Standard deviation of the mean. Mean square (standard) deviation

Values ​​obtained from experience inevitably contain errors due to a wide variety of reasons. Among them, one should distinguish between systematic and random errors. Systematic errors are caused by reasons that act in a very specific way, and can always be eliminated or taken into account quite accurately. Random errors are caused by a very large number of individual causes that cannot be accurately accounted for and act in different ways in each individual measurement. These errors cannot be completely excluded; they can only be taken into account on average, for which it is necessary to know the laws that govern random errors.

We will denote the measured quantity by A, and the random error in the measurement by x. Since the error x can take on any value, it is a continuous random variable, which is fully characterized by its distribution law.

The simplest and most accurately reflecting reality (in the vast majority of cases) is the so-called normal error distribution law:

This distribution law can be obtained from various theoretical premises, in particular, from the requirement that the most probable value of an unknown quantity for which a series of values ​​with the same degree of accuracy is obtained by direct measurement is the arithmetic mean of these values. Quantity 2 is called dispersion of this normal law.

Average

Determination of dispersion from experimental data. If for any value A, n values ​​a i are obtained by direct measurement with the same degree of accuracy and if the errors of value A are subject to the normal distribution law, then the most probable value of A will be average:

a - arithmetic mean,

a i - measured value at the i-th step.

Deviation of the observed value (for each observation) a i of value A from arithmetic mean: a i - a.

To determine the variance of the normal error distribution law in this case, use the formula:

2 - dispersion,
a - arithmetic mean,
n - number of parameter measurements,

Standard deviation

Standard deviation shows the absolute deviation of the measured values ​​from arithmetic mean. In accordance with the formula for the measure of accuracy of a linear combination mean square error The arithmetic mean is determined by the formula:

, Where


a - arithmetic mean,
n - number of parameter measurements,
a i - measured value at the i-th step.

The coefficient of variation

The coefficient of variation characterizes the relative measure of deviation of measured values ​​from arithmetic mean:

, Where

V - coefficient of variation,
- standard deviation,
a - arithmetic mean.

The higher the value coefficient of variation, the relatively greater the scatter and less uniformity of the studied values. If the coefficient of variation less than 10%, then the variability of the variation series is considered to be insignificant, from 10% to 20% is considered average, more than 20% and less than 33% is considered significant and if the coefficient of variation exceeds 33%, this indicates the heterogeneity of information and the need to exclude the largest and smallest values.

Average linear deviation

One of the indicators of the scope and intensity of variation is average linear deviation(average deviation module) from the arithmetic mean. Average linear deviation calculated by the formula:

, Where

_
a - average linear deviation,
a - arithmetic mean,
n - number of parameter measurements,
a i - measured value at the i-th step.

To check the compliance of the studied values ​​with the law of normal distribution, the relation is used asymmetry indicator to his mistake and attitude kurtosis indicator to his mistake.

Asymmetry indicator

Asymmetry indicator(A) and its error (m a) is calculated using the following formulas:

, Where

A - asymmetry indicator,
- standard deviation,
a - arithmetic mean,
n - number of parameter measurements,
a i - measured value at the i-th step.

Kurtosis indicator

Kurtosis indicator(E) and its error (m e) is calculated using the following formulas:

, Where

  • 6. Plan of statistical research, its content. 7. Statistical research program, its content.
  • 8. Statistical population, its group properties, types. Requirements for the sample population.
  • 25. Statistical tables, their types and requirements for them.
  • 9. Collection of statistical material.
  • 10. Basic operations for developing statistical material.
  • 11. Analysis of the results of statistical research.
  • 12. Implementation of statistical research results into practice
  • 13. Absolute values, their application in healthcare.
  • 14. Relative values, their application in activity analysis
  • 15. Variation series, their types, meaning. 16. Values ​​characterizing the variation series.
  • 17. Methods for calculating average values, meaning.
  • 18. Standard deviation, calculation method, value.
  • 19. Error of representativeness of average values, calculation method, value. 20. Error of representativeness of relative values, calculation method, meaning.
  • 21. Estimation of the reliability of the difference in statistical values.
  • 23. The concept of correlation analysis.
  • 24. Graphic images of the results of statistical research, types.
  • 26. Time series, indicators, calculation and application in medicine.
  • 27. Public health of the population, indicators, significance. 28. Factors influencing public health. Health formula.
  • 29. Sections of demography, its importance for healthcare.
  • 30. Population statistics, indicators, their significance. 31. Age structure of the population, types, social significance.
  • 33. Population dynamics, types, indicators, medical and social significance.
  • 34. Natural movement of the population, indicators, patterns, medical and social significance.
  • 35. Fertility, levels, calculation methods, analysis and medical and social aspects of its regulation.
  • 36. Mortality rate, indicators, levels, calculation methods, analysis and medical and social significance.
  • 37. Infant mortality, causes, age characteristics, calculation methods.
  • 38. Perinatal mortality, calculation methods, levels, structure, causes, medical and social significance.
  • 40. Population reproduction, types, indicators, calculation methods.
  • 42. Incidence, indicators, structure, methods of study.
  • 43. International statistical classification of diseases and health-related problems, meaning, principles of construction.
  • 3) Diseases in hospitalized patients
  • 4) Diseases with temporary disability (see Question 58).
  • 45. Morbidity with temporary disability, causes, indicators. 46. ​​Study of morbidity with temporary disability. Police registration of morbidity.
  • 47. Preventive medical examinations, types, procedure, documents.
  • 48. Study of morbidity by seeking medical help.
  • 51. Physical development, study methods, medical and social significance.
  • 52. Disability of the population, causes, indicators, medical and social significance. 102. Disability, procedure for establishing and registration documents.
  • 54. Diseases of the circulatory system, their medical and social significance and conditionality. Organization of cardiological service. Primary prevention.
  • 55. Neoplasms, their medical and social significance and conditionality. Organization of oncology service. Primary prevention.
  • 59. Mental disorders, their medical and social significance and conditionality. Organization of psychoneurological care. Primary prevention.
  • 60. Alcoholism and drug addiction, their medical and social significance and conditionality. Organization of drug treatment. Primary prevention.
  • 61. Principles of state policy of the Republic of Belarus in the field of healthcare.
  • 62. Types, forms, conditions of medical care.
  • 63. Primary health care, principles, organizational structure, significance, development prospects.
  • 65. Registry, its functions. Forms for making an appointment with a doctor.
  • 68. General practitioner, functions, content of work, features of VTE.
  • 76. Reception department, tasks, organizational structure.
  • 80. Hospital-replacing technologies, types, operating principles, significance
  • 103. Medical and rehabilitation expert commission, its composition and functions.
  • 104. Medical, social and labor rehabilitation of disabled people.
  • Stage II – territorial medical association (TMO).
  • Stage III – regional hospital and regional medical institutions.
  • 109. Prevention is the most important principle of healthcare, its forms and levels.
  • 113. Healthy lifestyle, its components, medical and social significance. 114. Formation of a healthy lifestyle, directions.
  • 115. Methods and means of hygienic education and training of the population. 116. Characteristics of methods of hygienic education, advantages and disadvantages.
  • 117. Protection of motherhood and childhood, its social significance, government measures in the Republic of Belarus.
  • 122. Children's hospital, features of hospitalization, structures and organization of work. 123. Analysis of the activities of a children's hospital.
  • 124. Women's consultation, its structure, tasks and organization of work. 125. Basic medical documentation and performance indicators of the antenatal clinic.
  • 126. Maternity hospital, structure, organization of reception of pregnant women, women in labor and postpartum women. 127. Basic medical documentation and performance indicators of the maternity hospital.
  • 18. Standard deviation, calculation method, value.

    An approximate method for assessing the variability of a variation series is to determine the limit and amplitude, but the values ​​of the variant within the series are not taken into account. The main generally accepted measure of the variability of a quantitative characteristic within a variation series is standard deviation (σ - sigma). The larger the standard deviation, the higher the degree of fluctuation of this series.

    The method for calculating the standard deviation includes the following steps:

    1. Find the arithmetic mean (M).

    2. Determine the deviations of individual options from the arithmetic mean (d=V-M). In medical statistics, deviations from the average are designated as d (deviate). The sum of all deviations is zero.

    3. Square each deviation d 2.

    4. Multiply the squares of the deviations by the corresponding frequencies d 2 *p.

    5. Find the sum of the products (d 2 *p)

    6. Calculate the standard deviation using the formula:

    when n is greater than 30, or when n is less than or equal to 30, where n is the number of all options.

    Standard deviation value:

    1. The standard deviation characterizes the spread of the variant relative to the average value (i.e., the variability of the variation series). The larger the sigma, the higher the degree of diversity of this series.

    2. The standard deviation is used for a comparative assessment of the degree of correspondence of the arithmetic mean to the variation series for which it was calculated.

    Variations of mass phenomena obey the law of normal distribution. The curve representing this distribution looks like a smooth bell-shaped symmetrical curve (Gaussian curve). According to the theory of probability, in phenomena that obey the law of normal distribution, there is a strict mathematical relationship between the values ​​of the arithmetic mean and the standard deviation. The theoretical distribution of a variant in a homogeneous variation series obeys the three-sigma rule.

    If in a system of rectangular coordinates the values ​​of a quantitative characteristic (variants) are plotted on the abscissa axis, and the frequency of occurrence of a variant in a variation series is plotted on the ordinate axis, then variants with larger and smaller values ​​are evenly located on the sides of the arithmetic mean.

    It has been established that with a normal distribution of the trait:

    68.3% of the values ​​of the option are within M1

    95.5% of the values ​​of the option are within M2

    99.7% of the values ​​of the option are within M3

    3. The standard deviation allows you to establish normal values ​​for clinical and biological parameters. In medicine, the interval M1 is usually taken as the normal range for the phenomenon being studied. The deviation of the estimated value from the arithmetic mean by more than 1 indicates a deviation of the studied parameter from the norm.

    4. In medicine, the three sigma rule is used in pediatrics to individually assess the level of physical development children (sigma deviation method), to develop standards for children's clothing

    5. The standard deviation is necessary to characterize the degree of diversity of the characteristic being studied and to calculate the error of the arithmetic mean.

    The value of the standard deviation is usually used to compare the variability of series of the same type. If two series with different characteristics are compared (height and weight, average duration of hospital treatment and hospital mortality, etc.), then a direct comparison of sigma sizes is impossible , because standard deviation is a named value expressed in absolute numbers. In these cases, use coefficient of variation (Cv), which is a relative value: the percentage ratio of the standard deviation to the arithmetic mean.

    The coefficient of variation is calculated using the formula:

    The higher the coefficient of variation , the greater the variability of this series. It is believed that a coefficient of variation of more than 30% indicates the qualitative heterogeneity of the population.

    "

    Standard deviation(synonyms: standard deviation, standard deviation, square deviation; related terms: standard deviation, standard spread) - in probability theory and statistics, the most common indicator of the dispersion of the values ​​of a random variable relative to its mathematical expectation. With limited arrays of samples of values, instead of the mathematical expectation, the arithmetic mean of the set of samples is used.

    Encyclopedic YouTube

    • 1 / 5

      The standard deviation is measured in units of measurement of the random variable itself and is used when calculating the standard error of the arithmetic mean, when constructing confidence intervals, when statistically testing hypotheses, when measuring the linear relationship between random variables. Defined as the square root of the variance of a random variable.

      Standard deviation:

      s = n n − 1 σ 2 = 1 n − 1 ∑ i = 1 n (x i − x ¯) 2 ; (\displaystyle s=(\sqrt ((\frac (n)(n-1))\sigma ^(2)))=(\sqrt ((\frac (1)(n-1))\sum _( i=1)^(n)\left(x_(i)-(\bar (x))\right)^(2)));)
      • Note: Very often there are discrepancies in the names of MSD (Root Mean Square Deviation) and STD (Standard Deviation) with their formulas. For example, in the numPy module of the Python programming language, the std() function is described as "standard deviation", while the formula reflects the standard deviation (division by the root of the sample). In Excel, the STANDARDEVAL() function is different (division by the root of n-1).

      Standard deviation(estimate of the standard deviation of a random variable x relative to its mathematical expectation based on an unbiased estimate of its variance) s (\displaystyle s):

      σ = 1 n ∑ i = 1 n (x i − x ¯) 2 . (\displaystyle \sigma =(\sqrt ((\frac (1)(n))\sum _(i=1)^(n)\left(x_(i)-(\bar (x))\right) ^(2))).)

      Where σ 2 (\displaystyle \sigma ^(2))- dispersion; x i (\displaystyle x_(i)) - i th element of the selection; n (\displaystyle n)- sample size; - arithmetic mean of the sample:

      x ¯ = 1 n ∑ i = 1 n x i = 1 n (x 1 + … + x n) . (\displaystyle (\bar (x))=(\frac (1)(n))\sum _(i=1)^(n)x_(i)=(\frac (1)(n))(x_ (1)+\ldots +x_(n)).)

      It should be noted that both estimates are biased. In the general case, it is impossible to construct an unbiased estimate. However, the estimate based on the unbiased variance estimate is consistent.

      In accordance with GOST R 8.736-2011, the standard deviation is calculated using the second formula of this section. Please check the results.

      Three sigma rule

      Three sigma rule (3 σ (\displaystyle 3\sigma )) - almost all values ​​of a normally distributed random variable lie in the interval (x ¯ − 3 σ ; x ¯ + 3 σ) (\displaystyle \left((\bar (x))-3\sigma ;(\bar (x))+3\sigma \right)). More strictly - with approximately probability 0.9973, the value of a normally distributed random variable lies in the specified interval (provided that the value x ¯ (\displaystyle (\bar (x))) true, and not obtained as a result of sample processing).

      If the true value x ¯ (\displaystyle (\bar (x))) is unknown, then you should not use σ (\displaystyle \sigma ), A s. Thus, the rule of three sigma is transformed into the rule of three s .

      Interpretation of the standard deviation value

      A larger standard deviation value shows a greater spread of values ​​in the presented set with the average value of the set; a smaller value, accordingly, shows that the values ​​in the set are grouped around the average value.

      For example, we have three numerical sets: (0, 0, 14, 14), (0, 6, 8, 14) and (6, 6, 8, 8). All three sets have mean values ​​equal to 7, and standard deviations, respectively, equal to 7, 5 and 1. The last set has a small standard deviation, since the values ​​in the set are grouped around the mean value; the first set has the most great importance standard deviation - values ​​within the set diverge greatly from the average value.

      In a general sense, standard deviation can be considered a measure of uncertainty. For example, in physics, standard deviation is used to determine the error of a series of successive measurements of some quantity. This value is very important for determining the plausibility of the phenomenon under study in comparison with the value predicted by the theory: if the average value of the measurements differs greatly from the values ​​​​predicted by the theory (large standard deviation), then the obtained values ​​or the method of obtaining them should be rechecked. is identified with portfolio risk.

      Climate

      Suppose there are two cities with the same average maximum daily temperature, but one is located on the coast and the other on the plain. It is known that cities located on the coast have many different maximum daytime temperatures that are lower than cities located inland. Therefore, the standard deviation of the maximum daily temperatures for a coastal city will be less than for the second city, despite the fact that the average value of this value is the same, which in practice means that the probability that the maximum air temperature on any given day of the year will be higher differ from the average value, higher for a city located inland.

      Sport

      Let's assume there are several football teams, which are assessed according to a certain set of parameters, for example, the number of goals scored and conceded, scoring chances, etc. It is most likely that the best team in this group will have better values ​​for a larger number of parameters. The smaller the team’s standard deviation for each of the presented parameters, the more predictable the team’s result is; such teams are balanced. On the other hand, a team with a large standard deviation is difficult to predict the result, which in turn is explained by an imbalance, for example, a strong defense but a weak attack.

      Using the standard deviation of team parameters makes it possible, to one degree or another, to predict the result of a match between two teams, assessing the strengths and weak sides commands, and therefore the chosen methods of struggle.

      The square root of the variance is called the standard deviation from the mean, which is calculated as follows:

      An elementary algebraic transformation of the standard deviation formula leads it to the following form:

      This formula often turns out to be more convenient in calculation practice.

      The standard deviation, just like the average linear deviation, shows how much on average specific values ​​of a characteristic deviate from their average value. The standard deviation is always greater than the mean linear deviation. There is the following relationship between them:

      Knowing this ratio, you can use the known indicators to determine the unknown, for example, but (I calculate a and vice versa. The standard deviation measures the absolute size of the variability of a characteristic and is expressed in the same units of measurement as the values ​​of the characteristic (rubles, tons, years, etc.). It is an absolute measure of variation.

      For alternative signs, for example presence or absence higher education, insurance, dispersion and standard deviation formulas are as follows:

      Let us show the calculation of the standard deviation according to the data of a discrete series characterizing the distribution of students in one of the university faculties by age (Table 6.2).

      Table 6.2.

      The results of auxiliary calculations are given in columns 2-5 of table. 6.2.

      The average age of a student, years, is determined by the weighted arithmetic mean formula (column 2):

      The squared deviations of the student's individual age from the average are contained in columns 3-4, and the products of the squared deviations and the corresponding frequencies are contained in column 5.

      We find the variance of students’ age, years, using formula (6.2):

      Then o = l/3.43 1.85 *oda, i.e. Each specific value of a student’s age deviates from the average by 1.85 years.

      The coefficient of variation

      In its absolute value, the standard deviation depends not only on the degree of variation of the characteristic, but also on the absolute levels of options and the average. Therefore, it is impossible to directly compare the standard deviations of variation series with different average levels. To be able to make such a comparison, you need to find specific gravity the average deviation (linear or quadratic) in the arithmetic average, expressed as a percentage, i.e. calculate relative measures of variation.

      Linear coefficient of variation calculated by the formula

      The coefficient of variation determined by the following formula:

      In coefficients of variation, not only the incomparability associated with different units of measurement of the characteristic being studied is eliminated, but also the incomparability that arises due to differences in the value of arithmetic means. In addition, the indicators of variation characterize the homogeneity of the population. The population is considered homogeneous if the coefficient of variation does not exceed 33%.

      According to the table. 6.2 and the calculation results obtained above, we determine the coefficient of variation, %, according to formula (6.3):

      If the coefficient of variation exceeds 33%, then this indicates the heterogeneity of the population being studied. The value obtained in our case indicates that the population of students by age is homogeneous in composition. Thus, an important function of generalizing indicators of variation is to assess the reliability of averages. The less c1, a2 and V, the more homogeneous the resulting set of phenomena and the more reliable the resulting average. According to the “three sigma rule” considered by mathematical statistics, in normally distributed or close to them series, deviations from the arithmetic mean not exceeding ±3st occur in 997 cases out of 1000. Thus, knowing X and a, you can get a general initial idea of ​​the variation series. If, for example, the average wage employee in the company was 25,000 rubles, and a is equal to 100 rubles, then with a probability close to certainty, it can be argued that the wages of the company’s employees fluctuate within the range (25,000 ± ± 3 x 100), i.e. from 24,700 to 25,300 rubles.

      One of the main tools of statistical analysis is the calculation of standard deviation. This indicator allows you to estimate the standard deviation for a sample or for a population. Let's learn how to use the standard deviation formula in Excel.

      Let’s immediately determine what the standard deviation is and what its formula looks like. This quantity is the square root of the arithmetic mean of the squares of the difference between all quantities in the series and their arithmetic mean. There is an identical name for this indicator - standard deviation. Both names are completely equivalent.

      But, naturally, in Excel the user does not have to calculate this, since the program does everything for him. Let's learn how to calculate standard deviation in Excel.

      Calculation in Excel

      You can calculate the specified value in Excel using two special functions STDEV.V(based on the sample population) and STDEV.G(based on the general population). The principle of their operation is absolutely the same, but they can be called in three ways, which we will discuss below.

      Method 1: Function Wizard


      Method 2: Formulas Tab


      Method 3: Manually entering the formula

      There is also a way in which you won't need to call the arguments window at all. To do this, you must enter the formula manually.


      As you can see, the mechanism for calculating standard deviation in Excel is very simple. The user only needs to enter numbers from the population or references to the cells that contain them. All calculations are performed by the program itself. It is much more difficult to understand what the calculated indicator is and how the calculation results can be applied in practice. But understanding this already relates more to the field of statistics than to learning to work with software.

    If you find an error, please select a piece of text and press Ctrl+Enter.