Mean and Standard deviation
Problems with Solutions
Mean
and
standard deviation
problems along with their solutions at the bottom of the page are presented. Problems related to data sets as well as grouped data are discussed.
Problems

Consider the following three data sets A, B and C.
A = {9,10,11,7,13}
B = {10,10,10,10,10}
C = {1,1,10,19,19}
a) Calculate the mean of each data set.
b) Calculate the standard deviation of each data set.
c) Which set has the largest standard deviation?
d) Is it possible to answer question c) without calculations of the standard deviation?

A given data set has a mean ? and a standard deviation ?.
a) What are the new values of the mean and the standard deviation if the same constant k is added to each data value in the given set?Explain.
b) What are the new values of the mean and the standard deviation if each data value of the set is multiplied by the same constant k?Explain.

If the standard deviation of a given data set is equal to zero, what can we say about the data values included in the given data set?

The frequency table of the monthly salaries of 20 people is shown below.
salary(in $)  frequency 
3500  5 
4000  8 
4200  5 
4300  2 
a) Calculate the mean of the salaries of the 20 people.
b) Calculate the standard deviation of the salaries of the 20 people.

The following table shows the grouped data, in classes, for the heights of 50 people.
height (in cm)  classes  frequency 
( 120 , 130 ]  2 
( 130 , 140 ]  5 
( 140 , 150 ]  25 
( 150 , 160 ]  10 
( 160 , 170 ]  8 
a) Calculate the mean of the salaries of the 20 people.
b) Calculate the standard deviation of the salaries of the 20 people.


mean of Data set A = (9+10+11+7+13)/5 = 10
mean of Data set B = (10+10+10+10+10)/5 = 10
mean of Data set C = (1+1+10+19+19)/5 = 10

Standard Deviation Data set A
= √[ ( (910)^{2}+(1010)^{2}+(1110)^{2}+(710)^{2}+(1310)^{2} )/5 ] = 2
Standard Deviation Data set B
= √[ ( (1010)^{2}+(1010)^{2}+(1010)^{2}+(1010)^{2}+(1010)^{2} )/5 ] = 0
Standard Deviation Data set C
= √[ ( (110)^{2}+(110)^{2}+(1010)^{2}+(1910)^{2}+(1910)^{2} )/5 ] = 8.05

Data set C has the largest standard deviation.

Yes, since data Set C has data values that are further away from the mean compared to sets A and B.


We limit the discussion to a data set with 3 values for simplicity, but the conclusions are true for any data set with quantitative data.
Let x, y and z be the data values making a data set.
The mean ? = (x + y + z) / 3
The standard deviation ? = √[ ((x  ?)^{2} + (y  ?)^{2} + (z  ?)^{2})/3 ]
We now add a constant k to each data value and calculate the new mean ?'.
?' = ((x + k) + (y + k) + (z + k)) / 3 = (x + y + z) / 3 + 3k/3 = ? + k
We now calculate the new mean standard deviation ?'.
?' = √[ ((x + k  ?')^{2} +(y + k  ?')^{2}+(z + k  ?')^{2})/3 ]
Note that x + k  ?' = x + k  ?  k = x  ?
also y + k  ?' = y + k  ?  k = y  ? and z + k  ?' = z + k  ?  k = z  ?
Therefore ?' = √[ ((x  ?)^{2} +(y  ?)^{2}+(z  ?)^{2})/3 ] = ?
If we add the same constant k to all data values included in a data set, we obtain a new data set whose mean is the mean of the original data set PLUS k. The standard deviation does not change.

We now multiply all data values by a constant k and calculate the new mean ?' and the new standard deviation ?'.
?' = (kx + ky + kz) / 3 = k?
?' = √[ ((kx  k?)^{2} +(ky  k?)^{2}+(kz  k?)^{2})/3 ] = k ?
If we multiply all data values included in a data set by a constant k, we obtain a new data set whose mean is the mean of the original data set TIMES k and standard deviation is the standard deviation of the original data set TIMES the absolute value of k.


Again, we limit the discussion to a data set with 4 values for simplicity, but the conclusions are true for any data set with quantitative data.
Let x, y, z and w be the data values making a data set with mean ?.
The standard deviation ? = √[ ((x  ?)^{2} + (y  ?)^{2} + (z  ?)^{2} + (w  ?)^{2})/3 ]
Let ? = 0, hence
√[ ((x  ?)^{2} + (y  ?)^{2} + (z  ?)^{2} + (w  ?)^{2})/3 ] = 0
Which gives
(x  ?)^{2} + (y  ?)^{2} + (z  ?)^{2} + (w  ?)^{2} = 0
All terms in the equation are positive and therefore, the above equation is equivalent to
(x  ?)^{2} = 0, (y  ?)^{2} = 0, (z  ?)^{2} = 0 and (w  ?)^{2} = 0.
Which gives
x = y = z = w = ? : all data values in the set with ? = 0 are equal.


Let x_{i} be the i th salary and f_{i} be the corresponding frequency.
mean of grouped data = ? = (?x_{i}*f_{i}) / ?f_{i}
= (3500*5 + 4000*8 + 4200*5 + 4300*2) /(5 + 8 + 5 + 2)
= $3955
b) standard deviation of grouped data = √[ (?(x_{i}?)^{2}*f_{i}) / ?f_{i} ]
= √[ (5*(35003955)^{2}+8*(40003955)^{2}+5*(42003955)^{2}+2*(43003955)^{2}) /(20) ]
= 282 (rounded to the nearest unit)


We first find the midpoints of the given classes.
height (in cm)  classes  midpoint  frequency 
( 120 , 130 ]  (120+130) ÷ 2 = 125  2 
( 130 , 140 ]  (130+140) ÷ 2 = 135  5 
( 140 , 150 ]  (140+150) ÷ 2 = 145  25 
( 150 , 160 ]  (150+160) ÷ 2 = 155  10 
( 160 , 170 ]  (160+170) ÷ 2 = 165  8 
Let m_{i} be the midpoint of the i th clss and f_{i} be the corresponding frequency.
mean of grouped data = ? = (?m_{i}*f_{i}) / ?f_{i}
= (125*2 + 135*5 + 145*25 + 155*10 + 165*8) /(2+5+25+10+8)
= 148.4
b) standard deviation of grouped data = √[ (?(m_{i}?)^{2}*f_{i}) / ?f_{i} ]
= √[ (2*(125148.4)^{2}+5*(135148.4)^{2}+25*(145148.4)^{2}+10*(155148.4)^{2}+8*(165148.4)^{2}) /(50) ]
= 9.9
More References and links
elementary statistics and probabilities.
Home Page