11.05 Comparisons using standard deviation

Lesson

Standard deviation ($\sigma$σ) is a measure of spread, which can help us compare the variability in two or more data sets.  It is a weighted average of the distance of each data point from the mean. A small standard deviation indicates that most scores are close to the mean, while a large standard deviation indicates that the scores are more spread out away from the mean value.

 Larger standard deviation - more spread out Smaller standard deviation - closer to the mean

Note that the mean will determine where the scores are clustered, while the standard deviation tells us how tightly they are clustered. The two sets of data below have a similar mean but different standard deviations.

Standard deviation

Standard deviation is a weighted average of how far each piece of data varies from the mean:

The standard deviation is a more complex calculation but takes every data point into account. The standard deviation is significantly impacted by outliers.

• larger value indicates a wider spread (more variable) data set.
• A smaller value indicates a more tightly packed (less variable) data set.

Exploration

Two groups of students (group A and B) have their scores recorded. As shown below in the table:

Class Frequency group A Frequency group B
$30-<40$30<40 $14$14 $3$3
$40-<50$40<50 $17$17 $53$53
$50-<60$50<60 $23$23 $12$12
$60-<70$60<70 $16$16 $2$2

Since we are given grouped data, we can only get an estimate of the standard deviation. We first need to determine the class centres, which will be used to represent each class. For instance, the class centre for the first interval is $\frac{30+40}{2}=35$30+402=35 shown in the second column below.

Class Class Centre Frequency group A Frequency group B
$30-<40$30<40 $35$35 $14$14 $3$3
$40-<50$40<50 $45$45 $17$17 $53$53
$50-<60$50<60 $55$55 $23$23 $12$12
$60-<70$60<70 $65$65 $16$16 $2$2

We can calculate the median and standard deviation of both groups using our calculator:

Mean Standard deviation
Group A $50.86$50.86 $10.49$10.49
Group B $46.86$46.86 $5.43$5.43

We notice that the mean for group A is higher but so is the standard deviation so the scores will peak with a higher score but will be more spread out. If you wanted to pick a more consistent group we might choose group B who have a lower mean score but a much lower standard deviation.

The image above shows how the mean and standard deviation affect the shape of the graphs.

Practice questions

Question 1

The mean income of people in Canada is $\$4300043000. This is the same as the mean income of people in Germany. The standard deviation of Canada is greater than the standard deviation of Germany. In which country is there likely to be the greatest difference between the incomes of the rich and poor?

A

Germany

B

A

Germany

B

Question 2

Two machines $A$A and $B$B are producing chocolate bars with the following mean and standard deviation for the weight of the bars.

Machine Mean (g) Standard deviation (g)
$A$A $52$52 $1.6$1.6
$B$B $55$55 $0.65$0.65
1. What does a comparison of the mean of the two machines tell us?

Machine $A$A produces chocolate bars with a more consistent weight.

A

Machine $A$A generally produces heavier chocolate bars.

B

Machine $B$B generally produces heavier chocolate bars.

C

Machine $B$B produces chocolate bars with a more consistent weight.

D

Machine $A$A produces chocolate bars with a more consistent weight.

A

Machine $A$A generally produces heavier chocolate bars.

B

Machine $B$B generally produces heavier chocolate bars.

C

Machine $B$B produces chocolate bars with a more consistent weight.

D
2. What does a comparison of the standard deviation of the two machines tell us?

Machine $B$B produces chocolate bars with a more consistent weight.

A

Machine $A$A generally produces heavier chocolate bars.

B

Machine $A$A produces chocolate bars with a more consistent weight.

C

Machine $B$B generally produces heavier chocolate bars.

D

Machine $B$B produces chocolate bars with a more consistent weight.

A

Machine $A$A generally produces heavier chocolate bars.

B

Machine $A$A produces chocolate bars with a more consistent weight.

C

Machine $B$B generally produces heavier chocolate bars.

D

Question 3

Luke, a cricketer, has made scores of $51$51, $25$25, $99$99, $35$35 and $90$90 in his first five innings this season. In his sixth innings, he scores no runs.

1. What is the change in his season batting average before and after the sixth inning?

2. What is the change in his standard deviation before and after his sixth innings? Give your answer correct to two decimal places.

3. What is the change in his median score before and after his sixth inning?

4. What is the change in his range of scores before and after his sixth inning?

Outcomes

VCMSP372 (10a)

Calculate and interpret the mean and standard deviation of data and use these to compare data sets. Investigate the effect of individual data values including outliers, on the standard deviation