topic badge

7.04 Standard deviation

Lesson

Standard deviation is a measure of spread, which helps give us a meaningful estimate of the variability in a data set. A small standard deviation means most scores are close to the mean. Conversely, a large standard deviation means the scores are very spread out.

Standard deviation is related to another measurement, variance, which is the average of the squared differences from the mean.

The standard deviation is found by calculating the square root of the variance. We use the symbols $\sigma$σ (pronounced "sigma") for standard deviation, and $\sigma^2$σ2 (pronounced "sigma-squared") for variance.

Calculating variance and standard deviation by hand can be very tedious for large data sets. However, in practice, we will be able to use the statistics mode of our CAS calculator to determine these values.

Measures of spread

Tells us how much variation we see in the data.

  • variance - average of the squared differences from the mean
  • standard deviation - square-root of variance
  • range - the difference between the highest value and lowest value
  • interquartile range - the spread of the middle $50%$50% of data values

 

Worked examples

Example 1 - using CAS to calculate the standard deviation and variance

Calculate the variance and standard deviation for this data set: $75,75,75,80,80,80,80,92,107,107,107,107$75,75,75,80,80,80,80,92,107,107,107,107

Classpad

Using Statistics mode, enter class centres into "list1".

Use the Calc -> One-variable menu to calculate the standard deviation (and other statistics), with "Freq" set to "1" because "list1" contains individual data values.

 

Standard deviation is given as $\sigma_x\approx13.59$σx13.59.

Variance can be calculated from the standard deviation as $\sigma_x^2\approx13.59^2\approx184.69$σ2x13.592184.69.

 

Example 2 - using CAS to calculate the standard deviation from a frequency table

Calculate the mean for the data represented in the frequency table

Value Frequency
$75$75 $3$3
$80$80 $4$4
$92$92 $1$1
$107$107 $4$4

Classpad

Using Statistics mode, enter values into "list1" and frequencies into "list2"

Use the Calc -> One-variable menu to calculate the standard deviation (and other statistics), using the "Freq" setting to select frequencies from "list2".


This data set is equivalent to the previous examples so, once again, the standard deviation is given as $\sigma_x\approx13.59$σx13.59.

 

Example 3 - using CAS to estimate the standard deviation for grouped data

Estimate the standard deviation for the data represented in the grouped frequency table

Class Frequency
$30-<40$30<40 $12$12
$40-<50$40<50 $16$16
$50-<60$50<60 $25$25
$60-<70$60<70 $4$4

Since we are given grouped data, we can only get an estimate of the standard deviation. We first need to determine the class centres, which will be used to represent each class. For instance, the class centre for the first interval is $\frac{30+40}{2}=35$30+402=35.

Class Class Centre Frequency
$30-<40$30<40 $35$35 $12$12
$40-<50$40<50 $45$45 $16$16
$50-<60$50<60 $55$55 $25$25
$60-<70$60<70 $65$65 $4$4

Classpad

Using Statistics mode, enter class centres into "list1" and frequencies into "list2".

Use the Calc -> One-variable menu to calculate the standard deviation (and other statistics), using the "Freq" setting to select frequencies from "list2".

 

For this data set, the standard deviation is given as $\sigma_x\approx8.91$σx8.91.

 

Sample vs population standard deviation

When our data set is a sample that is taken to represent a larger population, a slightly different formula is used to calculate the standard deviation so that it is a better estimate of the spread of the entire population. This is the sample standard deviation, with the symbol $s_x$sx. You might have noticed this value in the CAS results in previous examples, with a value very close to $\sigma_x$σx.

Remember!

For the Mathematics Applications course you should always use the $\sigma_x$σx value for standard deviation.

 

Practice questions

Question 1

The mean income of people in Country A is $\$19069$$19069. This is the same as the mean income of people in Country B. The standard deviation of Country A is greater than the standard deviation of Country B. In which country is there likely to be the greatest difference between the incomes of the rich and poor?

  1. Country A

    A

    Country B

    B

Question 2

Find the population standard deviation of the following set of scores, to two decimal places, by using the statistics mode on the calculator:

$8,20,9,9,8,19,9,18,5,10$8,20,9,9,8,19,9,18,5,10

Question 3

Using your calculator, find the population standard deviation for the data given in the dot plot below. Round your answer to two decimal places.

Question 4

Fill in the table and answer the questions below.

  1. Complete the table given below.

    Class Class Centre Frequency $fx$fx
    $1-9$19 $\editable{}$ $8$8 $\editable{}$
    $10-18$1018 $\editable{}$ $6$6 $\editable{}$
    $19-27$1927 $\editable{}$ $4$4 $\editable{}$
    $28-36$2836 $\editable{}$ $6$6 $\editable{}$
    $37-45$3745 $\editable{}$ $8$8 $\editable{}$
    Totals   $\editable{}$ $\editable{}$
  2. Use the class centres to estimate the mean of the data set, correct to two decimal places.

  3. Use the class centres to estimate the population standard deviation, correct to two decimal places.

  4. If we used the original ungrouped data to calculate standard deviation, do you expect that the ungrouped data would have a higher or lower standard deviation?

    Higher standard deviation

    A

    Lower standard deviation

    B

Outcomes

2.1.5

determine the mean and standard deviation of a data set using technology and use these statistics as measures of location and spread of a data distribution, being aware of their limitations

2.1.6

use the number of deviations from the mean (standard scores) to describe deviations from the mean in normally distributed data sets

What is Mathspace

About Mathspace