topic badge

7.045 Standard deviation

Lesson

Standard deviation

Standard deviation is a measure of spread, which helps give a meaningful estimate of the variability in a data set. While the quartiles gave us a measure of spread about the median, the standard deviation gives us a measure of spread with respect to the mean. It is a weighted average of the distance of each data point from the mean. A small standard deviation indicates that most scores are close to the mean, while a large standard deviation indicates that the scores are more spread out away from the mean value.

The standard deviation can be calculated for a population or a sample.

The symbols used are:

$\text{Population Standard Deviation}$Population Standard Deviation $=$= $\sigma$σ (lowercase sigma)
$\text{Sample Standard Deviation}$Sample Standard Deviation $=$= $s$s  

In statistics mode on a calculator, the following symbols might be used:

$\text{Population Standard Deviation}$Population Standard Deviation $=$= $\sigma_n$σn
$\text{Sample Standard Deviation}$Sample Standard Deviation $=$= $\sigma_{n-1}$σn1

Note: In this exercise the standard deviation refers to the population standard deviation unless clearly stated otherwise.

Standard deviation can be calculated using a formula. However, as this process is time consuming we will be using our calculator to find the standard deviation. Ensure settings are correct for the data given, this is particularly important when changing between data that is in a simple list to data that is in a frequency table.

Standard deviation is also a very powerful way of comparing different data sets, particularly if there are different means and population numbers.

Remember!

The three main measures of spread are:

  • Range–the size of the interval the data is spread over:

$\text{Range}=\text{Highest score}-\text{Lowest score}$Range=Highest scoreLowest score

The range is simple to calculate but only takes into account two values. The range is also significantly impacted by outliers.

  • Interquartile range–the range of the middle $50%$50% of data:

$IQR=Q_3-Q_1$IQR=Q3Q1

The interquartile range is relatively simple to calculate but only takes into account two values. It is not significantly affected by outliers.

  • Standard deviation–a weighted average of how far each piece of data varies from the mean:

The standard deviation is a more complex calculation but takes every data point into account. The standard deviation is significantly impacted by outliers.

For each measure of spread:

  • larger value indicates a wider spread (more variable) data set.
  • A smaller value indicates a more tightly packed (less variable) data set.

 

Select  your brand of calculator below to work through an example of finding the measures of spread using technology.

Casio ClassPad

Calculator example coming soon.

TI Nspire

Calculator example coming soon.

 

Practice questions

Question 1

The mean income of people in Country A is $\$19069$$19069. This is the same as the mean income of people in Country B. The standard deviation of Country A is greater than the standard deviation of Country B. In which country is there likely to be the greatest difference between the incomes of the rich and poor?

  1. Country A

    A

    Country B

    B

Question 2

Find the population standard deviation of the following set of scores, to two decimal places, by using the statistics mode on the calculator:

$8,20,9,9,8,19,9,18,5,10$8,20,9,9,8,19,9,18,5,10

Question 3

The table shows the number of goals scored by a football team in each game of the year.

Score ($x$x) Frequency ($f$f)
$0$0 $3$3
$1$1 $1$1
$2$2 $5$5
$3$3 $1$1
$4$4 $5$5
$5$5 $5$5
  1. In how many games were $0$0 goals scored?

  2. Determine the median number of goals scored. Leave your answer to one decimal place if necessary.

  3. Calculate the mean number of goals scored each game. Leave your answer to two decimal places if necessary.

  4. Use your calculator to find the population standard deviation. Leave your answer to two decimal places if necessary.

Question 4

Two machines $A$A and $B$B are producing chocolate bars with the following mean and standard deviation for the weight of the bars.

Machine Mean (g) Standard deviation (g)
$A$A $52$52 $1.5$1.5
$B$B $56$56 $0.65$0.65
  1. What does a comparison of the mean of the two machines tell us?

    Machine $A$A produces chocolate bars with a more consistent weight.

    A

    Machine $B$B produces chocolate bars with a more consistent weight.

    B

    Machine $A$A generally produces heavier chocolate bars.

    C

    Machine $B$B generally produces heavier chocolate bars.

    D
  2. What does a comparison of the standard deviation of the two machines tell us?

    Machine $B$B generally produces heavier chocolate bars.

    A

    Machine $A$A generally produces heavier chocolate bars.

    B

    Machine $B$B produces chocolate bars with a more consistent weight.

    C

    Machine $A$A produces chocolate bars with a more consistent weight.

    D

Outcomes

ACMGM030

determine the mean and standard deviation of a dataset and use these statistics as measures of location and spread of a data distribution, being aware of their limitations

What is Mathspace

About Mathspace