topic badge

8.02 The empirical rule

Lesson

The empirical rule, also known as the $68-95-99.7$689599.7$%$% rule, is an estimate of the spread of data that is normally distributed. As a general rule, the majority of scores lie within $3$3 standard deviations of the mean. More specifically:

  • $68%$68% of scores lie within $1$1 standard deviation of the mean.

  • $95%$95% of scores lie within $2$2 standard deviations of the mean.

  • $99.7%$99.7% of scores lie within $3$3 standard deviations of the mean.

A normal distribution is beautifully symmetrical, so we can actually divide these regions further. For example, as $95%$95% of scores lie within $2$2 standard deviations of the mean, $47.5%$47.5% (half of $95%$95%) of scores will lie between the mean and $2$2 standard deviations above the mean, as shown below.

This same principle applies for any of the empirical rule values and we can use this information to work out the spread of scores. For example, we can say $81.5%$81.5% ($47.5%+34%$47.5%+34%) of scores lie between $2$2 standard deviations below and $1$1 standard deviation above the mean.

Play around with this applet by moving the endpoints of the shaded region. You will see the percentage of scores lying between the endpoints, and can reveal the percentages of each piece with the toggle:

 

Watch out!

As the normal distribution is bell shaped, the spread of scores does not remain consistent between measures of standard deviation. So the percentage amounts can't be transferred between regions.

For example, $68%$68% of scores lie between $1$1 standard deviation below and $1$1 standard deviation above the mean. However, only $47.5%$47.5% of scores lie between the mean and $2$2 standard deviations below the mean.

Exploration

Standard deviation is a measure of spread that we can apply to everyday contexts. For example, let's say the mean score in a test was $67$67 and the standard deviation was $7$7 marks. This means that:

  • a person who was $1$1 standard deviation above the mean would have received a mark of $74$74 (as this is $67+7$67+7).
  • a person who was $2$2 standard deviations below the mean would have received a mark of $53$53 (as this is $67-2\times7$672×7).

If we're told that the scores were approximately normally distributed, we could go one step further and determine the percentage of students who scored between $53$53 and $74$74.

The number of students that score between $2$2 standard deviations of the mean would be $95%$95%. The normal distribution is symmetric, so half of $95%$95% of students scored between the mean and two standard deviations below. In other words, $47.5%$47.5% of students scored between $53$53 and $67$67.

Using the same reasoning, we know that half of $68%$68% of students scored between the mean and $1$1 standard deviation above. This means that $34%$34% of students scored between $67$67 and $74$74.

So putting the two percentages together, we can say that $\left(47.5+34\right)%=81.5%$(47.5+34)%=81.5% of students scored between $53$53 and $74$74.

The empirical rule
  • $68%$68% of scores lie within $1$1 standard deviation of the mean.
  • $95%$95% of scores lie within $2$2 standard deviations of the mean.
  • $99.7%$99.7% of scores lie within $3$3 standard deviations of the mean.

Remember, since the normal distribution is symmetric, we can halve the interval at the mean to halve the percentage of scores.

Practice questions

qUESTION 1

Consider the normal distribution shown below. Each unit on the horizontal axis indicates $1$1 standard deviation.

  1. Approximately what percentage of scores lie in the shaded region? Use the empirical rule to find your answer.

QUESTION 2

The grades in a test are approximately normally distributed. The mean mark is $47$47 with a standard deviation of $3$3.

  1. Between which two scores does approximately $68%$68% of the results lie symmetrically about the mean? Write both scores on the same line, separated by a comma.

  2. Between which two scores does approximately $95%$95% of the results lie symmetrically about the mean? Write both scores on the same line, separated by a comma.

  3. Between which two scores does approximately $99.7%$99.7% of the results lie symmetrically about the mean? Write both scores on the same line, separated by a comma.

QUESTION 3

The number of biscuits in a box is approximately normally distributed with mean $30$30 and standard deviation of $3$3.

  1. Complete the following sentence.

    Approximately $81.5%$81.5% of the scores lie between $2$2 standard deviations below and $\editable{}$ standard deviation(s) above the mean.

  2. Complete the following sentence.

    From part (a), this means that approximately $81.5%$81.5% of the boxes have between $\editable{}$ and $\editable{}$ biscuits in them.

Outcomes

MS2-12-2

analyses representations of data in order to make inferences, predictions and draw conclusions

MS2-12-7

solves problems requiring statistical processes, including the use of the normal distribution and the correlation of bivariate data

What is Mathspace

About Mathspace