Lesson

The empirical rule, also known as the $68$68-$95$95-$99.7$99.7$%$% rule, is a way of estimating the way that normally distributed data spreads out. These numbers correspond to the approximate proportion within one, two, and three standard deviations of the mean.

- Approximately $68%$68% of scores lie within $1$1 standard deviation of the mean:

- Approximately $95%$95% of scores lie within $2$2 standard deviations of the mean:

- Approximately $99.7%$99.7% of scores lie within $3$3 standard deviations of the mean:

A normal distribution is symmetrical, so we can use these basic values to find approximations of other regions. For example, as $95%$95% of scores lie within $2$2 standard deviations of the mean, $47.5%$47.5% (half of $95%$95%) will lie between the mean and $2$2 standard deviations above the mean:

We can use a similar trick to conclude that $34%$34% (half of $68%$68%) lies between $1$1 standard deviation below the mean, and the mean itself.

If we add these approximations together, we conclude that $81.5%$81.5% (which is $34%+47.5%$34%+47.5%) of scores lie between $1$1 standard deviation below and $2$2 standard deviations above the mean.

Play around with this applet by moving the endpoints of the shaded region. You will see the percentage of scores lying between the endpoints, and can reveal the percentages of each piece with the toggle:

Watch out!

The empirical rule is only an approximation, and better approximations exist. For example, a better approximation for the area between $1$1 standard deviation below and above the mean is

$68.268949213...%$68.268949213...%

An exact value is impossible to write down, like other important numbers in mathematics, so we need to approximate somewhere. For now, this is just a good place to start thinking about the distribution.

Standard deviation is a measure of spread that we can apply to everyday contexts. For example, let's say the mean score in a test was $67$67 and the standard deviation was $7$7 marks. This means that:

- a person who was $1$1 standard deviation above the mean would have received a mark of $74$74 (as this is $67+7$67+7).
- a person who was $2$2 standard deviations below the mean would have received a mark of $53$53 (as this is $67-2\times7$67−2×7).

If we're told that the scores were approximately normally distributed, we could go one step further and determine the percentage of students who scored between $53$53 and $74$74.

The number of students that score between $2$2 standard deviations of the mean would be $95%$95%. The normal distribution is symmetric, so half of $95%$95% of students scored between the mean and two standard deviations below. In other words, $47.5%$47.5% of students scored between $53$53 and $67$67.

Using the same reasoning, we know that half of $68%$68% of students scored between the mean and $1$1 standard deviation above. This means that $34%$34% of students scored between $67$67 and $74$74.

So putting the two percentages together, we can say that $\left(47.5+34\right)%=81.5%$(47.5+34)%=81.5% of students scored between $53$53 and $74$74.

The empirical rule

- $68%$68% of scores lie within $1$1 standard deviation of the mean.
- $95%$95% of scores lie within $2$2 standard deviations of the mean.
- $99.7%$99.7% of scores lie within $3$3 standard deviations of the mean.

Remember, since the normal distribution is symmetric, we can halve the interval at the mean to halve the percentage of scores.

The grades in a test are approximately normally distributed. The mean mark is $60$60 with a standard deviation of $2$2.

Between which two scores does approximately $68%$68% of the results lie symmetrically about the mean? Write both scores on the same line, separated by a comma.

Between which two scores does approximately $95%$95% of the results lie symmetrically about the mean? Write both scores on the same line, separated by a comma.

Between which two scores does approximately $99.7%$99.7% of the results lie symmetrically about the mean? Write both scores on the same line, separated by a comma.

The following figure shows the approximate percentage of scores lying within various standard deviations from the mean of a normal distribution. The heights of $600$600 boys are found to approximately follow such a distribution, with a mean height of $145$145 cm and a standard deviation of $20$20 cm. Find the number of boys with heights between:

$125$125 cm and $165$165 cm

$105$105 cm and $185$185 cm

$85$85 cm and $205$205 cm (to the nearest whole number)

$145$145 cm and $165$165 cm

$165$165 cm and $185$185 cm (to the nearest whole number)

In a normal distribution, what percentage of scores lie between $2$2 standard deviations below and $3$3 standard deviations above the mean? Use the empirical rule to find your answer.

S7-4 Investigate situations that involve elements of chance: A comparing theoretical continuous distributions, such as the normal distribution, with experimental distributions B calculating probabilities, using such tools as two-way tables, tree diagrams, simulations, and technology.

Apply probability methods in solving problems