NZ Level 7 (NZC) Level 2 (NCEA)

Using z-scores to identify probabilities

Lesson

The statistic known as a *z-score* is often used in relation to the *normal *probability distribution, although it can be used with other distributions. The process of obtaining a z-score is called *standardisation*.

In this process, the raw scores are shifted by subtracting a fixed number, the mean, from all of them so that the new mean becomes zero. Then, each of these numbers is divided by the standard deviation so that the new standard deviation is one. Thus, each of the raw scores has a corresponding z-score and, taken as a whole, they have a mean of $0$0 and a standard deviation of $1$1.

The advantage of the standardisation procedure is that scores of the same kind from different experiments can be compared.

To obtain a z-score, it is necessary to know the mean $\mu$`μ` and the standard deviation $\sigma$`σ` of a complete population. If $x$`x` is a raw score, the corresponding z-score is $z=\frac{x-\mu}{\sigma}$`z`=`x`−`μ``σ`.

Each number in the data set is a single measurement of some feature of an experimental subject. (They might be test scores from a group of high school students, for example.) To standardise a set of scores we first calculate the mean and standard deviation of the raw data.

The following scores were obtained from a mathematics class.

$35,44,51,52,59,64,64,67,69,69,70,70,71,73,73,75,78,79,83,91$35,44,51,52,59,64,64,67,69,69,70,70,71,73,73,75,78,79,83,91

The average is $66.85$66.85 and the standard deviation is $13.02$13.02.

We subtract the mean from each score:

$-31.85$−31.85 | $-22.85$−22.85 | $-15.85$−15.85 | $-14.85$−14.85 | $-7.85$−7.85 |

$-2.85$−2.85 | $-2.85$−2.85 | $0.15$0.15 | $2.15$2.15 | $2.15$2.15 |

$3.15$3.15 | $3.15$3.15 | $4.15$4.15 | $6.15$6.15 | $6.15$6.15 |

$8.15$8.15 | $11.15$11.15 | $12.15$12.15 | $16.15$16.15 | $24.15$24.15 |

We divide each of these numbers by the standard deviation:

$-2.45$−2.45 | $-1.75$−1.75 | $-1.22$−1.22 | $-1.14$−1.14 | $-0.60$−0.60 |

$-0.22$−0.22 | $-0.22$−0.22 | $0.01$0.01 | $0.17$0.17 | $0.17$0.17 |

$0.24$0.24 | $0.24$0.24 | $0.32$0.32 | $0.47$0.47 | $0.47$0.47 |

$0.63$0.63 | $0.86$0.86 | $0.93$0.93 | $1.24$1.24 | $1.85$1.85 |

These are the standardised scores. A negative z-score is below the average and a positive z-score is above.

If a particular set of scores is from a normal probability distribution, then approximately 68% of the observations will be within one standard deviation from the mean, 95% of the observations will be within 2 standard deviations of the mean and 99.7% will be closer than 3 standard deviations from the mean. The observations tend to be densest near the mean and the density falls off with distance from the mean. The typical density curve has a shape as in the following diagram.

If the scores in Example $1$1 really are from a normal distribution, we would expect $68%$68% of the z-scores to lie somewhere between $-1$−1 and $1$1. In fact, there are $14$14 in this region, which is about $68%$68% of $20$20. Only $1$1 of the z-scores is further than $2$2 standard deviations from the mean, which is still more than would be expected from a normal distribution.

If a student is chosen at random from the same class as in Example $1$1, what is the probability that the student had a z-score of greater than $1$1 in the test?

In a normal distribution, $100-68=32%$100−68=32% of scores are more than $1$1 standard deviation from the mean. Of these, half of them or $16%$16% should be above the mean. So, the probability is $16%$16% or $0.16$0.16 that the chosen student has a z-score of more than $1$1.

This probability estimate should be treated with some caution, however, because the data may not be strictly normally distributed. Notice, for example, that there are more positive z-scores than negative. In a normal distribution, the observations would be arranged more symmetrically about the mean. Since there are more positive z-scores than negative in the data, it may be that the probability we seek is a little higher than the calculated $0.16$0.16.

Tables are available to determine probabilities associated with non-whole number z-scores from the standard normal distribution. Using these, one can answer questions like, What is the probability that an observed value will be at least $z$`z`? or What is the probability that an observation will fall between $0$0 and $z$`z`?

The table below shows the area under the standard normal curve between $0$0 and a given $z$`z`-score. Use this table to find the probability that a variable has a $z$`z`-score less than $z=0.85$`z`=0.85.

Give your answer to four decimal places.

A sprinter is training for a national competition. She runs 400 m in an average time of $75$75 seconds, with a standard deviation of $6$6 seconds.

Use the table below showing the area under the standard normal curve between $0$0 and a given $z$`z`-score to answer the following questions.

Determine the $z$

`z`-score of a time of $67$67 seconds. Round your answer to two decimal places.Find $P(X$

`P`(`X`$<$<$67$67$)$). Round your answer to four decimal places.The value $0.0918$0.0918 represents the probability that:

The sprinter will run 400 m in less than $67$67 seconds.

AThe sprinter will run 400 m in exactly than $67$67 seconds.

BThe sprinter will run 400 m in more than $67$67 seconds.

CThe sprinter will run 400 m in less than $67$67 seconds.

AThe sprinter will run 400 m in exactly than $67$67 seconds.

BThe sprinter will run 400 m in more than $67$67 seconds.

C

The mean height of an adult male is $1.78$1.78 m, with a standard deviation of $9$9 cm.

Determine the $z$

`z`-score of a height of $1.69$1.69 m.If $700$700 males are chosen at random, approximately how many males will be taller than $1.69$1.69 m?

Round your answer to the nearest whole number of people.

S7-4 Investigate situations that involve elements of chance: A comparing theoretical continuous distributions, such as the normal distribution, with experimental distributions B calculating probabilities, using such tools as two-way tables, tree diagrams, simulations, and technology.

Apply probability methods in solving problems