topic badge

8.05 Understanding z-scores

Lesson

Now that we've had practice calculating $z$z-scores and know how to interpret them, we can use $z$z-scores to identify events that are more or less likely to occur. This is because $z$z-scores measure how far away a score is from the mean relative to the rest of the data set, and the further a value is from the mean, the less likely it is to occur.

Exploration

The arrival times for a particular train are approximately normally distributed with an expected arrival time of $9:00$9:00 am and a standard deviation of $10$10 minutes. The bell curve looks like this:

So for the arrival time of $8:50$8:50 am, the corresponding $z$z-score is $-1$1 since this arrival time is $1$1 standard deviation below the mean. An arrival time of $9:10$9:10 am is $1$1 standard deviation above the mean, so its $z$z-score is $1$1. Following the above procedure, we can obtain the following table:

Times $8:30$8:30 am $8:40$8:40 am $8:50$8:50 am $9:00$9:00 am $9:10$9:10 am $9:20$9:20 am $9:30$9:30 am
$z$z-scores $-3$3 $-2$2 $-1$1 $0$0 $1$1 $2$2 $3$3

We might be interested in answering some of the following questions:

  1. If I’m waiting for the train is it more likely that it arrives after $9:00$9:00 am or after $9:10$9:10 am?
  2. Is it more likely that the train will arrive between $8:40$8:40 am and $8:50$8:50 am or $9:00$9:00 am and $9:10$9:10 am?

For each of these cases we will calculate and interpret their $z$z-scores.

1. A time of $9:00$9:00 am has a $z$z-score of $0$0 while a time of $9:10$9:10 am has a $z$z-score of $1$1. So for a train to arrive after $9:00$9:00 am, they have to fall in this region of the bell curve:

For a train to arrive after $9:10$9:10 am, they have to fall in this region of the bell curve:

As you can see, it’s more likely that the train will arrive after $9:00$9:00 am, because more of the bell curve is shaded. We can go even one step further to say that $50%$50% of the time, the train will arrive after $9:00$9:00 am since the normal distribution is symmetrical.

Using the empirical rule, we know that $68%$68% of arrivals are within $1$1 standard deviation of the mean. This means $32%$32% of arrivals are outside $1$1 standard deviation of the mean, which we can see below.

The curve is symmetrical so $16%$16% of arrivals are before $8:50$8:50 am and $16%$16% of arrivals are after $9:10$9:10 am.

In summary, $50%$50% of the time the train arrives after $9:00$9:00 am, and $16%$16% of the time the train arrives after $9:10$9:10 am.

2. Using the $z$z-scores, a train will arrive between $8:40$8:40 am and $8:50$8:50 am if it's between $1$1 and $two$two standard deviations below the mean.

Using the empirical rule, we know the percentage of trains that arrive in the following two regions.

 

We can then subtract one percentage from the other to find that $13.5%$13.5% of trains arrive between $8:40$8:40 am and $8:50$8:50 am. By symmetry, we can see that $34%$34% of trains arrive between $9:00$9:00 am and $9:10$9:10 am, so this time range is more than twice as likely.

Practice questions

question 1

The amount of time spent waiting in the reception area at a doctor's office is approximately normally distributed.

The $z$z-scores of the waiting times of four patients, represented by the letters $K$K, $L$L, $M$M, and $N$N are given in the table below.

Patient $K$K $L$L $M$M $N$N
$z$z-score $-2.91$2.91 $1.84$1.84 $-2.15$2.15 $1.48$1.48
  1. Which event is least likely?

    Waiting longer than $L$L.

    A

    Waiting longer than $M$M.

    B

    Waiting longer than $N$N.

    C

    Waiting longer than $K$K.

    D
  2. Which event is more likely?

    Waiting longer than $M$M but less than $N$N.

    A

    Waiting longer than $K$K but less than $L$L.

    B
  3. Which event is more likely?

    Waiting less than $M$M or longer than $N$N.

    A

    Waiting less than $K$K or longer than $L$L.

    B

question 2

The number of babies born in a country each day is approximately normally distributed with mean $153$153 and standard deviation $28$28. The number of babies born on two consecutive days along with their $z$z-scores are provided below.

Number of newborns $265$265 $69$69
$z$z-scores $4$4 $-3$3
  1. Which event is more likely?

    The number of babies born the following day is less than $69$69.

    A

    The number of babies born the following day is greater than $265$265.

    B

question 3

The owner of a cafe records the arrival times of every customer each morning. The data set is approximately normally distributed with the busiest time at $9:00$9:00 am and a standard deviation of $14$14 minutes.

  1. Complete the following table by finding the $z$z-scores of each arrival time.

    Times $8:18$8:18 am $8:32$8:32 am $8:46$8:46 am
    $z$z-scores $\editable{}$ $\editable{}$ $\editable{}$
  2. The cafe owner has a friend who can help for a short period of time in the morning. Which of the following is the busiest time period for her friend to help?

    Before $8:18$8:18 am.

    A

    Between $8:18$8:18 am and $8:32$8:32 am.

    B

    Between $8:32$8:32 am and $8:46$8:46 am.

    C

    Between $8:46$8:46 am and $9:00$9:00 am.

    D

Outcomes

MS2-12-2

analyses representations of data in order to make inferences, predictions and draw conclusions

MS2-12-7

solves problems requiring statistical processes, including the use of the normal distribution and the correlation of bivariate data

What is Mathspace

About Mathspace