topic badge
AustraliaVIC
VCE 12 Methods 2023

9.06 Calculations with the normal distribution

Lesson

As we discovered previously, the shape or spread of a normal distribution is affected by the standard deviation, which varies depending on the data set. Just like in every branch of mathematics, to directly compare multiple normally distributed data sets, we need a common unit of measurement. In statistics involving the normal distribution, we use the number of standard deviations away from the mean as a standardised unit of measurement called a $z$z-score.

 

What is a $z$z-score?

A $z$z-score is a value that shows how many standard deviations a score is above or below the mean. Mathematically, it is the ratio of the distance a score is above or below the mean to the standard deviation. In other words, it's indicative of how an individual's score deviates from the population mean, as shown in the picture below.

  • A positive $z$z-score indicates the score was above the mean.
  • A $z$z-score of zero indicates the score was equal to the mean.
  • A negative $z$z-score indicates the score was below the mean.
Careful!

What's really important to remember is that $z$z-scores can only be defined if the population parameters (ie. the mean and standard deviation of the population) are known.

Remember a "population" just means every member of a group is counted. It doesn't have to be people. For example, it may be the Australian population, all the students in Year 10 in a school or all the chickens on a farm.

 

What are $z$z-scores used for?

$z$z-scores are used to compare various normally distributed data sets. For example, let's say Sam got $75$75 on his Biology exam and $80$80 on his Chemistry exam. At first glance, it would seem that he did better on his Chemistry exam. However, then he received this info from his teacher:

  Mean S. D.
Chemistry $75$75 $6$6
Biology $70$70 $3$3

What does this mean for Sam?

To really understand how Sam performed in his exams, we need to calculate his $z$z-score for both of them. Let's do that now!

 

Calculating $z$z-scores

There is a formula was can use for calculating the $z$z-scores of a population.

Formula for calculating $z$z-scores from a population

$z=\frac{x-\mu}{\sigma}$z=xμσ

This means:

$\text{standardised z score}=\frac{\text{raw score}-\text{population mean score}}{\text{standard deviation}}$standardised z score=raw scorepopulation mean scorestandard deviation

Note: for sample (not population) data, we use $\overline{x}$x for the mean and $s$s for the standard deviation to estimate the population parameters.

So let's start by calculating Sam's $z$z-score for Biology:

$z$z $=$= $\frac{x-\mu}{\sigma}$xμσ
  $=$= $\frac{75-70}{3}$75703
  $=$= $1.6666$1.6666...
$z$z $=$= $1.67$1.67 (to 2 d.p.)
 

This means he is $1.67$1.67 standard deviations above the mean in Biology.

Now let's calculate his $z$z-score for Chemistry:

$z$z $=$= $\frac{x-\mu}{\sigma}$xμσ
  $=$= $\frac{80-75}{6}$80756
  $=$= $0.8333$0.8333...
  $=$= $0.83$0.83 (to 2 d.p.)

This means he is $0.83$0.83 standard deviations above the mean in Chemistry.

His $z$z-score for Biology was larger than the $z$z-score for Chemistry, indicating that he performed better in biology, relative to the class. Furthermore we can say that he was in the top $20.3%$20.3% of the class for Chemistry, but in the top $4.7%$4.7% for Biology. We'll take a look at how to calculate these probabilities in the sections that follow.

 

The empirical rule in terms of $z$z-scores

We were introduced to the empirical rule in the previous lesson and we can now express the rule in terms of $z$z-scores.

The empirical rule
  • $68%$68% of scores have a $z$z-score between $-1$1 and $1$1.
  • $95%$95% of scores have a $z$z-score between $-2$2 and $2$2.
  • $99.7%$99.7% of scores have a $z$z-score between $-3$3 and $3$3.

Remember, since the normal distribution is symmetric, we can halve the interval at the mean to halve the percentage of scores.

Practice questions

question 1

A general ability test has a mean score of $100$100 and a standard deviation of $15$15.

  1. If Paul received a score of $102$102 in the test, what was his $z$z-score correct to two decimal places?

  2. If Georgia had a $z$z-score of $3.13$3.13, what was her score in the test, correct to the nearest integer?

question 2

Marge scored $43$43 in her Mathematics exam, in which the mean score was $49$49 and the standard deviation was $5$5. She also scored $92.2$92.2 in her Philosophy exam, in which the mean score was $98$98 and the standard deviation was $2$2.

  1. Find Marge’s $z$z-score in Mathematics.

  2. Find Marge’s $z$z score in Philosophy.

  3. Which exam did Marge do better in, compared to the rest of her class?

    Philosophy

    A

    Mathematics

    B

 

Calculations with the standard and general normal distributions

We saw in our previous chapter how we can use the graph of the normal distribution and the $68-95-99.7%$689599.7% rule to calculate probabilities for both the standard normal distribution where $\mu=0$μ=0 and $\sigma=1$σ=1, and also the general normal distribution, where $\mu$μ and $\sigma$σ take on any value.

When we wish to calculate probabilities for values that are not exactly $1$1, $2$2 or $3$3 standard deviations from the mean, we turn to technology to perform our calculations.

Let's take a look at how to do these using our calculator.

Casio ClassPad

How to use the CASIO Classpad to complete the following tasks involving calculating probabilities for the normal distribution.

Use your calculator to calculate the following probabilities. Give answers correct to $4$4 decimal places.

  1. $P\left(Z<2.2\right)$P(Z<2.2)

  2. $P\left(-1.2P(1.2<Z<1.72)

  3. $P\left(X>90\right)$P(X>90) where $X~N\left(100,7^2\right)$X~N(100,72)

  4. $P\left(12P(12<X<20) where $X~N\left(15,5^2\right)$X~N(15,52)

TI Nspire

How to use the TI Nspire to complete the following tasks involving calculating probabilities for the normal distribution.

Use your calculator to calculate the following probabilities. Give answers correct to $4$4 decimal places.

  1. $P\left(Z<2.2\right)$P(Z<2.2)

  2. $P\left(-1.2P(1.2<Z<1.72)

  3. $P\left(X>90\right)$P(X>90) where $X~N\left(100,7^2\right)$X~N(100,72)

  4. $P\left(12P(12<X<20) where $X~N\left(15,5^2\right)$X~N(15,52)

Practice questions

question 3

Using your calculator, find the probability that a $z$z-score is at most $1.60$1.60 given that it is greater than $-0.69$0.69 in the standard normal distribution.

Give your answer correct to $4$4 decimal places.

question 4

If $X\sim N\left(20,5^2\right)$X~N(20,52), use your calculator to determine $k$k in the following parts.

Round your answers to two decimal places.

  1. $P\left(XP(X<k)=0.65

  2. $P\left(X>k\right)=0.45$P(X>k)=0.45

  3. $P\left(kP(k<X<27)=0.89

  4. $P\left(21P(21<X<k)=0.4

 

Quantiles and percentiles

What does it mean when you hear that someone is in the $98$98th percentile?

Have you ever completed a mathematics competition and when you receive your result you find that you were above the $0.85$0.85 quantile?

You know this is a good thing, but do you know what it means?

As the name suggests, percentiles split an ordered set of data into one hundred parts, where each percentile indicates the proportion of the population below that value.

Let’s think about the height of female adults. If the $45$45th percentile height is $155$155 cm, this means that $45%$45% of female adults are shorter than $155$155 cm.

We often see our test results expressed in this way: “You are placed in the $78$78th percentile”. But what does this mean?

If you are at the $78$78th percentile, then you performed better than $78%$78% of those who did the test.

But if you are in the $78$78th percentile, then you performed better than at least $78%$78% of those who did the test.

In the second case, we can say "at least" $78%$78% because you may have scored better than other students also in the $78$78th percentile. But in both cases your score is still below the percentiles above the $78$78th (i.e. the $79$79th, $80$80th, etc up to the $99$99th percentile). This is illustrated in the images below:

Note that the percentile is not the same as your actual score in the test. Rather it is a reflection of your rank in the ordered set of scores from all the participants.

A quantile is exactly like a percentile, but expressed as a decimal instead.

So for the situation above, where you're in the $45$45th percentile for your height, you would say you're in the $0.45$0.45 quantile.

If you're in the $94$94th percentile for your mathematics competition result, then you're in the $0.94$0.94 quantile.

Worked examples

example 1

The results of an IQ test are known to be normally distributed with a mean of $100$100 and a standard deviation of $10$10.

(a) What is the $84$84th percentile?

Think: We need to consider how many standard deviations above or below the mean indicates that an area of $84%$84% has been shaded on our normal distribution curve. Thinking about $84%$84%, we know this is made up of $50%$50% (half the curve) and $34%$34%. From our experience with the $68-95-99.7%$689599.7% rule, we know that $34%$34% means we have one standard deviation above the mean. We can sketch this on a graph.

Do:

From our graph, we can see that the $84$84th percentile is $110$110 on this IQ test.

(b) What is the lowest mark achieved by the $0.975$0.975 quantile?

Think: The lowest mark achieved by the $0.975$0.975 quantile is the same as the mark that is at the $0.975$0.975 quantile. Using similar reasoning as we did in part (a), we know that $0.975$0.975 is made up of $0.5$0.5 (half the curve) and $0.475$0.475 which is half of $0.95$0.95. Thus we're looking at the score that is two standard deviations above the mean. Again, a graph is useful here.

Do:

From our graph, we can see that the lowest mark achieved by the $0.975$0.975 quantile is $120$120 on the IQ test.

 

Using the calculator

Let's look at an example using the inverse normal function of our calculator to find scores given a probability. Select your calculator brand below to work through an example.

Casio ClassPad

How to use the CASIO Classpad to complete the following tasks involving the inverse normal distribution.

The heights of $16$16-year-old females are known to be normally distributed with a mean of $165$165 cm and a standard deviation of $2$2 cm. 

Give your answers correct to $2$2 decimal places.

  1. Calculate the height that $98%$98% of $16$16-year-old females fall below.

  2. What is the height of the $0.4$0.4 quantile?

  3. What is the shortest height of the tallest $15$15% of females?

TI Nspire

How to use the TI Nspire to complete the following tasks involving the inverse normal distribution.

The heights of $16$16-year-old females are known to be normally distributed with a mean of $165$165 cm and a standard deviation of $2$2 cm. 

Give your answers correct to $2$2 decimal places.

  1. Calculate the height that $98%$98% of $16$16-year-old females fall below.

  2. What is the height of the $0.4$0.4 quantile?

  3. What is the shortest height of the tallest $15$15% of females?

Practice questions

question 5

For the standard normal variable $X$X$~$~$N\left(0,1\right)$N(0,1), use a graphics calculator to determine the following values.

Round your answers to three decimal places.

  1. The $0.7$0.7 quantile

  2. The $65$65th percentile

  3. The lowest score in the top $20$20 percent

Question 6

If $X\sim N\left(30,4^2\right)$X~N(30,42), determine:

  1. the $0.5$0.5 quantile.

  2. the $0.83$0.83 quantile.

    Round your answer to two decimal places.

  3. the $35$35th percentile.

    Round your answer to two decimal places.

Finding an unknown mean or standard deviation

Since we can rely on technology to calculate probabilities for any general normal distribution, we rely less on the use of the standard normal distribution than we did in the past. However, when we don't know the mean and/or the standard deviation of a normal variable $X$X, we can use the standard normal distribution $Z$Z to help us calculate these.

Worked example

example 2

A machine produces components whose weights are normally distributed. The intention is for this machine to be calibrated to produce components whose weights have a mean of $600$600 g, with only $0.95%$0.95% of components having a weight less than $590$590 g. Determine the standard deviation of the calibrated machine.

Think: When we do not know the mean or the standard deviation or both of these things for a general normal distribution, we can use the link between the general normal distribution and the standard normal distribution to help us calculate these.

The link between the two is the standardising formula: $z=\frac{x-\mu}{\sigma}$z=xμσ

In this situation we have $x=590$x=590 and $\mu=600$μ=600. Visually we can represent the information like this:

If we duplicate this diagram, but this time for the standard normal distribution, we can see that we can then use the inverse normal function of the calculator to calculate the associated $z$z score.

Do: Using the inverse normal function of the calculator we find $z_1=-2.3455$z1=2.3455. We can now put all our information into the standardising formula. Where the score of interest $x=590$x=590, the population mean is $\mu=600$μ=600 and the corresponding $z$z-score is $-2.3455$2.3455.

$z$z $=$= $\frac{x-\mu}{\sigma}$xμσ

The formula for a $z$z-score

$-2.3455$2.3455 $=$= $\frac{590-600}{\sigma}$590600σ

Substitute in known values

$\sigma$σ $=$= $\frac{-10}{-2.3455}$102.3455

Rearrange for $\sigma$σ

  $=$= $4.26$4.26 g ($2$2 d.p)

 

Practice questions

Question 7

If $X\sim N\left(\mu,100\right)$X~N(μ,100), use your calculator to find $\mu$μ if $P\left(\mu\le X\le20\right)=0.3013$P(μX20)=0.3013

  1. Round your answer to two decimal places.

question 8

If $X\sim N\left(\mu,\sigma^2\right)$X~N(μ,σ2), use your calculator to find $\mu$μ and $\sigma$σ if $P\left(X<70\right)=0.1817$P(X<70)=0.1817 and $P\left(X<80\right)=0.9655$P(X<80)=0.9655

  1. Round your answers to two decimal places.

    $\mu$μ $=$= $\editable{}$
    $\sigma$σ $=$= $\editable{}$

Outcomes

U34.AoS4.3

continuous random variables: - construction of probability density functions from non-negative functions of a real variable - specification of probability distributions for continuous random variables using probability density functions - calculation and interpretation of mean, 𝜇, variance, 𝜎^2, and standard deviation of a continuous random variable and their use - standard normal distribution, N(0, 1), and transformed normal distributions, N(𝜇, 𝜎^2), as examples of a probability distribution for a continuous random variable - effect of variation in the value(s) of defining parameters on the graph of a given probability density function for a continuous random variable - calculation of probabilities for intervals defined in terms of a random variable, including conditional probability (the cumulative distribution function may be used but is not required)

U34.AoS4.4

statistical inference, including definition and distribution of sample proportions, simulations and confidence intervals: - distinction between a population parameter and a sample statistic and the use of the sample statistic to estimate the population parameter - simulation of random sampling, for a variety of values of 𝑝 and a range of sample sizes, to illustrate the distribution of 𝑃^ and variations in confidence intervals between samples - concept of the sample proportion as a random variable whose value varies between samples, where 𝑋 is a binomial random variable which is associated with the number of items that have a particular characteristic and 𝑛 is the sample size - approximate normality of the distribution of P^ for large samples and, for such a situation, the mean 𝑝 (the population proportion) and standard deviation - determination and interpretation of, from a large sample, an approximate confidence interval for a population proportion where 𝑧 is the appropriate quantile for the standard normal distribution, in particular the 95% confidence interval as an example of such an interval where 𝑧 ≈ 1.96 (the term standard error may be used but is not required).

What is Mathspace

About Mathspace