We've already been introduced to the normal distribution. In this chapter, we are going to focus on the standard normal distribution which is the simplest form of the normal distribution. It has three key features:
The standard normal distribution is of great interest because any data set with some other normal distribution can be rescaled so that it has the standard normal distribution. This means that we only need to know the properties of the standard normal distribution in order to make predictions about data from infinitely many other normal distributions.
For example, suppose a great many components produced by a machine are measured. All the components should be identical but it is found that there is some variation in the measurements and a frequency histogram of the measurements shows the characteristic normal curve shape. From the data, it is found that the mean is $\mu$μ and the standard deviation is $\sigma$σ. (We are using symbols rather than particular numbers to make the argument general.) We would say that the data has the $N(\mu,\sigma)$N(μ,σ) distribution: normal with mean $\mu$μ and standard deviation $\sigma$σ.
We wish to rescale the data set so that it has the $N(0,1)$N(0,1) distribution. To do this we first subtract $\mu$μ from each of the data values. After this transformation, the data will have mean $0$0. Next, we divide each data value by $\sigma$σ. This operation causes the transformed data to have the standard deviation $1$1.
Verify, using previously established facts about linear transformations of random variables, that the standardisation operation described above is consistent with earlier work.
Let $X$X be a random variable with mean $\mu$μ and variance $\sigma^2$σ2. We wish to know the mean and standard deviation of the transformed random variable $Y=\frac{X-\mu}{\sigma}$Y=X−μσ.
The mean of $X$X is expressed as the expected value $E(X)$E(X). From previous work, we know that
$E(Y)$E(Y) | $=$= | $E\left[\frac{X-\mu}{\sigma}\right]$E[X−μσ] |
$=$= | $E\left[\frac{X}{\sigma}-\frac{\mu}{\sigma}\right]$E[Xσ−μσ] | |
$=$= | $\frac{E(X)}{\sigma}-\frac{E(X)}{\sigma}$E(X)σ−E(X)σ | |
$=$= | $0$0 |
Also,
$Var(Y)$Var(Y) | $=$= | $Var\left(\frac{X-\mu}{\sigma}\right)$Var(X−μσ) |
$=$= | $Var\left(\frac{X}{\sigma}-\frac{\mu}{\sigma}\right)$Var(Xσ−μσ) | |
$=$= | $\frac{1}{\sigma^2}Var\left(X\right)$1σ2Var(X) | |
$=$= | $\frac{\sigma^2}{\sigma^2}$σ2σ2 | |
$=$= | $1$1 |
Then, to get the standard deviation of $Y$Y we take $\sqrt{Var(Y)}=1$√Var(Y)=1.
Continuing the component production scenario, suppose the component produced by the machine was supposed to measure $6.5$6.5 cm and the data showed a standard deviation of $0.05$0.05 cm. The quality control requirement for the machine is that no more than $1%$1% of the components it produces should measure more than $6.65$6.65 cm or less than $6.35$6.35 cm.
That is, $99%$99% of the components are required to be within $3\sigma=0.15$3σ=0.15 of the mean. In terms of probability, a component produced by the machine should have a probability of $0.99$0.99 of being within the $3\sigma$3σ limit.
The standard normal distribution table makes it possible to determine this probability in terms of numbers of standard deviations away from the mean. In the tables given, the leftmost columns, labelled $z$z, and the column headings are used to determine the numbers of standard deviations. The probabilities are in the body of the table.
Only the positive numbers of standard deviations need be shown in the table because the distribution is symmetrical. (Half of the probability is on the positive side and half on the negative.)
We look in the body of the table for the probability $\frac{0.99}{2}=0.495$0.992=0.495. Then, by looking at the row and column headings we see that this corresponds to a $z$z-score of between $2.57$2.57 and $2.58$2.58 standard deviations. We could take $z=2.575$z=2.575.
Reverting to the actual distribution $N(6.5,0.05)$N(6.5,0.05), we conclude that $99%$99% of the machine's output measures between $6.5-0.05\times2.575\approx6.37$6.5−0.05×2.575≈6.37 cm and $6.5+0.05\times2.575\approx6.63$6.5+0.05×2.575≈6.63 cm.
Since the required quality control limits are wider than this range, we conclude that the machine is operating satisfactorily and there will be fewer than $1%$1% rejected components.
The given table gives us the area between $0$0 and a given $z$z-score.
Using the table, find the area under the normal curve between one standard deviation(s) below the mean and two standard deviation(s) above the mean.
Give your answer to four decimal places.
The given table gives us the area between $0$0 and a given $z$z-score.
Using the table, find the area under the normal curve between $z=1.51$z=1.51 and $z=1.89$z=1.89.
Give your answer to four decimal places.
The table below shows the area between $0$0 and a given $z$z-score. Use this table to find the percent of data that is less than $z=0.18$z=0.18.
Give your answer as a percentage to two decimal places.
The given table gives us the area between $0$0 and a given $z$z-score.
Using the table, find the area under the normal curve between one standard deviation(s) below the mean and two standard deviation(s) above the mean.
Give your answer to four decimal places.
The given table gives us the area between $0$0 and a given $z$z-score.
Using the table, find the area under the normal curve between $z=1.51$z=1.51 and $z=1.89$z=1.89.
Give your answer to four decimal places.