Consider a set of measurements $x_1,x_2,x_3,...,x_n$x1,x2,x3,...,xn They could be students' scores achieved in a test or the lengths of pieces of string or many other things. Whatever their physical origin, we can think of them as values of a random variable $X$X.
We calculate the mean of $X$X by summing the measurements and dividing by the number of them. In symbols, we write
$\mu_X=\frac{1}{n}\sum_{i=1}^nx_i$μX=1nn∑i=1xi.
We calculate the variance of $X$X by summing the squared distances of the measurements from the mean and dividing by the number of them. In symbols, we write
$\text{Var}X=\frac{1}{n}\sum_{i=1}^n(x_i-\mu_X)^2$VarX=1nn∑i=1(xi−μX)2
Suppose we added $20$20 points to every student's test score or decreased the length of every piece of string by $10$10cm or generally, added a fixed amount a to every $x_i$xi to obtain a new random variable $Y$Y. We could write $Y=X+a$Y=X+a.
Now, the mean of $Y$Y is $\mu_Y=\frac{1}{n}\sum_{i=1}^n(x_i+a)$μY=1nn∑i=1(xi+a) since $a$a has been added to every measurement. Thus,
$\mu_Y$μY | $=$= | $\frac{1}{n}\left(\sum_{i=1}^nx_i+na\right)$1n(n∑i=1xi+na) |
$=$= | $\mu_X+a$μX+a |
This is not unexpected. If the value of every observation is shifted by a fixed amount, then the mean should also move by that amount.
The variance of $Y$Y must be
$\text{Var}Y$VarY | $=$= | $\frac{1}{n}\sum_{i=1}^n\left(y_i-\mu_Y\right)^2$1nn∑i=1(yi−μY)2 |
$=$= | $\frac{1}{n}\sum_{i=1}^n\left(x_i+a-(\mu_X+a)\right)^2$1nn∑i=1(xi+a−(μX+a))2 | |
$=$= | $\text{Var}X$VarX |
Thus, adding a constant to every measurement does not affect the variance.
Suppose we scaled the scores of the students by a factor of $1.3$1.3 or we decided to express the lengths of the pieces of string in metres rather than centimetres or generally, we multiplied every $x_i$xi by a number $b$b to obtain a new random variable $Z=bX$Z=bX.
The mean of $Z$Z must be
$\mu_Z$μZ | $=$= | $\frac{1}{n}\sum_{i=1}^nbx_i$1nn∑i=1bxi |
$=$= | $\frac{b}{n}\sum_{i=1}^nx_i$bnn∑i=1xi | |
$=$= | $b\mu_X$bμX |
Again, this is not unexpected. If the value of every observation is multiplied by a fixed amount, then the mean should also be multiplied by that amount.
The variance of Z must be
$\text{Var}Z$VarZ | $=$= | $\frac{1}{n}\sum_{i=1}^n(z_i-\mu_Z)^2$1nn∑i=1(zi−μZ)2 |
$=$= | $\frac{1}{n}\sum_{i=1}^n(bx_i-b\mu_X)^2$1nn∑i=1(bxi−bμX)2 | |
$=$= | $\frac{b^2}{n}\sum_{i=1}^n(x_i-\mu_X)^2$b2nn∑i=1(xi−μX)2 | |
$=$= | $b^2\text{Var}X$b2VarX |
Putting all of this together, we see that if $X$X and $W$W are random variables with $W$W a linear transformation of $X$X, we have
$W=aX+b$W=aX+b
and
$\mu_W=a\mu_X+b$μW=aμX+b
$\text{Var}W=a^2\text{Var}X$VarW=a2VarX
A school mathematics test was given marks out of $50$50. It was found that some of the questions on the test were very easy and every student was able to answer them correctly. The easy questions were worth $14$14 marks and the teacher decided to remove those marks from all the scores so that the test was now effectively marked out of $36$36. The school required that the test results be recorded as a mark out of $100$100 so that every score now had to be scaled up by a factor of $\frac{100}{36}$10036.
If the original mean was $33$33 and the original variance was $9$9, what were the mean and standard deviation reported in the school's records?
The original scores can be represented as a random variable $X$X and the final transformed scores by $Y$Y. The transformation is given by $Y=\frac{25}{9}(X-14)$Y=259(X−14). That is, $Y=\frac{25X}{9}-\frac{350}{9}$Y=25X9−3509.
We must have, for the mean, $\mu_Y=\frac{25}{9}\mu_X-\frac{350}{9}=\frac{25}{9}\times33-\frac{350}{9}=52.7$μY=259μX−3509=259×33−3509=52.7.
For the variance, $\text{Var}Y=\left(\frac{25}{9}\right)^2\text{Var}X=\left(\frac{25}{9}\right)^2\times9=\frac{625}{9}$VarY=(259)2VarX=(259)2×9=6259.
The standard deviation is the square root of the variance. So, the standard deviation of $Y$Y is $\frac{25}{3}\approx8.3$253≈8.3.
You will often find another notation for the variance of a random variable. Instead of $\text{Var}X$VarX, we write $\sigma_X^2$σ2X. Then, the standard deviation is $\sigma_X$σX.
You will also find the mean of a random variable referred to as its expected value and so, $\mu_X$μX and $E(X)$E(X) are alternatives.
The heights of a certain species of fully grown plants are thought to be normally distributed with a mean of $55$55 cm and a standard deviation of $4$4 cm.
If the heights were recorded in mm instead of cm:
State the new mean.
State the new variance.
In a given population, a certain variable $X$X is considered to be normally distributed with a mean of $80$80 and a standard deviation of $4$4.
If the data for $Y$Y is transformed according to the rule $-8-4X$−8−4X:
Calculate the new mean.
Calculate the new standard deviation.
The marks in the Chemistry ATAR exam were normally distributed. Let $X$X$%$% be the random variable representing the distribution of these marks.
Dylan scored a raw mark of $57%$57% and after average marks scaling scored $63.28%$63.28%.
Danielle scored a raw mark of $86%$86% and after average marks scaling scored $93.44%$93.44%.
If these marks were scaled according to the rule $aX+b$aX+b, determine the values of $a$a and $b$b.
If the raw mean mark was $62%$62%, determine the scaled mean mark for Chemistry.
Investigate situations that involve elements of chance: A calculating probabilities of independent, combined, and conditional events B calculating and interpreting expected values and standard deviations of discrete random variables C applying distributions such as the Poisson, binomial, and normal
Apply probability distributions in solving problems