topic badge

11.03 Variance and standard deviation

Lesson

Having calculated the expected value as measure of central tendency of a probability distribution, we will now learn to calculate the variance as a measure of spread. 

When working with data, variance is found as the average of the squared differences, or deviations, from the mean. (Squaring our values allows us to compute a positive value which indicates how spread out our distribution is from the mean.) The larger the variance, the more spread the data set is.

This gives us the following formula, with $\mu$μ for the mean and $x$x for the scores we have in our set:

$\sigma^2$σ2 $=$= $\frac{1}{n}\Sigma(x-\mu)^2$1nΣ(xμ)2

A class of Mathematics students were asked how many siblings they had.

The following set lists the number of siblings of each student in the class:

$1,2,1,3,0,2,2,2,1,4$1,2,1,3,0,2,2,2,1,4

To calculate variance using the formula stated above, we first need to find the mean:

$\mu$μ $=$= $\frac{1+2+1+3+0+2+2+2+1+4}{10}$1+2+1+3+0+2+2+2+1+410
$\mu$μ $=$= $1.8$1.8

Now, we need to calculate the square of the difference of each value from the mean. For our lowest value of $x$x, $0$0,  this would be:

$(0-1.8)^2$(01.8)2 $=$= $\frac{81}{25}$8125

And so for our data set:

$\sigma^2$σ2 $=$= $\frac{1}{10}[1\times(0-1.8)^2+3\times(1-1.8)^2+4\times(2-1.8)^2+1\times(3-1.8)^2+1\times(4-1.8)^2]$110[1×(01.8)2+3×(11.8)2+4×(21.8)2+1×(31.8)2+1×(41.8)2]
$\sigma^2$σ2 $=$= $\frac{29}{25}$2925

Calculating variance using probabilities

Just as we saw with expected value, we can see that we could rewrite this calculation in terms of the weighted probabilities of a discrete probability distribution. Summarising the distribution as:

$x$x $0$0 $1$1 $2$2 $3$3 $4$4
$P(X=x)$P(X=x) $\frac{1}{10}$110 $\frac{3}{10}$310 $\frac{4}{10}$410 $\frac{1}{10}$110 $\frac{1}{10}$110

We can manipulate the calculation for variance to the form: 

$\sigma^2=\frac{1}{10}\times(0-1.8)^2+\frac{3}{10}\times(1-1.8)^2+\frac{4}{10}\times(2-1.8)^2+\frac{1}{10}\times(3-1.8)^2+\frac{1}{10}\times(4-1.8)^2$σ2=110×(01.8)2+310×(11.8)2+410×(21.8)2+110×(31.8)2+110×(41.8)2

And so, we can think of variance as weighted mean of the squared deviations, weighted by the probabilities. This gives us the formula:

$Var(X)=\Sigma(x-\mu)^2P(x)$Var(X)=Σ(xμ)2P(x)

summing for all values of our probability distribution. 

Weighting a mean using probabilities is the process you take to calculate an expected value. And so, the variance is, in fact, the expected value of $(x-\mu)^2$(xμ)2. This gives us the further formula:

$Var(X)=E((X-\mu)^2)$Var(X)=E((Xμ)2)

As you have seen, the calculation we have to do is quite laborious- it needs careful calculator use with lots of fractions and brackets required! But there is good news, as there is an even simpler formula and method we can utilise: 

$Var(X)=E(X^2)-\mu^2$Var(X)=E(X2)μ2

With this formula, we calculate the expected value of $X^2$X2 by squaring each outcome and multiplying by the associated probabilities. After adding these together, we only need to subtract $\mu^2$μ2, the square of the original expected value, once. (Don't worry, we don't need to prove the derivation of this simplified formula - we just need to use it!)

Recalculating the variance using this new formula:

$Var(X)=0^2\times\frac{1}{10}+1^2\times\frac{3}{10}+2^2\times\frac{4}{10}+3^2\times\frac{1}{10}+4^2\times\frac{1}{10}-\frac{81}{25}=\frac{29}{25}$Var(X)=02×110+12×310+22×410+32×110+42×1108125=2925

Standard deviation

Standard deviation, $\sigma$σ when referring to a population, is simply the square root of the variance. For us, it is just another way to think of the measure of spread of a distribution. Its main benefit is that it is measured in the same units as the original scores were considering, whereas variance would be measured in square units since that calculation involved squaring all of the distances of scores from the mean. 

Mathematically, standard deviation has some interesting properties, such as proportionality when scores of a distribution are magnified by a constant. It is also used to normalise scores for comparison. 

Just as there are sample and population means, there are sample and population standard deviations. A sample standard deviation is notated as $s$s. Remember though, a discrete probability distribution is always considered as a population, so we use $\sigma$σ, not $s$s.  And just as a sample mean is an estimate of the population mean, you can think of a sample standard deviation as an estimate of the population's standard deviation. The bigger the sample, the better the estimate.

The good news is that the final two formulae for variance are given to you on your reference sheet for your exams. So, as long as you remember how to calculate an expected value as a weighted mean, the process will hopefully become quite routine.

Variance and standard deviation

The variance of a discrete probability distribution as a measure of spread:

$Var(X)=E((X-\mu)^2)$Var(X)=E((Xμ)2) or $Var(X)=E(X^2)-\mu^2$Var(X)=E(X2)μ2

Standard deviation is the square root of the variance:

$\sigma=\sqrt{Var(X)}$σ=Var(X)

Remember that the expected value $E(X)$E(X) is found by multiplying the value of each outcome $x$x by its probability $P(x)$P(x) and calculating the sum.

Practice questions

Question 1

Consider the table.

$x$x $0$0 $1$1 $2$2 $3$3 $4$4
$P$P$($($X=x$X=x$)$) $0.12$0.12 $0.15$0.15 $0.22$0.22 $0.23$0.23 $0.28$0.28
  1. Does the table represent a discrete probability distribution?

    Yes

    A

    No

    B
  2. Calculate $E\left(X\right)$E(X).

  3. Calculate the variance of $X$X.

  4. Hence calculate the standard deviation.

    Give your answer to two decimal places.

Question 2

$X$X is a random variable with the following probability distribution table.

$x$x $1$1 $3$3 $5$5 $7$7 $9$9
$P\left(X=x\right)$P(X=x) $\frac{1}{12}$112 $k$k $\frac{1}{2}$12 $\frac{1}{20}$120 $\frac{1}{5}$15
  1. Find the value of $k$k.

  2. Find $E\left(X\right)$E(X).

  3. Find $Var\left(X\right)$Var(X).

Question 3

At a local fair, in a game that involves rolling a standard die, players can win a prize depending on what they roll. Each player must pay $\$3$$3 to play. The prizes are awarded as follows:

  • The player wins $\$3$$3 if a $1$1, $3$3 or $5$5 is rolled.
  • The player wins $\$6$$6 if a $4$4 or $6$6 is rolled.
  • The player wins $\$9$$9 if a $2$2 is rolled.
  1. Let $X$X be the prize received by the player.

    Complete the probability distribution table.

    Write your answers as fractions.

    $x$x $3$3 $6$6 $9$9
    $P$P$($($X=x$X=x$)$) $\editable{}$ $\editable{}$ $\editable{}$
  2. Calculate the expected prize value.

  3. Calculate the standard deviation of the distribution.

    Give your answer to the nearest cent.

Outcomes

MA11-7

uses concepts and techniques from probability to present and interpret data and solve problems in a variety of contexts, including the use of probability distributions

What is Mathspace

About Mathspace