topic badge

4.02 Median and mode

Lesson

Measures of central tendency attempt to summarise a set of data with a single value that describes the centre or middle of the scores.

The three main measures of central tendency are the meanmedian, and mode. Deciding which one is best depends on some other characteristics of the particular set of data, and we will look further into the suitability of the different measures and observing the effects of outliers in our next lesson.

 

Measures of centre

Mean: Often referred to as the average–this is the sum of the scores divided by the number of scores.

Median: The middle value of an ordered set of data–or the value that separates the bottom half and top half of the scores.

Mode: The most frequently occurring value. For continuous data or data grouped in class intervals we talk about the modal class - the most frequently occurring class, rather than a mode.

 

Median

The median is one way of describing the middle or the centre of a data set using a single value. The median is the middle score in a data set.

The data must be ordered (usually in ascending order) before calculating the median.

Which term is in the middle?

Suppose we have five numbers in our data set: $4$4, $11$11, $15$15, $20$20 and $24$24.

The median would be $15$15 because it is the value right in the middle. There are two numbers on either side of it.

$4,11,\editable{15},20,24$4,11,15,20,24

If we have a larger data set, however, we may not be able to see straight away which term is in the middle. There are two methods we can use to help us work this out.

 

The "cross out" method

Exploration

Once a data set is ordered, we can cross out numbers in pairs (one high number and one low number) until there is only one number left. Let's check out this process using an example. Here is a data set with nine numbers:

  1. Check that the data is sorted in ascending order (i.e. in order from smallest to largest).

  1. Cross out the smallest and the largest number, like so:

  1. Repeat step 2, working from the outside in - taking the smallest number and the largest number each time until there is only one term left. We can see in this example that the median is $7$7:

Note that this process will only leave one term if there are an odd number of terms to start with. If there are an even number of terms, this process will leave two terms instead, if you cross them all out, you've gone too far! To find the median of a set with an even number of terms, we can then take the mean of these two remaining middle terms.

 

The "counting terms" method

We can also work out which term will be the middle number by considering whether there is an odd or even number of scores, and then using a formula.

We summarise the formulas below.

Finding the median position

Let $n$n be the number of terms.

  • If $n$n is odd, then the median is the middle term, which is the $\frac{n+1}{2}$n+12th term.
  • If $n$n is even, then the median is the average of the two middle terms, that being the $\frac{n}{2}$n2th and $\left(\frac{n}{2}+1\right)$(n2+1)th terms.

Exploration

Let's use the same set of nine numbers from the previous example, $1,1,3,5,7,9,9,10,15$1,1,3,5,7,9,9,10,15. We can see that there is an odd number of scores, $n=9$n=9, so the position of the median is:

$\text{Position of median }$Position of median $=$= $\frac{9+1}{2}$9+12

Where we've used $\frac{n+1}{2}$n+12

  $=$= $5$5th term

Simplifying the fraction

 

This means the fifth term will be the median: $1,1,3,5,\editable{7},9,9,10,15$1,1,3,5,7,9,9,10,15.

So again, we find that the median is $7$7.

Let's now try this with an even number of terms. Here is a data set with four terms: $8,12,17,20$8,12,17,20. This time, we have $n=4$n=4. What would happen if we used the same procedure as above?

$\text{Position of median}$Position of median $=$= $\frac{4+1}{2}$4+12

Where we've used $\frac{n+1}{2}$n+12 again

  $=$= $2.5$2.5th term

Simplifying the fraction

 

What does the "$2.5$2.5th term" mean? Well, just like when we used the "cross-out" method, the $2.5$2.5th term means the average (mean) of the $2$2nd and $3$3rd terms. This is why the when the number of scores, $n$n, is even, we find the average of the $\frac{n}{2}$n2th term and $\left(\frac{n}{2}+1\right)$(n2+1)th terms.

Again, remember that the data must be in order before counting along to the median position. So in this example, the median will be the average of $12$12 and $17$17.

$\text{Median }$Median $=$= $\frac{12+17}{2}$12+172

Taking the average of the $2$2nd and $3$3rd scores

  $=$= $14.5$14.5

Simplifying the fraction

Practice questions

Question 1

Consider the following scores:

$23,25,13,9,11,21,24,17,20$23,25,13,9,11,21,24,17,20

  1. Sort the scores in ascending order.

  2. Calculate the median.

Question 2

Write down $4$4 consecutive odd numbers whose median is $40$40.

  1. Write all solutions on the same line separated by a comma.

 
 

The mode

The mode is another measure of central tendency - that is, it's a third way of describing a value that represents the centre of the data set. The mode describes the most frequently occurring score. For continuous data or data grouped in class intervals we talk about the modal class - the most frequently occurring class, rather than a mode.

Let's say we ask $10$10 people how many pets they have. $2$2 people say no pets, $6$6 people say one pet and $2$2 people say they have two pets. What is the most common number of pets for people to have? In this case, the most common number is one pet, because the largest number of people, which was $6$6, had one pet. So the mode of this data set is $1$1.

Data can have more than one mode when several outcomes have the same highest frequency. When the data has two or more modes we refer to it as being multimodal and if it has exactly two modes it is called bimodal

Note: We can also refer to the general shape of the data as being bimodal if the data has two clear peaks. When talking about the general shape the peaks do not need to be of exactly the same height.

 

Worked example

A statistician organised a set of data into the frequency table shown below, find the mode of the data.

Score ($x$x) Frequency ($f$f)
$10$10 $26$26
$20$20 $10$10
$30$30 $18$18
$40$40 $18$18
$50$50 $15$15

Think: The mode is the score that occurs most frequently.

Do: The highest number in the frequency column is $26$26. This corresponds to the score of $10$10, and therefore the mode is $10$10.

Reflect: At a glance, it may seem unusual that $10$10 is the mode, since the mode measures central tendency, and $10$10 is far from being the centre of the numbers that we saw between $10$10 and $50$50.

The mode measures central tendency, but a different kind of central tendency. It tells us where the data likes to "bunch up"–this gives us an approximation for what score we're likely to draw if we sample from the data set.

Practice questions

Question 3

Find the mode of the following scores:

$8,18,5,2,2,10,8,5,14,14,8,8,10,18,14,5$8,18,5,2,2,10,8,5,14,14,8,8,10,18,14,5

  1. Mode = $\editable{}$

Question 4

Find the mode from the histogram shown.

HistogramScoresFrequency510152025306869707172

Outcomes

3.3.1.1

identify the mode from a dataset

3.3.1.2

calculate measures of central tendency, the mean and the median from a dataset

What is Mathspace

About Mathspace