topic badge

11.01 Quartiles

Lesson

Introduction

Measures of spread in a numerical data set seek to describe whether the scores in a data set are very similar and clustered together, or whether there is a lot of variation in the scores and they are very spread out.

There are several methods to describe the spread of data, which vary in complexity. We can simply look at the numerical range of the entire data set - the difference between the largest and smallest value, or we can break the data into chunks to examine the range of smaller sections within the data.

Remember, the range only changes if the highest or lowest score in a data set is changed. Otherwise it will remain the same. This will mean it will be significantly affected by an outlier being present in the data.

In this lesson, we will look at an alternative to the range called the interquartile range.

Quartiles

Whilst the range is very simple to calculate, it is based on only two numbers in the data set, it does not tell us about the spread of data within these two values. To get a better picture of the internal spread in a data set, it is often more useful to find the set's quartiles, which can be used for a measure of spread called interquartile range (IQR).

Quartiles are scores at particular locations in the data set-similar to the median, but instead of dividing a data set into halves, they divide a data set into quarters. Let's look at how we would divide up some data sets into quarters now.

Make sure the data set is ordered before finding the quartiles or the median.

A data set with scores 1, 3, 4, 7, 11, 12, 14, 19.

Here is a data set with 8 scores.

A data set with scores 1, 3, 4, 7, 11, 12, 14, 19. The median is located between 7 and 11.

First locate the median, between the 4\text{th} and 5\text{th} scores:

Now there are four scores in each half of the data set, so split each of the four scores in half to find the quartiles.

Scores 1, 3, 4, 7, 11, 12, 14, 19. Quartile 1 is between 3 and 4, quartile 3 is between 12 and 14.

We can see the first quartile, Q_{1} is between the 2\text{nd} and 3\text{rd} scores. Similarly, the third quartile, Q_{3} is between the 6\text{th} and 7\text{th} scores.

Now let's look at a situation with 9 scores:

Scores 8, 8, 10, 11, 13, 14, 18, 22, 25. Quartile 1 is between 8 and 10, the median is 13, quartile 3 is between 18 and 22.

This time, the 5\text{th} term is the median. There are four terms on either side of the median. So Q_{1} is still between the 2\text{nd} and 3\text{rd} scores and Q_{3} is between the 6\text{th} and 7\text{th} scores.

Finally, let's look at a set with 10 scores:

Scores 12, 13, 14, 19, 19, 21, 22, 22, 28, 30. Quartile 1 is 14, the median is between 19 and 21, quartile 3 is 22

For this set, the median is between the 5\text{th} and 6\text{th} scores. This time, however, there are 5 scores on either side of the median. So Q_{1} is the 3\text{rd} term and Q_{3} is the 8\text{th} term.

Each quartile represents 25\% of the data set. The least score to the first quartile is approximately 25\% of the data, the first quartile to the median is another 25\%, the median to the third quartile is another 25\%, and the third quartile to the greatest score represents the last 25\% of the data. We can combine these sections together, for example, 50\% of the scores in a data set lie between the first and third quartiles.

These quartiles are sometimes referred to as percentiles. A percentile is a percentage that indicates the value below which a given percentage of observations in a group of observations fall. For example, if a score is in the 75\text{th} percentile in a statistical test, it is higher than 75\% of all other scores. The median represents the 50\text{th} percentile, or the halfway point in a data set.

Examples

Example 1

Here are Ray's scores from his last 13 rounds of golf played: 66,\,66,\,68,\,68,\,70,\,78,\,80,\,84,\,106,\,116,\,126,\,130,\,132

a

What is his median?

Worked Solution
Create a strategy

Use the formula: \left(\dfrac{n+1}{2}\right)\text{th} where n is the total number of scores, to find the position of the median.

Apply the idea

There are n=13 scores in the list.

\displaystyle \text{Position of median}\displaystyle =\displaystyle \dfrac{13+1}{2}Substitute n=13
\displaystyle =\displaystyle 7\text{th}Evaluate
\displaystyle \text{Median}\displaystyle =\displaystyle 80Choose the 7\text{th} score
b

What is the lower quartile?

Worked Solution
Create a strategy

Find the median of the lower half of the scores excluding the median.

Apply the idea

The lower half of the scores are: 66,\,66,\,68,\,68,\,70,\,78.

\displaystyle \text{Lower quartile}\displaystyle =\displaystyle \dfrac{68+68}{2}Average the middle scores
\displaystyle =\displaystyle 68Evaluate
c

What is the upper quartile?

Worked Solution
Create a strategy

Find the median of the upper half of the scores excluding the median.

Apply the idea

The upper half of the scores are: 84,\,106,\,116,\,126,\,130,\,132.

\displaystyle \text{Upper quartile}\displaystyle =\displaystyle \dfrac{116+126}{2}Average the middle scores
\displaystyle =\displaystyle 121Evaluate
Idea summary
  • Q_{1} is the first quartile (sometimes called the lower quartile). It is the middle score in the bottom half of data and it represents the 25\text{th} percentile.

  • Q_{2} is the second quartile, and is usually called the median. It represents the 50\text{th} percentile of the data set.

  • Q_{3} is the third quartile (sometimes called the upper quartile). It is the middle score in the top half of the data set, and represents the 75\text{th} percentile.

Interquartile range

The interquartile range (IQR) is the difference between the third quartile and the first quartile. 50\% of scores lie within the IQR because it contains the data set between the first quartile and the median, as well as the median and the third quartile.

Since it focuses on the middle 50\% of the data set, the interquartile range often gives a better indication of the internal spread than the range does, and it is less affected by individual scores that are unusually high or low, which are the outliers.

Subtract the first quartile from the third quartile. That is, \text{IQR} = Q_{3}-Q_{1}

Examples

Example 2

Answer the following given the frequency table:

ScoreFrequency
51
141
183
242
321
382
505
a

Find the number of scores.

Worked Solution
Create a strategy

Add all the frequencies.

Apply the idea
\displaystyle \text{Number of scores}\displaystyle =\displaystyle 1+1+3+2+1+2+5Find the sum
\displaystyle =\displaystyle 15Evaluate the addition
b

Find the median.

Worked Solution
Create a strategy

Use the formula: \left(\dfrac{n+1}{2}\right)\text{th} where n is the total number of scores, to find the position of the median.

Apply the idea

There are 15 scores in the list.

\displaystyle \text{Position of median}\displaystyle =\displaystyle \dfrac{15+1}{2}Substitute n=15
\displaystyle =\displaystyle 8\text{th}Evaluate

Using the frequency table we can see that the 8\text{th} score is 32.\text{Median}=32

c

Find the lower quartile of the set of scores.

Worked Solution
Create a strategy

Find the median of the lower half of the scores.

Apply the idea

There are 7 scores in the lower half excluding the median. The lower quartile will be the 4th score from the frequency table.\text{Lower quartile}=18

d

Find the upper quartile of the set of scores.

Worked Solution
Create a strategy

Find the median of the upper half of the scores.

Apply the idea

There are 7 scores in the upper half excluding the median. The upper quartile will be the 4th in the upper half, which is the 4+8=12th score overall.\text{Upper quartile}=50

e

Find the interquartile range.

Worked Solution
Create a strategy

Use the interquartile range formula: \text{IQR} = Q_{3} - Q_{1}

Apply the idea
\displaystyle \text{IQR}\displaystyle =\displaystyle 50-18Substitute the quartiles
\displaystyle =\displaystyle 32Evaluate

Example 3

Consider the dot plot below:

A dot plot with scores ranging from 4 to 19. Ask your teacher for more information.
a

Find the total number of scores.

Worked Solution
Create a strategy

Count the number of dots in the dot plot.

Apply the idea
\displaystyle \text{Number of scores}\displaystyle =\displaystyle 15Count the dots
b

Find the median.

Worked Solution
Create a strategy

Use the formula: \left(\dfrac{n+1}{2}\right)\text{th} where n is the total number of scores, to find the position of the median.

Apply the idea
\displaystyle \text{Position of median}\displaystyle =\displaystyle \dfrac{15+1}{2}Substitute n=15
\displaystyle =\displaystyle 8\text{th}Evaluate the division

The 8\text{th} score in the dot plot is on 15.\text{Median}=15

c

Find the lower quartile of the set of scores.

Worked Solution
Create a strategy

Find the median of the lower half of the scores.

Apply the idea

There are 7 scores in the first half excluding the median. The lower quartile will be the 4th score in the dot plot.\text{Lower quartile}=9

d

Find the upper quartile of the set of scores.

Worked Solution
Create a strategy

Find the median of the upper half of the scores.

Apply the idea

There are 7 scores in the second half excluding the median. The upper quartile will be the 8+4=12th score in the dot plot.\text{Upper quartile}=16

e

Find the interquartile range.

Worked Solution
Create a strategy

Use the interquartile range formula: \text{IQR} = Q_{3} - Q_{1}

Apply the idea
\displaystyle \text{IQR}\displaystyle =\displaystyle 16-9Substitute the quartiles
\displaystyle =\displaystyle 7Evaluate
Idea summary

To calculate the interquartile range:

\displaystyle \text{IQR}=Q_{3}-Q_{1}
\bm{\text{IQR}}
is the interquartile range
\bm{Q_{1}}
is the first quartile
\bm{Q_{3}}
is the third quartile

Outcomes

VCMSP349

Determine quartiles and interquartile range and investigate the effect of individual data values, including outliers on the interquartile range

What is Mathspace

About Mathspace