topic badge

8.04 Range and interquartile range (IQR)

Introduction

Measures of spread in a numerical data set describe whether the values in a data set are very similar and clustered together, or whether there is a lot of variation in the values and they are very spread out. In this section, we will look at the range and interquartile range as measures of spread.

Range

The range is the simplest measure of spread in a numerical data set. It is the difference between the maximum and minimum values in a data set.

Two bus drivers, Kenji and Bjorn, track how many passengers board their buses each day for a week. Their results are displayed in this table:

MTWTF
Kenji1013141611
Bjorn22713517

Both data sets have the same median and the same mean, but the sets are quite different. To calculate the range, we start by finding the greatest and least number of passengers for each driver:

GreatestLeast
Kenji1610
Bjorn272

Now we subtract the least from the greatest to find the difference, which is the range:

Range
Kenji16-10=6
Bjorn27-2=25

Notice how Kenji's range is quite small compared to Bjorn's. We might say that Kenji's route is more predictable and that Bjorn's route is much more variable. We can see that the range does not say anything about the size of the values, just their spread.

The range of a numerical data set is the difference between the greatest and the least value in the data set.

Range

=\text {Greatest value} - \text {Least value}

Examples

Example 1

Which of the following data sets has the largest range?

A
101,\,105,\,118,\,129,\,136
B
19,\,23,\,25,\,28,\,29
C
22,\,25,\,43,\,64
D
104,\,107,\,113,\,120,\,125
Worked Solution
Create a strategy

Use the formula \text{Range} = \text{Greatest value} - \text{least value.}

Apply the idea

Find the range for each option.

Option A: 136-101=35

Option B: 29-19=10

Option C: 64-22=42

Option D: 125-104=21

The option with the largest difference is C, which has a range of 42.

Idea summary

Range is the difference between the greatest and the least value in the data set.

Range

=\text {Greatest value} - \text {Least value}

Interquartile range

To get a better picture of the internal spread in a data set, it is often more useful to find the set's quartiles, from which the interquartile range (IQR) can be calculated.

Quartiles are values at particular locations in the data set-similar to the median, but instead of dividing a data set into halves, they divide a data set into quarters. Let's look at how we would divide up some data sets into quarters now.

Make sure the data set is ordered before finding the quartiles or the median.

A data set with 8 values. The values are 1, 3, 4, 7, 11, 12, 14, 19.

First locate the median, between the 4\text{th} and 5\text{th} values:

A data set with 8 values. The values are 1, 3, 4, 7, 11, 12, 14, 19, the median is located between 7 and 11.

Now there are four values in each half of the data set, so split each of the four values in half to find the quartiles. We can see the first quartile, Q_{1} is between the 2\text{nd} and 3\text{rd} values-that is, there are two values on either side of Q_{1}. Similarly, the third quartile, Q_{3} is between the 6\text{th} and 7\text{th} values:

Values 1, 3, 4, 7, 11, 12, 14, 19. Quartile 1 is between 3 and 4, quartile 3 is between 12 and 14.

To find Q_{1} for this data set, we would need to find the mean of 3 and 4, which is 3.5. And to find Q_{3}, we would find the mean of 12 and 14, which is 13.

Now let's look at a data set with 9 values:

Values 8, 8, 10, 11, 13, 14, 18, 22, 25. Quartile 1 is between 8 and 8, the median is 13, quartile 3 is between 18 and 22.

This time, the 5\text{th} term is the median. There are four terms on either side of the median. So Q_{1} is between the 2\text{nd} and 3\text{rd} values and Q_{3} is between the 6\text{th} and 7\text{th} values. Again, we would need to find the mean of the 2\text{nd} and 3\text{rd} values, and the mean of the 6\text{th} and 7\text{th} values to find Q_{1} and Q_{3}.

Finally, let's look at a set with 10 values:

A data set with values 12, 13, 14, 19, 19, 21, 22, 22, 28, 30. Quartile 1 is 14, the median is between 19 and 21, quartile 3 is 22

For this set, the median is between the 5\text{th} and 6\text{th} values. This time, there are 5 values on either side of the median. So Q_{1} is the 3\text{rd} term and Q_{3} is the 8\text{th} term.

Each quartile represents 25\% of the data set. The least value to the first quartile is approximately 25\% of the data, the first quartile to the median is another 25\%, the median to the third quartile is another 25\%, and the third quartile to the greatest value represents the last 25\% of the data. We can combine these sections together-for example, 50\% of the values in a data set lie between the first and third quartiles.

  • Q_{1} is the first quartile (sometimes called the lower quartile). It is the middle value in the bottom half of data.

  • Q_{2} is the second quartile, and is usually called the median, which we have already learned about.

  • Q_{3} is the third quartile (sometimes called the upper quartile). It is the middle value in the top half of the data set.

The interquartile range (IQR) is the difference between the third quartile and the first quartile. 50\% of values lie within the IQR because it contains the data set between the first quartile and the median, as well as the median and the third quartile. Since it focuses on the middle 50\% of the data set, the interquartile range often gives a better indication of the internal spread than the range does, and it is less affected by individual values that are unusually high or low, which are the outliers.

Interquartile Range

=Q_{3}-Q_{1}

Examples

Example 2

Consider the following set of values:33,\,38,\,50,\,12,\,33,\,48,\,41

a

Sort the values in ascending order.

Worked Solution
Create a strategy

Arrange the values from smallest to largest.

Apply the idea

12,\,33,\,33,\,38,\,41,\,48,\,50

b

Find the number of values.

Worked Solution
Create a strategy

Count the total number of values.

Apply the idea

\text{Number of values} = 7

c

Find the median.

Worked Solution
Create a strategy

Remove the smallest and largest value from the data set, until there is only one number remaining.

Apply the idea

Remove the smallest and largest value from the set12,\,33,\,33,\,38,\,41,\,48,\,50

Remove the smallest and largest value from the set33,\,33,\,38,\,41,\,48

Remove the smallest and largest value from the set33,\,38,\,41

Remove the smallest and largest value from the set38

38 is the median in the set of values, because it is the middle value in the set.

d

Find the first quartile of the set of values.

Worked Solution
Create a strategy

Find the median of the first half of the data set.

Apply the idea

The first half of the values not incuding the median are: 12,\,33,\,33

The median of this set is 33.

So, the first quartile of the original set of values is 33.

e

Find the third quartile of the set of values.

Worked Solution
Create a strategy

Find the median of the second half of the data set.

Apply the idea

The second half of the values not including the median are: 41,\,48,\,50

The median of this set is 48.

So, the third quartile of the original set of values is 48.

f

Find the interquartile range.

Worked Solution
Create a strategy

We can use the interquartile range formula: \text{IQR} = Q_{3} - Q_{1}

Apply the idea
\displaystyle \text{IQR}\displaystyle =\displaystyle 48-33Substitute the quartiles
\displaystyle =\displaystyle 15Evaluate
Idea summary

Interquartile range is the difference between the third quartile and the first quartile.

Interquartile Range

=Q_{3}-Q_{1}

To find the first quartile, find the median of the first half of the data set. To find the third quartile, find the median of the second half of the data set.

Outcomes

6.SP.A.2

Understand that a set of data collected to answer a statistical question has a distribution which can be described by its center, spread, and overall shape.

6.SP.A.3

Recognize that a measure of center for a numerical data set summarizes all of its values with a single number, while a measure of variation describes how its values vary with a single number

6.SP.B.5

Summarize numerical data sets in relation to their context, such as by:

6.SP.B.5.C

Giving quantitative measures of center (median and/or mean) and variability (interquartile range and/or mean absolute deviation), as well as describing any overall pattern and any striking deviations from the overall pattern with reference to the context in which the data was gathered.

What is Mathspace

About Mathspace