topic badge
AustraliaVIC
VCE 11 General 2023

1.07 Box plots

Lesson

Five number summary

A  previous lesson  defined the quartiles of a data set, and found the first quartile, the median, and the third quartile. Remember that the quartiles can be useful to give some basic insight into the internal spread of data, whereas the range only uses the difference between the two extreme data points, the maximum and minimum. We can use the quartiles in combination with the two extremes of a data set to simplify the data into a five number summary:

A five number summary for a data set consists of: \text{Min}, Q_1,\,\text{Median},Q_3,\,\text{Max}

The five numbers from the five number summary break up a set of scores into four parts with 25\% of the scores in each quartile. Have a look at the diagram here:

A diagram showing the five number summary. Ask your teacher for more information.

So knowing these five key numbers can help us identify regions of 25\%, 50\%, and 75\% of the scores.

Examples

Example 1

The list shows the number of points scored by a basketball team in each game of their previous season.77,\, 97,\,96,\,89,\,52,\,99,\,58,\,69,\\96,\,59,\,96,\,55,\,80,\,52,\,68

a

Sort the data in ascending order.

Worked Solution
Create a strategy

Arrange the number from smallest to largest.

Apply the idea

52, \, 52, \, 55,\, 58, \, 59, \, 68, \, 69, \, 77, \, 80, \, 89, \, 96, \, 96, \, 96, \, 97, \, 99

b

Find the maximum value.

Worked Solution
Create a strategy

Choose the last value when the data has been sorted in ascending order.

Apply the idea
\displaystyle \text{Maximum value}\displaystyle =\displaystyle 99
c

Find the minimum value.

Worked Solution
Create a strategy

Choose the first value when the data has been sorted in ascending order.

Apply the idea
\displaystyle \text{Minimum value}\displaystyle =\displaystyle 52
d

Find the median value.

Worked Solution
Create a strategy

The median is the middle score, or the average of the two middle scores.

Apply the idea

There are 15 scores, so the middle score will be the eighth score when the scores are in order.

\displaystyle \text{Median}\displaystyle =\displaystyle 77
e

Find the lower quartile.

Worked Solution
Create a strategy

Find the middle score of the scores below the median.

Apply the idea

Scores below the median: 52, \, 52, \, 55, \, 58, \, 59, \, 68, \, 73

\displaystyle Q_1\displaystyle =\displaystyle 58The middle score is 58
f

Find the upper quartile.

Worked Solution
Create a strategy

Find the middle score of the scores above the median.

Apply the idea

Scores above the median: 80, \, 89, \, 96, \, 96, \, 96, \, 97, \, 99

\displaystyle Q_3\displaystyle =\displaystyle 96The middle score is 96
Idea summary

A five number summary for a data set consists of: \text{Min}, Q_1,\,\text{Median},Q_3,\,\text{Max}

Box plots

Box plots, sometimes called box-and-whisker plots, are a useful way of getting a quick overview of a numerical data set as they visually display the five number summary of a data set. In particular, a box plot highlights the middle 50\% of the scores in the data set, between Q_1 and Q_3. Box plots provide a clear picture of the central tendency and spread of a set of data.

Start with a number line that covers the full range of values in the data set. Next, plot the values from the five number summary on the number line, and connect them in a certain way to create a box plot. Here is an example:

A box plot showing the five number summary of a data set. Ask your teacher for more information.

The two vertical edges of the box show the quartiles of the data range. The left-hand side of the box is Q_1 and the right-hand side of the box is Q_3. The vertical line inside the box shows the median (the middle score) of the data.

Then there are two lines that extend from the box outwards. The endpoint of the left line is at the minimum score, while the endpoint of the right line is at the maximum score.

Examples

Example 2

For the following box-and-whisker plot:

score
0
2
4
6
8
10
12
14
16
18
20
a

Find the lowest score.

Worked Solution
Create a strategy

The lowest score is at the end of the left whisker.

Apply the idea

\text{Lowest score}=3

b

Find the highest score.

Worked Solution
Create a strategy

The highest score is at the end of the right whisker.

Apply the idea

\text{Highest score}=18

c

Find the range.

Worked Solution
Create a strategy

The range is the difference between the highest score and the lowest score.

Apply the idea
\displaystyle \text{Range}\displaystyle =\displaystyle 18-3Find the difference of the scores
\displaystyle =\displaystyle 15Evaluate the subtraction
d

Find the median.

Worked Solution
Create a strategy

The median is marked by a line between the lower and upper quartile.

Apply the idea

\text{Median}=10

e

Find the interquartile range (\text{IQR}).

Worked Solution
Create a strategy

The interquartile range (\text{IQR}) is the difference between the upper quartile and the lower quartile.

Apply the idea
\displaystyle \text{IQR}\displaystyle =\displaystyle 15-7Find the difference between the quartiles
\displaystyle =\displaystyle 8Evaluate the subtraction

Example 3

Using the following box-and-whisker plot :

Glass Width
10.7
10.8
10.9
11.0
11.1
11.2
11.3
11.4
a

What percentage of scores lie between:

  • 10.9 and 11.2

  • 10.8 and 10.9

  • 11.1 and 11.3

  • 10.9 and 11.3

  • 10.8 and 11.2

Worked Solution
Create a strategy

Think about how many quartiles are in that range. One quartile represents 25\% of the data set.

Apply the idea

50\% of scores lie between Q1 and Q3. So 50\% of scores lie between 10.9 and 11.2.

25\% of the scores lie between the lowest score and Q1. So 25\% of scores lie between 10.8 and 10.9.

50\% of scores lie between the median and the highest score. So 50\% of scores lie between 11.1 and 11.3.

75\% of scores lie between Q2 and the highest score. So 75\% of scores lie between 10.9 and 11.3.

75\% of scores lie between the lowest score and Q3.So 75\% of scores lie between 10.8 and 11.2.

b

In which quartile (or quartiles) is the data the most spread out?

Worked Solution
Create a strategy

Which quartile takes up the longest space on the graph?

Apply the idea

The second quartile is the most spread out.

Example 4

Consider the following data set: 20, \, 36, \, 52, \, 56, \, 24, \, 16, \, 40, \, 4, \, 28

a

Complete the table for the given data:

Minimum\quad
Lower quartile\quad
Median\quad
Upper quartile\quad
Maximum\quad
Worked Solution
Create a strategy

Order the numbers from smallest to largest to find the values of the five number summary.

Apply the idea

Ordered data: 4, \, 16, \, 20, \, 24, \, 28, \, 36, \, 40, \, 52, \, 56

\displaystyle \text{Minimum}\displaystyle =\displaystyle 4The first score
\displaystyle \text{Maximum}\displaystyle =\displaystyle 56The last score
\displaystyle \text{Median}\displaystyle =\displaystyle 28The middle score
\displaystyle Q_1\displaystyle =\displaystyle \dfrac{16+20}{2}Average the middle scores of 4, \, 16, \, 20, \, 24
\displaystyle =\displaystyle 18Evaluate
\displaystyle Q_3\displaystyle =\displaystyle \dfrac{40+52}{2}Average the middle scores of 36, \, 40, \, 52, \, 56
\displaystyle =\displaystyle 46Evaluate

The completed table is shown:

Minimum4
Lower quartile18
Median28
Upper quartile46
Maximum56
b

Construct a box plot for the data.

Worked Solution
Create a strategy

Use the five number summary in part (a) to construct the box plot.

Apply the idea
Data
0
10
20
30
40
50
60
Idea summary

Here are the features of a box plot:

A box plot showing the five number summary of a data set. Ask your teacher for more information.
  • The left-hand side of the box is Q_1, and the right-hand side of the box is Q_3.

  • The vertical line inside the box shows the median (the middle score) of the data.

  • The endpoint of the left line is at the minimum score, while the endpoint of the right line is at the maximum score.

Outcomes

U1.AoS1.5

construct and interpret graphical displays of data, and describe the distributions of the variables involved and interpret in the context of the data

What is Mathspace

About Mathspace