A  previous lesson defined the quartiles of a data set, and found the first quartile, the median, and the third quartile. Remember that the quartiles can be useful to give some basic insight into the internal spread of data, whereas the range only uses the difference between the two extreme data points, the maximum and minimum. We can use the quartiles in combination with the two extremes of a data set to simplify the data into a five number summary:
A five number summary for a data set consists of: \text{Min}, Q_1,\,\text{Median},Q_3,\,\text{Max}
The five numbers from the five number summary break up a set of scores into four parts with 25\% of the scores in each quartile. Have a look at the diagram here:
So knowing these five key numbers can help us identify regions of 25\%, 50\%, and 75\% of the scores.
The list shows the number of points scored by a basketball team in each game of their previous season.77,\, 97,\,96,\,89,\,52,\,99,\,58,\,69,\\96,\,59,\,96,\,55,\,80,\,52,\,68
Sort the data in ascending order.
Find the maximum value.
Find the minimum value.
Find the median value.
Find the lower quartile.
Find the upper quartile.
A five number summary for a data set consists of: \text{Min}, Q_1,\,\text{Median},Q_3,\,\text{Max}
Box plots, sometimes called box-and-whisker plots, are a useful way of getting a quick overview of a numerical data set as they visually display the five number summary of a data set. In particular, a box plot highlights the middle 50\% of the scores in the data set, between Q_1 and Q_3. Box plots provide a clear picture of the central tendency and spread of a set of data.
Start with a number line that covers the full range of values in the data set. Next, plot the values from the five number summary on the number line, and connect them in a certain way to create a box plot. Here is an example:
The two vertical edges of the box show the quartiles of the data range. The left-hand side of the box is Q_1 and the right-hand side of the box is Q_3. The vertical line inside the box shows the median (the middle score) of the data.
Then there are two lines that extend from the box outwards. The endpoint of the left line is at the minimum score, while the endpoint of the right line is at the maximum score.
For the following box-and-whisker plot:
Find the lowest score.
Find the highest score.
Find the range.
Find the median.
Find the interquartile range (\text{IQR}).
Using the following box-and-whisker plot :
What percentage of scores lie between:
10.9 and 11.2
10.8 and 10.9
11.1 and 11.3
10.9 and 11.3
10.8 and 11.2
In which quartile (or quartiles) is the data the most spread out?
Consider the following data set: 20, \, 36, \, 52, \, 56, \, 24, \, 16, \, 40, \, 4, \, 28
Complete the table for the given data:
Minimum | \quad |
---|---|
Lower quartile | \quad |
Median | \quad |
Upper quartile | \quad |
Maximum | \quad |
Construct a box plot for the data.
Here are the features of a box plot:
The left-hand side of the box is Q_1, and the right-hand side of the box is Q_3.
The vertical line inside the box shows the median (the middle score) of the data.
The endpoint of the left line is at the minimum score, while the endpoint of the right line is at the maximum score.