Box and whisker plots, or box plots, are a great way of displaying numerical data as they clearly show all the quartiles in a data set. Since statisticians are interested in what's "normal," they assume that most values will be somewhere in the middle. The "box" in box and whisker plots indicates the middle half of the data set. Let's look at how box plots give us a clear picture of a data set's central tendency and spread.
We start with a number line that displays the values in our data set. Above that, you'll see that there are two lines or "whiskers" that extend from the box outwards. The two end points of these lines show the maximum (upper extreme) and minimum (lower extreme) values in the data set.
The two vertical edges of the box show the quartiles of the data range. The left hand side of the box is the lower quartile (Q1) and the right hand side of the box is the upper quartile (Q3).
Remember that, from the minimum value to the lower quartile is 25\% of the data, from the lower quartile to the median is another 25\%, from the median to the upper quartile is another 25\% and from the upper quartile to the maximum value represents another 25\%.
50\% of the values in a data set lie between Q1 and Q3, which is the box portion between the lower and upper quartiles. This is the middle 50\% of the data which are often considered the normal values of data.
Finally, the vertical line inside the box shows the median (the middle value), sometimes called Q2.
The box and whisker plot shows a nice summary of all this information:
We can also find the range of a data set, which is the distance between the minimum and maximum values, by simply subtracting the largest and smallest pieces of data.
Along those same lines the interquartile range (IQR) is the distance between the lower and upper quartile. To find the IQR simply subtract Q3 - Q1.
A list of the minimum, lower quartile, median, upper quartile, and maximum values is often called the five number summary.
Creating a box plot:
Put the data in ascending order (from smallest to largest).
Find the median (middle value) of the data.
To divide the data into quarters, find the median (middle value) between the minimum value and the median, as well as between the median and the maximum value.
If there are lots of values in a data set, it may be easier to work out which values represent the median and the upper and lower quartiles to avoid all of that counting.
For the following box plot:
Find the lowest value.
Find the highest value.
Find the range.
Find the median.
Find the interquartile range (\text{IQR}).
You have been asked to represent this data in a box plot: 20,\,36,\,52,\,56,\,24,\,16,\,40,\,4,\,28
Complete the table for the given data.
Minimum | |
---|---|
Lower quartile | |
Median | |
Upper quartile | |
Maximum | |
Interquartile range |
Construct a box plot for the data.
The features of a box plot are shown below:
A list of the minimum, lower quartile, median, upper quartile, and maximum values is often called the five number summary.
One quartile represents 25\% of the data set.
Creating a box plot:
Put the data in ascending order (from smallest to largest).
Find the median (middle value) of the data.
To divide the data into quarters, find the median (middle value) between the minimum value and the median, as well as between the median and the maximum value.