Data displays are useful to aid with summarizing, analyzing, and interpreting a data distribution. The best data displays make useful information easy to read for the intended audience.
Frequency tables can be used to organize both categorical and numerical data.
Favorite color | Frequency |
---|---|
\text{red} | 5 |
\text{blue} | 3 |
\text{green} | 10 |
\text{yellow} | 20 |
Categorical data can be displayed using circle graphs or bar graphs.
Numerical data can be displayed using histograms, box plots, line plots, and stem-and-leaf plots.
Histograms and box plots are good for large quantities of data. The shape, center, and spread of the data is easy to read, but the individual data values are lost in these types of displays.
Histograms display the frequency of data as either a count or relative proportion along the y-axis and divide the numerical data into bins of equal width along the x-axis.
When constructing a histogram:
There should be no gaps between the bars
The bin range includes the value on the left side and excludes the value on the right side
All bins should be the same width visually and numerically
Generally we exclude the lower bound and include the upper bound, so the equivalent labels for the x-axis would be:
Box plots divide data into four equal quartiles using the five-number summary: minimum, lower quartile, median, upper quartile, and maximum.
When constructing a boxplot:
The box in the center represents the middle half and spread of the data
The line in the box represents the median
The two endpoints represent the minimum and maximum value
Since every individual data point must be included in a line plot and stem-and-leaf plot these data displays are best for smaller data sets.
Line plots display the frequency of data by the number of dots at each value. This display is best used for discrete values with a small range.
When constructing a line plot:
Determine the best type of data display(s) for each set of data.
In order to determine the popularity of each meal, the number of classmates who like tacos, pizza, or cheeseburgers the most was collected.
The height all 196 Olympic gymnasts in the most recent summer games.
Use the data display to answer each question.
Estimate the median score from the math quiz.
Estimate the range and interquartile range of the data.
What are the disadvantages of this box plot?
What are the advantages of a box plot?