In 6th grade, we constructed and analyzed data in histograms, box plots, and dot plots. These displays highlight or hide different aspects of a set of data. We will choose appropriate displays and justify our reasoning in this lesson.
Data displays are useful to aid with summarizing, analyzing, and interpreting a data distribution. The best data displays make useful information easy to read for the intended audience.
Numerical data can be displayed using histograms, box plots, and dot plots.
Histograms display the frequency of data as either a count or relative proportion along the y-axis and divide the numerical data into bins of equal width along the x-axis.
Generally, we include the lower bound and exclude the upper bound, so the equivalent labels for the x-axis would be:
Box plots divide data into four equal quartiles using the five-number summary: minimum, lower quartile, median, upper quartile, and maximum.
Dot plots display the frequency of data by the number of dots at each value. This display is best used for discrete values with a small range.
The histogram, box plot, and dot plot summarize the heights of 29 students in a class, in inches:
What do you notice about the different displays for the same data?
Histograms, box plots, and dot plots may display the same data, but the different displays have their own strengths and weaknesses.
Niether histograms nor box plots show every individual data value, but histograms will show intervals where there may be gaps or a lower frequency of data. Box plots provide a quick, efficient overall view of the shape, center and spread of the data if we're not interested in where there may be gaps in the data.
Determine the best type of data display(s) for each set of data:
The number of goals scored in each high school soccer game for the season
The height of all 196 Olympic gymnasts in the most recent summer games
Shown below are the quiz score percentages from Mr. Sanchez's first period math class: \{20, 25, 26, 30, 30, 40, 43, 63, 65, 67, 70, 70, 75, 90, 93 \}
Construct a box plot of the quiz scores.
What are the advantages and disadvantages of a box plot?
Explain whether a dot plot or a histogram could be a better display for the data.
Consider the years of experience of various employees at a company: \{ 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 6, 8, 8, 9, 10, 11, 12, 12, 12, 13, 13,\\ 13, 13, 13, 13, 14, 16, 16, 16, 17, 18, 18, 18, 18, 18, 18, 20, 20, 21, 21, 22 \}
Construct a histogram of the data. Choose appropriate scales and labels for the axes.
Would another data display represent the data well? Explain.
The best display for a data set is one that reveals the information we want to share. Some displays hide key information like the individual data points, the total number of data points, or features like the shape, clusters, gaps, and spread.
As a starting place, consider:
If there is a small quantity and range of data, try a dot plot.
If the data has a large range or quantity of data, try a histogram or box plot.
Choose a box plot if you only need to see an overview of center, spread and shape.
Choose a histogram if, in addition to center, spread, and shape, you want to know the size of the data set and view any gaps or clusters among various intervals.