Data displays are useful for the organize and represent and analyze and communicate results stages of the data cycle. The best data displays make useful information easy to read for the intended audience.
Numerical data can be displayed using histograms, stem-and-leaf plots, and line plots (dot plots).
Categorical data can be displayed using circle graphs, bar graphs, and dot plots.
The histogram, stem-and-leaf plot, and line plot summarize the heights of 30 students in a class, in inches:
Stem | Leaf |
---|---|
5 | 1\ 4\ 4\ 4\ 7 |
6 | 0\ 0\ 0\ 1\ 1\ 2\ 2\ 2\ 2\ 2\ 3\ 3\ 3\ 3\ 3\ 4\ 4\ 4\ 5\ 5\ 5\ 6\ 7\ 7 |
7 | 1 |
Key 5\vert 1 = 51 inches |
What do you notice about the different displays for the same data?
What does the line plot tell you that the histogram does not?
What does the stem-and-leaf tell you that the histogram does not?
What are the benefits of each display? What information is easier to read of one display compared to another?
If the data was recorded to the nearest eighth of an inch, would these displays still be helpful? Explain.
Histograms display the frequency of data as either a count or relative proportion along the y-axis and divide the numerical data into bins of equal width along the x-axis.
Generally, we include the lower bound and exclude the upper bound, so the equivalent labels for the first bin would be 1\leq \text{Rainfall}\lt 5.
Advantages | Disadvantages |
---|---|
Good for large quantities of data | Individual data values are lost |
Easy to read off the spread, clusters, and trends | Bin size can affect the conclusion |
We do not need to round when collecting data |
Line plots or dot plots display the frequency of data by the number of dots at each value. This display is best used for countable values with a small range.
Advantages | Disadvantages |
---|---|
Simple to create | Difficult to show for large quantities of data |
Can read off the most and least common response | Difficult to show a large number of categories or a large spread |
Can see some clusters in the data |
Stem | Leaf |
---|---|
10 | 6\ 7\ 9 |
11 | 0\ 3\ 3\ 4\ 5\ 5\ 6\ 6 |
12 | 2\ 3\ 5\ 7 |
13 | 1\ 5 |
Key 12\vert 3 = 123 years |
Advantages | Disadvantages |
---|---|
Data is sorted | Difficult to show large quantities of data |
Keeps the original data | Not helpful if the data only varies in the last digit |
Can see some clusters in the data |
A circle graph is useful for showing proportions and parts of a whole for different categories.
Histograms, line plots, stem-and-leaf plots, and sometimes circle graphs, may display the same data, but the different displays have strengths and weaknesses.
Histograms do not show every individual data value, but show intervals where there may be gaps or a lower frequency of data and can be used for very large data sets.
Line plots and stem-and-leaf plots show the original data values, but cannot easily represent very large sets of numerical data.
Circle graphs may only be used for grouped numerical data or when there are only a few possible numerical responses.
Classroom attendance over a month is shown in a histogram and a stem-and-leaf plot. Each data value tells us how many students were present each day.
Stem | Leaf |
---|---|
1 | 9 |
2 | 1\ 1\ 1\ 2\ 2\ 3\ 3\ 3\ 4\ 4\ 4\ 5\ 5\ 5\ 5\ 6\ 6\ 6\ 6\ 7\ 7\ 8\ 8\ 8\ 8\ 9 |
3 | 0 |
Which display shows how many days had 25\leq \text{ attendance } \lt 27 more clearly?
Which display shows the day with the highest attendance more clearly?
Which display shows the shape more clearly?
Determine the best type of data display(s) for each formulated question:
What is a typical number of goals for my high school's soccer team?
How do the heights of the 196 Olympic gymnasts in the most recent summer games compare?
Shown below are the quiz score percentages from Mr. Sanchez's first period math class: \{20, 25, 26, 30, 30, 40, 43, 63, 65, 67, 70, 70, 75, 90, 93 \}
Construct a histogram of the quiz scores with intervals of 15.
Explain whether a dot plot, stem-and-leaf plot, circle graph, or a histogram with a different bin width could be a better display for the data.
The best display for a data set is one that reveals the information we want to share. Some displays hide key information like the individual data points, the total number of data points, or features like the shape, clusters, gaps, and spread.
As a starting place, consider the size, range, and quantity of data.
If there is a small quantity and range of data, try a dot plot.
If the data has a large range or quantity of data, try a histogram.
Choose a histogram if, in addition to center, spread, and shape, you want to know the size of the data set and view any gaps or clusters among various intervals.
Strengths | Drawbacks | |
---|---|---|
Histogram | Easily display large or spread out data sets. The shape, mode, and spread are visible. | Cannot see individual data values. Depending on the number or size of the bins or intervals, the shape can look different. |
Circle graph | We can compare the proportions of different categories visually. They are commonly used. | It can be difficult to compare similar groups if they are not labeled. We may lose the original totals if we just show the percentages. |
Dot plot (line plot) | Useful for individual data values. The highest and lowest categories are easy to see. | Do not work well for large data sets. Can be slower to make by hand. |
Stem-and-leaf plot | Useful for individual data values. The highest and lowest values are easy to see. | Do not work well for large data sets. |