Representing data can be tricky because we want to make it easy to interpret without losing any information. Most of the time we are forced to compromise, either making the data simpler to express in a simpler manner or instead having more complex ways to express our data.
This lesson looks at a couple of the ways in which we can present our data visually to express information, with the trade-off that we need to learn how to read them.
The stem-and-leaf diagram is an example of a way to express data in a more complicated way so that we can express more information visually. In particular, the stem-and-leaf diagram is used when we have lots of numerical data points.
A stem-and-leaf diagram is made up of two components, the stem and the leaf. The stem is usually used to represent the tens part of a score while the leaf is used to represent the ones part of the score.
For example, the score $52$52 would be expressed on a stem-and-leaf diagram like so:
Stem | Leaf | |
$5$5 | $2$2 | |
|
As we can see, we expressed the score by writing the ones digit in the row corresponding to its tens digit. In other words, we attached the leaf, $2$2, to its stem, $5$5, to make the score $52$52.
What is useful about the stem-and-leaf diagram is that we can record as many scores as we like by writing the leaves in the appropriate rows. As such, we could express the data set:
$52,46,31,57,49,51,52,30$52,46,31,57,49,51,52,30
with this stem-and-leaf diagram:
Stem | Leaf | |
$3$3 | $0$0$1$1 | |
$4$4 | $6$6$9$9 | |
$5$5 | $1$1$2$2$2$2$7$7 | |
|
As we can see, each score has been expressed on the diagram as a ones digit written in its tens row.
Notice that the leaves have been arranged in ascending order from left to right. We need to do this so that we can find the median without jumping back and forth across our rows.
It is also worth noting that if there is more than one of the same score, in this case $52$52 appears twice, each score should have its own leaf.
Consider the stem-and-leaf diagram below.
Stem | Leaf | |
$1$1 | $1$1$3$3$7$7 | |
$2$2 | $0$0$2$2$2$2$5$5$8$8$8$8$8$8$8$8$9$9 | |
$3$3 | $0$0$1$1$2$2$7$7$7$7$9$9$9$9 | |
$4$4 | $1$1$2$2$6$6 | |
$5$5 | $0$0$3$3$7$7 | |
$6$6 | $1$1$6$6 | |
$7$7 | $2$2$6$6 | |
$8$8 | $1$1 | |
|
What is the mode, range, median and mean of the data set represented by this stem-and-leaf diagram?
Think: We can summarise the data using our usual methods, translating the leaves back into scores where necessary.
Do: Since the row with the most leaves has the stem "$2$2", the modal class will be the "$20$20s". It is also worth noting that the individual score that has the highest frequency is "$28$28" which is our mode.
We can find the range by finding the difference between the greatest and least scores. The greatest score is represented by the rightmost leaf in the bottom row. Attaching this leaf to its stem gives us $81$81. Similarly, the least score is represented by the leftmost leaf in the top row which gives us $11$11. Subtracting $11$11 from $81$81 gives us the range:
Range | $=$= | $81-11$81−11 |
$=$= | $70$70 |
By counting the total number of leaves in the diagram, we find that there are $30$30 scores. Since there is an even number of scores, the median will be the average of the two middle scores- the fifteenth and sixteenth scores. By counting the dots moving left to right and top to bottom, we find that the fifteenth score is "$32$32" and the sixteenth score is "$37$37". By taking the average of these two scores we get:
Median | $=$= | $\frac{32+37}{2}$32+372 |
$=$= | $34.5$34.5 |
We can find the mean by adding all the scores together and dividing the sum by the number of scores. After translating all the leaves into scores we can add them up to get a sum of $1162$1162. Dividing this by the number of scores, $30$30, will give us:
Mean | $=$= | $\frac{1161}{30}$116130 |
$=$= | $38.7$38.7 |
As such, we can summarise the data expressed in the stem-and-leaf diagram as:
Reflect: When interpreting the data in a stem-and-leaf diagram we want to take advantage of the diagram's compact nature to quickly find the mode, modal class, range and median. After doing this, we can translate the leaves back into scores to find the mean.
A city council selected a number of houses at random. They determined the fastest travel time from each house to the nearest hospital, and produced these results (in minutes):
$25,37,16,27,27,35,21,18,19,49,14,19,31,42,18$25,37,16,27,27,35,21,18,19,49,14,19,31,42,18
Represent this data in an ordered stem-and-leaf plot, with one leaf for each score and commas between each leaf:
Stem | Leaf | |
$1$1 | $\editable{}$ | |
$2$2 | $\editable{}$ | |
$3$3 | $\editable{}$ | |
$4$4 | $\editable{}$ | |
|
The following stem-and-leaf plot shows the ages of $20$20 employees in a company.
Stem | Leaf | |
$2$2 | $0$0 $1$1 $1$1 $2$2 $8$8 $8$8 $9$9 | |
$3$3 | $0$0 $2$2 $4$4 $8$8 $8$8 | |
$4$4 | $1$1 $1$1 $1$1 $2$2 $5$5 | |
$5$5 | $3$3 $4$4 $8$8 | |
|
How many of the employees are in their 30s?
What is the age of the oldest employee?
What is the age of the youngest employee?
What is the median age of the employees?
What is the modal age group?
$20$20s
$30$30s
$40$40s
$50$50s
While stem-and-leaf diagrams are used primarily to store data of two digit numbers, there are some cases where the stem and leaf might mean something different. It is for this reason that we should always check the key before translating the leaves into scores.
For example, in the stem-and-leaf diagram below the stem represents the whole number of kilometres while the leaf represents tenths of a kilometre.
Stem | Leaf | |
$1$1 | $3$3$6$6$7$7 | |
$2$2 | $0$0$2$2$2$2$7$7 | |
$3$3 | $8$8$9$9 | |
|
There are also cases of the stem representing the number of tens as usual, except it uses two digit numbers in the stem to express three digit numbers. In this case, the score $128$128 is represented by the leaf "$8$8" attached to the "$12$12" stem.
Stem | Leaf | |
$9$9 | $2$2$5$5 | |
$10$10 | $0$0$3$3$3$3$9$9 | |
$11$11 | $7$7$8$8 | |
$12$12 | $3$3$4$4$6$6$8$8 | |
$13$13 | $1$1$1$1$4$4 | |
|
In both cases, we need the key to tell us how to interpret the stem-and-leaf diagram since the data is different from our usual two digit scores.