topic badge
Middle Years

13.04 Stem-and-leaf and dot plots

Lesson

Representing data can be tricky because we want to make it easy to interpret without losing any information. Most of the time we are forced to compromise, either making the data simpler to express in a simpler manner or instead having more complex ways to express our data.

This lesson looks at a couple of the ways in which we can present our data visually to express information, with the trade-off that we need to learn how to read them.

 

The dot plot

The dot plot is a useful way to express discrete data in a visually simple manner. The main advantages of the dot plot are that we can find the mode and range very easily, as well as quickly see how the data is distributed. The main disadvantages are that we need to count each dot when finding the median and it is often easier to convert to a table to find the mean.

The dot plot is particularly suited to discrete data where the frequency of results are often greater than one.

Dot plot

In a dot plot, each dot represents one data point belonging to the result that it is placed above.

The mode(s) of a dot plot will be the result(s) with the most dots. Since a dot plot stacks vertically, the highest column(s) will belong to the mode(s).

Let's have a look at an example of a dot plot.

Worked example

example 1

Consider the dot plot below.

What is the mode, range, median and mean of the data set represented by this dot plot?

Think: We have just learned that the mode will be the result with the highest column. We learned how to find the range, median and mean from the lesson about summarising data.

Do: Since the stack of dots above "$1$1" is the highest, we know that the mode is "$1$1".

The range will be the difference between the highest score, $6$6, and the lowest score, $0$0. Since $6-0=6$60=6, the range is "$6$6".

By counting the total number of dots in the dot plot, we find that there are $19$19 dots. Since there is an odd number of scores, the median will be the middle score, in this case the tenth score. By counting the dots moving top to bottom and left to right, we find that the tenth dot is in the "$2$2" column. So the median is "$2$2".

To find the mean, we want to add all the scores together and then divide that sum by $19$19 (the number of scores). We recall that we can add our scores together more easily by adding the product of each result and its frequency. For this, we want to convert our dot plot into a frequency table by counting how many dots in each column.

No. children Frequency $fx$fx
$0$0 $3$3 $0$0
$1$1 $6$6 $6$6
$2$2 $4$4 $8$8
$3$3 $2$2 $6$6
$4$4 $3$3 $12$12
$5$5 $0$0 $0$0
$6$6 $1$1 $6$6

We obtained the values in the third column by multiplying each result by its frequency. Adding up all the numbers in the third column is equivalent to finding the sum of all the scores. So we can find the mean by summing the numbers in the third column and then dividing that sum by $19$19. This gives us:

Mean = $\frac{0+6+8+6+12+0+6}{19}$0+6+8+6+12+0+619
  = $\frac{38}{19}$3819
  = $2$2

So our summary of the data in the dot plot would be:

  • Mode = $1$1
  • Range =$6$6
  • Median = $2$2
  • Mean = $2$2

Reflect: As we saw, the mode and range are easy to identify on a dot plot, while the median takes a bit more work and the mean takes a lot more work.

The other feature of the data that we can see clearly with the dot plot is that the data is concentrated around "$1$1" and "$2$2", so it makes sense that our mean, median and mode would be close to these results.

The other useful thing about a dot plot is that we can very easily plot data onto it by adding a dot into the relevant column for each score. This is surprisingly easy since it does not involve ordering the data or counting each specific result.

Have a go at this in the practice question below.

Practice question

Question 1

A group of adults is asked: "How old were you when you passed your driving test?".

The responses were:
$22,17,17,17,19,21,17,22,21,18,18,17,18,22,18$22,17,17,17,19,21,17,22,21,18,18,17,18,22,18

Represent the responses with a dot plot and answer the questions below.

Create your dot plot here

 

  1. What is the range of this data set?

  2. What is the mode of this data set?

  3. What is the median of this data set?

  4. How many people passed their driving test on or after their $19$19th birthday?

 

The stem-and-leaf plot

The stem-and-leaf plot is an example of a way to express data in a more complicated way so that we can express more information visually. In particular, the stem-and-leaf plot is used when we have lots of numerical data points.

A stem-and-leaf plot is made up of two components, the stem and the leaf. The stem is usually used to represent the tens part of a score while the leaf is used to represent the ones part of the score.

For example, the score $52$52 would be expressed on a stem-and-leaf plot like so:

Stem Leaf
$5$5 $2$2
 
Key: $5$5$\mid$$2$2$=$=$52$52

As we can see, we expressed the score by writing the ones digit in the row corresponding to its tens digit. In other words, we attached the leaf, $2$2, to its stem, $5$5, to make the score $52$52.

What is useful about the stem-and-leaf plot is that we can record as many scores as we like by writing the leaves in the appropriate rows. As such, we could express the data set:

$52,46,31,57,49,51,52,30$52,46,31,57,49,51,52,30

with this stem-and-leaf plot:

Stem Leaf
$3$3 $0$0$1$1
$4$4 $6$6$9$9
$5$5 $1$1$2$2$2$2$7$7
 
Key: $5$5$\mid$$2$2$=$=$52$52

As we can see, each score has been expressed on the plot as a ones digit written in its tens row.

Notice that the leaves have been arranged in ascending order from left to right. We need to do this so that we can find the median without jumping back and forth across our rows.

It is also worth noting that if there is more than one of the same score, in this case $52$52 appears twice, each score should have its own leaf.

Worked example

example 2

Consider the stem-and-leaf plot below.

Stem Leaf
$1$1 $1$1$3$3$7$7
$2$2 $0$0$2$2$2$2$5$5$8$8$8$8$8$8$8$8$9$9
$3$3 $0$0$1$1$2$2$7$7$7$7$9$9$9$9
$4$4 $1$1$2$2$6$6
$5$5 $0$0$3$3$7$7
$6$6 $1$1$6$6
$7$7 $2$2$6$6
$8$8 $1$1
 
Key: $5$5$\mid$$2$2$=$=$52$52

What is the mode, range, median and mean of the data set represented by this stem-and-leaf plot?

Think: We can summarise the data using our usual methods, translating the leaves back into scores where necessary.

Do: Since the row with the most leaves has the stem "$2$2", the modal class will be the "$20$20s". It is also worth noting that the individual score that has the highest frequency is "$28$28" which is our mode.

We can find the range by finding the difference between the greatest and least scores. The greatest score is represented by the rightmost leaf in the bottom row. Attaching this leaf to its stem gives us $81$81. Similarly, the least score is represented by the leftmost leaf in the top row which gives us $11$11. Subtracting $11$11 from $81$81 gives us the range:

Range $=$= $81-11$8111
  $=$= $70$70

By counting the total number of leaves in the plot, we find that there are $30$30 scores. Since there is an even number of scores, the median will be the average of the two middle scores- the fifteenth and sixteenth scores. By counting the dots moving left to right and top to bottom, we find that the fifteenth score is "$32$32" and the sixteenth score is "$37$37". By taking the average of these two scores we get:

Median $=$= $\frac{32+37}{2}$32+372
  $=$= $34.5$34.5

We can find the mean by adding all the scores together and dividing the sum by the number of scores. After translating all the leaves into scores we can add them up to get a sum of $1162$1162. Dividing this by the number of scores, $30$30, will give us:

Mean $=$= $\frac{1161}{30}$116130
  $=$= $38.7$38.7

As such, we can summarise the data expressed in the stem-and-leaf plot as:

  • Mode = $28$28
  • Modal class = $20$20s
  • Range =$70$70
  • Median = $34.5$34.5
  • Mean = $38.7$38.7

Reflect: When interpreting the data in a stem-and-leaf plot we want to take advantage of the plot's compact nature to quickly find the mode, modal class, range and median. After doing this, we can translate the leaves back into scores to find the mean.

Practice questions

Question 2

A city council selected a number of houses at random. They determined the fastest travel time from each house to the nearest hospital, and produced these results (in minutes):
$25,37,16,27,27,35,21,18,19,49,14,19,31,42,18$25,37,16,27,27,35,21,18,19,49,14,19,31,42,18

  1. Represent this data in an ordered stem-and-leaf plot, with one leaf for each score and commas between each leaf:

    Stem Leaf
    $1$1 $\editable{}$
    $2$2 $\editable{}$
    $3$3 $\editable{}$
    $4$4 $\editable{}$
     
    Key: $2$2$\mid$$5$5$=$=$25$25
Question 3

The following stem-and-leaf plot shows the ages of $20$20 employees in a company.

Stem Leaf
$2$2 $0$0 $1$1 $1$1 $2$2 $8$8 $8$8 $9$9
$3$3 $0$0 $2$2 $4$4 $8$8 $8$8
$4$4 $1$1 $1$1 $1$1 $2$2 $5$5
 $5$5 $3$3 $4$4 $8$8
 
Key: $1$1$\mid$$2$2$=$=$12$12
  1. How many of the employees are in their 30s?

  2. What is the age of the oldest employee?

  3. What is the age of the youngest employee?

  4. What is the median age of the employees?

  5. What is the modal age group?

    $20$20s

    A

    $30$30s

    B

    $40$40s

    C

    $50$50s

    D

 

Using the key for stem-and-leaf plots

While stem-and-leaf plots are used primarily to store data of two digit numbers, there are some cases where the stem and leaf might mean something different. It is for this reason that we should always check the key before translating the leaves into scores.

For example, in the stem-and-leaf plot below the stem represents the whole number of kilometres while the leaf represents tenths of a kilometre.

Stem Leaf
$1$1 $3$3$6$6$7$7
$2$2 $0$0$2$2$2$2$7$7
$3$3 $8$8$9$9
 
Key: $2$2$\mid$$7$7$=$=$2.7$2.7 km

There are also cases of the stem representing the number of tens as usual, except it uses two digit numbers in the stem to express three digit numbers. In this case, the score $128$128 is represented by the leaf "$8$8" attached to the "$12$12" stem.

Stem Leaf
$9$9 $2$2$5$5
$10$10 $0$0$3$3$3$3$9$9
$11$11 $7$7$8$8
$12$12 $3$3$4$4$6$6$8$8
$13$13 $1$1$1$1$4$4
 
Key: $12$12$\mid$$8$8$=$=$128$128

 

In both cases, we need the key to tell us how to interpret the stem-and-leaf plot since the data is different from our usual two digit scores.

What is Mathspace

About Mathspace