7. Statistics

Lesson

The dot plot is a useful way to express discrete data in a visually simple manner. The main advantages of the dot plot are that we can find the mode and range very easily, as well as quickly see how the data is distributed. The main disadvantages are that we need to count each dot when finding the median and it is often easier to convert to a table to find the mean.

The dot plot is particularly suited to discrete data where the frequency of results are often greater than one.

Dot plot

In a dot plot, each dot represents one data point belonging to the result that it is placed above.

The mode(s) of a dot plot will be the result(s) with the most dots. Since a dot plot stacks vertically, the greatest column(s) will belong to the mode(s).

Let's have a look at an example of a dot plot.

Consider the dot plot below.

What is the mode, range, median and mean of the data set represented by this dot plot?

**Think**: We have just learned that the mode will be the result with the greatest column. To find the range, median, and mean, we'll need to find the numerical values of each piece of data displayed in the dot plot.

**Do**: Since the stack of dots above "$1$1" is the greatest, we know that the mode is "$1$1".

The range will be the difference between the greatest score, $6$6, and the least score, $0$0. Since $6-0=6$6−0=6, the range is "$6$6".

By counting the total number of dots in the dot plot, we find that there are $19$19 dots. Since there is an odd number of scores, the median will be the middle score, in this case the tenth score. By counting the dots moving top to bottom and left to right, we find that the tenth dot is in the "$2$2" column. So the median is "$2$2".

To find the mean, we want to add all the scores together and then divide that sum by $19$19 (the number of scores). We recall that we can add our scores together more easily by adding the product of each result and its frequency. For this, we want to convert our dot plot into a frequency table by counting how many dots in each column.

No. children | Frequency | $fx$fx |
---|---|---|

$0$0 | $3$3 | $0$0 |

$1$1 | $6$6 | $6$6 |

$2$2 | $4$4 | $8$8 |

$3$3 | $2$2 | $6$6 |

$4$4 | $3$3 | $12$12 |

$5$5 | $0$0 | $0$0 |

$6$6 | $1$1 | $6$6 |

We obtained the values in the third column by multiplying each result by its frequency. Adding up all the numbers in the third column is equivalent to finding the sum of all the scores. So we can find the mean by summing the numbers in the third column and then dividing that sum by $19$19. This gives us:

Mean | = | $\frac{0+6+8+6+12+0+6}{19}$0+6+8+6+12+0+619 |

= | $\frac{38}{19}$3819 | |

= | $2$2 |

So our summary of the data in the dot plot would be:

- Mode = $1$1
- Range =$6$6
- Median = $2$2
- Mean = $2$2

**Reflect**: As we saw, the mode and range are easy to identify on a dot plot, while the median takes a bit more work and the mean takes a lot more work.

The other feature of the data that we can see clearly with the dot plot is that the data is concentrated around "$1$1" and "$2$2", so it makes sense that our mean, median and mode would be close to these results.

The dot plot shows the temperature ($^\circ C$°`C`) in a town over a several week period. Identify the temperature that is an outlier.

A group of adults is asked: "How old were you when you passed your driving test?".

The responses were:

$22,17,17,17,19,21,17,22,21,18,18,17,18,22,18$22,17,17,17,19,21,17,22,21,18,18,17,18,22,18

Represent the responses with a dot plot and answer the questions below.

Create your dot plot here |

What is the range of this data set?

What is the mode of this data set?

What is the median of this data set?

How many people passed their driving test on or after their $19$19th birthday?

A supermarket manager takes a note every time an employee is late to work, and how late they were (rounded to the nearest half hour). The dot plot below shows their results for the last month:

What was the median amount of time that employees were late?

What fraction of late employees were later than the median amount?

If an employee is more than $1$1 hour late, the manager fines them $\$10$$10. How much money did the manager collect in fines over the last month?

Display numerical data in plots on a number line, including dot plots, histograms, and box plots.

Summarize numerical data sets in relation to their context, such as by:

Reporting the number of observations

Describing the nature of the attribute under investigation, including how it was measured and its units of measurement.

Giving quantitative measures of center (median and/or mean) and variability (interquartile range and/or mean absolute deviation), as well as describing any overall pattern and any striking deviations from the overall pattern with reference to the context in which the data was gathered.

Relating the choice of measures of center and variability to the shape of the data distribution and the context in which the data was gathered