Sometimes we want to talk about a data set without having to refer to every single result. In other words, we want to summarise the data set to learn more about it and make comparisons. In the last lesson, we introduced the mode, the most frequently occurring score. In this lesson, we will learn about two more ways we can summarise numerical data sets.
The mean of a data set is an average score.
Three friends are planning a trip to Alice Springs. They plan to fly there, and discover that the airline imposes a weight limit on their luggage of 20\text{ kg} per person. On the night before the flight they weigh their luggage and find that their luggage weights form this data set: 17,\,18,\,22
One of them has packed too much. They decide to share their luggage around so that they all carry the same amount. How much does each person carry now? Thinking about it using more mathematical language, we are sharing the total luggage equally among three groups. As a mathematical expression, we find: \dfrac{17+18+22}{3}=\dfrac{57}{3}=19
Each person carries 19\text{ kg}. This amount is the mean of the data set.
If we replace every number in a numerical data set with the mean, the sum of the numbers in the data set will be the same. To calculate the mean, use the formula: \text{Mean}=\dfrac{\text{Sum of scores}}{\text{Number of scores}}
Find the mean of the following scores:6,\,14,\,10,\,13,\,5,\,9,\,14,\,15
Give your answer as a decimal.
The median of a data set is another kind of average.
Seven people were asked about their weekly income, and their responses form this data set: \$300,\,\$400,\,\$400,\,\$430,\,\$470,\,\$490,\,\$2900The mean of this data set is \dfrac{\$5390}{7}=\$770, but this amount doesn't represent the data set very well. Six out of seven people earn much less than this.
Instead we can select the median, which is the middle score. So for this data set the median is \$430. This weekly income is much closer to the other scores in the data set, and summarises the set better.
The median of a numerical data set is the "middle" score, and its definition changes depending on the number of scores in the data set. If there are an odd number of scores, the median will be the middle score. If there are an even number of scores, the median will be the number in between the middle two scores, and half the scores will be greater than the median, and half will be less than the median.
Find the median of the following scores: 3,\,18,\,10,\,19,\,12,\,5,\,6,\,20,\,7
To find the median of a numerical data set:
If there are an odd number of scores, the median will be the middle score.
If there are an even number of scores, the median will be the number in between the middle two scores.
The range is the difference between the highest and the lowest score in a data set. Unlike the mean and the median, the range doesn't measure the center - instead it measures how spread out it is.
M | T | W | T | F | |
---|---|---|---|---|---|
Kenji | 10 | 13 | 14 | 16 | 11 |
Bjorn | 2 | 27 | 13 | 5 | 17 |
Highest | Lowest | |
---|---|---|
Kenji | 16 | 10 |
Bjorn | 27 | 2 |
Range | |
---|---|
Kenji | 16-10=6 |
Bjorn | 27-2=25 |
Notice how Kenji's range is quite small, at least compared to Bjorn's. Kenji's route may be more predictable and Bjorn's route may be more variable.
We can see that the range does not say anything about the size of the scores, just their spread.
The range of a numerical data set is the difference between the highest and the lowest score. \text{Range = Highest score - Lowest score}
Find the range of the following scores:10,\,7,\,2,\,14,\,13,\,15,\,11,\,4