Sometimes we want to talk about a data set without having to refer to every single result. In other words, we want to summarize the data set to learn more about it and make comparisons. In the last lesson, we introduced the mode, the most frequently occurring data value. In this lesson, we will learn about two measures of center that we can use to summarize numerical data sets.
The mean of a data set is an average of all the data values in the set. Let's look at an example.
Three friends are planning a trip to Alice Springs. They plan to fly there, and discover that the airline imposes a weight limit on their luggage of 20\text{ kg} per person. On the night before the flight they weigh their luggage and find that their luggage weights form this data set: 17,\,18,\,22
One of them has packed too much. They decide to share their luggage around so that they all carry the same amount. How much does each person carry now? Thinking about it mathematically, we are sharing the total luggage equally among three groups. We can represent that as an expression: \dfrac{17+18+22}{3}=\dfrac{57}{3}=19
Each person carries 19\text{ kg}. This amount is the mean of the data set.
To calculate the mean, use the formula:
Find the mean of this data set:4,\,7,\,1,\,2,\,3
The mean of a data set is the average of all the data values in the set.
The median of a numerical data set is another measure of center. It is the "middle" value, and how to find the median changes depending on the number of values in the data set. If there are an odd number of values, the median will be the middle value. If there are an even number of values, the median will be the number in between the middle two values.
Because we are finding the middle value, half the values will be greater than or equal to the median, and half will be less than or equal to the median. Let's look at an example.
Seven people were asked about their weekly income, and their responses form this data set: \$300,\,\$400,\,\$430,\,\$470,\,\$490,\,\$2900The mean of this data set is \dfrac{\$5390}{7}=\$770, but this amount doesn't represent the data set very well because six out of seven people earn much less than this.
Instead we can find the median, which is the middle of the data set. To find the median we remove the biggest and the smallest values: \$400,\,\$400,\,\$430,\,\$470,\,\$490
Then the next biggest and the next smallest: \$400,\,\$430,\,\$470
Then the next biggest and the next smallest: \$430
There is only one number left, and this is the median - so for this data set the median is \$430. This weekly income is much closer to the other values in the data set, and summarizes the set better.
Find the median of the following values: 11,\,17,\,3,\,14,\,19,\,7
Find the median of the following scores: 3,\,18,\,10,\,19,\,12,\,5,\,6,\,20,\,7
To find the median of a numerical data set:
If there are an odd number of values, the median will be the middle value.
If there are an even number of values, the median will be the number in between the middle two values.