topic badge

9.04 Review: measures of center and spread

Mean

The mean is the average of the values in the data set. It is a measure of center, meaning it is an approximation of where the middle of a data set is.

Let's think about a situation where three friends are planning a trip to Palm Springs. They plan to fly there, and learn that the airline has a rule: each person can only bring 35 lbs of stuff in their bags. On the night before the flight they weigh their luggage and find that their luggage weights from this data set: 29,\,32,\,37

One of them has packed too much. They decide to share their luggage around so that they all carry the same amount. How much does each person carry now?

Thinking about it using more mathematical language, we are sharing the total luggage equally among three groups. As a mathematical expression, we find: \dfrac{29+32+37}{3}=\dfrac{98}{3}=32.67

Each person carries 32.67 lbs. This amount is the mean of the data set.

If we replace every number in a numerical data set with the mean, the sum of the numbers in the data set will be the same. To calculate the mean, use the formula: \text{Mean}=\dfrac{\text{Sum of all numbers}}{\text{How many numbers there are}}

Examples

Example 1

Find the mean of the scores:6,\,14,\,10,\,13,\,5,\,9,\,14,\,15

Give your answer as a decimal.

Worked Solution
Create a strategy

Use the formula \text{Mean}=\dfrac{\text{Sum of scores}}{\text{Number of scores}}

Apply the idea
\displaystyle \text{Mean}\displaystyle =\displaystyle \frac{6+14+10+13+5+9+14+15}{8}Use the formula
\displaystyle =\displaystyle \frac{86}{8}Add the numbers in the numerator
\displaystyle =\displaystyle 10.75Perform the division
Idea summary
\displaystyle \text{Mean}=\frac{\text{Sum of all numbers}}{\text{How many numbers there are}}
\bm{\text{Mean}}
is the average of a data set.

Median

The median is the middle of the data set when ordered least to greatest. It is also a measure of center.

Let's say seven people were asked about their weekly income, and their responses form this data set: \$300,\,\$400,\,\$400,\,\$430,\,\$470,\,\$490,\,\$2900The mean of this data set is \dfrac{\$5390}{7}=\$770, but this amount doesn't represent the data set very well. Six out of seven people earn much less than this.

Instead we can select the median, which is the middle income. We remove the biggest and the smallest incomes to get: \$400,\,\$400,\,\$430,\,\$470,\,\$490

Then the next biggest and the next smallest to get: \$400,\,\$430,\,\$470

Then the next biggest and the next smallest to get: \$430

There is only one number left, and this is the median - so for this data set the median is \$430. This weekly income is much closer to the other scores in the data set, and summarizes the set better.

The median is the number in the middle of a numerical data set.

  • If the list has an odd number of data points, the median is the one right in the center.

  • If the list has an even number of data points, the median is the number halfway between (or the average of) the two middle ones.

Half the numbers in the list will be bigger than the median, and half will be smaller.

The image shows 2 sets of scores ordered from smallest to largest and their medians. Ask your teacher for more information.

Examples

Example 2

Find the median of the scores: 3,\,18,\,10,\,19,\,12,\,5,\,6,\,20,\,7

Worked Solution
Create a strategy

We need to put the scores in order and find the middle score.

Apply the idea

The scores in order are:3,\,5,\,6,\,7,\,10,\,12,\,18,\,19,\,20

The middle score is 10 because it has 4 scores above it and 4 scores below it.

The median of the scores is 10.

Idea summary

The median of a numerical data set is the data value in the middle when the data is ordered from least to greatest.

To find the median of a data set:

  • If the list has an odd number of data points, the median is the one right in the center.

  • If the list has an even number of data points, the median is the number halfway between (the average of) the two middle ones.

Mode

The mode of a data set is the result with the greatest frequency, or the data value that appears most often in the data set. If there are multiple results that share the greatest frequency then there will be more than one mode.

Yvonne asks 15 of her friends what their favorite color is. She writes down their answers. Here is what she wrote down: \text{Blue, Pink, Blue, Yellow, Green, Pink, Pink, Yellow,}\\ \text{ Green, Blue, Yellow, Pink, Yellow, Pink, Pink}

She then counts the number of colors to see which is the most picked.

\text{Color}\text{Number of} \\\ \text{Friends}
\text{Pink}6
\text{Green}2
\text{Blue}3
\text{Yellow}4

The mode of the data is pink.

Examples

Example 3

Thomas conducted a survey on the average number of hours his classmates exercised per day and displayed his data in a table.

\text{No. exercise} \\ \text{hours}\text{ Frequency}
02
112
27
35
40
53

What is the mode of the data?

Worked Solution
Create a strategy

Choose the result with the greatest frequency in the data.

Apply the idea

1 hour of exercise is the mode because it has the greatest frequency.

Idea summary

The mode of a data set is the result with the greatest frequency. If there are multiple results that share the greatest frequency then there will be more than one mode.

Range

The range is a measure of the spread of a data set from the highest value to the lowest.

Two bus drivers, Kenji and Bjorn, track how many passengers board their buses each day for a week. Their results are displayed in this table:

MTWTF
Kenji1013141611
Bjorn22713517

Both data sets have the same median and the same mean, but the sets are quite different. To calculate the range, we start by finding the highest and lowest number of passengers for each driver:

HighestLowest
Kenji1610
Bjorn272

Now we subtract the lowest from the highest to find the difference, which is the range:

Range
Kenji16-10=6
Bjorn27-2=25

Notice how Kenji's range is quite small, at least compared to Bjorn's. We might say that Kenji's route is more predictable and that Bjorn's route is much more variable (is more likely to change).

The range of a numerical data set is the difference between the highest and the lowest data point. \text{Range = Highest data point - Lowest data point}

Examples

Example 4

Find the range of the following scores:10,\,7,\,2,\,14,\,13,\,15,\,11,\,4

Worked Solution
Create a strategy

Use the formula \text{Range} = \text{Highest score} - \text{Lowest score.}

Apply the idea

The highest score is 15 and the lowest score is 2.

\displaystyle \text{Range}\displaystyle =\displaystyle 15-2Subtract 2 from 15
\displaystyle =\displaystyle 13Perform the subtraction
Idea summary
\displaystyle \text{{Range = Highest data point - Lowest data point}}
\bm{\text{Range}}
is a measure of how spread apart a data set is from its highest to lowest value.

What is Mathspace

About Mathspace