topic badge
AustraliaNSW
Stage 5.1-3

10.02 Calculating centre and spread

Lesson

Measures of centre

The mean is often referred to as the average. To calculate the mean, add all the scores in a data set, then divide this by number of scores.

To find the mean from a graphical representation, we can use a frequency table to list out the values of on the graph. Consider the histogram below:

The image shows a histogram for scores 1 to 5. Ask your teacher for more information.

We can construct a frequency table like the one below:

\text{Score }(x)\text{Frequency }(f)xf
133
2816
3515
4312
515

The mean will be calculated by dividing the sum of the last column by the sum of the second column, \dfrac{51}{20}=2.55.

The median is one way of describing the middle or the centre of a data set using a single value. The median is the middle score in a data set.

Suppose we have five numbers in our data set: 4,\,11,\,15,\,20, and 24.

The median would be 15 because it is the value right in the middle. There are two numbers on either side of it. 4,\,11,\,15,\,20,\,24

If we have an even number of terms, we will need to find the average of the middle two terms. Suppose we wanted to find the median of the set 2,\,3,\,6,\,9, we want the value halfway between 3 and 6. The average of 3 and 6 is \dfrac{3+6}{2}=\dfrac{9}{2}, or 4.5, so the median is 4.5.

2,\,3,\,4.5,\,6,\,9

If we have a larger data set, however, we may not be able to see right away which term is in the middle. We can use the "cross out" method.

Once a data set is ordered, we can cross out numbers in pairs (one high number and one low number) until there is only one number left.

The set of numbers 1, 1, 3, 5, 7, 9, 9, 10, and 15.

Here is a data set with nine numbers.

1. Check that the data is sorted in ascending order (i.e. in order from smallest to largest).

The image shows a set of numbers, 1,1,3,5,7,9,9,10, and 15, with 15 and 1 crossed out.

2. Cross out the smallest and the largest number, like so

The image shows a set of numbers, 1,1,3,5,7,9,9,10, and 15, with all numbers crossed out except 7.

3. Repeat step 2, working from the outside in - taking the smallest number and the largest number each time until there is only one term left. We can see in this example that the median is 7.

Note that this process will only leave one term if there are an odd number of terms to start with. If there are an even number of terms, this process will leave two terms instead, if you cross them all out, you've gone too far. To find the median of a set with an even number of terms, we can then take the mean of these two remaining middle terms.

The idea behind the cross out method can be used in graphical representations by cross off data points from each side.

The mode describes the most frequently occurring score.

Suppose that 10 people were asked how many pets they had. 2 people said they didn't own any pets, 6 people had one pet and 2 people said they had two pets.

In this data set, the most common number of pets that people have is one pet, and so the mode of this data set is 1.

A data set can have more than one mode, if two or more scores are equally tied as the most frequently occurring.

Examples

Example 1

Answer the following given this set of scores: 9,\,4,\,14,\,19,\,20,\,15,\,12

a

Sort the scores in ascending order.

Worked Solution
Create a strategy

Arrange the numbers from smallest to largest.

Apply the idea

\text{List}=4,\,9,\,12,\,14,\,15,\,19,\,20

b

Find the total number of scores.

Worked Solution
Create a strategy

Count the scores.

Apply the idea

\text{Number}=7

c

Find the median.

Worked Solution
Create a strategy

The median in an odd set of scores is the \left( \dfrac{n+1}{2} \right)th score, where n is the total number of scores.

Apply the idea

There are 7 scores.

\displaystyle \text{Median position}\displaystyle =\displaystyle \dfrac{7+1}{2} Substitute the values
\displaystyle =\displaystyle \dfrac{8}{2} Evaluate the numerator
\displaystyle =\displaystyle 4\text{th score} Perform the division

Therefore the median will be the 4th score. \text{Median}=14

Idea summary

Mean

  • The numerical average of a data set, this is the sum of the data values divided by the number of data values.

  • Appropriate for sets of data where there are no values much higher or lower than those in the rest of the data set.

Median

  • The middle value of a data set ranked in order.

  • A good choice when data sets have a couple of values much higher or lower than most of the others.

Mode

  • The data value that occurs most frequently.

  • A good descriptor to use when the set of data has some identical values, when data is non-numeric (categorical) or when data reflects the most popular item.

Measure of spread

The range of a numerical data set is the difference between the smallest and largest scores in the set. The range is one type of measure of spread.

For example, at one school the ages of students in Year 7 vary between 11 and 14. So the range for this set is 14-11=3.

As a different example, if we looked at the ages of people waiting at a bus stop, the youngest person might be a 7 year old and the oldest person might be a 90 year old. The range of this set of data is 90-7=83, which is a much larger range of ages.

Examples

Example 2

Find the range of the following set of scores:10,\,7,\,2,\,14,\,13,\,15,\,11,\,4

Worked Solution
Create a strategy

Use the formula: \text{Range}=\text{Maximum score}-\text{Minimum score}

Apply the idea

The maximum score is 15 and the minimum score is 2.

\displaystyle \text{Range}\displaystyle =\displaystyle 15-2Substitute the values
\displaystyle =\displaystyle 13Evaluate the subtraction

Example 3

Assess how various changes to data sets alter their characteristics.

a

Consider the set of data:

1,\,2,\,2,\,4,\,4,\,5,\,6,\,6,\,8,\,8,\,8,\,9,\,9

If one score of 8 is changed to a 9, which two of the following would be altered?

A
Median
B
Mean
C
Range
D
Mode
Worked Solution
Create a strategy

Find each statistic before and after the change.

Apply the idea

We change the set of data into:

1,\,2,\,2,\,4,\,4,\,5,\,6,\,6,\,8,\,8,\,9,\,9,\,9

For the median, we know there are 13 scores and the median will be the 7th digit. Examining both data sets will show us that the median is 6. Comparing the initial and final data sets, we can see the median is the same. So option A is incorrect.

For the mean, by changing one score from 8 to 9, the new data set will have a greater sum of scores. So the mean will also increase. So option B is correct.

For the range, the initial data set has 9 as the maximum score and 1 as the minimum score. In the new set, the minimum and maximum stay the same, so the range will not change. So option C is incorrect.

For the mode, the initial data set has 8 as the most frequent score, while 9 is the most frequent score in the new data set. So the mode has changed from 8 to 9. So option D is correct.

b

Consider this set of data that represents the number of apps on six people’s phones.

11,\,12,\,15,\,17,\,19,\,19

If each person downloads another 7 apps, which three of the following would change?

A
Mode
B
Mean
C
Range
D
Median
Worked Solution
Apply the idea

Adding 7 to each score gives us:

\displaystyle \text{Final set}\displaystyle =\displaystyle 11+7,\,12+7,\,15+7,\,17+7,\,19+7,\,19+7Add 7
\displaystyle =\displaystyle 18,\,19,\,22,\,24,\,26,\,26Perform the addition

For the mode, the initial data set had 19 as the most frequent score, while 26 is the most frequent score in the new data set. So the mode has been changed. Option A is correct.

For the mean, by adding 7 to all the scores, the new data set will have a greater sum, so the mean will also increase. So option B is correct.

For the range, since we increased both the maximum and minimum scores by 7, the difference between them will not change. The range for both data sets is 8. Option C is incorrect.

For the median, we know there are 6 scores, and the median will be the average of the 3rd and 4th scores. The median of the original set was 16, but the median of the new wet is 23. So option D is correct.

Idea summary

The range of a numerical data set is given by:

\text{Range}=\text{Maximum score}-\text{Minimum score}

Outcomes

MA5.1-12SP

uses statistical displays to compare sets of data, and evaluates statistical claims made in the media

What is Mathspace

About Mathspace