Statistical data can be divided two types: categorical and numerical. There are four ways of summarising numerical data: the mode, mean, median and range.

Ideas

Types of data
Mode
Mean
Median
Range

Types of data

Data that is collected as a set of words is called categorical data.

Imagine asking someone for their favourite colour, country of birth, or gender. Their answer would always be a word. We can also think of categorical data as values which can be sorted into groups or categories.

When the data is a set of numbers, it is called numerical data.

Imagine asking someone for their height, their age, how many pets they own, or how long they spend on social media each day. Their answers would always be a number.

Numerical data is divided into two types, continuous and discrete.

Discrete numerical data is counted, so its values are separated. If you asked someone to tell you how many pets they have they might say "4", but they would not say "4 and seven sixteenths".

Continuous data is measured, so it can take any value within a range - there are an infinite number of possible values.

A large ruler measuring the height of an elephant.

If we measure an animal's height, we might find any reasonable value, limited only by the precision of our ruler. If you have to measure to find the answer, the data is continuous.

Examples

Example 1

A class was surveyed about where they went on their most recent holiday. What kind of data are the survey results?

Worked Solution

Create a strategy

Determine whether the answers would be words or numbers.

Apply the idea

Names of places are words. So the data is categorical.

Idea summary

Categorical data is made up of words.

Numerical data is made up of numbers.

Discrete numerical data is counted.
Continuous numerical data is measured.

Mode

The mode of a data set is the most commonly occurring score.

To find the mode we can count how many times each score occurred (the frequency). The score with the highest frequency is the mode.

Examples

Example 2

Find the mode of the following scores: 6,\,1,\,8,\,1,\,6,\,9,\,7,\,6,\,8

What is the mode?

Worked Solution

Create a strategy

Choose the score that appears the most often.

Apply the idea

6 has a frequency of 3, which is the highest frequency of all the scores. So the mode is 6.

Idea summary

The mode of a data set is the result with the highest frequency.

The frequency is the number of times that a score occurs. If there are multiple results that share the highest frequency then there will be more than one mode.

Mean

The mean of a data set is an average score.

Three friends are planning a trip to Alice Springs. They plan to fly there, and discover that the airline imposes a weight limit on their luggage of 20\text{ kg} per person. On the night before the flight they weigh their luggage and find that their luggage weights form this data set: 17,\,18,\,22

One of them has packed too much. They decide to share their luggage around so that they all carry the same amount. How much does each person carry now? Thinking about it using more mathematical language, we are sharing the total luggage equally among three groups. As a mathematical expression, we find: \dfrac{17+18+22}{3}=\dfrac{57}{3}=19

Each person carries 19\text{ kg}. This amount is the mean of the data set.

If we replace every number in a numerical data set with the mean, the sum of the numbers in the data set will be the same. To calculate the mean, use the formula: \text{Mean}=\dfrac{\text{Sum of scores}}{\text{Number of scores}}

Examples

Example 3

Find the mean of the following scores:4,\,8,\,2,\,5,\,1

Give your answer as a decimal.

Worked Solution

Create a strategy

Use the formula \text{Mean}=\dfrac{\text{Sum of scores}}{\text{Number of scores}}

Apply the idea

\displaystyle \text{Mean}	\displaystyle =	\displaystyle \dfrac{4+8+2+5+1}{5}	Use the formula
	\displaystyle =	\displaystyle \dfrac{20}{5}	Add the numbers in the numerator
	\displaystyle =	\displaystyle 4	Perform the division

Idea summary

\displaystyle \text{Mean}=\dfrac{\text{Sum of scores}}{\text{Number of scores}}

\bm{\text{Mean}}

is the average of the scores.

Median

The median of a data set is another kind of average.

Seven people were asked about their weekly income, and their responses form this data set: \$300,\,\$400,\,\$400,\,\$430,\,\$470,\,\$490,\,\$2900The mean of this data set is \dfrac{\$5390}{7}=\$770, but this amount doesn't represent the data set very well. Six out of seven people earn much less than this.

Instead we can select the median, which is the middle score. We remove the biggest and the smallest scores to get: \$400,\,\$400,\,\$430,\,\$470,\,\$490

Then the next biggest and the next smallest to get: \$400,\,\$430,\,\$470

Then the next biggest and the next smallest to get: \$430

There is only one number left, and this is the median - so for this data set the median is \$430. This weekly income is much closer to the other scores in the data set, and summarises the set better.

The median of a numerical data set is the "middle" score, and its definition changes depending on the number of scores in the data set. If there are an odd number of scores, the median will be the middle score. If there are an even number of scores, the median will be the number in between the middle two scores, and half the scores will be greater than the median, and half will be less than the median.

The image shows 2 sets of scores ordered from smallest to largest and their medians. Ask your teacher for more information.

Examples

Example 4

Find the median of the following scores: 3.2,\,2.3,\,5.5,\,4.6,\,8.5

Worked Solution

Create a strategy

We need to put the scores in order and find the middle score.

Apply the idea

The scores in order are:2.3,\,3.2,\,4.6,\,5.5,\,8.5

The middle score is 4.6 because it has 2 scores above it and 2 scores below it.

\text{Median}=4.6

Idea summary

To find the median of a numerical data set:

If there are an odd number of scores, the median will be the middle score.
If there are an even number of scores, the median will be the number in between the middle two scores.

Range

The range is the simplest measure of spread in a numerical data set. Unlike the mean and the median, the range doesn't measure the center - instead it measures how spread out it is.

Two bus drivers, Kenji and Bjorn, track how many passengers board their busses each day for a week. Their results are displayed in this table:

	M	T	W	T	F
Kenji	10	13	14	16	11
Bjorn	2	27	13	5	17

Both data sets have the same median and the same mean, but the sets are quite different. To calculate the range, we start by finding the highest and lowest number of passengers for each driver:

	Highest	Lowest
Kenji	16	10
Bjorn	27	2

Now we subtract the lowest from the highest to find the difference, which is the range:

	Range
Kenji	16-10=6
Bjorn	27-2=25

Notice how Kenji's range is quite small, at least compared to Bjorn's. We might say that Kenji's route is more predictable and that Bjorn's route is much more variable. We can see that the range does not say anything about the sise of the scores, just their spread.

The range of a numerical data set is the difference between the highest and the lowest score. \text{Range = Highest score - Lowest score}

Examples

Example 5

Find the range of the following scores:11,\,-19,\,14,\,17,\,-11,\,15,\,13,\,-5,\,-20

Worked Solution

Create a strategy

Use the formula \text{Range} = \text{Highest score} - \text{Lowest score.}

Apply the idea

The highest score is 17 and the lowest score is -20.

\displaystyle \text{Range}	\displaystyle =	\displaystyle 17-(-20)	Subtract -20 from 17
	\displaystyle =	\displaystyle 37	Perform the subtraction

Idea summary

\displaystyle \text{{Range = Highest score - Lowest score}}

\bm{\text{Range}}

is the difference between the highest and the lowest score.

Outcomes

MA4-19SP

collects, represents and interprets single sets of data, using appropriate statistical displays

MA4-20SP

analyses single sets of data using measures of location, and range

11.01 Summarising data

Introduction

Ideas

Types of data

Examples

Example 1

Mode

Examples

Example 2

Mean

Examples

Example 3

Median

Examples

Example 4

Range

Examples

Example 5

Outcomes

MA4-19SP

MA4-20SP

What is Mathspace

About Mathspace