Data Analysis

Hong Kong

Stage 1 - Stage 3

Lesson

Measures of central tendency are tools we use to give us an estimate of what score represents the centre of a data set. We've looked at different measure of central tendency, particularly the:

- mean: the average of the score,
- median: the middle score, and
- mode: the most frequently occurring score.

Measures of distribution tell us how far the scores in a data set are spread out.

We've already looked a one measure of spread, the range, which is the difference between the highest and lowest score in a data set.

Now we are going to learn about a new measure of spread called the Mean Absolute Deviation (MAD). The MAD of a set of data is the average distance between the scores and the mean.

Let's use an example to help explain this.

Find the mean absolute deviation of {$23,18,31,28,20$23,18,31,28,20}.

1. Find the mean.

$Mean=\frac{23+18+31+28+20}{5}$`M``e``a``n`=23+18+31+28+205$=$=$24$24

2. Find the difference between each individual score and the mean.

$23-24=-1$23−24=−1

$18-24=-6$18−24=−6

$31-24=7$31−24=7

$28-24=4$28−24=4

$20-24=-4$20−24=−4

3. Take the absolute value of each difference.

$\left|-1\right|=1$|−1|=1

$\left|-6\right|=6$|−6|=6

$\left|7\right|=7$|7|=7

$\left|4\right|=4$|4|=4

$\left|-4\right|=4$|−4|=4

Find the mean of these differences.

$Mean=\frac{1+6+7+4+4}{5}$`M``e``a``n`=1+6+7+4+45$=$=$4.4$4.4

This means that, on average, scores in this data set are $4.4$4.4 units above or below the mean.

A batsman’s mean number of runs is $57$57 and the MAD is $13$13. In the next match he makes $58$58 runs. If this score is added to the existing scores, which is true of the new mean and MAD?

A) $\text{Mean }<57$Mean <57, with $MAD<13$`M``A``D`<13 B) $\text{Mean }>57$Mean >57, with $MAD>13$`M``A``D`>13 C) $\text{Mean }<57$Mean <57, with $MAD>13$`M``A``D`>13 D) $\text{Mean }>57$Mean >57, with $MAD<13$`M``A``D`<13

Think: Is the new score lower or higher than the old mean? Also, are the other scores are, on average, further from the old mean?

Do: D) $Mean>57$`M``e``a``n`>57, with $MAD<13$`M``A``D`<13

Two months in a row, the snow depth at a particular location was measured every day for the first week of the month, and the results are shown in the table.

Sun | Mon | Tue | Wed | Thu | Fri | Sat | |
---|---|---|---|---|---|---|---|

Depth in first month | $7$7 | $5$5 | $8$8 | $14$14 | $6$6 | $12$12 | $20$20 |

Depth in second month | $9$9 | $7$7 | $10$10 | $16$16 | $8$8 | $14$14 | $22$22 |

How is the distribution of snow depth in the first month similar to the distribution of snow depth in the second month?

Same mean snow depth

ASame MAD

BSame range of snow depth

CHow is the distribution of snow depth in the first month different to the distribution of snow depth in the second month?

Greater average snow depth in the second month

AGreater variation in snow depth in the second month

B

Consider the distribution of the data presented in this histogram.

Which of the following statements is true?

Since the data is symmetric, the MAD of this data is the same as the mean.

ASince the data is symmetric, the mean of this data is $12.5$12.5 and the MAD is greater than $0$0.

BSince the data is symmetric, the MAD of this data is $0$0 and the mean is $12.5$12.5.

C