7. Categorical & Quantitative Data

Lesson

A measure of center summarizes the data in a single number. The most common ones are the mean, median and mode.

The mean is the **average** of the numbers in a data set.

To calculate the mean, we add up all the scores in a data set, then divide this total by the frequency (ie. number of scores).

Hint

To find the sum of all the scores, we can either add up each individual score, or, if certain scores are repeated, we can add the products of the scores and frequencies (ie. $f\times x=fx$`f`×`x`=`f``x`).

Now let's try calculating the mean of data sets ourselves.

Find the mean of the following scores:

$-14$−14, $0$0, $-2$−2, $-18$−18, $-8$−8, $0$0, $-15$−15, $-1$−1.

A statistician organized a set of data into the frequency table shown:

Complete the frequency distribution table:

Score ($x$ `x`)Frequency ($f$ `f`)$f\times x$ `f`×`x`$31$31 $12$12 $\editable{}$ $32$32 $14$14 $\editable{}$ $33$33 $7$7 $\editable{}$ $34$34 $20$20 $\editable{}$ $35$35 $15$15 $\editable{}$ **Totals**$\editable{}$ $\editable{}$ Calculate the mean, correct to two decimal places.

Find the range of the scores in the table above.

Find the mode of the set of scores in the table.

The mean of $4$4 scores is $21$21. If three of the scores are $17$17, $3$3 and $8$8, find the $4$4th score (call it $x$`x`).

The frequency table below shows the resting heart rate of some people taking part in a study.

Complete the table:

Heart Rate Class Center ($x$ `x`)Frequency ($f$ `f`)$fx$ `f``x`$30$30-$39$39 $\editable{}$ $13$13 $\editable{}$ $40$40-$49$49 $\editable{}$ $22$22 $\editable{}$ $50$50-$59$59 $\editable{}$ $24$24 $\editable{}$ $60$60-$69$69 $\editable{}$ $36$36 $\editable{}$ Determine an estimate for the mean resting heart rate? Leave your answer to two decimal places if necessary.

The median is a measure of **measure of center**. In other words, it's one way of describing a value that represents the middle or the center of a data set. The median is the **middle score** in a data set.

Remember!

The data must be ordered (usually in ascending order) to calculate the median.

Say we have five numbers in our data set: $4$4, $11$11, $15$15, $20$20 and $24$24.

The median would be $15$15 because it is right in the middle. There are two numbers on either side of it.

4, 11, 15, 20, 24

However, if we have a larger data set, we may not be able to see straight away which term is in the middle.

You can also determine which term will be the middle number using the following formula:

Let $n$`n` be the number of terms:

$\text{middle term }=\frac{n+1}{2}$middle term =`n`+12th term

So if we use the same set of numbers from the previous example:

$1$1, $1$1, $3$3, $5$5, $7$7, $9$9, $9$9, $10$10, $15$15, there are nine numbers in the set. So to work out which value is in the middle:

$\text{middle term }$middle term | $=$= | $\frac{9+1}{2}$9+12 |

$=$= | $5$5 |

This means the fifth term will be the median: $1$1, $1$1, $3$3, $5$5, 7, $9$9, $9$9, $10$10, $15$15.

So again, we find that the median is $7$7.

Let's try that with an even number of terms. Let's look at this data set with four terms: $8$8, $12$12, $17$17, $20$20.

$\text{middle term }$middle term | $=$= | $\frac{4+1}{2}$4+12 |

$=$= | $2.5$2.5th term |

But what is the $2.5$2.5^{th} term? The $2.5$2.5^{th} term means the average between the second and the third term. Again, remember your data must be in order before you count the terms. So in this example, the median will be the average of $12$12 and $17$17.

$\text{median }$median | $=$= | $\frac{12+17}{2}$12+172 |

$=$= | $14.5$14.5 |

Find the median of this set of scores:

$11$11, $11$11, $13$13, $14$14, $18$18, $22$22, $23$23, $25$25

Write down $4$4 consecutive odd numbers whose median is $40$40.

Write all solutions on the same line separated by a comma.

Solve the following using the bar graph:

Find the total number of scores.

Find the median.

The mode is a measure of **central tendency**. In other words, it's one way of describing a value that represents the middle or the center of a data set so we get a sense of what is "normal." The **mode** describes the **mo**st **frequently occurring score**. Remember the word and the meaning start with the same two letters.

Let's say I asked $10$10 people how many pets they had and $2$2 people said no pets, $6$6 people had one pet and $2$2 people said they had two pets. What is the most common number of pets for people to have? The answer is one pet because the majority of people $\frac{6}{10}$610 had one pet. So the mode in this data set is $1$1.

Remember!

Find the mode of the following scores:

$8,18,5,2,2,10,8,5,14,14,8,8,10,18,14,5$8,18,5,2,2,10,8,5,14,14,8,8,10,18,14,5

Mode = $\editable{}$

Find the mode from the histogram shown.

In statistics, we tend to assume that our data will fit some kind of trend and that most things will fit into a "normal" range. This is why we look at measures of center, such as the mean, median, and mode.

An outlier is a response that is very different from the rest of the data set, being really above or below average. For example, if there are five people in a group and four people were between $120$120cm and $130$130cm, whereas Jim was $165$165cm, Jim would be an outlier as he is *much* taller than everyone else in the group.

Outliers can skew or change the shape of our data.

- The mode will remain unchanged
- The median will change a little bit
- The mean will change significantly

If we have... | |
---|---|

A really low outlier | A really high outlier |

The median decreases a little bit | The median increases a little bit |

The mean decreases significantly | The mean increases significantly |

Since the mean is heavily impacted by outliers, it is best to use the median if a data set has outliers.

Which measure of center would be best for the following data set?

$8,10,14,18,19,91$8,10,14,18,19,91

Mean

AMedian

BMode

CMean

AMedian

BMode

C

Carl has been recording his spelling test scores for the past semester. His scores were $14,16,2,15,15,16,15$14,16,2,15,15,16,15.

Calculate the median of Carl's scores.

Calculate the mean of Carl's scores.

Round your answer to two decimal places if necessary.

Which measure of center more accurately describes the center of this data set?

The median

AThe mean

BThe median

AThe mean

B

Summarize, represent, and interpret data on a single count or measurement variable.

Estimate or calculate to make predictions based on a circle, line, bar graph, measure of central tendency, or other representation.