Univariate Data

Lesson

You may have seen some frequency tables that have an extra column for cumulative frequency. It is a running total of the frequencies. In other words, cumulative frequency is the total of a frequency and all frequencies from the previous scores in the data set.

Sam recorded the number of pets owned by $10$10 people in his class. Here is the regular frequency table.

Number of Pets | Frequency |
---|---|

$0$0 | $2$2 |

$1$1 | $5$5 |

$2$2 | $2$2 |

$3$3 | $1$1 |

Now let's look at how we would calculate and add in a cumulative frequency column. Remember, we add each frequency to the previous frequency total. The first value in the cumulative frequency table will be the same as the value in the frequency column (since there's no previous value to add it to).

Number of Pets | Frequency | Cumulative Frequency |
---|---|---|

$0$0 | $2$2 | $2$2 |

$1$1 | $5$5 | $2+5=7$2+5=7 |

$2$2 | $2$2 | $7+2=9$7+2=9 |

$3$3 | $1$1 | $9+1=10$9+1=10 |

Notice that the final value in the cumulative frequency column is the same as the total number of people that were surveyed? That's how we know we've got our frequency scores all right!

We've looked at histograms as graphical representations of the distribution of data by plotting the frequencies of each individual score. In cumulative frequency histograms, we plot the cumulative frequency scores. As such, the columns in a cumulative frequency histogram continue to increase, with the largest columns having the largest score.

A pair of dice are rolled $50$50 times and the numbers appearing on the uppermost face are added to give a score.

a) What is the lowest possible score?

b) What is the highest possible score?

c) The frequency of each score is given in the table. Complete the cumulative frequency values.

d) How many times did a score of $8$8 appear?

e) How many times did a score more than $9$9 appear?

f) How many times did a score of at most $6$6 appear?

The heights of $26$26 boys in a class are listed:

$156,143,143,163,153,156,163,143,163,163,156,156,156,153,150,156,156,150,150,156,163,163,153,156,153,156$156,143,143,163,153,156,163,143,163,163,156,156,156,153,150,156,156,150,150,156,163,163,153,156,153,156

a) Construct a cumulative frequency histogram to represent the data.

b) How many students are taller than $150$150cm?

Think:

Do: $26-6=20$26−6=20

$20$20 students are taller than $150$150cm.

c) How many students are at most $150$150cm tall?

Think: We worked out in part B that

Do: $6$6 students are at most $150$150cm.

d) How many students are taller than $155$155cm but shorter than $160$160cm?

Think: The only score that fits in this range is $156$156cm. How many times does this score appear?

Do: $20-10=10$20−10=10

e) Find the modal height of the students.

Think: The mode is the most frequently occurring score. In the cumulative frequency histogram, this is indicated by the largest difference the columns.

Do: $156$156cm is the modal height.

f) Find the mean height. Write your answer correct to $2$2 decimal places.

Think: We need to add up the scores and divide it by the number of scores.

Do:

$\text{Mean }$Mean | $=$= | $\frac{3\times143+3\times150+4\times153+10\times156+6\times163}{26}$3×143+3×150+4×153+10×156+6×16326 |

$=$= | $\frac{4029}{26}$402926 | |

$=$= | $154.961$154.961... | |

$=$= | $154.96$154.96cm |

g) Find the median height.

Think: There are $26$26 scores. So where would the middle score lie?

Do: The median lies between the thirteenth and fourteenth scores. So the median height is $156$156cm.

Plan and conduct surveys and experiments using the statistical enquiry cycle:– determining appropriate variables and measures;– considering sources of variation;– gathering and cleaning data;– using multiple displays, and re-categorising data to find patterns, variations, relationships, and trends in multivariate data sets;– comparing sample distributions visually, using measures of centre, spread, and proportion;– presenting a report of findings

Evaluate statistical investigations or probability activities undertaken by others, including data collection methods, choice of measures, and validity of findings