topic badge
Standard Level

12.04 Create and interpret cumulative frequency tables and polygons

Lesson

We have already seen how the frequencies of data values can be used to create a histogram. The cumulative frequencies can also be plotted to create another type of chart, called a cumulative frequency graph. This graph will be used in the next chapter for finding values such as the median and interquartile range from a set of grouped data.

Cumulative frequency is a 'running total' of the frequencies. To calculate it, we add an additional column to the frequency distribution table:

class interval frequency cumulative frequency
$50\le t<55$50t<55 $5$5 $5$5
$55\le t<60$55t<60 $10$10 $5+10=15$5+10=15
$60\le t<65$60t<65 $25$25 $15+25=40$15+25=40
$65\le t<70$65t<70 $26$26 $40+26=66$40+26=66
$70\le t<75$70t<75 $40$40 $66+40=106$66+40=106
$75\le t<80$75t<80 $49$49 $106+49=155$106+49=155
$80\le t<85$80t<85 $28$28 $155+28=183$155+28=183
Total $183$183  

 

  • The first value in the cumulative frequency column will always be the same as the first value in the frequency column.
  • To get the second cumulative frequency value we add the second frequency to the first cumulative frequency value, $5+10=15$5+10=15.
  • The second cumulative frequency value tells us there are $15$15 values in the interval $50\le p<60$50p<60.
  • The third cumulative frequency value tells us there are $40$40 values in the interval $50\le p<65$50p<65, and so on.
  • The final cumulative frequency value is always equal to the sum of the frequencies. In this case, there are $183$183 values in the entire data set, represented by $50\le p<85$50p<85.

 

Worked example

example 1

The frequency distribution table below shows the heights ($h$h), in centimetres, of a group of children aged $5$5 to $11$11.

Child's height in cm frequency cumulative frequency
$9090<h100 $5$5 $5$5
$100100<h110 $22$22 $27$27
$110110<h120 $30$30 $57$57
$120120<h130 $31$31 $88$88
$130130<h140 $18$18 $106$106
$140140<h150 $6$6 $112$112

Use the table to answer the following questions:

  1. How many children were in the group?
  2. How many children had heights greater than $130$130 cm but less than or equal to $140$140 cm?
  3. Which class interval contained the most children?
  4. How many children had a height less than or equal to $120$120 cm?
  5. How many children had a height greater than $130$130 cm?

Solution

  1. The final cumulative frequency value tells us there were $112$112 children in the group. This is equal to the sum of the values in the frequencies column. 
  2. The frequency column indicates there are $18$18 children with heights in the range $130130<h140.
  3. The class interval with the highest frequency is $120120<h130.
  4. The cumulative frequency of the $110110<h120 class interval, tells us that $57$57 children had a height less than or equal to $120$120 cm.
  5. Here we can add the final two frequencies: $18+6=24$18+6=24. Alternatively we could subtract the cumulative frequency of $88$88 (corresponding to class interval containing the height $130$130 cm), from the total number of children in the group: $112-88=24$11288=24.

 

Summarising data from a grouped frequency table

When finding the mean and median of grouped data we want to first find the class centre of each group. The class centre is the mean of the highest and lowest possible scores in the group.

Exploration

Estimate the mean and median of the following data.

Group Frequency ($f$f)
$1-5$15 $7$7
$6-10$610 $2$2
$11-15$1115 $4$4
$16-20$1620 $7$7

First we find the class centre for each group. This is just the average of the endpoints of the group. For example, the first group is $1$1 to $5$5, so the class centre is $\frac{1+5}{2}=3$1+52=3.

Group Class Centre ($x$x) Frequency ($f$f)
$1-5$15 $3$3 $7$7
$6-10$610 $8$8 $2$2
$11-15$1115 $13$13 $4$4
$16-20$1620 $18$18 $7$7

Notice that we've given the class centre the pronumeral $x$x this time. This is because we will use the class centre in the same way that we used the score for ungrouped data.

To find the mean, we want to make an $xf$xf column again. In this case, $x$x is the class centre.

Group Class Centre ($x$x) Frequency ($f$f) $xf$xf
$1-5$15 $3$3 $7$7 $21$21
$6-10$610 $8$8 $2$2 $16$16
$11-15$1115 $13$13 $4$4 $52$52
$16-20$1620 $18$18 $7$7 $126$126

Dividing the sum of the $xf$xf column by the sum of the $f$f column gives us $\frac{21+16+52+126}{7+2+4+7}=\frac{215}{20}=10.75$21+16+52+1267+2+4+7=21520=10.75.

Similarly for the median we want to make a cumulative frequency table.

Group Class Centre ($x$x) Frequency ($f$f) Cumulative frequency
$1-5$15 $3$3 $7$7 $7$7
$6-10$610 $8$8 $2$2 $9$9
$11-15$1115 $13$13 $4$4 $13$13
$16-20$1620 $18$18 $7$7 $20$20

Since there are $20$20 scores, we look for the $10$10th and $11$11th scores, which are both in the group $11-15$1115. While we don't know the exact score of the median, we can use the class centre $13$13 as our estimate for the median.

 

 

Cumulative frequency graphs

Using the values in the cumulative frequency column, we can create a cumulative frequency histogram. 

class interval frequency cumulative frequency
$50\le t<55$50t<55 $5$5 $5$5
$55\le t<60$55t<60 $10$10 $15$15
$60\le t<65$60t<65 $25$25 $40$40
$65\le t<70$65t<70 $26$26 $66$66
$70\le t<75$70t<75 $40$40 $106$106
$75\le t<80$75t<80 $49$49 $155$155
$80\le t<85$80t<85 $28$28 $183$183
Total $183$183  

Notice that the columns in a cumulative frequency histogram will always increase in size from left to right. The frequency represented by any particular column will be equal to the difference in height between that column and the one before it.

A cumulative frequency polygon, also known as an ogive, is a line graph connecting cumulative frequencies at the upper endpoint of each class interval. Sometimes the cumulative frequency histogram and polygon are displayed together:

The cumulative frequency polygon can also be displayed on its own.

To find the median, we can use the cumulative frequency for each score. Consider the table below:

Score ($x$x) Frequency ($f$f) Cumulative frequency
$1$1 $6$6 $6$6
$2$2 $9$9 $15$15
$3$3 $1$1 $16$16
$4$4 $6$6 $22$22
$5$5 $8$8 $30$30
$6$6 $6$6 $36$36
$7$7 $6$6 $42$42
$8$8 $2$2 $44$44
$9$9 $8$8 $52$52

The final row has a cumulative frequency of $52$52, so there are $52$52 scores in total. This means that the median will be the mean of the $26$26th and $27$27th scores in order.

Looking at the cumulative frequency table, there are $22$22 scores less than or equal to $4$4 and $30$30 scores less than or equal to $5$5. This means that the $26$26th and $27$27th scores are both $5$5, so the median is $5$5.

Finally, we can find the range just by looking at the score column. The highest score is $9$9 and the lowest is $1$1, so the range will be $9-1=8$91=8.

 

 

Practice questions

Question 1

A principal wants to investigate the performance of students at his school in Performing Arts. To do this, he has the marks of each student studying Performing Arts collected into groups and put into a frequency table. Each group of marks is assigned a grade.

The frequency table for this is shown below.

  1. Complete the cumulative frequency column.

    Grade Score $\left(x\right)$(x) Frequency $(f)$(f) Cumulative Frequency $(cf)$(cf)
    $E$E $0\le x<20$0x<20 $7$7 $\editable{}$
    $D$D $20\le x<40$20x<40 $14$14 $\editable{}$
    $C$C $40\le x<60$40x<60 $32$32 $\editable{}$
    $B$B $60\le x<80$60x<80 $97$97 $\editable{}$
    $A$A $80\le x<100$80x<100 $62$62 $\editable{}$
  2. Calculate the total frequency.

  3. Identify the class size.

  4. Complete the sentence below.

    Approximately three quarters of the scores recorded are greater than $\editable{}$.

Question 2

Consider the table.

Score ($x$x) Cumulative Frequency ($cf$cf)
$10$10 $7$7
$11$11 $15$15
$12$12 $18$18
$13$13 $20$20
$14$14 $26$26
  1. How many scores were there in total?

  2. How many scores of $14$14 were there?

  3. How many scores of less than $13$13 were there?

Question 3

Consider the histogram attached.

ScoreCumulative Frequency510154243444546

  1. How many scores were there in total?

  2. How many scores of $46$46 occured?

What is Mathspace

About Mathspace