We have already seen how the frequencies of data values can be used to create a histogram. The cumulative frequencies can also be plotted to create another type of chart, called a cumulative frequency graph. This graph will be used in the next chapter for finding values such as the median and interquartile range from a set of grouped data.
Cumulative frequency is a 'running total' of the frequencies. To calculate it, we add an additional column to the frequency distribution table:
class interval | frequency | cumulative frequency |
---|---|---|
$50\le t<55$50≤t<55 | $5$5 | $5$5 |
$55\le t<60$55≤t<60 | $10$10 | $5+10=15$5+10=15 |
$60\le t<65$60≤t<65 | $25$25 | $15+25=40$15+25=40 |
$65\le t<70$65≤t<70 | $26$26 | $40+26=66$40+26=66 |
$70\le t<75$70≤t<75 | $40$40 | $66+40=106$66+40=106 |
$75\le t<80$75≤t<80 | $49$49 | $106+49=155$106+49=155 |
$80\le t<85$80≤t<85 | $28$28 | $155+28=183$155+28=183 |
Total | $183$183 |
The frequency distribution table below shows the heights ($h$h), in centimetres, of a group of children aged $5$5 to $11$11.
Child's height in cm | frequency | cumulative frequency |
---|---|---|
$90 |
$5$5 | $5$5 |
$100 |
$22$22 | $27$27 |
$110 |
$30$30 | $57$57 |
$120 |
$31$31 | $88$88 |
$130 |
$18$18 | $106$106 |
$140 |
$6$6 | $112$112 |
Use the table to answer the following questions:
Solution
When finding the mean and median of grouped data we want to first find the class centre of each group. The class centre is the mean of the highest and lowest possible scores in the group.
Estimate the mean and median of the following data.
Group | Frequency ($f$f) |
---|---|
$1-5$1−5 | $7$7 |
$6-10$6−10 | $2$2 |
$11-15$11−15 | $4$4 |
$16-20$16−20 | $7$7 |
First we find the class centre for each group. This is just the average of the endpoints of the group. For example, the first group is $1$1 to $5$5, so the class centre is $\frac{1+5}{2}=3$1+52=3.
Group | Class Centre ($x$x) | Frequency ($f$f) |
---|---|---|
$1-5$1−5 | $3$3 | $7$7 |
$6-10$6−10 | $8$8 | $2$2 |
$11-15$11−15 | $13$13 | $4$4 |
$16-20$16−20 | $18$18 | $7$7 |
Notice that we've given the class centre the pronumeral $x$x this time. This is because we will use the class centre in the same way that we used the score for ungrouped data.
To find the mean, we want to make an $xf$xf column again. In this case, $x$x is the class centre.
Group | Class Centre ($x$x) | Frequency ($f$f) | $xf$xf |
---|---|---|---|
$1-5$1−5 | $3$3 | $7$7 | $21$21 |
$6-10$6−10 | $8$8 | $2$2 | $16$16 |
$11-15$11−15 | $13$13 | $4$4 | $52$52 |
$16-20$16−20 | $18$18 | $7$7 | $126$126 |
Dividing the sum of the $xf$xf column by the sum of the $f$f column gives us $\frac{21+16+52+126}{7+2+4+7}=\frac{215}{20}=10.75$21+16+52+1267+2+4+7=21520=10.75.
Similarly for the median we want to make a cumulative frequency table.
Group | Class Centre ($x$x) | Frequency ($f$f) | Cumulative frequency |
---|---|---|---|
$1-5$1−5 | $3$3 | $7$7 | $7$7 |
$6-10$6−10 | $8$8 | $2$2 | $9$9 |
$11-15$11−15 | $13$13 | $4$4 | $13$13 |
$16-20$16−20 | $18$18 | $7$7 | $20$20 |
Since there are $20$20 scores, we look for the $10$10th and $11$11th scores, which are both in the group $11-15$11−15. While we don't know the exact score of the median, we can use the class centre $13$13 as our estimate for the median.
Using the values in the cumulative frequency column, we can create a cumulative frequency histogram.
class interval | frequency | cumulative frequency |
---|---|---|
$50\le t<55$50≤t<55 | $5$5 | $5$5 |
$55\le t<60$55≤t<60 | $10$10 | $15$15 |
$60\le t<65$60≤t<65 | $25$25 | $40$40 |
$65\le t<70$65≤t<70 | $26$26 | $66$66 |
$70\le t<75$70≤t<75 | $40$40 | $106$106 |
$75\le t<80$75≤t<80 | $49$49 | $155$155 |
$80\le t<85$80≤t<85 | $28$28 | $183$183 |
Total | $183$183 |
Notice that the columns in a cumulative frequency histogram will always increase in size from left to right. The frequency represented by any particular column will be equal to the difference in height between that column and the one before it.
A cumulative frequency polygon, also known as an ogive, is a line graph connecting cumulative frequencies at the upper endpoint of each class interval. Sometimes the cumulative frequency histogram and polygon are displayed together:
The cumulative frequency polygon can also be displayed on its own.
To find the median, we can use the cumulative frequency for each score. Consider the table below:
Score ($x$x) | Frequency ($f$f) | Cumulative frequency |
---|---|---|
$1$1 | $6$6 | $6$6 |
$2$2 | $9$9 | $15$15 |
$3$3 | $1$1 | $16$16 |
$4$4 | $6$6 | $22$22 |
$5$5 | $8$8 | $30$30 |
$6$6 | $6$6 | $36$36 |
$7$7 | $6$6 | $42$42 |
$8$8 | $2$2 | $44$44 |
$9$9 | $8$8 | $52$52 |
The final row has a cumulative frequency of $52$52, so there are $52$52 scores in total. This means that the median will be the mean of the $26$26th and $27$27th scores in order.
Looking at the cumulative frequency table, there are $22$22 scores less than or equal to $4$4 and $30$30 scores less than or equal to $5$5. This means that the $26$26th and $27$27th scores are both $5$5, so the median is $5$5.
Finally, we can find the range just by looking at the score column. The highest score is $9$9 and the lowest is $1$1, so the range will be $9-1=8$9−1=8.
A principal wants to investigate the performance of students at his school in Performing Arts. To do this, he has the marks of each student studying Performing Arts collected into groups and put into a frequency table. Each group of marks is assigned a grade.
The frequency table for this is shown below.
Complete the cumulative frequency column.
Grade | Score $\left(x\right)$(x) | Frequency $(f)$(f) | Cumulative Frequency $(cf)$(cf) |
---|---|---|---|
$E$E | $0\le x<20$0≤x<20 | $7$7 | $\editable{}$ |
$D$D | $20\le x<40$20≤x<40 | $14$14 | $\editable{}$ |
$C$C | $40\le x<60$40≤x<60 | $32$32 | $\editable{}$ |
$B$B | $60\le x<80$60≤x<80 | $97$97 | $\editable{}$ |
$A$A | $80\le x<100$80≤x<100 | $62$62 | $\editable{}$ |
Calculate the total frequency.
Identify the class size.
Complete the sentence below.
Approximately three quarters of the scores recorded are greater than $\editable{}$.
Consider the table.
Score ($x$x) | Cumulative Frequency ($cf$cf) |
---|---|
$10$10 | $7$7 |
$11$11 | $15$15 |
$12$12 | $18$18 |
$13$13 | $20$20 |
$14$14 | $26$26 |
How many scores were there in total?
How many scores of $14$14 were there?
How many scores of less than $13$13 were there?
Consider the histogram attached.
How many scores were there in total?
How many scores of $46$46 occured?