10. Representing Data

Lesson

We have already seen how the frequencies of data values can be used to create a histogram. The cumulative frequencies can also be plotted to create another type of chart, called a cumulative frequency graph. This graph will be used in the next chapter for finding values such as the median and interquartile range from a set of grouped data.

Cumulative frequency is a 'running total' of the frequencies. To calculate it, we add an additional column to the frequency distribution table:

class interval | frequency | cumulative frequency |
---|---|---|

$50\le t<55$50≤t<55 |
$5$5 | $5$5 |

$55\le t<60$55≤t<60 |
$10$10 | $5+10=15$5+10=15 |

$60\le t<65$60≤t<65 |
$25$25 | $15+25=40$15+25=40 |

$65\le t<70$65≤t<70 |
$26$26 | $40+26=66$40+26=66 |

$70\le t<75$70≤t<75 |
$40$40 | $66+40=106$66+40=106 |

$75\le t<80$75≤t<80 |
$49$49 | $106+49=155$106+49=155 |

$80\le t<85$80≤t<85 |
$28$28 | $155+28=183$155+28=183 |

Total | $183$183 |

- The first value in the cumulative frequency column will always be the same as the first value in the frequency column.
- To get the second cumulative frequency value we add the second frequency to the first cumulative frequency value, $5+10=15$5+10=15.
- The second cumulative frequency value tells us there are $15$15 values in the interval $50\le p<60$50≤
`p`<60. - The third cumulative frequency value tells us there are $40$40 values in the interval $50\le p<65$50≤
`p`<65, and so on. - The final cumulative frequency value is always equal to the sum of the frequencies. In this case, there are $183$183 values in the entire data set, represented by $50\le p<85$50≤
`p`<85.

The frequency distribution table below shows the heights ($h$`h`), in centimetres, of a group of children aged $5$5 to $11$11.

Child's height in cm | frequency | cumulative frequency |
---|---|---|

$90h≤100 |
$5$5 | $5$5 |

$100h≤110 |
$22$22 | $27$27 |

$110h≤120 |
$30$30 | $57$57 |

$120h≤130 |
$31$31 | $88$88 |

$130h≤140 |
$18$18 | $106$106 |

$140h≤150 |
$6$6 | $112$112 |

Use the table to answer the following questions:

- How many children were in the group?
- How many children had heights greater than $130$130 cm but less than or equal to $140$140 cm?
- Which class interval contained the most children?
- How many children had a height less than or equal to $120$120 cm?
- How many children had a height greater than $130$130 cm?

**Solution**

- The final cumulative frequency value tells us there were $112$112 children in the group. This is equal to the sum of the values in the frequencies column.
- The frequency column indicates there are $18$18 children with heights in the range $130
130< `h`≤140. - The class interval with the highest frequency is $120
120< `h`≤130. - The cumulative frequency of the $110
110< `h`≤120 class interval, tells us that $57$57 children had a height less than or equal to $120$120 cm. - Here we can add the final two frequencies: $18+6=24$18+6=24. Alternatively we could subtract the cumulative frequency of $88$88 (corresponding to class interval containing the height $130$130 cm), from the total number of children in the group: $112-88=24$112−88=24.

Using the values in the cumulative frequency column, we can create a cumulative frequency histogram.

class interval | frequency | cumulative frequency |
---|---|---|

$50\le t<55$50≤t<55 |
$5$5 | $5$5 |

$55\le t<60$55≤t<60 |
$10$10 | $15$15 |

$60\le t<65$60≤t<65 |
$25$25 | $40$40 |

$65\le t<70$65≤t<70 |
$26$26 | $66$66 |

$70\le t<75$70≤t<75 |
$40$40 | $106$106 |

$75\le t<80$75≤t<80 |
$49$49 | $155$155 |

$80\le t<85$80≤t<85 |
$28$28 | $183$183 |

Total | $183$183 |

Notice that the columns in a cumulative frequency histogram will always increase in size from left to right. The frequency represented by any particular column will be equal to the difference in height between that column and the one before it.

A cumulative frequency polygon, also known as an **ogive**, is a line graph connecting cumulative frequencies at the upper endpoint of each class interval. Sometimes the cumulative frequency histogram and polygon are displayed together:

The cumulative frequency polygon can also be displayed on its own.

A principal wants to investigate the performance of students at his school in Performing Arts. To do this, he has the marks of each student studying Performing Arts collected into groups and put into a frequency table. Each group of marks is assigned a grade.

The frequency table for this is shown below.

Complete the cumulative frequency column.

Grade Score $\left(x\right)$( `x`)Frequency $(f)$( `f`)Cumulative Frequency $(cf)$( `c``f`)$E$ `E`$0\le x<20$0≤ `x`<20$7$7 $\editable{}$ $D$ `D`$20\le x<40$20≤ `x`<40$14$14 $\editable{}$ $C$ `C`$40\le x<60$40≤ `x`<60$32$32 $\editable{}$ $B$ `B`$60\le x<80$60≤ `x`<80$97$97 $\editable{}$ $A$ `A`$80\le x<100$80≤ `x`<100$62$62 $\editable{}$ Calculate the total frequency.

Identify the class size.

Complete the sentence below.

Approximately three quarters of the scores recorded are greater than $\editable{}$.

Consider the table.

Score ($x$x) |
Cumulative Frequency ($cf$cf) |
---|---|

$10$10 | $7$7 |

$11$11 | $15$15 |

$12$12 | $18$18 |

$13$13 | $20$20 |

$14$14 | $26$26 |

How many scores were there in total?

How many scores of $14$14 were there?

How many scores of less than $13$13 were there?

Consider the histogram attached.

How many scores were there in total?

How many scores of $46$46 occured?

represents information in symbolic, graphical and tabular form

develops and carries out simple statistical processes to answer questions posed