topic badge
iGCSE (2021 Edition)

18.08 Grouped data graphs

Lesson

 

Grouped data

Continuous numerical data, such as times, heights, weights or temperatures, are based on measurements, so any data value is possible within a large range of values. 

As an example, the following frequency distribution table represents the times taken for $72$72 runners to complete a ten kilometre race.

Class interval Frequency
$45\le\text{time }<50$45time <50 $9$9
$50\le\text{time }<55$50time <55 $7$7
$55\le\text{time }<60$55time <60 $20$20
$60\le\text{time }<65$60time <65 $30$30
$65\le\text{time }<70$65time <70 $6$6
 

 

Class intervals

What may surprise us at first is that the table above has only five rows, even though it represents $72$72 different data values. The data is first grouped into class intervals (also known as classes or bins), in the frequency distribution table.

In the table above,

  • The first class interval includes the running times for $9$9 different runners. Each of their times fall within a range that is greater than or equal to $45$45 minutes, but less than $50$50 minutes. 
     
  • The second class interval includes the running times for $7$7 different runners, each with times falling with a range greater than or equal to $50$50 minutes, but less than $55$55 minutes. 

 

Important!

Every data value must go into exactly one and only one class interval.

Class intervals should be equal width.

 

There are several different ways that class intervals are defined. Here are some examples with two adjacent class intervals:

Class interval formats Description
$45<\text{time }\le50$45<time 50 $50<\text{time }\le55$50<time 55

Upper endpoint included,

lower endpoint excluded.

$45\le\text{time }<50$45time <50 $50\le\text{time }<55$50time <55

Lower endpoint included,

upper endpoint excluded.

$45$45 to $<50$<50 $50$50 to $<55$<55

Lower endpoint included,

upper endpoint excluded.

$45$45 - $49$49 $50$50 - $54$54

Suitable for data rounded to the nearest

whole number, or discrete data.

$45$45 → $50$50 $50$50 → $55$55

Not clear which endpoints are included or

excluded. Assume upper endpoint is included.

Regardless of the format used, each class interval for a given set of data should be consistent across all class intervals.

Note: In this course, class intervals for any particular set of data will be the same width. There are situations in data representation when class intervals are different widths, but this is beyond the scope of this course.

 

 

Class centre

The class centre is the average of the endpoints of each interval.

For example, if the class interval is $45\le\text{time }<50$45time <50, or $45$45 - $50$50, the class centre is calculated as follows:

class centre $=$= $\frac{45+50}{2}$45+502
  $=$= $47.5$47.5

 

Because the class centre is an average of the endpoints, it is often used as a single value to represent the class interval. 

 

Practice question

Question 1

Find the class centre for the class interval $19\le t<23$19t<23 where $t$t represents time.

 

Frequency polygon

Using the example of running times, we can add a 'class centre' column to the frequency distribution table.

Class interval Class centre Frequency
$45\le\text{time }<50$45time <50 $47.5$47.5 $9$9
$50\le\text{time }<55$50time <55 $52.5$52.5 $7$7
$55\le\text{time }<60$55time <60 $57.5$57.5 $20$20
$60\le\text{time }<65$60time <65 $62.5$62.5 $30$30
$65\le\text{time }<70$65time <70 $67.5$67.5 $6$6

The class centre is used to create a frequency polygon. 

A frequency polygon is a line graph that displays the frequency distribution of a set of data. 

  • Notice that the class centres have been used as the scale on the horizontal axis. Each point on the frequency polygon is a coordinate pair made up of the class centre and the frequency: $\left(\text{class centre },\text{frequency }\right)$(class centre ,frequency ).
  • A frequency polygon can be drawn together with a frequency histogram or it can be displayed on its own.

 

Practice questions

Question 2

As part of a fuel watch initiative, the price of petrol, $p$p, at a service station was recorded each day for $21$21 days. The frequency table shows the findings.

Price (in cents per litre) Class Centre Frequency
$120.9120.9<p125.9 $123.4$123.4 $4$4
$125.9125.9<p130.9 $128.4$128.4 $6$6
$130.9130.9<p135.9 $133.4$133.4 $5$5
$135.9135.9<p140.9 $138.4$138.4 $6$6
  1. What was the highest price that could have been recorded?

  2. How many days was the price above $130.9$130.9 cents?

Outcomes

0607C11.4C

Mean, mode, median and range from grouped discrete data.

0607C11.7B

Use of a graphic display calculator to calculate mean for grouped data.

0607E11.4C

Mean, mode, median and range from grouped discrete data.

0607E11.5

Mean from continuous data.

0607E11.7B

Use of a graphic display calculator to calculate mean for grouped data.

What is Mathspace

About Mathspace