topic badge

8.08 Histograms

Histograms

Numerical data, such as times, heights, weights or temperatures, are based on measurements making any data value possible within a large range of values. Instead of having a visual for every single data point, we can represent the frequency of each value or the number of times that the value occurs in the data set. When displaying frequency information for this type of data, a special chart called a histogram is used.

The histogram represents the distribution of the data. It allows us to see clearly where all of the recorded times fall along a numerical scale. The x \text{-axis} represents the measurements in the data set, and the y \text{-axis} represents the frequency, or number of times that the measure occurs in the data set.

Let's look at some examples of histograms and practice interpreting them.

Examples

Example 1

A government agency records how long people wait on hold to speak to their representatives. The results are displayed in the histogram below:

A histogram titled Time on hold with frequency on the y axis, Length of hold on x axis. Ask your teacher for more information.
a

Complete the corresponding frequency table:

Length of hold (minutes)Frequency
1
2
3
4
5
Worked Solution
Create a strategy

List the corresponding frequency of each length of hold (minutes).

Apply the idea
Length of hold (minutes)Frequency
111
212
311
42
54
b

How many phone calls were made?

Worked Solution
Create a strategy

Add all the frequencies from part (a).

Apply the idea
\displaystyle \text{No.Calls}\displaystyle =\displaystyle 11+12+11+2+4Find the sum of all frequencies
\displaystyle =\displaystyle 40Evaluate
c

How long in total did these people wait on the hold?

Worked Solution
Create a strategy

Multiply each hold time by its frequency, and add the results together.

Apply the idea
\displaystyle \text{Total time}\displaystyle =\displaystyle (1\times 11)+(2\times 12)+(3\times 11)+(4\times 2)+ (5\times 4)Multiply each time by the frequency
\displaystyle =\displaystyle 11+24+33+8+20Evaluate the multiplication
\displaystyle =\displaystyle 96\text{ minutes}Evaluate the addition
d

What was the mean wait time? Give your answer as a decimal.

Worked Solution
Create a strategy

To find the mean wait time, divide the total hold time by the total calls.

Apply the idea
\displaystyle \text{Mean}\displaystyle =\displaystyle \dfrac{96}{40}Divide the total hold time by the total calls
\displaystyle =\displaystyle 2.4\text{ minutes}Evaluate

Example 2

The amount of snowfall (in centimeters) is recorded at the base of the mountain each day.

a

To create a frequency histogram of the data, which values go on the horizontal axis?

A
Number of days it snowed each amount.
B
Amount of snowfall.
Worked Solution
Create a strategy

Choose the option which data may be grouped into intervals (which are also called bins).

Apply the idea

The correct option is B: Amount of snowfall.

b

The snowfall recorded each day, to the nearest centimeter, is as follows: 6,\,2,\,0,\,3,\,2,\,2,\,3,\,4,\,2,\,0,\,3,\,2,\,3,\,4,\,6,\,4,\,3,\,0,\,5,\,3

Construct a frequency histogram of the data.

Worked Solution
Create a strategy

Create a frequency table for the given data by finding the number of times each given value has been recorded then construct the histogram.

Apply the idea

The data can be represented by the following frequency table.

Amount of SnowfallFrequency
03
25
36
43
51
62

Based on the frequency table, the histogram should look like the following:

A histogram titled Snowfall with frequency on y-axis, Amount of Snowfall on x-axis. Ask your teacher for more information.
c

On how many days did 3 centimeters of snow fall?

Worked Solution
Create a strategy

Look at the histogram from part (b) and determine the frequency of the 3 centimeters of snow fall.

Apply the idea

\text{Days}=6

d

On how many days did at least 4 centimeters of snow fall?

Worked Solution
Create a strategy

Add the frequencies of all the scores that is 4 centimeters or higher.

Apply the idea
\displaystyle \text{Days}\displaystyle =\displaystyle 3+1+2Find the sum of the scores
\displaystyle =\displaystyle 6Evaluate

Intervals or bins

In the next example, the data needs to be grouped into intervals (also called bins) in order to construct the frequency table and the histogram to represent the times taken for 72 runners to complete a ten kilometer race.

This image shows a histogram showing the distribution of running times. Ask your teacher for more information.

What may surprise us at first is that the histogram has only five columns, even though it represents 72 different data values.

To produce the histogram, the data is first grouped into intervals, which are ranges in the data set, using the frequency table.

The first interval includes the running times for 9 different runners. Each of their times fall within a range that is greater than or equal to 45 minutes, but less than 50 minutes. This interval is represented by the first column in the histogram.

Interval (minutes)Frequency
45\ - \lt 509
50\ - \lt 557
55\ - \lt 6020
60\ - \lt 6530
65\ - \lt 706

The second interval includes the running times for 7 different runners, each with times falling with a range greater than or equal to 50 minutes, but less than 55 minutes. This interval is represented by the second column in the histogram, and so on.

Every data value must go into exactly one and only one interval or bin.

There are some general guidelines to use when attempting to create intervals:

  • Intervals should be all the same size.
  • Intervals should include all of the data.
  • Boundaries for intervals should reflect the data values being represented.
  • Determine the number of intervals based upon the data.
  • If possible, selecting to create a number of intervals that is a factor of the number of data values (ie. a histogram representing 20 data values might have 4 or 5 intervals) will simplify the process

The key features of a histogram are:

  • The horizontal axis is a numerical scale (like a number line)

  • The data on the horizontal axis may be grouped into intervals

  • There are no gaps between the columns of a histogram

  • The height of each column will be the frequency

Histograms are not the same as bar graphs. The two major differences between them are:

  1. In a bar graph, the bars do not touch.

  2. Bar graphs are normally used to represent categorical data (ie. eye color, hair color, gender, etc.) along the horizontal axis, rather than numerical data.

Examples

Example 3

Consider the following set of values:

44,62,56,53,31,78,59,46,32,41,65

a

Which set of five intervals should we use to analyze this data?

A
40\to 49,\, 50 \to 59,\, 60 \to 69,\, 70\to 79 and 80 \to 89
B
20\to 29,\, 30 \to 39,\, 50 \to 59,\, 60\to 69 and 70 \to 79
C
30\to 39,\, 40 \to 49,\, 50 \to 59,\, 60\to 69 and 70 \to 79
D
30\to 44,\, 40 \to 44,\, 50 \to 54,\, 60\to 74 and 70 \to 74
Worked Solution
Create a strategy

The intervals we choose for this data need to cover all the values in the data set. That is, there should be no value that does not belong to a set.

Also, the upper boundary of any class should be adjacent to the lower boundary of the next interval, i.e. there should be no gaps between the intervals.

The size of each interval must be the same.

Apply the idea

Which of the sets of intervals satisfy the conditions above?

A quick way to check this is to ask: Do these intervals contain every whole number between the smallest number in the data (31) to the largest number in the data (78)?

If so, and if each interval is the same size, then it is a good choice of intervals.

The correct answer is letter C.

30\to 39,\, 40 \to 49,\, 50 \to 59,\, 60\to 69 and 70 \to 79

b

Create a frequency table using the set of intervals from part (a).

Worked Solution
Create a strategy

Create a table with two columns. The first column contains the intervals of values while the second column contains the frequencies.

The frequency of an interval of values is the number of times values that belong to that interval appear in the given data.

Apply the idea

How many numbers in the given data are between 30 and 39? This will be the frequency of the first interval.

Count the number of values in each interval. Write the frequency in the corresponding box.

ValuesFrequency
30 \, - 392
40\, - 493
50 \, - 593
60 \, - 692
70 \, - 791
c

Construct a histogram to display the data shown in the frequency table.

Worked Solution
Create a strategy

Examine the frequency table to determine how many times each range of values occurred to construct the histogram.

Apply the idea
A histogram showing the distribution of values and frequency. Ask your teacher for more information.
Idea summary

Every data value must go into exactly one and only one interval or interval.

The key features of a histogram are:

  • The horizontal axis is a numerical scale (like a number line)

  • The data on the horizontal axis may be grouped into intervals

  • There are no gaps between the columns of a histogram

  • The height of each column will be the frequency

Outcomes

6.SP.B.4

Display numerical data in plots on a number line, including dot plots, histograms, and box plots.

6.SP.B.5

Summarize numerical data sets in relation to their context, such as by:

6.SP.B.5.A

Reporting the number of observations

6.SP.B.5.B

Describing the nature of the attribute under investigation, including how it was measured and its units of measurement.

6.SP.B.5.C

Giving quantitative measures of center (median and/or mean) and variability (interquartile range and/or mean absolute deviation), as well as describing any overall pattern and any striking deviations from the overall pattern with reference to the context in which the data was gathered.

6.SP.B.5.D

Relating the choice of measures of center and variability to the shape of the data distribution and the context in which the data was gathered

What is Mathspace

About Mathspace