topic badge
AustraliaVIC
VCE 12 General 2023

1.02 Tables and charts

Lesson

Introduction

Graphs are a visual way of presenting information. They can be very useful as they help us sort and order the information we collect and present it in a clear, concise way. Selecting a good type of graph to display your data is important, and the best type of graph to choose will change depending on the type of information you need to display. Let's go through a few different types of graphs now.

Bar graphs

Bar graph is a generic name for any graph that displays information using rectangular or cylindrical bars.

Examples

Example 1

The sales (in thousands) of different products are shown in the following horizontal bar graph.

The image shows a hoeizontal bar graph. Ask your teacher for more information.
a

Which of the following is the best-selling product?

A
Product A
B
Product B
C
Product C
D
Product D
E
Product E
F
Product F
Worked Solution
Create a strategy

We need to choose which product has the longest bar in the graph.

Apply the idea

Product D has the highest sale, so the correct answer is option D.

b

How many units of all products were sold in total?

Worked Solution
Create a strategy

Add altogether the number of units that were sold in each product.

Apply the idea
\displaystyle \text{Total sales}\displaystyle =\displaystyle 9000 +6000+5000+10\,000+9000+7000Add all the unit sold
\displaystyle =\displaystyle 46\,000 \text{ units}Evaluate
c

If product B was sold at \$50 each, find the revenue generated by product B alone.

Worked Solution
Create a strategy

We need to multiply the number of units that were sold in product B by \$50.

Apply the idea
\displaystyle \text{Revenue}\displaystyle =\displaystyle 6000 \times 50Multiply 6000 by \$50
\displaystyle =\displaystyle \$300\,000Evaluate
Idea summary

Bar graph is a generic name for any graph that displays information using rectangular or cylindrical bars.

Column graphs

A column graph is the name for a specific type of bar graph that uses vertical bars, so that they appear like columns.

Examples

Example 2

A survey of the preferred sport was done for a group of boys and the results are shown in the bar graph below:

The column graph shows the number of each type of sport. Ask your teacher for more information.
a

How many boys prefer football to other sports?

Worked Solution
Create a strategy

Look across to the number that matches the height of the bar for football.

Apply the idea

The height of the bar for football goes up to the line next to the number 6 on the left.

6 boys prefer football to other sports.

b

Which type of sport is the most popular?

Worked Solution
Create a strategy

Choose the sport that has the tallest column in the graph.

Apply the idea

The most popular type of sport means the most boys preferred it. The hockey has most preferred and tallest column in the graph.

The most popular type of sport is hockey.

c

How many boys took part in the survey?

Worked Solution
Create a strategy

Add the heights of each column.

Apply the idea
\displaystyle \text{Total student}\displaystyle =\displaystyle 6+8+7+7+10Add the heights
\displaystyle =\displaystyle 38\text{ boys}Evaluate the addition
Idea summary

A column graph is the name for a specific type of bar graph that uses vertical bars, so that they appear like columns.

When to use bar graphs and column graphs

  • Bar graphs and column graphs are used to display categorical data Categorical data .

  • There is one bar for each category and the height or length of the bar represents the frequency.

  • Bars are drawn with gaps to show that each value is a separate category.

Let's say we decided to conduct an experiment experiment about what is the most common coloured car in the neighbourhood, and we are going to record the colours of the next 50 cars that drive past.

How would we keep track of what we saw? We could write a list, but that might look a bit messy and be a bit hard to understand, like the list below:

Green, white, yellow, white, black, green, black, blue, blue, gold, silver, white, black, gold, green, blue, purple, blue, white, black, gold, silver, silver, red, red, red, black, gold, red, blue, white, black, silver, silver, purple, pink, white, blue, red, black, yellow, blue, white, white, red, green, pink, black, white, red.

A nicer way to keep track of data like this is to create a frequency table and keep a tally of the results.

Frequency refers to how often an event occurs. We make use of frequency tables as an easy way to display data because we can have one column showing a list of the possible outcomes that may occur, a second column with tally marks of the frequency of each event (although this column isn't always included), and a third with the total frequency as a number. Frequency tables are useful for surveys, as you can keep a running total easily each time someone responds.

Examples

Example 3

In a survey some people were asked approximately how many minutes they take to decide between brands of a particular product.

a

Complete the frequency table.

A table with the tally of minutes taken. Ask your teacher for more information.
Worked Solution
Create a strategy

Count the number of strokes in each group, then write your answer in the frequency column.

Apply the idea
A table with the tally of minutes taken. Ask your teacher for more information.
b

How many people took part in the survey?

Worked Solution
Create a strategy

Add the numbers in the frequency column.

Apply the idea

To add the frequencies we can use a vertical algorithm and add the numbers down each column: \begin{array}{c} & &1 &3 \\ & &1 &7 \\ &+ &1 &2 \\ \hline & &4&2 \\ \hline \end{array}

42 people were surveyed.

c

What proportion of people surveyed took 2 minute to make a decision?

Worked Solution
Create a strategy

We need to divide the number of people who took 2 minutes to make a decision by the total number of people that were surveyed.

Apply the idea

\text{Proportion}=\dfrac{17}{42}

Idea summary

Frequency refers to how often an event occurs.

A nicer way to keep track of data is to create a frequency table and keep a tally of the results.

Divided bar graphs

A divided bar graph is a graph in that the bar represents the whole data set and the bar is divided into several segments to represent the proportional size of each category.

A bar graph can be any length, but it can be helpful to think about what length could make the data easier to divide up - multiples of 5 or 10 are often good. Remember that you don't want it to be too long or too short.

The image shows jellybeans with different color. Ask your teacher for more information.

Say we had 100 jellybeans and we divided them up by colour. 30 were green, 28 were pink, 28 were orange and 14 were white.

Since there were 100 jellybeans, we may choose to make our divided bar graph 10 cm long so we have an easy ratio where 1 cm corresponds to 10 jellybeans.

To work out how much of the bar graph each colour represents, we want to write each colour as a fraction of the whole, then evaluate evaluate this fraction of the line. For example, \dfrac{30}{100} or \dfrac{3}{10} of the jellybeans are greenand \dfrac{3}{10} \times 10 = 3. This means that 3 cm of the 10 cm bar graph should be given to the green jellybeans. Similarly \dfrac{28}{100}\times 10=2.8, so 2.8 cm should be given to both pink and orange and 1.4 cm should be given to white.

We can check we've calculated everything correctly by adding up the length values:3 + 2.8 + 2.8 + 1.4 = 10 so we know we've got everything correct.

Examples

Example 4

The divided bar graph shows the percentage of total subscriptions that each newspaper has. The Age has 54\,000 subscriptions:

The image shows a divided bar with different percentages. Ask your teacher for more information.
a

What is 1\% of total subscriptions?

Worked Solution
Create a strategy

We need to divide the number of people who have subscribed "The Age" by 25.

Apply the idea
\displaystyle 1\% \text{ of total subscriptions}\displaystyle =\displaystyle \dfrac{54\,000}{25}Divide 54\,000 by 25
\displaystyle =\displaystyle 2160Evaluate the quotient
b

Find the total number of subscriptions.

Worked Solution
Create a strategy

Multiply the number of subscriptions that represent 1\% from part (a) by 100.

Apply the idea
\displaystyle \text{Total number of subscriptions}\displaystyle =\displaystyle 2160 \times 100Multiply 2160 by 100
\displaystyle =\displaystyle 216\,000Evaluate
Idea summary

A divided bar graph is a graph in that the bar represents the whole data set and the bar is divided into several segments to represent the proportional size of each category.

Histograms, bar graphs, and column graphs

Histograms are similar to bar or column graphs. There are two main differences:

  1. Bar or column graphs are usually used to display categorical data, while histograms are used to display numerical data.

  2. Bar and column graphs are drawn with spaces between the columns, while histograms do not have spaces between the columns.

This is because histograms are used to display discrete or continuous numerical data. In other words, there are no distinct categories between the groups. Instead, histograms display ranges of data that are determined by the person creating the graph. The width of the columns in a histogram are used to show the interval that they represent.

Each student in a class was surveyed and asked about the colour of their eyes. The data is categorical and the results are displayed in a column graph below:

The column graph shows the number of eye colour. Ask you teacher for more information.

Each student in a class was surveyed and asked the size of their families. The data is numerical and the results are displayed in a histogram below:

The histogram shows the number of children in family. Ask your teacher for more information.

The data that was collected in this survey is called discrete data because it can take particular values (in this case whole numbers). In histograms that display discrete data the mark is located in the centre of the columns across the horizontal access. The height of each column represents the frequency of each data item.

Each student in a class was surveyed and asked their heights. The data is numerical and the results are displayed in a histogram below:

The histogram shows the height of the students. Ask you teacher for more information.

The data that was collected in this survey is called continuous data because it can take any value within a range . In histograms that display continuous data, the column width represents the range of each interval or bin. The height of each column represents the frequency of each data item within each interval.

Examples

Example 5

Continuous data is represented in a histogram as shown:

The histogram shows the scores. Ask you teacher for more information.

Complete the following frequency table:

ScoreFrequency
21
23
25
27
29
31
Worked Solution
Create a strategy

List the corresponding frequency of each scores.

Apply the idea
ScoreFrequency
2120
2316
2516
2712
2918
3112
Idea summary

Histograms are similar to bar or column graphs. There are two main differences:

  1. Bar or column graphs are usually used to display categorical data, while histograms are used to display numerical data.

  2. Bar and column graphs are drawn with spaces between the columns, while histograms do not have spaces between the columns.

Shape of data

When we describe the shape of data sets, we want to focus on how the scores are distributed. Some questions that we might be interested include:

  • Is the distribution symmetrical or not?

  • Are there any clusters or gaps in the data?

  • Are there any outliers?

  • Where is the centre of the data located approximately? (Recall our three measures of centre: mean, median, and mode)

  • Is the data widely spread or very compact? (Recall our three measures of spread: range, interquartile range and standard deviation standard deviation )

Data may be described as symmetrical or asymmetrical.

There are many cases where the data tends to be around a central value with no bias bias left or right. In such a case, roughly 50\% of scores will be above the mean and 50\% of scores will be below the mean. In other words, the mean and median roughly coincide.

The normal distribution is a common example of a symmetrical distribution of data.

This image shows a bell-shaped curve.

The normal distribution looks like this bell-shaped curve.

The image shows a bell-shaped curve drawn over a histogram. Ask your teacher for more information.

This picture shows how a data set that has an approximate normal distribution may appear in a histogram.

The dark line shows the nice, symmetrical curve that can be drawn over the histogram that the data roughly follows.

In this distribution, the peak of the data represents the mean, the median and the mode (taken as the centre of the modal class) all these measures of central tendency are equal for this symmetrical distribution.

A uniform distribution is a symmetrical distribution where each outcome is equally likely, so the frequency should be the same for each outcome. For example, when rolling dice the outcomes are equally likely, while we might get an irregular column graph if only a small number of rolls were performed if we continued to roll the dice the distribution would approach a uniform distribution like that shown below.

The image shows a column graph with all columns of the same height. Ask your teacher for more information.

If a data set is asymmetrical instead (i.e. it isn't symmetrical), it may be described as skewed.

A data set that has positive skew (sometimes called a 'right skew') has a longer tail of values to the right of the data set. The mass of the distribution is concentrated on the left of the figure.

The image shows a curve with its right side stretched out.

A positively skewed has this general shape with right side stretched out.

The image shows a curve shown over a histogram of positively skewed data. Ask your teacher for more information.

General shape shown over a histogram of positively skewed data.

A data set that has negative skew (sometimes called a 'left skew') has a longer tail of values to the left of the data set. The mass of the distribution is concentrated on the right of the figure.

The image shows a curve of negatively skewed data with left side stretched out.

A negatively skewed graph has this general shape with left side stretched out.

The image shows a curve over a histogram of negatively skewed data. Ask your teacher for more information.

General shape shown over a histogram of negatively skewed data.

In a set of data, a cluster occurs when a large number of the scores are grouped together within a small range. Clustering may occur at a single location or several locations. For example, annual wages for a factory may cluster around \$ 40\,000 for unskilled factory workers, \$ 55\,000 for tradespersons and \$ 70\,000 for management. The data may also have clear gaps where values are either very uncommon or not possible in the data set.

As we have seen previously, an outlier is a data point that varies significantly from the body of the data. An outlier will be a value that is either significantly larger or smaller than other observations. Outliers are important to identify as they point to unusual bits of data that may require further investigation and impact some calculations such as mean, range, and standard deviation.

A dot plot showing a data of scores. Ask your teacher for more information.

For the dot plot given above the score of 9 would be considered an outlier as it is well above the body of the data.

Idea summary

A distribution is said to be symmetric if its left and right sides are mirror images of one another.

A uniform distribution is a symmetrical distribution where each outcome is equally likely, so the frequency should be the same for each outcome.

A data set that has positive skew (sometimes called a 'right skew') has a longer tail of values to the right of the data set. The mass of the distribution is concentrated on the left of the figure.

A data set that has negative skew (sometimes called a 'left skew') has a longer tail of values to the left of the data set. The mass of the distribution is concentrated on the right of the figure.

To determine the modality of a data distribution:

  • If there is a single class the data is uni-modal.

  • If there are two classes the data is bi-modal.

  • If there are more than two the data is multi-modal.

An outlier is a value that is either noticeably greater or smaller than other observations.

Outcomes

U3.AoS1.2

frequency tables, bar charts including segmented bar charts, histograms, stem plots, dot plots, and their application in the context of displaying and describing distributions

U3.AoS1.14

construct frequency tables and bar charts and use them to describe and interpret the distributions of categorical variables

U3.AoS1.16

construct stem and dot plots, boxplots, histograms and appropriate summary statistics and use them to describe and interpret the distributions of numerical variables

U3.AoS1.17

answer statistical questions that require a knowledge of the distribution(s) of one or more categorical variables

What is Mathspace

About Mathspace