topic badge

1.04 Associations between numerical variables

Lesson

Introduction

Bivariate data is the name for numerical data consisting of two sets of individual data. We are often interested in whether there seems to be any connection between the two sets of data. A scattergraph (or scatterplot) provides a visual representation of the numerical data which can help to determine whether there is a relationship between the two sets.

The explanatory variable is plotted on the horizontal axis and the response variable is plotted on the vertical axis. A single data point in a bivariate data set is written in the form (x,y), with the first number x being the explanatory variable and the second number y being the response variable.

Describe correlation

When describing the correlation of the two variables in a scattergraph, we want to describe the strength of the correlation and the direction of the correlation.

To describe the strength of a correlation, we use the words perfect, strong, weak, and no correlation. Perfect correlation means that the points in the scattergraph form a perfect line, and no correlation means that the points form no trend at all.

To describe the direction of a correlation, we use the words positive and negative correlation. Positive correlation means that as the explanatory variable increases, the response variable also increases. Negative correlation means that as the explanatory variable increases, the response variable decreases. Even without a scattergraph, we can use these words to describe the relationship between two variables.

Here are some examples of what each correlation description looks like.

Positive correlations

A scattergraph which shows perfect positive correlation.
A scattergraph which shows strong positive correlation
A scattergraph which shows weak positive correlation.

Negative correlations

A scattergraph which shows perfect negative correlation.
A scattergraph which shows strong negative correlation.
A scattergraph which shows weak negative correlation

Examples

Example 1

The following table shows the number of traffic accidents associated with a sample of drivers of different age groups.

AgeAccidents
2041
2544
3039
3534
4030
4525
5022
5518
6019
6517
a

Which of the following scatter plots correctly represents the above data?

A
10
20
30
40
50
60
\text{Age}
10
20
30
40
\text{Accidents}
B
10
20
30
40
50
60
\text{Age}
10
20
30
40
\text{Accidents}
C
10
20
30
40
50
60
\text{Age}
10
20
30
40
\text{Accidents}
Worked Solution
Create a strategy

Compare the coordinates of the plotted points with the table of data.

Apply the idea

Comparing the points with the table, the correct graph is the graph in option B.

b

Is the correlation between a person's age and the number of accidents they are involved in positive or negative?

A
Positive
B
Negative
Worked Solution
Create a strategy

Use the direction the points seem to be going in the scatter plot.

Apply the idea

We can observe from the scatter plot in part (a) that as a person grows older, the number of accidents they are involved in decreases. So the correlation is negative, option B.

c

Is the correlation between a person's age and the number of accidents they are involved in strong or weak?

A
Strong
B
Weak
Worked Solution
Create a strategy

Consider how closely the points follow a straight line.

Apply the idea

The scatter plot in part a shows points which mostly follow a single line, so the correlation is strong, option A.

d

Which age group's data represent an outlier?

A
30-year-olds
B
None of them
C
65-year-olds
D
20-year-olds
Worked Solution
Create a strategy

Choose the point, if any, on the scatter plot that is positioned far away from the trend of the data.

Apply the idea

We can see in the scatter plot that there is no point which is positioned away from rest of the data. So none in the group represents an outlier, option B.

Example 2

Consider the two variables: time spent studying and exam performance.

a

Is there likely to be a relationship between the two?

Worked Solution
Create a strategy

Consider whether one variable affects the other.

Apply the idea

Generally the more people study, the better they do in exams. So there is likely to be a relationship between the two.

b

Do you think the correlation is positive or negative?

Worked Solution
Create a strategy

Consider how one variable changes as the other increases.

Apply the idea

Generally the more people study, the higher the mark they get in an exam. So as one increases, the other increases. The effect is positive.

Idea summary

When describing the correlation of the two variables in a scattergraph, we want to describe the strength of the correlation and the direction of the correlation.

To describe the strength of a correlation, we use the words perfect, strong, weak, and no correlation.

To describe the direction of a correlation, we use the words positive and negative correlation.

Outcomes

ACMGM052

construct a scatterplot to identify patterns in the data suggesting the presence of an association

ACMGM053

describe an association between two numerical variables in terms of direction (positive/negative), form (linear/non-linear) and strength (strong/moderate/weak)

ACMGM056

use a scatterplot to identify the nature of the relationship between variables

What is Mathspace

About Mathspace