topic badge

1.05 Correlation using r-values

Worksheet
Pearson's correlation coefficient
1

Describe the type of correlation the following correlation coefficients indicate:

a
r = 1
b
r = 0
c
r = -1
2

A pair of data sets have a correlation coefficient of \dfrac{1}{10} while a second pair of data sets have a correlation coefficient of \dfrac{3}{5}. Which pair of data sets have the stronger correlation?

3

If the explanatory variable increases, describe the effect on the response variable for the following studies:

a

A study found that the correlation coefficient between heights of women and probability of being turned down for a promotion was found to be - 0.90.

b

A study found that the correlation coefficient between population of a city and number of speeding fines recorded was found to be 0.83.

c

A study found that the correlation coefficient between length of hair and length of fingernails was found to be 0.07.

d

A study found that the correlation coefficient between number of bylaws a council has about dog breeding and number of dogs available for adoption at the local shelter was found to be 0.55.

4

For each of the following graphs, write down an appropriate value for the correlation coefficient:

a
1
2
3
4
5
6
7
8
9
x
5
10
15
20
25
30
35
y
b
1
2
3
4
5
6
7
8
9
x
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
34
36
38
y
c
1
2
3
4
5
6
7
8
9
10
x
4
8
12
16
y
d
1
2
3
4
5
6
7
8
9
x
2
4
6
8
10
12
14
16
18
y
5

The scatter diagram shows data of the height of a ball kicked into the air as a function of time:

a

Which type of model is appropriate for the data, Linear or Non-linear?

b

Write down a possible value of Pearson’s correlation coefficient, r, for this set of data.

1
2
3
4
5
6
7
8
9
t
5
10
15
20
25
30
35
40
45
50
55
\text{Height}
6

The scatter diagram shows data of a person's level of happiness as a function of their age:

a

Which type of model is appropriate for the data, linear or non-linear?

b

Write down a possible value of Pearson’s correlation coefficient, r, for this set of data.

1
2
3
4
5
6
7
8
9
\text{Age}
10
20
30
40
50
\text{Happiness}
7

The scatter diagram shows data of the height of an object after it is pushed off a rooftop as a function of time:

a

Which type of model is appropriate for the data, linear or quadratic?

b

Write down a possible value of Pearson’s correlation coefficient, r, for this set of data.

1
2
3
4
5
6
7
8
9
x
100
200
300
400
500
600
700
800
900
y
8

A climate scientist wishes to investigate whether there is a relationship between the altitude of a city and the average maximum temperature of the city. Data was collected and is shown in the table below:

Altitude940627440342302525250775896116
Temperature17232529282225221626
a

State the explanatory variable in this problem.

b

Construct a scatter plot for this data.

c

Describe the correlation between the two variables.

d

From the following values, select the value that is most likely to be the correlation coefficient for this data.

  • 0.75
  • -1
  • 0.6
  • -0.8
e

Can the scientist conclude that the altitude of a city causes the average maximum temperature? Explain your answer.

Correlation and causation
9

Explain whether or not the following statements are true or false.

a

If there is correlation between two variables, there must be causation.

b

If there is causation between two variables, there must be correlation.

10

Explain the difference between coincidence, causation, and a confounding factor.

11

For each of the following data examples, determine if there is a causal relationship between the variables:

a

The number of times a coin lands on heads and the likelihood that it lands on heads on the next flip.

b

The amount of weight training a person does and their strength.

12

Many trees lose their leaves in winter. Does this mean that cold temperatures cause the leaves to fall?

13

A geography teacher observes that many of the students who are involved in the music programme do better at tests. Does this mean that learning music makes students better at geography?

14

The table shows the number of fans sold at a store during days of various temperatures:

\text{Temperature } (\degree\text{C})68101214161820
\text{Number of fans sold}1213141718192123
a

Is there a causal relationship between the variables?

b

Without calculating, consider the correlation coefficient, r, for temperature and number of fans sold. Is the value of r positive or negative?

15

A study found a strong correlation between the approximate number of pirates out at sea and the average world temperature.

a

Does this mean that the number of pirates out at sea has an impact on world temperature?

b

Is the strong correlation found a coincidence? Explain your answer.

c

If there is correlation between two variables, is there causation?

16

A study found a strong positive association between the temperature and the number of beach drownings.

a

Does this mean that the temperature causes people to drown? Explain your answer.

b

Is the strong correlation found a coincidence? Explain your answer.

17

The fleet manager for the Australian Automotive Association wants to estimate how car maintenance costs, C in hundreds of dollars, are related to the distance, K in thousands of kilometres, driven each year. The data collected is shown in the scatter plot below along with the least squares regression line:

a

How much is the cost of the maintenance of a car that is not driven at all? Explain whether this prediction is reliable.

b

Predict the annual maintenance cost for a car that is driven 40\,000 \text{ km} per year.

c

Is this predicted value reliable? Explain your answer.

d

Is there any reason to believe that there is a causal relationship between distance driven and maintenance costs? Explain your answer.

5
10
15
20
25
30
35
K
100
200
300
400
500
C
18

A medical study measured the blood glucose, G, and hormone, H, levels of a group of patients. The results are displayed in the scatter plot below, together with the least-squares regression line. The correlation coefficient for this data set is - 0.48.

a

How many patients with a hormone level of less than 8 units had a glucose level less than 150 units?

b

Determine the upper and lower glucose levels for the patients involved in this study.

c

Having no knowledge of the effects of insulin and glucose, one researcher involved in the study claims that a high insulin (hormone) level will cause a patient to have a low glucose level.

Is this claim correct?

d

Is there a causal relationship between the blood glucose and hormone levels of a group of patients?

1
2
3
4
5
6
7
8
9
10
H
60
80
100
120
140
160
180
200
220
240
G
e

State the number of patients involved in the survey.

f

How could the size of the study influence an explanation for an association between the variables?

Sign up to access Worksheet
Get full access to our content with a Mathspace account

Outcomes

3.1.2.2

understand an association between two numerical variables in terms of direction (positive/negative), form (linear) and strength (strong/moderate/weak)

3.1.3.2

use a scatterplot to identify the nature of the relationship between variables

3.1.4.1

recognise that an observed association between two variables does not necessarily mean that there is a causal relationship between them

3.1.4.2

identify and communicate possible non-causal explanations for an association, including coincidence and confounding due to a common response to another variable

What is Mathspace

About Mathspace