topic badge

7.02 Form and strength of association

Lesson

Most of the time, we want to scatterplots of bivariate data to find patterns in data and make inferences about a possible relationship between the two variables. To do this, we look at both the form (or shape) and the strength of the association.

 

Linear and non-linear relationships

The first way to analyse scatterplots is to describe the shape that the bivariate data takes. Sometimes the data clusters around some kind of curve, so the relationship is:

  •  linear (a straight line), or
  • non-linear (not a straight line) and the two variables have a non-linear relationship. Non-linear data could have a quadratic (parabolic), exponential or hyperbolic shape.

Linear relationship

Quadratic relationship

Exponential relationship

Hyperbolic relationship

 

Positive and negative relationships

We can further describe linear relationships by whether they are increasing (positive gradient) or decreasing (negative gradient). 

  • A linear relationship where the dependent variable increases as the independent variable increases is called a positive linear relationship. 
  • A linear relationship where the dependent variable decreases as the independent variable increases is called a negative linear relationship.

Positive relationship

Negative relationship

What does it mean if the line is close to horizontal (0 gradient)? In terms of data, the dependent variable does not change as the independent variable increases. In other words, the dependent variable doesn't actually depend on the other variable, so we say there is likely no relationship.

Careful!

The words positive and negative only apply when describing linear relationships. For non-linear relationships (like a quadratic relationship) there can be a mix of gradients - positive in one part and negative in the other.

 

Strong and weak relationships

The second way to analyse scatterplots is to describe the strength of the relationship. If the data points cluster very closely around a curve, we say that there is evidence of a strong relationship. If the data points are very spread out but there is still an overall curve we say there is evidence of a weak relationship. If the data points are somewhere in the middle we say that there is evidence of a moderate relationship.

For example, almost all data points of a strong linear relationship will lie on or very close to a straight line. If the data points are arbitrarily spread out, then there is probably no linear relationship at all. This could mean that there is a non-linear relationship, or that the two variables are completely unrelated.

Strong relationship

Moderate relationship

Weak relationship

No relationship

Careful!

We can never be completely certain that there's a linear relationship between any two variables from a scatterplot. This is why we say that that "there is evidence of a linear relationship" or that "there is probably no relationship". To be brief with our words, we often say "there is a linear relationship" and "there is no relationship", but is important to keep in mind what is meant by this.

Practice questions

Question 1

Describe the relationship between the variables observed in the scatterplot below.

Loading Graph...

  1. Strong positive linear relationship

    A

    Strong negative linear relationship

    B

    No linear relationship

    C

    Weak positive linear relationship

    D

    Weak negative linear relationship

    E

Question 2

Describe the relationship between the variables observed in the scatterplot below.

Loading Graph...

  1. Weak parabolic relationship

    A

    Strong parabolic relationship

    B

    No relationship

    C

    Weak linear relationship

    D

    Strong linear relationship

    E

Question 3

Which of the following scatterplots demonstrates no relationship?

  1. Loading Graph...

    A

    Loading Graph...

    B

    Loading Graph...

    C

    Loading Graph...

    D

    Loading Graph...

    E

Outcomes

MS2-12-2

analyses representations of data in order to make inferences, predictions and draw conclusions

MS2-12-7

solves problems requiring statistical processes, including the use of the normal distribution and the correlation of bivariate data

What is Mathspace

About Mathspace