topic badge

9.06 Scatter plots and lines of fit

Lesson

Concept summary

Functions can be used to model real-world events and interpret data from those events. Data that measures or compares two characteristics of a population is known as bivariate data.

When analyzing and interpreting data, we often look for a relationship between two variables called an association. We can describe an association according to the form, direction, or strength of the relationship between the two variables.

For numerical data, descriptions include:

  • linear or nonlinear
  • positive or negative
  • strong or weak

For categorical data, descriptions include:

  • strong or weak

To more easily analyse a set of data and determine if there is an association between the variables, we often construct a graph of the data, known as a scatter plot. On a scatter plot, we can then draw a line that best estimates the relationship between two sets of data, called a line of fit or trend line.

Worked examples

Example 1

A scatter plot showing the median value of property sales made in the United States is shown below.

2010
2012
2014
2016
2018
2020
2022
\text{Year}
150
200
250
300
350
400
\text{Value }(\text{thousands of }\$)
a

Sketch an approximate line of best fit for the scatter plot.

Approach

The line of best fit should go through the data points in such a way that there are roughly the same amount of data points on side of the line and the distance between the points and the line is minimized.

Solution

2010
2012
2014
2016
2018
2020
2022
\text{Year}
150
200
250
300
350
400
\text{Value }(\text{thousands of }\$)
b

Use the line of best fit to predict the median housing price in 2022.

Solution

Looking at the graph we sketched in part (a), the line of best fit approximately passes through the point \left(2022, 365\right).

2010
2012
2014
2016
2018
2020
2022
\text{Year}
150
200
250
300
350
400
\text{Value }(\text{thousands of }\$)

So we can predict that the median housing price in the U.S. in 2022 will be around \$365\,000.

Outcomes

MA.912.AR.2.2

Write a linear two-variable equation to represent the relationship between two quantities from a graph, a written description or a table of values within a mathematical or real-world context.

MA.912.DP.2.4

Fit a linear function to bivariate numerical data that suggests a linear association and interpret the slope and y-intercept of the model. Use the model to solve real-world problems in terms of the context of the data.

What is Mathspace

About Mathspace