topic badge

9.05 Fitted functions

Lesson

Concept summary

Bivariate data can be modeled with a fitted function also called a regression model. Depending on the strength of the association, measured with the coefficient of determination (r^2), a regression function may pass exactly through all of the points, some of the points, or none of the points.

0.1
0.2
0.3
0.4
0.5
x
10000
20000
30000
40000
50000
y
Exponential regression r^2=0.763
5
10
15
20
25
x
20
40
60
80
100
120
y
Quadratic regression r^2=0.908
Coefficient of determination

A measurement used to explain how much the variability of one quantity can be explained by its relationship to another quantity.

A line of best fit or regression line both refer to a linear regression model. The correlation coefficent, r, can be calculated with technology to describe the strength of the line of best fit. To approximate a line of best fit by eye, balance the number of points above the line with the number of points below the line. You should generally ignore outliers as they can skew the line of best fit.

Worked examples

Example 1

During an alcohol education program, 10 adults were offered up to 6 drinks and were then given a simulated driving test where they were scored out of a possible 100 points.

Number of drinks3264416342
Driving score64594257587233635562
a

Describe the association between number of drinks and driving score.

Approach

Construct a scatterplot to get a visual of the data.

1
2
3
4
5
6
\text{Drinks}
10
20
30
40
50
60
70
80
90
\text{Score }

Then consider the form, strength, and direction.

Solution

The data appears to have a strong, negative, linear association.

b

Use technology to calculate the correlation coefficient and line of best fit.

Approach

  1. Enter the x- and y-values in two separate columns:
  2. Highlight the data and select \text{Two Variable Regression Analysis}:

  3. Select \text{Show Statistics} to see the correlation coefficient, r:

  4. Choose \text{Linear} under the \text{Regression Model} drop down menu to find the line of best fit:

Solution

The correlation coefficient is r=-0.9115 and the equation of the line of best fit is y=-6.22x+78

c

Interpret the meaning of the slope and y-intercept of the line of best fit in context of the data.

Approach

From part (b) we know that the equation of the line of best fit is y=-6.22x+78 which tells us the slope is -6.22 and the y-intercept is 78.

Solution

The slope of -6.22 represents the driving score dropping by -6.22 points for every extra drink consumed.

The y-intercept tells us that an adult with 0 drinks has a predicted score of 78 according to the linear model.

Reflection

Matching the slope and the y-intercept to their respective units is a good strategy for interpreting their meaning in context. \text{slope}=\dfrac{\text{rise}}{\text{run}}=\dfrac{-6.22}{1}

The quantity on the y-axis represents the "rise" and the quantity on the x-axis represents the "run". So the slope represents negative 6.22 score for every 1 drink.

The y-intercept can be written as an ordered pair \left(x,y\right)=\left(0,78\right) where x is the number of drinks and y is the score on the driving test.

Example 2

Consider the data in the table:

Hours worked691214303540485060
Happiness15305070909590756030
a

Use technology to find a regression model to represent the data.

Approach

We first need to determine the form of the data. To do this we can create a scatterplot:

5
10
15
20
25
30
35
40
45
50
55
60
65
\text{Hours}
10
20
30
40
50
60
70
80
90
\text{Happiness}

Since the data appears to be quadratic, we can use a quadratic regression for our model.

  1. Enter the x- and y-values in two separate columns:
  2. To find the quadratic regression model choose \text{Polynomial} under the \text{Regression Model} drop down menu and select 2:

Solution

The equation of the quadratic regression model is y=-0.10x^2+6.58x-16.07

Reflection

We can check the accuracy of the regression model in one of two ways:

  1. Use technology to calculate the coefficient of determination.
  2. Graph the regression model on the same plane as the data.

If we select \text{Show Statistics} in the calculator tool we can see that the coefficient of determination for our model is R^2=0.9565

The scatterplot with the regression model looks like:

5
10
15
20
25
30
35
40
45
50
55
60
65
\text{Hours}
10
20
30
40
50
60
70
80
90
\text{Happiness}
b

What happiness rating would someone who worked 25 hours per week be expected to have?

Approach

Use the regression model y=-0.10x^2+6.58x-16.07 and input x=25.

Solution

\displaystyle y\displaystyle =\displaystyle -0.10x^2+6.58x-16.07Regression model
\displaystyle y\displaystyle =\displaystyle -0.10(25)^2+6.58(25)-16.07Substitute
\displaystyle y\displaystyle =\displaystyle 85.93Evaluate

A person who works 25 hours has a predicted happiness rating of about 86.

c

What happiness rating would someone who worked 80 hours per week be expected to have?

Approach

Use the regression model y=-0.10x^2+6.58x-16.07 and input x=80.

Solution

\displaystyle y\displaystyle =\displaystyle -0.10x^2+6.58x-16.07Regression model
\displaystyle y\displaystyle =\displaystyle -0.10(80)^2+6.58(80)-16.07Substitute
\displaystyle y\displaystyle =\displaystyle -129.67Evaluate

A person who works 80 hours has a predicted happiness rating of about -129.67.

Reflection

This solution is not viable in this context since the happiness rating should range from 0 to 100 and should not be negative.

Outcomes

A1.N.Q.A.1

Use units as a way to understand real-world problems.*

A1.N.Q.A.1.A

Choose and interpret the scale and the origin in graphs and data displays.*

A1.N.Q.A.1.C

Define and justify appropriate quantities within a context for the purpose of modeling.*

A1.S.ID.B.4

Represent data from two quantitative variables on a scatter plot, and describe how the variables are related. Fit a function to the data; use functions fitted to data to solve problems in the context of the data.*

A1.S.ID.C.5

Interpret the rate of change and the constant term of a linear model in the context of data.*

A1.S.ID.C.6

Use technology to compute the correlation coefficient of a linear model; interpret the correlation coefficient in the context of the data.*

A1.S.ID.C.7

Explain the differences between correlation and causation. Recognize situations where an additional factor may be affecting correlated data.*

A1.MP2

Reason abstractly and quantitatively.

A1.MP3

Construct viable arguments and critique the reasoning of others.

A1.MP4

Model with mathematics.

A1.MP5

Use appropriate tools strategically.

A1.MP6

Attend to precision.

A1.MP7

Look for and make use of structure.

A1.MP8

Look for and express regularity in repeated reasoning.

What is Mathspace

About Mathspace