topic badge

6.01 Scatter plots and fitted functions

Lesson

Concept summary

Regression analysis is used to study the relationship between paired quantities, usually represented in the form \left(x,y\right). The x-variable is the independent variable and the y-variable is the dependent variable. This data can be graphed in a scatter plot and an equation, called the regression model, can be found that best fits the data.

Regression model

Describes the relationship between values of x and y, from which the most probable value of y can be predicted for any value of x.

We can use technology to find nonlinear regression models to fit a given set a data. For now, our nonlinear models will include polynomial (quadratic or cubic), exponential, logarithmic, or radical models.

Coefficient of determination {(R\text{-squared}})

A statistical measure of how close the data values are to the fitted regression model. This value represents the percentage of the variation in y that can be explained by the variation in x.

The value of R^2 can be anything from 0 to 100\%. The closer R^2 is to 1, the better fit the regression model is to the data. For a linear regression model, R^2 is the square of the correlation coefficient, r.

Worked examples

Example 1

The length for the 50th percentile of a male baby for every even month from 0 to 24 months is given in the table.

Age (months)024681012
Length (cm)49.8858.4263.8967.6270.6073.2875.75
Age (months)141618202224
Length (cm)78.0580.2182.2684.2086.0587.82
a

Fit a logarithmic function to the data.

Approach

We can use technology to fit a nonlinear function to data, but a logarithmic function is only defined for x>0 so we will need to remove the first data point from our table.

Enter the table in the spreadsheet.

Highlight the data and select 'Two Variable Regression'.

A screenshot of the GeoGebra Statistics tool with the data set inputted into the spreadsheet, the data being displayed on a scatter plot, and the Two Variable Regression Analysis button being selected. Speak to your teacher for more details.

Choose 'Log' as the regression model.

A screenshot of the GeoGebra Statistics tool with the data set inputted into the spreadsheet, the data being displayed on a scatter plot with a regression curve, and the Two Variable Regression Analysis button selected. The Regression Model has been changed to Log from None. Speak to your teacher for more details.

Solution

The logarithmic regression model is f\left(x\right)=47.4+12\ln(x)

Reflection

The logarithmic model could also be written in other bases. For example, the model could be f\left(x\right)=47.4+27.63\log(x) using a base-10 logarithm.

b

Use the regression model to predict the length of a 15 month old baby in the 50th percentile.

Approach

We can use the regression model from part (a) and input 15 for x.

Solution

f\left(15\right)=47.4+12\ln(15)\approx 79.90. So, a baby in the 50th percentile will be about 80\text{cm} long at 15 months old.

c

Create a graph of the scatterplot with the regresion model to check the reasonableness of the solution found in part (b).

Approach

To graph the data we need to determine an appropriate scale. The table has a domain of [0,24] and a range of [58.42,87.82] so we can count by 2s on the x-axis and count by 5s on the y-axis from 55 to 90.

Now we can use the graph to estimate the length at 15 months.

Solution

2
4
6
8
10
12
14
16
18
20
22
24
\text{Age (months)}
60
65
70
75
80
85
90
\text{Length (cm)}

The function appears to pass through \left(15,80\right) which is consistent with our result in part (b).

Outcomes

M3.N.Q.A.1

Use units as a way to understand real-world problems.*

M3.N.Q.A.1.D

Choose an appropriate level of accuracy when reporting quantities.

M3.S.ID.B.6

Represent data from two quantitative variables on a scatter plot, and describe how the variables are related. Fit a function to the data; use functions fitted to data to solve problems in the context of the data.*

M3.MP2

Reason abstractly and quantitatively.

M3.MP4

Model with mathematics.

M3.MP5

Use appropriate tools strategically.

M3.MP6

Attend to precision.

M3.MP7

Look for and make use of structure.

What is Mathspace

About Mathspace