topic badge

2.10 Extension: Fitting functions to quadratic data

Lesson

In our lesson from Algebra 1 on lines of best fit we saw how to fit data to a linear model to make predictions and interpret patterns in the data.  Now, we'll look at how we might do the same for data that more closely resembles a quadratic curve.

Using technology to fit quadratic regression models

Let's consider the following data set:

$x$x $4$4 $4.8$4.8 $5.1$5.1 $6$6 $7.1$7.1 $8.2$8.2 $9.4$9.4
$y$y $19.4$19.4 $20.4$20.4 $20.2$20.2 $19.1$19.1 $18$18 $14.9$14.9 $10$10

As we can see from the scatter plot, the data appears to be following a quadratic pattern better than a linear one.

Instead of trying to fit the data to a linear function, let's have our calculator fit a quadratic regression model to the data.

As you can see, when we go to choose a model for regression, there are many to choose from. Your knowledge of functions will help you make the best choice, but this lesson will focus on quadratics.

And here we have the equation of the quadratic function fitted to the data.

We can see the value of the coefficient of determination, $r^2$r2, is very strong (it's very close to $1$1).

Given the coefficients $a$a, $b$b, and $c$c we have the quadratic function $y=-0.52x^2+5.27x+6.86$y=0.52x2+5.27x+6.86

 

Practice questions

Question 1

The scatter diagram shows the number of fish in pond as a function of time.

Loading Graph...

A scatter plot is depicted with a vertical y-axis and a horizontal x-axis, both unlabeled except for the notations 'y' and 'x' at their respective ends. There are eight distinct black dots scattered across the plot, with coordinates at $\left(1,8\right)$(1,8), $\left(2,\frac{25}{6}\right)$(2,256), $\left(3,\frac{4}{3}\right)$(3,43), $\left(4,\frac{4}{5}\right)$(4,45), $\left(5,\frac{4}{3}\right)$(5,43), $\left(6,\frac{25}{6}\right)$(6,256), and $\left(7,8\right)$(7,8)

  1. Which type of model would be appropriate for this data?

    Linear

    A

    Quadratic

    B
  2. Will the coefficient of $x^2$x2 in the quadratic model be positive or negative?

    Positive

    A

    Negative

    B

Question 2

A social researcher claims that the longer people stay in their job, the less satisfaction they gain from their work. She asked a sample of people how many years they had been employed in their current job and to rate their level of satisfaction out of $10$10. The results are presented in the table.

Number of years employed Satisfaction rating
$2$2 $9$9
$3$3 $5$5
$5$5 $4$4
$6$6 $2$2
$8$8 $5$5
$10$10 $7$7
$12$12 $8$8
$13$13 $7$7
$15$15 $9$9
  1. Create a scatter plot for the data collected.

    Loading Graph...

  2. Does the scatter plot support the social worker’s claims?

    Yes

    A

    No

    B
  3. Which form of equation would be best suited to model the relationship between the number of years employed and satisfaction with the job?

    $y=mx+b$y=mx+b

    A

    $y=a^x+b$y=ax+b

    B

    $y=ax^2+bx+c$y=ax2+bx+c

    C
  4. The social researcher herself has been employed in her current job for $4$4 years and rates her satisfaction with her work a $10$10 out of $10$10. What is the difference between the satisfaction rating approximated by the model $y=0.5x^2-6x+20$y=0.5x26x+20 and her actual rating?

Question 3

Nine data points have been plotted below with a quadratic curve of best fit.

Loading Graph...

  1. Predict the $y$y-value of a point with an $x$x-value of $13$13.

  2. Which of the following points would be predicted by the quadratic curve of best fit?

    $\left(3,4\right)$(3,4)

    A

    $\left(14,10\right)$(14,10)

    B

    $\left(2,9\right)$(2,9)

    C

    $\left(15,15\right)$(15,15)

    D

 

Interpolation and extrapolation from a model

Given a set of data relating two variables $x$x and $y$y, we can use a model to best estimate how the dependent variable changes in response to the independent variable $x$x. A model allows us to go one step further and make predictions about other possible ordered pairs that fit this relationship.

Exploration

Say we gathered several measurements on the population $P$P of a small town $t$t years after an earthquake. We can then plot the data on the $xy$xy-plane as shown below.

Population of a town measured at several instances.

 

We can fit a model through the observed data to make predictions about the population at certain times after the earthquake. One plausible model might look like this function:

A curve modeling population of a town over time.

 

To make a prediction on the population, say two years after the earthquake, we first identify the point on the curve when $t=2$t=2. Then we find the corresponding value of $P$P. As you can see below, the model predicts that two years after the earthquake, the population of the town was $250$250.

A predicted population of $250$250 when $t=2$t=2.

 

A prediction which is made within the observed data set is called an interpolation. Roughly speaking, we've gathered data between $t=0.3$t=0.3 and $t=2.3$t=2.3 so a prediction at $t=2$t=2 would be classified as an interpolation.

If we predict the population six years after the earthquake, we find that the population is roughly $16$16. A prediction outside the observed data set such as this one is called an extrapolation.

A predicted population of $16$16 when $t=6$t=6.

 

How reliable are these predictions? Well, any model that fits the observed data will make reliable predictions from interpolations since the model roughly passes through the center of the data points. We can say that the model follows the trend of the observed data.

However, extrapolations are generally unreliable since we make assumptions about how the relationship continues outside of collected data. Sometimes extrapolation can be made more reliable if we have additional information about the relationship.

Consider if we were to use the following quadratic model to fit the data. We can see that interpolating doesn't change much from the previous model, but the predicted values from extrapolation are very different.

Polynomial curve modeling population of a town over time.

 

With further information, say like government funding and support aid, the population of the town might increase after a certain point and so the polynomial curve may be an appropriate model to use.

Remember!

A prediction made within the observed data is called interpolation.

A prediction made outside the observed data is called extrapolation.

Generally, extrapolation is less reliable than interpolation since the model makes assumptions about the relationship outside the observed data set.

 

Practice question

Question 4

The height of a particular projectile $y$y in meters was measured at different times $t$t in seconds and the following curve of best fit was drawn.

Loading Graph...

  1. Using the curve of best fit, what is the predicted height of the projectile after $1$1 second?

  2. Using the curve of best fit, what is the predicted height of the projectile after $4$4 seconds?

  3. The height at a given time has the following relationship $y=-5t^2+100$y=5t2+100. Which of the following statements about interpolation and extrapolation is true?

    Extrapolation is less reliable since we make assumptions about the relationship outside of the observed data.

    A

    Extrapolation is less reliable since we should use a line of best fit instead.

    B

What is Mathspace

About Mathspace