topic badge
AustraliaVIC
VCE 12 General 2023

3.03 Data transformation

Worksheet
Making predictions
1

A \log_{10} transformation was used to linearise the x-values from a bivariate set of data. The equation of the least squares line fitted to this data is y = 2.9 + 1.2 \log_{10} x.

Predict the value of y when x = 62. Round your answer to two decimal places.

2

A \log_{10} transformation was used to linearise the y-values from a bivariate set of data. The equation of the least squares line fitted to this data is \log_{10} y = 2.9 + 1.2 x.

Predict the value of y when x = 2. Round your answer to two decimal places.

3

A reciprocal transformation was used to linearise the x-values from a bivariate set of data. The equation of the least squares line fitted to this data is y = 17.9 + 12.9 \left(\dfrac{1}{x}\right).

Predict the value of y when x = 10.

4

A reciprocal transformation was used to linearise the y-values from a bivariate set of data. The equation of the least squares line fitted to this data is \dfrac{1}{y} = - 1.9 - 2.02 x.

Predict the value of y when x = 1.6. Round your answer to two decimal places.

5

A square transformation was used to linearise the x-values from a bivariate set of data. The equation of the least squares line fitted to this data is y = 290 + 1246 x^{2}.

Predict the value of y when x = - 3.

6

A square transformation was used to linearise the y-values from a bivariate set of data. The y-values were all non-negative. The equation of the least squares line fitted to this data is y^{2} = 3113 - 4 x.

Predict the value of y when x = - 20. Round your answer to two decimal places.

Square transformations
7

Consider the following set of data:

x0246810121416
y01.322.52.73.13.43.63.9

Apply a square transformation to the y-values and determine the equation of the least squares line for the transformed data. Give your answer in the form y^{2}=a+bx, wherea and b are rounded to two decimal places.

8

The following plot shows a least squares line obtained after applying a square transformation to the response variable. Determine the equation of the least squares line in the form:\text{Velocity}^{2} = a + b \left(\text{Resistance}\right)

10
20
30
40
50
60
70
80
90
\text{Resistance}
30
60
90
120
150
180
210
240
270
\text{Velocity}^2
9

Consider the following data set:

x11.82.2344.55.86.279
y- 5- 0.5- 0.37719.542.379.9116155
a

Use technology to create a scattergraph of the above data.

b

Does the shape of the data appears to be linear or non-linear?

c

Complete the transformation of the x-values by completing the table below:

x^{2}
y- 5- 0.5- 0.37719.542.379.9116155
d

Find the coefficient of correlation of the transformed data set. Round your answer to two decimal places.

e

Find the equation of the least squares regression line for your transformed data set. Give your answer in the form y = a + b x^{2}, where a and b are rounded to one decimal place.

f

Predict the value for y when x = 70. Round your answer to two decimal places.

g

Comment on the validity of this prediction.

Logarithmic transformations
10

Consider the following set of data:

x0.10.20.30.40.50.60.7
y15253963100158244

Apply a \log_{10} transformation to the y-values and determine the equation of the least squares line for the transformed data.

Give your answer in the form \log_{10} y = a + b x, where a and b are rounded to two decimal places.

11

Consider the following set of data:

x472815435091476112312
y80.369.759.147.538.721.314.883.2
a

Calculate the coefficient of correlation, r. Round your answer to two decimal places.

b

Apply a \log_{10} transformation to the x-values, and then find the new coefficient of correlation, r. Round your answer to two decimal places.

c

Comment on the effect of applying the transformation.

12

Consider the following data set:

x11.62.43.47910.813.61517
y35.87.428.7711.1513.5512.2713.6711.6217.07
a

Use technology to create a scattergraph of this data.

b

Does the shape of the data appears to be linear or non-linear?

c

Complete the transformation on the x-values by completing the following table. Round your answers to two decimal places.

\log x
y35.87.428.7711.1513.5512.2713.6711.6217.07
d

Find the coefficient of correlation of the transformed data set. Round your answer to two decimal places.

e

Determine the equation of the least squares regression line of your transformed data set. Give your answer in the form y = a + b \log_{10} x, where a and b are rounded to one decimal place.

f

Predict the value for y when x = 7. Round your answer to two decimal places.

g

Comment on the validity of your prediction.

13

The average number of avocados purchased per week by households with various weekly incomes is researched and recorded below:

a

Use technology to create a scattergraph of this data.

b

Does the shape of the data appears to be linear or non-linear?

c

Complete the transformation on the x-values by completing the last column of the table. Round your answers to two decimal places.

d

Find the coefficient of correlation for the transformed data set. Round your answer to two decimal places.

\text{Income } (x)\text{Average number} \\ \text{of avocados } (y)\text{Log income }\\ (\log_{10} x)
5002.18
7002.46
8002.6
9002.74
10002.75
12002.9
12502.78
15002.8
18002.94
20002.83
e

Find the least squares regression line for your transformed data set. Give your answer in the form y = a + b \log_{10} x, where a and b are rounded to one decimal place.

f

Make a prediction for the average number of avocados purchased by a household with a weekly income of \$600. Round your answer to two decimal places.

g

Comment on the validity of this prediction.

14

The following plot shows a least squares line obtained after applying a square transformation to the response variable. Determine the equation of the least squares line in the form:y = a + b \log_{10} \left(x\right)

-25
-20
-15
-10
-5
5
10
15
20
25
\log_{10}(x)
5
10
15
20
25
30
35
40
45
50
55
y
Reciprocal transformations
15

Consider the following set of data:

x0.81.772.73.624.95.778.610.1
y0.2330.2940.3450.40.4350.5260.5260.5880.667

Apply a reciprocal transformation to the y-values and determine the equation of the least squares line for the transformed data.

Give your answer in the form \dfrac{1}{y} = a + b x, where a and b are rounded to two decimal places.

16

The following shows a least square line after applying a reciprocal transformation to the explanatory variable. Determine the equation of the least squares line in the form:\text{Length } = a + b \left(\dfrac{1}{\text{Height}}\right)

1
2
3
4
5
6
7
8
9
\dfrac{1}{\text{Height}}
2
4
6
8
10
12
14
16
18
20
22
\text{Length}
17

Consider the following data set:

x0.10.60.91.73.54.76.69.910.614.9
y2819.337.227.1810.575.436.36.335.699.13
a

Use technology to create a scattergraph of this data.

b

Does the shape of the data appear to be linear or non-linear?

c

Complete the transformation of the x-values by completing the following table. Round your answers to two decimal places.

\dfrac{1}{x}
y2819.337.227.1810.575.436.36.335.699.13
d

Find the coefficient of correlation of the transformed data set. Round your answer to two decimal places.

e

Find the equation of the least squares regression line for your transformed data set. Give your answer in the form y = a + b \times \left(\dfrac{1}{x}\right), where a and b are rounded to one decimal place.

f

Predict the value for y when x = 100. Round your answer to two decimal places.

g

Comment on the validity of this prediction.

18

Google Maps algorithms calculate the time it will take to travel various distances at certain times of the day and at different speeds. For a particular stretch of highway, Google Maps collected the following data:

a

Use technology to create a scattergraph of this data.

b

Does the shape of the data appears to be linear or non-linear?

c

Complete the transformation on the x-values by completing the last column of the table. Round your answers to two decimal places.

d

Find the coefficient of correlation of the transformed data set. Round your answer to two decimal places.

e

Find the equation of the least squares regression line for your transformed data set. Give your answer in the form y = a + b \times \left(\dfrac{1}{x}\right), where a and b are rounded to one decimal place.

\text{Time in} \\ \text{hours } (x)\text{Average}\\ \text{speed}\\ \text{ }(y)\text{Reciprocal of time} \\ \text{in hours } \left(\dfrac{1}{x}\right)
1.0624.33
1.522.5
1.2323.69
0.6827.41
1.523.13
0.7925.8
1.0625.33
1.3123.29
1.7322.23
0.8924.87
1.4123
1.123.73
f

Predict the average speed of a vehicle which travelled for 30 minutes. Round your answer to two decimal places.

g

Comment on the validity of this prediction.

Which transformation is best?
19

Consider the first residual plot shown below:

a

Does the residual plot suggest that the data is linear or non-linear?

2
4
6
8
10
12
14
16
18
20
x
-4
-3
-2
-1
1
2
3
4
5
6
7
8
9
10
11
y
b

A transformation is applied to the data. The resulting residual plot is shown on the right:

Does the residual plot suggest that the transformation has increased or decreased the linearity of the data?

2
4
6
8
10
12
14
16
18
20
x
-0.3
-0.2
-0.1
0.1
0.2
0.3
0.4
20

Consider the first residual plot shown:

a

Does the residual plot suggest that the data is linear or non-linear?

2
4
6
8
10
12
14
16
18
20
x
-1
-0.5
0.5
1
y
b

A transformation is applied to the data. The resulting residual plot is shown on the right:

Does the residual plot suggest that the transformation has increased or decreased the linearity of the data?

2
4
6
8
10
12
14
16
18
20
x
-1
-0.5
0.5
1
1.5
y
21

The table shows two transformations applied to a set of bivariate data, along with their r^{2} values and a description of their residual plot:

Based on this information, what is the most suitable transformation for linearising the data, square or logarithmic transformation?

\text{Transformation}\text{Residual plot}r^{2}
x^{2}\text{Shows a pattern}0.68
\log_{10} y\text{Random}0.56
22

Consider the following data set:

a

Use technology to create a scattergraph of this data.

b

State whether the shape of the data appears to be reciprocal, logarithmic or parabolic.

c

Complete the appropriate transformation on the x values by completing the last column of the table. Round your answers to two decimal places.

d

Find the correlation coefficient of the transformed data set. Round your answer to two decimal places.

e

Find the equation of the least squares regression line for your transformed data set.

xy\text{Transformed }x
0.1-19
0.6-4.32
1.413.9
2.123.92
3-22.12
621.68
6.8-9.02
17.6-0.5
20-1.2
3236.06
f

Predict the value for y when x = 100. Round your answer to two decimal places.

g

Comment on the validity of your prediction.

23

Consider the following data set:

a

Use technology to create a scattergraph of this data.

b

State whether the shape of the data appears to be logarithmic, reciprocal or parabolic.

c

Apply the appropriate transformation to the x values by completing the last column of the table.

d

Find the correlation coefficient of the transformed data set. Round your answer to two decimal places.

e

Find the equation of the least squares regression line for your transformed data set.

f

Predict the value for y when x = 30. Round your answer to two decimal places.

g

Comment on the validity of your prediction.

xy\text{Transformed }x
1-49
2-102
2.81707.3
4768
5.51191.5
8.56858.5
10.84773.7
15.24253.9
205201
39.4-9118.7
24

The following set of data shows the energy consumption \left(\dfrac{\text{kW hours}}{\text{month}}\right)of various households and the size of the house in square feet:

\text{Size of house } (x)\text{Energy consumption } (y)\text{Transformed }x
201020.1
176049.75
146028.12
188029.89
298036.29
191049.96
250032.8
180014.82
238055.66
124023.63
a

Use technology to create a scattergraph of this data.

b

State whether the shape of the data appears to be logarithmic, parabolic or reciprocal.

c

Apply the appropriate transformation on the x-values by completing the last column of the table. Round your answers to two decimal places.

d

Find the correlation coefficient of the transformed data set. Round your answer to two decimal places.

e

Find the equation of the least squares regression line of your transformed data set.

f

Predict the value for the energy consumption when the house is 1500 square feet. Round your answer to the nearest integer.

g

Comment on the validity of this prediction.

25

Consider the following set of data:

a

Apply a \log_{10} transformation to the y-values and determine the coefficient of determination. Round your answer to two decimal places.

b

Apply a reciprocal transformation to the y-values and determine the coefficient of determination. Round your answer to two decimal places.

xy
0.0520.01
0.19.98
0.25.05
20.533
4.620.216
4.90.204
5.70.075
80.225
100.1
120.073
14.10.071
19.30.052
c

The residual plots for each transformation are given below:

Logarithmic transformation

2
4
6
8
10
12
14
16
18
x
-0.8
-0.6
-0.4
-0.2
0.2
0.4
0.6
0.8
y

Reciprocal transformation

2
4
6
8
10
12
14
16
18
x
-4
-3
-2
-1
1
2
3
4
5
6
7
y

Which transformation is more appropriate for fitting a least squares line to this data?

Sign up to access Worksheet
Get full access to our content with a Mathspace account

Outcomes

U3.AoS1.12

data transformation and its purpose

U3.AoS1.27

construct a residual analysis to test the assumption of linearity and, in the case of clear non-linearity, transform the data to achieve linearity and repeat the modelling process using the transformed data

What is Mathspace

About Mathspace