topic badge
AustraliaVIC
VCE 12 General 2023

4.05 Forecasts

Lesson

Fit a least squares line to time series data

Although it is not necessary to first deseasonalise time series data before fitting a least squares regression line, it is often performed in this order. The least squares line can then be used to make predictions being mindful of the issue of extrapolation.

Examine the smoothing effect that deseasonalising the data has on the original data set:

The image shows a graph with raw data and deseasonalised data. Ask your teacher for more information.

The deseasonalised data looks much smoother than the original data which will aid in fitting a least squares regression line.

In order to substitute values into a regression line, it is first necessary to give each time period a numerical value, as shown in the table below, where for example, March 2012 is represented by the number 1.

The image shows a table of time period, raw data, proportion, and deseasonalised data. Ask your teacher for more information.

Using a CAS calculator, it is now possible to enter the time values as the independent variable and the deseasonalised data as the dependent variable, and fit to this a linear regression model.

The image shows a calculation of linear regression in CAS calculator. Ask your teacher for more information.

The equation of the least squares line is now y=494.4746-3.2519t.

This line can now be used to predict a future value.

For example, consider the value for September 2015.

The time period associated with September 2015 will be t=15. This is obtained by counting on from the last piece of available data in the table.

Substituting t=15 into the regression line to make a prediction obtains y=445.6968.

It is important to remember however, that this is the deseasonalised value for September 2015, or the smoothed value. In the graph above, this would be the value taken from the green plotted data.

To find the more realistic value which contains the seasonality component, it is necessary to simply reverse the deseasonalising process by multiplying the predicted value by the appropriate seasonal index.

Recall the seasonal indices from before:

The image shows a table of a seasonal index. Ask your teacher for more information.

So to finalise the prediction for September 2015 we multiply by the seasonal index for September:

\begin{aligned} y&=445.6968\times 1.0135\\ y&=445.7137 \end{aligned}

Idea summary

Steps to predict from time series data:

  1. Smooth the data by deseasonalising.

  2. Give each time period a number.

  3. Calculate the equation of the least-squares regression line.

  4. Substitute a future time value into the equation of the least-squares regression line to predict score.

  5. Multiply the predicted score by the appropriate seasonal index to factor back in the seasonality of the data.

Reliability of the prediction

When using time series data the prediction will almost always be an extrapolation. As such, the predictions might be somewhat unreliable due to the inability to accurately predict future events. However, in general so long as the prediction has been made within one cycle of the available data, it is considered reliable.

In the example above, one cycle was four quarters, and the last available piece of data was December 2014. So any prediction made for any of the quarters in 2015 would be considered reliable as they are within one cycle of December 2014. When predicting for 2016 and beyond, there is a much higher risk of being inaccurate.

Examples

Example 1

The following data shows the sales of air conditioners at a leading retailer over four quarters from 2012 to 2014.

Time periodNumber of air conditioners soldProportion of yearly meanDeseasonalised data
\text{1 (March 2012)}10420.8529
\text{2 (June 2012)}4860.3978
\text{3 (Sept 2012)}6130.5017
\text{4 (Dec 2012)}27462.2476
\text{5 (March 2013)}11600.8183
\text{6 (June 2013)}6090.4296
\text{7 (Sept 2013)}11390.8035
\text{8 (Dec 2013)}27621.9485
\text{9 (March 2014)}17950.9638
\text{10 (June 2014)}11810.6341
\text{11 (Sept 2014)}10940.5874
\text{12 (Dec 2014)}33801.8148
a

Calculate the seasonal component for the quarters ending in March, June, September, and December, rounding to four decimal places if necessary.

MarchJuneSeptemberDecember
Worked Solution
Create a strategy

Average the proportion of yearly mean values for each month.

Apply the idea
\displaystyle \text{March}\displaystyle =\displaystyle \dfrac{0.8529+0.8183+0.9638}{3}Average the proportion of yearly means for March
\displaystyle =\displaystyle 0.8783Evaluate
\displaystyle \text{June}\displaystyle =\displaystyle \dfrac{0.3978+0.4296+0.6341}{3}Average the proportion of yearly means for June
\displaystyle =\displaystyle 0.4872Evaluate
\displaystyle \text{September}\displaystyle =\displaystyle \dfrac{0.5017+0.8035+0.5874}{3}Average the proportion of yearly means for September
\displaystyle =\displaystyle 0.6309Evaluate
\displaystyle \text{December}\displaystyle =\displaystyle \dfrac{2.2476+1.9485+1.8148}{3}Average the proportion of yearly means for December
\displaystyle =\displaystyle 2.0036Evaluate
MarchJuneSeptemberDecember
0.87830.48720.63092.0036
b

Deseasonalise the data in the table and fill in the last column. Round off to the nearest whole air conditioner sold.

Worked Solution
Create a strategy

Divide each raw data by the corresponding seasonal index found from part (a).

Apply the idea
Time periodNumber of air conditioners soldProportion of yearly meanDeseasonalised data
\text{1 (March 2012)}10420.8529\dfrac{1042}{0.8783}=1186
\text{2 (June 2012)}4860.3978\dfrac{486}{0.4872}=998
\text{3 (Sept 2012)}6130.5017\dfrac{613}{0.6309}=972
\text{4 (Dec 2012)}27462.2476\dfrac{2746}{2.0036}=1371
\text{5 (March 2013)}11600.8183\dfrac{1160}{0.8783}=1321
\text{6 (June 2013)}6090.4296\dfrac{609}{0.4872}=1250
\text{7 (Sept 2013)}11390.8035\dfrac{1139}{0.6309}=1805
\text{8 (Dec 2013)}27621.9485\dfrac{2762}{2.0036}=1379
\text{9 (March 2014)}17950.9638\dfrac{1795}{0.8783}=2044
\text{10 (June 2014)}11810.6341\dfrac{1181}{0.4872}=2424
\text{11 (Sept 2014)}10940.5874\dfrac{1094}{0.6309}=1734
\text{12 (Dec 2014)}33801.8148\dfrac{3380}{2.0036}=1687
c

Use your calculator to calculate the least squares regression line that fits the deseasonalised data, rounding values to a single decimal place if necessary.

Give the equation of the line in the form y=at+b.

Worked Solution
Create a strategy

Enter the t-values in the first list or column, and deseasonalised values in the second list or column.

Apply the idea

Using your calculator you should get the following equation:y=92.3t+914.4

d

Predict the number of air conditioners sold in the quarter ending December 2015.

Round off to the nearest whole air conditioner sold.

Worked Solution
Create a strategy

Use the regression line equation from part (c) and then multiply it by the December seasonal index from part (a) to reverse the effect of deseasonalisation.

Apply the idea

For December 2015, the t-value would be t=12+4=16. So we can substitute this value in the equation from part (c).

\displaystyle y\displaystyle =\displaystyle 92.3t+914.4Write the equation
\displaystyle =\displaystyle 92.3 \times 16 + 914.4Substitute t=16
\displaystyle =\displaystyle 2391.2Evaluate

Now we need to multiply this value by the seasonal component for December from part (a) which was 2.0036.

\displaystyle \text{Air conditioners}\displaystyle =\displaystyle 2391.2 \times 2.0036Multiply by the seasonal component
\displaystyle =\displaystyle 4791Evaluate and round
e

Comment on the reliability of your prediction.

A
Reliable due to the prediction being made within one cycle of the available data.
B
Unreliable due to the prediction being made beyond one cycle of the available data.
Worked Solution
Create a strategy

Determine whether the prediction made in part (d) is within or beyond one cycle of the data.

Apply the idea

December 2015 occurs only 1 year after the data that is recorded in the table. So it is within one cycle.

The correct option is A.

Idea summary

If asked to comment on the reliability of the prediction, consider whether the future time value is close to the data.

Outcomes

U3.AoS1.30

model linear trends using the least squares line of best fit, interpret the model in the context of the trend being modelled, use the model to make forecasts with consideration of the limitations of extending forecasts too far into the future

What is Mathspace

About Mathspace