NZ Level 8 (NZC) Level 3 (NCEA) [In development] Making Predictions from Time Series Data (using LSRL)
Lesson

Once we've calculated the seasonal index for each season and used them to deseasonalise our data, we're finally ready to fit a least squares regression line to our data and use it to make predictions.

Let's begin by examining the smoothing effect that deseasonalising the data had on the original data set. We can see that the deseasonalised data looks much smoother than the original data and we can now fit a least squares regression line.

We first need to give each time period a numerical value, as shown in the table below. Using our CAS calculator, we now enter the time values as our independent variable and our deseasonalised data as our dependent variable, and we fit to this a linear regression model. We now have $\hat{y}=-3.2519t+494.4746$^y=3.2519t+494.4746 as the equation of our least squares regression line.

We can now use this line to predict a future value. Let's consider the value for September 2015.

The time period associated with September 2015 will be $t=15$t=15. This is obtained by counting on from the last piece of available data in the table.

We now substitute that value into our regression line to make our prediction and obtain $\hat{y}=445.6968$^y=445.6968

But we're not quite finished yet.

We've just found the deseasonlised value for September 2015, and as we know, this is the smoothed value or the value taken from the green plotted data on the graph above. What we're interested in is the more realistic value that includes the seasonality or fits in with the blue plotted data above.

So how we add the seasonality back in to our predicted value?

We simply reverse the deseasonlising process by multiplying our predicted value by the seasonal index appropriate to the season we're predicting for.

Recall our seasonal indices from before: So to finalise our prediction for September 2015 we calculate as follows:

$\hat{y}=445.6968\times1.0135$^y=445.6968×1.0135

$\hat{y}=451.7137$^y=451.7137

## How reliable is our prediction?

When using time series data our prediction will almost always be an extrapolation. As such, we know our predictions might be somewhat unreliable due to our inability to accurately predict future events. However, in general we say that as long as our prediction has been made within one cycle of the available data, we can consider our prediction reliable.

In the example above, one cycle was four quarters, and our last available piece of data was December 2014. So any prediction made for any of the quarters in 2015 would be considered reliable as they are within one cycle since December 2014. If we started predicting for 2016 and beyond, we'd run the risk of being too inaccurate.

#### Worked Examples

##### Question 1:

The following data shows the sales of air conditioners at a leading retailer over four quarters from 2012 to 2014.

 Time Period Number Of Air Conditioners Sold Proportion Of Yearly Mean Deseasonalised Data 1 (March 2012) 2 (June 2012) 3 (Sept 2012) 4 (Dec 2012) 5 (March 2013) 6 (June 2013) 7 (Sept 2013) 8 (Dec 2013) 9 (March 2014) 10 (June 2014) 11 (Sept 2014) 12 (Dec 2014) $1042$1042 $486$486 $613$613 $2746$2746 $1160$1160 $609$609 $1139$1139 $2762$2762 $1795$1795 $1181$1181 $1094$1094 $3380$3380 $0.8529$0.8529 $0.3978$0.3978 $0.5017$0.5017 $2.2476$2.2476 $0.8183$0.8183 $0.4296$0.4296 $0.8035$0.8035 $1.9485$1.9485 $0.9638$0.9638 $0.6341$0.6341 $0.5874$0.5874 $1.8148$1.8148
1. Calculate the seasonal component for the quarters ending in March, June, September and December, rounding to four decimal places if necessary.

 March June September December $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$
2. Deseasonalise the data in the table and fill in the last column.

Round off to the nearest whole air conditioner sold.

 Time Period Number Of Air Conditioners sold Proportion Of Yearly Mean Deseasonalised Data 1 (March 2012) $1042$1042 $0.8529$0.8529 $\editable{}$ 2 (June 2012) $486$486 $0.3978$0.3978 $\editable{}$ 3 (Sept 2012) $613$613 $0.5017$0.5017 $\editable{}$ 4 (Dec 2012) $2746$2746 $2.2476$2.2476 $\editable{}$ 5 (March 2013) $1160$1160 $0.8183$0.8183 $\editable{}$ 6 (June 2013) $609$609 $0.4296$0.4296 $\editable{}$ 7 (Sept 2013) $1139$1139 $0.8035$0.8035 $\editable{}$ 8 (Dec 2013) $2762$2762 $1.9485$1.9485 $\editable{}$ 9 (March 2014) $1795$1795 $0.9638$0.9638 $\editable{}$ 10 (June 2014) $1181$1181 $0.6341$0.6341 $\editable{}$ 11 (Sept 2014) $1094$1094 $0.5874$0.5874 $\editable{}$ 12 (Dec 2014) $3380$3380 $1.8148$1.8148 $\editable{}$
3. Use your calculator to calculate the least squares regression line that fits the deseasonalised data, rounding values to a single decimal place if necessary.

Give the equation of the line in the form $y=at+b$y=at+b

4. Predict the number of air conditioners sold in the quarter ending December 2015.

Round off to the nearest whole air conditioner sold.

5. Comment on the reliability of your prediction

Reliable due to the prediction being made within one cycle of the available data.

A

Unreliable due to the prediction being made beyond one cycle of the available data.

B

Reliable due to the prediction being made within one cycle of the available data.

A

Unreliable due to the prediction being made beyond one cycle of the available data.

B

##### Question 2:

A new pop up ice-cream shop records their sales over their first month. The data is tabulated below.

Note that the shop is only open over the weekend.

 Day Sales Deseasonalised (dollars) Data Fri Wk 1 Sat Wk 1 Sun Wk 1 Fri Wk 2 Sat Wk 2 Sun Wk 2 Fri Wk 3 Sat Wk 3 Sun Wk 3 Fri Wk 4 Sat Wk 4 Sun Wk 4 $2036$2036 $2257$2257 $1936$1936 $2224$2224 $2547$2547 $2060$2060 $2349$2349 $2706$2706 $Y$Y $2435$2435 $2824$2824 $2398$2398 $2101.14$2101.14 $2040.87$2040.87 $2092.75$2092.75 $X$X $2303.10$2303.10 $2226.79$2226.79 $2424.15$2424.15 $2446.88$2446.88 $2431.09$2431.09 $2512.90$2512.90 $2553.58$2553.58 $2592.15$2592.15
Seasonal Components:
Fri Sat Sun
$0.9690$0.9690 $1.1059$1.1059 $0.9251$0.9251

1. On which day will shop be most likely to need extra help?

Saturday

A

Sunday

B

Friday

C

Saturday

A

Sunday

B

Friday

C
2. Calculate the value of $X$X in the table.

Round the value off to two decimal places if necessary.

3. Calculate the value of $Y$Y in the table.

Round the value off to a single decimal place if necessary.

4. Using your calculator, determine the equation of least squares regression line for the deseasonalised data, where $t=1$t=1 is Friday of Week 1.

Give the equation in the form $y=at+b$y=at+b and round off any figures of to two decimal places.

You can make use of $a$a and $b$b in your working,

5. Predict the sales for Friday of the sixth week.

Give your answer in dollars and round off any figures to two decimal places if needed.

6. Comment on the reliability of your prediction

Reliable due to the prediction being made within one cycle of the available data

A

Unreliable due to the prediction being made beyond one cycle of the available data

B

Reliable due to the prediction being made within one cycle of the available data

A

Unreliable due to the prediction being made beyond one cycle of the available data

B

### Outcomes

#### S8-1

Carry out investigations of phenomena, using the statistical enquiry cycle: A conducting experiments using experimental design principles, conducting surveys, and using existing data sets B finding, using, and assessing appropriate models (including linear regression for bivariate data and additive models for time-series data), seeking explanations, and making predictions C using informed contextual knowledge, exploratory data analysis, and statistical inference D communicating findings and evaluating all stages of the cycle.

#### 91581

Investigate bivariate measurement data