The purpose of smoothing time series data using a moving average or deseasonalising the data is to take out the 'peaks' and 'troughs' and view the underlying trend. Often, but not always, the smoothed data will appear to be linear in nature. If it is linear in nature we can calculate the least-squares regression line for the smoothed data and use this line to make future predictions. Making future predictions with time series data is also called forecasting.
Once we have a predicted value from the underlying trend line, we will need to factor back in the season it's from. If it's from a 'peak' season then our prediction should be adjusted upwards. If it's from a 'trough' season then it should be adjusted downwards. We multiply the predicted score from the regression line by the appropriate seasonal index to adjust the predicted value.
Step | Action |
---|---|
1. | Smooth the data using the appropriate moving average or by deseasonalising the data. |
2. | Give each time period a number eg: Mon Week 1 becomes $t=1$t=1, Tues Week 1 becomes $t=2$t=2 etc. |
3. | Calculate the equation of the least-squares regression line using the time period in list 1 and the smoothed data in list 2 of your calculator. Round numbers to four decimal places unless instructed otherwise. |
4. | Substitute a future time value into the equation of the least-squares regression line to obtain a predicted score. |
5. | Multiply the predicted score by the appropriate seasonal index (as a decimal) to factor back in the seasonality of the data. |
6. | If asked to comment on the reliability of the prediction, consider whether the future time value is within one cycle of the existing data. If it is, we say it is 'reliable'. |
Note:
An aluminium window company records quarterly figures on the number of windows manufactured. For planning purposes the company management wishes to predict the number of windows that will be manufactured in March 2020.
Month/Yr | Time ($t$t) | Raw data | Proportion of yearly mean | Deseasonalised data |
---|---|---|---|---|
Mar 2018 | $1$1 | $471$471 | $0.9598$0.9598 | $505.81$505.81 |
Jun 2018 | $2$2 | $480$480 | $0.9781$0.9781 | $485.01$485.01 |
Sep 2018 | $3$3 | $492$492 | $1.0025$1.0025 | $485.43$485.43 |
Dec 2017 | $4$4 | $520$520 | $1.0596$1.0596 | $487.97$487.97 |
Mar 2018 | $5$5 | $427$427 | $0.9143$0.9143 | $458.56$458.56 |
Jun 2018 | $6$6 | $463$463 | $0.9914$0.9914 | $467.84$467.84 |
Sep 2018 | $7$7 | $484$484 | $1.0364$1.0364 | $477.54$477.54 |
Dec 2018 | $8$8 | $494$494 | $1.0578$1.0578 | $463.57$463.57 |
Mar 2019 | $9$9 | $425$425 | $0.9190$0.9190 | $456.41$456.41 |
Jun 2019 | $10$10 | $462$462 | $0.9995$0.9995 | $466.83$466.83 |
Sep 2019 | $11$11 | $463$463 | $1.0016$1.0016 | $456.82$456.82 |
Dec 2019 | $12$12 | $499$499 | $1.0795$1.0795 | $468.26$468.26 |
Think: Look at the table carefully. We can see that steps 1 and 2 have been completed already.
Step 1–the data has been smoothed. The deseasonalised data can be fitted to a linear regression model.
Do: Step 3–calculate the least-squares regression line. Using our calculator, we enter the time values in list 1 and our deseasonalised data in list 2, then we fit a linear regression model. Then to complete the remaining steps, we can use the linear regression model to make a prediction for March 2020, and then adjust this value to include seasonality. Review how to fit a least squares regression line using your brand of calculator here.
Do: Write down the equation $y=-3.2519t+494.4746$y=−3.2519t+494.4746 as the equation of the least squares regression line. To predict for March 2020 we first need to find the $t$t value. March 2019 was $t=9$t=9 so one cycle or four more time periods ahead means March 2020 will be $t=13$t=13. We now substitute that value into our regression line to make our deseasonalised prediction.
$y$y | $=$= | $-3.2519t+494.4746$−3.2519t+494.4746 |
Writing down the least-squares equation |
$y$y | $=$= | $-3.2519\times13+494.4746$−3.2519×13+494.4746 |
Substituting $t=13$t=13 |
$y$y | $=$= | $452.1999$452.1999 |
Simplifying |
Now we want to adjust for seasonality. We've just found the deseasonalised value for March 2020 which is the predicted value taken from the deseasonalised values. What we're interested in is the more realistic value that includes the seasonal trend in our actual data. So we must multiply our predicted value by the appropriate seasonal index. In this case we want to multiply by the seasonal index for March.
The seasonal indices for this data are given in the table below.
March | June | September | December |
---|---|---|---|
$0.9310$0.9310 | $0.9897$0.9897 | $1.0135$1.0135 | $1.0656$1.0656 |
So the predicted value for March 2020 is $452.1999\times0.9310=420.9981$452.1999×0.9310=420.9981. In other words, the window company predicts that $421$421 windows will be manufactured in March 2020.
Reflect: The original data as shown below contains peaks and troughs that reflect the seasonality. We use the deseasonalised values to fit a least squares line, which is then used to to make a deseasonalised prediction. To include seasonality, this prediction is adjusted using the seasonal index.
When using time series data our prediction will almost always be an extrapolation. As such, we know our predictions might be somewhat unreliable due to our inability to accurately predict future events. However, in general we say that as long as our prediction has been made within one cycle of the available data, we can consider our prediction reliable.
In the example above, one cycle was four quarters, and our last available piece of data was December 2019. So any prediction made for any of the quarters in 2020 would be considered reliable as they are within one cycle since December 2019. If we started predicting for 2021 and beyond, we'd run the risk of being too inaccurate.
The following data shows the sales of air conditioners at a leading retailer over four quarters of three consecutive years.
Time period | Time ($t$t) | Number of air conditioners sold | Proportion of yearly mean |
---|---|---|---|
March year $1$1 | $1$1 | $1042$1042 | $0.8529$0.8529 |
June year $1$1 | $2$2 | $486$486 | $0.3978$0.3978 |
Sept year $1$1 | $3$3 | $613$613 | $0.5017$0.5017 |
Dec year $1$1 | $4$4 | $2746$2746 | $2.2476$2.2476 |
March year $2$2 | $5$5 | $1160$1160 | $0.8183$0.8183 |
June year $2$2 | $6$6 | $609$609 | $0.4296$0.4296 |
Sept year $2$2 | $7$7 | $1139$1139 | $0.8035$0.8035 |
Dec year $2$2 | $8$8 | $2762$2762 | $1.9485$1.9485 |
March year $3$3 | $9$9 | $1795$1795 | $0.9638$0.9638 |
June year $3$3 | $10$10 | $1181$1181 | $0.6341$0.6341 |
Sept year $3$3 | $11$11 | $1094$1094 | $0.5874$0.5874 |
Dec year $3$3 | $12$12 | $3380$3380 | $1.8148$1.8148 |
Calculate the seasonal component for the quarters ending in March, June, September and December, rounding to four decimal places if necessary.
March | June | September | December |
$\editable{}$ | $\editable{}$ | $\editable{}$ | $\editable{}$ |
The data is smoothed using a $4$4 point centred moving average as shown in the table below. Calculate the missing values.
Time period | Time ($t$t) | Number of air conditioners sold | 4CMA |
---|---|---|---|
March year $1$1 | $1$1 | $1042$1042 | |
June year $1$1 | $2$2 | $486$486 | |
Sept year $1$1 | $3$3 | $613$613 | $1236.5$1236.5 |
Dec year $1$1 | $4$4 | $2746$2746 | $1266.625$1266.625 |
March year $2$2 | $5$5 | $1160$1160 | $1347.75$1347.75 |
June year $2$2 | $6$6 | $609$609 | $\editable{}$ |
Sept year $2$2 | $7$7 | $1139$1139 | $1496.875$1496.875 |
Dec year $2$2 | $8$8 | $2762$2762 | $1647.75$1647.75 |
March year $3$3 | $9$9 | $1795$1795 | $1713.625$1713.625 |
June year $3$3 | $10$10 | $1181$1181 | $\editable{}$ |
Sept year $3$3 | $11$11 | $1094$1094 | |
Dec year $3$3 | $12$12 | $3380$3380 |
Use your calculator to calculate the equation of the least squares regression line that fits the 4CMA data.
Give the equation of the line in the form $y=at+b$y=at+b.
Round $a$a and $b$b to four decimal places.
Predict the number of air conditioners sold in the quarter ending December year $4$4.
Round your answer to the nearest whole air conditioner sold.
Comment on the reliability of your prediction.
Reliable due to the prediction being made within one cycle of the available data.
Unreliable due to the prediction being made beyond one cycle of the available data.
A new pop up ice-cream shop records their sales over their first month. The data is tabulated below.
Note that the shop is only open over the weekend.
Day | Fri
Wk 1 |
Sat
Wk 1 |
Sun
Wk 1 |
Fri
Wk 2 |
Sat
Wk 2 |
Sun
Wk 2 |
Fri
Wk 3 |
Sat
Wk 3 |
Sun
Wk 3 |
Fri
Wk 4 |
Sat
Wk 4 |
Sun
Wk 4 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Sales
(dollars) |
$2036$2036 | $2257$2257 | $1936$1936 | $2224$2224 | $2547$2547 | $2060$2060 | $2349$2349 | $2706$2706 | $Y$Y | $2435$2435 | $2824$2824 | $2398$2398 |
Deseasonalised
Data |
$2101.14$2101.14 | $2040.87$2040.87 | $2092.75$2092.75 | $X$X | $2303.10$2303.10 | $2226.79$2226.79 | $2424.15$2424.15 | $2446.88$2446.88 | $2431.09$2431.09 | $2512.90$2512.90 | $2553.58$2553.58 | $2592.15$2592.15 |
Seasonal Components: | ||
---|---|---|
Fri | Sat | Sun |
$0.9690$0.9690 | $1.1059$1.1059 | $0.9251$0.9251 |
On which day will shop be most likely to need extra help?
Saturday
Sunday
Friday
Calculate the value of $X$X in the table.
Round the value off to two decimal places if necessary.
Calculate the value of $Y$Y in the table.
Round the value off to a single decimal place if necessary.
Using your calculator, determine the equation of least squares regression line for the deseasonalised data, where $t=1$t=1 is Friday of Week 1.
Give the equation in the form $y=at+b$y=at+b and round off any figures of to two decimal places.
You can make use of $a$a and $b$b in your working,
Predict the sales for Friday of the sixth week.
Give your answer in dollars and round off any figures to two decimal places if needed.
Comment on the reliability of your prediction
Reliable due to the prediction being made within one cycle of the available data
Unreliable due to the prediction being made beyond one cycle of the available data