Time series data is a type of bivariate data and typically it will be used to make some sort of prediction. However one issue with time series data is that its fluctuating nature makes this difficult. If a least squares line was fit to the data, it would not have a strong correlation. In order to help identify trends and make predictions, a process called smoothing can be used. Smoothing removes the peaks and troughs from the data and allows the underlying trend to be more easily seen.
As the name suggests, the means of sets of data are calculated and then plotted replacing the original data resulting in a smoothed effect.
The following demonstrates how to calculate what is called a three-point moving mean and a five-point moving mean. This strategy is sometimes referred to as moving average (MA).
Consider the time series data presented in the table:
\text{Time period} | \text{Raw data} | \text{3-moving average} | \text{5-moving average} |
---|---|---|---|
1 | 113.8 | ||
2 | 109 | 112.1 | |
3 | 113.4 | 107.4 | 97.7 |
4 | 99.7 | a | b |
5 | 52.6 | 86.4 | 93.8 |
6 | 107 | 85.3 | 91.6 |
7 | 96.3 | 101.8 | 90.7 |
8 | c | 98 | 89.5 |
9 | 95.6 | 81.4 | 88.2 |
10 | 46.5 | 80.9 | 86.7 |
11 | 100.5 | 78.6 | |
12 | d |
Calculate the value of a correct to one decimal place.
Calculate the value of b correct to one decimal place.
Solve for the value of c in the table.
Calculate the value of d correct to one decimal place.
Which moving average best smooths the data?
To find the 3-moving average for a particular time period, we find the mean of the data values for that time period, the previous time period, and the next time period.
To find the 5-moving average for a particular time period, we find the mean of the data values for that time period, the 2 previous time periods, and the next 2 time periods.
The number of points chosen affects how the data will be smoothed. It is not always the case that a greater number of points better smooths the data. Consider the graph below.
From the graph, the three-point mean is a far smoother line graph when compared with the five-point mean line graph, which still appears to have some degree of seasonality with its peaks and troughs.
The best way to determine the number of points to use for a moving mean is to count the number of points or seasons per cycle. Counting from one peak to the next peak, it can be seen that there are three points per cycle. Therefore a three-point moving mean will most likely be the best to smooth the data - this is because it will account for each of the three seasons present in the data.
Consider the Time Series graph drawn below, along with two sets of moving averages.
Which moving average is most appropriate for this data?
Why is the 5 point moving average the most appropriate?
The moving average used should always match the number of seasons per cycle in the original data.
The above examples used an odd number of points when calculating the means. Calculating the moving means for an even number of points requires the use of a process called centring.
If there are five data points: a, b, c, d, and e:
then a \text{four-point moving mean}=\dfrac{\dfrac12a+b+c+d+\dfrac12e}{4}
By taking a half of the first and the last data point, this counts as only one data point and effectively we have only used 4.
Likewise, if there are seven data points: a, b, c, d, e, f, and g:
then a \text{six-point moving mean}=\dfrac{\dfrac12a+b+c+d+e+f+\dfrac12g}{4}
Consider the time series data presented in the table.
\text{Time period} | \text{Raw data} | \text{4 point centred}\\\text{moving average} | \text{6 point centred}\\\text{moving average} |
---|---|---|---|
1 | 89 | ||
2 | 102.5 | ||
3 | 93.5 | 98.09 | |
4 | 111 | a | b |
5 | 81.7 | 94 | 94.77 |
6 | 92.6 | 92.09 | 93.08 |
7 | 87.9 | 89.96 | 88.96 |
8 | c | 87.56 | 85.93 |
9 | 74.4 | 84.54 | 85.79 |
10 | 80.7 | 82.48 | |
11 | 75.6 | ||
12 | d |
Calculate the value of a in the table. Round your answer to two decimal places.
Calculate the value of b in the table. Round your answer to two decimal places.
Calculate the value of c in the table. Round your answer to one decimal place.
Calculate the value of d in the table. Round your answer to one decimal place.
Which centred moving average best smoothes the data?
If there are five data points: a, b, c, d, and e:
then a \text{four-point moving mean}=\dfrac{\dfrac12a+b+c+d+\dfrac12e}{4}
By taking a half of the first and the last data point, this counts as only one data point and effectively we have only used 4.
Likewise, if there are seven data points: a, b, c, d, e, f, and g:
then a \text{six-point moving mean}=\dfrac{\dfrac12a+b+c+d+e+f+\dfrac12g}{4}