NZ Level 8 (NZC) Level 3 (NCEA) [In development]
Seasonal Indices and Deseasonalising Data
Lesson

We now take a look at our final method for smoothing time series data. This is the most involved of the processes we've covered so far and will also be the focus of our next chapter.

Rather than calculating a moving average - whether a moving mean or a moving median - we will smooth or deseasonalise our data by calculating what is known as the seasonal index and we do this using what is known as the average percentage method.

The seasonal index is a measure of how a particular season compares with the average season.

There is a four step process involved to deseasonalise our data:

1. Calculate the average for each cycle
2. Calculate the proportion or average percentage for each piece of original data
3. Calculate the seasonal index or seasonal component for each season
4. Deseasonalise the data

We will take each of these steps one at a time. But first let's consider why we're doing this.

## The benefits of the average percentage method

When calculating a moving mean or a moving median, we must use data points that are right next to each other. The problem with this method is that we often use the peak and the trough of the cycle in the same calculation, effectively ignoring the significant difference in the seasons. In a way, we're only roughly smoothing the data!

The method we'll investigate here avoids this approach thus making it more robust for creating smoothed data and then using for predictions.

## Calculating the average for each cycle

We'll use a familiar set of data from an earlier chapter to illustrate each of the processes involved.

Our first step is the calculate the average or mean for each cycle. We can see below that there are three cycles, each with four quarters or seasons per cycle. Therefore we need to calculate the mean for the 2012, 2013 and 2014 cycle.

The first and last are already done for us, we need to calculate the mean for 2013 cycle.

2013 Mean = $\frac{427+463+484+494}{4}=467$427+463+484+4944=467

## Expressing the raw data as a proportion of the cycle average

To calculate X, Y and Z in the table, we need to calculate each of their corresponding raw data values as a proportion of the mean for the cycle (or year) that they belong to.

$X=\frac{480}{490.75}=0.9781$X=480490.75=0.9781

$Y=\frac{427}{467}=0.9143$Y=427467=0.9143

$Z=\frac{499}{462.25}=1.0795$Z=499462.25=1.0795

## Calculating the seasonal index

Looking at the proportions in the table, you'll now notice that March tends to be the lowest point in the cycle as it has the smallest proportion each year, while December tends to be the highest point in the cycle and above the yearly average as indicated by a proportion greater than 1.

We will now calculate the average proportion for each season which we'll use as our basis to deseasonalise our data. This is known as the seasonal index or seasonal component.

To calculate the seasonal index for March we calculate the average of the proportions for each March quarter. We can similarly calculate the seasonal index for September.

Seasonal Index for March = $\frac{0.9598+0.9143+0.9194}{3}=0.9312$0.9598+0.9143+0.91943=0.9312

Seasonal Index for September = $\frac{1.0025+1.0364+1.0016}{3}=1.0135$1.0025+1.0364+1.00163=1.0135

## Deseasonalising the raw data

Now it's time to use our seasonal indices to smooth or deseasonalise our data.

Deseasonalising Data

Deseasonalised Data = $\frac{RawValue}{SeasonalIndex}$RawValueSeasonalIndex

$X=\frac{492}{1.0135}=485.4465$X=4921.0135=485.4465

$Y=\frac{463}{0.9897}=467.8185$Y=4630.9897=467.8185

$Z=\frac{425}{0.9312}=456.4003$Z=4250.9312=456.4003

We'll take a look at what we can do with this deseasonalised data in the next chapter.

#### Worked Examples

##### Question 1:

The local police station records the number of speeding fines issued each quarter.

The table alongside has the data for each quarter from 2016 to 2018.

 Time Period Data Percentage of yearly mean March 2016 $105$105 $106.06%$106.06% June 2016 $91$91 $x$x September 2016 $101$101 $102.02%$102.02% December 2016 $99$99 $100%$100% March 2017 $101$101 $y$y June 2017 $83$83 $89.01%$89.01% September 2017 $96$96 $102.95%$102.95% December 2017 $93$93 $99.73%$99.73% March 2018 $99$99 $108.2%$108.2% June 2018 $82$82 $89.62%$89.62% September 2018 $94$94 $102.73%$102.73% December 2018 $91$91 $z$z
1. For 2016, 2017 and 2018, calculate the mean number of speeding tickets issued in each time period.

 Year 2016 2017 2018 Mean $\editable{}$ $\editable{}$ $\editable{}$
2. Use your answers from part (a) to calculate the value of $x$x.

3. Use your answers from part (a) to calculate the value of $y$y.

4. Use your answers from part (a) to calculate the value of $z$z.

##### Question 2:

The table shows the number of new gaming apps released each quarter, from the beginning of 2016 through to the end of 2018.

 Time Period Data Proportion of yearly mean March 2016 $43$43 $0.77$0.77 June 2016 $45$45 $0.81$0.81 September 2016 $54$54 $0.97$0.97 December 2016 $81$81 $1.45$1.45 March 2017 $51$51 $0.82$0.82 June 2017 $50$50 $0.8$0.8 September 2017 $60$60 $0.96$0.96 December 2017 $89$89 $1.42$1.42 March 2018 $57$57 $0.81$0.81 June 2018 $52$52 $0.74$0.74 September 2018 $69$69 $0.98$0.98 December 2018 $103$103 $1.47$1.47
1. Calculate the seasonal component for the first quarter, correct to 2 decimal places.

2. Deseasonalise the data for March 2018. Give your answer to two decimal places.

3. Calculate the seasonal component for the third quarter. Give your answer to two decimal places.

4. Calculate the seasonal component for the fourth quarter. Give your answer to two decimal places.

##### Question 3:

Every four months Neil records the growth of his bean plant (starting with a new plant every year).

The data provided is from the beginning of 2016 to the end of 2019.

 Time Period Growth (in cm) Proportion of yearly mean April 2016 $95.6$95.6 $0.99$0.99 August 2016 $106.7$106.7 $a$a December 2016 $87.8$87.8 $0.91$0.91 April 2017 $c$c $0.99$0.99 August 2017 $101.2$101.2 $1.1$1.1 December 2017 $84.1$84.1 $0.91$0.91 April 2018 $86.3$86.3 $1.01$1.01 August 2018 $93.6$93.6 $1.09$1.09 December 2018 $77.3$77.3 $0.9$0.9 April 2019 $76.1$76.1 $0.99$0.99 August 2019 $83.4$83.4 $b$b December 2019 $71.8$71.8 $0.93$0.93
1. Calculate the value of $a$a in the table. Give your answer to 2 decimal places.

2. Calculate the value of $b$b in the table. Give your answer to two decimal places.

3. If the mean for 2017 is $92.2$92.2, calculate the value of $c$c.

Write each line of working as an equation, and give your answer to two decimal places.

4. Calculate the seasonal component for April. Give your answer to two decimal places.

### Outcomes

#### S8-1

Carry out investigations of phenomena, using the statistical enquiry cycle: A conducting experiments using experimental design principles, conducting surveys, and using existing data sets B finding, using, and assessing appropriate models (including linear regression for bivariate data and additive models for time-series data), seeking explanations, and making predictions C using informed contextual knowledge, exploratory data analysis, and statistical inference D communicating findings and evaluating all stages of the cycle.

#### 91580

Investigate time series data