Lesson

We have now collected a lot of skills and tools which we can use to analyse time series data.

We can put these all together, along with the statistical investigation process, to analyse some real world data. Below is a step by step example of an investigation into Perth petrol prices in 2016. This is a bit out of date now! However the process for investigating the problem remains the same.

Read through the example and then create and investigate your own problem related to more recent petrol prices in Perth. For example: can you predict the price of petrol at your local station in 2030?

Reminder - the following steps form a rigorous and complete statistical investigation process:

- Identify a problem
- Collect or obtain data
- Analyse data
- Interpret and communicate results

Let's pose the following problem: We wish to understand the fuel price cycle for Perth.

When is the cheapest time to buy petrol in Perth and what can we expect to pay for unleaded petrol on the cheapest day next week?

For our purposes, we're conducting this analysis in late February, early March, and "next week" is from Monday 29th February to Sunday March 6th 2016.

To collect our data we can refer to online government resources which show the average prices movements of unleaded petrol in Perth.

The ACCC is updated regularly with petrol prices in all major cities in Australia and is a great place to start.

In Perth there's a website called 'FuelWatch' so that residents can find where the cheapest places are in Perth on any given day. You can also download historical files of data.

To investigate our problem we need the February 2016 file which gives data on petrol stations from all around Perth and country areas.

We can add some filters and sort the data so we are looking at the price cycle at the Caltex petrol station in Balcatta.

Here are the unleaded petrol prices for each day in February, 2016 at this Caltex petrol station.

We can then use a spreadsheet to create a line graph of this data.

We can certainly see a $7$7 day cycle! We can also see that it appears that petrol prices are decreasing each week through the slight declining trend in the graph.

The first part of our question is easily answered, fuel appears to be cheapest on a Monday. Therefore we should purchase aim to purchase our unleaded petrol on Monday 29th February.

To work out what it will cost on that day, we need to manipulate our data making it ready for a prediction.

Firstly we calculate the mean for each of the four weeks.

Then we calculate each proportion of the weekly means and then the seasonal indices.

We can then deseasonalise the data.

We can now graph our raw data along with our deseasonalised data to see the smoothing effect.

We are now ready to fit our least squares regression line to the deseasonalised data, where $t=1$`t`=1 is Monday $1$1st February.

Doing so we obtain the following least squares regression line:

$y=-0.3278t+114.9397$`y`=−0.3278`t`+114.9397

We're now ready to answer what the petrol will cost us on Monday $29$29th February 2016.

Note that $t=29$`t`=29 in this case.

$y=105.4324$`y`=105.4324

We now need to add the seasonality back into our prediction and we do so by multiplying by the seasonal index for Monday.

$105.4324\times0.9195=96.95$105.4324×0.9195=96.95 cents/litre

On Monday 29th February 2016, we can expect to pay $96.95$96.95 cents/litre for ULP at Caltex in Balcatta.

We can be confident of the accuracy of our prediction since it is made within one cycle of the last available collected data.

Side note: In fact, on this day, the price was $93.9$93.9 cents/litre at the Balcatta Caltex, so even lower than anticipated by the historical records.

- If we predicted the price for the Balcatta Caltex for Wednesday $9$9th March, 2016 what comment on reliability would we make?
- What underlying assumptions are we making when we use this model to predict future values?
- What could be the reason(s) for the spike in petrol prices on Tuesdays?
- Suggest some real life events that could make predictions from historical records such as these unreliable.

implement the statistical investigation process to answer questions that involve the analysis of time series data