Path: blob/master/april_18/lessons/lesson-16/code/starter-code/starter-code-16.ipynb
1905 views
Kernel: Python 2
In [2]:
Walmart Sales Data
For the independent practice, we will analyze the weekly sales data from Walmart over a two year period from 2010 to 2012.
The data is again separated by store and by department, but we will focus on analyzing one store for simplicity.
The data includes:
Store - the store number
Dept - the department number
Date - the week
Weekly_Sales - sales for the given department in the given store
IsHoliday - whether the week is a special holiday week
Loading the data and setting the DateTimeIndex
In [3]:
Out[3]:
Filter the dataframe to Store 1 sales and aggregate over departments to compute the total sales per store.
In [ ]:
Plot the rolling_mean for Weekly_Sales. What general trends do you observe?
In [ ]:
Compute the 1, 2, 52 autocorrelations for Weekly_Sales and/or create an autocorrelation plot.
In [ ]:
In [ ]:
In [ ]:
Split the weekly sales data in a training and test set - using 75% of the data for training
In [ ]:
Create an AR(1) model on the training data and compute the mean absolute error of the predictions.
In [ ]:
In [ ]:
Plot the residuals - where are their significant errors.
In [ ]:
In [ ]:
Compute and AR(2) model and an ARMA(2, 2) model - does this improve your mean absolute error on the held out set.
In [ ]:
In [ ]:
Finally, compute an ARIMA model to improve your prediction error - iterate on the p, q, and parameters comparing the model's performance.
In [ ]: