Path: blob/master/Time Forecasting using Python/3 ARIMA (Auto Regressive Integrated Moving Average) .ipynb
3074 views
The ARIMA model, which stands for AutoRegressive Integrated Moving Average, is a popular statistical method for time series forecasting. It combines three components:
AutoRegressive (AR) part
Integrated (I) part
Moving Average (MA) part
Mathematical explanation of each component and how they come together in the ARIMA model.
1. AutoRegressive (AR) Part
The AR part of the model specifies that the output variable depends linearly on its own previous values. Mathematically, an AR model of order ( p ) (AR(p)) is written as: [ y_t = \phi_1 y_{t-1} + \phi_2 y_{t-2} + \cdots + \phi_p y_{t-p} + \epsilon_t ] where:
( y_t ) is the value at time ( t ).
( \phi_1, \phi_2, \ldots, \phi_p ) are parameters of the model.
( \epsilon_t ) is white noise error term at time ( t ).
2. Integrated (I) Part
The integrated part of the model involves differencing the data to make it stationary, which means that the statistical properties (mean, variance) do not change over time. The order of differencing is denoted by ( d ). If ( y_t ) is the original time series, the differenced series ( y_t' ) of order ( d ) is given by: [ y_t' = (1 - B)^d y_t ] where ( B ) is the backshift operator defined as ( By_t = y_{t-1} ).
For example, if ( d = 1 ), the differenced series is: [ y_t' = y_t - y_{t-1} ]
3. Moving Average (MA) Part
The MA part of the model specifies that the output variable depends linearly on the current and past values of a stochastic (white noise) term. Mathematically, an MA model of order ( q ) (MA(q)) is written as: [ y_t = \epsilon_t + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \cdots + \theta_q \epsilon_{t-q} ] where:
( y_t ) is the value at time ( t ).
( \theta_1, \theta_2, \ldots, \theta_q ) are parameters of the model.
( \epsilon_t, \epsilon_{t-1}, \ldots, \epsilon_{t-q} ) are white noise error terms.
ARIMA Model
Combining the AR, I, and MA components, the ARIMA model of order ( (p, d, q) ) is represented as: [ (1 - B)^d y_t = \phi_1 (1 - B)^d y_{t-1} + \phi_2 (1 - B)^d y_{t-2} + \cdots + \phi_p (1 - B)^d y_{t-p} + \epsilon_t + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \cdots + \theta_q \epsilon_{t-q} ]
Simplified, it becomes:
Steps to Fit an ARIMA Model
Identification: Determine the values of ( p ), ( d ), and ( q ) using techniques like the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots.
Estimation: Estimate the parameters ( \phi ) and ( \theta ) using methods like Maximum Likelihood Estimation (MLE).
Diagnostic Checking: Check the residuals to ensure they resemble white noise using tools like the Ljung-Box test.
Forecasting: Use the fitted ARIMA model to forecast future values.
Example
Suppose we have a time series ( y_t ) and we fit an ARIMA(1,1,1) model. This means:
( p = 1 ): The series has an autoregressive part of order 1.
( d = 1 ): The series has been differenced once to achieve stationarity.
( q = 1 ): The series has a moving average part of order 1.