Path: blob/master/5.2 Kalman auto-correlation tracking - AR(1) process.ipynb
1675 views
AR(1) - Auto-correlation tracking
In this notebook, I would like to show the power of the Kalman filter for tracking hidden variables.
In the notebook 5.1 I introduced the Kalman algorithm and applied it to real financial data. However, it was not possible to verify the goodness of the algorithm (by computing the MSE for instance). In the present notebook, I prefer to work with a simulated time series, where the hidden state process (the autocorrelation coefficient ) is known!
Contents
The AR(1) is the Auto-Regressive process of order 1 wiki.
where is the mean of , and . We also assume uncorrelated errors .
The process can be written in the form of a linear regression, where the value of is modelled as a linear function of :
with and . We also have to require that and are uncorrelated. And impose the condition , in order to have a stationary process.
The ACF (Auto-Correlation Function) wiki is
The PACF (Partial Auto-Correlation Function) wiki, is
An important thing to notice is the form of the autocorrelation coefficient with lag :
For simplicity I will assume and . Let us simulate the AR(1) process:
We can estimate the parameters in the same way we do it for linear regressions. The following considered estimators are asymptotically unbiased.
Plots
Let us compute and plot the ACF and PACF with lag .
The PACF can be computed with several methods (type pacf??
in a new cell for more informations). The parameter alpha
is used for the confidence interval computation.
Plot made by me
The function acf
uses the Bartlett's approximation for the computation of the standard error (necessary for the computation of confidence intervals). The formulas for the standard errors used in acf
and pacf
are:
Alternatively, we can use the statsmodels
functions:
The autocorrelation function exponentially decays in two forms depending on the sign of . If , then all autocorrelations are positive. If , the sign shows an alternating pattern and the graph looks like a dumped cosine function.
Estimation using statsmodels methods: OLS and ols
Method 1
Method 2
The library statsmodels
lets you use R-style formulas to specify the regression parameters. This is very useful, in particular when working with multiple regression.
The right method to call is ols
, written with small letters.
Method 3
The same effect of sm.add_constant
can be obtained with the dmatrices
from the patsy library. This function lets you create a regressors matrix by selecting the columns of a dataframe and using R-style formulas (for more information see here ). This is the code to use:
Note that we used again the function sm.OLS
with capital letters.
Comment: Regarding the fit methods, there are two possibilities: The OLS method for multiple linear regression returns an estimated vector that requires the computation of the inverse of a matrix see here, wiki.
pinv method: uses the Moore-Penrose inverse
qr method: uses the QR decomposition
Estimation using ARMA class
Method 4
Above I presented the results obtained by two different methods.
The first parameter above corresponds to what we previously called (the mean). We can calculate the regression intercept by:
Plot of true rho and the rho with model errors
I use the name beta
to call the "rho with model error".
Let us estimate the regression parameters using the entire dataset:
Train and Test sets
Alright... MLE method doesn't work very well.
As discussed in the notebook 5.1, sometimes the log-likelihood function assumes a maximum at 0. If we try to set sig_mod
to smaller values the MLE doesn't work anymore.
Window-dependent estimation:
The previous method is an informal method that helps to get an approximative value of the standard deviation of the process noise .
The estimation of is a difficult task, and it is considered as one of the hardest parts in the design of a filter. In this case, the designer has a strong responsibility, because his decisions will affect the performances of the filter.
Kalman filter application:
The last values of and in the training set are used as initial values for the Kalman filter in the test set.
Rolling regression beta
RTS Smoother
In the following cell I also calculate the Rauch-Tung-Striebel smoother.
Plot of betas
Let us compare the MSE:
As expected the RTS smoother has smaller MSE than the Filter output.
When we compare the rolling regression MSE with the Kalman MSE, we have to remember that the rolling betas depend on the window size!
The Kalman estimator, represents the spot value of beta. This value is however affected by the designer choice of the unkown parameter . Let us recall that a small process error means that we are very confident in the model, and the filter will not be very sensitive to new measures.
Non-constant Autocorrelation
In the previous section, we considered a constant true value of . In this case, regression analysis works well and the results are quite satisfactory. But what happens if is a function of time?
In the next two examples, when we try to compute the autocorrelation coefficient of the process using the entire dataset, we will obtain just a constant "average value". But we will not be able to acquire knowledge of the spot autocorrelation.
In this section, we will see that the Kalman filter is a good solution for this problem! And can be a good alternative to the rolling regression approach.
The auto-correlation value computed above is not informative of the real auto-correlation dynamics. The two plots below show the true dynamics of (left), and the bad estimation of the regression line with slope computed above (right).
Train-Test split.
MLE estimator
Yeah! It works! Although the value of var_eta is a little higher than expected, we will use it.
Kalman filter application
The last values of and in the training set are used as initial values for the Kalman filter in the test set.
Rolling betas and smoothed betas
Plot all betas
MSE:
Comments:
We see that the MSE of the filter is similar to the MSE of the rolling regression. However the Kalman filter does not depend on the window size!, whether the rolling regression output does!
The presented result looks nice for a rolling regression with rolling_window=60
, but if you try to increase it to rolling_window=100
or more, you will notice a very bad performance.
This behavior is the opposite of what we saw for a constant beta.
In the Kalman filter instead, we used the MLE estimated parameters.
The smoothed beta has an incredibly small MSE! In general the smoothing approach has the purpose to remove the noise. It produces a very good estimate of the true value of beta (i.e. rho).
Example 2: Discontinuous behavior
In the following example, I will give space to my imagination to create a highly irregular (but not too much) function.
The purpose is to test the behavior of the Kalman filter when the process model is completely wrong! Let us recall that we assumed (see notebook 5.1) a beta process following a random walk:
A process with big discontinuities cannot be well described by a random walk. Each discontinuity is interpreted as an extreme Gaussian event whose probability is negligible. However, we will see that the Kalman filter works well even in these circumstances. Although it takes some time to adjust to a new measurements (this time is inversely proportional to the process error )
I collected all the code used in the Example 1 inside the function plot_betas
.
Comments:
The two plots above contain the results of the Kalman estimation compared with rolling regression and smoothed values.
Smoothed values are always the best in terms of MSE. Of course they cannot be used "online" as they require the knowledge of the estimations of betas at each time in the dataset.
In the first plot I used MLE parameters. The estimated value of
var_eta
is very close to the true value. However, this small value creates problems in presence of jumps. For this reason the MSE is higher than the rolling regression one.To overcome this problem, in the second plot, I increased the parameter
var_eta
in order to make the filter more reactive to the jumps in the process. This example is important because it shows that in some situations the filter designer decisions matter.Recall that when you choose a too small window in the rolling regression, the MSE instead of decreasing it increases. This is because the variance of beta increases.
One last tip to remember: Always compare the Kalman output with the rolling regression output, to get a better understanding of the functioning of the filter.