Path: blob/master/1.2 SDE simulations and statistics.ipynb
1675 views
SDE simulations and statistics
Contents
Brownian motion
Let us simulate some Brownian paths. Remember that the (standard) Brownian motion is a continuous time stochastic process, that satisfies the following properties:
.
The increments are stationary and independent. (see A3 for the definition)
It is a martingale.
It has continuous paths, but nowhere differentiable.
for .
For more info see here wiki.
In our simulation, each increments is such that:
The process at time T is given by and follows the distribution:
Now we consider the terminal values of each path. These values should give us some indications about the distribution of .
We expect to have and .
Here I'm using the unbiased standard deviation estimator, with one degree of freedom i.e. ddof=1
. Since the number of steps is quite big, there is no big difference between this and other possible estimators.
Let us recall that a common alternative is the MLE (maximum likelihood estimator), with ddof=0
.
Confidence intervals
Ok... what we have seen so far was quite easy. Right?
But... we want to be precise! We want to associate a confidence interval to each estimated parameter! In other words, we want to indicate how close the sample statistic (the estimated parameter) is to the population parameter (the real value of the parameter).
We can find an interval of values that contains the real parameter, with a confidence level of , for . This is called confidence interval.
Let us introduce some basic concepts:
Let i.i.d (independent and identical distributed), for , with and . We call .
has expectation: .
has variance: .
Central limit theorem:
Normal distributed variables
For , we call the value such that:
Let . Let us assume is known.
The sum of normal random variables is again normal, i.e. , therefore we have that . We can standardize and obtain .
We can write:
Then, the confidence interval for the mean is:
where is one realization of . Recall that confidence intervals are random variables, and can have different values for different samples!
Usually it is common to calculate the confidence interval. For a confidence interval, , so that , and .
Let . Let us assume is unknown.
We can consider instead the sample variance (unbiased version, ):
and the statistics
Let us recall that:
and are independent.
(student t distribution) wiki.
Following the same steps as above, we can compute the t-confidence interval for the mean:
where and are realizations of and . For , we call the value such that:
Recall that the student t density is symmetric. The term is called standard error for the mean.
For big the values of and are very close. Therefore they are interchangeable.
Non-normal data
Thanks to the central limit theorem, the ratio approaches a standard normal distribution for big enough. In general is enough.
Confidence intervals for the variance
The Chi-square distribution is asymmetric, therefore we define and and write:
The confidence interval for the variance is therefore:
For the standard deviation , the confidence interval is:
A step back
Ok... cool... a lot of theory. Let's see how it works in practice for the terminal value of the Brownian motion .
Sometimes the confidence interval fails!
We chose a confidence interval. This means that of the times it fails!
In the following cell, I want to check how many times the confidence interval fails to contain the population mean (in this case it is ), if we repeat the experiment 100 times. For this purpose we just need to change the seed of the random number generator.
Well... it failed 3 times!!
The parameters can be estimated by the following scipy.stats
function:
Hypothesis testing
Hypothesis testing is a broad topic. A summary of the topic can be found here statistical hypothesis testing. The two fundamental concepts to understand are: the Test Statistic and the p-value.
For what concerns practical applications, only the p-value is necessary. It is the probability of obtaining values of the test statistic more extreme than those observed in the data, assuming the null hypothesis being true. The p-value is the smallest significance level that leads us to rejecting the null hypothesis.
Here I want to use some "classical tests" for testing the normality of .
Shapiro-Wilk, doc The null hypothesis is that the data are normally distributed. It returns the test statistics and the p-value.
Jarque-Bera doc The null hypothesis is that the data are normally distributed. It returns the test statistics and the p-value.
Kolmogorov-Smirnov, doc It compares two distributions. The null hypothesis assumes the two distributions identical.
Other common tests are: The Anderson-Darling test and the D'Agostino test.
It is also quite useful to visualize the Q-Q plot.
Shapiro - Wilk:
Assuming a confidence level of . Since the p-value is not smaller that 0.05, we cannot reject .
Jarque - Bera:
Assuming a confidence level of . Since the p-value is not smaller that 0.05, we cannot reject .
Kolmogorov - Smirnov:
Assuming a confidence level of . Since the p-value is not smaller that 0.05, we cannot reject .
Q-Q plot:
Geometric Brownian Motion
The GBM has the following stochastic differential equation (SDE):
By applying the Itô lemma on it is possible to solve this equation (see the computations here). The solution is:
When we consider a lognormal random variable , the two parameters and are not the location and scale parameters of the lognormal distribution. They are respectively the location and scale parameters of the normally distributed .
Using our specific notation , the scale parameter is and the shape parameter is . The location is zero, because the distribution starts at zero. (changing the parameter would shift the support of the distribution on the domain )
Parameter estimation:
In order to estimate the parameters, we need to take first the logarithm of the data!
The variable of interest is
Now we can estimate the mean and the variance of a Normal distributed sample. This is quite easy!
Path simulation
There is no need to use advanced techniques.
If we want to simulate GBM paths, it is enough to simulate Brownian paths (with the adjusted drift) and then take their exponentials!
The Cox-Ingersoll-Ross process CIR is described by the SDE:
The parameters are:
mean reversion coefficient
long term mean
volatility coefficient
If (Feller condition), the process is always strictly positive!
Curiosity for mathematicians The diffusion coefficient of the CIR equation does not satisfy the Lipschitz conditions (square root is not Lipschitz)!! However, it can be proved that the CIR equation admits a unique solution.
Here I recall a common technique used to solve SDEs numerically. wiki
Euler Maruyama method
Let us divide the time interval in We can choose equally spaced points such that for each .
A generic Itô-Diffusion SDE
can be approximated by:
with and . The quantity to simulate is .
Applying this method to the CIR SDE we obtain:
Despite the presence of the Feller condition, when we discretize the process, it is possible that becomes negative, creating problems in the evaluation of the square root. Here I summarize the most common methods to overcome this problem:
where ^+ indicates the positive part. The methods 1) 2) and 3) just resolve the square root problem, but the process can still become negative. The method 4) prevents this possibility!
Let us have a look at the implementation of the method 4)
In the plot above I also included the asymptotic mean and standard deviation:
You can find the complete formulas here.
Let us also recall that the conditional distribution of is a (scaled) non-central Chi-Squared distribution:
with , and the random variable follows the non-central distribution:
with degrees of freedom and non-centrality parameter . (scipy.stats.ncx2
uses this parameterization, see link)
After the variable transformation, the density of is:
with , , .
The function is a modified Bessel function of first kind. See [1] for more details.
The two parameterizations are related by and .
OLS
We can rewrite the equation as:
Now we can find the parameters by OLS.
After a lot of calculations, it is possible to find an expression for . I will not write it in latex (it is long), but just in python in the following cell.
We call:
.
.
.
The parameter can be estimated from the residuals. The choice of ddof=2
is common when estimating the standard deviation of residuals in linear regressions (it corresponds to the number of parameters already estimated).
If you are lazy... and you don't want to read, write or understand this extremely long OLS formula, you can use a solver for minimizing the sum of the squares.
Here I'm using the scipy.optimize
function minimize.
There are several algorithms that work quite well. An alternative to Nelder-Mead can be the L-BFGS-B algorithm. Since the variables are positive, I'm also specifying the bounds (but it helps only for L-BFGS-B).
The coefficients are given in the last row. We can see that they coincide with those obtained before.
Of course, if you change the initial parameters of the simulation, the estimation procedure can give different results.
I tried to change the seed, several times. After some trials, I found that is the most "volatile" parameter, while and are quite stable.
Finally, let us compare the density with the histogram of all realizations of .
We can also use the scipy.stats.ncx2
(doc) class to fit the scaled Non-Central Chi-Squared density. However, the results can be very different.
The CIR_pdf
function defined above, corresponds to the function ss.ncx2.pdf
after the change of parameterization.
In practice, when working with financial market data, we just have 1 realized path! (we just have one realization of ).
The OLS method described above is the right method to use! An alternative can be to use the MLE method, considering the conditional probability density.
Comment
If we know the conditional density, why do we need to solve numerically the SDE? Why can't we just generate random variables from this density?
Well, we can! And in some circumstances it is the best thing to do! But the CIR process is mainly used as a model for the volatility dynamics in the Heston model, where the SDE approach is easier to implement!
A nice trick to solve the problem of negative values is to change variables. (this is a recurrent trick in financial math)
Let us consider the log-variable:
The new variable can be negative for .
By Itô lemma we obtain the evolution of :
Using the Euler-Maruyama method,
let us simulate some paths.
As expected, the path did not change. (under same parameters and same seed as before).
This is (in my opinion) the best theoretical approach!
However, in practice, this is not the perfect method. It can have several problems for small values of . For instance, as the algorithm involves an exponential function, it can generate huge numbers and even NaNs.
References
[1] Cox, J.C., Ingersoll, J.E and Ross, S.A. A Theory of the Term Structure of Interest Rates. Econometrica, Vol. 53, No. 2 (March, 1985)