Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.
Image: ubuntu2004
Risk measure estimation
Camilo Garcia Trillos 2020
In this notebook
In this notebook, we build a first complete application: estimating Value at Risk (at level 99%) and Expected Shortfall (at level 95%) for one stock (simplest case). We also look at backtesting.
Acquire and clean data
We start by reading the data from the csv table 'AIG_20171201_15y.csv'. It corresponds to AIG, an insurance company.
Let read the data and perform some basic cleaning procedures:
Understand what are the entries of the data
Guarantee that the type of each entry corresponds to their meaning
Check if there are missing data
Choose an appropriate index
Make a visual inspection in search of abnormal observations
We start by reading and inspecting the data
Looks like it contains the same information as the APPLE database, but the database starts this time in january 2003. Let us check some summary statistics.
First, we suspect by comparing the 'count' row, it is likely that there are no missing data. Then, looking at each of the summar statistics (mean,std,min,...) for open, high, low and close (and their respective adjusted versions), they seem rather close, which is a good sign of coherence in the information.
We can now turn our attention to the ifnormation withinto the content It looks that, different to apple, this share did not split but rather diminished the number of shares on the market (split ratio less than one).
Also, there are huge gaps on the prices, but unlike apple share this happens on the adjusted prices. Finaly, there are also big differences in the traded volume of shares.
These are things to take into account when looking at the plots.
Let us now see what are datatypes of the entries
Once again we need to fix the date entry type.
We can now check we have no missing data, and that the date column is unique (in view of defining it as the index of the database).
We are ready to make some plots: ake four separate plots
One with open, close, low, and high for each date
One with adj. open, close, low, and high for each date -One with split ratio per day
One with Adjusted volume per day
Two very steep changes in a couple of days. Are they caused by splits? Before trying to answer, let us make a 'zoom' of this area
The decreasing change is more progressive, although there is a strong decrease around August 2008. At the time the actual effects of the subprime crisis on AIG were being revealed.
The upward tendency occurs quite quickly starting from 2009-07. Now, leat us look at the splits
This correspons to the upward trend before. Unlike Apple, this share rather tah aplitting actually merged some stocks!
We can now check the adjusted prices
The picture is grim. The stock lost an important part of its value and has ever since recovered some ground (albeit negligible with respect to initial losses).
Finally, we look at the traded adjusted volume
The volume has a very large peak precisely on the date of the split. It is not unlikely, but if this number were relevant to us, we would need to find out what happened that day.
Risk measure estimation : one day
In the following, we will implement and compare the closed form and historical simualtion approaches for the estimation of value at risk of the changes in the AIG adjusted close price.
To simplify the exercise, we suppose that we want to estimate a one day value at risk. Moreover,we SUPPOSE that we are making the calculations on the 01/01/2008. Therefore, we will only consider the data base before this date.
Let us find another database that only takes these dates into consideration
Closed form
Remember that the key point to use a close form approximation is to find a suitable risk mapping where the risk factors are Gaussian.
Our candidate is to define as risk factor
Let us examine if
This factor seems stationary
This factor seems Gaussian
Stationarity
Let us also look at variances and covariances within one month periods.
The mean of each subsample seems to be rather random around a costant mean value. This is not agians stationarity. Let us test the standard deviation as well
This plot seems slighly less random, but still reasonable.
Very small estimated autocorrelation. Finally, let us check the augmented Dickey-Fueller test.
All in all, we cannot seem to have enough evidence agains stationnarity. How about normality?
Normality
Visually, the fit is not clear, although the divergences are not extraordiary. Let us look at another tool to check this
So the p-value is very small. Yet another indication against normality.
This means we cannot assume full normality of the whole series.
Despite the above, for the sake of comparison, let us see what the risk measures would be if assuming normality. Recall from the lecture that the delta approximation gives
and so the formulas for the risk measures are
Historical simulation: one day
We have already tested stationarity, we can calculate directly the historical simulation of one day of value at risk and expected shortfall at the assumed levels.
To do that, we first calculate what would be the P&L between 'today' and 'tomorrow' (that is from 2007/12/31 to 2008/01/01) using as possible outcomes the past values of the log-returns. The formula in this case is simply
Let us look quickly at the histogram of possible changes
Now we calculate the estimators for the position in value at risk and expected shortfall. We first order the samples and choose one of the estimators we saw in class. I will simply take the '+' value at risk and its corresponding expected shortfall. Please remember that observations start at zero
Clearly these values represent a significative increase of 44% and 32.7% !