CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.

| Download
Views: 3376
Image: ubuntu2004
Kernel: Python 3 (system-wide)

Risk measures

Camilo A. Garcia Trillos - 2020


In this notebook

  • we learn some probability distributions available in scipy.stats.

  • we use the associated functions to calculate V@R and ES

  • we introduce a Monte Carlo estimator for both V@R and ES

Some distributions do not have easy formulas for their value at risk or expeted shortfall. We introduce here some functions and procedures in Python to approximate their value.

We start by importing the modules we use in this notebook. The package scipy.stats contains some statistical distributions and tests. We will use some of the functions on it.

import numpy as np from numpy.random import default_rng import matplotlib.pyplot as plt import scipy.stats as st

PDF, CDF and Quantile functions

It is possible to calculate directly the PDF, CDF and quantile function for several distributions in Python, thanks to the scipy.stats library. This is achieved with the following general structure:

  • PDF: st.[name distribution].pdf(probability, parameters)

  • CDF: st.[name distribution].cdf(probability, parameters)

  • Quantile function: st.[name distribution].ppf(probability, parameters)

Let us illustrate with some examples.

Standard Gaussian

# Plotting the pdf x = np.linspace(-3,3,50) #Take 50 equally spaced points between -3 and 3 y = st.norm.pdf(x) # calculate the pdf for each one plt.plot(x, y) # plot plt.title('PDF standard Gaussian') plt.xlabel('x') plt.ylabel('pdf')
Text(0, 0.5, 'pdf')
Image in a Jupyter notebook

We recover the familiar bell-shaped pdf.

# Plotting the cdf x = np.linspace(-3,3,50) #Take 50 equally spaced points between -3 and 3 y = st.norm.cdf(x) # calculate the pdf for each one plt.plot(x, y) # plot plt.title('CDF standard Gaussian') plt.xlabel('x') plt.ylabel('cdf')
Text(0, 0.5, 'cdf')
Image in a Jupyter notebook

The plot illustrates the properties of the cdf: non-decreasing, with limits and -\infty and \infty equal to 0 and 1 respectively.

# Plotting the quantile function x = np.linspace(0,1,100) #Take 100 equally spaced points between 0 and 1 y = st.norm.ppf(x) # calculate the pdf for each one plt.plot(x, y) # plot plt.title('Quantile function - standard Gaussian') plt.xlabel('probability') plt.ylabel('x')
Text(0, 0.5, 'x')
Image in a Jupyter notebook

Note how this is the reflection of the CDF plot around the x=y line.

Some numbers that appear commonly are associated to the quantile at levels 0.95 and 0.99 for the Gaussian distribution:

# Standard Gaussian, quantile 0.99 print('Quantile at level 0.95: ', st.norm.ppf(0.95)) print('Quantile at level 0.99: ', st.norm.ppf(0.99))
Quantile at level 0.95: 1.6448536269514722 Quantile at level 0.99: 2.3263478740408408

Other examples of continuous distributions

#Gaussian with mean 1 and variance 4, quantile 0.99 print(st.norm.ppf(0.99, 1, 2)) #Due to the specific structure of Gaussian variables, this coincides with a rescaling of the standard one: print(st.norm.ppf(0.99) * 2 + 1)
5.6526957480816815 5.6526957480816815
# The pdf of a standard - t with 4 degrees of freedom. x = np.linspace(-5,5,50) #Take 50 equally spaced points between -3 and 3 y = st.t.pdf(x,4) # calculate the pdf for each one plt.plot(x, y) # plot plt.title('PDF - t with 4 d. of f.') plt.xlabel('x') plt.ylabel('pdf')
Text(0, 0.5, 'pdf')
Image in a Jupyter notebook
# A t-distribution with 4 degrees of freedom and standard form, quantile 0.99 st.t.ppf(0.99,4)
3.7469473879811366
# A chi-square distribution with 4 degrees of freedom st.chi.ppf(0.99,4)
3.6437211935036444

There are many more distributions than the ones above. Have a look at the help to know more.

Examples of discrete distributions

In the case of discrete distributions, there is no pdf function. Instead, we have a probability function (p(x):=(X=x)p(x):=\P(X=x)) which is calculated in the scipy.stats package using the following structure:

  • st.[name distribution].pmf(probability, parameters)

The CDF and quantile are calculated as for the continuous case.

Let us see one example with the Poisson distribution. Recall that if XX is distributed Poisson with parameter μ\mu, P[X=n]=eμμnn!P[X=n] = e^{-\mu} \frac{\mu^n}{n!}

# Plotting probability x = np.arange(15) y = st.poisson.pmf(x,5) plt.plot(x, y,'bo') # plot plt.title('Probability Poisson mu=5') plt.vlines(x, 0, y, colors='b', lw=1, alpha=0.2) plt.xlabel('x') plt.ylabel('probability')
Text(0, 0.5, 'probability')
Image in a Jupyter notebook
# Plotting cdf x = np.arange(15) y = st.poisson.cdf(x,5) plt.step(x, y, where='post') # the post label assumes a right continuous function plt.plot(x, y, 'bo') # plot plt.title('CDF Poisson mu=5') #plt.vlines(x, 0, y, colors='b', lw=1, alpha=0.2) plt.xlabel('x') plt.ylabel('probability');
Image in a Jupyter notebook
# Plotting quantile x = st.poisson.cdf(np.arange(15),5) #Take the values we plotted before y = st.poisson.ppf(x,5) plt.step(x, y, where='pre') # the pre label assumes a left continuous function plt.plot(x, y, 'bo') # plot plt.title('Quantile function Poisson mu=5') #plt.vlines(x, 0, y, colors='b', lw=1, alpha=0.2) plt.xlabel('probability') plt.ylabel('quantile');
Image in a Jupyter notebook

Calculating V@R, ES using quantiles

Let us implement two functions that receive a (static) instance of a probability distribution from scipy.stats, and return V@R and ES.

Value at risk

Taking XX to be a random variable denoting, for example, profit and losses, we can easily calculate Value at Risk whenever XX is distributed following one of the distributions in scipy.stats, by using the quantile function.

We can show that (see lecture notes that) V@Rα(X)=qX(1α+ϵ1FX(qX(1α))=1α), \mathrm{V@R}^\alpha(X) = -q_{X}(1-\alpha+ \epsilon \mathbb 1_{ F_X(q_X(1-\alpha)) = 1-\alpha }) , for some 0<ϵ<P[X=qX(1α)]0<\epsilon< P[X=q_X(1-\alpha)]. Note that in the case of a continuous distribution we get V@Rα(X)=qX(1α).\mathrm{V@R}^\alpha(X) = -q_{X}(1-\alpha).

def varisk(dist, alpha): if alpha<=0 or alpha>=1: raise ValueError('Alpha is outside of valid interval') x_aux = dist.ppf(1-alpha) try: #check if calling the function pmf raises an error dist.pmf(0) except: # if an error, the distribution is continuous because there is no prob. mass function return -x_aux #Otherwise, the distribution is discrete if dist.cdf(x_aux)==1-alpha: return -1.*dist.ppf( 1-alpha+dist.pmf(x_aux)*alpha ) return -x_aux

Let us see some examples:

# Value at risk at level 0.99 for a Gaussian with mean 1 and variance 4 print(varisk(st.norm(1,2), 0.99)) #The same from an exact formula (see lecture notes): print( 2*st.norm.ppf(0.99,0,1)-1 )
3.6526957480816815 3.6526957480816815 3.6526957480816815

We got the value we were expecting. Let us now check the case of a bernoulli random variable with P[X=1]=0.75P[X=1]=0.75.

print('V@R level 0.99',varisk(st.bernoulli(0.75), 0.99)) print('V@R level 0.75',varisk(st.bernoulli(0.75), 0.75)) print('V@R level 0.60',varisk(st.bernoulli(0.75), 0.60))
V@R level 0.99 -0.0 V@R level 0.75 -1.0 V@R level 0.60 -1.0

Expected shortfall

Recall (see the lecture notes) that

ESα=11αα1V@Ru(X)du=11α{E[X1X<V@Rα(X)]+(α~α)V@Rα(X)}\begin{split} \mathrm{ES}^\alpha & = \frac{1}{1-\alpha} \int_{\alpha}^1 \mathrm{V@R}^u(X)\mathrm d u \\ & = \frac 1 {1-\alpha} \left\{ - \mathbb{E}[X \mathbb{1}_{X <-\mathrm{V@R}^\alpha(X)}] + (\tilde \alpha-\alpha) \mathrm{V@R}^\alpha(X) \right\} \end{split}

where α~:=FX(V@Rα(X))=1FX(V@Rα(X))+P[X=V@Rα(X)]\tilde \alpha:= F_{-X}(\mathrm{V@R}^\alpha(X)) = 1-F_{X}(-\mathrm{V@R}^\alpha(X)) + P[X = -\mathrm{V@R}^\alpha(X)].

We are going to use the method 'expect' associated to a given distribution in scipy.stats. Look at the help of this function to understand more, but here are two examples of use:

# Example for expect print('Expected value of a standard Gaussian:', st.norm(0,1).expect(lambda x: x)) print('Second moment of a standard Gaussian: ', st.norm(0,1).expect(lambda x: x**2)) print('Expected value of X in the interval [0,\infty):',st.norm(0,1).expect(lambda x: x, lb =0))
Expected value of a standard Gaussian: 0.0 Second moment of a standard Gaussian: 1.000000000000001 Expected value of X in the interval [0,\infty): 0.39894228040143215

We can use this available fucntion to define our own function to calculate expected shortfall.

def es(dist,alpha): var_alpha = varisk(dist,alpha) x_aux = (-1*dist.expect(func = lambda x: x, ub = -var_alpha))/(1-alpha) try: #check if calling the function pmf raises an error dist.pmf(0) except: # if an error, the distribution is continuous because there is no prob. mass function return x_aux p_varalpha = dist.pmf(-1*var_alpha) alpha_tilde = 1-dist.cdf(-1*var_alpha) + p_varalpha return x_aux + (alpha_tilde-alpha-p_varalpha)*var_alpha/(1-alpha)
# Expected shortfall at level 0.99 for a Gaussian with mean 1 and variance 4 alpha = 0.99 print(es(st.norm(1,2), alpha)) #The same from an exact formula (see lecture notes): print( -1 + 2*( st.norm(0,1).pdf( st.norm(0,1).ppf(alpha))/(1-alpha) ) )
4.330428440691598 4.330428440691612

As can be seen, there is a small difference due to the approximation error in the function "expect", but the approximation is very good.

Let us compare both value at risk and expected shortfall for different values

alpha_vec = np.arange(0.01,1,0.03) test_dist = st.norm(0,1) plt.plot(alpha_vec, [varisk(test_dist,i) for i in alpha_vec], label='V@R') plt.plot(alpha_vec, [es(test_dist,i) for i in alpha_vec], label='ES') plt.legend() plt.title('V@R and ES for standard Gaussian') plt.xlabel('Probability') plt.ylabel('Value');
Image in a Jupyter notebook

We can verify visually several properties:

  • Expected shortfall is always larger or equal than value at risk

  • Expected shortfall tends to E[X]-E[X] when α0\alpha\downarrow 0.

Let us now check the case of a Bernoulli random variable with P[X=1]=0.75P[X=1]=0.75.

print('ES level 0.99',es(st.bernoulli(0.75), 0.99)) print('ES level 0.60',es(st.bernoulli(0.75), 0.6))
ES level 0.99 0.0 ES level 0.60 -0.3750000000000002
alpha_vec = np.arange(0.01,1,0.03) test_dist = st.bernoulli(0.75) plt.plot(alpha_vec, [varisk(test_dist,i) for i in alpha_vec], label='V@R') plt.plot(alpha_vec, [es(test_dist,i) for i in alpha_vec], label='ES') plt.legend() plt.title('V@R and ES for Bernoulli(0.75)') plt.xlabel('Probability') plt.ylabel('Value');
Image in a Jupyter notebook

This is an example that shows that expected shortfall is more regular with respect to the level. Here it is continuous in alpha while value at risk is not.

Monte Carlo approximations

A Monte Carlo approximation might be useful either to replace the calculation of integrals for expected shortfall as above, or to calculate both value at risk and expected shortfall in cases where we do not have an explicit expression for the pdf.

The idea is simply replace the calculation of expected shortfall or value at risk on the distribution bu that of the empirical measure coming from a large sample of the distribution.

Suppose that X1,,XnX_1, \ldots, X_n are all i.i.d. samples of the same distribution in R\mathbb R. Let us also denote by X(1),,X(n)X_{(1)}, \ldots, X_{(n)} this sample after ordering (i.e. X(i)X(j)X_{(i)} \leq X_{(j)} if and only if iji\leq j). Then we have that V@Rα(X)X(iα)ESα(X)1iαk=1iαX(k)\begin{split} \mathrm{V@R}^\alpha(X) & \approx -X_{( i_\alpha )} \\ \mathrm{ES}^\alpha(X) &\approx -\frac{1}{i_\alpha }\sum_{k=1}^{i_\alpha} X_{( k )} \end{split}

where iα:=n(1α)i_\alpha:= \lfloor n(1-\alpha)\rfloor.

Remark: Naturally, this is not the only estimator, There are other choices of estimator that will be presented later in the lecture notes.

Let us define a function that calculate the approximation of value at risk and expected shortfall for any sample

def var_es_sample(sample,alpha): ss=np.sort(sample) ialpha = int(sample.size * (1-alpha)) return -ss[ialpha], -ss[:ialpha].mean()

We can test by comparing in cases as the ones we had before. Remember the use of the random number generator.

We start with the standard normal case.

NMC = 1000000 # Number of samples rng = default_rng(0) sample_normal = rng.normal(1,2,NMC) var_es_sample(sample_normal, 0.99)
(3.6554153431984657, 4.344458055149295)

Comparing with the values before (3.6526957480816815, 4.330428440691612), this is not a bad approximation. Note, though that we required a large sample (here, 1000000). A larger sample might be required for higher levels in VAR and ES.

Also, remember that this estimator is random (this is why we had to fix the seed of the generator). You can check this by removing the seed.

Let us see also the case of the Bernoulli(0.75).

rng = default_rng(123) sample_bernoulli=(rng.random(size = NMC)<=0.75) +0 var_es_sample(sample_bernoulli, 0.99)
(0, -0.0)
var_es_sample(sample_bernoulli, 0.6)
(-1, -0.3761125)

Again, we get close enough values to the ones defined before.

Exercises

  1. Plot the values of value at risk for value at risk for a t-distribution as a function of the number of degrees of freedom. Repeat the exercise with expected shortfall. What do you observe?

n_vec = np.arange(1,11,1) alpha = 0.95 dist = st.t(n_vec) VaR = varisk(dist,alpha) plt.plot(n_vec,VaR,'g') plt.xlabel("degree of freedom of t distribution") plt.ylabel("VaR at level 0.95") plt.show() ES = np.arange(10) for n in n_vec: ES[n-1] = es(st.t(n),alpha) plt.plot(n_vec,ES,'b') plt.xlabel("degree of freedom of t distribution") plt.ylabel("ES at level 0.95") plt.show()
Image in a Jupyter notebook
/usr/local/lib/python3.8/dist-packages/scipy/stats/_distn_infrastructure.py:2561: IntegrationWarning: The maximum number of subdivisions (50) has been achieved. If increasing the limit yields no improvement it is advised to analyze the integrand in order to determine the difficulties. If the position of a local difficulty can be determined (singularity, discontinuity) one will probably gain from splitting up the interval and calling the integrator on the subranges. Perhaps a special-purpose integrator should be used. vals = integrate.quad(fun, lb, ub, **kwds)[0] / invfac
Image in a Jupyter notebook
  1. Assume XX follows a standard Gaussian. Write a function ϕ(α)\phi(\alpha) such that V@Rϕ(α)(X)=ESα(X)\mathrm{V@R}^{\phi(\alpha)}(X) = \mathrm{ES}^\alpha(X). Test the function. What is the value associated to 0.975?

def phi(alpha): dist = st.norm(0,1) x_aux = es(dist,alpha) vec = np.arange(0.0001,1,0.0001) epsilon = 0.001 for y in vec: if np.abs(varisk(dist,y)-x_aux)<epsilon: return y return None print(phi(0.975))
0.9903000000000001
  1. Assume that you have invested £100, equally, in two investments with P&L given respectively by a) A Gaussian random variable with mean 60 and sd 100; and b) An exponential random variable with mean 75. Assume further that both are independent.

Calculate the value at risk and expected shortfall of your total P&L.

import numpy as np from numpy.random import default_rng def var_es_sample(sample,alpha): ss = np.sort(sample) ialpha = int(sample.size * (1-alpha)) return -ss[ialpha], -ss[:ialpha].mean() NMC = 1000000 # Number of samples rng = default_rng(0) sample = rng.normal(60,100,size=NMC) + rng.exponential(75,size=NMC) print("The VaR at level 0.95 of total P&L is",var_es_sample(sample,0.95)[0]) print("The ES at level 0.95 of total P&L is",var_es_sample(sample,0.95)[1])
The VaR at level 0.95 of total P&L is 56.90157098600528 The ES at level 0.95 of total P&L is 101.64082287263861