Path: blob/master/7.1 Classical MVO.ipynb
1675 views
Mean-Variance Optimization
In this notebook I implement the classical mean-variance optimization (MVO) algorithm that finds the optimal weights of a portfolio. For a theoretical background see modern portfolio theory.
Contents
Since the number of stocks is small, it could be useful to plot the normalized time series and the correlation matrix of the log-returns.
plots
Some mathematics
Let us indicate the stock price at time with . We can now recall some important definitions:
Log returns
Linear returns
There are two important properties to remember:
Thanks to the properties of logarithms, for we can write:
If we have a portfolio with stocks, we can prove that the linear return of the portfolio at time before rebalancing (with weights selected at time ) is:
Let us consider an investor with a total capital in cash at time . At time he decides to buy shares of the stock . Of course the conditions
must hold. It convenient to define the relative weights of the portfolio:
such that for each
At this point it is easy to prove the initial expression:
where in the values are the number of shares selected at time i.e. before the rebalancing.
Why did I recall these formulas?
Simply because
therefore DO NOT USE THEM with the equal sign!!
Ok... if the time interval is short...we can use the first order Taylor approximation of the logarithmic function such that:
and linear returns are not so different from the log-returns. But if we consider monthly or annual returns, the difference becomes significant!
1) We use log-returns to estimate the monthly mean and covariance matrix.
We can assume that the daily log-returns of are i.i.d. with mean and standard deviation . Later, we will also assume that the log-returns are normally distributed, but the reality is that this assumption is wrong! If we try to test for normality using Shapiro-Wilk (here below), we can see that for each time series this assumption is rejacted. Well, this is a well known fact. I didn't waste time writing my Lévy processes notebooks 😃
We can calculate the monthly mean and covariance matrix from the daily mean and covariance matrix:
where I used the i.i.d. property of log-returns such that for . The term is the correlation coefficient between the daily log-returns and , and the daily covariance.
2) We use monthly linear returns to compute the monthly portfolio linear return.
Once we obtained the monthly log-returns, we need to convert them into linear returns. For this purpose, we can assume that log-returns are normally distributed. It follows that the prices are log-normally distributed and linear returns as well:
The formulas for the mean and covariance of the multivariate distribution can be found on wiki. Let us call and the mean and covariance of the monthly log-returns. The monthly linear returns have:
This topic is also discussed in detail in [1] (see equation 6.162).
Variance of the portfolio
The expected return of the portfolio is simply the weighted sum of the expected returns of each stock (here I use the subscript to simplify the notation)
But the variance involves all the covariance terms:
There are variance terms and covariance terms.
Optimization with scipy.optimize
Let us write the optimization problem in matrix form. Let and be the expected returns vector and the covariance matrix. Let be the vector of weights. With I indicate the target portfolio return. Then the optimization problem can be written as:
Here I use scipy.optimize.minimize
to solve the optimization problem.
It can be useful to read the doc for the Linear constraint. It is convenient to write the two linear equality constraints in a compact form:
The condition is for an investor that is not allowed to take short positions.
Now I compute the optimal weights for several values of target expected return . With these weights we can compute the standard deviation of the corresponding portfolio. The curve of all the points is called efficient frontier.
The Sharpe ratio is a performance measure defined as
where is the risk free rate. The line with a slope equal to the maximum Sharpe ratio is called capital market line (CML). The point on the efficient frontier with maximum Sharpe ratio is called tangent portfolio.
Plot
Weights for the tangent portfolio:
The weights are a bit different than before because here the maximum Sharpe ratio is more accurate. Before it was computed by grid search.
Optimal weights between stocks and bond
Once we have computed the tangency portfolio, it is optimal to stay on the CML. Let us call the return of a portfolio on the CML that allocates a fraction of the initial capital to the tangency portfolio (with return ) and to the risk-free asset , i.e.
such that
From these equations, given a desired level of expected return or variance, it is possible to calculate the optimal balance between the stocks portfolio (the tangency portfolio) and a risk free (a bond) asset.
The following function does it. Let us give a value of expected return as input and see how it works:
In this function I also introduced the possibility to give an upperbound to the weights:
Let us try it. We can see that although there is more diversification, the introduction of this bound reduced the Sharpe ratio.
Optimization with cvxpy
The code of this section follows closely the example given in the cvxpy website. CVXPY uses a better solver, which as we will see it is much faster. It is also very easy to use.
Here I also present an alternative and equivalent formulation of the problem:
where represents the risk aversion coefficient of the investor. The choice of a positive lambda follows the common requirement to describe a risk-averse investor. For , the investor tends to be more risk-neutral. For the investor is very risk-averse and the optimal portfolio converges to the minimum variance portfolio i.e. the portfolio with the least variance.
From a mathematical point of view, this formulation corresponds to a problem where we want to maximize the expected return for a fixed variance level plus the other constraints. We can introduce the Lagrange multiplier such that the objective function becomes , which is equivalent to the initial problem (since is just a constant and does not affect the weights).
Now I replicate the results of the previous section.
I run compute the optimal portfolio for seveal values of risk aversion , that is equivalent to optimize for several values of portfolio standard deviations. The resulting curve is again the efficient frontier, where each point corresponds to a value of .
Plot:
Probability density of the tangency portfolio
What if we want to compute some probabilities? For instance the probability of losing money e.g. ?
Well, we do not have a closed form for the density of the portfolio. Since each is log-normal, then is not a log-normal. You can see here wiki-related distribution that the sum of independent LN random variables is approximately log-normal, but not exactly log-normal. In our case the variables are correlated.
We can use Monte Carlo to simulate many LN linear returns and compute the empirical density of the portfolio. It is convenient, in order to obtain a smooth density, to work with the Gaussian Kernel density estimation [KDE](Kernel density estimation).
Let us generate some multivariate log-normal (LN) random variables , starting from the knowledge of the multivariate normal random variables . We previously computed the monthly mean and covariance of .
Comment:
We can see that the two curves are not so different. This is because 1 month is still a small interval of time, and therefore the linear returns are approximately distributed like the log-returns (Normally distributed) and the sum of Normal distributed random variables is Normal. For a bigger time interval e.g. 1 year, the two curves become very very different.
The loss probability in the plot is the probability of losing more than the Maximum loss value i.e.
Short positions - closed formula
If are allowed to take short positions, we can remove the bound and the problem becomes:
Following [3] we can compute the weights and the efficient frontier in closed form.
We can define the following new variables:
The weights have the following closed expression:
The efficient frontier is a parabola with the following equation:
It is good to check that the covariance matrix is full rank before the inversion.
Theoretical weights:
Numerical weights:
Let us compute the efficient frontier
Plot
References
[1] Attilio Meucci (2005) Risk and asset allocation.
[2] D. Ruppert, D. Matteson (2015) Statistics and Data analysis for financial engineering
[3] Robert Merton (1970) An analytical derivation of the efficient portfolio frontier