CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.

| Download
Views: 3376
Image: ubuntu2004
Kernel: Python 3 (system-wide)

Mean-variance analysis with simulated data

Camilo A. Garcia Trillos - 2020

In this class

- We are implementing a Markowitz portfolio optimization, after some simulations (loosely based on a version on the blog http://blog.quantopian.com). - We learn how to perform an optimisation using the scipy package.
%matplotlib inline import numpy as np import matplotlib.pyplot as plt import pandas as pd from scipy.optimize import minimize np.random.seed(123)

Markowitz mean-variance analysis

In this class we will start by simulated random data. Later we will learn how to use actual stock data.

Assume that we have 44 assets, each with a return series of length 10001000. We can use numpy.random.randn to sample returns from a normal distribution (recall Class A).

## NUMBER OF ASSETS n_assets = 4 ## NUMBER OF OBSERVATIONS n_obs = 1000 ## VECTOR OF MEANS mu_bar = np.array([0.,0.,0., 0.]) #mu_bar = np.array([0.001,0.002,0.005, 0.008]) ## MATRIX OF COVARIANCE #With this auxSigma matrix, inverting is stable auxSigma = np.array ( [[1.,0.,0.,0.],[0,1,0,0],[0,0,1,0],[0,0,0,1]]) #With this auxSigma matrix, inverting is less stable #auxSigma = np.array ( [[2,0.,0.,0.],[-0.2,0.7,0,0],[-0.3,0.4,0.7,0],[0.1,0.4,0.4,2.3]]) Sigma_exact =auxSigma.T@auxSigma print (Sigma_exact)
[[1. 0. 0. 0.] [0. 1. 0. 0.] [0. 0. 1. 0.] [0. 0. 0. 1.]]
return_vec = np.random.multivariate_normal(mu_bar,Sigma_exact,n_obs) print (return_vec.shape)
(1000, 4)
plt.plot(return_vec, alpha=0.3); plt.xlabel('Time') plt.ylabel('Returns')
Text(0, 0.5, 'Returns')
Image in a Jupyter notebook

Random portfolios

Let's generate a couple of random portfolios. Let's assume for the moment that we restrict ourselves to have "long-only" portfolios, that is, the weights are nonnegative.

def rand_weights_pos(n): k = np.random.rand(n) return k / sum(k) print(rand_weights_pos(n_assets)) print(rand_weights_pos(n_assets))
[0.54066805 0.2360283 0.11660484 0.1066988 ] [0.27638339 0.03006307 0.47850085 0.21505269]

Let's next compute the expected return and standard deviation for a portfolio.

def random_portfolio(returns, weights): if returns.shape[1] != weights.shape[0]: print("Error: MSG by random_portfolio: dimensions don't fit") return 0 w = np.asmatrix(weights) p = np.asmatrix(np.mean(returns, axis=0)) C = np.asmatrix(np.cov(returns.T)) mu = w * p.T sigma = np.sqrt(w * C * w.T) return mu, sigma

Now, let's plot several such random risk-return profiles:

n_portfolios = 3000 means, stds = np.column_stack([random_portfolio(return_vec, rand_weights_pos(n_assets)) for _ in range(n_portfolios)]) plt.plot(stds, means,'o', markersize=5) plt.xlabel('std') plt.ylabel('mean') plt.title('Mean and standard deviation of returns of randomly generated portfolios (long-only)')
Text(0.5, 1.0, 'Mean and standard deviation of returns of randomly generated portfolios (long-only)')
Image in a Jupyter notebook
def rand_weights_unconstrained(n): k = np.random.rand(n)*2-1 # Uniformly distributed on [-1,1] return k / sum(k) means_unc, stds_unc = np.column_stack([random_portfolio(return_vec, rand_weights_unconstrained(n_assets)) for _ in range(2 * n_portfolios)]) plt.plot(stds_unc, means_unc, 'og', markersize=5) plt.xlabel('std') plt.ylabel('mean') plt.title('Mean and standard deviation of returns of randomly generated portfolios')
Text(0.5, 1.0, 'Mean and standard deviation of returns of randomly generated portfolios')
Image in a Jupyter notebook
plt.plot(stds_unc[stds_unc<0.9], means_unc[stds_unc<0.9], 'og', markersize=5, alpha=0.4) plt.plot(stds, means, 'o', markersize=5, alpha=0.2) plt.xlabel('std') plt.ylabel('mean') plt.title('Mean and standard deviation of returns of randomly generated portfolios')
Text(0.5, 1.0, 'Mean and standard deviation of returns of randomly generated portfolios')
Image in a Jupyter notebook

Let us now check the covariance matrix and find its inverse.

Sigma = np.cov(return_vec.T) print(Sigma)
[[ 0.95457653 0.00415482 0.04630785 -0.03329693] [ 0.00415482 0.94767103 0.00220586 -0.0675899 ] [ 0.04630785 0.00220586 1.02460895 0.06388503] [-0.03329693 -0.0675899 0.06388503 0.94171693]]

Let us check the stability of the matrix with the condioned number

np.linalg.cond(Sigma)
1.26406561814308

This is not too bad. We could move on without expecting major stability problems.

Sigma_inv = np.linalg.inv(Sigma)

Markowitz, as done in the lecture notes

def Markowitz_weights(Sigma, mean_returns, mu_p): S_inv = np.linalg.inv(Sigma) pi_mu = S_inv @ (mean_returns)/sum(S_inv @ mean_returns) pi_1 = np.sum(S_inv, axis = 1) / sum(np.sum(S_inv, axis = 1) ) lambda_demoninator = (mean_returns @ pi_mu) - (mean_returns@pi_1) ll = np.array((mu_p - (mean_returns @ pi_1))/lambda_demoninator) # previous line: to convert into array in case that mu_p is a single number ll.shape=(ll.size,1) return pi_mu * ll + pi_1 * (1-ll)

Remark: recall from the lecture that Markowitz needs at least two expected returns to differ. This is not the case via our simulated data. Indeed, any portfolio will have a zero expected return and the equally weighted portfolio will have the smallest variance. Let's just proceed, nevertheless, and see how far we get ...

Sigma = np.cov(return_vec.T) mean_returns = np.mean(return_vec, axis = 0) Markowitz_weights(Sigma, mean_returns, 0.05)
array([[-0.14820352, 0.83623086, -0.24774642, 0.55971907]])
mu_p = np.arange(0.02,0.06,0.002)
Markowitz_weights(Sigma, mean_returns, mu_p)
array([[ 1.77270001e-01, 3.76150124e-01, 1.22801901e-01, 3.23777975e-01], [ 1.55571766e-01, 4.06822173e-01, 9.80986790e-02, 3.39507381e-01], [ 1.33873532e-01, 4.37494223e-01, 7.33954574e-02, 3.55236788e-01], [ 1.12175297e-01, 4.68166272e-01, 4.86922358e-02, 3.70966195e-01], [ 9.04770630e-02, 4.98838322e-01, 2.39890143e-02, 3.86695601e-01], [ 6.87788285e-02, 5.29510371e-01, -7.14207263e-04, 4.02425008e-01], [ 4.70805940e-02, 5.60182420e-01, -2.54174288e-02, 4.18154415e-01], [ 2.53823595e-02, 5.90854470e-01, -5.01206504e-02, 4.33883821e-01], [ 3.68412503e-03, 6.21526519e-01, -7.48238719e-02, 4.49613228e-01], [-1.80141094e-02, 6.52198568e-01, -9.95270935e-02, 4.65342635e-01], [-3.97123439e-02, 6.82870618e-01, -1.24230315e-01, 4.81072041e-01], [-6.14105784e-02, 7.13542667e-01, -1.48933537e-01, 4.96801448e-01], [-8.31088129e-02, 7.44214716e-01, -1.73636758e-01, 5.12530855e-01], [-1.04807047e-01, 7.74886766e-01, -1.98339980e-01, 5.28260261e-01], [-1.26505282e-01, 8.05558815e-01, -2.23043201e-01, 5.43989668e-01], [-1.48203516e-01, 8.36230864e-01, -2.47746423e-01, 5.59719075e-01], [-1.69901751e-01, 8.66902914e-01, -2.72449644e-01, 5.75448481e-01], [-1.91599985e-01, 8.97574963e-01, -2.97152866e-01, 5.91177888e-01], [-2.13298220e-01, 9.28247013e-01, -3.21856087e-01, 6.06907295e-01], [-2.34996454e-01, 9.58919062e-01, -3.46559309e-01, 6.22636701e-01]])
Mw1 = _

Markowitz via an optimization algorithm

Next a different that also allows for additional constraints (e.g., nonnegative portfolio weights).

from scipy.optimize import minimize minimize?
def Markowitz_weights_opt(Sigma, mean_returns, mu_p, nonnegative = False, print_info = False): n = Sigma.shape[0] mu_p_vect = np.array(mu_p) mu_p_vect.shape = (mu_p_vect.size,) # previous lines: to convert into array in case that mu_p is a single number portfolios = [] x0 = np.ones(n)/n cost_fun = lambda x: (x@Sigma)@x for mu_p in mu_p_vect: if nonnegative: cons = ({'type': 'eq', 'fun': lambda x: x@mean_returns - mu_p}, {'type': 'eq', 'fun': lambda x: x@np.ones(n) - 1}, {'type': 'ineq', 'fun': lambda x: x}) else: cons = ({'type': 'eq', 'fun': lambda x: x@mean_returns - mu_p}, {'type': 'eq', 'fun': lambda x: x@np.ones(n) - 1}) # Calculate efficient frontier weights using quadratic programming res = minimize(cost_fun, x0, method = 'SLSQP', constraints=cons) if print_info: print(mu_p, res) portfolios.append(res.x) # Turning the portfolios into a numpy array portfolios = np.array([np.array(p).squeeze() for p in portfolios]) return portfolios
Markowitz_weights_opt(Sigma, mean_returns, 0.012)
array([[0.26426984, 0.2540042 , 0.22165789, 0.26006807]])

In the following yet another implementation. Now, we are using the constraint μTπ≥μp\mu^T \pi \geq \mu_p instead of μTπ=μp\mu^T \pi = \mu_p.

def Markowitz_weights_optIneq(Sigma, mean_returns, mu_p_vect, nonnegative = False, print_info = False): n = Sigma.shape[0] mu_p_vect = np.array(mu_p_vect) mu_p_vect.shape = (mu_p_vect.size,) # previous lines: to convert into array in case that mu_p is a single number portfolios = [] x0 = np.ones(n)/n cost_fun = lambda x: (x@Sigma)@x for mu_p in mu_p_vect: if nonnegative: cons = ({'type': 'ineq', 'fun': lambda x: x@mean_returns - mu_p}, {'type': 'eq', 'fun': lambda x: x@np.ones(n) - 1}, {'type': 'ineq', 'fun': lambda x: x}) else: cons = ({'type': 'ineq', 'fun': lambda x: x@mean_returns - mu_p}, {'type': 'eq', 'fun': lambda x: x@np.ones(n) - 1}) # Calculate efficient frontier weights using quadratic programming res = minimize(cost_fun, x0, method = 'SLSQP', constraints=cons) if print_info: print(mu_p, res) portfolios.append(res.x) # Calculate efficient frontier weights using quadratic programming #portfolios = [opt.solvers.qp(S, opt.matrix(np.zeros(n)), G, h, A, opt.matrix([1.0,mu]))['x'] # for mu in mu_p] # Turning the portfolios into a numpy array portfolios = np.array([np.array(p).squeeze() for p in portfolios]) return portfolios
Markowitz_weights_opt(Sigma, mean_returns, 0.02)
array([[0.17722274, 0.3761428 , 0.1228422 , 0.32379226]])
Markowitz_weights_optIneq(Sigma, mean_returns, 0.02)
array([[0.17722274, 0.3761428 , 0.1228422 , 0.32379226]])
Markowitz_weights_opt(Sigma, mean_returns, mu_p)
array([[ 1.77222742e-01, 3.76142797e-01, 1.22842200e-01, 3.23792261e-01], [ 1.55524588e-01, 4.06812914e-01, 9.81380733e-02, 3.39524425e-01], [ 1.33827210e-01, 4.37483227e-01, 7.34333174e-02, 3.55256246e-01], [ 1.12175308e-01, 4.68166247e-01, 4.86922151e-02, 3.70966229e-01], [ 9.04770623e-02, 4.98838321e-01, 2.39890150e-02, 3.86695601e-01], [ 6.87788279e-02, 5.29510372e-01, -7.14205968e-04, 4.02425006e-01], [ 4.70805922e-02, 5.60182422e-01, -2.54174260e-02, 4.18154412e-01], [ 2.53823589e-02, 5.90854469e-01, -5.01206496e-02, 4.33883822e-01], [ 3.68412398e-03, 6.21526518e-01, -7.48238709e-02, 4.49613229e-01], [-1.80141092e-02, 6.52198568e-01, -9.95270934e-02, 4.65342634e-01], [-3.97122228e-02, 6.82870473e-01, -1.24230489e-01, 4.81072239e-01], [-6.14104911e-02, 7.13542570e-01, -1.48933659e-01, 4.96801580e-01], [-8.31086328e-02, 7.44214538e-01, -1.73637001e-01, 5.12531096e-01], [-1.04807047e-01, 7.74886766e-01, -1.98339980e-01, 5.28260260e-01], [-1.26505153e-01, 8.05558699e-01, -2.23043370e-01, 5.43989824e-01], [-1.48203105e-01, 8.36230507e-01, -2.47746956e-01, 5.59719554e-01], [-1.69901625e-01, 8.66902806e-01, -2.72449807e-01, 5.75448626e-01], [-1.91599983e-01, 8.97574964e-01, -2.97152867e-01, 5.91177886e-01], [-2.13298221e-01, 9.28247011e-01, -3.21856086e-01, 6.06907296e-01], [-2.34996454e-01, 9.58919063e-01, -3.46559308e-01, 6.22636699e-01]])
Markowitz_weights_optIneq(Sigma, mean_returns, mu_p)
array([[ 1.77222742e-01, 3.76142798e-01, 1.22842200e-01, 3.23792260e-01], [ 1.55524588e-01, 4.06812912e-01, 9.81380717e-02, 3.39524429e-01], [ 1.33827210e-01, 4.37483229e-01, 7.34333178e-02, 3.55256243e-01], [ 1.12175298e-01, 4.68166272e-01, 4.86922354e-02, 3.70966194e-01], [ 9.04770710e-02, 4.98838306e-01, 2.39889999e-02, 3.86695624e-01], [ 6.87788286e-02, 5.29510371e-01, -7.14207308e-04, 4.02425008e-01], [ 4.70805965e-02, 5.60182420e-01, -2.54174313e-02, 4.18154415e-01], [ 2.53823837e-02, 5.90854436e-01, -5.01206874e-02, 4.33883868e-01], [ 3.68412479e-03, 6.21526519e-01, -7.48238718e-02, 4.49613228e-01], [-1.80141097e-02, 6.52198568e-01, -9.95270930e-02, 4.65342635e-01], [-3.97123351e-02, 6.82870607e-01, -1.24230328e-01, 4.81072056e-01], [-6.14104338e-02, 7.13542510e-01, -1.48933738e-01, 4.96801661e-01], [-8.31088149e-02, 7.44214716e-01, -1.73636756e-01, 5.12530855e-01], [-1.04807051e-01, 7.74886767e-01, -1.98339976e-01, 5.28260260e-01], [-1.26505152e-01, 8.05558698e-01, -2.23043371e-01, 5.43989826e-01], [-1.48203309e-01, 8.36230686e-01, -2.47746691e-01, 5.59719314e-01], [-1.69901753e-01, 8.66902912e-01, -2.72449642e-01, 5.75448483e-01], [-1.91599984e-01, 8.97574964e-01, -2.97152866e-01, 5.91177885e-01], [-2.13298221e-01, 9.28247012e-01, -3.21856086e-01, 6.06907295e-01], [-2.34996458e-01, 9.58919063e-01, -3.46559305e-01, 6.22636699e-01]])
Mw1
array([[ 1.77270001e-01, 3.76150124e-01, 1.22801901e-01, 3.23777975e-01], [ 1.55571766e-01, 4.06822173e-01, 9.80986790e-02, 3.39507381e-01], [ 1.33873532e-01, 4.37494223e-01, 7.33954574e-02, 3.55236788e-01], [ 1.12175297e-01, 4.68166272e-01, 4.86922358e-02, 3.70966195e-01], [ 9.04770630e-02, 4.98838322e-01, 2.39890143e-02, 3.86695601e-01], [ 6.87788285e-02, 5.29510371e-01, -7.14207263e-04, 4.02425008e-01], [ 4.70805940e-02, 5.60182420e-01, -2.54174288e-02, 4.18154415e-01], [ 2.53823595e-02, 5.90854470e-01, -5.01206504e-02, 4.33883821e-01], [ 3.68412503e-03, 6.21526519e-01, -7.48238719e-02, 4.49613228e-01], [-1.80141094e-02, 6.52198568e-01, -9.95270935e-02, 4.65342635e-01], [-3.97123439e-02, 6.82870618e-01, -1.24230315e-01, 4.81072041e-01], [-6.14105784e-02, 7.13542667e-01, -1.48933537e-01, 4.96801448e-01], [-8.31088129e-02, 7.44214716e-01, -1.73636758e-01, 5.12530855e-01], [-1.04807047e-01, 7.74886766e-01, -1.98339980e-01, 5.28260261e-01], [-1.26505282e-01, 8.05558815e-01, -2.23043201e-01, 5.43989668e-01], [-1.48203516e-01, 8.36230864e-01, -2.47746423e-01, 5.59719075e-01], [-1.69901751e-01, 8.66902914e-01, -2.72449644e-01, 5.75448481e-01], [-1.91599985e-01, 8.97574963e-01, -2.97152866e-01, 5.91177888e-01], [-2.13298220e-01, 9.28247013e-01, -3.21856087e-01, 6.06907295e-01], [-2.34996454e-01, 9.58919062e-01, -3.46559309e-01, 6.22636701e-01]])

All three methods therefore yield more or less the same portfolio, which is nice ... (Markowitz_weights_optIneq only have an inequality constraint and yields results that have further away from the other two approaches).

(If we run again the problem with different means and variances the solution with inequality constraints drift further away. This is simply because those means cannot be attained using only positive portfolios).

portfolio_mu = [mean_returns @ x for x in Mw1]
portfolio_mu
[0.02, 0.022, 0.023999999999999997, 0.025999999999999995, 0.02799999999999999, 0.029999999999999992, 0.03199999999999999, 0.033999999999999996, 0.03599999999999999, 0.03799999999999999, 0.03999999999999998, 0.04199999999999998, 0.043999999999999984, 0.04599999999999998, 0.04799999999999999, 0.049999999999999975, 0.05199999999999998, 0.05399999999999999, 0.05599999999999998, 0.05799999999999997]
portfolio_sigma = [np.sqrt((x@Sigma)@x) for x in Mw1]
portfolio_sigma
[0.515581632416369, 0.5315419321289605, 0.5509491496500153, 0.5734534325916862, 0.5987056431709378, 0.6263735219026193, 0.6561515561812542, 0.6877657118379038, 0.7209744919504029, 0.755567664520399, 0.7913637009977808, 0.8282066486133915, 0.8659628931544197, 0.9045180737805882, 0.9437742808709849, 0.9836475866049637, 1.0240659109629717, 1.064967201054332, 1.106297890538078, 1.1480116026673817]
plt.plot(stds, means, 'o', markersize=3) plt.plot(stds_unc[stds_unc<1], means_unc[stds_unc<1], 'og', markersize=3, alpha=0.4) plt.ylabel('mean') plt.xlabel('std') plt.plot(portfolio_sigma, portfolio_mu, 'ro') plt.title('Random portfolios and mean-variance frontier')
Text(0.5, 1.0, 'Random portfolios and mean-variance frontier')
Image in a Jupyter notebook