Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
packtpublishing
GitHub Repository: packtpublishing/machine-learning-for-algorithmic-trading-second-edition
Path: blob/master/05_strategy_evaluation/04_mean_variance_optimization.ipynb
2923 views
Kernel: Python 3

Mean-Variance Optimization

MPT solves for the optimal portfolio weights to minimize volatility for a given expected return, or maximize returns for a given level of volatility. The key requisite input are expected asset returns, standard deviations, and the covariance matrix.

Diversification works because the variance of portfolio returns depends on the covariance of the assets and can be reduced below the weighted average of the asset variances by including assets with less than perfect correlation. In particular, given a vector, ω, of portfolio weights and the covariance matrix, Σ\Sigma, the portfolio variance, σPF\sigma_{\text{PF}} is defined as: σPF=ωTΣω\sigma_{\text{PF}}=\omega^T\Sigma\omega

Markowitz showed that the problem of maximizing the expected portfolio return subject to a target risk has an equivalent dual representation of minimizing portfolio risk subject to a target expected return level, μPFμ_{PF}. Hence, the optimization problem becomes: minωσPF2=ωTΣωs.t.μPF=ωTμω=1 \begin{align} \min_\omega & \quad\quad\sigma^2_{\text{PF}}= \omega^T\Sigma\omega\\ \text{s.t.} &\quad\quad \mu_{\text{PF}}= \omega^T\mu\\ &\quad\quad \lVert\omega\rVert =1 \end{align}

We can calculate an efficient frontier using scipy.optimize.minimize and the historical estimates for asset returns, standard deviations, and the covariance matrix.

Imports & Settings

import warnings warnings.filterwarnings('ignore')
%matplotlib inline import pandas as pd import numpy as np from numpy.random import random, uniform, dirichlet, choice from numpy.linalg import inv from scipy.optimize import minimize import pandas_datareader.data as web import matplotlib.pyplot as plt from matplotlib.ticker import FuncFormatter import seaborn as sns
sns.set_style('whitegrid') np.random.seed(42)
cmap = sns.diverging_palette(10, 240, n=9, as_cmap=True)

Prepare Data

We select historical data for tickers included in the S&P500 (according to Wikipedia) from 1998-2017.

with pd.HDFStore('../data/assets.h5') as store: sp500_stocks = store['sp500/stocks']
sp500_stocks.head()
with pd.HDFStore('../data/assets.h5') as store: prices = (store['quandl/wiki/prices'] .adj_close .unstack('ticker') .filter(sp500_stocks.index) .sample(n=30, axis=1))

Compute Inputs

Compute Returns

start = 2008 end = 2017

Create month-end monthly returns and drop dates that have no observations:

weekly_returns = prices.loc[f'{start}':f'{end}'].resample('W').last().pct_change().dropna(how='all') weekly_returns = weekly_returns.dropna(axis=1) weekly_returns.info()
<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 521 entries, 2008-01-13 to 2017-12-31 Freq: W-SUN Data columns (total 25 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 WYNN 521 non-null float64 1 FIS 521 non-null float64 2 VLO 521 non-null float64 3 MRK 521 non-null float64 4 DTE 521 non-null float64 5 REGN 521 non-null float64 6 TEL 521 non-null float64 7 YUM 521 non-null float64 8 HST 521 non-null float64 9 AES 521 non-null float64 10 XOM 521 non-null float64 11 NEE 521 non-null float64 12 HAL 521 non-null float64 13 TJX 521 non-null float64 14 AXP 521 non-null float64 15 IR 521 non-null float64 16 CERN 521 non-null float64 17 IVZ 521 non-null float64 18 XLNX 521 non-null float64 19 LOW 521 non-null float64 20 WEC 521 non-null float64 21 ETFC 521 non-null float64 22 XEL 521 non-null float64 23 CDNS 521 non-null float64 24 BXP 521 non-null float64 dtypes: float64(25) memory usage: 105.8 KB

Set Parameters

stocks = weekly_returns.columns
n_obs, n_assets = weekly_returns.shape n_assets, n_obs
(25, 521)
NUM_PF = 100000 # no of portfolios to simulate
x0 = uniform(0, 1, n_assets) x0 /= np.sum(np.abs(x0))

Annualization Factor

periods_per_year = round(weekly_returns.resample('A').size().mean()) periods_per_year
52

Compute Mean Returns, Covariance and Precision Matrix

mean_returns = weekly_returns.mean() cov_matrix = weekly_returns.cov()

The precision matrix is the inverse of the covariance matrix:

precision_matrix = pd.DataFrame(inv(cov_matrix), index=stocks, columns=stocks)

Risk-Free Rate

Load historical 10-year Treasury rate:

treasury_10yr_monthly = (web.DataReader('DGS10', 'fred', start, end) .resample('M') .last() .div(periods_per_year) .div(100) .squeeze())
rf_rate = treasury_10yr_monthly.mean()

Simulate Random Portfolios

The simulation generates random weights using the Dirichlet distribution, and computes the mean, standard deviation, and SR for each sample portfolio using the historical return data:

def simulate_portfolios(mean_ret, cov, rf_rate=rf_rate, short=True): alpha = np.full(shape=n_assets, fill_value=.05) weights = dirichlet(alpha=alpha, size=NUM_PF) if short: weights *= choice([-1, 1], size=weights.shape) returns = weights @ mean_ret.values + 1 returns = returns ** periods_per_year - 1 std = (weights @ weekly_returns.T).std(1) std *= np.sqrt(periods_per_year) sharpe = (returns - rf_rate) / std return pd.DataFrame({'Annualized Standard Deviation': std, 'Annualized Returns': returns, 'Sharpe Ratio': sharpe}), weights
simul_perf, simul_wt = simulate_portfolios(mean_returns, cov_matrix, short=False)
df = pd.DataFrame(simul_wt) df.describe()

Plot Simulated Portfolios

ax = simul_perf.plot.scatter(x=0, y=1, c=2, cmap='Blues', alpha=0.5, figsize=(14, 9), colorbar=True, title=f'{NUM_PF:,d} Simulated Portfolios') max_sharpe_idx = simul_perf.iloc[:, 2].idxmax() sd, r = simul_perf.iloc[max_sharpe_idx, :2].values print(f'Max Sharpe: {sd:.2%}, {r:.2%}') ax.scatter(sd, r, marker='*', color='darkblue', s=500, label='Max. Sharpe Ratio') min_vol_idx = simul_perf.iloc[:, 0].idxmin() sd, r = simul_perf.iloc[min_vol_idx, :2].values ax.scatter(sd, r, marker='*', color='green', s=500, label='Min Volatility') plt.legend(labelspacing=1, loc='upper left') plt.tight_layout()
Max Sharpe: 19.42%, 25.70%
Image in a Jupyter notebook

Compute Annualize PF Performance

Now we'll set up the quadratic optimization problem to solve for the minimum standard deviation for a given return or the maximum SR.

To this end, define the functions that measure the key metrics:

def portfolio_std(wt, rt=None, cov=None): """Annualized PF standard deviation""" return np.sqrt(wt @ cov @ wt * periods_per_year)
def portfolio_returns(wt, rt=None, cov=None): """Annualized PF returns""" return (wt @ rt + 1) ** periods_per_year - 1
def portfolio_performance(wt, rt, cov): """Annualized PF returns & standard deviation""" r = portfolio_returns(wt, rt=rt) sd = portfolio_std(wt, cov=cov) return r, sd

Max Sharpe PF

Define a target function that represents the negative SR for scipy's minimize function to optimize, given the constraints that the weights are bounded by [-1, 1], if short trading is permitted, and [0, 1] otherwise, and sum to one in absolute terms.

def neg_sharpe_ratio(weights, mean_ret, cov): r, sd = portfolio_performance(weights, mean_ret, cov) return -(r - rf_rate) / sd
weight_constraint = {'type': 'eq', 'fun': lambda x: np.sum(np.abs(x))-1}
def max_sharpe_ratio(mean_ret, cov, short=False): return minimize(fun=neg_sharpe_ratio, x0=x0, args=(mean_ret, cov), method='SLSQP', bounds=((-1 if short else 0, 1),) * n_assets, constraints=weight_constraint, options={'tol':1e-10, 'maxiter':1e4})

Compute Efficient Frontier

The solution requires iterating over ranges of acceptable values to identify optimal risk-return combinations

def min_vol_target(mean_ret, cov, target, short=False): def ret_(wt): return portfolio_returns(wt, mean_ret) constraints = [{'type': 'eq', 'fun': lambda x: ret_(x) - target}, weight_constraint] bounds = ((-1 if short else 0, 1),) * n_assets return minimize(portfolio_std, x0=x0, args=(mean_ret, cov), method='SLSQP', bounds=bounds, constraints=constraints, options={'tol': 1e-10, 'maxiter': 1e4})

The mean-variance frontier relies on in-sample, backward-looking optimization. In practice, portfolio optimization requires forward-looking input. Unfortunately, expected returns are notoriously difficult to estimate accurately.

The covariance matrix can be estimated somewhat more reliably, which has given rise to several alternative approaches. However, covariance matrices with correlated assets pose computational challenges since the optimization problem requires inverting the matrix. The high condition number induces numerical instability, which in turn gives rise to Markovitz curse: the more diversification is required (by correlated investment opportunities), the more unreliable the weights produced by the algorithm.

Min Volatility Portfolio

def min_vol(mean_ret, cov, short=False): bounds = ((-1 if short else 0, 1),) * n_assets return minimize(fun=portfolio_std, x0=x0, args=(mean_ret, cov), method='SLSQP', bounds=bounds, constraints=weight_constraint, options={'tol': 1e-10, 'maxiter': 1e4})
def efficient_frontier(mean_ret, cov, ret_range, short=False): return [min_vol_target(mean_ret, cov, ret) for ret in ret_range]

Run Calculation

Get random PF

simul_perf, simul_wt = simulate_portfolios(mean_returns, cov_matrix, short=False)
print(simul_perf.describe())
Annualized Standard Deviation Annualized Returns Sharpe Ratio count 100000.000000 100000.000000 100000.000000 mean 0.272889 0.177186 0.675622 std 0.076975 0.055340 0.213748 min 0.139966 0.019720 0.055767 25% 0.217795 0.148334 0.511378 50% 0.260579 0.173843 0.665910 75% 0.314480 0.202062 0.834405 max 0.654543 0.466036 1.332866
simul_max_sharpe = simul_perf.iloc[:, 2].idxmax() simul_perf.iloc[simul_max_sharpe]
Annualized Standard Deviation 0.166130 Annualized Returns 0.221928 Sharpe Ratio 1.332866 Name: 67110, dtype: float64

Get Max Sharpe PF

max_sharpe_pf = max_sharpe_ratio(mean_returns, cov_matrix, short=False) max_sharpe_perf = portfolio_performance(max_sharpe_pf.x, mean_returns, cov_matrix)
r, sd = max_sharpe_perf pd.Series({'ret': r, 'sd': sd, 'sr': (r-rf_rate)/sd})
ret 0.228716 sd 0.166360 sr 1.371822 dtype: float64

From simulated pf data

Get Min Vol PF

min_vol_pf = min_vol(mean_returns, cov_matrix, short=False) min_vol_perf = portfolio_performance(min_vol_pf.x, mean_returns, cov_matrix)

Get Efficent PFs

ret_range = np.linspace(simul_perf.iloc[:, 1].min(), simul_perf.iloc[:, 1].max(), 50) eff_pf = efficient_frontier(mean_returns, cov_matrix, ret_range, short=True) eff_pf = pd.Series(dict(zip([p['fun'] for p in eff_pf], ret_range)))

Plot Result

The simulation yields a subset of the feasible portfolios, and the efficient frontier identifies the optimal in-sample return-risk combinations that were achievable given historic data.

The below figure shows the result including the minimum variance portfolio and the portfolio that maximizes the SR and several portfolios produce by alternative optimization strategies. The efficient frontier

fig, ax = plt.subplots() simul_perf.plot.scatter(x=0, y=1, c=2, ax=ax, cmap='Blues',alpha=0.25, figsize=(14, 9), colorbar=True) eff_pf[eff_pf.index.min():].plot(linestyle='--', lw=2, ax=ax, c='k', label='Efficient Frontier') r, sd = max_sharpe_perf ax.scatter(sd, r, marker='*', color='k', s=500, label='Max Sharpe Ratio PF') r, sd = min_vol_perf ax.scatter(sd, r, marker='v', color='k', s=200, label='Min Volatility PF') kelly_wt = precision_matrix.dot(mean_returns).clip(lower=0).values kelly_wt /= np.sum(np.abs(kelly_wt)) r, sd = portfolio_performance(kelly_wt, mean_returns, cov_matrix) ax.scatter(sd, r, marker='D', color='k', s=150, label='Kelly PF') std = weekly_returns.std() std /= std.sum() r, sd = portfolio_performance(std, mean_returns, cov_matrix) ax.scatter(sd, r, marker='X', color='k', s=250, label='Risk Parity PF') r, sd = portfolio_performance(np.full(n_assets, 1/n_assets), mean_returns, cov_matrix) ax.scatter(sd, r, marker='o', color='k', s=200, label='1/n PF') ax.legend(labelspacing=0.8) ax.set_xlim(0, eff_pf.max()+.4) ax.set_title('Mean-Variance Efficient Frontier', fontsize=16) ax.yaxis.set_major_formatter(FuncFormatter(lambda y, _: '{:.0%}'.format(y))) ax.xaxis.set_major_formatter(FuncFormatter(lambda y, _: '{:.0%}'.format(y))) sns.despine() fig.tight_layout();
Image in a Jupyter notebook