CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.

| Download
Views: 3376
Image: ubuntu2004
Kernel: Python 3 (system-wide)

Mean variance frontier with market data

** Camilo A. Garcia Trillos 2020**

In this notebook

We revisit the mean-variance frontier construction, this time using market data. We use a sub-sample of data, belonging to a selection of stocks in the market.

%matplotlib inline import numpy as np import pandas as pd import matplotlib.pyplot as plt import scipy.stats as stats from math import ceil

Data cleaning

We start by reading in data

data = pd.read_csv('~/Data/basket_20171201_10y.csv')
--------------------------------------------------------------------------- FileNotFoundError Traceback (most recent call last) <ipython-input-2-14adc80f4979> in <module> ----> 1 data = pd.read_csv('~/Data/basket_20171201_10y.csv') /usr/local/lib/python3.8/dist-packages/pandas/io/parsers.py in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision) 686 ) 687 --> 688 return _read(filepath_or_buffer, kwds) 689 690 /usr/local/lib/python3.8/dist-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds) 452 453 # Create the parser. --> 454 parser = TextFileReader(fp_or_buf, **kwds) 455 456 if chunksize or iterator: /usr/local/lib/python3.8/dist-packages/pandas/io/parsers.py in __init__(self, f, engine, **kwds) 946 self.options["has_index_names"] = kwds["has_index_names"] 947 --> 948 self._make_engine(self.engine) 949 950 def close(self): /usr/local/lib/python3.8/dist-packages/pandas/io/parsers.py in _make_engine(self, engine) 1178 def _make_engine(self, engine="c"): 1179 if engine == "c": -> 1180 self._engine = CParserWrapper(self.f, **self.options) 1181 else: 1182 if engine == "python": /usr/local/lib/python3.8/dist-packages/pandas/io/parsers.py in __init__(self, src, **kwds) 2008 kwds["usecols"] = self.usecols 2009 -> 2010 self._reader = parsers.TextReader(src, **kwds) 2011 self.unnamed_cols = self._reader.unnamed_cols 2012 pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.__cinit__() pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._setup_parser_source() FileNotFoundError: [Errno 2] No such file or directory: '/home/user/Data/basket_20171201_10y.csv'

This corresponds to 10 years of daily information for a selection of assets.

First of all, we need to summarise, understand, visualize and clean the data. Let us look first at the different entries and their types:

data.dtypes
ticker object date object open float64 high float64 low float64 close float64 volume float64 ex-dividend float64 split_ratio float64 adj_open float64 adj_high float64 adj_low float64 adj_close float64 adj_volume float64 dtype: object

Note that once again the 'date' field is not recognised as a data but as an object. Let us have a look at how it looks before we decide to transform it

data.head()

It does look like a date with the format 'yyy-mm-dd'. We can then transform this field and update it.

data['date']=pd.to_datetime(data['date'], yearfirst=True)

Let us check that it looks the same, but that the field is now a date

data.head()
data.dtypes
ticker object date datetime64[ns] open float64 high float64 low float64 close float64 volume float64 ex-dividend float64 split_ratio float64 adj_open float64 adj_high float64 adj_low float64 adj_close float64 adj_volume float64 dtype: object

Let su check if elements are unique per date.

print( len(np.unique(data['date'])), len(data['date']))
2749 24741

Not equal... because we have several indices. So this time the index should be a combination of ticker and date. However, since we are going to work fron now on only with adjusted_close prices, an alternative is to "pivot" the table, so that the 'ticker', which identifies the asset, becomes now another entry.

P = data.pivot(index='date', columns='ticker', values='adj_close') P.head()

The dataframe 'P' has only information of the 'adjusted close price' for each date for each one of the companies that are in the database, namely BAC (Bank of America), C (Citybank), GE (General Electric), JNJ (Johnson and Johnson), MSFT (Microsoft), PG (Proctor & Gamble), T (AT&T), WMT (Wallmart), and XOM (Exxon Mobile). All of these are large companies from different economic sectors including finance, technology, energy, retail and pharmaceutical.

Let us check that we did not loose any data when creating our pivot table.

P.isnull().any()
ticker BAC False C False GE False JNJ False MSFT False PG False T False WMT False XOM False dtype: bool

No null data, so all fields are complete. Now, let us check that the data makes sense visually.

P.plot()
<AxesSubplot:xlabel='date'>
Image in a Jupyter notebook

That 'C' line look suspicious... until we realise it belongs to Citygroup: a huge drop in between 2008-2010 is not so starnge if we think about the financial crisis of the time.

Efficient frontier calculation

We are now going to build the efficient frontier in this reduced portfolio.

  • Using one week as your period, find the mean_variance frontier if we assume that these are all the stocks on the market, and only using information before 01/01/2012.

  • We find the minimal variance portfolio (note there is no risk free asset).

Building the log-returns

Let us now resample using only weekly data...

P=P.resample('W-FRI').last() P.head()
type(P)
pandas.core.frame.DataFrame

Replot to check we did not do anything wrong

P.plot()
<AxesSubplot:xlabel='date'>
Image in a Jupyter notebook

Good. Now, let us calculate both the returns and the log-returns.

logR=np.log(P).diff() logR.drop(logR.index[0],inplace=True) # We drop the first line which produces NAN logR.head()
logR.head()
logR.plot()
<AxesSubplot:xlabel='date'>
Image in a Jupyter notebook

Remember that negative results of log-returns means returns lower then 1 (i.e. effective losses)

Stationarity check

How stationary are the data? Let us look first at the log-returns. I will simply plot the mean and variance with a rolling window of 10 observations.

logR.rolling(20).mean().plot(alpha=0.5) plt.title('Log returns')
Text(0.5, 1.0, 'Log returns')
Image in a Jupyter notebook

These values seem to be significantly bigger initially. What about standard deviation?

logR.rolling(20).std().plot(alpha=0.5)
<AxesSubplot:xlabel='date'>
Image in a Jupyter notebook

It looks that there is a significant ``change of regime'' around 2010. It is of course associated with the results of the financial crisis. We are going to consider first the mean variance using the entire history from 2007 to 2010 and then compare with the result from 2010 on.

Note: In class we worked with mean-varaince analysis of returns. It is often convenient to work with log-retunrs instead: on the one hand, log-returns are usually closer to sattionary. On the other hand, they tend to look closer to Gaussian

Mean-variance frontier construction

We start by calculating, the mean and variance for the selected history on the first case

logR_sub = logR[logR.index< '2012-01-01'] mu = logR_sub.mean() Sigma = logR_sub.cov() Rho = logR_sub.corr() print('Mean:', mu) print('Variances', np.diag(Sigma)) print('Var-cov:', Sigma) print('Correlation', Rho)
Mean: ticker BAC -0.008158 C -0.011287 GE -0.002101 JNJ 0.000542 MSFT -0.000119 PG 0.000720 T 0.000589 WMT 0.001307 XOM 0.000983 dtype: float64 Variances [0.01065754 0.01643106 0.00302104 0.00060041 0.00164302 0.00067786 0.00114917 0.0008024 0.00119484] Var-cov: ticker BAC C GE JNJ MSFT PG T \ ticker BAC 0.010658 0.010821 0.003996 0.001206 0.001436 0.001176 0.001680 C 0.010821 0.016431 0.004706 0.001277 0.001674 0.001060 0.002169 GE 0.003996 0.004706 0.003021 0.000551 0.000855 0.000614 0.000794 JNJ 0.001206 0.001277 0.000551 0.000600 0.000493 0.000410 0.000463 MSFT 0.001436 0.001674 0.000855 0.000493 0.001643 0.000460 0.000587 PG 0.001176 0.001060 0.000614 0.000410 0.000460 0.000678 0.000492 T 0.001680 0.002169 0.000794 0.000463 0.000587 0.000492 0.001149 WMT 0.001000 0.000953 0.000495 0.000375 0.000495 0.000437 0.000505 XOM 0.001490 0.001562 0.000762 0.000490 0.000745 0.000483 0.000675 ticker WMT XOM ticker BAC 0.001000 0.001490 C 0.000953 0.001562 GE 0.000495 0.000762 JNJ 0.000375 0.000490 MSFT 0.000495 0.000745 PG 0.000437 0.000483 T 0.000505 0.000675 WMT 0.000802 0.000479 XOM 0.000479 0.001195 Correlation ticker BAC C GE JNJ MSFT PG T \ ticker BAC 1.000000 0.817741 0.704303 0.476717 0.343086 0.437369 0.480161 C 0.817741 1.000000 0.668015 0.406486 0.322244 0.317525 0.499051 GE 0.704303 0.668015 1.000000 0.409441 0.383542 0.428870 0.426370 JNJ 0.476717 0.406486 0.409441 1.000000 0.496616 0.642216 0.557587 MSFT 0.343086 0.322244 0.383542 0.496616 1.000000 0.435775 0.427469 PG 0.437369 0.317525 0.428870 0.642216 0.435775 1.000000 0.557387 T 0.480161 0.499051 0.426370 0.557587 0.427469 0.557387 1.000000 WMT 0.341866 0.262506 0.318131 0.540748 0.431313 0.592972 0.526187 XOM 0.417678 0.352628 0.401172 0.579001 0.531398 0.536973 0.575743 ticker WMT XOM ticker BAC 0.341866 0.417678 C 0.262506 0.352628 GE 0.318131 0.401172 JNJ 0.540748 0.579001 MSFT 0.431313 0.531398 PG 0.592972 0.536973 T 0.526187 0.575743 WMT 1.000000 0.489380 XOM 0.489380 1.000000

Correlation are positive as expected... means seem reasonable.

Before calculating the optimal portfolio, we transform the variable Sigma in a numpy matrix and calculate its condition number to establish the stability of its inverse.

Sigma=Sigma.values np.linalg.cond(Sigma)
128.7153892613439

It is a large number, but still OK. We do not need to stabilise the inverse.

Now, we are going to re-define the Markowitz weights function to find the mean-variance frontier.

def Markowitz_weights(Sigma, mean_returns, mu_p): S_inv = np.linalg.inv(Sigma) pi_mu = S_inv @ (mean_returns)/sum(S_inv @ mean_returns) pi_1 = np.sum(S_inv, axis = 1) / sum(np.sum(S_inv, axis = 1) ) lambda_demoninator = (mean_returns @ pi_mu) - (mean_returns@pi_1) ll = np.array((mu_p - (mean_returns @ pi_1))/lambda_demoninator) # previous line: to convert into array in case that mu_p is a single number ll.shape=(ll.size,1) return pi_mu * ll + pi_1 * (1-ll)
# Using the function: note that Sigma and mean are Series, so we turn them into arrays by using the property 'values' print(mu) print('======') #mup= np.linspace(0.999,1.003) mup= np.linspace(-0.002,0.004) mvf_portfolios = Markowitz_weights(Sigma, mu,mup ) print(mvf_portfolios[0:3])
ticker BAC -0.008158 C -0.011287 GE -0.002101 JNJ 0.000542 MSFT -0.000119 PG 0.000720 T 0.000589 WMT 0.001307 XOM 0.000983 dtype: float64 ====== [[ 0.04274296 0.15931801 0.02639219 0.59758119 0.20169459 0.41534182 -0.14901258 -0.06063188 -0.23342629] [ 0.03794259 0.15316594 0.02884724 0.59361264 0.19470381 0.40944844 -0.14098252 -0.05195897 -0.22477917] [ 0.03314222 0.14701386 0.03130228 0.5896441 0.18771302 0.40355506 -0.13295245 -0.04328606 -0.21613205]]

Let us calculate the means and variances of those portfolios.

mu_mvf = mvf_portfolios @ mu sd_mvf = np.diag((( mvf_portfolios @ Sigma) @ mvf_portfolios.T ))**0.5

We can also calculate the portfolio with the (quasi-) minimal variance portfolio

sigma_inv = np.linalg.inv(Sigma) ones_vec = np.ones(Sigma.shape[0]) min_var_p = sigma_inv@ones_vec min_var_p/=min_var_p.sum() plt.bar(R.columns, min_var_p) print(R.columns) print(min_var_p)
Index(['BAC', 'C', 'GE', 'JNJ', 'MSFT', 'PG', 'T', 'WMT', 'XOM'], dtype='object', name='ticker') [-0.09416917 -0.01614631 0.0964129 0.48439373 0.00230924 0.24725583 0.08001433 0.18672968 0.01319977]
Image in a Jupyter notebook

So that the minimal variance portfolio invests mostly on health and consumption (Johnson and Johnson, Proctor and Gamble, and Wallmart). Note that this is a risky strategy because it is too centered on one type of asset.

Let us check that everything is correct. We plot our surface and the minimum variance portfolio. To do so, we generate a bunch of random portfolios to visually verify that our calculated values are right.

#Generating random portfolios random_portfolios = 2.*np.random.rand(1000,Sigma.shape[0]) -1 random_portfolios/= random_portfolios.sum(axis=1)[..., np.newaxis] rand_mean = np.dot(random_portfolios, mu) rand_sd = np.diag((random_portfolios@Sigma)@random_portfolios.T)**0.5 rand_sd #Calculating mean and variance of minimal variance portfolio min_var_mean = np.dot(min_var_p, mu) min_var_sd = min_var_p.dot(Sigma).dot(min_var_p.T)**0.5 print(min_var_sd, min_var_mean) max_sd_plot=0.06 #Making the plot plt.plot(rand_sd[rand_sd <max_sd_plot], rand_mean[rand_sd < max_sd_plot], 'b.') plt.plot(sd_mvf, mu_mvf, 'go') plt.plot(min_var_sd, min_var_mean, 'rx')
0.02063705581048089 0.001492388376839055
[<matplotlib.lines.Line2D at 0x7f2c334cfd00>]
Image in a Jupyter notebook

Stability in time

What happens if we consider only recent history? If we had perfect stationarity, the optimal should still be very close to the one we found in the previous part. Let us make some calculations

logR_sub2 = logR[(logR.index< '2012-01-01') & (logR.index> '2010-01-01')] mu2 = logR_sub2.mean().values Sigma2 = logR_sub2.cov().values Rho2 = logR_sub2.corr().values print('Mean:', mu2) print('Variances', np.diag(Sigma2)) print('Var-cov:', Sigma2) print('Correlation', Rho2) mvf_portfolios2 = Markowitz_weights(Sigma2, mu2, mup) mu2_mvf = mvf_portfolios2 @ mu2 sd2_mvf = np.diag((( mvf_portfolios2 @ Sigma2) @ mvf_portfolios2.T ))**0.5 print(np.linalg.cond(Sigma2)) sigma2_inv = np.linalg.inv(Sigma2) ones_vec = np.ones(Sigma2.shape[0]) min_var_p2 = sigma2_inv@ones_vec min_var_p2/=min_var_p2.sum() plt.bar(R.columns, min_var_p2) print(R.columns) print(min_var_p2)
Mean: [-0.00950856 -0.00219951 0.00220996 0.00085082 -0.00109434 0.00151515 0.00188769 0.00155423 0.00257817] Variances [0.00311498 0.00377738 0.00168709 0.00036972 0.00095914 0.00035433 0.00046654 0.00036697 0.00091758] Var-cov: [[0.00311498 0.00284224 0.00164264 0.00047208 0.00086103 0.00042768 0.00067589 0.00043497 0.00107392] [0.00284224 0.00377738 0.00180427 0.00052624 0.00108874 0.00050054 0.00083692 0.00047916 0.001286 ] [0.00164264 0.00180427 0.00168709 0.00041486 0.00086875 0.00044694 0.00058337 0.00046338 0.00093544] [0.00047208 0.00052624 0.00041486 0.00036972 0.00028924 0.00019963 0.00021918 0.00016211 0.00036518] [0.00086103 0.00108874 0.00086875 0.00028924 0.00095914 0.00029394 0.00037959 0.00031198 0.00062288] [0.00042768 0.00050054 0.00044694 0.00019963 0.00029394 0.00035433 0.00023336 0.00021372 0.00033215] [0.00067589 0.00083692 0.00058337 0.00021918 0.00037959 0.00023336 0.00046654 0.00022388 0.00044079] [0.00043497 0.00047916 0.00046338 0.00016211 0.00031198 0.00021372 0.00022388 0.00036697 0.00031368] [0.00107392 0.001286 0.00093544 0.00036518 0.00062288 0.00033215 0.00044079 0.00031368 0.00091758]] Correlation [[1. 0.82858805 0.71655018 0.43989882 0.49813507 0.40708948 0.56066372 0.40683055 0.63521888] [0.82858805 1. 0.71472261 0.44530459 0.57199022 0.43265766 0.63044506 0.40697403 0.6907519 ] [0.71655018 0.71472261 1. 0.52529064 0.68294018 0.57806134 0.65755684 0.58891775 0.75184149] [0.43989882 0.44530459 0.52529064 1. 0.48572049 0.55154699 0.52774308 0.44010584 0.62698097] [0.49813507 0.57199022 0.68294018 0.48572049 1. 0.50421683 0.56745031 0.52585777 0.66395411] [0.40708948 0.43265766 0.57806134 0.55154699 0.50421683 1. 0.57396207 0.59268311 0.58251898] [0.56066372 0.63044506 0.65755684 0.52774308 0.56745031 0.57396207 1. 0.54107775 0.6736912 ] [0.40683055 0.40697403 0.58891775 0.44010584 0.52585777 0.59268311 0.54107775 1. 0.54057532] [0.63521888 0.6907519 0.75184149 0.62698097 0.66395411 0.58251898 0.6736912 0.54057532 1. ]] 66.18958930410454 Index(['BAC', 'C', 'GE', 'JNJ', 'MSFT', 'PG', 'T', 'WMT', 'XOM'], dtype='object', name='ticker') [ 0.02480495 -0.06254287 -0.17280838 0.35353751 0.04500131 0.26564107 0.27430629 0.34214459 -0.07008448]
Image in a Jupyter notebook

The first big change is that in the minimal variance portfolio, there is no longer room for a positive position in GE or Exxon Mobile. On the other hand Bank of America makes its comeback: Note also the reduction in exposition to JNJ.

#Generating random portfolios random_portfolios2 = 2.*np.random.rand(1000,Sigma2.shape[0]) -1 random_portfolios2/= random_portfolios2.sum(axis=1)[..., np.newaxis] rand_mean2 = np.dot(random_portfolios2, mu2) rand_sd2 = np.diag((random_portfolios2@Sigma2)@random_portfolios2.T)**0.5 #Calculating mean and variance of minimal varaince portfolio min_var_mean2 = np.dot(min_var_p2, mu2) min_var_sd2 = min_var_p.dot(Sigma2).dot(min_var_p2.T)**0.5 print(min_var_sd2, min_var_mean2) max_sd_plot=0.06 #Making the plot plt.plot(rand_sd2[rand_sd2 <max_sd_plot], rand_mean2[rand_sd2 < max_sd_plot], 'b.') plt.plot(sd2_mvf, mu2_mvf, 'go') plt.plot(min_var_sd2, min_var_mean2, 'rx')
0.013923096082564245 0.0010427265331609799
[<matplotlib.lines.Line2D at 0x7f2c3342a2e0>]
Image in a Jupyter notebook

Note that with this data, the mean variance frontier achieves the same returns with much smaller variance.

How does the efficient portfolio calculated with the data of the first part do on this second part?

#Making the plot mu1in2_mvf = mvf_portfolios @ mu2 sd1in2_mvf = np.diag((( mvf_portfolios @ Sigma2) @ mvf_portfolios.T ))**0.5 plt.plot(rand_sd2[rand_sd2 <max_sd_plot], rand_mean2[rand_sd2 < max_sd_plot], 'b.') plt.plot(sd2_mvf, mu2_mvf, 'go') plt.plot(sd1in2_mvf, mu1in2_mvf, 'y*') plt.plot(min_var_sd2, min_var_mean2, 'rx')
[<matplotlib.lines.Line2D at 0x7f2c33382c40>]
Image in a Jupyter notebook

We can see that the efficient part of the frontier is not doing that bad.

Let us move forward in time a couple of years and evaluate how these two portfolios would do in the period 2015-.

R_sub3 = R[(R.index> '2015-01-01')] mu3 = R_sub3.mean().values Sigma3 = R_sub3.cov().values Rho3 = R_sub3.corr().values print('Mean:', mu3) print('Variances', np.diag(Sigma3)) print('Var-cov:', Sigma3) print('Correlation', Rho3) mvf_portfolios3 = Markowitz_weights(Sigma3, mu3, mup) mu3_mvf = mvf_portfolios3 @ mu3 sd3_mvf = np.diag((( mvf_portfolios3 @ Sigma3) @ mvf_portfolios3.T ))**0.5 print(np.linalg.cond(Sigma3)) sigma3_inv = np.linalg.inv(Sigma3) ones_vec = np.ones(Sigma3.shape[0]) min_var_p3 = sigma3_inv@ones_vec min_var_p3/=min_var_p3.sum() plt.bar(R.columns, min_var_p3) print(R.columns) print(min_var_p3)
Mean: [ 3.17089980e-03 2.31414528e-03 -1.79822463e-03 2.41587099e-03 4.18994786e-03 4.04470847e-04 1.44399790e-03 1.23922279e-03 -2.76520721e-05] Variances [0.00148144 0.00116645 0.00096945 0.00032701 0.00095902 0.00035725 0.00046562 0.00068835 0.00048457] Var-cov: [[1.48144347e-03 1.16573124e-03 4.35206897e-04 1.81572602e-04 4.49598304e-04 7.74107987e-05 1.37466132e-04 1.28871207e-04 3.04613149e-04] [1.16573124e-03 1.16644598e-03 4.71421081e-04 1.88652743e-04 4.15244861e-04 1.25625235e-04 1.65589075e-04 7.98007726e-05 2.87273708e-04] [4.35206897e-04 4.71421081e-04 9.69450039e-04 2.12555959e-04 2.65003065e-04 1.66592903e-04 2.40478405e-04 8.25903973e-06 2.70597336e-04] [1.81572602e-04 1.88652743e-04 2.12555959e-04 3.27014030e-04 2.42208339e-04 1.52696242e-04 1.02605747e-04 1.18529988e-04 1.36443676e-04] [4.49598304e-04 4.15244861e-04 2.65003065e-04 2.42208339e-04 9.59021380e-04 2.65334278e-04 1.44848437e-04 2.23905822e-04 2.17360718e-04] [7.74107987e-05 1.25625235e-04 1.66592903e-04 1.52696242e-04 2.65334278e-04 3.57253412e-04 1.35743174e-04 1.44619226e-04 1.20544663e-04] [1.37466132e-04 1.65589075e-04 2.40478405e-04 1.02605747e-04 1.44848437e-04 1.35743174e-04 4.65622780e-04 6.59445138e-05 1.44725823e-04] [1.28871207e-04 7.98007726e-05 8.25903973e-06 1.18529988e-04 2.23905822e-04 1.44619226e-04 6.59445138e-05 6.88354591e-04 7.13161133e-05] [3.04613149e-04 2.87273708e-04 2.70597336e-04 1.36443676e-04 2.17360718e-04 1.20544663e-04 1.44725823e-04 7.13161133e-05 4.84568573e-04]] Correlation [[1. 0.8867957 0.36315373 0.26087027 0.37719678 0.10640715 0.16551451 0.12761662 0.35952449] [0.8867957 1. 0.44331657 0.30545526 0.39260679 0.19460607 0.22468923 0.08905713 0.38210783] [0.36315373 0.44331657 1. 0.37750906 0.27483626 0.28307765 0.35792848 0.01011022 0.39480549] [0.26087027 0.30545526 0.37750906 1. 0.43250562 0.44674219 0.26294892 0.24982675 0.34276194] [0.37719678 0.39260679 0.27483626 0.43250562 1. 0.45330554 0.21676171 0.27557838 0.31885211] [0.10640715 0.19460607 0.28307765 0.44674219 0.45330554 1. 0.33282251 0.29162969 0.28972246] [0.16551451 0.22468923 0.35792848 0.26294892 0.21676171 0.33282251 1. 0.1164811 0.30468516] [0.12761662 0.08905713 0.01011022 0.24982675 0.27557838 0.29162969 0.1164811 1. 0.12348208] [0.35952449 0.38210783 0.39480549 0.34276194 0.31885211 0.28972246 0.30468516 0.12348208 1. ]] 24.900558004055362 Index(['BAC', 'C', 'GE', 'JNJ', 'MSFT', 'PG', 'T', 'WMT', 'XOM'], dtype='object', name='ticker') [ 0.03548436 -0.00513189 -0.0128339 0.29949406 -0.06414959 0.23864434 0.20060666 0.14062327 0.16726271]
Image in a Jupyter notebook
#Generating random portfolios random_portfolios3 = 2.*np.random.rand(1000,Sigma3.shape[0]) -1 random_portfolios3/= random_portfolios3.sum(axis=1)[..., np.newaxis] rand_mean3 = np.dot(random_portfolios3, mu3) rand_sd3 = np.diag((random_portfolios3@Sigma2)@random_portfolios3.T)**0.5 #Calculating mean and variance of minimal varaince portfolio min_var_mean3 = np.dot(min_var_p3, mu3) min_var_sd3 = min_var_p.dot(Sigma3).dot(min_var_p3.T)**0.5 print(min_var_sd3, min_var_mean3) max_sd_plot=0.06 #Calculating results of previous optimals mu2in3_mvf = mvf_portfolios2 @ mu3 sd2in3_mvf = np.diag((( mvf_portfolios2 @ Sigma3) @ mvf_portfolios2.T ))**0.5 mu1in3_mvf = mvf_portfolios @ mu3 sd1in3_mvf = np.diag((( mvf_portfolios @ Sigma3) @ mvf_portfolios.T ))**0.5 #Making the plot plt.plot(rand_sd3[rand_sd3 <max_sd_plot], rand_mean3[rand_sd3 < max_sd_plot], 'b.') plt.plot(sd3_mvf, mu3_mvf, 'go') plt.plot(min_var_sd3, min_var_mean3, 'rx') plt.plot(sd1in3_mvf, mu1in3_mvf, '*', color='orange') plt.plot(sd2in3_mvf, mu2in3_mvf, 'y*')
0.013478190320770416 0.0011343138763670057
[<matplotlib.lines.Line2D at 0x7f2c332e3e50>]
Image in a Jupyter notebook

Note that the efficient portfolios calculated with the first data-base (all before 2010) are subperforming (even if they are not that bad). This shows that periodic recalculation is needed. If trading costs are important, this might reduce the possibilities to keep true efficient portfolios in time.