Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
packtpublishing
GitHub Repository: packtpublishing/machine-learning-for-algorithmic-trading-second-edition
Path: blob/master/05_strategy_evaluation/03_pyfolio_demo.ipynb
2923 views
Kernel: Python 3

From zipline to pyfolio

Pyfolio facilitates the analysis of portfolio performance and risk in-sample and out-of-sample using many standard metrics. It produces tear sheets covering the analysis of returns, positions, and transactions, as well as event risk during periods of market stress using several built-in scenarios, and also includes Bayesian out-of-sample performance analysis.

  • Open-source backtester by Quantopian Inc.

  • Powers Quantopian.com

  • State-of-the-art portfolio and risk analytics

  • Various models for transaction costs and slippage.

  • Open source and free: Apache v2 license

  • Can be used:

    • stand alone

    • with Zipline

    • on Quantopian

Imports & Settings

import warnings warnings.filterwarnings('ignore')
%matplotlib inline from pathlib import Path import pandas as pd import seaborn as sns import matplotlib.pyplot as plt from pyfolio.utils import extract_rets_pos_txn_from_zipline from pyfolio.plotting import (plot_perf_stats, show_perf_stats, plot_rolling_beta, plot_rolling_returns, plot_rolling_sharpe, plot_drawdown_periods, plot_drawdown_underwater) from pyfolio.timeseries import perf_stats, extract_interesting_date_ranges
sns.set_style('whitegrid')

Converting data from zipline to pyfolio

with pd.HDFStore('backtests.h5') as store: backtest = store['backtest/equal_weight'] backtest.info()
<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 1008 entries, 2013-01-02 00:00:00+00:00 to 2016-12-30 00:00:00+00:00 Data columns (total 39 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 period_open 1008 non-null datetime64[ns, UTC] 1 period_close 1008 non-null datetime64[ns, UTC] 2 starting_cash 1008 non-null float64 3 ending_cash 1008 non-null float64 4 portfolio_value 1008 non-null float64 5 returns 1008 non-null float64 6 longs_count 1008 non-null int64 7 shorts_count 1008 non-null int64 8 long_value 1008 non-null float64 9 short_value 1008 non-null float64 10 long_exposure 1008 non-null float64 11 pnl 1008 non-null float64 12 short_exposure 1008 non-null float64 13 capital_used 1008 non-null float64 14 orders 1008 non-null object 15 transactions 1008 non-null object 16 gross_leverage 1008 non-null float64 17 positions 1008 non-null object 18 net_leverage 1008 non-null float64 19 starting_exposure 1008 non-null float64 20 ending_exposure 1008 non-null float64 21 starting_value 1008 non-null float64 22 ending_value 1008 non-null float64 23 factor_data 1008 non-null object 24 prices 1008 non-null object 25 treasury_period_return 1008 non-null float64 26 trading_days 1008 non-null int64 27 period_label 1008 non-null object 28 algorithm_period_return 1008 non-null float64 29 algo_volatility 1007 non-null float64 30 benchmark_period_return 1008 non-null float64 31 benchmark_volatility 1007 non-null float64 32 alpha 0 non-null object 33 beta 0 non-null object 34 sharpe 1004 non-null float64 35 sortino 1004 non-null float64 36 max_drawdown 1008 non-null float64 37 max_leverage 1008 non-null float64 38 excess_return 1008 non-null float64 dtypes: datetime64[ns, UTC](2), float64(26), int64(3), object(8) memory usage: 315.0+ KB

pyfolio relies on portfolio returns and position data, and can also take into account the transaction costs and slippage losses of trading activity. The metrics are computed using the empyrical library that can also be used on a standalone basis. The performance DataFrame produced by the zipline backtesting engine can be translated into the requisite pyfolio input.

returns, positions, transactions = extract_rets_pos_txn_from_zipline(backtest)
returns.head().append(returns.tail())
2013-01-02 00:00:00+00:00 0.000000 2013-01-03 00:00:00+00:00 0.000000 2013-01-04 00:00:00+00:00 0.000000 2013-01-07 00:00:00+00:00 0.000000 2013-01-08 00:00:00+00:00 -0.000005 2016-12-23 00:00:00+00:00 -0.000233 2016-12-27 00:00:00+00:00 0.000160 2016-12-28 00:00:00+00:00 -0.000847 2016-12-29 00:00:00+00:00 0.000735 2016-12-30 00:00:00+00:00 -0.000606 Name: returns, dtype: float64
positions.info()
<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 1004 entries, 2013-01-08 00:00:00+00:00 to 2016-12-30 00:00:00+00:00 Columns: 750 entries, Equity(0 [A]) to cash dtypes: float64(750) memory usage: 5.8 MB
positions.columns = [c for c in positions.columns[:-1]] + ['cash'] positions.index = positions.index.normalize() positions.info()
<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 1004 entries, 2013-01-08 00:00:00+00:00 to 2016-12-30 00:00:00+00:00 Columns: 750 entries, Equity(0 [A]) to cash dtypes: float64(750) memory usage: 5.8 MB
transactions.symbol = transactions.symbol.apply(lambda x: x.symbol)
transactions.head().append(transactions.tail())
HDF_PATH = Path('..', 'data', 'assets.h5')

Sector Map

assets = positions.columns[:-1] with pd.HDFStore(HDF_PATH) as store: df = store.get('us_equities/stocks')['sector'].dropna() df = df[~df.index.duplicated()] sector_map = df.reindex(assets).fillna('Unknown').to_dict()

Benchmark

with pd.HDFStore(HDF_PATH) as store: benchmark_rets = store['sp500/fred'].close.pct_change() benchmark_rets.name = 'S&P500' benchmark_rets = benchmark_rets.tz_localize('UTC').filter(returns.index) benchmark_rets.tail()
DATE 2016-12-23 00:00:00+00:00 0.001252 2016-12-27 00:00:00+00:00 0.002248 2016-12-28 00:00:00+00:00 -0.008357 2016-12-29 00:00:00+00:00 -0.000293 2016-12-30 00:00:00+00:00 -0.004637 Name: S&P500, dtype: float64
perf_stats(returns=returns, factor_returns=benchmark_rets) # positions=positions, # transactions=transactions)
Annual return 0.019619 Cumulative returns 0.080817 Annual volatility 0.047487 Sharpe ratio 0.432879 Calmar ratio 0.336024 Stability 0.555919 Max drawdown -0.058387 Omega ratio 1.085094 Sortino ratio 0.630497 Skew 0.223701 Kurtosis 6.125539 Tail ratio 0.988875 Daily value at risk -0.005901 Alpha 0.005922 Beta 0.121033 dtype: float64
fig, ax = plt.subplots(figsize=(14, 5)) plot_perf_stats(returns=returns, factor_returns=benchmark_rets, ax=ax) sns.despine() fig.tight_layout();
Image in a Jupyter notebook

Returns Analysis

Testing a trading strategy involves backtesting against historical data to fine-tune alpha factor parameters, as well as forward-testing against new market data to validate that the strategy performs well out of sample or if the parameters are too closely tailored to specific historical circumstances.

Pyfolio allows for the designation of an out-of-sample period to simulate walk-forward testing. There are numerous aspects to take into account when testing a strategy to obtain statistically reliable results, which we will address here.

oos_date = '2016-01-01'
show_perf_stats(returns=returns, factor_returns=benchmark_rets, positions=positions, transactions=transactions, live_start_date=oos_date)

Rolling Returns OOS

The plot_rolling_returns function displays cumulative in and out-of-sample returns against a user-defined benchmark (we are using the S&P 500):

plot_rolling_returns(returns=returns, factor_returns=benchmark_rets, live_start_date=oos_date, cone_std=(1.0, 1.5, 2.0)) plt.gcf().set_size_inches(14, 8) sns.despine() plt.tight_layout();
Image in a Jupyter notebook

The plot includes a cone that shows expanding confidence intervals to indicate when out-of-sample returns appear unlikely given random-walk assumptions. Here, our strategy did not perform well against the benchmark during the simulated 2017 out-of-sample period

Summary Performance Statistics

pyfolio offers several analytic functions and plots. The perf_stats summary displays the annual and cumulative returns, volatility, skew, and kurtosis of returns and the SR. The following additional metrics (which can also be calculated individually) are most important:

  • Max drawdown: Highest percentage loss from the previous peak

  • Calmar ratio: Annual portfolio return relative to maximal drawdown

  • Omega ratio: The probability-weighted ratio of gains versus losses for a return target, zero per default

  • Sortino ratio: Excess return relative to downside standard deviation

  • Tail ratio: Size of the right tail (gains, the absolute value of the 95th percentile) relative to the size of the left tail (losses, abs. value of the 5th percentile)

  • Daily value at risk (VaR): Loss corresponding to a return two standard deviations below the daily mean

  • Alpha: Portfolio return unexplained by the benchmark return

  • Beta: Exposure to the benchmark

Rolling Sharpe

plot_rolling_sharpe(returns=returns) plt.gcf().set_size_inches(14, 8) sns.despine() plt.tight_layout();
Image in a Jupyter notebook

Rolling Beta

plot_rolling_beta(returns=returns, factor_returns=benchmark_rets) plt.gcf().set_size_inches(14, 6) sns.despine() plt.tight_layout();
Image in a Jupyter notebook

Drawdown Periods

The plot_drawdown_periods(returns) function plots the principal drawdown periods for the portfolio, and several other plotting functions show the rolling SR and rolling factor exposures to the market beta or the Fama French size, growth, and momentum factors:

fig, ax = plt.subplots(nrows=2, ncols=2, figsize=(16, 10)) axes = ax.flatten() plot_drawdown_periods(returns=returns, ax=axes[0]) plot_rolling_beta(returns=returns, factor_returns=benchmark_rets, ax=axes[1]) plot_drawdown_underwater(returns=returns, ax=axes[2]) plot_rolling_sharpe(returns=returns) sns.despine() plt.tight_layout();
Image in a Jupyter notebook

This plot, which highlights a subset of the visualization contained in the various tear sheets, illustrates how pyfolio allows us to drill down into the performance characteristics and exposure to fundamental drivers of risk and returns.

Modeling Event Risk

Pyfolio also includes timelines for various events that you can use to compare the performance of a portfolio to a benchmark during this period, for example, during the fall 2015 selloff following the Brexit vote.

interesting_times = extract_interesting_date_ranges(returns=returns) (interesting_times['Fall2015'] .to_frame('momentum_equal_weights').join(benchmark_rets) .add(1).cumprod().sub(1) .plot(lw=2, figsize=(14, 6), title='Post-Brexit Turmoil')) sns.despine() plt.tight_layout();
Image in a Jupyter notebook