GitHub Repository: packtpublishing/machine-learning-for-algorithmic-trading-second-edition
Path: blob/master/05_strategy_evaluation/03_pyfolio_demo.ipynb
²⁹²³ views

Kernel: Python 3

From `zipline` to `pyfolio`

Pyfolio facilitates the analysis of portfolio performance and risk in-sample and out-of-sample using many standard metrics. It produces tear sheets covering the analysis of returns, positions, and transactions, as well as event risk during periods of market stress using several built-in scenarios, and also includes Bayesian out-of-sample performance analysis.

Open-source backtester by Quantopian Inc.
Powers Quantopian.com
State-of-the-art portfolio and risk analytics
Various models for transaction costs and slippage.
Open source and free: Apache v2 license
Can be used:
- stand alone
- with Zipline
- on Quantopian

Imports & Settings

In [1]:

import warnings
warnings.filterwarnings('ignore')

In [2]:

%matplotlib inline
from pathlib import Path

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

from pyfolio.utils import extract_rets_pos_txn_from_zipline
from pyfolio.plotting import (plot_perf_stats,
                              show_perf_stats,
                              plot_rolling_beta,
                              plot_rolling_returns,
                              plot_rolling_sharpe,
                              plot_drawdown_periods,
                              plot_drawdown_underwater)

from pyfolio.timeseries import perf_stats, extract_interesting_date_ranges

In [3]:

sns.set_style('whitegrid')

Converting data from zipline to pyfolio

In [4]:

with pd.HDFStore('backtests.h5') as store:
    backtest = store['backtest/equal_weight']
backtest.info()

Out[4]:

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1008 entries, 2013-01-02 00:00:00+00:00 to 2016-12-30 00:00:00+00:00
Data columns (total 39 columns):
 #   Column                   Non-Null Count  Dtype              
---  ------                   --------------  -----              
 period_open              1008 non-null   datetime64[ns, UTC]
 period_close             1008 non-null   datetime64[ns, UTC]
 starting_cash            1008 non-null   float64            
 ending_cash              1008 non-null   float64            
 portfolio_value          1008 non-null   float64            
 returns                  1008 non-null   float64            
 longs_count              1008 non-null   int64              
 shorts_count             1008 non-null   int64              
 long_value               1008 non-null   float64            
 short_value              1008 non-null   float64            
long_exposure            1008 non-null   float64            
pnl                      1008 non-null   float64            
short_exposure           1008 non-null   float64            
capital_used             1008 non-null   float64            
orders                   1008 non-null   object             
transactions             1008 non-null   object             
gross_leverage           1008 non-null   float64            
positions                1008 non-null   object             
net_leverage             1008 non-null   float64            
starting_exposure        1008 non-null   float64            
ending_exposure          1008 non-null   float64            
starting_value           1008 non-null   float64            
ending_value             1008 non-null   float64            
factor_data              1008 non-null   object             
prices                   1008 non-null   object             
treasury_period_return   1008 non-null   float64            
trading_days             1008 non-null   int64              
period_label             1008 non-null   object             
algorithm_period_return  1008 non-null   float64            
algo_volatility          1007 non-null   float64            
benchmark_period_return  1008 non-null   float64            
benchmark_volatility     1007 non-null   float64            
alpha                    0 non-null      object             
beta                     0 non-null      object             
sharpe                   1004 non-null   float64            
sortino                  1004 non-null   float64            
max_drawdown             1008 non-null   float64            
max_leverage             1008 non-null   float64            
excess_return            1008 non-null   float64            
dtypes: datetime64[ns, UTC](2), float64(26), int64(3), object(8)
memory usage: 315.0+ KB

pyfolio relies on portfolio returns and position data, and can also take into account the transaction costs and slippage losses of trading activity. The metrics are computed using the empyrical library that can also be used on a standalone basis. The performance DataFrame produced by the zipline backtesting engine can be translated into the requisite pyfolio input.

In [5]:

returns, positions, transactions = extract_rets_pos_txn_from_zipline(backtest)

In [6]:

returns.head().append(returns.tail())

Out[6]:

2013-01-02 00:00:00+00:00    0.000000
2013-01-03 00:00:00+00:00    0.000000
2013-01-04 00:00:00+00:00    0.000000
2013-01-07 00:00:00+00:00    0.000000
2013-01-08 00:00:00+00:00   -0.000005
2016-12-23 00:00:00+00:00   -0.000233
2016-12-27 00:00:00+00:00    0.000160
2016-12-28 00:00:00+00:00   -0.000847
2016-12-29 00:00:00+00:00    0.000735
2016-12-30 00:00:00+00:00   -0.000606
Name: returns, dtype: float64

In [7]:

positions.info()

Out[7]:

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1004 entries, 2013-01-08 00:00:00+00:00 to 2016-12-30 00:00:00+00:00
Columns: 750 entries, Equity(0 [A]) to cash
dtypes: float64(750)
memory usage: 5.8 MB

In [8]:

positions.columns = [c for c in positions.columns[:-1]] + ['cash']
positions.index = positions.index.normalize()
positions.info()

Out[8]:

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1004 entries, 2013-01-08 00:00:00+00:00 to 2016-12-30 00:00:00+00:00
Columns: 750 entries, Equity(0 [A]) to cash
dtypes: float64(750)
memory usage: 5.8 MB

In [9]:

transactions.symbol = transactions.symbol.apply(lambda x: x.symbol)

In [10]:

transactions.head().append(transactions.tail())

Out[10]:

In [11]:

HDF_PATH = Path('..', 'data', 'assets.h5')

Sector Map

In [12]:

assets = positions.columns[:-1]
with pd.HDFStore(HDF_PATH) as store:
    df = store.get('us_equities/stocks')['sector'].dropna()
    df = df[~df.index.duplicated()]
sector_map = df.reindex(assets).fillna('Unknown').to_dict()

Benchmark

In [13]:

with pd.HDFStore(HDF_PATH) as store:
    benchmark_rets = store['sp500/fred'].close.pct_change()
benchmark_rets.name = 'S&P500'
benchmark_rets = benchmark_rets.tz_localize('UTC').filter(returns.index)
benchmark_rets.tail()

Out[13]:

DATE
2016-12-23 00:00:00+00:00    0.001252
2016-12-27 00:00:00+00:00    0.002248
2016-12-28 00:00:00+00:00   -0.008357
2016-12-29 00:00:00+00:00   -0.000293
2016-12-30 00:00:00+00:00   -0.004637
Name: S&P500, dtype: float64

In [14]:

perf_stats(returns=returns,
           factor_returns=benchmark_rets)
#            positions=positions, 
#            transactions=transactions)

Out[14]:

Annual return          0.019619
Cumulative returns     0.080817
Annual volatility      0.047487
Sharpe ratio           0.432879
Calmar ratio           0.336024
Stability              0.555919
Max drawdown          -0.058387
Omega ratio            1.085094
Sortino ratio          0.630497
Skew                   0.223701
Kurtosis               6.125539
Tail ratio             0.988875
Daily value at risk   -0.005901
Alpha                  0.005922
Beta                   0.121033
dtype: float64

In [15]:

fig, ax = plt.subplots(figsize=(14, 5))
plot_perf_stats(returns=returns, 
                factor_returns=benchmark_rets,     
                ax=ax)
sns.despine()
fig.tight_layout();

Out[15]:

Returns Analysis

Testing a trading strategy involves backtesting against historical data to fine-tune alpha factor parameters, as well as forward-testing against new market data to validate that the strategy performs well out of sample or if the parameters are too closely tailored to specific historical circumstances.

Pyfolio allows for the designation of an out-of-sample period to simulate walk-forward testing. There are numerous aspects to take into account when testing a strategy to obtain statistically reliable results, which we will address here.

In [16]:

oos_date = '2016-01-01'

In [17]:

show_perf_stats(returns=returns, 
                factor_returns=benchmark_rets, 
                positions=positions, 
                transactions=transactions, 
                live_start_date=oos_date)

Out[17]:

Rolling Returns OOS

The plot_rolling_returns function displays cumulative in and out-of-sample returns against a user-defined benchmark (we are using the S&P 500):

In [18]:

plot_rolling_returns(returns=returns, 
                     factor_returns=benchmark_rets, 
                     live_start_date=oos_date, 
                     cone_std=(1.0, 1.5, 2.0))
plt.gcf().set_size_inches(14, 8)
sns.despine()
plt.tight_layout();

Out[18]:

The plot includes a cone that shows expanding confidence intervals to indicate when out-of-sample returns appear unlikely given random-walk assumptions. Here, our strategy did not perform well against the benchmark during the simulated 2017 out-of-sample period

Summary Performance Statistics

pyfolio offers several analytic functions and plots. The perf_stats summary displays the annual and cumulative returns, volatility, skew, and kurtosis of returns and the SR. The following additional metrics (which can also be calculated individually) are most important:

Max drawdown: Highest percentage loss from the previous peak
Calmar ratio: Annual portfolio return relative to maximal drawdown
Omega ratio: The probability-weighted ratio of gains versus losses for a return target, zero per default
Sortino ratio: Excess return relative to downside standard deviation
Tail ratio: Size of the right tail (gains, the absolute value of the 95th percentile) relative to the size of the left tail (losses, abs. value of the 5th percentile)
Daily value at risk (VaR): Loss corresponding to a return two standard deviations below the daily mean
Alpha: Portfolio return unexplained by the benchmark return
Beta: Exposure to the benchmark

Rolling Sharpe

In [19]:

plot_rolling_sharpe(returns=returns)
plt.gcf().set_size_inches(14, 8)
sns.despine()
plt.tight_layout();

Out[19]:

Rolling Beta

In [20]:

plot_rolling_beta(returns=returns, factor_returns=benchmark_rets)
plt.gcf().set_size_inches(14, 6)
sns.despine()
plt.tight_layout();

Out[20]:

Drawdown Periods

The plot_drawdown_periods(returns) function plots the principal drawdown periods for the portfolio, and several other plotting functions show the rolling SR and rolling factor exposures to the market beta or the Fama French size, growth, and momentum factors:

In [21]:

fig, ax = plt.subplots(nrows=2, ncols=2, figsize=(16, 10))
axes = ax.flatten()

plot_drawdown_periods(returns=returns, ax=axes[0])
plot_rolling_beta(returns=returns, factor_returns=benchmark_rets, ax=axes[1])
plot_drawdown_underwater(returns=returns, ax=axes[2])
plot_rolling_sharpe(returns=returns)
sns.despine()
plt.tight_layout();

Out[21]:

This plot, which highlights a subset of the visualization contained in the various tear sheets, illustrates how pyfolio allows us to drill down into the performance characteristics and exposure to fundamental drivers of risk and returns.

Modeling Event Risk

Pyfolio also includes timelines for various events that you can use to compare the performance of a portfolio to a benchmark during this period, for example, during the fall 2015 selloff following the Brexit vote.

In [22]:

interesting_times = extract_interesting_date_ranges(returns=returns)
(interesting_times['Fall2015']
 .to_frame('momentum_equal_weights').join(benchmark_rets)
 .add(1).cumprod().sub(1)
 .plot(lw=2, figsize=(14, 6), title='Post-Brexit Turmoil'))
sns.despine()
plt.tight_layout();

Out[22]:

From `zipline` to `pyfolio`

Imports & Settings

Converting data from zipline to pyfolio

Sector Map

Benchmark

Returns Analysis

Rolling Returns OOS

Summary Performance Statistics

Rolling Sharpe

Rolling Beta

Drawdown Periods

Modeling Event Risk

Product

Resources

Company

From zipline to pyfolio

Imports & Settings

Converting data from zipline to pyfolio

Sector Map

Benchmark

Returns Analysis

Rolling Returns OOS

Summary Performance Statistics

Rolling Sharpe

Rolling Beta

Drawdown Periods

Modeling Event Risk

From `zipline` to `pyfolio`