Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
QuantConnect
GitHub Repository: QuantConnect/Research
Path: blob/master/Analysis/02 Kalman Filter Based Pairs Trading.ipynb
995 views
Kernel: Python 3

QuantConnect Logo Welcome to The QuantConnect Research Page!

Refer to this page for documentation https://www.quantconnect.com/docs#Introduction-to-Jupyter

Contribute to this template file https://github.com/QuantConnect/Research/tree/master/Notebooks

%matplotlib inline # Imports from clr import AddReference AddReference("System") AddReference("QuantConnect.Common") AddReference("QuantConnect.Jupyter") AddReference("QuantConnect.Indicators") from System import * from QuantConnect import * from QuantConnect.Data.Market import TradeBar, QuoteBar from QuantConnect.Jupyter import * from QuantConnect.Indicators import * from datetime import datetime, timedelta import matplotlib.pyplot as plt import pandas as pd # Create an instance qb = QuantBook()
import matplotlib.pyplot as plt import matplotlib.cm as cm import numpy as np import pandas as pd from pykalman import KalmanFilter import statsmodels.api as sm plt.style.use('seaborn-whitegrid')

pairs trading

Pairs trading is a market neutral trading strategy and it belongs to statistical arbitrage. The basic idea is to select two stocks which move similarly, sell high priced stock and buy low priced stock where there is a price divergence between the pairs.

Cointegration

Before using pairs trading, we need to know the cointegration. Cointegration is a statistical property of time series (that is a series of random variables)

  • Correlation specifies the co-movement of return, it is a short-term relationship

  • Cointegration specifies co-movement of price, it is a long-term relationship

Generally speaking, the weak stationary process means the mean and the autocovariance do not vary from time. For example, white noise ϵt\epsilon_t is an example of stationary time series because E(ϵt)=0E(\epsilon_t)=0 and Var(ϵt)=σ2Var(\epsilon_t)=\sigma^2.

If two series {xtx_t} and {yty_t} are not stationary but their linear combination ut=βxt−yt u_t = \beta x_t - y_t is a stationary process, then we say {xt} and {yt} are cointegrated. Here we take the first-order difference of x and y, then we say their relationship is first-order cointegration. Most of the financial time series are integrated into order 1.

In general, we use Augmented Dickey-Fuller test to test cointegration.

Can we apply this idea to trading strategy?

The most widely used model of stock price is dStSt=μdt+σdWt\frac{\text{d}S_t}{S_t}=\mu \text{d}t+\sigma \text{d}W_t StS_t is stock price, μ\mu is average return(drift term), σ\sigma is volatility (variance term), WtW_t is Brownian motion. From ito's Lemma we get dlog(St)=(μ−σ22)dt+σdWt\text{d} log(S_t)=(\mu-\frac{\sigma^2}{2}) \text{d}t + \sigma \text{d}W_t So logarithm price is integrated into order 1.

Step 1: Find two likely cointegrated stocks

symbols = ["XOM","CVX"] data = {} plt.figure(figsize =(15,7)) for syl in symbols: qb.AddEquity(syl) data[syl] = qb.History([syl], datetime(2003,1,1), datetime(2009,1,1), Resolution.Daily).loc[syl]['close'] data[syl].index = data[syl].index.date data[syl].plot(label=syl) plt.legend() plt.ylabel('Adjusted Price') plt.figure(figsize =(15,7)) for syl in symbols: (1+data[syl].pct_change()).cumprod().plot(label=syl) plt.legend() plt.ylabel('Cumulative Return')
Text(0,0.5,'Cumulative Return')
Image in a Jupyter notebookImage in a Jupyter notebook

We choose stock A :XOM, stock B:CVX. From the above plot, we can see two series are highly correlated with each other. Next, we will explore this relationship further.

Step 2: Estimate Spreads

If we have two stocks, X and Y, which are cointegrated in their price movements, then any divergence in the spread from 0 should be temporary and mean-reverting. Next step we will estimate the spread series.

# Run linear regression over two log price series x = list(np.log(data[symbols[0]])) x_const = sm.add_constant(x) y = list(np.log(data[symbols[1]])) linear_reg = sm.OLS(y,x_const) results = linear_reg.fit() results.summary()
beta = results.params[1] alpha = results.params[0]
y_pred = np.log(data[symbols[0]])*beta + alpha
plt.figure(figsize =(15,7)) cm = plt.get_cmap('jet') sc = plt.scatter(np.log(data[symbols[0]]), np.log(data[symbols[1]]), s=50, c=x, cmap=cm, marker='o', alpha=0.6,label='Log Price',edgecolor='k') plt.plot(x, y_pred, '-',c='black',linewidth=3, label='OLS Fit') plt.legend() cb = plt.colorbar(sc) cb.ax.set_yticklabels([str(p) for p in data[symbols[0]].index]) plt.xlabel(symbols[0]) plt.ylabel(symbols[1])
Text(0,0.5,'CVX')
Image in a Jupyter notebook
# construct the spread series according to OLS result df = pd.DataFrame(np.log(data[symbols[1]]) - np.log(data[symbols[0]])*beta-alpha,index=data[symbols[0]].index) df.columns = ['spread'] df.plot(figsize =(15,10)) plt.ylabel('spread')
Text(0,0.5,'spread')
Image in a Jupyter notebook

Step 3: Check Stationarity

From the above plot, the first order difference Spreadt=log(yt)−βlog(xt)−αSpread_t=log(y_t) -\beta log(x_t)-\alpha seems to be stationary and mean-reversion. Next, we will check if it is stationary.

# check if the spread is stationary adf = sm.tsa.stattools.adfuller(df['spread'], maxlag=1) print('ADF test statistic: %.02f' % adf[0]) for key, value in adf[4].items(): print('\t%s: %.3f' % (key, value)) print('p-value: %.03f' % adf[1])
ADF test statistic: -3.45 1%: -3.435 5%: -2.863 10%: -2.568 p-value: 0.009

Running the example prints the test statistic value of -3.45. The more negative this statistic, the more likely we are to reject the null hypothesis (there is a unit root).

As part of the output, we get a table to help determine the ADF statistic. We can see that our statistic value of -3.45 is less than the value of -3.435 at 1%. The p-value 0.009 is less than 0.05.

This suggests that we can reject the null hypothesis with a significance level of less than 1%. Rejecting the null hypothesis means that the process has no unit root, and in turn that the time series is stationary or does not have time-dependent structure.

Step 4: Create Trading Signal

Here we use 1.96 times the standard deviation as our threshold. 1.96 is the approximate value of the 97.5 percentile point of the normal distribution. 95% of the area under a normal curve lies within roughly 1.96 standard deviations of the mean. If the sample is out of this range, then we think there is a price divergence between these two stocks price which means the pairs trading opportunity.

df['mean'] = df['spread'].mean() df['upper'] = df['mean'] + 1.96*df['spread'].std() df['lower'] = df['mean'] - 1.96*df['spread'].std()
df.plot(figsize =(15,10),style=['g', '--r', '--b', '--b'])
<matplotlib.axes._subplots.AxesSubplot at 0x7f2a134a8b00>
Image in a Jupyter notebook
df['buy'] = df['spread'][((df['spread'] < df['lower']) & (df['spread'].shift(1) > df['lower']) | (df['spread'] < df['mean']) & (df['spread'].shift(1) > df['mean']))] df['sell'] = df['spread'][((df['spread'] > df['upper']) & (df['spread'].shift(1) < df['upper']) | (df['spread'] > df['mean']) & (df['spread'].shift(1) < df['mean']))] df.plot(figsize =(17,10), style=['g', '--r', '--b', '--b', 'm^','cv'])
<matplotlib.axes._subplots.AxesSubplot at 0x7f2a13391710>
Image in a Jupyter notebook

Here buying the spread means buy 1 unit of stock B(CVX) and sell β\beta units of stock A(XOM). We expect that the relationship between x and y will hold in the future. Buying the spread when it is lower than the standard deviation σ\sigma and closing out the position when it returns to mean μ\mu. Selling the spread means to sell 1 unit of stock B(CVX) and buy β\beta units of stock A(XOM) and when it is above σ\sigma and close the position when reaching the long-term mean to realize a profit.

Method: Kalman Filter

If we use linear regression to estimate those two parameters for the spread calculation, the main issue is that we have to pick an arbitrary lookback window and assume in the near future they will still keep this relationship and the spread will convergence to its long-term equilibrium. But in practice, α\alpha and β\beta are not constants and vary over time. They are not market observables. On the other hand, the long-term relationship can break down.

In the following research, we use Kalman filter to model the spread. This is an adaptive filter which updates itself iteratively and produces α\alpha, β\beta dynamically. We use the python package pykalman which has the EM method that calibrates the covariance matrices over the training period.

For the model details https://www.quantstart.com/articles/Dynamic-Hedge-Ratio-Between-ETF-Pairs-Using-the-Kalman-Filter

obs_mat = sm.add_constant(np.log(data[symbols[0]]).values, prepend=False)[:, np.newaxis] trans_cov = 1e-5 / (1 - 1e-5) * np.eye(2)
kf = KalmanFilter(n_dim_obs=1, n_dim_state=2, initial_state_mean=np.ones(2), initial_state_covariance=np.ones((2, 2)), transition_matrices=np.eye(2), observation_matrices=obs_mat, observation_covariance=0.5, transition_covariance=0.000001 * np.eye(2))
state_means, state_covs = kf.filter(np.log(data[symbols[1]]).values) slope=state_means[:, 0] intercept=state_means[:, 1] plt.figure(figsize =(15,7)) plt.plot(data[symbols[1]].index, slope, c='b') plt.ylabel('slope') plt.figure(figsize =(15,7)) plt.plot(data[symbols[1]].index,intercept,c='r') plt.ylabel('intercept')
Text(0,0.5,'intercept')
Image in a Jupyter notebookImage in a Jupyter notebook

From the result above, the slope increases and the intercept doesn't change too much as the time goes.

# visualize the correlation between assest prices over time plt.figure(figsize =(15,7)) cm = plt.get_cmap('jet') sc = plt.scatter(np.log(data[symbols[0]]), np.log(data[symbols[1]]), s=50, c=x, cmap=cm, marker='o', alpha=0.6,label='Price',edgecolor='k') cb = plt.colorbar(sc) cb.ax.set_yticklabels([str(p) for p in data[symbols[0]].index]) plt.xlabel("price (%s)" %symbols[0]) plt.ylabel("price (%s)" %symbols[0]) # add regression lines step = 50 # pick slope and intercept every 50 days colors_l = np.linspace(0.1, 1, len(state_means[::step])) for i, b in enumerate(state_means[::step]): plt.plot(np.log(data[symbols[0]]), b[0] *np.log(data[symbols[0]]) + b[1], alpha=.5, lw=2, c=cm(colors_l[i]))
Image in a Jupyter notebook

Next we plot the original spread and the spread estimated using Kalman Filter.

kl_spread = np.log(data[symbols[1]]) - np.log(data[symbols[0]]) * state_means[:,0] - state_means[:,1] df['kl_spread'] = kl_spread
new_df = df.drop(['buy','sell'],1) new_df.plot(figsize =(15,8),style=['r','--y', '--y', '--y','b'])
<matplotlib.axes._subplots.AxesSubplot at 0x7f2a13474240>
Image in a Jupyter notebook