Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
AI4Finance-Foundation
GitHub Repository: AI4Finance-Foundation/FinRL
Path: blob/master/examples/Stock_NeurIPS2018_3_Backtest.ipynb
726 views
Kernel: Python 3

Stock NeurIPS2018 Part 3. Backtest

This series is a reproduction of paper the process in the paper Practical Deep Reinforcement Learning Approach for Stock Trading.

This is the third and last part of the NeurIPS2018 series, introducing how to use use the agents we trained to do backtest, and compare with baselines such as Mean Variance Optimization and DJIA index.

Other demos can be found at the repo of FinRL-Tutorials.

Part 1. Install Packages

## install finrl library !pip install git+https://github.com/AI4Finance-Foundation/FinRL.git
import matplotlib.pyplot as plt import numpy as np import pandas as pd from stable_baselines3 import A2C, DDPG, PPO, SAC, TD3 from finrl.agents.stablebaselines3.models import DRLAgent from finrl.config import INDICATORS, TRAINED_MODEL_DIR from finrl.meta.env_stock_trading.env_stocktrading import StockTradingEnv from finrl.meta.preprocessor.yahoodownloader import YahooDownloader %matplotlib inline

Part 2. Backtesting

To backtest the agents, upload trade_data.csv in the same directory of this notebook. For Colab users, just upload trade_data.csv to the default directory.

train = pd.read_csv('train_data.csv') trade = pd.read_csv('trade_data.csv') # If you are not using the data generated from part 1 of this tutorial, make sure # it has the columns and index in the form that could be make into the environment. # Then you can comment and skip the following lines. train = train.set_index(train.columns[0]) train.index.names = [''] trade = trade.set_index(trade.columns[0]) trade.index.names = ['']

Then, upload the trained agent to the same directory, and set the corresponding variable to True.

if_using_a2c = True if_using_ddpg = True if_using_ppo = True if_using_td3 = True if_using_sac = True

Load the agents

trained_a2c = A2C.load(TRAINED_MODEL_DIR + "/agent_a2c") if if_using_a2c else None trained_ddpg = DDPG.load(TRAINED_MODEL_DIR + "/agent_ddpg") if if_using_ddpg else None trained_ppo = PPO.load(TRAINED_MODEL_DIR + "/agent_ppo") if if_using_ppo else None trained_td3 = TD3.load(TRAINED_MODEL_DIR + "/agent_td3") if if_using_td3 else None trained_sac = SAC.load(TRAINED_MODEL_DIR + "/agent_sac") if if_using_sac else None

Trading (Out-of-sample Performance)

We update periodically in order to take full advantage of the data, e.g., retrain quarterly, monthly or weekly. We also tune the parameters along the way, in this notebook we use the in-sample data from 2009-01 to 2020-07 to tune the parameters once, so there is some alpha decay here as the length of trade date extends.

Numerous hyperparameters – e.g. the learning rate, the total number of samples to train on – influence the learning process and are usually determined by testing some variations.

stock_dimension = len(trade.tic.unique()) state_space = 1 + 2 * stock_dimension + len(INDICATORS) * stock_dimension print(f"Stock Dimension: {stock_dimension}, State Space: {state_space}")
Stock Dimension: 29, State Space: 291
buy_cost_list = sell_cost_list = [0.001] * stock_dimension num_stock_shares = [0] * stock_dimension env_kwargs = { "hmax": 100, "initial_amount": 1000000, "num_stock_shares": num_stock_shares, "buy_cost_pct": buy_cost_list, "sell_cost_pct": sell_cost_list, "state_space": state_space, "stock_dim": stock_dimension, "tech_indicator_list": INDICATORS, "action_space": stock_dimension, "reward_scaling": 1e-4 }
e_trade_gym = StockTradingEnv(df = trade, turbulence_threshold = 70,risk_indicator_col='vix', **env_kwargs) # env_trade, obs_trade = e_trade_gym.get_sb_env()
df_account_value_a2c, df_actions_a2c = DRLAgent.DRL_prediction( model=trained_a2c, environment = e_trade_gym) if if_using_a2c else (None, None)
hit end!
df_account_value_ddpg, df_actions_ddpg = DRLAgent.DRL_prediction( model=trained_ddpg, environment = e_trade_gym) if if_using_ddpg else (None, None)
hit end!
df_account_value_ppo, df_actions_ppo = DRLAgent.DRL_prediction( model=trained_ppo, environment = e_trade_gym) if if_using_ppo else (None, None)
hit end!
df_account_value_td3, df_actions_td3 = DRLAgent.DRL_prediction( model=trained_td3, environment = e_trade_gym) if if_using_td3 else (None, None)
hit end!
df_account_value_sac, df_actions_sac = DRLAgent.DRL_prediction( model=trained_sac, environment = e_trade_gym) if if_using_sac else (None, None)
hit end!

Part 3: Mean Variance Optimization

Mean Variance optimization is a very classic strategy in portfolio management. Here, we go through the whole process to do the mean variance optimization and add it as a baseline to compare.

First, process dataframe to the form for MVO weight calculation.

def process_df_for_mvo(df): return df.pivot(index="date", columns="tic", values="close")

Helper functions for mean returns and variance-covariance matrix

# Codes in this section partially refer to Dr G A Vijayalakshmi Pai # https://www.kaggle.com/code/vijipai/lesson-5-mean-variance-optimization-of-portfolios/notebook def StockReturnsComputing(StockPrice, Rows, Columns): import numpy as np StockReturn = np.zeros([Rows-1, Columns]) for j in range(Columns): # j: Assets for i in range(Rows-1): # i: Daily Prices StockReturn[i,j]=((StockPrice[i+1, j]-StockPrice[i,j])/StockPrice[i,j])* 100 return StockReturn

Calculate the weights for mean-variance

StockData = process_df_for_mvo(train) TradeData = process_df_for_mvo(trade) TradeData.to_numpy()
array([[ 89.4945755 , 234.61436462, 90.74229431, ..., 46.98942566, 36.2900238 , 114.62765503], [ 89.4945755 , 237.48353577, 91.0124588 , ..., 47.09257507, 37.26652145, 114.16794586], [ 91.88856506, 235.65350342, 93.18331909, ..., 47.47934723, 38.31403732, 113.86148071], ..., [147.34196472, 197.85211182, 178.69203186, ..., 48.2805481 , 45.98464584, 146.54455566], [148.01603699, 198.85264587, 177.35902405, ..., 48.73966217, 45.13446426, 145.26525879], [147.55014038, 196.8515625 , 174.49697876, ..., 48.32645416, 44.022686 , 144.07385254]])
#compute asset returns arStockPrices = np.asarray(StockData) [Rows, Cols]=arStockPrices.shape arReturns = StockReturnsComputing(arStockPrices, Rows, Cols) #compute mean returns and variance covariance matrix of returns meanReturns = np.mean(arReturns, axis = 0) covReturns = np.cov(arReturns, rowvar=False) #set precision for printing results np.set_printoptions(precision=3, suppress = True) #display mean returns and variance-covariance matrix of returns print('Mean returns of assets in k-portfolio 1\n', meanReturns) print('Variance-Covariance matrix of returns\n', covReturns)
Mean returns of assets in k-portfolio 1 [0.136 0.068 0.086 0.083 0.066 0.134 0.06 0.035 0.072 0.056 0.103 0.073 0.033 0.076 0.047 0.073 0.042 0.056 0.054 0.056 0.103 0.089 0.041 0.053 0.104 0.11 0.044 0.042 0.042] Variance-Covariance matrix of returns [[3.156 1.066 1.768 1.669 1.722 1.814 1.569 1.302 1.302 1.811 1.303 1.432 1.218 1.674 0.74 1.839 0.719 0.884 1.241 0.823 1.561 1.324 0.752 1.027 1.298 1.466 0.657 1.078 0.631] [1.066 2.571 1.306 1.123 1.193 1.319 1.116 1.053 1.045 1.269 1.068 1.089 0.899 1.218 0.926 1.391 0.682 0.727 1.025 1.156 1.166 0.984 0.798 0.956 1.259 1.111 0.688 1.091 0.682] [1.768 1.306 4.847 2.73 2.6 2.128 1.944 2.141 2.17 3.142 1.932 2.283 1.56 2.012 0.993 3.707 1.094 1.319 1.845 1.236 1.899 1.894 1.041 1.921 1.823 2.314 0.986 1.421 0.707] [1.669 1.123 2.73 4.892 2.363 1.979 1.7 2.115 1.959 2.387 1.773 2.319 1.571 1.797 0.968 2.597 1.144 1.298 1.643 1.071 1.615 1.775 0.91 1.666 1.707 1.784 0.82 1.345 0.647] [1.722 1.193 2.6 2.363 4.019 2.127 1.917 2.059 1.817 2.46 1.577 2.238 1.513 1.929 0.925 2.64 0.947 0.971 1.894 1.089 1.711 1.642 0.865 1.456 1.478 1.687 0.92 1.326 0.697] [1.814 1.319 2.128 1.979 2.127 5.384 1.974 1.549 1.683 2.122 1.624 1.771 1.441 1.939 0.846 2.191 0.837 1.075 1.475 1.041 1.978 1.768 0.784 1.328 1.365 1.912 0.787 1.28 0.666] [1.569 1.116 1.944 1.7 1.917 1.974 3.081 1.483 1.534 1.937 1.367 1.62 1.399 1.843 0.894 2.057 0.794 0.905 1.438 1.014 1.72 1.382 0.865 1.206 1.273 1.488 0.811 1.173 0.753] [1.302 1.053 2.141 2.115 2.059 1.549 1.483 2.842 1.525 2.044 1.428 1.783 1.308 1.533 0.878 2.279 0.938 1.092 1.385 1.078 1.429 1.314 0.831 1.459 1.466 1.48 0.83 1.042 0.567] [1.302 1.045 2.17 1.959 1.817 1.683 1.534 1.525 2.661 1.987 1.454 1.748 1.217 1.475 0.791 2.216 0.896 0.973 1.396 0.949 1.379 1.407 0.859 1.268 1.281 1.454 0.81 1.143 0.667] [1.811 1.269 3.142 2.387 2.46 2.122 1.937 2.044 1.987 4.407 1.789 2.12 1.593 1.982 0.945 3.96 0.956 1.094 1.758 1.157 1.788 1.692 0.905 1.879 1.712 2. 0.945 1.421 0.713] [1.303 1.068 1.932 1.773 1.577 1.624 1.367 1.428 1.454 1.789 2.373 1.51 1.166 1.501 0.756 1.941 0.824 0.998 1.239 0.887 1.366 1.414 0.797 1.299 1.296 1.41 0.764 1.071 0.783] [1.432 1.089 2.283 2.319 2.238 1.771 1.62 1.783 1.748 2.12 1.51 2.516 1.326 1.575 0.889 2.345 0.958 1.022 1.623 1.02 1.489 1.532 0.848 1.377 1.444 1.547 0.81 1.211 0.63 ] [1.218 0.899 1.56 1.571 1.513 1.441 1.399 1.308 1.217 1.593 1.166 1.326 2.052 1.399 0.727 1.749 0.786 0.795 1.154 0.829 1.296 1.12 0.743 1.105 1.088 1.214 0.739 0.998 0.598] [1.674 1.218 2.012 1.797 1.929 1.939 1.843 1.533 1.475 1.982 1.501 1.575 1.399 3.289 0.853 2.112 0.85 0.89 1.412 1.002 1.9 1.352 0.842 1.317 1.334 1.487 0.847 1.165 0.766] [0.74 0.926 0.993 0.968 0.925 0.846 0.894 0.878 0.791 0.945 0.756 0.889 0.727 0.853 1.153 1.027 0.642 0.59 0.848 0.892 0.825 0.748 0.694 0.761 0.929 0.819 0.61 0.806 0.547] [1.839 1.391 3.707 2.597 2.64 2.191 2.057 2.279 2.216 3.96 1.941 2.345 1.749 2.112 1.027 5.271 1.08 1.235 1.892 1.297 1.91 1.85 1.068 2.164 1.85 2.169 1.112 1.555 0.779] [0.719 0.682 1.094 1.144 0.947 0.837 0.794 0.938 0.896 0.956 0.824 0.958 0.786 0.85 0.642 1.08 1.264 0.679 0.804 0.74 0.819 0.845 0.749 0.891 0.849 0.794 0.633 0.719 0.514] [0.884 0.727 1.319 1.298 0.971 1.075 0.905 1.092 0.973 1.094 0.998 1.022 0.795 0.89 0.59 1.235 0.679 1.518 0.816 0.719 0.943 1.027 0.615 1. 0.947 0.994 0.533 0.673 0.504] [1.241 1.025 1.845 1.643 1.894 1.475 1.438 1.385 1.396 1.758 1.239 1.623 1.154 1.412 0.848 1.892 0.804 0.816 2.028 0.9 1.265 1.243 0.787 1.194 1.193 1.282 0.752 1.099 0.622] [0.823 1.156 1.236 1.071 1.089 1.041 1.014 1.078 0.949 1.157 0.887 1.02 0.829 1.002 0.892 1.297 0.74 0.719 0.9 2.007 0.952 0.849 0.732 1.008 1.15 0.933 0.722 0.897 0.614] [1.561 1.166 1.899 1.615 1.711 1.978 1.72 1.429 1.379 1.788 1.366 1.489 1.296 1.9 0.825 1.91 0.819 0.943 1.265 0.952 2.759 1.308 0.832 1.214 1.285 1.493 0.793 1.113 0.705] [1.324 0.984 1.894 1.775 1.642 1.768 1.382 1.314 1.407 1.692 1.414 1.532 1.12 1.352 0.748 1.85 0.845 1.027 1.243 0.849 1.308 2.864 0.751 1.153 1.26 1.411 0.71 1.046 0.651] [0.752 0.798 1.041 0.91 0.865 0.784 0.865 0.831 0.859 0.905 0.797 0.848 0.743 0.842 0.694 1.068 0.749 0.615 0.787 0.732 0.832 0.751 1.289 0.806 0.766 0.763 0.663 0.797 0.645] [1.027 0.956 1.921 1.666 1.456 1.328 1.206 1.459 1.268 1.879 1.299 1.377 1.105 1.317 0.761 2.164 0.891 1. 1.194 1.008 1.214 1.153 0.806 2.27 1.259 1.294 0.812 0.986 0.676] [1.298 1.259 1.823 1.707 1.478 1.365 1.273 1.466 1.281 1.712 1.296 1.444 1.088 1.334 0.929 1.85 0.849 0.947 1.193 1.15 1.285 1.26 0.766 1.259 3.352 1.267 0.697 1.137 0.685] [1.466 1.111 2.314 1.784 1.687 1.912 1.488 1.48 1.454 2. 1.41 1.547 1.214 1.487 0.819 2.169 0.794 0.994 1.282 0.933 1.493 1.411 0.763 1.294 1.267 2.982 0.709 1.007 0.656] [0.657 0.688 0.986 0.82 0.92 0.787 0.811 0.83 0.81 0.945 0.764 0.81 0.739 0.847 0.61 1.112 0.633 0.533 0.752 0.722 0.793 0.71 0.663 0.812 0.697 0.709 1.371 0.697 0.561] [1.078 1.091 1.421 1.345 1.326 1.28 1.173 1.042 1.143 1.421 1.071 1.211 0.998 1.165 0.806 1.555 0.719 0.673 1.099 0.897 1.113 1.046 0.797 0.986 1.137 1.007 0.697 3.073 0.759] [0.631 0.682 0.707 0.647 0.697 0.666 0.753 0.567 0.667 0.713 0.783 0.63 0.598 0.766 0.547 0.779 0.514 0.504 0.622 0.614 0.705 0.651 0.645 0.676 0.685 0.656 0.561 0.759 1.452]]

Use PyPortfolioOpt

from pypfopt.efficient_frontier import EfficientFrontier ef_mean = EfficientFrontier(meanReturns, covReturns, weight_bounds=(0, 0.5)) raw_weights_mean = ef_mean.max_sharpe() cleaned_weights_mean = ef_mean.clean_weights() mvo_weights = np.array([1000000 * cleaned_weights_mean[i] for i in range(len(cleaned_weights_mean))]) mvo_weights
array([424250., 0., 0., 0., 0., 108650., 0., 0., 0., 0., 181450., 0., 0., 0., 0., 0., 0., 0., 0., 0., 16960., 0., 0., 0., 133540., 135150., 0., 0., 0.])
LastPrice = np.array([1/p for p in StockData.tail(1).to_numpy()[0]]) Initial_Portfolio = np.multiply(mvo_weights, LastPrice) Initial_Portfolio
array([4731.544, 0. , 0. , 0. , 0. , 579.993, 0. , 0. , 0. , 0. , 771.759, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 85.465, 0. , 0. , 0. , 470.265, 712.801, 0. , 0. , 0. ])
Portfolio_Assets = TradeData @ Initial_Portfolio MVO_result = pd.DataFrame(Portfolio_Assets, columns=["Mean Var"]) MVO_result

Part 4: DJIA index

Add DJIA index as a baseline to compare with.

TRAIN_START_DATE = '2009-01-01' TRAIN_END_DATE = '2020-07-01' TRADE_START_DATE = '2020-07-01' TRADE_END_DATE = '2021-10-29'
df_dji = YahooDownloader( start_date=TRADE_START_DATE, end_date=TRADE_END_DATE, ticker_list=["dji"] ).fetch_data()
[*********************100%***********************] 1 of 1 completed Shape of DataFrame: (319, 8)
df_dji = df_dji[["date", "close"]] fst_day = df_dji["close"][0] dji = pd.merge( df_dji["date"], df_dji["close"].div(fst_day).mul(1000000), how="outer", left_index=True, right_index=True, ).set_index("date")

Part 5: Backtesting Results

Backtesting plays a key role in evaluating the performance of a trading strategy. Automated backtesting tool is preferred because it reduces the human error. We usually use the Quantopian pyfolio package to backtest our trading strategies. It is easy to use and consists of various individual plots that provide a comprehensive image of the performance of a trading strategy.

df_result_a2c = ( df_account_value_a2c.set_index(df_account_value_a2c.columns[0]) if if_using_a2c else None ) df_result_ddpg = ( df_account_value_ddpg.set_index(df_account_value_ddpg.columns[0]) if if_using_ddpg else None ) df_result_ppo = ( df_account_value_ppo.set_index(df_account_value_ppo.columns[0]) if if_using_ppo else None ) df_result_td3 = ( df_account_value_td3.set_index(df_account_value_td3.columns[0]) if if_using_td3 else None ) df_result_sac = ( df_account_value_sac.set_index(df_account_value_sac.columns[0]) if if_using_sac else None ) result = pd.DataFrame( { "a2c": df_result_a2c["account_value"] if if_using_a2c else None, "ddpg": df_result_ddpg["account_value"] if if_using_ddpg else None, "ppo": df_result_ppo["account_value"] if if_using_ppo else None, "td3": df_result_td3["account_value"] if if_using_td3 else None, "sac": df_result_sac["account_value"] if if_using_sac else None, "mvo": MVO_result["Mean Var"], "dji": dji["close"], } )
result

Now, everything is ready, we can plot the backtest result.

plt.rcParams["figure.figsize"] = (15,5) plt.figure() result.plot()
<Axes: xlabel='date'>
<Figure size 1500x500 with 0 Axes>
Image in a Jupyter notebook