Path: blob/master/12_gradient_boosting_machines/11_intraday_model.ipynb
2923 views
Intraday Strategy, Part 2: Model Training & Signal Evaluation
In this notebook, we load the high-quality NASDAQ100 minute-bar trade-and-quote data generously provided by Algoseek (available here) and use the features engineered in the last notebook to train gradient boosting model that predicts the returns for the NASDAQ100 stocks over the next 1-minute bar.
Note that we will assume throughout that we can always buy (sell) at the first (last) trade price for a given bar at no cost and without market impact. This does certainly not reflect market reality, and is rather due to the challenges of simulating a trading strategy at this much higher intraday frequency in a realistic manner using open-source tools.
Note also that this section has slightly changed from the version published in the book to permit replication using the Algoseek data sample.
Imports & Settings
Ensuring we can import utils.py
in the repo's root directory:
Load Model Data
Model Training
Helper functions
Categorical Variables
Custom Metric
Cross-validation setup
Show train/validation periods:
Train model
The following model-training loop will take more than 10 hours to run and also consumes substantial memory. If you run into resource constraints, you can modify the code, e.g., by:
Only loading data required for one iteration.
Shortening the training period to require less than one year.
You can also speed up the process by using fewer n_splits
, which implies longer test periods.
Signal Evaluation
We have out-of-sample predictions for 484 days from February 2016 through December 2017:
We only use minutes with at least 100 predictions:
There are ~700 periods, equivalent to a bit over a single trading day (0.67% of all periods in the sample), with fewer than 100 predictions over the 23 test months:
Information Coefficient
Across all periods
By minute
We are making new predictions every minute, so it makes sense to look at the average performance across all short-term forecasts:
Plotted as a five-day rolling average, we see that the IC was mostly below the out-of-sample period mean, and increased during the last quarter of 2017 (as reflected in the validation results we observed while training the model).
Vectorized backtest of a naive strategey: financial performance by signal quantile
Alphalens does not work with minute-data, so we need to compute our own signal performance measures.
Unfortunately, Zipline's Pipeline also doesn't work for minute-data and Backtrader takes a very long time with such a large dataset. Hence, instead of an event-driven backtest of entry/exit rules as in previous examples, we can only create a rough sketch of the financial performance of a naive trading strategy driven by the model's predictions using vectorized backtesting (see Chapter 8 on the ML4T workflow. As we will see below, this does not produce particularly helpful results.
This naive strategy invests in equal-weighted portfolios of the stocks in each decile under the following assumptions (mentioned at the beginning of this notebook:
Based on the predictions using inputs from the current and previous bars, we can enter positions at the first trade price in the following minute bar
We exit all positions at the last price in that following minute bar
There are no trading cost or market impact (slippage) of our trades (but we can check how sensitive the results would be).
Average returns by minute bar and signal quantile
To this end, we compute the quintiles and deciles of the model's fwd1min
predictions for each minute:
Descriptive statistics of intraday returns by quintile and decile of model predictions
Next, we compute the average one-minute returns for each quintile / decile and minute.
The returns per minute, averaged over the 23-months period, increase by quintile/decile and range from -.3 (-.4) to .27 (.37) basis points for the bottom and top quintile (decile), respectively. While this aligns with the finding of a weakly positive rank correlation coefficient, it also suggests that such small gains are unlikely to survive the impact of trading costs.
Cumulative Performance by Quantile
To simulate the performance of our naive strategy that trades all available stocks every minute, we simply assume that we can reinvest (including potential gains/losses) every minute. To check for the sensitivity with respect for trading cost, we can assume they are a constant number (fraction) of basis points, and subtract this number from the minute-bar returns.
Without trading costs, the compounding of even fairly small gains leads to extremely large cumulative profits for the top quantile. However, these disappear as soon as we allow for minuscule trading costs that reduce the average quantile return close to zero.
Without trading costs
With extremely low trading costs
Feature Importance
We'll take a quick look at the features that most contributed to improving the IC across the 23 folds:
The top features from a conventional feature importance perspective are the ticker, followed by NATR, minute of the day, latest 1m return and the CCI:
Explore with greater accuracy and in more detail how feature values affect predictions using SHAP values as demonstrated in various other notebooks in this Chapter and the appendix!
Conclusion
We have seen that a relatively simple gradient boosting model is able to achieve fairly consistent predictive performance that is significantly better than a random guess even on a very short horizon.
However, the resulting economic gains of our naive strategy of frequently buying/(short-)selling the top/bottome quantiles are too small to overcome the inevitable transaction costs. On the one hand, this demonstrates the challenges of extracting value from a predictive signal. On the other hand, it shows that we need a more sophisticated backtesting platform so that we can even begin to design and evaluate a more sophisticated strategy that requires far fewer trades to exploit the signal in our ML predictions.
In addition, we would also want to work on improving the model by adding more informative feature, e.g. based on the quote/trade info contained in the Algoseek data, or by fine-tuning our model architecture and hyperparameter settings.