Path: blob/master/finrl/applications/imitation_learning/Stock_Selection.ipynb
732 views
Installation Setup
Data
Column Details
permno is the permanent stock identifier used in CRSP
Retail Market Order Imbalance (moribvol) = (Shares Purchased - Shares Sold)/(Shares Purchased+Shares Sold)
We standardize by dividing the sum of total buy and sell retail market orders to make this variable comparable across different stocks
Tracking Retail Investor Activity: https://onlinelibrary.wiley.com/doi/abs/10.1111/jofi.13033
5 Russell groups
Classification of stocks according to the constituent members of Russell indices
Large-Cap (russellgroup = 1): stocks in Russell Top 200, which consists of the largest 200 members in Russell 1000
Mid-Cap (russellgroup = 2): stocks in Russell Mid-Cap, which consists of the smallest 800 members in Russell 1000
Small-Cap (russellgroup = 3): the largest 1000 members in Russell 2000
Micro-Cap (russellgroup = 4): stocks in Russell Micro-Cap, which consists of the smallest 1000 members in Russell 2000 plus the largest 1000 stocks outside Russell 2000
Nano-Cap (russellgroup = 5): all remaining stocks
11 sectors
The Global Industry Classification Standard (GICS)
I name a sector using the corresponding ticker of the SPDR sector ETF (see, https://www.sectorspdr.com/sectorspdr/)
ret_i, i = 1,5,10,20, is the return over next day, 5 days, 10 days, 20 days
I already moved the return backward for one trading day to avoid potentisl forward-looking bias. For example, -0.003 ret_1 of stock 93436 at 2022-06-24 is the return earned by this stock at 2022-06-27
More Data Explorations
In hindsight, we present stock correlation anaysis between imbalance trades and return rates in 5 days, as an example
Stock Selection
Pick 11 Large Cap Tech (XLK) firms whose retail investor trades are significantly correlatede with return rates in 5 days. Stocks are ["QCOM", "ADSK", "FSLR", "MSFT", "AMD", "ORCL", "INTU", "WU", "LRCX", "TXN", "CSCO"]