Path: blob/master/docs/source/tutorial/stocktrading/1-data.rst
728 views
:github_url: https://github.com/AI4Finance-Foundation/FinRL ================= Section 1. Data ================= Part 1. Install Packages ================================== .. code-block:: python ## install required packages !pip install swig !pip install wrds !pip install pyportfolioopt ## install finrl library !pip install git+https://github.com/AI4Finance-Foundation/FinRL.git .. code-block:: python import pandas as pd import numpy as np import datetime import yfinance as yf from finrl.meta.preprocessor.yahoodownloader import YahooDownloader from finrl.meta.preprocessor.preprocessors import FeatureEngineer, data_split from finrl import config_tickers from finrl.config import INDICATORS import itertools Part 2. Fetch data ================================== `yfinance <https://github.com/ranaroussi/yfinance>`_ is an open-source library that provides APIs fetching historical data form Yahoo Finance. In FinRL, we have a class called YahooDownloader that use yfinance to fetch data from Yahoo Finance. **OHLCV**: Data downloaded are in the form of OHLCV, corresponding to **open, high, low, close, volume,** respectively. OHLCV is important because they contain most of numerical information of a stock in time series. From OHLCV, traders can get further judgement and prediction like the momentum, people's interest, market trends, etc. Data for a single ticker ---------------------------------------- **using yfinance** .. code-block:: python aapl_df_yf = yf.download(tickers = "aapl", start='2020-01-01', end='2020-01-31') **using FinRL** In FinRL's YahooDownloader, we modified the data frame to the form that convenient for further data processing process. We use adjusted close price instead of close price, and add a column representing the day of a week (0-4 corresponding to Monday-Friday). .. code-block:: python aapl_df_finrl = YahooDownloader(start_date = '2020-01-01', end_date = '2020-01-31', ticker_list = ['aapl']).fetch_data() Data for the chosen ticker ---------------------------------------- .. code-block:: python TRAIN_START_DATE = '2009-01-01' TRAIN_END_DATE = '2020-07-01' TRADE_START_DATE = '2020-07-01' TRADE_END_DATE = '2021-10-29' .. code-block:: python df_raw = YahooDownloader(start_date = TRAIN_START_DATE, end_date = TRADE_END_DATE, ticker_list = config_tickers.DOW_30_TICKER).fetch_data() Part 3. Preprocess Data ================================== We need to check for missing data and do feature engineering to convert the data point into a state. - **Adding technical indicators**. In practical trading, various information needs to be taken into account, such as historical prices, current holding shares, technical indicators, etc. Here, we demonstrate two trend-following technical indicators: MACD and RSI. - **Adding turbulence index**. Risk-aversion reflects whether an investor prefers to protect the capital. It also influences one's trading strategy when facing different market volatility level. To control the risk in a worst-case scenario, such as financial crisis of 2007–2008, FinRL employs the turbulence index that measures extreme fluctuation of asset price. Hear let's take MACD as an example. Moving average convergence/divergence (MACD) is one of the most commonly used indicator showing bull and bear market. Its calculation is based on EMA (Exponential Moving Average indicator, measuring trend direction over a period of time.)