Path: blob/master/data/create_datasets.ipynb
2908 views
Download and store data
This notebook contains information on downloading the Quandl Wiki stock prices and a few other sources that we use throughout the book.
Imports & Settings
Set Data Store path
Modify path if you would like to store the data elsewhere and change the notebooks accordingly
Quandl Wiki Prices
Quandl has been acuqired by NASDAQ in late 2018. In 2021, NASDAQ integrated Quandl's data platform. Free US equity data is still available under a new URL, subject to the limitations mentioned below.
NASDAQ makes available a dataset with stock prices, dividends and splits for 3000 US publicly-traded companies. Prior to its acquisition (April 11, 2018), Quandl announced the end of community support (updates). The historical data are useful as a first step towards demonstrating the application of the machine learning solutions, just ensure you design and test your own algorithms using current, professional-grade data.
Follow the instructions to create a free NASDAQ account
Download the entire WIKI/PRICES data
Extract the .zip file,
Move to this directory and rename to wiki_prices.csv
Run the below code to store in fast HDF format (see Chapter 02 on Market & Fundamental Data for details).
Wiki Prices Metadata
QUANDL used to make some stock meta data be available on its website; I'm making the file available to allow readers to run some examples in the book:
Instead of using the QUANDL API, load the file wiki_stocks.csv
as described and store in HDF5 format.
S&P 500 Prices
The following code downloads historical S&P 500 prices from FRED (only last 10 years of daily data is freely available)
Alternatively, download S&P500 data from stooq.com; at the time of writing the data was available since 1789. You can switch from Polish to English on the lower right-hand side.
We store the data from 1950-2020:
S&P 500 Constituents
The following code downloads the current S&P 500 constituents from Wikipedia.
Metadata on US-traded companies
The following downloads several attributes for companies traded on NASDAQ, AMEX and NYSE
Update: unfortunately, NASDAQ has disabled automatic downloads. However, you can still access and manually download the files at the below URL when you fill in the exchange names. So for AMEX, URL becomes
https://www.nasdaq.com/market-activity/stocks/screener?exchange=AMEX&letter=0&render=download
.
Convert market cap information to numerical format
Market cap is provided as strings so we need to convert it to numerical format.
Keep only values with value units:
Store result
The file us_equities_meta_data.csv
contains a version of the data used for many of the examples. Load using
and proceed to store in HDF5 format.
MNIST Data
Fashion MNIST Image Data
We will use the Fashion MNIST image data created by Zalando Research for some demonstrations.
Bond Price Indexes
The following code downloads several bond indexes from the Federal Reserve Economic Data service (FRED)
Warning: Unfortunately, most of this data has been recently removed from the FRED service. It is not important for the examples in the book, so you can just ignore this.