GitHub Repository: QuantConnect/Research
Path: blob/master/Documentation/Python/Predicting-Future-Prices-With-TensorFlow.ipynb
⁹⁸⁴ views

Kernel: Python 3

Predicting Future Prices With TensorFlow

An example of TF model building, training, saving in the ObjectStore, and loading.

Import Libraries

Let's start by importing the functionality we'll need to build the model, split the data, and serialize/unserialize the model for saving and loading.

In [1]:

import tensorflow as tf
from sklearn.model_selection import train_test_split
import json5
from google.protobuf import json_format

Gather & Prepare Data

Let's retreive some intraday data for the SPY by making a History request.

In [2]:

qb = QuantBook()
spy = qb.AddEquity("SPY").Symbol
data = qb.History(spy, 
                  datetime(2020, 6, 22), 
                  datetime(2020, 6, 27), 
                  Resolution.Minute).loc[spy].close
data

We'll use the last 5 closing prices of the SPY as inputs to our model. Here, we create a DataFrame containing this data.

In [3]:

lookback = 5
lookback_series = []
for i in range(1, lookback + 1):
    df = data.shift(i)[lookback:-1]
    df.name = f"close_-{i}"
    lookback_series.append(df)
X = pd.concat(lookback_series, axis=1).reset_index(drop=True)
X

We'd like the model to predict the closing price of the SPY 1 timestep into the future, so let's create a DataFrame containing this data.

In [4]:

Y = data.shift(-1)[lookback:-1].reset_index(drop=True)
Y.plot(figsize=(16, 6))
plt.title("SPY price 1 timestep in the future")
plt.xlabel("Time step")
plt.ylabel("Price")
plt.show()

Now, let's split the data into testing and training sets.

In [5]:

test_size = 0.33
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, shuffle=False)
print(f"Train index: {X_train.index[0]}...{X_train.index[-1]}")
print(f"Test index: {X_test.index[0]}...{X_test.index[-1]}")

Define a Testing Method

To test the model, we'll setup a method to plot test set predictions ontop of the SPY price.

In [6]:

def test_model(sess, output, title, X):
    prediction = sess.run(output, feed_dict={X: X_test})
    prediction = prediction.reshape(prediction.shape[1], 1)

    y_test.reset_index(drop=True).plot(figsize=(16, 6), label="Actual")
    plt.plot(prediction, label="Prediction")
    plt.title(title)
    plt.xlabel("Time step")
    plt.ylabel("SPY Price")
    plt.legend()
    plt.show()

Manually Build the Model

Let's build the neural network architecture by utilizing the TensorFlow library. Note how we name the input and output nodes so we can retreive them when loading the model from the ObjectStore.

In [7]:

tf.reset_default_graph()
sess = tf.Session()

num_factors = X_test.shape[1]
num_neurons_1 = 32
num_neurons_2 = 16
num_neurons_3 = 8

X = tf.placeholder(dtype=tf.float32, shape=[None, num_factors], name='X')
Y = tf.placeholder(dtype=tf.float32, shape=[None])

# Initializers
weight_initializer = tf.variance_scaling_initializer(mode="fan_avg", distribution="uniform", scale=1)
bias_initializer = tf.zeros_initializer()

# Hidden weights
W_hidden_1 = tf.Variable(weight_initializer([num_factors, num_neurons_1]))
bias_hidden_1 = tf.Variable(bias_initializer([num_neurons_1]))
W_hidden_2 = tf.Variable(weight_initializer([num_neurons_1, num_neurons_2]))
bias_hidden_2 = tf.Variable(bias_initializer([num_neurons_2]))
W_hidden_3 = tf.Variable(weight_initializer([num_neurons_2, num_neurons_3]))
bias_hidden_3 = tf.Variable(bias_initializer([num_neurons_3]))

# Output weights
W_out = tf.Variable(weight_initializer([num_neurons_3, 1]))
bias_out = tf.Variable(bias_initializer([1]))

# Hidden layer
hidden_1 = tf.nn.relu(tf.add(tf.matmul(X, W_hidden_1), bias_hidden_1))
hidden_2 = tf.nn.relu(tf.add(tf.matmul(hidden_1, W_hidden_2), bias_hidden_2))
hidden_3 = tf.nn.relu(tf.add(tf.matmul(hidden_2, W_hidden_3), bias_hidden_3))

# Output layer
output = tf.transpose(tf.add(tf.matmul(hidden_3, W_out), bias_out), name='outer')

We'll train the neural network by iteratively minimizing the mean squared difference between the model predictions and the actual SPY price.

In [8]:

loss = tf.reduce_mean(tf.squared_difference(output, Y))
optimizer = tf.train.AdamOptimizer().minimize(loss)
sess.run(tf.global_variables_initializer())

batch_size = len(y_train) // 10
epochs = 20
for _ in range(epochs):
    for i in range(0, len(y_train) // batch_size):
        start = i * batch_size
        batch_x = X_train[start:start + batch_size]
        batch_y = y_train[start:start + batch_size]
        sess.run(optimizer, feed_dict={X: batch_x, Y: batch_y})

To ensure the model we've built and trained is working, let's plot it's predictions on the test set.

In [9]:

 test_model(sess, output, "Test Set Results from Original Model", X)

Save Model to the ObjectStore

We first serialize the TensorFlow graph and weights to JSON format, then save these in the ObjectStore by using the Save method.

In [10]:

graph_definition = tf.compat.v1.train.export_meta_graph()
json_graph = json_format.MessageToJson(graph_definition)

def get_json_weights(sess):
    weights = sess.run(tf.compat.v1.trainable_variables())
    weights = [w.tolist() for w in weights]
    weights_list = json5.dumps(weights)
    return weights_list
json_weights = get_json_weights(sess)
sess.close()

qb.ObjectStore.Save('graph', json_graph)
qb.ObjectStore.Save('weights', json_weights)

Load Model from the ObjectStore

Let's first retreive the JSON for the TensorFlow graph and weights that we saved in the ObjectStore.

In [11]:

json_graph = qb.ObjectStore.Read('graph')
json_weights = qb.ObjectStore.Read('weights')

Now let's restore the TensorFlow graph from JSON and select the input and output nodes.

In [18]:

tf.reset_default_graph()
graph_definition = json_format.Parse(json_graph, tf.compat.v1.MetaGraphDef())
sess = tf.Session()
tf.compat.v1.train.import_meta_graph(graph_definition)
X = tf.compat.v1.get_default_graph().get_tensor_by_name('X:0')
output = tf.compat.v1.get_default_graph().get_tensor_by_name('outer:0')

To avoid retraining the model after loading, let's restore the weights.

In [19]:

weights = [np.asarray(x) for x in json5.loads(json_weights)]

assign_ops = []
feed_dict = {}
vs = tf.compat.v1.trainable_variables()
zipped_values = zip(vs, weights)
for var, value in zipped_values:
    value = np.asarray(value)
    assign_placeholder = tf.placeholder(var.dtype, shape=value.shape)
    assign_op = var.assign(assign_placeholder)
    assign_ops.append(assign_op)
    feed_dict[assign_placeholder] = value
sess.run(assign_ops, feed_dict=feed_dict);

To ensure loading the model was successfuly, let's test the model.

In [20]:

test_model(sess, output, "Test Set Results from Loaded Model", X)
sess.close()

Appendix

Below are some helper methods to manage the ObjectStore keys. We can use these to validate the saving and loading is successful.

In [10]:

def get_ObjectStore_keys():
    return [str(j).split(',')[0][1:] for _, j in enumerate(qb.ObjectStore.GetEnumerator())]

def clear_ObjectStore():
    for key in get_ObjectStore_keys():
        qb.ObjectStore.Delete(key)

In [11]:

clear_ObjectStore()

In [ ]: