Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
suyashi29
GitHub Repository: suyashi29/python-su
Path: blob/master/Time Forecasting using Python/6 Forecasting using RNN and Forecast Stock Price using LSTM.ipynb
3074 views
Kernel: Python 3 (ipykernel)

Recurrent Neural Networks (RNNs) are widely used for time series forecasting due to their ability to handle sequential data. Here's a high-level approach to using RNNs for forecasting:

Recurrent Neural Networks (RNNs) are a type of neural network architecture that is particularly well-suited for tasks involving sequential data. Unlike feedforward neural networks, which process data in fixed-size chunks, RNNs can handle input sequences of arbitrary length.

key features of RNNs:

  • Recurrent Connections: RNNs have recurrent connections that allow information to persist across different time steps in a sequence. This means that information from previous inputs is considered when processing the current input.

  • Shared Parameters: The same set of weights and biases are applied at each time step. This allows the network to use the same computation for different elements of the sequence.

  • Time Dependency: RNNs are well-suited for tasks where the order or temporal dependency of data matters, such as time series prediction, language modeling, and speech recognition.

image.png

1. Understanding RNN for Forecasting

Recurrent Layers: RNNs process inputs sequentially, making them effective for time series forecasting. Unlike feedforward neural networks, RNNs have connections that loop back, which allows them to maintain a "memory" of previous inputs. Handling Time Steps: Time series data is naturally structured in sequences, which RNNs can use to model temporal dependencies.

2. Steps for Time Series Forecasting Using RNN

a) Preprocessing the Data
  • Data Normalization: Time series data often requires normalization to a range (like 0-1) to improve model convergence.

  • Create Time Steps: You need to transform the time series data into sequences. For example, if your time series is daily stock prices, you can create sequences like [t-10, t-9, ..., t] to predict the value at time t+1.

b) Splitting the Dataset
  • Train, Validation, and Test Split: It's important to divide the data into training, validation, and test sets. You train the model on the training data and validate it on the validation set to avoid overfitting.

c) Building the RNN Model
  • You can use frameworks like TensorFlow or PyTorch for this task. The typical RNN layers are:

  • SimpleRNN: The basic RNN layer, but not commonly used for long sequences due to vanishing gradient problems.

  • LSTM (Long Short-Term Memory): A more advanced version that solves the vanishing gradient issue. It's widely used for time series.

  • GRU (Gated Recurrent Units): A simplified version of LSTM, often faster to train with similar performance.

Let's create a simple RNN using Keras with some sample data. In this example, we'll use a sequence of numbers to predict the next number in the sequence.

import numpy as np from tensorflow.keras.models import Sequential from tensorflow.keras.layers import SimpleRNN, Dense # Generate a simple sequence of numbers def generate_sequence(length): return np.array([i for i in range(1, length+1)]) # Prepare data sequence_length = 30 data = generate_sequence(sequence_length) data
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30])
# Create input-output pairs X = [] y = [] for i in range(len(data) - 4): X.append(data[i:i+4]) y.append(data[i+1:i+5]) X = np.array(X).reshape((-1, 4, 1)) y = np.array(y).reshape((-1, 4, 1)) # Print first few values of X and y print("First few input sequences (X):") print(X[:3]) print("\nFirst few target sequences (y):") print(y[:1])
First few input sequences (X): [[[1] [2] [3] [4]] [[2] [3] [4] [5]] [[3] [4] [5] [6]]] First few target sequences (y): [[[2] [3] [4] [5]]]
# Define RNN model model = Sequential() model.add(SimpleRNN(50, activation='relu', input_shape=(4, 1))) model.add(Dense(4)) model.compile(optimizer='adam', loss='mse')
# Train the model model.fit(X, y, epochs=300, verbose=0)
<keras.callbacks.History at 0x1a3a088b0d0>
# Generate a new sequence input_sequence = np.array([[24, 25, 26, 27]]).reshape((1, 4, 1)) prediction = model.predict(input_sequence, verbose=0)
# Print input sequence in 2D format print("\nInput Sequence (2D format):") print(input_sequence.reshape((4, 1)))
Input Sequence (2D format): [[24] [25] [26] [27]]
# Print the next number prediction print("\nNext Sequence Prediction:") print(prediction.flatten())
Next Sequence Prediction: [24.95547 25.974632 27.013697 28.19194 ]

Example 2: Steps for Simple RNN Time Series Forecasting

## Import Lib import numpy as np import matplotlib.pyplot as plt from keras.models import Sequential from keras.layers import SimpleRNN, Dense from sklearn.preprocessing import MinMaxScaler
# Generate a sine wave as an example of time series data def generate_sine_wave(length, time_steps): x = np.linspace(0, length, time_steps) return np.sin(x) # Example: Generate a sine wave data = generate_sine_wave(50, 200) # 50 units length, 200 data points # Plot the data plt.plot(data) plt.title("Generated Sine Wave") plt.show()
Image in a Jupyter notebook

We need to scale the data and create sequences of past observations to predict the future.

# Scaling the data to [0,1] using MinMaxScaler scaler = MinMaxScaler(feature_range=(0, 1)) data = scaler.fit_transform(data.reshape(-1, 1)) # Create sequences (e.g., using the last 10 time steps to predict the next one) def create_sequences(data, sequence_length): X, y = [], [] for i in range(len(data) - sequence_length): X.append(data[i:i+sequence_length]) y.append(data[i+sequence_length]) return np.array(X), np.array(y) sequence_length = 10 # Look back 10 time steps X, y = create_sequences(data, sequence_length) # Reshape X to be (samples, time steps, features) X = X.reshape((X.shape[0], X.shape[1], 1)) # 1 feature (the sine wave value)

Build the Simple RNN Model

Now, we create a basic Simple RNN model. We'll use one RNN layer followed by a Dense layer to output the predicted value.

# Define the Simple RNN Model model = Sequential() model.add(SimpleRNN(units=50, activation='tanh', input_shape=(sequence_length, 1))) model.add(Dense(1)) # Output layer (1 value as output) # Compile the model model.compile(optimizer='adam', loss='mean_squared_error') # Train the model history = model.fit(X, y, epochs=50, batch_size=16, verbose=1)
Epoch 1/50 12/12 [==============================] - 1s 2ms/step - loss: 0.2594 Epoch 2/50 12/12 [==============================] - 0s 3ms/step - loss: 0.0463 Epoch 3/50 12/12 [==============================] - 0s 4ms/step - loss: 0.0133 Epoch 4/50 12/12 [==============================] - 0s 2ms/step - loss: 0.0047 Epoch 5/50 12/12 [==============================] - 0s 2ms/step - loss: 0.0020 Epoch 6/50 12/12 [==============================] - 0s 2ms/step - loss: 0.0011 Epoch 7/50 12/12 [==============================] - 0s 2ms/step - loss: 8.6373e-04 Epoch 8/50 12/12 [==============================] - 0s 2ms/step - loss: 7.8808e-04 Epoch 9/50 12/12 [==============================] - 0s 2ms/step - loss: 6.6220e-04 Epoch 10/50 12/12 [==============================] - 0s 3ms/step - loss: 5.5900e-04 Epoch 11/50 12/12 [==============================] - 0s 2ms/step - loss: 4.9228e-04 Epoch 12/50 12/12 [==============================] - 0s 2ms/step - loss: 4.7294e-04 Epoch 13/50 12/12 [==============================] - 0s 2ms/step - loss: 4.0463e-04 Epoch 14/50 12/12 [==============================] - 0s 2ms/step - loss: 3.5737e-04 Epoch 15/50 12/12 [==============================] - 0s 2ms/step - loss: 3.2029e-04 Epoch 16/50 12/12 [==============================] - 0s 2ms/step - loss: 2.9764e-04 Epoch 17/50 12/12 [==============================] - 0s 3ms/step - loss: 2.7873e-04 Epoch 18/50 12/12 [==============================] - 0s 2ms/step - loss: 2.5081e-04 Epoch 19/50 12/12 [==============================] - 0s 2ms/step - loss: 2.1472e-04 Epoch 20/50 12/12 [==============================] - 0s 2ms/step - loss: 1.8662e-04 Epoch 21/50 12/12 [==============================] - 0s 3ms/step - loss: 1.8239e-04 Epoch 22/50 12/12 [==============================] - 0s 3ms/step - loss: 1.5929e-04 Epoch 23/50 12/12 [==============================] - 0s 3ms/step - loss: 1.4794e-04 Epoch 24/50 12/12 [==============================] - 0s 3ms/step - loss: 1.2832e-04 Epoch 25/50 12/12 [==============================] - 0s 3ms/step - loss: 1.1493e-04 Epoch 26/50 12/12 [==============================] - 0s 3ms/step - loss: 1.0415e-04 Epoch 27/50 12/12 [==============================] - 0s 3ms/step - loss: 8.8524e-05 Epoch 28/50 12/12 [==============================] - 0s 3ms/step - loss: 8.4235e-05 Epoch 29/50 12/12 [==============================] - 0s 2ms/step - loss: 7.4594e-05 Epoch 30/50 12/12 [==============================] - 0s 2ms/step - loss: 7.3639e-05 Epoch 31/50 12/12 [==============================] - 0s 3ms/step - loss: 6.6906e-05 Epoch 32/50 12/12 [==============================] - 0s 2ms/step - loss: 4.9978e-05 Epoch 33/50 12/12 [==============================] - 0s 2ms/step - loss: 5.3613e-05 Epoch 34/50 12/12 [==============================] - 0s 2ms/step - loss: 4.6916e-05 Epoch 35/50 12/12 [==============================] - 0s 2ms/step - loss: 3.9726e-05 Epoch 36/50 12/12 [==============================] - 0s 2ms/step - loss: 3.2790e-05 Epoch 37/50 12/12 [==============================] - 0s 4ms/step - loss: 2.8893e-05 Epoch 38/50 12/12 [==============================] - 0s 2ms/step - loss: 2.8596e-05 Epoch 39/50 12/12 [==============================] - 0s 3ms/step - loss: 2.6063e-05 Epoch 40/50 12/12 [==============================] - 0s 2ms/step - loss: 2.1854e-05 Epoch 41/50 12/12 [==============================] - 0s 2ms/step - loss: 1.9474e-05 Epoch 42/50 12/12 [==============================] - 0s 2ms/step - loss: 1.4589e-05 Epoch 43/50 12/12 [==============================] - 0s 2ms/step - loss: 1.2404e-05 Epoch 44/50 12/12 [==============================] - 0s 2ms/step - loss: 1.0476e-05 Epoch 45/50 12/12 [==============================] - 0s 2ms/step - loss: 1.0039e-05 Epoch 46/50 12/12 [==============================] - 0s 2ms/step - loss: 1.1003e-05 Epoch 47/50 12/12 [==============================] - 0s 2ms/step - loss: 9.4074e-06 Epoch 48/50 12/12 [==============================] - 0s 2ms/step - loss: 6.0546e-06 Epoch 49/50 12/12 [==============================] - 0s 2ms/step - loss: 5.3390e-06 Epoch 50/50 12/12 [==============================] - 0s 2ms/step - loss: 5.0227e-06

Make Predictions

Once the model is trained, we can use it to make predictions on the test data.

# Predict on the training data (to demonstrate how the model fits) predicted = model.predict(X) # Inverse transform the scaled data back to the original values predicted = scaler.inverse_transform(predicted) y_true = scaler.inverse_transform(y) # Plot the actual vs. predicted plt.plot(y_true, label='True Data') plt.plot(predicted, label='Predicted Data') plt.legend() plt.show()
6/6 [==============================] - 0s 2ms/step
Image in a Jupyter notebook

Forecast Future Values

We can also generate forecasts by feeding the model the last few time steps and predicting the next step.

# Forecast future steps def forecast_future(model, input_data, n_steps): predictions = [] current_input = input_data[-sequence_length:] # Start from the last sequence for _ in range(n_steps): current_input = current_input.reshape((1, sequence_length, 1)) next_value = model.predict(current_input) predictions.append(next_value[0, 0]) current_input = np.append(current_input[0, 1:], next_value, axis=0) # Move the window forward return predictions # Forecast 20 future time steps future_steps = 20 forecast = forecast_future(model, data, future_steps) # Plot the forecasted future values plt.plot(np.arange(len(data)), scaler.inverse_transform(data), label='Historical Data') plt.plot(np.arange(len(data), len(data) + future_steps), scaler.inverse_transform(np.array(forecast).reshape(-1, 1)), label='Forecast') plt.legend() plt.show()
1/1 [==============================] - 0s 20ms/step 1/1 [==============================] - 0s 20ms/step 1/1 [==============================] - 0s 29ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 20ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 23ms/step 1/1 [==============================] - 0s 23ms/step 1/1 [==============================] - 0s 20ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 21ms/step 1/1 [==============================] - 0s 18ms/step 1/1 [==============================] - 0s 17ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 19ms/step 1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 28ms/step
Image in a Jupyter notebook

Forecasting stock prices using an LSTM model in Python

##pip install numpy pandas matplotlib scikit-learn keras tensorflow
import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.preprocessing import MinMaxScaler from keras.models import Sequential from keras.layers import LSTM, Dense
# Load the stock data df = pd.read_csv('stockdata.csv') # Display the first few rows print(df.head()) # Set 'Date' as the index (optional, but helpful for time series analysis) df['Date'] = pd.to_datetime(df['Date']) df.set_index('Date', inplace=True) # Extract only the 'Close' column for prediction data = df['Close'].values.reshape(-1, 1) # Scale the data to the range [0, 1] scaler = MinMaxScaler(feature_range=(0, 1)) scaled_data = scaler.fit_transform(data) # Plot the stock price history plt.plot(df['Close'], label='Stock Price History') plt.title("Stock Price Over Time") plt.xlabel('Date') plt.ylabel('Close Price') plt.legend() plt.show()

Create Sequences for LSTM

For LSTM, we need to create sequences of data (e.g., 60 past days to predict the next day). You can choose how many time steps (lookback period) you want the LSTM to use for prediction.

# Create sequences: using 60 time steps (you can adjust this) sequence_length = 60 def create_sequences(data, sequence_length): X, y = [], [] for i in range(sequence_length, len(data)): X.append(data[i-sequence_length:i, 0]) y.append(data[i, 0]) return np.array(X), np.array(y) X, y = create_sequences(scaled_data, sequence_length) # Reshape X to be (samples, time steps, features) as required for LSTM X = np.reshape(X, (X.shape[0], X.shape[1], 1))

Split the Data into Training and Testing Sets

We need to split the dataset into training and testing sets. Typically, we train on the first 80% of the data and test on the remaining 20%.

# Split data: 80% for training, 20% for testing train_size = int(len(X) * 0.8) X_train, X_test = X[:train_size], X[train_size:] y_train, y_test = y[:train_size], y[train_size:]

Build the LSTM Model

We will now define the LSTM model architecture. It will consist of two LSTM layers and a Dense layer to predict the next value.

# Build the LSTM model model = Sequential() # Add LSTM layers model.add(LSTM(units=50, return_sequences=True, input_shape=(X_train.shape[1], 1))) model.add(LSTM(units=50)) # Add a Dense layer for the output model.add(Dense(1)) # Compile the model model.compile(optimizer='adam', loss='mean_squared_error') # Train the model model.fit(X_train, y_train, epochs=20, batch_size=32)

Make Predictions

Once the model is trained, we can use it to predict stock prices on the test data and evaluate the performance.

# Make predictions on the test data predicted_stock_price = model.predict(X_test) # Inverse transform the predictions back to the original scale predicted_stock_price = scaler.inverse_transform(predicted_stock_price) y_test_actual = scaler.inverse_transform(y_test.reshape(-1, 1)) # Plot the results plt.plot(y_test_actual, label='Actual Stock Price') plt.plot(predicted_stock_price, label='Predicted Stock Price') plt.title('Stock Price Prediction') plt.xlabel('Time') plt.ylabel('Stock Price') plt.legend() plt.show()

Forecast Future Stock Prices

Now, we can use the trained model to forecast future stock prices by feeding it the last sequence of data.

# Predict future prices def forecast_future(model, last_sequence, n_steps): predictions = [] current_input = last_sequence.reshape((1, sequence_length, 1)) for _ in range(n_steps): next_value = model.predict(current_input) predictions.append(next_value[0, 0]) current_input = np.append(current_input[0, 1:], next_value, axis=0) # Move the window forward current_input = current_input.reshape((1, sequence_length, 1)) return np.array(predictions) # Forecast next 30 days last_sequence = scaled_data[-sequence_length:] future_predictions = forecast_future(model, last_sequence, 30) # Inverse transform predictions future_predictions = scaler.inverse_transform(future_predictions.reshape(-1, 1)) # Plot future predictions plt.plot(np.arange(len(df), len(df) + len(future_predictions)), future_predictions, label='Future Forecast') plt.title('Future Stock Price Forecast') plt.xlabel('Time') plt.ylabel('Stock Price') plt.legend() plt.show()