Path: blob/master/RNN Fundamentals/3 Recurrent Neural Networks (RNNs).ipynb
3074 views
Recurrent Neural Networks (RNNs) are a type of neural network architecture that is particularly well-suited for tasks involving sequential data. Unlike feedforward neural networks, which process data in fixed-size chunks, RNNs can handle input sequences of arbitrary length.
key features of RNNs:
Recurrent Connections: RNNs have recurrent connections that allow information to persist across different time steps in a sequence. This means that information from previous inputs is considered when processing the current input.
Shared Parameters: The same set of weights and biases are applied at each time step. This allows the network to use the same computation for different elements of the sequence.
Time Dependency: RNNs are well-suited for tasks where the order or temporal dependency of data matters, such as time series prediction, language modeling, and speech recognition.
Applications of RNNs:
Language Modeling and Text Generation: RNNs can be used to model the probability distribution of sequences of words. This enables tasks like auto-completion, machine translation, and text generation.
Time Series Prediction: RNNs are effective for tasks like stock price prediction, weather forecasting, and any scenario where the current state depends on previous states.
Speech Recognition: RNNs can be used to convert spoken language into written text. This is crucial for applications like voice assistants (e.g., Siri, Alexa).
Handwriting Recognition: RNNs can recognize handwritten text, enabling applications like digit recognition and signature verification.
Image Captioning: RNNs can be combined with Convolutional Neural Networks (CNNs) to generate captions for images.
Video Analysis: RNNs can process sequences of images or video frames, making them useful for tasks like action recognition, video captioning, and video prediction.
Anomaly Detection: RNNs can be used to detect anomalies in sequences of data, making them valuable for tasks like fraud detection in finance or detecting defects in manufacturing.
Sentiment Analysis: RNNs can analyze sequences of text to determine the sentiment expressed.
Mathematical Implementation:
Terms:
xt: Input at time step at t
ht: Hidden state at time step at t
Whx: Weight matrix for input-to-hidden connections
Whh: Weight matrix for hidden-to-hidden connections
bh:Bias term for hidden layer
Wyh: Weight matrix for hidden-to-output connection
by: Bias term for output layer
Training:
During training, you would use backpropagation through time (BPTT) to compute gradients and update the weights and biases to minimize the loss function. Prediction:
Once the network is trained, you can make predictions by passing a sequence of inputs through the network. This is a basic mathematical interpretation of a simple RNN. In practice, more sophisticated variants like LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Unit) are often used to address issues like vanishing gradients and better capture long-term dependencies.
Below is a basic implementation of a simple RNN using only the NumPy library. This code demonstrates how you can manually perform forward passes through time.
A,sum(AWi),sum(AWiHi), AWiHiAout - 0 , 1
Explanation:
The code defines a basic RNN class (SimpleRNN) with methods for forward pass (forward) and backward pass (backward).
The activation functions (sigmoid and tanh) and their derivatives are defined.
The forward method performs a forward pass through the RNN, storing intermediate values for backpropagation.
The backward method computes gradients and updates the weights and biases using backpropagation through time (BPTT).
Let us use Keras library to create and train a basic RNN for a toy example of sequence prediction. This example uses a very simple sequence (1, 2, 3, 4, 5) and tries to predict the next number in the sequence.
X Y=(5,6,7,8)