CoCalc -- Day 6 GAN Fundamentals and Unsupervised Training.ipynb

GitHub Repository: suyashi29/python-su
Path: blob/master/Generative AI for Intelligent Data Handling/Day 6 GAN Fundamentals and Unsupervised Training.ipynb
³⁰⁷⁴ views

Kernel: Python 3 (ipykernel)

GAN stands for Generative Adversarial Networks.

They are a class of artificial intelligence algorithms used in unsupervised machine learning, implemented by a system of two neural networks contesting with each other in a zero-sum game framework. Following points explain GAN:

1. Two Networks: GANs consist of two neural networks: a generator and a discriminator.
1. Competitive Training: The generator creates synthetic data samples, while the discriminator distinguishes between real and fake data.
1. Adversarial Process: The generator aims to generate data that is indistinguishable from real data, while the discriminator aims to correctly classify real and fake samples.
1. Improvement Iteration: Through iterative training, the generator improves its ability to produce realistic samples, while the discriminator becomes better at distinguishing real from fake.
1. Wide Applications: GANs have been used successfully in generating realistic images, audio, text, and more, with applications in art generation, data augmentation, and anomaly detection, among others.

The "zero-sum rule" is a concept often used in game theory and economics. It refers to a situation where the gains and losses of one participant are exactly balanced by the gains and losses of another participant. In other words, the total benefit to all participants in the system adds up to zero

Zero-sum rule:

Total Gain = Total Loss: In a zero-sum game, the total gains made by all participants in the game equal the total losses suffered by all participants. This means that any gain by one player is directly offset by a loss experienced by another player
No Net Benefit: In such scenarios, there is no net benefit gained or lost across all participants. Any advantage gained by one player necessarily comes at the expense of others, resulting in a redistribution rather than a creation of wealth

Examples

In competitive sports matches, where one team's victory corresponds to another team's defeat, the outcome often follows a zero-sum dynamic
Economic transactions can sometimes be modeled as zero-sum games in certain contexts. For instance, in a simple barter economy, the exchange of goods between two parties can be seen as a zero-sum game, as the value gained by one party is equivalent to the value lost by the other.
Contrast with Non-Zero Sum Games: In contrast to zero-sum games, non-zero sum games allow for scenarios where all participants can benefit simultaneously. These games often involve cooperative strategies where collaboration can lead to outcomes that are more favorable for all parties involved

GAN Components

1.Generator

Generative Adversarial Network (GAN), the generator function is a crucial component responsible for creating new data samples that resemble the training data distribution
The generator takes random noise as input and produces synthetic data that ideally becomes indistinguishable from real data to an external observer, such as the discriminator in the GAN setup

The components of a typical generator function in a GAN:

Input Layer: The generator typically starts with an input layer that takes random noise as input. This noise is usually drawn from a simple probability distribution, such as a Gaussian distribution.
Dense Layer: Following the input layer, there is often a dense (fully connected) layer that maps the input noise to a higher-dimensional space. This layer helps to transform the random noise into a format that can be further processed by subsequent layers.
Batch Normalization: Batch normalization layers are commonly used in the generator to stabilize and speed up the training process. They normalize the activations of the previous layer across the mini-batch.
Activation Function: Activation functions, such as ReLU (Rectified Linear Unit) or Leaky ReLU, are applied after each layer to introduce non-linearity into the network. This allows the generator to learn complex patterns and generate diverse outputs.
Reshaping Layer: After several dense layers, there is often a reshaping layer that transforms the output into a 3D tensor, typically representing an image-like structure. This prepares the data for the subsequent convolutional layers.
Convolutional Transpose Layers: Convolutional transpose layers, also known as deconvolutional layers, are used to upsample the data, gradually increasing its spatial dimensions. These layers learn to generate high-resolution images from the low-dimensional input noise.
Output Layer: Finally, the output layer typically consists of a convolutional transpose layer followed by an activation function, such as Tanh. This layer generates the final synthetic data samples, which ideally resemble the real data samples from the training set.

note: The generator function in a GAN is trained adversarially with the discriminator. The goal of the generator is to produce synthetic data that is realistic enough to fool the discriminator into classifying it as real. Through this adversarial training process, the generator learns to generate increasingly realistic data samples that capture the underlying structure of the training data distribution.

Overall, the generator function plays a central role in the GAN framework, driving the generation of new data samples and enabling the model to learn to produce realistic outputs.

2. Discriminator

The discriminator function in a Generative Adversarial Network (GAN) is responsible for distinguishing between real and fake data samples. It learns to classify input data as either coming from the real training data distribution or generated by the generator network.

Components of a discriminator function in a GAN:

Input Layer: The discriminator starts with an input layer that receives data samples, which could be either real or generated by the generator. These data samples are typically images, text, or any other type of data that the GAN is designed to generate.
Convolutional Layers: Convolutional layers are commonly used in the discriminator to extract features from the input data. These layers consist of multiple filters that slide over the input data, detecting patterns and spatial relationships. Convolutional layers are effective for processing high-dimensional data such as images.
Activation Functions: Activation functions like Leaky ReLU (Rectified Linear Unit) are often applied after each convolutional layer to introduce non-linearity into the network. Leaky ReLU helps prevent the vanishing gradient problem by allowing a small, non-zero gradient when the input is negative.
Pooling or Strided Convolutions: Pooling layers or strided convolutions are used to downsample the feature maps produced by the convolutional layers. This reduces the spatial dimensions of the data while retaining important features. Pooling layers typically use max-pooling or average-pooling operations.
Dropout: Dropout layers may be included to prevent overfitting by randomly dropping a fraction of the neurons during training. This regularization technique helps improve the generalization ability of the discriminator.
Flattening Layer: After the convolutional layers, the feature maps are flattened into a one-dimensional vector. This prepares the data for input into the fully connected layers.
Fully Connected Layers: Fully connected (dense) layers are used to perform classification based on the extracted features. These layers receive the flattened feature vector as input and output a single value indicating the probability that the input data is real.
Output Layer: The output layer typically consists of a single neuron with a sigmoid activation function. This neuron produces a scalar output in the range [0, 1], representing the probability that the input data is real. A value close to 1 indicates a high probability of the input being real, while a value close to 0 indicates a high probability of the input being fake.

The discriminator function is trained alongside the generator in an adversarial manner. It learns to differentiate between real and fake data samples by minimizing a suitable loss function, such as binary cross-entropy loss. As training progresses, the discriminator becomes better at distinguishing between real and fake data, while the generator learns to produce increasingly realistic data samples to fool the discriminator.

Step-by-Step explanation of how GANs function:

**Initialization: The generator and discriminator neural networks are initialized with random weights.
**Training Loop:

Generator Input: The generator takes random noise (usually sampled from a Gaussian distribution) as input and generates synthetic data.
Real Data: The discriminator is fed with real data samples from the training dataset along with the synthetic data generated by the generator.
Discriminator Training: The discriminator is trained to distinguish between real and fake data. It adjusts its weights to improve its ability to classify the input as either real or fake.
Generator Training: The generator is trained to generate data that is realistic enough to fool the discriminator. It generates synthetic data and passes it through the discriminator. The generator's weights are adjusted to minimize the discriminator's ability to distinguish between real and fake data. In other words, the generator aims to maximize the probability that the discriminator classifies its generated data as real.
Back and Forth Training: This process continues iteratively, with the generator and discriminator improving their performance through successive rounds of training. As the discriminator gets better at distinguishing real from fake data, the generator must also improve to produce more realistic data.

Adversarial Loss Function : The training of GANs is driven by an adversarial loss function. The goal of the generator is to minimize this loss, while the goal of the discriminator is to maximize it. This adversarial setup creates a competitive dynamic where the generator and discriminator are constantly trying to outperform each other.
Convergence: Ideally, with enough training, the generator becomes proficient at generating data that is indistinguishable from real data, and the discriminator becomes unable to differentiate between real and fake data. At this point, the GAN is said to have converged.
Evaluation: The trained generator can be used to generate new data samples that are similar to the training data. These generated samples can be evaluated for quality and realism using various metrics and visual inspection.

Basic implementation of a GAN using TensorFlow.

It defines both the generator and discriminator networks, along with the training loop

In [1]:

import tensorflow as tf
from tensorflow.keras import layers
import numpy as np
import matplotlib.pyplot as plt

In [2]:

import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import SimpleRNN, Dense
from keras.datasets import mnist
from keras.utils import to_categorical

# Load the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Print a few examples from the dataset
plt.figure(figsize=(10, 5))
for i in range(5):
    plt.subplot(2, 5, i + 1)
    plt.imshow(train_images[i], cmap='gray')
    plt.title(f"Label: {train_labels[i]}")
    plt.axis('off')

plt.show()

Out[2]:

In a Conv2D generator, Leaky ReLU introduces a small negative slope to prevent dead neurons, enhancing learning by allowing gradients to flow during backpropagation, thereby improving model stability and convergence. This helps capture subtle features and nuances in generated images, enhancing the overall quality of the output
In a Conv2D generator, the assert function can be used to verify the dimensions of the input tensor before proceeding with the convolution operation, ensuring compatibility between the input data and the convolutional layer's parameters, thereby preventing runtime errors and ensuring the integrity of the network's operations.
Setting use_bias = false in a Conv2D generator means that the convolutional layer won't utilize bias terms. This can be useful in scenarios where introducing bias terms might not be necessary or could potentially lead to overfitting. By excluding bias terms, the model becomes more constrained, potentially reducing the overall number of parameters and computational complexity while still allowing for effective feature extraction and learning.
Stride: Strides to reduce the spatial dimensions of the output volume. it also determines the stpe size of convultaion filter.

In [4]:

(10+20+30)/3

Out[4]:

20.0

In [5]:

# Define the Generator network
def make_generator_model():
    model = tf.keras.Sequential()
    model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,)))
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Reshape((7, 7, 256)))
    assert model.output_shape == (None, 7, 7, 256) # None is the batch size

    model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False))
    assert model.output_shape == (None, 7, 7, 128)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU()) 

    model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
    assert model.output_shape == (None, 14, 14, 64)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
    assert model.output_shape == (None, 28, 28, 1)

    return model
# Create the generator
generator = make_generator_model()

In [6]:


# Define the Discriminator network
def make_discriminator_model():
    model = tf.keras.Sequential()
    model.add(layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same',
                                     input_shape=[28, 28, 1])) 
    model.add(layers.LeakyReLU())
    model.add(layers.Dropout(0.2)) # Techique by reducing the risk of overfitting 

    model.add(layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same'))
    model.add(layers.LeakyReLU())
    model.add(layers.Dropout(0.2))

    model.add(layers.Flatten())
    model.add(layers.Dense(1))

    return model

# Create the discriminator
discriminator = make_discriminator_model()

In [7]:

# Define the loss functions
cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True) 

# Define the discriminator loss
def discriminator_loss(real_output, fake_output):
    real_loss = cross_entropy(tf.ones_like(real_output), real_output)
    fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
    total_loss = real_loss + fake_loss
    return total_loss

# Define the generator loss
def generator_loss(fake_output):
    return cross_entropy(tf.ones_like(fake_output), fake_output)

In [8]:

# Define the optimizers
generator_optimizer = tf.keras.optimizers.Adam(1e-4)
discriminator_optimizer = tf.keras.optimizers.Adam(1e-4)

# Define the training step function
@tf.function
def train_step(images):
    noise = tf.random.normal([BATCH_SIZE, noise_dim])

    with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
        generated_images = generator(noise, training=True)

        real_output = discriminator(images, training=True)
        fake_output = discriminator(generated_images, training=True)

        gen_loss = generator_loss(fake_output)
        disc_loss = discriminator_loss(real_output, fake_output)

    gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
    gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)

    generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
    discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))

    return gen_loss, disc_loss

In [9]:

# Define the training loop with visualizations
def train(dataset, epochs):
    for epoch in range(epochs):
        for image_batch in dataset:
            gen_loss, disc_loss = train_step(image_batch)
        print(f'Epoch {epoch+1}/{epochs}, Generator Loss: {gen_loss}, Discriminator Loss: {disc_loss}')

        # Generate and save images for visualization
        if epoch % 10 == 0:
            generate_and_save_images(generator, epoch + 1, seed)

# Function to generate and save images
def generate_and_save_images(model, epoch, test_input):
    predictions = model(test_input, training=False)
    fig = plt.figure(figsize=(4,4))

    for i in range(predictions.shape[0]):
        plt.subplot(4, 4, i+1)
        plt.imshow(predictions[i, :, :, 0] * 127.5 + 127.5, cmap='gray')
        plt.axis('off')

    plt.savefig('image_at_epoch_{:04d}.png'.format(epoch))
    plt.show()

In [ ]:

# Load and prepare the dataset ( MNIST)
mnist = tf.keras.datasets.mnist
(train_images, _), (_, _) = mnist.load_data()
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1).astype('float32')
train_images = (train_images - 127.5) / 127.5  # Normalize the images to [-1, 1]

# Batch and shuffle the data
BUFFER_SIZE = 60000
BATCH_SIZE = 256
train_dataset = tf.data.Dataset.from_tensor_slices(train_images).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)

# Define the dimensionality of the random noise vector
noise_dim = 100
num_examples_to_generate = 16
seed = tf.random.normal([num_examples_to_generate, noise_dim])

# Train the GAN
train(train_dataset, epochs=50)

Epoch 1/50, Generator Loss: 0.8886315822601318, Discriminator Loss: 1.1144211292266846

Epoch 2/50, Generator Loss: 0.80439293384552, Discriminator Loss: 1.5225005149841309
Epoch 3/50, Generator Loss: 1.018770456314087, Discriminator Loss: 1.027886152267456
Epoch 4/50, Generator Loss: 0.8793073296546936, Discriminator Loss: 1.2916512489318848
Epoch 5/50, Generator Loss: 0.9028756618499756, Discriminator Loss: 1.205275058746338
Epoch 6/50, Generator Loss: 0.7213987112045288, Discriminator Loss: 1.4907300472259521
Epoch 7/50, Generator Loss: 0.7699480652809143, Discriminator Loss: 1.375025749206543
Epoch 8/50, Generator Loss: 1.1869161128997803, Discriminator Loss: 0.8523796796798706
Epoch 9/50, Generator Loss: 1.0505003929138184, Discriminator Loss: 1.0612211227416992
Epoch 10/50, Generator Loss: 1.0153477191925049, Discriminator Loss: 1.0271923542022705
Epoch 11/50, Generator Loss: 1.2612522840499878, Discriminator Loss: 0.871403157711029

Epoch 12/50, Generator Loss: 0.9438713788986206, Discriminator Loss: 1.238171100616455
Epoch 13/50, Generator Loss: 0.8856446146965027, Discriminator Loss: 1.2689669132232666
Epoch 14/50, Generator Loss: 1.1267590522766113, Discriminator Loss: 1.1643075942993164
Epoch 15/50, Generator Loss: 1.258558988571167, Discriminator Loss: 0.8452894687652588
Epoch 16/50, Generator Loss: 1.2043936252593994, Discriminator Loss: 1.0385801792144775
Epoch 17/50, Generator Loss: 1.2196786403656006, Discriminator Loss: 0.9326040744781494
Epoch 18/50, Generator Loss: 1.3994824886322021, Discriminator Loss: 0.9160407185554504
Epoch 19/50, Generator Loss: 1.2086116075515747, Discriminator Loss: 1.0605415105819702
Epoch 20/50, Generator Loss: 1.3892619609832764, Discriminator Loss: 0.8507324457168579
Epoch 21/50, Generator Loss: 1.2455077171325684, Discriminator Loss: 0.9571502804756165

Epoch 22/50, Generator Loss: 1.3776977062225342, Discriminator Loss: 0.9134082794189453
Epoch 23/50, Generator Loss: 1.4289569854736328, Discriminator Loss: 0.8536520600318909
Epoch 24/50, Generator Loss: 1.6960281133651733, Discriminator Loss: 0.9384723901748657
Epoch 25/50, Generator Loss: 1.3601925373077393, Discriminator Loss: 0.9052325487136841

Explanation:

In Epoch 2, the generator loss is 0.896 and the discriminator loss is 1.226. This means the generator is improving but still isn't generating samples that are very close to the real data, and the discriminator is doing a decent job at telling real from fake.
In Epoch 9, the generator loss drops to 0.76, indicating improvement in generating realistic samples. Meanwhile, the discriminator is loss increased , indicating it is finding difficult able to differentiate but with a lower error.
In Epoch 10, the generator loss increases to 0.957, which might indicate that the generator is finding it harder to generate more realistic samples or that the discriminator has become more effective. The discriminator loss is also high, suggesting it's finding it challenging to distinguish between real and fake samples.