Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
suyashi29
GitHub Repository: suyashi29/python-su
Path: blob/master/Generative AI for Intelligent Data Handling/Day 6 GAN Fundamentals and Unsupervised Training.ipynb
3074 views
Kernel: Python 3 (ipykernel)

GAN stands for Generative Adversarial Networks.

They are a class of artificial intelligence algorithms used in unsupervised machine learning, implemented by a system of two neural networks contesting with each other in a zero-sum game framework. Following points explain GAN:

    1. Two Networks: GANs consist of two neural networks: a generator and a discriminator.

    1. Competitive Training: The generator creates synthetic data samples, while the discriminator distinguishes between real and fake data.

    1. Adversarial Process: The generator aims to generate data that is indistinguishable from real data, while the discriminator aims to correctly classify real and fake samples.

    1. Improvement Iteration: Through iterative training, the generator improves its ability to produce realistic samples, while the discriminator becomes better at distinguishing real from fake.

    1. Wide Applications: GANs have been used successfully in generating realistic images, audio, text, and more, with applications in art generation, data augmentation, and anomaly detection, among others.

The "zero-sum rule" is a concept often used in game theory and economics. It refers to a situation where the gains and losses of one participant are exactly balanced by the gains and losses of another participant. In other words, the total benefit to all participants in the system adds up to zero

Zero-sum rule:

  • Total Gain = Total Loss: In a zero-sum game, the total gains made by all participants in the game equal the total losses suffered by all participants. This means that any gain by one player is directly offset by a loss experienced by another player

  • No Net Benefit: In such scenarios, there is no net benefit gained or lost across all participants. Any advantage gained by one player necessarily comes at the expense of others, resulting in a redistribution rather than a creation of wealth

image.png

Examples

  • In competitive sports matches, where one team's victory corresponds to another team's defeat, the outcome often follows a zero-sum dynamic

  • Economic transactions can sometimes be modeled as zero-sum games in certain contexts. For instance, in a simple barter economy, the exchange of goods between two parties can be seen as a zero-sum game, as the value gained by one party is equivalent to the value lost by the other.

  • Contrast with Non-Zero Sum Games: In contrast to zero-sum games, non-zero sum games allow for scenarios where all participants can benefit simultaneously. These games often involve cooperative strategies where collaboration can lead to outcomes that are more favorable for all parties involved

GAN Components

image-2.png

1.Generator

  • Generative Adversarial Network (GAN), the generator function is a crucial component responsible for creating new data samples that resemble the training data distribution

  • The generator takes random noise as input and produces synthetic data that ideally becomes indistinguishable from real data to an external observer, such as the discriminator in the GAN setup

The components of a typical generator function in a GAN:

  • Input Layer: The generator typically starts with an input layer that takes random noise as input. This noise is usually drawn from a simple probability distribution, such as a Gaussian distribution.

  • Dense Layer: Following the input layer, there is often a dense (fully connected) layer that maps the input noise to a higher-dimensional space. This layer helps to transform the random noise into a format that can be further processed by subsequent layers.

  • Batch Normalization: Batch normalization layers are commonly used in the generator to stabilize and speed up the training process. They normalize the activations of the previous layer across the mini-batch.

  • Activation Function: Activation functions, such as ReLU (Rectified Linear Unit) or Leaky ReLU, are applied after each layer to introduce non-linearity into the network. This allows the generator to learn complex patterns and generate diverse outputs.

  • Reshaping Layer: After several dense layers, there is often a reshaping layer that transforms the output into a 3D tensor, typically representing an image-like structure. This prepares the data for the subsequent convolutional layers.

  • Convolutional Transpose Layers: Convolutional transpose layers, also known as deconvolutional layers, are used to upsample the data, gradually increasing its spatial dimensions. These layers learn to generate high-resolution images from the low-dimensional input noise.

  • Output Layer: Finally, the output layer typically consists of a convolutional transpose layer followed by an activation function, such as Tanh. This layer generates the final synthetic data samples, which ideally resemble the real data samples from the training set.

note: The generator function in a GAN is trained adversarially with the discriminator. The goal of the generator is to produce synthetic data that is realistic enough to fool the discriminator into classifying it as real. Through this adversarial training process, the generator learns to generate increasingly realistic data samples that capture the underlying structure of the training data distribution.

  • Overall, the generator function plays a central role in the GAN framework, driving the generation of new data samples and enabling the model to learn to produce realistic outputs.

2. Discriminator

The discriminator function in a Generative Adversarial Network (GAN) is responsible for distinguishing between real and fake data samples. It learns to classify input data as either coming from the real training data distribution or generated by the generator network.

Components of a discriminator function in a GAN:

  • Input Layer: The discriminator starts with an input layer that receives data samples, which could be either real or generated by the generator. These data samples are typically images, text, or any other type of data that the GAN is designed to generate.

  • Convolutional Layers: Convolutional layers are commonly used in the discriminator to extract features from the input data. These layers consist of multiple filters that slide over the input data, detecting patterns and spatial relationships. Convolutional layers are effective for processing high-dimensional data such as images.

  • Activation Functions: Activation functions like Leaky ReLU (Rectified Linear Unit) are often applied after each convolutional layer to introduce non-linearity into the network. Leaky ReLU helps prevent the vanishing gradient problem by allowing a small, non-zero gradient when the input is negative.

  • Pooling or Strided Convolutions: Pooling layers or strided convolutions are used to downsample the feature maps produced by the convolutional layers. This reduces the spatial dimensions of the data while retaining important features. Pooling layers typically use max-pooling or average-pooling operations.

  • Dropout: Dropout layers may be included to prevent overfitting by randomly dropping a fraction of the neurons during training. This regularization technique helps improve the generalization ability of the discriminator.

  • Flattening Layer: After the convolutional layers, the feature maps are flattened into a one-dimensional vector. This prepares the data for input into the fully connected layers.

  • Fully Connected Layers: Fully connected (dense) layers are used to perform classification based on the extracted features. These layers receive the flattened feature vector as input and output a single value indicating the probability that the input data is real.

  • Output Layer: The output layer typically consists of a single neuron with a sigmoid activation function. This neuron produces a scalar output in the range [0, 1], representing the probability that the input data is real. A value close to 1 indicates a high probability of the input being real, while a value close to 0 indicates a high probability of the input being fake.

The discriminator function is trained alongside the generator in an adversarial manner. It learns to differentiate between real and fake data samples by minimizing a suitable loss function, such as binary cross-entropy loss. As training progresses, the discriminator becomes better at distinguishing between real and fake data, while the generator learns to produce increasingly realistic data samples to fool the discriminator.

image-2.png

Step-by-Step explanation of how GANs function:

  1. **Initialization: The generator and discriminator neural networks are initialized with random weights.

  2. **Training Loop:

  • Generator Input: The generator takes random noise (usually sampled from a Gaussian distribution) as input and generates synthetic data.

  • Real Data: The discriminator is fed with real data samples from the training dataset along with the synthetic data generated by the generator.

  • Discriminator Training: The discriminator is trained to distinguish between real and fake data. It adjusts its weights to improve its ability to classify the input as either real or fake.

  • Generator Training: The generator is trained to generate data that is realistic enough to fool the discriminator. It generates synthetic data and passes it through the discriminator. The generator's weights are adjusted to minimize the discriminator's ability to distinguish between real and fake data. In other words, the generator aims to maximize the probability that the discriminator classifies its generated data as real.

  • Back and Forth Training: This process continues iteratively, with the generator and discriminator improving their performance through successive rounds of training. As the discriminator gets better at distinguishing real from fake data, the generator must also improve to produce more realistic data.

  1. Adversarial Loss Function : The training of GANs is driven by an adversarial loss function. The goal of the generator is to minimize this loss, while the goal of the discriminator is to maximize it. This adversarial setup creates a competitive dynamic where the generator and discriminator are constantly trying to outperform each other.

  2. Convergence: Ideally, with enough training, the generator becomes proficient at generating data that is indistinguishable from real data, and the discriminator becomes unable to differentiate between real and fake data. At this point, the GAN is said to have converged.

  3. Evaluation: The trained generator can be used to generate new data samples that are similar to the training data. These generated samples can be evaluated for quality and realism using various metrics and visual inspection.

GAN%204.jpg

Basic implementation of a GAN using TensorFlow.

  • It defines both the generator and discriminator networks, along with the training loop

import tensorflow as tf from tensorflow.keras import layers import numpy as np import matplotlib.pyplot as plt
import numpy as np import matplotlib.pyplot as plt from keras.models import Sequential from keras.layers import SimpleRNN, Dense from keras.datasets import mnist from keras.utils import to_categorical # Load the MNIST dataset (train_images, train_labels), (test_images, test_labels) = mnist.load_data() # Print a few examples from the dataset plt.figure(figsize=(10, 5)) for i in range(5): plt.subplot(2, 5, i + 1) plt.imshow(train_images[i], cmap='gray') plt.title(f"Label: {train_labels[i]}") plt.axis('off') plt.show()
Image in a Jupyter notebook
  • In a Conv2D generator, Leaky ReLU introduces a small negative slope to prevent dead neurons, enhancing learning by allowing gradients to flow during backpropagation, thereby improving model stability and convergence. This helps capture subtle features and nuances in generated images, enhancing the overall quality of the output

  • In a Conv2D generator, the assert function can be used to verify the dimensions of the input tensor before proceeding with the convolution operation, ensuring compatibility between the input data and the convolutional layer's parameters, thereby preventing runtime errors and ensuring the integrity of the network's operations.

  • Setting use_bias = false in a Conv2D generator means that the convolutional layer won't utilize bias terms. This can be useful in scenarios where introducing bias terms might not be necessary or could potentially lead to overfitting. By excluding bias terms, the model becomes more constrained, potentially reducing the overall number of parameters and computational complexity while still allowing for effective feature extraction and learning.

  • Stride: Strides to reduce the spatial dimensions of the output volume. it also determines the stpe size of convultaion filter.

(10+20+30)/3
20.0
# Define the Generator network def make_generator_model(): model = tf.keras.Sequential() model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,))) model.add(layers.BatchNormalization()) model.add(layers.LeakyReLU()) model.add(layers.Reshape((7, 7, 256))) assert model.output_shape == (None, 7, 7, 256) # None is the batch size model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False)) assert model.output_shape == (None, 7, 7, 128) model.add(layers.BatchNormalization()) model.add(layers.LeakyReLU()) model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False)) assert model.output_shape == (None, 14, 14, 64) model.add(layers.BatchNormalization()) model.add(layers.LeakyReLU()) model.add(layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh')) assert model.output_shape == (None, 28, 28, 1) return model # Create the generator generator = make_generator_model()
# Define the Discriminator network def make_discriminator_model(): model = tf.keras.Sequential() model.add(layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same', input_shape=[28, 28, 1])) model.add(layers.LeakyReLU()) model.add(layers.Dropout(0.2)) # Techique by reducing the risk of overfitting model.add(layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same')) model.add(layers.LeakyReLU()) model.add(layers.Dropout(0.2)) model.add(layers.Flatten()) model.add(layers.Dense(1)) return model # Create the discriminator discriminator = make_discriminator_model()
# Define the loss functions cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True) # Define the discriminator loss def discriminator_loss(real_output, fake_output): real_loss = cross_entropy(tf.ones_like(real_output), real_output) fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output) total_loss = real_loss + fake_loss return total_loss # Define the generator loss def generator_loss(fake_output): return cross_entropy(tf.ones_like(fake_output), fake_output)
# Define the optimizers generator_optimizer = tf.keras.optimizers.Adam(1e-4) discriminator_optimizer = tf.keras.optimizers.Adam(1e-4) # Define the training step function @tf.function def train_step(images): noise = tf.random.normal([BATCH_SIZE, noise_dim]) with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape: generated_images = generator(noise, training=True) real_output = discriminator(images, training=True) fake_output = discriminator(generated_images, training=True) gen_loss = generator_loss(fake_output) disc_loss = discriminator_loss(real_output, fake_output) gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables) gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables) generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables)) discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables)) return gen_loss, disc_loss
# Define the training loop with visualizations def train(dataset, epochs): for epoch in range(epochs): for image_batch in dataset: gen_loss, disc_loss = train_step(image_batch) print(f'Epoch {epoch+1}/{epochs}, Generator Loss: {gen_loss}, Discriminator Loss: {disc_loss}') # Generate and save images for visualization if epoch % 10 == 0: generate_and_save_images(generator, epoch + 1, seed) # Function to generate and save images def generate_and_save_images(model, epoch, test_input): predictions = model(test_input, training=False) fig = plt.figure(figsize=(4,4)) for i in range(predictions.shape[0]): plt.subplot(4, 4, i+1) plt.imshow(predictions[i, :, :, 0] * 127.5 + 127.5, cmap='gray') plt.axis('off') plt.savefig('image_at_epoch_{:04d}.png'.format(epoch)) plt.show()
# Load and prepare the dataset ( MNIST) mnist = tf.keras.datasets.mnist (train_images, _), (_, _) = mnist.load_data() train_images = train_images.reshape(train_images.shape[0], 28, 28, 1).astype('float32') train_images = (train_images - 127.5) / 127.5 # Normalize the images to [-1, 1] # Batch and shuffle the data BUFFER_SIZE = 60000 BATCH_SIZE = 256 train_dataset = tf.data.Dataset.from_tensor_slices(train_images).shuffle(BUFFER_SIZE).batch(BATCH_SIZE) # Define the dimensionality of the random noise vector noise_dim = 100 num_examples_to_generate = 16 seed = tf.random.normal([num_examples_to_generate, noise_dim]) # Train the GAN train(train_dataset, epochs=50)
Epoch 1/50, Generator Loss: 0.8886315822601318, Discriminator Loss: 1.1144211292266846
Image in a Jupyter notebook
Epoch 2/50, Generator Loss: 0.80439293384552, Discriminator Loss: 1.5225005149841309 Epoch 3/50, Generator Loss: 1.018770456314087, Discriminator Loss: 1.027886152267456 Epoch 4/50, Generator Loss: 0.8793073296546936, Discriminator Loss: 1.2916512489318848 Epoch 5/50, Generator Loss: 0.9028756618499756, Discriminator Loss: 1.205275058746338 Epoch 6/50, Generator Loss: 0.7213987112045288, Discriminator Loss: 1.4907300472259521 Epoch 7/50, Generator Loss: 0.7699480652809143, Discriminator Loss: 1.375025749206543 Epoch 8/50, Generator Loss: 1.1869161128997803, Discriminator Loss: 0.8523796796798706 Epoch 9/50, Generator Loss: 1.0505003929138184, Discriminator Loss: 1.0612211227416992 Epoch 10/50, Generator Loss: 1.0153477191925049, Discriminator Loss: 1.0271923542022705 Epoch 11/50, Generator Loss: 1.2612522840499878, Discriminator Loss: 0.871403157711029
Image in a Jupyter notebook
Epoch 12/50, Generator Loss: 0.9438713788986206, Discriminator Loss: 1.238171100616455 Epoch 13/50, Generator Loss: 0.8856446146965027, Discriminator Loss: 1.2689669132232666 Epoch 14/50, Generator Loss: 1.1267590522766113, Discriminator Loss: 1.1643075942993164 Epoch 15/50, Generator Loss: 1.258558988571167, Discriminator Loss: 0.8452894687652588 Epoch 16/50, Generator Loss: 1.2043936252593994, Discriminator Loss: 1.0385801792144775 Epoch 17/50, Generator Loss: 1.2196786403656006, Discriminator Loss: 0.9326040744781494 Epoch 18/50, Generator Loss: 1.3994824886322021, Discriminator Loss: 0.9160407185554504 Epoch 19/50, Generator Loss: 1.2086116075515747, Discriminator Loss: 1.0605415105819702 Epoch 20/50, Generator Loss: 1.3892619609832764, Discriminator Loss: 0.8507324457168579 Epoch 21/50, Generator Loss: 1.2455077171325684, Discriminator Loss: 0.9571502804756165
Image in a Jupyter notebook
Epoch 22/50, Generator Loss: 1.3776977062225342, Discriminator Loss: 0.9134082794189453 Epoch 23/50, Generator Loss: 1.4289569854736328, Discriminator Loss: 0.8536520600318909 Epoch 24/50, Generator Loss: 1.6960281133651733, Discriminator Loss: 0.9384723901748657 Epoch 25/50, Generator Loss: 1.3601925373077393, Discriminator Loss: 0.9052325487136841

Explanation:

  • In Epoch 2, the generator loss is 0.896 and the discriminator loss is 1.226. This means the generator is improving but still isn't generating samples that are very close to the real data, and the discriminator is doing a decent job at telling real from fake.

  • In Epoch 9, the generator loss drops to 0.76, indicating improvement in generating realistic samples. Meanwhile, the discriminator is loss increased , indicating it is finding difficult able to differentiate but with a lower error.

  • In Epoch 10, the generator loss increases to 0.957, which might indicate that the generator is finding it harder to generate more realistic samples or that the discriminator has become more effective. The discriminator loss is also high, suggesting it's finding it challenging to distinguish between real and fake samples.