CoCalc -- 7 Convolutional Neural Networks (CNNs).ipynb

GitHub Repository: suyashi29/python-su
Path: blob/master/Applied Generative AI with GANS/7 Convolutional Neural Networks (CNNs).ipynb
⁴⁸⁶⁶ views

Kernel: Python 3 (ipykernel)

Convolutional Neural Networks (CNNs) are a specialized type of neural network architecture that excel at processing grid-like data, such as images and videos. They are designed to automatically and adaptively learn spatial hierarchies of features from input data.

Applications of CNNs:

Image Classification
Object Detection
Semantic Segmentation: In semantic segmentation, CNNs assign a class label to each pixel in an image. This is useful for tasks like medical image analysis, where precise delineation of structures (e.g., tumors) is critical.
Instance Segmentation: This is a more advanced form of segmentation where, in addition to labeling pixels by category, each distinct instance of an object is uniquely identified. It's used in scenarios like robotics, where a robot must identify individual objects in a scene.
Object Recognition in Videos: CNNs can be applied to individual frames of a video stream to perform object recognition. This is a critical component in applications like video surveillance and action recognition.
Face Recognition: CNNs have proven to be highly effective in the field of face recognition. They can learn features that are distinctive to individual faces, enabling tasks like biometric authentication.
Style Transfer: CNNs can be used to alter the style of an image while preserving its content. This is popular in artistic applications where one might want to apply the style of a famous painting to a photograph.
Super-Resolution: CNNs can enhance the resolution of images, which is useful in scenarios like upscaling low-resolution images or enhancing the quality of medical images.
Medical Image Analysis: CNNs are widely used in medical imaging for tasks like tumor detection, organ segmentation, and disease classification.
Natural Language Processing (NLP): While not as commonly associated with CNNs as with recurrent networks, CNNs have been used in NLP tasks like text classification and sentiment analysis, particularly for tasks where local context is important.

Convolutional Layers: The core building blocks of CNNs are convolutional layers. These layers apply filters or "kernels" to small, overlapping regions of the input data. This allows the network to learn spatial hierarchies of features.
Feature Learning: CNNs automatically learn hierarchies of features from the input data. For example, in image processing, initial layers might learn to recognize edges, while deeper layers learn to recognize complex shapes or patterns.
Pooling Layers: These layers reduce the spatial dimensions (width and height) of the data volume, while keeping the depth unchanged. Common pooling operations include max pooling and average pooling.
Activation Functions: Non-linear activation functions (like ReLU - Rectified Linear Unit) are applied after each convolutional layer to introduce non-linearity into the model. This enables the network to learn more complex patterns.
Fully Connected Layers: After several convolutional and pooling layers, the network typically ends with one or more fully connected layers. These layers perform the high-level reasoning on the learned features.
Loss Function and Optimization: The choice of loss function depends on the task (classification, regression, etc.). Common choices include categorical cross-entropy for classification tasks. Optimization techniques like stochastic gradient descent (SGD) or more advanced variants like Adam are used to minimize the loss.
Backpropagation: CNNs are trained using backpropagation, where the gradients of the loss function with respect to the parameters are computed and used to update the weights of the network.
Transfer Learning: CNNs trained on large datasets for tasks like image recognition (e.g., ImageNet) are often used as a starting point for other tasks. This is called transfer learning

Mathamatical Interpretation for CNN

Below is a simple implementation of a 2D convolution operation using pure Python

In [1]:

def convolution2D(input_image, kernel):
    input_height, input_width = len(input_image), len(input_image[0])
    kernel_height, kernel_width = len(kernel), len(kernel[0])

    output_height = input_height - kernel_height + 1
    output_width = input_width - kernel_width + 1

    # Initialize the output image
    output_image = [[0 for _ in range(output_width)] for _ in range(output_height)]

    # Perform the convolution
    for i in range(output_height):
        for j in range(output_width):
            # Compute the dot product between the kernel and the input region
            dot_product = 0
            for m in range(kernel_height):
                for n in range(kernel_width):
                    dot_product += input_image[i+m][j+n] * kernel[m][n]
            output_image[i][j] = dot_product

    return output_image

# Define a sample 2D image and a 3x3 kernel
input_image = [
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12],
    [13, 14, 15, 16]
]

kernel = [
    [1, 0, -1],
    [1, 0, -1],
    [1, 0, -1]
]

# Perform convolution
output_image = convolution2D(input_image, kernel)

# Print the result
for row in output_image:
    print(row)

Out[1]:

[-6, -6]
[-6, -6]

Steps for above example:

convolution2D is a function that takes an input image (a 2D list) and a kernel (another 2D list) as input. It performs the convolution operation and returns the resulting image.
input_image is a sample 4x4 input image, and kernel is a 3x3 kernel.
The convolution operation is performed manually using nested loops. For each position in the output image, the dot product between the kernel and the corresponding region of the input image is computed.

Implementation using Numpy:

Implementing a Convolutional Neural Network (CNN) using NumPy involves creating the layers (convolution, pooling, fully connected) and implementing the forward pass. Below is an example of a simple CNN using NumPy for a binary image classification task:

In [2]:

import numpy as np

def convolution2D(input_image, kernel):
    input_height, input_width = input_image.shape
    kernel_height, kernel_width = kernel.shape

    output_height = input_height - kernel_height + 1
    output_width = input_width - kernel_width + 1

    # Initialize the output image
    output_image = np.zeros((output_height, output_width))

    # Perform the convolution
    for i in range(output_height):
        for j in range(output_width):
            output_image[i, j] = np.sum(input_image[i:i+kernel_height, j:j+kernel_width] * kernel)

    return output_image

def max_pooling2D(input_image, pool_size):
    input_height, input_width = input_image.shape
    pool_height, pool_width = pool_size

    output_height = input_height // pool_height
    output_width = input_width // pool_width

    # Initialize the output image
    output_image = np.zeros((output_height, output_width))

    # Perform max pooling
    for i in range(output_height):
        for j in range(output_width):
            output_image[i, j] = np.max(input_image[i*pool_height:(i+1)*pool_height, j*pool_width:(j+1)*pool_width])

    return output_image

# Sample input image
input_image = np.array([[1, 2, 1, 0],
                        [0, 1, 3, 2],
                        [2, 0, 1, 2],
                        [1, 2, 2, 1]])

# Sample kernel
kernel = np.array([[1, 0, -1],
                   [1, 0, -1],
                   [1, 0, -1]])

# Sample max pooling size
pool_size = (2, 2)

# Perform convolution
conv_output = convolution2D(input_image, kernel)

# Perform max pooling
pool_output = max_pooling2D(conv_output, pool_size)

# Print the results
print("Convolution output:")
print(conv_output)

print("\nMax pooling output:")
print(pool_output)

Out[2]:

Convolution output:
[[-2. -1.]
 [-3. -2.]]

Max pooling output:
[[-1.]]

Explanation:

convolution2D is a function that takes an input image (a 2D NumPy array) and a kernel (another 2D NumPy array) as input. It performs the convolution operation and returns the resulting image.
max_pooling2D is a function that takes an input image and a pool size as input and performs max pooling.
The sample input image and kernel are provided.
The convolution operation is performed, followed by max pooling

Below is a simple example code for a Convolutional Neural Network (CNN) implementation using Keras. This example uses the MNIST dataset for handwritten digit classification.

In [3]:

import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from keras.datasets import mnist
from keras.utils import to_categorical

In [4]:


# Load and preprocess the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape((60000, 28, 28, 1)).astype('float32') / 255
test_images = test_images.reshape((10000, 28, 28, 1)).astype('float32') / 255

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

In [5]:

# Build the CNN model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(MaxPooling2D((2, 2)))
#model.add(Conv2D(64, (3, 3), activation='relu'))
#model.add(MaxPooling2D((2, 2)))
#model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(10, activation='softmax'))

Out[5]:

C:\Users\Suyashi144893\AppData\Local\anaconda3\Lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)

In [6]:


# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(train_images, train_labels, epochs=5, batch_size=64, validation_split=0.2)

Out[6]:

Epoch 1/5
750/750 ━━━━━━━━━━━━━━━━━━━━ 5s 6ms/step - accuracy: 0.9326 - loss: 0.2326 - val_accuracy: 0.9744 - val_loss: 0.0928
Epoch 2/5
750/750 ━━━━━━━━━━━━━━━━━━━━ 5s 6ms/step - accuracy: 0.9773 - loss: 0.0754 - val_accuracy: 0.9811 - val_loss: 0.0686
Epoch 3/5
750/750 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.9849 - loss: 0.0510 - val_accuracy: 0.9820 - val_loss: 0.0597
Epoch 4/5
750/750 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.9879 - loss: 0.0386 - val_accuracy: 0.9845 - val_loss: 0.0522
Epoch 5/5
750/750 ━━━━━━━━━━━━━━━━━━━━ 5s 6ms/step - accuracy: 0.9914 - loss: 0.0288 - val_accuracy: 0.9847 - val_loss: 0.0557

<keras.src.callbacks.history.History at 0x1a4c5cff2d0>

In [7]:



# Evaluate the model on the test data
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)

# Prediction on a single test image
sample_image = test_images[3].reshape((1, 28, 28, 1))
prediction = model.predict(sample_image)
predicted_label = np.argmax(prediction)

# Display the sample image and its predicted label
plt.imshow(test_images[3].reshape((28, 28)), cmap='gray')
plt.title(f"Predicted Label: {predicted_label}")
plt.show()

Out[7]:

313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.9850 - loss: 0.0481
Test accuracy: 0.9850000143051147
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step

In [ ]:

Convolutional Neural Networks (CNNs) are a specialized type of neural network architecture that excel at processing grid-like data, such as images and videos. They are designed to automatically and adaptively learn spatial hierarchies of features from input data.

Applications of CNNs:

Mathamatical Interpretation for CNN

Steps for above example:

Implementation using Numpy:

Explanation:

Below is a simple example code for a Convolutional Neural Network (CNN) implementation using Keras. This example uses the MNIST dataset for handwritten digit classification.

Product

Resources

Company

Convolutional Neural Networks (CNNs) are a specialized type of neural network architecture that excel at processing grid-like data, such as images and videos. They are designed to automatically and adaptively learn spatial hierarchies of features from input data.

Applications of CNNs:

Here are some key characteristics and concepts related to CNNs:

Mathamatical Interpretation for CNN

Steps for above example:

Implementation using Numpy:

Explanation:

Below is a simple example code for a Convolutional Neural Network (CNN) implementation using Keras. This example uses the MNIST dataset for handwritten digit classification.