CoCalc -- Day 6 Convolutional Neural Networks (CNNs).ipynb

GitHub Repository: suyashi29/python-su
Path: blob/master/Generative AI for Intelligent Data Handling/Day 6 Convolutional Neural Networks (CNNs).ipynb
³⁰⁷⁴ views

Kernel: Python 3 (ipykernel)

Convolutional Neural Networks (CNNs) are a specialized type of neural network architecture that excel at processing grid-like data, such as images and videos. They are designed to automatically and adaptively learn spatial hierarchies of features from input data.

Applications of CNNs:

Image Classification: CNNs are extensively used for tasks like object recognition and image classification. They can identify and classify objects within images into predefined categories.
Object Detection: CNNs can not only identify objects but also locate and outline them within an image. This is crucial for tasks like pedestrian detection in autonomous vehicles or face detection in image processing applications.
Semantic Segmentation: In semantic segmentation, CNNs assign a class label to each pixel in an image. This is useful for tasks like medical image analysis, where precise delineation of structures (e.g., tumors) is critical.
Instance Segmentation: This is a more advanced form of segmentation where, in addition to labeling pixels by category, each distinct instance of an object is uniquely identified. It's used in scenarios like robotics, where a robot must identify individual objects in a scene.
Object Recognition in Videos: CNNs can be applied to individual frames of a video stream to perform object recognition. This is a critical component in applications like video surveillance and action recognition.
Face Recognition: CNNs have proven to be highly effective in the field of face recognition. They can learn features that are distinctive to individual faces, enabling tasks like biometric authentication.
Style Transfer: CNNs can be used to alter the style of an image while preserving its content. This is popular in artistic applications where one might want to apply the style of a famous painting to a photograph.
Super-Resolution: CNNs can enhance the resolution of images, which is useful in scenarios like upscaling low-resolution images or enhancing the quality of medical images.
Medical Image Analysis: CNNs are widely used in medical imaging for tasks like tumor detection, organ segmentation, and disease classification.
Natural Language Processing (NLP): While not as commonly associated with CNNs as with recurrent networks, CNNs have been used in NLP tasks like text classification and sentiment analysis, particularly for tasks where local context is important.

The key differences between Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs):

Feature	Recurrent Neural Networks (RNNs)	Convolutional Neural Networks (CNNs)
Data Type	Sequential data (e.g., time series, natural language)	Grid-like data (e.g., images, spatial information)
Architecture	Recurrent connections capturing temporal dependencies	Convolutional layers for hierarchical feature extraction and parameter sharing
Memory Handling	Maintains hidden state to capture sequential dependencies	Focuses on local patterns and spatial relationships, less emphasis on memory
Parameter Sharing	Shared weights across time steps	Convolutional operations with shared weights across input
Applications	Natural Language Processing (NLP), time series analysis, speech recognition	Image classification, object detection, image segmentation
Challenges	Vanishing gradient problem for long-term dependencies	May struggle with capturing sequential dependencies
Typical Use Cases	Text generation, machine translation, speech recognition	Image classification, object detection, image segmentation
Example Libraries	Keras, PyTorch, TensorFlow	Keras, PyTorch, TensorFlow

It's important to note that while RNNs and CNNs have their specific strengths and applications, in some cases, hybrid models that combine both architectures are used to address tasks that require handling both sequential and spatial information.

Convolutional Layers: The core building blocks of CNNs are convolutional layers. These layers apply filters or "kernels" to small, overlapping regions of the input data. This allows the network to learn spatial hierarchies of features.
Feature Learning: CNNs automatically learn hierarchies of features from the input data. For example, in image processing, initial layers might learn to recognize edges, while deeper layers learn to recognize complex shapes or patterns.
Pooling Layers: These layers reduce the spatial dimensions (width and height) of the data volume, while keeping the depth unchanged. Common pooling operations include max pooling and average pooling.
Activation Functions: Non-linear activation functions (like ReLU - Rectified Linear Unit) are applied after each convolutional layer to introduce non-linearity into the model. This enables the network to learn more complex patterns.
Fully Connected Layers: After several convolutional and pooling layers, the network typically ends with one or more fully connected layers. These layers perform the high-level reasoning on the learned features.
Loss Function and Optimization: The choice of loss function depends on the task (classification, regression, etc.). Common choices include categorical cross-entropy for classification tasks. Optimization techniques like stochastic gradient descent (SGD) or more advanced variants like Adam are used to minimize the loss.
Backpropagation: CNNs are trained using backpropagation, where the gradients of the loss function with respect to the parameters are computed and used to update the weights of the network.
Transfer Learning: CNNs trained on large datasets for tasks like image recognition (e.g., ImageNet) are often used as a starting point for other tasks. This is called transfer learning

Mathamatical Interpretation for CNN

Below is a simple implementation of a 2D convolution operation using pure Python

In [1]:

def convolution2D(input_image, kernel):
    input_height, input_width = len(input_image), len(input_image[0])
    kernel_height, kernel_width = len(kernel), len(kernel[0])

    output_height = input_height - kernel_height + 1
    output_width = input_width - kernel_width + 1

    # Initialize the output image
    output_image = [[0 for _ in range(output_width)] for _ in range(output_height)]

    # Perform the convolution
    for i in range(output_height):
        for j in range(output_width):
            # Compute the dot product between the kernel and the input region
            dot_product = 0
            for m in range(kernel_height):
                for n in range(kernel_width):
                    dot_product += input_image[i+m][j+n] * kernel[m][n]
            output_image[i][j] = dot_product

    return output_image

# Define a sample 2D image and a 3x3 kernel
input_image = [
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12],
    [13, 14, 15, 16]
]

kernel = [
    [1, 0, -1],
    [1, 0, -1],
    [1, 0, -1]
]

# Perform convolution
output_image = convolution2D(input_image, kernel)

# Print the result
for row in output_image:
    print(row)

Out[1]:

[-6, -6]
[-6, -6]

Steps for above example:

convolution2D is a function that takes an input image (a 2D list) and a kernel (another 2D list) as input. It performs the convolution operation and returns the resulting image.
input_image is a sample 4x4 input image, and kernel is a 3x3 kernel.
The convolution operation is performed manually using nested loops. For each position in the output image, the dot product between the kernel and the corresponding region of the input image is computed.

Implementation using Numpy:

Implementing a Convolutional Neural Network (CNN) using NumPy involves creating the layers (convolution, pooling, fully connected) and implementing the forward pass. Below is an example of a simple CNN using NumPy for a binary image classification task:

In [9]:

import numpy as np
a=np.ones((3,4),dtype=int)
a
2/3

Out[9]:

0.6666666666666666

In [10]:

import numpy as np

def convolution2D(input_image, kernel):
    input_height, input_width = input_image.shape
    kernel_height, kernel_width = kernel.shape

    output_height = input_height - kernel_height + 1
    output_width = input_width - kernel_width + 1

    # Initialize the output image
    output_image = np.zeros((output_height, output_width))

    # Perform the convolution
    for i in range(output_height):
        for j in range(output_width):
            output_image[i, j] = np.sum(input_image[i:i+kernel_height, j:j+kernel_width] * kernel)

    return output_image

def max_pooling2D(input_image, pool_size):
    input_height, input_width = input_image.shape
    pool_height, pool_width = pool_size

    output_height = input_height // pool_height
    output_width = input_width // pool_width

    # Initialize the output image
    output_image = np.zeros((output_height, output_width))

    # Perform max pooling
    for i in range(output_height):
        for j in range(output_width):
            output_image[i, j] = np.max(input_image[i*pool_height:(i+1)*pool_height, j*pool_width:(j+1)*pool_width])

    return output_image

# Sample input image
input_image = np.array([[1, 2, 1, 0],
                        [0, 1, 3, 2],
                        [2, 0, 1, 2],
                        [1, 2, 2, 1]])

# Sample kernel
kernel = np.array([[1, 0, -1],
                   [1, 0, -1],
                   [1, 0, -1]])

# Sample max pooling size
pool_size = (2, 2)

# Perform convolution
conv_output = convolution2D(input_image, kernel)

# Perform max pooling
pool_output = max_pooling2D(conv_output, pool_size)

# Print the results
print("Convolution output:")
print(conv_output)

print("\nMax pooling output:")
print(pool_output)

Out[10]:

Convolution output:
[[-2. -1.]
 [-3. -2.]]

Max pooling output:
[[-1.]]

Explanation:

convolution2D is a function that takes an input image (a 2D NumPy array) and a kernel (another 2D NumPy array) as input. It performs the convolution operation and returns the resulting image.
max_pooling2D is a function that takes an input image and a pool size as input and performs max pooling.
The sample input image and kernel are provided.
The convolution operation is performed, followed by max pooling

Below is a simple example code for a Convolutional Neural Network (CNN) implementation using Keras. This example uses the MNIST dataset for handwritten digit classification.

In [11]:

import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from keras.datasets import mnist
from keras.utils import to_categorical

In [12]:


# Load and preprocess the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape((60000, 28, 28, 1)).astype('float32') / 255
test_images = test_images.reshape((10000, 28, 28, 1)).astype('float32') / 255

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

MaxPolling 2D

In [13]:

# Build the CNN model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(MaxPooling2D((2, 2)))
#model.add(Conv2D(64, (3, 3), activation='relu'))
#model.add(MaxPooling2D((2, 2)))
#model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(10, activation='softmax'))

In [14]:


# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(train_images, train_labels, epochs=5, batch_size=64, validation_split=0.2)

Out[14]:

Epoch 1/5
750/750 [==============================] - 12s 15ms/step - loss: 0.2364 - accuracy: 0.9323 - val_loss: 0.0927 - val_accuracy: 0.9735
Epoch 2/5
750/750 [==============================] - 11s 15ms/step - loss: 0.0748 - accuracy: 0.9777 - val_loss: 0.0759 - val_accuracy: 0.9770
Epoch 3/5
750/750 [==============================] - 13s 17ms/step - loss: 0.0511 - accuracy: 0.9847 - val_loss: 0.0561 - val_accuracy: 0.9827
Epoch 4/5
750/750 [==============================] - 13s 18ms/step - loss: 0.0385 - accuracy: 0.9886 - val_loss: 0.0641 - val_accuracy: 0.9807
Epoch 5/5
750/750 [==============================] - 13s 18ms/step - loss: 0.0301 - accuracy: 0.9909 - val_loss: 0.0559 - val_accuracy: 0.9838

<keras.callbacks.History at 0x1af803f1790>

In [17]:



# Evaluate the model on the test data
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)

# Prediction on a single test image
sample_image = test_images[3].reshape((1, 28, 28, 1))
prediction = model.predict(sample_image)
predicted_label = np.argmax(prediction)

# Display the sample image and its predicted label
plt.imshow(test_images[4].reshape((28, 28)), cmap='gray')
plt.title(f"Predicted Label: {predicted_label}")
plt.show()

Out[17]:

313/313 [==============================] - 1s 4ms/step - loss: 0.0501 - accuracy: 0.9836
Test accuracy: 0.9836000204086304
1/1 [==============================] - 0s 38ms/step

Convolutional Neural Networks (CNNs) are a specialized type of neural network architecture that excel at processing grid-like data, such as images and videos. They are designed to automatically and adaptively learn spatial hierarchies of features from input data.

Applications of CNNs:

The key differences between Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs):

Mathamatical Interpretation for CNN

Steps for above example:

Implementation using Numpy:

Explanation:

Below is a simple example code for a Convolutional Neural Network (CNN) implementation using Keras. This example uses the MNIST dataset for handwritten digit classification.

MaxPolling 2D

Product

Resources

Company

Convolutional Neural Networks (CNNs) are a specialized type of neural network architecture that excel at processing grid-like data, such as images and videos. They are designed to automatically and adaptively learn spatial hierarchies of features from input data.

Applications of CNNs:

The key differences between Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs):

Here are some key characteristics and concepts related to CNNs:

Mathamatical Interpretation for CNN

Steps for above example:

Implementation using Numpy:

Explanation:

Below is a simple example code for a Convolutional Neural Network (CNN) implementation using Keras. This example uses the MNIST dataset for handwritten digit classification.

MaxPolling 2D