CoCalc -- cnn_transfer_learning.ipynb

GitHub Repository: codebasics/deep-learning-keras-tf-tutorial
Path: blob/master/18_transfer_learning/cnn_transfer_learning.ipynb
¹¹⁴¹ views

Kernel: Python 3

Transfer learning in image classification

In this notebook we will use transfer learning and take pre-trained model from google's Tensorflow Hub and re-train that on flowers dataset. Using pre-trained model saves lot of time and computational budget for new classification problem at hand

In [2]:

# Install tensorflow_hub using pip install tensorflow_hub first

In [3]:

import numpy as np
import cv2

import PIL.Image as Image
import os

import matplotlib.pylab as plt

import tensorflow as tf
import tensorflow_hub as hub

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential

Make predictions using ready made model (without any training)

In [4]:

IMAGE_SHAPE = (224, 224)

classifier = tf.keras.Sequential([
    hub.KerasLayer("https://tfhub.dev/google/tf2-preview/mobilenet_v2/classification/4", input_shape=IMAGE_SHAPE+(3,))
])

In [5]:

gold_fish = Image.open("goldfish.jpg").resize(IMAGE_SHAPE)
gold_fish

Out[5]:

In [6]:

gold_fish = np.array(gold_fish)/255.0
gold_fish.shape

Out[6]:

(224, 224, 3)

In [7]:

gold_fish[np.newaxis, ...]

Out[7]:

array([[[[0.28235294, 0.33333333, 0.07058824],
         [0.31372549, 0.37254902, 0.09019608],
         [0.34901961, 0.41960784, 0.11764706],
         ...,
         [0.32941176, 0.39215686, 0.00392157],
         [0.32156863, 0.38431373, 0.00392157],
         [0.30980392, 0.36862745, 0.        ]],

        [[0.28627451, 0.33333333, 0.08235294],
         [0.3254902 , 0.38039216, 0.10980392],
         [0.35294118, 0.42352941, 0.12941176],
         ...,
         [0.32156863, 0.38039216, 0.00392157],
         [0.31372549, 0.37254902, 0.00392157],
         [0.30196078, 0.36078431, 0.        ]],

        [[0.28627451, 0.33333333, 0.08627451],
         [0.31372549, 0.36862745, 0.10196078],
         [0.34509804, 0.41568627, 0.12941176],
         ...,
         [0.31764706, 0.37647059, 0.00392157],
         [0.30980392, 0.36862745, 0.00784314],
         [0.29803922, 0.35686275, 0.00392157]],

        ...,

        [[0.05490196, 0.10980392, 0.01568627],
         [0.05098039, 0.11372549, 0.01960784],
         [0.05098039, 0.12156863, 0.02352941],
         ...,
         [0.15686275, 0.21960784, 0.03921569],
         [0.15686275, 0.22352941, 0.03529412],
         [0.16078431, 0.22352941, 0.03137255]],

        [[0.0627451 , 0.1254902 , 0.01568627],
         [0.05882353, 0.13333333, 0.01960784],
         [0.05490196, 0.1372549 , 0.01960784],
         ...,
         [0.1372549 , 0.20392157, 0.04705882],
         [0.14117647, 0.20784314, 0.04313725],
         [0.14117647, 0.20784314, 0.03529412]],

        [[0.06666667, 0.14509804, 0.01176471],
         [0.07058824, 0.15294118, 0.01960784],
         [0.05490196, 0.14901961, 0.01176471],
         ...,
         [0.11372549, 0.18039216, 0.04313725],
         [0.11764706, 0.18431373, 0.03921569],
         [0.11764706, 0.18823529, 0.03529412]]]])

In [8]:

result = classifier.predict(gold_fish[np.newaxis, ...])
result.shape

Out[8]:

(1, 1001)

In [9]:

predicted_label_index = np.argmax(result)
predicted_label_index

Out[9]:

2

In [10]:

# tf.keras.utils.get_file('ImageNetLabels.txt','https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt')
image_labels = []
with open("ImageNetLabels.txt", "r") as f:
    image_labels = f.read().splitlines()
image_labels[:5]

Out[10]:

['background', 'tench', 'goldfish', 'great white shark', 'tiger shark']

In [11]:

image_labels[predicted_label_index]

Out[11]:

'goldfish'

Load flowers dataset

In [79]:

dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url,  cache_dir='.', untar=True)
# cache_dir indicates where to download data. I specified . which means current directory
# untar true will unzip it

In [80]:

data_dir

Out[80]:

'.\\datasets\\flower_photos'

In [81]:

import pathlib
data_dir = pathlib.Path(data_dir)
data_dir

Out[81]:

WindowsPath('datasets/flower_photos')

In [82]:

list(data_dir.glob('*/*.jpg'))[:5]

Out[82]:

[WindowsPath('datasets/flower_photos/daisy/100080576_f52e8ee070_n.jpg'),
 WindowsPath('datasets/flower_photos/daisy/10140303196_b88d3d6cec.jpg'),
 WindowsPath('datasets/flower_photos/daisy/10172379554_b296050f82_n.jpg'),
 WindowsPath('datasets/flower_photos/daisy/10172567486_2748826a8b.jpg'),
 WindowsPath('datasets/flower_photos/daisy/10172636503_21bededa75_n.jpg')]

In [83]:

image_count = len(list(data_dir.glob('*/*.jpg')))
print(image_count)

Out[83]:

3670

In [84]:

roses = list(data_dir.glob('roses/*'))
roses[:5]

Out[84]:

[WindowsPath('datasets/flower_photos/roses/10090824183_d02c613f10_m.jpg'),
 WindowsPath('datasets/flower_photos/roses/102501987_3cdb8e5394_n.jpg'),
 WindowsPath('datasets/flower_photos/roses/10503217854_e66a804309.jpg'),
 WindowsPath('datasets/flower_photos/roses/10894627425_ec76bbc757_n.jpg'),
 WindowsPath('datasets/flower_photos/roses/110472418_87b6a3aa98_m.jpg')]

In [85]:

PIL.Image.open(str(roses[1]))

Out[85]:

In [86]:

tulips = list(data_dir.glob('tulips/*'))
PIL.Image.open(str(tulips[0]))

Out[86]:

Read flowers images from disk into numpy array using opencv

In [87]:

flowers_images_dict = {
    'roses': list(data_dir.glob('roses/*')),
    'daisy': list(data_dir.glob('daisy/*')),
    'dandelion': list(data_dir.glob('dandelion/*')),
    'sunflowers': list(data_dir.glob('sunflowers/*')),
    'tulips': list(data_dir.glob('tulips/*')),
}

In [88]:

flowers_labels_dict = {
    'roses': 0,
    'daisy': 1,
    'dandelion': 2,
    'sunflowers': 3,
    'tulips': 4,
}

In [89]:

flowers_images_dict['roses'][:5]

Out[89]:

[WindowsPath('datasets/flower_photos/roses/10090824183_d02c613f10_m.jpg'),
 WindowsPath('datasets/flower_photos/roses/102501987_3cdb8e5394_n.jpg'),
 WindowsPath('datasets/flower_photos/roses/10503217854_e66a804309.jpg'),
 WindowsPath('datasets/flower_photos/roses/10894627425_ec76bbc757_n.jpg'),
 WindowsPath('datasets/flower_photos/roses/110472418_87b6a3aa98_m.jpg')]

In [90]:

str(flowers_images_dict['roses'][0])

Out[90]:

'datasets\\flower_photos\\roses\\10090824183_d02c613f10_m.jpg'

In [91]:

img = cv2.imread(str(flowers_images_dict['roses'][0]))

In [92]:

img.shape

Out[92]:

(240, 179, 3)

In [93]:

cv2.resize(img,(224,224)).shape

Out[93]:

(224, 224, 3)

In [94]:

X, y = [], []

for flower_name, images in flowers_images_dict.items():
    for image in images:
        img = cv2.imread(str(image))
        resized_img = cv2.resize(img,(224,224))
        X.append(resized_img)
        y.append(flowers_labels_dict[flower_name])

In [95]:

X = np.array(X)
y = np.array(y)

Train test split

In [96]:

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

Preprocessing: scale images

In [97]:

X_train_scaled = X_train / 255
X_test_scaled = X_test / 255

Make prediction using pre-trained model on new flowers dataset

In [41]:

X[0].shape

Out[41]:

(180, 180, 3)

In [42]:

IMAGE_SHAPE+(3,)

Out[42]:

(224, 224, 3)

In [60]:

x0_resized = cv2.resize(X[0], IMAGE_SHAPE)
x1_resized = cv2.resize(X[1], IMAGE_SHAPE)
x2_resized = cv2.resize(X[2], IMAGE_SHAPE)

In [61]:

plt.axis('off')
plt.imshow(X[0])

Out[61]:

<matplotlib.image.AxesImage at 0x1e7aec49cd0>

In [63]:

plt.axis('off')
plt.imshow(X[1])

Out[63]:

<matplotlib.image.AxesImage at 0x1eec795f610>

In [64]:

plt.axis('off')
plt.imshow(X[2])

Out[64]:

<matplotlib.image.AxesImage at 0x1eec7e39e20>

In [72]:

predicted = classifier.predict(np.array([x0_resized, x1_resized, x2_resized]))
predicted = np.argmax(predicted, axis=1)
predicted

Out[72]:

array([795, 795, 795], dtype=int64)

In [73]:

image_labels[795]

Out[73]:

'shower curtain'

Now take pre-trained model and retrain it using flowers images

In [75]:

feature_extractor_model = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/4"

pretrained_model_without_top_layer = hub.KerasLayer(
    feature_extractor_model, input_shape=(224, 224, 3), trainable=False)

In [98]:

num_of_flowers = 5

model = tf.keras.Sequential([
  pretrained_model_without_top_layer,
  tf.keras.layers.Dense(num_of_flowers)
])

model.summary()

Out[98]:

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
keras_layer_3 (KerasLayer)   (None, 1280)              2257984   
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 6405      
=================================================================
Total params: 2,264,389
Trainable params: 6,405
Non-trainable params: 2,257,984
_________________________________________________________________

In [99]:

model.compile(
  optimizer="adam",
  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
  metrics=['acc'])

model.fit(X_train_scaled, y_train, epochs=5)

Out[99]:

Epoch 1/5
 1/86 [..............................] - ETA: 0s - loss: 1.9482 - acc: 0.2188WARNING:tensorflow:Callbacks method `on_train_batch_end` is slow compared to the batch time (batch time: 0.0080s vs `on_train_batch_end` time: 0.0130s). Check your callbacks.

WARNING:tensorflow:Callbacks method `on_train_batch_end` is slow compared to the batch time (batch time: 0.0080s vs `on_train_batch_end` time: 0.0130s). Check your callbacks.

86/86 [==============================] - 2s 19ms/step - loss: 0.7985 - acc: 0.7028
Epoch 2/5
86/86 [==============================] - 2s 19ms/step - loss: 0.4163 - acc: 0.8517
Epoch 3/5
86/86 [==============================] - 2s 19ms/step - loss: 0.3264 - acc: 0.8895
Epoch 4/5
86/86 [==============================] - 2s 19ms/step - loss: 0.2682 - acc: 0.9106
Epoch 5/5
86/86 [==============================] - 2s 19ms/step - loss: 0.2305 - acc: 0.9266

<tensorflow.python.keras.callbacks.History at 0x1e7fc8112b0>

In [100]:

model.evaluate(X_test_scaled,y_test)

Out[100]:

29/29 [==============================] - 1s 23ms/step - loss: 0.3703 - acc: 0.8682

[0.37029528617858887, 0.8681917190551758]

Transfer learning in image classification

Load flowers dataset

Read flowers images from disk into numpy array using opencv

Train test split

Preprocessing: scale images

Now take pre-trained model and retrain it using flowers images

Product

Resources

Company