Path: blob/master/49_quantization/quantization.ipynb
1141 views
Kernel: Python 3
Quantization Tutorial
Quantization is a technique to downsize a trained model so that you can deploy it on EDGE devices. In this tutorial we will,
(1) Train a hand written digits model
(2) Export to a disk and check the size of that model
(3) Use two techniques for quantization (1) post training quantization (3) quantization aware training
In [1]:
In [2]:
In [3]:
Out[3]:
60000
In [4]:
Out[4]:
10000
In [5]:
Out[5]:
(28, 28)
In [6]:
Out[6]:
<matplotlib.image.AxesImage at 0x255fe7bb760>
In [7]:
Out[7]:
5
In [8]:
In [9]:
In [10]:
Out[10]:
(60000, 784)
Using Flatten layer so that we don't have to call .reshape on input dataset
In [11]:
Out[11]:
Epoch 1/5
1875/1875 [==============================] - 4s 2ms/step - loss: 0.2734 - accuracy: 0.9226
Epoch 2/5
1875/1875 [==============================] - 3s 2ms/step - loss: 0.1245 - accuracy: 0.9635
Epoch 3/5
1875/1875 [==============================] - 3s 2ms/step - loss: 0.0866 - accuracy: 0.9745
Epoch 4/5
1875/1875 [==============================] - 3s 2ms/step - loss: 0.0652 - accuracy: 0.9800
Epoch 5/5
1875/1875 [==============================] - 3s 2ms/step - loss: 0.0516 - accuracy: 0.9847
<tensorflow.python.keras.callbacks.History at 0x25bbb173820>
In [12]:
Out[12]:
313/313 [==============================] - 1s 2ms/step - loss: 0.0888 - accuracy: 0.9708
[0.08878576755523682, 0.97079998254776]
In [13]:
Out[13]:
INFO:tensorflow:Assets written to: ./saved_model/assets
(1) Post training quantization
Without quantization
In [14]:
With quantization
In [15]:
Read this article for post training quantization: https://www.tensorflow.org/model_optimization/guide/quantization/post_training
In [17]:
Out[17]:
319792
In [18]:
Out[18]:
84752
You can see above that quantizated model is 1/4th the size of a non quantized model
In [19]:
In [20]:
Once you have above files saved to a disk, check their sizes. Quantized model will be obvi
(2) Quantization aware training
In [21]:
Out[21]:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
quantize_layer (QuantizeLaye (None, 28, 28) 3
_________________________________________________________________
quant_flatten (QuantizeWrapp (None, 784) 1
_________________________________________________________________
quant_dense (QuantizeWrapper (None, 100) 78505
_________________________________________________________________
quant_dense_1 (QuantizeWrapp (None, 10) 1015
=================================================================
Total params: 79,524
Trainable params: 79,510
Non-trainable params: 14
_________________________________________________________________
In [22]:
Out[22]:
1875/1875 [==============================] - 7s 4ms/step - loss: 0.0438 - accuracy: 0.9866
<tensorflow.python.keras.callbacks.History at 0x255fe86ba30>
In [23]:
Out[23]:
313/313 [==============================] - 1s 2ms/step - loss: 0.0802 - accuracy: 0.9755
[0.08016839623451233, 0.9754999876022339]
In [24]:
Out[24]:
WARNING:absl:Found untraced functions such as flatten_layer_call_fn, flatten_layer_call_and_return_conditional_losses, dense_layer_call_fn, dense_layer_call_and_return_conditional_losses, dense_1_layer_call_fn while saving (showing 5 of 15). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: C:\Users\dhava\AppData\Local\Temp\tmpqnsx4bvx\assets
INFO:tensorflow:Assets written to: C:\Users\dhava\AppData\Local\Temp\tmpqnsx4bvx\assets
In [25]:
Out[25]:
82376
In [26]: