Path: blob/master/47_BERT_text_classification/BERT_email_classification.ipynb
1141 views
Kernel: Python 3
BERT tutorial: Classify spam vs no spam emails
In [106]:
Import the dataset (Dataset is taken from kaggle)
In [107]:
Out[107]:
In [108]:
Out[108]:
In [109]:
Out[109]:
Split it into training and test data set
In [110]:
In [111]:
Out[111]:
4894 Send me the new number
1682 Y lei?
1103 Black shirt n blue jeans... I thk i c ü...
2939 Hey i've booked the pilates and yoga lesson al...
Name: Message, dtype: object
Now lets import BERT model and get embeding vectors for few sample statements
In [112]:
In [113]:
Out[113]:
<tf.Tensor: shape=(2, 768), dtype=float32, numpy=
array([[-0.8435169 , -0.51327276, -0.8884574 , ..., -0.74748874,
-0.75314736, 0.91964483],
[-0.87208366, -0.50543964, -0.94446677, ..., -0.858475 ,
-0.7174535 , 0.8808298 ]], dtype=float32)>
Get embeding vectors for few sample words. Compare them using cosine similarity
In [114]:
In [115]:
Out[115]:
array([[0.9911089]], dtype=float32)
Values near to 1 means they are similar. 0 means they are very different. Above you can use comparing "banana" vs "grapes" you get 0.99 similarity as they both are fruits
In [116]:
Out[116]:
array([[0.8470385]], dtype=float32)
Comparing banana with jeff bezos you still get 0.84 but it is not as close as 0.99 that we got with grapes
In [117]:
Out[117]:
array([[0.98720354]], dtype=float32)
Jeff bezos and Elon musk are more similar then Jeff bezos and banana as indicated above
Build Model
There are two types of models you can build in tensorflow.
(1) Sequential (2) Functional
So far we have built sequential model. But below we will build functional model. More information on these two is here: https://becominghuman.ai/sequential-vs-functional-model-in-keras-20684f766057
In [127]:
In [128]:
Out[128]:
Model: "model_3"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
text (InputLayer) [(None,)] 0
__________________________________________________________________________________________________
keras_layer_2 (KerasLayer) {'input_mask': (None 0 text[0][0]
__________________________________________________________________________________________________
keras_layer_3 (KerasLayer) {'default': (None, 7 109482241 keras_layer_2[1][0]
keras_layer_2[1][1]
keras_layer_2[1][2]
__________________________________________________________________________________________________
dropout (Dropout) (None, 768) 0 keras_layer_3[1][13]
__________________________________________________________________________________________________
output (Dense) (None, 1) 769 dropout[0][0]
==================================================================================================
Total params: 109,483,010
Trainable params: 769
Non-trainable params: 109,482,241
__________________________________________________________________________________________________
In [130]:
Out[130]:
4179
In [131]:
Train the model
In [132]:
Out[132]:
Epoch 1/5
131/131 [==============================] - 25s 181ms/step - loss: 0.3450 - accuracy: 0.8615
Epoch 2/5
131/131 [==============================] - 24s 182ms/step - loss: 0.2509 - accuracy: 0.8894
Epoch 3/5
131/131 [==============================] - 24s 181ms/step - loss: 0.2136 - accuracy: 0.9172
Epoch 4/5
131/131 [==============================] - 24s 180ms/step - loss: 0.1872 - accuracy: 0.9296
Epoch 5/5
131/131 [==============================] - 24s 181ms/step - loss: 0.1736 - accuracy: 0.9373
<tensorflow.python.keras.callbacks.History at 0x25a2ef372e0>
In [123]:
Out[123]:
44/44 [==============================] - 9s 182ms/step - loss: 0.1475 - accuracy: 0.9548
[0.14750021696090698, 0.9547738432884216]
Inference
In [126]:
Out[126]:
array([[0.6472808 ],
[0.7122627 ],
[0.5710311 ],
[0.06721176],
[0.02479185]], dtype=float32)