Path: blob/master/47_BERT_text_classification/BERT_email_classification-Copy2.ipynb
1141 views
Kernel: Python 3
BERT tutorial: Classify spam vs no spam emails
In [1]:
Import the dataset (Dataset is taken from kaggle)
In [2]:
Out[2]:
In [3]:
Out[3]:
In [4]:
Out[4]:
Split it into training and test data set
In [5]:
In [6]:
Out[6]:
1717 Sorry about earlier. Putting out fires.Are you...
707 So when do you wanna gym harri
4667 Not..tel software name..
5188 Okie
Name: Message, dtype: object
Now lets import BERT model and get embeding vectors for few sample statements
In [ ]:
In [8]:
Out[8]:
ERROR:absl:hub.KerasLayer is trainable but has zero trainable weights.
<tf.Tensor: shape=(2, 768), dtype=float32, numpy=
array([[-0.8435169 , -0.51327276, -0.8884574 , ..., -0.74748874,
-0.75314736, 0.91964483],
[-0.87208366, -0.50543964, -0.94446677, ..., -0.858475 ,
-0.7174535 , 0.8808298 ]], dtype=float32)>
Get embeding vectors for few sample words. Compare them using cosine similarity
In [9]:
In [10]:
Out[10]:
array([[0.9911089]], dtype=float32)
Values near to 1 means they are similar. 0 means they are very different. Above you can use comparing "banana" vs "grapes" you get 0.99 similarity as they both are fruits
In [11]:
Out[11]:
array([[0.8470385]], dtype=float32)
Comparing banana with jeff bezos you still get 0.84 but it is not as close as 0.99 that we got with grapes
In [12]:
Out[12]:
array([[0.98720354]], dtype=float32)
Jeff bezos and Elon musk are more similar then Jeff bezos and banana as indicated above
Build Model
There are two types of models you can build in tensorflow.
(1) Sequential (2) Functional
So far we have built sequential model. But below we will build functional model. More information on these two is here: https://becominghuman.ai/sequential-vs-functional-model-in-keras-20684f766057
In [13]:
In [14]:
Out[14]:
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
text (InputLayer) [(None,)] 0
__________________________________________________________________________________________________
keras_layer (KerasLayer) {'input_mask': (None 0 text[0][0]
__________________________________________________________________________________________________
keras_layer_1 (KerasLayer) {'default': (None, 7 109482241 keras_layer[0][0]
keras_layer[0][1]
keras_layer[0][2]
__________________________________________________________________________________________________
dropout (Dropout) (None, 768) 0 keras_layer_1[0][13]
__________________________________________________________________________________________________
output (Dense) (None, 1) 769 dropout[0][0]
==================================================================================================
Total params: 109,483,010
Trainable params: 109,483,009
Non-trainable params: 1
__________________________________________________________________________________________________
In [15]:
Out[15]:
4179
In [16]:
Train the model
In [ ]:
Epoch 1/5
In [123]:
Out[123]:
44/44 [==============================] - 9s 182ms/step - loss: 0.1475 - accuracy: 0.9548
[0.14750021696090698, 0.9547738432884216]
Inference
In [126]:
Out[126]:
array([[0.6472808 ],
[0.7122627 ],
[0.5710311 ],
[0.06721176],
[0.02479185]], dtype=float32)