Path: blob/master/second_edition/chapter11_part01_introduction.ipynb
713 views
This is a companion notebook for the book Deep Learning with Python, Second Edition. For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.
If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.
This notebook was generated for TensorFlow 2.6.
Deep learning for text
Natural-language processing: The bird's eye view
Preparing text data
Text standardization
Text splitting (tokenization)
Vocabulary indexing
Using the TextVectorization layer
Displaying the vocabulary
Two approaches for representing groups of words: Sets and sequences
Preparing the IMDB movie reviews data
Displaying the shapes and dtypes of the first batch
Processing words as a set: The bag-of-words approach
Single words (unigrams) with binary encoding
Preprocessing our datasets with a TextVectorization
layer
Inspecting the output of our binary unigram dataset
Our model-building utility
Training and testing the binary unigram model
Bigrams with binary encoding
Configuring the TextVectorization
layer to return bigrams
Training and testing the binary bigram model
Bigrams with TF-IDF encoding
Configuring the TextVectorization
layer to return token counts
Configuring TextVectorization
to return TF-IDF-weighted outputs
Training and testing the TF-IDF bigram model