Path: blob/master/text-pre-processing/Text Preprocessing Examples.ipynb
314 views
Kernel: Python 3
Code tidbits for preprocessing texts
Lowercasing
In [5]:
Out[5]:
['canada', 'canada', 'canada', 'canada']
Stemming
In [4]:
In [2]:
Out[2]:
In [3]:
Out[3]:
Lemmatization
In [5]:
Out[5]:
[nltk_data] Downloading package wordnet to /Users/kavgan/nltk_data...
[nltk_data] Package wordnet is already up-to-date!
In [6]:
Out[6]:
In [7]:
Out[7]:
Stop Word Removal
In [9]:
In [10]:
Out[10]:
original sentence = this is a text full of content and we need to clean it up
sentence with stop words removed= W W W text full W content W W W W clean W W
Noise Removal
In [12]:
In [13]:
Out[13]:
In [14]:
In [15]:
Out[15]: