Path: blob/master/43_text_classification_rnn/rnn_text_classification.ipynb
1141 views
Kernel: Python 3
In [1]:
In [28]:
Out[28]:
'2.5.0'
In [2]:
Out[2]:
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
In [3]:
Out[3]:
Success
Read more about this dataset here: https://ai.stanford.edu/~amaas/data/sentiment/ As per this article: This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.
In [4]:
In [5]:
Out[5]:
tfds.core.DatasetInfo(
name='imdb_reviews',
version=1.0.0,
description='Large Movie Review Dataset.
This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.',
homepage='http://ai.stanford.edu/~amaas/data/sentiment/',
features=FeaturesDict({
'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
'text': Text(shape=(), dtype=tf.string),
}),
total_num_examples=100000,
splits={
'test': 25000,
'train': 25000,
'unsupervised': 50000,
},
supervised_keys=('text', 'label'),
citation="""@InProceedings{maas-EtAl:2011:ACL-HLT2011,
author = {Maas, Andrew L. and Daly, Raymond E. and Pham, Peter T. and Huang, Dan and Ng, Andrew Y. and Potts, Christopher},
title = {Learning Word Vectors for Sentiment Analysis},
booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies},
month = {June},
year = {2011},
address = {Portland, Oregon, USA},
publisher = {Association for Computational Linguistics},
pages = {142--150},
url = {http://www.aclweb.org/anthology/P11-1015}
}""",
redistribution_info=,
)
In [6]:
Out[6]:
{'test': <PrefetchDataset shapes: ((), ()), types: (tf.string, tf.int64)>,
'train': <PrefetchDataset shapes: ((), ()), types: (tf.string, tf.int64)>,
'unsupervised': <PrefetchDataset shapes: ((), ()), types: (tf.string, tf.int64)>}
In [7]:
In [8]:
Out[8]:
tensorflow.python.data.ops.dataset_ops.PrefetchDataset
In [9]:
Out[9]:
25000
In [11]:
Out[11]:
25000
In [9]:
Out[9]:
b"This was an absolutely terrible movie. Don't be lured in by Christopher Walken or Michael Ironside. Both are great actors, but this must simply be their worst role in history. Even their great acting could not redeem this movie's ridiculous storyline. This movie is an early nineties US propaganda piece. The most pathetic scenes were those when the Columbian rebels were making their cases for revolutions. Maria Conchita Alonso appeared phony, and her pseudo-love affair with Walken was nothing but a pathetic emotional plug in a movie that was devoid of any real meaning. I am disappointed that there are movies like this, ruining actor's like Christopher Walken's good name. I could barely sit through it."
0
In [12]:
In [13]:
In [14]:
Out[14]:
texts: [b"This movie was exactly what I expected, not great, but also not that bad either. In my opinion PG13 movies aren't scary enough so that's why I already knew I was going to be bored throughout the entire film. Sure there were scary things going on in the hotel room, but nothing we all haven't already seen. I guess I didn't like it because I thought there were too many twists and turns happening; it got old and repetitive. I also didn't understand if all the things Cusack was experiencing in the room was real or not. There is no explanation for any of the events that occurred. The movie just drags on and when it finally does come to an end you want it to keep going because you are still waiting around for someone to tell you what the whole movie was about. What I did like was the special effects. Other than that there wasn't much enjoyment from it. Maybe its just me but I thought this was below average."
b"Sad story of a downed B-17 pilot. Brady is shot down over occupied territory. The local ranchers extended him kindness and protection at the cost of their own lives. I had never heard of this movie and it snagged me for two hours. After the film is over, I'm glad I took the time. It's an entire story told to explain the look on Brady's face at the start of the film."
b"There is a scene near the beginning after a shootout where horses are running. If something red catches your eye it is because a white van is parked behind a bush by the trail. I thought I had seen bad but this is it. A white van in a western. Did they not catch this? Oh well, and I paid top dollar at the rental. It will make you want to grab your buddies and have them all put in 10 grand and make a better movie. The talking was so so slow, the acting was mostly OK but couldn't be taken seriously due to the poor nature of the filming. There is a door at the sheriffs that looks like a door today with the particular trimming. I say watch this movie, and move Cabin boy into #2 on the worst of all time."]
labels: [0 1 0]
In [15]:
In [16]:
Out[16]:
['',
'[UNK]',
'love',
'i',
'and',
'yoga',
'tensorflow',
'samosas',
'jalebi',
'biking']
In [17]:
Out[17]:
array([[3, 2, 1]], dtype=int64)
In [18]:
In [19]:
Out[19]:
array(['', '[UNK]', 'the', 'and', 'a', 'of', 'to', 'is', 'in', 'it', 'i',
'this', 'that', 'br', 'was', 'as', 'for', 'with', 'movie', 'but',
'film', 'on', 'not', 'you', 'are'], dtype='<U14')
In [20]:
Out[20]:
<tf.Tensor: shape=(2,), dtype=string, numpy=
array([b"This movie was exactly what I expected, not great, but also not that bad either. In my opinion PG13 movies aren't scary enough so that's why I already knew I was going to be bored throughout the entire film. Sure there were scary things going on in the hotel room, but nothing we all haven't already seen. I guess I didn't like it because I thought there were too many twists and turns happening; it got old and repetitive. I also didn't understand if all the things Cusack was experiencing in the room was real or not. There is no explanation for any of the events that occurred. The movie just drags on and when it finally does come to an end you want it to keep going because you are still waiting around for someone to tell you what the whole movie was about. What I did like was the special effects. Other than that there wasn't much enjoyment from it. Maybe its just me but I thought this was below average.",
b"Sad story of a downed B-17 pilot. Brady is shot down over occupied territory. The local ranchers extended him kindness and protection at the cost of their own lives. I had never heard of this movie and it snagged me for two hours. After the film is over, I'm glad I took the time. It's an entire story told to explain the look on Brady's face at the start of the film."],
dtype=object)>
In [21]:
Out[21]:
array([[ 11, 18, 14, ..., 0, 0, 0],
[614, 64, 5, ..., 0, 0, 0],
[ 48, 7, 4, ..., 0, 0, 0]], dtype=int64)
In [22]:
Out[22]:
Original: b"This movie was exactly what I expected, not great, but also not that bad either. In my opinion PG13 movies aren't scary enough so that's why I already knew I was going to be bored throughout the entire film. Sure there were scary things going on in the hotel room, but nothing we all haven't already seen. I guess I didn't like it because I thought there were too many twists and turns happening; it got old and repetitive. I also didn't understand if all the things Cusack was experiencing in the room was real or not. There is no explanation for any of the events that occurred. The movie just drags on and when it finally does come to an end you want it to keep going because you are still waiting around for someone to tell you what the whole movie was about. What I did like was the special effects. Other than that there wasn't much enjoyment from it. Maybe its just me but I thought this was below average."
Round-trip: this movie was exactly what i expected not great but also not that bad either in my opinion [UNK] movies arent scary enough so thats why i already knew i was going to be [UNK] throughout the entire film sure there were scary things going on in the [UNK] room but nothing we all havent already seen i guess i didnt like it because i thought there were too many [UNK] and turns [UNK] it got old and [UNK] i also didnt understand if all the things [UNK] was [UNK] in the room was real or not there is no [UNK] for any of the events that [UNK] the movie just [UNK] on and when it finally does come to an end you want it to keep going because you are still [UNK] around for someone to tell you what the whole movie was about what i did like was the special effects other than that there wasnt much [UNK] from it maybe its just me but i thought this was [UNK] average
Original: b"Sad story of a downed B-17 pilot. Brady is shot down over occupied territory. The local ranchers extended him kindness and protection at the cost of their own lives. I had never heard of this movie and it snagged me for two hours. After the film is over, I'm glad I took the time. It's an entire story told to explain the look on Brady's face at the start of the film."
Round-trip: sad story of a [UNK] [UNK] [UNK] [UNK] is shot down over [UNK] [UNK] the local [UNK] [UNK] him [UNK] and [UNK] at the [UNK] of their own lives i had never heard of this movie and it [UNK] me for two hours after the film is over im [UNK] i took the time its an entire story told to [UNK] the look on [UNK] face at the start of the film
Original: b"There is a scene near the beginning after a shootout where horses are running. If something red catches your eye it is because a white van is parked behind a bush by the trail. I thought I had seen bad but this is it. A white van in a western. Did they not catch this? Oh well, and I paid top dollar at the rental. It will make you want to grab your buddies and have them all put in 10 grand and make a better movie. The talking was so so slow, the acting was mostly OK but couldn't be taken seriously due to the poor nature of the filming. There is a door at the sheriffs that looks like a door today with the particular trimming. I say watch this movie, and move Cabin boy into #2 on the worst of all time."
Round-trip: there is a scene near the beginning after a [UNK] where [UNK] are running if something red [UNK] your eye it is because a white [UNK] is [UNK] behind a [UNK] by the [UNK] i thought i had seen bad but this is it a white [UNK] in a [UNK] did they not [UNK] this oh well and i [UNK] top [UNK] at the [UNK] it will make you want to [UNK] your [UNK] and have them all put in 10 [UNK] and make a better movie the talking was so so slow the acting was mostly ok but couldnt be taken seriously due to the poor nature of the [UNK] there is a [UNK] at the [UNK] that looks like a [UNK] today with the particular [UNK] i say watch this movie and move [UNK] boy into 2 on the worst of all time
In [23]:
In [24]:
Out[24]:
[-0.00647295]
In [25]:
In [26]:
Out[26]:
Epoch 1/10
32/391 [=>............................] - ETA: 21s - loss: 0.6929 - accuracy: 0.4927
---------------------------------------------------------------------------
CancelledError Traceback (most recent call last)
<ipython-input-26-d9321db1417e> in <module>
----> 1 model.fit(train_dataset, epochs=10,
2 validation_data=test_dataset,
3 validation_steps=30)
~\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
1181 _r=1):
1182 callbacks.on_train_batch_begin(step)
-> 1183 tmp_logs = self.train_function(iterator)
1184 if data_handler.should_sync:
1185 context.async_wait()
~\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\eager\def_function.py in __call__(self, *args, **kwds)
887
888 with OptionalXlaContext(self._jit_compile):
--> 889 result = self._call(*args, **kwds)
890
891 new_tracing_count = self.experimental_get_tracing_count()
~\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\eager\def_function.py in _call(self, *args, **kwds)
915 # In this case we have created variables on the first call, so we run the
916 # defunned version which is guaranteed to never create variables.
--> 917 return self._stateless_fn(*args, **kwds) # pylint: disable=not-callable
918 elif self._stateful_fn is not None:
919 # Release the lock early so that multiple threads can perform the call
~\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\eager\function.py in __call__(self, *args, **kwargs)
3021 (graph_function,
3022 filtered_flat_args) = self._maybe_define_function(args, kwargs)
-> 3023 return graph_function._call_flat(
3024 filtered_flat_args, captured_inputs=graph_function.captured_inputs) # pylint: disable=protected-access
3025
~\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\eager\function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
1958 and executing_eagerly):
1959 # No tape is watching; skip to running the function.
-> 1960 return self._build_call_outputs(self._inference_function.call(
1961 ctx, args, cancellation_manager=cancellation_manager))
1962 forward_backward = self._select_forward_and_backward_functions(
~\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\eager\function.py in call(self, ctx, args, cancellation_manager)
589 with _InterpolateFunctionError(self):
590 if cancellation_manager is None:
--> 591 outputs = execute.execute(
592 str(self.signature.name),
593 num_outputs=self._num_outputs,
~\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\eager\execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
57 try:
58 ctx.ensure_initialized()
---> 59 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
60 inputs, attrs, num_outputs)
61 except core._NotOkStatusException as e:
CancelledError: [_Derived_]RecvAsync is cancelled.
[[{{node Adam/Adam/update/AssignSubVariableOp/_57}}]]
[[gradient_tape/sequential/embedding/embedding_lookup/Reshape/_54]] [Op:__inference_train_function_22707]
Function call stack:
train_function
In [27]:
Out[27]:
3.8.5 (tags/v3.8.5:580fbb0, Jul 20 2020, 15:57:54) [MSC v.1924 64 bit (AMD64)]