Path: blob/main/C1/W2/assignment/C1W2_Assignment.ipynb
3494 views
Week 2: Implementing Callbacks in TensorFlow using the MNIST Dataset
In the lectures you learned how to do classification using Fashion MNIST, a dataset containing items of clothing. There's another, similar dataset called MNIST which has items of handwriting -- the digits 0 through 9.
In this assignment you will code a classifier for the MNIST dataset, that trains until it reaches 98% accuracy and stops once this threshold is achieved. In the lectures you saw how this was done for the loss but here you will be using accuracy instead.
Some notes:
Your network should succeed in less than 10 epochs.
When it reaches 98% or greater it should print out the string "Reached 98% accuracy so cancelling training!" and stop training.
TIPS FOR SUCCESSFUL GRADING OF YOUR ASSIGNMENT:
All cells are frozen except for the ones where you need to submit your solutions or when explicitly mentioned you can interact with it.
You can add new cells to experiment but these will be omitted by the grader, so don't rely on newly created cells to host your solution code, use the provided places for this.
You can add the comment # grade-up-to-here in any graded cell to signal the grader that it must only evaluate up to that point. This is helpful if you want to check if you are on the right track even if you are not done with the whole assignment. Be sure to remember to delete the comment afterwards!
Avoid using global variables unless you absolutely have to. The grader tests your code in an isolated environment without running all cells from the top. As a result, global variables may be unavailable when scoring your submission. Global variables that are meant to be used will be defined in UPPERCASE.
To submit your notebook, save it and then click on the blue submit button at the beginning of the page.
Load and inspect the data
Begin by loading the data. A couple of things to notice:
The file
mnist.npzis already included in the current workspace under thedatadirectory. By default theload_datafrom Keras accepts a path relative to~/.keras/datasetsbut in this case it is stored somewhere else, as a result of this, you need to specify the full path.tf.keras.datasets.mnist.load_datareturns the train and test sets in the form of the tuples(training_images, training_labels), (testing_images, testing_labels)but in this exercise you will be needing only the train set so you can ignore the second tuple.
One important step is to normalize the pixel values. The dataset includes black and white images and the pixel values for these kinds of images usually range from 0 to 255 but the network will have an easier time learning if these values range from 0 to 1.
The data comes as numpy arrays so you can easily normalize the pixel values by using vectorization:
Exercise 1: create_and_compile_model
Your first task is to create and compile the model that you will later train to recognize handwritten digits.
Feel free to try the architecture for the neural network that you see fit but in case you need extra help you can check out an architecture that works pretty well at the end of this notebook. Notice that the part where the model is compiled is already provided (and the accuracy metric is defined so it can be accessed by your callback later on) so you only need to specify the layers of the model.
Hints:
The first layer should take into consideration the
input_shapeof the data, which in this case is the size of each imageThe last layer should take into account the number of classes you are trying to predict
The next cell allows you to check the number of total and trainable parameters of your model and prompts a warning in case these exceeds those of a reference solution, this serves the following 3 purposes listed in order of priority:
Helps you prevent crashing the kernel during training.
Helps you avoid longer-than-necessary training times.
Provides a reasonable estimate of the size of your model. In general you will usually prefer smaller models given that they accomplish their goal successfully.
Notice that this is just informative and may be very well below the actual limit for size of the model necessary to crash the kernel. So even if you exceed this reference you are probably fine. However, if the kernel crashes during training or it is taking a very long time and your model is larger than the reference, come back here and try to get the number of parameters closer to the reference.
Expected Output:
Exercise 2: EarlyStoppingCallback
Now it is time to create your own custom callback. For this complete the EarlyStoppingCallback class and the on_epoch_end method in the cell below. If you need some guidance on how to proceed, check out this link.
Exercise 3: train_mnist
Now that you have defined your callback it is time to complete the train_mnist function below. This function will receive the training data (features and targets encoded as numpy arrays) and should use it to train the model you defined earlier. It should also return the training history of the model. This object is returned when using the fit method of a tf.keras.Model as explained in the docs.
You must set your model to train for 10 epochs and the callback should fire before the 10th epoch for you to pass this part of the assignment
Now train the model and get the training history by calling the train_mnist function, passing in the appropiate parameters:
Expected Output:
Reached 98% accuracy so cancelling training! printed out before reaching 10 epochs.
Need more help?
Run the following cell to see an architecture that works well for the problem at hand:
Congratulations on finishing this week's assignment!
You have successfully implemented a callback that gives you more control over the training loop for your model. Nice job!
Keep it up!