Path: blob/master/site/en-snapshot/addons/tutorials/average_optimizers_callback.ipynb
25118 views
Copyright 2020 The TensorFlow Authors.
Overview
This notebook demonstrates how to use Moving Average Optimizer along with the Model Average Checkpoint from tensorflow addons package.
Moving Averaging
The advantage of Moving Averaging is that they are less prone to rampant loss shifts or irregular data representation in the latest batch. It gives a smooothened and a more general idea of the model training until some point.
Stochastic Averaging
Stochastic Weight Averaging converges to wider optima. By doing so, it resembles geometric ensembeling. SWA is a simple method to improve model performance when used as a wrapper around other optimizers and averaging results from different points of trajectory of the inner optimizer.
Model Average Checkpoint
callbacks.ModelCheckpoint
doesn't give you the option to save moving average weights in the middle of training, which is why Model Average Optimizers required a custom callback. Using theupdate_weights
parameter,ModelAverageCheckpoint
allows you to:
Assign the moving average weights to the model, and save them.
Keep the old non-averaged weights, but the saved model uses the average weights.
Setup
Build Model
Prepare Dataset
We will be comparing three optimizers here:
Unwrapped SGD
SGD with Moving Average
SGD with Stochastic Weight Averaging
And see how they perform with the same model.
Both MovingAverage
and StochasticAverage
optimizers use ModelAverageCheckpoint
.