CoCalc -- writing_your_own

GitHub Repository: keras-team/keras-io
Path: blob/master/guides/writing_your_own_callbacks.py
³²⁷³ views
1
"""
2
Title: Writing your own callbacks
3
Authors: Rick Chao, Francois Chollet
4
Date created: 2019/03/20
5
Last modified: 2023/06/25
6
Description: Complete guide to writing new Keras callbacks.
7
Accelerator: GPU
8
"""
9

10
"""
11
## Introduction
12

13
A callback is a powerful tool to customize the behavior of a Keras model during
14
training, evaluation, or inference. Examples include `keras.callbacks.TensorBoard`
15
to visualize training progress and results with TensorBoard, or
16
`keras.callbacks.ModelCheckpoint` to periodically save your model during training.
17

18
In this guide, you will learn what a Keras callback is, what it can do, and how you can
19
build your own. We provide a few demos of simple callback applications to get you
20
started.
21
"""
22

23
"""
24
## Setup
25
"""
26

27
import numpy as np
28
import keras
29

30
"""
31
## Keras callbacks overview
32

33
All callbacks subclass the `keras.callbacks.Callback` class, and
34
override a set of methods called at various stages of training, testing, and
35
predicting. Callbacks are useful to get a view on internal states and statistics of
36
the model during training.
37

38
You can pass a list of callbacks (as the keyword argument `callbacks`) to the following
39
model methods:
40

41
- `keras.Model.fit()`
42
- `keras.Model.evaluate()`
43
- `keras.Model.predict()`
44
"""
45

46
"""
47
## An overview of callback methods
48

49
### Global methods
50

51
#### `on_(train|test|predict)_begin(self, logs=None)`
52

53
Called at the beginning of `fit`/`evaluate`/`predict`.
54

55
#### `on_(train|test|predict)_end(self, logs=None)`
56

57
Called at the end of `fit`/`evaluate`/`predict`.
58

59
### Batch-level methods for training/testing/predicting
60

61
#### `on_(train|test|predict)_batch_begin(self, batch, logs=None)`
62

63
Called right before processing a batch during training/testing/predicting.
64

65
#### `on_(train|test|predict)_batch_end(self, batch, logs=None)`
66

67
Called at the end of training/testing/predicting a batch. Within this method, `logs` is
68
a dict containing the metrics results.
69

70
### Epoch-level methods (training only)
71

72
#### `on_epoch_begin(self, epoch, logs=None)`
73

74
Called at the beginning of an epoch during training.
75

76
#### `on_epoch_end(self, epoch, logs=None)`
77

78
Called at the end of an epoch during training.
79
"""
80

81
"""
82
## A basic example
83

84
Let's take a look at a concrete example. To get started, let's import tensorflow and
85
define a simple Sequential Keras model:
86
"""
87

88

89
# Define the Keras model to add callbacks to
90
def get_model():
91
    model = keras.Sequential()
92
    model.add(keras.layers.Dense(1))
93
    model.compile(
94
        optimizer=keras.optimizers.RMSprop(learning_rate=0.1),
95
        loss="mean_squared_error",
96
        metrics=["mean_absolute_error"],
97
    )
98
    return model
99

100

101
"""
102
Then, load the MNIST data for training and testing from Keras datasets API:
103
"""
104

105
# Load example MNIST data and pre-process it
106
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
107
x_train = x_train.reshape(-1, 784).astype("float32") / 255.0
108
x_test = x_test.reshape(-1, 784).astype("float32") / 255.0
109

110
# Limit the data to 1000 samples
111
x_train = x_train[:1000]
112
y_train = y_train[:1000]
113
x_test = x_test[:1000]
114
y_test = y_test[:1000]
115

116
"""
117
Now, define a simple custom callback that logs:
118

119
- When `fit`/`evaluate`/`predict` starts & ends
120
- When each epoch starts & ends
121
- When each training batch starts & ends
122
- When each evaluation (test) batch starts & ends
123
- When each inference (prediction) batch starts & ends
124
"""
125

126

127
class CustomCallback(keras.callbacks.Callback):
128
    def on_train_begin(self, logs=None):
129
        keys = list(logs.keys())
130
        print("Starting training; got log keys: {}".format(keys))
131

132
    def on_train_end(self, logs=None):
133
        keys = list(logs.keys())
134
        print("Stop training; got log keys: {}".format(keys))
135

136
    def on_epoch_begin(self, epoch, logs=None):
137
        keys = list(logs.keys())
138
        print("Start epoch {} of training; got log keys: {}".format(epoch, keys))
139

140
    def on_epoch_end(self, epoch, logs=None):
141
        keys = list(logs.keys())
142
        print("End epoch {} of training; got log keys: {}".format(epoch, keys))
143

144
    def on_test_begin(self, logs=None):
145
        keys = list(logs.keys())
146
        print("Start testing; got log keys: {}".format(keys))
147

148
    def on_test_end(self, logs=None):
149
        keys = list(logs.keys())
150
        print("Stop testing; got log keys: {}".format(keys))
151

152
    def on_predict_begin(self, logs=None):
153
        keys = list(logs.keys())
154
        print("Start predicting; got log keys: {}".format(keys))
155

156
    def on_predict_end(self, logs=None):
157
        keys = list(logs.keys())
158
        print("Stop predicting; got log keys: {}".format(keys))
159

160
    def on_train_batch_begin(self, batch, logs=None):
161
        keys = list(logs.keys())
162
        print("...Training: start of batch {}; got log keys: {}".format(batch, keys))
163

164
    def on_train_batch_end(self, batch, logs=None):
165
        keys = list(logs.keys())
166
        print("...Training: end of batch {}; got log keys: {}".format(batch, keys))
167

168
    def on_test_batch_begin(self, batch, logs=None):
169
        keys = list(logs.keys())
170
        print("...Evaluating: start of batch {}; got log keys: {}".format(batch, keys))
171

172
    def on_test_batch_end(self, batch, logs=None):
173
        keys = list(logs.keys())
174
        print("...Evaluating: end of batch {}; got log keys: {}".format(batch, keys))
175

176
    def on_predict_batch_begin(self, batch, logs=None):
177
        keys = list(logs.keys())
178
        print("...Predicting: start of batch {}; got log keys: {}".format(batch, keys))
179

180
    def on_predict_batch_end(self, batch, logs=None):
181
        keys = list(logs.keys())
182
        print("...Predicting: end of batch {}; got log keys: {}".format(batch, keys))
183

184

185
"""
186
Let's try it out:
187
"""
188

189
model = get_model()
190
model.fit(
191
    x_train,
192
    y_train,
193
    batch_size=128,
194
    epochs=1,
195
    verbose=0,
196
    validation_split=0.5,
197
    callbacks=[CustomCallback()],
198
)
199

200
res = model.evaluate(
201
    x_test, y_test, batch_size=128, verbose=0, callbacks=[CustomCallback()]
202
)
203

204
res = model.predict(x_test, batch_size=128, callbacks=[CustomCallback()])
205

206
"""
207
### Usage of `logs` dict
208

209
The `logs` dict contains the loss value, and all the metrics at the end of a batch or
210
epoch. Example includes the loss and mean absolute error.
211
"""
212

213

214
class LossAndErrorPrintingCallback(keras.callbacks.Callback):
215
    def on_train_batch_end(self, batch, logs=None):
216
        print(
217
            "Up to batch {}, the average loss is {:7.2f}.".format(batch, logs["loss"])
218
        )
219

220
    def on_test_batch_end(self, batch, logs=None):
221
        print(
222
            "Up to batch {}, the average loss is {:7.2f}.".format(batch, logs["loss"])
223
        )
224

225
    def on_epoch_end(self, epoch, logs=None):
226
        print(
227
            "The average loss for epoch {} is {:7.2f} "
228
            "and mean absolute error is {:7.2f}.".format(
229
                epoch, logs["loss"], logs["mean_absolute_error"]
230
            )
231
        )
232

233

234
model = get_model()
235
model.fit(
236
    x_train,
237
    y_train,
238
    batch_size=128,
239
    epochs=2,
240
    verbose=0,
241
    callbacks=[LossAndErrorPrintingCallback()],
242
)
243

244
res = model.evaluate(
245
    x_test,
246
    y_test,
247
    batch_size=128,
248
    verbose=0,
249
    callbacks=[LossAndErrorPrintingCallback()],
250
)
251

252
"""
253
## Usage of `self.model` attribute
254

255
In addition to receiving log information when one of their methods is called,
256
callbacks have access to the model associated with the current round of
257
training/evaluation/inference: `self.model`.
258

259
Here are a few of the things you can do with `self.model` in a callback:
260

261
- Set `self.model.stop_training = True` to immediately interrupt training.
262
- Mutate hyperparameters of the optimizer (available as `self.model.optimizer`),
263
such as `self.model.optimizer.learning_rate`.
264
- Save the model at period intervals.
265
- Record the output of `model.predict()` on a few test samples at the end of each
266
epoch, to use as a sanity check during training.
267
- Extract visualizations of intermediate features at the end of each epoch, to monitor
268
what the model is learning over time.
269
- etc.
270

271
Let's see this in action in a couple of examples.
272
"""
273

274
"""
275
## Examples of Keras callback applications
276

277
### Early stopping at minimum loss
278

279
This first example shows the creation of a `Callback` that stops training when the
280
minimum of loss has been reached, by setting the attribute `self.model.stop_training`
281
(boolean). Optionally, you can provide an argument `patience` to specify how many
282
epochs we should wait before stopping after having reached a local minimum.
283

284
`keras.callbacks.EarlyStopping` provides a more complete and general implementation.
285
"""
286

287

288
class EarlyStoppingAtMinLoss(keras.callbacks.Callback):
289
    """Stop training when the loss is at its min, i.e. the loss stops decreasing.
290

291
    Arguments:
292
        patience: Number of epochs to wait after min has been hit. After this
293
        number of no improvement, training stops.
294
    """
295

296
    def __init__(self, patience=0):
297
        super().__init__()
298
        self.patience = patience
299
        # best_weights to store the weights at which the minimum loss occurs.
300
        self.best_weights = None
301

302
    def on_train_begin(self, logs=None):
303
        # The number of epoch it has waited when loss is no longer minimum.
304
        self.wait = 0
305
        # The epoch the training stops at.
306
        self.stopped_epoch = 0
307
        # Initialize the best as infinity.
308
        self.best = np.inf
309

310
    def on_epoch_end(self, epoch, logs=None):
311
        current = logs.get("loss")
312
        if np.less(current, self.best):
313
            self.best = current
314
            self.wait = 0
315
            # Record the best weights if current results is better (less).
316
            self.best_weights = self.model.get_weights()
317
        else:
318
            self.wait += 1
319
            if self.wait >= self.patience:
320
                self.stopped_epoch = epoch
321
                self.model.stop_training = True
322
                print("Restoring model weights from the end of the best epoch.")
323
                self.model.set_weights(self.best_weights)
324

325
    def on_train_end(self, logs=None):
326
        if self.stopped_epoch > 0:
327
            print(f"Epoch {self.stopped_epoch + 1}: early stopping")
328

329

330
model = get_model()
331
model.fit(
332
    x_train,
333
    y_train,
334
    batch_size=64,
335
    epochs=30,
336
    verbose=0,
337
    callbacks=[LossAndErrorPrintingCallback(), EarlyStoppingAtMinLoss()],
338
)
339

340
"""
341
### Learning rate scheduling
342

343
In this example, we show how a custom Callback can be used to dynamically change the
344
learning rate of the optimizer during the course of training.
345

346
See `callbacks.LearningRateScheduler` for a more general implementations.
347
"""
348

349

350
class CustomLearningRateScheduler(keras.callbacks.Callback):
351
    """Learning rate scheduler which sets the learning rate according to schedule.
352

353
    Arguments:
354
        schedule: a function that takes an epoch index
355
            (integer, indexed from 0) and current learning rate
356
            as inputs and returns a new learning rate as output (float).
357
    """
358

359
    def __init__(self, schedule):
360
        super().__init__()
361
        self.schedule = schedule
362

363
    def on_epoch_begin(self, epoch, logs=None):
364
        if not hasattr(self.model.optimizer, "learning_rate"):
365
            raise ValueError('Optimizer must have a "learning_rate" attribute.')
366
        # Get the current learning rate from model's optimizer.
367
        lr = self.model.optimizer.learning_rate
368
        # Call schedule function to get the scheduled learning rate.
369
        scheduled_lr = self.schedule(epoch, lr)
370
        # Set the value back to the optimizer before this epoch starts
371
        self.model.optimizer.learning_rate = scheduled_lr
372
        print(f"\nEpoch {epoch}: Learning rate is {float(np.array(scheduled_lr))}.")
373

374

375
LR_SCHEDULE = [
376
    # (epoch to start, learning rate) tuples
377
    (3, 0.05),
378
    (6, 0.01),
379
    (9, 0.005),
380
    (12, 0.001),
381
]
382

383

384
def lr_schedule(epoch, lr):
385
    """Helper function to retrieve the scheduled learning rate based on epoch."""
386
    if epoch < LR_SCHEDULE[0][0] or epoch > LR_SCHEDULE[-1][0]:
387
        return lr
388
    for i in range(len(LR_SCHEDULE)):
389
        if epoch == LR_SCHEDULE[i][0]:
390
            return LR_SCHEDULE[i][1]
391
    return lr
392

393

394
model = get_model()
395
model.fit(
396
    x_train,
397
    y_train,
398
    batch_size=64,
399
    epochs=15,
400
    verbose=0,
401
    callbacks=[
402
        LossAndErrorPrintingCallback(),
403
        CustomLearningRateScheduler(lr_schedule),
404
    ],
405
)
406

407
"""
408
### Built-in Keras callbacks
409

410
Be sure to check out the existing Keras callbacks by
411
reading the [API docs](https://keras.io/api/callbacks/).
412
Applications include logging to CSV, saving
413
the model, visualizing metrics in TensorBoard, and a lot more!
414
"""
415

416
Product

Resources

Company