GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/ja/guide/migrate/early_stopping.ipynb
²⁵¹¹⁸ views

Kernel: Python 3

Copyright 2021 The TensorFlow Authors.

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

早期停止を移行する

このノートブックは、最初に TensorFlow 1 で tf.estimator.Estimator と早期停止フックを使用してから、次に TensorFlow 2 で Keras API またはカスタムトレーニングループを使用して、早期停止を使用してモデルトレーニングをセットアップする方法を示します。早期停止は、たとえば検証損失が特定のしきい値に達した場合にトレーニングを停止する正則化手法です。

TensorFlow 2 では、早期停止を実装する 3 つの方法があります。

組み込みの Keras コールバック（tf.keras.callbacks.EarlyStopping）を使用して、Model.fit に渡します。
カスタムコールバックを定義し、Keras Model.fit に渡します。
カスタムトレーニングループでカスタム早期停止ルールを記述します（tf.GradientTapeを使用）。

セットアップ

In [ ]:

import time
import numpy as np
import tensorflow as tf
import tensorflow.compat.v1 as tf1
import tensorflow_datasets as tfds

TensorFlow 1: 早期停止フックと tf.estimator による早期停止

MNIST データセットの読み込みと前処理、および tf.estimator.Estimator で使用されるモデル定義の関数を定義することから始めます。

In [ ]:

def normalize_img(image, label):
  return tf.cast(image, tf.float32) / 255., label

def _input_fn():
  ds_train = tfds.load(
    name='mnist',
    split='train',
    shuffle_files=True,
    as_supervised=True)

  ds_train = ds_train.map(
      normalize_img, num_parallel_calls=tf.data.AUTOTUNE)
  ds_train = ds_train.batch(128)
  ds_train = ds_train.repeat(100)
  return ds_train

def _eval_input_fn():
  ds_test = tfds.load(
    name='mnist',
    split='test',
    shuffle_files=True,
    as_supervised=True)
  ds_test = ds_test.map(
    normalize_img, num_parallel_calls=tf.data.AUTOTUNE)
  ds_test = ds_test.batch(128)
  return ds_test

def _model_fn(features, labels, mode):
  flatten = tf1.layers.Flatten()(features)
  features = tf1.layers.Dense(128, 'relu')(flatten)
  logits = tf1.layers.Dense(10)(features)

  loss = tf1.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
  optimizer = tf1.train.AdagradOptimizer(0.005)
  train_op = optimizer.minimize(loss, global_step=tf1.train.get_global_step())

  return tf1.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)

TensorFlow 1 では、早期停止は tf.estimator.experimental.make_early_stopping_hook で早期停止フックを設定することで機能します。引数なしで関数を受け入れることができる should_stop_fn のパラメータとして、フックを make_early_stopping_hook メソッドに渡します。 should_stop_fn が True を返すと、トレーニングは停止します。

次の例は、トレーニング時間を最大 20 秒に制限する早期停止手法を実装する方法を示しています。

In [ ]:

estimator = tf1.estimator.Estimator(model_fn=_model_fn)

start_time = time.time()
max_train_seconds = 20

def should_stop_fn():
  return time.time() - start_time > max_train_seconds

early_stopping_hook = tf1.estimator.experimental.make_early_stopping_hook(
    estimator=estimator,
    should_stop_fn=should_stop_fn,
    run_every_secs=1,
    run_every_steps=None)

train_spec = tf1.estimator.TrainSpec(
    input_fn=_input_fn,
    hooks=[early_stopping_hook])

eval_spec = tf1.estimator.EvalSpec(input_fn=_eval_input_fn)

tf1.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

TensorFlow 2: 組み込みコールバックと Model.fit による早期停止

MNIST データセットと単純な Keras モデルを準備します。

In [ ]:

(ds_train, ds_test), ds_info = tfds.load(
    'mnist',
    split=['train', 'test'],
    shuffle_files=True,
    as_supervised=True,
    with_info=True,
)

ds_train = ds_train.map(
    normalize_img, num_parallel_calls=tf.data.AUTOTUNE)
ds_train = ds_train.batch(128)

ds_test = ds_test.map(
    normalize_img, num_parallel_calls=tf.data.AUTOTUNE)
ds_test = ds_test.batch(128)

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(10)
])

model.compile(
    optimizer=tf.keras.optimizers.Adam(0.005),
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=[tf.keras.metrics.SparseCategoricalAccuracy()],
)

TensorFlow 2 では、組み込みの Keras Model.fit（または Model.evaluate）を使用する場合、組み込みのコールバック（tf.keras.callbacks.EarlyStopping）を Model.fit の callbacks パラメータに渡すことで、早期停止を構成できます。

EarlyStopping コールバックは、ユーザー指定の指標を監視し、改善が止まるとトレーニングを終了します。（詳細については、組み込みメソッドによるトレーニングおよび評価または API ドキュメントを確認してください。）

以下は、損失を監視し、改善を示さないエポック数が 3（patience）に設定された後にトレーニングを停止する早期停止コールバックの例です。

In [ ]:

callback = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=3)

# Only around 25 epochs are run during training, instead of 100.
history = model.fit(
    ds_train,
    epochs=100,
    validation_data=ds_test,
    callbacks=[callback]
)

len(history.history['loss'])

TensorFlow 2: カスタムコールバックと Model.fit による早期停止

Model.fit（または Model.evaluate）の callbacks パラメータに渡すこともできるカスタムの早期停止コールバックを実装することもできます。

この例では、self.model.stop_training が True に設定されると、トレーニングプロセスが停止されます。

In [ ]:

class LimitTrainingTime(tf.keras.callbacks.Callback):
  def __init__(self, max_time_s):
    super().__init__()
    self.max_time_s = max_time_s
    self.start_time = None

  def on_train_begin(self, logs):
    self.start_time = time.time()

  def on_train_batch_end(self, batch, logs):
    now = time.time()
    if now - self.start_time >  self.max_time_s:
      self.model.stop_training = True

In [ ]:

# Limit the training time to 30 seconds.
callback = LimitTrainingTime(30)
history = model.fit(
    ds_train,
    epochs=100,
    validation_data=ds_test,
    callbacks=[callback]
)
len(history.history['loss'])

TensorFlow 2: カスタムトレーニングループによる早期停止

TensorFlow 2 では、組み込みの Keras メソッドを使用してトレーニングと評価を行っていない場合、カスタムトレーニングループで早期停止を実装できます。

Keras API を使用して、別の単純なモデル、オプティマイザ、損失関数、および指標を定義することから始めます。

In [ ]:

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(10)
])

optimizer = tf.keras.optimizers.Adam(0.005)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

train_acc_metric = tf.keras.metrics.SparseCategoricalAccuracy()
train_loss_metric = tf.keras.metrics.SparseCategoricalCrossentropy()
val_acc_metric = tf.keras.metrics.SparseCategoricalAccuracy()
val_loss_metric = tf.keras.metrics.SparseCategoricalCrossentropy()

tf.GradientTape と @tf.function デコレータを使用してパラメータ更新関数を定義し、スピードアップします。

In [ ]:

@tf.function
def train_step(x, y):
  with tf.GradientTape() as tape:
      logits = model(x, training=True)
      loss_value = loss_fn(y, logits)
  grads = tape.gradient(loss_value, model.trainable_weights)
  optimizer.apply_gradients(zip(grads, model.trainable_weights))
  train_acc_metric.update_state(y, logits)
  train_loss_metric.update_state(y, logits)
  return loss_value

@tf.function
def test_step(x, y):
  logits = model(x, training=False)
  val_acc_metric.update_state(y, logits)
  val_loss_metric.update_state(y, logits)

次に、早期停止ルールを手動で実装できるカスタムトレーニングループを記述します。

以下の例は、検証損失が特定のエポック数にわたって改善されない場合にトレーニングを停止する方法を示しています。

In [ ]:

epochs = 100
patience = 5
wait = 0
best = float('inf')

for epoch in range(epochs):
    print("\nStart of epoch %d" % (epoch,))
    start_time = time.time()

    for step, (x_batch_train, y_batch_train) in enumerate(ds_train):
      loss_value = train_step(x_batch_train, y_batch_train)
      if step % 200 == 0:
        print("Training loss at step %d: %.4f" % (step, loss_value.numpy()))
        print("Seen so far: %s samples" % ((step + 1) * 128))        
    train_acc = train_acc_metric.result()
    train_loss = train_loss_metric.result()
    train_acc_metric.reset_states()
    train_loss_metric.reset_states()
    print("Training acc over epoch: %.4f" % (train_acc.numpy()))

    for x_batch_val, y_batch_val in ds_test:
      test_step(x_batch_val, y_batch_val)
    val_acc = val_acc_metric.result()
    val_loss = val_loss_metric.result()
    val_acc_metric.reset_states()
    val_loss_metric.reset_states()
    print("Validation acc: %.4f" % (float(val_acc),))
    print("Time taken: %.2fs" % (time.time() - start_time))

    # The early stopping strategy: stop the training if `val_loss` does not
    # decrease over a certain number of epochs.
    wait += 1
    if val_loss < best:
      best = val_loss
      wait = 0
    if wait >= patience:
      break

Next steps

API ドキュメントで、Keras の組み込み早期停止コールバック API の詳細をご覧ください。
最小損失での早期停止を含む、カスタム Keras コールバックの書き方を学びます。
Keras 組み込みメソッドを使用したトレーニングと評価について学びます。
EarlyStopping コールバックを使用する過学習および未学習のチュートリアルで、一般的な正則化手法を調べます。