GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/ja/lattice/tutorials/custom_estimators.ipynb
²⁵¹¹⁸ views

Kernel: Python 3

Copyright 2020 The TensorFlow Authors.

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

TF Lattice カスタム Estimator

警告: Estimator を新しいコードに使用するのは推奨されません。Estimator は、v1. Session スタイルのコードを実行しますが、これは正しく記述するのがより難しく、特に TF2 コードと組み合わせると予期しない動作を見せる可能性があります。Estimator は、「互換性保証」（https://tensorflow.org/guide/versions）の対称ではありますが、追加機能が適用されることがなく、セキュリティ脆弱性以外の修正は行われません。詳細は、『[移行ガイド](https://tensorflow.org/guide/migrate)』をご覧ください。

概要

カスタム Estimator を使用すると、TFL (TensorFlow Lattice) レイヤーを用いて任意の単調モデルの作成ができます。このガイドでは、そのような Estimator の作成に必要な手順の概要を説明します。

セットアップ

TF Lattice パッケージをインストールします。

In [ ]:

#@test {"skip": true}
!pip install tensorflow-lattice

必要なパッケージをインポートします。

In [ ]:

import tensorflow as tf

import logging
import numpy as np
import pandas as pd
import sys
import tensorflow_lattice as tfl
from tensorflow import feature_column as fc

from tensorflow_estimator.python.estimator.canned import optimizers
from tensorflow_estimator.python.estimator.head import binary_class_head
logging.disable(sys.maxsize)

UCI Statlog（心臓）データセットをダウンロードします。

In [ ]:

csv_file = tf.keras.utils.get_file(
    'heart.csv', 'http://storage.googleapis.com/download.tensorflow.org/data/heart.csv')
df = pd.read_csv(csv_file)
target = df.pop('target')
train_size = int(len(df) * 0.8)
train_x = df[:train_size]
train_y = target[:train_size]
test_x = df[train_size:]
test_y = target[train_size:]
df.head()

このガイドのトレーニングに使用するデフォルト値を設定します。

In [ ]:

LEARNING_RATE = 0.1
BATCH_SIZE = 128
NUM_EPOCHS = 1000

特徴量カラム

ほかの TF Estimator と同じように、Estimator にデータを渡す必要があります。通常は input_fn を用い、FeatureColumns を使用して解析します。

In [ ]:

# Feature columns.
# - age
# - sex
# - ca        number of major vessels (0-3) colored by flourosopy
# - thal      3 = normal; 6 = fixed defect; 7 = reversable defect
feature_columns = [
    fc.numeric_column('age', default_value=-1),
    fc.categorical_column_with_vocabulary_list('sex', [0, 1]),
    fc.numeric_column('ca'),
    fc.categorical_column_with_vocabulary_list(
        'thal', ['normal', 'fixed', 'reversible']),
]

tfl.laysers.CategoricalCalibrationレイヤーは分類インデックスを直接使用できるため、分類特徴量を密な特徴量列でラップする必要はないので注意してください。

input_fn を作成する

ほかの Estimator と同じように、input_fn を使用してモデルのトレーニングと評価用のデータを投入します。

In [ ]:

train_input_fn = tf.compat.v1.estimator.inputs.pandas_input_fn(
    x=train_x,
    y=train_y,
    shuffle=True,
    batch_size=BATCH_SIZE,
    num_epochs=NUM_EPOCHS,
    num_threads=1)

test_input_fn = tf.compat.v1.estimator.inputs.pandas_input_fn(
    x=test_x,
    y=test_y,
    shuffle=False,
    batch_size=BATCH_SIZE,
    num_epochs=1,
    num_threads=1)

model_fn を作成する

カスタム Estimator を作成する方法はいくつかあります。ここでは、解析された入力テンソル上で Keras モデルを呼び出すmodel_fnを構築します。入力特徴量の解析にはtf.feature_column.input_layer、tf.keras.layer.DenseFeatures、またはtfl.estimators.transform_featuresを使用することができます。3 つ目を使用する場合は、分類特徴量を密な特徴量列でラップする必要がなく、結果で得られるテンソルは連結されないため、較正レイヤーで特徴量を使用しやすくなります。

モデルを構築する場合には、TFL レイヤーや他の Keras レイヤーを組み合わせることができます。ここでは TFL レイヤーから較正格子 Keras モデルを作成して、いくつかの単調性制約を課します。その後で、Keras モデルを使用してカスタム Estimator を作成します。

In [ ]:

def model_fn(features, labels, mode, config):
  """model_fn for the custom estimator."""
  del config
  input_tensors = tfl.estimators.transform_features(features, feature_columns)
  inputs = {
      key: tf.keras.layers.Input(shape=(1,), name=key) for key in input_tensors
  }

  lattice_sizes = [3, 2, 2, 2]
  lattice_monotonicities = ['increasing', 'none', 'increasing', 'increasing']
  lattice_input = tf.keras.layers.Concatenate(axis=1)([
      tfl.layers.PWLCalibration(
          input_keypoints=np.linspace(10, 100, num=8, dtype=np.float32),
          # The output range of the calibrator should be the input range of
          # the following lattice dimension.
          output_min=0.0,
          output_max=lattice_sizes[0] - 1.0,
          monotonicity='increasing',
      )(inputs['age']),
      tfl.layers.CategoricalCalibration(
          # Number of categories including any missing/default category.
          num_buckets=2,
          output_min=0.0,
          output_max=lattice_sizes[1] - 1.0,
      )(inputs['sex']),
      tfl.layers.PWLCalibration(
          input_keypoints=[0.0, 1.0, 2.0, 3.0],
          output_min=0.0,
          output_max=lattice_sizes[0] - 1.0,
          # You can specify TFL regularizers as tuple
          # ('regularizer name', l1, l2).
          kernel_regularizer=('hessian', 0.0, 1e-4),
          monotonicity='increasing',
      )(inputs['ca']),
      tfl.layers.CategoricalCalibration(
          num_buckets=3,
          output_min=0.0,
          output_max=lattice_sizes[1] - 1.0,
          # Categorical monotonicity can be partial order.
          # (i, j) indicates that we must have output(i) <= output(j).
          # Make sure to set the lattice monotonicity to 'increasing' for this
          # dimension.
          monotonicities=[(0, 1), (0, 2)],
      )(inputs['thal']),
  ])
  output = tfl.layers.Lattice(
      lattice_sizes=lattice_sizes, monotonicities=lattice_monotonicities)(
          lattice_input)

  training = (mode == tf.estimator.ModeKeys.TRAIN)
  model = tf.keras.Model(inputs=inputs, outputs=output)
  logits = model(input_tensors, training=training)

  if training:
    optimizer = optimizers.get_optimizer_instance_v2('Adagrad', LEARNING_RATE)
  else:
    optimizer = None

  head = binary_class_head.BinaryClassHead()
  return head.create_estimator_spec(
      features=features,
      mode=mode,
      labels=labels,
      optimizer=optimizer,
      logits=logits,
      trainable_variables=model.trainable_variables,
      update_ops=model.updates)

トレーニングと Estimator

model_fnを使用して Estimator を作成し、トレーニングすることができます。

In [ ]:

estimator = tf.estimator.Estimator(model_fn=model_fn)
estimator.train(input_fn=train_input_fn)
results = estimator.evaluate(input_fn=test_input_fn)
print('AUC: {}'.format(results['auc']))