GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/ko/guide/migrate/canned_estimators.ipynb
³⁹⁰⁴² views

Kernel: Python 3

Copyright 2021 The TensorFlow Authors.

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

마이그레이션 예제: 미리 준비된 Estimator

미리 준비된(또는 미리 만들어진) Estimator는 TensorFlow 1에서 다양한 일반적인 사용 사례에 대해 모델을 훈련하는 빠르고 쉬운 방법으로 전통적으로 사용되었습니다. TensorFlow 2는 Keras 모델을 통해 이들 중 다수에 대한 간단한 대략적인 대체물을 제공합니다. TensorFlow 2 대체 기능이 내장되어 있지 않은 미리 준비된 추정기의 경우에도 상당히 쉽게 자체 대체 기능을 구축할 수 있습니다.

이 가이드에서는 TensorFlow 1의 tf.estimator에서 파생된 모델을 Keras를 사용하여 TensorFlow 2로 마이그레이션하는 방법을 보여줄 수 있도록 직접 등가물 및 사용자 정의 대체물의 몇 가지 예제를 안내합니다.

즉, 이 가이드에는 마이그레이션에 대한 예가 포함되어 있습니다.

에서 tf.estimator 의 LinearEstimator , Classifier 또는 Regressor Keras에 TensorFlow 1 tf.compat.v1.keras.models.LinearModel TensorFlow 2
에서 tf.estimator 의 DNNEstimator , Classifier 또는 Regressor TensorFlow 1 TensorFlow 2에서 사용자 지정 Keras DNN ModelKeras에
에서 tf.estimator 의 DNNLinearCombinedEstimator , Classifier 또는 Regressor 에 TensorFlow 1 tf.compat.v1.keras.models.WideDeepModel TensorFlow 2
TensorFlow 1에 있는 tf.estimator의 BoostedTreesEstimator, Classifier 또는 Regressor에서 TensorFlow 2의 tfdf.keras.GradientBoostedTreesModel로

모델 훈련의 경우 일반적으로 tf.feature_column을 사용하여 TensorFlow 1 Estimator 모델에 대한 특성 전처리 작업을 사전작업으로 수행합니다. TensorFlow 2의 특성 전처리에 대한 자세한 내용은 특성 열에서 Keras 전처리 레이어 API로 마이그레이션하기 가이드를 참고하세요.

설치하기

몇 가지 필요한 TensorFlow 가져오기로 시작합니다.

In [ ]:

!pip install tensorflow_decision_forests

In [ ]:

import keras
import pandas as pd
import tensorflow as tf
import tensorflow.compat.v1 as tf1
import tensorflow_decision_forests as tfdf

표준 Titanic 데이터세트에서 데모용으로 몇 가지 간단한 데이터를 준비하고,

In [ ]:

x_train = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')
x_eval = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/eval.csv')
x_train['sex'].replace(('male', 'female'), (0, 1), inplace=True)
x_eval['sex'].replace(('male', 'female'), (0, 1), inplace=True)

x_train['alone'].replace(('n', 'y'), (0, 1), inplace=True)
x_eval['alone'].replace(('n', 'y'), (0, 1), inplace=True)

x_train['class'].replace(('First', 'Second', 'Third'), (1, 2, 3), inplace=True)
x_eval['class'].replace(('First', 'Second', 'Third'), (1, 2, 3), inplace=True)

x_train.drop(['embark_town', 'deck'], axis=1, inplace=True)
x_eval.drop(['embark_town', 'deck'], axis=1, inplace=True)

y_train = x_train.pop('survived')
y_eval = x_eval.pop('survived')

In [ ]:

# Data setup for TensorFlow 1 with `tf.estimator`
def _input_fn():
  return tf1.data.Dataset.from_tensor_slices((dict(x_train), y_train)).batch(32)


def _eval_input_fn():
  return tf1.data.Dataset.from_tensor_slices((dict(x_eval), y_eval)).batch(32)


FEATURE_NAMES = [
    'age', 'fare', 'sex', 'n_siblings_spouses', 'parch', 'class', 'alone'
]

feature_columns = []
for fn in FEATURE_NAMES:
  feat_col = tf1.feature_column.numeric_column(fn, dtype=tf.float32)
  feature_columns.append(feat_col)

다양한 TensorFlow 1 Estimator 및 TensorFlow 2 Keras 모델을 활용하여 사용할 간단한 샘플 옵티마이저 프로그램을 인스턴스화하는 메서드를 생성합니다.

In [ ]:

def create_sample_optimizer(tf_version):
  if tf_version == 'tf1':
    optimizer = lambda: tf.keras.optimizers.legacy.Ftrl(
        l1_regularization_strength=0.001,
        learning_rate=tf1.train.exponential_decay(
            learning_rate=0.1,
            global_step=tf1.train.get_global_step(),
            decay_steps=10000,
            decay_rate=0.9))
  elif tf_version == 'tf2':
    optimizer = tf.keras.optimizers.legacy.Ftrl(
        l1_regularization_strength=0.001,
        learning_rate=tf.keras.optimizers.schedules.ExponentialDecay(
            initial_learning_rate=0.1, decay_steps=10000, decay_rate=0.9))
  return optimizer

예 1: LinearEstimator에서 마이그레이션

TensorFlow 1: LinearEstimator 사용하기

TensorFlow 1에서는 tf.estimator.LinearEstimator 를 사용하여 회귀 및 분류 문제에 대한 기준선 선형 모델을 생성할 수 있습니다.

In [ ]:

linear_estimator = tf.estimator.LinearEstimator(
    head=tf.estimator.BinaryClassHead(),
    feature_columns=feature_columns,
    optimizer=create_sample_optimizer('tf1'))

In [ ]:

linear_estimator.train(input_fn=_input_fn, steps=100)
linear_estimator.evaluate(input_fn=_eval_input_fn, steps=10)

TensorFlow 2: Keras LinearModel 사용하기

TensorFlow 2에서는 Keras의 인스턴스를 만들 수 있습니다 tf.compat.v1.keras.models.LinearModel 받는 대체입니다 tf.estimator.LinearEstimator . tf.compat.v1.keras 경로는 호환성을 위해 미리 만들어진 모델이 존재함을 나타내는 데 사용됩니다.

In [ ]:

linear_model = tf.compat.v1.keras.experimental.LinearModel()
linear_model.compile(loss='mse', optimizer=create_sample_optimizer('tf2'), metrics=['accuracy'])
linear_model.fit(x_train, y_train, epochs=10)
linear_model.evaluate(x_eval, y_eval, return_dict=True)

예 2: DNNEstimator에서 마이그레이션

TensorFlow 1: DNNEstimator 사용하기

TensorFlow 1에서는 tf.estimator.DNNEstimator를 사용하여 회귀 및 분류 문제용 기준 DNN(심층 신경망) 모델을 생성할 수 있습니다.

In [ ]:

dnn_estimator = tf.estimator.DNNEstimator(
    head=tf.estimator.BinaryClassHead(),
    feature_columns=feature_columns,
    hidden_units=[128],
    activation_fn=tf.nn.relu,
    optimizer=create_sample_optimizer('tf1'))

In [ ]:

dnn_estimator.train(input_fn=_input_fn, steps=100)
dnn_estimator.evaluate(input_fn=_eval_input_fn, steps=10)

TensorFlow 2: Keras를 사용하여 사용자 정의 DNN 모델 생성하기

TensorFlow 2에서 tf.estimator.DNNEstimator에 의해 생성된 모델을 대체하기 위해 사용자 지정 DNN 모델을 생성할 수 있습니다. 비슷한 수준의 사용자 지정 사용자 지정(예: 이전 예시에서 선택한 모델 최적화 프로그램을 사용자 지정하는 기능) .

유사한 워크플로를 사용하여 tf.estimator.experimental.RNNEstimator를 Keras 순환 신경망(RNN) 모델로 대체할 수 있습니다. Keras는 tf.keras.layers.RNN과 tf.keras.layers.LSTM와 tf.keras.layers.GRU를 통해 다양한 내장형 사용자 정의 설정 옵션을 제공합니다. 자세한 내용은 Keras를 사용하는 RNN 가이드의 내장형 RNN 레이어: 간단한 예제 섹션을 확인하세요.

In [ ]:

dnn_model = tf.keras.models.Sequential(
    [tf.keras.layers.Dense(128, activation='relu'),
     tf.keras.layers.Dense(1)])

dnn_model.compile(loss='mse', optimizer=create_sample_optimizer('tf2'), metrics=['accuracy'])

In [ ]:

dnn_model.fit(x_train, y_train, epochs=10)
dnn_model.evaluate(x_eval, y_eval, return_dict=True)

예 3: DNNLinearCombinedEstimator에서 마이그레이션

TensorFlow 1: DNNLinearCombinedEstimator 사용하기

TensorFlow 1에서는 tf.estimator.DNNLinearCombinedEstimator 를 사용하여 선형 및 DNN 구성 요소 모두에 대한 사용자 지정 기능이 있는 회귀 및 분류 문제에 대한 기준 결합 모델을 생성할 수 있습니다.

In [ ]:

optimizer = create_sample_optimizer('tf1')

combined_estimator = tf.estimator.DNNLinearCombinedEstimator(
    head=tf.estimator.BinaryClassHead(),
    # Wide settings
    linear_feature_columns=feature_columns,
    linear_optimizer=optimizer,
    # Deep settings
    dnn_feature_columns=feature_columns,
    dnn_hidden_units=[128],
    dnn_optimizer=optimizer)

In [ ]:

combined_estimator.train(input_fn=_input_fn, steps=100)
combined_estimator.evaluate(input_fn=_eval_input_fn, steps=10)

TensorFlow 2: Keras WideDeepModel 사용하기

TensorFlow 2에서는 Keras의 인스턴스를 만들 수 있습니다 tf.compat.v1.keras.models.WideDeepModel 에 의해 생성 된 하나 대신에 tf.estimator.DNNLinearCombinedEstimator 같이, 예를 들어 사용자가 지정한 사용자 정의 비슷한 수준의 (와, 이전 예, 선택한 모델 최적화 프로그램을 사용자 정의하는 기능).

이 WideDeepModel은 LinearModel과 사용자 정의 DNN 모델을 기반으로 구성되며, 둘 다 앞의 두 예시에서 논의되었습니다. 원하는 경우 내장된 Keras LinearModel 대신 사용자 정의 선형 모델을 사용할 수도 있습니다.

미리 준비된 Estimator를 사용하는 대신 자체 모델을 빌드하려면 Keras 순차형 모델 가이드를 확인하세요. 사용자 정의 훈련 및 옵티마이저에 대한 자세한 내용은 사용자 정의 훈련: 둘러보기 가이드를 확인하세요.

In [ ]:

# Create LinearModel and DNN Model as in Examples 1 and 2
optimizer = create_sample_optimizer('tf2')

linear_model = tf.compat.v1.keras.experimental.LinearModel()
linear_model.compile(loss='mse', optimizer=optimizer, metrics=['accuracy'])
linear_model.fit(x_train, y_train, epochs=10, verbose=0)

dnn_model = tf.keras.models.Sequential(
    [tf.keras.layers.Dense(128, activation='relu'),
     tf.keras.layers.Dense(1)])
dnn_model.compile(loss='mse', optimizer=optimizer, metrics=['accuracy'])

In [ ]:

combined_model = tf.compat.v1.keras.experimental.WideDeepModel(linear_model,
                                                               dnn_model)
combined_model.compile(
    optimizer=[optimizer, optimizer], loss='mse', metrics=['accuracy'])
combined_model.fit([x_train, x_train], y_train, epochs=10)
combined_model.evaluate(x_eval, y_eval, return_dict=True)

예 4: BoostedTreesEstimator에서 마이그레이션

TensorFlow 1: BoostedTreesEstimator 사용하기

TensorFlow 1에서는 tf.estimator.BoostedTreesEstimator를 사용하여 회귀 및 분류 문제의 결정 트리 앙상블을 사용하는 기준 그래디언트 부스팅 모델을 만드는 기준을 생성할 수 있습니다. 이 기능은 더 이상 TensorFlow 2에 포함되어 있지 않습니다.

bt_estimator = tf1.estimator.BoostedTreesEstimator(
    head=tf.estimator.BinaryClassHead(),
    n_batches_per_layer=1,
    max_depth=10,
    n_trees=1000,
    feature_columns=feature_columns)

bt_estimator.train(input_fn=_input_fn, steps=1000)
bt_estimator.evaluate(input_fn=_eval_input_fn, steps=100)

TensorFlow 2: TensorFlow 의사결정 포레스트 사용하기

TensorFlow 2에서 tf.estimator.BoostedTreesEstimator가 TensorFlow Decision Forests 패키지의 tfdf.keras.GradientBoostedTreesModel로 교체되었습니다.

TensorFlow Decision Forests는 tf.estimator.BoostedTreesEstimator에 비해 다양한 이점을 제공합니다. 특히 품질, 속도, 사용 편의성 및 유연성 측면에서 좋습니다. TensorFlow Decision Forests에 대해 알아보려면 초보자 colab부터 시작하세요.

다음 예제는 TensorFlow 2를 사용하여 그래디언트 부스트 트리 모델을 훈련하는 방법을 보여줍니다.

TensorFlow Decision Forests를 설치합니다.

In [ ]:

!pip install tensorflow_decision_forests

TensorFlow 데이터세트를 생성합니다. 의사결정 포레스트는 기본적으로 다양한 유형의 특성을 지원하며 전처리가 필요하지 않습니다.

In [ ]:

train_dataframe = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')
eval_dataframe = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/eval.csv')

# Convert the Pandas Dataframes into TensorFlow datasets.
train_dataset = tfdf.keras.pd_dataframe_to_tf_dataset(train_dataframe, label="survived")
eval_dataset = tfdf.keras.pd_dataframe_to_tf_dataset(eval_dataframe, label="survived")

train_dataset 데이터세트에서 모델을 훈련합니다.

In [ ]:

# Use the default hyper-parameters of the model.
gbt_model = tfdf.keras.GradientBoostedTreesModel()
gbt_model.fit(train_dataset)

eval_dataset 데이터세트에서 모델의 품질을 평가합니다.

In [ ]:

gbt_model.compile(metrics=['accuracy'])
gbt_evaluation = gbt_model.evaluate(eval_dataset, return_dict=True)
print(gbt_evaluation)

그래디언트 부스트 트리는 TensorFlow Decision Forests에서 사용할 수 있는 많은 결정 포레스트 알고리즘 중 하나일 뿐입니다. 예를 들어 Random Forests(tfdf.keras.GradientBoostedTreesModel로 사용 가능)는 과대적합에 매우 강하게 저항하는 한편 CART(tfdf.keras.CartModel로 사용 가능)는 모델 해석에 적합합니다.

다음 예제에서는 랜덤 포레스트 모델을 훈련하고 플로팅합니다.

In [ ]:

# Train a Random Forest model
rf_model = tfdf.keras.RandomForestModel()
rf_model.fit(train_dataset)

# Evaluate the Random Forest model
rf_model.compile(metrics=['accuracy'])
rf_evaluation = rf_model.evaluate(eval_dataset, return_dict=True)
print(rf_evaluation)

마지막 예제에서는 CART 모델을 훈련하고 평가합니다.

In [ ]:

# Train a CART model
cart_model = tfdf.keras.CartModel()
cart_model.fit(train_dataset)

# Plot the CART model
tfdf.model_plotter.plot_model_in_colab(cart_model, max_depth=2)