GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/zh-cn/tutorials/keras/regression.ipynb
²⁵¹¹⁸ views

Kernel: Python 3

Copyright 2018 The TensorFlow Authors.

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

In [ ]:

#@title MIT License
#
# Copyright (c) 2017 François Chollet
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.

Basic regression: Predict fuel efficiency

Note: 我们的 TensorFlow 社区翻译了这些文档。因为社区翻译是尽力而为，所以无法保证它们是最准确的，并且反映了最新的官方英文文档。如果您有改进此翻译的建议，请提交 pull request 到 tensorflow/docs GitHub 仓库。要志愿地撰写或者审核译文，请加入 [email protected] Google Group。

此教程使用经典的 Auto MPG 数据集并演示了如何构建模型来预测 20 世纪 70 年代末和 20 世纪 80 年代初汽车的燃油效率。为此，您需要为模型提供该时期的许多汽车的描述。这种描述包括诸如气缸、排量、马力和重量等特性。

此示例使用了 Keras API。（请访问 Keras 教程和指南以了解更多信息。）

In [ ]:

# Use seaborn for pairplot.
!pip install -q seaborn

In [ ]:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

# Make NumPy printouts easier to read.
np.set_printoptions(precision=3, suppress=True)

In [ ]:

import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import layers

print(tf.__version__)

Auto MPG 数据集

该数据集可以从 UCI机器学习库中获取.

获取数据

首先下载数据集。

In [ ]:

url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data'
column_names = ['MPG', 'Cylinders', 'Displacement', 'Horsepower', 'Weight',
                'Acceleration', 'Model Year', 'Origin']

raw_dataset = pd.read_csv(url, names=column_names,
                          na_values='?', comment='\t',
                          sep=' ', skipinitialspace=True)

In [ ]:

dataset = raw_dataset.copy()
dataset.tail()

数据清洗

数据集包含一些未知值：

In [ ]:

dataset.isna().sum()

为了保证此初始教程简单，请删除这些行：

In [ ]:

dataset = dataset.dropna()

"Origin" 列为分类数据，而不是数值数据。因此，下一步是使用 pd.get_dummies 对列中的值进行独热编码。

注：您可以设置 tf.keras.Model 来为您执行这种转换，但这超出了本教程的范围。有关示例，请参阅使用 Keras 预处理层对结构化数据进行分类或加载 CSV 数据教程。

In [ ]:

dataset['Origin'] = dataset['Origin'].map({1: 'USA', 2: 'Europe', 3: 'Japan'})

In [ ]:

dataset = pd.get_dummies(dataset, columns=['Origin'], prefix='', prefix_sep='')
dataset.tail()

将数据拆分为训练集和测试集

现在，将数据集拆分为训练集和测试集。您将在模型的最终评估中使用测试集。

In [ ]:

train_dataset = dataset.sample(frac=0.8, random_state=0)
test_dataset = dataset.drop(train_dataset.index)

数据检查

检查训练集中几对列的联合分布。

第一行表明燃油效率 (MPG) 是所有其他参数的函数。其他行表示它们是彼此的函数。

In [ ]:

sns.pairplot(train_dataset[['MPG', 'Cylinders', 'Displacement', 'Weight']], diag_kind='kde')

让我们也查看一下总体统计信息。请注意每个特征覆盖大为不同的范围：

In [ ]:

train_dataset.describe().transpose()

从标签中分离特征

将目标值（“标签”）从特征中分离。此标签是您训练模型来预测的值。

In [ ]:

train_features = train_dataset.copy()
test_features = test_dataset.copy()

train_labels = train_features.pop('MPG')
test_labels = test_features.pop('MPG')

归一化

在统计信息表中，可以很轻松地看到每个特征的范围的不同：

In [ ]:

train_dataset.describe().transpose()[['mean', 'std']]

使用不同的尺度和范围对特征归一化是好的实践。尽管模型可能在没有特征归一化的情况下收敛，它会使得模型训练更加复杂，并会造成生成的模型依赖输入所使用的单位选择。

归一化十分重要的一个原因是特征会与模型权重相乘。因此，输出尺度和梯度尺度受输入尺度的影响。

尽管模型可能在没有特征归一化的情况下收敛，但归一化会使训练更加稳定。

注：归一化独热特征没有任何好处，这里这样做是为了简单起见。有关如何使用预处理层的更多详细信息，请参阅使用预处理层指南和使用 Keras 预处理层对结构化数据进行分类教程。

归一化层

tf.keras.layers.Normalization 是一种将特征归一化添加到模型中的简洁且简单的方法。

第一步是创建层：

In [ ]:

normalizer = tf.keras.layers.Normalization(axis=-1)

然后，通过调用 Normalization.adapt 以将预处理层的状态拟合到数据：

In [ ]:

normalizer.adapt(np.array(train_features))

计算均值和方差，并将它们存储在层中。

In [ ]:

print(normalizer.mean.numpy())

当层被调用时，它会返回输入数据，每个特征将单独归一化：

In [ ]:

first = np.array(train_features[:1])

with np.printoptions(precision=2, suppress=True):
  print('First example:', first)
  print()
  print('Normalized:', normalizer(first).numpy())

线性回归

在构建深度神经网络模型之前，首先使用一个和多个变量进行线性回归。

使用一个变量进行线性回归

从单变量线性回归开始，根据 'Horsepower' 预测 'MPG'。

使用 tf.keras 训练模型通常从定义模型架构开始。使用 tf.keras.Sequential 模型，它表示一系列步骤。

单变量线性回归模型有两个步骤：

使用 tf.keras.layers.Normalization 预处理层规一化 'Horsepower' 输入特征。
应用线性变换 ( $y = mx+b$ ) 以使用线性层 (tf.keras.layers.Dense) 生成 1 个输出。

输入的数量可以由 input_shape 参数设置，也可以在模型第一次运行时自动设置。

首先，创建一个由 'Horsepower' 特征构成的 NumPy 数组。然后，实例化 tf.keras.layers.Normalization 并将其状态拟合到 horsepower 数据：

In [ ]:

horsepower = np.array(train_features['Horsepower'])

horsepower_normalizer = layers.Normalization(input_shape=[1,], axis=None)
horsepower_normalizer.adapt(horsepower)

构建 Keras 序贯模型：

In [ ]:

horsepower_model = tf.keras.Sequential([
    horsepower_normalizer,
    layers.Dense(units=1)
])

horsepower_model.summary()

此模型将根据 'Horsepower' 预测 'MPG'。

在前 10 个“马力”值上运行未经训练的模型。输出不会很好，但您会看到它具有预期的形状 (10, 1)：

In [ ]:

horsepower_model.predict(horsepower[:10])

构建模型后，使用 Keras Model.compile 方法配置训练过程。要编译的最重要参数是 loss 和 optimizer，因为它们定义了将要优化的内容 (mean_absolute_error) 以及优化的方法（使用 tf.keras.optimizers.Adam）。

In [ ]:

horsepower_model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.1),
    loss='mean_absolute_error')

使用 Keras Model.fit 执行 100 个周期的训练：

In [ ]:

%%time
history = horsepower_model.fit(
    train_features['Horsepower'],
    train_labels,
    epochs=100,
    # Suppress logging.
    verbose=0,
    # Calculate validation results on 20% of the training data.
    validation_split = 0.2)

使用 history 对象中存储的统计信息呈现模型的训练进度：

In [ ]:

hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch
hist.tail()

In [ ]:

def plot_loss(history):
  plt.plot(history.history['loss'], label='loss')
  plt.plot(history.history['val_loss'], label='val_loss')
  plt.ylim([0, 10])
  plt.xlabel('Epoch')
  plt.ylabel('Error [MPG]')
  plt.legend()
  plt.grid(True)

In [ ]:

plot_loss(history)

收集测试集上的结果，供后面使用：

In [ ]:

test_results = {}

test_results['horsepower_model'] = horsepower_model.evaluate(
    test_features['Horsepower'],
    test_labels, verbose=0)

由于这是一个单变量回归，很容易将模型的预测视为输入的函数：

In [ ]:

x = tf.linspace(0.0, 250, 251)
y = horsepower_model.predict(x)

In [ ]:

def plot_horsepower(x, y):
  plt.scatter(train_features['Horsepower'], train_labels, label='Data')
  plt.plot(x, y, color='k', label='Predictions')
  plt.xlabel('Horsepower')
  plt.ylabel('MPG')
  plt.legend()

In [ ]:

plot_horsepower(x, y)

使用多个输入进行线性回归

您可以使用几乎相同的设置根据多个输入进行预测。此模型仍然执行相同的 $y = mx+b$ ，只是 $m$ 是一个矩阵，而 $b$ 是一个向量。

再次创建一个两步 Keras 序贯模型，第一层为您之前定义并拟合到整个数据集的 normalizer (tf.keras.layers.Normalization(axis=-1))：

In [ ]:

linear_model = tf.keras.Sequential([
    normalizer,
    layers.Dense(units=1)
])

当您对一批输入调用 Model.predict 时，它会为每个样本生成 units=1 输出。

In [ ]:

linear_model.predict(train_features[:10])

当您调用模型时，将构建其权重矩阵 – 可以看到 kernel 权重（ $y=mx+b$ 中的 $m$ ）的形状为 (9, 1)：

In [ ]:

linear_model.layers[1].kernel

使用 Keras Model.compile 配置模型并使用 Model.fit 训练 100 个周期：

In [ ]:

linear_model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.1),
    loss='mean_absolute_error')

In [ ]:

%%time
history = linear_model.fit(
    train_features,
    train_labels,
    epochs=100,
    # Suppress logging.
    verbose=0,
    # Calculate validation results on 20% of the training data.
    validation_split = 0.2)

使用此回归模型中的所有输入可以实现比 horsepower_model（具有一个输入）低得多的训练和验证误差：

In [ ]:

plot_loss(history)

收集测试集上的结果，供后面使用：

In [ ]:

test_results['linear_model'] = linear_model.evaluate(
    test_features, test_labels, verbose=0)

使用深度神经网络 (DNN) 进行回归

在上一部分中，您为单输入和多输入实现了两个线性模型。

在此，您将实现单输入和多输入 DNN 模型。

除了将模型扩展为包括一些“隐藏”非线性层之外，代码基本相同。此处的名称“隐藏”仅表示不直接连接到输入或输出。

这些模型包含的层比线性模型多一些：

归一化层和以前一样（对于单输入模型使用 horsepower_normalizer，对于多输入模型使用 normalizer）。
使用 ReLU (relu) 激活函数非线性的两个隐藏非线性 Dense 层。
一个线性 Dense 单输出层。

两个模型都将使用相同的训练过程，因此 compile 方法包含在下面的 build_and_compile_model 函数中。

In [ ]:

def build_and_compile_model(norm):
  model = keras.Sequential([
      norm,
      layers.Dense(64, activation='relu'),
      layers.Dense(64, activation='relu'),
      layers.Dense(1)
  ])

  model.compile(loss='mean_absolute_error',
                optimizer=tf.keras.optimizers.Adam(0.001))
  return model

使用 DNN 和单输入进行回归

创建一个 DNN 模型，仅将 'Horsepower' 作为输入，将 'Horsepower'（之前定义）作为归一化层：

In [ ]:

dnn_horsepower_model = build_and_compile_model(horsepower_normalizer)

此模型比线性模型的可训练参数多很多。

In [ ]:

dnn_horsepower_model.summary()

使用 Keras Model.fit 训练模型：

In [ ]:

%%time
history = dnn_horsepower_model.fit(
    train_features['Horsepower'],
    train_labels,
    validation_split=0.2,
    verbose=0, epochs=100)

此模型略优于线性单输入 horsepower_model：

In [ ]:

plot_loss(history)

如果您将预测值绘制为 'Horsepower' 的函数，应看到此模型如何利用隐藏层提供的非线性：

In [ ]:

x = tf.linspace(0.0, 250, 251)
y = dnn_horsepower_model.predict(x)

In [ ]:

plot_horsepower(x, y)

收集测试集上的结果，供后面使用：

In [ ]:

test_results['dnn_horsepower_model'] = dnn_horsepower_model.evaluate(
    test_features['Horsepower'], test_labels,
    verbose=0)

使用 DNN 和多输入进行回归

使用所有输入重复前面的过程。模型的性能在验证数据集上会略有提高。

In [ ]:

dnn_model = build_and_compile_model(normalizer)
dnn_model.summary()

In [ ]:

%%time
history = dnn_model.fit(
    train_features,
    train_labels,
    validation_split=0.2,
    verbose=0, epochs=100)

In [ ]:

plot_loss(history)

收集测试集上的结果：

In [ ]:

test_results['dnn_model'] = dnn_model.evaluate(test_features, test_labels, verbose=0)

性能

所有模型都已经过训练，因此您可以查看它们的测试集性能：

In [ ]:

pd.DataFrame(test_results, index=['Mean absolute error [MPG]']).T

这些结果与训练期间看到的验证误差相匹配。

做预测

您现在可以使用 Keras Model.predict，在测试集上利用 dnn_model 进行预测并查看损失：

In [ ]:

test_predictions = dnn_model.predict(test_features).flatten()

a = plt.axes(aspect='equal')
plt.scatter(test_labels, test_predictions)
plt.xlabel('True Values [MPG]')
plt.ylabel('Predictions [MPG]')
lims = [0, 50]
plt.xlim(lims)
plt.ylim(lims)
_ = plt.plot(lims, lims)

看起来模型预测得相当出色。

现在，查看一下误差分布：

In [ ]:

error = test_predictions - test_labels
plt.hist(error, bins=25)
plt.xlabel('Prediction Error [MPG]')
_ = plt.ylabel('Count')

如果您对模型感到满意，请使用 Model.save 将其保存以备后续使用：

In [ ]:

dnn_model.save('dnn_model.keras')

如果您重新加载模型，它会给出相同的输出：

In [ ]:

reloaded = tf.keras.models.load_model('dnn_model.keras')

test_results['reloaded'] = reloaded.evaluate(
    test_features, test_labels, verbose=0)

In [ ]:

pd.DataFrame(test_results, index=['Mean absolute error [MPG]']).T

结论

本笔记本 (notebook) 介绍了一些处理回归问题的技术。

均方误差 (MSE) (tf.keras.losses.MeanSquaredError) 和平均绝对误差 (MAE) (tf.keras.losses.MeanAbsoluteError) 是用于回归问题的常见损失函数。MAE 对异常值不那么敏感。不同的损失函数用于分类问题。
类似的，用于回归的评估指标与分类不同。常见的回归指标是平均绝对误差（MAE）。
当数字输入数据特征的值存在不同范围时，每个特征应独立缩放到相同范围。
过拟合是 DNN 模型的常见问题，但本教程不存在此问题。有关这方面的更多帮助，请访问过拟合和欠拟合教程。

Copyright 2018 The TensorFlow Authors.

Basic regression: Predict fuel efficiency

Auto MPG 数据集

获取数据

数据清洗

将数据拆分为训练集和测试集

数据检查

从标签中分离特征

归一化

归一化层

线性回归

使用一个变量进行线性回归

使用多个输入进行线性回归

使用深度神经网络 (DNN) 进行回归

使用 DNN 和单输入进行回归

使用 DNN 和多输入进行回归

性能

做预测

结论

Product

Resources

Company