GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/zh-cn/tfx/tutorials/serving/rest_simple.ipynb
²⁵¹¹⁸ views

Kernel: Python 3

Copyright 2020 The TensorFlow Authors.

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

使用 TensorFlow Serving 训练和应用 TensorFlow 模型

警告：此笔记本只适合在 Google Colab 中运行。它会在系统上安装软件包，并且需要 root 访问权限。如果要在本地 Jupyter 笔记本中运行，请谨慎操作。

注：您现在可以在 Jupyter 风格的笔记本中运行此示例而无需进行设置！只需点击“在 Google Colab 中运行”

本指南将训练一个神经网络模型来分类运动鞋和衬衫等服装图像、保存训练的模型，然后使用 TensorFlow Serving 应用该模型。本指南的重点是 TensorFlow Serving，而不是 TensorFlow 中的建模和训练。有关侧重于建模和训练的完整示例，请参阅基本分类示例。

本指南使用 tf.keras，它是 TensorFlow 中用来构建和训练模型的高级 API。

In [ ]:

import sys

# Confirm that we're using Python 3
assert sys.version_info.major == 3, 'Oops, not running Python 3. Use Runtime > Change runtime type'

In [ ]:

# TensorFlow and tf.keras
print("Installing dependencies for Colab environment")
!pip install -Uq grpcio==1.26.0

import tensorflow as tf
from tensorflow import keras

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
import os
import subprocess

print('TensorFlow version: {}'.format(tf.__version__))

创建模型

导入 Fashion MNIST 数据集

本指南使用 Fashion MNIST 数据集，该数据集包含 10 个类别的 70,000 个灰度图像。这些图像以低分辨率（28x28 像素）展示了单件衣物，如下所示：

图 1. Fashion-MNIST 样本（由 Zalando 提供，MIT 许可）。

Fashion MNIST 旨在临时替代经典 MNIST 数据集，后者常被用作计算机视觉机器学习程序的“Hello, World”。您可以直接从 TensorFlow 访问 Fashion MNIST，只需导入和加载数据。

注：尽管这些实际上是图像，但它们将作为 NumPy 数组而非二进制图像对象进行加载。

In [ ]:

fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

# scale the values to 0.0 to 1.0
train_images = train_images / 255.0
test_images = test_images / 255.0

# reshape for feeding into the model
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)
test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

print('\ntrain_images.shape: {}, of {}'.format(train_images.shape, train_images.dtype))
print('test_images.shape: {}, of {}'.format(test_images.shape, test_images.dtype))

训练并评估模型

我们使用最简单的 CNN，因为我们不关注建模部分。

In [ ]:

model = keras.Sequential([
  keras.layers.Conv2D(input_shape=(28,28,1), filters=8, kernel_size=3, 
                      strides=2, activation='relu', name='Conv1'),
  keras.layers.Flatten(),
  keras.layers.Dense(10, name='Dense')
])
model.summary()

testing = False
epochs = 5

model.compile(optimizer='adam', 
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=[keras.metrics.SparseCategoricalAccuracy()])
model.fit(train_images, train_labels, epochs=epochs)

test_loss, test_acc = model.evaluate(test_images, test_labels)
print('\nTest accuracy: {}'.format(test_acc))

保存模型

要将训练的模型加载到 TensorFlow Serving 中，我们首先需要将其保存为 SavedModel 格式。这将在定义明确的目录层次结构中创建一个 protobuf 文件，并将包括版本号。TensorFlow Serving 允许我们选择在发出推断请求时要使用的模型版本或“可应用”版本。每个版本将导出到给定路径下的不同子目录。

In [ ]:

# Fetch the Keras session and save the model
# The signature definition is defined by the input and output tensors,
# and stored with the default serving key
import tempfile

MODEL_DIR = tempfile.gettempdir()
version = 1
export_path = os.path.join(MODEL_DIR, str(version))
print('export_path = {}\n'.format(export_path))

tf.keras.models.save_model(
    model,
    export_path,
    overwrite=True,
    include_optimizer=True,
    save_format=None,
    signatures=None,
    options=None
)

print('\nSaved model:')
!ls -l {export_path}

检查保存的模型

我们将使用命令行实用工具 saved_model_cli 在 SavedModel 中查看 MetaGraphDefs（模型）和 SignatureDefs（您可以调用的方法）。请参阅 TensorFlow 指南中有关 SavedModel CLI 的讨论。

In [ ]:

!saved_model_cli show --dir {export_path} --all

这告诉了我们很多关于模型的信息！在本例中，我们训练了模型，所以我们已经知道输入和输出，但如果没有训练模型，这将是重要信息。它并不能告诉我们所有信息（例如，告诉我们这是灰度图像数据），但这是一个很好的开始。

使用 TensorFlow Serving 应用模型

警告：如果您不是在 Google Colab 中运行，以下单元将使用 root 访问权限在系统上安装软件包。如果要在本地 Jupyter 笔记本中运行，请谨慎操作。

将 TensorFlow Serving 分发 URI 添加为软件包源：

We're preparing to install TensorFlow Serving using Aptitude since this Colab runs in a Debian environment. We'll add the tensorflow-model-server package to the list of packages that Aptitude knows about. Note that we're running as root.

注：此示例以原生方式运行 TensorFlow Serving，但您也可以在 Docker 容器中运行，这是开始使用 TensorFlow Serving 最简单的一种方式。

In [ ]:

import sys
# We need sudo prefix if not on a Google Colab.
if 'google.colab' not in sys.modules:
  SUDO_IF_NEEDED = 'sudo'
else:
  SUDO_IF_NEEDED = ''

In [ ]:

# This is the same as you would do from your command line, but without the [arch=amd64], and no sudo
# You would instead do:
# echo "deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list && \
# curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -

!echo "deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | {SUDO_IF_NEEDED} tee /etc/apt/sources.list.d/tensorflow-serving.list && \
curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | {SUDO_IF_NEEDED} apt-key add -
!{SUDO_IF_NEEDED} apt update

安装 TensorFlow Serving

您只需要一个命令行！

In [ ]:

# TODO: Use the latest model server version when colab supports it.
#!{SUDO_IF_NEEDED} apt-get install tensorflow-model-server
# We need to install Tensorflow Model server 2.8 instead of latest version
# Tensorflow Serving >2.9.0 required `GLIBC_2.29` and `GLIBCXX_3.4.26`. Currently colab environment doesn't support latest version of`GLIBC`,so workaround is to use specific version of Tensorflow Serving `2.8.0` to mitigate issue.
!wget 'http://storage.googleapis.com/tensorflow-serving-apt/pool/tensorflow-model-server-2.8.0/t/tensorflow-model-server/tensorflow-model-server_2.8.0_all.deb'
!dpkg -i tensorflow-model-server_2.8.0_all.deb
!pip3 install tensorflow-serving-api==2.8.0

开始运行 TensorFlow Serving

接下来，我们就要开始运行 TensorFlow Serving 并加载模型了。在模型加载后，我们就可以开始使用 REST 发出推断请求了。有一些重要参数：

rest_api_port：用于 REST 请求的端口。
model_name：您将在 REST 请求的网址中使用它。可以是任意名称。
model_base_path：这是指向保存模型目录的路径。

In [ ]:

os.environ["MODEL_DIR"] = MODEL_DIR

In [ ]:

%%bash --bg 
nohup tensorflow_model_server \
  --rest_api_port=8501 \
  --model_name=fashion_model \
  --model_base_path="${MODEL_DIR}" >server.log 2>&1

In [ ]:

!tail server.log

在 TensorFlow Serving 中向模型发出请求

首先，我们来看一下测试数据中的一个随机样本。

In [ ]:

def show(idx, title):
  plt.figure()
  plt.imshow(test_images[idx].reshape(28,28))
  plt.axis('off')
  plt.title('\n\n{}'.format(title), fontdict={'size': 16})

import random
rando = random.randint(0,len(test_images)-1)
show(rando, 'An Example Image: {}'.format(class_names[test_labels[rando]]))

好吧，看起来很有趣。要是让您来识别会有多难？现在，我们为一个包含三个推断请求的批次创建 JSON 对象，并查看模型的识别情况：

In [ ]:

import json
data = json.dumps({"signature_name": "serving_default", "instances": test_images[0:3].tolist()})
print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))

发出 REST 请求

可应用的最新版本

我们将一个预测请求作为 POST 发送到服务器的 REST 端点，并向其传递三个样本。我们将要求服务器在不指定特定版本的情况下提供可应用的最新版本。

In [ ]:

# docs_infra: no_execute
!pip install -q requests

import requests
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/fashion_model:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']

show(0, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
  class_names[np.argmax(predictions[0])], np.argmax(predictions[0]), class_names[test_labels[0]], test_labels[0]))

可应用的特定版本

现在，我们指定一个可应用的特定版本。由于我们只有一个版本，所以我们选择版本 1。我们仍将查看所有三个结果。

In [ ]:

# docs_infra: no_execute
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/fashion_model/versions/1:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']

for i in range(0,3):
  show(i, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
    class_names[np.argmax(predictions[i])], np.argmax(predictions[i]), class_names[test_labels[i]], test_labels[i]))