GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/zh-cn/model_optimization/guide/combine/cqat_example.ipynb
²⁵¹¹⁸ views

Kernel: Python 3

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

在 TensorFlow.org 上查看

在 Google Colab 运行

在 GitHub 上查看源代码

下载笔记本

聚类保留量化感知训练 (CQAT) Keras 示例

概述

这是一个展示聚类保留量化感知训练 (CQAT) API 用法的端到端示例，该 API 是 TensorFlow 模型优化工具包的协作优化流水线的一部分。

其他页面

有关流水线和其他可用技术的简介，请参阅协作优化概述页面。

从头开始为 MNIST 数据集训练一个 tf.keras 模型。
通过聚类对模型进行微调，并查看准确率。
应用 QAT 并观察聚类损失。
应用 CQAT 并观察之前应用的聚类已被保留。
生成一个 TFLite 模型并观察对其应用 CQAT 的效果。
将获得的 CQAT 模型准确率与使用训练后量化所量化的模型进行比较。

安装

您可以在本地 virtualenv 或 Colab 中运行此 Jupyter 笔记本。有关设置依赖项的详细信息，请参阅安装指南。

In [ ]:

! pip install -q tensorflow-model-optimization

In [ ]:

import tensorflow as tf

import numpy as np
import tempfile
import zipfile
import os

在不使用聚类的情况下为 MNIST 训练 tf.keras 模型

In [ ]:

# Load MNIST dataset
mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Normalize the input image so that each pixel value is between 0 to 1.
train_images = train_images / 255.0
test_images  = test_images / 255.0

model = tf.keras.Sequential([
  tf.keras.layers.InputLayer(input_shape=(28, 28)),
  tf.keras.layers.Reshape(target_shape=(28, 28, 1)),
  tf.keras.layers.Conv2D(filters=12, kernel_size=(3, 3),
                         activation=tf.nn.relu),
  tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(10)
])

# Train the digit classification model
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(
    train_images,
    train_labels,
    validation_split=0.1,
    epochs=10
)

评估基准模型并保存以备稍后使用

In [ ]:

_, baseline_model_accuracy = model.evaluate(
    test_images, test_labels, verbose=0)

print('Baseline test accuracy:', baseline_model_accuracy)

_, keras_file = tempfile.mkstemp('.h5')
print('Saving model to: ', keras_file)
tf.keras.models.save_model(model, keras_file, include_optimizer=False)

使用 8 个聚类对模型进行聚类和微调

应用 cluster_weights() API 对整个预训练模型进行聚类，以演示并观察其不仅能够在应用 zip 时有效缩减模型大小，还能保持良好的准确率。有关如何在保持目标准确率的同时以最佳方式使用 API 实现最佳压缩率，请参阅聚类综合指南。

定义模型并应用聚类 API

在使用聚类 API 之前，需要对模型进行预训练。

In [ ]:

import tensorflow_model_optimization as tfmot

cluster_weights = tfmot.clustering.keras.cluster_weights
CentroidInitialization = tfmot.clustering.keras.CentroidInitialization

clustering_params = {
  'number_of_clusters': 8,
  'cluster_centroids_init': CentroidInitialization.KMEANS_PLUS_PLUS,
  'cluster_per_channel': True,
}

clustered_model = cluster_weights(model, **clustering_params)

# Use smaller learning rate for fine-tuning
opt = tf.keras.optimizers.Adam(learning_rate=1e-5)

clustered_model.compile(
  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
  optimizer=opt,
  metrics=['accuracy'])

clustered_model.summary()

微调模型并根据基准评估准确率

在 3 个周期内使用聚类对模型进行微调。

In [ ]:

# Fine-tune model
clustered_model.fit(
  train_images,
  train_labels,
  epochs=3,
  validation_split=0.1)

定义辅助函数来计算和打印模型每个内核中的聚类数。

In [ ]:

def print_model_weight_clusters(model):

    for layer in model.layers:
        if isinstance(layer, tf.keras.layers.Wrapper):
            weights = layer.trainable_weights
        else:
            weights = layer.weights
        for weight in weights:
            # ignore auxiliary quantization weights
            if "quantize_layer" in weight.name:
                continue
            if "kernel" in weight.name:
                unique_count = len(np.unique(weight))
                print(
                    f"{layer.name}/{weight.name}: {unique_count} clusters "
                )

检查模型内核是否已正确聚类。我们需要先剥离聚类包装器。

In [ ]:

stripped_clustered_model = tfmot.clustering.keras.strip_clustering(clustered_model)

print_model_weight_clusters(stripped_clustered_model)

对于本示例，与基准相比，聚类后的测试准确率损失最小。

In [ ]:

_, clustered_model_accuracy = clustered_model.evaluate(
  test_images, test_labels, verbose=0)

print('Baseline test accuracy:', baseline_model_accuracy)
print('Clustered test accuracy:', clustered_model_accuracy)

应用 QAT 和 CQAT 并检查两种情况下对模型聚类的影响

接下来，我们对聚类的模型同时应用 QAT 和聚类保留 QAT (CQAT)，并观察 CQAT 在聚类的模型中保留权重聚类。请注意，在应用 CQAT API 之前，我们使用 tfmot.clustering.keras.strip_clustering 从模型中剥离了聚类包装器。

In [ ]:

# QAT
qat_model = tfmot.quantization.keras.quantize_model(stripped_clustered_model)

qat_model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
print('Train qat model:')
qat_model.fit(train_images, train_labels, batch_size=128, epochs=1, validation_split=0.1)

# CQAT
quant_aware_annotate_model = tfmot.quantization.keras.quantize_annotate_model(
              stripped_clustered_model)
cqat_model = tfmot.quantization.keras.quantize_apply(
              quant_aware_annotate_model,
              tfmot.experimental.combine.Default8BitClusterPreserveQuantizeScheme())

cqat_model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
print('Train cqat model:')
cqat_model.fit(train_images, train_labels, batch_size=128, epochs=1, validation_split=0.1)

In [ ]:

print("QAT Model clusters:")
print_model_weight_clusters(qat_model)
print("CQAT Model clusters:")
print_model_weight_clusters(cqat_model)

查看 CQAT 模型的压缩优势

定义辅助函数以获取压缩的模型文件。

In [ ]:

def get_gzipped_model_size(file):
  # It returns the size of the gzipped model in kilobytes.

  _, zipped_file = tempfile.mkstemp('.zip')
  with zipfile.ZipFile(zipped_file, 'w', compression=zipfile.ZIP_DEFLATED) as f:
    f.write(file)

  return os.path.getsize(zipped_file)/1000

请注意，这是一个小模型。将聚类和 CQAT 应用于更大的生产模型将产生更明显的压缩效果。

In [ ]:

# QAT model
converter = tf.lite.TFLiteConverter.from_keras_model(qat_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
qat_tflite_model = converter.convert()
qat_model_file = 'qat_model.tflite'
# Save the model.
with open(qat_model_file, 'wb') as f:
    f.write(qat_tflite_model)
    
# CQAT model
converter = tf.lite.TFLiteConverter.from_keras_model(cqat_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
cqat_tflite_model = converter.convert()
cqat_model_file = 'cqat_model.tflite'
# Save the model.
with open(cqat_model_file, 'wb') as f:
    f.write(cqat_tflite_model)
    
print("QAT model size: ", get_gzipped_model_size(qat_model_file), ' KB')
print("CQAT model size: ", get_gzipped_model_size(cqat_model_file), ' KB')

查看从 TF 到 TFLite 的准确率持久性

定义一个辅助函数，基于测试数据集评估 TFLite 模型。

In [ ]:

def eval_model(interpreter):
  input_index = interpreter.get_input_details()[0]["index"]
  output_index = interpreter.get_output_details()[0]["index"]

  # Run predictions on every image in the "test" dataset.
  prediction_digits = []
  for i, test_image in enumerate(test_images):
    if i % 1000 == 0:
      print(f"Evaluated on {i} results so far.")
    # Pre-processing: add batch dimension and convert to float32 to match with
    # the model's input data format.
    test_image = np.expand_dims(test_image, axis=0).astype(np.float32)
    interpreter.set_tensor(input_index, test_image)

    # Run inference.
    interpreter.invoke()

    # Post-processing: remove batch dimension and find the digit with highest
    # probability.
    output = interpreter.tensor(output_index)
    digit = np.argmax(output()[0])
    prediction_digits.append(digit)

  print('\n')
  # Compare prediction results with ground truth labels to calculate accuracy.
  prediction_digits = np.array(prediction_digits)
  accuracy = (prediction_digits == test_labels).mean()
  return accuracy

评估已被聚类和量化的模型后，您将看到 TFLite 后端保持 TensorFlow 的准确率。

In [ ]:

interpreter = tf.lite.Interpreter(cqat_model_file)
interpreter.allocate_tensors()

cqat_test_accuracy = eval_model(interpreter)

print('Clustered and quantized TFLite test_accuracy:', cqat_test_accuracy)
print('Clustered TF test accuracy:', clustered_model_accuracy)

应用训练后量化并与 CQAT 模型进行比较

接下来，我们对聚类的模型使用训练后量化（无微调），并根据 CQAT 模型检查其准确率。这演示了为什么需要使用 CQAT 来提高量化模型的准确率。差异可能不是很明显，因为 MNIST 模型非常小并且过度参数化。

首先，根据前 1000 个训练图像定义一个校准数据集生成器。

In [ ]:

def mnist_representative_data_gen():
  for image in train_images[:1000]:  
    image = np.expand_dims(image, axis=0).astype(np.float32)
    yield [image]

对模型进行量化并将准确率与先前获得的 CQAT 模型进行比较。请注意，通过微调量化的模型会实现更高的准确率。

In [ ]:

converter = tf.lite.TFLiteConverter.from_keras_model(stripped_clustered_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = mnist_representative_data_gen
post_training_tflite_model = converter.convert()
post_training_model_file = 'post_training_model.tflite'
# Save the model.
with open(post_training_model_file, 'wb') as f:
    f.write(post_training_tflite_model)
    
# Compare accuracy
interpreter = tf.lite.Interpreter(post_training_model_file)
interpreter.allocate_tensors()

post_training_test_accuracy = eval_model(interpreter)

print('CQAT TFLite test_accuracy:', cqat_test_accuracy)
print('Post-training (no fine-tuning) TF test accuracy:', post_training_test_accuracy)

结论

在本教程中，您学习了如何创建模型，使用 cluster_weights() API 对其进行聚类，以及应用聚类保留量化感知训练 (CQAT) 以在使用 QAT 时保留聚类。将最终的 CQAT 模型与 QAT 模型进行了比较，以表明前者保留了聚类，而后者丢失了聚类。接下来，将模型转换为 TFLite 以显示链式聚类和 CQAT 模型优化技术的压缩优势，并评估 TFLite 模型以确保在 TFLite 后端保持准确率。最后，将 CQAT 模型与使用训练后量化 API 实现的量化聚类模型进行比较，以展示 CQAT 在恢复正常量化的准确率损失方面的优势。