Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
tensorflow
GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/zh-cn/model_optimization/guide/combine/cqat_example.ipynb
25118 views
Kernel: Python 3

Copyright 2021 The TensorFlow Authors.

#@title Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # https://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License.

聚类保留量化感知训练 (CQAT) Keras 示例

概述

这是一个展示聚类保留量化感知训练 (CQAT) API 用法的端到端示例,该 API 是 TensorFlow 模型优化工具包的协作优化流水线的一部分。

其他页面

有关流水线和其他可用技术的简介,请参阅协作优化概述页面

目录

在本教程中,您将:

  1. 从头开始为 MNIST 数据集训练一个 tf.keras 模型。

  2. 通过聚类对模型进行微调,并查看准确率。

  3. 应用 QAT 并观察聚类损失。

  4. 应用 CQAT 并观察之前应用的聚类已被保留。

  5. 生成一个 TFLite 模型并观察对其应用 CQAT 的效果。

  6. 将获得的 CQAT 模型准确率与使用训练后量化所量化的模型进行比较。

安装

您可以在本地 virtualenvColab 中运行此 Jupyter 笔记本。有关设置依赖项的详细信息,请参阅安装指南

! pip install -q tensorflow-model-optimization
import tensorflow as tf import numpy as np import tempfile import zipfile import os

在不使用聚类的情况下为 MNIST 训练 tf.keras 模型

# Load MNIST dataset mnist = tf.keras.datasets.mnist (train_images, train_labels), (test_images, test_labels) = mnist.load_data() # Normalize the input image so that each pixel value is between 0 to 1. train_images = train_images / 255.0 test_images = test_images / 255.0 model = tf.keras.Sequential([ tf.keras.layers.InputLayer(input_shape=(28, 28)), tf.keras.layers.Reshape(target_shape=(28, 28, 1)), tf.keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation=tf.nn.relu), tf.keras.layers.MaxPooling2D(pool_size=(2, 2)), tf.keras.layers.Flatten(), tf.keras.layers.Dense(10) ]) # Train the digit classification model model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy']) model.fit( train_images, train_labels, validation_split=0.1, epochs=10 )

评估基准模型并保存以备稍后使用

_, baseline_model_accuracy = model.evaluate( test_images, test_labels, verbose=0) print('Baseline test accuracy:', baseline_model_accuracy) _, keras_file = tempfile.mkstemp('.h5') print('Saving model to: ', keras_file) tf.keras.models.save_model(model, keras_file, include_optimizer=False)

使用 8 个聚类对模型进行聚类和微调

应用 cluster_weights() API 对整个预训练模型进行聚类,以演示并观察其不仅能够在应用 zip 时有效缩减模型大小,还能保持良好的准确率。有关如何在保持目标准确率的同时以最佳方式使用 API 实现最佳压缩率,请参阅聚类综合指南

定义模型并应用聚类 API

在使用聚类 API 之前,需要对模型进行预训练。

import tensorflow_model_optimization as tfmot cluster_weights = tfmot.clustering.keras.cluster_weights CentroidInitialization = tfmot.clustering.keras.CentroidInitialization clustering_params = { 'number_of_clusters': 8, 'cluster_centroids_init': CentroidInitialization.KMEANS_PLUS_PLUS, 'cluster_per_channel': True, } clustered_model = cluster_weights(model, **clustering_params) # Use smaller learning rate for fine-tuning opt = tf.keras.optimizers.Adam(learning_rate=1e-5) clustered_model.compile( loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), optimizer=opt, metrics=['accuracy']) clustered_model.summary()

微调模型并根据基准评估准确率

在 3 个周期内使用聚类对模型进行微调。

# Fine-tune model clustered_model.fit( train_images, train_labels, epochs=3, validation_split=0.1)

定义辅助函数来计算和打印模型每个内核中的聚类数。

def print_model_weight_clusters(model): for layer in model.layers: if isinstance(layer, tf.keras.layers.Wrapper): weights = layer.trainable_weights else: weights = layer.weights for weight in weights: # ignore auxiliary quantization weights if "quantize_layer" in weight.name: continue if "kernel" in weight.name: unique_count = len(np.unique(weight)) print( f"{layer.name}/{weight.name}: {unique_count} clusters " )

检查模型内核是否已正确聚类。我们需要先剥离聚类包装器。

stripped_clustered_model = tfmot.clustering.keras.strip_clustering(clustered_model) print_model_weight_clusters(stripped_clustered_model)

对于本示例,与基准相比,聚类后的测试准确率损失最小。

_, clustered_model_accuracy = clustered_model.evaluate( test_images, test_labels, verbose=0) print('Baseline test accuracy:', baseline_model_accuracy) print('Clustered test accuracy:', clustered_model_accuracy)

应用 QAT 和 CQAT 并检查两种情况下对模型聚类的影响

接下来,我们对聚类的模型同时应用 QAT 和聚类保留 QAT (CQAT),并观察 CQAT 在聚类的模型中保留权重聚类。请注意,在应用 CQAT API 之前,我们使用 tfmot.clustering.keras.strip_clustering 从模型中剥离了聚类包装器。

# QAT qat_model = tfmot.quantization.keras.quantize_model(stripped_clustered_model) qat_model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy']) print('Train qat model:') qat_model.fit(train_images, train_labels, batch_size=128, epochs=1, validation_split=0.1) # CQAT quant_aware_annotate_model = tfmot.quantization.keras.quantize_annotate_model( stripped_clustered_model) cqat_model = tfmot.quantization.keras.quantize_apply( quant_aware_annotate_model, tfmot.experimental.combine.Default8BitClusterPreserveQuantizeScheme()) cqat_model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy']) print('Train cqat model:') cqat_model.fit(train_images, train_labels, batch_size=128, epochs=1, validation_split=0.1)
print("QAT Model clusters:") print_model_weight_clusters(qat_model) print("CQAT Model clusters:") print_model_weight_clusters(cqat_model)

查看 CQAT 模型的压缩优势

定义辅助函数以获取压缩的模型文件。

def get_gzipped_model_size(file): # It returns the size of the gzipped model in kilobytes. _, zipped_file = tempfile.mkstemp('.zip') with zipfile.ZipFile(zipped_file, 'w', compression=zipfile.ZIP_DEFLATED) as f: f.write(file) return os.path.getsize(zipped_file)/1000

请注意,这是一个小模型。将聚类和 CQAT 应用于更大的生产模型将产生更明显的压缩效果。

# QAT model converter = tf.lite.TFLiteConverter.from_keras_model(qat_model) converter.optimizations = [tf.lite.Optimize.DEFAULT] qat_tflite_model = converter.convert() qat_model_file = 'qat_model.tflite' # Save the model. with open(qat_model_file, 'wb') as f: f.write(qat_tflite_model) # CQAT model converter = tf.lite.TFLiteConverter.from_keras_model(cqat_model) converter.optimizations = [tf.lite.Optimize.DEFAULT] cqat_tflite_model = converter.convert() cqat_model_file = 'cqat_model.tflite' # Save the model. with open(cqat_model_file, 'wb') as f: f.write(cqat_tflite_model) print("QAT model size: ", get_gzipped_model_size(qat_model_file), ' KB') print("CQAT model size: ", get_gzipped_model_size(cqat_model_file), ' KB')

查看从 TF 到 TFLite 的准确率持久性

定义一个辅助函数,基于测试数据集评估 TFLite 模型。

def eval_model(interpreter): input_index = interpreter.get_input_details()[0]["index"] output_index = interpreter.get_output_details()[0]["index"] # Run predictions on every image in the "test" dataset. prediction_digits = [] for i, test_image in enumerate(test_images): if i % 1000 == 0: print(f"Evaluated on {i} results so far.") # Pre-processing: add batch dimension and convert to float32 to match with # the model's input data format. test_image = np.expand_dims(test_image, axis=0).astype(np.float32) interpreter.set_tensor(input_index, test_image) # Run inference. interpreter.invoke() # Post-processing: remove batch dimension and find the digit with highest # probability. output = interpreter.tensor(output_index) digit = np.argmax(output()[0]) prediction_digits.append(digit) print('\n') # Compare prediction results with ground truth labels to calculate accuracy. prediction_digits = np.array(prediction_digits) accuracy = (prediction_digits == test_labels).mean() return accuracy

评估已被聚类和量化的模型后,您将看到 TFLite 后端保持 TensorFlow 的准确率。

interpreter = tf.lite.Interpreter(cqat_model_file) interpreter.allocate_tensors() cqat_test_accuracy = eval_model(interpreter) print('Clustered and quantized TFLite test_accuracy:', cqat_test_accuracy) print('Clustered TF test accuracy:', clustered_model_accuracy)

应用训练后量化并与 CQAT 模型进行比较

接下来,我们对聚类的模型使用训练后量化(无微调),并根据 CQAT 模型检查其准确率。这演示了为什么需要使用 CQAT 来提高量化模型的准确率。差异可能不是很明显,因为 MNIST 模型非常小并且过度参数化。

首先,根据前 1000 个训练图像定义一个校准数据集生成器。

def mnist_representative_data_gen(): for image in train_images[:1000]: image = np.expand_dims(image, axis=0).astype(np.float32) yield [image]

对模型进行量化并将准确率与先前获得的 CQAT 模型进行比较。请注意,通过微调量化的模型会实现更高的准确率。

converter = tf.lite.TFLiteConverter.from_keras_model(stripped_clustered_model) converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.representative_dataset = mnist_representative_data_gen post_training_tflite_model = converter.convert() post_training_model_file = 'post_training_model.tflite' # Save the model. with open(post_training_model_file, 'wb') as f: f.write(post_training_tflite_model) # Compare accuracy interpreter = tf.lite.Interpreter(post_training_model_file) interpreter.allocate_tensors() post_training_test_accuracy = eval_model(interpreter) print('CQAT TFLite test_accuracy:', cqat_test_accuracy) print('Post-training (no fine-tuning) TF test accuracy:', post_training_test_accuracy)

结论

在本教程中,您学习了如何创建模型,使用 cluster_weights() API 对其进行聚类,以及应用聚类保留量化感知训练 (CQAT) 以在使用 QAT 时保留聚类。将最终的 CQAT 模型与 QAT 模型进行了比较,以表明前者保留了聚类,而后者丢失了聚类。接下来,将模型转换为 TFLite 以显示链式聚类和 CQAT 模型优化技术的压缩优势,并评估 TFLite 模型以确保在 TFLite 后端保持准确率。最后,将 CQAT 模型与使用训练后量化 API 实现的量化聚类模型进行比较,以展示 CQAT 在恢复正常量化的准确率损失方面的优势。