GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/ko/model_optimization/guide/combine/cqat_example.ipynb
²⁵¹¹⁸ views

Kernel: Python 3

저작권 2021 TensorFlow 작성자.

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

클러스터 보존 양자화 인식 훈련(CQAT) Keras 예제

개요

이것은 TensorFlow Model Optimization Toolkit의 협업 최적화 파이프라인의 일부인 클러스터 보존 양자화 인식 교육(CQAT) API의 사용을 보여주는 종단 간 예제입니다.

다른 페이지

파이프라인 및 기타 사용 가능한 기술에 대한 소개는 협업 최적화 개요 페이지 를 참조하십시오 .

내용물

튜토리얼에서는 다음을 수행합니다.

MNIST 데이터 세트에 대한 tf.keras 모델을 처음부터 훈련시킵니다.
클러스터링으로 모델을 미세 조정하고 정확도를 확인합니다.
QAT를 적용하고 클러스터의 손실을 관찰합니다.
CQAT를 적용하고 이전에 적용된 클러스터링이 보존되었는지 관찰합니다.
TFLite 모델을 생성하고 CQAT를 적용한 효과를 관찰합니다.
훈련 후 양자화를 사용하여 양자화된 모델과 달성된 CQAT 모델 정확도를 비교합니다.

설정

이 Jupyter Notebook은 로컬 virtualenv 또는 colab 에서 실행할 수 있습니다. 종속성 설정에 대한 자세한 내용은 설치 가이드 를 참조하십시오.

In [ ]:

! pip install -q tensorflow-model-optimization

In [ ]:

import tensorflow as tf

import numpy as np
import tempfile
import zipfile
import os

클러스터링 없이 MNIST용 tf.keras 모델 학습

In [ ]:

# Load MNIST dataset
mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Normalize the input image so that each pixel value is between 0 to 1.
train_images = train_images / 255.0
test_images  = test_images / 255.0

model = tf.keras.Sequential([
  tf.keras.layers.InputLayer(input_shape=(28, 28)),
  tf.keras.layers.Reshape(target_shape=(28, 28, 1)),
  tf.keras.layers.Conv2D(filters=12, kernel_size=(3, 3),
                         activation=tf.nn.relu),
  tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(10)
])

# Train the digit classification model
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(
    train_images,
    train_labels,
    validation_split=0.1,
    epochs=10
)

기준 모델을 평가하고 나중에 사용하기 위해 저장합니다.

In [ ]:

_, baseline_model_accuracy = model.evaluate(
    test_images, test_labels, verbose=0)

print('Baseline test accuracy:', baseline_model_accuracy)

_, keras_file = tempfile.mkstemp('.h5')
print('Saving model to: ', keras_file)
tf.keras.models.save_model(model, keras_file, include_optimizer=False)

8개의 클러스터로 모델을 클러스터링하고 미세 조정

cluster_weights() API를 적용하여 사전 훈련된 전체 모델을 클러스터링하여 정확도를 유지하면서 zip을 적용할 때 모델 크기를 줄이는 효과를 보여주고 관찰합니다. API를 사용하여 목표 정확도를 유지하면서 최고의 압축률을 달성하는 방법은 클러스터링 종합 가이드 를 참조하십시오.

모델 정의 및 클러스터링 API 적용

클러스터링 API를 사용하기 전에 모델을 사전 학습해야 합니다.

In [ ]:

import tensorflow_model_optimization as tfmot

cluster_weights = tfmot.clustering.keras.cluster_weights
CentroidInitialization = tfmot.clustering.keras.CentroidInitialization

clustering_params = {
  'number_of_clusters': 8,
  'cluster_centroids_init': CentroidInitialization.KMEANS_PLUS_PLUS,
  'cluster_per_channel': True,
}

clustered_model = cluster_weights(model, **clustering_params)

# Use smaller learning rate for fine-tuning
opt = tf.keras.optimizers.Adam(learning_rate=1e-5)

clustered_model.compile(
  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
  optimizer=opt,
  metrics=['accuracy'])

clustered_model.summary()

모델 미세 조정 및 기준선에 대한 정확도 평가

3 Epoch에 대한 클러스터링으로 모델을 미세 조정합니다.

In [ ]:

# Fine-tune model
clustered_model.fit(
  train_images,
  train_labels,
  epochs=3,
  validation_split=0.1)

모델의 각 커널에서 클러스터링 수를 계산하고 인쇄하는 도우미 함수를 정의합니다.

In [ ]:

def print_model_weight_clusters(model):

    for layer in model.layers:
        if isinstance(layer, tf.keras.layers.Wrapper):
            weights = layer.trainable_weights
        else:
            weights = layer.weights
        for weight in weights:
            # ignore auxiliary quantization weights
            if "quantize_layer" in weight.name:
                continue
            if "kernel" in weight.name:
                unique_count = len(np.unique(weight))
                print(
                    f"{layer.name}/{weight.name}: {unique_count} clusters "
                )

모델 커널이 올바르게 클러스터링되었는지 확인하십시오. 먼저 클러스터링 래퍼를 제거해야 합니다.

In [ ]:

stripped_clustered_model = tfmot.clustering.keras.strip_clustering(clustered_model)

print_model_weight_clusters(stripped_clustered_model)

이 예의 경우 기준선과 비교하여 클러스터링 후 테스트 정확도의 손실이 최소화됩니다.

In [ ]:

_, clustered_model_accuracy = clustered_model.evaluate(
  test_images, test_labels, verbose=0)

print('Baseline test accuracy:', baseline_model_accuracy)
print('Clustered test accuracy:', clustered_model_accuracy)

QAT 및 CQAT를 적용하고 두 경우 모두 모델 클러스터에 미치는 영향 확인

다음으로 클러스터링된 모델에 QAT와 클러스터 보존 QAT(CQAT)를 모두 적용하고 CQAT가 클러스터링된 모델에서 가중치 클러스터를 보존하는 것을 관찰합니다. CQAT API를 적용하기 전에 tfmot.clustering.keras.strip_clustering 을 사용하여 모델에서 클러스터링 래퍼를 제거했습니다.

In [ ]:

# QAT
qat_model = tfmot.quantization.keras.quantize_model(stripped_clustered_model)

qat_model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
print('Train qat model:')
qat_model.fit(train_images, train_labels, batch_size=128, epochs=1, validation_split=0.1)

# CQAT
quant_aware_annotate_model = tfmot.quantization.keras.quantize_annotate_model(
              stripped_clustered_model)
cqat_model = tfmot.quantization.keras.quantize_apply(
              quant_aware_annotate_model,
              tfmot.experimental.combine.Default8BitClusterPreserveQuantizeScheme())

cqat_model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
print('Train cqat model:')
cqat_model.fit(train_images, train_labels, batch_size=128, epochs=1, validation_split=0.1)

In [ ]:

print("QAT Model clusters:")
print_model_weight_clusters(qat_model)
print("CQAT Model clusters:")
print_model_weight_clusters(cqat_model)

CQAT 모델의 압축 이점 보기

압축된 모델 파일을 가져오는 도우미 함수를 정의합니다.

In [ ]:

def get_gzipped_model_size(file):
  # It returns the size of the gzipped model in kilobytes.

  _, zipped_file = tempfile.mkstemp('.zip')
  with zipfile.ZipFile(zipped_file, 'w', compression=zipfile.ZIP_DEFLATED) as f:
    f.write(file)

  return os.path.getsize(zipped_file)/1000

이것은 작은 모델입니다. 클러스터링과 CQAT를 더 큰 프로덕션 모델에 적용하면 압축률이 더 높아집니다.

In [ ]:

# QAT model
converter = tf.lite.TFLiteConverter.from_keras_model(qat_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
qat_tflite_model = converter.convert()
qat_model_file = 'qat_model.tflite'
# Save the model.
with open(qat_model_file, 'wb') as f:
    f.write(qat_tflite_model)
    
# CQAT model
converter = tf.lite.TFLiteConverter.from_keras_model(cqat_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
cqat_tflite_model = converter.convert()
cqat_model_file = 'cqat_model.tflite'
# Save the model.
with open(cqat_model_file, 'wb') as f:
    f.write(cqat_tflite_model)
    
print("QAT model size: ", get_gzipped_model_size(qat_model_file), ' KB')
print("CQAT model size: ", get_gzipped_model_size(cqat_model_file), ' KB')

TF에서 TFLite까지의 정확도 지속성 확인

테스트 데이터 세트에서 TFLite 모델을 평가하는 도우미 함수를 정의합니다.

In [ ]:

def eval_model(interpreter):
  input_index = interpreter.get_input_details()[0]["index"]
  output_index = interpreter.get_output_details()[0]["index"]

  # Run predictions on every image in the "test" dataset.
  prediction_digits = []
  for i, test_image in enumerate(test_images):
    if i % 1000 == 0:
      print(f"Evaluated on {i} results so far.")
    # Pre-processing: add batch dimension and convert to float32 to match with
    # the model's input data format.
    test_image = np.expand_dims(test_image, axis=0).astype(np.float32)
    interpreter.set_tensor(input_index, test_image)

    # Run inference.
    interpreter.invoke()

    # Post-processing: remove batch dimension and find the digit with highest
    # probability.
    output = interpreter.tensor(output_index)
    digit = np.argmax(output()[0])
    prediction_digits.append(digit)

  print('\n')
  # Compare prediction results with ground truth labels to calculate accuracy.
  prediction_digits = np.array(prediction_digits)
  accuracy = (prediction_digits == test_labels).mean()
  return accuracy

클러스터링되고 양자화된 모델을 평가한 다음 TensorFlow의 정확도가 TFLite 백엔드에서 지속되는지 확인합니다.

In [ ]:

interpreter = tf.lite.Interpreter(cqat_model_file)
interpreter.allocate_tensors()

cqat_test_accuracy = eval_model(interpreter)

print('Clustered and quantized TFLite test_accuracy:', cqat_test_accuracy)
print('Clustered TF test accuracy:', clustered_model_accuracy)

훈련 후 양자화 적용 및 CQAT 모델과 비교

다음으로, 클러스터링 된 모델에서 사후 훈련 양자화(미세 조정 없음)를 사용하고 CQAT 모델과 비교하여 정확성을 확인합니다. 이는 양자화된 모델의 정확성 향상을 위해 CQAT를 사용해야 하는 이유를 보여줍니다. 이 차이는 MNIST 모델이 상당히 작고 과잉 파라미터화되어 아주 잘 보이지 않을지도 모릅니다.

먼저 처음 1000개의 훈련 이미지에서 캘리브레이션 데이터 세트에 대한 생성기를 정의합니다.

In [ ]:

def mnist_representative_data_gen():
  for image in train_images[:1000]:  
    image = np.expand_dims(image, axis=0).astype(np.float32)
    yield [image]

모델을 양자화하고 이전에 획득한 CQAT 모델과 정확도를 비교합니다. 미세 조정으로 양자화된 모델은 더 높은 정확도를 달성합니다.

In [ ]:

converter = tf.lite.TFLiteConverter.from_keras_model(stripped_clustered_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = mnist_representative_data_gen
post_training_tflite_model = converter.convert()
post_training_model_file = 'post_training_model.tflite'
# Save the model.
with open(post_training_model_file, 'wb') as f:
    f.write(post_training_tflite_model)
    
# Compare accuracy
interpreter = tf.lite.Interpreter(post_training_model_file)
interpreter.allocate_tensors()

post_training_test_accuracy = eval_model(interpreter)

print('CQAT TFLite test_accuracy:', cqat_test_accuracy)
print('Post-training (no fine-tuning) TF test accuracy:', post_training_test_accuracy)

결론

이 튜토리얼에서는 모델을 생성하고, cluster_weights() API를 사용하여 클러스터링하고, QAT를 사용하는 동안 클러스터를 보존하기 위해 클러스터 보존 양자화 인식 훈련(CQAT)을 적용하는 방법을 배웠습니다. 최종 CQAT 모델을 QAT 모델과 비교하여 클러스터가 전자에서 보존되고 후자에서 손실됨을 보여줍니다. 다음으로, 모델은 체인 클러스터링 및 CQAT 모델 최적화 기술의 압축 이점을 보여주기 위해 TFLite로 변환되었으며 TFLite 모델은 TFLite 백엔드에서 정확도가 지속되는지 확인하기 위해 평가되었습니다. 마지막으로 CQAT 모델을 훈련 후 양자화 API를 사용하여 얻은 양자화된 클러스터링 모델과 비교하여 정상 양자화에서 정확도 손실을 복구하는 데 CQAT의 이점을 보여주었습니다.