Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
tensorflow
GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/zh-cn/addons/tutorials/layers_weightnormalization.ipynb
25118 views
Kernel: Python 3
#@title Licensed under the Apache License, Version 2.0 # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # https://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License.

概述

此笔记本将演示如何使用权重归一化层以及如何提升收敛。

WeightNormalization

用于加速深度神经网络训练的一个简单的重新参数化:

Tim Salimans、Diederik P. Kingma (2016)

通过以这种方式重新参数化权重,我们改善了优化问题的条件,并加快了随机梯度下降的收敛速度。我们的重新参数化受到批次归一化的启发,但没有在小批次中的样本之间引入任何依赖项。这意味着我们的方法也可以成功应用于递归模型(例如 LSTM)和对噪声敏感的应用(例如深度强化学习或生成模型),而批次归一化则不太适合这类模型和应用。尽管我们的方法要简单得多,但它仍可很大程度上为完整批次归一化提供加速。另外,我们方法的计算开销较低,从而允许在相同的时间内执行更多的优化步骤。

https://arxiv.org/abs/1602.07868



设置

!pip install -U tensorflow-addons
import tensorflow as tf import tensorflow_addons as tfa
import numpy as np from matplotlib import pyplot as plt
# Hyper Parameters batch_size = 32 epochs = 10 num_classes=10

构建模型

# Standard ConvNet reg_model = tf.keras.Sequential([ tf.keras.layers.Conv2D(6, 5, activation='relu'), tf.keras.layers.MaxPooling2D(2, 2), tf.keras.layers.Conv2D(16, 5, activation='relu'), tf.keras.layers.MaxPooling2D(2, 2), tf.keras.layers.Flatten(), tf.keras.layers.Dense(120, activation='relu'), tf.keras.layers.Dense(84, activation='relu'), tf.keras.layers.Dense(num_classes, activation='softmax'), ])
# WeightNorm ConvNet wn_model = tf.keras.Sequential([ tfa.layers.WeightNormalization(tf.keras.layers.Conv2D(6, 5, activation='relu')), tf.keras.layers.MaxPooling2D(2, 2), tfa.layers.WeightNormalization(tf.keras.layers.Conv2D(16, 5, activation='relu')), tf.keras.layers.MaxPooling2D(2, 2), tf.keras.layers.Flatten(), tfa.layers.WeightNormalization(tf.keras.layers.Dense(120, activation='relu')), tfa.layers.WeightNormalization(tf.keras.layers.Dense(84, activation='relu')), tfa.layers.WeightNormalization(tf.keras.layers.Dense(num_classes, activation='softmax')), ])

加载数据

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data() # Convert class vectors to binary class matrices. y_train = tf.keras.utils.to_categorical(y_train, num_classes) y_test = tf.keras.utils.to_categorical(y_test, num_classes) x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255

训练模型

reg_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) reg_history = reg_model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(x_test, y_test), shuffle=True)
wn_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) wn_history = wn_model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(x_test, y_test), shuffle=True)
reg_accuracy = reg_history.history['accuracy'] wn_accuracy = wn_history.history['accuracy'] plt.plot(np.linspace(0, epochs, epochs), reg_accuracy, color='red', label='Regular ConvNet') plt.plot(np.linspace(0, epochs, epochs), wn_accuracy, color='blue', label='WeightNorm ConvNet') plt.title('WeightNorm Accuracy Comparison') plt.legend() plt.grid(True) plt.show()