GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/zh-cn/addons/tutorials/layers_normalizations.ipynb
²⁵¹¹⁸ views

Kernel: Python 3

Copyright 2020 The TensorFlow Authors.

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

归一化

概述

此笔记本将简要介绍 TensorFlow 的归一化层。当前支持的层包括：

组归一化（TensorFlow Addons）
实例归一化（TensorFlow Addons）
层归一化（TensorFlow Core）

这些层背后的基本理念是对激活层的输出进行归一化，以提高训练过程中的收敛性。与批次归一化相反，这些归一化不适用于批次，而是用于归一化单个样本的激活，这样可使它们同样适用于递归神经网络。

通常，通过计算输入张量中子组的均值和标准差来执行归一化。此外，也可以对此应用比例因子和修正因子。

$y_{i} = \frac{\gamma ( x_{i} - \mu )}{\sigma }+ \beta$

$y$ ：输出

$x$ ：输入

$\gamma$ ：比例因子

$\mu$ ：均值

$\sigma$ ：标准差

$\beta$ ：修正因子

下面的图像演示了这些技术之间的区别。每个子图显示一个输入张量，其中 N 为批次轴，C 为通道轴，(H, W) 为空间轴（例如图片的高度和宽度）。蓝色像素由相同的均值和方差归一化，均值和方差通过聚合这些像素的值得出。

来源：(https://arxiv.org/pdf/1803.08494.pdf)

权重 γ 和 β 可以在所有归一化层中训练，以补偿表征能力的可能损失。您可以通过将 center 或 scale 标记设置为 True 来激活这些因子。当然，您也可以在训练过程中对 beta 和 gamma 使用 initializers、constraints 和 regularizer 来调整这些值。

设置

安装 Tensorflow 2.0 和 Tensorflow-Addons

In [ ]:

!pip install -U tensorflow-addons

In [ ]:

import tensorflow as tf
import tensorflow_addons as tfa

准备数据集

In [ ]:

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

组归一化教程

简介

组归一化 (GN) 将输入的通道分成较小的子组，并根据其均值和方差归一化这些值。由于 GN 只对单一样本起作用，因此该技术与批次大小无关。

在图像分类任务中，GN 的实验得分与批次归一化十分接近。如果您的整体批次大小很小，则使用 GN 而不是批次归一化可能更为有利，因为较小的批次大小会导致批次归一化的性能不佳。

###下面的示例在 Conv2D 层之后将 10 个通道按标准的“最后一个通道”设置分为 5 个子组：

In [ ]:

model = tf.keras.models.Sequential([
  # Reshape into "channels last" setup.
  tf.keras.layers.Reshape((28,28,1), input_shape=(28,28)),
  tf.keras.layers.Conv2D(filters=10, kernel_size=(3,3),data_format="channels_last"),
  # Groupnorm Layer
  tfa.layers.GroupNormalization(groups=5, axis=3),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.fit(x_test, y_test)

实例归一化教程

简介

实例归一化是组归一化的特例，其中组大小与通道大小（或轴大小）相同。

实验结果表明，当替换批次归一化时，实例归一化在样式迁移方面表现良好。最近，实例归一化也已被用来代替 GAN 中的批次归一化。

示例

在 Conv2D 层之后应用 InstanceNormalization 并使用统一的初始化比例和偏移因子。

In [ ]:

model = tf.keras.models.Sequential([
  # Reshape into "channels last" setup.
  tf.keras.layers.Reshape((28,28,1), input_shape=(28,28)),
  tf.keras.layers.Conv2D(filters=10, kernel_size=(3,3),data_format="channels_last"),
  # LayerNorm Layer
  tfa.layers.InstanceNormalization(axis=3, 
                                   center=True, 
                                   scale=True,
                                   beta_initializer="random_uniform",
                                   gamma_initializer="random_uniform"),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.fit(x_test, y_test)

层归一化教程

简介

层归一化是组归一化的特例，其中组大小为 1。均值和标准差根据单个样本的所有激活计算得出。

实验结果表明，层归一化非常适合循环神经网络，因为它可以独立于批大小工作。

示例

在 Conv2D 层之后应用 Layernormalization 并使用比例和偏移因子。

In [ ]:

model = tf.keras.models.Sequential([
  # Reshape into "channels last" setup.
  tf.keras.layers.Reshape((28,28,1), input_shape=(28,28)),
  tf.keras.layers.Conv2D(filters=10, kernel_size=(3,3),data_format="channels_last"),
  # LayerNorm Layer
  tf.keras.layers.LayerNormalization(axis=3 , center=True , scale=True),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.fit(x_test, y_test)

文献

Layer norm

Instance norm

Group Norm

Copyright 2020 The TensorFlow Authors.

归一化

概述

设置

安装 Tensorflow 2.0 和 Tensorflow-Addons

准备数据集

组归一化教程

简介

实例归一化教程

简介

示例

层归一化教程

简介

示例

文献

Product

Resources

Company