GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/zh-cn/guide/variable.ipynb
²⁵¹¹⁵ views

Kernel: Python 3

Copyright 2020 The TensorFlow Authors.

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

变量简介

TensorFlow 变量是用于表示程序处理的共享持久状态的推荐方法。本指南介绍在 TensorFlow 中如何创建、更新和管理 tf.Variable 的实例。

变量通过 tf.Variable 类进行创建和跟踪。tf.Variable 表示张量，对它执行运算可以改变其值。利用特定运算可以读取和修改此张量的值。更高级的库（如 tf.keras）使用 tf.Variable 来存储模型参数。

设置

本笔记本讨论变量布局。如果要查看变量位于哪一个设备上，请取消注释这一行代码。

In [ ]:

import tensorflow as tf

# Uncomment to see where your variables get placed (see below)
# tf.debugging.set_log_device_placement(True)

创建变量

要创建变量，请提供一个初始值。tf.Variable 与初始值的 dtype 相同。

In [ ]:

my_tensor = tf.constant([[1.0, 2.0], [3.0, 4.0]])
my_variable = tf.Variable(my_tensor)

# Variables can be all kinds of types, just like tensors
bool_variable = tf.Variable([False, False, False, True])
complex_variable = tf.Variable([5 + 4j, 6 + 1j])

变量与张量的定义方式和操作行为都十分相似，实际上，它们都是 tf.Tensor 支持的一种数据结构。与张量类似，变量也有 dtype 和形状，并且可以导出至 NumPy。

In [ ]:

print("Shape: ", my_variable.shape)
print("DType: ", my_variable.dtype)
print("As NumPy: ", my_variable.numpy())

大部分张量运算在变量上也可以按预期运行，不过变量无法重构形状。

In [ ]:

print("A variable:", my_variable)
print("\nViewed as a tensor:", tf.convert_to_tensor(my_variable))
print("\nIndex of highest value:", tf.math.argmax(my_variable))

# This creates a new tensor; it does not reshape the variable.
print("\nCopying and reshaping: ", tf.reshape(my_variable, [1,4]))

如上所述，变量由张量提供支持。您可以使用 tf.Variable.assign 重新分配张量。调用 assign（通常）不会分配新张量，而会重用现有张量的内存。

In [ ]:

a = tf.Variable([2.0, 3.0])
# This will keep the same dtype, float32
a.assign([1, 2]) 
# Not allowed as it resizes the variable: 
try:
  a.assign([1.0, 2.0, 3.0])
except Exception as e:
  print(f"{type(e).__name__}: {e}")

如果在运算中像使用张量一样使用变量，那么通常会对支持张量执行运算。

从现有变量创建新变量会复制支持张量。两个变量不能共享同一内存空间。

In [ ]:

a = tf.Variable([2.0, 3.0])
# Create b based on the value of a
b = tf.Variable(a)
a.assign([5, 6])

# a and b are different
print(a.numpy())
print(b.numpy())

# There are other versions of assign
print(a.assign_add([2,3]).numpy())  # [7. 9.]
print(a.assign_sub([7,9]).numpy())  # [0. 0.]

生命周期、命名和监视

在基于 Python 的 TensorFlow 中，tf.Variable 实例与其他 Python 对象的生命周期相同。如果没有对变量的引用，则会自动将其解除分配。

为了便于跟踪和调试，您还可以为变量命名。两个变量可以使用相同的名称。

In [ ]:

# Create a and b; they will have the same name but will be backed by
# different tensors.
a = tf.Variable(my_tensor, name="Mark")
# A new variable with the same name, but different value
# Note that the scalar add is broadcast
b = tf.Variable(my_tensor + 1, name="Mark")

# These are elementwise-unequal, despite having the same name
print(a == b)

保存和加载模型时会保留变量名。默认情况下，模型中的变量会自动获得唯一变量名，所以除非您希望自行命名，否则不必多此一举。

虽然变量对微分很重要，但某些变量不需要进行微分。在创建时，通过将 trainable 设置为 False 可以关闭梯度。例如，训练计步器就是一个不需要梯度的变量。

In [ ]:

step_counter = tf.Variable(1, trainable=False)

放置变量和张量

为了提高性能，TensorFlow 会尝试将张量和变量放在与其 dtype 兼容的最快设备上。这意味着如果有 GPU，那么大部分变量都会放置在 GPU 上。

不过，您可以对此进行重写。在此代码段中，即使存在可用的 GPU，我们也可以在 CPU 上放置一个浮点张量和一个变量。通过打开设备放置日志记录（请参阅设置），可以查看变量的放置位置。

注：虽然可以手动放置变量，但使用分布策略是一种可优化计算的更便捷且可扩展的方式。

如果在有 GPU 和没有 GPU 的不同后端上运行此笔记本，则会看到不同的记录。请注意，必须在会话开始时打开设备布局记录。

In [ ]:

with tf.device('CPU:0'):

  # Create some tensors
  a = tf.Variable([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
  b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
  c = tf.matmul(a, b)

print(c)

您可以将变量或张量的位置设置在一个设备上，然后在另一个设备上执行计算。但这样会产生延迟，因为需要在两个设备之间复制数据。

不过，如果您有多个 GPU 工作进程，但希望变量只有一个副本，则可以这样做。

In [ ]:

with tf.device('CPU:0'):
  a = tf.Variable([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
  b = tf.Variable([[1.0, 2.0, 3.0]])

with tf.device('GPU:0'):
  # Element-wise multiply
  k = a * b

print(k)

注：由于 tf.config.set_soft_device_placement 默认处于打开状态，所以，即使在没有 GPU 的设备上运行此代码，它也会运行，只不过乘法步骤会在 CPU 上执行。

有关分布式训练的详细信息，请参阅指南。

后续步骤

要了解变量的一般用法，请参阅有关自动微分的指南。

Copyright 2020 The TensorFlow Authors.

变量简介

设置

创建变量

生命周期、命名和监视

放置变量和张量

后续步骤

Product

Resources

Company