Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
tensorflow
GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/zh-cn/quantum/tutorials/gradients.ipynb
25118 views
Kernel: Python 3
#@title Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # https://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License.

计算梯度

本教程探讨适用于量子电路期望值的梯度计算算法。

计算量子电路中某个可观测对象的期望值的梯度是一个复杂的过程。可观测对象的期望值并不具备总是易于编写的解析梯度公式——这不同于诸如矩阵乘法或向量加法等具备易于编写的解析梯度公式的传统机器学习变换。因此,可以轻松地为不同的场景采用不同的量子梯度计算方法。本教程比较了两种不同的微分方案。

设置

!pip install tensorflow==2.7.0

安装 TensorFlow Quantum:

!pip install tensorflow-quantum==0.7.2
# Update package resources to account for version changes. import importlib, pkg_resources importlib.reload(pkg_resources)

现在,导入 TensorFlow 和模块依赖项:

import tensorflow as tf import tensorflow_quantum as tfq import cirq import sympy import numpy as np # visualization tools %matplotlib inline import matplotlib.pyplot as plt from cirq.contrib.svg import SVGCircuit

1. 准备工作

我们来更具体地说明量子电路的梯度计算概念。假设您具有如下所示的参数化电路:

qubit = cirq.GridQubit(0, 0) my_circuit = cirq.Circuit(cirq.Y(qubit)**sympy.Symbol('alpha')) SVGCircuit(my_circuit)

以及可观测对象:

pauli_x = cirq.X(qubit) pauli_x

所用算子为 Y(α)XY(α)=sin(πα)⟨Y(\alpha)| X | Y(\alpha)⟩ = \sin(\pi \alpha)

def my_expectation(op, alpha): """Compute ⟨Y(alpha)| `op` | Y(alpha)⟩""" params = {'alpha': alpha} sim = cirq.Simulator() final_state_vector = sim.simulate(my_circuit, params).final_state_vector return op.expectation_from_state_vector(final_state_vector, {qubit: 0}).real my_alpha = 0.3 print("Expectation=", my_expectation(pauli_x, my_alpha)) print("Sin Formula=", np.sin(np.pi * my_alpha))

如果定义 f1(α)=Y(α)XY(α)f_{1}(\alpha) = ⟨Y(\alpha)| X | Y(\alpha)⟩,则 f1(α)=πcos(πα)f_{1}^{'}(\alpha) = \pi \cos(\pi \alpha)。请参见下例:

def my_grad(obs, alpha, eps=0.01): grad = 0 f_x = my_expectation(obs, alpha) f_x_prime = my_expectation(obs, alpha + eps) return ((f_x_prime - f_x) / eps).real print('Finite difference:', my_grad(pauli_x, my_alpha)) print('Cosine formula: ', np.pi * np.cos(np.pi * my_alpha))

2. 对微分器的需求

对于大型电路,要始终具备可精确计算给定量子电路梯度的公式并不现实。如果简单的公式不足以计算梯度,则可以使用 tfq.differentiators.Differentiator 类来定义用于计算电路梯度的算法。例如,您可以使用以下方法在 TensorFlow Quantum (TFQ) 中重新创建以上示例:

expectation_calculation = tfq.layers.Expectation( differentiator=tfq.differentiators.ForwardDifference(grid_spacing=0.01)) expectation_calculation(my_circuit, operators=pauli_x, symbol_names=['alpha'], symbol_values=[[my_alpha]])

但是,如果您改为基于采样(在真实设备上进行)估计期望值,则值可能会有所变化。这意味着您的估计方法并不完善:

sampled_expectation_calculation = tfq.layers.SampledExpectation( differentiator=tfq.differentiators.ForwardDifference(grid_spacing=0.01)) sampled_expectation_calculation(my_circuit, operators=pauli_x, repetitions=500, symbol_names=['alpha'], symbol_values=[[my_alpha]])

涉及到梯度时,这会迅速加剧造成严重的准确率问题:

# Make input_points = [batch_size, 1] array. input_points = np.linspace(0, 5, 200)[:, np.newaxis].astype(np.float32) exact_outputs = expectation_calculation(my_circuit, operators=pauli_x, symbol_names=['alpha'], symbol_values=input_points) imperfect_outputs = sampled_expectation_calculation(my_circuit, operators=pauli_x, repetitions=500, symbol_names=['alpha'], symbol_values=input_points) plt.title('Forward Pass Values') plt.xlabel('$x$') plt.ylabel('$f(x)$') plt.plot(input_points, exact_outputs, label='Analytic') plt.plot(input_points, imperfect_outputs, label='Sampled') plt.legend()
# Gradients are a much different story. values_tensor = tf.convert_to_tensor(input_points) with tf.GradientTape() as g: g.watch(values_tensor) exact_outputs = expectation_calculation(my_circuit, operators=pauli_x, symbol_names=['alpha'], symbol_values=values_tensor) analytic_finite_diff_gradients = g.gradient(exact_outputs, values_tensor) with tf.GradientTape() as g: g.watch(values_tensor) imperfect_outputs = sampled_expectation_calculation( my_circuit, operators=pauli_x, repetitions=500, symbol_names=['alpha'], symbol_values=values_tensor) sampled_finite_diff_gradients = g.gradient(imperfect_outputs, values_tensor) plt.title('Gradient Values') plt.xlabel('$x$') plt.ylabel('$f^{\'}(x)$') plt.plot(input_points, analytic_finite_diff_gradients, label='Analytic') plt.plot(input_points, sampled_finite_diff_gradients, label='Sampled') plt.legend()

在这里可以看到,尽管有限差分公式在解析示例中可以快速计算出梯度本身,但当涉及到基于采样的方法时,却产生了大量噪声。必须使用更细致的技术来确保可以计算出良好的梯度。接下来,您将了解一种速度缓慢而不太适用于解析期望梯度计算的技术,但该技术在基于实际样本的真实示例中却有着出色的表现:

# A smarter differentiation scheme. gradient_safe_sampled_expectation = tfq.layers.SampledExpectation( differentiator=tfq.differentiators.ParameterShift()) with tf.GradientTape() as g: g.watch(values_tensor) imperfect_outputs = gradient_safe_sampled_expectation( my_circuit, operators=pauli_x, repetitions=500, symbol_names=['alpha'], symbol_values=values_tensor) sampled_param_shift_gradients = g.gradient(imperfect_outputs, values_tensor) plt.title('Gradient Values') plt.xlabel('$x$') plt.ylabel('$f^{\'}(x)$') plt.plot(input_points, analytic_finite_diff_gradients, label='Analytic') plt.plot(input_points, sampled_param_shift_gradients, label='Sampled') plt.legend()

从上面可以看到,某些微分器最好用于特定的研究场景。通常,在更为“真实”的环境下测试或实现算法时,基于样本的较慢方法在面对设备噪声等问题时鲁棒性更佳,因此是理想的微分器。诸如有限差分之类的较快方法非常适合面向解析计算且需要更高吞吐量的场景,但尚未考虑算法在实际设备上是否可行。

3. 多个可观测对象

我们来引入一个额外的可观测对象,借此了解 TensorFlow Quantum 对单个电路的多个可观测对象的支持情况。

pauli_z = cirq.Z(qubit) pauli_z

如果此可观测对象同样用于之前的电路,则 f2(α)=Y(α)ZY(α)=cos(πα)f_{2}(\alpha) = ⟨Y(\alpha)| Z | Y(\alpha)⟩ = \cos(\pi \alpha)f2(α)=πsin(πα)f_{2}^{'}(\alpha) = -\pi \sin(\pi \alpha)。快速检查:

test_value = 0. print('Finite difference:', my_grad(pauli_z, test_value)) print('Sin formula: ', -np.pi * np.sin(np.pi * test_value))

结果匹配(足够接近)。

现在,如果定义 g(α)=f1(α)+f2(α)g(\alpha) = f_{1}(\alpha) + f_{2}(\alpha),则 g(α)=f1(α)+f2(α)g'(\alpha) = f_{1}^{'}(\alpha) + f^{'}_{2}(\alpha)。在 TensorFlow Quantum 中为电路定义多个可观测对象,相当于向 gg 添加更多项。

这意味着,电路中特定符号的梯度等于该符号应用于该电路的每个可观测对象的相应梯度之和。这与 TensorFlow 梯度计算和反向传播(将所有可观测对象的梯度总和作为特定符号的梯度)相兼容。

sum_of_outputs = tfq.layers.Expectation( differentiator=tfq.differentiators.ForwardDifference(grid_spacing=0.01)) sum_of_outputs(my_circuit, operators=[pauli_x, pauli_z], symbol_names=['alpha'], symbol_values=[[test_value]])

在这里可以看到,第一个条目是相对于 Pauli X 的期望,第二个条目是相对于 Pauli Z 的期望。现在,梯度计算方法如下:

test_value_tensor = tf.convert_to_tensor([[test_value]]) with tf.GradientTape() as g: g.watch(test_value_tensor) outputs = sum_of_outputs(my_circuit, operators=[pauli_x, pauli_z], symbol_names=['alpha'], symbol_values=test_value_tensor) sum_of_gradients = g.gradient(outputs, test_value_tensor) print(my_grad(pauli_x, test_value) + my_grad(pauli_z, test_value)) print(sum_of_gradients.numpy())

现在,您已验证每个可观测对象的梯度之和即为 α\alpha 的梯度。所有 TensorFlow Quantum 微分器均支持此行为,且此行为在与其余 TensorFlow 的兼容性方面起着至关重要的作用。

4. 高级用法

TensorFlow Quantum 子类 tfq.differentiators.Differentiator 中存在的所有微分器。要实现微分器,用户必须实现两个接口之一。标准是实现 get_gradient_circuits ,它告诉基类要测量哪些电路以获得梯度估计值。或者,也可以重载 differentiate_analyticdifferentiate_sampled;类 tfq.differentiators.Adjoint 就采用这种方式。

下面使用 TensorFlow Quantum 实现一个电路的梯度。您将使用一个参数转移的小示例。

回想上文定义的电路,α=Yα0|\alpha⟩ = Y^{\alpha}|0⟩。和之前一样,可以定义一个函数作为该电路对 XX 可观测对象的期望值,f(α)=αXαf(\alpha) = ⟨\alpha|X|\alpha⟩。对于该电路使用参数转移规则,您可以发现导数是 αf(α)=π2f(α+12)π2f(α12)\frac{\partial}{\partial \alpha} f(\alpha) = \frac{\pi}{2} f\left(\alpha + \frac{1}{2}\right) - \frac{ \pi}{2} f\left(\alpha - \frac{1}{2}\right)get_gradient_circuits 函数返回该导数的分量。

class MyDifferentiator(tfq.differentiators.Differentiator): """A Toy differentiator for <Y^alpha | X |Y^alpha>.""" def __init__(self): pass def get_gradient_circuits(self, programs, symbol_names, symbol_values): """Return circuits to compute gradients for given forward pass circuits. Every gradient on a quantum computer can be computed via measurements of transformed quantum circuits. Here, you implement a custom gradient for a specific circuit. For a real differentiator, you will need to implement this function in a more general way. See the differentiator implementations in the TFQ library for examples. """ # The two terms in the derivative are the same circuit... batch_programs = tf.stack([programs, programs], axis=1) # ... with shifted parameter values. shift = tf.constant(1/2) forward = symbol_values + shift backward = symbol_values - shift batch_symbol_values = tf.stack([forward, backward], axis=1) # Weights are the coefficients of the terms in the derivative. num_program_copies = tf.shape(batch_programs)[0] batch_weights = tf.tile(tf.constant([[[np.pi/2, -np.pi/2]]]), [num_program_copies, 1, 1]) # The index map simply says which weights go with which circuits. batch_mapper = tf.tile( tf.constant([[[0, 1]]]), [num_program_copies, 1, 1]) return (batch_programs, symbol_names, batch_symbol_values, batch_weights, batch_mapper)

Differentiator 基类使用从 get_gradient_circuits 返回的分量来计算导数,如上面的参数转移公式所示。现在,这个新的微分器可以与现有 tfq.layer 对象一起使用:

custom_dif = MyDifferentiator() custom_grad_expectation = tfq.layers.Expectation(differentiator=custom_dif) # Now let's get the gradients with finite diff. with tf.GradientTape() as g: g.watch(values_tensor) exact_outputs = expectation_calculation(my_circuit, operators=[pauli_x], symbol_names=['alpha'], symbol_values=values_tensor) analytic_finite_diff_gradients = g.gradient(exact_outputs, values_tensor) # Now let's get the gradients with custom diff. with tf.GradientTape() as g: g.watch(values_tensor) my_outputs = custom_grad_expectation(my_circuit, operators=[pauli_x], symbol_names=['alpha'], symbol_values=values_tensor) my_gradients = g.gradient(my_outputs, values_tensor) plt.subplot(1, 2, 1) plt.title('Exact Gradient') plt.plot(input_points, analytic_finite_diff_gradients.numpy()) plt.xlabel('x') plt.ylabel('f(x)') plt.subplot(1, 2, 2) plt.title('My Gradient') plt.plot(input_points, my_gradients.numpy()) plt.xlabel('x')

现在,可以使用这个新的微分器来生成可微运算。

要点:如果微分器之前已附加到一个运算,那么在附加到新的运算之前,必须先进行刷新,因为一个微分器一次只能附加到一个运算。

# Create a noisy sample based expectation op. expectation_sampled = tfq.get_sampled_expectation_op( cirq.DensityMatrixSimulator(noise=cirq.depolarize(0.01))) # Make it differentiable with your differentiator: # Remember to refresh the differentiator before attaching the new op custom_dif.refresh() differentiable_op = custom_dif.generate_differentiable_op( sampled_op=expectation_sampled) # Prep op inputs. circuit_tensor = tfq.convert_to_tensor([my_circuit]) op_tensor = tfq.convert_to_tensor([[pauli_x]]) single_value = tf.convert_to_tensor([[my_alpha]]) num_samples_tensor = tf.convert_to_tensor([[5000]]) with tf.GradientTape() as g: g.watch(single_value) forward_output = differentiable_op(circuit_tensor, ['alpha'], single_value, op_tensor, num_samples_tensor) my_gradients = g.gradient(forward_output, single_value) print('---TFQ---') print('Foward: ', forward_output.numpy()) print('Gradient:', my_gradients.numpy()) print('---Original---') print('Forward: ', my_expectation(pauli_x, my_alpha)) print('Gradient:', my_grad(pauli_x, my_alpha))

成功:现在,您可以使用 TensorFlow Quantum 提供的所有微分器,以及定义自己的微分器了。