GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/ja/quantum/tutorials/barren_plateaus.ipynb
²⁵¹¹⁸ views

Kernel: Python 3

Copyright 2020 The TensorFlow Authors.

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

不毛の高原

この例では、量子ニューラルネットワーク構造だけでは学習に不十分だという McClean、2019 の研究結果について見ていきます。特に、特定の大規模なランダム量子回路のファミリでは勾配が消えてしまうため量子ニューラルネットワークが優れた機能を発揮しないことがわかります。この例では、特定の学習問題のモデルをトレーニングするのではなく、勾配の動作を理解することに焦点を当てます。

セットアップ

In [ ]:

!pip install tensorflow==2.7.0

TensorFlow Quantum をインストールします。

In [ ]:

!pip install tensorflow-quantum==0.7.2

In [ ]:

# Update package resources to account for version changes.
import importlib, pkg_resources
importlib.reload(pkg_resources)

次に、TensorFlow とモジュールの依存関係をインポートします。

In [ ]:

import tensorflow as tf
import tensorflow_quantum as tfq

import cirq
import sympy
import numpy as np

# visualization tools
%matplotlib inline
import matplotlib.pyplot as plt
from cirq.contrib.svg import SVGCircuit

np.random.seed(1234)

1. 概要

以下は多くのブロックを持つランダム量子回路です ( $R_{P}(\theta)$ はランダムなパウリ回転です):

$f(x)$ が任意の量子ビット $a$ と $b$ の期待値 w.r.t $Z_{a}Z_{b}$ として定義されている場合、 $f'(x)$ の平均が 0 に非常に近くなり、あまり変化しないという問題があります。以下に示すとおりです。

2. ランダム回路の生成

論文からの構成は簡単に生成できです。以下は、ランダムな量子回路を生成する簡単な関数を実装します。これは量子ニューラルネットワーク（QNN）と呼ばれ、量子ビットのセットの深さを指定します。

In [ ]:

def generate_random_qnn(qubits, symbol, depth):
    """Generate random QNN's with the same structure from McClean et al."""
    circuit = cirq.Circuit()
    for qubit in qubits:
        circuit += cirq.ry(np.pi / 4.0)(qubit)

    for d in range(depth):
        # Add a series of single qubit rotations.
        for i, qubit in enumerate(qubits):
            random_n = np.random.uniform()
            random_rot = np.random.uniform(
            ) * 2.0 * np.pi if i != 0 or d != 0 else symbol
            if random_n > 2. / 3.:
                # Add a Z.
                circuit += cirq.rz(random_rot)(qubit)
            elif random_n > 1. / 3.:
                # Add a Y.
                circuit += cirq.ry(random_rot)(qubit)
            else:
                # Add a X.
                circuit += cirq.rx(random_rot)(qubit)

        # Add CZ ladder.
        for src, dest in zip(qubits, qubits[1:]):
            circuit += cirq.CZ(src, dest)

    return circuit


generate_random_qnn(cirq.GridQubit.rect(1, 3), sympy.Symbol('theta'), 2)

ここでは、1 つのパラメータ $\theta_{1,1}$ の勾配を調査します。 $\theta_{1,1}$ が存在する回路にsympy.Symbolを配置します。回路内の他のシンボルの統計は分析しないので、ここでランダムな値に置き換えます。

3. 回路の実行

勾配の変化が少ないという主張を調査するために、オブザーバブル、および、いくつかの回路を生成します。まず、ランダム回路のバッチを生成します。ランダムな ZZ オブザーバブルを選択し、TensorFlowQuantum を使用して勾配と分散をバッチ計算します。

3.1 バッチ分散計算

回路のバッチ全体で特定のオブザーバブルの勾配の分散を計算するヘルパー関数を作成します。

In [ ]:

def process_batch(circuits, symbol, op):
    """Compute the variance of a batch of expectations w.r.t. op on each circuit that 
    contains `symbol`. Note that this method sets up a new compute graph every time it is
    called so it isn't as performant as possible."""

    # Setup a simple layer to batch compute the expectation gradients.
    expectation = tfq.layers.Expectation()

    # Prep the inputs as tensors
    circuit_tensor = tfq.convert_to_tensor(circuits)
    values_tensor = tf.convert_to_tensor(
        np.random.uniform(0, 2 * np.pi, (n_circuits, 1)).astype(np.float32))

    # Use TensorFlow GradientTape to track gradients.
    with tf.GradientTape() as g:
        g.watch(values_tensor)
        forward = expectation(circuit_tensor,
                              operators=op,
                              symbol_names=[symbol],
                              symbol_values=values_tensor)

    # Return variance of gradients across all circuits.
    grads = g.gradient(forward, values_tensor)
    grad_var = tf.math.reduce_std(grads, axis=0)
    return grad_var.numpy()[0]

3.1 セットアップして実行する

生成するランダム回路の数、および、それらに対する量子ビットの深さおよび量を選択します。次に、結果をプロットします。

In [ ]:

n_qubits = [2 * i for i in range(2, 7)
           ]  # Ranges studied in paper are between 2 and 24.
depth = 50  # Ranges studied in paper are between 50 and 500.
n_circuits = 200
theta_var = []

for n in n_qubits:
    # Generate the random circuits and observable for the given n.
    qubits = cirq.GridQubit.rect(1, n)
    symbol = sympy.Symbol('theta')
    circuits = [
        generate_random_qnn(qubits, symbol, depth) for _ in range(n_circuits)
    ]
    op = cirq.Z(qubits[0]) * cirq.Z(qubits[1])
    theta_var.append(process_batch(circuits, symbol, op))

plt.semilogy(n_qubits, theta_var)
plt.title('Gradient Variance in QNNs')
plt.xlabel('n_qubits')
plt.xticks(n_qubits)
plt.ylabel('$\\partial \\theta$ variance')
plt.show()

このプロットは、量子機械学習の問題では、ランダムな QNN 仮説を単純に推測しても、最良の結果を期待することはできないことを示しています。学習が発生する可能性のある点まで勾配を変化させるには、モデル回路に何らかの構造が存在する必要があります。

4. ヒューリスティック

Grant、2019 による興味深いヒューリスティックによると、ランダムに非常に近い状態（完全にランダムではない）で開始できます。McClean et al. と同じ回路を使用して、著者は、「不毛のプラトー」を回避するために、古典的な制御パラメータに対して異なる初期化手法を提案します。初期化手法は、完全にランダムな制御パラメータで一部のレイヤーを開始しますが、その直後のレイヤーでは、最初のいくつかのレイヤーによって行われた最初の変換が取り消されるようにパラメータを選択します。著者はこれをアイデンティティブロックと呼んでいます。

このヒューリスティックの利点は、1 つのパラメータを変更するだけで、その時点のブロック以外の他のすべてのブロックがそのままになり、勾配信号は以前よりはるかに強くなります。これにより、ユーザーは、強い勾配信号を取得するために変更する変数とブロックを選択できます。このヒューリスティックは、トレーニングフェーズ中にユーザーが「不毛の高原」に陥るのを防ぐことはできません（完全な同時更新を制限します）。「高原」の外から開始できることを保証するだけです。

4.1 新しい QNN の構築

次に、アイデンティティブロック QNN を生成する関数を作成します。この実装は、論文の実装とは少し異なります。ここでは、1 つのパラメータの勾配の振る舞いを見て、McClean et al と一致するようにして、いくつかの簡略化を行います。

アイデンティティブロックを生成してモデルをトレーニングするには、通常、 $U1(\theta_1) U1(\theta_1)^{\dagger}$ ではなく $U1(\theta_{1a}) U1(\theta_{1b})^{\dagger}$ が必要です。そうしないと、トレーニング後も常にアイデンティティを取得します。アイデンティティブロックの数の選択は経験に基づいています。ブロックが深いほど、ブロックの中央の分散は小さくなります。ただし、ブロックの開始時と終了時では、パラメータ勾配の分散が大きい必要があります。

In [ ]:

def generate_identity_qnn(qubits, symbol, block_depth, total_depth):
    """Generate random QNN's with the same structure from Grant et al."""
    circuit = cirq.Circuit()

    # Generate initial block with symbol.
    prep_and_U = generate_random_qnn(qubits, symbol, block_depth)
    circuit += prep_and_U

    # Generate dagger of initial block without symbol.
    U_dagger = (prep_and_U[1:])**-1
    circuit += cirq.resolve_parameters(
        U_dagger, param_resolver={symbol: np.random.uniform() * 2 * np.pi})

    for d in range(total_depth - 1):
        # Get a random QNN.
        prep_and_U_circuit = generate_random_qnn(
            qubits,
            np.random.uniform() * 2 * np.pi, block_depth)

        # Remove the state-prep component
        U_circuit = prep_and_U_circuit[1:]

        # Add U
        circuit += U_circuit

        # Add U^dagger
        circuit += U_circuit**-1

    return circuit


generate_identity_qnn(cirq.GridQubit.rect(1, 3), sympy.Symbol('theta'), 2, 2)

4.2 比較

ここでは、ヒューリスティックが勾配の分散がすぐに消えないようにするのに役立つことがわかります。

In [ ]:

block_depth = 10
total_depth = 5

heuristic_theta_var = []

for n in n_qubits:
    # Generate the identity block circuits and observable for the given n.
    qubits = cirq.GridQubit.rect(1, n)
    symbol = sympy.Symbol('theta')
    circuits = [
        generate_identity_qnn(qubits, symbol, block_depth, total_depth)
        for _ in range(n_circuits)
    ]
    op = cirq.Z(qubits[0]) * cirq.Z(qubits[1])
    heuristic_theta_var.append(process_batch(circuits, symbol, op))

plt.semilogy(n_qubits, theta_var)
plt.semilogy(n_qubits, heuristic_theta_var)
plt.title('Heuristic vs. Random')
plt.xlabel('n_qubits')
plt.xticks(n_qubits)
plt.ylabel('$\\partial \\theta$ variance')
plt.show()

これは、（ほぼ）ランダムな QNN からより強い勾配信号を取得する上での大きな改善です。