GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/ko/quantum/tutorials/quantum_data.ipynb
²⁵¹¹⁸ views

Kernel: Python 3

Copyright 2020 The TensorFlow Authors.

In [1]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

양자 데이터

MNIST 튜토리얼에 비교한 내용을 바탕으로, 이 튜토리얼은 서로 다른 데이터세트가 성능 비교에 어떻게 영향을 미치는지 보여주는 Huang et al.의 최근 연구를 살펴봅니다. 그 연구에서, 저자들은 기존 머신러닝 모델이 양자 모델과 마찬가지로(또는 그 이상으로) 잘 학습할 수 있는 방법과 경우에 대해 이해하고자 합니다. 연구는 또한 면밀히 만들어진 데이터세트를 통해 기존 머신러닝 모델과 양자 머신러닝 모델 사이의 실증적인 성능 분리를 소개합니다. 여러분은 다음을 수행합니다.

축소된 차원 Fashion-MNIST 데이터세트를 준비합니다.
양자 회로를 사용하여 데이터세트에 레이블을 다시 지정하고 Projected Quantum Kernel(PQK) 기능을 컴퓨팅합니다.
레이블이 다시 지정된 데이터세트에서 기존 신경망을 훈련하고 PQK 기능에 액세스할 수 있는 모델과 성능을 비교합니다.

설치

In [2]:

!pip install tensorflow==2.7.0 tensorflow-quantum==0.7.2

Out[2]:

WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.

In [ ]:

# Update package resources to account for version changes.
import importlib, pkg_resources
importlib.reload(pkg_resources)

In [3]:

import cirq
import sympy
import numpy as np
import tensorflow as tf
import tensorflow_quantum as tfq

# visualization tools
%matplotlib inline
import matplotlib.pyplot as plt
from cirq.contrib.svg import SVGCircuit
np.random.seed(1234)

1. 데이터 준비

양자 컴퓨터에서 실행할 fashion-MNIST 데이터세트를 준비하는 것으로 시작하겠습니다.

1.1 fashion-MNIST 다운로드

첫 번째 단계는 기존 fashion-mnist 데이터세트를 받는 것입니다. tf.keras.datasets 모듈을 사용하여 받을 수 있습니다.

In [4]:

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()

# Rescale the images from [0,255] to the [0.0,1.0] range.
x_train, x_test = x_train/255.0, x_test/255.0

print("Number of original training examples:", len(x_train))
print("Number of original test examples:", len(x_test))

Out[4]:

Number of original training examples: 60000
Number of original test examples: 10000

데이터세트를 필터링하여 T-shirts/tops 및 dresses만 유지하고 다른 클래스는 제거합니다. 동시에 레이블 y를 0은 True, 3은 False에 대응하는 부울로 변환합니다.

In [5]:

def filter_03(x, y):
    keep = (y == 0) | (y == 3)
    x, y = x[keep], y[keep]
    y = y == 0
    return x,y

In [6]:

x_train, y_train = filter_03(x_train, y_train)
x_test, y_test = filter_03(x_test, y_test)

print("Number of filtered training examples:", len(x_train))
print("Number of filtered test examples:", len(x_test))

Out[6]:

Number of filtered training examples: 12000
Number of filtered test examples: 2000

In [7]:

print(y_train[0])

plt.imshow(x_train[0, :, :])
plt.colorbar()

Out[7]:

True

<matplotlib.colorbar.Colorbar at 0x7f6db42c3460>

1.2 이미지 축소하기

MNIST 예시와 같이, 현재 양자 컴퓨터의 경계 내에 있게 하려면 이러한 이미지의 크기를 축소해야 합니다. 하지만 이번에는 tf.image.resize 연산 대신 FCA 변환을 사용하여 차원을 축소하겠습니다.

In [8]:

def truncate_x(x_train, x_test, n_components=10):
  """Perform PCA on image dataset keeping the top `n_components` components."""
  n_points_train = tf.gather(tf.shape(x_train), 0)
  n_points_test = tf.gather(tf.shape(x_test), 0)

  # Flatten to 1D
  x_train = tf.reshape(x_train, [n_points_train, -1])
  x_test = tf.reshape(x_test, [n_points_test, -1])

  # Normalize.
  feature_mean = tf.reduce_mean(x_train, axis=0)
  x_train_normalized = x_train - feature_mean
  x_test_normalized = x_test - feature_mean

  # Truncate.
  e_values, e_vectors = tf.linalg.eigh(
      tf.einsum('ji,jk->ik', x_train_normalized, x_train_normalized))
  return tf.einsum('ij,jk->ik', x_train_normalized, e_vectors[:,-n_components:]), \
    tf.einsum('ij,jk->ik', x_test_normalized, e_vectors[:, -n_components:])

In [9]:

DATASET_DIM = 10
x_train, x_test = truncate_x(x_train, x_test, n_components=DATASET_DIM)
print(f'New datapoint dimension:', len(x_train[0]))

Out[9]:

New datapoint dimension: 10

마지막 단계는 데이터세트 크기를 1,000개의 훈련 데이터 포인트와 200개의 테스트 데이터 포인트로 축소하는 것입니다.

In [10]:

N_TRAIN = 1000
N_TEST = 200
x_train, x_test = x_train[:N_TRAIN], x_test[:N_TEST]
y_train, y_test = y_train[:N_TRAIN], y_test[:N_TEST]

In [11]:

print("New number of training examples:", len(x_train))
print("New number of test examples:", len(x_test))

Out[11]:

New number of training examples: 1000
New number of test examples: 200

2. PQK 기능 다시 레이블 지정 및 컴퓨팅

이제 양자 구성요소를 결합하고 위에서 생성한 잘라낸 fashion-MNIST 데이터세트를 다시 레이블 지정하여 "부자연스러운" 양자 데이터세트를 준비합니다. 양자와 기존 메서드를 최대한 분리하기 위해, 우선 PQK 기능을 준비한 다음 그 값에 따라 출력을 다시 레이블 지정합니다.

2.1 양자 인코딩 및 PQK 기능

다음 모든 큐비트에서 1-RDM로 정의된 x_train, y_train, x_test 및 y_test를 바탕으로 새로운 일련의 기능을 생성합니다.

$V(x_{\text{train}} / n_{\text{trotter}}) ^ {n_{\text{trotter}}} U_{\text{1qb}} | 0 \rangle$

여기에서 $U_\text{1qb}$ 가 단일 큐비트 회전 벽이고 $V(\hat{\theta}) = e^{-i\sum_i \hat{\theta_i} (X_i X_{i+1} + Y_i Y_{i+1} + Z_i Z_{i+1})}$

우선, 다음과 같은 단일 큐비트 회전 벽을 생성할 수 있습니다.

In [12]:

def single_qubit_wall(qubits, rotations):
  """Prepare a single qubit X,Y,Z rotation wall on `qubits`."""
  wall_circuit = cirq.Circuit()
  for i, qubit in enumerate(qubits):
    for j, gate in enumerate([cirq.X, cirq.Y, cirq.Z]):
      wall_circuit.append(gate(qubit) ** rotations[i][j])

  return wall_circuit

다음 회로를 살펴봄으로써 이 작업을 빠르게 확인할 수 있습니다.

In [13]:

SVGCircuit(single_qubit_wall(
    cirq.GridQubit.rect(1,4), np.random.uniform(size=(4, 3))))

Out[13]:

다음으로 모든 대체 cirq.PauliSum 객체를 기하급수적으로 증가시킬 수 있는 tfq.util.exponential의 도움을 통해 $V(\hat{\theta})$ 를 준비할 수 있습니다.

In [14]:

def v_theta(qubits):
  """Prepares a circuit that generates V(\theta)."""
  ref_paulis = [
      cirq.X(q0) * cirq.X(q1) + \
      cirq.Y(q0) * cirq.Y(q1) + \
      cirq.Z(q0) * cirq.Z(q1) for q0, q1 in zip(qubits, qubits[1:])
  ]
  exp_symbols = list(sympy.symbols('ref_0:'+str(len(ref_paulis))))
  return tfq.util.exponential(ref_paulis, exp_symbols), exp_symbols

이 회로는 확인하기 약간 더 어려울 수 있지만 두 개의 큐비트 케이스를 검사하여 어떤 일이 일어나는 중인지 확인할 수 있습니다.

In [15]:

test_circuit, test_symbols = v_theta(cirq.GridQubit.rect(1, 2))
print(f'Symbols found in circuit:{test_symbols}')
SVGCircuit(test_circuit)

Out[15]:

Symbols found in circuit:[ref_0]

이제 전체 인코딩 회로를 함께 조합하는 데 필요한 모든 구성 요소를 갖췄습니다.

In [16]:

def prepare_pqk_circuits(qubits, classical_source, n_trotter=10):
  """Prepare the pqk feature circuits around a dataset."""
  n_qubits = len(qubits)
  n_points = len(classical_source)

  # Prepare random single qubit rotation wall.
  random_rots = np.random.uniform(-2, 2, size=(n_qubits, 3))
  initial_U = single_qubit_wall(qubits, random_rots)

  # Prepare parametrized V
  V_circuit, symbols = v_theta(qubits)
  exp_circuit = cirq.Circuit(V_circuit for t in range(n_trotter))
  
  # Convert to `tf.Tensor`
  initial_U_tensor = tfq.convert_to_tensor([initial_U])
  initial_U_splat = tf.tile(initial_U_tensor, [n_points])

  full_circuits = tfq.layers.AddCircuit()(
      initial_U_splat, append=exp_circuit)
  # Replace placeholders in circuits with values from `classical_source`.
  return tfq.resolve_parameters(
      full_circuits, tf.convert_to_tensor([str(x) for x in symbols]),
      tf.convert_to_tensor(classical_source*(n_qubits/3)/n_trotter))

몇몇 큐비트를 선택하고 데이터 인코딩 회로를 준비하세요.

In [17]:

qubits = cirq.GridQubit.rect(1, DATASET_DIM + 1)
q_x_train_circuits = prepare_pqk_circuits(qubits, x_train)
q_x_test_circuits = prepare_pqk_circuits(qubits, x_test)

다음으로, 위의 데이터세트 회로의 1-RDM을 바탕으로 한 PQK 기능을 컴퓨팅하고 결과를 rdm, 모양이 [n_points, n_qubits, 3]인 tf.Tensor에 저장하세요. 항목은 i가 데이터포인트에서 색인을 생성하고, j가 큐비트에서 색인을 생성하고
k가 $\lbrace \hat{X}, \hat{Y}, \hat{Z} \rbrace$ 에서 색인을 생성하는 rdm[i][j][k] = $\langle \psi_i | OP^k_j | \psi_i \rangle$ 에 있습니다.

In [18]:

def get_pqk_features(qubits, data_batch):
  """Get PQK features based on above construction."""
  ops = [[cirq.X(q), cirq.Y(q), cirq.Z(q)] for q in qubits]
  ops_tensor = tf.expand_dims(tf.reshape(tfq.convert_to_tensor(ops), -1), 0)
  batch_dim = tf.gather(tf.shape(data_batch), 0)
  ops_splat = tf.tile(ops_tensor, [batch_dim, 1])
  exp_vals = tfq.layers.Expectation()(data_batch, operators=ops_splat)
  rdm = tf.reshape(exp_vals, [batch_dim, len(qubits), -1])
  return rdm

In [19]:

x_train_pqk = get_pqk_features(qubits, q_x_train_circuits)
x_test_pqk = get_pqk_features(qubits, q_x_test_circuits)
print('New PQK training dataset has shape:', x_train_pqk.shape)
print('New PQK testing dataset has shape:', x_test_pqk.shape)

Out[19]:

New PQK training dataset has shape: (1000, 11, 3)
New PQK testing dataset has shape: (200, 11, 3)

2.2 PQK 기능을 바탕으로 다시 레이블 지정

이제 x_train_pqk 및 x_test_pqk에서 생성된 기능인 이러한 양자를 갖췄으므로 데이터세트를 다시 레이블 지정할 때입니다. 양자와 기존 성능을 최대한 분리하기 위해 x_train_pqk 및 x_test_pqk에 있는 스펙트럼 정보를 바탕으로 데이터세트를 다시 레이블 지정할 수 있습니다.

참고: 기존 및 양자 모델 간 성능을 확실히 최대한 분리하도록 데이터세트를 준비하는 것은 속임수처럼 느껴질 수 있지만 기존 컴퓨터에는 어렵고 양자 컴퓨터가 모델링 하기 쉬운 데이터세트의 존재에 대한 매우 중요한 증거를 제공합니다. 장점을 보여주기 위해 이와 같은 것들을 우선 생성할 수 없다면 QML에서의 양자의 장점을 찾는 것은 의미가 없을 것입니다.

In [20]:

def compute_kernel_matrix(vecs, gamma):
  """Computes d[i][j] = e^ -gamma * (vecs[i] - vecs[j]) ** 2 """
  scaled_gamma = gamma / (
      tf.cast(tf.gather(tf.shape(vecs), 1), tf.float32) * tf.math.reduce_std(vecs))
  return scaled_gamma * tf.einsum('ijk->ij',(vecs[:,None,:] - vecs) ** 2)

def get_spectrum(datapoints, gamma=1.0):
  """Compute the eigenvalues and eigenvectors of the kernel of datapoints."""
  KC_qs = compute_kernel_matrix(datapoints, gamma)
  S, V = tf.linalg.eigh(KC_qs)
  S = tf.math.abs(S)
  return S, V

In [21]:

S_pqk, V_pqk = get_spectrum(
    tf.reshape(tf.concat([x_train_pqk, x_test_pqk], 0), [-1, len(qubits) * 3]))

S_original, V_original = get_spectrum(
    tf.cast(tf.concat([x_train, x_test], 0), tf.float32), gamma=0.005)

print('Eigenvectors of pqk kernel matrix:', V_pqk)
print('Eigenvectors of original kernel matrix:', V_original)

Out[21]:

Eigenvectors of pqk kernel matrix: tf.Tensor(
[[-2.09569391e-02  1.05973557e-02  2.16634180e-02 ...  2.80352887e-02
   1.55521873e-02  2.82677952e-02]
 [-2.29303762e-02  4.66355234e-02  7.91163836e-03 ... -6.14174758e-04
  -7.07804322e-01  2.85902526e-02]
 [-1.77853629e-02 -3.00758495e-03 -2.55225878e-02 ... -2.40783971e-02
   2.11018627e-03  2.69009806e-02]
 ...
 [ 6.05797209e-02  1.32483775e-02  2.69536003e-02 ... -1.38843581e-02
   3.05043962e-02  3.85345481e-02]
 [ 6.33309558e-02 -3.04112374e-03  9.77444276e-03 ...  7.48321265e-02
   3.42793856e-03  3.67484428e-02]
 [ 5.86028099e-02  5.84433973e-03  2.64811981e-03 ...  2.82612257e-02
  -3.80136147e-02  3.29943895e-02]], shape=(1200, 1200), dtype=float32)
Eigenvectors of original kernel matrix: tf.Tensor(
[[ 0.03835681  0.0283473  -0.01169789 ...  0.02343717  0.0211248
   0.03206972]
 [-0.04018159  0.00888097 -0.01388255 ...  0.00582427  0.717551
   0.02881948]
 [-0.0166719   0.01350376 -0.03663862 ...  0.02467175 -0.00415936
   0.02195409]
 ...
 [-0.03015648 -0.01671632 -0.01603392 ...  0.00100583 -0.00261221
   0.02365689]
 [ 0.0039777  -0.04998879 -0.00528336 ...  0.01560401 -0.04330755
   0.02782002]
 [-0.01665728 -0.00818616 -0.0432341  ...  0.00088256  0.00927396
   0.01875088]], shape=(1200, 1200), dtype=float32)

이제 데이터세트의 레이블을 다시 지정하는 데 필요한 모든 것이 갖춰졌습니다! 이제 플로 차트를 참고하여 데이터세트를 다시 레이블 지정할 때 성능을 최대한 분리하는 방법에 대해 더 잘 이해할 수 있습니다.

양자와 기존 모델 간의 성능을 최대한 분리하기 위해 S_pqk, V_pqk 및 S_original, V_original을 사용하여 기존 데이터세트와 PQK 기능 Kernel 행렬 $g(K_1 || K_2) = \sqrt{ || \sqrt{K_2} K_1^{-1} \sqrt{K_2} || _\infty}$ 간의 기하학적 차이를 극대화하도록 시도할 것입니다. $g$ 값이 크면 처음에 플로 차트에서 오른쪽으로 이동하여 양자 케이스에서 예측 우위를 얻을 수 있습니다.

참고: $s$ 및 $d$ 에 대한 양을 컴퓨팅하는 것은 또한 성능 분리를 더 잘 이해하는 데 아주 유용할 수 있습니다. 이 케이스에서 성능 분리를 볼 수 있도록 $g$ 값이 충분히 커야 합니다.

In [22]:

def get_stilted_dataset(S, V, S_2, V_2, lambdav=1.1):
  """Prepare new labels that maximize geometric distance between kernels."""
  S_diag = tf.linalg.diag(S ** 0.5)
  S_2_diag = tf.linalg.diag(S_2 / (S_2 + lambdav) ** 2)
  scaling = S_diag @ tf.transpose(V) @ \
            V_2 @ S_2_diag @ tf.transpose(V_2) @ \
            V @ S_diag

  # Generate new lables using the largest eigenvector.
  _, vecs = tf.linalg.eig(scaling)
  new_labels = tf.math.real(
      tf.einsum('ij,j->i', tf.cast(V @ S_diag, tf.complex64), vecs[-1])).numpy()
  # Create new labels and add some small amount of noise.
  final_y = new_labels > np.median(new_labels)
  noisy_y = (final_y ^ (np.random.uniform(size=final_y.shape) > 0.95))
  return noisy_y

In [23]:

y_relabel = get_stilted_dataset(S_pqk, V_pqk, S_original, V_original)
y_train_new, y_test_new = y_relabel[:N_TRAIN], y_relabel[N_TRAIN:]

3. 모델 비교하기

데이터세트를 준비했으므로 모델 성능을 비교할 차례입니다. 두 개의 소규모 피드 포워드 신경망을 생성하고 x_train_pqk에 있는 PQK 기능에 액세스가 주어지면 성능을 비교하겠습니다.

3.1 PQK 강화 모델 생성

표준 tf.keras 라이브러리 기능을 사용하여 이제 x_train_pqk 및 y_train_new 데이터 포인트에서 모델을 생성하고 훈련할 수 있습니다.

In [24]:

#docs_infra: no_execute
def create_pqk_model():
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Dense(32, activation='sigmoid', input_shape=[len(qubits) * 3,]))
    model.add(tf.keras.layers.Dense(16, activation='sigmoid'))
    model.add(tf.keras.layers.Dense(1))
    return model

pqk_model = create_pqk_model()
pqk_model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              optimizer=tf.keras.optimizers.Adam(learning_rate=0.003),
              metrics=['accuracy'])

pqk_model.summary()

Out[24]:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 32)                1088      
_________________________________________________________________
dense_1 (Dense)              (None, 16)                528       
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 17        
=================================================================
Total params: 1,633
Trainable params: 1,633
Non-trainable params: 0
_________________________________________________________________

In [25]:

#docs_infra: no_execute
pqk_history = pqk_model.fit(tf.reshape(x_train_pqk, [N_TRAIN, -1]),
          y_train_new,
          batch_size=32,
          epochs=1000,
          verbose=0,
          validation_data=(tf.reshape(x_test_pqk, [N_TEST, -1]), y_test_new))

3.2 기존 모델 생성

위의 코드와 유사하게 이제 또한 여러분의 부자연스러운 데이터세트의 PQK 기능에 액세스가 없는 기존 모델을 생성할 수 있습니다. 이 모델은 x_train 및 y_label_new를 사용하여 훈련될 수 있습니다.

In [26]:

#docs_infra: no_execute
def create_fair_classical_model():
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Dense(32, activation='sigmoid', input_shape=[DATASET_DIM,]))
    model.add(tf.keras.layers.Dense(16, activation='sigmoid'))
    model.add(tf.keras.layers.Dense(1))
    return model

model = create_fair_classical_model()
model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              optimizer=tf.keras.optimizers.Adam(learning_rate=0.03),
              metrics=['accuracy'])

model.summary()

Out[26]:

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_3 (Dense)              (None, 32)                352       
_________________________________________________________________
dense_4 (Dense)              (None, 16)                528       
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 17        
=================================================================
Total params: 897
Trainable params: 897
Non-trainable params: 0
_________________________________________________________________

In [27]:

#docs_infra: no_execute
classical_history = model.fit(x_train,
          y_train_new,
          batch_size=32,
          epochs=1000,
          verbose=0,
          validation_data=(x_test, y_test_new))

3.3 성능 비교하기

두 모델을 훈련했으므로 두 가지 모델 간 검증 데이터의 성능 차이를 빠르게 나타낼 수 있습니다. 일반적으로 두 모델 모두 훈련 데이터에서 0.9를 초과하는 정확도를 달성합니다. 하지만 검증 데이터에서 PQK 기능의 정보만이 모델을 보이지 않는 인스턴스로 잘 일반화하기에 충분하다는 것이 분명해집니다.

In [28]:

#docs_infra: no_execute
plt.figure(figsize=(10,5))
plt.plot(classical_history.history['accuracy'], label='accuracy_classical')
plt.plot(classical_history.history['val_accuracy'], label='val_accuracy_classical')
plt.plot(pqk_history.history['accuracy'], label='accuracy_quantum')
plt.plot(pqk_history.history['val_accuracy'], label='val_accuracy_quantum')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

Out[28]:

<matplotlib.legend.Legend at 0x7f6d846ecee0>

성공: 공정한(하지만 부자연스러운) 설정에서 기존 모델을 의도적으로 이길 수 있는 부자연스러운 양자 데이트세트를 엔지니어링했습니다. 다른 기존 모델 유형을 사용하여 결과를 비교해 보세요. 다음 단계는 직접 엔지니어링 할 필요 없이 기존 모델을 이길 수 있는 새롭고 흥미로운 데이터세트를 찾을 수 있는지 시도하고 확인해 보는 것입니다!

4. 주요 결론

이것과 MNIST 실험에서 도출할 수 있는 다음과 같은 몇 가지 주요 결론이 있습니다.

오늘날의 양자 모델이 기존 데이터에서 기존 모델 성능을 능가하는 것은 아주 드뭅니다. 특히 백만 개 이상의 데이터 포인트를 가질 수 있는 오늘날의 기존 데이터세트에서 말입니다.
데이터가 고전적으로 시뮬레이션하기 어려운 양자 회로에서 나올 수 있다고 해서 기존 모델에서 데이터를 학습하는 것이 반드시 어려워지지는 않습니다.
사용되는 모델 아키텍처 또는 훈련 알고리즘에 관계없이 양자 모델이 학습하기 쉽고 기존 모델이 학습하기 어려운 데이터세트(궁극적으로 사실상의 양자)가 존재합니다.