Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
quantum-kittens
GitHub Repository: quantum-kittens/platypus
Path: blob/main/notebooks/ch-applications/qaoa.ipynb
3855 views
Kernel: Python 3

Solving combinatorial optimization problems using QAOA

In this tutorial, we will:

  • introduce combinatorial optimization problems,

  • explain approximate optimization algorithms,

  • explain how the Quantum Approximate Optimization Algorithm (QAOA) works,

  • and run a simple example on a simulator or a real quantum system.

Combinatorial Optimization Problem

Combinatorial optimization problems involve finding an optimal object out of a finite set of objects. We would focus on problems that involve finding "optimal" bit strings composed of 0's and 1's among a finite set of bit strings. One such problem corresponding to a graph is the Max-Cut problem.

Max-Cut problem

A Max-Cut problem involves partitioning nodes of a graph into two sets, such that the number of edges between the sets is maximum. The example below has a graph with four nodes and some ways of partitioning it into two sets, "red" and "blue" is shown.

For 4 nodes, as each node can be assigned to either the "red" or "blue" sets, there are 24=162^4=16 possible assignments, out of which we have to find one that gives maximum number of edges between the sets "red" and "blue". The number of such edges between two sets in the figure, as we go from left to right, are 0, 2, 2, and 4. We can see, after enumerating all possible 24=162^4=16 assignments, that the rightmost figure is the assignment that gives the maximum number of edges between the two sets. Hence if we encode "red" as 0 and "blue" as 1, the bit strings "0101" and "1010" that represent the assignment of nodes to either set are the solutions.

As you may have realized, as the number of nodes in the graph increases, the number of possible assignments that you have to examine to find the solution increases exponentially.

QAOA

QAOA (Quantum Approximate Optimization Algorithm) introduced by Farhi et al.[1] is a quantum algorithm that attempts to solve such combinatorial problems.

It is a variational algorithm that uses a unitary U(β,γ)U(\boldsymbol{\beta}, \boldsymbol{\gamma}) characterized by the parameters (β,γ)(\boldsymbol{\beta}, \boldsymbol{\gamma}) to prepare a quantum state ψ(β,γ)\lvert \psi(\boldsymbol{\beta}, \boldsymbol{\gamma}) \rangle. The goal of the algorithm is to find optimal parameters {latex} (\boldsymbol{\beta}_{\text{opt}}, \boldsymbol{\gamma}_{\text{opt}}) such that the quantum state {latex} \lvert \psi(\boldsymbol{\beta}_{\text{opt}}, \boldsymbol{\gamma}_{\text{opt}}) \rangle encodes the solution to the problem.

The unitary U(β,γ)U(\boldsymbol{\beta}, \boldsymbol{\gamma}) has a specific form and is composed of two unitaries U(β)=eiβHBU(\boldsymbol{\beta}) = e^{-i \boldsymbol{\beta} H_B} and U(γ)=eiγHPU(\boldsymbol{\gamma}) = e^{-i \boldsymbol{\gamma} H_P} where HBH_B is the mixing Hamiltonian and HPH_P is the problem Hamiltonian. Such a choice of unitary drives its inspiration from a related scheme called quantum annealing.

The state is prepared by applying these unitaries as alternating blocks of the two unitaries applied pp times such that

ψ(β,γ)=U(β)U(γ)U(β)U(γ)p  timesψ0\lvert \psi(\boldsymbol{\beta}, \boldsymbol{\gamma}) \rangle = \underbrace{U(\boldsymbol{\beta}) U(\boldsymbol{\gamma}) \cdots U(\boldsymbol{\beta}) U(\boldsymbol{\gamma})}_{p \; \text{times}} \lvert \psi_0 \rangle

where ψ0\lvert \psi_0 \rangle is a suitable initial state.

We will demonstrate these steps using the Max-Cut problem discussed above. For that we would first define the underlying graph of the problem shown above.

import networkx as nx graph = nx.Graph() graph.add_nodes_from([0, 1, 2, 3]) graph.add_edges_from([(0, 1), (1, 2), (2, 3), (3, 0)]) nx.draw(graph, with_labels=True, alpha=0.8, node_size=500)
Image in a Jupyter notebook

The problem Hamiltonian specific to the Max-Cut problem up to a constant here is:

HP=12(Z0Z1I2I3)+12(I0Z1Z2I3)+12(Z0I1I2Z3)+12(I0I1Z2Z3)H_P = \frac{1}{2}\big(Z_0 \otimes Z_1 \otimes I_2 \otimes I_3\big) + \frac{1}{2}\big(I_0 \otimes Z_1 \otimes Z_2 \otimes I_3\big) + \frac{1}{2}\big(Z_0 \otimes I_1 \otimes I_2 \otimes Z_3\big) + \frac{1}{2}\big(I_0 \otimes I_1 \otimes Z_2 \otimes Z_3\big)

To construct such a Hamiltonian for a problem, one needs to follow a few steps that we'll cover in later sections of this page.

The mixer Hamiltonian HBH_B is usually of the form:

HB=(X0I1I2I3)+(I0X1I2I3)+(I0I1X2I3)+(I0I1I2X3)H_B = \big(X_0 \otimes I_1 \otimes I_2 \otimes I_3 \big) + \big(I_0 \otimes X_1 \otimes I_2 \otimes I_3 \big) + \big(I_0 \otimes I_1 \otimes X_2 \otimes I_3 \big) + \big(I_0 \otimes I_1 \otimes I_2 \otimes X_3 \big)

As individual terms in the summation of HPH_P and HBH_B both commute, we can write the unitaries as:

U(HB)=eiβHB=eiβX0eiβX1eiβX2eiβX3.U(H_B) = e^{-i \beta H_B} = e^{-i \beta X_0}e^{-i \beta X_1}e^{-i \beta X_2}e^{-i \beta X_3}.

Notice that each term in the product above corresponds to an X-rotation on each qubit. And we can write U(HP)U(H_P) as:

U(HP)=eiγHP=eiγZ0Z1eiγZ1Z2eiγZ2Z3eiγZ0Z3U(H_P) = e^{-i \gamma H_P} = e^{-i \gamma Z_0 Z_1}e^{-i \gamma Z_1 Z_2}e^{-i \gamma Z_2 Z_3}e^{-i \gamma Z_0 Z_3}

Let's now examine what the circuits of the two unitaries look like.

The Mixing Unitary

from qiskit import QuantumCircuit, Aer from qiskit.circuit import Parameter # Adjacency is essentially a matrix which tells you which nodes are # connected. This matrix is given as a sparse matrix, so we need to # convert it to a dense matrix adjacency = nx.adjacency_matrix(graph).todense() N_QUBITS = 4 beta = Parameter("$\\beta$") qc_mix = QuantumCircuit(N_QUBITS) for i in range(N_QUBITS): qc_mix.rx(2 * beta, i) qc_mix.draw()
/var/folders/z_/6s4ntyps5lsb232v7f82201r0000gn/T/ipykernel_64520/2153888928.py:7: FutureWarning: adjacency_matrix will return a scipy.sparse array instead of a matrix in Networkx 3.0. adjacency = nx.adjacency_matrix(graph).todense()
Image in a Jupyter notebook

The Problem Unitary

gamma = Parameter("$\\gamma$") qc_p = QuantumCircuit(N_QUBITS) for pair in list(graph.edges()): # pairs of nodes qc_p.rzz(2 * gamma, pair[0], pair[1]) qc_p.barrier() qc_p.decompose().draw()
Image in a Jupyter notebook

The Initial State

The initial state used during QAOA is usually an equal superposition of all the basis states i.e.

ψ0=(12(0+1))n\lvert \psi_0 \rangle = \bigg(\frac{1}{\sqrt{2}}\big(\lvert 0 \rangle + \lvert 1 \rangle\big)\bigg)^{\otimes n}

Such a state, when number of qubits is 4 (n=4n=4), can be prepared by applying Hadamard gates starting from an all zero state as shown in the circuit below.

qc_0 = QuantumCircuit(N_QUBITS) for i in range(N_QUBITS): qc_0.h(i) qc_0.draw()
Image in a Jupyter notebook

The QAOA circuit

So far we have seen that the preparation of a quantum state during QAOA is composed of three elements

  • Preparing an initial state

  • Applying the unitary {latex} U(H_P) = e^{-i \gamma H_P} corresponding to the problem Hamiltonian

  • Then, applying the mixing unitary {latex} U(H_B) = e^{-i \beta H_B}

Let's see what it looks like for the example problem:

qc_qaoa = QuantumCircuit(N_QUBITS) qc_qaoa.append(qc_0, range(N_QUBITS)) qc_qaoa.append(qc_p, range(N_QUBITS)) qc_qaoa.append(qc_mix, range(N_QUBITS)) qc_qaoa.decompose().decompose().draw()
Image in a Jupyter notebook

The next step is to find the optimal parameters {latex} (\boldsymbol{\beta_{\text{opt}}}, \boldsymbol{\gamma_{\text{opt}}}) such that the expectation value

ψ(βopt,γopt)HPψ(βopt,γopt)\langle \psi(\boldsymbol{\beta}_{\text{opt}}, \boldsymbol{\gamma}_{\text{opt}}) \rvert H_P \lvert \psi(\boldsymbol{\beta}_{\text{opt}}, \boldsymbol{\gamma}_{\text{opt}}) \rangle

is minimized. Such an expectation can be obtained by doing measurement in the Z-basis. We use a classical optimization algorithm to find the optimal parameters. Following steps are involved as shown in the schematic

  1. Initialize β\boldsymbol{\beta} and γ\boldsymbol{\gamma} to suitable real values.

  2. Repeat until some suitable convergence criteria is met:

    1. Prepare the state ψ(β,γ)\lvert \psi(\boldsymbol{\beta}, \boldsymbol{\gamma}) \rangle using the QAOA circuit

    2. Measure the state in standard basis

    3. Compute ψ(β,γ)HPψ(β,γ) \langle \psi(\boldsymbol{\beta}, \boldsymbol{\gamma}) \rvert H_P \lvert \psi(\boldsymbol{\beta}, \boldsymbol{\gamma}) \rangle

    4. Find new set of parameters {latex} (\boldsymbol{\beta}_{new}, \boldsymbol{\gamma}_{new}) using a classical optimization algorithm

    5. Set current parameters (β,γ)(\boldsymbol{\beta}, \boldsymbol{\gamma}) equal to the new parameters {latex} (\boldsymbol{\beta}_{new}, \boldsymbol{\gamma}_{new})

The code below implements the steps mentioned above.

def maxcut_obj(solution, graph): """Given a bit string as a solution, this function returns the number of edges shared between the two partitions of the graph. Args: solution: (str) solution bit string graph: networkx graph Returns: obj: (float) Objective """ # pylint: disable=invalid-name obj = 0 for i, j in graph.edges(): if solution[i] != solution[j]: obj -= 1 return obj def compute_expectation(counts, graph): """Computes expectation value based on measurement results Args: counts: (dict) key as bit string, val as count graph: networkx graph Returns: avg: float expectation value """ avg = 0 sum_count = 0 for bit_string, count in counts.items(): obj = maxcut_obj(bit_string, graph) avg += obj * count sum_count += count return avg/sum_count # We will also bring the different circuit components that # build the qaoa circuit under a single function def create_qaoa_circ(graph, theta): """Creates a parametrized qaoa circuit Args: graph: networkx graph theta: (list) unitary parameters Returns: (QuantumCircuit) qiskit circuit """ nqubits = len(graph.nodes()) n_layers = len(theta)//2 # number of alternating unitaries beta = theta[:n_layers] gamma = theta[n_layers:] qc = QuantumCircuit(nqubits) # initial_state qc.h(range(nqubits)) for layer_index in range(n_layers): # problem unitary for pair in list(graph.edges()): qc.rzz(2 * gamma[layer_index], pair[0], pair[1]) # mixer unitary for qubit in range(nqubits): qc.rx(2 * beta[layer_index], qubit) qc.measure_all() return qc # Finally we write a function that executes the circuit # on the chosen backend def get_expectation(graph, shots=512): """Runs parametrized circuit Args: graph: networkx graph """ backend = Aer.get_backend('qasm_simulator') backend.shots = shots def execute_circ(theta): qc = create_qaoa_circ(graph, theta) counts = backend.run(qc, seed_simulator=10, nshots=512).result().get_counts() return compute_expectation(counts, graph) return execute_circ
from scipy.optimize import minimize expectation = get_expectation(graph) res = minimize(expectation, [1.0, 1.0], method='COBYLA') res
fun: -2.994140625 maxcv: 0.0 message: 'Optimization terminated successfully.' nfev: 30 status: 1 success: True x: array([1.9793337 , 1.16663483])

Note that different choices of classical optimizers are present in qiskit. We choose COBYLA as our classical optimization algorithm here.

Analyzing the result

backend = Aer.get_backend('aer_simulator') backend.shots = 512 from qiskit.visualization import plot_histogram qc_res = create_qaoa_circ(graph, res.x) counts = backend.run(qc_res, seed_simulator=10).result().get_counts() plot_histogram(counts)
Image in a Jupyter notebook

As we notice that the bit strings "0101" and "1010" have the highest probability and are indeed the assignments of the graph (we started with) that gives 4 edges between the two partitions.

Appendix

1. Constructing Problem Hamiltonian

Any maximization problem can be cast in terms of a minimization problem and vice versa. Hence the general form a combinatorial optimization problem is given by

maximize     C(x)\text{maximize } \;\; C(x)subject to     xS\text{subject to } \;\; x \in S

where xSx \in S, is a discrete variable and C:DRC : D \rightarrow \mathbb{R} is the cost function, that maps from some domain SS in to the real numbers R\mathbb{R}. The variable xx can be subject to a set of constraints and lies within the set SDS \subset D of feasible points.

In binary combinatorial optimization problems, the cost function CC can typically be expressed as a sum of terms that only involve a subset Q[n]Q \subset[n] of the nn bits in the string x{0,1}nx \in \{0,1\}^n and is written in the canonical form

C(x)=(Q,Q)[n]w(Q,Q)  iQxi  jQ(1xj),C(x) = \sum_{(Q,\overline{Q}) \subset [n]} w_{(Q,\overline{Q})} \; \prod_{i\in Q} x_i \; \prod_{j\in \overline{Q}} (1- x_j),

where xi{0,1}x_i \in \{0,1\} and w(Q,Q)Rw_{(Q,\overline{Q})}\in \mathbb{R}. We want to find the n-bit string xx for which C(x)C(x) is the maximal.

1.1 Diagonal Hamiltonians

This cost function can be mapped to a Hamiltonian that is diagonal in the computational basis. Given the cost-function CC this Hamiltonian is then written as

H=x{0,1}nC(x)xxH = \sum_{x \in \{0,1\}^n} C(x) |x \rangle\langle x|

where x{0,1}nx \in \{0,1\}^n labels the computational basis states xC2n|x \rangle \in \mathbb{C}^{2^n}. If the cost function only has at most weight kk terms, i.e. when only QQ contribute that involve at most QkQ \leq k bits, then this diagonal Hamiltonian is also only a sum of weight kk Pauli ZZ operators.

The expansion of HH in to Pauli ZZ operators can be obtained from the canonical expansion of the cost-function CC by substituting for every binary variable xi{0,1}x_i \in \{0,1\} the matrix {latex} x_i \rightarrow 2^{-1}(1 - Z_i). Here ZiZ_i is read as the Pauli ZZ operator that acts on qubit ii and trivial on all others, i.e.

Zi=(1001).Z_i = \left(\begin{array}{cc} 1 & 0 \\ 0 & -1 \end{array}\right).

This means that the spin Hamiltonian encoding the classical cost function is written as a Q|Q| - local quantum spin Hamiltonian only involving Pauli ZZ- operators.

H=(Q,Q)[n]w(Q,Q)  12Q+QiQ(1Zi)  jQ(1+Zj).H = \sum_{(Q,\overline{Q}) \subset [n]} w_{(Q,\overline{Q})} \; \frac{1}{2^{|Q| + |\overline{Q}|}}\prod_{i\in Q} \left(1 - Z_i\right) \; \prod_{j\in \overline{Q}} \left(1 + Z_j\right).

Now, we will assume that only a few (polynomially many in nn) w(Q,Q)w_{(Q,\overline{Q})} will be non-zero. Moreover we will assume that the set (Q,Q)|(Q,\overline{Q})| is bounded and not too large. This means we can write the cost function as well as the Hamiltonian HH as the sum of mm local terms C^k\hat{C}_k,

H=k=1mC^k,H = \sum_{k = 1}^m \hat{C}_k,

where both mm and the support of C^k\hat{C}_k is reasonably bounded.

2 Examples:

We consider 2 examples to illustrate combinatorial optimization problems. We will only implement the first example as in Qiskit, but provide a sequence of exercises that give the instructions to implement the second example as well.

2.1 (weighted) MAXCUTMAXCUT

Consider an nn-node non-directed graph G = (V, E) where |V| = n with edge weights wij>0w_{ij}>0, {latex} w_{ij}=w_{ji}, for (i,j)E(i,j)\in E. A cut is defined as a partition of the original set V into two subsets. The cost function to be optimized is in this case the sum of weights of edges connecting points in the two different subsets, crossing the cut. By assigning xi=0x_i=0 or xi=1x_i=1 to each node ii, one tries to maximize the global profit function (here and in the following summations run over indices 1,2,...,n)

C(x)=i,j=1nwijxi(1xj).C(\textbf{x}) = \sum_{i,j = 1}^n w_{ij} x_i (1-x_j).

To simplify notation, we assume uniform weights wij=1 w_{ij} = 1 for (i,j)E(i,j) \in E. To find a solution to this problem on a quantum computer, one needs first to map it to a diagonal Hamiltonian as discussed above. We write the sum as a sum over edges in the set EE

C(x)=i,j=1nwijxi(1xj)=(i,j)E(xi(1xj)+xj(1xi))C(\textbf{x}) = \sum_{i,j = 1}^n w_{ij} x_i (1-x_j) = \sum_{(i,j) \in E} \left( x_i (1-x_j) + x_j (1-x_i)\right)

To map it to a spin Hamiltonian, we make the assignment {latex} x_i\rightarrow (1-Z_i)/2, where ZiZ_i is the Pauli Z operator that has eigenvalues ±1\pm 1 and obtain C(x)HC(\textbf{x}) \rightarrow H

H=(j,k)E12(1ZjZk).H = \sum_{(j,k) \in E} \frac{1}{2}\left(1 - Z_j Z_k \right).

This means that the Hamiltonian can be written as a sum of m=Em = |E| local terms:

C^e=12(1Ze1Ze2)\hat{C}_e = \frac{1}{2}\left(1 - Z_{e1}Z_{e2}\right)

with e=(e1,e2)Ee = (e1,e2) \in E.

2.2 Constraint satisfaction problems and MAX 3-SAT\text{MAX 3-SAT}.

Another example of a combinatorial optimization problem is 3-SAT\text{3-SAT}. Here the cost function {latex} C(\textbf{x}) = \sum_{k = 1}^m c_k(\textbf{x}) is a sum of clauses ck(x)c_k(\textbf{x}) that constrain the values of 33 bits of some x{0,1}n\textbf{x} \in \{0,1\}^n that participate in the clause. Consider for instance this example of a 3-SAT\text{3-SAT} clause

c1(x)=(1x1)(1x3)x132c_1(\textbf{x}) = (1-x_1)(1-x_3)x_{132}

for a bit string x{0,1}133\textbf{x} \in \{0,1\}^{133}. The clause can only be satisfied by setting the bits x1=0x_1 = 0,x3=0x_3 = 0 and x132=1x_{132} = 1. The 3-SAT\text{3-SAT} problem now asks whether there is a bit string that satisfies all the mm clauses or whether no such string exists. This decision problem is the prime example of a problem that is NPNP-complete.

The closely related optimization problem MAX 3-SAT\text{MAX 3-SAT} asks to find the bit string x\textbf{x} that satisfies the maximal number of clauses in C(x)C(\textbf{x}). This can of course be turned again in to a decision problem if we ask where there exists a bit string that satisfies more than m~\tilde{m} of the mm clauses, which is again NPNP-complete.

3. Approximate optimization algorithms

Both the previously considered problems MAXCUTMAXCUT and MAX 3-SAT\text{MAX 3-SAT} are actually known to be a NP-hard problems 3. In fact it turns out that many combinatorial optimization problems are computationally hard to solve in general. In light of this fact, we can't expect to find a provably efficient algorithm, i.e. an algorithm with polynomial runtime in the problem size, that solves these problems. This also applies to quantum algorithms. There are two main approaches to dealing with such problems. First approach is approximation algorithms that are guaranteed to find solution of specified quality in polynomial time. The second approach are heuristic algorithms that don't have a polynomial runtime guarantee but appear to perform well on some instances of such problems.

Approximate optimization algorithms are efficient and provide a provable guarantee on how close the approximate solution is to the actual optimum of the problem. The guarantee typically comes in the form of an approximation ratio, α1\alpha \leq 1. A probabilistic approximate optimization algorithm guarantees that it produces a bit-string x{0,1}n\textbf{x}^* \in \{0,1\}^n so that with high probability we have that with a positive {latex} C_\text{max} = \max_{\textbf{x}}C(\textbf{x})

CmaxC(x)αCmax.C_\text{max} \geq C(\textbf{x}^*) \geq \alpha C_\text{max}.

For the MAXCUTMAXCUT problem there is a famous approximate algorithm due to Goemans and Williamson 2 . This algorithm is based on an SDP relaxation of the original problem combined with a probabilistic rounding technique that yields an with high probability approximate solution x\textbf{x}^* that has an approximation ratio of α0.878\alpha \approx 0.878. This approximation ratio is actually believed to optimal so we do not expect to see an improvement by using a quantum algorithm.

4. The QAOA algorithm

The Quantum approximate optimization algorithm (QAOA) by Farhi, Goldstone and Gutmann 1 is an example of a heuristic algorithm. Unlike Goemans-Williamson algorithm, QAOA does not come with performance guarantees. QAOA takes the approach of classical approximate algorithms and looks for a quantum analogue that will likewise produce a classical bit string xx^* that with high probability is expected to have a good approximation ratio α\alpha. Before discussing the details, let us first present the general idea of this approach.

4.1 Overview:

We want to find a quantum state ψp(γ,β)|\psi_p(\vec{\gamma},\vec{\beta})\rangle, that depends on some real parameters γ,βRp\vec{\gamma},\vec{\beta} \in \mathbb{R}^p, which has the property that it maximizes the expectation value with respect to the problem Hamiltonian HH. Given this trial state we search for parameters γ,β\vec{\gamma}^*,\vec{\beta}^* that maximize {latex} F_p(\vec{\gamma},\vec{\beta}) = \langle \psi_p(\vec{\gamma},\vec{\beta})|H|\psi_p(\vec{\gamma},\vec{\beta})\rangle.

Once we have such a state and the corresponding parameters we prepare the state ψp(γ,β)|\psi_p(\vec{\gamma}^*,\vec{\beta}^*)\rangle on a quantum computer and measure the state in the ZZ basis {latex} |x \rangle = |x_1,\ldots x_n \rangle to obtain a random outcome xx^*.

We will see that this random xx^* is going to be a bit string that is with high probability close to the expected value {latex} M_p = F_p(\vec{\gamma}^*,\vec{\beta}^*). Hence, if MpM_p is close to CmaxC_\text{max} so is C(x)C(x^*).

4.2 The components of the QAOA algorithm.

4.2.1 The QAOA trial state

Central to QAOA is the trial state ψp(γ,β)|\psi_p(\vec{\gamma},\vec{\beta})\rangle that will be prepared on the quantum computer. Ideally we want this state to give a large expectation value {latex} F_p(\vec{\gamma},\vec{\beta}) = \langle \psi_p(\vec{\gamma},\vec{\beta})|H|\psi_p(\vec{\gamma},\vec{\beta})\rangle with respect to the problem Hamiltonian HH. In Farhi 1, the trial states ψp(γ,β)|\psi_p(\vec{\gamma},\vec{\beta})\rangle are constructed from the problem Hamiltonian HH together with single qubit Pauli XX rotations. That means, given a problems Hamiltonian

H=k=1mC^kH = \sum_{k = 1}^m \hat{C}_k

diagonal in the computational basis and a transverse field Hamiltonian

B=i=1nXiB = \sum_{i = 1}^n X_i

the trial state is prepared by applying pp alternating unitaries

ψp(γ,β)=eiβpBeiγpHeiβ1Beiγ1H+n|\psi_p(\vec{\gamma},\vec{\beta})\rangle = e^{ -i\beta_p B } e^{ -i\gamma_p H } \ldots e^{ -i\beta_1 B } e^{ -i\gamma_1 H } |+\rangle^n

to the product state +n|+\rangle^n with X+=+ X |+\rangle = |+\rangle.

This particular ansatz has the advantage that there exists an explicit choice for the vectors γ,β\vec{\gamma}^*,\vec{\beta}^* such that for {latex} M_p = F_p(\vec{\gamma}^*,\vec{\beta}^*) when we take the limit {latex} \lim_{p \rightarrow \infty} M_p = C_\text{max}. This follows by viewing the trial state ψp(γ,β)|\psi_p(\vec{\gamma},\vec{\beta}) \rangle as the state that follows from Trotterizing the adiabatic evolution with respect to HH and the transverse field Hamiltonian BB, c.f. Ref 1.

Conversely the disadvantage of this trial state is one would typically want a state that has been generated from a quantum circuit that is not too deep. Here depth is measured with respect to the gates that can be applied directly on the quantum chip. Hence there are other proposals that suggest using Ansatz trial state that are more tailored to the Hardware of the quantum chip Ref. 4, Ref. 5.

4.2.2 Computing the expectation value

An important component of this approach is that we will have to compute or estimate the expectation value

Fp(γ,β)=ψp(γ,β)Hψp(γ,β)F_p(\vec{\gamma},\vec{\beta}) = \langle \psi_p(\vec{\gamma},\vec{\beta})|H|\psi_p(\vec{\gamma},\vec{\beta})\rangle

so we can optimize the parameters γ,β\vec{\gamma},\vec{\beta}. We will be considering two scenarios here.

Classical evaluation

Note that when the circuit to prepare ψp(γ,β)|\psi_p(\vec{\gamma},\vec{\beta})\rangle is not too deep it may be possible to evaluate the expectation value FpF_p classically.

This happens for instance when one considers MAXCUTMAXCUT for graphs with bounded degree and one considers a circuit with p=1p=1. We will see an example of this in the Qiskit implementation below (section 5.2) and provide an exercise to compute the expectation value.

To illustrate the idea, recall that the Hamiltonian can be written as a sum of individual terms {latex} H = \sum_{k = 1}^m \hat{C}_k. Due to the linearity of the expectation value, it is sufficient to consider the expectation values of the individual summands. For p=1p = 1 one has that

ψ1(γ,β)C^kψ1(γ,β)=+neiγ1Heiβ1BC^keiβ1Beiγ1H+n.\langle \psi_1(\vec{\gamma},\vec{\beta})|\hat{C}_k|\psi_1(\vec{\gamma},\vec{\beta})\rangle = \langle +^n | e^{ i\gamma_1 H } e^{ i\beta_1 B } | \hat{C}_k | e^{ -i\beta_1 B } e^{ -i\gamma_1 H } |+^n\rangle.

Observe that with {latex} B = \sum_{i = 1}^n X_i the unitary eiβ1Be^{ -i\beta_1 B } is actually a product of single qubit rotations about XX with an angle β\beta for which we will write {latex} X(\beta)_k = \exp(i\beta X_k).

All the individual rotations that don't act on the qubits where C^k\hat{C}_k is supported commute with C^k\hat{C}_k and therefore cancel. This does not increase the support of the operator C^k\hat{C}_k. This means that the second set of unitary gates {latex} e^{ -i\gamma_1 H } = \prod_{l=1}^m U_l(\gamma) have a large set of gates {latex} U_l(\gamma) = e^{ -i\gamma_1 \hat{C}_l } that commute with the operator {latex} e^{ i\beta_1 B } \hat{C}_k e^{ -i\beta_1 B }. The only gates {latex} U_l(\gamma) = e^{ -i\gamma_1 \hat{C}_l } that contribute to the expectation value are those which involve qubits in the support of the original C^k\hat{C}_k.

Hence, for bounded degree interaction the support of {latex} e^{ i\gamma_1 H } e^{ i\beta_1 B } \hat{C}_k e^{ -i\beta_1 B } e^{ -i\gamma_1 H } only expands by an amount given by the degree of the interaction in HH and is therefore independent of the system size. This means that for these smaller sub problems the expectation values are independent of nn and can be evaluated classically. The case of a general degree 33 is considered in 1.

This is a general observation, which means that if we have a problem where the circuit used for the trial state preparation only increases the support of each term in the Hamiltonian by a constant amount the cost function can be directly evaluated.

When this is the case, and only a few parameters β,γ\beta, \gamma are needed in the preparation of the trial state, these can be found easily by a simple grid search. Furthermore, an exact optimal value of MpM_p can be used to bound the approximation ratio

MpCmaxα\frac{M_p}{C_\text{max}} \geq \alpha

to obtain an estimate of α\alpha. For this case the QAOA algorithm has the same characteristics as a conventional approximate optimization algorithm that comes with a guaranteed approximation ratio that can be obtained with polynomial efficiency in the problem size.

Evaluation on a quantum computer

When the quantum circuit becomes too deep to be evaluated classically, or when the connectivity of the Problem Hamiltonian is too high we can resort to other means of estimating the expectation value. This involves directly estimating Fp(γ,β)F_p(\vec{\gamma},\vec{\beta}) on the quantum computer. The approach here follows the path of the conventional expectation value estimation as used in VQE 4, where a trial state ψp(γ,β)| \psi_p(\vec{\gamma},\vec{\beta}) \rangle is prepared directly on the quantum computer and the expectation value is obtained from sampling.

Since QAOA has a diagonal Hamiltonian HH it is actually straight forward to estimate the expectation value. We only need to obtain samples from the trial state in the computational basis. Recall that H=x{0,1}nC(x)xxH = \sum_{x \in \{0,1\}^n} C(x) |x \rangle\langle x| so that we can obtain the sampling estimate of

ψp(γ,β)Hψp(γ,β)=x{0,1}nC(x)xψp(γ,β)2\langle \psi_p(\vec{\gamma},\vec{\beta})|H|\psi_p(\vec{\gamma},\vec{\beta})\rangle = \sum_{x \in \{0,1\}^n} C(x) |\langle x| \psi_p(\vec{\gamma},\vec{\beta}) \rangle |^2

by repeated single qubit measurements of the state ψp(γ,β)| \psi_p(\vec{\gamma},\vec{\beta}) \rangle in the ZZ basis. For every bit string xx obtained from the distribution xψp(γ,β)2|\langle x| \psi_p(\vec{\gamma},\vec{\beta}) \rangle |^2 we evaluate the cost function C(x)C(x) and average it over the total number of samples. The resulting empirical average approximates the expectation value up to an additive sampling error that lies within the variance of the state. The variance will be discussed below.

With access to the expectation value, we can now run a classical optimization algorithm, such as 6, to optimize the FpF_p.

While this approach does not lead to an a-priori approximation guarantee for xx^*, the optimized function value can be used later to provide an estimate for the approximation ratio α\alpha.

4.3.3 Obtaining a solution with a given approximation ratio with high probability

The algorithm is probabilistic in nature and produces random bit strings from the distribution xψp(γ,β)2|\langle x| \psi_p(\vec{\gamma},\vec{\beta}) \rangle |^2. So how can we be sure that we will sample an approximation xx^* that is close to the value of the optimized expectation value MpM_p? Note that this question is also relevant to the estimation of MpM_p on a quantum computer in the first place. If the samples drawn from xψp(γ,β)2|\langle x| \psi_p(\vec{\gamma},\vec{\beta}) \rangle |^2 have too much variance, many samples are necessary to determine the mean.

We will draw a bit string xx^* that is close to the mean MpM_p with high probability when the energy as variable has little variance.

Note that the number of terms in the Hamiltonian {latex} H = \sum_{k=1}^m \hat{C}_k are bounded by mm. Say each individual summand C^k\hat{C}_k has an operator norm that can be bounded by a universal constant C^kC~\|\hat{C}_k\| \leq \tilde{C} for all k=1mk = 1\ldots m. Then consider

ψp(γ,β)H2ψp(γ,β)ψp(γ,β)Hψp(γ,β)2ψp(γ,β)H2ψp(γ,β)=k,l=1mψp(γ,β)C^kC^lψp(γ,β)m2C~2\begin{aligned} \langle \psi_p(\vec{\gamma},\vec{\beta})|H^2|\psi_p(\vec{\gamma},\vec{\beta})\rangle - \langle \psi_p(\vec{\gamma},\vec{\beta})|H|\psi_p(\vec{\gamma},\vec{\beta})\rangle^2 &\leq \langle \psi_p(\vec{\gamma},\vec{\beta})|H^2|\psi_p(\vec{\gamma},\vec{\beta})\rangle \\ &= \sum_{k,l =1}^m \langle \psi_p(\vec{\gamma},\vec{\beta})|\hat{C}_k \hat{C}_l |\psi_p(\vec{\gamma},\vec{\beta})\rangle \\ &\leq m^2 \tilde{C}^2 \\ \end{aligned}

where we have used that {latex} \langle \psi_p(\vec{\gamma},\vec{\beta})|\hat{C}_k \hat{C}_l |\psi_p(\vec{\gamma},\vec{\beta})\rangle \leq \tilde{C}^2.

This means that the variance of any expectation Fp(γ,β)F_p(\vec{\gamma},\vec{\beta}) is bounded by m2C~2m^2 \tilde{C}^2. Hence this in particular applies for MpM_p. Furthermore if mm only grows polynomially in the number of qubits nn, we know that taking polynomially growing number of samples s=O(C~2m2ϵ2)s = O\left(\frac{\tilde{C}^2 m^2}{\epsilon^2}\right) from xψp(γ,β)2|\langle x| \psi_p(\vec{\gamma},\vec{\beta}) \rangle |^2 will be sufficient to obtain a xx^* that leads to an C(x)C(x^*) that will be close to MpM_p.

5. Problems

  1. The QAOA algorithm produces a bit string, is this string the optimal solution for this graph? Compare the experimental results from the superconducting chip with the results from the local QASM simulation.

  2. We have computed the cost function F1F_1 analytically in section 5.2. Verify the steps and compute fA(γ,β)f_A(\gamma,\beta) as well fB(γ,β)f_B(\gamma,\beta).

  3. We have given an exact expression for F1F_1 in the Qiskit implementation.

    • Write a routine to estimate the expectation value F1(γ,β)F_1(\gamma,\beta) from the samples obtained in the result

    • Use an optimization routine,e.g. SPSA from the VQE example in this tutorial, to optimize the parameters in the sampled F1(γ,β)F_1(\gamma,\beta) numerically. Do you find the same values for γ,β\gamma^*,\beta^* ?

  4. The Trial circuit in section 5.3 corresponds to depth p=1p=1 and was directly aimed at being compatible with the Hardware.

    • Use the routine from exercise 2 to evaluate the cost functions Fp(γ,β)F_p(\gamma,\beta) for p=2,3p=2,3. What do you expect to see in the actual Hardware?

    • Generalize this class of trial state to other candidate wave functions, such as the Hardware efficient ansatz of Ref. 4.

References

  1. Farhi, Edward, Jeffrey Goldstone, and Sam Gutmann. "A quantum approximate optimization algorithm." arXiv preprint arXiv:1411.4028 (2014).

  2. Goemans, Michel X., and David P. Williamson. Journal of the ACM (JACM) 42.6 (1995): 1115-1145.

  3. Garey, Michael R.; David S. Johnson (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman. ISBN 0-7167-1045-5

  4. Kandala, Abhinav, et al. "Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets." Nature 549.7671 (2017): 242.

  5. Farhi, Edward, et al. "Quantum algorithms for fixed qubit architectures." arXiv preprint arXiv:1703.06199 (2017).

  6. Spall, J. C. (1992), IEEE Transactions on Automatic Control, vol. 37(3), pp. 332–341.

  7. Michael Streif and Martin Leib "Training the quantum approximate optimization algorithm without access to a quantum processing unit" (2020) Quantum Sci. Technol. 5 034008

# pylint: disable=unused-import import qiskit.tools.jupyter %qiskit_version_table