Path: blob/main/notebooks/ch-applications/qaoa.ipynb
3855 views
Solving combinatorial optimization problems using QAOA
In this tutorial, we will:
introduce combinatorial optimization problems,
explain approximate optimization algorithms,
explain how the Quantum Approximate Optimization Algorithm (QAOA) works,
and run a simple example on a simulator or a real quantum system.
Combinatorial Optimization Problem
Combinatorial optimization problems involve finding an optimal object out of a finite set of objects. We would focus on problems that involve finding "optimal" bit strings composed of 0's and 1's among a finite set of bit strings. One such problem corresponding to a graph is the Max-Cut problem.
Max-Cut problem
A Max-Cut problem involves partitioning nodes of a graph into two sets, such that the number of edges between the sets is maximum. The example below has a graph with four nodes and some ways of partitioning it into two sets, "red" and "blue" is shown.
For 4 nodes, as each node can be assigned to either the "red" or "blue" sets, there are possible assignments, out of which we have to find one that gives maximum number of edges between the sets "red" and "blue". The number of such edges between two sets in the figure, as we go from left to right, are 0, 2, 2, and 4. We can see, after enumerating all possible assignments, that the rightmost figure is the assignment that gives the maximum number of edges between the two sets. Hence if we encode "red" as 0 and "blue" as 1, the bit strings "0101" and "1010" that represent the assignment of nodes to either set are the solutions.
As you may have realized, as the number of nodes in the graph increases, the number of possible assignments that you have to examine to find the solution increases exponentially.
QAOA
QAOA (Quantum Approximate Optimization Algorithm) introduced by Farhi et al.[1] is a quantum algorithm that attempts to solve such combinatorial problems.
It is a variational algorithm that uses a unitary characterized by the parameters to prepare a quantum state . The goal of the algorithm is to find optimal parameters {latex} (\boldsymbol{\beta}_{\text{opt}}, \boldsymbol{\gamma}_{\text{opt}}) such that the quantum state {latex} \lvert \psi(\boldsymbol{\beta}_{\text{opt}}, \boldsymbol{\gamma}_{\text{opt}}) \rangle encodes the solution to the problem.
The unitary has a specific form and is composed of two unitaries and where is the mixing Hamiltonian and is the problem Hamiltonian. Such a choice of unitary drives its inspiration from a related scheme called quantum annealing.
The state is prepared by applying these unitaries as alternating blocks of the two unitaries applied times such that
where is a suitable initial state.
We will demonstrate these steps using the Max-Cut problem discussed above. For that we would first define the underlying graph of the problem shown above.
The problem Hamiltonian specific to the Max-Cut problem up to a constant here is:
To construct such a Hamiltonian for a problem, one needs to follow a few steps that we'll cover in later sections of this page.
The mixer Hamiltonian is usually of the form:
As individual terms in the summation of and both commute, we can write the unitaries as:
Notice that each term in the product above corresponds to an X-rotation on each qubit. And we can write as:
Let's now examine what the circuits of the two unitaries look like.
The Mixing Unitary
The Problem Unitary
The Initial State
The initial state used during QAOA is usually an equal superposition of all the basis states i.e.
Such a state, when number of qubits is 4 (), can be prepared by applying Hadamard gates starting from an all zero state as shown in the circuit below.
The QAOA circuit
So far we have seen that the preparation of a quantum state during QAOA is composed of three elements
Preparing an initial state
Applying the unitary
{latex} U(H_P) = e^{-i \gamma H_P}corresponding to the problem HamiltonianThen, applying the mixing unitary
{latex} U(H_B) = e^{-i \beta H_B}
Let's see what it looks like for the example problem:
The next step is to find the optimal parameters {latex} (\boldsymbol{\beta_{\text{opt}}}, \boldsymbol{\gamma_{\text{opt}}}) such that the expectation value
is minimized. Such an expectation can be obtained by doing measurement in the Z-basis. We use a classical optimization algorithm to find the optimal parameters. Following steps are involved as shown in the schematic

Initialize and to suitable real values.
Repeat until some suitable convergence criteria is met:
Prepare the state using the QAOA circuit
Measure the state in standard basis
Compute
Find new set of parameters
{latex} (\boldsymbol{\beta}_{new}, \boldsymbol{\gamma}_{new})using a classical optimization algorithmSet current parameters equal to the new parameters
{latex} (\boldsymbol{\beta}_{new}, \boldsymbol{\gamma}_{new})
The code below implements the steps mentioned above.
Note that different choices of classical optimizers are present in qiskit. We choose COBYLA as our classical optimization algorithm here.
Analyzing the result
As we notice that the bit strings "0101" and "1010" have the highest probability and are indeed the assignments of the graph (we started with) that gives 4 edges between the two partitions.
Appendix
1. Constructing Problem Hamiltonian
Any maximization problem can be cast in terms of a minimization problem and vice versa. Hence the general form a combinatorial optimization problem is given by
where , is a discrete variable and is the cost function, that maps from some domain in to the real numbers . The variable can be subject to a set of constraints and lies within the set of feasible points.
In binary combinatorial optimization problems, the cost function can typically be expressed as a sum of terms that only involve a subset of the bits in the string and is written in the canonical form
where and . We want to find the n-bit string for which is the maximal.
1.1 Diagonal Hamiltonians
This cost function can be mapped to a Hamiltonian that is diagonal in the computational basis. Given the cost-function this Hamiltonian is then written as
where labels the computational basis states . If the cost function only has at most weight terms, i.e. when only contribute that involve at most bits, then this diagonal Hamiltonian is also only a sum of weight Pauli operators.
The expansion of in to Pauli operators can be obtained from the canonical expansion of the cost-function by substituting for every binary variable the matrix {latex} x_i \rightarrow 2^{-1}(1 - Z_i). Here is read as the Pauli operator that acts on qubit and trivial on all others, i.e.
This means that the spin Hamiltonian encoding the classical cost function is written as a - local quantum spin Hamiltonian only involving Pauli - operators.
Now, we will assume that only a few (polynomially many in ) will be non-zero. Moreover we will assume that the set is bounded and not too large. This means we can write the cost function as well as the Hamiltonian as the sum of local terms ,
where both and the support of is reasonably bounded.
2 Examples:
We consider 2 examples to illustrate combinatorial optimization problems. We will only implement the first example as in Qiskit, but provide a sequence of exercises that give the instructions to implement the second example as well.
2.1 (weighted)
Consider an -node non-directed graph G = (V, E) where |V| = n with edge weights , {latex} w_{ij}=w_{ji}, for . A cut is defined as a partition of the original set V into two subsets. The cost function to be optimized is in this case the sum of weights of edges connecting points in the two different subsets, crossing the cut. By assigning or to each node , one tries to maximize the global profit function (here and in the following summations run over indices 1,2,...,n)
To simplify notation, we assume uniform weights for . To find a solution to this problem on a quantum computer, one needs first to map it to a diagonal Hamiltonian as discussed above. We write the sum as a sum over edges in the set
To map it to a spin Hamiltonian, we make the assignment {latex} x_i\rightarrow (1-Z_i)/2, where is the Pauli Z operator that has eigenvalues and obtain
This means that the Hamiltonian can be written as a sum of local terms:
with .
2.2 Constraint satisfaction problems and .
Another example of a combinatorial optimization problem is . Here the cost function {latex} C(\textbf{x}) = \sum_{k = 1}^m c_k(\textbf{x}) is a sum of clauses that constrain the values of bits of some that participate in the clause. Consider for instance this example of a clause
for a bit string . The clause can only be satisfied by setting the bits , and . The problem now asks whether there is a bit string that satisfies all the clauses or whether no such string exists. This decision problem is the prime example of a problem that is -complete.
The closely related optimization problem asks to find the bit string that satisfies the maximal number of clauses in . This can of course be turned again in to a decision problem if we ask where there exists a bit string that satisfies more than of the clauses, which is again -complete.
3. Approximate optimization algorithms
Both the previously considered problems and are actually known to be a NP-hard problems 3. In fact it turns out that many combinatorial optimization problems are computationally hard to solve in general. In light of this fact, we can't expect to find a provably efficient algorithm, i.e. an algorithm with polynomial runtime in the problem size, that solves these problems. This also applies to quantum algorithms. There are two main approaches to dealing with such problems. First approach is approximation algorithms that are guaranteed to find solution of specified quality in polynomial time. The second approach are heuristic algorithms that don't have a polynomial runtime guarantee but appear to perform well on some instances of such problems.
Approximate optimization algorithms are efficient and provide a provable guarantee on how close the approximate solution is to the actual optimum of the problem. The guarantee typically comes in the form of an approximation ratio, . A probabilistic approximate optimization algorithm guarantees that it produces a bit-string so that with high probability we have that with a positive {latex} C_\text{max} = \max_{\textbf{x}}C(\textbf{x})
For the problem there is a famous approximate algorithm due to Goemans and Williamson 2 . This algorithm is based on an SDP relaxation of the original problem combined with a probabilistic rounding technique that yields an with high probability approximate solution that has an approximation ratio of . This approximation ratio is actually believed to optimal so we do not expect to see an improvement by using a quantum algorithm.
4. The QAOA algorithm
The Quantum approximate optimization algorithm (QAOA) by Farhi, Goldstone and Gutmann 1 is an example of a heuristic algorithm. Unlike Goemans-Williamson algorithm, QAOA does not come with performance guarantees. QAOA takes the approach of classical approximate algorithms and looks for a quantum analogue that will likewise produce a classical bit string that with high probability is expected to have a good approximation ratio . Before discussing the details, let us first present the general idea of this approach.
4.1 Overview:
We want to find a quantum state , that depends on some real parameters , which has the property that it maximizes the expectation value with respect to the problem Hamiltonian . Given this trial state we search for parameters that maximize {latex} F_p(\vec{\gamma},\vec{\beta}) = \langle \psi_p(\vec{\gamma},\vec{\beta})|H|\psi_p(\vec{\gamma},\vec{\beta})\rangle.
Once we have such a state and the corresponding parameters we prepare the state on a quantum computer and measure the state in the basis {latex} |x \rangle = |x_1,\ldots x_n \rangle to obtain a random outcome .
We will see that this random is going to be a bit string that is with high probability close to the expected value {latex} M_p = F_p(\vec{\gamma}^*,\vec{\beta}^*). Hence, if is close to so is .
4.2 The components of the QAOA algorithm.
4.2.1 The QAOA trial state
Central to QAOA is the trial state that will be prepared on the quantum computer. Ideally we want this state to give a large expectation value {latex} F_p(\vec{\gamma},\vec{\beta}) = \langle \psi_p(\vec{\gamma},\vec{\beta})|H|\psi_p(\vec{\gamma},\vec{\beta})\rangle with respect to the problem Hamiltonian . In Farhi 1, the trial states are constructed from the problem Hamiltonian together with single qubit Pauli rotations. That means, given a problems Hamiltonian
diagonal in the computational basis and a transverse field Hamiltonian
the trial state is prepared by applying alternating unitaries
to the product state with .
This particular ansatz has the advantage that there exists an explicit choice for the vectors such that for {latex} M_p = F_p(\vec{\gamma}^*,\vec{\beta}^*) when we take the limit {latex} \lim_{p \rightarrow \infty} M_p = C_\text{max}. This follows by viewing the trial state as the state that follows from Trotterizing the adiabatic evolution with respect to and the transverse field Hamiltonian , c.f. Ref 1.
Conversely the disadvantage of this trial state is one would typically want a state that has been generated from a quantum circuit that is not too deep. Here depth is measured with respect to the gates that can be applied directly on the quantum chip. Hence there are other proposals that suggest using Ansatz trial state that are more tailored to the Hardware of the quantum chip Ref. 4, Ref. 5.
4.2.2 Computing the expectation value
An important component of this approach is that we will have to compute or estimate the expectation value
so we can optimize the parameters . We will be considering two scenarios here.
Classical evaluation
Note that when the circuit to prepare is not too deep it may be possible to evaluate the expectation value classically.
This happens for instance when one considers for graphs with bounded degree and one considers a circuit with . We will see an example of this in the Qiskit implementation below (section 5.2) and provide an exercise to compute the expectation value.
To illustrate the idea, recall that the Hamiltonian can be written as a sum of individual terms {latex} H = \sum_{k = 1}^m \hat{C}_k. Due to the linearity of the expectation value, it is sufficient to consider the expectation values of the individual summands. For one has that
Observe that with {latex} B = \sum_{i = 1}^n X_i the unitary is actually a product of single qubit rotations about with an angle for which we will write {latex} X(\beta)_k = \exp(i\beta X_k).
All the individual rotations that don't act on the qubits where is supported commute with and therefore cancel. This does not increase the support of the operator . This means that the second set of unitary gates {latex} e^{ -i\gamma_1 H } = \prod_{l=1}^m U_l(\gamma) have a large set of gates {latex} U_l(\gamma) = e^{ -i\gamma_1 \hat{C}_l } that commute with the operator {latex} e^{ i\beta_1 B } \hat{C}_k e^{ -i\beta_1 B }. The only gates {latex} U_l(\gamma) = e^{ -i\gamma_1 \hat{C}_l } that contribute to the expectation value are those which involve qubits in the support of the original .
Hence, for bounded degree interaction the support of {latex} e^{ i\gamma_1 H } e^{ i\beta_1 B } \hat{C}_k e^{ -i\beta_1 B } e^{ -i\gamma_1 H } only expands by an amount given by the degree of the interaction in and is therefore independent of the system size. This means that for these smaller sub problems the expectation values are independent of and can be evaluated classically. The case of a general degree is considered in 1.
This is a general observation, which means that if we have a problem where the circuit used for the trial state preparation only increases the support of each term in the Hamiltonian by a constant amount the cost function can be directly evaluated.
When this is the case, and only a few parameters are needed in the preparation of the trial state, these can be found easily by a simple grid search. Furthermore, an exact optimal value of can be used to bound the approximation ratio
to obtain an estimate of . For this case the QAOA algorithm has the same characteristics as a conventional approximate optimization algorithm that comes with a guaranteed approximation ratio that can be obtained with polynomial efficiency in the problem size.
Evaluation on a quantum computer
When the quantum circuit becomes too deep to be evaluated classically, or when the connectivity of the Problem Hamiltonian is too high we can resort to other means of estimating the expectation value. This involves directly estimating on the quantum computer. The approach here follows the path of the conventional expectation value estimation as used in VQE 4, where a trial state is prepared directly on the quantum computer and the expectation value is obtained from sampling.
Since QAOA has a diagonal Hamiltonian it is actually straight forward to estimate the expectation value. We only need to obtain samples from the trial state in the computational basis. Recall that so that we can obtain the sampling estimate of
by repeated single qubit measurements of the state in the basis. For every bit string obtained from the distribution we evaluate the cost function and average it over the total number of samples. The resulting empirical average approximates the expectation value up to an additive sampling error that lies within the variance of the state. The variance will be discussed below.
With access to the expectation value, we can now run a classical optimization algorithm, such as 6, to optimize the .
While this approach does not lead to an a-priori approximation guarantee for , the optimized function value can be used later to provide an estimate for the approximation ratio .
4.3.3 Obtaining a solution with a given approximation ratio with high probability
The algorithm is probabilistic in nature and produces random bit strings from the distribution . So how can we be sure that we will sample an approximation that is close to the value of the optimized expectation value ? Note that this question is also relevant to the estimation of on a quantum computer in the first place. If the samples drawn from have too much variance, many samples are necessary to determine the mean.
We will draw a bit string that is close to the mean with high probability when the energy as variable has little variance.
Note that the number of terms in the Hamiltonian {latex} H = \sum_{k=1}^m \hat{C}_k are bounded by . Say each individual summand has an operator norm that can be bounded by a universal constant for all . Then consider
where we have used that {latex} \langle \psi_p(\vec{\gamma},\vec{\beta})|\hat{C}_k \hat{C}_l |\psi_p(\vec{\gamma},\vec{\beta})\rangle \leq \tilde{C}^2.
This means that the variance of any expectation is bounded by . Hence this in particular applies for . Furthermore if only grows polynomially in the number of qubits , we know that taking polynomially growing number of samples from will be sufficient to obtain a that leads to an that will be close to .
5. Problems
The QAOA algorithm produces a bit string, is this string the optimal solution for this graph? Compare the experimental results from the superconducting chip with the results from the local QASM simulation.
We have computed the cost function analytically in section 5.2. Verify the steps and compute as well .
We have given an exact expression for in the Qiskit implementation.
Write a routine to estimate the expectation value from the samples obtained in the result
Use an optimization routine,e.g. SPSA from the VQE example in this tutorial, to optimize the parameters in the sampled numerically. Do you find the same values for ?
The Trial circuit in section 5.3 corresponds to depth and was directly aimed at being compatible with the Hardware.
Use the routine from exercise 2 to evaluate the cost functions for . What do you expect to see in the actual Hardware?
Generalize this class of trial state to other candidate wave functions, such as the Hardware efficient ansatz of Ref. 4.
References
Farhi, Edward, Jeffrey Goldstone, and Sam Gutmann. "A quantum approximate optimization algorithm." arXiv preprint arXiv:1411.4028 (2014).
Goemans, Michel X., and David P. Williamson. Journal of the ACM (JACM) 42.6 (1995): 1115-1145.
Garey, Michael R.; David S. Johnson (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman. ISBN 0-7167-1045-5
Kandala, Abhinav, et al. "Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets." Nature 549.7671 (2017): 242.
Farhi, Edward, et al. "Quantum algorithms for fixed qubit architectures." arXiv preprint arXiv:1703.06199 (2017).
Spall, J. C. (1992), IEEE Transactions on Automatic Control, vol. 37(3), pp. 332–341.
Michael Streif and Martin Leib "Training the quantum approximate optimization algorithm without access to a quantum processing unit" (2020) Quantum Sci. Technol. 5 034008