Path: blob/main/notebooks/intro/grover-intro.ipynb
3855 views
Grover's search algorithm
Search problems
A lot of the problems that computers solve are types of search problems. You’ve probably already searched the web using a search engine, which is a program that builds a database from websites and allows you to search through it. We can think of a database as a program that takes an address as input, and outputs the data at that address. A phone book is one example of a database; each entry in the book contains a name and number. For example, we might ask the database to give us the data in at the 3441st address, and it will return the 3441st name and number in the book.
We call this process of providing an input and reading the output "querying the database". Often in computer science, we consider databases to be black boxes, which means we're not allowed to see how they work; we’ll just assume they're magical processes that do exactly as they promise. We call magical processes like these "oracles".
If we have someone's name and we’re trying to find their phone number, this is easy if the book is sorted alphabetically by name. We can use an algorithm called binary search.
Example: Binary search
Binary search is a very efficient classical algorithm for searching sorted databases. You’ve probably used something similar when searching for a specific page in a book (or even using a physical phone book). Let’s say we want to find Evelina's phone number.
First, we check the middle item in the database and see if it’s higher or lower than the item we’re searching for.
In this case “H” comes after “E”. Since the list is sorted we know that the address of the entry we’re looking for has to be lower than 7. We can ignore any addresses larger than 6 and repeat this algorithm on the reduced list.
This time, the middle entry’s name begins with “D”, which comes before “E”. Now we know our entry must have address higher than 3.
Each step halves the size of list we’re working on, so the search space shrinks exponentially.
Which means that even with very large databases, we can find entries quickly.
Quick quiz
The maximum number of database queries needed grows logarithmically (base 2) with the number of entries in the database.
Using binary search, what's the largest number of queries we'd need to search a sorted database with 1024 entries?
10
1
100
Hint: how many times do you need to halve the database to be left with only one item?
Since binary search grows logarithmically with the size of the database, there isn’t much room for improvement from a quantum computer. But we don’t always have the convenience of searching sorted lists. What if we were instead given a phone number, and we wanted to find the name associated with that number?
This is a lot more difficult, as phone books aren't usually sorted by number. If we assume the phone numbers are ordered randomly in the list, there’s no way of homing in on our target as we did last time. The best we can do with a classical computer is randomly pick an input address, see if it contains the phone number we’re looking for, and repeat until we happen upon the correct entry. For this reason, a lot of work goes into indexing databases to improve search times.
When the database is completely disordered like this, we say it's unstructured. And the quantum algorithm we'll learn about on this page is an algorithm for unstructured search.
Unstructured search
If we search an unstructured database by randomly choosing inputs, how many inputs would we need to check on average before we find the entry we're looking for?
Half the possible inputs.
All the possible inputs.
Three-quarters of the possible inputs.
Using random guessing, how does the average number of database queries needed grow with the number of entries in the database?
Linearly.
Logarithmically.
Quadratically.
Exponentially.
It may seem that we can't possibly do better than random guessing here; we don't have any idea where the correct entry will be in the database, and each incorrect query only rules out one entry.
For classical computers, our intuition is correct, but if our database can input and output quantum superpositions, it turns out we can do better than random guessing! On this page we will learn about our first quantum algorithm: Grover's quantum search algorithm. When searching any database (structured or unstructured), Grover's algorithm grows with the square root of the number of inputs, which for unstructured search is a quadratic improvement over the best classical algorithm.
Beyond black boxes
Search algorithms can search databases of collected information such as phone books, but they can also do more than that. If we can make a problem look like a database search problem, then we can use a search algorithm to solve it. For example, let’s consider the problem of solving a sudoku. If someone claims to have solved a sudoku, you can check if it’s solved pretty quickly: You check along each row, check along each column, check each square, and you’re finished. In this sense, you are the database, and the person that gave you the solution is querying you. They are trying to find the input that returns the information “yes this is a valid solution”.
In fact, we can present a lot of computational problems as "find the input that results in a certain output".
One example of a problem we can solve like this is the Boolean satisfiability problem (known as 'SAT').
SAT problems
SAT problems are widely studied in computer science, and lots of other computing problems can be converted to SAT problems. In this page we will use Grover’s algorithm to solve a simple SAT problem, and you can use the skills you learn here to apply quantum search algorithms to other problems.
A solution to a SAT problem is a string of bits, which makes it easy to map to a quantum circuit. The problem itself is essentially a bunch of conditions (we call them clauses) that rule out different combinations of bit values. For example, if we had three bits, one of the clauses might be "You can’t have the zeroth bit ON and the first bit OFF", which would rule out the combinations 101 and 001 as valid solutions.
Here's a file that encodes a "3-SAT" problem, which is a SAT problem where every clause refers to exactly 3 bits, and one of these bit conditions in each clause must be satisfied.
Example 3-SAT problem
Here is an example of a 3-SAT problem, stored in a file format called "DIMACS CNF". These files are very simple and are just one way of storing SAT problems.
ParseError: KaTeX parse error: Undefined control sequence: \cssId at position 1: \̲c̲s̲s̲I̲d̲{_dimacs-c}{\te…
ParseError: KaTeX parse error: Undefined control sequence: \cssId at position 1: \̲c̲s̲s̲I̲d̲{_dimacs-proble…
ParseError: KaTeX parse error: Undefined control sequence: \cssId at position 1: \̲c̲s̲s̲I̲d̲{_dimacs-clause…
ParseError: KaTeX parse error: Undefined control sequence: \cssId at position 1: \̲c̲s̲s̲I̲d̲{_dimacs-clause…
ParseError: KaTeX parse error: Undefined control sequence: \cssId at position 1: \̲c̲s̲s̲I̲d̲{_dimacs-clause…
Like with the sudoku, it’s easy to check if a bit string is a valid solution to a SAT problem; we just look at each clause in turn and see if our string disobeys any of them. In this course, we won’t worry about how we do this in a quantum circuit. Just remember we have efficient classical algorithms for checking SAT solutions, and for now we’ll just use Qiskit’s built-in tools to build a circuit that does this for us.
We've saved this file under examples/3sat.dimacs (relative to the code we're running).
We can use Qiskit's circuit library to build a circuit that does the job of the oracle we described above (we'll keep calling this circuit the 'oracle' even though it's no longer magic and all-powerful).
This circuit above acts similarly to the databases we described before. The input to this circuit is a string of 3 bits, and the output given depends on whether the input string is a solution to the SAT problem or not.
The result of this checking computation will still be either True or False, but the behaviour of this circuit is slightly different to how you might expect. To use this circuit with Grover's algorithm, we want the oracle to change the phase of the output state by 180° (i.e. multiply by -1) if the state is a solution. This is why Qiskit calls the class 'PhaseOracle'.
$$ U_\text{oracle}|x\rangle = \bigg\{ \begin{aligned} \phantom{-}|x\rangle & \quad \text{if $xParseError: KaTeX parse error: Expected 'EOF', got '}' at position 19: … not a solution}̲ \\ -|x\rangle …xParseError: KaTeX parse error: Expected 'EOF', got '}' at position 15: is a solution}̲ \\ \end{aligne…$
For example, the only solutions to this problem are 000, 011, and 101, so the circuit above has this matrix:
To summarise:
There are problems for which it's easy to check if a proposed solution is correct.
We can convert an algorithm that checks solutions into a quantum circuit that changes the phase of solution states
We can then use Grover's algorithm to work out which states have their phases changed.
In this sense, the database or oracle is the problem to be solved.
Overview of Grover's algorithm
Now we understand the problem, we finally come to Grover’s algorithm. Grover’s algorithm has three steps:
The first step is to create an equal superposition of every possible input to the oracle. If our qubits all start in the state , we can create this superposition by applying a H-gate to each qubit. We’ll call this equal superposition state ''.
The next step is to run the oracle circuit () on these qubits. On this page, we'll use the circuit (
oracle) Qiskit created for us above, but we could use any circuit or hardware that changes the phases of solution states.The final step is to run a circuit called the 'diffusion operator' or 'diffuser' () on the qubits. We'll go over this circuit when we explore Grover's algorithm in the next section, but it's a remarkably simple circuit that is the same for any oracle.
We then need to repeat steps 2 & 3 a few times depending on the size of the circuit. Note that we query the oracle in step #2, so the number of queries is roughly proportional to the square root of the number of possible inputs. If we repeat 2 & 3 the right amount of times, then when we measure, we'll have a high chance of measuring a solution to the oracle.

Next, we can again use Qiskit's tools to create a circuit that does steps 2 & 3 for us.
And we can combine this into a circuit that performs Grover's algorithm. Here, we won't repeat steps 2 & 3 as this is a small problem and doing them once is enough.
Finally, let's run this on a simulator and see what results we get:
We have a high probability of measuring one of the 3 solutions to the SAT problem.
Quick quiz
Which of these bit strings is a solution to the SAT problem solved by this quantum circuit?
011
001
010
110
How does Grover's algorithm work?
We’ve learnt about search problems, and seen Grover’s algorithm used to solve one. But how, and why, does this work?
Visualising Grover's algorithm
Grover’s algorithm has a nice geometric explanation. We’ve seen that we can represent quantum states through vectors. With search problems like these, there are only two vectors we care about: The solutions, and everything else. We'll call the superposition of all solution states '', so for the SAT problem above:
and we'll call the superposition of every other state '':
The plane
Since the two vectors and don't share any elements, they are perpendicular, so we can draw them at right angles on a 2D plane. These will be our y- and x-axes, respectively.
Step 1
Let's plot the states of our quantum computer on this plane at different points in the algorithm. The first state we'll plot is . This is the state after step 1 (the initialisation step). This state is an equal superposition of all computational basis states. Since any possible state is either a solution or not a solution, we can write as some combination of and , so it sits between them on the our plane.
Step 1
For difficult problems, we'd expect there to be lots of possible inputs, but only a small number of solutions. In this case, would be much closer to than (i.e. the angle, , between them is small), so it's unlikely that measuring would give one of the computational basis states that make up . Our goal is to end up with the computer in a state as close to as possible.
Step 2
Next we pass our qubits through the circuit . We saw above that, by definition, flips the phase of all solution states. In our diagram, this is a reflection through the vector . I.e.:
Step 3
We've just seen that we can reflect through the vector , so is there another vector could we reflect through that would move our state closer to ? The answer is 'yes', we can reflect through the vector . It may not be obvious at first how we can create a circuit that does this, but it's a relatively simple operation that we'll cover later in this page.
Finish (or repeat)
Now our state vector is closer to than before, which means we have a higher chance of measuring one of our solution states. If there is only one solution, we need to repeat steps 2 & 3 ~ times to have the highest probability of measuring that solution.
How many times do we need to query the oracle?
To work this out, we'll have to work how much each iteration rotates our state towards . Let's say we're somewhere in the middle of our algorithm, the state of our computer () is an angle from the starting state . The angle between and is .
The oracle reflects the state vector of our computer around , so the angle between our new, reflected state vector () and will also be .
Next we reflect through . The angle between the state of our computer () and is .
So, after one iteration, we know the angle between the state of our computer and is also .
Which means each iteration rotates the state of our computer towards by .
Now we just need to work out how many lots of fit into a right angle, and this will be roughly the number of iterations needed to rotate into .
So what's the angle in terms of ? With a bit of trigonometry, we know that is equal to the component of , divided by the length of (which is 1). If there's only one solution state, then . So .
Finally, for difficult problems, will be very small, which means we can use the small angle approximation to say radians.
Since, for small , we want to rotate around radians, this means we need to do roughly iterations. Since we query the oracle once per iteration, the number of oracle queries needed is proportional to , if there is exactly one solution.
Quick quiz
For an oracle with many possible inputs and exactly one solution, . What approximate value would have if there were two solutions?
Circuits for Grover's algorithm
To round off the chapter, we’ll create a simple circuit from scratch that implements Grover’s algorithm, and show it works. We’ll use two qubits, and we’ll start by creating an oracle circuit.
The oracle
To keep things simple, we're not going to solve a real problem here. For this demonstration, we'll create a circuit that flips the phase of the state and leaves everything else unchanged. Fortunately, we already know of a two-qubit gate that does exactly that!
Here's a short function to show the matrix representation of this circuit:
Try it
Can you create 3 more oracle circuits that instead target the other 3 computational basis states (, and )? Use display_unitary to check your answer.
Hint: Try to create circuits that transform to and from the basis state you're targeting, can you then use these circuits with the cz gate?
Creating the diffuser
Next we'll create a diffuser for two qubits. Remember that we want to do a reflection around the state , so let's see if we can use the tools we already have to build a circuit that does this reflection.
We've already seen that the cz gate does a reflection around (up to a global phase), so if we know the transformation that maps , we can:
Do the transformation
Reflect around (i.e the
czgate)Do the transformation
We know that we can create the state from the state by applying a H-gate to each qubit. Since the H-gate is its own inverse, applying H-gates to each qubit also does .
Now we need to work out how we transform .
Quick quiz
Which of these gates transforms ?
x
z
h
s
So applying an X-gate to each qubit will do the transformation . Let's do that:
Now we have the transformation , we can apply our cz gate and reverse the transformation.
Putting it together
We now have two circuits, oracle and diffuser, so we can put this together into a circuit that performs Grover's algorithm. Remember the three steps:
Initialise the qubits to the state
Perform the oracle
Perform the diffuser
And when we simulate, we can see a 100% probability of measuring , which was the solution to our oracle!
Try it
Try replacing the oracle in this circuit with the different oracles you created above. Do you get the expected result?
SAT problems are hard
Random guessing grows linearly with the number of entries in the database, which isn’t actually too bad (although we know we can do much better). But we usually measure how algorithms grow by their input length in bits, so how do these two connect? Each extra variable (bit) in our SAT problem doubles the number of possible solutions (i.e. entries to our database), so the search space grows exponentially with the number of bits.
ParseError: KaTeX parse error: Undefined control sequence: \cssId at position 1: \̲c̲s̲s̲I̲d̲{Big-N}{N} = 2^…Since random guessing grows linearly with , the running time will grow by roughly .
Quick quiz
How does the running time of Grover's algorithm grow with the number of input bits (when there is only one solution)?
Making use of structure
So far, we’ve treated SAT problems as if they’re completely unstructured, but unlike the unsorted phone book, we do have some clues that will help us in our search. A SAT problem isn’t a black box, but a set of individual clauses, and we can use these clauses to home in on a correct answer. We won’t get anything nearly as efficient as binary search, but it’s still much better than random guessing. One (classical) algorithm that uses the structure of SAT problems is Schöning’s algorithm.
Like random guessing, Schöning’s algorithm chooses an input at random and checks if it works. But unlike random guessing, it doesn’t just throw this string away. Instead, it picks an unsatisfied clause and toggles a bit in the string to satisfy that clause. Annoyingly, this new string might un-satisfy a different, previously-satisfied clause, but on average it's beneficial to keep toggling bits in this manner a few times. If the initial guess was close enough, there’s a fair chance we’ll stumble upon the correct solution. If not, then after some number of steps, the computer starts again with a new completely random guess. It turns out for 3-SAT (although not (>3)-SAT), this algorithm grows with roughly , which not only beats random guessing, but also beats Grover's algorithm!
It may not be obvious at first glance, but we can actually combine Grover and Schöning's algorithms to get something even better than either individually. If you create a circuit that carries out the bit-toggling part of Schöning's algorithm, you can use this as the oracle and use Grover's algorithm to find the best "initial guess". We won't go into it in this course, but it's a fun project to investigate it!