Full Lecture Notes on Quantum Algorithms
Full Lecture Notes on Quantum Algorithms
1
5 Quantum machine learning 32
5.1 A brief overview of classical machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1.1 Types of machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1.2 Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.1.3 Training neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.1.4 Reasons for the success of classical machine learning . . . . . . . . . . . . . . . . . . . 34
5.2 Quantum machine learning using qBLAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3 Quantum support vector machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3.1 Support vector machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3.2 Classical computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3.3 Quantum computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.4 Quantum principal component analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.5 Quantum neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.5.1 Quantum feedforward neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.5.2 Quantum convolutional neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.5.3 Quantum Boltzmann machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2
8.3.6 Example of the solution of a practical problem on a quantum annealer: Flight-gate
assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
8.4 Quantum Approximate Optimization Algorithm (QAOA) . . . . . . . . . . . . . . . . . . . . 66
8.4.1 Introduction to QAOA: From the quantum adiabatic algorithm to QAOA . . . . . . . 66
8.4.2 The Quantum Approximate Optimization Algorithm for solving Max-cut . . . . . . . 68
8.4.3 Other interesting remarks and extensions of QAOA . . . . . . . . . . . . . . . . . . . . 71
8.4.4 More on the relation between QAOA and quantum annealing . . . . . . . . . . . . . . 71
3
A Quantization of the electromagnetic field in a cavity 123
A.1 Quantizing the electromagnetic field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4
Chapter 1
In this course, we will give an overview of various approaches to quantum computation, reflecting many of
the latest developments in the field. We will cover several different models of quantum computation, from
the foundational circuit model through measurement-based and adiabatic quantum computation to boson
sampling. We will discuss quantum computation with both discrete and continuous variables. When it
comes to the algorithms that we study, they include both classics like Shor’s algorithm and newer, heuristic
approaches like the quantum approximate optimization algorithm (QAOA). We will also see how quantum
computing can be combined with machine learning.
We assume that the students taking this course already have some familiarity with quantum physics
(superposition, entanglement, etc.) and some basic concepts in quantum computation. We will repeat some
of these basic concepts at the beginning of the course, but perhaps give a more thorough justification for
why they can be used in quantum computation.
In this first chapter, we will study the circuit model of quantum computation. This introduces quantum
bits, quantum gates, and other components in close similarity with concepts in classical computing and gives
us the tools to begin investigating whether quantum computers can ever outperform classical computers.
For this chapter, we have borrowed parts from Refs. [Nielsen and Chuang, 2000, Aaronson, 2018, Kockum
and Nori, 2019].
• State preparation to initialize the qubits in the input state we need to begin the computation.
5
Figure 1.1: The Bloch-sphere representation of a qubit state. The north pole is the ground state |0i and the
south pole is the excited state |1i. To convert an arbitrary superposition of |0i and |1i to a point on the
sphere, the parametrization |ψi = cos ϑ2 |0i + eiφ sin ϑ2 |1i is used.
for now that it is possible to initialize our quantum computer in some simple state, and that we can read
out the state of the qubits at the end of a computation.
6
If there are N qubits in a system, the total state of that system can be a superposition of 2N different
states: |000 . . . 00i, |100 . . . 00i, |010 . . . 00i, . . ., |111 . . . 10i, |111 . . . 11i. Note that N classical bits can be in
one of these 2N states, but not in a superposition of several of them. To store all the information about a
general N -qubit state, one needs to keep track of the 2N amplitudes in the superposition. This means that
at least 2N classical bits are required to represent the quantum system. This explains why it is hard for
classical computers to simulate some classical systems, and gives a first hint that quantum computers can
be more powerful than classical ones (at least when it comes to simulating quantum systems).
There are many physical implementations of qubits, e.g., superconducting qubits, trapped ions, natural
atoms, etc. These implementations are a topic for another course. In the following, we assume that we have
access to qubits, but do not care much about how they are made.
The Pauli matrices generate rotations around the corresponding axes on the Bloch sphere when exponenti-
ated, e.g.,
cos(ϑ/2) −i sin(ϑ/2)
Rx (ϑ) = exp(−iϑX/2) = cos(ϑ/2)I − i sin(ϑ/2)X = , (1.7)
−i sin(ϑ/2) cos(ϑ/2)
This gate transforms the qubit state from the |0i , |1i basis to the |+i , |−i basis. The Hadamard is often
used to create superposition states at the beginning of a quantum algorithm.
The Z gate applies a phase factor −1 to the |1i part of the qubit state. Two gates that apply other phase
factors are often given their own names. One is the T , or π/8, gate:
1 0 exp(−iπ/8) 0
T = = exp(iπ/8) , (1.10)
0 exp(iπ/4) 0 exp(iπ/8)
7
The other is the phase, or S, or P , gate
1 0
S= = T 2. (1.11)
0 i
1 0 0 0
0 1 0 0
CNOT = 0 0 0 1 .
(1.12)
0 0 1 0
Note that the two-qubit Hilbert space is spanned by the basis vectors |00i, |01i, |10i, and |11i, in that order.
This is the tensor product of the two single-qubit Hilbert spaces. The action of the CNOT gate is thus to
do nothing if the first qubit is in state |0i (|00i, |01i changes to |00i, |01i), and to apply NOT to the second
qubit if the first qubit is in state |1i (|10i, |11i changes to |11i, |10i).
The controlled-Z (CZ) gate can be defined in the same manner:
1 0 0 0
0 1 0 0
CZ = 0 0 1 0 .
(1.13)
0 0 0 −1
More generally, the controlled application of a single-qubit unitary U to the second qubit takes the form
I2 02
. (1.14)
02 U
There are also two-qubit gates that are not controlled operations. For example, the SWAP gate
1 0 0 0
0 0 1 0
SWAP = 0 1 0 0
(1.15)
0 0 0 1
swaps the states |01i, |10i to |10i, |01i.
It is also possible to define gates involving than two qubits. For three-qubit gates, the most well-known
ones are the Toffoli and Fredking gates. The Toffoli gate is a controlled-controlled-NOT (CCNOT), i.e., the
state of the third qubit is flipped if and only if both the first two qubits are in state |1i:
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0
Toffoli = . (1.16)
0 0 0 0 1 0 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1
0 0 0 0 0 0 1 0
8
The Fredkin gate is a controlled-SWAP (CSWAP) gate, swapping the states of the second and third qubits
if and only if the state of the first qubit is |1i:
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0
Fredkin = . (1.17)
0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0
0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1
Most practical implementations of quantum computing have limited connectivity between qubits, only al-
lowing for pairwise interactions between nearest-neighbour qubits. This can prohibit direct implementations
of multi-qubit gates with three or more qubits, but it turns out that any multi-qubit gate can be decomposed
into a number of single- and multi-qubit gates.
9
• The Gottesman-Knill theorem
If we take our gate set to be {CNOT, H, S}, we circumvent all the objections above. However, it turns
out that this still is not enough to achieve universal quantum computation. This is the
Gottesman-Knill theorem: A quantum circuit using only the following elements can be simulated
efficiently on a classical computer:
The Clifford group is the group of operations that map Pauli matrices onto (possibly different) Pauli
matrices, and is generated by the gates (H, CNOT, S). Quantum circuits with only these gates are also
called stabilizer circuits (we will return to this concept in Chapter 3, where we will give a proof of the
Gottesman-Knill theorem and discuss quantum error correction). If one starts in the computational
basis, the single-qubit states that can be reached using these circuits are the ±1 points on the x,
y, and z axes of the Bloch sphere. This restriction of the state space turns out to make the circuit
classically simulatable even though the state space has states with both superposition, entanglement,
and complex amplitudes.
What is then a universal gate set? Actually, replacing the Hadamard gate in the last gate set considered
above, {CNOT, H, S}, with almost any other gate makes the set universal. We can also replace S with some
other gate. One universal gate set is {CNOT, H, T }. Almost any two-qubit gate on its own is also universal.
We thus now know that we can implement any unitary operation on N qubits in a realistic experimental
architecture that allows for a few single- and two-qubit gates. Does that mean that we can run quantum
algorithms that are faster than classical algorithms for some problems? This will be discussed in the next
chapter (and many other chapters after that).
10
Chapter 2
Could a quantum computer find the answer to a problem that cannot be solved on a classical computer,
even if there are no restrictions on the resources available to the classical computer? The answer is no.
The classical computer could always simulate the quantum computation by storing, to enough precision,
the 2N amplitudes for the whole state of the quantum computer’s N qubits, and changing these amplitudes
according to the gates applied in the quantum algorithm. However, there is clearly a possibility that this
classical simulation requires a lot more resources than the quantum computation itself. So a better question
to ask is: are there problems that a quantum computer could solve faster than a classical computer? That
question is the topic of this chapter.
Solovay–Kitaev theorem: Let G a finite subset of SU (2) and U ∈ SU (2). If the group generated by
G is dense in SU (2), then for any > 0 it is possible to approximate U to precision using O log4 (1/)
gates from G.
The proof can be found in Ref. [Nielsen and Chuang, 2000]. What the theorem essentially says is that
if we have a gate set that can approximate any single-qubit rotation, then we only need relatively few gates
from that set to do the approximation, so we can do the approximation fast. By adding some two-qubit gate
to make the gate set universal, it can be shown that the total number of gates needed to approximate U on
N qubits is at most O 4N log4 (1/) . More recent results have shown that for certain gate sets, the number
of gates can be brought down from O log4 (1/) to O(log(1/)). There are also good algorithms for finding
11
quantum computing has potential. We now know how to find a universal gate set to approximate any unitary
transformation, and we know that we can implement that approximation efficiently. This means that we are
ready to compare how fast classical and quantum computers are at solving different types of problems: we
can define quantum complexity classes.
• P (polynomial): The set of decision problems solvable in polynomial (poly) time by a DTM.
• NP (non-deterministic polynomial): The set of decision problems whose solutions can be verified in
polynomial time by a DTM.
• NP-hard: The set of decision problems whose solutions allows solving all problems in NP.
• #P: The set of problems that count the number of solutions of NP problems.
• #P-hard: The set of problems whose solutions allows solving all other problems in #P
It is known that P ⊆ NP, but the question whether the inclusion holds strictly (and hence ultimately P
6= NP) stands as one of the most important open problems in the modern age of science.
A concept that is often mentioned in connection to P and NP is the polynomial hierarchy (PH). It is a
hierarchy of complexity classes that generalizes the classes P, NP to the case in which oracles are accessible.
An oracle is a black box that can output the solution of a decision problem contained in a given complexity
class using a single call. For example, AB is the set of decision problems in class A that are solvable in
polynomial time by a DTM augmented by an oracle for some complete problem in class B.
The first level of the PH is the class P; in symbols, Σ0 = P. Successive levels are refined recursively:
A problem is in the polynomial hierarchy if it is in some Σk , i.e., the polynomial hierarchy is the union of
all Σk .
Analogously to what was said above concerning the relation between P and NP, it is known that Σk ⊆
Σk+1 , i.e., higher levels of the PH contain lower levels, and it is strongly believed that the inclusion is strict,
namely that Σk 6= Σk+1 . If there is a k for which Σk = Σk+1 , the PH is said to collapse to level k. It can
be shown that if a collapse occurs at level k then for all k 0 > k it would hold that Σk0 = Σk , which justifies
the terminology “collapse”.
12
2.2.2 Complexity classes for a probabilistic Turing machine
A PTM is much like a DTM, but it makes random choices of the state of the finite state machine when
reading from the tape, and it traverses all states in a random sequence. This randomness means that a PTM
will not get stuck away from a solution, which could happen for a DTM. This leads to the definition fo the
following complexity classes:
• BPP (bounded probabilistic polynomial): The class of decision problems that a PTM solves in poly-
nomial time with an error probability bounded away from (i.e., strictly less than) 1/3 for all instances.
• PP (probabilistic polynomial): The class of decision problems that a PTM solves in polynomial time
with an error probability bounded less than 1/2 for all instances.
A PTM can be simulated by a DTM with only polynomial overhead. Therefore, it is believed that BPP
= P.
• BQP (bounded quantum polynomial): This is the quantum analogue of BPP. Intuitively, BQP is the
class of problems that can be solved using at most a polynomial number of gates, with at most 1/3
probability of error.
• QMA-hard: The set of decision problems whose solutions allows solving all problems in QMA.
BQP it is the class we refer to when we talk about problems efficiently solved by a universal quantum
computer. Note that we do not have to specify which gates the definition is based upon, as long as they
constitute a universal set. Thanks to the Solovay-Kitaev theorem, using one universal set or another merely
results in a polylogarithmic overhead; this cost is dominated by a polynomial function.
Quantum computing subsumes classical one. In terms of complexity classes, this is summarized by the
statement BPP ⊆ BQP.
• If the answer is yes, then the second qubit has at least 2/3 probability of being measured 1, conditioned
on the first qubit having been measured 1.
13
• If the answer is no, then the second qubit has at most 1/3 probability of being measured 1, conditioned
on the first qubit having been measured 1.
• On any input, the first qubit has a nonzero probability of being measured 1. This condition can actually
be refined to an n-dependent probability.
Denoting qo (qc ) the output (post-selected) qubit, the relevant mathematical object is the conditional prob-
ability which reads by definition:
P (qo = 1 ∧ qc = 1)
P (qo = 1/qc = 1) = . (2.2)
P (qc = 1)
Intuitively, the power of PostBQP relies upon the denominator P (qc = 1): since it can be arbitrarily low, it
may compensate for very unlikely events corresponding to the solution.
We now want to be more specific about the success probability P (+1 ). The Solovay-Kitaev theorem (see
above) actually sets a lower bound on the acceptable probabilities: it lets us approximate any desired unitary
within exponentially small error for only a polynomial increase in the circuit size. In other words, for an
exponentially unlikely probability, the theorem still ensures that arbitrary universal gate sets can be used
for polynomially long computations like BQP circuits—since a polynomial overhead remains in the BQP
class. And indeed the class PostBQP is based upon BQP circuits. Thus it is well-defined only if the relevant
output probabilities are at worst exponentially unlikely:
1
P (+1 ) & . (2.3)
2n
It has been shown in [Kuperberg, 2015] that this condition was fulfilled whenever “reasonable” universal gate
sets were considered.
Additionally, suppose now that there is a polynomial p(n) such that P (+1 ) ≥ 1/p(n). In that case
P (+1 ) is polynomially unlikely. Then running the BQP circuit p(n) more times would still correspond to
a polynomial time computation and remain in BQP. On the other hand, such redundancy would enable
recording enough statistics to simulate the quantum post-selection through classical postprocessing. Hence
conditioning on an event which probability scales as 1/p(n) does not give any power to the post-selection.
So P (+1 ) has to be worst than polynomially unlikely.
Following the discussion in [Aaronson, b], the definition of the class PostBQP could be refined to account
for this feature: the conditioning probability P (+1 ) scales as the inverse of an exponential function,
1
P (+1 ) ∼ , (2.4)
2n
up to some scaling factor irrelevant in terms of computational classes.
14
Review
We have now seen that universal quantum computation can be performed with a rather small and simple set
of quantum gates, and that there is good reason to believe that a universal quantum computer will be able
to solve some problems faster than classical computers can. However, a crucial question remains to answer
before we go ahead and invest large resources into actually building a quantum computer: can a quantum
computer deal with errors?
If even a small error can wreck a quantum computation beyond repair, a practical quantum computer
can never be realized. In today’s classical computers, the probability p of an error occurring during a logical
operation is amazingly low. Numbers on the order of p ≈ 10−18 are often quoted. For state-of-the-art
quantum computers, however, p is rather on the order of 10−2 or, in the best case, 10−3 . In this chapter, we
show that quantum error correction is indeed possible and feasible, even with error rates close to what we
have today. For a more detailed account, se chapter 10 in Ref. [Nielsen and Chuang, 2000].
• The no-cloning theorem [Wootters and Zurek, 1982, Dieks, 1982]. Quantum mechanics prohibits the
existence of a unitary operation U that can change a known state |ϕi into a copy of an unknown
arbitrary quantum state |ψi without perturbing the latter, i.e., U |ϕi |ψi = |ψi |ψi. For the classical
error correction described above, such cloning was essential for encoding.
• Measuring a quantum state causes it to collapse into an eigenstate of the measured observable. How
16
to correct errors on an arbitrary superposition state α |0i + β |1i without disturbing the state when
performing a measurement on it?
• In a classical bit, there is only one possible error: a bit flip, taking the bit from 0 to 1 or vice versa. For
a quantum bit, there are infinitely many possible errors: any single-qubit operation, i.e., any rotation
around any axis of the Bloch sphere by any angle, could possibly be induced by some external error
source. How to make an error-correction procedure general enough to be able to deal with all these
possible errors?
Figure 3.1: The three-qubit bit-flip error-correction code. In the first step, CNOT gates are applied to
produce the state |ψ3 i from |ψi [see Eq. (3.2)]. After a bit-flip error occurs, parity measurements are done
on pairs of qubits and correcting bit flips are applied to the qubits depending on the measurement results
(-1 means the qubits are in opposite states, +1 that they are in the same state). Figure from Ref. [Kockum,
2014].
To protect it from bit-flip errors, i.e., from an unwanted application of the X gate, we encode it using three
qubits as
|ψ3 i = α |000i + β |111i . (3.2)
Note that this state can be created by application of CNOT gates, which is different from cloning the state;
⊗3
that would have resulted in |ψ3 i = (α |0i + β |1i) (if cloning had been possible).
Now, let us assume that the third qubit is flipped. This gives
If we first perform a parity measurement on qubits 1 and 2, i.e., measure the product Z1 Z2 , and then measure
the parity of qubits 2 and 3, i.e., Z2 Z3 , we would not affect the state |ψ3 i, since Z1 Z2 |ψ3 i = |ψ3 i = Z2 Z3 |ψ3 i.
Similarly, performing these measurements on |ψ3,err i does not change the coefficients α and β determining
the superposition; it just adds a global phase factor. However, the result of the measurements lets us draw
the conclusion that qubit 3 has been flipped (assuming that the probability of more than one qubit flipping
is negligible). We can then apply an X gate to this qubit, flipping it back to its original state and recovering
|ψ3 i. As shown in Fig. 3.1, the same set of measurements also lets us correct the state if a bit-flip error
17
instead occurs on qubit 1 or 2. We have thus managed to circumvent the problem of quantum measurements
being projective.
What happens if the error is an arbitrary X rotation Rx (ϑ) on the third qubit instead of just X? As we
can see from Eq. (1.7), the resulting state would then be
Measuring the observables Z1 Z2 and Z2 Z3 will either project this state into |ψ3 i or |ψ3,err i (modulo a
global phase). In both cases, we will know what operation to apply to recover |ψ3 i. This demonstrates
how quantum error-correcting codes deal with the continuum of possible errors: measurements are used to
project the perturbed state into a finite set of states from which we know how to recover the original state.
|ψ3 i = α |+ + +i + β |− − −i . (3.5)
we achieve an encoded three-qubit state where a Z error flips one of the qubits from one of its basis states
to the other. This encoding is implemented by adding a Hadamard gate on each qubit at the end of the
encoding step in Fig. 3.1.
To detect a phase flip error in one of the three qubits, we measure the products H ⊗3 Z1 Z2 H ⊗3 = X1 X2
and H ⊗3 Z2 Z3 H ⊗3 = X2 X3 . Just as above, these measurements do not change the coefficients α and β, but
lets us conclude whether a qubit has suffered a phase flip (and which qubit it was), allowing us to apply a
Z gate to correct. And just as before, this procedure also works for arbitrary Z rotations by projecting the
state into either |ψ3 i or a state that can be corrected by applying a Z gate.
18
Studying the procedure for error-correction in the nine-qubit code, it becomes clear that the code also is
able to correct a combined Z and X error on one qubit. The check for bit flip errors will locate the bit flip
without being affected by the presence of the phase flip error. Once the bit flip has been corrected, only the
phase-flip error remains and will thus be detected and corrected as before.
Since the nine-qubit code thus can correct for both X, Z, and XZ errors, it is able to correct for any
single-qubit error. This is seen by noting that any rotation of a qubit on the Bloch sphere can be decomposed
into Rz (γ)Rx (β)Rz (α). Applying this operation to a qubit gives
3.5 Stabilizers
The error-correction codes above, and many others, can be understood in terms of stabilizers. If a state
|ψi is unchanged under the action of a unitary operator U , i.e., U |ψi = |ψi, we say that the state |ψi is
stabilized by U . We can extend this to a collection of operators and states. Let S be a group of N -qubit
operators and VS the set of N -qubit states that are fixed by (unchanged under the action of) every element
in S. Then we say that S is the stabilizer of the vector space VS . For VS to be non-trivial (i.e., not just
zero), it can be seen quite easily that the elements of S need to commute and that −I cannot be in S.
Recall that for the three-qubit bit flip code in Sec. 3.2, we measured the operators Z1 Z2 and Z2 Z3 , noting
that they left the encoded state, a superposition of |000i and |111i, unchanged. Here, the stabilized vector
space VS is spanned by |000i and |111i and the stabilizer S is the group generated by Z1 Z2 and Z2 Z3 ,
i.e., the group {I, Z1 Z2 , Z2 Z3 , Z1 Z3 } (products of the elements in the group are elements in the group; the
elements I and Z1 Z3 can be constructed by multiplying together the generators Z1 Z2 and Z2 Z3 in various
ways).
So by measuring the generators of S, we were able to correct certain errors on the stabilized states.
We can imagine constructing other error-correction codes by finding a stabilizer, its generators, and the
corresponding stabilized states. But how do we know which errors the code protects from? If we work with
a stabilizer S and errors {Ej } that are subgroups of the N -qubit Pauli group (products of single-qubit Pauli
matrices and factors ±1, ±i), there is a theorem (Theorem 10.8 in Ref. [Nielsen and Chuang, 2000]) that
tells us that these errors can be corrected if Ej† Ek ∈ / N (S) − S ∀j, k. Here, the normalizer of S, N (S),
consists of all elements A ∈ / S such that A commutes with all elements in S. For our example with the
three-qubit bit-flip code, it is quite straightforward to see that any product of two elements in the error set
{I, X1 , X2 , X3 } anti-commutes with the generators Z1 Z2 and Z2 Z3 of the stabilizer (except I, but I ∈ S, so
I∈ / N (S) − S), so all errors in the set can be corrected.
19
This means that U |ψi is stabilized by U gU † , and thus U VS is stabilized by U SU † , which is generated by
the operators U g1 U † , ..., U gn if g1 , ..., gn are the generators of S.
The point of this is that it sometimes allows a compact representation of qubit states under transforma-
⊗N
tions. Consider, for example, the state |0i . Applying a Hadamard gate to each qubit in this state trans-
⊗N N
forms it to |+i , which requires 2 amplitudes to represent in the computational basis. However, the state
⊗N
|0i is the only state (up to a global phase) that is stabilized by the stabilizer generated by {Z1 , Z2 , ..., ZN }.
⊗N
After the transformation, |+i is uniquely determined by H{Z1 , Z2 , ..., ZN }H † = {X1 , X2 , ..., XN }, which
only is N generators.
It turns out that the gate set in the Gottesman–Knill theorem, {CNOT, H, S}, is such that any unitary
that transforms elements of the N -qubit Pauli group to other elements of the N -qubit Pauli group can be
composed from O(N 2 ) gates in that set. Thus, starting in a computational basis state, specified by N
generators in the N -qubit Pauli group, we can keep track of changes to the state by keeping track of how
these generators change under the action of {CNOT, H, S}. This only requires O(N 2 ) steps on a classical
computer. The same goes for measurements of these states, so the quantum circuit is classically simulatable.
20
SURFACE CODES: TOWARDS PRACTICAL LARGE-SCALE . . .
(a) Z Z Z Z Z Z Z Z Z
TABLE III. Eigens
Z
X Z X Z X Z X Z X Z X
and X̂a X̂b X̂c X̂d .
X X X X X X X X X X
X Z X Z X Z X Z X Z X Eigenvalue
Z Z Z Z Z Z Z Z Z Z
X Z X Z X Z X Z X Z X
+1
X X X X X X X X X X
X Z X Z X Z X Z X Z X
Z Z Z Z Z Z Z Z Z Z
X Z X Z X Z X Z X Z X
X X X X X X X X X X
X Z X Z X Z X Z X Z X
Z Z Z Z Z Z Z Z Z Z
−1
Figure 3.2: The layout for the surface code. The 2 3circles
1 open 4 5represent
6 7 8qubits that are part of the encoded
(b)The solid circles
logical state. a are qubits that areI not - + state,
g part of the logical I but are used for measurements
of the four-qubit XXXX and ZZZZ stabilizers a of the code. From Ref. [Fowler et al., 2012].
Z
b Z Z c b +1: ΨZ+
The error-correction Z
c 1: Ψ
code attracting most attention for large-scale implementation today is the surface
code. For a detailed explanation of how the surface code works, we refer to Ref. [FowlerZ-et al., 2012]. Here,
d
we will only explain some basic points of d the code.
repeat
The surface code is adapted to a square grid of qubits, where each qubit can interact (perform two-qubit
gates) with its four nearest neighbours. The g layout is depicted inHFig. - +3.2. The surface code only uses
(c) a H
four-qubit stabilizers on the form XXXX and ZZZZ, which can be measured using the nearest-neighbour
a
interactions on the grid Xto perform sequential CNOT or CZ gates between the four “data qubits”
+1: ΨX+ targetto bethe nearest-n
b X
measured and a “measurement c
X qubit” b
connected to them, as shown in the figure.
qubit as the contr
Once the system stateXhas been projected 1: ΨX- measurements, a
c into a state that is stabilized by the four-qubit
Z error on a data qubit anti-commutes with the XXXX stabilizer measurements that involve one of Hadamard
the two applied
d d
X-measurement qubits connected to this data qubit. Similarly, an X error on a data qubit anti-commutes
repeat the CNOTs; the proj
with the ZZZZ stabilizer measurements that involve one of the two Z-measurement qubits connected of X̂ato X̂b X̂c X̂d . Hen
this data qubit. In this way, it is possible to identify when single-qubit errors (also XZ errors) occur on data
qubits. However,FIG. 1. error
if the (Color online)increases
probability (a) A two-dimensional
such that several dataarray
qubitsimplementa- all the measure qub
close to each other experience
tionbeofhard
errors, it can the surface
to deducecode.
whichData
error qubits are open
configuration circles
actually (◦),the
caused measurement
measurement results,dataeven ifqubits simu
qubits
a logical error has are solid circles (•), with measure-Z qubits colored green
not occurred. Zabcd |ψ", with
To flip the state of the encoded qubit, a chain of X or Z operations, stretching from one side toX̂ theX̂other
(dark) and measure-X qubits colored orange (light). Away from a b X̂c X̂d |ψ" = X
of the square grid, is required, as shown in Fig. 3.3. The larger the distance d across the grid of data qubits,
the better protected the encoded qubit is, provided the error probability is below the threshold.Following measurem
the boundaries, each data qubit contacts four measure qubits, and For the
eachthemeasure
surface code, threshold qubit contacts
is on the order of four data isqubits;
1 %, which the many
better than measureother qubits qubits
error-correcting codes.in Figs. 1(b)
perform
This contributes to four-terminal
the great interestmeasurements.
in implementing On the boundaries,
the surface code. the measure every step in the cyc
qubits contact only three data qubits and perform three-terminal
It is possible to perform two-qubit operations on encoded qubits in the surface code. However,
the entire two-dime
explaining
how this works is beyond the scope of these notes. See Ref. [Fowler et al., 2012] for details.
measurements, and the data qubits contact either two or three measure begins. We note tha
qubits. The solid line surrounding the array indicates the array each of the measure
boundary. (b) Geometric sequence of operations (left), and quantum easily modified whi
circuit (right) for one surface code cycle for a measure-Z qubit, Appendix B).
which stabilizes Ẑa Ẑb Ẑc Ẑd . (c) Geometry
21 and quantum circuit for Stabilizer codes h
a measure-X qubit, which stabilizes X̂a X̂b X̂c X̂d . The two identity Iˆ
not operate from the
operators for the measure-Z process, which are performed by simply
state |ψ" that results
waiting, ensure that the timing on the measure-X qubit matches that
the stabilizers; we c
of the measure-Z qubit, the former undergoing two Hadamard Ĥ
X X X
Z
Z
Z
X
in a particular state, as in our earlier two-qubit example.
Z
X
X
X However, the set of surface code stabilizers is actually not
Z X
Z
X
always complete, so the array can have additional degrees of
X
X freedom. These additional degrees of freedom can be used to
define logical operators, the first step in defining a logical
lution of measurement qubit. We can see this by considering the small 2D array
segment of the 2D array. shown in Fig. 3. This array has been drawn with two types
the bottom of the figure, of boundaries, terminating with measure-X qubits on the right
orizontal plane. Vertical and left, which we call X boundaries, and terminating with
n which a measurement measure-Z qubits on the top and bottom, which we call Z
elation indicating an X̂
Ẑ X̂ error, and temporal
is sequential in time. ZL
d by the measurement
Z Z Z Z ZZ6 Z Z Z
X Z X Z X Z X Z X
X X X X X X X X
nt process itself must X Z XX11 Z X Z X Z X
yield a sign change for Z Z Z Z ZZ7 Z Z Z
cle, this measurement X Z X Z X Z X Z X
l typically be signaled X10
X X X XX X X X X
12X Z
hanges occurring on a X Z X Z
3 Z X Z X
surement error could, XL Z Z Z Z Z Z Z Z d=5
urement, with a lower XX1 Z XX2 Z X3 X Z XX4 Z XX5
X X X X X X X X
ollowing that, with an
X Z X Z X Z X Z X
ablishing the value of
quires several surface
Z Z Z Z ZZ8 Z Z Z
X Z X Z X Z X Z X
as well as sequential X X X X X X X X
X Z X Z X Z X Z X
ror, as it is typically Z Z Z Z ZZ9 Z Z Z
inct from that of a data
r more measurements
FIG. 3. (Color
Figure 3.3: Logical operations online)code.
on the surface A square 2D array
The depicted ofhas
setup data qubits,
distance withd =
5, i.e., X 5 data qubits
, such as CNOT errors,
along each side of the square. There are 41 data qubits in the setup and together with 40 measurement
boundaries on the left and right and Z boundaries on the top and
erate distinct qubits
patterns
for X and Z stabilizers. The difference leaves room for encoding one logical qubit. To perform
bottom. The array has 41 data qubits, but only 40 X̂ and Ẑ stabilizers.
measure-Z qubits.logical operations on this encoded qubit, a chain of operators must be applied across the square grid, as
A chain A
of product
X operatorschain X̂L = X̂the
1 X̂2two
X̂3 X̂4 X̂5 ofsides
X̂ operators connects the at
with X measurements two
e error signalsshown.
will be X boundaries,
connecting
commutes with
opposite
all the array stabilizers and changes
the boundaries
implement a logical X operation. Likewise, a chain of Z operators connecting the two opposite sides with
n space as well as in
X measurementsthe at array state from the quiescent state |ψ! to |ψFrom with et the
the boundaries implement a logical Z operation. X ! =Ref.
X̂L |ψ!
[Fowler al., 2012].
matched up to deduce same measurement outcomes as |ψ!. A second product chain ẐL =
h very high probability Ẑ6 Ẑ7 Ẑ3 Ẑ8 Ẑ9 connects the two Z boundaries and commutes with the
en the linearity of a array stabilizers; it changes the array state from |ψ! to |ψZ ! = ẐL |ψ!.
errors that occur are The operator chains X̂L and ẐL anticommute. A modification of
correct for all these the X̂L chain to the chain X̂L" = X̂1 X̂10 X̂11 X̂12 X̂3 X̂4 X̂5 generates a
by applying corrective quiescent state |ψX" ! = X2,10,11,12 |ψX !, related to |ψX ! by the result of
rements, as discussed the measurement X2,10,11,12 = ±1 of the encircled measure-X qubit
ray are not so sparse, (outlined in black).
22
032324-7
Chapter 4
For this part of the notes, we follow Ref. [Nielsen and Chuang, 2011].
The Fourier transformation is extremely useful, e.g. to detect periods in a signal where {x0 , . . . , xN −1 }
could be the amplitude of some signal as a function of discretized time. The Fourier transformed signal
{y0 , . . . , yN −1 } then describes the frequency content. Solving the Schrödinger equation on a lattice, DFT is
what takes you between real space and momentum (k) space.
The quantum Fourier transform (QFT) is a unitary n-qubit operation, transforming the initial N = 2n
basis states {|0i, . . . , |N − 1i} into a new basis in a way which looks mathematically identical to the DFT,
N −1
1 X i 2πjk
|ji → √ e N |ki. (4.1)
N k=0
The action on an arbitrary state is
N
X −1 N
X −1
xj |ji → yk |ki,
j=0 k=0
where the amplitudes yk are the DFT transforms of the amplitudes xm . One may easily verify that the new
states are normalized and form an orthogonal set, and thus that the QFT is a unitary transform.
The QFT can be used to find periods and also to extract eigenvalues of unitary operators to a high
precision. But before discussing these issues in more detail, let’s see if we can find an effective implementation
of the QFT. Remember that there are operators that need exponentially many single- and two-qubit gates
for implementation, so what about the QFT?
23
binary n-bit representation of j = j1 2n−1 + j2 2n−2 + · · · + jn−1 21 + jn 20 . E.g. in the 4 qubit case the state
|5i = |0101i = |01 i|12 i|03 i|14 i.
|0i + ei2π0.jn |1i |0i + ei2π0.jn−1 jn |1i . . . |0i + ei2π0.j1 j2 ...jn |1i
|ji = |j1 , j2 , . . . , jn i → . (4.2)
2n/2
24
where in the last step we have used that
and that the integer part of j · 2l disappears in the exponent since it is multiplied by 2π.
since ei2π0.j1 = ei2πj1 /2 = −1 for j1 = 1 and +1 otherwise. The controlled-R2 gate rotates the component
2 2
|1i of the first qubit by ei2π/2 if j2 = 1, i.e. it applies the phase ei2πj2 /2 . Therefore, it produces the state
1 2
1
√ |0i + ei2πj2 /2 ei2πj1 /2 |1i |j2 , . . . , jn i = √ |0i + ei2π0.j1 j2 |1i |j2 , . . . , jn i.
2 2
After all the controlled-Rk operations on the first qubit, the state is
1 j1
1
√ |0i + ei2π( 2 +...+ 2n ) |1i |j2 , . . . , jn i = √ |0i + ei2π0.j1 j2 ...jn |1i |j2 , . . . , jn i.
jn
2 2
The Hadamard on the second qubit produces
j
√1 i2π ( 21 +...+ 2jn
n)
j
i2π 22
2
|0i + e |1i |0i + e |1i |j3 , . . . , jn i
= √122 |0i + ei2π0.j1 j2 ...jn |1i |0i + ei2π0.j2 |1i |j3 , . . . , jn i,
(4.6)
We continue in this fashion for all qubits, obtaining the final state
√1 i2πj2−n i2πj2−(n−1) i2πj2−1
2
|0i + e |1i |0i + e |1i ... |0i + e |1i
1
= √2n |0i + ei2π0.j1 j2 ...jn |1i |0i + ei2π0.j2 ...jn |1i ... |0i + ei2π0.jn |1i .
(4.8)
We now need to reverse the order of the qubits, which can be achieved using a series of SWAP gates.
The number of gates needed are n on the first qubit plus n − 1 on the second and so on, adding up to
n(n + 1)/2 = O(n2 ) gates. Then we need on the order of n SWAP gates, not changing the scaling. Thus
25
we can implement the QFT for n qubits using on the order of O(n2 ) gates. The best classical algorithm
(FFT) needs O(n2n ) gates, indicating why the QFT could be used for speedup. This does not translate in an
immediate speed-up for computing classical FFT, because we cannot access the amplitudes when measuring
the Fourier-transformed quantum state, and we don’t even know how to efficiently prepare the input state
to be transformed. However, in the next section we’ll see one problem where the quantum Fourier transform
is useful.
Figure 4.1: An efficient circuit to perform the quantum Fourier transform. (From Nielsen & Chuang), Fig.
5.1. Here 0.j1 ...jn = j2−n , 0.j2 ...jn = j2−(n−1) , ..., 0.jn−1 jn = j/2−2 and 0.jn = j/2−1 .
26
Figure 4.2: A circuit performing the first step of the phase estimation algorithm. (From Nielsen & Chuang),
Fig. 5.2.
Figure 4.3: An overview circuit figure of the phase estimation circuit. (From Nielsen & Chuang), Fig. 5.3.
probability of reading out some other state "close" to the best approximation. A careful analysis gives that
if we want n bits precision, with a failure probability less than ε, we need a register of size
1
t = n + log 2 + .
2ε
Phase estimation is interesting in its own right and may be used in quantum simulations. We will see how
it enters Shor’s algorithm for factoring.
27
4.3.1 Modular arithmetics
Order finding is defined in modular arithmetics. Modular arithmetics is based on the fact that, given any
two positive integers x and n, x can uniquely be written as
x = k · n + r,
x=r mod n,
as an example
2 = 5 = 8 = 11 mod 3.
The greatest common divisor gcd(a, b) of two integers a and b is the largest integer dividing both a and
b. If gcd(a, b) = 1 then a and b are called co-prime.
Multiplicative inverse
Now let’s look at modular multiplication by looking at the series
showing that the equation x · 6 = y mod 15 has no solution for y ∈ {1, 2, 4, 5, 7, 8, 10, 11}. Note that in
particular it has no solution for y = 1. Then take a = 7 and n = 15 which are co-prime giving
a−1 · a = 1 mod n,
and it exists if and only if a and n are co-prime. If the inverse exists we can solve the equation
x · a = c mod n
and then the series {mk } will just repeat from the start.
28
4.3.2 Order finding
Consider the equation
xr = 1 mod N,
which has solutions for the integers x and N being co-prime, and x < N . The lowest positive integer r solving
the equation is called the order of x modulo N . One straightforward method to calculate r is to evaluate
the series mk = xk mod N for 0 < k < N , then it’s clear that the series {mk } is periodic with period r
since xr+a = xr · xa = xa mod N . In other words, the order is the period of the modular exponentiation
function mk = xk mod N . As an example let’s take x = 5 and N = 21 giving
mk = {5, 4, 20, 16, 17, 1, 5, 4, 20, 16, 17, 1, 5, 4, 20, 16, 17, 1, 5, 4},
and we have the order r = 6. There is no classical algorithm for finding r which scales polynomially in the
number of bits L needed to represent the input, i.e. the integers x and N .
y 2 − 1 = (y − 1)(y + 1) = 0 mod N.
Therefore(y − 1)(y + 1) is multiple of N , i.e. it contains the two factors pq in his decomposition. However
neither (y − 1) nor (y + 1) are multiple N :
Therefore (y − 1) and (y + 1) must slip the two factors q and p appearing in the decomposition of N . Say
for instance:
(y − 1) = lp; (y + 1) = l0 q.
We finally obtain therefore
In other words, determining the order r of the modular exponentiation function xk mod N yields the
determination of the two factors p and q such that N = pq. For n having L bits, this common factor can be
found using Euclid’s algorithm in O(L3 ) steps. For uniformly chosen x one may calculate a lower bound for
the probability of r being even and that y = xr/2 is non-trivial,
1
p(r is even and xr/2 6= −1 mod N ) ≥ 1 − ,
2m
where m is the number of different prime-factors in N , i.e. m ≥ 1.
In the following, we are going to derived an efficient quantum algorithm for order finding.
29
4.3.4 A quantum algorithm for order finding
Given an integer x of which we want to find the order mod N , consider the L-qubit unitary operation
|x · y mod N i, 0 ≤ y ≤ N − 1
U |yi ≡ .
|yi, N ≤ y ≤ 2L − 1
The unitarity follows since it basically permutes the basis states and y has a multiplicative inverse modulus
N since y and N are co-prime. The states
r−1
−i2πsk
1 X
|us i = √ exp |xk mod N i,
r r
k=0
1
To have enough accuracy in the phase estimation we should use t = 2L + 1 + log 2 + 2ε qubits in the
first register and prepare the second register in the |1i state. We’ll then get the phase φ = s/r, for a random
0 ≤ s < r, with 2L + 1 bits precision, with a probability of at least (1 − ε). Knowing that the phase φ = s/r
is a rational number, where s and r are integers, not larger than L bits, we can classically determine s and
r. The appropriate algorithm is called the continued fraction expansion and needs O(L3 ) gates.
4.3.5 Performance
The algorithm fails if s = 0, and also if s and r have common factors so that they cannot be extracted
from s/r. The probability of failure can be shown to be small and one need only to repeat the procedure a
polynomial (in L) number of times to obtain r with high probability. The number of gates needed are O(L)
k
for the initial Hadamards, inverse Fourier transform needs O(L2 ) gates, implementing U 2 through modular
exponentiation requires O(L3 ) gates, and the classical continued fraction algorithm needs O(L3 ) (classical)
gates. If we need to repeat an O(L) number of times the overall scaling would be O(L4 ), but being more
clever there are ways to guarantee success in a constant number of attempts, giving the scaling O(L3 ) gates.
The algorithm
Find a factor of the composite L-bit integer N .
• 2. Determine whether N = ab , i.e. is if there is only one prime-factor. This can be done with O(L3 )
operations. If so return the factor a.
30
• 3. Randomly choose 1 < x < (N − 1), and check whether x and N are co-prime (O(L3 ) operations).
If not co-prime return the factor gcd(x, N ).
• 4. Find the order r of x modulo N , which can be done using O(L3 ) quantum gates (quantum subrou-
tine!!)
• 5. If r is even and xr/2 6= −1 mod N then compute gcd(xr/2 + 1, N ) and gcd(xr/2 − 1, N ) and check
if one is a non-trivial factor of N . Return this factor. If r is odd, or xr/2 = −1 mod N the algorithm
fails.
31
Chapter 5
Quantum computing and machine learning are arguably two of the “hottest” topics in science at the moment.
Here in Sweden, this is reflected in the fact that the two largest programs supported by the Knut and Alice
Wallenberg foundation are centered around quantum technologies and artificial intelligence. In this chapter,
we will discuss efforts to combine the two fields into quantum machine learning. Since this is a course
about quantum algorithms, we do not cover applications of classical machine learning to simulating and
understanding quantum systems, but focus instead on how machine learning can be enhance by quantum
computation. We begin with a brief overview of classical machine learning and then study some examples
of quantum machine learning algorithms.
• Unsupervised learning: Learning structure in P (data) given samples from P (data). Here, the
machine learning is used to generate knowledge by analyzing unlabelled data. Examples of unsupervised
learning are clustering (grouping data points), density estimation (estimating a probability density
function giving rise to the data), and much of what is called data mining.
• Supervised learning: Learning structure in P (labels|data) given samples from P (data, labels). Here,
the machine learning is used to generalize knowledge gained from studying labelled data to predict
correct labels for unlabelled data. Examples of supervised learning are foremost various classificiation
tasks, e.g., image recognition.
• Reinforcement learning: Learning from (possibly rare) rewards. Here, an agent learns by acting on
its environment, observing the results of its actions (the results can include rewards), and updating its
32
Figure 5.1: Structure of a feed-forward neural network and a single neuron in that network. From Ref. [Tani-
kic and Despotovic, 2012].
policy for actions based on the outcomes. Examples of reinforcement learning include the superhuman-
level game-playing programs for go, chess, StarCraft, etc. by Google’s DeepMind.
A neural network can thus be said to be a complicated function, parameterized by all the weights and
biases in the network, which transforms an input into an output. What functions can a neural network
represent? The answer, provided by Cybenko in 1989 [Cybenko, 1989], gives a hint of why neural networks
are powerful. It turns out that any arbitrary smooth function with vector input and vector output can be
approximated to any desired precision by a feedforward neural network with only a single hidden layer. In
practice, deep neural networks, i.e., networks with many hidden layers, have turned out to be more efficient
at representing various functions. See, e.g., Ref. [Lin et al., 2017].
33
5.1.3 Training neural networks
To train a neural network to perform a specific task boils down to adjusting the weights w and biases b such
that the network approximates a function that solves the task. To do this, we first need to be able to say
how close the output of the current network is to the desired output. The difference between the actual and
the desired output is measured by some cost function. One example is the mean square error
1 X 2
C(w, b) = |y(x) − a| , (5.2)
2n x
where n is the number of examples x (data points), y is the desired output, and a is the actual output. To
find good weights and biases for the task is to find weights and biases that minimize C(w, b).
How does one minimize C in a smart way? Clearly, there are too many unknowns to simply find the
extremum by setting the gradient of C to zero and solving the resulting equation. Therefore, gradient descent
is used, with the update of the parameters being proportional to minus the gradient (the proportionality
constant is called the learning rate). However, the naïve approach to calculating the gradient of C is time-
consuming: to obtain each partial derivative of C the network would need to be run once for each data point
x and for each variable in w or b, to see how a small change in the input changes the output. One important
reason for the prevalence of neural networks today is that two tricks have been found that can reduce the
necessary calculations considerably.
The first trick is to use stochastic gradient descent, which means that the partial derivatives are not
calculated for all data points x in each step of the gradient descent, but only for a subset, a mini-batch, of x.
The next step uses another subset, and so on until all subsets have been used (marking the end of an epoch
of training), whereupon the selection of mini-batches starts over. The use of stochastic gradient descent will
only give an approximation to the actual gradient, but if the mini-batches are large enough and the learning
rate is low enough, this approximation is usually sufficient.
The second trick is to calculate the partial derivatives not one by one by running the network many times,
but by using back-propagation. Essentially, back-propagation allows one to calculate all partial derivates by
running the network once, noting the result for each neuron, finding the derivative of the cost function with
respect to the output from the last layer, and then applying the chain rule repeatedly, going backwards
through the network to extract each partial derivative. For more details on how this works, see, e.g.,
Ref. [Nielsen, 2015].
With the network having so many parameters, often orders of magnitude more than the number of
training examples, a valid concern is whether the training will result in overfitting. Over the years, various
strategies have been developed to counter this, but that is beyond the scope of this short introduction to
classical machine learning.
• Data: There are now a large number of huge data sets that can be used for training of neural networks.
• Computational power: We now have much more powerful, and increasingly custom-built, hardware
to run machine-learning algorithms on.
• Algorithms: Clever algorithms like back-propagation, which enable efficient training, together with
a number of other clever tricks discovered in the past decade or so, have led to large improvements.
34
5.2 Quantum machine learning using qBLAS
To see how quantum computers can aid or enhance machine learning, a first entry point is to note that many
machine learning algorithms rely heavily on linear algebra. For classical computation of linear-algebra oper-
ations, there are optimized low-level routines called basic linear algebra subprograms (BLAS). For quantum
computers, there are several algorithms that deal with linear-algebra problems. Together, these algorithms
are sometimes referred to as quantum BLAS (qBLAS).
Some examples of qBLAS are the HHL algorithm for solving systems of linear equations [Harrow et al.,
2009], the quantum Fourier transform, and quantum phase estimation for finding eigenvalues and eigenvec-
tors. All of these examples have exponential speed-ups compared to their classical counterparts. However,
it is important to “read the fine print” [Aaronson, 2015] for these algorithms. They all rely on the problem
being encoded in a quantum random access memory (QRAM). In a QRAM, data is encoded in the probabil-
ity amplitudes of a large superposition state. For example, a vector b with n entries can be stored in log2 n
P
qubits as j bj |ji, where bj are the entries in the vector and |ji are the computational basis states of the
qubits.
The problem with the QRAM is that no efficient way is known to encode the data in the QRAM in the
first place. The time it takes to encode the problem can therefore negate the exponential speed-up from the
qBLAS algorithms. This is sometimes called the input problem. There is also an output problem: the output
of the qBLAS algorithms is not necessarily the direct answer sought, but a state which lets you sample
properties of the answer. For example, solving the system of linear equations Ax = b does not give the
solution vector x as an easily measurable output, but just enables sampling properties of x.
35
Figure 5.2: Illustration of support vectors. From Wikipedia.
grangian
1 2
X
L(w, b, λ) = |w| − λj [yj (w · xj − b) − 1]. (5.3)
2 j
Setting the partial derivatives of L with respect to λj equal to zero gives the constraints (the λj not corre-
sponding to support vectors will become zero). Setting the partial derivative of L with respect to w to zero
leads to X X
0=w− λ j yj x j ⇒ w = λj yj xj , (5.4)
j j
so we see that w will be determined by the support vectors. Finally, we also use
∂L X
0= = λj yj , (5.5)
∂b j
The objective now is to find λj that maximize this expression under the constraint given by Eq. (5.5).
In many cases, the separation between two classes of data points cannot be parameterized as a simple
hyperplane, as illustrated to the left in Fig. 5.3. The solution commonly used is then to transform the data
points to a feature space that admits a hyperplane as a separator. This is encoded by a kernel function
K(xi , xj ), which then replaces the dot product in Eq. (5.6).
36
Figure 5.3: Illustration of a kernel for a support vector machine. From Wikipedia.
With this addition, and some further work, the maximization problem in Eq. (5.6) can be shown to lead
to the following system of linear equations:
0 1 b 0
= , (5.7)
1 K λ y
where the ones are 1 × M row and column vectors (M is the number of data points) and the entries in the
M × M matrix K are given by Kij = K(xi , xj ).
We can now estimate the time it takes to find the support vectors on a classical computer. If the data
points xj ∈ RN , calculating one entry in K takes O(N ) time, so calculating all of K takes O(M 2 N ) time.
Solving the system of linear equations takes O(M 3 ) time, so in total the classical computer will require
O(M 2 [N + M ]) time.
Note that we require QRAM to construct the state |xi i. Next we q do a “swap test” (see Fig. 5.4) on |ϕi
2 2
and the ancilla qubit in |ψi. The distance we seek is then given by |xi | + |xj | times the probability of
measuring 1 in the swap test. The computational complexity for this distance calculation is O(log N ).
37
Figure 5.4: The quantum circuit for a swap test. Figure from Peter Wittek. The resulting state of the system
before the measurement is 12 [|0i (|f i |f 0 i + |f 0 i |f i) + |1i (|f i |f 0 i − |f 0 i |f i)]. This means that the probability
of measuring 1 becomes zero when f = f 0 .
Briefly, this is done by approximating exp{−iF t} (which is not trivial, since K is not sparse), and then using
quantum phase estimation to extract eigenvalues and eigenvectors. These eigenvalues, together with y in
the eigenbasis, lets us construct the solution state
X
|b, λi ∝ b |0i + λj |ji . (5.12)
j
The complexity for this part of the computation is O(log M ). The total complexity for the quantum SVM
is thus O(log N M ).
on “quantum-inspired” algorithms for PCA [Tang, 2018], where the methods from the quantum algorithm
have been adapted to find an improved (in the scaling of some parameters) classical algorithm.
38
1
2
3
. .
. . . . .
. .
n
n+1
Figure 5.5: A quantum feedforward neural network. Figure from Ref. [Farhi and Neven, 2018].
Figure 1: Schematic of the proposed quantum neural network on a quantum processor. The
input state | , 1i is prepared and then transformed via a sequence of few qubit unitaries
5.5Ui (✓Quantum
i ) that dependneural networks
on parameters ✓i . These get adjusted during learning such that the
measurement of Yn+1 on the readout qubit tends to produce the desired label for | i.
In this final section, we discuss quantum versions of neural networks. This is a quite new and rapidly
developing field, so it is possible that these notes can become outdated fast. We therefore only try to give a
few On the readout
examples qubit
and discuss we general
some then measure
propertiesa Pauli operator,
of quantum neuralsaynetworks.
y , which we call Yn+1 . This
gives a +1
Although or a 1.
combining Our goal
quantum is to make
computing the measurement
and neural networks sounds outcome correspond
interesting, given howto they
the are
bothcorrect labelcomputing
promising of the input string, it
paradigms, that is, l(z).
is not Typically to
straightforward thedomeasurement
so. There areoutcome
certainlyispotential
not
upsides to quantum
certain. neural networks:
Our predicted label value theyis can
the work with quantum
real number between data,1oranda compact
1, representation of
classical data, and there may be quantum algorithms that can speed up training. However, there are several
potential downsides or questions. For example, ~✓ )Yinput
hz, 1| U † (the (~✓ ) |z, 1i
n+1 Uproblem (5)
may be a factor here also. Furthermore,
classical neural networks require nonlinear activiation functions, but quantum mechanics is linear. Also,
whichneural
classical is the averagehave
networks of the observed
a large numberoutcomes
of neuronsifand Yn+1manyis measured
connectionsin multiple
between copies
them, ofmay
which
(4).
be challenging for NISQ devices with few qubits and limited connectivity. Another question is whether back-
Our goal
propagation can beis implemented
to find parameters ✓~ so that
in a quantum the network,
neural predictedsincelabel is nearback-propagation
classical the true label. requires
We
will address
measuring the question
the output of whether
at each point such parameters
in the network, something thatevenwould
exist destroy
(representation)
a quantum as well as
superposition.
the question of whether such optimal parameters can then be efficiently found (learning).
For aQuantum
5.5.1 given circuit, that is, a choice
feedforward of L unitaries,
neural networks ~ and an input
and a set of parameters ✓,
string z, consider the sample loss
In 2018, Farhi (yes, the same Farhi that proposed the QAOA) and Neven proposed an architecture for a
quantum feedforward neuralloss( network
~ z) =
✓, [Farhi
1 andl(z)Neven,
hz, 1| U2018],
† ~ as depicted in Fig.
( ✓ )Yn+1 U (~✓ ) |z, 1i . 5.5. Here, instead of(6)
layers
of neurons, there are layers of quantum gates, which act on an n-qubit input state |ψi and an ancilla qubit
prepared
Note in |1i.the
that Instead
sampleof loss
weights
we and biases
use is in in
linear a classical
the marginneural networks,
(the product weofnow
thehave and the ϑj
labelparameters
parameterizing these gates; the action of the whole quantum circuit is a unitary operation
predicted label value) and its minimum is at 0 (not minus infinity) because the predicted U (ϑ). At the end,
the label
ancillavalue
is measured
is automatically Y basis). to
(here, in thebounded Thebeoutcome
betweenof -1theand
measurement
1. Suppose is used
thattothe
classify the input
quantum
state, giving it one of two possible labels.
neural network is working perfectly, so that for each input z, the measurement always gives
To train this quantum neural network for its classification task, the authors propose using a loss function
where z is the input and l(z) is a label function giving the correct label (±1). This cost function is zero if
the network spits out the correct classification, and greater than zero otherwise. The authors use stochastic
39
gradient descent to find parameters ϑ that minimize this cost function. However, no back-propagation is
used (since this does not seem to be applicable to this architecture, as discussed above). Furthermore, to
evaluate the gradient, multiple runs of the quantum circuit are required for each partial derivative, since
one needs to collect enough statistics to find the expectation value of the output with sufficient precision.
The authors point out that a possible advantage of the quantum network is that the form of the unitary U
guarantees that the gradient does not “blow up”, which can be a problem in some classical machine-learning
algorithms.
Since the quantum feedforward neural network lacks a nonlinear element, one can ask whether it has the
ability to represent any label function. The authors show that the network indeed has this ability, but some
label functions may require an exponential circuit depth.
with ai and bj biases, and W a weight matrix. The form of the probability distribution (a Boltzmann
distribution) explains the name Boltzmann machines.
To train the network, a cost function
X
C(a, b, W ) = − Pdata (v) log P (v) (5.16)
v
40
phase, and demonstrate that a QCNN trained on a set of ex-
INTRODUCTION machines (BMs). BMs consist of binary chanics could exp
he phase diagram over the entire parameter regime. Finally,
Quantum machine learning is at the neurons divided into visible and hidden better than classica
ons of QCNN are discussed. units, represented by vectors v and h re- Amin et al. [4]
crossroads of two of the most exciting
current areas of research: quantum com- spectively (see Fig. 1). To every assign- and take the so-c
puting and classical machine learning. Al- ment of the visible and hidden neurons, Ising Hamiltonian
s has re- though the field is stillC in its infancy, a BM associates a joint probability tion. This increa
P C P FC
the certain difficultie
practical (a) body of literature is already large P (v, h) = e
1 − E (v,h)
, transverse field te
enough to warrant several review arti- j Z
ealization cles [1–3]. This short survey focuses on quantum Boltzma
riguing to a selection of significant recent results where Z is a normalization factor,
= E(v, h) non-trivial. Althou
( )
on the subtopic of quantum neural inet- is a quadratic energy function of v and and performance
to further (
( ) (
h,
1) ( ) ( +
1)
Cat Dog
1) and there is a set W of real-valued well understood,
works, an area that hopes to build on
ems6 . In the enormous impact that classical neu-
( + (1) 1)
( + 1)
coefficients in the energy function that examples the auth
study of ral are weight parameters to be learned. The QBMs can learn d
(b)networks have had. In particular, we training process consists of using meth- than classical BMs
eme com- concentrate on quantum generalizations
ical anal- of popular neural-network concepts such ods such as gradient descent to optimize Kieferova and
as Boltzmann machines, generative ad- the weights so as to
! maximize the likeli- formalism of Am
versarial networks and autoencoders, as hood that P(v) = h P(v, h) reproduces more general Ham
s between in ⇢
well as applications of classical neural net- the statistics of observed F data. the transverse Ising
works to quantum information and vice eral believed to b
ems have simulatable. Furth
versa. Quantum Boltzmann machines
atural to new methods fo
The energy-based nature of BMs gives a for parameters co
e used for natural framework for considering quan-
complex BOLTZMANN MACHINE NEURAL tum generalizations of their behavior. tum terms to be
QBM enables a f
s intrigu- NETWORKS The BM energy function is equivalent to tomography that c
(c)Illustrations
The
ed to fur- Figure 5.6: neural networks that have received the Hamiltonian of a simple Ising model data, and also ge
of (a) a classical convolutional neural network and (b) a quantum convolutional
6,11 neural network.recent
most Figure from Ref.from
attention [Congaetquantum
and one might hope that more general reconstructed sta
blems . al., 2019].
perspective are known as Boltzmann Hamiltonians allowed by quantum me- numerical evidenc
= (MERA)
any-body
QCNN
MERA
42
Chapter 6
Measurement-based quantum
computation
The circuit model introduced and discussed in the previous two chapters is not the only way to perform
quantum computation. In this and the next chapter, we will look at two other approaches: measurement-
based quantum computation (MBQC) and adiabatic quantum computation. We will also see some additional
examples later in the course; MBQC will resurface when we discuss quantum computation with continuous
variables.
Measurement-based quantum computation was first proposed by Raussendorf and Briegel in 2001 [Raussendorf
and Briegel, 2001]. More details can be found in the follow-up paper in Ref. [Raussendorf et al., 2003]. The
first experimental demonstration of all basic components of MBQC was performed by the Zeilinger group in
2005 [Walther et al., 2005].
43
6.2 The details of MBQC
6.2.1 Definition of the possible operations
Let us first recall from Chapter 1 some of the operations we can perform on single and multiple qubits.
The addition of any non trivial two-qubit Clifford gate, e.g., CZ = |0ih0| ⊗ I + |1 |1ih1| ⊗ Ẑ, combined with
the set of single-qubit operations above allows for any general multi-qubit Clifford operation.
H, Uz (π/2) = Zπ/2 , CZ universal set for multi-qubit Clifford operations
These operations are enough to perform some algorithms like error correcting codes, but no algorithm which
could not be efficiently simulated on a classical computer [Horodecki et al., 2006].
44
6.2.3 Measurements and their effect
The next step is to measure the input qubit in some rotated basis. Measuring in a rotated basis with angles
(ϑ, ϕ) is like measuring Rz (ϕ + π/2)Rx (ϑ)ZRx (−ϑ)Rz (−ϕ − π/2). Here we measure the first qubit with
ϑ = π/2 and an arbitrary ϕ, i.e., we measure the observable
Hence a measurement of the operator in Eq. (6.2) projects the second qubit into:
giving the result. The extra Pauli operator X m depends on the outcome of the measurement on qubit 1 and
is said to be a “by-product” operator. It can be compensated for by choosing the measurement basis of the
following steps in the computation (thus introducing, in general, time-ordering). In the following we rename
−2ϕ → ϕ.
45
where in the first step we used that X m H = HZ m , in the third that HZ m H = X m and later on that
XRz (ϕ) = Rz (−ϕ)X and ZRx (ϕ) = Rx (−ϕ)Z.
Since the most general rotation of a single qubit can be decomposed as Rz (γ)Rx (β)Rz (α), the above
steps lets us implement any single-qubit operation. For the result to be deterministic, we can perform
the measurements choosing the basis sequentially, depending on the preceding measurement outcomes, as
ϕ1 = α, ϕ2 = (−1)m1 β, ϕ3 = (−1)m2 γ. The Pauli corrections remaining at the end of the computation are
not important and never need to be physically applied; they can be accounted for in the final interpretation
of the result (classical post-processing).
If we want to implement a Clifford unitary, by definition CΣ = Σ 0 C meaning that interchanging the
order of Clifford operators and Pauli matrices will leave the Clifford operator unchanged. This means that
there is no need to choose measurements adaptively.
46
at chosen times, but where it is easier to create a large entangled state in one big operation and then only
perform single-qubit operations.
As an aside, we note that MBQC usually has been considered for two-dimensional cluster states. How-
ever, by working with cluster states in three dimensions, the quantum computation can be made fault-
tolerant [Briegel et al., 2009].
47
Chapter 7
The MBQC that we considered in Chapter 6 was, although distinct from the circuit model for quantum
computation, still rather similar to that circuit model. The paradigm of quantum computation that we
explore in this chapter, adiabatic quantum computation (AQC), is further removed from the circuit model
than MBQC. The idea for AQC was evolved around the turn of the millenium. A recent extensive review of
the topic is Ref. [Albash and Lidar, 2018], from which we quote several of the following considerations.
48
assured if the evolution time τ satisfies the condition
dĤ(s)
max hψ1 (s)| |ψ0 (s)i
0≤s≤1 ds t
τ ; s≡ , (7.2)
min ∆2 (s) τ
0≤s≤1
where |ψ0 (s)i and |ψ1 (s)i are the instantaneous ground and first excited eigenstate of the Hamiltonian in
Eq. (7.1), and ∆(s) = (E1 (s) − E0 (s)) is the instantaneous energy gap between the ground state and first
excited state energies. If the criterion of Eq. (7.2) is fulfilled, then we can be certain that the system will
evolve into the ground state of Ĥ1 .
While these are useful sufficient conditions, they involve bounding the minimum eigenvalue gap of a
complicated many-body Hamiltonian, a notoriously difficult problem. This is one reason that AQC has
generated so much interest among physicists: it has a rich connection to well-studied problems in condensed
matter physics. For example, because of the dependence of the run time on the gap, the performance of
quantum adiabatic algorithms is strongly influenced by the type of quantum phase transition the same
system would undergo in the thermodynamic limit.
Nevertheless, a number of examples are known where the gap analysis can be carried out. For example,
adiabatic quantum computers can perform a process analogous to Grover search and thus provide a quadratic
speedup over the best possible classical algorithm for the Grover search problem. Other examples are known
where the gap analysis can be used to demonstrate that AQC provides a speedup over classical computation,
including adiabatic versions of some of the key algorithms of the circuit model.
However, much more common is the scenario where either the gap analysis reveals no speedup over clas-
sical computation, or where a clear answer to the speedup question is unavailable. In fact, the least is known
about adiabatic quantum speedups in the original setting of solving classical combinatorial optimization
problems. This remains an area of very active research, partly due to the original (still unmaterialized) hope
that AQC would deliver quantum speedups for NP-complete problems [Farhi et al., 2001], and partly due
the availability of commercial QA devices such as those manufactured by D-Wave Systems Inc., designed to
solve optimization problems using stoquastic Hamiltonians.
49
7.4 Reason for non-commuting Hamiltonians
As mentioned above,
h it is important
i in AQC that the initial Hamiltonian Ĥ0 and final Hamiltonian Ĥ1 do
not commute, i.e., Ĥ0 , Ĥ1 6= 0. This can be understood by considering the following trivial example, taken
from Pontus Vikstål’s Master thesis, Ref. [Vikstål, 2018]. Suppose that the initial and final Hamiltonian in
AQC are be given by
1 0 −1 0
Ĥ0 = and Ĥ1 = , (7.3)
0 −1 0 − 21
which clearly commute. Since the Hamiltonians are both diagonal in the z-basis, we label the corresponding
eigenvectors as
1 0
|0i = and |1i = . (7.4)
0 1
It is easy to see that the ground state of Ĥ0 is |1i and that the ground state of Ĥ1 is |0i. We described above
how AQC is based on the adiabatic theorem. For the theorem to hold there must always exist an energy
gap between the eigenstates (see the proof below in Sec. 7.5). Following the AQC algorithm Eq. (7.1), it can
be seen that the energy gap between the eigenstates |1i and |0i closes at some point. In this example, the
energies
h becomesi equal at the point t/τ = 4/5, and so the gap between them closes. This is enough to see
that Ĥ0 , Ĥ1 6= 0 is a necessary condition for keeping the gap open.
The general solution to the Schrödinger equation is given by a superposition of the separable solutions
X X
Ψ (x, t) = cn Ψn (x, t) = cn ψn (x)e−iEn t/~ , (7.7)
n n
Hence, the nth eigenstate for a time-independent Hamiltonian remains in the nth eigenstate, simply picking
up a phase factor −En t/~. For a time-dependent Hamiltonian, the eigenenergies and eigenfunctions are
themselves time-dependent. The instantaneous eigenstates and eigenenergies are defined as
50
At any instant of time, the eigenfunctions form a complete orthogonal set
where the dependence on position is implicit. We have also introduced the bra-ket notation ψn (x) = hx|ψn i.
The general solution to the Schrödinger equation is now given by
X X
Ψ (x, t) = cn (t)Ψn (x, t) = cn (t)ψn (x, t)eiϑn (t) , (7.11)
n n
Our task is to determine the coefficients cn (t). By substituting (7.11) into the Schrödinger equation, we
obtain X X
i~ (ċn ψn + cn ψ̇n + icn ψn ϑ̇n )eiϑn = cn En ψn eiϑn . (7.13)
n n
The third term on the left cancels the term on the right, since ϑ̇n = En . We are thus left with
X
i~ (ċn ψn + cn ψ̇n )eiϑn = 0. (7.14)
n
Multiplying with an arbitrary eigenfunction hψm | from the left and using the orthogonality condition
Eq. (7.10) yields X
ċm = − cn hψm |ψ̇n iei(ϑn −ϑm ) . (7.15)
n
Now, if the Hamiltonian is slowly changing, such that its time derivative can be considered to be very
˙
small and that the energy difference |En − Em | is large compared to hψm | Ĥ |ψn i , the second term becomes
negligible. This approximation is known as the the adiabatic approximation, and we conclude that
51
By solving this equation one finds
Z t
cm (t) = cm (0) exp − hψm (s)|ψ̇m (s)ids = cm (0)eiγm (t) , (7.21)
0
where Z t
γm (t) = i hψm (s)|ψ̇m (s)ids, (7.22)
0
is the geometrical (Berry) phase factor. Putting the obtained expression for the coefficients cm (t) back into
(7.11), we obtain that the nth eigenstate is given by
Hence, a system that starts out in the nth eigenstate, will remain in the nth eigenstate, simply picking up
a couple of phase factors.
52
Chapter 8
For this part of the notes, we follow Refs. [Vikstål, 2018, Rodríguez-Laguna and Santalla, 2018, Albash and
Lidar, 2018].
• The traveling salesperson problem (TSP), as already mentioned. You are given a list of cities and the
distances between them, and you have to provide the shortest route to visit them all.
• The knapsack problem. You are given the weights and values of a set of objects and you have to
provide the most valuable subset of them to take with you, given a certain bound on the total weight.
• The integer factorization problem. You are given a big number M , and you have to provide two integer
factors, p and q, such that M = pq.
• The satisfactibility problem (SAT). You are given a boolean expression of many variables zi ∈ {0, 1},
for example, P (z1 , z2 , z3 , z4 ) = z1 ∨ z̄2 ∧ (z3 ∨ z̄4 ). Then, you are asked whether there is a valuation of
those variables which will make the complete expression true. For example, in that case, making all
zi = 1 is a valid solution.
53
Notice that all these problems have a similar structure: you are given certain input data (the initial
numbers, the list of the cities, the list of objects or the integer to factorize) and you are asked to provide a
response. The first two problems are written as optimization problems, in which a certain target function
should be minimized (or maximized). The sorting problem can be restated as an optimization problem: we
can design a penalty function to be minimized, by counting the misplaced consecutive numbers. The factor-
ization problem can also be expressed in that way: find p and q such that E = (M − pq)2 becomes minimal,
and zero if possible. SAT can also be regarded as an optimization problem in which the evaluation of the
boolean formula should be maximized. Thus, all those problems can be seen as combinatorial optimization
problems. This means that, in order to solve them by brute force, a finite number of possibilities must be
checked. But this number of possibilities grows very fast with the number of variables or the size of the
input. The number of configurations typically involved becomes easily gigantic. It is enough to consider 100
guests at a party with 100 chairs, to obtain 100! ∼ 10157 possible configurations to chose in which assigning
each guest to a chair. This is roughly the square of number of particles in the universe, estimated to be
about 1080 . To calculate the correct configuration in this ocean of possible configurations is often hopeless
for classical computers - even for our best and future best super computers. This is why it makes sense to
ask the question: could a quantum computer help?
54
• Max-cut
• Knapsack
• Traveling Salesperson
• SAT
• Exact-Cover
• Subset-sum
• Number partitioning
• Factoring?
• Sorting
For these problems, quantum algorithms could provide an exponential advantage. Furthermore, for the
NP-complete problems, maybe we can solve them on a quantum computer with a quadratic advantage
with respect to the classical solution. This kind of quantum advantage is of the same kind of the Grover
algorithm. The Grover search algorithm can be run on quantum computer, either within the circuit model
or using√adiabatic quantum computation [Albash and Lidar, 2018], providing the same quadratic advantage,
i.e. O( N ) vs O(N ) running time.
Finally, approximate solutions of combinatorial optimization problems (even NP-complete ones) might be
obtainable efficiently with the QAOA algorithm, that we will see in Sec. 8.4. Not much is known currently,
about the time-complexity of finding approximate solutions to those problems, and how this compares to
the corresponding approximate classical algorithms.
55
total 2n possible strings and the goal is to find the bit-string z = z1 . . . zn that minimizes the cost function
C(z). Note that a minimization problem can be transformed into a maximization problem by a minus sign
C(z) → −C(z). One of the most widely used models in physics, that also is used to represent optimization
problems, is the Ising model. It is was developed in the 1920s as a way to understand phase transitions in
magnetic materials.
The Ising model consists of N Ising spins on a lattice that can take the values si = +1 or si = −1, which
corresponds to the spin-↑ and the spin-↓ direction. Here si denotes the Ising spin at site i on the lattice. The
Ising spins are coupled together through long-range magnetic interactions, which can be either ferromagnetic
or antiferromagnetic, corresponding to the spins being encouraged to be aligned or anti-aligned respectively.
Moreover, an external magnetic field can be applied at each individual spin site which will give a different
energy to the spin-↑ and spin-↓ directions. The energy configuration of the Ising model with N Ising spins
is given by the Ising Hamiltonian
N
X N
X
H(s1 , . . . , sN ) = − Jij sj si − hi si , (si = ±1), (8.1)
i,j=1 i=1
where Jij is the coupling strength between the ith spin and j th spin and hi is the magnetic field at the ith spin
site. Note that for a given set of couplings, magnetic fields and spin-configurations, the Ising Hamiltonian
will just “spit out” a number that represents the energy of that particular spin-configuration. For a given
set of couplings and magnetic fields there always exists at least one spin-configuration that minimizes the
energy of the Ising Hamiltonian.
A quantum version of this model is obtained by simply replacing the spin-variables si with Pauli-z
operators
X n
X
ĤC ≡ Ĥ(σ̂1z , . . . , σ̂nz ) = − Jij σ̂iz σ̂jz − hi σ̂iz , (8.2)
1≤i<j≤n i=1
were σ̂iz refers to the Pauli-z matrix acting on the ith qubit. The spectral decomposition of the this Hamil-
tonian then encodes the different solutions in the computational basis
X
ĤC = C(z) |zihz| , (8.3)
z∈{0,1}n
and such the eigenstate |zi with the lowest eigenvalue corresponds the optimal solution. The Hamiltonian
of Eq. 8.2 is formally known as an Ising-Hamiltonian, after its inventor, but it can also be referred to as as
cost Hamiltonian in the context of optimisation problems, because of its eigenvalues in the computational
basis corresponds to the different costs of the cost function.
Also not that in the context of QAOA, we will refer to the Ising or cost Hamiltonian as Eq.(8.2), where
however we will take the plus sign in front of both terms of the Hamiltonian.
56
the values vi for each object i, i.e. wi = vi . It can be formulated as a decision problem, as follows: Given an
integer m (the total value) and a set of positive and negative integers n = {n1 , n2 , . . . , nN } of length N , is
there a subset of those integers that sums exactly to m?
Example: Consider the case when m = 7 and the set n = {−5, −3, 1, 4, 9}. In this particular example
answer is “yes”, and the subset is {−3, 1, 9}.
Example: Consider m = 13 and n = {−3, 2, 8, 4, 20}. This time the answer is “no”.
This problem can be framed as an energy minimization problem. The energy function for the subset sum
problem can be formulated as
N
!2
X
E(z1 , . . . , zN ) = ni zi − m , zi ∈ {0, 1},
i=1
where N corresponds to the size of the subset. Hence, if a configuration of zi exists such that E = 0, then
the answer is “yes”. Likewise if (E) > 0 for all possible configurations of zi , then the answer is “no”. To map
this energy function onto an Ising Hamiltonian we introduce the Ising spins si = ±1 instead of zi as
1
zi = (1 + si ), (8.4)
2
such that si = +1 (spin-↑) corresponds to zi = 1, and si = −1 (spin-↓) corresponds to zi = 0. Then the
Hamiltonian is written as
N
!2
X 1
H(s1 , . . . , sN ) = ni (1 + si ) − m
i=1
2
N N
!2
X 1 1X
= ni si + ni − m
i=1
2 2 i=1
2
N N N
1 X X 1 X 1 X
= ni nj si sj + nj − m ni si + nj − m .
4 i=1
2 j=1
2 j=1
1≤i,j≤N
Notice that the last term is simply a constant. This Hamiltonian is an Ising Hamiltonian cf. Eq. (8.1).
Furthermore, by observing that Jij is symmetric and that the sum of the diagonal elements i = j is equal
to the trace of Jij , we get
2
N N
1 X X 1 X 1
H(s1 , . . . , sN ) = Jij si sj + hi si + nj − m + Tr [Jij ]
2 i=1
2 j=1
4
1≤i<j≤N
(8.6)
X XN
= Jij si sj + hi si + const,
1≤i<j≤N i=1
57
where we have absorbed the 1/2 into Jij in the second step and made an implicit redefinition of the couplings
Jij ≡ ni nj /2. To solve the problem on a quantum computer, one can now quantize the spin variables si → σ̂iz .
It is clear that the answer to the number partitioning problem is “yes” if H = 0, because then there exist
a spin configuration where the sum of the ni for the +1 spins is the equal to the sum of the ni for the −1
spins [Lucas, 2014]. Likewise the answer is “no” if (H) > 0 for all possible spin configurations. Expanding
the square of Eq. (8.7) we get
X X
H(s1 , . . . , sN ) = Jij si sj = 2 Jij si sj + Tr [Jij ] , (8.8)
1≤i,j≤N 1≤i<j≤N
58
(see also Scott Pakin Quantum Annealing lecture at the Quantum Science summer school 2017, available at
http://qs3.mit.edu/images/pdf/QS3-2017---Pakin-Lecture.pdf).
where (σ̂ix = Iˆ ⊗ . . . ⊗ σ̂ x ⊗ . . . I)
ˆ with the Pauli σ̂ x matrix on the i:th place. The qubits or (spins) can then
be regarded as pointing simultaneously in the spin-↑ and spin-↓ directions along the z-axis at the beginning.
Indeed, it is easy to show using basic quantum mechanics that the ground state of Ĥ0 for N -qubits is
N
⊗n 1 1 X
|ψ(0)i = |+i = √ (|0i + |1i) ⊗ . . . ⊗ (|0i + |1i) = √ |zi,
2 2n z∈{0,1}n
which corresponds to a superposition of all possible spin configurations. Constructing Ĥ1 can be done in
a straightforward manner by replacing each si in Eq. (8.1) with a corresponding Pauli-z matrix, to yield
Eq.(8.2), that we report here for convenience:
N
X N
X
Ĥ1 = − Jij σ̂jz σ̂iz − hi σ̂iz ,
i,j=1 i=1
where (σ̂iz = Iˆ ⊗ . . . ⊗ σ̂ z ⊗ . . . I)
ˆ with σ̂ z on the i:th place. When the time-dependent AQC Hamiltonian
smoothly interpolates between Ĥ0 and Ĥ1 ,
with s(0) = 0 and s(τ ) = 1, τ the total time of the algorithm, the qubits will gradually choose between 0
and 1 corresponding to spin-↑ and spin-↓, depending on which spin-configuration minimizes the energy of
the Ising Hamiltonian [Rodríguez-Laguna and Santalla, 2018]. At the end t = τ the system will have evolved
into |ψ(τ )i = |z1 z2 . . . zN i and a readout of this state in the computational basis will reveal each bit value
from which the corresponding values of the Ising spins can be obtained. If the annealing time was not slow
enough, which is in practice occurring in any realistic situation, the state that is read-out will only encode
the optimal solution to the problem with a certain success probability p, given by the overlap of the final
state with the solution state. It is possible to estimate the success probability at each run by calculating the
ratio of the number of runs where the optimal solution was found to the total number of runs.
In order to calculate the time-to-solution with 99% certainty, we can evaluated the success probability
after repeating m times the annealing procedure and equate it to 0.99, i.e.
m
Psucc = 1 − (1 − p)m = 0.99,
59
where we have used the change of basis of logarithms logb x = loga x/ loga b, and where ln is the natural
logarithm. Therefore we obtain at the end:
ln(1 − 0.99)
T99 = mτ = τ,
ln(1 − p)
A challenge in quantum annealing is that a full connectivity, which manifest in the interaction parameters
Jij being 6= 0 beyond nearest-neihgbours interactions, is often required in the problem Hamiltonian (for
typical hard problems). This full connectivity might be difficult to achieve experimentally. In case the
hardware has limited connectivity, embeddings allow one to map the fully connected problem onto a locally
connected spin-system, at the price of an overhead in the total number of physical qubits to use. Examples
of embedding are the Lechner, Hauke and Zoller (LHZ) scheme or the minor embedding (we will talk about
this later).
60
is the following:
n
X n
X
q(z) = z > Qz = Qjj zj + Qjk zj zk (8.11)
j=1 j,k=1
j<k
Figure 8.2: A 3x3 Chimera graph, denoted C3. Qubits are arranged in 9 unit cells. From D-Wave website.
D-Wave 200Q is the first commercially available quantum annealer, developed by the Canadian company
D-Wave Systems, and it is a heuristic solver using quantum annealing for solving optimisation problems.
Their hardware is based on rf-SQUIDs (Superconducting Quantum Interference Devices). In these devices,
flux qubits are realized by means of long loops of superconducting wire interrupted by a set of Josephson
junctions. A “supercurrent" of Cooper pairs of electrons, condensed to a superconducting condensate, flows
through the wires; a large ensemble of these pairs behaves as a single quantum state with net positive or
net negative flux, thereby defining the two qubits states. The chip must be kept extremely cold for the
macroscopic circuit to behave like a two-level (qubit) system, and nominally runs at 15 mK.
In order to be casted onto D-Wave machines, optimisation problems can be formulated either in terms
of Ising Hamiltonians, or in terms of QUBOs.
The hardware layout of the D-Wave 2000Q quantum annealer restricts the connections between binary
variables to the so called Chimera graph 8.2. In order to make problems with higher connectivity amenable to
the machine, an embedding must be employed, for instance minor embedding. This means coupling various
physical qubits together into one logical qubit, representing one binary variable, with a strong ferromagnetic
coupling JF in order to ensure that all physical qubits have the same value after readout.
61
For how to program a D-Wave machine, see https://docs.dwavesys.com/docs/latest/doc_getting_
started.html.
– Need to know the gap between the ground state and first excited state, which can be costly to
compute
– In contrast, circuit-model algorithms tend to be more straightforward to analyze
The good:
• Constants do matter
– If the gap is such that a correct answer is expected only once every million anneals, and an anneal
takes 5 microseconds, that is still only 5 seconds to get a correct answer - may be good enough
– On current systems, the gap scaling may be less of a problem than the number of available qubits
• We may be able to (classically) patch the output to get to the ground state
– Hill climbing or other such approaches may help get quickly from a near-groundstate solution into
the ground state
– We may not even need the exact ground state
– For many optimization problems, “good and fast" may be preferable to “perfect but slow"
62
Formal Problem Definition
The typical passenger flow in an airport can usually be divided into three parts: After the airplane has
arrived at the gate, one part of the passengers passes the baggage claim and leaves the airport. Other
passengers stay in the airport to take connecting flights. These transit passengers can take up a significant
fraction of the total passenger amount. The third part are passengers which enter the airport through the
security checkpoint and leave with a flight. The parameters of the problem are summarized in table 8.1.
Assignment problems can easily be represented with binary variables indicating whether or not a resource
is assigned to a certain facility. The variables form a matrix indexed over the resources and the facilities.
The binary decision variables are x ∈ {0, 1}F ×G with
(
1, if flight i ∈ F is assigned to gate α ∈ G,
ziα = (8.12)
0, otherwise.
Like already stated, the passenger flow divides into three parts and so does the objective function: The
partial sums of the arriving, the departing and the transfer passengers sum up to the total transfer time of
all passengers. For the arrival part we get a contribution of the corresponding time tarr
α for each of the ni
arr
passengers if flight i is assigned to gate α. Together with the departure part, which is obtained analogously,
the linear terms of the objective are
X arr/dep
T arr/dep (z) = ni tarr/dep
α ziα . (8.13)
iα
The contribution of the transfer passengers is the reason for the hardness of the problem: Only if flight i is
assigned to gate α and flight j to gate β the corresponding time is added. This results in the quadratic term
X
T trans (z) = nij tαβ ziα zjβ . (8.14)
iαjβ
Constraints
Not all binary encodings for z form valid solutions to the problem. There are several further restrictions
which need to be added as constraints. In this model a flight corresponds to a single airplane arriving and
F Set of flights (i ∈ F )
G Set of gates (α ∈ G)
ndep
i No. of passengers departing with flight i
narr
i No. of passengers arriving with flight i
nij No of transfer passengers between flights i and j
tarr
α Transfer time from gate α to baggage claim
tdep
α Transfer time from check-in to gate α
tαβ Transfer time from gate α to gate β
tin
i Arrival time of flight i
tout
i Departure time of flight i
tbuf Buffer time between two flights at the same gate
63
departing at a single gate. It is obvious, that every flight can only be assigned to a single gate, therefore we
have X
ziα = 1 ∀i ∈ F. (8.16)
α
Furthermore it is clear that no flight can be assigned to a gate which is already occupied by another flight
at the same time. The resulting linear inequalities ziα + zjα ≤ 1 are equivalent to the quadratic constraints
ziα · zjα = 0 ∀(i, j) ∈ P ∀α ∈ G, (8.17)
where P is the subset of forbidden flight pairs with overlapping time slots, that can be aggregated in
P = (i, j) ∈ F 2 : tin in out
+ tbuf .
i < tj < ti (8.18)
and X X
C not (z) = ziα zjα (8.20)
α (i,j)∈P
fulfill (
one/not > 0, if constraint is violated,
C (8.21)
= 0, if constraint is fulfilled,
and therefore are suitable penalty terms which can be combined with the objective function. Since the
benefit in the objective function in case of an invalid variable choice should not exceed the penalty, two
parameters λone , λnot ∈ R+ need to be introduced:
q(z) = T (z) + λone C one (z) + λnot C not (z). (8.22)
In theory these parameters could be set to infinity, but in practice this is not possible and they have to be
chosen carefully.
The parameters λone and λnot need to be large enough to ensure that a solution always fullfills the
constraints. However due to precision restrictions of the D-Wave machine it is favorable to choose them as
small as possible. Two different possibilities to obtain suitable values are studied in Ref. [Stollenwerk et al.,
2019]: a worst case analysis for each single constraint, and a bisection algorithm which iteratively checks
penalty weights against the solution validity. We leave the technical details of this discussion to [Stollenwerk
et al., 2019].
Note that this final cost function could be expressed in terms of an Ising Hamiltonian, by using the
mapping in Eq.(8.4) and by quantizing the spin variables. However, QUBO is the native way of encoding
cost-functions into D-Wave machines, therefore the authors leave out this step.
Since the constraint Eq. (8.16) introduces |F | complete graphs of size |G|, and given the connectivity of
chimera graphs, which requires in the use of minor embedding, this results at most in a quadratic increase
in the number of physical qubits with the number of logical qubits, which is |F | · |G|. With this, the authors
were able to embed instances up to 84 binary variables (extracted by real data, but simplified considerably),
which requires roughly 1600 qubits. The instances are then solved on D-Wave, and the time-to-solution is
then evaluated as a function of the size of the instance.
64
Time to solution
500 25 %
400 50 %
75 %
300
T99 / s
200
100
3 4 5 6 7
Number of flights |F|
Figure 8.3: Top: Time to solution against the number of flights for the instances from IBP . The different
colors represent the 25th, 50th and 75th percentiles.
The annealing solutions were obtained using 1000 annealing runs, and majority voting as a un-embedding
strategy for broken chains of logical qubits. The time to solution is then evaluated using Eq.(8.3.1), where
τ is the annealing time, which we fixed to 20µs.
Figure 8.3 shows the time-to-solution in dependence of the number of flights. There is an increase in the
time-to-solution with the number of flights, and therefore the problem size.
A conclusive assessment of the scaling behavior is not possible at the moment, because due to the high
connectivity of the problem there is a large embedding overhead, and as a result the size of the amenable
problems is very small. Future generations of quantum annealers with more qubits and higher connectivity
are needed to investigate larger problems.
65
8.4 Quantum Approximate Optimization Algorithm (QAOA)
8.4.1 Introduction to QAOA: From the quantum adiabatic algorithm to QAOA
This Section is taken from the licentiate thesis of Pontus Vikstål (Chalmers, 2021).
The QAOA [Farhi et al., 2014] is inspired by the adiabatic quantum computing (AQC), and in particular
its application to optimisation problems in the context of quantum annealing, that you have seen in Chap.7
and 8.3.1. We report here for convenience the Hamiltonian of Eq.(8.10):
Here s(0) = 0 and s(T ) = 1, T is the total time of the algorithm, ĤM is the initial Hamiltonian, whose
maximally excited state is easy to prepare, and ĤC is the cost Hamiltonian maximally excited state encodes
the solution to an optimization problem.
The QAOA is based on the observation that the easiest way to practically implement the quantum
annealing Hamiltonian evolution expressed by Eq.(8.23) is to Trotterize it, i.e. to decompose it in small time
steps:
p
" Z #
T Y h i
Û (T ) ≡ T exp −i Ĥ(t) dt ≈ exp −iĤ(k∆t)∆t . (8.24)
0 k=1
Here Û (T ) is the evolution operator from 0 to T , T is the time-ordering operator, and p is a large integer so
that ∆t = T /p is a small time segment. Next, for two non-commuting operators A and B and sufficiently
small ∆t, one can use the Trotter formula:
Thus it is possible to approximate quantum annealing by applying the cost and mixing Hamiltonian in an
alternating sequence.
The key idea underlying QAOA is to truncate this product to an arbitrary positive integer and redefine
the time dependence in each exponent (1 − s(k∆t))∆t → βk and s(k∆t)∆t → γk . In this way, and as a
crucial difference with quantum annealing, the fixed time segments become instead variational parameters to
be optimized. Finally letting the product act on the initial state of quantum annealing, the plus state, one
obtains the variational state
p
~ ≡ ⊗n ~
Y X
|~γ , βi e−iβk ĤM e−iγk ĤC |+i = d(~
z
γ ,β)
|zi, (8.27)
k=1 z
where ~γ = (γ1 , γ2 , . . . , γp ) and β~ = (β1 , β2 , . . . , βp ), and where in the last step we have just made explicit that
the variational state is s given superposition of z-eigenstates. If the eigenvalues of the cost Hamiltonian are
all integers then γ is 2π-periodic, and can be restricted to lie between γk ∈ [0, 2π] and the mixer is π-periodic
βk ∈ [0, π]. However, the task of choosing the time-dependence of the angular variational parameters (~γ , β), ~
remains. The possibility of optimising over variational parameters makes it irrelevant whether the initial
state is a ground state or the highest excited state. In the context of QAOA, we might hence re-define the
mixing Hamiltonian without the minus sign, i.e. we set ĤM to be
X
ĤM = σ̂ix . (8.28)
i
66
Also, different authors use QAOA for minimisation or maximisation of a given cost-function. What you
should pay attention to is the following: once the problem is given, if your cost-function ĤC encodes the
solution in its ground state, then you should proceed to minimisation of the cost function.
~ be the expectation value of ĤC in the variational state of Eq.(8.27).
Let Ep (~γ , β)
~
~ ≡ h~γ , β|
~ ĤC |~γ , βi
~ =
X
Ep (~γ , β) prob(~
z
γ ,β)
C(z), (8.29)
z
~ (~ ~
γ ,β)
where, by obtaining the last term in Eq.(8.27), we see that probz(~γ ,β) = |dz |2 and C(z) = hz|HC |zi. By
finding good angles ~γ and β~ that minimize the expectation value above, the probability of finding the qubits
in their lowest energy configuration when measuring is increased, because the candidate variational states
becomes closer to the ground state of the cost Hamiltonian. Therefore the angles are chosen such that the
expectation value is minimized
~ ∗ ) = arg min Ep (~γ , β).
(~γ ∗ , β ~ (8.30)
~ ~
γ ,β
In general, this requires the quantum computer to query a classical optimizer, to tell the quantum computer
how it should update the variational state by slightly changing the angles in order to minimize the expectation
value, see Fig. 8.4. This has to be repeated until some convergence criteria is met or if an optimal or a good
enough solution is found. In other words, QAOA is converting the search in a space of a combinatorial
number of discrete configurations in a search for 2p optimal angles along a continuous energy landscape.
Figure 8.4: Schematic representation of the QAOA. The quantum processor prepares the variational state,
~ are optimized in a closed loop using
depending on variational parameters. The variational parameters (~γ , β)
a classical optimizer. From Ref. [Vikstål et al., 2020].
3. Calculate C(z) = hz|HC |zi using a classical computer. This step is classically efficient.
Pm
4. Repeat step 2-3, m times. Record the best observed string zbest , and the sample mean 1/m i=1 C(zi ),
where zi is the ith measurement outcome. Note that when m → ∞ the sample mean approaches the
67
expectation value Eq. (8.29) by the law of large numbers, as it is clear in particular from the righter-
most hand-side term.
5. If the optimal or a “good enough” solution is found, output C(zbest ) together with the string zbest . Else,
~ based on the minimization of the expectation
query a classical optimizer that updates the angles (~γ , β)
value and repeat from step 2.
Since QAOA is an algorithm that provides approximate solutions, a relevant metric in order to compare
its performance with respect to efficient classical algorithms also producing approximate solutions is the
approximation ratio. The approximation ratio is C(z), where z is the output of the quantum algorithm,
divided by the maximum of C.
Another relevant quantity in the context of QAOA is its success probability. Analogously as for quantum
annealing, it is defined as the square of the overlap between the variational states obtained for the optimal
angles, with the actual solution of the problem.
Fixed p Algorithm
We now explain how for fixed p we can do classical preprocessing and determine the angles ~γ and β~ that
~
maximize Ep (~γ , β). This approach will work more generally but we illustrate it for a specific problem,
MaxCut for graphs with bounded degree. The input is a graph with n vertices and an edge set {hjki} of
size m. The goal is to find a string z that makes
X
ĤC = Ĉhjki , (8.31)
hjki
where
1
−σjz σkz + 1 ,
Chjki = (8.32)
2
as large as possible. Now
~ =
X
Ep (~γ , β) hs| eiγ1 ĤC · · · eiβp ĤM Chjki e−iβp ĤM · · · e−iγ1 ĤC |si . (8.33)
hjki
eiγ1 ĤC · · · eiβp ĤM Chjki e−iβp ĤM · · · e−iγ1 ĤC . (8.34)
This operator only involves qubits j and k and those qubits whose distance on the graph from j or k is less
than or equal to p. To see this consider p = 1 where the previous expression is
eiγ1 ĤC eiβ1 ĤM Chjki e−iβ1 ĤM e−iγ1 ĤC . (8.35)
The factors in the operator e−iβ1 ĤM which do not involve qubits j or k commute through Chjki and we get
x x x x
eiγ1 ĤC eiβ1 (σj +σk ) Chjki e−iβ1 (σj +σk ) e−iγ1 ĤC . (8.36)
Any factors in the operator eiγ1 ĤC which do not involve qubits j or k will commute through and cancel out.
So the operator in equation Eq. (8.36) only involves the edge hjki and edges adjacent to hjki, and qubits on
68
those edges. For any p we see that the operator in Eq. (8.34) only involves edges at most p steps away from
hjki and qubits on those edges.
Return to equation Eq. (8.33) and note that the state |si is the product of σ x eigenstates
so each term in equation Eq. (8.33) depends only on the subgraph involving qubits j and k and those at a
distance no more than p away. These subgraphs each contain a number of qubits that is independent of n
(because the degree is bounded) and this allows us to evaluate Ep in terms of quantum subsystems whose
sizes are independent of n.
As an illustration consider MaxCut restricted to input graphs of fixed degree 3. For p = 1, there are only
these possible subgraphs for the edge hjki:
(8.38)
For any subgraph G define the operator CG which is C restricted to G,
X
CG = Ch``0 i , (8.39)
h``0 iεG
and
U (BG , β) = e−iβBG . (8.42)
Let the state |s, Gi be Y
|s, Gi = |+i` .
`εG
Return to equation Eq. (8.33). Each edge hj, ki in the sum is associated with a subgraph g(j, k) and makes
a contribution to Ep of
hs, g(j, k)| U † (Cg(j,k) , γp ) · · · U † (Bg(j,k) , β1 )Chjki U (Bg(j,k) , β1 ) · · · U (Cg(j,k) , γp ) |s, g(j, k)i (8.43)
The sum in Eq. (8.33) is over all edges, but if two edges hjki and hj 0 k 0 i give rise to isomorphic subgraphs,
~ are the same. Therefore we can view the sum in Eq. (8.33) as a
then the corresponding functions of (~γ , β)
sum over subgraph types. Define
69
where g(j, k) is a subgraph of type g. Ep is then
~ = ~
X
Ep (~γ , β) wg fg (~γ , β) (8.45)
g
where wg is the number of occurrences of the subgraph g in the original edge sum. The functions fg do not
depend on the number of decision variables n, neither on the number of constraints or clauses m. The only
dependence on those quantities comes through the weights wg and these are just read off the original graph.
Note that the expectation in Eq. (8.44) only involves the qubits in subgraph type g. The maximum number
of qubits that can appear in Eq. (8.43) comes when the subgraph is a tree. For a graph with maximum
degree v, the numbers of qubits in this tree is
(v − 1)p+1 − 1
qtree = 2 , (8.46)
(v − 1) − 1
(or 2p + 2 if v = 2), which is n and m independent. For each p there are only finitely many subgraph types.
~ in Eq. (8.45) can be evaluated on a classical computer whose resources are not
Using Eq. (8.44), Ep (~γ , β)
growing with n. Each fg involves operators and states in a Hilbert space whose dimension is at most 2qtree .
Admittedly for large p this may be beyond current classical technology, but the resource requirements do
not grow with n.
To run the quantum algorithm we first find the (~γ , β) ~ that maximize Ep . The only dependence on n
and m is in the weights wg and these are easily evaluated. Given the best (~γ , β) ~ we turn to the quantum
computer and produce the state |~γ , βi~ given in equation Eq. (8.27). We then measure in the computational
basis and get a string z and evaluate C(z) = hz|ĤC |zi. Repeating gives a sample of values of C(z) between
0 and +m whose mean is Ep (~γ , β). ~ An outcome of at least Ep (~γ , β)
~ − 1 will be obtained with probability
1 − 1/m with order m log m repetitions.
70
Let us briefly discuss the performance of QAOA on MaxCut on (connected) 3-regular graphs. In
Ref. [Farhi et al., 2014], they show that for p = 1, the worst case approximation ratio that the quan-
tum algorithm produces is 0.6924. Hence this p = 1 result on 3-regular graphs is not as good as known
classical algorithms, i.e. 0.9326 [Halperin et al., 2004]. It is possible to analyze the performance of the
QAOA for p = 2 on 3-regular graphs. However it is more complicated than the p = 1 case. A numerical
optimisation yields 0.7559. Recently, Wurtz and Love speculated that there is no quantum advantage for
QAOA for solving Max-cut on 3-regular graphs for p < 6 [Wurtz and Love, 2020].
• It has been shown that QAOA is universal, meaning that for a problem of size n, and a choice of ĤC
and ĤM , QAOA can approximate any unitary U of dimension 2n ×2n to arbitrary precision when using
a sufficiently large iteration p, which in general depends on n [Farhi et al., 2017, Lloyd, 2018, Morales
et al., 2019].
• Farhi et al. showed that level p = 1 QAOA is computationally hard to simulate on a classical computer
without collapsing the polynomial hierarchy to the third level [Farhi and Harrow, 2019]. However, this
does not imply that QAOA for p = 1 is able to solve problems more efficiently than classically.
• In the absence of a fully connected set-up, one can split the role of the cost-function used for the
classical optimization within QAOA and the Hamiltonian implementing the evolution, and still retain
non-trivial approximation ratios, e.g. for Max-Cut [Farhi et al., 2017]. Furthermore, one can explore
the advantage stemming from rotating the qubits with different angles when acting with the mixer
Hamiltonian, and more generally introducing more free parameters.
• Brandao et al. [Brandao et al., 2018] demonstrated that if the problem instances come from a reasonable
distribution, then the expectation value of the cost function concentrates. This suggests that it is
possible to train a classical optimizer to find good angles on small instances and reuse those angles on
larger instances, as long as they come from the same distribution.
• Furthermore, it was shown that the QAOA is able to realize Grover’s search algorithm [Jiang et al.,
2017, Niu et al., 2019].
• In 2019, Hadfield et al. put forward the quantum operator alternating ansatz, which generalizes the
original QAOA ansatz to allow for more general types of Hamiltonians and initial states [Hadfield et al.,
2019].
• In general, QAOA is nowadays an active field of research, and researchers are fervidly trying to char-
acterise, understand and derive bounds on its performance.
71
⊗N
where (we have to add an additional minus sign to H0 such that the state |+i we start from is the ground
state of H(s) and the convention still conforms with the formulation of the QAOA)
X
H0 = σix , (8.48)
i
X X
HC = hi σiz + Jij σiz σjz . (8.49)
i ij
We discretize the time-evolution operator of the annealing process into N time steps of size τ = ta /N .
Approximating each time step to second order in τ yields
γn = τ B(sn ), n = 1, . . . , N (8.52)
βn = −τ (A(sn+1 ) + A(sn )) /2, n = 1, . . . , N − 1 (8.53)
βN = −τ A(sN )/2. (8.54)
So N time steps for the second-order-accurate annealing scheme correspond to p = N steps for the QAOA.
As an example, we take
τ (n − 1/2)
γn = (8.56)
N n
βn = −τ 1 − (8.57)
N
τ
βN = − . (8.58)
4N
Since the time evolution of quantum annealing is necessarily finite, with quantum annealing, too, there
is an associated success probability, i.e. the probability of being in the actual desired ground state at the
end of the evolution. In general, how the QAOA and quantum annealing performances - measured by the
respective success probabilities - compare depends on the problem instance itself. Some detailed comparison
are presented in Ref. [Willsch et al., 2020] for weighted Max-Cut and 2-SAT.
72
Chapter 9
In this chapter, we discuss the variational quantum eigensolver (VQE). The VQE is a heuristic approach to
solving various problems with a combination of quantum and classical computation. As we will see later,
the QAOA of the preceding chapter can be considered a special case of the VQE.
The content of this chapter is mostly based on the review in Ref. [Moll et al., 2018]. We first outline how
the VQE works and then discuss details of some of the steps in the algorithm.
How hard is this problem in general? If the Hamiltonian is k-local, i.e., if terms in H act on at most k
qubits, the problem is know to be QMA-complete for k ≥ 2. The general problem would thus be hard even
for an ideal quantum computer. However, it is believed that physical systems have Hamiltonians that do
not correspond to hard instances of this problem, and a heuristic quantum algorithm could still outperform
a classical one.
A general Hamiltonian for N qubits can be written
X X N
O
H= hα Pα = hα σα(j)
j
, (9.2)
α α j=1
where the hα are coefficients and the Pα are called Pauli strings. The latter are products of single-qubit
Pauli matrices (including the identity matrix).
The steps of the VQE algorithm are the following (see also Fig. 9.1):
0. Map the problem that you wish to solve to finding the ground-state energy of a Hamiltonian on the
form in Eq. (9.2).
2. Measure expectation values of the Pauli strings in the Hamiltonian, i.e., measure E[Ψ (ϑ)]Pα Ψ (ϑ).
P
3. Calculate the energy E corresponding to the trial state, E = α hα E[Ψ (ϑ)]Pα Ψ (ϑ), by summing up
the results of the measurements in the preceding step.
73
Quantum Sci. Technol. 3 (2018) 030503 N Moll et al
Steps 1 and 2 are run on a quantum computer, which can handle the quantum states more efficiently than a
3.classical
Exploring Hilbert
computer. space
Steps 3 andwith
4 arethe
doneVQE
on a classical computer. The algorithm is iterative, i.e., it starts
over from step 1 after step 4, and continues to iterate until some convergence criterion is met, indicating
Tothat
exploit
the near-term quantum
ground-state energy devices,
has beenapplications andfollowing
found. In the algorithms have towebediscuss
sections, tailored to current
steps quantum
0, 1, and 4 in more
hardware
detail. with only tens or hundreds of qubits and without full quantum error correction. One main constraint
is the limited quantum volume that restricts the depth of meaningful quantum circuits. Still, a small-scale
quantum
9.2 computer
More on with step
hundred0qubits can process quantum
– mapping states that cannot even be stored in any classical
to a Hamiltonian
memory. A natural way to make use of this quantum advantage is via a hybrid quantum–classical architecture: a
quantum
Broadlyco-processor
speaking, theprepares
VQE ismulti-qubit quantum
currently mostly states
being ∣Y (q )ñ parametrized
considered for two typesbyofcontrol parameters
problems: q . The
optimization
subsequent
problems,measurement
where H is aofcost
a cost function
function (q )problem,
forEqthe = áY (q )∣and
Hq∣Y (q )ñ, typically
many-body the energy
fermionic of a problem
quantum systems, e.g.
Hamiltonian Hq, serveschemistry).
molecules (quantum a classical computer
The firsttotype
findofnew values qwas
problems in order to minimize
discussed Eq (in
extensively q ) Chapter
and find 8,thewhere
ground-state energy
we saw several examples of how optimization problems can be mapped to a Hamiltonian. In this chapter,
we therefore focus on quantum-chemistry problems.
Even though a Hamiltonian can be Eqmin = min
written down (áYfor
(q )∣
aH q∣Y (q )ñ).system, that is not the Hamiltonian that(3)
molecular
q
is used in VQE. In classical simulations of molecular systems, there are many different methods, e.g., density
This VQE approach
functional theory to Hamiltonian-problem
(DFT), where the actual solving
system has been recently
of interacting appliedisindescribed
electrons different contexts [34, 37, 40,
as non-interacting
70–72]. In fact, the Hamiltonian Hq can take many forms, the only requirement being that it can be mapped
electrons moving in a modified external potential. An approach more suited to VQE is to describe the to a
system
in second
system quantization.
of interacting qubitsThis
withrequires calculating a increasing
a non-exponentially number of number
spatial integrals
of terms.on a classical
Here computer,
we distinguish twobut
that task
relevant canHamiltonians
cases: be accomplished efficiently.
that describe The Hilbert
fermionic space consists or
condensed-matter of molecular system (section
electron orbitals. The Hamiltonian
4) and
is
Hamiltonians that describe a classical optimization problemX (section 5).
tij a†i aj + uijkl a†i a†k al aj ,
X
HF = (9.3)
i,j i,j,k,l
3.1. Variational quantum eigensolver method
In detail, the VQE method consists of four main steps as shown in figure 3. First, on the quantum processor a
tentative variational eigenstate, a trial state, ∣Y (q )ñ is generated by a sequence of gates parameterized by a set of
74
control parameters q . In the ideal case, this trial state depends on a small number of classical parameters q ,
whereas the set of gates is chosen to efficiently explore Hilbert space. In particular, the class of states forming the
solution to the minimization problem in equation (3) has to lie within the set of possible trial states. Suitable gate
sets which provide a good approximation to the wanted target state, which minimizes the cost function, have
been found for both classical optimization problems [41] (section 5) and quantum chemistry problems
where ai (a†i ) annihilates (creates) an electron in the ith orbital. The coefficients tij and uijkl describing
one- and two-electron interactions are calculated from the spatial integrals mentioned above.
n Theooperators in Eq. (9.3) are fermionic. They thus obey the fermionic anti-commutation relations, e.g.,
ai , a†j = δij . These are not the relations that the qubit Pauli operators obey. We thus need to translate
the Hamiltonian in Eq. (9.3) to a form that can be implemented on the quantum computer. One well-known
mapping from fermionic operators to qubit operators is the Jordan-Wigner transformation:
where N is the number of orbitals and qubits. This mapping is not well suited to the VQE, because it creates
highly non-local terms in the qubit Hamiltonian. In actual applications of VQE to quantum chemistry, other
mappings are used (Bravyi-Kitaev, parity, ...). There is ongoing research on finding more suitable mappings.
where |Φi is a simple state formed by the Slater determinant for low-energy orbitals. The operator T (ϑ) is
known as a cluster operator. It is given by
X
T (ϑ) = Tk (ϑ), (9.6)
k
ϑji a†j ai ,
X
T1 (ϑ) = (9.7)
i∈occ,j∈unocc
† †
X
T2 (ϑ) = ϑkl
ij al ak aj ai , (9.8)
i,j∈occ,k,l∈unocc
where the sums go over occupied and unoccupied orbitals. The coefficients of the higher-order cluster
operators decrease rapidly as more orders are included. For this reason, the expansion is usually truncated
at the second , “double”, order (UCCSD) or the third, “triple”, order (UCCSDT).
|Ψ (ϑ)i = Usingle (ϑ)Uent (ϑ)Usingle (ϑ)Uent (ϑ) . . . Usingle (ϑ)Uent (ϑ) Usingle (ϑ) |00 . . . 0i . (9.9)
| {z }
d repetitions
75
Here, Usingle (ϑ) represent arbitrary-single qubit rotations on each of the N qubits (different rotations in each
of the d+1 steps) and Uent (ϑ) represent two-qubit entangling operations (same in each step) that were easy to
implement in the available hardware. For the single-qubit operations alone, there are N (3d + 2) independent
rotation angles in the parameter vector ϑ (an arbitrary single-qubit rotation can be characterized by 3 Euler
angles).
Already for relatively small molecules, d needs to be more than just a few repetitions to reach accuracy
that can compete with classical methods. However, a larger d means that the quantum circuit takes longer to
run, and thus decoherence will limit the achievable d. Recently, researchers are exploring “error mitigation”
to get around this problem. In one type of error mitigation, the experiment is rerun several times with
varying levels of added noise. From this, one can extrapolate the answer towards what it would have been
for zero noise.
Note that the form of Eq. (9.9) is that of the QAOA in Eq. (8.51). This shows that the QAOA is an
example of the broader class of algorithms that is the VQE.
76
Chapter 10
These parts of the notes follow Refs. [Lund et al., 2017, Aaronson and Arkhipov, 2013], as well as the notes
by Sevag Gharibian at the Quantum Complexity Theory 2019, University of Paderborn, http://groups.
uni-paderborn.de/fg-qi/courses/UPB_QCOMPLEXITY/2019/UPB_QCOMPLEXITY_syllabus.html. We also
highly recommend to read the short review Ref. [Harrow and Montanaro, 2017].
3. Large-scale universal quantum computers can be built (and hence there is something wrong in quantum
mechanics textbooks).
One of the main motivations of studying (selected) sampling problems is that possible to prove that they
are indeed problems that can be solved efficiently in polynomial time by a quantum computer, while they
cannot be solved in polynomial time by a classical computer (up to widely believed conjectures in computer
77
science). Experimentally validating a sampling model would hence disprove the Extended Church-Turing
thesis and thereby solve Shor’s trilemma.
Sampling problems consist in generating output random numbers according to a particular probability
distribution (see Fig. 1). All quantum computations on n qubits can be expressed as the preparation of
an n-qubit initial state |0i⊗n , a unitary evolution corresponding to a uniformly generated quantum circuit
C followed by a measurement in the computational basis on this system. In this picture the computation
outputs a length n bitstring x ∈ {0, 1}n with probability
Px = |hx|C|0i⊗n |2 (10.1)
In this way quantum computers produce probabilistic samples from a distribution determined by the circuit
C.
One of the appealing features of these models is that they are restricted models of quantum computation,
or sub-universal, and hence they do not require a universal quantum computer: they might be imple-
mentable on a large scale via near-term “noisy intermediate scale quantum devices” (NISQ). This quest for
an experimental demonstration of quantum computational speedup has fallen under the moniker of “quan-
tum advantage”, with multiple candidate approaches to date: The Instantaneous Quantum Polynomial-Time
(IQP) model, random circuit sampling (RCS), and the deterministic quantum computation with one quan-
tum bit (DQC1) model. Among these candidates, the Google AI-group recently released a demonstration
of RCS with 53 qubits [Arute et al., 2019], which is at the edge of current simulation capability of classical
computers [Pednault et al., 2019], and the Pann group a proof of Gaussian Boson Sampling with 100 optical
modes and 50 input squeezed states [Zhong et al., 2020]. Despite disproving the Extended Church-Turing
Thesis is inherently challenging because it is a scaling statement, these two experiment constitute a first
proof of quantum advantage, in that classical computers cannot currently simulate the experiments them-
selves. Some comments on the Google experiments can be found in Sec. 10.3. Here, however, we shall focus
on two of the first and historically more important models for quantum advantage: Instantaneous Quantum
Polytime and Boson sampling. Note that the historically very first leap towards exhibiting a candidate
model for quantum advantage was taken by Terhal and DiVincenzo in their 2008 paper “Adaptive quantum
computation, constant depth quantum circuits and Arthur-Merlin games" [Terhal and DiVincenzo, 2002].
Their approach was already to appeal to a complexity-theoretic argument: they gave evidence that there
exists a certain class of quantum circuits that cannot be simulated classically by proving that if a classical
simulation existed, certain complexity classes strongly believed to be distinct would collapse to the same
class.
Beyond the conceptual interest in unveiling quantum advantage with sampling models, we also underline
that a convincing demonstration of the impossibility for a classical computer to produce the same output
probability distribution as a quantum architecture is also a crucial milestone towards demonstrating the
controllability of quantum computers, with the scope of using them to perform useful algorithms.
78
or (2) Z, CZ , and CCZ (doubly controlled-Z) gates. Between two layers of Hadamard gates is a collection
of diagonal gates. Although these diagonal gates may act on the same qubit many times, they all commute
so in principle could be applied simultaneously, hence the name “instantaneous". In the case of (1) these
circuits correspond to random instances of the Ising model drawn from the complete graph [Bremner et al.,
2016].
Ref. [Bremner et al., 2016] extends this result to the approximate sampling case relying on conjectures,
similarily to the Boson Sampling model. In contrast to Boson Sampling, though, these conjectures have later
been proven by Jens Eisert and collaborators.
Figure 10.1: Example of an IQP circuit, taken from Ref. [Harrow and Montanaro, 2017].
The worst-case complexity of the problems in both (C1) and (C2) can be seen to be hard to classically
sample in two steps.
First, one prove that these families of circuits are examples of sets that become universal under post-
selection and as a result their output probabilities are hard to classically sample.
Then, complexity-theoretical arguments can be applied, that allow to conclude on the hardness of the
model. We are going to review both aspects briefly here.
It can be shown that the complexity of computing the output probabilities of IQP circuits, Px =
|hx|H ⊗n DH ⊗n |0i⊗n |2 , is classically hard in the worst case and this also holds under multiplicative ap-
proximation.
|ψi • X̂
|+i • X h H |ψi
Figure 10.2: Hadamard gadget in a post-selected IQP circuit, where h takes value 0 if +1 is measured, while
h = 1 if the result is −1.
Output state
Suppose one wants to implement a Hadamard gate on an arbitrary qubit |ψi = α |0i + β |1i. Following
the circuit structure depicted in Fig.10.2, we add an ancillary qubit initialized in |+i so that we start from
79
(omitting normalization)
|ψi |+i = α |00i + α |01i + β |10i + β |11i .
Then we apply the controlled Z gate and the measurement in the X basis. Conditioned on getting the
outcome corresponding to the state |+i when measuring the first qubit we have:
ĈZ
α |00i + α |01i + β |10i + β |11i 7−→ α |00i + α |01i + β |10i − β |11i
h+|
7−→ α(|0i + |1i) + β(|0i − |1i) = H |ψi . (10.2)
If instead we get the outcome corresponding to the state |−i when we measure the first qubit, the same kind
of calculations give:
ĈZ
α |00i + α |01i + β |10i + β |11i 7−→ α |00i + α |01i + β |10i − β |11i
h−|
7−→ α(|0i + |1i) − β(|0i − |1i) = XH |ψi . (10.3)
Defining h the outcome of the measurement, so that h = 0 (resp. h = 1) corresponds to measuring the state
|+i (resp. |−i), then the result of the computation is, in the general case
X h H |ψi .
So the point of post selecting is to ensure it is indeed H and not −H that has been implemented.
80
PostBPP (that is, classical polynomial-time with postselection, also called BPPpath ). This would yield to
the following chain of inclusions of complexity classes:
which is known to imply a collapse of the polynomial hierarchy. The final argument why this happens is that
PP is as hard as PH (Toda’s theorem, Gödel prize 1998), while PostBPP is contained in the third level of
the polynomial hierarchy. Therefore, if exact IQP was efficiently classically simulatable, the full polynomial
hierarchy would be contained in the third level, which implies the collapse.
1. It is a major milestone, demonstrating that a gate-based quantum computer can indeed perform a
computational task in 3 minutes, which would take 2.5 days to solve on the most powerful supercom-
puter on earth. Importantly, if Google would now add a few more qubits to their chip, simulating the
outcome would become impossible even for that supercomputer, in decent times. E.g., for 60 qubits,
you would need 33 such supercomputers for just storing the quantum state of the Google chip.
2. The computation is not claimed to be useful in any way. The task is to sample from a particular
probability distribution. This task was chosen carefully, because it is easy to perform on Google’s
quantum computer, but hard on any classical computer.
3. This does not imply that quantum computers from now on outperform classical computers in general.
However as quantum computers evolve, the class of computational tasks where the quantum computer
performs better than classical computers will grow. The hope is of course that at some point it will
also contain useful computations.
4. The computation was performed without using error correction. The error rate was low enough to give
the right answer for this comparably short quantum algorithm. This is a so-called NISQ device, where
NISQ stands for Noisy Intermediate-Scale Quantum device.
5. The basic architecture of the 53-qubit device is similar to previously published devices from Google.
The breakthrough consists of careful engineering of control hardware and software as well as a thor-
ough analysis of which computational tasks are easy for a quantum computer and hard for classical
computers.
81
circuit sampling-figure.pdf
Figure 10.3: Sketch of a Random Circuit Sampling, taken from Ref. [Arute et al., 2019].
6. The algorithm creates a random entangled state by repeating layers of eight sets of gates, where most
qubits are taking part in eight entangling gates, two with each nearest neighbour, interlaced with eight
single qubit gates. For the longest algorithm, each layer is repeated twenty times, giving on the order
of 53 × (8/2) × 20 ∼ 4000 two-qubit gates. The algorithm is then repeated one million times, to
give appropriate statistics. The full run-time is 200 seconds. To perform the similar sampling on a
supercomputer with 1 million cores is estimated to take 2.5 days.
7. In WACQT, we are inspired by this breakthrough. We note that the average lifetime (T1) of the
Google qubits is 16 microseconds, while we have demonstrated reproducible lifetimes of around 80
microseconds. However, Google’s design gives them very fast two-qubit gates, taking less than 20
ns. At the moment, we are working on increasing the speed of our two-qubit gates. We also note
the importance of automated calibration and control software, which we are currently developing
also for our setup. WACQT also takes full part in the work to find useful algorithms suitable for
superconducting qubit architectures.
8. In contrast to the other circuits in this chapter that are candidate to yield quantum advantage, in [Arute
et al., 2019] the gates that are implemented are in principle√ drawn
√ √ from a universal gate set.
√ More
in detail, regarding single qubit gates, they implement X, Y , W , with W = (X + Y )/ 2 (see
Fig. 10.3). They generate random quantum circuits using the two-qubit unitaries measured for each
pair during simultaneous operation, rather than a standard gate for all pairs. The typical two-qubit
gate is a full iSWAP with 1/6th of a full CZ. Using individually calibrated gates in no way limits the
82
universality of the demonstration.
Figure 10.4
input ports of a multi-port splitter, which we feed with N photons. We assume that the input state only
has maximum one input photon maximum per mode (see Fig. 10.4). With no loss of generality we order the
modes such that the first N input modes contain a photon, while the others are empty. I.e,
|ψin i = |11 , ...1N , ..., 0M i = â†1 ...â†N |01 , ..., 0M i ≡ |T i. (10.6)
Now a linear optics network described by the M × M matrix U is applied to the input state. Linear bosonic
interactions, or linear scattering networks, are defined by dynamics in the Heisenberg picture that generate
a linear relationship between the annihilation operators of each mode, i.e.
M
Uj,k âk , i.e. ~b = U~a;
X
b̂j = U † âj U = (10.7)
k=1
M
b̂†j = U † â†j U = †
â†k , i.e. b~† = U † a~† ,
X
Uj,k (10.8)
k=1
1 In this sense, therefore, the Boson Sampling model already lives in the bosonic space associated with an infinite dimensional
Hilbert space and with continuous-variable operators, such as the quadratures of the field. However, we present this model in
the discrete-variable section of the notes, to highlight the contrast with models making use of squeezed states and homodyne
detection. The latter are more traditionally associated with the continuous-variable approach, and they will be presented in
Chapter 11.
83
which also implies that â†j = k=1 Uj,k b̂†k . It is important to make a distinction from the unitary operator
PM
U which acts upon the Fock basis and the unitary matrix defined by U which describes the linear mixing of
modes. For optical systems the matrix U is determined by how linear optical elements, such as beam-splitters
and phase shifters, are laid out. In fact all unitary networks can be constructed using just beam-splitter and
phase shifters.
The set of events which are then output by the algorithm is a tuple of M non-negative integers whose
sum is N . This set is denoted ΦM,N . As we will explicitly show in Sec.(10.4.2), the probability distribution of
output events is related to the matrix permanent of sub-matrices of U . The matrix permanent is defined in a
recursive way like the common matrix determinant, but without the alternation of addition and subtraction.
For example
a b
Per = ad + cb; (10.9)
c d
a b c
where Sn represents the elements of the symmetric group of permutations of n elements. With this, we can
now define the output distribution of the linear network with the input state from Eq.(10.6). For an output
event S = (S1 , S2 , ..., SM ) ∈ Φm,n , the probability of S
|Per(U S )|2
PS = (10.12)
S1 !S2 !...SN !
where the matrix US is a N × N sub-matrix of U where column i is repeated Si times and only the first N
rows are used. One critical observation of this distribution is that all events are proportional to the square
of a matrix permanent derived from the original network matrix U . The resulting photon distribution in
output is hard to be classically sampled, as we will show later.
The sum in Eq.(10.13) is composed of M N terms, which is the number of ways in which we can put N
objects in M modes allowing for repetitions; we can refer to this set of permutation as Ṽ , and then rename
84
the terms of the sum in Eq.(10.13), i.e.
N
N X
M M N
Uj,k b̂† = †
Y XY
kU k,Ṽkj b̂Ṽ j (10.14)
k
j=1 k=1 j=1 k=1
where k is the k-th boson and Ṽkj is in which mode that photon is found in the permutation j. In other
words, the ensemble Ṽ is the set of M N permutations of N photons in M modes, repetitions allowed. Let’s
consider as an example the case in which we have in input N = 2 photons in M = 3 modes. Then we have
to consider j=1 â†j , i.e. the first two rows of the vector
Q2
â†1
†
U11 b̂†1 + U12 b̂†2 + U13 b̂†3
U11 U12 U13 b̂1
†
â2 = U21 U22 U23 b̂†2 = U21 b̂†1 + U22 b̂†2 + U23 b̂†3 (10.15)
† † † † †
â3 U 31 U32 U33 b̂3 U31 b̂1 + U32 b̂2 + U33 b̂3
which gives
2
â†j (U11 b̂†1 + U12 b̂†2 + U13 b̂†3 )(U21 b̂†1 + U22 b̂†2 + U23 b̂†3 )
Y
=
j=1
85
Before going to the general case of arbitrary output configuration, we want to √ fix the ideas with an
example. Consider the case where we project onto the state | {S}i = |020i = b̂†2
2 / 2|000i. Then the only
†2
contributing terms in Eq.(10.16) is U12 U22 b̂2 , and we obtain:
1 2
2
PS = |hS|ψout i| = h000|b̂22 U b̂†1 b̂†2 |000i
2!
1 2
= h000|b̂22 U12 U22 b̂†2
2 |000i
2
1 2 2
= 4|U12 U22 | = 2|U12 U22 | , (10.19)
2
permanent of the submatrix of U identified by the rows corresponding to the input photons, and the columns
identified by the output configuration of interest, where the columns are repeated as many times as the
occupation of the output mode. In practice, we have:
so that
(S) U12 U12
U = , (10.21)
U22 U22
and
Per(U S ) = U12 U22 + U12 U22 = 2U12 U22 , (10.22)
yielding immediately Eq.(10.19) according to (10.12).
In the general case, let us now indicate with S̃kj the j-th permutations of the N photons in the output
state, where k is the photon index. At maximum, there will be N ! permutations, corresponding to the ways
of arranging N photons in N modes, repetitions not allowed (note that if in the output distribution there
are more photon per mode then the number of ways I can put N photons in M modes with S1 in the first,
S2 in the second... is N !/(S1 !S2 !...SM !)). For example, it is clear that e.g. if we project onto the state
| {S}i = |011i then we have S = [2, 3] and
j = 1 : k = 1 → S̃kj = 2; k = 2 → S̃kj = 3;
j = 2 : k = 1 → S̃kj = 3; k = 2 → S̃kj = 2. (10.23)
It is clear that the only non-zero terms in the sum of Eq.(10.18) will be those for which ṼkS = S̃kj , hence2
2
N! Y
N
1 X
PS = Uk,S̃ j , (10.24)
S1 !...SM ! j=1 k
k=1
2 2
|U12 U23 + U13 U22 | , while for the state |020i we have seen that PS = 2|U12 U22 | . We can now compare
2 We have used that
N
N
N M N M N N! Y
N
Uk,Ṽ j b̂† j |01 , ..., 0M i
Y X Y X Y X
h01 , ..., 0M | b̂S k = S1 !...SM ! Uk,S̃ j = Uk,S̃ j .
k Ṽk k k
k=1 j=1 k=1 j=1 k=1 j=1 k=1
86
Eq.(10.24) with the formula for the permanent of an L × L matrix A, which reads
L! Y
X L
Per(A) = ak,σ̃j . (10.25)
k
j=1 k=1
where σ̃kj is the k-th element of the j-th permutation of the numbers 1, ..., L. It is easy to compare Eq.(10.24)
to Eq.(10.25); note however that our original matrix was M × M , while we are computing here only the
permanent of the sub-matrix involving the first N rows, and columns which correspond to configuration S.
Hence we obtain Eq.(10.12).
(2) If we had a polynomial-time classical algorithm for exact BosonSampling, then we could approximate
p to within a multiplicative constant in the class BPPNP , by using a standard technique called universal
hashing.
Combining facts (1) and (2), we find that, if the classical BosonSampling algorithm exists, then P#P =
BPPNP , and therefore the polynomial hierarchy collapses.
The second proof is inspired by independent work of Bremner, Jozsa, and Shepherd [Bremner et al.,
2010], and namely by the proof of computational hardness for the IQP model that we have seen in Sec.10.2.
In this proof, one starts with a result of Knill, Laflamme, and Milburn, which says that linear optics with
adaptive measurements is universal for BQP, giving name to the respective KLM model. A straightforward
modification of their construction shows that linear optics with postselected measurements is universal for
PostBQP (that is, quantum polynomial-time with postselection on possibly exponentially-unlikely measure-
ment outcomes). Furthermore, Aaronson previously showed that PostBQP = PP. On the other hand, if a
classical BosonSampling algorithm existed, then we will show that we could simulate postselected linear
87
optics in PostBPP (that is, classical polynomial-time with postselection, also called BPPpath ). We would
therefore get exactly the same chain of inclusion as in Eq.(10.5), and the same conclusions on the hardness
of the model apply.
88
Chapter 11
Continuous-Variable Approach to
Quantum Information
In the Continuous-Variable (CV) approach to quantum information processing, √ relevant observables √are
characterised by a continuous spectrum, such as the amplitude q̂ = (â+↠)/( 2) and phase p̂ = (â−↠)/(i 2)
quadratures of the electromagnetic field, satisfying [q̂, p̂] = i. The associated Hilbert space is infinite-
dimensional, and the (infinite) energy levels are eigenstates of the number operator n̂ = ↠â. This is
opposed to the traditional Discrete Variable (DV) approach, where observable have a discrete spectrum,
and the Hilbert space is finite-dimensional. Generally speaking, CV systems offer the advantage that the
resource states (e.g. squeezed states or large cluster states, that we will introduce in the following) can be
deterministically produced. Moreover, new methods for experimental implementations, offering solutions to
scalability, have emerged in the context of CV. For instance, A. Furusawa (Tokyo, Japan) and O.Pfister
(Charlottesville, USA) have been able to produce CV entangled states of up to one-million optical modes; in
the experiments of N. Treps and V. Parigi (Paris, France) several squeezed states are simultaneously available
in the same optical cavity. With microwave cavities coupled to superconducting circuits, the Yale group has
demonstrated that it is possible to store quantum information for a longer time in a radiation state (namely
an encoded “cat state") then if the corresponding state is encoded in a qubit. CV are therefore promising
for the implementation of scalable and robust architectures for quantum computing.
89
This Hamiltonian has the following energy eigenvalues
1
En = ~ω n + , n = 0, 1, 2, . . .
2
and the eigenstates of the Hamiltonian are known as Fock states, written in the Dirac notation as |ni, where
n represents the number of quanta or photons in the single-mode field. The Fock states are eigenstates of
the number operator n̂ = ↠â, satisfying
↠â |ni = n |ni .
The vacuum state of the harmonic oscillator is defined by
â |0i = 0. (11.1)
Acting with the creation and annihilation operators on the Fock state yields
√
â |ni = n |n − 1i , (11.2)
√
↠|ni = n + 1 |n + 1i . (11.3)
Hence it is clear that the creation operator ↠, creates a quanta of energy ~ω and the annihilation operator
â destroys a quanta of energy ~ω in the single-mode field. Any Fock state can be generated by acting on the
vacuum state multiple times with the creation operator
â†n
√ |0i = |ni .
n!
Furthermore the number states are orthogonal
hm|ni = δmn
Coherent states
In quantum optics, the coherent states are the states with most resemblance to classical states, in the sense
that they give rise to expectation values that look like the classical electric field. These states are the
eigenstates of the annihilation operator:
â |αi = α |αi . (11.4)
Because â is non-Hermitian α is usually complex. For the creation operator ↠we have for obvious reasons
which is a superposition of infinite many number of Fock states! Furthermore calculating the expectation
value of the number operator n̂
2
n̄ = hα| n̂ |αi = |α| ,
90
2
we see that |α| is related to the mean number of photons in the field. Using this we can compute the
probability of finding n photons in the field
2n
2 |α| n̄n
2
|hn|αi| = e−|α| = en̄ ,
n! n!
which we recognize as a Poisson distribution with a mean of n̄. This distribution arises when the probability
that an event occurs is independent of earlier events.
Coherent states are known to be non-orthogonal. For example, consider the scalar product hβ|αi, where
|αi and |βi are to different coherent states
2 X X (β m )∗ αn
/2 −|α|2 /2
hβ|αi = e−|β| e √ √ hm|ni
m n m! n!
2 X (β ∗ α)n
+|α|2 )/2
= e−(|β|
n
n!
2
+|α|2 −2β ∗ α)/2
= e−(|β|
2
/2 (αβ ∗ −βα∗ )/2
= e−|α−β| e ,
taking the modulus square we get
2 2
|hβ|αi| = e−|α−β| . (11.5)
2
From Eq. (11.5) it is evident that two coherent states are non-orthogonal. Only when |α − β| is large, so
2
that |hβ|αi| ∼ 0, they become quasi-orthogonal.
Quadrature operators
It is convenient to introduce the two hermitian operators
1
q̂ = √ (â + ↠),
2
(11.6)
1
p̂ = √ (â − ↠),
2i
which satisfy the commutation relation
[q̂, p̂] = i.
These are called the quadratures of the field, and they are observables. Measurement if the quadratures
is called a homodyne measurement. These observables have a continuous spectrum, from which the name
“continuous variables",
91
with ψq (s) = q hs |ψi and with ψp (s) = p hs |ψi.
The variance of an arbitrary operator is defined by
h(∆Â)2 i = hÂ2 i − hÂi2 (11.11)
and can interpreted as the uncertainty of an observable. The expectation value of the quadratures
1
hq̂i = √ hn| (â + ↠) |ni = 0,
2
1
hp̂i = √ hn| (â − ↠) |ni = 0
2i
are evaluated to zero, which means that the expectation value of the electric field is also zero (see Eq. (A.17)
in the Appendix). On the other hand the expectation value of the square is non-zero
1 1
hn| (â2 + â↠+ ↠â + â†2 ) |ni = (1 + 2n),
hq̂ 2 i =
2 2
2 1 2 † † †2 1
hp̂ i = hn| (â + ââ + â â + â ) |ni = (1 + 2n).
2 2
Thus it follows from Eq. (11.11) that the uncertainty in both quadratures are equal, and when n = 0
(corresponding to the vacuum state), the uncertainty is minimum
1
= h(∆p̂)2 ivac ,
h(∆q̂)2 ivac = (11.12)
2
also implying the saturation of the Heisenberg uncertainty principle
1
h(∆q̂)2 ivac h(∆p̂)2 ivac =
. (11.13)
4
This is known as the vacuum fluctuations. Since coherent states are nothing else than vacuum displaced in
phase space, it can be easily verified that the quantum uncertainty for both quadratures on coherent states
is the same as for vacuum, for all amplitude α:
1
h(∆q̂)2 iα = = h(∆p̂)2 iα ,
2
also resulting in minimum-uncertainly states:
1
h(∆q̂)2 iα h(∆p̂)2 iα = . (11.14)
4
Squeezed states
In contrast to coherent states, squeezed states are characterised by asymmetric fluctuations in q and p, i.e.
h(∆q̂)2 iξ < h(∆p̂)2 iξ for a q-squeezed state, and h(∆q̂)2 iξ > h(∆p̂)2 iξ for a p-squeezed state, yet satisfying
in both cases the Heisenberg principle with equality sign Eq.(11.14). Squeezed states are obtained applying
the squeezing operator to the vacuum, i.e.
iξ(q̂ p̂+p̂q̂)
|ξi = S(ξ)|0i = e− 2 |0i.
In the limit of infinite squeezing, which corresponds to infinite energy, one obtains the infinitely squeezed
states, which are the eigenstates of position and momentum with zero eigenvalue:
|0iq infinitely q-squeezed state, such that q̂|0iq = 0|0iq (11.15)
|0ip infinitely p-squeezed state, such that p̂|0ip = 0|0ip . (11.16)
Generalization of these states exist, where the state that is squeezed in Eq.(11.1.1) is an arbitrary coherent
state.
92
Figure 11.1: Coherent, Squeezed and Thermal States. Wigner function ball-on-stick representations of (i)
vacuum state (blue), (ii) coherent state (red), (iii) thermal state (yellow, dashed line), and (iv) squeezed state
(green). This picture is taken from Ref.["Storage and manipulation of optical information using gradient
echo memory in warm vapours and cold ensembles", B. Sparkes, 2013].
returns the probability density of q, where for simplicity of calculation we have considered the case of a pure
93
state %̂ = |ψihψ|. Similarly, one can show that
Z ∞
2
Pr(p) = W (q, p)dq = |ψ(p)|
−∞
Mari-Eisert theorem
Any given CV quantum circuit is defined by (i) a specific input state, (ii) a unitary evolution and (iii)
measurements. The Mari-Eisert theorem [Mari and Eisert, 2012] states that if all these elements are de-
scribed by positive Wigner functions, then there exists a classical algorithm able to efficiently simulate
this circuit. This theorem is the analog of the Gottesman-Knill theorem seen for qubits in Chapter 1.
For a nice demonstration, see the licentiate thesis of Ingrid Strandberg (Chalmers, 2019), available at
https://research.chalmers.se/publication/513592/file/513592_Fulltext.pdf.
Hence, including a negative Wigner function element is mandatory in order to design a CV sub-universal
quantum circuit that cannot be efficiently simulated by a classical device. By virtue of the Hudson theorem,
this necessarily corresponds to the use of non-Gaussian resources. Indeed, the Hudson theorem states that
the only pure states to possess a non-negative Wigner function are Gaussian states.
Previous criteria for the simulatability of CV circuits were given in terms of the Gaussianity of a circuit:
if all elements in a CV circuits are Gaussian, then the circuit is classically efficiently simulatable [Bartlett
et al., 2002]. This criterion is strictly included in the Mari-Eisert theorem, i.e. it recognises as classically
efficiently simulatable a smaller class of CV circuits: there are indeed states and processes that are non-
Gaussian, and yet for which the Wigner function is positive. Consider for instance the mixed state of vacuum
|0i and single-photon state |1i, % = p|0ih0| + (1 − p)|1ih1| for any p > 1/2.
An even more general criterion with respect to the Mari-Eisert theorem for the simulatability of continuous-
variable circuits was given in Ref. [Rahimi-Keshari et al., 2016], based on other quasi-probability distributions
than the Wigner function.
94
eiH(p̂i ,q̂j ) consisting of a polynomial of the quadratures q̂i and p̂j in each mode i, j, to an arbitrary fixed
accuracy [Lloyd and Braunstein, 1999, Gu et al., 2009].
It is also convenient to introduce the notion of Gaussian universality. This is given by LUBOs operations,
for Linear unitary Bogoliubov transformations. They consist of any evolution operator involving a at most
quadratic polynomial in q̂ and p̂. Introducing x̂ = (q̂1 , q̂2 ...q̂N , p̂1 , p̂2 ...p̂N )T ≡ (~q, p~)T , then these operations
can be represented as
Û † x̂Û = Sx̂ + ĉ, (11.17)
with S a symplectic matrix, ĉ a displacement (see e.g. Ref. [Menicucci et al., 2011]). In the following, we are
going to list the relevant CV elementary Gaussian operations, which allows reaching Gaussian universality:
b) Quadrature displacements:
is(q̂ p̂+p̂q̂)
c) Squeezing: S(s) = e− 2
isq̂ 2
d) Shear: D2,q = e 2
Any single mode Gaussian operation can be decomposed in a) rotations; b) quadrature displacement; c) or
d), i.e. squeezing or shear. I.e,
{D1,q (s) = Z(s), D2,q (s), F = R(π/2)} universal set for single-mode Gaussian operations
1 Note that the action of a rotation of the quadratures of a single mode looks like
q̂ cos ϑ − sin ϑ q̂
R(ϑ) = , (11.18)
p̂ sin ϑ cos ϑ p̂
q̂1
not to be confused with an equivalent matrix acting on the two-mode amplitude quadratures vector , realising the
q̂2
rotation of one quadrature q̂1 with respect to another one q̂2 .
2 The corresponding rotation of the second mode quadrature is expressed by
0
q 0 −1 q −p
0 = = (11.19)
p 1 0 p q
which corresponds to the transformations F † q̂F = −p̂ and F † p̂F = q̂, i.e. F † âF = iâ and F † ↠F = −i↠.
95
Multimode Gaussian transformations
The addition of any non trivial two-mode Gaussian gate, such as the CZ interaction eigq̂×q̂ , combined
with the set of single-mode operations above allows for any general multi-mode Gaussian operation (see
Ref. [Menicucci et al., 2011] for comments on the generalization to imperfect Gaussian operations and
Gaussian measurements).
{D1,q (s) = Z(s), D2,q (s), F = R(π/2), CZ } universal set for multi-mode Gaussian operations
They are enough to perform some algorithms such as error correcting code regarding errors on single channels
(though, non-Gaussian measurements are required to correct errors such as loss on all channels simultane-
ously). Yet, all the algorithms involving only Gaussian unitaries acting on Gaussian states with Gaussian
measurements could be efficiently simulated on a classical computer, as we have seen in the previous section.
{D1,q (s) = Z(s), D2,q (s), D3,q (s), F = R(π/2)} universal set for single-mode q.c.
{D1,q (s) = Z(s), D2,q (s), D3,q (s), F = R(π/2), CZ } universal set for multi-mode q.c.
n 2 π 2 2 3
o
eiq̂s , eiq̂ s , ei 4 (q̂ +p̂ ) , eiq̂1 ⊗q̂2 , eiq̂ s universal set for multi-mode q.c.
96
11.2.1 Cluster states in Continuous Variables
Ideal cluster states are defined as follows: given a graph with N vertices and a certain number of edges
relying these vertices according to a specific structure modeled by an adjacency matrix V , a CV cluster
state is obtained starting from a collection of N infinitely p-squeezed states, and applying CZ interactions
according to the graph structure, i.e. the controlled-Z gate eigq̂×q̂ , yielding
N
Y i i T
|ψV i = ĈZ [V ]|0i⊗N
p = e 2 Vjk q̂j q̂k |0i⊗N
p = e 2 q̂ V q̂
|0i⊗N
p (11.22)
j,k
From Eqs.(11.25) and (11.24) it follows that the cluster state (11.22) is stabilized by the set
† i T i T
Ki = ĈZ [V ]Xi (s)ĈZ [V ] = e 2 q̂ V q̂ Xi (s)e− 2 q̂ V q̂
YY i i
= e 2 Vj,k q̂j q̂k e−isp̂i e− 2 Vl,m q̂l q̂m (11.26)
j,k l,m
for each i. Using that eiq̂1 q̂2 p̂1 e−iq̂1 q̂2 = p̂1 − q̂2 , we finally obtain that
Y Y
Ki = e−isp̂i Vi,k eisq̂k = Xi (s) Vi,k Zk (s). (11.27)
k k
Eq.(11.27) is formally equivalent to its analog in the discrete variable case (see Eq.(20.11) in Ref.[D. Druss
and G. Leuchs, “Lectures on Quantum Information", Wiley-VCH (2007)]). The Ki form a group. The
generators of the corresponding algebra are found by derivation since Ki = e−isHi . Note that because of
Eq.(11.23) it follows that Hi |ψV i = 0, i.e. ∀i. Hence the H operators are called ”nullifiers" for the state
|ψV i. They can be easily calculated as
" #
dKi d −isp̂i
Y
isq̂k
Hi = i =i e Vi,k e
ds (s=0) ds
k (s=0)
" #
−isp̂i d
Y Y
−isp̂i isq̂k isq̂k
= i −ip̂i e Vi,k e +e Vi,k e
ds
k k (s=0)
" #
X Y X
= p̂i + i e−isp̂i Vi,k iq̂k eisq̂l = p̂i − Vi,k q̂k (11.28)
k l (s=0) k
97
From Eq.(11.29) follows immediately that
!
X
hψV |∆2 p̂i − Vi,k q̂k |ψV i = 0, (11.30)
k
P 2
which for states with zero average also reads hψV | (p̂i − k Vi,k q̂k ) |ψV i = 0.
Figure 11.2
R
• 1) Preparation: One mode contains the initial state that we want to process |ϕi = dsf (s)|siq ; the
R
other mode is initialized to |0ip . The input state is hence |ϕi ⊗ |0ip = dsf (s)|siq |0ip . Apply a CZ
gate between the two, obtaining
Z Z Z
iq̂⊗q̂ isq̂2
CZ (|ϕi ⊗ |0ip ) = dsf (s)e |siq |0ip = dsf (s)e |siq |0ip = dsf (s)|siq |sip
1
Z Z
= √ dsf (s) dre−irs |rip |sip , (11.31)
2π
• 2-pre) Measure: Measure the input mode in the p̂ basis with outcome m: this projects the second
qumode into the state
Z Z
|ψiout ∝ dsf (s)e−ims |sip = e−imp̂ dsf (s)|sip = X(m)F |ϕi. (11.32)
R R
The last equality is obtained since F |ϕi = dsf (s)F |siq = dsf (s)|sip . The effect of this circuit is
to apply the identity (modulo a displacement and a rotation).
• 2) If we send as an input state the rotated state Dq̂ |ϕi = dsf (s)Dq̂ |siq where Dq̂ = eif (q̂) is an
R
operator diagonal in the computational basis, then measuring p̂ in the first qumode projects the second
mode into
Since the CZ gate commutes with Dq̂ , the same result is obtained with |ϕi as an input state, and
a rotation on the first mode after the CZ as in the left panel of Fig.11.2. This is in turn equivalent
to the situation in which no rotation is applied, but the first mode is measured in a rotated basis
Dq̂† p̂Dq̂ ≡ p̂f (q̂) . The extra displacement X(m) depends on the outcome of the measurement on mode
1, and can be compensated by choosing the measurement basis of the following steps (thus introducing
in general time-ordering).
98
• 3) Universality of single mode operations: Repeat twice the previous protocol, for two different
operators Dq̂1 and Dq̂2 . The output state is:
|ψiout = X(m2 )F (Dq̂2 X(m1 ))F Dq̂1 |ϕi
2
= X(m2 )F X(m1 )Dq̂+m 1
F Dq̂1 |ϕi
2 1
= X(m2 )F X(m1 )F D− p̂+m1 Dq̂ |ϕi
(11.34)
where we have used the inequalities X(−m)q̂X(m) = q̂ + m, Z(−m)p̂Z(m) = p̂ + m, F † (−q̂)F = p̂,
F † p̂F = q̂. If instead of measuring the second mode on p̂f (q̂) I would have measured it in the outcome-
dependent basis p̂f (−q̂−m1 ) I would have obtained as a result my deterministic desired output
|ψiout = X(m2 )F X(m1 )F Dp̂2 Dq̂1 |ϕi (11.35)
(universal for single-mode operations if I repeat other times: I can obtain the desired transforma-
tion concatenating various Dq̂ and Dp̂ ), a part from by-product operations which do not need to be
corrected.
• 3)-4) Triviality of measurement adaptivity for Gaussian unitaries Let us focus on the building
blocks of the universal set given above. For the Gaussian operations:
– F is obtained at each step of the computation.
– Dq̂ = eisq̂ is obtained by measuring p̂sq̂ = e−isq̂ p̂eisq̂ = p̂ + s (measure p̂ and add s to the result).
Note that p̂s(q̂+m) = p̂sq̂ = p̂ + s (no adaptation is required).
q̂ 2 q̂ 2 q̂ 2
– Dq̂ = eis 2 is obtained
√ by measuring in the basis p̂sq̂2 /2 = e−is 2 p̂eis 2 = p̂ + sq̂ = g(q̂ sin ϑ +
p̂ cos ϑ) with g = 1 + √ s2 and ϑ = arctan s. This
√ is readily verified because the latter defini-
tion implies cos ϑ = 1/ 1 + s2 and sin ϑ = s/ 1 + s2 . This corresponds to a rotated homo-
dyne quadrature. Note that if I would have to adapt the basis I should measure according to
p̂s(q̂+m)2 /2 = p̂ + sq̂ + ms. This can be achieved by measuring in the same basis as without
adaptation (i.e. p̂ + sq̂) and adding ms to the result
The adaptation required for these measurements is trivial and can be done after (as a post-processing).
Hence Gaussian operations can be implemented in any order or simultaneously ("parallelism").
A cubic phase gate would require instead
q̂ 3 q̂ 3 q̂ 3
– Dq̂ = eis 3 is obtained by measuring in the basis p̂sq̂3 /3 = e−is 3 p̂eis 3 = p̂ + sq̂ 2 . If I have to
measure according to p̂s(q̂+m)3 /3 = p̂ + sq̂ 2 + 2msq̂ + m2 s, the term 2msq̂ requires a non-trivial
adaptation of the measurement basis.
• 5) Cluster states as a resource: Given the fact that the CZ gates commute with the measurements,
in practice the state used as initial resource in the quantum computation protocol presented is a
generalized cluster state in which some of the modes (the input modes), also linked to the other nodes
of the cluster, are initialized to code the modes of the input state. However, one can think of taking
an initial cluster state (e.g, a square cluster state) and "writing" in some of its nodes the modes of the
physical input state (e.g. by CZ gates and measure of the input modes, or by teleportation [Ukai et al.,
2010]). A state which allows this for each U and each input state is said to be a universal resource.
It has been demonstrated by Briegel that a square lattice graph (a cluster state) with unit weights is
a universal resource for quantum computation. Depending on the specific kind of computation, other
graphs than a square lattice could be more suitable for implementing the computation [Horodecki et al.,
2006].
99
• 6) Two mode interactions: A sequence of single mode operations can be implemented via following
measurements on a linear cluster. To achieve full universality we have to add to the previous toolbox
a two-mode interaction, e.g. the CZ gate. Such two-qubit gates, e.g., the CZ and CNOT gates, can
be constructed in a two-dimensional cluster state where two input qubits are entangled with a few
other qubits, in analogy to the case of qubit MBQC discussed in Sec.6.2. By a series of single-qubit
measurements and rotations, we can end up with two of the other qubits representing the output state
corresponding to the two-qubit gate having acted on the input state.
In conclusions, note that the procedure above is an idealization: in real life, squeezed states will always
have finite energy, i.e. squeezing degree. As a result, the state output of the computation - even in the
presence of ideal entangling gates and measurements - will always be affected by Gaussian noise, caused
by the finite squeezing. How to avoid accumulation of this (and other types of) noise is the object of the
following section.
100
11.3 GKP encoding and Error Correction
In classical informatics, when it comes to make sure that the errors that can occur during a computation
can be corrected, it is convenient to resort to digitalized information, i.e. bits. For this reason combined
with versatility, analog computers have been outperformed by digital computers in the 50s-60s, when the
latter became sufficiently performant. Also note that from a computer science perspective, the definition of
computational models based on real numbers is problematic and less studied3 .
Analogously, with quantum information, if the goal is to achieve fault-tolerant quantum computation, we
must resort to qubit-like quantum information even when using continuous-variable hardware. An example
of qubit-like quantum information encoding in CV is based on the use of cat states, where the qubit-
like information is encoded in codewords |0iL = |αi + | − αi and |1iL = |iαi + | − iαi, where we have
omitted normalization constants. This encoding has allowed demonstrating a break-even point, in the sense
that quantum information encoded in such cat states has been living longer than the one encoded in the
corresponding qubit (within a transmon architecture) [Ofek et al., 2016].
In Ref. [Gottesman et al., 2001], another way of encoding qubits in quantized harmonic oscillators was
introduced by Gottesman, Kitaev and Preskill, yielding the GKP encoding. This encoding has been shown to
allow for the correction of arbitrary type of noise, while cat codes are specialized to correct for single-photon
losses. Essentially, and without attempting to be rigorous, this is because GKP codes allows one to correct for
single-mode displacements, and any noise-map can be decomposed in single-mode displacements [Gottesman
et al., 2001]. GKP state have been generated and encoded experimentally both in superconducting microwave
cavities [Campagne-Ibarcq et al., 2020] and with trapped ions [Flühmann et al., 2018].
In what follows, we are going to introduce the GKP encoding and the corresponding error-correcting
scheme in detail.
Realistic logical qubit states are normalizable finitely squeezed states, rather than non-normalizable infinitely
squeezed states. The Dirac peaks are hence replaced by a normalized Gaussian of width ∆, while the infinite
sum itself will become a Gaussian envelope function of width δ −1 (see Figure 11.3). Overall, the realistic
states wavefunctions read:
√
(2n)2 πδ 2 (q − 2n π)2
Z X
−iup̂ −iv q̂
hq |0̃L i = dudvG(u)F (v)e e hq |0iL = N0 exp − exp − ,
n
2 2∆2
√
(2n + 1)2 πδ 2 (q − (2n + 1) π)2
Z X
hq |1̃L i = dudvG(u)F (v)e−iup̂ e−ivq̂ hq |1iL = N1 exp − exp − ,
n
2 2∆2
(11.37)
3 If real computation were physically realizable, one could use it to solve NP-complete problems, and even #P-complete
problems, in polynomial time. Unlimited precision real numbers in the physical universe are prohibited by the holographic
principle and the Bekenstein bound.
101
where we have introduced the noise distributions
1 u2 1 v2
G(u) = √ e− 2∆2 ; F (v) = √ e− 2δ2 , (11.38)
∆ 2π δ 2π
and N0 and N1 are normalization constants.
Figure 11.3: Wavefunction in position representation of GKP |0̃L i state in continuous blue (|0̃L i in dashed
red) with δ = ∆ = 0.25 from Equation Eq. (11.37)
.
These are all implemented by Gaussian CV operations, as introduced in Sec.11.1.2. To promote this set of
operations to universality we need a non-Clifford gate. This requires instead a non-Gaussian operation:
3 2
2q̂
iπ
4
√
π
+ √q̂π − √ 2q̂
π
T̄ = e . (11.40)
102
In order to explain this EC procedure, we follow a toy-model approach that has been developed by
Glancy and Knill [Glancy and Knill, 2006]. This approach is based on a decomposition of the noise in
several realizations of displacements, resulting in blurred ideal GKP states. Within this approach, we are
going to show explicitly how GKP EC allows one to correct for displacements, by analyzing the output of
the circuit in Fig. 11.5 with merely displaced perfect GKP states at the input, Sections 11.3.2 and 11.3.3.
Since displacements form an operator basis, it follows that GKP states can correct any type of noise. Note
that this works in principle for arbitrary noise, even when this is non Gaussian.
This approach is exact in the infinite squeezing regime. For physical states, i.e. with finite squeezing
there are subtleties. Some of these are discussed in a recent paper by Barbara Terhal’s group, that discusses
the difference between error correction with coherent-enveloped GKP states versus blurred ideal GKP states.
These subtle differences are not completely captured by the Glancy-Knill picture, but for pedagogical reasons
we will omit further details of this discussion.
√
|ψ̃i • X(−s mod π)
|0̃L i • p̂ s
Figure 11.4: Procedure to correct for errors in the p̂ quadrature. |0̃L i is a noisy GKP state and |ψ̃i is a noisy
GKP-encoded CV state. X(m) is a displacement operator e−imp̂ .
Figure 11.5: Modeling the noise in the protocol. |0L i is a perfect – unphysical – GKP state and |ψi is a
perfect GKP-encoded CV state.
associated with the distinct measurement outcome s, where |sip are eigenstate of the p̂ operator. In the
following, we are going to omit the subscript when this does not cause confusion. In the following and unless
specified all integrals will run over the whole real axis.
R
More specifically, we’re interested in the observable I ⊗ p̂ where I = dq1 |q1 i hq1 | is the identity operator,
acting on mode 1 which is unmeasured. From the initial unphysical and perfect input states, a set of
displacements in position and momentum are applied before the ĈZ gate and the measurement. Thus we
want to work out the following quantity, in a sort of Heisenberg representation fashion:
eiv2 q̂2 eiu2 p̂2 eiv1 q̂1 eiu1 p̂1 ĈZ† (I ⊗ p̂)ĈZ e−iu1 p̂1 e−iv1 q̂1 e−iu2 p̂2 e−iv2 q̂2 (11.42)
103
Step by step we have:
Z
ĈZ† (I ⊗ p̂)ĈZ = ĈZ† dq1 dp2 p2 |q1 , p2 i hq1 , p2 | ĈZ
Z
= dq1 dp2 p2 |q1 , p2 − q1 i hq1 , p2 − q1 | (11.43)
Then:
eiv2 q̂2 eiu2 p̂2 eiv1 q̂1 eiu1 p̂1 ĈZ† (I ⊗ p̂)ĈZ e−iu1 p̂1 e−iv1 q̂1 e−iu2 p̂2 e−iv2 q̂2
Z
=e iv2 q̂2 iu2 p̂2 iv1 q̂1 iu1 p̂1
e e e dq1 dp2 p2 |q1 , p2 − q1 i hq1 , p2 − q1 | e−iu1 p̂1 e−iv1 q̂1 e−iu2 p̂2 e−iv2 q̂2
Z
= dq1 dp2 p2 |q1 − u1 , p2 − q1 + v2 i hq1 − u1 , p2 − q1 + v2 |
Z
= dq1 dp2 p2 |q1 , p2 − q1 − u1 + v2 i hq1 , p2 − q1 − u1 + v2 |
Z
= dq1 dp2 (q1 + p2 + u1 − v2 ) |q1 , p2 i hq1 , p2 | = q̂1 + p̂2 + u1 − v2 (11.44)
q1 − u1 = q10 → q1 = q10 + u1
p2 − q1 + v2 = p02 → p2 = p02 + q10 + u1 − v2
and we have renamed q10 → q1 and p02 → p2 . So what kind of measurement results are we to expect? To
figure it out we should express Eq. (11.44) in a different manner, namely:
eiv2 q̂2 eiu2 p̂2 eiv1 q̂1 eiu1 p̂1 ĈZ† (I ⊗ p̂)ĈZ e−iu1 p̂1 e−iv1 q̂1 e−iu2 p̂2 e−iv2 q̂2
Z
= dq1 dp2 (p2 + q1 + u1 − v2 ) |q1 , p2 i hq1 , p2 | . (11.45)
We recall that the quantum state that will be measured is |ψL , 0L i so a product of perfect GKP states. On
√
these states, a quadrature projection can only give rise to integer multiples of π. So the sum q1 + p2 will
√ √
be a multiple of π, say n π. Specifically we get
√
s = n π + u1 − v2 . (11.46)
The outcome of the measurement is a value corresponding to the noise of the data qubit, blurred by the
noise coming from the ancilla.
√
The error threshold is defined in relation with the following displacement X(−s mod π). The modulo
√ √
function has range [− π/2, π/2], and the procedure succeeds if u1 − v2 is small. Otherwise a logical
error occurs and the displacement acts as Pauli-X gate in terms of the GKP encoding. Mathematically the
constraint reads:
√
|u1 − v2 | ≤ π/2. (11.47)
The remaining logical errors can be taken care of by concatenating the GKP code with qubit codes and
quantum error correction. A limit on the tolerated logical error probability depending on the chosen qubit
code translates then into a constraint on the squeezing parameters [Menicucci, 2014].
104
11.3.3 Single noise realization: Output state of the GKP error-correcting gad-
get
We now compute the output state of the circuit Fig. 11.5. We show that the noise in the p̂ quadrature of
the logical qubit is replaced by the one given by the ancilla |0̃L i. Given Eq.(11.41) and standard quantum
measurement theory [Nielsen and Chuang, 2011] after that the outcome s is obtained, the state is projected
onto
|ψs i ∝ P̂s |ψi12 (11.48)
where |ψi12 is the state of input and ancilla after displacement and ĈZ gate, i.e.
|ψi12 = ĈZ e−iu1 p̂1 e−iv1 q̂1 e−iu2 p̂2 e−iv2 q̂2 |ψL , 0L i , (11.49)
R
and where again the identity I = dq1 |q1 i hq1 | is implicitly acted in mode 1, and the projector P̂s defined
in Eq.(11.41) acts on mode 2. We therefore now compute explicitly,
Z
P̂s |ψ12 i = dq1 |q1 i hq1 | ⊗ |sihs|ĈZ e−iu1 p̂1 e−iv1 q̂1 e−iu2 p̂2 e−iv2 q̂2 |ψL , 0L i ≡ |Φi |si (11.50)
so we have:
√ XZ √
|Φi = e−iu2 (n π−v2 )
dq1 δ(q1 − l π)e−i(v1 −u2 )q1 |q1 + u1 i hq1 |ψL i
l∈Z
√
−iu2 (n π−v2 )
X √ √ √
=e e−i(v1 −u2 )l π
|l π + u1 i hl π|ψL i. (11.54)
l∈Z
√
Since (the position wavefunction of) |ψL i is made of Dirac pikes on integer multiples of π, the following
P √ √
result is straightforward: using that l∈Z |l πi hl π|ψL i = |ψL i we obtain the final state before the
displacement:
√
|Φi = e−iu2 (n π−v2 ) −iu1 p̂1 −i(v1 −u2 )q̂1
e e |ψL i (11.55)
105
and from Eq.(11.50)
√
P̂s |ψ12 i = |Φi |u1 + n π − v2 ip . (11.56)
The conditional state on mode 2 is obtained as:
with p0 = u1 − v2 . So we have the equation of the output state on a single realization of the noise. Like
in [Glancy and Knill, 2006], we can see that the remaining p̂1 displacement is given by v2 and is independent
of the original noise u1 . For q̂1 though the noise from the ancilla u2 has been added to the original value v1 .
It appears more clearly if we write Eq. (11.60) as in from [Glancy and Knill, 2006]. Then the output state
would read – for an intermediate measurement result s:
106
11.4 Sampling models and sub-universal models in Continuous Vari-
ables
In the previous Section, we have dealt with MBQC, which is a model for (universal) quantum computation.
In this Section, we turn to sub-universal computational models in CV. In Sec. 11.1.1 we have seen that
Wigner negativity is necessary in order to obtain quantum advantage - at least of the exponential type.
However, if one aims at minimal extensions of Gaussian models, the Boson Sampling model that we
have seen in Section 10.4 appears as potentially “over-kill": both the input state and the measurement are
described by negative Wigner functions, as they correspond respectively to single-photon states and photon
counting measurement. Can we define other sub-universal model of quantum computation where only one of
these elements is non-Gaussian, and show that they yield hard-to-sample output probability distributions?
Three different families of non trivial sub-universal quantum circuits can be defined, depending on whether
the element yielding the Wigner function negativity is provided by the input state, the unitary evolution,
or the measurement. This concept is exemplified in Fig.11.6. Although Wigner negativity allows stepping
outside the range of applicability of the theorem in Ref. [Mari and Eisert, 2012], it is by itself not sufficient to
imply classical hardness [García-Álvarez et al., 2020]. The classical hardness of these circuits, therefore, has
yet to be proven, for each circuit type. For of the latter kind, corresponding to Gaussian Boson Sampling
(GBS), this was done in Refs. [Lund et al., 2014, Hamilton et al., 2017]. These circuits are composed of
input squeezed states, passive linear optics evolution, and photon counters. Circuits of the second kind
are for instance related to the CV implementation of Instantaneous Quantum Computing – another sub-
universal model, where input states and measurements are Gaussian, while the evolution contains non-
Gaussian gates [Douce et al., 2017, Douce et al., 2019]. First definitions of the former class of CV circuits,
i.e. that display non-Gaussian input state and Gaussian operations and measurements, have been very
recently considered [Chakhmakhchyan and Cerf, 2017, Chabaud et al., 2017].
In this Chapter we are going to introduce the model CV-IQP (the continuous-variable version of the IQP
model introduced in Sec.10.2) and to sketch the proof of its computational hardness.
All the gates in this model are diagonal in the position representation. As such, the Fourier transform is
absent. Therefore, this model is not universal. For pedagogical purposes in these notes, we can assume in
the definition that we are able to implement all the gates corresponding to all the possible choices of the
real parameters in Eq.(11.62). However, this has shown to be un-necessary [Douce et al., 2019]. We require
momentum squeezed states |σi with σ < 1 to be present at the input. This model is a simpler version
than the one introduced in Ref. [Douce et al., 2017]. Note the reminiscence with the IQP model defined
in Sec.10.2. Here, too, we have a “crossed" structure of the type: p − q − p in input state, evolution and
measurement, as it was x − z − x for the discrete-variable model.
For the proof of hardness, we proceed in the same way as seen in Chapter 10, when dwelling with the
IQP model: we show that adding post-selection, the model becomes universal. In analogy to the Hadamard
gadget of the discrete-variable model, we need therefore to develop a Fourier gadget.
107
Figure 11.6: Families of quantum circuits in continuous-variable displaying minimal non-Gaussian character.
Circuits with photon-subtracted squeezed states, linear evolution and heterodyne detection (instead than
homodyne as represented in the leftermost sketch) have been shown to be hard to classically sample in
Ref. [Chabaud et al., 2017]).
In the idealised case of infinite squeezing input states, this can be obtained fairly easily with the gadget
of Fig.11.8. Adding post-selection allows to recover the Fourier transform, which completes the set of gates
in the model.
However, a realistic model departs from this idealized situation in at least two unavoidable aspects.
First, in order to obtain a probability of success different from zero for the homodyne measurement, thereby
giving meaning to post-selection, we must introduce a binning of the real axis. This can still be equivalently
R
modeled by using as a projective measurement. Instead that the ideal homodyne detector p̂ = p |pi hp|,
we are going to consider the finitely-resolved homodyne detector p̂η operator that we define as [Paris et al.,
2003]
∞ Z ∞ ∞
dpχηk (p)|pihp| ≡
X X
p̂η = pk pk P̂k (11.63)
k=−∞ −∞ k=−∞
with χηk (p)= 1 for p ∈ [pk − η, pk + η] and 0 outside, pk = 2ηk and 2η the resolution, associated with
the width of the detector pixels 4 . It is easy to check that this is still a projective measurement, since
P∞ 5
k=−∞ P̂k = I, and P̂k P̂k = P̂k δk,k . Note that this modelization is distinct from modeling imperfect
0 0
detection efficiency [Leonhardt, 1997, Leonhardt and Paul, 1993, Paris et al., 2003]. Using this operator for
the measurements introduces errors in the Fourier-transform applied with the Fourier-gadget. In particular,
the output state is a mixes state. Second, realistic input states do have finite squeezing. Below, we are going
to address the effect of these sources of noise, as well as how to deal with them.
4 Note that this model turns out to be equivalent to an ideal scheme with perfectly resolving homodyne detectors and a
108
|σi p̂η
.. / f 0 (q̂) / ..
. .
|σi p̂η
Figure 11.7: IQP circuit in CVs. |σi are finitely squeezed states with variance σ in the p̂ representation. The
gate f 0 (q̂) is a uniform combination of elementary gates from the set in Eqs.(11.62). The finitely-resolved
homodyne measurement p̂η has resolution 2η.
| i • X h | i • p̂ p
|+i • X hH | i |0ip • X(p)F | i
Figure 11.8: Left: Hadamard gadget in a post-selected IQP circuit, where h takes value 0 if +1 is measured,
while h = 1 if the result is −1. Right: Ideal Fourier gadget in CVs, exact translation of the Hadamard
gadget. |0ip represents an infinitely p̂-squeezed state with σ = 0, thus satisfying p̂|0ip = 0.
Output state We compute the output state of the realistic Fourier transform gate implementation. The
circuit is reproduced in Fig. 11.9. By convention the first (resp. second) ket in the tensorial product will
refer to the upper (resp. lower) arm.
|ψi • p̂η
|σip • h
|ψout i
Figure 11.9
109
We measure on the upper arm the finitely resolved p̂η operator defined in Eq.(1) of the main text. When
obtaining an outcome pk , the measurement yields the conditional state on the lower arm
h i
%̂k,cond = Tr1 P̂k ⊗ I2 |ψ1,2 i hψ1,2 | P̂k ⊗ I2
Z pk +η
= ds p1hs |ψ1,2 i hψ1,2 | sip1
pk −η
η
Z
(t−q)2 (t0 −q 0 )2 0 0
= 3/2 dqdtdq 0 dt0 e− 2σ2 e− 2σ2 ψ(q)ψ ∗ (q 0 )sinc(η(q − q 0 ))eipk (q−q ) |tip ht|p (11.66)
π σ
where we have used Z η
0
ds eis(q−q ) = 2ηsinc(η(q − q 0 )). (11.67)
−η
We remark that the same expression as in Eq.(11.66) is obtained if the homodyne detectors are perfectly
resolved, and a discretization is performed after measurement by binning the measurement outcomes.
This state then has to be normalized by the probability of getting the outcome corresponding to the
projection operator above. What really matters to us is %̂k=0,cond corresponding to the outcome pk = 0,
because it is indeed the particular post-selected state that corresponds to the implementation of the Fourier
transform. For this specific outcome we have:
η
Z
(t−q)2 (t0 −q 0 )2
0
%̂k=0,cond = 3/2 dqdtdq 0 dt0 e− 2σ2 e− 2σ2 ψ(q)ψ ∗ (q 0 )sinc(η(q − q 0 )) |tip ht|p . (11.68)
π σ
Notice that in the limit of perfect resolution η → 0 (upon normalization) we re-obtain the state that would
be obtained in an MBQC implementation of the Fourier transform with a finitely squeezed ancillary state.
As can be seen in Eq. (11.68), finite squeezing means convoluting the state with a Gaussian in the momentum
representation, or equivalently multiplication with a Gaussian in the position representation.
taken in the state after the ĈZ gate, that is (see Eq. (11.65))
1
Z
(t−q)2
|ψ1,2 i = 1/4 √ dqdt e− 2σ2 ψ(q) |qiq |tip .
π σ
The calculation reads:
110
where from the second to the third line we used that
Z +∞
(t−q)2 (t−q 0 )2 √ (q−q 0 )2
dt e− 2σ2 e− 2σ2 = πσe− 4σ2 . (11.71)
−∞
while in the last step we have used Eq.(11.67). The probability can be Taylor expanded in terms of powers
of η: Z
2ησ 1 (q−q 0 )2
Prob[k = 0] = √ dqdq 0 √ e− 4σ2 ψ ∗ (q 0 )ψ(q) + O(η 2 ) . (11.72)
π 2σ π
The first term in the parenthesis is precisely the norm hψ1,2 |ψ1,2 i hence is equal to 1. Consequently the
probability reads:
2ησ
Prob[k = 0] = √ + O(η 3 ). (11.73)
π
The dominating order is thus proportional to the resolution 2η.
(q−q 0 )2
Large squeezing limit We note that Gaussian distributions obey the following relation: √1 e− 4σ 2 −→
2 πσ σ→0
0
δ(q − q ). Based on this property, the integral in Eq. 11.70 actually yield:
1
Z
(q−q 0 )2
dqdq 0 √ e− 4σ2 ψ ∗ (q 0 )ψ(q)sinc(η(q − q 0 )) ∼ 1
2σ π σ→0
1
Z
(q−q 0 )2
dqdq 0 √ e− 4σ2 ψ(q)ψ ∗ (q 0 ) ∼ 1. (11.74)
2σ π σ→0
Thus the probability of obtaining the outcome pk = 0 becomes dominated by the pure state contribution,
and is determined by the expression:
2ησ
Prob[k = 0] ∼ Prob(1) [k = 0] ∼ √ . (11.75)
σ→0 σ→0 π
We notice that this probability is given as a function of the squeezed state variance σ. Eq. (11.75) ensures
that the post-selection probability is non-zero, a necessary requirement to define it properly. This probability
also needs to satisfy
1
Prob[k = 0] & n . (11.76)
2
This exponentially low probability is still compatible with the definition of post-selected class as explained
in 2.2.3.
plus the Hadamard gate which in GKP encoding corresponds to the Fourier-transform, that hence is obtained
with post-selection:
H̄ = F . (11.78)
111
Therefore, any post-selected BQP computation can be simulated with a post-selected instance of IQP circuits,
i.e. Post-CVIQP = PostBQP. In other words, for every computation in Post-BQP we can find a CV-IQP
circuit that, with post selection corresponding to a non zero probability (and consistent with the definition
of the Post-BQP class), simulates that computation, despite the finite input squeezing and finite resolution.
However, it has later been shown in Ref. [Douce et al., 2019] that the circuit family is hard-to-sample
even without input GKP states. The trick is to show that generation of GKP states can be subsumed in the
gates of the circuit itself. The technicalities of that work are out of the scope of this lecture.
112
Figure 11.10: Left: Sketch of a cavity that confines photons between two mirrors. Right: Sketch of a
superconducting resonator where microwave photons are confined between two capacitors that act as semi-
transparent mirrors. Inspiration taken from Ida-Maria Svensson PhD Thesis (Chalmers, 2018).
By connecting two Josephson junctions in a superconducting loop we get what is called a Superconducting
QUantum Interference Device (SQUID). SQUIDs are very sensitive to magnetic fields and are therefore
mainly used as magnetometers for various purposes. However the SQUID can also be used as a two-photon
pump if a time-varying magnetic flux is threaded through the superconducting loop at twice the resonator
frequency. The SQUID will be an important component in the physical implementation of the continuous
variable quantum annealer.
where ωr denotes the mode frequency of the resonator, K denotes the amplitude of the Kerr-nonlinearity,
G denotes the amplitude of the two-photon pump, and the two-photon pump frequency is set to twice the
resonator frequency. The time-dependence of the Hamiltonian can be removed by transforming to a frame
†
rotating at the resonator frequency. Indeed, by doing the following unitary transformation Û(t) = eiωr tâ â ,
6 From here forth we will use natural units where ~ = c = 1.
Figure 11.11: Schematic illustration of a two-photon pumped KNR: it consists of a SQUID sandwiched
between two quarter wavelength (λ/4) resonators. The SQUID consists of two Josephson junctions connected
in a superconducting loop. The Josephson junctions are drawn as square boxes with a cross in the figure.
The loop is furthermore penetrated by a magnetic flux Φext and the ends of the resonators are capacitively
coupled to a transmission line.
113
we get
G2
G G
Ĥ = −Kâ2† â2 + G â†2 + â2 = −K â†2 − â2 −
+ . (11.79)
K K K
p
From the last expression it is evident that the two coherent states |±αi with α = G/K are are two
degenerate eigenstates of this Hamiltonian with eigenenergy
G2
Ĥ |±αi = |±αi .
K
Following Puri et al. [Puri et al., 2017] we now take advantage of this well-defined two state subspace
and choose to encode
p the plogical spins |0̄i and |1̄i onto these coherent states, i.e. we do the mapping
{|0̄i , |1̄i} → {|− G/Ki , | G/Ki}, where the bar is used to distinguish the logical spin states from the
vacuum and single-photon Fock state. For sufficiently large |α| the states can be considered orthogonal,
indeed following Eq. (11.5) we have that
2 2
|h1̄|0̄i| = e−4|α| .
√ 2
For instance if |α| = 3, then |h1̄|0̄i| ≈ 10−6 . Remarkably, these “code-words" are quite stable even in the
presence of noise induced by single-photon losses from the resonator (which we will consider as the main loss
mechanism). Lousely speaking, this is due to the fact that the coherent states are eigenstates of the loss
operator â.
In order to corroborate this fact, we start by showing that single-photon loss does not lead to spin-flip
2
error, which is when a qubit flips e.g. |0i → |1i. Consider the quantity | h1̄|â|0̄i| that describes the overlap
between the two logical spin states |0̄i and |1̄i after a single-photon has been lost from the resonator. Using
Eq. (11.4) and Eq. (11.5) we have that
2 2 2 2 2
| h1̄|â|0̄i| = |α| |h1̄|0̄i| = |α| e−4|α| ≈ 10−5 ,
√
for |α| = 3. Therefore we propose to use a two-photon pump amplitude set to G = 3K so that |α| =
p √
3K/K = 3, as such we will use G = 3K throughout this section.
More formally (beyond the scope of the lecture but given here as a complement), we compute the semi-
classical steady-state solution of the KNR. In the presence of single-photon loss the collapse operator in the
√
Lindblad-master equation takes the form Ĉ = γâ and the time-evolution of the density matrix is then
given by
d%̂(t) h i
= −i Ĥ, %̂ + γ 2â%̂↠− %̂↠â − ↠â%̂ .
dt
This furthermore assumes that the system is in thermal equilibrium with a zero-temperature environment,
so that the number of thermal photons in the system is negligible. This is a good approximation since the
typical microwave cavity is at mK. The described process is schematically sketched in Fig. 11.12. While it is
possible to obtain the steady-state solution analytically [Bartolo et al., 2016, Meaney et al., 2014], it can be
found quite easily (see Appendix B.3) from the solution of the semi-classical equations of motion that the
expected position of the states |±αi in presence of single-photon loss are
1/4 !!
1 16G2 − γ 2
i γ
α= exp − arctan p .
2 K2 2 16G2 − γ 2
114
γ
.. |n〉
G .
|3〉
|2〉
K |1〉
|0〉
G γ
Figure 11.12: Left: Schematic illustration of a cavity with a Kerr-nonlinear element that introduces an
effective photon-photon interaction of strength K. The cavity is subject to two-photon pumping with
amplitude G and single-photon loss at a rate γ. Right: The aforementioned effects sketched on the Fock
states |ni.
resonators [Kirchmair et al., 2013]. When this condition is satisfied the steady-state density matrix of the
system, d%̂/dt = 0, takes the form
1
%̂ss ≈ (|αi hα| + |−αi h−α|) ,
2
which is an equal weighted statistical mixture of the two coherent states |±αi or equivalently of the two
logical spin states {|0̄i , |1̄i}. This can furthermore be confirmed by numerical integration of the master
equation Eq. (11.5.2). The numerical results show that the mapping remains robust in presence of single-
photon loss as long as 12K γ. Figure 11.13 shows the density plot of the steady-state Wigner function
for γ = 0.12K and γ = 6K, respectively. When γ = 0.12K the fidelity is F = 99.99% with respect to the
ideal steady state Eq. (11.5.2). When γ = 6K, instead, outside the regime of low photon loss, the fidelity
drops to F = 85.13%.
where F denotes the amplitude of the single-photon pump. With this additional pump it can be once again
found (see Appendix B.4) from the solution to the semi-classical equations of motion (neglecting losses) that
the expected position of the coherent states |±αi of the resonator are
r
G F
±α ' ± + ,
K 4G
115
(a) (b)
Figure 11.13: Steady state of a two-photon driven KNR in presence of single-photon loss. Figure (a) is the
steady state phase space density plot of the Wigner function with a photon loss rate γ = 0.12K and figure
(b) with γ = 6K. The two-photon drive amplitude was set to G = 3K in both figures. The white circles
indicate the expected position of the coherent states. On the axes, Im(α) and Re(α) are the momentum and
position quadrature respectively.
Single-photon pump
Figure 11.14: The two- and one-photon pumped KNR. One end of the two-photon pumped KNR is ca-
pacitively coupled to a transmission line through which microwave pulses of amplitude amplitude F and
frequency ωr drive the resonator.
i.e. theypare slightly displaced by a factor F/4G, but if 4G F or equivalently 12K F , then
±α ' ± G/K, and the states are once again confined to the spin subspace spanned by |0̄i and |1̄i.
Most importantly, this single-photon pump lifts the degeneracy between the states |0̄i and |1̄i by an amount
In the spin subspace spanned by our two logical spins |0̄i and |1̄i the Hamiltonian of Eq. (11.80) can therefore
be expressed as
¯ ¯
IˆĤ1 Iˆ = (|0̄i h0̄| + |1̄i h1̄|) −Kâ†2 â2 + G â†2 + â2 + F ↠+ â (|0̄i h0̄| + |1̄i h1̄|)
using the definition of a coherent state â |0̄/1̄i = ∓α |0̄/1̄i and the quasi-orthogonality condition h1̄|0̄i ≈ 0,
we get
¯ ¯ ¯ ¯z
IˆĤ1 Iˆ = (−Kα4 + 2Gα2 )Iˆ + 2F ασ̂
¯ z + const
= 2F ασ̂
¯ z = |1̄i h1̄| − |0̄i h0̄| is the Pauli z-matrix. This is the Ising Hamiltonian for a single Ising spin in a
where σ̂
magnetic field of strength |2F α|. Therefore the application of a weak single-photon drive induces an effective
magnetic field.
116
Now, recall from Sec.7.4 that we require an initial Hamiltonian that does not commute with the final
Hamiltonian and furthermore should have a simple non-degenerate ground state. We consider the case when
the resonators are initialized to vacuum, since it is relative simple to prepare the resonators in such a state.
We choose an initial Hamiltonian in the rotating frame of the following form
Ĥ0 = −δ↠â − Kâ†2 â2 (11.81)
where δ is a finite positive detuning to separate the vacuum state |0i from the single-photon Fock state |1i by
an energy gap δ. At a first glance Eq. (11.81) might fill us with fear and trepidation since the Hamiltonian
isn’t bounded from below. Fear not! One can show that this is a consequence of the approximations used in
the derivation, namely that we truncated the expansion of the Josephson cosine potential to fourth order,
see Appendix B.1, and that the microscopic Hamiltonian from which it was derived from Eq. (B.5) does
not exhibit this property. We also have to remember that photon loss also naturally leads to lower photon
number states. Therefore as pointed out by Nigg et al. (2016) [Nigg et al., 2017] we have to think of the zero
photon state in the rotating frame as the state with highest energy instead of the one with lowest energy.
We also adopt the terminology by Nigg et al. and refer to the ground state, the state with highest energy,
as the roof state.
The time-dependent Hamiltonian for the QA algorithm is obtained by adiabatically varying the two- and
single-photon pump amplitudes and frequencies (see Appendix B.2)
t t
Ĥ(t) = 1 − Ĥ0 + Ĥ1
τ τ
(11.82)
t † †2 2
t †2 2 †2 2
†
= 1− −δâ â − Kâ â + −Kâ â + G â + â + F â + â ,
τ τ
yielding Eq. (7.1) for the single Ising spin in a magnetic field problem. Notice that the Kerr term in the
Hamiltonian above is actually time independent so that one only needs to vary the frequency and amplitude
of the pumps.
pump for the moment. The Hamiltonian for two coupled KNRs in the rotating frame is given by
2
−Kâ†2 †2 † †
X
2 2
Ĥ1 = i âi + G âi + âi + g â1 â2 + â1 â2 .
i=1
117
A steady state analysis shows that for sufficiently small coupling strengths g 6K the two-photon pumped
KNR is approximately kept in the subspace spanned by |0̄, 0̄i, |0̄, 1̄i, |1̄, 0̄i and |1̄, 1̄i (see Appendix B.5).
Similar to the previous section it can be shown that in the spin subspace spanned by the logical spins
{|0̄, 0̄i , |0̄, 1̄i , |1̄, 0̄i , |1̄, 1̄i} this Hamiltonian takes the form
¯ z σ̂
Ĥ1 = 2gα2 σ̂ ¯z
1 2 + const.
This is the Hamiltonian for two magnetically coupled spins with g > 0 (g < 0) corresponding to the
ferromagnetic (antiferromagnetic) coupling between the spins in the rotating frame. The initial Hamiltonian
for two coupled KNRs is chosen to be:
2
−δâ†i âi − Kâ†2 † †
X
2
Ĥ0 = i âi + g â1 â2 + â1 â2 .
i=1
Furthermore, in order to ensure that the vacuum state |0, 0i is the roof state of the initial Hamiltonian Ĥ0
the detuning has to be greater than the single-photon exchange rate δ > g, which can be seen by looking at
the eigenstate following the roof state
What is interesting about the two-spin Ising problem is that the ground state exhibits frustration. This
implies that the ground state for the two-spin problem is degenerate. At t = τ the two degenerate roof states
for ferromagnetic and antiferromagnetic coupling are {|0̄, 0̄i , |1̄, 1̄i} and {|0̄, 1̄i , |1̄, 0̄i}, respectively. It might
seem like this can pose a problem since the gap closes between the two “highest” energy states, and therefore
violates the adiabatic theorem. However, the theorem is not violated for the following reason. Recall from
i that an observable A is conserved if its Hermitian operator Â
your introductory course in quantum mechanics
h
commutes with the Hamiltonian, i.e. Â, Ĥ = 0. It can be shown that the parity operator which is defined
P h i
by P̂2 = exp iπ i=1 â†i âi commutes with both the initial and final Hamiltonian, so that P̂2 , Ĥ(t) = 0.
2
This means that parity is conserved and that transitions only occur between even-even and odd-odd parity
eigenstates. So since the zero-photon Fock state |0i has even parity and the single-photon Fock states |1i
has odd parity, degeneracy is not a problem.
We now possess the tools to write down the final Hamiltonian Ĥ1 for N linearly coupled KNRs, which
reads
N
−Kâ†2 †2 †
gij â†i âj + â†j âi ,
X X
2 2
Ĥ1 = i âi + G âi + âi + Fi âi + âi + (11.83)
i=1 1≤i<j≤N
where gij is the single-photon coupling strength between resonator i and j. In the computational basis the
final Hamiltonian becomes
N
¯ z σ̂
¯z ¯ z + const,
X X
Ĥ1 = 2α2 gij σ̂ i j + 2α Fi σ̂i (11.84)
1≤i<j≤N i=1
118
which corresponds to the Ising Hamiltonian for N spins. The corresponding initial Hamiltonian can just as
readily be written down
N
−δâ†i âi − Kâ†2 gij â†i âj + âi â†j .
X X
2
Ĥ0 = i â i +
i=1 1≤i<j≤N
We now compare the couplings and the single-photon pumping strengths of the KNR Ising Hamiltonian
Eq. (11.84) with the couplings and magnetic field strengths of the Ising Hamiltonian Eq. (8.2). This yields
to the following correspondence for the couplings:
− 2α2 gij ↔ Jij . (11.85)
The minus sign is introduced because gij > 0 (gij < 0) corresponds to ferromagnetic (antiferromagnetic)
coupling while Jij > 0 (Jij < 0) corresponds to antiferromagnetic (ferromagnetic) coupling. Similar for the
single-photon pumps we make the following correspondence:
2αFi ↔ hi . (11.86)
Lastly, the final component for this continuous variable quantum annealing architecture is the readout of
the state of the computational states. This can be implemented through balanced homodyne detection that
is enabled via capacitively coupled transmission lines [Puri et al., 2017], corresponding to the measurement
R
of a suitably chosen field quadrature, namely q̂ = dq|qihq|.
where A is the same scale factor. To make sure that the conditions 6K gij and 12K Fi are satisfied,
we chose A to be
K K
A= = .
2 max|hi | 10
It should be noted that the scale factor may vary depending on the given problem and the set of numbers in
the problem. After having defined all the couplings and single-photon pump strengths we have diagonalized
the time-dependent QA Hamiltonian
t t
Ĥ(t) = 1 − Ĥ0 + Ĥ1 , (11.87)
τ τ
119
where the initial and final Hamiltonian are
3
−Kâ†2 †2 †
gij â†i âj + â†j âi ,
X X
2 2
Ĥ1 = i âi + G âi + âi + Fi âi + âi +
i=1 1≤i<j≤3
3
−δâ†i âi − Kâ†2 gij â†i âj + âi â†j .
X X
2
Ĥ0 = i âi +
i=1 1≤i<j≤3
In the spin subspace spanned by the computational basis the final Hamiltonian takes the form (dropping the
constant)
3 3
1 X
¯ z σ̂
¯z
X 1 X
¯z ,
Ĥ1 = A − ni nj σ̂ i j +
nj − m ni σ̂ i
2 i=1
2 j=1
1≤i<j≤3
120
fair partitioning to this problem is S1 = {3, 5} and S2 = {8} for which the Ising spin configurations that
satisfy this problem are s1 = s2 = +1 and s3 = −1 and the configuration with all spins flipped. We chose
A = 0.0015K, to satisfy 6K gij . The initial and final Hamiltonian is given by
3
−Kâ†2 †2
gij â†i âj + â†j âi ,
X X
2 2
Ĥ1 = i âi + G âi + âi +
i=1 1≤i<j≤3
3
−δâ†i âi − Kâ†2 gij â†i âj + âi â†j .
X X
2
Ĥ0 = i âi +
i=1 1≤i<j≤3
In the spin subspace spanned by the coherent eigenstates of the two-photon pumped KNR in the rotating
frame the final Hamiltonian takes the form
¯ z σ̂
¯z
X
Ĥ1 = −A ni nj σ̂i j,
1≤i<j≤3
realizing the Ising Hamiltonian for the number partitioning problem. From diagonalizing the time-dependent
QA Hamiltonian we found that the minimum energy gap was ∆min = 0.3K and that the adiabatic condition
was τ 3.92/K, which requires an evolution time that is approximately τ = 392/K. Each resonator
was then initialized to the vacuum |0i and we numerically solved the Schrödinger equation. From this the
success probability of reaching the desired state |0̄, 0̄, 1̄i or |1̄, 1̄, 0̄i was found to be 99.97%. Furthermore a
calculation
r of the fidelity showed that the system reaches the entangled state N (|0̄, 0̄, 1̄i + |1̄, 1̄, 0̄i), where
2
N = 1/ 2 1 + exp(−6|α| ) is a normalization constant, with 100% fidelity.
Discussion
The numerical results that we have obtained show that in the ideal case with no losses the success probability
was 99.98% for the subset sum problem and 99.97% for the number partitioning problem. The 0.02%−0.03%
error mostly comes from deviations from the computational subspace. A higher success probability could
have been achieved by choosing a smaller scale factor since then the conditions 6K gij and 12K Fi
would have been better satisfied. However, choosing a smaller scale factor decreases the minimum gap energy
which leads to an increase in the evolution time of the algorithm. For example choosing A = 0.00025K for
the number partitioning problem we get a success probability that is 100%, but the evolution time increases
approximately by a factor of 6.
121
al. (2016) [Puri et al., 2017] where they showed how the LHZ scheme for two-photon pumped KNRs could
be physically realized. A third connectivity solution was presented by Nigg et al. (2016) [Nigg et al., 2017]
where they showed that by leveraging the phenomenon of flux quantization for a network of Kerr-parametric
oscillators they could achieve an all-to-all connected architecture.
122
Appendix A
Figure A.1: Cavity with two perfectly reflecting mirrors located at z = 0 and z = L. The electric field is
assumed to be polarized along the x-direction
123
In the absence of sources and charges the Maxwell equations read
∂B
∇×E =− , (A.1)
∂t
∂E
∇ × B = µ0 0 , (A.2)
∂t
∇ · E = 0, (A.3)
∇ · B = 0. (A.4)
Using Maxwell’s equations in the absence of sources and charges and the given boundary conditions, the
electric field has the form 2ω 2 1/2
E(z, t) = q(t) sin(kz)ex , (A.5)
V 0
where V is the volume of the cavity, q(t) is a function of time and ex is the unit-vector along the x-direction.
From the boundary conditions the allowed frequencies are found to be
cπn
ωn = , n = 1, 2, 3, . . .
L
with kn = ωn /c as the corresponding wave number. In Eq. (A.5) we have assumed a specific frequency ω,
i.e. a specific standing wave which is called a mode of the field. We find the corresponding magnetic field
by substituting Eq. (A.5) into (A.2),
µ0 0 2ω 2 1/2
B(z, t) = q̇(t) cos(kz)ey . (A.6)
k V 0
We now identify q(t) as a canonical coordinate, and p(t) ≡ q̇(t) as the momenta canonically conjugate to
q(t). The energy stored in the field of the single-mode is
1 1 1
Z
H= |E|2 + µ0 |B|2 dV = (p2 + ω 2 q 2 ).
2 V 0 2
We see that this is nothing but the energy of a harmonic oscillator with unit mass1 . To quantize the
electromagnetic field we promote q and p to operators
q → q̂ and p → p̂,
Thus the electric and magnetic field are also promoted to operators
2ω 2 1/2
Êx (z, t) = q̂ sin(kz), (A.7)
V 0
µ0 0 2ω 2 1/2
B̂y (z, t) = p̂ cos(kz). (A.8)
k V 0
The subscript x and y denotes the constituent components of the fields. The Hamiltonian now reads
1 2
Ĥ = (p̂ + ω 2 q̂ 2 ). (A.9)
2
1 The amplitude of the electric field in Eq. (A.5) was cleverly chosen to yield the energy of a harmonic oscillator of unit mass.
124
Next we define the very useful non-Hermitian ladder operators
which obey the boson commutation relation [â, ↠] = 1. Inverting Eq. (A.10) and Eq. (A.11) and substituting
it into Eq. (A.9) we obtain a Hamiltonian entirely written in terms of the ladder operators as in Eq.(11.1.1).
In the main text, we haven’t discussed the time-dependence of the operators. What we’ve done so far
is assumed to hold at some time t, for example t = 0. In the Schrödinger picture the states are time-
dependent and the operators are time-independent. On the contrary, in the Heisenberg picture the operators
are time-dependent and the states are time-independent. In the Heisenberg picture the time evolution of the
annihilation operator is given by
dâ i
= [Ĥ, â]
dt ~
i 1
= [~ω ↠â + ), â] (A.12)
~ 2
= iω(↠ââ − â↠â)
= iω[â, ↠]â = −iωâ.
After substituting Eq. (A.10) into Eq. (A.7) and Eq. (A.11) into Eq. (A.8), the electric and magnetic field
with the inclusion of the time-dependence become, respectively
~ω 1/2
Êx (z, t) = (âe−iωt + ↠eiωt ) sin(kz), (A.14)
V 0
µ0 0 ~ω 3 1/2 −iωt
B̂y (z, t) = (âe − ↠eiωt ) cos(kz). (A.15)
ik V 0
It is convenient to introduce the two hermitian operators
r
ω 1
X̂ = q̂ = (â + ↠),
2~ 2
(A.16)
1 1
X̂π/2 = √ p̂ = (â − ↠),
2~ω 2i
which satisfy the commutation relation
i
[X̂, X̂π/2 ] =
.
2
From here it is easy to show that the electric field operator Eq. (A.14) can be written in terms of the
dimensionless quantities X̂ and X̂π/2 as
~ω 1/2
Êx (z, t) = 2 (X̂ cos(ωt) + X̂π/2 sin(ωt)) sin(kz). (A.17)
0 V
This expression shows that X̂ and X̂π/2 are associated with the electric field amplitude, where the second
term is offset by π/2 compared to the cos(ωt) term.
125
Now, let’s understand why coherent states are the most classical states. To construct a state with close
resemblance to the classical electromagnetic field, one can observe that by replacing â and ↠with a complex
variable in Eq. (A.14) and Eq. (A.15) it would produce a “classical field”, i.e. a field that oscillates. This is
achieved in view of the definition of the coherent state as an eigenstate to the annihilation operator, as in
Eq.(11.4). The expectation value of the electric field given by Eq. (A.14) becomes then
~ω 1/2
hÊx (z, t)iα = hα| Êx (z, t) |αi = hα| (âe−iωt + ↠eiωt ) |αi sin(kz)
V 0
~ω 1/2
= (αe−iωt + α∗ eiωt ) sin(kz).
V 0
Writing α in polar coordinates α = |α|eiϕ we get
~ω 1/2
hÊx (z, t)iα = 2|α| cos(ωt − ϕ) sin(kz),
V 0
and we see that the field oscillates very much like the classical electric field. Likewise, for the quadrature
operator Eq. (A.16), we have
1 1
hX̂iα = hα| (â + ↠) |αi = (α + α∗ ) = Re α = |α| cos(ϕ)
2 2
and similarly hX̂π/2 iα = Im α = |α| sin(ϕ), so the mean of the quadratures are related to the real and
imaginary part of α.
Let us now derive Eq.(11.1.1) of the main text. Since the Fock states form a complete set, we will use
them to express α
∞
X
|αi = cn |ni , (A.18)
n=0
where cn = hn|αi denotes a complex number which is to be determined. Inserting Eq. (A.18) into Eq. (11.4)
and using Eq. (11.1) and Eq. (11.2) we obtain
∞ ∞
X √ X
cn n |n − 1i = cn α |ni .
n=1 n=0
Since the Fock-states form an orthogonal basis, we can multiply with an arbitrary state hm| from left and
use the orthogonality condition hm|ni = δmn to obtain
√
cm+1 m + 1 = αcm .
By the substitution m → n − 1
α α2 αn
cn = √ cn−1 = p cn−2 = . . . = √ c0 ,
n n(n − 1) n!
126
Xπ⁄2
∆X=1/2
Im 𝛼 ∆Xπ⁄2=1/2
|𝛼|
𝜙
X
Re 𝛼
Figure A.2: Phase-space diagram of a coherent state |αi. The displacement from the origin is equal to |α|
and the angle ϕ is the phase, measured from the X-axis. The quantum uncertainty is displayed as a grey
circle with a diameter of 1/2.
2
Therefore |c0 | = e−|α| /2 and our final expression for the coherent state |αi expressed in terms of Fock states
is Eq.(11.1.1) in the main text.
A neat way to illustrate a coherent state is in phase space. In Fig. A.2 the phase space diagram of a
coherent state |αi is illustrated. The coherent state can be viewed as a displacement of the vacuum state
with a distance |α| from the origin and an angle ϕ measured from the X̂-axis. The grey circle area represents
the uncertainty of the coherent state and has a constant diameter of 1/2.
127
Appendix B
ϕ2 Φext ϕ1 EJ,CJ EJ CJ
Figure B.1: Left: Schematic illustration of a SQUID. Right: Circuit symbol for a Josephson junction with
Josephson energy EJ and capacitance CJ .
In this section of the appendix we aim to derive the two-photon pumped Kerr Hamiltonian starting from
the circuit Lagrangian of a SQUID and follow the derivation presented by Nigg et al. [Nigg et al., 2017].
The SQUID consists of two Josephson junctions connected in a superconducting loop with a magnetic flux
Φext (t) penetrating the loop. Fig B.1 is meant to illustrate the SQUID and the corresponding circuit symbol
of the Josephson junction. The circuit Lagrangian for a SQUID is given by
1
L= CJ φ̇21 + φ̇22 + EJ (cos(2πφ1 /Φ0 ) + cos(2πφ2 /Φ0 )) (B.1)
2
where φi is the phase difference across the junction and Φ0 = h/(2e) is the magnetic flux quantum. For
convenience we chose units such that the flux quantum is Φ0 = h/(2e) = 2π. The parameters CJ and EJ
in Eq. (B.1) denote the capacitance and Josephson energy of each Josephson junction. Due to the fluxoid
quantization condition, φ1 and φ2 are not independent variables. If a flux Φext is penetrating the loop, then
the flux quantization requires that
φ2 − φ1 = Φext , (B.2)
where we have set the geometrical phase term to zero and neglected the induced flux term which is reasonable
for small loops. We introduce the new variable
1
ϑ= (φ1 + φ2 ).
2
128
which together with Eq. (B.2) implies
1
φ1 = ϑ − Φext , (B.3)
2
1
φ2 = ϑ + Φext . (B.4)
2
Inserting Eq. (B.3) and Eq. (B.4) into Eq. (B.1) and simplifying we get
1 Φext
L = C ϑ̇2 + 2EJ cos cos(ϑ)
2 2
with C = 2CJ . We have also neglected the 41 CJ Φ̇2ext term which is independent of coordinates and does
not affect the dynamics of the system. The Hamiltonian function is obtained as usual through the Legendre
transformation X
H= Qi ϑ̇i − L,
i
∂L
with Qi = = C ϑ̇i being the generalized momenta. From the Legendre transformation we obtain the
∂ ϑ̇i
Hamiltonian
Q2
Φext
H = EC 2 − 2EJ cos cos(ϑ). (B.5)
2e 2
where EC ≡ e2 /C is the charging energy. Next, we consider a monochromatic magnetic flux modulation
threaded through the loop that is of the form
Φext (t)
= Φdc + δΦ(t)
2
where dc is the static magnetic flux and δΦ(t) is a time-dependent perturbation that we choose of the form
where δΦac is the ac flux. The external flux cosine term in Eq. (B.5) can now be written as
Φext (t)
cos = cos(Φdc ) cos(δΦac cos(Ωt)) − sin(Φdc ) sin(δΦac cos(Ωt)).
2
Using the following mathematical tricks
∞
X
cos(z cos(ϑ)) = J0 (z) + 2 (−1)n J2n (z) cos(2nϑ)
n=1
∞
X
sin(z cos(ϑ)) = 2 (−1)n J2n−1 (z) cos [(2n − 1)ϑ] ,
n=1
∞
!
Φext (t) X
n
cos = cos(Φdc ) J0 (δΦac ) + 2 (−1) J2n (δΦac ) cos(2nΩt)
2 n=1
∞
!
X
n
− sin(Φdc ) 2 (−1) J2n−1 (δΦac ) cos [(2n − 1)Ωt] .
n=1
129
If the ac-flux is sufficiently small, |δΦac | 1, then we can keep only the two leading terms of the sum, that’s
J0 and J1 and use the approximations J0 (z) ' 1 and J1 (z) ' z/2 yielding
Φext (t)
cos ' cos(Φdc ) + sin(Φdc )δΦac cos (Ωt) .
2
For convenience we now separate the Hamiltonian into a time independent and time dependent part as
H0 = H1 + H2 (t), where
Q2
H1 = EC − 2EJ cos(Φdc ) cos(ϑ)
2e2
H2 (t) = −2EJ sin(Φdc )δΦac cos (Ωt) cos(ϑ).
We now expand the cos(ϑ) to fourth order, keeping only the leading nonlinearity. Furthermore since we
focus on the weak ac modulation |δΦac | 1, we can neglect the nonlinear part of the time dependent term,
then we have
Q2 EJ
H1 ' EC + EJ cos(Φdc )ϑ2 − cos(Φdc )ϑ4 ,
2e2 12
H2 (t) ' EJ sin(Φdc )δΦac cos (Ωt) ϑ2 ,
where we have dropped the coordinate independent terms that does not contribute to the dynamics. As the
next step we promote ϑ and Q to operators with the commutation relation
h i
ϑ̂, Q̂ = i.
By introducing the creation and annihilation operator ↠and â that obey the bosonic commutation relation
â, ↠= 1, we have that
r r
1 † Z
â − ↠,
ϑ̂ = â + â and Q̂ = −i
2Z 2
where s
2e2 EJ cos(Φdc )
Z= .
EC
Thus the Hamiltonian in terms of the creation and annihilation operator ↠and â can be written as
EC 4
Ĥ1 = ωr ↠â − 2
â + ↠,
96e
EJ 2
Ĥ2 (t) = sin(Φdc )δΦac cos (Ωt) â + ↠,
2Z
where we have defined the resonator frequency
r
2EC EJ cos(Φdc )
ωr = ,
e2
and have furthermore dropped the ωr /2 which simply corresponds to a constant energy shift. Next we
move to the interaction picture. To switch into the interaction picture we divide the Schrödinger picture
Hamiltonian into two parts:
ĤS = ĤS,0 + ĤS,1
where
ĤS,0 = ωr ↠â.
130
and
EC 4 EJ 2
ĤS,1 = − â + ↠+ sin(Φdc )δΦac cos (Ωt) â + ↠.
96e2 2Z
The Hamiltonian in the interaction picture is defined as Ĥint = eiĤS,0 t ĤS e−iĤS,0 t , and we thus get
† 4 †
† eiωr â ât
â + ↠EC e−iωr â ât
Ĥint = ωr â â −
96e2
EJ † 2 †
+ sin(Φdc )δΦac cos (Ωt) eiωr â ât â + ↠e−iωr â ât . (B.6)
2Z
Next normal ordering the terms yields
4
: â + ↠: = â†4 + 6â†2 + 4â†3 â + 6â†2 â2 + 4↠â3 + 12↠â + â4 + 6â2 + 3,
2
: â + ↠: = â†2 + 2↠â + â2 + 1.
†
Under the unitary transformation Û(t) = e−iωr tâ â
each â and ↠transform according to
† †
eiωr tâ â âe−iωr tâ â
= âeiωr t ,
† †
eiωr tâ â ↠e−iωr tâ â
= ↠e−iωr t .
Now we use the rotating wave approximation (RWA) which corresponds to neglecting all rotating terms
proportional to exp(iωr t). In the second term of Eq. (B.6) the only non-rotating terms are 6â†2 â2 and
12↠â. The latter leads to a renormalization of the oscillator frequency by an amount ∆ωr = −EC /(8e2 )
which we absorb into ωr and make an implicit redefinition of the resonator frequency. For the third and last
term in Eq. (B.6) we specifically consider a two-photon pump frequency that is close to twice the resonator
frequency, i.e. Ω ' 2ωr . By writing cos (Ωt) in Euler form we find that the only non-rotating terms are â†2
and â2 . Putting it all together we have
EC †2 2 EJ
Ĥint = ωr ↠â − sin(Φdc )δΦac â†2 + â2 .
2
â â +
16e 4Z
Switching back to the Schrödinger picture we get
which is the two-photon pumped Kerr-nonlinear resonator Hamiltonian, where we have defined the Kerr-
nonlinear amplitude as
EC
K≡ ,
16e2
and the two-photon pump amplitude as
EJ
G≡ sin(Φdc )δΦac .
4Z
131
where ωr is the frequency of the resonator, ωp (t) is the two-photon pump frequency and the one-photon
pump frequency is chosen half times the two-photon pump frequency. We transform to a frame rotating at
†
the single-photon pump frequency by performing the following unitary transformation Û(t) = eiωp (t)tâ â/2 .
Under a unitary transformation the state vector |ψ(t)i transforms according to
Substituting the transformed state vector into the Schrödinger equation we obtain
by using Eq. (B.8) we replace |ψ(t)i with Û † (t) |ψ̃(t)i and get
!
∂ |ψ̃(t)i ∂ Û(t) † † ˜
i = i Û (t) + Û(t)Ĥ(t)Û (t) |ψ̃(t)i = Ĥ(t) |ψ̃(t)i ,
∂t ∂t
where
˜ ∂ Û(t) †
Ĥ(t) ≡ i Û (t) + Û(t)Ĥ(t)Û † (t) (B.9)
∂t
is the transformed Hamiltonian. We proceed by evaluating the first term in this expression
∂ Û(t) † 1 1
i Û (t) = − (ωp (t) + ω̇p (t)t) ↠â Û(t)Û † (t) = − (ωp (t) + ω̇p (t)t) ↠â,
∂t 2 | {z } 2
1
where the dot ω̇p (t) denotes the time-derivative. We then calculate the second term of Eq. (B.9) by invoking
the Baker-Hausdorff lemma,
1
eA ae−A = a + [A, a] + [A, [A, a]] + . . .
2!
where A is a Hermitian operator. Applying this formula we get that each operator â and ↠transform
according to
(−iωp (t)t/2)2
Û † (t)âÛ(t) = â 1 − iωp (t)t/2 + + . . . = âe−iωp (t)t/2 ,
2!
(iωp (t)t/2)2
Û † (t)↠Û(t) = ↠1 + iωp (t)t/2 + + . . . = ↠eiωp (t)t/2 ,
2!
and thus the Hamiltonian in the rotating frame is given by (we can now drop the tilde)
1
Ĥ(t) = ωr − (ωp (t) + ω̇p (t)t) ↠â − Kâ†2 â2 + G(t) â†2 + â2 + F (t) ↠+ â .
2
132
the Hamiltonian in the rotating frame now reads
t t t
δ↠â − Kâ†2 â2 + G â†2 + â2 + F ↠+ â
Ĥ(t) = − 1 −
τ τ τ
t t
−δ↠â − Kâ†2 â2 + −Kâ†2 â2 + G â†2 + â2 + F ↠+ â
= 1− (B.10)
τ τ
t t
= 1− Ĥ0 + Ĥ1 ,
τ τ
Ĥ0 ≡ −δ↠â − Kâ†2 â2 and Ĥ1 ≡ −Kâ†2 â2 + G â†2 + â2 + F ↠+ â .
where P (α, α∗ , t) is some weight function that is normalized such that Tr [%̂(t)] = 1. The integral of Eq. (B.11)
is formally known as the Glauber Sudarshan P-representation of the state %̂(t). To derive an equation for
P (α, α∗ , t) we substitute the P-representation into the Lindblad master equation Eq. (11.5.2) and use the
properties
which can be derived from the definition of the coherent state. Next we perform integration by parts and
assume that the boundary conditions at infinity are zero. This introduces a minus sign in front of each
differential operator. The above properties show that we have the following correspondences
133
Substituting Eq. (B.11) into the master equation Eq. (11.5.2) and using the identities given by Eq. (B.12)
we get
∂P (α, α∗ , t) ∂ γ ∂ γ
= − (i2Kα2 α∗ − i2Gα∗ − α) − ∗
(−i2Kα∗2 α + i2Gα − α∗ ),
∂t ∂α 2 ∂α 2
∂2 ∂ 2
∗2
+ (iKα 2
− iG) + (−iKα + iG) P (α, α∗ , t). (B.13)
∂α2 ∂α∗2
This equation is on the form of a Fokker-Planck equation and can moreover be written as a stochastic
differential equation [Walls and Milburn, 2007]
dα γ p
= i2 Kα2 α∗ − Gα∗ + i α + i2(Kα2 − G)η(t),
dt 4 (B.14)
dα∗
∗2 γ ∗ p 2 ∗
= i2 −Kα α + Gα + i α + i2(Kα − G)η (t).
dt 4
It can be shown that this stochastic differential equation is indeed equivalent to the Fokker-Planck equation.
Here η(t), η ∗ (t) are stochastic forces (Langevin terms) with zero mean. The semi-classical or mean value
equation is obtained by taking the average, the Langevin terms thus disappear since hη(t)i = hη(t)∗ i =
0 [Walls and Milburn, 2007]. The semi-classical equation of motion thus are
dα γ
= i2 Kα2 α∗ − Gα∗ + i α ,
dt 4 (B.15)
dα∗ γ
= i2 −Kα α + Gα + i α∗ .
∗2
dt 4
The steady-state to the semi-classical equation of motion can now readily be obtained by setting dα/dt = 0,
and solving this equation. The equation Kα2 α∗ − Gα∗ + i γ4 α = 0 has three solutions which are given by
α1s = 0 and α2,3
s
= ±reiϑ , where
1/4 !
1 16G2 − γ 2
1 γ
r= and ϑ = − arctan p .
2 K2 2 16G2 − γ 2
s
p
which reduces to α2,3 ' G/K in the case of 4G γ.
To investigate the stability of these solutions we linearise Eq. (B.15) around the steady states
where αis is the i:th steady-state solution to Eq. (B.15) and δαi is a small perturbation. The linearized
equations written in matrix form are
2
d δαi (t) 2K|αis | + iγ/4 K(αis )2 − G δαi (t)
' i2 ,
dt δα∗ (t) 2
−K(αs∗ )2 + G −2K|αs | + iγ/4 δα∗ (t)
i i i i
Stability of the fixed points require all eigenvalues to have a positive real part less than or equal to zero [Walls
and Milburn, 2007]. We begin by examining the first steady state solution which is the origin α1s = 0, it has
eigenvalues λ± = −γ/2 ± 2G, so if 4G > γ, then the origin is an unstable solution. The eigenvalues to the
s
second and third steady state solution α2,3 = ±reiϑ is most easily analyzed in the case when 4G γ, where
both steady state solutions has the degenerate eigenvalue −γ/2, and are thus stable solutions.
134
B.4 Effect of single-photon pump
In this section we will study the effect that is obtained when a single-photon pump is added to the two-
photon pumped KNR. For the moment we will neglect the effect of single-photon loss, since it has already
been treated in appendix B.3. The Hamiltonian for the two- and one-photon KNR written in a frame rotating
at the resonator frequency is given by
In the Heisenberg picture the time evolution of the annihilation operator is given by
dâ h i
= i Ĥ, â = i 2K↠â2 − 2G↠− F .
dt
To obtain the semi-classical equations of motion we take the average hâi = α, which is obtained by simply
replacing â with the complex variable α,
dα
= i 2Kα∗ α2 − 2Gα∗ − F .
dt
The steady state to this p equation has three solutions which are of the form α1 = (−2ε, 0) p and α2,3 =
(±α + ε, 0) where α = G/K and ε = F/4G. Hence if 4G F so that ε → 0, then α2,3 ' ± G/K. To
investigate the stability of the steady state solutions we do a small perturbation around the fixed point. The
linearized equation on motion are
d(δα) 2
dt 2K|αis | K(αis )2 − G δα
d(δα∗ ) ' i2 (B.16)
s∗ 2 s 2 δα ∗
−K(αi ) + G −2K|αi |
dt
which has the eigenvalues q
4 2
λ± = ±2 Im 4K 2 |αis | − |G − K(αis )2 | . (B.17)
s
We begin by examining the first steady state solution α1 = −2ε. It has eigenvalues of
pthe eigenvalues for
the form λ± ' ±2 Im KF 2 /(2G) − G2 . If the expression inside the square root is negative then the α1s
√
will have an eigenvalue with a positive real part. This happens if F < 2Kα3 . The eigenvalue of the two
other solutions α2s and α3s both has λ± = 0 as eigenvalue and are therefore stable.
The Heisenberg equations of motion tell us how the the operators evolve in time
dâ1 h i
= i Ĥ, â1 = i 2Kâ†1 â21 − 2Gâ†1 − gâ2 ,
dt
dâ2 h i
= i Ĥ, â2 = i 2Kâ†2 â22 − 2Gâ†2 − gâ1 .
dt
135
Once again the mean field or semi classical equations are obtained by replacing â1 → α and â2 → β, where
α and β are two complex variables. We thus get that the semi-classical equations of motion for two coupled
KNRs are
dα
= i 2Kα∗ α2 − 2Gα∗ − gβ ,
dt
dβ
= i 2Kβ ∗ β 2 − 2Gβ ∗ − gα .
dt
The solutions to the steady state equations are
2
and the error is bounded between 0 ≤ ε ≤ 1 − e−|α| . To have an error that’s bounded between zero and
2
one we have to divide by 1 − e−|α| , such that
x
!
2n
−|α|2
X |α| 2
ε= 1−e / 1 − e−2|α| .
n=0
n!
√
Now if x = 16 and |α| = 3 we find that the numerical error is approximately ε ≈ 10−8 , which is a
considerably small numerical error.
136
B.7 Generation of cat states using a two-photon pumped KNR
The two-photon pumped KNR can be used as a mean to generate Schrödinger cat states. Cat states are a
superposition of two-coherent states with opposite phase, defined by
|αi ± |−αi
|Cα± i = q ,
2
2(1 ± e−2|α| )
where the term in the denominator is a normalization factor. |Cα+ i is called an even-cat state because
when written in the Fock basis it only contain Fock states with even numbers and |Cα− i is called an odd-
cat state because when written in the Fock-basis it only contains Fock states with odd numbers. Cat
states are of interest because they have the potential to be used as logical states for universal quantum
computation [Puri et al., 2017], and hold potential for error-correction against single-photon losses (see
experimental work pfrom Yale’s group). To generate a cat state it can first be noted that since the two-coherent
states |±αi = |± G/Ki are two-degenerate eigenstates of the two-photon pumped KNR Eq. (11.79) p so is
any superposition of theses states. It can be easily checked that the even and odd cat state with α = G/K
are also two degenerate eigenstates of the two-photon pumped KNR Eq. (11.79) with eigenenergy G2 /K.
Second, to generate a cat state the detuning and single-photon pump is set to zero in the time-dependent
Hamiltonian Eq. (11.82). The vacuum and single-photon Fock state are then degenerate eigenstates of the
initial Hamiltonian. Under adiabatic evolution the zero-photon Fock state will evolve into the even cat-state
since parity is conserved, and the single-photon Fock state will evolve into the odd-cat state for the same
reason.
Numerical simulations show that, for G = 3K and τ = 158 as given by the adiabatic condition, if the
resonator is initialized to |0i (|1i) it will evolve into the even cat state (odd cat state) with 100% fidelity,
within the numerical error. Figure B.2 (a) and (b) show the Wigner function of the even and odd cat. The
Wigner negativity that appears between the two Gaussian peaks are interference fringes and are sometimes
referred to as the cat’s whiskers. These interference fringes are very characteristic for a cat state. When the
amplitude of |α| is
psmall the states are often referred to as “kitten” states. For example an even cat (“kitten”)
state with |α| = 1/2 is shown in Fig. B.2 (c).
137
(a) Even cat state (b) Odd cat state
138
Bibliography
[Aaronson, 2005] Aaronson, S. (2005). Quantum computing, postselection, and probabilistic polynomial-
time. Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences,
461:3473.
[Aaronson, 2015] Aaronson, S. (2015). Read the fine print. Nat. Phys., 11:291.
[Aaronson, 2018] Aaronson, S. (2018). Lecture Notes for Intro to Quantum Information Science.
[Aaronson and Arkhipov, 2013] Aaronson, S. and Arkhipov, A. (2013). The computational complexity of
linear optics. Theory of Computing, 9:143.
[Albash and Lidar, 2018] Albash, T. and Lidar, D. A. (2018). Adiabatic quantum computation. Reviews of
Modern Physics, 90(1):015002.
[Allcock and Zhang, 2019] Allcock, J. and Zhang, S. (2019). Quantum machine learning. Natl. Sci. Rev.,
6:26.
[Amin et al., 2018] Amin, M. H., Andriyash, E., Rolfe, J., Kulchytskyy, B., and Melko, R. (2018). Quantum
Boltzmann Machine. Phys. Rev. X, 8:021050.
[Arute et al., 2019] Arute, F., Arya, K., Babbush, R., Bacon, D., Bardin, J. C., Barends, R., Biswas, R.,
Boixo, S., Brandao, F. G. S. L., Buell, D. A., Burkett, B., Chen, Y., Chen, Z., Chiaro, B., Collins, R.,
Courtney, W., Dunsworth, A., Farhi, E., Foxen, B., Fowler, A., Gidney, C., Giustina, M., Graff, R.,
Guerin, K., Habegger, S., Harrigan, M. P., Hartmann, M. J., Ho, A., Hoffmann, M., Huang, T., Humble,
T. S., Isakov, S. V., Jeffrey, E., Jiang, Z., Kafri, D., Kechedzhi, K., Kelly, J., Klimov, P. V., Knysh, S.,
Korotkov, A., Kostritsa, F., Landhuis, D., Lindmark, M., Lucero, E., Lyakh, D., Mandrà, S., McClean,
J. R., McEwen, M., Megrant, A., Mi, X., Michielsen, K., Mohseni, M., Mutus, J., Naaman, O., Neeley,
M., Neill, C., Niu, M. Y., Ostby, E., Petukhov, A., Platt, J. C., Quintana, C., Rieffel, E. G., Roushan, P.,
Rubin, N. C., Sank, D., Satzinger, K. J., Smelyanskiy, V., Sung, K. J., Trevithick, M. D., Vainsencher,
A., Villalonga, B., White, T., Yao, Z. J., Yeh, P., Zalcman, A., Neven, H., and Martinis, J. M. (2019).
Quantum supremacy using a programmable superconducting processor. Nature, 574(7779):505–510.
[Barak et al., 2015] Barak, B., Moitra, A., O’Donnell, R., Raghavendra, P., Regev, O., Steurer, D., Trevisan,
L., Vijayaraghavan, A., Witmer, D., and Wright, J. (2015). Beating the random assignment on constraint
satisfaction problems of bounded degree.
[Bartlett et al., 2002] Bartlett, S. D., Sanders, B. C., Braunstein, S. L., and Nemoto, K. (2002). Efficient
classical simulation of continuous variable quantum information processes. Phys. Rev. Lett., 88:9.
139
[Bartolo et al., 2016] Bartolo, N., Minganti, F., Casteels, W., and Ciuti, C. (2016). Exact steady state of
a kerr resonator with one-and two-photon driving and dissipation: Controllable wigner-function multi-
modality and dissipative phase transitions. Physical Review A, 94(3):033841.
[Bouland et al., 2019] Bouland, A., Fefferman, B., Nirkhe, C., and Vazirani, U. (2019). On the complexity
and verification of quantum random circuit sampling. Nature Physics, 15(2):159–163.
[Brandao et al., 2018] Brandao, F. G. S. L., Broughton, M., Farhi, E., Gutmann, S., and Neven, H. (2018).
For Fixed Control Parameters the Quantum Approximate Optimization Algorithm’s Objective Function
Value Concentrates for Typical Instances.
[Bremner et al., 2010] Bremner, M. J., Josza, R., and Shepherd, D. (2010). Classical simulation of commut-
ing quantum computations implies collapse of the polynomial hierarchy. Proc. R. Soc. A, 459:459.
[Bremner et al., 2016] Bremner, M. J., Montanaro, A., and Shepherd, D. J. (2016). Average-case com-
plexity versus approximate simulation of commuting quantum computations. Physical review letters,
117(8):080501.
[Briegel et al., 2009] Briegel, H. J., Browne, D. E., Dür, W., Raussendorf, R., and Van den Nest, M. (2009).
Measurement-based quantum computation. Nat. Phys., 5:19.
[Campagne-Ibarcq et al., 2020] Campagne-Ibarcq, P., Eickbusch, A., Touzard, S., Zalys-Geller, E., Frattini,
N., Sivak, V., Reinhold, P., Puri, S., Shankar, S., Schoelkopf, R., et al. (2020). Quantum error correction
of a qubit encoded in grid states of an oscillator. Nature, 584(7821):368–372.
[Chabaud et al., 2017] Chabaud, U., Douce, T., Markham, D., van Loock, P., Kashefi, E., and Ferrini, G.
(2017). Continuous-variable sampling from photon-added or photon-subtracted squeezed states. Physical
Review A, 96:062307.
[Chakhmakhchyan and Cerf, 2017] Chakhmakhchyan, L. and Cerf, N. J. (2017). Boson sampling with gaus-
sian measurements. Physical Review A, 96(3):032326.
[Choi, 2008] Choi, V. (2008). Minor-embedding in adiabatic quantum computation: I. the parameter setting
problem. Quantum Information Processing, 7(5):193–209.
[Choi, 2011] Choi, V. (2011). Minor-embedding in adiabatic quantum computation: Ii. minor-universal
graph design. Quantum Information Processing, 10(3):343–353.
[Cong et al., 2019] Cong, I., Choi, S., and Lukin, M. D. (2019). Quantum convolutional neural networks.
Nat. Phys., 15:1273.
[Dieks, 1982] Dieks, D. (1982). Communication by EPR devices. Physics Letters A, 92:271.
[Douce et al., 2017] Douce, T., Markham, D., Kashefi, E., Diamanti, E., Coudreau, T., Milman, P., van
Loock, P., and Ferrini, G. (2017). Continuous-variable instantaneous quantum computing is hard to
sample. Phys. Rev. Lett., 118:070503.
[Douce et al., 2019] Douce, T., Markham, D., Kashefi, E., van Loock, P., and Ferrini, G. (2019). Probabilistic
fault-tolerant universal quantum computation and sampling problems in continuous variables. Physical
Review A, 99:012344.
140
[Farhi et al., 2014] Farhi, E., Goldstone, J., and Gutmann, S. (2014). A quantum approximate optimization
algorithm. arXiv preprint arXiv:1411.4028.
[Farhi et al., 2001] Farhi, E., Goldstone, J., Gutmann, S., Lapan, J., Lundgren, A., and Preda, D. (2001).
A Quantum Adiabatic Evolution Algorithm Applied to Random Instances of an NP-Complete Problem.
Science, 292:472.
[Farhi et al., 2017] Farhi, E., Goldstone, J., Gutmann, S., and Neven, H. (2017). Quantum algorithms for
fixed qubit architectures. arXiv preprint arXiv:1703.06199.
[Farhi and Harrow, 2019] Farhi, E. and Harrow, A. W. (2019). Quantum supremacy through the quantum
approximate optimization algorithm.
[Farhi and Neven, 2018] Farhi, E. and Neven, H. (2018). Classification with Quantum Neural Networks on
Near Term Processors.
[Flühmann et al., 2018] Flühmann, C., Negnevitsky, V., Marinelli, M., and Home, J. P. (2018). Sequential
modular position and momentum measurements of a trapped ion mechanical oscillator. Phys. Rev. X,
8:021001.
[Fowler et al., 2012] Fowler, A. G., Mariantoni, M., Martinis, J. M., and Cleland, A. N. (2012). Surface
codes: Towards practical large-scale quantum computation. Physical Review A, 86:032324.
[García-Álvarez et al., 2020] García-Álvarez, L., Calcluth, C., Ferraro, A., and Ferrini, G. (2020). Efficient
simulatability of continuous-variable circuits with large wigner negativity. Phys. Rev. Research, 2:043322.
[Gerry et al., 2005] Gerry, C., Knight, P., and Knight, P. L. (2005). Introductory quantum optics. Cambridge
university press.
[Glancy and Knill, 2006] Glancy, S. and Knill, E. (2006). Error analysis for encoding a qubit in an oscillator.
Phys. Rev. A, 73:012325.
[Glauber, 1963] Glauber, R. J. (1963). Coherent and incoherent states of the radiation field. Physical Review,
131(6):2766.
[Gottesman et al., 2001] Gottesman, D., Kitaev, A., and Preskill, J. (2001). Encoding a qubit in an oscillator.
Phys. Rev. A, 64:012310.
[Gu et al., 2009] Gu, M., Weedbrook, C., Menicucci, N. C., Ralph, T. C., and van Loock, P. (2009). Quantum
computing with continuous-variable clusters. Phys. Rev. A, 79:062318.
[Hadfield et al., 2019] Hadfield, S., Wang, Z., O’Gorman, B., Rieffel, E., Venturelli, D., and Biswas, R.
(2019). From the Quantum Approximate Optimization Algorithm to a Quantum Alternating Operator
Ansatz. Algorithms, 12(2):34.
[Halperin et al., 2004] Halperin, E., Livnat, D., and Zwick, U. (2004). Max cut in cubic graphs. Journal of
Algorithms, 53(2):169–185.
[Hamilton et al., 2017] Hamilton, C. S., Kruse, R., Sansoni, L., Barkhofen, S., Silberhorn, C., and Jex, I.
(2017). Gaussian boson sampling. Physical review letters, 119(17):170501.
[Harrow et al., 2009] Harrow, A. W., Hassidim, A., and Lloyd, S. (2009). Quantum Algorithm for Linear
Systems of Equations. Phys. Rev. Lett., 103:150502.
141
[Harrow and Montanaro, 2017] Harrow, A. W. and Montanaro, A. (2017). Quantum computational
supremacy. Nature, 549(7671):203.
[Hauke et al., 2020] Hauke, P., Katzgraber, H. G., Lechner, W., Nishimori, H., and Oliver, W. D. (2020).
Perspectives of quantum annealing: Methods and implementations. Reports on Progress in Physics,
83(5):054401.
[Hillmann et al., 2020] Hillmann, T., Quijandría, F., Johansson, G., Ferraro, A., Gasparinetti, S., and Fer-
rini, G. (2020). Universal gate set for continuous-variable quantum computation with microwave circuits.
Phys. Rev. Lett., 125:160501.
[Horodecki et al., 2006] Horodecki, P., Bruß, D., and Leuchs, G. (2006). Lectures on quantum information.
[Huh et al., 2015] Huh, J., Guerreschi, G. G., Peropadre, B., McClean, J. R., and Aspuru-Guzik, A. (2015).
Boson sampling for molecular vibronic spectra. Nature Photonics, 9(9):615–620.
[Jiang et al., 2017] Jiang, Z., Rieffel, E. G., and Wang, Z. (2017). Near-optimal quantum circuit for Grover’s
unstructured search using a transverse field. Physical Review A, 95(6):062317.
[Kandala et al., 2017] Kandala, A., Mezzacapo, A., Temme, K., Takita, M., Brink, M., Chow, J. M., and
Gambetta, J. M. (2017). Hardware-efficient variational quantum eigensolver for small molecules and
quantum magnets. Nature, 549:242.
[Kirchmair et al., 2013] Kirchmair, G., Vlastakis, B., Leghtas, Z., Nigg, S. E., Paik, H., Ginossar, E., Mir-
rahimi, M., Frunzio, L., Girvin, S. M., and Schoelkopf, R. J. (2013). Observation of quantum state collapse
and revival due to the single-photon kerr effect. Nature, 495(7440):205.
[Kockum, 2014] Kockum, A. F. (2014). Quantum optics with artificial atoms. PhD thesis, Chalmers Uni-
versity of Technology.
[Kockum and Nori, 2019] Kockum, A. F. and Nori, F. (2019). Quantum Bits with Josephson Junctions. In
Tafuri, F., editor, Fundamentals and Frontiers of the Josephson Effect, pages 703–741. Springer.
[Kuperberg, 2015] Kuperberg, G. (2015). How hard is it to approximate the jones polynomial? Theory of
Computing, 11:183.
[Lechner et al., 2015] Lechner, W., Hauke, P., and Zoller, P. (2015). A quantum annealing architecture with
all-to-all connectivity from local interactions. Science advances, 1(9):e1500838.
[Leonhardt, 1997] Leonhardt, U. (1997). Measuring the Quantum State of Light. Cambridge University
Press, New York, NY, USA, 1st edition.
[Leonhardt and Paul, 1993] Leonhardt, U. and Paul, H. (1993). Realistic optical homodyne measurements
and quasiprobability distributions. Phys. Rev. A, 48:4598.
[Li et al., 2020] Li, Y., Chen, M., Chen, Y., Lu, H., Gan, L., Lu, C., Pan, J., Fu, H., and Yang, G.
(2020). Benchmarking 50-photon gaussian boson sampling on the sunway taihulight. arXiv preprint
arXiv:2009.01177.
[Lin et al., 2017] Lin, H. W., Tegmark, M., and Rolnick, D. (2017). Why Does Deep and Cheap Learning
Work So Well? J. Stat. Phys., 168:1223.
142
[Lloyd and Braunstein, 1999] Lloyd, S. and Braunstein, S. L. (1999). Quantum computation over continuous
variables. Phys. Rev. Lett., 82:1784.
[Lloyd et al., 2014] Lloyd, S., Mohseni, M., and Rebentrost, P. (2014). Quantum principal component anal-
ysis. Nat. Phys., 10:631.
[Lucas, 2014] Lucas, A. (2014). Ising formulations of many np problems. Frontiers in Physics, 2:5.
[Lund et al., 2017] Lund, A., Bremner, M. J., and Ralph, T. (2017). Quantum sampling problems, boson-
sampling and quantum supremacy. npj Quantum Information, 3(1):15.
[Lund et al., 2014] Lund, A. P., Rahimi-Keshari, S., Rudolph, T., OÕBrien, J. L., and Ralph, T. C. (2014).
Boson sampling from a gaussian state. Phys. Rev. Lett., 113:100502.
[Mari and Eisert, 2012] Mari, A. and Eisert, J. (2012). Positive wigner functions render classical simulation
of quantum computation efficient. Phys. Rev. Lett., 109:230503.
[Meaney et al., 2014] Meaney, C. H., Nha, H., Duty, T., and Milburn, G. J. (2014). Quantum and classical
nonlinear dynamics in a microwave cavity. EPJ Quantum Technology, 1(1):7.
[Menicucci et al., 2011] Menicucci, N. C., Flammia, S. T., and van Loock, P. (2011). Graphical calculus for
gaussian pure states. Physical Review A, 83(4):042335.
[Meystre and Sargent, 2007] Meystre, P. and Sargent, M. (2007). Elements of quantum optics. Springer
Science & Business Media.
[Moll et al., 2018] Moll, N., Barkoutsos, P., Bishop, L. S., Chow, J. M., Cross, A., Egger, D. J., Filipp, S.,
Fuhrer, A., Gambetta, J. M., Ganzhorn, M., Kandala, A., Mezzacapo, A., Müller, P., Riess, W., Salis, G.,
Smolin, J., Tavernelli, I., and Temme, K. (2018). Quantum optimization using variational algorithms on
near-term quantum devices. Quantum Sci. Technol., 3:030503.
[Morales et al., 2019] Morales, M. E. S., Biamonte, J., and Zimboras, Z. (2019). On the universality of the
quantum approximate optimization algorithm.
[Moylett et al., 2019] Moylett, A. E., García-Patrón, R., Renema, J. J., and Turner, P. S. (2019). Clas-
sically simulating near-term partially-distinguishable and lossy boson sampling. Quantum Science and
Technology, 5(1):015001.
[Neville et al., 2017] Neville, A., Sparrow, C., Clifford, R., Johnston, E., Birchall, P. M., Montanaro, A.,
and Laing, A. (2017). Classical boson sampling algorithms with superior performance to near-term ex-
periments. Nature Physics, 13(12):1153–1157.
[Nielsen, 2015] Nielsen, M. (2015). Neural Networks and Deep Learning. Determination Press.
[Nielsen and Chuang, 2000] Nielsen, M. A. and Chuang, I. L. (2000). Quantum Computation and Quantum
Information. Cambridge University Press.
[Nielsen and Chuang, 2011] Nielsen, M. A. and Chuang, I. L. (2011). Quantum Computation and Quantum
Information: 10th Anniversary Edition. Cambridge University Press, New York, NY, USA, 10th edition.
143
[Nigg et al., 2017] Nigg, S. E., Lörch, N., and Tiwari, R. P. (2017). Robust quantum optimizer with full
connectivity. Science advances, 3(4):e1602273.
[Niu et al., 2019] Niu, M. Y., Lu, S., and Chuang, I. L. (2019). Optimizing QAOA: Success Probability and
Runtime Dependence on Circuit Depth.
[Ofek et al., 2016] Ofek, N., Petrenko, A., Heeres, R., Reinhold, P., Leghtas, Z., Vlastakis, B., Liu, Y.,
Frunzio, L., Girvin, S., Jiang, L., et al. (2016). Extending the lifetime of a quantum bit with error
correction in superconducting circuits. Nature, 536(7617):441–445.
[Paris et al., 2003] Paris, M. G. A., Cola, M., and Bonifacio, R. (2003). Quantum-state engineering assisted
by entanglement. Phys. Rev. A, 67:042104.
[Pednault et al., 2019] Pednault, E., Gunnels, J. A., Nannicini, G., Horesh, L., and Wisnieff, R. (2019).
Leveraging secondary storage to simulate deep 54-qubit sycamore circuits.
[Puri et al., 2017] Puri, S., Andersen, C. K., Grimsmo, A. L., and Blais, A. (2017). Quantum annealing
with all-to-all connected nonlinear oscillators. Nature communications, 8:15785.
[Qi et al., 2020] Qi, H., Brod, D. J., Quesada, N., and García-Patrón, R. (2020). Regimes of classical
simulability for noisy gaussian boson sampling. Phys. Rev. Lett., 124:100502.
[Rahimi-Keshari et al., 2016] Rahimi-Keshari, S., Ralph, T. C., and Caves, C. M. (2016). Sufficient condi-
tions for efficient classical simulation of quantum opticss. Phys. Rev. X, 6:021039.
[Raussendorf and Briegel, 2001] Raussendorf, R. and Briegel, H. J. (2001). A One-Way Quantum Computer.
Phys. Rev. Lett., 86:5188.
[Raussendorf et al., 2003] Raussendorf, R., Browne, D. E., and Briegel, H. J. (2003). Measurement-based
quantum computation on cluster states. Phys. Rev. A, 68:022312.
[Rebentrost et al., 2014] Rebentrost, P., Mohseni, M., and Lloyd, S. (2014). Quantum Support Vector
Machine for Big Data Classification. Phys. Rev. Lett., 113:130503.
[Rodríguez-Laguna and Santalla, 2018] Rodríguez-Laguna, J. and Santalla, S. N. (2018). Building an adia-
batic quantum computer simulation in the classroom. American Journal of Physics, 86(5):360–367.
[Scheel, 2004] Scheel, S. (2004). Permanents in linear optical networks. arXiv preprint quant-ph/0406127.
[Shor, 1995] Shor, P. W. (1995). Scheme for reducing decoherence in quantum computer memory. Physical
Review A, 52:R2493.
[Spring et al., 2013] Spring, J. B., Metcalf, B. J., Humphreys, P. C., Kolthammer, W. S., Jin, X.-M., Bar-
bieri, M., Datta, A., Thomas-Peter, N., Langford, N. K., Kundys, D., Gates, J. C., Smith, B. J., Smith,
P. G. R., and Walmsley, I. A. (2013). Boson sampling on a photonic chip. Science, 339:798.
[Stollenwerk et al., 2019] Stollenwerk, T., Lobe, E., and Jung, M. (2019). Flight gate assignment with a
quantum annealer. In International Workshop on Quantum Technology and Optimization Problems, pages
99–110. Springer.
[Tang, 2018] Tang, E. (2018). Quantum-inspired classical algorithms for principal component analysis and
supervised clustering.
144
[Tanikic and Despotovic, 2012] Tanikic, D. and Despotovic, V. (2012). Artificial Intelligence Techniques for
Modelling of Temperature in the Metal Cutting Process. In Metall. - Adv. Mater. Process. InTech.
[Terhal and DiVincenzo, 2002] Terhal, B. M. and DiVincenzo, D. P. (2002). Adaptive quantum computation,
constant depth quantum circuits and arthur-merlin games. arXiv preprint quant-ph/0205133.
[Ukai et al., 2010] Ukai, R., Yoshikawa, J.-i., Iwata, N., van Loock, P., and Furusawa, A. (2010). Uni-
versal linear bogoliubov transformations through one-way quantum computation. Physical review A,
81(3):032315.
[Vikstål, 2018] Vikstål, P. (2018). Continuous-variable quantum annealing with superconducting circuits.
Master Thesis, Chalmers.
[Vikstål et al., 2020] Vikstål, P., Grönkvist, M., Svensson, M., Andersson, M., Johansson, G., and Ferrini,
G. (2020). Applying the quantum approximate optimization algorithm to the tail-assignment problem.
Physical Review Applied, 14(3):034009.
[Walls and Milburn, 2007] Walls, D. F. and Milburn, G. J. (2007). Quantum optics. Springer Science &
Business Media.
[Walther et al., 2005] Walther, P., Resch, K. J., Rudolph, T., Schenck, E., Weinfurter, H., Vedral, V.,
Aspelmeyer, M., and Zeilinger, A. (2005). Experimental one-way quantum computing. Nature, 434:169.
[Wang et al., 2019] Wang, H., Qin, J., Ding, X., Chen, M.-C., Chen, S., You, X., He, Y.-M., Jiang, X., You,
L., Wang, Z., et al. (2019). Boson sampling with 20 input photons and a 60-mode interferometer in a 10
power 14-dimensional hilbert space. Physical review letters, 123(25):250503.
[Watrous, 2009] Watrous, J. (2009). Encyclopedia of Complexity and Systems Science, chapter Quantum
Computational Complexity, pages 7174–7201. Springer New York, New York, NY.
[Wendin, 2017] Wendin, G. (2017). Quantum information processing with superconducting circuits: a re-
view. Reports Prog. Phys., 80:106001.
[Willsch et al., 2020] Willsch, M., Willsch, D., Jin, F., De Raedt, H., and Michielsen, K. (2020). Bench-
marking the quantum approximate optimization algorithm. Quantum Information Processing, 19:197.
[Wootters and Zurek, 1982] Wootters, W. K. and Zurek, W. H. (1982). A single quantum cannot be cloned.
Nature, 299:802.
[Wurtz and Love, 2020] Wurtz, J. and Love, P. J. (2020). Bounds on maxcut qaoa performance for p> 1.
arXiv preprint arXiv:2010.11209.
[Zhong et al., 2020] Zhong, H.-S., Wang, H., Deng, Y.-H., Chen, M.-C., Peng, L.-C., Luo, Y.-H., Qin, J.,
Wu, D., Ding, X., Hu, Y., Hu, P., Yang, X.-Y., Zhang, W.-J., Li, H., Li, Y., Jiang, X., Gan, L., Yang,
G., You, L., Wang, Z., Li, L., Liu, N.-L., Lu, C.-Y., and Pan, J.-W. (2020). Quantum computational
advantage using photons. Science.
145