0% found this document useful (0 votes)
6 views146 pages

Full Lecture Notes on Quantum Algorithms

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views146 pages

Full Lecture Notes on Quantum Algorithms

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 146

Advanced Quantum Algorithms

Giulia Ferrini, Anton Frisk Kockum, Laura García-Álvarez, Pontus Vikstål

December 29, 2020


Contents

1 The circuit model for quantum computation 5


1.1 Components of the circuit model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Quantum bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Single-qubit gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Multi-qubit gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Universal quantum computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Comparing quantum and classical computers 11


2.1 The Solovay–Kitaev theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Complexity classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 Complexity classes for a deterministic Turing machine . . . . . . . . . . . . . . . . . . 12
2.2.2 Complexity classes for a probabilistic Turing machine . . . . . . . . . . . . . . . . . . 13
2.2.3 Complexity classes for a quantum Turing machine . . . . . . . . . . . . . . . . . . . . 13
2.2.4 Summary of the complexity classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Quantum error correction 16


3.1 Challenges for quantum error correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 The three-qubit bit-flip code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 The three-qubit phase-flip code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 The nine-qubit Shor code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.5 Stabilizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.6 Proving the Gottesman–Knill theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.7 Fault-tolerant quantum computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.8 The surface code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4 Fast Quantum Algorithms 23


4.1 The Quantum Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.1 Another definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.2 An efficient implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 Phase estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 Factoring - Shor’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.3.1 Modular arithmetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3.2 Order finding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3.3 Factoring as order finding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3.4 A quantum algorithm for order finding . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.3.5 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

1
5 Quantum machine learning 32
5.1 A brief overview of classical machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1.1 Types of machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1.2 Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.1.3 Training neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.1.4 Reasons for the success of classical machine learning . . . . . . . . . . . . . . . . . . . 34
5.2 Quantum machine learning using qBLAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3 Quantum support vector machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3.1 Support vector machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3.2 Classical computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3.3 Quantum computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.4 Quantum principal component analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.5 Quantum neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.5.1 Quantum feedforward neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.5.2 Quantum convolutional neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.5.3 Quantum Boltzmann machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

6 Measurement-based quantum computation 43


6.1 The basic idea of MBQC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.2 The details of MBQC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.2.1 Definition of the possible operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.2.2 Preparing the initial state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.2.3 Measurements and their effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2.4 Universal single-qubit operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2.5 Cluster states as a resource . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.2.6 Two-qubit gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.3 Universality and efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

7 Adiabatic quantum computation 48


7.1 The basic idea of AQC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.2 Adiabatic evolution and quantum speed-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.3 Universality and stoquasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.4 Reason for non-commuting Hamiltonians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.5 Proof of the adiabatic theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

8 Algorithms for solving combinatorial optimization problems 53


8.1 Combinatorial optimization problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.1.1 Hardness of combinatorial optimization problems and promises of quantum computers
for solving them . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.2 Combinatorial optimization and the Ising model . . . . . . . . . . . . . . . . . . . . . . . . . 55
8.2.1 Mapping combinatorial optimization problems to spin Hamiltonians . . . . . . . . . . 56
8.3 Quantum annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
8.3.1 Solving optimization problems on a quantum annealer . . . . . . . . . . . . . . . . . . 59
8.3.2 Heuristic understanding of quantum annealing . . . . . . . . . . . . . . . . . . . . . . 60
8.3.3 QUBO optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
8.3.4 D-wave quantum annealer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
8.3.5 Summary of pros and cons for quantum annealing . . . . . . . . . . . . . . . . . . . . 62

2
8.3.6 Example of the solution of a practical problem on a quantum annealer: Flight-gate
assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
8.4 Quantum Approximate Optimization Algorithm (QAOA) . . . . . . . . . . . . . . . . . . . . 66
8.4.1 Introduction to QAOA: From the quantum adiabatic algorithm to QAOA . . . . . . . 66
8.4.2 The Quantum Approximate Optimization Algorithm for solving Max-cut . . . . . . . 68
8.4.3 Other interesting remarks and extensions of QAOA . . . . . . . . . . . . . . . . . . . . 71
8.4.4 More on the relation between QAOA and quantum annealing . . . . . . . . . . . . . . 71

9 The variational quantum eigensolver 73


9.1 Outline of the algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
9.2 More on step 0 – mapping to a Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
9.3 More on step 1 – the trial state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
9.3.1 Problem-specific trial states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
9.3.2 Hardware-efficient trial states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
9.4 More on step 4 – updating the parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

10 Sampling models and sub-universal models of quantum computation 77


10.1 Introduction: motivation for sampling models . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
10.2 Instantaneous Quantum Polytime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
10.2.1 Hadamard gadget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
10.3 Random Circuit Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
10.4 Boson Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
10.4.1 Definition of the Boson Sampling model . . . . . . . . . . . . . . . . . . . . . . . . . . 83
10.4.2 Proof that the Boson Sampling probability distribution is proportional to permanents 84
10.4.3 Sketch of the proof of computational hardness of the Boson Sampling probability dis-
tribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

11 Continuous-Variable Approach to Quantum Information 89


11.1 Quantum computing with continuous variables . . . . . . . . . . . . . . . . . . . . . . . . . . 89
11.1.1 The underlying physical model: the quantized harmonic oscillator . . . . . . . . . . . 89
11.1.2 First definitions, elementary operations and universal gate-sets . . . . . . . . . . . . . 94
11.2 Measurement-based quantum computation: the general paradigm in CV . . . . . . . . . . . . 96
11.2.1 Cluster states in Continuous Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
11.2.2 The CV MBQC paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
11.3 GKP encoding and Error Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
11.3.1 GKP encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
11.3.2 Single noise realization: intermediate measurement and threshold condition . . . . . . 103
11.3.3 Single noise realization: Output state of the GKP error-correcting gadget . . . . . . . 105
11.4 Sampling models and sub-universal models in Continuous Variables . . . . . . . . . . . . . . . 107
11.4.1 Continuous-Variable Instantaneous Quantum Polytime . . . . . . . . . . . . . . . . . . 107
11.5 Quantum annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
11.5.1 Circuit QED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
11.5.2 Two-photon pumped Kerr-nonlinear resonator . . . . . . . . . . . . . . . . . . . . . . . 113
11.5.3 Two- & one-photon pumped Kerr-nonlinear resonator . . . . . . . . . . . . . . . . . . 115
11.5.4 Coupled two-photon pumped Kerr-nonlinear resonators . . . . . . . . . . . . . . . . . 117
11.5.5 Simulation of relevant combinatorial optimization problems . . . . . . . . . . . . . . . 119
11.5.6 Remarks on scalability & the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

3
A Quantization of the electromagnetic field in a cavity 123
A.1 Quantizing the electromagnetic field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

B Superconducting quantum circuits 128


B.1 Circuit Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
B.2 Transformation to the rotating frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
B.3 Steady state & stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
B.4 Effect of single-photon pump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
B.5 Coupling between two Kerr-nonlinear resonators . . . . . . . . . . . . . . . . . . . . . . . . . 135
B.6 Error estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
B.7 Generation of cat states using a two-photon pumped KNR . . . . . . . . . . . . . . . . . . . . 137

4
Chapter 1

The circuit model for quantum


computation

In this course, we will give an overview of various approaches to quantum computation, reflecting many of
the latest developments in the field. We will cover several different models of quantum computation, from
the foundational circuit model through measurement-based and adiabatic quantum computation to boson
sampling. We will discuss quantum computation with both discrete and continuous variables. When it
comes to the algorithms that we study, they include both classics like Shor’s algorithm and newer, heuristic
approaches like the quantum approximate optimization algorithm (QAOA). We will also see how quantum
computing can be combined with machine learning.
We assume that the students taking this course already have some familiarity with quantum physics
(superposition, entanglement, etc.) and some basic concepts in quantum computation. We will repeat some
of these basic concepts at the beginning of the course, but perhaps give a more thorough justification for
why they can be used in quantum computation.
In this first chapter, we will study the circuit model of quantum computation. This introduces quantum
bits, quantum gates, and other components in close similarity with concepts in classical computing and gives
us the tools to begin investigating whether quantum computers can ever outperform classical computers.
For this chapter, we have borrowed parts from Refs. [Nielsen and Chuang, 2000, Aaronson, 2018, Kockum
and Nori, 2019].

1.1 Components of the circuit model


Loosely speaking, a computation requires a system that can represent data, a way to perform manipulation
of that data, and a method for reading out the result of the computation. In the circuit model of quantum
computation, we use:
• Quantum bits (qubits) to represent the data.

• State preparation to initialize the qubits in the input state we need to begin the computation.

• Quantum gates on the qubits to manipulate the data.

• Measurements on the qubits to read out the final result.


Below, we first say a few words about what qubits are. We then discuss various quantum gates, and
what is required of such gates to allow us to perform any quantum computation we would like. We assume

5
Figure 1.1: The Bloch-sphere representation of a qubit state. The north pole is the ground state |0i and the
south pole is the excited state |1i. To convert an arbitrary superposition of |0i and |1i to a point on the
sphere, the parametrization |ψi = cos ϑ2 |0i + eiφ sin ϑ2 |1i is used.

for now that it is possible to initialize our quantum computer in some simple state, and that we can read
out the state of the qubits at the end of a computation.

1.2 Quantum bits


In a classical computer, the most basic unit of information is a bit, which can take two values: 0 and 1. In
a quantum computer, the laws of quantum physics allow phenomena like superposition and entanglement.
When discussing information processing in a quantum world, the most basic unit is therefore a quantum bit,
usually called qubit, a two-level quantum system with a ground state |0i and an excited state |1i. Unlike a
classical bit, which only has two possible states, a quantum bit has infinitely many states: all superpositions
of |0i and |1i,
 
α
|ψi = α |0i + β |1i = , (1.1)
β
2 2
where α and β are complex numbers satisfying |α| + |β| = 1. A measurement on this qubit state (in the
2 2
basis of |0i and |1i) gives the result 0 with probability |α| and the result 1 with probability |β| .
A useful tool for visualizing a qubit state is the Bloch sphere shown in Fig. 1.1. A state of the qubit is
represented as a point on the surface of the sphere, which has radius 1. The two states of a classical bit
correspond to the north and south poles on the sphere. The two states on opposite ends of the x axis are
often denoted
|0i + |1i |0i − |1i
|+i = √ , |−i = √ . (1.2)
2 2

6
If there are N qubits in a system, the total state of that system can be a superposition of 2N different
states: |000 . . . 00i, |100 . . . 00i, |010 . . . 00i, . . ., |111 . . . 10i, |111 . . . 11i. Note that N classical bits can be in
one of these 2N states, but not in a superposition of several of them. To store all the information about a
general N -qubit state, one needs to keep track of the 2N amplitudes in the superposition. This means that
at least 2N classical bits are required to represent the quantum system. This explains why it is hard for
classical computers to simulate some classical systems, and gives a first hint that quantum computers can
be more powerful than classical ones (at least when it comes to simulating quantum systems).
There are many physical implementations of qubits, e.g., superconducting qubits, trapped ions, natural
atoms, etc. These implementations are a topic for another course. In the following, we assume that we have
access to qubits, but do not care much about how they are made.

1.3 Single-qubit gates


Operations on a single qubit moves its state from |ψi = α |0i + β |1i to |ψ 0 i = α0 |0i + β 0 |1i while preserving
2 2 2 2
the norm |α| + |β| = 1 = |α0 | + |β 0 | . Such operations (gates) can be described by 2 × 2 unitary matrices.
Here we list some of the most common ones. First are the Pauli matrices:
 
0 1
X = , (1.3)
1 0
 
0 −i
Y = , (1.4)
i 0
 
1 0
Z = , (1.5)
0 −1
(1.6)

The Pauli matrices generate rotations around the corresponding axes on the Bloch sphere when exponenti-
ated, e.g.,
 
cos(ϑ/2) −i sin(ϑ/2)
Rx (ϑ) = exp(−iϑX/2) = cos(ϑ/2)I − i sin(ϑ/2)X = , (1.7)
−i sin(ϑ/2) cos(ϑ/2)

where I is the identity matrix.


The X gate is the quantum equivalent of the classical NOT gate:
    
0 1 α β
X(α |0i + β |1i) = = = β |0i + α |1i . (1.8)
1 0 β α

By adding the X and Z gates, one obtains the Hadamard gate


 
1 1 1 X +Z
H=√ = √ . (1.9)
2 1 −1 2

This gate transforms the qubit state from the |0i , |1i basis to the |+i , |−i basis. The Hadamard is often
used to create superposition states at the beginning of a quantum algorithm.
The Z gate applies a phase factor −1 to the |1i part of the qubit state. Two gates that apply other phase
factors are often given their own names. One is the T , or π/8, gate:
   
1 0 exp(−iπ/8) 0
T = = exp(iπ/8) , (1.10)
0 exp(iπ/4) 0 exp(iπ/8)

7
The other is the phase, or S, or P , gate
 
1 0
S= = T 2. (1.11)
0 i

1.4 Multi-qubit gates


To realize useful quantum algorithms, we also need to be able to make two or more qubits interact through
multi-qubit gates. One way to achieve this is to let the state of one qubit control whether a certain single-
qubit gate is applied to another qubit. One such gate is the controlled-NOT (CNOT) gate:

1 0 0 0
 
0 1 0 0
CNOT =  0 0 0 1 .
 (1.12)
0 0 1 0
Note that the two-qubit Hilbert space is spanned by the basis vectors |00i, |01i, |10i, and |11i, in that order.
This is the tensor product of the two single-qubit Hilbert spaces. The action of the CNOT gate is thus to
do nothing if the first qubit is in state |0i (|00i, |01i changes to |00i, |01i), and to apply NOT to the second
qubit if the first qubit is in state |1i (|10i, |11i changes to |11i, |10i).
The controlled-Z (CZ) gate can be defined in the same manner:

1 0 0 0
 
0 1 0 0 
CZ =  0 0 1 0  .
 (1.13)
0 0 0 −1
More generally, the controlled application of a single-qubit unitary U to the second qubit takes the form
 
I2 02
. (1.14)
02 U
There are also two-qubit gates that are not controlled operations. For example, the SWAP gate
1 0 0 0
 
0 0 1 0
SWAP =  0 1 0 0
 (1.15)
0 0 0 1
swaps the states |01i, |10i to |10i, |01i.
It is also possible to define gates involving than two qubits. For three-qubit gates, the most well-known
ones are the Toffoli and Fredking gates. The Toffoli gate is a controlled-controlled-NOT (CCNOT), i.e., the
state of the third qubit is flipped if and only if both the first two qubits are in state |1i:

1 0 0 0 0 0 0 0
 
0 1 0 0 0 0 0 0
 
0 0 1 0 0 0 0 0
 
0 0 0 1 0 0 0 0
 
Toffoli =  . (1.16)
0 0 0 0 1 0 0 0
 
0 0 0 0 0 1 0 0
 
0 0 0 0 0 0 0 1
0 0 0 0 0 0 1 0

8
The Fredkin gate is a controlled-SWAP (CSWAP) gate, swapping the states of the second and third qubits
if and only if the state of the first qubit is |1i:

1 0 0 0 0 0 0 0
 
0 1 0 0 0 0 0 0
 
0 0 1 0 0 0 0 0
 
0 0 0 1 0 0 0 0
 
Fredkin =  . (1.17)
0 0 0 0 1 0 0 0
 
0 0 0 0 0 0 1 0
 
0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1

Most practical implementations of quantum computing have limited connectivity between qubits, only al-
lowing for pairwise interactions between nearest-neighbour qubits. This can prohibit direct implementations
of multi-qubit gates with three or more qubits, but it turns out that any multi-qubit gate can be decomposed
into a number of single- and multi-qubit gates.

1.5 Universal quantum computation


Are all the gates we specified above enough to carry out any quantum computation that we would like?
Could we do any quantum computation using just a small subset of the gates above? These are questions
about universality.
For classical computers, a set of gates is called universal if, by applying enough gates from this set in a
sequence, it is possible to express any Boolean function on any number of bits. The classical NAND gate
turns out to be universal all on its own, but there are other sets of gates that are not universal. For example,
the set {AND, OR} is not universal, because these gates cannot express non-monotone Boolean functions;
changing an input bit from 0 to 1 in a circuit with these gates will never result in an output bit changing
from 1 to 0.
For quantum computers, a gate set is called universal if the gates therein can be used to approximate
any unitary transformation on any number of qubits to any desired precision. To understand what makes a
gate set universal for quantum computing, let us see how a gate set can fail to be universal:

• Inability to create superposition states


Some gates, e.g., X and CNOT, only change states in the computational basis into other states in
the computational basis (e.g, CNOT |11i = |10i). These gates can maintain superpositions, but they
cannot create new ones.

• Inability to create entanglement


The Hadamard gate can create a superposition (e.g., H |0i = |+i), but it only acts on a single qubit,
so it cannot create entanglement between two or more qubits. Clearly, no single-qubit gate can. Since
an unentangled state of N qubits can be written (α1 |0i + β1 |1i) ⊗ . . . ⊗ (αN |0i + βN |1i), it can be
specified using only 2N amplitudes, which a classical computer would be able to simulate efficiently.

• Inability to create non-real amplitudes


A gate set like {CNOT, H} would be able to create both entanglement and superposition states.
However, the matrices specifying these gates only have real entries. Thus, they would never be able to
create states with complex amplitudes.

9
• The Gottesman-Knill theorem
If we take our gate set to be {CNOT, H, S}, we circumvent all the objections above. However, it turns
out that this still is not enough to achieve universal quantum computation. This is the

Gottesman-Knill theorem: A quantum circuit using only the following elements can be simulated
efficiently on a classical computer:

1. Preparation of qubits in computational basis states,


2. Quantum gates from the Clifford group (H, CNOT, S), and
3. Measurements in the computational basis.

The Clifford group is the group of operations that map Pauli matrices onto (possibly different) Pauli
matrices, and is generated by the gates (H, CNOT, S). Quantum circuits with only these gates are also
called stabilizer circuits (we will return to this concept in Chapter 3, where we will give a proof of the
Gottesman-Knill theorem and discuss quantum error correction). If one starts in the computational
basis, the single-qubit states that can be reached using these circuits are the ±1 points on the x,
y, and z axes of the Bloch sphere. This restriction of the state space turns out to make the circuit
classically simulatable even though the state space has states with both superposition, entanglement,
and complex amplitudes.

What is then a universal gate set? Actually, replacing the Hadamard gate in the last gate set considered
above, {CNOT, H, S}, with almost any other gate makes the set universal. We can also replace S with some
other gate. One universal gate set is {CNOT, H, T }. Almost any two-qubit gate on its own is also universal.
We thus now know that we can implement any unitary operation on N qubits in a realistic experimental
architecture that allows for a few single- and two-qubit gates. Does that mean that we can run quantum
algorithms that are faster than classical algorithms for some problems? This will be discussed in the next
chapter (and many other chapters after that).

10
Chapter 2

Comparing quantum and classical


computers

Could a quantum computer find the answer to a problem that cannot be solved on a classical computer,
even if there are no restrictions on the resources available to the classical computer? The answer is no.
The classical computer could always simulate the quantum computation by storing, to enough precision,
the 2N amplitudes for the whole state of the quantum computer’s N qubits, and changing these amplitudes
according to the gates applied in the quantum algorithm. However, there is clearly a possibility that this
classical simulation requires a lot more resources than the quantum computation itself. So a better question
to ask is: are there problems that a quantum computer could solve faster than a classical computer? That
question is the topic of this chapter.

2.1 The Solovay–Kitaev theorem


In the previous chapter, we saw that there are many gate sets that are universal for quantum computing,
i.e., they can approximate any unitary transformation on any number of qubits to any desired precision
ε. However, it is a different question whether that approximation is fast and efficient. For example, If the
number of gates needed would scale exponentially in 1/ε, there would probably not be any practical use for
such a universal gate set. Luckily, there is a theorem telling us that the scaling for any universal gate set is
much more benign:

Solovay–Kitaev theorem: Let G a finite subset of SU (2) and U ∈ SU (2). If the group generated by
G is dense in SU (2), then for any  > 0 it is possible to approximate U to precision  using O log4 (1/)


gates from G.

The proof can be found in Ref. [Nielsen and Chuang, 2000]. What the theorem essentially says is that
if we have a gate set that can approximate any single-qubit rotation, then we only need relatively few gates
from that set to do the approximation, so we can do the approximation fast. By adding some two-qubit gate
to make the gate set universal, it can be shown that the total number of gates needed to approximate U on
N qubits is at most O 4N log4 (1/) . More recent results have shown that for certain gate sets, the number


of gates can be brought down from O log4 (1/) to O(log(1/)). There are also good algorithms for finding


the gate sequences needed to achieve the Solovay-Kitaev bound.


This theorem, together with the observations in the previous chapter, is what gives us confidence that

11
quantum computing has potential. We now know how to find a universal gate set to approximate any unitary
transformation, and we know that we can implement that approximation efficiently. This means that we are
ready to compare how fast classical and quantum computers are at solving different types of problems: we
can define quantum complexity classes.

2.2 Complexity classes


The time it takes to solve a computational problem can scale differently with the problem size, and can depend
on what type of computational machine is used. This is the basis for defining computational complexity
classes, which are the topic of this section. Most of the following is based on the supplementary material of
Ref. [Douce et al., 2017] and section 2 in Ref. [Wendin, 2017].
For classical computers, the scaling is usually defined either for a deteriministic Turing machine (DTM)
or a probabilistic Turing machine (PTM). The DTM is the model that corresponds to ordinary classical
computers; it is a finite state machine that reads and writes on a finite tape.

2.2.1 Complexity classes for a deterministic Turing machine


The most basic and well-known complexity classes are those defined for a DTM:

• P (polynomial): The set of decision problems solvable in polynomial (poly) time by a DTM.

• NP (non-deterministic polynomial): The set of decision problems whose solutions can be verified in
polynomial time by a DTM.

• NP-hard: The set of decision problems whose solutions allows solving all problems in NP.

• #P: The set of problems that count the number of solutions of NP problems.

• #P-hard: The set of problems whose solutions allows solving all other problems in #P

It is known that P ⊆ NP, but the question whether the inclusion holds strictly (and hence ultimately P
6= NP) stands as one of the most important open problems in the modern age of science.
A concept that is often mentioned in connection to P and NP is the polynomial hierarchy (PH). It is a
hierarchy of complexity classes that generalizes the classes P, NP to the case in which oracles are accessible.
An oracle is a black box that can output the solution of a decision problem contained in a given complexity
class using a single call. For example, AB is the set of decision problems in class A that are solvable in
polynomial time by a DTM augmented by an oracle for some complete problem in class B.
The first level of the PH is the class P; in symbols, Σ0 = P. Successive levels are refined recursively:

Σk+1 = NPΣk . (2.1)

A problem is in the polynomial hierarchy if it is in some Σk , i.e., the polynomial hierarchy is the union of
all Σk .
Analogously to what was said above concerning the relation between P and NP, it is known that Σk ⊆
Σk+1 , i.e., higher levels of the PH contain lower levels, and it is strongly believed that the inclusion is strict,
namely that Σk 6= Σk+1 . If there is a k for which Σk = Σk+1 , the PH is said to collapse to level k. It can
be shown that if a collapse occurs at level k then for all k 0 > k it would hold that Σk0 = Σk , which justifies
the terminology “collapse”.

12
2.2.2 Complexity classes for a probabilistic Turing machine
A PTM is much like a DTM, but it makes random choices of the state of the finite state machine when
reading from the tape, and it traverses all states in a random sequence. This randomness means that a PTM
will not get stuck away from a solution, which could happen for a DTM. This leads to the definition fo the
following complexity classes:

• BPP (bounded probabilistic polynomial): The class of decision problems that a PTM solves in poly-
nomial time with an error probability bounded away from (i.e., strictly less than) 1/3 for all instances.

• PP (probabilistic polynomial): The class of decision problems that a PTM solves in polynomial time
with an error probability bounded less than 1/2 for all instances.

A PTM can be simulated by a DTM with only polynomial overhead. Therefore, it is believed that BPP
= P.

2.2.3 Complexity classes for a quantum Turing machine


We can define a quantum Turing machine (QTM) in analogy with the classical Turing machines, using a
quantum memory (tape) and a quantum processor. We then have a few new complexity classes:

• BQP (bounded quantum polynomial): This is the quantum analogue of BPP. Intuitively, BQP is the
class of problems that can be solved using at most a polynomial number of gates, with at most 1/3
probability of error.

• PostBQP (Post-selected bounded quantum polynomial): PostBQP is an extension of BQP where,


during a polynomial time computation, one is allowed to abort and start all over again for free whenever
the result on a specific conditioning qubit (or subset of qubits) is not satisfying.

• QMA (quantum Merlin–Arthur): Classically, Merlin–Arthur problems are a subclass of NP problems


where the oracle is referred to as Merlin and Arthur is the one doing the verification in polynomial
time. The QMA class is the one where Arthur can use a quantum computer to verify the solution in
polynomial time.

• QMA-hard: The set of decision problems whose solutions allows solving all problems in QMA.

BQP it is the class we refer to when we talk about problems efficiently solved by a universal quantum
computer. Note that we do not have to specify which gates the definition is based upon, as long as they
constitute a universal set. Thanks to the Solovay-Kitaev theorem, using one universal set or another merely
results in a polylogarithmic overhead; this cost is dominated by a polynomial function.
Quantum computing subsumes classical one. In terms of complexity classes, this is summarized by the
statement BPP ⊆ BQP.

More details on PostBQP


The post-selection procedure in PostBQP, which is not specific to quantum computing (one can define
the classical complexity class PostBPP similarly), is highly unrealistic and brings in a lot of power to the
model [Aaronson, 2005]. More formally, PostBQP is the class of problems solvable by a BQP machine such
that:

• If the answer is yes, then the second qubit has at least 2/3 probability of being measured 1, conditioned
on the first qubit having been measured 1.

13
• If the answer is no, then the second qubit has at most 1/3 probability of being measured 1, conditioned
on the first qubit having been measured 1.

• On any input, the first qubit has a nonzero probability of being measured 1. This condition can actually
be refined to an n-dependent probability.

Denoting qo (qc ) the output (post-selected) qubit, the relevant mathematical object is the conditional prob-
ability which reads by definition:

P (qo = 1 ∧ qc = 1)
P (qo = 1/qc = 1) = . (2.2)
P (qc = 1)

Intuitively, the power of PostBQP relies upon the denominator P (qc = 1): since it can be arbitrarily low, it
may compensate for very unlikely events corresponding to the solution.
We now want to be more specific about the success probability P (+1 ). The Solovay-Kitaev theorem (see
above) actually sets a lower bound on the acceptable probabilities: it lets us approximate any desired unitary
within exponentially small error for only a polynomial increase in the circuit size. In other words, for an
exponentially unlikely probability, the theorem still ensures that arbitrary universal gate sets can be used
for polynomially long computations like BQP circuits—since a polynomial overhead remains in the BQP
class. And indeed the class PostBQP is based upon BQP circuits. Thus it is well-defined only if the relevant
output probabilities are at worst exponentially unlikely:
1
P (+1 ) & . (2.3)
2n
It has been shown in [Kuperberg, 2015] that this condition was fulfilled whenever “reasonable” universal gate
sets were considered.
Additionally, suppose now that there is a polynomial p(n) such that P (+1 ) ≥ 1/p(n). In that case
P (+1 ) is polynomially unlikely. Then running the BQP circuit p(n) more times would still correspond to
a polynomial time computation and remain in BQP. On the other hand, such redundancy would enable
recording enough statistics to simulate the quantum post-selection through classical postprocessing. Hence
conditioning on an event which probability scales as 1/p(n) does not give any power to the post-selection.
So P (+1 ) has to be worst than polynomially unlikely.
Following the discussion in [Aaronson, b], the definition of the class PostBQP could be refined to account
for this feature: the conditioning probability P (+1 ) scales as the inverse of an exponential function,
1
P (+1 ) ∼ , (2.4)
2n
up to some scaling factor irrelevant in terms of computational classes.

2.2.4 Summary of the complexity classes


The Venn diagram in Fig. 2.1 summarizes most of the complexity classes defined above and their relations.
Another plot of the relationships between a few of the complexity classes is shown in Fig. 2.2.
Some of the definitions above, and a few more complexity classes, are given in 2.3. The definition of
further complexity classes can be found in [Watrous, 2009, Aaronson, a].

14
Review

ort-term QIP perspec-


quantum technologies
riving the present and
cessary for developing
g-term goals of useful
estigations of quantum
r the present efforts to
he work on improving
ts are extremely sensi-
e requirement of mak-
the coherence has led
tum limited supercon-
d quantum sensors and
o a new quantum tech-
ity of magnetic meas-
Figure Figure
2.1: A Venn 1. Computational
diagram complexity
of various classical is defined
and quantum by Turing
complexity classes. From Ref. [Wendin, 2017].
machines (TM) providing digital models of computing [1, 62,
rcuits for QIP has been 63]: deterministic TM (DTM); quantum TM (QTM); classical
[8 –10]. The first exper- polynomial time execution and define complexity classes: P denotes
non-deterministic TM (NTM). Tractable problems are defined by

t, a Josephson-junction problems that are efficiently solvable with a classical computer; P


he charge regime, was is a subset of NP, the problems efficiently checkable by a classical
the coherence time was computer. QMA denotes the problems efficiently checkable by a
ut it took only another quantum computer. NP-hard problems are the problems at least as
hard as any NP problem, and QMA-hard problems are the problems
useful qubit concepts: at least
Figure 2.2: Main as hardclasses
complexity as anyuseful
QMA forproblem. For aand
our purposes nice
thetutorial on relationships
inclusion how to (black line,
um CPB [15],inclusion
and the classifytocombinatorial
from bottom top). problems, including games, see [61].
ep was to embed qubits
ator, introducing circuit of the experimental work, only touches the surface and is cov-
7–22]. The subsequent ered by references to recent work. Reviews and analyses of the
ng coplanar microwave QIP field are provided by [42–53]. For a comprehensive trea-
dbreaking progress: tise on QIP we refer to Nielsen and Chuang [54]. Extensive
technical discussions of a broad range of superconducting cir-
ubit-resonator coupling cuits and qubits can be found in [55–60].
The present review also tries to look beyond the exper-
apping
√ excitations, in imental state of the art, to anticipate what will be coming up
iSWAP gate [24]. in the near future in the way of applications. There is so much
ulti-qubit systems with theoretical experience that waits to be implemented on super-
nsmon version of the conducting platforms. The ambition is to outline opportunities
2011 the for addressing real-world problems in Physics, Chemistry and
Figure 2.3: Summary of the definition of the complexity classes used in this chapter. From Ref. [Lund et al.,
Moreover, in 2017]
a 3D-cavity increased Materials Science.
]. At present there is
15
3D multi-qubit circuits 2. Easy and hard problems
apable of performing a
gates with control and Why are quantum computers and quantum simulators of
such great interest? Quantum computers are certainly able to
vide a snapshot of cur- solve some problems much faster than classical computers.
Chapter 3

Quantum error correction

We have now seen that universal quantum computation can be performed with a rather small and simple set
of quantum gates, and that there is good reason to believe that a universal quantum computer will be able
to solve some problems faster than classical computers can. However, a crucial question remains to answer
before we go ahead and invest large resources into actually building a quantum computer: can a quantum
computer deal with errors?
If even a small error can wreck a quantum computation beyond repair, a practical quantum computer
can never be realized. In today’s classical computers, the probability p of an error occurring during a logical
operation is amazingly low. Numbers on the order of p ≈ 10−18 are often quoted. For state-of-the-art
quantum computers, however, p is rather on the order of 10−2 or, in the best case, 10−3 . In this chapter, we
show that quantum error correction is indeed possible and feasible, even with error rates close to what we
have today. For a more detailed account, se chapter 10 in Ref. [Nielsen and Chuang, 2000].

3.1 Challenges for quantum error correction


Correcting for errors on classical bits is quite straightforward. The basic idea is to encode the state of
one bit redundantly using several bits such that an error on one of the latter does not change the encoded
information. The simplest example is a three-bit code with majority voting. The state of one bit, 0 or 1,
is encoded into three bits as 000 or 111. The encoded information is read out by measuring all three bits
and going with the majority vote. For example, if the third bit is flipped, 000 changes into 001 (and 111
into 110), but the majority vote among 001 (110) still tells us that the encoded bit was 0 (1). For the error
correction to fail, two bits need to be flipped. This means that while the error probability for the single
unencoded bit is p, the error probability for the encoded bit is ∼ p2 .
Correcting for errors on quantum bits is more complicated. Indeed, this was a major headache for
researchers in the early days of quantum computing. Before Peter Shor showed in 1995 [Shor, 1995] how to
achieve quantum error correction, it was hard to be optimistic about the prospects for quantum computing.
The major obstacles thought to prevent quantum error correction were three:

• The no-cloning theorem [Wootters and Zurek, 1982, Dieks, 1982]. Quantum mechanics prohibits the
existence of a unitary operation U that can change a known state |ϕi into a copy of an unknown
arbitrary quantum state |ψi without perturbing the latter, i.e., U |ϕi |ψi = |ψi |ψi. For the classical
error correction described above, such cloning was essential for encoding.

• Measuring a quantum state causes it to collapse into an eigenstate of the measured observable. How

16
to correct errors on an arbitrary superposition state α |0i + β |1i without disturbing the state when
performing a measurement on it?

• In a classical bit, there is only one possible error: a bit flip, taking the bit from 0 to 1 or vice versa. For
a quantum bit, there are infinitely many possible errors: any single-qubit operation, i.e., any rotation
around any axis of the Bloch sphere by any angle, could possibly be induced by some external error
source. How to make an error-correction procedure general enough to be able to deal with all these
possible errors?

3.2 The three-qubit bit-flip code


In this section, we will look at the simplest quantum error-correction code, which demonstrates that all
three obstacles above can be dealt with. The code is the three-qubit bit-flip code. The encoding and
error-correction process is shown schematically in Fig. 3.1.
Encoding Measurements Correction

|ψi • • Flip iff M12 = −1, M23 = 1


Bit M12
|0i flip Flip iff M12 = −1, M23 = −1
error M23
|0i Flip iff M12 = 1, M23 = −1

Figure 3.1: The three-qubit bit-flip error-correction code. In the first step, CNOT gates are applied to
produce the state |ψ3 i from |ψi [see Eq. (3.2)]. After a bit-flip error occurs, parity measurements are done
on pairs of qubits and correcting bit flips are applied to the qubits depending on the measurement results
(-1 means the qubits are in opposite states, +1 that they are in the same state). Figure from Ref. [Kockum,
2014].

We begin with an arbitrary one-qubit state

|ψi = α |0i + β |1i . (3.1)

To protect it from bit-flip errors, i.e., from an unwanted application of the X gate, we encode it using three
qubits as
|ψ3 i = α |000i + β |111i . (3.2)
Note that this state can be created by application of CNOT gates, which is different from cloning the state;
⊗3
that would have resulted in |ψ3 i = (α |0i + β |1i) (if cloning had been possible).
Now, let us assume that the third qubit is flipped. This gives

|ψ3,err i = α |001i + β |110i . (3.3)

If we first perform a parity measurement on qubits 1 and 2, i.e., measure the product Z1 Z2 , and then measure
the parity of qubits 2 and 3, i.e., Z2 Z3 , we would not affect the state |ψ3 i, since Z1 Z2 |ψ3 i = |ψ3 i = Z2 Z3 |ψ3 i.
Similarly, performing these measurements on |ψ3,err i does not change the coefficients α and β determining
the superposition; it just adds a global phase factor. However, the result of the measurements lets us draw
the conclusion that qubit 3 has been flipped (assuming that the probability of more than one qubit flipping
is negligible). We can then apply an X gate to this qubit, flipping it back to its original state and recovering
|ψ3 i. As shown in Fig. 3.1, the same set of measurements also lets us correct the state if a bit-flip error

17
instead occurs on qubit 1 or 2. We have thus managed to circumvent the problem of quantum measurements
being projective.
What happens if the error is an arbitrary X rotation Rx (ϑ) on the third qubit instead of just X? As we
can see from Eq. (1.7), the resulting state would then be

cos(ϑ/2) |ψ3 i − i sin(ϑ/2) |ψ3,err i . (3.4)

Measuring the observables Z1 Z2 and Z2 Z3 will either project this state into |ψ3 i or |ψ3,err i (modulo a
global phase). In both cases, we will know what operation to apply to recover |ψ3 i. This demonstrates
how quantum error-correcting codes deal with the continuum of possible errors: measurements are used to
project the perturbed state into a finite set of states from which we know how to recover the original state.

3.3 The three-qubit phase-flip code


The three-qubit bit-flip code can only correct for X errors on a single qubit. However, the idea of the code
can be extended to instead deal with Z errors, i.e., phase flips. The crucial observation is that just as X
flips |0i to |1i and vice versa, Z flips |+i to |−i and vice versa. So by encoding the one-qubit state we want
to protect in the {|+i , |−i} basis instead of the {|0i , |1i} basis,

|ψ3 i = α |+ + +i + β |− − −i . (3.5)

we achieve an encoded three-qubit state where a Z error flips one of the qubits from one of its basis states
to the other. This encoding is implemented by adding a Hadamard gate on each qubit at the end of the
encoding step in Fig. 3.1.
To detect a phase flip error in one of the three qubits, we measure the products H ⊗3 Z1 Z2 H ⊗3 = X1 X2
and H ⊗3 Z2 Z3 H ⊗3 = X2 X3 . Just as above, these measurements do not change the coefficients α and β, but
lets us conclude whether a qubit has suffered a phase flip (and which qubit it was), allowing us to apply a
Z gate to correct. And just as before, this procedure also works for arbitrary Z rotations by projecting the
state into either |ψ3 i or a state that can be corrected by applying a Z gate.

3.4 The nine-qubit Shor code


The breakthrough of Shor in 1995, which showed that quantum error correction could correct arbitrary
single-qubit errors, was a nine-qubit code [Shor, 1995] that combines the three-qubit bit-flip and phase-flip
codes into one. The two codes are concatenated such that the single-qubit state |psii first is encoded with
the three-qubit phase-flip code and then each of these three qubits are encoded with the three-qubit bit-flip
code into three more qubits each. The resulting state is
(|000i + |111i)(|000i + |111i)(|000i + |111i) (|000i − |111i)(|000i − |111i)(|000i − |111i)
|ψ9 i = α √ +β √ .
2 2 2 2
(3.6)
To find whether a bit flip has occurred, we can treat each of the three group of three qubits separately,
measuring the product of Zs on the first two and the last two qubits in each group. For phase-flip errors, the
procedure is slightly more complicated. We first determine in which of the three groups of three qubits that
a phase flip has occurred. This is done by measuring the product of Xs on all qubits in two groups of qubits
at a time, i.e., measuring X1 X2 X3 X4 X5 X6 and X4 X5 X6 X7 X8 X9 . The results of these measurements will
tell us if a phase flip has occurred in one of the three groups, but it will not tell us which of these three
qubits has been flipped. However, we do not need to know which qubit it was, because we can correct the
error to the encoded state by applying Zs to all three qubits in the group.

18
Studying the procedure for error-correction in the nine-qubit code, it becomes clear that the code also is
able to correct a combined Z and X error on one qubit. The check for bit flip errors will locate the bit flip
without being affected by the presence of the phase flip error. Once the bit flip has been corrected, only the
phase-flip error remains and will thus be detected and corrected as before.
Since the nine-qubit code thus can correct for both X, Z, and XZ errors, it is able to correct for any
single-qubit error. This is seen by noting that any rotation of a qubit on the Bloch sphere can be decomposed
into Rz (γ)Rx (β)Rz (α). Applying this operation to a qubit gives

Rz (γ)Rx (β)Rz (α) |ψi = (cos(γ/2) − i sin(γ/2)Z)(cos(β/2) − i sin(β/2)X)(cos(α/2) − i sin(α/2)Z) |ψi ,


(3.7)
which results in terms where either the identity operator, X, Z, or XZ has been applied to |ψi (note that
XZX = −Z). Measuring the observables for error correction will thus project any error to identity, a bit-flip
error, a phase-flip error, or a combined bit- and phase-flip error. Since each of these can be corrected, any
single-qubit error can be corrected.

3.5 Stabilizers
The error-correction codes above, and many others, can be understood in terms of stabilizers. If a state
|ψi is unchanged under the action of a unitary operator U , i.e., U |ψi = |ψi, we say that the state |ψi is
stabilized by U . We can extend this to a collection of operators and states. Let S be a group of N -qubit
operators and VS the set of N -qubit states that are fixed by (unchanged under the action of) every element
in S. Then we say that S is the stabilizer of the vector space VS . For VS to be non-trivial (i.e., not just
zero), it can be seen quite easily that the elements of S need to commute and that −I cannot be in S.
Recall that for the three-qubit bit flip code in Sec. 3.2, we measured the operators Z1 Z2 and Z2 Z3 , noting
that they left the encoded state, a superposition of |000i and |111i, unchanged. Here, the stabilized vector
space VS is spanned by |000i and |111i and the stabilizer S is the group generated by Z1 Z2 and Z2 Z3 ,
i.e., the group {I, Z1 Z2 , Z2 Z3 , Z1 Z3 } (products of the elements in the group are elements in the group; the
elements I and Z1 Z3 can be constructed by multiplying together the generators Z1 Z2 and Z2 Z3 in various
ways).
So by measuring the generators of S, we were able to correct certain errors on the stabilized states.
We can imagine constructing other error-correction codes by finding a stabilizer, its generators, and the
corresponding stabilized states. But how do we know which errors the code protects from? If we work with
a stabilizer S and errors {Ej } that are subgroups of the N -qubit Pauli group (products of single-qubit Pauli
matrices and factors ±1, ±i), there is a theorem (Theorem 10.8 in Ref. [Nielsen and Chuang, 2000]) that
tells us that these errors can be corrected if Ej† Ek ∈ / N (S) − S ∀j, k. Here, the normalizer of S, N (S),
consists of all elements A ∈ / S such that A commutes with all elements in S. For our example with the
three-qubit bit-flip code, it is quite straightforward to see that any product of two elements in the error set
{I, X1 , X2 , X3 } anti-commutes with the generators Z1 Z2 and Z2 Z3 of the stabilizer (except I, but I ∈ S, so
I∈ / N (S) − S), so all errors in the set can be corrected.

3.6 Proving the Gottesman–Knill theorem


Here we outline how to prove the Gottesman–Knill theorem using stabilizers. For a more detailed description
of the proof, see Sections 10.5.2–10.5.4 in Ref. [Nielsen and Chuang, 2000].
If we have a vector space VS stabilized by S, the action of any unitary U on a state |ψi ∈ VS can, for
any element g ∈ S, be written
U |ψi = U g |ψi = U gU † U |ψi . (3.8)

19
This means that U |ψi is stabilized by U gU † , and thus U VS is stabilized by U SU † , which is generated by
the operators U g1 U † , ..., U gn if g1 , ..., gn are the generators of S.
The point of this is that it sometimes allows a compact representation of qubit states under transforma-
⊗N
tions. Consider, for example, the state |0i . Applying a Hadamard gate to each qubit in this state trans-
⊗N N
forms it to |+i , which requires 2 amplitudes to represent in the computational basis. However, the state
⊗N
|0i is the only state (up to a global phase) that is stabilized by the stabilizer generated by {Z1 , Z2 , ..., ZN }.
⊗N
After the transformation, |+i is uniquely determined by H{Z1 , Z2 , ..., ZN }H † = {X1 , X2 , ..., XN }, which
only is N generators.
It turns out that the gate set in the Gottesman–Knill theorem, {CNOT, H, S}, is such that any unitary
that transforms elements of the N -qubit Pauli group to other elements of the N -qubit Pauli group can be
composed from O(N 2 ) gates in that set. Thus, starting in a computational basis state, specified by N
generators in the N -qubit Pauli group, we can keep track of changes to the state by keeping track of how
these generators change under the action of {CNOT, H, S}. This only requires O(N 2 ) steps on a classical
computer. The same goes for measurements of these states, so the quantum circuit is classically simulatable.

3.7 Fault-tolerant quantum computing


The error-correction codes we have looked at so far have helped us reduce the probability that a single-
qubit error will result in an actual error in an encoded single-logical-qubit state formed by several physical
qubits. However, this is just the first step towards quantum computing that works in the presence of errors.
Such fault-tolerant quantum computing requires schemes to perform logical operations and measurements on
logical qubits in a way that itself is robust against errors. Explaining how this is done is beyond the scope
of this course, but it can be done.
The most important concept in fault-tolerant quantum computing is the error threshold. This is the single-
qubit error probability which can be tolerated in practice and still allow fault-tolerant quantum computing.
The concept can be intuitively understood already from the three-qubit bit-flip code. For an unencoded
qubit, the error probability is p. For the encoded three-qubit state, we can correct for one single-qubit error,
but two single-qubit errors will result in an error for the logical qubit. This occurs with a probability cp2 for
some c that is determined by how the code works (in this case, c = 3). For it to make sense to invest qubit
resources into the three-qubit encoding, we must have cp2 < p, i.e., p < 1/c. The error threshold is thus 1/c.
If we are below the error threshold, we can reduce the logical error rate further by concatenating the
code at more levels. For the bit-flip case, each of the three qubits making up the logical qubit could in
turn be encoded into three qubits each to protect against bit-flip errors. And each of those qubits could be
encoded into three more, and so on. But will this not lead to having to invest too many resources to lower
the logical error rate enough? Fortunately, the answer is no. This is the threshold theorem for quantum
computation: A quantum circuit containing s gates may be simulated with probability error of at most ε
using O(poly(log s/ε)s) gates on hardware whose components fail with probability at most p, provided p is
below some constant threshold, p > pth , and given reasonable assumptions about the noise in the underlying
hardware. Here, “poly” is a polynomial of fixed degree.

3.8 The surface code


In practice, only small examples of error-correcting codes have been demonstrated in experiments so far.
There are several practical challenges for experimental implementations of error correction at larger scale.
One is that some concatenated codes may require high qubit connectivity. Recall, as a simple example, that
the nine-qubit Shor code required measuring two six-qubit stabilizers to identify the phase-flip errors.

20
SURFACE CODES: TOWARDS PRACTICAL LARGE-SCALE . . .

(a) Z Z Z Z Z Z Z Z Z
TABLE III. Eigens
Z
X Z X Z X Z X Z X Z X
and X̂a X̂b X̂c X̂d .
X X X X X X X X X X
X Z X Z X Z X Z X Z X Eigenvalue
Z Z Z Z Z Z Z Z Z Z
X Z X Z X Z X Z X Z X
+1
X X X X X X X X X X
X Z X Z X Z X Z X Z X
Z Z Z Z Z Z Z Z Z Z
X Z X Z X Z X Z X Z X
X X X X X X X X X X
X Z X Z X Z X Z X Z X
Z Z Z Z Z Z Z Z Z Z
−1
Figure 3.2: The layout for the surface code. The 2 3circles
1 open 4 5represent
6 7 8qubits that are part of the encoded
(b)The solid circles
logical state. a are qubits that areI not - + state,
g part of the logical I but are used for measurements
of the four-qubit XXXX and ZZZZ stabilizers a of the code. From Ref. [Fowler et al., 2012].
Z
b Z Z c b +1: ΨZ+
The error-correction Z
c 1: Ψ
code attracting most attention for large-scale implementation today is the surface
code. For a detailed explanation of how the surface code works, we refer to Ref. [FowlerZ-et al., 2012]. Here,
d
we will only explain some basic points of d the code.
repeat
The surface code is adapted to a square grid of qubits, where each qubit can interact (perform two-qubit
gates) with its four nearest neighbours. The g layout is depicted inHFig. - +3.2. The surface code only uses
(c) a H
four-qubit stabilizers on the form XXXX and ZZZZ, which can be measured using the nearest-neighbour
a
interactions on the grid Xto perform sequential CNOT or CZ gates between the four “data qubits”
+1: ΨX+ targetto bethe nearest-n
b X
measured and a “measurement c
X qubit” b
connected to them, as shown in the figure.
qubit as the contr
Once the system stateXhas been projected 1: ΨX- measurements, a
c into a state that is stabilized by the four-qubit
Z error on a data qubit anti-commutes with the XXXX stabilizer measurements that involve one of Hadamard
the two applied
d d
X-measurement qubits connected to this data qubit. Similarly, an X error on a data qubit anti-commutes
repeat the CNOTs; the proj
with the ZZZZ stabilizer measurements that involve one of the two Z-measurement qubits connected of X̂ato X̂b X̂c X̂d . Hen
this data qubit. In this way, it is possible to identify when single-qubit errors (also XZ errors) occur on data
qubits. However,FIG. 1. error
if the (Color online)increases
probability (a) A two-dimensional
such that several dataarray
qubitsimplementa- all the measure qub
close to each other experience
tionbeofhard
errors, it can the surface
to deducecode.
whichData
error qubits are open
configuration circles
actually (◦),the
caused measurement
measurement results,dataeven ifqubits simu
qubits
a logical error has are solid circles (•), with measure-Z qubits colored green
not occurred. Zabcd |ψ", with
To flip the state of the encoded qubit, a chain of X or Z operations, stretching from one side toX̂ theX̂other
(dark) and measure-X qubits colored orange (light). Away from a b X̂c X̂d |ψ" = X
of the square grid, is required, as shown in Fig. 3.3. The larger the distance d across the grid of data qubits,
the better protected the encoded qubit is, provided the error probability is below the threshold.Following measurem
the boundaries, each data qubit contacts four measure qubits, and For the
eachthemeasure
surface code, threshold qubit contacts
is on the order of four data isqubits;
1 %, which the many
better than measureother qubits qubits
error-correcting codes.in Figs. 1(b)
perform
This contributes to four-terminal
the great interestmeasurements.
in implementing On the boundaries,
the surface code. the measure every step in the cyc
qubits contact only three data qubits and perform three-terminal
It is possible to perform two-qubit operations on encoded qubits in the surface code. However,
the entire two-dime
explaining
how this works is beyond the scope of these notes. See Ref. [Fowler et al., 2012] for details.
measurements, and the data qubits contact either two or three measure begins. We note tha
qubits. The solid line surrounding the array indicates the array each of the measure
boundary. (b) Geometric sequence of operations (left), and quantum easily modified whi
circuit (right) for one surface code cycle for a measure-Z qubit, Appendix B).
which stabilizes Ẑa Ẑb Ẑc Ẑd . (c) Geometry
21 and quantum circuit for Stabilizer codes h
a measure-X qubit, which stabilizes X̂a X̂b X̂c X̂d . The two identity Iˆ
not operate from the
operators for the measure-Z process, which are performed by simply
state |ψ" that results
waiting, ensure that the timing on the measure-X qubit matches that
the stabilizers; we c
of the measure-Z qubit, the former undergoing two Hadamard Ĥ
X X X

Z
Z
Z
X
in a particular state, as in our earlier two-qubit example.
Z
X
X
X However, the set of surface code stabilizers is actually not
Z X
Z
X
always complete, so the array can have additional degrees of
X
X freedom. These additional degrees of freedom can be used to
define logical operators, the first step in defining a logical
lution of measurement qubit. We can see this by considering the small 2D array
segment of the 2D array. shown in Fig. 3. This array has been drawn with two types
the bottom of the figure, of boundaries, terminating with measure-X qubits on the right
orizontal plane. Vertical and left, which we call X boundaries, and terminating with
n which a measurement measure-Z qubits on the top and bottom, which we call Z
elation indicating an X̂
Ẑ X̂ error, and temporal
is sequential in time. ZL
d by the measurement
Z Z Z Z ZZ6 Z Z Z
X Z X Z X Z X Z X
X X X X X X X X
nt process itself must X Z XX11 Z X Z X Z X
yield a sign change for Z Z Z Z ZZ7 Z Z Z
cle, this measurement X Z X Z X Z X Z X
l typically be signaled X10
X X X XX X X X X
12X Z
hanges occurring on a X Z X Z
3 Z X Z X
surement error could, XL Z Z Z Z Z Z Z Z d=5
urement, with a lower XX1 Z XX2 Z X3 X Z XX4 Z XX5
X X X X X X X X
ollowing that, with an
X Z X Z X Z X Z X
ablishing the value of
quires several surface
Z Z Z Z ZZ8 Z Z Z
X Z X Z X Z X Z X
as well as sequential X X X X X X X X
X Z X Z X Z X Z X
ror, as it is typically Z Z Z Z ZZ9 Z Z Z
inct from that of a data
r more measurements
FIG. 3. (Color
Figure 3.3: Logical operations online)code.
on the surface A square 2D array
The depicted ofhas
setup data qubits,
distance withd =
5, i.e., X 5 data qubits
, such as CNOT errors,
along each side of the square. There are 41 data qubits in the setup and together with 40 measurement
boundaries on the left and right and Z boundaries on the top and
erate distinct qubits
patterns
for X and Z stabilizers. The difference leaves room for encoding one logical qubit. To perform
bottom. The array has 41 data qubits, but only 40 X̂ and Ẑ stabilizers.
measure-Z qubits.logical operations on this encoded qubit, a chain of operators must be applied across the square grid, as
A chain A
of product
X operatorschain X̂L = X̂the
1 X̂2two
X̂3 X̂4 X̂5 ofsides
X̂ operators connects the at
with X measurements two
e error signalsshown.
will be X boundaries,
connecting
commutes with
opposite
all the array stabilizers and changes
the boundaries
implement a logical X operation. Likewise, a chain of Z operators connecting the two opposite sides with
n space as well as in
X measurementsthe at array state from the quiescent state |ψ! to |ψFrom with et the
the boundaries implement a logical Z operation. X ! =Ref.
X̂L |ψ!
[Fowler al., 2012].
matched up to deduce same measurement outcomes as |ψ!. A second product chain ẐL =
h very high probability Ẑ6 Ẑ7 Ẑ3 Ẑ8 Ẑ9 connects the two Z boundaries and commutes with the
en the linearity of a array stabilizers; it changes the array state from |ψ! to |ψZ ! = ẐL |ψ!.
errors that occur are The operator chains X̂L and ẐL anticommute. A modification of
correct for all these the X̂L chain to the chain X̂L" = X̂1 X̂10 X̂11 X̂12 X̂3 X̂4 X̂5 generates a
by applying corrective quiescent state |ψX" ! = X2,10,11,12 |ψX !, related to |ψX ! by the result of
rements, as discussed the measurement X2,10,11,12 = ±1 of the encircled measure-X qubit
ray are not so sparse, (outlined in black).
22
032324-7
Chapter 4

Fast Quantum Algorithms

For this part of the notes, we follow Ref. [Nielsen and Chuang, 2011].

4.1 The Quantum Fourier Transform


You should all have seen the discrete Fourier transform (DFT) where a set of N complex numbers {x0 , . . . , xN −1 }
are Fourier transformed into N new complex numbers {y0 , . . . , yN −1 } according to
N −1
1 X i 2πmk
yk = √ e N xm .
N m=0

The Fourier transformation is extremely useful, e.g. to detect periods in a signal where {x0 , . . . , xN −1 }
could be the amplitude of some signal as a function of discretized time. The Fourier transformed signal
{y0 , . . . , yN −1 } then describes the frequency content. Solving the Schrödinger equation on a lattice, DFT is
what takes you between real space and momentum (k) space.
The quantum Fourier transform (QFT) is a unitary n-qubit operation, transforming the initial N = 2n
basis states {|0i, . . . , |N − 1i} into a new basis in a way which looks mathematically identical to the DFT,
N −1
1 X i 2πjk
|ji → √ e N |ki. (4.1)
N k=0
The action on an arbitrary state is
N
X −1 N
X −1
xj |ji → yk |ki,
j=0 k=0

where the amplitudes yk are the DFT transforms of the amplitudes xm . One may easily verify that the new
states are normalized and form an orthogonal set, and thus that the QFT is a unitary transform.
The QFT can be used to find periods and also to extract eigenvalues of unitary operators to a high
precision. But before discussing these issues in more detail, let’s see if we can find an effective implementation
of the QFT. Remember that there are operators that need exponentially many single- and two-qubit gates
for implementation, so what about the QFT?

4.1.1 Another definition


We’ll now rewrite the definition of the QFT in a way that is more transparent for constructing a circuit.
First we need a simple way to number the basis states. We simply number the n-qubit state |ji using the

23
binary n-bit representation of j = j1 2n−1 + j2 2n−2 + · · · + jn−1 21 + jn 20 . E.g. in the 4 qubit case the state
|5i = |0101i = |01 i|12 i|03 i|14 i.

Alternative notation: the binary fraction


Before starting, we introduce a notation that is used in NC to perform calculations on the quantum Fourier
transform and the phase estimation algorithms. However, in these notes we will make use of both notations
- the standard one and the one based on the binary fraction - leaving to the reader the choice of which one
is preferable.
The definition of binary fractions is the following:

0.j1 j2 j3 . . . jn = j1 /2 + j2 /22 + j3 /23 · · · + jn /2n ,

e.g. 0.101 = 0.5 + 0.125 = 0.625, and more generally

0.jl jl+1 . . . jm = jl /2 + jl+1 /22 + · · · + jm /2m−l+1 .

Using this notation we can write the QFT in Eq. (4.1) as

|0i + ei2π0.jn |1i |0i + ei2π0.jn−1 jn |1i . . . |0i + ei2π0.j1 j2 ...jn |1i
  
|ji = |j1 , j2 , . . . , jn i → . (4.2)
2n/2

Rewriting the output state of the quantum Fourier transform


The algebraic manipulations connecting the two expressions are straightforward, but need some afterthought.
Observing that
n
k k1 2n−1 kn 20 −1 −n
X
= + · · · + = k1 2 + · · · + k n 2 = kl 2−l , (4.3)
2n 2n 2n
l=0

from Eq.(4.1) we have:


N −1
1 X i 2πjk
|ji → √ e N |ki (4.4)
N k=0
1 1
1 X X Pn −l
= √ ··· ei2πj l=1 kl 2 |k1 . . . kn i
N k1 =0 kn =0
1 1 O n
1 X X −l
= √ ··· ei2πjkl 2 |kl i
N k1 =0 kn =0 l=1
n 1
1 O X i2πjkl 2−l
= √ e |kl i
N l=1 kl =0
n
1 O −l
= √ (|0i + ei2πj2 |1i)
N l=1
1 −1 −2 −n
= √ (|0i + ei2πj2 |1i)(|0i + ei2πj2 |1i) . . . (|0i + ei2πj2 |1i)
N
1
= √ (|0i + ei2π0.jn |1i)(|0i + ei2π0.jn−1 0.jn |1i) . . . (|0i + ei2π0.j1 ...jn |1i)
N

24
where in the last step we have used that

j2−n = j1 /2 + j2 /4 + . . . jn /2n = 0.j1 . . . jn (4.5)


...
−1
j2 = j1 2n−2 + j2 2n−3 + . . . jn /2 = j1 2n−2 + j2 2n−3 + · · · + 0.jn

and that the integer part of j · 2l disappears in the exponent since it is multiplied by 2π.

4.1.2 An efficient implementation


Using the form of the QFT in Eq. (4.2) it is straightforward to implement the desired transformation with
a quantum a circuit. We realize that we have to implement conditional phase shifts on each qubit, therefore
we define the single qubit operator " #
1 0
Rk = k .
0 ei2π/2
Now let’s see what happens an input state |j1 , j2 , . .√
. , jn i when it passes through√the circuit in Fig. 4.1. The
first Hadamard gate produces the state (|0i + |1i)/ 2 if j1 = 0 and (|0i − |1i)/ 2 if j1 = 1, i.e.
1   1
√ |0i + ei2πj1 /2 |1i |j2 , . . . , jn i = √ |0i + ei2π0.j1 |1i |j2 , . . . , jn i,

2 2

since ei2π0.j1 = ei2πj1 /2 = −1 for j1 = 1 and +1 otherwise. The controlled-R2 gate rotates the component
2 2
|1i of the first qubit by ei2π/2 if j2 = 1, i.e. it applies the phase ei2πj2 /2 . Therefore, it produces the state
1  2
 1
√ |0i + ei2πj2 /2 ei2πj1 /2 |1i |j2 , . . . , jn i = √ |0i + ei2π0.j1 j2 |1i |j2 , . . . , jn i.

2 2
After all the controlled-Rk operations on the first qubit, the state is
1  j1
 1
√ |0i + ei2π( 2 +...+ 2n ) |1i |j2 , . . . , jn i = √ |0i + ei2π0.j1 j2 ...jn |1i |j2 , . . . , jn i.
jn 
2 2
The Hadamard on the second qubit produces
j
  
√1 i2π ( 21 +...+ 2jn
n)
j
i2π 22
2
|0i + e |1i |0i + e |1i |j3 , . . . , jn i
= √122 |0i + ei2π0.j1 j2 ...jn |1i |0i + ei2π0.j2 |1i |j3 , . . . , jn i,
 
(4.6)

and the controlled R2 to Rn−1 gates yield the state


j j
  
√1 i2π ( 21 +...+ 2jn
n) i2π ( 22 +...+ n−1
jn
) |1i |j , . . . , j i
2
|0i + e |1i |0i + e 2
3 n
1
= √22 |0i + ei2π0.j1 j2 ...jn |1i |0i + ei2π0.j2 ...jn |1i |j3 , . . . , jn i.
 
(4.7)

We continue in this fashion for all qubits, obtaining the final state
    
√1 i2πj2−n i2πj2−(n−1) i2πj2−1
2
|0i + e |1i |0i + e |1i ... |0i + e |1i
1
= √2n |0i + ei2π0.j1 j2 ...jn |1i |0i + ei2π0.j2 ...jn |1i ... |0i + ei2π0.jn |1i .
  
(4.8)

We now need to reverse the order of the qubits, which can be achieved using a series of SWAP gates.
The number of gates needed are n on the first qubit plus n − 1 on the second and so on, adding up to
n(n + 1)/2 = O(n2 ) gates. Then we need on the order of n SWAP gates, not changing the scaling. Thus

25
we can implement the QFT for n qubits using on the order of O(n2 ) gates. The best classical algorithm
(FFT) needs O(n2n ) gates, indicating why the QFT could be used for speedup. This does not translate in an
immediate speed-up for computing classical FFT, because we cannot access the amplitudes when measuring
the Fourier-transformed quantum state, and we don’t even know how to efficiently prepare the input state
to be transformed. However, in the next section we’ll see one problem where the quantum Fourier transform
is useful.

Figure 4.1: An efficient circuit to perform the quantum Fourier transform. (From Nielsen & Chuang), Fig.
5.1. Here 0.j1 ...jn = j2−n , 0.j2 ...jn = j2−(n−1) , ..., 0.jn−1 jn = j/2−2 and 0.jn = j/2−1 .

4.2 Phase estimation


The aim of this algorithm is to estimate the angle in the eigenvalue ei2πφ corresponding to an eigenvector
|ui of a unitary operator U . The vector |ui is given, as well as a circuit (black box, oracle) effectively
implementing controlled-U n operations. A circuit solving a first stage of this problem is shown in Fig. 4.2
and an overview circuit showing the whole algorithm is given in Fig. 4.3. Two qubit registers are needed,
the first has t qubits which are initialized to zero. The number of qubits t is determined by the required
accuracy in the estimate of φ. The second register is large enough to represent the eigenvector |ui, and it is
also initialized to |ui and remains in this state throughout the computation. The initial set of Hadamard gates
puts all qubits of register 1 in an equal superposition of |0i and |1i. If the k-th control qubit is in the state
k 2 k k
|1i a unitary operation U 2 will be performed on the second register, picking up a phase ei2πφ = ei2πφ2 .
The first step (k = 0) gives for example:
1 1
CU √ (|0i + |1i) |ui = √ |0i + ei2πφ |1i |ui.

(4.9)
2 2
The final state of the first register in this first step is
2t −1
1  i2π2t−1 φ

i2π2t−2 φ
 
i2π21 φ

i2π20 φ
 1 X i2πkφ
√ |0i + e |1i |0i + e |1i . . . |0i + e |1i |0i + e |1i = √ e |ki.
2t 2t k=0
(4.10)
By comparison with (4.4) we see that this state is nothing else than the Fourier transform of the state
|2t φi = |φ1 φ2 . . . φt i, where in the last step we have assumed that the phase φ has an exact representation
in t bits as φ = 0.φ1 φ2 . . . φt (with a slight abuse of notation). The final step is hence to make an Inverse
quantum Fourier transform of the first register. This allows recovering the latter state. Register 1 is then
read out and φ is recovered. If the phase is not an exact binary fraction in t qubits there will be some finite

26
Figure 4.2: A circuit performing the first step of the phase estimation algorithm. (From Nielsen & Chuang),
Fig. 5.2.

Figure 4.3: An overview circuit figure of the phase estimation circuit. (From Nielsen & Chuang), Fig. 5.3.

probability of reading out some other state "close" to the best approximation. A careful analysis gives that
if we want n bits precision, with a failure probability less than ε, we need a register of size
  
1
t = n + log 2 + .

Phase estimation is interesting in its own right and may be used in quantum simulations. We will see how
it enters Shor’s algorithm for factoring.

4.3 Factoring - Shor’s algorithm


We now know how to efficiently determine the phase of an eigenvalue to a unitary operator. In this section
we’ll see how this enables us to efficiently solve a number theoretical problem which is considered hard on
classical computers: order finding. Finally we show how factoring can be reduced to order finding.

27
4.3.1 Modular arithmetics
Order finding is defined in modular arithmetics. Modular arithmetics is based on the fact that, given any
two positive integers x and n, x can uniquely be written as

x = k · n + r,

where k is a non-negative integer and 0 ≤ r < n is the reminder

x=r mod n,

as an example
2 = 5 = 8 = 11 mod 3.
The greatest common divisor gcd(a, b) of two integers a and b is the largest integer dividing both a and
b. If gcd(a, b) = 1 then a and b are called co-prime.

Multiplicative inverse
Now let’s look at modular multiplication by looking at the series

mk = k · a mod n, 0 < k < n.

As an example, take a = 6 and n = 15 with gcd(a, n) = 3 giving

mk = {6, 12, 3, 9, 0, 6, 12, 3, 9, 0, 6, 12, 3, 9},

showing that the equation x · 6 = y mod 15 has no solution for y ∈ {1, 2, 4, 5, 7, 8, 10, 11}. Note that in
particular it has no solution for y = 1. Then take a = 7 and n = 15 which are co-prime giving

mk = {7, 14, 6, 13, 5, 12, 4, 11, 3, 10, 2, 9, 1, 8},

showing that the equation x · 7 = y mod 15 may be solved for all y.


The multiplicative inverse a−1 of an integer a modulus n is another integer which fulfils

a−1 · a = 1 mod n,

and it exists if and only if a and n are co-prime. If the inverse exists we can solve the equation

x · a = c mod n

for all integers a, b and c through


x = c · a−1 mod n.
Another way of formulating this is the following: all the integers between 1 and (n − 1) appear once
and only once in {mk } if and only if a and n are co-prime. If not we can write a = x · gcd(a, n) and
n = y · gcd(a, n), where 0 < y < n. So for k = y we get

my = y · x · gcd(a, n) mod (y · gcd(a, n)) = 0,

and then the series {mk } will just repeat from the start.

28
4.3.2 Order finding
Consider the equation
xr = 1 mod N,
which has solutions for the integers x and N being co-prime, and x < N . The lowest positive integer r solving
the equation is called the order of x modulo N . One straightforward method to calculate r is to evaluate
the series mk = xk mod N for 0 < k < N , then it’s clear that the series {mk } is periodic with period r
since xr+a = xr · xa = xa mod N . In other words, the order is the period of the modular exponentiation
function mk = xk mod N . As an example let’s take x = 5 and N = 21 giving

mk = {5, 4, 20, 16, 17, 1, 5, 4, 20, 16, 17, 1, 5, 4, 20, 16, 17, 1, 5, 4},

and we have the order r = 6. There is no classical algorithm for finding r which scales polynomially in the
number of bits L needed to represent the input, i.e. the integers x and N .

4.3.3 Factoring as order finding


Factoring can be reduced to order finding as follows. Suppose we want to factor N = pq. Consider the
period r of the function of k defined as xk mod N for some x. x is chosen such that r is even and that
xr/2 6= N − 1 mod N . Then r allows to find p and q as follows. Define y = xr/2 . Then y 2 = xr = 1 mod N
by definition of period r. Therefore we have

y 2 − 1 = (y − 1)(y + 1) = 0 mod N.

Therefore(y − 1)(y + 1) is multiple of N , i.e. it contains the two factors pq in his decomposition. However
neither (y − 1) nor (y + 1) are multiple N :

(y − 1) 6= 0 mod N, because otherwise the period would be smaller than r. (4.11)


(y + 1) 6= 0 mod N, by construction. (4.12)

Therefore (y − 1) and (y + 1) must slip the two factors q and p appearing in the decomposition of N . Say
for instance:
(y − 1) = lp; (y + 1) = l0 q.
We finally obtain therefore

p = gcd(N, xr/2 − 1); q = gcd(N, xr/2 + 1).

In other words, determining the order r of the modular exponentiation function xk mod N yields the
determination of the two factors p and q such that N = pq. For n having L bits, this common factor can be
found using Euclid’s algorithm in O(L3 ) steps. For uniformly chosen x one may calculate a lower bound for
the probability of r being even and that y = xr/2 is non-trivial,
1
p(r is even and xr/2 6= −1 mod N ) ≥ 1 − ,
2m
where m is the number of different prime-factors in N , i.e. m ≥ 1.
In the following, we are going to derived an efficient quantum algorithm for order finding.

29
4.3.4 A quantum algorithm for order finding
Given an integer x of which we want to find the order mod N , consider the L-qubit unitary operation

|x · y mod N i, 0 ≤ y ≤ N − 1
U |yi ≡ .
|yi, N ≤ y ≤ 2L − 1

The unitarity follows since it basically permutes the basis states and y has a multiplicative inverse modulus
N since y and N are co-prime. The states
r−1
−i2πsk
 
1 X
|us i = √ exp |xk mod N i,
r r
k=0

defined for integers 0 ≤ s ≤ r − 1 are eigenstates of U , since


r−1 r
−i2πsk −i2πs(k − 1)
     
1 X 1 X i2πs
U |us i = √ exp |xk+1 mod N i = √ exp |xk mod N i = exp |us i,
r r r r r
k=0 k=1
nh io
−i2πs(r−1)
since xr = x0 mod N , and exp = exp i2πs
 
r r . Using the phase estimation algorithm we can
now efficiently determine s/r with high accuracy. One requirement is that we can implement the operators
k
U 2 efficiently, which can be done using a procedure known as modular exponentiation (see Box 5.2. on
page 228 in N & C), needing O(L3 ) gates. Furthermore we need to produce one or more of the eigenstates
|us i which is done by noting
r−1 r−1 r−1
−i2πsk
 
1 X 1 XX
√ |us i = exp |xk mod N i = |1i.
r s=0 r s=0 r
k=0

1
 
To have enough accuracy in the phase estimation we should use t = 2L + 1 + log 2 + 2ε qubits in the
first register and prepare the second register in the |1i state. We’ll then get the phase φ = s/r, for a random
0 ≤ s < r, with 2L + 1 bits precision, with a probability of at least (1 − ε). Knowing that the phase φ = s/r
is a rational number, where s and r are integers, not larger than L bits, we can classically determine s and
r. The appropriate algorithm is called the continued fraction expansion and needs O(L3 ) gates.

4.3.5 Performance
The algorithm fails if s = 0, and also if s and r have common factors so that they cannot be extracted
from s/r. The probability of failure can be shown to be small and one need only to repeat the procedure a
polynomial (in L) number of times to obtain r with high probability. The number of gates needed are O(L)
k
for the initial Hadamards, inverse Fourier transform needs O(L2 ) gates, implementing U 2 through modular
exponentiation requires O(L3 ) gates, and the classical continued fraction algorithm needs O(L3 ) (classical)
gates. If we need to repeat an O(L) number of times the overall scaling would be O(L4 ), but being more
clever there are ways to guarantee success in a constant number of attempts, giving the scaling O(L3 ) gates.

The algorithm
Find a factor of the composite L-bit integer N .

• 1. If N is even return the factor 2.

• 2. Determine whether N = ab , i.e. is if there is only one prime-factor. This can be done with O(L3 )
operations. If so return the factor a.

30
• 3. Randomly choose 1 < x < (N − 1), and check whether x and N are co-prime (O(L3 ) operations).
If not co-prime return the factor gcd(x, N ).

• 4. Find the order r of x modulo N , which can be done using O(L3 ) quantum gates (quantum subrou-
tine!!)

• 5. If r is even and xr/2 6= −1 mod N then compute gcd(xr/2 + 1, N ) and gcd(xr/2 − 1, N ) and check
if one is a non-trivial factor of N . Return this factor. If r is odd, or xr/2 = −1 mod N the algorithm
fails.

The algorithm will succeed with a probability larger than 3/4.

31
Chapter 5

Quantum machine learning

Quantum computing and machine learning are arguably two of the “hottest” topics in science at the moment.
Here in Sweden, this is reflected in the fact that the two largest programs supported by the Knut and Alice
Wallenberg foundation are centered around quantum technologies and artificial intelligence. In this chapter,
we will discuss efforts to combine the two fields into quantum machine learning. Since this is a course
about quantum algorithms, we do not cover applications of classical machine learning to simulating and
understanding quantum systems, but focus instead on how machine learning can be enhance by quantum
computation. We begin with a brief overview of classical machine learning and then study some examples
of quantum machine learning algorithms.

5.1 A brief overview of classical machine learning


What is machine learning? With the success of, and hype around, machine learning in recent years, there
are examples of companies and researchers calling many things “machine learning” that would have been
called something else a few years ago. According to Wikipedia, “Machine learning is the scientific study of
algorithms and statistical models that computer systems use to perform a specific task without using explicit
instructions, relying on patterns and inference instead”. We like the following definition (source unknown):
“Machine learning studies algorithms whose performance improves with data (‘learning from experience’)”.

5.1.1 Types of machine learning


Broadly speaking, there are three paradigms in machine learning for extracting meaning from some data:

• Unsupervised learning: Learning structure in P (data) given samples from P (data). Here, the
machine learning is used to generate knowledge by analyzing unlabelled data. Examples of unsupervised
learning are clustering (grouping data points), density estimation (estimating a probability density
function giving rise to the data), and much of what is called data mining.

• Supervised learning: Learning structure in P (labels|data) given samples from P (data, labels). Here,
the machine learning is used to generalize knowledge gained from studying labelled data to predict
correct labels for unlabelled data. Examples of supervised learning are foremost various classificiation
tasks, e.g., image recognition.

• Reinforcement learning: Learning from (possibly rare) rewards. Here, an agent learns by acting on
its environment, observing the results of its actions (the results can include rewards), and updating its

32
Figure 5.1: Structure of a feed-forward neural network and a single neuron in that network. From Ref. [Tani-
kic and Despotovic, 2012].

policy for actions based on the outcomes. Examples of reinforcement learning include the superhuman-
level game-playing programs for go, chess, StarCraft, etc. by Google’s DeepMind.

5.1.2 Neural networks


One common way to implement machine learning is using neural networks. The neurons in such a network
can be connected in different layouts. Figure 5.1 shows a feed-forward neural network, where some input
(data) is fed into the neurons in the input layer on the left. The output from these neurons becomes the
input for neurons in the next layer, and so forth. In the end, some result is defined by the output from the
last layer on the right. The layers between the input and output layers are called hidden layers.
The structure of a single neuron is shown on the left in Fig. 5.1. The inputs xj to neuron i are weighted
by weights wi,j . To the weighted sum of the inputs, a “bias” bi can be added. The result ai is fed into a
nonlinear activation function f , which usually is some smoothed version of a step function. The output is
then  
X
yi = f  wi,j xj + bi . (5.1)
j

A neural network can thus be said to be a complicated function, parameterized by all the weights and
biases in the network, which transforms an input into an output. What functions can a neural network
represent? The answer, provided by Cybenko in 1989 [Cybenko, 1989], gives a hint of why neural networks
are powerful. It turns out that any arbitrary smooth function with vector input and vector output can be
approximated to any desired precision by a feedforward neural network with only a single hidden layer. In
practice, deep neural networks, i.e., networks with many hidden layers, have turned out to be more efficient
at representing various functions. See, e.g., Ref. [Lin et al., 2017].

33
5.1.3 Training neural networks
To train a neural network to perform a specific task boils down to adjusting the weights w and biases b such
that the network approximates a function that solves the task. To do this, we first need to be able to say
how close the output of the current network is to the desired output. The difference between the actual and
the desired output is measured by some cost function. One example is the mean square error
1 X 2
C(w, b) = |y(x) − a| , (5.2)
2n x

where n is the number of examples x (data points), y is the desired output, and a is the actual output. To
find good weights and biases for the task is to find weights and biases that minimize C(w, b).
How does one minimize C in a smart way? Clearly, there are too many unknowns to simply find the
extremum by setting the gradient of C to zero and solving the resulting equation. Therefore, gradient descent
is used, with the update of the parameters being proportional to minus the gradient (the proportionality
constant is called the learning rate). However, the naïve approach to calculating the gradient of C is time-
consuming: to obtain each partial derivative of C the network would need to be run once for each data point
x and for each variable in w or b, to see how a small change in the input changes the output. One important
reason for the prevalence of neural networks today is that two tricks have been found that can reduce the
necessary calculations considerably.
The first trick is to use stochastic gradient descent, which means that the partial derivatives are not
calculated for all data points x in each step of the gradient descent, but only for a subset, a mini-batch, of x.
The next step uses another subset, and so on until all subsets have been used (marking the end of an epoch
of training), whereupon the selection of mini-batches starts over. The use of stochastic gradient descent will
only give an approximation to the actual gradient, but if the mini-batches are large enough and the learning
rate is low enough, this approximation is usually sufficient.
The second trick is to calculate the partial derivatives not one by one by running the network many times,
but by using back-propagation. Essentially, back-propagation allows one to calculate all partial derivates by
running the network once, noting the result for each neuron, finding the derivative of the cost function with
respect to the output from the last layer, and then applying the chain rule repeatedly, going backwards
through the network to extract each partial derivative. For more details on how this works, see, e.g.,
Ref. [Nielsen, 2015].
With the network having so many parameters, often orders of magnitude more than the number of
training examples, a valid concern is whether the training will result in overfitting. Over the years, various
strategies have been developed to counter this, but that is beyond the scope of this short introduction to
classical machine learning.

5.1.4 Reasons for the success of classical machine learning


In explanations of the current success of classical machine learning, three factors are usually brought forward:

• Data: There are now a large number of huge data sets that can be used for training of neural networks.

• Computational power: We now have much more powerful, and increasingly custom-built, hardware
to run machine-learning algorithms on.

• Algorithms: Clever algorithms like back-propagation, which enable efficient training, together with
a number of other clever tricks discovered in the past decade or so, have led to large improvements.

34
5.2 Quantum machine learning using qBLAS
To see how quantum computers can aid or enhance machine learning, a first entry point is to note that many
machine learning algorithms rely heavily on linear algebra. For classical computation of linear-algebra oper-
ations, there are optimized low-level routines called basic linear algebra subprograms (BLAS). For quantum
computers, there are several algorithms that deal with linear-algebra problems. Together, these algorithms
are sometimes referred to as quantum BLAS (qBLAS).
Some examples of qBLAS are the HHL algorithm for solving systems of linear equations [Harrow et al.,
2009], the quantum Fourier transform, and quantum phase estimation for finding eigenvalues and eigenvec-
tors. All of these examples have exponential speed-ups compared to their classical counterparts. However,
it is important to “read the fine print” [Aaronson, 2015] for these algorithms. They all rely on the problem
being encoded in a quantum random access memory (QRAM). In a QRAM, data is encoded in the probabil-
ity amplitudes of a large superposition state. For example, a vector b with n entries can be stored in log2 n
P
qubits as j bj |ji, where bj are the entries in the vector and |ji are the computational basis states of the
qubits.
The problem with the QRAM is that no efficient way is known to encode the data in the QRAM in the
first place. The time it takes to encode the problem can therefore negate the exponential speed-up from the
qBLAS algorithms. This is sometimes called the input problem. There is also an output problem: the output
of the qBLAS algorithms is not necessarily the direct answer sought, but a state which lets you sample
properties of the answer. For example, solving the system of linear equations Ax = b does not give the
solution vector x as an easily measurable output, but just enables sampling properties of x.

5.3 Quantum support vector machines


5.3.1 Support vector machines
We will now look at an example of a classic machine-learning problem that can be tackled with quantum
algorithms: support vector machines (SVMs. A support vector machine is a classifier that divides data
points into categories based on some boundary (a hyperplane) in space. To train an SVM is to show it
labelled data points such that it can identify the optimal boundary. New unlabelled examples can then be
classified by checking on which side of the boundary they fall. This is illustrated in Fig. 5.2. The green
line (H3 ) does not separate the two categories of points (black and white). The blue and red lines (H1 and
H2 ) both separate the points correctly, but the red line (H2 ) is optimal, since the distance between H2 and
the nearest points in the training data is maximized. The points nearest the hyperplane thus define the
hyperplane. These points are called support vectors.

5.3.2 Classical computation


Mathematically, the problem can be formulated as follows. We are given a set of data points xj with labels
yj ∈ {−1, 1}. We want to find the equation w · x − b = 0 defining the optimal separating hyperplane. Here,
w is the normal vector to the hyperplane, normalized such that the closest points in the two classes lie on
the hyperplanes w · x − b = ±1. The distance from the optimal separating hyperplane to the closest point is
2
1/|w|. Since we wish to maximize this distance, we should minimize |w|, or, equivalently, 21 |w| , under the
constraints yj (w · xj − b) ≥ 1 ∀j.
To perform this minimization under constraints, we introduce Lagrange multipliers, forming the La-

35
Figure 5.2: Illustration of support vectors. From Wikipedia.

grangian
1 2
X
L(w, b, λ) = |w| − λj [yj (w · xj − b) − 1]. (5.3)
2 j

Setting the partial derivatives of L with respect to λj equal to zero gives the constraints (the λj not corre-
sponding to support vectors will become zero). Setting the partial derivative of L with respect to w to zero
leads to X X
0=w− λ j yj x j ⇒ w = λj yj xj , (5.4)
j j

so we see that w will be determined by the support vectors. Finally, we also use
∂L X
0= = λj yj , (5.5)
∂b j

and substitute these results back into the Lagrangian to obtain


X 1X
λj − λi λj yi yj xi · xj . (5.6)
j
2

The objective now is to find λj that maximize this expression under the constraint given by Eq. (5.5).
In many cases, the separation between two classes of data points cannot be parameterized as a simple
hyperplane, as illustrated to the left in Fig. 5.3. The solution commonly used is then to transform the data
points to a feature space that admits a hyperplane as a separator. This is encoded by a kernel function
K(xi , xj ), which then replaces the dot product in Eq. (5.6).

36
Figure 5.3: Illustration of a kernel for a support vector machine. From Wikipedia.

With this addition, and some further work, the maximization problem in Eq. (5.6) can be shown to lead
to the following system of linear equations:
    
0 1 b 0
= , (5.7)
1 K λ y

where the ones are 1 × M row and column vectors (M is the number of data points) and the entries in the
M × M matrix K are given by Kij = K(xi , xj ).
We can now estimate the time it takes to find the support vectors on a classical computer. If the data
points xj ∈ RN , calculating one entry in K takes O(N ) time, so calculating all of K takes O(M 2 N ) time.
Solving the system of linear equations takes O(M 3 ) time, so in total the classical computer will require
O(M 2 [N + M ]) time.

5.3.3 Quantum computation


As we saw at the end of the previous subsection, the problem of SVMs boils down to two costly computations:
calculating the entries in the matrix K and solving the system of linear equations in Eq. (5.7). Quantum
algorithms can be applied to both these computations [Rebentrost et al., 2014].
To calculate the dot product xi · xj , we assume that |xi | and |xj | are known. We then use that
2 2 2
|xi | + |xj | − |xi − xj |
xi · xj = , (5.8)
2
2
which reduces our problem to finding the distance |xi − xj | . To find this distance, we first construct the
two states
1
|ψi = √ (|0i |xi i + |1i |xj i), (5.9)
2
1
|ϕi = q (|xi | |0i − |xj | |1i). (5.10)
2 2
|xi | + |xj |

Note that we require QRAM to construct the state |xi i. Next we q do a “swap test” (see Fig. 5.4) on |ϕi
2 2
and the ancilla qubit in |ψi. The distance we seek is then given by |xi | + |xj | times the probability of
measuring 1 in the swap test. The computational complexity for this distance calculation is O(log N ).

37
Figure 5.4: The quantum circuit for a swap test. Figure from Peter Wittek. The resulting state of the system
before the measurement is 12 [|0i (|f i |f 0 i + |f 0 i |f i) + |1i (|f i |f 0 i − |f 0 i |f i)]. This means that the probability
of measuring 1 becomes zero when f = f 0 .

To solve the system of linear equations, we need to invert the matrix


 
0 1
F = . (5.11)
1 K

Briefly, this is done by approximating exp{−iF t} (which is not trivial, since K is not sparse), and then using
quantum phase estimation to extract eigenvalues and eigenvectors. These eigenvalues, together with y in
the eigenbasis, lets us construct the solution state
X
|b, λi ∝ b |0i + λj |ji . (5.12)
j

The complexity for this part of the computation is O(log M ). The total complexity for the quantum SVM
is thus O(log N M ).

5.4 Quantum principal component analysis


The methods applied in the quantum approach to SVMs can also be used in other machine-learning problems.
For example, the algorithm for distance calculation can be applied to clustering. Another example is principal
component analysis (PCA), where the second part of the quantum SVM algorithm can be re-used. In PCA,
an unlabelled set of (high-dimensional) data points xj is analyzed to find which are the axes along which the
data varies the most (and which thus are most useful for classification). A nice example of PCA for physics
researchers is Paperscape (paperscape.org), which takes data from all papers on the arXiv and uses PCA to
show the relations between the papers on a two-dimensional map.
A quantum algorithm for PCA [Lloyd et al., 2014] uses the matrix exponentiation and phase estimation
from the quantum algorithm for SVMs to find the largest eigenvalues and eigenvectors of the covariance
matrix j xj xTj . Those eigenvectors are the principal components. We note here that there is recent work
P

on “quantum-inspired” algorithms for PCA [Tang, 2018], where the methods from the quantum algorithm
have been adapted to find an improved (in the scaling of some parameters) classical algorithm.

38
1
2
3

. .
. . . . .
. .
n
n+1

Figure 5.5: A quantum feedforward neural network. Figure from Ref. [Farhi and Neven, 2018].
Figure 1: Schematic of the proposed quantum neural network on a quantum processor. The
input state | , 1i is prepared and then transformed via a sequence of few qubit unitaries
5.5Ui (✓Quantum
i ) that dependneural networks
on parameters ✓i . These get adjusted during learning such that the
measurement of Yn+1 on the readout qubit tends to produce the desired label for | i.
In this final section, we discuss quantum versions of neural networks. This is a quite new and rapidly
developing field, so it is possible that these notes can become outdated fast. We therefore only try to give a
few On the readout
examples qubit
and discuss we general
some then measure
propertiesa Pauli operator,
of quantum neuralsaynetworks.
y , which we call Yn+1 . This
gives a +1
Although or a 1.
combining Our goal
quantum is to make
computing the measurement
and neural networks sounds outcome correspond
interesting, given howto they
the are
bothcorrect labelcomputing
promising of the input string, it
paradigms, that is, l(z).
is not Typically to
straightforward thedomeasurement
so. There areoutcome
certainlyispotential
not
upsides to quantum
certain. neural networks:
Our predicted label value theyis can
the work with quantum
real number between data,1oranda compact
1, representation of
classical data, and there may be quantum algorithms that can speed up training. However, there are several
potential downsides or questions. For example, ~✓ )Yinput
hz, 1| U † (the (~✓ ) |z, 1i
n+1 Uproblem (5)
may be a factor here also. Furthermore,
classical neural networks require nonlinear activiation functions, but quantum mechanics is linear. Also,
whichneural
classical is the averagehave
networks of the observed
a large numberoutcomes
of neuronsifand Yn+1manyis measured
connectionsin multiple
between copies
them, ofmay
which
(4).
be challenging for NISQ devices with few qubits and limited connectivity. Another question is whether back-
Our goal
propagation can beis implemented
to find parameters ✓~ so that
in a quantum the network,
neural predictedsincelabel is nearback-propagation
classical the true label. requires
We
will address
measuring the question
the output of whether
at each point such parameters
in the network, something thatevenwould
exist destroy
(representation)
a quantum as well as
superposition.
the question of whether such optimal parameters can then be efficiently found (learning).
For aQuantum
5.5.1 given circuit, that is, a choice
feedforward of L unitaries,
neural networks ~ and an input
and a set of parameters ✓,
string z, consider the sample loss
In 2018, Farhi (yes, the same Farhi that proposed the QAOA) and Neven proposed an architecture for a
quantum feedforward neuralloss( network
~ z) =
✓, [Farhi
1 andl(z)Neven,
hz, 1| U2018],
† ~ as depicted in Fig.
( ✓ )Yn+1 U (~✓ ) |z, 1i . 5.5. Here, instead of(6)
layers
of neurons, there are layers of quantum gates, which act on an n-qubit input state |ψi and an ancilla qubit
prepared
Note in |1i.the
that Instead
sampleof loss
weights
we and biases
use is in in
linear a classical
the marginneural networks,
(the product weofnow
thehave and the ϑj
labelparameters
parameterizing these gates; the action of the whole quantum circuit is a unitary operation
predicted label value) and its minimum is at 0 (not minus infinity) because the predicted U (ϑ). At the end,
the label
ancillavalue
is measured
is automatically Y basis). to
(here, in thebounded Thebeoutcome
betweenof -1theand
measurement
1. Suppose is used
thattothe
classify the input
quantum
state, giving it one of two possible labels.
neural network is working perfectly, so that for each input z, the measurement always gives
To train this quantum neural network for its classification task, the authors propose using a loss function

C(ϑ, z) = 1 − l(z) z, 13 U † (ϑ)Yn+1 U (ϑ) z, 1 , (5.13)

where z is the input and l(z) is a label function giving the correct label (±1). This cost function is zero if
the network spits out the correct classification, and greater than zero otherwise. The authors use stochastic

39
gradient descent to find parameters ϑ that minimize this cost function. However, no back-propagation is
used (since this does not seem to be applicable to this architecture, as discussed above). Furthermore, to
evaluate the gradient, multiple runs of the quantum circuit are required for each partial derivative, since
one needs to collect enough statistics to find the expectation value of the output with sufficient precision.
The authors point out that a possible advantage of the quantum network is that the form of the unitary U
guarantees that the gradient does not “blow up”, which can be a problem in some classical machine-learning
algorithms.
Since the quantum feedforward neural network lacks a nonlinear element, one can ask whether it has the
ability to represent any label function. The authors show that the network indeed has this ability, but some
label functions may require an exponential circuit depth.

5.5.2 Quantum convolutional neural networks


Another recent proposal [Cong et al., 2019] for a quantum neural network is a quantum version of a convolu-
tional neural network (CNN), illustrated in Fig. 5.6. Convolutional neural networks are often used for image
recognition. As depicted in Fig. 5.6(a), a classical CNN consists of convolutional layers (C) that essentially
scan a filter across the image, pooling layers (P) that reduce the size of the feature map produced by the
convolution, and a final part with fully connected layers (FC) that do classification based on the features
extracted in previous layers. In the quantum version [Fig. 5.6(b)], the filter in convolutional layer number
j is replaced by a two-qubit unitary operation Uj , which is applied to all pairs of neighbouring qubits. The
pooling layer number k is replaced by measuring half of the qubits and applying a unitary single-qubit oper-
ation Vk to the remaining qubits, conditioned on the measurement outcome for the neighbouring qubit. This
operation conditioned on measurement has the added benefit of adding a nonlinearity to the setup. Finally,
the fully connected layer is replaced by a quantum circuit similar to that proposed by Farhi and Neven, as
discussed in the preceding subsection. Similar to that proposal, the training here is also done by gradient
descent without any back-propagation.

5.5.3 Quantum Boltzmann machines


The neural networks that have been most frequently studied in some quantum form are Boltzmann ma-
chines [Amin et al., 2018], which come in a few different architectures, as shown in Fig. 5.7. A Boltzmann
machine has hidden (h) and visible (v) neurons, which take binary values (0 or 1). The aim of training a
Boltzmann machine is to make the states of its visible neurons follow a probability distribution P (v) that
mimics that of the training data. This probability distribution is given by
1 −E(v,h)
P (v, h) = e , (5.14)
Z
where Z is a normalization factor (partition function) and the energy function is, for the case of a restricted
Boltzmann machine (Fig. 5.7 center),
X X X
E(v, h) = ai vi + bj hj + vi Wij hj , (5.15)
i j i,j

with ai and bj biases, and W a weight matrix. The form of the probability distribution (a Boltzmann
distribution) explains the name Boltzmann machines.
To train the network, a cost function
X
C(a, b, W ) = − Pdata (v) log P (v) (5.16)
v

40
phase, and demonstrate that a QCNN trained on a set of ex-
INTRODUCTION machines (BMs). BMs consist of binary chanics could exp
he phase diagram over the entire parameter regime. Finally,
Quantum machine learning is at the neurons divided into visible and hidden better than classica
ons of QCNN are discussed. units, represented by vectors v and h re- Amin et al. [4]
crossroads of two of the most exciting
current areas of research: quantum com- spectively (see Fig. 1). To every assign- and take the so-c
puting and classical machine learning. Al- ment of the visible and hidden neurons, Ising Hamiltonian
s has re- though the field is stillC in its infancy, a BM associates a joint probability tion. This increa
P C P FC
the certain difficultie
practical (a) body of literature is already large P (v, h) = e
1 − E (v,h)
, transverse field te
enough to warrant several review arti- j Z
ealization cles [1–3]. This short survey focuses on quantum Boltzma
riguing to a selection of significant recent results where Z is a normalization factor,
= E(v, h) non-trivial. Althou
( )
on the subtopic of quantum neural inet- is a quadratic energy function of v and and performance
to further (
( ) (
h,
1) ( ) ( +
1)
Cat Dog
1) and there is a set W of real-valued well understood,
works, an area that hopes to build on
ems6 . In the enormous impact that classical neu-
( + (1) 1)
( + 1)
coefficients in the energy function that examples the auth
study of ral are weight parameters to be learned. The QBMs can learn d
(b)networks have had. In particular, we training process consists of using meth- than classical BMs
eme com- concentrate on quantum generalizations
ical anal- of popular neural-network concepts such ods such as gradient descent to optimize Kieferova and
as Boltzmann machines, generative ad- the weights so as to
! maximize the likeli- formalism of Am
versarial networks and autoencoders, as hood that P(v) = h P(v, h) reproduces more general Ham
s between in ⇢
well as applications of classical neural net- the statistics of observed F data. the transverse Ising
works to quantum information and vice eral believed to b
ems have simulatable. Furth
versa. Quantum Boltzmann machines
atural to new methods fo
The energy-based nature of BMs gives a for parameters co
e used for natural framework for considering quan-
complex BOLTZMANN MACHINE NEURAL tum generalizations of their behavior. tum terms to be
QBM enables a f
s intrigu- NETWORKS The BM energy function is equivalent to tomography that c
(c)Illustrations
The
ed to fur- Figure 5.6: neural networks that have received the Hamiltonian of a simple Ising model data, and also ge
of (a) a classical convolutional neural network and (b) a quantum convolutional
6,11 neural network.recent
most Figure from Ref.from
attention [Congaetquantum
and one might hope that more general reconstructed sta
blems . al., 2019].
perspective are known as Boltzmann Hamiltonians allowed by quantum me- numerical evidenc
= (MERA)
any-body
QCNN

MERA

proposals may pro


n (QPR). h1 h1 h5
classical BMs, alt
h1 to overfit data m
P and a ν1 ν1 ν1= QEC (QCNN)
further.
) ground h2 h2 h2 h6
P. While ν2 ν2 ν2
mmon ma- h3 h3 h7 Quantum algor
n, the di- Figure 1: (a) Simplified h3 ν3
illustration of CNNs.
ν3
A sequence of
challeng- image ν3
processing layers—convolution (C), pooling (P), and classical Boltz
ν4 The practical value
fully connected (FC)—transformsh4an input imageh4into a series h8
uch as su- cal or quantum, re
of(a)feature mapsBM
Fully connected (blue rectangles),
machines.(b) RBM and finally into
A fully connected Boltzmann(c) DBM anwithoutput with which they ar
pplication both between
Figure 5.7: Layouts
probability
for Boltzmann
the hiddendistribution
Left:
(purple
and visible layers, and within thebars).
hidden and(b)visible
QCNNs
machines,
inherit
layers. Center:
connections
a depends on sampl
A restricted
articular, Boltzmann
Figure
similar 1. Examples
machine, with of
layered Boltzmann
connections machines.
structure.
only going Inbetween
(c) a fully
QCNN connected Boltzmann
and
the hidden andMERA machine,
visible connections
share
layers. Right:the
A deep
Boltzmann distrib
existmachine,
put state Boltzmann between with
all visible and hidden
multiple hidden neurons. In a restricted
layers. Figure from Ref.Boltzmann
[Allcock machine (RBM),
and Zhang, there are
2019].
same circuit structure, but run in reverse directions. can be done appr
no connections between neurons of the same type. A deep Boltzmann machine (DBM) consists of
ow to effi- adding one or more additional layers of hidden neurons to an RBM, with no connections between
special case of rest
e without neurons in the same layer. chines (RBM), i.e.
41
thermore, (CNNs) in which the wavefunction | G i is given directly
on of neu- as input to the quantum circuit. To demonstrate the
se transi- success of our model for QPR, we theoretically analyze
is minimized using gradient descent, often aided by an efficient scheme for sampling from the training data.
It is unclear if quantum algorithms can be used for speeding up training of classical Boltzmann machines.
To construct a quantum version of a Boltzmann machine, the neurons are replaced by spins and the
energy function is given by a Hamiltonian for the spin system. It is believed that such a quantum Boltz-
mann machines can represent some probability distributions, particularly ones arising in quantum systems,
better than classical Boltzmann machines can (note that classical Boltzmann machines already are used for
simulating some quantum systems well). However, the scaling properties and other performance metrics for
quantum Boltzmann machines are still unclear. Furthermore, it seems that training of a quantum Boltzmann
machine can be more complicated than the training of a classical one [Amin et al., 2018]. A potential upside
is that quantum annealers could be used to implement quantum Boltzmann machines.

42
Chapter 6

Measurement-based quantum
computation

The circuit model introduced and discussed in the previous two chapters is not the only way to perform
quantum computation. In this and the next chapter, we will look at two other approaches: measurement-
based quantum computation (MBQC) and adiabatic quantum computation. We will also see some additional
examples later in the course; MBQC will resurface when we discuss quantum computation with continuous
variables.
Measurement-based quantum computation was first proposed by Raussendorf and Briegel in 2001 [Raussendorf
and Briegel, 2001]. More details can be found in the follow-up paper in Ref. [Raussendorf et al., 2003]. The
first experimental demonstration of all basic components of MBQC was performed by the Zeilinger group in
2005 [Walther et al., 2005].

6.1 The basic idea of MBQC


In the circuit model of quantum computation, we prepare an initial state, apply a sequence of quantum gates
to this state, and finally do some measurement on the output state to obtain the result of the computation.
While this seems a natural order of operations, which agrees with our intuition for how a classical computation
is performed, it is possible to mix up the order somewhat.
In MBQC, we first prepare a certain entangled state of qubits, where a subset of these qubits represent
the input state. This entangled state is the resource for our computation. In the rest of the computation, we
never need to apply any multi-qubit operations. All we do is apply single-qubit rotations and single-qubit
measurements in some sequence, where later rotations are conditioned on earlier measurement results. The
output state will be encoded in a subset of the qubits (different from the subset that encoded the input),
and a final measurement can be performed on this output state to read out the result of the computation.
Because of the presence of measurements in earlier steps of the computation, MBQC differs from the
circuit model for quantum computation in one important aspect: it is not reversible. In the circuit model,
we apply a large unitary transformation (decomposed into a sequence of gates) that takes us from the input
state to the output state. Before any measurement is done on the output state, this unitary transformation
could be reversed to bring us back to the input state. However, since measurements are done before we reach
the output state in MBQC, that computation cannot be reversed. For this reason, MBQC is often referred
to as one-way quantum computation.

43
6.2 The details of MBQC
6.2.1 Definition of the possible operations
Let us first recall from Chapter 1 some of the operations we can perform on single and multiple qubits.

Single-qubit Clifford transformations


Clifford operations C are the unitary operations which map Pauli group operators Σ to Pauli group operators
Σ 0 under conjugation, i.e., CΣC † = Σ 0 [Horodecki et al., 2006].

H, Rz (π/2) = Zπ/2 universal set for single-qubit Clifford operations

Multiqubit Clifford transformations

The addition of any non trivial two-qubit Clifford gate, e.g., CZ = |0ih0| ⊗ I + |1 |1ih1| ⊗ Ẑ, combined with
the set of single-qubit operations above allows for any general multi-qubit Clifford operation.

H, Uz (π/2) = Zπ/2 , CZ universal set for multi-qubit Clifford operations

These operations are enough to perform some algorithms like error correcting codes, but no algorithm which
could not be efficiently simulated on a classical computer [Horodecki et al., 2006].

Single-qubit universal transformations


A general single-qubit transformation can be decomposed as Rz (γ)Rx (β)Rz (α). To implement any such
arbitrary operation with arbitrary accuracy, it is sufficient to be able to perform the Clifford operations
together with any non-Clifford operation provide, e.g.,

H, Zπ/2 , Zπ/4 universal set for single-qubit quantum computation

Multi-qubit universal transformations


If we have universal control of single qubits, adding a non-trivial (entangling) two-qubit gate to the gate set
is enough to achieve universality for multiple qubits.

H, Zπ/2 , Zπ/4 , CZ universal set for multi-qubit quantum computation

6.2.2 Preparing the initial state


We first consider a setup with two qubits. One qubit contains
√ the initial state that we want to process:
|ψi = α |0i + β |1i. The other qubit is prepared in |+i = 1/ 2(|0i + |1i), e.g., by applying a Hadamard gate
to an initial state |0i. We then apply a CZ gate between the two qubits, obtaining
h i 1

CZ(|ψi ⊗ |+i) = |0ih0| ⊗ 1 + |1ih1| ⊗ Ẑ (α|0i + β|1i) ⊗ √ (|0i + |1i)
2
 
1 1
= α|0i ⊗ √ (|0i + |1i) + β|1i ⊗ √ (|0i − |1i)
2 2
= α|0i|+i + β|1i|−i (6.1)

44
6.2.3 Measurements and their effect
The next step is to measure the input qubit in some rotated basis. Measuring in a rotated basis with angles
(ϑ, ϕ) is like measuring Rz (ϕ + π/2)Rx (ϑ)ZRx (−ϑ)Rz (−ϕ − π/2). Here we measure the first qubit with
ϑ = π/2 and an arbitrary ϕ, i.e., we measure the observable

σ̂ϕ ≡ Rz (ϕ + π/2)Rx (π/2)ZRx (−π/2)Rz (−ϕ − π/2)


= cos ϕX + sin ϕY
= e−iϕ |0ih1| + eiϕ |1ih0| = |ϕ+ ihϕ+ | − |ϕ− ihϕ− |, (6.2)

where we have used Eqs. (1.3)-(1.7) and introduced the rotated measurement basis |ϕ± i = 1/ 2(|0i±eiϕ |1i).
To obtain the result, note that conversely |0i = √12 (|ϕ+ i+|ϕ− i) while |1i = √12 e−iϕ (|ϕ+ i−|ϕ− i), and rewrite
the state:
α β
α|0i|+i + β|1i|−i = √ (|ϕ+ i + |ϕ− i)|+i + √ e−iϕ (|ϕ+ i − |ϕ− i)|−i
2 2
   
α β −iϕ α β −iϕ
= |ϕ+ i √ |+i + √ e |−i + |ϕ− i √ |+i − √ e |−i .
2 2 2 2

Hence a measurement of the operator in Eq. (6.2) projects the second qubit into:

• |ψiout ∝ (α|+i + βe−iϕ |−i) if the outcome is 1 (m = 0)

• |ψiout ∝ (α|+i − βe−iϕ |−i) if the outcome −1 (m = 1)

The state of the second qubit can then be compactly written as

|ψiout = X m HRz (−2ϕ)|ψi. (6.3)

This is readily verified since

X m HRz (−2ϕ)(α|0i + β|1i) = X m H(αeiϕ |0i + βe−iϕ |1i)


= X m (αeiϕ |+i + βe−iϕ |−i), (6.4)

giving the result. The extra Pauli operator X m depends on the outcome of the measurement on qubit 1 and
is said to be a “by-product” operator. It can be compensated for by choosing the measurement basis of the
following steps in the computation (thus introducing, in general, time-ordering). In the following we rename
−2ϕ → ϕ.

6.2.4 Universal single-qubit operations


We can repeat the preceding protocol three times by using four qubits, first using qubit 1 as input and qubit
2 as output, then using qubit 2 as input and qubit 3 as output, and so on. The final output state (qubit 4)
then becomes

|ψiout = X m3 HRz (ϕ3 )X m2 HRz (ϕ2 )X m1 HRz (ϕ1 )|ψi


= HZ m3 Rz (ϕ3 )HZ m2 Rz (ϕ2 )HZ m1 Rz (ϕ1 )|ψi
= HZ m3 Rz (ϕ3 )(HZ m2 H)(HRz (ϕ2 )H)Z m1 Rz (ϕ1 )|ψi
= HZ m3 Rz (ϕ3 )X m2 Rx (ϕ2 )Z m1 Rz (ϕ1 )|ψi
= X m3 Z m2 X m1 HRz ((−1)m2 ϕ3 )Rx ((−1)m1 ϕ2 )Rz (ϕ1 )|ψi, (6.5)

45
where in the first step we used that X m H = HZ m , in the third that HZ m H = X m and later on that
XRz (ϕ) = Rz (−ϕ)X and ZRx (ϕ) = Rx (−ϕ)Z.
Since the most general rotation of a single qubit can be decomposed as Rz (γ)Rx (β)Rz (α), the above
steps lets us implement any single-qubit operation. For the result to be deterministic, we can perform
the measurements choosing the basis sequentially, depending on the preceding measurement outcomes, as
ϕ1 = α, ϕ2 = (−1)m1 β, ϕ3 = (−1)m2 γ. The Pauli corrections remaining at the end of the computation are
not important and never need to be physically applied; they can be accounted for in the final interpretation
of the result (classical post-processing).
If we want to implement a Clifford unitary, by definition CΣ = Σ 0 C meaning that interchanging the
order of Clifford operators and Pauli matrices will leave the Clifford operator unchanged. This means that
there is no need to choose measurements adaptively.

6.2.5 Cluster states as a resource


In the example above, the four qubits were entangled through CZ gates between nearest neighbours. These
CZ gates could all have been done before the start of the computation (when qubit 1 was in the input state
and qubits 2–4 were in the state |+i). This entangled initial state would then be a one-dimensional cluster
state, which is the resource enabling the rest of the computation being carried by just measurements and a
limited set of single-qubit operations.
More generally, a state like this which allows any input state and any unitary transformation on that
input (using only single-qubit rotations and measurements) is said to be a universal resource. It has been
shown that a square lattice graph (a cluster state) with unit weights is a universal resource for quantum
computation. Despite this fact, depending on the specific kind of computation other graphs than a square
lattice could be more suitable for implementing the computation [Horodecki et al., 2006].

6.2.6 Two-qubit gates


For universal quantum computation, we need some two-qubit gate in addition to the arbitrary single-qubit
rotations that we constructed above. Such two-qubit gates, e.g., the CZ and CNOT gates, can be constructed
in a two-dimensional cluster state where two input qubits are entangled with a few other qubits. By a series
of single-qubit measurements and rotations, we can end up with two of the other qubits representing the
output state corresponding to the two-qubit gate having acted on the input state. The example of the CNOT
gate will be demonstrated during the first tutorial of the course.

6.3 Universality and efficiency


From the previous section, it is clear that we can implement arbitrary single-qubit rotations and a two-qubit
gate (e.g., CZ or CNOT), which together form a universal gate set for quantum compuation (cf. Chapter 1).
Thus, MBQC can implement universal quantum computation.
However, it is clear that MBQC in general requires more qubits than the circuit model does. It is also
necessary to compare the number of qubit rotations and measurements needed in MBQC and in the circuit
model. If MBQC would turn out to simply be an inefficient reformulation of the circuit model, it would
not be useful. Fortunately, it turns out that the overhead of MBQC is polynomial in all these resources, so
MBQC is an efficient paradigm for quantum computation.
The efficiency of MBQC opens up new avenues for practical implementations of quantum computation.
There may be physical systems where it is hard to implement single two-qubit gates between chosen qubits

46
at chosen times, but where it is easier to create a large entangled state in one big operation and then only
perform single-qubit operations.
As an aside, we note that MBQC usually has been considered for two-dimensional cluster states. How-
ever, by working with cluster states in three dimensions, the quantum computation can be made fault-
tolerant [Briegel et al., 2009].

47
Chapter 7

Adiabatic quantum computation

The MBQC that we considered in Chapter 6 was, although distinct from the circuit model for quantum
computation, still rather similar to that circuit model. The paradigm of quantum computation that we
explore in this chapter, adiabatic quantum computation (AQC), is further removed from the circuit model
than MBQC. The idea for AQC was evolved around the turn of the millenium. A recent extensive review of
the topic is Ref. [Albash and Lidar, 2018], from which we quote several of the following considerations.

7.1 The basic idea of AQC


Adiabatic quantum computation is based on the adiabatic theorem (we give a proof for this theorem in
Sec. 7.5). The theorem states that a system which starts in a non-degenerate ground state of a time-
dependent Hamiltonian Ĥ(t) that is slowly changing from some initial form Ĥ0 to some final form Ĥ1 ,
during time τ , will remain in its instantaneous ground state throughout the evolution, provided that the
Hamiltonian varies sufficiently slowly. Assuming a linear time dependence the AQC Hamiltonian can be
written as  
t t
Ĥ(t) = 1 − Ĥ0 + Ĥ1 , (0 ≤ t ≤ τ ), (7.1)
τ τ
where the coefficient in front of Ĥ0 changes from unity to zero, and the coefficient in front of Ĥ1 changes from
zero to unity. Moreover, it is crucial that Ĥ0 and Ĥ1 are two noncommuting Hamiltonians (see Sec. 7.4).
If the initial Hamiltonian Ĥ0 is such that its ground state is easy to construct, it is easy to initialize the
system in this ground state. The Hamiltonian Ĥ1 is chosen such that its ground state encodes the solution
of a certain problem. This means that the adiabatic evolution will provide us with this solution.
Adiabatic quantum computation is closely related to quantum annealing (QA), which also encodes the
solution to a problem in the ground state of some final Hamiltonian. The relation between AQC and QA is
discussed later in the course.

7.2 Adiabatic evolution and quantum speed-up


What is the minimum evolution time τ , such that the time evolution in Eq. (7.1) is adiabatic? According
to the adiabatic theorem, it can be shown that for a non-degenerate ground state, the adiabatic evolution is

48
assured if the evolution time τ satisfies the condition

dĤ(s)
max hψ1 (s)| |ψ0 (s)i
0≤s≤1 ds t
τ ; s≡ , (7.2)
min ∆2 (s) τ
0≤s≤1

where |ψ0 (s)i and |ψ1 (s)i are the instantaneous ground and first excited eigenstate of the Hamiltonian in
Eq. (7.1), and ∆(s) = (E1 (s) − E0 (s)) is the instantaneous energy gap between the ground state and first
excited state energies. If the criterion of Eq. (7.2) is fulfilled, then we can be certain that the system will
evolve into the ground state of Ĥ1 .
While these are useful sufficient conditions, they involve bounding the minimum eigenvalue gap of a
complicated many-body Hamiltonian, a notoriously difficult problem. This is one reason that AQC has
generated so much interest among physicists: it has a rich connection to well-studied problems in condensed
matter physics. For example, because of the dependence of the run time on the gap, the performance of
quantum adiabatic algorithms is strongly influenced by the type of quantum phase transition the same
system would undergo in the thermodynamic limit.
Nevertheless, a number of examples are known where the gap analysis can be carried out. For example,
adiabatic quantum computers can perform a process analogous to Grover search and thus provide a quadratic
speedup over the best possible classical algorithm for the Grover search problem. Other examples are known
where the gap analysis can be used to demonstrate that AQC provides a speedup over classical computation,
including adiabatic versions of some of the key algorithms of the circuit model.
However, much more common is the scenario where either the gap analysis reveals no speedup over clas-
sical computation, or where a clear answer to the speedup question is unavailable. In fact, the least is known
about adiabatic quantum speedups in the original setting of solving classical combinatorial optimization
problems. This remains an area of very active research, partly due to the original (still unmaterialized) hope
that AQC would deliver quantum speedups for NP-complete problems [Farhi et al., 2001], and partly due
the availability of commercial QA devices such as those manufactured by D-Wave Systems Inc., designed to
solve optimization problems using stoquastic Hamiltonians.

7.3 Universality and stoquasticity


Adiabatic quantum computation is a universal model of quantum computation. Problems that are solvable
in polynomial time with the circuit model can be solved in polynomial time with an adiabatic quantum
computer, and vice versa. The fact that the circuit model can efficiently simulate AQC was proven by
Edward Fahri et al. in 2000. The fact that AQC can efficiently simulate the circuit model was proven by
Dorit Aharonov in 2007. As such, the two models are computationally equivalent. The proofs for these
statements can be found in Ref. [Albash and Lidar, 2018].
When discussing universality and efficiency for AQC, it is important to distinguish whether the final
Hamiltonian Ĥ1 is stoquastic or not. A Hamiltonian H is called stoquastic if it only has real-valued elements,
and, in some local basis B, all off-diagonal elements of H are either zero or negative, i.e., hx|H|yi ≤ 0 ∀x, y ∈
B. Stoquastic Hamiltonians seem to exist in the borderlands between the classical and quantum worlds. It
is an open question whether stoquastic AQC can be efficiently simulated by a classical computer. It has been
shown that stoquastic AQC cannot be universal for quantum computation unless the polynomial hierarchy
collapses. The proof by Aharonov of the universality for AQC thus requires non-stoquastic Hamiltonians.

49
7.4 Reason for non-commuting Hamiltonians
As mentioned above,
h it is important
i in AQC that the initial Hamiltonian Ĥ0 and final Hamiltonian Ĥ1 do
not commute, i.e., Ĥ0 , Ĥ1 6= 0. This can be understood by considering the following trivial example, taken
from Pontus Vikstål’s Master thesis, Ref. [Vikstål, 2018]. Suppose that the initial and final Hamiltonian in
AQC are be given by    
1 0 −1 0
Ĥ0 = and Ĥ1 = , (7.3)
0 −1 0 − 21
which clearly commute. Since the Hamiltonians are both diagonal in the z-basis, we label the corresponding
eigenvectors as    
1 0
|0i = and |1i = . (7.4)
0 1

It is easy to see that the ground state of Ĥ0 is |1i and that the ground state of Ĥ1 is |0i. We described above
how AQC is based on the adiabatic theorem. For the theorem to hold there must always exist an energy
gap between the eigenstates (see the proof below in Sec. 7.5). Following the AQC algorithm Eq. (7.1), it can
be seen that the energy gap between the eigenstates |1i and |0i closes at some point. In this example, the
energies
h becomesi equal at the point t/τ = 4/5, and so the gap between them closes. This is enough to see
that Ĥ0 , Ĥ1 6= 0 is a necessary condition for keeping the gap open.

7.5 Proof of the adiabatic theorem


In this section, we provide a simple and straightforward proof of the famous adiabatic theorem, first pub-
lished by Max Born and Vladimir Fock in 1928. This section is taken from Pontus Vikstål’s Master thesis,
Ref. [Vikstål, 2018].
Theorem: A particle that begins from the nth eigenstate of a Hamiltonian that is gradually (adiabatically)
changing from an initial form Ĥi into a final form Ĥf , will remain in the nth eigenstate.
Consider an arbitrary time-independent Hamiltonian Ĥ, for which the Schrödinger equation reads
dΨ (x, t)
i~ = ĤΨ (x, t). (7.5)
dt
Through separation of variables, the time-independent Schrödinger equation can be obtained:

Ĥψn (x) = En ψn (x). (7.6)

The general solution to the Schrödinger equation is given by a superposition of the separable solutions
X X
Ψ (x, t) = cn Ψn (x, t) = cn ψn (x)e−iEn t/~ , (7.7)
n n

or by simply considering a specific eigenfunction

Ψn (x, t) = ψn (x)e−iEn t/~ . (7.8)

Hence, the nth eigenstate for a time-independent Hamiltonian remains in the nth eigenstate, simply picking
up a phase factor −En t/~. For a time-dependent Hamiltonian, the eigenenergies and eigenfunctions are
themselves time-dependent. The instantaneous eigenstates and eigenenergies are defined as

Ĥ(t)ψn (x, t) = En (t)ψn (x, t). (7.9)

50
At any instant of time, the eigenfunctions form a complete orthogonal set

hψm (t)|ψn (t)i = δmn , (7.10)

where the dependence on position is implicit. We have also introduced the bra-ket notation ψn (x) = hx|ψn i.
The general solution to the Schrödinger equation is now given by
X X
Ψ (x, t) = cn (t)Ψn (x, t) = cn (t)ψn (x, t)eiϑn (t) , (7.11)
n n

where ϑn (t) is known as the dynamical phase factor


t
1
Z
ϑn (t) = − En (s)ds. (7.12)
~ 0

Our task is to determine the coefficients cn (t). By substituting (7.11) into the Schrödinger equation, we
obtain X X
i~ (ċn ψn + cn ψ̇n + icn ψn ϑ̇n )eiϑn = cn En ψn eiϑn . (7.13)
n n

The third term on the left cancels the term on the right, since ϑ̇n = En . We are thus left with
X
i~ (ċn ψn + cn ψ̇n )eiϑn = 0. (7.14)
n

Multiplying with an arbitrary eigenfunction hψm | from the left and using the orthogonality condition
Eq. (7.10) yields X
ċm = − cn hψm |ψ̇n iei(ϑn −ϑm ) . (7.15)
n

To calculate the quantity hψm |ψ̇n i, we first observe that for m 6= n


d ˙
(hψm | Ĥ |ψn i) = 0 = hψ̇m | Ĥ |ψn i + hψm | Ĥ |ψn i + hψm | Ĥ |ψ̇n i
dt | {z } | {z }
En |ψn i Em hψm | (7.16)
˙
= En hψ̇m |ψn i + hψm | Ĥ |ψn i + Em hψm |ψ̇n i.
We then note that
d
(hψm |ψn i) = 0 = hψ̇m |ψn i + hψm |ψ̇n i, (7.17)
dt | {z }
δmn

which implies the relation hψ̇m |ψn i = −hψm |ψ̇n i, so


˙
hψm | Ĥ |ψn i
hψm |ψ̇n i = , (m 6= n). (7.18)
En − Em
This holds as long as no transitions between eigenstates occur. The differential equation in Eq. (7.15) can
now be written
X hψm | Ĥ˙ |ψn i
ċm = −cm hψm |ψ̇m i − . (7.19)
En − Em
m6=n

Now, if the Hamiltonian is slowly changing, such that its time derivative can be considered to be very
˙
small and that the energy difference |En − Em | is large compared to hψm | Ĥ |ψn i , the second term becomes
negligible. This approximation is known as the the adiabatic approximation, and we conclude that

ċm ≈ −cm hψm |ψ̇m i. (7.20)

51
By solving this equation one finds
 Z t 
cm (t) = cm (0) exp − hψm (s)|ψ̇m (s)ids = cm (0)eiγm (t) , (7.21)
0

where Z t
γm (t) = i hψm (s)|ψ̇m (s)ids, (7.22)
0

is the geometrical (Berry) phase factor. Putting the obtained expression for the coefficients cm (t) back into
(7.11), we obtain that the nth eigenstate is given by

|Ψn (t)i = eiϑn (t) eiγn (t) |ψn (t)i . (7.23)

Hence, a system that starts out in the nth eigenstate, will remain in the nth eigenstate, simply picking up
a couple of phase factors.

52
Chapter 8

Algorithms for solving combinatorial


optimization problems

For this part of the notes, we follow Refs. [Vikstål, 2018, Rodríguez-Laguna and Santalla, 2018, Albash and
Lidar, 2018].

8.1 Combinatorial optimization problems


A combinatorial optimization problem seeks to find the best answer to a given problem from a vast collection
of configurations. A typical example of a combinatorial optimization problem is the travelling salesperson
problem, where a salesperson seeks to find the shortest travel distance between different locations, such that
all locations are visited once. The naive method to solve this problem would be to make a list of all the
different routes between locations that the salesperson could take and from that list find the best answer.
This could work when the number of locations is small, but the naive method would fail if the number
of locations grows large, since the number of possible routes increases exponentially with the number of
locations. Thus the naive method is not efficient in practice, and we should therefore develop more clever
optimisaton algorithms.
Typical examples of this kind of relevant combinatorial optimization problems are summarized in the list
below.

• The traveling salesperson problem (TSP), as already mentioned. You are given a list of cities and the
distances between them, and you have to provide the shortest route to visit them all.

• The knapsack problem. You are given the weights and values of a set of objects and you have to
provide the most valuable subset of them to take with you, given a certain bound on the total weight.

• Sorting. Given N numbers, return them in non-ascending order.

• The integer factorization problem. You are given a big number M , and you have to provide two integer
factors, p and q, such that M = pq.

• The satisfactibility problem (SAT). You are given a boolean expression of many variables zi ∈ {0, 1},
for example, P (z1 , z2 , z3 , z4 ) = z1 ∨ z̄2 ∧ (z3 ∨ z̄4 ). Then, you are asked whether there is a valuation of
those variables which will make the complete expression true. For example, in that case, making all
zi = 1 is a valid solution.

53
Notice that all these problems have a similar structure: you are given certain input data (the initial
numbers, the list of the cities, the list of objects or the integer to factorize) and you are asked to provide a
response. The first two problems are written as optimization problems, in which a certain target function
should be minimized (or maximized). The sorting problem can be restated as an optimization problem: we
can design a penalty function to be minimized, by counting the misplaced consecutive numbers. The factor-
ization problem can also be expressed in that way: find p and q such that E = (M − pq)2 becomes minimal,
and zero if possible. SAT can also be regarded as an optimization problem in which the evaluation of the
boolean formula should be maximized. Thus, all those problems can be seen as combinatorial optimization
problems. This means that, in order to solve them by brute force, a finite number of possibilities must be
checked. But this number of possibilities grows very fast with the number of variables or the size of the
input. The number of configurations typically involved becomes easily gigantic. It is enough to consider 100
guests at a party with 100 chairs, to obtain 100! ∼ 10157 possible configurations to chose in which assigning
each guest to a chair. This is roughly the square of number of particles in the universe, estimated to be
about 1080 . To calculate the correct configuration in this ocean of possible configurations is often hopeless
for classical computers - even for our best and future best super computers. This is why it makes sense to
ask the question: could a quantum computer help?

8.1.1 Hardness of combinatorial optimization problems and promises of quan-


tum computers for solving them
Not all optimization problems are equally hard. Let us focus on the minimal possible time required to solve
them as a function of the input size, i.e. to which time-complexity class they belong (see Sec.2.2 for a list of
relevant complexity classes).
We recall that a problem is said to be NP-hard if an algorithm solving it can be translated into an
algorithm for solving any NP-complete problem with a poly-time overhead. NP-hard problems are not
necessarily decision problems, but for any optimization problem that is NP-hard, the corresponding decision
problem is NP-complete. For example, the decision version of the traveling salesperson problem reads as
follows: given a length L, decide whether the graph has a tour of at most L. Another example of a famous
NP-hard problem is the Max-Cut problem.
All the decision problems associated to the optimisation problems in the list introduced in the previous
subsection are NP: given a candidate solution, it can always be checked in polynomial time. But only one of
them is known to be in P: the sorting problem, because it can always be solved in time O(N log(N )) which
is less than O(N 2 ). For the factorization problem we do not know whether it is in P or not. The other three
belong to a special subset: they are NP-complete. This means that they belong to a special group with
this property: if a polynomial algorithm to solve one of them is ever found, then we will have a polynomial
algorithm to solve all NP problems. The generality and power of this result is known as Cook’s theorem.
How can the solution to an NP-complete problem be useful to solve all NP problems? By using a strategy
called reduction. An instance of the sorting problem, for example, can be converted into an instance of SAT
in polynomial time. Thus, the strategy to solve any NP-problem would be: (i) translate your instance to an
instance of SAT, (ii) solve that instance of SAT, and (iii) translate back your solution. It is very relevant
for physicists to know which combinatorial optimization problems are NP-complete for many reasons, and
one of the most important is to avoid losing valuable time with a naive attempt to solve them in polynomial
time. The complexity classes relative to the problems listed above are summarized in Fig.8.1.
Actually, we do not expect quantum computers either to solve efficiently problems that are NP-complete.
That would mean that NP is contained in BQP, and we do not believe that this is the case (see e.g.
https://www.scottaaronson.com/blog/?p=206). Yet, we expect that some problems that are in NP and
not in P could be efficiently solvable by a quantum computer, i.e. they belong to BQP (such as Factoring).

54
• Max-cut
• Knapsack
• Traveling Salesperson

• SAT
• Exact-Cover
• Subset-sum
• Number partitioning

• Factoring?

• Sorting

Figure 8.1: Complexity classes relevant for optimisation problems.

For these problems, quantum algorithms could provide an exponential advantage. Furthermore, for the
NP-complete problems, maybe we can solve them on a quantum computer with a quadratic advantage
with respect to the classical solution. This kind of quantum advantage is of the same kind of the Grover
algorithm. The Grover search algorithm can be run on quantum computer, either within the circuit model
or using√adiabatic quantum computation [Albash and Lidar, 2018], providing the same quadratic advantage,
i.e. O( N ) vs O(N ) running time.
Finally, approximate solutions of combinatorial optimization problems (even NP-complete ones) might be
obtainable efficiently with the QAOA algorithm, that we will see in Sec. 8.4. Not much is known currently,
about the time-complexity of finding approximate solutions to those problems, and how this compares to
the corresponding approximate classical algorithms.

8.2 Combinatorial optimization and the Ising model


To solve a combinatorial optimization problem using a quantum algorithm, be it Quantum Annealing or
the QAOA as we are going to see in the next sections, the quantum algorithm must be able to interpret
the specific problem that we wish to solve. This can be done by encoding the optimization problem onto
a quantum system. In this section we will see how a combinatorial optimization can be framed as a cost
Hamiltonian in the Ising form.
Let C : {0, 1}n → R be a cost function that encode a combinatorial optimization problem. There are in

55
total 2n possible strings and the goal is to find the bit-string z = z1 . . . zn that minimizes the cost function
C(z). Note that a minimization problem can be transformed into a maximization problem by a minus sign
C(z) → −C(z). One of the most widely used models in physics, that also is used to represent optimization
problems, is the Ising model. It is was developed in the 1920s as a way to understand phase transitions in
magnetic materials.
The Ising model consists of N Ising spins on a lattice that can take the values si = +1 or si = −1, which
corresponds to the spin-↑ and the spin-↓ direction. Here si denotes the Ising spin at site i on the lattice. The
Ising spins are coupled together through long-range magnetic interactions, which can be either ferromagnetic
or antiferromagnetic, corresponding to the spins being encouraged to be aligned or anti-aligned respectively.
Moreover, an external magnetic field can be applied at each individual spin site which will give a different
energy to the spin-↑ and spin-↓ directions. The energy configuration of the Ising model with N Ising spins
is given by the Ising Hamiltonian
N
X N
X
H(s1 , . . . , sN ) = − Jij sj si − hi si , (si = ±1), (8.1)
i,j=1 i=1

where Jij is the coupling strength between the ith spin and j th spin and hi is the magnetic field at the ith spin
site. Note that for a given set of couplings, magnetic fields and spin-configurations, the Ising Hamiltonian
will just “spit out” a number that represents the energy of that particular spin-configuration. For a given
set of couplings and magnetic fields there always exists at least one spin-configuration that minimizes the
energy of the Ising Hamiltonian.
A quantum version of this model is obtained by simply replacing the spin-variables si with Pauli-z
operators
X n
X
ĤC ≡ Ĥ(σ̂1z , . . . , σ̂nz ) = − Jij σ̂iz σ̂jz − hi σ̂iz , (8.2)
1≤i<j≤n i=1

were σ̂iz refers to the Pauli-z matrix acting on the ith qubit. The spectral decomposition of the this Hamil-
tonian then encodes the different solutions in the computational basis
X
ĤC = C(z) |zihz| , (8.3)
z∈{0,1}n

and such the eigenstate |zi with the lowest eigenvalue corresponds the optimal solution. The Hamiltonian
of Eq. 8.2 is formally known as an Ising-Hamiltonian, after its inventor, but it can also be referred to as as
cost Hamiltonian in the context of optimisation problems, because of its eigenvalues in the computational
basis corresponds to the different costs of the cost function.
Also not that in the context of QAOA, we will refer to the Ising or cost Hamiltonian as Eq.(8.2), where
however we will take the plus sign in front of both terms of the Hamiltonian.

8.2.1 Mapping combinatorial optimization problems to spin Hamiltonians


Many optimization problems, including all of Karp’s 21 NP-complete problems, can be written in the form
of Eq. (8.2) and hence solved on a quantum computer, by choosing appropriate values for Jij and hi [Lucas,
2014]. We now turn to a couple of specific examples, and we will disregard here the minus sign in front of
the Ising Hamiltonian.

A special case of the knapsack problem: the subset sum problem


The subset sum problem is a famous combinatorial optimization problem that is known to be NP-complete,
and it is a special case of the decision version of the knapsack problem, where the weights wi are equal to

56
the values vi for each object i, i.e. wi = vi . It can be formulated as a decision problem, as follows: Given an
integer m (the total value) and a set of positive and negative integers n = {n1 , n2 , . . . , nN } of length N , is
there a subset of those integers that sums exactly to m?
Example: Consider the case when m = 7 and the set n = {−5, −3, 1, 4, 9}. In this particular example
answer is “yes”, and the subset is {−3, 1, 9}.
Example: Consider m = 13 and n = {−3, 2, 8, 4, 20}. This time the answer is “no”.
This problem can be framed as an energy minimization problem. The energy function for the subset sum
problem can be formulated as
N
!2
X
E(z1 , . . . , zN ) = ni zi − m , zi ∈ {0, 1},
i=1

where N corresponds to the size of the subset. Hence, if a configuration of zi exists such that E = 0, then
the answer is “yes”. Likewise if (E) > 0 for all possible configurations of zi , then the answer is “no”. To map
this energy function onto an Ising Hamiltonian we introduce the Ising spins si = ±1 instead of zi as
1
zi = (1 + si ), (8.4)
2
such that si = +1 (spin-↑) corresponds to zi = 1, and si = −1 (spin-↓) corresponds to zi = 0. Then the
Hamiltonian is written as
N
!2
X 1
H(s1 , . . . , sN ) = ni (1 + si ) − m
i=1
2
N N
!2
X 1 1X
= ni si + ni − m
i=1
2 2 i=1
   2
N N N
1 X X 1 X 1 X
= ni nj si sj +  nj − m ni si +  nj − m .
4 i=1
2 j=1
2 j=1
1≤i,j≤N

We now introduce the coupling Jij and the magnetic field hi as


 
N
1 X
Jij ≡ ni nj , and hi =  nj − m  ni , (8.5)
2 j=1

and finally obtain


 2
N N
1 X X 1X
H(s1 , . . . , sN ) = Jij si sj + hi si +  nj − m .
4 i=1
2 j=1
1≤i,j≤N

Notice that the last term is simply a constant. This Hamiltonian is an Ising Hamiltonian cf. Eq. (8.1).
Furthermore, by observing that Jij is symmetric and that the sum of the diagonal elements i = j is equal
to the trace of Jij , we get
 2
N N
1 X X 1 X 1
H(s1 , . . . , sN ) = Jij si sj + hi si +  nj − m + Tr [Jij ]
2 i=1
2 j=1
4
1≤i<j≤N
(8.6)
X XN
= Jij si sj + hi si + const,
1≤i<j≤N i=1

57
where we have absorbed the 1/2 into Jij in the second step and made an implicit redefinition of the couplings
Jij ≡ ni nj /2. To solve the problem on a quantum computer, one can now quantize the spin variables si → σ̂iz .

Number partitioning problem


The number partitioning problem is another well known combinatorial optimization problem that is also NP-
complete [Albash and Lidar, 2018]. The number partitioning problem can be defined as a decision problem,
as follows: Given a set S of positive integers {n1 , n2 , . . . , nN } of length N , can this set be partitioned into
two sets S1 and S2 such that the sum of the sets are equal?
Example: Consider the set {1, 2, 3, 4, 6, 10}, can this set be partitioned into two sets, such that the sum
of both sets are equal? The answer is “yes” and the partitions are S1 = {1, 2, 4, 6} and S2 = {3, 10}.
Example: Consider the set {1, 2, 3, 4, 6, 7}, can this set be partitioned into two sets, such that the sum
of both sets are equal? This time the answer is “no”, which you can try to convince yourself that it is.
The Ising Hamiltonian for the number partitioning problem can be straightforwardly written down as
follows !2
N
X
H(s1 , . . . , sN ) = ni si , (si = ±1). (8.7)
i=1

It is clear that the answer to the number partitioning problem is “yes” if H = 0, because then there exist
a spin configuration where the sum of the ni for the +1 spins is the equal to the sum of the ni for the −1
spins [Lucas, 2014]. Likewise the answer is “no” if (H) > 0 for all possible spin configurations. Expanding
the square of Eq. (8.7) we get
X X
H(s1 , . . . , sN ) = Jij si sj = 2 Jij si sj + Tr [Jij ] , (8.8)
1≤i,j≤N 1≤i<j≤N

where we have introduced the couplings as


Jij = ni nj .
It should be noted that the ground state of Ising Hamiltonian for the number partitioning problem is always
at least two-fold degenerate. This has to do with the fact that changing si to −si does not change H. What
is also notable about the number partitioning problem compared to the subset sum problem is that the
number partitioning problem does not require any additional magnetic fields on each spin site.
Number partitioning is known as the “easiest hard problem" (Hayes, 2002) due to the existence of efficient
approximation algorithms that apply in most (although of course not all) cases, e.g., a polynomial-time
approximation algorithm known as the differencing method (Karmarkar and Karp, 1982). This concludes
this section of examples.

8.3 Quantum annealing


There is no consensus in the literature, with respect to the therminology for quantum annealing. We adopt
the following definition [Hauke et al., 2020]: Quantum Annealing (QA) is a heuristic quantum algorithm that
is based on the Adiabatic Quantum Computation model seen in Chapter 7, and that aims at solving hard
combinatorial optimization problems, by using an Ising Hamiltonian as the target Hamiltonian. In other
words, the main idea of QA is to take advantage of adiabatic evolution expressed by Eq.(7.1) to go from a
simple non-degenerate ground state of an initial Hamiltonian to the ground state of an Ising Hamiltonian
that encodes the solution of the desired combinatorial optimization problem. It can hence be thought as the
restriction of adiabatic quantum computation to optimization problems. As a consequence, a less general
Hamiltonian is sufficient for addressing quantum annealing, as compared to adiabatic quantum computation

58
(see also Scott Pakin Quantum Annealing lecture at the Quantum Science summer school 2017, available at
http://qs3.mit.edu/images/pdf/QS3-2017---Pakin-Lecture.pdf).

8.3.1 Solving optimization problems on a quantum annealer


To solve an Ising problem on a qubit based quantum annealer one defines a set of qubits |q1 q2 . . . qN i to
store the answer to the problem and interpret qi = 0 to mean si = +1 (spin-↑) and qi = 1 to mean si = −1
(spin-↓). Then one chooses a pair of noncommuting Hamiltonians Ĥ0 and Ĥ1 , such that the ground state
of Ĥ0 is easy to prepare, and that the ground state of Ĥ1 encodes the solution to the Ising problem. To
achieve this the initial Hamiltonian is typically assumed to be:
 
X
x x 0 1
Ĥ0 = − σ̂i , σ̂ = (8.9)
1 0
i

where (σ̂ix = Iˆ ⊗ . . . ⊗ σ̂ x ⊗ . . . I)
ˆ with the Pauli σ̂ x matrix on the i:th place. The qubits or (spins) can then
be regarded as pointing simultaneously in the spin-↑ and spin-↓ directions along the z-axis at the beginning.
Indeed, it is easy to show using basic quantum mechanics that the ground state of Ĥ0 for N -qubits is
 N
⊗n 1 1 X
|ψ(0)i = |+i = √ (|0i + |1i) ⊗ . . . ⊗ (|0i + |1i) = √ |zi,
2 2n z∈{0,1}n

which corresponds to a superposition of all possible spin configurations. Constructing Ĥ1 can be done in
a straightforward manner by replacing each si in Eq. (8.1) with a corresponding Pauli-z matrix, to yield
Eq.(8.2), that we report here for convenience:
N
X N
X
Ĥ1 = − Jij σ̂jz σ̂iz − hi σ̂iz ,
i,j=1 i=1

where (σ̂iz = Iˆ ⊗ . . . ⊗ σ̂ z ⊗ . . . I)
ˆ with σ̂ z on the i:th place. When the time-dependent AQC Hamiltonian
smoothly interpolates between Ĥ0 and Ĥ1 ,

Ĥ(t) = (1 − s(t))Ĥ0 + s(t)Ĥ1 , (8.10)

with s(0) = 0 and s(τ ) = 1, τ the total time of the algorithm, the qubits will gradually choose between 0
and 1 corresponding to spin-↑ and spin-↓, depending on which spin-configuration minimizes the energy of
the Ising Hamiltonian [Rodríguez-Laguna and Santalla, 2018]. At the end t = τ the system will have evolved
into |ψ(τ )i = |z1 z2 . . . zN i and a readout of this state in the computational basis will reveal each bit value
from which the corresponding values of the Ising spins can be obtained. If the annealing time was not slow
enough, which is in practice occurring in any realistic situation, the state that is read-out will only encode
the optimal solution to the problem with a certain success probability p, given by the overlap of the final
state with the solution state. It is possible to estimate the success probability at each run by calculating the
ratio of the number of runs where the optimal solution was found to the total number of runs.
In order to calculate the time-to-solution with 99% certainty, we can evaluated the success probability
after repeating m times the annealing procedure and equate it to 0.99, i.e.
m
Psucc = 1 − (1 − p)m = 0.99,

from which we can extract


ln(1 − 0.99)
m= ,
ln(1 − p)

59
where we have used the change of basis of logarithms logb x = loga x/ loga b, and where ln is the natural
logarithm. Therefore we obtain at the end:

ln(1 − 0.99)
T99 = mτ = τ,
ln(1 − p)

A challenge in quantum annealing is that a full connectivity, which manifest in the interaction parameters
Jij being 6= 0 beyond nearest-neihgbours interactions, is often required in the problem Hamiltonian (for
typical hard problems). This full connectivity might be difficult to achieve experimentally. In case the
hardware has limited connectivity, embeddings allow one to map the fully connected problem onto a locally
connected spin-system, at the price of an overhead in the total number of physical qubits to use. Examples
of embedding are the Lechner, Hauke and Zoller (LHZ) scheme or the minor embedding (we will talk about
this later).

8.3.2 Heuristic understanding of quantum annealing


The term "annealing" is due to the analogy with the procedure used by metallurgists in order to make perfect
crystals: the metal is melted and allowed to reduce its temperature very slowly. When the temperature
becomes low enough, the system is expected to be in its global energy minimum with very high probability.
Yet, mistakes are bound to take place if the temperature is reduced too fast. Why does this procedure
work? Thermal fluctuations allow the system to explore a huge number of configurations. Sometimes, a
metastable state is found, but if temperature is still large enough, fluctuations will allow it to escape. The
escape probability is always related to the energy barrier separating the metastable state from the deeper
energy minima. So, annealing can be considered an analog computation.
Annealing (meaning this metallurgic version) works nicely when the metastable states are separated by
low energy barriers. Some target functions have barriers that are tall but very thin. Others will have different
peculiarities.
Is there any alternative analog computation suitable for problems with tall and thin barriers? Yes, there
is one. Instead of escaping metastable states through thermal fluctuations, we may try to escape them
through quantum tunneling, since the probability of such an event is known to decay with the product of
the height and width of the energy barriers.
This is what quantum annealing does: in QA, we engineer a quantum system so that its energy is related
to the target function that we want to minimize. Thus, its ground state will provide the solution to our
problem. If we start out at high temperature and cool the system down, then we are simply performing an
annealing analog computation. Alternatively, we can operate always at extremely low temperature, so that
quantum effects are always important, but add an extra element to the system which forces strong quantum
fluctuations. This extra element (the starting Hamiltonian) is then slowly reduced and, when it vanishes,
the GS will give us the solution to our problem. In other terms: we make the system Hamiltonian evolve
from a certain starting Hamiltonian, H0 , whose GS presents strong quantum fluctuations in the basis of the
final Hamiltonian eigenstates, to our target Hamiltonian, H1 , whose GS is the solution to our problem.

8.3.3 QUBO optimisation


Many optimisation problems can be recast in terms of QUBOs, where the acronym stands for Quadratic
Unconstrained Binary Optimisation, i.e. the function to be optimised is a quadratic function over binary
variables without further constraints.
The standard format for a QUBO objective function to be minimized over binary variables z ∈ {0, 1}n

60
is the following:
n
X n
X
q(z) = z > Qz = Qjj zj + Qjk zj zk (8.11)
j=1 j,k=1
j<k

with an upper-triangular quadratic matrix Q ∈ Rn×n .


Usually these transformations produce overhead, e.g. in terms of increasing the number of variables, the
required connections between them or the value of the coefficients.

8.3.4 D-wave quantum annealer

Figure 8.2: A 3x3 Chimera graph, denoted C3. Qubits are arranged in 9 unit cells. From D-Wave website.

D-Wave 200Q is the first commercially available quantum annealer, developed by the Canadian company
D-Wave Systems, and it is a heuristic solver using quantum annealing for solving optimisation problems.
Their hardware is based on rf-SQUIDs (Superconducting Quantum Interference Devices). In these devices,
flux qubits are realized by means of long loops of superconducting wire interrupted by a set of Josephson
junctions. A “supercurrent" of Cooper pairs of electrons, condensed to a superconducting condensate, flows
through the wires; a large ensemble of these pairs behaves as a single quantum state with net positive or
net negative flux, thereby defining the two qubits states. The chip must be kept extremely cold for the
macroscopic circuit to behave like a two-level (qubit) system, and nominally runs at 15 mK.
In order to be casted onto D-Wave machines, optimisation problems can be formulated either in terms
of Ising Hamiltonians, or in terms of QUBOs.
The hardware layout of the D-Wave 2000Q quantum annealer restricts the connections between binary
variables to the so called Chimera graph 8.2. In order to make problems with higher connectivity amenable to
the machine, an embedding must be employed, for instance minor embedding. This means coupling various
physical qubits together into one logical qubit, representing one binary variable, with a strong ferromagnetic
coupling JF in order to ensure that all physical qubits have the same value after readout.

61
For how to program a D-Wave machine, see https://docs.dwavesys.com/docs/latest/doc_getting_
started.html.

8.3.5 Summary of pros and cons for quantum annealing


As a summary of this quantum annealing section, and before turning to an example of a problem solved
on a quantum annealer, we present a list of pros and cons for quantum annealing, taken from the notes of
Scott Pakin Quantum Annealing lecture at the Quantum Science summer school 2017, http://qs3.mit.
edu/images/pdf/QS3-2017---Pakin-Lecture.pdf.
The bad:

• Very difficult to analyze an algorithm’s computational complexity

– Need to know the gap between the ground state and first excited state, which can be costly to
compute
– In contrast, circuit-model algorithms tend to be more straightforward to analyze

• Unknown if quantum annealing can outperform classical computation

– If gap always shrinks exponentially then no

The good:

• Constants do matter

– If the gap is such that a correct answer is expected only once every million anneals, and an anneal
takes 5 microseconds, that is still only 5 seconds to get a correct answer - may be good enough
– On current systems, the gap scaling may be less of a problem than the number of available qubits

• We may be able to (classically) patch the output to get to the ground state

– Hill climbing or other such approaches may help get quickly from a near-groundstate solution into
the ground state
– We may not even need the exact ground state
– For many optimization problems, “good and fast" may be preferable to “perfect but slow"

8.3.6 Example of the solution of a practical problem on a quantum annealer:


Flight-gate assignment
In this section, we present the results outlined in Ref. [Stollenwerk et al., 2019], where the solution of a
concrete problem is addressed on a quantum annealer, namely flight gate assignment. The goal of flight
gate assignment is reducing the total transit time for passengers in an airport, increasing passenger comfort
and punctuality. This problem is related to the quadratic assignment problem, a fundamental problem in
combinatorial optimization whose standard formulation is NP-hard. This problem can be brought into a
QUBO format by standard transformations. Here, we investigate the solvability of the optimal flight gate
assignment problem with a D-Wave 2000Q quantum annealer.

62
Formal Problem Definition
The typical passenger flow in an airport can usually be divided into three parts: After the airplane has
arrived at the gate, one part of the passengers passes the baggage claim and leaves the airport. Other
passengers stay in the airport to take connecting flights. These transit passengers can take up a significant
fraction of the total passenger amount. The third part are passengers which enter the airport through the
security checkpoint and leave with a flight. The parameters of the problem are summarized in table 8.1.
Assignment problems can easily be represented with binary variables indicating whether or not a resource
is assigned to a certain facility. The variables form a matrix indexed over the resources and the facilities.
The binary decision variables are x ∈ {0, 1}F ×G with
(
1, if flight i ∈ F is assigned to gate α ∈ G,
ziα = (8.12)
0, otherwise.

Like already stated, the passenger flow divides into three parts and so does the objective function: The
partial sums of the arriving, the departing and the transfer passengers sum up to the total transfer time of
all passengers. For the arrival part we get a contribution of the corresponding time tarr
α for each of the ni
arr

passengers if flight i is assigned to gate α. Together with the departure part, which is obtained analogously,
the linear terms of the objective are
X arr/dep
T arr/dep (z) = ni tarr/dep
α ziα . (8.13)

The contribution of the transfer passengers is the reason for the hardness of the problem: Only if flight i is
assigned to gate α and flight j to gate β the corresponding time is added. This results in the quadratic term
X
T trans (z) = nij tαβ ziα zjβ . (8.14)
iαjβ

The total objective function is


T (z) = T arr (z) + T dep (z) + T trans (z). (8.15)

Constraints
Not all binary encodings for z form valid solutions to the problem. There are several further restrictions
which need to be added as constraints. In this model a flight corresponds to a single airplane arriving and

Table 8.1: Input data for a flight gate assignment instance

F Set of flights (i ∈ F )
G Set of gates (α ∈ G)
ndep
i No. of passengers departing with flight i
narr
i No. of passengers arriving with flight i
nij No of transfer passengers between flights i and j
tarr
α Transfer time from gate α to baggage claim
tdep
α Transfer time from check-in to gate α
tαβ Transfer time from gate α to gate β
tin
i Arrival time of flight i
tout
i Departure time of flight i
tbuf Buffer time between two flights at the same gate

63
departing at a single gate. It is obvious, that every flight can only be assigned to a single gate, therefore we
have X
ziα = 1 ∀i ∈ F. (8.16)
α
Furthermore it is clear that no flight can be assigned to a gate which is already occupied by another flight
at the same time. The resulting linear inequalities ziα + zjα ≤ 1 are equivalent to the quadratic constraints
ziα · zjα = 0 ∀(i, j) ∈ P ∀α ∈ G, (8.17)
where P is the subset of forbidden flight pairs with overlapping time slots, that can be aggregated in
P = (i, j) ∈ F 2 : tin in out
+ tbuf .

i < tj < ti (8.18)

Mapping to QUBO and Penalty Terms


While the presented objective function already follows the QUBO format Eq.(8.11), the constraints need to
be reduced which is shown in this section. The standard way to reduce constraints is to introduce terms
penalizing those variable choices that violate the constraints. Just in these cases a certain positive value is
added to the objective function to favor valid configurations while minimizing. The quadratic terms
XX 2
one
C (z) = ziα − 1 (8.19)
i α

and X X
C not (z) = ziα zjα (8.20)
α (i,j)∈P

fulfill (
one/not > 0, if constraint is violated,
C (8.21)
= 0, if constraint is fulfilled,
and therefore are suitable penalty terms which can be combined with the objective function. Since the
benefit in the objective function in case of an invalid variable choice should not exceed the penalty, two
parameters λone , λnot ∈ R+ need to be introduced:
q(z) = T (z) + λone C one (z) + λnot C not (z). (8.22)
In theory these parameters could be set to infinity, but in practice this is not possible and they have to be
chosen carefully.
The parameters λone and λnot need to be large enough to ensure that a solution always fullfills the
constraints. However due to precision restrictions of the D-Wave machine it is favorable to choose them as
small as possible. Two different possibilities to obtain suitable values are studied in Ref. [Stollenwerk et al.,
2019]: a worst case analysis for each single constraint, and a bisection algorithm which iteratively checks
penalty weights against the solution validity. We leave the technical details of this discussion to [Stollenwerk
et al., 2019].
Note that this final cost function could be expressed in terms of an Ising Hamiltonian, by using the
mapping in Eq.(8.4) and by quantizing the spin variables. However, QUBO is the native way of encoding
cost-functions into D-Wave machines, therefore the authors leave out this step.
Since the constraint Eq. (8.16) introduces |F | complete graphs of size |G|, and given the connectivity of
chimera graphs, which requires in the use of minor embedding, this results at most in a quadratic increase
in the number of physical qubits with the number of logical qubits, which is |F | · |G|. With this, the authors
were able to embed instances up to 84 binary variables (extracted by real data, but simplified considerably),
which requires roughly 1600 qubits. The instances are then solved on D-Wave, and the time-to-solution is
then evaluated as a function of the size of the instance.

64
Time to solution

500 25 %
400 50 %
75 %
300
T99 / s

200
100

3 4 5 6 7
Number of flights |F|

Figure 8.3: Top: Time to solution against the number of flights for the instances from IBP . The different
colors represent the 25th, 50th and 75th percentiles.

The annealing solutions were obtained using 1000 annealing runs, and majority voting as a un-embedding
strategy for broken chains of logical qubits. The time to solution is then evaluated using Eq.(8.3.1), where
τ is the annealing time, which we fixed to 20µs.
Figure 8.3 shows the time-to-solution in dependence of the number of flights. There is an increase in the
time-to-solution with the number of flights, and therefore the problem size.
A conclusive assessment of the scaling behavior is not possible at the moment, because due to the high
connectivity of the problem there is a large embedding overhead, and as a result the size of the amenable
problems is very small. Future generations of quantum annealers with more qubits and higher connectivity
are needed to investigate larger problems.

65
8.4 Quantum Approximate Optimization Algorithm (QAOA)
8.4.1 Introduction to QAOA: From the quantum adiabatic algorithm to QAOA
This Section is taken from the licentiate thesis of Pontus Vikstål (Chalmers, 2021).
The QAOA [Farhi et al., 2014] is inspired by the adiabatic quantum computing (AQC), and in particular
its application to optimisation problems in the context of quantum annealing, that you have seen in Chap.7
and 8.3.1. We report here for convenience the Hamiltonian of Eq.(8.10):

Ĥ(t) = (1 − s(t))ĤM + s(t)ĤC . (8.23)

Here s(0) = 0 and s(T ) = 1, T is the total time of the algorithm, ĤM is the initial Hamiltonian, whose
maximally excited state is easy to prepare, and ĤC is the cost Hamiltonian maximally excited state encodes
the solution to an optimization problem.
The QAOA is based on the observation that the easiest way to practically implement the quantum
annealing Hamiltonian evolution expressed by Eq.(8.23) is to Trotterize it, i.e. to decompose it in small time
steps:
p
" Z #
T Y h i
Û (T ) ≡ T exp −i Ĥ(t) dt ≈ exp −iĤ(k∆t)∆t . (8.24)
0 k=1

Here Û (T ) is the evolution operator from 0 to T , T is the time-ordering operator, and p is a large integer so
that ∆t = T /p is a small time segment. Next, for two non-commuting operators A and B and sufficiently
small ∆t, one can use the Trotter formula:

ei(A+B)∆t = eiA∆t eiB∆t +O(∆t), (8.25)

and apply it to the discretized time evolution operator Eq. (8.24)


p
Y h i h i
Û (T ) ≈ exp −i(1 − s(k∆t))ĤM ∆t exp −is(k∆t)ĤC ∆t . (8.26)
k=1

Thus it is possible to approximate quantum annealing by applying the cost and mixing Hamiltonian in an
alternating sequence.
The key idea underlying QAOA is to truncate this product to an arbitrary positive integer and redefine
the time dependence in each exponent (1 − s(k∆t))∆t → βk and s(k∆t)∆t → γk . In this way, and as a
crucial difference with quantum annealing, the fixed time segments become instead variational parameters to
be optimized. Finally letting the product act on the initial state of quantum annealing, the plus state, one
obtains the variational state
p
~ ≡ ⊗n ~
Y X
|~γ , βi e−iβk ĤM e−iγk ĤC |+i = d(~
z
γ ,β)
|zi, (8.27)
k=1 z

where ~γ = (γ1 , γ2 , . . . , γp ) and β~ = (β1 , β2 , . . . , βp ), and where in the last step we have just made explicit that
the variational state is s given superposition of z-eigenstates. If the eigenvalues of the cost Hamiltonian are
all integers then γ is 2π-periodic, and can be restricted to lie between γk ∈ [0, 2π] and the mixer is π-periodic
βk ∈ [0, π]. However, the task of choosing the time-dependence of the angular variational parameters (~γ , β), ~
remains. The possibility of optimising over variational parameters makes it irrelevant whether the initial
state is a ground state or the highest excited state. In the context of QAOA, we might hence re-define the
mixing Hamiltonian without the minus sign, i.e. we set ĤM to be
X
ĤM = σ̂ix . (8.28)
i

66
Also, different authors use QAOA for minimisation or maximisation of a given cost-function. What you
should pay attention to is the following: once the problem is given, if your cost-function ĤC encodes the
solution in its ground state, then you should proceed to minimisation of the cost function.
~ be the expectation value of ĤC in the variational state of Eq.(8.27).
Let Ep (~γ , β)
~
~ ≡ h~γ , β|
~ ĤC |~γ , βi
~ =
X
Ep (~γ , β) prob(~
z
γ ,β)
C(z), (8.29)
z

~ (~ ~
γ ,β)
where, by obtaining the last term in Eq.(8.27), we see that probz(~γ ,β) = |dz |2 and C(z) = hz|HC |zi. By
finding good angles ~γ and β~ that minimize the expectation value above, the probability of finding the qubits
in their lowest energy configuration when measuring is increased, because the candidate variational states
becomes closer to the ground state of the cost Hamiltonian. Therefore the angles are chosen such that the
expectation value is minimized
~ ∗ ) = arg min Ep (~γ , β).
(~γ ∗ , β ~ (8.30)
~ ~
γ ,β

In general, this requires the quantum computer to query a classical optimizer, to tell the quantum computer
how it should update the variational state by slightly changing the angles in order to minimize the expectation
value, see Fig. 8.4. This has to be repeated until some convergence criteria is met or if an optimal or a good
enough solution is found. In other words, QAOA is converting the search in a space of a combinatorial
number of discrete configurations in a search for 2p optimal angles along a continuous energy landscape.

Figure 8.4: Schematic representation of the QAOA. The quantum processor prepares the variational state,
~ are optimized in a closed loop using
depending on variational parameters. The variational parameters (~γ , β)
a classical optimizer. From Ref. [Vikstål et al., 2020].

We end this section by giving a summary of the QAOA:


~
1. Pick a p and start with an initial set of 2p angles (~γ , β).
~ using a quantum computer and measure this state in the computational
2. Construct the state |~γ , βi
basis. The output is a string z with a probability given by the distribution of states |zi.

3. Calculate C(z) = hz|HC |zi using a classical computer. This step is classically efficient.
Pm
4. Repeat step 2-3, m times. Record the best observed string zbest , and the sample mean 1/m i=1 C(zi ),
where zi is the ith measurement outcome. Note that when m → ∞ the sample mean approaches the

67
expectation value Eq. (8.29) by the law of large numbers, as it is clear in particular from the righter-
most hand-side term.

5. If the optimal or a “good enough” solution is found, output C(zbest ) together with the string zbest . Else,
~ based on the minimization of the expectation
query a classical optimizer that updates the angles (~γ , β)
value and repeat from step 2.
Since QAOA is an algorithm that provides approximate solutions, a relevant metric in order to compare
its performance with respect to efficient classical algorithms also producing approximate solutions is the
approximation ratio. The approximation ratio is C(z), where z is the output of the quantum algorithm,
divided by the maximum of C.
Another relevant quantity in the context of QAOA is its success probability. Analogously as for quantum
annealing, it is defined as the square of the overlap between the variational states obtained for the optimal
angles, with the actual solution of the problem.

8.4.2 The Quantum Approximate Optimization Algorithm for solving Max-cut


The following Sections are taken from the paper of Fahri and Guttman [Farhi et al., 2014]. In this section,
we follow hence the notations of the original reference, and we consider a maximisation of the cost function.

Fixed p Algorithm

We now explain how for fixed p we can do classical preprocessing and determine the angles ~γ and β~ that
~
maximize Ep (~γ , β). This approach will work more generally but we illustrate it for a specific problem,
MaxCut for graphs with bounded degree. The input is a graph with n vertices and an edge set {hjki} of
size m. The goal is to find a string z that makes
X
ĤC = Ĉhjki , (8.31)
hjki

where
1
−σjz σkz + 1 ,

Chjki = (8.32)
2
as large as possible. Now
~ =
X
Ep (~γ , β) hs| eiγ1 ĤC · · · eiβp ĤM Chjki e−iβp ĤM · · · e−iγ1 ĤC |si . (8.33)
hjki

Consider the operator associated with edge hjki

eiγ1 ĤC · · · eiβp ĤM Chjki e−iβp ĤM · · · e−iγ1 ĤC . (8.34)

This operator only involves qubits j and k and those qubits whose distance on the graph from j or k is less
than or equal to p. To see this consider p = 1 where the previous expression is

eiγ1 ĤC eiβ1 ĤM Chjki e−iβ1 ĤM e−iγ1 ĤC . (8.35)

The factors in the operator e−iβ1 ĤM which do not involve qubits j or k commute through Chjki and we get
x x x x
eiγ1 ĤC eiβ1 (σj +σk ) Chjki e−iβ1 (σj +σk ) e−iγ1 ĤC . (8.36)

Any factors in the operator eiγ1 ĤC which do not involve qubits j or k will commute through and cancel out.
So the operator in equation Eq. (8.36) only involves the edge hjki and edges adjacent to hjki, and qubits on

68
those edges. For any p we see that the operator in Eq. (8.34) only involves edges at most p steps away from
hjki and qubits on those edges.
Return to equation Eq. (8.33) and note that the state |si is the product of σ x eigenstates

|si = |+i1 |+i2 . . . |+in (8.37)

so each term in equation Eq. (8.33) depends only on the subgraph involving qubits j and k and those at a
distance no more than p away. These subgraphs each contain a number of qubits that is independent of n
(because the degree is bounded) and this allows us to evaluate Ep in terms of quantum subsystems whose
sizes are independent of n.
As an illustration consider MaxCut restricted to input graphs of fixed degree 3. For p = 1, there are only
these possible subgraphs for the edge hjki:

(8.38)
For any subgraph G define the operator CG which is C restricted to G,
X
CG = Ch``0 i , (8.39)
h``0 iεG

and the associated operator


U (CG , γ) = e−iγ CG . (8.40)
Also define X
BG = σjx (8.41)
jεG

and
U (BG , β) = e−iβBG . (8.42)
Let the state |s, Gi be Y
|s, Gi = |+i` .
`εG

Return to equation Eq. (8.33). Each edge hj, ki in the sum is associated with a subgraph g(j, k) and makes
a contribution to Ep of

hs, g(j, k)| U † (Cg(j,k) , γp ) · · · U † (Bg(j,k) , β1 )Chjki U (Bg(j,k) , β1 ) · · · U (Cg(j,k) , γp ) |s, g(j, k)i (8.43)

The sum in Eq. (8.33) is over all edges, but if two edges hjki and hj 0 k 0 i give rise to isomorphic subgraphs,
~ are the same. Therefore we can view the sum in Eq. (8.33) as a
then the corresponding functions of (~γ , β)
sum over subgraph types. Define

~ = hs, g(j, k)| U † (Cg(j,k) , γ1 ) · · · U † (Bg(j,k) , βp )Chjki U (Bg(j,x) βp ) · · ·


fg (~γ , β)
U (Cg(j,k) , γ1 ) |s, g(j, k)i , (8.44)

69
where g(j, k) is a subgraph of type g. Ep is then

~ = ~
X
Ep (~γ , β) wg fg (~γ , β) (8.45)
g

where wg is the number of occurrences of the subgraph g in the original edge sum. The functions fg do not
depend on the number of decision variables n, neither on the number of constraints or clauses m. The only
dependence on those quantities comes through the weights wg and these are just read off the original graph.
Note that the expectation in Eq. (8.44) only involves the qubits in subgraph type g. The maximum number
of qubits that can appear in Eq. (8.43) comes when the subgraph is a tree. For a graph with maximum
degree v, the numbers of qubits in this tree is

(v − 1)p+1 − 1
 
qtree = 2 , (8.46)
(v − 1) − 1

(or 2p + 2 if v = 2), which is n and m independent. For each p there are only finitely many subgraph types.
~ in Eq. (8.45) can be evaluated on a classical computer whose resources are not
Using Eq. (8.44), Ep (~γ , β)
growing with n. Each fg involves operators and states in a Hilbert space whose dimension is at most 2qtree .
Admittedly for large p this may be beyond current classical technology, but the resource requirements do
not grow with n.
To run the quantum algorithm we first find the (~γ , β) ~ that maximize Ep . The only dependence on n
and m is in the weights wg and these are easily evaluated. Given the best (~γ , β) ~ we turn to the quantum
computer and produce the state |~γ , βi~ given in equation Eq. (8.27). We then measure in the computational
basis and get a string z and evaluate C(z) = hz|ĤC |zi. Repeating gives a sample of values of C(z) between
0 and +m whose mean is Ep (~γ , β). ~ An outcome of at least Ep (~γ , β)
~ − 1 will be obtained with probability
1 − 1/m with order m log m repetitions.

The Ring of Disagrees


We now analyze the performance of the quantum algorithm for MaxCut on 2-regular graphs. Regular of
degree 2 (and connected) means that the graph is a ring. The objective operator is again given by equation
Eq. (8.31) and its maximum is n or n − 1 depending on whether n is even or odd. We will analyze the
algorithm for all p.
For any p (less than n/2), for each edge in the ring, the subgraph of vertices within p of the edge is a
segment of 2p + 2 connected vertices with the given edge in the middle. So for each p there is only one type
of subgraph, a line segment of 2p + 2 qubits and the weight for this subgraph type is n. We numerically
maximize the function given in Eq. (8.44) and we find that for p = 1, 2, 3, 4, 5 and 6 the maxima are 3/4,
5/6, 7/8, 9/10, 11/12, and 13/14 to 13 decimal places from which we conclude that Ep = n(2p + 1)/(2p + 2)
for all p. So the quantum algorithm will find a cut of size n(2p + 1)/(2p + 2) − 1 or bigger. Since the best cut
is n, we see that our quantum algorithm can produce an approximation ratio that can be made arbitrarily
close to 1 by making p large enough, independent of n. For each p the circuit depth can be made 3p by
breaking the edge sum in C into two sums over hj, j + 1i with j even and j odd. So this algorithm has a
circuit depth independent of n.

Remarks on the performance of QAOA


It is still an open area of research to establish whether QAOA with a fixed p allows for a better performance
in solving approximatively NP-complete problems, with respect to classical algorithms. It is also an open
question how QAOA performs with respect to Quantum Annealing. An interesting comparative study for
Max-Cut on unweighted graphs and 2-SAT can be found in Ref. [Willsch et al., 2020].

70
Let us briefly discuss the performance of QAOA on MaxCut on (connected) 3-regular graphs. In
Ref. [Farhi et al., 2014], they show that for p = 1, the worst case approximation ratio that the quan-
tum algorithm produces is 0.6924. Hence this p = 1 result on 3-regular graphs is not as good as known
classical algorithms, i.e. 0.9326 [Halperin et al., 2004]. It is possible to analyze the performance of the
QAOA for p = 2 on 3-regular graphs. However it is more complicated than the p = 1 case. A numerical
optimisation yields 0.7559. Recently, Wurtz and Love speculated that there is no quantum advantage for
QAOA for solving Max-cut on 3-regular graphs for p < 6 [Wurtz and Love, 2020].

8.4.3 Other interesting remarks and extensions of QAOA


• Originally (see version 1 of Ref. [Farhi et al., 2014], also available on arXiv) the original authors
considered the problem Max E3LIN2, where they showed that QAOA achieves a better approximation
ratio than, at that time, the best classical approximation algorithm. The computer science community
engaged the challenge and soon came up with a better classical algorithm [Barak et al., 2015].

• It has been shown that QAOA is universal, meaning that for a problem of size n, and a choice of ĤC
and ĤM , QAOA can approximate any unitary U of dimension 2n ×2n to arbitrary precision when using
a sufficiently large iteration p, which in general depends on n [Farhi et al., 2017, Lloyd, 2018, Morales
et al., 2019].

• Farhi et al. showed that level p = 1 QAOA is computationally hard to simulate on a classical computer
without collapsing the polynomial hierarchy to the third level [Farhi and Harrow, 2019]. However, this
does not imply that QAOA for p = 1 is able to solve problems more efficiently than classically.

• In the absence of a fully connected set-up, one can split the role of the cost-function used for the
classical optimization within QAOA and the Hamiltonian implementing the evolution, and still retain
non-trivial approximation ratios, e.g. for Max-Cut [Farhi et al., 2017]. Furthermore, one can explore
the advantage stemming from rotating the qubits with different angles when acting with the mixer
Hamiltonian, and more generally introducing more free parameters.

• Brandao et al. [Brandao et al., 2018] demonstrated that if the problem instances come from a reasonable
distribution, then the expectation value of the cost function concentrates. This suggests that it is
possible to train a classical optimizer to find good angles on small instances and reuse those angles on
larger instances, as long as they come from the same distribution.

• Furthermore, it was shown that the QAOA is able to realize Grover’s search algorithm [Jiang et al.,
2017, Niu et al., 2019].

• In 2019, Hadfield et al. put forward the quantum operator alternating ansatz, which generalizes the
original QAOA ansatz to allow for more general types of Hamiltonians and initial states [Hadfield et al.,
2019].

• In general, QAOA is nowadays an active field of research, and researchers are fervidly trying to char-
acterise, understand and derive bounds on its performance.

8.4.4 More on the relation between QAOA and quantum annealing


In this section, we give a mapping between a Hamiltonian describing a quantum annealing scheme and the
QAOA for a given number of steps, taken from Ref. [Willsch et al., 2020]. The annealing Hamiltonian reads

H(s) = A(s)(−H0 ) + B(s)HC , s = t/ta ∈ [0, 1], (8.47)

71
⊗N
where (we have to add an additional minus sign to H0 such that the state |+i we start from is the ground
state of H(s) and the convention still conforms with the formulation of the QAOA)
X
H0 = σix , (8.48)
i
X X
HC = hi σiz + Jij σiz σjz . (8.49)
i ij

We discretize the time-evolution operator of the annealing process into N time steps of size τ = ta /N .
Approximating each time step to second order in τ yields

U = e+iτ A(sN )H0 /2 e−iτ B(sN )HC


× e+iτ (A(sN )+A(sN −1 ))H0 /2 · · ·
× e−iτ B(s2 )HC e+iτ (A(s2 )+A(s1 ))H0 /2
× e−iτ B(s1 )HC e+iτ A(s1 )H0 /2 , (8.50)

where sn = (n − 1/2)/N , and n = 1, . . . , N .


To map Eq. (8.50) to the QAOA evolution

V = e−iβp H0 e−iγp HC · · · e−iβ1 H0 e−iγ1 HC , (8.51)


⊗N
we can neglect e+iτ A(s1 )H0 /2 because its action on |+i yields only a global phase factor and we can choose

γn = τ B(sn ), n = 1, . . . , N (8.52)
βn = −τ (A(sn+1 ) + A(sn )) /2, n = 1, . . . , N − 1 (8.53)
βN = −τ A(sN )/2. (8.54)

So N time steps for the second-order-accurate annealing scheme correspond to p = N steps for the QAOA.
As an example, we take

A(s) = 1 − s, B(s) = s. (8.55)

Using Eqs. (8.52) - (8.54), we obtain

τ (n − 1/2)
γn = (8.56)
N n 
βn = −τ 1 − (8.57)
N
τ
βN = − . (8.58)
4N
Since the time evolution of quantum annealing is necessarily finite, with quantum annealing, too, there
is an associated success probability, i.e. the probability of being in the actual desired ground state at the
end of the evolution. In general, how the QAOA and quantum annealing performances - measured by the
respective success probabilities - compare depends on the problem instance itself. Some detailed comparison
are presented in Ref. [Willsch et al., 2020] for weighted Max-Cut and 2-SAT.

72
Chapter 9

The variational quantum eigensolver

In this chapter, we discuss the variational quantum eigensolver (VQE). The VQE is a heuristic approach to
solving various problems with a combination of quantum and classical computation. As we will see later,
the QAOA of the preceding chapter can be considered a special case of the VQE.
The content of this chapter is mostly based on the review in Ref. [Moll et al., 2018]. We first outline how
the VQE works and then discuss details of some of the steps in the algorithm.

9.1 Outline of the algorithm


The VQE is designed to solve problems that can be cast in the form of finding the ground-state energy EGS
of a Hamiltonian H. The ground-state energy is the lowest eigenvalue of the Hamiltonian,

H |ΨGS i = EGS |ΨGS i . (9.1)

How hard is this problem in general? If the Hamiltonian is k-local, i.e., if terms in H act on at most k
qubits, the problem is know to be QMA-complete for k ≥ 2. The general problem would thus be hard even
for an ideal quantum computer. However, it is believed that physical systems have Hamiltonians that do
not correspond to hard instances of this problem, and a heuristic quantum algorithm could still outperform
a classical one.
A general Hamiltonian for N qubits can be written

X X N
O
H= hα Pα = hα σα(j)
j
, (9.2)
α α j=1

where the hα are coefficients and the Pα are called Pauli strings. The latter are products of single-qubit
Pauli matrices (including the identity matrix).
The steps of the VQE algorithm are the following (see also Fig. 9.1):
0. Map the problem that you wish to solve to finding the ground-state energy of a Hamiltonian on the
form in Eq. (9.2).

1. Prepare a trial state |Ψ (ϑ)i set by a collection of parameters ϑ.

2. Measure expectation values of the Pauli strings in the Hamiltonian, i.e., measure E[Ψ (ϑ)]Pα Ψ (ϑ).
P
3. Calculate the energy E corresponding to the trial state, E = α hα E[Ψ (ϑ)]Pα Ψ (ϑ), by summing up
the results of the measurements in the preceding step.

73
Quantum Sci. Technol. 3 (2018) 030503 N Moll et al

Figure 3. Variational quantum


Figure eigensolver
9.1: The stepsmethod. TheVQE
of the trial states, which depend
algorithm. FromonRef.
a few[Moll
classical
etparameters q , are created on the
al., 2018].
quantum device and used for measuring the expectation values needed. These are combined on a classical computer to calculate the
energy Eq (q ) , i.e.the cost function, and find new parameters q to minimize it. The new q parameters are then fed back into the
algorithm. The parameters q* of the solution are obtained when the minimal energy is reached.
4. Update the parameters ϑ based on the result (and results in previous iterations).

Steps 1 and 2 are run on a quantum computer, which can handle the quantum states more efficiently than a
3.classical
Exploring Hilbert
computer. space
Steps 3 andwith
4 arethe
doneVQE
on a classical computer. The algorithm is iterative, i.e., it starts
over from step 1 after step 4, and continues to iterate until some convergence criterion is met, indicating
Tothat
exploit
the near-term quantum
ground-state energy devices,
has beenapplications andfollowing
found. In the algorithms have towebediscuss
sections, tailored to current
steps quantum
0, 1, and 4 in more
hardware
detail. with only tens or hundreds of qubits and without full quantum error correction. One main constraint
is the limited quantum volume that restricts the depth of meaningful quantum circuits. Still, a small-scale
quantum
9.2 computer
More on with step
hundred0qubits can process quantum
– mapping states that cannot even be stored in any classical
to a Hamiltonian
memory. A natural way to make use of this quantum advantage is via a hybrid quantum–classical architecture: a
quantum
Broadlyco-processor
speaking, theprepares
VQE ismulti-qubit quantum
currently mostly states
being ∣Y (q )ñ parametrized
considered for two typesbyofcontrol parameters
problems: q . The
optimization
subsequent
problems,measurement
where H is aofcost
a cost function
function (q )problem,
forEqthe = áY (q )∣and
Hq∣Y (q )ñ, typically
many-body the energy
fermionic of a problem
quantum systems, e.g.
Hamiltonian Hq, serveschemistry).
molecules (quantum a classical computer
The firsttotype
findofnew values qwas
problems in order to minimize
discussed Eq (in
extensively q ) Chapter
and find 8,thewhere
ground-state energy
we saw several examples of how optimization problems can be mapped to a Hamiltonian. In this chapter,
we therefore focus on quantum-chemistry problems.
Even though a Hamiltonian can be Eqmin = min
written down (áYfor
(q )∣
aH q∣Y (q )ñ).system, that is not the Hamiltonian that(3)
molecular
q
is used in VQE. In classical simulations of molecular systems, there are many different methods, e.g., density
This VQE approach
functional theory to Hamiltonian-problem
(DFT), where the actual solving
system has been recently
of interacting appliedisindescribed
electrons different contexts [34, 37, 40,
as non-interacting
70–72]. In fact, the Hamiltonian Hq can take many forms, the only requirement being that it can be mapped
electrons moving in a modified external potential. An approach more suited to VQE is to describe the to a
system
in second
system quantization.
of interacting qubitsThis
withrequires calculating a increasing
a non-exponentially number of number
spatial integrals
of terms.on a classical
Here computer,
we distinguish twobut
that task
relevant canHamiltonians
cases: be accomplished efficiently.
that describe The Hilbert
fermionic space consists or
condensed-matter of molecular system (section
electron orbitals. The Hamiltonian
4) and
is
Hamiltonians that describe a classical optimization problemX (section 5).
tij a†i aj + uijkl a†i a†k al aj ,
X
HF = (9.3)
i,j i,j,k,l
3.1. Variational quantum eigensolver method
In detail, the VQE method consists of four main steps as shown in figure 3. First, on the quantum processor a
tentative variational eigenstate, a trial state, ∣Y (q )ñ is generated by a sequence of gates parameterized by a set of
74
control parameters q . In the ideal case, this trial state depends on a small number of classical parameters q ,
whereas the set of gates is chosen to efficiently explore Hilbert space. In particular, the class of states forming the
solution to the minimization problem in equation (3) has to lie within the set of possible trial states. Suitable gate
sets which provide a good approximation to the wanted target state, which minimizes the cost function, have
been found for both classical optimization problems [41] (section 5) and quantum chemistry problems
where ai (a†i ) annihilates (creates) an electron in the ith orbital. The coefficients tij and uijkl describing
one- and two-electron interactions are calculated from the spatial integrals mentioned above.
n Theooperators in Eq. (9.3) are fermionic. They thus obey the fermionic anti-commutation relations, e.g.,
ai , a†j = δij . These are not the relations that the qubit Pauli operators obey. We thus need to translate
the Hamiltonian in Eq. (9.3) to a form that can be implemented on the quantum computer. One well-known
mapping from fermionic operators to qubit operators is the Jordan-Wigner transformation:

a†i → 1⊗i−1 ⊗ σ− ⊗ σz⊗N −i , (9.4)

where N is the number of orbitals and qubits. This mapping is not well suited to the VQE, because it creates
highly non-local terms in the qubit Hamiltonian. In actual applications of VQE to quantum chemistry, other
mappings are used (Bravyi-Kitaev, parity, ...). There is ongoing research on finding more suitable mappings.

9.3 More on step 1 – the trial state


The trial state |Ψ (ϑ)i can essentially be parameterized by ϑ in two ways: to form states that have a form
that is suggested by the problem Hamiltonian, or to form states that are easy to create with the available
quantum-computing hardware.

9.3.1 Problem-specific trial states


In quantum chemistry, a common class of trial states are created using a so-called coupled-cluster approach,
often the unitary coupled-cluster (UCC) one. Here, the unitary operator U (ϑ) creates the trial state:

|Ψ (ϑ)i = U (ϑ) |Φi = exp T (ϑ) − T † (ϑ) |Φi ,


 
(9.5)

where |Φi is a simple state formed by the Slater determinant for low-energy orbitals. The operator T (ϑ) is
known as a cluster operator. It is given by
X
T (ϑ) = Tk (ϑ), (9.6)
k

ϑji a†j ai ,
X
T1 (ϑ) = (9.7)
i∈occ,j∈unocc
† †
X
T2 (ϑ) = ϑkl
ij al ak aj ai , (9.8)
i,j∈occ,k,l∈unocc

where the sums go over occupied and unoccupied orbitals. The coefficients of the higher-order cluster
operators decrease rapidly as more orders are included. For this reason, the expansion is usually truncated
at the second , “double”, order (UCCSD) or the third, “triple”, order (UCCSDT).

9.3.2 Hardware-efficient trial states


On an actual quantum computer, particularly a NISQ one, implementing the cluster operators can be hard,
especially since the fermionic operators in the cluster operators must be mapped to qubit operators first.
Therefore, hardware-efficient trial states are preferred. In the work of Ref. [Kandala et al., 2017], where the
H2 , LiH, and BeH2 molecules were simulated using 2, 4, and 6 qubits, respectively, the trial states were of
the form

|Ψ (ϑ)i = Usingle (ϑ)Uent (ϑ)Usingle (ϑ)Uent (ϑ) . . . Usingle (ϑ)Uent (ϑ) Usingle (ϑ) |00 . . . 0i . (9.9)
| {z }
d repetitions

75
Here, Usingle (ϑ) represent arbitrary-single qubit rotations on each of the N qubits (different rotations in each
of the d+1 steps) and Uent (ϑ) represent two-qubit entangling operations (same in each step) that were easy to
implement in the available hardware. For the single-qubit operations alone, there are N (3d + 2) independent
rotation angles in the parameter vector ϑ (an arbitrary single-qubit rotation can be characterized by 3 Euler
angles).
Already for relatively small molecules, d needs to be more than just a few repetitions to reach accuracy
that can compete with classical methods. However, a larger d means that the quantum circuit takes longer to
run, and thus decoherence will limit the achievable d. Recently, researchers are exploring “error mitigation”
to get around this problem. In one type of error mitigation, the experiment is rerun several times with
varying levels of added noise. From this, one can extrapolate the answer towards what it would have been
for zero noise.
Note that the form of Eq. (9.9) is that of the QAOA in Eq. (8.51). This shows that the QAOA is an
example of the broader class of algorithms that is the VQE.

9.4 More on step 4 – updating the parameters


Just like the other steps in the VQE that we have discussed so far, step 4 is also the subject of ongoing
research. When searching for the ground-state energy of the problem Hamiltonian, there are several pit-
falls that the update step must deal with. For example, the parameter landscape may have local minima.
Furthermore, there is evidence that the landscape for larger problems can contain “barren plateaus”. Both
these problems are hard to deal with if one uses a standard gradient-descent-based search for the optimal
parameters. Also, the value of E obtained in step 3 is noisy, since it is based on limited sampling of the
expectation values for the Pauli strings making up the Hamiltonian (at some point, it becomes too costly to
run the quantum computer enough times to sample all strings enough time eliminate the noise). The search
method used needs to be robust against this noise. Another issue is that the number of parameters will be
large for a larger problem. One possibility is to use gradient-free algorithms like Nelder-Mead.
There are many considerations that go into choosing the right method for updating the parameters. Yet
another is that it can take non-negligible time to change all parameters and set up the instructions (pulse
shapes, etc.) needed to implement step 1 on the quantum computer again.
Although VQE is an interesting heuristic hybrid quantum-classical algorithm for NISQ devices, it is clear
that there is still much to be understood about the different steps of the algorithm. It is still unclear how
well the VQE will scale with the size of the problems it is applied to.

76
Chapter 10

Sampling models and sub-universal


models of quantum computation

These parts of the notes follow Refs. [Lund et al., 2017, Aaronson and Arkhipov, 2013], as well as the notes
by Sevag Gharibian at the Quantum Complexity Theory 2019, University of Paderborn, http://groups.
uni-paderborn.de/fg-qi/courses/UPB_QCOMPLEXITY/2019/UPB_QCOMPLEXITY_syllabus.html. We also
highly recommend to read the short review Ref. [Harrow and Montanaro, 2017].

10.1 Introduction: motivation for sampling models


One of the difficulties in rigorously proving quantum advantage for computation is linked to the difficulty
of bounding the power of classical computers. For instance, the celebrated Shor’s polynomial time quantum
algorithm for factorisation is important discovery for the utility of quantum computing, but is not satisfying
in addressing the issue of quantum advantage due to the unknown nature of the complexity of factoring.
The best known classical factoring algorithm, the general number field sieve, is exponential time (growing
1/3 2/3
as ecn ln n where n is the number of bits of the input number). However, in order to prove quantum
advantage, or really any separation between classical and quantum computational models, it must be proven
for all possible algorithms and not just those that are known.
The challenge of bounding the power of classical computation can be exemplified by the Shor’s trilemma.
The Extended Church-Turing Thesis asserts that any “reasonable” model of computation can be efficiently
simulated on a standard classical computational model such as a Turing Machine or a Random Access
Machine or a cellular automaton. Since Shor’s algorithm allows to solve factoring in polynomial time, it is
clear that one of the following statements must be false:

1. The Extended Church-Turing Thesis is true.

2. Factoring cannot be solved efficiently on a classical computer.

3. Large-scale universal quantum computers can be built (and hence there is something wrong in quantum
mechanics textbooks).

One of the main motivations of studying (selected) sampling problems is that possible to prove that they
are indeed problems that can be solved efficiently in polynomial time by a quantum computer, while they
cannot be solved in polynomial time by a classical computer (up to widely believed conjectures in computer

77
science). Experimentally validating a sampling model would hence disprove the Extended Church-Turing
thesis and thereby solve Shor’s trilemma.
Sampling problems consist in generating output random numbers according to a particular probability
distribution (see Fig. 1). All quantum computations on n qubits can be expressed as the preparation of
an n-qubit initial state |0i⊗n , a unitary evolution corresponding to a uniformly generated quantum circuit
C followed by a measurement in the computational basis on this system. In this picture the computation
outputs a length n bitstring x ∈ {0, 1}n with probability

Px = |hx|C|0i⊗n |2 (10.1)

In this way quantum computers produce probabilistic samples from a distribution determined by the circuit
C.
One of the appealing features of these models is that they are restricted models of quantum computation,
or sub-universal, and hence they do not require a universal quantum computer: they might be imple-
mentable on a large scale via near-term “noisy intermediate scale quantum devices” (NISQ). This quest for
an experimental demonstration of quantum computational speedup has fallen under the moniker of “quan-
tum advantage”, with multiple candidate approaches to date: The Instantaneous Quantum Polynomial-Time
(IQP) model, random circuit sampling (RCS), and the deterministic quantum computation with one quan-
tum bit (DQC1) model. Among these candidates, the Google AI-group recently released a demonstration
of RCS with 53 qubits [Arute et al., 2019], which is at the edge of current simulation capability of classical
computers [Pednault et al., 2019], and the Pann group a proof of Gaussian Boson Sampling with 100 optical
modes and 50 input squeezed states [Zhong et al., 2020]. Despite disproving the Extended Church-Turing
Thesis is inherently challenging because it is a scaling statement, these two experiment constitute a first
proof of quantum advantage, in that classical computers cannot currently simulate the experiments them-
selves. Some comments on the Google experiments can be found in Sec. 10.3. Here, however, we shall focus
on two of the first and historically more important models for quantum advantage: Instantaneous Quantum
Polytime and Boson sampling. Note that the historically very first leap towards exhibiting a candidate
model for quantum advantage was taken by Terhal and DiVincenzo in their 2008 paper “Adaptive quantum
computation, constant depth quantum circuits and Arthur-Merlin games" [Terhal and DiVincenzo, 2002].
Their approach was already to appeal to a complexity-theoretic argument: they gave evidence that there
exists a certain class of quantum circuits that cannot be simulated classically by proving that if a classical
simulation existed, certain complexity classes strongly believed to be distinct would collapse to the same
class.
Beyond the conceptual interest in unveiling quantum advantage with sampling models, we also underline
that a convincing demonstration of the impossibility for a classical computer to produce the same output
probability distribution as a quantum architecture is also a crucial milestone towards demonstrating the
controllability of quantum computers, with the scope of using them to perform useful algorithms.

10.2 Instantaneous Quantum Polytime


IQP circuits are an intermediate model of quantum computation where every circuit has the form C =
H ⊗n DH ⊗n , where H is a Hadamard gate and D is an efficiently generated quantum circuit that is diag-
onal in the computational basis. Sampling then simply corresponds to performing measurements in the
computational basis on the state H ⊗n DH ⊗n |0i⊗n . In [Bremner et al., 2010] it was argued that classical
computers could not efficiently sample from IQP circuits (exact sampling case) where D is chosen uniformly

 
1 0
at random from circuits composed of: (1) CZ (square-root of controlled-Z), and T = gates;
0 eiπ/4

78
or (2) Z, CZ , and CCZ (doubly controlled-Z) gates. Between two layers of Hadamard gates is a collection
of diagonal gates. Although these diagonal gates may act on the same qubit many times, they all commute
so in principle could be applied simultaneously, hence the name “instantaneous". In the case of (1) these
circuits correspond to random instances of the Ising model drawn from the complete graph [Bremner et al.,
2016].
Ref. [Bremner et al., 2016] extends this result to the approximate sampling case relying on conjectures,
similarily to the Boson Sampling model. In contrast to Boson Sampling, though, these conjectures have later
been proven by Jens Eisert and collaborators.

Figure 10.1: Example of an IQP circuit, taken from Ref. [Harrow and Montanaro, 2017].

The worst-case complexity of the problems in both (C1) and (C2) can be seen to be hard to classically
sample in two steps.
First, one prove that these families of circuits are examples of sets that become universal under post-
selection and as a result their output probabilities are hard to classically sample.
Then, complexity-theoretical arguments can be applied, that allow to conclude on the hardness of the
model. We are going to review both aspects briefly here.
It can be shown that the complexity of computing the output probabilities of IQP circuits, Px =
|hx|H ⊗n DH ⊗n |0i⊗n |2 , is classically hard in the worst case and this also holds under multiplicative ap-
proximation.

10.2.1 Hadamard gadget


This section is taken from the supplementary material of [Douce et al., 2017]. For either of the gate sets
(1) or (2), the only missing ingredient for universality is the ability to perform Hadamard gates at any
point within the circuit. In Ref. [Bremner et al., 2010] it was shown that such gates can be replaced with a
“Hadamard gadget", which requires one post-selected qubit and controlled-phase gate per Hadamard gate.
The Hadamard gadget [Bremner et al., 2010] is the very essence of the difficulty to simulate IQP circuits on
classical computers. It shows that under post-selection an IQP circuit can implement a Hadamard gate.

|ψi • X̂
|+i • X h H |ψi

Figure 10.2: Hadamard gadget in a post-selected IQP circuit, where h takes value 0 if +1 is measured, while
h = 1 if the result is −1.

Output state
Suppose one wants to implement a Hadamard gate on an arbitrary qubit |ψi = α |0i + β |1i. Following
the circuit structure depicted in Fig.10.2, we add an ancillary qubit initialized in |+i so that we start from

79
(omitting normalization)
|ψi |+i = α |00i + α |01i + β |10i + β |11i .
Then we apply the controlled Z gate and the measurement in the X basis. Conditioned on getting the
outcome corresponding to the state |+i when measuring the first qubit we have:
ĈZ
α |00i + α |01i + β |10i + β |11i 7−→ α |00i + α |01i + β |10i − β |11i
h+|
7−→ α(|0i + |1i) + β(|0i − |1i) = H |ψi . (10.2)

If instead we get the outcome corresponding to the state |−i when we measure the first qubit, the same kind
of calculations give:
ĈZ
α |00i + α |01i + β |10i + β |11i 7−→ α |00i + α |01i + β |10i − β |11i
h−|
7−→ α(|0i + |1i) − β(|0i − |1i) = XH |ψi . (10.3)

Defining h the outcome of the measurement, so that h = 0 (resp. h = 1) corresponds to measuring the state
|+i (resp. |−i), then the result of the computation is, in the general case

X h H |ψi .

So the point of post selecting is to ensure it is indeed H and not −H that has been implemented.

Probability of measuring |+i


A subtlety with post-selection that is worth mentioning concerns the probability of the conditioning result.
Specifically, if one wants to post-select on a qubit measured in a given state, then the probability associated
with this measurement must be non zero. Thus it ensures that the conditional probability describing the post-
selection is well-defined. In the case of the Hadamard gadget, we can compute the relevant success probability
explicitely. We have after the ĈZ gate – actually 1/2 times the following equation for normalization purposes:

α |00i + α |01i + β |10i − β |11i = (α + β) |+0i + (α − β) |+1i + (α − β) |−0i + (α + β) |−1i . (10.4)

It is then obvious to show that the probability to measure |+i is


1  1
|α + β|2 + |α − β|2 = .
4 2
An interesting feature of this result is that is doesn’t depend on the input state |ψi. So even if initialized in
|−i, the entangling ĈZ gate sort of smoothes the global state in such a way that the probability of measuring
the first qubit in |+i is now 1/2. Given that the number of post-selected lines l in a DV IQP circuit is of
order of the total number of lines in the circuit n, l ∼ O(n), the overall success probability distribution 1/2l
is exponentially low in the circuit size. However we stress that this post-selection should be regarded as a
mathematical tool for the hardness proof, and its actual implementation is not required in practice.

Complexity-theoretical arguments and proof of computational hardness


In Sec.10.2.1 we have shown that IQP with postselected measurements is universal for PostBQP (that is,
quantum polynomial-time with postselection on possibly exponentially-unlikely measurement outcomes). In
other words, to any computation in PostBQP corresponds a post-selected IQP circuit.
Furthermore, Aaronson previously showed that PostBQP = PP. On the other hand, if a classical
algorithm existed for the simulation of IQP, then we will show that we could simulate postselected IQP in

80
PostBPP (that is, classical polynomial-time with postselection, also called BPPpath ). This would yield to
the following chain of inclusions of complexity classes:

BPPpath = PostBPP = PostBQP = PP, (10.5)

which is known to imply a collapse of the polynomial hierarchy. The final argument why this happens is that
PP is as hard as PH (Toda’s theorem, Gödel prize 1998), while PostBPP is contained in the third level of
the polynomial hierarchy. Therefore, if exact IQP was efficiently classically simulatable, the full polynomial
hierarchy would be contained in the third level, which implies the collapse.

10.3 Random Circuit Sampling


Google has recently published a paper demonstrating quantum advantage using a 53-qubit quantum com-
puter [Arute et al., 2019]. From an information-theoretic perspective, the classical computational hardness
of sampling from this circuit family has been demonstrated in Ref. [Bouland et al., 2019].
What do these results mean and what are the implications for WACQT and other efforts to build
a quantum computer around the world? This section is the WACQT statement following release of the
Google-AI experiment, and has been written by G. Johansson. It has been edited further by G. Ferrini
after release of the IBM paper claiming that the output of the Google experiment can be simulated in 2,5
days [Pednault et al., 2019]. For more informations, and for a very clear explanation of the implications of
this experiment for quantum advantage, we recommend to read the relevant blogposts on Scott Aaronson
blog https://www.scottaaronson.com/blog/. For a very nice discussion on the terminology quantum
supremacy used by many authors and reasons to avoid it, as well as for a pedagogical introduction to the
Google experiment, visit https://phys.cam/2019/10/quantum-supremacy/.

1. It is a major milestone, demonstrating that a gate-based quantum computer can indeed perform a
computational task in 3 minutes, which would take 2.5 days to solve on the most powerful supercom-
puter on earth. Importantly, if Google would now add a few more qubits to their chip, simulating the
outcome would become impossible even for that supercomputer, in decent times. E.g., for 60 qubits,
you would need 33 such supercomputers for just storing the quantum state of the Google chip.

2. The computation is not claimed to be useful in any way. The task is to sample from a particular
probability distribution. This task was chosen carefully, because it is easy to perform on Google’s
quantum computer, but hard on any classical computer.

3. This does not imply that quantum computers from now on outperform classical computers in general.
However as quantum computers evolve, the class of computational tasks where the quantum computer
performs better than classical computers will grow. The hope is of course that at some point it will
also contain useful computations.

4. The computation was performed without using error correction. The error rate was low enough to give
the right answer for this comparably short quantum algorithm. This is a so-called NISQ device, where
NISQ stands for Noisy Intermediate-Scale Quantum device.

5. The basic architecture of the 53-qubit device is similar to previously published devices from Google.
The breakthrough consists of careful engineering of control hardware and software as well as a thor-
ough analysis of which computational tasks are easy for a quantum computer and hard for classical
computers.

81
circuit sampling-figure.pdf

Figure 10.3: Sketch of a Random Circuit Sampling, taken from Ref. [Arute et al., 2019].

6. The algorithm creates a random entangled state by repeating layers of eight sets of gates, where most
qubits are taking part in eight entangling gates, two with each nearest neighbour, interlaced with eight
single qubit gates. For the longest algorithm, each layer is repeated twenty times, giving on the order
of 53 × (8/2) × 20 ∼ 4000 two-qubit gates. The algorithm is then repeated one million times, to
give appropriate statistics. The full run-time is 200 seconds. To perform the similar sampling on a
supercomputer with 1 million cores is estimated to take 2.5 days.

7. In WACQT, we are inspired by this breakthrough. We note that the average lifetime (T1) of the
Google qubits is 16 microseconds, while we have demonstrated reproducible lifetimes of around 80
microseconds. However, Google’s design gives them very fast two-qubit gates, taking less than 20
ns. At the moment, we are working on increasing the speed of our two-qubit gates. We also note
the importance of automated calibration and control software, which we are currently developing
also for our setup. WACQT also takes full part in the work to find useful algorithms suitable for
superconducting qubit architectures.

8. In contrast to the other circuits in this chapter that are candidate to yield quantum advantage, in [Arute
et al., 2019] the gates that are implemented are in principle√ drawn
√ √ from a universal gate set.
√ More
in detail, regarding single qubit gates, they implement X, Y , W , with W = (X + Y )/ 2 (see
Fig. 10.3). They generate random quantum circuits using the two-qubit unitaries measured for each
pair during simultaneous operation, rather than a standard gate for all pairs. The typical two-qubit
gate is a full iSWAP with 1/6th of a full CZ. Using individually calibrated gates in no way limits the

82
universality of the demonstration.

10.4 Boson Sampling


10.4.1 Definition of the Boson Sampling model
Aaronson and Arkipov [Aaronson and Arkhipov, 2013] describe a simple model for producing output proba-
bilities that are hard to classically sample. Their model uses bosons that interact only by linear scattering 1 .
The bosons must be prepared in a Fock state and measured in the Fock basis. Consider at this purpose M

Figure 10.4

input ports of a multi-port splitter, which we feed with N photons. We assume that the input state only
has maximum one input photon maximum per mode (see Fig. 10.4). With no loss of generality we order the
modes such that the first N input modes contain a photon, while the others are empty. I.e,
|ψin i = |11 , ...1N , ..., 0M i = â†1 ...â†N |01 , ..., 0M i ≡ |T i. (10.6)
Now a linear optics network described by the M × M matrix U is applied to the input state. Linear bosonic
interactions, or linear scattering networks, are defined by dynamics in the Heisenberg picture that generate
a linear relationship between the annihilation operators of each mode, i.e.
M
Uj,k âk , i.e. ~b = U~a;
X
b̂j = U † âj U = (10.7)
k=1
M
b̂†j = U † â†j U = †
â†k , i.e. b~† = U † a~† ,
X
Uj,k (10.8)
k=1
1 In this sense, therefore, the Boson Sampling model already lives in the bosonic space associated with an infinite dimensional

Hilbert space and with continuous-variable operators, such as the quadratures of the field. However, we present this model in
the discrete-variable section of the notes, to highlight the contrast with models making use of squeezed states and homodyne
detection. The latter are more traditionally associated with the continuous-variable approach, and they will be presented in
Chapter 11.

83
which also implies that â†j = k=1 Uj,k b̂†k . It is important to make a distinction from the unitary operator
PM

U which acts upon the Fock basis and the unitary matrix defined by U which describes the linear mixing of
modes. For optical systems the matrix U is determined by how linear optical elements, such as beam-splitters
and phase shifters, are laid out. In fact all unitary networks can be constructed using just beam-splitter and
phase shifters.
The set of events which are then output by the algorithm is a tuple of M non-negative integers whose
sum is N . This set is denoted ΦM,N . As we will explicitly show in Sec.(10.4.2), the probability distribution of
output events is related to the matrix permanent of sub-matrices of U . The matrix permanent is defined in a
recursive way like the common matrix determinant, but without the alternation of addition and subtraction.
For example  
a b
Per = ad + cb; (10.9)
c d

a b c
 

Per  d e f  = aei + ahf + bdi + cdh + cge. (10.10)


g h i
In a more general form
n
X Y
Per(A) = ai,σ(i) (10.11)
σ∈Sn i=1

where Sn represents the elements of the symmetric group of permutations of n elements. With this, we can
now define the output distribution of the linear network with the input state from Eq.(10.6). For an output
event S = (S1 , S2 , ..., SM ) ∈ Φm,n , the probability of S

|Per(U S )|2
PS = (10.12)
S1 !S2 !...SN !
where the matrix US is a N × N sub-matrix of U where column i is repeated Si times and only the first N
rows are used. One critical observation of this distribution is that all events are proportional to the square
of a matrix permanent derived from the original network matrix U . The resulting photon distribution in
output is hard to be classically sampled, as we will show later.

10.4.2 Proof that the Boson Sampling probability distribution is proportional


to permanents
We now explicitly show that the output probability distribution of the boson sampling circuit is proportional
to the permanent of the relevant submatrix. We follow a derivation similar to the one in the supplementary
information of Ref. [Spring et al., 2013]. In a more general formulation the input population is |T i =
|T1 , ...TM i. The general case of arbitrary input state is treated in Ref. [Scheel, 2004].
In the Heisenberg representation, the input state does not evolve. However, using Eqs.(10.7) and (10.6)
we can re-express it as
   
N N X
M
â†j  |01 , ..., 0M i =  Uj,k b̂†k  |01 , ..., 0M i.
Y Y
|ψout i = |ψin i =  (10.13)
j=1 j=1 k=1

The sum in Eq.(10.13) is composed of M N terms, which is the number of ways in which we can put N
objects in M modes allowing for repetitions; we can refer to this set of permutation as Ṽ , and then rename

84
the terms of the sum in Eq.(10.13), i.e.
  N
N X
M M N
Uj,k b̂†  = †
Y XY

kU k,Ṽkj b̂Ṽ j (10.14)
k
j=1 k=1 j=1 k=1

where k is the k-th boson and Ṽkj is in which mode that photon is found in the permutation j. In other
words, the ensemble Ṽ is the set of M N permutations of N photons in M modes, repetitions allowed. Let’s
consider as an example the case in which we have in input N = 2 photons in M = 3 modes. Then we have
to consider j=1 â†j , i.e. the first two rows of the vector
Q2

â†1
 †  
U11 b̂†1 + U12 b̂†2 + U13 b̂†3
   
U11 U12 U13 b̂1
 †  
 â2  = U21 U22 U23   b̂†2  =  U21 b̂†1 + U22 b̂†2 + U23 b̂†3  (10.15)
   
† † † † †
â3 U 31 U32 U33 b̂3 U31 b̂1 + U32 b̂2 + U33 b̂3
which gives
2
â†j (U11 b̂†1 + U12 b̂†2 + U13 b̂†3 )(U21 b̂†1 + U22 b̂†2 + U23 b̂†3 )
Y
=
j=1

= U11 U21 b̂†2 † † † †


1 + U11 U22 b̂1 b̂2 + U11 U23 b̂1 b̂3
+ U12 U21 b̂†2 b̂†1 + U12 U22 b̂†2 † †
2 + U12 U23 b̂2 b̂3
+ U13 U21 b̂†3 b̂†1 + U13 U22 b̂†2 b̂†3 + U13 U23 b̂†2
3 . (10.16)
Hence here in the sum of Eq.(10.16) we have
j = 1 : k = 1 → Ṽkj = 1; k = 2 → Ṽkj = 1;
j = 2 : k = 1 → Ṽkj = 1; k = 2 → Ṽkj = 2;
j = 3 : k = 1 → Ṽkj = 1; k = 2 → Ṽkj = 3;
j = 4 : k = 1 → Ṽkj = 2; k = 2 → Ṽkj = 1;
... (10.17)
In practice, our linear network is going to redistribute the photons in the output modes (in the sense of
their expected presence). The output state is hence a coherent superposition of various multi-Fock states
according to a probability distribution which depends on the matrix U .
Let us indicate with Si we indicate the number of photons in mode i in the configuration S, S k indicates
the mode in which the photon k is found in the configuration S, and where for each configuration we have
PM
i=1 Si = N . The

total number of configurations is the number of ways of arranging N bosons in M modes,

N +M −1
i.e. Nconfig = (repetitions not allowed). If the total number of input photons is small
N

compared to the number of modes such that N ∼ M , then the probability that two photons are found in
the same output mode is rather small (birthday paradox). Nevertheless, we will consider here the general
case of an arbitrary output distribution. The probability distribution S is evaluated by projection of the
output state (10.13) onto the configuration state |Si ≡ |S1 , ..., SM i, i.e.
2
MN N
2 1  X Y
PS = |hS|ψout i| = h01 , ..., 0M | b̂S1 1 ...b̂SMM Uk,Ṽ j b̂†Ṽ j |01 , ..., 0M i
S1 !...SM ! j=1
k k
k=1
 N
 2
N M N
1
Uk,Ṽ j b̂†Ṽ j  |01 , ..., 0M i .
Y XY
= h01 , ..., 0M | b̂S k  (10.18)
(S1 !...SM !) j=1
k k
k=1 k=1

85
Before going to the general case of arbitrary output configuration, we want to √ fix the ideas with an
example. Consider the case where we project onto the state | {S}i = |020i = b̂†2
2 / 2|000i. Then the only
†2
contributing terms in Eq.(10.16) is U12 U22 b̂2 , and we obtain:

1 2
2
PS = |hS|ψout i| = h000|b̂22 U b̂†1 b̂†2 |000i
2!
1 2
= h000|b̂22 U12 U22 b̂†2
2 |000i
2
1 2 2
= 4|U12 U22 | = 2|U12 U22 | , (10.19)
2

where we have used that h02 |b̂22 b̂†S


2 |0i i = 2. We can now verify that this expression is equivalent to the
2

permanent of the submatrix of U identified by the rows corresponding to the input photons, and the columns
identified by the output configuration of interest, where the columns are repeated as many times as the
occupation of the output mode. In practice, we have:

U11 U12 U13


 

U =  U21 U22 U23  , (10.20)


U31 U32 U33

so that  
(S) U12 U12
U = , (10.21)
U22 U22
and
Per(U S ) = U12 U22 + U12 U22 = 2U12 U22 , (10.22)
yielding immediately Eq.(10.19) according to (10.12).
In the general case, let us now indicate with S̃kj the j-th permutations of the N photons in the output
state, where k is the photon index. At maximum, there will be N ! permutations, corresponding to the ways
of arranging N photons in N modes, repetitions not allowed (note that if in the output distribution there
are more photon per mode then the number of ways I can put N photons in M modes with S1 in the first,
S2 in the second... is N !/(S1 !S2 !...SM !)). For example, it is clear that e.g. if we project onto the state
| {S}i = |011i then we have S = [2, 3] and

j = 1 : k = 1 → S̃kj = 2; k = 2 → S̃kj = 3;
j = 2 : k = 1 → S̃kj = 3; k = 2 → S̃kj = 2. (10.23)

It is clear that the only non-zero terms in the sum of Eq.(10.18) will be those for which ṼkS = S̃kj , hence2
2
N! Y
N
1 X
PS = Uk,S̃ j , (10.24)
S1 !...SM ! j=1 k
k=1

where we have used that h0i |âSi i â†S


i |0i i = Si !. For instance, we obtain for the state | {S}i = |011i PS =
i

2 2
|U12 U23 + U13 U22 | , while for the state |020i we have seen that PS = 2|U12 U22 | . We can now compare
2 We have used that
 N
 N
N M N M N N! Y
N
Uk,Ṽ j b̂† j  |01 , ..., 0M i
Y X Y X Y X
h01 , ..., 0M | b̂S k  = S1 !...SM ! Uk,S̃ j = Uk,S̃ j .
k Ṽk k k
k=1 j=1 k=1 j=1 k=1 j=1 k=1

86
Eq.(10.24) with the formula for the permanent of an L × L matrix A, which reads
L! Y
X L
Per(A) = ak,σ̃j . (10.25)
k
j=1 k=1

where σ̃kj is the k-th element of the j-th permutation of the numbers 1, ..., L. It is easy to compare Eq.(10.24)
to Eq.(10.25); note however that our original matrix was M × M , while we are computing here only the
permanent of the sub-matrix involving the first N rows, and columns which correspond to configuration S.
Hence we obtain Eq.(10.12).

10.4.3 Sketch of the proof of computational hardness of the Boson Sampling


probability distribution
Exact Boson Sampling
The first main result of the original Aaronson paper states the following:
Theorem 1: hardness of exact Boson Sampling The exact Boson Sampling problem is not efficiently
solvable by a classical computer, unless P#P = BPPNP and the polynomial hierarchy collapses to the third
level. More generally, let O be any oracle that “simulates boson computers”, in the sense that O takes
as input a random string r (which O uses as its only source of randomness) and a description of a boson
computer A and returns a sample OU (r) from the probability distribution DU over possible outputs of U .
O
Then P#P = BPPNP .
At least for a computer scientist, it is tempting to interpret Theorem 1 as saying that “the exact Boson-
Sampling problem is #P-hard under BPPNP -reductions.” Notice that this would have a shocking implica-
tion: that quantum computers (indeed, quantum computers of a particularly simple kind) could efficiently
solve a #P-hard problem! There is a catch, though, that has to do with the fact that BosonSampling is a
sampling problem rather than a decision problem. In other words, the “reduction” from #P-complete prob-
lems to BosonSampling makes essential use of the hypothesis that we have a classical BosonSampling
algorithm. Details can be found in the original article.
Two proofs of Theorem 1 can be given. In the first proof, we consider the probability p of some particular
basis state when a boson computer is measured. We then prove two facts:

(1) Even approximating p to within a multiplicative constant is a #P-hard problem.

(2) If we had a polynomial-time classical algorithm for exact BosonSampling, then we could approximate
p to within a multiplicative constant in the class BPPNP , by using a standard technique called universal
hashing.

Combining facts (1) and (2), we find that, if the classical BosonSampling algorithm exists, then P#P =
BPPNP , and therefore the polynomial hierarchy collapses.
The second proof is inspired by independent work of Bremner, Jozsa, and Shepherd [Bremner et al.,
2010], and namely by the proof of computational hardness for the IQP model that we have seen in Sec.10.2.
In this proof, one starts with a result of Knill, Laflamme, and Milburn, which says that linear optics with
adaptive measurements is universal for BQP, giving name to the respective KLM model. A straightforward
modification of their construction shows that linear optics with postselected measurements is universal for
PostBQP (that is, quantum polynomial-time with postselection on possibly exponentially-unlikely measure-
ment outcomes). Furthermore, Aaronson previously showed that PostBQP = PP. On the other hand, if a
classical BosonSampling algorithm existed, then we will show that we could simulate postselected linear

87
optics in PostBPP (that is, classical polynomial-time with postselection, also called BPPpath ). We would
therefore get exactly the same chain of inclusion as in Eq.(10.5), and the same conclusions on the hardness
of the model apply.

Approximate Boson Sampling


While theoretically interesting, Theorem 1 unfortunately does not suffice to rule out the Extended Church
Turing thesis, as even an optical setup realistically cannot perform exact Boson Sampling due to experimental
noise. Thus, one must consider approximate Boson Sampling, and show that it is also hard to sample. More
precisely, what Aaronson has demonstrated is that even sampling from a probability distribution D0 U that
0i
P i
is away from the exact boson sampling one DU by a certain error bound ε, i.e. i DU − D U < ε, it is
classically hard.
The proof of computational hardness of approximate boson sampling relies on two extra conjectures,
beyond the fact that the polynomial hierarchy does not collapse, namely the Permanent of Gaussians con-
jecture, and the Permanent anti-concentration conjecture. We will not go into the details of the proof or
hardness of approximate boson sampling here.

Experimental realisations and classical simulatability of Boson Sampling


Several proof of principle experiments have been achieved, the former ones with a handful of modes and single
photons, see among others [Spring et al., 2013], and latest ones with up to 20 input photons injected in 60
modes [Wang et al., 2019]. Due to the probabilistic nature of single-photon sources, experimentalists have
now moved towards Gaussian Boson Sampling to seek for large-scale demonstrations of quantum advantage,
which was recently achieved (shortly after the Google experiment on RCS) using 50 input squeezed states,
and 100 optical modes [Zhong et al., 2020]. The input squeezed states can be deterministically produced, and
the results on the impossibility of simulating the outcome probability distributions still stand, although with
a different proof technique [Hamilton et al., 2017]. Also note that, in contrast to random circuit sampling. at
least one useful application of Gaussian Boson Sampling has been outlined, namely the calculation of vibronic
molecular spectra [Huh et al., 2015]. For an extensive discussion see https://www.scottaaronson.com/
blog/?p=5122. We will briefly review more sampling models with squeezed states and continuous-variables
in Sec. 11.4.
Finally, note that classical algorithms are also constantly improving, which allow one for the simulation
and benchmark of larger and larger Boson Sampling devices [Neville et al., 2017, Li et al., 2020]. Also,
imperfections such as noise (photon losses, partial distinguishability of the photons) might render Boson
Sampling circuits classically efficiently simulatable [Moylett et al., 2019, Qi et al., 2020].

88
Chapter 11

Continuous-Variable Approach to
Quantum Information

In the Continuous-Variable (CV) approach to quantum information processing, √ relevant observables √are
characterised by a continuous spectrum, such as the amplitude q̂ = (â+↠)/( 2) and phase p̂ = (â−↠)/(i 2)
quadratures of the electromagnetic field, satisfying [q̂, p̂] = i. The associated Hilbert space is infinite-
dimensional, and the (infinite) energy levels are eigenstates of the number operator n̂ = ↠â. This is
opposed to the traditional Discrete Variable (DV) approach, where observable have a discrete spectrum,
and the Hilbert space is finite-dimensional. Generally speaking, CV systems offer the advantage that the
resource states (e.g. squeezed states or large cluster states, that we will introduce in the following) can be
deterministically produced. Moreover, new methods for experimental implementations, offering solutions to
scalability, have emerged in the context of CV. For instance, A. Furusawa (Tokyo, Japan) and O.Pfister
(Charlottesville, USA) have been able to produce CV entangled states of up to one-million optical modes; in
the experiments of N. Treps and V. Parigi (Paris, France) several squeezed states are simultaneously available
in the same optical cavity. With microwave cavities coupled to superconducting circuits, the Yale group has
demonstrated that it is possible to store quantum information for a longer time in a radiation state (namely
an encoded “cat state") then if the corresponding state is encoded in a qubit. CV are therefore promising
for the implementation of scalable and robust architectures for quantum computing.

11.1 Quantum computing with continuous variables


11.1.1 The underlying physical model: the quantized harmonic oscillator
A review of the formalism underlying CV quantum information, namely the quantization of the harmonic
oscillator, is provided in Appendix A. Here we provide the basic tools that we will need and use in the
following. Also note that Appendix A uses a different systems of units than the one used in the rest of these
notes.

Hamiltonian of the quantized harmonic oscillator


The Hamiltonian of the quantized harmonic oscillator, expressed in terms of the creation and annihilation
(or ladder) operators, is given by
 1
Ĥ = ~ω ↠â + .
2

89
This Hamiltonian has the following energy eigenvalues
 
1
En = ~ω n + , n = 0, 1, 2, . . .
2

and the eigenstates of the Hamiltonian are known as Fock states, written in the Dirac notation as |ni, where
n represents the number of quanta or photons in the single-mode field. The Fock states are eigenstates of
the number operator n̂ = ↠â, satisfying
↠â |ni = n |ni .
The vacuum state of the harmonic oscillator is defined by

â |0i = 0. (11.1)

Acting with the creation and annihilation operators on the Fock state yields

â |ni = n |n − 1i , (11.2)

↠|ni = n + 1 |n + 1i . (11.3)

Hence it is clear that the creation operator ↠, creates a quanta of energy ~ω and the annihilation operator
â destroys a quanta of energy ~ω in the single-mode field. Any Fock state can be generated by acting on the
vacuum state multiple times with the creation operator

â†n
√ |0i = |ni .
n!
Furthermore the number states are orthogonal

hm|ni = δmn

and form a complete set



X
|ni hn| = 1.
n=0

Coherent states
In quantum optics, the coherent states are the states with most resemblance to classical states, in the sense
that they give rise to expectation values that look like the classical electric field. These states are the
eigenstates of the annihilation operator:
â |αi = α |αi . (11.4)
Because â is non-Hermitian α is usually complex. For the creation operator ↠we have for obvious reasons

hα| ↠= α∗ hα| .

It is possible to show (see Appendix) that



2 X αn
|αi = e−|α| /2
√ |ni ,
n=0 n!

which is a superposition of infinite many number of Fock states! Furthermore calculating the expectation
value of the number operator n̂
2
n̄ = hα| n̂ |αi = |α| ,

90
2
we see that |α| is related to the mean number of photons in the field. Using this we can compute the
probability of finding n photons in the field
2n
2 |α| n̄n
2
|hn|αi| = e−|α| = en̄ ,
n! n!
which we recognize as a Poisson distribution with a mean of n̄. This distribution arises when the probability
that an event occurs is independent of earlier events.
Coherent states are known to be non-orthogonal. For example, consider the scalar product hβ|αi, where
|αi and |βi are to different coherent states
2 X X (β m )∗ αn
/2 −|α|2 /2
hβ|αi = e−|β| e √ √ hm|ni
m n m! n!
2 X (β ∗ α)n
+|α|2 )/2
= e−(|β|
n
n!
2
+|α|2 −2β ∗ α)/2
= e−(|β|
2
/2 (αβ ∗ −βα∗ )/2
= e−|α−β| e ,
taking the modulus square we get
2 2
|hβ|αi| = e−|α−β| . (11.5)
2
From Eq. (11.5) it is evident that two coherent states are non-orthogonal. Only when |α − β| is large, so
2
that |hβ|αi| ∼ 0, they become quasi-orthogonal.

Quadrature operators
It is convenient to introduce the two hermitian operators
1
q̂ = √ (â + ↠),
2
(11.6)
1
p̂ = √ (â − ↠),
2i
which satisfy the commutation relation
[q̂, p̂] = i.
These are called the quadratures of the field, and they are observables. Measurement if the quadratures
is called a homodyne measurement. These observables have a continuous spectrum, from which the name
“continuous variables",

q̂|siq = s|siq (11.7)


p̂|sip = s|sip . (11.8)

As such, the quadratures possess the spectral decomposition


Z
q̂ = ds|siq q hs|s (11.9)
Z
p̂ = ds|sipp hs|s. (11.10)
R R
Quadrature eigenstates form a basis, i.e. I = ds|siq q hs| = ds|sipp hs|, which allows one to write arbitrary
bosonic states in either the q or p representations, respectively:
Z Z
|ψi = dsψq (s)|sip = dsψp (s)|sip ,

91
with ψq (s) = q hs |ψi and with ψp (s) = p hs |ψi.
The variance of an arbitrary operator is defined by
h(∆Â)2 i = hÂ2 i − hÂi2 (11.11)
and can interpreted as the uncertainty of an observable. The expectation value of the quadratures
1
hq̂i = √ hn| (â + ↠) |ni = 0,
2
1
hp̂i = √ hn| (â − ↠) |ni = 0
2i
are evaluated to zero, which means that the expectation value of the electric field is also zero (see Eq. (A.17)
in the Appendix). On the other hand the expectation value of the square is non-zero
1 1
hn| (â2 + â↠+ ↠â + â†2 ) |ni = (1 + 2n),
hq̂ 2 i =
2 2
2 1 2 † † †2 1
hp̂ i = hn| (â + ââ + â â + â ) |ni = (1 + 2n).
2 2
Thus it follows from Eq. (11.11) that the uncertainty in both quadratures are equal, and when n = 0
(corresponding to the vacuum state), the uncertainty is minimum
1
= h(∆p̂)2 ivac ,
h(∆q̂)2 ivac = (11.12)
2
also implying the saturation of the Heisenberg uncertainty principle
1
h(∆q̂)2 ivac h(∆p̂)2 ivac =
. (11.13)
4
This is known as the vacuum fluctuations. Since coherent states are nothing else than vacuum displaced in
phase space, it can be easily verified that the quantum uncertainty for both quadratures on coherent states
is the same as for vacuum, for all amplitude α:
1
h(∆q̂)2 iα = = h(∆p̂)2 iα ,
2
also resulting in minimum-uncertainly states:
1
h(∆q̂)2 iα h(∆p̂)2 iα = . (11.14)
4

Squeezed states
In contrast to coherent states, squeezed states are characterised by asymmetric fluctuations in q and p, i.e.
h(∆q̂)2 iξ < h(∆p̂)2 iξ for a q-squeezed state, and h(∆q̂)2 iξ > h(∆p̂)2 iξ for a p-squeezed state, yet satisfying
in both cases the Heisenberg principle with equality sign Eq.(11.14). Squeezed states are obtained applying
the squeezing operator to the vacuum, i.e.
iξ(q̂ p̂+p̂q̂)
|ξi = S(ξ)|0i = e− 2 |0i.
In the limit of infinite squeezing, which corresponds to infinite energy, one obtains the infinitely squeezed
states, which are the eigenstates of position and momentum with zero eigenvalue:
|0iq infinitely q-squeezed state, such that q̂|0iq = 0|0iq (11.15)
|0ip infinitely p-squeezed state, such that p̂|0ip = 0|0ip . (11.16)
Generalization of these states exist, where the state that is squeezed in Eq.(11.1.1) is an arbitrary coherent
state.

92
Figure 11.1: Coherent, Squeezed and Thermal States. Wigner function ball-on-stick representations of (i)
vacuum state (blue), (ii) coherent state (red), (iii) thermal state (yellow, dashed line), and (iv) squeezed state
(green). This picture is taken from Ref.["Storage and manipulation of optical information using gradient
echo memory in warm vapours and cold ensembles", B. Sparkes, 2013].

Phase space representation


In quantum mechanics a system can be fully described by its density operator %̂, however the density operator
can be a rather abstract object and it can be hard to read off its properties. Therefore we employ the phase
space representation. One can think of phase space as a mathematical abstract space, where the state of a
harmonic oscillator is represented in terms of its quadratures. For a general quantum system, since position
q̂ and momentum p̂ are non commutative operators in quantum physics, the state cannot be represented as
a point in phase space as it would in classical physics, because both position and momentum are not allowed
to have precise values. The same holds true for the quadratures of the electromagnetic field, q̂ and p̂.
Among the possible representations of states in phase space there is the Wigner function. It is defined
by Z ∞
1
W (q, p) ≡ hq + 12 x| %̂ |q − 12 xi eixp dx
2π −∞
where x, q and p are now interpreted as quadratures, and %̂ is the density operator for a quantum system.
W (q, p) is known as a quasi-probability distribution since it can take on negative values. Even though
the Wigner function is not a “real” probability distribution it can still be associated to one, for example
calculating the probability distribution (also referred to as the marginal distribution) over p
Z ∞ Z ∞ Z ∞
1
Pr(q) = W (q, p)dp = dp dxψ ∗ (q − 12 x)ψ(q + 21 x)eixp
−∞ 2π −∞ −∞
Z ∞

= dxψ (q − 12 x)ψ(q + 12 x)δ(x)
−∞
2
= ψ ∗ (q)ψ(q) = |ψ(q)|

returns the probability density of q, where for simplicity of calculation we have considered the case of a pure

93
state %̂ = |ψihψ|. Similarly, one can show that
Z ∞
2
Pr(p) = W (q, p)dq = |ψ(p)|
−∞

is the probability density of p, where ψ(p) is the wave function in p-representation.


For instance, both the vacuum and the coherent state are shaped like a Gaussian which does not display
any negativity, while the Wigner function for the single-photon Fock state shows clear indication of negativity.
A cut of the Wigner function (as seen from the top) for a coherent state, a squeezed state, the vacuum and
a thermal state is represented in Fig.11.1.
In Eq.(11.1.1), the Wigner function is associated to a quantum state (possibly mixed) %̂. However, it is
also possible to associate a Wigner function to a process, or also to a measurement, where in the latter case %
is replaced by the projector associated with a given outcome, e.g. |pihp| for the outcome p of the homodyne
R
measurement p̂ = ds|sipp hs|s. If the Wigner function is positive for all outcomes, then the we say that the
measurement is Wigner-positive.

Mari-Eisert theorem
Any given CV quantum circuit is defined by (i) a specific input state, (ii) a unitary evolution and (iii)
measurements. The Mari-Eisert theorem [Mari and Eisert, 2012] states that if all these elements are de-
scribed by positive Wigner functions, then there exists a classical algorithm able to efficiently simulate
this circuit. This theorem is the analog of the Gottesman-Knill theorem seen for qubits in Chapter 1.
For a nice demonstration, see the licentiate thesis of Ingrid Strandberg (Chalmers, 2019), available at
https://research.chalmers.se/publication/513592/file/513592_Fulltext.pdf.
Hence, including a negative Wigner function element is mandatory in order to design a CV sub-universal
quantum circuit that cannot be efficiently simulated by a classical device. By virtue of the Hudson theorem,
this necessarily corresponds to the use of non-Gaussian resources. Indeed, the Hudson theorem states that
the only pure states to possess a non-negative Wigner function are Gaussian states.
Previous criteria for the simulatability of CV circuits were given in terms of the Gaussianity of a circuit:
if all elements in a CV circuits are Gaussian, then the circuit is classically efficiently simulatable [Bartlett
et al., 2002]. This criterion is strictly included in the Mari-Eisert theorem, i.e. it recognises as classically
efficiently simulatable a smaller class of CV circuits: there are indeed states and processes that are non-
Gaussian, and yet for which the Wigner function is positive. Consider for instance the mixed state of vacuum
|0i and single-photon state |1i, % = p|0ih0| + (1 − p)|1ih1| for any p > 1/2.
An even more general criterion with respect to the Mari-Eisert theorem for the simulatability of continuous-
variable circuits was given in Ref. [Rahimi-Keshari et al., 2016], based on other quasi-probability distributions
than the Wigner function.

11.1.2 First definitions, elementary operations and universal gate-sets


In the following, we are going to introduce the basic CV quantum operations, and to learn about several
quantum computation models and protocols in CV. We start by introducing a notion of computational
universality in CV, as well as the elementary CV operations.

Definition of CV universality (1)


The first definition of computational universality in CV that we will encounter in this course (and the first
one historically) is the following: a CV QC system is universal if it can simulate the action of any Hamiltonian

94
eiH(p̂i ,q̂j ) consisting of a polynomial of the quadratures q̂i and p̂j in each mode i, j, to an arbitrary fixed
accuracy [Lloyd and Braunstein, 1999, Gu et al., 2009].
It is also convenient to introduce the notion of Gaussian universality. This is given by LUBOs operations,
for Linear unitary Bogoliubov transformations. They consist of any evolution operator involving a at most
quadratic polynomial in q̂ and p̂. Introducing x̂ = (q̂1 , q̂2 ...q̂N , p̂1 , p̂2 ...p̂N )T ≡ (~q, p~)T , then these operations
can be represented as
Û † x̂Û = Sx̂ + ĉ, (11.17)
with S a symplectic matrix, ĉ a displacement (see e.g. Ref. [Menicucci et al., 2011]). In the following, we are
going to list the relevant CV elementary Gaussian operations, which allows reaching Gaussian universality:

Single mode Gaussian transformations


iϑ(q̂ 2 +p̂2 )
1
a) Rotations: R(ϑ) = e 2 . Particular example: Fourier transform R(π/2) = F , with2
Z ∞
1
F |siq = √ dreirs |riq = |sip
2π ∞
Z ∞
1
F † |sip = √ dre−irs |rip = |siq (11.20)
2π ∞
eirs
with q hr|sip = √

and |sip , |siq the eigenstates of the quadrature operators introduced in Eq.(11.7).

b) Quadrature displacements:

X(s) = e−isp̂ , such that X(s)|riq = |r + siq


isq̂
Z(s) = e , such that Z(s)|rip = |r + sip . (11.21)

is(q̂ p̂+p̂q̂)
c) Squeezing: S(s) = e− 2

isq̂ 2
d) Shear: D2,q = e 2

Any single mode Gaussian operation can be decomposed in a) rotations; b) quadrature displacement; c) or
d), i.e. squeezing or shear. I.e,

{D1,q (s) = Z(s), D2,q (s), F = R(π/2)} universal set for single-mode Gaussian operations
1 Note that the action of a rotation of the quadratures of a single mode looks like
    
q̂ cos ϑ − sin ϑ q̂
R(ϑ) = , (11.18)
p̂ sin ϑ cos ϑ p̂
 
q̂1
not to be confused with an equivalent matrix acting on the two-mode amplitude quadratures vector , realising the
q̂2
rotation of one quadrature q̂1 with respect to another one q̂2 .
2 The corresponding rotation of the second mode quadrature is expressed by

 0      
q 0 −1 q −p
0 = = (11.19)
p 1 0 p q

which corresponds to the transformations F † q̂F = −p̂ and F † p̂F = q̂, i.e. F † âF = iâ and F † ↠F = −i↠.

95
Multimode Gaussian transformations
The addition of any non trivial two-mode Gaussian gate, such as the CZ interaction eigq̂×q̂ , combined
with the set of single-mode operations above allows for any general multi-mode Gaussian operation (see
Ref. [Menicucci et al., 2011] for comments on the generalization to imperfect Gaussian operations and
Gaussian measurements).

{D1,q (s) = Z(s), D2,q (s), F = R(π/2), CZ } universal set for multi-mode Gaussian operations

They are enough to perform some algorithms such as error correcting code regarding errors on single channels
(though, non-Gaussian measurements are required to correct errors such as loss on all channels simultane-
ously). Yet, all the algorithms involving only Gaussian unitaries acting on Gaussian states with Gaussian
measurements could be efficiently simulated on a classical computer, as we have seen in the previous section.

Single-mode universal transformations


For a single mode, all Gaussian operations together with any non-Gaussian operation provide universality.
isq̂ k
For example, the set Dk,q = e k for k = 1, 2, 3 for all s ∈ R together with F provides universal single mode
quantum computation (i.e., can be used to implement any single-mode unitary operation to arbitrary fixed
accuracy):

{D1,q (s) = Z(s), D2,q (s), D3,q (s), F = R(π/2)} universal set for single-mode q.c.

Multi-mode universal transformations


Adding to this any non-trivial two-mode interaction provide universal quantum computation [Lloyd and
Braunstein, 1999]. I.e.

{D1,q (s) = Z(s), D2,q (s), D3,q (s), F = R(π/2), CZ } universal set for multi-mode q.c.

n 2 π 2 2 3
o
eiq̂s , eiq̂ s , ei 4 (q̂ +p̂ ) , eiq̂1 ⊗q̂2 , eiq̂ s universal set for multi-mode q.c.

11.2 Measurement-based quantum computation: the general paradigm


in CV
It is in principle possible to perform sequences of gates from the elementary gate set that we have identified
above [Hillmann et al., 2020], and to engineer thereby quantum computations in CV within the circuit model.
However, CV quantum computation finds its most natural formulation within the measurement-based model.
The reason for this is the availability of massively large cluster states, that can be deterministically generated
(see experimental results by Furusawa, Pfister, Treps as well as latest releases from the Canadian start-up
Xanadu). In this framework, we can reformulate the notion of universal quantum computation introduced
in Sec. 11.1.2 as follows: for any CV unitary U = eiH(q̂i ,p̂j ) (corresponding to an arbitrary polynomial of q̂i
and p̂j ) and any given input |ϕi there exists an appropriate graph state such that by entangling the graph
state locally with |ϕi and applying an appropriate sequence of single-mode measurements, U |ϕi is computed.
We are now going to retrace all the steps seen in Sec. 6.2 when we have introduce the measurement-based
quantum computation model for qubits, but here in the framework of CV. We start with the definition of
CV cluster states.

96
11.2.1 Cluster states in Continuous Variables
Ideal cluster states are defined as follows: given a graph with N vertices and a certain number of edges
relying these vertices according to a specific structure modeled by an adjacency matrix V , a CV cluster
state is obtained starting from a collection of N infinitely p-squeezed states, and applying CZ interactions
according to the graph structure, i.e. the controlled-Z gate eigq̂×q̂ , yielding
N
Y i i T
|ψV i = ĈZ [V ]|0i⊗N
p = e 2 Vjk q̂j q̂k |0i⊗N
p = e 2 q̂ V q̂
|0i⊗N
p (11.22)
j,k

Here V is a real and symmetric matrix, with finite elements.


Eq.(11.22) implies that a cluster state satisfies a nullifier relation, as detailed here below. Let us first
introduce the following definition: If an operator K satisfies for a state |ϕi

K|ϕi = |ϕi (11.23)

we call it a "stabilizer" for the state |ϕi. Eq.(11.23) implies that

U KU † (U |ϕi) = U |ϕi, (11.24)

i.e that U KU † stabilizes U |ϕi. Note furthermore that

e−isp̂ |0ip ≡ X(s)|0ip = |0ip ∀s. (11.25)

From Eqs.(11.25) and (11.24) it follows that the cluster state (11.22) is stabilized by the set
† i T i T
Ki = ĈZ [V ]Xi (s)ĈZ [V ] = e 2 q̂ V q̂ Xi (s)e− 2 q̂ V q̂
YY i i
= e 2 Vj,k q̂j q̂k e−isp̂i e− 2 Vl,m q̂l q̂m (11.26)
j,k l,m

for each i. Using that eiq̂1 q̂2 p̂1 e−iq̂1 q̂2 = p̂1 − q̂2 , we finally obtain that
Y Y
Ki = e−isp̂i Vi,k eisq̂k = Xi (s) Vi,k Zk (s). (11.27)
k k

Eq.(11.27) is formally equivalent to its analog in the discrete variable case (see Eq.(20.11) in Ref.[D. Druss
and G. Leuchs, “Lectures on Quantum Information", Wiley-VCH (2007)]). The Ki form a group. The
generators of the corresponding algebra are found by derivation since Ki = e−isHi . Note that because of
Eq.(11.23) it follows that Hi |ψV i = 0, i.e. ∀i. Hence the H operators are called ”nullifiers" for the state
|ψV i. They can be easily calculated as
  " #
dKi d −isp̂i
Y
isq̂k
Hi = i =i e Vi,k e
ds (s=0) ds
k (s=0)
" #
−isp̂i d
Y Y
−isp̂i isq̂k isq̂k
= i −ip̂i e Vi,k e +e Vi,k e
ds
k k (s=0)
" #
X Y X
= p̂i + i e−isp̂i Vi,k iq̂k eisq̂l = p̂i − Vi,k q̂k (11.28)
k l (s=0) k

from which follows that !


X
p̂i − Vi,k q̂k |ψV i = 0. (11.29)
k

97
From Eq.(11.29) follows immediately that
!
X
hψV |∆2 p̂i − Vi,k q̂k |ψV i = 0, (11.30)
k

P 2
which for states with zero average also reads hψV | (p̂i − k Vi,k q̂k ) |ψV i = 0.

11.2.2 The CV MBQC paradigm


Consider the scheme in Fig. 11.2.

Figure 11.2

R
• 1) Preparation: One mode contains the initial state that we want to process |ϕi = dsf (s)|siq ; the
R
other mode is initialized to |0ip . The input state is hence |ϕi ⊗ |0ip = dsf (s)|siq |0ip . Apply a CZ
gate between the two, obtaining
Z Z Z
iq̂⊗q̂ isq̂2
CZ (|ϕi ⊗ |0ip ) = dsf (s)e |siq |0ip = dsf (s)e |siq |0ip = dsf (s)|siq |sip
1
Z Z
= √ dsf (s) dre−irs |rip |sip , (11.31)

where we have used that eisq̂ |0ip = |sip .

• 2-pre) Measure: Measure the input mode in the p̂ basis with outcome m: this projects the second
qumode into the state
Z Z
|ψiout ∝ dsf (s)e−ims |sip = e−imp̂ dsf (s)|sip = X(m)F |ϕi. (11.32)
R R
The last equality is obtained since F |ϕi = dsf (s)F |siq = dsf (s)|sip . The effect of this circuit is
to apply the identity (modulo a displacement and a rotation).

• 2) If we send as an input state the rotated state Dq̂ |ϕi = dsf (s)Dq̂ |siq where Dq̂ = eif (q̂) is an
R

operator diagonal in the computational basis, then measuring p̂ in the first qumode projects the second
mode into

|ψiout ∝ X(m)F Dq̂ |ϕi. (11.33)

Since the CZ gate commutes with Dq̂ , the same result is obtained with |ϕi as an input state, and
a rotation on the first mode after the CZ as in the left panel of Fig.11.2. This is in turn equivalent
to the situation in which no rotation is applied, but the first mode is measured in a rotated basis
Dq̂† p̂Dq̂ ≡ p̂f (q̂) . The extra displacement X(m) depends on the outcome of the measurement on mode
1, and can be compensated by choosing the measurement basis of the following steps (thus introducing
in general time-ordering).

98
• 3) Universality of single mode operations: Repeat twice the previous protocol, for two different
operators Dq̂1 and Dq̂2 . The output state is:
|ψiout = X(m2 )F (Dq̂2 X(m1 ))F Dq̂1 |ϕi
2
= X(m2 )F X(m1 )Dq̂+m 1
F Dq̂1 |ϕi
2 1
= X(m2 )F X(m1 )F D− p̂+m1 Dq̂ |ϕi
(11.34)
where we have used the inequalities X(−m)q̂X(m) = q̂ + m, Z(−m)p̂Z(m) = p̂ + m, F † (−q̂)F = p̂,
F † p̂F = q̂. If instead of measuring the second mode on p̂f (q̂) I would have measured it in the outcome-
dependent basis p̂f (−q̂−m1 ) I would have obtained as a result my deterministic desired output
|ψiout = X(m2 )F X(m1 )F Dp̂2 Dq̂1 |ϕi (11.35)
(universal for single-mode operations if I repeat other times: I can obtain the desired transforma-
tion concatenating various Dq̂ and Dp̂ ), a part from by-product operations which do not need to be
corrected.
• 3)-4) Triviality of measurement adaptivity for Gaussian unitaries Let us focus on the building
blocks of the universal set given above. For the Gaussian operations:
– F is obtained at each step of the computation.
– Dq̂ = eisq̂ is obtained by measuring p̂sq̂ = e−isq̂ p̂eisq̂ = p̂ + s (measure p̂ and add s to the result).
Note that p̂s(q̂+m) = p̂sq̂ = p̂ + s (no adaptation is required).
q̂ 2 q̂ 2 q̂ 2
– Dq̂ = eis 2 is obtained
√ by measuring in the basis p̂sq̂2 /2 = e−is 2 p̂eis 2 = p̂ + sq̂ = g(q̂ sin ϑ +
p̂ cos ϑ) with g = 1 + √ s2 and ϑ = arctan s. This
√ is readily verified because the latter defini-
tion implies cos ϑ = 1/ 1 + s2 and sin ϑ = s/ 1 + s2 . This corresponds to a rotated homo-
dyne quadrature. Note that if I would have to adapt the basis I should measure according to
p̂s(q̂+m)2 /2 = p̂ + sq̂ + ms. This can be achieved by measuring in the same basis as without
adaptation (i.e. p̂ + sq̂) and adding ms to the result
The adaptation required for these measurements is trivial and can be done after (as a post-processing).
Hence Gaussian operations can be implemented in any order or simultaneously ("parallelism").
A cubic phase gate would require instead
q̂ 3 q̂ 3 q̂ 3
– Dq̂ = eis 3 is obtained by measuring in the basis p̂sq̂3 /3 = e−is 3 p̂eis 3 = p̂ + sq̂ 2 . If I have to
measure according to p̂s(q̂+m)3 /3 = p̂ + sq̂ 2 + 2msq̂ + m2 s, the term 2msq̂ requires a non-trivial
adaptation of the measurement basis.
• 5) Cluster states as a resource: Given the fact that the CZ gates commute with the measurements,
in practice the state used as initial resource in the quantum computation protocol presented is a
generalized cluster state in which some of the modes (the input modes), also linked to the other nodes
of the cluster, are initialized to code the modes of the input state. However, one can think of taking
an initial cluster state (e.g, a square cluster state) and "writing" in some of its nodes the modes of the
physical input state (e.g. by CZ gates and measure of the input modes, or by teleportation [Ukai et al.,
2010]). A state which allows this for each U and each input state is said to be a universal resource.
It has been demonstrated by Briegel that a square lattice graph (a cluster state) with unit weights is
a universal resource for quantum computation. Depending on the specific kind of computation, other
graphs than a square lattice could be more suitable for implementing the computation [Horodecki et al.,
2006].

99
• 6) Two mode interactions: A sequence of single mode operations can be implemented via following
measurements on a linear cluster. To achieve full universality we have to add to the previous toolbox
a two-mode interaction, e.g. the CZ gate. Such two-qubit gates, e.g., the CZ and CNOT gates, can
be constructed in a two-dimensional cluster state where two input qubits are entangled with a few
other qubits, in analogy to the case of qubit MBQC discussed in Sec.6.2. By a series of single-qubit
measurements and rotations, we can end up with two of the other qubits representing the output state
corresponding to the two-qubit gate having acted on the input state.

In conclusions, note that the procedure above is an idealization: in real life, squeezed states will always
have finite energy, i.e. squeezing degree. As a result, the state output of the computation - even in the
presence of ideal entangling gates and measurements - will always be affected by Gaussian noise, caused
by the finite squeezing. How to avoid accumulation of this (and other types of) noise is the object of the
following section.

100
11.3 GKP encoding and Error Correction
In classical informatics, when it comes to make sure that the errors that can occur during a computation
can be corrected, it is convenient to resort to digitalized information, i.e. bits. For this reason combined
with versatility, analog computers have been outperformed by digital computers in the 50s-60s, when the
latter became sufficiently performant. Also note that from a computer science perspective, the definition of
computational models based on real numbers is problematic and less studied3 .
Analogously, with quantum information, if the goal is to achieve fault-tolerant quantum computation, we
must resort to qubit-like quantum information even when using continuous-variable hardware. An example
of qubit-like quantum information encoding in CV is based on the use of cat states, where the qubit-
like information is encoded in codewords |0iL = |αi + | − αi and |1iL = |iαi + | − iαi, where we have
omitted normalization constants. This encoding has allowed demonstrating a break-even point, in the sense
that quantum information encoded in such cat states has been living longer than the one encoded in the
corresponding qubit (within a transmon architecture) [Ofek et al., 2016].
In Ref. [Gottesman et al., 2001], another way of encoding qubits in quantized harmonic oscillators was
introduced by Gottesman, Kitaev and Preskill, yielding the GKP encoding. This encoding has been shown to
allow for the correction of arbitrary type of noise, while cat codes are specialized to correct for single-photon
losses. Essentially, and without attempting to be rigorous, this is because GKP codes allows one to correct for
single-mode displacements, and any noise-map can be decomposed in single-mode displacements [Gottesman
et al., 2001]. GKP state have been generated and encoded experimentally both in superconducting microwave
cavities [Campagne-Ibarcq et al., 2020] and with trapped ions [Flühmann et al., 2018].
In what follows, we are going to introduce the GKP encoding and the corresponding error-correcting
scheme in detail.

11.3.1 GKP encoding


GKP code-words
We start by recalling the basis of GKP encoding. The starting point relies on the definition of qubits as
continuous wave-functions made of an infinite number of Dirac peaks [Gottesman et al., 2001]:
X √ X √
|0L i = |2n πiq = |n πip ,
n n
X √ X √
|1L i = |(2n + 1) πiq = (−1)n |n πip . (11.36)
n n

Realistic logical qubit states are normalizable finitely squeezed states, rather than non-normalizable infinitely
squeezed states. The Dirac peaks are hence replaced by a normalized Gaussian of width ∆, while the infinite
sum itself will become a Gaussian envelope function of width δ −1 (see Figure 11.3). Overall, the realistic
states wavefunctions read:

(2n)2 πδ 2 (q − 2n π)2
Z X    
−iup̂ −iv q̂
hq |0̃L i = dudvG(u)F (v)e e hq |0iL = N0 exp − exp − ,
n
2 2∆2

(2n + 1)2 πδ 2 (q − (2n + 1) π)2
Z X    
hq |1̃L i = dudvG(u)F (v)e−iup̂ e−ivq̂ hq |1iL = N1 exp − exp − ,
n
2 2∆2
(11.37)
3 If real computation were physically realizable, one could use it to solve NP-complete problems, and even #P-complete

problems, in polynomial time. Unlimited precision real numbers in the physical universe are prohibited by the holographic
principle and the Bekenstein bound.

101
where we have introduced the noise distributions
1 u2 1 v2
G(u) = √ e− 2∆2 ; F (v) = √ e− 2δ2 , (11.38)
∆ 2π δ 2π
and N0 and N1 are normalization constants.

Figure 11.3: Wavefunction in position representation of GKP |0̃L i state in continuous blue (|0̃L i in dashed
red) with δ = ∆ = 0.25 from Equation Eq. (11.37)
.

Definition of CV universality (2)


A second definition of CV universality is based upon encodings, such as the GKP encoding: consider qubits
encoded in CV hardware. In this case, universality is achieved when one can implement at least one of the
universal gate sets for qubits quantum computation, that we introduced in Chapter 1, encoded in GKP.
On the (non-normalizable) states introduced in Eq.(11.36), Clifford operations correspond to the gates:

Z̄ = ei π q̂
, C̄Z = eiq̂k q̂l , H̄ = F. (11.39)

These are all implemented by Gaussian CV operations, as introduced in Sec.11.1.2. To promote this set of
operations to universality we need a non-Clifford gate. This requires instead a non-Gaussian operation:
 3  2  
2q̂

4

π
+ √q̂π − √ 2q̂
π
T̄ = e . (11.40)

GKP encoding and fault-tolerance


In this Section, prepared with the contribution of Tom Douce, we prove that Continuous Variables MBQC
with finite squeezing and an additional supply of GKP states yield fault tolerant quantum computation [Gottes-
man et al., 2001,Menicucci, 2014]. In Ref. [Gu et al., 2009] they showed how to implement standard quantum
gates in CV MBQC, which would be sufficient for universal QC with GKP states [Gottesman et al., 2001],
i.e. relying on a DV encoding embedded in a CV hardware. What remains to prove is that these gates can be
performed fault-tolerantly, admitting use of GKP ancillary resource states. This was achieved in [Menicucci,
2014], where it is shown that the noise in the p̂ quadrature of a GKP encoded quantum state can be replaced
by the noise of the ancillary |0̃L i state following the procedure shown in Fig. 11.4. Repeating this gadget
after a Fourier transform allows for correction of the other quadrature, thereby enabling fault tolerance.

102
In order to explain this EC procedure, we follow a toy-model approach that has been developed by
Glancy and Knill [Glancy and Knill, 2006]. This approach is based on a decomposition of the noise in
several realizations of displacements, resulting in blurred ideal GKP states. Within this approach, we are
going to show explicitly how GKP EC allows one to correct for displacements, by analyzing the output of
the circuit in Fig. 11.5 with merely displaced perfect GKP states at the input, Sections 11.3.2 and 11.3.3.
Since displacements form an operator basis, it follows that GKP states can correct any type of noise. Note
that this works in principle for arbitrary noise, even when this is non Gaussian.
This approach is exact in the infinite squeezing regime. For physical states, i.e. with finite squeezing
there are subtleties. Some of these are discussed in a recent paper by Barbara Terhal’s group, that discusses
the difference between error correction with coherent-enveloped GKP states versus blurred ideal GKP states.
These subtle differences are not completely captured by the Glancy-Knill picture, but for pedagogical reasons
we will omit further details of this discussion.

|ψ̃i • X(−s mod π)

|0̃L i • p̂ s

Figure 11.4: Procedure to correct for errors in the p̂ quadrature. |0̃L i is a noisy GKP state and |ψ̃i is a noisy
GKP-encoded CV state. X(m) is a displacement operator e−imp̂ .

|ψi e−iu1 p̂1 e−iv1 q̂1 • X(v2 − u1 )



|0L i e−iu2 p̂2 e−iv2 q̂2 • p̂ n π + u1 − v2

Figure 11.5: Modeling the noise in the protocol. |0L i is a perfect – unphysical – GKP state and |ψi is a
perfect GKP-encoded CV state.

11.3.2 Single noise realization: intermediate measurement and threshold con-


dition
Firstly, we compute the measurement result obtained at the output of the circuit in Fig. 11.5. The measure-
R
ment is homodyne detection, corresponding to the observable p̂ = p|sipp hs|, featuring the projectors

P̂s = |sipp hs| (11.41)

associated with the distinct measurement outcome s, where |sip are eigenstate of the p̂ operator. In the
following, we are going to omit the subscript when this does not cause confusion. In the following and unless
specified all integrals will run over the whole real axis.
R
More specifically, we’re interested in the observable I ⊗ p̂ where I = dq1 |q1 i hq1 | is the identity operator,
acting on mode 1 which is unmeasured. From the initial unphysical and perfect input states, a set of
displacements in position and momentum are applied before the ĈZ gate and the measurement. Thus we
want to work out the following quantity, in a sort of Heisenberg representation fashion:

eiv2 q̂2 eiu2 p̂2 eiv1 q̂1 eiu1 p̂1 ĈZ† (I ⊗ p̂)ĈZ e−iu1 p̂1 e−iv1 q̂1 e−iu2 p̂2 e−iv2 q̂2 (11.42)

103
Step by step we have:
Z
ĈZ† (I ⊗ p̂)ĈZ = ĈZ† dq1 dp2 p2 |q1 , p2 i hq1 , p2 | ĈZ
Z
= dq1 dp2 p2 |q1 , p2 − q1 i hq1 , p2 − q1 | (11.43)

Then:

eiv2 q̂2 eiu2 p̂2 eiv1 q̂1 eiu1 p̂1 ĈZ† (I ⊗ p̂)ĈZ e−iu1 p̂1 e−iv1 q̂1 e−iu2 p̂2 e−iv2 q̂2
Z
=e iv2 q̂2 iu2 p̂2 iv1 q̂1 iu1 p̂1
e e e dq1 dp2 p2 |q1 , p2 − q1 i hq1 , p2 − q1 | e−iu1 p̂1 e−iv1 q̂1 e−iu2 p̂2 e−iv2 q̂2
Z
= dq1 dp2 p2 |q1 − u1 , p2 − q1 + v2 i hq1 − u1 , p2 − q1 + v2 |
Z
= dq1 dp2 p2 |q1 , p2 − q1 − u1 + v2 i hq1 , p2 − q1 − u1 + v2 |
Z
= dq1 dp2 (q1 + p2 + u1 − v2 ) |q1 , p2 i hq1 , p2 | = q̂1 + p̂2 + u1 − v2 (11.44)

where in the last steps we have performed the changes of variables:

q1 − u1 = q10 → q1 = q10 + u1
p2 − q1 + v2 = p02 → p2 = p02 + q10 + u1 − v2

and we have renamed q10 → q1 and p02 → p2 . So what kind of measurement results are we to expect? To
figure it out we should express Eq. (11.44) in a different manner, namely:

eiv2 q̂2 eiu2 p̂2 eiv1 q̂1 eiu1 p̂1 ĈZ† (I ⊗ p̂)ĈZ e−iu1 p̂1 e−iv1 q̂1 e−iu2 p̂2 e−iv2 q̂2
Z
= dq1 dp2 (p2 + q1 + u1 − v2 ) |q1 , p2 i hq1 , p2 | . (11.45)

We recall that the quantum state that will be measured is |ψL , 0L i so a product of perfect GKP states. On

these states, a quadrature projection can only give rise to integer multiples of π. So the sum q1 + p2 will
√ √
be a multiple of π, say n π. Specifically we get

s = n π + u1 − v2 . (11.46)

The outcome of the measurement is a value corresponding to the noise of the data qubit, blurred by the
noise coming from the ancilla.

The error threshold is defined in relation with the following displacement X(−s mod π). The modulo
√ √
function has range [− π/2, π/2], and the procedure succeeds if u1 − v2 is small. Otherwise a logical
error occurs and the displacement acts as Pauli-X gate in terms of the GKP encoding. Mathematically the
constraint reads:

|u1 − v2 | ≤ π/2. (11.47)
The remaining logical errors can be taken care of by concatenating the GKP code with qubit codes and
quantum error correction. A limit on the tolerated logical error probability depending on the chosen qubit
code translates then into a constraint on the squeezing parameters [Menicucci, 2014].

104
11.3.3 Single noise realization: Output state of the GKP error-correcting gad-
get
We now compute the output state of the circuit Fig. 11.5. We show that the noise in the p̂ quadrature of
the logical qubit is replaced by the one given by the ancilla |0̃L i. Given Eq.(11.41) and standard quantum
measurement theory [Nielsen and Chuang, 2011] after that the outcome s is obtained, the state is projected
onto
|ψs i ∝ P̂s |ψi12 (11.48)
where |ψi12 is the state of input and ancilla after displacement and ĈZ gate, i.e.
|ψi12 = ĈZ e−iu1 p̂1 e−iv1 q̂1 e−iu2 p̂2 e−iv2 q̂2 |ψL , 0L i , (11.49)
R
and where again the identity I = dq1 |q1 i hq1 | is implicitly acted in mode 1, and the projector P̂s defined
in Eq.(11.41) acts on mode 2. We therefore now compute explicitly,
Z
P̂s |ψ12 i = dq1 |q1 i hq1 | ⊗ |sihs|ĈZ e−iu1 p̂1 e−iv1 q̂1 e−iu2 p̂2 e−iv2 q̂2 |ψL , 0L i ≡ |Φi |si (11.50)

with |Φi given by


Z
|Φi = dq1 |q1 i hq1 , s| ĈZ e−iu1 p̂1 e−iv1 q̂1 e−iu2 p̂2 e−iv2 q̂2 |ψL , 0L i (11.51)
Z
= dq1 |q1 i hq1 , s − q1 | e−iu1 p̂1 e−iv1 q̂1 e−iu2 p̂2 e−iv2 q̂2 |ψL , 0L i
Z
= ei(v1 u1 −u2 s) dq1 e−i(v1 −u2 )q1 |q1 i hq1 − u1 , s − q1 + v2 |ψL , 0L i.

Now we make explicit use of the form of the variable s = u1 + n π − v2 , and we have
√ √
Z
|Φi = ei(v1 u1 −u2 (u1 +n π−v2 )) dq1 e−i(v1 −u2 )q1 |q1 i hq1 − u1 , n π − q1 + u1 |ψL , 0L i (11.52)
√ √
Z
−iu2 (n π−v2 )
=e dq1 e−i(v1 −u2 )q1 |q1 + u1 i hq1 , n π − q1 |ψL , 0L i.

Now let’s go through the inner product:


√ X √ √
hn π − q1 |0L i = hn π − q1 |l πi
l∈Z
X √
= δ(q1 − (n − l) π)
l∈Z
X √
= δ(q1 − l π), (11.53)
l∈Z

so we have:
√ XZ √
|Φi = e−iu2 (n π−v2 )
dq1 δ(q1 − l π)e−i(v1 −u2 )q1 |q1 + u1 i hq1 |ψL i
l∈Z

−iu2 (n π−v2 )
X √ √ √
=e e−i(v1 −u2 )l π
|l π + u1 i hl π|ψL i. (11.54)
l∈Z

Since (the position wavefunction of) |ψL i is made of Dirac pikes on integer multiples of π, the following
P √ √
result is straightforward: using that l∈Z |l πi hl π|ψL i = |ψL i we obtain the final state before the
displacement:

|Φi = e−iu2 (n π−v2 ) −iu1 p̂1 −i(v1 −u2 )q̂1
e e |ψL i (11.55)

105
and from Eq.(11.50)

P̂s |ψ12 i = |Φi |u1 + n π − v2 ip . (11.56)
The conditional state on mode 2 is obtained as:

%̂k,cond 1 = Tr2 [P̂s |ψ12 i hψ12 | P̂s ] (11.57)



Z
= dpδ(p − (u1 + n π − v2 )) |Φi hΦ| = |Φi hΦ| (11.58)

corresponding to the pure state |Φi.


Now we just have to deal with the remaining correction, i.e. the displacement by the actual measurement
√ √ √
result obtained modulo π, yielding s[mod π] = (u1 − v2 )[mod π]
√ √ √
eis[mod π]p̂1
|si ∝ eis[mod π]p̂1
|Φi = e−iu2 (n π−v2 ) −iv2 p̂1 −i(v1 −u2 )q̂1
e e |ψL i ≡ |ψL iout , (11.59)

which can also be expressed as



|ψL i 7−→ e−iu2 (p0 −u1 +n π) −iv2 p̂1 −i(v1 −u2 )q̂1
e e |ψL i (11.60)

with p0 = u1 − v2 . So we have the equation of the output state on a single realization of the noise. Like
in [Glancy and Knill, 2006], we can see that the remaining p̂1 displacement is given by v2 and is independent
of the original noise u1 . For q̂1 though the noise from the ancilla u2 has been added to the original value v1 .
It appears more clearly if we write Eq. (11.60) as in from [Glancy and Knill, 2006]. Then the output state
would read – for an intermediate measurement result s:

e−iu2 (s−u1 ) e−iv2 p̂1 e−i(v1 −u2 )q̂1 |ψL i (11.61)

and we can see that the same discussion applies.


This error-correcting gadget can then be followed by a Fourier transform and by a second round of the
same error-correcting gadget, with the use of another fresh GKP ancilla. This has the effect to correct for
the noise in the second quadrature.

106
11.4 Sampling models and sub-universal models in Continuous Vari-
ables
In the previous Section, we have dealt with MBQC, which is a model for (universal) quantum computation.
In this Section, we turn to sub-universal computational models in CV. In Sec. 11.1.1 we have seen that
Wigner negativity is necessary in order to obtain quantum advantage - at least of the exponential type.
However, if one aims at minimal extensions of Gaussian models, the Boson Sampling model that we
have seen in Section 10.4 appears as potentially “over-kill": both the input state and the measurement are
described by negative Wigner functions, as they correspond respectively to single-photon states and photon
counting measurement. Can we define other sub-universal model of quantum computation where only one of
these elements is non-Gaussian, and show that they yield hard-to-sample output probability distributions?
Three different families of non trivial sub-universal quantum circuits can be defined, depending on whether
the element yielding the Wigner function negativity is provided by the input state, the unitary evolution,
or the measurement. This concept is exemplified in Fig.11.6. Although Wigner negativity allows stepping
outside the range of applicability of the theorem in Ref. [Mari and Eisert, 2012], it is by itself not sufficient to
imply classical hardness [García-Álvarez et al., 2020]. The classical hardness of these circuits, therefore, has
yet to be proven, for each circuit type. For of the latter kind, corresponding to Gaussian Boson Sampling
(GBS), this was done in Refs. [Lund et al., 2014, Hamilton et al., 2017]. These circuits are composed of
input squeezed states, passive linear optics evolution, and photon counters. Circuits of the second kind
are for instance related to the CV implementation of Instantaneous Quantum Computing – another sub-
universal model, where input states and measurements are Gaussian, while the evolution contains non-
Gaussian gates [Douce et al., 2017, Douce et al., 2019]. First definitions of the former class of CV circuits,
i.e. that display non-Gaussian input state and Gaussian operations and measurements, have been very
recently considered [Chakhmakhchyan and Cerf, 2017, Chabaud et al., 2017].
In this Chapter we are going to introduce the model CV-IQP (the continuous-variable version of the IQP
model introduced in Sec.10.2) and to sketch the proof of its computational hardness.

11.4.1 Continuous-Variable Instantaneous Quantum Polytime


Definition of the model
We define the model Instantaneous Quantum Computing in CV (Fig.11.7). This model is composed of the
elementary gates n o
2 3
eidq̂ , eisq̂ , eicq̂ , eibq̂1 q̂2 . (11.62)

All the gates in this model are diagonal in the position representation. As such, the Fourier transform is
absent. Therefore, this model is not universal. For pedagogical purposes in these notes, we can assume in
the definition that we are able to implement all the gates corresponding to all the possible choices of the
real parameters in Eq.(11.62). However, this has shown to be un-necessary [Douce et al., 2019]. We require
momentum squeezed states |σi with σ < 1 to be present at the input. This model is a simpler version
than the one introduced in Ref. [Douce et al., 2017]. Note the reminiscence with the IQP model defined
in Sec.10.2. Here, too, we have a “crossed" structure of the type: p − q − p in input state, evolution and
measurement, as it was x − z − x for the discrete-variable model.
For the proof of hardness, we proceed in the same way as seen in Chapter 10, when dwelling with the
IQP model: we show that adding post-selection, the model becomes universal. In analogy to the Hadamard
gadget of the discrete-variable model, we need therefore to develop a Fourier gadget.

107
Figure 11.6: Families of quantum circuits in continuous-variable displaying minimal non-Gaussian character.
Circuits with photon-subtracted squeezed states, linear evolution and heterodyne detection (instead than
homodyne as represented in the leftermost sketch) have been shown to be hard to classically sample in
Ref. [Chabaud et al., 2017]).

In the idealised case of infinite squeezing input states, this can be obtained fairly easily with the gadget
of Fig.11.8. Adding post-selection allows to recover the Fourier transform, which completes the set of gates
in the model.
However, a realistic model departs from this idealized situation in at least two unavoidable aspects.
First, in order to obtain a probability of success different from zero for the homodyne measurement, thereby
giving meaning to post-selection, we must introduce a binning of the real axis. This can still be equivalently
R
modeled by using as a projective measurement. Instead that the ideal homodyne detector p̂ = p |pi hp|,
we are going to consider the finitely-resolved homodyne detector p̂η operator that we define as [Paris et al.,
2003]
∞ Z ∞ ∞
dpχηk (p)|pihp| ≡
X X
p̂η = pk pk P̂k (11.63)
k=−∞ −∞ k=−∞

with χηk (p)= 1 for p ∈ [pk − η, pk + η] and 0 outside, pk = 2ηk and 2η the resolution, associated with
the width of the detector pixels 4 . It is easy to check that this is still a projective measurement, since
P∞ 5
k=−∞ P̂k = I, and P̂k P̂k = P̂k δk,k . Note that this modelization is distinct from modeling imperfect
0 0

detection efficiency [Leonhardt, 1997, Leonhardt and Paul, 1993, Paris et al., 2003]. Using this operator for
the measurements introduces errors in the Fourier-transform applied with the Fourier-gadget. In particular,
the output state is a mixes state. Second, realistic input states do have finite squeezing. Below, we are going
to address the effect of these sources of noise, as well as how to deal with them.
4 Note that this model turns out to be equivalent to an ideal scheme with perfectly resolving homodyne detectors and a

discretization (binning) ofR the measurement outcomes.


5 This result uses that ∞ dp0 χη (p0 )hp0 |δ(p − p0 ) = χη (p)hp| despite χη (p0 ) is not a smooth function, which can be verified
−∞ k0 k0 k0
with Riemann sum formalism.

108
|σi p̂η

.. / f 0 (q̂) / ..
. .

|σi p̂η

Figure 11.7: IQP circuit in CVs. |σi are finitely squeezed states with variance σ in the p̂ representation. The
gate f 0 (q̂) is a uniform combination of elementary gates from the set in Eqs.(11.62). The finitely-resolved
homodyne measurement p̂η has resolution 2η.

| i • X h | i • p̂ p
|+i • X hH | i |0ip • X(p)F | i
Figure 11.8: Left: Hadamard gadget in a post-selected IQP circuit, where h takes value 0 if +1 is measured,
while h = 1 if the result is −1. Right: Ideal Fourier gadget in CVs, exact translation of the Hadamard
gadget. |0ip represents an infinitely p̂-squeezed state with σ = 0, thus satisfying p̂|0ip = 0.

Fourier Gadget for Continuous Variables


We consider in this subsection the actual Fourier gadget, provided by the circuit in Fig. 11.9, where we have
removed the idealizations introduced for simplifying the discussion above. Namely, the ancillary squeezed
state is finitely squeezed, and the homodyne detection performed on the first mode possesses a finite resolu-
tion. This subsection is taken from the Supplementary Material of Ref. [Douce et al., 2017].

Output state We compute the output state of the realistic Fourier transform gate implementation. The
circuit is reproduced in Fig. 11.9. By convention the first (resp. second) ket in the tensorial product will
refer to the upper (resp. lower) arm.

|ψi • p̂η
|σip • h
|ψout i

Figure 11.9

We recall that we start from:


1
Z Z
t2
|ψi ⊗ |σip = dq ψ(q) |qiq ⊗ 1/4
√ dt e− 2σ2 |tip . (11.64)
π σ
Step by step we have first the ĈZ gate:
1
Z
t2
ĈZ |ψi ⊗ |σip = 1/4
√ dqdt e− 2σ2 ψ(q) |qiq |q + tip
π σ
1
Z
(t−q)2
= 1/4
√ dqdt e− 2σ 2 ψ(q) |qiq |tip ≡ |ψ1,2 i . (11.65)
π σ

109
We measure on the upper arm the finitely resolved p̂η operator defined in Eq.(1) of the main text. When
obtaining an outcome pk , the measurement yields the conditional state on the lower arm
h i
%̂k,cond = Tr1 P̂k ⊗ I2 |ψ1,2 i hψ1,2 | P̂k ⊗ I2
Z pk +η
= ds p1hs |ψ1,2 i hψ1,2 | sip1
pk −η
η
Z
(t−q)2 (t0 −q 0 )2 0 0
= 3/2 dqdtdq 0 dt0 e− 2σ2 e− 2σ2 ψ(q)ψ ∗ (q 0 )sinc(η(q − q 0 ))eipk (q−q ) |tip ht|p (11.66)
π σ
where we have used Z η
0
ds eis(q−q ) = 2ηsinc(η(q − q 0 )). (11.67)
−η

We remark that the same expression as in Eq.(11.66) is obtained if the homodyne detectors are perfectly
resolved, and a discretization is performed after measurement by binning the measurement outcomes.
This state then has to be normalized by the probability of getting the outcome corresponding to the
projection operator above. What really matters to us is %̂k=0,cond corresponding to the outcome pk = 0,
because it is indeed the particular post-selected state that corresponds to the implementation of the Fourier
transform. For this specific outcome we have:
η
Z
(t−q)2 (t0 −q 0 )2
0
%̂k=0,cond = 3/2 dqdtdq 0 dt0 e− 2σ2 e− 2σ2 ψ(q)ψ ∗ (q 0 )sinc(η(q − q 0 )) |tip ht|p . (11.68)
π σ
Notice that in the limit of perfect resolution η → 0 (upon normalization) we re-obtain the state that would
be obtained in an MBQC implementation of the Fourier transform with a finitely squeezed ancillary state.
As can be seen in Eq. (11.68), finite squeezing means convoluting the state with a Gaussian in the momentum
representation, or equivalently multiplication with a Gaussian in the position representation.

Probability of measuring pk = 0, Prob[k = 0] We evaluate here the probability of measuring an


outcome pk = 0 within a window function of width 2η, yielding the conditional state in Eq.(11.68). More
precisely, we consider the expectation value of the following operator:
Z η
P̂0 = ds |sip hs| (11.69)
−η

taken in the state after the ĈZ gate, that is (see Eq. (11.65))
1
Z
(t−q)2
|ψ1,2 i = 1/4 √ dqdt e− 2σ2 ψ(q) |qiq |tip .
π σ
The calculation reads:

Prob[k = 0] = hψ1,2 | P̂0 ⊗ I2 |ψ1,2 i


1
Z
(t−q)2 (t0 −q 0 )2
= √ dqdq 0 dtdt0 ds e− 2σ2 e− 2σ2 ψ ∗ (q 0 )ψ(q)δ(t − t0 )q hq 0 |sipp hs|qiq
σ π
1
Z
(t−q)2 (t−q 0 )2 0
0 − 2σ2 − 2σ2
= dqdq dtds e e ψ ∗ (q 0 )ψ(q)eis(q−q )
2σπ 3/2
1
Z
(q−q 0 )2 0
= dqdq 0 ds e− 4σ2 ψ ∗ (q 0 )ψ(q)eis(q−q )

2ησ 1
Z
(q−q 0 )2
= √ dqdq 0 √ e− 4σ2 ψ ∗ (q 0 )ψ(q)sinc(η(q − q 0 )). (11.70)
π 2σ π

110
where from the second to the third line we used that
Z +∞
(t−q)2 (t−q 0 )2 √ (q−q 0 )2
dt e− 2σ2 e− 2σ2 = πσe− 4σ2 . (11.71)
−∞

while in the last step we have used Eq.(11.67). The probability can be Taylor expanded in terms of powers
of η: Z 
2ησ 1 (q−q 0 )2
Prob[k = 0] = √ dqdq 0 √ e− 4σ2 ψ ∗ (q 0 )ψ(q) + O(η 2 ) . (11.72)
π 2σ π
The first term in the parenthesis is precisely the norm hψ1,2 |ψ1,2 i hence is equal to 1. Consequently the
probability reads:
2ησ
Prob[k = 0] = √ + O(η 3 ). (11.73)
π
The dominating order is thus proportional to the resolution 2η.

(q−q 0 )2
Large squeezing limit We note that Gaussian distributions obey the following relation: √1 e− 4σ 2 −→
2 πσ σ→0
0
δ(q − q ). Based on this property, the integral in Eq. 11.70 actually yield:
1
Z
(q−q 0 )2
dqdq 0 √ e− 4σ2 ψ ∗ (q 0 )ψ(q)sinc(η(q − q 0 )) ∼ 1
2σ π σ→0
1
Z
(q−q 0 )2
dqdq 0 √ e− 4σ2 ψ(q)ψ ∗ (q 0 ) ∼ 1. (11.74)
2σ π σ→0

Thus the probability of obtaining the outcome pk = 0 becomes dominated by the pure state contribution,
and is determined by the expression:
2ησ
Prob[k = 0] ∼ Prob(1) [k = 0] ∼ √ . (11.75)
σ→0 σ→0 π
We notice that this probability is given as a function of the squeezed state variance σ. Eq. (11.75) ensures
that the post-selection probability is non-zero, a necessary requirement to define it properly. This probability
also needs to satisfy
1
Prob[k = 0] & n . (11.76)
2
This exponentially low probability is still compatible with the definition of post-selected class as explained
in 2.2.3.

Dealing with errors


These sources of noise can in principle spoil the result of the computation and reduce the power of the post-
selected version of the circuit family, thereby preventing from achieving arbitrary Post-BQP computations
and spoiling the proof of hardness structure. How to solve this problem?
The proof of hardness of CV-IQP with input finite-squeezing states makes use of GKP states. In Ref.
[Douce et al., 2017] it was assumed that input GKP states were present at the input. The universal set of
CV operations given in Eq.(11.62) clearly includes a universal set of DV operations in GKP encoding:
(   3  2 )
√ iπ q̂
+ √q̂π −2 √q̂π
4 2

iq̂ π ¯ iq̂1 q̂2 π
Z̄ = e , CZ = e , T̄ = e , (11.77)

plus the Hadamard gate which in GKP encoding corresponds to the Fourier-transform, that hence is obtained
with post-selection:

H̄ = F . (11.78)

111
Therefore, any post-selected BQP computation can be simulated with a post-selected instance of IQP circuits,
i.e. Post-CVIQP = PostBQP. In other words, for every computation in Post-BQP we can find a CV-IQP
circuit that, with post selection corresponding to a non zero probability (and consistent with the definition
of the Post-BQP class), simulates that computation, despite the finite input squeezing and finite resolution.
However, it has later been shown in Ref. [Douce et al., 2019] that the circuit family is hard-to-sample
even without input GKP states. The trick is to show that generation of GKP states can be subsumed in the
gates of the circuit itself. The technicalities of that work are out of the scope of this lecture.

11.5 Quantum annealing


So far, in the Continuous-Variable quantum computing Chapter but in general in this course notes, we
had in mind a “device-independent formalism", that could be applied to any physical system modeled by
the quantized harmonic oscillator (for this Chapter), or by a two-level quantum system (for the qubit
part). In this section instead, we are going to focus on a specific implementation of a Continuous-Variable
Quantum annealer, that is implemented with microwave cavities (our quantized harmonic oscillator), coupled
to superconducting quantum circuits allowing to modify the quantum state of the field in the cavity. The
model that we present here is drawn from Ref. [Puri et al., 2017]. The full Section is taken from Ref. [Vikstål,
2018].
The heart of our continuous variable quantum annealer device is the two-photon pumped Kerr-Nonlinear
Resonator (KNR). It consists of a superconducting resonator made nonlinear with a Kerr-type nonlinearity,
that causes the energy levels of the resonator to be non-equidistant. In this Section we begin with a brief
overview of circuit QED and superconductivity. After that, we will introduce the two-photon pumped KNR
in more detail. This device will allow encoding a combinatorial optimization problem in a network of two-
photon pumped KNR’s where a single Ising spin is mapped to two quasi-orthogonal coherent states with
opposite phases. The latter constitute a two-fold degenerate eigenspace for the two-photon pumped KNR
in the rotating frame. Furthermore, we will demonstrate how the AQC Hamiltonian Eq. (7.1) is realized by
adiabatically controlling the frequency and amplitude of the pumps, and how N linearly coupled KNRs that
are initialized to vacuum |0i at t = 0, will evolve to the ground state of an Ising Hamiltonian at t = τ .

11.5.1 Circuit QED


Superconductivity is a quantum phenomenon that appears when certain metals are cooled down below a
critical temperature. There, the metals become superconducting and a current can flow without electrical
resistance [Tinkham, 2004]. Circuit quantum electrodynamics, or simply circuit QED, is based on super-
conducting electrical circuits. Similar to cavity QED, which studies atoms coupled to a single mode of the
electromagnetic field inside a cavity, circuit QED studies on-chip implementation of QED using microwave
photons. In circuit QED a superconducting coplanar waveguide between two capacitors plays the role of the
cavity and the atom is replaced with an artificial one. Fig. 11.10 to the left illustrates a simple cavity that
uses two mirrors facing others each to confine photons. The illustration to the right depicts the corresponding
circuit QED implementation.
While photons do not interact with each other in vacuum an effective photon-photon interaction can be
engineered by the use of non-linear materials. One such effective interaction is called the Kerr effect, which
happens when a material has a refractive index that is proportional to the intensity of the electromagnetic
field. A resonator can be made nonlinear with the use of a Josephson junction that introduces a Kerr-type
nonlinearity. A Josephson junction consists of two superconducting metals separated by a weak link, that
is made of a thin insulating barrier. For our purpose the Josephson junction can simply be viewed as a
dissipationless nonlinear inductor.

112
Figure 11.10: Left: Sketch of a cavity that confines photons between two mirrors. Right: Sketch of a
superconducting resonator where microwave photons are confined between two capacitors that act as semi-
transparent mirrors. Inspiration taken from Ida-Maria Svensson PhD Thesis (Chalmers, 2018).

By connecting two Josephson junctions in a superconducting loop we get what is called a Superconducting
QUantum Interference Device (SQUID). SQUIDs are very sensitive to magnetic fields and are therefore
mainly used as magnetometers for various purposes. However the SQUID can also be used as a two-photon
pump if a time-varying magnetic flux is threaded through the superconducting loop at twice the resonator
frequency. The SQUID will be an important component in the physical implementation of the continuous
variable quantum annealer.

11.5.2 Two-photon pumped Kerr-nonlinear resonator


We now consider the proposed physical implementation of a continuous variable quantum annealer that was
developed by Puri et al [Puri et al., 2017]. As already mentioned, the main component of the quantum an-
nealer consists of a two-photon pumped KNR. A two-photon pumped KNR can be engineered by embedding
a SQUID in the middle of a half-wavelength (λ/2) resonator, see Fig. 11.11. By modulating the magnetic
flux Φext through the SQUID at twice the resonator frequency the SQUID can be used as a parametric two-
photon pump. The Hamiltonian of a two-photon pumped KNR can be derived by expanding the Josephson
potential of the SQUID up to fourth order and doing the rotating wave approximation, see Appendix B.1.
It results in6
Ĥ(t) = ωr ↠â − Kâ2† â2 + G â†2 e−2iωr t + â2 e2iωr t


where ωr denotes the mode frequency of the resonator, K denotes the amplitude of the Kerr-nonlinearity,
G denotes the amplitude of the two-photon pump, and the two-photon pump frequency is set to twice the
resonator frequency. The time-dependence of the Hamiltonian can be removed by transforming to a frame

rotating at the resonator frequency. Indeed, by doing the following unitary transformation Û(t) = eiωr tâ â ,
6 From here forth we will use natural units where ~ = c = 1.

𝜆/4 Φext 𝜆/4

Figure 11.11: Schematic illustration of a two-photon pumped KNR: it consists of a SQUID sandwiched
between two quarter wavelength (λ/4) resonators. The SQUID consists of two Josephson junctions connected
in a superconducting loop. The Josephson junctions are drawn as square boxes with a cross in the figure.
The loop is furthermore penetrated by a magnetic flux Φext and the ends of the resonators are capacitively
coupled to a transmission line.

113
we get
G2
  
G G
Ĥ = −Kâ2† â2 + G â†2 + â2 = −K â†2 − â2 −

+ . (11.79)
K K K
p
From the last expression it is evident that the two coherent states |±αi with α = G/K are are two
degenerate eigenstates of this Hamiltonian with eigenenergy

G2
Ĥ |±αi = |±αi .
K
Following Puri et al. [Puri et al., 2017] we now take advantage of this well-defined two state subspace
and choose to encode
p the plogical spins |0̄i and |1̄i onto these coherent states, i.e. we do the mapping
{|0̄i , |1̄i} → {|− G/Ki , | G/Ki}, where the bar is used to distinguish the logical spin states from the
vacuum and single-photon Fock state. For sufficiently large |α| the states can be considered orthogonal,
indeed following Eq. (11.5) we have that
2 2
|h1̄|0̄i| = e−4|α| .
√ 2
For instance if |α| = 3, then |h1̄|0̄i| ≈ 10−6 . Remarkably, these “code-words" are quite stable even in the
presence of noise induced by single-photon losses from the resonator (which we will consider as the main loss
mechanism). Lousely speaking, this is due to the fact that the coherent states are eigenstates of the loss
operator â.
In order to corroborate this fact, we start by showing that single-photon loss does not lead to spin-flip
2
error, which is when a qubit flips e.g. |0i → |1i. Consider the quantity | h1̄|â|0̄i| that describes the overlap
between the two logical spin states |0̄i and |1̄i after a single-photon has been lost from the resonator. Using
Eq. (11.4) and Eq. (11.5) we have that
2 2 2 2 2
| h1̄|â|0̄i| = |α| |h1̄|0̄i| = |α| e−4|α| ≈ 10−5 ,

for |α| = 3. Therefore we propose to use a two-photon pump amplitude set to G = 3K so that |α| =
p √
3K/K = 3, as such we will use G = 3K throughout this section.
More formally (beyond the scope of the lecture but given here as a complement), we compute the semi-
classical steady-state solution of the KNR. In the presence of single-photon loss the collapse operator in the

Lindblad-master equation takes the form Ĉ = γâ and the time-evolution of the density matrix is then
given by
d%̂(t) h i
= −i Ĥ, %̂ + γ 2â%̂↠− %̂↠â − ↠â%̂ .

dt
This furthermore assumes that the system is in thermal equilibrium with a zero-temperature environment,
so that the number of thermal photons in the system is negligible. This is a good approximation since the
typical microwave cavity is at mK. The described process is schematically sketched in Fig. 11.12. While it is
possible to obtain the steady-state solution analytically [Bartolo et al., 2016, Meaney et al., 2014], it can be
found quite easily (see Appendix B.3) from the solution of the semi-classical equations of motion that the
expected position of the states |±αi in presence of single-photon loss are
1/4 !!
1 16G2 − γ 2

i γ
α= exp − arctan p .
2 K2 2 16G2 − γ 2

p the single-photon loss rate, namely if 4G  γ


Thus, if the two-photon pump amplitude is much larger than
or equivalently 12K  γ, since G = 3K, then α ' G/K and the mapping is confined to the spin
subspace spanned by |0̄i and |1̄i. Furthermore K/30 > γ is already achievable today in superconducting

114
γ
.. |n〉
G .
|3〉

|2〉

K |1〉

|0〉
G γ

Figure 11.12: Left: Schematic illustration of a cavity with a Kerr-nonlinear element that introduces an
effective photon-photon interaction of strength K. The cavity is subject to two-photon pumping with
amplitude G and single-photon loss at a rate γ. Right: The aforementioned effects sketched on the Fock
states |ni.

resonators [Kirchmair et al., 2013]. When this condition is satisfied the steady-state density matrix of the
system, d%̂/dt = 0, takes the form
1
%̂ss ≈ (|αi hα| + |−αi h−α|) ,
2
which is an equal weighted statistical mixture of the two coherent states |±αi or equivalently of the two
logical spin states {|0̄i , |1̄i}. This can furthermore be confirmed by numerical integration of the master
equation Eq. (11.5.2). The numerical results show that the mapping remains robust in presence of single-
photon loss as long as 12K  γ. Figure 11.13 shows the density plot of the steady-state Wigner function
for γ = 0.12K and γ = 6K, respectively. When γ = 0.12K the fidelity is F = 99.99% with respect to the
ideal steady state Eq. (11.5.2). When γ = 6K, instead, outside the regime of low photon loss, the fidelity
drops to F = 85.13%.

11.5.3 Two- & one-photon pumped Kerr-nonlinear resonator


We now turn our attention to the realization of the Ising Hamiltonian Eq. (8.1). We will demonstrate that
a sufficiently weak single-photon pump can act like an effective magnetic field [Puri et al., 2017]. The single-
photon pump can be physically implemented by capacitively coupling one end of the two-photon pumped
KNR to a transmission line through which the single-photons are added to the resonator, see Figure 11.14.
By considering the addition of a weak single-photon pump with a frequency that is in resonance with the
resonator we get in a frame rotating at the resonator frequency that the the Hamiltonian is

Ĥ1 = −Kâ†2 â2 + G â†2 + â2 + F ↠+ â ,


 
(11.80)

where F denotes the amplitude of the single-photon pump. With this additional pump it can be once again
found (see Appendix B.4) from the solution to the semi-classical equations of motion (neglecting losses) that
the expected position of the coherent states |±αi of the resonator are
r
G F
±α ' ± + ,
K 4G

115
(a) (b)

Figure 11.13: Steady state of a two-photon driven KNR in presence of single-photon loss. Figure (a) is the
steady state phase space density plot of the Wigner function with a photon loss rate γ = 0.12K and figure
(b) with γ = 6K. The two-photon drive amplitude was set to G = 3K in both figures. The white circles
indicate the expected position of the coherent states. On the axes, Im(α) and Re(α) are the momentum and
position quadrature respectively.

Single-photon pump

𝜆/4 Φext 𝜆/4

Figure 11.14: The two- and one-photon pumped KNR. One end of the two-photon pumped KNR is ca-
pacitively coupled to a transmission line through which microwave pulses of amplitude amplitude F and
frequency ωr drive the resonator.

i.e. theypare slightly displaced by a factor F/4G, but if 4G  F or equivalently 12K  F , then
±α ' ± G/K, and the states are once again confined to the spin subspace spanned by |0̄i and |1̄i.
Most importantly, this single-photon pump lifts the degeneracy between the states |0̄i and |1̄i by an amount

∆E = hα|Ĥ1 |αi − h−α|Ĥ1 |−αi = 4F α.

In the spin subspace spanned by our two logical spins |0̄i and |1̄i the Hamiltonian of Eq. (11.80) can therefore
be expressed as
¯ ¯
IˆĤ1 Iˆ = (|0̄i h0̄| + |1̄i h1̄|) −Kâ†2 â2 + G â†2 + â2 + F ↠+ â (|0̄i h0̄| + |1̄i h1̄|)
 

using the definition of a coherent state â |0̄/1̄i = ∓α |0̄/1̄i and the quasi-orthogonality condition h1̄|0̄i ≈ 0,
we get
¯ ¯ ¯ ¯z
IˆĤ1 Iˆ = (−Kα4 + 2Gα2 )Iˆ + 2F ασ̂
¯ z + const
= 2F ασ̂
¯ z = |1̄i h1̄| − |0̄i h0̄| is the Pauli z-matrix. This is the Ising Hamiltonian for a single Ising spin in a
where σ̂
magnetic field of strength |2F α|. Therefore the application of a weak single-photon drive induces an effective
magnetic field.

116
Now, recall from Sec.7.4 that we require an initial Hamiltonian that does not commute with the final
Hamiltonian and furthermore should have a simple non-degenerate ground state. We consider the case when
the resonators are initialized to vacuum, since it is relative simple to prepare the resonators in such a state.
We choose an initial Hamiltonian in the rotating frame of the following form
Ĥ0 = −δ↠â − Kâ†2 â2 (11.81)
where δ is a finite positive detuning to separate the vacuum state |0i from the single-photon Fock state |1i by
an energy gap δ. At a first glance Eq. (11.81) might fill us with fear and trepidation since the Hamiltonian
isn’t bounded from below. Fear not! One can show that this is a consequence of the approximations used in
the derivation, namely that we truncated the expansion of the Josephson cosine potential to fourth order,
see Appendix B.1, and that the microscopic Hamiltonian from which it was derived from Eq. (B.5) does
not exhibit this property. We also have to remember that photon loss also naturally leads to lower photon
number states. Therefore as pointed out by Nigg et al. (2016) [Nigg et al., 2017] we have to think of the zero
photon state in the rotating frame as the state with highest energy instead of the one with lowest energy.
We also adopt the terminology by Nigg et al. and refer to the ground state, the state with highest energy,
as the roof state.
The time-dependent Hamiltonian for the QA algorithm is obtained by adiabatically varying the two- and
single-photon pump amplitudes and frequencies (see Appendix B.2)
 
t t
Ĥ(t) = 1 − Ĥ0 + Ĥ1
τ τ
  (11.82)
t † †2 2
 t †2 2 †2 2
 †

= 1− −δâ â − Kâ â + −Kâ â + G â + â + F â + â ,
τ τ
yielding Eq. (7.1) for the single Ising spin in a magnetic field problem. Notice that the Kerr term in the
Hamiltonian above is actually time independent so that one only needs to vary the frequency and amplitude
of the pumps.

11.5.4 Coupled two-photon pumped Kerr-nonlinear resonators


After having demonstrated that a weak single-photon pump induces an effective magnetic field we will now
explain how the coupling between spins is realized.
The coupling between two resonators can in the simplest form be expressed as
g(â1 â†2 + â†1 â2 ),
where g describes single-photon exchange rate between two capacitively coupled resonators. We will begin by
considering the case of two capacitively coupled two-photon pumped KNRs as shown in Fig 11.15 and assume
that the parameters K and G are identical for both resonators. We furthermore neglect the single-photon

𝜆/4 Φext 𝜆/4 𝜆/4 Φext 𝜆/4

Figure 11.15: Two capacitively coupled KNRs.

pump for the moment. The Hamiltonian for two coupled KNRs in the rotating frame is given by
2      
−Kâ†2 †2 † †
X
2 2
Ĥ1 = i âi + G âi + âi + g â1 â2 + â1 â2 .
i=1

117
A steady state analysis shows that for sufficiently small coupling strengths g  6K the two-photon pumped
KNR is approximately kept in the subspace spanned by |0̄, 0̄i, |0̄, 1̄i, |1̄, 0̄i and |1̄, 1̄i (see Appendix B.5).
Similar to the previous section it can be shown that in the spin subspace spanned by the logical spins
{|0̄, 0̄i , |0̄, 1̄i , |1̄, 0̄i , |1̄, 1̄i} this Hamiltonian takes the form

¯ z σ̂
Ĥ1 = 2gα2 σ̂ ¯z
1 2 + const.

This is the Hamiltonian for two magnetically coupled spins with g > 0 (g < 0) corresponding to the
ferromagnetic (antiferromagnetic) coupling between the spins in the rotating frame. The initial Hamiltonian
for two coupled KNRs is chosen to be:
2    
−δâ†i âi − Kâ†2 † †
X
2
Ĥ0 = i âi + g â1 â2 + â1 â2 .
i=1

Furthermore, in order to ensure that the vacuum state |0, 0i is the roof state of the initial Hamiltonian Ĥ0
the detuning has to be greater than the single-photon exchange rate δ > g, which can be seen by looking at
the eigenstate following the roof state

|1, 0i + |0, 1i |1, 0i + |0, 1i


   
Ĥ0 √ = (g − δ) √ .
2 2
The two-spin Ising problem is then realized with the time-dependent Hamiltonian
 
t t
Ĥ(t) = 1 − Ĥ0 + Ĥ1 .
τ τ

What is interesting about the two-spin Ising problem is that the ground state exhibits frustration. This
implies that the ground state for the two-spin problem is degenerate. At t = τ the two degenerate roof states
for ferromagnetic and antiferromagnetic coupling are {|0̄, 0̄i , |1̄, 1̄i} and {|0̄, 1̄i , |1̄, 0̄i}, respectively. It might
seem like this can pose a problem since the gap closes between the two “highest” energy states, and therefore
violates the adiabatic theorem. However, the theorem is not violated for the following reason. Recall from
i that an observable A is conserved if its Hermitian operator Â
your introductory course in quantum mechanics
h
commutes with the Hamiltonian, i.e. Â, Ĥ = 0. It can be shown that the parity operator which is defined
 P  h i
by P̂2 = exp iπ i=1 â†i âi commutes with both the initial and final Hamiltonian, so that P̂2 , Ĥ(t) = 0.
2

This means that parity is conserved and that transitions only occur between even-even and odd-odd parity
eigenstates. So since the zero-photon Fock state |0i has even parity and the single-photon Fock states |1i
has odd parity, degeneracy is not a problem.
We now possess the tools to write down the final Hamiltonian Ĥ1 for N linearly coupled KNRs, which
reads
N       
−Kâ†2 †2 †
gij â†i âj + â†j âi ,
X X
2 2
Ĥ1 = i âi + G âi + âi + Fi âi + âi + (11.83)
i=1 1≤i<j≤N

where gij is the single-photon coupling strength between resonator i and j. In the computational basis the
final Hamiltonian becomes
N
¯ z σ̂
¯z ¯ z + const,
X X
Ĥ1 = 2α2 gij σ̂ i j + 2α Fi σ̂i (11.84)
1≤i<j≤N i=1

118
which corresponds to the Ising Hamiltonian for N spins. The corresponding initial Hamiltonian can just as
readily be written down
N    
−δâ†i âi − Kâ†2 gij â†i âj + âi â†j .
X X
2
Ĥ0 = i â i +
i=1 1≤i<j≤N

We now compare the couplings and the single-photon pumping strengths of the KNR Ising Hamiltonian
Eq. (11.84) with the couplings and magnetic field strengths of the Ising Hamiltonian Eq. (8.2). This yields
to the following correspondence for the couplings:
− 2α2 gij ↔ Jij . (11.85)
The minus sign is introduced because gij > 0 (gij < 0) corresponds to ferromagnetic (antiferromagnetic)
coupling while Jij > 0 (Jij < 0) corresponds to antiferromagnetic (ferromagnetic) coupling. Similar for the
single-photon pumps we make the following correspondence:
2αFi ↔ hi . (11.86)
Lastly, the final component for this continuous variable quantum annealing architecture is the readout of
the state of the computational states. This can be implemented through balanced homodyne detection that
is enabled via capacitively coupled transmission lines [Puri et al., 2017], corresponding to the measurement
R
of a suitably chosen field quadrature, namely q̂ = dq|qihq|.

11.5.5 Simulation of relevant combinatorial optimization problems


Subset sum problem
To demonstrate the capabilities of the two-photon pumped KNR architecture described in chapter 11.5, we
consider a small instance of the subset sum problem that was introduced in section 8.2.1. We can consider
the subset sum problem defined by the set n = {−2, 1, 2} and m = 3. This problem requires three coupled
KNRs and the Ising spin configuration that satisfy this problem is s1 = −1 and s2 = s3 = +1. By dividing
Eq. (11.85) with −2α2 and replacing Jij with ni nj /2 we get that the coupling between the KNR’s for the
subset sum problem should be chosen as
ni nj
gij = −A 2 , gii = 0.

Here A is a positive scale factor which will leave the Ising problem invariant and is convenient to introduce
in order to satisfy the physical constraints on the coupling strengths. Next we divide Eq. (11.86) by 2α and
replace hi with Eq. (8.5), to find that the single-photon pump amplitudes should be chosen as
 
N
1 1 X
Fi = A nj − m ni ,
2α 2 j=1

where A is the same scale factor. To make sure that the conditions 6K  gij and 12K  Fi are satisfied,
we chose A to be
K K
A= = .
2 max|hi | 10
It should be noted that the scale factor may vary depending on the given problem and the set of numbers in
the problem. After having defined all the couplings and single-photon pump strengths we have diagonalized
the time-dependent QA Hamiltonian
 
t t
Ĥ(t) = 1 − Ĥ0 + Ĥ1 , (11.87)
τ τ

119
where the initial and final Hamiltonian are
3       
−Kâ†2 †2 †
gij â†i âj + â†j âi ,
X X
2 2
Ĥ1 = i âi + G âi + âi + Fi âi + âi +
i=1 1≤i<j≤3
3    
−δâ†i âi − Kâ†2 gij â†i âj + âi â†j .
X X
2
Ĥ0 = i âi +
i=1 1≤i<j≤3

In the spin subspace spanned by the computational basis the final Hamiltonian takes the form (dropping the
constant)    
3 3
1 X
¯ z σ̂
¯z
X 1 X
¯z  ,
Ĥ1 = A − ni nj σ̂ i j +
 nj − m ni σ̂ i
2 i=1
2 j=1
1≤i<j≤3

realizing the Ising Hamiltonian for the subset sum problem.


Pontus Vikstål has performed numerical simulations of this problem [Vikstål, 2018]. From diagonalizing
the instantaneous QA Hamiltonian Eq. (11.87) he found that the minimum energy gap was ∆min = 0.046K
and that the adiabatic condition was τ  79/K which requires an evolution time that is approximately
τ = 7900/K. Next he initialized each individual resonator to the zero-photon Fock state |0i, and numer-
ically solved the Schrödinger equation with Ĥ(t) given by Eq. (11.87). The success probability was then
computed as the probability of occupation of the correct roof state at t = τ , that is, h1̄, 0̄, 0̄|%̂(τ )|1̄, 0̄, 0̄i,
which corresponds to the solution s1 = −1 and s2 = s3 = +1. The calculated success probability was found
to be 99.99% which shows that the evolution is to a very good approximation restricted to the computational
subspace and that the KNR annealer correctly computes the solution of the subset sum problem. Fig. 11.16
shows the Wigner function at the end of the evolution for each individual resonator.

(a) First KNR (b) Second KNR (c) Third KNR

Figure 11.16: The Wigner function for each individual KNR at t = τ .

Number partitioning problem


To map the number partitioning problem onto a network of two-photon pumped KNRs we begin by comparing
the KNR Ising Hamiltonian Eq. (11.84) with the number partitioning problem Ising Hamiltonian Eq. (8.8).
The comparison shows that the single-photon exchange rate coupling between i:th and j:th resonator should
be chosen as
ni nj
gij = −A 2 , gii = 0.
α
Since this problem does not require any additional single-photon pumps, we can safely include α2 that is in
the denominator inside the scale factor A, so that gij = −Ani nj . We considered the number partitioning
problem defined by the set n = {3, 5, 8}. This problem also requires three resonators coupled together. The

120
fair partitioning to this problem is S1 = {3, 5} and S2 = {8} for which the Ising spin configurations that
satisfy this problem are s1 = s2 = +1 and s3 = −1 and the configuration with all spins flipped. We chose
A = 0.0015K, to satisfy 6K  gij . The initial and final Hamiltonian is given by
3     
−Kâ†2 †2
gij â†i âj + â†j âi ,
X X
2 2
Ĥ1 = i âi + G âi + âi +
i=1 1≤i<j≤3
3    
−δâ†i âi − Kâ†2 gij â†i âj + âi â†j .
X X
2
Ĥ0 = i âi +
i=1 1≤i<j≤3

In the spin subspace spanned by the coherent eigenstates of the two-photon pumped KNR in the rotating
frame the final Hamiltonian takes the form

¯ z σ̂
¯z
X
Ĥ1 = −A ni nj σ̂i j,
1≤i<j≤3

realizing the Ising Hamiltonian for the number partitioning problem. From diagonalizing the time-dependent
QA Hamiltonian we found that the minimum energy gap was ∆min = 0.3K and that the adiabatic condition
was τ  3.92/K, which requires an evolution time that is approximately τ = 392/K. Each resonator
was then initialized to the vacuum |0i and we numerically solved the Schrödinger equation. From this the
success probability of reaching the desired state |0̄, 0̄, 1̄i or |1̄, 1̄, 0̄i was found to be 99.97%. Furthermore a
calculation
r of the fidelity showed that the system reaches the entangled state N (|0̄, 0̄, 1̄i + |1̄, 1̄, 0̄i), where

2
N = 1/ 2 1 + exp(−6|α| ) is a normalization constant, with 100% fidelity.

Discussion
The numerical results that we have obtained show that in the ideal case with no losses the success probability
was 99.98% for the subset sum problem and 99.97% for the number partitioning problem. The 0.02%−0.03%
error mostly comes from deviations from the computational subspace. A higher success probability could
have been achieved by choosing a smaller scale factor since then the conditions 6K  gij and 12K  Fi
would have been better satisfied. However, choosing a smaller scale factor decreases the minimum gap energy
which leads to an increase in the evolution time of the algorithm. For example choosing A = 0.00025K for
the number partitioning problem we get a success probability that is 100%, but the evolution time increases
approximately by a factor of 6.

11.5.6 Remarks on scalability & the model


The results that we have presented confirm that the KNR architecture is a promising way of implementing
quantum annealing. An important question that we didn’t address though, is the scalability of this model.
The two-photon pumped KNR architecture could be scaled up using pairwise couplings between resonators
only [Puri et al., 2017]. However, a combinatorial optimization requires in general long-range interactions
between spins. Indeed, the coupling terms that appear in the problems are ∝ ni nj , and does so that the
we require an all-to-all connectivity. It is usually difficult to implement experimentally such long-range
interactions. One possible solution would be to embed the Ising spins in a so-called Chimera graph, as
it is done in the D-wave system [Choi, 2008, Choi, 2011]. Another possibility is to embed the Ising spins
in a LHZ scheme which maps the Ising problem on a graph with local interactions only [Lechner et al.,
2015]. In the LHZ scheme N logical spins (the spins that define the optimization problem) are mapped to
M = N (N − 1)/2 physical spins (the spins available the lab). This was furthermore investigated by Puri et

121
al. (2016) [Puri et al., 2017] where they showed how the LHZ scheme for two-photon pumped KNRs could
be physically realized. A third connectivity solution was presented by Nigg et al. (2016) [Nigg et al., 2017]
where they showed that by leveraging the phenomenon of flux quantization for a network of Kerr-parametric
oscillators they could achieve an all-to-all connected architecture.

122
Appendix A

Quantization of the electromagnetic field


in a cavity

This Appendix is taken from Ref. [Vikstål, 2018].

A.1 Quantizing the electromagnetic field


In this section we will consider the quantization of a single-mode electromagnetic field following Ref. [Gerry
et al., 2005, Meystre and Sargent, 2007]. Note that the dimensions in this Appendix are different than
those used in the main text, resulting in a different definition of the field quadratures. To quantize the
electromagnetic field we will begin by considering a closed cavity of volume V with mirrors of perfect reflection
located at z = 0 and z = L. We imagine that we have a monochromatic, single-mode electromagnetic field
that is assumed to be polarized along the x-direction.

Figure A.1: Cavity with two perfectly reflecting mirrors located at z = 0 and z = L. The electric field is
assumed to be polarized along the x-direction

123
In the absence of sources and charges the Maxwell equations read
∂B
∇×E =− , (A.1)
∂t
∂E
∇ × B = µ0 0 , (A.2)
∂t
∇ · E = 0, (A.3)
∇ · B = 0. (A.4)

Using Maxwell’s equations in the absence of sources and charges and the given boundary conditions, the
electric field has the form  2ω 2 1/2
E(z, t) = q(t) sin(kz)ex , (A.5)
V 0
where V is the volume of the cavity, q(t) is a function of time and ex is the unit-vector along the x-direction.
From the boundary conditions the allowed frequencies are found to be
cπn
ωn = , n = 1, 2, 3, . . .
L
with kn = ωn /c as the corresponding wave number. In Eq. (A.5) we have assumed a specific frequency ω,
i.e. a specific standing wave which is called a mode of the field. We find the corresponding magnetic field
by substituting Eq. (A.5) into (A.2),

µ0 0  2ω 2 1/2
B(z, t) = q̇(t) cos(kz)ey . (A.6)
k V 0
We now identify q(t) as a canonical coordinate, and p(t) ≡ q̇(t) as the momenta canonically conjugate to
q(t). The energy stored in the field of the single-mode is

1 1 1
Z  
H= |E|2 + µ0 |B|2 dV = (p2 + ω 2 q 2 ).
2 V 0 2

We see that this is nothing but the energy of a harmonic oscillator with unit mass1 . To quantize the
electromagnetic field we promote q and p to operators

q → q̂ and p → p̂,

and impose that they obey the canonical commutation relation

[q̂, p̂] = i~.

Thus the electric and magnetic field are also promoted to operators
 2ω 2 1/2
Êx (z, t) = q̂ sin(kz), (A.7)
V 0
µ0 0  2ω 2 1/2
B̂y (z, t) = p̂ cos(kz). (A.8)
k V 0
The subscript x and y denotes the constituent components of the fields. The Hamiltonian now reads
1 2
Ĥ = (p̂ + ω 2 q̂ 2 ). (A.9)
2
1 The amplitude of the electric field in Eq. (A.5) was cleverly chosen to yield the energy of a harmonic oscillator of unit mass.

124
Next we define the very useful non-Hermitian ladder operators

â = (2~ω)−1/2 (ω q̂ + ip̂), (A.10)


↠= (2~ω)−1/2 (ω q̂ − ip̂). (A.11)

which obey the boson commutation relation [â, ↠] = 1. Inverting Eq. (A.10) and Eq. (A.11) and substituting
it into Eq. (A.9) we obtain a Hamiltonian entirely written in terms of the ladder operators as in Eq.(11.1.1).
In the main text, we haven’t discussed the time-dependence of the operators. What we’ve done so far
is assumed to hold at some time t, for example t = 0. In the Schrödinger picture the states are time-
dependent and the operators are time-independent. On the contrary, in the Heisenberg picture the operators
are time-dependent and the states are time-independent. In the Heisenberg picture the time evolution of the
annihilation operator is given by
dâ i
= [Ĥ, â]
dt ~
i  1
= [~ω ↠â + ), â] (A.12)
~ 2
= iω(↠ââ − â↠â)
= iω[â, ↠]â = −iωâ.

Solving this equation yields


â(t) = âe−iωt , (A.13)
where â(0) ≡ â. Taking the Hermitian adjoint of Eq. (A.13) we also find that

↠(t) = ↠eiωt .

After substituting Eq. (A.10) into Eq. (A.7) and Eq. (A.11) into Eq. (A.8), the electric and magnetic field
with the inclusion of the time-dependence become, respectively
 ~ω 1/2
Êx (z, t) = (âe−iωt + ↠eiωt ) sin(kz), (A.14)
V 0
µ0 0  ~ω 3 1/2 −iωt
B̂y (z, t) = (âe − ↠eiωt ) cos(kz). (A.15)
ik V 0
It is convenient to introduce the two hermitian operators
r
ω 1
X̂ = q̂ = (â + ↠),
2~ 2
(A.16)
1 1
X̂π/2 = √ p̂ = (â − ↠),
2~ω 2i
which satisfy the commutation relation
i
[X̂, X̂π/2 ] =
.
2
From here it is easy to show that the electric field operator Eq. (A.14) can be written in terms of the
dimensionless quantities X̂ and X̂π/2 as
 ~ω 1/2
Êx (z, t) = 2 (X̂ cos(ωt) + X̂π/2 sin(ωt)) sin(kz). (A.17)
0 V

This expression shows that X̂ and X̂π/2 are associated with the electric field amplitude, where the second
term is offset by π/2 compared to the cos(ωt) term.

125
Now, let’s understand why coherent states are the most classical states. To construct a state with close
resemblance to the classical electromagnetic field, one can observe that by replacing â and ↠with a complex
variable in Eq. (A.14) and Eq. (A.15) it would produce a “classical field”, i.e. a field that oscillates. This is
achieved in view of the definition of the coherent state as an eigenstate to the annihilation operator, as in
Eq.(11.4). The expectation value of the electric field given by Eq. (A.14) becomes then
 ~ω 1/2
hÊx (z, t)iα = hα| Êx (z, t) |αi = hα| (âe−iωt + ↠eiωt ) |αi sin(kz)
V 0
 ~ω 1/2
= (αe−iωt + α∗ eiωt ) sin(kz).
V 0
Writing α in polar coordinates α = |α|eiϕ we get
 ~ω 1/2
hÊx (z, t)iα = 2|α| cos(ωt − ϕ) sin(kz),
V 0
and we see that the field oscillates very much like the classical electric field. Likewise, for the quadrature
operator Eq. (A.16), we have
1 1
hX̂iα = hα| (â + ↠) |αi = (α + α∗ ) = Re α = |α| cos(ϕ)
2 2
and similarly hX̂π/2 iα = Im α = |α| sin(ϕ), so the mean of the quadratures are related to the real and
imaginary part of α.
Let us now derive Eq.(11.1.1) of the main text. Since the Fock states form a complete set, we will use
them to express α

X
|αi = cn |ni , (A.18)
n=0

where cn = hn|αi denotes a complex number which is to be determined. Inserting Eq. (A.18) into Eq. (11.4)
and using Eq. (11.1) and Eq. (11.2) we obtain
∞ ∞
X √ X
cn n |n − 1i = cn α |ni .
n=1 n=0

Since the Fock-states form an orthogonal basis, we can multiply with an arbitrary state hm| from left and
use the orthogonality condition hm|ni = δmn to obtain

cm+1 m + 1 = αcm .

By the substitution m → n − 1
α α2 αn
cn = √ cn−1 = p cn−2 = . . . = √ c0 ,
n n(n − 1) n!

we obtain a recursion formula. Hence Eq. (A.18) can be expressed as



X αn
|αi = c0 √ |ni .
n=0 n!
2
We use the normalization condition to determine the coefficient |c0 | ,
∞ X ∞ ∞
2
X α∗m αn X |α|2 2
|hα|αi| = 1 = |c0 |2 √ √ hm|ni = |c0 |2 = |c0 |2 e|α| .
m=0 n=0 m! n! n=0
n!

126
Xπ⁄2
∆X=1/2

Im 𝛼 ∆Xπ⁄2=1/2

|𝛼|

𝜙
X
Re 𝛼

Figure A.2: Phase-space diagram of a coherent state |αi. The displacement from the origin is equal to |α|
and the angle ϕ is the phase, measured from the X-axis. The quantum uncertainty is displayed as a grey
circle with a diameter of 1/2.

2
Therefore |c0 | = e−|α| /2 and our final expression for the coherent state |αi expressed in terms of Fock states
is Eq.(11.1.1) in the main text.
A neat way to illustrate a coherent state is in phase space. In Fig. A.2 the phase space diagram of a
coherent state |αi is illustrated. The coherent state can be viewed as a displacement of the vacuum state
with a distance |α| from the origin and an angle ϕ measured from the X̂-axis. The grey circle area represents
the uncertainty of the coherent state and has a constant diameter of 1/2.

127
Appendix B

Superconducting quantum circuits

B.1 Circuit Lagrangian

ϕ2 Φext ϕ1 EJ,CJ EJ CJ

Figure B.1: Left: Schematic illustration of a SQUID. Right: Circuit symbol for a Josephson junction with
Josephson energy EJ and capacitance CJ .

In this section of the appendix we aim to derive the two-photon pumped Kerr Hamiltonian starting from
the circuit Lagrangian of a SQUID and follow the derivation presented by Nigg et al. [Nigg et al., 2017].
The SQUID consists of two Josephson junctions connected in a superconducting loop with a magnetic flux
Φext (t) penetrating the loop. Fig B.1 is meant to illustrate the SQUID and the corresponding circuit symbol
of the Josephson junction. The circuit Lagrangian for a SQUID is given by
1  
L= CJ φ̇21 + φ̇22 + EJ (cos(2πφ1 /Φ0 ) + cos(2πφ2 /Φ0 )) (B.1)
2
where φi is the phase difference across the junction and Φ0 = h/(2e) is the magnetic flux quantum. For
convenience we chose units such that the flux quantum is Φ0 = h/(2e) = 2π. The parameters CJ and EJ
in Eq. (B.1) denote the capacitance and Josephson energy of each Josephson junction. Due to the fluxoid
quantization condition, φ1 and φ2 are not independent variables. If a flux Φext is penetrating the loop, then
the flux quantization requires that
φ2 − φ1 = Φext , (B.2)
where we have set the geometrical phase term to zero and neglected the induced flux term which is reasonable
for small loops. We introduce the new variable
1
ϑ= (φ1 + φ2 ).
2

128
which together with Eq. (B.2) implies
1
φ1 = ϑ − Φext , (B.3)
2
1
φ2 = ϑ + Φext . (B.4)
2
Inserting Eq. (B.3) and Eq. (B.4) into Eq. (B.1) and simplifying we get
 
1 Φext
L = C ϑ̇2 + 2EJ cos cos(ϑ)
2 2

with C = 2CJ . We have also neglected the 41 CJ Φ̇2ext term which is independent of coordinates and does
not affect the dynamics of the system. The Hamiltonian function is obtained as usual through the Legendre
transformation X
H= Qi ϑ̇i − L,
i

∂L
with Qi = = C ϑ̇i being the generalized momenta. From the Legendre transformation we obtain the
∂ ϑ̇i
Hamiltonian
Q2
 
Φext
H = EC 2 − 2EJ cos cos(ϑ). (B.5)
2e 2

where EC ≡ e2 /C is the charging energy. Next, we consider a monochromatic magnetic flux modulation
threaded through the loop that is of the form
Φext (t)
= Φdc + δΦ(t)
2
where dc is the static magnetic flux and δΦ(t) is a time-dependent perturbation that we choose of the form

δΦ(t) = δΦac cos(Ωt),

where δΦac is the ac flux. The external flux cosine term in Eq. (B.5) can now be written as
 
Φext (t)
cos = cos(Φdc ) cos(δΦac cos(Ωt)) − sin(Φdc ) sin(δΦac cos(Ωt)).
2
Using the following mathematical tricks

X
cos(z cos(ϑ)) = J0 (z) + 2 (−1)n J2n (z) cos(2nϑ)
n=1

X
sin(z cos(ϑ)) = 2 (−1)n J2n−1 (z) cos [(2n − 1)ϑ] ,
n=1

where Jn (z) is the Bessel function of the first kind, we get


  !
Φext (t) X
n
cos = cos(Φdc ) J0 (δΦac ) + 2 (−1) J2n (δΦac ) cos(2nΩt)
2 n=1

!
X
n
− sin(Φdc ) 2 (−1) J2n−1 (δΦac ) cos [(2n − 1)Ωt] .
n=1

129
If the ac-flux is sufficiently small, |δΦac |  1, then we can keep only the two leading terms of the sum, that’s
J0 and J1 and use the approximations J0 (z) ' 1 and J1 (z) ' z/2 yielding
 
Φext (t)
cos ' cos(Φdc ) + sin(Φdc )δΦac cos (Ωt) .
2
For convenience we now separate the Hamiltonian into a time independent and time dependent part as
H0 = H1 + H2 (t), where

Q2
H1 = EC − 2EJ cos(Φdc ) cos(ϑ)
2e2
H2 (t) = −2EJ sin(Φdc )δΦac cos (Ωt) cos(ϑ).

We now expand the cos(ϑ) to fourth order, keeping only the leading nonlinearity. Furthermore since we
focus on the weak ac modulation |δΦac |  1, we can neglect the nonlinear part of the time dependent term,
then we have
Q2 EJ
H1 ' EC + EJ cos(Φdc )ϑ2 − cos(Φdc )ϑ4 ,
2e2 12
H2 (t) ' EJ sin(Φdc )δΦac cos (Ωt) ϑ2 ,

where we have dropped the coordinate independent terms that does not contribute to the dynamics. As the
next step we promote ϑ and Q to operators with the commutation relation
h i
ϑ̂, Q̂ = i.

By introducing the creation and annihilation operator ↠and â that obey the bosonic commutation relation
â, ↠= 1, we have that
 

r r
1 † Z
â − ↠,
 
ϑ̂ = â + â and Q̂ = −i
2Z 2
where s
2e2 EJ cos(Φdc )
Z= .
EC
Thus the Hamiltonian in terms of the creation and annihilation operator ↠and â can be written as
EC 4
Ĥ1 = ωr ↠â − 2
â + ↠,
96e
EJ 2
Ĥ2 (t) = sin(Φdc )δΦac cos (Ωt) â + ↠,
2Z
where we have defined the resonator frequency
r
2EC EJ cos(Φdc )
ωr = ,
e2
and have furthermore dropped the ωr /2 which simply corresponds to a constant energy shift. Next we
move to the interaction picture. To switch into the interaction picture we divide the Schrödinger picture
Hamiltonian into two parts:
ĤS = ĤS,0 + ĤS,1
where
ĤS,0 = ωr ↠â.

130
and
EC 4 EJ 2
ĤS,1 = − â + ↠+ sin(Φdc )δΦac cos (Ωt) â + ↠.
96e2 2Z
The Hamiltonian in the interaction picture is defined as Ĥint = eiĤS,0 t ĤS e−iĤS,0 t , and we thus get
† 4 †
† eiωr â ât
â + ↠EC e−iωr â ât
Ĥint = ωr â â −
96e2
EJ † 2 †
+ sin(Φdc )δΦac cos (Ωt) eiωr â ât â + ↠e−iωr â ât . (B.6)
2Z
Next normal ordering the terms yields
4
: â + ↠: = â†4 + 6â†2 + 4â†3 â + 6â†2 â2 + 4↠â3 + 12↠â + â4 + 6â2 + 3,
2
: â + ↠: = â†2 + 2↠â + â2 + 1.

Under the unitary transformation Û(t) = e−iωr tâ â
each â and ↠transform according to
† †
eiωr tâ â âe−iωr tâ â
= âeiωr t ,
† †
eiωr tâ â ↠e−iωr tâ â
= ↠e−iωr t .

Now we use the rotating wave approximation (RWA) which corresponds to neglecting all rotating terms
proportional to exp(iωr t). In the second term of Eq. (B.6) the only non-rotating terms are 6â†2 â2 and
12↠â. The latter leads to a renormalization of the oscillator frequency by an amount ∆ωr = −EC /(8e2 )
which we absorb into ωr and make an implicit redefinition of the resonator frequency. For the third and last
term in Eq. (B.6) we specifically consider a two-photon pump frequency that is close to twice the resonator
frequency, i.e. Ω ' 2ωr . By writing cos (Ωt) in Euler form we find that the only non-rotating terms are â†2
and â2 . Putting it all together we have
EC †2 2 EJ
Ĥint = ωr ↠â − sin(Φdc )δΦac â†2 + â2 .

2
â â +
16e 4Z
Switching back to the Schrödinger picture we get

Ĥ(t) = ωr ↠â − Kâ†2 â2 + G â†2 e−2iωr t + â2 e2iωr t



(B.7)

which is the two-photon pumped Kerr-nonlinear resonator Hamiltonian, where we have defined the Kerr-
nonlinear amplitude as
EC
K≡ ,
16e2
and the two-photon pump amplitude as
EJ
G≡ sin(Φdc )δΦac .
4Z

B.2 Transformation to the rotating frame


In this part of the appendix we will demonstrate how to derive the time-dependent AQC Hamiltonian
for the two- and one-photon pumped KNR. Let’s consider the two- and one-photon parametrically driven
Kerr-nonlinear resonator Hamiltonian that is given by
   
Ĥ(t) = ωr ↠â − Kâ†2 â2 + G(t) e−iωp (t)t â†2 + eiωp (t)t â2 + F (t) e−iωp (t)t/2 ↠+ eiωp (t)t/2 â ,

131
where ωr is the frequency of the resonator, ωp (t) is the two-photon pump frequency and the one-photon
pump frequency is chosen half times the two-photon pump frequency. We transform to a frame rotating at

the single-photon pump frequency by performing the following unitary transformation Û(t) = eiωp (t)tâ â/2 .
Under a unitary transformation the state vector |ψ(t)i transforms according to

|ψ̃(t)i = Û(t) |ψ(t)i → |ψ(t)i = Û † (t) |ψ̃(t)i . (B.8)

Substituting the transformed state vector into the Schrödinger equation we obtain

∂ |ψ̃(t)i ∂ Û(t) ∂ |ψi (t)


i =i |ψ(t)i + Û(t)i
∂t ∂t ∂t
∂ Û(t)
=i |ψ(t)i + Û(t)Ĥ(t) |ψ(t)i ,
∂t

by using Eq. (B.8) we replace |ψ(t)i with Û † (t) |ψ̃(t)i and get
!
∂ |ψ̃(t)i ∂ Û(t) † † ˜
i = i Û (t) + Û(t)Ĥ(t)Û (t) |ψ̃(t)i = Ĥ(t) |ψ̃(t)i ,
∂t ∂t

where
˜ ∂ Û(t) †
Ĥ(t) ≡ i Û (t) + Û(t)Ĥ(t)Û † (t) (B.9)
∂t
is the transformed Hamiltonian. We proceed by evaluating the first term in this expression

∂ Û(t) † 1 1
i Û (t) = − (ωp (t) + ω̇p (t)t) ↠â Û(t)Û † (t) = − (ωp (t) + ω̇p (t)t) ↠â,
∂t 2 | {z } 2
1

where the dot ω̇p (t) denotes the time-derivative. We then calculate the second term of Eq. (B.9) by invoking
the Baker-Hausdorff lemma,
1
eA ae−A = a + [A, a] + [A, [A, a]] + . . .
2!
where A is a Hermitian operator. Applying this formula we get that each operator â and ↠transform
according to

(−iωp (t)t/2)2
 
Û † (t)âÛ(t) = â 1 − iωp (t)t/2 + + . . . = âe−iωp (t)t/2 ,
2!
(iωp (t)t/2)2
 
Û † (t)↠Û(t) = ↠1 + iωp (t)t/2 + + . . . = ↠eiωp (t)t/2 ,
2!

and thus the Hamiltonian in the rotating frame is given by (we can now drop the tilde)
 
1
Ĥ(t) = ωr − (ωp (t) + ω̇p (t)t) ↠â − Kâ†2 â2 + G(t) â†2 + â2 + F (t) ↠+ â .
 
2

By cleverly choosing the pump frequency


 
t 1
ωp (t) = 2ωr + 2δ 1 − ⇒ ω̇p (t) = −δ
2τ τ

and the drive strengths


t t
G(t) = G and F (t) = F,
τ τ

132
the Hamiltonian in the rotating frame now reads
 
t t  t
δ↠â − Kâ†2 â2 + G â†2 + â2 + F ↠+ â

Ĥ(t) = − 1 −
τ τ τ
 
t t
−δ↠â − Kâ†2 â2 + −Kâ†2 â2 + G â†2 + â2 + F ↠+ â
  
= 1− (B.10)
τ τ
 
t t
= 1− Ĥ0 + Ĥ1 ,
τ τ

where we have defined

Ĥ0 ≡ −δ↠â − Kâ†2 â2 and Ĥ1 ≡ −Kâ†2 â2 + G â†2 + â2 + F ↠+ â .
  

B.3 Steady state & stability


In this section of the appendix we will demonstrate one way to obtain the steady state fixed points1 for the
two-photon pumped KNR in presence of single-photon loss.
The Lindblad master equation, which is an operator valued equation, can be turned into a differential
equation (Fokker-Planck like equation) by first writing the density operator %̂ describing the state of the
system, in terms of the coherent state projectors |αi hα|. The coherent state representation of the density
matrix is defined by [Glauber, 1963]
Z
%̂(t) = d2 αP (α, α∗ , t) |αi hα| , (B.11)

where P (α, α∗ , t) is some weight function that is normalized such that Tr [%̂(t)] = 1. The integral of Eq. (B.11)
is formally known as the Glauber Sudarshan P-representation of the state %̂(t). To derive an equation for
P (α, α∗ , t) we substitute the P-representation into the Lindblad master equation Eq. (11.5.2) and use the
properties

â |αi hα| = α |αi hα| ,


|αi hα| ↠= α∗ |αi hα| ,
 

|αi hα| â = + α∗ |αi hα| ,
∂α
 

↠|αi hα| = + α |αi hα| ,
∂α∗

which can be derived from the definition of the coherent state. Next we perform integration by parts and
assume that the boundary conditions at infinity are zero. This introduces a minus sign in front of each
differential operator. The above properties show that we have the following correspondences

â%̂(t) ↔ αP (α, α∗ , t),


 
† ∗ ∂
â %̂(t) ↔ α − P (α, α∗ , t),
∂α
  (B.12)

%̂(t)â ↔ α − P (α, α∗ , t),
∂α∗
%̂(t)↠↔ α∗ P (α, α∗ , t).
1A fixed point is when a solution does not change in time.

133
Substituting Eq. (B.11) into the master equation Eq. (11.5.2) and using the identities given by Eq. (B.12)
we get

∂P (α, α∗ , t)  ∂ γ ∂ γ
= − (i2Kα2 α∗ − i2Gα∗ − α) − ∗
(−i2Kα∗2 α + i2Gα − α∗ ),
∂t ∂α 2 ∂α 2
∂2 ∂ 2
∗2

+ (iKα 2
− iG) + (−iKα + iG) P (α, α∗ , t). (B.13)
∂α2 ∂α∗2
This equation is on the form of a Fokker-Planck equation and can moreover be written as a stochastic
differential equation [Walls and Milburn, 2007]
dα  γ  p
= i2 Kα2 α∗ − Gα∗ + i α + i2(Kα2 − G)η(t),
dt 4 (B.14)
dα∗ 
∗2 γ ∗ p 2 ∗
= i2 −Kα α + Gα + i α + i2(Kα − G)η (t).
dt 4
It can be shown that this stochastic differential equation is indeed equivalent to the Fokker-Planck equation.
Here η(t), η ∗ (t) are stochastic forces (Langevin terms) with zero mean. The semi-classical or mean value
equation is obtained by taking the average, the Langevin terms thus disappear since hη(t)i = hη(t)∗ i =
0 [Walls and Milburn, 2007]. The semi-classical equation of motion thus are
dα  γ 
= i2 Kα2 α∗ − Gα∗ + i α ,
dt 4 (B.15)
dα∗  γ 
= i2 −Kα α + Gα + i α∗ .
∗2
dt 4
The steady-state to the semi-classical equation of motion can now readily be obtained by setting dα/dt = 0,
and solving this equation. The equation Kα2 α∗ − Gα∗ + i γ4 α = 0 has three solutions which are given by
α1s = 0 and α2,3
s
= ±reiϑ , where
1/4 !
1 16G2 − γ 2

1 γ
r= and ϑ = − arctan p .
2 K2 2 16G2 − γ 2
s
p
which reduces to α2,3 ' G/K in the case of 4G  γ.
To investigate the stability of these solutions we linearise Eq. (B.15) around the steady states

αi (t) = αis + δαi (t), (i = 1, 2, 3),

where αis is the i:th steady-state solution to Eq. (B.15) and δαi is a small perturbation. The linearized
equations written in matrix form are
2
    
d  δαi (t)  2K|αis | + iγ/4 K(αis )2 − G δαi (t)
' i2   ,
dt δα∗ (t) 2
−K(αs∗ )2 + G −2K|αs | + iγ/4 δα∗ (t)
i i i i

and the eigenvalues to the matrix equation are


r !
γ 2 s 4

s∗ 2 s 2

λ± = − ± 2 Im 3K |αi | + G K (αi ) − G + (αi ) K .
2

Stability of the fixed points require all eigenvalues to have a positive real part less than or equal to zero [Walls
and Milburn, 2007]. We begin by examining the first steady state solution which is the origin α1s = 0, it has
eigenvalues λ± = −γ/2 ± 2G, so if 4G > γ, then the origin is an unstable solution. The eigenvalues to the
s
second and third steady state solution α2,3 = ±reiϑ is most easily analyzed in the case when 4G  γ, where
both steady state solutions has the degenerate eigenvalue −γ/2, and are thus stable solutions.

134
B.4 Effect of single-photon pump
In this section we will study the effect that is obtained when a single-photon pump is added to the two-
photon pumped KNR. For the moment we will neglect the effect of single-photon loss, since it has already
been treated in appendix B.3. The Hamiltonian for the two- and one-photon KNR written in a frame rotating
at the resonator frequency is given by

Ĥ = −Kâ†2 â2 + G â†2 + â2 + F ↠+ â .


 

In the Heisenberg picture the time evolution of the annihilation operator is given by
dâ h i
= i Ĥ, â = i 2K↠â2 − 2G↠− F .

dt
To obtain the semi-classical equations of motion we take the average hâi = α, which is obtained by simply
replacing â with the complex variable α,

= i 2Kα∗ α2 − 2Gα∗ − F .

dt
The steady state to this p equation has three solutions which are of the form α1 = (−2ε, 0) p and α2,3 =
(±α + ε, 0) where α = G/K and ε = F/4G. Hence if 4G  F so that ε → 0, then α2,3 ' ± G/K. To
investigate the stability of the steady state solutions we do a small perturbation around the fixed point. The
linearized equation on motion are
 
d(δα)  2
 
 dt  2K|αis | K(αis )2 − G δα
 d(δα∗ )  ' i2 (B.16)
    
s∗ 2 s 2 δα ∗
−K(αi ) + G −2K|αi |
dt
which has the eigenvalues q 
4 2
λ± = ±2 Im 4K 2 |αis | − |G − K(αis )2 | . (B.17)
s
We begin by examining  the first steady state solution α1 = −2ε. It has eigenvalues of
pthe eigenvalues for
the form λ± ' ±2 Im KF 2 /(2G) − G2 . If the expression inside the square root is negative then the α1s

will have an eigenvalue with a positive real part. This happens if F < 2Kα3 . The eigenvalue of the two
other solutions α2s and α3s both has λ± = 0 as eigenvalue and are therefore stable.

B.5 Coupling between two Kerr-nonlinear resonators


In this section of the appendix we will obtain the fixed points for two linearly coupled KNR’s. The Hamil-
tonian for two linearly coupled two-photon pumped KNRs in the rotating frame is given by
2      
−Kâ†2 †2 † †
X
2 2
Ĥ = i âi + G âi + âi + g â1 â2 + â1 â2 .
i=1

The Heisenberg equations of motion tell us how the the operators evolve in time
dâ1 h i  
= i Ĥ, â1 = i 2Kâ†1 â21 − 2Gâ†1 − gâ2 ,
dt
dâ2 h i  
= i Ĥ, â2 = i 2Kâ†2 â22 − 2Gâ†2 − gâ1 .
dt

135
Once again the mean field or semi classical equations are obtained by replacing â1 → α and â2 → β, where
α and β are two complex variables. We thus get that the semi-classical equations of motion for two coupled
KNRs are

= i 2Kα∗ α2 − 2Gα∗ − gβ ,

dt

= i 2Kβ ∗ β 2 − 2Gβ ∗ − gα .

dt
The solutions to the steady state equations are

{α1s , β1s } = {0, 0},


r r
s s 2G + g 2G + g
{α2,3 , β2,3 } = {± ,± }
2K 2K
r r
s s 2G + g 2G + g
{α4,5 , β4,5 } = {∓ ,± }.
2K 2K
s
p
which reduces to α2,3 ' G/K when 2G  g, and the states are kept in the subspace spanned by |0̄, 0̄i,
|0̄, 1̄i, |1̄, 0̄i and |1̄, 1̄i. Following the same kind of stability analysing procedure that we did in section B.3
s
we find likewise that the origin is an unstable solution while α2,3 are stable solutions.

B.6 Error estimation


In this part of the appendix we aim at providing an order of magnitude of the error that is inherent in
the truncation of the Hilbert space dimension corresponding to n = 16 photons, as we have used in the
numerical simulations of Chapter 5. In order to do so, we evaluate the error that this truncation entails on
the normalization condition for a coherent state. From the normalization condition of a coherent we have
that
∞ 2n
2 X |α|
hα|αi = 1 = e−|α| ,
n=0
n!
which can be written as
∞ 2n
2 X |α|
1 − e−|α| = 0.
n=0
n!
If we truncate the Fock space at the number x then the numerical error ε is given by
x 2n
2 X |α|
ε = 1 − e−|α| ,
n=0
n!

2
and the error is bounded between 0 ≤ ε ≤ 1 − e−|α| . To have an error that’s bounded between zero and
2
one we have to divide by 1 − e−|α| , such that
x
!
2n
−|α|2
X |α|  2

ε= 1−e / 1 − e−2|α| .
n=0
n!

Now if x = 16 and |α| = 3 we find that the numerical error is approximately ε ≈ 10−8 , which is a
considerably small numerical error.

136
B.7 Generation of cat states using a two-photon pumped KNR
The two-photon pumped KNR can be used as a mean to generate Schrödinger cat states. Cat states are a
superposition of two-coherent states with opposite phase, defined by

|αi ± |−αi
|Cα± i = q ,
2
2(1 ± e−2|α| )

where the term in the denominator is a normalization factor. |Cα+ i is called an even-cat state because
when written in the Fock basis it only contain Fock states with even numbers and |Cα− i is called an odd-
cat state because when written in the Fock-basis it only contains Fock states with odd numbers. Cat
states are of interest because they have the potential to be used as logical states for universal quantum
computation [Puri et al., 2017], and hold potential for error-correction against single-photon losses (see
experimental work pfrom Yale’s group). To generate a cat state it can first be noted that since the two-coherent
states |±αi = |± G/Ki are two-degenerate eigenstates of the two-photon pumped KNR Eq. (11.79) p so is
any superposition of theses states. It can be easily checked that the even and odd cat state with α = G/K
are also two degenerate eigenstates of the two-photon pumped KNR Eq. (11.79) with eigenenergy G2 /K.
Second, to generate a cat state the detuning and single-photon pump is set to zero in the time-dependent
Hamiltonian Eq. (11.82). The vacuum and single-photon Fock state are then degenerate eigenstates of the
initial Hamiltonian. Under adiabatic evolution the zero-photon Fock state will evolve into the even cat-state
since parity is conserved, and the single-photon Fock state will evolve into the odd-cat state for the same
reason.
Numerical simulations show that, for G = 3K and τ = 158 as given by the adiabatic condition, if the
resonator is initialized to |0i (|1i) it will evolve into the even cat state (odd cat state) with 100% fidelity,
within the numerical error. Figure B.2 (a) and (b) show the Wigner function of the even and odd cat. The
Wigner negativity that appears between the two Gaussian peaks are interference fringes and are sometimes
referred to as the cat’s whiskers. These interference fringes are very characteristic for a cat state. When the
amplitude of |α| is
psmall the states are often referred to as “kitten” states. For example an even cat (“kitten”)
state with |α| = 1/2 is shown in Fig. B.2 (c).

137
(a) Even cat state (b) Odd cat state

(c) Kitten state

Figure B.2: Cat states.

138
Bibliography

[Aaronson, a] Aaronson, S. The complexity zoo.

[Aaronson, b] Aaronson, S. Postbqp postscripts: A confession of mathematical errors.

[Aaronson, 2005] Aaronson, S. (2005). Quantum computing, postselection, and probabilistic polynomial-
time. Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences,
461:3473.

[Aaronson, 2015] Aaronson, S. (2015). Read the fine print. Nat. Phys., 11:291.

[Aaronson, 2018] Aaronson, S. (2018). Lecture Notes for Intro to Quantum Information Science.

[Aaronson and Arkhipov, 2013] Aaronson, S. and Arkhipov, A. (2013). The computational complexity of
linear optics. Theory of Computing, 9:143.

[Albash and Lidar, 2018] Albash, T. and Lidar, D. A. (2018). Adiabatic quantum computation. Reviews of
Modern Physics, 90(1):015002.

[Allcock and Zhang, 2019] Allcock, J. and Zhang, S. (2019). Quantum machine learning. Natl. Sci. Rev.,
6:26.

[Amin et al., 2018] Amin, M. H., Andriyash, E., Rolfe, J., Kulchytskyy, B., and Melko, R. (2018). Quantum
Boltzmann Machine. Phys. Rev. X, 8:021050.

[Arute et al., 2019] Arute, F., Arya, K., Babbush, R., Bacon, D., Bardin, J. C., Barends, R., Biswas, R.,
Boixo, S., Brandao, F. G. S. L., Buell, D. A., Burkett, B., Chen, Y., Chen, Z., Chiaro, B., Collins, R.,
Courtney, W., Dunsworth, A., Farhi, E., Foxen, B., Fowler, A., Gidney, C., Giustina, M., Graff, R.,
Guerin, K., Habegger, S., Harrigan, M. P., Hartmann, M. J., Ho, A., Hoffmann, M., Huang, T., Humble,
T. S., Isakov, S. V., Jeffrey, E., Jiang, Z., Kafri, D., Kechedzhi, K., Kelly, J., Klimov, P. V., Knysh, S.,
Korotkov, A., Kostritsa, F., Landhuis, D., Lindmark, M., Lucero, E., Lyakh, D., Mandrà, S., McClean,
J. R., McEwen, M., Megrant, A., Mi, X., Michielsen, K., Mohseni, M., Mutus, J., Naaman, O., Neeley,
M., Neill, C., Niu, M. Y., Ostby, E., Petukhov, A., Platt, J. C., Quintana, C., Rieffel, E. G., Roushan, P.,
Rubin, N. C., Sank, D., Satzinger, K. J., Smelyanskiy, V., Sung, K. J., Trevithick, M. D., Vainsencher,
A., Villalonga, B., White, T., Yao, Z. J., Yeh, P., Zalcman, A., Neven, H., and Martinis, J. M. (2019).
Quantum supremacy using a programmable superconducting processor. Nature, 574(7779):505–510.

[Barak et al., 2015] Barak, B., Moitra, A., O’Donnell, R., Raghavendra, P., Regev, O., Steurer, D., Trevisan,
L., Vijayaraghavan, A., Witmer, D., and Wright, J. (2015). Beating the random assignment on constraint
satisfaction problems of bounded degree.

[Bartlett et al., 2002] Bartlett, S. D., Sanders, B. C., Braunstein, S. L., and Nemoto, K. (2002). Efficient
classical simulation of continuous variable quantum information processes. Phys. Rev. Lett., 88:9.

139
[Bartolo et al., 2016] Bartolo, N., Minganti, F., Casteels, W., and Ciuti, C. (2016). Exact steady state of
a kerr resonator with one-and two-photon driving and dissipation: Controllable wigner-function multi-
modality and dissipative phase transitions. Physical Review A, 94(3):033841.

[Bouland et al., 2019] Bouland, A., Fefferman, B., Nirkhe, C., and Vazirani, U. (2019). On the complexity
and verification of quantum random circuit sampling. Nature Physics, 15(2):159–163.

[Brandao et al., 2018] Brandao, F. G. S. L., Broughton, M., Farhi, E., Gutmann, S., and Neven, H. (2018).
For Fixed Control Parameters the Quantum Approximate Optimization Algorithm’s Objective Function
Value Concentrates for Typical Instances.

[Bremner et al., 2010] Bremner, M. J., Josza, R., and Shepherd, D. (2010). Classical simulation of commut-
ing quantum computations implies collapse of the polynomial hierarchy. Proc. R. Soc. A, 459:459.

[Bremner et al., 2016] Bremner, M. J., Montanaro, A., and Shepherd, D. J. (2016). Average-case com-
plexity versus approximate simulation of commuting quantum computations. Physical review letters,
117(8):080501.

[Briegel et al., 2009] Briegel, H. J., Browne, D. E., Dür, W., Raussendorf, R., and Van den Nest, M. (2009).
Measurement-based quantum computation. Nat. Phys., 5:19.

[Campagne-Ibarcq et al., 2020] Campagne-Ibarcq, P., Eickbusch, A., Touzard, S., Zalys-Geller, E., Frattini,
N., Sivak, V., Reinhold, P., Puri, S., Shankar, S., Schoelkopf, R., et al. (2020). Quantum error correction
of a qubit encoded in grid states of an oscillator. Nature, 584(7821):368–372.

[Chabaud et al., 2017] Chabaud, U., Douce, T., Markham, D., van Loock, P., Kashefi, E., and Ferrini, G.
(2017). Continuous-variable sampling from photon-added or photon-subtracted squeezed states. Physical
Review A, 96:062307.

[Chakhmakhchyan and Cerf, 2017] Chakhmakhchyan, L. and Cerf, N. J. (2017). Boson sampling with gaus-
sian measurements. Physical Review A, 96(3):032326.

[Choi, 2008] Choi, V. (2008). Minor-embedding in adiabatic quantum computation: I. the parameter setting
problem. Quantum Information Processing, 7(5):193–209.

[Choi, 2011] Choi, V. (2011). Minor-embedding in adiabatic quantum computation: Ii. minor-universal
graph design. Quantum Information Processing, 10(3):343–353.

[Cong et al., 2019] Cong, I., Choi, S., and Lukin, M. D. (2019). Quantum convolutional neural networks.
Nat. Phys., 15:1273.

[Cybenko, 1989] Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function. Math.


Control. Signals, Syst., 2:303.

[Dieks, 1982] Dieks, D. (1982). Communication by EPR devices. Physics Letters A, 92:271.

[Douce et al., 2017] Douce, T., Markham, D., Kashefi, E., Diamanti, E., Coudreau, T., Milman, P., van
Loock, P., and Ferrini, G. (2017). Continuous-variable instantaneous quantum computing is hard to
sample. Phys. Rev. Lett., 118:070503.

[Douce et al., 2019] Douce, T., Markham, D., Kashefi, E., van Loock, P., and Ferrini, G. (2019). Probabilistic
fault-tolerant universal quantum computation and sampling problems in continuous variables. Physical
Review A, 99:012344.

140
[Farhi et al., 2014] Farhi, E., Goldstone, J., and Gutmann, S. (2014). A quantum approximate optimization
algorithm. arXiv preprint arXiv:1411.4028.

[Farhi et al., 2001] Farhi, E., Goldstone, J., Gutmann, S., Lapan, J., Lundgren, A., and Preda, D. (2001).
A Quantum Adiabatic Evolution Algorithm Applied to Random Instances of an NP-Complete Problem.
Science, 292:472.

[Farhi et al., 2017] Farhi, E., Goldstone, J., Gutmann, S., and Neven, H. (2017). Quantum algorithms for
fixed qubit architectures. arXiv preprint arXiv:1703.06199.

[Farhi and Harrow, 2019] Farhi, E. and Harrow, A. W. (2019). Quantum supremacy through the quantum
approximate optimization algorithm.

[Farhi and Neven, 2018] Farhi, E. and Neven, H. (2018). Classification with Quantum Neural Networks on
Near Term Processors.

[Flühmann et al., 2018] Flühmann, C., Negnevitsky, V., Marinelli, M., and Home, J. P. (2018). Sequential
modular position and momentum measurements of a trapped ion mechanical oscillator. Phys. Rev. X,
8:021001.

[Fowler et al., 2012] Fowler, A. G., Mariantoni, M., Martinis, J. M., and Cleland, A. N. (2012). Surface
codes: Towards practical large-scale quantum computation. Physical Review A, 86:032324.

[García-Álvarez et al., 2020] García-Álvarez, L., Calcluth, C., Ferraro, A., and Ferrini, G. (2020). Efficient
simulatability of continuous-variable circuits with large wigner negativity. Phys. Rev. Research, 2:043322.

[Gerry et al., 2005] Gerry, C., Knight, P., and Knight, P. L. (2005). Introductory quantum optics. Cambridge
university press.

[Glancy and Knill, 2006] Glancy, S. and Knill, E. (2006). Error analysis for encoding a qubit in an oscillator.
Phys. Rev. A, 73:012325.

[Glauber, 1963] Glauber, R. J. (1963). Coherent and incoherent states of the radiation field. Physical Review,
131(6):2766.

[Gottesman et al., 2001] Gottesman, D., Kitaev, A., and Preskill, J. (2001). Encoding a qubit in an oscillator.
Phys. Rev. A, 64:012310.

[Gu et al., 2009] Gu, M., Weedbrook, C., Menicucci, N. C., Ralph, T. C., and van Loock, P. (2009). Quantum
computing with continuous-variable clusters. Phys. Rev. A, 79:062318.

[Hadfield et al., 2019] Hadfield, S., Wang, Z., O’Gorman, B., Rieffel, E., Venturelli, D., and Biswas, R.
(2019). From the Quantum Approximate Optimization Algorithm to a Quantum Alternating Operator
Ansatz. Algorithms, 12(2):34.

[Halperin et al., 2004] Halperin, E., Livnat, D., and Zwick, U. (2004). Max cut in cubic graphs. Journal of
Algorithms, 53(2):169–185.

[Hamilton et al., 2017] Hamilton, C. S., Kruse, R., Sansoni, L., Barkhofen, S., Silberhorn, C., and Jex, I.
(2017). Gaussian boson sampling. Physical review letters, 119(17):170501.

[Harrow et al., 2009] Harrow, A. W., Hassidim, A., and Lloyd, S. (2009). Quantum Algorithm for Linear
Systems of Equations. Phys. Rev. Lett., 103:150502.

141
[Harrow and Montanaro, 2017] Harrow, A. W. and Montanaro, A. (2017). Quantum computational
supremacy. Nature, 549(7671):203.

[Hauke et al., 2020] Hauke, P., Katzgraber, H. G., Lechner, W., Nishimori, H., and Oliver, W. D. (2020).
Perspectives of quantum annealing: Methods and implementations. Reports on Progress in Physics,
83(5):054401.

[Hillmann et al., 2020] Hillmann, T., Quijandría, F., Johansson, G., Ferraro, A., Gasparinetti, S., and Fer-
rini, G. (2020). Universal gate set for continuous-variable quantum computation with microwave circuits.
Phys. Rev. Lett., 125:160501.

[Horodecki et al., 2006] Horodecki, P., Bruß, D., and Leuchs, G. (2006). Lectures on quantum information.

[Huh et al., 2015] Huh, J., Guerreschi, G. G., Peropadre, B., McClean, J. R., and Aspuru-Guzik, A. (2015).
Boson sampling for molecular vibronic spectra. Nature Photonics, 9(9):615–620.

[Jiang et al., 2017] Jiang, Z., Rieffel, E. G., and Wang, Z. (2017). Near-optimal quantum circuit for Grover’s
unstructured search using a transverse field. Physical Review A, 95(6):062317.

[Kandala et al., 2017] Kandala, A., Mezzacapo, A., Temme, K., Takita, M., Brink, M., Chow, J. M., and
Gambetta, J. M. (2017). Hardware-efficient variational quantum eigensolver for small molecules and
quantum magnets. Nature, 549:242.

[Kirchmair et al., 2013] Kirchmair, G., Vlastakis, B., Leghtas, Z., Nigg, S. E., Paik, H., Ginossar, E., Mir-
rahimi, M., Frunzio, L., Girvin, S. M., and Schoelkopf, R. J. (2013). Observation of quantum state collapse
and revival due to the single-photon kerr effect. Nature, 495(7440):205.

[Kockum, 2014] Kockum, A. F. (2014). Quantum optics with artificial atoms. PhD thesis, Chalmers Uni-
versity of Technology.

[Kockum and Nori, 2019] Kockum, A. F. and Nori, F. (2019). Quantum Bits with Josephson Junctions. In
Tafuri, F., editor, Fundamentals and Frontiers of the Josephson Effect, pages 703–741. Springer.

[Kuperberg, 2015] Kuperberg, G. (2015). How hard is it to approximate the jones polynomial? Theory of
Computing, 11:183.

[Lechner et al., 2015] Lechner, W., Hauke, P., and Zoller, P. (2015). A quantum annealing architecture with
all-to-all connectivity from local interactions. Science advances, 1(9):e1500838.

[Leonhardt, 1997] Leonhardt, U. (1997). Measuring the Quantum State of Light. Cambridge University
Press, New York, NY, USA, 1st edition.

[Leonhardt and Paul, 1993] Leonhardt, U. and Paul, H. (1993). Realistic optical homodyne measurements
and quasiprobability distributions. Phys. Rev. A, 48:4598.

[Li et al., 2020] Li, Y., Chen, M., Chen, Y., Lu, H., Gan, L., Lu, C., Pan, J., Fu, H., and Yang, G.
(2020). Benchmarking 50-photon gaussian boson sampling on the sunway taihulight. arXiv preprint
arXiv:2009.01177.

[Lin et al., 2017] Lin, H. W., Tegmark, M., and Rolnick, D. (2017). Why Does Deep and Cheap Learning
Work So Well? J. Stat. Phys., 168:1223.

[Lloyd, 2018] Lloyd, S. (2018). Quantum approximate optimization is computationally universal.

142
[Lloyd and Braunstein, 1999] Lloyd, S. and Braunstein, S. L. (1999). Quantum computation over continuous
variables. Phys. Rev. Lett., 82:1784.

[Lloyd et al., 2014] Lloyd, S., Mohseni, M., and Rebentrost, P. (2014). Quantum principal component anal-
ysis. Nat. Phys., 10:631.

[Lucas, 2014] Lucas, A. (2014). Ising formulations of many np problems. Frontiers in Physics, 2:5.

[Lund et al., 2017] Lund, A., Bremner, M. J., and Ralph, T. (2017). Quantum sampling problems, boson-
sampling and quantum supremacy. npj Quantum Information, 3(1):15.

[Lund et al., 2014] Lund, A. P., Rahimi-Keshari, S., Rudolph, T., OÕBrien, J. L., and Ralph, T. C. (2014).
Boson sampling from a gaussian state. Phys. Rev. Lett., 113:100502.

[Mari and Eisert, 2012] Mari, A. and Eisert, J. (2012). Positive wigner functions render classical simulation
of quantum computation efficient. Phys. Rev. Lett., 109:230503.

[Meaney et al., 2014] Meaney, C. H., Nha, H., Duty, T., and Milburn, G. J. (2014). Quantum and classical
nonlinear dynamics in a microwave cavity. EPJ Quantum Technology, 1(1):7.

[Menicucci, 2014] Menicucci, N. C. (2014). Fault-tolerant measurement-based quantum computing with


continuous-variable cluster states. Phys. Rev. Lett., 112:120504.

[Menicucci et al., 2011] Menicucci, N. C., Flammia, S. T., and van Loock, P. (2011). Graphical calculus for
gaussian pure states. Physical Review A, 83(4):042335.

[Meystre and Sargent, 2007] Meystre, P. and Sargent, M. (2007). Elements of quantum optics. Springer
Science & Business Media.

[Moll et al., 2018] Moll, N., Barkoutsos, P., Bishop, L. S., Chow, J. M., Cross, A., Egger, D. J., Filipp, S.,
Fuhrer, A., Gambetta, J. M., Ganzhorn, M., Kandala, A., Mezzacapo, A., Müller, P., Riess, W., Salis, G.,
Smolin, J., Tavernelli, I., and Temme, K. (2018). Quantum optimization using variational algorithms on
near-term quantum devices. Quantum Sci. Technol., 3:030503.

[Morales et al., 2019] Morales, M. E. S., Biamonte, J., and Zimboras, Z. (2019). On the universality of the
quantum approximate optimization algorithm.

[Moylett et al., 2019] Moylett, A. E., García-Patrón, R., Renema, J. J., and Turner, P. S. (2019). Clas-
sically simulating near-term partially-distinguishable and lossy boson sampling. Quantum Science and
Technology, 5(1):015001.

[Neville et al., 2017] Neville, A., Sparrow, C., Clifford, R., Johnston, E., Birchall, P. M., Montanaro, A.,
and Laing, A. (2017). Classical boson sampling algorithms with superior performance to near-term ex-
periments. Nature Physics, 13(12):1153–1157.

[Nielsen, 2015] Nielsen, M. (2015). Neural Networks and Deep Learning. Determination Press.

[Nielsen and Chuang, 2000] Nielsen, M. A. and Chuang, I. L. (2000). Quantum Computation and Quantum
Information. Cambridge University Press.

[Nielsen and Chuang, 2011] Nielsen, M. A. and Chuang, I. L. (2011). Quantum Computation and Quantum
Information: 10th Anniversary Edition. Cambridge University Press, New York, NY, USA, 10th edition.

143
[Nigg et al., 2017] Nigg, S. E., Lörch, N., and Tiwari, R. P. (2017). Robust quantum optimizer with full
connectivity. Science advances, 3(4):e1602273.

[Niu et al., 2019] Niu, M. Y., Lu, S., and Chuang, I. L. (2019). Optimizing QAOA: Success Probability and
Runtime Dependence on Circuit Depth.

[Ofek et al., 2016] Ofek, N., Petrenko, A., Heeres, R., Reinhold, P., Leghtas, Z., Vlastakis, B., Liu, Y.,
Frunzio, L., Girvin, S., Jiang, L., et al. (2016). Extending the lifetime of a quantum bit with error
correction in superconducting circuits. Nature, 536(7617):441–445.

[Paris et al., 2003] Paris, M. G. A., Cola, M., and Bonifacio, R. (2003). Quantum-state engineering assisted
by entanglement. Phys. Rev. A, 67:042104.

[Pednault et al., 2019] Pednault, E., Gunnels, J. A., Nannicini, G., Horesh, L., and Wisnieff, R. (2019).
Leveraging secondary storage to simulate deep 54-qubit sycamore circuits.

[Puri et al., 2017] Puri, S., Andersen, C. K., Grimsmo, A. L., and Blais, A. (2017). Quantum annealing
with all-to-all connected nonlinear oscillators. Nature communications, 8:15785.

[Qi et al., 2020] Qi, H., Brod, D. J., Quesada, N., and García-Patrón, R. (2020). Regimes of classical
simulability for noisy gaussian boson sampling. Phys. Rev. Lett., 124:100502.

[Rahimi-Keshari et al., 2016] Rahimi-Keshari, S., Ralph, T. C., and Caves, C. M. (2016). Sufficient condi-
tions for efficient classical simulation of quantum opticss. Phys. Rev. X, 6:021039.

[Raussendorf and Briegel, 2001] Raussendorf, R. and Briegel, H. J. (2001). A One-Way Quantum Computer.
Phys. Rev. Lett., 86:5188.

[Raussendorf et al., 2003] Raussendorf, R., Browne, D. E., and Briegel, H. J. (2003). Measurement-based
quantum computation on cluster states. Phys. Rev. A, 68:022312.

[Rebentrost et al., 2014] Rebentrost, P., Mohseni, M., and Lloyd, S. (2014). Quantum Support Vector
Machine for Big Data Classification. Phys. Rev. Lett., 113:130503.

[Rodríguez-Laguna and Santalla, 2018] Rodríguez-Laguna, J. and Santalla, S. N. (2018). Building an adia-
batic quantum computer simulation in the classroom. American Journal of Physics, 86(5):360–367.

[Scheel, 2004] Scheel, S. (2004). Permanents in linear optical networks. arXiv preprint quant-ph/0406127.

[Shor, 1995] Shor, P. W. (1995). Scheme for reducing decoherence in quantum computer memory. Physical
Review A, 52:R2493.

[Spring et al., 2013] Spring, J. B., Metcalf, B. J., Humphreys, P. C., Kolthammer, W. S., Jin, X.-M., Bar-
bieri, M., Datta, A., Thomas-Peter, N., Langford, N. K., Kundys, D., Gates, J. C., Smith, B. J., Smith,
P. G. R., and Walmsley, I. A. (2013). Boson sampling on a photonic chip. Science, 339:798.

[Stollenwerk et al., 2019] Stollenwerk, T., Lobe, E., and Jung, M. (2019). Flight gate assignment with a
quantum annealer. In International Workshop on Quantum Technology and Optimization Problems, pages
99–110. Springer.

[Tang, 2018] Tang, E. (2018). Quantum-inspired classical algorithms for principal component analysis and
supervised clustering.

144
[Tanikic and Despotovic, 2012] Tanikic, D. and Despotovic, V. (2012). Artificial Intelligence Techniques for
Modelling of Temperature in the Metal Cutting Process. In Metall. - Adv. Mater. Process. InTech.

[Terhal and DiVincenzo, 2002] Terhal, B. M. and DiVincenzo, D. P. (2002). Adaptive quantum computation,
constant depth quantum circuits and arthur-merlin games. arXiv preprint quant-ph/0205133.

[Tinkham, 2004] Tinkham, M. (2004). Introduction to superconductivity. Courier Corporation.

[Ukai et al., 2010] Ukai, R., Yoshikawa, J.-i., Iwata, N., van Loock, P., and Furusawa, A. (2010). Uni-
versal linear bogoliubov transformations through one-way quantum computation. Physical review A,
81(3):032315.

[Vikstål, 2018] Vikstål, P. (2018). Continuous-variable quantum annealing with superconducting circuits.
Master Thesis, Chalmers.

[Vikstål et al., 2020] Vikstål, P., Grönkvist, M., Svensson, M., Andersson, M., Johansson, G., and Ferrini,
G. (2020). Applying the quantum approximate optimization algorithm to the tail-assignment problem.
Physical Review Applied, 14(3):034009.

[Walls and Milburn, 2007] Walls, D. F. and Milburn, G. J. (2007). Quantum optics. Springer Science &
Business Media.

[Walther et al., 2005] Walther, P., Resch, K. J., Rudolph, T., Schenck, E., Weinfurter, H., Vedral, V.,
Aspelmeyer, M., and Zeilinger, A. (2005). Experimental one-way quantum computing. Nature, 434:169.

[Wang et al., 2019] Wang, H., Qin, J., Ding, X., Chen, M.-C., Chen, S., You, X., He, Y.-M., Jiang, X., You,
L., Wang, Z., et al. (2019). Boson sampling with 20 input photons and a 60-mode interferometer in a 10
power 14-dimensional hilbert space. Physical review letters, 123(25):250503.

[Watrous, 2009] Watrous, J. (2009). Encyclopedia of Complexity and Systems Science, chapter Quantum
Computational Complexity, pages 7174–7201. Springer New York, New York, NY.

[Wendin, 2017] Wendin, G. (2017). Quantum information processing with superconducting circuits: a re-
view. Reports Prog. Phys., 80:106001.

[Willsch et al., 2020] Willsch, M., Willsch, D., Jin, F., De Raedt, H., and Michielsen, K. (2020). Bench-
marking the quantum approximate optimization algorithm. Quantum Information Processing, 19:197.

[Wootters and Zurek, 1982] Wootters, W. K. and Zurek, W. H. (1982). A single quantum cannot be cloned.
Nature, 299:802.

[Wurtz and Love, 2020] Wurtz, J. and Love, P. J. (2020). Bounds on maxcut qaoa performance for p> 1.
arXiv preprint arXiv:2010.11209.

[Zhong et al., 2020] Zhong, H.-S., Wang, H., Deng, Y.-H., Chen, M.-C., Peng, L.-C., Luo, Y.-H., Qin, J.,
Wu, D., Ding, X., Hu, Y., Hu, P., Yang, X.-Y., Zhang, W.-J., Li, H., Li, Y., Jiang, X., Gan, L., Yang,
G., You, L., Wang, Z., Li, L., Liu, N.-L., Lu, C.-Y., and Pan, J.-W. (2020). Quantum computational
advantage using photons. Science.

145

You might also like