A Partial Derandomization of Phaselift Using Spherical Designs
A Partial Derandomization of Phaselift Using Spherical Designs
A Partial Derandomization of Phaselift Using Spherical Designs
DOI 10.1007/s00041-014-9361-2
D. Gross
Freiburg Center for Data Analysis Modeling, Eckerstr. 1, 79104 Freiburg, Germany
F. Krahmer
Institute for Numerical and Applied Mathematics, University of Göttingen, Lotzestraße 16-18, 37083
Göttingen, Germany
J Fourier Anal Appl
1 Introduction
In this work we are interested in the problem of recovering a complex signal (vector)
x ∈ Cd from an intensity measurement y0 = x22 and amplitude measurements
yi = |ai , x|2 i = 1, . . . , m,
This “lifts” the problem to matrix space of dimension d 2 , where it becomes linear and
can be explicitly solved to find the unique solution.
As we will show in Theorem 2, it is, without making additional assumptions on
the 2-design, not possible to use as measurements a random subset of this 2-design
J Fourier Anal Appl
which is of size o(d 2 ). In other words, for the measurement scenario described in [9],
the quadratic scaling in d is basically unavoidable.
To contrast these two extreme approaches, Ref. [8] works with a number of mea-
surements close to the absolute minimum, but there are no tractable reconstruction
schemes provided, the question of numerical stability is not considered, and it is
unclear whether non-generic measurements—i.e., vectors with additional structural
properties—can be employed. On the other hand, the number of measurements in [9]
is much larger, while the measurements are highly structured and there is an explicit
reconstruction method. A number of recent works including this paper aim to balance
between these two approaches, working with a number of measurements only slightly
larger while having at least some of the desired properties mentioned above.
Reference [3] introduces a reconstruction method called polarization that works
for O(d log d) measurements and can handle structured measurement vectors, includ-
ing the masked illumination setup that appears in diffraction imaging [10], where
the measurements are generated by the discrete Fourier transform preceded by a ran-
dom diagonal matrix. For Gaussian measurements, the polarization approach has also
shown to be stable with respect to measurement noise [3]. While simulations seem
to suggest stability also for the derandomized masked illumination setup, a proof of
stability is—to our knowledge—not available yet.
An alternative approach, which we will also follow in this paper, is the PhaseLift
algorithm, which is based on the lifted formulation (1). The algorithm was introduced
in [22] and reconstruction guarantees have been provided in [18,23]. The central obser-
vation is that the matrix x x ∗ , while unknown, is certainly of rank one. This connects the
phase retrievel problem with the young but already extensive field of low-rank matrix
recovery [21,35,54,63]. Over the past years, this research program has rigorously iden-
tified many instances in which low-rank matrices can be efficiently reconstructed from
few linear measurements. The existing results on low-rank matrix recovery were not
directly applicable to phase retrieval, because the measurement matrices ai ai∗ failed to
be sufficiently incoherent in the sense of [21,35] (the incoherence parameter captures
the well-posedness of a low-rank recovery problem). For the case of Gaussian mea-
surement vectors ai , Candès, Strohmer, Voroninski and Li were able to circumvent this
problem, providing problem-specific stable recovery guarantees [18,23] for a number
of measurements of optimal order O(d). For recovery, they use a convex relaxation
of the rank minimization problem, which makes the reconstruction algorithmically
tractable.
It should be noted, however, that because of the significantly increased problem
dimensions, PhaseLift is not as efficient as many phase retrieval algorithms developed
over the last decades in the physics literature (such as [30]) and the optimization
literature (for example [13]). Recently there have been attempts to provide recovery
guarantees for alternating minimization algorithms [61], which are somewhat closer to
the algorithms used in practice, but this direction of research is only at its beginnings.
While the above mentioned recovery guarantees for PhaseLift address the issues
of tractable reconstruction and stability with respect to noise, these results leave open
the question of whether measurement systems with additional structure and less ran-
domness still allow for guaranteed recovery. There are both practical and theoretical
motivations for pursuing such generalizations: A practitioner may be constrained in the
J Fourier Anal Appl
1 The definition of a t-design varies between authors. In particular, what is called a t-design here (and
in most of the physics literature), would sometimes be referred to as a 2t or even a (2t + 1)-design. See
Sect. 3.3 for our precise definition.
J Fourier Anal Appl
As the number of measurements needed for phase retrieval is larger than the sig-
nal space dimension, one cannot expect these results to exactly carry over to the
phase retrieval setting. Nevertheless, the question remains whether there is a larger,
but preferably not too large, set such that measurements drawn from it uniformly at
random allow for phase retrieval reconstruction guarantees. In some sense, the sam-
pling scenario we seek can be interpreted as an interpolation between the maximally
random setup of Gaussian measurements with an optimal order of measurements and
the construction in [9], which is completely deterministic, but suboptimal in terms of
the embedding dimension. While in this paper, we will focus on the phase retrieval
problem, we remark that such an interpolating approach between measurements drawn
from a basis and maximally random measurements may also be of interest in other sit-
uations where reconstructions from bases are known, but lead to somewhat suboptimal
embedding dimensions.
The concept of t-designs, as defined in Sect. 3.3, provides such an interpolation.
The intuition behind that definition is that with growing t, more and more moments of
the random vector corresponding to a random selection from the t-design agree with
the Haar measure on the unit sphere. In that sense, as t scales up further, t-designs
give better and better approximations to Haar-random vectors.
The utility of this concept as a general-purpose de-randomization tool for Hilbert-
space valued random construtions has been appreciated for example in quantum infor-
mation theory [4,55]. It has been compared [4] to the notion of t-wise independence,
which plays a role for example in the analysis of discrete randomized algorithms [56],
seems to have been long appreciated in coding theory. The smallest t-design in Cd
consists of O(d 2t ) elements. Thus, whenever that lower bound is met, drawing a sin-
gle element from a design requires 2t log d bits, as opposed to 2d bits for a complex
Bernoulli vector—an exponential gap.
From a practical point of view, the usefulness of these concepts hinges on the
availability of constructions for designs. Explicit constructions for any order t and any
dimension d are known [7,40,47,69]—however, they are typically “inefficient” in the
sense that they require a vector set of exponential size. For example, the construction
in [40] uses O(t)d vectors which is exponential in the dimension d.
Tighter analytic expressions for exact designs are notoriously difficult to find.
Designs of degree 2 are widely known [45,46,67,75]. A concrete example is used
for the converse bound in Sect. 7 (as well as for the converse bounds for low-rank
matrix recovery from Fourier-type bases in [35]). For degree 3, both real [70] and
complex [49] designs are known.2 For higher t, there are numerical methods based on
the notion of the frame potential [45,49,64] , non-constructive existence proofs [69],
and constructions in sporadic dimensions (c.f. [5] and references therein).
Importantly, almost-tight randomized constructions for approximate designs for
arbitrary degrees and dimensions are known [4,15,40]. The simplest results [40] show
that collections of Haar-random vectors form approximate t-designs. This indeed can
reduce randomness: One only needs to expend a considerable amount of randomness
once to generate a design—for subsequent applications it is sufficient to sample small
2 While stated only for dimensions that are a power of 2, the results can be used for construtions in arbitrary
dimensions [49].
J Fourier Anal Appl
subsets from it.3 Going further, there have been recent deep results on designs obtained
from certain structured ensembles [15]. We do not describe the details here, as they
are geared toward quantum problems and may have to be substantially modified to
be applicable to phase retrieval. The only connection to phase retrieval to date is the
estimation of pure quantum states [41,59].
Finally we point out that the notion of the frame potential above is no coincidence.
In [6] a frame-theoretic approach to designs is provided, underlining their close con-
nection.
In this paper, we show that spherical designs can indeed be used to partially derandom-
ize recovery guarantees for underdetermined estimation problems; we generalize the
recovery guarantee in [23] to measurements drawn uniformly at random from complex
projective designs, at the cost of a slightly higher number of measurements.
Theorem 1 (Main Theorem) Let x ∈ Cd be the unknown signal. Suppose that x22 is
known and that m measurement vectors a1 , . . . , am have been sampled independently
and uniformly at random from a t-design Dt ⊂ Cd (t ≥ 3). Then, with probability at
least 1 − e−ω , PhaseLift (the convex optimization problem (24) below) recovers x up
to a global phase, provided that the sampling rate exceeds
As the discussion of the previous subsection suggests, the bounds on the sampling rate
decrease as the order of the design increases. For fixed t, and up to poly-log factors,
it is proportional to O(d 1+2/t ). This is sub-quadratic for the regime t ≥ 3 where our
arguments apply. If the degree is allowed to grow logarithmialy with the dimension
(as t = 2 log d), we recover an optimal, linear scaling up to a polylog overhead,
m = O(d log3 d).
In light of the highly structured, analytical and exact designs known for degree 2
and 3, it is of great interest to ask whether a linear scaling can already be achieved for
some small, fixed t. As shown by the following theorem, however, for t = 2 not even
a subquadratic scaling is possible if no additional assumptions are made, irrespective
of the reconstruction algorithm used.
Theorem 2 (Converse bound) Let d be a prime power larger than 2. Then there exists
a 2-design D2 ⊂ Cd and orthogonal, normalized vectors x, z ∈ Cd which have the
following property:
Suppose that m measurement vectors y1 , . . . , ym are sampled independently and
uniformly at random from D2 . Then, for any ω ≥ 0, the number of measurements must
obey
3 The situation is comparable to the use of random graphs as randomness expanders [43].
J Fourier Anal Appl
ω
m≥ d(d + 1),
4
or the event
It is worthwhile to put this statement in perspective with other advances in the field.
Throughout our work, we have only demanded that the set of all possible measurement
vectors forms a t-design and have not made any further assumptions. Theorem 2 has
to be interpreted in this regard: The 2-design property alone does not allow for a sub-
quadratic scaling when a “reasonably small” probability of failure is required in the
recovery process.
Note that this does not exclude the possibility that certain realizations of 2-designs
can perform better, if additional structural properties can be exploited. A good exam-
ple for such a measurement process is the multi-illumination setup provided in [17].
In [37] the authors of this paper verified that the set of all measurement vectors
used in the framework of [17] does constitute a 2-design (Lemma 6). Additional
structural properties—most notably a certain correlated Fourier basis structure in the
individual measurements—allowed
for establishing recovery guarantees already for
m = O d log4 d measurements [17] and m = O(d log2 d) [37], respectively—
which both clearly are sub-quadratic sampling rates.
1.3 Outlook
There are a number of problems left open by our analysis. First, recall that our results
achieve linear scaling up to logarithmic factors only when samples are drawn from
a set of superpolynomial size. Thus it would be very interesting to find out whether
there are polynomial size sets such that sampling from them already achieves such
a scaling, in particular, if t-designs for some fixed t can be used. The case of t = 3
seems particularly important in that regard, since the converse bound (Theorem 2)
shows that a design order of at least 3 is necessary. Also, highly structured 3-designs
are known to exist (see above).
Another important follow-up problem concerns approximate t-designs. While our
main result is phrased for exact t-designs, certain scenarios will only exhibit approxi-
mate design properties. We expect that our proofs can be generalized to such a setup,
but also leave this problem for future work. Lastly, the reconstruction quality for noisy
measurements is also an important issue yet to be investigated.
2 Numerical Experiments
200
180
140
120
100
80
60
40
20
5 10 15 20 25 30 35 40
Dimension
Fig. 1 Phase diagram for PhaseLift from (projected) stabilizer states, which form an exact 3-design in
dimensions 2n and a weighted one else. The x-axis indicates the problem’s dimension, while the y-axis
denotes the number of independent design measurements performed. The frequency of a successful recovery
over 30 independent runs of the experiment appears color-coded from black (zero) to white (one). To guide
the eye, we have furthermore included a red line indicating m = 4d − 4
better than our main theorem suggests. To be concrete, we use stabilizer states—a
highly structured vector set which is very prominent in quantum information theory
[31,32]. Stabilizer states exist in any dimension, though their properties are some-
what better-behaved in prime power dimensions. In this case, there exists O(d log d )
stabilizer state vectors. Due to their rich combinatorial structure, these vectors can be
constructed efficiently. For dimensions d = 2n that are a power of two, it is known
[49] that the set of stabilizer states forms a 3-design. This statement is false for other
prime power dimensions (d
= 2n for some n), where they only form an exact 2-design.
However, weighted 3-designs can be constructed for arbitrary dimensions d by pro-
jecting down stabilizer states from the next largest power-of-2-dimension 2n obeying
2n−1 < d < 2n [49]. For further clarification of the concept of exact and weighted
t-designs we defer the reader to [68] and references therein.
We have used these vectors in our numerical simulations, the results of which are
depicted in Fig. 1. For each dimension d between 1 and 40 (x-axis) and for each
number of measurements m ranging from 1 to 200 (y-axis), we ran a total of 30
independent experiments. Each such experiment consisted in choosing a Haar-random
(normalized Gaussian) complex vector x as test signal. Then, we drew m projected
stabilizer states uniformly at random and calculated their squared overlap with the
test signal. We then ran PhaseLift on this data and declared the recovery a “success”
if the Forbenius distance between the reconstructed matrix X̃ and the true projection
X = x x ∗ was smaller than 10−3 . Figure 1 depicts the empirical success probability:
Black corresponds to only failures, white to only successes.
J Fourier Anal Appl
We obtain the picture of a relatively sharp phase transition along a line that scales
linearly in the problem dimension. In fact, the transition seems to occur in the vicin-
ity of the line m = 4d − 4 – drawn in red in Fig. 1. This seems to agree with the
conjecture that 4d − 4 measurements are required for injectivity (see e.g. [41]). How-
ever, there are a few differences in the problem setup: Firstly, the conjecture only asks
whether there is a unique solution, while the numerical simulations study whether the
PhaseLift algorithm can find it. Secondly, the conjecture concerns unique solutions
for all possible inputs, while numerically, we estimate the success probability. And
thirdly, the conjecture states that generic measurements work, while our simulations
use a specific random procedure (drawn uniformly from a 3-design) to generate them.
In this work we require three different objects of linear algebra: vectors, matrices and
operators acting on matrices.
We will work with vectors in a d-dimensional complex Hilbert space V d equipped
with an inner product ·, ·. We refer to the associated induced norm by
z2 = z, z ∀z ∈ V d .
We will denote such vectors by Latin characters. For z ∈ V d , we define the dual vector
z ∗ ∈ (V d )∗ via
z ∗ y = z, y ∀y ∈ V d .
(Z , Y ) = tr(Z Y ), (3)
the space H d becomes a Hilbert space. In addition to that, we will require the 3
different Schatten-norms
where the second one is induced by the scalar product (3). These three norms are
related via the inequalities
√ √
Z 2 ≤ Z 1 ≤ dZ 2 and Z ∞ ≤ Z 2 ≤ dZ ∞ ∀Z ∈ H d .
J Fourier Anal Appl
X = x x ∗ and Ai = ai ai∗ .
MZ 2
Mop = sup
Z ∈H d Z 2
induced by the Frobenius norm on H d . It turns out that only very few matrix-valued
operators will appear below. These are: the identity map
I : Hd → Hd
Z → Z ∀Z ∈ H d
and (scalar multiples of) projectors onto some matrix Y ∈ H d . The latter corresponds
to
Y : H d → H d
Z → Y (Y, Z ) = Y tr(Y Z ) ∀Z ∈ H d .
The operator
1 : Z → 1 tr(1Z ) = 1 tr(Z ) ∀Z ∈ H d ,
is a very important example for this subclass of operators. Note that it is not a normal-
ized projection, but d1 1 is. Indeed, for Z ∈ H d arbitrary
2
d −1 1 Z = d −2 1 tr(11 Z ) = d −2 tr(1)1 tr(Z ) = d −1 1 Z . (4)
0 ≤ 1 ≤ dI, (5)
by using (4).
The properties of t-designs are most naturally stated in the framework of (t-fold)
tensor product spaces. This motivates recapitulating some basic concepts of multi-
linear algebra that are going to greatly simplify our analysis later on. The concepts
presented here are standard and can be found in any textbook on multilinear algebra.
Our presentation has been influenced in particular by [51,74].
Let V1 , . . . , Vk be (finite dimensional, complex) vector spaces, and let V1∗ , . . . , Vk∗
be their dual spaces. A function
f : V1 × · · · × Vk → C
and we call the elementary elements z 1 ⊗ · · · ⊗ z k the tensor product of the vectors
z 1 , . . . , z k ∈ V d . Such an element can alternatively be defined more concretely via the
Kronecker product of the individual vectors. However, such a construction requires
an explicit choice of basis in V d which is not the case in (6).
d ∗maps V → V (d ×d-matrices)
With this notation, the space of linear d d corresponds
to the tensor product M := V ⊗ V
d d which is spanned by y ⊗ z ∗ : y, z ∈ V d —
the set of all rank-1 matrices. For this generating set of M d , we define the trace to be
the natural bilinear map
∗
tr : V d ⊗ V d → C
y ⊗ z ∗ → z ∗ y = z, y
for all y, z ∈ V d . The familiar notion of trace is obtained by extending this definition
linearly to M d .
∗ ⊗k
Using M d = V d ⊗ V d allows us to define the (matrix) tensor product M d
to be the space of all multilinear functions
∗ ∗
f : Vd × Vd × ··· × Vd × Vd → C
k times
J Fourier Anal Appl
1
PSymk (z 1 ⊗ · · · ⊗ z k ) := z π(1) ⊗ · · · ⊗ z π(k) , (7)
k!
π ∈Sk
⊗k
where Sk denotes the group of permutations of k elements. This map projects V d
⊗k
onto the totally symmetric subspace Symk of V d whose dimension [51] is
d +k−1
dim Sym = k
. (8)
k
The idea of (real) spherical designs originates in coding theory [25] and has been
extended to more general spaces in [42,52,62]. We refer the interested reader to Lev-
enshtein [52] for a unified treatment of designs in general metric spaces and from now
on focus on designs in the complex vector space V d .
Roughly speaking, a complex projective t-design is a finite subset of the complex
unit sphere in V d with the property that the discrete average of any polynomial of
degree t or less equals its uniform average. Many equivalent definitions—see e.g.
[42,46,62]—capture this essence. However, there is a more explicit definition of a
t-design that is much more suitable for our purpose:
J Fourier Anal Appl
1
N
(wi wi∗ )⊗t = dim(Symt )−1 PSymt , (9)
N
i=1
where PSymt denotes the projector onto the totally symmetric subspace (7) of (V d )⊗t
and consequently dim(Symt ) = d+t−1t .
Note that the defining property (9) is invariant under global phase changes wi →
eiφ wi , thus it matches the symmetry of the phase retrieval problem. The definition
above is equivalent to demanding
1
N
∗ ⊗t
(wi wi ) = dw (ww ∗ )⊗t ,
N w
i=1
where the right hand side is integrated with respect to the Haar measure. This form
makes the statement that t-designs mimic the first 2t moments of Haar measure more
explicit.
P. Seymor and T. Zaslavsky proved in [69] that t-designs on V d exist for every
t, d ≥ 1, provided that N is large enough (N ≥ N (d, t)), but they do not give an
explicit construction. A necessary criterion—cf. [42,46]—for the t-design property is
that the number of vectors N obeys
d + t/2 − 1 d + t/2−1
N≥ = O(d 2t ). (10)
t/2 t/2
However, the proof in [69] is non-constructive and known constructions are “innefi-
cient” in the sense that the number of vectors required greatly exceeds (10). Hayashi
et al. [40] proposed a construction requiring O(t)d vectors. For real spherical designs
other “inefficient” constructions have been proposed [7,47] (N = t O(d ) ) which can
2
for all 1 ≤ k ≤ t. This knowledge about the first t moments of the sampling procedure
is the key ingredient for our partial derandomization of Gaussian PhaseLift [23].
This approach makes heavy use of operator-valued large deviation bounds. They have
been established first in the field of quantum information by Ahlswede and Winter
[2]. Later the first author of this paper and his coworkers successfully applied these
methods to the problem of low rank matrix recovery [35,38]. By now these methods
are widely used and we borrow them in their most recent (and convenient) form from
Tropp [71,72].
Theorem 5 (Smallest Eigenvalue Bernstein Inequality, [71]) Let S = k Mk be a
sum of iid random matrices Mk which obey E [M K ] = 0 and λmin (M
k ) ≥ −R almost
surely for some fixed R. With the variance parameter σ 2 (S) = k E Mk2 ∞ the
following chain of inequalities holds for all t ≥ 0.
t 2 /2 d exp(−3t 2 /8σ 2 ) t ≤ σ 2 /R
Pr [λmin (S) ≤ −t] ≤ d exp − ≤
σ 2 + Rt/3 d exp(−3t/8R) t ≥ σ 2 /R.
The defining property (9) of t-designs is phrased in terms of tensor spaces. To work with
these notions practically, we need tools for efficiently computing contractions between
high-order tensors. The concept of wiring diagrams provides such a method—see [51]
for an introduction and also [24,73] (however, they use a slightly different notation).
Here, we give a brief description that should suffice for our calculations.
Roughly, the calculus of wiring diagrams associates with every tensor a box, and
with every index of that tensor a line emanating from the box. Two connected lines
represent contracted indices. (More precisely, we place contravariant indices of a
tensor on top of the associated box and covariant ones at the bottom. However, one
should be able to digest our calculations without reference to this detail). A matrix
A : V d → V d can be seen as a two-indexed tensor Ai j . It will thus be represented
by a node A with the upper line corresponding to the index i and the lower one to j.
J Fourier Anal Appl
Two matrices A, B are multiplied by contracting B’s “contravariant” index with A’s
“covariant” one:
(AB)i j = Ai k B k j
k
Pictographically, we write
A
AB =
B
tr(A) = A .
A⊗B = A B.
tr2 (A ⊗ B) = A B .
The last ingredient we need are the transpositions σ(i, j) on (V d )⊗t which act by
interchanging the ith and the jth tensor factor. For example
σ(1,2) (x ⊗ y ⊗ · · · ) = y ⊗ x ⊗ · · · ,
with x, y ∈ V d arbitrary. Transpositions suffice, because they generate the full group
⊗2
of permutations. For V d we only have
but for higher tensor systems more permutations can occur. Consequently, permuta-
tions act by interchanging different input and output lines and the wiring diagram
representation allows one to keep track of this pictorially. In fact, only the input and
J Fourier Anal Appl
2
σ(1,2) = = =1
1
PSym2 (X ⊗ Y )
= (X ⊗ Y + Y ⊗ X ) ,
2
1 1
PSym2 = σπ(1),π(2) = 1 + σ(1,2) ,
2 2
π ∈S2
and the concepts from above allow us to translate this into the following wiring dia-
gram:
1
PSym2 = + .
2
⊗2
(Note that this operator acts on the full tensor space V d , hence in the wiring
diagram it is represented by a two-indexed box.) Applying the graphical calculus
yields
⎛ ⎞ ⎛ ⎞
A B A B A B A B A
1⎜
⎜
⎟ 1⎜
⎟= ⎜
⎟
⎟
tr2 PSym2 A ⊗ B = PSym2 = + +
2⎝ ⎠ 2⎝ B ⎠
1
= (tr(B)A + BA) ,
2
1 1
PSym3 = σπ(1),π(2),π(3) = σ1,2,3 +σ2,1,3 +σ3,2,1 +σ1,3,2 +σ2,3,1 +σ3,1,2 ,
6 6
π ∈S3
A B C
PSym3
⎛ ⎞
A B C A B C A B C A B C A B C A B C
1⎜
⎜
⎟
⎟
= + + + + +
6⎝ ⎠
⎛ ⎞
A B C A C A B A C A A
1⎜
⎜
⎟
= + B + C + B + B + C⎟
6⎝ ⎠
C B
1
= (A tr(B)tr(C) + BA tr(C) + CAtr(B) + A tr(BC) + CBA + BCA)
6
4 Problem Setup
A : H d → Rm ,
m
Z → tr(Ai Z )ei , (15)
i=1
R : Hd → Hd,
m
m
Z → m −1 (d + 1)d Ai Z = m −1 (d + 1)d Ai tr(Ai Z ), (16)
i=1 i=1
(d + 1)d ∗
R= A A.
m
The scaling is going to greatly simplify our analysis, because it guarantees that R is
“near-isotropic”, as the following result shows.
(d + 1)d
m
E[R]Z = E[Ai tr(Ai Z )]
m
i=1
= (d + 1)d tr 2 E[A⊗21 ]1 ⊗ Z (18)
= 2 tr 2 PSym2 1 ⊗ Z (19)
= Z + 1(tr Z ) = I + 1 Z .
J Fourier Anal Appl
Here, (18) follows from the fact that the ai ’s are chosen iid from a t-design, (19) uses
−1
the fact that dim(Sym2 ) = d+12 together with Definition 3, and the final line is an
application of Lemma 6.
Let now x ∈ V d be the signal we want to recover. As in [23] we consider the space
T := x z ∗ + zx ∗ : z ∈ V d ⊂ H d (20)
(which is the tangent space of the manifold of all hermitian matrices at the point
X = x x ∗ ). This space is of crucial importance for our analysis. The orthogonal
projection onto this space can be given explicitly:
PT : H d → T,
Z → X Z + Z X − X Z X (21)
= X Z + Z X − (X, Z )X. (22)
We denote the projection onto its orthogonal complement with respect to the Frobenius
inner product by PT⊥ . Then for any matrix Z ∈ H d the decomposition
Z = PT Z + PT⊥ Z =: Z T + Z T⊥
P T 1 P T = X (23)
holds. We will frequently use this fact. For a proof, consider Z ∈ H d arbitrary and
insert the relevant definitions:
PT 1 PT Z = PT 1 tr(1PT Z ) = (X 1 + 1X − X 1X ) tr (X Z + Z X − X Z X )
= X tr(X Z ) = X Z .
Following [8,18,23] the measurements (13) and (14) can be translated into matrix
form by applying the following “lifts”:
X := x x ∗ , and Ai := ai ai∗ .
Hence, the phase retrieval problem becomes a matrix recovery problem. The solution
to this is guaranteed to have rank 1 and encodes (up to a global phase) the unknown
vector x via X = x x ∗ . Relaxing the rank minimization problem (which would output
the correct solution) to a trace norm minimization yields the now-familiar convex
optimization problem
minarg X X 1
subject to Ai , X = yi i = 1, . . . m,
†
X = X ,
tr(X ) = 1,
X ≥ 0. (24)
While this convex program is formally equivalent to the previously studied general-
purpose matrix recovery algorithms [21,35,63], there are two important differences:
• The measurement matrices Ai are rank-1 projectors: Ai = ai ai∗ .
• The unknown signal is known to be proportional to a rank-1 projector (X = x x ∗ )
as well.
While the second fact is clearly of advantage for us, the first one makes the problem
considerably harder: In the language of [35], it means that the “incoherence parameter”
μ = d maxi=1,...,m Ai ∞ = dai 22 = d is as large as it can get! Higher values of μ
correspond to more ill-posed problems and as a result, a direct application of previous
low-rank matrix recovery results fails. It is this problem that Refs. [18,23] first showed
how to circumvent for the case of Gaussian measurements. Below, we will adapt these
ideas to the case of measurements drawn from designs, which necessitates following
more closely the approach of [35].
4.3 Well-Posedness/Injectivity
m 1
m −1 A(Z )22 = m −1 (tr(Z Ai ))2 = tr(Z m −1 Ai tr(Ai Z )) = tr(Z RZ )
(d + 1)d
i=1 i
1 1
= tr(Z (R − E[R])Z ) + tr(Z (I + 1 )Z )
(d + 1)d (d + 1)d
1 1
= tr(Z PT (R − E[R])PT Z ) + (tr(Z 2 ) + (tr Z )2 )
(d + 1)d (d + 1)d
≥ 0.5d −2 tr(Z PT (R − E[R])PT Z ) + tr(Z 2 )
m
(d + 1)d
PT (R − E[R])PT = (Mi − E[Mi ]) with Mi = P T Ai P T .
m
i=1
Note that these summands have mean zero by construction. Furthermore observe that
the auxiliary result (23) implies
2 1 1 1 1
− I ≤ − I − X ≤ − PT IPT − PT 1 PT
m m m m m
= −PT E[Mi ]PT ≤ PT (Mi − E[Mi ])PT
(d + 1)2 d 2
0 ≤ E[Mi2 ] = 2
P T E Ai P T Ai P T
m
(d + 1)2 d 2
= 2
PT E tr(Ai PT Ai ) Ai PT .
m
J Fourier Anal Appl
where we have used the basic definition of PT and 0 ≤ tr(Ai X ) = |ai , x|2 ≤ 1.
Consequently, for Z ∈ T arbitrary
2(d + 1)2 d 2
PT E[Mi2 ]PT Z ≤ PT E [Ai tr(Ai X ) tr(Ai Z )]
m2
2(d + 1)2 d 2
= 2
PT tr 2,3 E[Ai⊗3 ]1 ⊗ X ⊗ Z
m
12(d + 1)2 d 2
= PT tr 2,3 PSym3 1 ⊗ X ⊗ Z
m (d + 2)(d + 1)d
2
2d
≤ PT (1 tr(Z ) + X tr(Z ) + Z + 1 tr(X Z ) + Z X + X Z )
m2
2d
= (X tr(X Z ) + X tr(X Z ) + Z + X tr(X Z ) + PT Z + X tr(X Z ))
m2
2d 12d
= (4 X + 2I ) Z ≤ 2 I Z .
m2 m
−1
Here we have applied dim Sym3 = d+2 3 and Lemma 7 in lines 3 and 4, respectively.
Furthermore we used Z ∈ T —hence PT Z = Z and tr(Z ) = tr(X Z ) – as well as the
basic definition (22) of PT to simplify the terms occurring in the fourth line. Putting
everything together yields
12d
E[(Mi − E[Mi ])2 ] ≤ E[Mi2 ] ≤ I
m2
for all 0 ≤ δ ≤ 1 ≤ 6d = σ 2 /R. This gives the desired bound on the event
for all matrices Z ∈ T simultaneously. This is the general statement at the beginning
of the proof and setting δ = 1/2 yields Proposition 9.
J Fourier Anal Appl
Proposition 10 Let A be as above with vectors sampled from a t-design (t ≥ 1). Then
the statement
m −1 A(Z )22 ≤ Z 22 (27)
Note that equation (27) can be improved. Indeed, a standard application of the
Operator Bernstein inequality (Theorem 4) gives
for all matrices Z ∈ T with probability of failure smaller than d 2 exp (−Cm/d) for
some 0 < C ≤ 1. However, we actually do not require this tighter bound.
In this section, we will follow [21,35] to prove that the convex program (24) indeed
recovers the sought for signal x, provided that a certain geometric object—an approx-
imate dual certificate—exists.
Definition 11 (Approximate dual certificate) Assume that the sampling process cor-
responds to (13) and (14). Then we call Y ∈ H d an approximate dual certificate,
provided that Y ∈ span (1, A1 , . . . , Am ) and
1 1
YT − X 2 ≤ as well as YT⊥ ∞ ≤ . (28)
4d 2
X̃ 1 = X + 1 ≥ X 1 + tr(T ) + ⊥
T 1 .
J Fourier Anal Appl
tr(T ) + ⊥
T 1 > 0 (29)
is true for every feasible . In order to show this we combine feasibility of with
inequalities (25) and (27) to obtain
(d + 1)d
m
Y = RX − tr(X )1 = Ai tr(Ai X ) − tr(X )1 ∈ span (1, A1 , . . . , Am ) .
m
i=1
(31)
In expectation, E[Y ] = X , which is the “perfect dual certificate” in the sense that the
norm bounds in (28) vanish. The hope would be to use the Operator Bernstein inequal-
ity to show that with high probability, Y will be sufficiently close to its expectation.
It has been shown that a slight refinement of the ansatz (31) indeed achieves this goal
Ref. [35,50]. However, the Bernstein bounds depend on the worst-case operator norm
of the summands. In our case, they can be as large as d 2 |ai , x|2 , which can reach d 2 .
This is far larger than in previous low-rank matrix recovery problems. Reference [23]
relied on the fact that large overlaps |ai , x|2 O(d −1 ) are “rare” for Gaussian ai .
The key observation here is that the t-design property provides one with useful
information about the first t moments of the random variable |x, ai |2 . This knowl-
edge allows us to explicitly bound the probability of “dangerously large overlaps” or
“coherent measurement vectors” occurring.
Pr |a, x|2 ≥ 5td −γ ≤ 4−t d −t (1−γ ) . (32)
Pr |a, x|2 ≥ (δ + 1)td −γ ≤ δ −t d −t (1−γ ) ,
which is valid for any δ ≥ 1. Setting δ = 4 then yields (32). The t-design property
provides us with useful information about the first t moments of the non-negative
random variable ξ = |a, x|2 . Indeed, with A = aa ∗ it holds for every k ≤ t that
E ξ k = E tr(AX )k
= tr E A⊗k X ⊗k
d + k − 1 −1
= tr PSymk X ⊗k
k
d + k − 1 −1 ⊗k
= tr X
k
≤ d −k k!,
because X ⊗k is invariant under PSymk . One way of seeing this4 is to note that
range(X ⊗k ) = span(x ⊗k ) and the latter is already contained in Symk . Therefore
the k-th moment τk of ξ is bounded by
1/k
τk = E[ξ k ] ≤ (d −k k!)1/k ≤ k/d.
μ = E[ξ ] = d −1 .
Pr [|ξ − μ| ≥ sτt ] ≤ s −t ,
4 Alternatively one could also rearrange tensor systems: X ⊗k = (x x ∗ )⊗k x ⊗k (x ∗ )⊗k and use
PSymk x ⊗k = x ⊗k .
J Fourier Anal Appl
Pr |a, x|2 ≥ (δ + 1)td −γ = Pr ξ − μ ≥ (δ + 1)td −γ − d −1
≤ Pr ξ − μ ≥ δtd −γ
≤ Pr |ξ − μ| ≥ δd 1−γ τt
≤ δ −t d −t (1−γ ) ,
for some unique ζ > 0 and z ∈ V d with z2 = 1. For this z we introduce the event
G ic := |z, ai |2 ≥ 5td −γ
(d + 1)d
m
R Z := Rx,y = 1 E i 1G i Ai , (34)
m
i=1
where 1 Ei and 1G i denote the indicator functions associated with the events E i and
G i , respectively.
The following result shows that due to Lemma 13 this truncated operator is in
expectation close to the original R.
(d + 1)d
m
Raux := 1 E i Ai
m
i=1
J Fourier Anal Appl
and observe
(d + 1)d
m
≤ E 1 Eic Ai op
m
i=1
2d 2 2d 2
m m
≤ E 1 Eic = Pr[E ic ]
m m
i=1 i=1
−t −t (1−γ ) 1−2t 2−t (1−γ )
≤ 2d × 4 d
2
=2 d .
Similarly,
m
(d + 1)d 2d 2
m
E [Raux − R Z ]op ≤ E 1G ic Ai ≤ E[1G ic ]
m m
i=1 op i=1
2d 2
m
≤ Pr[G ic ] ≤ 21−2t d 2−t (1−γ )
m
i=1
and inserting these bounds into (36) yields the desired statement.
We now establish a technical result which will allow us to find a suitable approximate
dual certificate using the “golfing scheme” construction [35,50].
Proposition 16 Fix Z ∈ T arbitrary, let R Z be as in (34). Assume that that the design
order t is at least 3 and the truncation rate γ satisfies
γ ≤ 1 − 2/t.
√
Then for 1/4 ≤ b ≤ 1 and c ≥ 2b with probability at least 1 − d exp(− 640td
9mb
2−γ )
one has
Z = ζ (zx ∗ + x z ∗ )
J Fourier Anal Appl
which follows from γ ≤ 1 − 2/t, t ≥ 3 and b ≥ 1/4. To obtain (38) we use a similar
reasoning:
where we have used the fact that PT projects onto a subspace of at most rank-2
matrices in the third line and (39) in the fourth. This motivates to define the event
E := { (R Z − E[R Z ]) Z ∞ ≤ 3b/4}
which guarantees both (37) and (38) due to the assumption on c and Z 2 = 1. So
everything boils down to bounding the probability of E c . We decompose
m
(d + 1)d
(R Z − E[R Z ]) Z = (Mi − E[Mi ]) with Mi = 1 Ei 1G i Ai tr(Ai Z ).
m
i=1
We will estimate this sum using the Operator Bernstein inequality (Theorem 4). Thus
we need an a priori bound for the summands
(d + 1)d 2d 2
Mi ∞ = 1 Ei 1G i Ai ∞ | tr(Ai Z )| ≤ 1 Ei 1G i 2|x, ai ||z, ai |
m m
4d 2 20 2−γ
≤ 5td −γ = td =: R,
m m
and therefore
(d + 1)2 d 2 (d + 1)2 d 2
E Mi2 = 2
E 1 Ei 1G i tr(Ai Z )2 Ai2 ≤ 2
E tr(Ai Z )2 Ai2
m m
(d + 1)2 d 2 6(d + 1)d
⊗3
= tr 2,3 E[Ai ]1 ⊗ Z ⊗ Z = 2 tr 2,3 PSym3 1 ⊗ Z ⊗ Z
m 2 m (d + 2)
(d + 1)d
= 2 1 tr(Z )2 + Z tr(Z ) + Z + 1 tr(Z 2 ) + 2Z 2
m (d + 2)
8d 8d
≤ 2 Z 22 1 = 2 1.
m m
√
Here we have used tr(Z ) ≤ 2Z 2 , Z ≤ Z 2 1 and Z 2 = 1. From this we can
conclude
8d
E[(Mi − E[Mi ])2 ≤ m max E[Mi2 ]∞ ≤ =: σ 2 .
∞ i=1,...,m m
i
Observing that
σ2 8 γ −1 2 3
≤ d ≤ ≤ b,
R 20t 15 4
Theorem 4 yields
3 × 3mb
Pr E c
= Pr [ (R Z − E[R Z ]) Z ∞ > 3b/4] ≤ d exp − ,
8 × 4 × 20td 2−γ
as desired.
With this ingredient we can now construct a suitable approximate dual certificate
Y , closely following [50].
γ ≤ 1 − 2/t
then with probability larger than 1 − 0.5e−ω , there exists an approximate dual cer-
tificate Y as in Definition 11. Here, C is a universal constant (which can in principle
be recovered explicitly from the proof).
J Fourier Anal Appl
The recursive construction yields the following expressions (c.f. [50, Lemma 14]):
Y := Yr = R Qr −1 Q r −1 − tr(Q r −1 )1 + Yr −1
r
= R Q i−1 Q i−1 − tr(Q i−1 )1 and
i=1
Q i = X − PT Yi = PT Q i−1 + tr(Q i−1 )1 − R Q i−1 Q i−1
i
= PT I + 1 − R Q i−1 Q i−1 = · · · = PT I + 1 − R Q j−1 X.
j=1
5 The use of pseudo-code allows for a compact presentation of this randomized procedure. However, the
reader should keep in mind that the construction is purely part of a proof and should not be confused with
the recovery algorithm (which is given in Eq. (24)).
J Fourier Anal Appl
We now set
r = log2 d + 2. (41)
Then, in case of success, the validity of properties (37) and (38) for c = 1/2 and
b = 1/4 in each step (Q i → Q i+1 and Yi → Yi+1 , respectively) guarantee
r
1 1
YT − X 2 = Q r 2 ≤ X 2 = 2−log2 d−2 X 2 ≤ ,
2 4d
j=1
r
⊥
YT⊥ ∞ ≤ PT R Q i−1 Q i−1 − tr(Q i−1 )1
∞
i=1
r
1 1 1−i
r
≤ Q i−1 2 ≤ 2 Q 0 ∞
4 4
i=1 i=1
∞
1 1
≤ X ∞ 2−i = .
4 2
i=0
6 It was pointed out to us by A. Hansen that in some previous papers [35,50] which involve a similar
construction to the one presented here, it was tacitly assumed that the ξi are independent. This will of
course not be true in general. Fortunately, a more careful argument shows that all conclusions remain valid
[1]. Our treatment here is similar to the one presented in [1].
J Fourier Anal Appl
p ≤ min p(ξl−1 , . . . , ξ1 ).
ξl−1 ,...,ξ1
m := C1 td 2−γ log d
and Z = Q gives a probability of success of at least 9/10 for any Q (in particular,
independently of the ξl−1 , . . . , ξ1 ). Thus, choosing p = 9/10 and m i = m for all i,
we can then iterate the estimate (44) to arrive at
l
l
l−1
Pr ξi < r ≤ Pr ξl + ξi < r ≤ · · · ≤ Pr ξi <r , (45)
i=1 i=1 i=1
where the ξi are independent Bernoulli variables with parameter 9/10. A standard
Chernoff bound (e.g. [39, Sect. Concentration: Theorem 2.1]) gives
l
ξi ≤ l(9/10 − t) ≤ e−2lt .
2
Pr
i=1
implies
2 2
9 r 8
2l − ≥ 20ωr ≥ 12ωr ≥ ω + log 2,
10 l 10
where we have used ω ≥ 1 ≥ log 2 in the last inequality. Together with (42), (45) and
(46) this gives the desired bound
l l
1
Pr algorithm fails = Pr ξi < r ≤ Pr ξi < r ≤ e−ω−log(2) = e−ω ,
2
i=1 i=1
J Fourier Anal Appl
l
m i = lm ≤ Cωtd 2−γ log2 d,
i=1
Finally we are ready to put all pieces together and show or main result—Theorem
1.
Proof of the Main Theorem In Sect. 5 (Proposition 12) we have shown that the
algorithm (24) recovers the sought for signal x, provided that (25) holds and a suitable
approximate dual certificate Y exists. Proposition 17—with a maximal truncation rate
of γ = (1 − 2/t)—implies that the probability that no such Y can be constructed is
smaller than 0.5e−ω , provided that the total sampling rate m obeys
for a sufficiently large absolute constant C. Provided that this constant is large enough,
Proposition 9 implies that the probability of (25) failing is also bounded by 0.5e−ω .
Theorem 1 now follows from the union bound over these two probabilities of failure.
7 Converse Bound
In this paper, we require designs of order at least three. Here we prove that this criterion
is fundamental in the sense that sampling from 2-designs in general cannot guarantee
a sub-quadratic sampling rate. In order to do so, we will use a particular sort of 2-
design, called a maximal set of mutually unbiased bases (MUBs) [45,46,67,75]. Two
orthonormal bases {u i }i=1
d
and {vi }i=1
d
are called mutually unbiased if their overlap is
uniformly minimal. Concretely, this means that
1
|u i , v j |2 = ∀i, j = 1, . . . , d
d
must hold for all i, j = 1, . . . , d. Note that this is just a generalization of the inco-
herence property between standard and Fourier basis. In prime power dimensions, a
maximal set of (d + 1) such MUBs is known to exist (and can be constructed) [44].
Such a set is maximal in the sense that it is not possible to find more than (d + 1)
MUBs in any Hilbert space. Among other interesting properties—cf. [26] for a detailed
survey—maximal sets of MUBs are known to form 2-designs [45,75].
The defining properties of a maximal set of MUBs allow us to derive the converse
bound—Theorem 2.
p
p ≤ − log(1 − p) ≤ ≤ 2p
1− p
for any p ∈ [0, 1/2] implies that (48) is a necessary criterion for (49) and we are done.
8 Conclusion
In this paper we have derived a partly derandomized version of Gaussian PhaseLift [18,
23]. Instead of Gaussian random measurements, our method guarantees recovery for
sampling iid from certain finite vector configurations, dubbed t-designs. The required
sampling rate depends on the design order t:
m = O td 1+2/t log2 d . (50)
J Fourier Anal Appl
For small t this rate is worse than the Gaussian analogue—but still non-trivial. How-
ever, as soon as t exceeds 2 log d, we obtain linear scaling up to a polylogarithmic
overhead.
In any case, we feel that the main purpose of this paper is not to present yet another
efficient solution heuristics, but to show that the phase retrieval problem can be deran-
domized using t-designs. These finite vector sets lie in the vast intermediate region
between random Fourier vectors and Gaussian random vectors (the Fourier basis is a
1-design, whereas normalized Gaussian random vectors correspond to an ∞-design).
Therefore the design order t allows us to gradually transcend between these two
extremal cases.
Acknowledgments DG and RK are grateful to the organizers and participants of the Workshop on Phase-
less Reconstruction, held as part of the 2013 February Fourier Talks at the University of Maryland, where
they were introduced to the details of the problem. This extends, in particular, to Thomas Strohmer. The work
of DG and RK is supported by the Excellence Initiative of the German Federal and State Governments (Grant
ZUK 43), by scholarship funds from the State Graduate Funding Program of Baden-Württemberg, and by
the US Army Research Office under contracts W911NF-14-1-0098 and W911NF-14-1-0133 (Quantum
Characterization, Verification, and Validation), and the DFG. FK acknowledges support from the German
Federal Ministry of Education and Reseach (BMBF) through the cooperative research project ZeMat.
Appendix
Here we briefly state an elementary proof of Lemma 6. In the main text we proved this
result using wiring diagrams. The purpose of this is to underline the relative simplicity
of wiring diagram calculations. Indeed, the elementary proof below is considerably
more cumbersome than its pictorial counterpart.
d
d
d
1=1⊗1= bi bi∗ ⊗ b j b∗j and σ(1,2) = bi b∗j ⊗ b j bi∗ .
i=1 j=1 i, j=1
d
tr 2 (A) = 1 ⊗ bi∗ A (1 ⊗ bi ) .
i=1
d
d
tr 2 σ(1,2) A ⊗ B = 1 ⊗ bk∗ bi b∗j ⊗ b j bi∗ A ⊗ B (1 ⊗ bk )
k=1 i, j=1
d
d
= bi b∗j Abk∗ b j bi∗ Bbk = bi , Bb j bi b∗j A
i, j,k=1 i, j=1
d ⎛ ⎞
d
= bi bi∗ B⎝ b j b∗j ⎠ A = 1B1A = B A,
i=1 j=1
and the desired resultfollows. Here we have used the basis representation of the
identity, namely 1 = i=1d
bi bi∗ .
References
1. Adcock, B., Hansen, A.C.: Generalized Sampling and Infinite-Dimensional Compressed Sensing,
Technical Report NA2011/02, DAMTP. University of Cambridge, Cambridge (2011)
2. Ahlswede, R., Winter, A.: Strong converse for identification via quantum channels. IEEE Trans. Inform.
Theory 48, 569–579 (2002)
3. Alexeev, B., Bandeira, A.S., Fickus, M., Mixon, D.G.: Phase retrieval with polarization. SIAM J.
Imaging Sci. 7, 35–66 (2014)
4. Ambainis, A., Emerson, J.: Quantum t-designs: t-wise independence in the quantum world. In: Pro-
ceedings of the 22nd Annual IEEE Conference on Computational Complexity, pp. 129–140 ( 2007)
5. Bachoc, C., Venkov, B.: Modular forms, lattices and spherical designs. In: Euclidean Lattices, Spherical
Designs and Modular Forms, pp. 87–111. On the works of Boris Venkov. L’Enseignement Mathéma-
tique, Genève (2001)
6. Bachoc, C., Ehler, M.: Tight p-fusion frames. Appl. Comput. Harmon. Anal. 35, 1–15 (2013)
7. Bajnok, B.: Construction of spherical t-designs. Geom. Dedic. 43, 167–179 (1992)
8. Balan, R., Casazza, P., Edidin, D.: On signal reconstruction without phase. Appl. Comput. Harmon.
Anal. 20, 345–356 (2006)
9. Balan, R., Bodmann, B.G., Casazza, P.G., Edidin, D.: Painless reconstruction from magnitudes of
frame coefficients. J. Fourier Anal. Appl. 15, 488–501 (2009)
10. Bandeira, A.S., Chen, Y., Mixon, D.G.: Phase retrieval from power spectra of masked signals. Inf.
Inference (2014). doi:10.1093/imaiai/iau002
11. Baraniuk, R.G., Davenport, M., DeVore, R.A., Wakin, M.: A simple proof of the restricted isometry
property for random matrices. Constr. approx. 28, 253–263 (2008)
12. Barvinok, A.: A Course in Convexity. American Mathematical Society (AMS), Providence (2002)
13. Bauschke, H.H., Combettes, P.L., Luke, D.R.: Hybrid projection–reflection method for phase retrieval.
JOSA A 20, 1025–1034 (2003)
14. Bhatia, R.: Matrix Analysis. Springer, New York (1996)
15. Brandao, F.G., Harrow, A.W., Horodecki, M.: Local random quantum circuits are approximate
polynomial-designs. arXiv:1208.0692 (2012)
16. Bruck, Y.M., Sodin, L.: On the ambiguity of the image reconstruction problem. Opt. Commun. 30,
304–308 (1979)
17. Candes, E., Li, X., Soltanolkotabi, M.: Phase retrieval from masked fourier transforms. Appl. Comput.
Harmon. Anal. arXiv:1310.3240 (to appear)
18. Candès, E., Li, X.: Solving quadratic equations via PhaseLift when there are about as many equations
as unknowns. Found. Comput. Math., 1–10 (2013)
19. Candès, E., Tao, T.: Decoding by linear programming. IEEE Trans. Inform. Theory 51, 4203–4215
(2005)
J Fourier Anal Appl
20. Candès, E.J., Romberg, J., Tao, T.: Stable signal recovery from incomplete and inaccurate measure-
ments. Commun. Pure Appl. Math. 59, 1207–1223 (2006)
21. Candès, E.J., Tao, T.: The power of convex relaxation: near-optimal matrix completion. IEEE Trans.
Inform. Theory 56, 2053–2080 (2010)
22. Candes, E.J., Eldar, Y.C., Strohmer, T., Voroninski, V.: Phase retrieval via matrix completion. SIAM
J. Imaging Sci. 6, 199–225 (2013)
23. Candès, E.J., Strohmer, T., Voroninski, V.: Phaselift: exact and stable signal recovery from magnitude
measurements via convex programming. Commun. Pure Appl. Math. 66, 1241–1274 (2013)
24. Cvitanović, P.: Group Theory. Birdtracks, Lie’s, and Exceptional Groups. Princeton University Press,
Princeton (2008)
25. Delsarte, P., Goethals, J., Seidel, J.: Spherical codes and designs. Geom. Dedic. 6, 363–388 (1977)
26. Durt, T., Englert, B.-G., Bengtsson, I., Życzkowski, K.: On mutually unbiased bases. Int. J. Quantum
Inf. 8, 535–640 (2010)
27. Ehler, M., Fornasier, M., Sigl, J.: Quasi-linear compressed sensing. Multiscale Model. Simul. 12,
725–754 (2014)
28. Eldar, Y.C., Mendelson, S.: Phase retrieval: stability and recovery guarantees. Appl. Comput. Harmon.
Anal. 36, 473–494 (2014)
29. Feller, W.: An Introduction to Probability Theory and its Applications. I. Wiley, New York (1968)
30. Fienup, J.R.: Phase retrieval algorithms: a comparison. Appl. Opt. 21, 2758–2769 (1982)
31. Gottesman, D.: Stabilizer codes and quantum error correction. Ph.D. dissertation, California Institute
of Technology (1997)
32. Gottesman, D.: The Heisenberg representation of quantum computers. In: Proceedings of the XXII
International Colloquium on Group Theoretical Methods in Physics, pp. 32–43. International Press,
Cambridge (1999)
33. Grant, M., Boyd, S.: Graph implementations for nonsmooth convex programs. In: Blondel, V., Boyd, S.,
Kimura, H. (eds.) Recent Advances in Learning and Control, Ser. Lecture Notes in Control and Infor-
mation Sciences, Springer-Verlag Limited, London, pp. 95–110. http://stanford.edu/~boyd/graph_dcp.
html (2008). Accessed 13 Oct 2014
34. Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming, version 2.1. http://
cvxr.com/cvx (2014). Accessed 13 Oct 2014
35. Gross, D.: Recovering low-rank matrices from few coefficients in any basis. IEEE Trans. Inform.
Theory 57, 1548–1566 (2011)
36. Gross, D., Audenaert, K., Eisert, J.: Evenly distributed unitaries: on the structure of unitary designs. J.
Math. Phys. 48, pp. 052 104, 22 (2007)
37. Gross, D., Krahmer, F., Kueng, R.: Improved recovery guarantees for phase retrieval from coded
diffraction patterns. arXiv:1402.6286 (2014)
38. Gross, D., Liu, Y.-K., Flammia, S.T., Becker, S., Eisert, J.: State tomography via compressed sensing.
Phys. Rev. Lett. 105, 150401 (2010)
39. Habib, M., McDiarmid, C., Ramírez Alfonsín, J., Reed, B. (eds.): Probabilistic Methods for Algorithmic
Discrete Mathematics. Springer, Berlin (1998)
40. Hayashi, A., Hashimoto, T., Horibe, M.: Reexamination of optimal quantum state estimation of pure
states. Phys. Rev. A 72, 032301 (2005)
41. Heinosaari, T., Mazzarella, L., Wolf, M.M.: Quantum tomography under prior information. Commun.
Math. Phys. 318, 355–374 (2013)
42. Hoggar, S.: t-Designs in projective spaces. Eur. J. Comb. 3, 233–254 (1982)
43. Hoory, S., Linial, N., Wigderson, A.: Expander graphs and their applications. Bull. Am. Math. Soc.
43, 439–561 (2006)
44. Klappenecker, A., Rötteler, M.: Constructions of mutually unbiased bases. In: Finite Fields and Appli-
cations. 7th International Conference, Fq 7 , pp. 137–144. Springer, Berlin (2004)
45. Klappenecker, A., Rotteler, M.: Mutually unbiased bases are complex projective 2-designs. In: 2005
IEEE International Symposium on Information Theory (ISIT), vols. 1, 2, pp. 1740–1744 (2005)
46. König, H.: Cubature formulas on spheres. In: Advances in Multivariate Approximation. Proceedings
of the 3rd International Conference on Multivariate Approximation Theory, pp. 201–211. Wiley-VCH,
Berlin (1999)
47. Korevaar, J., Meyers, J.: Chebyshev-type quadrature on multidimensional domains. J. Approx. Theory
79, 144–164 (1994)
J Fourier Anal Appl
48. Krahmer, F., Rauhut, H.: Structured random measurements in signal processing. arXiv:1401.1106
(2014)
49. Kueng, R., Gross, D.: Stabilizer states are complex projective 3-designs in qubit dimensions.
(in preparation)
50. Kueng, R., Gross, D.: RIPless compressed sensing from anisotropic measurements. Lin. Alg. Appl.
441, 110–123 (2014)
51. Landsberg, J.M.: Tensors: Geometry and Applications. American Mathematical Society (AMS),
Providence (2012)
52. Levenshtein, V.I.: Universal bounds for codes and designs. In: Handbook of Coding Theory. Part 1:
Algebraic Coding, vol. 1, pp. 499–648. Elsevier, Amsterdam (1998)
53. Li, X., Voroninski, V.: Sparse signal recovery from quadratic measurements via convex programming.
SIAM J. Math Anal. 45, 3019–3033 (2013)
54. Liu, Y.-K.: Universal low-rank matrix recovery from pauli measurements. Adv. Neural Inf. Process.
Syst. 24, 1638–1646 (2011)
55. Low, R.A.: Large deviation bounds for k-designs. Proc. R. Soc. Lond. Ser. A 465, 3289–3308 (2009)
56. Luby, M., Wigderson, A.: Pairwise independence and derandomization. Print version of Foundations
and Trends in Theoretical Computer Science, Vol. 1, No. 4 (2005), print version of foundations and
trends in theoretical computer science vol. 1, no. 4 (2005) ed. Boston, MA (2006)
57. Millane, R.: Phase retrieval in crystallography and optics. JOSA A 7, 394–411 (1990)
58. Mixon, D.: Short, Fat Matrices, Blog, 2013. http://dustingmixon.wordpress.com/. Accessed 13 Oct
2014
59. Mondragon, D., Voroninski, V.: Determination of all pure quantum states from a minimal number of
observables. arXiv:1306.1214 (2013)
60. Nebe, G., Rains, E., Sloane, N.: The invariants of the Clifford groups. Des. Codes Cryptogr. 24, 99–121
(2001)
61. Netrapalli, P., Jain, P., Sanghavi, S.: Phase retrieval using alternating minimization. In: Advances in
Neural Information Processing Systems, pp. 2796–2804 (2013)
62. Neumaier, A.: Combinatorial Configurations in Terms of Distances. Department of Mathematics Mem-
orandum, Eindhoven (1981)
63. Recht, B., Fazel, M., Parrilo, P.A.: Guaranteed minimum-rank solutions of linear matrix equations via
nuclear norm minimization. SIAM Rev. 52, 471–501 (2010)
64. Renes, J.M., Blume-Kohout, R., Scott, A., Caves, C.M.: Symmetric informationally complete quantum
measurements. J. Math. Phys. 45, 2171–2180 (2004)
65. Rudelson, M., Vershynin, R.: On sparse reconstruction from Fourier and Gaussian measurements.
Commun. Pure Appl. Math. 61, 1025–1045 (2008)
66. Sanderson, B.: Immersions and embeddings of projective spaces. Proc. Lond. Math. Soc. 3(14),
137–153 (1964)
67. Schwinger, J.: Unitary operator bases. Proc. Natl. Acad. Sci. USA 46, 570–579 (1960)
68. Scott, A.: Tight informationally complete quantum measurements. J. Phys. A 39, 13507–13530 (2006)
69. Seymour, P., Zaslavsky, T.: Averaging sets: a generalization of mean values and spherical designs. Adv.
Math. 52, 213–240 (1984)
70. Sidelnikov, V.: Spherical 7-designs in 2n -dimensional Euclidean space. J. Algebr. Comb. 10, 279–288
(1999)
71. Tropp, J. A.: User-friendly tools for random matrices: an introduction, notes. http://users.cms.caltech.
edu/~jtropp/notes/Tro12-User-Friendly-Tools-NIPS.pdf (2012). Accessed 13 Oct 2014
72. Tropp, J.A.: User-friendly tail bounds for sums of random matrices. Found. Comput. Math. 12, 389–434
(2012)
73. Turaev, V.: Quantum Invariants of Knots and 3-Manifolds. Walter de Gruyter, Berlin (1994)
74. Watrous, J.: Theory of quantum information, lecture notes. https://cs.uwaterloo.ca/~watrous/
LectureNotes.html (2011). Accessed 13 Oct 2014
75. Zauner, G.: Quantendesigns: Grundzüge einer nichtkommutativen Designtheorie. Ph.D. dissertation,
University of Vienna (1999)