Quantum States and Ensembles2
Quantum States and Ensembles2
Quantum States and Ensembles2
Quantum Information
Chapter 2
John Preskill
California Institute of Technology
2
2
Foundations I: States and Ensembles
3
4 2 Foundations I: States and Ensembles
for all vectors |ϕi, |ψi (where here A|ψi has been denoted as |Aψi). A
is self-adjoint if A = A† , or in other words, if hϕ|A|ψi = hψ|A|ϕi∗ for all
vectors |ϕi and |ψi. If A and B are self adjoint, then so is A+B (because
(A + B)† = A† + B † ), but (AB)† = B † A† , so that AB is self adjoint
2.1 Axioms of quantum mechanics 5
E n E m = δn,m E n .
E †n = E n . (2.5)
where a, b are complex numbers that satisfy |a|2 +|b|2 = 1, and the overall
phase is physically irrelevant. A qubit is a quantum system described by
a two-dimensional Hilbert space, whose state can take any value of the
form eq.(2.15).
We can perform a measurement that projects the qubit onto the basis
{|0i, |1i}. Then we will obtain the outcome |0i with probability |a|2 , and
the outcome |1i with probability |b|2 . Furthermore, except in the cases
a = 0 and b = 0, the measurement irrevocably disturbs the state. If the
value of the qubit is initially unknown, then there is no way to determine
a and b with that single measurement, or any other conceivable measure-
ment. However, after the measurement, the qubit has been prepared in a
known state – either |0i or |1i – that differs (in general) from its previous
state.
In this respect, a qubit differs from a classical bit; we can measure a
classical bit without disturbing it, and we can decipher all of the infor-
mation that it encodes. But suppose we have a classical bit that really
does have a definite value (either 0 or 1), but where that value is initially
unknown to us. Based on the information available to us we can only say
that there is a probability p0 that the bit has the value 0, and a probability
p1 that the bit has the value 1, where p0 + p1 = 1. When we measure
the bit, we acquire additional information; afterwards we know the value
with 100% confidence.
An important question is: what is the essential difference between a
qubit and a probabilistic classical bit? In fact they are not the same, for
several reasons that we will explore. To summarize the difference in brief:
there is only one way to look at a bit, but there is more than one way to
look at a qubit.
2.2.1 Spin- 12
First of all, the coefficients a and b in eq.(2.15) encode more than just the
probabilities of the outcomes of a measurement in the {|0i, |1i} basis. In
particular, the relative phase of a and b also has physical significance.
The properties of a qubit are easier to grasp if we appeal to a geomet-
rical interpretation of its state. For a physicist, it is natural to interpret
eq.(2.15) as the spin state of an object with spin- 12 (like an electron).
Then |0i and |1i are the spin up (| ↑i) and spin down (| ↓i) states along
2.2 The Qubit 9
a particular axis such as the z-axis. The two real numbers characterizing
the qubit (the complex numbers a and b, modulo the normalization and
overall phase) describe the orientation of the spin in three-dimensional
space (the polar angle θ and the azimuthal angle ϕ).
We will not go deeply here into the theory of symmetry in quantum
mechanics, but we will briefly recall some elements of the theory that
will prove useful to us. A symmetry is a transformation that acts on
a state of a system, yet leaves all observable properties of the system
unchanged. In quantum mechanics, observations are measurements of
self-adjoint operators. If A is measured in the state |ψi, then the outcome
|ai (an eigenvector of A) occurs with probability |ha|ψi|2 . A symmetry
should leave these probabilities unchanged, when we “rotate” both the
system and the apparatus.
A symmetry, then, is a mapping of vectors in Hilbert space
|ψi 7→ |ψ 0 i, (2.16)
that preserves the absolute values of inner products
|hϕ|ψi| = |hϕ0 |ψ 0 i|, (2.17)
for all |ϕi and |ψi. According to a famous theorem due to Wigner, a
mapping with this property can always be chosen (by adopting suitable
phase conventions) to be either unitary or antiunitary. The antiunitary
alternative, while important for discrete symmetries, can be excluded for
continuous symmetries. Then the symmetry acts as
|ψi 7→ |ψ 0 i = U |ψi, (2.18)
where U is unitary (and in particular, linear).
Symmetries form a group: a symmetry transformation can be inverted,
and the product of two symmetries is a symmetry. For each symmetry op-
eration R acting on our physical system, there is a corresponding unitary
transformation U (R). Multiplication of these unitary operators must re-
spect the group multiplication law of the symmetries – applying R1 ◦ R2
should be equivalent to first applying R2 and subsequently R1 . Thus we
demand
U (R1 )U (R2 ) = Phase(R1 , R2 ) · U (R1 ◦ R2 ) (2.19)
A phase depending on R1 and R2 is permitted in eq.(2.19) because quan-
tum states are rays; we need only demand that U (R1 ◦ R2 ) act the same
way as U (R1 )U (R2 ) on rays, not on vectors. We say that U (R) provides
a unitary representation, up to a phase, of the symmetry group.
So far, our concept of symmetry has no connection with dynamics.
Usually, we demand of a symmetry that it respect the dynamical evolu-
tion of the system. This means that it should not matter whether we
10 2 Foundations I: States and Ensembles
first transform the system and then evolve it, or first evolve it and then
transform it. In other words, the diagram
dynamics
Initial - Final
transformation transformation
? ?
dynamics
New Initial - New Final
[Q, H] = 0 ; (2.23)
The most general 2×2 unitary matrix with determinant 1 can be expressed
in this form. Thus, we are entitled to think of a qubit as a spin- 21 object,
and an arbitrary unitary transformation acting on the qubit’s state (aside
from a possible physically irrelevant rotation of the overall phase) is a
rotation of the spin.
A peculiar property of the representation U (n̂, θ) is that it is double-
valued. In particular a rotation by 2π about any axis is represented non-
trivially:
U (n̂, θ = 2π) = −I. (2.32)
Our representation of the rotation group is really a representation “up to
a sign”
U (R1 )U (R2 ) = ±U (R1 ◦ R2 ). (2.33)
But as already noted, this is acceptable, because the group multiplication
is respected on rays, though not on vectors. These double-valued repre-
sentations of the rotation group are called spinor representations. (The
existence of spinors follows from a topological property of the group —
that it is not simply connected.)
While it is true that a rotation by 2π has no detectable effect on a
spin- 12 object, it would be wrong to conclude that the spinor property
has no observable consequences. Suppose I have a machine that acts on
a pair of spins. If the first spin is up, it does nothing, but if the first spin
is down, it rotates the second spin by 2π. Now let the machine act when
the first spin is in a superposition of up and down. Then
1 1
√ (| ↑i1 + | ↓i1 ) | ↑i2 7→ √ (| ↑i1 − | ↓i1 ) | ↑i2 . (2.34)
2 2
While there is no detectable effect on the second spin, the state of the
first has flipped to an orthogonal state, which is very much observable.
In a rotated frame of reference, a rotation R(n̂, θ) becomes a rotation
through the same angle but about a rotated axis. It follows that the three
components of angular momentum transform under rotations as a vector:
−iϕ/2
cos 2θ
e
|ψ(θ, ϕ)i = , (2.39)
eiϕ/2 sin 2θ
(up to an overall phase). We can check directly that this is an eigenstate
of
e−iϕ sin θ
cos θ
n̂ · ~
σ = iϕ , (2.40)
e sin θ − cos θ
with eigenvalue one. We now see that eq.(2.15) with a = e−iϕ/2 cos 2θ ,
b = eiϕ/2 sin 2θ , can be interpreted as a spin pointing in the (θ, ϕ) direction.
We noted that we cannot determine a and b with a single measurement.
Furthermore, even with many identical copies of the state, we cannot
completely determine the state by measuring each copy only along the
z-axis. This would enable us to estimate |a| and |b|, but we would learn
nothing about the relative phase of a and b. Equivalently, we would find
the component of the spin along the z-axis
θ θ
hψ(θ, ϕ)|σ3 |ψ(θ, ϕ)i = cos2 − sin2 = cos θ, (2.41)
2 2
but we would not learn about the component in the x-y plane. The prob-
lem of determining |ψi by measuring the spin is equivalent to determining
the unit vector n̂ by measuring its components along various axes. Alto-
gether, measurements along three different axes are required. E.g., from
hσ3 i and hσ1 i we can determine n3 and n1 , but the sign of n2 remains
undetermined. Measuring hσ2 i would remove this remaining ambiguity.
If we are permitted to rotate the spin, then only measurements along
the z-axis will suffice. That is, measuring a spin along the n̂ axis is
equivalent to first applying a rotation that rotates the n̂ axis to the axis
ẑ, and then measuring along ẑ.
In the special case θ = π2 and ϕ = 0 (the x̂-axis) our spin state is
1
| ↑x i = √ (| ↑z i + | ↓z i) (2.42)
2
14 2 Foundations I: States and Ensembles
(“spin-up along the x-axis”). The orthogonal state (“spin down along the
x-axis”) is
1
| ↓x i = √ (| ↑z i − | ↓z i) . (2.43)
2
For either of these states, if we measure the spin along the z-axis, we will
obtain | ↑z i with probability 12 and | ↓z i with probability 21 .
Now consider the combination
1
√ (| ↑x i + | ↓x i) . (2.44)
2
This state has the property that, if we measure the spin along the x-axis,
we obtain | ↑x i or | ↓x i, each with probability 21 . Now we may ask, what
if we measure the state in eq.(2.44) along the z-axis?
If these were probabilistic classical bits, the answer would be obvious.
The state in eq.(2.44) is in one of two states, and for each of the two,
the probability is 12 for pointing up or down along the z-axis. So of
course we should find up with probability 21 when we measure the state
√1 (| ↑x i + | ↓x i) along the z-axis.
2
But not so for qubits! By adding eq.(2.42) and eq.(2.43), we see that
the state in eq.(2.44) is really | ↑z i in disguise. When we measure along
the z-axis, we always find | ↑z i, never | ↓z i.
We see that for qubits, as opposed to probabilistic classical bits, proba-
bilities can add in unexpected ways. This is, in its simplest guise, the phe-
nomenon called “quantum interference,” an important feature of quantum
information.
To summarize the geometrical interpretation of a qubit: we may think
of a qubit as a spin- 21 object, and its quantum state is characterized
by a unit vector n̂ in three dimensions, the spin’s direction. A unitary
transformation rotates the spin, and a measurement of an observable has
two possible outcomes: the spin is either up or down along a specified
axis.
It should be emphasized that, while this formal equivalence with a spin-
1
2 object applies to any two-level quantum system, not every two-level
system transforms as a spinor under spatial rotations!
with eigenvalues eiθ and e−iθ , the states of right and left circular polar-
ization. That is, these are the eigenstates of the rotation generator
0 −i
J= = σ2, (2.48)
i 0
are mutually orthogonal and can be obtained by rotating the states |xi
and |yi by 45◦ . Suppose that we have a polarization analyzer that allows
only one of two orthogonal linear photon polarizations to pass through,
absorbing the other. Then an x or y polarized photon has probability 21
of getting through a 45◦ rotated polarizer, and a 45◦ polarized photon
has probability 12 of getting through an x or y analyzer. But an x photon
never passes through a y analyzer.
Suppose that a photon beam is directed at an x analyzer, with a y
analyzer placed further downstream. Then about half of the photons will
pass through the first analyzer, but every one of these will be stopped
by the second analyzer. But now suppose that we place a 45◦ -rotated
analyzer between the x and y analyzers. Then about half of the photons
pass through each analyzer, and about one in eight will manage to pass all
three without being absorbed. Because of this interference effect, there
is no consistent interpretation in which each photon carries one classical
bit of polarization information. Qubits are different than probabilistic
classical bits.
A device can be constructed that rotates the linear polarization of a
photon, and so applies the transformation Eq. (2.45) to our qubit; it
functions by “turning on” a Hamiltonian for which the circular polar-
ization states |Li and |Ri are nondegenerate energy eigenstates. This
is not the most general possible unitary transformation. But if we also
have a device that alters the relative phase of the two orthogonal linear
polarization states
|xi → e−iϕ/2 |xi,
|yi → eiϕ/2 |yi (2.50)
(by turning on a Hamiltonian whose nondegenerate energy eigenstates are
the linear polarization states), then the two devices can be employed to-
gether to apply an arbitrary 2 × 2 unitary transformation (of determinant
1) to the photon polarization state.
axioms appear to be violated. The trouble is that our axioms are intended
to characterize the quantum behavior of a closed system that does not
interact with its surroundings. In practice, closed quantum systems do
not exist; the observations we make are always limited to a small part of
a much larger quantum system.
When we study open systems, that is, when we limit our attention to
just part of a larger system, then (contrary to the axioms):
with probability |b|2 , we obtain the result |1iA and prepare the state
hM A i = AB hψ|M A ⊗ I B |ψiAB
X X
= a∗jν (A hj| ⊗ B hν|) (M A ⊗ I B ) aiµ (|iiA ⊗ |µiB )
j,ν i,µ
X
= a∗jµ aiµ hj|M A |ii = tr (M A ρA ) , (2.62)
i,j,µ
where
X
ρA = trB (|ψihψ|) ≡ aiµ a∗jµ |iihj| (2.63)
i,j,µ
dual vector (or bra) B hµ| as a linear map that takes vectors in HA ⊗ HB
to vectors of HA , defined through its action on a basis:
similarly, the ket |µiB defines a map from the HA ⊗ HB dual basis to the
HA dual basis, via
AB hiν|µiB = δµν A hi|. (2.65)
The partial trace operation is a linear map that takes an operator M AB
on HA ⊗ HB to an operator on HA defined as
X
trB M AB = B hµ|M AB |µiB . (2.66)
µ
From the definition eq.(2.63), we can immediately infer that ρA has the
following properties:
1. ρA is self-adjoint: ρA = ρ†A .
2. ρA is positive: For any |ϕi, hϕ|ρA |ϕi = µ | i aiµ hϕ|ii|2 ≥ 0.
P P
2
P
3. tr(ρA ) = 1: We have tr(ρA ) = i,µ |aiµ | = 1, since |ψiAB is
normalized.
It follows that ρA can be diagonalized in an orthonormal basis, that the
eigenvalues are all real and nonnegative, and that the eigenvalues sum to
one.
If we are looking at a subsystem of a larger quantum system, then, even
if the state of the larger system is a ray, the state of the subsystem need
not be; in general, the state is represented by a density operator. In the
case where the state of the subsystem is a ray, and we say that the state is
pure. Otherwise the state is mixed. If the state is a pure state |ψiA , then
the density matrix ρA = |ψihψ| is the projection onto the one-dimensional
space spanned by |ψiA . Hence a pure density matrix has the property
ρ2 = ρ. A general density matrix, expressed in the basis {|ai} in which
it is diagonal, has the form
X
ρA = pa |aiha|, (2.68)
a
P
where 0 < pa ≤ 1 and a pa = 1. If the state is not pure,P there
Pare two
or more terms in this sum, and ρ2 6= ρ; in fact, tr ρ2 = p2a < pa = 1.
2.3 The density operator 21
density matrices must have the eigenvalues 0 and 1 — they are one-
dimensional projectors, and hence pure states. We have already seen
that any pure state of a single qubit is of the form |ψ(θ, ϕ)i and can be
envisioned as a spin pointing in the (θ, ϕ) direction. Indeed using the
property
σ )2 = I,
(n̂ · ~ (2.71)
where n̂ is a unit vector, we can easily verify that the pure-state density
matrix
1
ρ(n̂) = (I + n̂ · ~σ) (2.72)
2
satisfies the property
(n̂ · ~
σ ) ρ(n̂) = ρ(n̂) (n̂ · ~
σ ) = ρ(n̂), (2.73)
and, therefore is the projector
ρ(n̂) = |ψ(n̂)ihψ(n̂)| ; (2.74)
that is, n̂ is the direction along which the spin is pointing up. Alterna-
tively, from the expression
−iϕ/2
e cos (θ/2)
|ψ(θ, φ)i = , (2.75)
eiϕ/2 sin (θ/2)
we may compute directly that
ρ(θ, φ) = |ψ(θ, φ)ihψ(θ, φ)|
cos2 (θ/2) cos (θ/2) sin (θ/2)e−iϕ
=
cos (θ/2) sin (θ/2)eiϕ sin2 (θ/2)
sin θe−iϕ
1 1 cos θ 1
= I+ iϕ = (I + n̂ · ~
σ ) (2.76)
2 2 sin θe − cos θ 2
where n̂ = (sin θ cos ϕ, sin θ sin ϕ, cos θ). One nice property of the Bloch
parametrization of the pure states is that while |ψ(θ, ϕ)i has an arbitrary
overall phase that has no physical significance, there is no phase ambiguity
in the density matrix ρ(θ, ϕ) = |ψ(θ, ϕ)ihψ(θ, ϕ)|; all the parameters in ρ
have a physical meaning.
From the property
1
tr σ i σ j = δij (2.77)
2
we see that
hn̂ · ~
σ iP~ = tr n̂ · ~σ ρ(P~ ) = n̂ · P~ . (2.78)
Here {|iiA } and {|µiB } are orthonormal basis for HA and HB respectively,
but to obtain the second equality in eq.(2.79) we have defined
X
|ĩiB ≡ ψiµ |µiB . (2.80)
µ
ρA = trB (|ψihψ|)
X X
= trB ( |iihj| ⊗ |ĩihj̃|) = hj̃|ĩi (|iihj|) . (2.82)
i,j i,j
Hence, it turns out that the {|ĩiB } are orthogonal after all. We obtain
orthonormal vectors by rescaling,
−1/2
|i0 iB = pi |ĩiB (2.85)
24 2 Foundations I: States and Ensembles
(we may assume pi 6= 0, because we will need eq.(2.85) only for i appearing
in the sum eq.(2.81)), and therefore obtain the expansion
X√
|ψiAB = pi |iiA ⊗ |i0 iB , (2.86)
i
The orthonormal bases {|aiA } and {|µiB } are related to the Schmidt
bases {|iiA } and {|i0 iB } by unitary transformations U A and U B , hence
X X
|iiA = |aiA (UA )ai , |i0 iB = |µiB (UB )µi0 . (2.88)
a µ
and |i0 iB ’s, and then we pair up the eigenstates of ρA and ρB with the
same eigenvalue to obtain eq.(2.86). We have chosen the phases of our
basis states so that no phases appear in the coefficients in the sum; the
only remaining freedom is to redefine |iiA and |i0 iB by multiplying by
opposite phases (which leaves the expression eq.(2.86) unchanged).
But if ρA has degenerate nonzero eigenvalues, then we need more in-
formation than that provided by ρA and ρB to determine the Schmidt
decomposition; we need to know which |i0 iB gets paired with each |iiA .
For example, if both HA and HB are d-dimensional and Uij is any d × d
unitary matrix, then
d
1 X
|ψiAB = √ |iiA Uij ⊗ |j 0 iB , (2.91)
d i,j=1
we have
1 X 1 X
|ψiAB = √ |iiA ⊗ |i0 iB = √ |aiA Uai ⊗ |b0 iB Uib†
d i d i,a,b
1 X
=√ |aiA ⊗ |a0 iB . (2.93)
d a
2.4.1 Entanglement
With any bipartite pure state |ψiAB we may associate a positive integer,
the Schmidt number, which is the number of nonzero eigenvalues in ρA
(or ρB ) and hence the number of terms in the Schmidt decomposition
of |ψiAB . In terms of this quantity, we can define what it means for a
bipartite pure state to be entangled: |ψiAB is entangled (or nonseparable)
if its Schmidt number is greater than one; otherwise, it is separable (or
unentangled). Thus, a separable bipartite pure state is a direct product
of pure states in HA and HB ,
|ψiAB = |ϕiA ⊗ |χiB ; (2.94)
26 2 Foundations I: States and Ensembles
then the reduced density matrices ρA = |ϕihϕ| and ρB = |χihχ| are pure.
Any state that cannot be expressed as such a direct product is entangled;
then ρA and ρB are mixed states.
When |ψiAB is entangled we say that A and B have quantum corre-
lations. It is not strictly correct to say that subsystems A and B are
uncorrelated if |ψiAB is separable; after all, the two spins in the separable
state
| ↑iA | ↑iB , (2.95)
are surely correlated – they are both pointing in the same direction. But
the correlations between A and B in an entangled state have a different
character than those in a separable state. One crucial difference is that
entanglement cannot be created locally. The only way to entangle A and
B is for the two subsystems to directly interact with one another.
We can prepare the state eq.(2.95) without allowing spins A and B to
ever come into contact with one another. We need only send a (classical!)
message to two preparers (Alice and Bob) telling both of them to prepare
a spin pointing along the z-axis. But the only way to turn the state
eq.(2.95) into an entangled state like
1
√ (| ↑iA | ↑iB + | ↓iA | ↓iB ) , (2.96)
2
is to apply a collective unitary transformation to the state. Local unitary
transformations of the form UA ⊗ UB , and local measurements performed
by Alice or Bob, cannot increase the Schmidt number of the two-qubit
state, no matter how much Alice and Bob discuss what they do. To
entangle two qubits, we must bring them together and allow them to
interact.
As we will discuss in Chapter 4, it is also possible to make the distinction
between entangled and separable bipartite mixed states. We will also
discuss various ways in which local operations can modify the form of
entanglement, and some ways that entanglement can be put to use.
states. The boundary of the set consists of all density matrices with
at least one vanishing eigenvalue (since there are nearby matrices with
negative eigenvalues). Such a density matrix need not be pure, for d > 2,
since the number of nonvanishing eigenvalues can exceed one.
hM i = λhM i1 + (1 − λ)hM i2
= λtr(M ρ1 ) + (1 − λ)tr(M ρ2 )
= tr (M ρ(λ)) . (2.100)
with 100% probability. Any probability distribution for the bit is a convex
sum of these two extremal points. Similarly, if there are d possible states,
there are d extremal distributions, and any probability distribution has
a unique decomposition into extremal ones (the convex set of probability
distributions is a simplex, the convex hull of its d vertices). If 0 occurs
with 21% probability, 1 with 33% probability, and 2 with 46% probability,
there is a unique “preparation procedure” that yields this probability
distribution.
But, when Alice receives Bob’s phone call, she can select a subensemble
of her spins that are all in the pure state | ↑x iA . The information that
Bob sends allows Alice to distill purity from a maximally mixed state.
Another wrinkle on the quantum eraser is sometimes called delayed
choice. This just means that the situation we have described is really
completely symmetric between Alice and Bob, so it can’t make any dif-
ference who measures first. (Indeed, if Alice’s and Bob’s measurements
are spacelike separated events, there is no invariant meaning to which
came first; it depends on the frame of reference of the observer.) Alice
could measure all of her spins today (say along x̂) before Bob has made
his mind up how he will measure his spins. Next week, Bob can decide
to “prepare” Alice’s spins in the states | ↑n̂ iA and | ↓n̂ iA (that is the
“delayed choice”). He then tells Alice which were the | ↑n̂ iA spins, and
she can check her measurement record to verify that
hσ1 in̂ = n̂ · x̂ . (2.110)
The results are the same, irrespective of whether Bob “prepares” the spins
before or after Alice measures them.
We have claimed that the density matrix ρA provides a complete phys-
ical description of the state of subsystem A, because it characterizes all
possible measurements that can be performed on A. One might object
that the quantum eraser phenomenon demonstrates otherwise. Since the
information received from Bob enables Alice to recover a pure state from
the mixture, how can we hold that everything Alice can know about A is
encoded in ρA ?
I prefer to say that quantum erasure illustrates the principle that “in-
formation is physical.” The state ρA of system A is not the same thing
as ρA accompanied by the information that Alice has received from Bob.
This information (which attaches labels to the subensembles) changes the
physical description. That is, we should include Alice’s “state of knowl-
edge” in our description of her system. An ensemble of spins for which
Alice has no information about whether each spin is up or down is a dif-
ferent physical state than an ensemble in which Alice knows which spins
are up and which are down. This “state of knowledge” need not really
be the state of a human mind; any (inanimate) record that labels the
subensemble will suffice.
We have already seen that a mixed state of any quantum system can
be realized as an ensemble of pure states in an infinite number of different
ways. For a density matrix ρA , consider one such realization:
X X
ρA = pi |ϕi ihϕi |, pi = 1. (2.111)
i
Here the states {|ϕi iA } are all normalized vectors, but we do not assume
that they are mutually orthogonal. Nevertheless, ρA can be realized as
an ensemble, in which each pure state |ϕi ihϕi | occurs with probability pi .
For any such ρA , we can construct a “purification” of ρA , a bipartite
pure state |Φ1 iAB that yields ρA when we perform a partial trace over
HB . One such purification is of the form
X√
|Φ1 iAB = pi |ϕi iA ⊗ |αi iB , (2.112)
i
where again the {|βµ iB ’s} are orthonormal vectors in HB . So in the state
|Φ2 iAB , we can realize the ensemble by performing a measurement in HB
that projects onto the {|βµ iB } basis.
Now, how are |Φ1 iAB and |Φ2 iAB related? In fact, we can easily show
that
|Φ1 iAB = (I A ⊗ U B ) |Φ2 iAB ; (2.117)
the two states differ by a unitary change of basis acting in HB alone, or
X√
|Φ1 iAB = qµ |ψµ iA ⊗ |γµ iB , (2.118)
µ
where
|γµ iB = U B |βµ iB , (2.119)
is yet another orthonormal basis for HB . We see, then, that there is a sin-
gle purification |Φ1 iAB of ρA , such that we can realize either the {|ϕi iA }
ensemble or {|ψµ iA } ensemble by choosing to measure the appropriate
observable in system B!
Similarly, we may consider many ensembles that all realize ρA , where
the maximum number of pure states appearing in any of the ensembles
is d. Then we may choose a Hilbert space HB of dimension d, and a
pure state |ΦiAB ∈ HA ⊗ HB , such that any one of the ensembles can
be realized by measuring a suitable observable of B. This is the HJW
theorem (for Hughston, Jozsa, and Wootters); it expresses the quantum
eraser phenomenon in its most general form.
In fact, the HJW theorem is an easy corollary to the Schmidt decom-
position. Both |Φ1 iAB and |Φ2 iAB have Schmidt decompositions, and
because both yield the same ρA when we take the partial trace over B,
these decompositions must have the form
Xp
|Φ1 iAB = λk |kiA ⊗ |k10 iB ,
k
Xp
|Φ2 iAB = λk |kiA ⊗ |k20 iB , (2.120)
k
where the λk ’s are the eigenvalues of ρA and the |kiA ’s are the correspond-
ing eigenvectors. But since {|k10 iB } and {|k20 iB } are both orthonormal
bases for HB , there is a unitary U B such that
|k10 iB = U B |k20 iB , (2.121)
from which eq.(2.117) immediately follows.
In the ensemble of pure states described by Eq. (2.111), we would say
that the pure states |ϕi iA are mixed incoherently — an observer in sys-
tem A cannot detect the relative phases of these states. Heuristically, the
36 2 Foundations I: States and Ensembles
(Some authors use the name “fidelity” for the square root of this quantity.)
The fidelity is nonnegative, vanishes if ρ and σ have support on mutually
orthogonal subspaces, and attains its maximum value 1 if and only if the
two states are identical. If ρ = |ψihψ| is a pure state, then the fidelity is
where p
kAk1 = tr A† A. (2.125)
The L1 norm is also sometimes called the trace norm. (For Hermitian
A, kAk1 is just the sum of the absolute values of its eigenvalues.) The
fidelity F (ρ, σ) is actually symmetric in its two arguments, although the
2.6 How far apart are two quantum states? 37
This holds because BAAB and ABBA have the same eigenvalues —
if |ψi is an eigenstate of ABBA with eigenvalue λ, the BA|ψi is an
eigenstate of BAAB with eigenvalue λ.
It is useful to know how the fidelity of two density operators is related
to the overlap of their purifications. We say that |ΦiAB is a purification
of the density operator ρA if
If X
ρ= pi |iihi| (2.128)
i
(where |iiA is an orthonormal basis for system A), then a particular pu-
rification of ρ has the form
X√
|Φρ i = pi |iiA ⊗ |iiB (2.129)
i
Noting that
X X
U ⊗ I|Φ̃i = |ji ⊗ |iiUji = |ji ⊗ |iiUijT = I ⊗ U T |Φ̃i, (2.134)
i,j i,j
38 2 Foundations I: States and Ensembles
we have
1 1
1 1
hΦσ (W )|Φρ (V )i = hΦ̃|σ 2 ρ 2 U ⊗ I|Φ̃i = tr σ 2 ρ 2 U , (2.135)
T
where U = W † V .
Now we may use the polar decomposition
p
0
A = U A† A, (2.136)
which says that tracing out a subsystem cannot decrease the fidelity of
two density operators. Monotonicity means, unsurprisingly, that throwing
away a subsystem does not make two quantum states easier to distinguish.
It follows from Uhlmann’s theorem because any state purifying ρAB also
provides a purification of A; therefore the optimal overlap of the purifi-
cations of ρAB and σ AB is surely achievable by purifications of ρA and
σA.
(For Hermitian A, kAk2 is the square root of the sum of the squares
of its eigenvalues.) The L1 distance is a particularly natural measure
of state distinguishability, because (as shown in Exercise 2.5) it can be
interpreted as the distance between the corresponding probability dis-
tributions achieved by the optimal measurement for distinguishing the
states. Although the fidelity, L1 distance, and L2 distance are not simply
related to one another in general, there are useful inequalities relating
these different measures. √
If {|λi |, i = 0, 1, 2, . . . d−1} denotes the eigenvalues of A† A, then
v
d−1
X
ud−1
uX
kAk1 = |λi |; kAk2 = t |λi |2 . (2.142)
i=0 i=0
and note that the absolute value of this difference may be written as
√ √ X √ √ √ √
| ρ − σ| = |λi | |iihi| = ρ− σ U =U ρ − σ , (2.146)
i
where X
U= sign(λi )|iihi|. (2.147)
i
40 2 Foundations I: States and Ensembles
Using
1 √ √ √ √ 1 √ √ √ √
ρ−σ = ρ− σ ρ+ σ + ρ+ σ ρ − σ (2.148)
2 2
and the cyclicity of the trace, we find
√ √ √ √ X √ √
tr (ρ − σ) U = tr| ρ − σ| ρ + σ = |λi | hi| ρ + σ|ii
i
X √ √ X √ √
≥ |λi | hi| ρ − σ|ii = |λi |2 = k ρ − σk22 .
i i
(2.149)
Finally, using
and hence
p 1
√ √
2 1
F (ρ, σ) ≥ 1 −
ρ − σ
2 ≥ 1 − kρ − σk1 . (2.153)
2 2
Eq.(2.153) tells us that if ρ and σ are close to one another in the L1 norm,
then their fidelity is close to one.
The L1 distance also provides an upper bound on fidelity:
1
F (ρ, σ) ≤ 1 − kρ − σk21 . (2.154)
4
To derive eq.(2.154) we first show that it holds with equality for pure
states. Any two vectors |ψi and |ϕi lie in some two-dimensional subspace,
and by choosing a basis and phase conventions appropriately we may write
cos θ/2 sin θ/2
|ψi = , |ϕi = , (2.155)
sin θ/2 cos θ/2
2.7 Summary 41
and
|hϕ|ψi|2 = sin2 θ. (2.157)
Therefore,
2.7 Summary
Axioms. The arena of quantum mechanics is a Hilbert space H. The
fundamental assumptions are:
(1) A state is a ray in H.
(2) An observable is a self-adjoint operator on H.
(3) A measurement is an orthogonal projection.
42 2 Foundations I: States and Ensembles
2.8 Exercises
2.1 Fidelity of measurement
a) What do you think of Alice’s idea? Hint: What does the uni-
tarity of U tell you about how the states |β 0 iB and |β̃ 0 iB are
related to one another?
b) Would you feel differently if the states |ϕiA and |ϕ̃iA were or-
thogonal?
doesn’t want to tell him who the winner will be. But after the Series
is over, Alice wants to be able to convince Bob that she knew the
outcome all along. What to do?
Bob suggests that Alice write down her choice (0 if the Yankees will
win, 1 if the Dodgers will win) on a piece of paper, and lock the
paper in a box. She is to give the box to Bob, but she will keep the
key for herself. Then, when she is ready to reveal her choice, she
will send the key to Bob, allowing him to open the box and read
the paper.
Alice rejects this proposal. She doesn’t trust Bob, and she knows
that he is a notorious safe cracker. Who’s to say whether he will be
able to open the box and sneak a look, even if he doesn’t have the
key?
Instead, Alice proposes to certify her honesty in another way, using
quantum information. To commit to a value a ∈ {0, 1} of her bit, she
prepares one of two distinguishable density operators (ρ0 or ρ1 ) of
the bipartite system AB, sends system B to Bob, and keeps system
A for herself. Later, to unveil her bit, Alice sends system A to Bob,
and he performs a measurement to determine whether the state of
AB is ρ0 or ρ1 . This protocol is called quantum bit commitment.
We say that the protocol is binding if, after commitment, Alice is
unable to change the value of her bit. We say that the protocol is
concealing if, after commitment and before unveiling, Bob is unable
to discern the value of the bit. The protocol is secure if it is both
binding and concealing.
Show that if a quantum bit commitment protocol is concealing, then
it is not binding. Thus quantum bit commitment is insecure.
Hint: First argue that without loss of generality, we may assume
that the states ρ0 and ρ1 are both pure. Then apply the HJW
Theorem.
Remark: Note that the conclusion that quantum bit commitment
is insecure still applies even if the shared bipartite state (ρ0 or ρ1 ) is
prepared during many rounds of quantum communication between
Alice and Bob, where in each round one party performs a quantum
operation on his/her local system and on a shared message system,
and then sends the message system to the other party.
2.4 Completeness of subsystem correlations
Consider a bipartite system AB. Suppose that many copies of the
(not necessarily pure) state ρAB have been prepared. An observer
Alice with access only to subsystem A can measure the expectation
46 2 Foundations I: States and Ensembles
Remark: It follows from (c) alone that the correlations of the “lo-
cal” observables determine the expectation values of all the observ-
ables. Parts (a) and (b) serve to establish that ρ is completely
determined by the expectation values of a complete set of observ-
ables.
this distance is zero if the two distributions are identical, and attains
its maximum value two if the two distributions have support on
disjoint sets.
a) Show that
d−1
X
d(p, q) ≤ |λi | = kρ − σk1 ≡ d(ρ, σ), (2.173)
i=0
A matrix is doubly
P stochastic
P if its entries are nonnegative real num-
bers such that µ Dµi = i Dµi = 1. That the columns sum to one
assures that D maps probability vectors to probability vectors (i.e.,
is stochastic). That the rows sum to one assures that D maps the
uniform distribution to itself. Applied repeatedly, D takes any input
distribution closer and closer to the uniform distribution (unless D
is a permutation, with one nonzero entry in each row and column).
Thus we can view majorization as a partial order on probability
vectors such that q ≺ p means that q is more nearly uniform than p
(or equally close to uniform, in the case where D is a permutation).
Show that normalized pure states {|ϕµ i} exist such that eq.(2.176)
is satisfied if and only if q ≺ p, where p is the vector of eigenvalues
of ρ.
Hint: Recall that, according to the HJW Theorem, if eq.(2.175)
and eq.(2.176) are both satisfied then there is a unitary matrix Vµi
such that X√
√
qµ |ϕµ i = pi Vµi |αi i. (2.178)
i
You may also use (but need not prove) Horn’s Lemma: if q ≺ p,
then there exists a unitary (in fact, orthogonal) matrix Uµi such
that q = Dp and Dµi = |Uµi |2 .
I ⊗ EB
a |Ψi
|Ψi 7→ |Ψa i = . (2.180)
hΨ| I ⊗ E B
1/2
a |Ψi
X ⊗I I ⊗X X ⊗X
I ⊗Y Y ⊗I Y ⊗Y . (2.184)
X ⊗Y Y ⊗X Z ⊗Z
The three observables in each row and in each column are mutu-
ally commuting, and so can be simultaneously diagonalized. In fact
the simultaneous eigenstates of any two operators in a row or col-
umn (the third operator is not independent of the other two) are
a complete basis for the four-dimensional Hilbert space of the two
qubits. Thus we can regard the array eq.(2.184) as a way of present-
ing six different ways to choose a complete set of one-dimensional
projectors for two qubits.
Each of these observables has eigenvalues ±1, so that in a determin-
istic and noncontextual model of measurement (for a fixed value of
the hidden variables), each can be assigned a definite value, either
+1 or −1.
where the λi ’s are real and nonnegative. (We’re not assuming here
that the vector has unit norm, so the sum of the λi ’s is not con-
strained.) Eq.(2.185) is called the Schmidt decomposition of the
vector |ψiAB . Of course, the bases in which the vector has the
Schmidt form depend on which vector |ψiAB is being decomposed.
A unitary transformation acting on HAB is called a local unitary
if it is a tensor product U A ⊗ U B , where U A , U B are unitary
transformations acting on HA , HB respectively. The word “local”
is used because if the two parts A and B of the system are widely
separated from one another, so that Alice can access only part A
and Bob can access only part B, then Alice and Bob can apply this
transformation by each acting locally on her or his part.
a) Now suppose that Alice and Bob choose standard fixed bases
{|iiA } and {|iiB } for their respective Hilbert spaces, and are
presented with a vector |ψAB i that is not necessarily in the
Schmidt form when expressed in the standard bases. Show
that there is a local unitary U A ⊗ U B that Alice and Bob can
apply so that the resulting vector