Quantum Channels: Peter Shor MIT Cambridge, MA

Quantum Channels
Peter Shor
MIT
Cambridge, MA
1
Claude Shannon, 1948
The fundamental problem of communication is that of reproducing
at one point either exactly or approximately a message selected at
another point.
2
John Pierce, 1973
I think that I have never met a physicist who understood
information theory. I wish that physicists would stop talking about
reformulating information theory and would give us a general
expression for the capacity of a channel with quantum effects taken
into account rather than a number of special cases.
3
Shannon
Shannons 1948 paper A mathematical theory of communication
founded the field of information theory. It contained two theorems
that we will discuss the quantum analogs of today: Source Coding
and Channel Coding.
Source Coding
Asymptotically, n symbols from a source X can be compressed to

length nH(X) + O( n).
For a memoryless source, which emits i.i.d. signals where signal xi
has probability pi , the entropy H is:
X
H(X) = pi log pi
i
This lecture will deal only with memoryless sources and channels.
4
Channel Coding
A noisy channel N has capacity
max I(X; N (X)),

p(X)
where p(X) is maximized over all probability distributions on the

channel input and I(X; N (X)) is the mutual information between
the input and the output.
I(X; Y ) = H(Y ) H(Y |X)

= H(X) + H(Y ) H(X, Y ).
X
H(X) = pi log pi
i
5
Entropy of a quantum state
Classical Case
Given n photons, each in state | li or | i, with probability 21 . Any
two of these states are completely distinguishable. The entropy is n
bits.
Quantum Case
Given n photons, each in state | li or | .% i, with probability 12 . If
the angle between the polarizations is small, any two of these states
are barely distinguishable. Intuitively, the entropy should be much
less than n bits.
6
By thermodynamic arguments, von Neumann deduced the entropy
of a quantum system with density matrix is
HvN () = Tr( log )
Recall was positive semideifinite, so log is defined.

If is diagonal with eigenvalues i , then log is diagonal with
eigenvalues i log i .
Thus, HvN () = HShan (i ) so the von Neumann entropy is the
Shannon entropy of the eigenvalues.
P
(Recall Tr = 1 = i i .)
You can ask: is this the right definition for information theory?
7
Schumacher Compression
(Quantum source coding theorem)
Given a memoryless source producing pure states v1 , v2 , v3 , . . .
with probabilities p1 , p2 , p3 , . . ..
We want to send them to a receiver using as few qubits as possible.

Theorem (Schumacher, 1994):
You can send n symbols using
nHvN () + o(n)
P
qubits, with fidelity approaching 1 as n , where = i p i v i v i
is the density matrix of the source.
8
Fidelity
Classical source coding works with high probability: the probability
that the received sequeuce is exactly the signal goes to 1 as the
block length n goes to .
This is too strong a criterion for quantum source coding. We ask
that the fidelity between the signal sent and the received state
goes to 1 as the block length n goes to .
The fidelity between a pure state sent | vi and a received density
matrix is hv | | vi = v v.
If the fidelity goes to 1, any measurement on the received signal
will have almost the same probability distribution of outcomes as
the same measurement on vv , the state sent.
9
Proof of Classical Source Coding Theorem
Assume we have a source X emitting symbols s1 , s2 , . . . with
probabilities p1 , p2 , . . .. Consider a sequence of n symbols from this
source.
Then a typical sequence has close to the right number (npi ) of each
symbol si .
Theorem: Almost all the time, the source emits a typical sequence.
There are 2nHShan (X)+o(n) typical sequences.
10
Typical Subspaces
Have states v1 , v2 , . . ., vk with probabilities p1 , p2 , . . ., pk .
Look at eigenvectors of density matrix .
Assign to each of eigenvector a probability equal to the
corresponding eigenvalue.
Any two eigenvectors are orthogonal.
Let the eigenvectors be v1 , v2 , . . ., vd with probabilities p1 , p2 , . . .,
pd .
Suppose we have n of these states.
The typical subspace S is the subspace generated by typical
sequences of eigenvectors.
S has dimension 2HvN ()n+o(n) .
11
How to do Schumacher compression.
Have states v1 , v2 , . . ., vk with probabilities p1 , p2 , . . ., pk . These
give density matrix . Let S be the typical subspace of n .
To compress:
Measure whether output of source lies in S.
If yes, get the state projected onto S. Can send using
log dim S nHvN () qubits.
If no, this is a low probability event; send anything.
12
Why does Schumacher Compression work?
Recall that the density matrix determines the outcomes of any
experiment.
Using the eigenvectors v1 , v2 , . . . vd with probabilities p1 , p2 , . . . pd
gives same probability of the outcomes as using states v1 , v2 , . . . vk
with probabilities p1 , p2 , . . . pk , since these two sources have the
same density matrix.
We know from the classical theory of typical sequences that the
probability of a no outcome is very small using vi and pi . Thus, the
probability of a no outcome is also very small with vi and pi .
This implies that the original state is almost surely very close to
the typical subspace S. Sending the state projected into S gives the
right outcomes with high fidelity.
13
Accessible Information
Suppose that we have a source that outputs signal i with
probability pi . How much Shannon information can we extract
about the sequence of is?
Let X be the random variable telling which signal i was sent.
Optimize over all possible measurements M on the signals (with
outcomes M1 , M2 , . . .).
Iacc = max I(X, M )

M
14
Example 1: Two states in ensemble
v1 = v2 =

1 cos()
v1 = v2 =
0 sin()
Then
1 1 + cos2 sin cos
=
2 sin cos 1 cos2
and HvN = H( 21 + cos

2 ). The optimal measurement is
and Iacc = 1 H( 12 + sin

2 ).
15
We see that Iacc < HvN ().
A plot of HvN and Iacc for the ensemble of two pure quantum states
with equal probabilities that differ by an angle of , 0 /2.
The top curve is the von Neumann entropy HvN = H( 12 + cos2 ) and
the bottom the accessible information Iacc = 1 H( 21 + sin2 ).
16
POVM Measurements
(Positive Operator Valued Measurements).
We are given a set of positive semidefinite matrices Ei satisfying
P
i Ei = I.
The probability of the ith outcome is
pi = Tr(Ei )
For von Neumann measurements, Ei = Si

To obtain the maximum information, we can assume that Ei s are
pure states. Then Ei = vi vi for some vector vi .
17
Example 2: Three signal states differing by 60 .
vi : (prob 13 )

1 1/2 1/2
v1 = v2 = v3 =
0 3/2 3/2
Optimal Measurement:
POVM corresponding to vectors wi vi .
Ei = 23 wi wi
wi : (prob 31 )
Each outcome rules out one state, leaving the other two equally
likely
Iacc = log 3 1 = .585; HvN = 1
Again, we have Iacc HvN .
18
Holevo Bound
Suppose we have a source emitting i with probability pi .
X X
= HvN ( pi i ) pi HvN (i )
i i
Theorem (Holevo, 1973)

Iacc
If all the i commute, the situation is essentially classical, and we
get Iacc = . Otherwise Iacc < .
19
Is this the most information we can send using the three states of
example 2?
Answer: No!
Use just two of the states, each with probabilities 1/2

1 1/2
v1 = v2 =
0 3/2
The measurement from example 1 now gives .6454 bits of

information about the random variable identifying the state which
was sent.
20
Is this the most information we can send using the three states of
example 2?
Answer: No!
Use three codewords v1 v1 , v2 v2 , v3 v3 .
The optimal measurement for these three states gives 1.369 bits,
which is larger than 2 .6454 = 1.298.
What about still longer codewords?
21
Theorem (Holevo, Schumacher-Westmoreland)
The classical-information capacity obtainable using codewords
composed of signal states i , where i has marginal probability pi ,
is
X X
({i }; {pi }) = HvN ( p i i ) pi HvN (i )
i i
We will give sketch of the proof of this formula in the special case
of pure states i .
Does this give the capacity of a quantum channel N ?
Possible capacity formula:
Maximize ({N (i )}; {pi }) over all output states N () of the
channel.
22
Theorem (pure state capacity)
We are given pure quantum states v1 , v2 , . . ., vk for use as signals.
Let = i pi vi vi . There are codes such that we send state vi with
P
probability pi having asymptotic capacity = HvN ()

How do we prove this?
random coding
typical subspace
pretty good measurement
also called square root measurement
23
Random Coding
We choose codewords
ui = vi1 vi2 . . . vin
where vi is picked with probability pi for each signal.

Then ui will be close to the typical subspace of n .
To decode, we
project into the typical subspace
apply the pretty good measurement
24
Pretty good measurement
1
We have N vectors u i S, which occur with equal probability N.
Given one of these, we want to distinguish between them.
i
P
Let = i ui u
Measure using the POVM with elements
Ei = 1/2 u i 1/2
i u
This is a POVM since
i 1/2 = I
X X
Ei = 1/2 u
i u
i i
ui 1/2 u
The probability of error if the state ui is sent is 1 ( i )2 .
This can be shown to be small for most ui from a random code if
N < dim S o(dim S).
25
Description of arbitrary memoryless quantum channel N : N must
be trace-preserving completely positive operator.
Positive: takes positive semi-definite matrices to positive
semi-definite matrices.
Completely postive: is positive even when tensored with the
identity channel. (E.g., the transpose operation is positive but not
completely positive).
A trace preserving completely positive operator can always be
expressed as
Ai Ai
X
N () =
i
where
Ai Ai = I
X
26
Unentangled Inputs, Separate Measurements
Maximize over probability distributions on inputs to the channel

i , pi :
Iacc ({N (i )}; {pi })
27
Unentangled Inputs, Joint Measurements

i , pi :
({N (i )}; {pi })
28
Entangled Inputs, Joint Measurements

i , pi where i is in the tensor product space of n inputs:
1
lim ({N n (i )}; {pi })
n n
29
Open Question
Is channel capacity additive?
Is max (N1 N2 ) = max (N1 ) + max (N2 )?
If it is, then gives the classical-information capacity of a quantum
channel.
This turns out to be the same question as additivity of
entanglement of formation considered in the previous lecture.
30
What things might increase the capacity of a quantum channel
which dont affect the capacity of a classical channel?
Entanglement between different channel uses? Unknown. This
is the big open additivity question.
A classical back channel from the receiver to the sender? This
helps, but seems to make exact calculation of the capacity
impossible.
Prior entanglement shared between the sender and the receiver.
This helps and makes the formulas really nice.
31
Recall superdense coding lets you send two bits per qubit over a
noiseless quantum channel if the sender and receiver share
entanglement.
By Holevos theorem, the bound

without prior shared entanglement is one bit per qubit. Thus, for
the noiseless quantum channel (the simplest case possible)
entanglement between sender and receiver helps.
32
Suppose that we have a quantum channel N . From superdense
coding, if N is a noiseless quantum channel, the sender could
communicate twice as much classical information to a receiver if
they share EPR pairs than if they dont. How does this generalize
to noisy channels? We call this quantity the entanglement-assisted
capacity and denote it by CE .
By superdense coding and teleportation, the entanglement-assisted
quantum capacity is exactly half of the entanglement-assisted
classical capacity.
1
QE = CE
2
33
Formula for entanglement-assisted capacity
Theorem (Bennett, Shor, Smolin, Thapliyal)
max HvN (N ()) + HvN () HvN ((N I))( ))

is a pure state on the tensor product of the input space of the

channel and a quantum space that the sender keeps, with
TrB = .
When the channel is classical, this formula turns into the entropy of
the input plus the entropy of the output less the entropy of the joint
system, or the second expression for classical mutual information.
34
Generalization
Suppose that the sender and the receiver have a limited amount of
entanglement (E ebits) they share. How much can capacity can
they obtain from a quantum channel?
If the sender is not allowed to use entanglement between different
channel uses, the answer is:
max i ) + H(N (i )) H((N

H( I)i )
i )E
i :H(
Here H means average over the entropy, and i means average over
the state; i is the pure entangled state (shared between sender
and receiver) whose partial traces are i . This formula interpolates
between the Holevo-Schumacher-Westmoreland capacity and the
entanglement-assisted capacity.
35
How to prove the formula for CE
(the lower bound)
max HvN (N ()) + HvN () HvN ((N I))( ))

Suppose is d1 Id, a multiple of the identity. Then we do the same

operations as for standard superdense coding, with the
generalizations of the Pauli matrices.
Use Holevo formula for :
X X
i i
The first term of gives the first two terms of CE ; the second term
of gives the last term of CE .
36
The Holevo formula for :
X X
i i
Recall in superdense coding Alice and Bob share a maximally

entangled state. Alice applies a random Pauli matrix to her half
and sends it through the channel.
The first term is the entropy of the average state of both halves of
the maximally entangled state after this operation. But on average,
the random Pauli matrix turns both halves into maximally mixed
states, turning the first term of into
HvN (N ()) + HvN ()
where is d1 I.
37
Proof sketch if 6= Id.
If is a projection matrix, things work the same way as in the
identity case.
If is not a projection matrix, then take tensor product of n uses
of the channel, N n , and the projection matrix = T , where T is
a typical subspace for n .
It turns out we need to show that
1
lim HvN (N n (T )) = HvN (N ()).
n n
This is intuitively true, but not that easy to prove rigorously.
38
What things might increase the entanglement-assisted capacity of a
quantum channel which dont affect the capacity of a classical
channel?
Entanglement between different channel uses? Does not help!
A classical back channel from the receiver to the sender? Does
not help!
Both of the above simultaneously? Does not help!
Proofs via quantum reverse Shannon theorem (next slide).
39
Quantum Reverse Shannon Theorem:
In the presense of entanglement, a noiseless qubit channel can
simulate n uses of any quantum channel with entanglement-assisted
capacity CE by sending nC + o(n) qubits.
This conjecture would show that asymptotically, in the presense of
free entanglement, quantum channels are characterized by one
parameter, CE . The analogous theorem is true for classical
channels in the presense of a correlated source of random bits.
With Charlie Bennett, Igor Devetak, and Andreas Winter, we have
a proof of this theorem for channels (1) transmitting signals
generated by some stochastic source, or (2) transmitting tensor
product states.
Does not appear to be quite true for general inputs, unless we allow
for other forms of shared entanglement than EPR pairs.
40
Quantum capacity
The quantum capacity is defined as limn log d/n, where n is the
number of channel uses in a protocol, and the d is the dimension of
the largest Hilbert space which can be transmitted through the
channel such that the fidelity of transmission of the (average/lowest
fidelity) state in it is 1 , for some fixed .
The quantum capacity of a channel can be shown to be
1
lim max H(N n ()) H((N n I))( ))
n n
where is a state whose reduced state on the input space is .
41

Quantum Channels: Peter Shor MIT Cambridge, MA

Uploaded by

Copyright:

Available Formats

Quantum Channels: Peter Shor MIT Cambridge, MA

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Quantum Channels: Peter Shor MIT Cambridge, MA

Uploaded by

Copyright:

Available Formats

Quantum Channels

max I(X; N (X)),

where p(X) is maximized over all probability distributions on the

I(X; Y ) = H(Y ) H(Y |X)

HvN () = Tr( log )

Recall was positive semideifinite, so log is defined.

We want to send them to a receiver using as few qubits as possible.

Iacc = max I(X, M )

and HvN = H( 21 + cos

and Iacc = 1 H( 12 + sin

The probability of the ith outcome is

For von Neumann measurements, Ei = Si

Theorem (Holevo, 1973)

The measurement from example 1 now gives .6454 bits of

What about still longer codewords?

probability pi having asymptotic capacity = HvN ()

ui = vi1 vi2 . . . vin

where vi is picked with probability pi for each signal.

This is a POVM since

Maximize over probability distributions on inputs to the channel

Maximize over probability distributions on inputs to the channel

Maximize over probability distributions on inputs to the channel

By Holevos theorem, the bound

max HvN (N ()) + HvN () HvN ((N I))( ))

is a pure state on the tensor product of the input space of the

max i ) + H(N (i )) H((N

max HvN (N ()) + HvN () HvN ((N I))( ))

Suppose is d1 Id, a multiple of the identity. Then we do the same

Recall in superdense coding Alice and Bob share a maximally

HvN (N ()) + HvN ()

This is intuitively true, but not that easy to prove rigorously.

where is a state whose reduced state on the input space is .

You might also like