Lecture 20
Lecture 20
In quantum mechanics, we usually imagine the information about the system to be stored in the
wavefunction. But it can also be encoded in something slightly more general, namely the density
matrix.
Additional reading if you wish: Cohen-Tannoudji, vol I, comp E3
Density matrices are formally operators on the Hilbert space and they allow us to combine the
ket-vectors |i into combinations that express our uncertainty. We will see that there is however
a big difference between the quantum probability that we know from the wavefunctions and the
classical probability that already exists in classical physics and whose incorporation into quantum
mechanics leads to density matrices.
ii =
X
i
|ci |2 = h|i
which is nothing else than the (squared) norm of the pure state. We usually use normalize pure
states which means that
Tr = 1
is one of the usual conditions that we require from density matrices . You should also notice that
the special density matrix we just defined is a Hermitean operator:
= ,
ij = ji
Also, the eigenvalues of are, in this particular case, equal to 0 and 1. Three properties will always
be required from density matrices. Density matrices must be
1
Hermitean operators, =
operators whose trace equals one, Tr = 1
operators without negative eigenvalues or equivalently without negative expectation values:
h||i 0 for every |i
The last condition is not surprising: h||i is interpreted as the probability that the density
matrix is found in the state |i, and this should be non-negative. Notice that for our simple choice
= |ih|, the expectation value is the usual expression for the probability
h||i = |h|i|2
Finally, you should see that the density matrix defined from a pure state |i as
= |ih|
does not change if we redefine |i by a phase
|i |iei
because the phase cancels with the complex conjugate phase. Thats a nice feature of because we
know that the choice of the phase is not physical. Otherwise, knows about everything that |i
in the state |i can be written
knows about. For example, the expectation value of an operator L
as
(|ih|)
= [H,
]
= i
h
= . . . = H|ih|
|ih|H
t
t
Here, . . . represents the Leibniz rule. In the next step, we used the Schrodinger equations for |i and
its conjugate. The equation will once again be a general evolution equation for all density matrices:
i
h
]
= [H,
t
Note that right now, we may forget about the states |i and h| completely. The state of the
system may be described by a density matrix . We know how it evolves with time. We know how
to compute probabilities that we measure a particular value of a quantity. We know how to evalute
the expectation value of an observable in . Great.
Mixed states
Of course, density matrices would not deserve an extra lecture if they were always equal to = |ih|.
While these density matrices for any |i are known as pure states, we really want to consider
more general matrices called the mixed states because they can be written as the following sum
(mixture). Every Hermitean operator and density matrices are Hermitean operators can be
diagonalized as
X
pi |iihi|
=
i
where pi are eigenvalues and = i pi |iihi| are the projection operators on the eigenstates. Note
that in the previous section, there was only one nonzero term with pi = 1. This allows us to interpret
the eigenvalues pi as probabilities of being in the state |ii. We always require
P
i :
0 pi 1
While the formula = i pi |iihi| was constructed with |ii being the eigenstates, you may also define
using this formula for arbitrary states |ii that dont even have to be orthogonal to each other. Let
us summarize the discussion by the following definition. A density matrix is a Hermitean operator
on the Hilbert space whose trace equals one and whose eigenvalues are non-negative. It always
satisfies the Schrodinger equation we derived previously; note that it is a linear equation in , and
therefore if you write as a combination of several terms, the equation will simply be the sum of
the individual equations. Finally, the expectation values and the probabilities are always expressed
using the traces (or expectation values) that we discussed previously.
P
pi = 1
p2i
X
i
p2i 1.
Its because pii < pi for all 0 < pi < 1. The inequality is only saturated if all pi are equal to either
zero or one, and because the trace must be one, there must be exactly one eigenvalue. We will
always require that the trace of 2 is smaller or equal to one; it follows from the stronger condition
about the eigenvalues.
1
1
N
3
the density matrix is a multiple of the identity operator. The factor 1/N is needed to keep Tr = 1.
Also, one of the interesting formula using density matrices is the entropy. In quantum mechanics,
we define the entropy as
S = kB Tr ( ln ).
Sometimes, the Boltzmann constant kB is dropped. If you have problems to calculate the logarithm
P
of an operator, it becomes much easier if you decompose = i pi |iihi| into the eigenvectors because
then
X
S = kB (pi ln pi )
i
For example, if you have the density matrix that describes a complete ignorance, pi = 1/N and
1
1
S = kB N ln
N
N
= kB ln(N).
In this simple case, the entropy is equal to kB times the logarithm of the number of states that
contribute to the density matrix. (If the states contributed unequally, the result would be still
approximately correct but not exactly.) For pure states, N = 1 and S = ln(1) = 0. In classical
physics, you know that the entropy was the logarithm of the volume in the phase space. In quantum
mechanics, the volume of phase space is replaced by the number of quantum microstates that are
indistinguishable and in which the system may be found. The entropy defined with this logarithmic
formula satisfies the usual extensivity property: S(1 2 ) = S(1 ) + S(2 ).
1
(|aiha| + |bihb|)
2
where the factor 1/2 was again chosen to have Tr = 1. You may say that this combination is very
similar to our old-fashioned linear superposition of the |ai and |bi states, namely to
1
|i = (|ai + |bi)
2
because it also gives you a 50 percent probability for |ai and 50 percent probability for |bi. If you
said so, you would join thousands of people who make a serious and elementary error. Why are these
two states different? Let us explicitly write down the matrices. The density matrix of ignorance is
simply
!
1
1/2 0
=
= 122
0 1/2
2
while the density matrix from |i is
=
1/2 1/2
1/2 1/2
Both of them have diagonal entries equal to 1/2 (and therefore the trace equal to one) but the
matrices are not equal. Incidentally, the diagonal entries are also called the populations of the
4
individual states. The off-diagonal entries differ. We can describe the differences in many other
ways. The eigenvalues of the first matrix are 1/2, 1/2 while the eigenvalues of the second matrix are
0, 1. The trace of 2 equals 1/2 while it equals 1 in the second, pure case. Also, the expectation value
of Sx (the x-component of the spin, roughly speaking) vanishes in the first one because Tr x = 0
but not in the last one:
h
hSx i = Tr(Sx ) = Tr
2
"
0 1
1 0
1/2 1/2
1/2 1/2
!#
= Tr
2
1/2 1/2
1/2 1/2
Note that for the ignorant density matrix = 1/2, the expectation values of all components of
the spin vanish. Thats impossible for any pure (normalized) state |i. You may still wonder, why
are the two density matrices really so different? Whats wrong with the state |i above? You may
notice that we randomly made all the phases real. In fact, if you define a new state
1
|i = |ai + ei |bi
2
with a general relative phase ei , it will also look like 50 percent for |ai and 50 percent for |bi. (The
overall phase does not change anything, and therefore without loss of generality we could take the
coefficient of |ai to be real and positive.) If you calculate the density matrix for this more general
ket-vector, you obtain
!
1/2 ei /2
= |ih| =
e+i /2 1/2
If you compute the average () over between 0 and 2, the off-diagonal elements disappear.
We use the word coherence to emphasize that has nonzero off-diagonal entries that know
about the relative phase ei between |ai and |bi. Similarly, the word decoherence refers to a
physical process that forgets about this relative phase and consequently puts the off-diagonal
entries equal to zero. We will discuss how decoherence is relevant for large physical systems and the
interpretation of quantum mechanics at the end of the course.
Partial traces
One of the most useful features of density matrices is the freedom to compute partial traces. What
do we mean? Imagine two particles. Their wavefunction is
(~r1 , ~r2 )
5
where ~r1 , ~r2 are the positions of the two particles. Here ||2 dV1 dV2 gives you the infinitesimal
probability that the first particle is found in the small volume dV1 and the other particle is found
in dV2 around the points ~r1 , ~r2 . Imagine that you are only interested in the properties of the first
particle. Can you construct a complex wavefunction for the first particle only? No. The full
wavefunction is a complex function of 6 coordinates, and there is no natural or other meaningful
way to extract a function of 3 variables out of it.
However, this impossible procedure becomes possible in the density matrix formalism. If you
describe the two objects by a density matrix which is a combination of expressions like
(~r10 , ~r20 ; ~r1 , ~r2 ) = (~r10 , ~r20 ) (~r1 , ~r2 ),
then you can define a density matrix 1 for the first particle only. Most generally, we write
1 = Tr2 .
In the particular example with the coordinates, this matrix would be a function
1 (~r10 , ~r1 ) =
It only depends on the coordinates of the first particle. For example, the diagonal elements of this
smaller density matrix
Z
1 (~r1 , ~r1 ) = d3 r2 (~r1 , ~r2 ; ~r1 , ~r2 ).
are nothing else than the total probability that the particle 1 is found at ~r1 while the particle 2 is
found anywhere (which is why we integrate the probability distribution over ~r2 ). However, the
reduced density matrix also has off-diagonal entries that can be calculated from the formula above.
The discrete version of the partial tracing is
(1 )ab =
am;bm
where m runs over an orthonormal basis of the Hilbert space of the particle/system 2.
Examples of partial tracing: non-entangled particles
First, consider the example in which the two subsystems 1,2 are in a pure state and not entangled.
This sentence means that
|i = |1 i |2 i.
The full wavefunction is just made of the (tensor) product of two independent wavefunctions |1 i
and |2 i that are assumed to be normalized. Then the full density matrix is
= |ih| = |1 i|2 ih2 |h1 |
where we chose the convention where the ordering of the data describing bra-vectors is the mirror
image of those for ket-vectors. We can choose a basis of the second Hilbert space where |2 i is one
of the basis vectors, and the partial tracing formula simply gives us
1 = |1 ih1 |.
If the particles are independent and not entangled, the partial tracing gives us the usual density
matrix for a pure state.
6
It is made of four terms but only the first and the last term are diagonal with respect to the
capital letters. This means that
1 = Tr2 = hA||Ai + hB||Bi =
|aiha| + |bihb|
2
We already know this matrix. It is the density matrix of complete ignorance for a two-dimensional
Hilbert space. Its eigenvalues are 1/2, 1/2. It is not a pure state; it is a mixed state indeed. The
lesson is that by tracing over some degrees of freedom in entangled (pure) states, we obtained mixed
states for the remaining degrees of freedom. If your particle 1 is correlated (entangled) with another
particle, the coherence disappears. For example, if you make a double-slit experiment but you
only allow x-polarized photons in the slit a and y-polarized photons in the slit b, there will be no
interference between the two beams; the polarizations x, y play the same role as A, B above.
All features of a subsystem are encoded in the partial trace
It is always enough to work with the simpler density matrix 1 that only describes the degrees of
freedom of the subsystem 1 if we only want to determine the properties of this subsystem. By the
previous sentence, we mean
Tr (L) = Tr (1 L)
for any operator L that only acts on the observables describing the system 1. As previously,
1 = Tr2 .
will turn out to have some values in the range M may be written
The probability that a quantity L
as the expectation value of the projection operator PM and therefore everything we can predict is
encoded in the expectation values. Why is the identity above true?
=
Tr (L)
XX
a
ha|hB|L|Bi|ai
hB|L|Bi
= ...
only acts on the states |ai of the first system. Therefore, it looks like a constant
where the operator L
and you can put it in front or after the B-vectors to get
... =
X
B
= L
1
hB||Bi L
which you may finally substitute above to obtain the identity of the two expectation values. If you
are uncertain about these compact manipulations with operators without indices, you should try to
rewrite the same derivation either with indices that parameterize matrix elements or continuous
indices ~r that are integrated over. The latter is relevant for the continuous degrees of freedom.