Physics 6210/spring 2007/lecture 1
Physics 6210/spring 2007/lecture 1
Physics 6210/spring 2007/lecture 1
Lecture 1
Relevant sections in text: §1.1
What is a theory?
1
Physics 6210/Spring 2007/Lecture 1
can be confusing. For example, quantum mechanics is a “theory”, but the use of quantum
mechanics to model the hydrogen atom as, say, a non-relativistic electron moving in a fixed
Coulomb field is also a “theory” in some sense — a theory of the hydrogen atom. Clearly
these two notions of “theory” have somewhat different logical standings in the sense that
one can build a variety of “theories” of the hydrogen atom (by adding, e.g., spin-orbit
coupling, special relativity, finite-size nucleus, etc. ) within the framework of the theory of
quantum mechanics. Given this slightly confusing state of affairs, I will try (but may fail)
to call quantum mechanics a “theory’, and I will call “models” the various theories built
from quantum mechanics (of, e.g., the hydrogen atom).
What are some successful physical theories? There are many, of course. Some ex-
amples are: Newton’s theory of matter and its interactions, valid at large length scales,
weak gravitational fields, and small velocities (“classical mechanics”); Maxwell’s theory
of the electromagnetic field and its interaction with charged “sources”; Einstein’s theory
of gravitation; and, of course, the theory of matter and its interactions at small length
scales, which we call quantum mechanics, along with its descendant, quantum field the-
ory. One confusing feature of all this is that theories really come to us in overlapping
hierarchies. For example, using the classical Maxwell theory of electrodynamics we can
create a “theory” of atoms as bound states of charges. These theories are, ultimately,
incorrect (being classical). A “correct” theory of electromagnetic phenomena in general,
and atoms in particular, arises via quantum electrodynamics in which the theory of quan-
tum mechanics (better: quantum field theory) is melded with Maxwell’s theory and then
used to build a theory of interacting charges, atoms, etc. Thus we can discuss the physical
“theory” called “quantum mechanics” and using the framework defined by this theory we
can build “theories” of various physical phenomena (e.g., crystalline solids). The theory
of the phenomena can be wrong without quantum mechanics being wrong, or perhaps one
is unable to build a satisfactory theory of the phenomena owing to a failure of the parent
theory (quantum mechanics). Similar comments apply to Einstein’s theories of relativity
and the various physical theories that are formulated in the context of Einstein’s relativity.
Here again, we are drawing a conceptual distinction between the idea of a “theory” and a
parcticular model built within the confines of that theory.
Observables
Measurable aspects of the experimentally accessible world are the observables. Any
theory is to provide a means of assigning a mathematical representation to the observables.
By specifying the observables, we are going a long way toward specifying the kinds of
physical situations that our theory is meant to cover. Quantum mechanics postulates a
universal ground rule for observables: they must be self-adjoint operators on a Hilbert
space. The way in which we implement this rule may vary from physical model to physical
model.
For example, using the theory of Newtonian mechanics we can build a model of a “par-
ticle” in which some important observables are position, momentum, angular momentum,
energy, etc. In fact, our usual model of a (Newtonian) “point particle” supposes that all
observables can be viewed as functions of position and momentum, which we can call the
basic observables. Mathematically, the basic observables for a single particle in Newtonian
theory are represented by 6 numbers, (x, y, z, px , py , pz ), that is, the observables are func-
tions on the six-dimensional phase space. Normally, these 6 numbers are actually viewed,
mathematically speaking, as a pair of vectors. The behavior of a “particle” is documented
by monitoring the behavior of its observables and we build our theory using the mathemat-
ical representation (e.g., vectors) for these quantities. Other quantities, like mass, electric
charge, and time are in some sense “observable” too, but in Newtonian mechanics these
quantities appear as parameters in various equations, not as quantities which one measures
to document the behavior of the system. In other words, while we may consider the way
in which the position of a particle changes, we normally don’t include in our model of a
“particle” a time-varying mass or electric charge. Of course, more sophisticated models
of matter may attempt to give a better, or more “fundamental” description of mass and
electric charge in which these quantities become observables in the sense described above.
As another example, consider the electromagnetic field as it is described in Maxwell’s
theory. Observables are, of course, the electric and magnetic field strengths. Also we
have, polarization of waves, energy density, momentum density, etc. All electromagnetic
observables (in Maxwell’s theory) are built from electric and magnetic field strengths,
which are the basic observables of the theory. Mathematically, they are represented as
vector fields. Another measurable quantity is the speed of light c. Like mass and charge
in Newtonian mechanics, this quantity appears as a parameter in Maxwell’s theory and
is not something that can change with the configuration of the system. We don’t use the
term “observable” for the speed of light in the language we are developing.
As my final example, consider the theory of bulk matter known as “statistical mechan-
3
Physics 6210/Spring 2007/Lecture 1
ics”. This theory has a number of similarities with quantum mechanics. For now, let us
note that some of the observables are things like free energy, entropy, critical exponents,
etc. All of these quantities can be computed from the partition function, which is, in some
sense, the basic observable. Of course, statistical mechanics is normally built from classical
and/or quantum mechanics in which case the partition function itself is built from more
basic observables. By the way, temperature is, of course, a measurable quantity. But it
plays the role of a parameter in statistical mechanics in a canonical ensemble and so in this
case temperature is treated as a parameter just like mass and electric charge in classical
electrodynamics.
So, we see that different physical phenomena require different kinds of observables, and
different theories use different mathematical representations for the observables. One of
our two main goals this semester is to get a solid understanding of how quantum mechanics
represents observables.
Let us remark that in all of our examples — indeed, in most theories — the observable
called “time” enters as an adjustable parameter. It is normally not modeled as an observ-
able in the same sense as we model, say, the energy of a particle. In quantum mechanics
time is not an observable in the sense described above.
States
state is specified by giving a probability distribution on the phase space (which is a single
function), rather than a point in phase space as is done in Newtonian mechanics. From the
probability distribution one can compute all observables of statistical mechanics. We see
that given the state of the system, all observables are determined, but the way we specify
the state can be rather different.
To get a better handle on the distinction between states and observables, you can think
as follows. The state of a system reflects the way it has been “prepared”, which normally
is a reflection of the particular initial conditions used. Given a particular preparation
procedure (performed by various measurements and/or filtering processes) the system will
behave in a particular – indeed, unique – way, as reflected in the behavior of its observables.
A physical model for a system principally involves an identification of the observables
needed to describe the system. This is done once and for all. The states of the system
represent various ways the system can be “started off’ and can be adjusted by experimental
procedures.
Dynamics
The measured values of observables of a system will usually change in time. Normally, a
theory will contain a means of describing time evolution of the system, that is, a “dynamical
law”, or a “law of motion”. Assuming that we use a time-independent mathematical model
for the observables, we can view dynamics as a continuous change (in time) of the state
of the system according to some system of equations. This way of formulating dynamics
is what is often called the Schrödinger picture of dynamics, and a famous example of the
dynamical law is provided by the Schrödinger equation. In statistical mechanics, the state
of the system is determined by a probability distribution on phase space. This distribution
evolves in time according to the Liouville equation.
In classical mechanics and electrodynamics, the state of the system is known once one
specifies the values of the basic observables. For example, if you give the positions and
velocities of a Newtonian particle at an instant of time, these quantities will be uniquely
determined for all time by a dynamical law (i.e., Newton’s second law). In these theories
one can therefore think of dynamics as a time evolution in the value of the observables
according to some system of equations (F = ma, Maxwell equations). One important
aspect of dynamics that is usually incorporated in any theory is a very basic notion of
causality. In the Schrödinger picture, given the state of the system at one time, the
dynamical law should determine the state uniquely at any other time. Granted this, you
see that the state at a given time (along with the dynamical law - which is part of the
specification of the theory) will determine the outcomes of all measurements at any time.
To summarize: A theory requires (1) A mathematical representation of observables;
5
Physics 6210/Spring 2007/Lecture 1
(2) A mathematical representation of states and a prescription for determining the values
of the observables — the physical output of the theory — from any given state; (3) A
specification of a dynamical law, which tells us how to extract physical output as a function
of time. Our goal this semester will be to see how quantum mechanics takes care of (1),
(2), and (3) and uses them to build models of a number of physical systems.
A Word of Caution
One prejudice that arises from, say, classical mechanics that must be dispelled is as
follows. In classical mechanics, knowing the state is the same as fixing the values of all
observables at one time. So, if we know the state of a Newtonian particle at one time,
we know the values of its coordinates and momenta and every other observable. Other
theories may be set up differently and this kind of result need not apply. For example,
in quantum mechanics (and in classical statistical mechanics), the state of the system will
provide probability distributions for all observables. One may completely determine/specify
the state by assigning values to some of the observables (“the energy of a simple harmonic
oscillator is 5 ergs with probability one”), but this may leave some statistical uncertainty in
other observables. As we shall see, for example, (roughly speaking) specifying the position
of a particle will completely determine the state of the particle in quantum mechanics. This
state will allow for a large statistical uncertainty (a very “broad” probability distribution)
for momentum. Likewise, specifying the energy of a particle will, in general, imply a
statistical uncertainty in the values of position and momentum.
Stern-Gerlach experiment
We now describe an experiment conducted by Stern and Gerlach in the early 1920’s.
It gives us a valuable demonstration of the kind of phenomenon that needs quantum
mechanics to explain it. It also provides an example of what is probably the simplest
possible quantum mechanical model. As you probably know, this experiment involves
the property of particles known (perhaps misleadingly) as their spin, which is an intrinsic
angular momentum possessed by the particles. Note, though, at the time of the experiment,
neither intrinsic spin nor quantum mechanics was very well understood! Our goal in
studying this important experiment is to introduce the basic rules of quantum mechanics
in what is probably the simplest possible mathematical setting.
7
Physics 6210/Spring 2007/Lecture 1
field direction along the axis of interest, then passing the beam of particles through and
seeing which way the particles are deflected, corresponding to spin “up” or “down” along
that direction. Let us therefore try to model the behavior of the spin vector S in such an
experiment, ignoring all the other “degrees of freedom” that the atoms might have. Thus
the atom is modeled as a “spin 1/2 particle”. Let us call an SG apparatus that measures
the spin along an axis characterized by a unit vector n “SGn ”. Thus the apparatus SGn
measures S · n. The empirical fact is that if you measure S · n you always get ±h̄/2. Let us
pass a beam of spin 1/2 particles through SGn and keep, say, only the particles that deflect
according to S · n having the value + h̄2 . If we pass this filtered beam through another such
SGn /filter device we see that 100% of the beam passes through. We say that we have
“determined the spin along n with certainty” for all the particles in the filtered beam. We
model this situation by saying that all the particles in the (filtered) beam are in the state
|S · n, +i. We can say that we have “prepared” many particles all in the same state by
passing a beam of particles through an SG apparatus and only keeping those deflected up
or down.
Suppose we pass a beam through the apparatus SGz and only keep one spin projection.
We now have many electrons prepared in the state |Sz , +i. Let us try to pin down the
value of Sx that these electrons possess. Pass the beam (all particles in the state |Sz , +i)
through another Stern-Gerlach apparatus SGx . Particles are now deflected according to
the projection of their magnetic moments (or spin vectors) along the x direction. What
you find in this experiment is that the beam splits in half. This is perfectly reasonable; we
have already decided that any component of the spin has just two projections along any
given axis. Since there is nothing special about the x or z directions; we should get similar
behavior for both. In the SGz filtered beam we did not “prepare” Sx in any special way,
so it is not too surprising that we get the beam to split in half.
Let us continue our investigation as follows. We have passed our beam through SGz
and kept the “spin up” particles. We then pass these spin up particles through SGx ; let
us focus on the beam that gave h̄/2 for the Sx measurement. Therefore, roughly half of
the beam that entered the SGx apparatus is kept, and we now have 1/4 of the original
particles left in our prepared beam. After this filtering process we can, if we like, verify
that a repeated filtering process with apparata SGx keeps all the beam intact - evidently
the state of the particles could be represented by |Sx , +i.*
Now we have a beam of electrons that have been measured to have the following
properties (1) Sz is +h̄/2, (2) Sx is +h̄/2.† Given (1) and (2) above, it is reasonable to
* Of course, one naturally prefers to write the state as something like |Sz , +; Sx , +i, but we
shall see that this is not appropriate.
† We could now go and measure Sy in this doubly filtered beam; you will find that half the
beam has spin up along y, half has spin down (exercise). But let us not even bother with
this.
8
Physics 6210/Spring 2007/Lecture 1
believe that the electrons all have definite values for Sz and Sx since we have filtered out
the only other possibilities. This point of view is not tenable. Suppose you go back to
check on the value of Sz . Take the beam that came out of SGz with value h̄/2 and then
SGx with the value +h̄/2 and pass it through SGz again. You may expect that all of the
beam is found to have a value +h̄/2 for Sz , but instead you will find that beam splits in
two! This is despite the fact that we supposedly filtered out the spin down components
along z.
So, if you measure Sz and get, say, h̄/2, and then you measure it again, you will get h̄/2
with probability one (assuming no other interactions have taken place). If you measure
Sz and get, say, h̄/2, then measure Sx and then measure Sz , the final measurement will
be ±h̄/2 with a 50-50 probability. This should get your attention: the values that you
can get for the observable Sz in two measurements depends upon whether or not you have
determined the value of Sx in between the Sz measurements.
Given this state of affairs, it is hard to make sense of the classical picture in which one
imagines the electron to have given, definite values of all its observables, e.g., Sx and Sz .
One sometimes says that the measurement of Sx has somehow “disturbed” the value of
Sz . This point of view is not incorrect, but is not a perfect description of what is going
on. For example, as we shall see, the quantum mechanical prediction is unambiguously
independent of the way in which we make the measurements. Nowhere do we really need
to know how the SG devices worked. Moreover, the “disturbance” in Sz due to the Sx
measurement is not a function of how carefully we make the Sx measurement, that is,
one cannot blame the strange behavior as coming from some “experimental error”, the
measurements can, ideally, be perfect and we still get the same result. The fact of the
matter is that one shouldn’t think of observables (such as Sz and Sx ) has having given,
fixed, values that “exist” in the object of interest. This may be philosophically a bit sticky
(and psychologically a bit disturbing), but it seems to be quite alright as a description of
how nature actually works.
If all this seems perfectly reasonable to you, then you probably don’t understand it
too well. Our macroscopic experience with matter just doesn’t give any hint that this is
the way nature works.
Electrons (and other elementary particles) are not like tiny baseballs following classical
trajectories with tiny spin angular momentum arrows attached to them, and there is no
reason (experimentally) to believe that they are. It is a purely classical prejudice that
a particle has definite values for all observables that we can measure. Try to think this
way: what is a particle? It has mass, (total) spin, charge, etc. and other intrinsic, “real”
properties that do not change with the state of the particle. Based upon experiment, one
may want to assign other observable properties such as position, energy, orbital angular
momentum, spin component along an axis to the particle. But according to experiment,
9
Physics 6210/Spring 2007/Lecture 1
these properties change with the state of the particle and cannot be viewed as “existing”
in the particle independently of the measuring process (which changes the state). As it
turns out, according to the quantum mechanical explanation of this sort of phenomenon,
all you are guaranteed to be able to “assign” to a particle is probability distributions for
its various observables. Our next task is to build up the quantum mechanical model of the
spin 1/2 system using the rules of quantum mechanics.
10