The Special Theory of Relativity
The Special Theory of Relativity
The Special Theory of Relativity
SPECIAL RELATIVITY
Lecture notes for III semester BSc.Physics course
Sreerag S Kumar
1
Chapter One
Introduction
In 1905, Albert Einstein (A Swiss patent clerk at the time) published a
paper titled “On the electrodynamics of moving bodies”. In it, Einstein
reconciled the then existing inconsistencies between Newtonian mechanics
and Maxwell’s electrodynamics by introducing some radical new ideas; ideas
that have since changed the way we understand the reality that we inhabit.
Together with the contributions of many other mathematicians and physi-
cists (Hermann Minkowski, H A Lorentz and Henri Poincare to name a few)
this paper has come to be known as the special theory of relativity.
Special relativity introduces to us a new way of thinking about space,
time, matter and energy. The equations of Maxwell’s electrodynamics had
hidden within them the secrets of the true structure of reality all along. The
formulation of special relativity has helped recast Maxwell’s electrodynamics
into a much more elegant and powerful theory.
For the purpose of this lecture series, we will only be looking at the
mechanics end of special relativity.
For the more interested reader, I will include a list of books and lectures for
further reading and reference.
“Henceforth, space by itself, and time by itself, are doomed to fade away
into mere shadows, and only a kind of union of the two will preserve an
independent reality”
-Hermann Minkowski
Addressing the 80th Assembly of German Natural Scientists and
Physicians, (Sep 21,1908)
2
Chapter Two
Galilean Relativity
The principle of Galilean relativity is not something new to us. We expe-
rience it on a daily basis and even though we’ve never bothered to name it
explicitly, we understand it to a very good extent. What is this principle
then? and why is it so important?
The principle of Galilean relativity or Galilean invariance is that
“The laws of physics are the same in all inertial frames of reference”
3
we can “translate” or compare the observations made in one frame to one
made in another. This is where the Galilean transformations come in.
If we are given two inertial frames O and O0 , such that O0 is moving
with a velocity v along the positive direction of their common x -axis as
observed from O, the following are the Galilean transformation equations
which connect the two frames.
x0 = x − vt
y0 = y
z0 = z
t0 = t
And the inverse transformations equations are,
x = x0 + vt0
y = y0
z = z0
t = t0
Differentiating once with respect to time, we can see how velocities transform
between inertial frames.
u0x = ux − vx
u0y = uy − vy
u0z = uz − vz
Or written in a compact vector notation,
~u0 = ~u − ~v
4
Chapter 3
A case of inconsistency: Maxwell’s
electrodynamics and Galilean transformations
Maxwell described and codified all electromagnetic phenomena into the form
of his four eponymous equations.
∇ · E = 4πρ
∇·B=0
1 ∂B
∇×E=−
c ∂t
1 ∂E
∇ × B = (4πJ + )
c ∂t
Now lets consider the following situation. Suppose you are in a moving
train and inside the train there is a charged body moving along the length
of the train with a constant velocity. Provided with this situation, one can
solve Maxwell’s equation to find the magnetic field produced by the mov-
ing charged body using the velocity that you observe for the body. Now, a
person outside on the tracks can also solve Maxwell’s equations to find the
magnetic field produced by the body. But what velocity does the person
use? Galilean transformation equations would suggest that the sum of the
velocities of both bodies be used. This would lead to him getting a different
answer for the field produced. The difference would be quite negligible at
low speeds but is significant when we deal with velocities near the speed of
light. So what’s wrong in this situation? Was it the fault of Maxwell’s equa-
tions that the two observers got different answers for the same phenomena?
or was it the fault of us using the Galilean transformation equations? The
answer lies in what we actually measure. We do not measure fields, we mea-
sure forces. Electric and magnetic fields work together to provide the same
electromagnetic Lorentz force which is the same for all observers. There is
a velocity dependent trade-off between the two fields when they transform
between frames. So the fault here is the use of Galilean transformations. It
is the Galilean transformations which are wrong and not Maxwell’s equa-
tions. Thus we need a new set of transformations which correctly transforms
Maxwell’s equations between frames in relative motion. These new set of
transformations were first provided by H A Lorentz and later by Einstein
who explained its physical significance. We will derive these Lorentz trans-
formation equations in chapter 5.
5
Chapter Four
The hunt for Ether and the most famous “failed”
experiment
Maxwell also helped to describe light as an electromagnetic phenomena.
According to electrodynamics, light is an electromagnetic wave constituting
time varying electric and magnetic fields which travel with a speed of
c= õ10 0 =299,792,458 m/s in free space. This also caused confusion as to
what the medium of propagation of light was. Since all the waves known
at the time had mediums, it was reasonable to assume that light too had a
medium through which it propagated. This led the scientific minds at the
time to put forward the ether hypothesis. The ether was theorized to be a
medium permeating all of space, having a very high rigidity modulus so as
to allow a wave of such high velocity to travel through it but also at the
same time providing no mechanical hindrance to bodies passing through it.
During late nineteenth century and early twentieth century, the majority of
the physics community was excited about detecting the presence of ether.
The most famous of such experiments devised to detect the presence of the
ether was the Michelson-Morley Interferometer experiment. The idea of the
experiment was that since ether filled the entirety of space, the earth, at
some point along its orbit must be in relative motion to it. Since the ether
was the medium through which light waves propagated, by measuring the
difference in the speed of light in different directions at different points along
the earth’s orbit must reveal the relative motion of earth with respect to the
ether.
The apparatus consists of a light source, a half slivered mirror and two
mirrors kept at a length L in perpendicular directions from the half silvered
mirror. Light from the source is directed towards the half silvered mirror,
where it splits and progresses towards the mirrors. The light reflects from
the mirrors, recombines and travels into a detector. Since light reflected
from both arms travel different distances through the ether, they interfere
with each other to produce fringe patterns.
Only that they didn’t. When they did the experiment, the calculated
fringe shift was 0.04 but the observed fringe shift was 0.018 or less. Much
within the area of experimental error. The Michelson-Morley experiment
failed to detect any relative motion with respect to the ether and to this day
remains one of the most famous albeit failed experiment in physics.
6
Figure 1: The Michelson-Morley interferometer experiment
The light leaves the source and arrives at the beam splitter at time T0 .
Both the mirrors are at a distance L at this moment but are moving with
a velocity v. The light takes time T1 to reach the mirror and thus travels
a distance cT1 . The mirror has moved a distance vT1 during this time and
hence cT1 = L + vT1 or
L
T1 =
c−v
During the journey back, the same rules apply but v becomes −v hence,
cT2 = L − vT2 or
L
T2 =
c+v
Therefore, total travel time is
L L 2Lc
Th = T1 + T2 = + = 2
c−v c+v c − v2
For the vertically placed mirror, light travels for a time T3 but the mirror has
traveled a distance
p vT3 in the horizontal direction and hence the distance
traveled is L + v T32 . Which gives,
2 2
L
T3 = √
c − v2
2
7
Now,
2Lc 2L 1 2L v2
Th = = 2 ≈ (1 + )
c2 − v 2 c 1 − v2 c c2
c
2L 2L c 2L 1 2L v2
Tv = √ = √ = q ≈ (1 − 2 )
2
c −v 2 c c −v
2 2 c 1− v2 c 2c
c2
8
Chapter 5
Events, Clocks, Meter sticks and the Lorentz trans-
formations
In this section, we formally start discussing special relativity. But before we
start, we must be familiar with the two postulates of special relativity.
1. The laws of physics are the same for all observers in iner-
tial frames of reference
These are the fundamental principles on top of which special relativity rests.
Some might also claim that since it is a law of nature that light travels at
the velocity c in vacuum and the laws of physics are the same in all inertial
frames, the second postulate is redundant. In physics or at least in relativity,
the fundamental quantity of importance is an event. An event is described
by a spatial location and a time. These spatial locations are determined
by measurements made using meter sticks and time using clocks. Now, an
event has its own existence independent of reference frames. You can label
an event with measurements made in your frame of reference but another
observer can give the same event a different set of labels depending on the
measurements that he or she may have made in their frames. Newtonian
mechanics held the idea that even though the physical location of events can
have different values in different values in different frames, the time of an
event is the same in all frames of reference. But we can see that relativity
forces us to abandon this viewpoint on time. Clocks which were once syn-
chronized will fall out of synchronization when they move relative to each
other. Loosely translated, synchronous means “same time”. So synchronous
clocks are just clocks which show the same time.
Einstein gave us a recipe for synchronizing clocks using flashes of light.
Let two observers P and P 0 be separated by some distance. They agree
that when their respective clocks show exactly 12 noon, they will each send
a flash towards each other. Now, if we have on observer at this common
middle point and he or she makes a measurement of the time at which the
flashes arrive, if the clocks were synchronous, the flashes would reach her at
the same time and he or she would say that the flashes arrived at the same
time. But this is not the case in the frame of a moving observer. Assume that
9
t
P O P0 x
when the stationary observer’s clock reads 12 noon, the moving observer is
at their common midpoint. By the time the light flash reaches the midpoint,
he or she will have moved some distance and hence the flashes would reach
him at a slightly different time. This means the stationary observer’s clocks
are not synchronized in the moving observer’s frame. Now we would like
to find out how synchronization is defined for the moving observer or what
the moving observer calls synchronous. Before we do that, we redefine our
units of measurements such that the speed of light comes out as unity. We
call such units natural units. One way to do this is to redefine the units
of length from meters to light seconds. One light second is the distance
traveled by light in one second which is equal to c × 1s= 299,792,458 m. Or
we could define meters of time as the time taken by light to travel one meter.
That is, one meter of time is the time taken by light to travel a distance of
one meter. One meter of time is then equal to 1m/c = 3.335641 × 10− 9s.
Redefining our units either way, the speed of light becomes unity in the new
units. Also other velocities that we measure become dimensionless constants
taking values between 0 and 1.
Coming back, we will now try to understand synchronicity in a moving
frame. Consider an observer O at rest. Three other observers, A, B and
C spaced one unit apart in the rest frame, move with a uniform velocity v
along the positive x direction. When their clocks read time t = 0, O and
A coincide. The observers in the moving frame use the same method as
10
above to synchronize their clocks. The observers at the end, A and C, agree
to fire towards B a flash of light when their clocks read time t0 = 0. In
the diagram, we can see our problem in a more easily understandable form.
We call diagrams like these spacetime diagrams. Each point in a spacetime
diagram is labeled by an x and a t value. Hence the points in a spacetime
diagram represents events in the real world. Paths of bodies in spacetime
are called their worldlines. And in it, we see that our observer at rest O, is
stationary in his frame. His path on the spacetime diagram lies along his
time axis (x = 0). The observer A travels along the line x = vt (or in his
own frame, x0 = 0), B along x = vt + 1 and C along x = vt + 2. Since
light always has the same velocity c for all observers and since we changed
the speed of light to 1, light always moves along lines of constant 45 degree
slope in spacetime diagrams (x = t). Now, A emits a flash of light towards
B when the time on his clock reads t0 = 0 and it reaches B. At what time, as
seen from the rest frame should C fire his flash towards B so that it reaches
B at the same instant as the light flash from A? Obviously we are looking
for a light path that joins the event which we shall label (ta , xa ) and the
path of observer C. To do this, we could simply start drawing 45 degree lines
from C towards B. If we do this, we are more than likely to get it wrong.
So we can be a bit smarter and start from the event itself and draw the line
back towards C. Which gives us with very little effort the path of the flash
of light emitted by C towards B joining the events (ta , xa ) and (tb , xb ). Now
that we have the events themselves, we would like to know their time and
position. Lets start with (ta , xa ). This point is the intersection of two lines;
x = vt + 1 and x = t and hence must be a solution to both.
xa = vta + 1
xa = ta
⇒ ta = vta + 1
which gives
1 1
ta = , xa =
1−v 1−v
What about (tb , xb )? That point too is the intersection of two lines and
knowing the equations of the lines, we can know the points. One of the lines
is simply the path of C which is given by x = vt + 2. What about the path of
the flash of light? We can see that it is a straight line with a negative slope
of one. Therefore its equation will be of the form x + t = k. To find the
value of this k, we just substitute the coordinate values of some other point
11
x = vt (A), x0 = 0 x = vt + 1 (B) x = t, x0 = t0
t
x = vt + 2 (C)
(ta , xa )
t = vx, t0 = 0
(tb , xb )
O x
12
along this line. Incidentally, (ta , xa ) is a point along this line. Therefore, we
have
xa + ta = k
1 1
⇒ + =k
1−v 1−v
2 2
Or, k = 1−v . The equation for the path of light is then xa + ta = 1−v .
Solving the two simultaneous equations, we find the value for (tb , xb )
2
xb + tb =
1−v
xb = vtb + 2
2v 2
⇒ tb = 2
, xb =
1−v 1 − v2
We are now left with a surprising result. As observed by O, it seems time
runs at a different rate in the moving frame. What the moving observers
call t0 = 0 is not the same as t = 0. Even though the clocks are synchronous
in the moving frame, it is clearly not so in the rest frame. t0 = 0 lies along
a line joining the origin and (tb , xb ). Another thing that we can notice is
that, along this line, t = vx. From the diagram, we see that this line is just
a reflection of the line x = vt along the diagonal x = t.
When t0 = 0 in the moving frame, t = vx in the rest frame. Also when
x0 = 0 in the moving frame, x = vt in the rest frame. From this, one might
assume that x0 = x − vt and t0 = t − vx. But this could be off by some
factor, say A and B. i.e,
x0 = A(x − vt)
t0 = B(t − vx)
One could also argue that A and B are functions of only their relative velocity
v since that is the only factor differentiating them. Hence,
x0 = A(v)(x − vt)
t0 = B(v)(t − vx)
Now, according to the second postulate, light travels with the same velocity
in all frames. Which means that when x = t, x0 = t0 . This would mean that
both A(v) and B(v) are the same. Which simplifies our equations further.
x0 = A(v)(x − vt)
t0 = A(v)(t − vx)
13
Since there are more variables than equations, we will need the inverse trans-
formations also to completely solve the problem. The inverse transforma-
tions are
x = A(v)(x0 + vt0 )
t = A(v)(t0 + vx0 )
Making the proper substitutions,
x = A(v)[A(v)x − v 2 A(v)x]
1
A(v)2 =
1 − v2
1
⇒ A(v) = √
1 − v2
Nowadays we call this factor A(v) the Lorentz factor and use the lower case
Greek letter γ to represent it. Therefore,
x0 = γ(x − vt)
t0 = γ(t − vx)
At last, we are left with a new set of transformations which correctly
describe coordinate transformations according to the postulates of special
relativity. This new set of transformations are called the Lorentz transfor-
mations and with these transformations, we are more than well equipped to
explore special relativity further.
Though popularized by Einstein, the Lorentz transformation equations
existed in some form or the other in Physics prior to special relativity. I
mean you would’ve noticed that they don’t carry Einstein’s name. Time
dilation was suggested in the case of electron orbits inside atoms, Lorentz
contraction was used to describe the null result of the Michelson-Morley
experiment. We derived these equations in natural units where the velocity
if light is one units. In SI units, these equations carry the form
1
γ=q
v2
1− c2
x0 = γ(x − vt)
vx
t0 = γ(t − 2 )
c
14
and similarly, the inverse transformations
x = γ(x0 + vt0 )
vx0
t = γ(t0 + )
c2
15
Chapter Six
The consequences of Lorentz transformations and
the geometry of Spacetime
In this chapter, we will explore some of the mind boggling consequences the
Lorentz transformation equations forces upon us. Lorentz transformations
present to us a new type of reality. One whose geometry is far richer than
the ordinary geometry of Euclid that we’ve been familiar with till now.
16
t x0 = 0 x=t
t0 = 0
x
O x=1
rest frame are contracted by a factor of γ when the same measurements are
made in a moving frame.
17
t x = vt
x=t
tB B
t0B
tA A
t0A
t = vx
x0B x0A
xB , xA
x
events.
6.2 Time dilation
18
In O,
xB − xA = 0, tB − tA = ∆t
In O0 ,
x0B − x0A = γ(xB − vtB ) − γ(xA − vtA )
= γv(tA − tB )
But more interestingly,
= γ(tB − tA ) = γ∆t
That is, time between two events as measured by an observer in a moving
frame is γ times more than the time measured in a stationary frame.
Since there is no way to tell which frame is moving and which one is at
rest, time dilation applies to both frames. Its entirely our choice to fix
which frame is at rest and which one is motion. For a photon, γ is infinity
which implies that time is infinitely dilated for a photon. Photons and other
particles which travel at the speed of light do not experience the flow of time.
6.3 Magnetism
19
Figure 6: Top: The charge is at rest in frame F , so this observer sees a
static electric field. An observer in another frame F moves with velocity
v relative to F , and sees the charge move with velocity −v with an altered
electric field E due to length contraction and a magnetic field B due to the
motion of the charge. Bottom: Similar setup, with the charge at rest in
frame F 0 .
interesting, because the magnetic force in one frame is just the electrostatic
force in another. The magnetic field that one experiences is just an artifact
of the frame of reference that the observer is in. A more mathematically
rigorous proof would take us too far off our course and hence is left for the
reader to explore on their volition.
20
t x = vt
x=t
t A , tB A B
t0A
t0B
t = vx
x0A
x0B
xA xB
x
O
21
simultaneous in O (tB − tA = 0). Now lets take a look at the situation from
O0 . By Lorentz transformation,
= γv(xA − xB )
Hence, the events are clearly not simultaneous in O0 .
Most of the paradoxes involved in relativity are due to misunderstood
notions of simultaneity. Things like the ladder in the barn paradox, the
train-guillotine paradox and many others can be explained using relativity
of simultaneity. If you understand relativity of simultaneity, you will realize
that there are no paradoxes involved in these situations and the accounts of
each observers are perfectly valid in their own frames of reference.
6.5 A new velocity addition rule and the cosmic speed limit
We have seen that observers don’t agree lengths and duration between
events. Then what about velocities, a quantities that depend on both of
the above quantities? How are velocities transformed between frames by
Lorentz transformations? Consider two frames O and O0 in relative motion
with respect to each other. O0 is moving with a velocity v along the positive
direction of their common x-axes. We have the Lorentz and inverse Lorentz
transformations
x0 = γ(x − vt)
t0 = γ(t − vx)
x = γ(x0 + vt0 )
t = γ(t0 + vx0 )
If a body is moving with a velocity u0 in O0 , what is its observed velocity
in O? We know
dx γ(dx0 + vdt0 ) u0x + v
ux = = =
dt γ(t0 + vx0 ) 1 + vu0x
Similarly,
dy dy 0 u0y
uy = = = γ
dt γ(dt0 + vdx0 ) 1 + vu0x
dz dz 0 u0z
uz = = = γ
dt γ(dt0 + vdx0 ) 1 + vu0x
22
Velocities don’t just add together in relativity like it did in Newtonian me-
chanics. But even then, at low speeds the effect are minuscule.
Imagine if the velocity of O0 were 0.5 and within that frame a body were
to move with a velocity u0 = u0x = 0.5. What will u be in O? Newtonian
mechanics would tell us to just add the two velocities together and we would
get the answer as 1. But relativity tells us to use a new velocity addition
rule and using it, we get the answer
Newton says 1 but Einstein tells 0.8. What about light? the second postulate
demands that the velocity of light be the same in all frames. Does our new
velocity transformation rule obey the second postulate? Lets check.
u0x + v 1+v
u = ux = 0
= =1
1 + vux 1+v×1
Thus light and only light travels with the same speed 1 in all frames under
this new relativistic transformation law for velocities.
Muons are elementary particles with charge −e and a mass almost 207
times greater than the electron. They are unstable particles which decay
into an electron, an electron antineutrino and a muon neutrino with a mean
lifetime of 2.1969811 × 10− 6 seconds.
µ− → e− + ν¯e + νµ
23
reaching the surface of the earth within their mean lifetime? There are two
ways that we can explain this. Consider a clock co-moving with the muon.
Since this clock is in relative motion with a clock placed on the surface of
earth, it should run slower. This clock will measure the mean lifetime of the
muon as t0 = 2.197µs. But the clock placed on earth will measure t = γt0 .
1 1
γ=√ =√ = 12.9196
1 − v2 1 − 0.9972
t = γt0 = 12.9196 × 2.197µs = 28.384µs
So, in the earth frame, the mean lifetime of the muon has been dilated to
28.384µs. With this time, the muon can travel easily a distance of 28.38µs×
0.997c = 8483.914 m. Which means they can easily travel towards the
ground within their mean lifetime.
How about the muon’s frame? what is the muon’s experience during this
journey? In the frame of the muon, the muon suddenly comes to life (by
collision between the cosmic rays and the atmospheric particles) and sees the
earth racing towards it at a speed of 0.997c. Which means that in the muons
frame, the earth is length contracted along the direction of motion. If in the
earth’s frame the distance between the muon and the ground was L0 = 7000
meters, in the muon’s frame, it is Lorentz contracted to L = L0 /γ = 542.63
meters. Which means the ground reaches the muon before it decays. Hence
in both frames, the muon reaches the ground. Even though the observers
may attribute different phenomena to the cause, they still agree about the
final effect.
So far we have talked about how special relativity and Lorentz trans-
formations show us how quantities previously known to be invariant under
Galilean transformations are no longer invariant under Lorentz transfor-
mations. We saw that space and time aren’t absolute and that observers
disagree about the lengths and duration between events. If space and time
don’t represent absolute invariant quantities in spacetime, then what does?
This is where the spacetime interval comes in. The spacetime interval be-
tween two events A and B is defined as
24
Where t is the time in meters. In SI units,
Therefore, the spacetime interval has units of length in either system. Is this
quantity invariant? Lets check. In one frame, the events have the coordi-
nates (tA , xA ) and (tB , xB ) and in a frame moving with a velocity v relative
to this frame, the coordinates will be (t0A , x0A ) and (t0B , x0B ). Calculating the
spacetime interval ∆s2 ,
and similarly,
∆s02 = (t0B − t0A )2 − (x0B − x0A )2
substituting for the quantities using Lorentz transformations,
∆s02 = (γ(tB − vxB ) − γ(tA − vxA ))2 − (γ(xB − vtB ) − γ(xA − vtA ))2
= γ 2 (tB − tA − v(xA − xB ))2 − γ 2 (xB − xA − v(tA − tB ))2
= γ 2 [(tB − tA )2 − 2v(tB − tA )(xA − xB ) + v 2 (xA − xB )2
− (xB − xA )2 + 2v(tB − tA )(xA − xB ) − v 2 (tA − tB )2 ]
= γ 2 [(tB − tA )2 (1 − v 2 ) − (xB − xA )2 (1 − v 2 )]
∆s2 = (tB − tA )2
∆s = tB − tA
Therefore in this frame, the spacetime interval is just the time measured by
the observers clock. The spacetime interval between two events is then just
the time between the events as measured by a clock which was present at
both of those events. Thus the spacetime interval between two events is also
called the proper time between the events.
25
t
x = −t x=t
t=5
x
O
∆s2 = t2 − x2 = constant
Which is the equation for a familiar conic section called the hyperbola. In
spacetime, we call it the invariant hyperbola since it represents the invariant
spacetime interval. For an event with t = 5, x = 0, the invariant hyperbola
would obey the equation
t2 − x2 = 25
and would look like Figure 8. Each of the points on the invariant hyperbola
represents the location of the same event in the spacetime diagrams of dif-
ferent inertial observers. This also shows us that spacetime has a hyperbolic
geometry.
26