3.6.3 Pauli’s Generalized Exclusion Principle and the Deuteron 61
3.6.4 Pion-Nucleon Scattering and Resonances 61
3.7 The semi-simplicity of (complexified) L(SU (n + 1)) 63
5. Spacetime Symmetry 94
5.1 The Lorentz Group 94
5.2 The Lorentz Group and SL(2, C) 97
5.3 The Lie Algebra L(SO(3, 1)) 99
5.4 Spinors and Invariant Tensors of SL(2, C) 101
5.4.1 Lorentz and SL(2, C) indices 102
5.4.2 The Lie algebra of SL(2, C) 104
5.5 The Poincaré Group 104
5.5.1 The Poincaré Algebra 105
5.5.2 Representations of the Poincaré Algebra 106
5.5.3 Massive Representations of the Poincaré Group: k µ = (m, 0, 0, 0) 111
5.5.4 Massless Representations of the Poincaré Group: k µ = (E, E, 0, 0) 111
6.3.1 The Yang-Mills Action 119
6.3.2 The Yang-Mills Equations 120
6.3.3 The Bianchi Identity 121
6.4 Yang-Mills Energy-Momentum 122
6.5 The QCD Lagrangian 123
1. Introduction to Symmetry and Particles
i) Spacetime symmetries: these are described by the Poincaré group. This is only an
approximate symmetry, because it is broken in the presence of gravity. Gravity is the
weakest of all the interactions involving particles, and we will not consider it here.
ii) Internal symmetries of particles. These relate processes involving different types
of particles. For example, isospin relates u and d quarks. Conservation laws can be
found for particular types of interaction which constrain the possible outcomes. These
symmetries are also approximate; isospin is not exact because there is a (small) mass
difference between mu and md . Electromagnetic effects also break the symmetry.
iii) Gauge symmetries. These lead to specific types of dynamical theories describing
types of particles, and give rise to conserved charges. Gauge symmetries if present,
appear to be exact.
The quark labels u, d, s, c, t, b stand for up, down, strange, charmed, top and bottom.
The quarks carry a fractional electric charge. Each quark has three colour states. Quarks
are not seen as free particles, so their masses are ill-defined (the masses above are “effective”
masses, deduced from the masses of composite particles containing quarks).
The leptons are also spin 1/2 fermions and can be arranged into three families
The leptons carry integral electric charge. The muon µ and taon τ are heavy unstable
versions of the electron e. Each flavour of charged lepton is paired with a neutral particle
ν, called a neutrino. The neutrinos are stable, and have a very small mass (which is taken
to vanish in the standard model).
All these particles have antiparticles with the same mass and opposite electric charge
(conventionally, for many particles, the antiparticles carry a bar above the symbol, e.g.
the antiparticle of u is ū). The antiparticles of the charged leptons are often denoted by
a change of − to +, so the positron e+ is the antiparticle of the electron e− etc. The
antineutrinos ν̄ differ from the neutrinos ν by a change in helicity (to be defined later...).
Hadrons are made from bound states of quarks (which are colour neutral singlets).
i) The baryons are formed from bound states of three quarks qqq; antibaryons are formed
from bound states of three antiquarks q̄ q̄ q̄
For example, the nucleons are given by
p = uud : 938 M ev
n = udd : 940 M ev
ii) Mesons are formed from bound states of a quark and an antiquark q q̄.
Other particles are made from heavy quarks; such as the strange particles K + = us̄
with mass 494 Mev , Λ = uds with mass 1115 Mev, and Charmonium ψ = cc̄ with mass
3.1 Gev.
The gauge particles mediate forces between the hadrons and leptons. They are bosons,
with integral spin.
Mass (GeV) Interaction
γ (photon) 0 Electromagnetic
W+ 80 Weak
W− 80 Weak
Z0 91 Weak
g (gluon) 0 Strong
The gluons are responsible for interquark forces which bind quarks together in nucleons.
It is conjectured that a spin 2 gauge boson called the graviton is the mediating particle
for gravitational forces, though detecting this is extremely difficult, due to the weakness of
gravitational forces compared to other interactions.
1.2 Interactions
There are three types of interaction which are of importance in particle physics: the strong,
electromagnetic and weak interactions.
• Responsible for binding of quarks to form hadrons (electromagnetic effects are much
• Responsible for binding forces between nucleons p and n, and hence for all nuclear
i) The strong interaction preserves quark flavours, although q q̄ pairs can be produced
and destroyed provided q, q̄ are the same flavour.
An example of this is:
+ u u +
d s K
d s +
p u u Σ
u u
The Σ+ and K + particles decay, but not via the strong interaction, because of con-
servation of strange quarks.
ii) Basic strong forces are “flavour blind”. For example, the interquark force between
q q̄ bound states in the ψ = cc̄ (charmonium) and Υ = bb̄ (bottomonium) mesons are
well-approximated by the potential
V ∼ + βr (1.1)
and the differences in energy levels for these mesons is approximately the same.
The binding energy differences can be attributed to the mass difference of the b and
c quarks.
The electromagnetic interactions are weaker than the strong interactions. They occur in
the interactions between electrically charged particles, such as charged leptons, mediated
by photons.
The simplest electromagnetic process consists of the absorption or emission of a photon
by an electron:
This process cannot occur for a free electron, as it would violate conservation of 4-
momentum, rather it involves electrons in atoms, and the 4-momentum of the entire atom
and photon are conserved.
Other examples of electromagnetic interactions are electron scattering mediated by
photon exchange
and there are also smaller contributions to this process from multi-photon exchanges.
Electron-positron interactions are also mediated by electromagnetic interactions
e+ e
e− e−
+ e+
e− e−
The dynamic theory governing electromagnetic interactions is Quantum Electrody-
namics (QED), which is very well tested experimentally.
The weak interaction is considerably weaker than both the strong and electromagnetic
interactions, they are mediated by the charged and neutral vector bosons W ± and Z 0
which are very massive and produce only short range interactions. Weak interactions
occur between all quarks and leptons, however they are in general negligable when there
are strong or electromagnetic interactions present. Only in the absence of strong and
electromagnetic interactions is the weak interaction noticable.
Unlike the strong and electromagnetic interactions, weak interactions can involve neu-
trinos. Weak interactions, unlike strong interactions, can also produce flavour change in
quarks and neutrinos.
The gauge bosons W ± carry electric charge and they can change the flavour of quarks.
Examples of W -boson mediated weak interactions are n −→ p + e− + ν̄e :
d u
n d d p
u u
and µ− −→ e− + ν̄e + νµ :
and νµ + n → µ− + p
νµ µ−
d u
n d d p
u u
e− ↔ νe , µ− ↔ νµ
u ↔ d, c↔s (1.2)
νµ νµ
e− e−
with the notable exceptional case being weak neutron decay, which has average lifetime
of 103 s.
Definition 1. There are three lepton numbers. The electron, muon and tauon numbers
are given by
election e− and anti-neutrino ν̄e could be created. Lepton numbers are conserved in all
There are also various quantum numbers associated with baryons.
These quark quantum numbers, together with N (u) − N (ū) and N (d) − N (d), ¯ are
conserved in strong and electromagnetic interactions, because in these interactions quarks
and antiquarks are only created or annihilated in pairs. The quark quantum numbers are
not conserved in weak interactions, because it is possible for quark flavours to change.
B = (N (q) − N (q̄)) (1.5)
where N (q) and N (q̄) are the total number of quarks and antiquarks. Baryons therefore
have B = 1 and antibaryons have B = −1; mesons have B = 0. B is conserved in all
Note that one can write
1 ¯ + C + T − S − B̃)
B = (N (u) − N (ū) + N (d) − N (d) (1.6)
Definition 4. The quantum number Q is the total electric charge. Q is conserved in all
In the absence of charged leptons, such as in strong interaction processes, one can write
2 1 ¯ − S − B̃)
Q = (N (u) − N (ū) + C + T ) − (N (d) − N (d) (1.7)
3 3
Hence, for strong interactions, the four quark quantum numbers S, C, B̃, T together
with Q and B are sufficient to determine N (u) − N (ū) and N (d) − N (d).
2. Elementary Theory of Lie Groups and Lie Algebras
xβ ◦ xα −1 : xα (U α ∩ U β ) → xβ (U α ∩ U β ) (2.1)
yA ◦ f ◦ x−1 α A
α : xα (U ) → yA (W ) (2.3)
is smooth as a function Rm → Rn .
x ◦ γ : (a, b) → Rn (2.4)
i) There exists e ∈ G such that g • e = e • g = g for all g ∈ G. e is called an identity
iii) For all g1 , g2 , g3 ∈ G; g1 •(g2 •g3 ) = (g1 •g2 )•g3 , so group multiplication is associative.
It is elementary to see that the identity e is unique, and g has a unique inverse g −1 .
Definition 10. A Lie group G is a smooth differentiable manifold which is also a group,
where the group multiplication • has the following properties
Henceforth, we shall drop the • for group multiplication and just write g1 • g2 = g1 g2 .
Many of the most physically interesting Lie groups are matrix Lie groups in various
dimensions. These are subgroups of GL(n, R) (or GL(n, C)), the n × n real (or complex)
invertible matrices. Group multiplication and inversion are standard matrix multiplication
and inversion.
Suppose that G is a matrix Lie group of dimension k. Let the local co-ordinates
be xi for i = 1, . . . , k. Then g ∈ G is described by its matrix components g AB (xi ) for
A, B = 1, . . . , n. The g AB are smooth functions of the co-ordinates xi . Examples of matrix
Lie groups are (here F = R or F = C):
i) GL(n, F), the invertible n × n matrices over F. The co-ordinates of GL(n, F) are the
n2 real (or complex) components of the matrices.
vi) SU (n) = {M ∈ GL(n, C) : M M † = In and det M = 1}. SU (2) and SU (3) play a
particularly important role in the standard model of particle physics.
There are other examples, some of which we will examine in more detail later. It can
be shown that any closed subgroup H of GL(n, F) (i.e. any subgroup which contains all
its accumulation points) is a Lie group.
Some of these groups are related to each other by group isomorphism; a particularly
simple example is SO(2) ∼= U (1). Elements of U (1) consist of unit-modulus complex
numbers e for θ ∈ R under multiplication, whereas SO(2) consists of matrices
cos θ − sin θ
R(θ) = (2.5)
sin θ cos θ
which satisfy R(θ+φ) = R(θ)R(φ). The map T : U (1) → SO(2) given by T (eiθ ) = R(θ)
is a group isomorphism.
We shall say that two points in G are connected if they can be linked with a continuous
curve. This defines an equivalence relation on G, and hence partitions G into equivalence
classes of connected points; the equivalence class of g ∈ G is called the connected component
of g. The equivalence class of points of O(n) connected to I is SO(n), which is connected.
γ̇p : f →
(f ◦ γ(t)) t=t0 (2.7)
The components of the tangent vector are
γ̇pm = ((x ◦ γ)m ) t=t0 = γ̇p (xm )
Note that one can write (using the chain rule)
γ̇p (f ) = (f ◦ γ(t)) t=t0
(f ◦ x−1 ◦ x ◦ γ(t)) t=t0
X ∂ d
= i
(f ◦ x−1 )|x(p) ( (x ◦ γ)i )t=t0
∂x dt
X ∂
= (f ◦ x−1 )|x(p) γ̇pi (2.9)
Proposition 1. The set of all tangent vectors at p forms a n-dimensional vector space
(where n = dim M ), denoted by Tp (M ).
Suppose that p lies in the chart U with local co-ordinates x. Suppose also that V ,
W ∈ Tp (M ) are tangent vectors at p corresponding to the curves γ, σ, where without
loss of generality we can take γ : (a, b) → M , σ : (a, b) → M with a < t0 < b and
γ(t0 ) = σ(t0 ) = p
Take a, b ∈ R. Consider the curve ρ̂ in Rn defined by
X ∂ d
ρ̇p (f ) = i
(f ◦ x−1 )|x(p) ( (x ◦ ρ)i )t=t0
∂x dt
X ∂ d
= i
(f ◦ x−1 )|x(p) ( ρ̂i (t))t=t0
∂x dt
X ∂ d
=a (f ◦ x−1 )|x(p) ( (x ◦ γ)i )t=t0
∂xi dt
X ∂ d
+b i
(f ◦ x−1 )|x(p) ( (x ◦ γ)i )t=t0
∂x dt
= aγ̇p (f ) + bσ̇p (f ) (2.11)
Using (2.9) it is straightforward to compute the tangent vectors to the curves ρ(i) at
ρ̇(i)p (f ) = (f ◦ x−1 )|x(p) (2.13)
and hence, if γ is a curve passing through p then (2.9) implies that
γ̇p (f ) = ρ̇(i)p (f )γ̇pi (2.14)
Pn i
and hence it follows that γ̇p = i=1 γ̇p ρ̇(i)p .
Hence the tangent vectors to the curves
ρ(i) at p span Tp (M ).
Given the expression (2.13), it is conventional to write the tangent vectors to the curves
ρ(i) at p as
ρ̇(i)p = (2.15)
∂xi p
Lemma 1. Suppose that M1 , M2 are smooth manifolds of dimension n1 , n2 respectively.
Let M = M1 × M2 be the Cartesian product manifold and suppose p = (p1 , p2 ) ∈ M . Then
Tp (M ) = Tp1 (M1 ) Tp2 (M2 ).
Suppose Vp ∈ Tp (M ). Then V is the tangent vector to a smooth curve γ(t), with
γ(t0 ) = p. Write γ(t) = (γ1 (t), γ2 (t)); γi (t) is then a smooth curve in Mi and γi (t0 ) = pi
for i = 1, 2.
Let f be a smooth function f : M → R. Suppose that xa are local co-ordinates on M1
for a = 1, . . . , n1 and y m are local co-ordinates on M2 for m = 1, . . . , n2 corresponding to
charts U1 ⊂ M1 and U2 ⊂ M2 .
Then one has n1 + n2 local co-ordinates z α on M where if q = (q1 , q2 ) ∈ U1 × U2 ,
– 18 –
Note that f1 (q1 ) = f (q1 , q2 ) is a smooth function of q1 when q2 is fixed, and f2 (q2 ) =
f (q1 , q2 ) is a smooth function of q2 when q1 is fixed.
Then using the chain rule
1 +n2
∂ d
Vp f = α
(f ◦ z −1 )|z(p) ((z ◦ γ)α (t))|t=t0
∂z dt
X ∂ d
= (f1 ◦ x−1 )|(x(p1 ),y(q2 )) ((x ◦ γ1 )a (t))|t=t0
∂xa dt
X ∂ d
+ j
(f2 ◦ y −1 )|(x(p1 ),y(q2 )) ((y ◦ γ2 )j (t))|t=t0
∂y dt
= (V (1)p + V (2)p )f (2.17)
where V (1)p is the tangent vector to γ1 at p, and V (2)p is the tangent vector to γ2 at
p. Hence Vp = V (1)p + V (2)p . Conversely, given two smooth curves γ1 (t), γ2 (t) in M1 , M2
passing through p1 and p2 at t = t0 , with associated tangent vectors V (1)p and V (2)p , one
can construct the smooth curve γ(t) = (γ1 (t), γ2 (t)) in M passing through p = (p1 , p2 ) at
t = t0 . Then (2.17) shows that V (1)p + V (2)p can be written as Vp ∈ Tp (M ).
Vp = Vpi (2.21)
∂xi p
It is conventional to write
V = V i( ) (2.22)
where V i = (V ◦ x−1 )(xi ) are functions Rn → R and ( ∂x
i ) is a locally defined vector
( )xj = δij (2.23)
It follows that T (M ) is n-dimensional with a local basis given by the ( ∂x i ). The vector
i n
field is called smooth if the functions V are smooth functions on R .
Suppose now that f is a smooth function on M and that V , W are vector fields on M .
Then note that V f can be regarded as a function M → R defined by
(V f )(p) = Vp f (2.24)
Hence one can act on V f with Wp at some p ∈ M to find
Wp (V f ) = Wpi (V f )
∂xi p
∂ ∂
= Wpi (V j j (f ◦ x−1 ))|x(p)
∂xi p ∂x
∂V j ∂
= Wpi i |x(p) ( j (f ◦ x−1 ))|x(p)
∂x ∂x
+ Wp Vp ( i j (f ◦ x−1 ))|x(p)
i j
∂x ∂x
The fact that there are second order derivatives acting on f means that we cannot
write Wp (V f ) = Zp f for some vector field Z.
However, these second order derivatives can be removed by taking the difference
∂V j i ∂W
j ∂
Wp (V f ) − Vp (W f ) = Wpi i
|x(p) − Vp i
|x(p) ( j (f ◦ x−1 ))|x(p) (2.26)
∂x ∂x ∂x
which can be written as Zp f where Z is a vector field called the commutator or alter-
natively the Lie bracket of W and V which we denote by [W, V ] with components
∂V j i ∂W
[W, V ]j = W i − V (2.27)
∂xi ∂xi
Prove that the Lie bracket satisfies
iii) The Jacobi identity: [[X, Y ], Z] + [[Z, X], Y ] + [[Y, Z], X] = 0 for all X, Y , Z ∈ T (M ).
d i i
(σ (t)) = Vσ(t) (2.28)
where in a slight abuse of notation, σ i (t) = (x ◦ σ)i (t) for some local co-ordinates x.
Such a curve is guaranteed to exist and to be unique (at least locally, given an initial
condition), by the standard existence and uniqueness theorems for ODE’s.
2.6 Push-Forwards of Vector Fields
Suppose that M , N are two smooth manifolds and f : M → N is a smooth map. Then
there is an induced map
f∗ : T (M ) → T (N ) (2.29)
which maps the tangent vector of a curve γ passing through a point p ∈ M to the
tangent vector of the curve f ◦ γ passing through f (p) ∈ N .
In particular, for each smooth function h on N , and if γ is a curve passing through
p ∈ M with γ(0) = p, and if Vp ∈ Tp (M ) is the tangent vector of γ at p then f∗ Vp ∈ Tf (p) (N )
is given by
(f∗ Vp )h = h ◦ (f ◦ γ) t=0
= Vp (h ◦ f ) (2.30)
Hence it is clear that the push-forward map f∗ is linear on the space of tangent vectors.
Note that if M , N and Q are manifolds, and f : M → N , g : N → Q are smooth
functions then if h : Q → R is smooth and p ∈ M ,
(g ◦ f )∗ Vp (h) = Vp (h ◦ (g ◦ f ))
= Vp ((h ◦ g) ◦ f )
= (f∗ Vp )(h ◦ g)
= g∗ (f∗ Vp ) (h) (2.31)
and hence
(g ◦ f )∗ = g∗ ◦ f∗ (2.32)
La g = ag (2.33)
La defined in this fashion is a differentiable invertible map from G onto G. Hence, one
can construct the push-forward La∗ of vector fields on G with respect to La .
Given v ∈ Te (G) one can construct a unique left-invariant vector field X(v) ∈ T (G)
with the property that X(v)e = v using the push-forward by
– 21 –
X(v)|g = Lg∗ v (2.35)
To see that X(v) is left-invariant, note that
Proposition 2. The set of left-invariant vector fields is closed under the Lie bracket, i.e.
if X, Y ∈ T (G) are left-invariant then so is [X, Y ].
Suppose that f : G → R is a smooth function. Then
La∗ [X, Y ]g f = [X, Y ]g (f ◦ La )
= Xg (Y (f ◦ La )) − Yg (X(f ◦ La )) (2.38)
Y (f ◦ La ) g = Yg (f ◦ La )
= (La∗ Yg )f
= Yag (f )
= (Y f )(ag)
= (Y f ) ◦ La )g (2.41)
so Y (f ◦ La ) = (Y f ) ◦ La
2.8 Lie Algebras
Definition 13. Suppose that G is a Lie group. Then the Lie algebra L(G) associated
with G is Te (G), the tangent space of G at the origin, together with a Lie bracket [ , ] :
L(G) × L(G) → L(G) which is defined by
for v, w ∈ Te (G), L∗ v and L∗ w denote the smooth vector fields on G obtained by pushing
forward v and w by left-multiplication (i.e. L∗ v|g = Lg∗ v), and [L∗ v, L∗ w] is the standard
vector field commutator. As the Lie bracket on L(G) is obtained from the commutator of
vector fields, it follows that the Lie bracket is
ii) Linear: [αv1 + βv2 , w] = α[v1 , w] + β[v2 , w] for α, β constants and v1 , v2 , w ∈ L(G),
iii) and satisfies the Jacobi identity: [[v, w], z] + [[z, v], w] + [[w, z], v] = 0 for all v, w,
z ∈ L(G).
where (ii) follows because the push forward map is linear on the space of vector fields,
and (iii) follows because as a consequence of Proposition 2, Lg∗ [v, w] = [L∗ v, L∗ w]g .
More generically, one can also define a Lie algebra to be a vector space g equipped
with a map [ , ] : g × g → g satisfying (i), (ii), (iii) above.
Definition 14. Suppose that {Ti : i = 1, . . . , n} is a basis for L(G). Then the Ti are called
generators of the Lie algebra. As [Ti , Tj ] ∈ L(G) it follows that there are constants cij k
such that
The constants cij k are called the structure constants of the Lie algebra.
The structure constants are constrained by the antisymmetry of the Lie bracket to be
antisymmetric in the first two indices;
2.9 Matrix Lie Algebras
The Lie algebras of matrix Lie groups are of particular interest. Suppose that G is a matrix
Lie group, and V ∈ T (G) is a smooth vector field. Let f be a smooth function of the matrix
components g AB . Then if h ∈ G,
Vh f = Vhm
∂g AB ∂f
= Vhm
∂xm ∂g AB
= VhAB AB (2.48)
∂g AB
VhAB = Vhm (2.49)
defines a tangent matrix associated with the components Vhm of V at h. Each vector
field has a corresponding tangent matrix, and it will often be most convenient to deal with
these matrices instead of more abstract vector fields as differential operators.
In particular, if γ(t) is some curve in G with tangent vector V then
Vf = (f ◦ γ(t))
dg AB ∂f
= (2.50)
dt ∂g AB
dg AB
hence the tangent vector to the curve corresponds to the matrix dt . We will fre-
quently denote the identity element of a matrix Lie group by e = I
Examples of matrix Lie algebras are
• a) GL(n, R): the co-ordinates of GL(n, R) are the n2 components of the matrices, so
GL(n, R) is n2 -dimensional. There is no restriction on tangent matrices to curves in
GL(n, R), the space of tangent vectors is Mn×n (R), the set of n × n real matrices.
• b) GL(n, C): the co-ordinates of GL(n, C) are the n2 components of the matrices, so
GL(n, C) is 2n2 -dimensional when viewed as a real manifold. There is no restriction
on tangent matrices to curves in GL(n, C), the space of tangent vectors is Mn×n (C),
the set of n × n complex matrices.
• c) SL(n, R): Suppose that M (t) is a curve in SL(n, R) with M (0) = I. To compute
the restrictions on the tangent vectors to the curve note that
– 24 –
dM (t)
Tr M −1 (t) =0 (2.52)
and so if we denote the tangent vector at the identity to be m = dMdt(t) |t=0 then
Tr m = 0. The tangent vectors correspond to traceless matrices. Hence SL(n, R) is
n2 − 1 dimensional.
• d) O(n): suppose that M (t) is a curve in O(n) with M (0) = I. To compute the
restrictions on the tangent vectors to the curve note that
dM (t) dM (t)T
M (t)T + M (t) =0 (2.54)
dt dt
and hence if m = dMdt(t) |t=0 then m+mT = 0. The tangents to the curve at the identity
correspond to antisymmetric matrices. There are 12 n(n − 1) linearly independent
antisymmetric matrices, hence O(n) is 12 n(n − 1)-dimensional.
Note that the Lie algebra of SO(2) is 1-dimensional and is spanned by
0 1
T1 = (2.55)
−1 0
Show that the tangent vectors of U (n) at I consist of antihermitian matrices, and the
tangent vectors of SU (n) at I are traceless antihermitian matrices.
Proposition 3. Suppose that G is a matrix Lie group and V is a smooth vector field on
G and a ∈ G is fixed. If V̂ denotes the tangent matrix associated with V , then the tangent
matrix associated with the push-forward La∗ V is aV̂ .
Suppose h ∈ G, and f : G → R is a smooth function on G. Consider the tangent
vector La∗ Vh defined at ah
(La∗ Vh )f = Vh (f ◦ La )
= Vh f˜ (2.56)
∂ f˜
(La∗ Vh )f = VhAB |h
∂g AB
∂f ∂
= VhAB P Q |ah AB ((ag)P Q )
∂g ∂g
= VhAB P B |ah aP A
AB ∂f
= (aV̂ ) |ah (2.57)
∂g AB
Proposition 4. Suppose that G is a matrix Lie group and that v, w ∈ Te (G) and V , W
are the left-invariant vector fields defined by Vg = Lg∗ v, Wg = Lg∗ w. Then the matrix
associated with [V, W ]e is the matrix commutator of [v̂, ŵ] where v̂ and ŵ are the matrices
associated with v and w.
By definition, the matrix associated with [V, W ] is
∂g AB
[V, W ]AB = [V, W ]m
∂W m ∂g AB m ∂g AB
p ∂V
= Vp − W
∂xp ∂xm ∂xp ∂xm
∂ ∂g AB AB
p ∂ m ∂g
= V p p Wm − W V
∂x ∂xm ∂xp ∂xm
∂ Ŵ AB ∂ V̂ AB
= Vp p
− Wp (2.58)
∂x ∂xp
where V̂ and Ŵ denote the matrices associated with V and W . But from the previous
proposition Vˆg = g AC v̂ CB and Ŵg = g AC ŵCB so
∂g AC P ∂g
[V, W ]AB
e = VP | e ŵ CB
− W |e v̂ CB
∂xp ∂xp
= v̂ AC ŵCB − ŵAC v̂ CB
= [v̂, ŵ]AB (2.59)
as required.
We have therefore shown that if G is a matrix Lie group then the elements L(G) can be
associated with matrices and the Lie bracket is then simply standard matrix commutation
by Proposition 4 (which can be directly checked satisfies all three of the Lie bracket for
Lie algebras). In the literature, it is often conventional to denote the Lie algebra of SO(n)
by so(n), su(n) is the Lie algebra of SU (n), u(n) the Lie algebra for U (n) etc. We will
however continue to use the notation L(G) for the Lie algebra of Lie group G.
Observe that the image [L(G), L(G)] under the Lie bracket need not be the whole of
L(G). This is clear for SO(2), as the Lie bracket vanishes identically in that case. Recall
that the Lie bracket on R viewed as a Lie group under addition vanishes identically as
well. If G is a connected 1-dimensional Lie group then G must either be isomorphic to R
or SO(2) (equivalently U (1)).
Proposition 5. Suppose that V is a left-invariant vector field. Let σ(t) be the integral
curve of V which passes through e when t = 0.
Then σ(t) is a 1-parameter subgroup of G.
Let x denote some local co-ordinates.
Consider the curves χ1 (t) = σ(s)σ(t) and χ2 (t) = σ(s + t) for fixed s.
These satisfy the same initial conditions χ1 (0) = χ2 (0) = σ(s).
By definition, χ2 satisfies the ODE
d d
((x ◦ χ2 (t))n ) = ((x ◦ σ(s + t))n )
dt d(s + t)
= Vσ(s+t) (xn )
= Vχ2 (t) (xn ) (2.60)
d d
((x ◦ χ1 (t))n ) = (x ◦ Lσ(s) ◦ σ(t))n
dt dt
((x ◦ Lσ(s) ◦ x−1 ) ◦ (x ◦ σ)(t))n
∂ d
(x ◦ Lσ(s) ◦ x−1 )n |x◦σ(t) ((x ◦ σ(t))m )
= m
∂x dt
where we have used the chain rule. But by definition of σ(t),
((x ◦ σ(t))m ) = Vσ(t)
Hence, substituting this into the above:
d ∂
((x ◦ χ1 (t))n ) = m
(x ◦ Lσ(s) ◦ x−1 )n |x◦σ(t)
Vσ(t) m
dt ∂x
= Vσ(t) ((x ◦ Lσ (s))n )
= (Lσ(s)∗ Vσ(t) )(xn ) (by definition of push − forward)
= Vχ1 (t) (x ) (as V is left − invariant) (2.63)
Vσ(t) f = (f ◦ σ)(t)
f (σ(t + h)) − f (σ(t))
= lim
h→0 h
f (σ(t)σ(h)) − f (σ(t))
= lim
h→0 h
= (f ◦ Lσ(t) ◦ σ(t0 ))|t0 =0
= (Lσ(t)∗ v)f (2.64)
so Vσ(t) = Lσ(t)∗ v.
From this we obtain the corollory
Corollory 1. Suppose that σ(t), µ(t) are two 1-parameter subgroups of G with tangent
vectors V , W respectively, with Ve = We = u. Then σ(t) = µ(t) for all t.
Note that
(x ◦ σ(t))n = Vσ(t) xn
– 28 –
= (Lσ(t)∗ u)xn (2.65)
and also
(x ◦ µ(t))n = Wσ(t) xn
= (Lµ(t)∗ u)xn (2.66)
So x ◦ σ and x ◦ µ satisfy the same ODE and with the same initial conditions, hence
σ(t) = µ(t).
2.11 Exponentiation
Definition 16. Suppose v ∈ Te (G), Then we define the exponential map exp : Te (G) → G
where σv (t) denotes the 1-parameter subgroup generated by X(v), and X(v) is the
left-invariant vector field obtained via the push-forward X(v)g = Lg∗ v
Note that exp(0) = e.
d d
(x ◦ σ(at))n t=0 = a (x ◦ σ(at))n at=0 = av n
dt d(at)
So σv (at) and σav (t) have the same tangent vector av at the origin. Therefore σv (at) =
σav (t).
as required.
2.12 Exponentiation on matrix Lie groups
Suppose that G is a matrix Lie group, and v ∈ Te (G) is some tangent matrix. The
exponential exp(tv) produces a curve in G with dt (exp(tv))|t=0 = v satisfying exp((t1 +
t2 )v) = exp(t1 v) exp(t2 v)
It is then straightforward to show that
(exp(tv))|t=t0 = lim t−1 (exp((t0 + t)v) − exp(t0 v))
dt t→0
= lim t−1 (exp(tv) − I) exp(t0 v)
= v exp(t0 v) (2.71)
Similarly, one also finds dt (exp(tv))|t=t0 = exp(t0 v)v, so v commutes with exp(tv).
It is clear that dt exp(tv) = v exp(tv) implies that exp(tv) is infinitely differentiable (as
expected as the integral curve is smooth by construction). Then by elementary analysis,
one can compute the power series expansion for exp(tv) as
∞ n n
X t v
exp(tv) = (2.72)
with a remainder term which converges to 0 (with respect to the supremum norm on
matrices, for example). Hence, for matrix Lie groups, the Lie group exponential operator
corresponds to the usual operation of matrix exponentiation.
Comment: Suppose that G1 and G2 are Lie groups. Then G = G1 × G2 is a Lie group,
and by Lemma 1, L(G) = L(G1 ) L(G2 ).
Conversely, suppose Lie groups G, G1 , G2 are such that L(G) = L(G1 ) L(G2 ). Then
by exponentiation, it follows that, at least in a local neighbourhood of e, G has the local
geometric structure of G1 × G2 . However, as it is not in general possible to reconstruct the
whole group in this fashion, one cannot say that G = G1 × G2 globally (typically there will
be some periodic identification somewhere in the Cartesian product group).
In general, one cannot reconstruct the entire Lie group by exponentiating elements of
the Lie algebra. Consider for example, SO(2) and O(2). Both L(O(2)) and L(SO(2)) are
generated by
0 1
T1 = (2.73)
−1 0
which always has determinant +1. So SO(2) = exp(L(SO(2))) but O(2) 6= exp(L(O(2))).
However, there do exist neighbourhoods B0 of 0 ∈ L(G) and B1 of I ∈ G such that the
map exp : B0 → B1 is invertible. (The inverse is called log by convention).
– 30 –
Suppose that G is a matrix Lie group, and let V be a left-invariant vector field on G, and
suppose that the associated tangent matrix to V at the identity is v̂.
Then if x are some local co-ordinates on G, we know that
g(x)v̂ = Vg(x) (2.75)
From this formula, it is clear that if h ∈ G is a constant matrix then
m m
Vg(x) = Vhg(x) (2.76)
We wish to construct an analogous integral over a matrix Lie group G. Suppose that
x, y are co-ordinates on G and define
dn x = J −1 dn y (2.80)
∂y i
where J is the Jacobian J = det ∂x j .
µi |g(x) = µji,g(x) (2.81)
Then we have
∂ ∂y k ∂ ∂
µji,g(x) j
= µ j
i,g(x) j k
= µji,g(y) j (2.82)
∂x ∂x ∂y ∂y
∂y j
µji,g(y) = µki,g(x) (2.83)
and hence
Definition 17. The Haar measure is defined by
dn x det(µji,g(x) ) (2.85)
It can be shown that the Haar measure (up to multiplication by a non-zero constant)
is the unique measure with this property.
Example: SL(2, R)
Consider g ∈ SL(2, R),
a b
g= (2.89)
c d
for a, b, c, d ∈ R constrained by ad − bc = 1. Note that
d −b
g −1 = (2.90)
−c a
! ! !
∂g c d ∂g 0 0 ∂g −a −b
g −1 1 = 2 , g −1 2 = 1 , g −1 3 = ac
∂x − cd −c ∂x d 0 ∂x d a
! ! !
1 0 0 0 0 1
v1 = , v2 = , v3 = (2.92)
0 −1 1 0 0 0
∂g ∂g ∂g
v1 = −bg −1 1
+ cg −1 2 − dg −1 3
∂x ∂x ∂x
v2 = dg −1
∂g ∂g
v3 = ag −1 1 + cg −1 3 (2.93)
∂x ∂x
It follows that the left-invariant vector fields obtained from pushing-forward the vector
fields associated with v1 , v2 , v3 at the identity with L∗ are
∂ ∂ ∂
µ1 = −b 1 + c 2 − d 3
∂x ∂x ∂x
µ2 = d 2
∂ ∂
µ3 = a 1 + c 3 (2.94)
∂x ∂x
Definition 20. If G is a matrix Lie group which is a subgroup of GL(n, R) or GL(n, C)
then the group elements themselves act directly on n-component vectors. The fundamental
representation is then defined by D(g) = g.
Definition 21. If G is a matrix Lie group then the adjoint representation Ad : G →
GL(L(G)) is defined by
– 34 –
Di (g) denotes D(g) restricted to Wi .
– 35 –
Proposition 10. Schur’s First Lemma: Suppose that D1 and D2 are two irreducible rep-
resentations of G acting on V1 and V2 respectively and there exists a linear transformation
A : V1 → V2 such that
AD1 (g) = D2 (g)A (2.102)
for all g ∈ G. Then either D1 and D2 are equivalent representations, or A = 0.
Proof First note that
Ker A = {ψ ∈ V1 : Aψ = 0} (2.103)
is an invariant subspace of D1 , because if ψ ∈ Ker A then
– 36 –
Definition 28. Suppose that D1 and D2 are representations of the Lie group G over vector
spaces V1 and V2 respectively. Let V = V1 V2 be the standard tensor product vector space
of V1 and V2 consisting of elements v1 ⊗ v2 (v1 ∈ V1 and v2 ∈ V2 ) in the vector space dual
to the space of bilinear forms on V1 × V2 . If v1 ⊗ v2 ∈ V then v1 ⊗ v2 acts linearly on
bilinear forms Ω via v1 ⊗ v2 Ω = Ω(v1 , v2 ). V is equipped with pointwise addition and scalar
multiplication which satisfy (v1 + w1 ) ⊗ (v2 + w2 ) = v1 ⊗ v2 + v1 ⊗ w2 + w1 ⊗ v2 + w1 ⊗ w2
and α(v1 ⊗ v2 ) = (αv1 ) ⊗ v2 = v1 ⊗ (αv2 ).
Then the tensor product representation D is defined as a linear map on V satisfying
so D(g1 g2 ) = D(g1 )D(g2 ). Hence, this together with D(e) = 1 implies that D(g) is
invertible. So D(g) is a representation.
Note that if D1 is irreducible on V1 and D2 is irreducible on V2 then D = D1 D2 is
not generally irreducible on V = V1 V2 . Indeed, we shall be particularly interested in
decomposing D into irreducible components in several explicit examples.
Definition 30. The trivial representation of L(G) on V is given by d(X) = 0 for all
X ∈ L(G)
– 37 –
Definition 31. If G is a matrix Lie group and hence L(G) is a matrix Lie algebra, then
the tangent vectors can be regarded as matrices acting directly on n-component vectors.
Then we define the fundamental representation of L(G) on V by d(X) = X
There is a particularly natural representation associated with any Lie algebra.
Definition 32. Let L(G) be a Lie algebra. Then the adjoint representation is a represen-
tation of L(G) over the vector space L(G), ad : L(G) → M (L(G)) defined by
so ad is indeed a representation.
1 1 1
exp(v) exp(w) = exp(v + w + [v, w] + [v, [v, w]] + [[v, w], w] + ...) (2.116)
2 12 12
where ... indicates terms of higher order in v and w. For simplicity we shall consider
only matrix Lie groups, in which case the Lie algebra elements are square matrices.
To obtain the first few terms in this formula, consider etv etw as a function of t and set
Z(t) = tP + t2 Q + O(t3 ) (2.118)
where we determine the matrices P and Q by expanding out (2.117) in powers of t:
1 1
I + t(v + w) + t2 (w2 + v 2 + 2vw) + O(t3 ) = I + tP + t2 (Q + P 2 ) + O(t3 ) (2.119)
2 2
from which we find P = v + w and Q = [v, w].
Proposition 13. All higher order terms in the power series expansion of Z(t) in the BCH
formula depend only on sums of compositions of commutators on v and w.
Suppose that Z(y) is an arbitrary square matrix. Consider
∂ xZ(y)
f1 (x, y) = (e ) (2.120)
Z 1
∂Z(y) (1−t)Z(y)
f2 (x, y) = e(x−1+t)Z(y) e dt (2.121)
1−x ∂y
These both satisfy
dn g
= etv (ad v)n we−tv (2.126)
hence the power series expansion of g(t) is given by
∞ n
tv −tv
X t
e we =w+ (ad v)n w (2.127)
Applying this expression to both sides of (2.124) and performing the y-integral, one
∞ n ∞
X t dZ X 1 dZ
v+w+ (ad v)n w = + (ad Z(t))n (2.128)
n! dt (n + 1)! dt
n=1 n=1
Then by expanding out Z(t) = n=1 Zn (as we know that Z(0) = 0), it follows by
induction using the above equation that the Zn can be written as sums of compositions of
commutators on v and w.
Exercise: Suppose that [v, w] = 0. Show that ev ew = ev+w .
Proposition 14. Suppose that D is a representation of the matrix Lie group G acting on
V . Then there is a representation d of L(G) also on V defined via
d(v) = D(exp(tv)) |t=0 (2.129)
for v ∈ L(G).
It is convenient to expand out up to O(t3 ) by
1 + (t1 + t2 )d(v) + (t1 + t2 )2 h(v) + O(t3i ) = (1 + t1 d(v) + t21 h(v))(1 + t2 d(v) + t22 h(v)) + O(t3i )
1 2
and so on equating the t1 t2 coefficient we find h(v) = 2 d(v) .
Next consider for v, w ∈ L(G)
1 1
D(e−tv e−tw etv etw ) = (1 − td(v) + t2 d(v)2 )(1 − td(w) + t2 d(w)2 )
2 2
1 2 1
× (1 + td(v) + t d(v) )(1 + td(w) + t2 d(w)2 ) + O(t3 )
2 2
= 1 + t2 (d(v)d(w) − d(w)d(v)) + O(t3 ) (2.132)
1 2 3 1 2
[v,w]+O(t3 )
e−tv e−tw etv etw = e−t(v+w)+ 2 t [v,w]+O(t ) et(v+w)+ 2 t
2 3
= et [v,w]+O(t ) (2.133)
and so
2 [v,w]+O(t3 )
D(e−tv e−tw etv etw ) = D(et ) = 1 + t2 d([v, w]) + O(t3 ) (2.134)
Comparing (2.132) with (2.134) we find that
– 40 –
But by the BCH formula
D(etαv etβw ) = D(et(αv+βw)+O(t ) ) = 1 + td(αv + βw) + O(t2 ) (2.137)
Hence, comparing the O(t) terms in (2.136) and (2.137) it follows that d(αv + βw) =
αd(v) + βd(w).
(Ad(etv ))|t=0 = ad v (2.139)
as required.
We have seen that a representation D of the matrix Lie group G acting on V gives rise
to a representation d of the Lie algebra L(G) on V . A partial converse is true.
Definition 33. Suppose that G is a matrix Lie group. Let d denote a representation of
L(G) on V . Then a representation D is induced locally on G via
Proposition 16. The map D given in (2.140) which is locally induced by the representation
d of L(G) on V defines a representation.
Clearly, D(g) defines a linear transformation on V .
As I = e0 it follows that D(e) = ed(0) = e0 = 1 where d(0) = 0 follows from the
linearity of d.
Also, suppose that g1 , g2 have g1 = ev1 , g2 = ev2 . Then by the BCH formula we have
g1 g2 = ev1 +v2 + 2 [v1 ,v2 ]+... (2.141)
– 41 –
where . . . denotes a sum of higher order nested commutators by Proposition 13. Hence
D(g1 g2 ) = ed(v1 +v2 + 2 [v1 ,v2 ]+... )
= ed(v1 )+d(v2 )+ 2 d([v1 ,v2 ])+d(... ) (by the linearity of d)
= ed(v1 )+d(v2 )+ 2 [d(v1 ),d(v2 )]+... (using (2.113))
= ed(v1 ) ed(v2 )
= D(g1 )D(g2 ) (2.142)
So D is at least locally a representation. Note that we have made use of the fact that all
higher order terms in the BCH expansion can be written as sums of commutators, together
with the property (2.113) of representations of L(G) in proceeding from the second to the
third line of the above equation.
Definition 36. A representation d of L(G) is called totally reducible if there exists a direct
sum decomposition of V into subspaces Wi , V = W1 W2 ... Wk where the Wi are
invariant subspaces with respect to d and d restricted to Wi is irreducible.
Proposition 18. Suppose that G is a matrix Lie group. If D is a representation of G on V
with invariant subspace W , then W is an invariant subspace of the induced representation
d of L(G) on V
Conversely, suppose d is a representation of L(G) on V with invariant subspace W ;
then W is an invariant subspace of the (locally) induced representation D of G on V .
Suppose that D is a representation of G on V with invariant subspace W with respect
to D. Let d be the induced representation of L(G) on V . If w ∈ W and X ∈ L(G) then
d d
d(X)w = (D(etX ))t=0 w = (D(etX )w)t=0 (2.145)
dt dt
As D(etX )w ∈ W for all t ∈ R it follows that d(X)w ∈ W .
Conversely, suppose that d is a representation of L(G) on V , and W is an invariant
subspace of V with respect to d. Let D be the locally defined representation of G induced
by d. Then if g ∈ G is given by g = eX for some X ∈ L(G) then if w ∈ W ,
X 1 n
D(g)w = ed(X) w = d (X)w (2.146)
– 43 –
If w1 , w2 ∈ L(V ) and α, β are scalars and v1 ⊗ v2 ∈ V then
d(w1 )d(w2 )v1 ⊗ v2 = d(w1 ) d1 (w2 )v1 ⊗ v2 + v2 ⊗ d2 (w2 )v2
= d1 (w1 )d1 (w2 )v1 ⊗ v2 + d1 (w2 )v1 ⊗ d2 (w1 )v2
+ d1 (w1 )v1 ⊗ d2 (w2 )v2 + v1 ⊗ d2 (w1 )d2 (w2 )v2 (2.149)
where the sum of the second and third terms in this expression is symmetric in w1 and w2 .
d(w1 )d(w2 )v1 ⊗ v2 − d(w2 )d(w1 )v1 ⊗ v2 = d1 (w1 )d1 (w2 )v1 ⊗ v2 − d1 (w2 )d1 (w1 )v1 ⊗ v2
+ v1 ⊗ d2 (w1 )d2 (w2 )v2 − v1 ⊗ d2 (w2 )d2 (w1 )v2
= d([w1 , w2 ])v1 ⊗ v2 (2.150)
as required .
Proposition 20. Suppose that D1 and D2 are representations of matrix Lie group G on
V1 and V2 with induced representations of L(G) on V1 and V2 denoted by d1 and d2 . Let
D = D1 D2 denote the representation of G on the tensor product V1 V2 . Then the
corresponding induced representation of L(G) on V1 V2 is d = d1 ⊗ 1 + 1 ⊗ d2 .
Proof Suppose w ∈ L(G), then expanding out in powers of t;
– 44 –
= (v1 + td1 (w)v1 + O(t2 )) ⊗ (v2 + td2 (w)v2 + O(t2 ))
= v1 ⊗ v2 + t d1 (w)v1 ⊗ v2 + v1 ⊗ d2 (w)v2 + O(t2 )
and hence from the O(t) term we find the induced representation d = d1 ⊗ 1 + 1 ⊗ d2
as required .
as required. .
As κab is symmetric, one can always choose an adapted basis for L(G) in which κab is
a diagonal matrix, and by rescaling the Lie algebra generators, the diagonal entries can be
set to +1, 0 or −1.
Definition 39. The Killing form is non-degenerate if κab has no zero diagonal entries in
the adapted basis. The Lie algebra L(G) is then called semi-simple. If all the diagonal
entries are −1 then L(G) is said to be a compact Lie algebra.
Lemma 5. Suppose that L(G) is semi-simple. Define cabc = cab d κdc (i.e. lower the last
index of the structure constants with the Killing form). Then cabc is totally antisymmetric
in a, b, c.
Note that
Definition 40. Suppose that L(G) is a Lie algebra with non-degenerate Killing form, and
d is a representation of L(G) on V . The Casimir operator C of L(G) is defined by
C=− (κ−1 )ab d(Ta )d(Tb ) (2.159)
Proposition 21. The Casimir operator commutes with d(X) for all X ∈ L(G)
It suffices to show that [C, d(Ta )] = 0 for all Ta .
Note that
(κ−1 )bc [d(Ta ), d(Tb )d(Tc )]
[d(Ta ), C] = −
(κ−1 )bc [d(Ta ), d(Tb )]d(Tc ) + d(Tb )[d(Ta ), d(Tc )]
(κ−1 )bc d[Ta , Tb ]d(Tc ) + d(Tb )d[Ta , Tc ]
(κ−1 )bc cab ` d(T` )d(Tc ) + cac ` d(Tb )d(T` )
= −ca c` d(T` )d(Tc ) − ca c` d(Tc )d(T` )
=0 (2.160)
– 46 –
If L(G) is compact, then working in the adapted basis, C takes a particularly simple
form X
C= d(Ta )d(Ta ) (2.162)
– 47 –
3. SU(2) and Isospin
0 0 0 0 0 1 0 −1 0
T1 = 0 0 −1 T2 = 0 0 0 T3 = 1 0 0 (3.1)
0 1 0 −1 0 0 0 0 0
dM (t)†
dM (t) dM (t)
Tr M (t)−1 =0 M (t)† + M (t) =0 (3.4)
dt dt dt
Setting t = 0 we find
Tr m = 0 m + m† = 0 (3.5)
where m = dMdt(t) |t=0 . Hence L(SU (n)) consists of the traceless antihermitian matrices.
Exercise: Verify that if X and Y are traceless antihermitian square matrices then so is
[X, Y ].
It is convenient to make use of the Pauli matrices σa defined by
! ! !
0 1 0 −i 1 0
σ1 = σ2 = σ3 = (3.6)
1 0 i 0 0 −1
which satisfy σa σb = δab I + iabc σc .
Then a basis of traceless antihermitian 2 × 2 matrices is given by taking Ta = − 2i σa .
It follows that
Comparing (3.2) and (3.7) it is clear that SO(3) and SU (2) have the same Lie algebra.
We might therefore expect SO(3) and SU (2) to be similar, at least near to the identity.
We shall see that this is true.
Exercise: Using this basis, show that the Lie algebra L(SU (2)) is of compact type.
3.2 Relationship between SO(3) and SU (2)
Proposition 22. The manifold SU (2) can be identified with S 3 .
α µ
U= ∈ SU (2) (3.8)
β ν
for α, β, µ, ν ∈ C. ! !
α µ
Then U U † = I implies that be orthogonal to with respect to the standard
β ν
inner product on C2 . As the orthogonal complement to in C2 is a 1-dimensional
complex vector space spanned by it follows that µ = −σ β̄, ν = σ ᾱ for σ ∈ C.
! !
α µ
We also require that and have unit norm, which fixes
β ν
R(U )mn = Tr σm U σn U †
By writing U = y0 I +iym σm for y0 , ym ∈ R satisfying y02 +yp yp = 1, it is straightforward
to show that
– 49 –
(here we have written Rmn = R(U )mn ).
It is clear that if yp = 0 for p = 1, 2, 3 so that U = ±I, then R = I, so R ∈ SO(3).
More generally, suppose that yp yp 6= 0. Then one can set y0 = cos α, yp = sin αzp for
0 < α < 2π and α 6= π. Then the constraint y02 + yp yp = 1 implies that zp zp = 1, i.e z is a
unit vector in R3 . The expression (3.13) can be rewritten as
Rmn zn = zm (3.15)
where p is summed over; which together with y02 + yp yp = u20 + up up = 1 implies that
y02 = u20 and yp yp = up up (sum over p). Substituting back into (3.18) we find ym2 = u2 for
each m = 1, 2, 3.
Suppose first that y0 6= 0, then u0 6= 0, and it follows that y0 = ±u0 and yp = ±up for
each p = 1, 2, 3 (with the same sign throughout).
Next, suppose y0 = 0. Then u0 = 0 also, and ym yn = um un for each m, n = 1, 2, 3.
Contracting with yn we get (yn yn )ym = (yn un )um for m = 1, 2, 3. As yn yn = 1, this implies
that ym = λum for m = 1, 2, 3 where λ is constant. Hence, (1 − λ2 )um un = 0. Contracting
over m and n then gives 1 − λ2 = 0, so λ = ±1. Therefore yp = ±up for p = 1, 2, 3 (with
the same sign throughout).
Hence, we have shown that each R ∈ SO(3) corresponds to two elements U and
−U ∈ SU (2). These two elements correspond to antipodal points ±y ∈ S 3 . This establishes
the correspondence.
It remains to check that R(U1 U2 ) = R(U1 )R(U2 ) for U1 , U2 ∈ SU (2). Note that on
U1 = y0 I2 + iyn σn , U2 = w0 + iwn σn (3.20)
U1 U2 = u0 I2 + iun σn (3.21)
R(U1 )mp R(U2 )pn = (y02 −y` y` )δmp +2mpq y0 yq +2ym yp (w02 −wr wr )δpn +2pnr w0 wr +2wp wn
On expanding out these two expressions in terms of y and w it becomes an exercise in
algebra to show that R(U1 U2 )mn = R(U1 )mp R(U2 )pn as required.
Exercise: Verify the identity R(U1 U2 )mn = R(U1 )mp R(U2 )pn .
It is conventional to write SU (2) = S 3 and SO(3) = S 3 /Z2 , where S 3 /Z2 is the 3-
sphere with antipodal points identified. SU (2) is called the double cover of SO(3); and
there is an isomorphism SO(3) ∼ = SU (2)/Z2 .
It can be shown (using topological methods outside the scope of this course) that
SU (2) and SO(3) are not homeomorphic. This is because SU (2) and SO(3) have different
fundamental groups π1 . In particular, as SU (2) ∼ = S 3 , and S 3 is simply connected, it
follows that π1 (SU (2)) is trivial. However, it can be shown that π1 (SO(3)) = Z2 .
J3 = id(T3 ), J± = √ (d(T1 ) ± id(T2 )) (3.24)
As V is a complex vector space, there exists an eigenstate |φi of J3 with some eigenvalue
λ, and using (3.25) it follows that
– 51 –
J3 J± |φi = (λ ± 1)J± |φi (3.26)
and so by simple induction
Definition 41. j is called the highest weight of the representation. In the context of particle
physics, it is called the spin.
Note that by acting on |j, ji with J− other distinct eigenstates of J3 are obtained. As
we are interested in finite dimensional representations, it follows that (J− )N |j, ji = 0 for
some positive integer N (otherwise one could just keep going and the representation would
be infinite dimensional). Let k + 1 be the smallest positive integer for which this happens,
and set |ψk i = (J− )k |j, ji, so, by construction, J− |ψk i = 0.
Define for ` = 0, . . . , k
J+ |ψ` i = j − (` − 1) + j − (` − 2) + · · · + j − 1 + j |ψ`−1 i
= `(j − (` − 1)) |ψ`−1 i (3.30)
In order to constrain j recall that J− |ψk i = 0, so
0 = J+ J− |ψk i
= [J+ , J− ] + J− J+ |ψk i
– 52 –
= J3 + J− J+ |ψk i
= (j − k) |ψk i + J− k(j − (k − 1)) |ψk−1 i
= j − k + k(j − (k − 1)) |ψk i
= (k + 1)(2j − k) |ψk i (3.31)
Proposition 24. V = span{J− k |j, ji , J k−1 |j, ji , . . . , |j, ji} and the highest weight state is
Consider the vector space V 0 spanned by |ψi i for i = 0, . . . , k. This is an invariant
subspace of V with respect to the representation d. As the representation is irreducible on
V it follows that V = V 0 . In particular, J3 is diagonalizable on V and each eigenspace is
To prove uniqueness suppose that |φi ∈ V satsfies J+ |φi = 0. As |φi ∈ V it follows
that we can write
|φi = ai (J− )i |j, ji (3.32)
for constants ai . Applying (J+ )2j to both sides of this equation implies a2j = 0. Then,
applying (J+ )2j−1 implies a2j−1 = 0. Continuing in this way, we obtain a1 = a2 = · · · =
a2j−1 = a2j = 0, and so |φi = a0 |j, ji. So the highest weight state in an irreducible
representation of L(SU (2)) is unique (up to scaling). .
Hence, we have shown that j is half (non-negative) integer, and the representation is
2j + 1-dimensional. The irreducible representations are therefore completely characterized
by the value of the weight j.
It is possible to go further, and prove the following theorem (the proof given is that
presented in [Samelson]):
J3† = J3 , †
J± = J∓ (3.33)
– 53 –
Show using (3.30) that hψ` | ψ` i = 2` (2j − ` + 1) hψ`−1 | ψ`−1 i, and hence that
N` ≡ hψ` | ψ` i = (3.34)
2` (2j
− `)!
the first label denotes the highest weight value j, the m label denotes the eigenstate
of J3 , J3 |j, mi = m |j, mi. These satisfy (check!)
1 p
J− |j, mi = √ (j + m)(j − m + 1) |j, m − 1i
1 p
J+ |j, m − 1i = √ (j + m)(j − m + 1) |j, mi (3.36)
C |j, mi = − j(j + 1) |j, mi (3.37)
3.3.1 Examples of Low Dimensional Irreducible Representations
It is useful to consider several low-dimensional representations.
1 1
J+ | , i = 0
2 2
1 1 1 1 1
J+ | , − i = √ | , i (3.38)
2 2 2 2 2
1 1 1 1 1
J− | , i = √ | , − i
2 2 2 2 2
1 1
J− | , − i = 0 (3.39)
2 2
ii) j = 1. A normalized basis of states is |1, 1i, |1, 0i and |1, −1i with
J+ |1, 1i = 0
J+ |1, 0i = |1, 1i
J+ |1, −1i = |1, 0i (3.40)
– 54 –
J− |1, 1i = |1, 0i
J− |1, 0i = |1, −1i
J− |1, −1i = 0 (3.41)
3 3
J+ | , i =0
2 2 r
3 1 3 3 3
J+ | , i = | , i
2 2 2 2 2
3 1 √ 3 1
J+ | , − i = 2| , i
2 2 r 2 2
3 3 3 3 1
J+ | , − i = | ,− i (3.42)
2 2 2 2 2
3 3 3 3 1
J− | , i = | , i
2 2 2 2 2
3 1 √ 3 1
J− | , i = 2| ,− i
2 2 r 2 2
3 1 3 3 3
J− | , − i = | ,− i
2 2 2 2 2
3 3
J− | , − i =0 (3.43)
2 2
– 55 –
J3 = J3 (1) ⊗ 1 + 1 ⊗ J3 (2), J± = J± (1) ⊗ 1 + 1 ⊗ J± (2) (3.46)
Exercise: Check that J± and J3 satisfy J3† = †
J3 , J± = J∓ and
[J3 , J± ] = ±J± , [J+ , J− ] = J3 (3.47)
In order to construct the decomposition of V into irreducible representations, first note
that if |ψ(1)i ∈ V (1) is a state of weight m1 with respect to J3 (1), and |ψ(2)i ∈ V (2) is
a state of weight m2 with respect to J3 (2) then |ψ(1)i ⊗ |ψ(2)i ∈ V is a state of weight
m1 + m2 with respect to J3 , i.e. weights add in the tensor product representation.
Using this, it is possible to compute the degeneracy of certain weight states in the
tensor product representation. In particular, the maximum possible weight must be j1 + j2
which corresponds to |j1 , j1 i ⊗ |j1 , j2 i.
Consider the weight j1 + j2 − k for k > 0. In general, j1 + j2 − k can be written as a
sum of two integers m1 + m2 , for m1 ∈ {−j1 , ..., j1 } and m2 ∈ {−j2 , ..., j2 } in k + 1 ways:
j1 + j2 − k = (j1 − k) + j2
= (j1 − k + 1) + (j2 − 1)
= (j1 − 1) + (j2 − k + 1)
= j1 + (j2 − k) (3.48)
provided that j1 − k ≥ −j1 and j2 − k ≥ −j2 , or equivalently
M M jX
1 +j2
– 56 –
j1 +j2 −|j1 −j2 |
= (2(|j1 − j2 | + n) + 1)
= (1 + 2j1 )(1 + 2j2 )
= dim V (3.50)
Hence we have decomposed V = V|j1 −j2 | · · · Vj1 +j2 into irreducible subspaces Vj
where Vj has highest weight j and is of dimension 2j + 1.
1 1 1 1 1 1 1 1 1
|1, 0i = √ | , − i ⊗ | , i + | , i ⊗ | , − i (3.52)
2 2 2 2 2 2 2 2 2
and applying J− once more
1 1 1 1
|1, −1i = | , − i ⊗ | , − i (3.53)
2 2 2 2
The remaining possible spin is j = 0. This has only one state, which must have the
1 1 1 1 1 1 1 1
|0, 0i = c0 | , i ⊗ | , − i + c1 | , − i ⊗ | , i (3.54)
2 2 2 2 2 2 2 2
for constants c0 , c1 to be fixed. Applying J+ to both sides we get
1 1 1 1
(c0 + c1 ) | , i ⊗ | , i = 0 (3.55)
2 2 2 2
so c1 = −c0 . Then on making the appropriate normalization we find
1 1 1 1 1 1 1 1 1
|0, 0i = √ | , − i ⊗ | , i − | , i ⊗ | , − i (3.56)
2 2 2 2 2 2 2 2 2
Next, take the spin 1 ⊗ spin 1/2 composite system. As j1 = 1, j2 = 21 there are two
possible values for the composite spin, j = 12 or j = 32 , and the tensor product space is
For the j = 32 states, the state with greatest weight is
3 3 1 1
| , i = |1, 1i ⊗ | , i (3.57)
2 2 2 2
– 57 –
Applying J− to both sides gives
3 1 2 1 1 1 1 1
| , i= |1, 0i ⊗ | , i + √ |1, 1i ⊗ | , − i (3.58)
2 2 3 2 2 3 2 2
3 3 1 1
| , − i = |1, −1i ⊗ | , − i (3.60)
2 2 2 2
For the j = 2 states, the state with maximal weight can be written as a linear combi-
1 1 1 1 1 1
| , i = c0 |1, 1i ⊗ | , − i + c1 |1, 0i ⊗ | , i (3.61)
2 2 2 2 2 2
for some constants c0 , c1 to be determined. Then
1 1 1 1 1
0 = J+ | , i = (c1 + √ c0 ) |1, 1i ⊗ | , i (3.62)
2 2 2 2 2
−6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6
ii) The weights are all half integer; and are distributed with unit distance between each
weight. The highest weight is j for 2j ∈ N, and the lowest weight is −j, and there
are no “holes” in the weight diagram- −j, −j + 1, . . . , j − 1, j are all weights.
– 58 –
iii) Each weight has multiplicity 1.
One can also plot the weight diagram of a generic (not necessarily irreducible) represen-
tation. For example, the weight diagram of the tensor product (j1 = 12 )⊗(j2 = 21 )⊗(j3 = 12 )
eight dimensional tensor product representation is
−3 −2 −1 0 1 2 3
This has a highest weight j = 32 and a lowest weight − 32 both with multiplicity 1, and
weights ± 12 each with multiplicity 3. In general, a non-irreducible representation will have
a highest weight, but it need not be of multiplicity 1. For a generic weight diagram
i) The diagram (together with weight multiplicities) is reflection symmetric about the
iii) Each weight need not be of multiplicity 1. However, as one proceeds from a particular
weight towards the origin (in unit steps from either the left or the right), the weight
multiplicities do not decrease.
1 ¯
I3 = (N (u) − N (ū) − (N (d) − N (d))) (3.67)
Isospin symmetry arises in the quark model because of the very similar properties of
the u and d quarks.
i) Nucleons have isospin I = 12 ; the proton has I3 = 12 , and the neutron has I3 = − 12 :
1 1
n = | ,− i
2 2
1 1
p=| , i (3.68)
2 2
ii) The pions have I = 1 with
π − = |1, −1i
π 0 = |1, 0i
π + = |1, 1i (3.69)
Σ− = |1, −1i
Σ0 = |1, 0i
Σ+ = |1, 1i (3.70)
Λ0 = |0, 0i (3.71)
iv) The strange mesons lie in two multiplets of I = 2
1 1
K0 = | , − i
2 2
1 1
K+ = | , i (3.72)
2 2
1 1
K− = | , − i
2 2
0 1 1
K̄ = | , i (3.73)
2 2
K ± are antiparticles with the same mass, but are in different isospin multiplets
because of their differing quark content.
– 60 –
v) The light quarks have I = 2
1 1
d = | ,− i
2 2
1 1
u=| , i (3.74)
2 2
|1, 1i = pp, |1, 0i = √ (np + pn), |1, −1i = nn (3.75)
which are are symmetric under exchange of isospin degrees of freedom, and the remaining
state is
|0, 0i = √ (np − pn) (3.76)
which is anti-symmetric under exchange of isospin degrees of freedom.
The deuteron d is a pn bound state, which has no pp or nn partners. There is therefore
only one possibility; d = |0, 0i.
In general, the total wavefunction for a NN state can be written as a product of space,
spin, and isospin functions
ψ = ψ(space)ψ(spin)ψ(isospin) (3.77)
3 3
| , i = π+p
2 2 r
3 1 2 0 1
| , i= π p + √ π+n
2 2 3 r3
3 1 1 − 2 0
| ,− i = √ π p + π n
2 2 3 3
– 61 –
3 3
| , − i = π−n (3.78)
2 2
1 1 1 0 2 +
| , i = √ π p− π n
2 2 r3 3
1 1 2 − 1
| ,− i = π p − √ π0n (3.79)
2 2 3 3
These equations can be inverted to give
3 3
π+p = | , i
r 2 2
0 2 3 1 1 1 1
π p= | , i+ √ | , i
3 2 2 r3 2 2
1 3 1 2 1 1
π+n = √ | , i − | , i
3 2 2 r3 2 2
1 3 1 2 1 1
π−p = √ | , − i + | ,− i
r3 2 2 3 2 2
2 3 1 1 1 1
π0n = | ,− i − √ | ,− i
3 2 2 3 2 2
3 3
π−n = | , − i (3.80)
2 2
Consider πN scattering. The scattering is described by a S-matrix, which, in processes
dominated by strong interactions is taken to be isospin invariant: [Ij , S] = 0 and so by
Schur’s lemma
hI 0 , m0 | S |I, mi = φ(I)δII 0 δmm0 (3.81)
The cross sections of the scattering processes are given by
σ(in → out) = K| hin| S |outi |2 (3.82)
for constant K. Hence
σ(π + p → π + p) = K|φ( )|2 (3.83)
2 3 1
σ(π 0 n → π − p) = K|φ( ) − φ( )|2 (3.84)
9 2 2
1 3 1
σ(π − p → π − p) = K|φ( ) + 2φ( )|2 (3.85)
9 2 2
For all three of these processes, a marked resonance is measured at approximately 1236
Mev. The ratio of the cross-sections of these resonances is
2 1
σ(π + p → π + p) : σ(π 0 n → π − p) : σ(π − p → π − p) = 1 : : (3.86)
9 9
which is consistent with the supposition that the resonance corresponds to a particle of
isospin 23 (and so |φ( 32 )| |φ( 12 )|). This particle is the ∆ particle, which lies in an
isospin I = 32 multiplet with states ∆− , ∆0 , ∆+ , ∆++ having weights I3 = − 32 , − 12 , 12 , 32
– 62 –
To proceed, define the n + 1 × n + 1 square matrices Eij by (Ei,j )pq = δip δjq . Recall that
L(SU (n+1)) consists of the traceless antihermitian matrices, and is therefore spanned over
R by i(Ei,i − En+1,n+1 ) for i = 1, . . . , n and Ei,j − Ej,i , i(Ei,j + Ej,i ) for i < j. Hence, on
complexification, the Lie algebra is spanned by traceless diagonal matrices together with
the Ei,j for i 6= j.
Suppose that h = diag (a1 , a2 , . . . , an+1 ), ai = 0 is a traceless diagonal matrix.
Then if i 6= j, observe that [h, Ei,j ] = (ai − aj )Ei,j .
(ad Ep,q ad Ei,j )Er,s = δjr δiq Ep,s − δjr δps Ei,q − δis δqr Ep,j + δis δpj Er,q (3.88)
and hence the component of (ad Ep,q ad Ei,j )Er,s in the E`,t direction is
δjr δiq δp` δst − δjr δps δi` δqt − δis δqr δp` δjt + δis δpj δr` δqt (3.89)
δjr δiq δpr − δjr δps δir δqs − δis δqr δpr δjs + δis δpj δqs = δjr δiq δpr + δis δpj δqs (3.90)
(as i 6= j and p 6= q). So the contribution to the trace from these terms is
δjr δiq δpr + δis δpj δqs = 2nδiq δpj (3.91)
We also compute the component of (ad Ep,q ad Ei,j )Er,s in the direction Ek,k −
En+1,n+1 for k = 1, . . . , n. Observe that the component of δiq Ep,s − δps Eiq in this
direction is δiq δps (δpk − δik ) (if i 6= q or p 6= s then the diagonal components of
δiq Ep,s −δps Eiq vanish). Hence the component of (ad Ep,q ad Ei,j )Er,s in the direction
Ek,k − En+1,n+1 for k = 1, . . . , n is
δjr δiq δps (δpk − δik ) + δis δpj δrq (δrk − δpk ) (3.92)
It follows that the component of (ad Ep,q ad Ei,j )(Ek,k − En+1,n+1 ) along (Ek,k −
En+1,n+1 ) is
– 63 –
− δj,n+1 δiq δp,n+1 (δpk − δik ) − δi,n+1 δpj δq,n+1 δpk (3.93)
δjk δiq δpk + δik δpj δqk + δj,n+1 δiq δp,n+1 δik + δi,n+1 δpj δq,n+1 δpk (3.94)
and so
(ad Ep,q ad h)Ei,j = ad Ep,q (ai − aj )Ei,j = (ai − aj )(δqi Ep,j − δp,j Ei,q ) (3.97)
as p 6= q. Hence
κ(Ep,q , h) = 0 (3.99)
iii) κ(h, g) where h = diag (a1 , a2 , . . . , an+1 ) and g = diag (b1 , b2 , . . . , bn+1 ) and i ai =
i bi = 0,
The only contribution to the trace Tr (ad h ad g) is from the terms
for i 6= j. But
(ad h ad g)Ei,j = (ai − aj )(bi − bj )Ei,j (3.101)
so taking the sum over i and j (i 6= j) we obtain
κ(h, g) = 2(n + 1) ai bi (3.102)
Hence κ is negative definite over the span over R of i(Er,r − En+1,n+1 ) for r = 1, . . . , n;
and this span is orthogonal to the span of the Ei,j − Ej,i and i(Ei,j + Ej,i ) (i 6= j).
Furthermore, κ is diagonal and negative definite over the span over R of the Ei,j − Ej,i and
i(Ei,j + Ej,i ). κ is therefore non-degenerate.
– 64 –
4. SU(3) and the Quark Model
The Lie algebra of SU (3) consists of the traceless antihermitian 3 × 3 complex matrices.
It is convenient to define the following matrices
0 0 √
2 3
0 0
h1 = 0 − 21 0 h2 = 0 √ 0
2 3
0 0 0 1
0 0 − √3
0 √
0 0 0 0
1 1 √1 0 0
e+ = 0 e− = 2
0 0
0 0 0 0 0 0
0 0 0 0 0 0
e2+ √1 2
0 0 0 0 0
= e− =
0 0 0 0 √1 0
0 0 √
0 0 0
e3+ 3 0 0 0
= 0 e− = (4.1)
0 0
0 0 0 √1 0 0
[H1 , H2 ] = 0
1 1 2 1 2 3 1 3
[H1 , E± ] = ±E± , [H1 , E± ] = ∓ E± , [H1 , E± ] = ± E±
2√ 2
1 2 3 2 3 3 3
[H2 , E± ] = 0, [H2 , E± ]=± E , [H2 , E± ] = ± E (4.3)
2 ± 2 ±
1 1
[E+ , E− ]=H√1
2 2 3 1
[E+ , E− ]= H2 − H1
√2 2
3 3 3 1
[E+ , E− ]= H2 + H1 (4.4)
2 2
The remaining commutation relations are
1 2 1 3 1 2 1 3
[E+ , E+ ] = √ E+ , [E− , E− ] = − √ E−
2 2
– 65 –
1 3 1 2 1 3 1 2
[E+ , E− ] = − √ E− , [E− , E+ ] = √ E+
2 2
2 3 1 1 2 3 1 1
[E+ , E− ] = √ E− , [E− , E+ ] = − √ E+ (4.5)
2 2
1 2 1 2 1 3 1 3 2 3 2 3
[E+ , E− ] = [E− , E+ ] = [E+ , E+ ] = [E− , E− ] = [E+ , E+ ] = [E− , E− ]=0 (4.6)
and √ √
3 1 2 2 2 2 3 1
[ H2 − H1 , E± ] = ±E± , [E+ , E− ] = H2 − H1 (4.8)
2 2 2 2
and √ √
3 1 3 3 3 3 3 1
[ H2 + H1 , E± ] = ±E± , [E+ , E− ] = H2 + H1 (4.9)
2 2 2 2
In particular, there are three pairs of raising and lowering operators E± m.
For simplicity, consider a representation d of L(SU (3)) obtained from a unitary repre-
sentation D of SU (3) such that d is an anti-hermitian representation- so that√H1 and H2
are hermitian, and hence diagonalizable with real eigenvalues. Hence, H1 and 23 H2 ± 12 H1 ,
can be simultaneously diagonalized, and the eigenvalues are real. (In fact the same can be
shown without assuming unitarity!)
Suppose then that |φi is an eigenstate of H1 with eigenvalue p and also an eigenstate
of H2 with eigenvalue q. It is convenient to order the eigenvalues as points in R2 with
position vectors (p, q) where p is the eigenvalue of H1 and q of H2 . (p, q) is then referred
to as a weight.
From the commutation relations we have the following properties
– 66 –
for m1 , m2 , m3 ∈ Z. It follows that 2 3q ∈ Z. It is particularly useful to plot the
sets of eigenvalues (p, q) as points in the plane. The resulting plot is known as the weight
diagram. As the representation is assumed to be irreducible, there can only be finitely
many points on the weight diagram, though it is possible that a particular weight may
correspond to more than one state. Moreover, as 2p ∈ Z, 2 3q ∈ Z, the weights are
constrained to lie on the points of a lattice. From the effect of the raising and lowering
operators on the eigenvalues, it is straightforward to see that this lattice is formed by the
tessalation of the plane by equilateral triangles of side 1. This is illustrated in Figure 1,
where the effect of the raising and lowering operators is given (in this diagram (0, 0) is a
weight, though this need not be the case generically).
2 E+
The weight diagram has three axes of symmetry. To see this, recall that if m is a weight
of a state in an irreducible representation of L(SU (2)) then so is −m. In the context of
the three L(SU (2)) algebras contained in L(SU (3)) this means that from the properties
of the algebra in (4.7), if (p, q) is a weight then so is (−p, q), i.e. the diagram is reflection
symmetric about the line θ = π2 passing through the origin. Also, due to the symmetry of
the L(SU (2)) algebra in (4.8), the weight diagram is reflection symmetric about the line
√ √
θ = π6 passing through the origin: so if (p, q) is a weight then so is ( 21 (p+ 3q), 21 ( 3p−q)).
And due to the symmetry of the L(SU (2)) algebra in ((4.9) the weight diagram is reflection
symmetric about the line θ = 5π 6 passing through the origin: so if (p, q) is a weight then so
√ 1
is ( 2 (p − 3q), 2 (− 3p − q)).
Using this symmetry, it suffices to know the structure of the weight diagram in the
sector of the plane π6 ≤ θ ≤ π2 . The remainder is fixed by the reflection symmetry.
Motivated by the treatment of SU (2) we make the definition:
Definition 42. |ψi is called a highest weight state if |ψi is an eigenstate of both H1 and
H2 , and E+m |ψi = 0 for m = 1, 2, 3.
Note that there must be a highest weight state, for otherwise one could construct
infinitely many eigenstates by repeated application of the raising operators E+ m . Given
– 67 –
a highest weight state, let V 0 be the vector space spanned by |ψi and states obtained by
acting with all possible products of lowering operators E−m on |ψi. As there are only finitely
many points on the weight diagram, there can only be finitely many such terms. Then, by
making use of the commutation relations, it is clear that V 0 is an invariant subspace of V .
As the representation is irreducible on V , this implies that V 0 = V , i.e. V is spanned by
|ψi and a finite set of states obtained by acting with lowering operators on |ψi. Suppose
that (p, q) is the weight of |ψi. Then V is spanned by a basis of eigenstates of H1 and H2
with weights confined to the sector given by π ≤ θ ≤ 5π 3 relative to (p, q)- all points on the
weight diagram must therefore lie in this sector.
Next suppose that |ψ1 i and |ψ2 i are two linearly independent highest weight states
(both with weight (p, q)). Let V1 and V2 be the vector spaces spanned by the states
obtained by acting with all possible products of lowering operators E− m on |ψ i and |ψ i
1 2
respectively; one therefore obtains bases for V1 and V2 consisting of eigenstates of H1 and
H2 . By the reasoning given previously, as V1 and V2 are invariant subspaces of V and the
representation is irreducible on V , it must be the case that V1 = V2 = V . In particular, we
find that |ψ2 i ∈ V1 . However, the only basis element of V1 which has weight (p, q) is |ψ1 i,
hence we must have |ψ2 i = c |ψ1 i for some constant c, in contradiction to the assumption
that |ψ1 i and |ψ2 i are linearly independent.
Having established the existence of a unique highest weight state |ψi, we can proceed
to obtain the generic form for the weight diagram. Recall that the highest weight j of an
irreducible representation of L(SU (2)) is always non-negative. By acting on |ψi with the
lowering operators E− m , one obtains three irreducible representations of L(SU (2)). Non-
negativity of the highest weight corresponding to the L(SU (2)) irreducible representation
generated by E− 1 implies that the highest weight must lie in the half-plane to the right of
Non-negativity of the highest weight corresponding to the L(SU (2)) irreducible represen-
tation generated by E− 2 implies that the highest weight must lie in the half-plane above
– 68 –
Finally, non-negativity of the highest weight corresponding to the L(SU (2)) irreducible
representation generated by E− 3 implies that the highest weight must lie in the half-plane
As the highest weight must lie in all three of these regions, it must lie in the sector
6 ≤ θ ≤ π2 relative to (0, 0), or at the origin:
Lemma 7. If the highest weight is (0, 0), then there is only one state in the representation,
which is called the singlet.
Let |ψi be the highest weight state with weight (0, 0). Suppose that E− m |ψi 6= 0 for
some m. Then by the reflection symmetry of the weight diagram, it follows that E+ m |ψi =
6 0,
in contradiction to the fact that E+ |ψi = 0 for i = 1, 2, 3, as |ψi is the highest weight
state. Hence E± m |ψi = 0 for m = 1, 2, 3. Also H |ψi = H |ψi = 0. It follows that the
1 2
1-dimensional subspace V spanned by |ψi is an invariant subspace of V , and therefore
V = V 0 as the representation is irreducible.
There are then three possible locations for the highest weight state |ψi.
line orthogonal to the axis of reflection θ = π6 , about which they are symmetric, and there
– 69 –
are no states outside this line, as these points cannot be reached by applying lowering
operators. Then, by using the reflection symmetry, it follows that the outermost states
from an equilateral triangle with horizontal base. Each lattice point inside the triangle
corresponds to (at least) one state which has this weight, because each lattice point in the
triangle lies at some possible weight within the L(SU (2)) representation given in (4.7),
and from the properties of L(SU (2)) representations, we know that this has a state with
this weight (i.e. as the L(SU (2)) weight diagram has no “holes” in it, neither does the
L(SU (3)) weight diagram).
This case is illustrated by
1 2 1 3
[E− , E− ] = − √ E− (4.11)
This implies that products of lowering operators involving E−3 can be rewritten as
linear combinations of products of operators involving only E−1 and E 2 (in some order).
In particular, we find
1 2 n 1 2 2 n−1 2 1 2 n−1
(E− )(E− ) |ψi = [E− , E− ](E− ) |ψi + E− E− (E− ) |ψi
1 3 2 n−1 2 1 2 n−1
= − √ E− (E− ) |ψi + E− E− (E− ) |ψi
– 70 –
n 3 2 n−1
= − √ E− (E− ) |ψi (4.12)
1 |ψi = 0 and [E 2 , E 3 ] = 0.
by simple induction, where we have used the fact that E− − −
A generic state of some fixed weight in the representation can be written as a linear
2 and E 1 lowering operators acting on |ψi of the form
combination of products of E− −
1 2
Π(E− , E− ) |ψi (4.13)
2 m−` 3 `
(E− ) (E− ) |ψi (4.14)
2 , E 3 ] = [E 1 , E 3 ] = 0.
where we have used the commutation relations [E− − − −
Hence, it follows that all weights in the diagram can have at most multiplicity 1.
However, from the property of the L(SU (2)) representations, as the weights in the outer
layers have multiplicity 1, it follows that all weights in the interior have multiplicity at
least 1.
Hence, all the weights must be multiplicity 1.
Suppose that the highest weight lies on the line θ = π6 . In this case, by applying powers of
E−1 the states of the L(SU (2)) representation given in (4.7) are generated. These form a
horizontal line orthogonal to the axis of reflection θ = π2 , about which they are symmetric,
and there are no states outside this line, as these points cannot be reached by applying
lowering operators. Then, by using the reflection symmetry, it follows that the outermost
states from an inverted equilateral triangle with horizontal upper edge. Each lattice point
inside the triangle corresponds to (at least) one state which has this weight, because each
lattice point in the triangle lies at some possible weight within the L(SU (2)) representation
given in (4.7), and from the properties of L(SU (2)) representations, we know that this has
a state with this weight (i.e. as the L(SU (2)) weight diagram has no “holes” in it, neither
does the L(SU (3)) weight diagram).
This case is illustrated by
– 71 –
Proposition 27. Each weight in this triangle corresponds to a unique state.
Note that all of the states on the horizontal top edge of the triangle correspond to
unique states, because these weights correspond to states which can only be obtained by
acting on |ψi with powers of E− 1 . It therefore follows by the reflection symmetry that all
Now consider a state of some fixed weight in the representation; this can be written as
a linear combination of terms of the form
1 2
Π(E− , E− ) |ψi (4.16)
go. Then either one finds that the state vanishes (due to an E−2 acting directly on |ψi), or
one can eliminate all of the E−1 terms and is left with a term proportional to
1 m−` 3 `
(E− ) (E− ) |ψi (4.17)
– 72 –
4.1.3 Hexagonal Weight Diagrams
Suppose that the highest weight lies in the sector π6 < θ < π2 . In this case, by applying
powers of E−1 the states of the L(SU (2)) representation given in (4.7) are generated. These
form a horizontal line extending to the left of the maximal weight which is orthogonal
to the line θ = π2 , about which they are symmetric, There are no states above, as these
points cannot be reached by applying lowering operators. Also, by applying powers of E− 2
the states of the L(SU (2)) representation given in (4.8) are generated. These form a line
extending to the right of the maximal weight which is orthogonal to the axis of reflection
θ = π6 , about which they are symmetric, and there are no states to the right of this line,
as these points cannot be reached by applying lowering operators.
Then, by using the reflection symmetry, it follows that the outermost states form a
hexagon. Each lattice point inside the hexagon corresponds to (at least) one state which
has this weight, because each lattice point in the hexagon lies at some possible weight
within the L(SU (2)) representation given in (4.7), and from the properties of L(SU (2))
representations, we know that this has a state with this weight (i.e. as the L(SU (2)) weight
diagram has no “holes” in it, neither does the L(SU (3)) weight diagram).
This case is illustrated by
The multiplicities of the states for these weight diagrams are more complicated than
for the triangular diagrams. In particular, the weights on the two edges of the hexagon
leading off from the highest weight have multiplicity 1, because these states can only be
constructed as (E− 1 )n |ψi or (E 2 )m |ψi. So by symmetry, all of the states on the outer layer
of the hexagon have multiplicity 1. However, if one proceeds to the next layer, then the
multiplicity of all the states increases by 1. This happens until the first triangular layer is
reached, at which point all following layers have the same multiplicity as the first triangular
– 73 –
Suppose that the top horizontal edge leading off the maximal weight is of length m,
and that the other outer edge is of length n, with m ≥ n. This situation is illustrated
m 00000000
1111 0000
1111 00000000
0000 11111111
0000 00000000
0000 111
000 000
111 000
111 000
111 000
111 000
1111 1111
1111 000
111 000
111 000
111 000
111 000
111 000
1111 00000000
0000 111
000 000
1111 000
1111 000
1111 000
1111 0001111
1111 0000
111 00000000
0000 111
1111 0001111
1111 00000000
1111 000
1111 0000
1111 0000
1111 0000
1111 000
0000 111
1111 000
1111 0000
1111 000
0000 111
0001111111111110000 111
1111 0001111
0000 1111
0000 111
000 0000
1111 0000
1111 0000
1111 000
1111 000
1111 0000
0000 111
1111 0000 111
1111 0000000
1111 000
1111 000
1111 000
1111 0000
1111 0000
0000 111
000 000
1111 000
1111 0000
0000 1110000 111
1111 1111
0000 111
1111 000 000
1111 000
0000 111
1111 000 000
whereas the n + 1-th layer is triangular, and all following layers are also triangular. As
one goes inwards through the outer n + 1 layers the multiplicity of the states in the layers
increases from 1 in the first outer layer to n + 1 in the n + 1-th layer. Then all the states
in the following triangular layers have multiplicity n + 1 as well.
We will prove this in several steps.
Proposition 28. States with weights on the k-th hexagonal layer for k = 1, . . . , n or the
k = n + 1-th layer (the first triangular layer) have multiplicity not exceeding k.
In order to prove this, consider first a state on the upper horizontal edge of the k-th
layer for k ≤ n + 1. The length of this edge is m − k + 1. A general state on this edge is
obtained via
2 1
Π(E− , E− ) |ψi (4.18)
where Π(E− 2 , E 1 ) contains (in some order) k − 1 powers of E 2 and ` powers of E 1 for
− − −
` = k − 1, . . . , m.
Now use the commutation relation (4.11) to commute the powers of E− 2 to the right as
far as they will go. Then the state can be written as a linear combination of the k vectors
3 i−1 1 `−i+1 2 k−i
|vi i = (E− ) (E− ) (E− ) |ψi (4.19)
for i = 1, . . . , k.
It follows that this state has multiplicity ≤ k.
– 74 –
Next consider a state again on the k-th level, but now on the edge leading off to the
right of the horizontal edge which we considered above; this edge is parallel to the outer
edge of length n. Take k ≤ n + 1, so the edge has length n − k + 1. A state on this edge is
obtained via
1 2
Π̂(E− , E− ) |ψi (4.20)
where Π̂(E− 1 , E 2 ) contains (in some order) k − 1 powers of E 1 and ` powers of E 2 where
− − −
` = k − 1, . . . , n. Now use the commutation relation (4.11) to commute the powers of E− 1
to the right as far as they will go. Then the state can be written as a linear combination
of the k vectors
3 i−1 2 `−i+1 1 k−i
|wi i = (E− ) (E− ) (E− ) |ψi (4.21)
for i = 1, . . . , k.
So these states also have multiplicity ≤ k. By using the reflection symmetry, it follows
that the all the states on the k-th hexagonal layer have multiplicity k.
We also have the
Proposition 29. The states with weights in the triangular layers have multiplicity not
exceeding n + 1.
Consider a state on the k-th row of the weight diagram for m + 1 ≥ k ≥ n + 1 which
lies inside the triangular layers. Such a state can also be written as
2 1
Π(E− , E− ) |ψi (4.22)
where Π(E−2 , E 1 ) contains (in some order) k − 1 powers of E 2 and ` powers of E 1 for
− − −
` = k − 1, . . . , m. and hence by the reasoning above, it can be rewritten as a linear
2 )k−i |ψi = 0.
combination of the k vectors |vi i in (4.19), however for i < k−n, |vi i = 0 as (E−
The only possible non-vanishing vectors are the n + 1 vectors |vk−n i , |vk−n+1 i , . . . , |vk i.
Hence these states have multiplicity ≤ n + 1.
Next note the lemma
3 3 1 i
E+ |wi,k i = (i − 1) q + p + + 1 − k |wi−1,k−1 i
2 √2 2
1 3 1 i 1 k
− √ (k − i)2 q − p + + − |wi,k−1 i
2 2 2 2 2 2
2 1 1
E+ |wi,k i = E− √ (i − 1) |wi−1,k−1 i
3 1 1
+ (k − i) q − p − (k − i − 1) |wi,k−1 i (4.23)
2 2 2
(with obvious simplifications in the cases when i = 1 or i = k)
– 75 –
Note that S1 = {|ψi} is linearly independent. Suppose that Sk−1 is linearly indepen-
dent (k ≥ 2). Consider Sk . Suppose
ci |wi,k i = 0 (4.24)
which are at the top right hand corner of the k-th hexagonal (or outermost triangular for
k = n + 1) layer. We have shown therefore that these weights have multiplicity both less
than or equal to, and greater than or equal to k. Hence these weights have multiplicity
k. Next consider the states on the level k edges which are obtained by acting with the
L(SU (2)) lowering operators E− 1 and E 2 on the “corner weight” states. Observe the
Lemma 9. Let d be a representation of L(SU (2)) on V be such that a particular L(SU (2))
weight m > 0 has multiplicity p. Then all weights m0 such that |m0 | ≤ m have multiplicity
whose proof is left as an exercise.
By this lemma, all the states on the k-th layer obtained in this fashion have multi-
plicity k also. Then the reflection symmetry implies that all states on the k-th layer have
multiplicity k.
In particular, the states on the outer triangular layer have multiplicity n + 1. We
have shown that the states on the triangular layers must have multiplicity not greater than
n + 1, but by the lemma above together with the reflection symmetry, they must also have
multiplicity ≥ n + 1. Hence the triangular layer weights have multiplicity n + 1, and the
proof is complete.
This was rather long-winded. There exist general formulae constraining multiplicities
of weights in more general Lie group representations, but we will not discuss these here.
– 76 –
Using the multiplicity properties of the weight diagram, it is possible to compute the
dimension of the representation. We consider first the hexagonal weight diagram for m ≥ n.
Then there are 1 + · · · + (m − n) + (m − n + 1) = 12 (m − n + 1)(m − n + 2) weights in the
interior triangle. Each of these weights has multiplicity (n + 1) which gives 12 (n + 1)(m −
n + 1)(m − n + 2) linearly independent states corresponding to weights in the triangle.
Consider next the k-th hexagonal layer for k = 1, . . . , n. This has 3((m + 1 − (k − 1)) +
(n + 1 − (k − 1)) − 2) = 3(m + n + 2 − 2k) weights in it, and each weight has multiplicity
k, which gives 3k(m + n + 2 − 2k) linearly independent states in the k-th hexagonal layer.
The total number of linearly independent states is then given by
1 X 1
(n+1)(m−n+1)(m−n+2)+ 3k(m+n+2−2k) = (m+1)(n+1)(m+n+2) (4.27)
2 2
This formula also applies in the case for m ≤ n and also for the triangular weight diagrams
by taking m = 0 or n = 0. The lowest dimensional representations are therefore 1,3,6,8,10...
d(v)u = (d(v))∗ u (4.28)
¯ w]) = d(v)
d([v, ¯ d(w)
¯ ¯ d(v)
− d(w) ¯ (4.30)
¯ a ), d(T
[d(T ¯ c)
¯ b )] = cab c d(T (4.32)
iH̄1 = (iH1 )∗
– 77 –
iH̄2 = (iH2 )∗
m m m m ∗
i(Ē− + Ē+ ) = (i(E+ + E− ))
m m m m ∗
Ē+ − Ē− = (E+ − E− ) (4.33)
which implies
Then H̄1 , H̄2 and Ē±m satisfy the same commutation relations as the unbarred opera-
tors, and also behave in the same way under the hermitian conjugate. One can therefore
plot the weight diagram associated with the conjugate representation d, ¯ the weights being
the (real) eigenvalues of H̄1 and H̄2 . But as H̄1 = −(H1 ) and H̄2 = −(H2 )∗ it follows
that if (p, q) is a weight of the representation d, then (−p, −q) is a weight of the represen-
¯ So the weight diagram of d¯ is obtained from that of d by inverting all the points
tation d.
(p, q) → −(p, q). Note that this means that the equilateral triangular weight diagrams N
and H of equal length sides are conjugate to each other.
0 ( 12 , 2√
1 (− 12 , 2√
0 (0, − √13 )
The state of highest weight is 0 which has weight ( 12 , 2√
). The weight diagram is
d u
0 (− 21 , − 2√
1 ( 12 , − 2√
0 (0, √13 )
The state of highest weight is 0 which has weight (0, √13 ). The weight diagram is
u d
4.2.3 Eight-Dimensional Representations
Consider the adjoint representation defined on the complexified Lie algebra L(SU (3)), i.e.
ad(v)w = [v, w]. Then the weights of the states can be computed by evaluating the
commutators with h1 and h2 :
– 81 –
|φ1 i ⊗ |φ2 i. Then
d(h1 ) |φi = (d1 (h1 ) |φ1 i) ⊗ |φ2 i + |φ1 i ⊗ (d2 (h1 ) |φ2 i)
= (p1 |φ1 i) ⊗ |φ2 i + |φ1 i ⊗ (p2 |φ2 i)
= (p1 + p2 ) |φi (4.37)
and similarly
d(h2 ) |φi = (q1 + q2 ) |φi (4.38)
So the weight of |φi is (p1 + p2 , q1 + q2 ); the weights add in the tensor product repre-
Using this, one can plot a weight diagram consisting of the weights of all the eigenstates
in the tensor product basis of V , the points in the weight diagram are obtained by adding
the pairs of weights from the weight diagrams of d1 and d2 respectively, keeping careful
track of the multiplicities (as the same point in the tensor product weight diagram may be
obtained from adding weights from different states in V1 V2 .)
Once the tensor product weight diagram is constructed, pick a highest weight, which
corresponds to a state which is annihilated by the tensor product operators E+ m for m =
4.3.1 3 ⊗ 3 decomposition.
Consider the 3 ⊗ 3 tensor product. Adding the weights together one obtains the following
table of quark content and associated weights
– 82 –
Quark content and weights for 3 ⊗ 3
Quark Content Weight
u⊗u (1, √13 )
d⊗d (−1, √13 )
s⊗s (0, − √23 )
u ⊗ d, d ⊗ u (0, √13 )
u ⊗ s, s ⊗ u ( 12 , − 2√
d ⊗ s, s ⊗ d (− 12 , − 2√
The raising and lowering operators are E± m = em ⊗ 1 + 1 ⊗ em . The highest weight state
± ±
is u ⊗ u with weight (1, √13 ). Applying lowering operators to u ⊗ u it is clear that a
six-dimensional irreducible representation is obtained. The (unit-normalized) states and
weights are given by
– 83 –
Hence 3 ⊗ 3 = 6 ⊕ 3̄. The states in the 6 are symmetric, whereas those in the 3̄ are
4.3.2 3 ⊗ 3̄ decomposition
– 84 –
Quark content and weights for 3 ⊗ 3̄
Quark Content Weight
u ⊗ s̄ ( 12 , 2 )
u ⊗ d¯ (1, 0)
d ⊗ s̄ (− 12 , 2 )
¯ s ⊗ s̄
u ⊗ ū, d ⊗ d, (0, 0)
d ⊗ ū (−1, 0)
s ⊗ ū (− 21 , − )
s ⊗ d¯ ( 12 , − 23 )
ds us
du dd,uu,ss ud
su sd
The raising and lowering operators are E± m = em ⊗ 1 + 1 ⊗ ēm All weights have
± ±
multiplicity 1, except
√ for (0, 0) which has multiplicity 3. The highest weight state is u ⊗ s̄
1 3
with weight ( 2 , 2 ). Acting on this state with all possible lowering operators one obtains
an 8 with the following states and weights
– 85 –
Removing these weights from the weight diagram, one is left with a singlet 1 with
weight (0, 0), corresponding to the state
1 ¯
√ (u ⊗ ū + s ⊗ s̄ + d ⊗ d) (4.39)
which is the unique linear combination- up to an overall scale- of u⊗ ū, s⊗ s̄ and d⊗ d¯ which
m . Hence we have the decomposition 3⊗ 3̄ = 8⊕1.
is annihilated by the raising operators E+
4.3.3 3 ⊗ 3 ⊗ 3 decomposition.
For this tensor product the quark content/weight table is as follows:
dds uus
dss uss
– 86 –
weight is u ⊗ u ⊗ u with weight ( 23 , 23 ). By applying lowering operators to this state, one
obtains a triangular 10-dimensional irreducible representation denoted by 10, which has
normalized states and weights:
dss uss
Removing the (non-vanishing) span of these states from the tensor product space, one
is left with a 17-dimensional vector space. The new weight diagram is
– 87 –
ddu duu
dds uus
dss uss
Note that the highest weight is now ( 12 , 23 ). This weight has has multiplicity 2. It
should be noted that the subspace consisting of linear combinations of d ⊗ u ⊗ u, u ⊗ d ⊗ u
and u ⊗ u ⊗ d which is annihilated by all raising operators E+ m is two-dimensional and
is spanned by the two orthogonal states √6 (d ⊗ u ⊗ u + u ⊗ d ⊗ u − 2u ⊗ u ⊗ d) and
√1 (d⊗ u ⊗ u − u ⊗ d ⊗ u). By acting on these two states with all possible lowering
operators, one obtains two 8 representations whose states are mutually orthogonal.
– 88 –
States and weights for another 8 in 3 ⊗ 3 ⊗ 3
State Weight
√1 (d ⊗ u ⊗ u 3
− u ⊗ d ⊗ u) ( 12 , 2 )
√1 (s ⊗ u ⊗ u − u ⊗ s ⊗ u) (1, 0)
2 √
√1 (d ⊗ u ⊗ d 3
− u ⊗ d ⊗ d) (− 12 , 2 )
2 (s ⊗ d ⊗ u + s ⊗ u ⊗ d − d ⊗ s ⊗ u − u ⊗ s ⊗ d),
2 (s ⊗ u ⊗ d + d ⊗ u ⊗ s − u ⊗ s ⊗ d − u ⊗ d ⊗ s) (0, 0)
√1 (s ⊗ d ⊗ d − d ⊗ s ⊗ d) (−1, 0)
2 √
√1 (s ⊗ u ⊗ s − u ⊗ s ⊗ s) 3
( 12 , −2 )
√1 (s ⊗ d ⊗ s − d ⊗ s ⊗ s)
(− 12 , − 23 )
Removing these weights from the weight diagram, we are left with a singlet 1 with
weight (0, 0). The state corresponding to this singlet is
√ (s ⊗ d ⊗ u − s ⊗ u ⊗ d + d ⊗ u ⊗ s − d ⊗ s ⊗ u + u ⊗ s ⊗ d − u ⊗ d ⊗ s) (4.40)
It is possible to arrange the baryons and the mesons into SU (3) multiplets; i.e. the states
lie in Hilbert spaces which are tensor products of vector spaces equipped with irreducible
representations of L(SU (3)). To see examples of this, it is convenient to group hadrons into
multiplets with the same baryon number and spin. We plot the hypercharge Y = S + B
where S is the strangeness and B is the baryon number against the isospin eigenvalue I3
for these particles.
– 89 –
Mass(Mev) 0
Y +
495 K 1
− +
137 π π0 π
549 −1 −1/2 η 1/2 1
− −1
Mass(Mev) 0
∗ Y +∗
892 K 1
770 ρ− ρ0 ρ+
−1 −1/2 ω 1/2 1
−∗ −1
The baryon decuplet has B = 1 and J = 2 with (I3 , Y ) diagram
– 90 –
Mass (Mev)
∆− ∆0 1 ∆ ∆++
−∗ ∗
1385 Σ Σ0 Σ+ ∗
−3/2 −1 −1/2 1/2 1 3/2
−∗ 0∗
Ξ −1 Ξ
Mass(Mev) Y
939 n p
− 0 +
1193 Σ Σ Σ
0 I3
1116 −1 −1/2 Λ 1/2 1
− +
Ξ −1
– 91 –
In this model, the fundamental states in the 3 are quarks, with basis states u (up),
d (down) and s (strange). The basis labels u, d, s are referred to as the flavours of the
¯ s̄. Baryons are composed of
quarks. The 3̄ states are called antiquarks with basis ū, d,
bound states of three quarks qqq, mesons are composed of bound states of pairs of quarks
and antiquarks q q̄. The quarks have J = 12 and B = 13 whereas the antiquarks have J = 21
and B = − 13 which is consistent with the values of B and J for the baryons and mesons.
The quark and antiquark flavours can be plotted on the (I3 , Y ) plane:
d 1/3 u
−1/2 1/2
I3 −1/2 1/2
u d
We have shown that mesons and baryons can be constructed from q q̄ and qqq states
respectively. But why do qq particles not exist? This problem is resolved using the notion
of colour. Consider the ∆++ particle in the baryon decuplet. This is a u ⊗ u ⊗ u state with
J = 32 . The members of the decuplet are the spin 32 baryons of lowest mass, so we assume
that the quarks have vanishing orbital angular momentum. Then the spin J = 32 is obtained
by having all the quarks in the spin up state, i.e. u ↑ ⊗u ↑ ⊗u ↑. However, this violates the
Pauli exclusion principle. To get round this problem, it is conjectured that quarks possess
additional labels other than flavour. In particular, quarks have additional charges called
colour charges- there are three colour basis states associated with quarks called r (red), g
(green) and b (blue). The quark state wave-functions contain colour factors which lie in
a 3 representation of SU (3) which describes their colour; the colour of antiquark states
corresponds to a 3̄ representation of SU (3) (colour). This colour SU (3) is independent of
the flavour SU (3).
These colour charges are also required to remove certain discrepancies (of powers of
3) between experimentally observed processes such as the decay π 0 → 2γ and the cross
section ratio between the processes e+ e− → hadrons and e+ e− → µ+ µ− and theoretical
predictions. However, although colour plays an important role in these processes, it seems
that one cannot measure colour directly experimentally- all known mesons and baryons
are SU (3) colour singlets (so colour is confined). This principle excludes the possibility
of having qq particles, as there is no singlet state in the SU (3) (colour) tensor product
decomposition 3 ⊗ 3, though there is in 3 ⊗ 3 ⊗ 3 and 3 ⊗ 3̄. Other products of 3 and 3̄
can also be ruled out in this fashion.
Nevertheless, the decomposition of 3 ⊗ 3 is useful because it is known that in addition
to the u, d and s quark states, there are also c (charmed), t (top) and b (bottom) quark
flavours. However, the c, t and b quarks are heavier than the u, d and s quarks, and are
– 92 –
unstable- they decay into the lighter quarks. The SU (3) symmetry cannot be meaningfully
extended to a naive SU (6) symmetry because of the large mass differences which break the
symmetry. In this context, meson states formed from a heavy antiquark and a light quark
can only be reliably put into 3 multiplets, whereas baryons made from one heavy and two
light quarks lie in 3 ⊗ 3 = 6 ⊕ 3̄ multiplets.
5. Spacetime Symmetry
for all x.
This condition can be rewritten in matrix notation as
ΛT ηΛ = η (5.4)
so Λ−1 is also a Lorentz transformation. Hence, the set of Lorentz transformations forms
a group, under matrix multiplication.
Write a generic Lorentz transformation as
λ βT
Λ= (5.6)
α R
λ2 = 1 + α.α (5.7)
– 94 –
λβ = RT α (5.8)
RT R − ββ T = I3 (5.9)
β = ±√ RT α (5.11)
1 + α.α
RT R − RT ααT R = I3 (5.12)
1 + α.α
ααT R
R̂ = 1 − √ √ (5.13)
1 + α.α(1 + 1 + α.α)
or equivalently
ααT R̂
R= 1+ √ (5.14)
1+ 1 + α.α
Then (5.12) implies
R̂T R̂ = I3 (5.15)
i.e. R̂ ∈ O(3). Moreover, it is straightforward to check directly that
det R̂ = √ det R (5.16)
1 + α.α
where we have used the formula det(I3 + KααT ) = 1 + Kα.α for any K. Also, one can
write ! ! !
λ 0 1 λ−2 αT 1 0
Λ= (5.17)
α I3 0 I3 − λ−2 ααT 0 R
and hence
λ λ
det Λ = det R = √ det R̂ (5.18)
1 + α.α 1 + α.α
O(3) has two connected components, the connected component of I3 (which is SO(3))
whose elements have determinant det R̂ = +1, and the connected component of −I3 , whose
elements have determinant det R̂ = −1.
There are therefore four connected components of the Lorentz group, according as
Λ 0 ≤ −1 or Λ0 0 ≥ 1 and det Λ = +1 or det Λ = −1. It is not possible to construct a
smooth curve in the Lorentz group passing from one of these components to the other.
The set of Lorentz transformations with det Λ = +1 forms a subgroup of the Lorentz
Note that (5.13) implies that
R̂T α = √ RT α (5.19)
1 + α.α
and hence
β.β = αT RRT α = α.α (5.20)
1 + α.α
So, if Λ and Λ0 are two Lorentz transformations
! !
λ βT λ0 β 0T
Λ= Λ0 = (5.21)
α R α0 R0
Hence the set of Lorentz transformations with Λ0 0 ≥ 1 also forms a subgroup of the
Lorentz group.
The subgroup of Lorentz transformations with det Λ = +1 and Λ0 0 ≥ 1 is called the
proper orthochronous Lorentz group, which we denote by SO(3, 1)↑ .
We note the useful lemma
Lemma 10. Suppose that Λ ∈ SO(3, 1)↑ . Then there exist S1 , S2 ∈ SO(3) and z ∈ R such
! cosh z sinh z 0 0 !
1 0T sinh z cosh z 0 0 1 0T
Λ= (5.23)
0 S1 0 0 1 0 0 S2
0 0 0 1
From the analysis of the Lorentz group so far, we have shown that if Λ ∈ SO(3, 1)↑
then there exists α ∈ R3 and R̂ ∈ SO(3) such that
√ !
1 + α.α αT R̂
Λ= 1
ααT R̂
α 1 + 1+√1+α.α
√ ! !
1 + α.α αT 1 0T
= 1 (5.24)
α 1 + 1+√1+α.α ααT 0 R̂
The result follows on substituting this into Λ and setting S2 = (S1 )T R̂.
5.2 The Lorentz Group and SL(2, C)
Consider the spacetime co-ordinates xµ . Define the matrices
! ! ! !
1 0 0 1 0 −i 1 0
σ0 = σ1 = σ2 = σ3 = (5.26)
0 1 1 0 i 0 0 −1
Then given real spacetime co-ordinates xµ , define the 2 × 2 complex hermitian matrix
µ x0 − x3 −x1 + ix2
x̃ = xµ σ = (5.27)
−x1 − ix2 x0 + x3
Observe that any hermitian 2 × 2 matrix can be written as x̃ for some real xµ .
Note that
det x̃ = (x0 )2 − (x1 )2 − (x2 )2 − (x3 )2 = ηµν xµ xν (5.28)
ηµν xµ xν is invariant under the action of SO(3, 1). det x̃ is invariant under the action of
SL(2, C), the complex 2 × 2 matrices with unit determinant.
Proposition 30. There exists an isomorphism π : SL(2, C)/Z2 → SO(3, 1)↑ where SL(2, C)/Z2
consists of elements ±N ∈ SL(2, C) with +N identified with −N .
Given N ∈ SL(2, C) consider the 2 × 2 complex matrix N x̃N † . The components of
this matrix are linear in the spacetime co-ordinates xµ . As x̃ is hermitian, it follows that
N x̃N † is also hermitian. Hence there exist Λµ ν ∈ R (independent of x) for µ, ν = 0, . . . , 3
such that
N x̃N † = (Λx)
] (5.29)
] for all x, and therefore Λ is
Taking the determinant of both sides we find det x̃ = det (Λx)
a Lorentz transformation.
Λ = π(N ) (5.30)
Note that
] = 2Λ0 µ xµ = Tr (N † N x̃)
Tr (Λx) (5.31)
Setting x0 = 1, x1 = x2 = x3 = 0 we find Λ0 0 = 21 Tr (N † N ) > 0.
If N1 , N2 ∈ SL(2, C) then
– 97 –
Hence (π(N ^
1 N2 )x) = π(N1 )π(N2 )x for all x, which implies π(N1 N2 ) = π(N1 )π(N2 ).
Next we will establish that π is onto SO(3, 1)↑ . First recall that any R ∈ SO(3, 1)↑ of
the form !
1 0T
R= (5.34)
0 R̂
can be written as a product of rotations around the spatial co-ordinate axes
1 0 0 0
0 1 0 0
R1 (φ1 ) = (5.35)
0 0 cos φ1 sin φ1
0 0 − sin φ1 cos φ1
1 0 0 0
0 cos φ2 0 − sin φ2
R2 (φ2 ) = (5.36)
0 0 1 0
0 sin φ2 0 cos φ2
1 0 0 0
0 cos φ3 sin φ3 0
R3 (φ3 ) = (5.37)
− sin φ3
0 cos φ3 0
0 0 0 1
By a direct computation we find π(e 2 ) = Rj for j = 1, 2, 3; and
cosh z sinh z 0 0
z 1
sinh z cosh z 0 0
π(e− 2 σ ) = (5.38)
0 0 1 0
0 0 0 1
iφj j z 1
where e 2 σ ∈ SL(2, C) for j = 1, 2, 3 and e− 2 σ ∈ SL(2, C).
Hence, if Λ ∈ SO(3, 1)↑ , it follows that one can write Λ = Λ1 .Λ2 . . . Λk where Λi are
elementary rotation or boost transformations in SO(3, 1)↑ , and from the above reasoning,
Λi = π(Ni ) for some Ni ∈ SL(2, C). Therefore, Λ = π(N1 .N2 . . . Nk ), so π is onto SO(3, 1)↑ .
Next, suppose that π(N ) = π(M ) for N, M ∈ SL(2, C). Then
Set Q = M −1 N , so that
Qx̃Q† = x̃ (5.40)
Setting x0 = 1, x1 = x2 = x3 = 0, we obtain QQ† = I2 , so Q ∈ SU (2). Hence
Qσ i = σ i Q (5.41)
– 98 –
2 − 1 map.
Lastly, we must prove that if N ∈ SL(2, C) then π(N ) ∈ SO(3, 1)↑ . We have already
shown that π(N ) is orthochronous. Suppose that det(π(N )) = −1. Consider
1 0 0 0
0 1 0 0
Λ̂ = π(N ) (5.42)
0 0 1 0
0 0 0 −1
The det Λ̂ = +1, so Λ̂ ∈ SO(3, 1)↑ . Hence, there exists some N 0 ∈ SL(2, C) such that
Λ̂ = π(N 0 ), so
1 0 0 0
0 1 0 0
π(N ) = π(N 0 ) (5.43)
0 0 1 0
0 0 0 −1
Setting Y = N 0 N −1 , we obtain
1 0 0 0
0 1 0 0
= π(Y ) (5.44)
0 0 1 0
0 0 0 −1
Y x µ σ µ Y † = x 0 I2 + x 1 σ 1 + x 2 σ 2 − x 3 σ 3 (5.45)
Y σ 1 = σ 1 Y, Y σ 2 = σ 2 Y, Y σ 3 = −σ 3 Y (5.46)
This is not possible, because [Y, σ 1 ] = [Y, σ 2 ] = 0 implies that Y = αI2 for some α ∈ C.
π(N ) ∈ SO(3, 1)↑ .
Although π : SL(2, C) → SO(3, 1)↑ is not 1-1, we have shown that the restriction of π
to SL(2, C)/Z2 , in which N is identified with −N is 1-1.
– 99 –
Differentiate this constraint and set t = 0 to obtain
mT η + ηm = 0 (5.48)
where m = ( dΛ(t)
dt )|t=0 . The generic solution to this constraint is
0 χ
mµ ν = (5.49)
χT S
for any χ ∈ R3 and S is a real 3 × 3 antisymmetric matrix; S = −S T . There are three real
degrees of freedom in χ and three real degrees of freedom in the antisymmetric matrix S.
Hence the Lie algebra is six-dimensional.
Define the 4 × 4 matrices M µν for µ, ν = 0, 1, 2, 3 by
(M µν )α β = i(η µα δ ν β − η να δ µ β ) (5.50)
note that M µν = −M νµ , so there are only six linearly independent matrices defined here.
By direct computation, we find
0 i 0 0 0 0 i 0
i 0 0 0 0 0 0 0
M 01 = M 02 =
0 0 0 0 i 0 0 0
0 0 0 0 0 0 0 0
0 0 0 i 0 0 0 0
0 0 0 0
0 −i 0
M 03 = M 12 =
0 0 0 0 0 i 0 0
i 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 −i
0 0 0
M 13 = M 23 = (5.51)
0 0 0 0 0 0 0
0 i 0 0 0 0 i 0
[M µν , M ρσ ] = i M µσ η νρ + M νρ η µσ − M µρ η νσ − M νσ η µρ
which defines the complexified Lie algebra of the Lorentz group.
Ji = ijk Mjk
Ki = M0i (5.53)
[Ji , Jj ] = iijk Jk
[Ki , Kj ] = −iijk Jk
[Ji , Kj ] = iijk Kk (5.54)
So, setting
1 1
Ai = (Ji − iKi ), Bi = (Ji + iKi ) (5.55)
2 2
we obtain the commutation relations
[Ai , Aj ] = iijk Ak
[Bi , Bj ] = iijk Bk
[Ai , Bj ] = 0 (5.56)
Hence the complexified Lorentz algebra L(SO(3, 1)) can be written as the direct sum of
two commuting complexified L(SU (2)) algebras. It follows that one can classify irreducible
representations of the Lorentz algebra by spins (A, B) for 2A, 2B ∈ N.
ψα → ψα0 = Nα β ψβ (5.57)
Definition 45. The right handed Weyl spinors are elements of a 2-dimensional com-
plex vector space V̄ on which the complex conjugate of the fundamental representation of
SL(2, C) acts as D∗ (N )χ̄ = N ? χ̄ where N ∈ SL(2, C) and N ? is the complex conjugate
of N . In terms of components, if χ̄ ∈ V̄ has components χ̄α̇ for α̇ = 1, 2, then under the
action of SL(2, C), χ̄ transforms as
and observe that αβ βγ = δ α γ . One defines α̇β̇ and α̇β̇ similarly.
Note that αβ is invariant under SL(2, C), as
or in matrix notation N N T = .
– 101 –
If we define the contravariant representation DCV on V via
ψα1 ...αn β̇1 ...β̇m → ψα0 = Nα1 µ1 . . . Nαn µn N ∗ β̇1 ν̇1 . . . N ∗ β̇m ν̇m ψβ1 ...βn ν̇1 ...ν̇m (5.66)
1 ...αn β̇1 ...β̇m
N xµ σ µ N † = ηνρ Λρ γ xγ σ ν (5.67)
– 102 –
So, denoting the components of σ µ by σαµβ̇ , one finds
so that σ̄ 0 = σ 0 , σ̄ i = −σ i for i = 1, 2, 3.
Exercise: Prove the following useful identities
i) σ µ σ̄ ν + σ ν σ̄ µ = 2η µν I2
ii) Tr σ µ σ̄ ν = 2η µν
– 103 –
5.4.2 The Lie algebra of SL(2, C)
The Lie algebra of SL(2, C) consists of traceless complex 2 × 2 matrices; which has six real
dimensions. This is to be expected, as L(SO(3, 1)) is six-dimensional. It is convenient to
define the matrices
i µ ν
(σ µν )α β = (σ σ̄ − σ ν σ̄ µ )α β
(σ̄ µν )α̇ β̇ = (σ̄ µ σ ν − σ̄ ν σ µ )α̇ β̇ (5.77)
so that σ 0i = − 2i σ i , σ jk = 12 jk` σ ` , σ̄ 0i = 2i σ i , σ̄ jk = 21 jk` σ ` . It is clear that the σ µν span
the 2 × 2 traceless matrices over R (as do the σ̄ µν ), hence they are generators of the Lie
algebra of SL(2, C). By a direct computation we obtain the commutation relations:
[σ µν , σ ρσ ] = i η νρ σ µσ + η µσ σ νρ − η νσ σ µρ − η µρ σ νσ
which is the same commutation relation as for the Lie algebra L(SO(3, 1)). Similarly, we
[σ̄ µν , σ̄ ρσ ] = i η νρ σ̄ µσ + η µσ σ̄ νρ − η νσ σ̄ µρ − η µρ σ̄ νσ
Hence the σ µν and σ̄ µν correspond to representations of L(SO(3, 1)).
The action of SL(2, C) on left-handed Weyl spinors is given by
ψα → eωµν σ α β ψβ (5.80)
Just as for the Lorentz algebra, one can define Ji = 12 ijk σjk = 12 σ i and Ki = σ0i = 2i σ i .
Hence Ai = 12 σ i , Bi = 0. Therefore the fundamental representation corresponds to a spin- 21
L(SU (2)) representation generated by A, and a L(SU (2)) B-singlet. This representation
is denoted by ( 21 , 0).
The action of SL(2, C) on right-handed Weyl spinors is given by
µν β̇
χ̄α̇ → eωµν σ̄ α̇ χ̄β̇ (5.81)
xµ → Λµ ν xν + bµ (5.82)
so the Poincareé group is closed under this multiplication. The identity is (I4 , 0) and the
inverse of (Λ, b) is (Λ−1 , −Λ−1 b).
One can construct a group isomorphism between the Poincaré group and the subgroup
of GL(5, R) of matrices of the form
Λ b
0 1
in the Poincaré group passing through the identity when t = 0, so Λ(0) = I4 , b(0) = 0.
Differentiating with respect to t and setting t = 0 we note that the generic element of the
Poincaré Lie algebra is of the form
m v
0 0
where m ∈ L(SO(3, 1)) and v ∈ R4 is unconstrained. Hence a basis for the Lie algebra is
given by the 5 × 5 matrices M µν and P ν for µ, ν = 0, 1, 2, 3 where
(M ρσ )µ ν = i(η ρµ δ σ ν − η σµ δ ρ ν )
(M ρσ )4 ν = (M ρσ )µ 4 = (M ρσ )4 4 = 0 (5.88)
(P ν )µ 4 = iη µν
(P ν )µ λ = (P ν )4 λ = (P ν )4 4 = 0 (5.89)
(labeling the matrix indices by µ, ν = 0, 1, 2, 3 and the additional index is “4”). The M ρσ
generate the Lorentz sub-algebra
m 0
0 0
for m ∈ L(SO(3, 1)); they satisfy the usual Lorentz algebra commutation relations
[M µν , M ρσ ] = i M µσ η νρ + M νρ η µσ − M µρ η νσ − M νσ η µρ
– 105 –
The P ν generate the translations
0 iv
0 0
[M µν , M ρσ ] = i M µσ η νρ + M νρ η µσ − M µρ η νσ − M νσ η µρ
[P µ , M ρσ ] = iη ρµ P σ − iη σµ P ρ
[P µ , P ν ] = 0 (5.95)
Proposition 31. The Pauli-Lubanski vector satisfies the following commutation relations:
1) [Wµ , d(Pν )] = 0
We will use the identities
[Wµ , d(Pν )] = µρσθ [d(M ρσ )d(P θ ), d(Pν )]
– 106 –
µρσθ d(M ρσ )[d(P θ ), d(Pν )] + [d(M ρσ ), d(Pν )]d(P θ )
= µρσθ d(M ρσ )d([P θ , Pν ]) + d([M ρσ , Pν ])d(P θ )
= µρσθ − iδ ρ ν d(P σ ) + iδ σ ν d(P ρ ) d(P θ )
=0 (5.99)
µλχθ [d(M λχ )d(P θ ), d(Mρσ )
[Wµ , d(Mρσ )] =
µλχθ d(M λχ )[d(P θ ), d(Mρσ )] + [d(M λχ ), d(Mρσ )]d(P θ )
µλχθ d(M λχ )d([P θ , Mρσ ]) + d([M λχ , Mρσ ])d(P θ )
µλχθ d(M λχ ) iδ θ ρ d(Pσ ) − iδ θ σ d(Pρ )
i d(M λ σ )δ χ ρ − d(M χ σ )δ λ ρ − d(M λ ρ )δ χ σ + d(M χ ρ )δ λ σ d(P θ )
= µλχθ d(M λχ )δ θ ρ d(Pσ ) − d(M λχ )δ θ σ d(Pρ )
2d(M λ σ )δ χ ρ d(P θ ) − 2d(M λ ρ )δ χ σ d(P θ )
(ηστ δ θ ρ − ηρτ δ θ σ )µλχθ d(M λχ )d(P τ ) − 2d(M λτ )d(P χ )
= (ηστ δ θ ρ − ηρτ δ θ σ )µλχθ d(M [λχ )d(P τ ] ) (5.100)
1 λχτ γ
λχτ γ W γ = γ ν1 ν2 ν3 d(M ν1 ν2 )d(P ν3 )
= 3d(M [λχ )d(P τ ] ) (5.101)
[Wµ , d(Mρσ )] = (ηστ δ θ ρ − ηρτ δ θ σ )µλχθ λχτ γ W γ
= (ηστ δ θ ρ − ηρτ δ θ σ )(−2)(δ τ µ ηθγ − δ τ θ ηµγ )W γ
= iηµρ Wσ − iηµσ Wρ (5.102)
as required.
(3) follows straightforwardly from (2):
[Wµ , Wν ] = νρσθ [Wµ , d(M ρσ )d(P θ )]
= νρσθ [Wµ , d(M ρσ )]d(P θ ) + d(M ρσ )[Wµ , d(P θ )]
= νρσθ [Wµ , d(M ρσ )]d(P θ )
– 107 –
= νρσθ (iδ ρ µ W σ − iδ σ µ W ρ )d(P θ )
= −iµνρσ W ρ d(P σ ) (5.103)
From this we find the
1) [Wµ W µ , d(Pν )] = 0
2) [Wµ W µ , d(Mρσ )] = 0
(1) follows because [Wµ W µ , d(Pν )] = Wµ [W µ , d(Pν )] + [Wµ , d(Pν )]W µ = 0
(2) holds because
as required.
Hence we have shown that Wµ W µ is a Casimir operator. d(Pµ )d(P µ ) is another Casimir
(1) follows because
[d(Pµ )d(P µ ), d(Pν )] = d(Pµ )[d(P µ ), d(Pν )] + [d(Pµ ), d(Pν )]d(P µ ) = 0 (5.105)
as required.
We shall show that irreducible representations are classified by the values of the two
Casimir operators Wµ W µ and d(Pµ )d(P µ ).
– 108 –
In particular, suppose that D is a unitary representation of the Poincaré group acting
on V . Such representations arise naturally in the context of quantum field theory when
V is taken to be a Hilbert space, and it is assumed that Poincaré transformations do not
affect transition probabilities. We will assume that this is the case.
Note that iMµν and iP µ form a basis for the (real) Poincaré algebra. Hence one can
locally write the Poincaré transformation as
i µ +ω µν )
e− 2 (bµ P µν M
and M
V = Vq (5.110)
it µ )+ω µν )) it µ )+ω µν ))
hλ (t) = e 2 (bµ d(P µν d(M
d(P λ )e− 2 (bµ d(P µν d(M
Differentiating with respect to t we find
dhλ it µ µν i it µ µν
= e 2 (bµ d(P )+ωµν d(M )) [d(P λ ), − (bµ d(P µ ) + ωµν d(M µν ))]e− 2 (bµ d(P )+ωµν d(M ))
dt it µ µν
2 it µ µν
= ω λ χ e 2 (bµ d(P )+ωµν d(M )) d(P χ )e− 2 (bµ d(P )+ωµν d(M ))
= ω λ χ hχ (5.112)
– 109 –
Definition 49. The stability subgroup Hq (or “little group”) which is associated with Vq
is the subgroup of the Poincaré group defined by
i µ +ω µν ) i µ )+ω µν ))
Hq = {e− 2 (bµ P µν M
: e− 2 (bµ d(P µν d(M
|ψi ∈ Vq for |ψi ∈ Vq } (5.115)
It can be shown that Hq is a Lie subgroup of the Poincaré group. Suppose then that
it µν
− 2i (bµ P µ + ωµν M µν ) ∈ L(Hq ). It follows that e− 2 ωµν M q = q for t ∈ R. Expanding out
in powers of t we see that this constraint is equivalent to
ωµν q ν = 0 (5.116)
where |ψq i ∈ Vq , and these transformations can be used to obtain all elements of Vq from
those in Vk .
It is then straightforward to show that the action of the representation of the whole
Poincaré group on {Vq : q 2 = m2 } is determined by the action of D(Hk ) acting on Vk .
To see this explicitly, suppose that |ψk,M i for M = 1, . . . , `k is a basis of Vk . Then one
can define
|ψq,M i = D(Λ0 (q), 0) |ψk,M i (5.121)
– 110 –
and the |ψq,M i then form a basis for Vq . The representation of Hk on Vk is determined by
the coefficients D(h)M N , where h ∈ Hk , in the expansion
D(h) |ψk,M i = D(h)M N |ψk,N i (5.122)
Suppose that (Λ, b) is a generic Poincaré transformation; then one can write
D(Λ, b) |ψq,M i = D(Λ0 (Λq), 0)D(Λ0 (Λq)−1 ΛΛ0 (q), Λ0 (Λq)−1 b)M N |ψk,N i
= D(Λ0 (Λq)−1 ΛΛ0 (q), Λ0 (Λq)−1 b)M N |ψΛq,N i (5.124)
We will examine the action of Hk on Vk in the timelike and null cases separately.
V = Vq (5.125)
is not in general finite-dimensional, we shall assume that Vk (and hence the Vq ) are
finite dimensional.
W0 |ψi = ij` d(M ij )d(P ` ) |ψi = d(M 23 )d(P 1 ) |ψi = Ed(J1 ) |ψi
W1 |ψi = −Ed(M 23 ) |ψi = −Ed(J1 ) |ψi
W2 |ψi = 2µνλ d(M µν )d(P λ ) |ψi
– 111 –
= E(d(M 13 ) − d(M 03 )) |ψi
= E(−d(J2 ) + d(K3 )) |ψi
W3 |ψi = 3µνλ d(M µν )d(P λ ) |ψi
= E(−d(M 12 ) + d(M 02 )) |ψi
= E(−d(J3 ) − d(K2 )) |ψi (5.127)
which implies
– 112 –
We require that e2πσi = ±1 (for a projective representation) and so 2σ ∈ Z. Neutrinos
have helicity ± 12 , photons have helicity ±1 and gravitons have helicity ±2.
– 113 –
Lie groups and Lie algebras play an important role in the various dynamical theories which
govern the behaviour of particles - the gauge theories. Though we will not examine the
quantization of these theories, we shall present the relationship between Lie algebras and
gauge theories.
Before examining non-Abelian gauge theories, we briefly recap some properties of the
simplest gauge theory, which is the U (1) gauge theory of electromagnetism.
6.1 Electromagnetism
The gauge theory of electromagnetism contains a field strength
fµν = ∂µ aν − ∂ν aµ (6.1)
∂ µ fµν = jν (6.3)
λµνρ ∂µ fνρ = 0 (6.4)
Equation (6.4) holds automatically due to the existence of the vector potential. Conversely,
if fµν satisfies (6.4) then it can be shown that a vector potential aµ exists (though only
locally) such that fµν = ∂µ aν − ∂ν aµ .
Using the vector potential, one defines a covariant derivative Dµ by
Dµ ψ = ∂µ ψ + iaµ ψ (6.5)
iγ µ Dµ Ψ − mΨ = 0 (6.8)
– 114 –
is gauge invariant. Ψ is a 4-component Dirac spinor constructed from left and right handed
Weyl spinors ψα , χ̄α̇ via
Ψ= (6.9)
{γ µ , γ ν } ≡ γ µ γ ν + γ ν γ µ = 2η µν I4 . (6.11)
and hence
[Dµ , Dν ]φ = i(∂µ aν − ∂ν aµ )φ = ifµν φ (6.14)
where g(x) ∈ G.
Dµ Φ = ∂µ Φ + Aµ Φ (6.16)
– 115 –
which implies
∂µ (gΦ) + A0µ gΦ = g(∂µ Φ + Aµ Φ) (6.18)
and hence
∂µ gΦ + A0µ gφ = gAµ Φ (6.19)
Lemma 11. If g(t) is a curve in the matrix Lie group G then dg dt g(t)
−1 ∈ L(G).
Suppose that g(t) = g0 when t = t0 . Set h(t) = g(t + t0 )g0−1 . Then h(t) is a smooth
curve in G with h(0) = I, and
dh dg dg −1
|t=0 = |t=t0 g0−1 = g |t=t0 (6.21)
dt dt dt
dg −1
But dh
dt |t=0 ∈ L(G) by definition, and hence dt g |t=t0 ∈ L(G) for all t0 .
Hence we have shown that ∂µ gg −1 ∈ L(G), and from our previous analysis of the
adjoint representation, we know that gAµ g −1 ∈ L(G); so A0µ ∈ L(G) as required.
Dµ θ = ∂µ θ + d(Aµ )θ (6.22)
(Caveat: in this proof t is simply a parameter along a curve in G, not the spacetime
co-ordinate x0 !)
– 116 –
To prove (i), set g = eh for h ∈ L(G), so that D(g) = ed(h) . Then (i) is equivalent to
f (t) = e−td(h) d(eth ve−th )etd(h) (6.24)
for t ∈ R. Then
e−td(h) [d(eth ve−th ), d(h)] + d(eth [h, v]e−th ) etd(h)
e−td(h) d([eth ve−th , h]) + d(eth [h, v]e−th ) etd(h)
e−td(h) d(eth [v, h]e−th ) + d(eth [h, v]e−th ) etd(h)
= 0 (6.25)
and f (0) = d(v). Hence, f (1) = e−d(h) d(eh ve−h )ed(h) = f (0) = d(v) as required.
To prove (ii), suppose g(t) = g0 at t = t0 . Set h(t) = g(t + t0 )g0−1 , so that h(t) is a
smooth curve in G with h(0) = I. Then
|t=0 = h1 ∈ L(G) (6.28)
d( |t=0 ) = d(h1 ) (6.29)
But by definition
d d
D(eth1 ) |t=0 =
d(h1 ) = D(h(t)) |t=0 (6.30)
dt dt
dD(g) dg
D(g −1 )|t=t0 = d(h1 ) = d( g −1 |t=t0 ) (6.31)
dt dt
are required.
= D(g)(∂µ θ + d(Aµ )θ) + d(gAµ g −1 )D(g) − D(g)d(Aµ )
∂µ (D(g)) − d(∂µ gg −1 )D(g) θ
D(g)Dµ θ + d(gAµ g −1 )D(g) − D(g)d(Aµ ) θ
∂µ (D(g)) − d(∂µ gg −1 )D(g) θ
+ (6.33)
as required.
Given this property of transformations of covariant derivatives, one can define the
adjoint covariant derivative
Definition 52. Suppose that θ ∈ L(G) transforms under the adjoint representation Ad of
G. The the covariant derivative associated with the adjoint representation is
To summarize, we have shown that if Φ transforms under the action of the fundamental
representation as Φ → Φ0 = gΦ, then in order for the fundamental covariant derivative to
transform in the same way, one must impose the transformation
on the gauge potential. We then have shown that if Φ transforms under the action of a
generic representation D, Φ → Φ0 = D(g)Φ, then the same transformation rule Aµ →
A0µ = gAµ g −1 − ∂µ gg −1 is sufficient to ensure that the generic covariant derivative Dµ Φ =
∂µ Φ + d(Aµ )Φ also transforms in the same way as Φ. Caveat: a covariant derivative is
always defined with respect to a particular representation
[Dµ , Dν ]Φ = (∂µ + Aµ ) ∂ν Φ + Aν Φ − (∂ν + Aν ) ∂µ Φ + Aµ Φ
= ∂µ Aν − ∂ν Aµ + [Aµ , Aν ] Φ (6.37)
– 118 –
Note that as Aµ ∈ L(g) it follows that Fµν ∈ L(G). Note also that by construction
0 is the transformed
[Dµ , Dν ]Φ transforms like Φ under a gauge transformation. Hence if Fµν
gauge field strength, then Fµν0 Φ0 = gF Φ. As this must hold for all Φ, we find
Fµν = gFµν g −1 (6.39)
Dµ Dν F µν = 0 (6.40)
Dµ Dν F µν = Dµ (∂ν F µν + [Aν , F µν ])
= ∂µ ∂ν F µν + [Aµ , ∂ν F µν ] + ∂µ [Aν , F µν ] + [Aµ , [Aν , F µν ]]
= [Aµ , Dν F µν ] + [∂µ Aν , F µν ] + [Aν , ∂µ F µν ]
= [Aµ , Dν F µν ] + [Fµν − [Aµ , Aν ], F µν ] + [Aν , Dµ F µν − [Aµ , F µν ]]
= [Aµ , Dν F ] + [Aν , Dµ F µν ] − [[Aµ , Aν ], F µν ] − [Aν , [Aµ , F µν ]]
= − [[Aµ , Aν ], F ] − [Aν , [Aµ , F µν ]]
=0 ( using the Jacobi identity) (6.41)
as required.
κ(etZ Xe−tZ , etZ Y e−tZ ) = κ(−etZ [X, Z]e−tZ , etZ Y e−tZ ) + κ(etZ Xe−tZ , −etZ [Y, Z]e−tZ )
– 119 –
= κ(−[etZ Xe−tZ , Z], etZ Y e−tZ ) + κ(etZ Xe−tZ , −[etZ Y e−tZ , Z])
= −κ(etZ Xe−tZ , [Z, etZ Y e−tZ ]) + κ(etZ Xe−tZ , −[etZ Y e−tZ , Z])
=0 (6.44)
as required.
Note that the coupling constant e plays an important role in the dynamics. If one
attempts to rescale A so that Aµ = eµ , it is possible to eliminate the explicit factor of e
from the Yang-Mills Lagrangian, and write
1 1
κ(Fµν , F µν ) = κ(F̂µν , F̂ µν ) (6.47)
4e2 4
where here F̂µν = ∂µ Âν −∂ν µ +e[µ , Âν ]. Although the explicit e-dependence of the Yang-
Mills Lagrangian appears to have been removed, observe that the gauge field strength now
has an e-dependent term, which arises from the commutator which is quadratic in A. So the
dependence on e in the non-abelian theory cannot be removed by rescaling. If, nevertheless,
one performs this rescaling (and then drops theˆon all terms), then the generic covariant
derivative is modified via Dµ Φ = ∂µ Φ + ed(Aµ )Φ, and the gauge potential transformation
rule is also modified: Aµ → A0µ = gAµ g −1 − e−1 ∂µ gg −1 . Whether e appears as an overall
factor in the Yang-Mills Lagrangian, or within the covariant derivative and gauge field
strength, depends on convention. Until stated otherwise, we shall however retain the 4e12
outside the Lagrangian, and work with the un-rescaled gauge fields.
If the representation D of G is unitary, then one can couple addtional gauge invariant
scalar terms to the Yang-Mills Lagrangian:
1 1
L= 2
κ(Fµν , F µν ) + (Dµ Φ)† Dµ Φ − V (Φ† Φ) (6.48)
4e 2
where Φ → D(g)Φ under a gauge transformation.
Fµν → Fµν + (∂µ δAν + [Aµ , δAν ]) − (∂ν δAµ + [Aν , δAµ ]) (6.49)
κ(Fµν , F µν )d4 x is given
Hence, the first order variation of the Yang-Mills action S = 4e2
δS = 2 κ(δFµν , F µν )d4 x
= κ(∂µ δAν + [Aµ , δAν ], F µν )d4 x
= − 2 κ(δAν , ∂µ F µν + [Aµ , F µν ])d4 x + surface terms (6.50)
where we have made use of the associativity of the Killing form. Neglecting the surface
terms (assuming that the solutions are sufficiently smooth and fall off sufficiently fast at
infinity), and requiring that δS = 0 for all variations δAν we obtain the Yang-Mills field
∂µ F µν + [Aµ , F µν ] = 0 (6.51)
or equivalently
Dµ F µν = 0 (6.52)
where here Dµ is the adjoint covariant derivative.
Consider the contraction λµνρ Dµ Fνρ . As the terms ([Aµ , ∂ρ Aν ] + [Aρ , ∂µ Aν ]) are sym-
metric in µ, ρ and the terms [Aµ , ∂ν Aρ ] + [Aν , ∂µ Aρ ] are symmetric in µ, ν, these terms give
no contribution to λµνρ Dµ Fνρ . Also,
because the partial derivatives commute with each other. Hence we find λµνρ Dµ Fνρ = 0,
or equivalently
Dµ Fνρ + Dν Fρµ + Dρ Fµν = 0 (6.56)
This equation is called the Bianchi identity.
This can be used to prove a Jacobi identity for all covariant derivatives. Suppose that
Dµ is the covariant derivative associated with the representation D, and Φ ∈ V transforms
under the representation D as Φ → Φ0 = D(g)Φ. Let d be the induced representation of
Then note that
– 121 –
= ∂ν ∂λ Φ + d(Aλ )∂ν Φ + d(Aν )∂λ Φ + d(∂ν Aλ ) + d(Aν )d(Aλ ) Φ (6.57)
and hence
using the Bianchi identity on F . As this must hold for all Φ we obtain the Jacobi
identity for covariant derivatives:
This differs from the canonical energy-momentum tensor by a total derivative. Observe
that Tµν is gauge invariant by construction.
Proposition 35. If the Yang-Mills field strength F satisfies the Yang-Mills field equations,
the energy-momentum tensor satisfies
∂ µ Tµν = 0 (6.63)
∂ µ Tµν = ∂ µ (κ(Fµλ , Fν λ )) − ∂ν (κ(Fρσ , F ρσ ))
= κ(∂ Fµλ , Fν ) + κ(Fµλ , ∂ µ Fν λ ) − κ(∂ν Fρσ , F ρσ )
µ λ
= κ(Dµ Fµλ − [Aµ , Fµλ ], Fν λ ) + κ(Fµλ , Dµ Fν λ − [Aµ , Fν λ ])
– 122 –
− κ(Dν Fρσ − [Aν , Fρσ ], F ρσ ) (6.64)
where here Dµ is the adjoint covariant derivative. Note that by the Bianchi identity
and so
κ(Fµλ , Dµ Fν λ ) = κ(Fµλ , Dν F µλ ) (6.66)
8 6 6
1 X a aµν X X
LQCD =− Fµν F +i Ψ̄A µ B a B
f γ (δA ∂µ + eA µ (Ta )A )ΨBf − mf Ψ̄A
f ΨAf (6.72)
a=1 f =1 f =1
– 123 –
Here we have decomposed the non-abelian gluon field strength F and potential A into
Fµν = Fµν Ta , Aµ = Aaµ Ta (6.73)
Fµν = ∂µ Aaν − ∂ν Aaµ + e[Aµ , Aν ]a = ∂µ Aaν − ∂ν Aaµ + ecbc a Abµ Acν (6.74)
The indices f = 1, . . . , 6 are flavour indices; whereas the indices A, B = 1, 2, 3 are
SU (3)colour indices. The SU (3)colour structure constants are cab c . mf is the mass of the f -
flavour quark. The SU (3) gauge coupling is e; observe that the gauge potentials have been
re-scaled as mentioned previously in order to remove the gauge coupling factor from the
Yang-Mills action; although this means that e now appears in the fundamental covariant
derivative via ∂µ + eAµ , and also in the gauge transformations via
where g ∈ G.
The ΨAf are fermionic fields associated to the quarks- they are 4-component Dirac
spinors. We set ΨA AB Ψ A A † 0
f =δ Bf , and Ψ̄f ≡ (Ψf ) γ .
Gauge Invariance
The Yang-Mills term − 41 8a=1 Fµν a F aµν = 1 κ(F , F µν ) in the QCD action is auto-
4 µν
matically gauge invariant by construction.
It remains to consider the terms involving fermions: the fermions transform under the
fundamental representation of SU (3)colour via
AP ∗ Q
f → Ψ̄Qf δ (g )P (6.78)
Hence we see that under these gauge transformations
AP ∗ Q
f ΨAf → Ψ̄Qf δ (g )P gA C ΨCf
= Ψ̄Qf δ ΨCf
= Ψ̄A
f ΨAf (6.79)
AP ∗ Q
f (δA ∂µ + e(Aµ )A )ΨBf → Ψ̄Qf δ (g )P gA C (δC B ∂µ + e(Aµ )C B )ΨBf
– 124 –
= Ψ̄A B B
f (δA ∂µ + e(Aµ )A )ΨBf (6.80)
∂µ F + ecbc Abµ F cµν
+ ie Ψ̄A ν B
f γ (Ta )A ΨBf = 0 (6.82)
f =1
8 X
X 6
J ν = −ie Ψ̄A ν B
f γ (Ta )A ΨBf Ta (6.83)
a=1 f =1
Dµ F µν = J ν (6.84)
There are also fermionic equations of motion; these may be obtained by varying Ψ̄A f
and ΨBf in the action. These can be varied independently; from the variation of Ψ̄A
f (not
varying ΨBf ) we obtain the equation:
d4 x
iΨ̄A µ B a B B
δS = f γ δA ∂µ + eA (T
µ a A) − mf f δΨBf
4 A µ B A µ a B B
= d x − i∂µ Ψ̄f γ δA + ieΨ̄f γ Aµ (Ta )A − mf Ψ̄f δΨBf (6.86)
Hence we obtain
−i∂µ Ψ̄A µ B A µ a B B
f γ δA + ieΨ̄f γ Aµ (Ta )A − mf Ψ̄f = 0 (6.87)
Take the hermitian transpose of the above equation we obtain
∗ B A
iγ 0 γ µ ∂µ ΨB 0 µ a 0 B
f − ieγ γ Aµ (Ta )A Ψf − mf γ Ψf = 0 (6.88)
where we have made use of the identity γ µ† = γ 0 γ µ γ 0 . As Ta is anti-hermitian, we
therefore recover (6.85).
– 125 –