Geometrical and Topological Foundations of Theoretical Physics: From Gauge Theories To String Program

IJMMS 2004:34, 17771836

PII. S0161171204304400
Hindawi Publishing Corp.
Received 20 April 2003 and in revised form 27 October 2003
We study the role of geometrical and topological concepts in the recent developments of the-
oretical physics, notably in non-Abelian gauge theories and superstring theory, and further
we show the great signicance of these concepts for a deeper understanding of the dynam-
ical laws of physics. This work aims to demonstrate that the global topological properties
of the manifolds model of spacetime play a major role in quantum eld theory and that,
therefore, several physical quantum eects arise from the nonlocal metrical and topological
structure of this manifold. We mathematically argue the need for building new structures
of space with dierent topology. This means, in particular, that the hidden symmetries
of fundamental physics can be related to the phenomenon of topological change of certain
classes of (presumably) nonsmooth manifolds.
2000 Mathematics Subject Classication: 14-xx, 55-xx, 81-xx, 83-xx.
1. Introduction. We analyze the role of geometrical and topological concepts in the
developments of theoretical physics, especially in gauge theory and string theory, and
we show the great signicance of these concepts for a better understanding of the
dynamics of physics. We claim that physical phenomena very likely emerge from the
geometrical and topological structure of spacetime. The attempts to solve one of the
central problems in twentieth century theoretical physics, that is, how to combine grav-
ity and the other forces into a unitary theoretical explanation of the physical world,
essentially depend on the possibility of building a new geometrical framework concep-
tually richer than Riemannian geometry. In fact, this geometrical framework still plays
a fundamental role in non-Abelian gauge theories and in superstring theory, thanks to
which a great variety of new mathematical structures has emerged. A very interesting
hypothesis is that the global topological properties of the manifolds model of space-
time play a major role in quantum eld theory and that, consequently, several physical
quantum eects arise from the nonlocal metrical and topological structure of these
manifold. Thus the unication of general relativity and quantum theory requires some
fundamental breakthrough in our understanding of the relationship between spacetime
and quantum process. In particular the superstring theories lead to the guess that the
usual structure of spacetime at the quantum scale must be dropped out from physical
thought. Non-Abelian gauge theories satisfy the basic physical requirements pertaining
to the symmetries of particle physics because they are geometric in character. They pro-
foundly elucidate the fundamental role played by bundles, connections, and curvature
in explaining the essential laws of nature. Kaluza-Klein theories and, more remarkably,
superstring theory showed that spacetime symmetries and internal (quantum) sym-
metries might be unied through the introduction of new structures of space with a
dierent topology. This essentially means, in our view, that hidden symmetries of
fundamental physics can be related to the phenomenon of topological change of a cer-
tain class of (presumably) nonsmooth manifolds.
2. The geometrization of theoretical physics: from Cartans theory of gravitation
to geometric quantum theories. This expository article, which summarizes the main
subject of a book in progress on the same topic, is aimed at analyzing some of the
most important mathematical developments and the conceptual signicance of the ge-
ometrization of theoretical physics, from the work of Cartan and Weyl to the recent non-
Abelian gauge theories. The starting point of our reections is the question of how to
characterize the properties of space (topological and algebraic invariants, group struc-
tures, symmetries and symmetry breaking) at the quantum level physics. More gener-
ally, we will try to highlight some striking aspects of the mathematical developments
inspired by the attempts to solve one of the central problems in twentieth century
theoretical physics: how to combine general relativity and quantum eld theory into
a unitary theoretical description of the physical world. Another point, which is in all
likelihood intimately connected to the above, is the question of how to determine the
topological (global) structure of the universe, as well as the physical conditions for its
early formation. Finally, we seek to outline some theoretical remarks which raised the
recent developments in theoretical physics concerned by the above questions.
Moreover, these two questions lead to the fundamental issue of the nature of space
and spacetime: is it a purely formal structure, or does it include a generative princi-
ple for physical phenomena? What relation is there among the physical properties of
microscopic and macroscopic matters, the kind of extended (or pointless) objects they
yield, and features of space into which they are embedded? Generally, an answer to
these fundamental questions and an explanation of the basic aporias such as contin-
uous/discrete, local/global, deterministic/nondeterministic, linear/nonlinear, depend
on a satisfactory geometric theory whose concepts are somewhat dierent from the
ones underlying the progress of physics at the beginning of this century (general rel-
ativity and quantum mechanics). In particular, it seems necessary to build a geometry
conceptually richer than Riemannian geometry. This has been partly achieved in the
last two decades, and we can now see the possibility of unifying theory of gravita-
tion with quantum mechanics. The enriched geometry plays a basic role in non-Abelian
gauge theories and in superstring theory, for which a great variety of newmathematical
structures has emerged.
This more general post-Riemannian geometry is based upon two very interesting
ideas I would like now to stress:
(1) space has ten or eleven dimensionsaccording to which we deal with superstring
theory or supergravityrather than four, an assumption made more plausible
by internal mathematical reasons as well as experimental physical evidence;
(2) the structure of spacetime at the quantum level is not that of a dierentiable
manifold C

, but apparently the equivalent of an arbitrary topological space

constructed froma complex (innite-dimensional) Riemann surface and on which
some fundamental mathematical objects are dened.
The latter hypothesis implies particularly that the global topological properties of the
(Lorentzian or Riemannian) manifold M play a major role in quantum eld theory and
that, consequently, several (physical) quantum eects arise from the nonlocal metrical
and topological structure of M. It seems reasonable to think that general relativity and
quantum theory are intrinsically incompatible and that, rather than merely developing
technique, what is required is some fundamental breakthrough in our understanding
of the relationship between spacetime structure and quantum process.
Concerning the rst idea, one may add that, in fact, eleven-dimensional spacetime
recommends itself as the habitat of the maximal supergravity theory. Remarkably, there
is also a phenomenological argument for eleven as the minimal number of spacetime
dimensions, which was pointed out by Witten. If the familiar SU(3)
gauge symmetry in four dimensions is to originate in isometries of a
compact manifold in N hidden dimensions, then these extra dimensions must be at
least seven in number. This follows fromthe observation that no manifold of dimension
three or smaller can have more than six isometries, and thus the eight-parameter group
SU(3) can most economically appear as the isometry group of the four-dimensional
manifold CP(2). Similarly an SU(2)U(1) gauge symmetry in four dimensions is most
economically obtained from the isometries of the three-dimensional manifold S
Thus the gauge group of low energy physics is obtainable from the isometries of the
seven-dimensional manifold CP(2)S
. This is not an Einstein manifold, however,
and as such, it is not relevant as the compactication of eleven-dimensional supergrav-
ity. Fortunately coset spaces of type SU(3)SU(2)(U(1)/SU(2))U(1)U(1) other
than CP(2)S
exist (S
for instance) and some of these are Einsteins manifolds.
Alas, because there is no supersymmetric Yang-Mills theory in eleven dimensions, the
fermion spectrum is necessarily nonchiral.
One of the most remarkable discoveries in the last decades is that bosons and
fermions can be placed in the same multiplet of a supergroup whose innitesimal pa-
rameters contain anticommuting elements [59]. Such a theory predicts a boson-fermion
mass degeneracy that is not observed in nature and thus the supersymmetry must be
broken. The Goldstone fermions associated with spontaneous breaking have the wrong
property to be neutrinos and hence the symmetry needs to be implemented as a local
gauge invariance with the Higgs-Kibble mechanism in action. On the other hand, the
construction of a successful quantum theory of gravity seems to depend largely on our
capacity to give an answer to the following questions: is there some profound break-
down of spacetime continuum and quantum concepts at the Planck length (10
Some of the current geometrical and physical works have this idea. In particular, the
superstring theories lead to the guess that the usual structure of spacetime (to which
we have been used since the general relativity) at the Planck length must be dropped
out from physical thought.
The main issue can be put in the following terms. Howmuch of the mathematical and
conceptual structure of classical general relativity do we expect to retain? In particular,
one thinks of the underlying smooth C

manifold, the metric tensor, the local eld

equations G


, the global topological properties of the manifold and global

metric features such as lightcone structure and the existence of event horizons. It is
very dicult to judge how many of these classical concepts should be present, in some
form or another, in a quantized theory. On the other hand, how much of the technical
and conceptual structure of conventional quantum eld theory do we expect to retain?
For example, does the usual idea of a local quantum eld (x) make any sense at all
or should we decide from the outset that, at a Planck scale, spacetime structure is not
that of a smooth manifold and therefore the local properties of elds become very
unconventional? Similarly, what remains of the local commutativity of quantum elds
in a theory where the lightcone is determined by the metric tensor, which is itself a
dynamical variable? To sum up all that precedes, two questions then naturally arise.
(i) Which kind of mathematical framework could underly the quantumeld theory?
If it is a topological space, then the latter is not a smooth manifold.
(ii) If something peculiar happens to spacetime topology at Planck distances, the
possibility arises that spaces should be considered not to be in any sense dif-
ferentiable manifolds.
Attempts to solve the above problems have given rise to a fertile program of ge-
ometrization which in turn has increasingly inuenced theoretical physics, especially
quantumeld theory, gauge theory, and string theory, as well as branches of mathemat-
ics, notably dierential and algebraic geometry and topology. This idea of geometriza-
tion, which is already present in Riemanns work on abstract manifolds (endowed with
metrics) and Riemannian surfaces, and in Poincars work on the topology of dier-
entiable manifolds and the geometrical theory of dierential equations, has in our time
taken on new directions and greater importance. However, in some fundamental work
appeared in 19181930, E. Cartan and H. Weyl introduced some concepts and theo-
ries which subsequently gave rise to many recent and important contributions to the
comprehension of the relation between geometry and theoretical physics. These ideas
included projective, ane, and conformal connections and bre spaces. The global
theory of connections over a dierentiable bre space was created by Ch. Ehresmann
around 19401950. (On this subject, see [8]).
In this perspective, one may recall that in 1923 Cartan proposed to modify the Ein-
stein theory of gravitation by allowing spacetime to have torsion and relating it to the
density of intrinsic angular momentum of a continuous medium (see [26, 53]). The
idea of connecting torsion to spin has known new developments around 1960, mainly
thanks to the work of D. W. Sciama and T. W. B. Kibble. There was considerable interest
in this problem from 1966 to 1976. All available evidence from experiments in macro-
physics attests to the validity of Einsteins general theory of relativity as a description
of this interaction. The need to propose alternative or more general gravitational theo-
ries stems from a dichotomy in theoretical physics. Strong, electromagnetic, and weak
interactions nd their successful description within the framework of relativistic quan-
tum eld theory in at Minkowski spacetime. These quantum elds reside in spacetime
but are separate from it. Gravitation, according to Einstein, deforms Minkowski space
and inheres in the dynamic Riemannian geometry of spacetime. One branch of fun-
damental physics is highly successful in a at and rigid spacetime, but gravitation
requires a nonat and dynamic spacetime. This state of aairs seems, at least from a
theoretical point of view, to be unsatisfactory. Stated dierently, there is no logical or
experimental compelling need to modify Einsteins theory, but one can advance good
heuristic arguments in favor of the Cartan idea.
(i) The geometrical independence of the metric g and linear connection leads to the
idea of treating these quantities as independent variables in the sense of a principle of
least action. If g and are assumed to be compatible, then the freedom in the choice
of reduces to that of the torsion tensor Q.
(ii) According to relativistic quantum theory, the Poincar groupor the inhomoge-
neous Lorentz groupis physically more signicant than the Lorentz group itself. The
Poincar group has two fundamental invariants: mass and spin. The rst of them is
related to translations and to energy momentum. In Einsteins theory, the density of
energy momentum is source of curvature whereas spin has no such direct dynamical
signicance. In a sense, Einstein-Cartan theory restoresto some extentthe symme-
try between mass and spin. It introduces also an unexpected duality: via Noethers
theorem, energy momentum is generated by translations whereas Einsteins equation
relates it to curvature, which is responsible for rotations of vectors undergoing parallel
transport. Conversely, spin is generated by rotations, but torsion induces translations
in the tangent space to a manifold (Cartan displacement). This duality can be traced
to the fact that the Einstein-Cartan Lagrangian is linear in curvature.
(iii) There is an interesting analogy between the description of magnetic moments
in electrodynamics and spin in the theory of gravitation. In a phenomenological de-
scription of electromagnetism, the external magnetic eld produced by a ferromagnet
may be obtained in at least three ways: by considering a surface current equivalent to
the actual distribution of microscopic currents and magnetic moments, by replacing
the latter by a volume distribution of Ampre currents, or, nally, by introducing a
smooth eld of the magnetization vector. In the Einstein theory, there are analogues
for the rst two descriptions, whereas the Einstein-Cartan theory provides the third.
The Einstein-Cartan theory assumes, as a model of spacetime, a four-dimensional
manifold with a linear connection compatible with a metric tensor g. The gravitational
part of the Lagrangian,
gR, is formed from the curvature tensor of . The left-hand
sides of the eld equations are obtained by varying this Lagrangian with respect to g and
Q. Variation with respect to g may be replaced by that relative to the eld of frames.
The sources of gravitational eld are described by expressions resulting either from
phenomenology or by varying an action integral obtained by applying the principle of
minimal gravitational coupling to a special-relativistic Lagrangian. The Einstein-Cartan
equations are


R =

, (2.1)


. (2.2)
The Cartan equation (2.1) is trivial in the sense that if the spin density vanishes, s

0, then so does torsion, Q

= 0. Quite independently of this, torsion is topologically

trivial: any linear connection can be deformed into a connection without torsion. The
Einstein-Cartan theory may be physically relevant only when the density of energy is
of the same order of magnitude as the spin density squared. For matter consisting of
particles of mass m and spin h/2, this will occur at densities of order m
3. Introduction to Kaluza-Klein theories. In another direction, there have been in
the 1920s very interesting attempts by Theodor Kaluza then by Oskar Klein to unify
the relativistic theory of gravitation with Maxwells theory by introducing a new geo-
metrical framework within which electromagnetism could be coupled with gravity (at
least theoretically). The Kaluza-Klein theories are purely geometrical in character and
have been worked out in order to encompass two apparently inconsistent physical the-
ories into a unitary theoretical explanation. Actually, even before Einsteins general
relativity, the physicist Gunnar Nordstrm in 1914 proceeded to unify his theory of
gravitation (in which gravity was described by a scalar eld coupled with the trace of
the energy momentum tensor) with Maxwells theory in a most imaginative way. In-
spired by Minkowskis four-dimensional spacetime continuum, Nordstrm added yet
another space dimension, thus obtaining a at ve-dimensional world. There he intro-
duced an Abelian ve-vector gauge eld for which he wrote down the Maxwell equations
including a conserved ve-current. He then identied the fth component of the ve-
vector potential with scalar gravity, whereas he identied the rst four components of
the ve-vector potential with the Maxwell four-potential. With these interpretations he
then noticed that in the cylindrical case (when all dynamical variables become inde-
pendent of the fth coordinate) the equations of his ve-dimensional Maxwell theory
reduced to those of the four-dimensional Maxwell-Nordstrm electromagnetic gravi-
tational theory. It is then fair to say that higher-dimensional unication starts with
Nordstrm, who assumed scalar gravity in our four-dimensional world to be a remnant
of an Abelian gauge theory in a ve-dimensional at spacetime.
The next step was taken by the mathematician Theodor Kaluza in 1919 in the wake of
Einsteins general relativity. Kaluza proposed that one pass to an Einstein-type theory of
gravity in ve dimensions, from which ordinary four-dimensional Einstein gravity and
Maxwell electromagnetism are to be obtained upon imposing a cylindrical constraint.
More precisely, what this amounts to is starting with a ve-dimensional manifold M
which is the product of M
of a four-dimensional spacetime M
with a circle S
The metric
(x, y) on the ve-manifold M (m, n = 0, 1, 2, 3, 5) is a function of both
the coordinates x

( = 0, 1, 2, 3) on M
and y x
, the coordinate of the circle S
It is convenient to replace the fteen eld variables
) by fteen new eld
variables g


, A

, according to the eld redenitions





All eld quantities, old and new, are periodic functions of the coordinate y on the
circle. If y =, where is the usual angular coordinate and the radius of the circle,
then the period is 2. Thus any eld quantity F(x, y) (F being any of the g

s, A

s, or
s) admits a Fourier expansion
F(x, y) =
. (3.2)
Kaluza assumed the ve-dimensional dynamics to be governed by a gravitational
Einstein-Hilbert action

x, (3.3)
), R
the ve-dimensional curvature scalar, and G
a ve-dimensional
counterpart of the gravitational constant. Using the Fourier expansions, the y-
dependence becomes explicit so that the y-integration can be carried out. A four-
dimensional action involving an innity of eldsthe Fourier components A

, g


then emerges. At this point Kaluza imposed a cylindricity condition: he trun-
cated the action by dropping all harmonics with n0, retaining only the zero modes:

(x, y) =g

(x), A

(x, y) =A

(x), (x, y) =
(x). (3.4)
The ve-dimensional line element then takes the form



, (3.5)



is the four-dimensional line element corresponding to the metric g

(x). The line ele-

ment (3.6) is invariant under the transformations


x x+e








which we recognize as Abelian gauge transformations la Weyl (see Section 5). Here
these transformations assume a geometrical meaning as shifts in the fth coordinate
by an amount (x

), which depends on ordinary four-spacetime. The Abelian gauge

symmetry in four dimensions originates in the isometries of the small circle in the fth
When the y integration is carried out with the cylindric truncation enforced, the ac-
tion (3.3), invariant under ve-dimensional general coordinate transformation, reduces
to a four-dimensional action invariant under both four-dimensional general coordinate
transformations and Abelian gauge transformations. This four-dimensional action is,
up to a surface term,








G =
, g

, F



, (3.9)
and R
=scalar curvature calculated fromthe four-metric g

; (our metric convention

calls for a minus (plus) sign for a time (spacelike) dimension).
This action involves a graviton (g

), an Abelian gauge boson (A


), and a scalar
. Kaluza arbitrarily set
= constant, in which case I
turns into the four-
dimensional Einstein-Maxwell action. To be sure, one has to have
> 0 in order
to have the proper relative sign of the Einstein and Maxwell terms, so that energy is
positive. This in turn means that the fth dimension must be spacelike; in fact, the
extra dimensions must all be spacelike. In addition to the invariances under general
coordinate transformations and gauge transformations, the action (3.8) also exhibits
an invariance under global scale transformations:





The eld equations of the original ve-dimensional theory have a solution in which the
ve-dimensional spacetime is the direct product of a circle with at four-dimensional
Minkowski spacetime. Then

, A

=0, =1 (3.11)

is the four-dimensional Minkowski metric (1, +1, +1, +1)). This solution serves as
a natural vacuum, and it spontaneously breaks the scale invariance (3.9). The massless

-eld is the Nambu-Goldstone boson associated with this spontaneous symmetry
breaking. So the zero-mode spectrumincludes spin 2 and spin 1 gauge elds and a spin
0 Nambu-Goldstone boson. In the full quantum theory the spin 0 boson is expected to
acquire a mass. Of course, the full classical theory contains not only the zero modes,
but also the n 0 harmonics (equation (3.2)). The action (3.3) determines their spins,
masses, and couplings. They all have spin less than or equal to 2, and they are all
massive. The nth harmonics have mass

, (3.12)
where , as before, is the radius of the small circle in the fth dimension. The couplings
of these harmonics with the gauge eld A

are determined from the action (3.3), and

these harmonics do carry electric charge


. (3.13)
Remarkably, electric charge is quantized because the fth dimension is compact. We
see that the elementary charge is
e : 4

and the corresponding ne-structure constant is

. (3.15)
If is to correspond to the U(1) subgroup of grand-unication group, then 1/100
so that the circumference of the small circle l 2 100

G 10
. The
circle must be very small indeed; a size about 100 Planck lengths could hardly have
been detected as yet. Nevertheless, this is large enough to call into question grand-
unication in four-dimension: the scales at which the grand-unication group is to
reveal itself unbroken are close to the scales at which the extra dimensions would
become manifest. To make all this applicable in a world with strong and electroweak
interactions, one of course has to introduce more than one extra dimension.
Kaluzas work has been unknown until when Oskar Klein, in 1926, rediscovered
Kaluzas theory. (Einstein delayed the publication of Kaluzas paper for two years.) Klein
noted the quantization of the electric charge and hoped Kaluza theory would under-
lie quantum mechanics (see Section 9). The relativistic generalization of Schrdingers
equation was carried out independently by many authors: Schrdinger, Klein, Gordon,
Fock, and others. This equation, now commonly known as the Klein-Gordon equation,
was arrived at by both Klein and Fock starting from Kaluzas theory: a zero-mass wave
equation in ve dimensions yields four-dimensional Klein-Gordon equations for the in-
dividual harmonics. It must be noted that this early work is viewed as a mathematical
trick devoid of any physical signicance. Nevertheless, this mathematical idea will prove
very fruitful for the further developments of the theory, especially in supergravity and
string theories. Oskar Klein comes closest to the modern point of view: he discusses
the higher harmonics and the size of the small circle. Later Einstein and Bergman also
adopted such a point of view. A purely mathematical approach (a projective interpreta-
tion of the fth coordinate) was developed by Veblen, Pauli, Jordan, and others. Jordan
appears to have been the rst to realize the importance of including the scalar eld

into the new ve-dimensional theory.
Remarkably, the most recent work on superstrings incorporates both the ideas of
Nordstrm and the subsequent ideas of Kaluza and Klein (see Section 11). However,
there was no real reason to extend the Kaluza-Klein idea beyond the ve dimensions
until the emergence of non-Abelian gauge eld theories invented by Yang and Mills in
1954 (see Section 5). In 1963, DeWitt suggested that a unication of Yang-Mills the-
ories and gravitation could be achieved in a higher-dimensional Kaluza-Klein frame-
work. Trautman was independently aware of this possibility as were others. A detailed
discussion of the Kaluza-Klein unication of gravity and Yang-Mills theories, includ-
ing the correct form of the (4+N)-dimensional metric, rst appeared in the work of
Kerner. The rst complete derivation of the four-dimensional gravitational plus Yang-
Mills plus scalar theory from a (4+N)-dimensional Einstein-Hilbert action was nally
given by Cho and Freund in 1975. The weakness of this higher-dimensional work was
the absence of any good reason as to why any dimension would compactify, let alone
the right number, so as to leave the ordinary four-dimensional large world. While the
ve-dimensional theory at least admitted the compactied fth dimension along with
Minkowski space as a solution to the ve-dimensional equations of motion, even this
was not true of the higher-dimensional theories. The essential reason for this is that the
higher-dimensional manifolds that give rise to Yang-Mills theories have curvature. If a
(4+N)-dimensional Einstein theory is to compactify into the direct product of four-
dimensional spacetime M
and a compact internal space with isometries, the metric

(x, y) can be written as follows in the zero-mode approximation:

(x, y) =










The metric
(y) is that of the corresponding N-dimensional symmetric space and
the Killing vectors

(y) have upper indices running over the dimension of the sym-
metry group. If four-space is to be at (and, actually, it cannot be at!), the Ricci tensor
= 0 for the spacetime indices, and therefore R+ = 0. But then R
must van-
ish for the internal indices as well, and this cannot be the case if the internal space is
Cremer and Scherk began to address this problem by pointing out that inclusion of
additional Yang-Mills and scalar matter elds in the higher-dimensional theory would
allow classical solutions in which spacetime is the direct product of Minkowski space
and a compact internal space of constant curvature. This spontaneous compactica-
tion was achieved, however, by going beyond the pure Kaluza-Klein framework and
including extra elds in just such a way as to induce the desired compactication. The
program of seeking solutions to the combined Einstein-Yang-Mills equations in 4+D
dimensions was generalized to a larger class of internal spaces by Luciani, Salam, Du,
and others. All this work on classical, higher-dimensional Kaluza-Klein theories pro-
vided a springboard for the study of both Kaluza-Klein supergravity and the quantum
dynamics of Kaluza-Klein theories.
Roughly, supergravity is an attempt to unify matter and force as dierent compo-
nents of the same agency. This is a kind of supersymmetric theory in which, because
of the fact that the numbers of Bose and Fermi degrees of freedom have to be equal in
supersymmetric theory, Bose elds beyond gravity appear in eleven dimensions. In fact
supersymmetry dictates that the missing Bose degrees of freedom be supplied in the
form of a massless antisymmetric tensor eld with three indices A
which indeed
have (112/3) = 84 = 12844 degrees of freedom. Moreover, in eleven dimensions,
there exist no matter and no Yang-Mills supermultiplets, so that besides gravity one
only has its supersymmetric partner A
and gravitino elds as matter. The source
of gravity is thus xed by supersymmetry. Furthermore, it is supersymmetry that de-
termines the dimension of spacetime in eleven-dimensional supergravity. Force and
matter uniquely determine each other; they are but dierent components of the same
supermultiplet. In ten dimensions a similar argument can be made, but there we en-
counter Yang-Mills supermultiplets whose gauge group is xed, though not uniquely,
by the requirement of anomaly cancellation. For superstring theories similar consider-
ations apply. To nd the possible vacuumof the eleven-dimensional theory, we look for
a solution of the classical equations in which the eleven-dimensional world manifold
is of the form M
= M
, where M
is the spacetime and M
the small
compact manifold. In the vacuum we require the spacetime M
to be maximally sym-
metric. This then xes the metric of M
). The antisymmetric tensor potential
has its own gauge invariance under
(x, y) A
(x, y)+

(x, y)+

(x, y)+

(x, y) (3.17)
. The corresponding gauge-invariant quantities are the eld strengths
given by the curl of A
. (3.18)
If F or its dual F

is to have a nonvanishing vacuumexpectation value on d-dimensional

spacetime without destroying the maximal symmetry, then either dor 11dmust equal
the number of indices of F, that is, d =4 or d =7. In the case d =4, once one has xed
the maximally symmetric form of F, the simplest solutions are obtained by setting
h(y) = 1 and F
= 0. Then the eld equations and Bianchi identities of eleven-
dimensional supergravity require F(x, y) = F = constant, and the energy momentum
tensor of the A-eld is equivalent to two cosmological terms, one on M
and one on
, with cosmological constants of opposite signs. Provided M
contains the time di-
mension, the cosmological constant on M
will have the sign appropriate to a compact
manifold so that spontaneous compactication really does occur. M
is the maximally
symmetric noncompact anti-de Sitter space and M
a compact Einstein space. The scale
is set by the expectation value F of the eld strengths. The rest of four-dimensional
physics is determined by the shape of the small seven-dimensional manifold M
. If,
for instance, M
is the seven-sphere S
, then the gauge group in four dimensions is
SO(8)the isometry group of S
and one nds eight supersymmetries. Just as gauge
symmetries in four dimensions are related to the Killing vectors of the small manifold,
so the supersymmetries are related to the Killing spinors. All the solutions with one
or more surviving supersymmetries are stable with respect to classical perturbations.
Some of the solutions without any surviving supersymmetry are stable, and others
are unstable. Several problems stand in the way of producing a realistic theory. First
of all, the four-dimensional anti-de Sitter space has a too large cosmological constant
that has to be eliminated somehow. Of course Higgs mechanisms in four dimensions
further aect the cosmological constant and it is the endproduct that has to be very
small or zero. Another serious problem is the lack of chiral fermions at least as long
as bound states and solitons are ignored. It seems that these problems can be solved
in the context of higher-dimensional superstring theory, which recently demonstrated
the possibility of a nite quantum theory of gravity, whereas the eleven-dimensional
supergravity theory, while trivially nite at one loop, is questionable in this regard.
4. The role of topological concepts in physics. We return to the geometrization of
mathematics and theoretical physics. It must be stressed that the geometrization move-
ment appears today to be more inuenced by this centurys concepts and methods,
than by those of ordinary geometry and even the new non-Euclidean geometries devel-
oped in the 19th century. A new family of geometrical and topological invariants (Betti
numbers, Euler-Poincar characteristic, Whitney-Stiefel characteristic classes, Pontrja-
gin and Chern characteristic forms) is at the heart of twentieth-century mathematical
progress, as well as at the foundation of recent physical theories, especially non-Abelian
gauge theories. The introduction of these concepts and the development of a host of
new notions and techniques in geometry and algebraic and dierential topology in the
1940shomology, cohomology, and homotopy (Whitney, Lefschetz, Hopf), bre space,
and characteristic classes (Ehresmann, Pontrjagin, Steenrod, Thom, Chern, Milnor), cat-
egories (Eilenberg, MacLane)mark the passage from the local to the global study of
mathematical objects.
One of the great mathematical advances of this century was the introduction of char-
acteristic classes by Whitney and Stiefel in 1935, and characteristic forms by Pontrjagin
(over a real bre space) and Simons and Chern (complex). Intuitively, these objects are
geometrical and topological invariants that can be classied in dierent families even
though they are all mutually related. Thus we say that we can topologically transform
a surface (or a manifold) into another if they have the same characteristic invariants
(and if the dimension of the space is compatible with the type of transformation). Two
manifolds satisfying these conditions are said to be topologically equivalent. These
invariants and their corresponding algebraic structures can be technically very compli-
cated. The two invariants mentioned above globally characterize the two mathematical
objects of bre spaces and connections, which in turn imply several basic algebraic
and geometric notions such as homology (homology group, intrinsic homology, singu-
lar homology, algebraic homology, functors, etc.), cohomology classes, homotopy, and
so forth. Two examples of homology and cohomology that are very important in con-
temporary physics are bordism and cobordism as developed by R. Thom, and ordinary
homology H(X). In fact, several notions of classical eld theory can be expressed by co-
homology. The more recent quantum eld theories, reinterpreted in the mathematical
framework of gauge theory, showa remarkable presence of cohomological ideas, seen in
some cases as a generalization of characteristic classes such as those of Euler-Poincar.
A very interesting example in our time is that of a non-Abelian cohomology space of
Riemannian surfaces with boundary.
We rst give some basic notions and denitions on principal bundles, connection,
curvature, and characteristic classes (we follow closely Steenrod [48] and Husemoller
[27]). Among other things, a principal bundle has a structure group G, which is a Lie
group, and a base B, which is a topological space. Notice that on the product BG there
is a natural right action of G by right multiplication on the second factor. This is a free
action, and the quotient is dened with B.
Denition 4.1. A (right) principal G-bundle consists of a triple (P, B, ), where
: P B is a map and a continuous, free right action P G P with respect to which
is invariant and so that induces a homeomorphism between the quotient space of
this action and B. Furthermore, there is an open covering U

of B over which all the

above data are isomorphic to the product data. That is to say, for each , there exists
a commutative diagram




is a homeomorphism which is equivariant with respect to the right G-action

and p
is the projection onto the rst factor.
The space P is called the total space of the principal bundle, is called the projection,
and B is called the base. The maps

are called local trivializations. Lastly, G is called

the structure group of the bundle. If P and B are smooth manifolds, if the action of G
on P is a smooth action, and if is a smooth submersion, then the principal bundle is
said to be a smooth principal bundle. In this case it follows automatically that one can
choose the local trivializations so that the

are dieomorphisms. An isomorphism of

G-bundles with the same base is a homeomorphismbetween their total spaces, which is
G-equivariant and which commutes with the projections to the base. A map between G-
bundles over possibly dierent bases is a G-equivariant map between the total spaces.
Such a map must be an isomorphism on each bre and it induces a map between the
base spaces. In order to consider a general example of principal bundle, let M be a
smooth manifold. Let E be the frame space for the tangent bundle of M. A point of
E consists of a point p M and a basis v
, . . . , v
for the tangent space TM
to M
at p. The topology, and indeed the smooth structure, of E is induced in the obvious
projection of E to M and an obvious action of GL(n, R) on E. The action of A=(a
GL(n, R) on the point (x, v
, . . . , v
) gives the point (x, w
, . . . , w
), where
. (4.2)
That is to say, the matrix A acts on the basis to produce a new basis for the same space;
the expression for the new basis in terms of the old basis is given by the columns of
the matrix A. This denes a right action of GL(n, R) on E.
Let : P B be a smooth principal G-bundle over an n-dimensional manifold. A
connection for this bundle is an innitesimal version of an equivariant family of cross
sections. It is an n-dimensional distribution (i.e., smooth family of n-dimensional
linear subspaces of the tangent bundle TP of P) which is horizontal in the sense that
the restriction of D to each plane in the distribution is an isomorphism onto the
corresponding tangent plane to B and which is invariant under the G-action. This dis-
tribution induces an isomorphismTP


(p). Suppose that is a connection

for P B. Let : [0, 1] B be a smooth path and e
((0)). Then there is a unique
path : [0, 1] P such that (0) = e, = , and

(t) is contained in the hori-

zontal space
. Given a smooth curve in the base : [0, 1] B from b
to b
, a
connection determines an isomorphism between the bers
), which is
equivariant with respect to the G-actions on these bers. Therefore a connection gives
a manner to connect distinct bers, albeit one needs a path in the base between the
image points in the base. Let be the Lie algebra for G. Then there is a unique one-form


(G; ) which is invariant under left multiplication by G and whose value at
the identity element of G is the identity linear map from TG
. This form is called
the Maurer-Cartan form. It is often denoted by g
dg. Its value on a tangent vector
is equal to g
Lemma 4.2. A connection on a smooth principal bundle : P B is equivalent to a
dierential one-form
(P; ) with the following properties.
(i) Under right multiplication by G, the form transforms via the adjoint represen-
tation of G on ; that is,

( g) =g

()g (4.3)
for any p P, any TP
, and any g G.
(ii) For any p P, consider the embedding R
: G P given by R
(g) = pg. Then
the pullback R

() =
Suppose that A is a connection on a principal bundle : P B, and suppose that
W B is a vector bundle associating to this principal bundle and a linear action of G on
a vector space V. We can use the connection to dierentiate sections of W, producing
one-forms with values in W. This covariant dierentiation is a linear operator

(B; W)
(B; W). (4.4)
The curvature arises as the obstruction to integrating the horizontal distribution of
a connection over two-dimensional submanifolds of the base. Let P B be a smooth
principal G-bundle and let adP be the vector bundle associated to P and the adjoint
action of G on its Lie algebra . Suppose that A is a connection on P, and TP.
We can integrate along paths in B to give a lifting of paths from B to P. If we try to
perform the same construction over higher-dimensional subspaces of B, then it is not
always possible to liftthere is an obstruction which is the curvature of the connection.
We x a point b B and two linearly independent tangent vectors
at b. Consider
a local coordinate system (x
, . . . , x
) centered at a point b B with the property that
for i =1, 2. We consider a rectangle [0, ][0, ] in the (x
, x
We lift the four sides of this rectangle in counterclockwise fashion beginning with the
side on the x
-axis. We do this so that the initial point lifts to a point p P and so that
each side begins where the previous side ends. There is no guarantee that the end of the
last side will be equal to p, but it will be of the formpg for some unique g =g() G.
If is suciently close to zero, then g() will be close to the identity in G, and hence
we can form log(g()) . We consider the element
() =

. (4.5)
Lemma 4.3. The element in g given by
() (4.6)
depends only on p,
. Furthermore, the point
p, K
adP (4.7)
depends only on
, and is bilinear and skew-symmetric in these variables. It is given
by evaluating a two-form on B with values in adP, denoted by F
, on (
). This two-
form F
is called the curvature of A.
We can use the curvature to dene cohomology classes in B which measures the
nontriviality of the bundle. These are called characteristic classes. The rst result we
need in order to dene characteristic classes from the curvature is the so-called Bianchi
Lemma 4.4 (Bianchi identity).
Suppose that
. , .
k times
R (4.8)
is a linear map which is symmetric and invariant under the simultaneous adjoint action
of G on ; that is,

, . . . , F
g, . . . , g
. (4.9)
Then we can form

, . . . , F

(B; R). (4.10)
Lemma 4.5. The form(F
, . . . , F
) is closed. If another connection A

for P is chosen,
then the dierence

, . . . , F

, . . . , F
is exact.
For the special orthogonal group SO(n), a basis for the invariant polynomials on the
Lie algebra is given by the even coecients of the characteristic polynomial together
with the Pfaan if nis even. Thus, we get one characteristic class in each degree 4i, and
if n = 2k, we also get one characteristic class in degree 2k. If we normalize properly,
then these classes are, respectively, the ith Pontrjagin class and the Euler class. There is
a similar result for complex-valued symmetric, multilinear functions on the Lie algebra.
Applying this to the unitary group, we see that a basis for the complex-valued invariant
polynomials is given by the coecients of the characteristic polynomials. Thus, in this
case, we have one characteristic class in each degree 2i. Correctly normalized, these
are the Chern classes.
We further recall some fundamental geometric-dierential facts regarding the no-
tions of bordism and cobordism. For each topological space X, the commutative group
(X) can be dened as follows (Thom [51]). Continuous mappings f : Y X from ori-
ented and compact manifolds with boundary Y in X are called chains. The sum and
dierence of chains are dened by disjoint union and change of orientation, respec-
tively. The boundary of f is its restriction to the boundary of Y. It is well known (thanks
to a fundamental theorem of algebraic geometry) that the boundary of a boundary is
empty: = 0. A cycle is a chain whose source has no boundary. The equivalence
classes of cycles form the group (X). The Thom ring is the bordism of a point. With
the product Y X, one can see that (X) is a module over , so that acts on (X). To
every continuous mapping X
is associated a linear transformation of (X
) into
); we then have a functor. The cobordismcohomology

(X) is obtained by taking

the homotopy classes of continuous mappings of X into a given space, in fact, a nested
sequence of topological spaces, the Thom spectrum. The cohomology (the theory

) is
richer than the homology (), dealing with rings having all sorts of operations. Cobor-
dism is an equivalence relation on the set of submanifolds, say N and N

of M, which
means that the cobordism Z transforms N into N

. M designates a compact oriented

manifold with H
(M) = 0. We say then that two submanifolds N, N

are cobordant in
M if there is a compact Z =M[0, 1] so that Z =N0 N

1 (see [35]).
Even these short considerations suce to highlight some basic characteristics of co-
homology that make it a good basis for building a richer and more sophisticated theory
of the spatial continuum and of spacetime, with enormous theoretical implications for
physics. Some of these characteristics are listed below (Bennequin [6]).
(1) Homology is constructed by quotienting a part of the data (cut and gluing). It
stabilizes forms.
(2) It shows the close relationship that can exist between gures and numbers, es-
pecially coecients. We can reconstruct a new ring from combinations of chains
with rational (Q) or complex coecients (C). This lets us localize and complete,
(3) The most remarkable property is probably the universality. There are many co-
homologies that all give the same results. More exactly, dierent denitions lead to
isomorphic (or related, at the very least) theories. This means that axiomatic construc-
tions are permitted (Atiyah (1968)).
(4) Cohomology realizes forms; in a certain sense, it denes forms. In any case, it
ensures certain stability and genericity. Several notions from classical eld theory can
be expressed cohomologically. Furthermore, the more recent quantum eld theories,
reinterpreted in the common mathematical framework of gauge theory, highlight the
basic role played by cohomology and characteristic classes (see the work of Atiyah and
Bott [4], Manin [34], Uhlenbeck [54], and Taubes [50]). These concepts are also used in
the attempts to give a consistent mathematical formulation and an intelligible physical
interpretation of other quantum gauge theories such as the quantum electrodynamics
of Dirac, Feynman, and Schwinger.
5. The birth and development of gauge theory. A review of the origin and develop-
ment of gauge theory is in order (for more details, see [37, 67]). Two major geometrical
advances of Weyl must be mentioned. In 19181919 he outlined what he called a purely
innitesimal geometry (for the history of this theory, see [44, 49, 57]), which should
knowa transfer principle for length measurements between innitely close points only,
and which should admit a conformal structure. The allusion is of course to Levi-Civita
parallel displacement principle in a Riemannian manifold embedded in a suciently
high-dimensional Euclidean space, locally given by



with the dx
to be interpreted as the coordinate representation of a displacement vector
between two innitesimally close points so that the direction vector
is transferred to

. According to Weyl, one has to separate logically the concept of parallel displacement
frommetrics and to introduce what he called an ane connection on a (dierentiable)
manifold as a linear torsion-free connection. Thus, Weyl proposes a generalization of
Riemannian geometry which seemed to be the most natural mathematical framework
for the construction of a unied theory of gravitational and electromagnetic forces.
This generalized Riemannian metric; a Weylian metric on a dierentiable manifold M
is given by
(i) a conformal structure on M, that is, a class of (semi-) Riemannian metrics [g]
in local coordinates given by g
(x) or g
(x) = (x)g
(x), with multiplica-
tion by (x) > 0 (real-valued) representing what Weyl considered to be gauge
transformation of the representative of [g],
(ii) a length connection on M, that is, a class of dierential forms in local coor-
dinates represented by
dlog (representing the gauge transfor-
mation of the representative of j).
This new innitesimal geometry enfolds in fact the rst formulation of a gauge the-
ory. The idea of gauge was introduced by Weyl in a very inuential paper of 1918 [60]
(See also the interesting paper on the same subject published thereafter by Pauli [38].)
The background of this thinking at that time can be retraced through the preface of
the various editions of his landmark book Raum, Zeit, Materie (rst edition, 1918) (Her-
mann Weyl has evidently been inspired by the work of Einstein on gravity (19151916),
but also by the work of Felix Klein, who introduced the general mathematical concept
of group of transformations in his famous Erlangen Program in 1872, by D. Hilbert,
and mostly by Levi-Civita and Elie Cartan, who introduced, respectively, the concepts
of parallel-transport and of connection, which turned out to play a more and more
important role in the mutual relations of mathematics and physics. He was also inu-
enced by the German physicist Gustav Mie who triedin a series of articles published in
19121913to explain the basic phenomena of matter on a purely electromagnetic ba-
sis, in particular the existence, mass, and stability of electrons. Besides, Mie attempted
to formulate a theory of the electron that does not involve divergent eld quantities
inside of the electron). Weyl showed that while Einsteins gravity theory depended on
a quadratic dierential form
, (5.2)
electromagnetism depended on a linear dierential form

, 1 i, k 4, (5.3)
(which in todays notation is



) dened up to the gauge transformation

, +dlog. (5.4)
Thus the idea of a nonintegrable scalar factor
d (5.5)
was born. Weyl argued that the addition of a gradient d(log) to d=


not change the physical content of the theory, thus concluding that

has invariant signicance. He naturally then identied F

with the electromagnetic

eld and put


, (5.7)
where A

is the electromagnetic potential. Thus electromagnetism is conceptually in-

corporated into Weyls theory and, in particular, into the geometric idea of a noninte-
grable scalar factor (see [9]).
Here, an important aspect of the relationship between Einsteins general relativity
and Weyls gauge theory is worth noting. In general relativity, one uses a kind of math-
ematical relationships, known as a connection, which species that if the spacetime
orientation of a frame at x is given, then the relative orientation of a frame at x+dx
can be calculated. Since the frames are in a gravitational eld, the connection itself is
determined by the strength of the eld. In fact, the connection can replace the grav-
itational eld entirely so that all motion can be described in terms of the connection
alone. This replacement of the eld by a mathematical connection leads to the well-
known geometrical picture of general relativity. The familiar curvature of spacetime
can be calculated directly from the connection. Now Weyl went a step beyond general
relativity and asked the following question: if the eects of a gravitational eld can be
described by a connection which gives the relative orientation between local frames in
spacetime, can other forces of nature such as electromagnetismalso be associated with
similar connections? Generalizing the concept that all physical magnitudes are relative,
Weyl proposed that the absolute magnitude or norm of a physical vector also should
not be an absolute quantity but should depend on its location in spacetime. A new con-
nection would then be necessary in order to relate the lengths of vectors at dierent
positions. This connection is associated with the idea of scale or gauge invariance.
It is important to note that the true signicance of Weyls proposal lies in the local
property of gauge symmetry and not in the particular choice of the norm or gauge as
a physical variable. Actually, the assumption of locality is an enormously powerful con-
dition that determines not only the general structure but many of the specic features
of gauge theory.
Thus, after Einstein has developed his theory of general relativity, in which a dy-
namical role was given to geometry, Weyl conjectured that perhaps the scale length,
indeed the scale of all dimensional quantities, would vary from point to point in space
and in time. His motivation was to unify gravity and electromagnetism to nd a geo-
metrical origin for electrodynamics (see the good presentations of Moriyasu [36] and
Gross [25]). He assumed that a translation in spacetime dx

would be accompanied
by a change of scale or gauge, 1 1 +S


. The gauge function S

(x) would
determine the relative scale of lengths so that a certain function would transform as
f(x) f(x)+[



. The hope was to identify the connection S

the vector potential of electrodynamics, thus unifying this theory with gravity. This
did not work but only temporally! In fact, in 1927, after the development of quantum
mechanics, Fock and London noticed that the p


, when p

is replaced with



, looked very much like Weyls change of scale, but with a complex
coecient for the connection. Two years later Weyl completed the discussion, showing
how electrodynamics was invariant under the gauge transformation of the gauge eld
and of the wave function of a charged particle,

, e
. (5.8)
The concept of gauge invariance, and therefore the principle of local gauge symmetry,
was born. Accompanying the translation of charged particle, there is a phase change.
The fact that the physics, at least at Planck scale, remain unchanged with respect to a
gauge transformation lies at the heart of dierent forms of matter.
The most remarkable thing mathematically is that all the objections to the Weyls
theory disappear if we interpret it, as will be done later, as based on the geometry of
a circle bundle over a Lorentzian manifold. Then the form above (see (5.3)), subject
to the gauge transformation, can be interpreted as dening a connection in the circle
bundle and thus the metric remains unaltered. More generally, the characteristic fea-
tures of gauge theories can be described in terms of the topological and geometrical
dierential concept of bre bundles and the connections in them. The connection is
an intrinsic local structure that can be imposed on the bundle; it gives an elementary
but fundamental example of a gauge eld. Since gauge elds, including in particular
the electromagnetic eld, are bre bundles, all gauge elds are thus based on topology
and geometry. Starting in the 1970s, 20 years after the discovery by Yang and Mills of
a non-Abelian gauge theory for strong force (nuclear interactions) in which the local
gauge group was the SU(3) isotopic-spin group, the physicists were able to express the
concept of a gauge eld in such a way that it could be recognized as an instance of more
abstract structures known to mathematicians as connections in bre bundles. The dis-
covery of this equivalence has made it possible to understand why and how powerful
mathematical concepts and structures are necessary and suitable for the description
and explanation of physical reality.
In a very important paper, Wu and Yang introduced the fundamental concept of
nonintegrablethat is, path-dependentphase factor as the basis of a description of
electromagnetism [66]. Further this concept is made to correspond to the denition of
a gauge eld; to extend it to global problems, they analyzed, in relation with the original
Diracs result, the eld produced by a magnetic monopole. The monopole discussion
leads to the recognition that in general the phase factor (and indeed the vector potential

) can only be properly dened in each of many overlapping regions of spacetime. In

the overlap of any two regions, there exists a gauge transformation relating the phase
factors dened for the two regions. The concept of monopole leads to the denition of
global gauges and global gauge transformations. A surprising result is that the mono-
pole types are quite dierent for SU
and SO
gauge elds and for electromagnetism.
The mathematics of these results is the ber bundle theory. Furthermore gauge elds,
including in particular the electromagnetic eld, are ber bundles, and all gauge elds
are thus based on geometry. So maybe all of the fundamental interactions of the phys-
ical worlds could be based on these geometrical and topological objects.
The exact formulation of the concept of a nonintegrable phase factor depends on
the denition of global gauge transformations, that is, on the choice of the overlapping
regions of R (where R is a region of spacetime, precisely, all spacetime minus the origin
r =0) and of the potential A

in this region. Through a certain kind of operations, called

distorsion, one arrives at a large number of possibilities, each with a particular choice
of overlapping regions and with a particular choice of gauge transformation from the
original (A

or (A

to the new A

in each region. Each of such possibilities will

be called a gauge (or global gauge). This denition is a natural generalization of the
usual concept, extended to deal with the intricacies of the eld of a magnetic monopole.
Notice that the gauge transformation factor in the overlap between R
and R
does not
refer to any specic A

. (The gauge transformation in the overlap of the two regions is

S =S
=exp(i) =exp(2ige/hc).) Thus two dierent gauges may share the same
characterizations (a) and (b). In the case of the monopole eld, one can attach to the
gauge any (A

and (A

provided they are gauge-transformed into each other in the
region of overlap. Thus a gauge is a concept not tied to any specic vector potential. Wu
and Yang called the process of distorsion leading from one gauge to another a global
gauge transformation. It is also a concept not tied to any specic vector potential. The
collection of gauges that can be globally gauge-transformed into each other will be
said to belong to the same gauge type. The phase factor exp(ie/hc


) (which is
nonintegrable, i.e., path-dependent) around a loop starts and ends at the same point in
the same region. Thus it does not change under any global transformation, so that we
have the following theorem for Abelian gauge elds.
Theorem 5.1. The phase factor around any loop is invariant under a global gauge
The next two theorems follow trivially from this by taking an innitesimal loop.
Theorem 5.2. The eld strength f

is invariant under a global gauge transforma-

Theorem 5.3. Between two gauge elds dened on the same gauge there exists a
continuous interpolating gauge eld dened on the same gauge.
Theorem5.4. Consider gauge
and dene any gauge eld on it. The total magnetic
ux through a sphere around the origin r = 0 is independent of the gauge eld and
depends on the gauge only:






, (5.9)
where S is the gauge transformation dened by (5.8) for the gauge
in question, and
the integral is taken around any loop around the origin r =0 in the overlap between R
and R
, such as the equation on a sphere r =1.
As in the case of electromagnetism, in the non-Abelian gauge elds both the concept
of a gauge and the concept of a global gauge transformation are not tied to any specic
gauge potentials. The nonintegrable phase factor for a given path is now an element of
the gauge group (see [41]). Since these phase factors do not in general commute with
each other, Theorems 5.1 and 5.2 for the Abelian case need to be modied as follows.
Theorem 5.5. Under a global gauge transformation, the phase factor around any
loop remains in the same class. The class does not depend on which point is taken as the
starting point around the loop.
Theorem 5.6. The eld strength f

is covariant under a global gauge transforma-

Theorem 5.5 denes the class of a loop. This concept is a generalization of the phase
factor for electromagnetism around a loop with the magnetic ux as the exponent. It
is a gauge-invariant concept.
Rigorously the mathematical structure of gauge theory is that of a vector bundle
E with structure group G over a compact Riemannian manifold M. We assume that
G O(m) and E carries an inner product compatible with G. Let E be the space of
G-connections on E, and let be the space of G-automorphisms of E. Then acts on
E as before, and we have a quotient space B E/. To each connection E there is
associated a curvature 2-form R

, and at each point x, we can take its norm




, (5.10)
where e
, . . . , e
is an orthonormal basis of T
M and the normof R

is the usual one
on Hom(E, E)namely, A, B) trace(A
B). Given any g , we recall that R

, so


on M. (5.11)
This says that the pointwise norm of the curvature is gauge-invariant.
Denition 5.7. The Yang-Mills functional is the mapping YM : E R
given by

. (5.12)
(Note that, by gauge invariance of the density (5.11), this functional descends to a func-
tional YM : B R
An important veried fact is that if M is four-dimensional, then YM is conformally
invariant; that is, if we replace the metric ds
on M by a new metric ds
= f
, for
some positive function f on M, then the Yang-Mills functional is unchanged. We think
of YM as an action integral and seek its stationary points.
Denition 5.8. A connection E is called a Yang-Mills connection, and its cur-
vature R

is called a Yang-Mills eld if grad

(YM) =0.
Lemma 5.9. The following are equivalent:
(1) is Yang-Mills,

(3) (R

) =0, where d

The equations

=0 are called the Yang-Mills equations. The equivalent condition

(3) states that the curvature R

is harmonic (with respect to its own Laplacian). It is

naturally appealing to a geometer to study connections with harmonic curvature. The
subject has even more allure when one learns that the classical elds corresponding
to basic forces of nature (the electromagnetic, weak, and strong interactions) can, and
should, be formulated in these terms. Abasic case is electromagnetic, where G =U
E is a complex line bundle over a four-dimensional Lorentzian manifold. In relativistic
terms, the six components of the curvature 2-form of a connection on E represent
the six components of the electromagnetic tensor. The equations
d=0, =0 (5.13)
are exactly Maxwells eld equations.
6. The geometric nature of gauge invariance: gauge theory and quantum mechan-
ics. General relativity was discovered around 1916 by Einstein. Its complete mathe-
matical formulation was due to his friend the mathematician Marcel Grassmann. Soon
after, this theory was recognized a major scientic event. Hermann Weyl of course
agreed with that. However he was still looking for a more complete formulation of
Einsteins theory. Thus, under the inuence of general relativity, he aimed to a search
for a unication of gravitation and electromagnetism. In other words, he asked him-
self whether one could not nd an extended geometry which would likewise allow to
accommodate the only other force eld known at the time, the electromagnetic eld

with its potential A
, into the geometrical structure of spacetime. His
main idea was that of a local gauge invariance, which I will try to explain.
Let me start with a historical note. In a letter to Einstein, 1 March 1916, Weyl wrote:
These days, I believe to have succeeded in deriving electricity and gravitation from a
common source. One obtains a wholly determined principle of action which within the
electricity-free eld leads to your expression for gravitation. On the other hand, within
a gravitation-free eld, one gets an expression which, at rst sight, is in agreement with
Maxwells theory. The response of Einstein came soon: I received your paper. It is a
stroke of genius (Genie-Streich) of the highest rank. Besides, I was not able so far to set-
tle my objection concerning the measure standard . . . . (We translated both quotations
from German).
The importance of the gauge invariance (Eichinvarianz) can be measured by what the
theoretical physicist Abdus Salam wrote in Nobel conference of 1975: One of the most
revolutionary events in the history of science of the last century is the idea of gauge
unication of the electromagnetic force with the weak nuclear forces (Salam[42]); or by
what the outstanding theoretical physicist T. W. B. Kibble wrote in 1982: Revolutions
are hard to recognize till they are past. This is surely true of the changes that have
occurred in elementary particle physics over the last two decades. The development of
gauge theories may well come to be seen as constituting one of the most fundamental
revolutions of this century, rivaling the development of quantum mechanics itself. Yet
so far its signicance is not widely understood outside the ranks of specialists (Kibble
Now we have to return back to Weyl. We said that the Weyls work was aimed at
extending the physical signicance of general relativity and consequently to propose a
generalization of Riemannian geometry. According to Weyl, these generalizations may
be possible by introducing the main idea that length of vectors, and not only direction,
must depend on the path. In other words, length ceases to be an action-at-distance
concept. Mathematically, the idea of local gauge invariance amounts to introducing a
nonintegrable scale factor or a function, which should supply the fact that in Riemann-
ian geometry the invariance of the length of each two vectors gets lost. So Weyl proposes
a procedure for recalibrating the displacement of a vector at each point of spacetime,
in order to leave the length as well as the direction of this vector locally unchanged.
Furthermore, Weyl had the ingenious idea of associating the metric tensor with the
strength of the electromagnetic eld, and the scale vector with the electromagnetic
The idea of Weyl runs as follows. The parallel transport of the two vectors V

and W


to x


and, consequently, around a closed contour is generalized. The angle

between the two vectors is still kept xed under parallel transport, but the assumption
of the invariance of the length of both vectors is dropped. The length of a vectorin
contrast to the angle between two of themceases to be an action-at-distance concept.
How should one change the expression
(which expresses the change of the components of a vector V
(x), if displaced or trans-
ported parallely on a Riemannian manifold M
of r dimensions from the point with
coordinates x
to the one with coordinates x
)? One would like to uphold the
bilinearity of |V
in V
and dx
, thereby arriving at
with the so far unknown connection coecients
. On the basis of the last expression,
we can dene once more a covariant dierentiation, which we still denote by
. Then
the change of a vector around the contour turns out to be
, (6.3)
this time involving the curvature tensor, and we get


. (6.4)
We thus see that in [61] Weyl enlarged the Riemannian spacetime of general relativity
by an independent vector eld of geometric originin modern terms, a one-form. This
additional geometric object is intimately linked with the geometrical structure of space-
time. In addition, the Weyl vector is the compensating potential for allowing invariance
with respect to local recalibration of lengths, that is, with respect to conformal changes
of the metric. One can furthermore generalize the Weyl geometry to the metric-ane
geometry, which is based on a (symmetric) metric and an independent (nonsymmetric)
linear connection. In Weyl geometry, one geometrical object, the metric tensor, stands
for the gravitational potential, as in general relativity, whereas the other one, the lin-
ear connection, was surmised to represent the electromagnetic potential known from
Maxwells theory. Together with a suitable (gravitational and electromagnetic) eld La-
grangian, which turns out to be quadratic in the curvature of the underlying Weyl space-
time, this builds up Weyls unied theory of 1918. The idea of gauge invariance, or the
so-called principle of recalibration, which applies rst to length of vectors in spacetime,
transmuted to the concept of local gauge invariance of the phase of a wave function in
1929, and represents, in the last form, one of the underlying principles of all modern
gauge theories, such as the Weinberg-Salam theory of electroweak interactions.
The other fundamental contribution of Weyl is related to his gauge theory but con-
cerns quantum mechanics. In an article in 1927 [62], then in his book Gruppentheorie
und Quantenmechanik (1928), Weyl proposes developing the mathematical foundations
of this newly discovered physical theory by showing its close relationship to group
representation theory. (For a very illuminating overview of Weyls contribution to the
theory of Lie groups, see [10].) In Weyls new mathematical approach, the basic ques-
tion at that time was to explain the properties of particles (protons and electrons) by
the properties of the quantum laws: do these laws satisfy the basic symmetries known
at that time (right/left, past/future, positive/negative electric charge)? Mathematically,
that was equivalent to knowing the structure of certain classes of (continuous) groups
and their algebras. These three kinds of symmetry were introduced (under other names)
into quantumphysics in the 1930s by Weyl himself and by E. Wigner, but no one thought
then of unifying the three kinds. In 1930, Dirac had detected the existence of a particle
(positron) with a charge opposite to that of an electron, and Weyl then generalized to
a universal essential equivalence between positive and negative electricity. This idea
was reformulated in 1937 as the conjugate invariance of electrical charge. However,
in 1957, Lee and Yang found that left-right symmetry (or conservation of parity P),
which physicists had always found useful to accept, was not entirely satised by the
laws of nature, particularly in weak interactions, which are responsible for radioac-
tive (beta) disintegration. Since it could be veried theoretically and experimentally
that this radioactivity gave a correct description of the neutrino, the conclusion was
that the existence of the Weyl-Pauli theory (of the neutrino) violated left-right symme-
try. This asymmetry seemed to be a consequence of duplication: massless particles
(neutrinos) emitted in a beta disintegration existed in only one form (left), while the
corresponding antiparticles (antineutrinos) could then only exist in the opposite form.
Mathematically, this duplication could appear as the existence of two valid solutions
for an equation. Some theoretical physicists interpret this phenomenon to speculate
that the world did not have to be symmetrical with respect to every operation which
left the laws of nature invariant: the loss of symmetry could be ascribed to the asym-
metry of the whole universe. Such an explanation raises several questions. It is just
as reasonable to believe that the loss of symmetry, as a characteristic of a transitory
phase in which the laws of nature apply, could be explained by a richer, more general
mathematical symmetry. Recent research in this eld seems to be oriented toward this
second outlook.
7. Quantum electrodynamics, gauge theory, and the concept of symmetry. It is
now important to emphasize some facts about the quantum electrodynamics, that is,
the theory that results from combining electron matter elds with electromagnetic
eldsformulation begun in the 1930s by P. Dirac and was essentially completed in
about 1949 by S. Tomonaga, J. Schwinger, R. P. Feynman, and F. J. Dyson. (The original
papers have been republished in [46]). We recall rstly that it is based on a local gauge
symmetry. Another theory, Einsteins general theory of relativity, is based on a local
gauge symmetry, which pertains not to a eld distributed through space and time but
to the structure of spacetime itself. Indeed every point in spacetime can be labeled by
four numbers, which give its position in the three spatial dimensions and its sequence
in the one time dimension. These numbers are the coordinates of the event, and the
procedure for assigning such numbers to each point in spacetime is a coordinate sys-
tem. The choice of such a coordinate system is clearly a matter of convention. The
freedom to move the origin of a coordinate system constitutes a symmetry of nature.
Actually there are three relate symmetries: all laws of nature remain invariant when the
coordinate system is transformed by translation, by rotation, or by mirror reection. It
is important to note, however, that the symmetries are only global ones. Each symmetry
transformation can be dened as a formula for nding the new coordinates of a point
from the old coordinate. Those formulas must be applied simultaneously in the same
way to all the points.
In quantum electrodynamics, the symmetry operation consists of a local phase
change in the electron eld, each such phase shift being accompanied by an inter-
action with the electromagnetic eld. Imagine an electron undergoing two consecutive
phase shifts: the emission of a photon and then the absorption of one. If the sequence
of the phase shifts was reversed, the nal result would be the same. It follows that an
unlimited series of phase shifts can be made, and the nal result will simply be their
algebraic sum, no matter what their sequence is. On the contrary, in the Yang-Mills
theory, where the symmetry operation is a local rotation of the isotopic-spin arrow,
the result of multiple transformations may be rather dierent. Suppose a hadron un-
dergoing a gauge transformation A followed by a second transformation B may have
an isospin arrow in the orientation of a proton. The same hadron undergoes B; at the
end of this sequence the isotopic-spin arrow is found in the orientation that corre-
sponds to a proton. Now suppose the same transformation was applied to the same
hadron but in the reverse sequence: B followed by A. In general the nal state will not
be the same; the particle may be a neutron instead of a proton. Therefore, the net eect
of the two transformations depends explicitly on the sequence in which they are ap-
plied. Because of this distinction, quantum electrodynamics is called an Abelian theory
and the Yang-Mills theory is called a non-Abelian one. Abelian groups are made up of
transformations that, when applied one after another, have the commutative property;
non-Abelian groups are not commutative (see [16]). (The terms are borrowed from the
mathematical theory of groups created by the Norwegian mathematician N. H. Abel.)
Like the Yang-Mills theory, the general theory of relativity is non-Abelian. Even the elec-
tromagnetic interaction has been incorporated into a larger theory that is non-Abelian.
For now, at least, it seems all the forces of nature are governed by non-Abelian gauge
This important and surprising result (i.e., the asymmetry of certain fundamental
laws of physics) spurred a vast new investigation, still active today, into spontaneous
symmetry breaking. The central question now seems to be the connection between the
symmetry breaking occurring in the behavior of certain elementary particles at a certain
level of size and temperature, and the geometrical structure of space at that same
level. More precisely, it has been hypothesized that a symmetry breaking occurs when
there is a change (or degeneration) in the space structure, or, mathematically speaking,
a jump from a group to a poorer group of the eld or the interaction concerned.
However, nothing prevents us from believing that if there is a richer group containing
the two others as subgroups, the diculty may be removed (see below for further
considerations on this point).
Mathematically the phenomenon of symmetry breaking can be formulated as fol-
lows. Let V be a vector bundle with structure group G; it might happen that under
some conditions the structure group of V can be reduced to a subgroup G
. This phe-
nomenon of gauge symmetry breaking plays a central role in particle physicsmore
precisely, in the Weinberg-Salam-Glashow model of weak interactions. Suppose that at
some lowmass scale m, the gauge group G is eectively reduced to a subgroup G
. Even
if the representations R and

R are inequivalent as representations of G, they may be
equivalent as representations of G
. In this case, the fermions that were kept massless
by the inequivalence of R and

R will be able to gain masses of order m. This is precisely
what seems to happen in nature. At a mass scale of order 10
, the gauge group
SU(3) SU(2) U(1) is reduced to SU(3) U(1). At this point, some of the gauge
elds become massive. At the same time, the representations R and

R are isomorphic
as representations of SU(3)U(1), so the light fermions can and do gain mass. Many
facts of this symmetry-breaking process are not yet understood, for example, why the
mass scale associated with symmetry breaking is so tiny compared to the natural mass
scale M
. It is, however, pretty clear that the idealization in which the masses of the
particles are all zero is the situation in which the gauge group SU(3)SU(2)U(1) is
not broken to a subgroup.
Consider further the basic decomposition S =S

of the spinor representation S

into spinors S
of positive and negative chirality; the distinction between S
and S

a matter of convention. Under a change of the orientation of spacetime, called a parity
transformation by physicists, S
and S

are exchanged. The representations R and

are therefore exchanged by parity. If we assume that the laws of nature are invariant
under parity, then R and

R must be isomorphic. The explanation of the lightness of
the fermions therefore rests on parity violation. However, in the 1950s it was discov-
ered that the weak interactions violate parity. On the other hand, parity is conserved
by strong and electromagnetic interactions; this is the statement that R and

R are iso-
morphic as representations of SU(3)U(1). In order to overcome this contradiction,
one has lately contemplated the possibility of extending the observed gauge group G
to a larger group

G, as SU(5) which contains SU(3) SU(2) U(1), SO(10), or the
exceptional group E
In physical terms, however, the problem can be put in a quite dierent manner. It
is well known that one of the serious diculties of the Yang-Mills theory is that when
isotopic-spin symmetry becomes exact, the result is that protons and neutrons are
indistinguishable; this situation is obviously contradictory. Even more troubling is the
prediction of electrically charged photons. The photon is necessarily massless because
it must have an innity range. Imposing a mass on the quanta of the charged elds does
not make the elds disappear, but it does conne them to a nite range. If the mass is
large enough, the range can be made as small as wished. As the long-range eects are
removed, the existence of the elds can be reconciled with experimental observations.
The modied Yang-Mills theory was easier to understand, but the theory still had to be
given a quantum-mechanical interpretation.
The problem of innities turned out to be severer than it had been in quantum elec-
trodynamics, and the standard recipe for renormalization would not solve it. In this
respect, the fundamental idea of the Higgs mechanism was to include in the modied
Yang-Mills gauge theory an extra eld, one having the peculiar property that it does
not vanish in the vacuum. One usually thinks of a vacuum as a space with nothing in
it, but in physics that vacuum is dened more precisely as the state in which all elds
have their lowest possible energy. For most elds the energy is minimized when the
value of the eld is zero everywhere, or in other words when the eld is turned o.
An electron eld, for example, has its minimum energy when there are no electrons.
The eect of the Higgs eld is to provide a frame of reference in which the orientation
of the isotopic-spin arrow can be determined. The Higgs eld can be represented as an
arrow superposed on the other isotopic-spin indicators in the imaginary internal space
of a hadron. What distinguishes the arrow of the Higgs eld is that it has a xed length,
established by the vacuum value of the eld. The orientation of the other isotopic-spin
arrows can then be measured with respect to the axis dened by the Higgs eld. In this
way a proton can be distinguished from a neutron. It might seem that the introduction
of the Higgs eld would spoil the gauge symmetry of the theory and thereby lead again
to insoluble innities. In fact, however, the gauge symmetry is not destroyed but merely
cancelled. The symmetry species that all the laws of physics must remain invariant
when the isotopic-spin arrow is rotated in an arbitrary way from place to place. This
implies that the absolute orientation of the arrowcannot be determined since any exper-
iment for measuring the orientation would have to detect some variation in a physical
quantity when the arrow was rotated. With the inclusion of the Higgs eld, the absolute
orientation of the arrow still cannot be determined because the arrow representing the
Higgs eld also rotates during a gauge transformation. All what can be measured is
the angle between the arrow of the Higgs eld and the other isotopic-spin arrows, or
in other words their relative orientation. The Higgs mechanism is an example of the
process called spontaneous symmetry breaking, which was already well established in
other areas of physics.
The concept was rst put forward by W. Heisenberg in his description of ferromag-
netic materials (in 1971). Heisenberg pointed out that the theory describing a ferro-
magnet has perfect geometric symmetry in that it gives nonspecial distinction to any
one direction in space. When the material becomes magnetized, however, there is one
axisthe direction of magnetizationthat can be distinguished fromall other axes. The
theory is symmetrical but the object it describes is not. Similarly, the Yang-Mills theory
retains its gauge symmetry with respect to rotations of the isotopic-spin arrow, but the
objects describedprotons and neutronsdo not express the symmetry. Philosophi-
cally, this fact leads to making the distinction between the ontological or objectal
level and the operational or theoretical level of physical entities; moreover, the rst
level cannot be reduced to the latter one.
Despite all these diculties, the Yang-Mills theory had begun as a model of the strong
interactions, but by the time it had been renormalized, interest in it centered on ap-
plications to weak interactions. In 1967, S. Weibeng, A. Salam, and C. Ward proposed a
model of the weak interactions based on a version of the Yang-Mills theory in which the
gauge quanta take on mass through the Higgs mechanism. The Weinberg-Salam-Ward
model actually embraces both the weak force and electromagnetism (Salam [43]). The
conjecture on which the model is ultimately founded is a postulate of local invariance
with respect to isotopic spin; in order to preserve that invariance, four photonlike elds
are introduced, rather than the three of the original Yang-Mills theory. The fourth pho-
ton could be identied with some primordial form of electromagnetism. It corresponds
to a separate force, which had to be added to the theory without explanation. For this
reason the model should not be called a unied eld theory.
If one were to search for a nonlinear generalization of Maxwells equation to explain
elementary particles, there are various symmetry properties one would require (see
(i) external symmetries under the Lorentz and Poincar groups and under the con-
formal group if one is taking the rest-mass to be zero,
(ii) internal symmetries under groups like SU(2) or SU(3) to account for the known
features of elementary particles,
(iii) covariance or the ability to be coupled with gravitation by working on a curved
Gauge theories satisfy these basic requirements because they are geometric in char-
acter. In fact, on the mathematical side, gauge theory is a well-established branch of dif-
ferential geometry, known as the theory of bre bundles with connection. (On this topic,
see [14, 18, 48].) It has much in common with Riemannian geometry which provided
Einstein with the basis for his theory of general relativity. If the current expectations
of Yang-Mills theory are eventually fullled, it will in some measure justify Einsteins
point of view that the basic laws of physics should all be in geometrical form (Atiyah
[3] and Wheeler [63]).
8. Topological aspects of gauge theory, and invariants of four-manifold topology
and quantum eld theory. We need once more to emphasize this fundamental fact.
The mathematical basis of gauge eld theory lies in vector bundles and the connections
in them. In fact, one of the most striking developments in mathematical physics over
the past quarter century has been the discovery of the fundamental role played by
bundles, connections, and curvature in expressing and eventually explaining the basic
laws of nature. (See [11, 12, 32].) The so-called Yang-Mills theory does reect in the
deeper way the intimate relationship between geometrical concepts and physical ideas.
The key feature of Yang-Mills theory is the invariance of the physical properties of
particles under a group, but in this case an innite-dimensional group (whereas the
Maxwells equations in vacuum for the electromagnetic eld are invariant under the
nite-dimensional Lorentz group of linear isometries of R
). Consider the classical
electromagnetic eld in terms of an exterior 2-formon R
. Notice that since d=0,
we may express as
=d, (8.1)
where is a 1-form on R
called electromagnetic potential. The form is dened only
up to an exact form, that is, we may replace by +df, where f is any smooth function
on R
. Such a replacement is called a change of gauge or a gauge transformation. So, the
invariance of physical laws of particles interactions with respect to the group of gauge
transformations lies at the heart of matter. It is called the principle of local invariance.
In an attempt to describe strong interactions at the classical level, C. N. Yang and R. Mills
proposed in 1954 that the Lagrangian of the interaction should involve a potential with
values in the Lie algebra of the non-Abelian group SU(2), which describes the degrees
of freedom of isotopic spin, the rst quantum number to be understood in relation
to strong interactions. Moreover, this Lagrangian should be invariant under the group
of local internal symmetries, again called gauge transformations. One of the striking
features of the Yang-Mills proposal was that the potential A (an SU(2)-valued 1-form)
was required to transformlike a connection. Specically, a gauge transformation is here
dened as a map : R
SU(2), and the transformed potential is given by
A =A
. (8.2)
Furthermore, the proposed (Lagrangian) density was just ||
, where = dA+
1/2[A, A] was the curvature of the connection A. This connection lives on the trivi-
alized principal SU(2)-bundle over R
. The gauge transformation simply amounts
to a principal bundle automorphism.
There is at present no doubt that some mathematical concepts of bre bundle the-
ory have become an established part of mathematical physics because bre bundles
provide a natural and very deep framework for discussing the concepts of relativity
and invariance, describing gravitation and other gauge elds, and giving a geometrical
interpretation to quantization and the canonical formalismof particles and elds. Fibre
bundles provide the language which is needed for dealing with local problems of dif-
ferential geometry and eld theory. They are necessary to understand and solve global,
topological problems, such as those arising in connection with magnetic monopoles
and instantons. For example, in an attempt to understand the properties of Donaldson
invariants of four manifolds, E. Witten presented a new approach to using physics to
illuminate Donaldson theory (Witten [64] and Donaldson [19]). (For a very illuminating
survey of the Seiberg-Witten equations and their relation to topological invariants of
four-manifolds, see [19]. Donaldson has stressed the importance of these equations by
the following words: Since 1982 the use of gauge theory, in the shape of the Yang-
Mills instanton equations, has permeated research in 4-manifold topology. (. . .) A body
of techniques has built up through the eorts of many mathematicians, producing re-
sults which have uncovered some of the mysteries of 4-manifold theory, and leading to
substantial internal conundrums within the eld itself. In the last three months of 1994
a remarkable thing happened: this research was turned on its head by the introduction
of a new kind of dierential-geometric equations by Seiberg and Witten: in the space
of a few weeks, long-standing problems were solved, new and unexpected results were
found, along with simpler new proofs of existing ones, and new vistas for research
opened up [19, page 45].) He suggested that, instead of computing the Donaldson in-
variants by counting SU(2) instanton solutions, one can obtain the same invariants
by cutting the solutions of the dual equations, which involve U(1) gauge elds and
monopoles. From a physical point of view, the dual description via monopoles and
Abelian gauge elds should be simpler than the microscopic SU(2) description since
in the renormalization group sense it arises by integrating out the irrelevant degrees
of freedom.
The new monopole equations and the topological invariants of four-manifolds intro-
duced by Witten involve two entities, a U(1) connection and a spinor eld. Thus a
main prerequisite for their study is a knowledge of spinors on four-manifolds. More
precisely, the most relevant notion is that of Spin
structure. Recall that if X is an ori-
ented, closed Riemannian four-manifold, a spin structure on X is a lift of the structure
group of the tangent bundle from SO(4) to its double cover Spin(4). The exceptional
isomorphism Spin(4) = SU(2) SU(2) means that this can be given a more concrete
description in terms of vector bundles. Giving a spin structure is the same as giv-
ing a pair of complex 2-plane bundles S
, S

X, each with structure group SU(2)

and related to the tangent bundle by a structure map c : TX Hom(S
, S

). Now the
map ef c(e)


c(e) induces a map from the self-dual 2-forms

, S

), which corresponds to the standard isomorphismbetween the Lie algebras

of SU(2) and SO(3).
The map c is the symbol of the Dirac operator D : (S
) (S

), and one of the most

fruitful calculations in dierential geometry leads to the Lichnerowicz-Weitzenbock
formula for the Dirac operator:


R. (8.3)
Here, is the covariant derivative on spinors, induced by Levi-Civita connection, and
R is the scalar curvature, which acts in (12.1) by scalar multiplication at each point. If
we have an additional auxiliary bundle E X, with a Hermitian metric and connection,
we may consider spinors with values in Esections of S
E. The Dirac operator on
these coupled spinors satises


(), (8.4)
where F
is the self-dual part 1/2(F
) of the curvature E. Here, the self-dual forms
act on spinors in the way described above. Nowa spin structure may not exist globally
the Stiefel-Whitney class w
(X) H
(X; Z/2) is the obstructionbut a variant, a Spin
structure, always does. A Spin
structure is given by a pair of vector bundles W
X with an isomorphism, say

=L, such that locally W

, where
is a local square root of L : L
= L. An old result of Hirzebruch and Hopf
assures the existence of Spin
structures on any oriented, closed four-manifold; up to
an action of the nite group H
(X; Z/2), they are classied by the lifts of w
(X) to
(X; Z), the rst Chern class of the line bundle L. A connection on L gives a Dirac
operator D : (W
) (W

), which is locally just the same as the Dirac operator on

-valued spinors. In particular we get the Lichnerowicz formula


(), (8.5)
where the factor of 1/2 comes from the square root of L. Note that Hom(W
, W
, S
Now, the Seiberg-Witten equations for a four-manifold X with Spin
structure W
equations for a pair (A, ), where
(1) A is a unitary connection on L =
(2) is a section of W
If and are in W
, we write

for the endomorphism , ) of W

. The trace-
free part of this endomorphism lies in the image of the map , and we write (, ) for
the corresponding element of
C. So, is a sesquilinear map : W
The Seiberg-Witten equations are
=0, F
=(, ). (8.6)
The sign of the quadratic form (, ) is crucial and underpins the whole theory.
Witten showed that (Witten [64], Seiberg and Witten (1999)), in general, the number
of solutions of a system of equations weighted by the sign of the determinant of the
operator analogous to T (an elliptic operator T :


is dened by T = s

t) is always a topological invariant if a suitable compactness

holds. If one has a gauge-invariant system of equations, and one wishes to count gauge
orbits of solutions up to gauge transformations, then one requires (i) compactness, (ii)
free action of the gauge group on the space of solutions. By contrast with Donaldson
theory, according to which for SU(2) instantons, compactness fails precisely because
an instanton can shrink to zero size, the monopole equations are scale-invariant but
they have no nonconstant L
solutions on at R
The general problem behind the above result is that of nding topological invariant
dened by solutions of partial dierential equations. In dierential topology one is
familiar with many contexts in which the solutions of an equation f(x) =y are, at the
level of homology, unchanged by continuous variations of parameters. For example,
f might be a map f : P Q between oriented manifolds, then the homology class in

(P) of f
(y), for generic y in Q, is a homotopy invariant of fjust the Poincar
dual of the pullback of the fundamental cohomology class of Q. Or f might be a section
of an oriented vector bundle V P, and y =0, so the solutions are the zero set of the
section which, assuming transversality, gives a submanifold representing the Poincar
dual of the Euler class of V. Now if we have a family of partial dierential equations,
depending on continuous parameters, we may hope to nd similar invariants from the
homology class of the solution space. This can be developed abstractly in the framework
of dierential topology in certain manifolds. The key points one needs to establish in
order to nd invariants analogous to the nite-dimensional case are the following.
(1) The maps involved should be Fredholm maps, which in practice means that the
linearization of the equations about a solution should be represented by linear ellip-
tic dierential equations, say over a compact manifold. The index of the linearized
equation gives the expected dimension of the solutions space.
(2) One needs to establish the compactness of the space of solutions, or some weaker
analog of this.
(3) One needs to establish orientability, analogous to the nite-dimensional case;
otherwise one only gets invariants modulo 2. This can be set up in terms of the index
theory of families of operators. In the cases arising from gauge theory, the equations
are invariant under the action of the gauge group of bundle automorphisms, and one
studies spaces of solutions modulo this action.
(4) One must not encounter reducible solutions in generic one-parameter families of
Nowone can showthat the essential features of Seiberg-Witten equations listed above
dene dierential-topological invariants of the underlying four-manifold. Indeed, the
theory is signicantly simpler than for the Donaldson instanton equations (Donaldson
and Kronheimer [20]). To check the Fredholmproperty we can ignore the quadratic term
(, ) since this does not aect the symbol (leading term) of the linearization. At the
level of the symbol, the linearization is given by the sum of the linearization of the
U(1) instanton equation, which modulo gauge is represented by the operator d

acting on ordinary forms, and the Dirac operator D
. Regarding compactness, unlike
the instanton case, the Seiberg-Witten moduli spaces are compact, without qualication.
This follows froma priori estimates on the solutions. These can be obtained fromenergy
estimates using integration by parts as in the previous section, or, more directly, by
the maximum principle applied to second-order equations. The remaining issues are
reducibles and orientations. If a nontrivial gauge transformation g Aut(L) xes a pair
(A, ), then must be zero and g U(1) a constant scalar. Thus, the only reducible
Seiberg-Witten solutions are the self-dual U(1) connections, and these do not occur in
generic r-dimensional families of metrics on X, so long as b
(X) > r. Thus if b
> 1,
reducibles do not interfere with the denition of invariants. Considering orientations,
an orientation of the moduli space is furnished by an orientation of the determinant
line of the relevant index bundle over the space C

of all irreducible pairs (A, )

modulo gauge transformation.
The most straightforward application of the Seiberg-Witten invariants is to distin-
guish dierentiable four-manifolds within the same homeomorphism type. Myriads of
examples could be given, the simplest being to show that connected sums X
, say,
of p copies of CP
and q copies of CP
, q > 1, for which the Seiberg-Witten invari-
ants vanish, are not dieomorphic to Khler surfaces (or any other manifolds with
nonzero Seiberg-Witten invariants). The Seiberg-Witten equations have led to astound-
ing advances in four-manifold theory (see, e.g., [33]). To some extent they may well
have brought the study of the gauge theory invariants to a fairly complete form, resolv-
ing many of the main problems that drove research in this area in the last ten years.
Perhaps the most exciting challenge is to come to grips with the quantum eld theory
ideas which led to these new advancesin parallel with other important developments
such as mirror symmetry, three-manifold invariants, conformal eld theoryand to
understand in a rigorous way the intricate structures discovered by Seiberg and Wit-
ten. At the same time there are notable questions which are left open at present. One
is the question of whether all simply connected manifolds are of simple type. A more
wide-ranging problem is to understand the structure of the invariants of families of
four-manifolds, and the relation between the instanton and Seiberg-Witten theories,
for manifolds with b
= 1. By considering an r-dimensional family of equations of ei-
ther kind, one should get invariants which are, roughly speaking, cohomology classes
in H
(BDi(X)), where BDi(X) is the classifying space of the dieomorphism group
of a four-manifold X. Then the same issues which complicate the story for ordinary
invariants when b
= 1 should arise, for any X, once r b
1. In another direction
one may consider four-manifolds which are not smooth. The instanton theory can be
extended to the class of quasiconformal four-manifolds (where the coordinate change
maps are only quasiconformal, not necessarily smooth).
In order to see the relation of these results to quantum eld theory, one must recall
the analysis of N = 2 supersymmetric Yang-Mills theory. To begin, we work on at
. It has long been known that this theory has a family of quantum vacuum states
parametrized by a complex variable u, which corresponds to the four-dimensional class
in Donaldson theory. For u, the gauge group is spontaneously broken down to the
maximal torus, the eective coupling is small, and everything can be computed using
asymptotic freedom. For small u, the eective coupling is strong. Classically, at u=0,
the full SU(2) gauge symmetry is restored. But the classical approximation is not valid
near u = 0. Quantum mechanically, the u plane turns out to parametrize a family of
elliptic curves, in fact, the modular curve of the group (2). The family can be described
by the equation

(xu), (8.7)
where is the analog of a parameter that often goes by the same name in the theory of
strong interactions. (The fact that 0 means that the quantum theory does not have
the conformal invariance of the classical theory.) The curve (12.5) is smooth for generic
u, but degenerates to a rational curve for u=
, or . Near each degeneration, the
theory becomes weakly coupled, and everything is calculable, if the right variables are
used. At u = , the weak coupling is (by asymptotic freedom) in terms of the original
eld variables. Near u=:
, a magnetic monopole becomes massless; the light degrees
of freedomare the monopole, dyon and a dual photon, or U(1) gauge boson. In terms of
the dyon and dual photon, the theory is weakly coupled and controllable near u=:
LeBrun [33] obtained some very important results concerning Einstein metrics on
a generalized hyperbolic 4-space H
= SO(4, 1)/SO(4) or complex-hyperbolic 2-space
=SU(2, 1)/U(2). He showed the following.
Theorem 8.1. Let M
be a smooth compact quotient of complex hyperbolic 2-space
=SU(2, 1)/U(2), and let g
be its standard complex-hyperbolic metric. Then every
Einstein metric g on M is of the form g =

, where : M M is a dieomorphism
and >0 is a constant.
This theorem is proved by estimating the scalar curvature of Riemannian metrics by
means of the Seiberg-Witten invariants of smooth four-manifolds.
Theorem8.2. Innitely many compact smooth simply connected four-manifolds with
2 >3 do not admit Einstein metrics.
In fact, it is possible to describe a sequence of smooth manifolds homeomorphic to
, where l : k is roughly 4 : 1, which do not admit Einstein metrics.
Regarding the Seiberg-Witten techniques, one needs rst to recall the following facts.
If (X
, J) is a compact complex surfacethat is, a complex manifold of real dimension
fourthen there is a process called blowing up which produces a new complex surface
by replacing some given point x X with a complex projective line CP
. The result-
ing surface is dieomorphic to a connected sum X#CP
, where CP
is the complex
projective plane with the nonstandard orientation. This process can then be iterated,
and in particular one may blow up any given collection of k distinct points of X so as
to produce new complex surfaces dieomorphic to X#kCP
for any positive integer
k. Conversely, any compact complex surface (M, J) can be expressed as X#kCP
k 0, an iterated blowup of some complex surface X which is not itself the blowup of
anything else. One says that X is a minimal model for M. A compact complex surface
(M, J) is said to be of general type if its minimal model X satises
(2+3)(X) >0 (8.8)
and X is neither CP
-nor a CP
-bundle over a complex curve. For example, the degree-m
[u: v : w : z] CP
in complex projective three-space is of general type if m > 4; these examples are all
simply connected and are their own minimal models. Now, starting from these facts,
we have the following result.
Theorem 8.3. Let (M, J) be a compact complex surface of general type, and let X be
its minimal model. Then any Riemannian metric g on M satises
(2+3)(X) (8.10)
with equality if and only if M =X and g is Khler-Einstein with respect to some complex
structure on M.
Proof. The complex structure J is a priori completely unrelated to the metric g
under discussion, but its deformation class is enough to allow one to dene twisted
spinor bundles V
, where L is a Hermitian line bundle with c
(L) =c
(M, J).
Now assume for simplicity that b
(M) >1. For any g, it then turns out that the Seiberg-
Witten equations

=0, F
=i() (8.11)
must be satised by some smooth connection on L and some smooth section of V
Here, D

is the Dirac operator coupled with , the purely imaginary 2-form F

is the
self-dual part of the curvature of , and the real-quadratic map : V

by the isomorphism
C =
satises ()
/8. This can be made more
explicit by choosing some Hermitian local trivialization of L, so that the connection
is represented by a purely imaginary 1-form ; in Penroses spinorial abstract-index
notation, the Seiberg-Witten equations then become








with the convention that

. The number of solutions, modulo gauge equiva-
lence and counted with appropriate multiplicities, can be shown to be independent of
g; and because the equations can be solved explicitly when the metric happens to be
Khler, it is not dicult to show that this invariant is 1. It follows that there must be
at least one solution for every metric g on M.
One sees thus that Seiberg-Witten theory gives us dierential-topological invariants
which allow one to estimate the scalar curvature of a metric in relation to its vol-
ume. The entropy method instead allows one to deduce Ricci-curvature estimates from
homotopy-theoretic assumptions.
9. The structure of bre bundles and the topological signicance of physical theo-
ries. We now return to the concept of bre bundles or bre spaces. That notion, being
global in character, arose in topology. At rst it was an attempt to nd new examples
of manifolds. Fiber spaces are locally, but not globally, product spaces. The presence of
such a distinction is a sophisticated mathematical fact. The development of bre spaces
has to wait until invariants are found to distinguish the berings or even to show that
globally there are nontrivial ones. The rst such invariants are the characteristic classes
introduced by H. Whitney and by E. Stiefel in 1935. Topology, however, forgets the al-
gebraic structure, and in applications vector bundles, with the linear structure intact,
are more useful.
A vector bundle : E M over a manifold M is, roughly speaking, a family of vector
spaces parametrized by M such that it is locally a product. The vector space E

(x) corresponding to x M is called the ber at x. Examples are the tangent bundle
M and all tensor bundles associated to it. A more trivial bundle is the product bundle
MV, where V is a xed vector space and (x, V), x M, is the ber at x. A vector
bundle is called real or complex according to whether the ber is a real or complex
vector space. Its dimension is the dimension of the bers. It is important that the linear
structure on the bers has a meaning so that the general linear group GL(n, R) plays a
fundamental role in matching the bers; it is called the structure group. A real (resp.,
complex) vector bundle is called Riemannian (resp., Hermitian) if the bers are provided
with inner products. In this case the structure group is reduced to O(n) (resp., U(n)),
with n being the dimension of the bers; the bundle is then called an O(n)-bundle
(resp., U(n)-bundle). Similarly, we have the notion of an SU(n)-bundle. A section of the
bundle E is an attachment, in a continuous and smooth manner, to every point x M,
a point of the ber E
. In other words, it is a continuous mapping : M E such
that the composition is the identity. This notion is a natural generalization of a
vector-valued function and of a tangent vector eld. In order to dierentiate , we need
a so-called connection in E. The latter allows the denition of the covariant derivative
(X being a vector eld in M), which is a new section of E. Covariant dierentiation
is generally not commutative; that is, D
for two vector elds X, Y in M.
The measure of the noncommutativity gives the curvature of the connection; this is an
analytic version of the geometric concept of nonholonomy introduced by Elie Cartan.
According to him, it is important to regard the curvature as a matrix-valued exterior
quadratic dierential form. Its trace is a closed 2-form. More generally, the sumof all its
principal minors of order k is a closed 2k-form. It is called a characteristic class. By the
de Rham theory the characteristic form of degree 2k determines a cohomology class
of dimension 2k, to be called a characteristic class. Whereas the characteristic forms
depend on the connection, the characteristic class depends only on the bundle. They are
the simplest invariants of the bundle. It must be an act of nature that the nontriviality of
a vector bundle is recognized through the need for a covariant dierentiation and that
its noncommutativity accounts for the rst global invariants. This introduction of the
characteristic classes gives emphasis on its local character, and the characteristic forms
contain more information than the classes. When M is a compact oriented manifold, a
characteristic class of the top dimension (i.e., of dimension equal to that of M) gives
by integration a characteristic number. When it is an integer, it is called a topological
quantum number.
These dierential-geometric notions have been found to be the likely mathematical
basis of a unied eld theory. Weyls gauge theory deals with a circle bundle or a U(1)-
bundle, that is, a complex Hermitian bundle of dimension one. In studying the isotopic
spin, Yang and Mills used what is essentially a connection in an SU(2)-bundle. It is
the rst instance of a non-Abelian gauge theory. From the connection the action can
be dened. A connection in an SU(2)-bundle at which the action takes the minimum
is called an instanton. (On this new theory, see [20, 24].) Its curvature has a simple
expression and is called self-dual. An instanton is thus a self-dual solution of the Yang-
Mills equation. When the space R
is compactied into the four-dimensional sphere
, the SU(2)-bundles are determined up to an isomorphism by a topological quantum
number k, which is an integer. It has been proved that over S
the moduli (or parameter)
space for the set of connections with self-dual curvature on the SU(2)-bundle with given
k >0 is a smooth manifold of dimension 8k3 (Atiyah et al. [5]). In physical terms this
is the dimension of the space of instantons with topological quantum number k > 0.
Instantons can claim a relation to Einstein through the following result. The group
SO(4) is locally isomorphic to SU(2)SU(2), so that a Riemannian metric on a four-
dimensional manifold M gives rise through projection to connections in the SU(2)-
bundles. M is an Einstein manifold if and only if these connections are self-dual or
The notion of bre bundle generalizes that of a Cartesian product on a manifold. Two
examples from physics and geometry will clarify the need for such a generalization (for
a more detailed presentation, see [21, 39].
(i) In Aristotelian physics both space and time are absolute, every event being dened
by an instant of time and a location in space. This is equivalent to saying that spacetime
E is a Cartesian product T S, where T is the time axis and S is the three-dimensional
(ii) In Galilean physics time remains absolute, but space is relative. This can be de-
scribed by saying that there is a projection : E T, that is, a surjective (onto) map
that associates to any event p E the corresponding instant of time t = (p) T.
The set (line) T is called the base space and the set
(t) of all events simultane-
ous with p is called the ber over t. Each ber is isomorphic to the Euclidean three-
dimensional space R
, which is therefore called the typical ber. The total space E of
this bundle may be trivialized, that is, represented as the Cartesian product T R
Any such trivialization (map) h : E T R
is of the form h(p) = ((p), r(p)), where
r(p) = (x(p), y(p), z(p)) are the space coordinates of the event p relative to an iner-
tial observer. One can say that Galilean spacetime E is the total space of a bre bundle
Table 9.1
Electromagnetism Gravitation













=0 R

which is trivial, that is, isomorphic to the product bundle T R
, without a natural
isomorphism between these bundles.
(iii) Consider now the two-dimensional sphere S
with a preferred orientation. Dene
a dyad as a pair of unit orthogonal vectors tangent to S
at a point. Let P be the set of
all dyads whose orientation agrees with that of S
. One can make P into the total space
of a bundle in such a way that : P S
is the map sending a dyad into the point at
which its vectors are attached to S
. If e = (e
, e
) is a dyad at x S
, then so is the
pair (e

, e

), where

sin, e

cos, (9.1)
and all dyads at x may be obtained in this manner from(e
, e
). Therefore, SO(2) is the
typical ber of the bundle : P S
. Equation (9.1) denes an action of the (structure)
group SO(2) on P. The bundle : P S
is a simple example of a principal bundle.
Moreover, this bundle is nontrivial in the following sense: there is no dieomorphism
k : S
SO(2) P such that k(x, a) =x. Indeed, if such a k existed, then s : S
dened by s(x) = k(x, a
), would determine a smooth eld of unit vectors on S
. By
the no combing of S
theorem of Brouwer, such a eld does not exist. In general,
if : E M is a bundle and N is an open subset of M, then a smooth map : N P,
such that =id
, is called a (local) section of . If N =M, then is a global section.
For a principal bundle, the existence of a global section is equivalent to its triviality.
Incidentally, the bundle of dyads occurs in the description of a magnetic pole of unit
strength. The nontrivial nature of the bundle : P S
shows up in the occurrence of
a string singularity in the expression for the vector potential of the magnetic pole.
The last remark leads to what is probably the most important domain of applications
of bre bundles in theoretical physics: innitesimal connections on principal bundles
provide good geometrical models of classical gauge elds. This has been known among
mathematicians and physicists for some time but, for the sake of completeness, we
recall some of the arguments in favor of this view. In a notation that is standard in
physics, one can consider the analogies between electromagnetism and gravitation (see
Table 9.1).
The issue raised in the discussion on the signicance of the electromagnetic poten-
tials becomes clear when electromagnetism is interpreted as an (innitesimal) connec-
tion in the space of phases. Namely, the experiments proposed by Aharonov and Bohm
[1] have a very simple analog in elementary dierential geometry: the surface of a cone
is locally at, but a vector undergoing parallel transport along a loop enclosing the
vertex does not return to its original position. Similarly, the phase of a wave function
of a charged particle undergoes parallel transport determined by the potential. The re-
gion with the magnetic eld is analogous to the vertex of the cone. Electromagnetism
potentials should not be slighted, but considered for what they are: the coecients of
a connection.
A heuristic approach to the notion of a connection on a principal bundle shows
how this concept is related to the physicists view of gauge potentials (see [52]). Let
: P M be a principal bundle with structure group G. The result of action of a G
on p P is another point pa P, lying in the same ber as p, (pa) = (p). A local
section s : N P denes a dieomorphism k : NG
(N) by k(x, a) =s(x)a =p.
With the section s xed for the moment, we may identify s(x) with (x, ) and s(x)a
with (x, a) =(x, )a, where is the unit element of G. An innitesimal connection on P
denes parallel displacement of elements of P. If dx =(dx

) is a small displacement at
x = (p) N, then the parallel transport of (x, ) along dx results in (x+dx, A),
where A = A


is a 1-form on N, with values in the Lie algebra of G. Parallel

transport should commute with the action of G such that (x, a) displaced along dx
becomes (x +dx, aAa). If s

: N

P is another section, then there is a map U :


G such that

(x) =s(x)U(x) (9.2)

for x N N

. The section s

leads to the dieomorphism k

: N



(x, a) =s

(x)a =s(x)U(x)a, and


(x, a) =k(x, Ua),


(x+dx, a) =k
x+dx, (U+dU)a
Relative to k

, parallel transport is described by a 1-formA



. By parallel trans-
port, the point k

(x, ) becomes k

(x+dx, A

), which is the same as k(x+dx, (U+

dU)( A

)). On the other hand, k

(x, ) = k(x, U) is parallel to k(x +dx, U AU).

Since parallel displacement in P should not depend on the choice of section (gauge),

) =UAU. This leads to the transformation law


(dU+AU) (9.4)
of the potential under gauge transformations of the second kind. It follows from (9.4)
that the G-valued 1-form
(da+Aa) (9.5)
is independent of the section. The form has a simple geometric interpretation:
+ is the element of G that moves the point (x, a) into the point (x, a)( +) =
(x, a+da+Aa) parallel to (x+dx, a+da). The section-independent 1-form on P is
called the connection form; it is the gauge-independent counterpart of the potential A.
Relation (9.4) contains, as special cases, the transformation laws of the coecients of
a linear connection (Christoel symbols, Ricci rotation coecients) of the electromag-
netic potentials and of non-Abelian gauge potentials of Yang-Mills type. The advantage
of the connection form, dened on P, over the potential A, dened on N M, results
from the following considerations: the connection form is dened independently of
any section, whereas A refers to a (local) section of the bundle. As a consequence, for a
nontrivial bundle, the potentials are dened only locally, whereas the connection form
is dened globally, all over P.
An interesting application of the bundle approach to gauge elds is the construction
of Riemannian geometries of Kaluza-Klein type. If there is a connection form on P,
g = g



is a metric tensor on M, and h is a bi-invariant metric on G, then one

can dene a metric tensor on P by the formula
(u, v) =g
T(u), T(v)
(u), (v)
, (9.6)
where u and v are vectors tangent to P, and T : TP TM is the projection of such
vectors on M, induced by . The metric is invariant under the action of G on P.
For G = SO(2), it coincides with the metric considered in ve-dimensional, unied
theories of gravitation and electromagnetism.
Relativistic theories of gravitationsuch as Einsteins theory of general relativity
may also be considered as gauge theories. The bundle P consists in this case of or-
thonormal linear frames (tetrads, vierbein) of the spacetime manifold M and G is the
Lorentz group. Alternatively, one can take P to be the bundle of orthonormal ane
frames, in which case G is the inhomogeneous Lorentz group. There are, however,
important dierences between Einsteins theory and gauge theories such as electro-
dynamics or the Yang-Mills theory. First of all, the bundle of frames is soldered to
the base M, whereas in other gauge theories the bundle is rather loosely connected
to M. The soldering results in the appearance, in theories of gravitation, of torsion, in
addition to curvature, which occurs in gauge theory. (Torsion is zero in Riemannian
geometry, but being zero is dierent from not existing at all.) Moreover, the form of
Einsteins equations of gravitation is dierent fromthe generic formof the eld equa-
tions assumed in gauge theories. The latter are derived from Lagrangians quadratic in
curvature, whereas the former are based on a linear Lagrangian. The possibility of con-
structing such a linear Lagrangian is also related to the existence of the soldering form
on P.
In the past, there were much research and discussion on whether and in what sense
gravitation is a gauge theory. Recently, this problem has been considered in connection
with the program of constructing a supersymmetric theory of gravitation. In classi-
cal relativity, the following questions have been raised and given diverse answers by
dierent authors.
(1) What is the gauge group of gravitation?
(2) What are the corresponding gauge potentials; what is the status of the metric
(3) Can the form of the eld equations be derived from arguments of gauge invari-
Utiyama [55] was the rst to say that gravitation may be looked upon as a gauge the-
ory; he identied its potentials with the coecients of the Riemannian connection on
spacetime. Using gauge arguments, Sciama argued in favor of an asymmetric connec-
tion as the basis of gravitation and showed that spin may be the source of torsion.
Independently, on the ground of heuristic considerations invoking a gauge group with
translations (in addition to Lorentz transformations), Kibble derived the full set of eld
equations of gravitation with spin and torsion; the Sciama-Kibble theory was later rec-
ognized as being essentially equivalent to Cartans theory of 1923 (see [13]). Chen Ning
Yang pointed out that Einsteins theory is dierent from other gauge theories in being
based on a Lagrangian that is linear, rather than quadratic, in curvature. He proposed
considering a theory of gravitation based on Riemannian geometry and a Lagrangian
of the form

(the dual

of the curvature form


= d

) and the confor-

mally invariant Lagrangian density). The source-free equations of this theory,

, appear to be too weak; for example, they admit as a solution the de Sitter uni-
verse with an arbitrary radius of curvature. There is a modication of Yangs theory
based on a metric connection with torsion and two sets of eld equations, as in the
Einstein-Cartan theory. It is clear, from the diversity of results and views, that there is
no unique gauge theory of gravitation. This is due to the fact that gravitation is a rich
theory from the geometrical point of view: it contains several invariants which may be
used to build the kinetic part of the gravitational Lagrangian. The correspondence prin-
ciple of relativistic gravity to the Newtonian theory suggestsbut probably does not
requirea Lagrangian linear in curvature, whereas the analogy with electrodynamics
leads to the idea of a quadratic Lagrangian.
According to Regge [40], there is no diculty in writing the modern (gauge) form of
electromagnetism (with the compact group SO(1) or U(1)) on a Riemannian manifold
and it is possible to write, la Cartan, general relativity as an SO(3, 1) gauge theory.
Besides, it may be useful to recall that Cartan was largely responsible for the introduc-
tion of the concept of torsion in Physics. Torsion remains a very interesting idea. We
need to use it, even by just declaring it to vanish, if we want to write general relativity
as a gauge theory in which all elds, and not only the spin connection, appear as gauge
potentials. The interesting feature of general relativity is that the associate curvature of
the vierbein, that is, torsion, vanishes as a consequence of the variational principle of
Hilbert, Einstein, and Cartan. And in fact the Lagrangian density is not invariant under
all gauge transformations of the Poincar group but only under those of the Lorentz
subgroup. Although nature has prepared the gauge potentials for the full group, it
ends up by requiring invariance under a subgroup only. A world with torsion would
appear inescapable if we have around enough density of high-spin particles which act
as sources, but this density seems at the moment well below the limit of observability.
Regarding the kind of space in which torsion is supposed to appear, one can remark
that it would not be any more a Riemannian manifold or, rather, none of the Riemannian
structures existing on the manifold would be directly related to Physics and the theory
would not be a geometrical theory in the sense envisaged by Einstein. One could yet
consider general relativity as GL(4, R) theory with the Christoel connection playing
the role of a Yang-Mills potential. If the torsion vanishes, it follows that the Christoel
symbol is symmetrical into the two lower indices whose role is however quite dierent.
The rst index is a GL(4, R) gauge index; the second labels instead the dierentials on
spacetime. We may relate them because of the accidental and marvelous fact that the
Jacobian group of derivatives on a dierentiable manifold is isomorphic to GL(4, R)
and that we use the same indexing for dierentials and vectors in GL(4, R). Once the
symmetry is established, the theory becomes almost by denition geometrical. If there
is no symmetry but we can control torsion by introducing suitable norms and bounds,
then we may still speak of an almost-geometrical theory whose exact mathematical
denition is still lacking. (About the work of Christoel, see [22].)
A gauge theory is any physical theory of a dynamical variable which, at the classical
level, may be identied with a connection on a principal bundle. The structure group
G of the bundle P is the group of gauge transformations of the rst kind; the group
of gauge transformations of the second kind may be identied with a subgroup of the
group AutP of all automorphisms of P. In this sense, gravitation is a gauge theory: the
basic gauge eld is a linear connection . In addition to , there is a metric tensor g
which plays the role of a Higgs eld. The most important dierence between gravitation
and other gauge theories is due to the soldering of the bundle of the frames LM to the
base manifold M. The bundle LM is constructed in a natural and unique way from
M, whereas a noncontractible M may be the base of inequivalent bundles with the
same structure group. For example, LS
reduced to SO(2) is isomorphic to SO(3),
but there is a denumerable set of inequivalent SO(2) bundles over S
, corresponding
to the dierent elements of
(SO(2)) = Z. The soldering form leads to torsion
which has no analog in nongravitational theories. Moreover, it aects the group ,
which now consists of the automorphisms of LM preserving . This group contains no
vertical automorphismother than the identity; it is isomorphic to the group DiM of all
dieomorphisms of M. In a gauge theory of Yang-Mills type over Minkowski spacetime,
the group is isomorphic to the semidirect product of the Poincar group by the group

of vertical automorphisms of P. In other words, in the theory of gravitation, the
of pure gauge transformations reduces to the identity; all elements of
correspond to dieomorphisms of M. What is the structure group G of the gravitational
principal bundle? Since spacetime M is four-dimensional, if P =LM, then G =GL(4, R).
But one can equally well take for P the bundle AM of ane frames; in this case, G is the
ane group. There is a simple correspondence between ane and linear connections,
which makes it really immaterial whether one works with LM or AM. If one assumes
as one usually doesthat and g are compatible, then the structure group of LM
or AM can be restricted to the Lorentz or the Poincar group, respectively. It is also
possible to take, as the underlying bundle for a theory of gravitation, another bundle
attached in a natural manner to spacetime, such as the bundle of projective frames or
the rst extension of LM. The corresponding structure groups are natural extensions
of GL(4, R), O(1, 3), or the Poincar group.
Table 10.1
Gauge eld terminology Bundle terminology
Gauge (or global gauge) Principal bre bundle
Gauge type Principal bre bundle
Gauge potential b

Connection on a principal bre bundle

Transition function
Phase factor
Parallel displacement
Field strength f
Source J

Electromagnetism Connection in a U
(1) bundle
Isotopic spin gauge eld Connection in an SU
Diracs monopole quantization Classication of U
(1) Bundle
according to rst Chern class
Electromagnetism without monopole Connection on a trivial U
(1) bundle
Electromagnetism with monopole Connection to a nontrivial U
(1) bundle
The importance of gauge theories in modern theoretical physics is well known. Yang
and Mills new gauge theory should especially serve as a model for the study of strong
interactions, including the quantum eects on them. The main feature of this gauge
theory is the use of a non-Abelian Lie group, the simplest of the noncommutative con-
tinuous groups, as its invariance group. This mathematical property of the symmetry
group gives a very rich structure to the theory, whose eld equations are more general
than Maxwells. This already illustrates the fundamental role of both geometrical and
internal symmetries in physical problems which can be handled by gauge theories. In
Weyls theory, in addition to the position variables of spacetime, there is already an
internal space parameter on which the phase group acts. The eld identied with the
particles wave function can therefore be seen as associating to each point of spacetime
a point of the internal space, or an angle (of rotation) in the case of electromagnetism.
A gauge requires that the coordinates of spacetime be combined with the parameters
of the internal space. Weyls theory satises the principle of local invariance: that is,
the eld equations are invariant under a gauge shift.
10. Some open mathematical problems in gauge theory. In the last thirty years,
elementary particle physics turned to modern mathematics. To emphasize the de-
velopments of the past decades, we reproduce the Wu and Yang dictionary [66] (see
Table 10.1).
So, theoretical physics is more and more concerned with the following topics: Rie-
mannian surfaces and their moduli spaces, the topology of compact Lie groups, Calabi-
Yau spaces (Ricci at Khler manifolds), representation theory of ane algebra, knot
theory, and so forth. If one looks carefully to some of the basic problems in theoreti-
cal physics, which heavily involve mathematics, one is reinforced in the idea that the
quantization of gauge theories and the string theory require analysis and geometry of
special innite-dimensional manifolds. Many problems can be formulated as the miss-
ing innite-dimensional analogues of nite-dimensional results.
Some examples of innite-dimensional geometries. (i) For gauge theories,
the geometric object is a/. Here, a is the set of connections of a principal G-bundle P
over a compact Riemannian three-manifold M. is the group of gauge transformations,
the automorphism of the G-bundle; it acts on a. G is the compact Lie group. a/ is the
orbit space. Since the tangent space T(a, ) of a at A is the space of equivariant 1-
forms on P with values in the Lie algebra of G, there is a natural inner product on
T(a, A) invariant under . Therefore, a/ has a Riemannian structure.
(ii) For the so-called -model, the natural geometric object is L(M), the set of free
loops on M, that is, the smooth maps of S
into M, M a Riemannian manifold, usually
compact. However, M might be R
or Minkowski space R
. The tangent space of
L(M) at , T(L(M), ), is the set of smooth vector elds along (sections of

This tangent space has an inner product
, V
, V
dt (10.1)
for V
, V
T(L(M), ). Note that the inner product is not invariant under the action of
, the dieomorphisms of S
, on L(M). Here, DiS
L(M) L(M) with (, )
, where ()(t) =(
(iii) In quantum mechanics, one studies the Schrdinger operator /2+V on L
where is the Laplacian and V is multiplication by a potential function. In quantumeld
theory, the operators should act on L
of certain function spaces or mapping spaces: a
in (i) and L(M) in (ii). One can emphasize that an alternate to the canonical formalism,
studying /2+V directly, is to use the Feynman-Kac formula, which expresses the heat
kernel K
(x, y) of e
as a path integral over paths from x to y:
(x, y) =


Dt. (10.2)
Here, e

Dt means the Wiener measure of this path space. The path integral ap-
proach for operators on L
(L(M)) requires paths in L(M), that is, maps X : S
[0, T]
M. So the measure space analogous to the space of paths is
X : S
[0, T] M; X(, 0) =
(), X(, T) =
. (10.3)
(iv) For gauge theories, the situation is a little more complicated. Note that a path
t f
(x) of functions on M is a function f(t, x) on [0, T]M. A connection A=(A

on [0, T] M can be transformed by a gauge transformation on [0, T] M so that
= A(d/dt) is 0 (the temporal gauge; integrate the dierential equation dA
/dt =
U(t, x)A
(t, x)). Connections on [0, T] M become paths of connections on M.
Although there are some technical complications, one is led very quickly for path inte-
gral purposes to a/ based on a four-dimensional manifold, usually MR (interpreted
as paths on a/ based on M). The last geometric objects we consider are homoge-
neous spaces of Di
, the orientation-preserving dieomorphisms of S
. DiS
ters string theory because the theory, involving as it does maps of S
, should be in-
variant under reparameterizations of S
. It is supposed to play a role similar to gauge
transformations in gauge theories and Di(M) for metrics on M, gravity.
(v) The space Di
can be made into a Khler manifold: the Lie algebra of
is Vect(S
). The tangent space of Di
at the identity coset is the set of
vector elds whose 0th Fourier coecient is 0. Thus
J =
makes Di
into an invariant almost complex structure. It is easy to see that J
is integrable and one assumes the Nirenberg-Newlander theorem will hold. There is
a family of Khler metrics given by the cocycles (of the Lie algebra of vector elds
on S
after complexication) with either a = 0, b 0 or a 0, b/a n
. Other
interesting homogeneous spaces are Di
, where K
is the subgroup with Lie
algebra generated by L
, L
, and L
. The case n = 0 is (v) above and the case n = 1
gives K
= Sl(2, R) Di
). (For good introductions to the theory of Khlerian
manifolds, see [30, 58].)
Mathematical note on almost complex structures
and Khler manifolds
Denition 10.1. Let M be a Hausdor space. Let U

be an open cover of M
and suppose that for each U

there is a homeomorphism

from U

onto an open set


of C
satisfying the following property: if U

, then the map f

from the open set


) of C
onto the open set


) of C
and the map



) onto


) are both holomorphic. If M has an

open cover U

and a set of maps

with this property, then M is called a
complex manifold of complex dimension n, and (U

is called a holomorphic
coordinate neighborhood system of M.
If we identify C
with R
, then a holomorphic map of an open set of C
to an open set
of C
, considered as a map between open sets in R
, is analytic (because the real part
and imaginary part of holomorphic function are analytic). Hence, of course, a complex
manifold of complex dimension nis a 2n-dimensional (real) analytic manifold. Let M be
a complex manifold of complex dimension n and let (U

be a holomorphic
coordinate neighborhood system. Let U be an open set of M, a homeomorphism
from U onto an open set D of C
, and suppose they satisfy the following property: if

( A), then the maps


(U U

) to (U U

) are both
holomorphic. If this is the case, (U, ) is called a holomorphic coordinate neighborhood
of M. For q U, set (q) =(z
(q), . . . , z
(q)). Then z
(k =1, . . . , n) is a complex-valued
function dened on U, and we call (z
, . . . , z
) the complex local coordinate system on
(U, ).
Let f be a complex-valued function dened on an open set E of a complex mani-
fold M. For each point p of E, we can choose a holomorphic coordinate neighborhood
(U, ) such that p E. If the function f
dened on the open set (U) of C
holomorphic, then f is said to be holomorphic in a neighborhood of p. This denition
does not depend on the choice of the holomorphic coordinate neighborhood (U, ).
Let f be holomorphic at all points of E, and let a complex local coordinate system in
a neighborhood of p be (z
, . . . , z
). Then we can write f(q) =f(z
(q), . . . , z
(q)), and
the right-hand member is a holomorphic function of n variables.
An n-dimensional complex manifold M is a 2n-dimensional manifold, so that at each
point p of M, the tangent space T
(M) and its dual T

(M) are dened. Let (z
, . . . , z
) be
a complex local coordinate system, and let x
and y
be the real and imaginary parts of
, respectively. Then (/x
, (/y
, . . . , (/x
, (/y
is a basis of T
and (dx
, (dy
, . . . , (dx
, (dy
is a basis of T

(M) dual to the former. Let
M and M

be complex manifolds of complex dimensions n and m, respectively, and let

be a continuous map from M to M

. If for each point p of M and each holomorphic

function f on M

dened on a neighborhood of (p),

f is also holomorphic in a
neighborhood of p, then is called a holomorphic map from M to M

. Holomorphic
maps are naturally dierentiable. If is a one-to-one holomorphic map from M to M

and if the inverse map

is also a holomorphic map from M

to M, then is called
a holomorphic isomorphism (or holomorphism) from M to M

Let (z
, . . . , z
) be a complex local coordinate system on a neighborhood U of a point
p of M. Dene a linear transformation J
of T
(M) by


, J


(k =1, . . . , n). (10.5)
We prove that the denition of J
does not depend on the choice of the complex local
coordinate system (z
, . . . , z
). To see this, extend J
to a linear transformation of the
complex vector space T
(M) set J
(u+iv) = J
v (u, v T
(M)). Then, by
(10.5) we have


, J


(k =1, . . . , n). (10.6)
Hence, if an element a of T
(M) is a linear combination of (/z
(k =1, . . . , n) only,
then we have J
a =ia, and if a is a linear combination of (/z
(k =1, . . . , n) only,
then we have J
a =ia. Now, if (w
, . . . , w
) is also a complex local coordinate system
on the neighborhood U of p and if w
= u
, then we can dene a new linear
transformation I
of T
(M) in the same manner as above. Hence J
and I
and this shows that the denition of J
does not depend on the choice of the complex
local coordinate systemin the neighborhood of p. From(10.5), it is clear that J
=1, (10.7)
where 1 denotes the identity transformation of T
(M). The correspondence J, which
assigns to each point p of M the linear transformation J
of T
(M), is called the almost
complex structure attached to M, which is dened more abstractly as follows.
Denition 10.2. An almost complex structure on a real dierentiable manifold M
is a tensor eld J which is, at every point p of M, an endomorphism of the tangent
space T
(M) such that J
Now, let M and M

be almost complex manifolds with almost complex structures J

and J

, respectively. Amapping f : M M

is said to be almost complex if J


In this case, f is dierentiable and holomorphic.
Denition 10.3. A Hermitian metric on an almost complex manifold M is a Rie-
mannian metric g invariant by the almost complex structure J, that is, g(JX, JY) =
g(X, Y) for any vector elds X and Y.
A Hermitian metric thus denes a Hermitian inner product on each tangent space
(M) with respect to the complex structure dened by J. An almost complex mani-
fold (resp., a complex manifold) with a Hermitian metric is called an almost Hermitian
manifold (resp., a Hermitian manifold).
Proposition 10.4. Let M be an almost Hermitian manifold with almost complex
structure J and metric g. Let be the fundamental 2-form, N the torsion of J, and
the covariant dierentiation of the Riemannian connection dened by g. Then, for any
vector elds X, Y, and Z on M,

Y, Z
=6d(X, JY, JY)6d(X, Y, Z)+g
N(Y, Z), JX
. (10.8)
We now state an important theorem.
Theorem 10.5. For an almost Hermitian manifold M with almost complex structure
J and metric g, the following conditions are equivalent:
(i) the Riemannian connection dened by g is almost complex;
(ii) the almost complex structure has no torsion and the fundamental 2-form is
A Hermitian metric on an almost complex manifold is called a Khler metric if the
fundamental 2-formis closed. An almost complex manifold (resp., a complex manifold)
with a Khler metric is called an almost Khler manifold (resp., a Khler manifold).
An almost Hermitian manifold with d = 0 and N = 0 used to be called a pseudo-
Khler manifold. Since an almost complex manifold with N =0 is a complex manifold,
a pseudo-Khler manifold is necessarily a Khler manifold.
Proposition 10.6. The curvature R and the Ricci tensor S of a Khler manifold
possess the following properties:
(i) R(X, Y)J =J R(X, Y) and R(JX, JY) =R(X, Y) for all vector elds X and Y;
(ii) S(JX, JY) =S(X, Y) and S(X, Y) =1/2trace of JR(X, JY) for all vector elds
X and Y.
Theorem10.7. For a Khler manifold M of complex dimension n, the restricted linear
holonomy group is contained in SU(n) if and only if the Ricci tensor vanishes identically.
Lemma 10.8. For an almost complex linear connection with curvature tensor R on
a two-dimensional almost complex manifold M, the restricted linear holonomy group is
contained in (the real representation of) SL(n; C) if and only if
traceR(X, Y) =0, traceJ R(X, Y) =0 (10.9)
for all vector elds X and Y, where J denotes the almost complex structure.
Theorem 10.9. An almost Hermitian manifold M is a Khler manifold if and only if
the bundle U(M) of unitary frames admits a torsion-free connection (which is necessarily
On each almost complex manifold M, one can construct the bundle C(M) of complex
linear frames and study connections in C(M) and their torsion. Let M be an almost
complex manifold of dimension 2n with almost complex structure J and let J
be the
canonical complex structure over the vector space R
. Then a complex linear frame at
a point x of M is a nonsingular linear mapping u: R
(M) such that uJ
One easily shows that J denes the structure of a complex vector space in T
and u : R
(M) is a complex linear frame at x if and only if it is a nonsingular
complex linear mapping of C
= R
onto T
(M). The set of complex linear frames
forms a principal bre bundle over M with group GL(n; C); it is called the bundle of
complex linear frames and is denoted by C(M). Since a bundle C(M) is a subbundle
of the bundle L(M) of linear frames, each almost complex structure gives rise to a
reduction of the structure group GL(2n, R) of L(M) to GL(n; C). Then one gets the
following results.
Proposition 10.10. Given a 2n-dimensional manifold M, there is a natural one-to-
one correspondence between the almost complex structures and the reductions of the
structure group of L(M) to GL(n; C).
Proposition 10.11. Given a 2n-dimensional manifold M, there is a natural one-to-
one correspondence between the almost complex structures of M and the cross-sections
of the associated bundle L(M)/GL(n; C) over M.
We know that, given a Riemannian manifold M with metric tensor g, a linear connec-
tion of M is a metric connection, that is, comes from a connection in the bundle
O(M) of orthonormal frames if and only if g is parallel with respect to G.
Proposition 10.12. For a linear connection on an almost complex manifold M, the
following conditions are equivalent:
(i) is a connection in the bundle C(M) of complex linear frames;
(ii) the almost complex structure J is parallel with respect to .
Theorem 10.13. Every almost complex manifold M admits an almost complex ane
connection such that its torsion T is given by N =8T, where N is the torsion of the almost
complex structure J of M.
Corollary 10.14. An almost complex manifold M admits a torsion-free almost com-
plex ane connection if and only if the almost complex structure has no torsion.
11. A new era in the relationship between geometry and physics: topology as a
guiding principle. Mathematical and conceptual issues. Beginning in the 1970s, it was
recognized that, mathematically, gauge theory is essentially one branch of dierential
geometry that uses the new concept of bre spaces with connections. This notion is
absolutely central in the understanding of the relation between mathematical structures
and physical theories, and it directly links geometry and physics to the point that it can
be said that the two are coextensive.
Consider the mathematical concept of a space with a connection and its curvature.
Let f : M N be a map between spaces M, N, where M, say, represents a model of
spacetime, and at each point p of M, there is localized a physical system with the space
of internal states f
(p). A connection on a geometrical object is a rule permitting
the transport of the system along the curves in M. In other words, if we know part
of the world lines and the initial internal state of a system in M, then, thanks to the
corresponding displacement determined by the connection, we can know the future
states of the system. According to recent physical theories, a gravitational eld is a
connection in the space of internal degrees of freedom of a gyroscope; the connection
allows us to follow the evolution of the gyroscope in spacetime. An electromagnetic
eld is also a connection in the space of internal degrees of freedom of a quantum
electron; the connection allows us to follow the evolution of the electron in spacetime.
A Yang-Mills eld is yet a connection in the space of internal degrees of freedom of a
This geometrical image seems now to be the most universal mathematical model
of an ideal universe with a small number of basic interactions. The state of matter in
spacetime, at each point and each moment, is described by a section of an appropriate
bre space N M. A eld is described by a connection on this bre space. Matter acts
on the connection by imposing restrictions on its curvature, and the connection acts
on matter by forcing it to propagate by parallel displacement along world lines. The
famous equations of Einstein, Maxwell and Dirac, and Yang and Mills are exactly the
embodiment of this idea. The geometrical concept of connection has thus become an
essential element of physics.
One can see that to each physical entity corresponds a geometrical or global dif-
ferential concept. For example, eld strength is identied with the curvature of the
connection; the action integral is but a global measure of curvature. Certain topolog-
ical and algebraic invariants in the theory of characteristic classes have been seen to
be most appropriate to describe the charge of the particle in the sense of Yang and
Mills. More generally, we can establish a direct correspondence from the concepts of
gauge eld theory to those of the dierential geometryand topology of bre spaces.
But how can we understand precisely the nature of such a correspondence? Inspired
by an idea already proposed by Weyl in another manner [61], we support the thesis
that, essentially, physics is but geometry in act. This implies not only that geometry
yields mathematical abstract concepts like manifolds, groups, curvature, connections,
and bundles, but also that it is, in a way, ontologically (or, if you wish, physically) rooted
in reality, because it is an integral part of the properties of physical entities and the
features of phenomena.
One could go so far as to postulate that there must be a geometrical structure, con-
tinuous or discrete according to the theory and the class of phenomena considered,
underlying any given physical family of phenomena, or maybe a topological structure
which would encompass at the same time the continuous and discrete characters of
space and of nature into a more general mathematical scheme. To convince oneself of
this, it suces to remember that some principles of geometrical symmetry (or, equiv-
alently, some groups) can be transformed into dynamical principles that are in turn
responsible for changes in the phenomena. Should we then arm in the beginning
was the symmetry or the group . . .? However, this concept is not just abstract, and
mathematical properties related to it have simultaneously an explanatory power and a
capacity to generate a world of forces, interactions, and energy . . ., so that the math-
ematical understanding of this world cannot be separate from the understanding of
reality itself. Indeed, at a deeper level, one is increasingly led to believe that symmetry
may, in a hidden sense, determine almost everything. Moreover, in view of all this, it
is not unreasonable to look on topology, like symmetry, as some kind of underlying or
unifying principle which helps us to understand natural phenomena at the microscopic
as well as the macroscopic levels.
In this regard, we note here that a connection, which is a well-dened geometrical
object, is more primitive than the curvature. Therefore, we should consider the gauge
potential to be more primitive than the gauge eld. In fact, in electromagnetism we
can show experimentally that the eld can be identically zero but physical eects can
still be detected; this is because the parallel transport need not be trivial if the region
of space is not simply connected. The vanishing of curvature only gives information
about the parallel transport around very small closed paths. Physically, the parallel
transport is generally described in terms of a nonintegrable phase factor. The property
of nonintegrability refers locally to the existence of a nonvanishing eld, whereas large-
scale nonintegrability is of a topological nature and may arise even if the eld is zero.
Classically, the concept of potential was introduced as a mathematical device to simplify
the eld equations, and the arbitrary nature of the gauge characteristic in the choice
of potential indicated that the potential did not really have a physical meaning. But,
geometrically, one can in fact show that such an interpretation is not satisfactory. The
connection is a geometrical object and so the potential should be considered as having a
physical nature. It is the choice of gauge describing the potential which has no physical
meaning, and this corresponds to the fact that the geometrical bre space where the
connection sits has no (natural) horizontal sections.
A more general problem concerns the relation between purely mathematical geom-
etry and physical geometry. According to an idea going back to Riemann and Cliord
and next developed by T. Levi-Civita, E. Cartan, H. Weyl, and A. Einstein, physical con-
cepts cannot be dissociated from geometrical ones, and inversely. Some remarks about
the general relativity theory can help to understand what we mean by that. In this the-
ory, the gravitational eld is seen as the eect of a geometric distortion, a curvature
or warping of spacetime. In this theory, as is well known, freely falling bodies are not
treated as subject to gravitational forces, but are instead regarded as following the
straightest possible path (a geodesic) in an underlying curved spacetime. In Newtons
theory of gravitation, the earths orbit curves around the sun because the suns gravity
forces it to depart from its natural straight line motion. In Einsteins theory, there are
nongravitational forces as such. The sun produces a warping of spacetime in its vicin-
ity and the earth travels freely along a geodesic in this curved spacetime. Gravity is
treated as a geometrical eect precisely because it is universal; it aects all test objects
in the same way. Thus, even light will follow a curved path in a gravitational eld. On
a large scale, the distribution of galaxies throughout the universe will depend on the
geometry of space. The fact that there might be a systematic curvature of space on a
cosmological scale raises the interesting question of the topology of the universe. So
long as space is considered to be at, it must be either innite in extent or else pos-
sess some sort of boundary. But if space is curved, there are other possibilities. Think
of the situation with a two-dimensional sheet. A curved sheet could be closed into a
sphere, for example, or a torus. It is possible to envisage a three-dimensional version
of a closed spherical surface, called a hypersphere. If the universe had the topology
of a hypersphere, it would posses a nite volume, but there would be no boundary or
edge to space. It is not known what topology space actually possesses, but the issue is
crucial to the superstring theory. (On this very interesting subject, see [23, 31].)
One of the basic assumptions in modern cosmology, the cosmological principle, is
that on large-scale average, our universe is spatially homogeneous and isotropic. The
apparent isotropy on large scales is normally explained as a consequence of spatial
homogeneity, which in turn is understood as a natural result of an inationary period
of the early universe. An alternative approach to explaining the apparent homogeneity
is to assume an expanding universe with small and nite space sections with a nontrivial
topology, the small universe model. From the theoretical point of view, it is possible
to have quantum creation of the universe with a multiply connected topology. From
the observational side, this model has been used to explain the observed periodicity
in the distribution of quasars and galaxies.
It is also worthwhile noting that to the generation of newspace dimensions and struc-
tures corresponds changes in the physical state of phenomena. For example, we know
that the qualitative properties of a certain physical (dynamical) system are sensitive to
the dimension of the space, and that the geometrical and topological structure of the
space puts constraints on the evolution of the system (see [7, 47]). We mention only
one outstanding example. In 1984, the British physicist Michael Berry showed that the
adiabatic evolution of energy eigenfunctions, with respect to a time-dependent quan-
tum Hamiltonian H(t), contains a phase of deeply geometrical origin in addition to the
familiar dynamical phase
E(t)dt. (11.1)
The additional phase approaches a nite, nonzero limit as the Hamiltonian is taken
more and more slowly around a closed path in its parameter space. The geometric
phase (C) (where C is a closed circuit on a sphere) measures the anholonomy of a
physical (classical or quantum) system. Anholonomy is a geometrical phenomenon in
which nonintegrability causes some variables to fail to return to their original values
when others, which drive them, are altered around a cycle. The simplest anholonomy
is in the parallel transport of vectors, two examples being the change in the direction
of swing of a Foucault pendulum after one rotation of the earth, and the change in
the direction of linear polarization of light along a twisting ray or coiled optical bre
whose direction is altered in a cycle. Adiabaticity is slow change and therefore denotes
phenomena at the border between dynamics and statics. Adiabatic change provides
the simplest way to make quantum parallel transport happen. The variables which are
cycled are parameters in the Hamiltonian of a system. If the cycling is slow, the adiabatic
theorem guarantees that the system returns to its original state. But it usually acquires
a nontrivial phase, a manifestation of anholonomy.
Moreover, some mathematical ideas can provide a deep and powerful connection be-
tween, on the one hand, the geometrical symmetries of space, and on the other, the
dynamical behavior of material bodies. In fact, forbidding the absence of spontaneous
changes in motion amounts to a statement of the laws of conservation of momentum
and regular momentum. The translation symmetry of space leads directly to momen-
tum conservation for particles, whereas the rotational symmetry implies angular mo-
mentum conservation. In addition to this, the conservation of energy can be shown to
followfromthe translation symmetry of time. Thus, the most fundamental and compre-
hensive laws of physics are seen to followfromthe basic fact that empty space and time
are featureless. It illustrates well the power of symmetry in ordering the natural world.
An interesting question now arises. Do all the forces of nature necessarily respect the
geometrical symmetries of space and time? Certainly, Maxwells electromagnetic the-
ory, as well as Einsteins general relativity theory, incorporates all the symmetries we
have just mentioned. What about the discrete (quantic) geometrical symmetries? How
can the laws of physics be tested for them?
A last remark about the possibility of discovering a deeper, yet unknown level of
theory and experience is where the discrete and the continuous characters of the laws
of physics are but special cases according to each other in the framework of a new uni-
tary mathematical theory. The theory of supergravity, developed mathematically in the
1970s, which generalizes a theory of gravitation conceived by Weyl in 1923 and another
by Kaluza and Klein about the same time, as well as the more recent superstring theory,
gives some hope (only in theory, actually) of unifying the laws of physics (see [56]). In
fact, at the base of this last theory, there is a new symmetry called supersymmetry that
acts even on a global level. It links the two large classes of elementary particles, the
fermions (such as the electron, the proton, and the neutron) and the bosons (such as
the photon), which, as is well-known, have very dierent properties. Since supersym-
metry extends from the global to the local level, it leads to a theory which includes
gravity and which suggests the possibility of unifying it with the other forces. In this
new perspective, it would be very interesting to study particularly the relation between
the topological structure of certain (local and global) groups acting on a certain family
of nonsmooth (quasiconformal or symplectic) manifolds and the corresponding kinds
of physical symmetries and symmetries breaking. In fact, the study of the gauge theory
invariants seems intimately related to the problem of constructing dieomorphisms
between four-manifolds, or nding embedded surfaces of a given genus, which would
complement the obstructions and invariants which have been found.
12. Further remarks on the Kaluza-Klein program. Probably the best geometrical
and physicalbut hardly uniedtheory resting on some global, topological ideas is
the one due to Kaluza and Klein. Its underlying geometry is that of a ve-dimensional
Riemannian space with a one-parameter group of isometries. It turns out that the
Kaluza-Klein space is the total space of a circle bundle and that the electromagnetic
potentials play a double role: they dene a connection form over the bundle and, to-
gether with the metric of spacetime, determine the ve-dimensional Riemannian ge-
ometry. Gauge theories such as those based on SU(n) group have a similar geometry.
Since the recent views of the role of gauge eld in strong and weak interactions are
more and more conrmed, one is reinforced in the guess that the theory of bre bun-
dles with connection should provide the framework for a geometrical understanding
of all fundamental physical forces. This unication seems to be considerably dierent
from Einsteins own attempt but may be close in spirit to his program of geometrizing
More specically, in the 1920s, Kaluza and Klein proposed to further unify the con-
cepts of internal and spacetime symmetries by reducing the former to the latter through
the introduction of some extra dimension of space. The main point can be reviewed as
follows. Assume that spacetime contains a fth (spacelike) dimension, which has the
topology of a circle, that is, we write

, x
and make the identication
+2R. (12.2)
Any sensible wave function will have to be periodic in x
and thus of the form


. (12.3)
Consider now the particular coordinate transformation


, (12.4)
where, for dimensional reasons, we have introduced a length l
. Using (12.3), this will




which looks like the gauge transformation
(x) e
(x), A

for a eld carrying charge
q =l
. (12.7)
Furthermore, Kaluza and Klein showed that the
components of the ve-dimensional
metric transformlike the gauge eld in (12.6) and that the ve-dimensional gravitational
action generates the four-dimensional gravity-plus-gauge action
gR(g)+ ,

+ ,
provided l
is identied with the so-called Planck length,
cm. Besides its
conceptual beauty, Kaluza-Klein theory has two interesting consequences:
(i) electric charge is automatically quantized, thanks to quantization of momentum
on a circle,
(ii) electromagnetic and gravitational interactions get unied at energies M
since, using (12.7) for n=1, G
Later on, the Kaluza-Klein idea was widely generalized, for example, to generate
larger (non-Abelian) gauge groups from even higher-dimensional spaces endowed with
suitable isometries. Kaluza-Klein theory leads to a unied classical theory but is based,
in an essential way, on quantum mechanics: the quantization of momentum gives the
quantization of electric charge. This means that there is no way to ignore quantum
mechanics within the Kaluza-Klein theory. But are the two consistent with each other?
Unfortunately, when we go from the semiclassical approximation to full-edged quan-
tumeld theory, the problemof ultraviolet innities immediately shows up. Howdo we
handle that? In D =4, gauge theories can be dealt with through the process of renormal-
ization; however, no such recipe is known for gravity. As we move to D >4, both gauge
and gravity become nonrenormalizable. In Kaluza-Klein theory, in particular, both di-
verge in a similar way in the ultraviolet, another expected consequence of Kaluza-Klein
unication. We thus face a kind of paradoxical situation. On the one hand, quantum
mechanics is essential to the success of the Kaluza-Klein idea. At the same time, quan-
tum eld theory gives meaningless innities and spoils the nice semiclassical results.
If the beautiful Kaluza-Klein idea is to be saved, we need a better quantum theory than
quantum eld theory. Now such theory already exists; it is called superstring theory.
13. Superstring theory, physics, and spacetime. It seems more and more justied
to believe that superstring achieves remarkable progress in the search for a theory of all
fundamental interactions in nature, going all the way fromgravity, which is responsible
for keeping the planets in orbit around the Sun, through electromagnetismwhich keeps
electrons in orbit around nuclei, through the strong interactions of the nuclear forces
which are responsible for many forms of radioactive decay. (See [17, 45] and especially
[65] which we follow here closely.)
One of the most important features of string theories is the unication of gauge cou-
plings. There are in particular two reasons why this is a particularly compelling feature
to study. On the one hand, the unication of gauge couplinglike the appearance of
gravity or of gauge symmetry in the rst placeis a feature intrinsic to string theory. On
the other hand, viewing the situation from an experimental perspective, the unication
of gauge couplings is arguably the highest-energy phenomenon that any extrapolation
from low-energy data can uncover; in this sense, it sits at what is believed to be the
frontier between our low-energy SU(3)SU(2)U(1) world and whatever may lie be-
yond. Thus, the unication of gauge couplings provides a fertile meeting ground where
string theory can be tested against the results of low-energy experimentation.
Superstring theory relies crucially on the two ideas of supersymmetry and a spacetime
structure of eleven dimensions. Supersymmetry requires that for each known particle
having integer spin0, 1, 2, and so on, measured in quantum unitsthere is a particle
with the same mass but half-integer spin (1/2, 3/2, 5/2, and so on), and vice versa.
Supersymmetry transforms the coordinate of space and time such that the laws of
physics are the same for all observers. Einsteins general theory of relativity derives
from this condition, and so supersymmetry implies gravity. In fact, supersymmetry
predicts supergravity, in which a particle with a spin of 2the gravitontransmits
gravitational interactions and has as a partner a graviton, with a spin of 3/2.
Superstring theory is based on the very fundamental notion of T-duality, which re-
lates two kinds of particles that arise when a string loops around a compact dimension.
One kind (call them vibrating particles) is analogous to those predicated by Kaluza
and Klein and comes from vibrations of the loop of the string (see [2, 29]). Such par-
ticles are more energetic if the circle is small. In addition, the string can wind many
times around the circle, like a rubber band on a wrist; its energy becomes higher the
more times it wraps around and the larger the circle. Moreover, each energy level repre-
sents a new particle (call them winding particles). T-duality states that the winding
particles for a circle of radius R are the same as the vibration particles for a circle
of radius 1/R, and vice versa. So, to a physicist, the two sets of particles are indistin-
guishable: a fat, compact dimension may yield apparently the same particles as a thin
This duality has a profound implication. For decades, physicists have been strug-
gling to understand nature at the extremely small scales near Planck length of 10
centimeters. We have always supposed that laws of nature, as we know them, break
down at smaller distances. What T-duality suggests, however, is that at these scales,
the universe looks just the same as it does at large scales. One may even imagine that
if the universe were to shrink to less than the Planck length, it would transform into a
dual universe that grows bigger as the original one collapses.
Supersymmetry is a conjectured symmetry between fermions and bosons. It is an in-
herently quantummechanical symmetry since the very concept of fermions is quantum
mechanical. Bosonic quantities can be described by ordinary (commuting) numbers or
by operators obeying commutation relations. Fermionic quantities involve anticommut-
ing numbers or operators. Supersymmetry is an updating of special relativity to include
fermionic as well as bosonic symmetries of spacetime. In developing relativity, Einstein
assumed that the spacetime coordinates were bosonic; fermions had not yet been dis-
covered. In supersymmetry the structure of spacetime is enriched by the presence of
fermionic as well as bosonic coordinates. If this is true, supersymmetry explains why
fermions exist in nature. Supersymmetry demands their existence. From experiments,
we have some hints that nature may be supersymmetric. In string theory, elementary
particles are understood as vibrating strings, and the structure of spacetime is coded in
the laws by which the strings propagate. A vibrating string is described by an auxiliary
two-dimensional eld theory, whose Lagrangian is roughly
I =
d d


. (13.1)
Here, X(, ) is the position of the string at proper time , at a coordinate along
the string. In string theory the auxiliary two-dimensional eld theory plays a more
fundamental role than spacetime, and spacetime exists only to the extent that it can
be reconstructed from the two-dimensional eld theory. String theory also leads in a
strikingly elegant way to models of particle physics with the qualitative properties of
the real world (such as the existence of quarks with electric charge and the structure of
weak interactions). String theory, if correct, entails a radical change in our concepts of
spacetime. That is what one would expect of a theory that reconciles general relativity
with quantum mechanics.
The answer involved duality again. Duality supersymmetries of the two-dimensional
eld theory put a basic restriction on the validity of classical notions of spacetime. The
basic duality is

and is just analogous to the more familiar electromagnetic duality E B. In each case
the duality exchanges a regime where familiar ideas in physics are adequate with one
where they are not. In the case of electric-magnetic duality, the easy region is weak-
coupling and the hard region is strong-coupling. In the case of the two-dimensional
string theory dualities, the easy situation is that of large distances and the hard
region is that in which some distances become very small.
There are at least ve consistent relativistic string theories. These theories involve
ten spacetime dimensions, some of which can be compactied or rolled up into un-
observably small manifolds. Each theory consequently has various classical solutions
and quantum states, and thus might be manifested in nature in dierent ways. This can
be related notably with the fact that the strong-coupling behavior of supersymmetric
string theories and eld theories is governed by a web of dualities relating dierent
theories. When one description breaks down because a coupling parameter becomes
large, another description takes over. For instance, in uncompactied ten-dimensional
Minkowski space, the strong-coupling limit of the type I superstring is the weakly cou-
pled heterotic SO(32) superstring; the strong-coupling limit of the type IIA superstring
is related to eleven-dimensional supergravity; the strong-coupling limit of the type IIB
superstring theory is equivalent to the same theory at weak coupling; and the strong
coupling limit of the E
heterotic string involves eleven-dimensional supergravity
again. Thus, after we compactify some dimensions, we learn that the dierent theo-
ries are all one. That is, they are dierent manifestations of one underlying and still
mysterious theory.
The duality symmetry mentioned above also has a number of nonlinear analogs, such
as mirror symmetry, which is a relationship between two spacetimes that would be
quite distinct in ordinary physics but turn out to be equivalent in string theory. The
equivalence is possible because in string theory one does not really have a classical
spacetime, but only the corresponding two-dimensional eld theory. Two apparently
dierent spacetimes X and Y might correspond to equivalent two-dimensional eld
theories. The mirror symmetry can be related to the phenomenon of topology change.
Here, one considers how space changes as a parameterwhich might be the time
is varied. One starts with a spatial manifold X so large that string theory eects are
unimportant. As time goes on, X shrinks and strings eects become large; the classical
idea of spacetime breaks down. At still later times, the distances are large again and
classical ideas are again valid, but one is on an entirely dierent spatial manifold Y.
Acknowledgments. We would like to warmly thank Jean-Pierre Bourguignon (IHES,
Bures-sur-Yvette, and Ecole Polytecnique, Palaiseau), Francis Bailly (CNRS, Laboratoire
de Physique des Solides de Bellevue), Marc Lachize-Rey (CEA Saclay, Astrophysics Pro-
gram), and Joseph Kouneiher (University of Paris-VII, Physics Department) for their
helpful comments and criticisms on an early version of the paper. The author was a
Fellow for the year 19971998 of the Institute for Advanced Study (Princeton), to whom
he is indebted for partial support and for charming hospitality. During the last years,
the author was also supported by the John Simon Guggenheim Memorial Foundation,
the Social Science and Humanities Research Council of Canada, and the Singer-Polignac
Foundation, to whom he would like to express his deep gratitude. Finally, the author
warmly acknowledges the suggestions, comments, and criticism of Professors Piet Hut,
Chiara Nappi, and Hugo Garcia Compean. In addition, he learned a great deal from
attending seminars, especially of Edward Witten and Daniel Fried.
