Appel Maths For Physicists PDF
Appel Maths For Physicists PDF
Appel Maths For Physicists PDF
and Physicists
Walter Appel
Translated by Emmanuel Kowalski
Copyright
Originally published in French under the title Mathmatiques pour la physique... et les
physiciens!, copyright
c 2001 by H&K Editions, Paris.
Contents
A books apology
xviii
Index of notation
xxii
1
1
1
5
7
9
12
12
13
15
16
17
18
23
23
24
25
26
29
30
31
32
34
35
37
37
38
43
46
47
51
51
51
54
54
55
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
vi
Contents
2.2.b Borel subsets . . . . . . . . . . . . . . . . . .
2.2.c Lebesgue measure . . . . . . . . . . . . . . . .
2.2.d The Lebesgue -algebra . . . . . . . . . . . .
2.2.e Negligible sets . . . . . . . . . . . . . . . . . .
2.2.f
Lebesgue measure on Rn . . . . . . . . . . .
2.2.g Definition of the Lebesgue integral . . . . .
2.2.h Functions zero almost everywhere, space L1
2.2.i
And today? . . . . . . . . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Integral calculus
3.1 Integrability in practice . . . . . . . . . . . . . . . . . . .
3.1.a Standard functions . . . . . . . . . . . . . . . . .
3.1.b Comparison theorems . . . . . . . . . . . . . . .
3.2 Exchanging integrals and limits or series . . . . . . . . .
3.3 Integrals with parameters . . . . . . . . . . . . . . . . . .
3.3.a Continuity of functions defined by integrals . .
3.3.b Differentiating under the integral sign . . . . .
3.3.c Case of parameters appearing in the integration
3.4 Double and multiple integrals . . . . . . . . . . . . . . .
3.5 Change of variables . . . . . . . . . . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
57
59
60
61
62
63
66
67
68
71
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
range
. . . .
. . . .
. . . .
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
73
73
73
74
75
77
77
78
78
79
81
83
85
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
87
87
88
90
91
93
93
95
96
96
99
99
99
104
105
106
108
108
110
111
111
113
114
116
Complex Analysis I
4.1 Holomorphic functions . . . . . . . . . . . . . . . . . . . . . .
4.1.a Definitions . . . . . . . . . . . . . . . . . . . . . . . .
4.1.b Examples . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.c The operators / z and /z . . . . . . . . . . . . .
4.2 Cauchys theorem . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.a Path integration . . . . . . . . . . . . . . . . . . . . . .
4.2.b Integrals along a circle . . . . . . . . . . . . . . . . . .
4.2.c Winding number . . . . . . . . . . . . . . . . . . . . .
4.2.d Various forms of Cauchys theorem . . . . . . . . . .
4.2.e Application . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Properties of holomorphic functions . . . . . . . . . . . . . .
4.3.a The Cauchy formula and applications . . . . . . . . .
4.3.b Maximum modulus principle . . . . . . . . . . . . . .
4.3.c Other theorems . . . . . . . . . . . . . . . . . . . . . .
4.3.d Classification of zero sets of holomorphic functions
4.4 Singularities of a function . . . . . . . . . . . . . . . . . . . . .
4.4.a Classification of singularities . . . . . . . . . . . . . .
4.4.b Meromorphic functions . . . . . . . . . . . . . . . . .
4.5 Laurent series . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5.a Introduction and definition . . . . . . . . . . . . . .
4.5.b Examples of Laurent series . . . . . . . . . . . . . . .
4.5.c The Residue theorem . . . . . . . . . . . . . . . . . . .
4.5.d Practical computations of residues . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Contents
vii
4.6
Complex Analysis II
5.1 Complex logarithm; multivalued functions . . . .
5.1.a The complex logarithms . . . . . . . . . .
5.1.b The square root function . . . . . . . . .
5.1.c Multivalued functions, Riemann surfaces
5.2 Harmonic functions . . . . . . . . . . . . . . . . .
5.2.a Definitions . . . . . . . . . . . . . . . . .
5.2.b Properties . . . . . . . . . . . . . . . . . .
5.2.c A trick to find f knowing u . . . . . . .
5.3 Analytic continuation . . . . . . . . . . . . . . . .
5.4 Singularities at infinity . . . . . . . . . . . . . . .
5.5 The saddle point method . . . . . . . . . . . . . .
5.5.a The general saddle point method . . . .
5.5.b The real saddle point method . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . .
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
135
135
135
137
137
139
139
140
142
144
146
148
149
152
153
154
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Conformal maps
6.1 Conformal maps . . . . . . . . . . . . . . . . . . . . . . . .
6.1.a Preliminaries . . . . . . . . . . . . . . . . . . . . .
6.1.b The Riemann mapping theorem . . . . . . . . . .
6.1.c Examples of conformal maps . . . . . . . . . . . .
6.1.d The Schwarz-Christoffel transformation . . . . . .
6.2 Applications to potential theory . . . . . . . . . . . . . . .
6.2.a Application to electrostatics . . . . . . . . . . . . .
6.2.b Application to hydrodynamics . . . . . . . . . . .
6.2.c Potential theory, lightning rods, and percolation
6.3 Dirichlet problem and Poisson kernel . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
155
. 155
. 155
. 157
. 158
. 161
. 163
. 165
. 167
. 169
. 170
. 174
. 176
Distributions I
7.1 Physical approach . . . . . . . . . . . . . . . . . . . . . . . .
7.1.a
The problem of distribution of charge . . . . . .
7.1.b The problem of momentum and forces during an
7.2 Definitions and examples of distributions . . . . . . . . .
7.2.a Regular distributions . . . . . . . . . . . . . . . . .
7.2.b Singular distributions . . . . . . . . . . . . . . . .
7.2.c
Support of a distribution . . . . . . . . . . . . . .
. . . .
. . . .
elastic
. . . .
. . . .
. . . .
. . . .
. . .
. . .
shock
. . .
. . .
. . .
. . .
179
179
179
181
182
184
185
187
viii
187
188
188
191
193
193
194
196
198
199
201
201
204
206
207
209
209
209
211
213
214
215
216
217
220
Distributions II
8.1 Cauchy principal value . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.1.a Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.1.b Application to the computation of certain integrals . . . . . .
8.1.c Feynmans notation . . . . . . . . . . . . . . . . . . . . . . . .
8.1.d Kramers-Kronig relations . . . . . . . . . . . . . . . . . . . . .
8.1.e
A few equations in the sense of distributions . . . . . . . . . .
8.2 Topology in D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.2.a Weak convergence in D . . . . . . . . . . . . . . . . . . . . . .
8.2.b Sequences of functions converging to . . . . . . . . . . . . .
8.2.c Convergence in D and convergence in the sense of functions
8.2.d Regularization of a distribution . . . . . . . . . . . . . . . . .
8.2.e Continuity of convolution . . . . . . . . . . . . . . . . . . . .
8.3 Convolution algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.4 Solving a differential equation with initial conditions . . . . . . . . .
8.4.a First order equations . . . . . . . . . . . . . . . . . . . . . . . .
8.4.b The case of the harmonic oscillator . . . . . . . . . . . . . . .
8.4.c Other equations of physical origin . . . . . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
223
223
223
224
225
227
229
230
230
231
234
234
235
236
238
238
239
240
241
244
245
7.3
7.4
7.5
7.6
7.7
7.8
8
Contents
Contents
9
ix
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
249
249
251
254
254
256
256
257
261
262
263
264
264
265
266
267
269
270
270
271
272
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
277
277
278
279
279
280
282
284
285
285
286
286
288
288
289
290
292
292
293
295
296
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
299
299
300
301
303
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Contents
11.1.d Higher-dimensional Fourier transforms . . . . . . . . . . . .
11.1.e Inversion formula . . . . . . . . . . . . . . . . . . . . . . . .
11.2 The Dirac comb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11.2.a Definition and properties . . . . . . . . . . . . . . . . . . . .
11.2.b Fourier transform of a periodic function . . . . . . . . . . .
11.2.c Poisson summation formula . . . . . . . . . . . . . . . . . .
11.2.d Application to the computation of series . . . . . . . . . . .
11.3 The Gibbs phenomenon . . . . . . . . . . . . . . . . . . . . . . . . .
11.4 Application to physical optics . . . . . . . . . . . . . . . . . . . . . .
11.4.a Link between diaphragm and diffraction figure . . . . . . .
11.4.b Diaphragm made of infinitely many infinitely narrow slits
11.4.c Finite number of infinitely narrow slits . . . . . . . . . . . .
11.4.d Finitely many slits with finite width . . . . . . . . . . . . . .
11.4.e Circular lens . . . . . . . . . . . . . . . . . . . . . . . . . . .
11.5 Limitations of Fourier analysis and wavelets . . . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
305
306
307
307
308
309
310
311
314
314
315
316
318
320
321
324
325
326
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
331
331
332
333
336
336
338
338
339
339
341
342
342
342
344
344
345
346
346
347
348
351
352
. . . .
fields
. . . .
. . . .
. . . .
355
355
358
359
365
368
. . . . . .
transverse
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Contents
13.5.a Definition . . . . . . . . . . . . . . . . . . . . . .
13.5.b Properties . . . . . . . . . . . . . . . . . . . . . .
13.5.c Intercorrelation . . . . . . . . . . . . . . . . . . .
13.6 Finite power functions . . . . . . . . . . . . . . . . . . .
13.6.a Definitions . . . . . . . . . . . . . . . . . . . . .
13.6.b Autocorrelation . . . . . . . . . . . . . . . . . . .
13.7 Application to optics: the Wiener-Khintchine theorem
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xi
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
368
368
369
370
370
370
371
375
376
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
377
377
377
378
379
379
379
380
382
383
384
385
387
387
389
390
391
393
394
396
397
398
401
403
404
15 Green functions
407
15.1 Generalities about Green functions . . . . . . . . . . . . . . . . . . . . 407
15.2 A pedagogical example: the harmonic oscillator . . . . . . . . . . . . 409
15.2.a Using the Laplace transform . . . . . . . . . . . . . . . . . . . 410
15.2.b Using the Fourier transform . . . . . . . . . . . . . . . . . . . 410
15.3 Electromagnetism and the dAlembertian operator . . . . . . . . . . . 414
15.3.a Computation of the advanced and retarded Green functions 414
15.3.b Retarded potentials . . . . . . . . . . . . . . . . . . . . . . . . . 418
15.3.c Covariant expression of advanced and retarded Green functions 421
15.3.d Radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
15.4 The heat equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
15.4.a One-dimensional case . . . . . . . . . . . . . . . . . . . . . . . 423
15.4.b Three-dimensional case . . . . . . . . . . . . . . . . . . . . . . 426
xii
Contents
. . . .
maps
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
444
446
447
447
449
450
453
455
455
456
458
459
460
461
462
17 Differential forms
17.1 Exterior algebra . . . . . . . . . . . . . . . . . . . . . . . .
17.1.a 1-forms . . . . . . . . . . . . . . . . . . . . . . . . .
17.1.b Exterior 2-forms . . . . . . . . . . . . . . . . . . .
17.1.c Exterior k-forms . . . . . . . . . . . . . . . . . . .
17.1.d Exterior product . . . . . . . . . . . . . . . . . . .
17.2 Differential forms on a vector space . . . . . . . . . . . .
17.2.a Definition . . . . . . . . . . . . . . . . . . . . . . .
17.2.b Exterior derivative . . . . . . . . . . . . . . . . . .
17.3 Integration of differential forms . . . . . . . . . . . . . . .
17.4 Poincars theorem . . . . . . . . . . . . . . . . . . . . . .
17.5 Relations with vector calculus: gradient, divergence, curl
17.5.a Differential forms in dimension 3 . . . . . . . . .
17.5.b Existence of the scalar electrostatic potential . . .
17.5.c Existence of the vector potential . . . . . . . . . .
17.5.d Magnetic monopoles . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
463
463
463
464
465
467
469
469
470
471
474
476
476
477
479
480
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
433
433
433
435
436
438
439
439
439
441
443
Contents
xiii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
489
489
491
492
497
503
505
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
509
510
512
516
517
519
20 Random variables
20.1 Random variables and probability distributions . . . .
20.2 Distribution function and probability density . . . . .
20.2.a Discrete random variables . . . . . . . . . . . . .
20.2.b (Absolutely) continuous random variables . . .
20.3 Expectation and variance . . . . . . . . . . . . . . . . .
20.3.a Case of a discrete r.v. . . . . . . . . . . . . . . . .
20.3.b Case of a continuous r.v. . . . . . . . . . . . . .
20.4 An example: the Poisson distribution . . . . . . . . . .
20.4.a Particles in a confined gas . . . . . . . . . . . . .
20.4.b Radioactive decay . . . . . . . . . . . . . . . . . .
20.5 Moments of a random variable . . . . . . . . . . . . . .
20.6 Random vectors . . . . . . . . . . . . . . . . . . . . . . .
20.6.a Pair of random variables . . . . . . . . . . . . . .
20.6.b Independent random variables . . . . . . . . . .
20.6.c Random vectors . . . . . . . . . . . . . . . . . .
20.7 Image measures . . . . . . . . . . . . . . . . . . . . . . .
20.7.a Case of a single random variable . . . . . . . . .
20.7.b Case of a random vector . . . . . . . . . . . . . .
20.8 Expectation and characteristic function . . . . . . . . .
20.8.a Expectation of a function of random variables
20.8.b Moments, variance . . . . . . . . . . . . . . . . .
20.8.c Characteristic function . . . . . . . . . . . . . .
20.8.d Generating function . . . . . . . . . . . . . . . .
20.9 Sum and product of random variables . . . . . . . . . .
20.9.a Sum of random variables . . . . . . . . . . . . .
20.9.b Product of random variables . . . . . . . . . . .
20.9.c Example: Poisson distribution . . . . . . . . . .
20.10 Bienaym-Tchebychev inequality . . . . . . . . . . . . .
20.10.a Statement . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
521
521
524
526
526
527
527
528
530
530
531
532
534
534
537
538
539
539
540
540
540
541
541
543
543
543
546
547
547
547
.
.
.
.
.
.
.
.
.
.
xiv
Contents
theorem
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
553
553
555
556
560
563
564
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
573
573
577
577
578
580
581
581
583
583
584
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
585
585
585
586
587
587
588
591
C Matrices
C.1 Duality . . . . . . . . . . . . . . . . . . . . . . .
C.2 Application to matrix representation . . . . . .
C.2.a Matrix representing a family of vectors
C.2.b Matrix of a linear map . . . . . . . . .
C.2.c Change of basis . . . . . . . . . . . . . .
C.2.d Change of basis formula . . . . . . . .
C.2.e Case of an orthonormal basis . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
593
593
594
594
594
595
595
596
Appendices
D A few proofs
597
Contents
xv
Tables
Fourier transforms
609
Laplace transforms
613
Probability laws
616
Further reading
617
References
621
Portraits
627
Sidebars
629
Index
631
Thanks
Many are those who I wish to thank for this book: Jean-Franois Colombeau
who was in charge of the course of Mathematics for the Magistre des sciences de la
matire of the ENS Lyon before me; Michel Peyrard who then asked me to replace
him; Vronique Terras and Jean Farago who were in charge of the exercise sessions
during the three years when I taught this course.
Many friends and colleagues were kind enough to read and re-read one or more
chapters: Julien Barr, Maxime Clusel, Thierry Dauxois, Kirone Mallick, Julien
Michel, Antoine Naert, Catherine Ppin, Magali Ribaut, Erwan Saint-Loubert Bi;
and especially Paul Pichaureau, TEX-guru, who gave many (very) sharp and (always)
pertinent comments concerning typography and presentation as well as contents. I
also wish to thank the ditions H&K and in particular Sbastien Desreux, for his
hard work and stimulating demands; Brangre Condomines for her attentive reading.
I am also indebted to Jean-Franois Quint for many long and fascinating mathematical discussions.
This edition owes a lot to Emmanuel Kowalski, friend of many years, Professor at
the University of Bordeaux, who led me to refine certain statements, correct others,
and more generally helped to clarify some delicate points.
For sundry diverse reasons, no less important, I want to thank Craig Thompson,
author of Good-bye Chunky Rice [89]and Blankets [90], who created the illustrations for
Chapter 1; Jean-Paul Marchal, master typographer and maker of popular images in
pinal; Angel Alastuey who taught me so much in physics and mathematics; Frdric,
Latitia, Samuel, Koupaa and Alanis Vivien for their support and friendship; Claude
Garcia, for his joie de vivre, his fideos, his fine bottles, his knowledge of topology, his
advice on Life, the Universe... and the Rest! without forgetting Anita, Alice and Hugo
for their affection.
I must not forget the many readers who, since the first French edition, have
communicated their remarks and corrections; in particular Jean-Julien Fleck, Marc
Rezzouk, Franoise Cornu, Cline Chevalier and professors Jean Cousteix, Andreas
de Vries, and Franois Thirioux; without forgetting all the students of the Magistre
des sciences de la matire of ENS Lyon, promotions 1993 to 1999, who followed
my classes and exercise sessions and who contributed, through their remarks and
questions, to the elaboration and maturation of this book.
Et, bien sr, les plus importants, Anne-Julia, Solveig et Anton avec tout mon
amour.
Although the text has been very consciously written, read, and proofread, errors, omissions or imprecisions may still remain. The author welcomes any remark, criticism, or correction that a reader may wish to
communicate, care of the publisher, for instance, by means of the email address Errare.humanum.est@h-k.fr.
A list of errata will be kept updated on the web site
http://press.princeton.edu
A books apology
hy should a physicist study mathematics? There is a fairly fashionable current of thought that holds that the use of advanced mathematics is of
little real use in physics, and goes sometimes as far as to say that knowing
mathematics is already by itself harmful to true physics. However, I am and remain
convinced that mathematics is still a precious source of insight, not only for students of
physics, but also for researchers.
Many only see mathematics as a tool and of course, it is in part a tool, but
they should be reminded that, as Galileo said, the book of Nature is written in the
mathematicians language.1 Since Galileo and Newton, the greatest physicists give
examples that knowing mathematics provides the means to understand precise physical
notions, to use them more easily, to establish them on a sure foundation, and even
more importantly, to discover new ones.2 In addition to ensuring a certain rigor in
reasoning which is indispensable in any scientific study, mathematics belongs to the
natural lexicon of physicists. Even if the rules of proportion and the fundamentals of
calculus are sufficient for most purposes, it is clear that a richer vocabulary can lead
to much deeper insights. Imagine if Shakespeare or Joyce had only had a few hundred
words to chose from!
Is is therefore discouraging to sometimes hear physicists dismiss certain theories because this is only mathematics. In fact, the two disciplines are so closely related that
the most prestigious mathematical award, the Fields medal, was given to the physicist
Edward Witten in 1990, rewarding him for the remarkable mathematical discoveries
that his ideas led to.
How should you read this book? Or rather how should you not read some parts?
1
Philosophy is written is this vast book which is forever open before our eyes I mean, the Universe
but it can not be read until you have learned the tongue and understood the character in which it is
written. It is written in the language of mathematics, and its characters are triangles, circles, and other
geometric pictures, without which it is not humanly possible to understand a single word [...].
2
I will only mention Newton (gravitation, differential and integral calculus), Gauss (optics,
magnetism, all the mathematics of his time, and quite a bit that was only understood much
later), Hamilton (mechanics, differential equations, algebra), Heaviside (symbolic calculus,
signal theory), Gibbs (thermodynamics, vector analysis) and of course Einstein. One could
write a much longer list. If Richard Feynman presents a very physical description of his art
in his excellent physics course [35], appealing to remarkably little formalism, it is nevertheless
the fact that he was the master of the elaborate mathematics involved, as his research works
show.
A books apology
xix
Since the reader may want to learn first certain specific topics among those present, here
is a short description of the contents of this text:
The first chapter serves as a reminder of some simple facts concerning the fundamental notion of convergence. There shouldnt be much in the way of new
mathematics there for anyone who has followed a rigorous calculus course,
but there are many counter-examples to be aware of. Most importantly, a long
section describes the traps and difficulties inherent in the process of exchanging
two limits in the setting of physical models. It is not always obvious where,
in physical reasoning, one has to exchange two mathematical limits, and many
apparent paradoxes follow from this fact.
The real beginning concerns the theory of integration, which is briefly presented
in the form of the Lebesgue integral based on measure theory (Chapter 2). For
many readers, this may be omitted in a first reading. Chapter 3 discusses the
basic results and techniques of integral calculus.
Chapters 4 to 6 present the theory of functions of a complex variable, with a
number of applications:
the residue method, which is an amazing tool for integral calculus;
some physical notions, such as causality, are very closely related to analyticity of functions on the complex plane (Section 13.4);
harmonic functions (such that f = 0) in two dimensions are linked to
the real parts of holomorphic (analytic) functions (Chapter 5);
conformal maps (those which preserve angles) can be used to simplify
boundary conditions in problems of hydrodynamics or electromagnetism
(Chapter 6);
Chapters 7 and 8 concern the theory of distributions (generalized functions)
and their applications to physics. These form a relatively self-contained subset of
the book.
Chapters 9 to 12 deal with Hilbert spaces, Fourier series, Fourier and Laplace
transforms, which have too many physical applications to attempt a list. Chapter 13 develops some of those applications, and this chapter also requires complex
analysis.
Chapter 14 is a short (probably too short) introduction to the Dirac notation
used in quantum mechanics: kets | and bras |. The notions of generalized
eigenbasis and self-adjoint operators on Hilbert space are also discussed.
Several precise physical problems are considered and solved in Chapter 15 by the
method of Green functions. This method is usually omitted from textbooks on
electromagnetism (where a solution is taken straight out of a magicians hat) or
of field theory (where it is assumed that the method is known). I hope to fill
a gap for students by presenting the necessary (and fairly simple) computations
from beginning to end, using physicists notation.
xx
A books apology
Chapters 16 and 17 about tensor calculus and differential forms are also somewhat independent from the rest of the text. Those two chapters are only brief
introductions to their respective topics.
Chapter 18 has the modest goal of relating some notions of topology and group
theory to the idea of spin in quantum mechanics.
Probability theory, discussed in Chapters 19 to 21, is almost entirely absent
from the standard physics curriculum, although the basic vocabulary and results of probability seem necessary to any physicist interested in theory (stochastic
equations, Brownian motion, quantum mechanics and statistical mechanics all
require probability theory) or experiments (Gaussian white noise, measurement
errors, standard deviation of a data set...)
Finally, a few appendices contain further reminders of elementary mathematical
notions and the proofs of some interesting results, the length of which made their
insertion in the main text problematic.
Many physical applications, using mathematical tools with the usual notation of physics, are included in the text. They can be found by looking in the
index at the items under Physical applications.
A books apology
xxi
Translators foreword
I am a mathematician and have now forgotten most of the little physics I learned in school
(although Ive probably picked up a little bit again by translating this book). I would like to
mention here two more reasons to learn mathematics, and why this type of book is therefore
very important.
First, physicists benefit from knowing mathematics (in addition to the reasons Walter
mentioned) because, provided they immerse themselves in mathematics sufficiently to become
fluent in its language, they will gain access to new intuitions. Intuitions are very different from
any set of techniques, or tools, or methods, but they are just as indispensable for a researcher,
and they are the hardest to come by.3 A mathematicians intuitions are very different from
those of a physicist, and to have both available is an enormous advantage.
The second argument is different, and may be subjective: physics is hard, much harder
in some sense than mathematics. A very simple and fundamental physical problem may
be all but impossible to solve because of the complexity (apparent or real) of Nature. But
mathematicians know that a simple, well-formulated, natural mathematical problem (in some
sense that is impossible to quantify!) has a simple solution. This solution may require inventing
entirely new concepts, and may have to wait for a few hundred years before the idea comes to
a brilliant mind, but it is there. What this means is that if you manage to put the physical
problem in a very natural mathematical form, the guiding principles of mathematics may lead
you to the solution. Dirac was certainly a physicist with a strong sense of such possibilities;
this led him to discover antiparticles, for instance.
3
Often, nothing will let you understand the intuition behind some important idea except,
essentially, rediscovering by yourself the most crucial part of it.
Notation
Symbol
Meaning
,
(|)
|, |
X
1
def
=
[a, b]
]a, b[
[[1, n]]
without (example: A \ B)
has for Laplace transform
duality bracket: a, x = a(x)
scalar product
ket, bra
integral part
Dirac comb
constant function x 7 1
convolution product
tensor product
exterior product
boundary (example: , boundary of )
laplacian
dAlembertian
equivalence of sequences or functions
isomorphic to (example: E F )
approximate equality (in physics)
equality defining the left-hand side
closed interval : {x R ; a x b}
open interval : {x R ; a < x < b}
{1, . . . , n}
Page
333
251
379,380
186
211
210, 439
464
414
580
Index of notation
Latin letter
B
B(a ; r ), B (a ; r )
Bil(E F , G )
A
C
C (I , R)
C (a ; r )
Cnt
D
D
D+ , D
diag(1 , . . . , n )
e
( e )
E(X )
F [ f ], fe
fb
f
f (n)
GLn (K)
G(E )
H
K, K
2
L1
L2
L (E , F )
L
Mn (K)
P(A)
P(A)
R
Re(z), Im(z)
S
S
S, Sn
t
u, v, w, x
pv
X,Y,Z
Meaning
Borel -algebra
open (closed) ball centered at a with radius r
space of bilinear maps from E F to G
complement of A in
C {}
vector space of continuous real-valued functions on I
circle centered at a with radius r
any constant
vector space of test functions
vector space of distributions
vector space of distributions with bounded support
on the left (right)
square matrix with coefficients mi j = i i j
Nepers constant (e: electric charge)
basis of a vector space
expectation of the r.v. X
Fourier transform of f
Laplace transform of f
transpose of f ; f(x) = f (x)
n-th derivative of f
group of invertible matrices of order n over K
group of automorphisms of the vector space E
Heaviside distribution or function
R or C
P
space of sequences (un )nN such that |un |2 < +
space of integrable functions
space of square-integrable functions
space of linear maps from E to F
Lebesgue -algebra
algebra of square matrices of order n over K
set of subsets of A
probability of the event A
R {, +}
real or imaginary part of z
space of Schwartz functions
space of tempered distributions
group of permutations (of n elements)
transpose of a matrix : ( tA)i j = A j i
vectors in E
principal value
random variables
xxiii
Page
58
512
146
182
183
236
433
527
278
333
189
193
261
280
262
60
57
514
289
300
26
434
188
521
Index of notation
xxiv
Greek letter
(
)
,
, i j
()
k
(X )
A
,
436
154
185, 196
445
465
464
213, 279
528
63
436
Abbreviations
(n).v.s.
r.v.
i.r.v.
521
537
Chapter
Reminders: convergence of
sequences and series
This first chapter, which is quite elementary, is essentially a survey of the notion
of convergence of sequences and series. Readers who are very confortable with this
concept may start reading the next chapter.
However, although the mathematical objects we discuss are well known in principle, they have some unexpected properties. We will see in particular that the order
of summation may be crucial to the evaluation of the series, so that changing the
order of summation may well change its sum.
We start this chapter by discussing two physical problems in which a limit process
is hidden. Each leads to an apparent paradox, which can only be resolved when the
underlying limit is explicitly brought to light.
1.1
The problem of limits in physics
1.1.a
On the other hand, to accelerate, it must use (or guzzle) gallons of gas in
order to increase its kinetic energy by an amount of 1 joule. This assumption,
although it is imperfect, is physically acceptable because each gallon of gas
yields the same amount of energy.
So, when the driver decides to increase its speed to reach v = 80 mph, the
quantity of gas required to do so is equal to the difference of kinetic energy,
namely, it is
1
1
(E c E c ) = m(v 2 v 2 ) = m(6 400 3 600) = 1 400 m.
2
2
With m = 10 1000 J mile2 hour2 , say, this amounts to 0.14 gallon. Jolly
good.
Now, let us watch the same scene of the truck accelerating, as observed by
a highway patrolman, initially driving as fast as the truck w = v = 60 mph,
but with a motorcycle which is unable to go faster.
The patrolman, having his college physics classes at his fingertips, argues as
follows: in my own Galilean reference frame, the relative speed of the truck
was previously v = 0 and is now v = 20 mph. To do this, the amount of
gas it has guzzled is equal to the difference in kinetic energies:
1
1
(E c E c ) = m (v )2 (v )2 = m(400 0) = 200 m,
2
2
or around 0.02 gallons.
There is here a clear problem, and one of the two observers must be wrong.
Indeed, the Galilean relativity principle states that all Galilean reference frames
are equivalent, and computing kinetic energy in the patrolmans reference
frame is perfectly legitimate.
How is this paradox resolved?
We will come to the solution, but first here is another problem. The reader,
before going on to read the solutions, is earnestly invited to think and try to
solve the problem by herself.
Second paradox
Consider a highly elastic rubber ball in free fall as we first see it. At some
point, it hits the ground, and we assume that this is an elastic shock.
Most high-school level books will describe the following argument: assume that, at the instant t = 0 when the ball hits the ground, the speed of
the ball is v1 = 10 ms1 . Since the shock is elastic, there is conservation of
total energy before and after. Hence the speed of the ball after the rebound is
v2 = v1 , or simply +10 ms1 going up.
This looks convincing enough. But it is not so impressive if seen from
the point of view of an observer who is also moving down at constant speed
vobs = v1 = 10 ms1 . For this observer, the speed of the ball before the
shock is v1 = v1 vobs = 0 ms1 , so it has zero kinetic energy. However,
N
X
i=1
m i ( v i 2 v2i ).
In a second reference frame, with relative speed w with respect to the first,
the difference is equal to
E c =
N
X
i=1
N
X
i=1
m i ( vi w)2 ( v i w)2
m i ( v i 2
vi 2) 2 w
N
X
i=1
m i ( v i v i )
= E c 2 w P ,
(we use as labels for any physical quantity expressed in the new reference
frame), so that E c = E c as long as the total momentum is preserved during
the shock, in other words if P = 0.
In the case of the truck and the patrolman, we did not really take the
momentum into account. In fact, the truck can accelerate because it pushes
back the whole earth behind it!
So, let us take up the computation with a terrestrial mass M , which is
large but not infinite. We will take the limit [M ] at the very end of the
computation, and more precisely, we will let [M /m ].
At the beginning of the experiment, in the terrestrial reference frame,
the speed of the truck is v. At the end of the experiment, the speed is v .
Earth, on the other hand, has original speed V = 0, and if one remains
m
in the same Galilean reference frame, final speed V = M
( v v ) (because
2
= m v m v + m( v v ) w +
( v v ) m( v v ) w,
2
2
2M
E c = E c .
Final
speed
Ec init.
Ec final
Ec
1
m v2
2
1
m v2
2
m 2
( v v2 )
2
m2
+
( v v )2
2M
v w
m
M
( v v )
v w
m
M
( v v ) w
1 m2
(v
2 M
0
1
m( v
2
1
2
1
m( v
2
w)2
M w2
v )2
M
2
m
M
w)2
( v v ) w
2
m 2
( v v2 )
2
m2
+
( v v )2
2M
1
Note that the terrestrial reference frame is then not Galilean, since the Earth started
moving under the trucks impulsion.
(Illustration
c Craig Thompson 2001)
The second paradox is resolved in the same manner: the Earths rebound
energy must be taken into account after the shock with the ball.
The interested reader will find another paradox, relating to optics, in Exercise 1.3 on page 43.
1.1.b
(Illustration
c Craig Thompson 2001)
does not move during the experiment. Since Romeo travels the distance L
relative to the boat, it is easy to deduce that the boat must cover, in the
opposite direction, the distance
m
=
L.
m+M
In the second case, let x(t) denote the position of the boat and y(t) that
of Romeo, relative to the Earth, not to the boat. The equation of movement
for the center of gravity of the system is
M x + m y =
x.
We now integrate on both sides between t = 0 (before Romeo starts moving)
and t = +. Because of the friction, we know that as [t +], the speed
of the boat goes to 0 (hence also the speed of Romeo, since he will have been
long immobile with respect to the boat). Hence we have
+
(M x + m y )
= 0 = x(+) x(0)
0
0
>0
hence
lim () 6= (0).
0
>0
V (x)
V0
x
Fig. 1.3 Potential wall V (x) = V0 H (x).
force involves additional (nonlinear) terms, the result is completely different. Hence, if you try
to perform this experiment in practice, it will probably not be conclusive, and the boat is not
likely to come back to the same exact spot at the end.
a second variable, and the true problem is that we have a double limit which
does not commute: lim lim f (x, y) 6= lim lim f (x, y).
x0 y0
y0 x0
The problem considered is that of a quantum particle arriving at a potential wall. We look at a one-dimensional setting, with a potential of the type
V (x) = V0 H (x), where H is the Heaviside function, that is, H (x) = 0 if
x < 0 and H (x) = 1 if x > 0. The graph of this potential is represented in
Figure 1.3.
A particle arrives from x = in an energy state E > V0 ; part of it is
transmitted beyond the potential wall, and part of it is reflected back. We are
interested in the reflection coefficient of this wave.
The incoming wave may be expressed, for negative values of x, as the sum
of a progressive wave moving in the direction of increasing x and a reflected
wave. For positive values of x, we have a transmitted wave in the direction
of increasing x, but no component in the other direction. According to the
Schrdinger equation, the wave function can therefore be written in the form
(x, t) = (x) f (t),
V (x)
V0
where
e ik x + B e ik x
(x) =
ik x
Ae
def
if x < 0, with k =
if x > 0, with
def
p
2mE
,
h}
2m(E V0 )
.
h}
The function f (t) is only a time-related phase factor and plays no role in
what follows. The reflection coefficient of the wave is given by the ratio of
the currents associated with and is given by R = 1 kk |A|2 (see [20, 58]).
Now what is the value of A? To find it, it suffices to write the equation
expressing the continuity of and at x = 0. Since (0+ ) = (0 ), we
have 1 + B = A. And since (0+ ) = (0 ), we have k(1 B) = k A, and
we deduce that A = 2k/(k + k ). The reflection coefficient is therefore equal
to
2
p
p
E E V0
k
k k
2
R = 1 |A| =
= p
.
(1.1)
p
k
k + k
E + E V0
Here comes the surprise: this expression (1.1) is independent of h}. In particular, the limit as [ h} 0] (which defines the classical limit) yields a nonzero
reflection coefficient, although we know that in classical mechanics a particle
with energy E does not reflect against a potential wall with value V0 < E !2
So, displaying explicitly the dependency of R on h}, we have:
lim R( h}) 6= 0 = R(0).
h}0
h}6=0
In fact, we have gone a bit too fast. We take into account the physical
aspects of this story: the classical limit is certainly not the same as brutally
writing h} 0. Since Plancks constant is, as the name indicates, just a
constant, this makes no sense. To take the limit h} 0 means that one
arranges for the quantum dimensions of the system to be much smaller than
all other dimensions. Here the quantum dimension is determined by the de
2
Broglie wavelength of the particle, that is, = h}/ p. What are the other
lengths in this problem? Well, there are none! At least, the way we phrased it:
because in fact, expressing the potential by means of the Heaviside function is
rather cavalier. In reality, the potential must be continuous. We can replace it
by an infinitely differentiable potential such as V (x) = V0 /(1 + e x/a ), which
increases, roughly speaking, on an interval of size a > 0 (see Figure 1.4). In
the limit where a 0, the discontinuous Heaviside potential reappears.
Computing the reflection coefficient with this potential is done similarly,
but of course the computations are more involved. We refer the reader to [58,
chapter 25]. At the end of the day, the reflection coefficient is found to
depend not only on h}, but also on a, and is given by
sinh a(k k ) 2
R( h}, a) =
.
sinh a(k + k )
p
p
( h} appears in the definition of k = 2mE / h} and k = 2m(E V0 )/ h}.)
We then see clearly that for fixed nonzero a, the de Broglie wavelength of the
particle may become infinitely small compared to a, and this defines the
correct classical limit. Mathematically, we have
a 6= 0
lim R( h}, a) = 0
classical limit
h}0
h}6=0
a6=0 h}6=0
A
L
10
2C
A
L
B
(the capacitors of two successive cells in series are equivalent with one capacitor
with capacitance C ). We want to know the total impedance of this circuit.
First, consider instead a circuit made with a finite sequence of n elementary
cells, and let Zn denote its impedance. Kirchhoffs laws imply the following
recurrence relation between the values of Zn :
1
iL
+
Z
n
1
2iC
+
,
(1.2)
Zn+1 =
1
2iC
+ Zn
iL +
2iC
where is the angular frequency. In particular, note that if Zn is purely
imaginary, then so is Zn+1 . Since Z1 is purely imaginary, it follows that
Zn iR
for all n N.
We dont know if the sequence (Zn )nN converges. But one thing is certain: if this sequence (Zn )nN converges to some limit, this must be purely
imaginary (the only possible real limit is zero).
Now, we compute the impedance of the infinite circuit, noting that this
circuit A B is strictly equivalent to the following:
2C
2C
A
L
1
iL
+
Z
1
2iC
Z=
+
.
1
2iC
+Z
iL +
2iC
(1.3)
1
1
Z2 = L
.
C
4C 2
We must therefore distinguish two cases:
11
1
If < c = p
, we have Z 2 < 0 and hence Z is purely imaginary of
2 LC
the form
r
1
L
Z = i
.
2
2
4C
C
Remark 1.2 Mathematically, there is nothing more that can be said, and in particular there
i
2C
1
If > c = p
, then Z 2 > 0 and Z is therefore real:
2 LC
r
L
1
Z =
.
C
4C 2 2
Remark 1.3 Here also the sign of Z can be determined by physical arguments. The real part
of an impedance (the resistive part) is always non-negative in the case of a passive component,
since it accounts for the dissipation of energy by the Joule effect. Only active components
(such as operational amplifiers) can have negative resistance. Thus, the physically acceptable
solution of equation (1.3) is
r
L
1
.
Z= +
C
4C 2 2
In this last case, there seems to be a paradox since Z cannot be the limit
as n + of (Zn )nN . Lets look at this more closely.
From the mathematical point of view, Equation (1.3) expresses nothing but the
fact that Z is a fixed point for the induction relation (1.2). In other words,
this is the equation we would have obtained from (1.2), by continuity, if we
had known that the sequence (Zn )nN converges to a limit Z. However, there
is no reason for the sequence (Zn )nN to converge.
Remark 1.4 From the physical point of view, the behavior of this infinite chain is rather sur-
prising. How does resistive behavior arise from purely inductive or capacitative components?
Where does energy dissipate? And where does it go?
12
In fact, there is no dissipation of energy in the sense of the Joule effect, but energy does
disappear from the point of view of an operator holding points A and B. More precisely,
one can show that there is a flow of energy propagating from cell to cell. So at the beginning
of the circuit, it looks like there is an energy sink with fixed power consumption. Still, no
energy disappears: an infinite chain can consume energy without accumulating it anywhere.3
In the regime considered, this infinite chain corresponds to a waveguide.
We conclude this first section with a list of other physical situations where
the problem of noncommuting limits arises:
taking the classical (nonquantum) limit, as we have seen, is by no
means a trivial matter; in addition, it may be in conflict with a nonrelativistic limit (see, e.g., [6]), or with a low temperature limit;
in plasma thermodynamics, the limit of infinite volume (V ) and
the nonrelativistic limit (c ) are incompatible with the thermodynamic limit, since a characteristic time of return to equilibrium is
V 1/3 /c ;
in the classical theory of the electron, it is often said that such a classical
electron, with zero naked mass, rotates too fast at the equator (200 times
the speed of light) for its magnetic moment and renormalized mass to
conform to experimental data. A more careful calculation by Lorentz
himself4 gave about 10 times the speed of light at the equator. But in
fact, the limit [m 0+ ] requires care, and if done correctly, it imposes
a limit [v/c 1 ] to maintain a constant renormalized mass [7];
1.2
Sequences
1.2.a
tor space and (un )nN a sequence of elements of E , and let E . The
3
Sequences
13
sequence (un )nN converges to if, for any > 0, there exists an index
starting from which un is at most at distance from :
> 0
N N
n N
n N = kun k < .
or
un .
n
to ) if, for any M R, there exists an index N , starting from which all
elements of the sequence are larger than M :
M R
N N
n N
n N = un > M
(resp. un < M ).
M R
N N
n N
n N = |zn | > M .
Remark 1.8 There is only one direction to infinity in C. We will see a geometric interpretation of this fact in Section 5.4 on page 146.
Remark 1.9 The strict inequalities kun k < (or |un | > M ) in the definitions above (which,
As an example: how should one prove that the sequence (un )nN , with
n
P
(1) p+1
un =
,
p4
p=1
14
or, equivalently, if
> 0 N N p, k N
p N =
u p+k u p
< .
u1 = 3.1
u2 = 3.14
u3 = 3.141
u4 = 3.1415
u5 = 3.14159
(you can guess the rest ...). This is a sequence of rationals, which is a Cauchy sequence (the
distance between u p and u p+k is at most 10 p ). However, it does not converge in Q, since its
limit (in R!) is , which is a notoriously irrational number.
The space Q is not nice in the sense that it leaves a lot of room for Cauchy sequences to
exist without converging in Q. The mathematical terminology is that Q is not complete.
Proof.
First case: It is first very simple to show that a Cauchy sequence (un )nN of real numbers is bounded. Hence, according to the Bolzano-Weierstrass theorem (Theorem A.41,
page 581), it has a convergent subsequence. But any Cauchy sequence which has a
convergent subsequence is itself convergent (its limit being that of the subsequence),
see Exercise 1.6 on page 43. Hence any Cauchy sequence in R is convergent.
Second case: Considering C as a normed real vector space of dimension 2, we can
suppose that the base field is R.
Consider a basis B = (b 1 , . . . , b d ) of the vector space E . Then we deduce that E is
complete from the case of the real numbers and the following two facts: (1) a sequence
( x n )nN of vectors, with coordinates (xn1 , . . . , xnd ) in B, converges in E if and only if
each coordinate sequence (x k n )nN ; and (2), if a sequence is a Cauchy sequence, then
each coordinate is a Cauchy sequence.
7
Sequences
15
def R
norm k f k2L2 = R | f |2 , is complete (see Chapter 9). This infinite-dimensional space is used
very frequently in quantum mechanics.
Here is an important example of the use of the Cauchy criterion: the fixed
point theorem.
THEOREM 1.18 (Banach fixed point theorem) Let E be a complete normed vec-
and this proves that the sequence (un )nN is a Cauchy sequence. Since the space E is
complete, this sequences has a limit a E . Since U is closed, we have a U .
Now from the continuity of f and the relation un+1 = f (un ), we deduce that
a = f (a). So
this a is a fixed
point of f . If b is an arbitrary fixed point, the inequality
ka b k =
f (a) f (b )
ka b k proves that ka b k = 0 and thus a = b ,
showing that the fixed point is unique.
16
Remark 1.19 Here is one reason why Banachs theorem is very important. Suppose we have
a normed vector space E and a map g : E E , and we would like to solve an equation
g(x) = b . This amounts to finding the fixed points of f (x) = g(x) + x b , and we can hope
that f may be a contraction, at least locally. This happens, for instance, in the case of the
Newton method, if the function used is nice enough, and if a suitable (rough) approximation
of a zero is known.
This is an extremely fruitful idea: one can prove this way the Cauchy-Lipschitz theorem
concerning existence and unicity of solutions to a large class of differential equations; one can
also study the existence of certain fractal sets (the von Koch snowflake, for instance), certain
stability problems in dynamical systems, etc.
Not only does it follow from Banachs theorem that certain equations have
solutions (and even better, unique solutions!), but the proof provides an effective way to find this solution by a successive approximations: it suffices to fix
u0 arbitrarily, and to define the sequence (un )nN by means of the recurrence
formula un+1 = f (un ); then we know that this sequence converges to the fixed
point a of f . Moreover, the convergence of the sequence of approximations
is exponentially fast: the distance from the approximate solution un to the
(unknown) solution a decays as fast as n . An example (the Picard iteration) is
given in detail in Problem 1 on page 46.
B1
x12
x22
x32
..
.
B2
x13 A1
x23 A2
x33 A3
..
.
B3
The question is now whether the sequences (An )nN and (Bk )kN themselves
converge, and if that is the case, whether their limits are equal. In general, it
turns out that the answer is No. However, under certain conditions, if one
sequence (say (An )) converges, then so does the other, and the limits are the
same.
DEFINITION 1.20 A double sequence (xn,k )n,k converges uniformly with respect to k to a sequence (B k ) kN
N as n if
> 0 N N n N k N
n N = xn,k Bk < .
In other words, there is convergence with respect to n for fixed k, but in such
a way that the speed of convergence is independent of k; or one might say that
all values of k are similarly behaved.
Sequences
17
ditions hold:
1.2.e
vector space), let a K, and let K . Then f has the limit , or tends
to , at the point a if we have
> 0 > 0 z K
|z a| < = f (z) < .
There are also limits at infinity and infinite limits, defined similarly:
x R
18
z K
(z 6= a and |z a| < ) = f (z) < .
This is denoted
= za
lim f (z).
z6=a
This definition has the advantage of being practically identical to the definition of convergence at infinity. It is often better adapted to the physical
description of a problem, as seen in Examples 1.1.b and 1.1.c on page 5 and
the following pages. A complication is that it reduces the applicability of the
theorem of composition of limits.
THEOREM 1.25 (Sequential characterization of limits) Let f : K K be a
1.2.f
Sequences of functions
xX
Sequences
19
DEFINITION 1.26 (Simple convergence) Let ( fn )nN be a sequence of functions, all defined on the same set X , which may be arbitrary. Then the
sequence ( f n ) nN
N converges simply (or: pointwise on
X) to a function f
defined on X if, for any x in X , the sequence fn (x) nN converges to f (x).
This is denoted
cv.s.
fn f .
DEFINITION 1.27 (Uniform convergence) Let ( fn )nN be a sequence of func-
tions, all defined on the same set X , which may be arbitrary. Then the
sequence ( f n ) nN
N converges uniformly to the function f if
> 0 N N
This is denoted
n N
n N = k fn f k < .
cv.u.
fn f .
nx
fn (x) = 2 nx
if x [0, 1/n] ,
if x [1/n, 2/n] ,
if x [2/n, 1] .
1/n 2/n
20
The reader will have no trouble proving that ( fn )n1 converges pointwise to the zero function.
However, the convergence is not uniform, since we have k fn f k = 1 for all n 1.
Example 1.30 The sequence of functions fn : R R defined for n 1 by
x
fn : x 7 sin x +
n
converges uniformly to f : x 7 sin x on the interval [0, 2], and in particular it converges
pointwise on this interval. However, although the sequence converges pointwise to the sine
function on R, the convergence is not uniform on all of R. Indeed, for n 1, we have
n
n
n
n
fn
= sin
+
and
f
= sin
,
2
2
2
2
2
and those two values differ by 1 in absolute value. However, one can check that the convergence
is uniform on any bounded segment in R.
2
Exercise 1.1 Let g(x) = e x and fn (x) = g(x n). Does the sequence ( fn )nN converge
Remark 1.31 In the case where the functions fn are defined on a subset of R with finite
measure (for instance, a finite segment), a theorem of Egorov shows that pointwise convergence
implies uniform convergence except on a set of arbitrarily small measure (for the definitions,
see Chapter 2).
Remark 1.32 There are other ways of defining the convergence of a sequence of functions. In
particular, when some norm is defined on a function space containing the functions fn , it is
possible do discuss convergence in the sense of this norm. Uniform convergence corresponds
to the case of the kk norm. In Chapter 9, we will also discuss the notion of convergence in
quadratic mean, or convergence in L2 norm, and convergence in mean or convergence in L1 norm.
In pre-Hilbert spaces, there also exists a weak convergence, or convergence in the sense of scalar
product (which is not defined by a norm if the space is infinite-dimensional).
Sequences
21
In other words:
vector space, it is possible to weaken the assumptions by asking, instead of (i), that the sequence
( fn (x0 )) converges at a single point x0 I . Assumption (ii) remains identical, and the conclusion is the same: ( fn )nN converges uniformly to a differentiable function with derivative equal
to g.
Counterexample 1.37 The sequence of functions given by
fn : x 7
n
8 X
sin2 (nx)
k=1 4n2 1
converges uniformly to the function f : x 7 |sin x| (see Exercise 9.3 on page 270), but the
sequence of derivatives does not converge uniformly. The previous theorem does not apply,
and indeed, although each fn is differentiable at 0, the limit f is not.
Remark 1.38 It happens naturally in some physical situations that a limit of a sequence of
functions is not differentiable. In particular, in statistical thermodynamics, the state functions
of a finite system are smooth. However, as the number of particles grows to infinity, discontinuities in the state functions or their derivatives may appear, leading to phase transitions.
10
See Definition 4.52 on page 106; the simplest example is D = ]a, b ] and x0 = a.
22
lim
fn (x) dx =
f (x) dx.
Example 1.40 This theorem is very useful, for instance, when dealing with a power series
expansion which is known to converge uniformly
open disc of convergence (see TheoP on the
n
rem 1.66 on page 34). So, if we have f (x) =
n=0 a n x for |x| < R, then for any x such that
|x| < R, we deduce that
Zx
X
an
x n+1 .
f (s) ds =
n
+1
0
n=0
Series
23
1.3
Series
1.3.a
ues in a normed vector space. Let (Sn )nN denote the sequence of partial
sums
n
def P
ak .
Sn =
k=0
The series
a n converges, and its sum is equal to A if the sequence
(Sn )nN converges to A. This is denoted
n=0
The series
in R.
an = A.
kan k converges
P
In particular, a series
P an of real or complex numbers converges absolutely if the series |an | converges in R.
or in other words:
lim
p,q
un = 0.
pnq
Conversely, any series which satisfies the Cauchy criterion and takes values in R,
C, any finite-dimensional normed vector space, or more generally, any complete normed
vector space, converges.
12
This criterion was stated by Bernhard Bolzano (see page 581) in 1817. But Bolzano was
isolated in Prague and little read. Cauchy presented this criterion, without proof and as an
obvious fact, in his analysis course in 1821.
24
kun k ,
n
n=0
n=0
nothing can be deduced from this, because the right-hand side does not tend to zero.
But we can use the Cauchy critetion: for all p, q N, we have of course
X
X
q
q
u
kun k ,
n
n= p
n= p
P
and since
kun k satisfies the Cauchy criterion, so does
P un . Since un lies in a
complete space by assumption, this means that the series un is indeed convergent.
In the theory of Fourier series, we will have to deal with formulas of the type
Z1
+
X
f (t)2 dt =
|cn |2 .
0
n=
To give a precise meaning to the right-hand side, we must clarify the meaning
of the convergence of a series indexed by integers in Z instead of N.
P
DEFINITION 1.47 A doubly
series nZZ a n , with an is a normed
P infinite P
vector space, converges if
an and
an are both convergent, the index
ranging over N in each case. Then we denote
+
X
n=
def
an =
P
an +
n=0
nZ a n .
X
n=1
an
P
In other words, a series of complex numbers nZ an converges to if and
only if, for any > 0, there exists N > 0 such that
X
j
for any i N and j N ,
an < .
n=i
dent. In particular,
P if the limit of
infinite series nZ an converges.
n=k
Series
25
P
For instance, take an = 1/n for n 6= 0 and a0 = 0. Then we have kn=kP
an = 0
for all k (and so this sequence does converge as k tends to infinity),
but
the
series
nZ a n
P
P
diverges according to the definition, because each of the series an and an (over n 0) is
divergent.
1.3.c
ai j =
i=1 j=1
q X
p
X
ai j ,
j=1 i=1
since each sum is finite. On the other hand, even if all series involved are
convergent, it is not always the case that
X
X
ai j
X
X
and
i=1 j=1
ai j ,
j=1 i=1
1 1
0
0
0 1 1
0
(a i j ) = 0 0
1 1
..
..
..
..
.
.
.
.
0 ...
0 . . .
0 . . .
..
.
0
0
0
..
.
where we have (note that i is the row index and j is the column index):
X
ai j = 0
but
i=1 j=1
1 1
0
0
0
0 2 2
0
0
(a i j ) = 0 0
3 3
0
..
..
..
.. ..
.
.
.
.
.
in which case
X
X
i=1 j=1
ai j =
X
i=1
a i j = 1.
j=1 i=1
0=0
but
putting
X
X
j=1 i=1
0 ...
0 . . .
0 . . .
..
.
ai j =
X
j=1
1 = +.
26
an with an in a normed vector space is conditionnally convergent if it is convergent but not absolutely convergent.
Series
27
is larger than ; call this sum S1 . Now add to S1 all consecutive values of n until
the resulting sum S1 + 1 + is smaller than ; call this sum S2 . Then start again
adding from the remaining values of n until getting a value larger than , called S3 ,
and continue in this manner until the end of time.
Now notice that:
Since at each step we add at least one value of or one of , it is clear that
all values of will be used sooner or later, as well as all values of , that is,
when all is said and done, all values of an will have been involved in one of the
sums Sn .
Since, at each step, the distance | Sn | is at most equal to the absolute value
of the last value of or considered, the distance from Sn to tends to 0 as n
tends to infinity.
From this we deduce that
P the sequence (Sn ) is a sequence of partial sums of a
rearrangement of the series an , and that it converges to . Hence this proves that by
simply changing the order of the terms, one may cause the series to converge to an arbitrary sum.
Let now a, b R with a < b (the case a = b being the one already considered).
If a and b are both finite, we can play the same game of summation as before,
but this time, at each step, we either sum values of n until we reach a value
larger than b , or we sum values of n until the value is less than a.
Example 1.53 Consider the sequence (an )nN with general term an = (1)n+1
P/n. It follows
from the theory of power series (Taylor expansion of log(1 + x)) that the series an converges
and has sum equal to log 2. If we sum the same values an by taking one positive term followed
by two negative terms, then the resulting series converges to 12 log 2. Indeed, if (Sn )nN and
(S n )nN denote the sequence of partial sums of the original and modified series, respectively,
then for n N we have
1 1 1
1
1
S2n = 1 + + +
2 3 4
2n 1 2n
and
1 1 1 1 1
1
1
1
S3n
= + + +
2
4
3
6
8
2n
1
4n
2
4n
|{z}
| {z }
|
{z
}
=
1 1 1 1
1
1
1
+ + +
= S2n .
2 4 6 8
4n 2 4n
2
As an exercise, the reader can check that if one takes instead two positive terms followed by
one negative terms, the resulting series converges with a value equal to 32 log 2.
The following result shows that, on the other hand, one can rearrange at
will the order of the terms of an absolutely convergent series.
THEOREM 1.54 A series of complex numbers is commutatively convergent if and only
if it is absolutely convergent.
28
k
k=1
for n N . For each such n N , there exists N such that the set {(1), . . . , (N )}
contains {1, . . . , n} (it suffices that N be larger than the maximum of the images of 1,
. . . , N by the inverse permutation 1 ). Then for any m N , we have
X
X
X
X
m
n
+
a
|ak | +
|ak |,
(k)
k
k=1
k=1
k>n
k>n
since the set {(1), . . . , (m)} contains {1, . . . , n}, and possibly additional values which
are all larger than n. The absolute convergence makes its P
appearance now: the last sum
on the right is the remainder for the convergent series |ak |, and for n N it is
therefore itself smaller than . Since, given , we can take n = N and find the value
N from it, such that
X
m
2
a
(k)
k=1
for m N , and so we have proved that the rearranged series converges with sum equal
to .
If the terms of the series are complex numbers, it suffices to apply the result for real
series to the series of real and imaginary parts.
13
Peter Gustav Lejeune-Dirichlet showed in 1837 that a convergent series with non-negative
terms is commutatively convergent. In 1854, Bernhard Riemann wrote three papers in order to
obtain a position at the university of Gttingen. In one of them, he describes commutatively
convergent series in the general case. However, another paper was selected, concerning the
foundations of geometry.
Series
29
N N
n N
n
P
n N =
fk (x) F (x)
< .
k=1
P
The function F is called the pointwise, or simple, limit of the series
fn ,
P cv.s.
and this is denoted
fn F .
> 0 N N
x X
n N
n
P
fk (x) F (x)
n N =
< .
k=1
cv.u.
This is denoted
fn F . This amounts to
n
P
lim
fk F
=0
where
kgk = sup
g(x)
.
n
k=1
xX
THEOREM 1.59 Any absolutely convergent series with values in a complete normed
vector space is uniformly convergent, and hence pointwise convergent.
30
fn .
n=0
1.4
Power series, analytic functions
Quite often, physicists encounter series expansions of some function.
These expansions may have different origins:
the superposition of many phenomena (as in the Fabry-Perot interferometer);
perturbative expansions, when exact computations are too difficult to
perform (e.g., hydrodynamics, semiclassical expansions, weakly relativistic expansions, series in astronomy, quantum electrodynamics, etc.);
sometimes the exact evalution of a function which expresses some physical quantity is impossible; a numerical evaluation may then be performed using Taylor series expansions, Fourier series, infinite product
expansions, or asymptotic expansions.
We first recall various forms of the Taylor formula. The general idea is
that there is an approximate expression
(x a)2
(x a)k (k)
f (a) + +
f (a)
2!
k!
for a function f which is at least k times differentiable
on an interval J , with
values in some normed vector space E , kk , and for a given point a J ,
where x lies is some neighborhood of a.
The question is to make precise the meaning of the symbol !
Define Rk (x) to be the difference between f (x) and the sum on the righthand side of the above expression; in other words, we have
f (x) f (a) + (x a) f (a) +
f (x) =
k
X
(x a)n
n=0
n!
31
2
2
f (a) +
without considering the issue of convergence. He was also interested in the physical and mathematical aspects of vibrating strings.
1.4.a
Taylor formulas
k
X
(x a)n
n=0
n!
f (n) (a) +
(x a)k+1 (k+1)
f
a + (x a) .
(k + 1)!
Remark 1.63 This formula is only valid for real-valued functions. However, the following
corollary is also true for functions with complex values, or functions with values in a normed
vector space.
32
f (x)
k
X
(x a)n
n!
n=0
1.4.b
f (n) (a) = o (x a)k .
xa
k
X
(1)n 2n+1
x
+ Rk (x),
2n + 1
n=0
for the Taylor formula of order 2n + 1 (notice that only odd powers of x
appear, because the inverse tangent function is odd).
If we represent graphically those polynomials with k = 0, 1, 4, 36 (i.e., of
order 1, 5, 9, and 18, respectively), with the graph of the function itself for
comparison, we obtain the following:
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
order 9
order 37
order 1
order 5
1
x
33
for a fixed real number x [1, 1] (for instance 0.8), the values at x of
the Taylor polynomial of increasing degree get closer and closer to the
value of the function as k increases;
on the other hand, for a real number x such that |x| > 1, disaster
strikes: the larger k is, the further away to arctan x is the value of the
Taylor polynomial!
The first observation is simply a consequence of the Taylor-Young formula.
The other two deserve more attention. It seems that the sequence (Tk )kN of
the Taylor polynomials converges on [1, 1] and diverges outside.14 However,
the function arctan is perfectly well-defined, and very regular, at the point x = 1; it
does not seem that anything special should happen there. In fact, it is possible
to write down the Taylor expansion centered at a = 1 instead of a = 0 (this
is a somewhat tedious computation15), and (using approximations of the same
order as before), we obtain the following graphs:
2
To be honest, it is difficult to ascertain from the graphs above if the interval to consider is
[1, 1] or ]1, 1[, for instance. The general theory of series shows that (Tn (x))nN converges
quickly if |x| < 1 and very slowly if |x| = 1.
15
The n-th coefficient of the polynomial is (1)n+1 sin(n/4)2n/2 /n and the constant term
is /4.
16
This notion of integral on a path is defined by the formula (4.2) page 94.
17
This is a consequence of the residue theorem 4.81 on page 115.
34
at a converges on the open disc centered at a with radius equal to the distance
from a to the closest singulirity (hence the radius isp|1 0| = 1 in the first
case of Taylor expansions at a = 0, and is |1 i| = 2 in the second case).
X
n=0
an (z z0 )n ,
where (an )nN is a given sequence of real or complex numbers, which are
sometimes called the coefficients of the power series.
P
THEOREM 1.66 (Radius of convergence) Let
an (z z0 )n be a power series
+
centered at z0 . The radius of convergence is the element in R defined by
def
R = sup t R+ ; (an t n )nN is bounded .
The power series converges absolutely and uniformly on any compact subset in the disc
def
B(z0 ; R) = z C ; |z z0 | < R , in the complex plane C, and it diverges for
any z C such that |z| > r. For |z| = r, the series may be convergent, conditionally
convergent, or divergent at z.
Note that absolute convergence here refers to absolute convergence as a
series of functions, which is stronger than absolute convergence for every z
involved: in other words, for any compact set D B(z0 ; R), we have
n=0 zD
n
n=1 z /n converges for any z C such that
|z| < 1 and diverges if |z| > 1 (the radius of convergence is R = 1). Moreover, this series is
divergent at z = 1, but conditionnally convergent at z = 1 (by the alternate series test), and
more generally, it is conditionnally convergent at z = e i for any
/ 2Z (this can be shown
using the Abel transformation, also known as summation by parts).
z V
f (z) =
X
n=0
an (z z0 )n .
35
P
P
function with no pole in N, the power series
F (n) an z n and an z n have
the same radius of convergence. From this and Theorem 1.60, it follows in
particular that a power series can be differentiated term by term inside the
disc of convergence:
THEOREM 1.69 (Derivative of a power series) Let J be an open subset of R,
n=0
an (x x0 )n .
Let R > 0 be the radius of convergence of this power series. Then f is infinitely differentiable on the open interval ]x0 R, x0 + R[, and each derivative has a power series
expansion on this interval, which is obtained by repeated term by term differentiation,
that is, we have
f (x) =
X
n=1
nan (x x0 )n1
f (k) (x) =
and
X
n=k
n!
a (x x0 )nk
(n k)! n
for any k N. Hence the n-th coefficient of the power series f (x) can be expressed as
an =
f (n) (x0 )
.
n!
f (n) (x0 )/n! (x x0 )n is the Taylor series of f at x0 . On
any compact subset inside the open interval of convergence, it is the uniform limit of the
sequence of Taylor polynomials.
Remark 1.71 In Chapter 4, this result will be extended to power series of one complex variable
1.4.d
Analytic functions
Consider a function that may be expended into a power series in a neighborhood V of a point z0 , so that for z V , we have
P
f (z) =
an (z z0 )n .
n=0
36
Note that the radius of convergence of the power series may (and often
does!) vary with the point z0 .
P
THEOREM 1.73 Let
an z n be a power series with positive radius of convergence
R > 0, and let f denote the sum of this power series on B(0 ; R). Then the function
f is analytic on B(0 ; R).
Example 1.74 The function f : x 7 1/(1 x) has the power series expansion
f (x) =
xn
n=0
around 0, with radius of convergence equal to 1. Hence, for any x0 ]1, 1[, there exists a
power series expansion centered at x0 (obviously with different coefficients). This can be made
explicit: let h B(0 ; |1 x0 |), then with x = x0 + h, we have
f (x) = f (x0 + h) =
X
1
1
1
(x x0 )n
=
=
.
1 (x0 + h)
1 x0 1 h/(1 x0 )
(1 x0 )n+1
n=0
open subset U of R. Under what conditions is f analytic?18 There are two obvious necessary
conditions:
f is infinitely differentiable on U ;
for any x0 U , there exists an open disc B(x0 , r ) such that the series
x0 )n converges for any x B(x0 , r ).
1
n!
However, those two conditions are not sufficient. The following classical counter-example
shows this: let
def
f (x) = exp 1/x 2
if x 6= 0,
f (0) = 0.
It may be shown19 that f is indeed of C and that each derivative of f at 0 vanishes, which
ensures (!) the convergence of the Taylor series everywhere. But since the function vanishes
only at x = 0, it is clear that the Taylor series does not converge to f on any open subset,
hence f is not analytic.
It is therefore important not to use the terminology analytic where infinitely differentiable is intended. This is a confusion that it still quite frequent in scientific literature.
The Taylor formulas may be used to prove that a function is analytic. If the sequence
(Rn )nN of remainders for a function f converges uniformly to 0 on a neighborhood of a R,
then the function f is analytic on this neighborhood. To show this, one may use the integral
expression of the remainder terms in the Taylor formula. A slightly different but useful
approach is to prove that both the function under consideration and its Taylor series (which
must be shown to have positive radius of convergence) satisfy the same differential equation,
with the corresponding initial conditions; then f is analytic because of the unicity of solutions
to a Cauchy problem.
Also, it is useful to remember that if f and g have power series expansions centered at z0 ,
then so do f + g and f g. And if f (z0 ) 6= 0, the function 1/ f also has a power series expansion
centered at z0 .
18
The same question, for a function of a complex variable, turns out to have a completely
different, and much simpler, answer: if f is differentiable in the complex sense on the
open set of definition, then it is always analytic. See Chapter 4.
19
By induction, proving that f (n) (x) is for x 6= 0 of the form x 7 Q n (x) f (x), for some
rational function Q n .
37
1.5
A quick look at asymptotic and divergent series
1.5.a
Asymptotic series
X
an
F (z)
.
z
zn
n=0
(1.4)
The definition means that the expansion (1.4) is a good approximation for
large values of z. Indeed, if we only consider the first twenty terms of the
series, for instance, we see that the sum of those approximates f (z) to order
1/z 20 at least when [z ].
However, it frequently turns out that for fixed z, the behavior of the series
in (1.4) is quite bad as N . In particular, the series may be divergent.
This phenomenon was pointed out and studied in detail by Henri Poincar
in the case of asymptotic series used in astronomy, at the beginning of the
twentieth century [70].
P
How can a divergent asymptotic series still be used? Since
an /z n is
asymptotic to F , if there is some R such that F is continuous for |z| R,
then we see that there exist constants C1 , C2 ,. . . such that
N
X
a
n
F (z)
CN
for N N and |z| R.
n
z |z| N +1
n=0
For fixed z, we can look for the value of N such that the right-hand side of
this inequality is minimal, and truncate the asymptotic series at this point.
Of course, we do not obtain F (z) with infinite precision. But in many cases the
actual precision increases with |z|, as described in the next section, and may
be pretty good.
It is also possible to speak of asymptotic expansion as z 0, which
corresponds to the existence of a sequence (an )nN such that
N
X
1
n
lim N f (z)
an z = 0,
z0 z
n=0
38
which is denoted
f (z)
z0
an z n .
(1.5)
n=0
may be two different functions with the same asymptotic expansion! For instance, e x
2
and e x both have asymptotic expansions with an = 0 for all n as x +.
X
F (e 2 ) = F () =
fn n .
n=0
Since the value of is fixed by Nature, with a value given by experiments, only
the truncated series can give a physical result if the series is divergent. The
truncation must be performed around the 137-th term, which means that we
still expect a very precise result certainly more precise, by far, than anything
the most precise experiment will ever give! However, if F (e 2 ) is not analytic
at e = 0, the question is raised whether the asymptotic expansion considered
gives access to F uniquely or not.
Studying asymptotic series is in itself a difficult task. Their implications
in physics (notably field theory) are at the heart of current research [61].
39
Leonhard Euler (17071783), a Swiss mathematician, an exceptional teacher, obtained a position at the Academy of Sciences of
Saint Petersburg thanks to Nicolas and Daniel Bernoulli when
he was only twenty. He also spent some years in Berlin, but
came back to Russia toward the end of his life, and died there at
seventy-six (while drinking tea). His works are uncountable! We
owe him the notations e and i and he imposed the use of that
was introduced by Jones in 1706. Other notations due to Euler
are sin, cos, tang, cot, sec, and cosec. He also introduced the use
of complex exponents, showed that e i x = cos x + i sin x, and was
particularly fond of the formula e i + 1 = 0. He defined the
function , which extends the factorial function from integers
to C \ (N), and used the Riemann zeta function for real values
of the variable. No stone of the mathematical garden of his time
was left unturned by Euler; let us only add the Euler angles in
mechanics and the Euler equation in fluid mechanics.
The number of terms necessary to approximate log 2 within 106 , for instance, can be
estimated quite precisely for both series. Using the Leibniz test for alternating sums, the
remainder of the first series is seen to satisfy
|Rn | un+1 =
1
,
n+1
and this is the right order of magnitude (a pretty good estimate is in fact Rn 1/2n). If we
want |Rn | to be less than 106 , it suffices to take n = 106 terms. This is a very slow convergence.
The remainder of the second series, on the other hand, can be estimated by the remainder of
a geometric series:
X
X
1
1
1
Rn =
= n.
n
k
n
2
2
2
k=n+1
k=n+1
Hence twenty terms or so are enough to approximate log 2 within 106 using this expansion
(since 220 106 ).
40
Fig. 1.5 The precise value of f (x)Pis always found between two successive values of the
partial sums of the serie
fk (x). Hence, it is inside the gray strip.
rapidly, [...] but they will see the second as divergent.
Astronomers, on the contrary, will see the first as divergent [...] and the
second as convergent. [70]
41
0,056
0,054
0,052
0,05
0,048
0,046
0,044
10
20
30
40
50
P
Fig. 1.6 The first 50 partial sums of the series (1)k1 (k 1)! x k for x = 1/20.
Notice that, starting from k = 44, the series diverges rapidly. The best precision
is obtained for n = 20, and gives f (x) with an error roughly of size 2 108 .
they are of the same sign. It follows (see the proof of the alternating series
test) that
(1.7)
f2n (x) < f (x) < f2n+1 (x),
although, in contrast with the case of alternating series with terms converging
to 0, the general term here (1)n n! x n+1 diverges. Hence it is not possible to
deduce from (1.7) an arbitrarily precise approximation of f (x). However, if x is
small, we can still get a very good approximation, as we now explain.
of x. There exists an index N0 such that the distance
Fix a positive value
f2n+1 (x) f2n (x) is smallest (the ratio between consecutive terms is equal to
nx, so this value of n is in fact N0 = 1/x). This means that, if we look
at the first N0 values, the series seems to converge, before it starts blowing
up. It is interesting to remark that the convergence
of the first
N0 terms
is exponentially fast, since the minimal distance f N +1 (x) f N (x) is roughly
given by
p
N ! x N N ! N N 2/x e 1/x
(using the Stirling formula, see Exercise 5.4 on page 154.) Thus, if we wish
to know the
p value of f (x) for a small value of x, and if a precision of the
order of 2/x e 1/x suffices, it is possible to use the divergent asymptotic
series (1.6), by computing and summing the terms up to the smallest term (see
Figure 1.5). For instance, we obtain for x = 1/50 a precision roughly equal
to 6 1020 , which is perfectly sufficient for most physical applications! (see
Figure 1.6.)
For a given value of x, on the other hand, the asymptotic series does not
allow any improvement on the precision.21 But the convergence is so fast that
21
For instance, in quantum field theory, the asymptotic series in terms of has a limited
42
divergent series. The interested reader may read the classic book of mile Borel [13], the first
part of which at least is very readable. Concerning asymptotic expansions, see [72].
precision since is fixed (equal to 1/137 approximately) and cannot be made to tend to zero.
This suggests that quantum field theory, in its current perturbative form, will one day be
replaced by another theory. Of course, as long as a precision to 10100 is enough...
22
Niels Abel wrote in 1826 that divergent series are the Devils invention, and it is shameful
to base any proof of any kind on them. By using them, one can get from them whatever
result is sought: they have done much evil and caused many paradoxes (letter to his professor
Holmbo).
23
Only the first is less precise, because it is too small and Stokes used an asymptotic expansion at +.
Exercises
43
EXERCISES
Physical paradoxes
Exercise 1.2 (Electrical energy) Consider an electric circuit consisting of two identical capac-
itors in series, with capacitance C and resistance R. Suppose that for t 0, the circuit is open,
one of the capacitors carries the charge Q , and the other has no charge. At t = 0, the circuit
is closed, and is left to evolve freely. What is the state of equilibrium for this circuit? What
is the energy of the system at t = 0? What is the energy as t +? Show that the missing
energy depends only on R. What happened to this energy?
Now assume that R = 0. What is the energy of the system at any arbitrary t? What is the
limit of this energy as t +? Do you have any comments?
Exercise 1.3 (A paradox in optics) We know that two distinct sources of monochromatic
light do not create a clear interference picture in an experiment with Young slits. As the
distance between the sources increases, we first see a contrast decrease in the interference picture. This is called a defect of spatial coherence.
Hence, a famous experiment gives a measurement of the angular distance between two
components of a double star by the observation of the first disappearance of the interference
fringes when slowly moving two Young slits apart.
This experiment works very well with monochromatic light. However, if we define two
monochromatic sources S1 and S2 mathematically, each emits a signal proportional to e 2i t ,
and there should be no problem of spatial coherence.
Perform the computation properly. A computation in optics always starts with amplitudes
(possibly, one may show that the crossed terms cancel out in average, and do the computations
with intensity only). Here, the cross terms are fine, and never disappear. In other words, this
shows that two different monochromatic light sources are always perfectly coherent.
But experiment shows the opposite: a defect of spatial coherence. How can this be explained?
Exercise 1.4 In the rubber ball paradox of page 2, give an interpretation of the variation of
kinetic energy of the ball, in the moving reference frame, in terms of the work of the force
during the rebound. The shock may be modeled by a very large force lasting a very short
amount of time, or one can use the formalism of distributions (see Chapter 7).
sequence of rational numbers such that Q [0, 1] = {xn ; n N}. Show that the sequence
(xn )nN diverges.
Exercise 1.6 In an arbitrary normed vector space, show that a Cauchy sequence which has
P =
n
P
i=1
def
i X i 7 kP k = max |i | .
1in
Exercise 1.8 (Fixed point) Let a, b R be real numbers with a < b , and let f : [a, b ]
[a, b ] be a continuous function with a fixed point . Assume that there exists a real number
44
for all n N.
un+1 = f (un )
2n
3
if 0 x 1/2n,
2n x
2
fn (x) = n 2n3 (x 1/2n) if 1/2n x 1/n,
0
if 1/n x 1,
each fn is increasing, show that f is also increasing. Show that the same stability holds for the
properties fn is convex and fn is k-Lipschitz. Show that, on the other hand, it is possible
that each fn is continuous, but f is not (take fn (x) = sin2n x).
n (x) =
Z x
0
1 e 1/nt
dt
for n N .
Show that n is infinitely differentiable, and that the sequence (n )nN converges uniformly on [1, 1]. What is its limit?
Let > 0 be given. Show that for any p N, there exists a map p from [1, 1] into R,
infinitely differentiable, such that
( p)
i) (k)
p (0) = 0 for k 6= p, and p (0) = 1.
ii) for k p 1 and x [1, 1], (k)
p (x) .
Now let (an )nN be an arbitrary sequence of real numbers. Construct an infinitely differentiable map f from [1, 1] to R such that f (n) (0) = an for all n N.
P
fn defined on
R, which converges pointwise to a sum F (x), the convergence being uniform on any finite
interval of R, and which, moreover, satisfies:
Exercise 1.12 (Slightly surprising exercise) Construct a series of functions
n N
lim fn (x) = +
x+
but
lim F (x) = .
x+
Exercises
45
f (z) =
n=0
cn (z a)n .
X
1
f (a + r e i )2 d.
|cn |2 r 2n =
2
0
n=0
iii) Prove that if R = +, in which case the sum f (z) of the power series is said to
be an entire function, and if moreover f is bounded on C, then f is constant (this is
Liouvilles theorem, which is due to Cauchy).
iv) Is the sine function a counter-example to the previous result?
Exercise 1.14 Let f be a function of C class defined on an open set R. Show that f
is analytic if and only if, for any x0 , there are a neighborhood V of x0 and positive real
numbers M and t such that
( p)
f (x)
p
x V p N
p! M t .
Exercise 1.15 Let f : R2 R be a function of two real variables. This exercise gives
examples showing that the limits
and
lim
(x, y)(0,0)
f (x, y)
are independent: each may exist without the other two existing, and they may exist without
being equal.
xy
if x 2 + y 2 6= 0,
i) Let f (x, y) = x 2 + y 2
0
if x = y = 0.
Show that the limits lim lim f (x, y) and lim lim f (x, y) both exist, but that the
x0 y0
limit24
lim
(x, y)(0,0)
y0 x0
24
(x, y)(a,b)
(x, y)6=(a,b)
if and only if
> 0 > 0
f (x, y) =
(x, y) 6= (a, b ) and
(x a, y b )
< = | f (x, y) | .
46
y + x sin(1/ y) if y 6= 0,
0
if y = 0.
Show that both limits lim f (x, y) and lim lim f (x, y) exist, but on the other
y0 x0
(x, y)(0,0)
x y + y sin 1
if x =
6 0,
2
2
iii) Let f (x, y) = x + y
x
0
if x = 0.
lim
(x, y)(0,0)
exist.
2
2
x y
iv) Let f (x, y) = x 2 + y 2
if x 2 + y 2 6= 0,
if x = y = 0.
Show that the limits lim lim f (x, y) and lim lim f (x, y) both exist, but are different.
x0 y0
y0 x0
PROBLEM
Problem 1 (Solving differential equations) The goal of this problem is to illustrate, in a
special case, the Cauchy-Lipschitz theorem that ensures the existence and unicity of the solution
to a differential equation with a given initial condition.
In this problem, I is an interval [0, a] with a > 0, and we are interested in the nonlinear
differential equation
ty
y =
(E)
1 + y2
with the initial condition
y(0) = 1.
(CI)
The system of two equations (E) + (CI) is called the Cauchy problem. In what follows, E
denotes the space
C (I , R) of real-valued continuous functions defined on I , with the norm
k f k = sup f (t).
tI
i) Let ( fn )nN be a Cauchy sequence in E , kk .
(a) Show that for any x I the sequence fn (x) nN converges in R. For x I , we
def
Show that the functions f E which are solutions of the Cauchy problem (E) + (CI)
are exactly the fixed points of .
Solutions of exercises
x
is 1-Lipschitz on R, i.e.,
1 + x2
y
x
1 + y 2 1 + x 2 | y x| .
47
(1.8)
v) Show that there exists a unique solution to the Cauchy problem. Give an explicit
iterative method to solve the system numerically (Picard iterations).
Remark 1.79 In general, all this detailed work need not be done: the Cauchy-Lipschitz theorem
states that for any continous function (x, y) which is locally Lipschitz with respect to the
second variable, the Cauchy problem
y = (t, y),
has a unique maximal solution (i.e., a solution defined on a maximal interval).
SOLUTIONS
Solution of exercise 1.2. The energy of the circuit at the beginning of the experiment is
the energy contained in the charged capacitor, namely E = Q 2 /2C . At equilibrium, when
[t ], no current flows, and the charge of each capacitor is Q /2 (it is possible to write
down the necessary differential equations and solve them to check this). Thus the final energy
is E = 2( Q /2)2 /C = E /2. The energy which is dissipated by the Joule effect (computed by
R +
the integral 0 Ri 2 (t) dt, where t 7 i(t) is the current flowing through the circuit at time t)
is of course equal to E E , and does not depend on R.
However, if R = 0, one observes oscillations of charge in each capacitor. The total energy
of the system is conserved (it is not possible to compute it from relations in a quasi-stationary
regime; one must take magnetic fields into account!). In particular, as [t +], the initial
energy is recovered. The explanation for this apparent contradiction is similar to what happened for Romeo and Juliet: the time to reach equilibrium is of order 2/RC and tends to
infinity as [R 0]. This is a typical situation where the limits [R 0] and [t +] do
not commute.
Finally, if we carry the computations even farther, it is possible to take into account the
electromagnetic radiation due to the variations of the electric and magnetic fields. There is
again some loss of energy, and for [t +], the final energy E E = E /2 is recovered.
Solution of exercise 1.3. Light sources are never purely monochromatic; otherwise there
would indeed be no spatial coherence problem. What happens is that light is emitted in wave
packets, and the spectrum of the source necessarily has a certain width > 0 (in a typical
example, this is order of magnitude = 1014 s1 , corresponding to a coherence length of a
few microns for a standard light-bulb; the coherence length of a small He-Ne laser is around
thirty centimeters, and that of a monomode laser can be several miles). All computations must
be done first with 6= 0 before taking a limit 0. Thus, surprisingly, spatial coherence
is also a matter of temporal coherence. This is often hidden, with the motto being since the
sources are not coherent, I must work by summing intensities instead of amplitudes.
In fact, when considering an interference figure, one must always sum amplitudes, and then
(this may be a memory from your optics course, or an occasion to read Born and Wolf [14])
perform a time average over a period t, which may be very small, but not too much (depending
on the receptor; the eyes are pretty bad in this respect, an electronic receptor is better, but none
can have t = 0).
48
n (0) = 1,
n (0) = 0,
n(k) (0) = 0
k 2.
one vanishing at 0 (i.e., the integral from 0 to x of the previous one). It is easy to see that
the successive derivatives of this function satisfy the require condition, and the last property
follows from the construction.
Now let (an )nN be an arbitrary sequence of real numbers. For all n N, one can apply the
previous construction to find n such that
1
.
sup (n1)
(x) n
n
2 max(1, |an |)
x[1,1]
P
It is then immediate that the series
an n converges uniformly to a function f having all
desired properties.
Of course, the function f thus constructed is by no means unique: one may add a term
(n 1), where R, without changing the values of the derivatives at 0.
fn (x) =
x 4n1
x 4n+1
+
(4n 1)! (4n + 1)!
Z 2
1 if k = n,
e i (kn) d = kn =
0 otherwise,
0
the stated formula follows.
Solutions of exercises
49
2
ii) Similarly, expand f (a + r e i ) as a product of two series and integrate term by term.
Most contributions cancel out using the formula above, and only the terms |cn |2 r 2n
remain.
iii) If f is bounded on C, we have |cn r n | k f k . Letting r +, it follows that cn = 0
for n 1, which means that f is constant.
iv) The function sin is not bounded on C! Indeed, we have for instance lim sin(i x) =
x+
+. So there is no trouble.
Solution of problem 1
i)
yI
fn (x)
nN
is a Cauchy sequence in R, so it
(b) Let > 0 be fixed. There exists N such that
f p fq
for all p > q > N .
Let x I . We then have
f p (x) fq (x)
for any p > q > N ,
and since this holds for all p, we may fix q and let p . We obtain
f (x) fq (x)
for all q > N .
Finally, this being true for any > 0, it follows that the sequence ( fn )nN
converges uniformly to f .
Remark: At this point,
we havent proved that there is convergence in the normed
vector space E , kk . It remains to show that the limit f is in E , that is, that f is
continuous. This follows from Theorem 1.33, but we recall the proof.
Using the triangle inequality, we deduce from this that for all y I such that
|x y| , we have
f ( y) f (x) f ( y) f N ( y) + f N ( y) f N (x) + f N (x) f (x) 3.
This proves the continuity of f at x, and since x is arbitrary, this proves that f
is continuous on I , and hence is an element of E .
(d) For any Cauchy sequence ( fn )nN in E , theprevious questions show that ( fn )nN
converges in E . Hence the space E , kk is complete.
f (t) = ( f ) (t) =
t f (t)
1 + f (t)
2 .
50
The functions f and ( f ) have the same derivative on I , and moreover satisfy
( f )(0) = 1
and
f (0) = 1.
It follows that ( f ) = f .
1 x2
we see that g (x) 1 for all x R. The
(1 + x 2 )2
mean-value theorem then proves that g is 1-Lipschitz, as stated.
Z
!
t
u
f
(u)
u
g(u)
( f ) (g)
= sup
2
2 du
tI 0
1 + f (u)
1 + g(u)
Z t
f (u)
g(u)
sup
u
2
2 du
tI 0
1 + f (u)
1 + g(u)
Z a
f (u)
g(u)
u
2
2 du
1 + f (u)
1 + g(u)
0
2
0
0
p
This is true for any f , g E , and hence is (a 2 /2)-Lipschitz; if 0 a < 2, this
map is a contraction.
v) According to the fixed-point theorem, the previous results show that has a unique
fixed point in E .
According to Question ii), this means that there existspa unique solution of the
Cauchy Problem (E ) + (C I ) on an interval [0, a] for a < 2.
To approximate the solution numerically, it is possible to select an arbitrary function
f0 (for instance, simply f0 = 0), and construct the sequence ( fn )nN defined by fn+1 =
( fn ) for n 0. This requires computing (numerically) some integrals, which is a fairly
straightforward matter (numerical integration is usually numerically stable: errors do
not accumulate in general25). The speed of convergence of the sequence ( fn )nN to the
solution f of the Cauchy problem is exponential: with I = [0, 1], the distance (from
the norm on E ) between fn and f is divided by 2 (at least) after each iterative step. It
is therefore possible to expect a good numerical approximation after few iterations (the
precision after ten steps is of the order of k f0 f k /1000 since 210 = 1024).
25
Chapter
2.1
The integral according to Mr. Riemann
2.1.a
Riemann sums
52
a = x0 x1 x2
xk xk+1
xn1 xn = b
Fig. 2.1 The interval [a, b ] is here partitioned uniformly, with n subintervals of constant
length (b a)/n.
n
X
k=1
and it may be expected (or hoped) that as n goes to infinity, this value will
converge to some limit. Riemanns result is that this happens if the approximations used for f improve steadily as n increases. This means that the values
taken by f on an interval [x, x + ] must be very close to f (x) when is
small. This is a continuity assumption for f . Here is the precise result.
DEFINITION 2.1 (Subdivision) A subdivision of the interval [a, b] is any tu-
n = (x i ) i=0,...,n , (k )k=1,...,n ,
where (x i ) i=0,...,n is a subdivision of [a, b] and k xk1 , xk for all k.
DEFINITION 2.2 (Riemann sums) Let n = (x i ) i , (k )k be a marked subdivision of [a, b] and let f : [a, b] K be a function with values in K = R
53
n
X
(xk xk1 ) f (k ).
k=1
uous function. For any sequence (n )nN of marked subdivisions of [a, b], such that
the step of the subdivisions tends to 0 as n tends to infinity, the sequence of associated
Riemann sums converges. Moreover, the limit of this sequence is independent of the chosen sequence of marked subdivisions. The (Riemann) integral1 of the function f is
defined to be this common limit:
Zb
Zb
f (x) dx =
f = lim S ( f , n )
a
i=1
R
The notation d for differentiation, dy/dx, (which originally represented an s,
the initial of summa), are due to Gottfried Leibniz (16461716), who also invented the name
function and popularized the use of the equal sign = and of a dot to represent multipliRb
cation. Joseph Fourier (17681830) invented the notation a .
2
Recall that f : [a, b ] K is piecewise continuous if there exist an integer n and real
numbers
a = a0 < a1 < < an = b
such that, for all i [[0, n 1]], the function f restricted to a i , a i+1 is continuous and has
finite limits as x tends to a i from above, and to a i+1 from below.
1
54
and
I+ ( f ) = inf
Z
2.2
The integral according to Mr. Lebesgue
To solve the problems related to Riemanns definition of integrable function, the idea is to provide an alternate definition for which a much larger
class of functions will be integrable.3 In particular, this space of integrable
functions, denoted L1 , will turn out to be complete.
3
From the point of view of trying to obtain a complete space with the norm
Rb
k f k1 = a f (x) dx,
55
f
xk+1
xk
Ak(n)
Fig. 2.2 Construction of the sets Ak(n) = f 1
value of Ak(n) k/2n .
2.2.a
xk , xk+1
, where xk = k/2n , with the
many intervals, say of size 21n . For each subinterval of the type 2kn , k+1
,
n
2
consider the set of real numbers x such that f (x) belongs to this interval (see
Figure 2.2):
k k+1
(n)
1
Ak = f
,
.
2n 2n
Denote by Ak(n) the size of this set (temporarily; this crucial notion
56
k
Ak(n) n .
2
k=0
It is quite easy to check that the sequence Sn ( f ) n is increasing (see Figure 2.4).
R
Hence it converges in R+ = R+ {+}, and the limit is denoted f .
The function f is said to be integrable (in the sense of Lebesgue) if the limit
is finite.5
This is the principle. As can be seen, the issue of evaluating the size of
the sets Ak(n) has been left open. This very important point is the subject of
the next sections.
Remark 2.4 Before going farther, we briefly explain what is gained from this new method:
the theorems of the Lebesgue integral calculus are infinitely simpler and more powerful
(in particular, concerning the inversion of limits and integrals, or differentiation under
the integral sign), even when applied to very regular functions.
5
The exact definition below will be slightly different, but equivalent; it will not involve a
particular choice of intervals (bounded by numbers of the form k/2n ) to subdivide R+ .
6
Much larger than the set of Riemann-integrable functions, in particular, because even very
irregular functions become integrable. It is true that some Riemann-integrable functions are
not Lebesgue integrable, and that this causes trouble, but they are typically functions which
create all the difficulties of the Riemann theory.
57
(T1) X T and T ;
Hence a -algebra T is stable by complement and countable union. Moreover, since the complement of the union of a family of sets is the intersection
of their complements, T is also stable by countable intersections.
Example 2.6 P(X ) isa -algebra (the largest
-algebra on X ); {, X } is also a -algebra (the
An algebra is a set of subsets stable under complement and finite union. The prefix
indicates the countable operations which are permitted.
58
DEFINITION 2.9 (Borel sets) The Borel -algebra is the -algebra on R generated by the open subsets of R. It is denoted B(R). Its elements are called
Borel sets.
It is easy to check that the Borel -algebra is also the -algebra generated
by intervals of R of the type ], a], where a ranges over all real numbers.8
In particular, the following sets are Borel subsets of R:
1. all open sets;
2. all closed sets (since their complement is open);
3. all countable unions and intersections of open and closed sets.
However, there are many others! The Borel -algebra is much more complex
and resists any simple-minded exhaustive description.
8
This is of great importance in probability theory, where it is used to reduce the study of a
random variable to the study of its distribution function.
2.2.c
59
Lebesgue measure
(MES2) for any countable family (An )nN of pairwise disjoint measurable
sets in T , we have
X
[
Ai =
(Ai ).
i=0
n=0
Remark 2.12 Since a measure takes values in R+ = R+ {+}, it is necessary to specify the
x + = +
for all x R+ .
The following theorem, due to Lebesgue, shows that there exists a measure
on the Borel -algebra which extends the length of an interval.
THEOREM 2.13 (The Lebesgue measure on Borel sets) The pair (R, B) is a
measurable set and there exists on (R, B) a unique measure such that
[a, b] = |b a|
for all a, b R.
Note that very similar definitions will appear in Chapter 19, concerning probability spaces.
60
Consider the length map which associates its length b a to any finite
interval [a, b]. We try to extend this definition to an arbitrary subset of R.
Let A P(R) be an arbitrary subset of R.
Let E (A) S
be the set of all sequences A = (An )nN of intervals such that A is
contained in
i=0 A i . For any A = (An )nN E (A), let
def
(A ) =
i=0
(Ai ),
inf (A ).
A E (A)
61
on the Lebesgue -algebra, and a measurable set will be an element of L . Note that in practice
(for a physicist) there is really no difference between the Lebesgue and Borel -algebras.
2.2.e
Negligible sets
if there exists a Borel set N containing N which is of Lebesgue measure zero. But N itself is
not necessarily a Borel set.
Example 2.18 The set Q of rational numbers is negligible since, using -additivity, we have
(Q) =
X
rQ
X
{r } =
0 = 0.
rQ
62
H : x 7
is differentiable almost everywhere.
The Dirichlet function
D : x 7
is zero almost everywhere.
1
0
0 if x < 0,
1 if x 0,
if x Q,
if x R \ Q,
mathematician) has to deal with are measurable, and it is usually pointless to bother checking
this. Why is that the case? The reason is that it is difficult to exhibit a nonmeasurable function
(or even a single nonmeasurable set); it is necessary to invoke at some point the Axiom of Choice
(see Sidebar 2), which is not physically likely to happen. The famous Banach-Tarski paradox
shows that nonmeasurable sets may have very weird properties.11
2.2.f
Lebesgue measure on R n
7
It seems that this result flies in the face of the principle of conservation of volume during a
translation or a rotation. But that is the point: the pieces Bi (or at least some of them) are not
measurable, and so the notion of volume of Bi makes no sense; see [74].
12
In the sense of the -algebra generated by open sets.
63
2.2.g
f =
n
P
i=1
i A .
i
64
Example 2.25 Let D be the Dirichlet function, which is 1 for rational numbers and 0 for
[0,1]
f+
65
In practice, the definitions above are justified by the fact that any measurable function f is the pointwise limit of a sequence ( fn )nN of simple
functions. If, moreover, f is bounded, it is a uniform limit of simple functions.13
As it should, the Lebesgue integral enjoys the same formal properties as
the Riemann integral. In particular, the following properties hold:
PROPOSITION 2.28 Let (X , T , ) be a measure space, and let f , g be measurable
i) if f and g are integrable with respect to , then for any complex numbers ,
C, the function f + g is integrable and
R
R
R
( f + g) d = f d + g d;
ii) R
if f and g are integrable and real-valued, and if f g, then
g d;
R
R
iii) if f is integrable, then f d | f | d;
f d
iv) R
if g is non-negative
and integrable and if | f | g, then f is integrable and
R
| f | d g d.
Finally, under some suitable assumptions,14 for any two measure spaces
(X , T , ) and (Y , T , ), the product measure space (X Y , T T , )
is defined in such a way that Fubinis formula holds: for any integrable
function f on X Y , we have
ZZ
Z Z
f d( ) =
f (x, y) d( y) d(x)
X Y
Z X Z Y
=
f (x, y) d(x) d( y)
Y
(an implicit fact here is that the integral with respect to one variable is a
measurable function of the other variable, so that all integrals which appear
make sense). In other words, the integral may be computed in any order (first
over X then Y and conversely). Moreover, a measurable function f defined
on X Y is integrable if either
Z Z
Z Z
f (x, y) d( y) d(x) or
f (x, y) d(x) d( y)
X
13
This should be compared with the Riemann definition of integral, where step functions
are used. It turns out that there are relatively few functions which are uniform limits of step
functions (because step functions, constant on intervals, are much more rigid than simple
functions can be, defined as they are using arbitrary measurable sets).
14
Always satisfied in applications.
66
is finite.
All this applies in particular to integrals on R2 with respect to the Lebesgue
measure , and by induction on Rd for d 1. The Fubini formula
becomes
ZZ
Z Z
f (x, y) dx dy =
f (x, y) dy dx
R2
R
R
Z Z
=
f (x, y) dx dy.
R
15
This is not simply an abuse of notation. Mathematically, what is done is take the set of
integrable functions, and take its quotient modulo the equivalence relation being equal almost
everywhere. This means that an element in L1 (X ) is in fact an equivalence class of functions.
This notion will come back during the course.
67
16
68
EXERCISES
Exercise 2.1 Let d N . Prove or disprove (with a counterexample) the following state-
ments:
1. closed sets in R;
2. intervals of the type ]a, b ];
3. intervals of the type ], b ].
Show that B(Rd ) is generated by
1. closed sets in Rd ;
2. bricks, that is, sets of the type {x Rd | a i < xi b i , i = 1, . . . , d };
3. closed half-spaces {x Rd | xi b }, where 1 i d and b R.
i) () = 0;
ii) is increasing: if A , B T and A B, then we have (A) (B);
iii) let (An )nN be an increasing sequence of measurable sets in T , and let A =
Then we have (An ) (A);
n
An .
An
(An )
(-subadditivity of the measure).
n
Exercise 2.5 In this exercise, we denote by O the set of open subsets in R for the usual
topology, and (as usual) by B and L the Borel -algebra and the Lebesgue -algebra. We have
O B L P(R), and we will show that all inclusions are strict.
i) Show that O $ B.
ii) Let C be the Cantor triadic set, defined by
\
P
def
C=
Fp ,
where
F p = x R ; (kn )nN E p , x = kn /3n ,
p1
n1
n
o
E p = (kn )nN {0, 1, 2}N ; n p, kn 6= 1 .
def
Exercises
69
b) Show that C is in one-to-one correspondance with {0, 1}N , and in one-to-one correspondence with R.
c) Show that L is in one-to-one correspondance with P(R).
d) One can show that there exists an injective map from B to R. Deduce that the
second inclusion is strict.
iii) Use Sidebar 2 on the following page to conclude that the last inclusion is also strict.
Exercise 2.6 Let f be a map from R to R. Is it true that the two statements below are
equivalent?
Exercise 2.8 (Egorovs theorem) Let (X , T , ) be a finite measure space, that is, a measure
space such that (X ) < + (for instance, X = [a, b ] with the Lebesgue measure). Let
( fn )nN be a sequence of complex-valued measurable functions on X converging pointwise to a
function f .
For n N and k N , let
\
1
def
.
x X ; f p (x) f (x)
En(k) =
k
pn
For fixed k, show that the sequence En(k) n1 is increasing (for inclusion) and that the measure
ofEn(k) tends to (X ) as n tends to infinity. Deduce the following theorem:
THEOREM 2.32 (Egorov) Let (X , T , ) be a finite measure space. Let ( fn )nN be a sequence of
complex-valued measurable functions on X converging pointwise to a function f . Then, for any > 0,
there exists a measurable set A with measure (A ) < such that ( fn )nN converges uniformly to f on
the complement X \ A .
This theorem shows that, on a finite measure space, pointwise convergence is not so
different from uniform convergence: one can find a set of measure arbitrarily close to the
measure of X such that the convergence is uniform on this set.
70
x y if and only if y x;
if x y and y z, then x z.
The Axiom of Choice states for any non-empty set A, there exists a map f : P(A)
A, called a choice function, such that f (X ) X for any non-empty subset X A. In
other words, the choice function selects, in an arbitrary way, an element in any non-empty
subset of X . It is interesting to know that a more restricted form of the Axiom of Choice
(the Axiom of Countable Choice) is compatible with the assertion that any subset of R is
Lebesgue-measurable. (But is not sufficient to prove this statement). The interested reader
may look in books such as [51] for a very intuitive approach, or [45] which is very clear.
c
This means that for any x R, there is a unique y E with x y.
Solutions of exercises
71
SOLUTIONS
Solution of exercise 2.1
i) Wrong. Take, for instance, d = 1 and the unbounded open set
[
1
U =
k, k + k
2
k1
P
which has Lebesgue measure (U ) = k1 2k = 1.
ii) Wrong. For d = 1, consider for example the Borel set (R\Q)[0, 1], which has measure
1 but does not contain any non-empty open set.
iii) True. It is easy to describe for any > 0 a subset of Rd containing R p {0}d p with
measure less than .
def
En = x X ; f (x) > 1/n .
The sequence (En )nN is an increasing sequence of measurable sets, and its union is
x X ; f (x) > 0 ,
which has positive measure since it contains A. Using the -additivity, there exists n0 such
that En0 has positive measure. Then the integral of f is at least equal to (En0 )/n0 > 0.
to construct a strictly increasing sequence (nk )kN such that X En(k)
< /2k for all k N,
k
and then conclude.
Chapter
Integral calculus
3.1
Integrability in practice
A physicist is much more likely to encounter concrete functions than
abstract functions for which integrability might depend on subtle theoretical
arguments. Hence it is important to remember the usual tricks that can be
used to prove that a function is integrable.
The standard method involves two steps:
first, find some general criteria proving that some standard functions
are integrable;
second, prove and use comparison theorems, which show how to reduce
the integrability of a complicated function to the integrability of one
of the standard ones.
3.1.a
Standard functions
The situation we consider is that of functions f defined on a half-open interval [a, b[, where b may be either a real number or +, and where the
functions involved are assumed to be integrable on any interval [a, c ] where
a < c is a real number (for instance, f may be a continuous function on
[a, b[). This is a fairly general situation, and integrability on R may be reduced to integrability on [0, +[ and ], 0].
Integral calculus
74
or
3.1.b
( = 1
Comparison theorems
Once we know some integrable functions, the following simple result gives
information concerning the integrability of many more functions.
PROPOSITION 3.5 (Comparison theorem) Let g : [a, b[ C be an integrable
function, and let f : [a, b[ C be a measurable function. Then
Exercise 3.1 Study the integrability of the following functions (in increasing order of diffi-
culty):
i) t 7 cos(t) log(tan t) on ]0, /2[;
PROPOSITION 3.6 (Asymptotic comparison) Let g be a non-negative measurable function on [a, b[, and let f be a measurable function on [a, b[. Assume that
g is not integrable and that f and g are both integrable on any interval [a, c ] with
c < b. Then asymptotic comparison relations between f and g extend to their integrals
as follows:
Rx
Rx
Rx
Rx
f g = a f a g
and
f = o(g) = a f = o
g .
a
b
xb
xb
75
If g and h are integrable on [a, b[, then asymptotic comparison relations between
f and h extend to the remainders as follows:
R b
Rb
Rb
Rb
f h = x f x h
and
f = o(h) = x f = o
h .
x
b
xb
+
1
xb
e x t
dt log x.
p
1 + t 2 x0+
3.2
Exchanging integrals and limits or series
Even when dealing with concrete functions, Lebesgues theory is extremely useful because of the power of its general statements about exchanging
limits and integrals. The main theorem is Lebesgues dominated convergence
theorem.
THEOREM 3.7 (Lebesgues dominated convergence theorem) Let (X , T , )
be a measure space, for instance, R or Rd with the Borel -algebra and the Lebesgue
measure. Let ( fn )nN be a sequence of complex-valued measurable functions on X . Assume that ( fn )nN converges pointwise almost everywhere to a function f : X R,
and moreover assume that there exists a -integrable function g : X R+ such that
| fn | g almost everywhere on X , for all n N . Then
i) f is -integrable;
ii) for any measurable subset A X , we have
Z
Z
fn d =
f d.
lim
n
ey
2 /n2
n e x cos x
dx. By the change of vari1 + n2 x 2
cos( y/n)
dy.
1 + y2
Integral calculus
76
Another useful result does not require the domination assumption, but
instead is restricted to increasing sequences. This condition is more restrictive,
but often very easy to check, in particular for series of non-negative terms.
THEOREM 3.9 (Beppo Levis monotone convergence theorem) Let (X , T , )
be a measure space, for instance, R or Rd with the Borel -algebra and the Lebesgue
measure. Let ( fn )n be a sequence
of non-negative measurable functions on X . Assume that the sequences fn (x) n are increasing for all x X . Then the function
f : X R {+}, defined by
for all x X ,
fn d =
f d
This result is often used as a first step to prove that a function is integrable.
Here finally is the version of the dominated convergence theorem adapted
to a series of functions:
THEOREM 3.10 (Term by term integration of a series) Let (X , T , ) be a mea-
sure P
space, for instance, R or Rd with the Borel -algebra and the Lebesgue measure.
Let
fn be a series of complex-valued measurable functions on X such that
Z
X
| fn | d < +.
n=0
Z
X
fn d =
fn d.
n=0
n=0
77
3.3
Integrals with parameters
3.3.a
A natural question is to determine what properties of regularity the integral I (x) will inherit from properties of f .
The two main results concern the continuity or derivability of f . Here
again, the Lebesgue theory gives very convenient answers:
Assume that for almost all t X , the function x 7 f (x, t) is continuous at x0 , and
that there exists a -integrable function g : X R such that
f (x, t) g(t) for almost all t X ,
R
for all x in a neighborhood of x0 . Then x 7 X f (x, t) dt is continuous at x0 .
which, being valid for an arbitrary sequence (xn )nN , is the statement we wanted to
prove.
Counterexample 3.12 It is of course natural to expect that the conclusion of Theorem 3.11
should be true in most cases. Here is an example where, in the absence of the domination
property, the continuity of a function defined by an integral with parameters does not hold.
Let f be the function on R R defined by
1 2
exp
t
if x 6= 0,
2
x
f (x, t) =
0
if x = 0.
Integral calculus
78
For any fixed nonzero x, the function t 7 f (x, t) is a gaussian centered at 1/x, so its
integral is easy to compute; it is in fact equal to 1. But for x = 0, the function t 7 f (0, t) is
identically zero, so its integral is also zero. Hence we have
I (0) = 0
x R ,
but
I (x) = 1.
3.3.b
[a, b]. Assume that there exists a neighborhood V of x0 such that for x V , the
function x 7 f (x, t) is differentiable at x0 for almost all t X , and t 7 f (x, t)
is integrable. Assume, moreover, that the derivatives are dominated by an integrable
function g, that is, there exists an integrable function g : X R such that
f
(x, t) g(t)
for all x V and almost all t X .
x
R
Then the function x 7 I (x) = f (x, t) dt is differentiable at x0 , and moreover we
have
Z
Z
f
d
f (x, t) dt
=
(x0 , t) dt.
dx
X
X x
x=x0
Those theorems are very useful in particular when dealing with the theory
of the Fourier transform.
Counterexample 3.14 Here again, the assumption of domination cannot be dispensed with.
cos(x t)
dt;
1 + t2
one can then show (see Exercise 3.10 on page 84) that
F (x) = e |x|
2
for all x R. This function is continuous on R, but is not differentiable at 0.
3.3.c
THEOREM 3.15 With X = R and the same assumption as in Theorem 3.13, let
moreover v : V R be a function differentiable at x0 . Let
Z v(x)
I (x) =
f (x, t) dt.
0
79
v(x)
0
f
(x , t) dt
x 0
f (x0 + h, )
Cnt h 2
f (x0 , v(x0 ))
h v (x) f x0 , v(x0 )
f (x0 , )
v(x0 )
v(x0 + h)
h v (x0 )
3.4
Double and multiple integrals
For completeness and ease of reference, we recall the theorem of Fubini
mentioned at the end of the previous chapter.
THEOREM 3.16 (Fubini-Lebesgue) Let (X , T , ) and (X , T , ) be measure
spaces, each of which is the union of a sequence of sets with finite measure, for instance,
X = Y = R with the Lebesgue measure. Let f : X Y R be a mesurable
function on X Y for the product -algebra. Then:
i) f is integrable on X Y with respect to the product measure if and only
if one of the following integrals is finite:
Z Z
Z Z
f (x, y) dy dx
f (x, y) dx dy.
or
Integral calculus
80
X Y
f d( )(x, y).
ZZ
f =
X Y
Z Z
Y
Z Z
f (x, y) d(x) d( y) =
f (x, y) d( y) d(x).
X
each of which is the union of a sequence of sets with finite measure, for instance,
X = Y = R with the Lebesgue measure. Let f : X Y R be a non-negative
measurable function. Then the integral of f on X Y with respect to the product
measure , which is either a non-negative real number or +, is given by the
Fubini formula
Z Z
ZZ
f (x, y) d( )(x, y) =
f (x, y) d(x) d( y)
X Y
Y
X
Z Z
=
f (x, y) d( y) d(x).
X
Change of variables
81
3.5
Change of variables
The last important point of integral calculus is the change of variable formula in multiple integrals. Suppose we wish to perform a change of variable
in 3 dimensions
(x, y, z) 7 (u, v, w) = (x, y, z),
(and analogously in n dimensions); then the jacobian of this change of variable
is defined as follows:
DEFINITION 3.19 Let and be two open sets in R3 . Let be a bijection
u u u
x y z
v
def D(u, v, w) def v
.
=
Jac (x, y, z) =
D(x, y, z) x y z
w w w
x y z
and moreover
Z
Z
f (u1 , . . . , un ) du1 . . . dun =
V
( f ) J is integrable on U
f (x1 , . . . , xn ) J (x1 , . . . , xn ) dx1 . . . dxn .
Here integration and integrability are with respect to the Lebesgue measure on Rn , and
mean that each component of f is integrable.
Integral calculus
82
r
=
x 2 + y2,
x = r cos ,
or
y
y = r sin ,
p
.
= 2 arctan
x + x 2 + y2
(Tricky question: why shouldnt we write = arctan( y/x) as is often seen?
Why do most people write it anyway?)
The jacobian of this change of variable is
x y
D(x, y) r r cos
sin
=
=
D(r, ) x y r sin r cos = r,
and we recover the well-known formula:
ZZ
Z Z
f (x, y) dx dy =
f (r, ) r dr d.
D(r,)
D(x, y)
directly from the formula is rather involved, but a classical result of differential calculus avoids
this computation: we have
D(r, ) D(x, y)
= 1,
D(x, y) D(r, )
and this result may be generalized to arbitrary locally bijective differentiable maps in any
dimension.
Remark 3.22 In one dimension, the change of variable formula is
f (u) du =
f (x) | (x)| dx.
It is important not to forget the absolute value in the change of variable, for instance,
Z
Z
Z
f (x) dx =
f ( y) |1| dy =
f ( y) dy.
R
Sometimes absolute values are omitted, and one writes, for instance, dy = dx. In this case,
the range of integration is considered to be oriented, and the orientation is reversed when the
change of variable is performed if it corresponds to a decreasing function : x 7 y = (x).
Thus we have
Z +
Z
Z +
f (x) dx =
f ( y) (dy) =
f ( y) dy,
with the minus sign cancelling the effect of interchanging the bounds of integration.
Exercises
83
EXERCISES
Exercise 3.3 Show that
|sin t|
2
dt
log x.
x+
t
Exercise 3.4 Check that the jacobian for the change of variables from cartesian coordinates
lim
1
0
n /2 x
dx
1 + n2 x 2
and
lim
1/
|sin x| n
dx.
1 + x2
Exercise 3.6 Let f () be the distribution function of energy emitted by a black body, as a
function of the frequency. Recall that, by definition, the total amount of energy with frequency
between 0 and 1 is given by the integral
Z 1
f () d.
0
Max Planck showed at the end of the nineteenth century that the distribution function f ()
is given by the formula (which now bears his name)
f () =
A 3
,
} 1
e 2h
where A is a constant, = 1/kb T is the inverse temperature, and h} is the Planck constant.
def
What is the distribution function g() of the energy as function of the wavelength = c /?
For which frequency max is f () maximal? What is the wavelength for which g() is maximal?
Why is it that max 6= c /max ?
Exercise 3.7 Show that the function x 7 arctan(x) arctan(x 1) is integrable on R (with
respect to Lebesgue measure, of course) and show that
Z +
arctan(x) arctan(x 1) dx = .
Exercise 3.8 (Counterexample to Fubinis theorem) For (x, y) [0, 1]2 , define
f (x, y) =
R1
sgn( y x)
.
max(x, y)2
R 1 R 1
Deduce from this the value of 0 0 f (x, y) dx dy.
R 1 R 1
ii) Similarly, compute 0 0 f (x, y) dy dx.
i) Compute
Integral calculus
84
that we have
f (x, y, z) r
r0
p
def
2
2
for some real number , where r = x + y + z 2 . Assuming that f is integrable on any
bounded region in R3 {0}, discuss the integrability of f at the origin, depending on the
value of . Generalize this to functions on Rd .
Exercise 3.10 (Fourier transform of a Lorentzian function) Let
F (x) =
cos u x
du
1 + u2
for all x R.
iii) Deduce from the previous question a differential equation satisfied by g. (Start by
comparing
2
x
2
x
and
.
x2 x2 + t2
t2 x2 + t2
Solve the differential equation, and deduce the value of F .)
Solutions of exercises
85
SOLUTIONS
Solution of exercise 3.4. Let (r, , ) denote spherical coordinates (where is latitude and
longitude). We have
x = r cos cos ,
and therefore the jacobian is
cos cos
J = r sin sin
r cos sin
y = r cos sin ,
cos sin
r sin sin
r cos cos
z = r sin ,
sin
r cos
0
= r 2 sin .
Solution of exercise 3.6. The function g() has the property that the amount of energy
radiated by frequencies in the interval [0 , 1 ], corresponding to wavelengths 0 and 1 , is
Z 0
equal to
g() d. The change of variable = c / leads to
1
Z 0
Z 1
Z 1
d
c
g() d =
g(c /) d =
g(c /) 2 d,
d
1
0
0
g(c /)
.
2
Note that there is no reason for the functions 7 f () and 7 2 f () to reach a maximum
at the same point. In fact, rather than the functions f and g themselves, it is the differential
forms (or measures) f () d and g() d which have physical meaning.
f () = c
which are consequences of the mean value theorem. To compute the value of the integral, let
F denote a primitive of arctan. Then we have
ZM
arctan(x) arctan(x 1) dx = F (M ) F (M 1) F (M ) + F (M 1)
M
= arctan( M ) arctan( M
),
where M [M 1, M ] and M
[M 1, M ] are given by the mean value theorem
again. The result stated follows immediately.
It follows that g(x) has a limit as x tends to zero on the right, namely,
g(0+ ) = F (0) = .
Similarly, for x < 0 we have g(x) = F (x) and hence g(0 ) = . Since the definition
gives g(0) = 0, we see that the function g is not continuous at 0.
86
Integral calculus
iii) A simple computation shows that the two derivatives mentioned in the statement of the
exercise are equal. From this, we deduce that
Z +
Z +
2
x
2
x
2
g(x)
=
cos
t
dt
=
cos t 2
dt
2
2
2
2
2
2
x
x
x +t
t
x +t
Z +
x
=
cos t
dt,
x2 + t2
using two integrations by parts. It follows that g is a solution of the differential equation
g (x) g(x) = 0.
The general solution of this equation is of the form
g(x) = e x + e x ,
where (+ , + ) are real constants (which give the value of g on R+ ), and ( , )
similarly on R .
Since g is bounded, the coefficient + of e t on R+ is zero, and the coefficient of
t
e on R also. Since g is odd and g(0+ ) = , we finally find that
g(x) = sgn(x) e |x| .
Since F (x) = sgn(x) g(x) and F (0) = , we finally find that
x R
F (x) = e |x| .
Chapter
Complex Analysis I
This chapter deals with the theory of complex-valued functions of one complex
variable. It introduces the notion of a holomorphic function, which is a function
f : C defined and differentiable on an open subset of C. We will see that
the (weak) assumption of differentiability implies, in strong contrast with the real
case, the (much stronger) consequence that f is infinitely differentiable. We will
then study functions with singularities at isolated points and which are holomorphic
except at these points; we will see that their study has important applications to
the computation of many integrals and sums, notably for the computation of the
Fourier transforms that occur in physics. In Chapter 5, we will see how techniques
of conformal analysis provide elegant solutions for certain problems of physics in
two dimensions, in particular, problems of electrostatics and of incompressible fluid
mechanics, but also in the theory of diffusion and in particle physics (see also
Chapter 15).
4.1
Holomorphic functions
Whereas differentiability in R is a relatively weak constraint,1 we will see
during the course of this chapter that differentiability in terms of a complex
variable implies by contrast many properties and rigidifies the situation, in
a sense that will soon be made precise.
1
A function f , defined on an open interval I of R and with values in R or C may be
differentiable at all points of I without, for instance, the derivative being continuous. For
instance, if we set f (0) = 0 and f (x) = x 2 sin(1/x) for all nonzero x, then f is differentiable
at all points of R, f (0) = 0, but f is not continuous at 0 (there is no limit of f at 0).
Complex Analysis I
88
4.1.a
Definitions
z0 C if
f (z) f (z0 )
z z0
exists, i.e., if there exists a complex number, denoted f (z0 ) such that
f (z) f (z0 ) f (z0 ) (z z0 ) = o(z z0 ) [z z0 ].
f (z0 ) = lim
def
zz0
One may ask whether this definition is not simply equivalent to differentiability in R2 , after identifying the set of complex numbers and the real plane
(see sidebar 3 on page 134 for a reminder on basic facts about differentiable
functions on R2 ). The following theorem, due to Cauchy, shows that this is
not the case and that C-differentiability is stronger.
THEOREM 4.2 (Cauchy-Riemann equations) Let f : C be a complexvalued function of one complex variable; denote by fe the complex-valued function
on R2 naturally associated to f , that is, the function
fe :
R2 C,
def
(x, y) 7 fe(x, y) = f (x + i y).
While we are speaking of this, we can recall that the study of the complex plane owes
much to the works of Rafaele Bombelli (15261573) (yes, so early!), who used i to solve algebraic
equations (he called it di meno, that is, [root] of minus [one]), and of Jean-Robert Argand
(17681822) who gave the interpretation of C as a geometric plane. The complex plane is
sometimes called the Argand plane. It is also Argand who introduced
the modulus for a
p
complex number. The notation i, replacing the older notation 1, is due to Euler (see
page 39).
Holomorphic functions
89
fe is R2 -differentiable
Cauchy-Riemann equation.
fe
fe
(x0 , y0 ) = i (x0 , y0 ).
x
y
Proof
Assume that f is differentiable in the complex sense at a point z0 . We can define
a linear form d fe(z0 ) : R2 C in the following manner:
def
d fe(z0 ) (k, l ) = f (z0 ) k + il .
it follows that
d fe =
f
= f (z0 )
x
fe
fe
dx +
dy
x
y
and
f
= i f (z0 ),
y
fe
fe
fe
dx +
d y (k, l ) =
dx + i dy (k, l )
d fe(x0 , y0 ) (k, l ) =
x
y
x
=
fe
(k + il ),
x
and thus that f is differentiable and that its derivative is equal to f (z0 ) =
fe
(z ).
x 0
Complex Analysis I
90
following equations:
Cauchy-Riemann equations.
Q
P
=
x (x0 , y0 )
y (x0 , y0 )
(v) f is holomorphic on .
(Solution on page 129)
4.1.b Examples
Example 4.6 The functions z 7
z 2,
Holomorphic functions
91
this, we compute
f (z) f (0)
z
= .
z 0
z
If now we let z follow a path included in R approaching 0, this quotient is
equal to 1 thoughout the path, whereas if we let z follow a path included in
the set of purely imaginary numbers, we find a constant value equal to 1;
this shows that the quantity of interest has no limit as z 0, and thus that f is
not differentiable at 0. If we add a constant, we see that it is not differentiable
at z0 for any z0 .
Example 4.8 The function z 7 |z| is nowhere holomorphic, and similarly for the functions
4.1.c
dy(k, l) = l.
def
For any function f : C, R2 -differentiable (and therefore not necessarily differentiable in the complex sense), we have (check sidebar 3 on page 134):
df =
f
f
dx +
dy.
x
y
For those readers interested in theoretical physics, this treatment as independent variables
can be found also in field theory, for instance in the case of Grassmann variables.
Complex Analysis I
92
and dy =
1
(dz
2i
df =
def 1
=
z
2
i
x
y
def 1
=
z
2
and
+i
x
y
f
f
dz +
dz .
z
z
holomorphic, at z0 , we have
f
(z ) = f (z0 )
z 0
and
f
(z ) = 0.
z 0
This last relation is none other than the Cauchy-Riemann condition. One may write
also the two preceding equations as
f
f
f
(z ) =
(z ) = i
(z ).
z 0
x 0
y 0
Example 4.12 The function f : z 7 Re(z) =
point.
z+z
2
f
z
1
2
at every
Remark 4.13 We could also change our point of view and look at functions that can be
expressed in terms of z only:
DEFINITION 4.14 A function f : C is antiholomorphic if it satisfies
f
=0
z
Cauchys theorem
93
Remark 4.15 Be careful not to argue that f / z = 0 at all points, so the function
is constant, which is absurd. The function z 7 z does satisfy the condition, and
therefore is antiholomorphic, but it is certainly not constant! The quantity f / z is
equal to f (z) only for a holomorphic function. For an arbitrary function, it does
not have such a clear meaning.
4.2
Cauchys theorem
In this section, we will describe various versions of a fundamental result of
complex analysis, which says that, under very general conditions, the integral
of a holomorphic function around a closed path is equal to zero. We will start by
making precise what a path integral is, and then present a simple version of
Cauchys theorem (with proof), followed by a more general version (proof
omitted).
4.2.a
Path integration
We now need a few definitions: argument, path, and winding number of a path
around a point.
DEFINITION 4.16 (Argument) Let be an open subset of C = C \ {0} and
which satisfies (x) = 0 for any real number x > 0. It is called the principal
determination of the argument and is denoted Arg.
Remark 4.17 One should be aware that a continuous determination of the argument
does not necessarily exist for a given open subset. In particular, one cannot find one
for the open subset C itself, since starting from argument 0 on the positive real axis
and making a complete circle around the origin, one comes back to this half-axis with
argument equal to 2, so that continuity of the argument is not possible.
plane is a continuous map from [0, 1] into (one may also allow other
Complex Analysis I
94
sets of definitions and take an arbitrary real segment [a, b] with a < b). If
: [0, 1] is a curve, we will denote by e its image in the plane; for
simplicity, we will also often write simply without risk of confusion. A
curve is closed if (0) = (1). It is simple if the image doesnt intersect
itself, which means that (s) = (t) only if s = t or, in the case of a closed
curve, if s = 0, t = 1 (or the opposite). A path is a curve which is piecewise
continuously differentiable.
DEFINITION 4.19 (Equivalent paths) Two paths
: [a, b] C
: [a , b ] C
and
are called equivalent if their images coincide and the corresponding curves
have the same orientation: ([a, b]) = ([a , b ]), with (a) = (a ),
(b) = (b). In other words, there exists a re-parameterization of the path
, preserving the orientation, which transforms it into , which means a
bijection u : [a, b] [a , b ] which is continuous, piecewise differentiable
and strictly increasing, and such that (x) = u(x) for all x [a, b].
Example 4.20 Let :
[0, 2] C
and
[0, /2] C
t 7 e
t 7 e 2i cos t .
These define equivalent paths, with image equal to the unit circle.
it
One must be aware that this formula holds only for a real-valued function.
Cauchys theorem
95
One checks (using the change of variable formula) that this definition
does not depend on the parameterization of the path , that is, if is a path
equivalent to , the right-hand integral is simultaneously defined and gives
the same result with or . We then have the following theorem:
THEOREM 4.22 Let f be a holomorphic function on and : [a, b] C a path
f (b) f (a) =
f
(z) dz.
z
(4.3)
L =
b
a
(t) dt.
then
Z
f (z) dz sup f (z) L .
z
7 a + R e i ,
Complex Analysis I
96
The picture on the right shows the various values of the function Ind for
a curve in the complex plane. One sees that the meaning of the value of the
function Ind is very simple: for any point not on , it gives the number
of times the curve turns around z, this number being counted algebraically
(taking orientation into account). Being in the unbounded connected component of U means simply being entirely outside the loop.
1
2
0
1
2
0
Exercise 4.3 Show by a direct computation that if is a circle with center a and radius r ,
1
0
if |z a| < r ,
if |z a| > r .
F (z) dz = 0.
Cauchys theorem
Proof. Indeed,
(b ) = (a).
97
F (z) dz = F (b ) F (a) = 0 since is a closed path so
THEOREM
4.30
z n+1
,
(n + 1)
Proof. The proof, elementary but clever, is well worthy of attention. Because of its
length, we have put it in Appendix D.
Remark 4.32 We have added some flexibility to the statement by allowing f not to be holomorphic at one point (at most) in the triangle. In fact, this extra generality is illusory. Indeed,
we will see further on that if a function is holomorphic in an open subset minus a single
point, but is continuous on the whole open set, then in fact it is holomorphic everywhere on
the open set.
Complex Analysis I
98
THEOREM 4.33 (Cauchys theorem for convex sets) Let be an open convex
Proof. Fix a point a and, for all z , define, using the convexity property,
Z
def
F (z) =
f ( ) d ,
[a,z]
def
[a, z] = (1 )a + z ; [0, 1] .
Finally, there exists an even more general version of the theorem, which is
optimal in some sense. It requires the notion of a simply connected set (see Definition A.10 on page 575 in Appendix A). The proof of this is unfortunately
more delicate. However, the curious reader will be able to find it in any good
book devoted to complex analysis [43, 59].
THEOREM 4.34 (Cauchys theorem for simply connected sets) Let be a simply connected open subset of the plane and let f H (). Then for any closed
inside ,
Z
f (z) dz = 0.
Remark 4.35 In fact, one can show that if f H (), where is an open simply connected
set (as for instance a convex set), then f admits a primitive on . The vanishing of the integral
then follows as before.
In the case of an open set with holes, one can use a generalization of
Cauchys theorem. It is easy to see that such an open set can be dissected in a
manner similar to what is indicated in Figure 4.1. One takes the convention
that the boundary of the holes is given the negative (clockwise) orientation,
and the exterior boundary of the open set is given the positive orientation.
The integral of a holomorphic function along the boundary of the open set,
with this orientation, is zero.
99
Fig. 4.1 One can dissect an open set having one or more holes to make it simply
connected. The integral on the boundary of the new open set thus defined
is zero since is simply connected, which shows that the integral on (with
the orientation chosen as described on the left) is also zero.
4.2.e
Application
One consequence of Cauchys theorem is the following: one can move the
integration contour of an holomorphic function continuously without changing the value of the integral, under the condition at least that the function
be holomorphic at all the points which are swept by the path during its
deformation.
THEOREM 4.36 Let be an open subset of C, f HR(), and aRpath in . If
f (z) dz =
f (z) dz.
Proof. The proof is somewhat delicate, but one can give a very intuitive graphical
description of the idea. Indeed, one moves from the contour in Figure 4.2 (a) to the
contour of Figure 4.2 (b) by adding the contour of Figure 4.2 (c). But the latter does
not contain any hole (since is deformed continuously to ); therefore the integral
of f on the last contour is zero.
4.3
Properties of holomorphic functions
4.3.a
Complex Analysis I
100
(a)
(b)
(c)
Z
Z
f ( )
d
0=
d +
f (z)
Z
f ( )
=
d + 2i Ind (z) f (z)
z
by definition of the winding number of a path around a point.
Remark 4.38 This result remains valid for a simply connected set.
Let us apply this theorem to the case, already mentioned, where the path
of integration is a circle centered at z0 with radius r: = C (0 ; r). One can
then parameterize by z = z0 + re i and we have
Z
Z
1
f (z)
1
f (z0 + re i ) i
f (z0 ) =
dz =
rie d
2i z z0
2i
r e i
Z
1
=
f (z0 + re i ) d,
2
which allows us to write:
THEOREM 4.39 (Mean value property) If is an open subset of C and if f is
holomorphic on , then for any z0 and any r R such that B(z0 ; r) , we
1
2
101
f (z0 + re i ) d.
f H () f is analytic on .
Recall that f is analytic on an open subset if and only if, at any point
of , f admits locally a power series expansion converging in an open disc
with non-zero radius. The proof will use Cauchys Formula:
P
Proof. Let us prove the implication . If the power series f (z) =
n=0 c n (z
a)n converges on an open ball
r > 0 around the point a, then according
P with radius
to Theorem 1.69, the series ncn (z a)n1 converges on the same open ball. Without
loss of generality, we can work with a = 0. Denote
X
def
ncn z n1 ,
z B(0 ; r ).
g(z) =
n=0
X
f (z) f (w)
z wn
g(z) =
cn
nw n1 .
zw
zw
n=1
n1
X
z wn
nw n1 = (z w)
kw k1 z nk1 .
zw
k=1
X
f (z) f (w)
g(z) |z w|
n2 |cn | n2 .
zw
n=1
Complex Analysis I
102
But this last series is convergent (Theorem 1.69). Letting z tend to w, we find that f is
differentiable at w and that g(w) = f (w).
So it follows that f is holomorphic
Pon B(a ; r ), and its derivative admits a power
series expansion of the type f (z) = n1 ncn (z a)n1 . By induction one can even
show that the successive derivatives of f are given by the formula in Theorem 1.69 on
page 35, which is therefore valid for complex variables as well as for real variables.
Let us prove the implication . We assume then that f H ().
Let a , and let us show that f admits a power series expansion around a.
Since is open, one can find a real number R > 0 such that B(a ; R) . Let be
a circle centered at a and with radius r < R. We have (Theorem 4.37)
Z
1
f ( )
f (z) =
d
z B(a ; r ).
2i z
But, for any , we have
z a |z a|
< 1;
a
r
a
z
X
(z a)n
1
=
n+1
(
a)
z
n=0
converges uniformly with respect to on , and this for any z B(a ; r ), which allows us
to exchange the series and the integral and to write
Z X
Z
(z a)n
1 X
f ( )
1
n
f (z) =
f
(
)
d
=
(z
a)
d
2i n=0 ( a)n+1
2i n=0
( a)n+1
X
=
cn (z a)n ,
n=0
with
Z
1
f ( )
d .
2i
( a)n+1
So we have shown that f admits a power series expansion around a. Moreover we now
(see the general theory of power series) that in fact we have cn = f (n) (a)/n!, and this
gives the formula of the theorem.
In conclusion, f is indeed analytic on and the formula is proved.
def
cn =
tiable on .
We can now pause to review our trail so far. We have first defined a
holomorphic function on as a function differentiable at any point of .
Having a single derivative at any point turns out to be sufficient to prove
that, in fact, f has derivatives of all order, which is already a very strong
result. Even stronger is that f is analytic on , that is, that it can be expanded
in power series around every point. And the proof shows something even
beyond this: is we take a point z , and any open ball centered at z entirely
103
ez =
X
zn
n!
n=0
with infinite radius of convergence. Around a point z0 C, the function has (another)
expansion
ez =
X
e z0 (z z0 )n
n=0
n!
The following theorem is due to Cauchy, but bears the name of Liouville7:
THEOREM 4.45 (Liouville) Any bounded entire function is constant.
This theorem tells us that any entire function which is not constant must
necessarily tend to infinity somewhere. This somewhere is of course at
infinity in the complex plane, but one must not think that it means that
f (z) tends to infinity when |z| tends to infinity. For instance, if we consider
the function z 7 exp(z), we see that if we let z tend to infinity on the left
6
There can of course be convergence even outside this ball; this brings the possibility of
analytic continuation (see Section 5.3 on page 144).
7
Not because it was thought that Cauchy already had far too many theorems to his name
but because, Liouville having quoted it during a lecture, a German mathematician attributed
the result to him by mistake; as sometimes happens, the mistake took hold.
Complex Analysis I
104
side of the plane (that is, by letting the real part of z go to ), then exp(z)
tends to 0. It is only if the real part of z tends to + that |e z | tends to
infinity.
Proof of Liouvilles theorem. We start by recalling the following result:
LEMMA 4.46 If f is an entire function such that f (z) 0 on C, then f is constant.
4.3.b
What does this result mean? At any point in the open set of definition,
the modulus of the holomorphic function cannot have a local maximum. It
can have a minimum: it suffices to consider z 7 z which a local (in fact,
global) minimum at z = 0. But we can never find a maximum.
105
We will see later that this theorem also applies to real harmonic functions
(such as the electrostatic potential in a vacuum), which will have important
physical consequences (see Chapter 5; it is in fact there that we will give,
page 141, a graphical interpretation of the maximum modulus principle).
f (z) dz = 0
dx dy.
x
y
K
K
8
George Green (17931841), English mathematician, was born and died in Nottingham,
where he wasnt sheriff, but rather miller (the mill is still functional) and self-taught. He
entered Cambridge at 30, where he got his doctorate four years later. He studied potential
theory, proved the Green-Riemann formula, and developed an English school of mathematical
physics, followed by Thompson (discoverer of the electron), Stokes (see page 472), Rayleigh,
and Maxwell.
9
The boundary of K , denoted K , must be of class C 1 and, for any point z K , there
must exist a neighborhood Vz of z homeomorphic to the unit open ball B(0 ; 1), such that
Vz K corresponds by the homeomorphism with the upper half-plane. This avoids a number
of pathologies.
106
Complex Analysis I
mitting continuous partial derivatives with respect to each of them, and let K be a
sufficiently regular compact subset of C. Then we have
Z
ZZ
F
F (z, z ) dz = 2i
dA ,
z
K
K
where dA denotes the area integration element in the complex plane. One can also
write F as a function of x and y, which gives us (with a slight abuse of notation)
Z
ZZ
F
F (x, y) dz = 2i
dx dy.
z
K
K
Proof. It is enough to write F = P + i Q andRto apply the formula from Theorem 4.50 to the real and to the imaginary parts of K F dz.
The Green-Riemann formula should be compared with the Stokes formula10 for differential forms, given page 473.
point if there exists a sequence (zn )nN of elements of Z \{c } such that zn c .
(It is important to remark that the definition imposes that zn 6= c for all
n N; without this condition, any point would be an accumulation point, a
constant sequence equal to c converging always to c .)
THEOREM 4.53 (Classification of zero sets) Let be a domain of C (recall that
this means a connected open set) and let f H (). Define the zero set of f by
def
Z( f ) = a ; f (a) = 0 .
10
107
for all z .
Interpretation
Suppose we have two holomorphic functions f1 and f2 , both defined on the
same domain , and that we know that f1 (z) = f2 (z) for any z belonging
to a certain set Z. Then if Z has at least one accumulation point in , the
functions f1 and f2 are necessarily equal on the whole set .
Thus, if we have a function holomorphic on C and if we know its values for instance on R+ , then it is theoretically entirely determined. In
particular, if f is zero on R+ , then it is zero everywhere.
Note that is not simply open, but it must also be connected. Indeed, if
were the union of two disjoint open subsets, the rigidity of the function
would not be able to bridge the gap between the two open subsets, and the
behavior of f on one of the two open subsets would be entirely uncorrelated
with its behavior on the other.
def
Counterexample 4.55 Assume we know a holomorphic function f at the points zn = 1/n, for
N ,
all n
and that in fact f (zn ) = n. In this case, f is not uniquely determined. In fact,
both functions f1 (z) = 1/z and f2 (z) = exp(2i/z)/z satisfy f1 (zn ) = f2 (zn ) = f (z) for all
n N. What gives? Here, because f (z) when z tends to 0, f is not holomorphic at 0;
the accumulation point, which is precisely 0, does not belong to and the theorem cannot be
applied to conclude.
COROLLARY 4.55.1 Let be a connected open subset and assume that the open ball
B = B(z0 ; r) with r > 0 is contained in . If f H () and if f is zero on B,
then f 0 on .
COROLLARY 4.55.2 Let C be a connected open subset, and let f and g be
holomorphic on and such that f g 0 on . Then either f 0 on or g 0
on .
Proof. Let us suppose that f 6 0; let then z0 be a point of such that f (z0 ) 6= 0.
Since f is continuous, there exists a positive real number r > 0 such that f (z) 6= 0
def
for all points in B = B(z0 ; r ). Hence g 0 on B and Corollary 4.55.1 allows us to
conclude that g 0 on .
Complex Analysis I
108
COROLLARY 4.55.3 Let f and g be entire functions. If f (x) = g(x) for all
x R, then f g.
These theorems also show that functional relations which hold on R remain valid on C. For instance, we have
x R
cos2 x + sin2 x = 1.
The function z 7 cos2 z + sin2 z 1 being holomorpic (it is defined using the
squares of power series with infinite radius of convergence, and is therefore
holomorphic on C) and being identically zero on R, it must be zero on C;
we therefore have
z C
cos2 z + sin2 z = 1.
4.4
Singularities of a function
Certain functions are only holomorphic on an open subset minus one or
a few points, for instance the function z 7 1/z, which belongs to the set
H C \ {0} . It is in general those functions which are useful in physics.
The points at which f is not holomorphic are called singularities and have a
physical significance in many problems.11
4.4.a
Classification of singularities
Thus, in linear response theory, the response function will be, depending on conventions,
analytic (say) in the complex upper half-plane, but will have poles in the lower half-plane.
Those poles correspond to the energies of the various modes. In particle physics, they will be
characteristic of the mass of an excitation (particle) and its half-life.
Singularities of a function
109
g(z) =
f (z) f (z0 )
z z0
z \ {z0 }.
f (z) f (z0 ) if z 6= z0 ,
def
g(z) =
z z0
f (z0 )
if z = z0 ,
then we obtain a function which is holomorphic on .
b) for any r > 0 such that B(a ; r) , the image of B(a ; r) \ {a} by f is
dense in C.
Proof. Assume there exists r > 0 such that the image of B(a ; r ) is not dense in C.
Then, for some C and some > 0, we have
| f (z) | >
for all z B(a ; r ) \ {a} .
def
Consider now the function g, defined on B(a ; r )\{a} by g(z) = 1/ f (z) . Then g
is holomorphic on B(a ; r ) \ {a} and, moreover, |g(z)| < 1/ for all z B(a ; r ) \ {a}.
According to the removable singularity theorem, we can continue the function g to a
function g : B(a ; r ) C. This new function g is then nonzero everywhere, except
possibly at a. For all z 6= a, we have
1
f (z) = +
.
()
g (z)
If g (a) were not 0, the right-hand side of () would be holomorphic on B(a ; r ) and
f would have an artificial
singularity at a, which we assumed was not the case. Hence
g (a) = 0 and lim f (z) = +.
za
Complex Analysis I
110
lim f (z) = +,
za
function f takes any value in C except 0 and oscillates so fast, in fact, that it reaches any
of these complex values infinitely often. This can be seen as follows. Let be an arbitrary
nonzero complex number. There exists a complex number w such that e w = . If we now put
def
zn = 1/(w + 2in), then zn 0 and f (zn ) = exp(1/zn ) = for all n N.
n
4.4.b
Meromorphic functions
DEFINITION 4.63 A subset S of C is locally finite if, for any closed disk D,
the subset D S is finite. If is an open set in C and if S , S is locally
finite in if and only if for any closed disque D contained in , the set
D S is finite.
Example 4.64 Let f be a function holomorphic on an open set and not identically zero.
Example 4.67 The function z 7 1/z is meromorphic on C. The set of its singularities is the
singleton {0}.
Laurent series
111
F (z) =
1
f (z)
for all z U \ Z( f )
is a meromorphic function.
Proof. The function F is holomorphic on \ Z ( f ) and, since Z ( f ) is locally
finite, according to Theorem
4.53 on page 106, it suffices to check that for any point
a Z ( f ), we have lim F (z) = +, which is immediate by continuity.
za
Example 4.69 If P and Q are coprime polynomials with complex coefficients (i.e., if they have
Branch points appear when a function cannot be defined on a simply connected open set, as,
for instance, the complex logarithm or other multivalued functions (see Section 5.1.a
on page 135).
Singularities at infinity are defined by making the substitution w = 1/z. One obtains a new
function F : w 7 F (w) = f (1/w); then f has a singularity at infinity if F has a
singularity at 0, those singularities being by definition of the same kind (see page 146).
4.5
Laurent series
4.5.a
We have just seen that a function holomorphic on \ z0 could have singularities of two different kinds at z0 : an essential singularity or a pole. We
can shed some light on this distinction in the following manner. If f can be
written in the form of the following series (which is not a power series, since
coefficients with negative indices occur):
X
f (z) =
an (z z0 )n ,
nZ
then
112
Complex Analysis I
The next theorem stipulates that f usually admits such an expansion.
0
z
We have then, for any point z in the annulus, by Cauchys formula applied to f
and :
Z
Z
Z
1
f ( )
1
f ( )
1
f ( )
f (z) =
d =
d
d
2i z
2i 1 z
2i 2 z
12
Pierre Laurent, (18131853), a former student at the cole Polytechnique, was a hydraulic
engineer and mathematician.
Laurent series
113
in the limit where both segments coincide. It is enough then to write that, for any
1 , we have | z| < 1 and hence
X
1
1
zn
1
=
=
,
n+1
z
1 z/
n=0
1
X
1
1
1
zn
=
=
.
z
z
1 /z
n+1
n=
This proves the first formula stated. Moreover, the value of an is given by the integral
on 1 or 2 depending on whether n is positive or negative. The integral on any
intermediate curve gives of course the same result.
P
an z n converges
for a certain z0 C, then
it
converges
for
any
z
C
with
|z|
<
|z
|.
We
have,
for Laurent
0
P
n
series, a similar result: if +
n= a n z converges for z1 and z2 with |z1 | < |z2 |, then it converges
for any z C such that |z1 | < |z| < |z2 |. Similarly, for a given Laurent series, there exist
two real numbers 1 and 2 (with 2 possibly +) such that the Laurent series converges
uniformly and absolutely in the interior of the open annulus 1 < |z| < 2 and diverges
outside. This result is elementary and does not require the theory of holomorphic functions.
Remark 4.74 We know that a power series has a radius of convergence, that is, if
Remark 4.75 The proof of the theorem also shows that f can be written as f = f1 + f2 , where
f1 is holomorphic in the disc |z| < 2 and f2 is holomorphic in the domain |z| > 1 . If one
asks, in addition, that f1 (z) 0, then this decomposition is unique.
z0
4.5.b
In the case where a function f has only one singularity, for instance,
1
def
,
f (z) =
z1
the Laurent series may be defined on the (infinite) annulus
def
f (z) =
1
1 1
=
z 1
z 1
1
z
X
1
n
z
n=1
(4.6)
for |z| > 1. Note that this series has infinitely many negative indices, but
beware, this does not mean that there is an essential singularity! Indeed, this
infinite series comes from having performed the expansion in an annulus
around 0, while the pole is at 1. A Laurent series expansion around z = 1
gives of course
1
f (z) =
z1
with no other term. Remark that this Laurent series expansion (4.6) is valid
only in the annulus |z| > 1, outside of which the series diverges rather
trivially.
Complex Analysis I
114
This same function f can be expanded on the annulus 0 < |z| < 1 by
X
f (z) =
zn.
n=0
When a function has more than one singularity, the annulus must remain
between the poles; for example, the function g(z) = 1/(z 1)(z + 2) admits
a Laurent series expansion on the annulus 1 < |z| < 2. Note that it also
admits a (different) Laurent series expansion on the annulus 2 < |z|, and a
third one on 0 < |z| < 1 (this last expansion without any nonzero coefficient
with negative index, since the function g is very cosily holomorphic at 0).
nected if, for any holomorphic function f : C, there exists a holomorphic function F : C such that F = f .
We will admit the following results:
if, for any holomorphic function f : C and for any closed path inside , we
have
Z
f (z) dz = 0.
if it is simply connected.
Consequently, we will drop the appellation holomorphically simply connected from now on. The notion of simple connectedness acquires a richer
meaning because of the preceding lemma however.
DEFINITION 4.79 (Residue) Let f be a function that can be expanded in
Laurent series in an annulus 1 < |z z0 | < 2 :
X
f (z) =
an (z z0 )n
for 1 < |z z0 | < 2 .
nZ
N
X
n=1
an
,
(z z0 )n
Laurent series
115
1 < |z z0 | < 2 .
(4.7)
z
which is the required formula.
Another formulation of this theorem, which will be of more use in practice, is as follows:
THEOREM 4.81 (Residue theorem) Let C be a simply connected
open set
i=1
116
Complex Analysis I
(4.8)
How can one know if f has a simple pole at z0 and not a higher order
pole? If one has no special intuition in the matter, one can try to use the
preceding formula: if it works, this means the pole was simple; if it explodes,
the pole was of higher order!13
Consider, for example, the case where f is a function that has a pole of
order 2 at z0 . Then we have
a2
a1
f (z) =
+
+ g(z)
with g H ().
(4.9)
2
(z z0 )
(z z0 )
Formula (4.8) then gives indeed an infinite value in this situation. How, then,
can one obtain a1 without being annoyed by the more divergent part? If
we multiply equation (4.9) by (z z0 )2 , we get
(z z0 )2 f (z) = a2 + a1 (z z0 ) + (z z0 )2 g(z).
which is meromorphic on the complex plane and has two poles at the points +ia and ia.
These poles are simple and we have
(z ia)
Res ( f ; ia) = lim (z ia) f (z) = lim
zi a
zi a (z ia)(z + ia)
1
1
= lim
=
.
zi a (z + ia)
2ia
13
I agree that this is a dangerous method, but as long as it is restricted to mathematics and
is not applied to chemistry, things will be fine.
117
4.6
Applications to the computation of horrifying
integrals or ghastly sums
The goal of this section is to show how the knowledge of Cauchys theorems and residues can help computing some integrals which are otherwise
quite difficult to evaluate explicitly. In certain cases one can also evaluate the
sum of a series using similar techniques.
The idea, when trying to compute an integral on R, is to start with only
a piece of the integral, on an interval [R, R], then close this path in the
complex plane, most often by a semicircle, checking that the added piece does
not contribute to the integral (at least in a suitable limit), and compute this
new path integral by the Residue theorem. Before presenting examples, we
must establish conditions under which the extra piece of the integral will
indeed be negligible.
4.6.a
Jordans lemmas
Jordans lemmas are little theorems which are constantly useful when trying
to compute integrals using the method of residues.
The first of Jordans lemmas is useful for the computation of arbitrary
integrals.
THEOREM 4.84 (First Jordan lemma) Let f : C C be a continuous function
in the sector
def
S = r e i ; r > 0 and 0 1 2 ,
r +
(r )
(r )
2
1
Complex Analysis I
118
S = r e i ; r 0 and 0 1 2 ,
such that z
lim f (z) = 0. Then
zS
lim
r +
f (z) e i z dz = 0.
(r )
Proof. Consider the case (the most difficult) where the sector is 0 . Remarking that
i re i
e
= e r sin ,
we can bound the integral under consideration by
Z
Z
iz
f (r e i ) r e r sin d
f
(z)
e
dz
(r)
0
Z /2
f (r e i ) + f (r e i () ) r e r sin d.
Z /2
Z /2
r e r sin
r e 2r/ d = (1 e r ) ,
2
2
0
0
which shows that for any r R, we have
Z
iz
f (z) e dz .
(r)
119
lim
R+ (R)
F (z) dz =
F (x) dx.
Moreover, when R goes to infinity, the contour (R) ends up containing all
the poles of F in the upper half-plane, i.e., for R large enough, we have
Z
X
F (z) dz = 2i
Res (F ; a),
(R)
a pole in
upper half-plane
F (x) dx = 2i
Res (F ; a).
(4.11)
a pole in
upper half-plane
Remark 4.86 We might, of course, have used a different contour in the argument to close the
segment [R, R] in the lower half-plane. Then we would have had to compute the residues
in the lower half-plane. But then one would have had to be careful that with the orientation
chosen the winding number of the contour around each pole would be equal to 1 and not
1, since the contour would have been oriented negatively.
Complex Analysis I
120
4.6.c
Fourier integrals
which occur very often in practice, since they correspond to the value of the
Fourier transform of f evaluated at k. One can use the same type of argument
and contour as the one used in Section 4.6.b, using this time the second of
Jordans lemmas, but one must be careful of the sign of k! Indeed, the integral
on the upper semicircle
def
C + = R e i ; [0, ] =
(4.12)
R
of f (z) e ikz only tends to 0 when r tends to infinity when the real part of ikz
is negative; if z C + , we must then have k positive. If k is negative, we must
rather perform the integration along the lower semicircle
C = R e i ; [0, ] =
def
J =
(4.13)
cos(k x)
dx,
x 2 + a2
where a is a strictly positive real number. We start by writing the cosine in exponential form.
One way to do it is to use cos(k x) = (e i kx + e i kx )/2, but here it is simpler to notice that
Z
e i kx
J =
dx,
2
2
x + a
the imaginary part being zero. For this last integral, we must now distinguish the two cases
k > 0 and k < 0.
121
ka
e .
a
When k < 0, we must take the contour (4.13) and we get
Z
i kz
e i kz
e
J = lim
dz
=
2i
Res
;
ia
.
2
2
R
z 2 + a2
(R) z + a
J =
J = e ka .
a
To conclude, we have, independently of the sign of k (and even if k = 0)
Z
cos k x
dx = e |k|a .
2 + a2
x
a
4.6.d
where R(x, y) is a rational function which has no pole on the unit circle
(x 2 + y 2 = 1). Putting, naturally, z = e i t , we obtain the relations
e i t e i t
1
1
1
1
=
z
, cos t =
z+
.
dz = ie i t dt, sin t =
2i
2i
z
2
z
K =
R
C
z 1/z z + 1/z
,
2i
2
dz
,
iz
where C is the unit circle with the positive orientation. An immediate application of the Residue theorem shows that
K = 2i
a pole B(0 ; 1)
Res
1
1
1 1
1
R
z , z+
;a
iz
2i
z 2
z
(4.14)
Complex Analysis I
122
I=
dt
,
a + sin t
a R, a > 1.
1
iz a +
Res
a ple B(0 ; 1)
def
1
2i
1
z
; a .
Define now f (z) = 2/(z 2 + 2iaz 1), and it only remains to find the poles of f . Thesepare the
solutions of the equation
z 2 + 2iaz 1 = 0, and so they are the points = ia + i a 2 1
p
and = ia i a 2 1. They are located as shown in the figure below
p
2
Only the pole contributes to the integral,
p it is a simple pole with residue i/ a 1
and the integral is therefore equal to I = 2/ a 2 1.
f (n),
nZ
S=
X
nZ
f (n) =
nZ
Res f (z) cotan(z) ; z = n .
which we can deform (adding contributions which cancel each other in the
integral) into
14
Which is not a contour properly speaking since it consists of multiple pieces, but this is
without importance, the integral on this contour being defined as the sum of the integrals
on each circle.
123
In the limit where N tends to infinity, and if the function f decays to 0, the
contour becomes simply
Finally, this last integral can be computed by the method of residues, by using
Jordans Lemmas and closing each of the lines by semicircles. There only
remains to evaluate the residues of the function f outside the real axis.
Example 4.89 Suppose we want to evaluate the sum
S=
X
nZ
1
,
n2 + a 2
1
cotan(ia) = coth(a),
2ia
2a
coth(a).
(4.15)
a
2
One can note that, if a tends to 0, we have S 1/a , which is the expected behavior (the term
n = 0 of the sum is 1/a 2 ). In the same manner, we have
S=
X
n=1
1
=
coth(a) 2 .
n2 + a 2
2a
2a
The method must obviously be tailored to each case, as the next example
shows:
Example 4.90 We now try to evaluate the sum
T =
X
(1)n
n=1
n4
Instead of using the function cotan, it is interesting to use the function z 7 1/ sin z, which
has poles at every integer and for which the residue at n N is precisely equal to (1)n /.
Complex Analysis I
124
Unfortunately, the function z 7 1/z 4 has a pole at 0. This pole must then be treated separately.
We start by noticing that
T =
T
2
T =
with
X
(1)n
,
4
n= n
n6=0
and we deduce that the sum T is given by 1/2i times the integral on the following contour
of the function z 7 1/z 4 sin(z)
which gives
T = Res
1
z 4 sin z
; z=0 ,
the minus sign coming from the fact that the last path is taken with negative orientation. There
only remains to evaluate this residue, using, for instance, a Taylor expansion of 1/ sin z:
z2
z5
sin z = z
+
+ ,
6
120
1
1
2 z 2 4 z 4
1
=
1
+
+
z 4 sin z
z5
6
120
2
2
1
z
144 z 4
= 5 1+
+
+ ,
z
6
720
which shows that the residue of 1/z 4 sin z at 0 is
X
(1)n
n=1
n4
144
720
and that
74
.
720
Exercises
125
EXERCISES
Exercise 4.4 Show that if f is holomorphic on an open subset C, then
2
f
| f |2 = 4 .
z
(i.e., functions such that the partial derivatives f / x and f / y exist and are continuous).
Show that the following chain rules hold:
f g
f g
( f g) =
+
,
z
z z
z z
f g
f g
( f g) =
+
.
z
z z
z z
and
Exercise 4.6 Let U be a connected open set and let f : U C be a function such that f 2
Path integration
Exercise 4.7 Consider, in C, the segments T1 = [i, 6i], T2 = [i, 2 + 6i] and T3 =
2
[2
R + 5i, 6i]. Define
R now a function f : C C by f (x + i y) = x 3i y. Compute explicitly
f
(z)
dz
and
f
(z)
dz.
Conclude.
T
T T
1
compute
p
3 with equation
(x 1)2 + y 2 = 4,
dz/z 2 .
a R, the integral
R +
e x dx =
Z
e x cos(a x) dx.
(1 + x) sin 2x
dx
x 2 + 2x + 2
and
dx
(1 + x 2 )n
with n N
Complex Analysis I
126
I =
e i x dx
e x e 2i x dx,
which is the Fourier transform of the gaussian x 7 e x at the point . One may use
Exercise 4.9.
(Another method will be given in Exercise 10.1 on page 295, using the Fourier transform.)
Exercise 4.14 Let A be a finite subset of C containing no positive real number. Let
2i X
Res
1 e 2i aA
f (z) e L(z) ; a ,
Exercise 4.15 Let f be a function holomorphic in the open disc D(0; R) with R > 1.
def
Denote by the unit circle: = e i ; [0, 2[ .
Computing in two different ways the integrals
Z
Z
f (z)
f (z)
I=
2 + z + 1/ z
dz,
J =
2 z + 1/ z
dz,
z
z
show that
2 2
Z 2
2
f e i sin2 /2 d = 2 f (0) f (0).
0
Exercises
Exercise 4.16 Compute the value of
127
dx
.
x6 + 1
On may use the classical contour or, more cleverly, a sector with angle /3.
Exercise 4.17 Compute
log x
dx.
1 + x3
cos
x
2
x2 1
+
2
sin x dx =
dx.
cos x 2 dx =
1
2
.
2
Hint: Use the following contour: from 0 to R on the real axis, then an eighth of a circle
up to R e i /4 , and then back to the starting point along the imaginary axis.
Exercise 4.20 Show that
+
0
cosh a x
,
dx =
cosh x
2 cos 2 a
e az
, and consider the contour integral on the
cosh z
rectangle with vertices at [R, R, R + i, R + i].
Hint: Look for the poles of F (z) =
Exercise 4.21 Show that if R is a rational function without poles on the half-axis of positive
is defined by
ei
e i
dx
=
.
x (1 + x)
sin
Complex Analysis I
128
lim+
dx
.
(x 2 a 2 i)
S=
nN
1
n4 + a 4
and
T =
nN
n2
n4 + a 4
with a R, a 0.
Exercise 4.25 (Ramanujans formula) Using the function z 7 cotan z, deduce, by means
coth coth 2
coth n
197
+
+ +
+ =
.
7
7
7
1
2
n
56 700
Note: This beautiful formula is due to the great indian mathematician Ramanujan.15 But
Ramanujan did not use the calculus of residues.
PROBLEM
Problem 2 Application to finite temperature field theory In quantum statistical me-
chanics, some quantities at equilibrium (such as the density of a gas, its pressure, its average
energy) are computed using sums over discrete frequencies (called Matsubara frequencies).
We consider here a simple case where these sums are easily computable using the method of
residues.
For instance, if one considers a gas of free (i.e., noninteracting) electrons (and positrons16),
denote by the chemical potential of the electrons; the positrons have chemical potential
equal to . The temperature being equal to T , put = 1/kb T , where kb is the Boltzmann
constant.
The free charge density, in momentum space, is given [55, 60] by
X
4(il + )
def
,
N ( p) =
2
2 2
(i
l + ) E p
l
15
Srinivasa Ramanujan (18871920) was probably the most romantic character in the mathematical pantheon. An Indian mathematician of genius, but lacking in formal instruction, he
was discovered by G. H. Hardy, to whom he had mailed page after page covered with formulas
each more incredible than the last. The reader may read Ramanujan, an Indian Mathematician
by Hardy, in the book [46]. One will also find there the famous anecdote of the number
1729.
16
In relativistic quantum mechanics, one cannot have a description of matter which is purely
electronic, for instance; positrons necessarily occur, as was discovered by Dirac [49].
Solutions of exercises
with
l =
(2l + 1)
129
def
E p = (m 2 c 4 + p 2 c 2 )1/2 .
and
1. Put this sum in the form of a sum of residues of a well-chosen function f (z). Make a
drawing of the complex plane with the corresponding poles.
2. Does the chosen function admit any other poles?
3. By considering the decay of the function f (z) at infinity, show that the sum of the
residues vanishes.
4. Show then that we have
N ( p) = tanh
E p
2
+ tanh
E p +
2
1
e (E p ) + 1
N Fpos ( p) =
and
1
.
e (E p +) + 1
SOLUTIONS
Solution of exercise 4.1 on page 90. Put f = P + i Q .
We start by remarking that property (ii) implies all the others. Let us show the equivalence
of (ii) and (iii). Suppose, for example, that P = Cnt . Then, at any point,
P
P
=
= 0,
x
y
which, by the Cauchy equations, implies that
Q
Q
=
=0
y
x
and hence that Q = Cnt . One deduces that (ii) and (iii) are equivalent. Moreover, (ii) and (iii),
together, imply property (i). Thus we have shown the equivalence of (i), (ii), and (iii).
Let us show that (iv) implies (iii) this will prove also that (iv) implies (i) and (ii). Assume
then property (iv). We can also assume that f is not the zero function (which is an obvious
case). Then f has no zero at all, and so the sum P 2 + Q 2 = Cnt is never zero. We have
0=
| f |2
P
Q
= 2P
+ 2Q
x
x
x
Cauchy
= 2P
Q
Q
+ 2Q
y
x
and
| f |2
P
Q
Q
Q
= 2P
+ 2Q
= 2P
+ 2Q
.
y
y
y
x
y
The two preceeding equations can therefore be written in the form
Q
P
Q / x
= 0.
P Q
Q / y
0=
Complex Analysis I
130
There only remains to show that (v) implies (ii). Assume that both f and f are holomorphic;
then the Cauchy equations allow us to write
( f H ) =
Q
P
=
x
y
( f H ) =
and
P
( Q )
=
,
x
y
Solution of exercise 4.6. Disregarding the trivial case where f = 0, write then f = f 3 / f 2
at any point where f does not vanish. But since f 2 is holomorphic, its zeros are isolated, and
therefore those of f also, since they are the same. Denote by Z the set of these zeros. Then
f is holomorphic on \ Z and bounded in the neighborhood of each point in Z , which are
therefore artificial singularities. Hence, if we extend by continuity the function g = f 3 / f 2 (by
putting g(z) = 0 for any x Z ), the resulting function g is holomorphic by Theorem 4.56 on
page 108, and moreover g = f by construction.
2
Solution
of exercise 4.8. The
function z 7 (1/z ) being holomorphic on the open set
p
i
1
1 i 3 1
dz
=
= +p .
2
z
z
3
3
3
T
Solution of exercise 4.9. Show that the integrals on the vertical segments tend to 0 when
[R ]. Show moreover that
ZR
Z
2
2
e (x+i a/2) dx = e a
R
e x cos a x dx
p a2
e .
Solution of exercise 4.10. The Laurent series expansion of the sine function coincides with
its Taylor series expansion and is given by
z3
z5
+
;
hence
3!
5!
and the residue is the coefficient of 1/z, namely 1.
sin z = z
sin z
1
z
z3
= +
,
2
z
z
3!
5!
(1 + z) e 2i z
z 2 + 2z + 2
and integrate by closing the contour from above (with justification!). The only residue
inside
R
the contour is the one in z = 1 + i. The desired integral is the imaginary part of f (z) dz =
ie 22i , which is ( cos 2)/e 2 .
For the second integral, close the contour from above, using Jordans second lemma, with
f (z) =
1
1
=
.
(1 + z 2 )n
(z i)2 (z + i)n
Solutions of exercises
131
Then the desired integral is equal to 2i Res ( f ; i) (or, if the contour is closed from below, to
2i Res ( f ; i)). But, according to formula (4.10), page 116, since i is a pole of order n,
dn1
1
1
.
Res ( f ; i) =
(n 1)! dz n1 (z + i)n z=i
An easy induction shows that
dn1
1
(n) (n 1) (2n + 2)
(1)n1 (2n 2)!
=
=
,
dz n1 (z + i)n
(z + i)2n1
(z + i)2n1 (n 1)!
Solution of exercise
4.12. The Fresnel integral is not defined in the Lebesgue sense since
2
the modulus of e i x is constant and nonzero, and therefore not integrable. However, the
integral is defined as an improper integral, which means that the integral I (R, R ) admits a
limit as R and R (separately) tend to infinity. Consider now the contour
R
R
def
C (R, R ) =
R
The integral on the upper eighth arc R is, by a change of variable, equal to
Z
Z
d
2
e i z dz =
ei p ,
R
C
where C is the upper quarter-circle with radius R 2 , and this last integral does tend to 0 by
Jordans second lemma. The second eighth of the circle is treated in the same manner. Since
the integral on C (R, R ) vanishes (the function is holomorphic), the Fresnel integral is equal
to an integral on the line segment with angle /4:
Z R
Z R
i x2
i (e i /4 x)2 i /4
lim
e
dx
e
e
dx
= 0,
R,R +
hence
e i x dx = e i /4
1+ip
2
.
e x dx = p
2
Solution of exercise 4.13. Denote by F () the integral under consideration. It can be put
in the form
F () = e
e (x+i ) dx.
Denote by the line with imaginary part , hence parallel to the real axis. Then, using
2
Cauchys theorem for the function z 7 e z , which is holomorphic on C, show as in
exercise 4.9, that
Z
Z
2
e z dz =
which gives F () = e .
e x dx = 1,
Complex Analysis I
132
Solution
of exercise 4.16. Consider the contour (R) made from the line segments [0, R],
0, R e i /3 and the part of the circle C (0 ; R) with argument varying from 0 and /3. Let
g be the function g : z 7 z 6 + 1. Then the function f = 1/g has only one pole inside this
contour, at the point = exp(i/6). Check then (using the first Jordan lemma and a few
algberaic computations) that
Z
Z
+ dx
.
f (z) dz 1 e i /3
R+
x6 + 1
(R)
0
z
z
= lim
z g(z) g( )
g(z)
1
e 5/6
=
=
.
g ( )
6
Z
Thus we get
because g( ) = 6 1 = 0
i e 5/6
dx
=
= .
+1
3 1 e i /3
3
x6
log z
,
1 + z3
where the logarithm function is defined on C \ R , for instance. Integrate on the contour
going from to R (for > 0), followed by an arc with angle 2/3, then by a line segment
toward the origin, but finished by another arc of the circle with radius . Show that the
integrals on the circles tend to 0 when R tends to infinity and to 0. The only residue inside
the contour is from the pole at z = e i /3 and is equal to 9 e i /6 . Compute the integral on
the second side in terms of the desired integral and in terms of
Z +
2
1
dx = p
1 + x3
3 3
0
(which can be computed using the partial fraction expansion of 1/(1 + x 3 ), for instance).
One finds then
Z +
log x
2
dx = 2 .
3
1
+
x
27
0
Solution of exercise 4.18. Show that the function is integrable using a Taylor expansion at
1 and 1. Integrate on a contour which consists of the line segment from R to R, avoiding
the poles at 1 and 1 by half-circles above the real axis with small radius , and close the
contour by a half-circle of radius R. Show that the integral on the large circle of
f (z) =
e i z/2
z2 1
tends to 0 and compute the integrals on the small circles (which do not vanish; their sum is
equal to in the limit where [ 0]).
Z +
Hence one finds that
f (x) dx = and the desired integral has value /2.
Solutions of exercises
133
X
n=1
1
.
(n2 ia 2 )
Use then the result given by equation (4.15), page 123, and take the real and imaginary parts,
respectively.
cotan(z) coth(z)
.
z7
This function has poles at n and in, for any n Z. The residue at n 6= 0 is simply coth(n)/n7 ,
and the residue at in 6= 0 is cotan(in)/(in)7 = coth(n)/n7 , which gives us that the integral
of f on the square with center at 0 and edges parallel to the axis at distance R = (n + 21 )
(which does not pass through any pole) is
In = 2i
k=n
X
2 coth k
+ 2i Res ( f ; 0).
k7
k=n
k6=0
Since (by an adaptation to this situation of Jordans second lemma) In tends to 0 when n tends
to infinity, we deduce that
S=
n
X
coth k
k=1
k7
1
= Res ( f ; 0).
4
It suffices now to expand coth z cotan z to order 6 around 0 (by hand for the most
courageous, with a computer for the others):
coth z cotan z =
1
73 2
197 6
z
z + O(z 10 )
2
z
45
14 175
which indicates that the residue of f at 0 is 197 /14 175, and the formula stated follows.
Complex Analysis I
134
R2 R2 ,
(x, y) 7 f (x, y),
fx fx
x
y
,
mat d f(x0 , y0 ) =
f
y fy
x
y
f x (x0 + h, y0 + k)
f y (x0 + h, y0 + k)
fx fx
!
f x (x0 , y0 )
x
y h
+
+ o(h, k)
f
f y (x0 , y0 )
y fy k
x
y
fx
fx
!
h+
k
f x (x0 , y0 )
x
y
+ o(h, k).
+
f
f y (x0 , y0 )
y h + f y k
x
y
!
The map which associates d f(x0 , y0 ) to (x0 , y0 ) is called the differential of f and is denoted d f .
It is customary to write dx : (h, k) 7 h and dy : (h, k) 7 k, and consequently the
differential d f can also be written, in the complex notation f = f x + i f y , as
df =
f
f
dx +
dy.
x
y
()
(Notice that we began with a vector function ( f x , f y ) and ended with a scalar function
f x + i f y ; therefore the matrix of d f is a simple line 1 2 and no more a 2 2 square
matrix. It is now the matrix of a linear form.)
If f is differentiable, then its partial derivatives are well-defined. The converse is not true,
but still, it the partial derivatives of f are defined and continuous on an open set R2 ,
then f is indeed differentiable, and its differential d f satisfies the equation ().
Chapter
Complex Analysis II
5.1
Complex logarithm; multivalued functions
5.1.a
Complex Analysis II
136
2
Fig. 5.1 It is not possible to integrate the function z 7 1/z on the complex plane, since
the result depends on the chosen path. Thus, the integrals on the solid contour
1 and the dashed contour 2 differ by 2i.
C \ R is simply connected.
L(z) =
1
d ,
(z)
(5.2)
I.e., a closed path can be contracted continuously to a point, see page 575.
137
Example 5.6 Let (z) be the complex logarithm defined by the cut R+ . The reader will easily
check that, if z = e i with ]0, 2[ this time and R+ , then (z) = log + i.
5.1.b
The logarithm is not the only function that poses problems when one tries to
extend it from the (half) real line to the complex plane. The square root2
function is another.
Indeed, if z = e i , one may be tempted to define the square root of z by
p def p i/2
z = e . But a problem arises: the argument is only defined up to
2, and so /2pis only defined up to a multiple of ; thus, there is ambiguity
on the sign of z.
It is therefore necessary to specify which determination of the argument
will be used to define the square root function. To this end, we introduce
again a cut and choose a continuous determination of the argument (which
always exists on a simply connected open set).
In the following, each time the complex square root function is mentioned,
it will be necessary to indicate, first which is the cut, and second, which is the
determination of the argument considered.
Example 5.7 If the principal determination of the argument = Arg z ], [ is chosen,
p def p
one can defined z 7 z = e i /2 and the restriction of this square root function to the
real axis is the usual real square root.
By chosing iR as the cut, one can define a square root function on the whole real axis.
Complex Analysis II
138
Georg Friedrich Bernhard Riemann (18261866), German mathematician of genius, brought extraordinary results in almost all
fields: differential geometry (riemannian manifolds, riemannian
geometry), complex analysis (Riemann surfaces, Riemanns mapping theorem, the Riemann zeta function in relation to number
theory, the nontrivial zeros of which are famously conjectured
to lie on the line of real part 21 ), integration (Riemann integral),
Fourier analysis (Riemann-Lebesgue theorem), differential calculus (Green-Riemann theorem), series,... He also contributed to
numerous physical problems (heat and light propagation, magnetism, hydrodynamics, acoustics, etc.).
cutting...
(The third figure is what is called a helicoid in geometry. Here, only the
topology of the surface concerns us. The helicoid can be found in spiral
staircases. Close to the axis of the staircase, the steps are very steep; this is why
masons use an axis with nonzero radius.)
This surface is called the Riemann surface of the logarithm function.
Each of the points on this surface can be mapped to a point on the complex
plane, simply by projecting vertically. Two points of the helicoid precisely
vertical to each other are therefore often identified to the same complex number.
They are distinguished only by being on a different floor. Note that walking
on the helicoid and turning once around the axis (i.e., the origin), means that
the floor changes, as in a parking garage.
PROPOSITION 5.8 H is simply connected.
Harmonic functions
139
gluing
When the square-root function has a given value on one of the sheets, it
has the opposite value on the equivalent point of the other sheet. However, it
should be noticed that this surface is not simply connected.
5.2
Harmonic functions
5.2.a
Definitions
2u 2u
+
=0
x 2 y2
at any point of .
Complex Analysis II
140
We will see, in this section, that a number of theorems proved for holomorphic functions remain true for harmonic functions, notably the mean value
theorem, and the maximum principle. This will have important consequences
in physics, where there is an abundance of harmonic functions.
First, what is the link between holomorphic functions and harmonic functions?
If f H () is a holomorphic function, it can be expressed in the form
f = u + iv, where u and v are real-valued functions; it is then easy to show
that u and v are harmonic. Indeed, f being C , the functions u and v are
also C , so
u =
u u Cauchy v v Schwarz
2u 2u
+
=
+
=
= 0
x 2 y2
x x y y
x y y x
from the Cauchy-Riemann equations on the one hand, and Schwarzs theorem
concerning mixed derivatives of functions of class C 2 on the other hand. The
argument for v is similar. To summarize:
THEOREM 5.10 The real and the imaginary part of a holomorphic function are
harmonic.
2 f
f =
i
+i
f =4
f = 4 f .
+ 2 =
2
x
y
x
y
z z
x
y
DEFINITION 5.11 A complex-valued function f defined on an open subset
5.2.b
i.e.,
f = 0.
Properties
Proof. If u and v are harmonic, then u + iv is certainly also harmonic since the
operators and are C-linear.
Conversely, if f is harmonic, it suffices to take the real part of the relation f = 0
to obtain u = 0, and similarly for the imaginary part.
Harmonic functions
141
THEOREM 5.13 Any real-valued harmonic funcion is locally the real part (resp. the
The astute reader will undoubtedly have already guessed what prevents
the extension of this theorem to any open subset : if has holes, the
function v may change value when following its extensions around the hole.
The same phenomenon will be seen in the section about analytic continuation,
and it has already been visible in Cauchys theorem. On the other hand, if u
is harmonic on a simply connected open subset, then u is indeed the real part
of a function holomorphic on the whole of .
COROLLARY 5.15 If u : C is harmonic, then it is of class C .
Proof. It is enough to show that for any z , there is some open ball B containing
z such that u restricted to B is C . But on B there is a holomorphic function f such
that u = Re( f ) on B; in particular, f is C on B, hence so is its real part in the sense
of R2 -differentiability.
let u be harmonic on . For all z0 and for all r R such that B(z0 ; r) ,
we have
Z
1
u(z + r e i ) d,
u(z0 ) =
2 0
or, in real notation (on R2 ),
Z
1
u(x0 , y0 ) =
u(x0 + r cos , y0 + r sin ) d.
2
THEOREM 5.17 (Maximum principle) Let be a domain in C, u : R
142
Complex Analysis II
y
y
or
u
v
x = y
for any z .
v
u
= .
y
x
There is a classical method: integrating the second equation, we write
Zx
u
v(x, 0) =
(x , 0) dx ,
y
0
then
v(x, y) = v(x, 0) +
3
y
0
u
(x, y ) dy
x
A question for physicists: why is it that a potato chip has a hyperbolic geometry? Another
question: why is it also true of the leaves of certain plants, and more spectacularly, for many
varieties of iris flowers?
Harmonic functions
143
f (z) + f (z)
2
with z = x + i y
namely,
f (x + i y) + f (x + i y)
()
2
We are considering the function u(x, y) as a function of two real variables.
Let us extend it to a function of two complex variables e
u (z, z ). For instance,
if u(x, y) = 3x + 5x y, we define e
u (z, z ) = 3z + 5z z . Now put
z z
def
g(z) = 2e
.
()
u ,
2 2i
Then, using formulas () and (), we get
z z 2 n z
z
z
z o
g(z) = 2e
u ,
=
f
+i
+f
i
2 2i
2
2
2i
2
2i
u(x, y) =
= f (z) + f (0),
(5.3)
so that f (z) is, up to a purely imaginary constant, g(z) g(0)/2, and hence
f (z) = g(z) u(0, 0) + i Cnt .
To establish formula (5.3), we used the fact that the function z 7 f (z) can
also be expressed as a certain function fe of the variable z , and more precisely
as f (z ). For instance, if f (z) = 3z + iz 2 , then f (z) = 3z iz 2 = f (z ).
Now, isnt all this a little bit fishy? We have considered in the course of
this reasoning, that the functions (u, f ,...) behaved as operators performing
certain operations on their variables (raise it to a power, multiply by a constant, take the cosine...), and that they could just as well perform them on a
complex variable if they could do so on a real one, or on z if they knew how
to operate on z. This is not at all the usual way to see a function! However, the
expansions of f and u in power series give a rigorous justification of all those
manipulations. They also give the limit of the method: it is necessary that the
functions have power series expansions valid on the whole set considered.
Thus, this technique works if is an open ball (or a connected and simply
connected open subset, by analytic continuation), but not on a domain which
is not simply connected, an annulus, for example.4
4
Complex Analysis II
144
e
u (z, z ) = 3z 2 3z 2 + 5zz + 2 and 2e
u (z/2, z/2i) = 3z 2 + 5z 2 /2i + 4 = f (z) + f (0), which gives
in particular that 2Re f (0) = 4, hence f (0) = 2+iCnt . Hence we have f (z) = 3z 2 +5z 2 /2i+2
(up to a purely imaginary constant) and one checks that the real part of f is indeed equal to
u.
Example 5.19 Consider the function u(x, y) = e x sin y. It is easy to see that this function
is harmonic. Then we have g(z) = 2e z/2 sin(z/2i) = 2ie z/2 sinh(z/2). Since g(0) = 0, we
have f (z) = g(z) + i Cnt , hence
h
z i
.
e x sin y = Re 2ie z/2 sinh
2
5.3
Analytic continuation
Assume a function f is known on an open subset by a power series
expansion that converges on . We would like to know it is possible to extend
the definition of f to a larger set.
Take a convenient example: we define the function f on the open ball
= B(0 ; 1) by
X
f (x) =
zn.
(5.4)
n=0
We know that this power series is indeed convergent on , but divergent for
|z| > 1. However, if we start from the point z0 = 12 + 2i3 , for instance, we can
find a new power series which converges to f on a certain neighborhood
of z0 :
X
f (z) =
an (z z0 )n
on a neighborhood of z0 .
(5.5)
n=0
Analytic continuation
z0
145
Fig. 5.3 Continuation along a path with successive disks, and second continuation
along a path .
146
Complex Analysis II
the chosen
For instance, in the case of the function f defined by
P path takes.
n+1 z n /n, which is equal to z 7 log(1 + z) on = B (0 ; 1),
f (z) =
(1)
n=1
there exists a singularity at the point z = 1. Two continuations, defined by
paths running above or under the point z = 1, will differ by 2i.
On the other hand, the following uniqueness result holds:
THEOREM 5.20 Let f be aSfunction holomorphic on a domain 0 . Let 1 , . . . , n
n
5.4
Singularities at infinity
It is possible to compactify the complex plane by adding a single point at
infinity.
def
1
Res ( f ; ) = Res 2 F (z) ; 0 .
z
5
One can understand why a plane with all directions to infinity identified is a sphere by
taking a round tablecloth and grasping in one hand the whole circumference (the points of
which are infinitely far from the center for a physicist). One obtains a pouch, which is akin
to a sphere.
The open sets in C are the usual open sets in C and the complements of compact subsets
in C. The topology obtained is the usual topology for the sphere, in which the point is like
any other point.
Singularities at infinity
147
N
z
z
Fig. 5.4 The Riemann sphere, obtained by stereographic projection. If z is a point in
the complex plane C, the line (N z) meets the sphere centered at 0 with radius 1
in a single point z , the image of z on the Riemann sphere. The equator of
the sphere is the image of the circle centered at the origin with radius 1. The
point N corresponds to the point at infinity (and to going to infinity in all
directions).
The entire functions which have a pole at infinity are (only) the polynomials.
Remark 5.23 The definition of the residue at infinity is justified by the fact that it is not really
the function f which matters for the residue but rather the differential form f (z) dz (which
is the natural object occuring in the theory of path integration; see Chapter 17). By putting
w = 1/z, it follows that
1
1
1
f (z) dz = 2 f
dw = 2 F (w) dw.
w
w
w
where the sum ranges over the singularities, including possibly the point at infinity.
Proof. It suffices to treat the case where the point at infinity is in the interior of K .
Assume that all the finite singularities (in C) are contained in the ball B(0 ; R). The
set {|z| > R} is then a neighborhood of , which we may assume is contained
in K
(increasing R if need be). We then define K = K \ z C ; |z| > R , which has a
boundary given by
K = C (0 ; R)
(with positive orientation). Applying the residue theorem to K , we find
Z
Z
X
f (z) dz +
f (z) dz = 2i
Res ( f ; zk ).
|z|=R
finite
singularities
Complex Analysis II
148
Thus we obtain
Z
f (z) dz =
|z|=R
|w|=1/R
1
1
f
dw = 2i Res ( f ; ).
w2
w
(Note that the change of variable w = 1/z transforms a positively oriented contour to
a negatively oriented one.)
The stated formula follows.
Example 5.25 The function f : z 7 1/z has a pole at infinity, with residue equal to 1.
If we let = C (5 ; 1), a contour which does not contain any singularity in C, we have
Z
f (z) dz = 0.
5.5
The saddle point method
The saddle point method is a computational recipe developed by Debye7
to compute integrals of the type
Z
def
I () = e f (z) dz.
With respect to the closed set K which is the outside of on the picture. In fact, the
notions of the parts inside and outside a curve do not make sense on the Riemann sphere.
Which side is the outside of the equator on Earth? The northern hemisphere or the southern
hemisphere?
7
Petrus Josephus Wilhelmus Debye (18841966), a Dutch physicist, born in Maastricht, studied the specific heat of solids (Debye model), ionic solutions (Debye potential, see problem 5
p. 325, Debye-Hckel method), among other things. He received the Nobel prize in chemistry
in 1936.
8
There exist generalizations of this method for functions of vector variables, even for fields,
which are used in quantum field theory.
5.5.a
149
(5.6)
Complex Analysis II
150
Then we can write
Denoting
d2 f
1
2
f (z) = f (z ) + (z z )
+ o(z z )2 .
2
dz 2 z=z
d2 f
= e i
and
z z = r e i ,
dz 2 z=z
we have, as a first approximation,
Re f (z) f (z ) = 12 r 2 cos(2 + ) + o(r 2 ),
Im f (z) f (z ) = 12 r 2 sin(2 + ) + o(r 2 ).
(5.7)
Hence we see that, at the saddle point z , the angle between the path
and the real axis should be such that sin(2 + ) = 0, which leaves two
possible directions, orthogonal to each other (solid lines in Figure 5.5). The
twodirections characterized
by cos(2 + ) = 0 are those where the real part
of f (z) f (z ) is constant. Between those lines, the real part is either
positive or negative, depending on the sign of cos(2 + ), which shows that
we are indeed dealing with a saddle point, or a mountain pass (see Figure 5.5).
The two orthogonal directions where the imaginary part of f (z) f (z )
vanishes (to second order) are precisely the directions where the real part of
f (z) f (z ) varies most quickly, either increasing or decreasing.
Consequently, we must choose, at the saddle point, the direction characterized by sin(2 + ) = 0 and such that, along this direction, Re( f ) has a
local minimum at z , and not a local maximum.
We have seen that, because of the properties of holomorphic functions, it
is usually possible to find a path along which the oscillations of e f have disappeared and where, moreover, f will have a local minimum as pronounced
as possible. This justifies the other common name of the method: the method
of steepest descent.
There only remains to evaluate the integral along the path . We introduce a new variable t parameterizing the tangent line to at z , so that we
can write
t2
f (z) = f (z ) + + o(z z )2 ,
2
and, from (5.7), we have
d2 f
t = (z z )
dz 2
(the sign of the square root being chosen in such a way that its real part is
positive). We now rewrite formula (5.6) in terms of this parameter t, which
gives, noticing that dz = dz(t)/dt dt,
Z +
2 dz(t)
I e f (z )
e t
dt.
dt
151
Re( f )
d
a
c
b
Fig. 5.5 The two solid curves are perpendicular and locally determined by the condition
sin(2 + ) = 0. These are not only the curves such that Im( f ) is constant,
but also those where Re( f ) varies most rapidly. The two dotted lines separate
the regions a, b , c and d . In regions a and c , Re f (z) > Re f (z ), whereas in
regions b and d , Re f (z) < Re f (z ). The chosen path therefore goes through
regions a and c .
Since the function d f /dz varies slowly in general, compared to the gaussian
term, we can consider that dz/dt = (d2 f /dz 2 )1/2 is constant, and evaluate
the remaining gaussian integral, which gives
I e f (z )
2
2
d f /dz 2 z=z
)1/2
(5.8)
This formula can be rigorously justified, case by case, by computing the nongaussian contributions to the integral and comparing them with the main
term we just computed. The computation must be done by expanding f to
higher order. One is then led to compute gaussian moments (which is easy) to
derive an asymptotic expansion of I to higher order. A condition to ensure
that the higher orders do not interfere is that the ratio between
p
1/ p
2
1/2
d f
d f
and
dx p x=x
dx p x=x
tends to 0 when tends to infinity. This condition is necessary, but not
sufficient.
Complex Analysis II
152
5.5.b
e 2 M (xx ) dx.
I e
We recognize here a gaussian integral, so the result is
I
e m
2
= e m
M
d2 f
2
.
2
/dx x=x
The first factor is simply the value of the exponential function at its maximum, and the second factor
p expresses the characteristic width of this maximum, proportional to 1/ M . This result should be compared to the result
of (5.8).
9
Exercises
153
Remark 5.27 Sometimes, an integral of the same type as before arises, but with an imaginary
exponential, that is, of the type
Z
I = e i f (x) dx,
R
which is zero if k 6= 0.
We must therefore look for extremums of f to find important contributions, which are
called for obvious reasons stationary phase points; assuming there is a unique such point x , a
computation very similar to the previous one for the real saddle point method gives the result
1/2
2i
.
I e i f (x )
d2 f /dx 2 | x=x
This case is also called the stationary phase method.
Remark 5.28 The saddle point method is not just a trick for physicists. It is very often used
in pure mathematics, analytic number theory in particular; for instance, it occurs in the proof
of the irrationality of one of the numbers (5), (7), . . . , (21) by Tanguy Rivoal [73]! (Recall
that
X 1
(s) =
ns
n1
EXERCISES
Exercise 5.1 Define
P (x, y) =
x(1 + x) + y 2
.
(1 + x)2 + y 2
Exercise 5.3 Show that, in two dimensions, a cavity in a conductor, devoid of charge, has
potential identically zero.
154
Complex Analysis II
Exercise 5.4 (Stirling formula) Using the Euler function defined on the complex halfplane Re(z) > 0 by the integral representation
Z +
def
(z) =
e x x z1 dx,
0
and using the fact that (n + 1) = n! for all n N, show, by the complex saddle point method,
that
p
(n + 1) = n! nn e n 2n.
This formula, established by Abraham de Moivre, is called the Stirling formula.
SOLUTIONS
Solution of exercise 5.3. Denote by the domain defined by the cavity; is simply con-
nected. The electrosatic potential satisfies (z) = 0 at any point z . This potential is also
a harmonic function, hence achieves its minimum and its maximum on the boundary .
Consequently, we have (z) = 0 for any z .
Solution of exercise 5.4. The first thing to do is to find a suitable contour. To start with,
write
( + 1) =
e x log x dx,
then look for the points where d f /dz = 0, with f (z) = z log z. We find z = , and
d2 f /dz 2 |z = 1/, which is a positive real number. The two lines for which, locally, the
imaginary part of f is constant are the real axis and the line with equation z = + i y. It is
the first which corresponds to a local minimum of f at z = ; therefore the real axis is the
most suitable contour.
Hence we evaluate the integral according to the formula (5.8), which gives
p
( + 1) e log 2
[ ].
One can check explicitly that the non-gaussian contributions (those given by the terms of order
larger than 2 in the expansion of f ) are inverse powers of . Hence
p
( + 1) = e 2 1 + O (1/) ,
and in particular, for integral values of ,
p
(n + 1) = n! nn e n 2n.
Notice that this approximation is very quickly quite precise; for n = 5, the exact value is
5! = 120 and the Stirling formula gives approximately 118, which is an error of about 2%.
Even more impressive, the relative error from applying the Stirling formula for n = 100 is
about 0, 08%. . . (note that this error is nevertheless very large in absolute value: it is roughly of
size 10155 , whereas 100! is about 10158 ).
Remark 5.29 If one seeks the asymptotic behavior of (z) for large z not necessarily real, the
computation is more or less the same, except that the path is not the real axis any more.
However, the result of the computation is identical (the result is valid uniformly in any region
where the argument ], ] of z satisfies + < < for some > 0).
Chapter
Conformal maps
6.1
Conformal maps
6.1.a
Preliminaries
Consider a change of variable f : (x, y) 7 (u, v) = u(x, y), v(x, y) in the
plane R2 , identified with C. This change of variable really only deserves the
name if f is locally bijective (i.e., one-to-one); this is the case if the jacobian
of the map is nonzero (then so is the jacobian of the inverse map):
u u
x x
D (u, v) x y
D (x, y) u v
=
6= 0
and
D (u, v) = y y 6= 0.
D (x, y) v v
x y
u v
THEOREM 6.1 In a complex change of variable
z = x + i y 7 w = f (z) = u + iv,
Conformal maps
156
u
x
2
v
u
+i
and hence, by the Cauchy-Riemann
x
x
v
x
2
u v
v u
= J f (z).
x y x y
Proof that the definitions are equivalent. We will denote in general w = f (z).
Consider, in the complex plane, two line segments 1 and 2 contained inside the set
where f is defined, and intersecting at a point z0 in . Denote by 1 and 2 their
images by f .
We want to show that if the angle between 1 and 2 is equal to , then the same
holds for their images, which means that the angle between the tangent lines to 1 and
2 at w0 = f (z0 ) is also equal to .
Consider a point z 1 close to z0 . Its image w = f (z) satisfies
lim
zz0
and hence
w w0
= f (z0 ),
z z0
zz0
which shows that the angle between the curve 1 and the real axis is equal to the angle
between the original segment 1 and the real axis, plus the angle = Arg f (z0 ) (which
is well defined because f (z) 6= 0).
Similarly, the angle between the image curve 2 and the real axis is equal to that
between the segment 2 and the real axis, plus the same .
Therefore, the angle between the two image curves is the same as that between the
two line segments, namely, .
Another way to see this is as follows: the tangent vectors of the curves are transformed according to the rule v = d f z0 v. But the differential of f (when f is seen as
a map from R2 to R2 ) is of the form
P
P
x
y cos sin
,
(6.1)
d f z0 =
Q Q = f (z0 ) sin
cos
x
y
where is the argument of f (z0 ). This is the matrix of a rotation composed with a
homothety, that is, a similitude.
Conformal maps
157
1
1
2
6.1.b
there exists a bijection f : U V such that f and its inverse f 1 are both
continuous. Such a map is called a homeomorphism of U into V .
Remark 6.7 In fact, in this case the continuity of f 1 is an unnecessary condition, because it
turns out to be implied by the continuity of f , the fact that f is bijective, and that U and V
are open. However, this is quite a delicate result.
Example 6.8 The complex plane C is homeomorphic to the (interior of the) unit disc, through
the homeomorphism
f : z 7
See also Exercise 6.2 on page 174.
z
.
1 + |z|
DEFINITION 6.9 Two open sets U and V are conformally equivalent if there
In general, two open sets which are homeomorphic are not conformally
equivalent. For instance, two annuli r1 < |z| < r2 and r1 < |z| < r2 are
conformally equivalent if and only if r2 /r1 = r2 /r1 .
So the Riemann mapping theorem is a very strong result, since any connected and simply connected open set, except C, can be transformed, via a conformal map, into the unit disc.
To illustrate, consider in Figure 6.1 on the next page some open subset
U of the complex plane. Riemanns theorem guarantees the existence of a
conformal map w = f (z) such that B(0 ; 1) = f (U ). The boundary U
of U is mapped to the unit disc C (0 ; 1). (This seems natural, but to make
Conformal maps
158
z
U
w = f (z)
D
Q
0
Fig. 6.1 The open subset U being homeomorphic to the unit disc, it is conformally
equivalent to it. There exists therefore a conformal map w = f (z) such that the
image of U by f is D. One can ask that f (P ) = 0 and f ( Q ) = Q .
z z0
.
z z0
The upper half-plane is then transformed bijectively into the interior B(0 ; 1)
of the unit circle. The point z0 is mapped to 0. The real points x tending
to either + or are sent to the same limit 1. As z runs over the real
1
Nicola Joukovski (18471921), a Russian physicist, studied hydrodynamics and aerodynamics. He discovered a conformal map z 7 21 (z + a 2 /z), which transforms the circles going
through a and a into a profile similar to an airplane wing.
Conformal maps
159
axis in increasing order, w goes around the unit circle in the counterclockwise
direction, starting and ending at the point w = 1.
z
w = f (z)
P (z0 )
B
P (0)
C
C
F
z
w = f (z)
1
F E D
1
A B
a
1
w = f (z) =
z+
,
2
z
where a is an arbitrary real number. This transformation is not conformal at
the points 1 and 1 (the derivative vanishes), which explains that the right
angles between the half-circle and the real axis are transformed into flat
angles (equal to ).
z
w = f (z)
A B
0
C
Conformal maps
160
Half-circle
The unit upper half-circle is mapped into the upper half-plane by putting
1+z 2
w = f (z) =
.
1z
The point 1 (C in the figure) is mapped to 0 while the point 1 (A) is mapped
to infinity on the real axis (either + or , depending on whether one
approaches A on the real axis or on the boundary of the half-circle).
z
w = f (z)
1
1
C
1
1
w = f (z) =
z e + e .
2
z
Other examples
The angular sector {r e i ; 0 < r < 1, 0 < < 2 } (a quarter-disc) is mapped
into the upper half-disc by putting w = z 2 .
Similarly, the angular sector of angle /k (with k R, k 12 ) is mapped
into the upper half-plane by putting w = z k .
The upper half-disc in mapped to a semi-infinite strip (infinite to the left)
by putting w = log z with the logarithm defined by a cut in the lower halfplane.
D
z
w = f (z)
1
B
0
A
The same transformation maps the upper half-plane into an infinite strip.
Conformal maps
161
The German mathematician Hermann Schwarz (18431921) succeeded his teacher Weierstrass at the University of Berlin in 1892.
He worked on conformal mappings, potential theory (proving
the existence of a solution to the Dirichlet problem in 1870), and
partial differential equations.
dw
= a(z x1 )1 /1 (z x2 )2 /1 (z xn )n /1
(6.2)
dz
transforms the real axis into a polygonal curve with successive angles (1 , . . . , n ) and
transforms the upper half-plane into the domain bounded by this curve (see Figure 6.2
on the following page).
Formula (6.2) provides only a local expression of the map. By integrating
along a complex path, we obtain the equivalent formula
Z Y
n
i
w = f (z) = a
(z x i ) 1 dz + w0 ,
(z) i=1
Conformal maps
162
w = f (z)
A
A
C D
x1 x2
x3 x4
x5
D
2
B
Fig. 6.2 The Schwarz-Christoffel transform maps the open upper half-plane into the
domain bounded by the polygonal curve with successive angles 1 , . . . , n .
the polygonal line is closed and we obtain a closed polygon. An infinite polygon (when
the condition does not hold) is a limiting case of closed polygons.
If the problem to solve is to find a conformal mapping transforming the upper halfplane into a polygonal domain (either closed or open), it is possible to choose arbitrarily
three of the real numbers (x1 , . . . , xn ). The others are then imposed by the various
quantities in the picture.
It is often convenient to move one of the xi (for instance, x1 or xn ) to infinity, which
simplifies the computations.
Example 6.13 We describe how to use the Schwarz-Christoffel transform to map conformally
w = f (z)
1
b
a
dw
1
1
= a(z + 1) 2 (z 1) 2 = p 2
=p
,
dz
z 1
1 z2
where a is a constant to be determined and = ia. We therefore put
w = f (z) = arcsin(z) + B,
and to have f (1) = b and f (1) = b , we must take B = 0 and = 2b/. Hence the
solution is
w
2b
w=
arcsin(z),
i.e.,
z = sin
.
2b
163
Elwin Christoffel (18291900), born in Montjoie Aachen (present-day Monshau) in Germany, from a family of clothes merchants, studied in Berlin from 1850, where he had in particular
Dirichlet as a teacher (see page 171). He defended his doctoral on
the flow of electricity in homogeneous bodies, then taught there
before ending his career in the new University of Strasbourg.
Shy, irritable, asocial, Christoffel was nevertheless a brilliant
mathematician who studied Riemannian geometry, introducing
the Christoffel symbols, as well as conformal mappings.
6.2
Applications to potential theory
(This section may be omitted for a first reading; it is preferable to study it with some
knowledge of Green functions, which we will present later.)
Conformal transformations are useful to solve a number of problems of
the type Poisson equation with boundary condition, for instance, the problem of the diffusion of heat in a solid, in a stationary regime, with the
temperature fixed on the boundary of the solid; or the problem of the electrostatic field created by a charge distribution in the presence of conductors.
The main idea is to transform the boundary of the problem, which may
be complicated, into a simpler boundary, by a conformal map. In particular,
the Dirichlet problem for a half-plane or for a circle (see page 170) can be
solved, and this makes it possible to solve it, at least in principle, for more
complicated boundaries.
THEOREM 6.14 Let R2 and R2 be two domains of R2 C and f a
bijective conformal map from into . Let : R be a real-valued function,
2
2
def
with laplacian denoted by (x, y) = x2 + y2 . Moreover, denote (u, v) = f (x, y)
and consider the image of by the change of variable f :
def
(u, v) = f 1 (u, v)
def
2
u2
2
v2
the
Proof. This is an elementary but tiresome computation. It suffices to use the chain
Conformal maps
164
rule formulas
x
y
=
+
u
u x
u y
and
x
y
=
+
v
v x
v y
mal transformation.
6.2.a
165
Application to electrostatics
(u,v) G (u, v ; u , v ) = (u u , v v ),
(C )
G (u, v ; u , v ) = 0
(u, v) f ( ), (u , v ) f (),
where the image of the boundary is the real axis f ( ) = R. In other
words, we are looking for the electrostatic response of a simpler system (a
half-plane) in the presence of an electric charge.
To solve this last problem, one can use the method of virtual images (which
is justified, for instance, by means of the Fourier or Laplace transform): the
potential created by a charge at w in the neighborhood of a conductor in
the lower half-plane is the sum of the free potentials3 created, first, by the
particle at w and, second, by another particle, with the opposite charge,
placed symmetrically of the first with respect to the real axis (namely, at w ).
2
The physicists proof being that the quantity 1
(x , y ) G (x, y ; x , y ) dx dy is the
0
potential created at (x, y) by the elementary charge (x , y ) dx dy .
3
I.e., of the Green functions tending to 0 at infinity.
Conformal maps
166
z
w = f (z)
b
(a)
1
(b)
Fig. 6.3 (a) The domain left free by the conductor is the set of points z = x + i y
with y > 0 and b < x < b . The Dirichlet conditions are that the potential
is constant, for instance, zero, on . (b) The image f () of the domain is
the upper half-plane.
However, we will see in the chapter concerning distributions that the Coulomb
potential4 in two dimensions is simply
q
(w) =
log w w .
20
G is thus equal to
G (w ;
w )
w w
q
.
=
log
2
w w
The Green function of the initial problem is obtained by putting, in the last
formula, w = sin(z/2b), which gives
z0
z
sin
sin
q
2b
2b ,
G (z ; z0 ) =
log
z
z
2
sin
sin 0
2b
2b
or, by expanding in real form sin(a + ib) = sin a cosh b + i cos a sinh b,
G (x, y ; x0 , y0 ) =
q
4
x
y
x
y0 2
x
y
x
y0 2
sin
cosh
sin 0 cosh
+ cos
sinh
cos 0 sinh
2b
2b
2b
2b
2b
2b
2b
2b
log x
y
x0
y0 2
x
y
x0
y0 2 .
+
cos
sin
cosh
sin
cosh
sinh
+
cos
sinh
2b
2b
2b
2b
2b
2b
2b
2b
Which, as the reader will notice, is not properly speaking something very
intuitive. However, it is remarkable how this computation has been done so
easily and without difficulty. This is the strength of conformal mappings.
4
The Coulomb potential in any dimension is a potential satisfying the Laplace equation
= /0 ; Theorem 7.54 on page 208 shows that the three-dimensional Coulomb potential
is given by r 7 1/40 r , whereas in two dimensions Theorem 7.55 on page 208 shows that it
is r 7 (log r )/20 .
167
vx =
and
vy =
x
y
This potential is harmonic since div v = 0 is equivalent to = 0. Solving
the flow problem amounts to finding with the right boundary conditions
(normal velocity equal to zero on all impenetrable surfaces). This is called the
Neumann problem.5
Remark 6.18 The velocity is obtained from a global potential if the domain under
DEFINITION 6.19 The complex velocity of the fluid is the complex quantity
given by
def
V = v x + iv y .
Since is harmonic, we know (Theorem 5.13) that there exists a harmonic
def
function : R which is conjugate to , that is, such that = + i is
holomorphic on . Then the complex derivative of is the conjugate of the
complex velocity:
(z) =
=
=
+i
z
x
x
x
Cauchy-Riemann
i
= v x iv y = V (z).
x
y
Carl Gottfried Neumann (18321925), German mathematician and physicist, was the son
of the physicist Franz Ernst Neumann, who introduced the notion of potential in electromagnetism.
Conformal maps
168
w = f (z)
Fig. 6.4 On the left: the wind over a wall. The white region is D. On the right: the
wind over the ground, without a wall. The white region is D .
a 2
z 7 z 2
z 7 z + a 2
z 7
a 2
We can now deduce from this the flow in D with the boundary conditions6:
lim v(z) = (v , 0).
Re(z)
In the general case, one can proceed as follows. Search for the flow in the form
(z) = v z + g(z),
where g is a holomorphic function which describes a perturbation from the homogeneous flow
0 (z) = v z, such that lim|z| g (z) = 0 so that the limit conditions are respected.
169
After the conformal mapping w = f (z), the flow still obeys the same
boundary conditions, since f (z) 1 as |z| tends to infinity in the upper
half-plane. The desired flow in the upper half-plane without a wall is easy to
find: it a a uniform flow (the wind over the plain...), which can be described
by the complex potential
(w) = v w.
Coming back to the variable w:
p
p
(z) = (w) =
z 2 + a 2 = v z 2 + a 2 .
Two special points appear: the velocity of the flow is infinite on the top of
the wall, at z = ia (this is why the wind whistles when going over a pointed
obstacle: locally supersonic velocities are reached); it is zero at the foot of
the wall (around z = 0). These two points are of course the two points
where the transformation is no longer conformal7; it is therefore possible that
singularities appear in this manner, which did not appear in the flow on the
upper half-plane.
6.2.c
From the previous example we can understand the principle of the lightning
rod. If we denote by ( r) the electrostatic potential, then close to a pointed
object (like the wall above, or an acute angle in the definition of the domain),
we can expect that will diverge. Identifying C and R2 , it is the gradient
of that will diverge, namely, the electric field. But we know that in the air
an electric field of intensity larger than some critical value, the breakdown
threshold, will cause a spontaneous electric discharge.8 Saint Elmos fire, well
known to mariners, has the same origin.
But this is not yet the most remarkable thing. Consider a problem of
random walk of a particle in an open domain C, assuming that, when the
particle meets the boundary of , it will stop. (It is possible to make numerical
simulations of this problem.) What will happen after many particles have
performed this process? One may think that the particles will accumulate on
the boundary , which will therefore grow, more or less regularly as time
goes. In fact, this is not what happens. If the boundary has cusps,
they will attract more particles and, hence, will grow faster than the rest of
the boundary. If, at a given time and due to the effect of chance, a cusp is
created (by accumulation of some particles) on a smooth part of , then
this cusp will, in turn, attract more particles. In a word: the fluctuations of
the boundary become unstable.
7
Indeed, at the foot of the wall, we have f (z) = 0, whereas around the top, f (z) diverges.
Of course, the rough explanation given only concerns lightning rods in a two-dimensional
world. In three dimensions, another argument must be found, because the one given uses
specific properties of C and not of R2 .
8
170
Conformal maps
p(z, t) = 0
t
at any point of . We will therefore have an equation of the same type if we
perform a conformal transformation (independent of time) on the domain .
Once more, the cusps will attract more particles and we will observe a growth
phenomenon.9 (Here it is the flow of particles that diverges.)
6.3
Dirichlet problem and Poisson kernel
The Dirichlet problem is the following: the values of a harmonic function
on the boundary of a domain in the plane are assumed to be known,
and we want to find the values of the function at any point in .
For instance, one may be interested in the case of a metallic disc,10 which is
a heat conductor, in a stationary state, for which we impose a temperature T0
on one half-circle, and a temperature T1 on the other half-circle. The question
is now to determine the temperature at any point inside the disc.
More precisely, if is a domain (hence, a connected open set) of the
complex plane C, if denotes its boundary and if u0 : C is a
continuous function, we are looking for a u : C of C 2 class, such that
u
= u0 ,
u = 0 in .
171
Peter Gustav Lejeune-Dirichlet (18051859), German mathematician, was professor successively in Breslau, Berlin, and Gttingen, where he succeeded his teacher Gauss (see page 557). He
was also the brother-in-law of Jacobi and a friend of Fourier. He
studied trigonometric series and gave a condition for the convergence of a Fourier series at a given point (see Theorem 9.46 on
page 268). His other works concern the theory of functions in
analysis and number theory.
To prove this theorem, we start by stating a technical lemma which tells us,
in substance, anything that can be proved for the central point of the open
disc B(0 ; 1) can also be proved for an arbitrary point of the disc.
def
za
. Then a
1az
B = B(0 ; |a|1 ))
i) a is bijective on B ;
ii) a
B(0 ; 1)
H (B ), where
and satisfies
iii) a1 = a ;
iv) a (a) = 0.
In other words, a transforms the disc into itself, but moves the point a to the center in
a holomorphic manner.
Proof of Theorem 6.21. We use the mean value property for harmonic functions:
Z 2
Z
u a ( )
1
1
u(a) = u a (0) =
u a (e i ) d =
d .
2 0
2i B
and
u(a) =
1
2i
u( )
1
a ( ) d =
a ( )
2i
u(e i )
1 |a|2
|a e i |
d.
Conformal maps
172
defined by
P (r, ) =
1 r2
1 1 r2
1
=
.
2 1 + r 2 2r cos
2 r e i 2
Remark 6.24 In the case of holomorphic functions, the Cauchy formula not only reproduces
holomorphic functions, but also creates them: if f is a given function on the circle C =
B(0 ; 1), assumed to be merely continuous, then
Z
1
f ( )
def
F (z) =
d
2i C z
Z
1 2
1 |z|2
f (e i )
if z B(0 ; 1),
d
z e i 2
u(z) = 2 0
f (z)
if z B(0 ; 1).
173
1
0,5
0
0,5
1
0,5
1
0,5
0
0,5
0,5
1 1
Fig. 6.5 The temperature is equal to 1 on the lower half-circle, and to +1 on the upper
half-circle; it is a harmonic function on the disc. Here it is represented on the
vertical axis, and is deduced from the preceding theorem.
chapter 8 on distributions, and in particular section 8.2.b on page 231, the reader is
invited to try to formalize the proof of this theorem.
The Dirichlet problem has many applications in physics. Indeed, there are
many equations of harmonic type: electromagnetism, hydrodynamics, heat
theory, and so on. The example already mentioned find the temperature at
any point in a disc where the temperature is imposed to be T0 on the upper
half-circle and T1 on the lower half-circle is illustrated in Figure 6.5. After
integration, one finds (in polar coordinates)
2
2r sin
T (r, ) = arctan
.
1 r2
In Exercises 6.7 and 6.8, two other examples of solutions of the Dirichlet
problem are given, for a half-plane and for a strip, respectively.
Conformal maps
174
EXERCISES
Exercise 6.1 Let U : R2 R2 be a twice differentiable function. Show that if f : C C is
Exercise 6.2 Show that any two annuli are homeomorphic (one can use, for instance, a
purely radial function).
Complex flows
Exercise 6.3 Determine the flow of the wind in a two-dimensional domain delimited by two
half-lines meeting at an angle (for instance, one of the half-lines represents the horizontal
ground, and the other a mountainside with uniform slope). Use the following boundary
conditions: the wind must have velocity with modulus v and flow parallel to the boundaries,
infinitely far away from the intersection of the lines.
Assume that the flow is irrotational, incompressible, and stationary. In particular, what is
the velocity of the wind at the point where the slope changes? Consider the various possibilities
depending on the value of [0, ].
Exercise 6.4 Let k be a real number. Interpret the complex flows given by
1 (z) = k log(z a)
and
Exercise 6.5 We wish to solve the problem of the flow of a Newtonian fluid in three dimen-
sions around a cylinder with infinite length and circular section. If only the solutions which
are invariant by translation along the axis are sought, the problem is reduced to the study of
the flow of a fluid in two dimensions around a disc.
We assume that the velocity of the fluid, far from the obstacle, is uniform. We look for
the solution as a complex potential written in the form
= V0 z + G (z),
where the term V0 z is the solution unperturbed by the disc and where G (z) is a perturbation,
such that
G (z) 0.
|z|+
Check that the solution obtained is indeed compatible with the boundary conditions (the
velocity of the fluid is tangent to the disc), and find the stagnation points of the fluid.
Exercises
175
Dirichlet problem
Exercise 6.6 We wish to know the temperature of a sheet in the shape of an infinite half-
T0 if x < 1,
T (x) = T1 if |x| < 1,
T2 if x > 1.
D
1
1
2
1
We want to find u of
to D and such that
C2
u = 0
in D,
u| D = u0 given.
Find, using the Cauchy formula, an integral formula for u(z) when Imz > 0.
Hint: Show, before concluding, that if z = x + i y with y > 0 and if f is holomorphic
on D, then
Z
y f ()
1 +
d.
f (z) =
( x)2 + y 2
Exercise 6.8 Dirichlet problem for a strip: we are seeking the solution of the Dirichlet
u = 0
in B,
u(x) = u0 (x) x R,
u(x + i) = u1 (x) x R.
def
176
Conformal maps
SOLUTIONS
Solution of exercise 6.2. Writing as usual z = r e i and denoting by r1 < r < r2 the first
annulus and by r1 < r < r2 the second one, consider the map
r r1
r 7 2
(r r1 ) + r1 ,
r2 r1
7 .
This application is indeed an homeomorphism from the first annulus onto the second (the
transformation of the radius is affine). One can check that this is not a holomorphic map,
except if the two annuli are homothetic to each other, in which case the formula given is the
corresponding homothety.
Solution of exercise 6.3. We can transform the domain D into the upper half-plane by
means of the following conformal map:
z 7 w = z / .
In the domain D = {z C ; Im(z) > 0}, the flow of the wind is given by a complex potential
(w) = vw; one checks indeed that the velocity field that derives from this is uniform and
parallel to the ground (since v R). The function being holomorphic, its real part Re,
which is the real velocity potential, is harmonic: (Re) = 0. Coming back to the initial
domain, we put
(z) = w(z) = z / = v z / ,
which therefore satisfies (Re = 0 (because of Corollary 6.15) and has the required boundary conditions. The velocity of he wind is given by the gradient of Re.
At z = 0, the mapping z 7 w is not conformal. If < , the derivative
/1
dw
=
z
dz
tends to 0 (and so does the velocity of the wind), which shows that at the bottom of a Vshaped valley one is protected from the transversal wind (but not from the longitudinal wind,
as people from Lyons know, who are victim of the North-South wind, whereas the East-West
wind is almost absent). On the other hand, if > , this derivative tends to infinity, which
shows that on the crest of a mountain, the transverse wind is important.
Solution of exercise 6.4. 1 corresponds to the flow around a source or a sink (depending
on the sign of k), and 2 corresponds to a whirlpool, as one can see by picturing the velocity
field.
Solution of exercise 6.5. Rotating the entire system if necessary, we may assume that the
velocity at infinity is parallel to the real axis. Therefore we take V0 R and look for a flow
symmetric with respect to the real axis.
We know a conformal transformation that maps the half-plane minus a half-disc into the
upper half-plane; it is given by z = z + a 2 /z, where a is the radius of the disc that is removed.
The circle C (0 ; a) is mapped to the segment [2a, 2a].
There only remains to find a free flow in the new domain: it is given trivially by
(w) = V0 w.
The solution of the problem is therefore
a2
(z) = V0 z +
.
z
Solutions of exercises
177
Solution of exercise 6.6. Notice that the function which is proposed is the imaginary part
of
z 7 L(z + 1) + L(z 1) + ,
where L is the holomorphic logarithm defined on C with a cut along the lower half-plane. We
look for the constants , , and by noticing that on the boundary of the sheet, T = T0 for
1 = 2 = , T = T1 for 1 = 0 and 2 = , and T = T2 for 1 = 2 = 0. We obtain then
T T2
T0 T1
1 + 1
2 + T2
T T1
y
T T2
y
= 0
arctan
+ 1
arctan
+ T2 .
x +1
x 1
T =
Chapter
Distributions I
7.1
Physical approach
7.1.a
Distributions I
180
r
R
()
Can we reconcile the two expressions () and () for the point-like charge
and the density of charge respectively? In other words, can we write both
equations in a uniform way?
To express the point-like charge with the continuous viewpoint, we
would need a function r 0 describing a point-like unit charge located at r 0 .
This function would thus be zero on R3 \ { r 0 }; moreover, when integrated
over a volume V , we would obtain
ZZZ
1 if r 0 V ,
3
r0 ( r ) d r =
0 if r 0
/V.
V
Generalizing, for any continuous function f : R3 R, we would need to
have
ZZZ
f ( r ) r 0 ( r ) d3 r = f ( r 0 ).
(7.1)
V
But, according to the theory of integration, since r 0 is zero almost everywhere, the integral (7.1) is necessarily zero also. A Dirac function cannot
exist.
One may, following Dirac, use a sequence of functions with constant integral equal to 1, positive, and that concentrate around 0, as for instance:
n if |x| 1/2n,
n (x) =
0 if |x| > 1/2n,
or its equivalent in three dimensions
3n3 /4
Dn ( r ) =
0
if k rk 1/2n,
if k rk > 1/2n,
Every time the temptation arises to use the -function, the sequence of
functions (n )nN (resp. (Dn )nN ) can be used instead, and at the very end of
the computation one must take the limit [n ]. Thus, r 0 is replaced in (7.1)
by Dn ( r r 0 ), and the formula is written
ZZZ
ZZZ
3
f ( r ) r 0 ( r ) d r = lim
f ( r ) Dn ( r r 0 ) d3 r = f ( r 0 ).
V
Physical approach
181
The reader can check that this formula is valid for any function f continuous
at r 0 (see for instance a fairly complete discussion of this point of view in the
book by Cohen-Tannoudji et al. [20, appendix II]). However, this procedure
is fairly heavy. This is what motivated Laurent Schwartz (and, independently,
on the other side of the Iron Curtain, Isral Gelfand [38]) to create the
theory of distributions (resp. of generalized functions), which gives a rigorous meaning to the -function and justifies the preceding technique (see
Theorem 8.18 on page 232).
In the same manner we will want to describe not only the distribution of
point-like charges, but also a charge supported on a surface or a curve, so that,
if we denote by ( r ) the function giving the distribution of charges, we can
compute the total charge contained within the volume V with the formula
ZZZ
Q (V ) =
( r) d3 r
V
Z
ZZ
ZZZ
X
2
=
qi +
d +
d s+
vol. ( r) d3 r ,
charges i in
the volume V
L V
S V
where vol. is the volumic density of charge, the surface distribution on the
surface S , the linear distribution on the curve L , and q i the point-like
charges.
Newtons law stipulates that, during the movement, the force f exerted on
the ball is such that f = m
v ; it is therefore proportional to the derivative
of the function graphed above. Now if we wish to model in a simple way a
hard collision (where the squashing of the ball is not taken into account,
for instance, as in a game of ptanque), the graph of the speed becomes
182
Distributions I
v
t
and the force exerted should still be proportional to the derivative of this
function; in a word, f should be zero for any t 6= 0 and satisfy
Z
1 +
f (t) dt = v(+) v() = 2v0 .
m
Once again, neither the integral nor the previous derivative can be treated in
the usual sense of functions.
7.2
Definitions and examples of distributions
We will now present a mathematical tool which, applied to a continuous function f , gives its value at 0, a relation which will be written
( f ) = f (0)
or
, f = f (0).
is the vector space of functions from Rn into C, which are of class C and
have bounded support (i.e., they vanish outside some ball; in the case n = 1,
they vanish outside a bounded interval).
A test function is any function D(Rn ).
1
183
particular because they are not analytic on Rn can you see why?)
As an exercise, the reader will show that for a, b R with a < b , the function
1
exp
if x ]a, b [ ,
(x) =
(x a)(x b )
0
if x ], a] [b , +[
is a test function on R.
defined on D(Rn ). The distributions form a vector space called the topological dual of D(Rn ), also called the space of distributions and denoted D (R
Rn)
or D .
For a distribution T D and a test function D, the value of T at
will usually be denoted not T () but instead T , . Thus, for any T D
and any D, T , is a complex number.
Remark 7.4 This definition, simple in appearance, deserves some comments.
Why so many constraints in the definition of the functions of D? The reason is simple:
the topological dual of D gets bigger, or richer, if D is small. The restriction of the
space test D to a very restricted subset of functions produces a space of distribution which is
very large.2 (This is therefore the opposite phenomenon from what might have been expected from
the experience in finite dimensions.3)
The C regularity is necessary but the condition of bounded support can be relaxed, and
there is a space of functions larger than D which is still small enough that most interesting
distributions belong to its dual. It is the Schwartz space S , which will be defined on page 289.
A distribution T is, by definition, a continuous linear functional on D; this means that for
any sequence (n )nN of test functions that converges to a test function D, the sequence
of complex numbers (T , n )nN must converge to the value T , . To make this precise we
must first specify precisely what is meant by a sequence of test functions (n )nN converging
to .
2
For those still unconvinced:
we may define a linear functional on D, which to any test
R
function associates (t) dt. However, if we had taken as D a space which is too large, for
instance, the space L1loc of locally integrable functions (see Definition
7.8 on the next page), then
R
this functional would not be well-defined since the integral f (t) dt may have been divergent,
R
for instance, when f (t) 1. The functional 7 (t) dt is thus well-defined on the vector
space D of test functions, but not on the space of locally integrable functions. Hence it belongs
to D but not to (L1loc ) .
3
Recall that in finite dimensions the dual space is of the same dimension as the starting
space. Thus, the bigger the vector space, the bigger its dual.
Distributions I
184
n ( p)
for any p N.
in D,
if (n )nN converges in D to ,
7.2.a
Regular distributions
185
Such a distribution is called the regular distribution associated to the locally integrable function f .
R
Proof. It suffices to show that the map 7 f (x) (x) dx is indeed linear (which
is obvious) and continuous.
Let (n )nN be a sequence of test functions converging to D. There exists a
bounded closed ball B such that B contains the supports
of all the n . Since the
R
function f is locally integrable, we can define M = B | f | and this M is a finite real
number.
R
R
We then have f (n ) M |n | M V kn k , where V is the
finite volume of the ball B. But kn k tends to 0 by the definition 7.5 and,
therefore, f , n tends to 0. By linearity, f , n tends to f , .
Remark 7.11 If the functions f and g are equal almost everywhere, the distributions f and g
will be equal, that is, f , = g, for any D. The converse is also true: if two regular
distributions are equal as functionals, then the corresponding functions are equal almost everywhere. The reader is invited to construct a proof of this fact using the following property:
any integrable function with compact support can be approximated in the mean by functions
in D. So the notion of distribution is in fact an extension of the notion of classes of locally
integrable functions equal almost everywhere.
Example 7.12 Denote by 1 the constant function 1 : x 7 1 (defined on R). The regular
1 : D C,
7 1, =
(x) dx.
, = (0)
D.
For a R, we define similarly the Dirac distribution centered at a, denoted a , by its action on any test function:
def
a , = (a)
D.
Frequently (this will be justified page 189), a will instead be denoted (x a).
The preceding definition is easily generalized to the case of many dimensions:
Distributions I
186
def
r 0 , = ( r 0 )
D.
As before, it will often be denoted ( r r 0 ).
X, =
that is,
X=
+
X
n=
or
+
X
(n),
n=
def
X(x) =
+
X
n=
(x n).
P
Notice that the distribution X is well defined, because the sum n (n)
is in fact finite ( being, by definition, with bounded support).
The continuity is left as an (easy) exercise for the reader.
The distribution is represented graphically by a vertical arrow, as here:
4
This bears repeating again at the risk of being boring: a distribution is not a function, but
a linear form on a space of functions.
5
This is a Cyrillic alphabet letter, corresponding to the sound sh.
187
a
x
a x
a 12 a + 12
The distribution
The X distribution
(x)
dx +
x
(x)
dx
x
0+ ].
(7.2)
Distributions I
188
(x)
(x)
1
def
pv , = pv
dx = lim+
dx.
x
x
0
|x|> x
7.3
Elementary properties. Operations
7.3.a
Operations on distributions
def
fa (x) = f (x a).
In other words, the graph (in Rn+1 ) of the function f has been translated by
the vector a:
189
fa
a
The regular distribution associated to this function, which is also denoted
fa , therefore satisfies
Z
Z
fa , = f (x a) (x) dx = f ( y) ( y + a) dy = f , a .
T (x a), is defined by
def
Ta , = T , a
def
T (x a), (x) = T (x), (x + a) .
The reader
can easily check that the number T , a exists and that the map
Remark 7.22 As in Remark 7.15, the notation T (x a) is simply synonymous with Ta ; similarly, with an abuse of language, (x a) sometimes designates (for physicists more than for
mathematicians) the function t 7 (t a) and not the value of at x a.
Example 7.23 Let a R. The translate by a of the Cauchy principal value is
pv
1
: 7 lim+
0
x a
(x)
dx +
x a
a+
(x)
dx .
x a
190
Distributions I
,
T , = T ,
def
def
T (x), (x) = T (x), (x)
for any D(Rn ).
f (a x), (x) = f (a x) (x) dx = f (x)
a |a|
x E
1 D
=
f (x),
.
|a|
a
This leads to the following generalization:
def 1 D
T (a x), (x) =
T (x),
|a|
a
1 D
T (a x), ( x) = n T ( x),
on Rn .
|a|
a
There is, unfortunately, no general definition of the product of two distributions.6 Already, if f and g are two locally integrable functions (hence
defining regular distributions), their product is not necessarily locally integrable.7 However, if is a function of class C , then for any D, the
product is a test function (still being C and with bounded support); if
f is locally integrable, then f is also locally integrable. From the point of
view of distributions, we can write
Z
Z
f , =
(x) f (x) (x) dx = f (x) (x) (x) dx = f , .
6
In particular, it is easy to notice that taking the square of the Dirac distribution raises
arduous questions. However, it turns out to be possible to define a product in a larger space
than the space of distributions, the space of generalized distributions of Colombeau [22, 23].
p
7
Simply take f (x) = g(x) = 1/ x.
191
and in particular
X
nZ
x (x) = 0.
(n) (x n).
, = , = (0) (0) = (0) , = (0) , .
Since any test function has bounded support, the last formula follows by simple linearity.
Remark 7.29 If is merely continuous, the product T cannot be defined for an arbitrary
distribution T , but the product can in fact still be defined. Similarly, if is of C 1 class
(in a neighborhood of 0), the product can be defined (see the next section).
The passage to the second integral is done by integrating by parts; the boundary terms vanish because has bounded support (this is one reason for the
very restrictive choice of the space of test functions). This justifies the
following generalization:
Distributions I
192
def
T , = T ,
(7.3)
Example 7.31 The function x 7 log |x| is locally integrable and therefore defines a regular
distribution. However, its derivative (in the usual sense of functions) x 7 1/x is not locally
integrable; therefore it does not define a (regular) distribution.
However, if the derivation is performed directly in the sense of distributions, log |x| is a
distribution (it is shown in Exercise 8.4 on page 241 that it is the Cauchy principal value pv 1x ).
def
, = T,
xi
xi
for D(Rn ) and 1 i n. Partial derivatives of higher order are obtained
by successive applications of these rules: for instance,
3
T
3
3
, = (1) T , 2
.
x12 x2
x1 x2
In particular, on R3 , we have
T
, = T ,
,
, = T ,
,
x
x
y
y
T
, = T ,
.
z
z
Similarly, the laplacian being a differential operator of order 2, we define:
DEFINITION 7.32 The laplacian of a distribution on Rn is given by
def
T , = T ,
k N
f (k) D (R).
193
7.4
Dirac and its derivatives
7.4.a
fined by
0
H (x) = 1/2
if x < 0,
if x = 0,
if x > 0.
8
Mathematicians had been aware for less than a century of the fact that a continuous function could be very far from differentiable (since there exist, for instance, functions continuous
on all of R which are nowhere differentiable, as shown by Peano, Weierstrass, and others).
9
Enamored of rigor from a recent date only and, therefore, as inflexible as any new convert.
The reader interested in the battle between Oliver Heaviside and the rigorists of Cambridge
can read with profit the article [48].
Distributions I
194
+
H , = H, =
(x) dx = 0 = (0),
0
THEOREM 7.36 The derivative of the Heaviside distribution is the Dirac distribution
H =
(3) def
, = (0).
The common notation (3) ( r ) = (x) ( y) (z) will be explained in Section 7.6.b. Most authors use instead of (3) since, in general, there is no
possible confusion. Thus a point-like charge in electrostatics is represented by
a distribution ( r ) = q ( r a). Note that, when it acts on a space variable,
the three-dimensional Dirac distribution has the dimension of the inverse of a
volume:
1
[] =
,
[L]3
which implies that ( r ) has the dimension Q /L3 of a volum density of
charge.
Surface Dirac distribution
To describe a surface carrying a uniform charge, we use the surface Dirac
distribution:
195
1
,
[L]
R3
def 1 D
1
= 3
d2 r
S (a r), = 3 S ,
a
a
|a|
|a|
S
ZZ
1
1
= 32
( x) d2 x =
S , .
|a|
|a|
S
(definition)
196
Distributions I
+/k
k/2
k/2
/k
Fig. 7.1 The distribution can be seen as two Dirac peaks with opposite signs infinitely
close to each other that is, a dipole.
1
.
[L]2
def
, = (0)
for any test function D.
Note that if is a test function, then
, = (0) = lim
k
2
k2
k0
k
1
k
k
= lim
x
x +
, ,
k0 k
2
2
x+
x
.
(x) = lim
k0 k
2
2
This formula is interpreted as follows. The distribution represents a positive
charge 1/k and a negative charge 1/k, situated at a distance equal to k, in
the limit where [k 0] (see Figure 7.1). Hence the distribution represents
a dipole, aligned on the horizontal axis (O x) with dipole moment 1, hence
oriented toward negative values of x.
197
Remark 7.40 This expression requires a notion of convergence in the space D , which will be
def
x , = x (0) = (0),
x
y , = y (0) = (0),
y
z , = z (0) = (0).
z
We denote by the vector ( x , y , z ).
Lets use these results to compute the electrostatic potential created by an elecrostatic
dipole. The potential created at a point s by a regular distribution of charges
is
ZZZ
1
( r )
V ( s) =
d3 r ,
40
k
r
sk
3
R
which can be written
1
1
.
V ( s) =
( r ),
40
k r sk
Now we extend this relation to other types of distributions, such as point-like,
curvilinear, surface, or dipolar distributions.
PROPOSITION 7.42 The potential created at a point s by the distribution of
charges ( r ) is
1
V ( s) =
40
( r ),
1
k r sk
(7.4)
Remark 7.43 The function s 7 1/ k r sk is not of C class, nor with bounded support.
As far as the support is concerned, we can work around the problem by imposing a
sufficiently fast decay at infinity of the distribution of charges. For the problem of the
singularity of 1/ k r sk at s = r, we note that the function still remains integrable since the
volume element is r 2 sin dr d d; this is enough for the expression (7.4) to make sense in
the case of most distributions of physical interest.
Distributions I
198
where we have denoted by P the dipole moment of the source. Then the
potential is given by
1
V ( s) = T ( r),
k r sk
1
= P x x P y y Pz z ,
k r sk
1
+ Py
+ Pz
= Px
,
x k r sk
y k r sk
z k r sk r=0
namely,
V ( s) = P x
sx
3
+ Py
sy
3
+ Pz
sz
3
Ps
k sk
k sk
k sk
k sk3
which is consistent with the result expected [50, p. 138].
THEOREM 7.44 In the usual space R3 , an electrostatic dipole with dipole moment
(a x) =
1
(x)
a |a|
The goal of this section is to give a meaning to the distribution f (x) ,
where f is a sufficiently regular function. This amounts to a change of
variable in a distribution and is therefore a generalization of the notions of
translation, dilation, and transposition.
Consider first the special case of a regular distribution and a change of
variable f : R R which is differentiable and bijective.
Let g be a locally integrable function. The function g f being still locally
integrable, it defines a distribution and we have
Z
Z
dy
g f , = g f (x) (x) dx = g( y) f 1 ( y) 1 ,
f f ( y)
since, in the substitution y = f (x), the absolute value of the jacobian is equal
to
dy 1
= f (x) = f f ( y) .
dx
By analogy, for a distribution T , we are led to define the distribution T f
as follows:
T
def
1
, f
T f , =
.
f f 1
199
Thus, the distribution f (x) satisfies
f 1 (0)
1
, f
f (x) , =
= 1 ,
f f 1
f f (0)
1
( y y0 ),
f (x) =
f ( y0 )
x | f (x)=0
To summarize:
THEOREM 7.45 Let f be a differentiable function which has only isolated zeros.
f ( y)
yZ ( f )
A classical application of this result to special relativity is proposed in Exercise 8.9 on page 242.
7.4.e
Distributions I
200
Moreover, the densities of charge and current form, from the point of
view of special relativity, a four-vector. In order to distinguish them, we will
denote three-dimensional vectors in the form j and four-vectors in the form
j = (, j /c ).
During a change of Galilean reference frame, characterized by a velocity
v (or, in nondimensional form, = v/c ), these quantities are transformed
according to the rule
( , j /c ) = (
) (, j/c ),
i.e., j = (
) j with
def
j = (, j /c ), (7.5)
where (
) is the matrix characterizing a Lorentz transformation. To simplify,
we consider the case of a Lorentz transformation along the axis (O x), characterized by the velocity = e x . Denote by x = (c t, x) = (x 0 , x 2 , x 2 , x 3 )
the coordinates in the original reference frame R and by x = (c t , x ) =
(x 0 , x 1 , x 2 , x 3 ) the coordinates in the reference frame R with velocity
relative to R. The Lorentz transformation of the coordinates is then given by
ct
ct
x
1
x
def
=
,
with = p
,
y
1
1 2
z
z
1
which we will write10 as x = (
) x. Expanding, we obtain
t = t x/c = (t v x/c 2 ),
t = (t + v x /c 2 ),
x = c t + x = (x v t),
x = (x + v t ),
and
y = y,
y= y,
z = z,
z = z.
j x ( x , t) = j x ( x, t) c ( x, t) .
(7.6)
201
Then j(x) = q (3) ( x), 0 . Looking back at equation (7.6), and denoting
by L 1 ( x ) = (x + v t ) the spacial part of the inverse transformation
)1 , we obtain
(
( x , t), 1c j ( x , t ) = q (3) L 1 ( x) ,
(3) L 1 ( x ) .
Notice then that
(3) L 1 ( x ) = (3) (x + v t ), y , z = (x + v t ) ( y ) (z )
= (x + v t ) ( y ) (z )
(change of variable)
= (3) ( x + vt)
j ( x , t) = q v (3) ( x + vt).
So we see that the Dirac distribution has some kind of invariance property
with respect to Lorentz transformations: it does not acquire a factor ,
contrary to what equation (7.6) could suggest. This is a happy fact, since the
total charge in space is given by
ZZZ
( x , t) d3 x = q,
Q =
R3
and not q!
This result generalizes of course to the case of a particle with an arbitrary
motion compatible with special relativity.
7.5
Derivation of a discontinuous function
7.5.a
function H
y
derivation
y
no!
distribution H
202
Distributions I
The same will happen for any function with an isolated discontinuity at a
point: its derivative will exist in the sense of distributions, which differs from
the usual derivative (if it exists) by a Dirac peak with height equal to the jump
of the function.
The following notation will be used:
f is the function being studied, or its associated distribution;
f will be the regular distribution associated to the usual derivative
of f in the sense of functions:
Z +
f (x) (x) dx ;
f , =
f , = f , ;
this is also a distribution.
Example 7.46 With this notation, the derivative in the sense of distributions of the Heaviside
the jump of f at a i : i = f (a +
i ) f (a i ). One can then write f as the sum
of a continuous function g and multiples of Heaviside functions (see Figure 7.2
on the next page):
X (0)
i H (x a i ).
f (x) = g(x) +
i
(7.7)
203
(0)
a
Fig. 7.2 Example of a function with a discontinuity at a with jump equal to (0) .
To lighten the notation, we will henceforth write equation (7.7) more compactly as follows:
f = f + (0) ;
all the discontinuities of f are implicitly taken into account. Similarly, if we
denote by (1) any discontinuity of the derivative of the function f , (2)
any discontinuity of its second derivative, and so on, then we will have the
following result, with similar conventions:
THEOREM 7.48 (Successive derivatives of a piecewise C function) Let f be
m N.
= X.
E = 0,
sgn(x), (x) =
(x) (x) dx.
0
Therefore, the derivative of f in the sense of distributions is f = f = sgn.
The function x 7 sgn(x) has a discontinuity at 0 with jump equal to 2.
Moreover, its usual derivative is zero almost everywhere (it is undefined at 0
and zero everywhere else), which shows that
{ sgn } = 0
and
f = sgn = 2.
We deduce that the next derivatives of f are given by
f (k) = 2 (k2)
for any k 2.
Distributions I
204
7.5.b
f be a function with a discontinuity along the surface S . Denoting by (0) the jump
function S , equal to the value of f just outside minus the value just inside, we have
f
f
+ (0) cos i S .
=
(7.8)
xi
xi
n
i
xi
S
Proof. Assume that f is of class C 1 on R3 \ S and that its first derivatives admit
a limit on each side of S . We now evaluate the action of f / x on an arbitrary test
function D(R3 ):
ZZZ
f
, = f ,
=
f
dx dy dz
x
x
x
ZZ Z
=
f (x, y, z)
dx dy dz
x
for a smooth surface. Fix now y and z and denote by x the real number such that
(x , y , z ) S (generalizing to the case where more than one real number satisfies
this property is straightforward). In addition, put h(x) = f (x, y , z ). Then we have
Z
Z
h
f (x, y , z )
dx = h(x)
dx = h,
=
, ,
x
x
x
x
and we can now use Theorem 7.47, which gives
h
h
, =
+ (0) (x , y , z ) (x x ),
x
x
Z
h
=
dx + (0) (x , y , z ) (x , y , z ),
x
where we have denoted by (0) (x , y , z ) the jump of f across the surface S at the
point (x , y , z ).
205
dx dy
x
d2 s
y
x
To conclude, we write
ZZ
(0) cos x d2 s = (0) cos x S , .
f
f
, = (0) cos x S , +
dx dy dz
x
x
for any D(R3 ). The same argument performed with the partial derivatives with
respect to y or z leads to the stated formula (7.8).
grad f d3 v =
V
ZZ
f n d2 s
Green-Ostrogradski formula.
11
More precisely, we should take a test function equal to 1 on V , but with bounded
support. If V is bounded, such a function always exists.
Distributions I
206
or equivalently
ZZZ
ZZ
div f d3 v =
( f n) d2 s
V
Green-Ostrogradski formula.
(1)
f = (0) n + n S + { f } ,
P
where n is the normal derivative of : n = i xi cos i = n.
(7.9)
Remark 7.52 The action of the distribution n on any test function is therefore given by
.
n , = S ,
n
f
3
(f f )d v =
f
d2 s.
(7.10)
n
V
S
207
(0)
2
d s,
f , =
f d3 v,
n, =
f
n
V
S
ZZZ
ZZ
(1)
f 2
d s,
{ f }, =
f d3 v.
n S , =
V
S n
7.5.d
f = (0) n + n S + { f }
(1)
= (0) n + n S + { f } ,
but { f } = 0
ZZZ
ZZ
1
1
d2 s +
2 ( r) d3 r .
=
n
S
V
Each term in the last expression has a limit as [ 0], respectively, 0 (since
the derivatives of are bounded and the surface of integration has area of
order of magnitude 2 ) and
ZZZ
1
lim
2 ( r) d3 r = 4 (0)
0+
V
by continuity of (the factor 4 arises as the surface area of S divided
by 2 ).
Distributions I
208
ZZZ
1
1
, =
( r) d3 r
r
r
3
R
ZZZ
= lim+
f ( r ) ( r) d3 r = lim+ f , ,
0
R3
since the integrand on the left is an integrable function (because the volume
element is given by r 2 sin dr d d). Putting everything together, we find
that
ZZZ
1
1
, =
( r) d3 r = 4 (0).
r
R3 r
In other words, we have proved the following theorem:
THEOREM 7.54 The laplacian of the radial function f : r 7
= 4 .
r
1
r
1
k rk
is given by
log |r| = 2 .
1
r n2
= (n 2)Sn ,
(2)n/2
.
(n/2)
(7.11)
Convolution
209
7.6
Convolution
7.6.a
DEFINITION 7.57 (Tensor product of functions) Let f and g be two functions defined on R. The direct product of f and g (also called the tensor
product) is the function h : R2 R defined by h(x, y) = f (x) g( y) for any
x, y R. It will be denoted h = f g.
This definition is generalized in the obvious manner to the product of
f : R p C by g : Rn C.
H =
or
12
7.6.b
Lets now see how to generalize the tensor product to distributions. For this,
as usual, we consider the distribution associated to the tensor product of two
locally integrable functions. If f : R p C and g : Rn C are locally
integrable, and if we write h : R p+n C as their tensor product, then for any
test function D(R p+n ) we have
ZZ
h, = f (x) g( y), (x, y) =
f (x) g( y) (x, y) dx dy
Z
Z
D
E
= f (x)
g( y) (x, y) dy dx = f (x), g( y), (x, y) .
Take now a distribution S (x) on R p and a distribution T ( y) on Rn . The
function
def
Distributions I
210
def D
E
S (x) T ( y), (x, y) = S (x), T ( y), (x, y) .
DEFINITION 7.59 (Tensor product of distributions) Let S and T be two distributions. The direct product, or tensor product, of the distributions S
and T is the distribution S (x) T ( y) defined on the space of test functions on
R p Rn by
def D
E
S (x) T ( y), (x, y) = S (x), T ( y), (x, y) .
H ( y)
y
(x)
(x) H ( y)
Convolution
211
(x) H ( y), (x, y) =
(0, y) dy.
0
Caution is that sometimes the constant function is omitted from the notation of a tensor product f 1. For instance, in two dimensions, the distribution denoted (x) is in reality equal to (x) 1( y) and acts therefore by the
formula
Z +
(x), (x, y) = (x) 1( y), (x, y) =
(0, y) dy.
Example 7.63 (Special relativity) In the setting of Section 7.4.e, the density of charge (x) =
since
= q 1 (3) ,
1 (3) (c t, x) = 1(c t) (3) ( x) = (3) ( x).
7.6.c
h(x) =
h= f g
or less rigorously h(x) = f (x) g(x). Note that the convolution of two
functions does not always exist.
Example 7.65 The convolution of the Heaviside function with itself is given by
H H (x) =
H (t) H (x t) dt = x H (x).
Exercise 7.4 Show that the convolution product is commutative, that is, h = f g = g f
whenever one of the two convolutions is defined.
Exercise 7.5 Let a, b R+ be real numbers such that a 6= b . Compute the convolution of
x 7 e |ax| with x 7 e |b x| .
Distributions I
212
(0, x)
Dx
x
(x, 0)
Convolution
213
(x) =
1 if |x| <
0 if |x| >
1
2
1
,
2
x
21
1
2
It will be seen12 that the sequence of functions n(nx) nN converges (in
a sense that will be made precise in Definition 8.12 on page 230) to the Dirac
distribution . Thus, it can be expected that, in the limit [n ], since
the convolution does not change anything (the precision of measurement is
infinite), we have f = f ; thus we expect that will be a unit element for
the convolution product.
To show this precisely, we first have to extend the convolution to the
setting of distributions.
12
Distributions I
214
7.6.e
Convolution of distributions
= f (x) g( y), (x + y) ,
since the jacobian in this change of variable is equal to | J | = 1.
DEFINITION 7.69 (Convolution of distributions) Let S and T be two distri-
S T , = S (x) T ( y), (x + y) .
The convolution of distributions does not always exist! The general conditions for its existence are difficult to write down. However, we may note that,
as in the case of functions, the following result holds:
THEOREM 7.70 The convolution of distributions with support bounded on the left
b
a
Convolution
215
locally integrable and therefore defines a regular distribution, Moreover, the reader can check
that (1 ) = 0, hence (1 ) H = 0. But, on the other hand, ( H ) = , and hence
1 ( H ) = 1, which shows that
(1 ) H 6= 1 ( H ).
7.6.f
Applications
Remark 7.74 The convolution product is the continuous equivalent of the Cauchy
product
P
P of
absolutelyP
convergent series. Recall that the Cauchy product of the series
the series wn such that
n
P
wn =
ak b nk .
an and
b n is
k=0
X
X
X
(a b )n =
an
bn .
n=0
n=0
n=0
Distributions I
216
T = T = T.
for any m N.
Proof
Let T D . Then, for any D, we have
T , = T (x) ( y), (x + y)
D
= T (x), ( y), (x + y) = T (x), (x) = T , x ,
Let a R. Then
D
T a , = T (x), ( y a), (x + y) = T (x), (x + a) = T (x a), (x) ,
which shows that T a = Ta .
T , = T (x), ( y), (x + y) = T (x), (x) = T , .
It follows that T = T = T .
In addition, if T = R S , then T = T = R S = R S .
Similarly, T = T = R S , which proves the following theorem:
THEOREM 7.76 To compute the derivative of a convolution, it suffices to take the
derivative one of the two factors and take the convolution with the other; in other
words, for any R, S D , we have
[R S ] = R S = R S .
when R S exists. In the same manner, in three dimensions, for any S , T D (R3 )
such that S T exists, we have
(S T ) = S T = S T .
7.6.g
V ( r) =
( r ),
=
40
k r rk
40
k rk
217
(if this expression makes sense), where k r k denotes, with a slight abuse of
notation, the function r 7 k rk. It is then easy to compute, in the sense of
distributions, the laplacian of the potential. Indeed, it suffices to compute the
laplacian of one of the two factors in this convolution. Since we know that
(1/r) = 4, we obtain
1
1
1
1
1
V =
(4)
=
=
40
k rk
40
k rk
40
= .
0
V = .
0
7.7
Physical interpretation of convolution operators
We are going to see now that many physical systems, in particular measuring equipment can be represented by convolution operators. Suppose we
have given a physical system which, when excited by an input signal E depending on time, produces an output signal, denoted S = O (E ). We make
the assumptions that the operator O that transforms E into S is
linear;
continuous;
translationy
S (t)
ytranslation
E (t a) S (t a)
Distributions I
218
THEOREM 7.78 If the three conditions stated above hold, then the operator O is a
for any E .
S = O (E ) = E R = R E
hand, we consider an optical system where the source is an exterior object and where the
output is measured on a photo-detector, the variable corresponding to the measurement (the
coordinates, in centimeters, on the photographic plate) and the variable corresponding to the
source (for instance, the angular coordinates of a celestial object) are diffent. One must then
perform a change of variable to be able to write a convolution relation between the input and
output signals.
O () = R = R.
Note that often, in physics, the operator linking S (t) and E (t) is a differential operator.13 One can then write D S = E where D is a distribution
which is a combination of derivatives of the unit .14 The distribution R,
being the output corresponding to a Dirac peak, therefore satisfies
R D (t) = D R (t) = (t).
|{z}
E (t)
doable in optics (use a star), but rather delicate in electricity. It is sometimes simpler to send
a signal close to a Heaviside function. The response to such an excitation, which is called the
step response, is then S = H R. If we differentiate this, we obtain S = H R = R = R,
which shows the following result:
THEOREM 7.81 The impulse response is the derivative of the step response.
13
Consider, for instance, a particle with mass m and position x(t) on the real axis, subject
to a force depending only of time F (t). Take, for example, the excitation E (t) = m1 F (t) and
the response S (t) = x(t). The evolution equation of the system is S (t) = E (t).
14
In the previous example, one can take D = , since S = S .
219
A star
Two stars
A window
15
Distributions I
220
7.8
Discrete convolution
It should be noted that it is quite possible to define a discrete convolution
to model the fuzziness of a digitalized picture. Such a picture is given no
longer by a continuous function, but by a sequence of values n 7 xn with n
[[1, N ]] (in one dimension) or (m, n) 7 xmn with (m, n) [[1, N ]] [[1, M ]]
(in two dimensions). Each of these discrete numerical values defines a pixel.
In the one-dimensional case, we are given values n for n [[1, N ]] (and
we put n = 0 outside of this interval), and the convolution is defined by
y =x
with yn =
+
X
k xnk ,
k=
with ymn =
+
X
+
X
k xmk,n .
k= =
The exercises for this chapter are found at the end of Chapter 8, page 241.
Discrete convolution
221
Laurent Schwartz, born in Paris in 1915, son of the first Jewish surgeon in the
Paris hospitals, grandson of a rabbi, was a student at the cole Normale Suprieure.
During World War II, he went to Toulouse and then to Clermont-Ferrand, before
fleeing to Grenoble in 1943. In 1944, he barely escaped a German raid. After the war,
he taught at the University of Nancy, where the main part of the Bourbaki group
was located. He came back to Paris in 1953 and obtained a position at the cole
Polytechnique in 1959.
He worked in particular in functional analysis (under the influcence of
Dieudonn) and, in 1944, created the theory of distributions. Because of these
tremendously important works, he was awarded the Fields Medal in 1950. He showed
how to use this theory in a physical context.
Finally, one cannot omit, side by side with the mathematician, the political militant, the pacifist, who raised his voice against the crimes of the French state during
the Algerian war and fought for the independence of Algeria [82].
Laurent Schwartz passed away on July 4, 2002.
Chapter
Distributions II
In this chapter, we will first discuss in detail a particular distribution which is very
useful in physics: the Cauchy principal value distribution. Notably, we will derive
the famous formula
1
1
= pv i,
x i
x
which appears in optics, statistical mechanics, and quantum mechanics, as well as
in field theory. We will also treat the topology on the space of distributions and
introduce the notion of convolution algebra, which will lead us to the notion of
Green function. Finally, we will show how to solve in one stroke a differential
equation with the consideration of initial conditions for the solution.
8.1
Cauchy principal value
8.1.a
1
x
Definition
|x| x
224
Distributions II
C + ()
x0
Fig. 8.1 The contour + .
1
(x)
(x)
(x)
def
pv , = pv
dx = lim+
dx +
dx .
0
x
x
x
1
xx0
is defined
C + z x0
225
x0
C ()
Fig. 8.2 The contour .
[ 0+ ].
Since the integral on the left-hand side is equal to 2i times the sum of the
residues located in the upper half-plane, plus the residue at x0 , which is equal
to 2i f (x0 ), the final formula is the same as (8.1).
Distributions II
226
keeping the original contour intact. Suppose the function considered has a
pole on the real axis. Then one can write the symbolic equivalences
and
which means that the pole has been moved in the first case to x0 i and in
the second case to x0 + i, and then the limit [ 0+ ] is taken at the end of
the computations.
To see clearly what this means, consider the equalities
Z
Z
Z
f (z)
f (z)
f (z)
dz = lim+
dz,
dz = lim+
0
0
+ z x0
+ z x0 + i
z x0 + i
where is the undeformed contour. Indeed, the first equality is a consequence
of continuity under the integral sign, and the second equality comes from
Cauchys theorem, which, for any given > 0, justifies deforming + into ,
since no pole obstructs this operation. Thus we get, performing implicitly the
operation lim+ ,
Z
f (z)
dz =
z x0
f (x)
dx = pv
x x0 + i
f (x)
dx i f (x0 ),
x x0
1
1
, f = pv
, f i (x x0 ), f .
x x0 + i
x x0
Symbolically, we write:
1
= pv
x x0 + i
1
x x0
i (x x0 ),
227
Z
F ( )
1
F (z)
d =
0
2i C z
if Im(z) > 0,
if Im(z) < 0.
2i
2
x x
hence
Z +
F (x )
1
F (x) =
pv
dx .
i
x x
This equation concerning F becomes much more interesting by taking successively its real and imaginary parts; the following theorem then follows:
THEOREM 8.2 (Kramers-Kronig relations) Let F : C C be a meromorphic
function on C, holomorphic in the upper half-plane and going to 0 sufficiently fast at
infinity in this upper half-plane. Then we have
Z +
Im F (x )
1
Re F (x) = pv
dx
x
228
and
Distributions II
1
Im F (x) = pv
Re F (x )
dx
x x
These are called dispersion relations and are very useful in optics (see, e.g., the
book by Born and Wolf [14, Chapter 10]) and in statistical physics. Physicists
call these formulas the Kramers-Kronig relations,1 while mathematicians say
that Re(F ) is the Hilbert transform of Im(F ).
Remark 8.3 If F is meromorphic on C and holomorphic in the lower half-plane (all its poles
have positive imaginary parts), then it satisfies relations which are dual to those just proved
(also called Kramers-Kronig relations, which does not simplify matters):
Z +
Z +
Im F (x )
Re F (x )
1
1
dx and Im F (x) = pv
dx .
Re F (x) = pv
x x
x x
We will see, in Chapter 13, that the Fourier transform of causal functions t 7 f (t) (those
that vanish for negative values of the variable t), when it exists, satisfies the Kramers-Kronig
relations of Theorem 8.2.
Remark 8.4 What happens if F is holomorphic on both the upper and the lower half-plane?
Since it is assumed that F has no pole on the real axis, it is then an entire function. The
assumption that the integral on a circle tends to zero as the radius gets large then leads (by the
mean value property) to the vanishing of the function F , which is the only way to reconcile
the previous formulas with those of Theorem 8.2.
Remark 8.5 In electromagnetism, the electric induction D and the electric field E are linked,
for a monochromatic wave, by
D( x, ) = () E( x, ),
where () is the dielectric constant of the material, depending on the pulsation of the
waves. This relation can be rewritten (via a Fourier tranform) in an integral relation between
D( x, t) and E( x, t):
Z +
D( x, t) = E( x, t) +
G () E( x, t ) d,
where G () is the Fourier transform of () 1. In fact, since the electric field is the physical
field and the field D is derived from it, the previous relation must be causal, that is, G () = 0
for any < 0. One of the main consequences is that the function 7 () is analytic in the
lower half-plane of the complex plane (see Chapter 12 on the Laplace transform). From the
Kramers-Kronig relations, interesting information concerning the function 7 () can be
deduced. Thus, if the plasma frequency is defined by p 2 = lim 2 [1 ()], we obtain
2
Im () d.
0
The reader is invited to read, for instance, the book by Jackson [50, Chapter 7.10] for a more
detailed presentation of this sum rule.
p 2 =
Dispersion relations first appeared in physics in the study of the dielectric constant of
materials by R. de L. Kronig in 1926 and, independently, in the theory of scattering of light
by atoms by H. A. Kramers in 1927.
229
sin x
dx = .
x
2
1
x
Proof. Indeed, for any test function D(R), we have by definition of the product
of a distribution (pv 1x ) by a C function:
Z
Z +
1
1
x (x)
x pv , = pv , x = lim+
dx =
(x) dx = 1, .
0
x
x
x
|x|>
x T (x) = 1,
are the distributions given by T (x) = pv 1x + , with C.
Proof. If T satisfies x T (x) = 1, then S = T pv(1/x) satisfies x S (x) = 0 and
thus, according to Theorem 7.28 on page 191, S is a multiple of .
linear combinations of , , . . . , n1 .
T (x) =
i 1
X mX
iI k=0
i,k (k) (x x i ),
Distributions II
230
Example 8.10 The equation sin x T (x) = 0 has for solutions the distributions of the form
nZ
n (x n).
The equation (cos x 1) T (x) = 0 has for solutions the distributions of the form
P
n (x n) + n (x n) .
nZ
x2
1
+ (x a) + (x + a),
a2
1
1
1
1
pv 2
=
pv
pv
.
x a2
2a
x +a
x a
8.2
Topology in D
8.2.a
Weak convergence in D
Tk T .
Exercise 8.2 For n N, let Tn be the regular distribution associated to the locally integrable
function x 7 sin(nx). Show that the sequence (Tn )nN converges weakly to 0 in D .
in
D (R)
have
(m)
Tk
T (m) .
Topology in D
231
there exists a real number A > 0 such that, for any x R and any
k N, we have
|x| A = fk (x) 0 ,
and
cv.u.
fk (x) 0
1
on a < |x| < .
a
Example 8.16 An example of a Dirac sequence is the sequence ( fn )nN defined by fn (x) =
n(nx) for any x R and any n N. (Check this on a drawing.)
Example 8.17 The sequence of functions (gn )nN defined by
n
def
2 2
gn (x) = p e n x
for x R
The functions fn are all differentiable, but their limit f is a sawtooth function, which is not
differentiable at the points of the form n with n Z.
Distributions II
232
sin2 (nx)
for n = 1 to 5.
n2 x 2
THEOREM 8.18 Any Dirac sequence converges weakly to the Dirac distribution
in D (R).
sin2 (nx)
for x R
n2 x 2
is a Dirac sequence (see Figure 8.4) and thus converges also to .
n (x) =
Remark 8.20 Without the positivity assumption for x close to 0, there is nothing to prevent
contributions proportional to to arise in the limit! Thus, consider the sequence of functions
(kn )nN given by the graph
n2
1
n
n1
n2
D
It is easy to show that kn . Let ( fn )nN be a Dirac sequence; then the sequence
( fn + kn )nN , which still satisfies , but not , converges to + and not to .
Topology in D
233
10
8
6
4
2
0
0,5
0,5
sin 2nx
, for n = 1, . . . , 5.
x
sin 2nx
.
x
All the functions bn share a common envelope x 7 1/x and the sequence
of real numbers (bn (x))nN does not converge if x
/ Z (see Figure 8.5). We do
have bn (0) , but, for instance,
1
(1)k /2 if n = 2k + 1,
bn
=
0
if n = 2k.
4
In particular, this sequence (bn )nN is not a Dirac sequence. Yet, despite this
we have the following result:
PROPOSITION 8.21 The sequence (bn )nN converges to in the sense of distribu-
tions.
Distributions II
234
We can always write (x) = (0) + x (x), where if of C class. Then we have
ZM
ZM
sin 2nx
dx +
(x) sin(2nx) dx,
b n , = (0)
x
M
M
and the first integral, via the substitution y = nx, tends to the improper integral
Z +
sin x
dx = 1,
x
continuous function on R, then fn g exists for all n N and the sequence ( fn g)nN
converges uniformly to g.
If g is not continuous but the limits on the
left or the right of g exist at the point x,
then fn g(x) tends to 12 g(x ) + g(x + ) .
8.2.c
Topology in D
235
f (x) = T (x) = T (t), (x t) .
Let T D and D (so is now not only of C class, but also with
bounded support). Then T exists and is a regular distribution associated to a
function g of C class, with derivatives given by the formula
g (k) (x) = T (t), (k) (x t) .
The following theorem shows that the operation of convolution with a fixed
distribution is continuous for the weak topology on D :
THEOREM 8.25 (Continuity of convolution) Let T D . If (Sk )kN converges
weakly in D to T , and if moreover all the Sk are supported in a fixed closed subset
such that all the Sk T exist, then the sequence (Sk T ) converges in D to S T .
Consider now a Dirac sequence (n )nN , with elements belonging to D.
It converges weakly to in D (this is Theorem 8.18); moreover, we know
(Theorem 8.24) that if T is a distribution, we have n T C (R). Since
n T T , we deduce:
n
In other words, any distribution is the weak limit of a sequence of functions of C class with bounded support.
Exercise 8.3 Find a sequence (n )nN in D which converges weakly to X in D .
Distributions II
236
8.3
Convolution algebras
DEFINITION 8.27 A convolution algebra is any vector space of distributions
containing the Dirac distribution and for which the convolution product
of an arbitrary number of factors is defined.
There are two well-known and very important examples.
THEOREM 8.28 (E
E , D + and D ) The space E of distributions with bounded sup-
where A and B are distributions (B is often called the source) and the
unknown X is a distribution.3 The problem is to find a solution and to
determine if it is unique. So we are, quite naturally, led to the search for a
distribution denoted A 1 , which is called the convolution inverse of A or,
sometimes, Green function of A, which will be such that
A A 1 = A 1 A = .
Remark 8.29 This inverse, if it exists, is unique in the algebra under consideration. Indeed, suppose
there exist, in a same algebra, two distributions A and A such that
Then
and therefore
A A = A A =
and
A A = A A = .
A = A = A (A A ) = (A A) A = A = A
A = A .
For instance, notice that any linear differential equation with constant coefficients
d
dn
a0 + a1 + + an n X (t) = B(t)
dt
dt
Convolution algebras
237
AX =B
tion in a given algebra A . In fact, a distribution A may very well have a certain convolution
inverse in one algebra and another inverse in a different algebra. Take, for instance, the case of
a harmonic oscillator characterized by a convolution equation of the type
( + 2 ) X (t) = B(t),
where B(t) is a distribution characterizing the input signal. The reader is invited to check that
( + 2 )
1
H (t) sin t = .
Since the distribution 1 H (t) sin t is an element of D+ , it follows that, for any B D+ , that
is, for any input which started at a finite instant t R and was zero previously, the equation
has a solution in D+ which is unique and is given by
X (t) = B(t)
1
H (t) sin t
causal solution.
1
( + 2 ) H (t) sin t = ,
1
H (t) sin t
anti-causal solution.
We also deduce that this convolution operator does not have an inverse in E .
Distributions II
238
8.4
Solving a differential equation
with initial conditions
Most students in physics remember very well that the evolution of a system
is given by a differential equation, but forget that the real problem is to solve
this equation for given initial conditions. It is true that, in the linear case,
these initial conditions are often used to parameterize families of solutions;
however, this is not a general rule.4
The Cauchy problem (an equation or system of differential equations,
with initial conditions) is often very difficult to solve, as every one of you
knows. Sometimes, it is very hard to show that a solution exists (without even
speaking of computing it!), or to show that it is unique.5
In the cases below, we have autonomous linear differential equations with
constant coefficients. The theoretical issues are therefore much simpler, and we
are just presenting a convenient computational method. Some cases which are
more complicated (and more useful in physics) will be presented on page 348.
8.4.a
(e)
239
(E)
There only remains to find the unique solution to this equation (in the sense
of distributions) which lies in the convolution algebra D+ of distributions
with support bounded on the left.
is
THEOREM 8.32 The convolution inverse of [ + ] in D+
G (t) = H (t) e t .
(The inverse in D is t 7 H (t) e t .)
Proof. Indeed, we have G (t) = e t +(t), which shows that [ +]G = .
or
U = u0 H (t) e t .
It may seem that we used something like a hammer to crush a very small
fly. Quite. But this systematic procedure becomes more interesting when
considering a differential equation of higher order, or more complicated initial
conditions, for instance on surfaces (see Section 15.4, page 422).
8.4.b
(e )
Distributions II
240
in D+ is
1
H (t) sin t.
G (t) =
1
H (t) sin t.
Remark 8.34 There exist methods to rediscover this result, the proof of which is immediate:
such convolution inverses, or Green functions, of course do not come out of thin air. These
methods are explained in Section 15.2 on page 409.
To summarize, the same problem can be seen from two different angles:
1. the first, classical, shows a differential equation with given initial conditions at the point 0;
2. the second shows a differential equations valid in the sense of distributions, without conditions at 0, for which a solution is sought in a given
convolution algebra (for instance, that of distributions with support
bounded on the left).
The right-hand side of (E ) shows what distribution is necessary to move
the system at rest (for t < 0) instantaneously to the initial state u(0), u (0) .
We will see in the following chapters, devoted to the Fourier and Laplace
transforms, how to compute the convolution inverses (also called the Green
functions) of a differential operator.
Here we merely state an interesting result:
THEOREM 8.35 (Heat equation) The convolution inverse of the heat operator
2
c
2
t
x
H (t) c x 2 /4t
G (x, t) = p
e
,
c t
Exercises
that is,
or
241
c (t) (x) G (x, t) = (2) (x, t),
c
2G
G
2 = (x) (t).
t
x
EXERCISES
Examples of distributions
Exercise 8.4 (Cauchy principal value) Show that the map defined on D by
1
pv : D 7 lim
0
x
|x|>
(x)
dx
x
which defines the Cauchy principal value is the derivative, in the sense of distributions, of
the regular distribution associated to the locally integrable function x 7 log |x|.
Exercise 8.5 (Beyond principal value) Show that the map
7 lim+
0
Z
(0)
(x)
dx
+ (0) log
2
x
Exercise 8.6 (Finite part of 1/ x 2 ) We denote by fp(1/x 2 ) the opposite of the distribution
1 def
1
.
fp 2 = pv
x
x
Show that
Z +
Z
1
(x) (0)
fp 2 , =
log |x| (x) dx = lim+
dx,
0
x
x2
|x|>
and that, if 1 denotes as usual the constant function equal to 1, or the associated distribution,
we have
1
x 2 fp 2 = 1 .
x
Generalize these results to define distributions related to the (not locally integrable) functions
x 7 1/x n for any n N.
Exercise 8.7 Show that we have, for all a R,
(a x) =
whereas, if a > 0,
1
(x),
|a|
H (a x) = H (x).
Distributions II
242
What is the interpretation of the functions and H ? These integrals have the nice property
of being invariant under Lorentz transformations. Show that, if we put
p
def
E ( p) = p 2 + m 2 ,
then we have
(p2 m 2 ) H (p 0 ) d4 p =
d3 p
,
2E ( p)
which is the usual way of writing that for any sufficiently smooth function (indicate precisely
the necessary regularity) we have
Z
Z
d3 p
(p2 m 2 ) H (p 0 ) f (p) d4 p = f E ( p), p
.
2E ( p)
Convergence in D
Exercise 8.11 Let n N and R. We consider the function
fn : R \ Z R,
1 sin2 nx/2
.
2n sin2 x/2
R +1
Show that fn can be extended by continuity to R. Compute 1 fn (t) dt and show that the
result is independent of n.
Show that the sequence ( fn )nN tends to X in the sense of distributions.
x 7
x 7
1 x D
(x)
a
a
[a 0+ ].
Exercises
243
Exercise 8.13 Let be a real number. Show that, in the sense of distributions, we have
X
nZ
sin |n| =
sin
.
1 cos
E (x, t) =
1
0
if c t > |x| ,
if c t < |x| ,
2
1 2
2.
2
2
c t
x
Exercise 8.15 In the plane R2 , we define the function E (x, t) = H (x) H (t), and its asso-
2
E
x t
Exercise 8.16 (The wages of fear) We model a truck and its suspension by a spring with
strength k carrying a mass m at one of its extremities. The wheels at the bottom part of the
spring are forced to move on an irregular road, the height of which with respect to a fixed
origin is described by a function of the distance E (x). The horizontal speed of the truck is
assumed to be constant equal to v, so that at any time t, the wheels are at height E (v t). Denote
by R(t) the height of the truck, compared to its height at rest (it is the response of the system).
i) Show that
d2
k
2
+
R(t) = 2 E (v t),
where 2 = .
dt 2
m
ii) In order to generalize the problem, we assume that E is a distribution. What is then
the equation linking the response R(t) of the truck to a given road profile E (x) D+ ,
where D+ is the space of distributions with support bounded on the left? Comment
on the choice of D+ .
iii) Compute the response of the truck to the profile of a road with the following irregularities:
a very narrow bump E (x) = (x);
a flat road E (x) = H (x);
Compute the amplitude of the movements of the truck. How does it depend on the
speed?
iv) In the movie The Wages of Fear by Henri-Georges Clouzot (1953, original title Le salaire
de la peur), a truck full of nitroglycerin must cross a very rough segment of road. One
of the characters, who knows the place, explains that the truck can only cross safely at
very low speed, or on the contrary at high speed, but not at moderate speed. Explain.
Distributions II
244
PROBLEM
Problem 3 (Kirchhoff integrals) The wave equation in three dimensions, for a scalar quan-
tity , is
def
1 2
c2 t2
( r, t) = 0,
where we denote r = (x, y, z). Monochromatic waves with pulsation are given by
( r , t) = ( r) e i t .
1. Show that the complex amplitude then satisfies the equation called the Helmoltz
equation:
def
+ k 2 ( r) = 0,
with k = .
c
Let S be an arbitrary closed surface. We are looking for a distribution ( r ) such that
( r) inside S ,
( r ) =
0
outside S ,
and which is a solution in the sense of distributions of the Helmoltz equation.
2. Show that differentiating in the sense of distribution leads to
h
i
d
d
+ k 2 ( r) ( r ) =
S ( r) +
S ( r ),
dn
dn
where n is the interior normal to S (the choices of orientation followed here are those
in the book Principles of Optics by Born and Wolf [14]).
3. Show that a Green function of the operator [ + k 2 ], that is, its convolution inverse,
is given by
1 e i kr
def
G ( r) =
,
where r = k r k .
4 r
4. Deduce that
ZZ
d e i kq
e i kq d
1
d2 x
with q = k r xk .
( r ) =
4
dn q
q
dn
S
5. Interpretation: the complex amplitude inside a closed surface S , with given limit conditions, is the same as that produced by a certain surface density of fictive sources6
located on S . This result is the basis of what is called the method of images in electromagnetism and it is, in the situation considered here, the founding principle of the
scalar theory of diffraction in optics.
A more complete version of Kirchhoffs Theorem, for an arbitrary dependence on time,
is presented in the book of Born and Wolf [14, p. 420].
Possibly very difficult to compute explicitly! Note that this computation requires knowing
both the amplitude and its normal derivative with respect to the surface. In some cases only
we may hope to exploit this exact formula, where symmetries yield additional information.
Solutions of exercises
245
SOLUTIONS
Solution of exercise 7.2 on page 198
x E
1 D
1
(x),
=
(0)
|a|
a
a |a|
1
, .
=
a |a|
R
Solution of exercise 7.4 on page 211. Put h(x) = f (t) g(x t) dt and make the substitution y = x t, with jacobian |dy/dt| = 1, which yields
Z
Z
h(x) = f (x y) g( y) dy = g( y) f (x y) dy = [g f ](x).
(a x), (x) =
Solution of exercise 7.5 on page 211. A simple but boring computation shows that
e |ax| e |b x| =
2
[a e b|x| b e a|x| ].
a2 b 2
Solution of exercise 8.1 on page 229. Notice that sin x = Im(e i x ) and that z 7 e i z is
holomorphic on C and decays in the upper half-plane (at least in the sense of Jordans second
lemma). Using then the Kramers-Kronig formula from Theorem 8.2, we obtain
Z
Z
1 + sin x
sin x
dx =
dx
x
2 x
0
Z Z +
1
sin x
= lim+
+
dx
(continuous function)
2 0
x
Z +
1
Ime i x
= pv
dx = Re(e i 0 ) = .
2
x
2
2
log |x| , = log |x| , (x) =
= lim
0+
log |x| (x) dx,
since x 7 log |x| is locally integrable (the integral is well defined around 0). Integrating each
of the two integrals by parts, we get
Z
(x)
log |x| , = lim+ log() () log() () +
dx
0
x
|x|>
Z +
(x)
= pv
dx,
x
which shows that log |x| = pv 1x .
Solution of exercise 8.5. First, we show that the stated limit does exist. Let D. Integrating by parts twice, we get
Z +
Z +
(x)
()
dx
=
Distributions II
246
and therefore
Z +
(0)
(x)
dx
+ (0) log =
2
x
Z +
() (0)
The second term tends to 0, and the third to (0) as [ 0+ ]. The first term is integrable
when = 0. Hence the expression given has a limit, which we denote by T , .
The linearity is obvious, and hence we have to prove the continuity, and it is sufficient to
show the continuity at 0 (the zero function of D).
Let (n )nN be a sequence of functions in D, converging to 0 in D. All the functions n
have support included in a fixed interval [A , A], where we can assume for convenience and
without loss of generality that A > e. Then
ZA
T , n =
n (x) log x dx + n (0),
0
and hence
T , n k k (A log A A) + k k .
n
n
Choose > 0. There exists an integer N such that for n N we have kn k /(A log A
A) and kn k , so that |T , n | 2 for any n N . This proves that T , n tends to 0
as [n ].
Since we have proved the continuity of T , this linear functional is indeed a distribution.
(1)n1 dn
(1)n1 dn1
1
1
=
log |x| =
pv .
n
n
x
(n 1)! dx
(n 1)! dx n1
x
1
1
(a x), =
(x), (x/a) =
(0) =
, .
|a|
|a|
|a|
For any a > 0, the functions x 7 H (x) and x 7 H (a x) are equal. Hence the associated
regular distributions are also equal. If a proof by substitution really is required, simply compute,
for any test function
Z
Z +
1 +
H (a x), =
H , (x/a) =
(x/a) dx =
(x) dx = H , .
|a|
|a| 0
0
For a < 0, the reader can check that H (a x) = 1 H (x).
Similarly, by letting it act on any test function , one shows that x = 2 . Note that this
relation can also be recovered directly by differentiating the relation x = , which yields
+ x = and thus x = 2 .
If f is a locally integrable function of C class, we can define the product f . By the
same method as before, we find that
f (x) (x) = f (0) f (0) .
The assumptions can be weakened by asking only that f be continuous and differentiable
at 0. Without a more complete theory of distributions, the relation above can be taken as a
definition for f (x) (x) in this case.
Solutions of exercises
247
1 sin nx/2 2
, for n = 2, . . . , 7. The
2n sin x/2
limit in the sense of distributions of this sequence is X.
1
1
(r R) =
.
2R
2R S
Solution of exercise 8.11.R By writing the difference of two consecutive terms and integrating by parts, it follows that
1
f (t) dt
1 n
fn (t) dt = 1
n N.
Moreover, fn takes only positive values and converges uniformly to 0 on any interval of the type
]+, 1], with > 0, which shows that
x
f (x) ,
n
2 n
P+
xk
and since fn (x) = k= 2 fn (x) and each term tends to k , it follows that fn X.
n
Solution of exercise 8.13. Expand the sine as sum of two complex exponentials, rearrange
the terms, and compute the sums of geometric series. The convergence must be justified.
Distributions II
248
whence
2E
(x, t) = (x) (t).
x t
i) The force exerted on the truck by the spring is k E (v t) R(t) , so by Newtons law
we can write
d2 R
(t) + k R(t) E (v t) = 0,
m
dt
hence the required formula.
iv) The nitroglycerin will explode if the amplitudes of the vibrations of the truck are too
important. The previous formula indicates a resonance at the speed v = . A very
slow or very fast speed will avoid this resonance.
Solution of problem 3.
Using Theorem 7.51 on page 206, the formula asked in Question 2 follows. To show that ( + k 2 ) G = , use the property
(1/r ) = 4
and the formula for differentiating a product: it follows that
d
(1) e i kr
d
S ( r) +
, S ( r)
,
( r) =
dn
dn
4 r
and then that
( r) =
d
( x) +
dn S
d
dn
S ( x),
(1) e i kk r xk
4 k r xk
Chapter
Hilbert spaces,
Fourier series
9.1
Insufficiency of vector spaces
To indicate how vector spaces and the usual notion of basis (which is called
an algebraic basis) are insufficient, here is a little problem: let E = RN , the
space of sequences of real numbers. Clearly E is an infinite-dimensional vector
space. So, the general theory of linear algebra teaches that1:
THEOREM 9.1 Any vector space has at least one algebraic basis.
Though the theorem that states this does not give a way of constructing such a basis. Even
worse, the proof uses the axiom of choice or rather an equivalent version, called Zorns lemma
LEMME (Zorn) Let Z be a non-empty ordered set in which any totally ordered subset admits an upper
250
...
(9.1)
iI
i ei ,
i=0
X
k=0
uk vk ,
namely,
kuk2 =
X
k=0
|uk |2 .
However, this will not do, since the norm of the vector (1, 1, 1, . . . , 1, . . . )
is infinite. Without additional assumptions, speaking of norms brings other
difficulties, and we are back at the beginning. Well, then, one may wish
to keep the finiteness requirement for the sums, but to restrict the vector
space under consideration by looking only at the space E 0 made of sequences
with bounded support (i.e., sequences where all but finitely many elements
are zero). This time, the family (9.1) is indeed an algebraic basis of E0 , but
Pre-Hilbert spaces
251
x (1) = (1, 0, 0, 0, 0, . . . ),
x (2) = 1, 12 , 0, 0, 0, . . . ,
x (3) = 1, 12 , 14 , 0, 0, . . . ,
x (4) = 1, 12 , 14 , 18 , 0, . . . ,
1
x (n) = 1, 12 , . . . , 2n1
, 0, . . . ,
...
of elements of E0 .
If we put
x=
1
2k kN
1
= 1, 12 , 14 , 18 , . . . , 2n1
,... ,
X
(n)
1
1
x x
2 =
=
.
k
4
3 4n1
k=n
Unfortunately, the element x does not belong to E 0 (it does not have bounded
support). One says that E 0 is not complete.
So we can see that the difficulties pile up when we try to extend the notion
of basis to an infinite family of vectors. To stay in a well-behaved setting, we
are led, following Hilbert and Schmidt, to introduce the notion of a Hilbert
space. Before doing this, we will review the general techniques linked to the
scalar product and to projection onto subspaces of finite dimension.
9.2
Pre-Hilbert spaces
DEFINITION 9.3 Let E be a real or complex vector space. A complex scalar
252
Property (b) expresses the linearity with respect to the second variable.
The scalar product is thus semilinear (or antilinear) with respect to the first,
which means that
, C, x, x , y, E
x + x y = (x| y) + x y .
The hermitian product is also called definite (property d) and positive (property c).
DEFINITION 9.4 (Norm) Let E be a pre-Hilbert space. The norm (or hermitian norm) of an element x E is defined using the scalar product by
def p
kxk = (x|x).
(Theorem 9.6 on the facing page shows that this is indeed a norm.)
The hermitian product has the following essential property:
2
( x |y)
(assuming that (x| y) 6= 0, which is not a restriction since
|( x | y )|
otherwise the inequality is immediate). Then we have = 1 and, for any real number
, the inequalities
def
Proof. Put =
Another importance
inequality, called the Minkowski inequality, shows that
p
the map x 7 (x|x) is indeed a norm (in the sense defined in Appendix A):
2
One should not confuse Hermann Schwarz (18431921) and Laurent Schwartz (1915
2002). For biographical indications, see, respectively, pages 161 and 222.
Pre-Hilbert spaces
253
have
x, y E
kx + yk kxk + k yk ,
with equality if and only if x and y are linearly dependent, and their proportionality
coefficient is positive or zero.
Proof. Use the Cauchy-Schwarz inequality in the expansion of kx + yk2 .
x, y E we have
This is called the the parallelogram identity (or law) because of the following
analogy with the euclidean norm in the plane: the sum of the squares of the
lengths of the diagonals of a parallelogram is equal to the sum of the squares
of the lengths of the four sides.
x+y
y
x
xy
ii) Two subsets A and B are orthogonal if, for any a A and any b B,
we have a b. This is denoted A B.
iii) If A is an arbitrary subset of E , the orthogonal of A, denoted A , is
the vector space of elements in E which are orthogonal to all elements
of A:
A = x E ; y A , x y .
254
9.2.a
i=1
i=1
n
P
i=1
n
P
i=1
9.2.b
(ei |x) ei ,
xi yi =
i.e.,
n
P
i=1
(x|ei ) (ei | y) ,
x i = (ei |x) ,
kxk2 =
n
P
i=1
|x i |2 .
xV =
p
X
i=1
(ei |x) ei .
To show that such a basis exists is easy: just take an arbitrary basis, then apply the GramSchmidt algorithm to make it orthonormal.
Pre-Hilbert spaces
255
xV =
p
X
i=1
(ei |x) ei .
e2
e1
(e2 x) e2
V
xv
(e1 x) e1
space of E .
256
Remark 9.13 If V is a subspace of infinite dimension, the previous result may be false. For
instance, consider the pre-Hilbert space E of continuous functions on [0, 1], with the natural
R1
scalar product ( f |g) = 0 f g, and the subspace V of functions f such that f (0) = 0. If
g is a function in V , then it is orthogonal to x 7 x g(x) (which belongs to V ); therefore
2
R1
x g(x) dx = 0 and g = 0. Therefore V = {0}, although {0} is not a complementary
0
subspace of V .
xp =
p
X
n=0
(en |x) en
for any p N,
p
2 X
(x|en ) 2 kxk2 .
and hence x p =
n=0
THEOREM 9.14 (Bessel inequality) Let E be a pre-Hilbert space. Let (en )nN be
2
(en |x) converges
n=0
9.3
Hilbert spaces
Recall that a normed vector space E is complete if any Cauchy sequence
in E is convergent in E .
Complete spaces are important in mathematics, because the notion of
convergence is much more natural in such a space. These are therefore the
spaces that a physicist requires intuitively.4
4
Since the formalization of quantum mechanics by Jnos (John) von Neumann (19031967)
in 1932, many physics courses start with the following axiom: The space of states of a quantum
system is a Hilbert space.
Hilbert spaces
257
David Hilbert (18621943), born in Knigsberg in Germany, obtained his first academic position in this town at the age of 22. He
left for Gttingen eleven years later and he ended his career there in
1930. A many-faceted genius, he was interested in the foundations of
mathematics (in particular the problem of axiomatization), but also
in mechanics, general relativity, and number theory, and invented
the notion of completeness, opening the way for the study of what
would later be called Hilbert spaces. His name remains attached to a
list of 23 problems he proposed in 1900 for the attention of mathematicians present and future. Most have been resolved, and some
have been shown to be indecidable.5
9.3.a
Hilbert basis
The spaces we are going to consider from now on will be of infinite dimension
in general. It is harder to work with spaces of infinite dimension since many
convenient properties of finite-dimensional vector spaces (for instance, that
E E ) are not true in general. The interest of Hilbert spaces is to keep
many of these properties; in particular, the notion of basis can be generalized
to infinite-dimensional Hilbert spaces.
DEFINITION 9.16 (Total system) Let H be a Hilbert space. A family (en )nI
(x|ei ) = 0, i I
(x = 0).
nN
V p = Vect{e0 , e1 , . . . , e p }
5
for any p N.
Such is the case of the continuum hypothesis, a result due to P. Cohen in 1963.
258
the orthonormal system (en )nN is the sequence of sums S p (x) given by
def
S p (x) =
p
X
n=0
(en |x) en ,
for any p N.
P
P 2
Proof. From Bessels inequality it follows that the series
|(en |x)|2 =
|cn | has
bounded partial sums; since it is a series with positive terms, it is therefore convergent,
and consequently it is a Cauchy sequence. Hence the quantity
X
2
q
X
q
S p (x) Sq (x)
2 =
c
e
|cn |2
n n
=
n= p+1
n= p+1
is the Cauchy remainder of a convergent series of real numbers, and it tends to 0 when
[p, q ]; this shows that the sequence Sn (x) nN is a Cauchy sequence in H which
is, by assumption, complete. So it converges in H .
This is not sufficient to show that the sequence Sn (x) nN converges to x;
as far as we know, it may very well converge to an entirely different vector.
This is the reason for the introduction of the Hilbert basis.
Hilbert spaces
259
i=0
2
(ei |x) ;
S
Proof. Denote Tn = Vect{e0 , . . . , en } and T = nN Tn .
a) b): by assumption, T is dense in H , hence T = H . Let > 0. There exists
y T such that kx yk . Moreover, since y T , there exists N N such
that y T N , so that, since the sequence (Tn )nN is increasing for inclusion, y Tn
for any n N . In addition the projection of x on Tn achieves the shortest distance
(Pythagoras theorem), hence kx xn k kx yk for
P any n N .
b) c): an easy computation shows that (xn | yn ) = ni=0 (x|e i ) (e i | y) and letting n
go to infinity gives the result.
c) d): immediate.
6
In other words, for a Hilbert basis (e i )iI , any vector in H can be approached, with
arbitrarily high precision, by a finite linear combination of the vectors e i .
7
In certain books, countability is part of the definition of a Hilbert basis.
260
d) e): immediate.
e) a): let x H and denote y = lim xn (from Theorem 9.18, we know that the
n
nN
nN
n=0
P
n=0
(en |x) en ,
cn (x) cn ( y) =
n=0
(x|en ) (en | y) ,
kxk2 =
P
cn (x)2 .
n=0
COROLLARY 9.23 Two elements of a Hilbert space which have the same Fourier
Proof. Let x, y H have the same Fourier coefficients with respect to an orthonormal Hilbert basis (en )nN . We then have cn (x y) = cn (x) cn ( y) = 0 for any n N;
hence, according to Property e), it follows that x y = 0.
PROPOSITION 9.24 In a Hilbert space, all Hilbert basis have the same cardinality.
COROLLARY 9.25 Every separable Hilbert space is isometric to the space 2 , to the
space L2 (R), and to the space L2 [0, a] (see the next pages for definitions).
L2 ,
This is important in quantum mechanics. Indeed, since in general the space considered is
which is separable, it follows that the family {| x} xR is not a Hilbert basis.
Hilbert spaces
x| y =
nN
cn (x) cn ( y) =
nN
(x|en ) (en | y) =
261
P
nN
x| en en | y,
Two Hilbert spaces will be important in the sequel, the spaces denoted
2 and L2 ; both are extensively used in quantum mechanics as well as in
signal theory. The two following sections give the definitions and the essential
properties of those two spaces. It will be seen that it is possible to translate
from one to the other, and this is precisely the objective of the classical Fourier
series expansion.
the series
(a|b) =
n=0
is a Hilbert space.
a n bn ,
Proof. Because of its length and difficulty, the proof is given in Appendix D,
page 602.
en = ( 0,
. . , 0} , 1, 0, 0, . . . ).
| .{z
n1 times
(In this is recognized the basis which we were looking for at the beginning
of this chapter!) Since this Hilbert basis is countable, the space 2 is separable.
Remark 9.29 Because N and Z have same cardinality, 2 can also be defined as the space of
262
The problem is that if f and g are equal almost everywhere, even without
being equal, then k f gk = 0, so this is not really a norm. The way out
of this difficulty is to identify functions which coincide almost everywhere,
namely, to say9 that if f and g are equal almost everywhere, then they represent
the same object.
DEFINITION 9.30 Let a R+ . The L2 [0, a] space is the space of measur-
able functions defined on [0, a] with values in C which are square integrable,
defined up to equality almost everywhere, with the hermitian scalar product
and the norm given by
Za
Z a
1/2
def
def
f (x)2 dx
f (x) g(x) dx,
and
k f k2 =
.
( f |g) L2 =
0
1
en (x) = p e 2inx/a
a
9
Mathematicians speak of taking the quotient by the subspace of functions which are zero
almost everywhere.
Hilbert spaces
9.3.d
263
The L2 (R
R) space
A fundamental result due to Riesz and Fischer, which is not obvious at all,
shows that L2 (R) is complete. It is therefore a Hilbert basis, and it can be
shown to admit a countable Hilbert basis. Thus the Hilbert spaces 2 , L2 [0, a],
and L2 (R) are isometric.
Remark 9.33 Although a Hilbert basis has many advantages, it will also be very interesting to
consider on the space L2 (R) a basis which is not a Hilbert basis, but rather has elements
which are distributions, such as the (noncountable) family x xR , which quantum physicists
denote |x xR , or the family e 2i p x pR , denoted (not without some risk of confusion!)
|p pR . Instead of representing a function f L2 by a discrete sum, it is represented by a
continuous sum, that is, an integral. Such a point of view lacks rigor if a basis in the sense
of distributions is not defined (note, for instance, that x 7 e 2i p x is not even in L2 (R)!).
However, the representation of a function by Fourier transformation
is simply (in another
language) the decomposition of a function in the basis x 7 e 2i p x pR .
Exercise 9.1 Define H to be the space of functions on R, up to equality almost everywhere,
such that ( f | f ) < +, where
Z
2
def
( f |g) =
f (t) g(t) e t dt.
R
Then H with this norm is a Hilbert space. Show that by orthogonalizing the family
(X 0 , . . . , X n , . . .) (using the method due to Erhard Schmidt), one obtains a family of polynomials (H0 , . . . , Hn , . . .), called the Hermite polynomials.
Show (see, for instance, [20, complement B.v]) that
Hn (t) = (1)n e t
dn t 2
e .
dt n
2
Meyer [37, 54, 65], which generalize the Fourier series expansion on L2 [0, a] (see Section 11.5,
page 321).
264
9.4
Fourier series expansion
It is also possible (and sometimes more convenient) to define the space
L2 [0, a] as the space of functions f : R C which are periodic with period
a and square integrable on [0, a] (since any function defined on [0, a] may be
extended to R by periodicity10).
1
Recall that
en (x) = p e 2inx/a
a
9.4.a
DEFINITION 9.35 Let f L2 [0, a]. For n Z, the n-th Fourier coefficient
a
0
cn ( f) = cn ( f ) ;
ii) n Z,
cn ( f) = cn ( f) = cn ( f ) ;
iii) n Z,
cn ( f ) = e 2ina/ cn ( f ).
2in
c (f)
a n
n Z.
265
its Fourier coefficients and by ( fn )nN the sequence of Fourier partial sums of f . Then
the sequence ( fn )nN converges in mean-square to f :
Za
f f n
2 =
f (t) fn (t)2 dt 0.
2
n
holds only in the sense of the L2 -norm, and not pointwise) simply that
+
+
X
1 X
f (t) = p
cn e 2int/a =
cn en (t)
(9.3)
a n=
n=
f =
+
P
n=
c n en
P
1 +
f (t) = p
c e 2i nt/a .
a n= n
true
false!
The first equality is proved in the general case of Hilbert spaces. But the Hilbert space which
we use is the space L2 [0, a], where functions are only defined up to almost-everywhere equality,
and where convergence means convergence in mean-square. The space of a-periodic functions,
with the norm defined as the supremum of the absolute values of a function, is not complete.
One cannot hope for a pointwise convergence result purely in the setting of Hilbert spaces.
(This type of convergence will be considered with other tools in Section 9.4.d on page 267.)
266
+
a
1 X
t
t
f (t) = p0 + p
an cos 2n
+ bn sin 2n
,
2 a
a n=1
a
a
(9.4)
Za
2
t
an = p
f (t) cos 2n
dt = cn + cn ,
a 0
a
Za
2
t
def
bn = p
f (t) sin 2n
dt = i(cn cn ),
a 0
a
def
with
+
Z a
|a0 |2 1 X
2
2
f (t)2 dt.
+
|an | + |bn | =
4
2 n=
0
and
n6=0
Remark 9.42 The Fourier series expansion gives a concrete isometry between the space L 2 [0, a]
and the space 2 . To each square-integrable function, we can associate a sequence (cn )nZ 2 ,
namely, the sequence of its Fourier coefficients.
Conversely, given an arbitrary sequence
P
(cn )nZ 2 , the trigonometric series
cn en converges (in the sense of mean-square convergence) to a function which is square integrable (this is the Riesz-Fischer Theorem). Frdric
Riesz said that the Fourier series expansion was a perpetual round-trip ticket between two
spaces of infinite dimension [52].
This business of convergence in L2 norm is all well and good, but it does not
even imply pointwise convergence (and, a fortiori, does not imply uniform
convergence11). Isnt it possible to do better? Can we write (9.3) or (9.4)
rigorously?
We start the discussion of this question by extending the notion of Fourier
series to functions which are not necessarily in L2 [0, a], but are in the larger
space L1 [0, a] of Lebesgue-integrable functions on [0, a].12 Indeed, for f
11
To take an example independent of Fourier series, the sequence (x 7 sinn x)nN converges
to 0 in mean-square, but it does not converge pointwise or uniformly.
12
Notice that, on the contrary, if f L2 [0, a], then f L1 [0, a] since, by the CauchySchwarz inequality we have
Z a
2
Za
Za
2
f (t) dt = | f | , 1 2
f (t)2 dt
1 dt =
f
2 a.
0
267
are defined.
a
0
f (t) e 2int/a dt 0.
n
268
limits on the left and on the right f (t0 ) and f (t0+ ) and the limits on the left and
on the right f (t0 ) and f (t0+ ) exist, then the sequence of partial sums fn (t0 ) nN
converges to the regularized value of f at t0 , that is, we have
i
1h +
lim fn (t0 ) =
f (t0 ) + f (t0 ) = f (t0 ).
n
2
Notice that not only the function itself, but also its derivative, must have a
certain regularity.
It is possible to weaken the assumptions of this theorem, and to ask only
that f be of bounded variation [37]. In practice, Dirichlets theorem is sufficient. Sidebar 4 (page 276) gives an illustration of this theorem.
9.4.e
269
Fourth question: what link is there between the regularity of f and the
speed of decay of the Fourier coefficients to 0?
(a) We know already that for f L1 [0, a], we have cn 0 (RiemannLebesgue lemma).
P
2
(b) Moreover, if f L2 [0, a], then +
n= |c n | < + (by Hilbert space
theory).
P
(c) Theorem 9.47 shows that if f C 1 [0, a], then +
n= |c n | < +, which
is an even stronger result.
Remark 9.48 It is not sufficient for this last result that f be of C k class on [0, a]: it
is indeed required that the graph of f in the neighborhood of 0 glues nicely with
that in the neighborhood of a.
Remark 9.49 It has been proved (the Carleson-Hunt theorem) that the Fourier series of a
function f which is in L p for some p > 1 converges almost everywhere to f . It is also known
(due to Kolmogorov) that the Fourier series of an integrable function may diverge everywhere.
The settting of L2 functions is the only natural one for the study of Fourier series.
To look for simple (or uniform) convergence is not entirely natural, then. This is why other
expansions, such as those provided by wavelets, are sometimes required.
270
Remark 9.50 Although many physics courses introduce Fourier analysis as a wonderful instru-
ment for the study of sound (music particularly), it should be noted that the expansion in
harmonics of a sound with given pitch (from which Ohm postulated that the timber of the
sound could be obtained) is not sufficient, at least if only the amplitudes |cn | of the harmonics
are considered. The phases of the various harmonics play an important part in the timber
of an instrument. A sound recording played in reverse does not give a recognizable sound,
although the power spectrum is identical [25].
9.4.f
On the figures of Sidebar 4 on page 276, we can see that the partial Fourier
sums, although converging pointwise to the square wave function, do not converge uniformly. Indeed, there always exists a crest close to the discontinuity.
It is possible to show that
this crest get always narrower;
|sin x| =
for all x R.
8X
sin2 nx
n=1 4n2 1
Exercise 9.4 (Fejer sums) The goal of this exercise is to indicate a way to improve the rep-
resentation of a function by its Fourier series when the conditions of Dirichlets theorem are
not valid.
Let f : R C be 2-periodic and integrable on [0, 2]. Define
Z
n
X
1
Sn f (x) =
ck e i kx ,
where ck =
f (x) e i kx dx.
2
k=n
Exercises
271
f (x u) Dn (u) du
and
n f (x) =
f (x u) n (u) du,
1
1 sin n + 2 u
1
sin nu/2 2
and
n (u) =
,
Dn (u) =
2
sin u/2
2n sin u/2
which are respectively called the Dirichlet kernel and the Fejr kernel.
ii) Show that the sequence (n )nN , restricted to [, ], is a Dirac sequence.
iii) Deduce from this Fejrs theorem:13 if f is continuous, the sequence (n f )nN converges
uniformly to f ; if f merely admits right and left limits at any point, then the sequence (n f )nN
converges to the regularized function associated to f .
Remark: the positivity of the Fejr kernel explains the weaker assumptions of Fejrs
Theorem, compared with Theorem 9.47 (where assumptions concerning the derivative
are required).
Exercise 9.5 (Poisson summation formula) Let f be a function of C 2 class on R, such that
n=
holds.
P
Hint: Define F (x) = nZ f (x + n) and show that F is defined, 1-periodic, and of C 1
class. Compute the Fourier coefficients of F and compare them with the values stated.
This identity will be proved in a more general situation, using the Fourier transforms of
distributions (Theorem 11.22 on page 309).
PROBLEM
Problem 4 (Isoperimetric inequality)
13
Lipt Fejr (18801959), Hungarian mathematician, was professor at the University of
Budapest. He worked in particular on Fourier series, a subject with a bad reputation at the
time because it did not conform to the standards of rigor that Cauchy and Weierstrass had
imposed. For an interesting historical survey of Fejrs theorem and other works of this period,
see [54], which also explores the theory of wavelets, one of the modern developments of
harmonic analysis.
272
Let be a simple closed path of C 1 class in the complex plane, L its length, and A
the area of the region it encloses. Show, using the previous question, that
L2 4A ,
with equality if and only if is a circle.
Hint: Use the complex Green formula (which states that
Z
ZZ
F
dS
F (z, z ) dz = 2i
z
S
S
for any function F continuous with respect to z and z and admitting continous
partial derivatives with respect to x and y) to show that if is parameterized by the
map f : [0, L] C, with f = u + iv and | f |2 = u 2 + v 2 = 1, then the area is equal
to
ZL
A =
u(s) v (s) ds.
0
SOLUTIONS
Solution of exercise 9.2. Using Fubinis theorem, we compute
cn (h) =
1
42
1
=
42
Z
Z
f (x t) g(t) dt e i nx dx
0
2
0
f (x t) e i n(xt) dx
g(t) e i nt dt
= cn ( f ) cn (g).
Solution of exercise 9.3. The function x 7 |sin x| is even and 2-priodic; computing its
Fourier coefficients (only those involving cosines being nonzero) yields
2 X
2
1
1
|sin x| = +
cos 2nx
n=1 2n + 1 2n 1
after a few lines of calculations. Expanding cos 2 = 1 2 sin2 and noticing that
X
1
1
= 1,
2n + 1 2n 1
n=1
it follows that
8 X
sin2 nx
2X
sin2 nx
=
.
n=1 (2n + 1)(2n 1)
n=1 4n2 1
Note that Dirichlets theorem ensures the pointwise convergence of the series (to the value
|sin x|) at any point x R. This convergence is moreover uniform.
|sin x| =
Solutions of exercises
273
n
1 X
e i ku
2 k=n
and to compute this geometric sum. Similarly, we have
Dn (u) =
1
n1
n1
1X
1X
1 Im exp i n + 2 u
n (u) =
D (u) =
,
n k=0 k
n k=0 2
sin u/2
0,5
0,5
0,5
0,5
0,5
0,5
0,5
0
0,5
274
One can show [24] that for f L1 [0, 2], the sequence (n f )nN converges to f in L1 ,
that is, we have
Z 2
n f (t) f (t) dt = 0.
lim
n
f (x) e 2i p x dx =
XZ
nZ
XZ
nZ
n+1
f (x) e 2i p x dx
f (x + n) e 2i p(x+n) dx =
0
Z 1X
f (x + n) e 2i p x dx =
0 nZ
XZ
nZ
1
0
f (x + n) e 2i p x dx
F (x) e 2i p x dx = c p (F ),
P
where F (x) =
nZ f (x + p), which is well-defined because of the assumption f (x) =
2
o (1/x ), and where the interversion of the sum and integral are justified by the normal
x
pZ
pZ
pZ
Solution of problem 4.
Z
2
+
1 L
L2 X
1
=
u(x) dx + 2
L 0
4 n= n2 L
n6=0
Z
2
L
2i nx/L
u (x) e
dx
0
by integration by parts. But, in the last sum, we have n2 1, hence the inequality
Z
2
Z
2
+
+
X
X
1 L
1 L
2i nx/L
2i nx/L
u
(x)
e
dx
u
(x)
e
dx
2
n
L
L
0
0
n=
n=
n6=0
|u (x)|2 dx.
Solutions of exercises
275
The path being parameterized by the functions u(s) and v(s), the integration element
dz can be written
dz = u (s) + iv (s) ds.
Take the function F (z, z ) = z to apply the Green formula. The area of the surface
S enclosed by the path = S is
Z
ZL
ZZ
F
dS = z dz =
(u iv)(u + iv ) ds
2iA = 2i
z
O
S
ZL
ZL
ZL
=
(uu + vv ) ds i
vu ds + i
uv ds.
0
The integrand in the first term is the derivative of 21 (u 2 + v 2 ); since u(0) = u(L) and
v(0) = v(L), this first integral is therefore zero.
The second integral can be integrated by parts, which yields
ZL
ZL
vu ds =
uv ds
0
v /a)2
(translating the path on the real axis if need be, which does not change the area); it
follows that
Z
Z
1 a 2 L2 L 2
1 L 2
A
u
(s)
ds
+
v
(s)
ds
.
2 42 0
a2 0
It suffices now to fix a so that the coefficients in front of the integrals are equal, which
means a 2 = 2/L; with this value, we get
Z
L 1 2
L2
A
(u + v 2 ) ds =
.
4 0
4
Equality holds if au = v /a, which means v = 2u/L. Together with the relation
L
L
u 2 + v 2 = 1, this implies u(s) = 2
cos(2s/L + ) and v(s) = 2
sin(2s/L + ),
with R, which is, indeed, the equation of a circle with circumference L.
Note that this result, which is rather intuitive, is by no means easy to prove!
276
0.5
0.5
1
0.5
0.5
0.5
0.5
0.5
1
0.5
Sn (t) =
0.5
1
0.5
n1
4X
1
sin(2k + 1)t.
k=0 2k + 1
Chapter
10
Fourier transform
of functions
This chapter introduces an integral transform called the Fourier transform, which
generalizes the Fourier analysis of periodic functions to the case of functions defined
on the whole real axis R.
We start with the definition of the Fourier transform for functions which are
Lebesgue-integrable (elements of L1 ). One problem of the Fourier transform thus
defined is that it does not leave the space L1 stable. It will then be extended
to functions which are square integrable (elements of L2 ), which have a physical
interpretation in terms of energy. The space L2 being stable, any square integrable
function has a Fourier transform which is also square integrable.
10.1
Fourier transform of a function in L1
In this section, we start by defining the Fourier transform of an integrable
function and deriving its main properties.
Although the Fourier transform is a generalization of the notion of a
Fourier series, it is not the L2 or Hilbert space setting which is the simplest.
Therefore we start with functions in L1 (R),1 before generalizing to square
integrable functions.
1
The reason is that, in contrast with functions defined on a finite interval, L2 (R) is not
contained in L1 (R).
278
10.1.a Definition
DEFINITION 10.1 Let f be a function, real- or complex-valued, depending on
one real variable. The Fourier transform (or spectrum) of f , if it exists, is the
complex-valued function defined for the real variable by
Z
def
fe() =
f (x) e 2i x dx
for all R.
(10.1)
fe = F [ f ]
fe() = F f (x) .
or
Remark 10.2 The integral that appears does not always exist. For instance, if we consider
f : x 7 x 2 , the integral
x 2 e 2i x dx
does not exist, for any value of , and f does not have a Fourier transform (in the sense of
functions; it will be seen in the next chapter how to define its Fourier transform in the sense
of distributions).
It is also possible to have to deal with non-integrable functions for which the integral
defining fe() converges as an improper integral. As an example, consider g : x 7 (sin x)/x.
Then g is not integrable but, on the other hand, the limit
ZM
sin x 2i x
e
dx
lim
M +
x
M
exists for all values of . More details about this (not a complete description of what can
happen! this is very difficult to obtain and state) will be given in Section 10.3, which discusses
the extension of the Fourier transform to the space of square integrable functions.
At least if f is integrable on R, it is clear that fe is defined on R. Indeed, recall that for a
function f : R R, the following equivalence holds:
f is Lebesgue-integrable | f | is Lebesgue-integrable,
which shows that if f is integrable, then so is x 7 f (x) e 2i x for any real value of .
Remark 10.4 A number of different conventions regarding the Fourier transform are in use in
various fields of physics and mathematics; in the table on page 612, the most common ones
are presented with the corresponding formula for the inverse Fourier transform.
279
10.1.b Examples
Example 10.5 Consider the rectangle function (already considered in Chapter 7):
def
(x) =
0
1
Z 1
sin
2
def
e =
()
e 2i t dt = sinc() =
12
1
if 6= 0,
if = 0.
sin (b a) e i (a+b) if 6= 0,
e[a,b ] () = F [a,b ] () =
b a
if = 0.
2
Example 10.6 Consider the gaussian x 7 e x . Then one can show (see Exercice 10.1 on
page 295 or, using the method of residues, Exercise 4.13 on page 126) that
2
2
F e x = e .
f (x) =
2a
.
a 2 + 42 x 2
2a
= e a|| .
F
a 2 + 42 x 2
The table on page 614 lists some of the most useful Fourier transforms.
The answer is no. The reason is similar to what happened in the definition
of the space L2 in the previous chapter: if f and g are two functions in L1
which are equal almost everywhere,2 then k f gk = 0, although f 6= g;
this means that the map f 7 k f k is not a norm. To resolve this difficulty,
2
For instance, one can take f = Q , the characteristic function of the set Q, also called the
Dirichlet function, and g the zero function.
280
those functions which are equal almost everywhere are identified. The set of
all functions which are equal to f almost everywhere is the equivalence class
of f. The set of equivalence classes of integrable functions is denoted L1 (R)
or more simply L1 . Thus, when speaking of a function f L1 , what is
meant is in fact any function in the same equivalence class. For instance, the
zero function and the Dirichlet function belong to the same equivalence class,
denoted 0. In order to not complicate matters, there are both said to be equal
to the zero function.
R
In the space L1 , the map f 7 | f | is indeed a norm.
form, defined on L1 (R) and taking values in L (R), is a continuous linear operator, that is
i) (linearity) for any f, g L1 (R) and any , C, we have
F [ f + g] = F [ f ] + F [g] ;
281
Proof of the two preceding theorems. To show the continuity of the function fe,
we apply the theorem of continuity of an integral depending on a parameter (Theorem 3.11 on page 77), using the domination relation
f (x) e 2i x f (x).
x R
Moreover, for any R, we have
Z
Z
fe() = f (x) e 2i x dx f (x) dx = k f k ,
1
k fek = sup fe() k f k1 .
R
Since the Fourier transform is obviously linear, this inequality is sufficient to establish
its continuity (see Theorem A.51 on page 583 concerning continuity of a linear map).
then there will be convergence in the sense of the L norm, that is, uniform
convergence of the sequence ( fen )nN :
L1
cv.u.
fn f = fen fe on R .
Example 10.5 on page 279 shows that the Fourier transform of a function
in L1 is not necessarily integrable itself. So the Fourier transform operator
does not preserve the space L1 , which is somewhat inconvenient, as one often
prefers an honest endomorphism of a vector space to an arbitrary linear map.
(It will be seen in Section 10.3 that it is possible to extend the Fourier transform to the space L2 and that L2 is then stable by the Fourier transform thus
defined.)
On the other hand, one can see that each of the Fourier transforms previously computed has the property that it tends to 0 at infinity. This property
is confirmed in general by the following theorem, which we will not prove
(see, e.g., [76]). It is the analogue of the Riemann-Lebesgue Lemma 9.44 for
Fourier series.
282
function and denote by fe its Fourier transform. Then the continuous function fe tends
to 0 at infinity:
lim fe() = 0.
10.1.e Inversion
The question we now ask is: can one inverse the Fourier transform? The
answer is partially yes; moreover, the inverse transform is none other than
the conjugate Fourier transform.
We just saw that, even if f L1 (R), its Fourier transform fe is not itself
always integrable, and so the conjugate Fourier transform of fe may not be
defined. However, in the special case where fe L1 , we have the following
fundamental result.
THEOREM 10.13 (Inversion in L1 (R
R)) Let f L1 (R) be an integrable function
and also
Proof. We prove the inversion formula only at points of continuity. The proof starts
with the following lemma.
LEMMA 10.14 Let f and g be two integrable functions. Then fe g and f e
g are integrable
and we have
f (t) e
g (t) dt =
283
for any reals x, t R and any n, Lebesgues dominated convergence theorem proves
that
Z
Z
(10.2)
fe(x) hn (x) e 2i x t dx fe(x) e 2i x t dx = F fe (t)
n
But, as seen in Example 8.17 on page 231, the sequence of Fourier transforms of hn
n
2 2
e
hn : x 7 p e n x ,
Putting equations (10.2), (10.3), and (10.4) together, we obtain, for any point where f is
continuous, that
Z
fe(x) e 2i x t dx = F fe (t) = f (t),
which is the second result stated.
COROLLARY 10.15 If f is integrable and is not equal almost everywhere to a continuous function, then its Fourier transform is not integrable.
Proof. If the Fourier transform fe is integrable, then f coincides almost everywhere
with the inverse Fourier transform of fe, which is continuous.
Remark 10.16 With the conventions we have chosen for the Fourier transform, we have, when
this makes sense, F 1 = F . One could use indifferently F 1 or F . The reader should be
aware that, in other fields, the convention used for the definition of the Fourier transform is
different from the one we use (for instance, in quantum mechanics, in optics, or when dealing
with functions of more than one variable), and the notation is not equivalent then. A table
summarizes the main conventions used and the corresponding inversion formulas (page 612).
Remark 10.17 Theorem 10.13 may be proved more simply using Lemma 10.14 and the fact
that the Fourier transform of the constant function 1 is the Dirac . But this property will
only be established in the next chapter. For this reason we use instead a sequence of functions
(hn )nN converging to 1, while the sequence of their Fourier transforms converges to .
To apply the inversion formula of Theorem 10.13, one needs some information concerning not only f , but also fe (which must be integrable); this may
be inconvenient. In the next proposition, information relative to f suffices
to obtain the inversion formula.
284
all integrable. Then fe is also integrable; the Fourier inversion formula holds, and we
have
f = F [ fe] .
Example 10.19 Consider again the example of the lorentzian function. We have
f (x) =
2
.
1 + 42 x 2
Then
192 4 x 2 16 2 x 2
,
(1 + 42 x 2 )3
which are both integrable. We deduce from this that F e || = 2/(1 + 42 x 2 ). Moreover,
since all the functions considered are real-valued, it follows by taking the complex conjugate
and exchanging the variables x and that also
f (x) =
16 2 x
(1 + 42 x 2 )2
and
F e |x| =
f (x) =
2
.
1 + 42 2
Hence the inversion formula is also a tool that can be used to compute
new Fourier transforms easily!
may still be defined. (It is not quite an improper integral, because the bounds
of integration are symmetrical.) For example, the Fourier transform of the
rectangle function is a cardinal sine. The latter is not Lebesgue-integrable,
but the integral
ZR
sin 2i x
def
e
I (x, R) =
d
R
does have a limit as [R ]. In such cases, the following theorem may be
used.
285
Since the rectangle function has only two discontinuities and since its derivative (in the
sense of functions) is always zero and therefore integrable, it follows that
0 if |x| > 21 ,
Z +R
lim
sinc() e 2i x d = F sinc() (x) = 1 if |x| < 21 ,
R+
R
1
if |x| = 21 .
2
10.2
Properties of the Fourier transform
10.2.a
have
F f (x) = fe(),
F f (x a) = e 2ia fe(),
F f (x) = F f (x) = fe(),
F f (x) e 2i0 x = fe( 0 ).
These results should be compared to the formulas for similar operations on the
Fourier coefficients of the periodic function; see Theorem 9.36 on page 264.
COROLLARY 10.23 In the following table, the properties of f indicated in the first
column are transformed into the properties of fe in the second column, and conversely:
286
fe()
even
odd
hermitian
antihermitian
10.2.b Dilation
If a dilation is performed (also called a change of scale), that is, if we put
g(x) = f (a x)
x R,
for some
a R ,
1 e
f
.
F f (a x) =
|a|
a
10.2.c Derivation
There is a very important relation between the Fourier transform and the
operation of differentiation:
THEOREM 10.25 (Fourier transform of the derivative) Let f L1 be a func-
tion which decays sufficiently fast at infinity so that x 7 x k f (x) belongs also to L1
for k = 0, . . . , n. Then fe can be differentiated n times and we have
F (2ix)k f (x) = fe(k) ()
for k = 1, . . . , n.
287
d e
F 2ix f (x) =
f ().
d
and
R+
f (x) e
2i x
iR
(2i) f (x) e
2i x
dx .
As f and f are integrable, f tends to zero at infinity (see Exercise 10.7 on page 296).
The previous formula then shows, by letting R tend to infinity, that
Z
Z
f (x) e 2i x dx = (2i) f (x) e 2i x dx,
The formula of Theorem 10.25 on the preceding page also implies a very
important result:
COROLLARY 10.25.2 Let f and g be measurable functions, and let p, q N.
Assume that the derivatives of f , of order 1 to p, exist and are integrable, and that
x 7 x k g(x) is also integrable for k = 0, . . . , q. Then, for any R, we have
Z
fe() |2| p f ( p) (x) dx
and
(q)
e
g ()
|2x|q g(x) dx,
288
These inequalities give very strong bounds: if f is five times differentiable, for
instance, and f (5) is integrable, then fe decays at least as fast as 1/ 5 .
So there exists a link between the regularity of f and the rate of decay of
fe at infinity, and similarly between the rate of decay of f and the regularity
of fe.
lim x k f (x) = 0
for any k N.
Example 10.27 The functions x 7 e x and x 7 x 5 (log x) e |x| are rapidly decaying.
transform is of C class.
10.3
Fourier transform of a function in L2
The are some functions which are not necessarily integrable, but whose
square is. Such is the sine cardinal function:
def
x 7 sinc(x) =
sin x
x
289
10.3.a
The space S
of C class which are rapidly decaying along with all their derivatives.
Indeed, f has bounded support, so, according to Corollary 10.25.1, fe is infinitely differentiable. Moreover, f can be differentiated p times and f ( p) is integrable for any p N, so
that Corollary 10.25.2 implies that fe decays at least as fast as 1/ p at infinity, for any p N.
Moreover (still by Theorem 10.25), the derivatives of fe are also Fourier transforms of functions
in D, so the same reasoning implies that fe S .
THEOREM 10.32 The Fourier transform defines a continuous linear operator from
4
The Schwartz space will also be useful to define the Fourier transform in the sense of
distributions.
290
The next step is to show that, for two functions f and g in S , we have
Z
Z
f (x) g(x) dx =
fe() e
g () d.
to show that the Fourier transform defined on S can be extended to a continuous linear operator on L2 . This step, essentially technical, is explained in
Sidebar 5 on page 294.
We end up with an operator Fourier transform, defined for any function
f L2 (R) which satisfies the identities of the next theorem:
5
291
try on L2 : for any two square integrable functions f and g, the Fourier transforms fe
and e
g also in L2 and we have
Z
Z
f (x) g(x) dx =
fe() e
g () d.
fe()2 d
k f k2 = k fek2 .
F F [ f] =F F [ f] = f
almost everywhere.
F 1 [] = F [].
It is natural to wonder if there is a relation between the Fourier transform
thus defined on L2 (R) (after much wrangling and sleight of hand) and the
original Fourier transform on L1 (R).
PROPOSITION 10.38 The Fourier transform on L1 and that on L2 coincide on the
subspace L1 L2 .
In other words, if f is square integrable on the one hand, but is also integrable
itself on the other hand, then its Fourier transform defined in the section is
indeed given by
Z
F [ f ] : 7
f (x) e 2i x dx.
But then, how does one express the Fourier transform of a square integrable
function which is not integrable? One can show that, in practice, its Fourier
transform in L2 can be defined by the following limit:
ZR
e
f () = lim
f (x) e 2i x dx
for almost all R
R+ R
292
10.4
Fourier transform and convolution
10.4.a Convolution formula
Recall first the definition of the convolution of two functions:
DEFINITION 10.39 Let f and g be two locally integrable functions. Their
Fubinis theorem (page 79) provides the proof of the following fundamental theorem, also known as the Faltung theorem (Faltung being the German
word for convolution):
THEOREM 10.40 Let f and g be two functions such that their Fourier transforms
f g() = fe() e
g ().
f g() = fe e
g ().
Proof. We consider the special case where f and g are integrable. Then the Fourier
transforms of f and g are defined and f g is indeed integrable. Define (x, t) =
f (t) g(x t) e 2i x . Then
Z Z
Z Z
F [ f g] () =
f (t) g(x t) dt e 2i x dx =
(x, t) dt dx.
The second formula is in general more delicate than the first; if we assume that fe
and e
g are both integrable, it follows from the first together with the Fourier inversion
formula.
293
Example 10.41 We want to compute without a problem the Fourier transform of the function
defined by
1 + x
(x) = 1 x
for 1 x 0,
for 0 x 1,
for |x| 1.
(x)
1
x
1
First we show that (x) = (x) and then we deduce that
sin 2
F (x) =
.
f g() =
ii) if f , g L1 and their Fourier transforms are also in L1 , then
e
f e
g () for any R;
294
It is easy to check that (x) does not depend on the chosen sequence (xn )nN
converging to x: indeed, if ( y n )nN also converges to x, we have
lim (xn ) lim ( yn ) = lim (xn yn ) = 0
by continuity of .
The map b thus constructed is linear and satisfies
(x)
b
=
lim (xn )
= lim
(xn )
|||||| lim kxn k = |||||| kxk ,
n
Exercises
295
EXERCISES
Exercise 10.1 (Fourier transform of the gaussian) Using the link between the Fourier trans-
form and derivation, find a differential equation satisfied by the Fourier transform of the
function x 7 exp(x 2 ). Solve this differential equation and deduce from this the Fourier
transform of this function. Recall that
Z
2
e x dx = 1.
Deduce from this the Fourier transform of the centered gaussian with standard deviation :
1
x2
f (x) = p
exp 2 .
2
2
Exercise 10.2 Using the convolution theorem for functions f , g L 2 and the preceding
exercise, compute the convolution product of a centered gaussian f with standard deviation
and a centered gaussian f with standard deviation .
Exercise 10.3 Let C such that Re() > 0, and let n N . Compute the Fourier
x 7 f (x) =
In particular, show that fe() is zero for < 0.
1
.
(2i x )n+1
Exercise 10.4 Compute the following integrals, using the known properties of convolution
and the Parseval-Plancheral formula, and by finding a relation with the functions and
= :
Z +
Z +
Z +
sin x 2
sin x 3
sin x
dx,
dx,
dx.
x
x
x
Exercise 10.5 (sine and cosine Fourier transforms) Let f : R R be an odd integrable func-
tion. Define
Show that
def
fes () = 2
f (t) = 2
+
0
fes () sin(2 t) d.
The function fes thus defined is called the sine Fourier transform of f , and f is the
inverse sine Fourier transform of fes . These transforms can be extended to arbitrary (integrable)
functions, the values of which are only of interest for t > 0 or > 0.
Similarly, show that if f is even and if the cosine Fourier transform of f is defined by
Z +
def
e
fc () = 2
f (t) cos(2 t) dt,
0
then we have
f (t) = 2
fec () cos(2 t) d.
296
t > 0 > 0
2t
cos 2 t
d =
e
.
2 + 2
2
Exercise 10.7 (proof of Theorem 10.25) Let f L 1 (R) be a function which is differentiable
SOLUTIONS
Solution of exercise 10.1. Define f (x) = e x 2 , which is an integrable function, and let fe
be its Fourier transform. From
f (x) + 2x f (x) = 0,
x R
2 fe() +
d e
f () = 0,
d
2
which has solutions given by fe() = A e for some constant A to be determined.
R +
Since fe(0) = f (x) dx = 1, it follows that A = 1.
By a simple change of variable, it also follows that
x2
1
F.T.
2 2 2
p exp 2 e 2 ,
2
2
f
f , and since f and f are both square integrable, their Fourier transforms are also square
integrable. Using the Cauchy-Schwarz inequality, it follows that fe f
f is integrable; there its
conjugate Fourier transform is defined and
2 2 2
2 2 2
f f (t) = F fe fe (t) = F e 2 e 2 (t)
2
2 2
= F e 2( + ) (t) = fp 2 + 2 (t).
The squares of the standard deviations (i.e., the variances) add up when taking the convolution
of gaussians.
fe() =
e 2i x
dx.
(2i x )n+1
Solutions of exercises
297
We compute this using the method of residues. The only pole of the integrand in C is at
x = i/2.
If < 0, we close the contour in the lower half-plane to apply Jordans second lemma,
and no residue appears; therefore the function fe vanishes identically for < 0.
For > 0, we close the contour in the upper half-plane and, by applying the residue
formula, we get fe() = (1)n+1 n e /n!.
To conclude, the Fourier transform of f can be written
n
e x
is given by
2
.
2 + 42 2
The inverse cosine Fourier transform leads to x 7 e |x| , which is the original input function,
after it has been made even. The formula follows.
Solution of exercise 10.6. Assume that the Fourier transform of y also exists. Denote by
Y () =
and hence
B()
.
1 A()
So, under the assumption that it exists, the solution is given by the inverse Fourier transform
of the function 7 Y () thus defined.
f (t) dt = lim
x+
f (t) dt.
x
0
which shows that f admits a limit at +, and similarly at . If those limits were nonzero,
f would certainly not be integrable on R.
Chapter
11
Fourier transform
of distributions
11.1
Definition and properties
The next objective is to define the Fourier transform for distributions.
This is interesting for at least two reasons:
1. it makes it possible to define the Fourier transform of distributions
such as or X;
2. possibly, it provides an extension of the Fourier transform to a larger
class of functions; in particular, functions which are neither in L1 (R),
nor in L2 (R), but which are of constant use in physics, such as the
Heaviside function.
In order to define the Fourier transform of a distribution, we begin, as
is our custom, by restricting to the special case of a regular distribution.
Consider therefore a locally integrable function. But then, unfortunately,
we realize that being locally integrable is not, for a function, a sufficient
condition for the Fourier transform to be defined.
Restricting even more, consider a function f L1 . Being integrable,
it is also locally integrable and has an associated regular distribution, also
300
which is the same, using Fubinis theorem to exchange the order of integration
(this is allowed because f is integrable and is also), as
Z
Z
Z
fe, = f (x)
e 2i x t (t) dt dx = f (x) (x)
e dx = f ,
e .
def
F [T ] , = T , F [] .
Sometimes the notation Te is used instead of F [T ]:
T , = T,
e .
Remark 11.2 It is known that if has compact support
and is
nonzero, F [] does
11.1.a
Tempered distributions
f such that lim x k f (x) = 0 for all k N, and that the Schwartz space (of
x
functions which are infinitely differentiable and rapidly decaying along with
all their derivatives) is denoted S .
The topological dual of S (i.e., the space of continuous linear forms on
S ) is denoted S . A tempered distribution is an element of S .
Hence, to say that T is a tempered distribution implies that T is continuous, that is, that
if n 0 in S ,
n
then T , n 0 in C.
n
words, we have S D .
301
is also linear, of course, and continuous which the reader is invited to check.
Therefore, it defines a tempered distribution, also denoted f .
DEFINITION 11.5 (Slowly increasing function) A function f : R C is
Exercise 11.1 Let T be a tempered distribution. Show that for any k N, x k T and T (k)
Exercise 11.2 Show that the distribution exp(x 4 ) X(x) is not tempered.
Remark 11.8 Does the Fourier transform of tempered distributions defined in this manner
coincide with the Fourier transform of functions, in the case of a regular distribution?
1
A slowly increasing function, multiplied by a rapidly decaying function, is therefore integrable; the terminology is coherent!
302
Yes indeed. Let f be an integrable function, and denote this time by T ( f ) the associated
regular distribution (to make the distinction clearly). The Fourier transform of g (in the sense
of functions) is a bounded function fe, which therefore defines a regular tempered distribution.
The associated distribution, denoted T ( fe), satisfies therefore for any S
Z
Z
e dx = T ( f ),
e = F T (f ) , .
T ( f ), = fe(x) (x) dx = f (x) (x)
This implies that T ( fe) is indeed equal to the Fourier transform, in the sense of distributions,
of T ( f ).
Similar reasoning holds for f L2 (R) (Exercise!) and shows that the Fourier transform
defined on S coincides with that defined on L2 (R).
F T = (2i) F [T ] ,
F T (x a) = e 2ia F [T ] ,
F T (n) = (2i)n F [T ] ,
F e 2ia x T = Te ( a).
Proof. Take the first property, for instance. Before starting, remember that the
letters x and , usually used to denote the variable for a function and its Fourier
transform,
respectively,
are in
fact interchangeable.
So, it is perfectly legitimate to write
F [T ] (), () as well as F [T ] (x), (x) .
This being said, let T be a tempered distribution and S . Then we have
F [T ] , = T , F
= T , F 2ix (x)
= F [T ] , 2i x (x) = 2i xF [T ] (x), (x) ,
which is the stated formula (with the variable x replacing ).
The other statements are proved in an identical manner and are left as warm-up
exercises.
bounded support. Then the convolution product S T makes sense and is a tempered distribution, and therefore it has a Fourier transform which is given by
F [S T ] = Se Te.
303
f () = T (t), e 2i t
(the right-hand side is well defined for all because T has bounded support).
11.1.c Examples
The first important example of the Fourier transform of a tempered distribution is that of a constant function.
THEOREM 11.12 (F
F [1] = ) The Fourier transform of the constant function 1 is
F.T.
1 ().
Proof. Let S be a Schwartz function. We have, by definition and the Fourier
inversion formula,
Z
def
e
1, = 1,
e = ()
e d.
Furthermore, as
e is in S , the inversion formula for the Fourier transform yields
Z
()
e d = F []
e (0) = (0) = , .
This theorem indicates that for a constant function, the signal contains
a single frequency, namely the zero frequency = 0. Similarly, we have:
THEOREM 11.13 The Fourier transforms of the trigonometric functions are given by
for all 0 R.
F e 2i0 x = ( 0 ),
i
1h
( 0 ) + ( + 0 ) ,
F [cos(20 x)] =
2
i
1h
F [sin(20 x)] =
( 0 ) ( + 0 )
2i
This result is quite intuitive: for the functions considered, only the frequency
0 occurs, but it may appear with a sign 1 (see Section 13.4, page 365).
THEOREM 11.14 (F
F [] = 1) The Fourier transform of the Dirac distribution is the
constant function 1 : x 7 1.
F.T.
(x) 1 .
304
e def
, = , e =
e(0).
F = 2i,
F (x a) = e 2ia .
F (m) = (2i)m ,
1
1
pv ,
2i
with C.
Similarly,
i
1
1
F H (x) =
pv
+ .
2
1
F pv
= i sgn ,
x
F [sgn x] =
1
1
pv .
i
305
tion in R3 , which admits a Fourier transform. Then the Fourier transform of its
laplacian T ( x) is 4 2 Te ( ).
Consider now the following partial differential equation (coming from particle physicis, plasma physics, or ionic solutions, and explained in Problem 5
on page 325):
m2 f ( x) = 4 ( x).
Taking the Fourier transform of this relation, it follows that
(4 2 m2 ) fe( ) = 4.
fe( ) =
m2
4
.
+ 4 2
(11.1)
But is that really true? Not entirely! Since we are looking for distribution solutions, we
must remember that there are many distribution solutions to the equation
(m 2 + 4 2 ) T ( ) = 0:
306
There only remains to compute the inverse Fourier transform of this function to obtain a solution to the equation we started with. For this, the method
of residues is still the most efficient. This computation is explained in detail
in the physics problem on page 325. In addition, the whole of Chapter 15 is
devoted to the same subject in various physical cases. The result is
f ( r) =
e mk rk
.
k rk
S
F [T ] , = T , F [] = T (),
e() .
Then F is the inverse of F , that is to say,
1/r if r R,
fR ( r) =
1/R if r R.
This regularized Coulomb potential converges to f : r 7 1/r in the sens of distributions (the
verification of this is immediate). Since the Fourier transform is a continuous operator, the
Fourier transform of fR tends to fe as R tends to infinity. After a few lines of computation
(isolating the constant part 1/R), the Fourier transform of fR is computed explicitly and gives
(2)3
4
4
( k) + 2
sin(kR).
R
k
R k3
The first and third terms tend to 0 in the sense of distributions as [R +], as the reader
can easily show.
fR ( k) =
307
11.2
The Dirac comb
11.2.a
+
X
n=
(x n).
transform in the sense of distributions, which is also equal to the Dirac comb:
F.T.
X(x) X().
This formula can be stated differently; since X is defined as a sum of
Dirac distribiutions, we have
+
+
X
X
F [X] () = F
n () =
e 2in .
n=
n=
e 2inx =
n=
+
X
n=
(x n).
(11.2)
f (t) =
1 12
1
2
308
As a periodic function, f can be expanded in Fourier series. Since f is even, only the
cosine coefficients an occur in this expansion, and after integrating by parts twice, we
get
Z 1/2
an = 4
x 2 cos(2nx) dx
0
2
x sin(2nx) 1/2
x
sin(2nx) dx +
(the boundary terms vanish)
n
2n
0
0
Z 1/2
1/2
4
cos(2nx)
8x
=
dx +
cos(2nx)
, (and the integral also)
2
n 0
2n
(2n)
0
= 4
1/2
that is,
(1)n
.
2 n 2
Moreover, differentiating twice the function f , in the sense of distributions, we obtain
first
an =
1
f (t) =
1 21
1
2
and then
f (t) = 2 2
+
X
n=
x n 12 = 2 2 X x 21 .
n=0
2 2X(x 21 ) = 4
+
X
X(x) = 1 + 2
+
X
(1)n e 2i nx
n=1
X(x 12 ) = 1 + 2
and therefore
an cos(2nt) yields
+
X
(1)n e 2i nx
n=1
(1)n e 2i nx e i n = 1 + 2
n=1
+
X
n=1
e 2i nx =
+
X
e 2i nx
n=
309
+
X
cn e 2inx ,
n=
(it suffices, for instance, that f decays faster than 1/|x| at infinity). The
f ), or equivalently
Fourier transform of f (x) T1 X (x/T ) is fe() X(T
X n
fe
(T n),
T
nZ
n=
310
X
1
1
def
,
S=
a 2 + 42 n2 b 2 + 42 n2
n=
with a, b R+ and a 6= b . First, note that if we define
X
1
1
def
S () =
e 2i n
a 2 + 42 n2 b 2 + 42 n2
n=
for all R,
S () =
n=
with
fe(n) e 2i n ,
1
1
=e
g () e
h().
42 2 + a 2 42 2 + b 2
To exploit the Poisson summation formula, it is necessary to compute the inverse Fourier
transform of f and hence those of g and h. We know that
2a
F.T.
,
e a|x| 2
a + 42 2
and therefore we deduce that
e a|x|
e b|x|
g(x) =
and
h(x) =
.
2a
2b
From this we can compute f : indeed, from fe = e
g he it follows that f = g h; this convolution
product is only an elementary (tedious) computation (see Exercise 7.5 on page 211) and yields
a|x|
1
e
e b|x|
1 a|x| b|x| 1
e
e
=
.
f (x) =
4ab
2 b 2 a2
a
b
fe() =
Hence we have
S = S (0) =
+
X
n=
fe(n) =
f (n)
n=
b2
1
1 X
1
1 X
e a|n| 2
e b|n| .
2
2
a 2a n=
b a 2b n=
Each series is the sum of two geometric series (one for positive n and one for negative n), and
we derive after some more work that
X
X
1 + e a
1 + e b
e a|n| =
and
e b|n| =
.
a
1e
1 e b
n=
n=
Hence, after rearranging the terms somewhat, we get
1
1
1
1
1
1
1
S= 2
b a 2 a 1 e a
b 1 e b
2ab b + a
1
coth a coth b
=
.
2(b 2 a 2 )
a
b
4
I did not make it up for the mere pleasure of computing a series; it appeared during a
calculation in a finite-temperature quantum field theory problem.
311
Josiah Willard Gibbs (18391903), American physicist, was professor of mathematical physics at Yale University. Gibbs revolutionized the study of thermodynamics in 1873 by a geometric
approach and then, in 1876, by an article concerning the equilibrium properties of mixtures. He had the idea of using diagrams
with temperatureentropy coordinates, where the work during a
cyclic transformation is given by the area of the cycle. It took
a long time for chemists to understand the true breadth of this
paper of 1876, which was written in a mathematical spirit. Gibbs
also worked in pure mathematics, in particular in vector analysis. Finally, his works in statistical mechanics helped provide its
mathematical basis.
11.3
The Gibbs phenomenon
Although the Gibbs phenomenon may be explained purely with the tools
of Fourier series,5 it is easier to take advantage of the Fourier transform of
distributions to clarify things.
Let f be a function, with Fourier transform fe() and denote
Z
f (x) =
fe() e 2i x d
sin 2 x
f = F 1 (/2 ) fe() = f (x)
.
x
Now recall that, in the sense of distributions, we have
sin 2 x
lim
= (x),
+
x
5
312
in accordance with the result stated on page 233, which proves that, by forgetting frequencies at a higher and higher level, we will finally recover the
original signal:
lim f = f
where Si is the sine integral, defined (see [2, 42]) by the formula
Zx
sin t
def
Si(x) =
dt.
t
0
If we plot the graph of H1 (x) = 12 + 1 Si(2x), we can see oscillations, with
amplitude around 8% of the unit height, on each side of the discontinuity.6
1
0,8
0,6
0,4
0,2
When the cutoff frequency is increased, formula (11.4) shows that the size
of the oscillations remains constant, but that they are concentrated in the
neighborhood of the point of discontinuity (here, we have = 2, 3, 4, 5 and
the graph has been enlarged in the horizontal direction to make it more
readable):
6
Notice in passing that a low-pass filter in frequencies is not physically possible, since it is
not causal.
313
0,5
What happens now if we consider a function f which is 1-periodic, piecewise continuous, and piecewise of C 1 class? Then f has a Fourier series
expansion
X
cn e 2inx ,
f (x) =
nZ
Then the function f , obtained by removing from the spectrum the frequencies || , is none other than
X
f (x) =
cn e 2inx ,
|n|
that is, the partial sum the Fourier series of f of order E [ ] (integral part
of ).
Hence the sequence of partial sums of the Fourier series will exhibit the
same aspect of oscillations with constant amplitude, concentrated closer and
closer to the discontinuities.7
It should be noted that this phenomenon renders Fourier series rather
unreliable in numerical computations, but that it can be avoided by using,
instead of the partial sums of the Fourier series, the Cesro means of those
series, which are called Fejr sums (see Exercise 9.4 on page 270).
In the preceding example with the Heaviside function, the oscillations get narrower in
a continuous manner. Here, since the function, being 1-periodic, has really infinitely many
discontinuities, there appears a superposition of oscillating figures which has the property,
rather unintuitive, of remaining fixed on the intervals [n, n + 1[ and of changing
rapidly when going through the points = n.
314
11.4
Application to physical optics
11.4.a
However, it may happen that one wishes to model an infinitely narrow slit. In
order to still be able to observe a figure, it is required to increase the intensity of
the light arriving in inverse proportion to the width of the slit. This intensity
coefficient is incorporated to the the transmittance, which gives
1 x
= (x).
fslit = lim
0
With this convention, the transmittance function thus becomes a transmittance distribution.
y
F
8
315
The amplitude which is observed in the direction is, with the Fraunhofer
approximation and linearizing sin , equal to
Z
nt
C f (x) e 2ix/ dx,
up to a multiplicative constant. On a screen at the focal point of a lens with
focal length R, the amplitude is therefore equal to
Z
nt
A( y) = C f (x) e 2i y x/R dx.
Changing the scale for the screen and putting = y/R, which has the
dimension of the inverse of length, we obtain therefore
Z
A() = Cnt f (x) e 2ix dx = fe().
Hence, up to a multiplicative constant (which we will ignore in our computations from now on), the amplitude of the diffraction figure at infinity of a diaphragm
is given by the Fourier transform of its transmittance distribution.
Remark 11.23 From the mathematical point of view, interference and diffraction phenomena
316
20
10
-1
-0,5
0,5
1,5
Fig. 11.1 Amplitude of the diffraction figure given by 2N + 1 = 21 infinitely narrow slits.
N
X
n=N
(x na).
The diffraction figure is the Fourier transform of f2 , which may be computed by summing directly the Fourier transforms of each Dirac component:
A2 () =
N
X
e 2ina,
n=N
-1,5
-1
-0,5
0,5
317
1,5
are linked by
aX
x
F.T.
a 2 X(a),
a
and hence the (2N + 1) slits are modelized, for instance, with the distribution
x
x
f1 (x) = a X
.
a
(2N + 1)a
(We took care that the boundaries of the rectangle function do not coincide with any of the Dirac spikes of the comb, which would have led to an
ambiguity.)
The Fourier transform of a rectangle function with width (2N + 1)a is
sin (2N + 1)a
x
F.T.
(2N + 1) a
,
(2N + 1)a
(2N + 1)a
and therefore
with
F.T.
f2 (x) A2 ()
i
sin
(2N
+
1)a
A2 () = a 2 X(a) (2N + 1) a
(2N + 1)a
+
X
sin (2N + 1)a
n
2
= (2N + 1)a
,
a
(2N + 1)a
n=
h
that is,
F.T.
f2 (x) A2 () = (2N
+ 1)a 2
+
X
sin (2N + 1)(a n)
.
(2N + 1)(a n)
n=
318
namely, a fairly narrow spike surrounded by small oscillations. One can then
easily imagine that the whole diffraction figure is made of a series of spikes,
spaced at a distance 1/a from each other; this is what has been observed in
Figure 11.1.
Notice that the width of the spikes is proportional to 1/(2N + 1), so the
more slits there are in the diaphragm, the narrower the diffraction bands are. The
number of slits may be read off directly from the figure, since it corresponds
to the number of local extrema between two main spikes (the latter being
included in the counting). When, as in the example given here, there is an
odd number of slits, the oscillations from neighboring sine cardinals add up;
this implies in particular that the small diffraction bands between the main
spikes are fairly visible. On the other hand, for an even number of slits, the
oscillations compensate for each other; the secondary bands are then much
less obvious, as can be seen in Figure 11.3, to be compared with Figure 11.1.
Note in passing that we obtained two different formulas for A2 , which
are of course equivalent; this equivalence is, however, difficult to prove using
classical analytic means.
N
X
x na
f3 (x) =
n=N
N
x
X
=
(x na)
n=N
that is,
x
x
x
f3 (x) = aX
.
a
(2N + 1)a
319
20
10
-0.5
0.5
Fig. 11.3 Diffraction figure of a system with 2N infinitely narrow slits (here 2N =
20). Note that the secondary extremums are much more attenuated than on
Figure 11.1, where 2N + 1 slits occurred.
(Despite the appearance of the last line, this is a regular distribution.) The
transmittance f3 is therefore simply the convolution of the transmittance distribution f2 with a rectangle function with width . The diffraction figure
is therefore given by the product of the Fourier transforms of these two distributions. The Fourier transform of (x/) is a sine cardinal function with
width 1/. In general, in such systems, the width of the slits is very small
compared to the spacing, that is, we have a. This implies that the width
of this sine cardinal function (1/) is much larger than the spacing between
the bands. The amplitude of the diffraction figure
A3 () = A2 ()
sin
is shown in Figure 11.4. One sees clearly there how the intensity of the spikes
decreases. Since the ratio a/ is equal to 4 in this example, the fourth band
from the center is located exactly at the first minimum of the sine cardinal,
and hence vanishes completely.
Remark 11.24 In this last example, three different lengths appear in the problem: the width
of the slits, the spacing a between slits, and the total length (2N + 1)a of the system. These
three lengths are related by
a (2N + 1)a,
and reappear, in the Fourier world, in the form of the characteristic width 1/ of the disappearance of the bands (the largest characteristic length in the diffraction figure), the distance
1/a between the principal bands of diffraction, and the typical width 1/(2N + 1)a of the fine
bands (the shortest), with
1
1
1
.
a
(2N + 1)a
Simply looking at the diffraction figure is therefore sufficient in principle to determine
the relative scales of the system of slits in use.
320
Fig. 11.4 Diffraction figure with 21 slits, with width 4 times smaller than their spacing:
= a/4.
z C
1
J0 (z) =
2
def
e i z cos d
p
2 + 2 ,
321
2
0
Fig. 11.5 The function x 7 2 J1 (x)/x. The first zero is located at a point x 1.220,
the second at x 2.233. The distance between successive zeros decreases.
11.5
Limitations of Fourier analysis and wavelets
Fourier analysis can be used to extract from a signal t 7 f (t) its composing frequencies. However, it is not suitable for the analysis of a musical or
vocal signal, for instance.
The reason is quite simple. Assume that we are trying to determine the
various frequencies that occur in the performance of a Schubert lied. Then
we have to integrate the sound signal from t = to t = +, which
is already somewhat difficult (the recording must have a beginning and an
ending if we dont want to spend a fortune in magnetic tape). If, moreover,
knowing fe we wish to reconstruct f , we must again integrate over ], +[.
Such an integration is usually numerical, and apart from the approximate
knowledge of fe, the integration range must be limited to a finite interval,
say [0 , 0 ]. But, even if fe decays rapidly, forgetting the tail of fe may
spectacularly affect the reconstitution of the original signal at these instants
when the latter changes very suddenly, which is the case during the attack
322
Fig. 11.6 The diffraction figure of a circular lens. Other larger rings appear in practice,
but their intensity is much smaller.
of a note, or when consonants are sung or spoken. Thus, to recreate the attack
of a note (a crucial moment to recognize the timber of an instrument), it is
necessary to compute the function fe() with high precision for large values
of , and hence to know precisely the attack of all the notes in the partition.
When the original signal is recreated, their musical characteristics and their
temporal localization may be altered.
This is why some researchers (for instance, the geophysicist Jean Morlet,
Alex Grossman in Marseilles, Yves Meyer at the cole Polytechnique, Pierre
Gilles Lemari-Rieusset in Orsay and then vry, and many others) have developed a different approach, based on a time-frequency analysis.
We know that the modulus of the Fourier spectrum fe() of a signal
provides excellent information on the frequency aspects, but little information
on the temporal aspect.9 However, to encode a musical signal, the customary
way is to proceed as follows:
x
A given signal f (t) and the same translated in time f (t a) have the same power spectrum,
since only the phase changes between the two spectra.
323
The musical signal is therefore encoded in matters of both time and frequency.
Yves Meyer has constructed a wavelet , that is, a function of C class,
all moments of which are zero, that produces an orthonormal basis (n p )n,p
of L2 , where
def
n p (t) = 2n/2 (2n t p)
n, p Z.
The larger n Z is, the narrower the wavelet n p is; when n , on
the other hand, the wavelet is extremely wide. Moreover, the parameter p
characterizes the average position of the wavelet. The decomposition in the
wavelet basis is given by
f =
+
X
+
X
f n p n p
n= p=
f L2 (R),
10
324
EXERCISES
Exercise 11.3 Compute (x/a) sin x using the Fourier transform.
Exercise 11.4 Compute the Fourier transform, in the sense of distributions, of x 7 |x|.
Exercise 11.5 Compute the Fourier transform of H (x) x n for any integer n. (One may use
S=
X
nN
a2
1
1
,
2
2
2
+ 4 n b 42 n2
with a, b
The result of Exercise 8.13 on page 243 may be freely used.
R+ .
J0 (x) =
1
2
e i x cos() d
for any x R.
i) Show that this definition is independent of . Deduce from this that J0 takes real
values.
ii) In the plane R2 , let f (x, y) be a radial function or distribution, that is, such that there
exists : R+ C (or a distribution with support in R+ ) with f (x, y) = (r ), where
p
def
r = x 2 + y 2 . Assume moreover that f has a Fourier transform. Show that its Fourier
def p
transform is also radial: fe(u, v) = () with = u 2 + v 2 .
Prove the formula giving as a function of , using the Bessel function J0 .
H. T.
H. T.
iii) Show that for continuous functions and , if , then also (i.e.,
the Hankel transform is its own inverse).
iv) Show that
d2 1 d H. T.
+
42 2 ().
dr 2
r dr
H. T.
H. T.
Physical optics
Exercise 11.8 Describe the diffraction figure formed by a system made of infinitely many
regularly spaced slits of finite width (the spacing between slits is a).
Exercise 11.9 Compute, in two dimensions, the diffraction figure of a diaphragm made of
Exercises
325
PROBLEM
Problem 5 (Debye electrostatic screen)
The system is considered in nonrelativistic terms, and it is assumed that the interactions
between particles are restricted to Coulomb interactions. The particles are also assumed
to be classical (not ruled by quantum mechanics) and point-like. The system may be
described by the following hamiltonian:
H =
X p2
1 X ei e j
i
+
2m
2 i, j ri j
i
i
i6= j
with
def
ri j =
r j r i
.
The indices i and j ranges successively over all particles in the system.
Now an exterior charge q0 , located at r = 0, is added to the system. We wish to know
the repartition of the other charges where thermodynamic equilibrium is reached.
Solution of the problem
i) Let ( r ) denote the distribution of density of particles of type , at equilibrium,
in the presence of q0 . Express the distribution q( r ) in terms of the ( r).
ii) Denoting by ( r) the electrostatic potential, write the two equations relating
( r) and ( r) the Poisson equation for electrostatics and the Boltzmann
equation for thermodynamics.
iii) Linearize the Boltzmann equation and deduce an equation for ( r).
iv) Passing to Fourier transforms, compute the algebraic expression of (
e k), the
Fourier transform of ( r).
v) Using complex integration and the method of residues, find the formula for the
potential ( r), which is called the Debye potential [26].
vi) Show that, at short distances, this potential is indeed of Coulomb type.
vii) Compute the total charge of the system and comment.
326
SOLUTIONS
Solution of exercise 11.2 on page 301. It must be shown that T (x) = exp(x 4 ) X(x) can-
not act on certain functions of the Schwartz space. It is easy to check that the gaussian
g : x 7 exp(x 2 ) decays rapidly as do all its derivatives, and therefore belongs to S , whereas
the evaluation of T , g leads to an infinite result.
Strictly speaking, we must exclude also the possibility that there exists another definition of
T , , for a Schwartz function, which makes it a tempered distribution and coincides with
R
T (x) (x) dx for with bounded support; but that is easy by taking a sequence n of test
functions with compact support that converges in S to g(x). Then on the one hand
T , n T , g
1
1
a 1
1
1
sin a 1
+
= 2 sin
+
2i
2
2
2 2i
2
2
(using the formula f (x) (x) = f (0) (x)), which, in inverse Fourier transform, gives
2 sin(a/2) sin x.
Of course a direct computation leads to the same result.
Solution of exercise 11.4. Notice first that f : x 7 |x| is the product of x 7 x and x 7
sgn x. The Fourier transform of x 7 x is, by the differentiation theorem, equal to i /2.
F.T.
Moreover, we know that sgn pv 1/i. Using the convolution theorem yields
F.T.
x sgn x
i 1
1
1 d
1
1
1
pv =
pv = 2 fp 2 .
2 i
22 d
Solution of exercise 11.5. The Fourier transforme of the constant function 1 is . The
formula F [(2ix)n f (x)] = f (n) (x) gives
F.T.
x n
(n)
.
(2i)n
and
H (x) x n
F.T.
H (x)
1
1
+
pv
.
2
2i
(n)
1
dn
1
pv
.
n
n+1
n
2 (2i)
(2i) d
Finally, the distribution fp(1/x n+1 ) is linked to the n-th derivative of pv(1/) by a multiplicative
constant:
dn
1
1
n
pv
=
(1)
n!
fp
,
d n
n+1
from which we derive
(n)
n!
1
F.T.
H (x) x n
+
fp
.
2 (2i)n
(2i)n+1
n+1
Solutions of exercises
327
Solution of exercise 11.6. Using the technique described in the text on page 310, we first
compute that
2a
2b
e a|x| + 2
sin b |x| ,
a2 + b 2
a + b2
and using the formula of exercise 8.13 on page 243 we get
e a|x| sin b |x| =
S=
1
1
sin b
coth a +
.
2
2
2
+b )
2b (a + b ) 1 cos b
2a(a 2
R2
+
(r )
() = 2
or
Z
e 2i r cos d r dr,
r (r ) J0 (2r ) dr.
0
H. T.
iii) This is simply the translation of the Fourier inversion formula in this context.
iv) In the formula
d2 1 d
+
dr 2
r dr
we recognize the expression for the laplacian of f in polar coordinates, since there is
no dependency on . The Fourier transform of f is equal to 42 2 fe(u, v) with
= (u, v) and 2 = 2 , hence the formula follows.
1 () 2 () d.
Solution of exercise 11.9. As seen on page 320, the amplitude of the diffraction figure is
given radially by the Hankel transform of the radial transmittance function, which is given here
by
0 if r > ,
f (x, y) = f (r ) =
1 if r < ,
p
2
2
where r = x + y .
The amplitude is therefore
p
A(, ) = A() =
J1 (2)
with
= 2 + 2 .
328
a
2
a2
Fig. 11.7 The diffraction figure created by two circular holes. The spacing between the
two lenses is here taken to be 4 times their diameter.
The transmittance function of two holes is
h
a i
a
+ x +
.
g(x, y) = f (x, y) x
2
2
The diffraction figure due to the two holes is obtained by multiplying the diffraction amplitude
due to a single hole by the Fourier transform of the distribution (x a/2) + (x + a/2),
which is equal to 2 cos(a).
We obtain then the result sketched in Figure 11.7.
Solution of problem 5.
()
where the constant value of is justifed by the fact that, for r , we have ( r) 0 in
the suitable gauge and ( r) .
In the limit of weak couplings (or low densities), we can linearize equation () and obtain
( r) = 1 e ( r) ,
which, inserted into (), gives
( 2 ) ( r) = 4q0 ( r)
def
with 2 = 4
linearized Poisson-Boltzmann
P
taking into account that the system is neutral ( e = 0).
e2 ,
Solutions of exercises
329
We will now try to solve this equation. First, assuming that ( r) has a Fourier transform,
we obtain the conjugate equation
( k 2 2 ) (
e k) = 4q0
and
( r) =
We deduce that11
1
(2)3
e i k r (
e k) d3 k.
4q0
.
k 2 + 2
There only remains to compute the inverse Fourier transform
Z
4q0 i k r d3 r
.
( r) =
e
(2)3
k 2 + 2
For this purpose, we use spherical coordinates on the space of vectors k, with the polar
axis oriented in the direction of r , and with the notation k r = kr cos , where k > 0 and
r > 0. Then
Z 2
Z Z +
4q0 2
dk
( r) =
d
d
k sin e i kr cos
,
2 + 2
3
k
(2)
0
0
0
or
Z1
Z +
82 q0 2 i kr cos dk
( r) =
d(cos )
k e
k 2 + 2
(2)3
1
0
Z +
Z
q0 k 2
1 i kr
2q0 + k sin kr
i kr
=
e
e
dk
=
dk.
k 2 + 2 ikr
r 0
k 2 + 2
0
(
e k) =
We must still evaluate this integral relative to the variable k. For this, we put
Z +
Z +
1
k e i kr
k sin kr
def
dk
=
Im
dk
,
I ( r) =
2
2
k 2 + 2
2
k +
0
and we use the method of residues by extending the variable k to the complex plane. The poles
def
of the meromorphic function f (z) = z e i zr /(z 2 + 2 ) are located at z1 = i and z2 = i.
Since r > 0, we can integrate on the contour described in the following figure:
z2
z1
and, after applying Jordans second lemma 4.85, as justified by the fact that k/(k 2 + 2 ) tends
to 0 when [k ], and after taking the limit where the radius in the contour goes to infinity,
we derive
Z +
k e i kr
ik e r
dk = 2i Res ( f ; z1 ) = 2i
= i e r ,
2
2
2i
k +
11
Being careful of the fact that, in the general case, the equation x T (x) = a has solutions
given T (x) = a pv(1/x) + b , with b C. Here, (
e k) is defined up to the addition of the
Fourier transform of an harmonic functions.
330
hence
I =
and
r
e
2
( r) =
q0 r
e
r
Debye potential.
Using the Boltzmann equation, each of the functions ( r) can be computed now. Note
that the Debye electrostatic screen occurs at a length scale of 1 , which tends to infinity in the
limit of weak couplings (small densities). For distances r , the potential is ( r) q0 /r ,
and we recover the Coulomb potential.
The total charge present around q0 is
ZZZ X
Q=
e ( r ) d3 r
ZZZ X
e 1 e ( r) d3 r
ZZZ X
= q0
e2 ( r) d3 r
by initial neutrality,
ZZZ X
e2
e r 3
d r = q0 .
r
Thus is exactly compensates the additional charge q0 . In other words, the system has managed
to become neutral again.12 The Debye screen is said to be a total screen.
12
This may seem surprising; whichever charge is added to the system, it remains neutral.
This is a strange property of an infinite system, which may bring in charges from infinity, and
make them disappear there. This type of paradox also happens in the famous infinite hotels,
all rooms of which are occupied, but in which, nevertheless, a new customer can always be
accommodated: it suffices that the person in Room 1 move to Room 2, the person in Room 2
move to Room 3, and more generally, the person in Room n move to Room n +1. All previous
occupants have a new room, and Room 1 becomes available for the new customer.
Chapter
12
12.1
Definition and integrability
In this chapter, we are interested in an integral transformation operating
on functions f which vanish for negative values of the variable: f (t) = 0
for all t < 0 (no continuity at 0 is imposed). An example is the function
t 7 H (t) cos t.
In the literature about the Laplace transform, the factor H (t) is frequently
omitted; we will follow this custom, except where some ambiguity may result,
and we will therefore speak of the function t 7 cos t, tacitly assuming that
this definition is restricted to positive values of t.
DEFINITION 12.1 A causal function is a function t 7 f (t) defined on R
332
12.1.a Definition
DEFINITION 12.2 Let f (t) be a real- or complex-valued locally integrable
f (t) e p t dt.
Whereas the function fb itself is called the Laplace tranform, the operation
f 7 fb should be called the Laplace transformation.
Its properties are of the same kind as those of the unilateral transform, with
variants that the reader may easily derive by herself.
The bilateral Laplace transform of H (t) f (t) is simply the Laplace transform of the function f (t). In the remainder of this chapter, only the unilateral
transform will be considered.
Remark 12.4 If f is a function zero for t < 0 which has a Fourier transform, then the relation
fb(i) = fe
with R
2
holds. The Laplace transform is an extension of the notion of Fourier transform. More
precisely, fb(x + i) is, up to the change of variable = 2, the Fourier transform of t 7
f (t) e x t evaluated at /2.
variable which is locally integrable, with real or complex values, such that
i) f is causal ( f (t) = 0 for all t < 0);
ii) f (t) does not increase faster than any exponential function, that is,
there exist constants M > 0 and s R such that
f (t) M e st
for all t R.
333
The following notation are used: if f (t) is an original, its Laplace transform is denoted fb(p) or F (p). The symbol is also in widespread use, used
as follows: f (t) F (p). (In certain russian books, the notation f (t) F (p)
also appears.)
12.1.b Integrability
In the remainder of this chapter, we write p = x + i.
Let f be a locally integrable function. We are looking for the domain
where its Laplace transform fb(p) is defined. Note first that integrability of
t 7 f (t) e p t is equivalent to integrability of t 7 f (t) e x t .
Moreover, if this function is integrable for some x0 , then it is also integrable for any x > x0 , since
f (t) e x t = f (t) e x0 t e (x0 x)t f (t) e x0 t .
It follows that the set of complex numbers p, where t 7 f (t) e p t is integrable, either is empty or is a (right) half-plane in the complex plane, or
indeed is the entire complex plane C.
lower bound of all real numbers x for which the function above is integrable:
n
o
def
= inf x R ; t 7 f (t)e x t is integrable .
PROPOSITION 12.7 Let f (t) be an original and let be its convergence abscissa.
334
iii) for x = , it may, or may not, be Lebesgue-integrable. If it is not Lebesgueintegrable, the improper integral
Z +
ZR
p
t
f (t) e
dt = lim
f (t) e p t dt
R+ R
R
may still converge for certain values of p = +i, which provides an extension
of the Laplace transform of f to those values of p.
Example 12.8 Consider the Heaviside function. Its Laplace transform is given by
b (p) =
H
H (t) e p t dt =
1 p t +
e
,
0
p
so that the convergence abscissa of H is equal to 0, and for all p C such that Re(p) > 0, we
b (p) = 1/p. The domain where H
b is defined is thus an open half-plane.
have H
b
Notice that, if the function H is extended to the imaginary axis (except the origin) by
defining, for any nonzero , H (i) = 1/i, we obtain a function which is close to the
Fourier transform of H , the latter being equal to
e () = pv 1 + ,
H
2i
2
which is the same as pv(1/2i) on R . This resemblance will be explained in Section 12.4.e,
page 345.
Example 12.9 Let f (t) = 1/(1 + t 2 ). Then we have
F (p) =
ept
dt,
1 + t2
which is defined for all p C such that Re(p) 0. The domain of definition of F is therefore
a closed half-plane.
Example 12.10 Let a C be an arbitrary complex number, and consider the Laplace transform
of the function t 7 H (t) e at . We have
Z +
Z +
H (t) e at
e at e p t dt =
e ( pa)t dt,
0
so that the convergence abscissa is = Re(a), and for all p C such that Re(p) > Re(a), the
Laplace transform of H (t) e at is 1/(p a).
THEOREM 12.11 In the open half-plane on the right of the abcissa of convergence,
the Laplace transform fb is holomorphic and hence also analytic.
Proof. Let x R with x > . Then, for any m N, the function t 7 t m f (t)e p t is
integrable: since x > , the real number y = (x + )/2 satisfies < y < x; moreover,
t 7 f (t) e yt is integrable, and we may write
t m f (t) e x t = f (t)e yt t m e (xa)t/2 ,
where the first factor is integrable while the second is bounded. The derivative of F
can be computed using the theorem of differentiation under the integral sign, and is
given by
Z
dF
= F (p) =
(t) f (t) e p t dt.
dp
0
335
i
<0
x
(a)
(b)
=0
>0
x
(c)
(d)
Fig. 12.1 Convergence abcissas for unilateral Laplace transforms. (a) Function with
bounded support on the right. (b) Function rapidly decaying on the right.
(c) Tempered function. (d) Rapidly increasing function. In the non-gray open
set, the Laplace transform is holomorphic.
Since F is differentiable in the complex sense at any point of the open half-plane
{p C ; Re p > }, it is an analytic function.
Notice that the convergence abcissa for t 7 t f (t) is the same as that for f . Also,
an obvious induction shows that
Z
F (n) (p) =
(t n ) f (t) e p t dt.
0
ii) If f (t) decays rapidly on the right, then < 0 and fb(p) is holomorphic in a half-plane containing the imaginary axis (in particular, the Fourier
transform of f exists).
iii) If f (t) is tempered (in the sense of distributions) and does not tend to 0 at
infinity, then = 0; its Fourier transform does not necessarily exist in the sense
of functions, but it exists in the sens of distributions.
iv) Finally, if f (t) increases rapidly at infinity, then 0 < +; then fb (if
it exists) is holomorphic on a half-plane not containing the imaginary axis and
the Fourier transform of f does not exist.
Example 12.13 The convergence abcissa for the rectangle function is and we have
b
(p)
=
1/2
e p t dt =
1 e p/2
,
p
336
b is bounded in a
which is analytic on C; the singularity at 0 is indeed an artificial one, since
b has the power series expansion
neighborhood of 0;
X
(1)n+1
b
pn.
(p)
=
(n + 1)! 2n+1
n=0
Any nonzero polynomial function or rational function has convergence abcissa equal to 0.
2
Exercise 12.1 Show that the function g : t 7 e t does not have a Laplace transform (that
is, = +).
12.2
Inversion
It is possible to find an inversion formula for the Laplace transform using
our knowledge of the Fourier inversion formula. Indeed, if F (p) is the Laplace
transform of the function f (t), we have, for any x > ,
Z +
F (x + 2i) =
H (t) f (t) e x t e 2it dt,
1
This is not always the case. It may have branch points, as in the case of the Laplace
transform of t 7 H (t)/(t + 1).
Inversion
1
y
337
0
1
1
0
0
x
Fig. 12.2 Representation of the Laplace transform of the function x 7 cos x. The zaxis is the modulus of the function p 7 p/(p 2 + 1). This function may be
analytically continued to the complex plane without i and i.
which shows that, for fixed x, the function 7 F (x + 2i) is the Fourier
transform of H (t) f (t) e x t . The Fourier inversion formula leads, when t is a
point where H (t) f (t) is continuous, to
Z +
x
t
H (t) e
f (t) =
F (x + 2i) e 2it d
or
H (t) f (t) =
F (x + 2i) e x t e 2it d =
1
2i
F (p) e p t d p,
Dx
D x = {x + i ; R}
(Bromwich contour).
Remark 12.15 Note that the resulting formula gives the original as a function of the image,
338
Let f be an original
and F its Laplace transform. Let be the convergence abcissa. For any point where f
is continuous, we have
Z x0 +i
1
F (p) e p t d p
for any x0 > .
f (t) =
2i x0 i
Remark 12.17 If we start with an arbitrary function, this formula leads us back to H (t) f (t).
Can you show this using the fact that F is analytic in the half-plane x > a and decays on
vertical lines?
12.3
Elementary properties and examples
of Laplace transforms
Just as in the case of the Fourier transform, there are relations linking
the Laplace transform and operations like translation, convolution, differentiation, and integration.
12.3.a
Translation
There is a translation theorem similar to what holds for the Fourier transform,
but which requires some care here: since we are looking at the unilateral
Laplace transform, we will write explicitly the function H (t) f (t) instead of
f (t). The, writing p = x + i as usual and denoting by the convergence
abcissa for f , we have
if
then
for < x,
for Re(a) < x.
if
then
Similarly
for < x,
for < x.
339
12.3.b Convolution
For two causal functions f and g, we define the convolution product, if it exists,
in the same manner as was done in 7.64 on page 211, and the definition
simplifies here to
Z +
Zt
f g(t) =
f (s) g(t s) ds =
f (s) g(t s) ds.
0
equal to and .
If
f (t) F (p)
and
g(t) G (p)
then [H f H g](t) F (p) G (p)
for < x,
for < x,
for max(, ) < x.
Similarly, we have:
if
and
then
f (t) F (p)
g(t) G (p)
Z x0 +i
1
f (t) g(t)
F (s) G (p s) ds
2i x0 i
for < x,
for < x,
< x0 ,
for
+ < x0 .
One shows (see Exercise 12.6) that lim f (t) e p t = 0 so that the boundary
t+
term is equal to f (0) or, more precisely, to lim+ f (t), which is also denoted
f (0+ ).
t0
340
which is continuous, differentiable, and such that its derivative is also an original.
Then, if f (t) F (p), the Laplace transform of the derivative of f in the sense of
functions is given by
d
H (t)
f (t) p F (p) f (0+ ).
dt
Similarly, if f can be differentiated n times and if its n-th derivative if an original,
we have
dn
H (t) n f (t) p n F (p) p n1 f (0+ ) p n2 f (0+ ) f (n1) (0+ ).
dt
Example 12.20 Let f (t) = 1 + e t , which is an original with convergence abscissa = 0, and
1
1
+
.
p
p +1
The function f is differentiable, with derivative g : t 7 e t , which is an original with
convergence abscissa = 1. Then we have
p
1
p F (p) f (0+ ) = 1 +
2=
= G (p).
p +1
p +1
1 + e t
is an original with the same convergence abscissa. If f (t) F (p), then we have
dn
(t)n f (t) n F (p).
dp
Similarly, there is the following integration result:
THEOREM 12.22 (Laplace transform and integration)
Let f be an original,
t
0
f (t) F (p)
F (p)
f (u) du
p
for < x,
for max(, 0) < x.
Remark 12.23 This may be deduced from the convolution theorem since
f (u) du = f H (t).
341
Conversely, we have
THEOREM 12.24 Let f be an original.
If
then
f (t) F (p)
Z
f (t)
F (z) dz
t
p
for < x,
for sup(, 0) < x.
In this last relation, the path of integration joins the point p at infinity, in
any direction in the half-plane of convergence. The result of the integration
is independent of the chosen path since F is holomorphic in this half-plane.
It is still necessary that F (p) decay faster than 1/ p at infinity (this is true
for nice functions, but will not hold in general for the Laplace transform of
distributions).
12.3.d Examples
We have already shown that
H (t)
1
p
H (t) e a t
(n 1)!
(p a)n
H (t) e a t
Using a simple fraction expansion, this provides a way to find the original
for the Laplace transform of any rational function. Note (although we do not
prove it) that the formula remains valid if we take a value of n which is not
an integer; then the factor (n 1)! in the formula must be replaced by (n),
where is the Euler function (see page 154).
In particular, we have:
H (t) e it
1
p i
and
H (t) e it
1
,
p + i
H (t) cos(t) 2
H (t) sin(t) 2
,
2
p +
p + 2
0 < Re(p).
0 < Re(p).
Similarly,
H (t) cosh(t)
p
p 2 2
H (t) sinh(t)
p 2 2
0 < Re(p).
342
Remark 12.25 Be careful not to use the linearity to state something like:
i t
Re
1
p i
since taking the real part of some expression is certainly not a C-linear operation.
12.4
Laplace transform of distributions
12.4.a Definition
Since we are only discussing the unilateral Laplace transform (or, what
amounts to the same thing, the Laplace transform of causal functions, zero
for t < 0), we will only generalize the Laplace transformation to distributions
with support bounded on the left, that is, distributions in the space D+ .
Recall that there is a convolution algebra structure defined on this space
namely, the product T S exists for all T , S D+ .
DEFINITION 12.26 Let T be a distribution with support bounded on the left.
If there exists an R such that, for all x > , the distribution e x t T (t) is
tempered, then the Laplace transform of T is defined by the formula
def
pered and with support bounded on the left, the Laplace transform of exists and is
equal to the constant function 1, with convergence abscissa = .
Proof. We have e t = for
12.4.b Properties
Let T be a distribution with support bounded on the left with Laplace transform Tb. Then the function Tb is holomorphic at any point p C such that
Re(p) > .
In addition, the following result may be shown:
343
PROPOSITION 12.28 (Sufficent condition for a Laplace transform) Any function F (p) which is holomorphic and bounded in modulus by a polynomial function
of p in a half-plane Re(p) > is the Laplace transform of a distribution in D+ .
distribution with Laplace transform F (p), and let T denote its derivative in the sense
of distributions. Then T has a Laplace transform given by
T (t) p Tb (p).
Remark 12.31 This result is compatible with Theorem 12.19: let f be a continuous function
which is the statement in the differentiation theorem 12.30 in the sense of distributions.
Also, the following results remain valid for the Laplace transform of distributions:
have Laplace transform T
b (p) with convergence
THEOREM 12.32 Let T (t) D+
T (t ) Tb (p) e p
T (t) e a Tb (p + a)
HT
Tb (p)
p
for < x,
Remark 12.33 The last equation is the translation of the integration theorem, since H f is a
344
12.4.c Examples
Using the differentiation and translation theorems for the Laplace transform,
we find:
THEOREM 12.34 The Dirac distribution and its derivatives have convergence
abscissas = and
(t) 1,
(t) p,
(n) (t) p n ,
(t ) e p .
Putting
def
X+ (t) =
n=0
(t n),
cos(t) X+ (t)
1
1 ep
and
e t X+ (t)
1
.
1 e 1 p
X
X
H (t) f (t) X(t) =
f (n) (t n)
f (n) e n p .
n=0
ep,
Putting
z =
P
f
(n)
zn.
n=0
n=0
series
z 7
f (n) z n .
n=0
345
n e in t
nI
t mn 1
.
(mn 1)!
(mn 1) ( )
X
n
n
n
m ,
fe() = e
g () +
+ fp
2(2i)mn 1 (mn 1)!
2i( n ) n
nI
that is,
fe() = fp F (2i) +
or also
n
m
2(2i) n 1 (mn
nI
1)!
(mn 1) ( n ),
X (i)mn 1 n
fe
= fp F (i) +
(mn 1) ( n ),
2
2(m
1)!
n
nI
(12.1)
where fp is the finite part (for multiple poles), or the Cauchy principal value
(for simple poles).
346
Exercise 12.3 Compute this way the Fourier transform of the Heaviside distribution and of
12.5
Physical applications, the Cauchy problem
The Laplace transform may be used to solve problems linked to the evolution of a system, when the initial condition is known. Suppose we have a
linear system of differential equations describing the evolution of a physical
system for which the equations of motion are supposed to be known. We
are looking for the solutions t 7 f (t), where the value of the function and
of a certain number of its derivatives at t = 0 are imposed. The Cauchy
problem is the data consisting of the differential equation (or the system of
differential equations) together with the initial conditions.
Many quarrels between scientists have been based on this concept. It is to be seen already
in Newtons equations and appears, in particular, later with Laplaces intuition of the deterministic character of classical mechanics (the famous I did not need this assumption). These
disagreements continued into the last century: Einstein, with his God does not play dice, and
the ever-lively discussions between partisans of various interpretations of quantum mechanics.
For a theatrical approach to this subject, the reader is invited to go see (or to read) the play
Arcadia by Tom Stoppard [88].
347
p
.
+1
The differential equation (with the given initial conditions) can therefore be
written simply in the form
cos t
p2
p 2 X (p) + 1 + X (p) =
2p
,
+1
p2
2p
1
2
.
(p 2 + 1)2
p +1
There only remains to find the original of X (p). The second term on the
right-hand side of the preceeding relation is the image of the sine function:
sin t 1/(p 2 + 1). Moreover, using Theorem 12.21, we obtain
d
1
2p
t sin t
= 2
,
2
dp p + 1
(p + 1)2
which gives the solution X (p) t sin t sin t = (t 1) sin t.
It is easy to check that x(t) = (t 1) sin t is indeed a solution of the stated
Cauchy problem.
The reader is referred to the table on page 614 for a list of the principal
Laplace transforms.
3
One can see the advantage of working with Laplace transforms, which are functions,
instead of Fourier transforms, which are distributions.
348
A( r , 0) = A0 ( r ) for all r R3 ,
( r , 0) = A
0 ( r ) for all r R3 .
A
e ( k, t) its
We denote by A ( r, p) the Laplace transform of A( r, t), and by A
Fourier transform with respect to the space variable. We also denote by
f( k, p) the double transform (Laplace transform for the variable t and
A
Fourier transform for the variable r ).
We have, by the differentiation theorem 12.19 :
A( r , t) A ( r , p),
A( r , t) pA
A ( r, p) A( r , 0),
t
2
( r , 0),
A( r , t) p 2A ( r , p) p A( r , 0) A
t2
and the evolution of the potential is therefore described, in the Laplace world,
by the equation
2
p
1
p
A ( r , p) = 2 A0 ( r ) + A
( r).
2
c
c
c 0
Take then the Fourier transform for the space variable, which gives
2
p
pe
1e
2 f
+ k A ( k, p) = 2 A
0 ( k) + A0 ( k),
2
c
c
c
p
1
e
0 ( k),
e ( k) +
A
A
2 2 0
2
+k c
p + k2 c 2
from which we can now take the inverse Laplace transform to derive, for any
t 0, that
e
e ( k, t) = A
e 0 ( k) H (t) cos(k c t) + A
0 ( k) H (t) sin(k c t) .
A
kc
Finally we need only compute the inverse Fourier of t 7 (sin k c t)/k c and
t 7 cos(k c t).
or
f( k, p) =
A
p2
349
THEOREM 12.36 The inverse Fourier transforms of those functions are given by
F 1
and
Z
sin(k c t)
1
sin(k c t) i k r 3
d k
=
e
3
kc
(2)
kc
i
1 h
=
(r c t) (r + c t)
4c r
1
(r c t) if t > 0,
= 4c r
0
if t = 0,
F 1 [cos(k c t)]
1
(r
4r
( r )
c t) if t > 0,
if t = 0.
The expression for the evolution of the field is then obtained by this last
inverse Fourier transform using the the convolution theorem. Denoting by
B( r ; c t) the sphere centered at the point r with radius c t (i.e., the boundary
of the corresponding ball), we derive for t > 0 the formula
ZZ
ZZ
1
1
A0 ( r ) 2
( r ) d2 r .
A( r , t) =
d
r
+
A
4 B( r;c t) n | r r |
4c 2 t B( r;c t) 0
where / n is the exterior normal derivative (in the direction the vector
exterior to the surface).
Notice that the value of the field at point r and time t is expressed in
terms of the initial values of the field and its derivative at points r such that
| r r | = c t (this is what appears in the integration domain B( r ; c t)).
Note also that if we let t tend to 0 in this formula, the second term tends
to 0, whereas the first one tends to A0 ( r) if A0 is sufficiently regular (it is
enough that it be of C 1 class here).
Exercise 12.4 As an exercise, the reader may show that the previous formulas may be recov-
+
d2 S ,
4 S
n q
cq n t
q n
where q is the distance from r to the point of surface which is the integration variable, and
/ n is the exterior normal derivative. The quantities between curly brackets are evaluated at
the delayed time t q/c .
To conclude, take S = B( r ; c t) and show that the last formula recovers the framed
equation above.
350
A
d r +
4 B( r;c t ) n | r r |
4c 2 t B( r;c t ) 0
In this last expression, the first term, produced by the sources, is given by a
volume integral; it is therefore clear that the current j( r , t) may be a distribution instead of a function.
Note that to be physically pertinent, we must impose that the inital conditions (the potential at t = 0 and its first derivative) must be compatible with
the initial conditions concerning the charge and current distribution.
Exercises
351
EXERCISES
Exercise 12.5 Let f (t) be an original which is periodic with period a. Show that its Laplace
f (t) e p t dt.
Exercise 12.6 We prove here an intermediate result used in the proof of Theorem 12.19.
i) Let h : R R be a differentiable function such that h and h are integrable. Show that
lim x+ h(x) = 0.
ii) Deduce that if f is an original with convergence abscissa , such that f is differentiable,
and if f is also an original with convergence abscissa , then for any complex number
p such that Re(p) > max(, ), we have
lim f (t) e p t = 0.
t+
t 0,
J0 (s) J0 (t s) ds = sin t.
Exercise 12.8 Prove the following results, and draw the original functions in each case (k is
1 k p
e ,
p
a)
H (t k)
b)
(t k) H (t k)
1 k p
e ,
p2
c)
H (t) H (t k)
1 e k p
,
p
d)
n=0
H (t) + 2
e)
H (t nk)
(1)n H (t 2nk)
n=1
f)
(1)n H (t nk)
n=0
g)
t H (t) + 2
n=1
1
,
p(1 e k p )
tanh(k p)
,
p
1
p(1 + e k p )
1
tanh(k p).
p2
Draw the graph of f , then find the solution t 7 x(t) to the given Cauchy problem.
352
SOLUTIONS
def
Solution of exercise 12.5. Put =
f
[0,a]
1
p2 + 1
We want to find the convolution H (t) J0 (t) with itself, which has image
1
1
1
H (t) J0(t) H (t) J0 (t) p
p
= 2
,
2
2
p +1
p +1
p +1
the original of which is H (t) sin t.
Note that the formula stated is still valid for t 0 by the parity of J0 .
1
k
1
k
1
2k
4k
Solutions of exercises
353
The formulas are obtained by simple application of the various properties of the Laplace
transform and summations.
5
X
n=0
(1)n 1 + e tn (t n 1) H (t n).
Chapter
13
For instance, a mechanical system with springs and masses, or an electric circuit built out
of linear components such as resistors, capacitors or inductors, or in the study of the free
electromagnetic field, ruled by the Maxwell equations.
356
u(t)
v(t)
L
1
u=
+
H + v = D v,
R
RC
v(t)
v
= 0.
u(t)
u0
When the input signal is a superposition of sinusoidal signals (which is, generally speaking, always the case), the linearity of the system may be used to
find the output signal: it suffices to multiply each sinusoidal component of the
input signal by the value of the transfer function for the given pulsation and
to sum (or integrate) over the whole set of frequencies:
X
X
if u(t) =
an e in t ,
then v(t) =
an Z(n ) e in t ,
n
and if
u(t) =
a() e it d,
then
v(t) =
a() Z() e it d.
The only subtle point in this argument is to justify that the output signal
is monochromatic if the input signal is. This step is provided by the following
theorem:
THEOREM 13.1 Any exponential signal (either real t 7 e p t or complex t 7
357
Indeed, assume that the input signal u and the output signal v are related
by a convolution equation v(t) = T (t) u(t), where T is either a function or
a tempered distribution. If u is of the form u(t) = e 2it , then v(t) = T (t)
e 2it is infinitely differentiable (because the exponential is) and Theorem 8.24
yields
v(t) = T (), e 2i(t) = T (), e 2i e 2it ,
which is indeed a complex exponential with the same frequency as the input signal. The
The quantity Z() may also be seen as the eigenvalue associated to the eigenvector
t 7 e 2it of the convolution operator T .
The same reasoning applies for a real-valued exponential. If the input
signal is of the type u(t) = e p t (where p is a real number), the output signal
can be written
v(t) = T (), e p(t) = T (), e p e p t ,
still according to Theorem 8.24. The transfer function p 7 T (), e p
is then of course the Laplace transform of the impulse response T (see Chapter 12).
Remark 13.2 (Case of a resonance) This discussion must be slightly nuanced: if Z () is infi-
nite, which means the Fourier transform of the distribution T is not regular at , one cannot
conclude that the output signal will be sinusoidal.
A simple example is given by the equation of the driven harmonic oscillator:
v + 2 v = e 2i t = u(t).
This equation can be written u = [ + 2 ] v, the solution of which, in D+ , is (see
Theorem 8.33 on page 239)
1
H (t) sin t.
The Fourier transform T (computed more easily by taking the Laplace transform evaluated at
2i, and applying the formula (12.1), page 345) is then2
1
1
1
Z () = Te() = fp 2
+
+
.
2
2
4
4i
2
4i
2
v = T u
with T (t) =
The transfer function Z () is computed, ab initio, at the beginning of Chapter 15. This
will explain why imaginary Dirac distributions appear, in relation with the choice of a solution
of the equation u = [ + 2 ] v; for a solution in D , the Dirac spikes would have the
opposite sign. A solution which does not involve any Dirac distribution exists also, but it does
not belong to any convolution algebra.
358
In fact, this reflects the fact that the solutions of the differential equation are not sinusoidal; instead, they are of the form
1
t eit,
v(t) =
2i
at the resonance frequency. More details are given in Chapter 15.
13.2
Fourier transform of vector fields:
longitudinal and transverse fields
The Fourier transform can be extended to an n-dimensional space.
DEFINITION 13.3 The Fourier transform of a function x 7 f ( x) with
variable x in Rn is given by
fe( ) =
ZZ
Rn
f ( x) e 2i x dn x.
All the properties of the Fourier transform extend easily to this case, mutatis
mutandis; for instance, some care is required for scaling:
1
.
F f (a x) = n fe
|a|
a
One may also define, coordinate-wise, the three-dimensional Fourier transform of a vector field x 7 A( x). Using this, it is possible to define transverse
fields and longitudinal fields.
Let x 7 A( x) be a vector field, defined on R3 and with values in R3 .
This field may be decomposed into three components A = (A x , A y , Az ), each
of which is a scalar field, that is, a function from R3 to R. Denote by A x the
Fourier transform of A x , A y that of A y , and A z that of Az . Putting together
these three scalar fields into a vector field k 7 A ( k), we obtain the Fourier
transform of the original vector field:
ZZZ
A ( k) =
A( x) e 2i k x d3 x.
R3
359
A ( k) = A ( k) A// ( k).
The longitudinal component of the field A is given by the inverse Fourier
transform of the field k 7 A// ( k):
A ( k) k
def
1
A// ( x) = F
k
.
| k|2
Its transverse component is then
def
A ( x) = F 1 A ( k) = A( x) A// ( x).
A( x) = 0
at any x R3 ,
13.3
Heisenberg uncertainty relations
In nonrelativistic quantum mechanics, the state of a spinless particle is
represented by a complex-valued function which is square integrable and
of C 2 class. More precisely, for any t R, the function (, t) : r 7 ( r , t)
is in L2 (R3 ). This function represents the probability amplitude of the particle
at the given point, that is, the probability density associated with the presence
360
and similarly Y and Z are the position operators along the O y- and O zdef
axes, where R = (X , Y , Z). The operator X is defined by its action on wave
functions given by
X =
with (, t) : r 7 x ( r , t),
def
(t) (t) =
( r , t) ( r, t) d3 r,
(recall the chosen convention for the hermitian product in quantum mechanics: it is linear on the right and semilinear on the left), and the position operator
acts as follows:
X (t) = (t)
with ( r, t) = x ( r , t).
3
Only physical wave functions belong to this space. Some useful functions, such as
monochromatic waves r 7 e i k r do not; they are the so-called generalized kets, which are
discussed in Chapter
14, in particular, in Section 14.2.d.
4
In fact (t) is a vector in an abstract Hilbert space
H . However, there exists an isometry
between H and H = L2 (R3 ), that is, the vector (t) can be represented by the function
r 7 ( r , t): this is the position representation.
But, since the Fourier transform is itself
an isometry of L2 (R3 ) into itself, the vector (t) may also be represented by the function
e p, t), which is the Fourier transform of the previous function. This is called the
p 7 (
momentum representation of the wave vector.
361
x (t) = (t) X (t) = (t) X (t) =
def
ZZZ
2
x ( r , t) d3 r.
In addition, to every wave function ( r , t) is associated its Fourier transform with respect to the space variable r , namely,
ZZZ
d3 r
def
( p, t) =
( r , t) e i p r/ h} p
.
(2 h})3
This function represents the probability amplitude for finding the particle
with momentum p. Of course, this function is also normalized to have norm
1, since for every t R we have
ZZZ
ZZZ
2 3
( p, t) d p =
( r , t)2 d3 r = 1
(t)(t) =
or
P x | = |
with (, t) : r 7 i h}
( r, t).
x
def
p x (t) = (t) P x (t) = i h}
( r , t)
( r , t) d3 r .
x
with
(, t) : p 7 p x ( p, t).
362
the wave function ( r , t) is the quantity x defined by the following equivalent expressions:
D
2 E
2
def
= x x2 = X | X | X 2
(x)2 = x x
ZZZ
2
2
=
x x ( r , t) d3 r = kX x k2 .
( p x )2 = p x2 p x 2 = P | P | P 2
2
ZZZ
2
=
h} ( r , t) d3 r p x 2
x
ZZZ
2
=
p x2 ( p, t) d3 p p x 2 = kP x p x k2
Note that p x2 may also be expressed as the average value of the operator
2
P x2 = h}2 x 2 in the state |, since an easy integration by parts reveals that
ZZZ
2 def
2
2
2
P x = P x = h}
( r , t)
( r, t) d3 r
x2
2
ZZZ
2
=
h} ( r, t) d3 r.
x
Remark 13.15 The average values and the uncertainties should be compared with the expectation value and standard deviation of a random variable (see Chapter 20).
We are going to show now that there exists a relation between x and
p x . To start with, we have the following lemma:
LEMMA 13.16 Let be a wave function.
a ( r , t) = ( r a, t),
363
def
k ( r , t) = ( r , t) e i k r / h},
then neither x nor p x is changed.
r, t) which is centered
iii) Consequently, it is possible to define a wave function (
in position and momentum (i.e., such that the average position and the average
momentum are zero) and has the same uncertainties as the original function
( r , t) with respect to position and momentum.
Proof. The average position of the translated wave function a is
ZZZ
ZZZ
2
2
x a =
x a ( r, t) d3 r =
(x a x ) ( r , t) d3 r = x a x .
2
2
2
2
(x) a = x a x a =
x 2 ( r a, t) d3 r x a x
ZZZ
2
=
(x a x )2 ( r, t) d3 r x2 2a x x + a 2x
= x 2 2a x x + a 2x x2 2a x x + a 2x
= x 2 x2 = (x)2 .
The momentum uncertainty of | is of course unchanged.
The proof is identical for the translates in momentum space.
Finally, the centered wave function is, for a given t, equal to
def
r, t) =
(
r r , t e i p r / h} .
3
2
dr
dr
x
(
r
)
(
r
)
d
r
x
(
r)
(
r)
x
x
1
2 (x)2 (p x )2 .
h}
()
364
But, using an integration by parts, since x is real and the boundary terms cancel (see
Remark 13.19 concerning this point), we can write
ZZZ
ZZZ
ZZZ
( r )2 d3 r
x ( r )
( r) d3 r =
x ( r )
( r ) d3 r,
x
x
or
2 Re
ZZZ
x ( r )
( r) d3 r
x
ZZZ
( r )2 d3 r = 1.
1
3
3
Re
(
r
)
d
r
(
r
)
d
r
x
(
r
)
x
(
r)
= 4.
x
x
If the usual definition of the Fourier transform is kept (with the factor 2
and without h}), the following equivalent theorem is obtained:
THEOREM 13.18 (Uncertainty relation) Let f L2 (R) be a function such that
R
x 2 | f (x)|2 dx and | f (x)|2 dx both exist. Define
Z
2
2 def 1
x =
x 2 f (x) dx,
2
k f k2
Z
Z
2
2
2 def 1
42
2
e
f (x) dx.
=
f () d =
2
2
k f k2
k f k2
1
Then we have x 2 2 16
2 with equality if and only if f is a gaussian.
the second moment
Remark 13.19 A point that was left slightly fuzzy above requires explanation: the cancellation
of the boundary terms during the integration by parts. Denote by H the space of square
integrable functions , such that X and P x are also square integrable. This space, with the
norm N defined by
is a Hilbert space.
If H is of C class with bounded support with respect to the variable x, the proof
above (and in particular the integration by parts with the boundary terms vanishing) is correct.
A classical result of analysis (proved using convolution with a Dirac sequence) is that the
space of C functions with bounded support is dense in H . By continuity, the preceding
result then holds for all H .
Remark 13.20 Another approach of the uncertainty relations is proposed in Exercise 14.6 on
page 404.
Analytic signals
365
13.4
Analytic signals
It is customary, in electricity or optics, for instance, to work with complex
sinusoidal signals of the type e it , for which only the real part has physical
meaning. This step from a real signal f(R) (t) = cos(t) to a complex signal
f (t) = e it may be generalized in the case of a nonmonochromatic signal.
First of all, notice that the real signal may be recovered in two different
ways: as the real part of the associated complex signal, or as the average of the
signal and its complex conjugate.
We can generalize this. We consider a real signal f(R) (t) and search for
the associated complex signal. In the case f(R) (t) = cos(20 t) (with 0 > 0),
the spectrum of f(R) contains both frequencies 0 and 0 . However, this
information is redundant, since the intensity for the frequency 0 is the same
as that intensity for the frequency 0 , since f(R) is real-valued. The complex
signal f (t) = e 2it associated to the real signal f(R) is obtained as follows:
take the spectrum of f(R) , keep only the positive frequencies, multiply by two,
then reconstitute the signal.
In a general way, for a real-valued signal f(R) (t) for which the Fourier transform fe(R) () is defined, its spectrum is hermitian, that is, we have
fe(R) () = fe(R) ()
for all R;
The inverse Fourier transform of this truncated spectrum will give the analytic
signal associated to the function f(R) . Since the inverse Fourier transform of
2H () is
i
F 1 2H () = F 1 [1 + sgn ] = (t) + pv
t
(see Theorem 11.16 on page 304), we deduce that the inverse Fourier transform
of 2H () fe(R) () is
Z
i
i
f(R) (t )
f(R) = f(R) (t) pv
dt .
F 1 2H () fe(R) () = + pv
t
t t
We make the following definition:
DEFINITION 13.21 An analytic signal is a function with causal Fourier trans-
form, that is, a function f such that the Fourier transform vanishes for negative frequencies.
366
DEFINITION 13.22 Let f(R) be a real signal with sufficiently regular Fourier
t t
DEFINITION 13.23 The analytic signal associated to the real signal f(R) is the
function
i
t
7 f (t) = f(R) (t) + i f(I) (t) = f(R) (t) pv
def
f (t )
dt ,
t t
The analytic signal associated with f(R) is analytic in the sense of Definition 13.21: its spectrum is identically zero for frequencies < 0. Since fe()
is causal, it has a Laplace transform F (p) defined at least on the right-hand
half-plane of the complex plane. Since the signal f is the inverse Fourier
transform of fe, it is given by
f (x) = F (2i x).
the spectrum of f is causal, then f may be analytically continued to the complex upper
half-plane.
Conversely, if f may be analytically continued to this half-plane, and if the integral
of f along a half-circle in the upper half-plane tends to 0 as the radius tends to infinity,
then the spectrum of f is causal.
(See [79] for more details.)
5
In the sense that the product H () fe(R) () exists; this means that fe(R) cannot involve a
singular distribution at the origin 0.
6
The upper half-plane because of the similitude x 7 2i x: the image of the upper
half-plane is the right-hand half-plane.
Analytic signals
367
An example of the use of analytic signals is given by a light signal, characterized by its real amplitude A(R) (t). We have
Z +
A(R) (t) =
A(R) () e 2it d,
where A(R) () is the Fourier transform of A(R) (t). However, since A(R) (t) is real,
it is also possible to write
Z +
A(R) (t) =
a() cos () + 2i t d,
0
where for > 0 we denote 2A(R) (t)() = a() e i() with a and real.
The imaginary signal associated to A(R) is of course given by
Z +
A(I) (t) =
a() sin () + 2i t d,
0
A(t) =
The quantity a() = 2A(R) () has the following interpretation: its square
W () = a 2 () is the power spectrum, or spectral density of the complex signal A (see Definition 13.26 on the next page). This complex spectral density
contains the energy carried by the real signal, and an equal amount of energy
carried by the imaginary signal. The total light intensity carried by the real
light signal is then equal to
Z
Z +
1 +
E =
W () d =
|A(R) ()|2 d,
2 0
namely, half of the total complex intensity (the other half is of course carried
by the imaginary signal).
368
13.5
Autocorrelation of a finite energy function
13.5.a Definition
Let f L2 (R) be a square integrable function. We know it has a Fourier
transform fe().
2
so we can interpret fe() as an energy density per frequency interval d.
2
function 7 fe() .
In optics and quantum mechanics, for instance, the autocorrelation function of f is often of interest:
DEFINITION 13.27 Let f be a square integrable function. The autocorrela-
13.5.b Properties
Since the autocorrelation function is the convolution of f with the conjugate
of its transpose (x) = f (x) f (x), it is possible to use the known properties
of the Fourier transform to derive some interesting properties. In particular,
THEOREM 13.28 The Fourier transform of the autocorrelation function of f is equal
to the spectral density of f :
2
F (t) = F f (x) f (x) = fe() .
369
maximal at the origin: |(x)| R (0) for any x R. (0) is real, positive, and
equal to the energy of f : (0) = | f |2 .
(x) =
(x)
.
(0)
13.5.c Intercorrelation
DEFINITION 13.32 The intercorrelation function of two square integrable
f g (x) =
f g (x) =
f g (x)
f (0) g (0)
370
13.6
Finite power functions
13.6.a
Definitions
Up to now, the functions which have been considered are mostly square inte2
grable, that is, functions
R in 2L (R). In many physical situations (in optics, for
instance), the integral | f | represents an energy. Sometimes, however, the
objects of interest are not functions with finite energy but those with finite
power, such as f (t) = cos t.
DEFINITION 13.34 A function f : R C is a finite power function if the
limit
ZT
1
lim
T + 2T T
exists and is finite. This limit is then
signal f .
f (t)2 dt
THEOREM 13.36 Let f (t) be a finite power signal and W () its spectral density.
Then we have
1
lim
T + 2T
f (t)2 dt =
W () d.
13.6.b Autocorrelation
For a finite power function, the previous definition 13.27 of the autocorrelation is not valid anymore, and is replaced by the following:
DEFINITION 13.37 The autocorrelation function of a finite power signal f
is given by
1
(x) = lim
T + 2T
def
T
T
f (t) f (t x) dt.
S1 (t)
S (t)
371
S2 (t)
(S )
Example 13.39 Let f (t) = e 2i 0 t . The autocorrelation function is (x) = e 2i 0 x and the
spectral density is W () = ( 0 ). Using the definition for a finite energy function, we
would have found instead that fe() = ( 0 ) and the spectral density would have to be
something like 2 ( 0 ). But, as explained in Chapter 7, the square of a Dirac distribution
does not make sense.
13.7
Application to optics:
the Wiener-Khintchine theorem
We consider an optical interference experiment with Young slits which are
lighted by a point-like light source (see Figure 13.1).
We will show that, in the case where the source is not quite monochromatic
but has a certain spectral width , the interference figure will get blurry at
points of the screen corresponding to differences in the optical path between
the two light rays which are too large. The threshold difference between
the optical paths such that the diffraction rays remain visible is called the
coherence length, and we will show that it is given by = c /.
Consider a nonmonochromatic source. We assume that the source S
of the figure emits a real signal t 7 S(R) (t), and that an analytic
complex
signal t 7 S (t) is associated with it. Hence, we have S(R) (t) = Re S (t) . The
Fourier transform of S(R) (t) is denoted S(R) ().
Moreover, we assume that the signal emitted has finite power and is characterized by a spectral power density W () for > 0. The signal is non-
372
I P = S P2 (t) ,
where we put
ZT
2 def
S (t) = lim 1
S P (t)2 dt.
P
T + 2T T
The signal reaching P is the sum of the two signals coming from the two slits.
The signal coming from one slit is thus the same as the signal coming from
the other, but with a time shift = L/c , where L is the path difference
between the two rays (and hence depends on the point P ):
S P (t) = S1 (t) + S2 (t) = S1 (t) + S1 (t ).
(The function S1 (t) is directly related to the signal S (t) emitted by the source.)
From this we derive
I P = S P S P = (S1 + S2 ) (S1 + S2 )
= S1 S1 + S2 S2 + 2Re S1 S2
= 2I + 2Re S1 S2 ,
S1 S2 = S1 (t) S1 (t ) = (),
def
() =
7
()
()
=
.
(0)
I
373
min
max
(0)
Fig. 13.2 The autocorrelation function for an almost monochromatic source, varying
with the parameter in the complex plane. The value at the origin (0) is
real and positive. The modulus of (t) varies slowly compared to its phase.
We may then estimate the visibility factor of the diffraction figure, using
the Rayleigh criterion [14]:
V =
Imax Imin
.
Imax + Imin
(13.2)
Here, Imax represents a local maximum of the light intensity, not at the point P ,
but in a close neighborhood of P ; that is, the observation point will be moved
slightly and hence the value of the parameter will change until an intensity
which is a local maximum is found, which is denoted Imax . Similarly, the
local minimum in the vicinity of P is found and is denoted Imin .
Estimating this parameter is only possible if the values max and min
corresponding to the maximal and minimal intensities close to a given point
are such that the difference max min is very small compared to the duration
of a wave train. If this is the case, the autocorrelation function will have
the shape in Figure 13.2, that is, its phase will vary very fast compared to its
modulus; in particular, the modulus of will remain almost constant between
max and min , whereas the phase will change by an amount equal to .8 For
= 0, we have of course = I , and then |()| will decrease with time.
Thus, by moving slightly P , varies little and || remains essentially
constant. Then Re() varies between || and + ||, and the visibility factor
8
This is what the physics says, not the mathematics! This phenomenon may be easily understood.
Suppose is made to vary so that it remains small during the duration of a wave train. Then,
taking a value of equal to half a period of the signal (which is almost monochromatic), the
signals S1 and S2 are simply shifted by : they are in opposition and () is real, negative,
and with absolute value almost equal to (0). If we take values of much larger, the signals
become more and more decorrelated because it may be that S1 comes from one wave train
whereas S2 comes
from another. A simple statistical model of a wave train gives an exponential
decay of ().
374
()
I
in the Young interference experiment is equal to the modulus of the normalized Fourier
transform of the spectral density of the source evaluated for the value corresponding to
the time difference between the two rays:
R +
2i d
0 W () e
V = R +
.
W () d
0
Exercises
375
EXERCISES
Exercise
13.1 (Electromagnetic
field in Coulomb gauge) We consider an electromagnetic
field E( r , t), B( r , t) compatible with a charge and current distribution ( r, t), j ( r, t) ,
that is, such that the fields satisfy the differential equations (Maxwell equations):
B
,
t
1 E
curl B = 2
+ 0 j.
c t
div E = /0 ,
curl E =
div B = 0,
i) Show that the field B is transverse.
ii) Show that the longitudinal component of the electric field is given by the instantaneous
Coulomb potential associated with the charge distribution. What may be said from the
point of view of relativity theory?
iii) Is the separation between transverse and longitudinal components preserved by a
Lorentz transformation?
Exercise 13.2 (van Cittert-Zernike theorem) Consider a light source which is translation in-
variant along the O y-axis, characterized by a light intensity I (x) on the O x-axis, such that the
light passes through an interferential apparatus as described below. The distance between the
source and the slits is f and the distance to the screen is D.
y
a
We assume that the source may be modeled by a set a point-like sources, emitting wave
trains independently of each other, with the same frequency . Since the sources are not
coherent with respect to each other, we admit that the intensity observed at a point of the
interference figure is the sum of intensities coming from each source.9 Show:
THEOREM 13.42 (van Cittert-Zernike) The visibility factor of the source I (x) is equal to the modulus
of the Fourier transform of the normalized spatial intensity of the source for the spatial frequency k =
a/ f , where a is the distance between the interference slits and f is the distance between the source and
the system:
a
I (x)
def
|| = Ie
with
I (x) = R +
.
f
I (s) ds
The reader will find in [14] a generalization of this result, due to van Cittert (1934) and
Zernike (1938).
9
We saw in Exercise 1.3 on page 43 that it is in fact necessary to have a nonzero spectral
width before the sources can be considered incoherent.
376
Solution of exercise 13.1. Applying the Fourier transform, the Maxwell equations become
k E = i e/0 ,
k B = 0,
B
B
,
t
1 E
E
ik B = 2
+ 0 ej.
c t
ik E =
k E ( k)
k
= i 2 e( k)/0 ,
k2
k
and hence, taking the inverse Fourier transform, we have
Z
( r ) [ r r ] 3
1
d r = E Coulomb ( r , t).
E // ( r , t) =
40
k r r k3
E// ( k) =
Thus, the longitudinal field propagates instantaneously through space, which contradicts special relativity. This indicates that the decomposition in transverse and longitudinal components
is not physical.
In fact, it is not preserved by a Lorentz transformation, since the Coulomb field is not.
Solution of exercise 13.2. The intensity at a point P with coordinate y on the screen is
given by the integral of intensities produced by I (x), for x R. An elementary interference
computation shows that, for small angles, we have
Z +
2a x
y
J (x) = K
I (x) 1 + cos
+
dx
f
D
Z +
2a x
y
= K I0 1 +
I (x) cos
+
dx ,
f
D
R +
with I0 = I (s) ds, and where I (x) = I (x)/I0 is the normalized intensity. Since I and I
are real-valued functions, we may write
Z +
2ia x
y
J ( y) = K I0 1 + Re
I (x) exp
+
dx
f
D
a
e 2i a y/D
.
= K I0 1 + Re Ie
f
If we decompose Ie(a/ f ) = || e i in terms of a phase and modulus, we obtain
2a y
J ( y) = K I0 1 + || cos
.
D
Chapter
14
Bras, kets,
and all that sort of thing
14.1
Reminders about finite dimension
In this section, we consider a vector space E , real or complex, with finite dimension n, equipped with a scalar product (|) and with a given orthonormal
basis (e1 , . . . , en ). Some elementary properties concerning the scalar product,
orthonormal bases, and the adjoint of an endomorphism will be recalled;
these results should be compared with those holding in infinite dimensions,
which are quite different.
space E . More precisely, for any linear form on E , there exists a unique a E
such that
x E
(x) = (a|x) .
Proof
P
Existence. Using the orthonormal basis (e1 , . . . , en ), define a = nj=1 (e i ) e i .
Pn
Pn
For any x = i=1 xi e i E , we have (x) = i=1 xi (e i ), hence (x) = (a|x).
Uniqueness. Assume a and a E satisfy
(a|x) = (a |x)
x E .
378
14.1.b Adjoint
DEFINITION 14.2 (Adjoint) Let u be an endomorphism of E . The adjoint
of u, if
it
exists, is the
unique v L (E ) such that, for any x, y E , we have
v(x) y = x u( y) . It is denoted u , and so by definition we have
x, y E
u (x) y = x u( y) .
Proof of uniqueness. Let v, w both be adjoints of u. Then we have
x, y E
x u( y) = v(x) y = w(x) y ,
which shows that v(x) w(x) E = {0} for any x E and hence v = w.
(V X )Y = t X (U Y )
t V = U .
(u + v) = u + g ,
(u ) = u,
det u = det u,
IdE = Id E ,
rank(u ) = rank(u),
(u v) = v u ,
tr (u ) = tr (u),
Sp(u) Sp(u ).
379
adjoint endomorphism (i.e., hermitian in the complex case, or symmetric in the real
case). Then u can be diagonalized in an orthonormal basis.
Also the reader should be able to prove the following result as an exercise:
endomorphisms of E , then they can be diagonalized simultaneously in the same orthonormal basis.
14.2
Kets and bras
In the remainder of this chapter, H is a separable Hilbert space (i.e., admitting a countable Hilbert basis).
14.2.a
Kets | H
380
to work at the same time with two or even three Hilbert spaces:
an abstract Hilbert space H, the vectors of which are denoted |;
the space L2 , in which the vectors are denoted (meaning a function x 7 (x)); this
space will be denoted L2 (R
R, d x) for added precision;
the space L2 again, but in which the vectors are denoted : p 7 (p); this space will
be denoted L2 (R
R, d p).
The square integrable functions correspond to Schrdingers formalism of wave mechanics, in
position representation in the case of the space L2 (R, dx), and in momentum representation
in the case of the space L2 (R, dp). The function p 7 (p) is the Fourier transform of the
function x 7 (x). We know that the Fourier transform gives an isometry between L2 (R, dx)
and L2 (R, dp).
When working at the same time with the abstract space H and the space L2 (R), the
distinction can be be made explicit, for instance, by writing a function in L2 (R) and | the
corresponding abstract vector. However, we will not make this distinction here.
In the case of a particle confined in an interval [0, a] (because of an infinite potential
sink), the space used is L2 [0, a] for the position representation and 2 for the momentum
representation; going from one to the other is then done by means of Fourier series.
Moreover, one should notice that all that is done here in the space L2 (R) may be easily
generalized to L2 (R3 ) or L2 (Rn ).
|
2 =
| , | H .
H
14.2.b Bras | H
: L2 (R) C
Z +
2
7
(x) e x dx
1
Dirac did not simply make a pun when inventing the terminology bra and ket: this
convenient notation became quickly indispensable for most physicists. Note that Dirac also
coined the words fermion and boson, and introduced the terminology of c -number, the function, and the useful notation h} = h/2.
381
: | 7 | , | H
| = | , | H .
In other words:
linear isomorphism of the Hilbert space H with its topological dual H . In other words,
for any continuous linear form |, there exists a unique vector | such that
| = | , | H .
| H
Hence there is a one-to-one correspondance between bras and kets:
To each bra | H one can associate a ket | H.
H H ,
| 7 | ,
=
|
H .
382
Remark 14.15 The semilinearity of the map above means that | = | for any C
and H. This is a result of the convention chosen; if we associate to a ket | the linear
map
| 7 | , | H ,
the result map H H is a linear isomorphism.
Counterexample 14.16 Consider again the Hilbert space H = L 2 (R). Let x0 R. Then the
2
0,
kn k2H =
e nx dx =
n n
+
whereas
(n ) = 1 for all n N. So there does not exist a constant R such that
() kk for all H, which means that is not continuous.
H
We will denote also with the generic notation | the generalized bras, namely
linear forms which are not necessarily in H . Such a form may not be continuous, or it may not be defined on the whole of H (as seen in the case of ,
in counterexample 14.16); still it will be asked that it be defined at least on a
dense subspace of H.
One of the most important example is the following:
DEFINITION 14.17 (The bra x0 |) Let x0 R. We denote by x0 | the discontinuous linear form defined on the dense subspace of continuous functions
by
x0 | = (x0 ).
The bra x0 | is thus another name for the Dirac distribution (x x0 ).
Since functions in L2 are defined up to equality almost everywhere, x0 is not well defined
on the whole of H. We therefore restrict it to the dense subspace of continuous functions.
3
In fact, the dual of L2 (R) is isomorphic to L2 (R), but via this isomorphism one can
identify any continuous linear form on L2 (R) to a function in L2 (R).
383
T =
| =
DEFINITION 14.20 (Ket | p and bra p|) Let p be a real number. The func-
tion
e i p x/ h}
2 h}
is not square integrable. It still defines a noncontinuous linear form on the
dense space L1 L2 :
p : x 7 p
p : L1 (R) L2 (R) C
Z +
7
p (x) (x) dx =
e(p),
where e is the Fourier transform of , with the conventions usual in quantum mechanics. This generalized bra is denoted p| = p |. The ket that
corresponds to the function p is denoted | p = | p
/ H.
384
p, p R, we have p p = (p p ).
valid in the sense of tempered distributions that is, which makes sense only
when applied to a function in the Schwartz space.
14.2.e Id =
P
n
| n n |
Let (n )nN be an orthonormal Hilbert basis of H. We know that this property is characterized by the fact that
x H
x=
X
n=0
(n |x) n .
| =
n | |n ,
|n n | .
n=0
| =
n=0
P
Or still, formally, the operator
n=0 |n n | is the identity operator. This is
what physicists call a closure relation.
For easy remembering, this will be stated as
385
and only if
Id =
X
n=0
|n n | .
X
X
2
kk = | = |
|n n | | =
| n n |
n=0
n=0
X
n | 2 ,
=
n=0
which can be compared with the formula d) in Theorem 9.21 on page 259.
The family (T ) is a total system (or a complete system) if, for all S ,
we have
T , = 0 for any = 0 .
4
The choice of index is not without meaning. A generalized basis can be useful to diagonalize a self-adjoint operator which has, in addition to eigenvalues, a continuous spectrum.
386
As in the case of the usual Hilbert basis (see Theorem 9.21), a generalized
basis is a total system. However, the normalization of such a family (the
elements of which are not a priori in L2 ) is delicate.
Note that a generalized basis is not in general a Hilbert basis of L2 , for
instance, because any Hilbert basis is countable.
THEOREM 14.24 (The basis { x| ; x R }) Denote by x| = x| the generalized
bra associated to the distribution x . Then the family of distributions x| ; x R
is a generalized basis of L2 (R).
Proof. Indeed,
for
S
very definition of the L2 norm, we have
2 any R +
, by the
R +
2
2
kk2 = (x) dx = x| dx.
THEOREM 14.25 (The basis { p| ; p R }) Denote by p| = p | the generalized bra defined by the regular distributions p associated to the function
e ix/ h}
p : x 7 p
.
2 h}
Then the family p| ; p R is a generalized basis of L2 (R).
Finally, it can
be shown that
the closure relation
is still valid for two generalized bases x| ; Rx R and p| ; p R . Indeed, if , L2 (R),
the relation | = | p p| d p holds: this is (once more) the ParsevalPlancherel identity! Similarly, we have
Z +
Z +
| =
(x) (x) dx =
x| x| dx
| x x| dx.
Id =
|x x| dx
and
Id =
| p p| d p.
Remark 14.27 This shows that it is possible to write, formally, identities like
| =
|x x| dx,
that is,
() =
x () (x) dx.
Linear operators
387
This formula has a sense if evaluated at a point y; indeed, this yields the correct formula
Z +
( y) =
(x) x ( y) dy.
The great strength of Diracs notation is to lead, very intuitively, to many correct formulas. For
instance, writing
Z
+
| =
| p p| dp
14.3
Linear operators
We will only describe here the simplest properties of linear operators acting
on Hilbert spaces. This is a very complex and subtle theory, which requires by
itself a full volume. The interested reader can read, for instance, the books [4,
95] or [71], which deals with quantum mechanics in Hilbert spaces. We will
define bounded operators, closed operators, eigenvalues and eigenvectors, and
adjoints of operators.
14.3.a
Operators
DEFINITION 14.28
(Linear operator) A linear operator on a Hilbert space H
is a pair A , D A , where
388
linear operator. For instance, the maximal domain on which the position operator, denoted
X , is defined is the subspace
D1 = L2 (R) ; x 7 x (x) L2 (R) .
However, this space may not be the most convenient to use. It may be simpler to restrict X to
the Schwartz space D2 = S of C class functions with all derivatives rapidly decaying. This
space is dense in H = L2 (R), and the operator X is defined on S . Moreover, this space has
the advantage of being stable with respect to X .
We must insist: the operator (X , D1 ) is not the same as the operator (X , D2 ), even if they
are both usually denoted by the same name operator X . The first is an extension of the
second (to a strictly larger space).
An important example of an operator is the momentum operator of quantum mechanics. Consider the space H = L2 [0, a] of square integrable functions on the interval [0, a]. The wave functions of a particle confined to this
interval all satisfy (0) = (a) = 0 (this is a physical constraint). We wish to
define an operator (the momentum operator P ) by
P = i h} .
by
D P
P = i h} .
Remark 14.33 (Boundary conditions for the operator P ) One may wonder what the boundary
conditions (0) = (a) really mean; as a matter of fact, , like every function of the L2 space,
is defined only up to equality almost everywhere, and therefore, it would seem that a boundary
condition is meaningless. But the fact that L2 [a, b ] implies that L1 [a, b ] and, in
particular, that is continuous.6 This explains why the boundary conditions make sense.
The momentum operator for a particle that can range over R is defined
similarly:
5
This does not imply that the function is differentiable; but it must be continuous and
its derivative exists almost everywhere.
6
Even absolutly continuous, but this is not of concern here.
Linear operators
389
by
D P .
P = i h}
The limit conditions () = (+) = 0 are automatically satisfied when
, H.
14.3.b Adjoint
A , D A be a linear operator on H. The
domain of A , denoted D A , is the set of vectors H for which there
exists a unique vector H satisfying
DEFINITION 14.35 (Adjoint) Let
D A
| = | A .
A = | A .
D A D A
PROPOSITION 14.36 (Existence of the adjoint) A has an adjoint if and only if
D A is dense in H.
Proof. Assume first that D A is dense. In the case = 0, the vector = 0 satisfies
the equation | = | A for all D A . So we must show that this vector is
unique. More generally, for a given arbitrary , let and satisfying the required
property. Then | = 0 for all D A . By density, we may find a sequence
(n )nN , with values in D A , which converges to . Taking the limit in the scalar
product (which is a continuous map), we obtain k k2 = | = 0
and hence = .
Conversely, assume that D A is not dense. Then the orthogonal of D A in not equal
to {0} (it is a corollary of Theorem 9.21 ). Choose any 6= 0 in this orthogonal space.
Then | = 0 for any D A . Let H. If is a candidate vector such that the
relation | = | A holds for all D A , it is obvious that the vector +
also satisfies this property. So there is no uniqueness and therefore D A = .
Remark 14.37 If D A is dense, it may still be the case that D A is equal to {0}.
In the rest of this chapter, we only consider linear operators with dense domain
in H. So the adjoint is always defined.
PROPOSITION 14.38 (Properties of the adjoint) Let A and B be linear opera-
tors.
i) A is a linear operator;
ii) (A) = A for all C;
390
iii) if A B, then B A ;
D A
6=0
kAk
.
kk
Just as in any other complete normed vector space, the following holds (it
is just a reminder of Theorem A.51 on page 583 and of sidebar 5 on page 294):
PROPOSITION 14.40 A is bounded if and only if A is continuous.
(A) = (, A) ; D A
of H H.
The operator A is closed if and only if its graph is closed,7 that is, if
for any sequence (n )nN with values in D A which converges to a limit , if
moreover the sequence (An )n converges to a vector , then we have D A
and A = .
7
(, )( , ) = | + | .
Linear operators
391
The reader will find the following elementary properties easy to check:
PROPOSITION 14.42 Any bounded operator is closed.
THEOREM 14.43 For any operator A, its adjoint A is closed.
DEFINITION 14.44 (Closable operator, closure) An operator A is closable if
there exists a closed extension of A; the smallest such closed extension is called
the closure of A and is denoted A.
PROPOSITION 14.45 A is closable if and only if the domain of A is dense. In
A , D A be an operator. An eigenvalue of A is a
complex number such that there exists a nonzero vector D A satisfying
the relation
A = .
Such a vector is an eigenvector. The set of eigenvectors of A is the discrete
spectrum of A, and is denoted d ( A).
DEFINITION 14.46 Let
P = i h} .
P = .
This implies
= (i/ h}) , and hence (x) = e i x/ h} , where is a constant. However,
functions of this type are square integrable only if = 0; since the zero function is not an
eigenvector, it follows that P has no eigenvalue: d (P ) = .
Example 14.48 (Discrete spectrum of the position operator X) Consider now the position op-
It follows that for almost all x R we have (x ) (x) = 0. Hence the function is zero
almost everywhere, in other words,8 = 0. Hence X has no eigenvalue: d (X ) = .
8
Since, we recall, the space L2 is constructed by identifying together functions which are
equal almost everywhere. Any function which is zero almost everywhere is therefore identified
with the zero function.
392
The set of generalized eigenvalues of A is called the pure continuous spectrum of A, and is denoted c (A). It is disjoint from the discrete spectrum:
d (A) c (A) = .
Example 14.50 (Continuous spectrum of the operator X) We saw that the operator X of the
Hilbert space L2 (R) has no eigenvalue. On the other hand, a function with graph which is a
very narrow spike around a real number should satisfy X . Hence we will check
that any real number is a generalized eigenvalue, the idea being to construct such a sequence
of spikier and spikier functions. For reasons of normalization, this is not quite the same as a
Dirac sequence.
0
Let x0 R.
p We consider the sequence (n )nN of functions defined by n (x) = n (x x0 )
and 0n (x) = n (nx); recall that the rectangle function satisfies (x) = 1 for |x| 1
and (x) = 0 otherwise.
It is easy to check that the functions n are of norm 1. For all n N, we have
Z +
Z 1/2n
1
(X x0 )n
=
x 2 n (x)2 dx =
nx 2 dx =
.
12
n2
1/2n
Remark 14.51 In the previous example, the sequence (n )nN converges neither in L 2 (R) nor
x (x x0 ) = x0 (x x0 ),
in other words X T = x0 T in the sense of distributions. This is the (generalized) eigenvector
associated to the generalized eigenvalue!
Defining properly what is a (tempered) distribution which is an eigenvector T of an
operator A requires a proper definition of what is meant by A T . This is in fact a delicate
problem, which we will only solve for a certain type of operators (see Definition 14.68).
Example 14.52 (Continuous spectrum of the operator P ) Consider the operator P on the space
L2 (R). We have already shown that it has no eigenvalue. On the other hand, we can now show
that any p R is a generalized eigenvalue. The idea is, once more, to find a function which
looks like the function p (which suffers itself from the essential defect of not being square
integrable). For this purpose, we will simply average the functions , for values of close
to p. Such an average is, up to unimportant factors, the same as the Fourier transform of
a function similar to the rectangle function: hence it is a function that decays like 1/x at
infinity, and is therefore square integrable.
To be more precise, define
r
Z p+
h} sin(x/ h})
1
def
()
(x) d =
p (x) = p
ix
2 p
393
for x R and > 0. It is easily checked that this function is of norm 1 in L2 (R), and that
2
(P p) ()
= Cnt 2 ,
p
so that taking = 1/n with n going to infinity shows that p is indeed a generalized eigenvalue.
Later on (see Theorem 14.82) we will see that the same operator defined on L2 [0, a] has
no generalized eigenvalue.
To finish this section, we remark that, in addition to the continuous spectrum and discrete spectrum, there exists a third type.
DEFINITION 14.53 Let A be an operator with dense domain D A . The residual spectrum of A, denoted r ( A), is the set of complex numbers such
that
/ d (A)
and
d (A ).
The spectrum of A is the union of the three types of spectra:
14.4
Hermitian operators; self-adjoint operators
The goal of this short section is to prove the following result, which is a
generalization of Theorem 14.7: any self-adjoint operator admits an orthogonal basis
of eigenvectors, where the basis may be a generalized basis (i.e., its elements may
be distributions). This fact is absolutely fundamental for quantum mechanics,
where an observable is defined, according to various textbooks, as being a
hermitian operator which has an orthogonal basis of eigenvectors or, better,
as a self-adjoint operator. In contrast with the case of finite dimension, the
notions of hermitian operator and of self-adjoint operator no longer coincide,
and this causes some difficulties.
Finally, we assume in this section that H = L2 (R).
Example 14.54 (A classical problem of quantum mechanics) Consider a particle in an infinite
potential sink, confined to the interval [0, a]. The operator P satisfies | P = P | for
all functions , D P . However, P has no eigenvalue, and no generalized eigenvalue. Hence
P is not an observable.
394
14.4.a
Definitions
A| = | A .
iii) A A .
D A = D A
and
D A
A = A.
In a finite-dimensional space, it is equivalent for an operator to be hermitian or self-adjoint. Here, however, there is a crucial distinction: the domain
of the adjoint of a hermitian operator A may well be strictly larger than that
of A. In other words, A may be acting in the same way as A on the domain
of A, but it may also act on a larger space. Disregarding this distinction
may lead to unwelcome surprises (an apparent paradox of quantum mechanics is explained in Exercise 14.5). It is often easy to check that an operator is
hermitian, but much more delicate to see whether it is, or is not, self-adjoint.
If A is defined on the whole of H, then things simplify enormously; since
A A , it follows that the domain of A is also equal to H and so if A is
hermitian, it is in fact self-adjoint. The situation is even simpler than that:
THEOREM 14.58 (Hellinger-Toeplitz) Let A be a hermitian operator defined on H.
395
n+ = dim Ker(A i)
and
i) A is self-adjoint;
ii) A is closed and n+ = n = 0;
iii) Im(A i) = H.
bounded.
The operator X on the space L2 (R) is self-adjoint and unbounded.
Proof
Consider the operator X on H = L2 [0, a]. Its domain is the whole of H.
Moreover, for any , H, we have
Z +
Z +
X | =
x (x) (x) dx =
(x) x (x) dx = | X ,
X
from which we derive in fact that |||X ||| a. (As an exercise, the reader is invited to
prove that |||X ||| = a.)
Consider now the operator X on the space L2 (R). The domain of X is
D X = H ; x 7 x (x) H .
By the same computation as before, it follows easily that X is hermitian. Lets prove
that it is even self-adjoint.
Let D X and write = X . Then, for any D X , we have X | =
| , by definition of X . Expressing the two scalar products as integrals, this gives,
for any D X ,
Z +
(x) x (x) (x) dx = 0.
396
not self-adjoint.
()
()
(See Remark 14.33 on page 388 for the meaning of these boundary conditions.) The
subspace D P is dense. Moreover, for any , D P , we have
Za
i h} (x) (x) dx
P | =
0
Za
= i h}
(x) (x) dx + i h} (a) (a) (0) (0) = | P ,
0
since the boundary terms cancel out from condition (). This establishes that P is
symmetric.
However the domain of P is strictly larger than that of P . Indeed, any function
which satisfies only the condition () is such that P | = | P for any D P .
Using techniques similar to that in the proof of Theorem 14.62, it is possible to prove
the converse inclusion, namely, that
D P = ; , L2 [0, a] .
On D P , the operator P is defined by P = i h}. So the adjoint of P acts in the
same manner as P on the domain of P , but it has a larger domain.
14.4.b Eigenvectors
The first important result concerning hermitian operators is in perfect analogy with one in finite dimensions: eigenvectors corresponding to distinct
eigenvalues are orthogonal.
THEOREM 14.65 Let A be a hermitian operator. Then:
397
Remark 14.66 Concerning point iv), there is nothing a priori that ensures that such a basis of
eigenvectors exists.
T , = | =
(x) (x) dx.
A = | A ,
A T = T A .
S is stable under A;
A T | = T | A .
A T = T ,
in other words if T | A = T | for any S .
398
Remark 14.71 There exists a much more general version of this theorem, which applies to any
self-adjoint operator. This generalized spectral theorem states that there always exists a spectral
family, but this notion is beyond the scope of this book. It is discussed in many books, for
instance [71, III.5, 6], [95, p. 424], and [38, vol. 4, ch.1, 4].
2
PROPOSITION
14.72 The
momentum operator P on the space L (R) is of class S .
Proof. We already know, by Theorem 14.25, that this family is a generalized basis.
The orthonormality is the statement that
p, p R
p| p = (p p ).
We now check that, denoting by p the regular distribution associated to the function
p (x) = (2 h})1/2 e i x/ h} , we have P p = p p .
Indeed, let S . Then we have
P p = p P
by Definition 14.68
f
= p i h} = i h} (p)
= p (p)
e
since F [ ] (p) = (i p/ h}) (p)
e
= p p = p p
since p is real.
()
Remark 14.73 The distribution p is, up to normalization, the limit of the p defined in
Example 14.52.
2
PROPOSITION
14.74
The operator X on L (R) is of class S . Moreover, the
X (x x0 ) = (x x0 ) X = (X )(x0 ) = x0 (x0 ) = x0 (x x0 ) .
a i,k = i | A k .
The question is: is it true or not that the knowledge of the infinite matrix
M = (a ik ) i,k is sufficient to reconstruct entirely the operator A?
399
P
P
P
=
xk k
A =
a ik xk i .
(14.1)
i=0
k=0
k=0
i | A |k = i | Ak
and derive
A | =
X
i=0 k=0
| i i | A |k k | ,
The difficulty is that these formulas are not always correct, since the righthand side of (14.1) is not always correctly defined for all D A . It is easy
to see that, if A is bounded, then by continuity everything works out. More
generally:
THEOREM 14.76 (Matrix representation) If A be a self-adjoint operator, then it
with M M and M .
This is where the trap lies for those who manipulate the notation blindly. The definition
def
| A = A |, which is often found, permits two readings of the expression | A |:
Those are not quite equivalent always because of the issue of the exact domain of an operator.
Note that, for a self-adjoint operator, there is no ambiguity; for a hermitian operator, mistakes
can arise because of this [41].
P
P
10
2
For instance, if
i=0 |a ik | < + for all k N, the operator A : k 7
i=0 a ik i
defines, by density of Vect{ i ; i N}, a closed operator, which is hermitian if aki = a ik .
400
k=0
|k k | .
n=0
that is,
D A
n |n n | ,
A | =
and
X
n=0
n n | |n ,
X
2
2n n | converges.
D A if and only if
With this the formal calculus of the operator A can be defined:
P
F (A) =
F () |n n | ,
n=0
with domain
n
o
X
D F (A) = H ;
F ()2 |n | |2 < + .
n=0
All the n are in D F (A) , hence D F (A) is also dense. In fact we have [95, 5.8]:
THEOREM 14.79 The operator F (A) is self-adjoint.
X
def
B=
n |n n |
n=0
is a self-adjoint operator.
This technique can provide the extension of an operator which is merely
hermitian, but admits an orthonormal basis of eigenvectors, to a self-adjoint
operator defined therefore on a larger domain, and which is the only one
for which many computations make sense.
401
h}2
D
H0 (x) =
+V (x) (x),
2m
where V is a given function. In many cases, H0 is essentially self-adjoint,
that is, the closure H0 is self-adjoint; this closure of H0 is then given by
H0 = H0 . The self-adjoint extension H = H0 , is the hamiltonian operator
of the system. It is this operator which is observable. In practice, it is
obtained by finding an orthonormal basis (n )nN for H0 and applying the
method above.
Example 14.80 (Hamiltonian for the harmonic oscillator) Consider a harmonic oscillator with
1
h}.
H0 |n = En |n
with En = n +
2
The functions n are of the type exp(m2 x 2 /2 h}) hn (x), where hn is a Hermite polynomial
of degree n [20, Ch. V C.2]. The operator H0 is not self-adjoint.
Let us now define the operator H by
P
P
H=
En |n n | ,
i.e.,
H | =
En n | |n
H0 | =
n=0
n=0
2
for any function | such as the series En2 n | converges.
The operator H is self-adjoint; it is an extension of H0 , which is called the hamiltonian
of the harmonic oscillator.
P
D P = { H ; H}.
p p = (p p ),
(orthonormal basis)
Z +
Id =
| p p| d p.
(closure relation)
Proof. See Example 14.52 for the continuous spectrum and Theorem 14.72 for the
basis of eigenvectors.
402
The operator P on L2 [0, a] is hermitian (symmetric), but not self-adjoint. Its domain
is
D P = H ; H and (0) = (a) = 0 .
of the quantity (p p0 )2 , which was denoted (p p0 )2 in the section on uncertainty
relations. It can be checked that the quantity (p p0 )2 is smallest for p0 = p,
hence
e
2 =
(p p )2 (p)2 .
(p)
0
By the uncertainty relations (Theorem 13.17), we know that p h}/2x h}/2a,
which finally implies
2
2
e
h} .
4a
2
2
Consequently, it is not possible to make
p 7 (p p0 ) (p)
e
=
(P p0 )
as
small as possible, so that p0 is not a generalized eigenvalue.
One easily checks that the domain of P is D P = { H ; H}, and that any
complex is an eigenvalue of P , with associated eigenvector : x 7 e i x/ h} . This
proves that r (P ) = C.
x x = (x x ),
Z +
Id =
|x x| dx.
(orthonormal basis)
(closure relation)
Exercises
403
Proof. See Theorem 14.62 to show that X is self-adjoint. See Remark 14.50 for the
spectrum.
Proof. See Theorem 14.62 to show that X is self-adjoint. The computation of the
spectrum is left as an exercise; the proof in Remark 14.50 can be used as a clue.
EXERCISES
Exercise 14.2 Check that any eigenvector is also an eigendistribution.
Exercise 14.3 Let A be a linerar operator defined on the subspace D A = W . For any
complex number , the operator A = A Id is also defined on W . One and only one of
the following four cases applies for (see, e.g., [94]):
Show that those definitions agree with those in Definitions 14.46 and 14.49.
Note that in some books (for instance [4]), the definitions somewhat different, and the
discrete and continuous spectrum, for instance, are not necessarily disjoint. (This may actually
be desirable to account for natural situations, such as some hyperbolic billiards arising in
arithmetic quantum chaos, where there is a seemingly chaotic discrete spectrum embedded
in a very regular continuous spectrum; see, e.g., [78].)
Exercise 14.4 (Continuous spectrum of X) Consider the Hilbert space L 2 [a, b ]. Show that
X has no eigenvalue, but that any [a, b ] is a generalized eigenvalue, that is, d (X ) =
and c (X ) = [a, b ].
Exercise 14.5 (A paradox of quantum mechanics, following F. Gieres [41])
Consider a particle, in an infinite potential sink, constrained to remain in the interval [a, a].
Physical considerations impose that the wave functions satisfy (a) = (a) = 0. The
hamiltonian is the operator on L2 [a, a] defined by H = h}2 /2m. Assume H is a
self-adjoint operator.
11
404
Exercise 14.6 (Uncertainty principle) Let A and B be two self-adjoint operators on H (i.e.,
observables). Let H be a normed vector (i.e., a physical state) such that
DA DB ,
A D B ,
B D A .
and
The average value of A and the uncertainty on A in the state are defined by
def
a = | A
Show that
and
a = kA ak .
a b 21 [A , B] ,
def
where [A , B] = A B B A.
SOLUTIONS
Solution of exercise 14.5.
i) Solving the eigenvalue equation leads to the discrete spectrum given by:
En =
h}2 2 n2
8ma 2
(n 1)
n x
2mEn x
1
1
n (x) = p cos
= p cos
a
h}
a
2a
p
2mEn x
1
1
n x
n (x) = p sin
= p sin
a
h}
a
2a
if n is odd,
if n is even.
15 h}4
.
8m 2 a 4
2
Moreover, since
fourth
the
derivative of is zero, it is tempting to write H = 0 and,
2
consequently, H = 0; which is quite annoying since H is self-adjoint, and one
would expect to have identical results.
The point is that one must be more careful with the definition of the operator H 2 .
Indeed, the domain of H is
Solutions of exercises
405
hence H
/ D H and therefore it is not permitted to compute H 2 as the composition
H (H ). (It may be checked that the elements of D H 2 must satisfy (a) = 0.) On
the other hand, using the orthonormal Hilbert basis (n )nN , we can write
H=
En |n n |
En2 |n n | .
n=1
and, using the techniques of Section 14.4.d, Definition 14.78, it is possible to define a
new operator, denoted H 2 , by
def
H2 =
n=1
H 2 n = H 2 n .
2 def X
15 h}4
En2 | n n | =
H =
,
8m 2 a 4
n=1
2
i.e., that H = H | H as it should be.
Solution of exercise 14.6. First of all, notice that a and b are real numbers since A and B
are hermitian operators (Theorem 14.65). Also, notice that
(A B B A) = | A B | B A = A| B B| A = 2Im A| B .
Since a and b are real, we can expand
(A a)(B b ) = A| B ab b a + ab kk2 = A| B ab ,
and obtain
(A B B A) = 2 Im (A a)(B b ) .
a b (A a)(B b ) = Im (A a)(B b )
1
Im A| B = [A , B] .
2
12
This is of course linked to the Fourier series of , which converges uniformly but cannot
be differentiated twice termwise.
Chapter
15
Green functions
This chapter, properly speaking, is not a course on Green functions, and it does not
really introduce any new object or concept. Rather, through some simple physical
examples, it explains how the various techniques discussed previously (Fourier and
Laplace transforms, conformal maps, convolution, differentiation in the sense of
distributions) can be used to solve easily certain physical problems related to linear
differential equations.
The first problem concerns the propagation of electromagnetic waves in the vacuum. There the Green function of the dAlembertian is recovered from scratch, as
well as the retarded potentials formula for an arbitrary source distribution.
The second problem is the resolution of the heat equation, using either the Fourier
transform, or the Laplace transform.
Finally, it will be seen how Green functions occur naturally in quantum mechanics.
15.1
Generalities about Green functions
Consider a linear system with input signal I and output (or response)
signal R. This is described by an equation of the type
(R) = I .
What we called input signal and output signal could be many things, such
as:
electrical signals in a circuit (for instance, power as input and response
of a component as output);
408
Green functions
charges and currents as input and electromagnetic fields as output;
heat sources as input and temperature as output;
e]
e Ge = 1
e
G = F [1/ D
D
Ge = 1/ D
or
or
or
D G = =
=
=
b
b Gb = 1
b
G 1/ D.
D
Gb = 1/ D
Despite the simplicity of this outline, difficulties sometimes obstruct the way;
we will see how to solve them.
1
2
Still, an example which is not translation invariant is treated in Section 6.2.a on page 165.
It suffices to put D = ().
409
15.2
A pedagogical example: the harmonic oscillator
We treat here in detail a simple example, which contains the essential difficulties concerning the computation of Green functions, without extraneous
complications (temporal and spacial variables together).
A Green function of the harmonic oscillator is a function t 7 G (t) that
is a solution, in the sense of distributions, of the equation
G (t) + 02 G (t) = (t),
(15.1)
(E )
can be solved, where f (the exterior excitation) is a continuous function, and where we put
0 = 1. This is done in two steps:
Resolution of the homogeneous equation. We know that the space of solutions of the
associated homogeneous equation (namely y + y = 0) is a two-dimensional vector space, a
basis of which is, for instance, (u, v) = (cos, sin).
Search for a particular solution. We use the method of variation of the constants,
generalized for order 2 equations, as follows: look for a solution of the type y0 = u + v,
where and are functions to be determined. Denoting by Wu,v the Wronskian matrix
associated to the basis (u, v), which is defined by
u v
Wu,v =
,
u v
those functions are given by
0
cos t
1
=
W
=
u,v
f
sin t
that is,
sin t
cos t
,
f (t)
= sin(t) f (t)
and
= cos(t) f (t),
so that (up to a constant, which can be absorbed in the solution of the homogeneous equation)
we have
Zt
Zt
and
(t) =
cos(s) f (s) ds.
(t) = sin(s) f (s) ds
0
Using the trigonometric formula sin(t s) = sin t cos s cos t sin s, we deduce
Zt
y0 (t) =
f (s) sin(t s) ds.
0
410
Green functions
the other hand, it may be a little disappointing: we found a single expression for a Green
function. More precisely, we found the unique inverse image of + in D+ , and not the
one in D (see Theorem 8.33 on page 239). This is quite normal: by definition of the unilateral
Laplace transform, only causal input functions are considered (hence they are in D + ).
(15.2)
in the Fourier world. A first trial for a solution, rather naive, leads to
1
Ge() = 2
.
(naive solution)
0 42 2
+
,
(15.3)
2
2
0 42 2
Thus, the general solution of the equation are given by
Zt
y(t) =
f (s) sin(t s) ds + cos t + sin t .
|
{z
}
0
free part
We recognize, for t > 0, the convolution of the function G : t 7 H (t) sin t and the truncated
excitation function H (t) f (t). The function G is the causal Green function associated to the
equation of the harmonic oscillator (see Theorem 8.33 on page 239).
411
in the limit 0.
As seen in Section 8.1.c on page 225, we may add to 0 a positive or
negative imaginary part i, which amounts to closing the contour by small
half-circles either above or under the poles. Replacing each occurence of 0 by
0 + i, or in other words, putting
1
1
1
e
G () =
,
4 0 + 0 + i 0 i
,
G () = lim+
0 4 0 + 0 i
0 i
or, equivalently, put = = 1/40 :
1
1
1
pv
+ i ( + 0 ) pv
i ( 0 ) .
Ge(ret) () =
4 0
+ 0
0
In the inverse Fourier transform formula
Z
G (ret) (t) = Ge(ret) () e 2it d,
Green functions
412
1
Since the inverse Fourier transform of 2i
pv 1 + 2 is H (t), it follows that
1
1
1
T.F.1
+ ( + 0 ) H (t) e i0 t
pv
2i
+ 0
2
and
1
1
1
T.F.1
+ ( 0 ) H (t) e i0 t .
pv
2i
0
2
Hence the inverse Fourier transform of Ge(ret) is
1
1
H (t) i e i0 t ie i0 t =
H (t) sin(0 t).
G (ret) (t) =
20
0
Remark 15.2 It is no more difficult, if the formula is forgotten and no table of Fourier
transforms is handy, to perform the integration along the contour indicated using the residue
theorem. The sign of t must then be watched carefully, since it dictates whether one should
close the integration contour from above or from below in the complex plane.
In the case discussed here, for t < 0, the imaginary part of must be negative in order
that the exponential function e 2i t decay fast enough to apply Jordans second lemma. In
this case, the integrand has no pole inside the contour, and the residue theorem leads to the
conclusion that G (ret) (t) = 0. If, on the other hand, we have t > 0, then we integrate from
above, and the residue at each pole must be computed; this is left as an exercise for the reader,
who will be able to check that this leads to the same result as previously stated.
In the following, the integration will be executed systematically using the method of
residues.
1
4 0
or, equivalently
Ge(ad) () =
1
1
,
+ 0 + i 0 + i
1
1
1
pv
i ( + 0 ) pv
+ i ( 0 ) ,
4 0
+ 0
0
1
H (t) sin(0 t)
0
and
G (ad) (t) =
1
H (t) sin(0 t).
0
413
1
sin 0 |t| .
2
t R
t
t0
Remark 15.3 Other complex-valued Green functions can also be defined. Those are of limited
interest as far as classical mechanics is concerned, by are very useful in quantum field theory
(see the Feynman propagator on page 431).
Green functions
414
15.3
Electromagnetism and the dAlembertian operator
DEFINITION 15.4 The dAlembertian operator or simply dAlembertian is
1 2
.
c 2 t2
(Certain authors use a different sign convention.)
def
15.3.a
/0
=
.
(15.4)
A
0 j
Hence we start by finding the Green function associated to the dAlembertian,
that is, a the distribution G ( r , t) satisfying the same differential equation
with a source ( x) (t) which is a Dirac distribution, in both space and time
aspects:
G ( r , t) = ( r ) (t)
(15.5)
(Green function of the dAlembertian).
def
def
Put A( r , t) = ( r, t), A( r, t) and j( r, t) = ( r , t), j ( r , t) . The general
solution of (15.4) is given by
ZZZ Z +
A( r , t) = [G j]( r , t) =
G ( r r , t t ) j( r , t ) dt d3 r
R3
415
z2
2
k
G ( k, z) = 1.
c2
c2
,
z2
c 2 k2
(15.6)
However, this integral is not well defined, because the integrand has a pole.
To see what happens, we start by simplifying the expression by means of an
integration in spherical coordinates where the angular components can be
easily dealt with. So we use spherical coordinates on the k-space, putting
d3 k = k 2 sin dk d d, with the polar axis in the direction of r . We have
5
To avoid sign mistakes or missing 2 factors, etc, use the following method: start from
the relation G = , multiply by e i ( k r z t) , then integrate. Then a further integration by
parts leads to the equation above.
6
We know that there exist distributions (free solutions) satisfying
2
z
2
k
T ( k, z) = 0,
c2
Green functions
416
therefore
ZZZZ
1
c2
e i(kr cos z t) k 2 sin dk d d dz
2
2
2
(2)4
c k z
Z +
Z +
c2
k
i
z
t
=
sin(kr)
e
dz dk.
2 2
2
43 r 0
c k z
G ( r , t) =
For the evaluation of the integral in parentheses, there is the problem of the
pole at z = ck. To solve this difficulty, we move to the complex plane
(that is, we consider that z C), and avoid the poles by deforming the
contour of integration. Different choices are possible, leading to different
Green functions.
Remark 15.5 It is important to realize that, with the method followed up to now,
there is an ambiguity in (15.6). The ways of deforming the contour lead to radically
different Green functions,7 since they are defined by different integrals. These functions
differ by a free solution (such that f = 0). The solution is the same as in the
preceding section concerning the harmonic oscillator: take the principal value of the fraction at
issue, or this principal value with an added Dirac distribution i at each pole.
We will make two choices, leading to functions which (for reasons that
will be clear at the end of the computation) will be named G (ad) ( r , t) and
G (ret) ( r , t):
given by the path
G (ad) ( r , t)
ck
ck
and
ck
ck
First, we compute the retarded Green function. For negative values of the
time t, the complex exponential e i z t only decays at infinity if the imaginary
part of z is positive; hence, to be able to apply the second Jordan lemma, we
must close the contour in the upper half-plane (see page 120):
R +
ck
7
ck
Radically different for a physicist. Some may be causal while some others are not, for
instance.
417
Since the poles are located outside this contour, the integral is equal to
zero:
G (ret) ( r , t) 0
for t < 0.
For positive values of t, we have to close the contour in the lower half-plane,
and both poles are then inside the contour. Since the path is taken with the
clockwise (negative) orientation, we find that for t > 0 we have
i z t
Z
e
e i z t
dz
=
2i
Res
;
z
=
ck
2 2
2
c 2k 2 z 2
(ret) c k z
i i ckt
=
e
e i ckt ,
ck
and hence
Z +
c2
ik i ckt
i ckt dk
sin(kr)
e
e
43 r 0
ck
Z +
c
=
e ik(r c t) e ik(r c t) e ik(r +c t) + e ik(r +c t) dk
2
8 r 0
Z +
c
=
e ik(r c t) e ik(r +c t) dk
2
8 r
c
=
(r c t) (r + c t) ,
4r
R ik x
since e dk = 2 (x) (see the table page 612). Because t is now strictly
positive and r 0, the distribution (r + c t) is identically zero on the whole
space. In conclusion, we have obtained
G (ret) ( r , t) =
G (ret) ( r , t) =
c
(r c t)
4r
for t > 0.
Note that this formula is in fact also valid for t < 0, since the remaining
Dirac distribution is then identically zero.
Similarly, the advanced Green function is zero for positive times and is
given for all t by
c
(r + c t) for t < 0,
4r
G (ad) ( r , t) 0
for t > 0.
G (ad) ( r , t) =
Physical interpretation
The retarded Green function is represented by a spherical shell, emitted at
t = 0 and with increasing radius r = c t. The fact that this pulse is so localized
explains why a light flash (from a camera, for instance) is only seen for a time
Green functions
418
equal to the length of time it was emitted8 but with a delay proportional to
the distance. Moreover, the amplitude of the Green function varies like 1/r,
which is expected for a potential.
(ret)
(ret)
( r , t),
A (ret) ( r , t) = G (ret) 0 j ( r , t),
( r , t) = G
0
or
(ret)
c
( r , t) =
40
ZZZ Z
R3
( r , t )
k r r k c (t t ) dt
kr r k
d3 r .
0
A(ret) ( r , t) =
d3 r .
4
k r r k
R3
Remark 15.6 Only a special solution has been described here. Obviously, the potential in
space is the sum of this special solution and a free solution, such that A = 0. If boundary
conditions are imposed, the free solution may be quite difficult to find. In two dimensions, at
least, techniques of conformal transformation may be helpful.
Lower-dimensional cases
We consider the analogue of the previous problem in one and two dimensions,
starting with the latter. First, it must be made clear what is meant by twodimensional analogue. Here, it will be the study of a system described by a
potential field satisfying the dAlembert equation in dimension 2. Note that the
ensuing theory of electromagnetism is extremely different from the theory of
standard electromagnetism in dimension 3. Especially notable is that the
8
419
Coulomb potential is not in 1/r but rather logarithmic,10 and the magnetic
field is not a vector but a simple scalar.11
What about the Green function of the dAlembertian? It is certainly possible to argue using a method identical with the preceding one, but it is simpler
to remark that it is possible to pass from the Green function in dimension 3
to the Green function in dimension 2 by integrating over one of the space
variables:
THEOREM 15.7 The Green functions of the dAlembertian in two and three dimen-
G(2) (x, y, t) =
G(3) (x, y, z , t) dz ,
which may be seen as the convolution of G(3) with a linear Dirac source:
G(2) (x, y, t) = G(3) (x , y , z , t ) (x x) ( y y) (t t) 1(z ).
Proof. Recall that G(3) (x, y, z, t) = (x) ( y) (z) (t). Passing to the Fourier
transform (and putting c = 1 in the proof), we obtain the relation
(z 2 k x2 k 2y kz2 ) G (k x , k y , kz , z) = 1,
def
(15.7)
R +
which holds for all k x , k y , kz and z. We now put T (x, y, t) = G(3) (x, y, z , t) dz and
compute (2) T . Passing again to the Fourier transform, we obtain with T = F [T ]
that
(z 2 k x2 k 2y ) T (k x , k y , z) = (z 2 k x2 k 2y ) G(3) (k x , k y , 0, z),
which is equal to 1 according to equation (15.7) for kz = 0. Hence we get
(z 2 k x2 k 2y ) T (k x , k y , z) = 1,
or
(2) T (x, y, t) = (x) ( y) (t),
which is the equation defining the Green function of the dAlembertian in two dimensions, and therefore G(2) = T (up to a free solution).
r 2 + z 2 c t dz ,
p
4
r 2 + z 2
where
10
def
r=
x 2 + y 2.
420
Green functions
Once more, this function is identically zero for negative values of t. For
positive values of t, we use the rule (explained in Theorem 7.45, page 199)
X 1
def
(z y)
f (z) =
with Z = y ; f ( y) = 0 ,
f ( y)
yZ
which is valid if the zeros of f are simple and isolated (i.e., if f ( y) does not
vanish at the same time as f ( y)). Applying this formula gives in particular
Z +
X g( y)
.
f (z ) g(z ) dz =
f ( y)
yZ
p
In the case considered we have f (z ) = r 2 + z 2 c t and
p
z
z1 = c 2 t 2 r 2 ,
f (z ) = p
Z = {z1 , z2 } with
z2 = z1 .
r 2 + z 2
After calculations, we find
c
p
2
2
2
2
(ret)
G(2)
(x, y, t) = 2 c t (x + y )
if c t > (x 2 + y 2 ),
(15.8)
otherwise.
The reader can check (using whichever method she prefers) that the Green
function of the dAlembertian in one dimension is given by
c /2 if c t > x,
G(1) (x, t) =
0
if c t < x.
12
As was envisioned by the pastor and academic Edwin Abbott (18391926) in his extraordinary novel Flatland [1].
13
Physiologically at least.
421
(ret)
G(2)
(r, t)
t1
t2
c t1
c t2
Fig. 15.1 The retarded Green function of the dAlembertian in two dimensions,
for
p
increasing values t2 > t1 of the variable t. In the abscissa, r = x 2 + y 2 .
15.3.d Radiation
What is the physical interpretation of the difference G (ad) G (ret) ?
We know that the vector potential is given, quite generally, by
ZZZZ
A(r) = free solution +
G (r r ) j(r ) d4 r ,
but there are two possible choices (at least) for the Green function, since either
the retarded or the advanced one may be used (or any linear combination
14
In fact, this expression is somewhat ambiguous since, for x = 0, the Dirac distribution
is singular and yet it is multiplied by a discontinuous function. The same ambiguity was,
however, also present in the first expression, since it was not specified what G was at t = 0...
Green functions
422
If the sources j are localized in space and time, then letting t in (15.10),
the integral in the right-hand side vanishes, since the Green function is retarded. The field A(in) can thus be interpreted as an incoming field, coming
from t = and evolving freely in time. One may also say that it is the
field from the past, asymptotically free, which we would like to see evolve since
t = without perturbing it.
Similarly, the field A(out) is the field which escapes, asymptotically freely
for t , that is, it is the field which, if left to evolve freely, becomes
(arbitrarily far in the future) equal to the total field for t +.
The difference between those two fields is therefore what is called the field
radiated by the current distribution j between t = and t = +. It is
equal to
ZZZZ
(ray)
(out)
(in)
A (r) = A
(r) A (r) =
G (r r ) j(r r ) d4 r ,
def
where
Hence the field A(ray) (r) gives, at any point and any time, the total field radiated by the distribution of charges. It is in particular interesting to notice
that, contrary to what is often stated, the unfortunate advanced Green function G (ad) is not entirely lacking in physical sense.
15.4
The heat equation
DEFINITION 15.9 The heat operator in dimension d is the differential oper-
ator
2
def
d
with d = 2 + + 2 ,
t
x1
xd
where c > 0 and > 0 are positive constants.
def
D=c
15
Recall that what is meant by free is a solution of the homogeneous Maxwell equations
(without source).
15.4.a
423
One-dimensional case
2
c 2 T (x, t) = (x, t),
(15.12)
t
x
7 2i
and
t
Hence the equation satisfied by G is
so that17
2
7 42 k 2 .
x2
42 k 2 + 2i c G (k, ) = 1,
G (k, ) =
(42 k 2
1
.
+ 2i c )
There only remains to take the inverse Fourier transforms. We start with the
variable . We have
Z
e 2it
Ge(k, t) =
d.
(42 k 2 + 2i c )
16
A kernel is a function K that can be used to express the solution f of a functional problem,
depending on a function g, in the form
Z
f (x) = K (x ; x ) g(x ) dx
(for instance, the Poisson, Dirichlet, and Fejr kernels already encountered). The Green functions are therefore also called kernels.
17
Up to addition of a distribution T satisfying 42 k 2 + 2i c T (k, ) = 0.
Green functions
424
for t < 0.
This is a rather good thing, since it shows that the heat kernel is causal.
Remark 15.10 It is the presence of unique purely imaginary pole, with strictly positive imag-
inary part, which accounts for the fact that there is no advanced Green function here. In
other words, this is due to the fact that we have a partial derivative of order 1 with respect to t
and the equation has real coefficients.
In the case of quantum mechanics, the Schrdinger equation is formally identical to the
heat equation, but it has complex coefficients. The pole is then real, and both advanced and
retarded Green functions can be introduced.
For positive values of the time variable t, we close the contour in the upper
half-plane, apply Jordans second lemma, and thus derive
e 2it
2ik 2
e
G (k, t) = 2i Res
; =
(42 k 2 + 2i c )
c
=
e 4
2 k 2 t/c
1 x 2 /a 2
2 2
F 1 e a k =
e
|a|
p
(which we apply with a = 2 t/c); the inverse Fourier transform in the
space variable is easily performed to obtain
1
c x 2
exp
G (x, t) = p
2 c t
4t
for t > 0.
(15.13)
Notice that, whereas temperature is zero everywhere for t < 0, the heat
source, although it is localized at x = 0 and t = 0, leads to strictly positive
temperatures everywhere for all t > 0. In other words, heat has propagated at
infinite speed. (Since this is certainly not literally the case, the heat equation
is not completely correct. However, given that the phenomena of thermal
diffusion are very slow, it may be considered that the microscopic processes
are indeed infinitely fast, and the heat equaton is perfectly satisfactory for
most thermal phenomena.)
425
Remark 15.11 The typical distance traveled by heat is x 2 = (4/c )t. This is also what
Einstein had postulated to explain the phenomenon of diffusion. Indeed, a particle under diffusion in a gas or liquid does not travel in a straight line, but rather follows a very roundabout
path, known as Brownian motion, which is characterized in particular by x 2 t. The
diffusion equation is in fact formally identical with the heat equation.
The gaussian we obtained in (15.13), with variance x 2 t, can also be interpreted
mathematically as a consequence of a probabilistic result, the central limit theorem, discussed
in Chapter 21.
The problem with an arbitraty heat source (15.12) is now easily solved, since
by linearity we can write
Z Z +
T (x, t) = [G ](x, t) =
G (x , t ) (x x , t t ) dt dx . (15.14)
Note, however, that this general solution again satisfies T (x, 0) = 0 for all
x R. We must still add to it a solution satisfying the initial conditions at
t = 0 and the free equation of heat propagation.
Initial conditions
We now want to take into account the initial conditions in the problem of
heat propagation. For this, we impose that the temperature at time t = 0 be
given by T (x, 0) = T0 (x) for all x R, where T0 is an arbitrary function.
We need a solution to the free problem (i.e., without heat source), which
means such that
T (x, t) = H (t) u(x, t).
The discontinuity of the solution at t = 0 provides a way to incorporate the
initial condition T (x, 0) = T0 (x). Indeed, in the sense of distributions we
have
T
u
=
+ T0 (x) (t).
t
t
The distribution T (x, t) then satisfies the equation
2
c
+ 2 T (x, t) = c T0 (x) (t),
t
x
the solution of which is given by
Z +
T (x, t) = c T0 (x) (t) G (x, t) = c
G (x x , t) T0(x ) dx .
Hence the solution of the problem that incorporates both initial condition
and heat source is given by
h
i
T (x, t) = T (x, t) + T (x, t) = (x, t) + c T0 (x) (t) G (x, t).
However, it is easier to take the initial conditions into account when using
the Laplace transform, as in the next example.
Green functions
426
T
T =0
t
and
T ( r , 0) = T0 ( r ).
f( k, p) = c Te ( k) Ge( k, p),
T
0
Ge( k, p) =
1
1
1
=
.
2
2
c p + 42 k 2 /c
c p + 4 k
(15.15)
In order to obtain the function G ( r , t), we come back to the expression (15.15)
and use the table of Laplace transforms on page 614:
H (t) 42 k 2 t/c
e
Ge( k, t) =
c
The reader should not, however, get the impression that this is vastly preferable in three
dimensions: we could just as well use the method of the one-dimensional case, or we could
have treated the latter with the Laplace transform.
Quantum mechanics
427
Remark 15.12 We may add a right-hand side to the heat equation: D T ( r, t) = ( r, t), where
is the distribution describing the heat source; it allows us to derive a more general solution
(where the convolution is both on time and space variables)
h
i
T ( r, t) = + c T0 ( r) (t) G ( r, t).
Remark 15.13 Here also the typical distance of heat propagation is x 2 t. This result
(still a consequence of the central limit theorem!) is independent of the dimension of the
system.
15.5
Quantum mechanics
This time, we consider the evolution equation for a wave function:
DEFINITION 15.14 The Schrdinger operator is the differential operator
i h}
h}2
+
V ( r ),
t 2m
2 h} t/2m
= e i H t/ h}
for t > 0,
Green functions
428
m 3/2
2
e im| r| /2 h}t
2i h}t
for t > 0.
(15.17)
h}2
+
=0
t
2m
with initial conditions ( r , 0) = 0 ( r ), we get21 the relation for the propagation of the wave function:
Z
( r , t) = G ( r r , t) 0 ( r ) d3 r
for t > 0.
(15.18)
The function G is called the Schrdinger kernel. Equation (15.18) is therefore
perfectly equivalent to the Schrdinger equation written in the form
i h}
= H
t
def
with H =
h}2
.
2m
(15.19)
Either form may be used as a postulate for the purpose of the formalization
of quantum mechanics.
It should be noted that the general form of the Green function (15.17)
may be obtained through the very rich physical tool of path integration. The
reader is invited to look at the book of Feynman and Hibbs [34] to learn
more about this.
Note also that, using Diracs notation, we can write the evolution equation
of the particle as
d
i h} (t) = H (t) ,
dt
(denoting by | the wave vector of the particle), which can be solved
formally by putting
(t) = e i H t/ h} |0 .
21
Once more, those initial conditions are taken into accound by adding a right-hand side
given by 0 ( r ) (t).
Klein-Gordon equation
429
R
Using the closure relation 1 = | r r | d3 r (see Theorem 14.26) we obtain
for all t that
Z
( r , t) = r (t) = r e i H t/ h} r r 0 d3 r
Z
i H t/ h}
r 0 ( r ) d3 r .
=
r e
G ( r r , t) = H (t) r e i H t/ h} r .
2
G ( p, t) = p e i H t/ h} p = e i p h}t/2m ,
which shows that the two expressions are consistent (and illustrates the simplicity of the Dirac notation).
15.6
Klein-Gordon equation
In this section, the space and time variables will be expressed in a covariant
manner, that is, we will put x = (c t, x) = (x0 , x) and p = (p0 , p). We choose
a Minkowski metric with signature (+, , , ), which means that we put
p2 = p02 p 2 ,
x2 = x02 x 2 ,
and
2
.
t2
Remark 15.15 Note that the sign of the dAlembertian is opposite to that used in
Section 15.3; here we conform with relativistic customs, where usually = with
a metric (+, , , ).
( + m2 ) (x) = j(x),
which is called the Klein-Gordon equation for the field (it originates in
field theory). In this equation, m represents the mass of a particle, j is an
arbitrary source and is the field which is the unknown. As usual, we take
the constants h} and c to be equal to 1.22
22
Green functions
430
G (p) =
p2 m2
1
.
p02 p 2 m2
As in the case of electromagnetism, there will arise a problem with this denominator when performing the inverse Fourier transform. Indeed, we will
try to write an expression like
ZZZZ
1
1
G (x) =
e i px d3 p d p0 .
2
4
(2)
p0 p 2 m 2
During
the process of integrating over the variable p0 , two poles at p0 =
p
p 2 + m2 occur. They must be bypassed by seeing p0 as a variable in the
complex plane and integrating along a deformed contour (see Section 15.2).
For instance, we may go around both poles in small half-circles in the
lower half-plane, which provides the advanced Green function:
G (ad) (t,
x)
p 2 + m2
p 2 + m2
p0 C
p0 C
x)
Hence we put
def
G (ad) (p) =
(p0
p 2 + m2
1
p 2 m2
i)2
p 2 + m2
Klein-Gordon equation
431
and
1
.
(p0
p 2 m2
Taking inverse Fourier transforms, we obtain
ZZZZ
1
e i px
(ad)/(ret)
G
(x) =
d4 p.
(2)4
(p0 i)2 p 2 m2
def
G (ret) (p) =
+ i)2
By the residue theorem, the function G (ret) is zero for t < 0 and G (ad) is
zero for t > 0. Moreover, these Green functions being defined by Lorentz
invariant objects, it follows that G (ret) is zero outside the future light cone (it
is said to be causal in the relativistic sense) and G (ret) outside of the past light
cone. Also, note that both functions are real-valued (despite the presence of
the term i) and that G (ret) (x) = G (ad) (x).
Whereas, in the case m2 = 0 of electromagnetism,23 we had a simple explicit formula for the Green functions, we would here get Bessel functions
after integrating on the variable p. The full calculation is not particularly
enlightening, and we leave it aside. Note, however, that the Green functions
are no longer supported on the light cone as with (15.9) (the case m2 = 0);
there is also propagation of signals at speeds less than the speed of light. This
is of course consistent with having a nonzero mass!
To close this very brief section (referring the reader to [49] for a wider
survey of the properties of the Green fnctions of the Klein-Gordon equation),
we mention that it is possible to introduce a very different Green function,
following Stueckelberg and Feynman. Let
GF (p) =
From the formula
x 2 2 + i
p2
1
.
m2 + i
x +
i
2
x +
i
,
2
432
Green functions
G F (t, x)
This last Green function is called the Feynman propagator24; it is complexvalued and does satisfy ( + m2 ) G F (x) = (x), as well as the relation
G F (x) = G F (x). Moreover, it can be shown (see [49]) that this propagator
is not zero outside the light cone (i.e., in particular that G F (0, r ) is not
identically zero for values of r 6= 0, contrary to what happens with the
advanced and retarded functions).
Remark 15.16 Why does this function not appear in classical physics? Simply because only
G (ad) and G (ret) are real-valued and have a physical meaning that is easy to grasp. The Feynman propagator G F is complex-valued; however, it makes sense in quantum mechanics, where
functions are intrinsically complex-valued.
EXERCISES
Exercise 15.1 Show that the Green function of the dAlembertian is of pulse type in odd
dimensions at least equal to 3, but that otherwise it is not localized.
Exercise 15.2 (Damped harmonic oscillator) A damped harmonic oscillator is ruled by the
ii) Compute the Green function in D+ . One may represent graphically the poles of the
meromorphic function that appears during the calculations.
iii) Find the solution of the equation with an excitation f (some assumptions on f may
be required).
24
Richard Phillips Feynman (19181988), American physicist, was one of the inventors of
quantum electrodynamics (the quantum relativistic theory of electromagnetism), which is considered by many as one of the most beautiful current physical theories. With a penetrating
mind and a physical sense particularly acute and clear, he simplified the perturbative methods
and invented the diagrams that were named after him, which are extremely useful both for
making computations and for understanding them. His physics lecture notes are indispensable
reading and his autobiography is funny and illuminating in many respects.
Chapter
16
Tensors
In this chapter, we will give few details concerning the practical handling of tensors,
or the usual calculation tricks, juggling with indices, contractions, recognition of
tensors, etc. Many excellent books [5, 27, 35, 57, 62, 92] discuss those rules (and often
do much more than that!)
Here, the focus will be on the mathematical construction of tensors, in order to
complement the computational viewpoint of the physicist. This chapter is therefore not directed to physicists trying to learn how to tensor, but rather to those
who, having started to manipulate tensors, would like to understand what is hidden
behind.
Also, since this is a purely introductory chapter, we only discuss tensors in a flat
space. There will be no mention of parallel transport, fiber bundles, connections,
or Christoffel symbols. Readers wishing to learn more should consult any of the
classical textbooks of differential geometry (see Bibliography, page 617).
16.1
Tensors in affine space
Let K denote either the real field R or the complex field C. We identify
E = Kn , a vector space of dimension n, with an affine space E of the same
def
dimension, We consider E with a basis B = (e ) = (e1 , . . . , en ).
16.1.a
Vectors
Tensors
434
Remark 16.2 Obviously, for a given vector u, the -th coordinate depends not only on the
vector e , but also on all other vectors of the chosen basis.
Remark 16.3 From the point of view of a physicist, what quantities may be modeled by a
vector? If working in R3 , a vector is not simply a triplet of real numbers; rather, it possesses
an additional property, related to changes of reference frame: if an observer performs a rotation,
the basis vectors turn also, and all triplets that are called vectors are transformed in a uniform
manner. A triplet (temperature, pressure, density), for instance, does not transform in the
same way, since each component is invariant under rotations (they are scalars).
n
P
=1
L e .
(16.1)
The left index in L is the row index, and the right index is the column index.
For reasons which will become clearer later on, the first index is indicated in
superscript and the second in subscript. Symbolically, we write1
B = L B.
def
Let = L1 be the inverse matrix L, i.e., the unique matrix with coeffi
cients ( ) that satisfies
n
P
=1
L =
n
P
=1
L =
for all , [[1, n]]. The vectors of B can be expressed as functions of those
of B by means of the matrix :
1
Note that this is not a matrix relation. However, one can formally build a line vector with
the e s and write
(e1 , . . . , en ) = (e1 , . . . , en ) L.
or
e =
n
P
=1
435
e .
By writing the decomposition of a vector in the two bases, we can find the
transformation law for the coordinates of a vector:
P
P
u=
u e
= u e ,
u (
P
,
e )
u ( e ) =
P
,
P
,
u (L e ),
u (L e ),
(use (16.1))
(change names)
n
P
and
=1
u =
n
P
L u .
(16.2)
=1
Remark 16.4 These last formulas are also a criterion for being a vector: an n-uple, the com-
ponents of which are transformed according to (16.2) during a change of basis (for example,
when an observer performs a rotation) will be a vector. In what follows, saying that we want
to prove that a certain quantity is a vector will simply mean that we wish to establish the
relations (16.2).
and
e = e ,
u = u
and
u = L u .
(16.1 )
(16.2 )
Remark 16.5 One should be aware also that many books of relativity theory use the same letter
to denote both matrices and L. Two solutions are used to avoid conflicts:
In the book of S. Weinberg [92], for instance, they are distinguished by the position of
indices. Thus, represents what we denoted L , whereas is the same as ours.
2
Albert Einstein (18791955), German then Swiss, then American when, being Jewish, he
fled Nazi Germany physicist and mathematician, student of Hermann Minkowski, father of
special relativity (1905) then general relativity (around 1915), used and popularized the tensor
calculus, in particular by the introduction of riemannian geometry in physics.
Tensors
436
In the book of Ch. Misner, K. Thorne, and J. A. Wheeler [67], for instance, a primed
superscript indicates that is being understood, and a primed subscript that L is
understood.
Each notation has its advantages; the one we chose is closer to classical linear algebra. Here
is a short summary of various conventions in use, where is used to indicate any tensor object:
this book
Weinberg
Wheeler
= L
, instead of .
dx i :
Rn R,
(h 1 , . . . , h n ) 7 h i ,
defined for i = 1, . . . , n, are linear forms.
: E K,
7
, u
, [[1, n]]
, e
1 if = ,
0 if 6= .
437
Since any linear form can be expressed uniquely in terms of the vectors of
the dual basis (
) , we make the following definition:
DEFINITION 16.11 (Covariant coordinates of a linear form) Let be a linear form, that is, an element in E . The covariant coordinates of the linear
form in the basis (
) are the coefficients appearing in its expression
in this basis:
=
, u e
u e
with =
, e
with u =
, u .
, u =
, u e = u
, e = u = u .
Similarly
, e = , e = = .
If the basis of E is changed, it is clear that the dual basis will also change.
Therefore, let e = L e , where L is an invertible matrix. We have then:
Tensors
438
e = L e
and
for all .
Proof. We have indeed
, e = , L e = L = L = ,
which proves that the family ( ) is the dual basis of (e ) .
It is easy now to find the change of coordinates formula for a linear form
= , using the same technique as for vectors. We leave the two-line
computation as an exercise to the reader, and state the result:
PROPOSITION 16.14 The covariant coordinates of a linear form are changed in the
= L .
Remark 16.15 The fact that the coordinates of a linear form are transformed in the same manner
as the vectors of the basis is the reason they are called covariant. Conversely, the coordinates of
a vector are changed in the inverse manner compared to the basis vectors (i.e., with the inverse
matrix ) and this explains why the coordinates are called contravariant coordinates.
Remark 16.16 Some quantities that may look like vectors actually transform like linear
forms. Thus, for a differentiable function f : E R, the quantity f transforms
as a linear form. The notation for the gradient is in fact a slightly awkward. It is
mathematically much more efficient to consider rather the linear form, depending on
the point x, given by
f
(x) dx
x
(this is an example of a differential form, see the next chapter). This linear form associates the
number d fx (h), which is equal to the scalar product f h, to any vector h E . Hence,
to define the vector f from the linear form d f |x , we must introduce this notion of scalar
product (what is called a metric). We emphasize this again: because a metric needs to be
introduced to define it, the gradient is not a vector.
d fx =
= , (e ) ,
439
and for any vector x, we have (x) = (x) e . Hence, under a change of
basis, the coordinates of of (i.e., its matrix coefficients) are transformed as
follows:
= , (e ) = , (L e ) = L .
a linear map are transformed, with respect to the superscript, as the contravariant
coordinates of a vector, and with respect to the subscript, as the covariant coordinates of
a linear form.
Thus, if a map is representated by the matrix in the basis B and by in the
basis B , we have
= L .
This formula should be compared with the formula (c.2), on page 596.
(x) = (x) e
and
(x) = (x) e
(Solution page 462).
are consistent.
0 0
0 0
0 0
0 0
.
L = (L ) =
= ( ) =
0
0
0
1 0
0 1 0
0
0
0 1
0
0 0 1
16.2
Tensor product of vector spaces: tensors
16.2.a
Note: This section may be skipped on a first reading; it contains the statement
and proof of the existence theorem for the tensor space of two vector spaces.
THEOREM 16.18 Let E and F be two finite-dimensional vector spaces. There exists
a vector space E F such that, for any vector space G , the space of linear maps from
E F to G is isomorphic to the space of bilinear maps from E F to G , that is,
L (E F , G ) Bil(E F , G ).
Tensors
440
such that, for any vector space G and any bilinear map f from E F to G , there
exists a unique linear map f from E F into G such that f = f , which is
summarized by the following diagram:
EF
@
@ f
@
@
?
R
@
- G
EF
a B a1n B
11.
..
AB =
..
. .
an1 B ann B
Show that the map (A , B) 7 A B gives an isomorphism
In particular, deduce that (Ei j Ekl )i, j,k,l is a basis of Mn (C) M p (C). Show that there exist
matrices in M(n p) (C) which are not of the form A B with A Mn (C) and B M p (C).
(Solution page 462).
441
L (E F , K) Bil(E F , K).
For E and F finite-dimensional vector spaces, we have
E F Bil(E F , K),
that is, the tensor product E F is isomorphic to the space of bilinear forms on E F
(see Section 16.2.b).
If (
) and (
) are bases of E and F , respectively, the family
(
)
and in general, with a slight abuse of notation, we identify those two spaces,
meaning that we do not hesitate to write an equal sign between the two.
Remark 16.22 Although it is perfectly possible to deal with the general case where E
are F are distinct spaces, we will henceforth assume that E = F , which is the most
common situation. Generalizing the formulas below causes no problem.
or 02 -tensor is any element in E
E , in other words any bilinear form on E E .
0
2
Tensors
442
Coordinates of a
THEOREM 16.24 Let T E E be a
the basis
with
of
by
0
-tensor
2
0
-tensor.
2
T = T ,
def
T = T(e , e ).
Change of coordinates of a 02 -tensor
0
The coordinates of a 2 -tensor are transformed according to the following
theorem during a change of basis:
THEOREM 16.25 Let T E E be a tensor given by T = T
. During
= L L T .
T
0
-tensor
2
0
Since a 2 -tensor is (via the isomorphism in Theorem 16.20) nothing but a
bilinear form on E E , it is a machine that takes two vectors in E and gives
back a number:
T(u, ) : v 7 T(u, v)
is indeed a linear form on E .
This means that the tensor T E E may also be seen as a machine
that turns a vector into a linear form:
T(, ) : E E ,
u 7 T(u, )
443
which does indeed take a vector (denoted u) as input and turns it to the
linear form T(u, ).
In mathematical terminology, showing that a bilinear form on E E
0
(a 2 -tensor), is the same thing as a linear map from E (the space of
vectors) to E (the space of linear forms) is denoted
Bil(E E , R) E E L (E , E ).
This property is clearly visible in the notation used for the coordinates of the
objects involved: if we write T = T , then we have
T(u, ) = T (u, )
, u
= T
= T u ,
(= T
, u
, )
which is a linear combination of the linear forms . The reader can check
carefully that those notations precisely describe the isomorphisms given above,
and that all this works with no effort whatsoever.
A particularly important instance of bilinear forms is the metric, and it
deserves its own section (see Section 16.3).
u v : E E K,
(
, ) 7 (u) (v).
2
-tensor
0
2
-tensor:
0
T = T e e ,
with
def
T = T(
, ).
444
Tensors
Change of coordinates of a
2
-tensor
0
u : E E K,
(
, v) 7
, u
, v .
1
-tensor:
1
def
= , (e )
445
A student of Jacobi and then of Dirichlet in Berlin, Leopold Kronecker (18231891) worked in finance for ten years starting when
he was 21. Thus enriched, he retired from business and dedicated
himself to mathematics. His interests ranged from Galois theory
(e.g., giving a simple proof that the general equation of degree at
least 5 cannot be solved with radicals), to elliptic functions, to
polynomial algebra. God created integers, all the rest is the work
of Man is his most famous quote. His constructive and finitist
viewpoints, opposed to those of Cantor, for instance, made him
pass for a reactionary. Ironically, with the advent of computers
and algorithmic questions, those ideas can now be seen as among
the most modern and lively.
e (e ) = , (e ) e (e ) = , (e ) e
, e
= , (e ) e = , (e ) e = (e ),
this is the case, and by linearity it follows that = e .
or 11 -tensor is any element of
E E or, equivalently, any linear map from E to itself.
1
1
Notice, morevoer, that the coordinates of the identity map are the same in
any basis:
THEOREM 16.32 The coordinates of the identity map in any basis (e ) are given
by the Kronecker symbol
1 if = ,
def
=
0 if 6= .
Tensors
446
Change of coordinates of a 11 -tensor
1
The coordinates of a 1 -tensor are neither more nor less than the coefficients of the matrix of the associated linear map. Therefore, Proposition 16.17
1
describes the law that governs the change of coordinates of a 1 -tensor.
T = L T .
Remark 16.35 In other words, each contravariant index of T is transformed using the matrix
and each covariant index is transformed using the matrix L. Compare this formula with the
change of basis formula (equation (c.2), page 596) in linear algebra.
p
q
q times
z
}|
{ z
}|
{
E E E E E E ,
p
q
the space E E
; in other words, it is a multilinear form on the space
E E E E.
Coordinates of a qp -tensor
p
THEOREM 16.38 Let T be a
-tensor. Then T may be expressed in terms of the
q
1
basis e1 e p q 1 p of the space E p E q :
T=
1 p
T1
q
1 q
e1 e p 1 q
1
T1
q
447
def
= T 1 , . . . , p , e1 , . . . , eq .
Change of coordinates of a
p
-tensor
q
1 p
1
tensor. During a basis change B = LB the coordinates T1
q
according to the rule
T 11q p =
For this reason, a
contravariant.
1
1
p
-tensor
q
p
p
1
1
q
q
p
q
are transformed
T11qp .
16.3
The metric, or:
how to raise and lower indices
16.3.a
g:
E 2 R,
(u, v) 7 g(u, v),
such that g(u, v) = g(v, u) for any u, v E , and such that g(u, u) > 0 for any
nonzero vector u.
A pseudo-metric on a finite-dimensional real vector space E is a symmetric
definite bilinear form (i.e., the only vector u for which we have g(u, v) = 0 for
all v E is the zero vector); however, g(u, u) itself may be of arbitrary sign
(and even zero, for a so-called light-like vector in special relativity).
Remark 16.41 A metric (resp. pseudo-metric) g can be seen, with the isomorphisms described
0
2
-tensor.
448
Tensors
g = g dx dx .
Since g is symmetric, it follows that g = g for all , [[1, n]].
Example 16.43 The most important example of a metric is the euclidean metric which, in the
canonical basis (e ) of E , is given by coordinates
1 if = ,
g =
0 if 6= .
g(e , e ) =
for any , .
There can only be an orthonormal basis if the space is carrying a true metric,
and not a pseudo-metric (since the coordinates above describe the euclidean
metric in B).
DEFINITION 16.45 (Minkowski pseudo-metric) The Minkowski space is the
space R4 with the pseudo-metric defined by
if = = 0,
1
def
g = = 1 if = = 1, 2, or 3,
(16.3)
0
if 6= .
The space R4 with this pseudo-metric is denote M1,3 . This metric was introduced by H. Minkowski as the natural setting for the description of space-time
in einsteinian relativity.
DEFINITION 16.46 (Scalar product) The scalar product of the vectors u and
v in E is the quantity
The metric
449
def
u v = g(u, v).
g = e e .
Proof. Indeed, we have e e = g(e , e ) = g (e , e )
= g , e
, e = g = g .
DEFINITION 16.48 Let E be a real vector space with a metric g, and let B =
(e ) be a basis of E . The covariant coordinates of a vector u E are the
coordinates, in the basis (
) dual to B, of the linear form e
u E defined
by e
u, v = u v for all v E .
3
Tensors
450
u2
u2
e2
u 1 u1
e1
Fig. 16.1 Contravariant and covariant coordinates of a vector of R2 with the usual
euclidean scalar product. The vectors e1 and e2 have norm 1.
whereas
u = u e .
u =
, u = dx , u
and
u = u e = g(u, e ).
Proof. The first equality is none other than the definition 16.1 of the contravariant
coordinates of a vector. To prove
u e = u , e = u , e = u = u .
axis spanned by the basis vector (requiring a metric to make precise the notion of orthogonality), whereas contravariant coordinates correspond to projections on the axis parallel to all
others (and the metric is not necessary). See figure 16.1.
Exercise 16.4 Show that, in an orthonormal basis, the covariant and contravariant coordinates are identical.
u = g u .
The metric
451
b v
, v =
for all v E .
This result comes from the definiteness of the metric and the fact that E
b
and E are isomorphic in finite dimensions. The explicit construction of
is described in Proposition 16.54.
b.
= , e
and
=
,
(16.4)
We have seen how the metric tensor can be used to lower the contravariant
index of a vector, thus obtaining its covariant coordinates. Similarly, it can be
used to lower the contravariant index of a linear form:
THEOREM 16.55 Let be a linear form. Its covariant and contravariant coordi-
= g .
b = e , and moreover for any vector v E , we have
Proof. By definition,
b v. In particular, = , e =
b e = e e = g , and this,
, v =
together with the relation g = g , concludes the proof.
def
b
g = g e e
b
b , b ) =
b b
g(
, ) = g(
, E .
Tensors
452
The dual first appears in the coordinates of the basis vectors of E and E ,
which are given by the following proposition:
PROPOSITION 16.57 The covariant and contravariant coordinates of e and
denoted [e ] , with square brackets are given by
[e ] = g ,
] = g ,
[
[e ] = ,
[
] = .
since the -th coordinate of e is , e . Finally, the last equality follows from
(
) = , e .
THEOREM 16.58 The matrices g and g are inverse to each other, that is, we
have
g g =
and
g g = .
Proof. First we prove the second formula. According to Proposition 16.57, we have
g = (
) , hence g g = (
) g = (
) by Theorem 16.55. Then the desired
conclusion follows from Proposition 16.57 again.
The first equality is implied by the second.
and
= g .
Proof. This is clear from to the formulas related to lowering indices, since g and
g are inverse to each other.
Of course these results are also valid (with identical proofs!) for
p+1
tensors, which are transformed into q1 -tensors by b
g (i.e., by g ).
p
q
Remark 16.60 This is where the abuse of notation consisting in putting together all
covariant indices (and all contravariant indices) becomes dangerous. Indeed, consider,
for instance, a tensor with coordinates
T ,
Operations on tensors
453
in expanded notation, and suppose now that, to simplify notation, we denote the coordinates
simply
T ,
T or T
?
Only the original noncondensed form can give the answer. If the metric tensor is used, it is
important to be careful to write all indices properly in the correct position.
16.4
Operations on tensors
Theorem 16.59 may be generalized, noting that the tensors g and b
g are
p
p+1
p1
used to transform q -tensors into q1 -tensors or q+1 -tensors, respectively.
For instance, suppose we have a tensor denoted T in physics. We may
lower the index using g , defining
def
T =g T .
This is the viewpoint of index-manipulations. What is the intrinsic mathematical meaning behind this operation?
The tensor T, written
T = T e e ,
is the object which associates the real number
T(
, , v) = T v
to a triple (
, , v) E E E . As to the tensor where one index is
lowered, it corresponds intrinsically to
e = T e : (
T
, u, v) 7 T u v .
(16.5)
Tensors
454
2. transforms it into (
, e
u, v), where e
u is defined by the relation (16.5),
u, v):
3. outputs the real number T(
, e
e : E E E R,
T
(
, u, v) 7 T(
, e
u, v).
The same operation may be performed to raise an index. This time, the
tensor b
g is used in order to define a vector, starting with a linear form (still
because of the metric duality).
To summarize:
THEOREM 16.61 (Raising and lowering indices) The metric tensors g and b
g can
be used to transform a
p
-tensor
q
into a
p+1
-tensor
q1
or a
p1
-tensor,
q+1
respectively.
T = T .
) , we have
Notice that, taking an arbitrary basis (e ) and its dual basis (
T(
, , e ) = T [
] [e ] = T = T
= T .
e by
Hence it suffices to define the tensor T
e : E R,
T
7 T(
, , e ),
tion which, for any choice of one covariant and one contravariant coordinate, transp
p1
forms a q -tensor into a q1 -tensor. This operation does not require the use of the
metric, and is defined independently of the choice of a basis.
Change of coordinates
455
16.5
Change of coordinates
One of the most natural question that a physicist may ask is how does a
change of coordinates transform a quantity q?
First, notice that a change of coordinates may be different things:
a simple isometry of the ambient space (rotation, translation, symmetry,...);
a change of Galilean reference frame (which is then an isometry in
Minkowski space-time with its pseudo-metric);
a more complex abstract transformation (going from cartesian to polar
coordinates, arbitrary coordinates in general relativity,...).
16.5.a
Curvilinear coordinates
These functions define a C 1 change of coordinates if (and only if) the map
: Rn Rn
(x 1 , . . . , x n ) 7 (u 1 , . . . , u n )
Tensors
456
Unfortunately, physicists often use the same word for those two very different
concepts! Or, which amounts to the same thing here, they omit to state
whether active or passive transformations are involved.
In what follows, we consider passive transformations: the space is invariant,
as well as the quantities measured; on the other hand, the coordinate system
varies, and hence the coordinates of the quantities measured also change.
Let Y be a point in R, and let e1 (Y), . . . , en (Y) denote the basis vectors
in Rn attached to the point Y. This family does not depend on Y, since X
represents the canonical coordinates on Rn .
x2 = 2
x2 = 1
x2 = 3
x1 = 3
x1 = 2
e2
Y
x1 = 1
e1
x1 = 0
We now want to find the new vectors, attached at the point Y = (Y) and tan-
gent to the coordinate lines, which represent the vectors e1 (Y), . . . , en (Y)
after the change of coordinates. They will then be the new basis vectors. We
denote them e1 (Y ), . . . , en (Y ) .
u2 = 3
u2 = 2
u2 = 1
u2 = 4
u1 = 2
u1 = 1
e1
e2
e2
Y
e1
u1 = 0
Change of coordinates
457
How are they defined? Suppose that we have two nearby points M and
M in space. Then we can write
M M = xm e ,
where xm is the variation of the -th coordinate x when going from M
to M . This same vector, which is an intrinsic quantity, may also be written
M M = um e ,
um =
u
xm ,
x
xm e =
u
u
x
=
xm e
m
x
x
(since and are mute indices), and hence, since the xm are independent
and arbitrary
u
x
e
or conversely
e =
e ,
x
u
or again, expanding the notation,
1
un
u
e1
e
1
1
1
x
x
.. ..
..
.
.
=
. .
. ,
.
1
u
un
en
en
xn
xn
and
1
x
xn
e1
e
1
1
1
u
u
..
.. ..
..
. = .
. .
.
1
n
x
en
en
un
un
Note that the matrices
u
x
=
and
L =
x
u
e =
Tensors
458
u x
= .
x u
We recover the notation used in an affine space5; the only difference is that,
now, the matrices L and depend on the point M where the transformation
is performed. To summarize:
THEOREM 16.63 (Transformation of basis vectors) During a change of coordi-
u
x
Note that these formulas are balanced from the point of view of indices,
obeying the following rule: a superscripted index of a term in the denominator of a
fraction is equivalent to a subscripted index.
u
v
x
and
v =
x
v ,
u
that is, following the opposite rule as that used for basis vectors.6
5
x
dx .
x
6
One should of course indicate every time at which point the vectors are tangent, which is
also where the partial derivatives are evaluated. In this case, the preceding formulas should be
Change of coordinates
459
Remark 16.65 Here is a way to memorize those formulas. Let x be the new coordinates. We
x
e,
x
e =
x
e;
x
x
e
x
(and then one should check that the index is balanced as it should).
Similarly, to recover the formula giving v in terms of the v s,
we write the formula with a prime sign in the numerator, since the index is superscripted:
v =
x
v ;
x
x
v .
x
The reader is invited to check that the converse formulas (giving v and e in terms of the
and e s) may be obtained with similar techniques.
v s
v (Y) =
x
(Y ) v (Y ),
u
Tensors
460
and
u
.
x
x
.
u
and
dinates
...,
1
p
T(Y) = T1 ,...,
e1 (Y ) en (Y ) 1 (Y) q (Y).
q
T 11,...,qp =
u 1
u p x 1
x q 1 ,..., p
T1 ,...,q .
x 1
x p u 1
u q
and
def
T, = T ,
Change of coordinates
461
16.5.f Conclusion
Le Calcul Tensoriel sait mieux la physique que le Physicien lui-mme.
Tensor calculus knows physics better than the physicist himself does.
Paul Langevin (quoted in [10]).
Delachet [27]; the excellent treatise on modern geometry [30] in three volumes; the bible
concerning the mathematics of general relativity (including all non-euclidean geometry) by
Wheeler7 et al. [67]; or finally, one of the best physics books ever, that of Steven Weinberg [92].
On voit [...] se substituer lhomo faber lhomo mathematicus.
Par exemple loutil tensoriel est un merveilleux oprateur de gnralit ;
le manier, lesprit acquiert des capacits nouvelles de gnralisation.
We see [...] homo mathematicus substituting himself to homo faber.
For instance, tensor calculus is a wonderful operator in generalities; handling it,
the mind develops new capacities for generalization.
Gaston Bachelard, Le nouvel esprit scientifique [10]
7
Tensors
462
SOLUTIONS
Solution of exercise 16.1 on page 439. Using the formulas for change of basis, we get
(x) e = L (x) L e
= L L (x) e
= (x) e = (x) e ,
Solution of exercise 16.2 on page 440. The first questions are only a matter of writing
things down. In the case n = p = 2, we see, for instance, that
0 1
0
1
2 3 2 3 1 1 0
0 0
0
2
0
2
2
0 0
4
6
but
0
0
0
0
cannot be a tensor product of two matrices.
0
1
0
0
0
0
2
0
0
0
3
7
1
,
3
Chapter
17
Differential forms
In all this chapter, E denotes a real vector space of finite dimension n (hence
we have E Rn , but E does not, a priori, have a euclidean structure, for
instance).
17.1
Exterior algebra
17.1.a 1-forms
DEFINITION 17.1 A linear form or exterior 1-form on E is a linear map
from E to R:
: E R.
The dual of E is the vector space E of linear forms on E .
Let B = ( b 1 , . . . , b n ) be a basis of E . Any vector x E can be expressed
in a unique way as combination of vectors of the basis B:
x = x 1 b1 + + x n bn.
i
b i ( x) = b i , x = x i
or
b , b j = ij
i, j [[1, n]].
Differential forms
464
Proof.
i ), we have obviously (by linearity)
P Let E . Then, denoting i = (b
= i i b i , which P
shows that the family (b 1 , . . . , b n ) generates E . It is also a
free family (indeed, if i b i = 0, applying this linear form to the vector b j yields
j = 0), and hence is a basis.
with i = ( b i ).
for all x, y E .
2 ( x, y) = 2 ( y, x)
det : E E R,
x 1
( x, y) 7 det( x, y) = 2
x
B
is an exterior 2-form on R2 .
y 1
,
y2
denoted 2 (E).
( x, y) 7 x 1 y 2 x 2 y 1
dx j dx i dx i dx j = dx i dx j dx j dx i for all i, j [[1, n]].
It is sufficient to keep the elements of this type with i < j; there are n(n1)/2
of them.
THEOREM 17.6 A basis of 2 (E ) is given by the family of the exterior 2-forms
def
dx i dx j = dx i dx j dx j dx i ,
Exterior algebra
465
def
Those two families are bases, respectively, for the space 2 (E ), which is the
0
0
space of antisymmetric 2 -tensors, and for the space of symmetric 2 -tensors.
PROPOSITION 17.7 Any
tensor.
0
-tensor
2
0
-tensors.
k
into itself; in other words, permutes the integers from 1 to n. A transposition is any permutation which exchanges two elements exactly and leaves the
others invariants. The set of permutations of [[1, n]] is denoted Sn .
The composition of applications defines the product of permutations.
Any permutation may be written as a product of finitely many transpositions. The signature of a permutation Sn is the integer () given by:
() = 1 if can be written as a product a an even number of transpositions,
and () = 1 if can be written as the product of an odd number of
transpositions.1 For any permutations and , we have
() = () ().
DEFINITION 17.9 An exterior k-form, or more simply a k-form, is any an-
1
Implicit in this definition is the fact, which we admit, that the parity of the number
of transpositions is independent of the choice of a formula expressing as a product of
transpositions.
Differential forms
466
Example 17.10 The most famous example is that of an exterior k-form on a space of dimension
k, namely, the determinant map expressed using the canonical basis: if ( e 1 , . . . , e k ) is the
canonical basis of Rk , and if i Rk for i = 1, . . . , k, then denoting i = i1 e 1 + + ik e k ,
this is a map
det :
(Rk )k R,
11
( 1 , . . . , k ) 7 detcan ( 1 , . . . , k ) = ...
k1
...
...
1k
.. ,
.
kk
an exterior 2-form.
k (E).
We will now find a basis of k (E ).
To generalize the corresponding result of the preceding section for 2 (E ),
we introduce the following definition:
DEFINITION 17.13 Let k N and let i1 , . . . , ik [[1, n]]. Then the exterior
k-form dx i1 dx ik is defined by the formula
X
def
dx i1 dx ik =
() dx (i1 ) dx (ik ) .
Sk
Example 17.14 Consider the space R4 and denote by x, y, z, t the coordonnates in this space.
Then we have
dx dy dz = dx dy dz dx dz dy + dy dz dx dy dx dz
+ dz dx dy dz dy dx.
i
dx 1 dx ik ; 1 i1 < < ik n
n
is a basis of k (Rn ), the dimension of which is equal to the binomial coefficient k .
0
Hence any exterior k-form (i.e., any antisymmetric k -tensor) can be written in the
form
P
k =
i1 ,...,ik dx i1 dx ik .
i1 <<ik
Exterior algebra
467
that is, when no ambiguity arises, repeated indices (appearing once as superscript and once as subscript) are assumed to be summed
in strictly increasing
n
order; so if there are k indices, the sum is over k terms instead of nk terms.
For the special case k = n, we deduce the following result:
THEOREM 17.16 The space n (Rn ) is of dimension 1.
COROLLARY 17.16.1 Any exterior n-form on Rn is a multiple of the form deter-
of and is defined by
= ( ).
def
)( x,
( x) ( x)
,
y) =
( y) ( y)
Differential forms
468
such that
1 ( x 1 ) . . . p ( x 1 )
def
.. .
1 p ( x 1 , . . . , x p ) = ...
.
1 ( x p ) . . . p ( x p )
Exterior product of two arbitrary exterior forms
We now introduce the general definition of the exterior product of two exterior forms.
DEFINITION 17.22 (Exterior product) Let k, N , k k (E ) and
( x (k+1) , . . . , x (k+) ).
Example 17.25 Let and be exterior 1-forms. Then using Definition 17.22 we get
( x, y) = ( x) ( y) ( x) ( y),
we have
469
(Rn ) =
def
n
M
k (Rn ).
k=0
17.2
Differential forms on a vector space
17.2.a
Definition
A differential form of
degree k on R n of C p -class, or differential k-form, is a map of C p -class
: Rn k (Rn ),
x 7 ( x).
Differential forms
470
+ + fn1,n dx n1 dx n
1i< jn
f i j dx i dx j = f i j dx i dx j .
where the extorior form is evaluated will be omitted from the notation, and we will
often write instead of ( x), when no ambiguity exists. Thus, for a 2-form ,
( 1 , 2 ) will be the real number obtained by applying the exterior 2-form ( x) to
the vectors 1 and 2 in E .
d x =
x
x
x
dx +
dy +
dz,
x
y
z
x
y
y
z
z
x
i1 ,...,ik
x
dx .
471
It follows that
dd =
2 i1 ,...,ik
x x
dx dx dx i1 dx ik .
But since the form is of C 2 -class, Schwarzs theorem on the commutativity of second
partial derivatives implies that
2 i1 ,...,ik
x x
2 i1 ,...,ik
and
x x
dx dx = dx dx ,
17.3
Integration of differential forms
The general theory of integration of differential forms is outside the scope
of this book. Hence the results we present are only intended to give an
idea of the flavor of this subject. The interested reader can turn for more
details to a book of differential geometry such as Cartan [17], Nakahara [68],
or Arnold [9]. Before discussing the general case, we mention two special
situations: the integration of n-forms on Rn and of 1-forms on any Rn .
DEFINITION 17.33 (Integral of a differential n-form) Let be a sufficiently
regular domain in Rn (so is an n-dimensional volume) and let be a
differential n-form. Then there exists a function f such that we can write
( x) = f ( x) dx 1 dx n ,
and the integral of on is defined to be
Z
Z
def
=
f ( x) dx 1 dx n .
Differential forms
472
: [0, 1] Rn ,
t 7 (t) = x 1 (t), . . . , x n (t) .
f dx + g dy =
f (s)
dx
ds +
ds
g (s)
dy
ds.
ds
This recovers the notion already seen (in complex analysis) of contour integration.
In the general case, how do we integrate a differential k-form? Most importantly, on what domain should one perform such an integral? Take a 2-form,
for instance. At every point of this integration domain, we need two vectors.
This is easily given if this domain is a smooth surface, for the two vectors can
then be a basis of the tangent plane. A differential 2-form must be integrated along
a 2-dimensional surface.
As an example, we consider a 2-form 2 in the space R3 . We denote it
= f x dy dz + f y dz dx + f z dx dy.
(Note that we have changed conventions somewhat, using dz dx instead of
dx dz; this only amounts to changing the sign of f y .) Let S be an oriented
473
where n is the normal vector to the surface. The first integral involves the
differential form , whereas the integral on the second line is an integral of
functions.
This is easy to memorize: it suffices to replace dx dy by dz and, similarly
dy dz by dx and dz dx by dy.
To generalize this result, it is necessary to study the behavior of a differential form during a change of coordinates. This is done in sidebar 6. One shows
that a differential k-form may be integrated on a surface of dimension k,
that is, there appears a duality2 between
differential 1-forms and paths (curves);
differential 2-forms and surfaces;
differential 3-forms and volumes (of dimension 3) or three-dimensional
surfaces in Rn ;
etc.
The following result is then particularly interesting:
THEOREM 17.36 (Stokes) Let be a smooth (k + 1)-dimensional domain with k-
This formula is associated with many names, including Newton, Leibniz, Gauss, Green,
Ostrogradski, Stokes, and Poincar, but it is in general called the Stokes formula.
Example 17.37 The boundary of a path : [0, 1] Rn is made of only the points b = (1)
and a = (0) . Let us consider the case n = 1 and let be a differential 0-form, that is, a
2
The following is meant by duality: the meeting of a k-form and a k-surface gives a
number, just as the meeting of a linear form and a vector gives a number.
Differential forms
474
0 dt
17.4
Poincar
es theorem
DEFINITION 17.38 (Exact, closed differential forms) A differential form is
closed if d = 0.
A differential k-form is exact if there exists a differential (k 1)-form
A such that = dA.
THEOREM 17.39 Any exact differential form is closed.
Proof. Indeed, if is exact, there exists a form A such that = dA, and we then
have d = ddA = 0.
Note that an exact differential 1-form is none other than a differential (of
a function). Indeed, if is an exact differential 1-form, there exists a 0-form,
that is, a function f , such that = d f .
THEOREM 17.40 (Schwarz) Let be a differential 1-form. If we write =
fj
fi
=
j
x
xi
Knowing whether the converse to Theorem 17.39 holds, that is, whether
any closed form is exact, turns out to be of paramount importance. The
answer depends in a crucial way on the shape and structure of the domain of
definition of the differential form.
DEFINITION 17.41 (Contractible open set) A contractible open set in Rn is
Poincars theorem
475
[0, 1]
C (x, ) = (1 ) x.
In one important special case, a more general condition than contractibility is sufficient to ensure that a closed form is exact. Indeed, for differential
1-forms, we have
THEOREM 17.45 Let Rn be a simply connected and connected open set.
Remark 17.46 Be careful that this theorem is only valid for differential 1-forms. In
general, it is not sufficient for a closed differential form to be defined on a simply
connected set in order for it to be exact. We will see below (see page 480) a simple
consequence of this for the field created by a magnetic monopole.
Notice that any contractible subset is simply connected (it suffices to look
at the image of a path under the contraction C to see that it also contracts
to a single point), but the converse is not true.
Differential forms
476
(x y) dx + (x + y) dy
x 2 + y2
in R2 . It is easy to check that d = 0 on R2 \ (0, 0) , using Schwarzs theorem, for instance.
However, is not exact. One may indeed integrate to get
1
x
f (x, y) = log(x 2 + y 2 ) arctan
+ Cnt ,
2
y
=
17.5
Relations with vector calculus:
gradient, divergence, curl
17.5.a
dx dy+
dydz+
dzdx,
x
y
y
z
z
x
Stokess formula gives
d =
S
477
or, putting f = ( f x , f y , f z ) and using the fact that the components of d are
the components of the curl curl f ,
I
ZZ
2
curl f d =
f d,
(17.1)
S
or in other words still, the flux of the curl of f through a surface is equal to
the circulation of f along the boundary of this surface (Green-Ostrogradski
formula).
Similarly, if is a differential 2-form given by
def
= f x dy dz + f y dz dx + f z dx dy,
its exterior derivative is then
fx f y fz
d =
+
+
dx dy dz = (div f ) dx dy dz,
x
y
z
Note that the relations just stated between exterior derivatives and the
gradient, divergence and curl operators show that the identity d2 0 may
simply be expressed, in the language of the man in the street, as
div curl 0
and
curl grad 0.
dx dy+
dydz+
dzdx,
x
y
y
z
z
x
478
Differential forms
(17.2 )
Poincars theorem (the second version, Theorem 17.45) ensures that there
exists a differential 0-form, that is, a function : R R, such that
= d =
dx +
dy +
dz.
x
y
z
This is one of the homogeneous Maxwell equations, giving the global structure of fields in
the setting of electrostatics. Hence all magnetic phenomena have been neglected.
4
This amounts to mixing up the logical statements (P Q ) and ( Q P )!
5
And who, moreover, would be quite incapable of finding a counterexample: it is true that
E comes from a potential, but for a very different reason.
479
on .
(17.3)
= f1 dx 2 dx 3 + f2 dx 3 dx 1 + f3 dx 1 dx 2 .
Then the exterior derivative of is simply
f1
f2
f3
d =
+
+
dx 1 dx 2 dx 3 ,
x1 x2 x3
and the condition (17.3) can be expressed again simply as
d = 0.
Poincars theorem (Theorem 17.44 now) ensures that if is contractible, there
exists a differential 1-form A = A1 dx 1 + A2 dx 2 + A3 dx 3 such that
= dA
A2 A1
A3 A2
1
2
=
dx dx +
dx 2 dx 3
x2
x2
x3
x1
A1 A3
+
dx 3 dx 1 ,
x3
x1
which, putting A = (A1 , A2 , A3 ) and comparing with the definition of ,
shows that
f = A = curl A.
THEOREM 17.49 (Existence of the vector potential) Let R3 be a contrac-
tible open subset and let f : R3 be a vector field with zero divergence on :
div f 0. Then there exists a vector potential A : R3 such that f = A.
In the space R3 , the magnetic field satisfies the relation div B 0, and
this independently of the distribution of charges. It follows that the magnetic
field, on R3 which is contractible, comes from a vector potential A.
Remark 17.50 In the general setting of electromagnetism, one first proves the existence of the
vector potential A; then, from the equation curl E = B/ t, it follows that curl(E +
A/ t) = 0 and hence the existence of the scalar potential such that
E =
A
t
Differential forms
480
(17.4)
The relation div B 0 is satisfied now only on R3 \{0}, which is not contractible.
It is possible to show then that there does not exist a vector potential A defined
on R3 \ {0} such that we have B = A at every point of R3 \ {0}. But
since a local description of the magnetic field using a vector potential remains
possible, it is possible to show that one may defined two potentials A1 and
A2 defined on a neighborhood of the upper half-space (resp. lower half-space)
such that the relations A i = B for i = 1, 2 hold on the respective domains
where the vectors are defined. For instance, A1 and A2 may be defined on
the following grayed domains:
0
A1
0
A2
17.6
Electromagnetism in the language
of differential forms
Physicists see the vector potential usually as a vector, but it should rather
be seen as a differential form. Indeed, if A itself does not have an intrinsic
meaning,H the integral of A along a path has one: for a closed path , the
integral A d is equal to the flux of the magnetic field through a surface
S bounded by (see formula (17.1)).
481
This is why, in special relativity, the components of the potential fourvector (vector potential + scalar potential) are denoted in contravariant notation:
A = A dx .
The Faraday tensor (which is a differential 2-form or equivalently a field
0
of 2 -tensors) is defined as the exterior derivative of A:
F = dA, namely, F = 12 F dx dx = 12 A A dx dx ,
F12 = B z ,
F13 = B y ,
F03 = E z ,
F23 = B x ,
+
dt dy dz
y
z
t
E x Ez B y
+
dt dz dx
+
z
x
t
E y E x Bz
+
dt dx dy.
+
x
y
t
If we reorder the terms, we obtain the Maxwell equations without source:
without source.
div B = 0
Differential forms
482
4
3-forms is of dimension 3 = 4 also. Using the Hodge operator, an isomorphism between both spaces is constructed:
dx ik+1 dx in
(n k)! i1 ,...,ik ,ik+1 ,...,in
with complete summation on repeated indices. For any k [[0, n]], the Hodge
operator is an isomorphism between the space of k-forms and the space of
(n k)-forms.
Example 17.52 Consider the case n = 3. Then dx =
Similarly
(dx dy) =
1
(31)!
dy dz dz dy = dy dz.
1
dz = dz.
(3 2)!
J = dx dy dz j x dt dy dz j y dt dz dx jz dt dx dy.
The equation
d F = J
(17.5)
is then equivalent to
E
=j
t
div E =
curl B
Maxwell equations
with source.
d J =
+
+
+
dt dx dy dz 0,
t
x
y
z
483
0
Ex
Ey
Ez
E
0 B z
By
,
F = (F ) = x
Bz
0 B x
E y
E z B y
Bx
0
0 B x B y B z
0
E z E y
F = ( F ) = B x
.
B E
0
Ex
y
z
Bz
E y E x
0
1
= F ,
2
0
Ex
Ey
0
B ,
F = E x
E y
B
0
which shows that the electric field has two components, whereas the magnetic field is a scalar
field (or rather a pseudo-scalar, which changes sign during a change of basis that reverses the
orientation).
484
Differential forms
PROBLEM
Problem 6 (Proca lagrangian and Cavendish experiment) We want to change the Maxwell
equations in order to describe a photon with nonzero mass. For this purpose, we start
from the lagrangian description of the electromagnetic field, and add a mass term to the
lagrangian. Then we look for detectable effects of the resulting equations.
2 2 2
A =
A A ,
2
2
is added to the classical lagrangian to obtain the Proca lagrangian
2
1
def
A A .
LP = F F J A /0 +
4
2
Write the equations of movement associated with this lagrangian. Show that, in
Lorenz gauge, they are given by
( + 2 )A = J /0 .
3. Express the relation between the frequency of an electromagnetic wave and its wave
number. Show that has the dimension of mass (still with h} = c = 1). Using the
de Broglie relations, show that the waves thus described are massive.
4. Consider the field created by a single, immobile, point-like particle situated at the
origin. Show that only the component A0 = is nonzero. Show that this scalar
potential is of the Yukawa potential type and is equal to
q
e r
( r) =
.
40
r
What is the typical decay distance for this potential?
II. In 1772, Cavendish realized an experiment to check the 1/r decay of the Coulomb
potential [50].
Consider two hollow spheres, perfectly conducting, concentric with respective radii R1
and R2 , with R1 < R2 . At the beginning of the experiment, both spheres are electrically
neutral, and the exterior sphere is given a potential V . Then the spheres are joined
(with a thin metallic wire for instance).
6
That is, the Maxwell equations for the electromagnetic field, and the Newton equation
with Lorentz force for particles.
485
1. Show that, if the electrostatic potential created by a point-like charge follows the
Coulomb law in 1/r , the inner sphere remains electrically neutral.
2. We assume now that the electromagnetic field is described by the Proca lagrangian
and that the electric potential created by a point-like charge follows the Yukawa
potential. Show that this potential satisfies the differential equation
( +2 ) = /0 .
3. Show that the electric field has a discontinuity /0 when passing through a surface
density of charge .
4. Compute the potential at any point of space after the experiment is ended (both
spheres are at the same potential V ). Deduce what is the charge of the inner sphere.
(One can assume that is small compared to 1/R1 and 1/R2 and simplify the
expressions accordingly.)
5. Knowing that Cavendish, in his time, could not measure any charge on the inner
sphere, and that he was able to measure charges as small as 109 C, deduce an upper
bound for the mass of a photon. Take V = 10 000 V and R1 R2 = 30 cm.
Any comments?
SOLUTION
Solution of problem 6.
I. 1. In Lorenz gauge, one finds F = J , which is indeed the covariant form of the
Maxwell equations.
2. Immediate computation.
3. Notice first that, in analogy with the Klein-Gordon equation, the additional term
deserves to be called a mass term. If we consider a monochromatic wave with
frequency and wave vector k, the equation of movement in vacuum (+2 ) A =
0 gives 2 + k 2 + 2 = 0. The de Broglie relations link frequency and energy on
the one hand, and momentum and wave vector on the other hand. We obtain then
E 2 = p 2 + 2 , which may be interpreted as the Einstein relation between energy,
momentum, and mass, taking as the mass. In fact, it is rather h}/c which has
the dimension of a mass.
4. We have J = (q, 0); the components Ai for i = 1, 2, 3 therefore are solutions of
the equation ( + 2 ) Ji 0 and are zero up to a free term.
The component = A0 , on the other hand, satisfies
d
+ 2 = q ( r )/0 ,
dt
which is independent of time. Looking for a stationary solution, we write therefore
( +2 ) = q ( r)/0 ,
which (see Problem 5, page 325) can be solved using the Fourier transform and a
residue computation, and yields the formula stated. The typical decay distance is
therefore 1 .
II. 1. When the exterior sphere is charged with the potential V , the potential at any point
exterior to the two spheres is (using for instance Gausss theorem on the electric
field)
R
( r) V 2
for all r > R2 .
r
Differential forms
486
The inner potential, on the other hand, is uniform and equal to V (same method).
In particular, the inner sphere is already at the same potential V . The electric wire
is also itself entirely at the potential V . Hence there is no electric field, no current
is formed, and the inner sphere remains neutral.
2. If electromagnetism follows the Proca lagrangian, then the Poisson law is not valid
anymore, and consequently, Gausss theorem is not either. The potential created by
a point-like charge being
( r ) =
1
e r ,
40 r
it satisfies the differential equation (2 ) = /0 (which is indeed the differential equation of Question I.2 in stationary regime); by convolution, the potential
created by a distribution of charge therefore satisfies (2 ) = /0 (recall
that ( ) = () ).
div E = = /0 2 .
Consider a charged surface element, surrounded by a surface S .
S
Then we have
ZZ
E dS =
ZZZ
d3 r =
ZZZ
2
0
d3 r .
R1 sinh(r )
.
r sinh(R1 )
487
A e R1 + B e R1 = R1 V ,
A e R2 + B e R2 = R2 V .
(r ) =
V
R sinh (R2 r ) + R2 sinh (r R1 )
r sinh(R) 1
r r=R1
r r=R1
R2
= V coth R1 + coth R
= .
R1 sinh R
0
Thus, for small , we have
=
2 V 0 R2
(R + R2 ).
6 R1 1
2 2
V R1 R2 0 (R1 + R2 ).
3
5. With the data in the text, since Q 109 C, we obtain 1 6 m, or, adding
correctly the proper factors h} and c ,
3. 1043 kg,
which is a very remarkable result. (Compare, for instance, with the electron mass
me = 9.1 1031 kg.)
Differential forms
488
f : S Rn ,
x 7 f ( x).
(Rk )k R,
( v1 , . . . , vk ) 7 f ( v 1 , . . . , vk ) = ( f v1 , . . . , f vk ).
def
def
S
Rk
Rn
Chapter
18
Groups
and group representations
18.1
Groups
DEFINITION 18.1 A group (G , ) is a set G with a product law defined on
G G , such that
490
0G + g = g + 0G = g;
Rx
Ry
Ry
Rx
z
y
x
491
18.2
Linear representations of groups
A group may be seen as an abstract set of objects, together with a table
giving the result of the product for every pair of elements (a multiplication
table). From this point of view, two groups (G , ) and (G , ) may well have
the same abstract multiplication table: this means there exists a bijective map
: G G which preserves the product laws, that is, such that for any g1 , g2
in G , the image by of the product g1 g2 G is the product of the respective
images of each argument:
(g1 g2 ) = (g1 ) (g2 ).
DEFINITION 18.8 A map that preserves in this manner the group structure
Mx =
1
0
x
,
1
with the product given by the product of matrices, is a group. Since M x M y = M x+ y for any
x, y R, it follows that the (obviously bijective) map : x 7 M x is a group isomorphism
between (R, +) and (M , ).
By far the most useful representations are those which map a group (G , )
to the group of automorphisms of a vector space or, equivalently, to the group
GLn (K) of invertible square matrices of size n:
1
To speak of continuity, we must have a topology on each group, which here is simply the
usual topology of R and of U C.
492
18.3
Vectors and the group S O(3)
In this section, physical space is identified with the vector space E = R3 .
DEFINITION 18.14 The special orthogonal group or group of rotations of
( a| b) = R( a)R( b) ;
R preserves orientation in space: the image of any basis which is positively oriented is also positively oriented.
The rotation group SO(E ) will be identified without further comment with
the group S O(3) of matrices representing rotations in a fixed orthonormal
basis.
Denoting by R the matrix representing a rotation R in the (orthonormal)
canonical basis of E , we have
( a| b) = (R a|R b) = ( a|t RR b)
for all a, b E
493
(using the invariant of the scalar product under R), which shows that R is an
invertible matrix and that
t R = R 1 .
(18.1)
A matrix for which (18.1) holds is called an orthogonal matrix.
Thus linear maps preserving the scalar product correspond to orthogonal
matrices, and they form a group, called the orthogonal group, denoted O(E ).
The group of orthogonal matrices is itself denoted O(3). Note that the matrix
I3 , although it is orthogonal, is not a rotation matrix; indeed, it reverses
orientation, and its determinant is equal to 1.
The group O(3) is therefore larger than SO(3). More precisely, we have:
THEOREM 18.15 The group O(3) is not connected; it is the union of two connected
3
X
j=1
R(
) i j e j .
2
Note : the matrix R(
) is not the matrix for the change of basis L defined Equation (16.1)
on page 434, but its transpose. Here, we have, symbolically,
e
e
1
1
) e2 ,
e2 = R(
e3
e3
which means that one can read the coordinates of the new vectors in the old basis in the rows
of the matrix (rather than in the columns). The matrix linking the coordinates of a (fixed)
point in both bases is R(
)1 = t R(
) :
x
x
t
) y .
y = R(
z
z
494
cos sin 0
def
R z () = R( ez ) = sin cos 0
0
0
1
Similarly, the matrices of rotations around the axis O x and O y are defined
by
1
0
0
def
sin
R x ( ) = R( e x ) = 0 cos
0 sin
cos
cos 0 sin
def
1
0 .
R y () = R( e y ) = 0
sin 0 cos
Remark 18.16 When speaking of space rotations, there are two different points of view, for
a physicist. The first point of view (called active) is one where the axes of the frame (the
observer) are immobile, whereas the physical system is rotated. In the second point of view,
called passive, the same system is observed by two different observers, the axes of whose
respective frames differ from each other by a rotation. In all the rest of the text, we only consider
passive transformations.
For a rotation R(
) parameterized by = n, we can express the vector
in terms of the canonical basis:
= e1 + e2 + e3 .
How does one deduce the representation for the rotation R(
)? It is tempting
to try the product R x ( ) R y () R z (); however, performing the product in
this order seems an arbitrary decision, and the example of the domino on
page 490 should convince the reader that this is unlikely to be the correct
solution. Indeed, in neither the first nor the second line is the
p final state of
the domino the same as its state after a rotation of angle / 2 around the
first diagonal in the O x y-plane.
A proper solution is in fact quite involved. It is first required to use
infinitesimal transformations, that is, to consider rotations with a very small
angle . Then it is necessary to explain how non-infinitesimal rotations may
be reconstructed from infinitesimal transformations, by means of a process
called exponentiation (which is of course related to the usual exponential
function on R).
Consider then a rotation with axis O z and infinitesimal angle : it is
given by
1 0
R( ez ) = 1 0 = I3 + Jz
0
0 1
Similarly, define
and
495
0 1 0
= 1 0 0 ,
=0
0 0 0
def dR( e z )
Jz =
d
0 0
dR(
e
)
def
x
Jx =
=
0 0
d =0
0 1
1
0
0 0 1
= 0 0 0 .
=0
1 0 0
def dR( e y )
Jy =
d
sin 0
cos 0 0
dR( ez )
2 2
n n
Jz + +
J + .
2!
n! z
(This series is absolutely convergent, and therefore convergent.) This is a
general fact, and the following theorem holds:
THEOREM 18.18 Let R() R be a one-parameter group of matrices. Then its
infinitesimal generator given by J = R (0) satisfies R() = e J for any R.
R( ez ) = R(0)exp( Jz ) = exp( Jz ) = I2 + Jz +
496
R(h) Id
h
R() = J R(),
which shows that R() is the solution of the differential equation with constant coefficient R = J R such that R(0) = I , that is, R() = e J .
axis directed by the vector n). Then, denoting as before J = ( J1 , J2 , J3 ), the matrix
) is given by
representing R(
R(
) = exp(
J) = exp(1 J1 + 2 J2 + 3 J3 ).
However, using the exponentials, the matrices Ji can provide any rotation
matrix. Even better: it turns out that, in practice, it is only necessary to know
the commutation relations between the matrices Ji . The reader will easily check
that they are given by
[ J1 , J2 ] = J3 ,
[ J2 , J3 ] = J1 ,
and
[ J3 , J1 ] = J2
(recall that [A , B] = A B B A for any two square matrices A and B), which
can be summarized neatly by the single formula
[ Ji , J j ] = i jk Jk ,
(18.2)
where summation over the repeated index (i.e., over k) is implicit, and where
the tensor is the totally anti-symmetric Levi-Civita tensor. The commutation
relations (18.2) define what is called the Lie algebra of S O(3). The coefficients
i jk are the structure constants of the Lie algebra of SO(3).
Remark 18.20 The minus sign in the formula is not important; if we had been dealing with
active transformations instead of passive ones, the matrices Jk would have carried a minus sign,
and the commutation relations also.
Knowing the Lie algebra of the group (i.e., knowing the structure constants)
is enough to recover the local structure of the group. (However, we will see
that there exist groups which are globally different but have the same Lie
algebra, for instance, SO(3) and SU(2), or more simply SO(3) and O(3); but
this is not a coincidence, since SU(2) is a covering of SO(3).)
Remark 18.21 Since SO(3) is a group, performing a rotation with parameter followed by a
rotation with parameter is another rotation, with parameter . However, there is no simple
relation between , and .
497
which shows that velocity vectors transform according to R. In other words velocity vectors
are indeed objects that transform according to a vector representation.
18.4
The group S U(2) and spinors
Consider now the complex vector space E = C2 , with its canonical
Hilbert structure, that is, with the hermitian scalar product
x
y
( x| y) = t x y = x1 y1 + x2 y2 ,
where x = 1 and y = 1 .
x2
y2
DEFINITION 18.24 The special unitary group of E , denoted S U(E ), is the
group of linear maps from E to itself which preserve the scalar product and
have determinant one.
M = t M.
A matrix such that M M = I2 only also satisfies
+ 6 0 mod 2.
498
is denoted S U(2).
M M = I2
and
det M = 1
U (
) = cos
I2 + i sin
n ,
2
2
b2
where
def
= (1 , 2 , 3 ),
the matrices i being none other than the Pauli matrices, well known to
physicists:
0 1
0 i
1 0
1 =
2 =
3 =
.
1 0
i 0
0 1
Those matrices have trace zero (they are traceless in the physicists language)
and are hermitian, which means that we have
i = i
for i = 1, 2, 3.
The Pauli matrices play for SU(2) the same role that the Jk play for SO(3), as
we will see. First, notice that the reasoning above can be reversed:
PROPOSITION 18.28 Let = n be a vector in R3 . The matrix
is hermitian,
cos 2 + in z sin 2
(in x + n y ) sin 2
(in x n y ) sin 2
cos 2 in z sin 2
499
Commutation relations:
Anticommutation relations:
i i i j
,
2 2
= i jk
ik
;
2
i j + j i = 2 i j I2 ,
i j = i j I2 + i i jk k ,
tr ( i ) = 0,
i = i .
The groups SO(3) and SU(2) are quite similar. Indeed, for any vector
R3 , we construct two matrices
R(
) = exp(
J) SO(3)
and
U (
) = exp(i
/2) SU(2),
The structure constants of SO(3) and SU(2) are identical. Since these structure
constants are enough to reconstruct (locally) the internal product law of the
group, there should exist a strong link between the groups. This is made
precise by the following theorem:
THEOREM 18.30 The group SO(3) is a representation of SU(2): there exists a
homomorphism
R : SU(2) SO(3),
U 7 R U .
Moreover, for any U SU(2), the matrices U and U have same image: R U =
RU .
Proof. We need a way of constructing a matrix RU SO(3) for any matrix U
SU(2), such that the group structure is preserved. In other words, we need that
RU R V = RU V
the vector space of hermitian matrices with trace zero. This may be written
z xi y
M =
; x, y, z R = { x ; x R3 }.
x+i y z
500
x x =
1
tr M ( x) M ( x )
2
det M ( x) = x 2 .
and
xi i with xi = 21 tr (M i ).
M 7 U M U 1 .
tr M = tr (U M U 1 ) = tr M = 0.
Hence there exists a unique element x R3 such that M = x . We then define
x = RU ( x), and in this manner we have defined a map RU : R3 R3 . The
uniqueness of x implies that RU is linear.
LEMMA 18.32 R U is a rotation.
so
1
2
P
j
tr (U j U 1 i ) x j ,
1
tr (i U j U 1 ).
2
A direct computation shows finally that we have det RU > 0; hence RU is a
rotation.
(RU )i j =
=n
rotation
vector
( defined up to 2)
M (
) =
matrix in M
= 21 tr (M
)
501
U = exp (i
/2)
matrix in SU(2)
( defined up to 4)
On the other hand, suppose we travel along this path twice. Then the resulting path
may be contracted to a single point, following the sequence of movements below (in
the first figure, the path taken twice is also drawn twice for legibility):
502
/2
( r , t) 7 ( r , t) =
=e
,
( r , t)
( r , t)
(18.3)
e i(+2) n /2 = e i n /2 ,
e i(+4) n /2 = +e i n /2 .
Hence, a 4-rotation is required in order that the wave function remain invariant. How is it, then, that this particular behavior of the electron (it is not
503
cover of SO(3)) is used to show that two states of spin exist in the fundamental spinor representation. In special relativity, the group of isometries of space is not SO(3) anymore, but
rather the group SO(3, 1), that leaves Minkowski space (R4 with the Minkowski metric) invariant. One can show that there exists a simply connected double cover of this group, isomorphic
to SL(2, C) and generated by two independent representations of SU(2) (one sometimes find
the notation SU(2) SU(2) to recall this independance). This is constructed using suitable
linear combinations of generators of rotations and Lorentz boosts as infinitesimal generators.
This group then affords four degrees of freedom for a particle with spin 21 , instead of the two
expected; this allows us to predict that not only the particle is thus described, but so is the
associated antiparticle.
The reader interested by the mathematics underlying the representations of the Dirac equation may begin by looking at the excellent book [68].
18.5
Spin and Riemann sphere
While we are dealing with pretty pictures, I will not resist the temptation
to present an application to quantum mechanics of the representation of the
complex plane as the Riemann sphere (see page 146); more precisely, this
concerns the representation of the spin.
We know that the wave function of an electron (or indeed of any other
particle with spin 12 ) may be represented as the tensor product of a vector | f
corresponding to a space dependency and a vector | corresponding to the
spin:
| = | f | .
This is another way of expressing that the wave function has two components,
as in (18.3) (the first, , corresponding to the state | or spin-up, and the
second, , corresponding to the state | or spin-down).
Let us concentrate only on the spin for now. Then the wave function |
is a linear combination of the basis vectors | and | . This, however, is
related to the choice of one particular direction for the measure of the spin,
namely, the direction z. We might as well have chosen another direction, for
504
Because the physical meaning of a wave function is not changed when the
vector is multiplied by an arbitrary nonzero complex number, it follows that
only the ratio / is necessary to characterize the state |.
Then we have the following very nice result:
THEOREM 18.35 Let z = /; then we have z C = C {}. The stereo-
graphic projection z of z on the Riemann sphere gives the direction characterizing the
vector |.
z
1
1
i
z = /
Exercises
505
exp (i3 /2) k exp (i3 /2) when k = 1, 2, 3. Interpret this result.
( r), we associate the function R such that R( r) = R 1 ( r) (so that the axes have been
transformed by the rotation R). For a rotation R z () around the O z-axis, show that we have
Rz ()(x, y, z) = (x y , y + x , z) and then that
dRz ()
= i L z Rz ()
with L z = i x
+y
.
d
y
x
Deduce that Rz () = exp(iL z ) and generalize this to the other axes. Show that
[i L i , i L j ] = i jk i Lk .
506
Initial positon.
Exercises
507
Sidebar 7 (Cont.)
... continues...
This magic trick (which may very well be done at home, using a pencil eraser, some tacks and
a yard or so of ordinary string) is a consequence of the double connectedness of SO(3).
Another illustration of this fact is Feynmanns hand trick [36].
Chapter
19
Introduction to probability
theory
Statistics is that science which proves
that 99.99% of human beings have
more legs than the average number.
510
19.1
Introduction
Dice are the embodiment of randomness. The Latin name, alea, is indeed
the source of the French word alatoire for random. The Arabic word
az-zahr, which also means dice, has produced Spanish azar, French hasard
(randomness), as well as English hazard. The way dice fall, through the
Latin cadere (to fall), has brought the Old French word chaance and then the
word chance.
According to a dictionary, chance is The unknown and unpredictable element
in happenings that seems to have no assignable cause.
This is what happens with the throw of a die. Even if we hold that the
laws that dictate the fall and rebounds of the die are perfectly deterministic,
it is a fact that this system is chaotic and thus, in practice, has unpredictable
behavior: the final outcome of the throw has no apparent reason [86].
It is quite remarkable that such a common concept as chance should at
the same time be so difficult to fathom. Here is a mathematical example:
Some real numbers have the following property,2 which we call Property P :
E.g., the famous case of water memory, which is described in detail in [15].
A number for which Property P holds is called a normal number.
Introduction
511
not have this property, or if it does, we are not able to prove it.3 The conclusion
is that it is in fact very difficult to make a random choice of a real number;
most of our choices are not random at all.4
This simple example should be enough to suggest how difficult it may be to
express rigorously what is a random choice. To overcome those difficulties,
it has been necessary to formalize and axiomatize probability theory, leaving
aside questions of interpretation to concentrate solely on the mathematical
content.
In this sense, the forefathers of probability are certainly Blaise Pascal (see
page 606) and Pierre de Fermat (16011665), during a famous exchange of letters during the summer 1654. The physicist Christiaan Huyghens (16291695)
published this correspondance in the first treatise of probability theory, De Ratiociniis in ludo ale, where in particular the notion of mathematical expectation
is introduced.
However, the works of Jacques Bernoulli (see page 555) and Abraham de
Moivre,5 with the first theorems concerning laws of large numbers, are the
real starting point of modern probability theory. Laplace and Gauss studied
the theory of errors (with important applications to measurements such as
that of the terrestrial meridian).
In the ninetheenth century, Pafnouti Tchebychev6 (18211894) and his
students Andrei Markov (18561922) and Alexandre Liapounov (18571918)
created a formalism suitable for the study of sums of random variables,
which marks the beginning of the important Russian school of probability
theory. The complete axiomatization of probability is due to Andrei Kolmogorov (see below), building on previous work of Sergei Bernstein (1880
1968), Richard von Mises (18831953) and mile Borel (18711956).
3
To be precise, we know a few normal numbers. For instance, the number obtained by first
writing successively all positive integers, then concatenating the resulting sequences of digits
(Champernowne, 1933):
0, 1234567891011121314151617 . . .
or the number obtained by doing the same with the sequence of prime numbers (Erds and
Copeland, 1945) :
0, 23571113171923293137 . . .
It is quite obvious that those numbers are artificial; in particular, they depend on the fact
that we work in base 10. So one may increase the difficulty by asking for an absolutely normal
number, one which is normal (in an obvious sense) in any basis. It seems that today not a
single absolutely normal number is known, whereas, again, almost all numbers must have this
property.
4
Consider also the great difficulty inherent in the writing of a random number generator
for a computer.
5
Abraham de Moivre (16671754) left France at eighteen during the repression against
Protestants. He discovered mathematics through a chance encounter with Newtons Principia [69]. Besides studies of complex numbers, he gave the mathematical definition of independent events, and proved the Stirling formula.
6
The mathematical great-great-great-great-grandfather of the translator.
512
19.2
Basic definitions
In this section, we introduce the basic vocabulary of probability theory
and the axioms stated by Kolmogorov [19] in ber die analytischen Methoden in der Wahrscheinlichkeitrechnung (Analytic Methods of Probability Theory,
1931), and also in the historical book Grundbegriffe der Wahrscheinlichkeitrechnung
(Foundations of Probability Theory, 1933).
The probability of an event that can range only over a discrete set (such as
the result of throwing a die) can be captured very intuitively. However, when
the space of possible events is continuous (such as the points in space, or an
instant in time), a precise mathematical description of the probability that
an event occurs is required.
Kolmogorov suggested that the probability of an event be the expression
of the measure of a set.
DEFINITION 19.1 A probability space is a pair (, ), where
= , {P }, {F }, {P , F } .
Example 19.3 When rolling once a die with six facets, the space of elementary events can be
= {1, 2, 3, 4, 5, 6}
and
= P().
The event the result of the throw is even is the element {2, 4, 6} .
Example 19.4 Take = R. Then both {, R} and , [0, 1] , R [0, 1] , R are -algebras.
Remark 19.5 Note that it is sometimes useful to take for a set which is not an abstract
mathematical object, but rather a group of persons, a collection of objects, etc.
Basic definitions
513
ment is in A.
i) Am An = if m 6= n,
S
ii) nN An = .
Any elementary event {} P() is then contained in one and only one of
the elements of the complete class.
DEFINITION 19.8 Let C P() be any set of subsets of . The intersection
514
nN
Comparing with Definition 2.11 on page 59, we see that a probability space
is simply a measure space where the total measure of the space is supposed to
be equal to 1.
The reader is then invited to check the following properties:
PROPOSITION 19.11 Let P be a probability measure. Then the following properties
hold:
nN
nN
1 if a A ,
A 7
0 if a
/A
defines
(in
an
obvious
way)
a
probability
measure
on
,
P
()
.
nN n xn
Basic definitions
515
Example 19.14
Let f : Rn R+ be an integrable function (with respect to Lebesgue measure)
R
such that f d = 1. For any Borel subset A B(Rn ) (recall that the Borel -algebra is the
-algebra on Rn generated by open sets, or equivalently by rectangles [a1 , b 1 ] [an , b n ]),
let
Z
f d.
P(A) =
The map P thus defined is a probability measure on Rn , B(Rn ) .
Example 19.15 Let be a positive real number, and let m R. The Gauss distribution or
normal distribution with mean m and standard deviation is the probability measure on R
(with respect to the Borel -algebra) defined by
Z
(t m)2
1
dt
for any A B(R).
P(A) = p
exp
2 2
2 A
is negligible.
Example 19.17 Let = [0, 1], with the Borel -algebra = B [0, 1]
516
A
B
C
Fig. 19.1 The Poincar formula for n = 3. In dark gray, A B C ; in light gray, A B,
A C and B C .
19.3
Poincare formula
Consider two events A and B. Obviously (just using the definition as the
measure of the corresponding events) we have the formula
P(A B) = P(A) + P(B) P(A B).
but as clearly seen in Figure 19.1, this is not the correct result for P(A B C ),
because the subset A B C has been counted three times, then subtracted
three times. Hence the correct relation is in fact
P(A B C ) = P(A) + P(B) + P(C ) P(A B)
P(B C ) P(C A) + P(A B C ).
With this done, the general result is easy to guess and prove.
THEOREM 19.18 (Poincare formula) Let n N and let A1 , . . . , An be arbitrary
k=1
P(Ai1 Aik ).
Proof. A proof using random variables, as defined in the next chapter, is given in
the Appendix D. An elementary proof by induction is also easy.
Conditional probability
517
19.4
Conditional probability
Exercise 19.1 Youre taking a plane to Seattle. A colleague tells you: There is a one in ten
thousand chance that there is a bomb in a plane. But there is only one in a hundred million
chance that there are two bombs. So, for more security, just bring your own bomb.
What should you think of this argument?
DEFINITION 19.19 Let A and B be two events such that P(B) 6= 0. The
P(A B)
.
P(B)
at least one (yours) is therefore simply one in ten thousand. Bringing one in your luggage is a
useless precaution.
P(B|A) =
P(A|B) P(B)
.
P(A)
518
Thomas Bayes (17021761), English Presbyterian minister and theologian, studied mathematics, and in particular probability theory,
during his free time. He published little of his work during his
lifetime and remained unknown to most of his peers. His Essay
towards Solving a Problem in the Doctrine of Chances was published
posthumously in 1764.
Example 19.22 On the planet Zork live two ethnic groups: Ents and Vogons. According to a
recent statistical analysis of the distribution of wealth on Zork, it is true that
Is it possible to deduce that wealth is inequitably distributed between the two groups?
Solution: Certainly not, in the absence of other data such as the proportion
of Vogons in the population. It may very well be the case that the Vogons
represent 80% of the total population, and that 80% of the total population is
poor. If that is the case, wealth is equitably distributed between the two groups
(not necessarily within each group). Again, this kind of confusion occurs
continually, and is sometimes encouraged. (Many examples can be found by
reading almost any newspaper, adapting the two words poor and Vogons
to various circumstances.)
i) Am An = if m 6= n,
S
ii) P
A
= 1.
n
nN
THEOREM 19.24 (Bayes formula) Let (An )nN be an almost complete system of
events, and let B be an arbitrary event B . Assume that P(An ) > 0 for any
n N. Then we have
P
i) P(B) = nN P(B|An )P(An );
ii) if, moreover, P(B) > 0, then
p N
P(B|A p ) P(A p )
P(A p |B) = P
.
P(B|An ) P(An )
nN
Independent events
519
Remark 19.25 The Collins case is a striking example of the use (or misuse) of probability and
conditional probability. In Los Angeles in 1964, a blond woman and a black man with a beard
were arrested for a robbery. Despite the absence of convincing evidence, the prosecution argued
successfully that they must be guilty because the chance that a random couple corresponded (as
they did) to the witnesss description was estimated to be one in twelve million. The California
Supreme Court reversed the judgment on appeal, since it was shown that the probability that
at least two couples in the Los Angeles area correspond to the description knowing that at least
one couple does (namely, the actual thieves) was close to 42 %, and therefore far from negligible
(certainly too large to decide guilt beyond reasonable doubt!). This shows that there was a
high probability that arresting the couple was a mistake.
19.5
Independent events
It is now time to define the notion of independent events. Intuitively, to
say that two events A and B are independent means that the probability that
A is realized is equal to the probability that it is realized knowing that B is
realized, or indeed to the probability that A is realized knowing that not-B
is realized.
DEFINITION 19.26 Let A and B be two events in . Then A and B are
i=1
k=1
520
Example 19.28 Consider the experiment of rolling two dice, and the events:
It is easy to check that A, B, and C are pairwise independent. Indeed, we have (assuming all
36 outcomes of the experiment have probability 1/36):
P(A) = P(B) = P(C ) =
1
2
1
4
(This is also clear intuitively, since A and B together imply C , for instance.)
Chapter
20
Random variables
Falstaff. Prithee, no more prattling; go. Ill hold.
This is the third time; I hope good luck lies in odd numbers.
Away I go. They say there is divinity in odd numbers,
either in nativity, chance, or death. Away!
William Shakespeare
The Merry Wives of Windsor, Act V, Scene i.
20.1
Random variables and probability distributions
DEFINITION 20.1 Let (, ) be a probability space. A random variable is a
function X , defined on and with values in R, which is measurable when R
Random variables
522
is equipped with the Borel -algebra, that is, such that we have X 1 (A)
for any Borel set A B(R).
To simplify notation, we will abbreviate r.v. instead of writing random
variable.
A random function is, in other words, simply a measurable function
and, for a physicist not overly concerned with quoting the axiom of choice,
this means any function that she may think about.
Here is an example. Take as probability space
def
P X : B R+ ,
B 7 P{X B},
where
def
{X B} = X 1 (B) = ; X () B .
In particular, we have
P X ]a, b] = P{a < X b} = P X 1 ]a, b]
= P ; a < X () b .
(unknown)
P
?
523
X R
P X (known)
?
In the example of the Rutgers students, assume moreover that the sample
space is given the uniform probability measure (each student is given the same
probability, p = 1/N , where N is the total number of students). What is the
probability distribution of the random variable age, which we see as integervalued? This distribution is a probability measure on the set N, an instance
of what is often called a discrete distribution:
P X {n} = P {X = n} = P {students ; age (student) = n}
number of students aged n
.
N
It can very well be the case that two random variables have the same
distribution without being closely related to each other. For instance,
=
THEOREM 20.3 Let X and Y be two random variables defined on the same sample
space which are equal almost everywhere. Then X and Y have the same distribution:
PX = PY .
X : N,
n
X
7
E i = number of tails.
i=1
.
k
2
Random variables
524
This distribution
is called the binomial distribution, and it is often denoted
B n, 12 . Note that it is easily checked that
n
1 X
n
1
P X (N) = n
= n (1 + 1)n = 1,
2 k=0 k
2
as it should be. Moreover, it is easily seen that if X has a binomial distribution, so does (n X ) (corresponding to the number of heads in n throws).
Of course, X and (n X ) are not equal almost everywhere!
What is clearly visible from this example is that distributions of random
variables only provide some information on the full sample space, by no means
all. This is the price to pay for the simplification that results from their use.
Whatever is not encoded in the random variable (all the factors leading to a
student of a certain age and not another, the exact trajectory in space of the
falling die...) is completely forgotten.
DEFINITION 20.4 (Bernoulli distribution, binomial distribution) Let n N
The binomial distribution B(n, p) corresponds to the sum of n independent random variables, each of which has a Bernoulli distribution with the
same parameter B(1, p).
20.2
Distribution function and probability density
DEFINITION 20.5 The distribution function of an r.v. X is the function
defined for x 0 by
def
F (x) = P {X x} = P X ], x] .
P {a < X b} = F (b) F (a).
525
Since the -algebra of Borel subsets of the real line is generated by the
intervals ], a], the folowing result follows:
PROPOSITION 20.6 Let (, ) be a probability
space. A map X : R is a
random variable if and only if X 1 ], a] for all a R.
Moreover, the probability
distribution P X is uniquely determined by the values
P X 1 (], a]) = F (a) for a R, that is, by the distribution function.
of F (b ), F (a), F (b ), and F (a ).
F (x + ) = lim+ F (x + ) = F (x) ;
0
def
F () = lim F (x)
and
F () = 0
and
F (+) = 1 ;
x+
then
vi) F is continuous if and only if P{X = x} = 0 for all x R. Such a
probability distribution or random variable is called diffuse.
Proof. Only i) and ii) require some care. First, write F (x + ) F (x) = P{x <
X x + }, then notice that, as tends to 0 (following an arbitrary sequence), the set
{x < X x + } tends to the empty set which has measure zero. This proves i).
Moreover, writing F (x) F (x ) = P{x < X x}, one notices similarly that
the set {x < X x} tends to {X = x}.
Random variables
526
20.2.a
1
5/6
4/6
3/6
2/6
1/6
1
tion is continuous.
Remark 20.11 In fact, we should not speak simply of continuity, but rather of a similar but
stronger property called absolute continuity. The reader is invited to look into a probability
or analysis book such as [83] for more details.
Since, for distribution functions, continuity and absolute continuity are in fact equivalent,
we will continue writing continuous instead of absolutely continuous for simplicity.
For a discrete r.v., the probability density f does not exist as a function, but
it may be represented with Dirac distributions, since we want to differentiate
a discontinuous function.
527
Example 20.13 Consider again the weighted die above. The probability density of the r.v. X
representing the result of throwing this die is
1
1
1
1
1
1
f (x) = (x 1) + (x 2) + (x 3) + (x 4) + (x 5) + (x 6).
4
6
6
6
6
12
Remark 20.14 In many cases, a random variable has a distribution function which can be
expressed as the sum of a continuous distribution function and a step function (which is itself
a distribution function). The probability density is then the sum of an ordinary function and
at most countably many Dirac distributions.
20.3
Expectation and variance
20.3.a Case of a discrete r.v.
DEFINITION 20.15 Let X be a discrete random variable. Assume that
X () = { pk ; k I } D ,
kI
however, this is always zero, by the definition of the expectation. Another idea
is therefore needed, and since we want to add all the deviations, irrespective
of sign, we can try to use
X
xk E(x) pk .
E X E(X ) =
k
Random variables
528
2 P
2
Var(X ) = 2 (X ) = E X E(X )
=
xk E(X ) pk
k
= E(X 2 ) E(X )2
when the integral exists. The variance Var(X) and the standard deviation
(X) are given by
Z +
Z +
2
2
Var(X ) = 2 (X ) =
t E(X ) f (t) dt =
t 2 f (t) dt E(x)
529
X is discrete, and a
distribution is used to represent its probability density, then the above definition provides the
same definition of the expectation of X as Definition 20.15.
2
Remark 20.19 Putting f (x) = (x) , the reader can also recognize in the definition of
expectation and standard deviation the quantities denoted x and x in quantum mechanics
(see Section 13.3 on page 359).
Remark 20.20 The expectation and the variance are special cases of what are called moments of
random variables.
The first formula uses the probability density;2 we can consider that we
integrate the function x 7 x with respect to the measure d = f (x) dx.
The second formule uses an integral over the sample space , which
is equipped with a -algebra and a probability measure, which suffices
to defined an integral in the same manner that the Lebesgue integral is
constructed.3 The function considered is this time the random variable
X , and it is integrated with respect to the measure dP().
Mathematicians usually avoid using the density f (which requires the use
of the theory of distributions as soon as F is discontinuous) and appeal instead to another measure denoted dF = f (x) dx and to the so-called Stieltjes
integrals (see sidebar 8 on page 552). However, since density probabilities
are probably more intuitive to physicists, we will use those instead, it being
understood that it may be a distribution, and must be considered accordingly,
even if it is written under an integral sign.
Exercise 20.2 A random variable X is distributed according to a Cauchy distribution if its
probability density is of the type
f (x) =
a
1
a 2 + (x m)2
for some real number m and some positive real number a. Compute the expectation and
standard deviation of a random variable with this distribution (answer in the table, page 616).
2
And, as already noticed, may be generalized to the case of discontinuous F by considering
f as a distribution and writing f , x 7 x instead of the integral.
3
This is indeed what is called the Lebesgue integral with respect to an arbitrary measure on
an arbitrary measure space.
Random variables
530
20.4
An example: the Poisson distribution
20.4.a
If N goes to infinity, we will see in the next chapter that the binomial distribution can be approached by a gaussian distribution; however, this is only
correct if N p 1. For fixed N (which is the case in physics), if we consider
a very small volume V , it is possible to make n = N p = N V /V very
small compared to 1. Then we can write
N
Nn
and
(1 p) N n = exp(N n) log(1 p) e N p = e n ,
n
n!
which gives
P(n)
N n n n nn n
p e = e .
n!
n!
531
n
for all n N.
n!
A random variable X is a Poisson random variable if P{X = n} = P (n) for
all n N.
P (n) = e
Random variables
532
20.5
Moments of a random variable
DEFINITION 20.23 The k-th moment (with k N) of a random variable X
k =
x f (x) dx =
X k () dP()
if it exists.
For k 1, the k-th moment does not necessarily exist. The 0-th moment
always exists, but it is always equal to 1 and is not very interesting. The
first moment is simply the expectation of X . One can define variants of the
moment for a centered random variable, that is, one which expectation equal
to zero.
DEFINITION 20.24 A centered random variable is a random variable with
m is given by
def
k =
(x
m)k
f (x) dx =
X () m)k dP().
Remark 20.26 For k 1, the centered k-th moment exists if and only if the original (noncentered) k-th moment exists.
Remark 20.27 The centered second moment is none other than the variance:
2 = Var(X ) = 2 (X ).
Exercise 20.3 Show that 2 =
2 21 .
Show that the kinetic energy of a distribution of mass is the sum of the kinetic energy
in the barycentric referential and the kinetic energy of the center of gravity (also called the
barycenter), carrying the total mass of the system.
533
1
.
x1/2 = inf x R ; F (x)
2
of a population, for instance, is the salary such that half of the population earn less, and half
earns more. It is therefore different from the average salary.
Take for instance the case of six individuals, named Angelo, Bianca, Corin, Dolabella,
Emilia, and Fang, with salaries as follows:
Angelo
Dolabella
5000
9000
Bianca
Emilia
9000
5000
Corin
Fang
9000
11000
The average salary is $8000. However, $8000 is not the median salary, since only two
persons earn less, and four earn more. In this example, the median salary is $9000.
If three other persons are added to the sample, Grumio, Hamlet and Imogen, earning
$6000, $7000 and $8500, respectively, then the median salary becomes $8500.
Example 20.31 (Half-life and lifetime) Consider a radioactive particle that may disintegrate at
any moment. Start the experiment at time t = 0 (the particle has not yet disintegrated) and
assume that the probability that it does disintegrate during an infinitesimal time interval dt is
equal to dt (independently of t). As in Section 20.4.b, one can show that the probability that
the particle be intact at time t is given by
F (t) = e t .
The median of the random variable time before disintegration is called the half-life (the
probability of disintegration during this length of time is 12 ). To compute it, we must solve the
equation
1
F (t1/2 ) = exp{ t1/2 } = ,
2
and therefore we have t1/2 = (log 2)/. Note that the expectation of this r.v., which is called
the (average) lifetime of the particle, is equal to 1/. Indeed, the probability density for
disintegration at time t is given by
f (t) = F (t) = e t ,
and the expectation is therefore
Z
Z
t1/2
1
= t f (t) dt = t e t dt = =
> t1/2 .
log 2
It is such that F () = 1/e.
Random variables
534
20.6
Random vectors
20.6.a
F X Y (, y) = F X Y (x, ) = 0
and
F X Y (+, +) = 1.
called joint probability density) of the pair (X , Y ) is the function (or sometimes the distribution)
def
f X Y (x, y) =
We then have
F X Y (x, y) =
2 FX Y
(x, y).
x y
Z
f X Y (s, t) dt ds.
Random vectors
535
f X Y (x, y) dx.
k, =
x k y f X Y (x, y) dx dy,
R2
def
def
1,0 and m Y =
0,1 the expectations of X and
if it exists. If we denote m X =
Y , the corresponding centered moment is given by
ZZ
def
k, =
(x m X )k ( y m Y ) f X Y (x, y) dx dy.
R2
Exercise 20.4 Let X and Y be two random variables with moments of order 1 and 2. Show
that
Var(X + Y ) = Var(X ) + Var(Y ) + 2 Cov(X , Y ).
Note that the centered moment of order (1, 1) is related with the noncentered moment by means of the formula 11 =
11 10 01 .
Finally, we have the correlation of random variables:
DEFINITION 20.39 The correlation of the pair (X , Y ) is the quantity
def
r=p
11
Cov(X , Y )
=p
20 02
Var(X ) Var(Y )
Random variables
536
This has the following properties (following from the Cauchy-Schwarz inequality):
THEOREM 20.40 Let (X , Y ) be a pair of random variables with correlation r,
assumed to exist. Then we have
i) |r| 1;
ii) |r| = 1 if and only if the random variables X and Y are related by a linear
transformation, more precisely if
X mX
Y mY
=r
Y
X
almost surely.
The correlation is useful in particular when trying to quantify a link between two statistical series of numbers. Suppose that for N students, we know
their size Tn and their weight Pn , 1 n N , and we want to show that there
exists a statistical link between the size and the weight. Compute the average
weight P and the average size T . The standard deviations of those random
variables are given by
2 (T ) =
N
1 X
2
Tn2 T
N n=1
and
2 (P ) =
N
1 X
2
Pn2 P .
N n=1
N
X
1
(T T )(Pi P ).
(T ) (P ) n=1 i
Random vectors
537
Name
Name
Niobe
Feronia
Klytia
Galatea
Eurydike
Freia
8.5
10.3
10.3
10.1
10.0
9.0
5
7
6
7
8
5
Frigga
Diana
Eurynome
Sappho
Terpsichore
Alkmene
9.6
9.1
9.3
9.3
9.7
9.4
6
5
8
6
11
7
The correlation for the first eight asteroids is r 0.785, which is quite large.
Worse, the correlation for the last five is r 0.932. Of course, this is merely
an accident, due to the small sample used. There is in fact no correlation
between the length of the name and the magnitude of asteroids. With the
twelve given, the correlation becomes 0.4 and with fifty asteroids6 it is only
r 0.097.
Remark 20.42 The covariance of (X , Y ) may be seen as a kind of scalar product of X E(X )
with Y E(Y ) and the standard deviation as the associated norm. The correlation is then
the cosine of the angle between the two functions in L2 space:
r=
(X |Y )
.
kX k2 kY k2
So there is no surprise in the fact that this number is between 1 and 1. Moreover, |r | = 1 if
and only if the functions X E(X ) and Y E(Y ) are linearly dependent, as we found above.
if the -algebras generated by X and Y are independent, that is, if, for all A T (X )
and B T (Y ), we have P(A B) = P(A) P(B).
6
Random variables
538
is,
Let
X () = cos and Y = sin . The r.v. X and Y are not independent (intuitively, because there
is a relation between them, namely X 2 + Y 2 = 1), but on the other hand the covariance is
equal to
Z
d
Cov(X , Y ) = cos sin
= 0,
2
x Rn we have
F X (, x2 , . . . , xn ) = = F X (x1 , . . . , , . . . , xn ) =
= F X (x1 , . . . , ) = 0
and
F X Y (+, . . . , +) = 1.
Image measures
539
f X (x1 , . . . , xn ) =
We then have
F X (x1 , . . . , xn ) =
x1
n
F (x , . . . , xn ).
x1 . . . xn X 1
...
xn
20.7
Image measures
20.7.a Case of a single random variable
Exercise 20.5 The probability density of the velocity of an atom of a perfect gas is known
(it is a maxwellian7). What is the probability density of the kinetic energy mv 2 /2 of this atom?
To treat this kind of problem, we must consider the situation where, given
a random variable X , a second random variable Y = (X ) is defined. This is
called a change of random variable. The task is to determine the distribution
function of Y , and its probability density, knowing those of X .
When the change of random variable is bijective, there is a unique y = (x)
associated to any x R, and it is then easy to see that if f denotes the
probability density of X and g denotes that of Y , then
f (x)
g( y) =
,
(x)
1
x=
( y)
Random variables
540
y = (x)
has at most countably many solutions, and let S y denote the set of these solutions.
Assume, moreover, that (x) 6= 0 for x S y . Then the probability density g of Y is
related to the probability density f of X by
X f (x)
.
g( y) =
(x)
xS y
kI
kI
where the last factor is the inverse of the jacobian matrix of the change of variable.
20.8
Expectation and characteristic function
20.8.a
541
iii) we have
E{X Y }2 E{X 2 } E{Y 2 },
by the Cauchy-Schwarz inequality, with equality if and only if X and Y are
linearly dependent;
k = E{X k }
and
k = E X E{X } .
Moreover, the k-th moment (centered or not) exists if and only if the function
x 7 x k f (x) is Lebesgue-integrable, where f is the probability density of X ,
that is, if
Z
k
x f (x) dx < +.
As a special case, the variance is given by
2 = E (X E{X })2 } = E{X 2 } E{X }2 .
Random variables
542
Notice that this quantity is always defined, for any random variable X , because
|e iX | = 1 and is of total measure equal to 1 (or equivalently, because the
density probability f , when it exists as a function, is Lebesgue-integrable).
According to the general theory of the Fourier transform, we have therefore:
PROPOSITION 20.57 Let X be a random variable and let X be its characteristic
function. Then
i) X (0) = 1 and X () 1 for all R;
ii) if the k-th order moment of X exists, X can be differentiated k times and we
have
(k)
k = E{X k } = (i)k X (0);
iii) since X is real-valued, X is hermitian, i.e., we have X () = X ().
R +
Proof. Note that X (0) = f X (x) dx = 1 or, to phrase it differently, we have
X (0) = E{1} = 1. The last two properties are direct translations of known results of
Fourier transforms.
In other words, two random variables have the same characteristic function if
and only if they have the same distribution function. (This theorem follows
from an inversion formula which is the analogue of the Fourier inversion
formula and is due to P. Lvy).
Before going to the next section, note also the following important fact:
PROPOSITION 20.59 The characteristic function of the sum of two independent
random variables is the product of the characteristic functions of the summands:
X +Y () = X () Y ().
8
Be careful with the various conventions for the Fourier transform! It is usual in probability
to omit the factor 2 which often occurs in analysis.
543
G X (s) = E(s X )
when it exists.
THEOREM 20.61 Let X : N be an integer-valued discrete r.v., and let pn =
In particular
E(X ) = G X (1)
n=0
(n)
pn
sn
and
and hence
G (0)
P(X = n) = X
.
n!
2
Var(X ) = G X (1) + G X (1) G X (1) .
20.9
Sum and product of random variables
20.9.a
Let us assume that every day I eat an amount of chocolate which is between
0 and 100 g, and let us model this situation by a random variable X1 for the
amount eaten on the first day, another random variable X2 for the amount
eaten on the second day, and so on, each random variable being assumed
to have a uniform distribution on [0, 100]. We would like to known the
distribution of the total amount of chocolate eaten in two days.
544
Random variables
f X1 +X2 = f1 f2 ,
or equivalently
f X1 +X2 (x) =
f1 (t) f2 (x t) dt,
which is only correct if the random variables involved are continuous, the probability
densities then being integrable functions.
545
f X1 +X2 =
f X1
f X2 = 1, as it should be.
Let now f2 (x1 , x2 ) denote the joint probability density of the pair of random
variables (X1 , X2 ), and make the change of variables Y1 = X1 and Y2 =
X1 + X2 . The jacobian of this transformation is given by
D( y1 , y2 ) 1 0
= 1,
| J| =
=
D(x1 , x2 ) 1 1
joint probability density of the pair (X1 , X2 ). Then the probability density of Y is
given by
Z +
g( y) =
f2 (t, y t) dt.
Example 20.64 The sum of two independent gaussian random variables X1 and X2 , with means
variables, each of which has finite variance Var(X i ) = i2 . Then the variance of the
sum exists and is the sum of the variances of the X i : we have
Var(X1 + + Xn ) = Var(X1 ) + + Var(Xn ).
Proof. Without loss of generality, we may assume that each random variable is centered (otherwise, replace Xi by Xi E(Xi ) without changing its variance).
Random variables
546
given by
y 1
dt,
f2 t,
t |t|
g( y) =
g( y) =
f2 (t, t y) |t| dt
Bienaym-Tchebychev inequality
547
Let X1 and X2 be two independent r.v. with Poisson distributions with parameters
1 and 2 , respectively. Then the r.v. X = X1 + X2 has a Poisson distribution with
parameter = 1 + 2 .
20.10
Bienaym
e-Tchebychev inequality
20.10.a Statement
Given a random variable X , its standard deviation mesaures the average
(quadratic) distance to the mean, that is, the probability that the actual value
of X be far from the expectation of X . The Bienaym-Tchebychev inequality
is a way to quantify this idea, giving an upper bound for the probability that
X differs from its expectation by a given amount.
THEOREM 20.69 (Bienayme-Tchebychev inequality) Let (, , P) be a probability space and let X be a real-valued random variable on such that E(X 2 ) < +.
Let m = E(X ) be the expectation of X and let 2 be its variance. Then, for any
> 0, we have
2
1
P |X m| 2 ,
or equivalently
P |X m| 2 .
Random variables
548
In the other direction, knowing the actual value of X for some experience
, this inequality gives an estimate of the error in assuming that this value is
equal to the average value m.
Example 20.70 Let X be a random variable with expectation m and standard deviation . The
probability that the result X () of an experiment differ from m by more than 4 is at most
6, 25 %.
be a non-negative random variable such that E(X ) < + (i.e., in the terminology of
analysis, we consider a measure space and a non-negative integrable function). For any
t R+ , we have
1Z
P {X t}
X () dP().
t
Proof. This is clear by looking at the following picture:
X ()
t
X () dP()
t P({X t})
The dark region is a rectangle with height t and width P {X t} . Of course, it
may well be something else than a rectangle (for instance many rectangles, depending
on the structure
of the event {X t}), but in any case the dark region
R has measure
t P {X t} . It is clear that this measure is smaller than or equal to X () dP(),
since X 0.
9
Mathematicians are machines for turning coffee into theorems, Paul Erds (or Alfred
Rnyi).
Bienaym-Tchebychev inequality
549
2
Applying this lemma to X E(X ) / 2 and t = 2 , we derive the
Bienaym-Tchebychev inequality.
.
+ 103
103
10
Georges Leclerc, comte de Buffon (17071788), famous naturalist, author of the monumental Histoire naturelle, was also interested in the theory of probability.
Random variables
550
2N
103
2 p
N p(1 p),
2 103
hence
N=
4 2
p(1 p)
4 106
20.11
Independance, correlation, causality
It is very important to distinguish between independance, correlation, and
causality. In Example 20.47, we saw that two dependent events can be uncorrelated.
Two events are causally linked if one is the cause of the other. For instance,
statistics show that road accidents are more frequent on Saturdays than other
days of the week (thus they are correlated). There is causality here: usually
more alcohol is consumed on Saturday, leading to a higher rate of accidents.
However, events may be correlated without causality. An example is given
by Henri Broch [15]: in villages in Alsace, statistics show that the number of
births per year is highly correlated with the number of storks (the correlation
is close to 1). Should we conclude that storks carry babies home (which would
mean causality)?11
David Ruelle, in his excellent book [77], mentions events which are causally
linked but uncorrelated, which is more surprising. Consider, on the one hand,
the position of the planet Venus in the sky, and on the other hand the weather
11
There is in fact a hidden causal explanation: the more families in a village, the more
houses, hence the more chimneys, and the more room for storks. This causality is not because
of the two events discussed, but because of a third, linked independently with both.
551
one month after the date of observation. We know very well that weather [86]
is a highly chaotic system, and hence is very sensitive to initial conditions. The
position of Venus on a given day is therefore highly likely to influence the
weather one month later. There is therefore strong causality (and dependence)
between the position of Venus and the weather. However, no correlation can
be observed. Indeed, other factors (the day of the year, the weather in previous
months, the solar activity, the position of other planets, etc.) are equally or
more crucial in the evolution of the weather than the position of Venus.
Looking only at Venus and averaging over all other variables, no trace of the
influence of this fair planet will be noticeable. Causality without correlation.
552
Random variables
i Bi =
i (B i )
Let P = x R ; {x} 6= 0 (the set of discontinuities of ). One can
define a measure p by p (X ) = (P X ), and moreover c = p is
also a Borel measure.
Chapter
21
Convergence of
random variables:
central limit theorem
Guildenstern (Flips a coin): The law of averages, if I have got this
right, means that if six monkeys were thrown up in the air for
long enough they would land on their tails about as often as they
would land on their
Rosencrantz: Heads. (He picks up the coin.)
Tom Stoppard, Rosencrantz & Guildenstern are dead [87]
21.1
Various types of convergence
We are going to define three types of convergence of a sequence (Xn )nN
of random variables to a limit X .
The first two are parallel to well-known concepts in the theory of integration.
Let (X , T , ) be a measure space (for instance, X = R, T = B, the Borel
-algebra, and the Lebesgue measure). A sequence ( fn )nN of measurable
functions on X converges almost everywhere to a measurable function f
if there exists a measurable subset
N X such that (N ) = 0 and, for
all x
/ N , the sequence fn (x) nN converges to f (x). In other words,
554
In other words, with a margin of error > 0, the subset of those points where
fn is too far from f shrinks down to nothingness as n becomes larger and
larger.
Similarly, we have the probabilistic analogue:
DEFINITION 21.2 (Convergence in probability) Let (Xn )nN be a sequence of
555
c.d.
P(Xn x) P(X x)
n
c.p.
c.d.
21.2
The law of large numbers
Consider a game of dce. I roll the die, and you must announce the result
in advance (with money at stake). If I only roll the die once, there is nothing
interesting you can really say (except if my die was false, but I would not say
so in advance). However, if I roll the die ten times, and you must guess the
sum of the results of the ten throws, there is much more you can say. More
precisely, you may resonably think that the sum will be close to 35, maybe
33 or 38, but most certainly not 6 (there is only one chance in 610 , that is,
one in sixty million to obtain a 1 ten times in a row; you may as well play
the lottery). And as I roll the die more and more times, you will be able to
bet with increasing confidence that the sum will be roughly n 3.5. This, in
effect, is what the weak law of large numbers makes precise1:
Let (, , P) be a probability
space and let (Xn )nN be a sequence of square integrable independent r.v. with the
same expectation m (i.e., for all n, we have E(Xn ) = m) and the same variance 2 .
Define
X + + Xn
Yn = 1
.
n
THEOREM 21.5 (Weak law of large numbers)
This theorem was proved (rigorously) for the first time by the Swiss mathematician James
(Jakob) Bernoulli (16541705) (who also discovered the famous Bernoulli numbers), in his
treatise Ars Conjectandi, published in 1713. His nephew Daniel Bernoulli (17001782) also
studied probability theory, as well as hydrodynamics. A generalization of Bernoullis result was
given by the marquis de Laplace, in his Thorie analytique des probabilits.
556
Then the sequence (Yn )nN converges in probability to the constant random variable
equal to m (i.e, Y () = m for all ). In other words, we have
> 0
P |Yn m| 0.
n
Proof. The simplest proof uses the Bienaym-Tchebychev inequality: notice that
E(Yn ) = m and then write
2 (Yn )
P |Yn m|
.
2
Since the random variables X1 , . . . , Xn are independent, the Bienaym identity 20.65
yields the formula
2 (n Yn ) = 2 (X1 + + Xn ) = n 2 ,
2 (Yn ) =
hence
from which we derive
n 2
2
=
,
2
n
n
2
P |Yn m| 2 0.
n n
Remark 21.6 It is possible to show that this result is still valid without assuming that the vari-
ables are square integrable (but they are always integrable). The proof is much more involved,
however.
dependent identically distributed r.v. Assume that the Xn are integrable, and let m
denote their common expectation. Then the sequence of Cesro means converges almost
surely to m: we have
X1 + + Xn c.a.s.
m.
n
21.3
Central limit theorem
Let us come back to our game of dice. We know, according to the law of
large numbers, that we must bet that the sum is close to the average. It would
be interesting to know how the sequence (Yn )nN of Cesro means converges
to the constant random variable m. And more precisely, what should we expect
if we bet on a value other than the average?
2
He showed in particular that it is possible to cut a cake into 65, 537 equal pieces with
straightedge, compass, and knife. A very useful trick if you plan a large wedding.
557
Translated lierally from the German, where a more suitable translation would have been
fundamental limit theorem (of probability theory).
558
Paul Lvy (18861971), son and grandson of mathematicians, studied at the cole Polytechnique, then at the cole des Mines, where
he became professor in 1913 . Hadamard, one of his teachers,
asked him to collect the works of Ren Gteaux, a young mathematician killed at the beginning of the First World War. After
doing this, Lvy developed the ideas of Gteaux on differential
calculus and functional analysis. At the same time as Khintchine, he studied random variables, and in particular the problem of the convergence of sequences of random variables. He
also improved and expanded considerably the results of Wiener
concerning Brownian motion. He was also interested in partial
differential equations and the use of the Laplace transform.
THEOREM 21.9 (Central limit theorem) Let (Xn )nN be a sequence of indepen-
dent, identically distributed random variables with finite expectation m and finite
variance 2 . Let
p
def X + + Xn
Yn = 1
and
Zn = n(Yn m).
n
Then the sequence (Zn )nN converges in distribution to the centered normal distribution N (0, 2 ). In other words, we have
Zx
Sn n
2
P
e t /2 dt.
p < x N (x) =
n
n
for all x R.
def
X (0) = iE(X )
and
X (0) = E(X 2 ),
2
u
u
u2
u
X p
=1+ip m
E(X 2 ) + o
.
n
n
2n
n
559
Remark 21.10 The condition that the variance Var(Xn ) be finite is necessary. Exercise 21.15
on page 562 gives a counterexample to the central limit theorem when this assumption is
not satisfied. It should be mentioned that non-Gaussian processes are also very much in current
fashion in probability theory.
Application
Consider a piece of copper, of macroscopic size (electrical wiring, for instance).
Because of thermal fluctuations,
the average velocity of a conducting electron
p
in this copper wire is 3kb T /m, that is, of the order of 105 m s1 , or tens
of thousands of kilometers per second! Yet, the copper wire, on a table, does not
move. This is due to the fact that the velocities of the electrons (and of nuclei,
although they are much slower) are randomly distributed and cancel each
other. How does this work?
Let N be the number of conducting electrons in the wire. We assume that
the wire also contains exactly N atoms,4 and we will denote by m and M ,
respectively, the masses of the electrons and atoms. You should first check
that the velocity of atoms is negligible in this problem.
We denote by Xn the random variable giving the velocity (at a given time)
of the n-th electron. What is called the average velocity is not the expectation
of Xn , since (in the absence of an electric field), the system is isotropic and
E(Xn ) = 0. In fact, the average velocity is best measured by the standard
deviation of the velocity (Xn ).
What is then average velocity (the standard deviation of the velocity) of
the whole wire? Since the number N of atoms and electrons is very large (of
the order of 1023 ), we can write
N
n
X
m
1 m
1 X
Y =
X p
p
Xn ,
N (m + M ) n=1 n
N M
N n=1
and the term in parentheses is close to a centered gaussian with variance 2
by the central limit theorem.
The standard deviation of Y is therefore very close to
1 m
(Y ) = p
.
N M
p
Notice the occurrence of the famous ratio 1/ N , which is characteristic of
the fluctuations of a system with N particles. In the present case, this reduces
the average velocity of the set of electrons to 1.5 107 m s1 .
4
Depending on the metal used, it may not be N , but rather N /2 or N /3; however, the
principle of the calculation remains the same.
560
Exercise 21.1 All 52 cards in a pack of cards are dealt to four players. What is the probability
that each player gets an ace? Does the probability change a lot when playing algebraic whist?
(Recall that algebraic whist is played with a pack of 57 cards.)
Hint: Compute first, in general, the number of ways of subdividing n objects into k
subsets containing r1 , . . . , rk objects, respectively.
Exercise 21.2 Take a pack of n cards, and deal them face up on a table, noting the order the
cards come in. Then shuffle the pack thoroughly and deal them again. What is the probability
that at least one card occurs at the same position both times? Does this probability have a
limit as n goes to infinity?
It is amusing to know that this result was used by con-men on Mississippi steamboats, who
exploited the passengers intuitive (wrong) assumption that the probability would be much
smaller than it actually is.
Conditional probabilities
Exercise 21.3 Mrs. Fitzmaurice and Mrs. Fitzsimmons discuss their respective children
during a plane trip. Mrs. Fitzmaurice says: I have two children, let me show you their
pictures. She looks in her bag, but finds only one picture: This is my little Alicia, playing
the piano last summer.
What is the probability that Alicia has a brother?
Exercise 21.4 A religious sect practices population control by means of the following rules:
each family has children (made in the dark) until a boy is born. Then, and only then, they
abstain for the rest of their days. If the (small) possibility of twins is neglected, and knowing
that without any control, about 51% of babies are boys, what will be the percentage of males
in the community?
Same question if each family also limits to twelve the total number of children it can have.
Exercise 21.5 A new diagnostic test for a lethal disease has been devised. Its reliability is
impressive, says the company producing it: clinical trials show that the test is positive for
95% of subjects who have the disease, and is negative for 95% of subjects who do not have it.
Knowing that this disease affects roughly 1% of the population, what probability is there that
a citizen taking the test and getting a positive result will soon end up in eternal rest in the
family plot?
Exercise 21.6 Another amusing exercise. The other day, my good friend J.-M. told me that
he knows someone to whom the most amazing thing happened. This other person was just
thinking of one of his former relations, who he had not thought about for 10 years, and the
very next day he learned the person had just died, at the same time he was thinking of him!
(Or maybe within ten minutes.)
What should one think of this extraordinary (rigorously true) fact? Does this kind of
thing happen often, or do we have here an example of ESP (Extra-Sensory Perception, for the
uninitiated)?
Exercise 21.7 In 1761, Thomas Bayes, Protestant theologian, leaves for ever this vale of tears.
He arrives at the Gates of Heaven. But there is not much room left, and anyway God is a strict
Presbyterian and Bayes, unluckily for him, was a Nonconformist minister. However, St. Peter
Exercises
561
gives him a chance: Bayes is shown three identical doors, two of which lead to Hell and only
one to Heaven. He has to choose one. With no information available, Bayes selects one at
random. But before he can open it, St. Peter who is good and likes mathematicians stops
him. Wait, here is a clue, he says, and opens one of the two remaining doors, which Bayes
can see leads to Hell.
What should Bayes do? Stick to his door, or change his mind and select the other unopened
door?5
(100 g) are needed. But she only has two eggs from two different brands. The first package
says the eggs inside are large (between 50 and 60 g, with mass uniformly distributed), and the
second says the eggs are medium (between 40 and 45 g with uniform distribution). What is
the probability density of the total mass of the two eggs? What is the average and standard
deviation of this mass? What is the probability that the mass is greater than 100 g?
Covariance
Exercise 21.10 Let (Xn )nN be a sequence of random variables, pairwise independent, with
3. Let
Sn =
n
X
Yi
n
i=1
for n N . Show that the sequence (Sn )nN converges in probability to the determinist
(constant) random variable p 2 .
This version of the classical Monty Hall paradox was suggested by Claude Garcia.
Recall that a Bernoulli distribution with parameter p is the distribution of a random
variable which takes the value 1 with probability p and 0 with probability 1 p.
6
562
distribution with parameter and Y has a Poisson distribution with parameter , where
, > 0. Let pn = P(X = n) and qn = P(Y = n) for n N.
i) Show that Z = X + Y has a Poisson distribution with parameter + .
Convergence
Exercise 21.13 A player flips two coins B1 and B2 repeatedly according to the following
rule: if flipping whichever coin is in his hand yields tails, switch to the other coin for the next
toss, otherwise continue with the same. Assume that tails is the outcome of flipping B1 with
probability p1 = 0.35, and of flipping B2 with probability p2 = 0.4. Let Xk be the random
variable equal to 1 if the player is flipping coin B1 at the k-th step, and equal to 0 otherwise.
Let
h (k)
h1 (k) = P(Xk = 1),
h2 (k) = P(Xk = 0),
and
H (k) = 1
.
h2 (k)
i) Show that H (k + 1) = M H (k), where M is a certain 2 2 matrix. Deduce that
H (k) = M k H (0), where H (0) is the vector corresponding to the start of the game.
ii) Diagonalize the matrix M . Show that as k tends to infinity, H (k) converges to a vector
which is independent of the initial conditions. Deduce that the sequence (Xk )kN
converges almost surely to a random variable, and write down the distribution of this
random variable.
Exercise 21.14 Let (Un ) be a sequence of identically distributed independent random vari-
Exercise 21.15 (Counterexample to the central limit theorem) Show that if X1 and X2 are
Cauchy random variables with parameters a and b , respectively (see Exercise 21.11), then the
random variable X1 + X2 is a Cauchy random variable with parameter a + b (compute the
characteristic function of a Cauchy random variable). What conclusion can you draw regarding
the central limit theorem?
Exercises
563
PROBLEMS
Problem 7 (Chain reaction) Let X and N be two square integrable random variables taking
values in N. Let (Xk )k1 be a sequence of independent random variables with the same distribution as X . Assume further that N is independent of the sequence (Xk )k1 . Let GX and G N
denote the generating functions of X and N . For n 1, let
Sn =
n
X
Xk .
k=1
S () =
N
()
X
Xk ().
k=1
G (t) =
pk t k .
k=0
Finally, let Zk be the random variable which is the total number of particles at time k,
and let xn be the probability that no particle survives at time n (end of the reaction):
n N
xn = P(Zn = 0).
iv) Find G when p0 = 1. We now assume that p0 < 1. Show that G is strictly increasing
on [0, 1].
v) Find G if p0 + p1 = 1. Assuming now that p0 + p1 < 1, show that G is convex.
vi) In the following cases, find the number of solutions to the equation G (x) = x:
a) p0 = 0 ;
vii) Find a recurrence relation expressing the generating function Gn+1 of Zn+1 in terms of
the generating function Gn of Zn and of G .
viii) Still under the condition 0 < p0 < 1, find a recurrence relation between xn and xn+1 .
Show that the sequence (xk )kN is increasing, and deduce that it converges to a limit
such that 0 < < 1.
ix) Show that E(Zn ) = E(Z1 )n .
x) Show from the preceding results that:
(a) if E(Z1 ) 1, then the probability that the reaction stop before time n converges
to 1 as n tends to infinity;
(b) if E(Z1 ) > 1, then this probability converges to a limite 0 < < 1. What should
we expect from the reaction? What are your comments?
564
Problem 8 (Random walk in the plane) Consider a plane named P , a point named O which
serves as origin for some cartesian coordinate system on P , and a drunk mathematician named
Pierre who, having profited extensively from his pubs extensive selection of single malt whiskey,
leaves the afore mentioned pub (located at the origin O ) at time t = 0 to walk home. Each
second, he make a single step of fixed unit length, in a random direction (obstacles, constables,
and other mishaps are considered negligible).
Let Xn and Yn be the r.v. corresponding to the x- and y-coordinates of Pierre after the
n-th step, and let n denote the angle (measured from the x-axis) giving the direction following
during the n-th step. Thus we have
n
n
X
X
Xn =
cos k
and
Yn =
sin k .
k=1
k=1
We assume that the k are independent random variables uniformly distributed between 0 and
2.
The objective is to study the r.v.
Rn = Xn2 + Yn2
Show that the r.v. Xn and Yn have expectation zero, and compute their variance.
Are X and Y correlated? Are they independent?
What is the expectation of Rn2 ?
Explain why it is reasonable to assume that for large n, the joint distribution of (Xn , Yn )
may be approximated by the joint distribution of a pair of independent centered normal
distributions. (Use the fact that uncorrelated gaussian variables are in fact independent.)
What is the (approximate) probability density of the pair (Xn , Yn )?
v) Deduce the (approximate) distribution function, and then the probability density,
of Rn .
vi) Finally, find an approximation for the expectation E(Rn ).
i)
ii)
iii)
iv)
SOLUTIONS
Solution of exercise 21.3. The probability that Alicia has a brother is 23 , and not 12 .
Indeed, among families with two children, about 25% have two boys, 25% have two girls,
and therefore 50% have one each. Since we know that there is at least one girl, the probability
of having two girls is 41 divided by 12 + 14 , which is 13 .
If Mrs. Fitzmaurice had distinguished between her children (saying, for instance, Here
is my older child at the piano), the probability would have been 12 . (Among those families
whose first child is a girl, there are 50% where the second is a girl.)
Solution of exercise 21.2. Let Ai denote the event which is the i-th card is at the same
place both times. It is easy to see that P(Ai ) = 1/n, that P(Ai A j ) = 1/n(n 1) for i 6= j
and that more generally
(n k)!
P(Ai1 Aik ) =
n!
when all indices are distinct. Using the Poincar formula, the probability of the event B: at
least one of the cards is at the same place both times is
!
n
[
P(B) = P
Ai ,
i=1
Solutions of exercises
565
n
k
n
X
(1)k+1
k=1
(n k)!
.
n!
n
n
X
(1)k+1 Cnk
k=1
n
X
(n k)!
(1)k
= 1
.
n!
k!
k=0
This probability converges to 1 1/e (this is the beginning of the power series expansion for
e 1 ), or roughly 63%.
Solution of exercise 21.4. The probability hasnt changed: there are still 51% of baby boys
in the community.
This is easy to check: notice that every family has exactly one boy; those with two children
have exactly one boy and one girl, those with three have one boy and two girls, etc. The families
with n children have one boy and n 1 girls.
Let P (n) be the probability that a family has n children. We have of course P (n) = p q n1 ,
with p = 0.51 the probability that a baby is a boy, and with q = 1 p.
The proportion of boys is then given by
t=
Solution of exercise 21.5. Let P denote the event the result of the test is positive, and let
P be the opposite event. Let M denote the event the person taking the test has the disease,
and let M be its opposite.
From the statement of the problem, we know that P(P |M ) = 0.95 and that P(P |M ) = 0.95.
Since (M , M ) is a partition of the sample space (either a person has the disease, or not), by
Bayess formula, we have
P(M |P ) =
P(P |M ) P(M )
With P(P |M ) = 1 P(P |M ) = 0.05, the probability of having the disease when the test is
positive is only around 16%.
Solution of exercise 21.6. Before making computations, two comments are in order. Saying
that the story is true does not mean that premonition really occurred, but that this is how the
story was told. Moreover, you are entitled to disbelief it is not because it is printed, with True
Story explicitly stated, even in Bold Face, that it is really true. For this type of debate, this is
an important remark.
What is the probability for one given person, whom I think about once in ten years, that I
will do so within ten minutes of his death? Lets spread out death on an interval of 80 years,
or roughly 42 million minutes, so that we can guess that the probability is roughly one out of
4.2 million.
I estimate that I know, more or less, roughly a thousand persons (including those I only
think about only once ten or twenty years); there may therefore be about one million people
who know someone I know.7 Hence it is not only possible, but in fact rather unsurprising,
that someone could tell me such a story!
7
This may be of the wrong order of magnitude, since my acquaintances each have, among
the people they know (those they only think about once in ten years, including people
known only by name, actors, and so on) rather more than a thousand individuals. On the
other hand, the sets are certainly not disjoint. Any finer estimate would be interesting.
566
This does not mean that ESP does not exist; only that there is no need here to invoke a
paranormal phenomenon in order to explain what happened.
Solution of exercise 21.7. Let us assume that door A leads to Heaven, and doors B and
C to Hell. If Bayes had chosen A, Saint Peter will open B or C for him; changing his mind
means Bayes will go to Hell.
If Bayes had chosen door B, Saint Peter, necessarily, must open door C for him (since he
opens a door leading to Hell), so if Bayes changes his mind, he will go to Heaven. The same is
obviously true if Bayes had chosen door C .
Therefore, independently of his original choice, if Bayes changes his mind after Saint Peter
has shown him one door, this change of mind will switch his destination from the one first
chosen. This means the probability that Bayes will go to Heaven is now 2/3 and the probability
of going to Hell is now 1/3.
Conclusion: well, this rather depends on what Bayes really wants does he want to spend
the rest of his (eternal) life hearing angels play the harp, or is he more interested in playing
roulette with the devils?
inequality. The goal is that the probability of the grade being too far from m be less than 0.2.
Consider the random variable X1 + + X N . It has expectation N m, and S N has expectation
m. The variance of X1 + + X N , on the other hand, is equal to N 2 , as can be seen by looking
at the centered random variables Xi = Xi m, with zero expectation and standard deviation .
The variance of the sum X1 + X2 is simply E ((X 1 + X 2 )2 ) = E ((X1 )2 ) + E ((X2 )2 ) = 2 2 ,
and similarly by induction for the sum of N random variables (since every crossed
term has
p
expectation equal to zero). Hence p
the standard deviation of X1 + + X N is N and that
of S N = (X1 + + Xn )/N is / N .
We are looking for a value of N such that
1
P |S N m|
0.2.
2
The Bienaym-Tchebychev inequality states that we always have
P |S N m| p
2 .
N
p
So we need 1/2 = 0, 2, or = 5, and it then suffices that
p
5
1
p = ,
2
N
or N = 4 5 2 = 20 2 to be certain that the desired inequality holds.
For a value = 2, one finds N = 80!!! This is rather unrealistic, considering the average
budget of a university. If the graders manage to grade very consistently, so that their individual
standard deviation is = 12 , it will be enough to take N = 5. This is still a lot but will
provoke fewer cardiac incidents among university administrators.
However, notice that those estimates are not optimal, because the Bienaym-Tchebychev inequality can be refined significantly if the distribution of the random variables Xi is known (it
is only in itself a worst-case scenario). We know that 80 graders will achieve the desired goal,
independently of the distribution (given the expectation and standard deviation), but it is very
likely that a smaller number suffices.
Solution of exercise 21.9. The probability densities of the masses of the first and second
egg, respectively, are
0
f1 (x) = 1
10
if x
/ [50, 60] ,
if x [50, 60] ,
and
0
f2 (x) = 1
if x
/ [40, 45] ,
if x [40, 45] .
Solutions of exercises
567
The expectation for the mass of the first egg is of course 55 g, and the expectation for the mass
of the second is 42.5 g. Thus the expectation for the total mass is 97.5 g.
The probability density of the mass of the two eggs together is the convolution
Z +
f (x) = f1 f2 (x) =
f1 (t) f2 (x t) dt,
(because, obviously, we can assume that the two masses are independent), which is the function
with the following graph:
0,1
90
95
100 105
2
Since 12 = 25/3 and 22 = 25/6
p by immediate computations, we obtain = 25/2. So the
standard deviation is = 5/ 2. From the graph, the probability that the mass is at least 100
g is 1/4.
Solution of exercise 21.11. Let X1 and X2 be two normal centered random variables, with
variances 12 and 22 , respectively. By Proposition 20.67, the probability density of the ratio
X1 /X2 is given by
Z
2 2
t y
1
t2
exp 2 exp 2 |t| dt
g( y) =
21 2
21
22
Z
2
2
1
t
2
2
=
exp 2
+
y
|t| dt
21 2
22 12
1
1
=
,
1 2 22 /12 + y 2
which is the distribution function of a Cauchy
R random variable. (To obtain the last line, notice
that the integral to compute is of the type u(t) u (t) dt.)
a e i ux
dx = e a|u| .
a2 + x 2
Since the characteristic function of the sum of independent random variables is the product
of the characteristic functions of the arguments, we get that the characteristic function of
X1 + + Xn is
exp (a1 + + an ) |x| ,
which does not converge pointwise to the characteristic function of a normal random variable.
568
Solution of problem 7
n
Y
k=1
GXk = (GX )n .
P(S = n and N = k) =
k=0
k=0
we derive
GS (t) =
=
=
P(S = n) t n =
n=0
P
k=0 n=0
k=0
n=0 k=0
P(Sk = n and N = k) t n
P(Sk = n) P(N = k) t n =
P(N = k) Gk (t)
n
k=0
P(N = k)
= G N GX (t) .
n=0
P(Sk = n) t n
{z
}
Gk (t)
iii) E(S ) = GS (1) = G N GX (1) GX (1) = E(N ) E(X ). An easy computation yields
Var(S ) = GS (1) + GS (1) GS (1)2 = E2 (X ) V (N ) + E(N ) V (X ).
b) :
c) :
The equation G (x) = x has two, one, and two solutions, respectively, in the three cases.
vii) Using question ii), we find that G Z2 = G G and then, by induction, that
G Zn+1 = G Zn G = G
G}.
| {z
n+1 times
viii) We have xn+1 = G G G (0) = G (xn ). In all cases above, there exists an interval
[0, ] on which G (x) x, and this interval is stable by G , so that the sequence (xn )nN
is increasing and bounded from above by . Hence, it converges.
ix) We have E(Zn+1 ) = G Z n+1 (1) = G Z n G (1) G (1) = E(Zn ) E(Z ) since G (1) = 1 and
hence, by induction again, we get E(Zn ) = E(Z )n .
x)
Solutions of exercises
569
(b) We assume now that E(Z1 ) > 1. The graph of G is of type c), so that xn
]0, 1]. The probability of extinction is not 1 (there is positive probability because
the particles may all disappear during the first steps). Moreover, E(Zn ) +:
the reaction is explosive.
Solution of problem 8
i) We compute the expectation of Xn :
n
Z 2
n
n
X
X
X
1
cos d = 0,
E(Xn ) = E
cos k =
E(cos k ) =
2 0
k=1
k=1
k=1
k=1
k6= l
2 (Yn )
n
2
cos2 d =
n
.
2
= n/2.
ii) To determine whether the variables Xn and Yn are correlated or not, we compute the
covariance. We have
Cov(Xn , Yn ) = E Xn E(Xn ) Yn E(Yn ) = E(Xn Yn )
n n
XX
=E
cos k sin l
k=1 l=1
570
Of course, the same is true for Yn , which can be approximated by Yn which also has
probability density fn .
Since Xn and Yn are uncorrelated, it is natural to assume that Xn and Yn are also
uncorrelated, and therefore they are independent, so that the joint probability density
is
2
x + y2
1
exp
.
f Xn ,Yn (x, y)
n
n
v) The distribution function Hn (r ) of the r.v. Rn is given by
ZZZ
Hn (r ) = P(R r ) =
f Xn ,Yn (x, y) dx dy,
B(0 ; r)
where B(0 ; r ) is the closed disc of radius r centered at the origin. Using the approximation above for the density and polar coordinates, we obtain
Z 2 Z r
Z
2 r 2 /n
1
2 /n
e
d d =
e
d
Hn (r )
n 0
n 0
0
Z r2
du
2
=
e u/n
= 1 e r /n .
n
0
The probability density of Rn is therefore
hn (r ) = Hn (r )
2r r 2 /n
e
.
n
pub
going closer
What a beautiful argument! Unfortunately, it comes somewhat after the fact, and it
has limits. Indeed, a one-dimensional random walk is also characterized by a distance
Solutions of exercises
571
p
to the origin E(Rn ) proportional to n, whereas the probability of getting closer is the
same as the probability of going further from the origin.
The conclusion to draw from this tale: beware nice heuristic reasonings; sometimes,
they are but webs of lies and deceit.
Appendix
Reminders concerning
topology and normed
vector spaces
A.1
Topology, topological spaces
Since topology is probably not well known to many physicists, we recall a
few elementary definitions. At the very least, they are very useful for stating
precisely properties of continuity and convergence in sets which are more
complicated than R.
DEFINITION A.1 (Open sets, neighborhoods) Let E be an arbitrary set. A
iI A i
574
n
iii) for any n N and
Tn any finite family (A1 , . . . , An ) O of open sets,
the intersection i=1 Ai O is open (any finite intersection of open sets is
open).
Using the notion of open set, the continuity of a map can be defined.
DEFINITION A.3 (Continuity of a map) Let (E , O) and (F , O ) be two topo-
it is not the union of two nonempty disjoint open sets: it is not possible to
find A and B such that
X =AB
with A O,
B O,
A 6= ,
B 6= ,
A B = .
not.
(a)
(b)
575
Be careful not to mix connected and convex. The first is a purely topological
notion, the second is geometric.
DEFINITION A.8 (Convexity) A subset X in a vector space or in an affine
space is convex if, for any points A , B in X , the line segment [A , B] is
contained in X . (The segment [A , B] is defined as the set of points A +
(1 )B, for [0, 1], and this definition requires a vector space, or affine
space, structure.)
Example A.9 All vector spaces are convex. Below are examples (in the plane) of a convex set (a)
and a nonconvex set (b); note that both are connected.
(a)
(b)
DEFINITION A.10 (Simply connected set) A topological space (E , O) is simply connected if any closed loop inside E may be deformed continuously to
a trivial constant loop (i.e., all loops are homotopic to zero).
A subset X E is simply connected if the topological space (X , O X ) is
simply connected.
In the special case E = R2 (or E = C), a subset X R2 is simply connected
if there is no hole in X .
Example A.11 Still in the plane, the set in (a) is simply connected, whereas the set in (b) is not.
Again both are connected.
(a)
(b)
Example A.12 Let X be the complex plane minus the origin 0; then X is not simply connected.
Indeed, any loop turning once completely around the origin cannot be retracted to a single point
(since 0 is not in X !).
On the other hand, the complex plane minus a half-line, for instance, the half-line of
negative real numbers (i.e., C \ R ) is simply connected. Below are some illustrations showing
how a certain loop is deformed to a point.
Step 1
Step 2
Step 3
Step 4
576
Example A.13 A torus (the surface of a doughnut) is not simply connected. In the picture
below one can see a closed loop which can not be contracted to a single point.
Example A.14 The unit circle U = {z C ; |z| = 1} in the complex plane is not simply
connected.
An important point is that connected and simply-connected sets are preserved by continuous maps.
THEOREM A.15 The image of a connected set (resp. simply connected set) by a con-
K is any family of open sets (U i ) iI such that K is the union of the sets U i :
[
Ui = K .
iI
577
A.2
Normed vector spaces
A.2.a Norms, seminorms
In this section, K is either R or C, and E is a K-vector space, not necessarily
of finite dimension.
DEFINITION A.17 (Norm) Let E be a K-vector space. A norm on E is a map
N2 : we have N (x) = 0 x = 0 ;
N3 : we have N ( x) = || N (x);
N4 : we have N (x + y) N (x) + N ( y)
(triangular inequality)
k f k2 =
sZ
f (x)2 dx
k f k = sup | f |
[a,b]
578
E , kk be a normed vector space. The
distance associated to the norm kk is the map
DEFINITION A.20 (Distance) Let
d:
E 2 R,
def
(x, y) 7 d (x, y) = kx yk .
d (x; A) = inf kx yk .
yA
r R+ . The open ball centered at a with radius r (for the norm kk) is the
set defined by
def
B(a ; r) = x E ; kx ak < r .
The closed ball centered at a with radius r is the set defined by
def
B(a ; r) = x E ; kx ak r .
x A
r > 0
B(x ; r) A.
interior of A if there exists r > 0 such that the open ball centered at a with
radius r is contained in A:
r > 0
B(a ; r) A.
579
r > 0
B(a ; r) A 6= .
B = B.
dense is the space Mn (K) of all matrices. The set of diagonalizable matrices is dense in Mn (C).
exp(tr A). This property is obvious in the case of a diagonalizable matrix: indeed, assume that
A = P diag(1 , . . . , n )P 1 (with P GLn (C)); then we have
exp(1 )
n
n
Q
P
..
P 1
exp A = P
and
exp(
)
=
exp
i .
i
.
i=1
i=1
exp(n )
Since the set of diagonalizable matrices is dense in Mn (C), and since the determinant, the trace
and the exponential function for matrices are continuous (the last being not quite obvious), it
follows that the identity is also valid for an arbitrary A Mn (C).
Example A.34 Let f : [0, 1] R be a continuous function such that
x+y
2
f (x) + f ( y)
2
for all (x, y) R2 . We want to prove that f is affine, i.e., there exist a, b R such that
f (x) = a x + b for all x [0, 1]. For this purpose, let b = f (0) and a = f (1) f (0), and let
g : x 7 f (x) (a x + b ). The function g is then continuous, and satisfies g(0) = g(1) = 0.
Moreover, g satisfies the same relation
x+y
g(x) + g( y)
g
=
2
2
580
Edmund Georg Hermann Landau (18771938), German mathematician, was famous in particular for his many works of analytic
number theory (in particular concerning the Riemann zeta function) and his treatise on number theory. From his youngest days,
he was also interested in mathematical problems and games, and
had published two books of chess problems even before finishing
his thesis. He taught at the University of Berlin from 1899 to 1919,
then at the University of Gttingen (replacing Minkowski) until
he was forced to resign by the Nazis in 1934 (the SS mathematician Teichmller organized a boycott of his classes).
as before. From this we deduce immediately that g( 12 ) = 0, then, repeating the process, that
g( 14 ) = g( 24 ) = g( 34 ) = 0,
and by an obvious induction we obtain
k
=0
for all n N and all k [[0, 2n ]].
g
2n
Since the set E = k 2n ; n N , k [[0, 2n ]] is dense in [0, 1], and since g is continuous,
we can conclude that g = 0.
THEOREM A.35 (Properties of open sets) The union of any family of open sets is
an open set.
The intersection of finitely many open sets is an open set.
Example A.36
nN
1
,1
n
= ]0, 1[ is open in R.
Counterexample A.37
The intersection
of infinitely many open sets has no reason to be open:
for instance,
nN
0, 1 +
1
n
THEOREM A.38 (Properties of closed sets) The union of finitely many closed
sets is a closed set.
The intersection of any family of closed sets is a closed set.
Counterexample A.39 As before,
the union of infinitely many closed sets has no reason to be
S
nN
0, 1
1
n
581
The sequence (un ) is negligible compared to (n )nN if, for any positive
real number > 0, there exists an index N N, depending on , such that
kun k |n | for all n N . This is denoted u n = o( n ), pronounced un is
little-Oh of n .
Two real- or complex-valued sequences u = (un )nN and v = (vn )nN are
equivalent, denoted u n v n , if un vn = o(un ) or, equivalently (and it
shows that the relation is an equivalence relation), if un vn = o(vn ).
The notation above is due to E. Landau.
vector space which is bounded and closed is compact, and in particular, any sequence
with values in such a subset has a convergent subsequence.
N N N .
582
Karl Theodor Wilhelm Weierstrass (18151897), German mathematician (from Westphalia), a famous teacher, was the first to give a convincing construction of the set R of real numbers (such a construction was lacking in Bolzanos work). He also constructed an example
of a continuous function on R which is nowhere differentiable (ignorant of Bolzanos prior work). The famous theorem stating that
any continuous function on an interval [a, b ] can be uniformly approximated by polynomials is also due to him. Finally, wishing to
provide rigorous and unambiguous definitions of the concepts of
analysis, he introduced the definition of continuity based on epsilons and deltas, which are the cause of such happy moments and
memories in the lives of students everywhere.
THEOREM A.45 If the norms N and N are equivalent, then any sequence that
converges for N also converges for N and conversely, and moreover, the limits are
equal.
Example A.46 Let E = Kn . We can easily compare the norms N1 , N2 , and N defined by
N1 ( x) =
|xi |, N2 ( x) =
|xi |2
1/2
N N1 n N
1
hence p N2 N1 n N2 .
p
n
N N2 n N
uniform convergence, are pairwise non-equivalent. To see this, let be the function defined
by (x) = 1 |x| if x [1, 1] and (x) = 0 otherwise.
1
The property of being bounded; of course, a bound for a set may depend on the norm.
Exercise
583
The vector spaces R, C, and more generally all finite-dimensional vector spaces are
complete, that is, any Cauchy sequence in a finite-dimensional vector space is convergent.
for x E .
The map f 7 ||| f ||| is a norm on the vector space of continuous linear maps from
E to F .
EXERCISE
Exercise A.1 Let E , kk
584
SOLUTION
Solution of exercise A.1. Let (un )nN be a Cauchy sequence in L (E ). For all x E , we
have
u p (x) uq (x)
kxk u p uq ,
and this proves that the sequence un (x) n is a Cauchy sequence in E . By assumption, it
converges; we then let
u(x) = lim un (x)
x E .
n
This defines the map u : E E , which a moments thought shows to be linear. Obviously, we
want to prove that u is continuous, and is the limit of (un )nN .
i) The map u is in L (E ).
By the Cauchy property, there exists N N such that u p uq 1 for all p, q N .
Let x E be such that kxk = 1.
For all p, q N , we have
u p (x) uq (x)
1. Letting q go to +, we obtain
u p (x) u(x)
1 for all p N . Hence,
u(x)
u(x) u N (x)
+
u N (x)
1 + |||u N ||| .
This proves that
sup
u(x)
1 + |||u N |||
kxk=1
(we use the fact that the norm on E is itself a continuous map). Since this holds for all
x E such that kxk = 1, we have proved that
u p u .
p N
And of course, since > 0 was arbitrary, this is precisely the definition of the fact that
(un )nN converges to u in L (E ).
Appendix
Elementary reminders of
differential calculus
B.1
Differential of a real-valued function
B.1.a
586
f
( a) h i + o ( h).
h0
xi
(b.1)
d fa =
n
X
f
( a) dx i .
xi
i=1
and the linear map is the same as the differential defined above using
partial derivatives.
587
d f = f dx .
and
def
and
d f = f, dx .
B.2
Differential of map with values in R p
It is very easy to generalize the previous results for vector-valued functions
of a vector variable. Let thus f : Rn R p be a map with vector values.
All previous notions and results apply to each coordinate of f . We therefore
denote
f1 ( a)
.
.
f ( a) =
for all a Rn .
.
f p ( a)
Then, if each coordinate of f has continuous partial derivatives at a Rn ,
we can write down (b.1) for each component:
f i ( a + h) = f i ( a) +
n
X
fi
( a) h j + o( h),
xj
j=1
i = 1, . . . , p.
f1
f1
( a)
( a) h1
f1 ( a + h) f1 ( a)
x1
xn
..
..
= .. + ...
.. + o( h).
.
.
.
.
f ( a + h) f ( a) f p ( a) f p ( a) h
p
p
p
x1
xn
(b.2)
Hence the following definition:
DEFINITION B.2 (Differential) Let f : Rn R p be a map such that each
588
such that
f1
( a) h1
xn
.
..
.
.. .
fp
( a) h p
xn
a 7 d f a .
and
d f = f dx .
B.3
Lagrange multipliers
Using linear forms, it is easy to establish the validity of the method of
Lagrange multipliers.
We start with an elementary lemma of linear algebra.
LEMMA B.3 Let and 1 , . . . , k be linear forms defined on a vector space E of
i=1
Ker i Ker ,
Lagrange multipliers
d f a = 0,
589
f
( a) = 0 i [[1, n]].
xi
that is,
Now let us make the problem more complicated: we are looking for extrema of f on a (fairly regular) subset of Rn , such as a curve in Rn , or
a surface, or some higher-dimensional analogue (what mathematicians call a
differentiable subvariety of E . Assume that this set is of dimension nk and
is defined by k equations (represented by differentiable functions), namely,
(1)
C (x1 , . . . , xn ) = 0
..
S :
(b.3)
.
C (k) (x1 , . . . , xn ) = 0.
In order that the set S defined by those equations be indeed an (n k)dimensional subvariety, it is necessary that the differentials of the equations
C (1) , . . . , C (k) be linearly independent linear forms at every point in S :
(1)
(k)
dC x , . . . , dC x
are linearly independent for all x S .
We are now interested in finding the points on a subvariety S where f
has an extremum.
THEOREM B.4 (Lagrange multipliers) Let f be a real-valued differentiable function on E = Rn , and let S be a differential subvariety of E defined by equations (b.3). Then, in order for a S to be an extremum for f restricted to S , it is
necessary that there exist real constants 1 , . . . , k such that
(1)
(k)
d f a = 1 dC a + + k dC a .
Proof. Assume that a S is an extremum of f restricted to S , and let U be the
tangent space to S at a. This tangent space is an (n k)-dimensional vector space,
where a U is the origin.
If a is an extremum of f on S , we necessarily have d f a . h = 0 for any vector h
tangent to S (if we move infinitesimally on S from a, the values of f cannot increase
or decrease in any direction). Thus we have U Ker d f a .
On the other hand, U is defined by the intersection of the tangent planes to the
subvarieties with equations C (i) ( x) = 0 for 1 i k (because S is the intersection
T
(i)
of those). Hence we have U = ki=1 Ker dC a , and therefore
k
T
i=1
590
k
X
f
C (i)
(
a)
( a) = 0,
i
x1
x1
i=1
..
..
.
.
k
(i)
X
C
f
( a)
i
( a) = 0
xn
xn
i=1
and
C (1) ( a) = 0,
..
.
(k)
C ( a) = 0,
(b.4)
F ( x) = f ( x)
k
X
i C (i) ( x).
i=1
Look then for points a Rn giving free extrema of F , that is, where the
n variables x1 , . . . , xn are independent (do not necessarily lie on S ). The
condition that the differential of F vanishes gives the first set of equations
in (b.4), and of course the solutions depend on the additional parameters
i . The values for those are found at the end, using the constraints on the
problem, namely, the second set of equations in (b.4).
Example B.5 Let S be a surface in R3 with equation C (x, y, z) = 0. Let r 0 R3 be a point
not in S , and consider the problem of finding the points of S closest to r 0 , that is, those
minimizing the function f ( x) = k x r 0 k2 , with the constraint C ( x) = 0. The auxiliary
function is F ( x) = k x r 0 k2 C ( x), with differential (at a point a R3 ) given by
dF a = 2 ( a r 0 |) grad C ( a) ,
in other words,
dF a . h = 2( a r 0 ) h grad C ( a) h
for all h R3 .
The free extrema of F are the points a such that ( a r 0 ) is colinear with the gradient of C
at a. Using the condition C ( a) = 0, the values of are found. Since grad C ( a) is a vector
perpendicular to the surface S , the geometric interpretation of the necessary condition is that
points of S closest to r 0 are those for which ( a r 0 ) is perpendicular to the surface at the
point a.
Of course, this is a necessary condition, but not necessarily a sufficient condition (the
distance may be maximal at the points found in this manner, or there may be a saddle point).
Remark B.6 Introducting the constraint multiplied by an additional parameter in the auxiliary
function in order to satisfy this constraint is a trick used in electromagnetism in order to fix
the gauge. In the Lagrangian formulation of electromagnetism, the goal is indeed to minimize
Solution of exercise
591
the action given by the time integral of the Lagrangian under the constraint that the potential
four-vector has a fixed gauge. A gauge-fixing term, for instance ( A)2 in Lorenz gauge, is
added to the Lagrangian density (see [21, 49]).
SOLUTION
Solution of exercise B.1 on page 588. By linear algebra, there exist additional linear forms
k+1 , . . . , n such that (1 , . . . , n ) is a basis of E , since the given forms are linearly independent. Let (e1 , . . . , en ) be the dual
By the definition of a basis, there are constants
Pbasis.
n
i , 1 i n, such that =
i=1 i i . Indeed, we have i = (e i ) using the formula m (eT
) = m defining the dual basis. Let j = i + 1, . . . , n. The same formula
gives e j 1ik Ker i Ker by assumption, hence j = 0 for j > i. This implies
f Vect(1 , . . . , k ), as desired.
Appendix
Matrices
In this short appendix, we recall the formalism of duality (duality bracket or pairing)
and show how it can be used easily to recover the formulas of change of basis, and to
better understand their origin; this can be considered as a warm-up for the chapter
on the formalism of tensors, which generalizes these results.
C.1
Duality
Let E be a finite-dimensional vector space over the field K = R or C, of
dimension n.
DEFINITION C.1 A linear form on E is a linear map from E to K. The dual
, x = (x).
i, j [[1, n]]
ei , e j = i j .
For x = ni=1 x i ei , we have x i = ei , x for all i [[1, n]]. The elements
of the dual basis E are also called coordinate forms.
Matrices
594
C.2
Application to matrix representation
We will now explain how the formalism of linear forms can be used to clarify the use of matrices to represent vectors, and how to use them to remember
easily the formulas of change of basis.
C.2.a
In other words, the vector x j is expressed in the basis E by the linear combination
n
P
xj =
a i j ei .
i=1
a11 a1 j a1q
a
21 a2 j a2q
M = .
..
..
.
.
.
.
an1 an j anq
vector x j
seen in E
for all i [[1, n]] and j [[1, q]]. We can express the definition M =
ei , x j i j symbolically in the form
M = matE (x1 , . . . , xq ) = e | x .
595
Ai j = c i f (b j ) = c i , f (b j ) .
This is denoted also
C.2.c
A = c f b .
()
Change of basis
P = Pass(B, B ) = b b
Notice that the inverse of the matrix P , which is simply the change of basis
matrix from B to B, is
1
b b
= b b .
expresses the relation between the coordinates for x in each basis. In abbreviated form, we write
X = PX
or, with the previous notation,
b | x = b b b x .
Matrices
596
P
b b
Id =
k
k
k=1
Id = b b
(c.1)
to recover the formula.
Let now f : E F be a linear map. In addition to the basis B and B of
E , let C and C be basis of F , and let
M = matB,C ( f )
M = matB ,C ( f ).
and
P = Pass(B, B ) = b b
and
Q = Pass(C , C ) = c c ,
the change of basis formula relating M and M can be recovered by writing
M = c f b = c c c f b b b = Q 1 M P
(c.2)
| {z } | {z } | {z }
M
Q 1
One should remember that (c.1) provides a way to recover without mistake the
change of basis formula for matrices, in the form
c f b = c c c f b b b .
Exercise C.1 The trace of an endomorphism f L (E ) is defined by the formula
tr ( f ) =
i=1
e i , f (e i )
in a basis E = (e1 , . . . , en ). Show that this definition is independent of the chosen basis.
C.2.e
Appendix
A few proofs
This appendix collects a few proofs which were too long to insert in the main
text, but are interesting for various reasons; they may be intrinsically beautiful, or
illustrate important mathematical techniques which physicists may not have other
opportunities to discover.
U the complement of the image of in the complex plane. The winding number
of around the point z ,
Z
1
d
def
for all z U
(d.1)
Ind (z) =
2i z
()
A few proofs
598
Now, we can compute easily that (t)/(t) = (t)/ (t) z , for all t [0, 1], except
possibly at the finitely many points S [0, 1] where the curve is not differentiable. Hence
we find
(t)
(t) (t)
(t)
=
2
(t) z
(t) z
(t) z
=
(t) (t)
(t) (t)
2
2 = 0
(t) z
(t) z
on [0, 1] \ S . Because the set S is finite, it follows that (t)/ (t) z is a constant. In
addition, we have obviously (0) = 1, and hence
(t) =
(t) z
.
(0) z
in the plane (by this is meant a filled triangle, with boundary consisting
of three line segments),
entirely contained in an open set . Let p , and let
f H \ { p} be a function continuous on and holomorphic on except
possibly at the point p. Then
Z
f (z) dz = 0.
Let [a, b ], [b , c ], and [c , a] denote the segments which together form the boundary of
the filled triangle . We can assume that a, b , c are not on a line, since otherwise the result is
obvious. We now consider three cases: p
/ , p is a vertex, p is inside the triangle,
but not a vertex.
p is not in . Let c be the middle of [a, b ], and a and b the middle, respectively, of
[b , c ] and [c , a]. We can subdivide into four smaller similar triangles i , i = 1, . . . , 4,
as in the picture, and we have
def
J =
f (z) dz =
4 Z
X
i=1
b
f (z) dz
c
a
c
b
(the integrals on [a , b ], [b , c ], and [c , a ] which occur on the right-hand side do so
in opposite pairs
R because of orientation). It follows that for one index i {1, 2, 3, 4} at
least, we have f (z) dz | J /4|. Let (1) denote one triangle (any will do) for which
i
this inequality holds. Repeating the reasoning above with this triangle, and so on, we
A few proofs
599
obtain a sequence ((n) )nN of triangles such that (1) (2) (n)
and
Z
|J|
.
f
(z)
dz
(n)
4n
at each step. Let L be the length of the boundary . Then it is clear that the length
of (n) is equal to L/2n for n 1.
Finally, because the sequence of triangles is a decreasing (for inclusion) sequence of
compact sets with diameter tending to 0, there exists a unique intersection point:
\
(i) = {z0 }.
i1
for all z C such that |z z0 | < r . Since the diameter of (n) tends to zero, we have
|z z0 | < r for all z (n) if n is large enough. Using Proposition 4.30, we have
Z
Z
Z
f (z0 ) dz = f (z0 )
dz = 0
and
f (z0 ) (z z0 ) dz = 0.
(n)
(n)
Hence
f (z) dz =
(n)
(n)
(n)
L L
L2
f (z) dz n n = n ,
2 2
4
(n)
Z
J 4n
f (z) dz L2 .
and therefore
(n)
This holds for all > 0, and consequently we have J = 0 in the case where p
/ .
p is a vertex, for instance p = a. Let x [a, b ] and y [a, c ] be two arbitrarily chosen
points. According to the previous case, we have
Z
f (z) dz =
[x yb]
and hence
b
f (z) dz = 0
[ yb c ]
f (z) dz =
p
Z
[a,x]
[x, y]
f (z) dz,
[ y,a]
which tends to 0 as [x a] and [ y a], since f is bounded, and the length of the
path of integration tends to 0.
p is inside the triangle, but is not a vertex. It suffices to use the previous result applied to
the three triangles [a, b , p], [b , p, c ], and [a, p, c ].
A few proofs
600
(x)
dx +
x
+
+
(x)
dx
x
0+ ].
(d.2)
<|x|<M
x (0) + (x)
dx
x
<|x|<M
Z
=
(0) + (x) dx,
(x)
dx =
x
<|x|<M
A few proofs
601
This is valid for any ]0, 1]. Letting then [ 0+ ] for fixed n, we find that
Z
n (x)
dx 4
n N .
lim+
0 |x|> x
This now proves that the map (d.2) is continuous.
THEOREM 8.18 on page 232 Any Dirac sequence converges weakly to the Dirac
distribution in D (R).
Let ( fn )nN be a Dirac sequence and let A > 0 be such that fn is non-negative on [A , A]
for all n.
def
Let D be given. We need to show that fn , (0). First, let = (0), so
that is a function of C class, with bounded support, with (0) = 0. Let K be the support
of .
Let > 0. Since is continuous, there exists a real number > 0 such that
|(x)|
for all real numbers x with |x| < . We can of course assume that < A. Because of
Condition in the definition 8.15 on page 231 of a Dirac sequence, there exists an integer N
such that
Z
fn (x) (x) dx
|x|>
for all n N . Because fn is non-negative on [, ], we can also write
Z
Z
Z
fn (x) dx =
fn (x) (x) dx
fn (x) dx.
|x|
|x|
|x|
Again from the definition, the last integral tends to 1, so that if n is large enough we have
Z
Z
fn (x) (x) dx 3.
fn (x) (x) dx 2,
hence
|x|
R
Now, there only remains to notice that K fn (x) dx 1 (since K is a bounded set), and
n
A few proofs
602
THEOREM 9.28 on page 261 The space 2 , with the hermitian product
(a|b) =
n=0
is a Hilbert space.
a n bn ,
un = lim un( p) .
p
Obviously, the sequence (un )nN of complex numbers should be the limit of (u ( p) ) p .
The sequence u is in 2 The Cauchy sequence (u ( p) ) pN , like any other Cauchy sequence, is bounded in 2 : there exists A > 0 such that
X
( p)
u
=
u ( p) 2 A
n
2
n=0
N
X
u ( p) 2
u ( p)
A ,
n
2
n=0
N
N
2 X
X
|un |2 A
lim un( p) =
n=0
n=0
(since the sum over n is finite, exchanging limit and sum is obviously permitted). Now this
holds for all N N, and it follows that u 2 and in fact kuk
2 A.
The sequence u ( p) tends to u in 2 that is, we have
u ( p) u
2 0.
p
Let > 0. There exists an integer N N such that
u ( p) u (q)
if p, q N . Let
2
p N . Then for all q p and all k N, we have
k
X
u ( p) u (q) 2
u ( p) u (q)
.
n
n
2
n=0
k
X
u ( p) un 2
n
n=0
A few proofs
603
i
1 h
(r c t) (r + c t) ,
4c r
1 (r c t) if t > 0,
F 1 [cos k c t] = 4r
( r)
if t = 0.
F 1
sin k c t
kc
We prove the second formula. It is obviously correct for t = 0, and if t > 0, we compute
ZZZ
1
F 1 [cos k c t] ( r) =
cos(k c t) e i k r d3 k
(2)3
Z 2
Z Z +
1
=
d
d
dk cos(k t) e i kr cos k 2 sin
(2)3 0
0
0
Z +
(e i kr e i kr ) 2
1
cos(k t)
k dk
=
2
(2) 0
ikr
Z +
1
=
e i k(tr) e i k(tr) + e i k(t+r) e i k(t+r) ik dk
2
8 r 0
Z +
1
=
e i k(tr) + e i k(t+r) ik dk
by parity
82 r
h
i
1
1
=
F 1 ik (r
t) + (r
+ t) =
(r t) + (r + t)
4r
4r
by the properties of Fourier transforms of derivatives. Since t > 0, the variable r is positive
and the distribution (r + t) is identically zero, leading to the result stated.
The first formula is obtained in the same manner.
A few proofs
604
such that, for any vector space G and any bilinear map f from E F to G , there
exists a unique linear map f from E F into G such that f = f , which is
summarized by the following diagram:
EF
@
@ f
@
@
?
R
@
- G
EF
where [(x, y)] is just some symbol associated with the pair (x, y) and (x, y) is in K, with the
additional condition that (x, y) = 0 except for finitely many pairs (x, y).
The set M has an obvious structure of a vector space (infinite-dimensional), with basis the
symbols [(x, y)] for (x, y) E F : indeed we add up elements of M term by term, and
multiply by scalars in the same way:
X
X
X
(x, y) [(x, y)] +
(x, y) [(x, y)] =
((x, y) + (x, y) )[(x, y)],
(x, y)E F
(x, y)E F
X
(x, y)E F
(x, y)E F
(x, y)E F
A few proofs
605
Now, in order to transform bilinear maps on E F into linear maps, we will use M and
identify in M certain elements (for instance, we want the basis elements [(x, 2 y)] and [(2x, y)]
to be equal, and to be equal to 2[(x, y)]).
For this purpose, let N be the subspace of M generated by all elements of the following
types, which are obviously in M :
[(x + x , y)] [(x, y)] [(x , y)];
[(x, y + y )] [(x, y)] [(x, y )];
[(a x, y)] a[(x, y)];
[(x, a y)] a[(x, y)];
where x E , y F and a K are arbitrary.
Consider now T = M /N , the quotient space, and let be the canonical projection map
M M /N , that is, the map which associates to z M the equivalence class of z, namely, the
set of elements of the form z + n with n N . This map is surjective and linear.
Now, E F can be identified with the canonical basis of M , and we can therefore
construct a map
: E F T = M /N
by sending (x, y) to the class of the basis vector [(x, y)].
We now claim that this map and T as E F satisfy the properties stated for the tensor
product.
Indeed, let us first show that is bilinear. Let x, x E and y F . Then (x + x , y) is the
class of [(x + x , y]). But notice that the element [(x + x , y)] [(x, y)] [(x , y )] is in N by
definition, which means that the class of [(x + x , y)] is the class of [(x, y)] + [(x , y)]. This
means that (x + x , y) = (x, y) + (x , y). Exactly in the same manner, the other elements
we have put in N ensure that all properties required for the bilinearity of are valid.
We now write E F = T , and we simply write (x, y) = x y.
Consider a K-vector space G . If we have a linear map T = E F G , then the composite
E F E F G
is a bilinear map E F G . Conversely, let B : E F G be a bilinear map. We construct
a map E F G as follows. First, let B : M G be the map defined by
X
X
(x, y) [(x, y)] 7
(x, y) B(x, y).
(x, y)E F
(x, y)E F
B([(x
+ x , y)] [(x, y)] [(x , y)]) = B(x + x , y) B(x, y) B(x , y) = 0.
This means that we can unambiguously use B to induce a map T G by x y 7 B(x, y),
because if we change the representative of x y in M , the value of B(x, y) does not change.
Thus we have constructed a linear map E F G .
It is now easy (and an excellent exercise) to check that the applications just described,
L (E F , G ) Bil(E F , G ) and Bil(E F , G ) L (E F , G ),
are reciprocal to each other.
H such that = . Similarly, exchanging the roles of and , there exists a unique
: H H such that = . Now notice that we have both = IdH (a triviality)
and = . The universal property of again implies that = IdH . Similarly,
we find = IdH , and hence is an isomorphism from H to H .
This shows that the tensor product is unique up to isomorphism, and in fact up to unique
isomorphism.
A few proofs
606
k=1
i=1
Following Pascal,1 consider the random variables 1Ai which are characteristic functions of
the events Ai , that is,
1Ai : R,
0
7
1
if A ,
if
/ A.
1 A c = 1 1 A i .
and
P(A1 An ) = E 1
n
Y
i=1
(1 1Ai ) .
Now expand the product on the right-hand side, noting that 1 1Ai = 1Ai ; we get2
n
n
Y
X
(1 1Ai ) = 1 +
(1)k
i=1
k=1
1 Ai 1 Ai . . . 1 Ai .
1
It only remains to compute the expectation of both sides to derive the stated formula.
Blaise Pascal (16231662) studied mathematics when he was very young, and investigated
the laws of hydrostatics and hydrodynamics (having his brother in law climb to the top of
the Puy de Dme with a barometer in order to prove that atmospheric pressure decreases with
altitude). He found many results in geometry, arithmetics, and infinitesimal calculus, and was
one of the very first to study probability theory. In addition, he invented and built the first
mechanical calculator. In 1654, Pascal abandoned mathematics and science for religion. In his
philosophy there remain, however, some traces of his scientific mind.
2
This may be proved by induction, for instance, in the same manner that one shows the
that
n
n
Y
X
X
(1 xi ) = 1 +
(1)k
xi1 xi2 . . . xik .
i=1
k=1
Tables
609
Fourier Transforms
The convention used here is the following:
Z +
Z
def
2i
x
e
f () =
f (x) e
dx
and
f (x) =
f (x)
f (a x)
f (n) (x)
(2ix)n f (x)
(x)
x
b a [a,b]
e |x|
a2
2a
+ 42 2
e x
1
x
xk
fe() e 2i x d.
1 e
f
|a|
a
(2i)n fe()
fe(n) ()
sin
= sinc
sin |a|
sin (b a) i(a+b )
e
2
2
+ 42 2
e a||
r
2 2 /
e
2i
1
(k)
(2i)k
(b > a)
( > 0)
(a > 0)
( > 0)
Tables
610
f (x)
H (x)
H (x)
sgn(x)
1
pv
x
sin(20 x)
cos(20 x)
(x)
fe()
1
1
+
pv
2 2i
1
1
pv
2 2i
1
1
pv
i
i sgn
1
( 0 ) ( + 0 )
2i
1
( 0 ) + ( + 0 )
2
1
(x x0 )
e 2i x0
X(x)
X()
1
X(/a)
|a|
X(a x)
Fourier Transforms
611
F ( k)
f ( r)
k 2 F ( k)
1/r
4/k 2
e r /r
1
2
r + a2
4/(2 + k 2 )
( > 0)
e ak /22 k
(a > 0)
1/r 2
1/22 k
1
( p n)( q n)
r
e r
( r)
1
4
k2
( p k)( q k)
p q 2
k2
3/2
2
e k /4
(2)3
p, q R3
( > 0)
f (x) e 2i x dx
fe( p) =
f ( r ) e i k r d3 r
f ( x) e i p x/ h}
x
(2 h})3/2
d3
dx
f (x) e i p x/ h} p
2 h}
ZZZ
ZZZ
fe(p) =
F ( k) =
fe() e 2i x d
Z
1
f (x) = p
fe() e i x d
2
Z
1
f (x) =
fe() e i x d
2
ZZZ
1
f ( r) =
F ( k) e i kr d3 k
(2)3
Z
dp
f (x) = fe(p) e i p x/ h} p
2 h}
ZZZ
d3 p
f ( x) =
fe( p) e i p x/ h}
(2 h})3/2
f (x) =
Inverse transform
(2 h})3/2
p
2 h}
2 h}
(2 h})3/2
2
(2)3
1
p
2
F []
F [1]
f g=
f g=
fe e
g
fe e
g
fe e
g
f g=
f g=
fe e
g
Z
1
fe e
g
(2)3
Z
Z
f g = fe e
g
1
f g=
2
Parseval-Plancherel
Various definitions used for the Fourier transforms. The formulas for those different from the ones used in this book can be recovered by elementary changes
of variables. The convention named QM are used in quantum mechanics. Note that the simplest definition in terms of memorizing the formulas is the
first one. The definition for quantum mechanics is the one in the book by Cohen-Tannoudji et al. [20]
(3D)
QM
(1D)
QM
3D
Z
1
fe() = p
f (x) e i x dx
2
Z
e
f () = f (x) e i x dx
fe() =
Fourier transform
612
Tables
Laplace transforms
613
Laplace transforms
In some problems (in particular, to solve differential equations while taking initial conditions at t = 0 into account), it is useful to go through the Laplace transform, which transforms
the differential operations into algebraic operations, which can easily be inverted. The table
of Laplace transforms is therefore usually read from right to left: one looks for an original
t 7 f (t) for the function p 7 F (p) that has been calculated.
f (t)
1
2i
c +i
F (p) =
e t z F (z) dz
e p t f (t) dt
0
F (p)
c i
f (t)
inversion
p F (p) f (0+ )
F (p)
p
f (s) ds
(1)n t n f (t)
F (n) (p)
Z
f (t)
t
e at f (t)
F (z) dz
(n N)
()
F (p a)
(a C)
R
() The integral p is defined on any path joining p to infinity on the right-hand side
(Re(z) +), which is entirely contained in the half-plane of integrability where F is
defined. Because F is analytic, this integral is independent of the chosen path.
Tables
614
f (t)
F (p)
(t)
(n) (t)
pn
X(t)
1
1 ep
1/p
t
p
1/p 2
p
1/ p
e at
1/(p a)
(a C)
t n e at /n!
(p a)(n+1)
(n N,
1/ t
cos t
p2
p
+ 2
sin t
p 2 + 2
cosh t
p
p 2 2
sinh t
p 2 2
1
(1 cos t)
2
1
p(p 2 + 2 )
1
(t sin t)
3
1
p 2 (p 2 + 2 )
1
(sin t t cos t)
23
1
(p 2 + 2 )2
t
cos t
2
1
(sin t + t cos t)
2
(p 2
p
+ 2 )2
p2
(p 2 + 2 )2
(n N)
a C)
( R or C)
( C)
Laplace transforms
f (t)
F (p)
t cos t
p 2 2
(p 2 + 2 )2
J0 (t)
J1 (t)
J1 (t)
t
log t
p
1
t k 2
Jk 1 (t)
2
(k) 2
1
p2
1 p
p
615
+ 2
p2
+ 2
p 2 + 2 p
log p
1
(p 2 + a 2 )k
( 0, 577...)
(k > 0)
[a, b]
R
R+
R
R+
Uniform law
Normal law
Exponential law
Cauchy law
2 law
a
1
2
a + (x m)2
x n 1
1
2
e x/2
2(n/2) 2
1 e x/
not defined
2n
not def.
(1 2i)n/2
e im|a|
1
1 i
2 2 /2
2
m
e im e
e ib e ia
i(b a)
(b a)2
12
a+b
2
b a [a,b ]
1
(x m)2
p exp
2 2
2
Char. function ()
Continuous laws
n p(1 p)
Variance 2
Poisson law
np
n2 1
12
p(1 p)
n+1
2
p
1
P(X = k) =
n
P(X = 1) = p
P(X =0)= 1 p
n k
P(X = k) =
p (1 p)nk
k
n
P(X = n) = e
n!
Expectation E (X )
[[0, n]]
Variance
Expectation
Law
Density f (x)
{0, 1}
Set of values
[[1, n]]
Uniform law
Name
Set of values
Name
Discrete laws
616
Tables
Further reading
General books
Methods of theoretical physics
P. M. Morse and H. Feshbach (McGraw-Hill, 1953)
A great classic, very complete.
Integration
Lebesgue measure and integration, an introduction
F. Burk (Wiley Interscience, 1998)
A very clear book, with many examples guiding the reader step by step. Also a
very good historical introduction.
Complex analysis
Function theory of one complex variable
R. E. Greene and S. G. Krantz (Wiley Interscience, 1997)
Very pedagogical.
Bibliography
618
Tensors, geometry
Riemannian geometry
S. Gallot, D. Hulin and J. Lafontaine (Springer-Verlag, 1990)
Leon sur la thorie des spineurs
. Cartan (Hermann, 1966)
An English version is published by Dover with the title The theory of spinors, 1981.
Groups
Symmetries in quantum mechanics: from angular momentum to supersymmetry
M. Chaichian and R. Hagedorn (Institute of Physics, 1998)
A very interesting book, with increasing difficulty, giving a vast overview of symmetries, and explaining clearly the relation between symmetries and dynamics.
Bibliography
619
Lie groups and algebra, with applications to physics, geometry and mechanics
D. H. Sattinger and O. L. Weaver (Springer-Verlag, 1986, Applied Mathematical Sciences 61)
Probabilities, statistics
An introduction to probability theory
W. Feller (Wiley, 1968 (vol. I) and 1971 (vol. II))
The first volume, with little formalism, is a remarkable book directed toward
concrete applications, analyzed in great detail and depth.
Stochastik
A. Engel (Ernst Klett Verlag, 2000)
This book has a fascinating approach to probability theory, full of applications
and with little formalism. Moreover, this is one of the rare books dealing with
probabilities and statistics simultaneously. (A French translation of the original
German edition is available: Les certitudes du hasard, ALAS diteur, Lyon, 1990).
References
[1] Edwin A. Abbott. Flatland. Dover, 1992.
[2] Milton Abramowitz and Irene A. Stegun, editors. Handbook of mathematical functions. Dover, 1972.
[3] Sir George Airy. On the intensity of light in the neighbourhood of a
caustic. Camb. Phil. Trans., 6, 379402, 1838.
[4] Naum Ilitch Akhiezer and Isral Markovitch Glazman. Theory of linear
operators in Hilbert space. Dover, 1993. (two volumes bound as one).
[5] Andr Angot. Complments de mathmatiques lusage des ingnieurs de
llectrotechnique et des tlcommunications. Masson, 1982.
[6] Walter Appel and Angel Alastuey. Thermal screening of Darwin interactions in a weakly relativistic plasma. Phys. Rev. E, 59(4), 45424551,
1999.
[7] Walter Appel and Michael K.-H. Kiessling. Mass and spin renormalization in Lorentz electrodynamics. Annals of Physics, 289(1), 2483, April
2001. xxx.lanl.gov/math-ph/00090003.
[8] George B. Arfken and Hans J. Weber. Mathematical methods for physicists.
Academic Press, fourth edition, 1995.
[9] Vladimir I. Arnold. Mathematical methods of classical mechanics. SpringerVerlag, 1989.
[10] Gaston Bachelard. Le nouvel esprit scientifique. Quadrige / P.U.F., 1987.
[11] Gernot Bauer and Detlef Drr. The Maxwell-Lorentz system of a rigid
charge distribution. Preprint Universitt Mnchen, 1999.
[12] Thomas Bayes. An essay towards solving a problem in the doctrine of
chances. Phil. Trans., 53, 370418, 1764.
[13] mile Borel. Leons sur les sries divergentes. Gauthier-Villars, Paris, second
edition, 1928.
[14] Max Born and Emil Wolf. Principles of optics. Cambridge University
Press, seventh edition, 1999.
622
References
References
623
[30] Boris Dubrovin, Serge Novikov, and Anatoli Fomenko. Modern geometry Methods and applications, volume 1. Springer, 1982. Old-fashioned
Russian notation with indices everywhere, but the text is very clear; the
first two volumes are especially recommended for physicists.
[31] Phil P. G. Dyke. An introduction to Laplace transforms and Fourier series.
Springer, 2001.
[32] Freeman J. Dyson. Divergences of perturbation theory in quantum
electrodynamics. Phys. Rev., 85, 631632, 1952.
[33] William Feller. An introduction to probability theory. Wiley, 1968, 1971.
2 volumes.
[34] Richard P. Feynman and A. R. Hibbs. Quantum mechanics and integrals.
McGraw-Hill, 1965.
[35] Richard P. Feynman, Robert B. Leighton, and Matthew Sands. The
Feynman lectures on physics, volume 2. Addison Wesley, 1977.
[36] Richard P. Feynman, Robert B. Leighton, and Matthew Sands. The
Feynman lectures on physics, volume 3. Addison Wesley, 1977.
[37] Claude Gasquet and Patrick Witomski. Fourier analysis and applications:
filtering, numerical computation, wavelets. Springer, 1998.
[38] Isral M. Gelfand and G. E. Shilov. Generalized functions. Academic
Press, 1964.
[39] Josiah W. Gibbs. Letters to the editor. Nature, 200201, December 1898.
[40] Josiah W. Gibbs. Letters to the editor. Nature, 606, April 1899.
[41] Franois Gieres. Mathematical surprises and Diracs formalism in quantum mechanics. Rep. Prog. Phys, 63, 18931931, 2000.
[42] I. S. Gradstein and I. M. Ryshik. Summen-, Produkt- und Integral-Tafeln.
Deutscher Verlag der Wissenschaften, 1963.
[43] Robert E. Greene and Steven G. Krantz. Function theory of one complex
variable. Wiley Interscience, 1997. Very pedagogical.
[44] tienne Guyon, Jean-Pierre Hulin, and Luc Petit. Physical hydrodynamics. Oxford University Press, 2001.
[45] Paul Halmos. Naive set theory. Springer-Verlag, 1998. A very clear introduction.
[46] Godfrey H. Hardy. A mathematicians apology. Cambridge University
Press, 1992. Mathematics as seen by a profound and very appealing
mathematician. With an account of Ramanujan.
624
References
[58] Lev Landau and Evgueni Lifchitz. Quantum mechanics. ButterworthHeinemann, 1996.
[59] Serge Lang. Complex analysis. Springer-Verlag, 1993.
[60] Michel Le Bellac. Quantum and statistical field theory. Oxford University
Press, 1992.
[61] Jean-Claude Le Guillou and Jean Zinn-Justin, editors. Large-order behaviour of perturbation theory. Current physics sources and comments 7.
North-Holland, 1990.
References
625
626
References
Portraits
Airy Sir George . . . . . . . . . . . .
Alembert Jean le Rond d . . . . . .
Bayes Thomas . . . . . . . . . . . . .
Bienaym Jules . . . . . . . . . . . .
Bolzano Bernhard . . . . . . . . . .
Borel mile . . . . . . . . . . . . . .
Cauchy Augustin . . . . . . . . . .
Christoffel Elwin . . . . . . . . . .
Dirac Paul . . . . . . . . . . . . . .
Dirichlet Gustav . . . . . . . . . .
Euler Leonhard . . . . . . . . . . .
Fourier Joseph . . . . . . . . . . . .
Fubini Guido . . . . . . . . . . . . .
Gauss Carl . . . . . . . . . . . . . .
Gibbs Josiah . . . . . . . . . . . . . .
Heaviside Oliver . . . . . . . . . . .
Hilbert David . . . . . . . . . . . .
Jordan Camille . . . . . . . . . . . .
Kolmogorov Andrei Nikolaievitch
Kronecker Leopold . . . . . . . . .
Landau Edmund . . . . . . . . . . .
Laplace, marquis de . . . . . . . . .
Lebesgue Henri . . . . . . . . . . . .
Lvy Paul . . . . . . . . . . . . . . .
Liouville Joseph . . . . . . . . . . .
Minkowski Hermann . . . . . . . .
Poincar Henri . . . . . . . . . . .
Poisson Denis . . . . . . . . . . . .
Riemann Bernhard . . . . . . . . . .
Schmidt Erhard . . . . . . . . . . . .
Schwartz Laurent . . . . . . . . . .
Schwarz Hermann . . . . . . . . . .
Stokes George . . . . . . . . . . . .
Taylor Brook . . . . . . . . . . . . .
Tchebychev Pafnouti . . . . . . . .
Weierstrass Karl . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
42
415
518
547
581
58
88
163
186
171
39
268
80
557
311
193
257
119
515
445
580
333
61
558
104
448
475
172
138
258
222
161
472
31
548
582
Sidebars
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
60
70
134
276
294
488
506
552
Index
The boldface page numbers refer to the
main definitions, the italic ones to exercises. The family names in Small Capitals
refer to short biographies.
Symbols
(Hodge operator) . . . . . . . . . . . . . . . . . 482
(adjoint) . . . . . . . . . . . . . . . . . . . . . . . . . 378
(dual) . . . . . . . . . . . . . . . . . . . . . . . . 436, 463
(adjoint) . . . . . . . . . . . . . . . . . . . . . 389, 497
(convolution product) . . . . . . . . . . . . . 214
c.d.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554
c.p.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554
c.a.s.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554
cv.s.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
cv.u.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
(rectangle function) . . . . . . . . . . 213, 279
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
A (closure) . . . . . . . . . . . . . . . . . . . . . . . . 391
| . . . . . . . . . . . . . . . . . . . . . . . . . . . 260, 380
| . . . . . . . . . . . . . . . . . . . . . . . . . . . 260, 379
x| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
|x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
p| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
|p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
-algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
generated . . . . . . . . . . . . . . . . . . 58, 513
by an r.v. . . . . . . . . . . . . . . . . . . . 537
independent s . . . . . . . . . . . . 519, 537
(un vn ) . . . . . . . . . . . . . . . . . . . . . 580
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210, 439
(exterior product) . . . . . . . . 464, 467, 468
Laurent Pierre . . . . . . . . . . . . . . . . . . . . . 112
A
a.e. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Abelian group . . . . . . . . . . . . . . . . . . . . . 489
Absolute convergence . . . . . . . . . . . . . . . . 29
Accumulation point . . . . . . . . . . . . . . . . 106
Adjoint . . . . . . . . . . . . . . . . . . . . . . . 378, 389
Airy George (sir) . . . . . . . . . . . . . . . . . . . . 42
Airy integral . . . . . . . . . . . . . . . . . . . . . . . . 42
dAlembert Jean le Rond . . . . . . . . . . . 415
dAlembertian . . . . . . . . . . . . . . . . . . . . . . 414
Algebra
-algebra . . . . . . . . . . . . . . . . . . . . . . 57
convolution . . . . . . . . . . . . . . . . 236
exterior . . . . . . . . . . . . 469, 463471
Lie . . . . . . . . . . . . . . . . . . . . . . . . . 496
Almost
complete system . . . . . . . . . . . . . . . 518
everywhere . . . . . . . . . . . . . . . . . . . . . 61
surely . . . . . . . . . . . . . . . . . . . . . . . . 515
Analytic
continuation . . . . . . . . . . . . . . 144, 366
function . . . . . . . . . . . . . . . . . . . 35, 101
signal . . . . . . . . . . . . . . . . . . . . 365, 366
Antihermitian function . . . . . . . . . . . . . 285
Antiholomorphic function . . . . . . . 92, 157
Argument . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Asymptotic expansion . . . . . . . . . . . . . . . . 37
Autocorrelation . . . . . . . . . . . . . . . . 368, 370
Average power of a function . . . . . . . . . 370
Axiom of choice . . . . . . . . . . . . . . . . . . . . 70
B
B(R) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
B(n, p) . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
Ball (open/closed ) . . . . . . . . . . . . . . . . 578
Banach fixed point theorem . . . . . . . . . . 15
Basis
algebraic . . . . . . . . . . . . . . . . . . . 250
dual . . . . . . . . . . . . . . . . . . . 436, 463
generalized . . . . . . . . . . . . . . . . . 385
Hilbert . . . . . . . . . . . . . . . . . . . . . 258
Bayes Thomas . . . . . . . . . . . . . . . . . . . . . 518
Bayes formula . . . . . . . . . . . . . . . . . . 517, 518
Bernoulli Jacques . . . . . . . . . . . . . . . . . 555
Bernoulli distribution . . . . . . . . . . . . . . . 524
Bertrand function . . . . . . . . . . . . . . . . . . . 74
Bessel
function . . . . . . . . . . . . . . . . . . 320, 324
inequality . . . . . . . . . . . . . . . . . . . . 256
Bienaym Jules . . . . . . . . . . . . . . . . . . . . . 547
Bienaym identity . . . . . . . . . . . . . . . . . . 545
Bienaym-Tchebychev inequality . . . . . . 547
Bilateral Laplace transform . . . . . . . . . . 332
Binomial
distribution . . . . . . . . . . . . . . . . . . . 524
law . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
Bolzano Bernhard . . . . . . . . . . . . . . . . . 581
Borel mile . . . . . . . . . . . . . . . . . . . . . . . . 58
Borel set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Bounded (continous) operator . . . . . . . 390
632
Bra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
| . . . . . . . . . . . . . . . . . . . . . . 260, 380
x| . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
p| . . . . . . . . . . . . . . . . . . . . . . . . . . 383
generalized . . . . . . . . . . . . . . . . . 383
Branch point . . . . . . . . . . . . . . . 111, 136, 139
Bromwich contour . . . . . . . . . . . . . . . . . 337
Brownian motion . . . . . . . . . . . . . . 170, 425
Buffon George Leclerc, comte de . . . . 549
C
cv.s.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . . . . . . . . . . . . . 19
cv.u.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
C ([a, b ]) . . . . . . . . . . . . . . . . . . . . . . . . . . 577
Casorati-Weierstrass theorem . . . . . . . . . 109
Cauchy
criterion . . . . . . . . . . . . . . . . . . . . 13, 23
distribution . . . . . . . . . . . . . . . 529, 562
formula . . . . . . . . . . . . . . . . 45, 99, 101
law . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
-Lipschitz theorem . . . . . . . . . . . . 46
principal value pv(1/x)188, 189, 224
Fourier transform of the . . . 304
problem . . . . . . . . . . . . . . . . . . 238, 346
product . . . . . . . . . . . . . . . . . . . . . . 215
-Riemann equations . . . . 88, 90, 92
-Schwarz inequality . . . . . . . . . . 252
sequence . . . . . . . . . . . . . . . . . . . . . . . 13
theorem . . . . . . . . . . 93, 97, 98, 98, 598
Cauchy Augustin-Louis . . . . . . . . . . . . . . 88
Causal
function . . . . . . . . . . . . . . . . . . 331, 366
system . . . . . . . . . . . . . . . . . . . . . . . . 219
Causality . . . . . . . . . . . . . . . . . . . . . . . . . . 550
Cavendish experiment . . . . . . . . . . . . . . 484
Centered random variable . . . . . . . . . . . 532
Central limit theorem . . . . . . . . . . . . . . . 558
Characteristic function A . . . . . . . . . . . 63
Characteristic function X . . . . . . . . . . 541
Charge (density of ) . . . . . . . . . . . . . . . 199
Choice (Axiom of ) . . . . . . . . . . . . . . . . 70
Christoffel Elwin . . . . . . . . . . . . . . . . . 163
Circular lens . . . . . . . . . . . . . . . . . . . . . . . 320
Class S . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Class (complete of events) . . . . . . . . . 513
Classical limit . . . . . . . . . . . . . . . . . . . . . . . . 7
Clever trick . . . . . . . . . . . . . . . . . . . . . . . . 142
Closable operator . . . . . . . . . . . . . . . . . . . 391
Closed . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578
differential form . . . . . . . . . . . . . . 474
operator . . . . . . . . . . . . . . . . . . . . . . 390
Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
of an operator . . . . . . . . . . . . . . . . 391
Index
of a subset . . . . . . . . . . . . . . . . . . . . 579
Coefficients (Fourier ) . . . . . . . . . 258, 264
Coherence
function . . . . . . . . . . . . . . . . . . . . . . 369
spatial . . . . . . . . . . . . . . . . . . . . . . . 43
temporal . . . . . . . . . . . . . . . . . . . 371
Comb . . . . . . . . . . . . . . . . . . see Dirac, comb
Commutative group . . . . . . . . . . . . . . . . 489
Commutatively convergent . . . . . . . . . . . 26
Compact . . . . . . . . . . . . . . . . . . . . . . . . . . 576
Complement . . . . . . . . . . . . . . . . . . . . . . . 512
Complete
measure space . . . . . . . . . . . . . . . . 61
class of events . . . . . . . . . . . . . . . . . 513
normed vector space . . . . . . . . . . . . 14
Complex logarithm . . . . . . . . . . . . . 135, 136
Complex velocity (of a fluid) . . . . . . . . 167
Conditional probability . . . . . . . . . . . . . 517
Conditionally convergent series . . . . . . . 26
Conformal map . . . . . . . . . . . . . . . . . . . . 156
Conformally equivalent . . . . . . . . . . . . . 157
Connected
doubly . . . . . . . . . . . . . . . . . . . . . 501
simply . . . . . . . . . . . . . . . . . . 114, 575
space . . . . . . . . . . . . . . . . . . . . . . . . . 574
Constants (structure ) . . . . . . . . . . . . . 496
Continuation (analytic ) . . . . . . . 144, 366
Continuity
of convolution . . . . . . . . . . . . . . . . 235
of differentiation . . . . . . . . . . . . . . 230
of a linear functional . . . . . . . . . . 184
of a map . R. . . . . . . . . . . . . . . . . . . . 574
under the sign . . . . . . . . . . . . . . . 77
Contour
Bromwich . . . . . . . . . . . . . . . . . . 337
integration . . . . . . . . . . . . . . . . . . . 94
Contractible open set . . . . . . . . . . . . . . . 474
Contraction . . . . . . . . . . . . . . . . . . . . . . . . . 15
Contraction of indices . . . . . . . . . . . . . . 454
Contravariant coordinates . . . . . . . . . . . 434
Convergence
absolute . . . . . . . . . . . . . . . . . . 23, 29
almost everywhere . . . . . . . . . . . 554
almost sure . . . . . . . . . . . . . . . . . 554
commutative . . . . . . . . . . . . . . . . . 26
conditional . . . . . . . . . . . . . . . . . . 26
in D . . . . . . . . . . . . . . . . . . . . . . . . . 183
in distribution . . . . . . . . . . . . . . . . 554
dominated theorem . . . . . . . . . . . 75
in measure . . . . . . . . . . . . . . . . . . . . 554
normal . . . . . . . . . . . . . . . . . . . . . . 23
pointwise
of a Fourier series . . . . . . . . . . . 268
of a sequence of functions . . . . 19
Index
of a series of functions . . . . . . . 29
in probability . . . . . . . . . . . . . . . . . 554
radius of . . . . . . . . . . . . . . . . . . . . 34
in S . . . . . . . . . . . . . . . . . . . . . . . . 290
of a sequence . . . . . . . . . . . . . . . . . . 12
of a series . . . . . . . . . . . . . . . . . . . . . 23
simple . . . . . . . . . . . see pointwise
to . . . . . . . . . . . . . . . . . . . . . . . . 13
uniform
of a double sequence . . . . . . . . . 16
of a Fourier series . . . . . . . . . . . 269
of a sequence of functions . . . . 19
of a series of functions . . . . . . . 29
weak . . . . . . . . . . . . . . . . . . . . . . . 230
Convexity . . . . . . . . . . . . . . . . . . . . . . . . . 575
Convolution
algebra . . . . . . . . . . . . . . . . . . . . . . . 236
of causal functions . . . . . . . . . . . . 339
continuity of . . . . . . . . . . . . . . . 235
discrete . . . . . . . . . . . . . . . . . . . . . 220
of distributions . . . . . . . . . . . . . . . . 214
and Fourier transform . . . . . . . . . 292
of functions . . . . . . . . . . . . . . . 211, 270
inverse . . . . . . . . . . . . . . . . . . . . . . . 236
regularization by . . . . . . . . . . . . 235
Coordinate forms . . . . . . . . . . . . . . . . . . 436
Coordinates
change . . . . . . . . . . . . . . . . . . . . . . . 455
contravariant . . . . . . . . . . . 434, 451
covariant . . . . . . . . . . . . . . . 437, 449
curvilinear . . . . . . . . . . . . . . . . . . 455
Correlation . . . . . . . . . . . . . . . . . . . . . . . . 535
Correlation function . . . . . . . . . . . . . . . . 368
Coulomb potential . . . . . . . . . . . . . . . . . 166
Fourier transform of the . . . . . 306
Laplacian of the . . . . . . . . . . . . . 208
Covariance . . . . . . . . . . . . . . . . . . . . . . . . 535
Covariant coordinates . . . . . . . . . . . . . . . 437
Criterion
Cauchy . . . . . . . . . . . . . . . . . . . 13, 23
Rayleigh . . . . . . . . . . . . . . . . . . . . 321
Current (density of ) . . . . . . . . . . . . . . 199
Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Curvilinear coordinates . . . . . . . . . . . . . 455
Cut . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
Cyclic group . . . . . . . . . . . see Permutations
D
D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
is dense in D . . . . . . . . . . . . . . . . 235
D+ , D . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
d (exterior derivative) . . . . . . . . . . . . . . . 470
633
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185, 186
dx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91, 134
dx i dx j . . . . . . . . . . . . . . . . . . . . . . . . . . 464
dz, dz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
/ z, / z . . . . . . . . . . . . . . . . . . . . . . . . . 92
dAlembertian . . . . . . . . . . . . . . . . . . . . . . 414
dAlembert Jean le Rond . . . . . . . . . . . 415
Debye Petrus . . . . . . . . . . . . . . . . . . . . . . 148
Debye screen, potential . . . . . . . . . . . . . . 325
Deceit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571
Degree of a representation . . . . . . . . . . . 492
Dense subset . . . . . . . . . . . . . . . . . . . 574, 579
Density
of charge and current . . . . . . . . . . 199
of D in D . . . . . . . . . . . . . . . . . . . 235
probability . . . . . . . . . . . . . . . . . 526
joint . . . . . . . . . . . . . . . . 534, 539
of S in L2 . . . . . . . . . . . . . . . . . . . 290
spectral . . . . . . . . . . . . . . . . . 368, 370
of a subset . . . . . . . . . . . . . . . . 574, 579
Derivation (continuity of ) . . . . . . . . . 230
Derivative
of a discontinuous function . . . . 201
of a distribution . . . . . . . . . . . . . . . 192
exterior . . . . . . . . . . . . . . . . . . . . 470
C 1 -diffeomorphism . . . . . . R. . . . . . . . . . 455
Differentiability under the sign . . . . . 78
Differential
form . . . . . . . . . . . . . . . . . 469, 463483
integral of a . . . . . . . . . . . . . . 471
of a function . . . . . . . . . 134, 586, 587
Diffraction . . . . . . . . . . . . . . . . . . . . . . . . . 314
Diffuse
mesure . . . . . . . . . . . . . . . . . . . . . . . . 60
probability . . . . . . . . . . . . . . . . . . . 525
Dilation of a distribution . . . . . . . . . . . 190
Dini Ulisse . . . . . . . . . . . . . . . . . . . . . . . . . 22
Dinis theorems . . . . . . . . . . . . . . . . . . . . . 22
Dirac
comb X . . . . . . . . . . . . . . . . . 186, 307
Fourier transform of . . . . . . 307
curvilinear distribution . . . . . . 195
distribution . . . . . . . . . 185, 186, 194
Fourier transform of . . . . . . 303
Laplace transform of . . . . . . 342
distribution . . . . . . . . . . . . . . . . 196
measure . . . . . . . . . . . . . . . . . . 514, 552
sequence . . . . . . . . . . . . . . . . . . . . . . 231
surface distribution . . . . . . . . . . 195
Dirac Paul . . . . . . . . . . . . . . . . . . . . . . . . 186
Direct product . . . . . . . . . . . . . . . . . 209, 210
Dirichlet
function . . . . . . . . . . . . . . . . . . . 62, 279
problem . . . . . . . . . . . . . . . . . . . . . . 170
634
for a disc . . . . . . . . . . . . . . . . . . 172
on a half-plane . . . . . . . . . . . . . 175
for a strip . . . . . . . . . . . . . . . . . . 175
theorem . . . . . . . . . . . . . . . . . . . . . . 268
Dirichlet Gustav Lejeune . . . . . . . . . . . 171
Dispersion relation . . . . . . . . . . . . . . . . . 227
Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . 578
Distribution . . . . . . . . . . . . . . . . . . . . . . . 183
Bernoulli . . . . . . . . . . . . . . . . . . . 524
binomial . . . . . . . . . . . . . . . . . . . 524
Cauchy . . . . . . . . . . . . . . . . 529, 562
convergence in . . . . . . . . . . . . . . 554
curvilinear Dirac . . . . . . . . . . . . 195
(Dirac derivative) . . . . . . . . . . . 196
derivative of a . . . . . . . . . . . . . . . 192
dilation of a . . . . . . . . . . . . . . . . 190
Dirac . . . . . . . . . . . . 185, 186, 194
Fourier transform of . . . . . . 303
Laplace transform of . . . . . . 342
function . . . . . . . . . . . . . . . . . . . . . . 524
joint . . . . . . . . . . . . . . . . . . . . 534
Gaussian . . . . . . . . . . . . . . . . . . . 557
Heaviside . . . . . . . . . . . . . . . . . . . 193
marginal . . . . . . . . . . . . . . . . . . . 534
normal . . . . . . . . . . . . . . . . . 515, 557
Poisson . . . . . . . . . . . . 530, 530, 547
probability . . . . . . . . . . . . . . . . . 522
regular . . . . . . . . . . . . . . . . . . . . . 184
regularization of a . . . . . . . . . . . 234
sgn (sign) . . . . . . . . . . . . . . . . . 203
singular . . . . . . . . . . . . . . . . . . . . 185
support of a . . . . . . . . . . . . . . . . 187
surface Dirac . . . . . . . . . . . . . . . . 195
tempered . . . . . . . . . . . . . . . . . . . 300
translate of a . . . . . . . . . . . . . . . . 189
transpose of a . . . . . . . . . . . . . . . 190
Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Dominated
convergence theorem . . . . . . . . . . . 75
sequence by another . . . . . . . . . 580
Doubly connected . . . . . . . . . . . . . . . . . . 501
Dual
basis . . . . . . . . . . . . . . . . . . . . . 436, 463
Hodge . . . . . . . . . . . . . . . . . . . . . 482
of a vector space . . . . . . . . . . 436, 463
Duality (metric ) . . . . . . . . . . . . . . . . . . 449
E
E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
E . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436, 463
Egorov theorem . . . . . . . . . . . . . . . . . . . . . 69
Eigenvalue . . . . . . . . . . . . . . . . . . . . . 391, 403
Index
generalized . . . . . . . . . . . . . 392, 403
Eigenvector . . . . . . . . . . . . . . . . . . . . . . . . 391
Einstein Albert . . . . . . . . . . . . . . . . . . . . 435
Einstein convention . . . . . . . . . . . . . . . . 435
Electromagnetic field
dynamics of the without sources
348
Green function of the . . . . . . . . 414
Electromagnetism . . 228, 325, 348, 414422
Electrostatics . . . . . . . . . . . . . . . . . . . 165, 194
Element (matrix ) . . . . . . . . . . . . . . . . . 398
Endomorphism
hermitian . . . . . . . . . . . . . . . . . . 379
normal . . . . . . . . . . . . . . . . . . . . . 379
self-adjoint . . . . . . . . . . . . . . . . . 379
symmetric . . . . . . . . . . . . . . . . . . 379
Energy (of a signal) . . . . . . . . . . . . . . . . . 368
Entire function . . . . . . . . . . . . . . . . . . . . 103
Equation
heat . . . . . . . . . . . . . . . . . . . 240, 422
Klein-Gordon . . . . . . . . . . . . . . . 429
Maxwell . . . . . . . . . . . . . . . . 481, 482
Poisson . . . . . . . . . . . . . . . . . 164, 217
Schrdinger . . . . . . . . . . . . . . . . . 427
Equivalent
conformally open sets . . . . . . . . 157
norms . . . . . . . . . . . . . . . . . . . . . . . . 581
paths . . . . . . . . . . . . . . . . . . . . . . . . . . 94
sequences . . . . . . . . . . . . . . . . . . . . . 580
Essential singularity . . . . . . . . . . . . . . . . . 110
Euler Leonhard . . . . . . . . . . . . . . . . . . . . 39
Euler function . . . . . . . . . . . . . . . . . . . . . 154
Event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
incompatible s . . . . . . . . . . . . . . . 513
realized . . . . . . . . . . . . . . . . . . . . . 513
Exact differential form . . . . . . . . . . . . . . 474
Expansion
asymptotic series . . . . . . . . . . . . . 37
Fourier series . . . . . . . . . . . . . . . 265
perturbative . . . . . . . . . . . . . . . . . . 38
power series . . . . . . . . . . . . . . . 34, 35
Taylor . . . . . . . . . . . . . . . . . . . . . . . 36
Expectation . . . . . . . . . . . . . . . . . . . . 527, 528
Extension
of a continuous operator . . . . . . . 294
of an operator on H . . . . . . . . . . . 388
Exterior
1-form . . . . . . . . . . . . . . . . . . . . . . . 463
2-form . . . . . . . . . . . . . . . . . . . . . . . 464
k-form . . . . . . . . . . . . . . . . . . . . . . . 465
algebra . . . . . . . . . . . . . . . 469, 463471
derivative . . . . . . . . . . . . . . . . . . . . . 470
product . . . . . . . . . . . . . . . . . . 467, 468
Index
F
F [ f ] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Faithful representation . . . . . . . . . . . . . . 492
Faltung theorem . . . . . . . . . . . . . . . . . . . . . 292
Family (free/generating ) . . . . . . . . . . . 250
Faraday tensor . . . . . . . . . . . . . . . . . . . . . 481
Fejr sums . . . . . . . . . . . . . . . . . . . . . 270, 313
Feynman Richard . . . . . . . . . . . . . . . . . . 432
Feynman propagator . . . . . . . . . . . . . . . . 432
Field
electromagnetic
dynamics of the without sources
348
Green function of the . . . . . 414
transverse/longitudinal . . . . . . 358
Finite part fp(1/x k ) . . . . . . . . . 241, 324, 345
Finite power functions . . . . . . . . . . . . . . 370
Fischer (Riesz- theorem) . . . . . . . . . . . . 67
Fixed point . . . . . . . . . . . . . . . . . . . . . . . . . 15
Form
coordinate . . . . . . . . . . . . . . . . . . 436
differential . . . . . . . . . 469, 463483
closed . . . . . . . . . . . . . . . . . . . 474
exact . . . . . . . . . . . . . . . . . . . . 474
integral of a . . . . . . . . . . . . . . 471
exterior 1-form . . . . . . . . . . . . . . . . 463
exterior 2-form . . . . . . . . . . . . . . . . 464
exterior k-form . . . . . . . . . . . . . . . . 465
linear . . . . . . . . . . . . . . . . . . 436, 463
volume . . . . . . . . . . . . . . . . . . . . . 467
Formula
Bayes . . . . . . . . . . . . . . . . . . . 517, 518
Cauchy . . . . . . . . . . . . . . . 45, 99, 101
Green . . . . . . . . . . . . . . . . . . . . . . 206
Green-Ostrogradski . . . . . . 205, 477
Gutzmer . . . . . . . . . . . . . . . . . . . . 45
Poincar . . . . . . . . . . . . . . . . 516, 606
Poisson summation . . . . . . 271, 309
Stirling . . . . . . . . . . . . . . . . . . . . . 154
Stokes . . . . . . . . . . . . . . . . . . . . . . 473
Taylor . . . . . . . . . . . . . . . . . . . . . . . 31
Taylor-Lagrange . . . . . . . . . . . . 31, 32
Taylor-Young . . . . . . . . . . . . . . . . . 32
Four-vector . . . . . . . . . . . . . . . . . . . . . . . . 200
Fourier
coefficients . . . . . . . . . . . . . . . 258, 264
partial series . . . . . . . . . . . . . . . . 258
series . . . . . . . . . . . . . . . . 258, 264, 265
transform
of X . . . . . . . . . . . . . . . . . . . . . . 307
computation by residues . . . . . 120
conjugate transform . . . . . . . 278
of . . . . . . . . . . . . . . . . . . . . . . . 303
of a distribution . . . . . . . . . . . . 300
635
of a function . . . . . . . . . . . . . . . 278
of the gaussian . . . . . . . . . 126, 295
of H . . . . . . . . . . . . . . . . . . . . . . 304
inverse of the 282, 284, 291, 306
of the laplacian . . . . . . . . . . . . . 305
of the lorentzian 1/(1 + t 2 ) . . . 84
of pv(1/x) . . . . . . . . . . . . . . . . . 304
in Rn . . . . . . . . . . . . . . . . . . . . . . 358
sine and cosine . . . . . . . . . . . 295
Fourier Joseph . . . . . . . . . . . . . . . . . . . . 268
fp(1/x k ) . . . . . . . . . . . . . . . . . . . . . . . 324, 345
Fraunhofer approximation . . . . . . . . . . . 314
Free family . . . . . . . . . . . . . . . . . . . . . . . . 250
Frequencies (Matsubara ) . . . . . . . . . . . 128
Fubini Guido . . . . . . . . . . . . . . . . . . . . . . . 80
Fubini-Lebesgue theorem . . . . . . . . . . . . . 79
Fubini-Tonelli theorem . . . . . . . . . . . . . . . 80
Function
analytic . . . . . . . . . . . . . . . . . . 35, 101
antihermitian . . . . . . . . . . . . . . . 285
antiholomorphic . . . . . . . . . 92, 157
autocorrelation . . . . . . . . . . 368, 370
Bertrand (t log t) . . . . . . . . . . . 74
Bessel . . . . . . . . . . . . . . . . . . 320, 324
causal . . . . . . . . . . . . . . . . . . 331, 366
characteristic . . . . . . . . . . . . . 63, 541
coherence . . . . . . . . . . . . . . . . . . 369
Dirac function . . . . . . . . . . 185, 186
Dirichlet . . . . . . . . . . . . . . . . 62, 279
distribution . . . . . . . . . . . . . . . . . 524
joint . . . . . . . . . . . . . . . . 534, 538
entire . . . . . . . . . . . . . . . . . . . . . . 103
Euler . . . . . . . . . . . . . . . . . . . . . . . 154
finite power . . . . . . . . . . . . . . . . . 370
Green . . . . . . 165, 236, 408, 407432
of the dAlembertian . . . . 414, 417
of the harmonic oscillator . . . 409
of the heat equation . . . . 423, 424
harmonic . . . . . . . . . . . 139, 139144
Heaviside . . . . . . . . . . . . . . . . 62, 193
hermitian . . . . . . . . . . . . . . . . . . 285
holomorphic . . . . . . . . . . . . . . . . 90
integrable . . . . . . . . . . . . . . . . . . . . 64
intercorrelation . . . . . . . . . . . . . 369
locally integrable . . . . . . . . . . . . 184
measurable . . . . . . . . . . . . . . . . . . 62
meromorphic . . . . . . . . . . . . . . . 110
multivalued . . . . . . . . . . . . . 135139
rapidly decaying . . . . . . . . . . . . . 288
rectangle (x) . . . . . . . . . . 213, 279
self-coherence . . . . . . . . . . . . . . . 369
simple . . . . . . . . . . . . . . . . . . . . . . 63
slowly increasing . . . . . . . . . . . . 301
test . . . . . . . . . . . . . . . . . . . . . . . . 182
636
transfer . . . . . . . . . . . . . . . . . . . . 356
Functional . . . . . . . . . . . . . . . . . . . . . . . . . 182
continuity of a linear . . . . . . . . 184
G
g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
b
g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
(z) (Euler function) . . . . . . . . . . . . . . . 154
Gauge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
Gauss Carl . . . . . . . . . . . . . . . . . . . . . . . . 557
Gaussian . . . . . . . . . . . . . . . . . . . . . . . . . . 279
distribution . . . . . . . . . . . . . . . 515, 557
Fourier transform . . . . . . . . . 126, 295
Gelfand triple . . . . . . . . . . . . . . . . . . . . . . 383
Generated (-algebra ) . . . . . . . . . . 58, 513
Generators (infinitesimal ) . . . . . . . . . 495
Gibbs Josiah . . . . . . . . . . . . . . . . . . . . . . . 311
Gibbs phenomenon . . . . . . . . . . . . . . . . . 311
Graph of a linear operator . . . . . . . . . . 390
Green
-Ostrogradski formula . . . 205, 477
formula . . . . . . . . . . . . . . . . . . . . . . 206
function . . . . . 165, 236, 408, 407432
of the dAlembertian . . . . 414, 417
of the harmonic oscillator . . . 409
of the heat equation . . . . 423, 424
theorem . . . . . . . . . . . . . . . . . . . . . . 106
Green George . . . . . . . . . . . . . . . . . . . . . 105
Green-Riemann theorem . . . . . . . . . . . . 105
Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
abelian (commutative) . . . . . . . 489
linear representation . . . . . . . . . . . 492
one-parameter . . . . . . . . . . . . . . . 495
of rotations . . . . . . . . . . . . . . . . . . . 492
Gutzmer formula . . . . . . . . . . . . . . . . . . . . 45
H
H (x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Half-life . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
Hankel transform . . . . . . . . . . . . . . . . . . 324
Harmonic function . . . . . . . . . 139, 139144
Harmonic oscillator . . . . . . . . . . . . 239, 409
Heat
equation . . . . . . . . . . . . . . . . . 240, 422
kernel . . . . . . . . . . . . . . . . 240, 423, 424
operator . . . . . . . . . . . . . . . . . . . . . . 240
Heaviside
distribution . . . . . . . . . . . . . . . . . . . 193
Fourier transform of H . . . . . . . . 304
function . . . . . . . . . . . . . . . . . . . 62, 193
Laplace transform of H . . . . . . . . 334
Heaviside Oliver . . . . . . . . . . . . . . . . . . . 193
Index
Heisenberg uncertainty relations . 363, 404
Hermite polynomials . . . . . . . . . . . . . . . 263
Hermitian
endomorphism . . . . . . . . . . . . . . . . 379
function . . . . . . . . . . . . . . . . . . . . . . 285
operator . . . . . . . . . . . . . . . . . . . . . . 394
product . . . . . . . . . . . . . . . . . . . . . . 251
Hilbert
basis . . . . . . . . . . . . . . . . . . . . . . . . . 258
infinite hotels . . . . . . . . . . . . . . 12, 330
space . . . . . . . . . . . . . . . . . . . . . . . . . 257
transform . . . . . . . . . . . . . . . . . 227, 366
Hilbert David . . . . . . . . . . . . . . . . . . . . . 257
Hodge operator . . . . . . . . . . . . . . . . . . . . 482
Holomorphic function . . . . . . . . . . . . . . . 90
Holomorphically simply connected . . . 114
Homeomorphic open sets . . . . . . . . . . . 157
Homeomorphism . . . . . . . . . . . . . . . . . . 157
Homotopy . . . . . . . . . . . . . . . . . . . . . . . . . 575
I
I.r.v. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
Identity
Bienaym . . . . . . . . . . . . . . . . . . . 545
parallelogram . . . . . . . . . . . . . . . 253
Parseval . . . . . . . . . . . . . . . . 259, 265
Parseval-Plancherel . . . . . . . . . . . 291
Poisson summation formula . . . . 271
Taylor-Lagrange . . . . . . . . . . . . . . 31
Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
Impulse response . . . . . . . . . . . . . . . . . . . 218
Independent
-algebras . . . . . . . . . . . . . . . . . . . . . 519
events . . . . . . . . . . . . . . . . . . . . . . . . 519
random variables . . . . . . . . . . . . . . 537
Indices
contracted . . . . . . . . . . . . . . . . . . 454
contravariant . . . . . . . . . . . 434, 451
covariant . . . . . . . . . . . . . . . 437, 449
Inequality
Bessel . . . . . . . . . . . . . . . . . . . . . . 256
Bienaym-Tchebychev . . . . . . . . 547
Cauchy-Schwarz . . . . . . . . . . . . . 252
Heisenberg . . . . . . . . . . . . . 363, 404
Minkowski . . . . . . . . . . . . . . . . . 253
Taylor-Lagrange . . . . . . . . . . . . . . 32
Infinite hotels . . . . . . . . . . . . . . . . . . . 12, 330
Infinitesimal generators . . . . . . . . . . . . . 495
Integrable function . . . . . . . . . . . . . . . 64, 64
Integral
of a differential form . . . . . . . . . . 471
Lebesgue . . . . . . . . . . . . . . . . . . . . 64
Riemann . . . . . . . . . . . . . . . . . 51, 53
Riemann-Stieltjes . . . . . . . . 529, 552
Index
Intercoherence function . . . . . . . . . . . . . 369
Intercorrelation . . . . . . . . . . . . . . . . . . . . 369
Interior of a subset . . . . . . . . . . . . . . . . . 578
Inverse (convolution ) . . . . . . . . . . . . . 236
Inversion
of the Fourier transform282, 284, 306
of the Laplace transform . . . . . . . 338
J
J0 (x), J1 (x) . . . . . . . . . . . . . . . . . . . . . . . . 320
Jacobian . . . . . . . . . . . . . . . . . . . . . . . . 81, 588
Joint
distribution function . . . . . . . . . . 534
probability density . . . . . . . . 534, 539
Jordan Camille . . . . . . . . . . . . . . . . . . . . 119
Jordans lemmas . . . . . . . . . . . . . . . . 117, 118
Joukovski Nicola . . . . . . . . . . . . . . . . . . 158
Joukovski mapping . . . . . . . . . . . . . . . . . 158
K
Kernel
Dirichlet . . . . . . . . . . . . . . . . . . . 270
Fejr . . . . . . . . . . . . . . . . . . . . . . . 270
heat . . . . . . . . . . . . . . . 240, 423, 424
Poisson . . . . . . . . . . . . . . . . . . . . . 172
Schrdinger . . . . . . . . . . . . . . . . 428
Ket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
| . . . . . . . . . . . . . . . . . . . . . . 260, 379
|x . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
|p . . . . . . . . . . . . . . . . . . . . . . . . . . 383
Khintchine (Wiener- theorem) . . . . . 371
Kirchhoff integral . . . . . . . . . . . . . . 244, 349
Klein-Gordon equation . . . . . . . . . . . . . 429
Kolmogorov Andrei Nikolaievitch . . . 515
Kramers-Kronig relations . . . . . . . . . . . . 227
Kronecker Leopold . . . . . . . . . . . . . . . . 445
Kronecker symbol . . . . . . . . . . . . . . . . . . 445
L
L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
2 (E ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
k (E ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
L2 (R) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
L2 [0, a] . . . . . . . . . . . . . . . . . . . . . . . 262, 264
L1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
L1loc (Rn ) . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Lagrange
multipliers . . . . . . . . . . . . . . . . . . . . 588
Taylor- formula . . . . . . . . . . . . . . 32
Lagrangian (Proca ) . . . . . . . . . . . . . . . . 484
Landau Edmund . . . . . . . . . . . . . . . . . . 580
Landau notation . . . . . . . . . . . . . . . . . . . 580
Laplace, Pierre Simon de . . . . . . . . . . . 333
637
Laplace transform . . . . . . . . . . 332, 331350
of distributions . . . . . . . . . . . . . . . 342
inversion of the . . . . . . . . . . . . . 338
Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Fourier transform of the . . . . . 305
Large numbers
strong law of . . . . . . . . . . . . . . . 556
weak law of . . . . . . . . . . . . . . . . . 555
Laurent series . . . . . . . . . . . . . . . . . . 111, 112
Law
binomial . . . . . . . . . . . . . . . . . . . 616
Cauchy . . . . . . . . . . . . . . . . . . . . . 616
Poisson . . . . . . . . . . . . . . . . . . . . . 616
strong of large numbers . . . . . . 556
weak of large numbers . . . . . . . 555
Lebesgue
integral . . . . . . . . . . . . . . . . . . . . . . . . 64
measure . . . . . . . . . . . . . . . . . . . . . . . 59
Riemann- lemma . . . . . . . . 267, 282
theorem . . . . . . . . . . . . . . . . . . . . . . . 75
Lebesgue Henri . . . . . . . . . . . . . . . . . . . . . 61
Lemma
Jordan . . . . . . . . . . . . . . . . . . . . 117, 118
Riemann-Lebesgue . . . . . . . . . 267, 282
Zorn . . . . . . . . . . . . . . . . . . . . . . . . . 249
Length of a path . . . . . . . . . . . . . . . . . . . . 95
Levi (Beppo- theorem) . . . . . . . . . . . . . 76
Levi-Civita tensor . . . . . . . . . . . . . . 483, 496
Lvy Paul . . . . . . . . . . . . . . . . . . . . . . . . . 558
Lie algebra . . . . . . . . . . . . . . . . . . . . . . . . . 496
Lifetime . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
Lightning rod . . . . . . . . . . . . . . . . . . . . . . 169
Limit
classical . . . . . . . . . . . . . . . . . . . . . . 7
in D . . . . . . . . . . . . . . . . . . . . . . . . . 183
pointwise of a series . . . . . . . . . . 29
uniform of a series . . . . . . . . . . . 29
Linear representation . . . . . . . . . . . . . . . 492
Liouville Joseph . . . . . . . . . . . . . . . . . . . 104
Liouville theorem . . . . . . . . . . . . . . . 45, 103
Lipschitz (Cauchy- theorem) . . . . . . . . 46
Locally
finite . . . . . . . . . . . . . . . . . . . . . . . . . 110
integrable function . . . . . . . . . . . . 184
Longitudinal fields . . . . . . . . . . . . . . . . . 358
Lorentzian . . . . . . . . . . . . . . . . . . . . . 279, 284
M
M1,3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448
Magic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
Markov inequality . . . . . . . . . . . . . . . . . . 548
Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593
element . . . . . . . . . . . . . . . . . . . . . . 398
jacobian . . . . . . . . . . . . . . . . . 81, 588
638
orthogonal . . . . . . . . . . . . . . . . . 493
Pauli . . . . . . . . . . . . . . . . . . . 498, 505
representation . . . . . . . . . . . . . . . . . 594
rotation . . . . . . . . . . . . . . . . . . . . 492
unitary . . . . . . . . . . . . . . . . . . . . . 497
Wronskian . . . . . . . . . . . . . . . . . . 409
Matsubara frequencies . . . . . . . . . . . . . . 128
Maximum modulus principle . . . . . . . . 104
Maximum principle . . . . . . . . . . . . . . . . 141
Maxwell equations . . . . . . . . . . . . . . 481, 482
Mean value
theorem . . . . . . . . . . . . . . . . . . . . . . 141
Mean value property . . . . . . . . . . . . . . . . 100
Measurable
function . . . . . . . . . . . . . . . . . . . . . . . 62
space . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
diffuse . . . . . . . . . . . . . . . . . . . . . . 60
Dirac . . . . . . . . . . . . . . . . . . . . . . 552
Lebesgue . . . . . . . . . . . . . . . . . 59, 60
Lebesgue exterior . . . . . . . . . . . . . 60
Median . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
Meromorphic function . . . . . . . . . . . . . . 110
Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
duality . . . . . . . . . . . . . . . . . . . . . . . 449
Minkowski . . . . . . . . . . . . . . . . . 448
Minkowski
inequality . . . . . . . . . . . . . . . . . . . . 253
pseudo-metric . . . . . . . . . . . . . . . . . 448
space . . . . . . . . . . . . . . . . . . . . . . . . . 448
Minkowski Hermann . . . . . . . . . . . . . . . 448
de Moivre Abraham . . . . . . . . . . . . . . . . 511
Moment . . . . . . . . . . . . . . . . . . . . . . . . . . . 532
of a couple of r.v. . . . . . . . . . . . . . 535
Momentum
average . . . . . . . . . . . . . . . . . . . . . 361
operator . 361, 388, 389, 391, 401, 402
representation . . . . . . . . . . . . . 360, 361
uncertainty . . . . . . . . . . . . . . . . . . . 361
Monty Hall paradox . . . . . . . . . . . . . . . . 560
Moreras theorem . . . . . . . . . . . . . . . . . . . 105
Multipliers (Lagrange ) . . . . . . . . . . . . 588
Multivalued function . . . . . . . . . . . . . . . 137
N
N (m, 2 ) . . . . . . . . . . . . . . . . . . . . . . . . . 557
Negligible
sequence compared to another 580
set . . . . . . . . . . . . . . . . . . . . . . . . 61, 515
Neighborhood . . . . . . . . . . . . . . . . . . . . . 573
Neumann problem . . . . . . . . . . . . . . . . . 167
Nonmonochromatic signal . . . . . . . . . . 372
Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577
equivalent s . . . . . . . . . . . . . . . . . 581
Index
hermitian . . . . . . . . . . . . . . . . . . 252
Normal distribution . . . . . . . . . . . . 515, 557
Normal endomorphism . . . . . . . . . . . . . 379
Normed vector space . . . . . . . . . . . . . . . . 577
O
O(3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
O(n ), o(n ) . . . . . . . . . . . . . . . . . . . . . . . 580
Observable . . . . . . . . . . . . . . . . . . . . . . . . 393
Open set . . . . . . . . . . . . . . . . . . . . . . 573, 578
conformally equivalent s . . . . . . 157
contractible . . . . . . . . . . . . . . . . . 474
star-shaped . . . . . . . . . . . . . . . . . . 475
Open/closed ball . . . . . . . . . . . . . . . . . . . 578
Operator
bounded (continuous) . . . . . . . 390
closable . . . . . . . . . . . . . . . . . . . . 391
closed . . . . . . . . . . . . . . . . . . . . . . 390
heat . . . . . . . . . . . . . . . . . . . . . . . . 422
hermitian (symmetric) . . . . . . . 394
Hodge . . . . . . . . . . . . . . . . . . . . . 482
momentum . . . . 361, 388, 389, 391,
401, 402
position . . . . . . . . 360, 391, 402, 403
Schrdinger . . . . . . . . . . . . . . . . . 427
self-adjoint . . . . . . . . . . . . . . . . . 394
Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
Order of a pole . . . . . . . . . . . . . . . . . . . . 112
Original . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
Orthogonal
matrix . . . . . . . . . . . . . . . . . . . . . . . 493
projection . . . . . . . . . . . . . . . . 255, 399
system . . . . . . . . . . . . . . . . . . . . . . . 253
Orthogonality . . . . . . . . . . . . . . . . . . . . . 253
Orthonormal system . . . . . . . . . . . . . . . . 253
Ostrogradski (Green- formula) . 205, 477
P
P (momentum operator) . . . . . . . . 388, 389
P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57, 512
(rectangle function) . . . . . . . . . . 213, 279
Parseval
-Plancherel identity . . . . . . . . . . 291
identity . . . . . . . . . . . . . . . . . . 259, 265
Partial Fourier series . . . . . . . . . . . . . . . . 258
Pascal Blaise . . . . . . . . . . . . . . . . . . . . . . 606
Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
integral on a . . . . . . . . . . . . . . . . . 94
length of a . . . . . . . . . . . . . . . . . . 95
winding number of a . . . . . . . . . 96
Pauli matrices . . . . . . . . . . . . . . . . . . . . . . 498
Percolation . . . . . . . . . . . . . . . . . . . . . . . . 169
Permutation . . . . . . . . . . . . . . . . . . . . 26, 465
Permutations . . . . . . . see Symmetric, group
Index
Ptanque . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
fp(1/x k ) . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Phenomenon
Gibbs . . . . . . . . . . . . . . . . . . . . . . 311
Stokes . . . . . . . . . . . . . . . . . . . . . . . 38
Physical applications
electricity . . . . . . . . . . . . . . . . . . 43, 355
electromagnetism . 199, 228, 348, 358,
375, 414422, 479, 480, 484
electrostatics . . . . . . 153, 165, 194, 216,
325, 477
harmonic oscillator . . . . 239, 409413
hydrodynamics . . . . . . . . . . 5, 167, 174
mechanics . . . . . . . . . . . . . . . . . . . . 15
optics . . . . . . . . . . . . . . 38, 43, 227, 244,
314321, 324, 365, 371, 375
quantum mechanics7, 38, 128, 260, 359,
403, 377405, 427432, 497, 503
radioactivity . . . . . . . . . . . . . . . . . . 533
relativity . . . . . . . . . . . . . . 200, 211, 242
resonance . . . . . . . . . . . . . . . . . . . . . 357
thermodynamics . . 173, 175, 422427
Physical optics . . . . . . . . . . . . . 314321, 366
coherence . . . . . . . . . . . . . . . . . . 43, 371
Kirchhoff integral . . . . . . . . . 244, 349
Picard iteration . . . . . . . . . . . . . . . . . . . . . . 46
Piecewise continuous function . . . . . . . . 53
Plasma frequency . . . . . . . . . . . . . . . . . . . 228
Poincar
formula . . . . . . . . . . . . . . . . . . 516, 606
theorem . . . . . . . . . . . . . . . . . . 474, 475
Poincar Henri . . . . . . . . . . . . . . . . . 37, 475
Point
accumulation . . . . . . . . . . . . . . . 106
branch . . . . . . . . . . . . . 111, 136, 139
fixed theorem . . . . . . . . . . . . . 15, 43
saddle . . . . . . . . . . . . . . . . . . . . . . 149
stopping . . . . . . . . . . . . . . . . . . . . 167
Pointwise . . . . . . . . . . . . . . . . . . . . see Simple
Poisson
distribution . . . . . . . . . . . 530, 530, 547
equation . . . . . . . . . . . . . . 164, 216, 217
kernel . . . . . . . . . . . . . . . . . . . . . . . . 172
law . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
summation formula . . . . . . . 271, 309
Poisson Denis . . . . . . . . . . . . . . . . . . . . . 172
Pole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
at infinity . . . . . . . . . . . . . . . . . . . . 146
order of a . . . . . . . . . . . . . . . . . . 112
simple . . . . . . . . . . . . . . . . . . . . . . 112
Position
average . . . . . . . . . . . . . . . . . . . . . 361
operator . . . . . . . . . . 360, 391, 402, 403
representation . . . . . . . . . . . . . . . . . 360
639
uncertainty . . . . . . . . . . . . . . . . . . . 361
Positive (negative) part . . . . . . . . . . . . . . . 64
Potential
Coulomb . . . . . . . . . . . . . . . . . . . 166
Fourier transform of the . . . 306
Laplacian of the . . . . . . . . . . 208
Debye . . . . . . . . . . . . . . . . . . . . . . 325
Yukawa . . . . . . . . . . . . . . . . . . . . . 484
Power series expansion . . . . . . . . . . . . . . . 34
Pre-Hilbert space . . . . . . . . . . . . . . . . . . . 251
Principal value pv(1/x) . . . . . . . . . 188, 224
Fourier transform of the . . . . . 304
Principle
maximum . . . . . . . . . . . . . . . . . . 141
maximum modulus . . . . . . . . . . 104
uncertainty . . . . . . . . . . . . . 363, 404
Probability
conditional . . . . . . . . . . . . . . . . . 517
density . . . . . . . . . . . . . . . . . . . . . . . 526
joint . . . . . . . . . . . . . . . . 534, 539
diffuse . . . . . . . . . . . . . . . . . . . . . 525
distribution . . . . . . . . . . . . . . . . . . . 522
measure . . . . . . . . . . . . . . . . . . . . . . 514
Problem
Cauchy . . . . . . . . . . . . . . . . 238, 346
Dirichlet . . . . . . . . . . . . . . . . . . . 170
for a disc . . . . . . . . . . . . . . . . . . 172
on a half-plane . . . . . . . . . . . . . 175
for a strip . . . . . . . . . . . . . . . . . . 175
Neumann . . . . . . . . . . . . . . . . . . . 167
Proca Lagrangian . . . . . . . . . . . . . . . . . . . 484
Product
Cauchy . . . . . . . . . . . . . . . . . . . . . 215
convolution
of causal functions . . . . . . . . . . 339
of distributions . . . . . . . . . . . . . 214
and Fourier transform . . . . . . . 292
of functions . . . . . . . . . . . . . . . . 270
of functions . . . . . . . . . . . . . . . . 211
direct . . . . . . . . . see Tensor, product
exterior . . . . . . . . . . . . . . . . . 467, 468
hermitian . . . . . . . . . . . . . . . . . . . 251
scalar . . . . . . . . . . . . . . . . . . 251, 448
tensor . . . . . . . . . see Tensor product
Projection (orthogonal ) . . . . . . . 255, 399
Propagator . . . . . . . . . . . see Green function
Feynman . . . . . . . . . . . . . . . . . . . 432
Pseudo-metric . . . . . . . . . . . . . . . . . . . . . . 447
Punctured neighborhood . . . . . . . . . . . . . 18
pv(1/x) . . . . . . . . . . . . . . . . . . . . . . . 188, 224
640
R
+
R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Radius of convergence . . . . . . . . . . . . . . . 34
Ramanujan Srinivasa . . . . . . . . . . . . . . . 128
Random variable . . . . . . . . . . . 521, 521550
centered . . . . . . . . . . . . . . . . . . . . 532
discrete . . . . . . . . . . . . . . . . . . . . . 526
independent s . . . . . . . . . . . . . . . 537
product of s . . . . . . . . . . . . . . . . . 546
ratio of s . . . . . . . . . . . . . . . . . . . . 546
reduced . . . . . . . . . . . . . . . . . . . . 532
sum of s . . . . . . . . . . . . . . . . 544, 545
uncorrelated s . . . . . . . . . . . . . . . 536
Random vector . . . . . . . . . . . . . . . . . . . . . 538
Rapidly decaying function . . . . . . . . . . . 288
Rayleigh criterion . . . . . . . . . . . . . . . . . . 321
Realization of an event . . . . . . . . . . . . . . 513
Rectangle function (x) . . . . . . . . 213, 279
Regular distribution . . . . . . . . . . . . . . . . 184
Regularization of a distribution . . 234, 235
Relation
closure . . . . . . . . . . . . . . . . . . . . . 384
Kramers-Kronig dispersion . . . 227
uncertainty . . . . . . . . . 363, 364, 404
Removable singularity . . . . . . . . . . . . . . . 109
Representation
by a matrix . . . . . . . . . . . . . . . . . . . 594
faithful . . . . . . . . . . . . . . . . . . . . . 492
group . . . . . . . . . . . . . . . . . . . . . . 492
momentum . . . . . . . . . . . . . 360, 361
position . . . . . . . . . . . . . . . . . . . . 360
Riesz theorem . . . . . . . . . . . . . . . 381
trivial . . . . . . . . . . . . . . . . . . . . . . 492
Residue . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
applications . . . . . . . . . . . . . . . 117124
at infinity . . . . . . . . . . . . . . . . . . . . 146
and Fourier transform . . . . . . . . . 120
practical computation . . . . . . . . . . 116
theorem . . . . . . . . . . . . . . . . . . . . . . 115
Resolvant set . . . . . . . . . . . . . . . . . . . 393, 403
Resonance . . . . . . . . . . . . . . . . . . . . . . . . . 357
Response (impulse ) . . . . . . . . . . . . . . . 218
Riemann
integral . . . . . . . . . . . . . . . . . . . . . . . . 53
-Lebesgue lemma . . . . . . . . 267, 282
mapping theorem . . . . . . . . . . . . . 157
Riemann-Stieltjes integral . . 529, 552
sphere . . . . . . . . . . . . . . . . . . . . 146, 504
surface . . . . . . . . . . . . . . . . . . . . . . . 137
Riemann Bernhard . . . . . . . . . . . . . . . . . 138
Riesz representation theorem . . . . . . . . 381
Riesz-Fischer theorem . . . . . . . . . . . . . . . . 67
r.v. . . . . . . . . . . . . . . . . . see Random variable
Index
S
SO(3) . . . . . . . . . . . . . . . . . . . . . 492, 492505
S . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289, 300
S (operator of class ) . . . . . . . . . . . . . 397
S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
SU(2) . . . . . . . . . . . . . . . . . . . . . 497, 492505
S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Sn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
(X ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
Saddle point . . . . . . . . . . . . . . . . . . . . . . . 149
Scalar
product . . . . . . . . . . . . . . . . . . 251, 448
pseudo-product . . . . . . . . . . . . . . . 448
Schmidt Erhard . . . . . . . . . . . . . . . . . . . 258
Schrdinger
equation . . . . . . . . . . . . . . . . . . . . . . 427
kernel . . . . . . . . . . . . . . . . . . . . . . . . 428
operator . . . . . . . . . . . . . . . . . . . . . . 427
Schwartz Laurent . . . . . . . . . . . . . . . . . . 220
Schwartz space S . . . . . . . . . . . . . . 289, 300
Schwarz Hermann . . . . . . . . . . . . . . . . . 161
Schwarz-Christoffel transformation . . . 161
Screen (total ) . . . . . . . . . . . . . . . . . . . . . 330
Self-adjoint
endomorphism . . . . . . . . . . . . . . . . 379
operator . . . . . . . . . . . . . . . . . . . . . . 394
Self-coherence function . . . . . . . . . . . . . 369
Seminorm . . . . . . . . . . . . . . . . . . . . . . . . . 577
Separable (Hilbert) space . . . . . . . . . . . . 259
Sequence (Dirac ) . . . . . . . . . . . . . . . . . 231
Series
computation of by residues . . . 122
conditionnally convergent . . . . . 26
Fourier . . . . . . . . . . . . . 258, 264, 265
partial . . . . . . . . . . . . . . . . . . . 258
Laurent . . . . . . . . . . . . . . . . . 111, 112
power . . . . . . . . . . . . . . . . . . . . . . . 34
Taylor . . . . . . . . . . . . . . . . . . . . . . . 35
Set
measurable . . . . . . . . . . . . . . . . . . 59
negligible . . . . . . . . . . . . . . . . . . . . 61
resolvant . . . . . . . . . . . . . . . 393, 403
of subsets of a set . . . . . . . . . . . . . . 57
sgn (sign distribution) . . . . . . . . . . . . 203
Si(x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
Signal
analytic . . . . . . . . . . . . . . . . 365, 366
finite energy . . . . . . . . . . . . . . . . 368
imaginary . . . . . . . . . . . . . . . . . . 366
nonmonochromatic . . . . . . . . . 372
Signature . . . . . . . . . . . . . . . . . . . . . . . . . . 465
Simple
connectedness . . . . . . . . . . . . . . . . . 575
convergence . . . . . . . . . . . . . . . . . 19, 29
Index
curve . . . . . . . . . . . . . . . . . . . . . . . . . . 93
function . . . . . . . . . . . . . . . . . . . . . . . 63
holomorphic connectedness . . 114
pole . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Simply connected . . . . . . . . . . . . . . . 114, 575
Simultaneous realization . . . . . . . . . . . . 513
Sine cardinal sinc(x) . . . . . . . . 279, 285, 288
Sine integral Si(x) . . . . . . . . . . . . . . . . . . 312
Singular distribution . . . . . . . . . . . . . . . . 185
Singularity
essential . . . . . . . . . . . . . . . . . . . . 110
at infinity . . . . . . . . . . . . . . . . . 111, 146
removable . . . . . . . . . . . . . . . . . . 109
Slowly increasing function . . . . . . . . . . . 301
Space
connected . . . . . . . . . . . . . . . . . . 574
of distributions D . . . . . . . . . . . . 183
Hilbert . . . . . . . . . . . . . . . . . . . . . 257
measurable . . . . . . . . . . . . . . . . . . 59
measure . . . . . . . . . . . . . . . . . . . . . 59
Minkowski . . . . . . . . . . . . . . . . . 448
pre-Hilbert . . . . . . . . . . . . . . . . . . 251
probability . . . . . . . . . . . . . . 512, 514
sample . . . . . . . . . . . . . . . . . . . . . 512
Schwartz S . . . . . . . . . . . . 289, 300
separable . . . . . . . . . . . . . . . . . . . 259
test D . . . . . . . . . . . . . . . . . . . . . . 182
topological . . . . . . . . . . . . . . . . . 573
Special relativity . . . . . . . . . . . . 200, 211, 242
Spectrum
continuous . . . . . . . . . . . . . 392, 403
discrete . . . . . . . . . . . . . . . . . 391, 403
of a function . . . . . . . . . . . . . . . . . 278
power . . . . . . . . . . . . . . . . . . . . . . 370
residual . . . . . . . . . . . . . . . . 393, 403
Sphere (Riemann ) . . . . . . . . . . . . . . . . 504
Spinor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
Standard deviation . . . . . . . . . . . . . 528, 528
Star-shaped open set . . . . . . . . . . . . . . . . 475
Step (of a subdivision) . . . . . . . . . . . . . . . 52
Stieltjes (Riemann- integral) . . . 529, 552
Stirling formula . . . . . . . . . . . . . . . . . . . . 154
Stokes
formula . . . . . . . . . . . . . . . . . . . . . . 473
phenomenon . . . . . . . . . . . . . . . . . . 38
Stokes George . . . . . . . . . . . . . . . . . . . . . 472
Stopping point . . . . . . . . . . . . . . . . . . . . . 167
Structure constants . . . . . . . . . . . . . . . . . 496
Sub-vector space generated . . . . . . . . . . . 250
Subdivision . . . . . . . . . . . . . . . . . . . . . . . . . 52
Support of a distribution . . . . . . . . . . . . 187
Surface (Riemann ) . . . . . . . . . . . . . . . . 137
Symmetric
endomorphism . . . . . . . . . . . . . . . . 379
641
group . . . . . . . . . . . . . see Cyclic group
operator . . . . . . . . . . . . . . . . . . . . . . 394
System
causal . . . . . . . . . . . . . . . . . . . . . . 219
complete of events . . . . . . . . . . . 513
orthonormal . . . . . . . . . . . . . . . . 253
total . . . . . . . . . . . . . . . 257, 259, 385
T
T (C ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
Taylor
-Lagrange formula . . . . . . . . . . . . 31
-Lagrange inequality . . . . . . . . . . 32
-Young formula . . . . . . . . . . . . . . 32
expansion . . . . . . . . . . . . . . . . . . . . . 36
formula with integral remainder . 31
polynomial . . . . . . . . . . . . . . . . . . . . 31
remainder . . . . . . . . . . . . . . . . . . . . . 31
series . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Taylor Brook . . . . . . . . . . . . . . . . . . . . . . . 31
Tchebychev Pafnouti . . . . . . . . . . . . . . . 548
Tchebychev inequality . . . . . . . . . . . . . . . 547
Tempered distribution . . . . . . . . . . . . . . 300
Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
Faraday . . . . . . . . . . . . . . . . . . . . . 481
Levi-Civita . . . . . . . . . . . . . . 483, 496
product . . . . . . . . . . . . . . . . . . 439447
of distributions . . . . . . . . . . . . . 210
of a linear form and a vector . 444
of linear forms . . . . . . . . . . . . . 441
of functions . . . . . . . . . . . . . . . . 209
of vector spaces . . . . . . . . . . . . . 439
of a vector and a linear form . 444
of vectors . . . . . . . . . . . . . . . . . . 443
Test function/space . . . . . . . . . . . . . . . . . 182
Theorem
Banach . . . . . . . . . . . . . . . . . . . . . . 15
Beppo Levi . . . . . . . . . . . . . . . . . . 76
Casorati-Weierstrass . . . . . . . . . . 109
Cauchy . . . . . . . . . . . . 97, 98, 98, 598
Cauchy-Lipschitz . . . . . . . . . . . . . 46
central limit . . . . . . . . . . . . 425, 558
dominated convergence . . . . . . . 75
Dini . . . . . . . . . . . . . . . . . . . . . . . . 22
Dirichlet . . . . . . . . . . . . . . . . . . . 268
Egorov . . . . . . . . . . . . . . . . . . . . . . 69
Faltung . . . . . . . . . . . . . . . . . . . . . 292
fixed point . . . . . . . . . . . . . . . . . . 15
Fubini-Lebesgue . . . . . . . . . . . . . . 79
Fubini-Tonelli . . . . . . . . . . . . . . . . 80
generalized spectral . . . . . . . . . . 398
Green . . . . . . . . . . . . . . . . . . . . . . 106
Green-Riemann . . . . . . . . . . . . . 105
Hellinger-Toeplitz . . . . . . . . . . . 394
642
Lebesgue . . . . . . . . . . . . . . . . . . . . 75
Liouville . . . . . . . . . . . . . . . . 45, 103
mean value . . . . . . . . . . . . . . . . . 141
Morera . . . . . . . . . . . . . . . . . . . . . 105
orthogonal projection . . . . . . . 255
Poincar . . . . . . . . . . . . . . . . 474, 475
representation . . . . . . . . . . . 377, 381
residue . . . . . . . . . . . . . . . . . . . . . 115
Riemann mapping . . . . . . . . . . . 157
Riesz . . . . . . . . . . . . . . . . . . . . . . . 381
Riesz-Fischer . . . . . . . . . . . . . . . . . 67
Schwarz-Christoffel . . . . . . . . . . 161
spectral . . . . . . . . . . . . . . . . . . . . . 379
Stokes . . . . . . . . . . . . . . . . . . . . . . 473
van Cittert-Zernike . . . . . . . . . . 375
Wiener-Khintchine . . . . . . . . . . 371
Total screen . . . . . . . . . . . . . . . . . . . . . . . . 330
Total system . . . . . . . . . . . . . . . 257, 259, 385
Transfer function . . . . . . . . . . . . . . . . . . . 356
Transform
z . . . . . . . . . . . . . . . . . . . . . . . . . . 344
Fourier . . . . . see Fourier, transform
Hankel . . . . . . . . . . . . . . . . . . . . . 324
Hilbert . . . . . . . . . . . . . . . . . 227, 366
Laplace . . . . . . . . . . . . . 332, 331350
of distributions . . . . . . . . . . . . . 342
inversion of the . . . . . . . . . . 338
Transformation
conformal . . . . . . . . . . . . . . . . . . 156
Schwarz-Christoffel . . . . . . . . . . 161
Translate of a distribution . . . . . . . . . . . 189
Transpose of a distribution . . . . . . . . . . 190
Transverse fields . . . . . . . . . . . . . . . . . . . . 358
Trivial representation . . . . . . . . . . . . . . . 492
U
Uncertainty
position/momentum . . . . . . . . 361
relation . . . . . . . . . . . . . . . . . . 363, 404
Uncorrelated r. v. . . . . . . . . . . . . . . . . . . . 536
Uniform convergence . . . . . . . . . . . . . . . . 29
Unitary matrix . . . . . . . . . . . . . . . . . . . . . 497
V
van Cittert-Zernike theorem . . . . . . . . . 375
Variance . . . . . . . . . . . . . . . . . . . 528, 528, 535
Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
random . . . . . . . . . . . . . . . . . . . . 538
Vector space
complete normed . . . . . . . . . . . . . 14
normed . . . . . . . . . . . . . . . . . . . . 577
sub- generated . . . . . . . . . . . . . . . 250
Index
W
Wages of fear (the ) . . . . . . . . . . . . . . . . 243
Wavelets . . . . . . . . . . . . . . . . . . . . . . . 263, 321
Weak convergence . . . . . . . . . . . . . . . . . . 230
Weierstrass Karl . . . . . . . . . . . . . . . . . . 582
Wiener-Khintchine theorem . . . . . . . . . 371
Winding number of a path . . . . . . . 96, 597
Wronskian . . . . . . . . . . . . . . . . . . . . . . . . . 409
X
X (position operator) . . . . . . . . . . . . . . . 391
Y
Young (Taylor- formula) . . . . . . . . . . . . 32
Yukawa potential . . . . . . . . . . . . . . . . . . . 484
Z
z-transform . . . . . . . . . . . . . . . . . . . . . . . . 344
Zernike (van Cittert- theorem) . . . . . 375
Zero of an holomorphic function . . . . 106
Zorns lemma . . . . . . . . . . . . . . . . . . . . . . 249