1 Analysis of Complex Functions

1 Analysis of Complex Functions

Let’s jump right in and begin with some definitions:

Definition 1.1 The complex plane C is (a representation in R2 of ) the set {z : |z| <
∞}, and the extended complex plane is C ∪{∞}, where {∞} is the single point at infinity
corresponding to the origin under the transformation z → z −1 .

Definition 1.2 An argument of z is any one of the numbers θ + 2πm, where z = reiθ
and m is an integer. The principal value of arg(z) satisfies −π < arg(z) ≤ π

Definition 1.3 A region D ⊂ C is a (non-empty) connected open subset of C.

Definition 1.4 Γ ⊂ D is a simple closed curve (s.c.c.) if Γ is a continuous image of

S 1 = {z : |z| = 1}

Definition 1.5 A function of a complex variable z is a mapping from D to C

Fundamental to a discussion of complex functions is the issue of differentiation. We

say that a function f : D → C is differentiable at the point z0 ∈ D if
0 f (z) − f (z0 )
f (z0 ) ≡ lim (1)
z→z0 z − z0

exists and is unique. We then say that a single-valued function f (z) is (complex) analytic,
regular or holomorphic in D if f 0 (z) exists (and is continuous) at each point in D. Finally,
f (z) is entire if it is holomorphic everywhere in C, or meromorphic if its only singularities
in C are poles (see later).

Theorem 1.1 If w = u + iv = f (x + iy) is differentiable at z0 = x0 + iy0 , then u and

v are differentiable as functions of two real variables u = u(x, y), v = v(x, y) and:
∂u ∂v
∂x ∂y
∂u ∂v
= − , (2)
∂y ∂x
and are known as the Cauchy-Riemann equations.
PHY581 Methods of Theoretical Physics I, Fall 2005 2

Corollary 1.1 If f is at least twice differentiable in D, then the real and imaginary
parts are both harmonic functions:
∂2u ∂2u
+ = 0
∂x2 ∂y 2
∂2v ∂2v
+ = 0. (3)
∂x2 ∂y 2
Thus, u and v cannot have maxima or minima in D and so any stationary points must
be saddle points. Hence, their biggest and smallest values are attained only on the
boundary ∂D of D.
As with functions of a real variable, we may consider power series expansions of
complex functions. For now we’ll look at holomorphic functions, but later we’ll extend
our discussion to include functions with singularities. As with real functions, if f (z) =
P∞ n P∞
n=0 an z , then the derivative of this function is given by f 0 (z) = n=1 nan z
. Each
P∞ n
power series has a radius of convergence, R, such that n=0 an z converges for |z| < R
and diverges for |z| > R.
We may use power series expansions to define complex versions of some of our
favourite real functions:

X zn z2
exp(z) = = 1+z+ +···
n=0 n! 2!
z3 z5
sin(z) = z − + −···
3! 5!
z2 z4
cos(z) = 1 − + −··· . (4)
2! 4!
In all these cases the radius of convergence is R = ∞. Note that, with this definition of
exp(z), the usual result; exp(z1 + z2 ) = exp(z1 ) exp(z2 ) holds.
As an introduction to more complicated functions, consider trying to define the
complex logarithm, log(z). We define w to be a logarithm of z if

z = exp(w) (5)

If we write w = u + iv, then

z = eu [cos(v) + isin(v)] (6)

and therefore, in particular, |z| = eu . Formally, we can write

w = log(z) = log(|z|) + i arg(z) + i(2πk) , k∈N (7)

PHY581 Methods of Theoretical Physics I, Fall 2005 3

where we have defined −π < arg(z) ≤ π. With arg(z) restricted to this region, each
choice for the integer k specifies a branch of the logarithm function. The principal value
of log(z) is obtained by setting k = 0.
Note that given some domain D (with 0 ∈
/ D), then once we specify the value of
log(z0 ) for some z0 ∈ D, then log(z) is uniquely defined on D. (This is equivalent to
specifying the branch).
However, it is impossible to define log(z) in this way in any domain which contains
a simple closed curve which encircles the origin, since log(0) is undefined.
For any holomorphic function, f (z), which is many valued (i.e. has several branches),
a point z0 which behaves in the same way as the origin for log(z) is referred to as a branch
point. In such a situation f (z0 ) may or may not be defined.
There exists a power series expansion for the logarithm, given by
∞ n
n+1 z z2 z3
log(1 + z) = (−1) =z− + −··· , (8)
n=1 n 2 3

with radius of convergence R = 1. In addition, all points on this circle also converge,
except for z = 1.
Another example of a function with branches is provided by the function w = z a ,
with a ∈ C. This can be seen by

w = z a = exp[a log(z)]
= exp[a(log(r) + iθ)] . (9)

Here we have set z = reiθ . It is the choice of range for θ, for example −π < θ ≤ π, that
specifies the branch. Note that the choice becomes unnecessary if a ∈ Z, because in this
case the choice of a different branch merely changes z a by e2πim = 1. This fits with our
elementary definition of z n .

Example 1.1

ii = exp[i log(i)]
= exp i i + i2πk
= exp − 2k + π . (10)
This takes infinitely many distinct real values.
PHY581 Methods of Theoretical Physics I, Fall 2005 4

z-plane w-plane










Example 1.2 w = z p/q , where p/q is a rational number in lowest terms. In this case
we obtain a function with q branches, having z = 0 as a branch point.

Consider the special case w = z. Write z = reiθ , 0 ≤ θ < 2π (note a different,

more convenient choice of range here). Then w = ρeiφ , with 0 ≤ φ < π, where ρ = + r
and φ = θ/2.
Geometrically, the whole complex plane with the positive real line removed is mapped
to the upper half plane (see below)

As we have defined f : w 2 = z, it defines a branch of z with z = 0 as a branch
point. We must have a cut along the positive real line because we cannot possibly have
continuity there since f (4) = 2, but f (4eiθ ) → −2 as θ → 2π from below.

1.1 Integration and Cauchy’s Theorem

We shall consider curves C in the z-plane, parameterized by a single real variable t. that
is to say,
C = {γ(t) = φ(t) + iψ(t) | t ∈ [0, 1]} , (11)

where φ and ψ are continuously differentiable functions of t. A particularly useful

example is when C is a union of line segment, or in other words, a polygon.
It is useful to be able to define the length of such a curve. In general, γ(t) is
called rectifiable if sup{ length of inscribed line segments} exists, and if so, then this
quantity is referred to as the length of the curve. It is clear that a union of line elements
is rectifiable and that the length is equal to the sum of the lengths of the included line
PHY581 Methods of Theoretical Physics I, Fall 2005 5

Proposition 1.1 Subject to the conditions introduced so far:

Z Z 1
f (z) dz = (u + iv)(φ0 + iψ 0 ) dt (12)
C 0

Theorem 1.2 (Cauchy’s Theorem) Let D be a simply-connected domain in C, and

let C be a piecewise differentiable simple closed curve entirely contained in the interior
of D, then, if f 0 (z) exists for each point z ∈ D
f (z) dz = 0 . (13)

Note that simple-connectivity of D is used to ensure the existence of a domain D1 ⊂ D

whose boundary is C.
From the point of view of physicists, this is one of the most important theorems
of complex analysis. The proof is somewhat involved, and I will not go into it here.
Rather, I’ll just give a simple demonstration that the theorem is true under the stronger
assumption that f (z) is holomorphic in D (i.e. f 0 (z) exists and is continuous in D)

Proof 1.3 Z Z Z
f (z) dz = (udx − vdy) + i (vdx + udy) (14)

If C bounds the domain D1 , we may apply Green’s theorem:

Z Z ! Z Z !
∂v ∂u ∂u ∂v
= − + dxdy + i − dxdy
D1 ∂x ∂y D1 ∂x ∂y
= 0, (15)

by the Cauchy-Riemann equations.

Of equal importance to physicists is a result that allows us to calculate the values of

integrals. As we’ll see, Cauchy’s theorem is required to prove this.

Theorem 1.4 (Cauchy’s Integral Formula) Let f 0 (z) exist inside the open disc B(z0 , r).
Then for each point a with |a − z0 | < r, we have
1 f (z)
f (a) = dz . (16)
2πi ∂B z−a
(here ∂B = {z | |z − z0 | = r}).
PHY581 Methods of Theoretical Physics I, Fall 2005 6

Proof 1.5
f (z) f (z)
dz = dz
∂B z−a ∂B(a,ε) z − a
f (z) − f (a) f (a)
= lim dz + dz . (17)
ε→ 0 ∂B(a,ε) z−a ∂B(a,ε) z − a

The first term on the right hand side can be made arbitrarily small because the integrand
is bounded and the length of the simple closed curve → 0. As for the second term;
γ(t) = a + εeit , t ∈ [0, 2π] (18)

and the final answer then becomes

f (a) it

= iεe dt
0 εeit
= 2πif (a) , (19)

as required.

This must seem like a lot of heavy-handed mathematical machinery, but I’d like to
present one more important result before we get down to some examples of how these
theorems are applied.

Theorem 1.6 (Power Series Expansions) Let f : U → C (U open), be complex

differentiable, and let B(z0 , ρ) ⊆ U. Then, ∃ a unique power series

f (z) = cn (z − z0 )n , (20)

with positive radius of convergence (≥ ρ), in some neighbourhood of z0 . Furthermore

1 f (z)
cn = dz , (21)
2πi |z−z0 |=r (z − z0 )n+1
for 0 < r < ρ.

This power series, if it exists, must be unique, since we can use Cauchy’s integral formula
to show that
1 (n)
f (z0 ) .
cn = (22)
Thus, we have formally shown the existence of a power series that we mentioned
PHY581 Methods of Theoretical Physics I, Fall 2005 7

1.2 Singularities and Laurent Series

Thus far we have been concerned with functions that are well-defined everywhere. We
now turn to the various ways in which they can misbehave.

Definition 1.6 A function f (z) is singular at z0 if there is no neighbourhood of z0 in

which it is holomorphic.

f (z) has a singularity at z = ∞ if g(η), defined by g(η) ≡ f (z) and η ≡ z −1 has a

singularity at η = 0.

Definition 1.7 A singularity at z0 is isolated if f is holomorphic on B(z0 , ε)\{z0 }.

There are three types of isolated singularities:

1. z0 is removable if, with a suitable definition of f (z0 ), we obtain a function holo-

morphic on all of B(z0 , ε).

2. z0 is called a pole of f if , for some m ≥ 1, the function g(z) = (z − z0 )m f (z) has

a removable singularity at z0 . The smallest value of m for which this condition
holds is called the order of the pole at z0 .

3. Otherwise, z0 is called an essential singularity.

For an example of a non-isolated singularity see f (z) = csc(z −1 ).

Definition 1.8 A function f (z) is meromorphic on the domain D if all its singularities
in D are isolated and non-essential.

Lemma 1.1 If f (z) is meromorphic, then locally f (z) is expressible as a quotient g(z)/h(z),
where g and h are holomorphic. Conversely, if g and h are holomorphic on a domain
D, and h 6≡ 0, then g(z)/h(z) is meromorphic (perhaps with removable singularities).

Proof 1.7 (⇒) is clear from definition, just take h(z) = (z − z0 )m

(⇐) : If h(z0 ) 6= 0, then g(z)/h(z) is holomorphic near z = z0 . If h(z0 ) = 0, then,
since h is holomorphic, we may use a power series expansion about z0 to express h(z) =
(z − z0 )m h̃(z0 ), with h(z0 ) 6= 0.
Furthermore g(z) = (z − z0 )k g̃(z0 ), for some k > 0 [it may be that g and h vanish
at the same point z0 , in which case g/h has a removable singularity at z0 ]. We can now
g(z) g̃(z)
= (z − z0 )k−m . (23)
h(z) h̃(z)
PHY581 Methods of Theoretical Physics I, Fall 2005 8

Note how the following definition parallels that of a pole and its order:

Definition 1.9 Let f : U → C be a holomorphic function defined on some open subset

U ⊂ C, and let f (z0 ) = 0. We say that z0 is a zero of f , and define its order, k, by the
f (z0 ) = f 0 (z0 ) = · · · = f (k−1) (z0 ) = 0 , f (k) 6= 0 (24)

If no such finite value k exists (i.e. f (i) (z0 ) = 0 ∀ 0 ≤ i < ∞), we say that z0 is a
zero of infinite order. In this case, the assumption of holomorphy implies f (z0 ) ∀ z in
some neighborhood of z0 .

1.2.1 Laurent Series

We’ll now generalize our power series discussion to functions with singularities. Recall
the expansion we derived earlier for holomorphic functions, and compare to the following.
If f (z) is holomorphic in the annulus R1 < |z − z0 | < R2 (note, f not required to be
holomorphic at z0 ) then we may express

f (z) = an (z − z0 )n (25)

and this series is referred to as a Laurent series for f . This can be decomposed as

X −1
f (z) = an (z − z0 )n + an (z − z0 )n (26)
n=0 n=−∞

where the first term is referred to as the subsidiary part and the second term as the
principal part. Both parts are unique, and, as earlier,
f (z)
an = dz (27)
C (z − z0 )n+1
with C a simple closed curve containing z0 in the domain of f .
The Laurent series converges absolutely, and uniformly in any closed subset of the
PHY581 Methods of Theoretical Physics I, Fall 2005 9

1.3 Calculus of Residues and Contour Integration

After all the waiting, we’ll get down to many applications of what we’ve learned in this

Definition 1.10 (Residue) The coefficient a−1 of (z − z0 )−1 in the Laurent expansion
is called the residue of f (z) at z = z0 .

Note that Cauchy’s integral formula gives

c−1 = f (z) dz (28)
2πi C

where C is a small circle of radius ε say, about z0 .

This leads us to the most important result of complex analysis (as far as we’ll be
concerned in this course)

Theorem 1.8 (Residue theorem (Cauchy)) If f (z) is meromorphic in some do-

main D, which contains the subdomain D1 , bounded by the simple closed curve C (with
no singularities on C), then
 
f (z) dz = 2πi  Rj  , (29)
C j=1

where Rj is the residue at the isolated singularity at z = zj j = 1, 2, . . . , N in D1 .

Proof 1.9 Easy!!!!

The problem with this result is actually to calculate the residues Rj . Two hints:

1. Use the Laurent expansion about zj if this is easily calculated.

2. Assume that zj = 0 is a pole of order m for f (z). Then Rj is given by

1 dm−1
Rj = lim m−1 [z m f (z)] (30)
(m − 1)! z→0 dz

Note that for a simple pole (m = 1) Rj = limz→0 (zf (z)).

Proof 1.10 Assume that

f (z) = cn z n , (31)

and differentiate term by term.

PHY581 Methods of Theoretical Physics I, Fall 2005 10

Now, the main point of many of the recent results is that they allow us to analyti-
cally evaluate many definite integrals which would seem almost impossible without the
techniques we’ll learn here. Although these methods will apply to complex integrals,
we’ll see that they provide an excellent method for the evaluation of real integrals.
In all the examples that follow, the procedure is the same. We have a definite real
integral to evaluate, and we do this by first making the integral complex and including
the range of integration (e.g. [−a, a], where a frequently tends to infinity), inside some
suitable simple closed curve C in C. With minor adaptation, we suppose that the complex
integrand has no singular points on C. We then apply the residue theorem and arrange
that all contributions to the integral, other than that on the real axis, are vanishingly
Warning: Watch out for branching. Recall that this occurs, for example, when f (x)
contains a factor xα , where α is non-integral, or log(x).
We shall consider various different cases:

1. Integrals of the form Z 2π

R(cos(θ), sin(θ)) dθ (32)

Method; write
! !
z + z −1 z − z −1
cos(θ) = , sin(θ) = (33)
2 2i

(i.e. z = eiθ , dz = iz dθ), and take C to be defined by |z| = 1.

2. f (z) meromorphic with finite number of poles in the region =(z) > 0 and no poles
on the real axis. Assume that |z 2 f (z)| ≤ M whenever =(z) ≥ 0 and |z| > R, say.

Label the poles as a1 , . . . , ak . Then

Z ∞ k
f (x) dx = 2πi Resf (aj ) (34)
−∞ j=1

To see this, integrate around a contour consisting of an upper semicircle of radius

R together with the real interval [−R, R]. As R → ∞, we obtain the above result.

3. The same general method applies to

Z ∞
f (x)eimx dx , (35)

subject to the weaker assumption that limz→∞ f (z) = 0 for =(z) ≥ 0, m ∈ R.

PHY581 Methods of Theoretical Physics I, Fall 2005 11

Lemma 1.2 (Jordan’s Lemma) Let Γ be an upper semicircle, of radius R, cen-

tered at 0. Let f (z) have a finite number of poles or removable singularities in the
upper half plane (=(z) > 0), and let lim z→∞ f (z) = 0. Then
eimz f (z) dz → 0 as R→∞, (36)

for m > 0.

Proof 1.11 Choose R so large that |f (z)| < ε ∀ z on Γ. Now

|eimz | = | exp[imR(cos θ + i sin θ)]|

= exp(−mRsinθ) . (37)

Z Z π
imz iθ iθ iθ
f (z)e dz = f (Re ) exp(imRe )Re dθ

Z0 π
< ε e−mR sin θ R dθ
Z π/2
= 2Rε e−mR sin θ dθ
Z π/2
≤ 2Rε e−2mRθ/π dθ
= (1 − e−mR )
< , (38)
as required.

Note, that we can distinguish between real and imaginary parts, and thereby
replace eimz by cos(mz) or sin(mz).

4. We can adapt this general technique to allow for a finite number of singular points
on the real axis. Diagramatically, we replace our original contour by

In this form, the following lemma can be useful

PHY581 Methods of Theoretical Physics I, Fall 2005 12

Lemma 1.3 Suppose that f (z) has a simple pole at z = 0, and we integrate round
a circular arc of radius r between the angles θ1 and θ2 . Then
f (z) dz = residue at 0 × i(θ2 − θ1 ) + εr , (39)
reiθ2 →reiθ2

where εr → 0, as r → 0.

Proof 1.12 Expand f (z) as (residue at 0)/z + (regular function), and use Jor-
dan’s lemma.

5. Branching Problems: Let f (z) be rational, and |z 2 f (z)| ≤ M outside a large

semicircle. Assume that f (z) has at worst a simple pole at z = 0, and that there
are no other poles on the real axis. If 0 < α < 1, we can determine
Z ∞
xα f (x) dx (40)

as follows:

Cut C along the non-negative real axis. Then z = reiθ is uniquely defined with
0 < r and 0 < θ < 2π. Write the logarithm over this domain as z → log(r) + iθ,
and define z α = eα log z .

With these conventions we have fixed the branches of log(z) and z α with which we
work in this problem.

Choose ε > 0 so small and R > 0 so large that the contour encloses all poles except
the one that may exist at z = 0.

δ r ε
PHY581 Methods of Theoretical Physics I, Fall 2005 13

lim z α f (z) dz = xα f (x) dx
ε→0 αε,r r
= lim z α f (z) dz
ε→0 γε,r
= −e2πiα xα f (x) dx , (41)

because on γε,r we are about to cross over to another branch of the integrand.
Z Z ∞
lim lim z α f (z) dz = (1 − e2πiα ) xα f (x) dx
R→∞ ε→0 α∪β∪δ∪γ 0
 
= 2πi  Resj (z α f (z)) . (42)

6. With log(z) rather than z α in the integrand, it may be more convenient to replace
the given contour by

7. For some integrands it may be useful to use a large rectangle, for example if a
trigonometric or hyperbolic function appears in the denominator.

−π+ in π+ in

−π π

1.4 Worked Examples: Contour Integration

1.4.1 Example 1
Z ∞ dx
0 (x2 + 1)2 (x2 + 4)
PHY581 Methods of Theoretical Physics I, Fall 2005 14

As with all our examples we choose a contour and extend the integral around the
contour. Here consider I
, (43)
Γ (z 2 + 1)2 (z 2 + 4)
where Γ is defined to be the contour consisting of that part of the real axis from −R to
R (R > 0), and the semicircle, radius R, center 0 in the upper half-plane.


-R R

dx +R dx dz
2 2 2
= 2 2 2
Γ (z + 1) (z + 4) −R (x + 1) (x + 4) C (z + 1)2 (z 2 + 4)

Now let R → ∞. Clearly the second term in the above goes to zero in this limit. We
now evaluate the left hand side using the calculus of residues.
The integrand has a pole of order 2 at z = i, and a pole of order 1 at z = 2i, which
lie within the contour Γ. Now;
" #
d 1
Res(z = i) = lim
z→i dz (z + i)(z + i)(z 2 + 4)
" #
−2 2z
= lim −
z→i (z + i)3 (z 2 + 4) (z + i)2 (z 2 + 4)2
−2 2i
= 3

3(2i) 9(2i)2
= − . (44)
" #
Res(z = 2i) = lim
z→2i (z + 1)2 (z + 2i)

= − . (45)
dz i i
2 2 2
= 2πi − −
Γ (z + 1) (z + 4) 36 36
= . (46)
PHY581 Methods of Theoretical Physics I, Fall 2005 15

So we obtain Z ∞ dx π
= . (47)
−∞ (x2 2 2
+ 1) (x + 4) 9
Finally, since the integrand is an even function of x, this implies
Z ∞ dx π
= . (48)
0 (x2 2 2
+ 1) (x + 4) 18

1.4.2 Example 2

Z ∞ cos(x) dx
, a>0
−∞ x2 + a2
This requires a little more cunning. Consider
eiz dz
Γ z 2 + a2
with Γ the same contour as in example 1. We then obtain:
eiz dz R cos(x) dx R i sin(x) dx eiz dz
= + +
Γ z 2 + a2 −R x2 + a2 −R x2 + a2 C z 2 + a2
Obviously, the final term goes to zero as R → ∞, thus
eiz dz ∞ cos(x) dx ∞ sin(x) dx
= +i
Γ z 2 + a2 −∞ x2 + a2 −∞ x2 + a2
The integrand on the left hand side has simple poles at z = ±ia. However, since
a > 0 we have chosen the above contour, and hence only require the pole at z = +ia
(enclosed by the contour). So;
" #
Res(z = ia) = lim
z→ia z + ia

= . (49)
Therefore Z Z !
∞ cos(x) dx ∞ sin(x) dx e−a π −a
+i = 2πi = e . (50)
−∞ x2 + a2 −∞ x2 + a2 2ia a
Finally, taking the real parts of both sides we obtain
Z ∞ cos(x) dx π
2 2
= e−a . (51)
−∞ x +a a
PHY581 Methods of Theoretical Physics I, Fall 2005 16



-R -r r R

1.4.3 Example 3
x − sin(x) ∞
dx . (52)
−∞ x3
We’ll begin by integrating by parts twice, to put the integral into a form that is more
simple to handle by complex methods.
Z " #∞ Z
∞ x − sin(x) (x − sin(x)) 1 ∞ 1 − cos(x)
dx = − + dx
−∞ x3 2x2 −∞
2 −∞ x2
" #∞ Z
(1 − cos(x)) 1 ∞ sin(x)
= − + dx
2x −∞
2 −∞ x
1 ∞ sin(x)
= dx , (53)
2 −∞ x
since both terms in the square brackets vanish.
Now to use our complex variable machinery. Consider
dz , (54)
Γ z
where Γ is the contour shown
Now, this integral is zero, by Cauchy’s theorem. We may write it as
eiz eiz −r eix eiz R eix
dz = dz + dx + dz + dx . (55)
Γ z C z −R x C0 z r x
Now, as R → ∞ the integral around the large semicircle, C becomes zero. Thus,
−r eix ∞ eix eiz
dx + dx = − dz . (56)
−∞ x r x C0 z
Now, since (eiz − 1)/z has a removable singularity at the origin, we have
−r eix ∞ eix 1
dx + dx = − dz
−∞ x r x C0 z
Z 0
= − iθ
(ireiθ ) dθ
π re
= πi . (57)
PHY581 Methods of Theoretical Physics I, Fall 2005 17

Letting r → 0, we have
eix ∞
dx = πi . (58)
−∞ x
Taking imaginary parts we finally obtain
Z ∞ sin(x)
dx = π , (59)
−∞ x
so that by our initial integration by parts:
Z ∞ x − sin(x) π
dx = . (60)
−∞ x 2

2 Exact and Approximate Evaluation of Sums and

Definition 2.1 An asymptotic sequence is a set of functions {φn (z)} such that

φn+1 (z) = o(φn (z)) , as z → z0 . (61)

(Usually we take φn = z −n , and z0 = ∞).

Definition 2.2 If {φn (z)} is an asymptotic sequence, then the asymptotic expansion for
a function f (z) is

f (z) ∼ ar φr (z) , (62)
provided that

f (z) − ar φr (z) = O(φn ) or o(φn−1) , (63)

as z → z0 . (i.e. the remainder after n terms is smaller than the last included term, or
the same order as the first neglected term)
Some important properties of asymptotic expansions are (Here consider f ∼ ar z −r

1. Asymptotic expansions depend on the sector (i.e arg(z)). For example,

1 1 π
e−z + ∼ , | arg(z)| < . (64)
z z 2
(no more terms, since e−z is smaller than any power of z). But,
1 1 3π π
e−z + 6∼ , < | arg(z)| < . (65)
z z 2 2
If the asymptotic expansion of f (z) is different in different sectors, we say it exhibits
Stokes’ phenomenon.
PHY581 Methods of Theoretical Physics I, Fall 2005 18

Theorem 2.1 If f (z) is single-valued and holomorphic for |z| ≥ a, and

f (z) ∼ ar z −r , (66)

is valid for all arg(z) (i.e., doesn’t exhibit Stokes’ phenomenon), then the series is
in fact convergent; i.e.

f (z) = ar z −r . (67)

P∞ n
Proof 2.2 f is single-valued, holomorphic for |z| ≥ a, therefore f (z) = −∞ cn z ,
with I
1 f (z)
cn = dz . (68)
2πi C z n+1
Choose C to be a large circle, radius R. Then
1 2π
iθ 1
|cn | ≤ f (Re ) dθ . (69)
2π 0 Rn
Now, since f (z) ∼ 0 ar z −r , f → a0 as |z| → ∞. Therefore, we can find M such
that |f | < M for large enough |z|. This implies that
|cn | < , (n > 0) . (70)
But R can be as large as we like, so cn = 0 for n > 0. Also, an = c−n , since
asymptotic expansions are unique (see next property).

2. For a given range of arg(z), the asymptotic expansion of f (z) is unique

To see this, let f (z) ∼ 0 an z −n as z → ∞ in a given sector. Then f ∼ a0 as
z → ∞, and
(f − a0 )z → a1 , (71)

as z → ∞. Similarly !
f− ar z z n → an , (72)

as z → ∞. Thus, the coefficients {an } are uniquely defined. (Note that the
converse does not hold).

3. Asymptotic expansions can be added and multiplied as if they were convergent

Let’s now see how we might calculate asymptotic expansions for several different
classes of functions.
PHY581 Methods of Theoretical Physics I, Fall 2005 19

2.1 Watson’s Lemma and Laplace’s Method

Lemma 2.1 (Watson) Let
Z ∞
F (z) = e−zt φ(t) dt , <(z) ≥ δ > 0 , (73)
with φ = 0 bn tn for |t| < R. Then

X bn n!
F (z) ∼ . (74)
0 z n+1

It is important to note here that the right hand side is merely the left hand side
expanded and integrated term by term. However, it is the fact that the result is an
asymptotic expansion that is nontrivial. This is because the summation need not con-
verge uniformly in t for all t in the range of integration. Thus, it is not clear that we
can interchange the order of integration and summation.
Under more restrictive circumstances we could just integrate by parts to show this.
However, Watson’s lemma works in more general situations, and a more subtle proof is
required. I won’t give the proof here, although if we have time I may come back and
supply it later.

Laplace’s method is a way to calculate asymptotic expansions for functions of the

form Z b
F (x) = exh(u) g(u) du , (75)

as x → +∞ (x real).
The rough argument is that the largest contribution comes from the biggest value
of h(u), say h(u0 ), which is exponentially larger than any other contribution. We’ll see
how this works in 2 distinct situations. In both these, Watson’s lemma is crucial to
obtaining the final result.

1. h0 (u0 ) = 0 ; (a calculus-type maximum)

Begin by taking Taylor series of h and g about u0 :

b 1
F (x) = exp x h(u0 ) + (u − u0 )2 h00 (u0 ) + · · · [g(u0) + (u − u0 )g 0(u0 ) + · · ·] du
a 2
Z ∞  
xh(u0 ) 1 2 00
∼ e exp xτ h (u0) [g(u0) + · · ·] dτ , (76)
−∞ 2
PHY581 Methods of Theoretical Physics I, Fall 2005 20

where τ = u −u0 and we can extend the range of integration to (−∞, ∞) since any
extra contributions are negligible (the dominant contribution comes from τ = 0)

Now integrate term by term using Watson’s lemma, to obtain

" s #
xh(u0 ) 2π
F (x) ∼ e g(u0) + O(x−3/2 ) . (77)
−xh00 (u0 )

2. h0 (u0 ) 6= 0

In this case we have u0 = b (or a). Now a Taylor expansion about u0 yields
Z 0
xh(u0 ) 0
F (x) ∼ e g(u0) exτ h (u0 ) dτ
∼ exh(u0 ) g(u0) + O(x−2 ) . (78)
xh0 (u 0)

Let’s see immediately how this works by applying what we’ve just learned to an
example that is well-known (the result at least) to some of you.
Consider expanding Γ(x + 1) as x → ∞, for x real. If you know what the Γ-function
is, you’ll know that the answer we hope to get is known as Stirling’s formula, and is
very useful in all types of situations in physics. If you haven’t heard of the Γ-function,
then this will still be a good example of how to use Laplace’s method.
The Γ-function has an integral expression given by
Z ∞
Γ(x + 1) = e−t tx dt . (79)

Although this appears to already be in the correct form to apply Laplace’s method to,
we must transform it because the largest value of the exponential occurs at t = 0, where
tx vanishes. Therefore, we’ll write
Z ∞
Γ(x + 1) = e−t+x log(t) dt
Z ∞
= x ex(−u+log(u)) du , (80)

where we have made the change of variables t = xu not because it is essential, but
because it makes things neater because the position of the maximum stays at a fixed
point and doesn’t go to infinity as we take the asymptotic limit.
PHY581 Methods of Theoretical Physics I, Fall 2005 21

Now, h(u) = −u + log(u) has a maximum at u = 1. We will only be interested in

this example in getting the leading term of the expansion. Therefore, we Taylor expand
everything about u = 1 as far as the first non-constant term. This gives
∞ 1
Γ(x + 1) = xx+1 exp x h(1) + (u − 1)2 h00 (1) + · · · du
∞ 2 /2
∼ xx+1 e−x e−xs ds , (81)

with s = u − 1. The ∼ here comes from Watson’s lemma. We could have expanded
to higher order but have chosen not to. We can extend the limit of integration since
any contribution from the range (−∞, −1) is subdominant. Thus, to leading order we
2π 1/2 x+1 −x
Γ(x + 1) ∼ x e , (82)
which you may recognize as the leading term in Stirling’s formula.

2.2 Riemann-Lebesgue Lemma and Method of Stationary Phase

Lemma 2.2 (Riemann-Lebesgue) Let q(t) be piecewise continuous on the compact
interval [a, b]. Then, for real x
Z b
I(x) ≡ eixt q(t) dt = o(1) , as x → ∞ . (83)

Proof 2.3 Assume w.l.o.g that q(t) is continuous on [a, b] so that for any given  > 0,
the interval [a, b] can be divided into n − 1 subintervals in each of which q(t) varies by
less than 2. Then, ∃{tn } such that a = t0 < t1 < . . . < tn = b, with |q(t) − q(tk )| < ,
for t ∈ [tk−1 , tk ]. Also, q(t) is bounded in [a, b], so ∃Q such that |q(t)| < Q ∀ t ∈ [a, b].
Then n Z n Z
X ti X ti
I(x) = q(ti ) eixt dt + [q(t) − q(ti )]eixt dt . (84)
1 ti−1 1 ti−1

Now, Z
ti eixti − eixti−1 2
e dt = ≤ , (85)
ti−1 ix x
and Z

[q(t) − q(ti )]e dt

≤ (ti − ti−1 ) . (86)

Putting these together, we obtain

|I(x)| ≤ Q n + (b − a) , (87)
which can be made as small as you like by choosing  small enough and/or choosing x
large enough.
PHY581 Methods of Theoretical Physics I, Fall 2005 22

The method of stationary phase is a way to calculate asymptotic expansions for

functions of the form Z b
I(x) = eixh(u) g(u) du . (88)

(with h twice differentiable and g once differentiable) as x → +∞ (x real).

The rough argument is that the largest contribution comes from the place where the
integrand oscillates least, since where rapid oscillations occur, one expects cancellations
to occur. More formally, making the substitution h(u) = t, and using the Riemann-
Lebesgue lemma, we see that the above expression is o(1) unless there’s a place where
h0 = 0.

3 Solution of Ordinary Differential Equations

Let I be an interval of the real line. C n (I) is the set of functions f (x) defined on I such
dn f
≡ D n f ≡ f (n) (89)
exists and is continuous. If f, g ∈ C n (I), then so are f + g and αf . Thus, C n (I) is a
vector space.

Definition 3.1 Suppose ai (x), 0 ≤ i ≤ n, are defined and bounded on I (an 6= 0). Then
L : C n (I) → C 0 (I), f → Lf
Lf (x) = ai (x)(D i f )(x) (90)

is a linear differential operator (LDO) of order n. If an (x) 6= 0 on I, then L is normal.

Definition 3.2 Let L be a LDO of order n on I and let f (x) be n-times differentiable
on I. An equation
Ly = f (91)

is a linear differential equation (LDE) of order n on I. If f ≡ 0 on I, the LDE is


We refer to solutions of the homogeneous equation as complementary functions (CFs),

and the set of CFs as the Kernel of the operator L. Specific solutions of the non-
homogeneous equation we then refer to as particular integrals (PIs).
PHY581 Methods of Theoretical Physics I, Fall 2005 23

3.1 Normal LDEs of Order 1

The general form is
a1 (x)y 0 (x) + a0 (x)y(x) = f (x) (92)

on I. Since a1 (x) 6= 0 on I, we divide by a1 (x) and rewrite as

y 0 (x) + p(x)y(x) = r(x) . (93)

We first consider the homogeneous equation

y 0(x) + p(x)y(x) = 0 . (94)

Define the integrating factor as eP , where

Z x
P (x) = p(u) du , (95)

which exists, since we assume that p(x) is bounded. Then, (yeP )0 = (y 0 + py)eP . Thus,
if y(x) satisfies the homogeneous equation, then (yeP )0 = 0, which implies

y(x) = Ce−P (x) . (96)

Now consider the non-homogeneous equation. Clearly (yeP )0 = reP . Therefore the
general solution is Z x
y(x) = Ce−P (x) + e−P (x) r(u)eP (u) du . (97)

Example 3.1 Show that

(x2 + 1)y 0(x) − (1 − x)2 y(x) = xe−x (98)

has solution  
1 1
Cex − 2
x+ 2
y(x) = . (99)
x2 + 1
PHY581 Methods of Theoretical Physics I, Fall 2005 24

3.2 Normal Second Order LDEs with Constant Coefficients

The general form is
L[y] = y 00 + p(x)y 0 + q(x)y = r(x) . (100)

Once again, here are two sub-problems to solving this equation; determining the Kernel
and the particular integrals. In general both are difficult.

Example 3.2
L[y] ≡ y 00 + 3y 0 + 2y = x2 ex . (101)

Consider first the CFs. Try y = ecx .

⇒ (c2 + 3c + 2)ecx = 0
⇒ c = −1 or − 2 . (102)

Therefore, independent solutions are y1 = e−x and y2 = e−2x , and the CF is a linear
combination of these.
The fastest way to find a PI is to guess one! Guess y = (ax2 + bx + d)ex . Then
routine algebra gives a = 1/6, b = −5/18, d = 19/108. The general solution is therefore
y(x) = c1 e−x + c2 e−2x + (18x2 − 30x + 19)ex . (103)

Example 3.3
L[y] ≡ y 00 + 2y 0 + y = cos(x) . (104)

Consider first the CFs. Try y = ecx . Yields c = −1 (repeated). So there are not
independent roots here. To find the other linearly independent CF, try y = u(x)e−x .
This then gives
L[y] = u00e−x = 0 , (105)

So u = ax+b. Therefore, two linearly independent solutions are y1 = e−x , and y2 = xe−x .
Now, guess a PI: y = c cos(x) + d sin(x). This gives c = 0 and d = 1/2. Therefore,
the general solution is
y(x) = (c1 x + c2 )e−x + sin(x) . (106)
PHY581 Methods of Theoretical Physics I, Fall 2005 25

Why did this trick work in the latter example? This is a particular case of reduction of
order: If one solution of a nth order LDE is known, the equation can be converted to
an order n − 1 one. Let’s verify this explicitly when n = 2.

L[y] = y 00 + p(x)y 0 + q(x)y = 0 . (107)

Suppose y = v(x) is a solution. Then try y(x) = u(x)v(x). Then

L[uv] = u00 v + 2u0 v 0 + uL[v] + pu0 v . (108)

But L[v] = 0, so by writing w = u0 we obtain the first order equation

0 v0
w + 2 +p w =0 (109)

for w.

3.3 Green’s Functions

This primarily concerns finding PIs for second order equations, although the concept
can be generalized to higher order systems. The technique assumes that one can find
the general CF. There are two standard cases in which to do this. First, consider

y 00 + Ay 0 + By = 0 , (110)

where A and B are constants. Let n1 and n2 be the roots of n2 + An + B = 0. Then

C1 en1 x + C2 en2 x 6 n2
n1 =
y(x) = , (111)
(C1 + C2 x)en1 x n1 = n2

as we have seen.
Second, consider
A 0 B
y 00 + y + 2y = 0 . (112)
x x
Now n1 and n2 are roots of n(n − 1) + An + B = 0. Then
C1 xn1 + C2 xn2 6 n2
n1 =
y(x) = , (113)
(C1 + C2 ln x)xn1 n1 = n2

where the second solution here can be found by reduction of order.

PHY581 Methods of Theoretical Physics I, Fall 2005 26

3.3.1 Initial value Problems

In this section, wlog, we will use I = [0, ∞), and for appropriateness of notation, will
use t (time) instead of x as our variable.
The general problem is

M[y(t)] = f(t) , y(0) = y0 . (114)

If this solution exists it can be shown to be unique. By assumption we can solve the
homogeneous problem (with the same initial condition). We therefore consider the
standard problem
M[y(t)] = f(t) , y(0) = 0 , (115)

since a solution of the homogeneous problem, with the given boundary condition, added
to a solution of this equation is the general solution to the equation.
Let us write the standard problem as

L[y] ≡ ÿ(t) + p(t)ẏ(t) + q(t)y(t) = f (t) , (116)

with y(0) = ẏ(0) = 0.

A heuristic approach is as follows. Suppose we can solve (116) when f (t) = δ(t − s),
with s fixed. Let the associated solution be G(t, s), i.e.

L[G] = δ(t − s) , (117)

with G(0, s) = Gt (0, s) = 0. Now consider

Z ∞
y(t) = G(t, s)f (s) ds . (118)

Clearly y(0) = 0, and yt (0) = 0. Also,

Z ∞
L[y] = L[G(t, s)]f (s) ds
Z ∞
= δ(t − s)f (s) ds
= f (t) . (119)

Thus, (118) is the solution of (116).

PHY581 Methods of Theoretical Physics I, Fall 2005 27

Now, what does (117) mean? Clearly, if t 6= s we can assume that G is a smooth
function of t. In addition, assume that G and Gt are bounded as t → s. Now integrate
(117) from t = s −  to s + ,  > 0 :
Z s+ Z s+
(Gtt + pGt + qG) dt = δ(t − s) dt = 1 . (120)
s− s−

Thus Z s+
[Gt ]s− + (pGt + qG) dt = 1 . (121)

By assumption the integrand is bounded, and so the integral is O() as  → 0, so in this

limit we get
[Gt ]ss+− = 1 . (122)

Thus, Gt is not continuous, but has a jump of 1 at t = s. Let’s be a little more formal
about all this.

Definition 3.3 The Green’s function for the initial value problem posed earlier is a
function G(t, s) satisfying

1. for t ≥ 0, s ≥ 0, t 6= s, G is smooth, and L[G] = 0, for fixed s.

2. G(0, s) = Gt (0, s) = 0, for s > 0

3. G is C 0 at t = s, but [Gt ]ss+− = 1

Definition 3.4 If y1 (x) and y2 (x) are linearly independent solutions to a second order
LDE, then the wronskian is

W [y1 , y2 ] ≡ y1 (x)y20 (x) − y2 (x)y10 (x) , (123)

and can be shown to be nonzero on I.

Lemma 3.1 G exists and is unique

Proof 3.1 By explicit construction. Let y1 , y2 be two independent solutions of L[y] = 0,

so that the wronskian is nonzero. Let
0 0≤t<s<∞
G(t, s) = . (124)
c1 y1 (t) + c2 y2 (t) 0<s≤t<∞
PHY581 Methods of Theoretical Physics I, Fall 2005 28

Clearly the first two conditions are satisfied. We need to impose continuity at t = s and
a jump of 1 in Gt . These conditions read

c1 y1 (s) + c2 y2 (s) = 0
c1 ẏ1 (s) + c2 ẏ2 (s) = 1 . (125)

By the definition of the wronskian, there exists a unique solution

y2 (s)
c1 = −
y1 (s)
c2 = . (126)
So, given these definitions, the solution of the initial value problem is
Z ∞
y(t) = G(t, s)f (s) ds . (127)

Example 3.4
ÿ + ω 2y = e−t , (128)
with t > 0, y(0) = ẏ(0) = 0. The Green’s function is sin[ω(t − s)]/ω (show this).
Therefore, the solution is
Z t
y(t) = G(t, s)e−s ds
1 t
= e−s sin(ω(t − s)) ds
ω 0
( )
1 sin(ωt)
= 2
e−t + − cos(ωt) . (129)
1+ω ω
Example 3.5
t2 ÿ − (t2 + 2t)ẏ + (t + 2)y = f (t) , (130)
with y(0) = ẏ(t) = 0. We have
2 1 2 f
L[y] ≡ ÿ − 1 + ẏ + + 2 y = fˆ ≡ 2 . (131)
t t t t
By inspection, one solution of L[y] = 0 is y = t. Using reduction of order, the second
solution is tet . Set
0 t<s
G(t, s) = c1 t+c2 tet−s , (132)
Continuity at t = s implies c1 + c2 = 0. A jump of 1 in Gt implies c2 = 1/s = −c1 .
1 t−s t
y(t) = [te − t]fˆ(s) ds
0 s
Z t " t−s #
e −1
= t f (s) ds . (133)
0 s3
PHY581 Methods of Theoretical Physics I, Fall 2005 29

3.3.2 Two-Point Boundary Value Problems

Now set I = [a, b]. Consider a n-th order system. The kernel has dimension n. In
a 2-point bvp, we impose m > 0 conditions at x = a, and n − m > 0 conditions at
x = b, to fix a complementary function. Such a problem may have 0, 1 or infinitely
many solutions.

Example 3.6 Consider y 00 + y = 0, for which the candidate functions are sin x and
cos x. Consider the following possibilities for boundary conditions (bcs).

1. y(0) = 1, y 0 (π) = 0. This has one solution; y(x) = cos x.

2. y(0) = y(π) = 0. This has an infinite number of solutions; y(x) = λ sin x, for
arbitrary λ.

3. y(0) = 0, y 0 (π) = 0. This has no non-trivial solutions.

Example 3.7 Consider y 00 − y = 0 and consider the following possibilities for boundary
conditions (bcs).

1. y(0) = 1, y bounded as x → ∞. This has one solution; y(x) = e−x .

2. y(0) = 0, y bounded as x → ∞. This has no non-trivial solutions.

Definition 3.5 Suppose the problem M[y] = 0 with boundary values at x = a and x = b
has no non-trivial solutions. Then a, b are conjugate points.

Definition 3.6 A boundary condition C[y, a] is homogeneous if, whenever it is satisfied

by y, it is also satisfied by λy, with λ an arbitrary constant.

Homogeneous bcs usually come in the form

c1 y(a) + c2 y 0(a) = 0 , (134)

for example.

Definition 3.7 Consider a 2-point bvp

L[y] ≡ y 00 + p(x)y 0 + q(x)y = f (x) (135)

x ∈ [a, b], with bcs C1 (y, y 0, a), C2 (y, y 0, b). The Green’s function G(x, ξ) satisfies
PHY581 Methods of Theoretical Physics I, Fall 2005 30

1. G(x, ξ) is smooth, L[G] = 0 for a ≤ x and ξ ≤ b, x 6= ξ.

2. Considered as a function of x, G satisfies the bcs.

3. G is C 0 at x = ξ, but Gx has a jump [Gx (x, ξ)]x=ξ+− = 1

I will state, but not prove, that if the bcs are homogeneous, and a, b are conjugate, then
G exists and is unique.
It can then be shown that the solution of the problem (135) is
Z b
y(x) = G(x, ξ)f (ξ) dξ . (136)

Example 3.8
y 00(x) + y(x) = f (x) , (137)

on [0, π], with y(0) = 0, y 0 (π) = 0. It is easy to see that the homogeneous equation has

y1 (x) = sin x satisfying y(0) = 0

y2 (x) = cos x satisfying y 0 (π) = 0 . (138)

The wronskian is then w = −1, so that the Green’s function is

− cos ξ sin x 0≤x≤ξ≤π
G(x, ξ) = , (139)
− sin ξ cos x 0≤ξ≤x≤π

and the final solution to the problem is

Z x Z π
y(x) = − cos x sin ξf (ξ) dξ − sin x cos ξf (ξ) dξ . (140)
0 x

4 Transform Calculus
4.1 The Fourier Transform
I’ll assume that you know something about Fourier series. Suppose g(x) is continuous
on −π to π, and that g(±π) = 0. Then

g(y) = Cn einy , (141)

where Z
1 π
Cn = g(y)e−iny dy . (142)
2π −π
PHY581 Methods of Theoretical Physics I, Fall 2005 31

Now consider changing the interval to [−L/2, L/2]. Set y = ωx, ω = 2π/L, g(y) = f (x),
cn = LCn . Then we have

f (x) = cn eiωnx , (143)
L −∞
with Z L/2
cn = f (x)e−iωnx dx . (144)

In the sum, k ≡ ωn changes by ∆k = 2π/L at each term.

1 X
f (x) = cn eikx ∆k . (145)
2π −∞

We now take the limit as L → ∞. The sum now samples points increasingly close
together and in the limit becomes an integral. Denoting cn by f(k) we obtain the
1 ∞ ˜
f (x) = f (k)eikx dk

Z ∞

f(k) = f (x)e−ikx dx . (146)

Definition 4.1 Suppose −∞ |f (x)|dx < ∞. Then f˜(k) defined by the above is the
Fourier Transform, and the expression for f (x) is the inversion formula.

Note that I will use these definitions consistently, however, physicists often switch the

signs in the exponents, and make this more symmetric by having a 1/ 2π in front of
each integral. These are just issues of convention.

4.1.1 Fundamental Relations

The so-called shifting relations are extremely useful

Lemma 4.1 (Shifting Relations) Suppose f˜(k) exists. Let g(x) = f (x − x0 ), and
h(x) = eiλx f (x), with λ and x0 constant. Then

g̃(k) = e−ikx0 f˜(k)

h̃(k) = f˜(k − λ) . (147)

Proof 4.1 Do it in class. Easy!!

PHY581 Methods of Theoretical Physics I, Fall 2005 32

Lemma 4.2 Suppose f˜(k) exists. Let g(x) = f (ax), with a 6= 0 real. Then
1 ˜ k
g̃(k) = f . (148)
|a| a

Proof 4.2
Z ∞
g̃(k) = f (ax)e−ikx dx
Z ∞ dy
= sign(a) f (y)e−iky/a
−∞ a
1 ˜ k
= f . (149)
|a| a

Lemma 4.3 Suppose g(x) = f 0 (x), h̃(k) = df˜/dk = f˜0 (k). Then, assuming all the
integrals converge,

g̃(k) = ik f˜(k)
h(x) = −ixf (x) . (150)

Proof 4.3 Easy!!

Simple extensions of these results show a general trend that the faster f (x) falls off
as x → ±∞, the smoother f˜(k) is, and vice-versa.
An important point is that Fourier transforms can be used to solve differential equa-
tions with constant coefficients:

y 00 (x) + py 0(x) + qy(x) = f (x) . (151)

Because of linearity,
(ik)2 ỹ + (ikp)ỹ + qỹ = f˜ , (152)

and so Z
1 ∞ f˜(k)eikx
y(x) = dk . (153)
2π −∞ q + ikp − k 2
This is clearly related to the Green’s function approach, and we shall return to it later.
PHY581 Methods of Theoretical Physics I, Fall 2005 33

4.1.2 A Digression on Distributions

The delta function, δ(x), is an example of a generalized function, or distribution: some-

thing which may fail to satisfy either smoothness, boundedness, or asymptotic properties
required of a given class of functions, but which can still be manipulated like a function.
For example, neither the delta function nor the function f (x) = 1 comes in the class of
functions for which Fourier transforms are normally defined, but the results

δ̃(k) = 1
1̃ = 2πδ(k) , (154)

are familiar. For these purposes, both these functions must be regarded as distributions.

Definition 4.2 Let F be a class of “good” functions on (−∞, ∞); for example, C ∞
with exponential decay at ±∞. Then, g(x) is a distribution with respect to F if hf, gi,
defined by Z ∞
hf, gi ≡ f (x)g(x) dx (155)

is finite ∀ f ∈ F , i.e., for all test functions.

Note that a different definition of “good” would lead to a different class of distribu-
tions; but C ∞ is usually required because this implies that the derivative of a distribution
(defined below) is also a distribution.
With the definition above, the space of distributions (which is dual to the space of
test functions) has many of the nice properties of a space of functions (e.g. linearity).
Each distribution is defined by its action on test functions. For example, δ(x) is the
distribution defined by
hδ, f i = f (0) ∀f ∈ F . (156)

The derivative of a distribution g is defined by

hg 0 , f i = −hg, f 0i ∀f ∈ F . (157)

(i.e. by integration by parts).

The Fourier Transform of a distribution g is defined similarly:

hg̃, f i = hg, fi ∀f ∈ F . (158)

It is straightforward to show that most of the properties of Fourier transforms hold also
for the Fourier transforms of distributions:
PHY581 Methods of Theoretical Physics I, Fall 2005 34

1. The Fourier transform of the delta function. By the above definition hδ̃, f i = hδ, f˜i.
The RHS is
Z ∞ Z ∞ Z ∞
˜ dk = f˜(0) =
δ(k)f(k) f (x) dx ≡ f (k) dk , (159)
−∞ −∞ −∞

and the LHS is Z ∞

δ̃(k)f (k) dk . (160)

Comparing these, which must be equal for all test functions f (x), gives the result
δ̃(k) = 1.

˜ The
2. The Fourier transform of a constant. By the above definition h1̃, f i = h1, fi.
RHS is Z Z
∞ ∞
˜ dk =
1f(k) f˜(k) dk = 2πf (0) , (161)
−∞ −∞

and the LHS is Z ∞

1̃f (k) dk . (162)

These must be equal for all test functions, so 1̃ = 2πδ(k). This result is consistent
with the Fourier inversion theorem, but the conditions of the theorem do not hold

3. The Fourier transform of H(x) (the Heaviside function). A naive approach gives
the wrong answer. One could argue that since H 0 (x) = δ(x), and, for any f
f˜0 (k) = ik f˜(k), then since δ̃(k) = 1, it follows that ik H̃(k) = 1, which is correct.
However, it does not follow that H̃(k) = 1/ik because when distributions are
allowed, the full solution of the equation 1 = ik H̃(k) should be
H̃(k) = + Aδ(k) , (163)
where A is a constant which is not determined by this method. Since H(x) +
H(−x) = 1, the real part of the Fourier transform of H must be the Fourier
transform of 1/2. Therefore A = π.
PHY581 Methods of Theoretical Physics I, Fall 2005 35

4.1.3 Convolution Integrals

Unfortunately, there exists no simple formula relating ffg to f˜ and g̃. Instead, ffg need
not exist. There is however another kind of multiplication which is physically very
important and for which Fourier transforms are easy to evaluate.
R∞ R∞
Definition 4.3 Suppose −∞ |f |2 dx < ∞ and −∞ |g|2 dx < ∞. The convolution of
f (x) and g(x) is
Z ∞
(f ∗ g)(x) = f (y)g(x − y) dy
Z ∞
= g(u)f (x − u) du
= (g ∗ f )(x) . (164)

Note that (f ∗δ) = f for all f . Thus, δ is the identity for (∗) considered as multiplication.

Theorem 4.4 (The Convolution Theorem) Suppose f˜, g̃, and f ∗ g exist. Then
∗ g exists and
∗ g = f˜g̃ .
fg (165)

Proof 4.5 See homework.

A useful result, obtained from the inversion theorem to be seen soon, is

1 ∞
δ(x) = eikx dk . (166)
2π −∞

Theorem 4.6 (Rayleigh (1899) - Plancherel (1910)) Suppose complex f (t) is such
that f˜(k), and ∞ |f |2 dx both exist. Then
Z ∞ 1 Z∞ ˜ 2
|f (x)|2 dx = |f (k)| dk . (167)
−∞ 2π −∞
Proof 4.7
1 ∞ ˜ ¯˜
RHS = f (k)f(k) dk
2π −∞
Z ∞ Z ∞ Z ∞
= (f (x)e−ikx dx)(f¯(y)eiky dy) dk
2π −∞ −∞ −∞
1 ∞ ∞ ∞ ¯ ik(y−x)
= f (x)f(y)e dkdxdy
2π −∞ −∞ −∞
Z ∞ Z ∞
= f (x)f¯(y)δ(y − x) dxdy
−∞ −∞
Z ∞
= f (x)f¯(x) dx
Z ∞
= |f (x)|2 dx . (168)
PHY581 Methods of Theoretical Physics I, Fall 2005 36

Theorem 4.8 (Parseval’s Theorem) If f , g, are real, and f˜, g̃ and −∞ f g dx all
exist, then Z ∞ 1 Z∞ ˜
f (x)g(x) dx = f (k)g̃(−k) dk . (169)
−∞ 2π −∞

Theorem 4.9 Suppose f (x) is continuous and f˜ exists. Then

1 ∞
f (x) = f˜(k)eikx dk . (170)
2π −∞

Proof 4.10 Let x be fixed, and let g(t) = f (t + x). Then, taking Fourier transforms
with respect to t we have
1 ∞ 1 ∞
f˜(k)eikx dk = g̃(k) dk
2π −∞ 2π −∞
= h1, g̃i

= h1̃, gi

= h2πδ, gi

= g(0)
= f (x) . (171)

Note that if f is discontinuous at, say, x = x0 , then f˜(k) is continuous and the
inversion integral is also continuous. It can be rigorously shown that
1 ∞ 1
f˜(k)eikx0 dk = [f (x0 + 0) + f (x0 − 0)] . (172)
2π −∞ 2

4.2 The Laplace Transform

In this section I will use t as the independent variable and p as the transform variable.
Define the Laplace Transform by
Z ∞
F (p) = e−pt f (t) dt ≡ L.f (t) . (173)

The Laplace transform traditionally treats only t ≥ 0. It is therefore conventional to

regard f (t) = 0 for t < 0. In the Laplace transform, p may be complex, defined at first
for <(p) > γ, where γ is as required for convergence of the transform. However, it is
important to be aware that no such γ may exist. For example, this is true for f (t) = et .
PHY581 Methods of Theoretical Physics I, Fall 2005 37

Here are some examples, for which the Laplace integrals are easy to compute.
f (t) = 1 , F (p) = (174)
f (t) = eat , F (p) = (<(p) > <(a)) (175)
f (t) = cos(ωt) , F (p) = (176)
p + ω2
f (t) = sin(ωt) , F (p) = (177)
p + ω2
f (t) = sinh(ωt) , F (p) = (178)
p − ω2
f (t) = cosh(ωt) , F (p) = , (179)
p − ω2

f (t) = δ(t − a) , F (p) = e−ap , (180)

f (t) = Θ(t − a) , F (p) = , (181)
f (t) = tn , F (p) = . (182)

4.2.1 Properties of Laplace Transforms

The change of scale property

1 p
L.f (λt) = F . (183)
λ λ
The shift theorems
L.eλt f (t) = F (p − λ) . (184)

If g(t) = f (t − a) for t > a and is zero otherwise, then

L.g(t) = e−ap F (p) . (185)

A very important result is that the Laplace transform of a derivative is given by

L. = p.L.f (t) − f (0) , (186)
and, similarly, we obtain
d2 f
L. = p2 L.f (t) − pf (0) − f 0 (0) . (187)
The Laplace transform of an integral:
Z t 1
L. f (u) du = L.f (t) . (188)
0 p
PHY581 Methods of Theoretical Physics I, Fall 2005 38

Theorem 4.11 (Initial Value Theorem)

lim f (t) = lim pF (p) . (189)

t→0 p→∞

(provided both limits exist).

Proof 4.12
pF (p) = L. + f (0)
df ∞
= f (0) + dt . e−pt(190)
0 dt
As p → ∞ the right hand side becomes f (0) as required, provided f (t) is bounded near
t = 0.

There exists a similar Final Value Theorem

lim f (t) = lim pF (p) . (191)

t→∞ p→0

Convolutions are also important for Laplace transforms. Recall that for Laplace
transforms we assume that functions vanish for t < 0. Therefore, in a convolution
integral h(t) = −∞ f (t − u)g(u) du the integrand is nonzero only for t > u > 0. Thus
we have Z t
h(t) = f (t − u)g(u) du , t>0, (192)
h(t) = 0 , t<0. (193)

Theorem 4.13 If h = f ∗ g, then H(p) = F (p)G(p).

Proof 4.14
Z ∞ Z ∞
F (p)G(p) = e−ps f (s) ds e−pu g(u) du
0 0
Z ∞ Z ∞
= e f (s)θ(s) ds e−pu g(u)θ(u) du
−∞ −∞
Z ∞ Z ∞
= du g(u)θ(u) dse−p(s+u) f (s)θ(s)
−∞ −∞
Z ∞ Z∞
= du g(u)θ(u) dte−pt f (t − u)θ(t − u)
−∞ −∞
Z ∞ Z ∞
= dt e dug(u)f (t − u)θ(t − u)θ(u)
−∞ −∞
Z ∞ Z t
= dt e−ptθ(t) duf (t − u)g(u)
−∞ 0
Z ∞ Z t 
= dte−pt du f (t − u)g(u)
Z0∞ 0

= dt e−pt h(t)
= H(p) . (194)
PHY581 Methods of Theoretical Physics I, Fall 2005 39

To illustrate the usefulness of the Laplace transform, we’ll tackle an example of a

differential equation with non-constant coefficients. The Bessel function, J0 (t) obeys
d dJ0
t + tJ0 = 0 , (195)
dt dt

with the boundary conditions J0 (0) = 1, J00 (0) = 0. First rewrite the equation as

tJ000 + Jo0 + tJ0 = 0 . (196)

Now Laplace transform and use the result L.tn f (t) = (−d/dp)n F (p) to get
d d
− (L.J000 ) + L.J00 − (L.J0 ) = 0 . (197)
dp dp
Next I’ll write K(p) ≡ L.J0 and use our results on the Laplace transforms of derivatives
to get
(p2 + 1)K 0 (p) + pK = 0 . (198)

This is now a simple first order equation that we can solve to give
c 1
K(p) = 1+ 2 , (199)
p p

where c is a constant. We can now use the initial value theorem to fix c via 1 = J0 (0) =
limp→∞ pK(p) = c. So finally,
1 1
K(p) = 1+ 2 . (200)
p p

This can be rewritten as (exercise!)

X αn
K(p) = , (201)
n=0 p2n+1

(−1)n (2n)!
αn = . (202)
22n (n!)2
Finally, we can use that the Laplace transform of tn is n!/pn+1 , to invert and get

X αn t2n
J0 (t) =
n=0 (2n)!

X − 14 t2
= . (203)
n=0 (n!)2
PHY581 Methods of Theoretical Physics I, Fall 2005 40

In this example, we used a Laplace transform that we know to invert another Laplace
transform. However, for general problems we’ll need a general inversion formula, anal-
ogous to the one that we used for Fourier transforms. Remember that for Fourier
transforms the transform variable was real, therefore the inversion formula was a real
integral. However, as I mentioned, with Laplace transforms, the transform variable is
complex in general. Therefore, we’ll end up needing a complex (contour) integral to
invert and recover f (t).
To get to the appropriate inversion formula, we’ll postulate a form for the integral,
and then show how it can be an inversion. Remember that I said we’d need to require
<(p) > γ for some γ in order for the Laplace transform to converge. With this is mind,
let’s examine integrals of the form
Z γ+i∞
dp ept F (p) , (204)

where F (p) is the Laplace transform of a function f (t). We would like this integral to
yield f (t). Substituting in for the actual Laplace transform we get
Z γ+i∞ Z ∞
dp e du e−pu f (u)θ(u) . (205)
γ−i∞ −∞

Set p = γ + ik. We then get

Z ∞ Z ∞
ieγt dkeikt due−iku [e−γu f (u)θ(u)] . (206)
−∞ −∞

But, by the Fourier transform and inversion formula, this is

2πieγt [e−γt f (t)θ(t)] . (207)

So, we can finally rearrange things to get the Bromwich Inversion Formula for the
Laplace transform: Z
1 γ+i∞
f (t)θ(t) = dp eptF (p) . (208)
2πi γ−i∞

Now, notice that, in deriving this, we have not specified what γ is. For a given F (p),
γ is not known a priori. In fact, γ must be chosen so that the right hand side of the
inversion integral is zero for t < 0 (to match the left hand side).
To do this, start with γ > 0 and close the contour using a semicircle in <(p) > γ > 0
to form a closed contour C. Now, for t < 0, the factor of ept ensures that the contribution
from the integral over the semicircle at infinity vanishes. Thus, for the integral around
PHY581 Methods of Theoretical Physics I, Fall 2005 41

C to yield zero , F (p) must have no singularities inside C. Therefore, all singularities of
F (p) must lie to the left of the line <(p) = γ. This fixes γ.
For t > 0, we close the contour in <(p) < γ. This gives
1 I pt
f (t) = e F (p) dp , t>0. (209)
2πi C
If the only singularities of F (p) are isolated poles, the inversion integral can be
performed by the calculus of residues
f (t) = Resi [ept F (p)]
= epj t F (pj ) , (210)

for poles at p = pj . Suppose the pole of F (p) with largest real part is p = pj . Then
f (t) ∼ epj t as t → ∞, and therefore we require γ > <(pj ).
Let me give some examples of how to use Laplace transforms to solve ordinary
differential equations, in particular initial value problems. Consider

ẍ + 2ẋ + x = e−t , (211)

with initial values x(0) = 1, ẋ(0) = 0. Write L.x(t) ≡ X(p). Laplace transforming the
equation, and using our results about the Laplace transforms of derivatives, gives
[p2 X(p) − px(0) − ẋ(0)] + 2[pX(p) − x(0)] + X(p) = , , (212)
which, after a little rearranging, implies that
1 1 1
X(p) = + 2
+ . (213)
p + 1 (p + 1) (p + 1)3
But now, using our example from earlier, we can invert each of these term by term to
t2 −t
x(t) = e−t + te−t + e . (214)
Here’s another example
ÿ + ẏ − 2y = 0 , (215)

subject to y(0) = 1 and y → 0 as t → ∞. Write L.y(t) ≡ Y (p). Transforming the

equation we get
(p2 + p − 2)Y (p) = p + ẏ(0) + 1 . (216)
PHY581 Methods of Theoretical Physics I, Fall 2005 42

This is now easily inverted to give

ẏ(0) + 2 t 1 − ẏ(0) −2t
y(t) = e + e , (217)
3 3
Now, requiring y → 0 as t → ∞ implies that ẏ(0) + 2 = 0, and therefore the solution is

y(t) = e−2t . (218)

What’s interesting about this example is that we use the first boundary condition just
after transforming, to dispose of one of the terms generated by Laplace transforming a
derivative, but use the second boundary term only after inverting the transform.
The next example deals with a pair of coupled first order differential equations.

ẋ + x + 2y = e2t ,
2ẋ + ẏ − x = 0 , (219)

with x(0) = y(0) = 0. Laplace transforming gives

(p + 1)X(p) + 2Y (p) = ,
(2p − 1)X(p) + pY (p) = 0 . (220)

These can be trivially solved to give

X(p) = ,
(p − 1)(p − 2)2
1 − 2p
Y (p) = . (221)
(p − 1)(p − 2)2
We could do this by partial fractions and using earlier results. However, instead we’ll
use this as our first Bromwich inversion formula example. Clearly we will need γ > 2.
The right hand side of the equation for X(p) has poles at p = 1 and p = 2, with

Res(p = 1) = lim(p − 1)(X(p)ept) = et ,

d pept
Res(p = 2) = lim , (222)
p→2 dp p−1

which yields
x(t) = (2t − 1)e2t + et . (223)
PHY581 Methods of Theoretical Physics I, Fall 2005 43

Similarly we obtain
y(t) = (1 − 3t)e2t − et . (224)

Sometimes, a convolution trick is useful. Consider

ẍ + ω 2 x = f (t) , (225)

with x(0) = ẋ(0) = 0. Laplace transforming we get

F (p)
X(p) = . (226)
p2+ ω2

Now, as we learned earlier, G(p) ≡ 1/(p2 + ω 2) is the Laplace transform of g(t) =

(sin(ωt)/ω)θ(t). Thus, we can write

X(p) = G(p)F (p) , (227)

and use the convolution theorem to tell us that x(t) = (g ∗ f )(t). This reads
Z t sin ω(t − t0 ) 0
x(t) = dt0 f (t ) , (228)
0 ω
for t > 0. Therefore, g(t) is the Green’s function of the problem.
As a final example for Laplace transforms, consider the diffusion equation
∂2u 1 ∂u
= , (229)
∂x k ∂t
in x ≥ 0, t ≥ 0, subject to u(0, t) = f (t), given, u(x, 0) = 0, and u(x, t) → 0, as x → ∞.
This problem could describe, for example, a prescribed heating (f (t)) applied to the
x = 0 end of a semi-infinite rod, initially unheated.
Perform the Laplace transform with respect to t:
Z ∞
U(x, p) = dt e−ptu(x, t) . (230)

Note that, evaluating this at x = 0 gives U(0, p) = L.f (t) ≡ F (p). Now, the diffusion
equation, using u(x, 0) = 0, gives

∂2U p
= U , (231)
∂x2 k
which is easily solved to give
q q
U(x, p) = A(p) exp(− p/kx) + B(p) exp( p/kx) . (232)
PHY581 Methods of Theoretical Physics I, Fall 2005 44

Now, since the x-dependence of U and u will be the same, we require U(x, p) → 0, as
x → ∞, which gives B(p) = 0. So, we have U(x, p) = A(p) exp(− p/kx). At x = 0, we
have U(0, p) = A(p), and so we write
U(x, p) = U(0, p) exp(− p/kx) ,
= F (p)G(p) , (233)
with G(p) = exp(− p/kx). Therefore, the convolution theorem tells us that u(x, t) =
(g ∗ f )(x, t), and it remains to find g(x, t). But from an earlier result we know this:
s !
x2 −3/2 x2
g(x, t) = t exp − . (234)
4πk 4kt

Therefore, the complete solution to the diffusion equation problem is

s !
x2 Z t 0 x2
u(x, t) = dt (t − t0 )−3/2 exp − f (t0 ) . (235)
4πk 0 4k(t − t0 )

I’ve given you a lot of examples using the Laplace transform. During this time you’ve
had some time to get used to the Fourier transform. I’d now like to go back to the Fourier
transform for one example that is of particular physical significance.
Consider the problem of finding the Green’s function that satisfies
− 2 − q 2 G(x, x0 ) = δ(x − x0 ) , (236)

where q is a fixed, real, positive number, and −∞ < x, x0 < ∞. This Green’s function
describes one-dimensional scattering in quantum mechanics. Set x0 = 0 (w.l.o.g.), and
then Fourier transform. We obtain that

(k 2 − q 2 )G̃(k) = 1 , (237)

with the function we’re looking for given by the inversion formula
Z ∞
G(x) = dk e−ikx G̃(k) . (238)

Now, we would like to solve for G̃(k). However, G̃(k) = 1/(k 2 − q 2 ) will not do, because
it puts poles on the real k-axis, and this gives problems for the inversion integral.
To proceed, we will apply Feynman’s Rule. This technique is extremely important
in quantum mechanics and quantum field theory. Replace q 2 by q 2 ± i. This enables
PHY581 Methods of Theoretical Physics I, Fall 2005 45

one to define two independent Green’s functions G± (x) by taking the limit  → 0 at an
appropriate later stage. Consider
G̃+ (k) = ,
k2 − (q 2 + i)
= ,
[k − (q 2 + i)1/2 ][k + (q 2 + i)1/2 ]
1 1
= · , (239)
k − q − i k + q + i0

where 0 ≡ /(2q 2 ), and lim→0 and lim0 →0 are equivalent. So, we have
Z ∞ 1 1
G+ (x) = dx e−ikx · . (240)
−∞ k − q − i k + q + i0

Now, for x > 0, we can evaluate this integral by closing the contour in the upper half
k-plane (=(k) > 0), and for x < 0, we can evaluate this integral by closing the contour
in the lower half k-plane (=(k) < 0). We use the residue theorem, and then take the
limit lim→0 at the very end. The result is
eiqx e−iqx
G+ (x) = πiθ(x) + πiθ(−x) . (241)
q q
One can calculate G− (x) similarly. Given our technique, you should check that these
Green’s functions obey the differential equation.

5 Sturm-Liouville Theory
Let’s begin with an example to get the feel of the kind of problems we’ll tackle with
these techniques.
Consider a uniform string with fixed ends. The displacement of this string obeys the
wave equation
1 ∂2y ∂2y
= , (242)
c2 ∂t2 ∂x2
with boundary conditions y = 0 at x = 0 and at x = l for all time. To start, we separate
variables, making the ansatz y(x, t) = X(x)T (t). This yields

1 T̈ X 00
− = − = const = λ , (243)
c2 T X
say. Vibrations correspond to T ∼ eiωt , so λ = ω 2 /c2 . We seek solutions of

X 00 = −λX , (244)
PHY581 Methods of Theoretical Physics I, Fall 2005 46

subject to X = 0 at x = 0 and at x = l. The solutions are of the form

Xn (x) = Bn sin , (245)
with n = 1, 2, 3, . . ..
Note that there exist solutions only for λ ∈ S, a discretely distributed set of real
eigenvalues. S is referred to as the spectrum of eigenvalues. The corresponding eigen-
functions Xn are orthogonal on [0, l]. The equation of motion for X(x) is typical of a
wide class of eigenvalue problems which arise from some partial differential equations of

5.1 General Remarks

We will study (at first) second order linear differential equations for x ∈ I = [a, b].
Restrict attention to linear differential operators of the form L = −a2 D 2 − a1 D + a0 for
given functions ai (x) with a2 (x) > 0 on I. Actually, it will prove sufficient to restrict
our attention to L of the self-adjoint form:

L = −DpD + q , (246)

with p(x) > 0, q(x) given functions on I.

the Sturm-Liouville Problem is specified by the differential equation

Ly(x) = λw(x)y(x) , (247)

to be solved for y(x) for x ∈ I, subject to the boundary conditions (to be specified),
with w(x) > 0 on I, and λ an eigenvalue parameter. The solution has some general

1. ∃ nontrivial solutions which obey the boundary conditions in use iff λ ∈ S, the
spectrum of the problem. S is a monotonic set of discretely distributed real eigen-
values λn , so that λ0 < λ1 < λ2 < · · ·, with λn → ∞ like n2 as n → ∞.

2. The eigenfunctions yn corresponding to the λn ∈ S are unique to within nor-

malization. Also, yn and ym are orthogonal in a sense that we shall see soon, if
n 6= m.

3. The yn provide a basis in the infinite dimensional vector space of functions on I,

which obey the boundary conditions in use, and suitable smoothness properties.
PHY581 Methods of Theoretical Physics I, Fall 2005 47

5.2 Orthogonality and Boundary Conditions

Suppose y1 and y2 obey the Sturm-Liouville equation. Then

−(py10 )0 + qy1 = λ1 wy1 , (248)

−(py20 )0 + qy2 = λ2 wy2 , (249)

and suitable boundary conditions, for distinct values λ1 and λ2 of the eigenvalue param-
eter. Form the object Z b
dx [y2 × (248) − y1 × (249)] . (250)

This yields
Z b Z b
(λ1 − λ2 ) dx w(x)y1 (x)y2 (x) = dx [−(py10 )0 y2 + (py20 )0 y1 ] ,
a a
= [−(py10 )y2 + (py20 )y1 ]ba . (251)

Now, appropriate boundary conditions are those that make this vanish. For then, since
λ1 6= λ2 , we have Z b
dx w(x)y1(x)y2 (x) = 0 . (252)

This is the sense in which y1 and y2 are orthogonal with respect to the weight function
A good example is given by Legendre’s equation and polynomials. This arises from
the Laplace equation in cylindrical coordinates (r, θ, z). Writing x ≡ cos θ, the Legendre
equation is " #
d d
− (1 − x2 ) P (x) = λP (x) , (253)
dx dx
with I = (−1, 1). Note that this is a Sturm-Liouville problem with w(x) = 1, and
p(x) = (1 − x2 ). The suitable boundary conditions are automatically imposed if P (x) is
finite at x = ±1, since p(x) → 0 at the endpoints. The solutions are a set of polynomials,
the first few of which are

P0 = 1 , λ0 = 0 , (254)
P1 = x , λ1 = 2 , (255)
P2 = x2 − , λ2 = 6 . (256)
More generally, there exists a unique Pn = xn + · · · with λn = n(n + 1). It can also be
checked that the Pn are orthogonal on I.
PHY581 Methods of Theoretical Physics I, Fall 2005 48

5.3 Real Eigenvalues

Let us allow the possibility the λn and yn are complex. Then, we may write two Sturm-
Liouville equations as

−(py 0 )0 + qy = λwy , (257)

−(p(y ∗ )0 )0 + qy ∗ = λ∗ wy ∗ . (258)

We now form the object

Z b
dx [y ∗ × (257) − y × (258)] . (259)

This yields
Z b Z b
∗ ∗
(λ − λ ) dx w(x)y(x)y(x) = dx [−(p(y ∗)0 )0 y + (py 0)0 y ∗] ,
a a
= [−(p(y ∗ )0 )y + (py 0)y ∗ ]ba ,
= 0, (260)

for the suitable boundary conditions. Now, since w(x) > 0 on I, this implies that
a dx w|y|2 is strictly positive on I. Therefore,

λ∗ = λ . (261)

i.e., λ is real.

5.4 Formal Vector Space View

Let us regard suitably behaved functions f (x), g(x), which obey the boundary conditions
of our Sturm-Liouville problem as elements of an infinite dimensional vector space V,
spanned by the yn (x). Thus, we can write

f (x) = cn yn (x) . (262)

Define a scalar (inner) product on V by

Z b
(f, g) ≡ dx w(x)f (x)g(x) , (263)

and note that “suitably well-behaved” requires that ||f || = (f, f )1/2 exists. Then,
(yn , ym ) = 0 for n 6= m. Further, choose the scale of yn to achieve orthonormality:

(yn , yn ) = δnm . (264)

PHY581 Methods of Theoretical Physics I, Fall 2005 49

(ym , f ) = cn (yn , ym ) ,
= cm δnm ,
= cm . (265)

This is to be compared with the formulae for Fourier series. Thus, if we assume that
the yn are known, then for a given f we can obtain the cm via
Z b
cn = dξ w(ξ)yn(ξ)f (ξ) . (266)

An important and useful result follows if we substitute this into the expansion for
f (x). We obtain
∞ Z
X b
f (x) = dξ w(ξ)yn(ξ)f (ξ) yn (x) ,
n=0 a
Z " ∞
b X
= dξ w(ξ)yn(ξ)yn (x) f (ξ) . (267)
a n=0

Since this is true ∀ f ∈ V, we may therefore infer

w(ξ)yn(ξ)yn (x) = δ(x − ξ) . (268)

This is a formal completeness relation for the Sturm-Liouville problem. Note that we
can check this if we assume that δ(x − ξ) obeys the boundary conditions for a < ξ < b.
We can then expand the delta function as

δ(x − ξ) = cn (ξ)yn (x) . (269)

Then, our expression for cn gives

Z b
cn (ξ) = dx w(x)yn (x)δ(x − ξ) ,
= w(ξ)yn(ξ) . (270)

5.5 Inhomogeneous Equations and Green’s Functions

Suppose we have solved the Sturm-Liouville problem

Lyn (x) = λn w(x)yn (x) , (271)

PHY581 Methods of Theoretical Physics I, Fall 2005 50

where the yn obey suitable boundary conditions on I = [a, b], and w(x) > 0 on I. By
solved I mean that the λn and yn are determined, and (yn , ym ) = δnm has been arranged.
We would like to solve the problem

[L − λw(x)]y(x) = f (x) , (272)

for y(x), for x ∈ I, subject to the same boundary conditions. Here λ is a fixed real
number, and f (x) is a given function; naturally assumed to obey the same boundary
conditions. To proceed, write

f (x) = w(x)h(x) ,

h(x) = hn yn (x) , (273)

where we can calculate the coefficients hn when f (and hence h) is given, and the yn are
We now posit the expansion

y(x) = an yn (x) , (274)

and seek the unknowns an to complete the specification of the solution. Substituting in
to the problem (272), the left hand side becomes

X ∞
an (L − λw)yn(x) = an (λn w − λw)yn (x) ,
n=0 n=0

= w(x) an (λn − λ)yn (x) . ,

Now the right hand side is

w(x) hn yn (x) . (275)

Finally, equating these, multiplying both sides by ym (x), and integrating over a < x < b,
we obtain
am (λm − λ) = hm . (276)

Now, if λ ∈
/ S, we have
an = . (277)
λn − λ
PHY581 Methods of Theoretical Physics I, Fall 2005 51

We now have the solution in terms of quantities calculated for the homogeneous
equation. It is instructive to write this in another way. The solution is

X hn
y(x) = yn (x) ,
n=0 λn − λ

"Z #
X 1 b
= yn (x) dξ w(ξ)yn(ξ)h(ξ) ,
n=0 λn − λ a
Z " ∞
b X
yn (x)yn (ξ)
= dξ f (ξ) , (278)
a n=0 λn − λ

In this form it is easy to identify the Green’s Function in the form

X yn (x)yn (ξ)
G(x, ξ) = , (279)
n=0 λn − λ

We can check that this Green’s function behaves as it should, by acting on it with
the left hand side of (272):

X yn (ξ)
[L − λw(x)]G(x, ξ) = (L − λw)yn (x) ,
n=0 λn − λ
yn (ξ)
= (λn − λ)w(x)yn (x) ,
n=0 λn − λ

= yn (ξ)w(x)yn (x) ,

which, using our earlier results, is

[L − λw(x)]G(x, ξ) = δ(x − ξ) , (280)

as expected.

5.6 Self-Adjointness
It has probably not escaped your notice that the Sturm-Liouville problem and its solu-
tions are related to things that you’ve seen in quantum mechanics. Here, we shall see
precisely what this relation is. In particular, we shall compare the terms self-adjoint (in
Sturm-Liouville theory) and Hermitian (in quantum mechanics).
I will begin by considering the case of w(x) = 1. Consider a typical Sturm-Liouville

Ly = λy ,
L = −DpD + q , (281)
PHY581 Methods of Theoretical Physics I, Fall 2005 52

with boundary conditions y(a) = y(b) = 0. In the vector space V of (possibly complex)
functions f , g, obeying the boundary conditions, we will use the inner product
Z b
(f, g) = dx f (x)∗ g(x) . (282)

We are interested in operators A such that Af ∈ V, ∀f ∈ V. Define the Hermitian

conjugate (or Hermitian adjoint) operator, A† by

(A† f, g) = (f, Ag) , (283)

∀ f , g ∈ V. We say that the operator A is hermitian if A† = A, or

(Af, g) = (f, Ag) , (284)

With this definition, note that our self-adjoint operator L is hermitian (for the special
case w = 1). You can check this trivially using the definition of the inner product. Do
this as a brief exercise.
Now turn to the case of w 6= 1. In this course, and in Sturm-Liouville theory in
general, we refer to the operator L as self-adjoint, even when w 6= 1. we also use
Ly = λwy and (f, g) = a dx wf g, from which real λn and orthogonal yn follow. However,
with this definition, (Lf, g) 6= (f, Lg), because of w. To understand the relationship to
hermiticity, we define
M ≡ −w −1/2 DpDw 1/2 + q , (285)

which is hermitian. To see this:

Z b
(Mf, g) = − dx ww −1/2 [DpDw 1/2f ]g + · · · ,
Z b
= dx (pDw 1/2 f )D(w 1/2 g) + · · · ,
= (f, Mg) . (286)

Therefore, care is needed with terminology, even though either definition leads to real
eigenvalues and orthogonal eigenfunctions.

You might also like