ODE Notes

ORDINARY DIFFERENTIAL EQUATIONS
SPRING AND SUMMER TERMS 2002

FIGUEROA-OFARRILL
JOSE
hjmf@maths.ed.ac.uki
Contents
1. Overview and basic concepts
1.1. Initial value problems
1.2. Systems and phase portraits
1.3. Linear systems with constant coefficients
1.4. Autonomous equations
Problems
2. Ordinary differential equations of higher order
2.1. Second order equations
2.2. Some several variable calculus
2.3. Energy
2.4. Hamiltonian vector fields
2.5. Gradient vector fields
Problems
3. Linear vector fields
3.1. Some linear algebra
3.2. Real eigenvalues
3.3. Complex eigenvalues
3.4. The exponential of a matrix
3.5. The case n = 2
3.6. Inhomogeneous systems
3.7. Higher order equations
Problems
4. Stability of nonlinear systems
4.1. Topology of Rn
4.2. Existence and uniqueness
4.3. Linearisation about equilibrium points
4.4. Stability
4.5. Liapunov stability
4.6. Stability and gradient fields
4.7. Limit cycles and the PoincareBendixson theorem
Problems
5. Rudiments of the theory of distributions
5.1. Test functions
5.2. Distributions
1
3
3
3
4
6
6
9
9
9
10
10
11
11
13
13
14
15
16
16
18
19
20
27
27
28
30
30
31
33
34
34
41
41
44
2001/2002
5.3. Distributional derivatives and ODEs

5.4. Greens functions
Problems
6. The Laplace transform
6.1. Definition and basic properties
6.2. Application: solving linear ODEs
6.3. The Laplace transform of a distribution
Problems
7. ODEs with analytic coefficients
7.1. Analytic functions
7.2. ODEs with analytic coefficients
Problems
46
48
50
54
54
58
59
61
64
64
66
68
M341 ODE
1. Overview and basic concepts

1.1. Initial value problems. The simplest ODE is
dx
= ax ,
(1)
dt
where a R is a constant and x : R R is an unknown function taking
t 7 x(t). This equation can be written equivalently as
x0 = ax
or
x0 (t) = ax(t) .
All solutions of (1) are of the form

x(t) = Ceat ,
where C R is some constant. This is easily proven: if x(t) is any
solution of (1) then consider C(t) = x(t)eat . Taking the derivative,
one shows that C 0 = 0, whence it is a constant.
We fix this constant by specifying the value of x at some given point
t0 : if x(t0 ) = x0 , then C = x0 eat0 , so that
x(t) = x0 ea(tt0 ) .
Without loss of generality we can take t0 = 0, for if u(t) solves (1) with
u(0) = u0 , then v(t) = u(t t0 ) obeys (1) with v(t0 ) = u0 .
Therefore we often recast ODEs as initial value problems:
x0 = ax
and
x(0) = C .
We have just proven that such initial value problems have a unique
solution.
1.2. Systems and phase portraits. We will spend most of this
course studying systems of ODEs, e.g.,
x01 = a1 x1
x02 = a2 x2 ,
(2)
where ai , a2 R are constants and x1 , x2 : R R are the unknown

functions of a real variable t. This particular system is decoupled, and
the most general solution is of the form
x1 (t) = C1 ea1 t
x2 (t) = C2 ea2 t ,
for some constants C1 , C2 R which can be determined from initial
conditions: C1 = x1 (0) and C2 = x2 (0).
Let us describe this system geometrically. The functions x1 (t) and
x2 (t) specify a curve x(t) = (x1 (t), x2 (t)) in the (x1 , x2 )-plane R2 . The
derivative x0 (t) is called the tangent vector to the curve at t (or at
x(t)). We can rewrite (2) as
x0 = Ax = (a1 x1 , a2 x2 ) ,
2001/2002
where Ax is a vector based at x. The map A : R2 R2 sending x 7 Ax

is a vector field in the plane R2 . In other words, it is an assignment
to every point x in the plane, of a vector Ax based at x. We picture
this by drawing at the point x R2 , a directed line segment from x to
x + Ax.
Solving the system (2) consists in finding curves x : R R2 in the
plane, whose tangent vector x0 (t) agree with the vector field Ax(t) for
all t. Initial conditions are of the form
x(0) = u ,
where u R2 is a point in the plane. This just means that at t = 0,
the curve passes through u R2 .
Existence and uniqueness of solutions to (2) means that through
every point u R2 there passes one and only one solution curve.
Notice that for this to be the case, the trivial solution x(t) = (0, 0)
for all t is also considered a curve.
The family of all solution curves as subsets of R2 is called the phase
portrait of the ODE.
1.3. Linear systems with constant coefficients. Generalising the
above, consider the linear system
dx1
= a11 x1 + a12 x2 + + a1n xn
dt
dx2
= a21 x1 + a22 x2 + + a2n xn
dt
(3)
..
.
dxn
= an1 x1 + an2 x2 + + ann xn .
dt
Here aij R are n2 constants, whereas xi : R R is an unknown
function of a real variable t.
At the most primitive level, solving this system consists of finding n
differentiable functions xi which obey (3).
More conceptually, a solution of (3) is a curve in Rn .
The Cartesian space Rn is the space of ordered n-tuples of real numbers. An element of Rn is a point x = (x1 , x2 , . . . , xn ), and the number
xi is called the i-th coordinate of the point. Points can be added
coordinatewise
x + y = (x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , . . . , xn + yn ) ,
and can also be multiplied coordinatewise by scalars R
x = (x1 , . . . , xn ) .
The distance between points x and y is
p
kx yk = (x1 y1 )2 + + (xn yn )2 ,
M341 ODE
and the length of a point x is the distance to the origin:

q
kxk = x21 + + x2n .
The origin, of course, is the point
0 = (0, . . . , 0) Rn .
A vector based at x Rn is an ordered pair of points x, y in Rn

and denoted xy.
~ We visualise it as an arrow from x to y. A vector
~ based at the origin can be identified with the point x Rn . To a
0x
vector xy
~ based at x is associated the vector y x based at the origin.
We say that xy
~ and y x are translates of each other.
We will reserve the name vector for vectors based at the origin. Their
translates are useful in visualisation, but for computations we always
use vectors based at the origin, as only those can be added and multiplied by scalars.
A candidate solution of (3) is a curve in Rn ,
x(t) = (x1 (t), x2 (t), . . . , xn (t)) ,
by which we mean a map x : R Rn sending t 7 x(t). This map is
called differentiable if and only if each of the coordinate functions xi
is differentiable. The derivative is defined by
dx
= x0 (t) = (x01 (t), x02 (t), . . . , x0n (t)) ,
dt
and it is also a map from R to Rn .
The translate v(t) of x0 (t) which is based at x(t) can be interpreted
geometrically as the tangent vector to the curve at t (or at x(t)). Its
length kx0 (t)k is interpreted physically as the speed of the particle
whose motion is described by the curve x(t).
We can use matrices to write (3) more succinctly. Let
a11 a12 . . . a1n

a21 a22 . . . a2n
A = [aij ] =
..
.
an1 an2 . . . ann
For each x R we define a vector Ax Rn whose i-th coordinate is
n
ai1 x1 + + ain xn ,
so that a matrix A is interpreted as a map A :
x 7 Ax. In this notation we write (3) simply as
x0 = Ax .
Rn
Rn
sending
(4)
The map A : Rn Rn is a vector field on Rn : to each point x Rn

it assigns the vector based at x which is a translate of Ax. A solution
of (4) is a curve x : R Rn whose tangent vector at any given t is the
vector Ax(t) (translated to x(t)).
2001/2002
The vector equation (4) can be supplemented by an initial condition

x(0) = x0 Rn , and we will see that such an initial value problem has
a unique solution. We have already proved this in the case n = 1, and
the general case is not any harder.
1.4. Autonomous equations. We need not restrict ourselves to linear equations, i.e., equations where the vector field A : Rn Rn is a
linear map. A large part of this course will concern itself with the more
general equation
x0 = f (x)
where f : U Rn is a continuous map from a subset U Rn to Rn .
Such equations are called autonomous because f does not depend
explicitly on t. On the other hand an equation like
x0 = f (x, t)
is called non-autonomous and are harder to interpret geometrically.
We will focus on autonomous systems for the most part. We will not
prove in this course any results on existence and uniqueness of nonlinear
initial value problems. There are such results but they are harder to
state. Some of them are discussed in Foundations of Analysis.
Problems
(These problems are taken from Hirsch & Smale, Chapter 1.)
Problem 1.1. For each of the matrices A which follow, sketch the
phase portrait of the corresponding differential equation x0 = Ax, using
Maple if so desired.

1

2 0
0
2
0
(c)
(a)
(b) 2
0 2
0 2
0 2
1

2
0 1
0 1
2
(d)
(e)
(f)
2
0
1
0
1 0

1
1 1
1
0
0
(i)
(g)
(h) 2 1
1 1
3 0
0 2
Problem 1.2. Consider the one-parameter family of differential equations
x01 = 2 x1 ,
x02 = a x2 ;
aR.
(a) Find all solutions (x1 (t), x2 (t)) .

(b) Sketch the phase portrait for the following values of a: 1, 0, 1, 2, 3.
Problem 1.3. For each of the following matrices A sketch the vector
field x 7 Ax in R3 . (Any missing entries are treated as 0.) Maple is
M341 ODE
of indirect use here. Although it will not plot the vector field itself, it
will plot the solutions.
1
1
1
(a) 1
(b) 2
(c) 2
1
0
2
0
0 1
1
0
(d) 1
(e) 1
(f) 1 1
1
0
2
1 1
Problem 1.4. For A as in (a), (b), (c) of Problem 1.3, solve the initial
value problem
x0 = Ax ,
x(0) = (k1 , k2 , k3 ) .
Problem 1.5. Let A be as in (e), Problem 1.3. Find constants a, b,

and c such that the curve (a cos t, b sin t, cet/2 ) is a solution of x0 = Ax
with x(0) = (1, 0, 3).
Problem 1.6. Find two different matrices A and B such that the
curve
x(t) = (et , 2e2t , 4e2t )
satisfies both the differential equations
x0 = Ax
and
x0 = Bx .
Problem 1.7. Let A be an n n diagonal matrix. Show that the

differential equation
x0 = Ax
has a unique solution for every initial condition.
Problem 1.8. Let A be an n n diagonal matrix. Find conditions on
A guaranteeing that
lim x(t) = 0
for all solutions of x0 = Ax.

Problem 1.9. Let A = [aij ] be an n n matrix. Denote by A the
matrix [aij ].
(a) What is the relation between the vector fields x 7 Ax and x 7
Ax?
(b) What is the geometric relation between the solution curves of
x0 = Ax and x0 = Ax?
Problem 1.10.
(a) Let u(t) and v(t) be solutions to x0 = Ax. Show that the curve
w(t) = u(t) + v(t) is a solution for all real numbers , .
(b) Let A =

1
2001/2002

. Find solutions u(t), v(t) of x0 = Ax such
2
that every solution can be expressed in the form u(t) + v(t) for
suitable constants , .
M341 ODE
2. Ordinary differential equations of higher order

2.1. Second order equations. The study of differential equations
derives from the study of Newtons second law (with unit mass):
F (x) = x00 ,
(5)
where x : R Rn (typically n 3) is the position of some object as a

function of time and F is a force field F : Rn Rn .
Newtons equation is a second order differential equation, the order
of an ODE being defined as the order of the highest derivative of x
which appears in the equation.
Second order equations (in fact, equations of any order > 1) are
equivalent to first-order equations by the following trick: define y :
R Rn by y = x0 . Then x00 = y 0, and Newtons equation becomes the
system
y 0 = F (x)
and
x0 = y .
(6)
Clearly if x is any solution of (5), then (x, y) with y = x0 is a solution

of (6); and conversely if (x, y) is any solution of (6), then x solves (5).
We say that (6) and (5) are equivalent.
2.2. Some several variable calculus. We summarise some results
from Several Variable Calculus.
Let x = (x1 , . . . , xn ) and y = (y1, . . . , yn ) be vectors in Rn . Their
inner product (or dot product) is denoted hx, yi and is defined as
hx, yi =
n
X
xi yi = x1 y1 + x2 y2 + + xn yn .
i=1
The norm of x is kxk where kxk2 = hx, xi. If x, y : R Rn are differentiable functions, then we have the following version of the Leibniz
rule:
hx, yi0 = hx0 , yi + hx, y 0 i .
(7)
Let f : Rn R be a differentiable function. The gradient of f ,

denoted grad f , is the vector field grad f : Rn Rn sending x to
(f /x1 , . . . , f /xn ). The level sets of f are the sets of the form
f 1 (c) for c R:
f 1 (c) = {x Rn | f (x) = c} .
A point x Rn is called a regular point if grad f (x) 6= 0. The
gradient is perpendicular to the level set at all regular points.
Now consider the composition of two differentiable maps:
f
x
R
Rn
R ,
10
2001/2002
which sends t to f (x(t)). The chain rule becomes

X f
d
dxi
(x(t))
f (x(t)) =
(t) = hgrad f (x(t)), x0 (t)i .
dt
x
dt
i
i=1
n
(8)
2.3. Energy. A force field F : Rn Rn is called conservative if

there exists some function V : Rn R, called a potential, such that
F = grad V .
Newtons equation (5) for a conservative force field is
x00 = grad V (x) .
(9)
Define the energy of a curve x : R Rn by

E(x(t)) = 12 kx0 (t)k2 + V (x(t)) .
The energy is constant along solution curves x : R Rn of Newtons

equation. Indeed, using the Leibniz rule (7) and the chain rule (8), we
have
d
E(x(t)) = hx00 , x0 i + hgrad V (x), x0 i = 0,
dt
where we have used Newtons equation (9). This is called conservation
of energy.
2.4. Hamiltonian vector fields. Newtons equation (9) is equivalent
to the following system:
(x, y)0 = (x0 , y 0) = (y, grad V (x)) ,
whose solutions can be interpreted as curves in Rn Rn = R2n associated
to the vector field defined by
(x, y) 7 (y, grad V (x)) .
(10)
Such a vector field is called hamiltonian. More precisely, a hamiltonian vector field in R2n is one of the form
(x, y) 7 (grady H, gradx H) ,
where
gradx H =
H
H
,...,
x1
xn

and
grady H =
H
H
,...,
y1
yn

,
for some function H : R2n R called the hamiltonian (function).

The vector field (10) defined by Newtons equation is hamiltonian
with hamiltonian function
H(x, y) = 12 kyk2 + V (x) .
Notice that the hamiltonian is constant along any solution curve and
equal to the energy. This means that solution curves lie in the level
sets of the hamiltonian.
M341 ODE
11
2.5. Gradient vector fields. A vector field F : Rn Rn in Rn is

called a gradient vector field if F = grad U for some function U :
Rn R.
Solution curves of the equation x0 = grad U(x) are orthogonal to the
level sets of the function U. This contrasts with hamiltonian systems,
in which the solution curves lie in the level sets.
Problems
(Some of the problems are taken from Hirsch & Smale, Chapter 2.)
Problem 2.1. Which of the following force fields F : R2 R2 are
conservative? If the field is conservative, find V : R2 R such that
F = grad V .
(a) F (x, y) = (x2 , 2y 2 )
(b) F (x, y) = (x2 y 2, 2xy)
(c) F (x, y) = (x, 0)
Problem 2.2. An ODE of order n is said to be in standard form if
it can be written as
x(n) = f (x, x0 , x00 , . . . , x(n1) ) ,
where x : R R. Prove that such an equation is equivalent to a
first-order system
X 0 = F (X) ,
for some function X : R Rn and some vector field F :
How are x and X related? How are f and F related?
Rn
Rn .
Problem 2.4. Which of the following vector fields F : R2

hamiltonian? Which ones are gradient?
(a) F (x, y) = (y, x x3 )
(b) F (x, y) = (y 2 x2 , xy, )
(c) F (x, y) = (x, y)
For those which are, find the function which makes them so.
R2
are
Problem 2.3. Prove the Leibniz rule (7).
Problem 2.5. Suppose that a vector field in R2 is both gradient and

hamiltonian. Show that the hamiltonian function H is harmonic; that
is,
2H 2H
+
=0.
x2
y 2
Problem 2.6. Let F : R2 R2 be the vector field defined by
F : (x, y) 7 (ax + by, cx + dy) .
For which values of the real parameters a, b, c, d is it hamiltonian, gradient or both?
12
2001/2002
Problem 2.7. A force field F : Rn Rn is called central if F (x) =

h(x)x, for some function h : Rn R. Let F be central and conservative,
so that F = grad V , for some function V : Rn R. Show that
V (x) = f (kxk), for some f : R R; in other words, the potential is
constant on spheres SR = {x Rn | kxk = R}.
Hint: Show that V is constant along any curve on the the sphere SR ,
and use that any two points on the sphere can be joined by curve. (Can
you prove this last assertion?)
Problem 2.8. Sketch the phase portraits of the following hamiltonian
vector fields in R2 , by first finding a hamiltonian function and sketching
its level sets.
(a) (x, y) 7 (y, 2(1 x))
(b) (x, y) 7 (2y x, y 2x)
(c) (x, y) 7 (2y 3x, 3y 2x)
(d) (x, y) 7 (y x, y x)
Problem 2.9. Sketch the phase portraits of each of the following gradient vector fields F : R2 R2 , by first finding a function U such that
F = grad U and then sketching its level sets.
(a) (x, y) 7 (2(x 1), y)
(b) (x, y) 7 (2x y, 2y x)
(c) (x, y) 7 (2x 3y, 2y 3x)
(d) (x, y) 7 (x y, y x)
M341 ODE
13
3. Linear vector fields

3.1. Some linear algebra. We collect some concepts which were covered in Linear Algebra and Differential Equations.
Rn is an example of an n-dimensional real vector space, and moreover every n-dimensional real vector space is isomorphic to Rn , the
isomorphism being given by a choice of basis.
A map A : Rn Rn is linear if
(i) A(x + y) = Ax + Ay, for all x, y Rn ; and
(i) A( x) = Ax, for all x Rn and R.
Any linear map A : Rn Rn , (x1 , . . . , xn ) 7 ((Ax)1 , . . . , (Ax)n ) is
characterised by n2 real numbers [aij ] defined by
Axi = ai1 x1 + ai2 x2 + + ain xn .
Writing points in Rn as column vectors (i.e., as n 1 matrices), we can
write the n2 numbers [aij ] as the entries in an n n matrix, and in this
way the action of A is given simply by matrix multiplication:

x1
a11 . . . a1n
x1
a11 x1 + + a1n xn
..
.
A ... 7 ... . . . ... ... =
.
xn
an1 . . . ann
xn
an1 x1 + + ann xn
Henceforth we will not distinguish between a linear transformation and
its associated matrix. Notice that nn matrices can be added and multiplied by real numbers, so they too form a vector space isomorphic to
Rn2 . In addition, matrices can be multiplied, and matrix multiplication
and composition of linear transformations correspond.
I assume familiarity with the notions of trace and determinant of
a matrix.
A subset E Rn is a (vector) subspace if
(i) x + y E for every x, y E, and
(ii) x E for every x E and R.
The kernel of a linear transformation A : Rn Rn is the subspace
defined by
ker A = {x Rn | Ax = 0} Rn .
Similarly, the image is the subspace defined by
im A = {y Rn | y = Ax, x Rn } Rn .
A linear transformation A : Rn Rn is invertible if there exists a

linear transformation A1 : Rn Rn obeying AA1 = A1 A = I,
where I : Rn Rn , Ix = x, is the identity. The following statements
are equivalent:
(a) A is invertible
(b) ker A = 0
(c) det A 6= 0
14
2001/2002
(d) im A = Rn
Let A : Rn Rn be a linear transformation. A nonzero vector x Rn
is called a (real) eigenvector if Ax = x for some real number , which
is called a (real) eigenvalue. The condition that be a real eigenvalue
of A means that the linear transformation I A : Rn Rn is not
invertible. Its kernel is called the -eigenspace of A: it consists of all
eigenvectors of A with eigenvalue together with the 0 vector. The
real eigenvalues of A are precisely the real roots of the characteristic
polynomial of A:
pA() = det(I A) .
A complex root of pA () is called a complex eigenvalue of A.
An n n matrix A = [aij ] is diagonal if aij = 0 for i 6= j, and
it is called diagonalisable (over R) if there exists an invertible n n
(real) matrix S, such that SAS 1 = D, with D diagonal. A sufficient
(but not necessary) condition for A to be diagonalisable, is that its
characteristic polynomial should factorise as
pA () = ( 1 )( 2 ) ( n ) ,
where the i are real and distinct. In other words, this means that
pA () should have n distinct real roots. We will summarise this condition as A has real, distinct eigenvalues. Another sufficient condition
for diagonalisability

is that A be symmetric: aij = aji.
a b
Let A =
be a 2 2 matrix. Its trace and determinant are,
c d
respectively, tr A = a + d and det A = ad bc. The characteristic
polynomial is
pA () = 2 (tr A) + det A .
Therefore the condition for A to have real, distinct roots is the positivity of the discriminant
(tr A)2 4 det A > 0 .
3.2. Real eigenvalues. Let A : Rn Rn , x 7 Ax, be a linear vector
field and consider the associated differential equation: x0 = Ax, where
x : R Rn .
If A is diagonal, A = diag{1 , 2 , . . . , n}, we know from Problem
1.7 that this equation has a unique solution for each choice of initial
value x(0). In fact, the solutions are
xi (t) = xi (0)ei t .
The solution depends continuously on the initial conditions (see Problem 3.5).
Suppose that A is diagonalisable, then there exists some constant
invertible matrix S such that D := SAS 1 is diagonal. Consider the
equation x0 = Ax = S 1 DSx. Define y : R Rn by y = Sx. Then
M341 ODE
15
y 0 = Dy and the previous paragraph applies. In particular, this system

has a unique solution for each choice of initial value y(0). Finally,
x(t) = S 1 y(t) is the desired solution.
In particular, if A has real, distinct eigenvalues then the equation
x0 = Ax has a unique solution for each choice of initial conditions x(0).
Moreover the solution is easy to construct explicitly: all we need to do
is to construct the matrix S, and this is done as follows:
1. find the eigenvectors {v1 , v2 , . . . , vn };
2. construct a matrix V whose columns are the eigenvectors:
| |
|
V = v1 v2 vn ;
| |
|
3. and invert to obtain the matrix S = V 1 .
For many problems we will not need the explicit form of S, and only
a knowledge of the eigenvalues of A will suffice.
3.3. Complex eigenvalues. It may happen that a real matrix A has
complex eigenvalues; although they always come in complex conjugate
pairs (see Problem 3.8). In this case it is convenient to think of A as a
linear transformation in a complex vector space.
The canonical example of a complex vector space is Cn , the space of
ordered n-tuples of complex numbers: (z1 , z2 , . . . , zn ).
Let A = [aij ] be a n n matrix. It defines a complex linear transformation Cn Cn as follows:

a11 . . . a1n
z1
a11 z1 + + a1n zn
z1
..
.
A ... 7 ... . . . ... ... =
.
zn
an1 . . . ann
zn
an1 z1 + + ann zn
If A is real, then it preserves the real subspace Rn Cn consisting of
real n-tuples: zi = zi .
A sufficient (but not necessary) condition for a complex n n matrix A to be diagonalisable is that its characteristic polynomial should
factorise into distinct linear factors:
pA () = ( 1 )( 2 ) ( n ) ,
where the i C are distinct.
Suppose that A is diagonalisable, but with eigenvalues which might
be complex. This means that there is an invertible n n complex
matrix S such that SAS 1 = D = diag{1 , . . . , n}. The i are in
general complex, but if A is real, they are either real or come in complex
conjugate pairs (see Problem 3.8).
The equation x0 = Ax, where x : R Rn Cn , can be easily solved
by introducing y : R Cn by x(t) = S 1 y(t), where
yi (t) = yi(0)ei t .
16
2001/2002
3.4. The exponential of a matrix. Let A be an n n matrix. The

exponential of A is the matrix defined by the following series:
eA :=
X
1 j
A = I + A + 12 A2 +
j!
j=0
This series converges absolutely for all A.

Proposition 3.1. Let A, B and C be n n matrices. Then
(a) if B = CAC 1 , then eB = CeA C 1 ;
(b) if AB = BA, then eA+B = eA eB ;
1
(c) eA = eA ;

a b
(d) if n = 2 and A =
then
b
a

A
a cos b sin b
.
e =e
sin b
cos b
If x Rn is an eigenvector of A with eigenvalue , then x is also an
eigenvector of eA with eigenvalue e .
Let L(Rn ) denote the vector space of n n real matrices. Consider
the map R L(Rn ) which to t R assigns the matrix etA . This can
2
be thought of as a map R Rn , so it makes sense to speak of the
derivative of this map. We have the following
d tA
e = AetA = etA A .
dt
Theorem 3.2. Let x 7 Ax be a linear vector field on Rn . The solution
of the initial value problem
x0 = Ax
x(0) = K Rn
is
x(t) = etA K ,
and there are no other solutions.
Although the solution of a linear ordinary differential equation is
given very explicitly in terms of the matrix exponential, exponentiating
a matrixespecially if it is of suffficiently large rankis not practical
in many situations. A more convenient way to solve a linear equation
is to change basis to bring the matrix to a normal form which can be
easily exponentiated, and then change basis back.
3.5. The case n = 2. Recall the following result from linear algebra.
M341 ODE
17
Theorem 3.3. Given any 2 2 real matrix A, there is an invertible

real matrix S such that B := SAS 1 takes one (and only one) of the
following forms

0
a b
0
(I)
(II)
(III)
,
0
b
a
1
where in (I) and b 6= 0 in (II).
Case (I) corresponds to matrices which are diagonalisable over R.
Case (II) corresponds (since b 6= 0) to matrices which are not diagonalisable over R but are diagonalisable over C. Finally, case (III)
corresponds to matrices which are not diagonalisable.
The exponential of A is then given by
eA = S 1 eB S ,
where eB is given by the following matrices:

e
0
cos b sin b
a
(I)
(II) e
sin b
cos b
0 e
(III) e
1 0
1 1
From these it is easy to reconstruct the phase portrait of any linear

vector field x 7 Ax on R2 .
There are seventeen different types, which can be identified by the
trace and determinant of A:
1. Case (I)
(a) Foci [ 2 = 4, A diagonal]
(i) = < 0: sink [ < 0, > 0]
(ii) 0 < = : source [ > 0, > 0]
(iii) 0 = = : degenerate [ = 0, = 0]
(b) Nodes [ 2 > 4]
(i) < = 0: degenerate [ < 0, = 0]
(ii) < < 0: sink [ < 0, > 0]
(iii) 0 < < : source [ > 0, > 0]
(iv) 0 = < : degenerate [ > 0, = 0]
(c) Saddles [ < 0]
(i) < 0 <
2. Case (II) [ 2 < 4]
(a) Centres [ = 0]
(i) a = 0, b < 0: periodic, counterclockwise
(ii) a = 0, 0 < b: periodic, clockwise
(b) Spirals
(i) a < 0, b < 0: sink, counterclockwise [ < 0]
(ii) a < 0, 0 < b: sink, clockwise [ < 0]
(iii) 0 < a, b < 0: source, counterclockwise [ > 0]
(iv) 0 < a, 0 < b: source, clockwise [ > 0]
3. Case (III) [ 2 = 4, A not diagonal]
18
2001/2002
(a) Improper Nodes

(i) = 0: degenerate [ = 0]
(ii) < 0: sink [ < 0]
(iii) 0 < : source [ > 0]
In a sink, the flows approach the origin, whereas in a source they
move away from it.
3.6. Inhomogeneous systems. Consider the following inhomogeneous nonautonomous linear differential equation for x : R Rn ,
x0 = Ax + B ,
(11)
where B : R R is a continuous map. The method of variation of

parameters consists of seeking a solution of the form
n
x(t) = etA f (t) ,
(12)
where f : R Rn is some differentiable curve. (This represents no loss

of generality since etA is invertible for all t.) Differentiating (12) using
the Leibniz rule yields
x0 (t) = AetA f (t) + etA f 0 (t) = Ax(t) + etA f 0 (t) .
Since x is assumed to be a solution of (11), we see that
f 0 (t) = etA B(t) ,
which can be integrated to yield
Z t
f (t) =
esAB(s) ds + K ,
0
for some K R . Solving for x(t) we finally have

Z t

tA
sA
x(t) = e
e B(s) ds + K ,
n
for some K Rn . Moreover every solution is of this form. Indeed, if y :

R Rn is another solution of (11), then x y solves the homogeneous
equation (x y)0 = A(x y). By Theorem 3.2, it has the form x y =
etA K0 for some constant K0 Rn , whence y has the same form as x
with possibly a different constant K Rn .
Theorem 3.4. Let u(t) be a particular solution of the inhomogeneous
linear differential equation
x0 = Ax + B .
(11)
Then every solution of (11) has the form u(t) + v(t) where v(t) is a
solution of the homogeneous equation
x0 = Ax .
(4)
Conversely, the sum of a solution of (11) and a solution of (4) is a

solution of (11).
M341 ODE
19
3.7. Higher order equations. Consider the nth order linear differential equation
s(n) + a1 s(n1) + + an1 s0 + an s = 0 ,
(13)
where s : R R. This equation is equivalent to the first order equation

x0 = Ax, where x : R Rn sends t 7 (s, s0 , . . . , s(n1) ), and where
0
1
0 0
0
0
1
.
.
.
..
.. 0
A = ..
(14)
.
0
0
1
an an1
a1
Proposition 3.5. The characteristic polynomial of the matrix A in
(14) is
pA () = n + a1 n1 + + an .
This results says that we can read off the eigenvalues of the matrix
A directly from the differential equation.
Notice that if s(t) and q(t) solve (13) then so do s(t) + q(t) and ks(t)
where k R. In other words, the solutions of (13) form a vector space.
This is an n-dimensional vector space, since the n initial conditions
s(0), s0 (0),..., s(n1) (0) uniquely determine the solution.
At a conceptual level this can be understood as follows. Let F denote the (infinite-dimensional) vector space of smooth (i.e., infinitely
differentiable) functions s : R R. Let L : F F denote the linear
map
s 7 s(n) + a1 s(n1) + + an1 s0 + an s .
A function s(t) solves equation (13) if and only if it belongs to the
kernel of L, which is a subspace of F.
Higher order inhomogeneous equations
s(n) + a1 s(n1) + + an1 s0 + an s = b ,
for b : R
system
(15)
R can also be solved by solving the associated first order

x0 = Ax + B ,
where A is given by (14) and B : R Rn is given by
0
0
B(t) =
... .
b(t)
(11)
20
2001/2002
Problems
(Some of the problems are taken from Hirsch & Smale, Chapters 3,4
and 5.)
Problem 3.1. Solve the following initial value problems:
(a) x0 = x
(b) x01 = 2x1 + x2
y 0 = x + 2y
x02 = x1 + x2
x(0) = 0 y(0) = 3
x1 (1) = 1 x2 (1) = 1
(c) x0 = Ax
(d) x0 = Ax
x(0) = (0, b, b)
2
0
0
0
A = 0 1
0
2 3
x(0) = (0, 3)

0
3
A=
1 2
Problem 3.2. Find a 22 matrix A such that one solution to x0 = Ax

is
x(t) = (e2t et , e2t + 2et ) .
Problem 3.3. Show that the only solution to
x01 = x1
x02 = x1 + x2
x1 (0) = a x2 (0) = b
is
x1 (t) = a et
x2 (t) = (a t + b)et .
(Hint: If (y1(t), y2 (t)) is any other solution, consider the functions
et y1 (t) and et y2 (t).)
Problem 3.4. Let a linear transformation A : Rn Rn have real,
distinct eigenvalues. What condition on the eigenvalues is equivalent
to limt kx(t)k = for every solution x(t) to x0 = Ax?
Problem 3.5. Suppose that the n n matrix A has real, distinct
eigenvalues. Let t 7 (t, x0 ) be the solution to x0 = Ax with initial
value (0, x0 ) = x0 . Show that for each fixed t,
lim (t, y0 ) = (t, x0 ) .
y0 x0
This means the solutions depend continuously on the initial conditions.

(Hint: Suppose that A is diagonal.)
Problem 3.6. Consider the second order differential equation
x00 + bx0 + cx = 0
b and c constant.
(16)
M341 ODE
21
(a) By examining the equivalent first order system

x0 = y
y 0 = cx by ,
show that if b2 4c > 0, then (16) has a unique solution x(t) for
every initial condition of the form x(0) = u and x0 (0) = v.
(b) If b2 4c > 0, what assumption about b and c ensures that
limt x(t) = 0 for every solution x(t)?
(c) Sketch the graphs of the three solutions of
x00 3x0 + 2x = 0
for the initial conditions
x(0) = 1
and
x0 (0) = 1, 0, 1 .
Problem 3.7. Let a 2 2 matrix A have real, distinct eigenvalues

and . Suppose that an eigenvector of eigenvalue is (1, 0) and
an eigenvector of eigenvalue is (1, 1). Sketch the phase portraits of
x0 = Ax for the following cases:
(a) 0 < <
(b) 0 < <
(c) < < 0
(d) < 0 <
(e) = 0 > 0
Problem 3.8. Let A be a nn real matrix. Prove that its eigenvalues

are either real or come in complex conjugate pairs.
Problem 3.9. Solve the following initial value problems and sketch
their phase portraits:
(a) x0 = y
(b) x01 = 2x2
y0 = x
x02 = 2x1
x(0) = 1 y(0) = 1
x1 (c) = 0 x2 (0) = 2
(c) x0 = y
y 0 = x
x(0) = 1 y(0) = 1
(d) x0 = Ax
x(0) = (3, 9)

1 2
A=
2
1

a b
Problem 3.10. Let A =
and let x(t) be a solution of x0 =
b
a
Ax, not identically zero. Show that the curve x(t) is of the following
form:
(a) a circle if a = 0;
(b) a spiral inward toward (0, 0) if a < 0 and b 6= 0;
(c) a spiral outward away from (0, 0) if a > 0 and b 6= 0.
What effect has the sign of b on the spirals in (b) and (c)? What is the
phase portrait if b = 0?
22
2001/2002
Problem 3.11. Sketch the portraits of:

(a) x0 = 2x
(b) x0 = x + z
y 0 = 2z
y 0 = 3y
z 0 = 2y
z 0 = x z
Which solutions tend to 0 as t ?

a b
Problem 3.12. Let A =
. Prove that the solutions of x0 =
b
a
Ax depend continuously on initial values. (See Problem 3.5.)
Problem 3.13. Solve the initial value problem
x0 = 4y
y0 = x
x(0) = 0
Problem 3.14. Solve
matrices A:
(a) 0
0
y(0) = 7 .
the equation x0 = Ax, for each of the following
0
1
0 2
1
0
0 0
15
(b) 1 0 17
0 1
7
Problem 3.15. Let A(t) and B(t) be two n n matrices depending

differentiably on t. Prove the following version of the Leibniz rule:
(A B)0 = A0 B + A B 0 .
Deduce from this that if A(t) is invertible for all t, then
0
A1 = A1 A0 A1 .
Problem
3.16. Compute the exponentials of the following matrices
(i = 1):

5 6
2 1
2 1
(a)
(b)
(c)
3 4
1
2
1 1

0 1
0 1 2
2 0 0
(d)
1 0
(e) 0 0 3
(f) 0 3 0
0 0 0
0 1 3
0 0

(g) 1 0
i
0
i+1
0
(h)
(i)
0 1
0 i
2
1+i
1 0 0 0
1 0 0 0
(j)
1 0 0 0
1 0 0 0
M341 ODE
23
Problem 3.17. For each matrix T in Problem 3.16 find the eigenvalues of eT .
Problem 3.18. Find an example of two linear transformations A, B
on R2 such that
eA+B 6= eA eB .
Problem 3.19. If AB = BA, prove that eA eB = eB eA and eA B =
BeA .
Problem 3.20. Let a linear transformation A : Rn Rn leave invariant a subspace E Rn (that is, Ax E for all x E). Show that eA
also leaves E invariant.
Problem
3.21. Show that there is no real 2 2 matrix S such that

1
0
S
e =
.
0 4
Problem 3.22. Find the general solution to each of the following systems:
(
(
x0 = 2x y
x0 = 2x y
(b)
(a)
y 0 = 2y
y 0 = x + 2y
(
0
0
x =y
x = 2x
(c)
(d)
y 0 = x 2y
y0 = x
z 0 = y 2z
x = y + z
(e)
y0 = z
z 0 = 0
Problem 3.23. In (a), (b) and (c) of Problem 3.22, find the solutions
satisfying each of the following initial conditions:
(a) x(0) = 1, y(0) = 2;
(b) x(0) = 0, y(0) = 2;
(c) x(0) = 0, y(0) = 0.
Problem 3.24. Let A : Rn Rn be a linear transformation that
leaves a subspace E Rn invariant. Let x : R Rn be a solution of
x0 = Ax. If x(t0 ) E for some t0 R, show that x(t) E for all t R.
Problem 3.25. Prove that if the linear transformation A : Rn Rn
has a real eigenvalue < 0, then the equation x0 = Ax has at least one
nontrivial solution x(t) such that limt x(t) = 0.
Problem 3.26. Let A : R2 R2 be a linear transformation and suppose that x0 = Ax has a nontrivial periodic solution, u(t): this means
that u(t + p) = u(t) for some p > 0. Prove that every solution is
periodic with the same period.
24
2001/2002
Problem 3.27. If u : R Rn is a nontrivial solution of x0 = Ax, then

show that
d
1
kuk =
hu, Aui .
dt
kuk
Problem 3.28. Classify and sketch the phase portraits of planar differential equations x0 = Ax, with A : R2 R2 linear, where A has zero
as an eigenvalue.
Problem 3.29. For each of the following matrices A consider the corresponding differential equation x0 = Ax. Decide whether the origin is
a sink, source, saddle or none of these. Identify in each case those vectors u such that limt x(t) = 0, where x(t) is the solution of x0 = Ax
with x(0) = u:

1
0
1 2
2 1
(a)
(b)
(c)
2 2
0 2
1 1

1 2
1 2
(d)
(e)
1 1
2
4
Problem 3.30. Which values (if any) of the parameter k in the following matrices makes the origin a sink for the corresponding differential
equation x0 = Ax?

3 0
a k
(b)
(a)
k
2
k 4
2
k 1
0 1 0
(c)
0 k
0 0
(d) 1
1
0 k
Problem 3.31. Let t : R2 R2 be the flow corresponding to the
equation x0 = Ax. (That is, t 7 t (x) is the solution passing through
x at t = 0.) Fix > 0, and show that is a linear map R2 R2 .
Then show that preserves area if and only if tr A = 0, and that in
this case the origin is neither a sink nor a source.
(Hint: A linear transformation is area-preserving if and only if its
determinant is 1.)
Problem 3.32. Describe in words the phase portraits of x0 = Ax for
the following matrices A:

2 0
2 0
1 0
2 0
(a)
(b)
(c)
(d)
.
0 2
0 1
0 2
1 2
Problem 3.33. Let T be an invertible linear transformation on Rn , n
odd. Show that x0 = T x has a nonperiodic solution.

a b
Problem 3.34. Let Let A =
have nonreal eigenvalues. Show
c d
that b 6= 0 and that the nontrivial solution curves to x0 = Ax are spirals
M341 ODE
25
or ellipses that are oriented clockwise if b > 0 and counterclockwise if

b < 0.
Problem 3.35. Find all solutions of the following equations:
(a) x0 4x cos t = 0
(b) x0 4x t = 0
(c) x0 = y
y0 = 2 x
(d) x0 = y
(e) x0 = x + y + z
y 0 = 4x + sin 2t
y 0 = 2y + t
z 0 = 2z + sin t
Problem 3.36. Which of the following functions satisfy an equation

of the form s00 + as0 + bs = 0 (a, b constant)?
(b) t2 t
(a) tet
(d)
cos 2t + 2 sin 3t
(e) e
cos 2t
(c)
cos 2t + 3 sin 2t
(f) et + 4
(g) 3t 9
Problem 3.37. Find solutions of the following equations having the
specified initial values:
(a) s00 + 4s = 0; s(0) = 1, s0 (0) = 0.
(b) s00 3s0 6s = 0; s(1) = 0, s0 (1) = 1.
Problem 3.38. For each of the following equations find a basis for
the solutions; that is, find two solutions s1 (t) and s2 (t) such that every
solution has the form s1 (t) + s2 (t) for suitable constants , :
(a) s00 + 3s = 0
(b) s00 3s = 0
(c) s00 s0 6s = 0
(d) s00 + s0 + s = 0
Problem 3.39. Suppose that the roots of the quadratic equation 2 +

a + b = 0 have negative real parts. Prove that every solution of the
s00 + as0 + bs = 0
satisfies
lim s(t) = 0 .
State and prove a generalisation of this result for nth order differential
equations
s(n) + a1 s(n1) + + an s = 0 ,
where the polynomial
n + a1 n1 + + an = 0
has n distinct roots with negative real parts.
26
2001/2002
Problem 3.40. Under what conditions on the constants a, b is there

a nontrivial solution to s00 +as0 +bs = 0 such that the equation s(t) = 0
has
(a) no solution;
(b) a positive finite number of solutions;
(c) infinitely many solutions?
Problem 3.41. For each of the following equations sketch the phase
portrait of the corresponding first order system. Then sketch the graph
of several solutions s(t) for different initial conditions:
(a) s00 + s = 0
(b) s00 s = 0
(d) s00 + 2s0 = 0
(e) s00 s0 + s = 0
(c) s00 + s0 + s = 0
Problem 3.42. Which equations s00 + as0 + bs = 0 have a nontrivial

periodic solution? What is the period?
Problem 3.43. Find all solutions to
s000 s00 + 4s0 4s = 0 .
Problem 3.44. Find all pairs of functions x(t), y(t) that satisfy the
system of differential equations
x0 = y
y 00 = x y + y 0 .
Problem 3.45. Find a real-valued function s(t) such that
s00 + 4s = cos 2t
s(0) = 0,
s0 (0) = 1.
Problem 3.46. Let q(t) be a polynomial of degree m. Show that any

equation
s(n) + a1 s(n1) + + an s = q
has a solution which is a polynomial of degree m.
M341 ODE
27
4. Stability of nonlinear systems

Having studied linear vector fields in the previous section, we now
start the study of the general equation x0 = f (x). Explicit solutions
for such equations usually cannot be found, so it behoves us to ask
different questions which can be answered. Two approaches suggest
themselves: approximating the solutions (either analytically or numerically) or studying those qualitative properties of the equation which
do not require knowing the solution explicitly. This is what we do in
this section.
4.1. Topology of Rn . We first recall some basic definitions concerning
subsets of Rn .
Let x Rn be any point and > 0 a real number. The (open)
-ball about x, denoted B (x), is the set
B (x) = {y Rn | ky xk < }
consisting of points in Rn which are a distance less than from x.

Similarly, one defines the closed -ball about x, denoted B (x), to
be the set
B (x) = {y Rn | ky xk } .
A subset U of Rn is said to be open if U contains some -ball about

every point. In other words, if for every x U, there is some (which
may depend on x) such that B (x) U. A subset of Rn is closed if its
complement is open. By convention, the empty set and Rn itself are
both open and closed. They are the only two subsets of Rn with this
property. Notice that a subset need not be either open or closed.
By a neighbourhood of a point x Rn we will mean any open
subset of Rn containing the point x.
The closure of a subset A Rn is defined as the intersection of all
Equivalently x A if every
closed sets containing A. It is denoted A.
neighbourhood of x intersects A.
Let A be a closed subset of Rn and let U A. We say that U is
dense in A if U = A.
A subset of Rn is bounded if it is contained in some closed ball
about the origin. A subset which is both closed and bounded is called
compact. A continuous function defined on a compact set always
attains its maximum and minimum inside the set.
A subset U of Rn is connected if there are continuous curves in U
joining any two points in U. More precisely, U is connected if given
x, y U there is a continuous curve c : [0, 1] U such that c(0) = x
and c(1) = y.
A subset U of Rn is simply connected if any closed continuous
curve in U is continuously deformable to a constant curve.
We will need the following plausible result from the topology of the
plane R2 .
28
2001/2002
Theorem 4.1 (Jordan Curve Theorem). A closed curve in R2 which

does not intersect itself separates R2 into two connected regions, a
bounded one (the interior of the curve) and an unbounded one (the
exterior of the curve).
4.2. Existence and uniqueness. Let U Rn be an open subset
and let f : U Rn be a continuous vector field. A solution of the
x0 = f (x)
is a curve u : I U, where I
satisfying
u0 (t) = f (u(t))
(17)
is some interval in the real line,

for all t I.
The interval I need not be finite and need not be either open nor
closed: [a, b], [a, b), (a, b], (a, b), (, b], (, b), (a, ) and [a, )
are all possible.
Let U Rn be an open set. A vector field f : U Rn defined on U
is said to be C 1 , if it is continuously differentiable; that is, all the n2
partial derivatives are continuous functions U R.
Theorem 4.2. Let U Rn be open, let f : U Rn be a C 1 vector
field and let x0 U. Then there exists a > 0 and a unique solution
x : (a, a) U
of (17) with x(0) = x0 .
There are two significant differences from the linear case: we may
not be able to take U to be all of Rn and we may not be able to extend
the solution from (a, a) to the whole real line.
To illustrate this second point, consider the vector field f : R R
given by f (x) = 1 + x2 . The solution of x0 = f (x) is
x(t) = tan(t c) ,
where c is some constant. Clearly this solution cannot be extended
beyond |t c| < /2. Such vector fields are said to be incomplete.
The differentiability condition on the vector field is necessary. For
example, consider the vector field f : R R given by f (x) = 3x2/3 .
Then both x(t) = 0 and x(t) = t3 solve x0 = f (x) with x(0) = 0. Thus
there is no unique solution. This does not violate the theorem because
f 0 (x) = 2x1/3 is not continuously differentiable at x = 0.
The proof of this theorem is given in Foundations of Analysis, but
we can sketch the idea. Briefly, suppose x(t) solves the initial value
problem
x0 = f (x)
x(0) = x0 .
(18)
M341 ODE
29
Then let us integrate the equation x0 (s) = f (x(s)) from s = 0 to t:

Z t
x(t) = x0 +
f (x(s)) ds .
(19)
0
This integral equation is equivalent to the initial value problem (18), as

can be seen by evaluating at t = 0 and by differentiating with respect
to t both sides of the equation. Therefore it is enough to show that
equation (19) has a solution. Let P denote the operator which takes a
function u : I U to the function P u : I U, defined by
Z t
(P u)(t) = x0 +
f (u(s)) ds ,
0
where I is some interval in the real line. Then u solves (19) if and only
if it is a fixed point of the operator P . This suggests the following
iterative scheme (called Picards iteration method). One defines a
sequence x1 , x2 , . . . of functions where
Z t
x1 (t) = x0 +
f (x0 ) ds = x0 + f (x0 )t
0
Z t
x2 (t) = x0 +
f (x1 (s)) ds
0
..
.
xk+1 (t) = x0 +
f (xk (s)) ds .
0
It can be shown that for f a C 1 vector field, the sequence of functions

(xi ) converges uniformly in a < t < a, for some a, to a unique fixed
point x of P .
In fact, for f (x) = Ax a linear vector field, it is easy to see that

tk k
xk (t) = I + tA + + A x0 ,
k!
so that (xk ) converges to etA x0 as expected.
Notice however that despite the constructive nature of the proof,
it is impractical for most f to use Picards method to find the solution.
Moreover even when the iterates xk approximate the solution, they
will stillmiss important qualitative behaviour. For example, if A =

0 1
, then x0 = Ax and x(0) = x0 has unique solution
1 0

cos t sin t
tA
x(t) = e x0 =
x0 ,
sin t cos t
which is periodic with period 2. However no term in the sequence of
functions (xk ) obtained from Picards method is ever periodic.
30
2001/2002
4.3. Linearisation about equilibrium points. Let U Rn be an

open set and let f : U Rn be a C 1 vector field defined on U. Suppose
that f (
x) = 0 for some x U. Then the constant curve x(t) = x
solves the differential equation (17) and by Theorem 4.2 it is the unique
solution passing through x. Therefore if x(t) = x at some time t then
it will always be (and has always been) at x. Such a point x is called
an equilibrium point of the differential equation. It is also called a
zero or a critical point of the vector field f .
We will be mostly interested in isolated equilibrium points, which
are those equilibrium points x
such that some -ball about x
contains
no other equilibrium points.
For example, a linear vector field f (x) = Ax has a zero at the origin.
This is an isolated critical point provided that A is nondegenerate; that
is, det A 6= 0. In fact, in this case, it is the only critical point. (See
Problem 4.2.)
A general vector field may have more than one critical point. For
example, the simple pendulum is described by the vector field f (x, y) =
(y, sin x) which has an infinite number of zeros (n, 0) for n Z.
When faced with a general vector field, our strategy will be to try
and piece together the phase portrait of the differential equation from
the phase portraits near each of the critical points. The basic idea here
is to linearise the equation at each of the critical points.
Let x U be a critical point of a C 1 vector field f : U Rn
defined on some open subset U of Rn . By changing coordinates in
Rn to y = x x if necessary, we can assume that x = 0. Since f
is continuously differentiable, the derivative matrix Df exists and in
particular
kf (x) Df (0)xk
lim
=0,
x0
kxk
where we have used that f (0) = 0. We see that f (x) is approximated
by the linear vector field Ax = Df (0)x at 0. We call A = Df (0) the
linear part (or the linearisation) of f around 0. More generally, if
x is any critical point, then
kf (x) Df (
x)(x x)k
lim
=0,
x
x
kx xk
hence Df (
x) linearises f near x.
Notice that a vector field f may have an isolated zero x, but its
linearisation Df (
x) does not have an isolated zero (cf. Problem 4.3).
4.4. Stability. Roughly speaking an equilibrium point is stable if trajectories which start near the equilibrium point remain nearby. A more
precise definition is the following.
Definition 4.3. Let f : U Rn be a C 1 vector field defined on an
open subset U of Rn . Let x U be a zero of the vector field. Then
M341 ODE
31
x is stable if for every neighbourhood W U of x, there is a neighbourhood W1 W such that every solution to (17) with x(0) W1 is
defined and in W for all t > 0. If W1 can be chosen so that in addition
limt x(t) = x, then x is asymptotically stable.
If x is not stable, it is unstable. This means that there exists one
neighbourhood W of x such that for every neighbourhood W1 W of
x, there is at least one solution x(t) starting at x(0) W1 which does
not lie entirely in W .
An equivalent - definition of (asymptotic) stability is given in Problem 4.4.
Stable equilibria which are not asymptotically stable are sometimes
called neutrally stable.
One should note that limt x(t) = x on its own does not imply
stability. (There are counterexamples, but they are quite involved.)
Let x 7 Ax be a linear vector field on Rn . Then the origin is called
a (linear) sink if all the eigenvalues of A have negative real parts.
More generally, a zero x of a C 1 vector field f : U Rn is called a
(nonlinear) sink if all the eigenvalues of the linearisation Df (
x) have
negative real parts.
A linear sink is asymptotically stable, whereas a centre is stable
but not asymptotically stable. Saddles and sources, for example, are
unstable.
The following theorems tell us to what extent we can trust the stability properties of the linearisation of a nonlinear vector field.
Theorem 4.4. Let f : U Rn be a C 1 vector field defined on an open
subset of Rn and let x U be a sink. Then there is a neighbourhood
W U of x such that if x(0) W then x(t) is defined and in W for
all t > 0 and such that limt x(t) = x.
Theorem 4.5. Let U Rn be open and f : U Rn be a C 1 vector
field. Suppose that x is a stable equilibrium point of the equation (17).
Then no eigenvalue of Df (
x) has positive real part.
Morally speaking, these two theorems say that if the linearised systems is unstable or asymptotically stable, then so will be the nonlinear
system in a small enough neighbourhood of the equilibrium point. If
the linearised system is stable but not asymptotically stable, then we
cannot say anything about the nonlinear system. (See Problem 4.12.)
4.5. Liapunov stability. In those cases where linearisation about an
equilibrium point sheds no light on its stability properties (because the
linearisation is neutrally stable, say) a method due to Liapunov can
help. Throughout this section we will let f : U Rn be a C 1 vector
field defined on an open subset U of Rn , and we will let x U be such
that f (
x) = 0.
32
2001/2002
Let E : W R be a differentiable function defined in a neighbourhood W U of x. We denote E : W R the function defined by
E(x)
= DE(x)f (x) .
Here the right-hand side is simply the operator DE(x) applied to the
vector f (x). If we let t (x) denote the solution to (17) passing through
x when t = 0, then

d
E(x)
= E(t (x))
dt
t=0
by the chain rule.
Definition 4.6. A Liapunov function for x is a continuous function
E : W R defined on a neighbourhood W U of x, differentiable on
W x, such that
(a) E(
x) = 0 and E(x) > 0 if x 6= x;
(b) E 0 in W x.
If in addition, E satisfies
(c) E < 0 in W x,
then it is said to be a strict Liapunov function for x
.
We can now state the stability theorem of Liapunov.
Theorem 4.7 (Liapunov Stability Theorem). Let x be an equilibrium
point for (17). If there exists a (strict) Liapunov function for x then x
is (asymptotically) stable.
Proof. Let > 0 be so small that the closed -ball about x
lies entirely
in W . Let be the minimum value of E on the boundary of this x) of radius centred at x. From (a) we know that
ball, the sphere S (
> 0. Let
n
o
W1 = x B (
x) | E(x) < .
x) since, by (b),
Then no solution starting inside W1 can meet the S (
E is non-increasing on solution curves. Therefore x is stable.
Now assume that (c) also holds so that E is strictly decreasing on
solution curves in W x. Let x(t) be a solution starting in W1 x and
and consider E(x(t)). Showing that limt E(x(t)) = 0 is equivalent
to showing that limt x(t) = x. Since E(x(t)) is strictly decreasing
and bounded below by 0, L := limt E(x(t)) exists. We claim that
L = 0.
Assume for a contradiction that L > 0 instead. Then by the same
argument as in the first part, we deduce that there is some smaller
sphere S ( < r) such that E(x) < L for all points x inside S .
Since E is continuous, it attains a maximum M in the spherical shell
A,R bounded by S and SR . Because E is negative definite, M is
negative. Now consider any solution curve starting inside Sr at time 0,
M341 ODE
33
but outside S . By stability it remains inside SR , and since E is strictly

decreasing and limt E(x(t)) = L, it remains outside the sphere S .
Therefore at any later time t,
Z t
E(x(s))ds
E(x(t)) = E(x(0)) +
0
E(x(0)) M t ;
but no matter how small M is, for t large enough the right hand side
will eventually be negative, contradicting the positive-definiteness of
E.
One can picture this theorem in the following way. Near x
, a Liapunov function has level sets which look roughly like ellipsoids containing x. One can interpret the condition that E is decreasing geometrically, as saying that at any point on the level set of E, the vector field
f (x) points to the inside of the ellipsoid. If E is merely non-increasing,
then the vector field may also point tangential to the ellipsoid; but in
either case, once inside such an ellipsoid, a solution curve can never
leave again.
There is also a similar result (which we state without proof) concerning the instability of a critical point.
Theorem 4.8 (Liapunov Instability Theorem). Let E : W R be
a continuous function defined on a neighbourhood W U of x and
differentiable in W x. Then if
(a) E > 0 in W x, and
(b) every closed ball centred at x and contained in W contains a point
x where E(x) > 0,
then x is an unstable equilibrium point.
4.6. Stability and gradient fields. Let U Rn be open and let
V : U R be a C 2 function (twice continuously differentiable). Let
f : U Rn be the associated gradient vector field f (x) = grad V (x),
as discussed in Section 2.5.
It follows from the chain rule that
V (x) = kf (x)k2 0 .
Moreover V (x) = 0 if and only if grad V (x) = 0, so that x is an equilibrium point of the gradient system x0 = grad V (x). This, together
with the observations in Section 2.5, allows us to characterise the gradient flows geometrically.
Theorem 4.9. Consider the gradient dynamical system
x0 = grad V (x) .
At regular points, where grad V (x) 6= 0, the solution curves cross level
surfaces orthogonally. Nonregular points are equilibria of the system.
Isolated minima are asymptotically stable.
34
2001/2002
4.7. Limit cycles and the Poincar

eBendixson theorem. This
section deals exclusively with planar vector fields.
We saw in Problem 3.26 that if a linear vector field x 7 Ax in the
plane admits a nontrivial periodic solution, then all solutions are periodic with the same period. The situation is very different for nonlinear
planar vector fields. These may admit nontrivial periodic solutions
which are limits of other nonperiodic trajectories. These solutions are
known as limit cycles.
Let t 7 t (u) denote the solution curve to x0 = f (x) which passes
by u at t = 0. We define the future limit point set of u to be all
those x U such that there is a increasing infinite sequence (ti )
such that in this limit ti u x. Similarly we define the past limit
point set of u by taking instead decreasing sequences (ti ) .
Definition 4.10. By a closed orbit of a dynamical system we mean
the image of a nontrivial periodic solution. That is, one which is not an
equilibrium point and for which p (x) = x for some x and p 6= 0.
It follows that np (y) = y for all y and n Z.
A limit cycle for the planar system x0 = f (x) is a closed orbit with
the property that it is contained either in the past or future limit point
set of some point not contained in .
We mention several results without proofs concerning the existence
of limit cycles in planar nonlinear differential equations.
Theorem 4.11. A limit cycle encloses an equilibrium point.
Let f : U R2 be a C 1 vector field defined on an open subset U of
the plane.
Theorem 4.12 (Bendixson Negative Criterion). If W U is simplyconnected and the divergence of f does not change sign in W , the there
is no limit cycle in W .
Theorem 4.13 (PoincareBendixson Theorem). Let K U be a compact subset which contains no equilibrium points. If t 7 x(t) is a solution curve which starts in K and remains in K for all t > 0, then it is
itself a limit cycle or spirals towards one. In either case the subset K
contains a limit cycle.
Problems
(Some of the problems are taken from Hirsch & Smale, Chapters 8
and 9.)
Problem 4.1. Write the first few terms of the Picard iteration scheme
for each of the following initial value problems. Where possible, use
any method to find explicit solutions and discuss the domain of the
solution.
(a) x0 = x + 2, x(0) = 2
M341 ODE
(b)
(c)
(d)
(e)
x0
x0
x0
x0
35
= x4/3 , x(0) = 0
= x4/3 , x(0) = 1
= sin x, x(0) = 0
= 1/(2x), x(1) = 1
Problem 4.2. Consider the linear equation x0 = Ax. Show that if

det A = 0 then there are an infinite number of critical points, none of
which is isolated.
Problem 4.3. Write down an example of a C 1 vector field f having
an isolated zero, say x, but whose linearisation Df (
x) about x does
not.
Problem 4.4. Let U Rn be open, f : U Rn be a C 1 vector field
and x U a zero of f . Prove that x is stable if and only if given any
> 0 such that the open -ball about x
is contained in U, there exists
a 0 < such that for any solution of x0 = f (x),
kx(0) xk < = kx(t) xk < ,
for all t 0.
Problem 4.5. Show by example that if f is a nonlinear C 1 vector field
and f (0) = 0, it is possible that limt x(t) = 0 for all solutions to
x0 = f (x), without the eigenvalues of Df (0) having negative real parts.
Problem 4.6. (a) Let x be a stable equilibrium of a differential equation corresponding to a C 1 vector field on an open subset U Rn .
Show that for every neighbourhood W of x in U, there is a neighbourhood W 0 of x in W such that every solution curve x(t) with
x(0) W 0 is defined an in W 0 for all t > 0.
(b) If x is asymptotically stable, the neighbourhood W 0 in (a) can
be chosen with the additional property that limt x(t) = x if
x(0) W 0 .
(Hint: Consider the set of all points of W whose trajectories for
t 0 enter the set W1 in Definition 4.3.)
Problem 4.7. For which of the following linear operators A on Rn is
0 Rn a stable equilibrium of x0 = Ax?

0 1
1
2
(a) A = O
(b)
(c)
0 0
2 2
0 1
0 1
0 0
1
0
1
0
0
0
(d)
(e)
0 1
1
0
0 1
1 0
0
0 1 0
36
2001/2002
Problem 4.8. Each of the following linear autonomous systems has

the origin as an isolated critical point. Determine the nature and stability properties of that critical point.
(
(
x0 = 2x
x0 = x 2y
(a) 0
(b) 0
y = 3y
y = 4x 5y
(
(
x0 = 3x + 4y
x0 = 5x + 2y
(c) 0
(d) 0
y = 2x + 3y
y = 17x 5y
(
(
x0 = 4x y
x0 = 4x 2y
(e) 0
(f) 0
y = x 2y
y = 5x + 2y
Problem 4.9. Consider the autonomous system
(
x0 = ax + by + e
y 0 = cx + dy + f ,
where ad bc 6= 0 and e, f are constants.
(a) Show that it has an isolated critical point, say (x0 , y0 ).
(b) Introduce new variables x = x x0 and y = y y0 , and show that
in terms of these variables the system is of the form
(
x0 = a
x + b
y
0
y .
y = cx + d
Find a
, b, c, d in terms of a, b, c, d. Deduce therefore that this system has a critical point at the origin.
(c) Find the critical point of the system
(
x0 = 2x 2y + 10
y 0 = 11x 8y + 49,
and determine its nature and stability properties.
Problem 4.10. Consider the linear dynamical system x0 = Ax in the
plane R2 . Discuss the stability of the origin as an equilibrium point if:
(a) A is symmetric (AT = A) and positive definite;
(b) A is symmetric and negative definite;
(c) A is skew-symmetric (AT = A);
(d) A = B + C, where B is symmetric negative definite and C is
skew-symmetric.
Problem 4.11. Show that the dynamical system in
tions in polar coordinates are
(
r2 sin(1/r) r > 0
0 = 1
r0 =
0
r=0
R2
whose equa-
M341 ODE
37
has a stable equilibrium at the origin.

(Hint: Every neighbourhood of the origin contains a solution curve
encircling the origin.)
Problem 4.12. This problem illustrates that if the linearised system
has a centre, then the nonlinear stability is undecided. Consider the
two autonomous systems
(
(
x0 = y + x(x2 + y 2 )
x0 = y x(x2 + y 2 )
(a) 0
(b)
.
y = x + y(x2 + y 2 )
y 0 = x y(x2 + y 2 )
Both systems have a critical point at the origin. Linearise the system
and show that that both systems have a centre at the critical point.
Solve the nonlinear systems by changing coordinates to polar coordinates (r, ), and show that the two systems have opposite stability
properties. Which system is stable and which is unstable?
Problem 4.13. Sketch the phase portrait of the equation x00 = 2x3
and show that it has an unstable critical point at the origin.
Problem 4.14. Consider the following second order equation:
x00 + 2x0 + 2 x = 0 ,
where 0 and > 0 are constants.
(a) Write down the equivalent autonomous system and show that it
has a unique critical point at the origin of the phase plane.
(b) For each of the following cases, describe the nature and stability
properties of the critical point:
(i) = 0; (ii) 0 < < ; (iii) = ; and (iv) > .
Problem 4.15. Consider the following second order equation:
x00 + x + x3 = 0 .
Find the trajectories, critical points and their nature and stability properties.
Problem 4.16. For each of the following functions V (u), sketch the
phase portrait of the gradient flow u0 = grad V (u). Identify the
equilibria and classify them as to stability or instability. Sketch the
level surfaces of V on the same diagram.
(a) x2 + 2y 2
(b) x2 y 2 2x + 4y + 5
(c) y sin x
(d) 2x2 2xy + 5y 2 + 4x + 4y + 4
(e) x2 + y 2 z
(f) x2 (x 1) + y 2 (y 2) + z 2
Problem 4.17. Find the type of critical point at the origin of the
following system:
(
x0 = x + y x(x2 + y 2 )
.
y 0 = x + y y(x2 + y 2 )
38
2001/2002
Derive an equation for E = x2 + y 2 and show that E = 0 for E = 1.

Solve the system and sketch the trajectories spiralling out and in to
this limit cycle.
Problem 4.18. Show that the following system has a periodic solution:
(
2
2
x0 = 3x y xex +y
.
2
2
y 0 = x + 3y yex +y
Problem 4.19. Consider the equation
x00 + x x2 1 = 0 ,
where 0 < < 1. Sketch the phase portrait of the system for 0 < <
and for 14 < < 1 and compare the nature of the trajectories.
1
4
Problem 4.20. Consider a linear system

x0 = Ax ,
with A an n n constant invertible matrix.
This system has a unique critical point at the origin.

(a) If A is skew-symmetric, show that the energy function E = kxk2
is conserved along trajectories.
(b) If there exists M such that MA + AT M is negative definite, show
that the energy function E = hx, Mxi is decreasing (E < 0) along
trajectories. If in addition M is positive-definite, use Liapunov
stability to deduce that the origin is a stable critical point, at
least when n = 2.
(c) Let A = S 1 NS, so that relative to the new variables y = Sx,
the system becomes
y 0 = Ny .
Suppose that there exists a positive-definite matrix M satisfying
MA + AT M is negative definite (resp. negative semi-definite).
Show that there exists a positive-definite matrix P such that P N +
N T P is negative definite (resp. negative semi-definite).
(d) For each of the stable critical points of 2 2 autonomous systems
find a suitable Liapunov function which detects the precise type
of stability: neutral or asymptotic.
Hint: Use part (c) to argue that it is enough to look at the normal
forms discussed in the lecture:

0
0

,
0
1

where in this case < 0, < 0 and 0. Then use (b) to
find Liapunov functions for the normal forms.
Problem 4.21. Consider the second order equation
x00 + x0 + x = 0 ,
where 0.
M341 ODE
39
(a) Write the corresponding linear system and show that it has an
isolated critical point at the origin.
(b) Show that the function E(x, y) = x2 + y 2 is a Liapunov function.
Deduce that the origin is stable.
(c) Identify the type of critical point and its stability property for
= 0, 0 < < 2, = 2 and > 2. In particular, show that the
origin is asymptotically stable for > 0.
This problem shows that a given Liapunov function may fail to detect
asymptotic stability. (It can be shown that there exists one which
does.) Moral: Liapunov functions are not unique and knowing that
one exists is not the same thing as finding one!
Problem 4.22. For each of the following systems, show that the origin
is an isolated critical point, find a suitable Liapunov function, and prove
that the origin is asymptotically stable:
(
x0 = 3x3 y
(a) 0
y = x5 2y 3
(
x0 = y 2 + xy 2 x3
(c) 0
y = xy + x2 y y 3
(
x0 = 2x + xy 3
(b) 0
y = x2 y 2 y 3
(
x0 = x3 y + x2 y 3 x5
(d) 0
y = 2x4 6x3 y 2 2y 5
(Hint: Try Liapunov functions of the form a x2p + b y 2q , for suitable

a, b, p, q.)
Problem 4.23. Consider the van der Pol equation:
x00 + (x2 1)x0 + x = 0 .
Write the equivalent linear system and show that it has an isolated
critical point at the origin. Investigate the stability properties of this
critical point for > 0, = 0 and < 0.
Problem 4.24. Consider the following autonomous system:
(
x0 = y + x f (x, y)
y 0 = x + y f (x, y)
where f is a function on the phase plane which is continuous and continuously differentiable on some disk D about the origin.
(a) Show that the origin is an isolated critical point.
(b) By constructing a Liapunov function or otherwise, show that the
origin is asymptotically stable if f is negative definite on D.
40
2001/2002
Problem 4.25. Discuss the stability of the limit cycles and critical
points of the following systems. (Here r2 = x2 + y 2.)
(
(
x0 = x + y + x(r2 3r + 1)
x0 = y + x sin(1/r)
(a) 0
(b)
y = x + y + y(r2 3r + 1)
y 0 = x + y sin(1/r)
(
(
x0 = x y + x(r3 r 1)
x0 = y + x(sin r)/r
(c) 0
(d)
y = x + y + y(r3 r 1)
y 0 = x + y(sin r)/r
Problem 4.26. Does any of the following differential equations have
limit cycles? Justify your answer.
(a) x00 + x0 + (x0 )5 3x3 = 0
(b) x00 (x2 + 1)x0 + x5 = 0
(c) x00 (x0 )2 1 x2 = 0
Problem 4.27. Prove that the following systems have a limit cycle,
by studying the behaviour of the suggested Liapunov function and
applying the PoincareBendixson theorem.
(
x0 = 2x y 2x3 3xy 2
(a)
.
y 0 = 2x + 4y 4y 3 2x2 y
(Hint: Try E(x, y) = 2x2 + y 2.)
(
x0 = 8x 2y 4x3 2xy 2
(b)
y 0 = x + 4y 2y 3 3x2 y
(Hint: Try E(x, y) = x2 + 2y 2.)
M341 ODE
41
5. Rudiments of the theory of distributions

It is not unusual to encounter differential equations with inhomogeneous terms which are not continuous functions: infinitesimal sources,
instantaneous forces,... There is a nifty formalism which allows us to
treat these peculiar differential equations on the same footing as the
regular ones. This formalism is called the theory of distributions. The
introduction to this topic is a little more technical than the rest of the
course, so please be patient.
5.1. Test functions.
Definition 5.1. A test function : R R is a smooth function with
compact support. We shall let D denote the space of test functions.
Smooth means that is continuous and infinitely differentiable, so
that all its derivatives 0 , 00 , (3) , . . . are continuous. One often says
that is of class C .
A function has compact support if there is a real number R such
that (t) = 0 for all |t| > R. More precisely, if : R R is a function,
we can define its support supp to be the closure of the set of points
on which does not vanish:
supp = {t | (t) 6= 0} .
Alternatively, supp is the complement of the largest open set on which
vanishes. In other words, a function will always vanish outside its
support, but may also vanish at some points at the boundary of its
support.
A useful class of test functions are the so-called bump functions. To
construct them, consider first the function h : R R defined by
(
e1/t , t > 0
h(t) :=
0,
t0
This function is clearly smooth everywhere except possibly at t = 0; but
LHopitals rule removes that doubt: h is everywhere smooth. Clearly,
supp h = [0, ). Define the function
(t) := h(t)h(1 t) .
It is clearly smooth, being the product of two smooth functions. Its
support is now [0, 1], so that it is a test function. Notice that (t) 0.
If you plot it, you find that it is a small bump between 0 and 1. We
can rescale to construct a small bump between a and b:

ta
[a,b] (t) :=
.
ba
Clearly [a,b] is a test function with support [a, b]. Somewhat loosely
we will call a nowhere negative test function which is not identically
42
2001/2002
zero, a bump function. The integral of a bump function is always

positive. For instance, one has
Z
[a,b] (t)dt 0.00703 (b a) .
R
Proposition 5.2. The space D of test functions has the following easily proven properties:
1. D is a real vector space; so that if 1 , 2 D and c1 , c2 R, then
c1 1 + c2 2 D.
2. If f is smooth and D, then f D.
3. If D, then 0 D. Hence all the derivatives of a test function
are test functions.
We are only considering real-valued functions of a real variable, but
mutatis mutandis everything we say also holds for complex-valued functions of a real variable.
Definition 5.3. A function f : R R is called (absolutely) integrable if
Z
|f (t)|dt < .
R
We say that f : R R is locally integrable if

Z b
|f (t)|dt <
a
for any finite interval [a, b]. A special class of locally integrable functions are the (piecewise) continuous functions.
Test functions can be used to probe other functions.
Proposition 5.4. If f is locally integrable and is a test function,
then the following integral is finite:
Z
Z
f := f (t)(t)dt .
R
Proof. Since D, it vanishes outside some closed interval [R, R].

Moreover since it is continuous, it is bounded: |(t)| M for some M.
Therefore
Z
Z R
f (t)(t)dt =
f (t)(t)dt ;
R
whence we can estimate

Z R
Z R
Z

f (t)(t)dt
|f (t)||(t)|dt M

R
|f (t)|dt < ,
where in the last inequality we have used that f is locally integrable.
M341 ODE
43
As the next result shows, test functions are pretty good probes. In
fact, they can distinguish continuous functions.
Theorem 5.5. Let f, g : R R be continuous functions such that
Z
Z
f = g D .
Then f = g.
Proof. We prove the logically equivalent
if f 6= g, then there
R statement:
R
exists some test function for which f 6= g .
If f 6= g there is some point t0 for which f (t0 ) 6= g(t0 ). Without loss
of generality, let us assume that f (t0 ) > g(t0). By continuity this is
also true in a neighbourhood of that point. That is, there exist > 0
and > 0, such that
f (t) g(t)
for |t t0 | .
Let be a bump function supported in [t0 , t0 + ]. Then, since

bumps have nonzero area,
Z
Z
Z t0 +
Z t0 +
(f (t) g(t)) (t)dt
(t)dt > 0 ,
f g =
t0
whence
f 6=
t0
g .
Remark 5.6. Continuity of f and g is essential; otherwise we could

change f , say, in just one point without affecting the integral.
The next topic is somewhat technical. It is essential for the logical
development of the subject, but we will not stress it. In particular, it
will not be examinable.
Definition 5.7. A sequence of functions fm : R R is said to converge uniformly to zero, if given any > 0 there exists N (depending
on ) so that |fm (t)| < for all m > N and for all t. In other words,
the crucial property is that N depends on but not on t.
We now define a topology on the space of test functions; i.e., a
notion of when two test functions are close to each other. Because
D is a vector space, two test functions are close to each other if their
difference is close to zero. It is therefore enough to say when a sequence
of test functions approaches zero.
Definition 5.8. A sequence of test functions m for m = 1, 2, . . . ,
is said to converge to zero (written m 0) if the following two
conditions are satisfied:
1. all the test functions m vanish outside some common finite interval [R, R] (in other words, R does not depend on m); and
(k)
2. for a fixed k, the k-th derivatives m converge uniformly to zero.
44
2001/2002
We can also talk about test functions converging to a function m

in the same way. One can show that is itself a test function. In
other words, the space of test functions is complete.
5.2. Distributions. Finally we can define a distribution.
Definition 5.9. A distribution is a continuous linear functional on
the space D of test functions. The space of distributions is denoted D0 .
In other words, a distribution T is a linear map which, acting on a
test function , gives a real number denoted by hT, i.
Linearity means that
hT, c1 1 + c2 2 i = c1 hT, 1 i + c2 hT, 2 i ,
for all 1 , 2 D and c1 , c2 R.
Continuity simply means that if a sequence m of test functions
converges to zero, then so does the sequence of real numbers hT, m i:
m 0 = hT, m i 0 .
Proposition 5.10. Every locally integrable function f defines a distribution Tf defined by
Z
hTf , i = f
for all D.
Proof. The map Tf is clearly linear. To show that it is also continuous, suppose that m 0. By the first convergence property in
Definition 5.8, there exists a finite R such that all m vanish outside of
[R, R]. Let C be the result of integrating |f (t)| on [R, R], and let
Mm be such that |m (t)| Mm . Then
Z R
Z R

f (t)m (t)dt
|f (t)||m (t)|dt Mm C .
| hTf , m i | =
R
But now notice that m 0 = Mm 0 = hTf , m i 0.

A distribution of the form Tf for some locally integrable function f
is called regular.
A famous example of a regular distribution is the one associated to
the Heaviside step function. Let H be the following function:
(
1, t0
H(t) =
0, t<0.
It is clearly locally integrable, although it is not continuous. It gives
rise to a regular distribution TH defined by
Z
Z
hTH , i = H(t)(t)dt =
(t)dt .
R
M341 ODE
45
Remark 5.11. The value at t = 0 of the Heaviside step function is

a matter of convention. It is an intrinsic ambiguity in its definition.
Notice however that the regular distribution TH does not depend on
this choice. This simply reiterates the fact that we should think of the
Heaviside step function not as a function, but as a distribution.
If all distributions were regular, there would be little point to this
theory. In fact, the raison detre of the theory of distributions is to
provide a rigorous framework for the following distribution.
Definition 5.12. The Dirac -distribution is the distribution defined by
h, i := (0)
D .
(20)
This distribution
cannot be regular: indeed, if there were a function
R
(t) such that = (0) it would have to satisfy that (t) = 0 for
all t 6= 0; but then such a function could not possibly have a nonzero
integral with any test function. Nevertheless it is not uncommon to
refer to this distribution as the Dirac -function.
Distributions which are not regular are called singular.
Distributions obey properties which are analogous to those obeyed
by the test functions. In fact, dually to Proposition 5.2 we have the
following result.
Proposition 5.13. The space D0 of distributions enjoys the following
properties:
1. D0 is a real vector space. Indeed, if T1 , T2 D0 and c1 , c2 R,
then c1 T1 + c2 T2 defined by
hc1 T1 + c2 T2 , i = c1 hT1 , i + c2 hT2 , i
is a distribution.
2. If f is smooth and T D0 , then f T , defined by
hf T, i = hT, f i
is a distribution.
3. If T D0 , then T 0 defined by
hT 0 , i = hT, 0 i
(21)
is a distribution.
Notice that any test function, being locally integrable, gives rise
to a (regular) distribution. This means that we have a linear map
D D0 which is one-to-one by Theorem 5.5. On the other hand, the
existence of singular distributions means that this map is not injective.
Nevertheless one can approximate (in a sense to be made precise below)
singular distributions by regular ones.
46
2001/2002
The space of distributions inherits a notion of convergence from the

space of test functions.
Definition 5.14. We say that a sequence of distributions Tn converges weakly to a distribution T , written Tn T , if
hTn , i hT, i
D .
We will often say that Tn converges to T in the (weak) distributional

sense to mean this type of convergence. It is in this sense that singular
distributions can be approximated by regular distributions. In fact,
Problem 5.1 explicitly shows how to construct sequences of regular
distributions which converge weakly to .
5.3. Distributional derivatives and ODEs. Let f be a locally integrable function. It need not be continuous, and certainly not differentiable. So there is little point in thinking of its derivative f 0 as a
function: it may not even be well defined at points. Nevertheless, we
can make sense of f 0 as a distribution: Tf0 .
If f is differentiable, then Tf0 = Tf 0 (see Problem 5.2). On the other
hand, if f is not differentiable, then Tf0 need not be a regular distribution at all. Let us illustrate this with an important example.
The derivative H 0 of the step function is not a function: it would not
be defined for t = 0. Nevertheless, the distributional derivative TH0 is
well-defined:

Z

0
0
0
hTH , i = hTH , i =
(t)dt = (t) = (0) ,
0
where we have used the fact that has compact support. Comparing
with equation (20), we see that is the (distributional) derivative of
the step function:
TH0 = .
(22)
Let us give some examples of distributional derivatives. We write

them as if they were functions, but it is important to keep in mind
that they are not. Hence, when we write f 0 what we really mean is the
distribution Tf0 and so on.
1. Let f (t) = t H(t). Then f 0 = H and f 00 = .
2. Let f (t) = |t|. Then f 0 (t) = H(t) H(t) and f 00 = 2.
3. Let f be n-times continuously differentiable (i.e., of class C n ).
Then
f
(n)
n
X
=
(1)i f (i) (0) (ni) .
i=0
(23)
M341 ODE
47
4. In particular we have that
0
m (n)
t = (1)m m!
(1)m n! (nm)
(nm)!
m>n
m=n
m<n.
Our present interest in distributions is to be able to solve a larger

class of differential equations. We will focus solely on linear ODEs.
Therefore we start by showing how differential operators act on distributions.
Since the derivative of a distribution is a distribution, we can take
any (finite) number of derivatives of a distribution and still get a distribution. Iterating the identity (21) we obtain for any distribution T
and any test function :
hT 00 , i = hT 0 , 0 i = hT, 00 i ,
or more generally
(k)

T , = (1)k T, (k)
T D0
D .
(24)
Let L be the following linear differential operator of order n:

n
X
n
n1
L = an D + an1 D
+ + a1 D + a0 =
ai Di ,
i=0
where the ai are smooth functions with an 6 0 and D = d/dt.

It follows from Proposition 5.13 that L acts on a distribution and
yields another distribution. Indeed, if T D0 , then
hL T, i = hT, L i
D ,
where L , the formal adjoint of L, is defined by

n
X
(1)i (ai )(i) .

L :=
i=0
A typical linear inhomogeneous ODE can be written in the form

L x(t) = f (t) ,
where L is a linear differential operator and f is the inhomogeneous
term. By analogy, we define a (linear) distributional ODE to be an
equation of the form:
LT = ,
(25)
where is a fixed distribution and the equation is understood as an

equation in D0 whose solution is a distribution T . In other words, a
distribution T solves the above distributional ODE if
hL T, i = hT, L i = h, i
D .
48
2001/2002
The distributional ODE (25) can have two different types of solutions:
Classical solutions. These are regular distributions T = Tx ,
where in addition x is sufficiently differentiable so that L x makes
sense as a function. In this case, = Tf has to be a regular
distribution corresponding to a continuous function f .
Weak solutions. These are either regular distributions T = Tx ,
where x is not sufficiently differentiable for L x to make sense as
a function; or simply singular distributions.
Suppose that the differential operator L is in standard form, so that
an 1. Then it is possible to show that if the inhomogeneous term
in equation (25) is the regular distribution = Tf corresponding to a
continuous function f , then all solutions are regular, with T = Tx for
x(t) a sufficiently differentiable function obeying L x = f as functions.
However, a simple first order equation like
t2 T 0 = 0 ,
which as functions would only have as solution a constant function, has
a three-parameter family of distributional solutions:
T = c1 + c2 TH + c3 ,
where ci are constants and H is the Heaviside step function.
5.4. Greens functions. Solving a linear differential equation is not
unlike inverting a matrix, albeit an infinite-dimensional one. Indeed, a
linear differential operator L is simply a linear transformation in some
infinite-dimensional vector space: the vector space of distributions in
the case of an equation of the form (25). If this equation were a linear
equation in a finite-dimensional vector space, the solution would be
obtained by inverting the operator L, now realised as a matrix. In this
section we take the first steps towards making this analogy precise. We
will introduce the analogue of the inverse for L (the Greens function
of L) and the analogue of matrix multiplication (convolution).
Let L be an n-th order linear differential operator in standard form:
L = Dn + an1 Dn1 + + a1 D + a0 ,
where ai are smooth functions.
Definition 5.15. By a fundamental solution for L we mean a distribution T satisfying
LT = .
Fundamental solutions are not unique, since one can add to T anything in the kernel of L. For example, we saw in (22) that (the regular
distribution defined by) the Heaviside step function is a fundamental
M341 ODE
49
solution of L = D. In this case it is a weak solution. This is not the only

fundamental solution, since we can always add a constant distribution:
Z
hTc , i = c (t) dt
D ,
R
for some c R.
One way to resolve this ambiguity is to impose boundary conditions.
An important class of boundary conditions is the following.
Definition 5.16. A (causal) Greens function for the operator L
is a fundamental solution G which in addition obeys
hG, i = 0 D such that supp (, 0) .
In other words, the Greens function G is zero on any test function
(t) vanishing for non-negative values of t. With a slight abuse of
notation, and thinking of G as a function, we can say that G(t) = 0
for t < 0.
As an example, consider the Greens function for the differential
operator L = Dk , which is given by
tk1
G(t) =
H(t) .
(k 1)!
Let us consider the Greens function for the linear operator L above,
where the ai are real constants. By definition the Greens function G(t)
obeys L G = and G(t) = 0 for t < 0. This last condition suggests
that we try G(t) = x(t)H(t), where x(t) is a function to be determined
and H is the Heaviside step function. Computing L G we find, after
quite a little bit of algebra,
"
#
n1 n`1
X
X k + `
ak+`+1 x(k) (0) (`) ,
L G = (L x) H +
`
`=0
k=0
where we have used equation (23). The distributional equation L G =

is therefore equivalent to the following initial value problem for the
function x:
Lx = 0
x(0) = x0 (0) = = x(n2) (0) = 0
x(n1) (0) = 1 .
We know from our treatment of linear vector fields that this initial
value problem has a unique solution. Therefore the Greens function
for L exists and is unique.
The Greens function is the analogue of an inverse of the differential
operator. In fact, it is only a right-inverse: it is a common feature
of infinite-dimensional vector spaces that linear transformations may
have left- or right-inverses but not both. This statement is lent further
credibility by the fact there is a product relative to which the Dirac
is the identity. This is the analogue of matrix multiplication: the convolution product. Convolutions are treated in more detail in Problems
50
2001/2002
5.10 and 5.13. Here we simply show how to solve an inhomogeneous

equation, given the Greens function.
Theorem 5.17. Let G be the Greens function for the operator L. Let
T be the distribution defined by
Z
T (t) := G(t s)f (s)ds .
R
Then T obeys the inhomogeneous equation

LT (t) = f (t) .
Proof. This follows from the fact that G is a fundamental solution:
LG(t) = (t) ,
whence taking L inside the integral
Z
Z
LT (t) = LG(t s)f (s)ds = (t s)f (s)ds = f (t) .
R
The expression of T is an example of a convolution product. More

details are given in Problem 5.10, which describes the convolution product of test functions. It is possible to extend the convolution product
to more general functions and indeed to certain types of distributions.
In the notation of Problem 5.10, T = G ? f .
Notice that because G(t) = 0 for t < 0, the integral in the expression
for T is only until t:
Z t
(G ? f )(t) =
G(t s) f (s) ds .
This definition embodies the principle of causality. If the above equation describes the response of a physical system to an external input
f (t), then one expects that the response of the system at any given
time should not depend on the future behaviour of the input.
Problems
Problem 5.1. Let f : R R be an absolutely integrable function of
unit area; that is,
Z
Z
|f (t)|dt <
and
f (t)dt = 1 .
R
Consider the sequence of functions fn defined by fn (t) = nf (nt), where

n is a positive integer. Show that Tfn in the distributional sense;
that is,
hTfn , i (0) D .
M341 ODE
(Hint: First show that
51
hTfn , i (0) =
f (t) [(t/n) (0)] dt ,
estimate the integral and show that it goes to zero for large n.)
Problem 5.2. Let f be a continuous function whose derivative f 0 is
also continuous (i.e., f is of class C 1 ). Let Tf denote the corresponding
regular distribution. Prove that Tf0 = Tf 0 .
Problem 5.3. Let a R and define the shifted step function Ha (t) by
(
1, ta
Ha (t) =
0, t<a,
and let THa be the corresponding regular distribution. Prove that
TH0 a = a , where a is defined by
ha , i = (a) D .
Problem 5.4. Let f be a smooth function and T be a distribution.
Then f T is a distribution, as was shown in lecture. Prove that
(f T )0 = f 0 T + f T 0 .
Problem 5.5. Let f be a smooth function. Prove that

n
X
(n)
j n
(1)
f (j) (0) (nj) .
f =
j
j=0
As a corollary, prove that
n<m
0 ,
m (n)
m
n=m
t = (1) m! ,
(1)m n! (nm) , n > m ;

(nm)!
and that
t (n)
n
X
n nj (j)
=
.
j
j=0
Problem 5.6. Prove that (the regular distribution corresponding to)

the function
tk1
H(t)
(k 1)!
is a fundamental solution for the operator Dk .
Problem 5.7. Let f :
function:
be the following piecewise continuous
|t| < 1
t ,
f (t) = 1 ,
t1
1 , t 1 .
52
2001/2002
Compute its second distributional derivative.

Problem 5.8. Prove that
|t|0 = H(t) H(t) ,
and that

D2 k 2 ek|t| = 2k .
Use this result to find the causal Greens function for the linear operator
L = D2 k 2 and find a solution x(t) of the inhomogeneous equation
Lx(t) = f (t), where
(
t for 0 < t < 1,
f (t) =
0 otherwise.
Problem 5.9. Find the Greens function for the linear second order
differential operator
L = D2 + aD + b ,
where a, b R. Distinguish between the cases a2 < 4b, a2 = 4b and
a2 > 4b and write the Greens function explicitly in this case. Use this
to solve the inhomogeneous initial value problem
x00 (t) + x0 (t) + x(t) = f (t)
x(t), x0 (t) 0 as t ,
where f (t) is the piecewise continuous function f (t) = H(t) H(t 1).
Sketch (or ask Maple to sketch) x(t) as a function of t. From the sketch
or otherwise, is x(t) smooth?
Problem 5.10. Let , and be test functions. Define their convolution ? by
Z
( ? )(t) := (t s) (s) ds .
R
(a) Show that if has support [a, b] and has support [c, d], then
? has support [a + c, b + d].
(b) Show that ( ? )0 = 0 ? = ? 0 .
(c) Conclude that ? is a test function.
(d) Show that the convolution product is commutative: ? = ? ,
and associative
( ? ) ? = ? ( ? ) .
The following problems get a little deeper into the notion of a distribution. They are not examinable, but some of you might find them
interesting.
Problem 5.11. Let : D D be a continuous linear map; that is,
1. (c1 1 + c2 2 ) = c1 (1 ) + c2 (2 ) for ci R and i D; and
2. if m 0 then (m ) 0 in D.
M341 ODE
53
(a) Prove that the adjoint map defined on linear functionals by

h T, i = hT, ()i
D ,
(26)
maps D0 to D0 .
Let a, b R with a 6= 0 and define the following operations on functions:
(a ) (t) = (ax) and
(b ) (t) = (t b) .
(b) Prove that a and b map test functions to test functions, and
that they are linear and continuous.
Let a : D0 D0 and b : D0 D0 be their adjoints, defined by (26).
(c) If f is a locally integrable function and Tf the corresponding regular distribution, show that
a Tf = Ta f
and b Tf = Tb f ,
where
(a f ) (t) =
1
f (t/a) and
|a|
(b f ) (t) = f (t + b) .
Problem 5.12. Let T D0 . Prove that its derivative obeys

1
T 0 = lim [h T T ] ,
h0 h
where h is defined in Problem 3 and where the limit is taken in the

distributional sense:

1
1
lim [h T T ] , = lim hh T T, i D .
h0 h
h0 h
Problem 5.13. Let f : R R be a bump function of unit area (see
Problem 5.1). Prove (without using Problem 5.1, since we have not
defined the convolution of a distribution and a test function) that the
sequence of test functions n := fn ? converges to as test functions:
n .
(Hint: You can use the results of Problem 5.10 about convolutions.)
(Remark : Together with Problem 5.1, which illustrates that one can
approximate distributions by functions, this problem illustrates that
one can define the convolution of a distribution and a test function,
and that is the identity under convolution.
54
2001/2002
6. The Laplace transform

This part of the course introduces a powerful method of solving differential equations which is conceptually different from the methods
discussed until now: the Laplace transform.
The Laplace transform is an example of an integral transform. Integral transforms are linear maps sending a function f (t) to a function
F (s) defined by a formula of the form
Z
F (s) = K(s, t) f (t) dt ,
for some function K(s, t) called the kernel and where the integral is
over some specified subset of the real line.
Integral transforms are useful in the solution of linear ordinary differential equations because these get transformed into algebraic equations
which are easier to solve. One then inverts the transform to recover the
solution to the differential equation. This circle (or square!) of ideas
can be represented diagrammatically as follows:
differential equation for f
solution: f
6inverse
transform
transform
algebraic equation for F
solution: F
The two most important integral transforms are the Fourier transform (cf. PDE) and the Laplace transform. Whereas the Fourier transform is useful in boundary value problems, the Laplace transform is
useful in solving initial value problems. As this is the main topic of
this course, we will concentrate solely on the Laplace transform.
6.1. Definition and basic properties.
Definition 6.1. Let f (t) be a function. Its Laplace transform is
defined by
Z
L {f } (s) :=
f (t) est dt ,
(27)
0
provided that the integral exists. One often uses the shorthand F (s)
for the Laplace transform L {f } (s) of f (t).
Remark 6.2. The following should be kept in mind:
M341 ODE
55
1. Not every function has a Laplace transform: for example, f (t) =

2
et does not, since the integral in equation (27) does not exist for
any value of s.
2. For some f (t) the Laplace transform does not exist for all values
of s. For example, the Laplace transform F (s) of f (t) = ect only
exists for s > c.
3. Although t is a real variable, the logical consistency of the theory requires s to be a complex variable. Therefore the Laplace
transform L {ect } (s) actually exists provided that Re(s) > c.
A large class of functions which have a Laplace transform (at least
for some values of s) are those of exponential order.
Definition 6.3. A function f (t) is said to be of exponential order if
there exist real constants M and such that
|f (t)| M et
t .
Proposition 6.4. If f (t) is of exponential order, then the Laplace

transform F (s) of f (t) exists provided that Re(s) > .
Proof. We estimate the integral
Z
Z
Z

st
st

f (t) e dt
|f (t)| |e | dt

0
M et e Re(s)t dt .
Provided that Re(s) > , this integral exists and

M
|F (s)|
.
Re(s)
Notice that in the above proof, in the limit Re(s) , F (s) 0.
In fact, this can be proven in more generality: so that if a function
F (s) does not approach 0 in the limit Re(s) , it cannot be the
Laplace transform of any function f (t); although as we will see it can
be the Laplace transform of a singular distribution!
A fundamental property of integral transforms is linearity:
L {f + g} (s) = L {f } (s) + L {g} (s) = F (s) + G(s) ,
which makes sense for all s for which both F (s) and G(s) exist.
Let f (t) be differentiable and let us compute the Laplace transform
of its derivative f 0 (t). By definition,
Z
0
L {f } (s) =
f 0 (t) est dt ,
0
which can be integrated by parts to obtain

Z

0
L {f } (s) =
s f (t) est dt + f (t) est
0
= s L {f } (s) f (0) + lim f (t) est .

t
56
2001/2002
Provided the last term is zero,

L {f 0 } (s) = s L {f } (s) f (0) .
We can iterate this expression in order to find the Laplace transform
of higher derivatives of f (t):
L {f 00 } (s) = s L {f 0 } (s) f 0 (0)
= s2 L {f } (s) s f (0) f 0 (0) ,
provided that f (t) exp(st) and f 0 (t) exp(st) both go to zero in the
limit t .
The entries in Table 1 below can be obtained by straightforward
integration. The last two lines will deserve more attention below.
Function
Transform
Conditions
f (t)
F (s)
convergence
eat
1
sa
Re(s) > Re(a)
cos t
s
s2 + 2
R and Re(s) > 0
sin t
s2 + 2
R and Re(s) > 0
cosh t
s
s2 2
R and Re(s) > ||
sinh t
s2 2
R and Re(s) > ||
tn
n!
sn+1
n = 0, 1, . . . and Re(s) > 0
eat f (t)
F (s a)
convergence
tn f (t)
(1)n F (n) (s)
same as for F (s)
f (t)
es F (s)
> 0 and same as for F (s)
n1
sn F (s)
f (n) (t)
sn1k f (k) (0)
k=0
t
f (t ) g( ) d
F (s) G(s)
lim f (k) (t)est = 0
same as for F (s) and G(s)
Table 1. Some Laplace transforms

In this table, f is the function defined by
(
f (t ) for t ,
f (t) :=
0
otherwise.
M341 ODE
57
Definition 6.5. The convolution of two functions f (t) and g(t) is

defined as
Z t
(f ? g)(t) =
f (t u) g(u) du .
0
Proposition 6.6. The convolution obeys the following properties:

Commutativity: f ? g = g ? f .
Bilinearity: f ? (a g + b h) = a f ? g + b f ? h, where a and b are
constants. Commutativity implies linearity on the first factor too.
Associativity: f ? (g ? h) = (f ? g) ? h.
Remark 6.7. The above definition of convolution agrees with that
introduced in Problem 5.10 provided that the functions involved are
such that they vanish for negative values of the independent variable.
Indeed, if f (t) and g(t) are functions for which f (t < 0) = 0 and
g(t < 0) = 0, then
Z t
Z
f (t u) g(u) du =
f (t u) g(u) du .
The most important property is probably the following.

Theorem 6.8. Let F (s) and G(s) be the Laplace transforms of f (t)
and g(t), respectively. Then
L {f ? g} (s) = F (s) G(s) .
Proof. We simply compute:
Z
L {f ? g} (s) =
est (f ? g)(t) dt
0

Z t
Z
st
=
e
f (t u) g(u) du dt .
0
We can exchange the order of integration by noticing that

Z t= Z u=t
Z u= Z t=
k(t, u) du dt =
k(t, u) du dt ,
t=0
u=0
u=0
t=u
for any function k(t, u) for which the integrals exist. Therefore,

Z Z
st
L {f ? g} (s) =
e f (t u) dt g(u) du
0
u

Z Z
s(u+v)
=
e
f (v) dv g(u) du
(v = t u)
0
0
Z
= L {f } (s)
esu g(u) du
0
= L {f } (s) L {g} (s) .
58
2001/2002
6.2. Application: solving linear ODEs. As explained in the introduction to this section, the usefulness of the Laplace transform is based
on its ability to turn initial value problems into algebraic equations. We
now discuss this method in more detail.
Let L be a linear differential operator in standard form:
n
X
n
n1
L = D + an1 D
+ . . . a1 D + a0 =
ai Di ,
i=0
where ai are real constants (an = 1) and where D is the derivative

operator: (Df )(t) = f 0 (t). We will consider the initial value problem
Lx = f
and x(0) = c0 , x0 (0) = c1 , . . . , x(n1) (0) = cn1 .
We can solve this using the Laplace transform in three easy steps:
1. We take the Laplace transform of the equation:
L {Lx} (s) = L {f } (s) = F (s) .
Using linearity and the expression for the Laplace transform for
x(i) in Table 1 one finds
!
n
X
L {Lx} (s) =
ai si X(s) P (s) ,
i=0
where X(s) = L {x} (s) and P (s) is the polynomial

P (s) =
n1
X
pi si
i=0
where pi =
ni1
X
ai+j+1 cj ,
(28)
j=0
which encodes all the information about the initial conditions.

2. We solve the transformed equation for X(s):
!
n
X
ai si X(s) = F (s) + P (s) ,
i=0
to obtain
X(s) = T (s) (F (s) + P (s)) ,
where
T (s) =
n
X
i=0
!1
ai si
1
.
sn + an1 sn1 + + a1 s + a0
3. We invert the transform to recover x(t).

If the Laplace transform F (s) of f (t) is a rational function (i.e.,
a ratio of polynomials), then so is X(s). The inverse transform is
easy to find in principle by a partial fraction decomposition and then
looking up in the tables. The resulting function x(t) will then be a
sum of terms each of which is made out of polynomials, exponential
and trigonometric functions.
M341 ODE
59
The transformed solution found above is a sum of two terms: the

term T (s)P (s) depends on the initial conditions, whereas the term
T (s)F (s) depends on the inhomogeneous term in the ODE: f (t). In
both terms, the function T (s) only depends on L and not on the initial
conditions or on the inhomogeneous term in the equation. This function is known as the transfer function of L. As we will show it is the
Laplace transform of a function g(t). This allows us to invert the term
T (s)F (s) using convolution.1
We desire a function g(t) whose Laplace transform is the transfer
function of the differential operator L. Let g(t) solve the homogeneous
equation Lg = 0. Solving for its Laplace transform, we find, as above,
L {g} (s) = T (s)P (s) ,
where P (s) is the polynomial defined in (28), but with cj = g (j) (0).
Clearly L {g} (s) = T (s) if and only if P (s) = 1, which is equivalent to the initial conditions g(0) = g 0(0) = = g (n2) (0) = 0 and
g (n1) (0) = 1. In other words, g(t) is essentially the Greens function
for the operator L. More precisely, the Greens function is given by
G(t) = H(t)g(t) ,
since it must vanish for negative t. Since the Laplace transform is
only defined for non-negative t, g(t) and G(t) share the same Laplace
transform, namely the transfer function T (s).
6.3. The Laplace transform of a distribution. What is the inverse
Laplace transform of a polynomial P (s)? We remarked above that for
F (s) to be the Laplace transform of a function f (t) it is necessary that
F (s) 0 as Re(s) . Since this is not the case for P (s), we know
that P (s) 6= L {p(t)} (s) for any function p(t). It will turn out, however,
that P (s) is the Laplace transform of a singular distribution. Just like
not every function has a Laplace transform, not every distribution will
have one either. This requires a little bit of formalism, which we will
simply sketch.
Suppose that Tf is a regular distribution. If f has a Laplace transform, it is natural to demand that the Laplace transform of Tf should
agree with that of f :
Z
L {Tf } (s) := L {f } (s) =
f (t) est dt ,
0
st
which looks like hTf , e i except for the minor detail of the lower limit
of integration and the fact that est is not a test function, since it does
not have compact support. However, for Re(s) > 0, the function est
1
The same cannot be said about the term T (s)P (s), since P (s), being a polynomial, cannot be the Laplace of any function; although it can be shown to be the
Laplace transform of a singular distribution.
60
2001/2002
obeys the next best thing: it decays very fast. The following definition
makes this notion precise.
Definition 6.9. We say that a function f is of fast decay if for all
non-negative integers k, p there exists some positive real number Mk,p
such that
(1 + t2 )p |f (k) (t)| Mk,p
t R .
The space of such functions is denoted by S.

Proposition 6.10. The space S is a vector space and contains the
space D of test functions as a subspace.
Proof. That S is a vector space is left as an exercise (see Problem 6.9).
Let us simply see that D S. If D is a test function, there is
some R > 0 such that (k) (t) = 0 for |t| > R. Since is smooth, it
is bounded and so are all its derivatives. Let Mk be the maximum of
|(k) (t)| for |t| R. Then,
(1 + t2 )p |f (k) (t)| (1 + R2 )p Mk
t R .
There is a natural notion of convergence in the space S.

Definition 6.11. We say that a sequence fn S of functions in S
converges to zero (written fn 0), if for any given p and k,
(1 + t2 )p |fn(k) (t)| 0
uniformly in t.
This notion of convergence agrees with the one for test functions. In
other words, a sequence of test functions converging to zero in D also
converge to zero in S. (See Problem 6.9.)
Recall that in Section 5.2 we defined a distribution to be a continuous linear functional on D. Some distributions will extend to linear
functionals on S.
Definition 6.12. A tempered distribution is a continuous linear
functional on S. The space of tempered distributions is denoted S0 .
This means that T S0 associates a number hT, f i with every f S,
in such a way that if f, g S and a, b are constants, then:
hT, a f + b gi = a hT, f i + b hT, gi .
Continuity means that if fn 0 then the numbers hT, fn i 0 as well.
Proposition 6.13. The space of tempered distributions is a vector
space. In fact, it is a vector subspace of D0 : S0 D0 .
Tempered distributions inherit the notion of weak convergence of
distributions in Definition 5.14.
Not all distributions have a Laplace transform. Let T be a distribution with the property that T (t) = 0 for t < 0. In other words,
M341 ODE
61
T D0 is such that for all test functions with supp (, 0),

one has hT, i = 0. (Compare with the definition of Greens function
in 5.4.) Assume moreover that for some R, et T is tempered,
that is, it belongs to S0 . Then one defines the (distributional) Laplace
transform of T by

L {T } (s) := T, est ,
which then exist for Re(s) > .
Let us compute the Laplace transforms of a few singular distributions:
First, we start with T = :

st
st
L {} (s) = , e
=e =1.
t=0
Notice that this is consistent with the fact that is the identity under
convolution (cf. Problem 5.13). Indeed, let f be a function with Laplace
transform F (s). Then,
L { ? f } (s) = L {} (s)F (s) = F (s) = L {f } (s) .
We generalise to T = (k) :

(k) st
(k)
dk st
k
k st
L
(s) = , e
= (1) , k e
= s e = sk .
dt
t=0
These two results allows us to invert the Laplace transform of any
polynomial.
Finally consider for > 0, defined by h , i = ( ). Its Laplace
transform is given by

st
st
L { } (s) = , e
=e
= es .
t=
Problems
Problem 6.1. Provide all the missing proofs of the results in Table 1.
Problem 6.2. Suppose that f (t) is a periodic function with period T ;
that is, f (t + T ) = f (t) for all t. Prove that the Laplace transform
F (s) of f (t) is given by
Z T
1
F (s) =
f (t) est dt ,
1 esT 0
which converges for Re(s) > 0.
Use this to compute the Laplace transform of the function f (t) which
is 1 for t between 0 and 1, 2 and 3, 4 and 5, etcetera and 0 otherwise.
62
2001/2002
Problem 6.3. Solve the following integral equation:

Z t
f (t) = 1
(t ) f ( ) d .
0
(Hint: Take the Laplace transform of the equation, solve for the Laplace
transform F (s) of f (t), and finally invert the transform.)
Problem 6.4. Show that the differential equation:
f 00 (t) + 2 f (t) = u(t),
subject to the initial conditions f (0) = f 0 (0) = 0, has
Z
1 t
f (t) =
u( ) sin (t ) d
0
as its solution.
(Hint: Take the Laplace transform of both sides of the equation, solve
for the Laplace transform of f and invert.)
Problem 6.5. Suppose that we consider the Laplace transform of tz ,
where z is a complex number with Re(z) > 0. This is given in terms
of the Euler function, defined by
Z
(z) :=
tz1 et dt .
0
Prove that
(z + 1)
,
sz+1
provided Re(z) > 0. (This is required for convergence of the integral.)
Prove that
L {tz } (s) =
(z + 1) = z (z) ,
for Re(z) > 0. Compute ( 12 ).
Problem 6.6. Compute the Laplace transforms of the following functions:
1
(a) f (t) = 3 cos 2t 8e2t
(b) f (t) =
t
(
1 , for t < 1, and
(c) f (t) =
(d) f (t) = (sin t)2
0 , for t 1.
0 , for t < 1,
(e) f (t) = 1 , for 1 t 2, and
0 , for t > 2.
Make sure to specify as part of your answer the values of s for which
the Laplace transform is valid.
M341 ODE
63
Problem 6.7. Find the inverse Laplace transform of the following

functions:
1
4
(a) F (s) = 2
(b) F (s) =
s +4
(s 1)2
s
1
(c) F (s) = 2
(d) F (s) = 3
s + 4s + 4
s + 3s2 + 2s
s+3
(e) F (s) = 2
s + 4s + 7
Problem 6.8. Use the Laplace transform to solve the following initial
value problems:
d2 f (t)
df (t)
(a)
5
+ 6f (t) = 0 ,
f (0) = 1, f 0 (0) = 1 ,
2
dt
dt
d2 f (t) df (t)
(b)
2f (t) = et sin 2t ,
f (0) = f 0 (0) = 0 ,
dt2
dt
0 , for 0 t < 3,
d2 f (t)
df (t)
+ 2f (t) = 1 , for 3 t 6, and
(c)
3
dt2
dt
0 , for t > 6,
f (0) = f 0 (0) = 0 .
Problem 6.9. Prove that the space S of functions of fast decay is a
vector space. Moreover show that vector addition and scalar multiplication are continuous operations; that is, show that if fn , gn 0 are
sequences of functions of fast decay converging to zero, then fn +gn 0
and cfn 0 for all scalars c. (This makes S into a topological vector
space.)
Prove as well that convergence in D and in S are compatible. (This
makes D is closed subspace of S.)
64
2001/2002
7. ODEs with analytic coefficients

In this short section we introduce the method of power series solutions to linear ordinary differential equations with analytic coefficients.
We start with a brief discussion of analytic functions and their power
series solutions and then present the method of power series to solve a
linear ordinary differential equation with analytic coefficients.
7.1. Analytic functions. Power series find their most natural expression in the context of complex analysis, and you are referred to the
module CAn for details. Here we will simply recall some of their basic
properties.
Definition 7.1. An infinite series of the form
X
an (t t0 )n = a0 + a1 (t t0 ) + a2 (t t0 )2 + ,
n=0
where an R, defines a power series in the real variable t about the

point t0 R.
By shifting the variable if necessary, we can always assume that
t0 = 0. This leads to a simpler form of the power series:
an tn = a0 + a1 t + a2 t2 + .
(29)
n=0
We say that the power series (29) converges at the point t R if

the limit
lim
N
X
an tn
exists.
n=0
Otherwise we say that the power series diverges at t.

Clearly every power series of the form (29) converges at t = 0. Happily many power series also converge at other points. In fact, power
series come in three varieties:
1. those which only converge for t = 0;
2. those which converge for all t R; and
3. those which converge for all |t| < R and diverge for |t| > R.
The real number R in the last case is called the radius of convergence of the power series. It is customary to assign the value R = 0
and R = respectively to the first and second cases above. This
way every power series has a radius of convergence which can be any
number 0 R . The interval (R, R) is known as the interval
of convergence.
Remark 7.2. Convergence is undecided at either endpoint of the interval of convergence!
M341 ODE
65
We now discuss a method to find

Pthe radius of convergence of a
power series. Consider the series n=0 un with constant coefficients.
The Ratio Test tells us that if the limit

un+1

L := lim
n
un
exists, then the series converges if L < 1 and diverges if L > 1. Applying this to the power series (29), we see that if the limit

an+1 tn+1
an+1
= lim
|t|
L := lim
n
an tn n an
exists, then the power series converges for L < 1, which is equivalent
to saying that the radius of convergence is given by

an+1
1
.
R = lim
n
an
Even if the Ratio Test fails, it is possible to show that the radius of
convergence of a power series always exists.
If the power series (29) converges for all |t| < R, then for those values
of t it defines a function f as follows:
X
an tn .
f (t) =
n=0
This function is automatically continuous and infinitely differentiable

in the interval |t| < R. In fact, the series can be differentiated termwise,
so that
an =
1 (n)
f (0)
n!
A convergent power series can also be integrated termwise, provided

the limits of integration are within the interval of convergence:
Z t2
X

an
f (t)dt =
tn+1
tn+1
if R < t1 < t2 < R.
2
1
n+1
t1
n=0
Convergent power series can be added and multiplied, the resulting
power series having as radius of convergence at least as large as the
smallest of the radii of convergence of the original power series. The
formula for the product deserves special attention:
!
!
X
X
X
n
n
an t
bn t
=
cn tn ,
n=0
n=0
n=0
where
cn =
n
X
`=0
a` bn` =
n
X
an` b` .
`=0
These formulae define the Cauchy product of the power series.
66
2001/2002
Not every function can be approximated by a power series, those

which can are called analytic. More precisely, the function f (t) is
said to be analytic at t0 if the expression
X
an (t t0 )n
(30)
f (t) =
n=0
is valid in some interval around t0 . If this is the case,

1
an = f (n) (t0 ) ,
n!
and the series (30) is the Taylor series of f (t) at t0 . In the special
case t0 = 0, the series (30) is the Maclaurin series of f (t).
Analyticity in the real line is not nearly as dramatic as in the complex
plane; but it is still an important concept. Clearly there are plenty
of analytic functions: polynomials, exponentials, sine and cosine are
analytic at all points.
Proposition 7.3. Analytic functions obey the following properties:
1. Let f (t) and g(t) be analytic at t0 . Then so are
f (t)g(t)
af (t) + bg(t)
f (t)/g(t)
where a, b R and, in the last case, provided that g(t0 ) 6= 0.

2. Let f (t) be analytic at t0 , and let f 1 be a continuous inverse to
f . Then if f 0 (t0 ) 6= 0, f 1 (t) is analytic at f (t0 ).
3. Let g(t) be analytic at t0 and let f (t) be analytic at g(t0 ). Then
f (g(t)) is analytic at t0 .
7.2. ODEs with analytic coefficients. Consider a linear differential
operator in standard form:
L = Dn + an1 (t)Dn1 + a1 (t)D + a0 (t) .
If the ai (t) are analytic at t0 , we say that t0 is an ordinary point for
the differential operator L. Theres nothing ordinary about ordinary
points. In fact one has the following important result.
Theorem 7.4. Let t0 be an ordinary point for the differential operator
L, and let ci for i = 0, 1, . . . , n 1 be arbitrary constants. Then there
exists a unique function x(t) analytic at t0 which solves the initial value
problem:
Lx = 0
subject to
x(i) (t0 ) = ci
in a certain neighbourhood of t0 . Furthermore if the ai (t) have Taylor

series valid in |t t0 | < R, R > 0, then so is the Taylor series for x(t).
Proof. (Sketch) Without loss of generality, and translating the independent variable if necessary, we will assume that t0 = 0. Also to simplify
the discussion we will only treat the case of a second order equation
x00 + p(t)x0 + q(t) = 0 .
(31)
M341 ODE
67
Since analytic functions have power series approximations at the points

of analyticity, we have that
p(t) =
pn tn
and
q(t) =
n=0
qn tn .
n=0
Let R denote the smallest of the radii of convergence of these two series.
If an analytic solution exists, it can be approximated by a power series
x(t) =
cn tn ,
n=0
which converges in some interval around t = 0 and hence can be differentiated termwise. Doing so, we find
X
x (t) =
(n + 1)cn+1 tn
0
X
and x (t) =
(n + 1)(n + 2)cn+2 tn .
00
n=0
n=0
Inserting these power series into the differential equation (31), and using the Cauchy product formulae, we can derive a recurrence relation
for the coefficients cn :
X
1
((` + 1)c`+1 pn` + c` qn` ) ,
(n + 1)(n + 2) `=0
n
cn+2 =
valid for n 0, which allows us to solve for all the cn (n 2) in terms

of c0 and c1 . Notice that x(0) = c0 and x0 (0) = c1 .
The main technical aspect of the proof (which we omit) is to prove
that the resulting power series converges for |t| < R. This is done
by majorising the series with a geometric series which converges for
|t| < R and using the comparison test.
There are many famous equations Lx = 0 with analytic solutions at
t = 0. Among the best known are the following second order equations:
(Airys equation) L = D2 + t;
(Hermites equation) L = D2 2tD + 2p;
(Chebyshevs equation) L = (1 t2 )D2 tD + p2 ; and
(Legendres equation) L = (1 t2 )D2 2tD + p(p + 1).
In the last two equations, we can divide through by (1 t2 ) to put the
differential operators in standard form.
In the last three cases, p is a real constant which, when a non-negative
integer, truncates one of the power series solutions to a polynomial.
These polynomials are known, not surprisingly, as Hermite polynomials, Chebyshev polynomials and Legendre polynomials. On
the other hand, the Airy functions, which solve Airys equation, are
not polynomials.
68
2001/2002
Problems
Problem 7.1. Let f (t) and g(t) be analytic at t = 0, with power series
expansions
f (t) =
an t
and g(t) =
n=0
bn tn .
n=0
Supposing that g(0) = b0 =

6 0, then their quotient h(t) := f (t)/g(t) is
analytic at t = 0, whence it will have a power series expansion
h(t) =
cn tn .
n=0
Write a recurrence relation for the coefficients cn and solve for the first
three coefficients c0 , c1 , c2 in terms of the coefficients {an , bn }.
Problem 7.2. Let f (t) be analytic at t = 0, with power series expansion
X
f (t) =
an tn .
n=0
Let f 1 be the inverse function (assumed continuous), so that f 1 (f (t)) =

t. Assuming that f 0 (0) = a1 6= 0, then f 1 has a power series expansion
about f (0) = a0 given by
f
(t) =
bn (t a0 )n .
n=0
Solve for the first four coefficients b0 , . . . , b3 in terms of the coefficients

{an }.
Problem 7.3. For each of the following equations, verify that t = 0 is
an ordinary point, write down a recurrence relation for the coefficients
of the power series solutions, and solve for the first few terms. Can you
recognise the power series in any of the cases?
(a) x00 + x0 tx = 0
(b) (1 + t2 )x00 + 2tx0 2x = 0
(c) x00 + tx0 + x = 0
Problem 7.4. Consider Airys equation:
x00 + tx = 0 .
(a) Is t = 0 an ordinary point? Why? What is the radius of convergence of analytic solutions of Airys equation?
(b) Derive a recurrence relation for the coefficients of the power series
solutions around t = 0, and compute the first 6 coefficients.
M341 ODE
69
Problem 7.5. Consider Chebyshevs equation:

(1 t2 )x00 tx0 + p2 x = 0 ,
(32)
where p is a constant.
(a) Is t = 0 an ordinary point? Why? What is the radius of convergence of analytic solutions of equation (32)?
(b) Derive a recurrence relation for the coefficients of power series
solutions of (32) at t = 0.
(c) Show that when p is a non-negative integer, (32) admits a polynomial solution. More concretely show that when p = 0, 2, 4, . . .
(32) admits a solution which is an even polynomial of degree p;
and that when p = 1, 3, 5, . . . it admits a solution which is an odd
polynomial of degree p.
(d) Define the Chebyshev polynomials Tp , as the polynomial found
above, normalised so that Tp (0) = 1 for p even and Tp0 (0) = 1 for
p odd. Calculate the first six Chebyshev polynomials: T0 , T1 , ...,
T5 using the recurrence relation found above.
(Note: The standard Chebyshev polynomials in the literature are
normalised in a different way.)
(e) Show that Chebyshev polynomials are orthogonal with respect to
the following inner product on the space of polynomials:
Z 1
1
hf, gi :=
f (t)g(t) dt ,
1 t2
1
using the following method:
(i) Show that the operator T, defined on a polynomial function
f (t) as Tf (t) := tf 0 (t) (1 t2 )f 00 (t), is self-adjoint relative
to the above inner product.
(ii) Show that Tp is an eigenfunction of T with eigenvalue p2 :
TTp = p2 Hp .
(iii) Deduce that if p 6= q, then hTp , Tq i = 0.

ODE Notes

Uploaded by

Copyright:

Available Formats

ODE Notes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ODE Notes

Uploaded by

Copyright:

Available Formats

ORDINARY DIFFERENTIAL EQUATIONS

SPRING AND SUMMER TERMS 2002

5.3. Distributional derivatives and ODEs

1. Overview and basic concepts

All solutions of (1) are of the form

where ai , a2 R are constants and x1 , x2 : R R are the unknown

where Ax is a vector based at x. The map A : R2 R2 sending x 7 Ax

and the length of a point x is the distance to the origin:

A vector based at x Rn is an ordered pair of points x, y in Rn

a11 a12 . . . a1n

The map A : Rn Rn is a vector field on Rn : to each point x Rn

The vector equation (4) can be supplemented by an initial condition

(a) Find all solutions (x1 (t), x2 (t)) .

Problem 1.5. Let A be as in (e), Problem 1.3. Find constants a, b,

Problem 1.7. Let A be an n n diagonal matrix. Show that the

for all solutions of x0 = Ax.

2. Ordinary differential equations of higher order

where x : R Rn (typically n 3) is the position of some object as a

Clearly if x is any solution of (5), then (x, y) with y = x0 is a solution

Let f : Rn R be a differentiable function. The gradient of f ,

which sends t to f (x(t)). The chain rule becomes

2.3. Energy. A force field F : Rn Rn is called conservative if

Define the energy of a curve x : R Rn by

The energy is constant along solution curves x : R Rn of Newtons

for some function H : R2n R called the hamiltonian (function).

2.5. Gradient vector fields. A vector field F : Rn Rn in Rn is

Problem 2.4. Which of the following vector fields F : R2

Problem 2.3. Prove the Leibniz rule (7).

Problem 2.5. Suppose that a vector field in R2 is both gradient and

Problem 2.7. A force field F : Rn Rn is called central if F (x) =

3. Linear vector fields

A linear transformation A : Rn Rn is invertible if there exists a

y 0 = Dy and the previous paragraph applies. In particular, this system

3.4. The exponential of a matrix. Let A be an n n matrix. The

This series converges absolutely for all A.

Theorem 3.3. Given any 2 2 real matrix A, there is an invertible

From these it is easy to reconstruct the phase portrait of any linear

(a) Improper Nodes

where B : R R is a continuous map. The method of variation of

x(t) = etA f (t) ,

where f : R Rn is some differentiable curve. (This represents no loss

for some K R . Solving for x(t) we finally have

for some K Rn . Moreover every solution is of this form. Indeed, if y :

Conversely, the sum of a solution of (11) and a solution of (4) is a

where s : R R. This equation is equivalent to the first order equation

R can also be solved by solving the associated first order

where A is given by (14) and B : R Rn is given by

(b) x01 = 2x1 + x2

Problem 3.2. Find a 22 matrix A such that one solution to x0 = Ax

This means the solutions depend continuously on the initial conditions.

(a) By examining the equivalent first order system

Problem 3.7. Let a 2 2 matrix A have real, distinct eigenvalues

(b) 0 < <

(c) < < 0

(d) < 0 <

Problem 3.8. Let A be a nn real matrix. Prove that its eigenvalues

(b) x01 = 2x2

Problem 3.11. Sketch the portraits of:

Which solutions tend to 0 as t ?