Waterloo Undergrad ODE 1 Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 187

Course Notes for AMATH 251

Introduction to Differential Equations (Advanced)

K.G. Lamb1
Department of Applied Mathematics
University of Waterloo

September 3, 2019

1 c K.G. Lamb, September 3, 2019


Contents

1 Introduction 1
1.1 Mathematical Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Falling objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Plotting solutions, scaling and non-dimensional variables: . . . . . . . . . . . 5
1.2 Population Growth: the Logistic Equation . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Classification of Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 First-Order Ordinary Differential Equations 11


2.1 Separable DEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 Autonomous First-order DEs . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Numerical Methods for ODEs: Construction of approximate solutions numerically . 15
2.2.1 Euler’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.2 The second-order Runga-Kutta or mid-point method . . . . . . . . . . . . . . 19
2.2.3 A Fourth-order Runga-Kutta Method . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Picard’s Method: finding approximate solutions . . . . . . . . . . . . . . . . . . . . . 23
2.4 Existence-Uniqueness of Solutions of First-Order Differential Equations . . . . . . . 24
2.5 Linear First-Order Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.1 Illustrative example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.2 Standard form for linear first-order DEs . . . . . . . . . . . . . . . . . . . . . 28
2.5.3 Solving linear first-order equations . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5.4 Existence-Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.5.5 Radioactive Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.5.6 Solving linear first-order equations using an integrating factor . . . . . . . . . 35
2.5.7 Applications of first-order linear ODEs . . . . . . . . . . . . . . . . . . . . . . 38
2.6 Nonlinear First-order Differential Equations . . . . . . . . . . . . . . . . . . . . . . . 41
2.6.1 The clepsydra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.6.2 Clairaut Equations and Singular Solutions . . . . . . . . . . . . . . . . . . . . 43
2.6.3 A geometry problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.7 Exact First-Order Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . 46

i
2.7.1 Test For Exactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.7.2 Inexact differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.8 Solution by Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3 Non-dimensionalization and Dimensional Analysis 63


3.1 Non-dimensionalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2 Dimensional Analysis: The Buckingham-π Theorem . . . . . . . . . . . . . . . . . . 68
3.2.1 Proof of the Buckingham-π Theorem . . . . . . . . . . . . . . . . . . . . . . . 71
3.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4 Objects Falling/Rising Through a Fluid 77


4.1 Drag Force on an Object Moving Through a Newtonian Fluid . . . . . . . . . . . . . 77
4.1.1 The Parachute Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.2 Buoyancy Forces and Added Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2.1 Buoyancy Force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2.2 Acceleration-reaction or added mass . . . . . . . . . . . . . . . . . . . . . . . 82

5 Second-Order Linear Ordinary Differential Equations 85


5.1 Some general theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.1.1 Finding a second homogeneous solution . . . . . . . . . . . . . . . . . . . . . 89
5.2 Constant Coefficient Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.2.1 Homogeneous Constant Coefficient DEs . . . . . . . . . . . . . . . . . . . . . 90
5.2.2 The Unforced Linear Oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.2.3 Inhomogeneous Equations: Method of Undetermined Coefficients for Con-
stant Coefficient ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.3 Resonance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.3.1 The simple harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.3.2 Solution of the equation for the forced, under-damped simple harmonic os-
cillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.4 Undamped Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.5 Method of Variation of Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.6 Cauchy-Euler Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
5.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

6 Systems of Equations 119


6.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.1.1 Matrix Formulation: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.2 Elimination Method for Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.3 Introduction to the Phase Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.3.1 Classification of Critical Points . . . . . . . . . . . . . . . . . . . . . . . . . . 126

ii
6.4 Competing Species . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.5 Linearization near a critical point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.5.1 General linearization procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 131
6.6 Nonlinear Oscillations: The Volterra prey-predator model . . . . . . . . . . . . . . . 132
6.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

7 Laplace Transforms 137


7.1 Existence of the Laplace Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.2 Properties of the Laplaces Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7.3 Inverse Laplace Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.3.1 Linearity of the Inverse Transform . . . . . . . . . . . . . . . . . . . . . . . . 145
7.3.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.4 Solving constant coefficient differential equations. I . . . . . . . . . . . . . . . . . . . 148
7.5 The Heaviside Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
7.5.1 Laplace Transform of the Heaviside function . . . . . . . . . . . . . . . . . . 153
7.6 Convolutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
7.6.1 The Convolution Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
7.7 Periodic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
7.8 Impulses and the Dirac Delta Function . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7.8.1 Laplace Transform of the Delta function . . . . . . . . . . . . . . . . . . . . . 165
7.8.2 An undamped oscillator subjected to an impulsive force . . . . . . . . . . . . 166
7.9 Laplace Transform Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

8 APPENDIX A: Differentiating and Integrating Complex-Valued Functions of a


Real Variable 173
8.1 From first principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.2 Power series approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

9 APPENDIX B: Equality of mixed partial derivatives 177

iii
Chapter 1

Introduction

This course provides an introduction to ordinary differential equations (ODEs) and mathematical
modelling based on ODEs. It includes material on non-dimensionalization of ODEs, which in
general simplifies problems by reducing the number of parameters, and dimensional analysis. An
introduction to Laplace Transforms provides additional tools for solving linear constant-coefficient
ODEs.
The first documented mathematical model was developed by Claudius Ptolemy (c. 100–178
CE) to describe and predict the motions of the sun, moon and planets about the Earth (which at
that time was the centre of motion) [3]. His model was described in his Mathematicki Syntaxis, a
work in 13 books, better known as the Almagest after Islamic scientists, who translated the book
from Greek to Arabic, began calling it al-magisti. Models based on differential equations had to
wait until the 1660s and 1670s for the invention of Calculus by Sir Isaac Newton (1642–1727) and
Gottried Leibniz (1646–1716).

1.1 Mathematical Modelling


Mathematical modelling requires

(i) A problem statement which requires a detailed and thorough description of the problem.

(ii) Use of relevant guiding principles to relate various quantities resulting in equations that need
to be solved. These comprise the mathematical model.

(iii) Finding a solution of the equations.

(iv) Interpretation of the solution.

A wide range of mathematical models covering a vast range of fields, including the physical sciences,
economics, biology, health and social sciences have since been developed that are based on differ-
ential equations. Since Newton was one of the inventors of Calculus let’s start with an example
from Newtonian Mechanics.

PROBLEM: We wish to develop a model for the motion of an object moving perpendicular to
the Earth’s surface for three situations:

1
1. At distances far from the Earth where the atmosphere is negligible but the force of gravity
varies;
2. For an object moving near the Earth but with negligible air resistance (drag);
3. For an object moving near the Earth such that air resistance affects its motion.

We are going to make several simplifying assumptions. For example, we will assume that the object
is much denser than the air so that only gravitational and drag forces need to be considered (e.g.,
we ignore the buoyancy force which is crucial to model a bubble of air rising through water). We
ignore relativistic effects, the gravitational forces exerted by other planets etc.

1.1.1 Falling objects


To proceed we need two laws from Newtonian Mechanics and one empirical law which provides an
approximate model for air drag. The two laws from Newtonian Mechanics are:

1. Newton’s Universal Law of Gravitation: Two bodies of masses M1 and M2 exert a mutually
attractive gravitational force on one another of magnitude GMr12M2 where G is a constant and
r is the distance between their centres of mass.
2. Newton’s 2nd Law: For a body of mass m moving with velocity ~v the time rate of change of
its momentum m~v is equal to the net force F~ acting on the body. Mathematically
d
m~v = F~

(1.1)
dt
or, in one dimension,
d 
mv = F. (1.2)
dt
If m is constant then
dv
m = F. (1.3)
dt

Model (a): Varying gravity, no drag.

(i) Let m be the mass of the object a height s > 0 above the Earth’s surface (Figure 1.1).
(ii) Let M be the mass of the Earth which we assume to be spherical with radius R with centre
of mass at the geometric centre.
(iii) The gravitational force on the object, whose centre of mass is a distance R + s from the centre
of the earth, is
GmM
Fgr = − (1.4)
(R + s)2
(iv) From Newton’s 2nd law
dv GmM
m = Fgr = − . (1.5)
dt (R + s)2
ds
or, using v = dt ,.
d2 s GmM
m =− . (1.6)
dt2 (R + s)2

2
R
R s
Earth object
mass m
mass M

Figure 1.1: Schematic of the coordinate system for object moving perpendicular to the Earth’s surface. s
is the height of the centre of mass of an object of mass m above the Earth’s surface. The Earth is assumed
to be spherical with radius R, mass M and centre of gravity at the geometric centre.

Equation (1.6), a second-order nonlinear differential equation, is our mathematical model. It is an


approximate model as many simplifications have been made.
Note that here, and below, v > 0 if s is increasing, i.e., if the object is moving upward (away
from the Earth’s surface) and v < 0 if it is moving downward.

Model (b): Close to the Earth’s surface, no drag.

The gravitational force is


GmM GmM 1 GmM
Fgr = − 2
=− 2 s 2 ≈− (1.7)
(R + s) R (1 + R ) R2

if s/R  1. This is what is meant by ‘close to the Earth’s surface’. Now R ≈ 6400 km so for s ≤ 64
km the error in approximating R + s by R is less than 1%. With this approximation
dv GmM
m = Fgr = − . (1.8)
dt R2
Definition: The gravitational acceleration at the Earth’s surface is
MG
g= . (1.9)
R2
Thus our model is
dv
m = −mg, (1.10)
dt
or
dv
= −g, (1.11)
dt
which is a linear first-order differential equation for v(t). In terms of s we have

d2 s
= −g (1.12)
dt2
a linear second-order differential equation for s(t). This is the constant graviational acceleration,
no drag model. (Aside: the gravitational acceleration measured at the Earth’s surface includes a

3
component associated with the Earth’s rotation because it is measured in an accelerating reference
frame. The gravitational acceleration varies with latitude in part because of latitudinal variations
of the effects of rotation and also because the distance from the centre of the Earth to the Earth’s
surface varies).

Model (c): Constant gravitational acceleration with air resistance.

Now our model equation has the form


dv
m = Fgr + Fd = −mg + Fd (1.13)
dt
where Fd is the drag force. The drag force is very complicated. It depends on the object’s velocity
and on its shape — in particular on how fluid flows over the object’s surface. Fd always acts to
oppose the motion of the object. The force exerted by a fluid flowing around an object can have
a component perpendicular to the direction of motion so an object can accelerate sideways (or
tangentially). There are different mechanisms that give rise to tangential forces one of which gives
the lift that allows airplanes to fly.
We consider two Empirical Laws

(i) At ‘low’ speeds Fd ∝ v:


Fd = −γv γ > 0 a constant. (1.14)

(ii) At ‘high’ speeds |Fd | ∝ v 2 :

Fd = −βv|v| β > 0 a constant. (1.15)

More about what ‘low’ and ‘high’ mean later. The coefficients γ and β are dimensional drag
coefficients. More commonly a dimensionless drag coefficient is used which we will discuss later.
Note that in both cases Fd > 0 if v < 0 and Fd < 0 if v > 0 so that the drag force always acts to
oppose the motion of the object by decreasing its speed (|v|) and hence its kinetic energy.
Our models are
dv γ
= −g − v linear drag law (1.16)
dt m
and
dv β
= −g − v|v| quadratic drag law (1.17)
dt m
The first model equation is a first-order linear DE. The second is a first-order nonlinear DE.

Example Solutions:

1. Constant acceleration, no drag


Have
d2 s
= −g (1.18)
dt2
which can be integrated twice to give
1
s(t) = − gt2 + v0 t + s0 (1.19)
2
ds
where v0 = dt (0) and s0 = s(0) are the initial height and velocity of the object.

4
2. Constant gravitational acceleration, linear drag:
dv γ γ gm 
= −g − v = − v+ . (1.20)
dt m m γ
Some insight before solving:

(i) dv
dt = 0 when v = − gm gm
γ . Since this is a constant v = − γ is a solution of the DE.
(ii) If v > − gm
γ ,
dv
dt < 0 so v decreases.
(iii) If v < − gm
γ ,
dv
dt > 0 so v increases.
(iv) The previous two points show that if initially dv
dt 6= 0 then v → − gm
γ as t increases.

The DE is separable and easily solved:


dv γ
mg = − dt (1.21)
v+ γ m
so
mg γ
ln |v + |=− t+C (1.22)
γ m
or
mg γ
= Ae− m t .
v+ (1.23)
γ
Why could we drop the absolute value on the left hand side? This is an important question
whose answer you should understand. Thus, the solution of the DE is
mg γ
v=− + Ae− m t . (1.24)
γ

Interpretation:

• As t → ∞ v → − mg γ . This is called the terminal velocity (often the terminal velocity is


taken as the absolute value of this, technically the terminal speed). The terminal velocity
increases in magnitude as the drag coefficient gets smaller (i.e., drag is reduced for fixed
velocity), as the object’s mass increases and as the gravitational acceleration increases, all
of which make sense. This is part of the interpretation of the solution — does your solution
make sense (if not, the model may be a poor one).

• The velocity relaxes to the terminal velocity exponentially with time scale m/γ. The larger
the mass or the weaker the drag coefficient the longer it takes the object’s velocity to relax
to the terminal velocity.

1.1.2 Plotting solutions, scaling and non-dimensional variables:


For a specific problem, with given values of m, g, and γ the solutions are easily plotted. In general,
however, if these values are given as undetermined parameters it is convenient to scale the variables
and put them in terms of dimensionless variables. Here, for example, we can use ṽ = v/( mg γ ) as a
γ
function of t̃ = m t. In terms of ṽ and t̃ the solution is

ṽ = Ãe−t̃ − 1 (1.25)

5
3
2
1
ṽ 0
−1
−2
−3
0 1 2 3 4

Figure 1.2: Solutions of the linear drag law for an object moving perpendicular to the Earth’s surface
plotted using non-dimensional variables t̃ and ṽ.


which involves only a single parameter à = mg . Sample solutions for different values of à are
shown in Figure 1.2. The variables ṽ and t̃ are dimensionless: mg γ has dimensions of velocity so
dividing v by this quantity results in a dimensionless variable. We could have non-dimensionalized
the problem from the beginning and this is often worth doing because it simplifies the equation.
Consider again the DE
dv γ
= −g − v. (1.26)
dt m
Setting v = U ṽ and t = T t̃ gives

dv d(U ṽ) dṽ dṽ dt̃ U dṽ


= =U =U = . (1.27)
dt dt dt dt̃ dt T dt̃
The differential equation becomes
U dṽ γ
= −g − U ṽ, (1.28)
T dt̃ m
or
dṽ T Tγ
=− g− ṽ, (1.29)
dt̃ U m
Setting T = mγ makes the coefficient of ṽ on the right hand side equal to 1. Next choose U so that
T mg
U g = 1. This gives U = T g = γ . The result is the dimensionless differential equation

dṽ
= −1 − ṽ, (1.30)
dt̃
You should convince yourself that the dimensionless variables ṽ and t̃ are the same as those used
to plot the solutions in Figure 1.2 and that the general solution of the dimensionless DE is

ṽ = −1 + Ãe−t̃ . (1.31)

We will learn more about non-dimensionalizing equations and dimensional analysis in Chapter 3.

Comments on the solutions:

6
• For the constant gravitational force without drag we solved a second-order DE for s(t). The
solution has two arbitrary constants v0 and s0 which can be determined from initial conditions.

• For the constant gravitational force, linear drag problem we had a first order DE for the
velocity v. The solution has one arbitrary constant which can be determined from the initial
conditions.

Example: Find the height s(t) of a ball thrown vertically upward from height s0 = 1 m with
velocity v0 = 10 m s−1 . Ignore air resistance and assume a constant gravitational acceleration.

Solution: From the above solution


1
s(t) = − gt2 + v0 t + s0 . (1.32)
2
Thus s(0) = s0 = 1 m and v(t) = −gt + v0 =⇒ v(0) = v0 = 10 m s−1 . Thus the solution is
1
s(t) = − gt2 + 10t + 1 (1.33)
2
where s is in meters, t in seconds and g ≈ 9.81 m s−2 .

Note that the maximum value of s is 100/2g ≈ 5 m so s is always much smaller than R and
hence the approximation of a constant gravitational acceleration is OK.
In general, the solution of an nth order ODE involves n arbitrary constants that are determined
by n conditions (e.g., initial or boundary conditions). This is not always the case: for some nth
order ODEs not all solutions involve n constants.

1.2 Population Growth: the Logistic Equation


Let P (t) represent the population of some species at time t. Populations take integer values and
jump at discrete times. We assume that the population is so large that we can make the continuum
approximation that assumes that the population varies continuously. The population could be
measured in thousands so values like 1.234 make sense. The continuous function P (t) is expected
to be a ‘good fit’ to the discretely varying population.
The growth rate per individual, P1 dP
dt is the difference between the average birth rate and the
average death rate. We assume the following:

1. the average birth rate (number of births per unit population) is constant, say β;

2. the average death rate due to the effects of crowding and increased competition for food, is
proportional to the size of the population. Let δ be its constant of proportionality.

This gives a simple model for population growth, called the logistic equation:
1 dP
= β − δP,
P dt
or
dP
= P (β − δP ). (1.34)
dt

7
The population increases if P < βδ and decreases if P > βδ . In both cases the population approaches
the steady state (equilibrium) solution P = βδ as t → ∞.
This equation is separable and easily solved. The solution will depend on the two parameters
α and β.

1.3 Classification of Differential Equations


Equations of the form
F x, y, y 0 , y 00 , . . . , y (n) = 0

(1.35)
dy
to be solved for the function y(x) are called ordinary differential equations, or ODEs. Here y 0 = dx
2
d y
is the first derivative of y, y 00 = dx 2 is the second derivative and y
(n) denotes the nth derivative.

That is F is a function of x, y(x) and the first n derivatives of y(x).

Definition: The order of the ODE is the order of the highest derivative.

Definition: An nth order ODE is linear if it has the form


dn y dn−1 y dn−2 y dy
an (x) n
+ a n−1 (x) n−1
+ a n−2 (x) n−2
+ · · · + a1 (x) + a0 (x)y = g(x) (1.36)
dx dx dx dx
where the ak (x) and g(x) are known functions. If an ODE is not linear it is said to be nonlinear.

Definition: An nth order linear ODE is homogeneous linear if g(x) = 0. Otherwise it is called
an inhomogeneous, or forced ODE.

Definition: In general an nth order ODE requires n conditions to uniquely determine a solution.
The problem of solving () subject to the n conditions

y(x0 ) = y0
y 0 (x0 ) = y00
.. (1.37)
.
(n)
y (n) (x0 ) = y0

(j)
where the y0 are given constants, is called an initial value problem, or IVP. Not that all of
the conditions are given at the same value of x. Problems arise in which conditions are specified at
different locations. This situation is more complicated and solutions may not exist. For nonlinear
ODEs there may be problems for which it is inappopriate to specify n initial conditions. We will
see some examples below.

Examples

(i) y 0 (x) = ex is a first-order linear ODE for y(x);

(ii) x2 y 00 − 2y 0 + 3xy = sin(x) is a second-order linear ODE for y(x);

(iii) x2 (y 0 )2 + 2y = 6 is a first-order nonlinear ODE for y(x);

8
3
(iv) t2 ddt2x + tx = 0 is a third-order linear ODE for x(t);
d v 4
(v) v(u) du 3
4 + v (u) = 6u is a fourth-order nonlinear ODE for v(u);

d2 x
(vi) dt2
+ tan(x) = 2t is a second-order nonlinar ODE for x(t).

Definition Partial differential equations are equations that involve partial derivatives of multi-
variable functions, e.g., and equation for the function z(x, y) such as
zxy + zy2 − xzx = 0. (1.38)
The short form for this type of equation is PDE. Famous examples include

1. The one-dimensional wave equation


ηtt − c2 ηxx = 0. (1.39)
Here t is time and x is a spatial coordinate. It is used to model waves in a violin string,
although a better model would the effects of damping. There are many other wave equations.
2. Laplace’s equation which in R3 has the form
∇2 u = uxx + uyy + uzz = 0. (1.40)
This models the shape of a soap film.
3. The heat equation which in one spatial dimension has the form
wt = κwxx (1.41)
where t is time and x is the spatial coordinate. This, for example, models the flow of heat
along a one-dimensional rod. The heat equation modelling the flow of heat in R3 has the
form
wt = κ∇2 w. (1.42)

ODEs and PDEs are different types of differential equations denoted by DEs. PDEs are generally
much more difficult. We will often use the term DEs in this course by which we will mean ODEs.

1.4 Problems
1. Nondimensionalize the ODE for a falling object subjected to a quadratic drag law. Show that
it can be written in the form
dṽ
= −1 − |ṽ|ṽ. (1.43)
dt̃
2. Nondimensionalize the logistic equation. Show that it can be written in the form
dP̃
= P̃ (1 − P̃ ). (1.44)
dt̃

9
10
Chapter 2

First-Order Ordinary Differential


Equations

We begin our study of ODEs by considering the simplest possible cases: first-order ODEs which
have the general form
F (x, y, y 0 ) = 0. (2.1)

One of the primary goals in early studies of differential equations was the classification of types
of equations that can be solved. There are several different types of first-order ODEs that can be
solved, many of which often arise in applications. Hence they are important types of ODEs. In
this chapter we consider a few of them.

2.1 Separable DEs


One class of ODEs that can be solve is separable first-order differential equations which most of
you have seen in first year calculus. To quickly recap, a first-order differential equation of the form

y 0 = f (x, y) (2.2)

is said to be separable if the function f (x, y) (sometimes called a forcing function) has the form
f (x, y) = h(x)g(y), i.e., it is the product of a function of x with a function of y. The equation can
then be written as
dy
= h(x) dx (2.3)
g(y)
and integrated to give
y x
dy
Z Z
= h(x) dx + C. (2.4)
g(y)
Here the integrals represent any anti-derivative of the corresponding functions and C is an arbitrary
constant. The equation is said to be solved although in practice it may be impossible to find
analytical expressions for the anti-derivatives. This technique for solving separable equations was
found by Leibniz in 1691.

11
Example: Find the solution of the initial value problem (IVP)
x
x3 y 0 =
y (2.5)
y(1) = 2

Solution: Writing the DE in the form y 0 = f (x, y) gives y 0 = 1/x2 y which is separable. The DE
can be written as
1
yy 0 = 2 . (2.6)
x
Integrating we have Z y Z x
dx
y dy = +C (2.7)
x2
or
1 2 1
y = − + C. (2.8)
2 x
This is called the general solution. The value of C is determined from the condition y(1) = 2. We
get C = 3 so we have
1 2 1
y = − + 3. (2.9)
2 x
We are not quite done because this contains two solution curves:
r
2
y(x) = ± 6 − . (2.10)
x
The initial condition is satisfied by only one of these, namely
r
2
y(x) = 6 − . (2.11)
x
Because we only deal with real valued functions in this course the desired solution exists only when
6 − x2 ≥ 0. This includes two ranges of value for x: x < 0 and x ≥ 1/3. The initial condition
y(1) = 2 tells us that the solution curve must pass through x = 1. So our solution curve includes
only values of x with x ≥ 1/3.
The condition y(1) = 2 is called an initial condition and the problem consisting of the DE plus
the initial condition is called an initial value problem. The solution of an initial value problem
does not contain any arbitrary constants. In general an initial value problem for an nth -order ODE
requires n inital conditions.
Note: In the preceding example you can write (2.6) immediately without first putting the
equation in the form y 0 = f (x, y). Any equation of the form a(x)b(y)y 0 = c(x)d(y) is separable.

Examples: Sample solutions of four separable DEs are plotted in figure 2.1. For each DE several
points (x0 , y0 ) on the x-y plane are indicated along with the solution passing through them.

1. Figure 2.1(a) shows solutions y(x) of y 0 = y. The solutions have the form y = aex where a
is an arbitrary constant (exercise). Any point (x0 , y0 ) in R2 has one and only one solution
passing through it, namely the solution with a = y0 e−x0 . The solutions cover the whole plane.

12
5 5 (b)
4 (a) 4
3 3
2 2
1 1
0 0
y

y
−1 −1
−2 −2
−3 −3
−4 −4
−5 −5
−5 −4 −3 −2 −1 0 1 2 3 4 5 −5 −4 −3 −2 −1 0 1 2 3 4 5
x x

5 5 (d)
4 (c) 4
3 3
2 2
1 1
0 0
y

−1 −1
−2 −2
−3 −3
−4 −4
−5 −5
−5 −4 −3 −2 −1 0 1 2 3 4 5 −5 −4 −3 −2 −1 0 1 2 3 4 5
x x

Figure 2.1: Solutions of first-order DEs. Solid circles represent points in R2 . Solutions passing through
y
these points are plotted. (a) y 0 = y. (b) y 0 = − x−1 The grey line x = 1 is a line along which no solutions
y
0
exist. (c) y = − 2√x . No solutions pass through the grey region x < 0. (d) y 0 = xy .

y
2. Figure 2.1(b) shows solutions y(x) of y 0 = − x−1 which have the form y(x) = x−1 a
where a is
an arbitrary constant (exercise). Any point (x0 , y0 ) not lying on the line x = 1 has a single
solution passing through it, with a = y0 (x0 − 1). In this case the solutions cover the whole
plane minus the line x = 1.
The exclusion of the line x = 1 arises because of the choice to seek solutions of the DE of
the form y(x). Alternatively we can write the DE as dx x−1
dy = − y which leads to solutions
x(y) = 1 + ay as the solutions. Now the solutions x(y) cover the xy-plane minus the line y = 0.
Thus, the part of the plane covered by the solutions can depend on what type of solutions
you want to include. A general form of the DE, is y dx + (x − 1) dy = 0 which does not favour
y(x) or x(y). Solutions of this DE can be written as y(x − 1) = a and for a = 0 there are two
curves: x = 1 and y = 0. The solutions y(x − 1) = a includes the solutions of the form y(x)
and x(y) and the solutions now cover the whole plane. Note there are two solutions passing
through (x0 , y0 ) = (0, 0) and a single solution through all other points.

13

y
3. Figure 2.1(c) shows solutions of y 0 = − 2√ x
which have the form y = ae− x where a is an
arbitrary constant (exercise). Because we consider real-valued functions only in this course,
solutions only exist for x > 0. In this example the solutions cover the half-plane x > 0 with
any point (x0 , y0 ) with x0 > 0 having one and only one solution passing through it. Points
with x = 0 have been excluded since the right hand side of the DE is singular at x = 0. The
solutions can, however, be extended to x = 0 by taking the limit x → 0+ since limx→0+ y = a.
What happens to the derivative as x → 0+ ?

4. Figure 2.1(d) shows solutions of y 0 = xy which have the form y 2 − x2 = a where a is an


arbitrary constant (exercise). The solutions are hyperbolae which cover the whole plane as
a varies between ±∞. For each (x0 , y0 ) in the upper or lower half plane (i.e., with y0 > 0
or y0 < 0 there is a unique solution y(x) passing through the point. For points with y0 = 0
there are two solutions y(x) passing
√ through the point. For example through (x0 , y0 ) = (1, 0)
the two solutions are y(x) = ± x2 − 1. These two p curves join to form a single curve in the
x-y plane which has the functional form x(y) = 1 + y 2 . The special nature of points with
y = 0 when solutions y(x) are sought is indicated by the singularity in the right-hand side of
the differential equation y 0 = xy .

In the above we found solutions of the form y(x) in cases 1, 2, and 3. These are called explicit
solutions: we have given the solution y as a function of x. In the last case the solution y 2 − x2 = a
has the form f (x, y) = 0. This defines y(x) implicitly.

2.1.1 Autonomous First-order DEs


A special case of a separable first-order DE is the case when the forcing function is independent of
the independent variable. Such equations often occur when the independent variable is a function
of time, e.g.,
dx
= f (x). (2.12)
dt
The reason for this is that the behaviour of many systems does not depend on the start time.
Examples we have seen so far include all our models for the motion of an object subject to gravi-
tational and drag forces and for the logistic model. Systems driven by time varying forces will not
be autonomous (e.g., a mixing tank problem for which the inflow varies with time). The solution
of (2.12) is given implicitly by Z x
ds
=t+c (2.13)
f (s)
which you may or may not be able to put in closed form. Much about the behaviour can be
determined from the equation, for example:

• Any point where f (x) = 0 is a solution of the equation. It is called a fixed point;

• x(t) is increasing with time in regions where f (x) > 0;

• x(t) is decreasing with time in regions where f (x) < 0.

Plotting f (x) shows the regions where x is increasing and decreasing. It is easy to show that if the
solution x(t) exists on an interval (a, b) then x(t) is either increasing of decreasing monotonically.
If f (x) is a continuous function x(t) can not take the same value at two different times unless x(t)
is a constant. Proof of this is left as a problem.

14
2.2 Numerical Methods for ODEs: Construction of approximate
solutions numerically
An example of a separable ODE for which a solution y(x) can’t be explicitly found is

y 0 = y 4 + 3y 3 − sin(y) tanh(x2 + sin(x) .


 
(2.14)

The vast majority of differential equations can not be solved analytically. Such DEs often
arise in mathematical models. In this case ‘numerical solutions’ must be sought. Such solutions
are approximate solutions and there are a number of sophisticated methods to compute numerical
solutions of sufficient accuracy. A good introductory reference for numerical methods for ODEs is
the text by Leveque [4].

2.2.1 Euler’s Method


The basis for many numerical methods is Euler’s method, which as the name suggests was used by
Euler himself to find approximate solutions to problems of interest. This method uses the fact that
usually the desired solution is ‘close’ to other solutions in some sense.
Consider the initial value problem (IVP)

y 0 (x) = f (x, y) y(x0 ) = y0 . (2.15)

Euler proposed the following method to find approximate solutions:

1. From the differential equation and initial condition we know the value of y and its derivative
at x = x0 :

y(x0 ) = y0 ,
(2.16)
y 0 (x0 ) = f (x0 , y0 ).

2. We can approximate the function y(x) in the vicinity of (x0 , y0 ) by its tangent line. This of
course assumes the tangent line exists but this is guaranteed if f (x, y) is continuous. Thus,
letting x1 = x0 + h we can approximate y(x1 ) = y(x0 + h), for small h, as

y(x1 ) = y(x0 + h) ≈ y0 + y 0 (x0 )h = y0 + f (x0 , y0 )h ≡ y1 . (2.17)

This is illustrated in Figure 2.2(a,c). For the value h = 2 used in panel (a) the approximated
value is not very good. It is much improved for h = 1 but still not very good (panel (c)), however
for a smaller value of h, say h = 0.4, the estimated value would give a good approximation of y(x1 ).
It should be clear that in general the point (x1 , y1 ) is not on the solution curve y(x) but if h is
sufficiently small it will be close to it. We then continue by estimating y(x2 ) = y(x0 + 2h) starting
from (x1 , y1 ):
y(x2 ) = y(x1 + h) ≈ y1 + y 0 (x1 )h ≈ y1 + f (x1 , y1 )h = y2 , (2.18)
etc.
Suppose we want to estimate y(z) given initial values at x0 . One approach is the following:

1. Subdivide the interval [x0 , z] into N intervals [xj , xj+1 ] with xj = x0 + jh for j = 1, . . . , N
where h = (z − x0 )/N . Note xN = z.

15
10 10
(a) (b)
8 8

6 6

y
4 4

2 2
0 0
0 1 2 3 4 5 0 1 2 3 4 5
x x

6 6
(c) (d)
5 5

4 4
y

y
3 3

2 2
1.0 1.5 2.0 2.5 3.0 1.0 1.5 2.0 2.5 3.0
x x

Figure 2.2: Euler’s and RK-2 method for estimating y(x0 + h) starting from (x0 , y0 ). The black curves
indicate the solution of the DE passing through (x0 , y0 ). (a,c) Euler’s method. Starting at the point (x0 , y0 )
the value of y(x) at x1 = x0 +h is approximated by following the tangent line at (x0 , y0 ) to (x1 , y1 ). The grey
curves show the solution curve on which the estimated point lies. (a) Result using h = 2. (c) Result using
h = 1. (b,d) RK-2 method. Taking half a step gives the intermediate points (x0 + h2 , y ∗ ). The solution curve
passing through this point is indicated with the dashed grey curve. The slope of the tangent line (upper
blue curve) to the solution curve through the intermediate point is then used as the slope of the straight line
approximation from (x0 , y0 ) to (x1 , y1 ) (lower blue line). The solid grey curve indicates the solution curve
passing through the estimated point (x1 , y1 ). (b) Result using h = 2. (d) Result using h = 1.

2. Calculate yj = yj−1 + f (xj , yj−1 )h, j = 1, 2, . . . , N to find yN .

3. The yj are estimates of y(xj ). In particular yN is an estimate of y(z). The hope is that as
N → ∞, yN → y(z), i.e., the numerical approximation converges to y(z) as the step size goes
to zero (and the number of steps goes to infinity). At the same time the intermediate values
approach the solution curve y(x).

This is called Euler’s method.

Example: Use Euler’s method to find approximations to the solution of

y 0 = xy y(0) = 1 (2.19)

16
2.0 2.0
(a) (b)

1.5
1.5

1.0

y
1.0
0.5

0.0 0.5
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
x x

2.0 2.0
(c) (d)

1.5 1.5
y

y
1.0 1.0

0.5 0.5
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
x x

Figure 2.3: Approximate solutions of y 0 = xy, y(0) = 1, obtained using Euler’s method. (a) Solutions of
2 √
the DE for different initial conditions. The solution of the IVP y(0) = 1 is y = ex /2 giving y(1) = e =. (b)
Approximating y(1) using two steps. (c) Approximating y(1) using 4 steps. (d) Approximating y(1) using
8 steps. The blue curves in panels (b)–(d) are the solution curves passing through the approximated values
at x = n/h.

at x = 1 taking 1, 2 and 4 steps.

Solution:
2
Exercise: Show that the general solution of the DE is y = Aex /2 and that the solution satisfying
2 √
the initial condition y(0) = 1 is y(x) = ex /2 . Hence y(1) = e = 1.64872127 · · · . Figure 2.3(a)
shows a number of solutions for different initial conditions including the solution of the IVP. Now
we find the required approximate values of y(1) using Euler’s method.

1. 1 step: Taking one step we have h = 1. The approximate solution is

y(1) ≈ y(0) + f (0, 1)h = 1 + 0 · h = 1. (2.20)

This gives a very poor approximation of y(1) = e1/2 ≈ 1.6487.

17
2. 2 steps: Now h = 1/2 = 0.5. We have
1
y(0.5) ≈ 1 + f (0, 1) ·
2
= 1,
1  1
y(1) ≈ 1 + f ,1 ) · (2.21)
2 2
1
=1+
4
5
= .
4
This approximate solution is illustrated in Figure 2.3(b). The solution curves passing through
the points (x1 , y1 ) and (x2 , y2 ) are included. After one step we are at (x1 , y1 ) = (0.5, 1.0)
2 2
which is on the solution curve y = e−1/8 ex /2 ≈ 0.88ex /2 .

3. 4 steps: Now h = 1/4 = 0.25. We have


1
y(0.25) ≈ 1 + f (0, 1) ·
4 (2.22)
= 1,

1  1
y(0.5) ≈ 1 + f ,1 ·
4 4
1 1
=1+ · (2.23)
4 4
17
= ,
16
17 1 17  1
y(0.75) ≈ +f , ·
16 2 16 4
17 1 17 1
= + · ·
16 2 16 4
17 1
= 1+ (2.24)
16 8
17 9
= ·
16 8
153 17 18
= = ·
128 16 16
and
153 3 153  1
y(1) ≈ +f , ·
128 4 128 4
153  3 
= 1+
128 16
153 19 (2.25)
= ·
128 16
17 18 19
= · ·
16 16 16
= 1.4194 · · ·

18
2.0 2.0
(a) (b)

1.5
1.5

1.0

y
1.0
0.5

0.0 0.5
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
x x

2.0 2.0
(c) (d)

1.5 1.5
y

y
1.0 1.0

0.5 0.5
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
x x

Figure 2.4: Approximate solutions of y 0 = xy, y(0) = 1, obtained using the RK-2 method. (a) Solutions of
the DE for different initial conditions. The solution of the IVP y(0) = 1 is in bold. (b) Approximating y(1)
using two steps. (c) Approximating y(1) using 4 steps. (d) Approximating y(1) using 8 steps.

Exercise: Show that for 8 steps with h = 0.125 the approximate solution is y(1) ≈ 65 · 66 · . . . 70 ·
71/647 ≈ 1.5240 · · · .

2.2.2 The second-order Runga-Kutta or mid-point method


If f (x, y) is smooth Euler’s method will converge as the step size goes to zero, that is as the number
of steps increases the approximate solution goes to the correct value. Its convergence can be slow
requiring a very large number of steps to get an answer of sufficient accuracy. It can be shown
rigourously that the error is proportional to the step size h as h → 0 (see problems at the end of
the chapter). Each step taken in Euler’s method introduces an error of O(h2 ), i.e.,

y(xn + h) = yn + f (xn , yn )h + O(h2 ) = yn+1 + O(h2 ). (2.26)

After N = 1/h steps the global error is O(h).

19
Luckily there are much better methods so Euler’s method is not used in practice. It is, however,
the basis for many better methods. A method for which the error goes to zero like hp as h → 0 is
said to be of order-p.
From Figure 2.3 we can see that it would be much better to estimate the slope at a mid-way point
and use it instead of the slope of the tangent line at the starting point. A simple improvement
on Euler’s method is the second-order Runge-Kutta, or RK-2, method which does just this. It
proceeds as follows. First estimate y at the mid-point x0 + h2 using Euler’s method and use the DE
to estimate the slope at the mid-point. For the second step go back to the initial point x0 and use
the estimated slope at the mid-point to go from x0 to x0 + h. Thus first find
h
y ∗ = y0 + f (x0 , y0 ) (2.27)
2
and then estimate y(x0 + h) using the slope at (x0 + h2 , y ∗ ) via
h ∗
y1 = y0 + f (x0 +, y )h. (2.28)
2
This is illustrated in Figures 2.2(b,d). The second-order Runga-Kutta method is often written
algorithmically as
k1 = hf (xn , yn )
1 1
k2 = hf (xn + h, yn + k1 ) (2.29)
2 2
yn+1 = yn + k2 .

Example: Using RK-2 find approximations to the solution of


y 0 = xy y(0) = 1 (2.30)
at x = 1 taking 1 and 2 steps. This is the same example used above to illustrate Euler’s method.
Solution:

1. 1 step: Taking one step we have h = 1. First find


h
y ∗ = y0 + f (0, 1)
2 (2.31)
1
=1+0· = 1.
2
Then
h ∗
y(1) ≈ y0 + f (x0 + , y )h
2
1
= 1 + f ( , 1)h
2
1 (2.32)
=1+ h
2
3
=
2
= 1.5.

20
2. 2 steps: Now h = 1/2 = 0.5. For the first step we have

h
y ∗ = y0 + f (0, 1)
2 (2.33)
1
= 1 + 0 · = 1.
4
Then
1 h
y( ) ≈ y0 + f (x0 + , y ∗ )h
2 2
1 1
= 1 + f ( , 1)
4 2 (2.34)
1
=1+
8
9
= .
8

For the second step, starting from (x1 , y1 ) = ( 21 , 89 ) first find

h
y ∗ = y1 + f (x1 , y1 )
2
9 1 9 1
= + · · (2.35)
8 2 8 4
81
= .
64

Then
h ∗
y(1) ≈ y1 + f (x1 + , y )h
2
9 3 81 1
= + f( , )
8 4 64 2
9 3 81 1 (2.36)
= + · ·
8 4 64 2
819
=
512
≈ 1.599609375.

Note that this approximation is better than the one obtained using Euler’s method with 8
steps!

The approximate solution obtained taking 2 steps is illustrated in Figure 2.4(b). We will stop
here as these calculations quickly get tedious. They are best left for a computer. Figure 2.4 also
shows solutions obtained using 1, 2, 4 and 8 points. Using 8 points it is hard to distinguish the
approximate points from the solution curve. Comparing Figures 2.3 and 2.4, the RK-2 method is
vastly superior, at least for this example.
Table 2.1 compares our computed estimates for y(1) using Euler’s method and the RK-2 method.
Recall the correct answer is y(1) = e1/2 ≈ 1.6487. The solution using RK-2 with one step is better

21
than the solution obtained using Euler’s method with 4 steps and the solution using RK-2 with
two steps is better than that obtained using Euler’s method with 8 steps.
RK-2 is an example of a Runga-Kutta method. This family of methods was developed by
Carl Runge (1856–1927) and Wilhelm Kutta (1867–1944), two German mathematicians. RK-2 is
a second-order method. Much better methods are available, a fourth- or fifth-order Runge-Kutta
method is often used in practice (RK4-5 combines fourth and fifth order methods to give an estimate
of the error which is used to adjust the step size). Much higher-order methods also exist. You can
learn more about the wonderful world of numerically solving ODEs and PDEs in AMATH 342 and
442.

2.2.3 A Fourth-order Runga-Kutta Method


Here we briefly outline a fourth-order Runga-Kutta method. This provides a very good general
purpose ODE solver however for highly accurate computations there are better methods. The
fourth-order RK4 method is as follows:
k1 = hf (xn , yn )
1 1
k2 = hf (xn + h, yn + k1 )
2 2
1 1
k3 = h(f (xn + h, yn + k2 ) (2.37)
2 2
k4 = hf (xn + h, yn + k3 )
1 
yn+1 = yn + k1 + 2k2 + 2k3 + k4 .
6
Note that k1 /h is the slope at xn , k2 /h and k3 /h are estimates of the slope at the midpoint xn +h/2
obtained using different estimates of the value of y at the midpoint, and k4 /h is an estimate of the
slope at xn + h. Now
Z xn +h
y(xn + h) = y(xn ) + y 0 (x) dx (2.38)
xn

Using Simpson’s rule


x+2h
h
Z 
g(s) ds ≈ g(x) + 4g(x + h) + g(x + 2h) , (2.39)
x 3

Table 2.1: Approximate values for y(1) of the solution of y 0 (x) = xy with initial condition
√ y(0) = 1 obtained
using Euler’s method and the second-order RK-2 method. The exact solution is y(1) = e = 1.64872127 . . . .

n y(1) y(1)
(Euler) (RK-2)

1 1 1.5
2 1.25 1.5996· · ·
4 1.4194· · · 1.6342· · ·
8 1.5240· · · 1.6448· · ·

22
which is exact for all polynomials of degree n ≤ 3, to evaluate the integral on the right after splitting
the interval into two sub-intervals of length h/2 gives
h 0 h 
y(xn + h) ≈ y(xn ) + y (xn ) + 4y 0 (xn + ) + y 0 (xn + h) (2.40)
6 2
The fourth-order Runga-Kutta method approximates this with

hy 0 (xn ) ≈ k1
h
4hy 0 (xn + ) ≈ 2k2 + 2k3 (2.41)
2
hy(xn + h) ≈ k4

where the kj have been selected to make the method fourth-order, that is

yn+1 = y(xn + h) + O(h5 ). (2.42)

so that after N steps of size 1/h the error is O(h4 ). There are many higher-order Runga-Kutta
methods. There are many subtleties that we have not addressed. One is that higher-order does
not necessarily mean higher accuracy. A discussion of this and many other aspects of finding
approximate solutions numerically is left for a course in numerical methods.

2.3 Picard’s Method: finding approximate solutions


This method of finding approximate solutions was formalized by Émile Picard (1856–1941) in the
1880’s and 1890’s but the method was discovered 50 years earlier by Joseph Liouville (1809–1882).
Consider
y 0 = f (x, y) y(x0 ) = Y0 . (2.43)
Integrating from x0 to x gives
Z x Z x
0

y (t) dt = f t, y(t) dt (2.44)
x0 x0

so Z x 
y(x) = y(x0 ) + f t, y(t) dt. (2.45)
x0

Define the operator T via


Z x 
T [w](x) = y(x0 ) + f t, w(t) dt. (2.46)
x0

T maps functions on R to functions on R: given the function w(x), T [w](x) is a new function. We
want to find a function y such that T [y] = y, i.e., a fixed point of T . The idea is to iterate to get a
sequence of functions y0 (x), y1 (x), y2 (x), . . . , where
Z x

yn+1 (x) = f t, yn (t) dt = T [yn ] (2.47)
x0

with the hope that yn converges to a function y(x) as n → ∞. If it does then taking the limit as
n → ∞ of yn+1 = T [yn ] implies that y = T [y], i.e., we have the solution of the DE (note all of the

23
functions yn (x) satisfy the initial condition yn (x0 ) = y(x0 ) = Y0 ). Often the initial guess is the
initial condition, i.e., take y0 (x) = Y0 , but this is not necessary.

Example: Consider y 0 = y with initial condition y(0) = 1. We know the solution is y = ex . Since
f (x, y) = y in this case the operator is
Z x
T [w] = 1 + w(t) dt. (2.48)
0

Iterating, starting from y0 = 1 we have


Z x
y1 = T [y0 ] = y0 + y0 dt
0
= 1 + x,
Z x
y2 = T [y1 ] = y0 + y1 (t) dt
0
Z x
=1+ (1 + t) dt
0
x2 (2.49)
=1+x+ ,
Z x2
y3 = T [y2 ] = y0 + y2 (t) dt
0
Z x
t2
=1+ (1 + t + ) dt
0 2
x 2 x 3
=1+x+ + ,
2 3!
etc. By induction it is straight forward to show that
x2 x3 x2
yn = 1 + x + + + ··· + . (2.50)
2 3! n!
Therefore

X xn
lim yn (x) = = ex (2.51)
n→∞ n!
n=0
which is the solution of the DE. Picard’s method can be used to prove the existence-uniqueness
theorem for solutions of first-order differential equations.

2.4 Existence-Uniqueness of Solutions of First-Order Differential


Equations
Consider the general first-order initial value problem
dy
= f (x, y) y(x0 ) = y0 . (2.52)
dx

24
∂f
THEOREM I (Picard’s Theorem): If f and ∂y are continuous functions in an open rectangle
n o
R = (x, y) : a < x < b; c < y < d (2.53)

that contains (x0 , y0 ) then the initial value problem (2.52) has a unique solution φ(x) in some
interval x0 − δ < x < x0 + δ where δ > 0.

There are stronger versions of this theorem. We mention one. First a definition.
Definition: A function f (x, y) is said to satisfy a Lipshitz condition in the variable y on an open
rectangle R if there is a constant K such that

|f (x, y2 ) − f (x, y1 )| < K|y2 − y1 | (2.54)

for all x, y1 and y2 in R.


The Lipschitz condition bounds the difference in slopes of solution curves of y 0 = f (x, y) and
says that as y2 → y1 for fixed x the difference in slopes f (x, y2 ) and f (x, y1 ) goes to zero at least
linearly in the difference in y values. There are many functions f (x, y) that satisfy the Lipschitz
condition for which fy is not continuous.
THEOREM II: If f (x, y) is a continuous function in an open rectangle
n o
R = (x, y) : a < x < b; c < y < d (2.55)

that contains (x0 , y0 ) and f (x, y) satisfies a Lipschitz condition in y on R then the initial value
problem (2.52) has a unique solution φ(x) in some interval x0 − δ < x < x0 + δ where δ > 0.

Note: These theorems

• can used closed rectangles instead of open rectangles;

• imply there is one and only one solution passing through (x0 , y0 );

• give no information on how large δ is. All we know is that it is greater than zero.

Example: What does the Theorem I tell us about solution of


dy
3 = x2 − xy 3 ,
dx (2.56)
y(1) = 6?

Putting the DE in the form y 0 = f (x, y) we have f = 13 x2 − 13 xy 3 . Both f and ∂f 2


∂y = −xy are
continuous on R2 . The hypothesis of the theorem is satisfied =⇒ there is a unique solution on an
interval (1 − δ, 1 + δ) for some δ > 0.

25
Example: What does the Theorem I tell us about solution of
dy y
=− ,
dx x−1 (2.57)
y(1) = 1?
y
This DE was considered above, solutions of which are plotted in Figure 2.1(b). Here f (x, y) = − x−1
is not continuous on an open neighbourhood containing the initial point (x0 , y0 ) = (1, 1) so the
theorem does not tell us anything. In this case there are no solutions of the form y(x) passing
through the initial point. Rewriting the differential equation as dx x−1
dy = − y both f (x, y) = − y
x−1

and fy = x−1 y2
are continuous at (x0 , y0 ) = (1, 1) and the theorem tells us that a solutions x(y)
exists. It is the straight line x = 1.

Here is a weaker theorem stated by Guiseppe Peano (1858–1932). He gave a proof of his theorem
in 1886 but his proof was found wanting and was not properly proved until many years later.
THEOREM III: (Peano’s Theorem): If f is continuous in an open rectangle
n o
R = (x, y) : a < x < b; c < y < d (2.58)

that contains (x0 , y0 ) then the initial value problem (2.52) has a solution φ(x) in some interval
x0 − δ < x < x0 + δ where δ > 0.

Comment: Uniqueness has been lost. There may be more than one solution. See problems for
an example.

2.5 Linear First-Order Differential Equations


A second, very important, class of differential equations that can always be solved are linear first-
order DEs. By ’can always be solved’ we mean that the solution can be written in terms of integrals.
In many cases it will be impossible to find analytic expressions for the integrals.
Linear first-order (and higher-order) differential equations arise in many problems, for example
the projectile problem above for the case of motion of an object near the Earth’s surface subject
to linear drag and the radioactive decay problem discussed below. There is a great deal of theory
for linear differential equations and much of this course will be focussed on first- and second-order
equations of this type.
Some of you may have learned how to solve linear first-order DEs by finding an integrating
factor. Here I use a different approach that can be generalized to higher-order linear DEs. In the
following section the integrating factor method is described and we will see that both methods are
essentially the same.

2.5.1 Illustrative example


We begin by consider a very simple inhomogeneous linear first-order DE. Finding its solution will
illustrate most of the key points of the general problem. Consider

y 0 + y = 1. (2.59)

26
We wish to find the general solution, that is, all solutions. First note that yp (x) = 1 is a solution.
This method of finding a solution is referred to as ’by inspection’. If you can solve a DE ’by
inspection’ do so. The subscript ’p’ indicates that we have a particular solution. A particular
solution is any solution of the inhomogeneous DE that does not involve any undetermined constants.
There are infinitily many particular solutions. yp = 1 is the simplest.
Next let y(x) = yp + v(x) = 1 + v(x) and substitute into the DE. This gives
(1 + v)0 + (1 + v) = 1 (2.60)
or
v 0 + v = 0. (2.61)
Not that the DE for v is the homogeneous version of the original DE for y. That is, it is the DE
for y with the forcing term removed. Now the solution of (2.60) can be obtained using the method
of separation of variables. The solution is
v = ae−x (2.62)
where a is an arbitrary constant. All solutions of (2.60) are of this form. With this solution for v
we have
y = 1 + ae−x . (2.63)
This is the general solution of (2.59). That is all solutions have this form. There is one undetermined
constant, a, whose value can be determined if an initial condition y(x0 ) = y0 is provided.
The function yh = e−x is a non-zero solution of the homogeneous equation. The general solution
has the form of a particular solution yp , which is the function 1 in this case, plus an arbitrary
constant times the homogeneous solution yh . This is a general feature of linear first-order DEs as
we will show below.
Before moving on, note that because yh = e−x is never zero we could write the solution as
y(x) = e−x (ex + c) = yh (e−x + c). (2.64)
With this in mind let’s solve (2.59) in a slightly different manner. First find a non-zero solution of
the homogeneous problem
y 0 + y = 0. (2.65)
One such solution is yh = e−x . Next let y = yh (x)v(x). Substituting into (2.59) we have
0
y 0 + y = yh v + yh v = yh0 + yh v + yh v 0 = 1

(2.66)
or
1
v0 = = ex . (2.67)
yh
Integrating gives the general solution v 0 = ex + a where c is an arbitrary constant. Thus
y = yh v = e−x ex + c = 1 + ae−x .

(2.68)
This recovers our previous solution. Here we chose e−x as the non-zero homogeneous solution. We
could have chosen ce−x for any non-zero constant c. For example we could have chosen yh = 2e−x ,
or yh = −101.1287e−x etc. We would end up with the same result. You should convince yourselves
of this. Note also that yh = 0 is also a solution of the homogenous problem but this choice for yh
does not work. Why?
The latter approach is what we will use in general because it is generally much harder to
find a particular solution than to find the homogeneous solution. Normally you need to find the
homogeneous solution first.

27
2.5.2 Standard form for linear first-order DEs
The general linear first-order differential equation has the form
dy
a1 (x) + a0 (x)y = b(x) (2.69)
dx
where a1 (x) is non-zero.

dy b(x)
Case a: a0 = 0. Then dx = a1 (x) and we can integrate to get
x
b(x)
Z
y= dx. (2.70)
a1 (x)

So there is nothing to discuss in this case.

Case b: a0 6= 0. In this case we divide by a1 (x) and write the DE in the standard form

dy
+ P (x)y = Q(x). (2.71)
dx
This is the case we are interested in.

2.5.3 Solving linear first-order equations


Now we consider the general problem. It will simplify our study of linear DEs, particularly when
we consider second-order DEs, to write the equation in terms of a differential operator. So we begin
by introducing this concept.

Some formalism: Differential Operators

Definition: An operator maps a function to another function. That is, if L is an operator and
f is a function then g = L[f ] is a new function. f must be in a space of functions that the operator
L can act on.

Example: A simple example of an operator is the differential operator which we will call D. If
f (x) is a function then D[f ] is the derivative of f , that is D[f ] = df
dx . Here the space of functions
the operator D can act on is the space of functions that are differentiable on a domain of interest.

Definition: An operator L is said to be a linear operator if L[af + g] = aL[f ] + L[g] for all
functions f and g in the domain of the operator and all constants a.

Example: The differential operator D[] is a linear operator:

d  df dg
D[af + g] = af + g = a + = aD[f ] + D[g]. (2.72)
dx dx dx

We work with linear first-order DEs in the standard form

y 0 + P (x)y = Q(x). (2.73)

28
Define the linear differential operator L[], which maps differentiable functions to functions, via

df
L[f ] ≡ + P (x)f ≡ D[f ] + P (x)f. (2.74)
dx
Thus, given a differentiable function f , L returns the function f 0 + P f . L is a linear operator, that
is
L[af (x) + g(x)] = aL[f ] + L[g] (2.75)
for any two differentiable functions f and g and any constant a. The proof is left as an exercise.
Our goal is to find the solution of
L[y] = Q(x). (2.76)
First we consider the general structure of the solution. Afterward we will consider one method for
finding the general solution.

General structure of the solution

Let yp be any solution of L[y] = Q. Let yg be any other solution. Both yg and yp are solutions of
the DE so
L[yg ] = L[yp ] = Q(x) (2.77)
and, by linearity,
L[yg − yp ] = L[yg ] − L[yp ] = 0. (2.78)
That is, the difference between any two solutions of (2.76) is a solution of the homogeneous problem

L[y] = 0. (2.79)

Let yh be any non-zero solution (y = 0 is always a solution of the homogeneous problem but it is
not a useful one). Then by linearity so is Cyh for any constant C. By linearity of the operator L
we have
L[yp + Cyh ] = L[yp ] + CL[yh ] = Q + C · 0 = Q. (2.80)
Thus yp + Cyh is a solution of (2.76) for all values of C. At this point this procedure does not
tell us that all solutions of (2.76) have this form for our chosen yp and yh because, for example,
we have not shown that there could not be another solution yh2 of the homogeneous problem that
is not a constant multiple of yh . We will soon see that there is no such solution and indeed, our
previous solution shows that all solutions of the homogeneous problem are multiplies of a selected
non-zero solution, that is, if yh2 is a solution of L[y] = 0 then yh2 = Cyh for some constant C. The
existence-uniqueness theorem discussed below also shows that this is the case.
In summary, all solutions of L[y] = Q have the form

y = yp + Cyh . (2.81)

for some C, where yp is any solution of L[y] = Q and yh is any non-zero solution of L[y] = 0.

29
Finding the solution

Using the formalism introduced above we now find the general solution.
First we find a non-zero solution of the homogeneous DE

L[y] = y 0 + P y = 0. (2.82)

Writing this as
y 0 = −P y (2.83)
we have Rx
yh = e − P (s) ds
(2.84)
as a homogeneous solution. By construction all solutions of the homogeneous problem Rhave this
x
form, so any solution of the homogeneous problem is a constant multiple of e− P (x) ds .
As mentioned above any anti-derivative of P can be used. Changing it (i.e., by varying the lower
limit of integration) simply results in multiplying yh by a constant.
This gives us one piece of the solution. Next we need to find a particular solution yp . To do
this we are going to set yp (x) = v(x)yh (x), find an equation for v and find its solution. This idea
will be used again when we consider second-order differential equations. We have
0
L[vyh ] = vyh + P vyh
= v 0 yh + vyh0 + P vyh
  (2.85)
= v 0 yh + v yh0 + P yh
= v 0 yh

since yh is a solution of the homogeneous problem, i.e., yh0 + P yh = 0. Thus v is a solution of the
differential equation
Q(x)
v0 = . (2.86)
yh (x)
The right-hand side is a known function so we can integrate:
Z x
Q(s)
v= ds. (2.87)
yh (s)
The solution is then
x
Q(s)
Z
y(x) = v(x)yh (x) + Cyh (x) = yh (x) ds + Cyh (x). (2.88)
yh (s)

In the above I am taking v given by (2.87), and the corresponding integral in (2.88) to be a
particular anti-derivative of Q/hh . Any one will do. Thus vhh is a particular solution. You could
also view it has being the general anti-derivative which involves an undetermined constant in which
case vhh would be the general solution however this hides the structure of the solution and I expect
all general solutions to involve an arbitrary constant.

EXAMPLE: Find the solution of

cos(x)y 0 − sin(x)y = cos(x) (2.89)

30
SOLUTION:

Step 1: Put in standard form by dividing by the coefficient of y 0 to get

y 0 − tan(x)y = 1. (2.90)

Step 2: Solve the homogeneous problem

yh0 = tan(x)yh . (2.91)

This gives
Rx 1
yh = e tan(s) ds
= e− ln | cos(x)| = . (2.92)
| cos(x)|
We can take
1
yh = (2.93)
cos(x)
because if cos(x) happens to be negative we can multiply yh = −1/ cos(x) by −1 to get a new
homogeneous solution. Note that yh is undefined at points where cos(x) = 0, i.e., where the
function changes sign so our solution will in general only be valid for a range of x values for
which cos(x) is only positive or only negative.

Step 3. Set yp = v(x)yh (x) and substitute into (2.90) to get

v 0 yh + vyh0 − tan(x)vyh = v 0 yh + v yh0 − tan(x)yh = v 0 yh = 1.



(2.94)

Thus
1
v0 = = cos(x) (2.95)
yh
so
v = sin(x). (2.96)
Thus a particular solution is yp = v(x)yh (x) = tan(x). The general solution is

C
y = yp + Cyh = tan(x) + (2.97)
cos(x)

which can be verified by substituting this solution into the original equation (this is always a
good idea). Note that the solution is generally singular at x = π2 + nπ for all integers n (for
appropriate choices of C some of the singularities can be removed — you should convince
yourself of this).

Comment: Other choices for the homogeneous and particular solutions could be made.
For example taking yh = −2/ cos(x) leads to v 0 = − 21 cos(x). For v one could then choose
v = − 12 sin(x) + 10.76. Then y = vyh + Cyh = tan(x) − 10.76/ cos(x) − 2C/ cos(x). Letting
B = −10.76 − 2C we can write this as y = tan(x) + B/ cos(x) which is the same as our
previous solution since C and B are both arbitrary.

31
EXERCISE: Find the solution of
xy 0 + 2x2 y = x2 (2.98)
following this procedure.

Problems often have forcing functions that can be separated into the sum of simpler functions.
In that case it is often convenient to find a particular solution corresponding to each piece and add
them together. Consider, for example the problem

L[y] ≡ y 0 + P (x)y = Q(x) = a1 Q1 (x) + a2 Q2 (x) + a3 Q3 (x) (2.99)

where the aj are constants. SupposePyou find particular solutions ypj for each Qj (x). That is,
L[yj ] = Qj for j = 1, 2, 3. Then yp = j aj ypj (x) is a particular solution of L[y] = Q since
X  X X
L aj ypj (x) = aj L[ypj ] = aj Qj = Q. (2.100)
j

EXAMPLE: Find the solution of

cos(x)y 0 − sin(x)y = −2 cos(x) + 5 sin(x). (2.101)

Solution: Putting the equation in standard form we have

L[y] ≡ y 0 − tan(x)y = −2 + 5 tan(x). (2.102)


1
From the previous problem we know that L[ cos(x) ] = 1 so

−2
L[ ] = −2. (2.103)
cos(x)
Next we need to find a particular solution of L[y] = tan(x). From the work done in the previous
example we need to solve
tan(x)
v0 = = sin(x) (2.104)
yh
so v = − cos(x) and the particular solution is vyh = −1. Thus yp2 = −1 is a solution of L[y] =
tan(x) (this can also be seen by inspection). Then

yp = −2 cos(x) − 5 (2.105)

is a particular solution of the DE. The general solution is this plus a multiple of the homogeneous
solution:
C
y = −2 cos(x) − 5 + . (2.106)
cos(x)

32
2.5.4 Existence-Uniqueness
Linear first-order differential equations have the general form

y 0 + P (x)y = Q(x) (2.107)

where P and Q are known functions. Equations of this type are a special form of y 0 = f (x, y) with
f (x, y) = Q(x) − P (x)y. There is a stronger version of the existence-uniqueness theorem for linear
equations.
Consider the Lipschitz condition. We have |f (x, y2 ) − f (x, y1 )| = |P (x)||y2 − y1 | so if P (x) is
bounded f (x, y) satisfies a Lipschitz condition in y with K = max |P (x)| or sup P for a ≤ x ≤ b
and for all y1 and y2 . In particular, the Lipshitz condition is satisfied for all y, not just for values
in a finite interval (c, d) as above. This strengthens the existence-uniqueness theorem:

THEOREM IV: Suppose f (x, y) is a continuous function that satisfies a Lipschitz condition in y
on an infinite strip a ≤ x ≤ b and −∞ < y < ∞ If (x0 , y0 ) is any point on the strip then the IVP

y 0 = f (x, y), y(x0 ) = y0 (2.108)

has a unique solution y(x) on the interval a ≤ x ≤ b.

Comment:

• With this stronger condition on f (x, y) (i.e., Lipschitz on −∞ < y < ∞), the solution exists
on the whole interval [a, b].

• Consider the linear linear equation (2.107), if P (x) and Q(x) are continuous on [a, b] then
f (x, y) = −P y + Q satisfies the conditions of the theorem hence a unique solution y(x) exists
on [a, b].

For a proof see [7].

2.5.5 Radioactive Decay


Before discussing some theoretical aspects of linear first-order differential equations we consider
another physical example to provide further motivation.

Example: Radioactive Decay. A rock contains a radioactive isotope RA1 which decays into a second
radioactive isotope RA2 . The latter subsequently decays into stable atoms:

RA1 → RA2 → stable atoms. (2.109)

Let Mi (t) be the mass of RAi , i = 1, 2 and assume the following:

(i) There is no source of RA1 .

(ii) RA1 decays into RA2 at a rate αM1 where α > 0 is constant.

(iii) RA2 decays into stable atoms at a rate kM2 where k > 0 is constant.

33
The latter two assumptions are based on empirical evidence. The problem: find M1 (t) and M2 (t).

We need some equations to solve: our mathematical model. For this we use conservation of
mass for each isotope.

1. First consider RA1 . Conservation of mass says that:


dM1
= rate of creation of RA1 − rate of decay of RA1
dt (2.110)
= 0 − αM1 .
or
dM1
+ αM1 = 0. (2.111)
dt
This has the solution
M1 (t) = Re−αt (2.112)
where R = M1 (0) is the initial mass of RA1 .

2. Next consider RA2 . We use conservation of mass again noting that M1 decreases at the rate
αM1 so RA2 is created at the rate αM1 = αRe−αt . Thus
dM2
= rate of creation of RA2 − rate of decay of RA2
dt
= αM1 + kM2 (2.113)
= αRe−αt − kM2 .

Thus, M2 satisfies the DE


dM2
+ kM2 = αRe−αt . (2.114)
dt
Note the similarity of equations (2.111) and (2.114). The form of the left hand side is the
same:
dMi
+ ai Mi (2.115)
dt
where ai is a constant. The right hand sides differ and this has an important effect on the
solution.

To solve (2.114) we multiply the equation by ekt giving

dM2
ekt + kekt M2 = αRe(k−α)t . (2.116)
dt
d
Noting that kekt = kt
dt (e ) (which is precisely why we multiplied by ekt ), this can be rewritten as

dM2 d kt 
ekt + e M2 = αRe(k−α)t . (2.117)
dt dt
or
d  kt 
e M2 = αRe(k−α)t , (2.118)
dt

34
the point being that the left hand side is an exact derivative. We can now integrate both sides
giving 
 αR e(k−α)t + C, if k 6= α;
k−α
ekt M2 = (2.119)
αRt + C, if k = α,
where C is an undetermined constant. The solution is then

 αR e−αt + Ce−kt , if k 6= α;
k−α
M2 (t) = (2.120)
αRte−αt + Ce−αt , if k = α,

Discussion: Equation (2.114) is a linear constant coefficient first-order differential equation. We


solved the DE by multiplying it by the function ekt which put the left hand side in the form of an
exact derivative. It was then straightforward to solve the equation. This procedure can be used to
find a closed form solution of any first-order linear DE with non-constant coefficients as we discuss
next.

2.5.6 Solving linear first-order equations using an integrating factor


First consider the procedure discussed above that. First we found a non-zero homogeneous solution
yh of L[y] = 0. Then we set the general solution y equal to v(x)yh (x) for an unknown function v.
Substituting into the inhomogeneous problem L[y] = Q led to

Q(x)
v 0 (x) = . (2.121)
yh (x)

which we integrated to find v which gives a particular solution after multiplying by yh . We can
write this equation as
d y Q(x)
= (2.122)
dx yh yh (x)
or, setting µ(x) = 1/yh (x), as
d  
µ(x)y(x) = µ(x)Q(x). (2.123)
dx
This is the equation in the form of an exact derivative. The right-hand side is know so we can
integrate immediately to find µ(x)y(x) from which we get the solution y(x).
Let us start from scratch and rederive (2.123) in another way. Start with the linear first-order
DE in standard form:
L[y] ≡ y 0 + P y = Q. (2.124)

Multiplying by an as yet unknown function µ(x) we have

dy
µ(x) + µ(x)P (x)y = µ(x)Q(x). (2.125)
dx
and we choose µ so that

µ(x)P (x) = . (2.126)
dx

35
If we can do this the equation becomes
dy dµ
µ(x) + y = µ(x)Q(x) (2.127)
dx dx
or
d 
µy = µ(x)Q(x) (2.128)
dx
which is (2.123). The left hand side is an exact derivative and we can now integrate to get
Z x
µy = µ(s)Q(s) ds + C (2.129)

or x
1h
Z i
y= µ(s)Q(s) ds + C . (2.130)
µ
Thus, if we can find a function µ(x) satisfying (2.126) then in principal we have found the solution.
In turns out that find µ(x) is easy. The equation for µ(x) is

= µP (x) (2.131)
dx
which is a separable linear DE. Rewriting as
1 dµ
= P (x) (2.132)
µ dx
and integrating gives Z x
ln |µ| = P (s) ds + C (2.133)
or Rx
P (s) ds
µ = Ae . (2.134)
Because the multiplication of (2.71) by µ(x) makes the left hand side an exact derivative, the
function µ(x) is called an integrating factor.
In the solution x
1h
Z i
y= µ(s)Q(s) ds + C . (2.135)
µ
multiplying µ(x) by a constant just changes C, which is an arbitrary constant, so we can take A = 1
(or any other non-zero number). Thus, the general solution of
dy
a1 + a0 y = b (2.136)
dx
is Rx hZ x Rs i
− P (s) ds P (t) dt
y=e e Q(s) ds + C , (2.137)

where
a0 (x) b(x)
P (x) = , Q(x) = (2.138)
a1 (x) a1 (x)
and C is an arbitrary constant. While this gives a closed form solution in many cases it may be
impossible to find the integrals analytically.

Comments:

36
• This was the method first used to find the general solution of a first-order linear DE. It was
found by Leibniz in 1694 using his ‘infinitesimal’ notation.

• The lower limit of integration has not been specified Rbecause it does not matter. Any anti-
x
derivative can be used. Changing the lower limit of P (s) ds merely changes the value of
the integral by an additive constant. The exponential of this is changed by a multiplicative
constant (i.e., changes the value of A in (2.134 ) and you should convince yourself that this
multiplicative constant cancels in the first term. In the second term, involving C, it doesn’t
matter because C is arbitrary.

• YOU SHOULD NOT MEMORIZE THIS FORMULA! You should understand the
process and follow the steps used to derive (2.137) for a given problem (or the method
discussed below).

• You should know that the solution of (2.131) is (2.134). There is no need to go through the
intermediate steps every time.

• The solution (2.137) has the form

y = yp (x) + Cyh (x) (2.139)

where Rx Z x Rs
yp =− P (s) ds
e P (t) dt
Q(s) ds (2.140)

and Rx
yh = e− P (s) ds
. (2.141)
We call the first a particular solution. It is one solution of the DE, the solution obtained from
(2.137) by setting C = 0 (other particular solutions can be obtained by choosing other values
for C). The second function, yh , is a solution when Q = 0. It is a solution of the homogeneous
problem
y 0 + P y = 0. (2.142)

EXERCISE: Consider (2.131). Seek a solution of the form µ = eS(x) . Substitute into the DE for
µ and find a DE for S(x). Solve the DE (i.e., express S in terms of P ). This gives a solution
that is always positive. How can you use this solution to get the general solutions (2.134) with an
arbitrary constant A of either sign?

EXAMPLE: Find the solution of


xy 0 + 2x2 y = x2 . (2.143)

Solution:

Step 1: First put in standard form by dividing the equation by the coefficient of y 0 giving

y 0 + 2xy = x. (2.144)

Step 2: Multiply by µ:
µy 0 + 2xµy = xµ. (2.145)

37
Step 3: Choose µ so that the coefficient of y is µ0 :

µ0 = 2xµ. (2.146)

The solution of this is Rx


2x dx 2
µ = Ae = Aex . (2.147)
2
We can take A = 1 and use µ = ex . Note than using another antiderivative of 2x, e.g.,
2
x2 + 10 gives µ = Ae10 ex and the factor e10 can be absorbed into the arbitrary constant A.
Any antiderivative of the function P will do.

Step 4: The DE is now


d  x2  2
e y = xex . (2.148)
dx
Integrating both sides gives
2 1 2
ex y = ex + C (2.149)
2
so
1 2
y= + Ce−x . (2.150)
2
Step 5: Check your solution:
2
1 2

xy 0 + 2x2 y = x · −2xCe−x + 2x2 + Ce−x

2
= −2x Ce 2 −x2
+ x + −2x2 Ce−x
2 2 (2.151)

= x2

so indeed have a solution of the DE.

2
EXERCISE: Find the solution using µ = −2ex as the integrating factor and show that you get the
same result.

2.5.7 Applications of first-order linear ODEs


Newton’s Law of Cooling

Consider an object with temperature θ(t) surrounded by fluid with an ambient (constant) temper-
ature Ta . Newton’s Law of Cooling states that the rate of change of the object’s temperature is
proportional to the temperature difference θ − Ta :
dθ  
= −k θ − Ta (2.152)
dt
where k is a positive constant. This is a simple model that captures gross averaged features of
cooling.

38
Figure 2.5: Newtonian cooling

Example: At one stage in the process of brewing beer the wort has to be cooled from boiling temper-
ature to 25◦ C, ideally within about 45 minutes. Assume the pot of wort is put in a large sink of cold
water and ice which is continually replenished so that the temperature of the water is maintained
at a temperature of Tw = 5◦ C. The wort is continually stirred to maintain a uniform temperature.
The temperature of the wort decreases from 100.0◦ C to 80.0◦ C in 5.00 minutes. How long does it
take to reach a temperatures of 25◦ C assuming Newton’s Law of Cooling applies.
What if Tw = 10◦ C and k is unchanged. Now how long does it take the wort to cool to 25◦ C?

Solution: Let θ(t) be the temperature of the wort at time t where t is in minutes. Our model
equation is

+ kθ = kTw . (2.153)
dt
A homogeneous solution is θh (t) = e−kt . By inspection a particular solution is θp (t) = Tw . Thus
the general solution of the DE is
θ(t) = Tw + Ae−kt (2.154)
where A is unknown. From the initial condition θ(0) = 100◦ C we have

θ(t) = Tw + (100 − Tw )e−kt . (2.155)

1. For the first case Tw = 5◦ C and we are told that θ1 = 80◦ C at time t1 = 5.00 min. Thus

θ1 = Tw + (100 − Tw )e−kt1
100 − Tw
=⇒ ekt1 = (2.156)
θ1 − Tw
1  100 − Tw 
=⇒ k = ln .
t1 θ1 − Tw

This determines k:
1  95.0 
k= ln ≈ 0.0472778 min−1 ≈ 0.0473 min−1 . (2.157)
5 75.0

39
The wort reaches the final temperature Tf = 25◦ C at the time t satisfying

Tw + (100 − Tw )e−kt = Tf (2.158)

i.e., when
100 − Tw
ekt = (2.159)
Tf − Tw
or when
1  100 − Tw  1  95.0 
t= ln = ln ≈ 33 min. (2.160)
k Tf − Tw k 20.0

2. If Tw = 10◦ C and k is unchanged the time to cool to 25◦ C is


1  100 − Tw  1  90.0 
t = ln = ln ≈ 38 min. (2.161)
k Tf − Tw k 15.0

Comments:

1. It is best to find the solution in terms of the variables Tw , Tf , k etc., until the final stage
rather than substituting there values early on. This makes it possible to answer the second
question easily rather than working out the whole solution again. It also shows how the
solution depends on the various parameters. For example, double k halves the cooling time.

2. How many significant figures should be kept in the value  fork and in the value for the
◦ 1
cooling time? Increasing θ1 by 0.05 C changes k to 5 ln 75.0595
= 0.0471445 · · · instead of

0.0472778 · · · . Decreasing θ1 by 0.05 C changes k to 0.0474111 · · · . Thus if θ1 is known to
within an error of 0.05◦ C then k = 0.047 ± 1.4 × 10−4 min−1 which leads to an error in the
final time of about 0.09 min or 5/6 s. How accurate is your thermometer and clock? To what
precision do you want the cooling time to be known?

Mixing Tank Problem

Consider a tank of volume V . A liquid flows in at a rate Vf . This is called the volume flux. It has
units of volume per unit time. The inflowing liquid has a dissolved chemical with concentration cin
(mass per unit volume). The concentration of the chemical in the tank is c(t). The liquid inside
the tank is well stirred so that the concentration is uniform throughout the tank. The well mixed
liquid flows out of the tank with the same rate as the inflow so the volume of fluid in the tank is
constant in time. Determine how the chemical concentration in the tank c(t) varies with time.

We use the principal of conservation of mass: the rate of change of the mass of chemical in the
tank is equal to the rate mass of chemical enters the tank minus the rate mass of chemical leaves
the tank. The mass of chemical in the tank is M (t) = V c(t). The rate mass enters is equal to the
product of the incoming volume flux times the concentration in the inflow, i.e., Vf cin (units of mass
per unit time). The rate mass leaves is Vf c(t). Thus, conservation of mass gives

d 
V c(t) = Vf cin − Vf c(t) (2.162)
dt

40
Figure 2.6: Mixing tank of volume V . c(t) is the chemical concentration in the tank, Vf is the volume flux
into and out of the tank and cin is the concentration of the chemical in the inflowing liquid.

or
dc Vf Vf
+ c(t) = cin . (2.163)
dt V V
A homogeneous solution is
Vf
ch = e− V t
(2.164)
and by inspection a particular solution is the constant solution
cp = cin . (2.165)
Thus the general solution is
c = cp + Ach
Vf (2.166)
= cin + Ae− V t

or, in terms of the initial concentration c(0),


 Vf
c(t) = cin + c(0) − cin e− V t (2.167)

Note that according to the solution after a long time the concentration c(t) is equal to that in
the inflow which makes physical sense. This is the particular solution.

2.6 Nonlinear First-order Differential Equations


2.6.1 The clepsydra
A clepsydra, or water clock, measures time by the regulated flow of a liquid into or out of a vessel.
Consider the following problem (Figure 2.7). A rotationally symmetric container of the form
z = f (r) or r = r(z), where z is the vertical coordinate and r is the radius of the container, is
filled with water of depth h(t) where t is time. Water flows out through a hole at the bottom of
the container. We wish to find an equation for h(t) and solve for some special shapes.
Daniel Bernoulli (1700–1782) studied simple fluid flows and derived Bernoulli’s Law (see Bernoulli’s
‘Hydrodynamica’ published in 1738). Ignoring viscosity and assuming steady laminar flow this
states that
1
p + ρgz + ρq 2 = constant (2.168)
2

41
Figure 2.7: A clepsydra. Fluid in the container is exposed to atmospheric pressure pa at the fluid surface
and at the exit. The container is radially symmetric with radius r a function of the height h above the
bottom. Bernouill

where p is the pressure, ρ the fluid density, g the gravitational acceleration, z the vertical coordinate,
q the speed of the fluid. If the water level in the container is dropping slowly enough we can treat
the flow in the container as steady. Relating the flow at the surface (at z = h(t)) to the flow exiting
the hole at z = 0 we have
1 1
p1 + ρgz1 + ρq12 = p2 + ρgz2 + ρq22 (2.169)
2 2
where the subscript 1 refers to the value at point 1 at the surface (z1 = h) and subscript 2 refers
to point 2 at the hole at the bottom of the container (z2 = 0). Assuming q1 = h0 (t) is negligibly
small (which requires that the surface area at the top is much larger than the cross-sectional area
of the hole) and that the pressure at the surface (p1 ) and at the exit point (p2 ) are both equal to
the atmospheric pressure we get p
q2 = 2gh. (2.170)
This is known as Toricelli’s Law after Evangelista Toricelli (1608–1697). In reality viscous effects
reduce the outflow speed somewhat, particularly if the hole is very tiny:
p
q2 = γ 2gh (2.171)

where 0 < γ < 1 depends on the fluid viscosity and the shape and length of the hole.
Now the rate at which fluid exits the container must be equal to the rate at which the volume
of fluid in the container is decreasing (conservation
√ of mass assuming the fluid density is constant).
The volume flux through the hole is aq2 = aα 2gh where a is the cross-sectional area of the hole.
The downward volume flux at the surface is A dh 2
dt (note it is negative) where A = πr (h) is the area

42
of the surface. Conservation of mass gives our model equation
dh p
πr2 = −aγ 2gh both sides negative (2.172)
dt
or √ 1
dh aγ 2g h 2
=− . (2.173)
dt π r2 (h)
This is a first-order nonlinear differential equation for h(t) (in general - the equation is linear for
what special container shapes?). It is separable so in principle it is easy to solve.
Note that fluid viscosity is temperature dependent, hence the time taken for a vessel to empty
will depend on the fluid temperature. We have ignored the effects of surface tension which become
very important when the water level is low or the container is very narrow (e.g., a syringe for which
neglecting q1 in Bernoulli’s equation would also not be valid).

Examples:

1. If the bowl is a parabola, i.e., h = αr2 for some constant α > 0, we have
dh
= −ch−1/2 (2.174)
dt
aγα √
where c = π 2g. Thus
1
h 2 dh = −c dt (2.175)
so 
3/2 3 2/3
h(t) = h0 − ct (2.176)
2
3/2
2 h0
where h0 = h(0). The bowl is empty when h = 0, i.e., after time t = 3 c .
dh
2. For what special shape does is the water surface drops at a constant rate, i.e., is constant.
√ dt
This requires r2 ∝ h. Setting h = kr4 the differential equation becomes

dh aγ 2gk
=− . (2.177)
dt π

2.6.2 Clairaut Equations and Singular Solutions


Alexis Clairaut (1713–1765) wrote a book in 1731 (at the age of 13!) on curves, one of his main
interests [3]. In particular, he considered envelopes of a set of tangent lines.
To derive Clairaut’s equation (see [8]) consider a family of straight lines

y = mx + b. (2.178)

This is a two parameter family which includes all straight lines on the plane. Suppose that we take
b to be a function of the slope m via b = f (m). Then we have a one parameter family of straight
lines:
y = mx + f (m). (2.179)
Now y 0 = m so we can write this as a first-order differential equation

y = xy 0 + f y 0

(2.180)

43
2

y 0

−1

−2
−5 −4 −3 −2 −1 0 1 2 3 4 5
x

2
Figure 2.8: Solutions of the Clairault equation y = xy 0 + y 0 .

where in general f (y 0 ) is a nonlinear function. This is known as Clairaut’s equation. Note that
in (2.180), y 0 need not be constant so this is a generalization of y = mx + f (m). There will in
general be solutions that are not lines in the plane.
Differentiating both sides gives

y 0 = y 0 + xy 00 + f 0 y 0 y 00 .

(2.181)

Letting p = y 0 we have
x + f 0 (p) p0 = 0

(2.182)
so either p0 = 0 or f 0 (p)= −x. The first case says p = y0
is constant, i.e., y is linear in x. The
general solution is p = m and y = mx + b where b and m are unrelated. This is not what we started
with. This discrepancy has arisen because we have found the solution of (2.182), not of (2.180).
That is, differentiating (2.180) has resulted in a loss of information. Substitituting y = mx + b into
(2.180) gives b = f (m), so we have recovered y = mx + f (m).
The second solution given by f 0 (p) = −x is new. This will give the envelope to a family of
straight lines as illustrated in the following example.

Example: Consider the problem with f (y 0 ) = y 02

(i) The first solution has p = y 0 = a, where a is a constant, hence y = ax + b where a and b are
constants.

(ii) The second solution has f 0 (p) = 2p = −x =⇒ 2y 0 = −x =⇒ y = − 41 x2 + d, where d is a


constant.

Now our solutions need to satisfy the original DE. What we have found so far are solutions of
the derivative of the original DE so we need to substitute our solutions into the original DE which

44
in this example is
y = xy 0 + y 02 . (2.183)

(i) Substituting y = ax + b into (2.183) gives


ax + b = ax + a2 (2.184)
hence we must have b = a2 .
(ii) Substituting y = −x2 /4 + d into the DE gives
1 1   1 2 1
− x2 + d = x − x + − x = − x4 (2.185)
4 2 2 4
so d = 0.

The solutions of the Clairaut Equation (2.183) are


y = ax + a2 (2.186)
and
1
y = − x2 . (2.187)
4
The first has a single arbitrary constant, the second solution has no arbitrary constants. Thus not
all solutions of a first-order differential equation involve an arbitrary constant. The solutions are
shown in Figure 2.8. Here the green lines are the straight line solutions with positive slope and the
red lines are those with negative slope. The black curve is the second solution y = 41 x2 which is
the envelope of the family of straight line solutions.

Exercise: Consider the IVP


y = xy 0 + y 02 ,
(2.188)
y(0) = y0
How many real valued solutions are there if (i) y0 < 0, (ii) y0 = 0, or (iii) y0 > 0?

What does the Existence-Uniqueness theorem say about solutions of this equation? To apply
the theorem we must write the equation in the form y 0 = f (x, y). For our example this gives
p
0 −x ± x2 + 4y
y = . (2.189)
2
Then
1
fy (x, y) = ± p (2.190)
x2 + 4y
2
which is not continuous along the curve x2 + 4y = 0, i.e., along the solution curve y = − x4 .

Exercise: For initial points on the curve y = −x2 /4 how many solutions are there (i.e., solutions
of the initial value problem y(x0 ) = y0 = −x20 /4)?
To read more about the Clairaut equation and a number of applications see [8]. [3] provides a
short biography on Clairaut and briefly discusses some of his other work.

45
5

1
A
A
0

y
A
−1

−2

−3

−4

−5
−4 −3 −2 −1 0 1 2 3 4 5 6
x

Figure 2.9:
Equal area triangles for A = 1 and c = 1.

2.6.3 A geometry problem


Find the family of curves for which the area of the triangle bounded by the segment of the tangent
line between a point (x, y) on the curve and the x-axis, the perpendicular from the point and the
x-axis and the segment of the x-axis between these two lines is equal to a given constant A.
Solution: Let (x0 , y0 ) be a point on the curve y = f (x). The equation for the tangent line is
y0
y = y0 + f 0 (x)(x − x0 ). The x intercept is x = x0 − y0 (x 0)
. The area of the triangle is

1  y0  1 f (x0 )2
A = |y0 | x0 − x0 − 0 =± . (2.191)
2 f (x0 ) 2 f 0 (x0 )
The ± sign arises because the the area A must be constant and y0 may be positive or negative and
the x-intercept of the tangent line could lie to the left or right of x0 . Thus the curve is given by
solutions of the nonlinear DE
1 f2
f0 = ± (2.192)
2A
which has solutions
2A
f =± (2.193)
x−c
where c is arbitrary. The solution curves for A = 1 and c = 1 are shown in Figure 2.9. Changing
the value of c simply translates the curves left or right.

2.7 Exact First-Order Differential Equations


We have seen how the general solution of a linear first-order DE can be found. Many first-order
differential equations, however, are nonlinear and other solutions methods are requied. There are

46
no general methods to solve nonlinear DEs but there are methods that can be used to solve certain
types of nonlinear DEs (e.g., separable equations). Here we consider equations that are exact or
that can be make to be exact.
Implicit function are given by equations of the form
F (x, y) = C (2.194)
where C is a constant. This equation defines curves, the contours of F , on the x-y plane. We
can view this as defining a function y(x) or x(y) (there may be several such functions satisfying
(2.194)). Taking the former, we can differentiate with respect to x to get
∂F ∂F dy
+ = 0. (2.195)
∂x ∂y dx
‘Multiplying’ by dx leads to the total differential
∂F ∂F
dF = dx + dy = 0. (2.196)
∂x ∂y
which for simplicity we will usually write as
dF = Fx dx + Fy dy = 0. (2.197)
This states that the infinitesimal vector (dx, dy), which is tangent to the solution curve, is perpen-
dicular to the gradient of F .
Definition: A differential form M (x, y) dx + N (x, y) dy is said to be exact in a rectangle R if there
is a function F (x, y) such that
M = Fx and N = Fy (2.198)
for all (x, y) ∈ R. Then
dF ≡ Fx dx + Fy dy = M (x, y) dx + N (x, y) dy. (2.199)
The solution of dF = 0 is F = constant.

Example: The differential form


 y3  y2
2x + dx − 3 dy = 0 (2.200)
x2 x
is exact since
y3 ∂  2 y3 
2x + = x − ,
x2 ∂x x
(2.201)
y2 ∂  2 y3 
−3 = x − .
x ∂y x
Thus
 y3  y2 
2 y3 
2x + dx − 3 dy = d x − =0
x2 x x (2.202)
y3
=⇒ x2 − = C.
x
This is the implicit solution of the DE.

47
2.7.1 Test For Exactness
Given the equation
M (x, y) dx + N (x, y) dy = 0 (2.203)
how do we know if it is exact?

Theorem IV: Let f (x, y) be a scalar field such that the partial derivatives fx , fy , and fyx exist on
an open rectangle R. Then if fyx is continuous on R the derivative fxy exists and fxy = fyx on R.

You should see the proof of this MATH 237 or a similar course.

Theorem V (Euler 1734): Suppose the first partial derivatives of M (x, y) and N (x, y) are
continuous on a rectangle R. Then

M = Fx and N = Fy (2.204)

is exact on R iff
∂M ∂N
= (2.205)
∂y ∂x
for all (x, y) ∈ R.

Note: With Fx = M and Fy = N this means Fxy = Fyx .

Example: Find the solution of


dy 3x2 + 6xy 2
=− 2 . (2.206)
dx 6x y + 4y 3

Solution: We can write this as

3x2 + 6xy 2 dx + 6x2 y + 4y 3 dy = 0.


 
(2.207)

Here M = 3x2 + 6xy 2 and N = 6x2 y + 4y 3 and

My = 12xy,
(2.208)
Nx = 12xy,

Since My = Nx are continuous everywhere the equation is exact.

Fx = M = 3x2 + 6xy 2
(2.209)
=⇒ F = x3 + 3x2 y 2 + g(y).

Next Fy = N gives

Fy = 6x2 y + g 0 (y) = N = 6x2 y + 4y 3


=⇒ g 0 (y) = 4y 3 (2.210)
4
=⇒ g(y) = y .

48
Thus
F = x3 + 3x2 y 2 + y 4 (2.211)
and the solutions of the DE are given implicitly by

x3 + 3x2 y 2 + y 4 = C. (2.212)

Sketch of a Proof of Theorem V: We need to show there exists a function F (x, y) such that
M = Fx and N = Fy iff My = Nx .

1. If there is a function F such that M = Fx and N = Fy we know that F , Fx , Fy , Fxy and Fyx
exist and are continuous since the first partial derivatives of M and N are continuous. By
Theorem IV Fxy = Fyx =⇒ My = Nx .

2. Suppose My = Nx . If F exists such that Fx = M and Fy = N then


Z x
Fx = M =⇒ F (x, y) = M (t, y) dt + g(y) (2.213)
x0

so Z x
Fy = N = My (t, y) dt + g 0 (y) (2.214)
x0
or Z x
0
g (y) = N (x, y) − My (t, y) dt. (2.215)
x0

This is only possible if the right hand side is a function of y only, i.e., if
Z x
∂ h i
N (x, y) − My (t, y) dt = 0
∂x x0 (2.216)
=⇒ Nx (x, y) − My (x, y) = 0.

That is, if Nx = My the right hand side of (2.215) is a function of y only so g(y) exists. By
construction Nx = My =⇒ F (x, y) exists.

2.7.2 Inexact differential equations


The equation
 y3  y2
2x + dx − 3 dy = 0 (2.217)
x2 x
is exact. Multiplying it by x2 gives
 
2x3 + y 3 dx − 3y 2 x dy = 0 (2.218)

which is not exact since


∂  3  ∂  
2x + y 3 = 3y 2 6= −3y 2 x = −3y 2 . (2.219)
∂y ∂x

49
Clearly we can make this inexact DE exact by multiplying by the integrating factor µ(x, y) = x12 .
This is a generalization of the method discussed above for finding solutions of linear first-order
DEs. We are now extending it to nonlinear DEs.
In general it can be very difficult to find an integrating factor. If
µM dx + µN dy = 0 (2.220)
is exact then
∂   ∂  
µM = µN (2.221)
∂y ∂x
so
N µx − M µy + (Nx − My )µ = 0. (2.222)
This is a first-order partial differential equation to solve for µ. Usually one guesses at a simple form
for µ, for example that µ is a function of x or y only.

Example: Find the solution of


 x2   x 
3 2 + 1 + 2xy dx + 2 + 3x2 dy = 0. (2.223)
y y

2
Solution: Here M = 3 xy2 + 1 + 2xy and N = 2 xy + 3x2 so
x2
My = −6 + 2x,
y3
(2.224)
2
Nx = + 6x.
y
Since My =
6 Nx , the equation is not exact. Multiplying by µ to make it exact leads to
∂  x2  ∂  x 
µ 3 2 + 1 + 2xy = µ 2 + 3x2 (2.225)
∂y y ∂x y
or
x2 x x2 2
3 2 + 1 + 2xy µy − 2 + 3x2 µx + −6 3 + 2x − − 6x µ = 0
  
(2.226)
y y y y
or
x2 x x2 1
3 2 + 1 + 2xy µy − 2 + 3x2 µx − 2 3 3 + + 2x µ = 0
  
(2.227)
y y y y
or
x2 x 2 x2
3 2 + 1 + 2xy µy − 2 + 3x2 µx − 3 2 + 1 + 2xy µ = 0.
  
(2.228)
y y y y
Now, since any solution of this equation will do, you need to notice that if you assume µ is
independent of x this reduces to
2
µy = µ (2.229)
y
which has the solution µ = y 2 . Using this as the integrating factor puts the DE in the form
   
3x2 + y 2 + 2xy 3 dx + 2xy + 3x2 y 2 dy = 0 (2.230)

which is exact. The solution is then easily found to be x3 + y 2 x + y 3 x2 = C (exercise).

50
2.8 Solution by Substitution
There are several types of first-order DEs that can be solved by a suitable variable transforma-
tion. We consider two: Bernoulii equations and homogeneous equations. Then we look at the
Brachistochrone problem which is also solved via a substitution.

Bernoulli Equations

Bernoulli equations have the form


dy
+ P (x)y = Q(x)y α . (2.231)
dx
If α = 0 or 1 the equations is linear and easy to solve so here we are only interested in α 6= 0, 1.
Let
v = y 1−α . (2.232)
Then
dv dy
= (1 − α)y −α
dx dx
vh i
= (1 − α) −P (x)y + Q(x)y α
y (2.233)
h 1i
= (1 − α)v −P (x) + Q(x)
v
= −(1 − α)P (x)v + (1 − α)Q,
so v is a solution of the linear first-order DE
dv
+ P̃ (x)v = Q̃(x) (2.234)
dx
where
P̃ = (1 − α)P and Q̃ = (1 − α)Q. (2.235)

Example: Find the solution of


y 0 + xy = xy 1/3 . (2.236)

Solution: This is a Bernoulii equation with α = 1/3. Let

v = y 1−α = y 2/3 . (2.237)

Then
2
v 0 = y −1/3 y 0
3
2 v 
= −xy + xy 1/3 (2.238)
3y
2 2
= − xv + x.
3 3

51
or
2 2
v 0 + xv = x. (2.239)
3 3
A homogeneous solution is
2 /3
vh = e−x . (2.240)
To find a particular solution let v = u(x)vh (x) to get
2
3x 2 2
u0 = = xex /3 (2.241)
vh (x) 3
which has
2 /3
u = ex (2.242)
as a solution. Then a particular solution for v is vp = uvh = 1 as could have been seen by inspection.
If you can spot this derectly when looking at the DE for v(x) you do not need to go through the
above steps to find a particular solution. The general solution is
2 /3
v = 1 + Ae−x (2.243)

giving
 2
3/2
y = v 3/2 = 1 + Ae−x /3 . (2.244)

Homogeneous equations

A first-order differential equation of the form


dy y
=f (2.245)
dx x
is called a homogeneous equation. Leibniz discovered how to solve problems of this type in the
early 1690’s.
Let y = xz. Then
dy dz
=x + z = f (z). (2.246)
dx dx
This is a separable equation for z:
dz dx
= . (2.247)
f (z) − z x

Example: Solve (x + y) dx − (x − y) dy = 0.

Solution: Write the DE as


dy x+y 1 + xy
= = . (2.248)
dx x−y 1 − xy
The equation is homogeneous. Letting y = xz leads to
dz dx
1+z = , (2.249)
1−z −z x

52
or
1−z dx
2
dz = . (2.250)
1+z x
Integrating:
1 z
Z Z
dz − dz = ln x + c (2.251)
1 + z2 1 + z2
or
1
tan−1 (z) − ln(1 + z 2 ) = ln x + c. (2.252)
2
Using z = y/x we have r
−1 y y 2
tan − ln 1+ = ln x + c. (2.253)
x x

The Brachistochrone

Consider a bead sliding down a frictionless wire joining points A and B. What is the shape
of the curve which minimizes the travel time? This problem is known as the Brachistochrone
problem (from the Greek brachistos, shortest + chronos, time). The Brachistochrone problem is
an historically interesting and important problem. It ultimately led to the development of the
Calculus of Variations. The problem was posed by Johann Bernoulli in the June 1696 issue of the
Acta eruditorum as a challenge to other mathematicians of the time. Solutions by Newton, Leibniz,
Jakob Bernoulii and by Johann Bernoulli himself were published the following year [3, 7].
To setup the problem we first consider a related problem in optics. Suppose a ray of light
travels through a medium comprised of layers of different material in which the light has different
velocities (Figure 2.10). It was experimentally observed by Snell that at the interface between two
layers (e.g., at the point xj−1 in the figure) the ratio of the sines of the angles the ray makes with
the normal on the two sides of the interface is constant, i.e., sin(αj−1 )/ sin(αj ) is constant. That is,
as the angle αj−1 varies, the angle αj changes to keep this ratio constant. A rational explanation
of this is that the ray of light takes the path between two points which minimizes the travel time.
This is known as Fermat’s principle of least time.
Consider a material comprised of N layers. Layer j has thickness dj and light travels in it at
speed vj . Light travels from x = x0 at the top of the upper layer to point xN at the bottom of the
N th layer. What path minimizes the travel time?
The travel time ∆T between xj−1 and xj+1 is
q q
(d2j + (xj − xj−1 )2 (d2j+1 + (xj+1 − xj )2
∆T = + . (2.254)
vj vj+1

For fixed xj−1 and xj+1 what value of xj minimizes the travel time ∆T ? It is given by the value of
xj for which

d∆T xj − xj−1 xj+1 − xj


=q −q = 0. (2.255)
dxj (d2j + (xj − xj−1 )2 vj (d2j+1 + (xj+1 − xj )2 vj+1

53
Figure 2.10:

which we can write as


sin(αj ) sin(αj+1 )
= . (2.256)
vj vj+1
Thus, we must have
sin(αj )
= constant for all j, (2.257)
vj
for if
sin(αj ) sin(αj+1 )
6= (2.258)
vj vj+1
for some j we can move xj until these are equal and the travel time is reduced.
Taking the limit as the number of layers goes to zero so that the properties of the medium vary
continuously with height y, including, for example, the light speed v(y), the path taken by the ray
of light becomes a smooth curve along which (2.11)

sin(α)
= constant (2.259)
v(y)

where α is the angle between the vertical and the direction the ray of light is travelling.
Johann Bernoulli’s solution of the Brachistochrone problem was based on this principle. Con-
sider a frictionless wire in the shape of a curve y(x) joining the point A at the origin to a point B
below and to the right (see Figure 2.11). A bead moves along the curve starting from rest at the

origin. Conservation of energy gives 12 v 2 + gy = 0 so the bead moves with speed v = −2gy. Now

∆x 1 1
sin(α) = p =q =p . (2.260)
2
∆x + ∆y 2
1+ ∆y  2 1 + (y 0 )2
∆x

Hence, the curve is given by

sin(α) 1
=√ p = constant (2.261)
v(y) −2gy 1 + (y 0 )2

54
Figure 2.11:

or
1
√ p = c̃. (2.262)
−y 1 + (y 0 )2
Rearranging we have
c
1 + (y 0 )2 = − , (2.263)
y
where c is a new constant, positive as y < 0. Solving for y 0 we have

c
r
0
y =− − −1 (2.264)
y

where the negative square root is used because y 0 < 0 (at least at the beginning, y 0 can change
sign). This is a separable equation:
r
−y
− dy = dx. (2.265)
c+y

To solve this we make the substitution


r
θ −y
tan = . (2.266)
2 c+y

Then
−y θ
= tan2 (2.267)
c+y 2
from which we find y = −c sin2 2θ . Then dy = −c sin 2θ cos 2θ dθ and the DE becomes
 θ  θ θ  θ
dx = − tan −c sin cos dθ = c sin2 dθ
2 2 2 2 (2.268)
c 
= 1 − cos θ dθ
2

55
Figure 2.12: A cycloid of radius r rolling beneath the x-axis with angular velocity ω.The center of the wheel
moves with speed V .

so
c 
x= θ − sin θ + A. (2.269)
2
The initial point is (x, y) = 0. From y = −c sin2 2θ we have θ = 0 at the initial point. Hence A = 0.
The final solution is
c 
x= θ − sin θ
2  (2.270)
c 
y = − 1 − cos θ .
2
This curve is a cycloid: the path followed by a point on the circumference of a wheel rolling
along a flat surface, in this case the surface lies above the wheel. Here c/2 is the radius of the
wheel.
Cycloid: Consider a wheel of radius r which lies below and tangent to the x-axis. It rolls along
the x-axis with constant angular velocity ω. A point on the circumference of the wheel is initially
at the origin. If the wheel spins counter-clockwise about its centre, which is initially at (xc , yc ) =
(0, −r), while its centre remains stationary a point on the circumference has position (x, y) =
−r sin(ωt), r(1 − cos(ωt) . The period of rotation is τ = 2π

ω . If the wheel rolls along the x-axis
it will move a distance equal to the circumference in one period, so the centre of the wheel moves
with speed V = 2πr τ = ωr. So taking into account the motion of the wheel centre the point has
position (xp (t), yp (t)) given by

xp (t) = ωrt − r sin(ωt)


 (2.271)
yp (t) = r 1 − cos(ωt) .

In terms of the counter-clockwise angle of rotation θ = ωt the path followed by the point is

xp (θ) = r θ − sin(θ)
 (2.272)
yp (θ) = r 1 − cos(θ) .

56
This is (2.270) with r = c/2.
The cycloid has many interesting properties. The time taken for a bead released from rest
at any point on the curve to reach the bottom under the influence of gravity is independent of
the release point. Hence, the cycloid is also an isochrone. This result was proved by Hygens in
1659 using infinitesimals and a geometric arguement. Jacob Bernoulli proved this result in 1690
by setting up an appropriate differential equation and solving it. Huygens used this idea in his
invention of a pendulum clock realising that if the pendulum moved along a cycloid it would keep
time more accurately as the period of oscillation did not depend on the amplitude of the oscillation
as it does for a simple pendulum. This is useful because friction and damping associated with the
motion of the pendulum through air will decrease the amplitude of oscillation over time.

2.9 Problems
1. Find the solutions of the following separable differential equations.
dy dp
(a) dx = y 2 x3 , y(0) = −1; (c) dt = (p − 1)(p − 2);
dx dq
(b) dt = (1 − x) sin t; (d) ds = tan(s)q 3 q(π) = 1.

On what intervals are the solutions defined?

2. For the following autonomous equations plot the forcing functions and find the fixed points.
Are the fixed point stable, i.e., if the solution is perturbed away from the fixed point does it
return to the fixed point? Is it globally stable, i.e., does the solution go to the fixed point no
matter where it starts from? Answer this without using the solution of the equation.

(a)
dv
= −1 − v. (2.273)
dt
(b)
dv
= −1 − |v|v. (2.274)
dt
(c)
dp
= −p(p − 1)(p − 5). (2.275)
dt
3. Consider the autonomous differential equation
dy
= f (y) (2.276)
dx
where f (y) is a continous function in any interval of interest. Suppose y(x) is a solution of
the differential equation on some interval (a, b). Prove that the solution y(x) is a monotonic
function on (a, b). If the solution is exists on (−∞, ∞) show that y(x) goes to a constant or
to ±∞ as x → ±∞.

4. Consider the IVP


y 0 = y, y(0) = 1. (2.277)
Estimate the value of y(1) using Euler’s method with N steps of size h = 1/N .

57
(a) Show that the estimate is yN = (1 + h)N .
(b) Show that the error E = e − yN in this estimate satisfies
E 1
lim = e ≈ 1.359 (2.278)
h→0 h 2
and hence the error E goes to zero linearly in h as h → 0. That is, Euler’s method gives
a first-order approximation for y(1).
5. Consider the IVP
y 0 = y, y(0) = 1. (2.279)
Estimate the value of y(1) using the RK-2 method with N steps of size h = 1/N .
h2 N
(a) Show that the estimate is yN = (1 + h + 2 ) .
(b) Show that the error E = e − yN in this estimate satisfies
E 1
lim 2
= e ≈ 0.453 (2.280)
h→0 h 6
and hence the error E goes to zero quadratically in h as h → 0. That is, the RK-2
method gives a sedond-order approximation for y(1).
6. Consider the IVP
y 0 = x + y, y(0) = 1. (2.281)
Using Picard’s method find the nth iterate yn (x) using induction. What does yn (x) converge
too? Show that this is the solution of the IVP. Do this using the following starting functions:

(a) y0 (x) = 1; (c) y0 (x) = 1 + x;


(b) y0 (x) = ex ; (d) y0 (x) = cos(x).

7. Find the exact solution of the IVP

y0 = y2, y(0) = 1. (2.282)

Starting with y0 (x) = 1 use Picard’s method to calculate y1 (x), y2 (x), and y3 (x). Compare
your results with the exact solution. Maple would be helpful in going further. Find the exact
solution and compare with the Picard iterates.
8. Consider the system of first-order DEs

x0 = y (2.283)
0
y = −x (2.284)

with initial conditions x(0) = 0 and y(0) = 1. This can be written in vector form as
d~v
= f~ t, ~v

dt   (2.285)
0
~v (0) =
1

where ~v = (x, y)T and f~ = (y, −x)T (T being the transpose). Picard’s method can be applied
to this equation first-order vector equation. Doing so find x(t) and y(t).

58
9. Show that f (x, y) = x2 |y| satisfies a Lipschitz condition on the rectangle |x| < 1 and |y| < 1
but that fy fails to exist at many points in the rectangle.

10. Show that f (x, y) = xy 2

(a) satisfies a Lipschitz condition on any rectangle a ≤ x ≤ b and c ≤ y ≤ d.


(b) does not satisfy a Lipschitz condition in y on any rectangle a ≤ x ≤ b and −∞ < y < ∞.

11. Show that f (x, y) = xy

(a) satisfies a Lipschitz condition on any rectangle a ≤ x ≤ b and c ≤ y ≤ d.


(b) satisfies a Lipschitz condition in y on any rectangle a ≤ x ≤ b and −∞ < y < ∞.
(c) does not satisfy a Lipschitz condition in y on the entire plane.

12. Consider the problem


y 0 = |y|, y(x0 ) = y0 . (2.286)

(a) For what points (x0 , y0 ) does Theorem I imply a unique solution y(x) exists on a non-zero
interval containing x0 ?
(b) For what points (x0 , y0 ) does Theorem II imply a unique solution y(x) exists on a non-
zero interval containing x0 ?
(c) What is the solution of the DE?

13. Consider the problem


dv
= −1 − v|v|, v(0) = v0 . (2.287)
dt
For what points (0, v0 ) does Theorem I imply a unique solution v(t) exists on a non-zero
interval |t| < δ for some δ > 0?
1
14. Show that f (x, y) = y 2

(a) does not satisfy a Lipschitz condition in y on the rectangle |x| ≤ 1 and 0 ≤ y ≤ 1;
(b) does satisfy a Lipschitz condition in y on any rectangle |x| ≤ 1 and c ≤ y ≤ d with
0 < c < d.

15. Consider the problem


1
y0 = y 2 , y(x0 ) = y0 (2.288)
with y0 ≥ 0.

(a) For what points (x0 , y0 ) does Theorem I imply a unique solution y(x) exists on a non-zero
interval containing x0 ?
(b) For what points (x0 , y0 ) does Theorem II imply a unique solution y(x) exists on a non-
zero interval containing x0 ?
(c) For what points (x0 , y0 ) does Theorem III imply that at least one solution y(x) exists
on a non-zero interval containing x0 ?
(d) Find all the solutions. For what points is the solution unique? For what points is it
non-unique?

59
16. Consider the problem
1
y 0 = |y| 2 , y(x0 ) = 0. (2.289)
There are four solutions on −∞ < x < ∞. What are they?
17. Solve the following DEs by finding an integrating factor
2 −x3 2 −t3 /3
(a) y 0 + 3x2 y = xex ; (c) y 0 + t2 y = tet ;
(b) y0 + 3y = e−3x sin x cos x; (d) ds
dt + 2s = t sin(t).

18. Solve the following DEs by finding a homogeneous solution yh (x) and finding a particular
solution yp (x) by setting yp = v(x)yh (x) and solving a DE for v(x).

(a) xy 0 − y = x3 cos(x); (c) 3 dx 2 2t


dt + 2x = t e ;
(b) 2y 0 + y = 1
1+ex ; (d) 3 dx 2 −2t/3 .
dt + 2x = t e

19. Constant coefficient linear first-order DEs arise in many problems. They have the general
form y 0 + αy = f (x) where α is a constant. Show that by a suitable scaling of x this equation
can always be put in the form y 0 + y = g(x) (where x is different from in the first equation).
Sometimes, e.g., if the independent variable is time, you might want to scale only by a positive
scale in which case there are two possible forms: y 0 + y = g(x) or y 0 − y = g(x). Find the
general solution of the following DEs. In each case n is a positive integer.

(a) y 0 + y = e−x ; a DE for f (x) and then finding a partic-


(b) y 0 + y = ekx with k 6= −1. ular solution for f ;
(c) y 0 + y = xn . Find a particular solution
(e) y 0 + y = xn ekx k 6= −1, Follow procedure
yp by quessing it is a polynomial in x;
from previous problem;
(d) y 0 + y = xn e−x . Find a particular solu-
tion yp by setting yp = f (x)e−x , finding (f) y 0 + y = e−x + 2ex + x2 − x3 .

Note the difference in the results when the exponential terms in the forcing functions are
solutions of the homogeneous equation and when they are not. More on this when we study
second-order DEs.
20. Solve the following DEs.
−1
(a) y 0 − 1
tan(x) y = sin(2x); (c) t2 dx 3 3t−3t .
dt − 3x = 2t e
(b) y 0 + xy = x; (d) t dv 3
dt − (1 + t)v = t .

21. Consider the DE


y 0 + y = cos(x). (2.290)
(a) Solve this by guessing that there is a particular solution of the form y = a cos(x)+b sin(x)
and substituting into the differential equation.
(b) Find the particular solution by using the fact that cos(x) = < eix . That is, solve


y 0 + y = eix and take the real part of the solution. Explain why this gives the solution
d
of the DE invoking linearity of the operator L = dx + 1.
(c) In a similar manner find the solution of the following:

60
i. y 0 + y = sin(x); iii. y 0 + 2y = sin(3x).
ii. y 0 + y = cos(2x);

22. Use the solution of the previous problem to find the general solution of (see end of section on
the alternative approach)
3
y 0 + y = sin(x) + ex − e−x − π cos(2x) + sin(2x). (2.291)
2

23. Without solving the DEs, for what initial points does the existence-uniqueness theory say
that the following IVPs have a unique solution. On what domains do the solutions exist
according to the theorem?

(a) y 0 + 1
x−1 y = x
x−2 , y(x0 ) = y0 . (b) sin(t) dx 2
dt + tx = t , x(t0 ) = x0 .

2 3 2 3
24. Suppose the functions x1 (t) = et + tet and x2 (t) = 51 et + tet are solutions of the same first-
order linear DE. What is a homogeneous solution for the DE? What is the general solution?
What is the differential equation?

25. Find the one-parameter family of straight lines and their envelope for each of the following
Clairaut equations:
0
(a) y = xy 0 + [1 + (y 0 )2 ]; (c) y = xy 0 − ey ;
(b) y = xy 0 − (y 0 )3 ; (d) y = xy 0 − (y 0 )2/3 .

26. Which of the following equations are exact? Solve those that are.
 
(a) x − y1 dy + y dx = 0;
(b) sin x tan y + 1) dx + (x + y 3 ) dy = 0;
(c) cos x cos2 x dx + 2 sin x sin y cos y dy = 0;
(d) cos(xy) − xy sin(xy) ey dx + −x sin(xy) + x cos(xy) ey dy = 0;
   

(e) (2s + s2 )ex+s ds + s2 ex+s dx = 0;


(f) 3x2 y + 2x2 y + 4x3 y 3 dx + x3 + 2x2 y + 3x4 y 2 + 1 + x dy = 0;
 

(g) x3 + 2x4 t + 3t2 dt + 3x2 t + 4x3 t2 dx = 0;


 
y
(h) dx = 1−x2 y 2
dx + 1−xx2 y2 dy;
1 x 1 x
 
(i) 2 sin y dx − y sin y dy = 0.

27. Suppose that


M (x, y) dx + N (x, y) dy = 0 (2.292)
is exact. Find all possible functions µ(x, y) for which

µM (x, y) dx + µN (x, y) dy = 0 (2.293)

is also exact (Note: this is a challenge and requires ideas from multi-variable calculus. Will
not be on quiz/midterm/final exam).

61
28. Consider the Brachistochrone problem. If the initial point is the origin (0, 0) and the final
point has coordinates (xb , yb ) where yb < 0, determine the values of θ and c. Is there always
a solution? Under what conditions does the path go below yb before terminating at (xb , yb )?

29. Solve the following problems

(a) t dx 2 2 1/2 .
dt = x + (t − x )
q
(b) x0 = x
t + tx3 ; x( 21 ) = 2
15 .

(c) tx0 + x = t sec(tx).

62
Chapter 3

Non-dimensionalization and
Dimensional Analysis

In this course we will use non-dimensionalization to simplify problems by introducing scaled dimen-
sionless variables in order to make the problems look simple by reducing the number of parameters
that appear in the problem. In more advanced courses, e.g., on perturbation theory, a further goal
is to scale variables so that the dimensionless variables and their derivatives have typical sizes of
about 1 in order to introduce small parameters that indicate which terms may be neglected.
Dimensional analysis is a useful method for deducing functional relationships between variables
and simplifying problems.
In an equation modelling a physical process all of the terms separated by a ’+’ sign must
have the same units. Otherwise we’d have something silly like ’oranges + avocados = houses’, or
2 kg + 4 m = 10 m s−1 . For example, in a mixing tank the equation for the mass of salt A(t) in
the tank is, based on the principle of conservation of mass,
dA
= rate mass enters − rate mass exits,
dt (3.1)
= cin Vfin − cout Vfout .

Here

• dA
dt has units of kg s−1

• cin has units of kg m−3

• Vfin has units of m3 s−1

so cin Vfin has units of kg s−1 , the same as those of dA


dt . Similarly for cout Vfout .
Checking the dimensions of terms in your equation can be helpful. It is always a good idea and
often reveals simple mistakes.
We will use square brackets to denote the dimensions of a quantity. All scientific units can be
written in terms of primary dimensions such as mass [M], length [L], time [T], electric current [I],
temperature [θ] etc. The units for these quantities depend on the measurement system being used.
The SI units for these quantities are kilograms (kg), meters (m), seconds (s), amperes (A) and
Kelvin (K), although ◦ C may be used for the latter. Other systems include the British foot-pound
system and the cgs system (grams-centimeters-seconds).

63
Other secondary dimensions can be written in terms of primary dimensions. For example,

• force F = mass × acceleration has dimensions [F] = [M][L][T]−2

• pressure P = force per unit area has dimensions [P] = [F][L]−2 = [M][L]−1 [T]−2

• energy E has dimensions [E] = [M][L]2 [T]−2

etc.
Use of dimensionless parameters can reduce the number of parameters in a problem, simplify the
presentation and analysis of results and improve understanding. For experimentalists or numerical
modellers the use of dimensionless parameters can significantly reduce the number of parameters
that need to be varied to explore the behaviour of a system in parameter space.

3.1 Non-dimensionalization
Problems are non-dimensionalized by scaling variables by a characteristic scale. We illustrate the
process via a few examples.

Example: Consider the mixing tank problem above. We assume the volume fluxes into and out of
the tank are equal and constant, i.e., Vfin = Vfout = Vf . Let V be the constant volume of the tank
and c(t) = A(t)/V be the concentration of salt in the tank. The fluid in the tank is well mixed and
cout = c. The governing equation is then
dc Vf Vf
= cin −c (3.2)
dt V V
and the IVP is
dc Vf Vf
+ c = cin ,
dt V V (3.3)
c(0) = c0 .

The problem involves two variables c and t and three parameters V , Vf and cin . We introduce
a characteristic concentration Cc and time scale Tc and non-dimensional concentration c̃ and time
t̃ via
c = Cc c̃ t = Tc t̃ (3.4)
where Cc and Tc are currently unknown. That is, we haven’t decided what to use yet. We will
choose them to make the problem look relatively simple. Under our scaling time derivatives scale
according to
d dt̃ d 1 d
= = . (3.5)
dt dt dt̃ Tc dt̃
Note that the factor 1/Tc carries the dimensions of the time derivative operator. Using this

dc 1 dc 1 d Cc c̃ Cc dc̃
= = = . (3.6)
dt Tc dt̃ Tc dt̃ Tc dt̃

64
Note here that on the right-hand side Cc /Tc has dimensions of concentration over time, the dimen-
sions of the left hand side. The term dc̃
dt̃
is dimensionless.
Substituting this, along with c = Cc c̃ into the IVP gives
Cc dc̃ Vf Vf
+ Cc c̃ = cin ,
Tc dt̃ V V (3.7)
Cc c̃(0) = c0 .
or
dc̃ Vf Tc Vf
+ Tc c̃ = cin ,
dt̃ V Cc V (3.8)
c0
c̃(0) = ,
Cc
where now all the terms are dimensionless. We now choose the scales Cc and Tc to put the
problem in a simple form. There are many ways to do this. The DE includes two ’messy’
coefficients. We have two scalings to play around with so we can make both of them equal to one.
That is, choose
V
Tc = (3.9)
Vf
so that the coefficient of c̃ in the DE is equal to one and then choose Cc to set the right hand side
equal to one. That is, set
Tc Vf cin
cin = =1 =⇒ Cc = cin (3.10)
Cc V Cc
(note we have used Tc = V /Vf ). We can then define

c0
c̃0 = (3.11)
Cc
to get the final dimensionless form of the problem:
dc̃
+ c̃ = 1,
dt̃ (3.12)
c̃(0) = c̃0 .

This involves only one parameter, c̃0 . The solution of this IVP problem is

c̃(t̃) = 1 + (c̃0 − 1)e−t̃ . (3.13)

To get the solution of the dimensional problem you can now use c = Cc c̃ and t = Tc t̃ to get
 
c(t) = Cc 1 + (c̃0 − 1)e−t̃
= Cc + c0 − Cc e−t/Tc

(3.14)
V f
= cin + c0 − cin e− V t .


65
60
50
40
s (m)

30
20
10
0
0 2 4 6 8 10
t (s)

Figure 3.1: Height vs time for various initial heights s0 , initial velocities v0 and different gravitational
accelerations g.

Example: The height of an object thrown up from height s0 > 0 with velocity v0 > 0 at time t is
governed by

d2 s
= −g
dt2
s(0) = s0 (3.15)
ds
(0) = v0 .
dt
which has the solution
1
s(t) = − gt2 + v0 t + s0 . (3.16)
2
This problem involves the two variables s and t, and three parameters g, v0 , and s0 . To plot
solutions we need to vary g (for different planets), v0 and s0 . Sample trajectories are shown in
Figure 3.1.
We can make the problem dimensionless. Here we introduce characteristic length and time
scales Lc and Tc along with dimensionless variables s̃ and t̃ given by s = Lc s̃ and t = Tc t̃. Using
(3.5) we have
d2 dd 1 d1 d 1 d2
= = = . (3.17)
dt2 dt dt Tc dt̃ Tc dt̃ Tc2 dt̃2
Thus
ds 1 d  L ds̃
c
= Lc s̃ = , (3.18)
dt Tc dt̃ Tc dt̃
and
d2 s 1 d2   L d2 s̃
c
2
= 2
Lc s̃ = 2 2. (3.19)
dt Tc dt̃ 2 Tc dt̃

66
In terms of the dimensionless variables, the IVP is

Lc d2 s̃
= −g
Tc2 dt̃2
Lc s̃(0) = s0 (3.20)
Lc ds̃
(0) = v0 ,
Tc dt̃
or
d2 s̃ Tc2
= −g
dt̃2 Lc
s0
s̃(0) = (3.21)
Lc
ds̃ Tc
(0) = v0 ,
dt̃ Lc
where now all the terms are dimensionless.
There are many ways to choose the scalings Tc and Lc . If you are dropping an object from a
height s0 above the ground (at s = 0) then choosing Lc = s0 is very sensible as then s̃ varies from
an initial value of 1 to a final value of 0 when it hits the ground. If you fire a cannonball vertically
upward from the ground at s = 0 (so s0 = 0) then it will rise to a maximum height of smax = v02 /2g
in which case using Lc = v02 /2g is a good choice because then s̃ increases from its initial value of
0 to a maximum value of 1 before decreasing back to 0 as the cannonball returns to the ground.
There are likewise many possible choices for Tc . As a general rule, the best choices for the scales
will depend on your problem.
Here I will assume s0 6= 0 and v0 6= 0 and take Lc = s0 and Tc = v0 /g. The latter is the time
taken for the object to reach its maximum height. With these choices

Tc2 Tc 1 v02
g = v0 = . (3.22)
Lc Lc 2 gs0
and the problem becomes

d2 s̃
= −2λ
dt̃2
s̃(0) = 1 (3.23)
ds̃
(0) = 2λ,
dt̃
where
1 v02
λ= . (3.24)
2 gs0
The solution in dimensionless terms is

s̃(t̃) = −λt̃2 + 2λt̃ + 1. (3.25)

This involves three variables (s̃, t̃ and λ) instead of five and to plot solutions we just need to vary
the value of a single variable, λ, instead of three. There is now a 1-parameter family of solutions.
All dimensionless solutions have the properties that

67
• the initial height is s̃(0) = 1.
ds̃
• the initial vertical velocity is dt̃
(0) = 2λ.
ds̃
• dt̃
(t̃) = 0 when t̃ = 1.

• the maximum height is s̃(1) = 1 + λ.


q
1+λ
• the object reaches the ground (s̃(t̃) = 0) at time t̃f = 1 + λ .

Some sample trajectories are plotted in Figure 3.2.

10
8
6
y

4
2
0
0 2 4 6 8 10
τ

Figure 3.2: Nondimensional height vs time (y = s̃ and τ = t̃) for several values of λ.

3.2 Dimensional Analysis: The Buckingham-π Theorem


The Buckingham-π Theorem is used to determine the number of dimensionless parameters in a
problem and can simplify relationships among variables by reducing the number of variables. It
is named after E. Buckingham who proved a general version of the theorem in 1914 however the
theorem was first proved in 1878 by J. Bertrand and the technique was made widely known by
Rayleigh in the 1890’s.
Suppose there are n physical variables and parameters Q1 , Q2 , ..., Qn and the solution of a
mathematical model gives one in terms of the others:

Qn = f (Q1 , Q2 , . . . , Qn−1 ). (3.26)

Suppose that there are r independent basic physical dimensions [M], [L], [T], etc. Suppose that k
dimensional quantities πj for j = 1, 2, . . . , k can be constructed via multiples of powers of the Qj .
We will see that there are at least n − r of them. Then

πk = h(π1 , π2 , . . . , πk−1 ). (3.27)

68
To see how the Buckingham-π Theorem is used we consider a simple illustrative example: the
simple nonlinear pendulum.

Example: Consider a simple pendulum of mass m in the absence of air damping. Assume the mass
is attached to a frictionless pivot via an inextensible massless wire of length l. Assuming Newtonian
mechanics applied (e.g., ignore relativistic effects). The pendulum is released from rest at an angle
α. How does the period τ of oscillation depend on the parameters α, m, l and the gravitational
acceleration g?
The first step is to determine the dimensions of the various parameters. Here we have

• [τ ] = [T]
• [g] = [L][T]−2
• [l] = [L]
• [α] = 1 (the angle is dimensionless)
• [m] = [M]

There are n = 5 independent variables. These variables involve r = 3 dimensions: [L], [T], and
[M]. In this case there are k = n − r = 5 − 3 = 2 independent dimensionless combinations of the
parameters. Since α is dimensionless it can be taken as one of the dimensionless parameters. To
find a second let
π = τ a mb g c l d (3.28)
(note: since the fifth parameter α is dimensionless there is no point in including it. Multiplying π
by any power of α will yield another dimensionless parameter). The dimension of π is
[π] = [τ ]a [m]b [g]c [l]d ,
c
= [T ]a [M ]b [L][T ]−2 [L]d , (3.29)
a−2c b c+d
= [T ] [M ] [L] .

Table 3.1: The international system of units: fundamental (or primary, or basic) dimensions. Sometimes
angles in radians is included as a dimension but not in the ISU.

Physical Quantity unit symbol unit name


(MKS)

mass kg kilogram
length m meter
time s second
electric current A ampere
luminosity C candela
temperature K degree Kelvin
amount mol mole

69
This is dimensionless iff the exponent of each of the primary dimensions is zero, that is iff

a − 2c = b = c + d = 0, (3.30)

which gives c = a/2, b = 0 and d = −c = −a/2. Thus, all dimensionless combinations of m, l, g


and τ have the form
 r g a
π= τ . (3.31)
l
We can choose a to have
q any non-zero value. For simplicity we take a = 1. Our two dimensionless
parameters are π1 = τ gl and π2 = α. Any other dimensionless parameter must have the form
π1α π2β and hence is not independent of π1 and π2 . According to the theorem

π1 = h(π2 ) (3.32)

for some function h, or r


g
τ = h(α), (3.33)
l
which can be written as s
l
τ= h(α). (3.34)
g

Without even writing down any differential equations, let alone solving them, we have learnt
several important things:

1. the period of oscillation does not depend on the mass m;



2. the period of oscillation is proportional to l;

3. the period of oscillation is inversely proportional to g;
q
4. τ gl depends only on the initial angle α.

Thus we√know that doubling the length of the pendulum increases the period of oscillation by
a factor of 2 and increasing √ the gravitational acceleration by a factor of two decreases the period
of oscillation by a factor of 2. If one wanted to conduct a series of experiments to determine how
the period τ depends on g, m, l and α, instead of doing a large set of experiments by varying all
four parameters one only needs to vary α. This sort of reduction in the number of parameters that
need to be varied is very powerful for both laboratory experiments and numerical simulations.
For small angles of oscillation, i.e., in the limit α → 0, the period of oscillation becomes
s
l
τlin = h(0). (3.35)
g

Mathematically the governing equation (a second-order ODE) is a nonlinear DE for arbitrary initial
angles but in the limit α → 0 the equation becomes a linear DE hence we refer to the periodq of
l
oscillation in this limit as the linear period τlin . As we will see later the period is 2π g , i.e.,
h(0) = 2π.

70
Comment: There are an infinite number of choices for the dimensionless parameters. For example
 r g −2
π1 = τ ,
l
 r g 3 (3.36)
2
π2 = τ α .
l
A direct application of the theorem gives
 r g −2  r g 3 
τ =H τ α2 (3.37)
l l

for some function H. Inverting we have


r r 
2 −1
g −2   g 3
α =H τ / τ (3.38)
l l
 q 
where H −1 is the inverse of H. Thus α = f τ gl . Inverting this leads to (3.34).

3.2.1 Proof of the Buckingham-π Theorem


The following is based on the discussion in [2]. See also [6] which is a very useful applied math text
covering a broad range of subjects. A book all Applied Mathematicians should be familiar with.

Assumptions

Dimensional analysis is built upon the following assumptions:

(i) A quantity u is determined in terms of n measurable quantities (variables or parameters)


(W1 , W2 , . . . , Wn ) by a functional relationship of the form

u = f (W1 , W2 , . . . , Wn ) (3.39)

(ii) The quantities (u, W1 , W2 , . . . , Wn ) involve m fundamental dimensions which are labelled L1 ,
L2 , . . . , Lm .

(iii) Let Z represent any of (u, W1 , W2 , . . . , Wn ). Then the dimensions of Z, denoted by [Z], is a
product of powers of the fundamental dimensions via

[Z] = Lα1 1 Lα2 2 · · · Lαmm . (3.40)

The dimension vector of Z is the column vector


 
α1
 α2 
α
~ = . (3.41)
 
 ..


αm

71
A quantity Z is dimensionless if and only if all its dimension exponents are zero. Let
 
b1i
 b2i 
~bi = 
 .. 

(3.42)
 . 
bmi

be the dimension vector of Wi , i = 1, 2, 3, . . . , n, and let


 
b11 b12 · · · b1n
 b21 b22 · · · b2n 
B= . .. .. .. (3.43)
 
 ..

. . . 
bm1 bm2 · · · bmn

be the m × n dimensional matrix of the problem.


Example: Before our final assumption consider the period of a simple pendulum τ which is
released from rest at an initial angle α. The period is assumed to depend on the pendulum
mass m, the length of the pendulum l, the gravitational acceleration g and the initial angle
α. This gives τ = f (m, l, g, α). The quantities u = τ , W1 = m, W2 = l, W3 = g and W4 = α
involve three dimensions: L1 = [M], L2 = [L], and L3 = [T]. Hence n = 4 and m = 3. Now
[m] = [M ] = L1 , l = [L] = L2 , [g] = [L][T [−2 = L2 L−2
3 and [α] = 1. Thus, for this problem
 
1 0 0 0
B= 0 1 1 0  (3.44)
0 0 −2 0

(iv) For any set of fundamental dimension one can choose a system of units for measuring the
value of any quantity Z. For example, the SI system, the cgs system or the British foot-pound
system. Changing from one system to another involves a positive scaling of each fundamental
unit. E.g., 1 m = 100 cm, 1 cm = 2.54 inches, or 1 kg = 2.2 lbs. Secondary quantities are
scaled accordingly. This results in a corresponding scaling of each quantity Z. The final
assumption is that the relationship between u and the variables (W1 , W2 , . . . , Wn ) is invariant
under any scaling of the fundamental units. For example, the relationship between kinetic
energy K and the mass m and velocity ~v of an object is K = 21 m~v · ~v in all measurement
systems, or Newton’s second law of motion F~ = m~a is independent of the measurement
system.
The invariance of the functional relationship means the following. Suppose under a scal-
ing of the dimension Lj Lj → L∗j = e Lj and under this scaling (u, W1 , W2 , . . . , Wn ) →
(u∗ , W1 ∗, W2∗ , . . . , Wn∗ ). Then u = f (W1 , W2 , . . . , Wn ) becomes u∗ = f (W1∗ , W2∗ , . . . , Wn∗ ).

Illustration of the Buckingham-π theorem

Before proving the theorem we re-consider our simple pendulum which has period

τ = f (m, l, g, α). (3.45)

This relationship is assumed to be independent of scalings of the fundamental dimensions. Suppose


we scale the mass dimension by e so L1 → L∗1 = e L1 . Under this scaling m → m∗ = e m but as

72
the other variables do not involve dimensions L1 = [M ], τ → τ ∗ = τ , g → g ∗ = g and l → l∗ = l
(α, being dimensionless does not change). Thus, τ = f (m, l, g, α) becomes τ ∗ = f (m∗ , l∗ , g ∗ , α)
according to assumption (iv). This means that

τ = f (e m, l, g, α). (3.46)

This is true for all  and the only way (3.45) and (3.46) can both hold is if τ does not depend on
m, i.e., τ = f (l, g, α) = f (W2 , W3 , W4 ).
We now tackle the next dimension [L] = L2 . Scale L2 by e . Then τ ∗ = τ is unchanged (it does
not involve dimensions of length), l∗ = e l, and g∗ = e g so τ ∗ = f (l∗ , g ∗ , α) gives τ = f (e l, e g, α).
The appearance of e in two term makes things slightly more complicated. To proceed we eliminate
dimension L2 from all but one variable by choosing two new independent variables. Let us remove
it from the second variable, W3 , by defining new variables X2 = W2 and X3 = W3 /W2 = g/l. Here
X3 has been chosen so that [X3 ] = [T]−2 = L−2 3 does not include the dimension L2 . In terms of
the new variables τ = f (l, g) = f (X2 , X3 , α) = f˜(l, g/l, α). If we now scale l by e we obtain
˜

τ = f˜(e X2 , X3 , α) (3.47)

which again is true for all . Hence f˜ must be independent of X2 , i.e.,


g 
τ = f˜(X3 , α) = f˜ , α . (3.48)
l

For −2 ˜
qthe final step we
q note that [τ ] = [T ] and [g/l] = [T ] . Since τ = f (g/l, α) it follows that
v = τ l = f˜((g/l, α) l ≡ h(g/l, α) = h(X3 , α) is dimensionless. Now [v] = 1 and [X3 ] = [T ]−2 .
g g

We now scale time by L∗3 = e L3 . Under this scaling v → v ∗ = v and X3 → X3∗ = e−2 X3 . By the
assumption v ∗ = h(X3∗ , α) or −2
p v = h(e X3 , α). The only way this can be true is if h is independent
of X3 , i.e., v = h(α), or τ / g/l = h(α) recovering our previous result.

Statement of the Buckingham-π theorem

Consider
u = f (W1 , W2 , . . . , Wn ) (3.49)
and let B be the dimension matrix for W1 , W2 , . . . , Wn . Let
 
a1
 a2 
~a =  .  (3.50)
 
 .  .
am
be the dimension vector of u. As in the simple pendulum example we want to introduce a new
dimensionless dependent variable v by multiplying u by appropriate powers of the Wj . Thus, choose
(y1 , y2 , . . . , yn ) so that
π = uW1y1 W2y2 · · · Wnyn (3.51)
is dimensionless. Recalling that the dimensions of Wi are [Wi ] = Lb11i Lb22i · · · Lbmmi it follows that
the dimensions of the expression on the right hand side is

L1a1 +y1 b11 +y2 b12 +···+yn b1n L2a2 +y1 b21 +y2 b22 +···+yn b2n · · ·
a +y1 bj1 +y2 bj2 +···+yn bjn
(3.52)
· · · Lj j · · · Lamm +y1 bm1 +y2 bm2 +···+yn bmn .

73
This is dimensionless if
aj + y1 bj1 + y2 bj2 + · · · + yn bjn = 0 (3.53)
for all j, i.e., if ~y is a solution of
B~y = −~a. (3.54)
Note that in B is often not invertible, e.g., it is not invertible if m 6= n (for example if there are more
variables Wk than dimensions Li as in our simple pendulum example). There are in general q many
g
solutions. Indeed, if π is dimensionless and π1 is any other dimensionless variables (e.g., τ / l and
α in our simple pendulum example) then π a π1b
is an alternative dimensionless replacement for u
for any nonzero a and b. It is also possible that there are no solutions which means the problem is
poorly formulated. For example u may include a dimension that is not a dimension of any of the
variables Wi . A scaling of this dimension leads to e u = f (W1 , W2 , . . . , Wn ) = u which can’t be
true for all . For example, one may speculate that the period of revolution of the moon around
the Earth, τr , depends on the mass of the Earth ME , the mass of the moon Mm and the distance
Rm of the moon from the centre of the Earth giving τr = f (ME , Mm , Rm ). ME , Mm and Rm
do not involve dimensions of time so the period τr can not be a function of these three variables
only. Something is missing (in this case Newton’s Universal Gravitational Constant). Henceforth
we assume that (3.54) has a solution.
Next, we want to construct dimensionless variables by taking appropriate combinations of the
Wj . Suppose
πi = W1x1i W2x2i · · · Wnxni (3.55)
is dimensionless. Then it follows that
B x~i = 0 (3.56)
which follows from (3.51) and (3.54) after replacing ~a by the zero vector. There are k = n − r(B)
linearly independent solutions of (3.56). Here r(B) is the rank of the dimension matrix B (i.e.,
the number of linearly independent columns, or equivalently, the number of linearly independent
rows). Let ~xi i = 1, 2, · · · , k be any such set and let πi given by (3.55) be the corresponding k
dimensionless variables. Then the Buckingham-π theorem states that

π = g(π1 , π2 , . . . , πk ). (3.57)

Example: Consider again the simple pendulum. The dimension matrix (3.44) has rank r = 3.
p
Hence there is a dimensionless dependent variable π, which we can take as τ g/l, and a single
independent variable π1 = α. The theorem says that π = h(α) for some function h.

Proof of the Buckingham-π theorem

The proof of the Buckingham-π theorem follows procedures illustrated in the simple pendulum
example. Suppose dimension L1 is scaled by e . Under this scale u = f (W1 , W2 , . . . , Wn ) becomes
ea1 u = f (eb11 W1 , eb12 W2 , . . . , eb1n Wn ). There are two cases to consider.

CASE I: If b11 = b12 = · · · = b1n = a1 = 0 it follows the L1 is not a fundamental dimension of the
problem, i.e., the problem is independent of this dimension. We can assume this is not the case.

CASE II: If b11 = b12 = · · · = b1n = 0 and a1 6= 0 we have ea1 u = f (W1 , W2 , . . . , Wn ) for all 
hence it follows that u ≡ 0, a situation that is not of interest.

74
CASE III: At least one of b11 , b12 , . . . , b1n is non-zero. Wlog assume b11 6= 0. Define new
quantities
−b /b
Xi = Wi W1 1i 11 , i = 2, 3, . . . , n (3.58)
and a new unknown
−ai /b11
v = uW1 . (3.59)
The formula u = f (W1 , W2 , . . . , Wn ) is equivalent to v = F (W1 , X2 , X3 , . . . , Xn ). By construction v
and the Xi are independent of dimension L1 . Scaling L1 by e gives v = F (eb11 W1 , X2 , X3 , . . . , Xn ).
Since this is true for all  it follows that F is independent of W1 . That is, v = G(X2 , X3 , . . . , Xn )
where v and the quantities X2 , X3 , . . . , Xn are independent of the dimension L1 .
−b /b
Note that under the change of variables Wj → Wu W1 1i 11 the dimension matrix is changed
by subtracting (b1i /b11 )b~1 from the ith column which makes the entries in row 1 zero apart from
b11 :

 
b11 b12 ··· b1n
 b21 b22 ··· b2n 
B= .. .. .. ..
 

 . . . . 
bm1 bm2 · · · bmn
h i
= b~1 b~2 · · · ~bn

h i
→ b~1 b~2 − b12 ~
b11 b1 · · · ~bn − b1n ~
b11 b1

b12 b1n
b11 b12 − b11 b11 ··· b1n − b11 b11
 
b12 b1n
 b21 b22 − b11 b21 ··· b2n − b11 b21

=
 
.. .. .. ..  (3.60)
 . . . . 
b12 b1n
bm1 bm2 − b11 bm1 · · · bmn − b11 bm1

b11 0 ··· 0
 
 b21 b22 − bb11
12
b21 ··· b2n − bb1n
11
b21 
= .. .. .. ..
 

 . . . . 
b12 b1n
bm1 bm2 − b11 bm1 · · · bmn − b11 bm1

 
b11 0 ··· 0
 b21 b̃22 ··· b̃2n 
= .. .. .. .. .
 
 . . . . 
bm1 b̃m2 · · · b̃mn

We continue with the remaining m − 1 dimensions. In turn we eliminate each dimension from
all but one variable and reduce the number of independent variables. As discussed above, after the
m dimensions have been eliminated we are left with k + 1 = n + 1 − r(B) dimensionless quantities
π and π1 , π2 , . . . , πk where
π = g(π1 , π2 , . . . , πk ). (3.61)

75
or
u = W1−y1 W2−y2 · · · Wn−yn g(π1 , π2 , . . . , πk ). (3.62)

Comments:

1. This proof makes no assumptions about the continuity of the functions f and g.

2. The assumed relationship u = f (W1 , W2 , . . . , Wn ) may be incorrect. This can manifest itself
in several ways. The non-dimensionalization procedure may fail (e..g, the orbital period of
the moon example). Alternatively, the nondimensionalization procedure may work but the
resulting π = g(π1 , π2 , . . . , πn ) may be incorrect because u depends on parameters or variables
not included in the analysis. This may, however, still give a useful results in certain limiting
cases. For example, if the period of the idealized simple pendulum q is assumed to depend on
l, g and m only the analysis follows through and results in τ = k gl for some constant k
q
instead of the correct1 result τ = f (α) gl where f is an undetermined function to be found
via experiments or mathematical analysis. The approximation f (α) = k, a constant, is only
valid in the small amplitude limit (which may be what you are interested in and hence OK).

3. The scaling L∗i = e Li which induces the transformation u → u∗ = eai , and Wj →


Wj∗ = ebij Wj for j = 1, 2, ...., n defines a one-parameter () Lie group of scaling transforma-
tions of the n + 1 quantities (u, W1 , W2 , . . . , Wn ) with  = 0 corresponding to the identity
transformation. Assumption (iv) behind dimensional analysis states that the relationships
u = f (W1 , W2 , . . . , Wn ) is invariant under this Lie group transformation. The Buckingham-π
theorem is exploiting these symmetries to reduce the dimensionality of the problem. There
is a whole field of study devoted to Lie group transformations which has given rise to many
useful results in the study of differential equations.

3.3 Problems
1. Introduce suitable non-dimensional population P̃ and time t̃ to write the logistic equation in
the form
dP̃
= P̃ (1 − P̃ ). (3.63)
dt
What is the general solution P̃ (t̃)? Use your dimensionless solution to find the solution P (t)
of the dimensional problem.

1
Even this is an approximation as we have ignored lots of other effects, e.g., relativistic effects, assuming them to
be negligible.

76
Chapter 4

Objects Falling/Rising Through a


Fluid

We are now going to reconsider objects moving through a fluid. We begin with a discussion of drag
forces and consider an example problem of a falling parachutist. We then consider the inclusion of
buoyancy and ‘added mass’ (or ‘accelaration-reaction’) forces which are important if the object’s
density is not a lot larger than the fluid density.

4.1 Drag Force on an Object Moving Through a Newtonian Fluid


Consider an object moving through a Newtonian fluid (e.g., air or water the situation is much more
complicate for more non-Newtonian fluids such as paint, ketchup or egg white). We assume that
the density of the object is much larger than the density of the surrounding fluid so buoyancy and
added mass forces, discussed below, can be ignored. The governing equation is
dv
m = −mg + Fd (v) (4.1)
dt
where the drag force Fd depends on the objects velocity and shape (orientation being part of the
latter). We assume the object is rotationally symmetric about its vertical axis and is not spinning
so the drag force acts to oppose the objects motion. That is, there is no tangential component
of the force (as there is, for example, on an airplane wing or a spinning soccer ball). The drag
force depends on the fluid density ρf , the size of the object given by, for example, a cross-sectional
diameter D, the object’s velocity V and the fluid viscosity µ. The dimensions are

• [Fd ] = [M ][L][T ]−2 ;

• [ρf ] = [M ][L]−3 ;

• [D] = [L];

• [V ] = [L][T ]−1 ;

• [µ] = [M ][L]−1 [T ]−1 ;

77
There are five variables involving three dimensions, hence two independent dimensionless variables.
These can be taken as
Fd
π1 = 1 2 2
, (4.2)
2 ρf D V
UD
Re = . (4.3)
µ/ρf

The factor 1/2 in the first is traditional. The latter dimensionless parameter is called the Reynold’s
number after Osborne Reynolds (1842-1912) who was led to define this fundamentally important
dimensionless parameter based on his studies of the conditions in which flow through a pipe tran-
sitions from laminar to turbulent flow. The transition depends on Re.
The drag force (technically, we are just considering the magnitude of the drag force) also depends
on the shape of the object which is dimensionless. Thus, according to the Buckingham-π Theorem
π1 = f (Re, shape), or
1
Fd = ρf D2 V 2 Cd (Re, shape) (4.4)
2
where Cd is called the drag coefficient. It depends on the Reynolds number and on the shape of
the object. Since D2 has dimensions of area this is often written in the form
1
Fd = Cd ρf AV 2 (4.5)
2
where of course the function Cd (Re, shape) will in general be different as A is not necessarily equal
to D2 .
The drag force can be expressed in other ways. For example

1 ρf DV µAV 1 µAV C̃d µAV


Fd = Cd = Cd Re = (4.6)
2 µ D 2 D 2 D

where C̃d = Re Cd (Re, shape). In parameter regimes where the drag force varies linearly with V
this is more appropriate as C̃d is constant. When the drag force varies quadratically with V (4.5)
is more appropriate because in this regime Cd is constant.

Ua
The drag on a solid sphere of radius a for low Re = ρ/µ (less than about 0.1–1) can be shown
to be (Stokes’ Law)
Fd ≈ 6πµaU. (4.7)
We can write this as in terms of the cross-sectional area A = πa2 as
1 A a 1 24µ/ρ 1 24
Fd = 12πµ 2 U 2 = ρf AU 2 = ρf AU 2 (4.8)
2 πa U 2 2U a 2 Re
where Re = 2aU/(µ/ρ) is based on the sphere diameter. Thus, the drag coefficient in this case is
Cd = 24/Re, i.e., it varies inversely with the Reynolds number. In this regime the drag force varies
linearly with speed. For a solid sphere the drag coefficient is roughly constant for Re between about
103 and 105 . In this regime a quadratic drag law is appropriate.
Examples:

78
1. Consider a pollen grain in air. Here ν = µ/ρ ≈ 1.5 × 10−5 m2 s−1 , D ≈ 100 µm or 10−4 m,
and Uterm ≈ 0.1 m s−1 giving
10−4 × 10−1
Re ≈ ≈ 0.7, (4.9)
1.5 × 10−5
so drag on a pollen grain varies approximately linearly with U .
2. Pitched baseball in air: µ ≈ 1.5 × 10−5 m2 s−1 , D ≈ 0.07 m. Then Re ≈ 5 × 103 U . A
quadratic drag law is valid for Re between about 103 and 105 or for U between about 0.2 and
20 m s−1 (professional pitchers throw at 25–40 m s−1 depending on the type of pitch).

4.1.1 The Parachute Problem


Consider a parachutist falling through the air. A quadratic drag law is generally appropriate for
this situation. The drag force is
1
Fd = Cd ρa SV 2 (4.10)
2
where Cd ≈ 0.75 for a parachute, S is the surface area of the parachute and ρa is the air density.
The governing equation is
dv 1
m = −mg + Cd ρa Sv 2 . (4.11)
dt 2
Note that by writing this down we are assuming the parachutist is always falling, that is v < 0
and the drag force acts up in the opposite direction to gravitational force. The terminal speed,
obtained by setting dv
dt = 0 is r
2mg
Uterm = . (4.12)
Cd ρa S
Scale the DE:
v = Vc u t = Tc τ (4.13)
where u is the dimensionless velocity and τ is the dimensionless time. The dimensionless governing
equation is
Vc du  C ρ SV 2 
d a c
= −g + u2 . (4.14)
Tc dτ 2m
or
du Tc  Cd ρa SVc2  gTc 2
= −g + u . (4.15)
dτ Vc 2mg Vc
To put this equation in a simple form we set
Tc
g = 1,
Uc
(4.16)
Cd ρa SVc2
= 1,
2mg
which simplifies to
Uterm
Vc = Uterm and Tc = . (4.17)
g
Since we are using the terminal speed as the velocity scaling, as τ → ∞ we expect that u → −1, a
useful check on our solution after we have found it. The nondimensional DE is
du
= −1 + u2 . (4.18)

79
This is separable:
du
= −dτ
1 − u2
1 1+u
=⇒ ln = −τ + c.
2 1−u (4.19)
1+u
=⇒ = Ae−2τ
1−u
1+u
=⇒ = Be−2τ
1−u
where B = ±A, the sign depending on the sign of the left-hand side. Solving for u we obtain

Be−2τ − 1
u(τ ) = . (4.20)
Be−2τ + 1

If we are given an initial condition u(0) = u0 we find


1 + u0
B= . (4.21)
1 − u0
Note that since u < 0 the denominator is non-zero. The solution can be written as
(1 + u0 )e−2τ − (1 − u0 )
u(τ ) = . (4.22)
(1 + u0 )e−2τ + (1 − u0 )
As τ → ∞, u → −1 as expected.

Example: A sky diver is falling at 40 m s−1 when she opens her parachute.

1. What is her terminal velocity?

2. Nondimensionalize the problem. What are the velocity and time scales? What are the initial
conditions?

3. How long does it take her to be falling at twice her terminal velocity?

4. How far has she fallen during this time?

Take ρa = 1.2 kg m3 , S = 20 m2 , m = 100 kg and Cd = 0.75.

Solution:

1. The terminal speed is


r
2mg
Uterm =
Cd ρ a S
s
2 × 100kg × 9.81 sm2 (4.23)
= kg
0.75 × 1.2 m 2
3 × 20m
m
≈ 10.44 .
s

80
2. The velocity and time scales are
m Uterm
Uterm ≈ 10.44 , Tc = ≈ 1.06 s. (4.24)
s g

The initial dimensional velocity is v0 = −40 m s−1 . The initial dimensionless velocity is
v(0) −40
u(0) = ≈ ≈ −3.83. (4.25)
Uterm 10.44

3. We need to find the time at which v = 2Uterm , or u = −2. From


(1 + u0 )e−2τ − (1 − u0 )
u(τ ) = . (4.26)
(1 + u0 )e−2τ + (1 − u0 )
we find  1 + u  1 − u 
2τ 0
e = . (4.27)
1 − u0 1 + u
Using u0 = −3.83 and setting u = −2 we have
 −2.83  3 
e2τ = = 1.76 (4.28)
4.83 −1
so
1
τ = ln(1.76) = 0.28. (4.29)
2
The corresponding dimensional time is t = Tc τ = 1.06 × 0.28 s, or t = 0.29s.
4. The (nondimensional) distance travelled in time τ is
Z τ
h=− u(τ ) dτ. (4.30)
0

It is convenient to use u(τ ) in the form


1+u0 −2τ
1−u0 e −1
u= 1+u0 −2τ (4.31)
1−u0 e +1

For our problem (1 + u0 )/(1 − u0 ) = −2.83/4.83 ≈ −0.586 < 0. Set it equal to −e−2α . Note
that α > 0.
−e−2α−2τ − 1 e−(τ +α) + eτ +α
u= = −
−e−2α−2τ + 1 −e−τ −α + eτ +α
eτ +α + e−(τ +α) (4.32)
= − τ +α
e − e−τ −α

= − coth τ + α ,
where
cosh(x)
coth(x) = ,
sinh(x)
ex + e−x (4.33)
cosh(x) = ,
2
ex − e−x
sinh(x) = .
2

81
Now Z
coth(x) dx = ln | sinh(x)| (4.34)
so τ
sinh(τ + α)
Z
s(τ ) = u(τ ) dτ = ln | sinh(τ + α)| − ln | sinh(α)| = ln . (4.35)
0 sinh(α)

For our problem α = −0.5 ln(0.586) ≈ 0.267 and τ = 0.28 so the distance travelled is

sinh(0.55)
s(0.28) ≈ ln ≈ 0.75. (4.36)
sinh(0.27)
In dimensional term we need to scale by the length scale Lc = Uterm Tc = 10.44 × 1.06 = 11.07
m. The dimensional distance travelled is about 8.3 m.

4.2 Buoyancy Forces and Added Mass


Other forces act on objects moving through a fluid.

4.2.1 Buoyancy Force


Consider an object of density ρo and volume Vo immersed in a fluid of density ρf . The object
is in place by a rigid wire and both the fluid and the object are at rest. Pressure in the fluid
increases with depth due to the influence of gravity hence the pressure exerted on the object by the
surrounding fluid will be larger at the bottom of the object than at the top of it. The net result
is an upward force exerted on the object by the surrounding fluid. This is called the buoyancy
force (see Figure 4.1). If you imaging replacing the object with fluid identical to the surrounding
fluid, the fluid occupying the space taken by the object will be in equilibrium with the surrounding
fluid. The buoyancy force acting on it is exactly balanced by the force of gravity −ρf V k̂ acting on
it. Here k̂ is the vertical unit vector in the direction opposing gravity and g is the gravitational
acceleration. Thus the buoyancy force acting on the object is

F~b = ρo V k̂. (4.37)

Including this force, the equation of motion for a vertically moving object is
dv
m = −mg + ρf Vo g + Fd = (ρf − ρo )Vo g + Fd . (4.38)
dt

4.2.2 Acceleration-reaction or added mass


Something is missing in (4.38). Consider an air bubble of volume Va and density ρa ≈ 1.2 kg m−3
in water of density ρw = 1000 kg m−3 released from rest. Its initial velocity is zero so initially the
drag force is zero. The above equation predicts an initial acceleration of
dv ρw − ρa
= g ≈ 1000g. (4.39)
dt ρa

82
Figure 4.1: Buoyancy force on an object at rest. The surrounding fluid exerts a surface stress (force per
unit area) ~t on the surface of the object in the inward, −n̂, direction of magnitude p(z) where p is thefluid
pressure and z the vertical coordinate. p increases with depth. The fluid pressure is unchanged if the object
is replaced by more of the same fluid hence the net result of the surface forces is a buoyancy force that
balances the force of gravity that would act on the replacement fluid. The force of gravity acting on the
object is F~g = −ρo Vo g k̂ where ρo is the density of the object, V0 its volume, g the gravitational acceleration
and k̂ is the unit vector in the vertical z direction. The buoyancy force is F~b = ρf V0 g k̂ where ρf is the
density of the surrounding fluid. If the object is less dense that the surrounding fluid the buoyancy force is
larger than the gravitational force and the object accelerates upward.

This is an enormous acceleration and is far from correct. As the density of the air goes to zero it
predicts an acceleration that goes to infinity.
What is missing is the fact that for the air bubble to rise the surrounding fluid has to move
and, more particularly, as the air bubble accelerates the surrounding fluid must accelerate too. So
work is being done on the fluid, not just on the air bubble. From an energetics point of view, if
the air bubble rises a height h a parcel of fluid of the same volume has in effect been moved down
a distant h. The potential energy of the system has been decreased by (ρw − ρa )Va gh. Equation
(4.38) says, ignoring the drag force, that this decrease in potential energy has been converted to
kinetic energy of the air bubble. In actual fact, most of the potential energy is converted to kinetic
energy of the surrounding water.
So, we need to take the acceleration of the surrounding fluid into account when an object
accelerates. In terms of the object, the acceleration of the surrounding fluid is associated with a
force exerted by the object on the fluid. There is an equal and opposite force exerted by the fluid
on the object. This is called the acceleration-reaction.
For a spherical object it can be shown that the force exerted on the object by a surrounding
inviscid fluid is
1 dv
Fa = − Mf . (4.40)
2 dt
~ × u = 0.) Here Mf =
This assumes irrotational incompressible flow (in particular the vorticity ∇
ρf Vo is the mass of the displaced fluid. The negative sign indicates that this force opposes the
acceleration of the object. It makes it harder to speed up an object but also makes it harder to
slow it down. Thus, including this force, the equation of motion for a vertically moving spherical

83
object is
1  dv
m+ M = (M − m)g + Fd , (4.41)
2 dt
or
1  dv Fd
ρo + ρf = (ρf − ρ0 )g + . (4.42)
2 dt Vo
In other words, the object accelerates as if its mass has been increased by half the mass of the
displaced fluid. Hence the term ‘added mass’. For more general objects the acceleration-reaction
can have a tangential component and is described by a second-order tensor. This model for the
acceleration-reaction term does not include modifications due to the presence of a boundary layer
over the surface of the object, which reduces the fluids kinetic energy, or the effects of flow separa-
tion. It is a very good model for estimating the object’s initial acceleration from rest but once the
object starts moving some additional terms may be required. The motion of an object through a
fluid can be very complicated!
Assuming a spherical air bubble, we now get an initial acceleration of
dv ρw − ρa
= g ≈ 2g. (4.43)
dt ρa + 12 ρw

Much more reasonable!


Both the buoyancy and added mass forces are neglible if the density of the object is much larger
than the density of the surrounding fluid, as is the case for a parachutist.

84
Chapter 5

Second-Order Linear Ordinary


Differential Equations

5.1 Some general theory


The general linear second-order equation has the form
L[y] ≡ y 00 + P (x)y 0 + Q(x)y = R(x). (5.1)
The operator L maps twice differentiable functions to a new function. As for the first-order case L
is a linear operator (exercise). We consider only real-valued functions P (x) and Q(x). For now we
assume R(x) is real as well and we seek real valued solutions y(x). Later on we will see that can at
times be useful to consider complex-valued forcing functions R(x) in which case the solution y(x)
is complex-valued too.
Theorem I. Existence-Uniqueness of Solutions: Let P (x), Q(x) and R(x) be continuous
functions on a closed interval [a, b]. If x0 ∈ [a, b] and y0 and y00 are any real numbers then (5.1)
has a unique solution y(x) on [a, b] satisfying the initial conditions y(x0 ) = y0 and y 0 (x0 ) = y00 .
For a proof see [7]. We can write this theorem in terms similar to the corresponding theorem
for first-order linear DEs. Let y1 (x) = y(x) and y2 (x) = y 0 (x) and define the vector
 
y1
~y = . (5.2)
y2
Then
 0    
d~y y1 y2 y2
= = =
dx y20 y100 R − P y10 − Qy1
 
y2
=
R − P y2 − Qy1 (5.3)
    
0 0 1 y1
= +
R −Q −P y2
= F~ (x, ~y ).
The initial conditions are  
y0
~y (x0 ) = . (5.4)
y00

85
The second-order DE has been replaced by a system of two first-order DEs (note we can also do
this for a general second-order DE of the form y 00 = F (x, y, y 0 )). The system has the same form
as the general first-order DE y 0 = f (x, y) with y replaced by a vector. In a similar manner an nth
order DE can be replaced by a first-order DE for an n-vector. The existence-uniqueness proof for
first-order DEs can be extended to the case where y is a vector. This is also the basis of many
numerical methods used to find approximate numerical solutions of differential equations.

As we have seen, the general solution of a first-order linear DE can be expressed as the sum
of a particular solution and a multiple of any non-zero homogeneous solution. This idea can be
extended to higher-order DEs: the general solution of an nth -order linear DE has the form of the
sum of a particular solution plus a linear combination of n linearly-independent solutions of the
homogeneous problem. Here we consider second-order DEs so we will need two linearly-independent
solution. This concept is defined below.
If yp (x) is a particular solution of (5.1) and y(x) is any other solution of (5.1) then the difference
yh = y − yp is a solution of the homogeneous equation
L[y] = 0 (5.5)
since by linearity
L[y − yp ] = L[y] − L[yp ] = R − R = 0. (5.6)
Thus, the problem of find the general solution of (5.1) reduces to finding any particular solution of
(5.1) plus the general solution of the homogeneous equation.
Theorem II: If y1 (x) and y2 (x) are solutions of the homogeneous equation (5.5) then so is c1 y1 (x)+
c2 y2 (x) for any constants c1 and c2 .
Proof: Follows trivially from linearity of the operator L.

Definition: If two functions f (x) and g(x) are defined on an interval [a, b] and have the property
that one is a constant multiple of the other then they are said to be linearly dependent on [a, b].
Otherwise they are linearly independent.

Example:

• y1 = x2 and y2 = 0 are linearly dependent on any interval since y2 = 0 · y1 .


• y1 = x2 and y2 = x3 are linearly independent on any interval since there is no constant k
such that y1 = ky2 or y2 = ky1 .
• y1 = x2 and y2 = πx2 are linearly dependent on any interval since y2 = πy1 .
• cos(x) and sin(x) are linearly independent on any interval since neither is a constant multiple
of the other on any interval.

Theorem III: Let y1 (x) and y2 (x) be linearly independent solutions of the homogeneous equation
(5.5) on [a, b]. Then
c1 y1 (x) + c2 y2 (x) (5.7)

86
is the general solution of (5.5) on [a, b]. That is any solution of (5.5) is a linear combination of y1
and y2 .

Before proving this theorem a definition and a couple of lemmas.

Definition: The Wronskian W (y1 , y2 ) of two functions y1 (x) and y2 (x) is the new function

W (y1 , y2 )(x) = y1 y20 − y2 y10 . (5.8)

The Wronskian is linear in each of its arguements(exercise). It is an example of a bi-linear operator.

Lemma 1: If y1 (x) and y2 (x) are any two solutions of (5.5) on [a, b] then their Wronskian is either
identically zero or never zero on [a, b].
Proof:
dW d
y1 y20 − y2 y10

=
dx dx
= y10 y20 + y1 y200 − y20 y10 − y2 y100
= y1 y200 − y2 y100 (5.9)
= y1 (−P y20 − Qy2 ) − y2 (−P y10 − Qy1 )
= −P (y1 y20 − y2 y10 ) − Q(y1 y2 − y2 y1 )
= −P W

The general solution of this first-order DE for W (x) is


Rx
W (x) = ce− P (t) dt
. (5.10)

W (x) is identically zero if c = 0 or never zero if c 6= 0.

Lemma 2: If y1 (x) and y2 (x) are any two solutions of (5.5) on [a, b] then they are linearly dependent
on [a, b] iff their Wronskian is identically zero on [a, b].
Proof:

(a) Suppose y1 and y2 are linearly dependent. If y1 = 0 or y2 = 0 then W (y1 , y2 ) = 0. If


neither are identically zero on [a, b] then y1 (x) = ky2 (x) for some constant k and W (y1 , y2 ) =
y1 y20 − y10 y2 = ky1 y10 − ky10 y1 = 0. Thus if y1 and y2 are linearly dependent W (y1 , y2 ) = 0.

(b) Suppose W (y1 , y2 ) = 0 on [a, b]. If y1 is identically zero on [a, b] then y1 and y2 are linearly
dependent. Otherwise, since y1 is continuous (it is twice differentiable on [a, b]) there is a
subinterval [c, d] ⊂ [a, b] on which y1 6= 0. Thus we can divide W (y1 , y2 ) = 0 by y12 on [c, d]
to get
W y1 y20 − y10 y2 d  y2 
= = = 0. (5.11)
y12 y12 dx y1
Thus y2 /y1 is constant on [c, d] or y2 = ky1 on [c, d]. We also have y20 = ky10 on [c, d] so by
the existence-uniqueness Theorem I y1 = ky2 on [a, b]. That is, if W (y1 , y2 ) = 0 then y1 and
y2 are linearly dependent.

87
We now prove Theorem III.
Proof of Theorem III: Let y(x) be any solution of (5.5). We need to show that we can find
constants c1 and c2 such that y = c1 y1 + c2 y2 . By Theorem I we pick a point x0 and values
y0 = y(x0 ) and y00 = y 0 (x0 ). We next find c1 and c2 such that
c1 y1 (x0 ) + c2 y2 (x0 ) = y(x0 ) = y0
(5.12)
c1 y10 (x0 ) + c2 y20 (x0 ) = y 0 (x0 ) = y00
If we can do this then by uniqueness (Theorem I) y = c1 y1 + c2 y2 . Equation (5.12) has a solution
iff
y1 (x0 ) y2 (x0 )
= y1 (x0 )y20 (x0 ) − y10 (x0 )y2 (x0 ) 6= 0. (5.13)
y10 (x0 ) y20 (x0 )
In other words, iff W (y1 , y2 )(x0 ) 6= 0. Since y1 and y2 are linearly independent by supposition,
W (y1 , y2 )(x0 ) 6= 0 so we can solve system (5.12) for c1 and c2 . Thus we can find c1 and c2 such
that y and c1 y1 + c2 y2 and their derivatives are equal at x0 . By the existence-uniqueness theorem
y = c1 y1 + c2 y2 on [a, b].

Example: Show that y = c1 sin(x) + c2 cos(x) is the general solution of y 00 + y = 0 on any interval
and find the solution of the initial value problem y(0) = 2 and y 0 (0) = 3.
Solution Let y1 = sin(x) and y2 = cos(x). Both are solutions of y 00 + y = 0. Their Wronskian is
W (y1 , y2 ) = y1 y20 − y10 y2 = sin(x) · − sin(x) − cos(x) · cos(x) = −1.
 
(5.14)
which is non-zero. Thus sin(x) and cos(x) are linearly independent (of course this should be obvious
to you but this illustrates that W 6= 0 anywhere). Since P (x) = 0 and Q(x) = 1 are continuous
theorem III says that c1 y1 + c2 y2 = c1 sin(x) + c2 cos(x) is the general solution of y 00 + y = 0 on any
interval [a, b]. The constant are determined via the initial conditions:
c1 y1 (0) + c2 y2 (0) = c1 sin(0) + c2 cos(0) = c2 = 2
(5.15)
c1 y10 (0) + c2 y20 (0) = c1 cos(0) − c2 sin(0) = c1 = 3.
Thus, the solution of the IVP is y = 3 sin(x) + 2 cos(x).

We finish this section with a very useful result.


Theorem IV: If f (x) is a complex
 valued solution of L[y] = 0 (with P (x) and Q(x) real) then so
is y1 = < f (x) and y2 = = f (x) .
The proof is left as an exercise.

eix is a solution of y 00 + y = 0 since y 00 = (i)2 eix = −eix = −y. Then < eix =

Example: y(x)
 =
cos(x) and = eix = sin(x) are also solutions.

88
5.1.1 Finding a second homogeneous solution
There is a techinque for finding solutions of the inhomogeneous equation (5.1) once you have two
linearly independent homogeneous solutions which is a generalization of the technique for first-order
linear DEs. So the real problem is finding a pair of homogeneous solutions. If you can find one,
say y1 (x), finding a second turns out to be straight forward.
Let y2 (x) = v(x)y1 (x) be a second linearly independent solution. Then

L[y2 ] = L[vy1 ]
00 0
= vy1 + P vy1 + Qvy1
= y1 v 00 + 2y10 v 0 + y100 v + P v 0 y1 + vy10 + Qvy1

(5.16)
= y1 v 00 + 2y10 + P y1 v 0 + vL[y1 ]


= y1 v 00 + 2y10 + P y1 v 0


= 0.

This is a linear first-order homogeneous DE for v 0 . The solution is


Rx
y0
0 −
Rx
2 y1 +P dt −2 ln |y1 | −
Rx
P (t) dt e− P (t) dt
v =e 1 =e e = . (5.17)
y12

Integrating again Rx
x
e− P (t) dt
Z
v= dx. (5.18)
y12
Any integral will do. Adding a constant merely adds a multiple of y1 to y2 = vy1 . I do not
recommend memorizing this formula. Understand the process and apply it from first principles.

Example: By inspection y1 (x) = x is a solution of

x2 y 00 + xy 0 − y = 0. (5.19)

To find a second solution let y2 = v(x)y1 = xv(x). Then y20 = v + xv 0 and y200 = 2v 0 + xv 00 .
Substituting into the DE gives

x2 y200 + xy20 − y2 = x2 (2v 0 + xv 00 ) + x(v + xv 0 ) − xv = x2 (2v 0 + xv 00 ) + x2 v 0 (5.20)

so the DE gives
x3 v 00 + 3x2 v 0 = 0 (5.21)
or
3 1 1
v 00 = − v 0 =⇒ v 0 = 3 =⇒ v = − 2 (5.22)
x x 2x
We can multiply v by any constant to get another solution of the equation for v. It is convenient
to multiply by −2 to get v(x) = x12 . Then y2 = xv = xx−2 = x−1 is a second linearly independent
solution.
The general solution of the DE is

y = c1 x + c2 x−1 . (5.23)

89
Note: We did not have to divide the DE by x2 to put the DE in the general form y 00 +P y 0 +Qy = 0
to apply this technique. To use formula (5.18) you do have to be careful to divide the DE by the
coefficient of y 00 , x2 in this case, to determine P (x).

5.2 Constant Coefficient Equations


We now consider the special case of constant coefficient equations using t as the independent
variable, i.e.,
L[y] ≡ y 00 + P (t)y 0 + Q(t)y = 0 (5.24)
where differentiation is with respect to t and P (t) and Q(t) are constant. Such problems arise in
many applications. We have already seen several examples in our study of first-order DEs.
To find the general solution of a constant coefficient linear DE of the form
L[y] ≡ ay 00 + by 0 + cy = f (t) (5.25)
where a, b and c are constants with a 6= 0 we must find any particular solution yp and two linearly
independent solutions of the homogeneous problem
L[y] = ay 00 + by 0 + cy = 0. (5.26)
Because techniques for finding particular solutions of the inhomogeneous problem make use of
solutions of the homogeneous equation we consider the homogeneous problem first. If a = 0 the
equation is a first order DE which we know how to solve so in the following we assume that a 6= 0.

5.2.1 Homogeneous Constant Coefficient DEs


Since a 6= 0 we can divide through by a to get
b c
y 00 + y 0 + y = 0. (5.27)
a a
The behaviour of the solutions depends fundamentally on the signs of b/a and c/a.

Case I: b = 0 and c/a = ω02 > 0


Wlog take ω0 > 0. This is the equation for an undamped linear mass-spring system or an
LC-circuit. The equation
y 00 + ωo2 y = 0 (5.28)
has the general solution
y(t) = a cos(ω0 t) + b sin(ω0 t). (5.29)
We will often consider equations in dimensionless form. To this end scale y via y = Lc ỹ where
Lc is a characteristic scale for y that carries the dimensions, and ỹ is dimensionless. Let τ = ω0 t
be the dimensionless time. Then
dy dτ dỹ dỹ
= Lc = Lc ω0 (5.30)
dt dt dτ dτ

90
and
d2 y 2
2 d ỹ
= L ω
c 0 . (5.31)
dt2 dτ 2
The differential equation for ỹ is
ỹ 00 + ỹ = 0 (5.32)
where here it is understood that the primes denote differentiation with respect to τ . Note that
because the equation is linear and homogeneous Lc does not appear: the scaling does not affect the
form of the differential equation. It will affect the initial conditions for the dimensionless problem.
The general solution of the dimensionless equation is

ỹ(τ ) = ã cos(τ ) + b̃ sin(τ ). (5.33)

Using y = Lc ỹ and τ = ω0 t

y(t) = Lc ỹ τ (t) = Lc ã cos(ω0 t) + Lc b̃ sin(ω0 t)
(5.34)
= a cos(ω0 t) + b sin(ω0 t).

The solution can be put in an alternative form using sin(α + β) = cos(α) sin(β) + sin(α) cos(β).
To this end write

ỹ(τ ) = ã cos(τ ) + b̃ sin(τ )


(5.35)
 
ã b̃
q
2
= ã + b̃ p 2 cos(τ ) + p sin(τ ) .
ã2 + b̃2 ã2 + b̃2
Defining an angle φ via
b̃ ã
cos(φ) = p ; sin(φ) = p . (5.36)
ã2 + b̃2 ã2 + b̃2
and setting q
à = ã2 + b̃2 (5.37)
we can write the solution in the form

ỹ(τ ) = Ã sin τ + φ . (5.38)

This form of the solution also involves two undetermined constants: the phase φ and amplitude A.
In dimensional terms the solution is

y(t) = A sin ω0 t + φ (5.39)

where A = Lc Ã.

Exercise: Show that the solution can be written in the forms à sin(τ ± φ) or à cos(τ ± φ). How is
φ defined in each of these four cases?

Case II: b = 0 and c/a = −r2 < 0


Wlog take r > 0. Setting τ = rt the non-dimensionalized equation is

y 00 − y = 0. (5.40)

91
Two linearly independent solutions are y1 = e−τ and y2 = eτ . The general solution

y = c1 e−τ + c2 eτ (5.41)

is the sum of an exponentially decaying term and an exponentially growing term. If c2 = 0 the
solution decays to zero as τ → +∞. If c2 6= 0 then the solution is unbounded, going to ±∞ as
τ → +∞ depending on the sign of c2 .

Case I revisited.
The equation
ỹ 00 + ỹ = 0 (5.42)
can also be solved using exponentials. Let ỹ = eγτ . Substituting into the governing equation leads
to
γ2 + 1 = 0 (5.43)
or
γ = ±i. (5.44)
Since γ is imaginary y = eγτ is a complex-valued function. We are interested in real-valued solutions.
Consider the general homogeneous second-order differential operator L[y] = y 00 + P (t)y 0 +
Q(t)y = 0 where P (t) and Q(t) are real valued functions. Consider any solution y(t). Using the
linearity of the operator and the fact that the coefficients of the DE are real-valued, we have

L[y] = L[yr + iyi ] = L[yr ] + iL[yi ] (5.45)

where yr and yi are the real and imaginary parts of y and L[yr ] and L[yi ] are the real and imaginary
parts of L[y]. If L[y] = 0 then both L[yr ] = 0 and L[yi ] = 0. In our case, the real and imaginary
parts are cos(τ ) and sin(τ ) so

L[eγτ ] = L[cos(τ )] + iL[sin(τ )] = 0 (5.46)

gives
L[cos(τ )] = L[sin(τ )] = 0. (5.47)
Both cos(τ ) and sin(τ ) are solutions of L[y] = 0. Note that using the solution e−iτ gives the same
two real-valued solutions. The use of complex-valued solutions in this manner is very useful as we
will shortly see.

The General Case


We now consider the linear constant coefficient second-order DE

L[y] = ay 00 + by 0 + cy = 0 (5.48)

in its general form. Here a, b and c are constant with a 6= 0. It will prove convenient to work in
terms of the differential operator
d
D≡ . (5.49)
dt
In terms of D the differential equation operator L is

L = aD2 + bD + c. (5.50)

92
We look for solutions of (5.48) of the form

y = eλt . (5.51)

Substituting eλt into the differential equation gives


 
L[eλt ] = aλ2 + bλ + c eλt = 0. (5.52)

Definition: P (λ) = aλ2 + bλ + c is called the characteristic polynomial of the linear differential
equation (5.48).

Our trial function eλt is a solution of the differential equation if

P (λ) ≡ aλ2 + bλ + c = 0. (5.53)

i.e., if λ is a root of the characteristic polynomial.


You can think of the operator L as a polynomial in the differential operator D, in particular
L = P (D) (see (5.50)). For functions of the form eλt each derivative brings down a factor of λ so
P (D)[eλt ] = P (λ)eλt . So setting P (D)[eλt ] = 0 gives

P (λ) = aλ2 + bλ + c = 0. (5.54)

Solving gives √ r
−b ± b2 − 4ac b b 2 c
λ= =− ± − . (5.55)
2a 2a 2a a
There are three cases to consider.

CASE A: b2 − 4ac > 0. For this case there are two real distinct roots:

−b − b2 − 4ac
λ1 = ,
√2a (5.56)
−b + b2 − 4ac
λ2 = .
2a
It is easy to show that eλ1 t and eλ2 t are linearly independent (exercise) so the general solution is

y = c1 eλ1 t + c2 eλ2 t . (5.57)

Note that if ac < 0 one root is negative and one is positive: λ1 < 0 < λ2 . If ac > 0 the two roots
have the same sign as −b has. If ac = 0 then c = 0, since we assume a 6= 0, in which case one root
is equal to zero. In this case the solution is a constant which is indeed a solution of the equation
ay 00 + by 0 = 0.
We can write the characteristic polynomial as P (λ) = (λ − λ1 )(λ − λ2 ) and the DE as

d d
P (D)[y] = (D − λ1 )(D − λ2 )[y] = ( − λ1 )( − λ2 )y = 0. (5.58)
dt dt
Defining the operators
Li ≡ D − λi for i = 1, 2 (5.59)

93
we can write the DE as  
L[y] = L1 ◦ L2 [y] = L1 L2 [y] = 0. (5.60)
Because the λi are constants the first-order operators commute:

L1 ◦ L2 = L2 ◦ L1 . (5.61)

Note that y1 = eλ1 t is in the null space of L1 and y2 = eλ2 t is in the null space of L2 . That is
L1 [eλ1 t ] = 0 and L2 [eλ2 t ] = 0.

CASE B: b2 − 4ac < 0. In this case there are two complex conjugate roots of the form α ± iβ.
The simplest way to obtain two real-valued linearly independent solutions is to take the real and
imaginary parts of eλ1 t (exercise):
n o
y1 = < eαt+iβt = eαt cos(βt),
n o (5.62)
y2 = = eαt+iβt = eαt sin(βt),

where √
b 4ac − b2
α=− and β= . (5.63)
2a 2a
An alternative approach which does not involve complex-valued functions is the following. First
write the DE as  b b2 c b2 
D2 + D + 2 + − 2 [y] = 0, (5.64)
a 4a a 4a
or  b 2 c b2 
D+ [y] + − 2 y = 0. (5.65)
2a a 4a
b
The operator D + b
2a annihilates e− 2a t , that is

b  − b t 
D+ e 2a = 0 (5.66)
2a
b
so let y = e− 2a t f (t) = eαt f (t) where α = − 2a
b
as in (5.63). Now
 b  − b t b b b b b b
e 2a f (t) = − e− 2a t f (t) + e− 2a t f 0 (t) + e− 2a t f (t) = e− 2a t f 0 (t),

D+ (5.67)
2a 2a 2a
so  b 2  − b t b
e 2a f (t) = e− 2a t f 00 (t),

D+ (5.68)
2a
hence
 b b c b2  b
L e− 2a t f (t) = e− 2a t f 00 (t) + − 2 e− 2a t f (t) = 0

(5.69)
a 4a
which simplifies to
4ac − b2
f 00 (t) + f (t) = 0. (5.70)
4a2
Since b2 < 4ac this equation has the form of f 00 + β 2 f = 0. As we’ve seen above the general solution
is  
f = a cos βt + b sin βt (5.71)

94
b
where β is as in (5.63). The solutions e− 2a t f (t) are the same as those given in (5.62).

CASE C: b2 − 4ac = 0. For this case there is only one root,


b
r=− (5.72)
2a
which is real. Thus only one solution of the form eλt exists. We need to find a second linearly
independent solution of the homogeneous DE. There are several ways to find the second root. This
b
simplest is to set y = e− 2a t f (t) as we did above. This gives equation (5.70) without the second
term since b2 − 4ac = 0, that is

f 00 = 0 =⇒ f (t) = c1 + c2 t (5.73)

so
b b b
y = e− 2a t f (t) = c1 e− 2a t + c2 te− 2a t (5.74)
is the general solution.

SUMMARY: In summary, two linearly independent solutions of the homogeneous DE ay 00 +


by 0 + cy = 0 have the forms shown in table 5.1.

Table 5.1: Forms of solutions of ay 00 + by 0 + cy = 0.

b2 − 4ac y1 y2

>0 eλ1 t e λ2 t
=0 ert tert
<0 αt
e cos(βt) αt
e sin(βt)

5.2.2 The Unforced Linear Oscillator


Consider a mass m attached to a wall via a spring. It is free to slide over a horizontal smooth
surface. The mass is subjected to two forces in the horizontal: the force exerted by the spring (Fs )
and a drag force (Fd ). Let y be the position of the mass relative to its position when the spring is
at its equilibrium length (i.e., neither stretched or compressed). The equation of motion is

d2 y
m = Fs + Fd . (5.75)
dt2

For small motion the force exerted by the spring is linearly proportional to the amount the
spring has been stretched from its equilibrium position (a ‘linear’ spring), so Fs = −ky where k is a
positive constant, called the spring constant. We assume the drag force, due to friction between the
mass and the surface and possibly air drag, varies linearly with the mass’ velocity. Thus Fd = −β dy
dt
for some positive drag coefficient β which is constant. Thus the governing equation is

d2 y dy
m +β + ky = 0. (5.76)
dt2 dt

95
Equations of this type arise in many problems.
The unforced (no external forces) damped mass-spring system is governed by an equation of
the form
ay 00 + by 0 + cy = 0 (5.77)
in which all the coefficients are
p positive. When b = 0 and c/a > 0 we have seen that solutions
oscillate with frequency ω0 = c/a. Many systems in nature undergo oscillations and some of these
can be modelled with a linear ODE of this form. These systems are typically damped (b/a > 0),
however in unstable situations oscillations may grow in time (e.g., wind blowing over the surface of a
lake generating surface waves) in which case b/a < 0. This requires an energy source because as the
oscillations grow in amplitude their energy increases and this energy must come from somewhere.
The behaviour of solutions of (5.77) is easiest to explore using an appropriately non-dimensionalized
equation. Thus we non-dimensionalize by setting y = Lc x and τ = ω0 t. The differential equation
becomes
b b
y 00 + y 0 + ω02 y = Lc ω02 x00 + Lc ω0 x0 + Lc ω02 x = 0 (5.78)
a a
where x = x(τ ). Dividing through by Lc ω02 gives (using dots to denote differentiation with respect
to τ )
ẍ + 2δ ẋ + x = 0 (5.79)
where
b b
δ= = √ . (5.80)
2aω0 2 ac
is the dimensionless damping coefficient. The behaviour of the solutions of equation (5.79) depends
on the single parameter δ. Because the equation is linear and homogeneous the scaling Lc does not
play a role however it does affect the initial conditions.
Setting x = eλτ gives p
λ = −δ ± δ 2 − 1. (5.81)
There are three cases: δ 2 < 1, δ 2 = 1 and δ 2 > 1. The factor of 2 in (5.79) was introduced merely
for the convenience of having δ 2 = 1 being the critical value at which the roots change from being
complex to real.

Energetics

Multiplying (5.79) by ẋ gives

d  1 2 1 2  dE
x + ẋ ≡ = −2δ ẋ2 . (5.82)
dτ 2 2 dτ
We call
1 1
E = x2 + ẋ2 (5.83)
2 2
the (nondimensional) energy. For a linear mass-spring system it is the sum of kinetic and potential
energy. If δ > 0 the energy can only decay. This is the case of a linearly damped system. If δ < 0
then the energy grows with time. This implies an energy source. A model of this type arises in the
study of some unstable systems.

96
Undamped oscillator: δ=0
1.5
1.0 (a)
0.5
0.0

x
−0.5
−1.0
−1.5
0 5 10 15 20 25 30
τ

Underdamped oscillator: δ=0.1, 0.3


1.5
1.0 (b)
0.5
0.0
x

−0.5
−1.0
−1.5
0 5 10 15 20 25 30
τ

Critically damped oscillator: δ=1.0


1.5
1.0 (c)
0.5
0.0
x

−0.5
−1.0
−1.5
0 2 4 6 8 10
τ

Overdamped oscillator: δ=1.1


2
(d)
1
x

−1
0 1 2 3 4 5
τ

Figure 5.1: Solutions of ẍ + 2δ ẋ + x = 0 for various values of δ and different initial conditions.

The linear damped oscillator: δ > 0

The three cases of interest are δ < 1, δ = 1 and δ > 1. Figure 5.1 shows several solutions of
ẍ + 2δ ẋ + x = 0 for various values of δ.

(i) Case 1: δ > 1. This is the overdamped oscillator. For this case there are two real negative
roots and the general solution is
√  √ 
2 2
x = c1 eλ1 τ + c2 eλ2 τ = c1 e− δ+ δ −1 τ + c2 e− δ− δ −1 τ . (5.84)
x(τ ) can pass through zero at most once (e.g., if the initial conditions have x(0) very small
compared to ẋ(0). This is called the overdamped case. It is desirable for a swinging door on
a hinge to be overdamped so it doesn’t flap back and forth. See Figure ??(d).
(ii) Case 2: δ < 1. This is the underdamped or undamped (δ = 0 oscillator. For this case there
are two complex conjugate roots. The general solution has the form
p
x = Ae−δτ cos 1 − δ 2 τ + φ .

(5.85)

The solution oscillates with frequency ω = 1 − δ 2 and decays with a decay time scale of 1/δ.
For δ  1 the frequency is close to 1 and the system undergoes a large number of oscillations
before decaying appreciably. For δ close to 1 the oscillation frequency is very small and the
decay scale is close to one. The solution decays by a large factor over one oscillation. This case
is called the underdamped case. Natural systems which undergo unforced oscillations generally
have some damping and hence fall into this category. See Figure 5.1(a,b) and 5.2(b).

97
Exponential growth/decay: δ=0, s = −1
8
6 (a)
4

x
2
0
−2
0 1 2 3
τ

underdamped oscillator: δ=0.02; s = 1


1.5
1.0 (b)
0.5
0.0
x

−0.5
−1.0
−1.5
0 10 20 30 40 50 60 70 80 90 100
τ

growing oscillator: δ=−0.02; s = 1


10
5
(c)
0
x

−5
−10
0 10 20 30 40 50 60 70 80 90 100
τ

exponential growth: δ=−1.01, s = 1


10
5
(d)
0
x

−5
−10
0.0 0.4 0.8 1.2 1.6 2.0
τ

Figure 5.2: Solutions of ẍ + 2δ ẋ + sx = 0 for various values of δ and different initial conditions. (a) s = −1.
(b)–(d) s = 1.

(iii) Case 3: δ = 1. This transitional case is called critically damped. The general solution is
x = c1 e−τ + c2 τ e−τ . (5.86)
Solutions are qualitatively similar to the overdamped case. See Figure 5.1(c)

The unstable oscillator: δ < 0

In this case the energy of the system grows in time. If −1 < δ < 0 the solution has the form
p
x = Ae−δτ cos 1 − δ 2 τ + φ

(5.87)
where now −δ > 0 so the amplitude of the oscillations grows in time. In many systems that have
natural modes of oscillations (e.g., waves), an unstable situation can result in growing oscillations.
An example is the generation of surface water waves by wind blowing over the water surface. If the
wind speed is large enough the flow may be unstable and perturbations in the interface displacement
(triggered by air pressure fluctuations associated with eddies) can grow in time. The kinetic energy
in the wind is the energy source for the growing surface water waves. The difference in the velocities
of the air and water means there is a source of kinetic energy to feed the growing waves. Of course
the waves will not grow indefinitely. There is a finite amount of energy available to feed the waves
and when waves become large enough they break and dissipate energy. The system rapidly becomes
nonlinear and very complicated. This is still an active area of research. See Figure ??(c) for sample
solutions.

98
The equation ẍ + 2δ ẋ − x = 0

Changing the sign of the last term fundamentally changes the behaviour of the solutions: when
δ = 0 the solutions grow or decay exponentially rather than oscillating (Figure 5.2(a)). Solutions
for δ = −1.01 (panel (d)) all grow exponentially in time.

5.2.3 Inhomogeneous Equations: Method of Undetermined Coefficients for Con-


stant Coefficient ODEs
We now consider particular solutions of

ay 00 + by 0 + cy = g(t). (5.88)

I. Powers of t

For
L[y] ≡ ay 00 + by 0 + cy = tn (5.89)
there is a particular solution of the form

yp = a0 + a1 t + a2 t2 + · · · + an tn . (5.90)

Note that since y has higher powers of t than y 0 and y 00 do there can be no terms in yp with powers
higher than n.

Example: Find a particular solution for

y 00 + 2y 0 − 6y = t2 . (5.91)

Solution: Since derivatives of powers of t yield powers of t we guess that the solution is a polynomial
in t. Since differentiation reduces the powers the term −6y on the left-hand side will have the highest
power. Thus we should try
yp = a0 + a1 t + a2 t2 . (5.92)
Then
yp0 = a1 + 2a2 t (5.93)
and
yp00 = 2a2 , (5.94)
so
yp00 + 2yp0 − 6yp = 2a2 + 2 a1 + 2a2 t − 6(a0 + a1 t + a2 t2
 
(5.95)
= 2a2 + 2a1 − 6a0 + 4a2 − 6a1 t − 6a2 t2 .
 

Setting this equal to t2 gives


2a2 + 2a1 − 6a0 = 0
4a2 − 6a1 = 0 (5.96)
−6a2 = 1.

99
Thus
1 2 1 a1 + a2 5
a2 = − ; a1 = a2 = − ; a0 = =− . (5.97)
6 3 9 3 54
A particular solution is
1 1 5
yp = − t − t2 . (5.98)
6 9 54

Comments:

(i) If c = 0 the problem reduces to a first-order problem for y 0 . Look for a particular solution
with yp0 = a0 + a1 t + a2 t2 + · · · an tn . Find the an and integrate to get yp in the form
yp = b0 t + b1 t2 + · · · bn tn+1 .
atn+2
(i) If b = c = 0 then you can integrate twice to get yp = (n+2)(n+1) .

II. Exponentials eqt

For
L[y] ≡ ay 00 + by 0 + cy = eqt (5.99)
start by looking for a solution of the form yp = Aeqt because all derivatives of yp will then be
proportional to eqt . Substituting into the DE gives

P (q)Aeqt = eqt (5.100)

where P (q) is the characteristic polynomial introduced above. If P (q) 6= 0, i.e., if eqt is not a
solution of the homogeneous equation, then we have A = 1/P (q) so the solution is

eqt eqt
yp = = 2 . (5.101)
P (q) aq + bq + c

If P (q) = 0, i.e., if eqt is a solution of the homogeneous equation (which is why it helps to find
the homogeneous solution first) then you need to do something else. If this is the case set

yp = f (t)eqt . (5.102)

Note that when substituting this into the left had side of the differential equation all terms of L[yp ]
will be include a factor eqt which we can then cancel. We have
d  
f (t)eqt = (f 0 + qf )eqt (5.103)
dt
and
d2  
f (t)e qt
= (f 00 + 2qf 0 + q 2 f )eqt (5.104)
dt2
so the differential equation gives
h i
L[yp ] = L[f (t)eqt ] = a f 00 + 2qf 0 + q 2 f + b f 0 + qf + cf eqt = eqt
 
(5.105)

100
or
af 00 + (2aq + b)f 0 + (aq 2 + bq + c)f = 1. (5.106)
Now the coefficient of f is P (q) = 0 so this reduces to

af 00 + (2aq + b)f 0 = 1 (5.107)

which is a first-order DE for f 0 . It is an example of the first problem we considered: a second-order


constant coefficient with a power of t as the forcing term. In this case the power is 0.

Case (i): If 2aq + b 6= 0 take f 0 = 1


2aq+b . Then f = t
2aq+b .

t2
Case (ii): If 2aq + b = 0 have f 00 = 1
a so take f = 2a .

In both cases we just need one particular solution and have taken the simplest.

2
Note: From aq 2 + bq + c = 0 we have q = −b± 2a b −4ac b
. If b2 − 4ac = 0 then q = − 2a or 2aq + b = 0.
Thus case (ii) is the case when q is a double root of P (q) = 0.

Summary: A particular solution of


L[y] = eqt (5.108)
has the form

1. yp = Aeqt if L[eqt ] 6= 0. That is, if q is not a root of P (q) = 0.

2. yp = Ateqt if L[eqt ] = 0 and q is a single root of P (q) = 0. That is if teqt is not a solution of
the homogeneous equation L[y] = 0.

3. yp = At2 eqt if L[eqt ] = 0 and q is a double root of P (q) = 0. That is, if both teqt and teqt are
solutions of the homogeneous equation L[y] = 0.

Comment: In equation (5.106) for f note that the coefficient of f is P (q), the coefficient of f 0 is
P 0 (q) and the coefficient of f 00 is 12 P 00 (q). Remembering this can speed things up in some of the
following.

III. Exponentials g(t) = tm eqt

More generally consider


L[y] ≡ ay 00 + by 0 + cy = tm eqt . (5.109)
For this case set
yp = f (t)eqt . (5.110)
to get (see above)
af 00 + (2aq + b)f 0 + (aq 2 + bq + c)f = tm . (5.111)
This is a linear, constant-coefficient, second-order DE for f with forcing tm . This is case I considered
above.

101
Example: Find the general solution of

y 00 + 3y 0 + 2y = t2 e−t . (5.112)

Solution: First find the homogeneous solution. Substituting y = eλt gives λ2 + 3λ + 2 = 0 or


(λ + 2)(λ + 1) = 0. Thus e−2t and e−t are two linearly independent homogeneous solutions.
Next look for a particular solution yp = f (t)e−t . Then

yp0 = (f 0 − f )e−t
(5.113)
yp00 = (f 00 − 2f 0 + f )e−t

Then L[yp ] is
 
yp00 + 3yp0 + 2yp = f 00 − 2f 0 + f + 3(f 0 − f ) + 2f e−t
(5.114)
= f 00 + f 0 e−t .


From the differential equation we have

f 00 + f 0 = t2 . (5.115)

Solve by setting f 0 = a0 + a1 t + a2 t2 . Substituting gives

(a1 + 2a2 t) + (a0 + a1 t + a2 t2 ) = a1 + a0 + (2a2 + a1 )t + a2 t2 = t2 . (5.116)

Hence
a2 = 1
a1 = −2a2 = −2 (5.117)
a0 = −a1 = 2
so
1
f 0 = 2 − 2t + t2 =⇒ f = 2t − t2 + t3 (5.118)
3
and the particular solution is
1 
yp = 2t − t2 + t3 e−t . (5.119)
3
Hence the general solution is
1 
y = 2t − t2 + t3 e−t + c1 e−2t + c2 e−t . (5.120)
3
Comments:

1. Because the exponential e−t is a solution of the homogeneous equation the coefficient of f in
(5.115) was zero. The coefficient of f 0 was non-zero because λ = −1 was a single root of the
characteristic equation P (λ) = 0.

2. An alternative derivation of (5.115) is the following. We have P (q) = q 2 +3q+2, P 0 (q) = 2q+3
and P 00 (q) = 2. Then P (−1) = 0, P 0 (−1) = 1 and 12 P 00 (−1) = 1. So the differential equation
for f (t) is 21 P 00 (−1)f 00 + P 0 (−1)f 0 + P (1)f = t2 or f 00 + f 0 = t2 .

102
3. When integrating the expression for f 0 to get f we did not need to include a constant of
integration as any solution will do. Our goal is to find a particular solution. If we did we
would have had f = a + 2t − t2 + 13 t3 which would add a term ae−t to the final solution.
We already have a c1 e−t term from the homogeneous solution so the new term can just be
absorbed into the homogeneous solution. Remember, if yp is a particular solution so is yp
plus any combination of the homogeneous solutions.

Example: Find the general solution of

y 00 + 4y 0 + 4y = 2t2 e−t − 6t6 e−2t . (5.121)

Solution:

Step 1. First find the homogeneous solution. Substituting y = eλt gives λ2 + 4λ + 4 = 0 or


(λ + 2)2 = 0. Thus e−2t and te−2t are two linearly independent homogeneous solutions.

Step 2. For the particular solutions we split the problem into two pieces. First find a
particular solution yp1 of L[y] = t2 e−t and then a particular solution yp2 for L[y] = t6 e−2t .
Then by linearity yp = 2yp1 − 6yp2 will be a particular solution of the problem.

Step 2(a): Let yp1 = f (t)e−t . Then


0
yp1 = (f 0 − f )e−t
00
(5.122)
yp1 = (f 00 − 2f 0 + f )e−t

Then L[yp1 ] is
 
00 0
yp1 + 4yp1 + 4yp1 = f 00 − 2f 0 + f + 4(f 0 − f ) + 4f e−t
(5.123)
= f 00 + 2f 0 + f e−t .


The coefficient of f is non-zero this time because e−t is not a solution of the homogeneous
equation. From the differential equation we have

f 00 + 2f 0 + f = t2 . (5.124)

Solve by setting f = a0 + a1 t + a2 t2 . Substituting gives

2a2 + 2(a1 + 2a2 t) + (a0 + a1 t + a2 t2 ) = 2a2 + 2a1 + a0 + (4a2 + a1 )t + a2 t2 = t2 . (5.125)

Hence
a2 = 1
a1 = −4a2 = −4 (5.126)
a0 = −2a2 − 2a1 = −2 + 8 = 6

103
so
f = 6 − 4t + t2 (5.127)
and the particular solution is

yp1 = 6 − 4t + t2 e−t .

(5.128)

Step 2(b): Next find yp2 . Let yp2 = f (t)e−2t . Have

yp0 = (f 0 − 2f )e−t
(5.129)
yp00 = (f 00 − 4f 0 + 4f )e−t

Then L[yp1 ] is
 
00 0
yp2 + 4yp2 + 4yp2 = f 00 − 4f 0 + 4f + 4(f 0 − 4f ) + 4f e−t
(5.130)
= f 00 e−t .

From the differential equation we have

f 00 = t6 (5.131)

which is easily integrated to get


t8
f= . (5.132)
72
Note that the coefficients of f 0 and f in (5.131) are zero because λ = −2 is a double root
of the characteristic equation P (λ) = 0 so the exponential part of the forcing function
e−2t is a solution of the homogeneous solution as is te−2t . The coefficients of f and f 0 ,
P (λ) and P 0 (λ), are zero.

Step 3. The particular solution of the differential equation is 2yp1 − 6yp2 , hence the general
solution of the differential equation is
t8
y = 12 − 8t + 2t2 e−t − e−2t + ae−2t + bte−2t .

(5.133)
12

Comment: You could also proceed by finding particular solutions yp1 and yp2 of L[y] = 2t2 e−t
and L[y] = −6t6 e−2t directly which would lead to f 00 + 2f 0 + f = 2t2 in step 2(a) and f 00 = −6t6
in step 2(b) with the final solution being yp = yp1 + yp2 .

III. Sinusoidal forcing functions g(t) = eαt cos(βt) or eαt cos(βt) with α and β real.

For this we note that


eαt cos(βt) = < eqt

(5.134)
eαt cos(βt) = = eqt


where
q = α + iβ. (5.135)

104
To proceed we are going to find solutions of

L[y] ≡ ay 00 + by 0 + cy = eqt . (5.136)

The solution will be complex valued since the forcing term eqt is. Setting y = yr + iyi where yr and
yi are the real and imaginary parts of y we have

L[y] = L[yr ] + iL[yi ] = eqt = eαt cos(βt) + ieαt sin(βt). (5.137)

Thus the real part, yr , is a particular solution of L[y] = eαt cos(βt) and the imaginary part, yi , is
a particular solution of L[y] = eαt sin(βt).
The procedure for finding solutions is identical to the case when q is real. One just has to take
the real or imaginary parts at the end.

Example: Find the general solution of


1
y 00 + 4y 0 + 13y = e−t cos(2t) + 3te−2t sin(3t) + 2t2 . (5.138)
2

Solution:

Step 1. First find the homogeneous solution. Substituting y = eλt gives λ2 + 4λ + 13 = 0 so



−4 ± 16 − 4 · 13 √
λ= = −2 ± 4 − 13 = −2 ± 3i. (5.139)
2
Thus two linearly independent homogeneous solutions are

y1 = e−2t cos(3t),
(5.140)
y2 = e−2t sin(3t).

Note the the second forcing function in (5.138) involves y2 .

Step 2. For the particular solutions we split the problem into three pieces. We need to find
particular solutions: (i) yp1 of L[y] = e−t cos(2t); (ii) yp2 of L[y] = e−2t sin(3t); and (iii) yp3 of
L[y] = t2 . Then by linearity yp = 21 yp1 + 3yp2 + 2yp3 will be a particular solution of (5.138) .

Step 2(a): Since e−t cos(2t) = < e(−1+2i)t we first solve




L[y] = eqt (5.141)

where q = −1 + 2i and then take the real part of the solution. Since e−t cos(2t) is not a
solution of the homogeneous problem we simply let yp1 = Aeqt which leads to

P (q)Aeqt = eqt (5.142)

where P (q) is the characteristic polynomial. P (q) 6= 0 in this case since e−t cos(2t) is
not a solution of the homogeneous problem so
1 1 1 1 3 − 2i
A= = 2 = 2
= = . (5.143)
P (q) q + 4q + 13 (−1 + 2i) + 4(−1 + 2i) + 13 6 + 4i 26

105
Next take the real part to get the particular solution yp1 :
n o n 3 − 2i  o
< Aeqt = < e−t cos(2t) + i sin(2t)
26
(5.144)
e−t  
= 3 cos(2t) + 2 sin(2t) .
26
Note the forcing involved cos(2t) and the solution involves both cos(2t) and sin(2t).
Step 2(b): Next find yp2 . We now consider

L[y] = teqt (5.145)

where now q = −2 + 3i and take the imaginary part of the solution because the forcing
involves sin(3t). Now eqt is a solution of the homogeneous problem this time so we need
to set y = f (t)eqt . Doing so leads to

f 00 + (2q + 4)f 0 = t. (5.146)

Thus we try:
f 0 = a0 + a1 t (5.147)
which gives
a1 + (2q + 4)(a0 + a1 t) = t (5.148)
so
1 1 i
a1 = = =−
2q + 4 6i 6
(5.149)
a1 a1 1
a0 = − =− = .
2q + 4 6i 36
Thus
1 i 1 i
f0 = − t =⇒ f = t − t2 . (5.150)
36 6 36 12
Hence
1 i 
yp2 = t − t2 eqt
36 12
1 i   
= t − t2 e−2t cos(3t) + i sin(3t) (5.151)
36 12
1 t2  1 t2 
= t cos(3t) + sin(3t) e−2t + i t sin(3t) − cos(3t) e−2t
36 12 36 12
which is a complex-valued particular solution of L[y] = teqt . Taking the imaginary part
we get
1 t2
yp2 = t sin(3t)e−2t − cos(3t)e−2t (5.152)
36 12
as the desired particular solution of L[y] = e−2t sin(3t).
Step 2(c): Next find yp3 . We consider

L[y] = t2 . (5.153)

106
For this we set yp3 = a0 + a1 t + a2 t2 . Substituting into L[y] = t2 gives

2a2 + 4(a1 + 2a2 t) + 13(a0 + a1 t + a2 t2 )


= 2a2 + 4a1 + 13a0 + (8a2 + 13a1 )t + 13a2 t2 (5.154)
2
=t
so
1
a2 =
13
8 8
a1 = − a2 = − (5.155)
13 269
2a2 + 4a1 6
a0 = − = 3.
13 13
Thus
6 8 1
yp3 = 3
− t + t2 . (5.156)
13 269 13
Step 3. Putting all the parts together, the particular solution of the differential equation is
1
2 yp1 + 3yp2 + 2yp3 , hence the general solution of the differential equation is

e−t   1 t2
y= 3 cos(2t) + 2 sin(2t) + t sin(3t)e−2t − cos(3t)e−2t
52 12 4 (5.157)
12 16 2 2
+ 3− t + t + c1 e−2t cos(3t) + c2 e−2t sin(3t).
13 269 13

5.3 Resonance
We now consider the effects of sinusoidal forcing on a linearly damped oscillator in detail.
Many systems have unforced modes of oscillation that are subjected to weak damping, musical
instruments being obvious examples. Other examples include

• wine glasses

• waves in a bathtub, lake, bay, swimming pool

• mass-spring systems

• RLC circuits

• pendulums

etc., etc., etc.


If such a system is forced periodically with a period close to one of its natural modes of oscillation
then the system can respond with a large amplitude oscillation. This phenomena is called resonance.
Examples include the proverbial opera singer shattering a wine glass when she sings at the right
pitch. This occurs when sound waves of frequency close to the resonance frequency of the wine

107
glass reflect off it. Wires hum in a high wind when eddies are shed off the wire at a frequency close
to a natural frequency of standing waves in the wire. This process was blamed for the failure of
the Tacoma Narrows Bridge in 1940 under high wind conditions (lots of videos on the web). Many
young children (and maybe university students?) have fun generating large waves in a bathtub
by moving back and forth at just the right frequency. A much larger scale example of this is the
huge tides in the Bay of Fundy and Ungava Bay in northern Quebec which are generated by small
amplitude tidal waves propagating along the coast. These force waves in the bays with frequency
close to that of a standing wave in the bay. There is lots of information on this phenomena in the
Bay of Fundy available on the web.

5.3.1 The simple harmonic oscillator


The simplest model of resonant-type phenomena is the linear, weakly-damped simple harmonic
oscillator subject to sinusoidal forcing, e.g., the linear mass-spring system with weak damping or
an RLC circuit with low resistance.
So, we now study in detail the solutions of

mx00 + γx0 + kx = F cos(Ωt). (5.158)

As it stands there are five parameters: the mass m, the damping coefficient γ, the spring
constant k, the forcing amplitude A and the forcing frequency Ω. All except the forcing amplitude
are positive (the first three must be, taking Ω > 0 is a choice). As we know we can reduce the
number of parameters by non-dimensionalizing the problem. First divide by m and set
p
ω0 = k/m (5.159)

to write the equation as


γ 0 F
x00 +
x + ω02 x = cos(Ωt). (5.160)
m m
Then introduce a nondimensional time via

τ = ω0 t (5.161)

to obtain
d2 x γ dx F Ω 
ω02 2
+ ω0 + ω02 x = cos τ (5.162)
dτ m dτ m ω0
Dividing by ω02 and defining Ω̃ = Ω/ω0 gives

d2 x γ dx F 
+ + x = cos Ω̃τ . (5.163)
dτ 2 ω0 m dτ mω02
Next define
γ
2δ = (5.164)
ω0 m
and let
F
x= x̃ (5.165)
mω02
to get our final equation
d2 x̃ dx̃ 
+ 2δ + x̃ = cos Ω̃τ . (5.166)
dτ 2 dτ

108
There are only two parameters: the nondimensional damping coefficient δ and the non-dimensional
forcing frequency Ω̃ which is the ratio of the forcing frequency to the natural frequency of the
undamped system. Note that because the governing differential equation is linear we were able to
scale x to eliminate the forcing amplitude parameter. Note also that to completely understand the
behaviour of solutions of the linear damped oscillator we need only vary these two parameters, not
five different parameters as in the original dimensional equation. This is an example of the power
of nondimensionalization.

5.3.2 Solution of the equation for the forced, under-damped simple harmonic
oscillator
In the following we will drop the tildes and consider the equation

d2 x dx 
2
+ 2δ + x = cos Ωτ . (5.167)
dτ dτ

We are going to focus on the under-damped case for which 0 < δ < 1 because for the under-
damped case the unforced system has oscillatory solutions that decay in time rather than solutions
that decay in time without oscillating. Resonance is a phenomenon that only occurs when the
unforced system has oscillatory behaviour. The undamped case δ = 0 is considered below.
The homogeneous solution of the DE is
p p
xh = c1 e−δτ cos( 1 − δ 2 τ ) + c2 e−δt sin( 1 − δ 2 τ ). (5.168)

A particular solution can be found by first finding a solution of

d2 x dx
2
+ 2δ + x = eiΩτ (5.169)
dτ dτ
and then taking the real part. Setting xp = BeiΩτ and substituting into the governing equation
gives
P (iΩ)B = −Ω2 + 2δΩi + 1)B = 1 (5.170)
where P (λ) = λ2 + 2δλ + 1 is the characteristic polynomial for the differential equation. P (iΩ) is
a complex number which we can write as

P (iΩ) = 1 − Ω2 + 2δΩi = reiα (5.171)

where
p
r = |P (iΩ)| = (1 − Ω2 )2 + 4δ 2 Ω2 ,
r cos(α) = 1 − Ω2 , (5.172)
r sin(α) = 2δΩ.

Note that for δ > 0, r sin(α) > 0 so α ∈ (0, π) so

2δΩ 
α = arctan with α ∈ (0, π). (5.173)
1 − Ω2
This is not the usual range for the arctan function. Using this form, α is the phase lag of the
response relative to the forcing.

109
From (5.170) we have
1
B = e−iα = Ae−iα , (5.174)
r
where
1
A= p . (5.175)
(1 − Ω )2 + 4δ 2 Ω2
2

Hence, the solution of (5.169) is


xp = Aei(Ωt−α) . (5.176)
The real-valued part, which gives the solution of (5.167), is

xp = A cos(Ωt − α). (5.177)

This is the form of the solution we will use. You will see the solution written in other ways, e.g., as

xp = A sin Ωτ + θ (5.178)

where A is unchanged and α = −θ + π2 . One advantage of this form is that the angle θ lies between
2
− π2 and π2 and θ = arctan( 1−Ω
2δΩ ) uses the standard range for arctan.

Interpretation of the Solution

The general solution of (5.167) is


x = xh + xp (5.179)
where the homogeneous solution xh is given by (5.168) and the particular solution is given by either
(5.177) or (5.178). The two undetermined constants c1 and c2 in the homogeneous solution are
determined by prescribed initial conditions.

1. If the damping coefficient is non-zero then the homogeneous solution decays with time leaving
the particular solution xp which is independent of the initial conditions. After a sufficiently
long time the system has no memory of its initial state.

2. Consider the phase lag α.

(a) Since δΩ ≥ 0, sin α > 0 hence α ∈ (0, π). Since A is positive the response A cos(Ωt − α)
lags the forcing cos(Ωt) by the angle α (see Figure 5.3). Using cos α = A(1 − Ω2 ) we see
that α ∈ (0, π2 ) if Ω < 1 and α ∈ ( π2 , π) if Ω > 1. Similarly θ ∈ (−π/2, π/2) and θ is
easily found from θ = arctan((1 − Ω2 )/(2Ωδ)). To find α use α = π2 − θ or

π  1 − Ω2 
α= − arctan (5.180)
2 2δΩ
(here the arctan function has range (− π2 , π2 )). The phase lag α is plotted as a function
of Ω for different values of the damping coefficient in Figure 5.4. Note that when Ω = 1
(forcing frequency equals the natural undamped frequency of oscillation) the phase lag
α = π/2.
(b) As δ → 0, sin α → 0 so α → 0 or α → π. If Ω < 1, α ∈ (0, π2 ) hence we must have α → 0.
In this case in the limit δ → 0 the response is in phase with the forcing. If Ω > 1, α → π
and the response is 180◦ out of phase. These limiting behaviours can be seen in Figure
5.4 where α(Ω) → 0 for Ω < 1 and α(Ω) → π for Ω > 1 as δ → 0.

110
cos(Ωt) vs (cos(Ωt − α)
1.5 α = π/4 α = 3π/4
1.0
0.5
0.0

y
−0.5
−1.0
−1.5
0 π/2 π 3π/2 2π 5π/2 3π 7π/2 4π
Ωt

Figure 5.3: Comparisons of cos(Ωt) (solid) with cos(Ωt − α) for α = π/4 (dots) and α = 3π/4 (dashed).

Phase lag of response vs forcing frequency


5π/4
π
phase lag α

3π/4
π/2 δ=0.8
π/4 δ=0.05

0
0.0 0.5 1.0 1.5 2.0
forcing frequency Ω

Figure 5.4: Phase lag as a function of frequency for δ = 0.05 and from 0.1 to 0.8 in increments of 0.1.

3. Now consider the amplitude of the response.

(a) As Ω → 0 the amplitude goes to 1. As Ω → ∞


1 1
A= q ∼ , (5.181)
2 Ω2
(1 − Ω2 + 4δ 2 Ω2

since the terms under the square root sign are dominated by Ω4 for large Ω. Thus,
A → 0 like 1/Ω2 as Ω → ∞.
(b) As δ → 0 the amplitude goes to 1 for all values of Ω 6= 1. We will consider this case
below.
(c) At frequencies for which the amplitude of the response has a local maximum (as a
function of the frequency) we have

∂A 1 −4Ω(1 − Ω2 ) + 8δ 2 Ω
=−  −3/2 = 0. (5.182)
∂Ω 2 2 2 2 2
(1 − Ω ) + 4δ Ω

This gives Ω = 0 or Ω2 = 1−2δ 2 . The second gives a local maximum provided δ < 1/ 2.
In this case the local maximum is
1 1
Amax = q = √ , (5.183)
2
(2δ 2 + 4δ 2 (1 − 2δ 2 ) 2δ 1 − δ 2

111
which goes to infinity as δ → 0 while the frequency of maximum
√ response goes to 1. The
maximum amplitude of the response goes to 1 as δ → 1/ 2 as the frequency of maximum
response goes to 0. Figure 5.5 shows the amplitude of the response as a function of Ω
for different values of the damping
√ coefficient δ. As the damping shrinks the region with
a large response near Ω = 1 − 2δ√ 2 get narrower. It is the large response in a narrow

frequency range centred near Ω = 1 − 2δ 2 that is referred to as resonance: the ampli-


tude of the response is much larger than the forcing amplitude (non-dimensionalized to
1).

Damped linear oscillator


12
11
10
Response amplitude A

9 δ=0.05

8
7
6
5
4 0.1
3 0.2
0.3
2
1
0.8
0
0.0 0.5 1.0 1.5 2.0
forcing frequency Ω

Figure 5.5: Amplitude of the response as a function of frequency for δ = 0.05 and from 0.1 to 0.8 in
increments of 0.1. The dotted curve passes through the peaks of the amplitude response curves.

5.4 Undamped Case


If there is no damping the DE reduces to

ẍ + x = cos(Ωτ ). (5.184)

The general solution is


1
x = A cos τ + B sin τ + cos(Ωτ ) if Ω 6= 1 (5.185)
1 − Ω2
or
τ
x = A cos τ + B sin τ +
sin τ if Ω = 1. (5.186)
2
For (δ, Ω) = (0, 1) there is unbounded growth. When Ω = 1 the amplitude of the solution for
the under-damped problem with δ > 0 goes to infinity as δ → 0.
Consider the initial value problem x(0) = ẋ(0) = 0 for Ω 6= 1. From (5.185) we have
1 h i
x= cos(Ωτ ) − cos τ . (5.187)
1 − Ω2

112
if Ω 6= 1. When Ω = 1 the solution (5.186) satisfying x(0) = ẋ(0) = 0 is
τ
x= sin τ. (5.188)
2

Forced undamped linear oscillator


25
20
15 Ω = 0.9
10
5
0
x

−5
−10
−15
−20
−25
0 5 10 15 20
t (natural periods)

25
20
15 Ω = 0.95
10
5
0
x

−5
−10
−15
−20
−25
0 5 10 15 20
t (natural periods)

Figure 5.6: Solutions of the IVP ẍ + x = cos(Ωτ ) for initial conditions x(0) = ẋ(0) = 0. The dotted lines is
the slowly varying amplitude (envelope) of the fast oscillations. The solid grey line is τ /2. (a) Ω = 0.9. (b)
Ω = 0.95

The terms in brackets in the solution for Ω 6= 1 have the form cos(α) − cos(β). Using
α + β α − β
cos(α) = cos +
2 2 (5.189)
α + β α − β α + β α − β
= cos cos − sin sin
2 2 2 2
and
α + β α − β
cos(β) = cos −
2 2 (5.190)
α + β α − β α + β α − β
= cos cos + sin sin
2 2 2 2
gives
α + β β − α
cos(α) − cos(β) = 2 sin sin . (5.191)
2 2

113
Thus
2 1−Ω  1+Ω 
x= sin τ sin t . (5.192)
1 − Ω2 2 2
Now consider Ω close to 1. To do this let Ω = 1 −  where ||  1. Then
2    
x= 2
sin τ sin (1 − )τ . (5.193)
2 −  2 2
The last factor is sinusoidal oscillations with frequency 1 − /2 ≈ 1. The first sine term
gives sinusoidal oscillations with frequency /2  1. The above expression can be interpreted
as sinusoidal oscillations with frequency close to 1 multiplied by a slowly varying amplitude
2 sin( 2 τ )/(2 − 2 ). For times which are not too large, i.e., τ <≈ 0.1, sin(( 2 τ ) ≈ /2 and

1
x ≈ τ sin(τ ). (5.194)
2
This is precisely the solution for Ω = 1.
Two solutions of the undamped case for Ω = 0.9 and Ω = 0.95 are plotted in Figure 5.6. The
slowly varying amplitude (also called the envelope) given by 2/(1 − Ω2 ) sin( 1−Ω
2 τ ) is plotted
with the dashed lines. Note the initial near linear growth (given by τ /2) of the amplitude
of the fast oscillations indicated by the grey curve. As Ω → 1 the envelope gets longer and
larger in amplitude. The number of fast oscillations for which the approximation τ /2 is valid
goes to infinity.
Problems of this type are often cast in terms of multiple time scales by defining a ‘slow’ time
scale τ̃ = τ and writing the solution as
 
x = A(τ̃ ) sin (1 − )τ . (5.195)
2
where A is a slowly-varying amplitude as it varies with the slow time scale τ̃ . The method of
multiple time-scales, or more generally multiple scales, is a powerful mathematical technique
used to find approximate solutions using perturbation theory and asymptotic analysis.

5.5 Method of Variation of Parameters


The method of variation of parameters, due to Lagrange in 1774, is a method that can be used to
find a particular solution of

L[y] ≡ y 00 + P (x)y 0 + Q(x)y = R(x) (5.196)

when two linearly independent solutions y1 and y2 of the homogeneous equation are known. Al-
though as yet we don’t know how to find two linearly independent homogeneous solutions in general
(in fact this is not possible), we do for constant coefficient DEs. We can use this method to find
particular solutions for more general forcing functions than those considered above.
The idea is to find a particular solution in the form

yp = v1 (x)y1 (x) + v2 (x)y2 (x) (5.197)

114
and find v1 and v2 . Note this is an extension of the method we used to find a particular solution of
a linear first-order equation. There, given a particular solution yp , we found a particular solution
yp in the form v(x)yh (x) where yh was a known solution of the homogeneous problem and v was
determined by a first-order equation for v 0 . This time we will find a coupled system of first-order
equations for v10 and v20 . Note also that any function can be expressed in this form and in doing
so v1 can be any function: given v1 and yp we can solve for v2 . Similarly we could choose v2 to
be anything and that determines v1 . So there is a lot of freedom in the choice of the functions v1
and v2 . Finally, note that adding a constant to v1 or v2 adds a constant multiple of one of the
homogeneous solutions to yp , hence we have another particular solution. This means we only need
find the vj up to a constant.
Differentiating (5.197) we have
yp0 = v10 y1 + v20 y2 + v1 y10 + v2 y20 . (5.198)
We now make the choice that
v10 y1 + v20 y2 = 0. (5.199)
With this constraint on v1 and v2 , given either v1 or v2 the other function is determined up to a
constant in a way that does not involve a particular solution yp . With this choice
yp0 = v1 y10 + v2 y20 . (5.200)
and
yp00 = v1 y100 + v2 y200 + v10 y10 + v20 y20 . (5.201)
Since y1 and y2 are solutions of the homogeneous equation
v1 y100 + v2 y200 = −v1 (P y10 + Qy1 ) − v2 (P y20 + Qy2 )
= −P (v1 y10 + v2 y20 ) − Q(v1 y1 + v2 y2 ) (5.202)
= −P yp0 − Qyp .
Hence
yp00 = v10 y10 + v20 y20 − P yp0 − Qyp . (5.203)
Hence yp is a solution of L[y] = R provided
v10 y10 + v20 y20 = R. (5.204)
Equations (5.199) and (5.204) provide a system of two linear equations for v10 and v20 :
v10 y1 + v20 y2 = 0,
(5.205)
v10 y10 + v20 y20 = R.
This system has a solution iff
y1 y2
= W (y1 , y2 ) 6= 0, (5.206)
y10 y20
which is always true as y1 and y2 are linearly independent solutions of the homogeneous DE. Hence
we can solve for v10 and v20 giving
0 y2
R y20 y2 (x)R(x)
v10 = =−
W (y1 , y2 ) W (y1 , y2 )(x)
(5.207)
y1 0
y10 R y1 (x)R(x)
v20 = =
W (y1 , y2 ) W (y1 , y2 )(x)

115
so
x
y2 (t)R(t)
Z
v1 = − dt
W (y1 , y2 )(t)
(5.208)
x
y1 (t)R(t)
Z
v2 = dt.
W (y1 , y2 )(t)

As mentioned above, any anti-derivative will do as adding constants to v1 or v2 adds constant


multiples of the homogeneous solutions which is just another particular solution.

Example: Find the general solution of y 00 + y = tan x.


Solution: Two linearly independent homogeneous solutions are y1 = cos x and y2 = sin x. The
Wronskian of these is W (y1 , y2 ) = y1 y20 − y10 y2 = cos x · cos x − (− sin x) · sin(x) = 1. Hence
Z x Z x
y2 tan x
v1 = − dx = − sin x tan x dx
W
Z x
sin2 x
=− dx
cos x (5.209)
Z x 
=− sec x − cos x dx

= sin x − ln | sec x + tan x|

and
x x
y1 tan x
Z Z
v2 = dx = cos x tan x dx
W
Z x
(5.210)
= sin x dx

= − cos x.

Hence a particular solution is


 
yp = sin x − ln | sec x + tan x| cos x − cos x sin x = − cos x ln | sec x + tan x| (5.211)

and the general solution is

y = c1 cos x + c2 sin x − cos x ln | sec x + tan x|. (5.212)

5.6 Cauchy-Euler Equations


A linear second-order DE for y(t) of the form

L[y] = at2 y 00 + bty 0 + cy = g(t) (5.213)

116
where a, b, and c are constant, is called a Cauchy-Euler equation or an equidimensional equation
(the latter because t2 y 00 , ty 0 and y all have the same dimensions as do the three constants).

For the homogeneous problem we try solutions of the form yh = tr since then y, ty 0 and t2 y 00
will all be proportional to tr :

yh0 = rtr−1
(5.214)
yh00 = r(r − 1)tr−2
so h i
L[yh ] = ar(r − 1) + br + c tr = 0 (5.215)
giving
ar2 + (b − a)r + c = 0. (5.216)
There are three cases:

1. If there are two distinct real roots r1 and r2 then tr1 and tr2 are two linearly independent
solutions.

2. If there are two complex conjugate roots α ± iβ take the real and imaginary parts of tα+iβ =
tα eiβ ln |t| to get tα cos(β ln |t|) and tα sin(β ln |t|) as two linearly independent solutions.

3. If there is a single double root two linearly independent solutions are tr and tr ln |t|.

The latter can be derived by setting y = tr f (t). Substituting in the homogeneous equation
gives
ar + (b − a)r + c]tr f + 2ar + b tr+1 f 0 + atr+2 f 00 = 0.
 2  
(5.217)
The coefficient of f is zero because r is a root of (5.216). Because r is a double root we also have
2ar + b = a. Thus we have
1
atr+1 f 0 + atr+2 f 00 = 0 =⇒ f 00 + f 0 = 0 (5.218)
t
which has f 0 = 1
t as a solution, hence f = ln |t|.
As you may guess from the form of the soultions, the Cauchy-Euler problem can be solved
by changing the independent variable t to s where |t| = es . This produces a constant coefficient
differential equation for y(s). See problems below.

5.7 Problems
1. Suppose P (x) and Q(x) are continuous on an interval [a, b]. A solution of

y 00 + P (x)y 0 + Q(x)y = 0, (5.219)

on the interval [a, b] that is tangent to the x-axis at any point on this interval must be
identically zero. Why?

2. The pairs of functions (sin x, cos x) and (sin x, sin x − cos x) are distinct pairs of linearly
independent solutions of y 00 + y = 0. Thus, pairs of linearly independent solutions of the
homogeneous equation y 00 + P (x)y 0 + Q(x)y = 0 are not uniquely determined by the equation.

117
(a) Show that
y1 y200 − y100 y2
P (x) = (5.220)
W (y1 , y2 )
and
y10 y200 − y20 y100
Q(x) = (5.221)
W (y1 , y2 )
so that the equation is uniquely determined by any given pair of linearly independent
solutions.
(b) Use this result to reconstruct the equation y 00 + y = 0 from each of the two pairs of
linearly independent solutions mentioned above.
(c) Use this result to find the equation satisfied by y1 = xex and y2 = x.
3. Bessel’s equation
x2 y 00 + xy 0 + (x2 − p2 )y = 0 (5.222)
with appropriate initial conditions defines new functions called Bessel functions. Verify that
x−1/2 sin x is one solution on the interval x > 0 for the case p = 1/2. Use this to find a second
solution. Then find a particular solution of
1
x2 y 00 + xy 0 + (x2 − )y = 3x3/2 sin x; x > 0. (5.223)
4
4. Consider the general nth -order constant coefficient linear DE
L[y] = P (D)[y] = a0 + a1 D + a2 D2 + · · · + an Dn = 0. (5.224)
d
where D = dx . Show that setting y = f (t)eqt gives the following equation for f :
P (D + q)[f ] = 0. (5.225)
Show that this can be written as
1 1
P (q)f + P 0 (q)D[f ] + D2 [f ] + · · · + P (n) (q)Dn [f ] = 0 (5.226)
2 n!
1 (n)
and that n! P (q) = an .
5. Consider the Cauchy-Euler equation
L[y] = at2 y 00 + bty 0 + cy = g(t) (5.227)
were a, b, and c are constants with a 6= 0. Convert this to a constant-coefficient differential
equation by the substitution |t| = es and finding a differential equation for ỹ(s) = y(t(s)).
Convert the homogeneous solutions of your DE for ỹ(s) to homogeneous solutions of y(t) to
recover the solutions of the Cauchy-Euler equation given in section 5.6.
6. Find the general solutions of
(a) y 00 + 9y = 9 sec2 (3t);
et
(b) y 00 − 2y 0 + y = 1+t 2;

(c) y 00 + 4y 0 + 4y = t−2 e−2t ;


(d) x2 y 00 − 3xy 0 + 4y = x2 ln x for x > 0;
t3
(e) t2 y 00 − 2ty 0 + 2y = 1+t2
.

118
Chapter 6

Systems of Equations

6.1 Example
Consider a double mixing tank (Figure 6.1). Water flows enters tank 1 and leaves tank 2 at a
constant rate uf . The salt concentration in the inflow, cin , is also constant. There is an exchange
between the two tanks, with water flowing from tank 2 to tank 1 at a rate ue . The volumes of
water in the tanks are constant so water flows from tank 1 to tank 2 at a rate uf + ue . Let x1 (t)
and x2 (t) be the masses of salt in the two tanks.
Conservation of salt gives the following two equations:
dx1
= uf cin + ue c2 (t) − (uf + ue )c1 (t)
dt (6.1)
dx2
= (uf + ue )c1 (t) − ue c2 (t) − uf c2 (t),
dt
or
dx1 uf + ue ue
+ x1 − x2 = uf cin
dt V1 V2
(6.2)
dx2 uf + ue uf + ue
+ x2 − x1 = 0.
dt V2 V1

This is a system of equations of the form


dx1
+ a1 x1 − a2 x2 = s (6.3)
dt
dx2
+ a3 x2 − a1 x1 = 0, (6.4)
dt
where the aj and source term s are positive constants. Note that a2 < a3 .
From (6.4) we have
1 dx2 a3
x1 = + x2 . (6.5)
a1 dt a1
Substituting into (6.3) gives

d2 x2 dx2
2
+ (a3 + a1 ) + a1 (a3 − a2 )x2 = a1 s. (6.6)
dt dt
This is a second-order, linear, constant coefficient differential equation.

119
Figure 6.1: Double mixing tanks. Vi , xi and ci are the volumes, salt mass and salt concentration in tanks
i = 1, 2. uf is the inflow volume flux, ue is the exchange volume flux and cin is the salt concentration in the
inflow.

1. To find the homogeneous solution set x2h = eλt . Substituting into the DE gives
λ2 + (a1 + a3 )λ + a1 (a3 − a2 ) = 0, (6.7)
or
a1 + a3 1 p
λ=− ± (a1 + a3 )2 + 4a1 (a2 − a3 ). (6.8)
2 2
Note that
(a) (a1 + a3 )2 + 4a1 (a2 − a3 ) = (a1 − a3 )2 + 4a1 a2 > 0 since the aj are all positive.
(b) (a1 + a3 )2 + 4a1 (a2 − a3 ) < (a1 + a3 )2 since a2 < a3 .
So the roots are real, distinct, and negative. Call them −λ1 and −λ2 where
a1 + a3 1 p
0 < λ1 = − (a1 + a3 )2 + 4a1 (a2 − a3 )
2 2 (6.9)
a1 + a3 1 p
< + (a1 + a3 )2 + 4a1 (a2 − a3 ) = λ2 .
2 2
The homogeneous solution is
x2h = ae−λ1 t + be−λ2 t . (6.10)
2. A particular solution, by inspection, is
s uf cin
x2p = = = V2 cin . (6.11)
a3 − a2 uf /V2

The solution is
x2 = ae−λ1 t + be−λ2 t + V2 cin (6.12)
where a and b are arbitrary constants. Now that x2 is known we can find x1 using (6.5). This gives
a3 − λ1 −λ1 t a3 − λ2 −λ2 t
x1 = ae + be + V1 cin . (6.13)
a1 a1
As t → ∞, x1 → V1 cin and x2 → V2 cin , i,e., the concentration in each tank goes to the inflow
concentration as expected.

120
6.1.1 Matrix Formulation:
As an alternative derivation we can write the system (6.3)–(6.4) in matrix form as

d~x
− A~x = ~s (6.14)
dt
where
 
x1
~x =
x2
 
−a1 a2
A= (6.15)
a1 −a3
 
s
~s = .
0

We can look for solutions of the homogeneous problem ~x0 = A~x in the form ~xh = ~v eλt where ~v is a
constant vector. Substituting into the differential equation gives

A~v = λ~v (6.16)

so solutions of this form exist if ~v is an eigenvector of the matrix A with eigenvalue λ. The
eigenvalues of A are given by
  λ + a1 −a2
det λI − A = = (λ + a1 )(λ + a3 ) − a1 a2 = 0. (6.17)
−a1 λ + a3

This is identical to (6.7) above, so the eigenvalues are the values −λ1 and −λ2 found above. The
corresponding eigenvectors can be taken as
a a −λ
! !
2 3 1
a1 −λ1 a1
v~1 = = , (6.18)
1 1

and
a2 a3 −λ2
! !
a1 −λ2 a1
v~2 = = . (6.19)
1 1
A particular solution of (6.14) is the constant vector
   
a3 a2 a3 s
s V1 cin
! !
−1 a1 (a3 −a2 ) a1 (a3 −a2 ) a1 (a3 −a2 )
~xp = −A ~s =   = = . (6.20)
1 1 s
a3 −a2 a3 −a2 0 a3 −a2 V2 cin

The general solution is


a1 a1 V1 cin
! ! !
a3 −λ1 t −λ1 a3 −λ2 −λ2 t
~x = a~x1 + b~x2 + ~xp = a e +b e + , (6.21)
1 1 V2 cin

which is identical to our previous solution (6.12)–(6.13).

121
6.2 Elimination Method for Systems
d
For this method, one writes the equations in in terms of the differential operator D ≡ dt and treats
the system of equations as you would a system of linear algebraic equations.
Notation: The operator (D + a)(D + b) means apply (D + b) first and than apply (D + a) to the
result. If a and b are constant we have
(D + a)(D + b)[x] = (D + a)[x0 + bx]
= D[x0 + bx] + a(x0 + bx)
(6.22)
= x00 + bx0 + ax0 + abx
= x00 + (b + a)x0 + ab x.

When a and b are constant the operators D+a and D+b commute: (D+a)(D+b) = (D+b)(D+a).
Differential operators with variable coefficients do not commute. For example (D + 2t)(D + 5) 6=
(D + 5)(D + 2t):

(D + 2t)(D + 5)[x] = (D + 2t)[x0 + 5x]


= D[x0 + 5x] + 2t(x0 + 5x)
(6.23)
= x00 + 5x0 + 2tx0 + 10tx
= x00 + (5 + 2t)x0 + 10tx,
while
(D + 5)(D + 2t)[x] = (D + 5)[x0 + 2tx]
= D[x0 + 2tx] + 5(x0 + 2tx)
(6.24)
= x00 + 2x + 2tx0 + 5x0 + 10tx
= x00 + (5 + 2t)x0 + (2 + 10t)x.

Example: Find the general solution of

x0 + 2x − y 0 = 4t (6.25)
0
y − 6x + y = 0. (6.26)

Solution: In terms of the differential operator D we write the system as

(D + 2)[x] − D[y] = 4t (6.27)


−6x + (D + 1)[y] = 0. (6.28)

One can solve this by finding x first and then y or visa versa.
Method 1: We can eliminate y to obtain a single equation involving x via (D + 1)[(6.27)] + D[(6.28)]
which gives
   
(D + 1)(D + 2)[x] − (D + 1)D[y] + −6D[x] + D(D + 1)[y] = (D + 1)[4t] − D[0] (6.29)

122
or
(D2 − 3D + 2)[x] = 4 + 4t. (6.30)
The homogeneous problem is
(D − 1)(D − 2)[x] = 0 (6.31)
which has the solutions xh = aet + be2t . For a particular solution set xp = a0 + a1 t to get

x = 5 + 2t + aet + be2t . (6.32)

Next we need to find y. From (6.26) we have

y 0 = x0 + 2x − 4t = 12 + 3aet + 4be2t . (6.33)

Integrating gives
y = c + 12t + 3aet + 2be2t . (6.34)
This involves three arbitrary constants. The system is comprised of two coupled first order DEs
so the general solution should only involve two. At this point equation (6.26) is satisfied, however
both equations in the system must be satisfied and the second is satisfied only for a single value of
c. So to determine the value of c we must now use equation (6.26). Substituting our expressions
for x and y we have

y 0 − 6x + y = (12 + 3aet + 4be2t ) − 6(5 + 2t + aet + be2t ) + (c + 12t + 3aet + 2be2t )


= (12 − 30 + c) + (−12t + 12t) + (3a − 6a + 3a)et + (4b − 6b + 2b)e2t (6.35)
= c − 18.

This should be equal to zero so c = 18. Thus, the solution of the system is

x = 5 + 2t + aet + be2t ,
(6.36)
y = c + 12t + 3aet + 2be2t .

Method 2: Alternatively you can eliminate x to get a single second-order DE for y, namely

(D − 1)(D − 2)[y] = 24t. (6.37)

Note the operator on the left is identical to the one we had before operating on x. This is no
coincidence. The homogeneous solution is y = uet + ve2t . Next find a particular solution in the
form yp = a0 + a1 t. The result is the general solution

y = 18 + 12t + uet + ve2t . (6.38)

Next use (6.28) to immediately get

y0 + y u v
x= = 5 + 2t + et + e2t . (6.39)
6 3 2
Note that letting u = 3a and v = 2b puts the solution in the same form as obtained from in the
first solutions. Since these constants are arbitrary either solution is fine.

123
Comment: The general linear constant coefficient system has the form

L1 [x] + L2 [y] = f1
(6.40)
L3 [x] + L4 [y] = f2 .

Applying L4 to the first equation and L2 to the second and subtracting gives (L4 L1 − L2 L3 )[x] =
L4 [f1 ] − L2 [f2 ] using the fact that L2 L4 = L4 L2 because the operators have constant coefficients.
Alternatively eliminate x by applying L3 to the first and L1 to the second and subtracting the first
from the second to get (L1 L4 − L3 L2 )[y] = L1 [f2 ] − L3 [f1 ]. The operators acting on x and y in
these two second-order DEs are the same if the operators have constant coefficients.
If L1 L4 − L2 L3 is identically zero the two equations are not linearly independent and in general
have no solution. For example consider

x0 + y 0 = 2
(6.41)
x0 + y 0 = 4.

6.3 Introduction to the Phase Plane


The general system of two first-order equations has the form
dx
= f (x, y, t), (6.42)
dt
dy
= g(x, y, t). (6.43)
dt
Definition: If f and g are independent of t the system is said to be autonomous.

An example of an autonomous system is the double mixing tank problem:


dx1
= −a1 x1 + a2 x2 + s, (6.44)
dt
dx2
= a1 x1 − a3 x2 . (6.45)
dt
Another is the unforced linear damped oscillator which, with y(t) = x0 (t), can be written as the
system
dx
= y, (6.46)
dt
dy
= −2δy − x. (6.47)
dt
For the forced linear oscillator the system becomes
dx
= y, (6.48)
dt
dy
= −2δy − x + cos(Ωt). (6.49)
dt
This is a non-autonomous system.

124
Figure 6.2: Direction fields and sample phase-plane curves for system (6.51) for s = 0.25.

Autonomous systems have the property that if (x(t), y(t)) is a solution so is (x(t + t0 ), y(t + t0 ))
for any time shift t0 . That is, it doesn’t matter when the start time is. The solution of a non-
autonomous system for initial conditions (x(t0 ), y(t0 )) = (x0 , y0 ) depends on the start time t0 .
If f and g are independent of t and x(t) can be inverted to find t(x) we can form the function
ỹ(x) = y t(x) . As an abuse of notation we normally dispense with the tilde and denote the new
function as y(x) even though y(t) and y(x) are really different functions. From the equations for
x(t) and y(t) we have
dy g(x, y)
= ≡ h(x, y). (6.50)
dx f (x, y)
which does not involve t. This is called the phase-plane equation. Solutions give the curves on
the x–y plane along which (x(t),
 y(t)) travels, called a trajectory. They do not provide information
on how the  point x(t), y(t) moves along the curve as t increases. The solution of the system
x(t), y(t) does.

Example: Consider
dx
= y,
dt (6.51)
dy
= sx.
dt
The phase-plane equation is
dy x
= s =⇒ y dy = sx dx =⇒ y 2 = sx2 + c. (6.52)
dx y

Case A: s > 0. For s > 0 the solutions are hyperbolas. The direction of motion along the
curve can be inferred from the original system: dx
dt = y =⇒ x is increasing if y > 0 and
decreasing if y < 0. See Figure 6.2.

125
Figure 6.3: Direction fields and sample phase-plane curves for system (6.51) for s = −0.25.

Case B: s < 0. For s < 0 the solutions are ellipses centred at (x, y) = (0, 0). As for Case
A, x is increasing if y > 0 and decreasing if y < 0 so the point (x(t), y(t)) moves clockwise
around the ellipse. See Figure 6.3.

Comments:

1. A trajectory x(t), y(t) of an autonomous system cannot back track along the phase-plane
dy
curve since the velocity ( dx
dt , dt ) at a point (x, y) does not depend on time.

2. A trajectory x(t), y(t) of an autonomous system cannot cross itself for the same reason.

Definition: A critical point, or fixed point, is a point (x0 , y0 ) where f (x0 , y0 ) = g(x0 , y0 ) = 0.
Note that (x(t), y(t)) = (x0 , y0 ) is a solution of the differential equation, called an equilibrium
solution.

6.3.1 Classification of Critical Points


1. A critical point is said to be stable if all trajectories that start in a neighbourhood of the
critical point stay nearby for all future times. More precisely, a critical point x~0 = (x0 , y0 ) is
stable if for every  > 0 there exists a δ > 0 such that if ||~x(0) − ~x0 || < δ then ||~x(t) − ~x0 || < 
for all t ≥ 0.

2. A critical point that is not stable is said to be unstable.

3. A critical point is said to be asymptotically stable if trajectories that start sufficiently


close to it go to the fixed point as t → ∞. That is, if there exists a δ > 0 such that if
||~x(0) − ~x0 || < δ then limt→∞ ~x(t) = ~x0 .

126
Figure 6.4: Inward (stable) spiral: direction fields and sample trajectory for system (6.53) for δ = 0.1.

4. A critical point is called a saddle point if some trajectories go to the fixed point while others
go away.

5. A critical point is called a centre if trajectories in its neighbourhood are closed curves going
around the critical point.

Example: System (6.51) as a single critical point at (0, 0). When s > 0 the critical point is unstable.
This is an example of a saddle point. Initial points on the line y = −x/2 go to the fixed point. All
others eventually go off to infinity although they can approach the critical point very closely before
doing so (see Figure 6.2). When s < 0 the critical point is stable however it is not asymptotically
stable. In this case critical point is a centre (see Figure 6.3).

The linear-damped oscillator can be written as a system for x(t) and y(t) = x0 (t):

dx
= y,
dt (6.53)
dy
= −2δy − x.
dt

Figure 6.4 shows a sample trajectory for δ = 0.1. This is called an inward spiral. When δ < 0
energy is fed into the system and the trajectory spirals out (Figure 6.4).
For linear systems the equations for the critical points, f (x, y) = 0 and g(x, y) = 0, are the
equations for two straight lines. If they are non-parallel there is one critical point. If the lines
are distinct and parallel there are none. If they coincide there are an infinite number. Non-linear
systems can have a much richer behaviour. We consider such a case next.

127
Figure 6.5: Outward (unstable) spiral: direction fields and sample trajectory for system (6.53) for δ = −0.1.

6.4 Competing Species


Here we consider a simple closed system with two species competing for the same food supply, e.g.,
two species of fish. Let x and y be the populations of the two species. We are assuming that the
population is large so that it makes sense to treat the population as a continuous function of time.
This is a convenient approximation. Consider a single species, x, on its own. The growth rate per
individual, x1 dx
dt is the difference between the average birth rate and the average death rate. We
assume the following:

1. the average birth rate (number of births per unit population) is constant, say β;
2. the average death rate due to the effects of crowding and increased competition for food, is
proportional to the size of the population. Let δ be its constant of proportionality.

This gives a simple model for population growth, called the logistic equation:
1 dx
= β − δx,
x dt
or
dx
= x(β − δx). (6.54)
dt
As we have already seen, the population increases if x < βδ and decreases if x > βδ . In both cases
the population approaches the steady state (equilibrium) solution x = βδ as t → ∞.
Now consider our two species with populations x and y. In the absence of the other species,
each population is modeled with a logistic equation:
dx
= x(β1 − δ1 x), (6.55)
dt
dy
= y(β2 − δ2 y), (6.56)
dt
where the βj and δj are positive constants. We now add the effects of competition by taking into
account the fact that the death rate now depends on the population of both species: the death

128
rate for x is now δ1 x + α1 y and the death rate for y is now δ2 y + α2 x. This leads to the coupled
nonlinear system
dx
= x(β1 − δ1 x − α1 y),
dt (6.57)
dy
= y(β2 − δ2 y − α2 x).
dt
This is our model for the competition of two species.
To understand the behaviour of the system it is useful to determine the critical points and
establish their nature. The critical points are solutions of
x(β1 − δ1 x − α1 y) = 0,
(6.58)
y(β2 − δ2 y − α2 x) = 0.
There are in general four critical points:
(x1 , y1 ) = (0, 0),
 β 
2
(x2 , y2 ) = 0, ,
α2

1
 (6.59)
(x3 , y3 ) = ,0 ,
α1
β α − β α δ β − δ β 
1 2 2 1 1 2 2 1
(x4 , y4 ) = , .
δ1 α2 − δ2 α1 δ1 α2 − δ2 α1
The second and third solutions are the equilibrium solutions for each species in the absence of the
other. The fourth solution comes from solving the system
δ1 x + α1 y = β1
(6.60)
δ2 x + α2 y = β2 .
This system may have (i) no solution (e.g., δ1 = δ2 , α1 = α2 and β1 6= β2 ); (ii) solutions along a
line if the two equations give the same straight line on the x-y plane; or (iii) solutions with x < 0
or y < 0 which must be rejected as populations can’t be negative. If x4 and y4 are positive it
represents a state in which both species can co-exist.
Note that x, y and t can always be scaled to put it in the form
dx
= x(1 − x − y),
dt (6.61)
dy
= y(β − δy − αx).
dt
This is left as an exercise.
Figure (6.6) shows the direction fields and some sample population directories for the case
dx
= x(1 − x − y),
dt (6.62)
dy
= y(0.75 − y − 0.5x).
dt
The critical points are (0, 0), (0, 0.75), (1, 0) and (0.5, 0.5). Visually we can see that (0.5, 0.5) is a
stable critical point. The other critical points are unstable. The critical points (0, 0.75) and (1, 0)
are saddle points.

129
Figure 6.6: Direction fields and sample trajectories for the competing species model (6.53).

6.5 Linearization near a critical point


To study the behaviour more closely in the vicinity of the stable point (0.5, 0.5) in our competing
species system (6.62) set x = 0.5 + x̃ and y = 0.5 + ỹ. When (x, y) is close to the critical point
x̃ and ỹ are small. Since 1 − x − y = 1 − (0.5 + x̃) − (0.5 + ỹ) = −x̃ − ỹ and 0.75 − y − 0.5x =
0.75 − (0.5 + ỹ) − 0.5(0.5 + x̃) = −ỹ − 0.5x̃ we have the system
dx̃
= −(0.5 + x̃)(x̃ + ỹ),
dt (6.63)
dỹ
= −(0.5 + ỹ)(ỹ + 0.5x̃),
dt
or
dx̃ 1
= − (x̃ + ỹ) − x̃2 − x̃y,
dt 2 (6.64)
dỹ 1 1 1
= − (ỹ + x̃) − ỹ 2 − ỹx̃.
dt 2 2 2
So far we have just translated the dependent variables to shift the critical point to (x̃, ỹ) = (0, 0).
Now we assume x̃ and ỹ are small so that the quadratic terms x̃2 , x̃ỹ and ỹ 2 are negligible to a
first approximation. This gives the approximate linearized system
dx̃ 1 1
= − x̃ − ỹ,
dt 2 2 (6.65)
dỹ 1 1
= − ỹ − x̃.
dt 2 4
which can be written as
d~x̃
= A~x̃ (6.66)
dt
where
− 21 − 12
!
A= . (6.67)
− 41 − 12

130

2
The matrix has two real negative eigenvalues: − 12 ± 4 so the general solution has the form
√ √
1 2 1 2
~x = a~v1 e(− 2 − 4
)t
+ b~v2 e(− 2 + 4
)t
(6.68)
where ~v1 and ~v2 are eigenvectors for the corresponding eigenvalues. All solutions of the linearized
system go to zero as expected. This shows that the fixed point is stable: in this case the model
predicts the two populations can co-exist. This is not always the case. For other values of the
model parameters the critical point for which both populations are non-zero can be unstable. In
that case the model predicts that one population goes extinct.
Writing the linearized system as
 1 1
D+x̃ + ỹ = 0,
2 2 (6.69)
1  1
x̃ + D + ỹ = 0
4 2
and using the elimination method, a single equation for x̃ is
 1 2 1
D+ [x̃] − x̃ = 0, (6.70)
2 8
or  1 2
D2 + D + [x̃] = 0. (6.71)
8
This is an over-damped oscillator equation again.

6.5.1 General linearization procedure


In general, to study the behaviour more closely in the vicinity of a critical point (x0 , y0 ) of the
system
dx
= f (x, y)
dt (6.72)
dy
= g(x, y)
dt
we expand f and g in Taylor series about the critical point (assuming the functions are differen-
tiable):
f (x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 )
h1 1 i
+ fxx (x0 , y0 )(x − x0 )2 + fxy (x0 , y0 )(x − x0 )(y − y0 ) + fyy (x0 , y0 )(y − y0 )2 + · · ·
2 2
g(x, y) = g(x0 , y0 ) + gx (x0 , y0 )(x − x0 ) + gy (x0 , y0 )(y − y0 )
h1 1 i
+ gxx (x0 , y0 )(x − x0 )2 + gxy (x0 , y0 )(x − x0 )(y − y0 ) + gyy (x0 , y0 )(y − y0 )2 + · · · .
2 2
(6.73)
Since (x0 , y0 is a critical point f (x0 , y0 ) = g(x0 , y0 ) = 0. Assuming x − x0 and y − y0 are small we
drop the quadratic and higher-order terms to get the approximate linear system
dx
= f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 )
dt (6.74)
dy
= g(x0 , y0 ) + gx (x0 , y0 )(x − x0 ) + gy (x0 , y0 )(y − y0 ).
dt

131
In matrix form, with x̃ = x − x0 and ỹ = y − y0 as above,

d~x̃
= A~x̃ (6.75)
dt
where !
fx (~x0 ) fy (~x0 )
A= . (6.76)
gx (~x0 ) gy (~x0 )

The eigenvalues of A determine the behaviour of the linear system which is only an approxima-
tion for the behaviour in a neighbourhood of the critical point. We can separate the behaviours of
the linear system into the following cases:
Major Cases:

A. Two distinct real roots or same sign =⇒ node;

B. Two distinct real roots of opposite sign =⇒ saddle point;

C. Two complex conjugate roots, not purely imaginary =⇒ spiral;

Minor Cases:

D. Real equal roots =⇒ node;

E. Purely imaginary roots =⇒ centre.

The behaviour of a nonlinear system in the vicinity of a critical point has the same behaviour
as the corresponding linear system if the behaviour is one of the major cases. The behaviour may
not be the same globally. For example, for an unstable node the trajectories could go to a stable
node rather than off to infinity.

6.6 Nonlinear Oscillations: The Volterra prey-predator model


Vito Volterra (1860–1940) developed a simply model for the populations of two interacting species:
a prey (e.g., rabbits) with unlimited food supply, and a predator (e.g., lynxes). Let x be the
population of the prey. In the absence of any predators the population of the prey grows with
constant birthrate. When predators are present the birth rate is assumed to be unchanged however
it now has a death rate that is proportional to the population of the predators. Hence the population
of the prey is governed by
dx
= ax − bxy,
dt
where a and b are positive constants.
In the absence of prey the predators population declines at a constant rate because it has
no food. Its death rate (number of deaths per unit population per unit time) is assumed to be
constant. When there are prey available its population grows at a rate that is proportional to the
prey’s population. This gives
dy
= −cy + dxy,
dt

132
5

y
2

0
0 1 2 3 4 5 6
x

Figure 6.7: Sample orbits for the Volterra model. a = b = 2 and c = d = 1.

where c and d are positive constants. The combination of the two equations gives a coupled
nonlinear system of first-order differential equations:
dx
= ax − bxy,
dt (6.77)
dy
= −cy + dxy.
dt
This system cannot be solved in terms of elementary functions.
The Volterra model has two critical points. One is (x1 , y1 ) = (0, 0). Linearizing about this
point we drop the quadratic terms to get the approximate linear system
 
d~x a 0
= ~x. (6.78)
dt 0 −c

The eigenvalues are a > 0 and −c < 0 hence (0, 0) is a saddle point.
To study the behaviour near the second critical point (x2 , y2 ) = ( dc , ab ) we set x̃ = x − x2 and
ỹ = y − y2 . Letting f (x, y) = ax − bxy and g(x, y) = −cy + dxy we have
a
fx = a − by =⇒ fx (x2 , y2 ) = a − b = 0,
b
c
fy = −bx =⇒ fy (x2 , y2 ) = −b ,
d
(6.79)
a
gx = dy =⇒ gx (x2 , y2 ) = d ,
b
c
gy = −c + dx =⇒ gy (x2 , y2 ) = −c + d = 0,
d
so
− bc
! !
d~x̃ fx (x2 , y2 ) fy (x2 , y2 )
~x̃ =
0 d ~x̃.
= (6.80)
dt gx (x2 , y2 ) gy (x2 , y2 ) da
0
d

133
3.0 x (prey) y (predator)
2.5
2.0

x,y
1.5
1.0
0.5
0.0
0 5 10 15 20 25
t

3.0 x (prey) y (predator)


2.5
2.0
x,y

1.5
1.0
0.5
0.0
0 5 10 15 20 25
t

Figure 6.8: Sample orbits for the Volterra model. a = b = 2 and c = d = 1.

q √
The eigenvalues are imaginary: λ = − bc
d ×
da
d = ± ac i. Hence the orbits are ellipses (exercise).
One can show that for the fully nonlinear system the orbits are also closed curves however they
are not ellipses. They approach ellipses as the orbit approaches the fixed point. Figure 6.7 shows
some example orbits for the case with a = b = 2 and c = d = 1. Time series for a sample solution
are shown in Figure 6.8. Notice that the period for the example with larger amplitude fluctuations
is slightly longer than the period of the smaller amplitude oscillations and the shape of the orbit
on the phase plane is far from elliptical. The fact that the period of oscillation depends on the
amplitude is a common feature of nonlinear oscillations (e.g., the nonlinear pendulum or surface
water waves).

6.7 Problems
1. Solve the phase plane equations for the following systems.

(a) (c)
dx
=y dx
dt = 6(y − 2x)2 + 2
dt
dy
= 2sy. dy
dt = 3(y − 2x)2 − 4.
dt
(b)
dx
= 2y + 2x
dt
dy
= −x.
dt

2. Describe the nature of the critical point (0, 0) of each of the following systems and sketch the
trajectories. Find x(t) and y(t).

134
(a) (d)

dx dx 1
= 4x − 3y, = 2x + y,
dt dt 2
dy dy 5
= 5x − 4y. = − x + y.
dt dt 2

(b) (e)

dx dx
= 2y + 2x, = x − 2y,
dt dt
dy dy
= −x. = 2x − 3y.
dt dt

(c) (f)

dx dx
= x + y, = 4x − 3y,
dt dt
dy dy
= −x − 3y. = 8x + 5y.
dt dt

3. Consider the Volterra system (6.77). Eliminate y from the system and obtain a nonlinear
second-order equation for x(t).
2
4. For the Volterra system shown that ddt2y ≥ 0 whenever dx
dt > 0. What is the meaning of this
result in terms of the time series shown in Figure 6.8?
c a

5. Consider the Volterra system linearized about the critical points d, b . Show that the orbits
of the linearized system are ellipses.

135
136
Chapter 7

Laplace Transforms

Definition: Let f (t) be a function on [0, ∞). The Laplace Transform of f (t) is a function F (s) =
L[f ](s) defined by Z ∞
F (s) = L[f ](s) = e−st f (t) dt, (7.1)
0
The domain of F (s) is all values of s for which the integral exists.
The Laplace Tranform is another example of an operator that maps a function f (t) to a new
function F (s). Another important transform is the Fourier Transform defined by
Z ∞
F[f (x)](k) = eikx f (x) dx. (7.2)
−∞

The definition of the Fourier Transform has different forms which are simple scalings of one another,
e..g, replace k by 2πk, divide the above by 2π, etc. You can learn about the Fourier Transform in
a future course.
Note: The Laplace Transform is another example of a linear operator. If L[f ](s) and L[g](s) exist
for s > s0 then L[f + cg](s) = L[f ](s) + cL[g](s) exists for s > s0 for any constant c. The proof
follows from linearity of the integral operator and is left as an exercise.

Example: The Laplace Transform of the constant function f (t) = 1 is


Z ∞  e−st  ∞ 1
L[1](s) = e−st dt = − = for s > 0. (7.3)
0 s 0 s
The transform exists for complex s = sr + isi with <{s} = sr > 0 because we then have
limt→∞ e−st = limt→∞ e−sr t e−isi t = 0.

Example: The Laplace Transform of f (t) = eat for a constant a is


Z ∞ Z ∞
−st at
at
L[e ](s) = e e dt = e−(s−a)t dt. (7.4)
0 0

137
The integral exists iff s > a in which case
Z ∞  e−(s−a)t  ∞ 1
L[eat ](s) = e−(s−a) dt = − = . (7.5)
0 s−a 0 s−a

Note that this result can be extended to complex values of a. The transform exists if s > <{a}
since then limt→∞ e−(s−a)t = limt→∞ e−(s−ar )t eiai = 0. This result will prove very useful in the
next example.

Example: Find L[sin bt](s) and L[cos bt](s) where b is a nonzero constant.
Solution: First the hard way.
Z ∞
L[sin bt](s) = e−st sin bt dt
0

d  cos bt 
Z
= e−st
− dt
0 dt b
Z ∞
 cos bt  t=∞ s −st
= − e−st − e cos bt dt
b t=0 0 b
(7.6)
1 s ∞ −st d sin bt 
Z
= − e dt (need s > 0 here)
b b 0 dt b
1 s  −st sin bt  ∞ s ∞ −st
Z 
= − e + e sin bt dt
b b b 0 b 0
1 s2 ∞ −st
Z
= − 2 e sin bt dt.
b b 0
So

s2  1
 Z
1+ e−st sin bt dt = (7.7)
b2 0 b
and hence
b
L[sin bt](s) = for s > 0. (7.8)
b2 + s2
Similarly one can show that
s
L[cos bt](s) = for s > 0. (7.9)
b2 + s2
This is left as an exercise.

Now the easy way, which is a two-for-one deal. Let f (t) = fr (t) + ifi (t) be a complex valued
function with real and imaginary parts fr and fi . Since the Laplace operator is linear operator
we have L[f ](s) = L[fr + ifi ](s) = L[fr ](s) + iL[fi ](s). In particular, L[eibt ](s) = L[cos bt](s) +
iL[sin bt](s). So we begin by finding the Laplace Transform of eibt . Then the real part is the Laplace
Transform of cos bt and the imaginary part is the Laplace Transform of sin bt.

138
From our second example
1 s + ib
L[eibt ](s) = = 2 . (7.10)
s − ib s + b2
Taking the real part gives
s
L[cos bt](s) = . (7.11)
s2 + b2
Taking the imaginary part gives
b
L[sin bt](s) = . (7.12)
s2 + b2
Simple!

Example: Use the linearity of the Laplace Transform to find L[−π + 6eλt + 9 sin(Ωt)].
Solution: Using linearity of the operator gives
L[−π + 6eλt + 9 sin(Ωt)] = −πL[1] + 6L[eλt ] + 9L[sin(Ωt)]
π 6 Ω (7.13)
=− + +9 2
s s−λ Ω + s2
which is defined for s > max{0, λ} (0 for the constant and sine functions, λ for the exponential
function).

7.1 Existence of the Laplace Transform


Definition: A function f (t) has a jump discontinuity at t0 ∈ (a, b) if f − = limt→t− f (t) and
0
f + = limt→t+ f (t) exist, are finite and f + 6= f − .
0

Definition: A function f (t) is piecewise continuous on a finite interval [a, b] if f (t) is continuous
at every point in [a, b] except possibly at a finite number of points at which f has a jump continuity.
Definition: A function f (t) is piecewise continuous on [0, ∞) if f (t) is piecewise continuous on
[0, T ] for all T > 0. There may be an countable infinity of discrete discontinuities on [0, ∞) with
no cluster point. That is, there can be no subset of points of discontinuities {tn } that have a limit
point tlim = limn→∞ tn .

Examples:

1. The function 
cos(t)
 0 ≤ t < 2π;
f (t) = 2 2π ≤ t < 8; (7.14)

(t − 7)2 t≥8

is piecewise continuous on [0, 20].

139
2. The function
(
1 2n ≤ t < 2n + 1;
f (t) = for n = 0, 1, 2, · · · (7.15)
−1 2n + 1 ≤ t < 2(n + 1);

is piecewise continuous on [0, ∞).

3. f (t) = t−1 is not piecewise continuous on [0, 2] because limt→0 f (t) does not exist.

Definition: A function f (t) is of exponential order α if there exist constants T and M such
that
|f (t)| ≤ M eαt (7.16)
for all t ≥ T .

Example: f (t) = atn is of exponential order α for all α > 0. To demonstrate we use the fact that

eαt
lim =∞ (7.17)
t→∞ |atn |

for all α > 0 as can be deduced by applying L’Hopital’s rule n times. Hence for any M > 0 there
eαt −1 for t > T and |atn | ≤ M eαt for t > T . This proves the result.
is a T such that |atn| > M

Theorem: Existence of Laplace Transforms. If f (t) is piecewise continuous on [0, ∞) and of


exponential order α then L[f ](s) exists for s > α.

Proof: Since f (t) is of exponential order α there exists constant T and M such that for t > T ,
|f (t)| < M eαt . We can split the Laplace Transform integral into two pieces
Z T Z ∞
−st
L[f ](s) = e f (t) dt + e−st f (t) dt. (7.18)
0 T

The first integral exists because the integral is over a finite interval on which the integrand is
piecewise continuous. Hence we just need to show that the second integral exists. On [T, ∞) we
have
|e−st f (t)| = e−st |f | ≤ M e(α−s)t (7.19)
so
Z ∞ Z ∞
e−st f (t) dt ≤ e−st f (t) dt
T T
Z ∞
≤M e(α−s)t dt (7.20)
T
e(α−s)T
=M for s > α.
s−α

140
Hence Z ∞
e−st f (t) dt (7.21)
T
exists and so does the Laplace Transform.
Inverse Laplace Transforms can be computed by calculating an integral on the complex plane.
This is beyond this course. For our needs we will use Laplace Transform tables.
The Laplace transform is very useful for solving linear constant coefficient differential equations
involving piecewise continuous forcing functions.

Example: Find the Laplace Transform of the piecewise continuous function



cos(2t) 0 < t < π;

f (t) = 1 π < t < 4; (7.22)

0 t > 4.

Solution: Using the definition we have


Z ∞ Z π Z 4
−st −st
L[f (t)] = e f (t) dt = e cos(2t) dt + e−st dt. (7.23)
0 0 π
Now
Z π nZ p o
−st
e cos(2t) dt = < ie−st ei2t dt
0 0
nZ π o
=< e−(s−2i)t dt
0
n e−(s−2i)t π o
=< −
s − 2i 0 (7.24)
n e−(s−2i)π − 1 o
=< −
s − 2i
n s + 2i o
= < (1 − e−sπ ) 2
s +4
−sπ
s(1 − e )
= .
s2 + 4
and
4
4
e−st e−sπ − e−4s
Z
−st
e dt = − = . (7.25)
π s s
π
Hence
s(1 − e−sπ ) e−sπ − e−4s
L[f (t)] = + . (7.26)
s2 + 4 s
Since all the integrals are over finite intervals the Laplace Transform is defined for all s. For s = 0
note that
e−sπ − e−4s
= lim 4e−4s − πe−sπ = 4 − π

lim (7.27)
s→0 s s→0
which is the result one obtains computing the integral directly for s = 0.

141
7.2 Properties of the Laplaces Transform
1. Translation in s.
Theorem: If L[f ](s) = F (s) exists for s > α then L[eat f ](s) = F (s − a) exists for s > α + a.
Proof: Z ∞ Z ∞
−st at
at
L[e f ](s) = e e f (t) dt = e−(s−a)t f (t) dt = F (s − a). (7.28)
0 0

Examples:
1 1
(a) L[1](s) = s =⇒ L[eat ](s) = s−a as we have already seen.
β β
(b) L[sin βt](s) = s2 +β 2
=⇒ L[eat cos βt](s) = (s−a)2 +β 2
.

2. Laplace Transform of a Derivative


Theorem: If f (t) is continuous on [0, ∞) and f 0 (t) is piecewise continuous on [0, ∞) with
both of exponential order α then for s > α

L[f 0 ](s) = sL[f ](s) − f (0). (7.29)

Proof:
Z ∞
0
L[f ](s) = e−st f 0 (t) dt
0
Z ∞
−st
∞ d −st 
=e f − e f (t) dt (need continuity of f here)
0 0 dt (7.30)
Z ∞
= −f (0) + s e−st f (t) dt (need s > α here)
0
= sF (s) − f (0) (need s > α again)

where F (s) is the Laplace Transform of f (t).

Examples:

142
(a) Using the Laplace Transform of sin(βt) we can find the Laplace transform of cos(βt) via
h1 d i
L[cos(βt)](s) = L sin(βt) (s)
β dt
1 hd i
= L sin(βt) (s)
β dt
1 
= sL[sin(βt)](s) − sin(0) (7.31)
β
s β
=
β s2 + β 2
s
= 2 .
s + β2

(b) L[1](s) = L[ dt 1
dt ](s) = sL[t](s) − 0. Hence L[t](s) = s L[1](s) =
1
s2
. By induction one can
show that
n!
L[tn ](s) = n+1 . (7.32)
s
The proof of this is left as an exercise.
(c) Prove the following identity for continuous functions f (t) assuming the transforms exist:
hZ t i 1
L f (τ ) dτ (s) = L[f ](s). (7.33)
0 s

Proof:

hd Z t i
L[f ](s) = L f (τ ) dτ (s)
dt 0
hZ t i Z t
= sL f (τ ) dτ (s) − f (τ ) dτ (7.34)
0 0 t=0
hZ t i
= sL f (τ ) dτ (s).
0

3. Higher-order derivatives
Theorem: Let f, f 0 , f 00 , . . . f (n−1) (where f (k) denotes the k th derivative of f ) be continuous
on [0, ∞) and let f (n) (t) be piecewise continuous on [0, ∞). Let all of these functions be of
exponential order α. Then for s > α

L[f (n) ](s) = sn L[f ](s) − sn−1 f (0) − sn−2 f 0 (0) − sn−3 f 00 (0) − · · · − f (n−1) (0). (7.35)

Proof: We prove by induction. We have already shown it is true for n = 1. Assume true for
n = k. Then by the previous result

L[f (k+1) ](s) = sL[f (k) ](s) − f (k) (0). (7.36)

143
By assumption
L[f (k) ](s) = sk L[f ](s) − sk−1 f (0) − sk−2 f 0 (0) − sk−3 f 00 (0) − · · · − f (k−1) (0) (7.37)
therefore
 
L[f (k+1) ](s) = s sk L[f ](s) − sk−1 f (0) − sk−2 f 0 (0) − sk−3 f 00 (0) − · · · − f (k−1) (0) − f (k) (0)

= sk+1 L[f ](s) − sk f (0) − sk−1 f 0 (0) − sk−2 f 00 (0) − · · · − sf (k−1) (0) − f (k) (0)
(7.38)
hence (7.35) is true for n = k + 1 if it is true for n = k. By induction (7.35) is true for all n.
4. Derivatives of Laplace Transforms
Theorem: Let F (s) = L[f ](s) and assume f (t) is piecewise continuous on [0, ∞) and of
exponential order α. Then for s > α
dn
L[tn f ](s) = (−1)n F (s). (7.39)
dsn
Proof:
Z ∞
dF d
(s) = e−st f (t) dt
ds ds 0
Z ∞ 
d 
= e−st f (t) dt
0 ds (7.40)
Z ∞
=− e−st tf (t) dt
0
= −L[tf ](s).
The result follows by induction.
Exercise: Use this to find the Laplace transform of tn using the Laplace transform of f (t) = 1.

Examples:

(a) Find the Laplace Transform of f (t) = t2 e2t .


Solution:
d2 d2  1  2
L[t2 e2t ] = (−1)2 2 L[e2t ] = 2 = . (7.41)
ds ds s − 2 (s − 2)3
(b) Find the Laplace Transform of f (t) = t cos(3t).
Solution:
d d  s 
L[t cos(3t)] = (−1)1 L[cos(3t)] = −
ds ds s2 + 9
 1 2s2 
=− 2 − 2 (7.42)
s + 9 (s + 9)2
s2 − 9
= 2 .
(s + 9)2

144
7.3 Inverse Laplace Transforms
Definition: Given a function F (s), if there is a function f (t) that is continuous on [0, ∞) and
satisfies
L[f ] = F, (7.43)
then f (t) = L−1 [F ] is called the inverse Laplace Transform of F (s).
If every function f (t) satisfying (7.43) is discontinuous then one can choose any one of them to
be the inverse transform. Two piecewise continuous functions satisfying (7.43) can only differ at
the point of discontinuity and this is not physically significant.

Example: (
0, 0 ≤ t ≤ 1;
f1 (t) = (7.44)
1, t > 1,
and (
0, 0 ≤ t < 1;
f2 (t) = (7.45)
1, t ≥ 1,
have different values at t = 1. They have the same Laplace Transform:
Z ∞ Z ∞
−st e−s
L[f1 ] = L[f2 ] = e dt = e−s e−st̃ dt̃ = . (7.46)
1 0 s
We can use h e−s i h e−s i
L = f1 or L = f2 . (7.47)
s s

7.3.1 Linearity of the Inverse Transform


Assume L−1 [F1 ] and L−1 [F2 ] exist and are continuous on [0, ∞). Then
L−1 [F1 + cF2 ] = L−1 [F1 ] + cL−1 [F2 ] (7.48)
for any constant c. To prove this use
h i h i h i
L L−1 [F1 ] + cL−1 [F2 ] = L L−1 [F1 ] + cL L−1 [F2 ]
(7.49)
= F1 + cF2 .

The latter step follows because the two inverse transforms are continuous functions. Hence L−1 [F1 ]+
cL−1 [F2 ] is a continuous function whose Laplace Transform is F1 + cF2 which proves (7.48).

145
7.3.2 Examples
1. Find h 5 12 i
L−1 − 3 + 2 . (7.50)
s s +5

Solution: We known that


b
L[sin bt] = (7.51)
s2 + b2
and
2
L[t2 ] = . (7.52)
s3
Using this
h 5 √ i
−1 12 i −1
h 5 2 12 5
L − 3+ 2 =L − 3+√ 2
s s +5 2s 5s +5
√ i
5 −1 2
h i 12 −1 h 5 (7.53)
=− L +√ L
2 s3 5 s 2+5

5 2 12 √
= − t + √ sin 5t.
2 5

2. Find h 3 i
L−1 . (7.54)
2s2 + 8s + 10

Solution: To solve this problem we need to put F (s) in a form, or combination of forms, that
look like those that appear in the Laplace Transform Table (see table 7.1). Denominators
that are quadratic polynomials in s need to be written in the form (s + a)2 + b2 or factored
as (s − λ1 )(s − λ2 ). In this case we have
3 3 1 3 1
= = . (7.55)
2s2 + 8s + 10 2 s2 + 4s + 5 2 (s + 2)2 + 1

Recall the shift theorem:


L[eat f (t)](s) = L[f ](s − a). (7.56)
We know that
1
L[sin t] = (7.57)
+1 s2
hence from the shift theorem with a = −2 we have
1
L[e−2t sin t] = . (7.58)
(s + 2)2 + 1

Thus h 3 i 3
L−1 = e−2t sin t. (7.59)
2s2 + 8s + 10 2

146
3. Find h 3 i
L−1 . (7.60)
2s2 + 8s + 6

Solution: In the previous example the denominator was always positive: 2s2 + 8s + 10 =
2 (s + 2)2 + 1 = 0 has no real roots. Here 2s2 + 8s + 6 = 2 (s + 2)2 − 1 = 0 has two
real roots. We can factor: 2s2 + 8s + 6 = 2(s + 1)(s + 3). So for this problem we proceed
differently:
3 3 1 3  1/2 1/2 
= = − (7.61)
2s2 + 8s + 6 2 (s + 1)(s + 3) 2 s+1 s+3
so h 3 i 3 h 1 i 3 h 1 i 3 3
L−1 2
= L −1
− L −1
= e−t − e−3t . (7.62)
2s + 8s + 6 4 s+1 4 s+3 4 4
The previous problem could have been solved in a similar manner using complex roots (exer-
cise).

4. Find h 3s3 + 8s2 + 6s + 46 i


y(t) = L−1 . (7.63)
(s + 2)2 (s2 − 2s + 6)

Solution: Here s2 − 2s + 6 = (s − 1)2 + 5 so we can’t factor the second factor in the


denominator of Y (s) unless we use complex roots, which is an option. Write

as + b cs + d
Y (s) = 2
+ 2
(s + 2) s − 2s + 6
(7.64)
(a + c)s + (−2a + b + 4c + d)s2 + (6a − 2b + 4c + 4d)s + 6b + 4d
3
= ,
(s + 2)2 (s2 − 2s + 6)

which gives the system of equations

a + c = 3,
−2a + b + 4c + d) = 8,
(7.65)
6a − 2b + 4c + 4d = 6,
6b + 4d = 46,

from which we obtain


(a, b, c, d) = (2, 7, 1, 1). (7.66)
So
2s + 7 s+1
Y (s) = +
(s + 2)2 (s − 1)2 + 5
(7.67)
2 3 s−1 2
= + + + .
s + 2 (s + 2)2 (s − 1)2 + 5 (s − 1)2 + 5

147
Next find the inverses:
h 2 i
L−1 = 2e−2t ,
s+2
h 3 i h 1 i
L−1 = 3L −1
= 3te−2t ,
(s + 2)2 (s + 2)2
h s−1 i √ (7.68)
L−1 2
= et cos( 5t)
(s − 1) + 5
h 2 i 2 √
L−1 2
= √ et sin( 5t).
(s − 1) + 5 5
Hence
h 3s3 + 8s2 + 6s + 46 i √ 2 t √
y(t) = L−1 = 2e −2t
+ 3te −2t
+ et
cos( 5t) + √ e sin( 5t). (7.69)
(s + 2)2 (s2 − 2s + 6) 5

7.4 Solving constant coefficient differential equations. I


Differential equations can be solved using Laplace Transforms. Taking the Laplace Transform of
the general constant coefficient second-order DE initial value problem

L[y] ≡ ay 00 + by 0 + cy = f (t)
y(0) = y0 (7.70)
0
y (0) = y00

gives    
a s2 L[y] − sy(0) − y 0 (0) + b sL[y] − y(0) + L[y] = L[f ] (7.71)
or
(as2 + bs + c)Y (s) = (as + b)y0 + ay00 + F (s) (7.72)
where Y (s) = L[y] and F (s) = L[f ](s) is the Laplace Transform of the forcing function. Thus

(as + b)y0 + ay00 F (s)


Y (s) = + (7.73)
P (s) P (s)

where P (s) is the characteristic polynomial of the differential equation.

Definition: The function


1
G(s) = (7.74)
P (s)
is called the Transfer Function. It encapsulates the differential operator L. From (7.73) we see
that Y (s) is the sum of the Transfer function multiplied by the function I(s) = (as+b)y0 +ay00 which
encapsulates the initial conditions, and the Transfer function multiplied by the Laplace Transform
of the forcing function.
Write Y (s) as
Y (s) = Yh (s) + Yp (s) (7.75)
where
(as + b)y0 + ay00
Yh (s) = (7.76)
P (s)

148
and
F (s)
Yp (s) = . (7.77)
P (s)
If the forcing function f (t) is zero Y (s) = Yh (s), hence we see that yh (t) = L−1 [Yh ] is a solution of
the homogeneous problem that satisfies the initial conditions. If the initial conditions are zero, i.e.,
y0 = y00 = 0, then Yh (s) = 0 and yp = L−1 [Yp ] is a particular solution for which yp (0) = y00 (0) = 0,
i.e., yp is the particular solution satisfying homogeneous initial conditions.
We now illustrate with some examples.

Example 1: Consider

y 00 − 2y 0 − y = 0,
y(0) = 1, (7.78)
0
y (0) = −2.

Let Y (s) = L[y]. Taking the Laplace Transform of the ODE we have

L[y 00 − 2y 0 − y] = L[0] = 0. (7.79)

Using the rules for the Laplace Transform of a derivative gives

L[y 00 − 2y 0 − y] = L[y 00 ] − 2L[y 0 ] − L[y]


 
= s2 L[y] − sy(0) − y 0 (0) − 2 sL[y] − y(0) − L[y] (7.80)
= 0.

So
(s2 − 2s − 1)Y (s) = sy(0) + y 0 (0) − 2y(0) = s − 4, (7.81)
or
s−4
Y (s) = . (7.82)
s2 − 2s − 1
Note that the denominator is the characteristic polynomial for the DE. This is always the case.
The numerator comes from the initial conditions. The denominator can be factored:

s2 − 2s − 1 = (s − λ1 )(s − λ2 ) (7.83)

where √ √
λ1 = 1 − 2<0 and λ2 = 1 + 2 > 0. (7.84)
Writing
s−4 a b
Y (s) = = + (7.85)
s2
− 2s − 1 s − λ1 s − λ2
√ √
we find a = (s + 3 2)/4 and b = 1 − a = (2 − 3 2)/4. Taking the inverse
h a b i
y(t) = L−1 +
s − λ1 s − λ2
= ae + beλ2 t
λ1 t
(7.86)
√ √
s + 3 2 λ1 t s − 3 2 λ2 t
= e + e .
4 4

149
Example 2: For the next example consider

y 00 + 4y 0 + 4y = e2t + cos(t)
y(0) = 1 (7.87)
0
y (0) = −1.

Taking the L.T. of the equation gives


1 s
s2 Y (s) − sy(0) − y 0 (0) + 4 sY (s) − y(0) + 4Y (s) =

+ , (7.88)
s − 2 s2 + 1
so
sy(0) + y 0 (0) + 4y(0) 1 s
Y (s) = 2
+ 2
+ 2 2
s + 4s + 4 (s − 2)(s + 4s + 4) (s + 1)(s + 4s + 4)
s+3 1 s
= 2 + 2
+ 2 2
(7.89)
s + 4s + 4 (s − 2)(s + 4s + 4) (s + 1)(s + 4s + 4)
s+3 1 s
= 2
+ 2
+ 2 .
(s + 2) (s − 2)(s + 2) (s + 1)(s + 2)2

Note that P (s) = s2 + 4s + 4 = (s + 2)2 is the characteristic polynomial for the differential equation.
We write the equation for Y (s) as

Y (s) = Yh (s) + Yp1 (s) + Yp2 (s) (7.90)

where Yh (s) is the piece that comes from the initial conditions and Yp1 and Yp2 are the terms
involving the transform of the two forcing functions. We find the inverse transform of each piece
separately.

1. Considering the first piece we have


s+3 s+2+1
Yh (s) = =
(s + 2)2 (s + 2)2
(7.91)
1 1
= +
s + 2 (s + 2)2

which has inverse


h i h 1 i h 1 i
yh (t) = L−1 Yh (s) = L−1 + L−1
s+2 (s + 2)2 (7.92)
−2t −2t
=e + te .

For the latter we have used L[t] = 1/s2 along with the shift theorem: L[te−2t ] = 1/(s + 2)2 .
Note that this solution, which does not involve the forcing functions, is a linear combination
of e−2t and te−2t which are two linearly independent solutions of the homogeneous solution.
This solution satisfies the initial conditions: yh (0) = 1 and yh0 (t) = −2e−2t + e−2t − 2te−2t is
equal to −1 at t = 0.

150
2. For the second piece we need to use partial fractions to separate it into pieces we can find the
inverse transform. Setting
1 a bs + c
Yp1 (s) = = +
(s − 2)(s + 2)2 (s − 2) (s + 2)2
(7.93)
as2 + 4as + 4a + bs2 + (c − 2b)s − 2c
= ,
(s − 2)(s + 2)2

gives the system

a+b=0
4a − 2b + c = 0 (7.94)
4a − 2c = 1,

from the coefficients of s2 , s and the constant part of the numerator. The solution of this
system is 1 1 6
(a, b, c) = ,− ,− . (7.95)
16 16 16
Hence
1 1 1 1 s+6
Yp1 (s) = 2
= −
(s − 2)(s + 2) 16 (s − 2) 16 (s + 2)2
(7.96)
1 1 1 1 4 
= − + .
16 (s − 2) 16 s + 2 (s + 2)2

The inverse transform is


h 1 i 1 1  −2t 
yp1 (t) = L−1 2
= e2t − e + 4te−2t
(s − 2)(s + 2) 16 16
(7.97)
1 1 −2t 1 −2t
= e2t − e − te .
16 16 4

This is a particular solution of the differential equation for the forcing term e2t . The first
term of yp1 (t), e2t /16, is a particular solution as
 1 2t  1
(D2 + 4D + 4) e = (4 + 8 + 4)e2t = e2t . (7.98)
16 16
The second and third terms are a linear combination of two linearly independent homogeneous
solutions. This combination is such that the particular solution yp1 satisfies homogeneous
0 (0) = 0. Checking this is a useful check on your solution
initial conditions: yp1 (0) = yp1
(exercise).

3. Finally, we consider the third piece. Using partial fractions again we have
s 1 3s + 4 1 3s + 16
Yp2 (s) = = −
(s2 + 1)(s + 2) 2 25 s + 1 25 (s + 2)2
2

1  3s 4  1 3s + 6 + 10
= + − (7.99)
25 s2 + 1 s2 + 1 25 (s + 2)2
1  3s 4  1 3 10 
= + − +
25 s2 + 1 s2 + 1 25 s + 2 (s + 2)2

151
1.5

1.0

0.5
y

0.0

−0.5
−2 −1 0 1 2
x

Figure 7.1: The Heaviside function.

The inverse transform is


h i 1  1  −2t 
yp2 (t) = L−1 Yp2 = 3 cos(t) + 4 sin(t) − 3e + 10te−2t . (7.100)
25 25
The part involving cos t and sin t is a particular solution for the forcing cos t. The remaining
two pieces are a homogeneous solution that makes the particular solution yp2 (t) one that
0 (0) = 0. This is a useful check of
satisfies homogeneous initial conditions, i.e., yp2 (0) = yp2
your solution (exercise).

Putting the three pieces together:


h i 1 1 −2t 1 −2t

−2t −2t 2t
y= e + te + e − e − te
16 16 4
  
1  1  −2t −2t (7.101)
+ 3 cos(t) + 4 sin(t) − 3e + 10te
25 25
327 −2t 35 −2t 1 3 4
= e + te + e2t + cos t + sin t.
400 100 16 25 25

7.5 The Heaviside Function


Definition: The Heaviside function H(t) (or unit step function) is defined by
(
0 t < 0;
H(t) = (7.102)
1 t ≥ 0,

(see Figure 7.1).


Comments:

152
1. Sometimes H(t) is defined as 0 for t ≤ 0 and 1 for t > 0.

2. While H(t) is a fairly traditional notation for the Heaviside function other terminology is
used, e.g., u(t).

Any piecewise continuous function can be expressed in terms of the Heaviside function. Values
may differ at points of discontinuity. The basic idea is that

f (t) = f1 (t) + (f2 (t) − f1 (t) H(t − t0 ) (7.103)

is equal to f1 (t) for t < t0 and to f1 + f2 − f1 = f2 for t > t0 . Thus the function f (t) given by
(7.103) jumps from f1 (t) to f2 (t) at t = t0 . We can have any number of jumps as the next example
illustrates.

Example: Consider the function





cos t 0 ≤ t < π;

2 π ≤ t ≤ 2π;
y(t) = (7.104)


sin t 2π < t ≤ 4π;

t − 10 t > 4π.

We can write this as


 
ỹ(t) = cos t + 2 − cos t H(t − π) + sin t − 2 H(t − 2π) + t − 10 − sin t)H(t − 4π). (7.105)

For t < π all the Heaviside functions are zero so ỹ(t) = cos t for t < π. For π < t < 2π the
first step function is one, the other two are zero so ỹ(t) = cos t + 2 − cos t = 2, etc. I’ve called
this function ỹ to highlight the fact that this function may not be equal to y(t) at the points of
discontinuity and in fact this is the case: y(2π) = 2 but ỹ(2π) = sin(2π) = 0.

7.5.1 Laplace Transform of the Heaviside function


For t0 ≥ 0:
∞ ∞
e−st0
Z Z
−st
e−st dt =
 
L H(t − t0 ) = e H(t − t0 ) dt = (7.106)
0 t0 s
and h e−st0 i
L−1 = H(t − t0 ). (7.107)
s

153
Translation in t

Let F (s) = L[f ](s) exist for s > α. If a > 0 is a constant then

L f (t − a)H(t − a) (s) = e−as F (s)


 
(7.108)

exists for s > α and


L−1 e−as F (s) = f (t − a)H(t − a).
 
(7.109)

Proof: By definition
Z ∞
e−st f (t − a)H(t − a) dt
 
L f (t − a)H(t − a) (s) =
Z0 ∞
= e−st f (t − a) dt (since H(t − a) = 0 for t < a and 1 for t > a)
a
Z ∞
= e−s(t̃+a) f (t̃) dt̃ (t̃ = t − a)
0
Z ∞
= e−sa e−st̃ f (t̃) dt̃
0
= e−sa F (s).
(7.110)

Exercise: What is different if a < 0?

Corollary:
L f (t)H(t − a) (s) = e−as L f (t + a) .
   
(7.111)

Examples:

1. Example 1: Find the Laplace Transform of


(
sin t 0 ≤ π/4;
f (t) = (7.112)
sin t + cos(t − 4 ) t ≥ π4 .
π

Solution: Write f (t) as


π π
f (t) = sin t + H(t − ) cos(t − ). (7.113)
4 4
Then
     π π 
L f = L sin t + L H(t − ) cos(t − )
4 4
1 − π4 s
 
= 2 +e L cos t (7.114)
s +1
1 π s
= 2 + e− 4 s 2 .
s +1 s +1

154
2. Example 2: Find the Laplace Transform of
(
sin t 0 ≤ π4 ;
f (t) = (7.115)
sin t + sin2 t t ≥ π4 .

Solution: Write f (t) as

π  1 cos(2t) 
f (t) = sin t + H t − − (7.116)
4 2 2
so
1  π  1  π 
L[f ] = L[sin t] + L H(t − ) − L H(t − ) cos 2t
2 4 2 4
π (7.117)
1 1 e− 4 s 1  π 
= 2 + − L H(t − ) cos 2t .
s +1 2 s 2 4
We must do some work on the last term to put it in the form H(t − a)f (t − a) so we can use
the shift therom. To do this we use
π π π 
cos 2t = cos 2(t − ) + = − sin 2(t − ) . (7.118)
4 2 4
Then
π π π  π π 2
L H(t − ) cos 2t = L H(t − ) sin 2(t − ) = e− 4 s L sin 2t = e− 4 s 2
    
. (7.119)
4 4 4 s +4
Thus π π
1 1 e− 4 s e− 4 s
L[f ](s) = 2 + − 2 . (7.120)
s +1 2 s s +4

3. Example 3: Find L[t2 H(t − 1)].


Solution:
L[t2 H(t − 1)] = L[(t − 1 + 1)2 H(t − 1)] = L[ (t − 1)2 + 2(t − 1) + 1 H(t − 1)]


= L[(t − 1)2 H(t − 1)]


+ 2L[(t − 1)H(t − 1)] (7.121)

+ L[ H(t − 1)]
2 1 1
= e−s 3 + 2 2 + .
s s s

Alternatively, using the corollary (7.111),


2 2 1
L[t2 H(t − 1)] = e−s L (t + 1)2 = e−s L t2 + 2t + 1 = e−s 3 + 2 +
   
. (7.122)
s s s

4. Example 4: Find
h e−2s i
f (t) = L−1 . (7.123)
s2

155
Solution: We use

L f (t − a)H(t − a) (s) = e−as L f (s) = e−as F (s).


   
(7.124)

Here we identify e−2s with e−as , hence a = 2, and 1


s2
with F (s). Using
h1i
L−1 = t, (7.125)
s2
we have
1
e−2s = e−2s L[t] = L H(t − 2)(t − 2)
 
2
(7.126)
s
so h e−2s i
L−1 = (t − 2)H(t − 2). (7.127)
s2
5. Example 5: The current I in an LC circuit is governed by the initial value problem

I 00 + 4I = g(t),
(7.128)
I(0) = I 0 (0) = 0

where 
 1, 0 < t < 1;

g(t) = −1, 1 < t < 2; (7.129)

0, t > 2.

Determine the current I(t).

Solution: Let J(s) = L[I](s). Taking the Laplace Transform of the equation we have

s2 J(s) − sI(0) − I 0 (0) + 4J(s) = L[g](s). (7.130)

or, after using the iniital conditions,


L[g](s)
J(s) = . (7.131)
s2 + 4
In terms of the Heaviside function
  1 e−s e−2s
L[g](s) = L 1 − 2H(t − 1) + H(t − 2) = − 2 + , (7.132)
s s s
so
J(s) = F (s) − 2e−s F (s) + e−2s F (s) (7.133)
where
1 1 1 s
F (s) = = − . (7.134)
s(s2 + 4) 4s 4 s2 + 4
The inverse Laplace Transform of F (s) is
 1  1  1 −1  1 
f (t) = L−1 F (s) = L−1

− L
4 s 4 s2 + 4
1 1
= − cos(2t) (7.135)
4 4
1 − cos(2t)
= .
4

156
Now we use
L−1 e−as F (s) = f (t − a)H(t − a)
 
(7.136)
where F (s) = L[f ]. Thus

I(s) = L−1 J(s) = L−1 F (s) − 2L−1 e−s F (s) + L−1 e−2s F (s)
       

= f (t) − 2f (t − 1)H(t − 1) + f (t − 2)H(t − 2)



1 − cos(2t) 1 − cos 2(t − 1) (7.137)
= −2 H(t − 1)
4 4
1 − cos 2(t − 2)
+ H(t − 2).
4

The forcing function and the solution and its first two derivatives are shown in Figure 7.2. The
solution and its first derivative are continuous. The second derivative has a jump discontinuity.
This can be deduced from the equation:

I 00 + 4I = 1 − 2H(t − 1) + H(t − 2). (7.138)

The right hand side is piecewise continuous with jumps at t = 1 and t = 2, hence so does the
left hand side. Because differentiation reduces differentiability, i.e., if y is a C 1 function (a
continuous function with a continuous first derivative) with a discontinuous second derivative,
then the derivative y 0 is continuous with a discontinuous first derivative. So among the terms
on the left-hand side the term with the highest derivative, I 00 , must have the same level of
continuity/differentiability as the right-hand side. In this case I 00 must be piecewise continuous
with jump discontinuities at t = 1 and t = 2. Then its integral, I 0 , is a continuous function
with jumps in its derivative at t = 1 and t = 2 and I is a C 1 function with jumps in its
curvature at t = 1 and t = 2.

7.6 Convolutions
Definition: Let f (t) and g(t) be piecewise continuous functions on [0, ∞). The convolution of
f (t) and g(t) is a new function denoted by f ∗ g defined as
Z t
f ∗ g(t) = f (t − v)g(v) dv. (7.139)
0

Thus, ∗ is an operator which takes in two functions and returns a new function.
Properties: The convolution operator has the following properties:

(i) f ∗ g(t) = g ∗ f (t);

(ii) f ∗ (g + ch) = f ∗ g + cf ∗ h.

(iii) f ∗ (g ∗ h) = (f ∗ g) ∗ h;

where f , g and h are piecewise continuous functions on [0, ∞) and c is a constant. The convolution
operator is bilinear operator.

Proof:

157
2
(a)
1

g
−1
−2
0 1 2 3 4
t

1.0
(b)
0.5

0.0
I

−0.5
−1.0
0 1 2 3 4
t

2
(c)
1
dI/dt

−1
−2
0 1 2 3 4
t

3
2 (d)
1
d2I/dt2

0
−1
−2
−3
0 1 2 3 4
t

Figure 7.2: Solution of the circuit problem (example 5). (a) The forcing function g. (b) The solution I(t).
(c) The first derivative I 0 (t). (d) The second derivative I 00 (t).

(i) By definition
Z t
f ∗g = f (t − v)g(v) dv. (7.140)
0
Make the substitution s = t − v. Then dv = −ds and s = t when v = 0 and s = 0 when v = t
so Z 0 Z t
f ∗ g(t) = f (s)g(t − s) (−ds) = g(t − s)f (s) ds = g ∗ f (t). (7.141)
t 0

(ii) Follows from linearity of integration.

(iii) By definition
Z t 
Z t hZ s=t−v i
(f ∗ g) ∗ h(t) = f ∗ g (t − v) h(v) dv = f (t − v − s)g(s) ds h(v) dv (7.142)
v=0 v=0 0

158
and
Z t 
Z t hZ s i
f ∗ (g ∗ h)(t) = f (t − s) g ∗ h (s) ds = f (t − s) g(s − v)h(v) dv ds. (7.143)
s=0 s=0 v=0

The latter integral is over the triangle 0 ≤ v ≤ s, 0 ≤ s ≤ t in the s-v plane. This is also the
triangle 0 ≤ v ≤ t, v ≤ s ≤ t so
Z t Z t
f ∗ (g ∗ h)(t) = f (t − s)g(s − v)h(v) ds dv. (7.144)
v=0 s=v

Making the substitution q = s − v to replace s by q we have


Z t Z t−v
f ∗ (g ∗ h)(t) = f (t − q − v)g(q)h(v) dq dv. (7.145)
v=0 q=0

Replacing q by s (i.e., change the name of the dummy variable of integration) gives (7.142)
proving the result.

Examples:

1. Find the convolution of f (t) = cos t and g(t) = 1.


Z t t 
(f ∗ g)(t) = cos(t − v) dv = − sin(t − v) = − sin(0) − sin(t) = sin(t). (7.146)
0 0

Note: here, making the identifications f (t) = cos t and g(t) = 1, g(v) in the integral doesn’t
appear as g(v) = 1. Alternatively
Z t t
(f ∗ g)(t) = (g ∗ f )(t) = cos(v) dv = sin(v) = sin(t). (7.147)
0 0

2. Find the convolution of f (t) = cos t and g(t) = sin t.


t t
1
Z Z 
(f ∗ g)(t) = cos(t − v) sin(v) dv = sin(t − v + v) − sin(t − v − v) dv
0 0 2
1 t 1 t
Z Z
= sin(t) dv − sin(t − 2v) dv
2 0 2 0
1 1  cos(t − 2v  t (7.148)
= t sin t −
2 2 2 0
t sin t 1 cos(−t) − cos(t)
= −
2 2 2
t sin t
= .
2

159
7.6.1 The Convolution Theorem
Theorem: Let f (t) and g(t) be piecewise continuous on [0, ∞) and be of exponential order α. Let
F (s) = L[f ] and G(s) = L[g]. Then

L[f ∗ g] = F (s)G(s), (7.149)

which means
L−1 [F (s)G(s)] = f ∗ g(t). (7.150)

Proof:
Z ∞ hZ t i
L[f ∗ g] = e−st f (t − v)g(v) dv dt
Z0 ∞ 0
hZ ∞ i
−st
= e H(t − v)f (t − v)g(v) dv dt (H(t − v) = 0 for v > t)
Z0 ∞ 0
hZ ∞ i
= g(v) e−st H(t − v)f (t − v) dt dv (change order of integration)
Z0 ∞ 0
hZ ∞ i
= g(v) e−st f (t − v) dt dv (H(t − v) = 0 for t < v)
Z0 ∞ v
hZ ∞
(7.151)
i
= g(v) e−s(t̃+v) f (t̃) dt̃ dv
Z0 ∞ 0
hZ ∞ i
−sv
= g(v)e e−st̃ f (t̃) dt̃ dv
Z0 ∞ 0

= g(v)e−sv F (s) dv
0
Z ∞
= F (s) g(v)e−sv dv
0
= F (s)G(s).

This result can be useful for finding inverse Laplace Transforms.

Examples:

1. Find h a i
L−1 . (7.152)
s2 (s2 + a2 )

Solution: h a i
−1 1
h a i
L−1 = L . (7.153)
s2 (s2 + a2 ) s 2 s 2 + a2
Using h1i
L−1 =t (7.154)
s2

160
and h a i
L−1 = sin(at) (7.155)
s 2 + a2
and the convolution theorem we have
h a i  
L−1 2 2 = sin(at) ∗ t (t)
s (s + a2 )
Z t
= sin(a(t − v))v dv
0
Z t
d  cos(a(t − v)) 
= v dt,
0 dv a
 cos(a(t − v))  v=t 1 Z t (7.156)

= v − cos a(t − v) dv
a v=0 a 0
v=t
t 1 sin(a(t − v))
= +
a a a v=0
at − sin(at)
= .
a2

2. Find the solution of


y 00 + 4y = f (t)
y(0) = 3 (7.157)
0
y (0) = −1

in terms of f (t).

Solution: Taking the Laplace Transform of the differential equation gives

(s2 + 4)Y (s) = 3s − 1 + F (s) (7.158)

where F (s) = L[f ]. Hence


s 1 2
Y (s) = 3 − + F (s)G(s) (7.159)
s2 + 4 2 s2 + 4
where
1
G(s) = (7.160)
s2 +4
is the Transfer Function for the DE. Let
1
g(t) = L−1 [G(s)] = sin(2t). (7.161)
2
Using the convolution theorem on the last term we have
1
y(t) = 3 cos(2t) − sin(2t) + f ∗ g(t)
2
(7.162)
1 1 t
Z
= 3 cos(2t) − sin(2t) + f (t − v) sin(2v) dv.
2 2 0

161
7.7 Periodic Functions
Theorem: If f (t) is piecewise continuous and periodic with period τ then

Fτ (s)
L[f ](s) = (7.163)
1 − e−sτ
where Z τ
Fτ (s) = e−st f (t) dt. (7.164)
0

Proof:
Z ∞
L[f ](s) = e−st f (t) dt
Z0 τ Z ∞
−st
= e f (t) dt + e−st f (t) dt
0 τ
Z τ Z ∞
−st
= e e−s(τ +t̃) f (τ + t̃) dt̃ (change of variables: t = τ + t̃)
f (t) dt + (7.165)
Z0 τ 0
Z ∞
−st −sτ
= e f (t) dt + e e−st̃ f (t̃) dt̃ (using periodicity of f )
0 0
Z τ
= e−st f (t) dt + e−sτ L[f ](s)
0

which gives the result.

Example: Find the Laplace Transform of the periodic function


(
1, 0 ≤ t < 1;
f (t) = (7.166)
0, 1 ≤ t < 2;

with f (t + 2) = f (t).
Solution: The function is periodic with period 2 so the Laplace Transform is

F2 (s)
L[f ](s) = (7.167)
1 − e−2s
where
2 1
1 − e−s
Z Z
−st
F2 (s) = e f (t) dt = e−st dt = . (7.168)
0 0 s
Hence
1 − e−s 1
L[f ](s) = −2s
= . (7.169)
s(1 − e ) s(1 + e−s )

162
7.8 Impulses and the Dirac Delta Function
To deal with short very strong impulsive forces (e.g., a hammer blow) the delta function introduced
by Dirac is often used.
Consider Newton’s Second Law applied to an object of mass m subject to a time varying force
F (t):
d 
mv = F (t). (7.170)
dt
Integrating from t1 to t2 gives Z t2
∆(mv) = F (t) dt (7.171)
t1
where ∆(mv) is the change in momentum between times t1 and t2 . The integral
Z t2
F (t) dt (7.172)
t1

is called the impulse.


Consider a hammer blow at time t. The force exerted by the hammer is enormous but it acts
over a very short period of time (Figure 7.3). The area under the force-time curve is the impulse
imparted by the hammer. A set of functions with the same impulse acting over a progressively
narrower base must get higher in order to preserve the area.
Let Fn (t) be a family of piecewise continuous functions with impulse 1 which are non-zero only
on [t0 , tn ] for n = 1, 2, 3, . . . with tn → t0 as n → ∞. That is
Z ∞ Z tn
Fn (t) dt = Fn (t) dt = 1 ∀n. (7.173)
−∞ t0
R tn
As n → ∞, max{Fn } → ∞. Otherwise, if the Fn (t) are bounded by, say, M , then t0 |Fn | dt ≤
M (tn − t0 ) → 0 as n → ∞ because tn → t0 .
There are many possible such sets of functions, for example:
(
n, 0 ≤ t ≤ n1 ;
Fn (t) = (7.174)
0, otherwise.

1.0

0.5
F

0.0

−0.5
0.00 0.25 0.50 0.75 1.00
t

Figure 7.3: An impulse force (non-dimensionalized) F acting for a short duration of time.

163
5 5 5
(a) (b) (c)
4 4 4

3 3 3

2 2 2

F
1 1 1

0 0 0

−1 −1 −1
−1 0 1 2 −1 0 1 2 −1 0 1 2
t t t

Figure 7.4: Sample functions Fn (t) given by (7.174). (a) n = 1. (b) n = 2. (c) n = 4.

5 5 5
(a) (b) (c)
4 4 4

3 3 3

2 2 2
F

F
1 1 1

0 0 0

−1 −1 −1
−1 0 1 2 3 −1 0 1 2 3 −1 0 1 2 3
t t t

Figure 7.5: Sample functions Fn (t) given by (7.175). (a) n = 1. (b) n = 2. (c) n = 4.

or 
2
n t,
 0 ≤ t ≤ n1 ;
Fn (t) = −n2 (t − n2 ) n1 < t ≤ n2 , (7.175)

0, otherwise.

See figures 7.4 and 7.5.

Claim: All of these families of functions have the property that


Z ∞
lim Fn (t) f (t) dt = f (0) (7.176)
n→∞ −∞

for all continuous functions f (t).

Proof: For simplicity we will assume that Fn (t) ≥ 0. This is not required.
The functions Fn have compact support [0, tn ] so
Z ∞ Z tn
Fn (t)f (t) dt = Fn (t)f (t) dt. (7.177)
−∞ 0

164
Since f (t) is continuous, for all  > 0 there exists a δ > 0 such that |f (t) − f (0)| <  for t < δ.
Also, since tn → t0 as n → ∞, for all δ > 0 there exists an M such that 0 < tn − t0 < δ for all
n > M . Thus, if n > M then tn < δ and
Z tn Z tn
 tn
Z
Fn (t)f (t) dt < Fn (t) f (0) + ) dt = f (0) +  Fn (t) dt = f (0) + , (7.178)
0 0 0

since the integral of Fn is 1. Similarly


Z tn
Fn (t)f (t) dt > f (0) − . (7.179)
0

Letting  → 0 proves the result.

Definition: The Dirac delta function δ(t) is a distribution or generalized function defined by
Z ∞
f (t)δ(t) dt = f (0). (7.180)
−∞

for all continuous functions f (t).

Comments:

1. The Dirac delta function is not a function. You can think of δ(t) as being zero everywhere
except at t = 0 where it is infinite in such a way that the integral of δ(t) is one. You can also
think of it as the limit of the families of functions Fn (t) discussed above. Neither of these are
precise. Its action on continuous functions when integrated against them is what defines it.
To learn more about generalized functions see, for example, the book by Lighthill [5].

2. With a change of variables we see that


Z ∞
f (t)δ(t − a) dt = f (a). (7.181)
−∞

7.8.1 Laplace Transform of the Delta function


For a > 0 we have Z ∞
L[δ(t − a)](s) = e−st δ(t − a) dt = e−as . (7.182)
0
This is not true for a < 0. Why? For a = 0 the transform is defined as

L[δ(t)](s) = lim L[δ(t − a)](s) = lim e−as = 1. (7.183)


a→0+ a→0+

Similarly (
t
0 t < a;
Z
δ(t − a) dt = (7.184)
−∞ 1 t > a;
and we define Z a Z t
δ(t − a) dt = lim δ(t − a) dt = 1. (7.185)
−∞ t→a+ −∞

165
Hence Z t
δ(t − a) dt = H(t − a) (7.186)
−∞
and we can define
d
H(t − a) = δ(t − a). (7.187)
dt
Another way to see this result is the following. Consider H(t − a) to be

H(t − a) = lim fn (t − a) (7.188)


n→∞

where 
0, t < 0;

fn (t) = nt, 0 ≤ t < n1 ; (7.189)

1, t ≥ n1 .

This is not exact as limn→∞ fn (0) = 0 where as we defined H(0) to be 1. Other than that they are
equivalent.
The functions fn (t) are continuous with a piecewise continuous derivative:

dfn 0, t < 0;

(t) = n, 0 < t < n1 ; (7.190)
dt 
0, t > n1 .

This is the first family of functions Fn considered above (see (7.174)) except here the derivative of
fn is not defined at t = 0 and t = 1/n. For any continuous function f (t)
Z ∞ Z 1/n
dfn
lim (t)f (t) dt = lim n f (t) dt = f (0) (7.191)
n→∞ −∞ dt n→∞ 0

so we can say that


dfn
δ(t) = lim . (7.192)
n→∞ dt
Assuming we can reverse the order of differentiation and the limit
dfn d   d
δ(t) = lim = lim fn (t) = H(t). (7.193)
n→∞ dt dt n→∞ dt
Obviously this is very formal and lacks some rigour. A more careful analysis leads to the same
results.

7.8.2 An undamped oscillator subjected to an impulsive force


Consider an undamped oscillator. The system is initially at rest in its equilibrium position. An
impulsive unit force acts at t = 0:
ẍ + x = δ(t)
x(0) = 0, (7.194)
ẋ(0) = 0.
Find x(t) for t ≥ 0.
We will find the solution in two ways.

166
1. Solution 1: First we find the solution xn (t) when the forcing is given by
(
n, 0 ≤ t < n1 ;
Fn = (7.195)
0, t ≥ n1

and then take the limit as n → ∞. This is the family functions (7.174) discussed above.
Solving
ẍn + xn = Fn (t),
xn (0) = 0, (7.196)
ẋn (0) = 0.
gives
L[Fn ](s)
Xn (s) = (7.197)
s2 + 1
where Xn = L[xn ] and
1/n
n
Z
L[Fn ](s) = e−st n dt = (1 − e−s/n ). (7.198)
0 s
Thus
n n
Xn (s) = − e−s/n 2 . (7.199)
s(s2 + 1) s(s + 1)
We now need to find the inverse transforms. First
h 1 i
−1 1
h s i
L−1 = L − = 1 − cos t. (7.200)
s(s2 + 1) s s2 + 1
Thus,
1  1 
xn (t) = n(1 − cos t) − nH t − 1 − cos t −
n n
 (7.201)
n(1 − cos t),
 0 ≤ t < n1 ;
=  
n cos t − 1 − cos t , t ≥ n1 .
 
n

We can take the limit at n → ∞ to find x(t). Using


1 1 1
cos t − = cos t cos + sin t sin
n n n (7.202)
 1  1 1 
= 1 − 2 + O(n−4 ) cos t + − 3 + O(n−5 ) sin t
2n n 6n
we have
 1   1   
n cos t − − cos t = − + O(n−3 ) cos t + 1 + O(n−2 ) sin t
n 2n (7.203)
→ sin t as n → ∞.

Hence
x(t) = lim xn (t) = sin t. (7.204)
n→∞

167
Impulsive forcing and approximation for n = 2
sin(t)
n(1−cos(t))
n( cos(t−1/n) − cos(t) )

x(t) dx/dt
5 3
4 2
3
1
2
0
x

v
1
−1
0
−1 −2
−2 −3
0 2 4 6 8 10 0 2 4 6 8 10
t t

x(t) dx/dt
2 2

1 1
x

0 0

−1 −1
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
t t

Figure 7.6: Solutions of the impulsively forced oscillator: exact and approximate for n = 2. The dotted
curves
 are n(1 − cost), the solution of the approximate problem for 0 ≤ t ≤ n1 = 12 . The dashed curves are
n cos 1 − n1 − cos t , the approximate solution for t > n1 = 12 . The solid curve is the solution sin t when the


forcing is modelled by a Dirac delta function. The vertical grey line is t = n1 = 21 . The right panels show the
corresponding derivatives. Bottom panels show a smaller range of t values and the red shows the solution
xn (t) and ẋn (t).

168
x(t) dx/dt
2.0 3
sin(t)
1.5 2
x (t)
n
1.0
1
0.5
0

v
x
0.0
−1
−0.5

−1.0 −2

−1.5 −3
0 2 4 6 8 10 0 2 4 6 8 10
t t

x(t) dx/dt
1.5 1.5

1.0 1.0

0.5 0.5
x

0.0 0.0

−0.5 −0.5
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
t t

Figure 7.7: Similar to figure 7.6 but only x(t) = sin t (solid black) and the approximate solution xn (t)
(dashed red) for n = 2 are shown.

2. Solution 2: Taking the Laplace Transform of the original problem in terms of the delta
function we have
(s2 + 1)X(s) = L[δ(t)] = 1 (7.205)
so
1 
x(t) = L−1

= sin t. (7.206)
s2 + 1

Comparing the two solutions the power of using the delta function is compelling. Use of the
δ function is only OK if the duration of the forcing is so short that approximating it as a delta
function is a good approximation. If that is the case, its use greatly simplifies finding the solution.

The approximate solutions xn (t) and the exact solution x(t) = sin(t) are illustrated in Figures
7.6–7.8.

169
Comments:

1. The solution x(t) = sin(t) satisfies the initial condition x(0) = 0 but it does not satisfy the
second initial condition ẋ(0) = 0!

2. The functions xn (t) and their first derivatives are continuous at t = 0 and 1/n:


0, t < 0;



ẋn (t) = n sin t, 0 ≤ t < n1 ; (7.207)


  
n sin t − sin t − 1  , t ≥ 1 .

n n

as can easily be verified. Looking at the equation we solved to find xn , the forcing function
Fn (t) had jump discontinuities so as discussed in an earlier example, ẍn has jump disconti-
nuities and xn and ẋn are continuous.

3. For large n, on the small interval 0 ≤ t ≤ 1/n we can approximate ẋ(t) = n sin t by ẋ(t) ≈ nt
so ẋn (t) increases from 0 at t = 0 to 1 at t = 1/n. In the limit at n → ∞, the derivative
undergoes a jump of 1 at t = 0. This is why the limiting solution x(t) = sin t has initial
derivative ẋ(0) = 1.

4. You can think of this in the following way. The equation for the limiting problem with forcing
given by the delta function is ẍ+x = δ(t). The right hand side has a delta function singularity,
hence so must the left hand side. This singularity must appear in the term with the highest
derivative, i.e., ẍ has a delta function singularity at t = 0. Integrating, ẋ jumps by one at
d
t = 0. This is a reflection of δ(t) = dt H(t).

5. An impulsive forcing results in a jump in ẋ at t = 0 (instantaneous acceleration) with no


change in x.

170
x(t) dx/dt
2.0 3
sin(t)
1.5 2
x (t)
n
1.0
1
0.5
0

v
x

0.0
−1
−0.5

−1.0 −2

−1.5 −3
0 2 4 6 8 10 0 2 4 6 8 10
t t

x(t) dx/dt
1.5 1.5

1.0 1.0

0.5 0.5
x

0.0 0.0

−0.5 −0.5
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
t t

Figure 7.8: Same as figure 7.7 but for n = 10.

171
7.9 Laplace Transform Table

Table 7.1: Laplace Transforms

f (t) F (s)

1
1 s
1
eat s−a
n!
tn sn+1
s
cos bt s2 +b2
b
sin bt s2 +b2
eat f (t) L[f ](s − a)
dn f Pn−1 n−1−j dj f
dtn sn L[f ](s) − j=0 s dtj
(0)
d n
n
t f (t) (−1)n ds n L[f ](s)

f (t − a)H(t − a) e−as F (s)


f ∗ g(t) F (s)G(s)

172
Chapter 8

APPENDIX A: Differentiating and


Integrating Complex-Valued
Functions of a Real Variable

There are a few instances in this course where it will be convenient to integrate and differentiate
complex-valued functions of a real variable, in particular the function f (t) = eλt where the constant
λ is a complex number while t is a real variable.

8.1 From first principles


Suppose f (t) is a complex-valued function with real and imaginary parts fr (t) and fi (t):

f (t) = fr (t) + ifi (t). (8.1)

The derivative of f (t) is defined in the usual way:

df f (t + h) − f (t)
(t) = lim
dt h→0 h 
fr (t + h) − fr (t) + i fi (t + h) − fi (t)
= lim
h→0 h (8.2)
fr (t + h) − fr (t) fi (t + h) − fi (t)
= lim + i lim
h→0 h h→0 h
dfr dfi
= (t) + i (t)
dt dt

To integrate construct the Riemann Sum and take the limit:


Z b N
X
f (t) dt = lim f (tn )∆ (8.3)
a N →∞
n=1

173
b−a b−a b−a
where tn is a point in (a + N (n − 1), a + N n) and ∆ = N . Then
Z b N
X
f (t) dt = lim f (tn )∆
a N →∞
n=1
XN

= lim fr (tn ) + ifi (tn ) ∆
N →∞
n=1 (8.4)
nXN N
X o
= lim fr (tn )∆ + i fi (tn )∆
N →∞
n=1 n=1
Z b Z b
= fr (t) dt + i fi (t) dt.
a a

So to differentiate or integrate a complex-valued function of a real variable simply differentiate


and integrate the real and imaginary parts.
Now consider f (t) = eλt where λ is complex with real and imaginary parts λr and λi . Then

eλt = eλr t eiλi t = eλr t cos(λi t) + i sin(λi t) .



(8.5)

The real part is fr (t) = eλr t cos(λi t) which has the derivative

fr0 (t) = λr eλr t cos(λi t) − λi eλr t sin(λi t), (8.6)

while the imaginary part is fi (t) = eλr t sin(λi t) which has the derivative

fi0 (t) = λr eλr t sin(λi t) + λi eλr t cos(λi t). (8.7)

Hence
f 0 (t) = fr0 (t) + ifi0 (t)
   
= λr cos(λi t) − λi sin(λi t) eλr t + i λr sin(λi t) + λi cos(λi t) eλr t
" #
   
= λr cos(λi t) − λi sin(λi t) + i λr sin(λi t) + λi cos(λi t) eλr t
" # (8.8)
  
= λr + iλi cos(λi t) + i sin(λi t) eλr t

= λeiλi t eλr t
= λeλt .

We have shown that


d λt
e = λeλt (8.9)
dt
for all complex numbers λ. In a future complex analysis course you will learn that this is true for
complex-valued t as well.
Now for integration. To integrate eλt you can integrate the real and imaginary parts.
Z t Z t Z t
0 0 0 0
f (t ) dt = fr (t ) dt + i fi (t0 ) dt0 . (8.10)

174
For f (t) = eλt this gives
Z t Z t Z t
0 0 0
eλt dt0 = eλt cos(λi t0 ) dt0 + i eλt sin(λi t0 ) dt0 . (8.11)

You can now integrate the real and imaginary pieces twice by parts to find the integrals of
eλr t cos(λi t) and eλr t sin(λi t). Alternatively, let
Z t
F (t) = f (t0 ) dt0 . (8.12)

The function F (t) has real and imaginary parts Fr (t) and Fi (t) given by
Z t
Fr (t) = fr (t0 ) dt0
Z t (8.13)
Fi (t) = fi (t0 ) dt0

as shown above. Hence the derivative of F (t) is F 0 (t) = Fr0 (t) + iFi0 (t) = fr (t) + ifi (t) as one should
expect. From this we can deduce that
Z t
0 eλt
eλt dt0 = (8.14)
λ
because
d eλt 1 d λt
= e = eλt . (8.15)
dt λ λ dt

8.2 Power series approach


There is an alternative approach using power series. For real-valued x you have seen that the
exponential function ex has the power-series expansion

X xn
(8.16)
n!
n=0

which converges for all x. The function ex can be extended to the complex plane using this power
series:

z z2 z3 X zn
e =1+z+ + + ··· = (8.17)
2 3! n!
n=1
for complex-valued z. This works because for any integers M > N ,
M M
X zn X |z|n
≤| (8.18)
n! n!
n=N n=N

which goes to zero as N → ∞ because x = |z| is real and the power-series for ex converges
absolutely.
For the function f (t) = eλt with t real we have

λ2 t2 λ3 t3
eλt = 1 + λt + + + ··· , (8.19)
2 3!

175
and differentiating term-by-term we have

d λt λ2 2t λ3 3t2
e =λ+ + + ···
dt 2 3!
 λ2 t2  (8.20)
= λ 1 + λt + + ···
2
λt
= λe .

Similarly we can integrate term-by-term


Z t
0 λt2 λ2 t3 λ3 t4
eλt dt0 = t + + + + ···
2 3! 4!
1 λ2 t2 λ3 t3 
(8.21)
= λt + + + ···
λ 2 3!
1  λt 
= e −1 .
λ
Since an indefinite integral is defined up to an arbitrary constant we can take
Z t
0 eλt
eλt dt0 = . (8.22)
λ

We could have used a definite integral to integrate:


Z t
0 λt2 λt20 λ2 t3 λ2 t30
eλt dt0 = (t − t0 ) + − + − + ···
t0 2 2 3! 3!
1 λ2 t2 λ3 t3 
= λt + + + ···
λ 2 3!
1 λ2 t20 λ3 t30 
(8.23)
− λt0 + + + ···
λ 2 3!
1  λt  1 
= e −1 − eλt0 − 1
λ λ
1 1
= eλt − eλt0 .
λ λ

176
Chapter 9

APPENDIX B: Equality of mixed


partial derivatives

Here we prove the following theorem:


Theorem IV: Let f (x, y) be a scalar field such that the partial derivatives fx , fy , and fyx exist on
an open rectangle R. Then if fyx is continuous on R the derivative fxy exists and fxy = fyx on R.

The proof is based on [1] where you can find a more concise treatment.
We start with a weaker version in which we first assume that both fyx and fxy exist and are
continuous on R and show that they are then equal.
Theorem IVa: Let f (x, y) be a scalar field such that the partial derivatives fx , fy , and fyx exist
on an open rectangle R. Then if fyx and fxy are both continuous on R then fxy = fyx on R.

Proof: We show that fxy (a, b) = fyx (a, b) at an arbitrary point (a, b) in R. Since R is open we can
choose h > 0 and k > 0 such that the points (a + h, b), (a, b + k) and (a + h, b + k) are also in R.
These points are the corners of a rectangle Rhk in R. Consider

∆(h, k) = f (a + h, b + k) − f (a + h, b) − f (a, b + k) + f (a, b) (9.1)


which we can write as

∆(h, k) = [f (a + h, b + k) − f (a + h, b)] − [f (a, b + k) − f (a, b)] (9.2)

or
∆(h, k) = [f (a + h, b + k) − f (a, b + k)] − [f (a + h, b) − f (a, b)] (9.3)
We introduce two new functions:

G(x) = f (x, b + k) − f (x, b) (9.4)

and
H(y) = f (a + h, y) − f (a, y). (9.5)
G(x) is the change in f between the top and bottom of the rectangle Rhk (see figure 9.1) and H(y)
is the change between the right and left sides of Rhk . In terms of these functions

∆(h, k) = G(a + h) − G(a) (9.6)

177
R (a,b+k) (x,b+k) (a+h,b+k)

Rhk

(a,b) (x,b) (a+h,b)

Figure 9.1: The open rectangle R (grey) containing the point (a, b) and the rectangle Rhk . The function
G(x) is the difference of f between the top and bottom of the rectangle along a vertical line at x.

and
∆(h, k) = H(b + k) − H(b). (9.7)
Since fx is continuous on R we know that G0 (x) is continuous on [a, a + h]. Hence by the mean
value theorem from (9.6) we have
∆(h, k) = hG0 (x1 ) (9.8)
for some x1 in [a + h, a]. Note that x1 depends not only h, but also on k as our function G defined
in (9.4) does (they also depend on a and b but we are treating them as fixed constants so ignore
this dependence). Using (9.4) we can expand this as
h i
∆(h, k) = h fx (x1 , b + k) − fx (x1 , b) . (9.9)

Now since fxy (x1 , y) is continuous on y ∈ [b, b + k] the mean value theorem also gives us

fx (x1 , b + k) − fx (x1 , b) = kfxy (x1 , y1 ) (9.10)

for some y1 in [b, b + k]. Thus we have

∆(h, k) = hkfxy (x1 , y1 ). (9.11)

Note that y1 depends on h and k too.


In a similar manner
h i
∆(h, k) = kH 0 (y2 ) = k fy (a + h, y2 ) − fy (a, y2 ) = hkfyx (x2 , y2 ) (9.12)

for some x2 ∈ [a, a + h] and y2 ∈ [b, b + k]. We now let (h, k) → (0, 0). Both (x1 , y1 ) and (x2 , y2 ) go
to (a, b) and from (9.11) and (9.12) we get

fxy (a, b) = fyx (a, b). (9.13)

178
This step uses continuity of fxy and fyx .

We now turn to the stronger theorem which does not assume the existence of fxy .
Theorem IV: Let f (x, y) be a scalar field such that the partial derivatives fx , fy , and fyx exist on
an open rectangle R. Then if fyx is continuous on R the derivative fxy exists and fxy = fyx on R.

Proof: Since fyx is continuous (9.12) still holds. Equation (9.8) holds as well but because we don’t
know that fxy exists we no longer have (9.11). We need to prove that fxy exists and is continuous.
Then the rest of the proof follows. We prove fxy (a, b) exists which proves it exists in R as (a, b) is
an arbitrary point in R. From the definition of derivatives

fx (a, b + k) − fx (a, b)
fxy (a, b) = lim (9.14)
k→0 k
if the limit exists. Hence, we need to prove that this limit exists and that it is equal to fyx (a, b).
Now
f (a + h, b) − f (a, b)
fx (a, b) = lim (9.15)
h→0 h
and
f (a + h, b + k) − f (a, b + k)
fx (a, b + k) = lim (9.16)
h→0 h
so
fx (a, b + k) − fx (a, b) [f (a + h, b + k) − f (a, b + k)] − [f (a + h, b) − f (a, b)]
= lim
k h→0 hk
(9.17)
∆(h, k)
= lim .
h→0 hk
This limit exists because we know that fx exists.
Now recall (9.12) which still holds. Writing it as

∆(h, k)
fyx (x2 , y2 ) = (9.18)
hk
and using it in (9.17) gives

fx (a, b + k) − fx (a, b)
= lim fyx (x2 , y2 ). (9.19)
k h→0

Taking the limit as k → 0 gives

fx (a, b + k) − fx (a, b) 
lim = lim lim fyx (x2 , y2 ) . (9.20)
k→0 k k→0 h→0

If the limit on the right hand side exists and is equal to fyx (a, b) then the limit on left hand side
exists and, from (9.14), we have fxy (a, b) = fyx (a, b) and we are done.
From (9.19) we know that
lim fyx (x2 , y2 ) (9.21)
h→0

179

δ

δ
(a,b)

Figure 9.2: Bδ , the ball of radius δ centred at the point (a, b). The point (a + h, b + k) is inside the square
with sides of length δ centred at (a, b).

exits. The limiting value depends on k. Call it V (k).


We need to show that limk→0 V (k) = fyx (a, b). For this we use the assumed continuity of fyx
on R. Let Bδ be a disk of radius δ centred on (a, b). For any  > 0 there exists a δ > 0 such that

|fyx (x, y) − fyx (a, b)| < for (x, y) ∈ Bδ (9.22)
2
Now assume |h| < δ/2 and |k| < δ/2, in which case the whole rectangle Rkh is in Bδ (Figure 9.2).
In particular (x2 , y2 ) is in Bδ hence

|fyx (x2 , y2 ) − fyx (a, b)| < . (9.23)
2

Now let h → 0 giving



|V (k) − fyx (a, b)| ≤
< (9.24)
2
provided |k| ≤ δ/2. Hence we have shown that for any  > 0 there is a δ > 0 such that |V (k) −
fyx (a, b)| <  if |k| < δ/2 which means that limk→0 V (k) exists and is equal to fyx (a, b). This
completes the proof.

180
Bibliography

[1] Tom M. Apostol. Calculus Volume II. Wiley, 1969.

[2] G. W. Bluman and S. Kumei. Symmetries and Differential Equations, volume 81 of Applied
Mathematical Sciences. Springer-Verlag, 1989.

[3] Victor Katz. A History of Mathematics: An introduction. 3rd Edition. Addison-Wesley, 2009.

[4] Randall J. LeVeque. Finite Difference Methods for Ordinary and Partial Differential Equations.
SIAM, 2007.

[5] M. J. Lighthill. An introduction to Fourier Analysis and Generalized Functions. Cambridge


University Press, 1958.

[6] C. C. Lin and L. A. Segel. Mathematics Applied to Deterministic Problems in the Natural
Sciences. Macmillan Publishing Co., 1974.

[7] George F. Simmons. Differential Equations with Applications and Historical Notes. International
Series in Pure and Applied Mathematics. McGraw-Hill, 1972.

[8] Morris Tenenbaum and Harry Pollard. Ordinary Differential Equations. Harper & Row, 1963.

181

You might also like