Main

Download as pdf or txt
Download as pdf or txt
You are on page 1of 116

ODEs, PDEs, and that Applied Math Jazz...

Figure 1: Numerical Microscopy: The Trabeculated Ventricle of an 84hpf Zebrafish Embryonic Heart
Mathematical model and simulation of an embryonic zebrafish heart at 84 hours post fertilization.
Blood cells can be seen flowing into the ventricle during the diastole phase. Blood flow plays a critical
role in heart development, as hemodynamic forces act as signaling mechanisms to regulate a wide host
of developmental processes at the cellular and genetic level.

Nicholas A. Battista
Modified: August 8th, 2016
Created: June 3, 2015

1
1 Introduction
Welcome! In this short 5 day course, we will cover a variety of topics pertaining to analytical methods
in applied mathematics. Most of these topics have been chosen in part because this material that
seems to have a way of falling through the cracks of most undergraduate curriculums in mathematics
and because they are traditionally not covered during the first year sequence in methods of applied
mathematics.
The tentative list of topics we will cover is below:
1. A review of undergraduate ODEs
• Separable Equations
• First-Order: Integrating Factor
• Homogeneous vs. Nonhomogeneous Equations
• Solving for Complementary Solutions
– Linear constant coefficient ODEs
• Solving for Particular Solution:
– Undetermined Coefficients
– Variation of Parameters
• Bonus Methods:
– Transform Methods (i.e., “Euler” type, Ricatti type, etc)
– Laplace Transforms
– Non-dimensionalization
– Introduction to Dynamical Systems, e.g., systems of ODEs
2. General Sturm-Liouville Theory
• Orthogonal Functions
• Fourier Series
• Orthogonal Polynomials as a Complete Basis
3. Greens Functions
4. Linear PDEs
• Classification of PDEs
• Derivations of classical linear PDEs
• Method of Characteristics
• Separation of Variables
• Fourier Transforms / Laplace Transforms
• Similarity Solutions
• Method of Images
5. Physics!
• Newton’s Laws of Motion
• Conservation Laws
• Lagrangian Formulation of Mechanics

2
• Hamiltonian Formulation of Mechanics
6. Big Picture Numerical Linear Algebra

• Linear systems, the bread and butter.


• Nonlinear systems, root-finding, and iterative methods - oh my!
• Eigen-fun: a story of eigenvalues, eigenvectors, and love.
Please note that these notes may not be complete at this time. Please check back periodically as
more sections will be added...when my code is compiling.

3
2 Review of Undergrad. ODEs
Undergraduate Ordinary Differential Equations, or Calculus 4, is mostly a blitz of various analytical
methods for solving initial value problems for linear (and some special case non-linear) differential
equations.
In this section we will review the following methods:

1. Separable Equations
2. First-Order: Integrating Factor
3. Homogeneous vs. Nonhomogeneous Equations
4. Solving for Complementary Solutions
• Linear constant coefficient ODEs
5. Solving for Particular Solution:
• Undetermined Coefficients
• Variation of Parameters
6. Bonus Methods and Conceptual Fun:
• Transform Methods (i.e., “Euler” type, Ricatti type, etc)
• Laplace Transforms
• Non-Dimensionalization
• Introduction to Dynamical Systems, e.g., systems of ODEs
– Eigenvalues and Systems of Linear Homogeneous ODE
– The Matrix Exponential
– Non-homogeneous Linear Systems
– Non-linear Systems of ODEs

Since undergraduate ODEs does not have much theory, we will try to motivate each method, when
possible, otherwise sit back and enjoy the following stampede of methods!

2.1 Separable Equations


Perhaps the most natural method to solve ODEs comes from ideas of calculus, where we simply want
to integrate anything and everything possible. The differential equations we are able to solve via pure
integration are of the form, where you can separate the dependent variable and indepedent variables
entirely, i.e., if we have the following ODE to solve,
dy
= f (x, y) = g(x)h(y),
dx
with y(a) = ya as initial data.
We can then separate the equation,
dy
= g(x)dx.
h(y)
Now integrating both sides we can get a functional relationship between y and x. Unfortunately,
sometimes this relationship is nonlinear or transcendental, making it difficult to write a solution like
y = y(x); however, we can still use the initial data to solve for the integration constant.

4
For example consider,
dy
= −y(1 − y),
dt
with y(0) = 2. We can then go through the separation method to find,

dy
= −dt
y(1 − y)
Z Z
1
dy = −dt
y(1 − y)
log(y) − log(1 − y) = −t + C
y
= = Ae−t
1−y
and hence doing a little algebra we find that

Ae−t
y(t) = .
1 + Ae−t
Now using the initial value information we can get A,
A
y(0) = 2 ⇒ 2 = ⇒ A = −2.
1+A
−2e−t
Therefore our solution is y(t) = 1−2e−t . Note that this solution will blow up for finite time, namely
when e−t = 12 or t = ln 12 .


2.2 First-Order Integrating Factor


The world would be very nice, but boring, if all first-order differential equations were separable. Note:
Not all separable equations are useless → some describe many processes in population biology or black
hole thermodynamics!

To simply make an equation not separable, we just need to tack on one extra term; however, we
will restrict ourselves to linear first-order equations now of the form,
dy
+ p(x)y(x) = q(x), (1)
dx
where we assume p(x) and q(x) are continuous. To solve the above equation we consider the left
hand side of (1) to be the result of a product rule. However, to make it such we need to multiply all
the terms by fudge factor given in exponential form, also known as the integrating factor,
R
eµ(x) = e p(x)dx
,
d µ(x) µ(x) dµ
thereby giving dx e = p(x)e since
= p(x). Doing so we see the product rule gives,
dx
 
d h R p(x)dx i R dy R
e y(x) = e p(x)dx + p(x)y(x) = e p(x)dx q(x).
dx dx
Hence to solve (1), we need to integrate both sides and then easily solve for y(x), e.g.,

5
d h R p(x)dx i R
e y(x) = e p(x)dx q(x)
Z dx Z R
d h R p(x)dx i
e y(x) dx = e p(x)dx q(x)dx
dx
Z
µ(x)
e y(x) = eµ(x) q(x)dx

and therefore we find that Z


−µ(x)
y(x) = e eµ(x) q(x).

2.2.1 Example: Integrating Factor


Let’s do an example! Consider
dy 1 2
+ t y(t) = t2 .
dt 3
Using the integrating factor approach we first identity eµ(t) as
1 2 3
R
eµ(t) = e 3 t dt = et .

Then using the methodology developed above, we see


Z
−t3 3
y(t) = e et t2 dt,

and hence
1 3
y(t) = + Ce−t ,
3
where C can be solved using initial data.

2.3 Homogeneous vs. Non-homogeneous Equations


Before we begin our trajectory into higher order linear differential equations, we are going to set a clear
discrepancy between homogeneous and non-homogeneous equations. In two words, a homogeneous
equation is when we have a bunch of dependent variable stuff equal to zero, i.e.,

dn y dn−1 y d2 y dy
L̂y(x) = cn (x) + cn−1 (x) + . . . + c2 (x) + c1 (x) + c0 (x)y = 0.
dxn dxn−1 dx2 dx
On the other hand a non-homogeneous equation looks like the above, except with non-zero right
hand side, e.g.,

dn y dn−1 y d2 y dy
L̂y(x) = cn (x) + cn−1 (x) + . . . + c2 (x) + c1 (x) + c0 (x)y = q(x).
dxn dxn−1 dx2 dx
The reason we make clear the distinction is because if we consider a solution to the non-homogeneous
problem, call it yP (x), we have

L̂yP (x) = q(x),


while a solution for the homogenous problem, call it yC (x), looks like

L̂yC (x) = 0.

6
Therefore if we consider a combined solution,

y(x) = yC (x) + yP (x),

we have

L̂y(x) = L̂yC (x) + L̂yP (x) = q(x).


Hence if we are trying to solve a non-homogeneous problem, we see it is necessary to solve the
homogeneous problem as well. We call the solution to the homogeneous problem the complementary
solution, while the solution to the non-homogeneous problem is called the particular solution.

Furthermore we note that any scaling multiple of the complementary solution is also a problem to
the homogeneous problem, i.e.,
L̂[αyC (x)] = 0,
where α ∈ C.

2.4 Linear Constant Coefficient ODEs


We will discuss equations of the form,

dn y dn−1 y d2 y dy
L̂y(x) = cn n
+ cn−1 n−1
+ . . . + c 2 2
+ c1 + c0 y = 0,
dx dx dx dx
where {cj }nj=0 ∈ R. For equations of this form we will show that our solution technique assumes an
ansatz for a solution and by doing so terms this differential problem into more an algebraic polynomial
root-finding issue. We also note that since this is an nth order homogeneous equation, we need to find
n linearly independent solutions for our complementary solution.
To illuminate this idea better further, we will show the possibilities for second-order linear constant
coefficient ODEs.
We now consider the following ODE,

d2 y dy
a 2
+b + cy = 0, (2)
dx dx
with initial data of the form y(0) = y0 and y 0 (0) = y0P . We assume solutions of the following form

y = ert ,

where r ∈ C. Continuing with this path we see

d2 y dy d2   d  rt 
+ cy = a 2 ert + b e + c ert
 
a 2
+b
dx dx dx dx
= ar2 ert + brert + cert
h i
= ert ar2 + br + c = 0

and hence we now need to solve


ar2 + br + c = 0,
and then will have the numeric values or r to build our complementary solution. We call this polyno-
mial the characteristic equation.
Moreover we note at this junction that we need n linearly independent solutions for our comple-
mentary solution, and in the above we had a second-order linear ODE and obtained a second-order

7
polynomial to find the roots of. In this second-order example, we will have three possible cases that
the roots can fall into. Naturally this goes back to properties of quadratic equations and when there
are real, complex, or double roots.
We will highlight each case with an example.

Case 1 : b2 − 4ac > 0 (real roots)


Consider the following example
y 00 − y 0 − 2y = 0.

Substituting our ansatz, y = ert we find we get the following algebraic equation to solve

r2 − r − 2 = 0.

Solving this equation we find that r = {−1, 2} and hence we get two linearly independent
solutions that solve the homogeneous problem above,

{e−x , e2x },

and so our complementary solution is

yC (x) = α1 e−x + α2 e2x .

Case 2 : b2 − 4ac < 0 (complex roots)


Consider the following example
y 00 + 2y 0 + 2y = 0.

Again, substituting our ansatz, y = ert , we get the following characteristic equation

r2 + 2r + 2 = 0.

Solving this equation we find r = {1 − i, 1 + i} and so our linearly independent solutions are
{e(1−i)x , e(1+i)x }. We can easily write the complementary solution as

yC (x) = α1 e(1−i)x + α2 e(1+i)x ,

but we will massage our solutions to a more friendly form! After much algebra, we find we can
write the above as:
h i
yC (x) = ex α1 cos(x) + α2 sin(x) .

In general if we find our complex roots to be {a − bi, a + bi}, we can write our complementary
solution as
h i
yC (x) = eax α1 cos(bx) + α2 sin(bx) .

Case 3 : b2 − 4ac = 0 (root with multiplicity > 1.)


Consider the following example
y 00 − 6y 0 + 9y = 0.

Using our ansatz, y = ert , we obtain the following characteristic equation

(r − 3)2 = 0.

8
Hence solving this equation we find that r = 3 is a double root, i.e., a root with multiplicity
> 1. Therefore if we tried to write our complementary solution is a similar way to Case 1, we
would have
yC (x) = α1 e3x + α2 e3x = α3x ,

and thereby only have ONE linearly independent solution, when we need two! The method
we will employ to find this second solution is a special (easy) case of Reduction of Order. We
assume the second linearly independent solution y2 (x) has the form

y2 (x) = v(x)y1 (x) = v(x)e3x ,

where y1 (x) = e3x is the single linearly independent solution we do have!


Plugging this into the ODE we get,

y200 − 6y20 + 9y2 = [v 00 y1 + 2v 0 y 0 + vy100 ] − 6[v 0 y1 + vy10 ] + 9vy1


= [v 00 y1 + 2v 0 y10 − 6v 0 y1 ] + v(x)[y100 − 6y10 + 9y1 ]
= v 00 e3x + (3)2v 0 e3x − 6v 0 e3x
h i
= e3x v 00 = 0

and therefore we need to solve v 00 = 0 and find v(x) = βx + γ. Hence our second linearly
independent solution is
y2 (x) = (βx + γ)e3x ,

and our complementary solution is

yC (x) = αe3x + (βx + γ)e3x = (α̃ + βx)e3x ,

where α̃ = α + γ, but is just another constant to be found using the initial data afterall, so it’s
nothing too fancy!

2.4.1 Solving for the Particular Solution


In this section we are concerned with finding the particular solution, yP (x), for non-homogeneous
equations. We will assume that we already found the complementary solution, i.e., the solution to
the homogeneous equation, yC (x). We are concerned with the particular solutions of linear constant-
coefficient equations, i.e.,

dn y dn−1 y d2 y dy
cn (x) n
+ cn−1 (x) n−1 + . . . + c2 (x) 2 + c1 (x) + c0 (x)y = q(x). (3)
dx dx dx dx
Like in the previous section, we illustrate this concept with 2nd order linear constant coefficient
equations,
c2 y 00 + c1 y 0 + c0 y = q(x).
We will describe two methods for finding the particular solution,
1. Undetermined Coefficients

2. Variation of Parameters

9
• Undetermined Coefficients:
This method of finding a particular solution is similar to that of “guess and check!”. This
method will only work for special cases of q(x), namely if q(x) belongs to the following families
of functions {an xn + . . . a1 x + a0 , eax , sin(ax), cos(ax)}, or any products of those.

– Example 1: Consider the following example,

y 00 − y 0 − 2y = 6x2 .

Recall we found the complementary solution to be yC (x) = α1 e−x + α2 e2x . Now since q(x)
is a polynomial, we will assume a polynomial form for yP (x), i.e., we will let yP (x) =
c1 x2 + c2 x + c3 .
Substituting our particular solution into the ODE gives

d2 d
y 00 − y 0 − 2y = 6x2 = (c1 x2 + c2 x + c3 ) − (c1 x2 + c2 x + c3 ) − 2(c1 x2 + c2 x + c3 )
dx2 dx
= (2c1 ) − (2c1 x + c2 ) − 2(c1 x2 + c2 x + c3 )
= [−2c1 ]x2 − 2[c1 + c2 ]x + [2c1 − c2 − 2c3 ]

Hence we see that we have the following system of equations to solve

−2c1 = 6
−2c1 − 2c2 = 0
2c1 − c2 − 2c3 = 0

Solving this system we find that c1 = −3, c2 = 3, and c3 = − 29 . Thereby making our
particular solution yP (x) = −3x2 +3x− 92 . Hence our full solution to this non-homogeneous
equation is
h 9i
y(x) = yC (x) + yP (x) = α1 e−x + α2 e2x + − 3x2 + 3x − ,
2

up to using the initial data to solve for {α1 , α2 }. Note that even though q(x) did not have
a linear term of constant term, the particular solution did have those components. Hence
we have to make sure to include all the polynomial terms at the same order and below as
q(x).

We can do a similar process if q(x) takes on the other functional forms. Below is a table for
what particular solution to “guess” for each q(x).

q(x) yP (x)
eax c1 eax
sin(bx) c1 sin(bx) + c2 cos(bx)
cos(bx) c1 sin(bx) + c2 cos(bx)
β sin(bx) + γ cos(bx) sin(bx) + c2 cos(bx)
an xn . . . a2 x2 + a1 x + a0 cn xn . . . c2 x2 + c1 x + c0

– Example: Resonance
One other subtle point to make is in the case when q(x) = eax . If our complementary
solutions are exponential functions, it is possible that our complementary solutions may

10
already include a term of eax , i.e., yC (x) = α1 eax +α2 ebx . In this case, we will have resonant
solutions.
To say a quick word on resonance, which properly deserves a chapter of its own, it is
the process of a mechanical system to oscillate with greater amplitude when pushed at a
preferential frequency, called the resonant frequency. Furthermore, this allows systems to
maximize their output oscillatory amplitude, even while the driving forces may be relatively
small. Essentially the system is able to store mechanical energy and efficiently transfer such
energy between its potential and kinetic forms easily at such frequencies.
Our particular solution will then take on a term like yP (x) = c1 xeax , analogous to what
we did in the case of multiple roots using Reduction of Order.

• Variation of Parameters:
The next method for finding particular solutions assumes we already know what the comple-
mentary solutions are. Like the last case, we will assume we are trying to solve a 2nd order
linear ODE. Note this method does not only work for linear constant coefficient ODEs, unlike
undetermined coefficients. This method will work for higher order equations analogously (the
algebra just gets a bit more undesirable).
Called, variation of parameters, it is a method that exploits a more general reduction of order
technique. We begin with the set of fundamental solutions {y1 (x), y2 (x)}, i.e., where yC (x) =
α1 y1 (x) + α2 y2 (x). We now assume the particular solution takes a reduction-of-order-form,

yP (x) = u1 (x)y1 (x) + u2 (x)y2 (x), (4)

where {u1 (x), u2 (x)} are unknown functions that we now set out to find! To motivate this
concept, we will try to find the particular solution to

y 00 + p(x)y 0 + r(x)y = q(x),

with already found fundamental solutions y1 (x) and y2 (x).


We begin with the following constraint

u01 y1 + u02 y2 = 0.

The reasons for this choice on constraint are not immediately obvious, but upon doing so we see
that the above constraint makes the derivative of yP (x) take a more manageable form,

yP0 (x) = u01 y1 + u1 y10 + u02 y2 + u2 y20 = u1 y10 + u2 y20 .

Hence since, as you’ve probably suspected, we’re going to substitute our ansatz for yP (x) into
the ODE, this constraint makes a lot of terms drop out.

h i h i h
yP00 + p(x)yP0 + r(x)yP = u01 y10 + u1 y100 + u02 y20 + u2 y200 + p(x) u1 y10 + u2 y20 + r(x) u1 (x)y1 (x) + u2 (x)y2 (x)

h i h i h i
= u01 y10 + u02 y20 + u1 y100 + p(x)y10 + r(x)y1 (x) + u2 y200 + p(x)y20 + r(x)y2 (x)
h i
= u01 y10 + u02 y20 = q(x)

11
since y1 (x) and y2 (x) are solutions to the homogeneous equation. Therefore we have the following
systems of two equations,
u01 y1 + u02 y2 = 0
u01 y10 + u02 y20 = q(x)

and two unknowns u01 (x) and u02 (x). Recall that we already know y1 (x) and y2 (x), and we assume
they are at least C 2 since they are solutions of a 2nd order ODE afterall!
Solving this system we find
−y2 (x)g(x) y1 (x)g(x)
u01 (x) = and u02 (x) = .
y1 (x)y20 (x) − y10 (x)y2 (x) y1 (x)y20 (x) − y10 (x)y2 (x)

Hence we can now integrate the above two formulas for u01 (x) and u02 (x) to find our particular
solution,

yP (x) = u1 (x)y1 (x) + u2 (x)y2 (x)


Z   Z  
y2 (x)g(x) y1 (x)g(x)
= −y1 (x) dx + y2 (x) dx.
y1 (x)y20 (x) − y10 (x)y2 (x) y1 (x)y20 (x) − y10 (x)y2 (x)

– Example:
We will show variation of parameters is indeed a true method for finding the particular
solution by performing the same example as in the undetermined coefficients section. Recall
we wish to solve

y 00 − y 0 − 2y = 6x2 .
Recall we found the complementary solution to be yC (x) = α1 e−x + α2 e2x , and thereby
our fundamental solutions are y1 (x) = e−x and y2 (x) = e2x .
Using our formulation above we have

Z   Z  
y2 (x)g(x) y1 (x)g(x)
yP (x) = −y1 (x) dx + y2 (x) dx.
y1 (x)y20 (x) − y10 (x)y2 (x) y1 (x)y20 (x) − y10 (x)y2 (x)
e2x (6x2 ) e−x (6x2 )
Z   Z  
= −e−x dx + e 2x
dx
e−x (2e2x ) + e−x e2x e−x (2e2x ) + e−x e2x
Z  2x 2  Z  −x 2 
e x e x
= −6e−x dx + 6e2x dx
3ex 3ex
Z Z
= −2e−x ex x2 dx + 2e2x e−2x x2 dx
h i h  x2 
x 1 −2x i
= −2e−x (x2 − 2x + 2)ex + 2e2x − − − e
2 2 4
 
1
= (−2x2 + 4x − 4) + −x2 − x −
2
9
= −3x2 + 3x − .
2
Therefore we see that our variation of parameters methodology is consistent with what
undetermined coefficients gave us (not that we are surprised; they should).

12
2.5 Substitution Methods for *some* Nonlinear ODEs
In this section we will discuss a solution technique for *some* first-order non-linear ODEs, namely
the Bernoulli, Ricatti, and Clairaut equation. We begin with Bernoulli

2.5.1 Bernoulli Equation


The Bernoulli equation is given by
dy
+ P (x)y = g(x)y n , (5)
dx
and hence is a non-linear first-order ODE for n > 1, assuming n ∈ R. If y 6= 0 on our interval on
interest, we can rewrite this as
dy
y −n + P (x)y 1−n = g(x).
dx
The substitution we use for Bernoulli is

φ = y 1−n , (6)

for n ∈ Z\{0, 1}. We then obtaining the following differential relationship,


dφ dy
= (1 − n)y −n .
dx dx
Substituting this into the ODE gives

+ (1 − n)P (x)φ(x) = (1 − n)g(x).
dx
The above equation is now simply a first-order linear equation that can solved with integrating
factor!
dy
• Example: dx = y(xy 3 − 1)
We make the substitution φ = y 3 ⇒ dφ dx = (−3)y
−3 dy
dx . Therefore we now have the following
first-order linear ODE to solve after making the substitution,

− 3φ = −3x.
dx

We now can simply hit this with integrating factor to pop-out


R a solution! Proceeding with inte-
grating factor, we define our integrating factor to be v = e −3dx = e−3x . Doing the integrating
factor dance, we multiply all terms in the equation by v = e−3x and then notice the left hand
side is indeed just the result of a product rule between φ and e−3x , i.e.,

d  −3x
φ(x) = −3xe−3x .

e
dx
Integrating both sides gives

1
e−3x φ(x) = (x + )e−3x + C,
3
where C is the integration constant. Now we simply solve for φ(x) and substitute our Bernoulli
transformation (6), to get

1
φ(x) = y −3 = x + + Ce3x .
3

13
2.5.2 Ricatti Equations
The Ricatti Equation takes the following non-linear form,
dy
= Q(x) + P (x)y + R(x)y 2 . (7)
dx
The Ricatti Equation may look innocent enough, i.e., similar to a Bernoulli equation but with
one addition term, namely Q(x); however, it turns out there is no way to find an analytical solution
unless the particular solution is known. For our considerations, we will assume we already know yP (x)
(somehow), and then define the new function
1
φ= . (8)
y − yP (x)
Hence we see that
1 dy 1 dφ
y(x) = yP + ⇒ = yP0 − 2 .
z dx φ dx
Substituting these into (7), gives
     
0 1 dφ 1 2 2yP 1
yP − 2 = Q(x) + P (x) yP + + R(x) yP + + 2 .
φ dx φ φ φ
Simplifying the above, we obtain

1 dφ 1 yP R(x) 2
− 2
= P (x) + 2R(x) + φ ,
φ dx φ φ )
since yP (x) is a solution to (7), i.e., yP0 = Q(x) + P (x)yP + R(x)yP2 .
We can simplify further to get

= − [P (x) + 2R(x)yP (x)] − R(x).
dx
This equation is now a first-order linear equation ODE for φ and can be solved using integrating
factor!

• Example: y 0 = e2x + (1 + 2ex )y + y 2 with *already known* particular solution yP (x) = −e−x .
(We do not know how we know that, nor is that our concern at this junction.)
First we identify all the pieces,

P (x) = 1 + 2ex
Q(x) = e2x
R(x) = 1

Next we use the transformation (8) to give us


= − [P (x) + 2R(x)yP (x)] − R(x)
dx
= − [(1 + 2ex ) − 2(1)ex ] φ − 1 = − [1 + 5ex ] φ − 1

Hence we have the following linear first-order ODE to solve, to which we can use integrating
factor,

+ φ = −1.
dx

14
R
1dx
Identifying the integrating factor as v(x) = e = ex , gives

d x
[e φ(x)] = −ex ,
dx
and thereupon integrating we obtain

ex φ(x) = −ex + C,

and hence
φ(x) = −1 + Ce−x .

Now we need to substitute for phi(x) to get our solution y(x),

1 1
y(x) = yP (x) + = −e−x + −x
.
φ Ce − 1

2.5.3 Clauraut’s Equation


dy
Clairaut’s Equation is a non-linear ODE, where the non-linearity is in the derivative, dx . The equation
takes the form  
dy dy
y(x) = x +f . (9)
dx dx
Differentiating this equation once with respect to x, yields

d2 y
  2
dy dy 0 dy d y
= +x 2 +f ,
dx dx dx dx dx2

and therefore we are left either


d2 y
 
dy
=0 or x + f0 = 0.
dx2 dx
d2 y
• Case 1: dx2 =0
dy
In this case we find that this implies dx = C, where C ∈ R. Therefore (9) becomes,

y(x) = Cx + f (C).

This gives a family of straight lines as a solution, based on the value of C. This is known as the
general solution of Clairaut’s solution.
 
dy
• Case 2: x + f 0 dx = 0.
 
dy
We see this implies that f dx = −x This leads to singular solutions. We will illustrate this
via an example. Consider f (y 0 ) = y 02 , then we have the case where

x
f 0 ((y 0 ) = 2y 0 = −x ⇒ y 0 = − ,
2
and hence we get
x  x 2 1
ys (x) = ( )x + − = − x2 .
2 2 4

15
The general solution is yG (x) = Cx + C 2 .
From uniqueness theory, we know that no two solutions made intersect at the same point or else
uniqueness is violated. We see in this trivial case for Clairaut’s Equation that yG (x) = ys (x)
when (x, y) = (−2C, −C 2 ).
Furthermore we can even see that the two curves are tangent to each other at this point, i.e.,
0
yG (−2C) = C = ys0 (−2C).

Therefore the singular solution, ys (x) is tangent to every member of the family of curves, yG (x),
for any value of C ∈ R.

2.6 Laplace Transforms


We will now discuss how to solve ODE initial value problems using Laplace Transforms. This method-
ology is completely different from methods previously described. In essence, we will transform our
ODE from a differential equation to an algebraic one using integral transforms. The transform itself
will take functions of time, that is functions definite on t ∈ [0, ∞) and transform them into functions
of complex frequency. Note: when using Laplace Transforms we will require all initial data to be at
t = 0, if it is not, you will have to transform the ODE appropriately into t ∈ [0, ∞] before attempting
to use Laplace Transforms.

We define the Laplace Transform to be


Z ∞
L {f (t)} = F (s) = e−st f (t)dt (10)
0

Hence if you are able to integrate, you will be able to find the Laplace Transform of many functions.
A table of some of them is listed below

f (t) L {f (t)}
c
c (c ∈ C) s
n!
tn sn+1 , n ∈ N
1
eat s−a
a
sin(at) s2 +a2
s
cos(at) s2 +a2
a
sinh(at) s2 −a2
s
cosh(at) s2 −a2

Likewise, we can “undo” the Laplace Transform by taking its inverse. The Inverse Laplace Trans-
form is defined by a line integral,
Z γ+iT
1
L −1
{F (s)} = f (t) = est F (s)ds (11)
2πi γ−iT

This integral gets a bit hairy to do, but it can be found using some beautiful complex analysis.

2.6.1 Some Properties of Laplace Transforms


1. Linearity:

Furthermore we note that both L and L −1 are linear, i.e.,

16
L {αf (t) + βg(t)} = αL {f (t)} + βL {g(t)},

and similarly for L −1 .

2. Convergence:

Moreover, we know that the Laplace Transform of a function f (t) will exist if f (t) is piecewise
continuous on [0, ∞) and has at most exponential order for t > T . If these are satisfied then

lim L {f (t)} = 0.
s→∞

We will sketch a quick proof below. Since f (t) is piecewise continuous of [0, T ], we know
that is bounded on the such interval, which implies |f (t)| ≤ M1 e0t . Note for t > T we have
|f (t)| ≤ M2 eαt .

Therefore let M = max{M1 , M2 } and a = max{0, α}. Then we have for this function f (t),

Z ∞
L {f (t)} = e−st f (t)dt
Z0 ∞
≤ e−st |f (t)|dt
0
Z ∞
≤M e−st eat dt
0
−(s−a)t ∞
e
=M
s−a 0
M
=
s−a

for s > a and hence lims→∞ L {f (t)} = 0.


3. Translational Property:

If a ∈ R, then
L {eat f (t)} = F (s − a).

The proof is clear by the definition of the Laplace Transform, i.e.,


Z ∞ Z ∞
L {eat f (t)} = e−st eat f (t)dt = e−(s−a)t f (t)dt = F (s − a).
0 0

4. Translation Property with Heaviside Function:

If a ∈ R+ , then
L {f (t − a)H(t − a)} = e−as F (s),

where H(t − a) is the Heaviside function and is defined as

17

0, 0 ≤ t < a
H(t − a) =
1, t ≥ a.

5. Derivatives of Transforms:

For n ∈ N, we have
dn
L {tn f (t)} = (−1)n F (s).
dsn
6. Convolutions:

Let f (t) and g(t) be piecewise continuous functions on [0, ∞) and with order at most exponential,
then
L {f (t) ∗ g(t)} = L {f (t)}L {g(t)} = F (s)G(s).

Likewise, we in the inverse Laplace transform, we have

L −1 {F (s)G(s)} = f ∗ g.

Note: be careful of which way you are transforming products!!


7. Laplace Transform of Delta Function:

The Dirac-delta function is defined by



0, x 6= a
δ(x − a) =
1. x=a

The Laplace Transform of δ(t − a) is:

L {δ(t − a)} = e−sa .

2.6.2 Laplace Transforms for Solving ODEs


Before we dive right into solving ODEs using Laplace Transforms, we will make a few remarks.
Laplace Transforms are ideally suited for initial value problems for linear ODEs with constant coef-
ficients. They also assume that your initial value begins at t = 0, if not, you will have to translate
your equation accordingly into the interval [0, ∞). Furthermore, we will see that taking the Laplace
Transform of our ODE will “transform” our problem from a differential one to a more algebraically
flavored one.

The only real piece we are missing is how to take Laplace Transforms of Derivatives. We proceed
immediately to remedy this.

Laplace Transforms of Derivatives:

We first will state the result. If y(t) ∈ C n on [0, ∞) and order is at most exponential, then

n−1
dn y
  X
L = sn F (s) − sk f (n−1−k) = sn F (s) − sn−1 f (0) − sn−2 f 0 (0) − . . . − f (n−1) (0),
dtn
k=0

18
where F (s) = L {f (t)}.

We will show the proof for the first derivative. The others follow analogously...or by induction.
Consider L {y 0 (t)}, and proceed to the integration,
Z ∞
L {y (t)} =
0
e−st y 0 (t)dt
0

Z ∞
= e−st y(t) + s e−st y(t)dt
0 0
= −y(0) + sL {y(t)}
= sY (s) − y(0)

and hence
L {y 0 (t)} = sY (s) − y(0).

Note that in calculating L {y 0 (t)} (or any higher derivatives for that matter), we need to use the
initial data! This is where the initial data will creep into using Laplace Transforms for solutions
to linear ODEs!

Let’s finally move onto using Laplace Transforms for ODEs!

Using Laplace Transforms to Solve ODEs:

There are 4 steps in solving ODEs using Laplace Transforms. They are listed below:
1. Identify your ODE
2. Transform all the terms in the ODE using L
3. Solve for Y (s)
4. Transform back using L −1 to get y(t)
Let’s do an example!

Example: Consider
dy
− 3y = e2t , with y(0) = 1.
dt
We begin by transforming all the terms using L and exploiting its linearity property,
   
dy dy
L =L − 3L {y} = L e2t

− 3y
dt dt
and obtain h i 1
sY (s) − y(0) − 3Y (s) = ,
s−2
or substituting the initial data,
h i 1
sY (s) − 1 − 3Y (s) = ,
s−2

19
At this point, our differential problem has been cast into a more algebraic problem, where
we must solve for Y (s),

s−1 −1 2
Y (s) = = + .
(s − 2)(s − 3) s−2 s−3
We have used the method of partial fractions above. Hence we now have
   
−1 2
y(s) = L −1 {Y (s)} = L −1 + L −1 .
s−2 s−3
Transforming back (using tables), this yields our solution,

y(t) = −e2t + 2e3t .

One last remark- by using Laplace Transforms we solved for our complementary and particular
solutions simultaneously! Furthermore, there are other transform methods, namely Fourier Trans-
forms, which can be used to solve ODEs (and PDEs) as well. Typically if you are trying to solve a
differential equation (whether it is an ODE or PDE), you can use Laplace Transforms to transform
the independent variables in [0, ∞), and you can use Fourier Transforms to transform variables, which
are in (−∞, ∞).

2.7 Non-dimensionalization - now you see me, now you don’t!


Okay, maybe non-dimensionalization is not technically a disappearing act, but there is a bit of magic
that happens once one goes through the entire procedure and arrives at the same equation, but free
of all units. This procedure is useful for simplifying equations, parameterizing problems, as well as
setting up an equation for proper numerical simulation, i.e., if physical quantity’s scale is too large or
too small, non-dimensionalizing the equation before using numerical methods, will free the equation
to work in any scaling that you wish.

Furthermore non-dimensionalization can also shed insight on proper characteristics of a system.


Quantities like resonant frequencies, half-lives, or bifurcation points can usually be recovered through
non-dimensionalization processes. Very popular scaling quantities in fluid dynamics, e.g., the Reynolds
number, Prandt Number, Rayleigh Number, are the results of non-dimensionalizing the Navier-Stokes
equations with or without advective-diffusive coupled equations, and give clear regimes in which dif-
ferent physical fluid behavior can be observed.

Moreover, after performing this procedure your equation will usually take a bit more friendly form,
seemingly free of as many parameters, since parameters will have been collected together into non-
dimensional quantities. This will become more clear as we do examples.

We will now describe the steps in performing non-dimensionalization. The steps are as follows

1. First identify all the independent and dependent variables within the equation (or system of
equations)

2. Replace each variable with a scaled quantity relative to a characteristic unit


3. Divide through by the coefficient of the highest order term, whether it be derivative or polynomial
in nature

20
4. Choose the definition of characteristic units for each variable to make as many coefficients of as
many terms one.

5. Rewrite the equation, or system of equations, using the new dimensionless quantities.

Let’s do an example!

2.8 Oscillating Spring


Consider the following system describing a mechanically oscillating spring with a mass attached.

d2 x dx
m +b + kx = F0 (t),
dt2 dt
where x(t) is the displacement of the block of mass, m, at time, t, b is the damping coefficient, k is the
spring stiffness, and F0 (t) may be some periodic forcing term. We begin by making the dimensionless
variables for displacement and time as follows,

x t
χ= and τ= ,
xC tC
where χ and τ are the non-dimensional displacement and time, respectively, and xC and tC are the
characteristic displacement and time quantities (or scalings), respectively. Before we can do the sub-
d d2
stitutions, we need to first relate the differential operators dt and dt2 to the non-dimensionalization.

First we consider that t = tc τ and hence see that dt = tC dτ , since tC is the characteristic time,
which is constant. Hence we get that

dτ 1 d dτ d 1 d
= ⇒ = = ,
dt tC dt dt dτ tC dτ
by the chain rule. Similarly for higher order derivatives we find that

n
dn dn

1
= .
dtn tC dτ n
Now let’s begin the substitutions!

d2 x dx
m 2
+b + kx = F0 (t)
dt dt

xC d2 χ xC dχ
m +b + kxC χ = F0 (tc τ ).
t2C dτ 2 tC dτ

Next divide through by the leading order coefficient, m xt2C , and we obtain
C

2
d χ btC dχ kt2C t2C
+ + χ= F0 (tc τ ).
dτ 2 m dτ m mxC

21
We are now at a crossroads, where we can decide what the characteristic displacement, xC , and
time, tC , should be. Since we are trying to make overall coefficients equal to one, there are two obvious
choices for tC , as follows,
r
m m
tC = (first-order normalization) OR tC = (zeroth order normalization).
b k
We will choose the latter, as we will see this will lead to another non-dimensional quantity that
is referred to as the damping ratio. However, it does not matter which choice we make, but one does
lead to a more traditionally studied problem. Now our governing system takes the form

d2 χ b dχ 1
+√ +χ= F0 (tc τ ).
dτ 2 mk dτ kxC
Furthermore we can define new parameters to lump all the parameters together in front of the
linear term and non-homogeneous term. We note that in the engineering community for these type
of systems, folks define

b 1
2ζ = √ and xC = .
mk k
The reason for the factor of 2 is purely aesthetic and arbitrary; it will later allow us to parameterize
solutions in terms of ζ solely. ζ is traditionally referred to as the damping ratio. The xC definition
can be thought of the displacement per unit force.

Hence the non-dimensional form of the damped harmonic oscillator is,

d2 χ b dχ 1
2
+√ +χ= F0 (tc τ ),
dτ mk dτ kxC
where we have made the assumptions that ζ = 2√bmk , tc = m 1
p
k , and xC = k .
To solve the governing equation, we need to find both particular and complementary solutions. We
will only concern ourselves with the complementary solutions for illustrating different regimes based
on our parameters, specifically the damping ratio, ζ. For the complementary solutions, e.g., solutions
to the homogeneous problem, we make the ansatz for solutions of the form

χ = eλt ,

since it is a second-order, linear, constant-coefficient ordinary differential equation. Substituting


this ansatz into the governing equations, we find

eλt (λ2 + 2ζλ + 1) = 0,


and then solving for λ, we obtain
p
λ = −ζ ± ζ2 − 1 (12)

We now have the following three possible branches:


1. Overdamped : ζ > 1 We now have two distinct eigenvalues of our differential operator, giving us
the following transient solution,
√ 2 √ 2
χ(τ ) = c1 eτ ζ −1 + c2 e−τ ζ −1 .

22
2. Critically damped : ζ = 1 We now only have one distinct eigenvalue of our differential operator.
Hence at this stage, we have one transient solution, χ1 (τ ) = c3 e−τ . However, we will also have
a resonant eigenfunction, χ2 (τ ) = c4 τ e−τ . Therefore our transient solution for this critically
damped case is
χ(τ ) = (c3 + τ c4 )e−τ .

3. Under damped : ζ < 1 We have two distinct eigenvalues of our differential operator; however,
they are both complex with nonzero imaginary part. This is not an issue as our transient
solutions can be written as
√ 2 √ 2
χ(τ ) = c̃5 e−(τ + 1−ζ ) + c̃6 e−(τ − 1−ζ ) ,

or equivalently, using some algebraic sorcery,


h p p i
χ(τ ) = e−τ c5 cos(τ 1 − ζ 2 ) + c6 sin(τ 1 − ζ 2 ) .

Hence we found there are three regimes for the damped harmonic oscillator system - overdamped,
underdamped, and critically damped. Moreover, we were able to discern when such bifurcation would
occur by using non-dimensionalization, giving rise to a natural non-dimensionalization quantity, ζ, we
can quantify the transient solution regimes by. Recall the quantity, ζ is sometimes referred to as the
damping ratio, which is heavily used by the engineering community to model decaying oscillations in
a system after undergoing a perturbation.

2.9 ODE Systems!


In this section we will discuss the big picture treatment of systems of ODES, of both the linear and
non-linear varieties. First we will begin with analogy between linear constant cofficient ODEs and
systems of constant coefficient linear ODEs, in that we seek exponential type solutions in both cases.

Next we will exploit the relationship between high order ODEs and systems of first-order ODEs, to
further solidfy this analogy. Through search for eigenvalues of these systems, we opt to build inutition
behind matrix exponential solutions, and massage these solutions into a more useful and meaningful
form.

We then turn the discussion to finding particular solutions to non-homogeneous linear systems of
ODEs with constant coefficients. We will see that the methods we emplore for this are completely
analogous to our approach in the sections above. Spoilers- we will discuss undetermined coefficients
for these type of systems as well as everyone’s favorite, variation of parameters.

Finally we will close the discussion of systems of ODEs we a brief big picture-esque story of non-
linear systems, equilibiria and steady-states, the Jacobian matrix and its complementary eigenvalues.
We then will chat about quantifying these equilibria as either sinks, sources, saddle-points, degenerate
cases, centers, or stable/unstable foci, at least locally.

2.9.1 Let’s use exponentials, again?!


We begin this story with our old first-order homogeneous constant coefficient friend,
dx
= ax,
dt

23
where a ∈ R, with initial data, x(0) = x0 . We note that the solution is (almost trivially) given by the
exponential function,
x(t) = eat x0 .
Wouldnt it be nice if we could so easily write the solution to an analogous system of first-order
constant-coefficient homogeneous ODEs? Well, we can! See, mathematics is beautiful (*especially
when it works out...). Consider the following system of N -ODEs,

dx
= Ax,
dt
with initial value, x(0) = x0 , and where A ∈ RN ×N .

The solution to the above system can be given as

x(t) = eAt · x0 .

...but what does this mean? How does the exponential function handle matrices? Are we just blow-
ing smoke? Nah, not this time. When in doubt, let’s turn to Taylor Series, even though this isn’t
numerical analysis. Shh, we won’t tell.

Recall the Taylor Series for an exponential function of a single variable, at, with at ∈ R is
1 1 1
eat = 1 + at + (at)2 + (at)3 + . . . + (at)N + . . . .
2! 3! N!
Well, as it turns out we can use this idea to definte the exponential of a matrix! Let’s do some
Taylor Series!
1 1 1
eAt = I + At + (At)2 + (At)3 + . . . + (At)N + . . . .
2! 3! N!

There we go - we know what the solution is! Is the above definition useful to us or really give us
much intution about the behavior of the solutions of the linear ODE system? Unfortunately, in its
current form it does not. However, this is where we put on our linear algebra tool belts and dive back
into the world of matrix factorizations.

We note that if a matrix has an eigenvalue decomposition, i.e., each eigenvalue, λ, of a matrix, A,
has geometric multiplicity equal to its algebraic multiplicity, then we can factorize A as

A = V DV −1 ,
where D is a diagonal matrix containing the eigenvalues on the main diagonal and V is a matrix of
eigenvectors, all of which are linearly independent (and henceV −1 exists). We can use this factorization
to better understand what the matrix exponential does for us. Performing this substitution we obtain

24
−1
eAt = eV DV t

(V DV −1 t)2 (V DV −1 t)3
= I + V DV −1 t + + + ...
2! 3!
t2 t3
= I + V DV −1 t + V DV −1 V DV −1 + V DV −1 V DV −1 V DV −1 + . . .
2! 3!
2 3
t t
= I + V DV −1 t + V D2 V −1 + V D3 V −1 + . . .
2! 3!
−1 −1 t2 2 −1 t3
=VV + V DV t + V D V + V D3 V −1 + . . .
2! 3!
h (Dt)2 (Dt)3 i
= V I + Dt + + + . . . V −1
2! 3!
= V eDt V −1 .

Hence by diagonalizing the matrix A in its eigenvalue decomposition, we are able to see the
influence that the eigenvalues will have on the evolving system, but moreover, see the role the matrix
exponential takes in building a solution to a system of linear ODEs. Furthermore, we can massage
the above relation further to see
 λ1 t 
e 0 ... 0
 .. 
 0 eλ 2 t 0 . 
V eDt V −1 = V 
  −1
 .. . . .. . .. . ..
V

 
λn−1 t
 0 e 0 
0 0 eλn t
Morevoer, if A is Hermitian (or self-adjoint), the eigendecomposition of A will lead to an orthog-
onalized decomposition, e.g.,
A = QDQ∗
, where D is the diagonal matrix of eigenvalues and Q is an orthogonal matrix composed of the
eigenvectors all of which are orthogonal. This result stems from the fact that Hermitian matrices have
real eigenvalues and orthogonal eigenvectors. This then leads to the matrix exponential taking the
form
 λt 
e 1
eAt = Q 
 ..  ∗
Q .
.
eλ n t
We note that not every matrix will have an eigenvalue decomposition; however, if we can write A
in its Jordan form, e.g.,

A = P JP −1 ,
where P is a matrix of generalized eigenvectors since the algebraic multiplicity of an eigenvalue is
greater than its geometric multiplicity. Moreover J is the Jordan Normal Form, which can be written
as a diagonal matrix of Jordan Blocks, i.e.,
 
J1
J =
 .. ,

.
Jp

25
where each Jordan Block, Jk , is written
 
λk 1
 .. 
 λk . 
Jk =  ,
 .. 
 . 1 
λk
and the number of rows in each Jordan Block is equal to the algebraic multiplicity of such eigenvalue,
λk . We can substitute the Jordan Decomposition into the matrix exponential and get a similar
relation, i.e.,

eAt = P eJt P −1 .
However, eJt is no longer a diagonal matrix, but will take some upper-triangular form. The diagonal
elements will take the usual form
Jk k = eλk t ,
but the super-diagonal entries will be a linear combination of the generalized eigenvalues.

Also, if we do not consider eigenvalues at all, but instead have found the matrix A’s Schur Decom-
position, which every square matrix has, we can rewrite A as

A = QT Q∗ ,
where T is upper-triangular (with the eigenvalues of A along its main diagonal) and Q is an orthogonal
matrix. Hence upon substituting this into the matrix exponential we get

eAt = QeT t Q∗ .
In this case of substituting the Schur Decomposition, we see a similar behavior to that of the Jordan
Decomposition, where the elements along the main diagonal of eT t are simply the eigenvalues being
exponentiated, while the super-diagonal elements are all linear combinations of the orthogonal vectors
of Q.

Now that we have a more intuituve understanding of what the matrix exponential is, we set out
to enourage understanding of why eigenvalues are the drivers behind systems’ solution!

2.9.2 Eigen-connections between ODEs and Linear Algebra


That’s right, it’s time for the marriage between two of your favorite subjects - linear algebra and
differential equations. We’ve been, how do they say, beating around the bush for a while now, but
it’s time to make their relationship official and a part of pop-culture. After eigenvaues are sometimes
referred to as the “heart of an operator.” We first begin by recognizing the standard eigenvalue
equation from linear algebra, where we have some linear operator, A ∈ Rn×n ,

Ax = λx.
The above relationship says that when our linear operator, A, acts on a certain vector, cleverly
called the eigenvector, it actually can be thought of as a scaling of such eigenvector. Anotherwards,
even though most of the time when a square matrix acts upon a vector, the result is a vector aligned
in a different direction than the original, when a square matrix acts upon an eigenvector, the direction
does not changed, but instead can be seen as a scalar multiple of the original eigenvector.

26
Cute, but how does this tie back into differential equations? Well, for one thing all those times we
were using the friendly exponential ansatz, y = eαt , to find a complementary solution for constant-
coefficient ODEs, we were actually searching for an eigenvalue of the operator, in this case, α. Let’s
take a closer look.

• Example: Eigenvalues of the damped mass-spring system

Consider the following damped harmonic oscillatory system,

mẍ + bẋ + kx = 0,

with some initial conditions, x(0) = x0 and ẋ(0) = ẋ0 . Now like we hinted to above, if m, b, k ∈ R,
then we have a very nice constant cofficient ODE on our hands, which we can solve by assuming
solutions of the form x = eαt .

Performing that ansatz, we get


eαt mα2 + bα + k = 0,


where to find α we simply need to solve the quadratic. However, rather than go into those gory
details we will just expose the above quadratic equation for what it actually is - a character-
istic equation for our eigenvalues, α! (Not α-factorial, I just got a little excited.) Hence the
characteristic equation of our differential operator in this case is

mα2 + bα + k = 0.

The story does not end there. We will now view this problem through a more traditional linear
operator feel in an effort to encourage more comfort and belief in this marriage of sorts. We also
get to explore a new technique in which we have not done yet - expressing a higher dimensional
ODE as a system of first-order ODEs!

To do this for our damped mass-spring system, we will simply perform the change of variables
of letting

v = ẋ and v̇ = ẍ = −bv − kx
which is popular among the physics-y folk! I guess that whole, first derivative of a physical
variable is velocity thing, anyways, I digress. Now we see that we can rewrite the ODE in terms
of a linear system between x and v, e.g.,
      
d x ẋ 0 1 x
= = ,
dt v v̇ −k −b v

with complementary initial value vector,


 
x0
x0 = .
ẋ0

By our claims about linear ODE systems, we know the solution should be
 λt 
e 1 0
x(t) = V eDt V −1 · x0 = V V −1 · x0 .
0 eλ2 t

27
However, let’s take a closer look at the eigenvalues, λ1 and λ2 , e.g., let’s actually find their
characteristic equation. Spoiler- it’s going to be the same as the characteristic equation we found
above for the higher-dimensional damped mass-spring system. Doing the standard approach for
eigenvalues, we obtain,

λ −1
λI − A = = 0,
k λ+b
and hence
λ(λ + b) + k = λ2 + bλ + k = 0.

Therefore we see we get the same characteristic polynomial whether we approach the problem a
higher-order ODE or a linear ODE system! We care about eigenvalues! Now that we’ve motivated
how ODEs and Linear Algebra are so closely tied together via eigenvalues, we will shift gears and
consider non-homogeneous linear systems of ODEs.

2.9.3 Non-homogeneous Systems of Linear ODEs


So far in our trek into linear ODE systems, we only considered homogeneous systems, e.g., systems
of the form
dx
= Ax.
dt
However, we can very easily consider a system of the form
dx
= Ax + f (t).
dt
In solving these type of non-homogeneous equations, we have a few methods, which are analogous
to methods we used previously to find particular solutions of a single ODE equation - namely un-
determined coefficients and variation of parameters for systems of linear ODEs! On that note, in
solving non-homogeneous systems, we need to find both a homogeneous (complementary) solution
and particular solution to the system, i.e.,

x = xC + xP ,
with x(0) = x0 . We will now show how to find particular solutions using both undetermined coefficients
and variation of parameters. We will consider the following example,
   
dx 1 1 t−2
= x+ .
dt 0 3 6 sin(2t)

First we find the homogeneous solution. Of course, we can easily write it as xC (t) = eAt · x0 , however,
we will do the rightful thing and compute A’s eigendecomposition. Finding the eigenvalues, i.e.,
λI − A = 0, gives us
λ = {1, 3}.
Looking to find the associated eigenvectors, we look at the null-space of
     
λ−1 −1 1 1
(λI − A)v = v=0 ⇒ v= , ,
0 λ−3 0 2

for eigenvalues λ = 1 and λ = 3, respectively. Hence we an write the solution as

1 − 21
  t  
−1 1 1 e 0
xC (t) = eAt · x0 = eV DV t · x0 = 1 · x0 .
0 2 0 e3t 0 2

28
We now look to find the particular solution of this linear system ODEs using two different approaches
- undetermined coefficients, which will work for special cases of the non-homogeneous term, and vari-
ation of parameters, which will work a bit more generally.

• Undetermined Coefficients for Linear Systems of ODEs

To do this we will consider a non-homogeneous problem where f (t) takes the form of one of the
functions (or products thereof) that were associated with the method of undetermined coeffi-
cients from one dimensional equations. In the exact same vein as we approached this situation
before using undetermined coefficients, we will make an educated guess of what the particular
should be, e.g., one that resembles the form the non-homogeneous term has taken, plug it into
the linear ODE system, and see if we can match coefficients.

The only difference now is that we are guessing a particular solution in the form of a vector;
however, this does not require anymore theory, possibly just more paper and pencil, as we
will see. Each component of our educated-guess-of-particular-solution-vector, or EGOPSV, just
kidding, that will not and should not catch on, will essentially be what the associated form
of non-homogeneous term is. Let’s take a closer look at our example, before I try any more
attempts at hilarity.

Recall the non-homogeneous term in our equation is the vector,


 
t−2
.
6 sin(2t)

Whilst looking at each component, we will guess a solution of the form for the first and second
components of our particular solution as xP1 (t) = a1 t + b1 + c1 sin(2t) + d1 cos(2t), while the
second component will take a similar form of xP2 (t) = a2 t + b2 + c2 sin(2t) + d2 cos(2t), where
xP = (xP1 (t) xP2 (t))T . Even though each component of the non-homogeneous term has either
a linear or sinuisoidal type form, we need to make a guess of the linear combination for each
component of our particular solution. We are again left to simply figure out the undetermined
coefficients (heyyo!). Let’s substitute this into our linear system of ODEs in attempt to find the
unknows {a1 , b1 , c1 , d1 , a2 , b2 , c2 , d2 }.

First we calculate the derivative of our guess at the particular solution,


 
dxP a1 + 2c1 cos(2t) − 2d1 sin(2t)
= .
dt a2 + 2c2 cos(2t) − 2d2 sin(2t)

Next we will do the substituting, in which we expect to extract a system of 8 equations for our
8 unknowns. Let’s check it out!

29
dxP
= AxP + f (t)
dt    
a t + b1 + c1 sin(2t) + d1 cos(2t) t−2
1 1  1
 
= + 
0 3
a2 t + b2 + c2 sin(2t) + d2 cos(2t) 6 sin(2t)

   
a1 t + b1 + c1 sin(2t) + d1 cos(2t) + a2 t + b2 + c2 sin(2t) + d2 cos(2t) t−2
= + 
3a2 t + 3b2 + 3c2 sin(2t) + 3d2 cos(2t) 6 sin(2t)

 
(a1 + a2 + 1) t + (b1 + b2 − 2) + (c1 + c2 ) sin(2t) + (d1 + d2 ) cos(2t)
= 
3a2 t + 3b2 + (3c2 + 6) sin(2t) + 3d2 cos(2t)

and hence we need to solve

 
  (a1 + a2 + 1) t + (b1 + b2 − 2) + (c1 + c2 ) sin(2t) + (d1 + d2 ) cos(2t)
a1 + 2c1 cos(2t) − 2d1 sin(2t)
= .
a2 + 2c2 cos(2t) − 2d2 sin(2t)
3a2 t + 3b2 + (3c2 + 6) sin(2t) + 3d2 cos(2t)

Therefore we get the following system of equations,

a1 − b1 − b2 = −2
a1 + a2 = −1
2c1 − d1 − d2 = 0
c1 + c2 + 2d1 = 0
a2 − 3b2 = 0
a2 = 0
3c2 + 2d2 = −6
2c2 − 3d2 = 0.

or written in beautiful matrix form


    
1 −1 0 0 0 −1 0 0 a1 −2
 1
 0 0 0 1 0 0 0   b1  
   −1 

 0
 0 2 −1 0 0 0 −1 

 c1  
  0 

 0 0 1 2 0 0 1 0   d1  =  0
   
 .
 0
 0 0 0 1 −3 0 0 

 a2  
  0 

 0 0 0 0 1 0 0 0   b2   0
   
 
 0 0 0 0 0 0 3 2   c2   −6 
0 0 0 0 0 0 2 −3 d2 0

With enough patience and peristence by analytic methods (or by a billionth of a second MATLAB
computation) you can find the coefficients to be

30
   
a1 −1

 b1  
  1 


 c1  
  −6/65 


 d1  
= 48/65 
.

 a2  
  0 


 b2  
  0 

 c2   −18/13 
d2 −12/13

Hence our particular solution is found to be

6 48
 
1−t− 5 sin(2t) + 65 cos(2t)
xP (t) =  .
− 18
13 sin(2t) − 12
13 cos(2t)

Fun stuff. However, if our non-homogeneous term wasn’t as friendly (in the rigorous mathemat-
ical definition, aka “undesirable to do by the above scheme”), we can use the analog of variation
of parameters for systems! Hint: it is virtually the same process!

• Variation of Parameters for Linear Systems of ODEs

If you would recall when we developed the variation of parameters formalism in one dimension,
we assumed a particular solution that was a combination of the fundamental solutions, i.e.,
the linearly independent homogeneous solutions, with each being multiplied by some unknown
function of the independent variable, e.g.,

yP (t) = v1 (x)y1 (t) + v2 (t)y2 (t),

where {v1 (x), v2 (x)} are unknown and {y1 (t), y2 (t)} are the fundamental solutions. This assump-
tion is spawned from the reduction of order vein. Furthermore, we also made the assumption
that
v10 (t)y1 (t) + v20 (t)y2 (t) = 0,
so that along the avenue to the standard variation of parameters formulae, we had enough
equations and unknowns. However, when deriving a formalism for variation of parameters for
systems of linear ODEs, we experience a natural extension of this reduction of order idea to
systems.

For linear systems of ODEs, we will make the assumption the particular solution takes a natural
form of
xP (t) = X(t)v(t),
where X(t) is the matrix of fundamental solutions and v(t) is an unknown vector function of
the indepedent variable, t. Next we do what feels right - plug it into the ODE to find conditions
on the identity of v(t). Substituting this ansatz, we find

31
dxP
− AxP = f (t)
dt

dX dv
v+X − AXv = f .
dt dt

Now recall that if X is the fundamental solution matrix, we have that


 
dX dX
= AX ⇒ − AX v(t) = 0,
dt dt

since v 6= 0. Therefore we now just have the relation


dv
X = f (t).
dt
We can easily (in principle) solve the above equation by inverting X, which we know to be
non-singular since it is the matrix of fundamental solutions, i.e.,
dv
= X −1 f .
dt
Moreover, since v is only a function of the independent variable, t, we can integrate both sides
of this equation to get Z
v(t) = X −1 (t) f (t)dt.

Hence our particular solution will take the form,


Z
xP (t) = X(t) X −1 (t) f (t)dt.

Going back to our example from earlier,


 
dx t−2
= Ax + ,
dt 6 sin(2t)

we’ve found our fundamental matrix to be


et e3t
 
X(t) = ,
0 2e3t

and hence
e−t − 21 e−t
 
X −1 (t) = 1 −3t .
0 2e

Therefore using the variation of parameters formula we’ve just sketched, we find

32
Z
xP (t) = X(t) X −1 (t) f (t)dt

et e3t e−t − 21 e−t


 Z   
t−2
= 1 −3t
0 2e3t 0 2e 6 sin(2t)

et e3t (t − 2)e−t − 3e−t sin(2t)


 Z  
=
0 2e3t 3e−3t sin(2t)

et e3t (t − R2)e−t − 3e−t sin(2t)dt


  R 
=
0 2e3t 3e−3t sin(2t)dt

   
1 −t
t 3t
5e 5(1 − t) + 3 sin(2t) + 6 cos(2t)
 
e e
=
2e3t
   
0 3 −3t
− 13 e 3 sin(2t) + 2 cos(2t)

3 6 9 6
 
1−t+ 5 sin(2t) + 5 cos(2t) − 13 sin(2t) − 13 cos(2t)
= 
18 12
− 13 sin(2t) − 13 cos(2t)

6 48
 
1−t− 65 sin(2t) + 65 cos(2t)
= .
− 18
13 sin(2t) −
12
13 cos(2t)

and hence we find that we arrive at the same solution using variation of parameters as we did
using undetermined coefficients (duh!).
Now that we’ve spent a lot of time building up the theory of linear systems of ODEs, we again
will shift into higher gears and begin considering non-linear systems of ODEs. For these systems, as
in the one-dimensional case, we will not be able to solve them generally, but instead will discuss tools
one can use to analyze these systems, to build intuition about how the underlying solutions should
behave. Let’s dance.

2.9.4 Dynamical Systems: Non-linear Systems of Equations


We now will change direction and rather than consider linear ODEs, we will begin to analyze systems
of non-linear ODEs. This is a rich area of mathematics, known more formally as dynamical systems.
We will do our best to shed light onto some of the beauty in this field, but don’t have enough time
(nor monetary compensation) to write an in depth text on it, which would require a full textbook. A
dynamical system can be written in the following succinct form,
dx
= F(x, t),
dt
with associated initial data x(0) = x0 . As we have previously seen for non-linear ODEs, solutions to
general non-linear equations is a difficult problem. The case of a system of non-linear ODEs is no

33
different; however, we can at least set up a formalism to best categorize the different kinds of solutions
the system will exhibit. Luckily, we have built up most of the machinery we need to do this - it’s time
to go back and revisit our good friends, eigenvalues.

“Eigenvalues, you say?” Yes. You may be wondering where eigenvalues come into the picture since
we have a non-linear system. Welp, we need to make a priority phone call before putting on our
eigen-analysis toolbelt; we have to go all the back to calculus 1 and remember linearization. Note if
you don’t have good memories of calculus 1 and would prefer reminiscing your calculus 2 days, just
think of this as the first few terms of a Taylor Series, e.g.,

 
f (x) = f (x0 ) + (x − x0 )f 0 (x0 ) + O (x − x0 )2 ≈ f (x0 ) + (x − x0 )f 0 (x0 ).

If we view the above linearization as a contraction mapping, the f 0 (x) term will tell us whether or
not our map converges or diverges. In a similar vein, we will again care about this term; however, we
must first linearize our non-linear system. Before we get into that business, we must first talk about
equilibria of the system. An equilibrium is a point where the system does not change in time, e.g.,
where dxdt = 0. Hence to find the equilibrium points of a system, we need to find all the roots of

dx
= F(x, t) = 0.
dt
Once we find all the equilibrium points, we next care to try to classify each equilibrium. This is
where our eigenvalue analysis will come in, and in doing so give insight into what type of solutions
the system will exhibit. Finding the equilibrium points is not always a trivial process; however, once
we find the equilbirium points we can then jump onto the linearization bandwagon. In linearizing our
non-linear system, we will center our linearizations about equilibria exclusively. Otherwise, we would
be up a creek.

It’s time to linearize! In performing this analysis, we will assume x0 is an equilibrium point of our
dynamical system. We now proceed with computing the Taylor Series, mostly to show the general
form before picking out the linear components,

 
F(x, t) = F(x0 , t) + ∇F (x − x0 ) + (x − x0 )T ∆F (x − x0 ) + O |x − x0 |3 .
x=x0 x=x0

 
= F(x0 , t) + J(x0 )(x − x0 ) + (x − x0 )T H(x0 )(x − x0 ) + O |x − x0 |3

≈ F(x0 , t) + J(x0 )(x − x0 ).

Note that the matrix, J, is sometimes referred to as the Jacobian matrix, where it takes the form
∂F1 ∂F1 ∂F1
···
 
∂x1 ∂x2 ∂xN
 
∂F2 ∂F2
 
 
 ∂x1 ∂x2 
J(x) = ∇F(x) =  .
 
 .. .. .. 

 . . . 

 
∂FN ∂FN
∂x1 ··· ∂xN

34
The other term, H, is referred to as the Hessian matrix. It can be thought of as the Jacobian of
a Jacobian matrix, e.g., H(f )(x) = J(∇f )(x). More intuitively, it is the matrix of all second-order
partial derivatives. Each component of the Hessian takes the form,

∂2f
Hij = .
∂xi ∂xj

Moreover, the Hessian is symmetric by Clairaut’s Theorem, which states the order differentiation does
not matter as long as the function is continuous and differentiable. In general, the Hessian is a scary
and not-so-much-fun thing to compute.

Now that we have our linearization, it is time to discuss what it does for us. What we essentially
have done is say near an equilibrium point, x0 , our dynamical system takes the approximate form,
dx0
≈ J(x0 )(x − x0 ),
dt
since F(x0 , t) = 0 by definition of an equilibria. Although this analysis will not work in a global sense,
at least locally near an equilibrium point we will be able to make claims on the types of solutions
allowed. Again, this is where we have to pick up our eigenvalue hammers. We will state their impor-
tance now!

The eigenvalues of the Jacobian matrix will help us classify the equilibrium point, and moreover
the solutions in a local neighborhood of it.
We will now push onwards and attempt to classify the equilbria, and upon doing some shed insight
on how solutions near-enough to the equilbria are behaving. We will discuss the following 6 cases,
1. Sink Equilibrium (Stable)

2. Source Equilibrium (Unstable)


3. ”Saddle-Point” Equilibrium
4. Degenerate Cases I and II

5. Centers
6. Stable Focus and Unstable Foci.
Each of these cases will be illustrated through the use of a dynamical system in R2 , where we will
assume the eigenvalues of the system’s Jacobian matrix to be λ1 , λ2 .

Sink Equilibrium

In this case we assume two real eigenvalues, such that λ1 , λ2 < 0.

If we imagine finding a “solution” to our linearized problem, e.g., dF(x


dt
0
= J(x0 )(x − x0 )., it is
clear we will have the case of decaying exponentials in our diagonalization. Therefore we would
assume any trajectory within a neighborhood of the equilibrium, x0 , will head to the equilibria
itself. Hence this case is known as a sink. It is also stable.

35
Figure 2: Various classifications of equilibrium based on eigenvalues of the Jacobian Matrix. (a) Sink
(stable) Equilibrium (b) Source (unstable) Equilibrium (c) Saddle-Point Equilibrium (d) Degenerate
Case 1 (e) Degenerate Case 2 (f) Center (g) Stable Foci (h) Unstable Foci

This case can be seen in Figure(2)(a).

Source Equilibrium

In this case we assume two real eigenvalues, such that λ1 , λ2 > 0.

Performing the same thought experiment as above, upon envisioning a solution the linearized
problem, we find that any trajectory found within a neighborhood of the equilibrium would
point away from the equilibria. Therefore this equilibria is known as a source, as all trajectories
point away from the equilibrium. It is also deemed unstable.

This case can be seen in Figure(2)(b).

36
Saddle-Point Equilibrium

In this case we assume two real eigenvalues, such that λ1 > 0, λ2 < 0, hence one is positive and
one is negative.

This case is called a saddle-point equilibrium because some trajectories will be pointed towards
the equilibria, while others will be pointed away. This is similar the idea of a saddle-point in
multivariate calculus. Similarly in multivariate calulus, where a saddle-point is not strictly con-
sidered a local maxima or minima, it cannot be deemed either stable or unstable.

This case can be seen in Figure(2)(c).

Degenerate Cases

In this case we assume two real eigenvalues, such that λ1 6= 0, λ2 = 0, hence one strictly zero
and the other is real. In these type of degenerate cases, we will see bifurcated behavior.

For the case where λ1 > 0, we believe solutions we act in a manner as seen in Figure(2)(d),
while in the case where λ1 < 0, the system has a feel of what is depicted in Figure(2)(e).

This type of behavior may be exhibited when radical structural differences occur in the trajec-
tories when perturbing system parameters.

Centers

In this case we assume two purely imaginary eigenvalues, such that λ1 = bi, λ2 = −bi.
Each trajectory around the equilibria, x0 , is closed and circular in nature. This can be seen in
Figure(2)(f).

We also note that if the Jaconian is skew-symmetric at x = x0 , then the equilbrium will always
be a center, i.e., skew-symmetric matrices have purely imaginary eigenvalues. Let’s actually do
a quick proof.

Proof: To show these trajectories are always circular, we begin with the relation that if
A = −A∗ , then the quadratic form x∗ Ax = 0. To prove this we can easily see that,

−x∗ Ax = x∗ (−A)x = x∗ A∗ x = (x∗ Ax)∗ = x∗ Ax,


since the quadratic form is a scalar quantity. Therefore we have x∗ Ax = −x∗ Ax and hence
x∗ Ax = 0.

37
Now that we have that relation, we consider the linearization, or actually just a linear
system of ODEs,
dx
= Ax.
dt
Consider A = −A∗ , and multiply the above by x∗ ,

dx
x∗ · = x∗ Ax = 0
dt
Therefore we have

dx 1 d d
x∗ · = (x · x) = ||x||2 = 0.
dt 2 dt dt
From the above we get that all the orbits must be circular.

Stable and Unstable Foci

In this case we assume two complex eigenvalues, such that λ1 = a + bi, λ2 = a − bi, with b 6= 0.

Depending on the sign of a, this system will either spiral towards or away from the equilbrium
point. In both cases it will oscillate about the equilibrium with period b. Thinking of the solu-
tions to the linearized problem, we note the stability will be controlled by if the solutions decay,
i.e., a < 0, or when they grow uncontrollably, i.e., a > 0.

For the case when a < 0, the trajectories spiral towards the equilibria. We call the equilibrium
a stable focus. Converging towards the equilibrium point is clear since the real part of the eigen-
value is negative. As in the sink case, which also has negative real part, we expect trajectories
to flow towards the equilibria. This can be seen in Figure(2)(g).

For the case when a > 0, the trajectories spiral away from the equilibria. We call the equilib-
rium an unstable focus. It is unstable since the real part of the eigenvalue is positive, and hence
solutions should flow away from the equilibria, analogous to the source case. This can be seen
in Figure(2)(h).

In a neighborhood of the equilibria, we expect the solutions to behave like one of the above cases.
Remember this is all local analysis and at this junction cannot say anything about global solution
behavior...yet

38
3 Sturm Liouville Theory and Orthonogal Expansions!
Sturm-Liouville theory regards second-order linear differential equations of the form
 
d dy
p(x) + q(x)y = −λw(x)y, (13)
dx dx
where p(x), q(x), and w(x) are all continuous functions on the particular region of interest, say
[a, b]. We also note that w(x) > 0 and is referred to as the “weight” function as well as that we assume
p(x) ∈ C n with n ≥ 1.

We further note that in this formalism we consider the solution y(x) to obey boundary conditions
on [a, b]. Moreover λ is an eigenvalue of the our Sturm-Liouville operator and is chosen to give non-
trivial solutions of (13), which satisfy the boundary conditions.

What makes Sturm-Liouville theory interesting and of particular use in applied mathematics is
the factor that the linear differential operator given by (13) is Hermitian in some function space for
specific boundary conditions. What is special about Hermitian operators is that their eigenfunctions
are orthogonal (sometimes with the assistance of a weight function, i.e., w(x)) and form a complete
function space. Recall how Hermitian operators work in Linear Algebra - they lead to a basis of
orthogonal eigenvectors, each with real associated eigenvalue.

Theorem 3.1 If a Sturm-Liouville problem is regular, i.e., if p(x), p0 (x), q(x), and w(x) are all con-
tinuous on [a, b] and have boundary conditions that can be written as

α1 y(a) + α2 y 0 (a) = 0 with α12 + α2 > 0 (14)


0
β1 y(b) + β2 y (b) = 0 with β12 2
+ β > 0, (15)

then the eigenvalues of the Sturm-Liouville problem are real and can be ordered such that

λ1 < λ2 < λ3 < · · · < λn < · · · → ∞.

Furthermore with each eigenvalue, λj , there is a unique eigenfunction, φj (x), which has exactly (j −1)
roots in (a, b), and with orthogonality property, i.e.,
Z b 
1, j=k
φj (x)φk (x)w(x)dx =
a 0, j 6= k

3.1 Application: Fourier Series and Orthogonal Series Expansions


In this subsection we will explore where Fourier Series comes from, as well as, how to write an arbitrary
continuous function, f (x), in terms of Fourier Series.

3.1.1 Example: Fourier Sine Series


Consider the following BVP,

y 00 + λy = 0 (16)
y(0) = 0 (17)
y(1) = 0. (18)

Now we have three cases to consider:

39
1. λ = 0
2. λ = −α2 < 0

3. λ = α2 > 0

Case 1: λ = 0: We have the following ODE to solve

y 00 = 0

and hence we obtain


y(x) = c1 + c2 x.
Inserting the boundary conditions, we see that c1 = c2 = 0, and hence we only have the trivial
solution in this case.
Case 2: λ = −α2 < 0 : We have the following ODE to solve

y 00 − α2 y = 0.

We find the general solution of

y(x) = c1 sinh(αx) + c2 cosh(αx).

Using the boundary conditions we see

y(0) = 0 ⇒ c2 = 0
y(1) = 0 ⇒ c1 sinh(α) = 0 ⇒ c1 = 0.

Hence we only have the trivial solution, once again.

Case 3: λ = α2 > 0: We consider the following ODE,

y 00 + α2 y = 0,

where α2 > 0. We find the general solution to be

y(x) = c1 sin(αx) + c2 cos(αx).

Using the boundary conditions we see

y(0) = 0 ⇒ c2 = 0
y(1) = 0 ⇒ c1 sin(α) = 0 ⇒ α = nπ where n ∈ Z+ .

Therefore we find our eigenvalues to be λn = αn2 = n2 π 2 , for n ∈ Z+ , with associated eigenfunc-


tions, φn (x) = sin(nπx).
We will now demonstrate the orthogonality of those eigenfunctions.
Orthogonality: Consider Z 1
sin(nπx) sin(mπx)dx,
0

where n 6= m. Now we use the following identity to perform the integration,

40
1h i
sin(a) sin(b) = cos(a − b) − cos(a + b) .
2

Z 1 Z 1
1h i
sin(nπx) sin(mπx)dx = cos((n − m)πx) − cos((n + m)πx) dx
0 0 2
1h 1 1 i1
= sin((n − m)πx) − sin((n + m)πx)
2 (n − m)π (n + m)π 0

= 0.

We note that the case n = m must be considered by itself, to avoid singularity in the above
integration. Moreover, the case where n = m = 0 is disgarded since 0 is not an eigenvalue of
the differential operator. Now we show the case n = m 6= 0 leads to a non-zero computation.
Now considering n = m 6= 0,

1
1 1h
Z Z i
sin2 (nπx)dx = 1 − cos(2nπx) dx
0 2 0
1h 1
= x− sin(2nπx)]10
2 2nπ
1
= .
2

Hence orthogonality holds (as it should)!

3.1.2 Example: Fourier Series of a function f (x)


Now we will illustrate how to use this idea of Fourier Series in rewriting a function, f (x). First we
begin with the general form of a Fourier Series. Written as an infinite sum, a Fourier Series of a
function f (x) on an interval [−L, L] is written as,
∞   ∞  
X jπx X jπx
f (x) = aj cos + bj sin . (19)
j=0
L j=1
L

The brunt of the work is now calculating the coefficients {an }∞ ∞


n=0 and {bn }n=1 . Luckily we can
employ orthogonality properties of the Fourier eigenfunctions.


Z L  nπx   2L n = m = 0
 mπx 
cos cos dx = L n=m>0 (20)
−L L L
0 n 6= m

Z L  nπx   mπx  
L n=m>0
sin sin dx = (21)
−L L L 0 n 6= m
Z L  nπx   mπx 
sin cos dx = 0. (22)
−L L L

To compute the coefficients, wenbegin witho(19) and multiply by either eigenfunction. First let’s
consider the contributions by the sin nπx

L eigenfunctions in the expansion of f (x). We begin
with

41

X  nπx  ∞
X  nπx 
f (x) = an cos + bn sin ,
n=0
L n=1
L
mπx

and multiply both sides by sin L . Next we integrate both sides over the interval of interest,
e.g.,

∞ ∞
" #
Z L  mπx  Z L X  nπx  X  nπx   mπx 
f (x) sin dx = an cos + bn sin sin dx.
−L L −L n=0
L n=1
L L

The only terms that are not zero in the above equation, are those for which n = m and the
sinusoidal terms, i.e.,
Z L  mπx  Z L  mπx 
f (x) sin dx = bm sin2 dx.
−L L −L L
Hence since bm is a constant, we can solve for it to get
RL mπx

−L
f (x) sin L dx
bm = RL .
sin2 mπx

−L L dx
Therefore we now have a formula to find all the coefficients {bj }∞
j=1 . Using a similar approach and
employing orthogonality once again, we can find a formula for the {aj }∞ j=0 coefficients as well, i.e.,
RL mπx

−L
f (x) cos L dx
am = RL .
mπx

−L
cos2 L dx
After computing all those integrals, we will have our expansion of f (x) on [−L, L] in all its Fourier
Series glory!

3.1.3 Example: Fourier Series of f (x) = x on [−π, π]


We wish to find

X ∞
X
f (x) = x = aj cos(jx) + bj sin(jx).
j=0 j=1

Welp, let’s do the integrals!


x sin(mx)dx
bm = R−π
π .
−π
sin2 (mx)dx
π Rπ
x 1
−m cos(mx) + m −π
cos(mx)dx
−π
=
π
−2π
m cos(mπ)
=
π
2
= (−1)m+1
m

42
Similarly, we’ll consider the integrations to find {an }. Let’s start with m > 0,

x cos(mx)dx
am = R−π π .
−π
cos2 (mx)dx
π Rπ
x 1
m sin(mx) −m −π
sin(mx)dx
−π
=
π
=0

and when m = 0 for a0 ,



xdx
a0 = R−π
π = 0.
−π
1dx
Therefore all the cosines drop out of our Fourier Expansion of f (x) = x on [−π, π]. Note, we could
have easily seen this by the fact the function we are rewriting in the Fourier basis, f (x) = x, is odd
on a symmetric interval. We then have
∞  
X 2 m+1
f (x) = x = (−1) sin(mx).
m=1
m
The figure below shows what the Fourier Series expansion looks like on that interval for differing
number of terms in the series.

3.2 General Orthogonal Series Expansions


As we saw in the above example, we can express a function f (x) in terms of sines and cosines on a
specified interval. We recall we are able to do this because the those sines and cosines, i.e., cos jπx

L
and cos jπx
 
L , on [−L, L] form a complete basis in function space. Moreover, these functions were
eigenfunctions of a Sturm-Liouville differential operator, so we knew we were guaranteed those prop-
erties - namely orthogonality and completeness.

Beyond Fourier Series there are many “popular” orthogonal polynomials, which are all eigenfunc-
tions of some Sturm-Liouville differential operator. Just for some buzzwords, the usual candidates
tend to be:

• Bessel Functions

• Chebyshev Polynomials
• Legendre Polynomials
• Laguerre Polynomials
• Hermite Polynomials

More generally, if {φj (x)} are eigenfunctions of a Sturm-Liouville problem, and we wish to expand
some function f (x) on an interval [a, b] in terms of them, i.e.,

X
f (x) = cj φ̃j (x),
j=0

43
Figure 3: The function f (x) = x being expressed as Fourier Series expansions with differing numbers
of truncated terms.

the brunt of the work still lies in calculating the coefficients {cj }. We note that φ̃j (x) are the
eigenfunctions, φj (x), just translated into the interval [a, b]. We can do the same procedure as we did
to find the Fourier Series coefficients. Let’s party!

44

X
f (x) = cm φ̃m (x)
m=0
X∞
f (x)φ̃n (x) = cm φ̃m (x)φ̃n (x)
m=0
X∞
f (x)φ̃n (x)w̃(x) = cm φ̃m (x)φ̃n (x)w̃(x)
m=0
Z b Z bX∞
f (x)φ̃n (x)w̃(x)dx = cm φ̃m (x)φ̃n (x)w̃(x)dx
a a m=0
Z b ∞
X Z b
f (x)φ̃n (x)w̃(x)dx = cm φ̃m (x)φ̃n (x)w̃(x)dx
a m=0 a
Z b Z b
f (x)φ̃n (x)w̃(x)dx = cn φ̃2n (x)w̃(x)dx
a a

and hence we obtain


Rb
a
f (x)φ̃n (x)w̃(x)dx
cn = Rb .
φ̃2 (x)w̃(x)dx
a n
Note: w̃(x) is the weight function that comes from (13), just translated into the appropriate
interval [a, b].

3.3 Using Fourier Series to Solve ODEs


Consider the following 2nd order linear constant-coefficient ODE:

L̂u = αu00 + βu0 + γu = g(x),


where {α, β, γ} ∈ R, u = u(t) and and either Dirichlet, Neumann, or mixed type boundary
conditions, on some interval [−L, L]. If the ODE is not defined on such an interval, simply do a
transformation into it. We then must solve for both the complementary solution, uC (x), i.e., the
solution to the homogeneous problem, as well as the particular solution, uP (x), e.g., the solution to
the non-homogeneous problem. Our full solution will be of the form

u(x) = uC (x) + uP (x).

Now in doing the above we will employ the services of Fourier Series to find the particular solution,
uP (x). We note that if g(x) is a fairly simple function, e.g., a polynomial, sine, cosine, or exponential,
or even products of them, finding the particular solution using undetermined coefficients is probably
the road more traveled (and it should be). If g(x) is not a simple function like those, you can try
to use variation of parameters; however, if you do not know the complementary solutions, this could
prove rather difficult. Now if g(x) is periodic with period 2L, this is where Fourier Series can come
in!
We wish to write our solution uP (x) in a Fourier Series,
∞   ∞  
X jπx X jπx
uP (x) = aj cos + bj sin .
j=0
L j=1
L

45
However, first we must also expand g(x) in its Fourier Series, e.g.,
∞   ∞  
X jπx X jπx
g(x) = cj cos + dj sin ;
j=0
L j=1
L

however, since g(x) is a known function we can find the coefficients {cj } and {dj } explicitly using
the methods above. We now have a linear differential equation of the form,
∞   ∞  
X jπx X jπx
L̂u = cj cos + dj sin .
j=0
L j=1
L

Substituting our solution ansatz (the Fourier Series for xP (x)) into the equation above, we see we
get

L̂u = g(x)

∞   ∞  
X jπx X jπx
L̂u = cj cos + dj sin
j=0
L j=1
L

 
∞   X ∞   ∞   X ∞  
X jπx jπx X jπx jπx
L̂  aj cos + bj sin  = cj cos + dj sin
j=0
L j=1
L j=0
L j=1
L

∞    ∞    X ∞   X ∞  
X jπx X jπx jπx jπx
aj L̂ cos + bj L̂ sin = cj cos + dj sin
j=0
L j=1
L j=0
L j=1
L

What is left is to write a system of equations for all the {aj } and {bj } coefficients in terms of {cj }
and {dj }. Note that if the damping term in not present in the ODE, i.e., β = 0, there is the possibility
of the presence of resonance, so the corresponding frequency term must include a polynomial type
coefficient, i.e., x, as well.

Furthermore we note that if we have Dirichlet boundary conditions, we will use a Fourier Sine
Series for both g(x) and yP (x), if Neumann, we will use the Fourier Cosine Series for them.

3.4 Gram-Schmidt Processes...for othogonal polynomials?!


If you’re ever in a situation where you need to know the first few othogonal polynomials in a particular
basis family, have no fear! You can derive them using a Gram-Schmidt process. That’s right - just
like in linear algebra. Note that you can derive as many as you wish using this methodology...if you
have enough time and mental arithmetic fortitude.

Before we proceed with Gram-Schmidt for othogonal polynomials we first recall the Gram-Schmidt
process for othogonalizing vectors in Rn . Once we have reviewed this, the leap to Gram-Schmidt
processes for infinitely dimensinoal functional spaces will feel like a natural extension.

46
3.4.1 Gram-Schmidt for vectors in Rn
Consider a set of m-vectors in Rn , with m ≤ n, {v1 , v2 , v3 , . . . , vm }. In general when handed a set of
vectors, they will not be orthogonal; however, we will employ the services of Gram-Schmidt to help
us achieve this feat. In order for the Gram-Schmidt process to achieve a set of m-orthogonal vectors
from a collection of vectors, all the vectors must be linearly independent. We note that the span of
the original vector space and that of the orthogonalized vector space are equivalent.
To orthogonalize a collection of vectors, we begin with a single vector, say q1 = v1 . Next we
proceed to find an orthogonal vector to q1 , by taking v2 and subtracting off the projection of v2 onto
q1 . To find the next orthogonal vector, we will repeat this process and take v3 and subtract off the
projections of v3 onto v1 and q2 . This process is illustrated below.

q1 = v1
q2 = v2 − projq1 v2
q3 = v3 − projq1 v3 − projq2 v3
q4 = v4 − projq1 v4 − projq2 v4 − projq3 v4
..
.
m−1
X
qm = vm − projqk vm .
k=1

We now recall the strict definition of a vector projection. The projection of a vector v onto a
vector u is defined as
< v, u >
proju v = u,
< u, u >
< ·, · > is an inner-product.
Hence we can rewrite the Gram-Schmidt process in terms of inner-products, e.g.,

q1 = v1
< v2 , q1 >
q2 = v2 − q1
< q1 , q1 >
< v3 , q1 > < v3 , q2 >
q3 = v3 − q1 − q2
< q1 , q1 > < q2 , q2 >
< v4 , q1 > < v4 , q2 > < v4 , q3 >
q4 = v4 − q1 − q2 − q3
< q1 , q1 > < q2 , q2 > < q3 , q3 >
..
.
m−1
X < vm , qk >
qm = vm − qk .
< qk , qk >
k=1

The last thing we can do is normalize these newly found orthogonal vectors, {q1 , q2 , . . . , qm },
using the standard normalization procedure, e.g.,
( )
q1 q2 qm
{q̂1 , q̂2 , . . . , q̂m } = , ,..., ,
||q1 || ||q2 || ||qm ||

where q̂k is now orthonormalized.

47
3.4.2 Gram-Schmidt for Orthogonal Polynomials!
We now shift gears slightly and look to the problem of creating a basis of orthogonal polynomials. We
recall if two polynomials, say pk andpj are orthogonal then we have
Z b 
0 (j 6= k)
< pk , pj >= pk (x)pj (x)w(x)dx = ,
a c ∈ R (j = k)
where w(x) is a weight-function associated with the specific orthogonal polynomials behind the scenes
Sturm-Liouville problem. On that note, we point out that in order to easily extend Gram-Schmidt
from vectors in Rn to infinite dimensional function space, we need to define an inner-product. We do
so accordingly,
Z b
< pk , pj >= pk (x)pj (x)w(x)dx.
a
For constructing families of orthogonal polynomials, we will use the above inner-product and see
that any family of orthogonal polynomials is going to be generated by changing the weight func-
tion, w(x), in the above inner-product. For examples, to construct the Legendre polynomial basis we
1
use w(x) = 1, while for Chebyshev polynomials the weight function is w(x) = √1−x 2
. We also need
correct integration bounds, i.e., for both Legendre and Chebyshev polynomails we need [a, b] = [−1, 1].

To construct the orthogonal basis of polynomials we proceed in a similar manner to the above,
except starting off with p0 = 1 as the zero-th order polynomial. To find p1 (x), the 1st -order orthogonal
polynomial, we subtract off the projection of the 1st -order monomial basis function onto p0 (x) from
x. We then continue in an analogous manner. The procedure can be seen as

p0 (x) = 1
< x, p0 >
p1 (x) = x − p0 (x)
< p0 , p0 >
< x2 , p0 > < x2 , p1 >
p2 (x) = x2 − p0 (x) − p1 (x)
< p0 , p0 > < p1 , p1 >
< x3 , p0 > < x3 , p1 > < x3 , p2 >
p3 (x) = x3 − p0 (x) − p1 (x) − p2 (x)
< p0 , p0 > < p1 , p1 > < p2 , p2 >
..
.
n−1
X < xn , pk >
pn (x) = xn − pk (x).
< pk , pk >
k=0

It is clear that doing this procedure is definitely not optimal, especially if you need to find p10 (x),
or even p3 (x) for that matter. However, where this procedure becomes powerful is in the realm of
numerical quadrature, or numerical integration. In a nutshell, in the Gaussian quadrature formulation
of approximating integrals, one needs to (*gets to) choose both the quadrature points as well as the
weight coefficients. It turns out that one possible method for choosing the quadrature becomes is
naturally found to be constructing a specific family of orthogonal polynomials and then finding the
roots of the desired order, e.g., if we wish to find 2 quadrature nodes, we would find the roots of p2 (x).

Furthermore because of the accuracy of Gaussian Quadrature compared to Newton-Cotes methods,


where the only degrees of freedom are the weight coefficients, as the quadrature points are known

48
a priori., one only needs to construct polynomials of modest order. The main difference between
the two numerical integrations methods can be seen as having 2N degrees of freedom in Gaussian
Quadrature, while only N degrees of freedom in Newton-Cotes. These degrees of freedom have
immediate consequence in eliminating lower order errors to achieve higher order accuracy.

49
4 Green’s Function
A Green’s function, denoted G(x, s), is associated with specific linear differential operator, L̂. It is
the solution to
L̂G(x, s) = δ(x − s), (23)
where δ(x−s) is the Dirac-delta function. As applied mathematicians we wish to exploit properties
of the Green’s function that makes them particularly useful for solving non-homogeneous differential
equations, whether they are of the ordinary or partial varieties, e.g., Green’s functions can be used
for finding solutions to equations of the form

L̂u(x) = f (x), (24)

where u(x) is the solution we seek and L̂ is some differential operator. Specifically we will use
them to solve non-homogeneous boundary value problems (BVPs). We note that BVPs sometimes
do not have solutions, so we will illustrate the main concepts of Green’s Function using “nice” linear
differential operators.

4.1 Big Picture of Green’s Functions


We consider L to be of the Sturm-Liouville family of differential operators, i.e.,
 
d d
L̂ = p(x) + q(x), (25)
dx dx

where we presume p(x) and q(x) are continuous functions on some interval of interest, say [a, b].
Naturally, we have some boundary conditions, written in the form,

c1 u0 (a) + c2 u(a) = 0

D̂u = (26)
c3 u0 (b) + c4 u(b) = 0.

Note: we call the operator D̂, the boundary conditions operator. (very creatively, in fact).
We consider the following BVP,

L̂u(x) = f (x) w/ D̂u = 0, (27)

where f (x) is continuous on [a, b]. We can induce there is only one solution to the above BVP,
given by
Z b
u(x) = G(x, s)f (s)ds. (28)
a
The name of the game will then be constructing the appropriate Green’s function for the specific
linear differential operator. It is clear once we have the Green’s function, we can change the non-
homogeneous term in the BVP, i.e., f (x), and assuming our integration skills are up to it, find the
solution u(x) to our BVP.

4.2 Properties of Green’s Functions


Green’s functions have the following properties:
• G(x, s) is continuous in x and s

• For x 6= s, L̂G(x, s) = 0.
• For s 6= 0, D̂G(x, s) = 0.

50
• Symmetry Condition: G(x, s) = G(s, x).
• Jump Discontinuity: G0 (s+ , s) − G0 (s− , s) = 1
p(s) .

Note: Many of our friendly linear differential operators already have had their Green’s functions
found and studied extensively! Look them up and then have an integration party!

4.3 A little motivation...


Consider the following integral with Dirac-delta kernel,
Z b
f (s)δ(x − s)ds. (29)
a

We know by properties of the Dirac-delta function that this integral is equal to


Z b
f (s)δ(x − s)ds = f (x)
a
.
Now recall (23) and substituting this into the above equation gives,
Z b Z b
f (s)δ(x − s)ds = L̂G(x, s)f (s)ds = f (x),
a a

and by (24), we obtain


Z b
L̂u(x) = L̂G(x, s)f (s)ds.
a
Finally by uniform convergence (and other analysis theorems, we take for granted, but never
should), but moreso we simply note that L̂ is only a differential operator on x, and hence can be
pulled out of the integrand, e.g.,
!
Z b
L̂u(x) = L̂ G(x, s)f (s) ,
a

which strongly infers that we obtain our solution of


Z b
u(x) = G(x, s)f (s)ds. (30)
a

4.4 Example: u” + 4u = f(x)


Consider the following BVP w/ associated boundary conditions,

u00 + 4u = f (x) (31)


u(0) = 0 (32)
π
u = 0. (33)
4
d2
We now proceed in finding the Green’s Function for the operator L̂ = dx2 + 4. We begin by first
solving the following ODE with respect to the Green’s function, G,

G00 + 4G = δ(x − s),


where we will have to consider two regions:

51
1. 0 ≤ x ≤ s
π
2. s ≤ x ≤ 4

Now we get from simple ODE theory that

G(x, s) = c1 cos(2x) + c2 sin(2x)

where coefficients c1 and c2 will depend on s.


Looking into the boundary conditions on each region appropriately, we see that for:
1. x ∈ [0, s] : G(0, s) = 0 ⇒ c1 (s) = 0
2. x ∈ [s, π4 ] : G π4 , s = 0 ⇒ c2 (s) = 0


Therefore we find that



c2 (s) sin(2x) x ∈ [0, s],
G(x, s) =
c1 (s) cos(2x) x ∈ [s, π4 ].

Now we must find the functional forms of c1 (s) and c2 (s). We will be able to do this by using
the fact we need G(x, s) to be continuous in both x and s, as well as the jump discontinuity in the
derivative of G(x, s).
First we will look at continuity.
Continuity : We need that G(x, s) has the same limit as x → s from below and x → s from above.
This gives us that
c2 (s) sin(2s) = c1 (s) cos(2s). (34)
" #
dG dG 1
Jump : lim→0 dx − dx = p(x) .
s+ s−
We note that p(x) = 1 for our toy problem. We find that

−2c1 (s) sin(2s) − 2c2 (s) cos(2s) = 1. (35)

Now solving (34) and (35) for c1 (s) and c2 (s), we find
1
c1 (s) = − sin(2s) (36)
2
1
c2 (s) = − cos(2s), (37)
2
and hence we find our Green’s function to be:
− 12 cos(2s) sin(2x)

x ∈ [0, s]
G(x, s) = (38)
− 21 sin(2s) cos(2x) x ∈ [s, π4 ].

4.4.1 Let’s make sure we’re on the same page, let f (x) = 50e−0.3742 .
Hence we are looking for the particular solution for the above BVP with f (x) = 50e−0.3742 .

Recall we just need to perform,


Z π
u(x) = G(x, s)f (s)ds.
0

52
Therefore we see
Z π4
u(x) = G(x, s)f (s)ds (39)
0
π
Z x Z 4
= G− (x, s)f (s)ds + G+ (x, s)f (s)ds (40)
0 x
Z x  Z π4  
1 1
= − cos(2s) sin(2x) (50e−0.3742 )ds + − sin(2s) cos(2x) (50e−0.3742 )ds (41)
0 2 x 2
" Z π #
Z x 4
= −25e−0.3742 sin(2x) cos(2s)ds + cos(2x) sin(2s)ds (42)
0 x
 
1 1
= −25e−0.3742 sin2 (2x) + cos2 (2x) (43)
2 2
25 −0.3742
=− e (sin2 (2x) + cos2 (2x)) (44)
2
25 −0.3742
= e (45)
2
(46)

and we find that


25 −0.3742
u(x) = e cos(4x).
2

53
5 Linear PDEs
In this section we will illustrate how to solve some linear partial differential equations, PDEs. Namely
we will discuss how to use

1. Classification of PDEs
2. Derivations of classical linear PDEs
3. Method of Characteristics
4. Separation of Variables

5. Fourier Transforms / Laplace Transforms


6. Similarity Solutions
7. Method of Images

to solve various kinds of linear PDEs. Solutions to non-linear PDEs are not very friendly to the
ole paper and pencil formulations that we love, cherish, and admire. First we begin by classifying
equations. To do so, let’s state what we mean by linear PDEs! We will illustrate this using 2nd order
PDEs, as a lot of natural phenomena is successfully mathematically modeled using 2nd order PDE
operators, but that is a story for another day.

We consider a function u = u(x, y), that is a function of N independent variables, and write the
general form of a 2nd order linear PDE as,

∂2u ∂2u ∂2u ∂u ∂u


A 2
+ B + C 2
+D +E + F u = G(x, y), (47)
∂x ∂x∂y ∂y ∂x ∂y
where A, B, C, D, E, F and G are functions of x and y. As in ordinary differential equations, if
G(x, y) = 0 the PDE is said to be homogeneous, otherwise it is said to be non-homogeneous.

Fortunately we are able to classify different types of PDEs, which give rise to solutions with
radically different behavior, using the coefficient functions {A, B, C, . . . , F }. Furthermore in the case
when {A, B, C, D, E, F } are constants, i.e., the PDE has constant coefficients, when we classify a
PDE there is no possibility of changing types spatially (or temporally). We classify them using the
following,
1. Hyperbolic: B 2 − 4AC > 0
2. Parabolic: B 2 − 4AC = 0

3. Elliptic: B 2 − 4AC < 0


In a nutshell, the classification of 2nd order PDEs is very important, but is beyond the scope of
what we wish to discuss here. We will note that the classification helps give rise to certain kinds
of boundary conditions and/or initial conditions for a PDE. Furthermore we will say that hyperbolic
equations are related to wave phenomena, parabolic are related to dissipative processes and heat flow,
and elliptic equations are related to steady-states of a system.

In the case of 1st Order Systems of PDEs, we can classify the system based on properties of the
coefficient matrices, A and B, where our general equation is of the form,

Aut + Bux = c.

54
Depending on the invertibility of A or B, we may or may not be able to classify the system as
elliptic, hyperbolic, or parabolic. If A is non-singular, and B may or may not be, we consider the
problem of

Eigenvalues : det(B − λA) = 0


Eigenvectors : (B − λA)v = 0.

However, if the opposite situation arises, where B is known to be non-singular, and A may or may
not be, we consider the analogous problem of

Eigenvalues : det(A − λB) = 0


Eigenvectors : (A − λB)v = 0.

We then have the following three cases, depending on what the eigenvalues are found to be:
Elliptic: No real eigenvalues ⇒ eigenvalues are strictly complex with Im 6= 0. (# of eqns. must
be even!)
Hyperbolic: N real eigenvalues and yield a full set of linearly independent eigenvectors

– a. N real distinct eigenvalues ⇒ N linearly independent eigenvectors


– b. Repeated eigenvalues may or may not yield a full set of linearly independent eigenvectors
(if full set, only then is the system hyperbolic).
Parabolic: N real eigenvalues, which do not yield a full set of linearly independent eigenvectors.
This can only occur if there are repeated eigenvalues.
We will just a few more notes of classification of these 1st order PDE systems. Note:
1. You cannot classify a system if it has both real and complex eigenvalues
2. Since in general A = A(x, t, u) and B = B(x, t, u) ⇒ λ = λ(x, t, u) and therefore classification
can vary from point to point in the (x, t)-plane.
3. For the scalar equation
a(x, t, u)ut + b(x, t, u)ux = c(x, t, u),
we have
b(x, t, u) a(x, t, u)
λ= or λ = ,
a(x, t, u) b(x, t, u)
and hence only have one real distinct eigenvalue. Therefore a scalar 1st order PDE is always
hyperbolic!

Actually before proceeding further, let’s take that dive and derive some of those more popular 2nd
order PDEs that everyone is ranting and raving about, namely the wave equation, heat equation, and
Laplace’s equation.

Let’s start off with the wave equation! (and surf along from there...)

55
5.1 Quick derivations of the big three - elliptic, hyperbolic, and parabolic
PDEs.
In this section we will motivate the phenomena in which where the wave equation (hyperbolic) and heat
equation (parabolic) PDEs govern the behavior. Furthermore we will also briefly build up intuition
behind Laplace’s Equation (elliptic); however, we will not derive it.

5.1.1 Derivation of the Wave Equation:


utt = a2 uxx
Consider a guitar string of length L, that is tethered at two ends along the x-axis, say at x = 0 and
x = L. When one plucks the string, we assume the all vibrational motion takes place perpendicular
to the x-axis, e.g., transverse vibrations. We will let u = u(x, t) denote the vertical displacement of
the string at any point x ∈ (0, L) for t > 0.

To derive the wave equation, we will need to make a few simplifying assumptions about the material
properties of the string and problem at hand. We will assume
1. The string is flexible and has no preferred shape
2. The string is homogeneous, i.e., the mass per unit length, ρ, is constant

3. The displacements are small compared to the length of the string, e.g., u(x, t) << L
4. From above, we get that this implies the slope of the string’s curvature is small at all points
x ∈ (0, L)
5. The string’s tension acts tangent to the string and magnitude is constant among all points on
the string

6. The tension defeats the force of gravity entirely, i.e., ignore it. It’s weak...unless this is the guitar
near the event horizon of a black hole.
7. Actually, let’s assume no external forces are present. No electromagnetic force. No nuclear
forces. No gravity. (This is still better than spherical cows.)

We consider a small interval on the string [xj , xj + ∆x]. The tensions on the end of this interval
are, of course, tangent to the ends of the curved string. We next proceed by computing the net vertical
force on the string, in hopes of employing Newton’s second law of motion to giving us our model.
Assuming small angular displacements, we compute the force acting on the element ∆s (it’s small, I
swear!) and obtain

T sin θ2 − T sin θ1 ≈ T tan θ2 − T tan θ1


   
∂u ∂u
=T −T
∂x (xj +∆x,t) ∂x (xj ,t)
 
∂u ∂u
=T −
∂x (xj +∆x,t) ∂x (xj ,t)

∂u ∂u
since ∂x = tan θ2 and ∂x = tan θ1 from slope relationships. We also note that from
(xj +∆x,t) (xj ,t)
our assumptions above T = |T1 | = |T2 |.

56
Now that we know all the tensile forces acting on the string, we use the rest of Newton’s second
law of motion to relate it to the string’s acceleration. We simply use the assumption that

ρ∆s ≈ ρ∆x,

to give us a fair approximation to the mass of the string on [xj , xj + ∆x]. The acceleration term
2 2
is then ρ∆s ∂∂t2u = ρ∆x ∂∂t2u . Hence Newton’s second law gives us

∂2u
 
∂u ∂u
ρ∆x 2 = T − .
∂t ∂x (xj +∆x,t) ∂x (xj ,t)

Next rearranging we get


∂u ∂u

∂2u T ∂x
(xj +∆x,t)
∂x
(xj ,t)
2
= .
∂t ρ ∆x
Finally taking the limit as ∆ → 0, we get the wave equation in all its linear glory,

∂2u 2
2∂ u
= a ,
∂t2 ∂x2
T
where a2 = ρ.

Uniqueness of Solutions: We can prove uniqueness of the Wave Equation by using Energy
Methods. We define the energy of the system to be
Z
E(t) = |wt |2 + |∇w|2 dx. (48)
U

Energy methods are powerful tool for getting a lot of information about a PDE for next to no
work. We will see that the main gist of using these methods is to look at the time derivative of the
energy, while assuming there are two unique solutions, to eventually arrive at a contradiction,
giving us uniqueness for just a few simple integration by parts, typically. We will use these
methods for both the Heat Equation and Laplace’s Equation that are coming up soon. Let’s
take a gander at uniqueness for the wave equation!

Theorem 5.2 Let U ⊂ Rn be an open and bounded and have its boundary, ∂U , be smooth (or
at least C 1 ), then there is at most one C 2 solution to

 utt − ∆u = f in (0, T ] × U
u = g on[0, T ] × ∂U
u(0, x) = u0 (x), ut (0, x) = u1 (x).

Let’s go ahead and prove this!

Proof 5.1 Let u, ũ be two non-identical solutions to the above non-homogeneous wave equation.
We will define w = u − ũ, giving us the homogeneous equation

 wtt − ∆w = 0 in (0, T ] × U
w = 0 on[0, T ] × ∂U
w(0, x) = w0 (x) = 0, wt (0, x) = w1 (x) = 0.

57
We then look at the energy of w,
Z
E(t) = |wt |2 + |∇w|2 dx,
U

and proceed by differentiating it with respect to time,

Z
dE
= 2Re wtt w̄t + ∇w · ∇w̄t dx
dt U
Z Z
= 2Re (wtt w̄t − ∆ww̄t ) dx + 2Re (∂ν w) wt dS,
U ∂U

by integrating by parts. Note that ∂ν is the normal derivative. Recall our boundary condition,
that w = 0 on ∂U . This implies wt = 0 on ∂U for all t, and hence we have that
Z
Re (∂ν w) wt dS = 0
∂U

, giving us Z
dE
= 2Re (wtt − ∆w) w̄t dx.
dt U

Since we have that wtt − ∆w =, we get that


dE
= 0,
dt
thereby illustrating that E(t) ∈ R, but moreover that E(t) = E(0). This essentially gives us what
we are looking for, e.g.,
Z
E(t) = E(0) = |wt (0, x)|2 + |∇w(0, x)|2 dx = 0,
U

since w0 = w1 = 0 and hence we have wt = 0 and ∇w = 0 for all t. Therefore w ∈ R and since
w(0, x) = 0, we get w = 0, implying that u = ũ.

5.1.2 Derivation of the Heat Equation:


∂u ∂2u
=k 2
∂t ∂x
We consider a long, thin wire in which heat transfer occurs. We call u(x, t) the temperature of
the wire at position x at time t. We consider only a thin (circular) rod, say it with length L. We will
plop this wire down on the x-axis, and hence the wire lays on x ∈ [0, L]. If you’d like we can say the
wire has cross-sectional area, A; however, we will find it won’t matter in the long run!

Now come our simplifying assumptions!


1. The flow of heat only takes place within the metal rod, i.e., the rod is insulated
2. No heat is being generated within the rod, e.g. no source terms...yet!
3. The rod is homogeneous, e.g., the mass per unit volume ρ is constant

58
4. The specific heat γ and thermal conductivity K of the material are constants (it gets more
interesting when they aren’t, though)
Now unlike the wave equation, where we simply used Newton’s second law of motion to derive our
model equation, we need to use two empirical laws in heat flow. These laws are very logical and may
not seem like they deserve the title of being an empirical law.
1. The rate of heat flow through a cross-section, A, is proportional to the cross-sectional area and
the partial derivative with respect to x of the temperature:
dQ ∂u
= −KA .
dt ∂x
The minus sign is to ensure that heat flows from the higher to lower temperature direction.
2. The quantity of heat, Q, at point x, within an element of mass m is

Q = γmu(x, t) = γρA∆xu,

where m = ρ(A∆x). We can now differentiate the above equation with respect to time to give
us a heat flow rate,
dQ ∂u
= γρA∆x .
dt ∂t
Now when heat is flowing in the positive x direction, we see that heat will build up at a rate of
   
∂u ∂u ∂u ∂u
−KA − −KA = −KA − .
∂x (x,t) ∂x (x+∆x,t) ∂x (x+∆x,t) ∂x (x,t)

Hence equating this with our second empirically derived relation, gives
 
∂u ∂u ∂u
γρA∆ = KA − ,
∂t ∂x (x+∆x,t) ∂x (x,t)

and therefore

∂u ∂u
∂x − ∂x
∂u K (x+∆x,t) (x,t)
= .
∂t γρ ∆x
Finally taking the limit as ∆x → 0, we obtain the heat equation, e.g.,

∂u ∂2u
= κ 2,
∂t ∂x
K
where κ = γρ .

Uniqueness of Solutions: We can prove uniqueness of the Heat Equation by using Energy
Methods, like we did for the Wave Equation. We define the energy of the system to be
Z
E(t) = |w|2 dx. (49)
U


Theorem 5.3 There exists at most one solution, u ∈ C 1 [0, T ] : C 2 (U ) to the initial value

ut − δu = f
problem / boundary value problem, , where U is a bounded domain.
u Γ = g.
T

59
Like we alluded to, we will prove this result using energy methods.

Proof 5.2 First we define two solutions u and ũ to be two solutions to the heat equation and
attack this proof head on by contradiction. We define a character w = u − ũ and then consider
the system 
wt − δw = 0
u Γ = 0.
T

All that is left to do is to differentiate (49) and use some mathematical massaging to obtain the
result, e.g.,

Z
dE
= 2Re wt w̄dx
dt
ZU
= 2Re ∆w w̄dx
U
Z Z
= −2Re ∇w · ∇w̄x + 2Re (∂ν w) w̄dS
U ∂U

from integrating by parts. We also note that this second integral is identically zero, because along
the boundary ∂U , we have w = 0. Hence we get
Z Z
dE 2
= −2Re ∇w · ∇w̄x = −2Re |∇w| x ≤ 0,
dt U U

and we have E(t) = E(0) = 0 since w = 0 at t = 0. Therefore we see that, indeed, u = ũ for all t.

We note that you can also prove uniqueness of the heat equation using the maxprinciple. The
 ut − δu = 0
maximum principle on Rn says that if we suppose u ∈ C 1 (0, T ] : C 2 (Rn ) solves ,
u(0, x) = g.
2
and satisfies u(t, x) ≤ Aea|x| for all (t, x) ∈ (0, T ] × Rn for some A, a > 0, then we have

supRn ×[0,T ] u = supRn g(x).

In a nutshell this says that the peaks of g(x) at t = 0 are the worst that can happen before
smoothing begins to happen. Imagine it physically. . . door. The proof for uniqueness becomes
rather trivial in this case.

Proof 5.3 If ũ is another solution and we define, w± = ±(u − ũ), just applying the maximum
principle to w± we see that w+ ≤ 0 and w− ≤ 0, which implies that u − ũ = 0.

5.1.3 Intuition about Laplace’s Equation:


∂2u ∂2u
∆u = ∇2 u = + 2 = 0.
∂x2 ∂y
First we note that we will not derive Laplace’s equation in this section. Sorry to disappoint you.
Instead we will try to motivate it as well as give some physical insight to what the equation says, while
also defining some operators. Let’s start there. The Laplacian operator, which acts on Euclidiean

60
n-space, can be thought as the operator of non-mixed second partial derivatives with respect to the
spatial variables. In n-dimensions the operator looks like
n
X ∂2 ∂2 ∂2 ∂2
∆= = + + . . . + . (50)
j=0
∂x2j ∂x21 ∂x22 ∂x2n

When we restrict ourselves to two spatial variables, say x and y, it takes the form in the above
Laplace Equation. It can also be thought of as the divergence of the gradient of a vector. It is not
related to our other friend, the Laplace Transform. There are two ways we will build some intuition
about the Laplacian.

In building intuition, let’s start off with something we all know well - basic calculus. Remember
back to your calculus 1 days, where taking a derivative was the answer to virtually everything, and
sometimes you had to take a second derivative to check something. Well that something, the concavity
is where we begin.

Recall if we have a function y = f (x) and f 00 (x) < 0, then y is concave up, while if f 00 (x) > 0, then
y is concave down. Analogously the Laplacian can be thought of as the generalization to multivariate
functions.

Likewise, we could also think of the Laplacian as a “local averaging operator”. The Laplacian
basically compares the value of u at some point to all the points within a neighborhood of it. This can
also, perhaps, give better insight into the concavity idea. There are three pathological cases, listed
below.
1. ∆u = 0: u at a point is equal to its neighbors (Laplace’s Equation)
2. ∆u > 0: u at a point is less than the average of its neighbors
3. ∆u < 0: u at a point is greater than the average of its neighbors
Hence we have the idea that

∆u ≈ “the average of all points in a neighborhood of u at ~x.”

Now when we think of say, the heat equation, which can be written as

ut = k∆u,

we can think of it as the rate of change with respect to time of the temperature is proportional to
the difference between the average temperature of u at a point and the value at the point itself. For
example, if there is a very cold spot in our domain, the surrounding domain is warmer, and hence the
Laplacian will be positive and thereby more heat will flow into that region, causing the temperature
to rise in our once colder spot.

Uniqueness of Solutions: We can prove uniqueness of the Laplace Equation by, again, using
Energy Methods, like we did for the Heat Equation. We define the energy of the system to be
Z Z
1 2
I(w) = |∇w| dx − f wdx, (51)
2 U U

which is a slightly different definition of how we defined the energy for the system previously.
We will prove the following theorem

61
Theorem 5.4 Let U be a bounded open set with smooth boundary. Let u be continuous on U
and it satisfies ∆u = f in U with u = g on ∂U , then it is the unique solution.

Proof 5.4 We will again proceed by contradiction and assume both u and ũ are solutions and
that u 6= ũ. Then we can define w = u − ũ, and hence we have that ∆w = 0 in U and w = 0 on
∂U . Now looking at the energy of w we have
Z Z
2
I(w) = |∇w| dx = − w∆wx = 0,
U U
by integrating by parts and since ∆w = 0 on U . Thus we have that ∇w = 0, implying that w is a
constant. Hence that constant must be zero, since w on the boundary is equal to zero. Therefore
we have that u = ũ.

Similarly to the heat equation case this proof is almost given away for free from a maximum
principle for the Laplace equation. We will state that strong maximum principle now.

Theorem 5.5 Suppose u ∈ C 2 (U ) ∩ C(Ū ) and −∆u = 0 in U , then we have


1. maxŪ u = max∂ Ū u
2. if there exists x0 ∈ U such that U (x0 ) = maxŪ u and U is a connected domain, then u is a
constant.

Furthermore, we can also fairly easily prove existence of solutions to Poisson’s Equation (basically
the non-homogeneous cousin of the Laplace Equation), if we assume u has a minimum on U .
Let’s check it out for fun!

Theorem 5.6 Let u be a function at which the energy function in (51) assumes its minimum.
Then u is a solution of the Poisson equation −∆u = f on U with u = g on ∂U .

Proof 5.5 The proof begins just be expanding out the functional,

t2
Z Z  Z
I(u + tv) = I(u) + t ∇u · ∇vdx − f vdx + |∇v|2 dx,
U U 2 U

where it is assumed v is a smooth function with compact support. Next since we are trying to
minimize I(u + tv), we look at the directional derivative of I along v and set it equal to zero,
Z Z
∇u · ∇vdx − f vdx = 0,
U U

and sampling integrating this by parts will give us nearly what we are after,
Z Z
− ∆u vdx − f vvx = 0,
U U

and hence we have that Z



− ∆u − f v vdx = 0,
U
thereby giving us that −∆u = f since v is assumed to be a smooth function with compact support.

62
One last remark before we proceed with solution methodoglies to linear PDEs. We state that if
u1 , u2 , . . . , uk are solutions of a homogeneous linear PDE, then their linear combination
k
X
u= cj uk = c1 u1 + c2 u2 + . . . + ck uk ,
j=1

is also a solution, assuming {cj } ∈ C. This is fondly known as the superposition principle. Splendid.
Time to rock!

5.2 Method of Characteristics: For 1st order linear PDE


The basic mechanics behind the method of characteristics transforms our problem from one of PDE
nature into an ODE challenge. Characteristics themselves can be thought of as curves, in which the
PDE reduces to an ODE. We now consider the following Cauchy problem,

a(x, y)ux + b(x, y)uy = c(x, y, u),


with initial corresponding initial data u|C = u0 , where C is some curve to which the initial data is de-
fined. We also note that this type of 1st order PDE is called quasi-linear due to the non-homogeneous
term, possibly being a nonlinear function of u(x, y).

We will now motivate and build intuition on the method of characteristics.

5.2.1 Conceptual Ideas:


We will list the steps and motivate each behind the method now, upon considering the Cauchy problem
described above.

1. Make a coordinate transformation: (x, y) → (τ, s) so that


(i) C corresponds to the τ -axis (s = 0), i.e., the level curves of s(x, y) = 0.
(ii) We call the characteristics of the PDE, the level curves, τ (x, y) = constant. The PDE will
reduce to an ODE on these curves.
2. Along each level curve, τ (x, y) = constant, the PDE reduces to an ODE.
3. PDE is subject to prescribed u on the parameterized initial data curve

x = f (τ )
C : ⇒ u = u(f (τ ), g(τ )) = u0 (τ ).
y = g(τ )

(i) Transform coordinates so that (x, y) → (τ, s).


(ii) This entails

x = x(τ, s) τ = τ (x, y)
y = y(τ, s) s = s(x, y).

(iii) Furthermore our initial data makes the constraints,

x(τ, 0) = f (τ )
y(τ, 0) = g(τ )
u(τ, 0) = u0 (τ ).

63
Figure 4: General idea of the Cauchy problem’s initial data.

4. The solution is going to be a surface in z = u(x, y) for the Cauchy problem which contains the
initial data, C : (x(τ, 0), y(τ, 0), u(τ, 0)). This can be seen in Figure(4).

This surface can be thought of as a level curve of the function, F (x, y, z) = u(x, y) − z = 0,

with corresponding normal vector, n = ∇F =< ux , uy , −1 > .


Notice that the vector < a, b, c > lies in the tangent plane of the solution surface at each point
on it, e.g.,
< a, b, c > ·n =< a, b, c > · < ux , uy , −1 >= aux + buy − c = 0.

Following the vector < a, b, c >, it directs us along a curve on the solution surface.

(i) Such a curve passes through each point of the initial data curve, Cs , given a one-parameter
family of curves, with the parameter, τ , specifying which one.
(ii) Define τ so that it is constant along each curve, thus giving a one-parameter family of
curves. The curves are then the level curves of τ (x, y) = constant.
(iii) Only s varies along each curve, with s = 0, corresponding to the points on the initial data
curve, Cs .

Figure 5: Parameterized curve, with parameter s, going through initial data curve Cs .

64
(iv) For a fixed τ , say τ = τ1 , the curve passing through the point (x(τ, 0), y(τ, 0), u(τ, 0)) on
Cs can be considered as a parametric curve with parameter s. This idea is illustrated in
Figure(??).
D E
∂y ∂u
(v) A tangent vector to the parametric curve is: ∂x ∂s , ∂s , ∂s .

(vi) The vector < a, b, c > is also tangent to this curve!


D E
∂y ∂u
(vii) Therefore < a, b, c > and ∂x , ,
∂s ∂s ∂s must be tangent to each other! E.g.,
 
∂x ∂y ∂u
, , = k < a, b, c >,
∂s ∂s ∂s

where we can scale our parameter s so such k = 1 and we get the following system of ODEs
∂x

∂s = a(x, y, u) 
∂y
∂s = b(x, y, u) 
One parameter family of ODEs, called the characteristic equations .
∂u
∂s = c(x, y, u)

Therefore we have
∂u ∂u ∂x ∂u ∂y
= + = aux + buy = c.
∂s ∂x ∂s ∂y ∂s
• Finally, we have parameterized the initial data curve, C , as

x(τ, 0) = f (τ )
y(τ, 0) = g(τ )
u(τ, 0) = u0 (τ )

The system of ODEs with these initial conditions yield a solution,

x = x(τ, s)
y = y(τ, s)
u = u(τ, s)

and upon inverting the transformation, if possible, i.e., solving for τ (x, y) and s(x, y), we
can arrive at our final solution,

u(x, y) = u(τ (x, y), s(x, y).

5.2.2 Mechanical Steps for the Method of Characteristics


If the above description was not clear on the mechanistic approach, we will succinctly place the steps
for the method of characteristics below. We again consider the Cauchy problem,

a(x, y)ux + b(x, y)uy = c(x, y, u) with u|C = u0 .

Step 1: Parameterize C in terms of τ and s = 0,



 x = x(τ, 0)
C : y = y(τ, 0)
u = u(τ, 0) = u(x(τ, 0), y(τ, 0)) = u0 (τ )

65
Step 2: Solve the system of ODEs given by:
dx
= a(x, y, u)
ds
dy
= b(x, y, u)
ds
du
= c(x, y, u)
ds

all subject to the initial conditions from Step 1. This then yields the following relation,

x = x(τ, s)
y = y(τ, s)
u = u(τ, s)

Step 3: Solve for s and τ in terms of x and y, if possible*. Then

u(x, y) = u(τ (x, y), s(x, y)).

*Unfortunately, the coordinate transformations are not always invertible, or explicit in nature!

We will now present a few useful theorems regarding the method of characteristics. These all will
be in the vein of the existence and uniqueness of solutions.

Theorem 5.7 ( Cauchy-Kowaleski Theorem): Consider the Cauchy  problem, a(x, y, u)ux +b(x, y, u)uy −
x = x̄(τ )
c(x, y, u) subject to a prescribed u on a parametric curve, C : , then we can let u =
y = ȳ(τ )
u(x̄(τ ), ȳ(τ )) = u0 (τ ) for initial data. Suppose a, b, c are continuously differentiable in a neighbor-
a b
hood of the initial data, i.e., the space curve (x̄(τ ), ȳ(τ ), u0 (τ )). If 6= 0 at all points on
Xτ Yτ
C (e.g., C is non-characteristic), then there exists a unique solution to the Cauchy problem in some
neighborhood of the initial data curve, C .

The following theorem can be used to identify when a Cauchy problem will have ∞-many solutions
or no solution, based on the initial data.

a b
Theorem 5.8 If = 0 for all points on C (initial data curve itself is a characteristic),
Xτ Yτ
then the Cauchy problem has either ∞−many solutions or no-solution in a neighborhood of C .

Throughout all of this analysis, we have employed use of the Implicit Function Theorem, which
tells us when we can invert a coordinate transformation.
Theorem 5.9 ( Implicit Function Theorem) Consider N equations with N + M variables,

Fk (x1 , x2 , . . . , XM , Y1 , . . . , YN ) = 0 for k = 1, 2, . . . , N,

or
F(x, y) = 0.

66
We also assume each partial derivative of F is C 1 with respect to each N + M variable. Furthermore
we assume that the system of equations is satisfied at some point P0 = (x01 , . . . , x0M , y10 , . . . , yN
0
) with

∂(F1 , . . . , FN )
6= 0.
∂(y1 , . . . , yN ) P0

Then {yk }N M
k=1 can be expressed as functions of {xj }j=1 in some neighborhood of P0 , i.e.,

y1 = φ1 (x1 , . . . , xM )
···
yN = φN (x1 , . . . , xM )

or
~
y = φ(x).
Moreover each {φk }N 1 M
k=1 is C with respect to each {xj }j=1 .

The Implicit Function Theorem in a nutshell, basically states that when we are given ξ = ξ(x, y)
and η = η(x, y), we can write x = x(ξ, η) and y = y(ξ, eta) whenever

ξx ξy
6= 0.
ηx ηy

It is time to do an example! Let’s boogey!

5.2.3 Example: xux − yuy = 2xy 2 with u(x, 1) = 1 + x2 on 0 < x < 1


Considering the PDE listed above. We will proceed with each of the mechanical steps we listed
previously for solving a PDE using method of characteristics.

Step 1: Parameterize C with τ and s = 0:



x(τ, 0) = τ
C : ⇒ u(x(τ, 0), y(τ, 0)) = u(τ, 1) = 1 + τ 2 .
y(τ, 0) = 1

Step 2: We setup the characteristics ODEs,


∂x
=x
∂s
∂y
= −y
∂s
∂u
= 2xy 2 = 2τ es e−2s = 2τ e−s .
∂s

Therefore solving the above system, we obtain

x = Kes ⇒ x = τ es
x(τ,0)=τ

y = Kes ⇒ y = e−s
y(τ,0)=1

u(s) = −2τ e−s + K ⇒ u(τ, s) = −2τ e−s + τ 2 + 2τ + 1


u(τ,0)=1+τ 2

67
Step 3: Inverting the transformations, we get

x = τ es ⇒ τ = xes = xy
y = e−s ⇒ s = − ln(y), y > 0

Hence we get our final solution as

u(τ, s) = −2τ e−s + τ 2 + 2τ + 1 ⇔ u(x, y) = −2xy 2 + 2xy + 1 + (xy)2 = −2xy 2 + (xy + 1)2 .

5.3 General Method of Characteristics for fully Nonlinear problems!


In the above method of characteristics description, we have intentionally talked about the method
of characteristics as used for quasi-linear first-order PDEs. There is a more general version of the
method of characteristics for fully nonlinear problems. We will describe the more general version now,
but without much motivation.

The general method of characteristics in all its glory attempts to solve first-order PDEs of the
form,
F (x, y, u, ux , uy ) = 0,
with corresponding initial data 
x = x(τ ) 
C : y = y(τ ) .
u(x(τ ), y(τ )) = u(τ )

First we use the following notation,



p = ux
⇒ standard notation.
q = uy

We will require that u2x + u2y 6= 0, e.g., p 6= 0 when q = 0 and vice-versa. We note that this
mandates that the solution will not have any local minima or maxima. Our governing PDE equation
now takes the form,

F (x, y, u, p, q) = 0.
Rolling through nonlinear method of characteristics rigmarole, which we do not expand upon here,
we arrive at the following system of ODEs to solve,
dx ∂F
=
ds ∂p
dy ∂F
=
ds ∂q
du ∂F ∂F
p +q
ds ∂p ∂q
 
dp ∂F ∂F
=− +p
ds ∂x ∂u
 
dq ∂F ∂F
=− +q
ds ∂y ∂u

68
with initial conditions x(τ, 0) = x(τ ), y(τ, 0) = y(τ ), u(τ, 0) = u(τ ), p(τ, 0) = p(τ ), q(τ, 0) = q(τ ).
For the initial conditions on p(τ ) and q(τ ), we use the conditions that

1. F (x(τ ), y(τ ), u(τ ), p(τ ), q(τ )) = 0 (The differential equation is satisfied on the initial curve)
2. From uτ = ux xτ + uy yτ on C we have that u0 (τ ) = p(τ )x0 (τ ) + q(τ )y 0 (τ ).

Using those two bits of information we will be able to pick out the correct initial data for p(τ, 0)
and q(τ, 0). We will leave you with just a few more notes on the general method of characteristics.

1. Characteristics are still given by the level curves of τ (x, y).


2. If F (x, y, u, p, q) is linear in ux and uy , the above system reduces to the “less-general” method
of characteristics described previously
xτ xs xτ Fp Fp Fq
3. If = =− 6= 0 on C , then we are guaranteed a unique
yt au ys yt au Fq xt au yτ
solution in some neighborhood of C .

Let’s do an example for completeness!



5.3.1 General Method of Characteristics Example: ux uy = 1 subject to u(x, 1) = 2 x for
x>0
First we write our first-order PDE in the form

F (x, y, u, p, q) = ux uy − 1 = pq − 1,

with p = ux and q = uy . We also note that Fx = Fy = Fu = 0, as well that q = Fp and p = Fq . We


then have the following system of ODEs to solve,
dx
= Fp = q
ds
dy
= Fq = p
ds
du
= pFp + qFq = 2pq
ds
dp
=0
ds
dq
=0
ds

with initial conditions x(τ, 0) = τ, y(τ, 0) = 1, u(τ, 0) = 2τ 1/2 . Recall that on C the initial data is
satisfied, hence we have
1
(i) pq − 1 = 0 ⇒ p = q

(ii) u0 (τ ) = pX 0 (τ ) + qY 0 (tau) ⇒ τ 1/2 = p(τ, 0)(1) + q(τ, 0)(0) = p


on C . Solving this system, we find that

p(τ, 0) = τ −1/2 and q(τ, 0) = τ 1/2 .

69
Now we have all the initial data we require and can proceed to solving the above system. Upon
solving all these ODEs we find
dp
= 0 ⇒ p(τ, s) = C ⇒ p(τ, s) = τ −1/2
ds p(τ,0)=τ −1/2
dq
= 0 ⇒ q(τ, s) = C ⇒ q(τ, s) = τ 1/2
ds q(τ,0)=τ 1/2
dx
= q ⇒ x(τ, s) = τ 1/2 s + A(τ ) ⇒ A(τ ) = τ
ds x(τ,0)=τ
dy
= p ⇒ y(τ, s) = τ −1/2 s + B(τ ) ⇒ B(τ ) = 1
ds y(τ,0)=1
du
= 2pq ⇒ u(τ, s) = 2s + C(τ ) ⇒ C(τ ) = 2τ 1/2
ds u(τ,0)=2τ 1/2

and hence get

x(τ, s) = τ 1/2 s + τ
y(τ, s) = τ −1/2 s + 1
u(τ, s) = 2s + 2τ 1/2 .

Now we must try to transform back into (x, y) coordinates from (τ, s). From the above relations,
we see we can write
τ y = τ 1/2 + τ = x,
and thereby
x
x − τy = 0 ⇒ τ = .
y
Hence we find  1/2
x
s= (y − 1),
y
and obtain as our final solution
 1/2  1/2
x x √
u(x, y) = 2 (y − 1) + 2 = 2 xy.
y y
x
We also state that the characteristics for the above problem are τ = y = constant.

5.4 Separation of Variables: For 2nd order linear PDE (finite domains)
In this section we will discuss a powerful method for finding solutions to linear PDE on finite do-
mains. This is not the end-all, be-all technique for solving linear PDEs, but is very useful in areas
from mechanics, electromagnetism, quantum mechanics, to economics.

We will show illustrate this method through an example.

Example: Heat Equation in 1d on a finite domain


Consider the following first-order linear constant coefficient, homogeneous PDE,

ut = κuxx ,

70
with associated boundary conditions u(0, t) = u(1, t) = 0, and initial value u0 (x) = u(x, 0) =
f (x). Hence we consider the domain x ∈ [0, 1] and t > 0.

We begin by assuming a solution of the form


u(x, t) = X(x)T (t),
that is, u(x, t) can be written as a product of two functions, one function with functional
dependance only on x, and the other with functional dependence only on t. Performing that
substitution we see

T 0 (t) X 00 (x)
ut = κuxx ⇒ =κ .
T (t) X(x)
From the above we see that since each side of the equation depends only on one independent
variable, yet are equal for any values, that the quantity above must be equal to a constant, say
−α2 . Hence we have

T 0 (t) X 00 (x)
= = −α2 .
κT (t) X(x)
We now have an eigenvalue in our problem...err...one more degree of freedom to work with!
Before we get to that, we note that the boundary conditions are still to be obeyed, and are
enforced via the X(x) function, i.e., X(0) = X(1) = 0. Now let’s solve the above problem!

Solving: T 0 (t) = −κα2 T (t)

We can solve this ODE simply by noticing its beautiful separable form,
dT
= −κα2 dt,
T
and hence by simple integration obtain
2
T (t) = c0 e−κα t .
We note that we did not get any information about the eigenvalue from solving this, nor
can be use the initial value yet, since there is much spatial information to be learned!

Solving: X 00 (x) + α2 X(x) = 0.

In solving this equation we have to look at three cases, that is when α = 0, α2 < 0,
and α2 > 0.

1. α = 0:
X 00 = 0
and hence we have
X(x) = cx + d.
Using the boundary conditions we see that
X(0) = X(1) = 0 ⇒ c = d = 0,
and hence we only have the trivial solution.

71
2. α2 < 0 :
X 00 − α2 X = 0
Solving this we get in general that

X(x) = c sinh(αx) + d cosh(αx).

Using the boundary conditions we see that

X(0) = d = 0
X(1) = c sinh(α) = 0 ⇒ c = 0

since we’ve assumed that α 6= 0 in this case. Hence we again only have the trivial
solution present.

3. α2 > 0:
X 00 + α2 X = 0
Solving the above ODE, we get

X(x) = c sin(αx) + d cos(αx).

Now using the boundary conditions we find

X(0) = d = 0
X(1) = c sin(α) = 0 ⇒ α = nπ

where n ∈ N. Therefore we find that our eigenvalue, λ = α2 = n2 π 2 .

Now we have both our solutions,


2
π2 t
u(x, t) = X(x)T (t) = c sin(nπx)e−κn ;

however, we see that any positive integer value of n gives us a different eigenvalue. It makes
sense in fact we can use superposition to sum all of these possible solutions, each corresponding
to a different eigenvalue, e.g.,


2
π2 t
X
u(x, t) = cn sin(nπx)e−κn .
n=1

Note that we still have an infinite amount of coefficients {cj } to find! Where, oh where, will we
be able to find these? Welp, we haven’t used the initial value for anything, yet. Let’s start there!

Using the initial value, u(x, 0) = u0 (x) = f (x), we see that we have

X
f (x) = cn sin(nπx).
n=1

The above is simply a Fourier Series for f (x)! We can use the tricks and tools of the trade, i.e.,
orthogonality and completeness of the Fourier basis, to find {cn }.

72
Using Fourier Series, we note that we get
Z 1
2
cn = f (x) sin(nπx)dx.
L 0

Therefore the final solution to our heat equation problem is


∞ Z 1 
2 X 2 2
u(x, t) = f (x̃) sin(nπx̃)dx̃ sin(nπx) e−κn π t .
L n=1 0

Non-homogeneous Heat Equation


Consider the following first-order linear constant coefficient, non-homogeneous PDE,

ut = R + kuxx ,

with associated boundary conditions, one of which is non-homogeneous, e.g., u(0, t) = 0 and
u(1, t) = u0 for t > 0, and initial value u0 (x) = u(x, 0) = f (x) for 0 < x < 1. Hence we consider
the domain x ∈ [0, 1] and t > 0.

For problems of this type, we will begin by making a change of variables of the dependent
variable,
u(x, t) = ψ(x, t) + ξ(x) (52)

Recal the boundary conditions in our new change of variables,

u(0, t) = ψ(0, t) + ξ(0) = 0


u(1, t) = ψ(1, t) + ξ(1) = 0

We will be able to put the non-homogeneous term all on the ξ(x) function, provided we have

ξ(0) = 0 and ξ(1) = u0 .

In this case we then have translated homogeneous boundary conditions to ψ(x, t) on [0, 1]. We
note that since ξ(x) has no t dependence, the initial conditions gets placed onto ψ(x, t), i.e.,

u(x, 0) = f (x) = ψ(x, 0) + ξ(x) ⇒ ψ(x, 0) = f (x) − ξ(x).

Differentiating (52) accordingly, we have

∂2u ∂2ψ ∂u ∂ψ
2
= + ξ 00 (x) and = .
∂x ∂x2 ∂t ∂t

Therefore our model PDE becomes


∂2ψ ∂ψ
k + kξ 00 + R = .
∂x2 ∂t

The above equation will reduce to a homogeneous equation if we make the assumption that

kξ 00 + R = 0.

73
Solving this (by two simple integration) gives
R 2
ξ(x) = − x + c1 x + c2 .
2k

Now making ξ(x) obey the boundary conditions we set above gives
R
c1 = + u0 and c2 = 0.
2k

Hence we find  
R 2 R
ξ(x) = − x + + u0 x.
2k 2k

Once we have ξ(x), we can now set out to solve our homogeneous PDE given above, e.g.,

ψt = kψxx ,

with ψ(0, t) = ψ(1, t) = 0 for t > 0 and ψ(x, 0) = f (x) − ξ(x).

It is clear that finding the solution to this PDE is identical to solving the homogeneous heat
equation we previously have, but with differing initial condition. Rolling through the same
separation of variables rigmarole, we find that


2
π2 t
X
ψ(x, t) = cn e−kn sin(nπx),
n=1

where we known the {cj }0 s are given by orthogonality of the Fourier basis,
R1 1    
[f (x) + ξ(x)] sin(nπx)dx
Z
0 R 2 R
cn = R1 2 =2 f (x) + x − + u0 x sin(nπx)dx.
sin (nπx)dx 0 2k 2k
0

Therefore the solution to our original non-homogeneous PDE is given by

u(x, t) = ψ(x, t) + ξ(x)


∞ Z 1       
X R 2 R −kn2 π 2 t R 2 R
= f (x̃) + x̃ − + u0 x̃ sin(nπx̃)dx̃ e sin(nπx) − x + + u0 x.
n=1 0 2k 2k 2k 2k

We note that the above solution is composed of two pieces, ψ(x, t) and ξ(x). We call ξ(x) the
steady-state solution since we note that as t → ∞, ψ(x, t) → 0. We call ψ(x, t) the transient
solution.
A few more notes on separation of variables:

We will just say a few brief things about this method.


1. This methodology be used only for linear PDEs on finite domains
2. Such PDEs do not have to have constant coefficients; however, as long as the equation is
separable and the resulting systems of ODEs leads to be of the Sturm-Liouville flavor, you
will be able to use this method

74
3. The example above easily goes through the mechanics and solutions to the wave equation
and Laplace’s Equation can be done analogously.
4. This methodology will work for higher order heat equations, wave equations, Laplace’s
equations, e.g., situations with more than two independent variables.
5. The change of variables, u(x, t) = ψ(x, t) + ξ(x), will work for non-homogeneous problems
involving Laplace’s equation or the wave equation, as well.

5.5 Integral Transforms: Linear PDEs on R (orRn )


In this section we will show how to use the Fourier Transform, Laplace Transform, and alike methods
for solving some linear PDEs. In general for linear PDEs on R, we will use the Fourier Transform,
while for variables defined on [0, ∞) we will use the Laplace Transform. We will begin with a brief
overview of the Fourier Transform

5.5.1 The Fourier Transform


We begin with the definition of the Fourier Transform,
Z ∞
F {f (x)}(k) = F (k) = f (x)e−ikx dx. (53)
−∞

We note that not every function f (x) will have a Fourier Transform; however, a sufficient condition
for a function to have one (i.e., the above integral converges) is that f (x) is square-integrable, e.g.,
Z
||f (x)||2 = |f (x)|2 dx = a constant.
R

If f (x) is also continuous as well, then the Inverse Fourier Transform should exist, and we have
the inversion formula Z
1 −1
f (x) = F {F (k)}(x) = F (k)eikx dk. (54)
2π R
We will now discuss a few properties of the Fourier Transform.

1. Linearity:

Like the Laplace Transform, linearity holds in the Fourier Transform. If it didn’t, it probably
wouldn’t be worth a second glance, after-all.

F {c1 f1 (x) + c2 f2 (x)} = c1 F {f1 (x)} + c2 F {f2 (x)} = c1 F1 (k) + c2 F2 (k),

where {c1 , c2 } ∈ C.
2. Translations:

75
We also have the following translational property,
Z
F {f (x − a)} = f (x − a)e−ikx dx
ZR
= f (u)e−ik(u+a) du
R
Z
−ika
=e f (u)e−iku du
R
= e−ika F (k)

Hence we have
F {f (x − a)} = e−ika F (k).

3. Scaling:

The following scaling property also holds


k

F
F {f (cx)} = c
.
|c|

4. Convolution:

The Fourier Transform of a convolution of two functions, f1 (x) and f2 (x), has an elegant (and
convenient!) form,

F {f1 (x) ∗ f2 (x)} = F1 (K)F2 (k).

5. Modulation:

The Fourier Transform of a modulation (product) of functions is given by

F {f1 (x)f2 (x)} = F1 (x) ∗ F2 (x).

6. Parseval’s Theorem:

Z Z
|f (x)|2 dx = |F (k)|2 dk,
R R
or equivalently

1
||F (k)||2 = ||f (x)||2 .

This identity becomes especially useful when doing stability analysis of numerical methods in-
volving finite differencing for the solutions of PDEs.

76
7. Duality:

Suppose f (x) has Fourier Transform F (k). Then we immediately know the Fourier Transform
of the function F (x)! This is known as the duality property of the Fourier Transform, as seen
below

F {F (x)} = 2πf (−k).

8. Derivatives!:

When we take the Fourier Transform of a derivative, we will simply use our good friend, inte-
gration by parts, to help us out. Knowing that our function is square-integrable is what will
allow us to drop terms. Let’s check it out

 
df df −ikx
F = intR e dx
dx dx
Z
= f (x)e−ikx + (−ik)f (x)e−ikx dx
R
Z
= (−ik) f (x)e−ikx dx
R
= (−ik)F (k)

Therefore we find that  


df
F = (−ik)F (k).
dx

We can continue analogously for higher order derivatives, and by induction, we get the following
relation

dn f
 
F = (−ik)n F (k).
dxn

9. Derivatives with respect to non-transformed variables:

Consider taking the Fourier Transform with respect to x of ∂u ∂t . We will see since t is a dummy-
type variable and our integration does not depend on it, it makes sense we will be able to pull
the derivative operator with respect to t out of the integral. Let’s check it out rigorously,

77
  Z
∂u
F = ut (x, t)e−ikx dx
∂t R
u(x, t + ∆t) − u(x, t) −ikx
Z
= e dx
R ∆t
Z Z 
1 −ikx −ikx
= lim u(x, t + ∆t)e dx − u(x, t)e dx
∆t→0 ∆t R R
1
= lim [U (k, t + ∆t) − U (k, t)]
∆t→0 ∆t

= U (k, t)
∂t

Naturally this result holds for higher order derivatives, i.e.,


 n 
∂ u ∂
F n
= n U (k, t).
∂t ∂t

Now let’s proceed to solve a PDE using Fourier Transform!

5.6 Fourier Transform to Solve a PDE


Consider the following hyperbolic equation, e.g., in this case the wave equation,

utt = c2 uxx ,

with u → 0 as |x| → ∞ and for t > 0. We begin by taking the Fourier Transform of both sides of this
equation,
Z Z
utt e−ikx dx = c2 uxx dx.
R R
From properties of the Fourier Transform we saw in the previous section, we see that this gives us

∂2
U (k, t) = c2 (−ik)2 U (k, t) = −c2 k 2 U (k, t)
∂t2
Hence taking the Fourier Transform has transformed our PDE into an ODE. We now just consider
k as some fixed parameter, and see our problem is reduced to a constant coefficient ,homogeneous
linear ODE. We can easily solve this ODE using familiar methods from basic ODE theory. Let

U (k, t) = eαt ,

and upon substituting we see we get the solution

U (k, t) = C1 (k)e−ickt + C2 (k)eickt ,


where C1 (k) and C2 (k) are functions strictly of k and will be able to be found via initial conditions.
We note that when k = 0, the ODE becomes singular and we instead have to solve Utt = 0, which
leads to the trivial solution when mandating that the solution remain physical.

78
Now to recover the solution we care about in physical space, i.e., u(x, t), we need to use the Inverse
Fourier Transform to transform back appropriately. Doing this we find
Z
1
F −1 {U (k, t)} = U (k, t)e−ikx dk
2π R
Z
1
C1 (k)e−ickt + C2 (k)eickt eikx dk
 
=
2π R
Z Z 
1 ik(x−ct) ik(x+ct)
= C1 (k)e dk + C2 (k)e dk
2π R R
= c1 (x − ct) + c2 (x + ct)

where C1 (k, t) = R c1 (x, t)e−ikx dx and C2 (k, t) = R c2 (x, t)e−ikx dx and hence their Inverse
R R

Fourier Transforms are used above. We note that this solution is called the D’Alembert solution,
which describes wave phenomena, where one wave packet moves to the right, c1 (x − ct), and one wave
packet moves to the left, c2 (x + ct).

5.7 Laplace Transform to Solve a PDE


Consider the following wave equation,
utt = c2 uxx ,
with boundary conditions u(0, t) = u(1, t) = 0 for t > 0, and initial values u(x, 0) = 0 and
ut (x, 0) = sin(πx) on 0 < x < 1. We note that the spatial domain is finite, making us consider
another method besides Fourier Transforms. We could, of course, notice the equation is separable,
and use separation of variables, but we already had our fun with those, and yurn to use integral
transforms!

Since the variable t is defined for t ∈ [0, ∞), we can use Laplace Transforms to transform the
equation in this variable. Taking the Laplace Transform of both sides of the equation gives

∂2u 2
   
2∂ u
L =L c
∂t2 ∂x2
Z ∞ Z ∞
−st
utt e dt = c2 uxx e−st dt
0 0
d2 U
s2 U (x, s) − su(x, 0) − ut (x, 0) = c2
dx2
and hence we have the following ODE to solve

Ux x − s2 U = su(x, 0) + ut (x, 0) = sin(πx),


by substituting our initial data. However, before we go further we recognize that the boundary
conditions are functions of t and hence themselves must be transformed as well,

L {u(0, t)} = L {0} = 0 = U (0, t)


L {u(1, t)} = L {0} = 0 = U (1, t).

Now we have the appropriate boundary conditions to use when solve the ODE in the transformed
space. Proceeding with the solution, we first solve the homogeneous problem, i.e., Ux x − s2 U = 0, to
get the complementary solution and find

UC (x, s) = c1 cosh(sx) + c2 sinh(sx),

79
since s > 0 in the Laplace Transform. However, using the boundary conditions, we find that
c1 = c2 = 0. Onwards to solve for the particular solution, we can use the method of undetermined
coefficients (or variation of parameters if you’re really that bored), starting with an initial guess for
the particular solution as
UP (x, s) = A cos(πx) + B sin(πx).
1
Substituting this into the ODE, we find that A = 0 and B = s2 +π 2 . Therefore our solution is:
1
U (x, s) = UC (x, s) + UP (x, s) = sin(πx).
s2 + π 2
Now we must transform back into physical space,
u(x, t) = L −1 {U (x, s)}
 
1
= L −1 sin(πx)
s2 + π 2
 
1 −1 π
= L sin(πx)
π s2 + π 2
1
= sin(πx) sin(πt).
π

5.8 Similarity Solutions!


Similar transformations can help give a lot of information about certain PDEs; however, we can also
use them to cook up solutions when some of (or all) of the independent variables appear through
certain groupings, rather than independently. Consider the following example involving the 2D heat
equation.
1. 2D Heat Equation

For example, if we consider the heat equation


ut = k(uxx + uyy ,
we can search for solutions that depend on x and y through the grouping,
p
r(x, y) = x2 + y 2 ,
e.g.,

u(x, y, t) → u(r(x, y), t) = u(r, t),


and
x2 y2
ux = ur rx ⇒ uxx = urr rx2 + ur rxx = urr + ur ,
r2 r3
and analogously for uyy we get,
y2 x2
uyy = urr 2
+ ur 3 ,
r r
and hence
1
uxx + uyy = urr + ur .
r
Doing these substitutions leads to the radially symmetric heat equation,
 
1
ut = k urr + ur .
r

80
2. 1D Wave Equation

Consider
ut + aux = 0.
We will search for solutions of the form u = u(ξ(x, t)) where ξ(x, t) = x − at. Performing the
necessary derivatives, we get

ut = uξ xit = −auξ and ux = uξ ξx = uξ .

Therefore substituting these into the wave equation above we find

ut + aux = 0
(−auξ ) + a(uξ ) = 0
0 = 0 ⇒ all differentiable functions of ξ satisfy the PDE.

Therefore we find our solutions take the form

u(x, t) = f (x − at).

We now make the claim that no solutions are lost by this similarity transformation. To see
this we will find the solution using this idea of similarity transformation! We will make the full
coordinate transformation,
ξ = x − at and η = η(x, t).
Recall from the Implicit Function Theorem if

ξx ξt 1 −a
= = ηt − aηx 6= 0,
ηx ηt ηx ηt

then we have an invertible transformation. We just need η(x, t) to satisfy the above.
Performing the necessary derivatives we find

ut = uξ ξt + uη ηt = −auξ + uη ηt
ux = uξ ξx + uη ηx = uξ + uη ηx

Substituting these into our governing PDE gives

ut + aux = −auξ + uη ηt + auξ + auη ηx


= uη (ηt + aηx ) = 0.

Hence we require that uη = 0 so that our coordinate transformation remains invertible, by our
determinant above. Solving this ODE, we find u(ξ, η) = C(ξ),
and therefore
u(x, t) = C(x − at).

3. 1D Heat Equation

81
Consider the 1D heat equation,
ut = kuxx .
We wish to use similarity transformations to try to cook up a solution. In general we will look
for solutions with groupings of the independent variables, such as ξ(x, t) = xα y β , where α, β 6= 0.
We then assume a solution of the form

u(x, t) = u(ξ(x, t)).

We proceed with performing the necessary differentiations,


ξ
ut = uξ ξt = uξ βxα tβ−1 = uξ β
t
α−1 β ξ
ux = uξ ξx = uξ αx t = uξ α
x

and hence
ξ2
uxx = uξξ ξx2 + uξ ξxx = uξξ α2 + uξ α(α − 1)xα−2 tβ
x2
ξ2 ξ
= uξξ α2 2 + uξ α(α − 1) 2 .
x x

Our governing equations then becomes

ξ2
 
ξ ξ
ut = kuxx ⇒ uξ β = k uξξ α2 2 + uξ α(α − 1) 2 ,
t x x

and thereby
x2
= k uξξ α2 ξ 2 + uξ α(α − 1)ξ .
 
uξ βξ
t
Now in the vein of similarity solutions, we want to try to get rid (or group) all the x’s and t’s.
2
In the above equation, we must let the quantity, xt be some power of ξ if the transformation is
successful. Now we let
x2
ξp = ,
t
and hence get that
x2
(xα y β )p = ⇒ 2 = αp and − 1 = βp,
t
and therefore
α
β=− .
2
We only have the one condition above and α can be chosen at our discretion. Let’s let α = 1 (that
seems nice). It also kills a term in our governing equation. We now have β = − 21 ⇒ ξ(x, y) = √xt .
Our governing equation then becomes,
1 ξ ξ
− uξ ξ 3 = kuξξ ξ 2 ⇒ − uξ = uξξ ⇒ uξξ + uξ = 0.
2 2k 2k

82
We can now integrate this equation with respect to ξ, to get
ξ2
uξ = Ae− 4k ,

and by integrating once again, we get


Z ξ
s2
u(ξ) − u(0) = Ae− 4k ds.
0

If the above doesn’t immediately scream error function to you, we will introduce you to that
friendly function now.

Definition 5.1 The error function is defined as


2 z −t2
Z
erf(z) = e dt,
π 0
with properties
erf(±∞) = ±1 and erf(x) is odd.

We can now write our solution in terms of the error function as


Z ξ
s2
u(ξ) − u(0) = A e− 4k ds
0
ξ
2 √
Z √
2 k
= e−t 2 kdt
0
ξ
Z √
2 2 k 2
= Ã e−t dt
π 0

and then we get  


ξ
u(ξ) = u0 + Ã erf √ ,
2 k
x2
and finally upon substituting back into (x, t) coordinates, using ξ = t ,
 
x
u(x, t) = u0 + Ã erf √ .
2 kt

5.9 Method of Images


The method of images is a technique used for solving certain differential equations, where the domain
of a certain problem is extended by the addition of a complementary domain, usually with some re-
spective symmetrical form to the original, in which certain boundary conditions are instantly satisfied,
and hence the solution to the original problem is facilitated. It is commonly used in electrostatics,
where an array of point charges are located in the vicinity of a conducting or insulating surface, and
the corresponding electric field is desired. Besides electrostatics, it is analogous to methods used in
highly symmetric optics problems. We note that this method heavily depends on uniqueness

We will perform two examples of the method of images to illustrate this idea. This section will also
require some basic electrostatic a priori physics knowledge. Well, either that or a basic understanding
of fundamental solutions of elliptic problems...whichever you’d prefer!

83
Figure 6: (a) Single point source (charge) above an infinitely long conducting surface. (b) The same
geometry, but with a virtual source point (with opposite charge) located symmetrically across the
conducting plane.

5.9.1 Example: Infinite conducting plate with single point charge


We consider a single point source (charge) in the upper half plane, where it is placed above an infinitely
long conducting surface. This is illustrated in Figure(6a). We are interested in what the electric field,
or just vector field, φ, looks like at another point in the plane. The governing problem is that of
Poisson problem, an elliptic PDE,
q
∆φ = − δ(r − r0 ),
0
where r0 = dẑ, the location of the point source, +q, and φ is the potential associated with such point
source. We note that for any non-homogeneous differential equation, we will have both a complemen-
tary solution and particular solution. Our only boundary condition is that on the conducting surface,
say at z = 0, the potential is zero.

This is where the intuition behind the method of images is beautiful. Rather than look for a
particular solution, say φ+ , and a complementary solution, φ− , to the equations
q
∆φ+ = − δ(r − r0 ) and ∆φ− = 0,
0
respectively, we will try to cook up a solution that satisfies the boundary condition at z = 0 using
knowledge of the fundamental solution of such elliptic operator, e.g., ∆ in 2D.

We note that the general potential for such an operator is given by φ = kr , where k is a constant
±q
that is determined by the physics in the underlying problem. Here k = 4π 0
, where ±q depends on
which source term you are considering. If we need combine these two solutions, we note we can satisfy
the boundary condition at z = 0, if we place a point charge of equal and opposite sign at the mirror
image of the other charge below the plane. This is seen in Figure(6b).

Note that below the plane is not even in the original domain of the problem, hence when we
consider the Poisson-type problem for it’s source charge, we find

∆φ− = δ(r − rim ) = 0,

where rim = −dẑ, since rim is not in the domain of the problem at hand.

84
Therefore the solution to this Poisson problem is
  " #
1 q −q 1 q −q
φ = φ+ + φ− = + = p +p
4π0 r+ r− 4π0 x2 + (z − d)2 x2 + (z + d)2

5.10 Example: Point Charge and a Grounded Sphere

Figure 7: (a) Single point source of charge qpt outside a grounded sphere. (b) The same geometry,
but with a virtual source point (with differing) located inside the grounded sphere at an unknown
location, xim . (c) Illustrating the influence from both the point charge and image point charge on a
point outside the sphere.

Consider the problem of a point charge outside of a grounded sphere. Since the sphere is grounded,
we need the potential to be equal to zero on the entire sphere. This is our boundary condition. To
setup the geometry, say the point charge is a distance x0 away from the sphere of radius, R. For
simplicity, we will also center the sphere about the origin. This is shown below in Figure(7)(a).

In order to find the potential everywhere using the method of images, we must need to find the
location of the virtual point charge, such that the boundary condition on the sphere is satisfied. To do
this, we make the hypothesis, that to preserve symmetry, this image point should be placed somewhere
along the x − axis, and moreover, to satisfy that φ = 0 along the sphere, intuition suggests this point
will need to be inside the sphere.

Rather than try to calculate the potentials of both these points to satisfy the boundary condition
along the entire sphere, we consider trying to satisfy the boundary condition at two points along the
x-axis, namely (R, 0) and (−R, 0), i.e.,

• At (R, 0), φ = 0
• At (−R, 0), φ = 0.

If this doesn’t make sense why we only consider these two points, we have somewhat pulled a fast
one. Well, not really. We are just using what scientists call hindsight, err I mean, foresight when
thinking about a problem. The motivation is as follows. We note that using the fundamental solution
for ∆φ = 0 in radial coordinates in 2 − D for electrostatics problems, we will have two unknowns when
placing the image point - the location along the x-axis and the strength of the charge. The train of
thought then follows that by using as much symmetry as possible, if we are able to write a system
of two equations and two unknowns, we then can get all the information we need for finding the true
solution.

85
Therefore from those two (more specific) boundary conditions, we get
 
1 qpt qim
φ =0= +
(x,y)=(R,0) 4π0 (x0 − R) (R − xim )
 
1 qpt qim
φ =0= + .
(x,y)=(−R,0) 4π0 (x0 + R) (xim + R)

Solving the above system of two equations and two unknowns yields,

R2 −qpt R
xim = and qim = .
x0 x0
Now to calculate the potential of any point outside the sphere, (r, θ, φ), we find
 
q R
1  qpt − ptx0
φ(r, θ, φ) = φpt + φim = p +q .
(r,θ,φ) (r,θ,φ) 4π0 r2 + x20 − 2rx0 cos θ r2 + R
4
− 2r R2
cos θ
x2 0x0

At this junction, you may be wondering whether or not the above solution truly satisfies the
boundary condition on the entire sphere. Well, the nice thing about PDEs (and differential equations
in general) is we can always check out solution. (If time permits on tests, of course...)

We first consider an arbitrary point on the sphere, say distances from charges qpt and qim of
s  2  2
R R
q
2 2
dpt = R + x0 − 2Rx0 cos θ and dim = R + 2 − 2R cos θ,
x0 x0

respectively. Now the potential on the surface of the sphere is given by


 
1 qpt qim
φ = +
r=R 4π0 dpt dim
 
qpt R
1  qpt − x0
= p +q 
4π0 R2 + x20 − 2Rx0 cos θ R4
R + x2 − 2R R
2 2
cos θ
0 x0
 
1  qpt qpt 
= p +r  
4π0  2
R2 + x0 − 2Rx0 cos θ x20 R4 3

R2 R2 + x20
− 2R
x0 cos θ
" #
1 qpt qpt
= p −p 2
4π0 R2 + x20 − 2Rx0 cos θ x0 + R2 − 2Rx0 cos θ
= 0.

Hence our solution checks out!

86
6 Physics!
You’re taking physics? I love physics!, said only 5% of the population, of which 90% were being ironic.
It’s no surprise that physics gets a bad rap somewhere between birth and college graduation; however,
it is truly a beautiful science and arguably without it, modern mathematics would not be where it is
today with it. For much of mathematics conception, math and physics (and other sciences for that
matter) were like peas and carrots. It has only been within the last century or so, where people start-
ing differentiating between calling themselves scientists to mathematicians, physicists, chemists, or
biologists. I digress. Physics has a place, and from it come a lot of interesting and useful mathematics!

In this section we will give a crash course in basic physics ideas, e.g., conservation laws and basic
mechanics, which will then lead us down the rabbit hole of calculus of variations to learn Lagrangian
and Hamiltonian formulations of mechanics. I apologize in advance for attempting to sum up years
worth of courses in a minimal amount of words. There will be (many) things left out, so please do
some light (gravity pun?) reading for further insight and intuition. Buckle up, we’re about to see
math in action!

6.1 Basic Physics Principles


There once lived a man named Isaac Newton. He may have invented calculus (or co-discovered at
the same time as his rival Gottfried Leibniz). What led him to calculus was his fascination with
phenomena in the natural world and wanting to quantify the beauty he saw through formulae and
equations. Upon doing so, he eventual wrote Philosophiae Naturalis Principia Mathematica, which
was to be a large collection of how the universe operated. Whilst his law of gravitation did not hold
up past two centuries, although it is a very good approximation in the non-relativistic regime, his laws
of motion remain infamous, especially with students being introduced to physics for the very first time.

Newton’s three laws of motion are as follows


1. An object stays at rest (or continues to move at constant velocity) unless an external force acts
upon it.
2. The sum of all external forces acting on an object is equal to the product of the object’s mass
and acceleration.
3. When a body exerts a force on a second body, the second body simultaneously exerts an equal
in magnitude, but opposite in direction, force on the first body.
We will now give a more mathematical flavor of the three laws of motion.

Law 1:

The first law basically states if the sum of all forces on an object are zero, then the object is
either at rest or moving at a constant velocity, e.g., it is not accelerating. More specifically it
says that the vector sum of all forces is zero. This law is better known as the law of intertia.

Mathematically it says the following two statements are equivalent,


N
X dv
Fj = 0 ⇔ = 0,
j=0
dt

87
where Fj is a force acting on a body and v is the velocity of such body. These statements say
that if the body is in motion, it will remain in motion, moving in a straight line, until other
forces act upon it. Furthermore, this is evident in what is referred to as the Newtonian reference
frame, or intertial reference frame, where the motion of a body according to another observer
remains moving at constant speed in a straight line.

Law 2:

Newton’s second law of motion is really a statement about the conservation of momentum. In a
nutshell it states that the time rate of change of linear momentum is equal to the sum of forces
acting on the body in an inertial reference frame. This can be described mathematically as

N
dp d(mv) X
= = Fj ,
dt dt j=0

where p is the linear momentum, and is equal to mass times velocity, i.e., p = mv. We note
that this is consistent with the first law, in that if the sum of vector forces acting on a body is
zero, then the momentum of the object remains unchanged. A consequence of any net force is
a change in momentum of a body.

It is common to assume that the mass of a body does not change and is constant, leading to the
infamous
N
X dv d2 x
Fnet = Fj = m = m 2 = ma;
j=0
dt dt

however if mass does change, we need to consider


dm dv
Fnet + u = ,
dt dt
where u is the velocity of the escaping (or incoming) mass of the body. The extra term, u dm
dt
is sometimes referred to as the advection of momentum. We note that if the body’s mass is
changing we do not simply treat the original statement as a product rule, i.e.,
d dv dm
WRONG ⇒ Fnet = (mv) = m +v ⇐ WRONG.
dt dt dt
The reason for this is that it does not respect Galilean invariance, or that the laws of motion
must remain the same regardless of what inertial system you are in, e.g., our statement above
would violate that in one frame of reference F = 0 while in another F 6= 0. For example, say
in one reference frame you are moving at constant velocity U , while we consider another non-
moving (stationary) reference frame. We can relate the physics between the two using Galilean
transformations, i.e.,
x0 = x − U t (position)
0
dx dx
v0 = = −U
dt dt
d2 x0 d2 x
a0 = = = a,
dt2 dt2

88
where all coordinates with primes indicate the moving inertial reference frame. Hence in both
frames of reference we have the same acceleration, and hence conservation of momentum is pre-
served.

Moreover, if we wish to consider Newton’s second law of motion for relativistic systems, we need
to change our definition of linear momentum, since at speeds “near-ish” to the speed of light,
linear momentum is not simply the product of mass and velocity.

Newton’s second law of motion is very powerful, and is the law that most people refer to the
most when considering physical phenomena. For example, the main equations of fluid dynamics,
called the Navier-Stokes equations, although nonlinear PDEs, are just a statement about the
conservation of momentum for a fluid. At the heart of basic mechanics, Newton’s second law is
main driver to solutions.

Law 3:

The crux of Newton’s third law is that no force is only unidirectional. If body A exerts a force on
body B, body B exerts the same force back on body A. All forces must be interactions between
bodies, and no one body can magically exert a force without feeling the equal and opposite force
back upon itself.

Think about punching a wall. You may think you have exerted all the force onto the wall, but
the reason your hand hurts is because the wall has “hit your hand” back with as much force as
you placed against it. To every action, there is an equal and opposite reaction.

We will now briefly discuss some conversation laws, or integrals of the system, in physics.

6.1.1 Conservation Laws


If a system is closed, e.g., isolated, and the system does not interact with its background environment
in any way, certain quantities arise, which are preserved throughout the entire evolution of the system
considered. These quantities are known as integrals of the system, constraints of the motion, or
conserved quantities. In physics, we call them conservation laws, and there are three:

1. Conservation of Momentum
2. Conservation of Energy
3. Conservation of Angular Momentum

4. Conservation of Charge

We will now briefly describe each of these conservation laws. Well not, conservation of charge,
but that should be somewhat self explanatory for the level of physics we are shooting for. Note, in
mathematics the term conservation law usually refers to transport type hyperbolic equations of the
form, here in their scalar-ized glory,
∂t u + ∂x f (u) = 0.
u is called the conserved quantity, while f is called the flux.

89
Conservation of Momentum

We have already seen a bit about the conservation of momentum coming from Newton’s second
law of motion. Rather than think about forces causing changes in momentum of a body, let’s
consider the momentum in a system as a whole. If one part of the system gains momentum to
go one way in the system, another part of the system must have gained equal momentum, but
to go in the opposite direction.

If we consider our system to have two particles in it, we assume that at time, t1 , and later time
t2 , the total momentum of the system must be the same, e.g.,

h i h i
m1 v1 + m2 v2 = m1 v1 + m2 v2 .
t=t1 t=t2

However, the above assumes that if there have been any collisions between the balls, that only
elastic collisions have occurred. That is to say, no energy has been lost, well converted, to heat,
sound, etc. Spoiler alert - energy cannot be created nor destroyed, merely converted to other
forms of energy. Those type of collisions where energy has been converted to heat (sound, etc.)
are called inelastic collisions. We will discuss this further in the conservation of energy section.
Again, the main statement about the conservation of momentum is best summed up (heyyo!)
through Newton’s second law,
dv X
m = Fj .
dt j

Let’s do some mechanics problems to illustrate the power of this law! In doing so, we will also
try to motivate intuitive ideas on how to approach such mechanics problems.
Example 1: Consider two identical masses coupled by a linear spring that is oriented ver-
tically, i.e., aligned with gravity, and is falling slowly through a viscous fluid. Assume that
the spring is originally extended and that the gravitational field is constant. Furthermore ,
assume that each mass experiences a viscous drag force, which depends linearly on velocity,
and that the masses are initially at rest.
1. Write down the coupled dynamics describing the system’s evolution
2. Non-dimensionalize the system. How many independent parameters are there?
3. Assuming that the drag coefficient is small, construct a perturbative expansion for the
motion. Only compute the first non-trivial correction.
We draw a free body diagram describing the system. This can be seen in Figure(8). Using
conservation of momentum, we see each mass with have its own equation, that depends
on a multitude of forces, including gravitational force, fluid drag forces, and spring forces.
Without loss of generality, we call will label the equation associated with the top mass with
a subscript of 1, and the bottom mass with a subscript of 2.

Note that we do not have any forces in the x−direction, hence only have to consider motion
in the z−direction. Therefore we can easily write that
mz̈1 = F1 + F1 + F1
gravity drag spring
mz̈2 = F2 + F2 + F2 .
gravity drag spring

90
Figure 8: (a) The physical system - two coupled masses attached by a linear spring falling through a
viscous fluid. (b) The free body diagram for both masses in the system.

Although we only have a skeletal like equation at the moment, we will be able to define
all the force terms in the above equation. First, we can write out the initial conditions.
Since both masses are initially at rest, we can write ż1 (0) = ż2 (0) = 0. Furthermore, we
can assume that initially the masses are a distance h apart, and hence define z1 (0) = 0 and
z2 (0) = −h.

Now we proceed to define all those force terms. Beginning with the force of gravity, which
we assume is constant, the force can be defined as

F1 = −mg
gravity
F2 = −mg.
gravity

Similarly since we assume that the drag force is linearly proportional to the velocity of the
mass, we can write these forces as

F1 = ν ż1
drag
F2 = ν ż2 ,
drag
since the forces are upwards in the +z direction. Note that ν is the drag coefficient.
Lastly, we have to define the spring force. We note for a simple spring, Hooke’s Law
says that the force produced is proportional to the distance the spring is stretched from
equilibrium, where the proportionality constant is called the spring stiffness. Hence a
general spring force could be written as Fspring = −k(x − d), where d is the equilibrium
distance and k is the spring stiffness. Hence if the spring is at a distance d, there will be
no force produced. Therefore we can write our spring force relations as

F1 = −k [(z2 − z1 ) − d]
spring
F2 = −k [(z1 − z2 ) − d] ,
spring

91
Now we can write our spring system as
mz̈1 = −mg + ν ż1 − k [(z2 − z1 ) − d]
mz̈2 = −mg + ν ż2 − k [(z1 − z2 ) − d] ,
with ż1 (0) = ż2 (0) = 0, z1 (0) = 0, z2 (0) = −h, and where k > 0 and ν > 0.
It is clear from the equations that motion of the masses is coupled, as well that the equa-
tions are linear. We will now proceed to non-dimensionalize these equations. To non-
dimensionalize we start by defining characteristic length and time scales, i.e.,
z1 = z10 z̃1
z2 = z20 z̃2
t = t0 τ.
Using chain rule, we can figure out the non-dimensionalization of the time derivatives,
d dτ d 1 d
= = .
dt dt dτ t0 dτ
Doing these substitutions, and dividing by the coefficient of the leading order term we get
the following system,
d2 z̃1 kt20 kt20 z20 t20
 
νt0 dz̃1 kd
− + z̃1 − z̃2 = − 0 g +
dτ 2 m dτ m mz10 z1 m
2 2 2 0 2
 
d z̃2 νt0 dz̃2 kt0 kt0 z1 t0 kd
− + z̃2 − z̃1 = − 0 g +
dτ 2 m dτ m mz20 z2 m

We currently have a bunch of parameters,


 2   2 
νt0 kt20 kt20 z20 kt20 z10
 
t0 kd t0 kd
g+ , 0 g± , , , , .
z10 m z2 m m m mz10 mz20
pm
We can let t0 = k , which looks the natural frequency for an undamped spring. Note
that we did not include ν in this term, since later in the problem we will assume it’s a
small parameter. . .so I guess the fluid isn’t very viscous, perhaps, it’s like air. . .but that’s
not the scope of our efforts here. We now have the following system,

d2 z̃1 z20
 
ν dz̃1 m kd
−√ + z̃1 − z̃2 = − 0 g +
dτ 2 mk dτ z10 kz1 m
d2 z̃2 0
 
ν dz̃2 z1 m kd
−√ + z̃2 − z̃1 = − 0 g −
dτ 2
mk dτ z20 kz2 m

Now we can make the assumption that both characteristic lengths, z10 and z20 are equal,
i.e., z10 = z20 . We can then let z10 = z20 = gm
k . Lastly, if we drop the tilda’s and revert back
to time derivatives being designated by a dot,we have

ν dk
z̈1 − √ ż1 + z1 − z2 = −1 +
mk gm
ν dk
z̈2 − √ ż2 + z2 − z1 = −1 −
mk gm

92
Hence we are left with only two parameters in the system
 
ν dk
√ , .
mk mg
The first parameter √ν can be seen as a kind of damping ratio within the system, while the
mk
dk
second parameter, mg , can be seen as the ratio of spring to gravitational forces. Now recall
we are going to assume that ν is a smaller parameter so we will define these parameters to
take the form as
ν dk
= √ and δ = .
mk mg
We are now going to seek perturbative solutions for the motion within the system. That
is we are going to seek solutions of the form,

z1 = z10 + z11 + 2 z12 + . . .


z2 = z20 + z21 + 2 z22 + . . . .

Substituting these perturbative forms into our the system’s equations, we have

d2 h 0 1 2 2
i dh 0 1 2 2
i
z1 + z1 +  z1 + . . . + z1 + z1 +  z1 + . . . + ...
dτ 2 dτ h i
. . . + (z10 + z11 + 2 z12 + . . .) − (z20 + z21 + 2 z22 + . . .) − δ + 1 = 0.
d2 h 0 1 2 2
i dh 0 1 2 2
i
z1 + z1 +  z1 + . . . + z1 + z1 +  z1 + . . . + ...
dτ 2 dτ h i
. . . + (z10 + z11 + 2 z12 + . . .) − (z20 + z21 + 2 z22 + . . .) − δ + 1 = 0.

If we look at the O(1) terms we find:


 0
 z̈1 + z10 − z20 − δ + 1 = 0
O(1) : .
 0
z̈2 + z20 − z10 + δ + 1 = 0

From the second equation we can solve for z10 and then differentiate twice, e.g.,

d4 0
z10 = z̈20 + z20 + δ + 1 ⇒ z̈10 = z + z̈20 .
dτ 4 2
Now substituting these into the first equation, we can get an equation all in terms of z20 ,
i.e.,

d4 0
 
0
 0 0
 0
4
z2 + z̈ 2 + z̈2 + z2 + δ + 1 − x2 − δ + 1 = 0,

and hence

d4 0
z + 2z̈20 + 2 = 0.
dτ 4 2
Although we went from a 2nd order equation to a 4th order equation, have no fear! We can
simply integrate the above equation twice, to get

93
z̈20 + 2z20 = −t2 + At + B,
where A, B ∈ R. Solving the homogeneous problem we find that
√ √
z20H = C1 cos( 2t) + C2 sin( 2t).
We can solve for the particular solution using undetermined coefficients (or variation of
parameters, if you fancy more hand writing), by guessing a particular solution of the form
z20P = d1 t2 + d2 t + d3 . Plugging this into the ODE, we find that d1 = − 21 , d2 = A2 , and
d3 = 12 (B + 1).

Hence we get that


√ √ 1 A 1
z20 = C1 cos( 2t) + C2 sin( 2t) − t2 + t + (B + 1).
2 2 2
Using the relation we found before, i.e., z10 = z̈20 + z20 + δ + 1, we find z10 to be
√ √ 1 A 1
z10 = −C1 cos( 2t) − C2 sin( 2t) − t2 + t + (B + 1) + δ.
2 2 2
From the initial conditions that state the masses are at rest, we can find what C2 and A
are. The other parameters can be found by stating where the masses were initially. We
will not do this. Using ż20 (0) = 0, we find
√ A
ż20 (0) = 2C2 − = 0.
2
Now recall our favorite relation, z10 = z̈20 + z20 + δ + 1, and that ż10 (0) = 0, we note that this
implies that
d3 z20
ż10 (0) = 0 ⇒ (0) = 0.
dτ 3
This gives us that

f racd3 z20 dτ 3 (0) = −2 2C2 = 0 ⇒ C2 = 0 ⇒ A = 0.

Hence we find the O(1) solutions to be


√ 1 B+1
z20 = C1 cos( 2t) − t2 + (55)
2 2
0
√ 1 2 B+1
z1 = −C1 cos( 2t) − t + + δ. (56)
2 2
To find the other unknown constants, B and C1 , we can use the initial position values to
uncover their identity. We can now look at the next higher order terms, O(),
√ √
z̈11 + z11 − z21 = −ż10 = −

√ 2C1 sin(
√ 2t) − t
O() : .
z̈21 + z21 − z11 = −ż20 = 2C1 sin( 2t) − t
We can go about and solve these equations in a similar manner to what we just did to
find the leading order term. We note that the initial values are now different since we are
looking at the first perturbative term, e.g.,
z11 (0) = z21 (0) = ż11 (0) = ż21 (0) = 0.

Can we go ahead and do this? Yes, absolutely. Will we? Nah, not right now. Let’s save
that for a rainy day.

94
Figure 9: (a) The physical system - two masses (m1 and m2 ) attached massless, inextensible string
that passes over a frictionless pulley of radius r, with moment of inertia, I. (b) The free body diagram
for both masses in the system.

Example 2: The Atwood Machine (...and sneak peak at Double Atwood Machine!)

Consider two point masses, m1 and m2 with m1 < m2 , connected by an ideal mass-less
and inextensible string of length l that passes over a frictionless pulley of radius r with
a moment of inertia, I, as shown in Figure 10. We can now proceed to look at the force
balance equations.

The forces on mass m1 are solely the gravitational force downwards and the tension of the
string, T1 . Furthermore since we assume m1 < m2 , we assume this mass is accelerating
upwards. This gives us the equation:

m1 a = T1 − m1 g. (57)

Similarly for mass m2 , we get


m2 a = m2 g − T2 , (58)

since m2 is accelerating downwards. Finally from the pulley we get the an equation for
the torque of system, that relates the moment of inertial and angular acceleration with the
difference between T1 and T2 ,
a
Iα = (T2 − T1 )r = I , (59)
r
where α is the angular acceleration of the system. Given that the string is not slipping at
all, we have that
a
α= .
r
Therefore adding (57), (58), and (59) together, we get
 
I
m1 + m2 + a = (T1 − T2 ) (1 + r) + g (m1 + m2 ) ,
r
and hence we find the acceleration to be

95
g (m1 + m2 )
a= .
m1 + m2 + rI2
Note: in order to add (59) to the other two equations, we first divided both sides by r to
put it into the proper units. Furthermore we can also easily compute what the tension in
the string is near m1 and m2 ,

I

m1 g 2m2 + r2
T1 = I
m1 + m2 + r2

I

m2 g 2m1 + r2
T2 = I
.
m1 + m2 + r2

Figure 10: (The physical system - the triple Atwood system composed of 4 masses, three inextensible
strings, and three frictionless pulleys.

We note that in order to find the acceleration, a, we had to write down a system of three
equations using empirical relations. Although these equations were not necessarily a big
chore, there are still easier ways to do this problem! Imagine for a moment if we had a double
or triple Atwood machine (see Figure)! Finding the acceleration in the system would no
longer be such a walk in the park; please be our guest and try it! However, using a different
formulation oh physics, called Lagrangian Mechanics, the problem becomes tolerable! We
will now venture down that path now.

96
7 Asymptotics and Perturbation Methods
This section will have a vastly different flavor, in comparison to the previous methods. We are to be
no longer concerned with exact solutions to equations, even in the form of an infinite sum, but will
set our sights solely on asymptotic or perturbative solutions. You can think of these as very good
approximation methods, if you’ve never heard the lingo prior. Well, let’s just define these when the
time comes, but now be comforted by knowing we will solely search for “approximate” solutions to
equations now.

7.1 Integral Asymptotics and Laplace’s Method


First we consider the seemingly innocent integral, I(λ),
Z b
I(λ) = f (t)e−λg(t) dt. (60)
a

Now what I haven’t revealed was properties about f (t), g(t), or λ. Well, we’ll assume f and g are
friendly functions, where friendly for us just means that they are smooth enough to be replaced by
local Taylor series approximations of appropriately desired degree.
Well, what about λ? This where the term asymptotic comes into play. We are only concerning
ourself here with the behavior of I(λ) as λ → ∞. Hence are only on a mission to look at this integral
in a limiting behavior scenario. This is the general idea with asymptotic methods, that is, methods
that describe limiting behavior of a function, or solution to an equation. We will now briefly give two
definitions to solidify this idea.

Definition 7.2 We say to functions fn and gn of a natural number parameter n, are asymptotically
equivalent, e.g., fn ∼ gn (as n → ∞), if and only if limn→∞ gfnn = 1. This equivalence only holds in
the sense of the limit, though.

For example, consider the functions g(n) = n2 − 2n and h(n) = n2 + 3n + 7. It clear that that
g(n) 6= h(n)∀n ∈ N; however, it is clear that g(n) ∼ h(n) as n → ∞, so it is safe to say that g(n)
and h(n) are asymptotically equivalent as n → ∞. It is paramount to note that we had to state the
asymptotic equivalent only holds “as n → ∞”. In general, the limiting behavior does not have to be as
n → ∞ or even n → 0, it just needs to be at a point of interest for the functions, e.g., where a function
may become singular, etc. Using this idea, we will now explore what an asymptotic expansion of a
function is.

Definition 7.3 An asymptotic expansion of a function f (x) is an expression of f (x) in terms of


a series, such that by taking any initial partial sum provides an asymptotic formula for f (x). The
successive terms in the series provide an increasing accurate description of f (x), by giving more detail
to the growth of f (x). We note that the partial sums of this expansion do not necessarily converge.

Now let’s get back to the problem at hand, finding the limiting behavior of (60) as λ → ∞. To
achieve this, we will use Laplace’s Method.

7.1.1 Laplace’s Method


Consider (60), such that g(t) assumes a strict minimum in [a, b] at an interior point, call it c. We will
also assume that g 0 (c) = 0 (duh), g 00 (c) > 0 (so the point is a minimum, duh), and f (c) 6= 0. Using
the secret mathematician trick of adding and subtracting zero, we can rewrite (60) as
Z b  
I(λ) = e−λg(c) f (t)e−λ g(t)−g(c) dt.
a

97
This is the part that is crucial for Laplace’s Method. Since g(c) is a minimum of g(t) on [a, b], for
λ >> 1, the main contribution of I(λ) will come from a small neighborhood of c. This is more obvious
from the definition of I(λ) above, rather than where it was first introduced. Hence for λ >> 1,

Z c+  
I(λ) ≈ e−λg(c) f (t)e−λ g(t)−g(c) dt
c−

Z c+  
≈ e−λg(c) f (c) e−λ g(t)−g(c) dt
c−

Next we approximate g(t) by its Taylor series centered around t = c,


1
g(t) = g(c) + (t − c)g 0 (c) + (t − c)2 g 00 (c) + O((t − c)3 ).
2!
Substituting the above into the integral equation, we find

Z c+   
g(c)+(t−c)g 0 (c)+ 2!
1
(t−c)2 g 00 (c) −g(c)
I(λ) ≈ e−λg(c) f (c) e−λ dt
c−

Z c+  
0 1 00 2
≈ e−λg(c) f (c) e−λ g (c)(t−c)+ 2! g (c)(t−c) dt
c−

Z c+
00
−λg(c) λ
(c)(t−c)2
=e f (c) e− 2 g dt
c−

since g 0 (c) = 0. Recall we said the main contribution of I(λ) comes from a neighborhood of c.
Moreover, on the flip side, we assume the integral’s contribution outside that neighborhood, therefore
we can expand that neighborhood to all of R, i.e., we now have
Z ∞
λ 00 2
I(λ) ≈ e−λg(c) f (c) e− 2 g (c)(t−c) dt.
−∞
Next to integrate the Gaussian-type integral above, we first do the swanky transformation of letting
s = t − c, and getting

Z ∞
00
−λg(c) λ
(c)s2
I(λ) ≈ e f (c) e− 2 g ds
−∞

s

= e−λg(c) f (c) .
λg 00 (c)

Thus to leading order we have that


s

I(λ) ∼ e−λg(c) f (c) as λ → ∞. (61)
λg 00 (c)

98
This above is the standard Laplace Method’s result when the minimum is within the interval (a, b).
There are three basic ideas behind this methodology. They are as follows:

1. Since we are concerned with λ → ∞, the main contribution of I(λ) comes from a small neigh-
borhood of the point in which g(t) is at its local minimum, say t = c, and can replace the
integration bounds from [a, b] → [c − , c + ].
2. Within the neighborhood of the minimizer, we can approximate f (t) and g(t) with their Taylor
Series until desired order.

3. Since the main contribution of I(λ) is coming from the interval [c − , c + ], we can extend this
interval to [−∞, ∞], as the regions outside [c − , c + ] only contribute higher order terms to
I(λ) as λ → ∞.
We will now go through a very similar calculation to illustrate what happens when the minimum
of g(t) in on an end point of the integration bounds, but first we will show the trick for integrating
Gaussian-type integrals.

7.1.2 Gaussian Integral


We will show how to explicitly integrate integrals of the form,
Z ∞
2
G= e−x dx. (62)
−∞

To do this, we will transform this integral to polar coordinates, but first, we will do the magical
trick. This is it. Just square (62), i.e.,
Z ∞ Z ∞ Z ∞Z ∞
2 2 2 2
G2 = e−x dx e−y dy = e−(x +y ) dxdy.
−∞ −∞ −∞ −∞

Now since G2 is in 2D, we can transform it to polar coordinates, by letting r2 = x2 + y 2 and


rdrdθ = dxdy. We have
Z ∞ Z 2π
2
2
G = e−r rdrdθ.
0 0

Therefore on the above we can simply do the substitution let u = r2 and therefore du = 2rdr. We
now have
Z ∞ Z 2π
2 1 −u
G = e dudθ,
0 0 2
and hence

G2 = −πe−u = π.
0
Now solving for G, we get that

G= π.
We used this trick above to find the leading order behavior of the Laplace Integral, and will use it
again in a minute for a similar computation!

99
7.1.3 Laplace’s Method Take Two (minimizer on an integration bound)
Consider (60), such that g(t) assumes a strict minimum in [a, b] at an endpoint of the interval. Without
loss of generality, let’s just say the minimum is found at t = a. We will also assume that g 0 (a) = 0
(duh), g 00 (a) > 0 (so the point is a minimum, duh), and f (a) 6= 0. Using the same math trick of
adding and subtracting zero, we can rewrite (60) as
Z b  
−λg(a) −λ g(t)−g(a)
I(λ) = e f (t)e dt.
a
We will make the same assumptions as we did above, e.g., that the main contribution to I(λ)
comes from a neighborhood of t = a, f (t) and g(t) are smooth enough so that we can replace them by
their Taylor Series centered at t = a, and that since the main contribution is coming a neighborhood
of t = a, we will be able to expand this neighborhood to all of R+ with only higher order asymptotic
terms being affected. The reason for only blowing this interval up to [a, ∞) is because we do not
consider points t < a. Upon doing these things, we get

Z a+  
I(λ) ≈ e−λg(a) f (t)e−λ g(t)−g(a)
dt
a

Z a+  
≈ e−λg(a) f (a) e−λ g(t)−g(a) dt
a

Z c+   
−λg(a) −λ g(a)+(t−a)g 0 (a)+ 2!
1
(t−a)2 g 00 (a) −g(a)
≈e f (a) e dt
a

Z c+  
−λg(a) −λ g 0 (a)(t−a)+ 2!
1 00
g (a)(t−a)2
≈e f (a) e dt
a

Z a+
00
−λg(a) λ
(a)(t−a)2
=e f (a) e− 2 g dt
a

Z ∞
00
−λg(a) λ
(a)(t−a)2
≈e f (a) e− 2 g dt.
a

Now doing the same substitution as before, letting s = t − a, we are left with the following
Gaussian-type integral to integrate,

Z ∞
00
−λg(a) λ
(a)s2
I(λ) ≈ e f (a) e− 2 g ds
0

r
π
= e−λg(a) f (a) .
2λg 00 (a)
Thus when the minimizer is at an endpoint, we have to leading order that
r
−λg(a) π
I(λ) ∼ e f (a) as λ → ∞. (63)
2λg 00 (a)

100
7.1.4 The Gamma Function and Higher Order Asymptotics
Usually in the asymptotic business, the next thing one is introduced to is the Gamma function. The
reason for this is two-fold; first it is a generalization of the factorial function, and second, it is used to
derive Stirling’s approximation, which is looking at the asymptotic behavior of the factorial function
itself. We will do all these things, but before, we concern ourself with the asymptotic behavior of
the factorial, let’s use the Gamma function to look at higher order asymptotics associated with the
Laplace integral.
The Gamma function is defined as,
Z ∞
Γ(x) = tx−1 e−t dt. (64)
0

The way to see (64) is a continuous generalization of the factorial function, we simply have to
integrate it by parts, e.g.,
Z ∞
Γ(x + 1) = tx e−t dt
0

∞ Z ∞
x −t
= −t e + xtx−1 e−t dt
0 0

Z ∞
=x tx−1 e−t dt
0

= xΓ(x).

Thus if x = n ∈ Z+ , we have
Γ(n + 1) = n!.
Now we can use this to find higher-order asymptotic terms in vein of the Laplace integral in (60).
We will again assume that g(t) assumes a strict minimum over [a, b] at a point c ∈ (a, b), as well that
g 0 (c) = 0, g 00 (c) 6= 0, and f 00 (c) 6= 0. Now where this calculation is different than those previously done
is that we assume f (c) = 0. This subtle change will lead us to employ the Gamma function to find the
leading behavior in our asymptotic expansion. We will again make the same assumptions as before,
e.g., the main contribution comes from the neighborhood of t = c, and all that jazz. Now for λ >> 1,

101
Z b
I(λ) = f (t)e−λg(t) dt
a

Z c+ 
≈ e−λg(c) f (t)e−λ g(t)−g(c)
dt
c−

Z c+  
1

1 00
g 0 (c)(t−c)+ 2! g (c)(t−c)2 +O((t−c)3 )
= e−λg(c) f (c) + f 0 (c)(t − c) + f 00 (c)(t − c)2 + O((t − c)3 ) e−λ dt
c− 2!

Z c+  
−λg(c) 1 00 λ 00 2
≈e 0
f (c)(t − c) + f (c)(t − c) e− 2 g (c)(t−c) dt
2
c− 2!

c+
f 00 (c) −λg(c)
Z
00
λ
(c)(t−c)2
= e (t − c)2 e− 2 g dt.
2 c−

We do the same transformation s = t − c, and then expanding the integration region to all of R,
we obtain
f 00 (c) −λg(c) ∞ 2 − λ g00 (c)s2
Z
I(λ) ≈ e s e 2 ds
2 −∞

Z ∞
00
λ
(c)s2
= f 00 (c)e−λg(c) s2 e− 2 g ds.
0

λ 00 2
To simplify the above integral, we perform one more transformation. Letting u = 2 g (c)s and
hence du = λg 00 (c)sds, we find that
Z ∞   
2u du
I(λ) ≈ f 00 (c)e−λg(c) e −u
0 λg 00 (c) 2λg 00 (c)u

Z ∞

2
= f 00 (c)e−λg(c) u1/2 e−u du
0 (λg 00 (c))3/2


2f 00 (c) −λg(c)
 
3
= e Γ
(λg 00 (c))3/2 2


2f 00 (c) −λg(c)
   
1 1
= e Γ
(λg 00 (c))3/2 2 2

Now we find that



 
1
Γ = π, i.e.,
2

102
  Z ∞
1
Γ = t−1/2 e−t dt, let u = t1/2
2 0

Z ∞
2
=2 e−u du
0


= π.

Hence we find that


f 00 (c)
r
π
I(λ) ≈ e−λg(c) .
2 (λg 00 (c))3/2
Therefore we have to leading order,
π
r
I(λ) ∼ f 00 (c)e−λg(c) 3, as λ → ∞.
2 (λg 00 (c))

7.1.5 Striling’s Approximation


We now set our sights on deciphering the leading order asymptotics of n! as n → ∞. To do this we
will rightfully use Laplace’s method, but first will motion to massaging the Gamma function into a
more welcoming form for Laplace’s method. Consider
Z ∞
Γ (x + 1) = tx e−t dt.
0
I hope you remember your exponential and logarithmic identities. We are about to go for a walk
down memory lane. First let’s warmup and use the property that

tx = ex ln(t)

,
Z ∞ Z ∞
e−x( x −ln(t)) dt.
t
Γ (x + 1) = ex ln(t) e−t dt =
0 0
t 1
Next we perform a substitution, z = x (and dz = x dt), and get

Z ∞ Z ∞ Z ∞
Γ (x + 1) = x e−x(z−ln(xz)) dz = x e−x(z−ln(x)−ln(z)) dz = xex ln(x) e−x(z−ln(z)) dt.
0 0 0

We now have exponentially and logarithmically massaged the Gamma integral into the following
form,
Z ∞
Γ (x + 1) = x x+1
e−x(z−ln z) dz. (65)
0

We see that (65) is in the form of a Laplace type integral, with f (z) = 1 and g(z) = z − ln z. With
a little further analysis, it is clear that g(z) has a strict minimum on (0, ∞) at z = 1. We see that
g(z = 1) = 1, g 0 (1) = 0, and g 00 (1) = 1. Substituting all of this information into the leading order
terms of the Laplace integral, we find that

103
s

Z r
−x(z−ln z) −xg(1) 2π 2π −x
e dz ∼ e f (1) = e , as x → ∞.
0 λg 00 (1) x
Hence we find that the Gamma function has leading order behavior,

2π −x √
r
1
Γ(x + 1) ∼ xx+1 e = 2πxx+ 2 e−x , asx → ∞.
x
To make this a little closer to home, consider x = n ∈ Z+ , as n → ∞, that is, make n is very large
integer. We are now ready to see what the factorial function looks like as n gets very, very large.
√ 1
n! ∼ 2πnn+ 2 e−n , as n → ∞.

7.1.6 Elementary, the dear Watson’s Lemma


We will introduce a close cousin to the Laplace Method, Watson’s Lemma. This is also a method
for finding the leading order behavior of an asymptotic integral; however, takes a slightly less general
form than Laplace-type integrals. Consider an integral of the form,
Z B
I(λ) = f (s)e−λg(s) ds as λ → ∞. (66)
A
Similarly to how we proceeded for Laplace’s Method, we can use a mathematical slight of hand,
to add and subtract 0,
Z B Z B  
I(λ) = f (s)e −λg(s)
ds = e −λg(A)
f (s)e−λ g(s)−g(A)
ds.
A A

Next we look to perform the substitution, t = g(s) − g(A); however, this transformation will only
work if we can invert s to get s = s(t). If this is possible, then we have
dt
ds = .
g 0 (s(t))
We have then massaged (66) to
g(B)−g(A)

f s(t) −λt
Z
−λg(A)
I(λ) = e  e ds.
0 g 0 s(t)
The main contribution to this integral comes will come from a neighborhood around the minimum
of the exponential term for λ → ∞. Moreover, we are also assuming that |f (s)| < Kebt for t > 0,
where b, K ∈ R and are all independent of t. Hence because the largest contribution will occur about
a one-sided neighborhood about 0, we only need to consider a bound of [0, ], where  > 0, e.g.,
Z  
−λg(A) f s(t) −λt
I(λ) ≈ e 0
 e ds for λ >> 1
0 g s(t)

Furthermore, the fractional term in the integrand can usually be written in the following form,

f s(t)
 = tα h(t),
g 0 s(t)

104
where α > −1. We also are going to assume that h(t) has a convergence Taylor Series about t = 0.
Upon doing so, for λ → ∞, we have,



f s(t) −λt
Z
I(λ) ≈ e−λg(A)  e ds
0 g 0 s(t)
Z 
= e−λg(A) tα h(t)e−λt ds
0
∞ 
h(k) (0)
X Z
= tk+α e−λt dt
k! 0
k=0
∞ ∞
h(k) (0)
X Z
≈ tk+α e−λt dt by the same assumptions as in Laplace’s Method
k! 0
k=0

Now substituting w = λt and dw = λdt, we obtain


∞ ∞ ∞ Z ∞
h(k) (0) h(k) (0)
X Z X
I(λ) ≈ tk+α e−λt dt = wk+α e−w dw.
k! 0 k!λk+1+α 0
k=0 k=0

Recalling the definition of the Gamma function, we then have the asymptotic expansion for I(λ)
as λ → ∞,

X h(k) (0)Γ (k + 1 + α)
I(λ) ∼ (67)
k!λk+1+α
k=0

Example: Consider the following integral,


Z π/2
2
I(λ) = e−λ tan θ dθ as λ → ∞
0

We can cleverly use the substitution, t = tan θ2 θ and dt = 2 tan θ sec2 θdθ, and through some
simplification arrive at Z ∞
1 1 −λt
I(λ) = √ e dt.
0 2 t 1 + t

Now we have α = 1
2 and can expand (1 + t)−1 in it’s Taylor Series about t = 0 for |t| < 1,


1 1 1X
h(t) = = (−1)k tk .
21+t 2
k=0

Hence from the Taylor Series, we see that

h(k) (0) 1
= (−1)k .
k! 2

Plugging all this into Watson’s Lemma, (67), we find the asymptotic series as λ → ∞ to be

∞ 1

1 X (−1)k Γ k + 2
I(λ) ∼ 1 as λ → ∞.
2 λn+ 2
k=0

105
7.1.7 Where does the Steepest Decent take us?
You may have been lucky enough to have heard of the Method of Steepest Decent for asymptotic
approximations in conjunction to integrals with a very similar flavor as the integral form of (60).
Although Steepest Decent is thought to be more of generalization of Laplace’s Method, it can also
be viewed as a methodology to massage your integral of interest into a form that can then call upon
Laplace’s Method to find the asymptotics. Let’s stop being cryptic, and observe the problem at hand.

Consider the following complex integral,


Z
I(x) = h(t)e−xρ(t) dt, (68)
C

where t ∈ C, ρ(t) = φ(t) + iψ(t). One of the main issues we face with this integral is that, well,
first the integration path is not clearly defined for us. . .spoiler, that will be our job, but also that we
are now officially diving into the world of complex integration.

We are implicitly assuming that ρ(t) has non-zero imaginary part. Our worry here is that our
path of integration may introduce regions of high frequency oscillations into the exponential. As we
proceed to pick an integration path, or contour, we will set out to choose one such that Im(ρ(t)) is
constant.

We will proceed to give a glossing overview of the method to build some intuition behind the
madness, as well as, draw similarities between this methodology and that of the Laplace Integral we
just studied.

• As we just said, we want to deform the integration contour, C, such that Im(ρ(t)) is constant.
That is, we will not have to worry about high frequency oscillations obstructing us from finding
appropriate asymptotic behavior.

• With that in mind we go ahead and find the saddle points of ρ(t). This is similar to what we did
in the Laplace Method, i.e., we wished to find a minimizer, and here we set out to find what we
can best do in the complex plane, which is find a saddle-point with positive second derivative,
i.e., ρ0 (tC ) = 0 and ρ00 (tC ) > 0, where tC ∈ C is the saddle-point.
• After finding the saddle-point, we look for the direction of steepest decent through the saddle-
points. There are two routes we can now take.
1. We can look for lines of constant phase that go through tC , e.g., ψ(t) = ψ(tC ), where
t = u + iv, so ψ(u, v) = ψ(uC , vC ). This is not exactly Steepest Descent, but more referred
to the method of Stationary Phase. . .we will get there. . .
2. Or, we look for θ, such that ρ(tC ) = re−θ , and the direction of the steepest decents through
the saddle-point are α1 = π−θ2 and α2 =
2π−θ
2 .

• Now that we have an idea of the path of constant phase, or steepest decents, we have to check
to see which lines are steepest ascent or steepest decent lines, since after all, we are performing
this analysis at a saddle-point. To do this we simply look at the real part:
– If Re(u, v) < 0 ← descent :)
– If Re(u, v) > 0 ← ascent :(
• We then deform the contour appropriately in the (u, v)-plane.

106
• Finally our integral should be massaged into a form where we can call upon Laplace’s Method
to perform the actual asymptotics to find the leading behavior of (68).

We will now go ahead and illustrate this method, via an example!

Example: Consider the following integral,


Z 1
2
I(x) = eixt dt as x → ∞, (69)
0

hence we have that ρ(t) = it2 . First things first, let’s look for the saddle-points. Letting
t = u + iv, we obtain,

f 0 (t) = 2it = 2i(u + iv),


t=u+iv

and setting f 0 (t) = 0 we find that u = v = 0. Therefore we have tC = 0. Now looking at f 00 (tC ),
we find that
f 00 (tC ) = 2i = 2eiπ/2 .
π
Now we can look for the direction through the saddle-points, with θ = 2 .,

π−θ π
α1 = =
2 4
2π − θ 3π
α2 = = ,
2 4

 
or similarly, we can set the Im f (t) = Im f (tC ) , i.e.,

u2 − v 2 = 0 ⇒ u = ±v,

which are the lines of steepest descent. Furthermore, it is clear that these lines in the (u, v)-plane,
match the directions through the saddle-points. However, we need to find which is associated
with a line of steepest ascent and one in which is the line of steepest descent. To figure this out,
we look at the real part of f (t), 
Re f (t) = −2uv.

Therefore we have two cases:



1. On the line u = −v, we have Re f (t) = 2u2 > 0 for all values of u. Therefore this is the
ascent line.

2. On the line u = v, we have Re f (t) = −2u2 < 0 for all values of u. Hence this is the
descent line.

Hence we know the direction through the saddle-point in which to construct our integration
path. However, we still need to figure out the other pieces to the contour we are choosing, e.g.,
we know the original integral is from t ∈ [0, 1]; however, we are building this into a closed-contour
integral in C. Furthermore we still need to choose the actual direction of the contour through
them. The pieces we currently have are depicted in the Figure(11). The blue lines depict the
lines of steepest descent, while red indicates the lines of steepest ascent. The blue dashed line

107
illustrates the part of the ascent line we will not choose for our contour, as to construct our
closed-contour we would be forced to cross a line of steepest ascent in quadrant IV .

To close this contour we need to choose one final path such that it includes the point t = 1 and
connectsto the line of steepest descent. Now at t = 1(u = 1, v = 0),
 we have f (1) = i and hence
Re f (1) = 0, similarly we also have that f (0) = 0 and Re f (0) =0. We look for paths such
that the imaginary part of f (t) is constant, e.g., set equal Im f (1) = Im f (t) . Upon doing
so, we find
p
u2 − v 2 = 1 ⇒ v = ± u2 − 1.

We then choose the + branch, such that our contour will meet with the steepest descent line,
u = v, at ∞. This then constructs our closed contour. However, we still need to choose a
direction around the contour, and will choose it to be counter-clockwise. This is shown in
Figure(12). Hence our contour includes C = C1 + C2 + C3 , where C1 is the original path of the
integral, we set out to find the leading order asymptotic behavior for.
Hence we now have the following integral we are considering,
Z Z Z Z
() = () + () + ().
C C1 C2 C3

R
The reason for this is because we can simply solve for the integral C1 () and perform asymptotic
analysis on the other integrals using Laplace’s Method. However, before we do that, we can pull
into play one important theorem from Complex Analysis, that is,

Theorem 7.10 (Cauchy’s Integral Theorem) Let U be an open subset of C which is simply-
connected. Consider f : U → C be a holomorphic function, and let γ be a rectifiable path in U
whose start point is equal to its end point, i.e., γ is a closed path, then
I
f (z)dz = 0.
γ

R
Using this theorem, and solving for C1
(), we are left with the situation that
Z Z Z
() = − () − (). (70)
C1 C2 C3

Now we will consider the leading order asymptotic behavior of each of these integrals separately.
Spoiler alert, we will call upon our friend, Watson’s Lemma, to actually perform the asymptotic
analysis.
2 5π
Path C3 : We begin with C3 eixt dt, and parameterize the path along u = v by t = ρei 4 and
R

dt = ei 4 dρ, giving us
Z Z 0 Z ∞
2 2 i5π/2 5π 5π 2
eixt dt = eixρ e
ei 4 dρ = −ei 4 e−xρ dρ.
C3 ∞ 0

2 dz
Now we perform the substitution, z = ρ and get dρ = √ ,
2 z
to get

∞ 5π ∞
ei 4 e−xz
Z Z Z
2 5π 1
eixt dt = −ei 4 √ e−xz dz = − √ dz.
C3 0 2 z 2 0 z

108
Lastly, we can call upon Watson’s Lemma, with α = − 12 , to give us the following leading
order asymptotic behavior,

ei 4 Γ 12
Z 
ixt2
e dt ∼ − . (71)
C3 2 x1/2
2
Path C2 : Now we consider the integral, C2 eixt dt. The most difficult part about this integral is
R
i
the choosing the appropriate parameterization. We let it2 = i − s and get dt = 2√1+is ds.
2
The motivation behind this choice is since at t = 1, it = i, and hence s = 0. We then
obtain
∞ 0
eix−xs ieix e−xs
Z Z Z
ixt2 i
e dt = √ ds = √ ds.
C2 2 0 1 + is 2 ∞ 1 + is
Now near s = 0, we can use Taylor’s Series, to write
1 is
√ ≈ 1 − + ...,
1 + is 2

giving us
0
ieix
Z Z  
2 is
eixt dt = 1− + . . . e−xs ds.
C2 2 ∞ 2
Now using Watson’s Lemma on the above largest term near s = 0, with α = 0, we get

ieix Γ (1)
Z
2
eixt dt ∼ − . (72)
C2 2 x

Therefore using (70), (72), and (71), we find the leading order asymptotic behavior of (69) to
be Z 1 5π r
ixt2 ei 4 π ieix
I(x) = e dt ∼ + . (73)
0 2 x 2x

This by no means has been a rigorous derivation or explanation of the Method of Steepest Descent,
but hopefully it gives you a good idea of the main ideas and concepts that motivate the method.

7.1.8 Well, what about Stationary Phase?


In this section we will use to method of Stationary Phase to find the leading order asymptotic behavior
of integrals that look very similar to those of the previous sections, e.g.,
Z B
I(x) = g(t)eixψ(t) dt as x → ∞, (74)
A

where A, B, g(t), ψ(t) ∈ R. To find the leading order asymptotics, we will consider two cases.

Case 1: If ψ 0 (t) 6= 0 for all t ∈ [A, B], we can simply use integration by parts on (74), e.g.,

B B
g 0 (t) ixψ(t)
Z B
Z
ig(t) ixψ(t)
I(x) = g(t)eixψ(t) dt = e +i e dt.
A xψ 0 (t) A A xψ 0 (t)

109
One can see that
 
ig(t) ixψ(t) B 1
e ∼O
xψ 0 (t) A x
Z B 0
 
g (t) ixψ(t) 1
i e ∼O ,
A xψ 0 (t) x2

as x → ∞. Therefore the leading order asymptotic behavior of (74) is

ig(t) ixψ(t) B
I(x) ∼ e as x → ∞.
xψ 0 (t) A

Case 2: If ψ 0 (t) = 0, ψ 00 (t) 6= 0 for some point t ∈ [A, B]. Without loss of generality, say this is at a
point t = c. We first find the Taylor Series of ψ(t) centered about c,

1 1 1
ψ(t) = ψ(c) + (t − c)ψ 0 (c) + (t − c)2 ψ 00 (c) + . . . = ψ(c) + (t − c)2 ψ 00 (c) + (t − c)3 ψ 000 (c) + . . . ,
2! 2! 3!
since ψ 0 (c) = 0. Substituting this into the integral in (74)we get
Z B
000
x
(c)(t−c)2 +...
I(x) = eiψ(c) g(t)ei 2 ψ dt.
A

You probably guessed it, we will only use the first term in the above exponential as well as look
at the Taylor Series for g(t) about t = c and pick off a few terms, e.g.,

Z B  x 000 2
I(x) ≈ eiψ(c) g(c) + (t − c)g 0 (c) + . . . ei 2 ψ (c)(t−c) dt

A
Z B
000
iψ(c) x
(c)(t−c)2
≈e g(c)ei 2 ψ dt.
A

Now since the main contribution to I(x) as x → ∞ comes from a neighborhood of the stationary
point, t = c, we only need to consider a neighborhood about c. However, like we have done a
few times before we can then extend the integration bounds all the way to (−∞, ∞), i.e.,

Z B
000
x
(c)(t−c)2
I(x) ≈ eiψ(c) g(c)ei 2 ψ dt
A
Z c+
000
x
(c)(t−c)2
≈ eiψ(c) g(c)ei 2 ψ dt
c−
Z ∞
000
x
(c)(t−c)2
≈ eiψ(c) g(c)ei 2 ψ dt.
−∞

Next to massage
q this last integral into a form we q can integrate, we use the transformation,
±i π ±i π
u = e 4 2 x ψ (a) (t − a) and hence du = e 4 12 x ψ 00 (a) dt, and obtain
1 00

110
iπ ∞
g(a)eixψ(a)∓i 4
Z
2
I(x) ∼ q e−u du,
1 00 −∞
2 x ψ (a)

and hence get the leading order asymptotic behavior to be


g(a)eixψ(a)∓i 4
I(x) ∼ q .
1 00 (a)
2π x ψ

Note that in our transformation where we defined u, we choose the + part if ψ 00 (c) > 0 and choose
the − part if ψ 00 (c) < 0. Furthermore note that if in the case ψ 0 (c) = ψ 00 (c) = · · · = ψ (p−1) (c) = 0
for some point c ∈ [a, b], the above process will work analogously - just use the largest non-zero
Taylor coefficient!

7.2 The Method of Dominant Balance!


In the previous section we discussed how to determine the asymptotic behavior of a real-valued inte-
gral in the limit of some parameter. Here we will discuss the asymptotic behavior of solutions to an
ODE without having to fully solve it. This method is called the Method of Dominant Balance. As
the name suggests, we will make some guesses about which terms of the ODE may be negligible in a
limit of interest and balance the remaining terms that we have deemed to be the dominant ones.

More specifically this asymptotic method will work on linear ODEs about their irregular singular
points. We will look for local solutions about the irregular singular point, x = x0 , of the form

y ∼ eS(x) as x → x0 . (75)

We assume the solution can be approximated by some type of exponential in neighborhood of the
irregular singular point, which gives us the motivation for (75). Furthermore, taking derivatives on
(75), we have the following differential relations,

y 0 (x) ∼ S 0 (x)eS(x) (76)


 
y 00 (x) ∼ S 0 (x)2 + S 00 (x) eS(x) . (77)

To actually find the approximate behavior of the solution near an irregular singular point, we are
lucky in that there are distinct steps one can follow when performing this method. Before diving into
an example, we will outline them now, so they don’t remain a mystery.

1. Assume the asymptotic behavior has the y(x) ∼ eS(x) .


2. Make educated guesses about which terms in the ODE may be negligible near our irregular
singular point, i.e., near x = x0 .
3. Drop these terms and then arrive at a simplified ODE. Solve it! (To do this, replace the
asymptotic relation, ∼, with an equals sign, and solve exactly. The solution will most certainly
satisfy the asymptotic relation, although it will not be the only function to do so, i.e., it is not
unique).
4. Check that the solution get now get is consistent with all the balancing we did in Step 2. If all the
terms are consistent, then this is the dominating factor of the asymptotic behavior; however, if
not, revisit the way you decided to drop terms in Step 2, and try balancing other terms, instead.

111
5. Repeat these process to capture the higher order asymptotics, using the previous asymptotic
results you obtained!

Let’s put this machinery to work! Consider the following example,

xy 00 + 4y 0 = 2y, (78)

which we wish to find the solution’s behavior near its irregular singular point at x → ∞. We perform
our substitutions from (75), (76), and (77) to get

xS 00 + xS 02 + 4S 0 = 2. (79)
We then make the assumption that as x → ∞

|xS 00 | << xS 02 .

The intuition behind this choice can only come from the fact that near irregular singular points, the
second derivative tends to be more behaved, as compared to the first derivative. It will be a good
idea to check this assumption at the end, either way. That gives us

xS 02 + 4S 0 − 2 = 0.

As the above form suggests, we call upon our high school friends, the quadratic formula, to solve
it, leading us to √ √
0 −2 ± 2 2 + x
S = as x → ∞.
x
Now we can pick this apart to see what the true control factor, or largest term, is in our asymptotic
series. As x → ∞, we see that the term that contributes most in the limit will be

S 0 (x) ∼ ± 2x−1/2 .

Now integrating this term, we obtain the control factor, S(x) ∼ ±2 2x1/2 , and hence

S(x) ∼ ±2 2x1/2 + C(x),

where C(x) will be the next leading order term, e.g., C(x) << ±2 2x1/2 as x → ∞. Before
proceeding to find the next term in the asymptotic relation, we should check to make sure our original
assumptions from Step 2 are consistent with what we found for S(x).

We find that
1
S 0 (x)2 ∼ 2x−1 and S 00 (x) ∼ ∓ √ x−3/2 ,
2
and hence our original assumption holds that |xS 00 | << xS 02 . Let’s find the next term in the

asymptotic expansion! Recall we found S(x) ∼ ±2 2x1/2 + C(x). We simply need to substitute this
into (79), e.g.,

1
  √ 2  √ 
x ∓ √ x−3/2 + C 00 (x) + x ± 2x−1/2 + C 0 (x) + 4 ± 2x−1/2 + C 0 (x) = 2
2
and simplifying we obtain,
√  
√ 1

xC 00 + xC 02 ± 2 2x1/2 + 2 C 0 ± 4 2 ∓ √ x−1/2 = 0.
2

112
Now we proceed to do some massaging
√ on the above equation. Considering large values of x,
i.e., x → ∞, we have that 2 << 2x1/2 . Furthermore, since C(x) << S(x) ∼ x1/2√ , we have that
|C 0 | << x−1/2 and
 √ |C 00
| << x −3/2
. This two relations imply that xC 02
<< 2 2x1/2 C 0 and
|xC 00 | << ± 4 2 ∓ √12 x−1/2 .

Therefore we have the following equation to solve,


√ 1/2 0 √
 
1
±2 2x C ± 4 2 ∓ √ x−1/2 = 0,
2
and hence we get  
0 1
C ∼∓ 2∓ x−1 as x → ∞.
4
Integrating the above, we obtain
 
1
C(x) ∼ ∓ 2 ∓ ln(x) as x → ∞.
4

Checking that our assumptions are correct, we see first that |C| << |S| as x →
 ∞, and then
 also

0 0 00 00 02 1/2 0 00
that |C | << |S |, |C | << |S |, xC << x C , and finally that |xC | << ± 4 2 ∓ √2 x−1/2 .
1

Check.

Hence the leading order behavior of the solution to (78) is given by

y(x) ∼ eS(x)

2x1/2 )∓(2∓ 14 ) ln(x)
∼ e±(2

2x1/2
= kx∓(2∓ 4 ) e±2
1

or √
2x1/2
y(x) ∼ kx∓(2∓ 4 ) e±2
1
,
where k is a constant that cannot be determined from local analysis.

7.3 Asymptotic Convergence vs. Pointwise Convergence

Asymptotic convergence and pointwise convergence are but two beasts with subtle, yet humongous,
differences. Recall we had a definition of asymptotic equivalence earlier, i.e.,

Definition 7.4 We say to functions fn and gn of a natural number parameter n, are asymptotically
equivalent, e.g., fn ∼ gn (as n → ∞), if and only if limn→∞ gfnn = 1.

However, we have an alternative definition of asymptotic equivalence, as follows,


P∞ P∞
Definition 7.5 Suppose we have a series of functions,
Pn−1 i=0 gi (x), then f ∼ i=0 gi (x), if f ∼
g0 (x) as x → x0 , and for each i 6= 0, f − i=0 gi (x) ∼ O (gn (x)) as x → x0 . In this sense, the
asymptotic series guarantees that
Pthe remainder is subdominant to the last included term in the series.
Furthermore, the function f ∼ gi ; however, the series itself may not be convergent.

113
The two above definitions have some small differences, which we will highlight later when we run
through a few properties of asymptotic relations. However, these two definitions are also in opposition
to the idea of pointless convergence, which we define now.
P∞
Definition 7.6 We say a function f (x) is pointwise convergent to a series i=0 gi (x) for all x ∈ I,
P∞
limn→∞ f (x) − i=0 gi (x) = 0.

Between these two definitions, we have two subtley opposing ideas. For asymptotic series, we
have for a fixed N , as x → x0 we are guaranteed to convergence to the function. While for pointless
convergence, we fix x = x0 and as N → ∞ (for x ∈ I) we are guaranteed to converge to the function.

xn
PN
We can illustrate these differences using a simple Taylor Series example. Consider f (x) = n=0 n! .
For a fixed x, as N → ∞, it is clear that f (x) → ex , i.e.,
N
x
X xn
lim e − = 0.
N →∞
n=0
n!

xn
PN
However, each sequence for a fixed N , we have n=0 n!  ex as x → x0 , e.g.,

N
X xn xN +!
ex −  as x → x0 .
n=0
n! (N + 1)!

To illustrate more properties of asymptotic convergence, we will go through a few (fun) examples.
Have fun reading!

1. If f1 ∼ g1 and f2 ∼ g2 ,as x → x0 , then


T/F: f1 + f2 ∼ g1 + g2 ?
i. Using Definition 1 of asymptotic equivalence, we have limx→x0 fg = 1, then limx→x0 gf11 +f
+g2 6=
2

1 in general. Consider a counterexample as f1 = x + x3 , f2 = −x, g1 = x + x2 , and


g2 = −x as x → 0. It is clear that

f1 + f2 x3
lim = lim 2 = 0.
x→0 g1 + g2 x→0 x

ii. Using Definition 2 of asymptotic equivalence, it is true by linearity!


T/F: f1 f2 ∼ g1 g2 ?
f1
i. Using Definition 1, we see this is true. Let’s see why! Recall we have, limx→x0 g1 =1
f2
and limx→x0 g2 = 1. Now we’ll do some gentle mathematical massaging,

g1 g2 f1 f2
lim f1 f2 = lim f1 f2 = lim g1 g2 = lim g1 g2 .
x→x0 x→x0 g1 g2 x→x0 g1 g2 x→x0

ii. Using Definition 2, this is true by linearity properties as well.


f1 g1
T/F: f2 ∼ g2 ?
i. Using Definition 1, we see this to be true, by similar arguments to the question above.
Consider   
f1 g1 g2 f1 g2 g1 g1
lim = lim = lim .
x→x0 f2 g1 g2 x→x0 g1 f2 g2 x→x 0 g2

114
T/F: ef1 ef2 ∼ eg1 eg2 ?
i. This is false, for reasons similar to the first example in this problem, e.g., f1 + f2 
g1 + g2 . Consider f1 = x1 , f2 = − x1 , g1 = x1 + x12 , and g2 = − x1 . Hence we see that
f1 + f2 = 0 and g1 + g2 = x12 , and therefore we see easily that

ef1 +f2 e0
lim = lim 2 6= 1.
x→0 eg1 +g2 x→0 e1/x

f1 g1
T/F: log f2 ∼ log g2 ?
2
i. This, too, is false, and we can show it easily! Let f1 = ex+3x , f2 = ex , g1 = ex , and
g2 = ex , and then we get log ff21 = log f1 − log f2 = 3x2 and log gg21 = log g1 − log g2 = 0.
Hence, we see trivially that 3x2  0 as x → 0.
T/F: log (f1 f2 ) ∼ log (g1 g2 ) ?
i. This is false by counterexample similar to the one above.
2. If f1 ∼ f2 have the same asymptotic series, are they the same function?

an xn as x → 0
P
No, in general
P they will not be the same function. We could have f1 =
n −1/x
and f2 = an x + e as x → 0, and although we have f1 ∼ f2 as x → 0, they are clearly
not the same function.
3. Are asymptotic series unique?

No, they are not. We can have the same asymptotic series as x → x0 for 2 different func-
tions. For each function, each has a unique asymptotic series as x → x0 . Consider f1 = x + 3
and f2 = x + 4, then we see f1 ∼ x as x → ∞ and f2 ∼ x as x → ∞, but clearly f1 6= f2 .
4. Are Laurent Series an asymptotic series?

Recall a Laurent Series is a representation of a complex function as a power series, which includes
terms of negative degree, i.e., it is used to express complex functions when a regular Taylor Series
expansion cannot be applied because of a singularity. We can write this as

X
L(x) = an (x − x0 )n .
n=−∞

We note that a Taylor Series can be an asymptotic series as x → x0 , as long as there is a


dominant leading order term. Similarly, a Laurent series can be an asymptotic series as long as
there is a dominant leading order term as x → x0 , i.e., we need the Laurent Series to take a
form like,

X
L(x) = an (x − x0 )n .
n=−M

If no such leading order term can be written, as in, there is an essential singularity, the Laurent
Series will not be an asymptotic series. For example, the function
1 1
e1/x = 1 + + + ...
x 2! · x2
, has an essential singularity at x = 0 and itself is not an asymptotic series as x → 0 since each
term is subdominant to infinitely many terms.

115
Figure 11: Pieces of the contour we know have to belong so far, but have not yet chosen their explicit
directions. The blue lines depict the lines of steepest descent, while red indicates the lines of steepest
ascent. The blue dashed line illustrates the part of the ascent line we will not choose for our contour,
as to construct our closed-contour we would be forced to cross a line of steepest ascent in quadrant
IV .

Figure 12: The steepest descent contour in the (u, v)-plane, e.g., C = C1 + C2 + C3 .

116

You might also like