Main
Main
Main
Figure 1: Numerical Microscopy: The Trabeculated Ventricle of an 84hpf Zebrafish Embryonic Heart
Mathematical model and simulation of an embryonic zebrafish heart at 84 hours post fertilization.
Blood cells can be seen flowing into the ventricle during the diastole phase. Blood flow plays a critical
role in heart development, as hemodynamic forces act as signaling mechanisms to regulate a wide host
of developmental processes at the cellular and genetic level.
Nicholas A. Battista
Modified: August 8th, 2016
Created: June 3, 2015
1
1 Introduction
Welcome! In this short 5 day course, we will cover a variety of topics pertaining to analytical methods
in applied mathematics. Most of these topics have been chosen in part because this material that
seems to have a way of falling through the cracks of most undergraduate curriculums in mathematics
and because they are traditionally not covered during the first year sequence in methods of applied
mathematics.
The tentative list of topics we will cover is below:
1. A review of undergraduate ODEs
• Separable Equations
• First-Order: Integrating Factor
• Homogeneous vs. Nonhomogeneous Equations
• Solving for Complementary Solutions
– Linear constant coefficient ODEs
• Solving for Particular Solution:
– Undetermined Coefficients
– Variation of Parameters
• Bonus Methods:
– Transform Methods (i.e., “Euler” type, Ricatti type, etc)
– Laplace Transforms
– Non-dimensionalization
– Introduction to Dynamical Systems, e.g., systems of ODEs
2. General Sturm-Liouville Theory
• Orthogonal Functions
• Fourier Series
• Orthogonal Polynomials as a Complete Basis
3. Greens Functions
4. Linear PDEs
• Classification of PDEs
• Derivations of classical linear PDEs
• Method of Characteristics
• Separation of Variables
• Fourier Transforms / Laplace Transforms
• Similarity Solutions
• Method of Images
5. Physics!
• Newton’s Laws of Motion
• Conservation Laws
• Lagrangian Formulation of Mechanics
2
• Hamiltonian Formulation of Mechanics
6. Big Picture Numerical Linear Algebra
3
2 Review of Undergrad. ODEs
Undergraduate Ordinary Differential Equations, or Calculus 4, is mostly a blitz of various analytical
methods for solving initial value problems for linear (and some special case non-linear) differential
equations.
In this section we will review the following methods:
1. Separable Equations
2. First-Order: Integrating Factor
3. Homogeneous vs. Nonhomogeneous Equations
4. Solving for Complementary Solutions
• Linear constant coefficient ODEs
5. Solving for Particular Solution:
• Undetermined Coefficients
• Variation of Parameters
6. Bonus Methods and Conceptual Fun:
• Transform Methods (i.e., “Euler” type, Ricatti type, etc)
• Laplace Transforms
• Non-Dimensionalization
• Introduction to Dynamical Systems, e.g., systems of ODEs
– Eigenvalues and Systems of Linear Homogeneous ODE
– The Matrix Exponential
– Non-homogeneous Linear Systems
– Non-linear Systems of ODEs
Since undergraduate ODEs does not have much theory, we will try to motivate each method, when
possible, otherwise sit back and enjoy the following stampede of methods!
4
For example consider,
dy
= −y(1 − y),
dt
with y(0) = 2. We can then go through the separation method to find,
dy
= −dt
y(1 − y)
Z Z
1
dy = −dt
y(1 − y)
log(y) − log(1 − y) = −t + C
y
= = Ae−t
1−y
and hence doing a little algebra we find that
Ae−t
y(t) = .
1 + Ae−t
Now using the initial value information we can get A,
A
y(0) = 2 ⇒ 2 = ⇒ A = −2.
1+A
−2e−t
Therefore our solution is y(t) = 1−2e−t . Note that this solution will blow up for finite time, namely
when e−t = 12 or t = ln 12 .
To simply make an equation not separable, we just need to tack on one extra term; however, we
will restrict ourselves to linear first-order equations now of the form,
dy
+ p(x)y(x) = q(x), (1)
dx
where we assume p(x) and q(x) are continuous. To solve the above equation we consider the left
hand side of (1) to be the result of a product rule. However, to make it such we need to multiply all
the terms by fudge factor given in exponential form, also known as the integrating factor,
R
eµ(x) = e p(x)dx
,
d µ(x) µ(x) dµ
thereby giving dx e = p(x)e since
= p(x). Doing so we see the product rule gives,
dx
d h R p(x)dx i R dy R
e y(x) = e p(x)dx + p(x)y(x) = e p(x)dx q(x).
dx dx
Hence to solve (1), we need to integrate both sides and then easily solve for y(x), e.g.,
5
d h R p(x)dx i R
e y(x) = e p(x)dx q(x)
Z dx Z R
d h R p(x)dx i
e y(x) dx = e p(x)dx q(x)dx
dx
Z
µ(x)
e y(x) = eµ(x) q(x)dx
and hence
1 3
y(t) = + Ce−t ,
3
where C can be solved using initial data.
dn y dn−1 y d2 y dy
L̂y(x) = cn (x) + cn−1 (x) + . . . + c2 (x) + c1 (x) + c0 (x)y = 0.
dxn dxn−1 dx2 dx
On the other hand a non-homogeneous equation looks like the above, except with non-zero right
hand side, e.g.,
dn y dn−1 y d2 y dy
L̂y(x) = cn (x) + cn−1 (x) + . . . + c2 (x) + c1 (x) + c0 (x)y = q(x).
dxn dxn−1 dx2 dx
The reason we make clear the distinction is because if we consider a solution to the non-homogeneous
problem, call it yP (x), we have
L̂yC (x) = 0.
6
Therefore if we consider a combined solution,
we have
Furthermore we note that any scaling multiple of the complementary solution is also a problem to
the homogeneous problem, i.e.,
L̂[αyC (x)] = 0,
where α ∈ C.
dn y dn−1 y d2 y dy
L̂y(x) = cn n
+ cn−1 n−1
+ . . . + c 2 2
+ c1 + c0 y = 0,
dx dx dx dx
where {cj }nj=0 ∈ R. For equations of this form we will show that our solution technique assumes an
ansatz for a solution and by doing so terms this differential problem into more an algebraic polynomial
root-finding issue. We also note that since this is an nth order homogeneous equation, we need to find
n linearly independent solutions for our complementary solution.
To illuminate this idea better further, we will show the possibilities for second-order linear constant
coefficient ODEs.
We now consider the following ODE,
d2 y dy
a 2
+b + cy = 0, (2)
dx dx
with initial data of the form y(0) = y0 and y 0 (0) = y0P . We assume solutions of the following form
y = ert ,
d2 y dy d2 d rt
+ cy = a 2 ert + b e + c ert
a 2
+b
dx dx dx dx
= ar2 ert + brert + cert
h i
= ert ar2 + br + c = 0
7
polynomial to find the roots of. In this second-order example, we will have three possible cases that
the roots can fall into. Naturally this goes back to properties of quadratic equations and when there
are real, complex, or double roots.
We will highlight each case with an example.
Substituting our ansatz, y = ert we find we get the following algebraic equation to solve
r2 − r − 2 = 0.
Solving this equation we find that r = {−1, 2} and hence we get two linearly independent
solutions that solve the homogeneous problem above,
{e−x , e2x },
Again, substituting our ansatz, y = ert , we get the following characteristic equation
r2 + 2r + 2 = 0.
Solving this equation we find r = {1 − i, 1 + i} and so our linearly independent solutions are
{e(1−i)x , e(1+i)x }. We can easily write the complementary solution as
but we will massage our solutions to a more friendly form! After much algebra, we find we can
write the above as:
h i
yC (x) = ex α1 cos(x) + α2 sin(x) .
In general if we find our complex roots to be {a − bi, a + bi}, we can write our complementary
solution as
h i
yC (x) = eax α1 cos(bx) + α2 sin(bx) .
(r − 3)2 = 0.
8
Hence solving this equation we find that r = 3 is a double root, i.e., a root with multiplicity
> 1. Therefore if we tried to write our complementary solution is a similar way to Case 1, we
would have
yC (x) = α1 e3x + α2 e3x = α3x ,
and thereby only have ONE linearly independent solution, when we need two! The method
we will employ to find this second solution is a special (easy) case of Reduction of Order. We
assume the second linearly independent solution y2 (x) has the form
and therefore we need to solve v 00 = 0 and find v(x) = βx + γ. Hence our second linearly
independent solution is
y2 (x) = (βx + γ)e3x ,
where α̃ = α + γ, but is just another constant to be found using the initial data afterall, so it’s
nothing too fancy!
dn y dn−1 y d2 y dy
cn (x) n
+ cn−1 (x) n−1 + . . . + c2 (x) 2 + c1 (x) + c0 (x)y = q(x). (3)
dx dx dx dx
Like in the previous section, we illustrate this concept with 2nd order linear constant coefficient
equations,
c2 y 00 + c1 y 0 + c0 y = q(x).
We will describe two methods for finding the particular solution,
1. Undetermined Coefficients
2. Variation of Parameters
9
• Undetermined Coefficients:
This method of finding a particular solution is similar to that of “guess and check!”. This
method will only work for special cases of q(x), namely if q(x) belongs to the following families
of functions {an xn + . . . a1 x + a0 , eax , sin(ax), cos(ax)}, or any products of those.
y 00 − y 0 − 2y = 6x2 .
Recall we found the complementary solution to be yC (x) = α1 e−x + α2 e2x . Now since q(x)
is a polynomial, we will assume a polynomial form for yP (x), i.e., we will let yP (x) =
c1 x2 + c2 x + c3 .
Substituting our particular solution into the ODE gives
d2 d
y 00 − y 0 − 2y = 6x2 = (c1 x2 + c2 x + c3 ) − (c1 x2 + c2 x + c3 ) − 2(c1 x2 + c2 x + c3 )
dx2 dx
= (2c1 ) − (2c1 x + c2 ) − 2(c1 x2 + c2 x + c3 )
= [−2c1 ]x2 − 2[c1 + c2 ]x + [2c1 − c2 − 2c3 ]
−2c1 = 6
−2c1 − 2c2 = 0
2c1 − c2 − 2c3 = 0
Solving this system we find that c1 = −3, c2 = 3, and c3 = − 29 . Thereby making our
particular solution yP (x) = −3x2 +3x− 92 . Hence our full solution to this non-homogeneous
equation is
h 9i
y(x) = yC (x) + yP (x) = α1 e−x + α2 e2x + − 3x2 + 3x − ,
2
up to using the initial data to solve for {α1 , α2 }. Note that even though q(x) did not have
a linear term of constant term, the particular solution did have those components. Hence
we have to make sure to include all the polynomial terms at the same order and below as
q(x).
We can do a similar process if q(x) takes on the other functional forms. Below is a table for
what particular solution to “guess” for each q(x).
q(x) yP (x)
eax c1 eax
sin(bx) c1 sin(bx) + c2 cos(bx)
cos(bx) c1 sin(bx) + c2 cos(bx)
β sin(bx) + γ cos(bx) sin(bx) + c2 cos(bx)
an xn . . . a2 x2 + a1 x + a0 cn xn . . . c2 x2 + c1 x + c0
– Example: Resonance
One other subtle point to make is in the case when q(x) = eax . If our complementary
solutions are exponential functions, it is possible that our complementary solutions may
10
already include a term of eax , i.e., yC (x) = α1 eax +α2 ebx . In this case, we will have resonant
solutions.
To say a quick word on resonance, which properly deserves a chapter of its own, it is
the process of a mechanical system to oscillate with greater amplitude when pushed at a
preferential frequency, called the resonant frequency. Furthermore, this allows systems to
maximize their output oscillatory amplitude, even while the driving forces may be relatively
small. Essentially the system is able to store mechanical energy and efficiently transfer such
energy between its potential and kinetic forms easily at such frequencies.
Our particular solution will then take on a term like yP (x) = c1 xeax , analogous to what
we did in the case of multiple roots using Reduction of Order.
• Variation of Parameters:
The next method for finding particular solutions assumes we already know what the comple-
mentary solutions are. Like the last case, we will assume we are trying to solve a 2nd order
linear ODE. Note this method does not only work for linear constant coefficient ODEs, unlike
undetermined coefficients. This method will work for higher order equations analogously (the
algebra just gets a bit more undesirable).
Called, variation of parameters, it is a method that exploits a more general reduction of order
technique. We begin with the set of fundamental solutions {y1 (x), y2 (x)}, i.e., where yC (x) =
α1 y1 (x) + α2 y2 (x). We now assume the particular solution takes a reduction-of-order-form,
where {u1 (x), u2 (x)} are unknown functions that we now set out to find! To motivate this
concept, we will try to find the particular solution to
u01 y1 + u02 y2 = 0.
The reasons for this choice on constraint are not immediately obvious, but upon doing so we see
that the above constraint makes the derivative of yP (x) take a more manageable form,
Hence since, as you’ve probably suspected, we’re going to substitute our ansatz for yP (x) into
the ODE, this constraint makes a lot of terms drop out.
h i h i h
yP00 + p(x)yP0 + r(x)yP = u01 y10 + u1 y100 + u02 y20 + u2 y200 + p(x) u1 y10 + u2 y20 + r(x) u1 (x)y1 (x) + u2 (x)y2 (x)
h i h i h i
= u01 y10 + u02 y20 + u1 y100 + p(x)y10 + r(x)y1 (x) + u2 y200 + p(x)y20 + r(x)y2 (x)
h i
= u01 y10 + u02 y20 = q(x)
11
since y1 (x) and y2 (x) are solutions to the homogeneous equation. Therefore we have the following
systems of two equations,
u01 y1 + u02 y2 = 0
u01 y10 + u02 y20 = q(x)
and two unknowns u01 (x) and u02 (x). Recall that we already know y1 (x) and y2 (x), and we assume
they are at least C 2 since they are solutions of a 2nd order ODE afterall!
Solving this system we find
−y2 (x)g(x) y1 (x)g(x)
u01 (x) = and u02 (x) = .
y1 (x)y20 (x) − y10 (x)y2 (x) y1 (x)y20 (x) − y10 (x)y2 (x)
Hence we can now integrate the above two formulas for u01 (x) and u02 (x) to find our particular
solution,
– Example:
We will show variation of parameters is indeed a true method for finding the particular
solution by performing the same example as in the undetermined coefficients section. Recall
we wish to solve
y 00 − y 0 − 2y = 6x2 .
Recall we found the complementary solution to be yC (x) = α1 e−x + α2 e2x , and thereby
our fundamental solutions are y1 (x) = e−x and y2 (x) = e2x .
Using our formulation above we have
Z Z
y2 (x)g(x) y1 (x)g(x)
yP (x) = −y1 (x) dx + y2 (x) dx.
y1 (x)y20 (x) − y10 (x)y2 (x) y1 (x)y20 (x) − y10 (x)y2 (x)
e2x (6x2 ) e−x (6x2 )
Z Z
= −e−x dx + e 2x
dx
e−x (2e2x ) + e−x e2x e−x (2e2x ) + e−x e2x
Z 2x 2 Z −x 2
e x e x
= −6e−x dx + 6e2x dx
3ex 3ex
Z Z
= −2e−x ex x2 dx + 2e2x e−2x x2 dx
h i h x2
x 1 −2x i
= −2e−x (x2 − 2x + 2)ex + 2e2x − − − e
2 2 4
1
= (−2x2 + 4x − 4) + −x2 − x −
2
9
= −3x2 + 3x − .
2
Therefore we see that our variation of parameters methodology is consistent with what
undetermined coefficients gave us (not that we are surprised; they should).
12
2.5 Substitution Methods for *some* Nonlinear ODEs
In this section we will discuss a solution technique for *some* first-order non-linear ODEs, namely
the Bernoulli, Ricatti, and Clairaut equation. We begin with Bernoulli
φ = y 1−n , (6)
d −3x
φ(x) = −3xe−3x .
e
dx
Integrating both sides gives
1
e−3x φ(x) = (x + )e−3x + C,
3
where C is the integration constant. Now we simply solve for φ(x) and substitute our Bernoulli
transformation (6), to get
1
φ(x) = y −3 = x + + Ce3x .
3
13
2.5.2 Ricatti Equations
The Ricatti Equation takes the following non-linear form,
dy
= Q(x) + P (x)y + R(x)y 2 . (7)
dx
The Ricatti Equation may look innocent enough, i.e., similar to a Bernoulli equation but with
one addition term, namely Q(x); however, it turns out there is no way to find an analytical solution
unless the particular solution is known. For our considerations, we will assume we already know yP (x)
(somehow), and then define the new function
1
φ= . (8)
y − yP (x)
Hence we see that
1 dy 1 dφ
y(x) = yP + ⇒ = yP0 − 2 .
z dx φ dx
Substituting these into (7), gives
0 1 dφ 1 2 2yP 1
yP − 2 = Q(x) + P (x) yP + + R(x) yP + + 2 .
φ dx φ φ φ
Simplifying the above, we obtain
1 dφ 1 yP R(x) 2
− 2
= P (x) + 2R(x) + φ ,
φ dx φ φ )
since yP (x) is a solution to (7), i.e., yP0 = Q(x) + P (x)yP + R(x)yP2 .
We can simplify further to get
dφ
= − [P (x) + 2R(x)yP (x)] − R(x).
dx
This equation is now a first-order linear equation ODE for φ and can be solved using integrating
factor!
• Example: y 0 = e2x + (1 + 2ex )y + y 2 with *already known* particular solution yP (x) = −e−x .
(We do not know how we know that, nor is that our concern at this junction.)
First we identify all the pieces,
P (x) = 1 + 2ex
Q(x) = e2x
R(x) = 1
dφ
= − [P (x) + 2R(x)yP (x)] − R(x)
dx
= − [(1 + 2ex ) − 2(1)ex ] φ − 1 = − [1 + 5ex ] φ − 1
Hence we have the following linear first-order ODE to solve, to which we can use integrating
factor,
dφ
+ φ = −1.
dx
14
R
1dx
Identifying the integrating factor as v(x) = e = ex , gives
d x
[e φ(x)] = −ex ,
dx
and thereupon integrating we obtain
ex φ(x) = −ex + C,
and hence
φ(x) = −1 + Ce−x .
1 1
y(x) = yP (x) + = −e−x + −x
.
φ Ce − 1
d2 y
2
dy dy 0 dy d y
= +x 2 +f ,
dx dx dx dx dx2
y(x) = Cx + f (C).
This gives a family of straight lines as a solution, based on the value of C. This is known as the
general solution of Clairaut’s solution.
dy
• Case 2: x + f 0 dx = 0.
dy
We see this implies that f dx = −x This leads to singular solutions. We will illustrate this
via an example. Consider f (y 0 ) = y 02 , then we have the case where
x
f 0 ((y 0 ) = 2y 0 = −x ⇒ y 0 = − ,
2
and hence we get
x x 2 1
ys (x) = ( )x + − = − x2 .
2 2 4
15
The general solution is yG (x) = Cx + C 2 .
From uniqueness theory, we know that no two solutions made intersect at the same point or else
uniqueness is violated. We see in this trivial case for Clairaut’s Equation that yG (x) = ys (x)
when (x, y) = (−2C, −C 2 ).
Furthermore we can even see that the two curves are tangent to each other at this point, i.e.,
0
yG (−2C) = C = ys0 (−2C).
Therefore the singular solution, ys (x) is tangent to every member of the family of curves, yG (x),
for any value of C ∈ R.
Hence if you are able to integrate, you will be able to find the Laplace Transform of many functions.
A table of some of them is listed below
f (t) L {f (t)}
c
c (c ∈ C) s
n!
tn sn+1 , n ∈ N
1
eat s−a
a
sin(at) s2 +a2
s
cos(at) s2 +a2
a
sinh(at) s2 −a2
s
cosh(at) s2 −a2
Likewise, we can “undo” the Laplace Transform by taking its inverse. The Inverse Laplace Trans-
form is defined by a line integral,
Z γ+iT
1
L −1
{F (s)} = f (t) = est F (s)ds (11)
2πi γ−iT
This integral gets a bit hairy to do, but it can be found using some beautiful complex analysis.
16
L {αf (t) + βg(t)} = αL {f (t)} + βL {g(t)},
2. Convergence:
Moreover, we know that the Laplace Transform of a function f (t) will exist if f (t) is piecewise
continuous on [0, ∞) and has at most exponential order for t > T . If these are satisfied then
lim L {f (t)} = 0.
s→∞
We will sketch a quick proof below. Since f (t) is piecewise continuous of [0, T ], we know
that is bounded on the such interval, which implies |f (t)| ≤ M1 e0t . Note for t > T we have
|f (t)| ≤ M2 eαt .
Therefore let M = max{M1 , M2 } and a = max{0, α}. Then we have for this function f (t),
Z ∞
L {f (t)} = e−st f (t)dt
Z0 ∞
≤ e−st |f (t)|dt
0
Z ∞
≤M e−st eat dt
0
−(s−a)t ∞
e
=M
s−a 0
M
=
s−a
If a ∈ R, then
L {eat f (t)} = F (s − a).
If a ∈ R+ , then
L {f (t − a)H(t − a)} = e−as F (s),
17
0, 0 ≤ t < a
H(t − a) =
1, t ≥ a.
5. Derivatives of Transforms:
For n ∈ N, we have
dn
L {tn f (t)} = (−1)n F (s).
dsn
6. Convolutions:
Let f (t) and g(t) be piecewise continuous functions on [0, ∞) and with order at most exponential,
then
L {f (t) ∗ g(t)} = L {f (t)}L {g(t)} = F (s)G(s).
L −1 {F (s)G(s)} = f ∗ g.
The only real piece we are missing is how to take Laplace Transforms of Derivatives. We proceed
immediately to remedy this.
We first will state the result. If y(t) ∈ C n on [0, ∞) and order is at most exponential, then
n−1
dn y
X
L = sn F (s) − sk f (n−1−k) = sn F (s) − sn−1 f (0) − sn−2 f 0 (0) − . . . − f (n−1) (0),
dtn
k=0
18
where F (s) = L {f (t)}.
We will show the proof for the first derivative. The others follow analogously...or by induction.
Consider L {y 0 (t)}, and proceed to the integration,
Z ∞
L {y (t)} =
0
e−st y 0 (t)dt
0
∞
Z ∞
= e−st y(t) + s e−st y(t)dt
0 0
= −y(0) + sL {y(t)}
= sY (s) − y(0)
and hence
L {y 0 (t)} = sY (s) − y(0).
Note that in calculating L {y 0 (t)} (or any higher derivatives for that matter), we need to use the
initial data! This is where the initial data will creep into using Laplace Transforms for solutions
to linear ODEs!
There are 4 steps in solving ODEs using Laplace Transforms. They are listed below:
1. Identify your ODE
2. Transform all the terms in the ODE using L
3. Solve for Y (s)
4. Transform back using L −1 to get y(t)
Let’s do an example!
Example: Consider
dy
− 3y = e2t , with y(0) = 1.
dt
We begin by transforming all the terms using L and exploiting its linearity property,
dy dy
L =L − 3L {y} = L e2t
− 3y
dt dt
and obtain h i 1
sY (s) − y(0) − 3Y (s) = ,
s−2
or substituting the initial data,
h i 1
sY (s) − 1 − 3Y (s) = ,
s−2
19
At this point, our differential problem has been cast into a more algebraic problem, where
we must solve for Y (s),
s−1 −1 2
Y (s) = = + .
(s − 2)(s − 3) s−2 s−3
We have used the method of partial fractions above. Hence we now have
−1 2
y(s) = L −1 {Y (s)} = L −1 + L −1 .
s−2 s−3
Transforming back (using tables), this yields our solution,
One last remark- by using Laplace Transforms we solved for our complementary and particular
solutions simultaneously! Furthermore, there are other transform methods, namely Fourier Trans-
forms, which can be used to solve ODEs (and PDEs) as well. Typically if you are trying to solve a
differential equation (whether it is an ODE or PDE), you can use Laplace Transforms to transform
the independent variables in [0, ∞), and you can use Fourier Transforms to transform variables, which
are in (−∞, ∞).
Moreover, after performing this procedure your equation will usually take a bit more friendly form,
seemingly free of as many parameters, since parameters will have been collected together into non-
dimensional quantities. This will become more clear as we do examples.
We will now describe the steps in performing non-dimensionalization. The steps are as follows
1. First identify all the independent and dependent variables within the equation (or system of
equations)
20
4. Choose the definition of characteristic units for each variable to make as many coefficients of as
many terms one.
5. Rewrite the equation, or system of equations, using the new dimensionless quantities.
Let’s do an example!
d2 x dx
m +b + kx = F0 (t),
dt2 dt
where x(t) is the displacement of the block of mass, m, at time, t, b is the damping coefficient, k is the
spring stiffness, and F0 (t) may be some periodic forcing term. We begin by making the dimensionless
variables for displacement and time as follows,
x t
χ= and τ= ,
xC tC
where χ and τ are the non-dimensional displacement and time, respectively, and xC and tC are the
characteristic displacement and time quantities (or scalings), respectively. Before we can do the sub-
d d2
stitutions, we need to first relate the differential operators dt and dt2 to the non-dimensionalization.
First we consider that t = tc τ and hence see that dt = tC dτ , since tC is the characteristic time,
which is constant. Hence we get that
dτ 1 d dτ d 1 d
= ⇒ = = ,
dt tC dt dt dτ tC dτ
by the chain rule. Similarly for higher order derivatives we find that
n
dn dn
1
= .
dtn tC dτ n
Now let’s begin the substitutions!
d2 x dx
m 2
+b + kx = F0 (t)
dt dt
xC d2 χ xC dχ
m +b + kxC χ = F0 (tc τ ).
t2C dτ 2 tC dτ
Next divide through by the leading order coefficient, m xt2C , and we obtain
C
2
d χ btC dχ kt2C t2C
+ + χ= F0 (tc τ ).
dτ 2 m dτ m mxC
21
We are now at a crossroads, where we can decide what the characteristic displacement, xC , and
time, tC , should be. Since we are trying to make overall coefficients equal to one, there are two obvious
choices for tC , as follows,
r
m m
tC = (first-order normalization) OR tC = (zeroth order normalization).
b k
We will choose the latter, as we will see this will lead to another non-dimensional quantity that
is referred to as the damping ratio. However, it does not matter which choice we make, but one does
lead to a more traditionally studied problem. Now our governing system takes the form
d2 χ b dχ 1
+√ +χ= F0 (tc τ ).
dτ 2 mk dτ kxC
Furthermore we can define new parameters to lump all the parameters together in front of the
linear term and non-homogeneous term. We note that in the engineering community for these type
of systems, folks define
b 1
2ζ = √ and xC = .
mk k
The reason for the factor of 2 is purely aesthetic and arbitrary; it will later allow us to parameterize
solutions in terms of ζ solely. ζ is traditionally referred to as the damping ratio. The xC definition
can be thought of the displacement per unit force.
d2 χ b dχ 1
2
+√ +χ= F0 (tc τ ),
dτ mk dτ kxC
where we have made the assumptions that ζ = 2√bmk , tc = m 1
p
k , and xC = k .
To solve the governing equation, we need to find both particular and complementary solutions. We
will only concern ourselves with the complementary solutions for illustrating different regimes based
on our parameters, specifically the damping ratio, ζ. For the complementary solutions, e.g., solutions
to the homogeneous problem, we make the ansatz for solutions of the form
χ = eλt ,
22
2. Critically damped : ζ = 1 We now only have one distinct eigenvalue of our differential operator.
Hence at this stage, we have one transient solution, χ1 (τ ) = c3 e−τ . However, we will also have
a resonant eigenfunction, χ2 (τ ) = c4 τ e−τ . Therefore our transient solution for this critically
damped case is
χ(τ ) = (c3 + τ c4 )e−τ .
3. Under damped : ζ < 1 We have two distinct eigenvalues of our differential operator; however,
they are both complex with nonzero imaginary part. This is not an issue as our transient
solutions can be written as
√ 2 √ 2
χ(τ ) = c̃5 e−(τ + 1−ζ ) + c̃6 e−(τ − 1−ζ ) ,
Hence we found there are three regimes for the damped harmonic oscillator system - overdamped,
underdamped, and critically damped. Moreover, we were able to discern when such bifurcation would
occur by using non-dimensionalization, giving rise to a natural non-dimensionalization quantity, ζ, we
can quantify the transient solution regimes by. Recall the quantity, ζ is sometimes referred to as the
damping ratio, which is heavily used by the engineering community to model decaying oscillations in
a system after undergoing a perturbation.
Next we will exploit the relationship between high order ODEs and systems of first-order ODEs, to
further solidfy this analogy. Through search for eigenvalues of these systems, we opt to build inutition
behind matrix exponential solutions, and massage these solutions into a more useful and meaningful
form.
We then turn the discussion to finding particular solutions to non-homogeneous linear systems of
ODEs with constant coefficients. We will see that the methods we emplore for this are completely
analogous to our approach in the sections above. Spoilers- we will discuss undetermined coefficients
for these type of systems as well as everyone’s favorite, variation of parameters.
Finally we will close the discussion of systems of ODEs we a brief big picture-esque story of non-
linear systems, equilibiria and steady-states, the Jacobian matrix and its complementary eigenvalues.
We then will chat about quantifying these equilibria as either sinks, sources, saddle-points, degenerate
cases, centers, or stable/unstable foci, at least locally.
23
where a ∈ R, with initial data, x(0) = x0 . We note that the solution is (almost trivially) given by the
exponential function,
x(t) = eat x0 .
Wouldnt it be nice if we could so easily write the solution to an analogous system of first-order
constant-coefficient homogeneous ODEs? Well, we can! See, mathematics is beautiful (*especially
when it works out...). Consider the following system of N -ODEs,
dx
= Ax,
dt
with initial value, x(0) = x0 , and where A ∈ RN ×N .
x(t) = eAt · x0 .
...but what does this mean? How does the exponential function handle matrices? Are we just blow-
ing smoke? Nah, not this time. When in doubt, let’s turn to Taylor Series, even though this isn’t
numerical analysis. Shh, we won’t tell.
Recall the Taylor Series for an exponential function of a single variable, at, with at ∈ R is
1 1 1
eat = 1 + at + (at)2 + (at)3 + . . . + (at)N + . . . .
2! 3! N!
Well, as it turns out we can use this idea to definte the exponential of a matrix! Let’s do some
Taylor Series!
1 1 1
eAt = I + At + (At)2 + (At)3 + . . . + (At)N + . . . .
2! 3! N!
There we go - we know what the solution is! Is the above definition useful to us or really give us
much intution about the behavior of the solutions of the linear ODE system? Unfortunately, in its
current form it does not. However, this is where we put on our linear algebra tool belts and dive back
into the world of matrix factorizations.
We note that if a matrix has an eigenvalue decomposition, i.e., each eigenvalue, λ, of a matrix, A,
has geometric multiplicity equal to its algebraic multiplicity, then we can factorize A as
A = V DV −1 ,
where D is a diagonal matrix containing the eigenvalues on the main diagonal and V is a matrix of
eigenvectors, all of which are linearly independent (and henceV −1 exists). We can use this factorization
to better understand what the matrix exponential does for us. Performing this substitution we obtain
24
−1
eAt = eV DV t
(V DV −1 t)2 (V DV −1 t)3
= I + V DV −1 t + + + ...
2! 3!
t2 t3
= I + V DV −1 t + V DV −1 V DV −1 + V DV −1 V DV −1 V DV −1 + . . .
2! 3!
2 3
t t
= I + V DV −1 t + V D2 V −1 + V D3 V −1 + . . .
2! 3!
−1 −1 t2 2 −1 t3
=VV + V DV t + V D V + V D3 V −1 + . . .
2! 3!
h (Dt)2 (Dt)3 i
= V I + Dt + + + . . . V −1
2! 3!
= V eDt V −1 .
Hence by diagonalizing the matrix A in its eigenvalue decomposition, we are able to see the
influence that the eigenvalues will have on the evolving system, but moreover, see the role the matrix
exponential takes in building a solution to a system of linear ODEs. Furthermore, we can massage
the above relation further to see
λ1 t
e 0 ... 0
..
0 eλ 2 t 0 .
V eDt V −1 = V
−1
.. . . .. . .. . ..
V
λn−1 t
0 e 0
0 0 eλn t
Morevoer, if A is Hermitian (or self-adjoint), the eigendecomposition of A will lead to an orthog-
onalized decomposition, e.g.,
A = QDQ∗
, where D is the diagonal matrix of eigenvalues and Q is an orthogonal matrix composed of the
eigenvectors all of which are orthogonal. This result stems from the fact that Hermitian matrices have
real eigenvalues and orthogonal eigenvectors. This then leads to the matrix exponential taking the
form
λt
e 1
eAt = Q
.. ∗
Q .
.
eλ n t
We note that not every matrix will have an eigenvalue decomposition; however, if we can write A
in its Jordan form, e.g.,
A = P JP −1 ,
where P is a matrix of generalized eigenvectors since the algebraic multiplicity of an eigenvalue is
greater than its geometric multiplicity. Moreover J is the Jordan Normal Form, which can be written
as a diagonal matrix of Jordan Blocks, i.e.,
J1
J =
.. ,
.
Jp
25
where each Jordan Block, Jk , is written
λk 1
..
λk .
Jk = ,
..
. 1
λk
and the number of rows in each Jordan Block is equal to the algebraic multiplicity of such eigenvalue,
λk . We can substitute the Jordan Decomposition into the matrix exponential and get a similar
relation, i.e.,
eAt = P eJt P −1 .
However, eJt is no longer a diagonal matrix, but will take some upper-triangular form. The diagonal
elements will take the usual form
Jk k = eλk t ,
but the super-diagonal entries will be a linear combination of the generalized eigenvalues.
Also, if we do not consider eigenvalues at all, but instead have found the matrix A’s Schur Decom-
position, which every square matrix has, we can rewrite A as
A = QT Q∗ ,
where T is upper-triangular (with the eigenvalues of A along its main diagonal) and Q is an orthogonal
matrix. Hence upon substituting this into the matrix exponential we get
eAt = QeT t Q∗ .
In this case of substituting the Schur Decomposition, we see a similar behavior to that of the Jordan
Decomposition, where the elements along the main diagonal of eT t are simply the eigenvalues being
exponentiated, while the super-diagonal elements are all linear combinations of the orthogonal vectors
of Q.
Now that we have a more intuituve understanding of what the matrix exponential is, we set out
to enourage understanding of why eigenvalues are the drivers behind systems’ solution!
Ax = λx.
The above relationship says that when our linear operator, A, acts on a certain vector, cleverly
called the eigenvector, it actually can be thought of as a scaling of such eigenvector. Anotherwards,
even though most of the time when a square matrix acts upon a vector, the result is a vector aligned
in a different direction than the original, when a square matrix acts upon an eigenvector, the direction
does not changed, but instead can be seen as a scalar multiple of the original eigenvector.
26
Cute, but how does this tie back into differential equations? Well, for one thing all those times we
were using the friendly exponential ansatz, y = eαt , to find a complementary solution for constant-
coefficient ODEs, we were actually searching for an eigenvalue of the operator, in this case, α. Let’s
take a closer look.
mẍ + bẋ + kx = 0,
with some initial conditions, x(0) = x0 and ẋ(0) = ẋ0 . Now like we hinted to above, if m, b, k ∈ R,
then we have a very nice constant cofficient ODE on our hands, which we can solve by assuming
solutions of the form x = eαt .
where to find α we simply need to solve the quadratic. However, rather than go into those gory
details we will just expose the above quadratic equation for what it actually is - a character-
istic equation for our eigenvalues, α! (Not α-factorial, I just got a little excited.) Hence the
characteristic equation of our differential operator in this case is
mα2 + bα + k = 0.
The story does not end there. We will now view this problem through a more traditional linear
operator feel in an effort to encourage more comfort and belief in this marriage of sorts. We also
get to explore a new technique in which we have not done yet - expressing a higher dimensional
ODE as a system of first-order ODEs!
To do this for our damped mass-spring system, we will simply perform the change of variables
of letting
v = ẋ and v̇ = ẍ = −bv − kx
which is popular among the physics-y folk! I guess that whole, first derivative of a physical
variable is velocity thing, anyways, I digress. Now we see that we can rewrite the ODE in terms
of a linear system between x and v, e.g.,
d x ẋ 0 1 x
= = ,
dt v v̇ −k −b v
By our claims about linear ODE systems, we know the solution should be
λt
e 1 0
x(t) = V eDt V −1 · x0 = V V −1 · x0 .
0 eλ2 t
27
However, let’s take a closer look at the eigenvalues, λ1 and λ2 , e.g., let’s actually find their
characteristic equation. Spoiler- it’s going to be the same as the characteristic equation we found
above for the higher-dimensional damped mass-spring system. Doing the standard approach for
eigenvalues, we obtain,
λ −1
λI − A = = 0,
k λ+b
and hence
λ(λ + b) + k = λ2 + bλ + k = 0.
Therefore we see we get the same characteristic polynomial whether we approach the problem a
higher-order ODE or a linear ODE system! We care about eigenvalues! Now that we’ve motivated
how ODEs and Linear Algebra are so closely tied together via eigenvalues, we will shift gears and
consider non-homogeneous linear systems of ODEs.
x = xC + xP ,
with x(0) = x0 . We will now show how to find particular solutions using both undetermined coefficients
and variation of parameters. We will consider the following example,
dx 1 1 t−2
= x+ .
dt 0 3 6 sin(2t)
First we find the homogeneous solution. Of course, we can easily write it as xC (t) = eAt · x0 , however,
we will do the rightful thing and compute A’s eigendecomposition. Finding the eigenvalues, i.e.,
λI − A = 0, gives us
λ = {1, 3}.
Looking to find the associated eigenvectors, we look at the null-space of
λ−1 −1 1 1
(λI − A)v = v=0 ⇒ v= , ,
0 λ−3 0 2
1 − 21
t
−1 1 1 e 0
xC (t) = eAt · x0 = eV DV t · x0 = 1 · x0 .
0 2 0 e3t 0 2
28
We now look to find the particular solution of this linear system ODEs using two different approaches
- undetermined coefficients, which will work for special cases of the non-homogeneous term, and vari-
ation of parameters, which will work a bit more generally.
To do this we will consider a non-homogeneous problem where f (t) takes the form of one of the
functions (or products thereof) that were associated with the method of undetermined coeffi-
cients from one dimensional equations. In the exact same vein as we approached this situation
before using undetermined coefficients, we will make an educated guess of what the particular
should be, e.g., one that resembles the form the non-homogeneous term has taken, plug it into
the linear ODE system, and see if we can match coefficients.
The only difference now is that we are guessing a particular solution in the form of a vector;
however, this does not require anymore theory, possibly just more paper and pencil, as we
will see. Each component of our educated-guess-of-particular-solution-vector, or EGOPSV, just
kidding, that will not and should not catch on, will essentially be what the associated form
of non-homogeneous term is. Let’s take a closer look at our example, before I try any more
attempts at hilarity.
Whilst looking at each component, we will guess a solution of the form for the first and second
components of our particular solution as xP1 (t) = a1 t + b1 + c1 sin(2t) + d1 cos(2t), while the
second component will take a similar form of xP2 (t) = a2 t + b2 + c2 sin(2t) + d2 cos(2t), where
xP = (xP1 (t) xP2 (t))T . Even though each component of the non-homogeneous term has either
a linear or sinuisoidal type form, we need to make a guess of the linear combination for each
component of our particular solution. We are again left to simply figure out the undetermined
coefficients (heyyo!). Let’s substitute this into our linear system of ODEs in attempt to find the
unknows {a1 , b1 , c1 , d1 , a2 , b2 , c2 , d2 }.
Next we will do the substituting, in which we expect to extract a system of 8 equations for our
8 unknowns. Let’s check it out!
29
dxP
= AxP + f (t)
dt
a t + b1 + c1 sin(2t) + d1 cos(2t) t−2
1 1 1
= +
0 3
a2 t + b2 + c2 sin(2t) + d2 cos(2t) 6 sin(2t)
a1 t + b1 + c1 sin(2t) + d1 cos(2t) + a2 t + b2 + c2 sin(2t) + d2 cos(2t) t−2
= +
3a2 t + 3b2 + 3c2 sin(2t) + 3d2 cos(2t) 6 sin(2t)
(a1 + a2 + 1) t + (b1 + b2 − 2) + (c1 + c2 ) sin(2t) + (d1 + d2 ) cos(2t)
=
3a2 t + 3b2 + (3c2 + 6) sin(2t) + 3d2 cos(2t)
(a1 + a2 + 1) t + (b1 + b2 − 2) + (c1 + c2 ) sin(2t) + (d1 + d2 ) cos(2t)
a1 + 2c1 cos(2t) − 2d1 sin(2t)
= .
a2 + 2c2 cos(2t) − 2d2 sin(2t)
3a2 t + 3b2 + (3c2 + 6) sin(2t) + 3d2 cos(2t)
a1 − b1 − b2 = −2
a1 + a2 = −1
2c1 − d1 − d2 = 0
c1 + c2 + 2d1 = 0
a2 − 3b2 = 0
a2 = 0
3c2 + 2d2 = −6
2c2 − 3d2 = 0.
With enough patience and peristence by analytic methods (or by a billionth of a second MATLAB
computation) you can find the coefficients to be
30
a1 −1
b1
1
c1
−6/65
d1
= 48/65
.
a2
0
b2
0
c2 −18/13
d2 −12/13
6 48
1−t− 5 sin(2t) + 65 cos(2t)
xP (t) = .
− 18
13 sin(2t) − 12
13 cos(2t)
Fun stuff. However, if our non-homogeneous term wasn’t as friendly (in the rigorous mathemat-
ical definition, aka “undesirable to do by the above scheme”), we can use the analog of variation
of parameters for systems! Hint: it is virtually the same process!
If you would recall when we developed the variation of parameters formalism in one dimension,
we assumed a particular solution that was a combination of the fundamental solutions, i.e.,
the linearly independent homogeneous solutions, with each being multiplied by some unknown
function of the independent variable, e.g.,
where {v1 (x), v2 (x)} are unknown and {y1 (t), y2 (t)} are the fundamental solutions. This assump-
tion is spawned from the reduction of order vein. Furthermore, we also made the assumption
that
v10 (t)y1 (t) + v20 (t)y2 (t) = 0,
so that along the avenue to the standard variation of parameters formulae, we had enough
equations and unknowns. However, when deriving a formalism for variation of parameters for
systems of linear ODEs, we experience a natural extension of this reduction of order idea to
systems.
For linear systems of ODEs, we will make the assumption the particular solution takes a natural
form of
xP (t) = X(t)v(t),
where X(t) is the matrix of fundamental solutions and v(t) is an unknown vector function of
the indepedent variable, t. Next we do what feels right - plug it into the ODE to find conditions
on the identity of v(t). Substituting this ansatz, we find
31
dxP
− AxP = f (t)
dt
dX dv
v+X − AXv = f .
dt dt
and hence
e−t − 21 e−t
X −1 (t) = 1 −3t .
0 2e
Therefore using the variation of parameters formula we’ve just sketched, we find
32
Z
xP (t) = X(t) X −1 (t) f (t)dt
1 −t
t 3t
5e 5(1 − t) + 3 sin(2t) + 6 cos(2t)
e e
=
2e3t
0 3 −3t
− 13 e 3 sin(2t) + 2 cos(2t)
3 6 9 6
1−t+ 5 sin(2t) + 5 cos(2t) − 13 sin(2t) − 13 cos(2t)
=
18 12
− 13 sin(2t) − 13 cos(2t)
6 48
1−t− 65 sin(2t) + 65 cos(2t)
= .
− 18
13 sin(2t) −
12
13 cos(2t)
and hence we find that we arrive at the same solution using variation of parameters as we did
using undetermined coefficients (duh!).
Now that we’ve spent a lot of time building up the theory of linear systems of ODEs, we again
will shift into higher gears and begin considering non-linear systems of ODEs. For these systems, as
in the one-dimensional case, we will not be able to solve them generally, but instead will discuss tools
one can use to analyze these systems, to build intuition about how the underlying solutions should
behave. Let’s dance.
33
different; however, we can at least set up a formalism to best categorize the different kinds of solutions
the system will exhibit. Luckily, we have built up most of the machinery we need to do this - it’s time
to go back and revisit our good friends, eigenvalues.
“Eigenvalues, you say?” Yes. You may be wondering where eigenvalues come into the picture since
we have a non-linear system. Welp, we need to make a priority phone call before putting on our
eigen-analysis toolbelt; we have to go all the back to calculus 1 and remember linearization. Note if
you don’t have good memories of calculus 1 and would prefer reminiscing your calculus 2 days, just
think of this as the first few terms of a Taylor Series, e.g.,
f (x) = f (x0 ) + (x − x0 )f 0 (x0 ) + O (x − x0 )2 ≈ f (x0 ) + (x − x0 )f 0 (x0 ).
If we view the above linearization as a contraction mapping, the f 0 (x) term will tell us whether or
not our map converges or diverges. In a similar vein, we will again care about this term; however, we
must first linearize our non-linear system. Before we get into that business, we must first talk about
equilibria of the system. An equilibrium is a point where the system does not change in time, e.g.,
where dxdt = 0. Hence to find the equilibrium points of a system, we need to find all the roots of
dx
= F(x, t) = 0.
dt
Once we find all the equilibrium points, we next care to try to classify each equilibrium. This is
where our eigenvalue analysis will come in, and in doing so give insight into what type of solutions
the system will exhibit. Finding the equilibrium points is not always a trivial process; however, once
we find the equilbirium points we can then jump onto the linearization bandwagon. In linearizing our
non-linear system, we will center our linearizations about equilibria exclusively. Otherwise, we would
be up a creek.
It’s time to linearize! In performing this analysis, we will assume x0 is an equilibrium point of our
dynamical system. We now proceed with computing the Taylor Series, mostly to show the general
form before picking out the linear components,
F(x, t) = F(x0 , t) + ∇F (x − x0 ) + (x − x0 )T ∆F (x − x0 ) + O |x − x0 |3 .
x=x0 x=x0
= F(x0 , t) + J(x0 )(x − x0 ) + (x − x0 )T H(x0 )(x − x0 ) + O |x − x0 |3
Note that the matrix, J, is sometimes referred to as the Jacobian matrix, where it takes the form
∂F1 ∂F1 ∂F1
···
∂x1 ∂x2 ∂xN
∂F2 ∂F2
∂x1 ∂x2
J(x) = ∇F(x) = .
.. .. ..
. . .
∂FN ∂FN
∂x1 ··· ∂xN
34
The other term, H, is referred to as the Hessian matrix. It can be thought of as the Jacobian of
a Jacobian matrix, e.g., H(f )(x) = J(∇f )(x). More intuitively, it is the matrix of all second-order
partial derivatives. Each component of the Hessian takes the form,
∂2f
Hij = .
∂xi ∂xj
Moreover, the Hessian is symmetric by Clairaut’s Theorem, which states the order differentiation does
not matter as long as the function is continuous and differentiable. In general, the Hessian is a scary
and not-so-much-fun thing to compute.
Now that we have our linearization, it is time to discuss what it does for us. What we essentially
have done is say near an equilibrium point, x0 , our dynamical system takes the approximate form,
dx0
≈ J(x0 )(x − x0 ),
dt
since F(x0 , t) = 0 by definition of an equilibria. Although this analysis will not work in a global sense,
at least locally near an equilibrium point we will be able to make claims on the types of solutions
allowed. Again, this is where we have to pick up our eigenvalue hammers. We will state their impor-
tance now!
The eigenvalues of the Jacobian matrix will help us classify the equilibrium point, and moreover
the solutions in a local neighborhood of it.
We will now push onwards and attempt to classify the equilbria, and upon doing some shed insight
on how solutions near-enough to the equilbria are behaving. We will discuss the following 6 cases,
1. Sink Equilibrium (Stable)
5. Centers
6. Stable Focus and Unstable Foci.
Each of these cases will be illustrated through the use of a dynamical system in R2 , where we will
assume the eigenvalues of the system’s Jacobian matrix to be λ1 , λ2 .
Sink Equilibrium
35
Figure 2: Various classifications of equilibrium based on eigenvalues of the Jacobian Matrix. (a) Sink
(stable) Equilibrium (b) Source (unstable) Equilibrium (c) Saddle-Point Equilibrium (d) Degenerate
Case 1 (e) Degenerate Case 2 (f) Center (g) Stable Foci (h) Unstable Foci
Source Equilibrium
Performing the same thought experiment as above, upon envisioning a solution the linearized
problem, we find that any trajectory found within a neighborhood of the equilibrium would
point away from the equilibria. Therefore this equilibria is known as a source, as all trajectories
point away from the equilibrium. It is also deemed unstable.
36
Saddle-Point Equilibrium
In this case we assume two real eigenvalues, such that λ1 > 0, λ2 < 0, hence one is positive and
one is negative.
This case is called a saddle-point equilibrium because some trajectories will be pointed towards
the equilibria, while others will be pointed away. This is similar the idea of a saddle-point in
multivariate calculus. Similarly in multivariate calulus, where a saddle-point is not strictly con-
sidered a local maxima or minima, it cannot be deemed either stable or unstable.
Degenerate Cases
In this case we assume two real eigenvalues, such that λ1 6= 0, λ2 = 0, hence one strictly zero
and the other is real. In these type of degenerate cases, we will see bifurcated behavior.
For the case where λ1 > 0, we believe solutions we act in a manner as seen in Figure(2)(d),
while in the case where λ1 < 0, the system has a feel of what is depicted in Figure(2)(e).
This type of behavior may be exhibited when radical structural differences occur in the trajec-
tories when perturbing system parameters.
Centers
In this case we assume two purely imaginary eigenvalues, such that λ1 = bi, λ2 = −bi.
Each trajectory around the equilibria, x0 , is closed and circular in nature. This can be seen in
Figure(2)(f).
We also note that if the Jaconian is skew-symmetric at x = x0 , then the equilbrium will always
be a center, i.e., skew-symmetric matrices have purely imaginary eigenvalues. Let’s actually do
a quick proof.
Proof: To show these trajectories are always circular, we begin with the relation that if
A = −A∗ , then the quadratic form x∗ Ax = 0. To prove this we can easily see that,
37
Now that we have that relation, we consider the linearization, or actually just a linear
system of ODEs,
dx
= Ax.
dt
Consider A = −A∗ , and multiply the above by x∗ ,
dx
x∗ · = x∗ Ax = 0
dt
Therefore we have
dx 1 d d
x∗ · = (x · x) = ||x||2 = 0.
dt 2 dt dt
From the above we get that all the orbits must be circular.
In this case we assume two complex eigenvalues, such that λ1 = a + bi, λ2 = a − bi, with b 6= 0.
Depending on the sign of a, this system will either spiral towards or away from the equilbrium
point. In both cases it will oscillate about the equilibrium with period b. Thinking of the solu-
tions to the linearized problem, we note the stability will be controlled by if the solutions decay,
i.e., a < 0, or when they grow uncontrollably, i.e., a > 0.
For the case when a < 0, the trajectories spiral towards the equilibria. We call the equilibrium
a stable focus. Converging towards the equilibrium point is clear since the real part of the eigen-
value is negative. As in the sink case, which also has negative real part, we expect trajectories
to flow towards the equilibria. This can be seen in Figure(2)(g).
For the case when a > 0, the trajectories spiral away from the equilibria. We call the equilib-
rium an unstable focus. It is unstable since the real part of the eigenvalue is positive, and hence
solutions should flow away from the equilibria, analogous to the source case. This can be seen
in Figure(2)(h).
In a neighborhood of the equilibria, we expect the solutions to behave like one of the above cases.
Remember this is all local analysis and at this junction cannot say anything about global solution
behavior...yet
38
3 Sturm Liouville Theory and Orthonogal Expansions!
Sturm-Liouville theory regards second-order linear differential equations of the form
d dy
p(x) + q(x)y = −λw(x)y, (13)
dx dx
where p(x), q(x), and w(x) are all continuous functions on the particular region of interest, say
[a, b]. We also note that w(x) > 0 and is referred to as the “weight” function as well as that we assume
p(x) ∈ C n with n ≥ 1.
We further note that in this formalism we consider the solution y(x) to obey boundary conditions
on [a, b]. Moreover λ is an eigenvalue of the our Sturm-Liouville operator and is chosen to give non-
trivial solutions of (13), which satisfy the boundary conditions.
What makes Sturm-Liouville theory interesting and of particular use in applied mathematics is
the factor that the linear differential operator given by (13) is Hermitian in some function space for
specific boundary conditions. What is special about Hermitian operators is that their eigenfunctions
are orthogonal (sometimes with the assistance of a weight function, i.e., w(x)) and form a complete
function space. Recall how Hermitian operators work in Linear Algebra - they lead to a basis of
orthogonal eigenvectors, each with real associated eigenvalue.
Theorem 3.1 If a Sturm-Liouville problem is regular, i.e., if p(x), p0 (x), q(x), and w(x) are all con-
tinuous on [a, b] and have boundary conditions that can be written as
then the eigenvalues of the Sturm-Liouville problem are real and can be ordered such that
Furthermore with each eigenvalue, λj , there is a unique eigenfunction, φj (x), which has exactly (j −1)
roots in (a, b), and with orthogonality property, i.e.,
Z b
1, j=k
φj (x)φk (x)w(x)dx =
a 0, j 6= k
y 00 + λy = 0 (16)
y(0) = 0 (17)
y(1) = 0. (18)
39
1. λ = 0
2. λ = −α2 < 0
3. λ = α2 > 0
y 00 = 0
y 00 − α2 y = 0.
y(0) = 0 ⇒ c2 = 0
y(1) = 0 ⇒ c1 sinh(α) = 0 ⇒ c1 = 0.
y 00 + α2 y = 0,
y(0) = 0 ⇒ c2 = 0
y(1) = 0 ⇒ c1 sin(α) = 0 ⇒ α = nπ where n ∈ Z+ .
40
1h i
sin(a) sin(b) = cos(a − b) − cos(a + b) .
2
Z 1 Z 1
1h i
sin(nπx) sin(mπx)dx = cos((n − m)πx) − cos((n + m)πx) dx
0 0 2
1h 1 1 i1
= sin((n − m)πx) − sin((n + m)πx)
2 (n − m)π (n + m)π 0
= 0.
We note that the case n = m must be considered by itself, to avoid singularity in the above
integration. Moreover, the case where n = m = 0 is disgarded since 0 is not an eigenvalue of
the differential operator. Now we show the case n = m 6= 0 leads to a non-zero computation.
Now considering n = m 6= 0,
1
1 1h
Z Z i
sin2 (nπx)dx = 1 − cos(2nπx) dx
0 2 0
1h 1
= x− sin(2nπx)]10
2 2nπ
1
= .
2
Z L nπx 2L n = m = 0
mπx
cos cos dx = L n=m>0 (20)
−L L L
0 n 6= m
Z L nπx mπx
L n=m>0
sin sin dx = (21)
−L L L 0 n 6= m
Z L nπx mπx
sin cos dx = 0. (22)
−L L L
To compute the coefficients, wenbegin witho(19) and multiply by either eigenfunction. First let’s
consider the contributions by the sin nπx
L eigenfunctions in the expansion of f (x). We begin
with
41
∞
X nπx ∞
X nπx
f (x) = an cos + bn sin ,
n=0
L n=1
L
mπx
and multiply both sides by sin L . Next we integrate both sides over the interval of interest,
e.g.,
∞ ∞
" #
Z L mπx Z L X nπx X nπx mπx
f (x) sin dx = an cos + bn sin sin dx.
−L L −L n=0
L n=1
L L
The only terms that are not zero in the above equation, are those for which n = m and the
sinusoidal terms, i.e.,
Z L mπx Z L mπx
f (x) sin dx = bm sin2 dx.
−L L −L L
Hence since bm is a constant, we can solve for it to get
RL mπx
−L
f (x) sin L dx
bm = RL .
sin2 mπx
−L L dx
Therefore we now have a formula to find all the coefficients {bj }∞
j=1 . Using a similar approach and
employing orthogonality once again, we can find a formula for the {aj }∞ j=0 coefficients as well, i.e.,
RL mπx
−L
f (x) cos L dx
am = RL .
mπx
−L
cos2 L dx
After computing all those integrals, we will have our expansion of f (x) on [−L, L] in all its Fourier
Series glory!
Rπ
x sin(mx)dx
bm = R−π
π .
−π
sin2 (mx)dx
π Rπ
x 1
−m cos(mx) + m −π
cos(mx)dx
−π
=
π
−2π
m cos(mπ)
=
π
2
= (−1)m+1
m
42
Similarly, we’ll consider the integrations to find {an }. Let’s start with m > 0,
Rπ
x cos(mx)dx
am = R−π π .
−π
cos2 (mx)dx
π Rπ
x 1
m sin(mx) −m −π
sin(mx)dx
−π
=
π
=0
Beyond Fourier Series there are many “popular” orthogonal polynomials, which are all eigenfunc-
tions of some Sturm-Liouville differential operator. Just for some buzzwords, the usual candidates
tend to be:
• Bessel Functions
• Chebyshev Polynomials
• Legendre Polynomials
• Laguerre Polynomials
• Hermite Polynomials
More generally, if {φj (x)} are eigenfunctions of a Sturm-Liouville problem, and we wish to expand
some function f (x) on an interval [a, b] in terms of them, i.e.,
∞
X
f (x) = cj φ̃j (x),
j=0
43
Figure 3: The function f (x) = x being expressed as Fourier Series expansions with differing numbers
of truncated terms.
the brunt of the work still lies in calculating the coefficients {cj }. We note that φ̃j (x) are the
eigenfunctions, φj (x), just translated into the interval [a, b]. We can do the same procedure as we did
to find the Fourier Series coefficients. Let’s party!
44
∞
X
f (x) = cm φ̃m (x)
m=0
X∞
f (x)φ̃n (x) = cm φ̃m (x)φ̃n (x)
m=0
X∞
f (x)φ̃n (x)w̃(x) = cm φ̃m (x)φ̃n (x)w̃(x)
m=0
Z b Z bX∞
f (x)φ̃n (x)w̃(x)dx = cm φ̃m (x)φ̃n (x)w̃(x)dx
a a m=0
Z b ∞
X Z b
f (x)φ̃n (x)w̃(x)dx = cm φ̃m (x)φ̃n (x)w̃(x)dx
a m=0 a
Z b Z b
f (x)φ̃n (x)w̃(x)dx = cn φ̃2n (x)w̃(x)dx
a a
Now in doing the above we will employ the services of Fourier Series to find the particular solution,
uP (x). We note that if g(x) is a fairly simple function, e.g., a polynomial, sine, cosine, or exponential,
or even products of them, finding the particular solution using undetermined coefficients is probably
the road more traveled (and it should be). If g(x) is not a simple function like those, you can try
to use variation of parameters; however, if you do not know the complementary solutions, this could
prove rather difficult. Now if g(x) is periodic with period 2L, this is where Fourier Series can come
in!
We wish to write our solution uP (x) in a Fourier Series,
∞ ∞
X jπx X jπx
uP (x) = aj cos + bj sin .
j=0
L j=1
L
45
However, first we must also expand g(x) in its Fourier Series, e.g.,
∞ ∞
X jπx X jπx
g(x) = cj cos + dj sin ;
j=0
L j=1
L
however, since g(x) is a known function we can find the coefficients {cj } and {dj } explicitly using
the methods above. We now have a linear differential equation of the form,
∞ ∞
X jπx X jπx
L̂u = cj cos + dj sin .
j=0
L j=1
L
Substituting our solution ansatz (the Fourier Series for xP (x)) into the equation above, we see we
get
L̂u = g(x)
∞ ∞
X jπx X jπx
L̂u = cj cos + dj sin
j=0
L j=1
L
∞ X ∞ ∞ X ∞
X jπx jπx X jπx jπx
L̂ aj cos + bj sin = cj cos + dj sin
j=0
L j=1
L j=0
L j=1
L
∞ ∞ X ∞ X ∞
X jπx X jπx jπx jπx
aj L̂ cos + bj L̂ sin = cj cos + dj sin
j=0
L j=1
L j=0
L j=1
L
What is left is to write a system of equations for all the {aj } and {bj } coefficients in terms of {cj }
and {dj }. Note that if the damping term in not present in the ODE, i.e., β = 0, there is the possibility
of the presence of resonance, so the corresponding frequency term must include a polynomial type
coefficient, i.e., x, as well.
Furthermore we note that if we have Dirichlet boundary conditions, we will use a Fourier Sine
Series for both g(x) and yP (x), if Neumann, we will use the Fourier Cosine Series for them.
Before we proceed with Gram-Schmidt for othogonal polynomials we first recall the Gram-Schmidt
process for othogonalizing vectors in Rn . Once we have reviewed this, the leap to Gram-Schmidt
processes for infinitely dimensinoal functional spaces will feel like a natural extension.
46
3.4.1 Gram-Schmidt for vectors in Rn
Consider a set of m-vectors in Rn , with m ≤ n, {v1 , v2 , v3 , . . . , vm }. In general when handed a set of
vectors, they will not be orthogonal; however, we will employ the services of Gram-Schmidt to help
us achieve this feat. In order for the Gram-Schmidt process to achieve a set of m-orthogonal vectors
from a collection of vectors, all the vectors must be linearly independent. We note that the span of
the original vector space and that of the orthogonalized vector space are equivalent.
To orthogonalize a collection of vectors, we begin with a single vector, say q1 = v1 . Next we
proceed to find an orthogonal vector to q1 , by taking v2 and subtracting off the projection of v2 onto
q1 . To find the next orthogonal vector, we will repeat this process and take v3 and subtract off the
projections of v3 onto v1 and q2 . This process is illustrated below.
q1 = v1
q2 = v2 − projq1 v2
q3 = v3 − projq1 v3 − projq2 v3
q4 = v4 − projq1 v4 − projq2 v4 − projq3 v4
..
.
m−1
X
qm = vm − projqk vm .
k=1
We now recall the strict definition of a vector projection. The projection of a vector v onto a
vector u is defined as
< v, u >
proju v = u,
< u, u >
< ·, · > is an inner-product.
Hence we can rewrite the Gram-Schmidt process in terms of inner-products, e.g.,
q1 = v1
< v2 , q1 >
q2 = v2 − q1
< q1 , q1 >
< v3 , q1 > < v3 , q2 >
q3 = v3 − q1 − q2
< q1 , q1 > < q2 , q2 >
< v4 , q1 > < v4 , q2 > < v4 , q3 >
q4 = v4 − q1 − q2 − q3
< q1 , q1 > < q2 , q2 > < q3 , q3 >
..
.
m−1
X < vm , qk >
qm = vm − qk .
< qk , qk >
k=1
The last thing we can do is normalize these newly found orthogonal vectors, {q1 , q2 , . . . , qm },
using the standard normalization procedure, e.g.,
( )
q1 q2 qm
{q̂1 , q̂2 , . . . , q̂m } = , ,..., ,
||q1 || ||q2 || ||qm ||
47
3.4.2 Gram-Schmidt for Orthogonal Polynomials!
We now shift gears slightly and look to the problem of creating a basis of orthogonal polynomials. We
recall if two polynomials, say pk andpj are orthogonal then we have
Z b
0 (j 6= k)
< pk , pj >= pk (x)pj (x)w(x)dx = ,
a c ∈ R (j = k)
where w(x) is a weight-function associated with the specific orthogonal polynomials behind the scenes
Sturm-Liouville problem. On that note, we point out that in order to easily extend Gram-Schmidt
from vectors in Rn to infinite dimensional function space, we need to define an inner-product. We do
so accordingly,
Z b
< pk , pj >= pk (x)pj (x)w(x)dx.
a
For constructing families of orthogonal polynomials, we will use the above inner-product and see
that any family of orthogonal polynomials is going to be generated by changing the weight func-
tion, w(x), in the above inner-product. For examples, to construct the Legendre polynomial basis we
1
use w(x) = 1, while for Chebyshev polynomials the weight function is w(x) = √1−x 2
. We also need
correct integration bounds, i.e., for both Legendre and Chebyshev polynomails we need [a, b] = [−1, 1].
To construct the orthogonal basis of polynomials we proceed in a similar manner to the above,
except starting off with p0 = 1 as the zero-th order polynomial. To find p1 (x), the 1st -order orthogonal
polynomial, we subtract off the projection of the 1st -order monomial basis function onto p0 (x) from
x. We then continue in an analogous manner. The procedure can be seen as
p0 (x) = 1
< x, p0 >
p1 (x) = x − p0 (x)
< p0 , p0 >
< x2 , p0 > < x2 , p1 >
p2 (x) = x2 − p0 (x) − p1 (x)
< p0 , p0 > < p1 , p1 >
< x3 , p0 > < x3 , p1 > < x3 , p2 >
p3 (x) = x3 − p0 (x) − p1 (x) − p2 (x)
< p0 , p0 > < p1 , p1 > < p2 , p2 >
..
.
n−1
X < xn , pk >
pn (x) = xn − pk (x).
< pk , pk >
k=0
It is clear that doing this procedure is definitely not optimal, especially if you need to find p10 (x),
or even p3 (x) for that matter. However, where this procedure becomes powerful is in the realm of
numerical quadrature, or numerical integration. In a nutshell, in the Gaussian quadrature formulation
of approximating integrals, one needs to (*gets to) choose both the quadrature points as well as the
weight coefficients. It turns out that one possible method for choosing the quadrature becomes is
naturally found to be constructing a specific family of orthogonal polynomials and then finding the
roots of the desired order, e.g., if we wish to find 2 quadrature nodes, we would find the roots of p2 (x).
48
a priori., one only needs to construct polynomials of modest order. The main difference between
the two numerical integrations methods can be seen as having 2N degrees of freedom in Gaussian
Quadrature, while only N degrees of freedom in Newton-Cotes. These degrees of freedom have
immediate consequence in eliminating lower order errors to achieve higher order accuracy.
49
4 Green’s Function
A Green’s function, denoted G(x, s), is associated with specific linear differential operator, L̂. It is
the solution to
L̂G(x, s) = δ(x − s), (23)
where δ(x−s) is the Dirac-delta function. As applied mathematicians we wish to exploit properties
of the Green’s function that makes them particularly useful for solving non-homogeneous differential
equations, whether they are of the ordinary or partial varieties, e.g., Green’s functions can be used
for finding solutions to equations of the form
where u(x) is the solution we seek and L̂ is some differential operator. Specifically we will use
them to solve non-homogeneous boundary value problems (BVPs). We note that BVPs sometimes
do not have solutions, so we will illustrate the main concepts of Green’s Function using “nice” linear
differential operators.
where we presume p(x) and q(x) are continuous functions on some interval of interest, say [a, b].
Naturally, we have some boundary conditions, written in the form,
c1 u0 (a) + c2 u(a) = 0
D̂u = (26)
c3 u0 (b) + c4 u(b) = 0.
Note: we call the operator D̂, the boundary conditions operator. (very creatively, in fact).
We consider the following BVP,
where f (x) is continuous on [a, b]. We can induce there is only one solution to the above BVP,
given by
Z b
u(x) = G(x, s)f (s)ds. (28)
a
The name of the game will then be constructing the appropriate Green’s function for the specific
linear differential operator. It is clear once we have the Green’s function, we can change the non-
homogeneous term in the BVP, i.e., f (x), and assuming our integration skills are up to it, find the
solution u(x) to our BVP.
• For x 6= s, L̂G(x, s) = 0.
• For s 6= 0, D̂G(x, s) = 0.
50
• Symmetry Condition: G(x, s) = G(s, x).
• Jump Discontinuity: G0 (s+ , s) − G0 (s− , s) = 1
p(s) .
Note: Many of our friendly linear differential operators already have had their Green’s functions
found and studied extensively! Look them up and then have an integration party!
51
1. 0 ≤ x ≤ s
π
2. s ≤ x ≤ 4
Now we must find the functional forms of c1 (s) and c2 (s). We will be able to do this by using
the fact we need G(x, s) to be continuous in both x and s, as well as the jump discontinuity in the
derivative of G(x, s).
First we will look at continuity.
Continuity : We need that G(x, s) has the same limit as x → s from below and x → s from above.
This gives us that
c2 (s) sin(2s) = c1 (s) cos(2s). (34)
" #
dG dG 1
Jump : lim→0 dx − dx = p(x) .
s+ s−
We note that p(x) = 1 for our toy problem. We find that
Now solving (34) and (35) for c1 (s) and c2 (s), we find
1
c1 (s) = − sin(2s) (36)
2
1
c2 (s) = − cos(2s), (37)
2
and hence we find our Green’s function to be:
− 12 cos(2s) sin(2x)
x ∈ [0, s]
G(x, s) = (38)
− 21 sin(2s) cos(2x) x ∈ [s, π4 ].
4.4.1 Let’s make sure we’re on the same page, let f (x) = 50e−0.3742 .
Hence we are looking for the particular solution for the above BVP with f (x) = 50e−0.3742 .
52
Therefore we see
Z π4
u(x) = G(x, s)f (s)ds (39)
0
π
Z x Z 4
= G− (x, s)f (s)ds + G+ (x, s)f (s)ds (40)
0 x
Z x Z π4
1 1
= − cos(2s) sin(2x) (50e−0.3742 )ds + − sin(2s) cos(2x) (50e−0.3742 )ds (41)
0 2 x 2
" Z π #
Z x 4
= −25e−0.3742 sin(2x) cos(2s)ds + cos(2x) sin(2s)ds (42)
0 x
1 1
= −25e−0.3742 sin2 (2x) + cos2 (2x) (43)
2 2
25 −0.3742
=− e (sin2 (2x) + cos2 (2x)) (44)
2
25 −0.3742
= e (45)
2
(46)
53
5 Linear PDEs
In this section we will illustrate how to solve some linear partial differential equations, PDEs. Namely
we will discuss how to use
1. Classification of PDEs
2. Derivations of classical linear PDEs
3. Method of Characteristics
4. Separation of Variables
to solve various kinds of linear PDEs. Solutions to non-linear PDEs are not very friendly to the
ole paper and pencil formulations that we love, cherish, and admire. First we begin by classifying
equations. To do so, let’s state what we mean by linear PDEs! We will illustrate this using 2nd order
PDEs, as a lot of natural phenomena is successfully mathematically modeled using 2nd order PDE
operators, but that is a story for another day.
We consider a function u = u(x, y), that is a function of N independent variables, and write the
general form of a 2nd order linear PDE as,
Fortunately we are able to classify different types of PDEs, which give rise to solutions with
radically different behavior, using the coefficient functions {A, B, C, . . . , F }. Furthermore in the case
when {A, B, C, D, E, F } are constants, i.e., the PDE has constant coefficients, when we classify a
PDE there is no possibility of changing types spatially (or temporally). We classify them using the
following,
1. Hyperbolic: B 2 − 4AC > 0
2. Parabolic: B 2 − 4AC = 0
In the case of 1st Order Systems of PDEs, we can classify the system based on properties of the
coefficient matrices, A and B, where our general equation is of the form,
Aut + Bux = c.
54
Depending on the invertibility of A or B, we may or may not be able to classify the system as
elliptic, hyperbolic, or parabolic. If A is non-singular, and B may or may not be, we consider the
problem of
However, if the opposite situation arises, where B is known to be non-singular, and A may or may
not be, we consider the analogous problem of
We then have the following three cases, depending on what the eigenvalues are found to be:
Elliptic: No real eigenvalues ⇒ eigenvalues are strictly complex with Im 6= 0. (# of eqns. must
be even!)
Hyperbolic: N real eigenvalues and yield a full set of linearly independent eigenvectors
Actually before proceeding further, let’s take that dive and derive some of those more popular 2nd
order PDEs that everyone is ranting and raving about, namely the wave equation, heat equation, and
Laplace’s equation.
Let’s start off with the wave equation! (and surf along from there...)
55
5.1 Quick derivations of the big three - elliptic, hyperbolic, and parabolic
PDEs.
In this section we will motivate the phenomena in which where the wave equation (hyperbolic) and heat
equation (parabolic) PDEs govern the behavior. Furthermore we will also briefly build up intuition
behind Laplace’s Equation (elliptic); however, we will not derive it.
To derive the wave equation, we will need to make a few simplifying assumptions about the material
properties of the string and problem at hand. We will assume
1. The string is flexible and has no preferred shape
2. The string is homogeneous, i.e., the mass per unit length, ρ, is constant
3. The displacements are small compared to the length of the string, e.g., u(x, t) << L
4. From above, we get that this implies the slope of the string’s curvature is small at all points
x ∈ (0, L)
5. The string’s tension acts tangent to the string and magnitude is constant among all points on
the string
6. The tension defeats the force of gravity entirely, i.e., ignore it. It’s weak...unless this is the guitar
near the event horizon of a black hole.
7. Actually, let’s assume no external forces are present. No electromagnetic force. No nuclear
forces. No gravity. (This is still better than spherical cows.)
We consider a small interval on the string [xj , xj + ∆x]. The tensions on the end of this interval
are, of course, tangent to the ends of the curved string. We next proceed by computing the net vertical
force on the string, in hopes of employing Newton’s second law of motion to giving us our model.
Assuming small angular displacements, we compute the force acting on the element ∆s (it’s small, I
swear!) and obtain
∂u ∂u
since ∂x = tan θ2 and ∂x = tan θ1 from slope relationships. We also note that from
(xj +∆x,t) (xj ,t)
our assumptions above T = |T1 | = |T2 |.
56
Now that we know all the tensile forces acting on the string, we use the rest of Newton’s second
law of motion to relate it to the string’s acceleration. We simply use the assumption that
ρ∆s ≈ ρ∆x,
to give us a fair approximation to the mass of the string on [xj , xj + ∆x]. The acceleration term
2 2
is then ρ∆s ∂∂t2u = ρ∆x ∂∂t2u . Hence Newton’s second law gives us
∂2u
∂u ∂u
ρ∆x 2 = T − .
∂t ∂x (xj +∆x,t) ∂x (xj ,t)
∂2u 2
2∂ u
= a ,
∂t2 ∂x2
T
where a2 = ρ.
Uniqueness of Solutions: We can prove uniqueness of the Wave Equation by using Energy
Methods. We define the energy of the system to be
Z
E(t) = |wt |2 + |∇w|2 dx. (48)
U
Energy methods are powerful tool for getting a lot of information about a PDE for next to no
work. We will see that the main gist of using these methods is to look at the time derivative of the
energy, while assuming there are two unique solutions, to eventually arrive at a contradiction,
giving us uniqueness for just a few simple integration by parts, typically. We will use these
methods for both the Heat Equation and Laplace’s Equation that are coming up soon. Let’s
take a gander at uniqueness for the wave equation!
Theorem 5.2 Let U ⊂ Rn be an open and bounded and have its boundary, ∂U , be smooth (or
at least C 1 ), then there is at most one C 2 solution to
utt − ∆u = f in (0, T ] × U
u = g on[0, T ] × ∂U
u(0, x) = u0 (x), ut (0, x) = u1 (x).
Proof 5.1 Let u, ũ be two non-identical solutions to the above non-homogeneous wave equation.
We will define w = u − ũ, giving us the homogeneous equation
wtt − ∆w = 0 in (0, T ] × U
w = 0 on[0, T ] × ∂U
w(0, x) = w0 (x) = 0, wt (0, x) = w1 (x) = 0.
57
We then look at the energy of w,
Z
E(t) = |wt |2 + |∇w|2 dx,
U
Z
dE
= 2Re wtt w̄t + ∇w · ∇w̄t dx
dt U
Z Z
= 2Re (wtt w̄t − ∆ww̄t ) dx + 2Re (∂ν w) wt dS,
U ∂U
by integrating by parts. Note that ∂ν is the normal derivative. Recall our boundary condition,
that w = 0 on ∂U . This implies wt = 0 on ∂U for all t, and hence we have that
Z
Re (∂ν w) wt dS = 0
∂U
, giving us Z
dE
= 2Re (wtt − ∆w) w̄t dx.
dt U
since w0 = w1 = 0 and hence we have wt = 0 and ∇w = 0 for all t. Therefore w ∈ R and since
w(0, x) = 0, we get w = 0, implying that u = ũ.
58
4. The specific heat γ and thermal conductivity K of the material are constants (it gets more
interesting when they aren’t, though)
Now unlike the wave equation, where we simply used Newton’s second law of motion to derive our
model equation, we need to use two empirical laws in heat flow. These laws are very logical and may
not seem like they deserve the title of being an empirical law.
1. The rate of heat flow through a cross-section, A, is proportional to the cross-sectional area and
the partial derivative with respect to x of the temperature:
dQ ∂u
= −KA .
dt ∂x
The minus sign is to ensure that heat flows from the higher to lower temperature direction.
2. The quantity of heat, Q, at point x, within an element of mass m is
Q = γmu(x, t) = γρA∆xu,
where m = ρ(A∆x). We can now differentiate the above equation with respect to time to give
us a heat flow rate,
dQ ∂u
= γρA∆x .
dt ∂t
Now when heat is flowing in the positive x direction, we see that heat will build up at a rate of
∂u ∂u ∂u ∂u
−KA − −KA = −KA − .
∂x (x,t) ∂x (x+∆x,t) ∂x (x+∆x,t) ∂x (x,t)
Hence equating this with our second empirically derived relation, gives
∂u ∂u ∂u
γρA∆ = KA − ,
∂t ∂x (x+∆x,t) ∂x (x,t)
and therefore
∂u ∂u
∂x − ∂x
∂u K (x+∆x,t) (x,t)
= .
∂t γρ ∆x
Finally taking the limit as ∆x → 0, we obtain the heat equation, e.g.,
∂u ∂2u
= κ 2,
∂t ∂x
K
where κ = γρ .
Uniqueness of Solutions: We can prove uniqueness of the Heat Equation by using Energy
Methods, like we did for the Wave Equation. We define the energy of the system to be
Z
E(t) = |w|2 dx. (49)
U
Theorem 5.3 There exists at most one solution, u ∈ C 1 [0, T ] : C 2 (U ) to the initial value
ut − δu = f
problem / boundary value problem, , where U is a bounded domain.
u Γ = g.
T
59
Like we alluded to, we will prove this result using energy methods.
Proof 5.2 First we define two solutions u and ũ to be two solutions to the heat equation and
attack this proof head on by contradiction. We define a character w = u − ũ and then consider
the system
wt − δw = 0
u Γ = 0.
T
All that is left to do is to differentiate (49) and use some mathematical massaging to obtain the
result, e.g.,
Z
dE
= 2Re wt w̄dx
dt
ZU
= 2Re ∆w w̄dx
U
Z Z
= −2Re ∇w · ∇w̄x + 2Re (∂ν w) w̄dS
U ∂U
from integrating by parts. We also note that this second integral is identically zero, because along
the boundary ∂U , we have w = 0. Hence we get
Z Z
dE 2
= −2Re ∇w · ∇w̄x = −2Re |∇w| x ≤ 0,
dt U U
and we have E(t) = E(0) = 0 since w = 0 at t = 0. Therefore we see that, indeed, u = ũ for all t.
We note that you can also prove uniqueness of the heat equation using the maxprinciple. The
ut − δu = 0
maximum principle on Rn says that if we suppose u ∈ C 1 (0, T ] : C 2 (Rn ) solves ,
u(0, x) = g.
2
and satisfies u(t, x) ≤ Aea|x| for all (t, x) ∈ (0, T ] × Rn for some A, a > 0, then we have
In a nutshell this says that the peaks of g(x) at t = 0 are the worst that can happen before
smoothing begins to happen. Imagine it physically. . . door. The proof for uniqueness becomes
rather trivial in this case.
Proof 5.3 If ũ is another solution and we define, w± = ±(u − ũ), just applying the maximum
principle to w± we see that w+ ≤ 0 and w− ≤ 0, which implies that u − ũ = 0.
60
n-space, can be thought as the operator of non-mixed second partial derivatives with respect to the
spatial variables. In n-dimensions the operator looks like
n
X ∂2 ∂2 ∂2 ∂2
∆= = + + . . . + . (50)
j=0
∂x2j ∂x21 ∂x22 ∂x2n
When we restrict ourselves to two spatial variables, say x and y, it takes the form in the above
Laplace Equation. It can also be thought of as the divergence of the gradient of a vector. It is not
related to our other friend, the Laplace Transform. There are two ways we will build some intuition
about the Laplacian.
In building intuition, let’s start off with something we all know well - basic calculus. Remember
back to your calculus 1 days, where taking a derivative was the answer to virtually everything, and
sometimes you had to take a second derivative to check something. Well that something, the concavity
is where we begin.
Recall if we have a function y = f (x) and f 00 (x) < 0, then y is concave up, while if f 00 (x) > 0, then
y is concave down. Analogously the Laplacian can be thought of as the generalization to multivariate
functions.
Likewise, we could also think of the Laplacian as a “local averaging operator”. The Laplacian
basically compares the value of u at some point to all the points within a neighborhood of it. This can
also, perhaps, give better insight into the concavity idea. There are three pathological cases, listed
below.
1. ∆u = 0: u at a point is equal to its neighbors (Laplace’s Equation)
2. ∆u > 0: u at a point is less than the average of its neighbors
3. ∆u < 0: u at a point is greater than the average of its neighbors
Hence we have the idea that
Now when we think of say, the heat equation, which can be written as
ut = k∆u,
we can think of it as the rate of change with respect to time of the temperature is proportional to
the difference between the average temperature of u at a point and the value at the point itself. For
example, if there is a very cold spot in our domain, the surrounding domain is warmer, and hence the
Laplacian will be positive and thereby more heat will flow into that region, causing the temperature
to rise in our once colder spot.
Uniqueness of Solutions: We can prove uniqueness of the Laplace Equation by, again, using
Energy Methods, like we did for the Heat Equation. We define the energy of the system to be
Z Z
1 2
I(w) = |∇w| dx − f wdx, (51)
2 U U
which is a slightly different definition of how we defined the energy for the system previously.
We will prove the following theorem
61
Theorem 5.4 Let U be a bounded open set with smooth boundary. Let u be continuous on U
and it satisfies ∆u = f in U with u = g on ∂U , then it is the unique solution.
Proof 5.4 We will again proceed by contradiction and assume both u and ũ are solutions and
that u 6= ũ. Then we can define w = u − ũ, and hence we have that ∆w = 0 in U and w = 0 on
∂U . Now looking at the energy of w we have
Z Z
2
I(w) = |∇w| dx = − w∆wx = 0,
U U
by integrating by parts and since ∆w = 0 on U . Thus we have that ∇w = 0, implying that w is a
constant. Hence that constant must be zero, since w on the boundary is equal to zero. Therefore
we have that u = ũ.
Similarly to the heat equation case this proof is almost given away for free from a maximum
principle for the Laplace equation. We will state that strong maximum principle now.
Furthermore, we can also fairly easily prove existence of solutions to Poisson’s Equation (basically
the non-homogeneous cousin of the Laplace Equation), if we assume u has a minimum on U .
Let’s check it out for fun!
Theorem 5.6 Let u be a function at which the energy function in (51) assumes its minimum.
Then u is a solution of the Poisson equation −∆u = f on U with u = g on ∂U .
Proof 5.5 The proof begins just be expanding out the functional,
t2
Z Z Z
I(u + tv) = I(u) + t ∇u · ∇vdx − f vdx + |∇v|2 dx,
U U 2 U
where it is assumed v is a smooth function with compact support. Next since we are trying to
minimize I(u + tv), we look at the directional derivative of I along v and set it equal to zero,
Z Z
∇u · ∇vdx − f vdx = 0,
U U
and sampling integrating this by parts will give us nearly what we are after,
Z Z
− ∆u vdx − f vvx = 0,
U U
62
One last remark before we proceed with solution methodoglies to linear PDEs. We state that if
u1 , u2 , . . . , uk are solutions of a homogeneous linear PDE, then their linear combination
k
X
u= cj uk = c1 u1 + c2 u2 + . . . + ck uk ,
j=1
is also a solution, assuming {cj } ∈ C. This is fondly known as the superposition principle. Splendid.
Time to rock!
x = x(τ, s) τ = τ (x, y)
y = y(τ, s) s = s(x, y).
x(τ, 0) = f (τ )
y(τ, 0) = g(τ )
u(τ, 0) = u0 (τ ).
63
Figure 4: General idea of the Cauchy problem’s initial data.
4. The solution is going to be a surface in z = u(x, y) for the Cauchy problem which contains the
initial data, C : (x(τ, 0), y(τ, 0), u(τ, 0)). This can be seen in Figure(4).
This surface can be thought of as a level curve of the function, F (x, y, z) = u(x, y) − z = 0,
Following the vector < a, b, c >, it directs us along a curve on the solution surface.
(i) Such a curve passes through each point of the initial data curve, Cs , given a one-parameter
family of curves, with the parameter, τ , specifying which one.
(ii) Define τ so that it is constant along each curve, thus giving a one-parameter family of
curves. The curves are then the level curves of τ (x, y) = constant.
(iii) Only s varies along each curve, with s = 0, corresponding to the points on the initial data
curve, Cs .
Figure 5: Parameterized curve, with parameter s, going through initial data curve Cs .
64
(iv) For a fixed τ , say τ = τ1 , the curve passing through the point (x(τ, 0), y(τ, 0), u(τ, 0)) on
Cs can be considered as a parametric curve with parameter s. This idea is illustrated in
Figure(??).
D E
∂y ∂u
(v) A tangent vector to the parametric curve is: ∂x ∂s , ∂s , ∂s .
where we can scale our parameter s so such k = 1 and we get the following system of ODEs
∂x
∂s = a(x, y, u)
∂y
∂s = b(x, y, u)
One parameter family of ODEs, called the characteristic equations .
∂u
∂s = c(x, y, u)
Therefore we have
∂u ∂u ∂x ∂u ∂y
= + = aux + buy = c.
∂s ∂x ∂s ∂y ∂s
• Finally, we have parameterized the initial data curve, C , as
x(τ, 0) = f (τ )
y(τ, 0) = g(τ )
u(τ, 0) = u0 (τ )
x = x(τ, s)
y = y(τ, s)
u = u(τ, s)
and upon inverting the transformation, if possible, i.e., solving for τ (x, y) and s(x, y), we
can arrive at our final solution,
65
Step 2: Solve the system of ODEs given by:
dx
= a(x, y, u)
ds
dy
= b(x, y, u)
ds
du
= c(x, y, u)
ds
all subject to the initial conditions from Step 1. This then yields the following relation,
x = x(τ, s)
y = y(τ, s)
u = u(τ, s)
*Unfortunately, the coordinate transformations are not always invertible, or explicit in nature!
We will now present a few useful theorems regarding the method of characteristics. These all will
be in the vein of the existence and uniqueness of solutions.
Theorem 5.7 ( Cauchy-Kowaleski Theorem): Consider the Cauchy problem, a(x, y, u)ux +b(x, y, u)uy −
x = x̄(τ )
c(x, y, u) subject to a prescribed u on a parametric curve, C : , then we can let u =
y = ȳ(τ )
u(x̄(τ ), ȳ(τ )) = u0 (τ ) for initial data. Suppose a, b, c are continuously differentiable in a neighbor-
a b
hood of the initial data, i.e., the space curve (x̄(τ ), ȳ(τ ), u0 (τ )). If 6= 0 at all points on
Xτ Yτ
C (e.g., C is non-characteristic), then there exists a unique solution to the Cauchy problem in some
neighborhood of the initial data curve, C .
The following theorem can be used to identify when a Cauchy problem will have ∞-many solutions
or no solution, based on the initial data.
a b
Theorem 5.8 If = 0 for all points on C (initial data curve itself is a characteristic),
Xτ Yτ
then the Cauchy problem has either ∞−many solutions or no-solution in a neighborhood of C .
Throughout all of this analysis, we have employed use of the Implicit Function Theorem, which
tells us when we can invert a coordinate transformation.
Theorem 5.9 ( Implicit Function Theorem) Consider N equations with N + M variables,
Fk (x1 , x2 , . . . , XM , Y1 , . . . , YN ) = 0 for k = 1, 2, . . . , N,
or
F(x, y) = 0.
66
We also assume each partial derivative of F is C 1 with respect to each N + M variable. Furthermore
we assume that the system of equations is satisfied at some point P0 = (x01 , . . . , x0M , y10 , . . . , yN
0
) with
∂(F1 , . . . , FN )
6= 0.
∂(y1 , . . . , yN ) P0
Then {yk }N M
k=1 can be expressed as functions of {xj }j=1 in some neighborhood of P0 , i.e.,
y1 = φ1 (x1 , . . . , xM )
···
yN = φN (x1 , . . . , xM )
or
~
y = φ(x).
Moreover each {φk }N 1 M
k=1 is C with respect to each {xj }j=1 .
The Implicit Function Theorem in a nutshell, basically states that when we are given ξ = ξ(x, y)
and η = η(x, y), we can write x = x(ξ, η) and y = y(ξ, eta) whenever
ξx ξy
6= 0.
ηx ηy
x = Kes ⇒ x = τ es
x(τ,0)=τ
y = Kes ⇒ y = e−s
y(τ,0)=1
67
Step 3: Inverting the transformations, we get
x = τ es ⇒ τ = xes = xy
y = e−s ⇒ s = − ln(y), y > 0
u(τ, s) = −2τ e−s + τ 2 + 2τ + 1 ⇔ u(x, y) = −2xy 2 + 2xy + 1 + (xy)2 = −2xy 2 + (xy + 1)2 .
The general method of characteristics in all its glory attempts to solve first-order PDEs of the
form,
F (x, y, u, ux , uy ) = 0,
with corresponding initial data
x = x(τ )
C : y = y(τ ) .
u(x(τ ), y(τ )) = u(τ )
We will require that u2x + u2y 6= 0, e.g., p 6= 0 when q = 0 and vice-versa. We note that this
mandates that the solution will not have any local minima or maxima. Our governing PDE equation
now takes the form,
F (x, y, u, p, q) = 0.
Rolling through nonlinear method of characteristics rigmarole, which we do not expand upon here,
we arrive at the following system of ODEs to solve,
dx ∂F
=
ds ∂p
dy ∂F
=
ds ∂q
du ∂F ∂F
p +q
ds ∂p ∂q
dp ∂F ∂F
=− +p
ds ∂x ∂u
dq ∂F ∂F
=− +q
ds ∂y ∂u
68
with initial conditions x(τ, 0) = x(τ ), y(τ, 0) = y(τ ), u(τ, 0) = u(τ ), p(τ, 0) = p(τ ), q(τ, 0) = q(τ ).
For the initial conditions on p(τ ) and q(τ ), we use the conditions that
1. F (x(τ ), y(τ ), u(τ ), p(τ ), q(τ )) = 0 (The differential equation is satisfied on the initial curve)
2. From uτ = ux xτ + uy yτ on C we have that u0 (τ ) = p(τ )x0 (τ ) + q(τ )y 0 (τ ).
Using those two bits of information we will be able to pick out the correct initial data for p(τ, 0)
and q(τ, 0). We will leave you with just a few more notes on the general method of characteristics.
F (x, y, u, p, q) = ux uy − 1 = pq − 1,
with initial conditions x(τ, 0) = τ, y(τ, 0) = 1, u(τ, 0) = 2τ 1/2 . Recall that on C the initial data is
satisfied, hence we have
1
(i) pq − 1 = 0 ⇒ p = q
69
Now we have all the initial data we require and can proceed to solving the above system. Upon
solving all these ODEs we find
dp
= 0 ⇒ p(τ, s) = C ⇒ p(τ, s) = τ −1/2
ds p(τ,0)=τ −1/2
dq
= 0 ⇒ q(τ, s) = C ⇒ q(τ, s) = τ 1/2
ds q(τ,0)=τ 1/2
dx
= q ⇒ x(τ, s) = τ 1/2 s + A(τ ) ⇒ A(τ ) = τ
ds x(τ,0)=τ
dy
= p ⇒ y(τ, s) = τ −1/2 s + B(τ ) ⇒ B(τ ) = 1
ds y(τ,0)=1
du
= 2pq ⇒ u(τ, s) = 2s + C(τ ) ⇒ C(τ ) = 2τ 1/2
ds u(τ,0)=2τ 1/2
x(τ, s) = τ 1/2 s + τ
y(τ, s) = τ −1/2 s + 1
u(τ, s) = 2s + 2τ 1/2 .
Now we must try to transform back into (x, y) coordinates from (τ, s). From the above relations,
we see we can write
τ y = τ 1/2 + τ = x,
and thereby
x
x − τy = 0 ⇒ τ = .
y
Hence we find 1/2
x
s= (y − 1),
y
and obtain as our final solution
1/2 1/2
x x √
u(x, y) = 2 (y − 1) + 2 = 2 xy.
y y
x
We also state that the characteristics for the above problem are τ = y = constant.
5.4 Separation of Variables: For 2nd order linear PDE (finite domains)
In this section we will discuss a powerful method for finding solutions to linear PDE on finite do-
mains. This is not the end-all, be-all technique for solving linear PDEs, but is very useful in areas
from mechanics, electromagnetism, quantum mechanics, to economics.
ut = κuxx ,
70
with associated boundary conditions u(0, t) = u(1, t) = 0, and initial value u0 (x) = u(x, 0) =
f (x). Hence we consider the domain x ∈ [0, 1] and t > 0.
T 0 (t) X 00 (x)
ut = κuxx ⇒ =κ .
T (t) X(x)
From the above we see that since each side of the equation depends only on one independent
variable, yet are equal for any values, that the quantity above must be equal to a constant, say
−α2 . Hence we have
T 0 (t) X 00 (x)
= = −α2 .
κT (t) X(x)
We now have an eigenvalue in our problem...err...one more degree of freedom to work with!
Before we get to that, we note that the boundary conditions are still to be obeyed, and are
enforced via the X(x) function, i.e., X(0) = X(1) = 0. Now let’s solve the above problem!
We can solve this ODE simply by noticing its beautiful separable form,
dT
= −κα2 dt,
T
and hence by simple integration obtain
2
T (t) = c0 e−κα t .
We note that we did not get any information about the eigenvalue from solving this, nor
can be use the initial value yet, since there is much spatial information to be learned!
In solving this equation we have to look at three cases, that is when α = 0, α2 < 0,
and α2 > 0.
1. α = 0:
X 00 = 0
and hence we have
X(x) = cx + d.
Using the boundary conditions we see that
X(0) = X(1) = 0 ⇒ c = d = 0,
and hence we only have the trivial solution.
71
2. α2 < 0 :
X 00 − α2 X = 0
Solving this we get in general that
X(0) = d = 0
X(1) = c sinh(α) = 0 ⇒ c = 0
since we’ve assumed that α 6= 0 in this case. Hence we again only have the trivial
solution present.
3. α2 > 0:
X 00 + α2 X = 0
Solving the above ODE, we get
X(0) = d = 0
X(1) = c sin(α) = 0 ⇒ α = nπ
however, we see that any positive integer value of n gives us a different eigenvalue. It makes
sense in fact we can use superposition to sum all of these possible solutions, each corresponding
to a different eigenvalue, e.g.,
∞
2
π2 t
X
u(x, t) = cn sin(nπx)e−κn .
n=1
Note that we still have an infinite amount of coefficients {cj } to find! Where, oh where, will we
be able to find these? Welp, we haven’t used the initial value for anything, yet. Let’s start there!
Using the initial value, u(x, 0) = u0 (x) = f (x), we see that we have
∞
X
f (x) = cn sin(nπx).
n=1
The above is simply a Fourier Series for f (x)! We can use the tricks and tools of the trade, i.e.,
orthogonality and completeness of the Fourier basis, to find {cn }.
72
Using Fourier Series, we note that we get
Z 1
2
cn = f (x) sin(nπx)dx.
L 0
ut = R + kuxx ,
with associated boundary conditions, one of which is non-homogeneous, e.g., u(0, t) = 0 and
u(1, t) = u0 for t > 0, and initial value u0 (x) = u(x, 0) = f (x) for 0 < x < 1. Hence we consider
the domain x ∈ [0, 1] and t > 0.
For problems of this type, we will begin by making a change of variables of the dependent
variable,
u(x, t) = ψ(x, t) + ξ(x) (52)
We will be able to put the non-homogeneous term all on the ξ(x) function, provided we have
In this case we then have translated homogeneous boundary conditions to ψ(x, t) on [0, 1]. We
note that since ξ(x) has no t dependence, the initial conditions gets placed onto ψ(x, t), i.e.,
∂2u ∂2ψ ∂u ∂ψ
2
= + ξ 00 (x) and = .
∂x ∂x2 ∂t ∂t
The above equation will reduce to a homogeneous equation if we make the assumption that
kξ 00 + R = 0.
73
Solving this (by two simple integration) gives
R 2
ξ(x) = − x + c1 x + c2 .
2k
Now making ξ(x) obey the boundary conditions we set above gives
R
c1 = + u0 and c2 = 0.
2k
Hence we find
R 2 R
ξ(x) = − x + + u0 x.
2k 2k
Once we have ξ(x), we can now set out to solve our homogeneous PDE given above, e.g.,
ψt = kψxx ,
It is clear that finding the solution to this PDE is identical to solving the homogeneous heat
equation we previously have, but with differing initial condition. Rolling through the same
separation of variables rigmarole, we find that
∞
2
π2 t
X
ψ(x, t) = cn e−kn sin(nπx),
n=1
where we known the {cj }0 s are given by orthogonality of the Fourier basis,
R1 1
[f (x) + ξ(x)] sin(nπx)dx
Z
0 R 2 R
cn = R1 2 =2 f (x) + x − + u0 x sin(nπx)dx.
sin (nπx)dx 0 2k 2k
0
We note that the above solution is composed of two pieces, ψ(x, t) and ξ(x). We call ξ(x) the
steady-state solution since we note that as t → ∞, ψ(x, t) → 0. We call ψ(x, t) the transient
solution.
A few more notes on separation of variables:
74
3. The example above easily goes through the mechanics and solutions to the wave equation
and Laplace’s Equation can be done analogously.
4. This methodology will work for higher order heat equations, wave equations, Laplace’s
equations, e.g., situations with more than two independent variables.
5. The change of variables, u(x, t) = ψ(x, t) + ξ(x), will work for non-homogeneous problems
involving Laplace’s equation or the wave equation, as well.
We note that not every function f (x) will have a Fourier Transform; however, a sufficient condition
for a function to have one (i.e., the above integral converges) is that f (x) is square-integrable, e.g.,
Z
||f (x)||2 = |f (x)|2 dx = a constant.
R
If f (x) is also continuous as well, then the Inverse Fourier Transform should exist, and we have
the inversion formula Z
1 −1
f (x) = F {F (k)}(x) = F (k)eikx dk. (54)
2π R
We will now discuss a few properties of the Fourier Transform.
1. Linearity:
Like the Laplace Transform, linearity holds in the Fourier Transform. If it didn’t, it probably
wouldn’t be worth a second glance, after-all.
where {c1 , c2 } ∈ C.
2. Translations:
75
We also have the following translational property,
Z
F {f (x − a)} = f (x − a)e−ikx dx
ZR
= f (u)e−ik(u+a) du
R
Z
−ika
=e f (u)e−iku du
R
= e−ika F (k)
Hence we have
F {f (x − a)} = e−ika F (k).
3. Scaling:
4. Convolution:
The Fourier Transform of a convolution of two functions, f1 (x) and f2 (x), has an elegant (and
convenient!) form,
5. Modulation:
6. Parseval’s Theorem:
Z Z
|f (x)|2 dx = |F (k)|2 dk,
R R
or equivalently
1
||F (k)||2 = ||f (x)||2 .
2π
This identity becomes especially useful when doing stability analysis of numerical methods in-
volving finite differencing for the solutions of PDEs.
76
7. Duality:
Suppose f (x) has Fourier Transform F (k). Then we immediately know the Fourier Transform
of the function F (x)! This is known as the duality property of the Fourier Transform, as seen
below
8. Derivatives!:
When we take the Fourier Transform of a derivative, we will simply use our good friend, inte-
gration by parts, to help us out. Knowing that our function is square-integrable is what will
allow us to drop terms. Let’s check it out
df df −ikx
F = intR e dx
dx dx
Z
= f (x)e−ikx + (−ik)f (x)e−ikx dx
R
Z
= (−ik) f (x)e−ikx dx
R
= (−ik)F (k)
We can continue analogously for higher order derivatives, and by induction, we get the following
relation
dn f
F = (−ik)n F (k).
dxn
Consider taking the Fourier Transform with respect to x of ∂u ∂t . We will see since t is a dummy-
type variable and our integration does not depend on it, it makes sense we will be able to pull
the derivative operator with respect to t out of the integral. Let’s check it out rigorously,
77
Z
∂u
F = ut (x, t)e−ikx dx
∂t R
u(x, t + ∆t) − u(x, t) −ikx
Z
= e dx
R ∆t
Z Z
1 −ikx −ikx
= lim u(x, t + ∆t)e dx − u(x, t)e dx
∆t→0 ∆t R R
1
= lim [U (k, t + ∆t) − U (k, t)]
∆t→0 ∆t
∂
= U (k, t)
∂t
utt = c2 uxx ,
with u → 0 as |x| → ∞ and for t > 0. We begin by taking the Fourier Transform of both sides of this
equation,
Z Z
utt e−ikx dx = c2 uxx dx.
R R
From properties of the Fourier Transform we saw in the previous section, we see that this gives us
∂2
U (k, t) = c2 (−ik)2 U (k, t) = −c2 k 2 U (k, t)
∂t2
Hence taking the Fourier Transform has transformed our PDE into an ODE. We now just consider
k as some fixed parameter, and see our problem is reduced to a constant coefficient ,homogeneous
linear ODE. We can easily solve this ODE using familiar methods from basic ODE theory. Let
U (k, t) = eαt ,
78
Now to recover the solution we care about in physical space, i.e., u(x, t), we need to use the Inverse
Fourier Transform to transform back appropriately. Doing this we find
Z
1
F −1 {U (k, t)} = U (k, t)e−ikx dk
2π R
Z
1
C1 (k)e−ickt + C2 (k)eickt eikx dk
=
2π R
Z Z
1 ik(x−ct) ik(x+ct)
= C1 (k)e dk + C2 (k)e dk
2π R R
= c1 (x − ct) + c2 (x + ct)
where C1 (k, t) = R c1 (x, t)e−ikx dx and C2 (k, t) = R c2 (x, t)e−ikx dx and hence their Inverse
R R
Fourier Transforms are used above. We note that this solution is called the D’Alembert solution,
which describes wave phenomena, where one wave packet moves to the right, c1 (x − ct), and one wave
packet moves to the left, c2 (x + ct).
Since the variable t is defined for t ∈ [0, ∞), we can use Laplace Transforms to transform the
equation in this variable. Taking the Laplace Transform of both sides of the equation gives
∂2u 2
2∂ u
L =L c
∂t2 ∂x2
Z ∞ Z ∞
−st
utt e dt = c2 uxx e−st dt
0 0
d2 U
s2 U (x, s) − su(x, 0) − ut (x, 0) = c2
dx2
and hence we have the following ODE to solve
Now we have the appropriate boundary conditions to use when solve the ODE in the transformed
space. Proceeding with the solution, we first solve the homogeneous problem, i.e., Ux x − s2 U = 0, to
get the complementary solution and find
79
since s > 0 in the Laplace Transform. However, using the boundary conditions, we find that
c1 = c2 = 0. Onwards to solve for the particular solution, we can use the method of undetermined
coefficients (or variation of parameters if you’re really that bored), starting with an initial guess for
the particular solution as
UP (x, s) = A cos(πx) + B sin(πx).
1
Substituting this into the ODE, we find that A = 0 and B = s2 +π 2 . Therefore our solution is:
1
U (x, s) = UC (x, s) + UP (x, s) = sin(πx).
s2 + π 2
Now we must transform back into physical space,
u(x, t) = L −1 {U (x, s)}
1
= L −1 sin(πx)
s2 + π 2
1 −1 π
= L sin(πx)
π s2 + π 2
1
= sin(πx) sin(πt).
π
80
2. 1D Wave Equation
Consider
ut + aux = 0.
We will search for solutions of the form u = u(ξ(x, t)) where ξ(x, t) = x − at. Performing the
necessary derivatives, we get
ut + aux = 0
(−auξ ) + a(uξ ) = 0
0 = 0 ⇒ all differentiable functions of ξ satisfy the PDE.
u(x, t) = f (x − at).
We now make the claim that no solutions are lost by this similarity transformation. To see
this we will find the solution using this idea of similarity transformation! We will make the full
coordinate transformation,
ξ = x − at and η = η(x, t).
Recall from the Implicit Function Theorem if
ξx ξt 1 −a
= = ηt − aηx 6= 0,
ηx ηt ηx ηt
then we have an invertible transformation. We just need η(x, t) to satisfy the above.
Performing the necessary derivatives we find
ut = uξ ξt + uη ηt = −auξ + uη ηt
ux = uξ ξx + uη ηx = uξ + uη ηx
Hence we require that uη = 0 so that our coordinate transformation remains invertible, by our
determinant above. Solving this ODE, we find u(ξ, η) = C(ξ),
and therefore
u(x, t) = C(x − at).
3. 1D Heat Equation
81
Consider the 1D heat equation,
ut = kuxx .
We wish to use similarity transformations to try to cook up a solution. In general we will look
for solutions with groupings of the independent variables, such as ξ(x, t) = xα y β , where α, β 6= 0.
We then assume a solution of the form
and hence
ξ2
uxx = uξξ ξx2 + uξ ξxx = uξξ α2 + uξ α(α − 1)xα−2 tβ
x2
ξ2 ξ
= uξξ α2 2 + uξ α(α − 1) 2 .
x x
ξ2
ξ ξ
ut = kuxx ⇒ uξ β = k uξξ α2 2 + uξ α(α − 1) 2 ,
t x x
and thereby
x2
= k uξξ α2 ξ 2 + uξ α(α − 1)ξ .
uξ βξ
t
Now in the vein of similarity solutions, we want to try to get rid (or group) all the x’s and t’s.
2
In the above equation, we must let the quantity, xt be some power of ξ if the transformation is
successful. Now we let
x2
ξp = ,
t
and hence get that
x2
(xα y β )p = ⇒ 2 = αp and − 1 = βp,
t
and therefore
α
β=− .
2
We only have the one condition above and α can be chosen at our discretion. Let’s let α = 1 (that
seems nice). It also kills a term in our governing equation. We now have β = − 21 ⇒ ξ(x, y) = √xt .
Our governing equation then becomes,
1 ξ ξ
− uξ ξ 3 = kuξξ ξ 2 ⇒ − uξ = uξξ ⇒ uξξ + uξ = 0.
2 2k 2k
82
We can now integrate this equation with respect to ξ, to get
ξ2
uξ = Ae− 4k ,
If the above doesn’t immediately scream error function to you, we will introduce you to that
friendly function now.
We will perform two examples of the method of images to illustrate this idea. This section will also
require some basic electrostatic a priori physics knowledge. Well, either that or a basic understanding
of fundamental solutions of elliptic problems...whichever you’d prefer!
83
Figure 6: (a) Single point source (charge) above an infinitely long conducting surface. (b) The same
geometry, but with a virtual source point (with opposite charge) located symmetrically across the
conducting plane.
This is where the intuition behind the method of images is beautiful. Rather than look for a
particular solution, say φ+ , and a complementary solution, φ− , to the equations
q
∆φ+ = − δ(r − r0 ) and ∆φ− = 0,
0
respectively, we will try to cook up a solution that satisfies the boundary condition at z = 0 using
knowledge of the fundamental solution of such elliptic operator, e.g., ∆ in 2D.
We note that the general potential for such an operator is given by φ = kr , where k is a constant
±q
that is determined by the physics in the underlying problem. Here k = 4π 0
, where ±q depends on
which source term you are considering. If we need combine these two solutions, we note we can satisfy
the boundary condition at z = 0, if we place a point charge of equal and opposite sign at the mirror
image of the other charge below the plane. This is seen in Figure(6b).
Note that below the plane is not even in the original domain of the problem, hence when we
consider the Poisson-type problem for it’s source charge, we find
where rim = −dẑ, since rim is not in the domain of the problem at hand.
84
Therefore the solution to this Poisson problem is
" #
1 q −q 1 q −q
φ = φ+ + φ− = + = p +p
4π0 r+ r− 4π0 x2 + (z − d)2 x2 + (z + d)2
Figure 7: (a) Single point source of charge qpt outside a grounded sphere. (b) The same geometry,
but with a virtual source point (with differing) located inside the grounded sphere at an unknown
location, xim . (c) Illustrating the influence from both the point charge and image point charge on a
point outside the sphere.
Consider the problem of a point charge outside of a grounded sphere. Since the sphere is grounded,
we need the potential to be equal to zero on the entire sphere. This is our boundary condition. To
setup the geometry, say the point charge is a distance x0 away from the sphere of radius, R. For
simplicity, we will also center the sphere about the origin. This is shown below in Figure(7)(a).
In order to find the potential everywhere using the method of images, we must need to find the
location of the virtual point charge, such that the boundary condition on the sphere is satisfied. To do
this, we make the hypothesis, that to preserve symmetry, this image point should be placed somewhere
along the x − axis, and moreover, to satisfy that φ = 0 along the sphere, intuition suggests this point
will need to be inside the sphere.
Rather than try to calculate the potentials of both these points to satisfy the boundary condition
along the entire sphere, we consider trying to satisfy the boundary condition at two points along the
x-axis, namely (R, 0) and (−R, 0), i.e.,
• At (R, 0), φ = 0
• At (−R, 0), φ = 0.
If this doesn’t make sense why we only consider these two points, we have somewhat pulled a fast
one. Well, not really. We are just using what scientists call hindsight, err I mean, foresight when
thinking about a problem. The motivation is as follows. We note that using the fundamental solution
for ∆φ = 0 in radial coordinates in 2 − D for electrostatics problems, we will have two unknowns when
placing the image point - the location along the x-axis and the strength of the charge. The train of
thought then follows that by using as much symmetry as possible, if we are able to write a system
of two equations and two unknowns, we then can get all the information we need for finding the true
solution.
85
Therefore from those two (more specific) boundary conditions, we get
1 qpt qim
φ =0= +
(x,y)=(R,0) 4π0 (x0 − R) (R − xim )
1 qpt qim
φ =0= + .
(x,y)=(−R,0) 4π0 (x0 + R) (xim + R)
Solving the above system of two equations and two unknowns yields,
R2 −qpt R
xim = and qim = .
x0 x0
Now to calculate the potential of any point outside the sphere, (r, θ, φ), we find
q R
1 qpt − ptx0
φ(r, θ, φ) = φpt + φim = p +q .
(r,θ,φ) (r,θ,φ) 4π0 r2 + x20 − 2rx0 cos θ r2 + R
4
− 2r R2
cos θ
x2 0x0
At this junction, you may be wondering whether or not the above solution truly satisfies the
boundary condition on the entire sphere. Well, the nice thing about PDEs (and differential equations
in general) is we can always check out solution. (If time permits on tests, of course...)
We first consider an arbitrary point on the sphere, say distances from charges qpt and qim of
s 2 2
R R
q
2 2
dpt = R + x0 − 2Rx0 cos θ and dim = R + 2 − 2R cos θ,
x0 x0
86
6 Physics!
You’re taking physics? I love physics!, said only 5% of the population, of which 90% were being ironic.
It’s no surprise that physics gets a bad rap somewhere between birth and college graduation; however,
it is truly a beautiful science and arguably without it, modern mathematics would not be where it is
today with it. For much of mathematics conception, math and physics (and other sciences for that
matter) were like peas and carrots. It has only been within the last century or so, where people start-
ing differentiating between calling themselves scientists to mathematicians, physicists, chemists, or
biologists. I digress. Physics has a place, and from it come a lot of interesting and useful mathematics!
In this section we will give a crash course in basic physics ideas, e.g., conservation laws and basic
mechanics, which will then lead us down the rabbit hole of calculus of variations to learn Lagrangian
and Hamiltonian formulations of mechanics. I apologize in advance for attempting to sum up years
worth of courses in a minimal amount of words. There will be (many) things left out, so please do
some light (gravity pun?) reading for further insight and intuition. Buckle up, we’re about to see
math in action!
Law 1:
The first law basically states if the sum of all forces on an object are zero, then the object is
either at rest or moving at a constant velocity, e.g., it is not accelerating. More specifically it
says that the vector sum of all forces is zero. This law is better known as the law of intertia.
87
where Fj is a force acting on a body and v is the velocity of such body. These statements say
that if the body is in motion, it will remain in motion, moving in a straight line, until other
forces act upon it. Furthermore, this is evident in what is referred to as the Newtonian reference
frame, or intertial reference frame, where the motion of a body according to another observer
remains moving at constant speed in a straight line.
Law 2:
Newton’s second law of motion is really a statement about the conservation of momentum. In a
nutshell it states that the time rate of change of linear momentum is equal to the sum of forces
acting on the body in an inertial reference frame. This can be described mathematically as
N
dp d(mv) X
= = Fj ,
dt dt j=0
where p is the linear momentum, and is equal to mass times velocity, i.e., p = mv. We note
that this is consistent with the first law, in that if the sum of vector forces acting on a body is
zero, then the momentum of the object remains unchanged. A consequence of any net force is
a change in momentum of a body.
It is common to assume that the mass of a body does not change and is constant, leading to the
infamous
N
X dv d2 x
Fnet = Fj = m = m 2 = ma;
j=0
dt dt
88
where all coordinates with primes indicate the moving inertial reference frame. Hence in both
frames of reference we have the same acceleration, and hence conservation of momentum is pre-
served.
Moreover, if we wish to consider Newton’s second law of motion for relativistic systems, we need
to change our definition of linear momentum, since at speeds “near-ish” to the speed of light,
linear momentum is not simply the product of mass and velocity.
Newton’s second law of motion is very powerful, and is the law that most people refer to the
most when considering physical phenomena. For example, the main equations of fluid dynamics,
called the Navier-Stokes equations, although nonlinear PDEs, are just a statement about the
conservation of momentum for a fluid. At the heart of basic mechanics, Newton’s second law is
main driver to solutions.
Law 3:
The crux of Newton’s third law is that no force is only unidirectional. If body A exerts a force on
body B, body B exerts the same force back on body A. All forces must be interactions between
bodies, and no one body can magically exert a force without feeling the equal and opposite force
back upon itself.
Think about punching a wall. You may think you have exerted all the force onto the wall, but
the reason your hand hurts is because the wall has “hit your hand” back with as much force as
you placed against it. To every action, there is an equal and opposite reaction.
We will now briefly discuss some conversation laws, or integrals of the system, in physics.
1. Conservation of Momentum
2. Conservation of Energy
3. Conservation of Angular Momentum
4. Conservation of Charge
We will now briefly describe each of these conservation laws. Well not, conservation of charge,
but that should be somewhat self explanatory for the level of physics we are shooting for. Note, in
mathematics the term conservation law usually refers to transport type hyperbolic equations of the
form, here in their scalar-ized glory,
∂t u + ∂x f (u) = 0.
u is called the conserved quantity, while f is called the flux.
89
Conservation of Momentum
We have already seen a bit about the conservation of momentum coming from Newton’s second
law of motion. Rather than think about forces causing changes in momentum of a body, let’s
consider the momentum in a system as a whole. If one part of the system gains momentum to
go one way in the system, another part of the system must have gained equal momentum, but
to go in the opposite direction.
If we consider our system to have two particles in it, we assume that at time, t1 , and later time
t2 , the total momentum of the system must be the same, e.g.,
h i h i
m1 v1 + m2 v2 = m1 v1 + m2 v2 .
t=t1 t=t2
However, the above assumes that if there have been any collisions between the balls, that only
elastic collisions have occurred. That is to say, no energy has been lost, well converted, to heat,
sound, etc. Spoiler alert - energy cannot be created nor destroyed, merely converted to other
forms of energy. Those type of collisions where energy has been converted to heat (sound, etc.)
are called inelastic collisions. We will discuss this further in the conservation of energy section.
Again, the main statement about the conservation of momentum is best summed up (heyyo!)
through Newton’s second law,
dv X
m = Fj .
dt j
Let’s do some mechanics problems to illustrate the power of this law! In doing so, we will also
try to motivate intuitive ideas on how to approach such mechanics problems.
Example 1: Consider two identical masses coupled by a linear spring that is oriented ver-
tically, i.e., aligned with gravity, and is falling slowly through a viscous fluid. Assume that
the spring is originally extended and that the gravitational field is constant. Furthermore ,
assume that each mass experiences a viscous drag force, which depends linearly on velocity,
and that the masses are initially at rest.
1. Write down the coupled dynamics describing the system’s evolution
2. Non-dimensionalize the system. How many independent parameters are there?
3. Assuming that the drag coefficient is small, construct a perturbative expansion for the
motion. Only compute the first non-trivial correction.
We draw a free body diagram describing the system. This can be seen in Figure(8). Using
conservation of momentum, we see each mass with have its own equation, that depends
on a multitude of forces, including gravitational force, fluid drag forces, and spring forces.
Without loss of generality, we call will label the equation associated with the top mass with
a subscript of 1, and the bottom mass with a subscript of 2.
Note that we do not have any forces in the x−direction, hence only have to consider motion
in the z−direction. Therefore we can easily write that
mz̈1 = F1 + F1 + F1
gravity drag spring
mz̈2 = F2 + F2 + F2 .
gravity drag spring
90
Figure 8: (a) The physical system - two coupled masses attached by a linear spring falling through a
viscous fluid. (b) The free body diagram for both masses in the system.
Although we only have a skeletal like equation at the moment, we will be able to define
all the force terms in the above equation. First, we can write out the initial conditions.
Since both masses are initially at rest, we can write ż1 (0) = ż2 (0) = 0. Furthermore, we
can assume that initially the masses are a distance h apart, and hence define z1 (0) = 0 and
z2 (0) = −h.
Now we proceed to define all those force terms. Beginning with the force of gravity, which
we assume is constant, the force can be defined as
F1 = −mg
gravity
F2 = −mg.
gravity
Similarly since we assume that the drag force is linearly proportional to the velocity of the
mass, we can write these forces as
F1 = ν ż1
drag
F2 = ν ż2 ,
drag
since the forces are upwards in the +z direction. Note that ν is the drag coefficient.
Lastly, we have to define the spring force. We note for a simple spring, Hooke’s Law
says that the force produced is proportional to the distance the spring is stretched from
equilibrium, where the proportionality constant is called the spring stiffness. Hence a
general spring force could be written as Fspring = −k(x − d), where d is the equilibrium
distance and k is the spring stiffness. Hence if the spring is at a distance d, there will be
no force produced. Therefore we can write our spring force relations as
F1 = −k [(z2 − z1 ) − d]
spring
F2 = −k [(z1 − z2 ) − d] ,
spring
91
Now we can write our spring system as
mz̈1 = −mg + ν ż1 − k [(z2 − z1 ) − d]
mz̈2 = −mg + ν ż2 − k [(z1 − z2 ) − d] ,
with ż1 (0) = ż2 (0) = 0, z1 (0) = 0, z2 (0) = −h, and where k > 0 and ν > 0.
It is clear from the equations that motion of the masses is coupled, as well that the equa-
tions are linear. We will now proceed to non-dimensionalize these equations. To non-
dimensionalize we start by defining characteristic length and time scales, i.e.,
z1 = z10 z̃1
z2 = z20 z̃2
t = t0 τ.
Using chain rule, we can figure out the non-dimensionalization of the time derivatives,
d dτ d 1 d
= = .
dt dt dτ t0 dτ
Doing these substitutions, and dividing by the coefficient of the leading order term we get
the following system,
d2 z̃1 kt20 kt20 z20 t20
νt0 dz̃1 kd
− + z̃1 − z̃2 = − 0 g +
dτ 2 m dτ m mz10 z1 m
2 2 2 0 2
d z̃2 νt0 dz̃2 kt0 kt0 z1 t0 kd
− + z̃2 − z̃1 = − 0 g +
dτ 2 m dτ m mz20 z2 m
d2 z̃1 z20
ν dz̃1 m kd
−√ + z̃1 − z̃2 = − 0 g +
dτ 2 mk dτ z10 kz1 m
d2 z̃2 0
ν dz̃2 z1 m kd
−√ + z̃2 − z̃1 = − 0 g −
dτ 2
mk dτ z20 kz2 m
Now we can make the assumption that both characteristic lengths, z10 and z20 are equal,
i.e., z10 = z20 . We can then let z10 = z20 = gm
k . Lastly, if we drop the tilda’s and revert back
to time derivatives being designated by a dot,we have
ν dk
z̈1 − √ ż1 + z1 − z2 = −1 +
mk gm
ν dk
z̈2 − √ ż2 + z2 − z1 = −1 −
mk gm
92
Hence we are left with only two parameters in the system
ν dk
√ , .
mk mg
The first parameter √ν can be seen as a kind of damping ratio within the system, while the
mk
dk
second parameter, mg , can be seen as the ratio of spring to gravitational forces. Now recall
we are going to assume that ν is a smaller parameter so we will define these parameters to
take the form as
ν dk
= √ and δ = .
mk mg
We are now going to seek perturbative solutions for the motion within the system. That
is we are going to seek solutions of the form,
Substituting these perturbative forms into our the system’s equations, we have
d2 h 0 1 2 2
i dh 0 1 2 2
i
z1 + z1 + z1 + . . . + z1 + z1 + z1 + . . . + ...
dτ 2 dτ h i
. . . + (z10 + z11 + 2 z12 + . . .) − (z20 + z21 + 2 z22 + . . .) − δ + 1 = 0.
d2 h 0 1 2 2
i dh 0 1 2 2
i
z1 + z1 + z1 + . . . + z1 + z1 + z1 + . . . + ...
dτ 2 dτ h i
. . . + (z10 + z11 + 2 z12 + . . .) − (z20 + z21 + 2 z22 + . . .) − δ + 1 = 0.
From the second equation we can solve for z10 and then differentiate twice, e.g.,
d4 0
z10 = z̈20 + z20 + δ + 1 ⇒ z̈10 = z + z̈20 .
dτ 4 2
Now substituting these into the first equation, we can get an equation all in terms of z20 ,
i.e.,
d4 0
0
0 0
0
4
z2 + z̈ 2 + z̈2 + z2 + δ + 1 − x2 − δ + 1 = 0,
dτ
and hence
d4 0
z + 2z̈20 + 2 = 0.
dτ 4 2
Although we went from a 2nd order equation to a 4th order equation, have no fear! We can
simply integrate the above equation twice, to get
93
z̈20 + 2z20 = −t2 + At + B,
where A, B ∈ R. Solving the homogeneous problem we find that
√ √
z20H = C1 cos( 2t) + C2 sin( 2t).
We can solve for the particular solution using undetermined coefficients (or variation of
parameters, if you fancy more hand writing), by guessing a particular solution of the form
z20P = d1 t2 + d2 t + d3 . Plugging this into the ODE, we find that d1 = − 21 , d2 = A2 , and
d3 = 12 (B + 1).
Can we go ahead and do this? Yes, absolutely. Will we? Nah, not right now. Let’s save
that for a rainy day.
94
Figure 9: (a) The physical system - two masses (m1 and m2 ) attached massless, inextensible string
that passes over a frictionless pulley of radius r, with moment of inertia, I. (b) The free body diagram
for both masses in the system.
Example 2: The Atwood Machine (...and sneak peak at Double Atwood Machine!)
Consider two point masses, m1 and m2 with m1 < m2 , connected by an ideal mass-less
and inextensible string of length l that passes over a frictionless pulley of radius r with
a moment of inertia, I, as shown in Figure 10. We can now proceed to look at the force
balance equations.
The forces on mass m1 are solely the gravitational force downwards and the tension of the
string, T1 . Furthermore since we assume m1 < m2 , we assume this mass is accelerating
upwards. This gives us the equation:
m1 a = T1 − m1 g. (57)
since m2 is accelerating downwards. Finally from the pulley we get the an equation for
the torque of system, that relates the moment of inertial and angular acceleration with the
difference between T1 and T2 ,
a
Iα = (T2 − T1 )r = I , (59)
r
where α is the angular acceleration of the system. Given that the string is not slipping at
all, we have that
a
α= .
r
Therefore adding (57), (58), and (59) together, we get
I
m1 + m2 + a = (T1 − T2 ) (1 + r) + g (m1 + m2 ) ,
r
and hence we find the acceleration to be
95
g (m1 + m2 )
a= .
m1 + m2 + rI2
Note: in order to add (59) to the other two equations, we first divided both sides by r to
put it into the proper units. Furthermore we can also easily compute what the tension in
the string is near m1 and m2 ,
I
m1 g 2m2 + r2
T1 = I
m1 + m2 + r2
I
m2 g 2m1 + r2
T2 = I
.
m1 + m2 + r2
Figure 10: (The physical system - the triple Atwood system composed of 4 masses, three inextensible
strings, and three frictionless pulleys.
We note that in order to find the acceleration, a, we had to write down a system of three
equations using empirical relations. Although these equations were not necessarily a big
chore, there are still easier ways to do this problem! Imagine for a moment if we had a double
or triple Atwood machine (see Figure)! Finding the acceleration in the system would no
longer be such a walk in the park; please be our guest and try it! However, using a different
formulation oh physics, called Lagrangian Mechanics, the problem becomes tolerable! We
will now venture down that path now.
96
7 Asymptotics and Perturbation Methods
This section will have a vastly different flavor, in comparison to the previous methods. We are to be
no longer concerned with exact solutions to equations, even in the form of an infinite sum, but will
set our sights solely on asymptotic or perturbative solutions. You can think of these as very good
approximation methods, if you’ve never heard the lingo prior. Well, let’s just define these when the
time comes, but now be comforted by knowing we will solely search for “approximate” solutions to
equations now.
Now what I haven’t revealed was properties about f (t), g(t), or λ. Well, we’ll assume f and g are
friendly functions, where friendly for us just means that they are smooth enough to be replaced by
local Taylor series approximations of appropriately desired degree.
Well, what about λ? This where the term asymptotic comes into play. We are only concerning
ourself here with the behavior of I(λ) as λ → ∞. Hence are only on a mission to look at this integral
in a limiting behavior scenario. This is the general idea with asymptotic methods, that is, methods
that describe limiting behavior of a function, or solution to an equation. We will now briefly give two
definitions to solidify this idea.
Definition 7.2 We say to functions fn and gn of a natural number parameter n, are asymptotically
equivalent, e.g., fn ∼ gn (as n → ∞), if and only if limn→∞ gfnn = 1. This equivalence only holds in
the sense of the limit, though.
For example, consider the functions g(n) = n2 − 2n and h(n) = n2 + 3n + 7. It clear that that
g(n) 6= h(n)∀n ∈ N; however, it is clear that g(n) ∼ h(n) as n → ∞, so it is safe to say that g(n)
and h(n) are asymptotically equivalent as n → ∞. It is paramount to note that we had to state the
asymptotic equivalent only holds “as n → ∞”. In general, the limiting behavior does not have to be as
n → ∞ or even n → 0, it just needs to be at a point of interest for the functions, e.g., where a function
may become singular, etc. Using this idea, we will now explore what an asymptotic expansion of a
function is.
Now let’s get back to the problem at hand, finding the limiting behavior of (60) as λ → ∞. To
achieve this, we will use Laplace’s Method.
97
This is the part that is crucial for Laplace’s Method. Since g(c) is a minimum of g(t) on [a, b], for
λ >> 1, the main contribution of I(λ) will come from a small neighborhood of c. This is more obvious
from the definition of I(λ) above, rather than where it was first introduced. Hence for λ >> 1,
Z c+
I(λ) ≈ e−λg(c) f (t)e−λ g(t)−g(c) dt
c−
Z c+
≈ e−λg(c) f (c) e−λ g(t)−g(c) dt
c−
Z c+
g(c)+(t−c)g 0 (c)+ 2!
1
(t−c)2 g 00 (c) −g(c)
I(λ) ≈ e−λg(c) f (c) e−λ dt
c−
Z c+
0 1 00 2
≈ e−λg(c) f (c) e−λ g (c)(t−c)+ 2! g (c)(t−c) dt
c−
Z c+
00
−λg(c) λ
(c)(t−c)2
=e f (c) e− 2 g dt
c−
since g 0 (c) = 0. Recall we said the main contribution of I(λ) comes from a neighborhood of c.
Moreover, on the flip side, we assume the integral’s contribution outside that neighborhood, therefore
we can expand that neighborhood to all of R, i.e., we now have
Z ∞
λ 00 2
I(λ) ≈ e−λg(c) f (c) e− 2 g (c)(t−c) dt.
−∞
Next to integrate the Gaussian-type integral above, we first do the swanky transformation of letting
s = t − c, and getting
Z ∞
00
−λg(c) λ
(c)s2
I(λ) ≈ e f (c) e− 2 g ds
−∞
s
2π
= e−λg(c) f (c) .
λg 00 (c)
98
This above is the standard Laplace Method’s result when the minimum is within the interval (a, b).
There are three basic ideas behind this methodology. They are as follows:
1. Since we are concerned with λ → ∞, the main contribution of I(λ) comes from a small neigh-
borhood of the point in which g(t) is at its local minimum, say t = c, and can replace the
integration bounds from [a, b] → [c − , c + ].
2. Within the neighborhood of the minimizer, we can approximate f (t) and g(t) with their Taylor
Series until desired order.
3. Since the main contribution of I(λ) is coming from the interval [c − , c + ], we can extend this
interval to [−∞, ∞], as the regions outside [c − , c + ] only contribute higher order terms to
I(λ) as λ → ∞.
We will now go through a very similar calculation to illustrate what happens when the minimum
of g(t) in on an end point of the integration bounds, but first we will show the trick for integrating
Gaussian-type integrals.
To do this, we will transform this integral to polar coordinates, but first, we will do the magical
trick. This is it. Just square (62), i.e.,
Z ∞ Z ∞ Z ∞Z ∞
2 2 2 2
G2 = e−x dx e−y dy = e−(x +y ) dxdy.
−∞ −∞ −∞ −∞
Therefore on the above we can simply do the substitution let u = r2 and therefore du = 2rdr. We
now have
Z ∞ Z 2π
2 1 −u
G = e dudθ,
0 0 2
and hence
∞
G2 = −πe−u = π.
0
Now solving for G, we get that
√
G= π.
We used this trick above to find the leading order behavior of the Laplace Integral, and will use it
again in a minute for a similar computation!
99
7.1.3 Laplace’s Method Take Two (minimizer on an integration bound)
Consider (60), such that g(t) assumes a strict minimum in [a, b] at an endpoint of the interval. Without
loss of generality, let’s just say the minimum is found at t = a. We will also assume that g 0 (a) = 0
(duh), g 00 (a) > 0 (so the point is a minimum, duh), and f (a) 6= 0. Using the same math trick of
adding and subtracting zero, we can rewrite (60) as
Z b
−λg(a) −λ g(t)−g(a)
I(λ) = e f (t)e dt.
a
We will make the same assumptions as we did above, e.g., that the main contribution to I(λ)
comes from a neighborhood of t = a, f (t) and g(t) are smooth enough so that we can replace them by
their Taylor Series centered at t = a, and that since the main contribution is coming a neighborhood
of t = a, we will be able to expand this neighborhood to all of R+ with only higher order asymptotic
terms being affected. The reason for only blowing this interval up to [a, ∞) is because we do not
consider points t < a. Upon doing these things, we get
Z a+
I(λ) ≈ e−λg(a) f (t)e−λ g(t)−g(a)
dt
a
Z a+
≈ e−λg(a) f (a) e−λ g(t)−g(a) dt
a
Z c+
−λg(a) −λ g(a)+(t−a)g 0 (a)+ 2!
1
(t−a)2 g 00 (a) −g(a)
≈e f (a) e dt
a
Z c+
−λg(a) −λ g 0 (a)(t−a)+ 2!
1 00
g (a)(t−a)2
≈e f (a) e dt
a
Z a+
00
−λg(a) λ
(a)(t−a)2
=e f (a) e− 2 g dt
a
Z ∞
00
−λg(a) λ
(a)(t−a)2
≈e f (a) e− 2 g dt.
a
Now doing the same substitution as before, letting s = t − a, we are left with the following
Gaussian-type integral to integrate,
Z ∞
00
−λg(a) λ
(a)s2
I(λ) ≈ e f (a) e− 2 g ds
0
r
π
= e−λg(a) f (a) .
2λg 00 (a)
Thus when the minimizer is at an endpoint, we have to leading order that
r
−λg(a) π
I(λ) ∼ e f (a) as λ → ∞. (63)
2λg 00 (a)
100
7.1.4 The Gamma Function and Higher Order Asymptotics
Usually in the asymptotic business, the next thing one is introduced to is the Gamma function. The
reason for this is two-fold; first it is a generalization of the factorial function, and second, it is used to
derive Stirling’s approximation, which is looking at the asymptotic behavior of the factorial function
itself. We will do all these things, but before, we concern ourself with the asymptotic behavior of
the factorial, let’s use the Gamma function to look at higher order asymptotics associated with the
Laplace integral.
The Gamma function is defined as,
Z ∞
Γ(x) = tx−1 e−t dt. (64)
0
The way to see (64) is a continuous generalization of the factorial function, we simply have to
integrate it by parts, e.g.,
Z ∞
Γ(x + 1) = tx e−t dt
0
∞ Z ∞
x −t
= −t e + xtx−1 e−t dt
0 0
Z ∞
=x tx−1 e−t dt
0
= xΓ(x).
Thus if x = n ∈ Z+ , we have
Γ(n + 1) = n!.
Now we can use this to find higher-order asymptotic terms in vein of the Laplace integral in (60).
We will again assume that g(t) assumes a strict minimum over [a, b] at a point c ∈ (a, b), as well that
g 0 (c) = 0, g 00 (c) 6= 0, and f 00 (c) 6= 0. Now where this calculation is different than those previously done
is that we assume f (c) = 0. This subtle change will lead us to employ the Gamma function to find the
leading behavior in our asymptotic expansion. We will again make the same assumptions as before,
e.g., the main contribution comes from the neighborhood of t = c, and all that jazz. Now for λ >> 1,
101
Z b
I(λ) = f (t)e−λg(t) dt
a
Z c+
≈ e−λg(c) f (t)e−λ g(t)−g(c)
dt
c−
Z c+
1
1 00
g 0 (c)(t−c)+ 2! g (c)(t−c)2 +O((t−c)3 )
= e−λg(c) f (c) + f 0 (c)(t − c) + f 00 (c)(t − c)2 + O((t − c)3 ) e−λ dt
c− 2!
Z c+
−λg(c) 1 00 λ 00 2
≈e 0
f (c)(t − c) + f (c)(t − c) e− 2 g (c)(t−c) dt
2
c− 2!
c+
f 00 (c) −λg(c)
Z
00
λ
(c)(t−c)2
= e (t − c)2 e− 2 g dt.
2 c−
We do the same transformation s = t − c, and then expanding the integration region to all of R,
we obtain
f 00 (c) −λg(c) ∞ 2 − λ g00 (c)s2
Z
I(λ) ≈ e s e 2 ds
2 −∞
Z ∞
00
λ
(c)s2
= f 00 (c)e−λg(c) s2 e− 2 g ds.
0
λ 00 2
To simplify the above integral, we perform one more transformation. Letting u = 2 g (c)s and
hence du = λg 00 (c)sds, we find that
Z ∞
2u du
I(λ) ≈ f 00 (c)e−λg(c) e −u
0 λg 00 (c) 2λg 00 (c)u
Z ∞
√
2
= f 00 (c)e−λg(c) u1/2 e−u du
0 (λg 00 (c))3/2
√
2f 00 (c) −λg(c)
3
= e Γ
(λg 00 (c))3/2 2
√
2f 00 (c) −λg(c)
1 1
= e Γ
(λg 00 (c))3/2 2 2
102
Z ∞
1
Γ = t−1/2 e−t dt, let u = t1/2
2 0
Z ∞
2
=2 e−u du
0
√
= π.
tx = ex ln(t)
,
Z ∞ Z ∞
e−x( x −ln(t)) dt.
t
Γ (x + 1) = ex ln(t) e−t dt =
0 0
t 1
Next we perform a substitution, z = x (and dz = x dt), and get
Z ∞ Z ∞ Z ∞
Γ (x + 1) = x e−x(z−ln(xz)) dz = x e−x(z−ln(x)−ln(z)) dz = xex ln(x) e−x(z−ln(z)) dt.
0 0 0
We now have exponentially and logarithmically massaged the Gamma integral into the following
form,
Z ∞
Γ (x + 1) = x x+1
e−x(z−ln z) dz. (65)
0
We see that (65) is in the form of a Laplace type integral, with f (z) = 1 and g(z) = z − ln z. With
a little further analysis, it is clear that g(z) has a strict minimum on (0, ∞) at z = 1. We see that
g(z = 1) = 1, g 0 (1) = 0, and g 00 (1) = 1. Substituting all of this information into the leading order
terms of the Laplace integral, we find that
103
s
∞
Z r
−x(z−ln z) −xg(1) 2π 2π −x
e dz ∼ e f (1) = e , as x → ∞.
0 λg 00 (1) x
Hence we find that the Gamma function has leading order behavior,
2π −x √
r
1
Γ(x + 1) ∼ xx+1 e = 2πxx+ 2 e−x , asx → ∞.
x
To make this a little closer to home, consider x = n ∈ Z+ , as n → ∞, that is, make n is very large
integer. We are now ready to see what the factorial function looks like as n gets very, very large.
√ 1
n! ∼ 2πnn+ 2 e−n , as n → ∞.
Next we look to perform the substitution, t = g(s) − g(A); however, this transformation will only
work if we can invert s to get s = s(t). If this is possible, then we have
dt
ds = .
g 0 (s(t))
We have then massaged (66) to
g(B)−g(A)
f s(t) −λt
Z
−λg(A)
I(λ) = e e ds.
0 g 0 s(t)
The main contribution to this integral comes will come from a neighborhood around the minimum
of the exponential term for λ → ∞. Moreover, we are also assuming that |f (s)| < Kebt for t > 0,
where b, K ∈ R and are all independent of t. Hence because the largest contribution will occur about
a one-sided neighborhood about 0, we only need to consider a bound of [0, ], where > 0, e.g.,
Z
−λg(A) f s(t) −λt
I(λ) ≈ e 0
e ds for λ >> 1
0 g s(t)
Furthermore, the fractional term in the integrand can usually be written in the following form,
f s(t)
= tα h(t),
g 0 s(t)
104
where α > −1. We also are going to assume that h(t) has a convergence Taylor Series about t = 0.
Upon doing so, for λ → ∞, we have,
f s(t) −λt
Z
I(λ) ≈ e−λg(A) e ds
0 g 0 s(t)
Z
= e−λg(A) tα h(t)e−λt ds
0
∞
h(k) (0)
X Z
= tk+α e−λt dt
k! 0
k=0
∞ ∞
h(k) (0)
X Z
≈ tk+α e−λt dt by the same assumptions as in Laplace’s Method
k! 0
k=0
Recalling the definition of the Gamma function, we then have the asymptotic expansion for I(λ)
as λ → ∞,
∞
X h(k) (0)Γ (k + 1 + α)
I(λ) ∼ (67)
k!λk+1+α
k=0
We can cleverly use the substitution, t = tan θ2 θ and dt = 2 tan θ sec2 θdθ, and through some
simplification arrive at Z ∞
1 1 −λt
I(λ) = √ e dt.
0 2 t 1 + t
Now we have α = 1
2 and can expand (1 + t)−1 in it’s Taylor Series about t = 0 for |t| < 1,
∞
1 1 1X
h(t) = = (−1)k tk .
21+t 2
k=0
h(k) (0) 1
= (−1)k .
k! 2
Plugging all this into Watson’s Lemma, (67), we find the asymptotic series as λ → ∞ to be
∞ 1
1 X (−1)k Γ k + 2
I(λ) ∼ 1 as λ → ∞.
2 λn+ 2
k=0
105
7.1.7 Where does the Steepest Decent take us?
You may have been lucky enough to have heard of the Method of Steepest Decent for asymptotic
approximations in conjunction to integrals with a very similar flavor as the integral form of (60).
Although Steepest Decent is thought to be more of generalization of Laplace’s Method, it can also
be viewed as a methodology to massage your integral of interest into a form that can then call upon
Laplace’s Method to find the asymptotics. Let’s stop being cryptic, and observe the problem at hand.
where t ∈ C, ρ(t) = φ(t) + iψ(t). One of the main issues we face with this integral is that, well,
first the integration path is not clearly defined for us. . .spoiler, that will be our job, but also that we
are now officially diving into the world of complex integration.
We are implicitly assuming that ρ(t) has non-zero imaginary part. Our worry here is that our
path of integration may introduce regions of high frequency oscillations into the exponential. As we
proceed to pick an integration path, or contour, we will set out to choose one such that Im(ρ(t)) is
constant.
We will proceed to give a glossing overview of the method to build some intuition behind the
madness, as well as, draw similarities between this methodology and that of the Laplace Integral we
just studied.
• As we just said, we want to deform the integration contour, C, such that Im(ρ(t)) is constant.
That is, we will not have to worry about high frequency oscillations obstructing us from finding
appropriate asymptotic behavior.
• With that in mind we go ahead and find the saddle points of ρ(t). This is similar to what we did
in the Laplace Method, i.e., we wished to find a minimizer, and here we set out to find what we
can best do in the complex plane, which is find a saddle-point with positive second derivative,
i.e., ρ0 (tC ) = 0 and ρ00 (tC ) > 0, where tC ∈ C is the saddle-point.
• After finding the saddle-point, we look for the direction of steepest decent through the saddle-
points. There are two routes we can now take.
1. We can look for lines of constant phase that go through tC , e.g., ψ(t) = ψ(tC ), where
t = u + iv, so ψ(u, v) = ψ(uC , vC ). This is not exactly Steepest Descent, but more referred
to the method of Stationary Phase. . .we will get there. . .
2. Or, we look for θ, such that ρ(tC ) = re−θ , and the direction of the steepest decents through
the saddle-point are α1 = π−θ2 and α2 =
2π−θ
2 .
• Now that we have an idea of the path of constant phase, or steepest decents, we have to check
to see which lines are steepest ascent or steepest decent lines, since after all, we are performing
this analysis at a saddle-point. To do this we simply look at the real part:
– If Re(u, v) < 0 ← descent :)
– If Re(u, v) > 0 ← ascent :(
• We then deform the contour appropriately in the (u, v)-plane.
106
• Finally our integral should be massaged into a form where we can call upon Laplace’s Method
to perform the actual asymptotics to find the leading behavior of (68).
hence we have that ρ(t) = it2 . First things first, let’s look for the saddle-points. Letting
t = u + iv, we obtain,
and setting f 0 (t) = 0 we find that u = v = 0. Therefore we have tC = 0. Now looking at f 00 (tC ),
we find that
f 00 (tC ) = 2i = 2eiπ/2 .
π
Now we can look for the direction through the saddle-points, with θ = 2 .,
π−θ π
α1 = =
2 4
2π − θ 3π
α2 = = ,
2 4
or similarly, we can set the Im f (t) = Im f (tC ) , i.e.,
u2 − v 2 = 0 ⇒ u = ±v,
which are the lines of steepest descent. Furthermore, it is clear that these lines in the (u, v)-plane,
match the directions through the saddle-points. However, we need to find which is associated
with a line of steepest ascent and one in which is the line of steepest descent. To figure this out,
we look at the real part of f (t),
Re f (t) = −2uv.
Hence we know the direction through the saddle-point in which to construct our integration
path. However, we still need to figure out the other pieces to the contour we are choosing, e.g.,
we know the original integral is from t ∈ [0, 1]; however, we are building this into a closed-contour
integral in C. Furthermore we still need to choose the actual direction of the contour through
them. The pieces we currently have are depicted in the Figure(11). The blue lines depict the
lines of steepest descent, while red indicates the lines of steepest ascent. The blue dashed line
107
illustrates the part of the ascent line we will not choose for our contour, as to construct our
closed-contour we would be forced to cross a line of steepest ascent in quadrant IV .
To close this contour we need to choose one final path such that it includes the point t = 1 and
connectsto the line of steepest descent. Now at t = 1(u = 1, v = 0),
we have f (1) = i and hence
Re f (1) = 0, similarly we also have that f (0) = 0 and Re f (0) =0. We look for paths such
that the imaginary part of f (t) is constant, e.g., set equal Im f (1) = Im f (t) . Upon doing
so, we find
p
u2 − v 2 = 1 ⇒ v = ± u2 − 1.
We then choose the + branch, such that our contour will meet with the steepest descent line,
u = v, at ∞. This then constructs our closed contour. However, we still need to choose a
direction around the contour, and will choose it to be counter-clockwise. This is shown in
Figure(12). Hence our contour includes C = C1 + C2 + C3 , where C1 is the original path of the
integral, we set out to find the leading order asymptotic behavior for.
Hence we now have the following integral we are considering,
Z Z Z Z
() = () + () + ().
C C1 C2 C3
R
The reason for this is because we can simply solve for the integral C1 () and perform asymptotic
analysis on the other integrals using Laplace’s Method. However, before we do that, we can pull
into play one important theorem from Complex Analysis, that is,
Theorem 7.10 (Cauchy’s Integral Theorem) Let U be an open subset of C which is simply-
connected. Consider f : U → C be a holomorphic function, and let γ be a rectifiable path in U
whose start point is equal to its end point, i.e., γ is a closed path, then
I
f (z)dz = 0.
γ
R
Using this theorem, and solving for C1
(), we are left with the situation that
Z Z Z
() = − () − (). (70)
C1 C2 C3
Now we will consider the leading order asymptotic behavior of each of these integrals separately.
Spoiler alert, we will call upon our friend, Watson’s Lemma, to actually perform the asymptotic
analysis.
2 5π
Path C3 : We begin with C3 eixt dt, and parameterize the path along u = v by t = ρei 4 and
R
5π
dt = ei 4 dρ, giving us
Z Z 0 Z ∞
2 2 i5π/2 5π 5π 2
eixt dt = eixρ e
ei 4 dρ = −ei 4 e−xρ dρ.
C3 ∞ 0
2 dz
Now we perform the substitution, z = ρ and get dρ = √ ,
2 z
to get
∞ 5π ∞
ei 4 e−xz
Z Z Z
2 5π 1
eixt dt = −ei 4 √ e−xz dz = − √ dz.
C3 0 2 z 2 0 z
108
Lastly, we can call upon Watson’s Lemma, with α = − 12 , to give us the following leading
order asymptotic behavior,
5π
ei 4 Γ 12
Z
ixt2
e dt ∼ − . (71)
C3 2 x1/2
2
Path C2 : Now we consider the integral, C2 eixt dt. The most difficult part about this integral is
R
i
the choosing the appropriate parameterization. We let it2 = i − s and get dt = 2√1+is ds.
2
The motivation behind this choice is since at t = 1, it = i, and hence s = 0. We then
obtain
∞ 0
eix−xs ieix e−xs
Z Z Z
ixt2 i
e dt = √ ds = √ ds.
C2 2 0 1 + is 2 ∞ 1 + is
Now near s = 0, we can use Taylor’s Series, to write
1 is
√ ≈ 1 − + ...,
1 + is 2
giving us
0
ieix
Z Z
2 is
eixt dt = 1− + . . . e−xs ds.
C2 2 ∞ 2
Now using Watson’s Lemma on the above largest term near s = 0, with α = 0, we get
ieix Γ (1)
Z
2
eixt dt ∼ − . (72)
C2 2 x
Therefore using (70), (72), and (71), we find the leading order asymptotic behavior of (69) to
be Z 1 5π r
ixt2 ei 4 π ieix
I(x) = e dt ∼ + . (73)
0 2 x 2x
This by no means has been a rigorous derivation or explanation of the Method of Steepest Descent,
but hopefully it gives you a good idea of the main ideas and concepts that motivate the method.
where A, B, g(t), ψ(t) ∈ R. To find the leading order asymptotics, we will consider two cases.
Case 1: If ψ 0 (t) 6= 0 for all t ∈ [A, B], we can simply use integration by parts on (74), e.g.,
B B
g 0 (t) ixψ(t)
Z B
Z
ig(t) ixψ(t)
I(x) = g(t)eixψ(t) dt = e +i e dt.
A xψ 0 (t) A A xψ 0 (t)
109
One can see that
ig(t) ixψ(t) B 1
e ∼O
xψ 0 (t) A x
Z B 0
g (t) ixψ(t) 1
i e ∼O ,
A xψ 0 (t) x2
ig(t) ixψ(t) B
I(x) ∼ e as x → ∞.
xψ 0 (t) A
Case 2: If ψ 0 (t) = 0, ψ 00 (t) 6= 0 for some point t ∈ [A, B]. Without loss of generality, say this is at a
point t = c. We first find the Taylor Series of ψ(t) centered about c,
1 1 1
ψ(t) = ψ(c) + (t − c)ψ 0 (c) + (t − c)2 ψ 00 (c) + . . . = ψ(c) + (t − c)2 ψ 00 (c) + (t − c)3 ψ 000 (c) + . . . ,
2! 2! 3!
since ψ 0 (c) = 0. Substituting this into the integral in (74)we get
Z B
000
x
(c)(t−c)2 +...
I(x) = eiψ(c) g(t)ei 2 ψ dt.
A
You probably guessed it, we will only use the first term in the above exponential as well as look
at the Taylor Series for g(t) about t = c and pick off a few terms, e.g.,
Z B x 000 2
I(x) ≈ eiψ(c) g(c) + (t − c)g 0 (c) + . . . ei 2 ψ (c)(t−c) dt
A
Z B
000
iψ(c) x
(c)(t−c)2
≈e g(c)ei 2 ψ dt.
A
Now since the main contribution to I(x) as x → ∞ comes from a neighborhood of the stationary
point, t = c, we only need to consider a neighborhood about c. However, like we have done a
few times before we can then extend the integration bounds all the way to (−∞, ∞), i.e.,
Z B
000
x
(c)(t−c)2
I(x) ≈ eiψ(c) g(c)ei 2 ψ dt
A
Z c+
000
x
(c)(t−c)2
≈ eiψ(c) g(c)ei 2 ψ dt
c−
Z ∞
000
x
(c)(t−c)2
≈ eiψ(c) g(c)ei 2 ψ dt.
−∞
Next to massage
q this last integral into a form we q can integrate, we use the transformation,
±i π ±i π
u = e 4 2 x ψ (a) (t − a) and hence du = e 4 12 x ψ 00 (a) dt, and obtain
1 00
110
iπ ∞
g(a)eixψ(a)∓i 4
Z
2
I(x) ∼ q e−u du,
1 00 −∞
2 x ψ (a)
iπ
g(a)eixψ(a)∓i 4
I(x) ∼ q .
1 00 (a)
2π x ψ
Note that in our transformation where we defined u, we choose the + part if ψ 00 (c) > 0 and choose
the − part if ψ 00 (c) < 0. Furthermore note that if in the case ψ 0 (c) = ψ 00 (c) = · · · = ψ (p−1) (c) = 0
for some point c ∈ [a, b], the above process will work analogously - just use the largest non-zero
Taylor coefficient!
More specifically this asymptotic method will work on linear ODEs about their irregular singular
points. We will look for local solutions about the irregular singular point, x = x0 , of the form
y ∼ eS(x) as x → x0 . (75)
We assume the solution can be approximated by some type of exponential in neighborhood of the
irregular singular point, which gives us the motivation for (75). Furthermore, taking derivatives on
(75), we have the following differential relations,
To actually find the approximate behavior of the solution near an irregular singular point, we are
lucky in that there are distinct steps one can follow when performing this method. Before diving into
an example, we will outline them now, so they don’t remain a mystery.
111
5. Repeat these process to capture the higher order asymptotics, using the previous asymptotic
results you obtained!
xy 00 + 4y 0 = 2y, (78)
which we wish to find the solution’s behavior near its irregular singular point at x → ∞. We perform
our substitutions from (75), (76), and (77) to get
xS 00 + xS 02 + 4S 0 = 2. (79)
We then make the assumption that as x → ∞
|xS 00 | << xS 02 .
The intuition behind this choice can only come from the fact that near irregular singular points, the
second derivative tends to be more behaved, as compared to the first derivative. It will be a good
idea to check this assumption at the end, either way. That gives us
xS 02 + 4S 0 − 2 = 0.
As the above form suggests, we call upon our high school friends, the quadratic formula, to solve
it, leading us to √ √
0 −2 ± 2 2 + x
S = as x → ∞.
x
Now we can pick this apart to see what the true control factor, or largest term, is in our asymptotic
series. As x → ∞, we see that the term that contributes most in the limit will be
√
S 0 (x) ∼ ± 2x−1/2 .
√
Now integrating this term, we obtain the control factor, S(x) ∼ ±2 2x1/2 , and hence
√
S(x) ∼ ±2 2x1/2 + C(x),
√
where C(x) will be the next leading order term, e.g., C(x) << ±2 2x1/2 as x → ∞. Before
proceeding to find the next term in the asymptotic relation, we should check to make sure our original
assumptions from Step 2 are consistent with what we found for S(x).
We find that
1
S 0 (x)2 ∼ 2x−1 and S 00 (x) ∼ ∓ √ x−3/2 ,
2
and hence our original assumption holds that |xS 00 | << xS 02 . Let’s find the next term in the
√
asymptotic expansion! Recall we found S(x) ∼ ±2 2x1/2 + C(x). We simply need to substitute this
into (79), e.g.,
1
√ 2 √
x ∓ √ x−3/2 + C 00 (x) + x ± 2x−1/2 + C 0 (x) + 4 ± 2x−1/2 + C 0 (x) = 2
2
and simplifying we obtain,
√
√ 1
xC 00 + xC 02 ± 2 2x1/2 + 2 C 0 ± 4 2 ∓ √ x−1/2 = 0.
2
112
Now we proceed to do some massaging
√ on the above equation. Considering large values of x,
i.e., x → ∞, we have that 2 << 2x1/2 . Furthermore, since C(x) << S(x) ∼ x1/2√ , we have that
|C 0 | << x−1/2 and
√ |C 00
| << x −3/2
. This two relations imply that xC 02
<< 2 2x1/2 C 0 and
|xC 00 | << ± 4 2 ∓ √12 x−1/2 .
Checking that our assumptions are correct, we see first that |C| << |S| as x →
∞, and then
also
√
0 0 00 00 02 1/2 0 00
that |C | << |S |, |C | << |S |, xC << x C , and finally that |xC | << ± 4 2 ∓ √2 x−1/2 .
1
Check.
y(x) ∼ eS(x)
√
2x1/2 )∓(2∓ 14 ) ln(x)
∼ e±(2
√
2x1/2
= kx∓(2∓ 4 ) e±2
1
or √
2x1/2
y(x) ∼ kx∓(2∓ 4 ) e±2
1
,
where k is a constant that cannot be determined from local analysis.
Asymptotic convergence and pointwise convergence are but two beasts with subtle, yet humongous,
differences. Recall we had a definition of asymptotic equivalence earlier, i.e.,
Definition 7.4 We say to functions fn and gn of a natural number parameter n, are asymptotically
equivalent, e.g., fn ∼ gn (as n → ∞), if and only if limn→∞ gfnn = 1.
113
The two above definitions have some small differences, which we will highlight later when we run
through a few properties of asymptotic relations. However, these two definitions are also in opposition
to the idea of pointless convergence, which we define now.
P∞
Definition 7.6 We say a function f (x) is pointwise convergent to a series i=0 gi (x) for all x ∈ I,
P∞
limn→∞ f (x) − i=0 gi (x) = 0.
Between these two definitions, we have two subtley opposing ideas. For asymptotic series, we
have for a fixed N , as x → x0 we are guaranteed to convergence to the function. While for pointless
convergence, we fix x = x0 and as N → ∞ (for x ∈ I) we are guaranteed to converge to the function.
xn
PN
We can illustrate these differences using a simple Taylor Series example. Consider f (x) = n=0 n! .
For a fixed x, as N → ∞, it is clear that f (x) → ex , i.e.,
N
x
X xn
lim e − = 0.
N →∞
n=0
n!
xn
PN
However, each sequence for a fixed N , we have n=0 n! ex as x → x0 , e.g.,
N
X xn xN +!
ex − as x → x0 .
n=0
n! (N + 1)!
To illustrate more properties of asymptotic convergence, we will go through a few (fun) examples.
Have fun reading!
f1 + f2 x3
lim = lim 2 = 0.
x→0 g1 + g2 x→0 x
g1 g2 f1 f2
lim f1 f2 = lim f1 f2 = lim g1 g2 = lim g1 g2 .
x→x0 x→x0 g1 g2 x→x0 g1 g2 x→x0
114
T/F: ef1 ef2 ∼ eg1 eg2 ?
i. This is false, for reasons similar to the first example in this problem, e.g., f1 + f2
g1 + g2 . Consider f1 = x1 , f2 = − x1 , g1 = x1 + x12 , and g2 = − x1 . Hence we see that
f1 + f2 = 0 and g1 + g2 = x12 , and therefore we see easily that
ef1 +f2 e0
lim = lim 2 6= 1.
x→0 eg1 +g2 x→0 e1/x
f1 g1
T/F: log f2 ∼ log g2 ?
2
i. This, too, is false, and we can show it easily! Let f1 = ex+3x , f2 = ex , g1 = ex , and
g2 = ex , and then we get log ff21 = log f1 − log f2 = 3x2 and log gg21 = log g1 − log g2 = 0.
Hence, we see trivially that 3x2 0 as x → 0.
T/F: log (f1 f2 ) ∼ log (g1 g2 ) ?
i. This is false by counterexample similar to the one above.
2. If f1 ∼ f2 have the same asymptotic series, are they the same function?
an xn as x → 0
P
No, in general
P they will not be the same function. We could have f1 =
n −1/x
and f2 = an x + e as x → 0, and although we have f1 ∼ f2 as x → 0, they are clearly
not the same function.
3. Are asymptotic series unique?
No, they are not. We can have the same asymptotic series as x → x0 for 2 different func-
tions. For each function, each has a unique asymptotic series as x → x0 . Consider f1 = x + 3
and f2 = x + 4, then we see f1 ∼ x as x → ∞ and f2 ∼ x as x → ∞, but clearly f1 6= f2 .
4. Are Laurent Series an asymptotic series?
Recall a Laurent Series is a representation of a complex function as a power series, which includes
terms of negative degree, i.e., it is used to express complex functions when a regular Taylor Series
expansion cannot be applied because of a singularity. We can write this as
∞
X
L(x) = an (x − x0 )n .
n=−∞
If no such leading order term can be written, as in, there is an essential singularity, the Laurent
Series will not be an asymptotic series. For example, the function
1 1
e1/x = 1 + + + ...
x 2! · x2
, has an essential singularity at x = 0 and itself is not an asymptotic series as x → 0 since each
term is subdominant to infinitely many terms.
115
Figure 11: Pieces of the contour we know have to belong so far, but have not yet chosen their explicit
directions. The blue lines depict the lines of steepest descent, while red indicates the lines of steepest
ascent. The blue dashed line illustrates the part of the ascent line we will not choose for our contour,
as to construct our closed-contour we would be forced to cross a line of steepest ascent in quadrant
IV .
Figure 12: The steepest descent contour in the (u, v)-plane, e.g., C = C1 + C2 + C3 .
116