Notes On Diffy Q's Jiri Lebl PDF
Notes On Diffy Q's Jiri Lebl PDF
Notes On Diffy Q's Jiri Lebl PDF
by Jiří Lebl
November 7, 2019
(version 6.0)
2
Typeset in LATEX.
This work is dual licensed under the Creative Commons Attribution-Noncommercial-Share Alike 4.0
International License and the Creative Commons Attribution-Share Alike 4.0 International License.
To view a copy of these licenses, visit https://creativecommons.org/licenses/by-nc-sa/4.0/
or https://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons
PO Box 1866, Mountain View, CA 94042, USA.
You can use, print, duplicate, share this book as much as you want. You can base your own notes
on it and reuse parts if you keep the license the same. You can assume the license is either the
CC-BY-NC-SA or CC-BY-SA, whichever is compatible with what you wish to do, your derivative
works must use at least one of the licenses. Derivative works must be prominently marked as such.
During the writing of these notes, the author was in part supported by NSF grant DMS-0900885
and DMS-1362337.
The date is the main identifier of version. The major version / edition number is raised only if there
have been substantial changes. Edition number started at 5, that is, version 5.0, as it was not kept
track of before.
The LATEX source for the book is available for possible modification and customization at github:
https://github.com/jirilebl/diffyqs
Contents
Introduction 7
0.1 Notes about these notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
0.2 Introduction to differential equations . . . . . . . . . . . . . . . . . . . . . . . 10
0.3 Classification of differential equations . . . . . . . . . . . . . . . . . . . . . . 17
Index 461
6 CONTENTS
Introduction
0.1.1 Organization
The organization of this book to some degree requires chapters be done in order. Later
chapters can be dropped. The dependence of the material covered is roughly:
Introduction
Appendix A Chapter 1
Chapter 2
Chapter 3 Chapter 7
Chapter 8 Chapter 4
Chapter 5 Chapter 6
8 INTRODUCTION
There are a few references in chapters 4 and 5 to chapter 3 (some linear algebra), but
these references are not essential and can be skimmed over, so chapter 3 can safely be
dropped, while still covering chapters 4 and 5. Chapter 6 does not depend on chapter 4
except that the PDE section 6.5 makes a few references to chapter 4, although it could in
theory be covered separately. The more in-depth appendix A on linear algebra can replace
the short review § 3.2 for a course that combines linear algebra and ODE.
0.1.4 Acknowledgments
Firstly, I would like to acknowledge Rick Laugesen. I used his handwritten class notes
the first time I taught Math 286. My organization of this book through chapter 5, and
the choice of material covered, is heavily influenced by his notes. Many examples and
computations are taken from his notes. I am also heavily indebted to Rick for all the
advice he has given me, not just on teaching Math 286. For spotting errors and other
suggestions, I would also like to acknowledge (in no particular order): John P. D’Angelo,
Sean Raleigh, Jessica Robinson, Michael Angelini, Leonardo Gomes, Jeff Winegar, Ian
Simon, Thomas Wicklund, Eliot Brenner, Sean Robinson, Jannett Susberry, Dana Al-Quadi,
Cesar Alvarez, Cem Bagdatlioglu, Nathan Wong, Alison Shive, Shawn White, Wing Yip
Ho, Joanne Shin, Gladys Cruz, Jonathan Gomez, Janelle Louie, Navid Froutan, Grace
Victorine, Paul Pearson, Jared Teague, Ziad Adwan, Martin Weilandt, Sönmez Şahutoğlu,
Pete Peterson, Thomas Gresham, Prentiss Hyde, Jai Welch, Simon Tse, Andrew Browning,
James Choi, Dusty Grundmeier, John Marriott, Jim Kruidenier, Barry Conrad, Wesley
Snider, Colton Koop, Sarah Morse, Erik Boczko, Asif Shakeel, Chris Peterson, Nicholas
Hu, Paul Seeburger, Jonathan McCormick, David Leep, William Meisel, Shishir Agrawal,
Tom Wan, and probably others I have forgotten. Finally, I would like to acknowledge NSF
grants DMS-0900885 and DMS-1362337.
10 INTRODUCTION
Yay! We got precisely the right-hand side. But there is more! We claim x ⇤ cos t + sin t + e t
dx
+ x ⇤ ( sin t + cos t e t ) + (cos t + sin t + e t ) ⇤ 2 cos t.
dt | {z } | {z }
dx x
dt
for some constant C. Different constants C will give different solutions, so there are really
infinitely many possible solutions. See Figure 1 for the graph of a few of these solutions.
We will see how we find these solutions a few lectures from now.
3 3
6000 6000
kt
P(t) ⇤ Ce ,
5000 5000
dP
⇤ Cke kt ⇤ kP. 3000 3000
dt
And it really is a solution. 2000 2000
200. Let us plug these conditions in and see Figure 2: Bacteria growth in the first 60 seconds.
what happens.
100 ⇤ P(0) ⇤ Ce k0 ⇤ C,
200 ⇤ P(10) ⇤ 100 e k10 .
�.�. INTRODUCTION TO DIFFERENTIAL EQUATIONS 13
ln 2
Therefore, 2 ⇤ e 10k or 10 ⇤ k ⇡ 0.069. So
At one minute, t ⇤ 60, the population is P(60) ⇤ 6400. See Figure 2 on the preceding page.
Let us talk about the interpretation of the results. Does our solution mean that there
must be exactly 6400 bacteria on the plate at 60s? No! We made assumptions that might
not be true exactly, just approximately. If our assumptions are reasonable, then there
will be approximately 6400 bacteria. Also, in real life P is a discrete quantity, not a real
number. However, our model has no problem saying that for example at 61 seconds,
P(61) ⇡ 6859.35.
Normally, the k in P 0 ⇤ kP is known, and we want to solve the equation for different
initial conditions. What does that mean? Take k ⇤ 1 for simplicity. Suppose we want to
solve the equation dPdt ⇤ P subject to P(0) ⇤ 1000 (the initial condition). Then the solution
turns out to be (exercise)
P(t) ⇤ 1000 e t .
We call P(t) ⇤ Ce t the general solution, as every solution of the equation can be written
in this form for some constant C. We need an initial condition to find out what C is, in
order to find the particular solution we are looking for. Generally, when we say “particular
solution,” we just mean some solution.
y(x) ⇤ Ce kx .
We saw above that this function is a solution, although we used different variable names.
Next,
dy
⇤ k y,
dx
for some constant k > 0. The general solution for this equation is
kx
y(x) ⇤ Ce .
14 INTRODUCTION
Exercise 0.2.1: Check that the y given is really a solution to the equation.
Next, take the second order differential equation
d2 y
2
⇤ k 2 y,
dx
for some constant k > 0. The general solution for this equation is
y(x) ⇤ C1 cos(kx) + C2 sin(kx).
Since the equation is a second order differential equation, we have two constants in our
general solution.
Exercise 0.2.2: Check that the y given is really a solution to the equation.
Finally, consider the second order differential equation
d2 y
2
⇤ k 2 y,
dx
for some constant k > 0. The general solution for this equation is
y(x) ⇤ C 1 e kx + C 2 e kx
,
or
y(x) ⇤ D1 cosh(kx) + D2 sinh(kx).
For those that do not know, cosh and sinh are defined by
ex + e x ex e x
cosh x ⇤ , sinh x ⇤ .
2 2
They are called the hyperbolic cosine and hyperbolic sine. These functions are sometimes
easier to work with than exponentials. They have some nice familiar properties such as
cosh 0 ⇤ 1, sinh 0 ⇤ 0, and dx
d
cosh x ⇤ sinh x (no that is not a typo) and dx
d
sinh x ⇤ cosh x.
Exercise 0.2.3: Check that both forms of the y given are really solutions to the equation.
Example 0.2.2: In equations of higher order, you get more constants you must solve for
d2 y
to get a particular solution. The equation dx 2 ⇤ 0 has the general solution y ⇤ C 1 x + C 2 ;
simply integrate twice and don’t forget about the constant of integration. Consider the
initial conditions y(0) ⇤ 2 and y 0(0) ⇤ 3. We plug in our general solution and solve for the
constants:
2 ⇤ y(0) ⇤ C 1 · 0 + C2 ⇤ C 2 , 3 ⇤ y 0(0) ⇤ C1 .
In other words, y ⇤ 3x + 2 is the particular solution we seek.
An interesting note about cosh: The graph of cosh is the exact shape of a hanging chain.
This shape is called a catenary. Contrary to popular belief this is not a parabola. If you
invert the graph of cosh, it is also the ideal arch for supporting its own weight. For example,
the gateway arch in Saint Louis is an inverted graph of cosh—if it were just a parabola it
might fall down. The formula used in the design is inscribed inside the arch:
y ⇤ 127.7 ft · cosh(x/127.7 ft) + 757.7 ft.
�.�. INTRODUCTION TO DIFFERENTIAL EQUATIONS 15
0.2.5 Exercises
Exercise 0.2.4: Show that x ⇤ e 4t is a solution to x 000 12x 00 + 48x 0 64x ⇤ 0.
Exercise 0.2.5: Show that x ⇤ e t is not a solution to x 000 12x 00 + 48x 0 64x ⇤ 0.
⇣ ⌘2
dy
Exercise 0.2.6: Is y ⇤ sin t a solution to dt ⇤1 y 2 � Justify.
Exercise 0.2.7: Let y 00 + 2y 0 8y ⇤ 0. Now try a solution of the form y ⇤ e rx for some �unknown�
constant r. Is this a solution for some r� If so, find all such r.
Exercise 0.2.8: Verify that x ⇤ Ce 2t is a solution to x 0 ⇤ 2x. Find C to solve for the initial
condition x(0) ⇤ 100.
Exercise 0.2.10: Find a solution to (x 0)2 + x 2 ⇤ 4 using your knowledge of derivatives of functions
that you know from basic calculus.
Exercise 0.2.13: The population of city X was ��� thousand �� years ago, and the population of
city X was ��� thousand �� years ago. Assuming constant growth, you can use the exponential
population model �like for the bacteria�. What do you estimate the population is now�
Exercise 0.2.14: Suppose that a football coach gets a salary of one million dollars now, and a raise
of ��% every year �so exponential model, like population of bacteria�. Let s be the salary in millions
of dollars, and t is time in years.
a� What is s(0) and s(1). b� Approximately how many year will it take
for the salary to be �� million.
c� Approximately how many year will it take d� Approximately how many year will it take
for the salary to be �� million. for the salary to be �� million.
Note: Exercises with numbers 101 and higher have solutions in the back of the book.
Exercise 0.2.103: Let x y 00 y 0 ⇤ 0. Try a solution of the form y ⇤ x r . Is this a solution for some
r� If so, find all such r.
• Ordinary differential equations or (ODE) are equations where the derivatives are taken
with respect to only one variable. That is, there is only one independent variable.
• Partial differential equations or (PDE) are equations that depend on partial derivatives
of several variables. That is, there are several independent variables.
dy
⇤ k y, (Exponential growth)
dt
dy
⇤ k(A y), (Newton’s law of cooling)
dt
d2 x dx
m 2 +c + kx ⇤ f (t). (Mechanical vibrations)
dt dt
And of partial differential equations:
@y @y
+c ⇤ 0, (Transport equation)
@t @x
@u @2 u
⇤ , (Heat equation)
@t @x 2
@2 u @2 u @2 u
⇤ + . (Wave equation in 2 dimensions)
@t 2 @x 2 @y 2
If there are several equations working together, we have a so-called system of differential
equations. For example,
y 0 ⇤ x, x0 ⇤ y
is a simple system of ordinary differential equations. Maxwell’s equations for electromag-
netics,
Æ ⇤ ⇢,
r·D Æ ⇤ 0,
r·B
Æ
@B Æ
r ⇥ EÆ ⇤ , Æ ⇤ ÆJ + @ D ,
r⇥H
@t @t
are a system of partial differential equations. The divergence operator r· and the curl
operator r⇥ can be written out in partial derivatives of the functions involved in the x, y,
and z variables.
18 INTRODUCTION
The next bit of information is the order of the equation (or system). The order is simply
the order of the largest derivative that appears. If the highest derivative that appears is
the first derivative, the equation is of first order. If the highest derivative that appears is
the second derivative, then the equation is of second order. For example, Newton’s law
of cooling above is a first order equation, while the mechanical vibrations equation is a
second order equation. The equation governing transversal vibrations in a beam,
4y@2 y
4@
a + 2 ⇤ 0,
@x 4 @t
is a fourth order partial differential equation. It is fourth order as at least one derivative is
the fourth derivative. It does not matter that the derivative in t is only of second order.
In the first chapter we will start attacking first order ordinary differential equations,
dy
that is, equations of the form dx ⇤ f (x, y). In general, lower order equations are easier to
work with and have simpler behavior, which is why we start with them.
We also distinguish how the dependent variables appear in the equation (or system).
In particular, we say an equation is linear if the dependent variable (or variables) and their
derivatives appear linearly, that is only as first powers, they are not multiplied together,
and no other functions of the dependent variables appear. In other words, the equation is
a sum of terms, where each term is some function of the independent variables or some
function of the independent variables multiplied by a dependent variable or its derivative.
Otherwise, the equation is called nonlinear. For example, an ordinary differential equation
is linear if it can be put into the form
dn y dn 1 y dy
a n (x) n + a n 1 (x) n 1 + · · · + a 1 (x) + a 0 (x)y ⇤ b(x). (2)
dx dx dx
The functions a 0 , a1 , . . . , a n are called the coefficients. The equation is allowed to depend
arbitrarily on the independent variable. So
d2 y dy 2 1
ex + sin(x) + x y ⇤ (3)
dx 2 dx x
is still a linear equation as y and its derivatives only appear linearly.
All the equations and systems given above as examples are linear. It may not be
immediately obvious for Maxwell’s equations unless you write out the divergence and curl
in terms of partial derivatives. Let us see some nonlinear equations. For example Burger’s
equation,
@y @y @2 y
+y ⇤ ⌫ 2,
@t @x @x
@y
is a nonlinear second order partial differential equation. It is nonlinear because y and @x
are multiplied together. The equation
dx
⇤ x2 (4)
dt
�.�. CLASSIFICATION OF DIFFERENTIAL EQUATIONS 19
is a nonlinear first order differential equation as there is a second power of the dependent
variable x.
A linear equation may further be called homogeneous, if all terms depend on the
dependent variable. That is, if there is no term that is a function of the independent
variables alone. Otherwise, the equation is called nonhomogeneous or inhomogeneous. For
example, the exponential growth equation, the wave equation, or the transport equation
above are homogeneous. The mechanical vibrations equation above is nonhomogeneous
as long as f (t) is not the zero function. Similarly, if the ambient temperature A is nonzero,
Newton’s law of cooling is nonhomogeneous. A homogeneous linear ODE can be put into
the form
dn y dn 1 y dy
a n (x) n + a n 1 (x) n 1 + · · · + a 1 (x) + a 0 (x)y ⇤ 0.
dx dx dx
Compare to (2) and notice there is no function b(x).
If the coefficients of a linear equation are actually constant functions, then the equation
is said to have constant coefficients. The coefficients are the functions multiplying the
dependent variable(s) or one of its derivatives, not the function b(x) standing alone. A
constant coefficient nonhomogeneous ODE is an equation of the form
dn y dn 1 y dy
an + an 1 1
+ · · · + a1 + a 0 y ⇤ b(x),
dx n dx n dx
where a 0 , a 1 , . . . , a n are all constants, but b may depend on the independent variable x. The
mechanical vibrations equation above is a constant coefficient nonhomogeneous second
order ODE. Same nomenclature applies to PDEs, so the transport equation, heat equation
and wave equation are all examples of constant coefficient linear PDEs.
Finally, an equation (or system) is called autonomous if the equation does not depend on
the independent variable. For autonomous ordinary differential equations, the independent
variable is then thought of as time. Autonomous equation means an equation that does
not change with time. For example, Newton’s law of cooling is autonomous, so is equation
(4). On the other hand, mechanical vibrations or (3) are not autonomous.
0.3.1 Exercises
Exercise 0.3.1: Classify the following equations. Are they ODE or PDE� Is it an equation or a
system� What is the order� Is it linear or nonlinear, and if it is linear, is it homogeneous, constant
coefficient� If it is an ODE, is it autonomous�
d2 x @u @u
a� sin(t) + cos(t)x ⇤ t 2 b� +3 ⇤ xy
dt 2 @x @y
@2 u @2 u
c� y 00 + 3y + 5x ⇤ 0, x 00 + x y⇤0 d� + u ⇤0
@t 2 @s 2
d4 x
e� x 00 + tx 2 ⇤ t f� ⇤0
dt 4
20 INTRODUCTION
curl r ⇥ uÆ ⇤ @u
@y
3 @u2 @u1
@z
, @z @u3 @u2
@x
, @x @u1
@y
. Notice that curl of a vector is still a vector. Write
out Maxwell’s equations in terms of partial derivatives and classify the system.
Exercise 0.3.3: Suppose F is a linear function, that is, F(x, y) ⇤ ax + b y for constants a and b.
What is the classification of equations of the form F(y 0 , y) ⇤ 0.
Exercise 0.3.4: Write down an explicit example of a third order, linear, nonconstant coefficient,
nonautonomous, nonhomogeneous system of two ODE such that every derivative that could appear,
does appear.
Exercise 0.3.101: Classify the following equations. Are they ODE or PDE� Is it an equation or a
system� What is the order� Is it linear or nonlinear, and if it is linear, is it homogeneous, constant
coefficient� If it is an ODE, is it autonomous�
@2 v @2 v dx
a� + 3 ⇤ sin(x) b� + cos(t)x ⇤ t 2 + t + 1
@x 2 @y 2 dt
d7F
c� ⇤ 3F(x) d� y 00 + 8y 0 ⇤ 1
dx 7
@u @2 u
e� x 00 + t yx 0 ⇤ 0, y 00 + tx y ⇤ 0 f� ⇤ 2
+ u2
@t @s
Exercise 0.3.102: Write down the general zeroth order linear ordinary differential equation. Write
down the general solution.
dy
⇤ f (x, y),
dx
or just
y 0 ⇤ f (x, y).
In general, there is no simple formula or procedure one can follow to find solutions. In the
next few lectures we will look at special cases where solutions are not difficult to obtain. In
this section, let us assume that f is a function of x alone, that is, the equation is
y 0 ⇤ f (x). (1.1)
that is π
y(x) ⇤ f (x) dx + C.
This y(x) is actually the general solution. So to solve (1.1), we find some antiderivative of
f (x) and then we add an arbitrary constant to get the general solution.
Now is a good time to discuss a point about calculus notation and terminology.
Calculus textbooks muddy the waters by talking about the integral as primarily the
so-called indefinite integral. The indefinite integral is really the antiderivative (in fact the
whole one-parameter family of antiderivatives). There really exists only one integral and
that is the definite integral. The only reason for the indefinite integral notation is that we
22 CHAPTER �. FIRST ORDER EQUATIONS
Ø
can always write an antiderivative as a (definite) integral. That is, by the fundamental
theorem of calculus we can always write f (x) dx + C as
π x
f (t) dt + C.
x0
Hence the terminology to integrate when we may really mean to antidifferentiate. Integration
is just one way to compute the antiderivative (and it is a way that always works, see the
following examples). Integration is defined as the area under the graph, it only happens to
also compute antiderivatives. For sake of consistency, we will keep using the indefinite
integral notation when we want an antiderivative, and you should always think of the
definite integral as a way to write it.
Example 1.1.1: Find the general solution of y 0 ⇤ 3x 2 .
Elementary calculus tells us that the general solution must be y ⇤ x 3 + C. Let us check
by differentiating: y 0 ⇤ 3x 2 . We got precisely our equation back.
Normally, we also have an initial condition such as y(x0 ) ⇤ y0 for some two numbers
x0 and y0 (x0 is usually 0, but not always). We can then write the solution as a definite
integral in a nice way. Suppose our problem is y 0 ⇤ f (x), y(x0 ) ⇤ y0 . Then the solution is
π x
y(x) ⇤ f (s) ds + y0 . (1.2)
x0
Let us check! We compute y 0 ⇤ f (x), via the fundamental theorem of calculus, and
by
Ø x0Jupiter, y is a solution. Is it the one satisfying the initial condition? Well, y(x0 ) ⇤
x0
f (x) dx + y0 ⇤ y0 . It is!
Do note that the definite integral and the indefinite integral (antidifferentiation) are
completely different beasts. The definite integral always evaluates to a number. Therefore,
(1.2) is a formula we can plug into the calculator or a computer, and it will be happy to
calculate specific values for us. We will easily be able to plot the solution and work with it
just like with any other function. It is not so crucial to always find a closed form for the
antiderivative.
Example 1.1.2: Solve
x2
y0 ⇤ e , y(0) ⇤ 1.
By the preceding discussion, the solution must be
π x
s2
y(x) ⇤ e ds + 1.
0
Here is a good way to make fun of your friends taking second semester calculus. Tell them
to find the closed form solution. Ha ha ha (bad math joke). It is not possible (in closed
form). There is absolutely nothing wrong with writing the solution as a definite integral.
This particular integral is in fact very important in statistics.
�.�. INTEGRALS AS SOLUTIONS 23
y 0 ⇤ f (y).
dy
⇤ f (y).
dx
Now we use the inverse function theorem from calculus to switch the roles of x and y to
obtain
dx 1
⇤ .
dy f (y)
What we are doing seems like algebra with dx and dy. It is tempting to just do algebra
with dx and dy as if they were numbers. And in this case it does work. Be careful, however,
as this sort of hand-waving calculation can lead to trouble, especially when more than one
independent variable is involved. At this point we can simply integrate,
π
1
x(y) ⇤ dy + C.
f (y)
dx 1
⇤ .
dy ky
We integrate to obtain
1
x(y) ⇤ x ⇤ln | y| + D,
k
where D is an arbitrary constant. Now we solve for y (actually for | y|).
| y| ⇤ e kx kD
⇤e kD kx
e .
If we replace e kD with an arbitrary constant C, we can get rid of the absolute value bars
(which we can do as D was arbitrary). In this way, we also incorporate the solution y ⇤ 0.
We get the same general solution as we guessed before, y ⇤ Ce kx .
Example 1.1.4: Find the general solution of y 0 ⇤ y 2 .
First we note that y ⇤ 0 is a solution. We can now assume that y , 0. Write
dx 1
⇤ 2.
dy y
24 CHAPTER �. FIRST ORDER EQUATIONS
We integrate to get
1
x⇤ + C.
y
1
We solve for y ⇤ C x. So the general solution is
1
y⇤ or y ⇤ 0.
C x
Note the singularities of the solution. If for example C ⇤ 1, then the solution “blows up”
as we approach x ⇤ 1. See Figure 1.1. Generally, it is hard to tell from just looking at the
equation itself how the solution is going to behave. The equation y 0 ⇤ y 2 is very nice and
defined everywhere, but the solution is only defined on some interval ( 1, C) or (C, 1).
Usually when this happens we only consider one of these the solution. For example if
we impose a condition y(0) ⇤ 1, then the solution is y ⇤ 1 1 x , and we would consider this
solution only for x on the interval ( 1, 1). In the figure, it is the left side of the graph.
-3 -2 -1 0 1 2 3
3 3
2 2
1 1
0 0
-1 -1
-2 -2
-3 -3
-3 -2 -1 0 1 2 3
1
Figure 1.1: Plot of y ⇤ 1 x.
x 0 ⇤ e t/2 .
x(t) ⇤ 2e t/2 + C.
�.�. INTEGRALS AS SOLUTIONS 25
We still need to figure out C. We know that when t ⇤ 0, then x ⇤ 0. That is, x(0) ⇤ 0. So
0 ⇤ x(0) ⇤ 2e 0/2 + C ⇤ 2 + C.
Thus C ⇤ 2 and
x(t) ⇤ 2e t/2 2.
Now we just plug in to get where the car is at 2 and at 10 seconds. We obtain
Example 1.1.6: Suppose that the car accelerates at a rate of t 2 m/s2 . At time t ⇤ 0 the car is
at the 1 meter mark and is traveling at 10 m/s. Where is the car at time t ⇤ 10.
Well this is actually a second order problem. If x is the distance traveled, then x 0 is the
velocity, and x 00 is the acceleration. The equation with initial conditions is
v0 ⇤ t 2 , v(0) ⇤ 10.
Exercise 1.1.1: Solve for v, and then solve for x. Find x(10) to answer the question.
1.1.1 Exercises
dy
Exercise 1.1.2: Solve dx ⇤ x 2 + x for y(1) ⇤ 3.
dy
Exercise 1.1.3: Solve dx ⇤ sin(5x) for y(0) ⇤ 2.
dy 1
Exercise 1.1.4: Solve dx ⇤ x2 1
for y(0) ⇤ 0.
Exercise 1.1.9: A spaceship is traveling at the speed 2t 2 + 1 km/s �t is time in seconds�. It is pointing
directly away from earth and at time t ⇤ 0 it is ���� kilometers from earth. How far from earth is it
at one minute from time t ⇤ 0�
Exercise 1.1.11: A dropped ball accelerates downwards at a constant rate 9.8 meters per second
squared. Set up the differential equation for the height above ground h in meters. Then supposing
h(0) ⇤ 100 meters, how long does it take for the ball to hit the ground.
Exercise 1.1.104: Sid is in a car traveling at speed 10t + 70 miles per hour away from Las Vegas,
where t is in hours. At t ⇤ 0, Sid is �� miles away from Vegas. How far from Vegas is Sid � hours
later�
Exercise 1.1.105: Solve y 0 ⇤ y n , y(0) ⇤ 1, where n is a positive integer. Hint� You have to
consider different cases.
Exercise 1.1.106: The rate of change of the volume of a snowball that is melting is proportional to
the surface area of the snowball. Suppose the snowball is perfectly spherical. Then the volume �in
centimeters cubed� of a ball of radius r centimeters is 4/3 ⇡r 3 . The surface area is 4⇡r 2 . Set up the
differential equation for how r is changing. Then, suppose that at time t ⇤ 0 minutes, the radius is
�� centimeters. After � minutes, the radius is � centimeters. At what time t will the snowball be
completely melted.
Exercise 1.1.107: Find the general solution to y 0000 ⇤ 0. How many distinct constants do you need�
�.�. SLOPE FIELDS 27
y 0 ⇤ f (x, y).
A lot of the time, we cannot simply solve these kinds of equations explicitly. It would
be nice if we could at least figure out the shape and behavior of the solutions, or find
approximate solutions.
-3 -2 -1 0 1 2 3
3 3
2 2
1 1
0 0
-1 -1
-2 -2
-3 -3
-3 -2 -1 0 1 2 3
To get an idea of how solutions behave, we draw such lines at lots of points in the plane,
not just the point (2, 1.5). We would ideally want to see the slope at every point, but that is
just not possible. Usually we pick a grid of points fine enough so that it shows the behavior,
but not too fine so that we can still recognize the individual lines. We call this picture the
slope field of the equation. See Figure 1.3 on the following page for the slope field of the
equation y 0 ⇤ x y. Usually in practice, one does not do this by hand, but has a computer do
the drawing.
28 CHAPTER �. FIRST ORDER EQUATIONS
Suppose we are given a specific initial condition y(x0 ) ⇤ y0 . A solution, that is, the
graph of the solution, would be a curve that follows the slopes we drew. For a few sample
solutions, see Figure 1.4. It is easy to roughly sketch (or at least imagine) possible solutions
in the slope field, just from looking at the slope field itself. You simply sketch a line that
roughly fits the little line segments and goes through your initial condition.
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
3 3 3 3
2 2 2 2
1 1 1 1
0 0 0 0
-1 -1 -1 -1
-2 -2 -2 -2
-3 -3 -3 -3
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
Figure 1.3: Slope field of y 0 ⇤ x y. Figure 1.4: Slope field of y 0 ⇤ x y with a graph
of solutions satisfying y(0) ⇤ 0.2, y(0) ⇤ 0, and
y(0) ⇤ 0.2.
By looking at the slope field we get a lot of information about the behavior of solutions
without having to solve the equation. For example, in Figure 1.4 we see what the solutions
do when the initial conditions are y(0) > 0, y(0) ⇤ 0 and y(0) < 0. A small change in the
initial condition causes quite different behavior. We see this behavior just from the slope
field and imagining what solutions ought to do.
We see a different behavior for the equation y 0 ⇤ y. The slope field and a few solutions
is in see Figure 1.5 on the next page. If we think of moving from left to right (perhaps x is
time and time is usually increasing), then we see that no matter what y(0) is, all solutions
tend to zero as x tends to infinity. Again that behavior is clear from simply looking at the
slope field itself.
-3 -2 -1 0 1 2 3
3 3
2 2
1 1
0 0
-1 -1
-2 -2
-3 -3
-3 -2 -1 0 1 2 3
What do you think is the answer? The answer seems to be yes to both does it not? Well,
pretty much. But there are cases when the answer to either question can be no.
Since generally the equations we encounter in applications come from real life situations,
it seems logical that a solution always exists. It also has to be unique if we believe our
universe is deterministic. If the solution does not exist, or if it is not unique, we have
probably not devised the correct model. Hence, it is good to know when things go wrong
and why.
Example 1.2.1: Attempt to solve:
1
y0 ⇤ , y(0) ⇤ 0.
x
Integrate to find the general solution y ⇤ ln |x| + C. The solution does not exist at x ⇤ 0.
See Figure 1.6 on the following page. The equation may have been written as the seemingly
harmless x y 0 ⇤ 1.
Example 1.2.2: Solve:
p
y 0 ⇤ 2 | y|, y(0) ⇤ 0.
See Figure 1.7 on the next page. Note that y ⇤ 0 is a solution. But another solution is
the function (
x2 if x 0,
y(x) ⇤
x 2 if x < 0.
It is hard to tell by staring at the slope field that the solution is not unique. Is there any
hope? Of course there is. We have the following theorem, known as Picard’s theorem .
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
3 3 3 3
2 2 2 2
1 1 1 1
0 0 0 0
-1 -1 -1 -1
-2 -2 -2 -2
-3 -3 -3 -3
p
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
Figure 1.6: Slope field of y 0 ⇤ 1/x . Figure 1.7: Slope field of y 0 ⇤ 2 | y| with two
solutions satisfying y(0) ⇤ 0.
Theorem 1.2.1 (Picard’s theorem on existence and uniqueness). If f (x, y) is continuous �as
@f
a function of two variables� and @y exists and is continuous near some (x0 , y0 ), then a solution to
exists �at least for some small interval of x’s� and is unique.
p
Note that the problems y 0 ⇤ 1/x , y(0) ⇤ 0 and y 0 ⇤ 2 | y|, y(0) ⇤ 0 do not satisfy the
hypothesis of the theorem. Even if we can use the theorem, we ought to be careful about
this existence business. It is quite possible that the solution only exists for a short while.
y0 ⇤ y 2 , y(0) ⇤ A.
We know how to solve this equation. First assume that A , 0, so y is not equal to zero
at least for some x near 0. So x 0 ⇤ 1/y 2 , so x ⇤ 1/y + C, so y ⇤ C 1 x . If y(0) ⇤ A, then C ⇤ 1/A
so
1
y⇤ .
1/A x
If A ⇤ 0, then y ⇤ 0 is a solution.
For example, when A ⇤ 1 the solution “blows up” at x ⇤ 1. Hence, the solution does
not exist for all x even if the equation is nice everywhere. The equation y 0 ⇤ y 2 certainly
looks nice.
For most of this course we will be interested in equations where existence and uniqueness
holds, and in fact holds “globally” unlike for the equation y 0 ⇤ y 2 .
�.�. SLOPE FIELDS 31
1.2.3 Exercises
Exercise 1.2.1: Sketch slope field for y 0 ⇤ e x y . How do the solutions behave as x grows� Can you
guess a particular solution by looking at the slope field�
a� b� c�
Exercise 1.2.7 (challenging): Take y 0 ⇤ f (x, y), y(0) ⇤ 0, where f (x, y) > 1 for all x and y.
If the solution exists for all x, can you say what happens to y(x) as x goes to positive infinity�
Explain.
Exercise 1.2.9: Suppose y 0 ⇤ f (x, y). What will the slope field look like, explain and sketch an
example, if you know the following about f (x, y)�
Exercise 1.2.10: Find a solution to y 0 ⇤ | y|, y(0) ⇤ 0. Does Picard’s theorem apply�
Exercise 1.2.11: Take an equation y 0 ⇤ (y 2x)g(x, y) + 2 for some function g(x, y). Can you
solve the problem for the initial condition y(0) ⇤ 0, and if so what is the solution�
32 CHAPTER �. FIRST ORDER EQUATIONS
Exercise 1.2.12 (challenging): Suppose y 0 ⇤ f (x, y) is such that f (x, 1) ⇤ 0 for every x, f is
@f
continuous and @y exists and is continuous for every x and y.
b� Can graphs of two solutions of the equation for different initial conditions ever intersect�
c� Given y(0) ⇤ 0, what can you say about the solution. In particular, can y(x) > 1 for any x�
Can y(x) ⇤ 1 for any x� Why or why not�
Exercise 1.2.101: Sketch the slope field of y 0 ⇤ y 3 . Can you visually find the solution that satisfies
y(0) ⇤ 0�
Exercise 1.2.104: Match equations y 0 ⇤ sin x, y 0 ⇤ cos y, y 0 ⇤ y cos(x) to slope fields. Justify.
a� b� c�
Does y 0 ⇤ f (y), y(0) ⇤ 0 have a continuously differentiable solution� Does Picard apply� Why, or
why not�
Exercise 1.2.106: Consider an equation of the form y 0 ⇤ f (x) for some continuous function f , and
an initial condition y(x0 ) ⇤ y0 . Does a solution exist for all x� Why or why not�
�.�. SEPARABLE EQUATIONS 33
f (x) dx + C. Unfortunately this method no longer works for the general form of the
equation y 0 ⇤ f (x, y). Integrating both sides yields
π
y⇤ f (x, y) dx + C.
y 0 ⇤ f (x)g(y),
for some functions f (x) and g(y). Let us write the equation in the Leibniz notation
dy
⇤ f (x)g(y).
dx
Then we rewrite the equation as
dy
⇤ f (x) dx.
g(y)
If we can find closed form expressions for these two integrals, we can, perhaps, solve for y.
Example 1.3.1: Take the equation
y 0 ⇤ x y.
Note that y ⇤ 0 is a solution. We will remember that fact and assume y , 0 from now on,
dy
so that we can divide by y. Write the equation as dx ⇤ x y. Then
π π
dy
⇤ x dx + C.
y
x2
ln | y| ⇤ + C,
2
34 CHAPTER �. FIRST ORDER EQUATIONS
or
x2 x2 x2
| y| ⇤ e 2 +C ⇤ e 2 e C ⇤ De 2 ,
where D > 0 is some constant. Because y ⇤ 0 is also a solution and because of the absolute
value we can write:
x2
y ⇤ De 2 ,
for any number D (including zero or negative).
We check:
x2
⇣ x2 ⌘
y 0 ⇤ Dxe 2 ⇤ x De 2 ⇤ x y.
Yay!
We should be a little bit more careful with this method. You may be worried that we
integrated in two different variables. We seemingly did a different operation to each side.
Let us work through this method more rigorously. Take
dy
⇤ f (x)g(y).
dx
dy
We rewrite the equation as follows. Note that y ⇤ y(x) is a function of x and so is dx !
1 dy
⇤ f (x).
g(y) dx
We use the change of variables formula (substitution) on the left hand side:
π π
1
dy ⇤ f (x) dx + C.
g(y)
We separate variables,
✓ ◆
y2 + 1 1
dy ⇤ y + dy ⇤ x dx.
y y
�.�. SEPARABLE EQUATIONS 35
We integrate to get
y2 x2
+ ln | y| ⇤ + C,
2 2
or perhaps the easier looking expression (where D ⇤ 2C)
y 2 + 2 ln | y| ⇤ x 2 + D.
It is not easy to find the solution explicitly as it is hard to solve for y. We, therefore, leave
the solution in this form and call it an implicit solution. It is still easy to check that an implicit
solution satisfies the differential equation. In this case, we differentiate with respect to x,
and remember that y is a function of x, to get
✓ ◆
2
y 0 2y + ⇤ 2x.
y
Multiply both sides by y and divide by 2(y 2 + 1) and you will get exactly the differential
equation. We leave this computation to the reader.
If you have an implicit solution, and you want to compute values for y, you might
have to be tricky. You might get multiple solutions y for each x, so you have to pick one.
Sometimes you can graph x as a function of y, and then flip your paper. Sometimes you
have to do more.
Computers are also good at some of these tricks. More advanced mathematical software
usually has some way of plotting solutions to implicit equations. For example, for C ⇤ 0 if
you plot all the points (x, y) that are solutions to y 2 + 2 ln | y| ⇤ x 2 , you find the two curves
in Figure 1.8 on the following page. This is not quite a graph of a function. For each x there
are two choices of y. To find a function you would have to pick one of these two curves.
You pick the one that satisfies your initial condition if you have one. For example, the top
curve satisfies the condition y(1) ⇤ 1. So for each C we really got two solutions. As you can
see, computing values from an implicit solution can be somewhat tricky. But sometimes,
an implicit solution is the best we can do.
The above equation also has the solution y ⇤ 0. So the general solution is
y 2 + 2 ln | y| ⇤ x 2 + C, and y ⇤ 0.
x 2 y 0 ⇤ (1 x 2 )(1 + y 2 ).
36 CHAPTER �. FIRST ORDER EQUATIONS
2.5 2.5
0.0 0.0
-2.5 -2.5
-5.0 -5.0
-5.0 -2.5 0.0 2.5 5.0
xy
Figure 1.8: The implicit solution y 2 + 2 ln | y| ⇤ x 2 to y 0 ⇤ y 2 +1
.
y0 1 x2
⇤ ,
1 + y2 x2
y0 1
⇤ 2 1,
1+ y 2 x
1
arctan(y) ⇤ x + C,
x ✓ ◆
1
y ⇤ tan x+C .
x
Example 1.3.3: Bob made a cup of coffee, and Bob likes to drink coffee only once reaches
60 degrees Celsius and will not burn him. Initially at time t ⇤ 0 minutes, Bob measured the
temperature and the coffee was 89 degrees Celsius. One minute later, Bob measured the
coffee again and it had 85 degrees. The temperature of the room (the ambient temperature)
is 22 degrees. When should Bob start drinking?
Let T be the temperature of the coffee in degrees Celsius, and let A be the ambient
(room) temperature, also in degrees Celsius. Newton’s law of cooling states that the rate at
which the temperature of the coffee is changing is proportional to the difference between
the ambient temperature and the temperature of the coffee. That is,
dT
⇤ k(A T),
dt
�.�. SEPARABLE EQUATIONS 37
for some constant k. For our setup A ⇤ 22, T(0) ⇤ 89, T(1) ⇤ 85. We separate variables and
integrate (let C and D denote arbitrary constants):
1 dT
⇤ k,
T A dt
ln(T A) ⇤ kt + C, (note that T A > 0)
kt
T A⇤De ,
kt
T ⇤A+De .
80 80
80 80
60 60
70 70
40 40
60 60
20 20
0.0 2.5 5.0 7.5 10.0 12.5 0 20 40 60 80
Figure 1.9: Graphs of the coffee temperature function T(t). On the left, horizontal lines are drawn at
temperatures 60, 85, and 89. Vertical lines are drawn at t ⇤ 1 and t ⇤ 9.21. Notice that the temperature
of the coffee hits 85 at t ⇤ 1, and 60 at t ⇡ 9.21. On the right, the graph is over a longer period of time,
with a horizontal line at the ambient temperature 22.
x y2
Example 1.3.4: Find the general solution to y 0 ⇤ 3 (including singular solutions).
First note that y ⇤ 0 is a solution (a singular solution). Now assume that y , 0.
3
y 0 ⇤ x,
y2
3 x2
⇤ + C,
y 2
38 CHAPTER �. FIRST ORDER EQUATIONS
3 6
y⇤ ⇤ 2 .
x2/2 + C x + 2C
6
y⇤ , and y ⇤ 0.
x2 + 2C
1.3.4 Exercises
Exercise 1.3.1: Solve y 0 ⇤ x/y .
dy y2 + 1
Exercise 1.3.7: Solve ⇤ 2 , for y(0) ⇤ 1.
dx x +1
dy x2 + 1
Exercise 1.3.8: Find an implicit solution for ⇤ 2 , for y(0) ⇤ 1.
dx y +1
Exercise 1.3.9: Find an explicit solution for y 0 ⇤ xe y, y(0) ⇤ 1.
Exercise 1.3.11: Find an explicit solution for y 0 ⇤ ye x2 , y(0) ⇤ 1. It is alright to leave a definite
integral in your answer.
Exercise 1.3.12: Suppose a cup of coffee is at ��� degrees Celsius at time t ⇤ 0, it is at �� degrees
at t ⇤ 10 minutes, and it is at �� degrees at t ⇤ 20 minutes. Compute the ambient temperature.
Exercise 1.3.106: Take Example �.�.� with the same numbers� �� degrees at t ⇤ 0, �� degrees at
t ⇤ 1, and ambient temperature of �� degrees. Suppose these temperatures were measured with
precision of ±0.5 degrees. Given this imprecision, the time it takes the coffee to cool to �exactly� ��
degrees is also only known in a certain range. Find this range. Hint� Think about what kind of
error makes the cooling time longer and what shorter.
b� How many rabbits are on the island in � month, � months, �� months, �� months �round to
the nearest integer�.
40 CHAPTER �. FIRST ORDER EQUATIONS
d h i
r(x)y 0 + r(x)p(x)y ⇤ r(x)y .
dx
This is the left-hand side of (1.3) multiplied by r(x). So if we multiply (1.3) by r(x), we
d h i
obtain
r(x)y ⇤ r(x) f (x).
dx
Now we integrate both sides. The right-hand side does not depend on y and the left-hand
side is written as a derivative of a function. Afterwards, we solve for y. The function r(x)
is called the integrating factor and the method is called the integrating factor method.
We are looking for a function r(x), such that if we differentiate it, we get the same
function back multiplied by p(x). That seems like a job for the exponential function! Let
Ø
p(x) dx
r(x) ⇤ e .
We compute:
y 0 + p(x)y ⇤ f (x),
Ø Ø Ø
p(x) dx 0
e y + e p(x) dx p(x)y ⇤ e p(x) dx
f (x),
d h Ø p(x) dx i Ø
p(x) dx
e y ⇤e f (x),
dx
Ø π Ø
p(x) dx p(x) dx
e y⇤ e f (x) dx + C,
Ø
✓π Ø
◆
p(x) dx p(x) dx
y⇤e e f (x) dx + C .
�.�. LINEAR EQUATIONS AND THE INTEGRATING FACTOR 41
Of course, to get a closed form formula for y, we need to be able to find a closed form
formula for the integrals appearing above.
Example 1.4.1: Solve
x2
y 0 + 2x y ⇤ e x , y(0) ⇤ 1.
Ø
2 2
First note that p(x) ⇤ 2x and f (x) ⇤ ex x .
The integrating factor is r(x) ⇤ e p(x) dx
⇤ ex .
We multiply both sides of the equation by r(x) to get
2 2 2 2
e x y 0 + 2xe x y ⇤ e x x e x ,
d h x2 i
e y ⇤ ex.
dx
We integrate
2
e x y ⇤ e x + C,
x2 x2
y ⇤ ex + Ce .
x2 x2
y ⇤ ex 2e .
Ø
Note that we do not care which antiderivative we take when computing e p(x)dx . You
can always add a constant of integration, but those constants will not matter in the end.
Exercise 1.4.1: Try it� Add a constant of integration to the integral in the integrating factor and
show that the solution you get in the end is the same as what we got above.
Advice: Do not try to remember the formula itself, that is way too hard. It is easier to
remember the process and repeat it.
Since we cannot always evaluate the integrals in closed form, it is useful to know how
to write the solution in definite integral form. A definite integral is something that you can
plug into a computer or a calculator. Suppose we are given
Øx ✓π x Øt ◆
p(s) ds p(s) ds
y(x) ⇤ e x0
e x0
f (t) dt + y0 . (1.4)
x0
You should be careful to properly use dummy variables here. If you now plug such a
formula into a computer or a calculator, it will be happy to give you numerical answers.
Exercise 1.4.3: Write the solution of the following problem as a definite integral, but try to simplify
as far as you can. You will not be able to find the solution in closed form.
2
y0 + y ⇤ e x x
, y(0) ⇤ 10.
Remark 1.4.1: Before we move on, we should note some interesting properties of linear
equations. First, for the linear initial value problem y 0 + p(x)y ⇤ f (x), y(x 0 ) ⇤ y0 , there is
always an explicit formula (1.4) for the solution. Second, it follows from the formula (1.4)
that if p(x) and f (x) are continuous on some interval (a, b), then the solution y(x) exists
and is differentiable on (a, b). Compare with the simple nonlinear example we have seen
previously, y 0 ⇤ y 2 , and compare to Theorem 1.2.1.
Example 1.4.2: Let us discuss a common simple application of linear equations. This type
of problem is used often in real life. For example, linear equations are used in figuring out
the concentration of chemicals in bodies of water (rivers and lakes).
A 100 liter tank contains 10 kilograms of salt dissolved in 60 liters 5 L/min, 0.1 kg/L
of water. Solution of water and salt (brine) with concentration of 0.1
kilograms per liter is flowing in at the rate of 5 liters a minute. The
solution in the tank is well stirred and flows out at a rate of 3 liters
a minute. How much salt is in the tank when the tank is full?
Let us come up with the equation. Let x denote the kilograms of 60 L
salt in the tank, let t denote the time in minutes. For a small change 10 kg salt
t in time, the change in x (denoted x) is approximately
3 L/min
x ⇡ (rate in ⇥ concentration in) t (rate out ⇥ concentration out) t.
dx
⇤ (rate in ⇥ concentration in) (rate out ⇥ concentration out).
dt
In our example, we have
rate in ⇤ 5,
concentration in ⇤ 0.1,
rate out ⇤ 3,
x x
concentration out ⇤ ⇤ .
volume 60 + (5 3)t
Our equation is, therefore,
dx ⇣ x ⌘
⇤ (5 ⇥ 0.1) 3 .
dt 60 + 2t
Or in the form (1.3)
dx 3
+ x ⇤ 0.5.
dt 60 + 2t
�.�. LINEAR EQUATIONS AND THE INTEGRATING FACTOR 43
dx 3
(60 + 2t)3/2 + (60 + 2t)3/2 x ⇤ 0.5(60 + 2t)3/2 ,
dt 60 + 2t
d h i
(60 + 2t)3/2 x ⇤ 0.5(60 + 2t)3/2 ,
dt π
(60 + 2t)3/2 x ⇤ 0.5(60 + 2t)3/2 dt + C,
π
3/2 (60 + 2t)3/2 3/2
x ⇤ (60 + 2t) dt + C(60 + 2t) ,
2
3/2 1
x ⇤ (60 + 2t) (60 + 2t)5/2 + C(60 + 2t) 3/2
,
10
60 + 2t 3/2
x⇤ + C(60 + 2t) .
10
We need to find C. We know that at
t ⇤ 0, x ⇤ 10. So
0 5 10 15 20
60 3/2 3/2
10 ⇤ x(0) ⇤ + C(60) ⇤ 6 + C(60) , 11.5 11.5
10
or
C ⇤ 4(603/2 ) ⇡ 1859.03.
11.0 11.0
60 + 40
+ C(60 + 40) 3/2
0 5 10 15 20
x(20) ⇤
10 Figure 1.10: Graph of the solution x kilograms of
⇡ 10 + 1859.03(100) 3/2 ⇡ 11.86. salt in the tank at time t.
1.4.1 Exercises
In the exercises, feel free to leave answer as a definite integral if a closed form solution
cannot be found. If you can find a closed form solution, you should give that.
Exercise 1.4.9: Suppose there are two lakes located on a stream. Clean water flows into the first
lake, then the water from the first lake flows into the second lake, and then water from the second
lake flows further downstream. The in and out flow from each lake is ��� liters per hour. The first
lake contains ��� thousand liters of water and the second lake contains ��� thousand liters of water.
A truck with ��� kg of toxic substance crashes into the first lake. Assume that the water is being
continually mixed perfectly by the stream.
b� When will the concentration in the first lake be below �.��� kg per liter�
b� In the long term, will the initial conditions make much of a difference� Why or why not�
Exercise 1.4.11: Initially � grams of salt are dissolved in �� liters of water. Brine with concentration
of salt � grams of salt per liter is added at a rate of � liters a minute. The tank is mixed well and is
drained at � liters a minute. How long does the process have to continue until there are �� grams of
salt in the tank�
Exercise 1.4.12: Initially a tank contains �� liters of pure water. Brine of unknown �but constant�
concentration of salt is flowing in at � liter per minute. The water is mixed well and drained at �
liter per minute. In �� minutes there are �� grams of salt in the tank. What is the concentration of
salt in the incoming brine�
Exercise 1.4.103: Suppose a water tank is being pumped out at � L/min. The water tank starts at
�� L of clean water. Water with toxic substance is flowing into the tank at � L/min, with concentration
20t g/L at time t. When the tank is half empty, how many grams of toxic substance are in the tank
�assuming perfect mixing��
�.�. LINEAR EQUATIONS AND THE INTEGRATING FACTOR 45
Exercise 1.4.104: Suppose we have bacteria on a plate and suppose that we are slowly adding a toxic
substance such that the rate of growth is slowing down. That is, suppose that dP
dt ⇤ (2 0.1 t)P. If
P(0) ⇤ 1000, find the population at t ⇤ 5.
Exercise 1.4.105: A cylindrical water tank has water flowing in at I cubic meters per second. Let
A be the area of the cross section of the tank in meters. Suppose water is flowing from the bottom of
the tank at a rate proportional to the height of the water level. Set up the differential equation for h,
the height of the water, introducing and naming constants that you need. You should also give the
units for your constants.
46 CHAPTER �. FIRST ORDER EQUATIONS
1.5 Substitution
Note: 1 lecture, can safely be skipped, §1.6 in [EP], not in [BD]
Just as when solving integrals, one method to try is to change variables to end up with
a simpler equation to solve.
1.5.1 Substitution
The equation
y 0 ⇤ (x y + 1)2
is neither separable nor linear. What can we do? How about trying to change variables, so
that in the new variables the equation is simpler. We use another variable v, which we
treat as a function of x. Let us try
v⇤x y + 1.
1 v0 ⇤ v 2 .
x y + 2 ⇤ (x y)De 2x ,
x y + 2 ⇤ Dxe 2x yDe 2x ,
y + yDe 2x ⇤ Dxe 2x x 2,
y ( 1 + De 2x ) ⇤ Dxe 2x x 2,
Dxe 2x x 2
y⇤ .
De 2x 1
�.�. SUBSTITUTION 47
Usually you try to substitute in the “most complicated” part of the equation with the
hopes of simplifying it. The above table is just a rule of thumb. You might have to modify
your guesses. If a substitution does not work (it does not make the equation any simpler),
try a different one.
y 0 + p(x)y ⇤ q(x)y n .
This equation looks a lot like a linear equation except for the y n . If n ⇤ 0 or n ⇤ 1, then the
equation is linear and we can solve it. Otherwise, the substitution v ⇤ y 1 n transforms the
Bernoulli equation into a linear equation. Note that n need not be an integer.
Example 1.5.1: Solve
x y 0 + y(x + 1) + x y 5 ⇤ 0, y(1) ⇤ 1.
First, the equation is Bernoulli (p(x) ⇤ (x + 1)/x and q(x) ⇤ 1). We substitute
v ⇤ y1 5
⇤ y 4, v 0 ⇤ 4y 5 0
y.
x y 0 + y(x + 1) + x y 5 ⇤ 0,
x y5 0
v + y(x + 1) + x y 5 ⇤ 0,
4
There are several things called Bernoulli equations, this is just one of them. The Bernoullis were a
prominent Swiss family of mathematicians. These particular equations are named for Jacob Bernoulli
(1654–1705).
48 CHAPTER �. FIRST ORDER EQUATIONS
x 0
v + y 4 (x + 1) + x ⇤ 0,
4
x 0
v + v(x + 1) + x ⇤ 0,
4
and finally
4(x + 1)
v ⇤ 4. v0
x
The equation is now linear. We can use the integrating factor method. In particular, we use
formula (1.4). Let us assume that x > 0 so |x| ⇤ x. This assumption is OK, as our initial
condition is x ⇤ 1. Let us compute the integrating factor. Here p(s) from formula (1.4) is
4(s+1)
s .
Øx ✓π x ◆ 4x+4
p(s) ds 4(s + 1) 4x 4 ln(x)+4 4x+4 4 e
e 1 ⇤ exp ds ⇤ e ⇤e x ⇤ ,
1 s x4
Øx
e 1
p(s) ds
⇤ e 4x+4 ln(x) 4
⇤ e 4x 4 x 4 .
The integral in this expression is not possible to find in closed form. As we said before, it is
perfectly fine to have a definite integral in our solution. Now “unsubstitute”
✓ π x 4t+4
◆
4 4x 4 4 e
y ⇤e x 4 dt + 1 ,
1 t4
e x+1
y⇤ ⇣ Øx ⌘ 1/4 .
4t+4
x 4 1
e
t4
dt + 1
v0 1
v + xv 0 ⇤ F(v) or xv 0 ⇤ F(v) v or ⇤ .
F(v) v x
We unsubstitute
y 1
⇤ ,
x ln |x| + C
x
y⇤ .
ln |x| + C
We want y(1) ⇤ 1, so
1 1
1 ⇤ y(1) ⇤ ⇤ .
ln |1| + C C
Thus C ⇤ 1 and the solution we are looking for is
x
y⇤ .
ln |x| 1
1.5.4 Exercises
Hint: Answers need not always be in closed form.
0 5 10 15 20 0 5 10 15 20
10 10 10.0 10.0
7.5 7.5
5 5
5.0 5.0
0 0 2.5 2.5
0.0 0.0
-5 -5
-2.5 -2.5
Figure 1.11: The slope field and some solutions of Figure 1.12: The slope field and some solutions of
x 0 ⇤ 0.3 (5 x). x 0 ⇤ 0.1 x (5 x).
for some positive k and M. This equation is commonly used to model population if we
know the limiting population M, that is the maximum sustainable population. The logistic
equation leads to less catastrophic predictions on world population than x 0 ⇤ kx. In the
real world there is no such thing as negative population, but we will still consider negative
x for the purposes of the math.
See Figure 1.12 on the preceding page for an example, x 0 ⇤ 0.1x(5 x). There are two
critical points, x ⇤ 0 and x ⇤ 5. The critical point at x ⇤ 5 is stable, while the critical point
at x ⇤ 0 is unstable.
It is not necessary to find the exact solutions to talk about the long term behavior of the
solutions. From the slope field above of x 0 ⇤ 0.1x(5 x), we see that
8
>
>
<
>
5 if x(0) > 0,
lim x(t) ⇤ 0 if x(0) ⇤ 0,
>
>
> DNE or 1 if x(0) < 0.
t!1
:
Here DNE means “does not exist.” From just looking at the slope field we cannot quite
decide what happens if x(0) < 0. It could be that the solution does not exist for t all the
way to 1. Think of the equation x 0 ⇤ x 2 ; we have seen that solutions only exist for some
finite period of time. Same can happen here. In our example equation above it turns out
that the solution does not exist for all time, but to see that we would have to solve the
equation. In any case, the solution does go to 1, but it may get there rather quickly.
If we are interested only in the long term behavior of the solution, we would be doing
unnecessary work if we solved the equation exactly. We could draw the slope field, but it
is easier to just look at the phase diagram or phase portrait, which is a simple way to visualize
the behavior of autonomous equations. In this case there is one dependent variable x.
We draw the x-axis, we mark all the critical points, and then we draw arrows in between.
Since x is the dependent variable we draw the axis vertically, as it appears in the slope
field diagrams above. If f (x) > 0, we draw an up arrow. If f (x) < 0, we draw a down
arrow. To figure this out, we could just plug in some x between the critical points, f (x)
will have the same sign at all x between two critical points as long f (x) is continuous. For
example, f (6) ⇤ 0.6 < 0, so f (x) < 0 for x > 5, and the arrow above x ⇤ 5 is a down arrow.
Next, f (1) ⇤ 0.4 > 0, so f (x) > 0 whenever 0 < x < 5, and the arrow points up. Finally,
f ( 1) ⇤ 0.6 < 0 so f (x) < 0 when x < 0, and the arrow points down.
x⇤5
x⇤0
�.�. AUTONOMOUS EQUATIONS 53
Armed with the phase diagram, it is easy to sketch the solutions approximately: As
time t moves from left to right, the graph of a solution goes up if the arrow is up, and it
goes down if the arrow is down.
Exercise 1.6.1: Try sketching a few solutions simply from looking at the phase diagram. Check
with the preceding graphs if you are getting the type of curves.
Once we draw the phase diagram, we classify critical points as stable or unstable .
unstable stable
Since any mathematical model we cook up will only be an approximation to the real
world, unstable points are generally bad news.
Let us think about the logistic equation with harvesting. Suppose an alien race really
likes to eat humans. They keep a planet with humans on it and harvest the humans at a
rate of h million humans per year. Suppose x is the number of humans in millions on the
planet and t is time in years. Let M be the limiting population when no harvesting is done.
The number k > 0 is a constant depending on how fast humans multiply. Our equation
becomes
dx
⇤ kx(M x) h.
dt
We expand the right-hand side and set it to zero.
kx(M x) h ⇤ kx 2 + kMx h ⇤ 0.
Solving for the critical points, let us call them A and B, we get
q q
2
kM + (kM) 4hk kM (kM)2 4hk
A⇤ , B⇤ .
2k 2k
Exercise 1.6.2: Sketch a phase diagram for different possibilities. Note that these possibilities are
A > B, or A ⇤ B, or A and B both complex �i.e. no real solutions�. Hint� Fix some simple k and M
and then vary h.
For example, let M ⇤ 8 and k ⇤ 0.1. When h ⇤ 1, then A and B are distinct and positive.
The slope field we get is in Figure 1.13 on the next page. As long as the population starts
above B, which is approximately 1.55 million, then the population will not die out. It will
in fact tend towards A ⇡ 6.45 million. If ever some catastrophe happens and the population
drops below B, humans will die out, and the fast food restaurant serving them will go out
of business.
Unstable points with one of the arrows pointing towards the critical point are sometimes called semistable.
54 CHAPTER �. FIRST ORDER EQUATIONS
0 5 10 15 20 0 5 10 15 20
10.0 10.0 10.0 10.0
Figure 1.13: The slope field and some solutions of Figure 1.14: The slope field and some solutions of
x 0 ⇤ 0.1 x (8 x) 1. x 0 ⇤ 0.1 x (8 x) 1.6.
When h ⇤ 1.6, then A ⇤ B ⇤ 4. There is only one critical point and it is unstable. When
the population starts above 4 million it will tend towards 4 million. If it ever drops below 4
million, humans will die out on the planet. This scenario is not one that we (as the human
fast food proprietor) want to be in. A small perturbation of the equilibrium state and we
are out of business. There is no room for error. See Figure 1.14.
Finally if we are harvesting at 2 million humans per year, there are no critical points.
The population will always plummet towards zero, no matter how well stocked the planet
starts. See Figure 1.15.
0 5 10 15 20
10.0 10.0
7.5 7.5
5.0 5.0
2.5 2.5
0.0 0.0
0 5 10 15 20
1.6.1 Exercises
Exercise 1.6.3: Consider x 0 ⇤ x 2 .
a� Draw the phase diagram, find the critical points, and mark them stable or unstable.
c� Find lim x(t) for the solution with the initial condition x(0) ⇤ 1.
t!1
a� Draw the phase diagram for 4⇡ x 4⇡. On this interval mark the critical points stable
or unstable.
c� Find lim x(t) for the solution with the initial condition x(0) ⇤ 1.
t!1
Exercise 1.6.5: Suppose f (x) is positive for 0 < x < 1, it is zero when x ⇤ 0 and x ⇤ 1, and it is
negative for all other x.
a� Draw the phase diagram for x 0 ⇤ f (x), find the critical points, and mark them stable or
unstable.
c� Find lim x(t) for the solution with the initial condition x(0) ⇤ 0.5.
t!1
Exercise 1.6.7: A disease is spreading through the country. Let x be the number of people infected.
Let the constant S be the number of people susceptible to infection. The infection rate dx dt is
proportional to the product of already infected people, x, and the number of susceptible but uninfected
people, S x.
b� Supposing x(0) > 0, that is, some people are infected at time t ⇤ 0, what is lim x(t).
t!1
c� Does the solution to part b� agree with your intuition� Why or why not�
56 CHAPTER �. FIRST ORDER EQUATIONS
a� Find and classify all critical points. b� Find lim x(t) given any initial condition.
t!1
a� Find the differential equation for x. b� What is the new limiting population�
For b�, c�, d�, find lim x(t) based on the phase diagram.
t!1
If the equation can be solved in closed form, we should do that. But what if we have
an equation that cannot be solved in closed form? What if we want to find the value
of the solution at some particular x? Or perhaps we want to produce a graph of the
solution to inspect the behavior. In this section we will learn about the basics of numerical
approximation of solutions.
The simplest method for approximating a solution is Euler’s method . It works as follows:
Take x 0 and compute the slope k ⇤ f (x0 , y0 ). The slope is the change in y per unit change
in x. Follow the line for an interval of length h on the x-axis. Hence if y ⇤ y0 at x0 , then
we say that y1 (the approximate value of y at x 1 ⇤ x0 + h) is y1 ⇤ y0 + hk. Rinse, repeat!
Let k ⇤ f (x1 , y1 ), and then compute x2 ⇤ x1 + h, and y2 ⇤ y1 + hk. Now compute x 3 and
y3 using x 2 and y2 , etc. Consider the equation y 0 ⇤ y 2/3, y(0) ⇤ 1, and h ⇤ 1. Then x 0 ⇤ 0
and y0 ⇤ 1. We compute
x i+1 ⇤ x i + h, y i+1 ⇤ y i + h f (x i , y i ).
The line segments we get are an approximate graph of the solution. Generally it is not
exactly the solution. See Figure 1.17 on the next page for the plot of the real solution and
the approximation.
We continue with the equation y 0 ⇤ y 2/3, y(0) ⇤ 1. Let us try to approximate y(2) using
Euler’s method. In Figures 1.16 and 1.17 we have graphically approximated y(2) with step
size 1. With step size 1, we have y(2) ⇡ 1.926. The real answer is 3. We are approximately
1.074 off. Let us halve the step size. Computing y4 with h ⇤ 0.5, we find that y(2) ⇡ 2.209,
so an error of about 0.791. Table 1.1 on page 59 gives the values computed for various
parameters.
Named after the Swiss mathematician Leonhard Paul Euler (1707–1783). The correct pronunciation of
the name sounds more like “oiler.”
58 CHAPTER �. FIRST ORDER EQUATIONS
-1 0 1 2 3 -1 0 1 2 3
3.0 3.0 3.0 3.0
y2
Figure 1.16: First two steps of Euler’s method with h ⇤ 1 for the equation y 0 ⇤ 3 with initial conditions
y(0) ⇤ 1.
-1 0 1 2 3
3.0 3.0
2.5 2.5
2.0 2.0
1.5 1.5
1.0 1.0
0.5 0.5
0.0 0.0
-1 0 1 2 3
y2
Figure 1.17: Two steps of Euler’s method (step size 1) and the exact solution for the equation y 0 ⇤ 3
with initial conditions y(0) ⇤ 1.
Exercise 1.7.1: Solve this equation exactly and show that y(2) ⇤ 3.
The difference between the actual solution and the approximate solution is called the
error. We usually talk about just the size of the error and we do not care much about its
sign. The point is, we usually do not know the real solution, so we only have a vague
understanding of the error. If we knew the error exactly . . . what is the point of doing the
approximation?
Notice that except for the first few times, every time we halved the interval the error
approximately halved. This halving of the error is a general feature of Euler’s method as it
is a first order method. There exists an improved Euler method, see the exercises, which is
a second order method. A second order method reduces the error to approximately one
�.�. NUMERICAL METHODS� EULER’S METHOD 59
Error
h Approximate y(2) Error Previous error
1 1.92593 1.07407
0.5 2.20861 0.79139 0.73681
0.25 2.47250 0.52751 0.66656
0.125 2.68034 0.31966 0.60599
0.0625 2.82040 0.17960 0.56184
0.03125 2.90412 0.09588 0.53385
0.015625 2.95035 0.04965 0.51779
0.0078125 2.97472 0.02528 0.50913
quarter every time we halve the interval. The meaning of “second” order is the squaring in
1/4 ⇤ 1/2 ⇥ 1/2 ⇤ (1/2)2 .
To get the error to be within 0.1 of the answer we had to already do 64 steps. To get
it to within 0.01 we would have to halve another three or four times, meaning doing 512
to 1024 steps. That is quite a bit to do by hand. The improved Euler method from the
exercises should quarter the error every time we halve the interval, so we would have to
approximately do half as many “halvings” to get the same error. This reduction can be a
big deal. With 10 halvings (starting at h ⇤ 1) we have 1024 steps, whereas with 5 halvings
we only have to do 32 steps, assuming that the error was comparable to start with. A
computer may not care about this difference for a problem this simple, but suppose each
step would take a second to compute (the function may be substantially more difficult to
compute than y 2/3). Then the difference is 32 seconds versus about 17 minutes. We are not
being altogether fair, a second order method would probably double the time to do each
step. Even so, it is 1 minute versus 17 minutes. Next, suppose that we have to repeat such
a calculation for different parameters a thousand times. You get the idea.
Note that in practice we do not know how large the error is! How do we know what is
the right step size? Well, essentially we keep halving the interval, and if we are lucky, we
can estimate the error from a few of these calculations and the assumption that the error
goes down by a factor of one half each time (if we are using standard Euler).
Exercise 1.7.2: In the table above, suppose you do not know the error. Take the approximate values
of the function in the last two lines, assume that the error goes down by a factor of �. Can you
estimate the error in the last time from this� Does it �approximately� agree with the table� Now do
it for the first two rows. Does this agree with the table�
Let us talk a little bit more about the example y 0 ⇤ y 2/3, y(0) ⇤ 1. Suppose that instead
of the value y(2) we wish to find y(3). The results of this effort are listed in Table 1.2 on the
next page for successive halvings of h. What is going on here? Well, you should solve the
60 CHAPTER �. FIRST ORDER EQUATIONS
equation exactly and you will notice that the solution does not exist at x ⇤ 3. In fact, the
solution goes to infinity when you approach x ⇤ 3.
h Approximate y(3)
1 3.16232
0.5 4.54329
0.25 6.86079
0.125 10.80321
0.0625 17.59893
0.03125 29.46004
0.015625 50.40121
0.0078125 87.75769
Table 1.2: Attempts to use Euler’s to approximate y(3) where of y 0 ⇤ y 2/3, y(0) ⇤ 1.
Another case where things go bad is if the solution oscillates wildly near some point.
The solution may exist at all points, but even a much better numerical method than
Euler would need an insanely small step size to approximate the solution with reasonable
precision. And computers might not be able to easily handle such a small step size.
In real applications we would not use a simple method such as Euler’s. The simplest
method that would probably be used in a real application is the standard Runge–Kutta
method (see exercises). That is a fourth order method, meaning that if we halve the interval,
the error generally goes down by a factor of 16 (it is fourth order as 1/16 ⇤ 1/2 ⇥ 1/2 ⇥ 1/2 ⇥ 1/2).
Choosing the right method to use and the right step size can be very tricky. There are
several competing factors to consider.
• Computational time: Each step takes computer time. Even if the function f is simple
to compute, we do it many times over. Large step size means faster computation, but
perhaps not the right precision.
• Stability: Certain equations may be numerically unstable. What may happen is that
the numbers never seem to stabilize no matter how many times we halve the interval.
We may need a ridiculously small interval size, which may not be practical due to
�.�. NUMERICAL METHODS� EULER’S METHOD 61
We have seen just the beginnings of the challenges that appear in real applications.
Numerical approximation of solutions to differential equations is an active research area
for engineers and mathematicians. For example, the general purpose method used for the
ODE solver in Matlab and Octave (as of this writing) is a method that appeared in the
literature only in the 1980s.
1.7.1 Exercises
dx
Exercise 1.7.3: Consider ⇤ (2t x)2 , x(0) ⇤ 2. Use Euler’s method with step size h ⇤ 0.5 to
dt
approximate x(1).
dx
Exercise 1.7.4: Consider ⇤t x, x(0) ⇤ 1.
dt
a� Use Euler’s method with step sizes h ⇤ 1, 1/2, 1/4, 1/8 to approximate x(1).
c� Describe what happens to the errors for each h you used. That is, find the factor by which the
error changed each time you halved the interval.
Exercise 1.7.5: Approximate the value of e by looking at the initial value problem y 0 ⇤ y with
y(0) ⇤ 1 and approximating y(1) using Euler’s method with a step size of 0.2.
Exercise 1.7.6: Example of numerical instability� Take y 0 ⇤ 5y, y(0) ⇤ 1. We know that the
solution should decay to zero as x grows. Using Euler’s method, start with h ⇤ 1 and compute
y1 , y2 , y3 , y4 to try to approximate y(4). What happened� Now halve the interval. Keep halving
the interval and approximating y(4) until the numbers you are getting start to stabilize �that is,
until they start going towards zero�. Note� You might want to use a calculator.
dy
The simplest method used in practice is the Runge–Kutta method. Consider dx ⇤ f (x, y),
y(x0 ) ⇤ y0 , and a step size h. Everything is the same as in Euler’s method, except the
computation of y i+1 and x i+1 .
k 1 ⇤ f (x i , y i ),
k 2 ⇤ f x i + h/2 , y i + k 1 (h/2) , x i+1 ⇤ x i + h,
k 1 + 2k 2 + 2k 3 + k 4
k 3 ⇤ f x i + h/2 , y i + k 2 (h/2) , y i+1 ⇤ y i + h,
6
k 4 ⇤ f (x i + h, y i + k 3 h).
62 CHAPTER �. FIRST ORDER EQUATIONS
dy
Exercise 1.7.7: Consider ⇤ yx 2 , y(0) ⇤ 1.
dx
a� Use Runge–Kutta �see above� with step sizes h ⇤ 1 and h ⇤ 1/2 to approximate y(1).
b� Use Euler’s method with h ⇤ 1 and h ⇤ 1/2.
c� Solve exactly, find the exact value of y(1), and compare.
Exercise 1.7.101: Let x 0 ⇤ sin(xt), and x(0) ⇤ 1. Approximate x(1) using Euler’s method with
step sizes �, �.�, �.��. Use a calculator and compute up to � decimal digits.
Exercise 1.7.102: Let x 0 ⇤ 2t, and x(0) ⇤ 0.
a� Approximate x(4) using Euler’s method with step sizes �, �, and �.
b� Solve exactly, and compute the errors.
c� Compute the factor by which the errors changed.
Exercise 1.7.103: Let x 0 ⇤ xe xt+1 , and x(0) ⇤ 0.
a� Approximate x(4) using Euler’s method with step sizes �, �, and �.
b� Guess an exact solution based on part a� and compute the errors.
There is a simple way to improve Euler’s method to make it a second order method
dy
by doing just one extra step. Consider dx ⇤ f (x, y), y(x0 ) ⇤ y0 , and a step size h. What
we do is to pretend we compute the next step as in Euler, that is, we start with (x i , y i ),
we compute a slope k 1 ⇤ f (x i , y i ), and then look at the point (x i + h, y i + k1 h). Instead of
letting our new point be (x i + h, y i + k1 h), we compute the slope at that point, call it k 2 ,
and then take the average of k 1 and k 2 , hoping that the average is going to be closer to the
actual slope on the interval from x i to x i + h. And we are correct, if we halve the step, the
error should go down by a factor of 22 ⇤ 4. To summarize, the setup is the same as for
regular Euler, except the computation of y i+1 and x i+1 .
k 1 ⇤ f (x i , y i ), x i+1 ⇤ x i + h,
k1 + k2
k 2 ⇤ f (x i + h, y i + k 1 h), y i+1 ⇤ y i + h.
2
dy
Exercise 1.7.104: Consider ⇤ x + y, y(0) ⇤ 1.
dx
a� Use the improved Euler’s method �see above� with step sizes h ⇤ 1/4 and h ⇤ 1/8 to approximate
y(1).
F(x, y) ⇤ x 2 + y 2 .
@F @F
dF ⇤ dx + dy. -5 -5
@x @y
For convenience, we will make use of the
notation of Fx ⇤ @F and F y ⇤ @y
@F
. In our
-10 -10
-10 -5 0 5 10
@x
example, Figure 1.18: Solutions to F(x, y) ⇤ x 2 + y 2 ⇤ C
for various C.
dF ⇤ 2x dx + 2y dy.
An interpretation of the setup is that at each point vÆ ⇤ (M, N) is a vector in the plane,
that is, a direction and a magnitude. As M and N are functions of (x, y), we have a
64 CHAPTER �. FIRST ORDER EQUATIONS
vector field. The particular vector field vÆ that comes from an exact equation is a so-called
conservative vector field, that is, a vector field that comes with a potential function F(x, y),
such that ✓ ◆
@F @F
vÆ ⇤ , .
@x @y
Let be a path in the plane starting at (x1 , y1 ) and ending at (x 2 , y2 ). If we think of vÆ as
force, then the work required to move along is
π π
vÆ(Ær ) · dÆr ⇤ M dx + N dy ⇤ F(x2 , y2 ) F(x 1 , y1 ).
That is, the work done only depends on endpoints, that is where we start and where we
end. For example, suppose F is gravitational potential. The derivative of F given by vÆ
is the gravitational force. What we are saying is that the work required to move a heavy
box from the ground floor to the roof, only depends on the change in potential energy.
That is, the work done is the same no matter what path we took; if we took the stairs or
the elevator. Although if we took the elevator, the elevator is doing the work for us. The
curves F(x, y) ⇤ C are those where no work need be done, such as the heavy box sliding
along without accelerating or breaking on a perfectly flat roof, on a cart with incredibly
well oiled wheels.
An exact equation is a conservative vector field, and the implicit solution of this equation
is the potential function.
dy
x+y ⇤ 0.
dx
It is up to us to find some potential F that works. Many different F will work; adding
a constant to F does not change the equation. Once we have a potential function F, the
equation F x, y(x) ⇤ C gives an implicit solution of the ODE.
dy
Example 1.8.1: Let us find the general solution to 2x + 2y dx ⇤ 0. Forget we knew what F
was.
If we know that this is an exact equation, we start looking for a potential function F.
We have M ⇤ 2x and N ⇤ 2y. If F exists, it must be such that Fx (x, y) ⇤ 2x. Integrate in
the x variable to find
F(x, y) ⇤ x 2 + A(y), (1.5)
for some function A(y). The function A is the “constant of integration”, though it is only
constant as far as x is concerned, and may still depend on y. Now differentiate (1.5) in y
�.�. EXACT EQUATIONS 65
2y ⇤ F y (x, y) ⇤ A0(y).
F x, y(x) ⇤ C.
p
for y in terms of x. In this case, we obtain y ⇤ ± C 2 x 2 as we did before.
Exercise 1.8.1: Why did we not need to add a constant of integration when integrating A0(y) ⇤ 2y�
Add a constant of integration, say 3, and see what F you get. What is the difference from what we
got above, and why does it not matter�
The procedure, once we know that the equation is exact, is:
(i) Integrate Fx ⇤ M in x resulting in F(x, y) ⇤ something + A(y).
(ii) Differentiate this F in y, and set that equal to N, so that we may find A(y) by
integration.
The procedure can also be done by first integrating in y and then differentiating in x. Pretty
easy huh? Let’s try this again.
dy
Example 1.8.2: Consider now 2x + y + x y dx ⇤ 0.
OK, so M ⇤ 2x + y and N ⇤ x y. We try to proceed as before. Suppose F exists. Then
Fx (x, y) ⇤ 2x + y. We integrate:
F(x, y) ⇤ x 2 + x y + A(y)
N ⇤ x y ⇤ F y (x, y) ⇤ x + A0(y).
But there is no way to satisfy this requirement! The function x y cannot be written as x
plus a function of y. The equation is not exact; no potential function F exists.
Is there an easier way to check for the existence of F, other than failing in trying to find
it? Turns out there is. Suppose M ⇤ Fx and N ⇤ F y . Then as long as the second derivatives
are continuous,
@M @2 F @2 F @N
⇤ ⇤ ⇤ .
@y @y@x @x@y @x
Let us state it as a theorem. Usually this is called the Poincarè Lemma .
Theorem 1.8.1 (Poincarè). If M and N are continuously differentiable functions of (x, y), and
@M
@y
⇤ @N
@x
, then near any point there is a function F(x, y) such that M ⇤ @F
@x
and N ⇤ @y
@F
.
The theorem doesn’t give us a global F defined everywhere. In general, we can only
find the potential locally, near some initial point. By this time, we have come to expect this
from differential equations.
Let us return to the example above where M ⇤ 2x + y and N ⇤ x y. Notice M y ⇤ 1 and
Nx ⇤ y, which are clearly not equal. The equation is not exact.
Example 1.8.3: Solve
dy 2x y
⇤ , y(0) ⇤ 1.
dx x 1
We write the equation as
dy
(2x + y) + (x 1) ⇤ 0,
dx
so M ⇤ 2x + y and N ⇤ x 1. Then
M y ⇤ 1 ⇤ Nx .
F(x, y) ⇤ x 2 + x y + A(y).
x 1 ⇤ x + A0(y).
x2 1
y⇤ .
x 1
Example 1.8.4: Solve
y x
dx + 2 dy ⇤ 0, y(1) ⇤ 2.
x2 +y 2 x + y2
We leave to the reader to check that M y ⇤ Nx .
This vector field (M, N) is not conservative if considered as a vector field of the entire
plane minus the origin. The problem is that if the curve is a circle around the origin, say
starting at (1, 0) and ending at (1, 0) going counterclockwise, then if F existed we would
expect
π π
y x
0 ⇤ F(1, 0) F(1, 0) ⇤ Fx dx + F y dy ⇤ dx + 2 dy ⇤ 2⇡.
x2 +y 2 x + y2
That is nonsense! We leave the computation of the path integral to the interested reader, or
you can consult your multivariable calculus textbook. So there is no potential function F
defined everywhere outside the origin (0, 0).
�.�. EXACT EQUATIONS 67
If we think back to the theorem, it does not guarantee such a function anyway. It only
guarantees a potential function locally, that is only in some region near the initial point. As
y(1) ⇤ 2 we start at the point (1, 2). Considering x > 0 and integrating M in x or N in y,
we find
F(x, y) ⇤ arctan y/x .
The implicit solution is arctan y/x ⇤ C. Solving, y ⇤ tan(C)x. That is, the solution is
a straight line. Solving y(1) ⇤ 2 gives us that tan(C) ⇤ 2, and so y ⇤ 2x is the desired
solution. See Figure 1.19, and note that the solution only exists for x > 0.
5 5
0 0
-5 -5
-10 -10
-5.0 -2.5 0.0 2.5 5.0
y
Figure 1.19: Solution to x 2 +y 2
dx + x
x 2 +y 2
dy ⇤ 0, y(1) ⇤ 2, with initial point marked.
dy
+ p(x)y ⇤ f (x), or p(x)y f (x) dx + dy ⇤ 0
dx
Ø
is always such an equation. Let r(x) ⇤ e p(x) dx be the integrating factor for a linear
dy
equation. Multiply the equation by r(x) and write it in the form of M + N dx ⇤ 0.
dy
r(x)p(x)y r(x) f (x) + r(x) ⇤ 0.
dx
Then M ⇤ r(x)p(x)y r(x) f (x), so M y ⇤ r(x)p(x), while N ⇤ r(x), so Nx ⇤ r 0(x) ⇤ r(x)p(x).
In other words, we have an exact equation. Integrating factors for linear functions are just
a special case of integrating factors for exact equations.
But how do we find the integrating factor u? Well, given an equation
M dx + N dy ⇤ 0,
Therefore,
(M y Nx )u ⇤ u x N u y M.
At first it may seem we replaced one differential equation by another. True, but all hope is
not lost.
A strategy that often works is to look for a u that is a function of x alone, or a function
of y alone. If u is a function of x alone, that is u(x), then we write u 0(x) instead of u x , and
u y is just zero. Then
M y Nx
u ⇤ u0.
N
M y Nx
In particular, N ought to be a function of x alone (not depend on y). If so, then we
have a linear equation
M y Nx
u0 u ⇤ 0.
N
�.�. EXACT EQUATIONS 69
M y Nx
Letting P(x)
Ø
⇤ N , we solve using the standard integrating factor method, to find
u(x) ⇤ Ce . The constant in the solution
P(x) dx
Ø is not relevant, we need any nonzero
solution, so we take C ⇤ 1. Then u(x) ⇤ e P(x) dx
is the integrating factor.
Similarly we could try a function of the form u(y). Then
My Nx
u ⇤ u0.
M
M y Nx
In particular, M ought to be a function of y alone. If so, then we have a linear equation
My Nx
u0 + u ⇤ 0.
M
Ø Ø
M y Nx
Letting Q(y) ⇤ M , we find u(y) ⇤ Ce Q(y) dy
. We take C ⇤ 1. So u(y) ⇤ e Q(y) dy
is
the integrating factor.
Example 1.8.6: Solve
x2 + y2 dy
+ 2y ⇤ 0.
x+1 dx
x 2 +y 2
Let M ⇤ x+1 and N ⇤ 2y. Compute
2y 2y
My Nx ⇤ 0⇤ .
x+1 x+1
As this is not zero, the equation is not exact. We notice
My Nx 2y 1 1
P(x) ⇤ ⇤ ⇤
N x + 1 2y x+1
dy
x 2 + y 2 + 2y(x + 1) ⇤ 0,
dx
which is an exact equation that we solved in Example 1.8.5. The solution was
r
C (1/3)x 3
y⇤± .
x+1
Example 1.8.7: Solve
dy
y 2 + (x y + 1) ⇤ 0.
dx
70 CHAPTER �. FIRST ORDER EQUATIONS
First compute
My Nx ⇤ 2y y ⇤ y.
As this is not zero, the equation is not exact. We observe
My Nx y 1
Q(y) ⇤ ⇤ 2
⇤
M y y
is a function of y alone. We compute the integrating factor
Ø
Q(y) dy ln y 1
e ⇤e ⇤ .
y
Therefore we look at the exact equation
x y + 1 dy
y+ ⇤ 0.
y dx
The reader should double check that this equation is exact. We follow the procedure for
exact equations
F(x, y) ⇤ x y + A(y),
and
xy + 1 1
⇤ x + ⇤ x + A0(y). (1.6)
y y
Consequently A0(y) ⇤ 1y or A(y) ⇤ ln y. Thus F(x, y) ⇤ x y + ln y. It is not possible to solve
F(x, y) ⇤ C for y in terms of elementary functions, so let us be content with the implicit
solution:
x y + ln y ⇤ C.
We are looking for the general solution and we divided by y above. We should check what
happens when y ⇤ 0, as the equation itself makes perfect sense in that case. We plug in
y ⇤ 0 to find the equation is satisfied. So y ⇤ 0 is also a solution.
1.8.3 Exercises
Exercise 1.8.2: Solve the following exact equations, implicit general solutions will suffice�
dy
a� (2x y + x 2 ) dx + (x 2 + y 2 + 1) dy ⇤ 0 b� x 5 + y 5 dx ⇤ 0
dy
c� e x + y 3 + 3x y 2 dx ⇤ 0 d� (x + y) cos(x) + sin(x) + sin(x)y 0 ⇤ 0
Exercise 1.8.3: Find the integrating factor for the following equations making them into exact
equations�
y e x +y 3
a� e x y dx + x e x y dy ⇤ 0 b� y2
dx + 3x dy ⇤ 0
2x+2y 2
c� 4(y 2 + x) dx + y dy ⇤ 0 d� 2 sin(y) dx + x cos(y) dy ⇤ 0
�.�. EXACT EQUATIONS 71
dy
Exercise 1.8.4: Suppose you have an equation of the form� f (x) + g(y) dx ⇤ 0.
a� Show it is exact.
a� Show that u y dx + u x dy ⇤ 0 is an exact equation. Therefore there exists �at least locally�
the so-called harmonic conjugate function v(x, y) such that v x ⇤ u y and v y ⇤ u x .
Verify that the following u are harmonic and find the corresponding harmonic conjugates v�
b� u ⇤ 2x y c� u ⇤ e x cos y d� u ⇤ x 3 3x y 2
Exercise 1.8.101: Solve the following exact equations, implicit general solutions will suffice�
Exercise 1.8.102: Find the integrating factor for the following equations making them into exact
equations�
1
a� y dx + 3y dy ⇤ 0 b� dx e x y dy ⇤ 0
cos(x) 1 y2
c� y2
+ y dx + x
y2
dy ⇤ 0 d� 2y + x dx + (2y + x) dy ⇤ 0
Exercise 1.8.103:
a� Show that every separable equation y 0 ⇤ f (x)g(y) can be written as an exact equation, and
verify that it is indeed exact.
b� Using this rewrite y 0 ⇤ x y as an exact equation, solve it and verify that the solution is the
same as it was in Example �.�.�.
72 CHAPTER �. FIRST ORDER EQUATIONS
a(x, t) u x + b(x, t) u t + c(x, t) u ⇤ g(x, t), u(x, 0) ⇤ f (x), 1 < x < 1, t > 0,
where u(x, t) is a function of x and t. The initial condition u(x, 0) ⇤ f (x) is now a function
of x rather than just a number. In these problems it is useful to think of x as position and t
as time. The equation describes the evolution of a function of x as time goes on. Below,
the coefficients a, b, c, and the function g are mostly going to be constant or zero. The
method we describe works with nonconstant coefficients, although the computations may
get difficult quickly.
This method we use is the method of characteristics. The idea is that we find lines along
which the equation is an ODE that we solve. We will see this technique again for second
order PDE when we encounter the wave equation in § 4.8.
Example 1.9.1: Consider the equation
u t + ↵u x ⇤ 0, u(x, 0) ⇤ f (x).
⇠⇤x ↵t, s ⇤ t.
Let’s see what the equation becomes. Remember the chain rule in several variables.
u t ⇤ u ⇠ ⇠ t + u s s t ⇤ ↵u ⇠ + u s ,
ux ⇤ u⇠ ⇠x + us sx ⇤ u⇠ .
( ↵u ⇠ + u s ) +↵ (u ⇠ ) ⇤ 0,
| {z } |{z}
ut ux
or in other words
u s ⇤ 0.
That is trivial to solve. Treating ⇠ as simply a parameter, we have obtained the ODE du
ds ⇤ 0.
The solution is a function that does not depend on s (but it does depend on ⇠). That is,
there is some function A such that
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
Figure 1.21: Example of “transport” in u t u x ⇤ 0 (that is, ↵ ⇤ 1) where the initial condition f (x) is a
peak at the origin. On the left is a graph of the initial condition u(x, 0). On the right is a graph of the
function u(x, 1), that is at time t ⇤ 1. Notice it is the same graph shifted one unit to the right.
We change coordinates to the characteristic coordinates. Let us call these coordinates (⇠, s).
These are coordinates where au x + bu t becomes differentiation in the s variable.
Along the characteristic curves (where ⇠ is constant), we get a new ODE in the s variable.
In the transport equation, we got the simple duds ⇤ 0. In general, we get the linear equation
du
+ cu ⇤ g. (1.7)
ds
We think of everything as a function of ⇠ and s, although we are thinking of ⇠ as a
parameter rather than an independent variable. So the equation is an ODE. It is a linear
ODE that we can solve using the integrating factor.
To find the characteristics, think of a curve given parametrically x(s), t(s) . We try to
have the curve satisfy
dx dt
⇤ a, ⇤ b.
ds ds
Why? Because when we think of x and t as functions of s we find, using the chain rule,
✓ ◆
du dx dt
+ cu ⇤ u x + ut +cu ⇤ au x + bu t + cu ⇤ g.
ds ds ds
| {z }
du
ds
So we get the ODE (1.7), which then describes the value of the solution u of the PDE along
this characteristic curve. It is also convenient to make sure that s ⇤ 0 corresponds to t ⇤ 0,
that is t(0) ⇤ 0. It will be convenient also for x(0) ⇤ ⇠. See Figure 1.22.
t ⇠ ⇤ constant
x(s), t(s)
s⇤0
x⇤⇠ x
dx dt
⇤ 1, ⇤ 1.
ds ds
�.�. FIRST ORDER LINEAR PDE 75
So
x ⇤ s + c1 , t ⇤ s + c2 ,
for some c1 and c2 . At s ⇤ 0 we want t ⇤ 0, and x should be ⇠. So we let c1 ⇤ ⇠ and c2 ⇤ 0:
x ⇤ s + ⇠, t ⇤ s.
The ODE is du
ds + u ⇤ x, and x ⇤ s + ⇠. So, the ODE to solve along the characteristic is
du
+ u ⇤ s + ⇠.
ds
The general solution of this equation, treating ⇠ as a parameter, is u ⇤ Ce s + s + ⇠ 1,
2
for some constant C. At s ⇤ 0, our initial condition is that u is e ⇠ , since at s ⇤ 0 we have
2
x ⇤ ⇠. Given this initial condition, we find C ⇤ e ⇠ ⇠ + 1. So,
⇠2 s
u⇤ e ⇠+1 e +s+⇠ 1
⇠2 s s
⇤e + (1 ⇠)e +s+⇠ 1.
See Figure 1.23 on the next page for a plot of u(x, t) as a function of two variables.
When the coefficients are not constants, the characteristic curves are not going to be
straight lines anymore.
Example 1.9.3: Consider the following variable coefficient equation:
xu x + u t + 2u ⇤ 0, u(x, 0) ⇤ cos(x).
dx dt
⇤ x, ⇤ 1.
ds ds
So
x ⇤ c1 e s , t ⇤ s + c2 .
At s ⇤ 0, we wish to get the line t ⇤ 0, and x should be ⇠. So
x ⇤ ⇠e s , t ⇤ s.
3 3.0
x 2 2.5 t
1
2.0
0
-1 1.5
-2 1.0
u(x,t)
-3 0.5
0.0
2 1.73
2 1.15
0.58
1 0.00
1 -0.58
-1.15
0 -1.73
0 -2.30
-2.88
-1 -3.46
-1
-2
-2
-3
-3
3.0
2.5 3
2.0 2
1.5 1
0
1.0
t
-1
0.5 -2 x
0.0 -3
This is for a fixed ⇠. At s ⇤ 0, we should get that u is cos(⇠), so that is our initial condition.
Consequently,
u ⇤ e 2s cos(⇠) ⇤ e 2t cos(xe t ).
We make a few closing remarks. One thing to keep in mind is that we would get into
trouble if the coefficient in front of u t , that is the b, is ever zero. Let us consider a quick
example of what can go wrong:
u x + u ⇤ 0, u(x, 0) ⇤ sin(x).
This problem has no solution. If we had a solution, it would imply that u x (x, 0) ⇤ cos(x),
but u x (x, 0) + u(x, 0) ⇤ cos(x) + sin(x) , 0. The problem is that the characteristic curve is
now the line t ⇤ 0, and the solution is already provided on that line!
As long as b is nonzero, it is convenient to ensure that b is positive by multiplying by
1 if necessary, so that positive s means positive t.
Another remark is that if a or b in the equation are variable, the computations can
quickly get out of hand, as the expressions for the characteristic coordinates become messy
and then solving the ODE becomes even messier. In the above examples, b was always 1,
meaning we got s ⇤ t in the characteristic coordinates. If b is not constant, your expression
for s will be more complicated.
Finding the characteristic coordinates is really a system of ODE in general if a depends
on t or if b depends on x. In that case, we would need techniques of systems of ODE to
�.�. FIRST ORDER LINEAR PDE 77
solve, see chapter 3 or chapter 8. In general, if a and b are not linear functions or constants,
finding closed form expressions for the characteristic coordinates may be impossible.
Finally, the method of characteristics applies to nonlinear first order PDE as well. In the
nonlinear case, the characteristics depend not only on the differential equation, but also on
the initial data. This leads to not only more difficult computations, but also the formation
of singularities where the solution breaks down at a certain point in time. An example
application where first order nonlinear PDE come up is traffic flow theory, and you have
probably experienced the formation of singularities: traffic jams. But we digress.
1.9.1 Exercises
Exercise 1.9.1: Solve
Exercise 1.9.5:
c� Explain why you got the same solution, although the characteristic coordinates you found
were different.
Exercise 1.9.6: Solve (1 + x 2 )u t + x 2 u x + e x u ⇤ 0, u(x, 0) ⇤ 0. Hint� Think a little out of the box.
where p(x) ⇤ B(x)/A(x), q(x) ⇤ C(x)/A(x), and f (x) ⇤ F(x)/A(x). The word linear means that the
equation contains no powers nor functions of y, y 0, and y 00.
In the special case when f (x) ⇤ 0, we have a so-called homogeneous equation
That is, we can add solutions together and multiply them by constants to obtain new
and different solutions. We call the expression C 1 y1 + C 2 y2 a linear combination of y1 and
y2 . Let us prove this theorem; the proof is very enlightening and illustrates how linear
equations work.
Proof: Let y ⇤ C 1 y1 + C 2 y2 . Then
y 00 + p y 0 + q y ⇤ (C 1 y1 + C 2 y2 )00 + p(C 1 y1 + C 2 y2 )0 + q(C 1 y1 + C2 y2 )
⇤ C 1 y100 + C 2 y200 + C 1 p y10 + C 2 p y20 + C 1 q y1 + C 2 q y2
⇤ C 1 (y100 + p y10 + q y1 ) + C2 (y200 + p y20 + q y2 )
⇤ C 1 · 0 + C2 · 0 ⇤ 0. ⇤
The proof becomes even simpler to state if we use the operator notation. An operator is
an object that eats functions and spits out functions (kind of like what a function is, but a
function eats numbers and spits out numbers). Define the operator L by
L y ⇤ y 00 + p y 0 + q y.
The differential equation now becomes L y ⇤ 0. The operator (and the equation) L being
linear means that L(C 1 y1 + C 2 y2 ) ⇤ C 1 L y1 + C 2 L y2 . The proof above becomes
L y ⇤ L(C 1 y1 + C 2 y2 ) ⇤ C 1 L y1 + C 2 L y2 ⇤ C 1 · 0 + C 2 · 0 ⇤ 0.
For example, the equation y 00 + k 2 y ⇤ 0 with y(0) ⇤ b 0 and y 0(0) ⇤ b 1 has the solution
b1
y(x) ⇤ b 0 cos(kx) + sin(kx).
k
b1
y(x) ⇤ b 0 cosh(kx) + sinh(kx).
k
Using cosh and sinh in this solution allows us to solve for the initial conditions in a cleaner
way than if we have used the exponentials.
The initial conditions for a second order ODE consist of two equations. Common sense
tells us that if we have two arbitrary constants and two equations, then we should be able
to solve for the constants and find a solution to the differential equation satisfying the
initial conditions.
Question: Suppose we find two different solutions y1 and y2 to the homogeneous
equation (2.2). Can every solution be written (using superposition) in the form y ⇤
C 1 y1 + C 2 y2 ?
Answer is affirmative! Provided that y1 and y2 are different enough in the following
sense. We say y1 and y2 are linearly independent if one is not a constant multiple of the other.
Theorem 2.1.3. Let p, q be continuous functions. Let y1 and y2 be two linearly independent
solutions to the homogeneous equation (2.2). Then every other solution is of the form
y ⇤ C 1 y1 + C 2 y2 .
y ⇤ C 1 cos x + C 2 sin x
In other words, y1 v 00 + (2y10 + p(x)y1 )v 0 ⇤ 0. Using w ⇤ v 0 we have the first order linear
equation y1 w 0 + (2y10 + p(x)y1 )w ⇤ 0. After solving this equation for w (integrating factor),
we find v by antidifferentiating w. We then form y2 by computing y1 v. For example,
suppose we somehow know y1 ⇤ x is a solution to y 00 + x 1 y 0 x 2 y ⇤ 0. The equation
for w is then xw 0 + 3w ⇤ 0. We find a solution, w ⇤ Cx 3 , and we find an antiderivative
v ⇤ 2xC2 . Hence y2 ⇤ y1 v ⇤ 2xC . Any C works and so C ⇤ 2 makes y2 ⇤ 1/x . Thus, the
general solution is y ⇤ C 1 x + C 2 1/x .
Since we have a formula for the solution to the first order linear equation, we can write
a formula for y2 :
πØ
p(x) dx
e
y2 (x) ⇤ y1 (x) 2
dx
y1 (x)
Although it is much easier to remember that we just need to try y2 (x) ⇤ y1 (x)v(x) and find
v(x) as we did above. Also, the technique works for higher order equations too: you get to
reduce the order for each solution you find. So it is better to remember how to do it rather
than a specific formula.
We will study the solution of nonhomogeneous equations in § 2.5. We will first focus
on finding general solutions to homogeneous equations.
2.1.1 Exercises
Exercise 2.1.2: Show that y ⇤ e x and y ⇤ e 2x are linearly independent.
Exercise 2.1.4: Prove the superposition principle for nonhomogeneous equations. Suppose that y1
is a solution to L y1 ⇤ f (x) and y2 is a solution to L y2 ⇤ g(x) �same linear operator L�. Show that
y ⇤ y1 + y2 solves L y ⇤ f (x) + g(x).
Exercise 2.1.5: For the equation x 2 y 00 x y 0 ⇤ 0, find two solutions, show that they are linearly
independent and find the general solution. Hint� Try y ⇤ x r .
Exercise 2.1.7: Same equation as in Exercise �.�.�. Suppose (b a)2 4ac ⇤ 0. Find a formula
for the general solution of ax 2 y 00 + bx y 0 + c y ⇤ 0. Hint� Try y ⇤ x r ln x for the second solution.
is also a solution.
Exercise 2.1.104: Find the general solution to x y 00 + y 0 ⇤ 0. Hint� It is a first order ODE in y 0.
Exercise 2.1.105: Write down an equation �guess� for which we have the solutions e x and e 2x .
Hint� Try an equation of the form y 00 + Ay 0 + B y ⇤ 0 for constants A and B, plug in both e x and
e 2x and solve for A and B.
84 CHAPTER �. HIGHER ORDER LINEAR ODES
y 00 6y 0 + 8y ⇤ 0, y(0) ⇤ 2, y 0(0) ⇤ 6.
This is a second order linear homogeneous equation with constant coefficients. Constant
coefficients means that the functions in front of y 00, y 0, and y are constants, they do not
depend on x.
To guess a solution, think of a function that stays essentially the same when we
differentiate it, so that we can take the function and its derivatives, add some multiples of
these together, and end up with zero. Yes, we are talking about the exponential.
Let us try a solution of the form y ⇤ e rx . Then y 0 ⇤ re rx and y 00 ⇤ r 2 e rx . Plug in to get
y 00 6y 0 + 8y ⇤ 0,
r 2 e rx 6 re rx +8 e rx ⇤ 0,
|{z} |{z} |{z}
y 00 y0 y
2
r 6r + 8 ⇤ 0 (divide through by e rx ),
(r 2)(r 4) ⇤ 0.
The functions e 2x and e 4x are linearly independent. If they were not linearly independent,
we could write e 4x ⇤ Ce 2x for some constant C, implying that e 2x ⇤ C for all x, which is
clearly not possible. Hence, we can write the general solution as
y ⇤ C 1 e 2x + C 2 e 4x .
We need to solve for C 1 and C2 . To apply the initial conditions, we first find y 0 ⇤
2C1 e 2x + 4C 2 e 4x . We plug x ⇤ 0 into y and y 0 and solve.
2 ⇤ y(0) ⇤ C 1 + C 2 ,
6 ⇤ y 0(0) ⇤ 2C1 + 4C 2 .
Making an educated guess with some parameters to solve for is such a central technique in differential
equations, that people sometimes use a fancy name for such a guess: ansatz, German for “initial placement of
a tool at a work piece.” Yes, the Germans have a word for that.
�.�. CONSTANT COEFFICIENT SECOND ORDER LINEAR ODES 85
Either apply some matrix algebra, or just solve these by high school math. For example,
divide the second equation by 2 to obtain 3 ⇤ C 1 + 2C 2 , and subtract the two equations to
get 5 ⇤ C 2 . Then C 1 ⇤ 7 as 2 ⇤ C 1 + 5. Hence, the solution we are looking for is
y ⇤ 7e 2x + 5e 4x .
Let us generalize this example into a method. Suppose that we have an equation
a y 00 + b y 0 + c y ⇤ 0, (2.3)
where a, b, c are constants. Try the solution y ⇤ e rx to obtain
ar 2 e rx + bre rx + ce rx ⇤ 0.
Divide by e rx to obtain the so-called characteristic equation of the ODE:
ar 2 + br + c ⇤ 0.
Solve for the r by using the quadratic formula.
p
b ± b 2 4ac
r1 , r2 ⇤ .
2a
So e r1 x and e r2 x are solutions. There is still a difficulty if r1 ⇤ r2 , but it is not hard to
overcome.
Theorem 2.2.1. Suppose that r1 and r2 are the roots of the characteristic equation.
�i� If r1 and r2 are distinct and real �when b 2 4ac > 0�, then (2.3) has the general solution
y ⇤ C 1 e r1 x + C 2 e r2 x .
�ii� If r1 ⇤ r2 �happens when b 2 4ac ⇤ 0�, then (2.3) has the general solution
y ⇤ (C 1 + C 2 x) e r1 x .
We also define the exponential e a+ib of a complex number. We do this by writing down
the Taylor series and plugging in the complex number. Because most properties of the
exponential can be proved by looking at the Taylor series, these properties still hold for the
complex exponential. For example the very important property: e x+y ⇤ e x e y . This means
that e a+ib ⇤ e a e ib . Hence if we can compute e ib , we can compute e a+ib . For e ib we use the
so-called Euler’s formula.
Theorem 2.2.2 (Euler’s formula).
e i✓ + e i✓ e i✓ e i✓
cos ✓ ⇤ and sin ✓ ⇤ .
2 2i
2
Exercise 2.2.5: Double angle identities� Start with e i(2✓) ⇤ e i✓ . Use Euler on each side and
deduce�
cos(2✓) ⇤ cos2 ✓ sin2 ✓ and sin(2✓) ⇤ 2 sin ✓ cos ✓.
For a complex number a + ib we call a the real part and b the imaginary part of the
number. Often the following notation is used,
Then
a y 00 + b y 0 + c y ⇤ 0.
If the characteristic equation has the roots ↵ ± i �when b 2 4ac < 0�, then the general solution is
y ⇤ C 1 cos(kx) + C 2 sin(kx).
We again plug in the initial condition and obtain 10 ⇤ y 0(0) ⇤ 2C 2 , or C 2 ⇤ 5. The solution
we are seeking is
y ⇤ 5e 3x sin(2x).
�.�. CONSTANT COEFFICIENT SECOND ORDER LINEAR ODES 89
2.2.4 Exercises
Exercise 2.2.6: Find the general solution of 2y 00 + 2y 0 4y ⇤ 0.
Exercise 2.2.12: Find the general solution of y 00 ⇤ 0 using the methods of this section.
Exercise 2.2.13: The method of this section applies to equations of other orders than two. We will
see higher orders later. Try to solve the first order equation 2y 0 + 3y ⇤ 0 using the methods of this
section.
Exercise 2.2.14: Let us revisit the Cauchy–Euler equations of Exercise �.�.� on page ��. Suppose
now that (b a)2 4ac < 0. Find a formula for the general solution of ax 2 y 00 + bx y 0 + c y ⇤ 0.
Hint� Note that x r ⇤ e r ln x .
Exercise 2.2.105: Find the solution to z 00(t) ⇤ 2z 0(t) 2z(t), z(0) ⇤ 2, z 0(0) ⇤ 2.
has exactly one solution y(x) defined on the same interval I satisfying the initial conditions
1)
y(a) ⇤ b 0 , y 0(a) ⇤ b 1 , ..., y (n (a) ⇤ b n 1 .
c 1 y1 + c 2 y2 + · · · + c n y n ⇤ 0
has only the trivial solution c 1 ⇤ c2 ⇤ · · · ⇤ c n ⇤ 0, where the equation must hold for all x.
If we can solve equation with some constants where for example c 1 , 0, then we can solve
for y1 as a linear combination of the others. If the functions are not linearly independent,
they are linearly dependent.
�.�. HIGHER ORDER LINEAR ODES 91
y 000 3y 00 y 0 + 3y ⇤ 0. (2.5)
r 3 e rx 3 r 2 e rx re rx +3 e rx ⇤ 0.
|{z} |{z} |{z} |{z}
y 000 y 00 y0 y
r3 3r 2 r + 3 ⇤ 0.
The trick now is to find the roots. There is a formula for the roots of degree 3 and 4
polynomials but it is very complicated. There is no formula for higher degree polynomials.
That does not mean that the roots do not exist. There are always n roots for an n th degree
polynomial. They may be repeated and they may be complex. Computers are pretty good
at finding roots approximately for reasonable size polynomials.
A good place to start is to plot the polynomial and check where it is zero. We can also
simply try plugging in. We just start plugging in numbers r ⇤ 2, 1, 0, 1, 2, . . . and see if
we get a hit (we can also try complex numbers). Even if we do not get a hit, we may get
an indication of where the root is. For example, we plug r ⇤ 2 into our polynomial and
get 15; we plug in r ⇤ 0 and get 3. That means there is a root between r ⇤ 2 and r ⇤ 0,
because the sign changed. If we find one root, say r1 , then we know (r r1 ) is a factor of
our polynomial. Polynomial long division can then be used.
A good strategy is to begin with r ⇤ 0, 1, or 1. These are easy to compute. Our
polynomial has two such roots, r1 ⇤ 1 and r2 ⇤ 1. There should be 3 roots and the last
root is reasonably easy to find. The constant term in a monic polynomial such as this is the
multiple of the negations of all the roots because r 3 3r 2 r + 3 ⇤ (r r1 )(r r2 )(r r3 ). So
3 ⇤ ( r1 )( r2 )( r3 ) ⇤ (1)( 1)( r3 ) ⇤ r3 .
You should check that r3 ⇤ 3 really is a root. Hence e x , e x and e 3x are solutions to (2.5).
They are linearly independent as can easily be checked, and there are 3 of them, which
happens to be exactly the number we need. So the general solution is
y ⇤ C1 e x
+ C2 e x + C3 e 3x .
The word monic means that the coefficient of the top degree r d , in our case r 3 , is 1.
�.�. HIGHER ORDER LINEAR ODES 93
Suppose we were given some initial conditions y(0) ⇤ 1, y 0(0) ⇤ 2, and y 00(0) ⇤ 3. Then
1 ⇤ y(0) ⇤ C1 + C2 + C 3 ,
2 ⇤ y 0(0) ⇤ C1 + C2 + 3C 3 ,
3 ⇤ y 00(0) ⇤ C1 + C2 + 9C 3 .
It is possible to find the solution by high school algebra, but it would be a pain. The
sensible way to solve a system of equations such as this is to use matrix algebra, see § 3.2
or appendix A. For now we note that the solution is C1 ⇤ 1/4, C 2 ⇤ 1, and C 3 ⇤ 1/4. The
specific solution to the ODE is
1 x 1
e + e x + e 3x .
y⇤
4 4
Next, suppose that we have real roots, but they are repeated. Let us say we have a root
r repeated k times. In the spirit of the second order solution, and for the same reasons, we
have the solutions
e rx , xe rx , x 2 e rx , . . . , x k 1 e rx .
We take a linear combination of these solutions to find the general solution.
Example 2.3.4: Solve
y (4) 3y 000 + 3y 00 y 0 ⇤ 0.
We note that the characteristic equation is
r4 3r 3 + 3r 2 r ⇤ 0.
By inspection we note that r 4 3r 3 + 3r 2 r ⇤ r(r 1)3 . Hence the roots given with
multiplicity are r ⇤ 0, 1, 1, 1. Thus the general solution is
y ⇤ (C 1 + C2 x + C3 x 2 ) e x + C4
|{z}
.
| {z }
terms coming from r⇤1 from r⇤0
The case of complex roots is similar to second order equations. Complex roots always
come in pairs r ⇤ ↵ ± i . Suppose we have two such complex roots, each repeated k times.
The corresponding solution is
(C 0 + C1 x + · · · + C k 1 x k 1 ) e ↵x cos( x) + (D0 + D1 x + · · · + Dk 1 x k 1 ) e ↵x sin( x).
where C0 , . . . , C k 1 , D0 , . . . , Dk 1 are arbitrary constants.
Example 2.3.5: Solve
y (4) 4y 000 + 8y 00 8y 0 + 4y ⇤ 0.
The characteristic equation is
r4 4r 3 + 8r 2 8r + 4 ⇤ 0,
2
(r 2 2r + 2) ⇤ 0,
2
(r 1)2 + 1 ⇤ 0.
94 CHAPTER �. HIGHER ORDER LINEAR ODES
Hence the roots are 1 ± i, both with multiplicity 2. Hence the general solution to the ODE is
The way we solved the characteristic equation above is really by guessing or by inspection.
It is not so easy in general. We could also have asked a computer or an advanced calculator
for the roots.
2.3.3 Exercises
Exercise 2.3.1: Find the general solution for y 000 y 00 + y 0 y ⇤ 0.
Exercise 2.3.4: Suppose the characteristic equation for an ODE is (r 1)2 (r 2)2 ⇤ 0.
Exercise 2.3.5: Suppose that a fourth order equation has a solution y ⇤ 2e 4x x cos x.
Exercise 2.3.6: Find the general solution for the equation of Exercise �.�.�.
Exercise 2.3.7: Let f (x) ⇤ e x cos x, g(x) ⇤ e x + cos x, and h(x) ⇤ cos x. Are f (x), g(x), and
h(x) linearly independent� If so, show it, if not, find a linear combination that works.
Exercise 2.3.8: Let f (x) ⇤ 0, g(x) ⇤ cos x, and h(x) ⇤ sin x. Are f (x), g(x), and h(x) linearly
independent� If so, show it, if not, find a linear combination that works.
Exercise 2.3.9: Are x, x 2 , and x 4 linearly independent� If so, show it, if not, find a linear
combination that works.
Exercise 2.3.10: Are e x , xe x , and x 2 e x linearly independent� If so, show it, if not, find a linear
combination that works.
Exercise 2.3.102: Suppose that the characteristic equation of a third order differential equation has
roots ±2i and �.
Exercise 2.3.104: Are e x , e x+1 , e 2x , sin(x) linearly independent� If so, show it, if not find a linear
combination that works.
Exercise 2.3.105: Are sin(x), x, x sin(x) linearly independent� If so, show it, if not find a linear
combination that works.
Exercise 2.3.106: Find an equation such that y ⇤ cos(x), y ⇤ sin(x), y ⇤ e x are solutions.
96 CHAPTER �. HIGHER ORDER LINEAR ODES
(iv) undamped, if c ⇤ 0.
This system appears in lots of applications even if it does not at first seem like it. Many
real-world scenarios can be simplified to a mass on a spring. For example, a bungee
jump setup is essentially a mass and spring system (you are the mass). It would be good
if someone did the math before you jump off the bridge, right? Let us give two other
examples.
Here is an example for electrical engineers. Consider the pictured
RLC circuit. There is a resistor with a resistance of R ohms, an inductor E C
L
with an inductance of L henries, and a capacitor with a capacitance R
of C farads. There is also an electric source (such as a battery) giving
a voltage of E(t) volts at time t (measured in seconds). Let Q(t) be the
charge in coulombs on the capacitor and I(t) be the current in the circuit. The relation
�.�. MECHANICAL VIBRATIONS 97
1.0 1.0
✓ + ✓ ⇤ 0.
00
L
0.0 0.0
true for a pendulum. Nevertheless, for Figure 2.1: The graphs of sin ✓ and ✓ (in radians).
reasonably short periods of time and small
swings (that is, only small angles ✓), the approximation is reasonably good.
In real-world problems it is often necessary to make these types of simplifications.
We must understand both the mathematics and the physics of the situation to see if the
simplification is valid in the context of the questions we are trying to answer.
98 CHAPTER �. HIGHER ORDER LINEAR ODES
x 00 + !02 x ⇤ 0.
By a trigonometric identity
Exercise 2.4.1: Justify the above identity and verify the equations for C and . Hint� Start with
cos(↵ ) ⇤ cos(↵) cos( ) + sin(↵) sin( ) and multiply by C. Then what should ↵ and be�
While it is generally easier to use the first form with A and B to solve for the initial
conditions, the second form is much more natural. The constants C and have nice
physical interpretation. Write the solution as
x(t) ⇤ C cos(!0 t ).
frequency of the resulting oscillation? What is the amplitude? The units are the mks units
(meters-kilograms-seconds).
The setup means that the mass was at half a meter in the positive direction during the
crash and relative to the wall the spring is mounted to, the mass was moving forward (in
the positive direction) at 1 m/s. This gives us the initial conditions.
So the equation with initial conditions is
2x 00 + 8x ⇤ 0, x(0) ⇤ 0.5, x 0(0) ⇤ 1.
p p
We directly compute !0 ⇤ k/m ⇤ 4 ⇤ 2. Hence the angular frequency is 2. The usual
frequency in Hertz (cycles per second) is 2/2⇡ ⇤ 1/⇡ ⇡ 0.318.
The general solution is
x(t) ⇤ A cos(2t) + B sin(2t).
Letting x(0) ⇤ 0.5 means A ⇤ 0.5. Then x 0(t) ⇤ 2(0.5)
p sin(2t)+2B cos(2t). Letting x 0(0) ⇤ 1
p p
we get B ⇤ 0.5. Therefore, the amplitude is C ⇤ A2 + B 2 ⇤ 0.25 + 0.25 ⇤ 0.5 ⇡ 0.707.
The solution is
x(t) ⇤ 0.5 cos(2t) + 0.5 sin(2t).
A plot of x(t) is shown in Figure 2.2.
In general, for free undamped motion, a
solution of the form
0.0 2.5 5.0 7.5 10.0
1.0 1.0
as
x 00 + 2px 0 + !02 x ⇤ 0,
where r
k c
!0 ⇤ , p⇤ .
m 2m
The characteristic equation is
r 2 + 2pr + !02 ⇤ 0.
Using the quadratic formula we get that the roots are
q
r⇤ p± p2 !02 .
The form of the solution depends on whether we get complex or real roots. We get real
roots if and only if the following number is nonnegative:
⇣ c ⌘2 k c 2 4km
p2 !02 ⇤ ⇤ .
2m m 4m 2
The sign of p 2 !02 is the same as the sign of c 2 4km. Thus we get real roots if and only if
c 2 4km is nonnegative, or in other words if c 2 4km.
Overdamping
When c 2 4km > 0, the system is over-
damped. In this case, there are two distinct
0 25 50 75 100
1.5 1.5
real roots
q r1 and r2 . Both roots are nega-
tive: As p2 !02 is always less than p, then
q 1.0 1.0
x(t) ⇤ C 1 e r1 t + C2 e r2 t .
0.0 0.0
position as time goes to infinity. For a few Figure 2.3: Overdamped motion for several differ-
sample plots for different initial conditions, ent initial conditions.
see Figure 2.3.
No oscillation happens. In fact, the
graph crosses the x-axis at most once. To see why, we try to solve 0 ⇤ C 1 e r1 t + C2 e r2 t .
Therefore, C 1 e r1 t ⇤ C2 e r2 t and using laws of exponents we obtain
C1 r1 )t
⇤ e (r2 .
C2
�.�. MECHANICAL VIBRATIONS 101
This equation has at most one solution t 0. For some initial conditions the graph never
crosses the x-axis, as is evident from the sample graphs.
Example 2.4.2: Suppose the mass is released from rest. That is x(0) ⇤ x 0 and x 0(0) ⇤ 0.
Then
x0
x(t) ⇤ r 1 e r2 t r 2 e r1 t .
r1 r2
It is not hard to see that this satisfies the initial conditions.
Critical damping
When c 2 4km ⇤ 0, the system is critically damped. In this case, there is one root of
multiplicity 2 and this root is p. Our solution is
pt pt
x(t) ⇤ C 1 e + C 2 te .
The behavior of a critically damped system is very similar to an overdamped system. After
all a critically damped system is in some sense a limit of overdamped systems. Since
these equations are really only an approximation to the real world, in reality we are never
critically damped, it is a place we can only reach in theory. We are always a little bit
underdamped or a little bit overdamped. It is better not to dwell on critical damping.
Underdamping
When c 2 4km < 0, the system is under-
damped. In this case, the roots are complex.
0 5 10 15 20 25 30
1.0 1.0
q
r⇤ p± p2 !02
q
0.5 0.5
p
⇤ p± 1 !02 p2
0.0 0.0
⇤ p ± i!1 ,
q -0.5 -0.5
pt
x(t) ⇤ e A cos(!1 t) + B sin(!1 t) , -1.0
0 5 10 15 20 25 30
-1.0
The phase shift shifts the oscillation left or right, but within the envelope curves (the
envelope curves do not change if changes).
Notice that the angular pseudo-frequency becomes smaller when the damping c (and
hence p) becomes larger. This makes sense. When we change the damping just a little bit,
we do not expect the behavior of the solution to change dramatically. If we keep making c
larger, then at some point the solution should start looking like the solution for critical
damping or overdamping, where no oscillation happens. So if c 2 approaches 4km, we
want !1 to approach 0.
On the other hand, when c gets smaller, !1 approaches !0 (!1 is always smaller than
!0 ), and the solution looks more and more like the steady periodic motion of the undamped
case. The envelope curves become flatter and flatter as c (and hence p) goes to 0.
2.4.4 Exercises
Exercise 2.4.2: Consider a mass and spring system with a mass m ⇤ 2, spring constant k ⇤ 3, and
damping constant c ⇤ 1.
c� If the system is not critically damped, find a c that makes the system critically damped.
Exercise 2.4.4: Using the mks units �meters-kilograms-seconds�, suppose you have a spring with
spring constant � N/m. You want to use it to weigh items. Assume no friction. You place the mass
on the spring and put it in motion.
a� You count and find that the frequency is �.� Hz �cycles per second�. What is the mass�
Exercise 2.4.5: Suppose we add possible friction to Exercise �.�.�. Further, suppose you do not
know the spring constant, but you have two reference weights � kg and � kg to calibrate your setup.
You put each in motion on your spring and measure the frequency. For the � kg weight you measured
�.� Hz, for the � kg weight you measured �.� Hz.
b� Find a formula for the mass in terms of the frequency in Hz. Note that there may be more
than one possible mass for a given frequency.
c� For an unknown object you measured �.� Hz, what is the mass of the object� Suppose that
you know that the mass of the unknown object is more than a kilogram.
We do not call !1 a frequency since the solution is not really a periodic function.
�.�. MECHANICAL VIBRATIONS 103
Exercise 2.4.6: Suppose you wish to measure the friction a mass of �.� kg experiences as it slides
along a floor �you wish to find c�. You have a spring with spring constant k ⇤ 5 N/m. You take the
spring, you attach it to the mass and fix it to a wall. Then you pull on the spring and let the mass
go. You find that the mass oscillates with frequency � Hz. What is the friction�
Exercise 2.4.101: A mass of 2 kilograms is on a spring with spring constant k newtons per meter
with no damping. Suppose the system is at rest and at time t ⇤ 0 the mass is kicked and starts
traveling at � meters per second. How large does k have to be to so that the mass does not go further
than � meters from the rest position�
Exercise 2.4.102: Suppose we have an RLC circuit with a resistor of ��� milliohms ��.� ohms�,
inductor of inductance of �� millihenries ��.�� henries�, and a capacitor of � farads, with constant
voltage.
Exercise 2.4.103: A ���� kg railcar hits a bumper �a spring� at � m/s, and the spring compresses by
�.� m. Assume no damping.
a� Find k.
b� How far does the spring compress when a ����� kg railcar hits the spring at the same speed�
c� If the spring would break if it compresses further than �.� m, what is the maximum mass of a
railcar that can hit it at � m/s�
d� What is the maximum mass of a railcar that can hit the spring without breaking at � m/s�
Exercise 2.4.104: A mass of m kg is on a spring with k ⇤ 3 N/m and c ⇤ 2 Ns/m. Find the mass
m0 for which there is critical damping. If m < m 0 , does the system oscillate or not, that is, is it
underdamped or overdamped�
104 CHAPTER �. HIGHER ORDER LINEAR ODES
y 00 + 5y 0 + 6y ⇤ 2x + 1. (2.6)
We will write L y ⇤ 2x + 1 when the exact form of the operator is not important. We
solve (2.6) in the following manner. First, we find the general solution y c to the associated
homogeneous equation
y 00 + 5y 0 + 6y ⇤ 0. (2.7)
We call y c the complementary solution. Next, we find a single particular solution y p to (2.6) in
some way. Then
y ⇤ yc + yp
is the general solution to (2.6). We have L y c ⇤ 0 and L y p ⇤ 2x + 1. As L is a linear operator
we verify that y is a solution, L y ⇤ L(y c + y p ) ⇤ L y c + L y p ⇤ 0 + (2x + 1). Let us see why
we obtain the general solution.
Let y p and ỹ p be two different particular solutions to (2.6). Write the difference as
w ⇤ y p ỹ p . Then plug w into the left-hand side of the equation to get
Using the operator notation the calculation becomes simpler. As L is a linear operator we
write
Lw ⇤ L(y p ỹ p ) ⇤ L y p L ỹ p ⇤ (2x + 1) (2x + 1) ⇤ 0.
So w ⇤ y p ỹ p is a solution to (2.7), that is Lw ⇤ 0. Any two solutions of (2.6) differ by a
solution to the homogeneous equation (2.7). The solution y ⇤ y c + y p includes all solutions
to (2.6), since y c is the general solution to the associated homogeneous equation.
Theorem 2.5.1. Let L y ⇤ f (x) be a linear ODE �not necessarily constant coefficient�. Let y c be
the complementary solution �the general solution to the associated homogeneous equation L y ⇤ 0�
and let y p be any particular solution to L y ⇤ f (x). Then the general solution to L y ⇤ f (x) is
y ⇤ yc + yp .
The moral of the story is that we can find the particular solution in any old way. If we
find a different particular solution (by a different method, or simply by guessing), then we
still get the same general solution. The formula may look different, and the constants we
have to choose to satisfy the initial conditions may be different, but it is the same solution.
�.�. NONHOMOGENEOUS EQUATIONS 105
y p ⇤ Ax + B.
2x 3x 3x 1
y ⇤ C1 e + C2 e + .
9
Now suppose we are further given some initial conditions. For example, y(0) ⇤ 0 and
y 0(0) ⇤ 1/3. First find y 0 ⇤ 2C 1 e 2x 3C2 e 3x + 1/3. Then
1 1 1
0 ⇤ y(0) ⇤ C 1 + C 2 , ⇤ y 0(0) ⇤ 2C1 3C 2 + .
9 3 3
We solve to get C1 ⇤ 1/3 and C 2 ⇤ 2/9. The particular solution we want is
1 2 3x 1 3e 2x 2e 3x + 3x 1
2x 3x
y(x) ⇤ e e + ⇤ .
3 9 9 9
Exercise 2.5.1: Check that y really solves the equation (2.6) and the given initial conditions.
Note: A common mistake is to solve for constants using the initial conditions with y c
and only add the particular solution y p after that. That will not work. You need to first
compute y ⇤ y c + y p and only then solve for the constants using the initial conditions.
A right-hand side consisting of exponentials, sines, and cosines can be handled similarly.
For example,
y 00 + 2y 0 + 2y ⇤ cos(2x).
Let us find some y p . We start by guessing the solution includes some multiple of cos(2x).
We may have to also add a multiple of sin(2x) to our guess since derivatives of cosine are
sines. We try
y p ⇤ A cos(2x) + B sin(2x).
106 CHAPTER �. HIGHER ORDER LINEAR ODES
or
( 4A + 4B + 2A) cos(2x) + ( 4B 4A + 2B) sin(2x) ⇤ cos(2x).
The left-hand side must equal to right-hand side. Namely, 4A + 4B + 2A ⇤ 1 and
4B 4A + 2B ⇤ 0. So 2A + 4B ⇤ 1 and 2A + B ⇤ 0 and hence A ⇤ 1/10 and B ⇤ 1/5. So
cos(2x) + 2 sin(2x)
y p ⇤ A cos(2x) + B sin(2x) ⇤ .
10
Similarly, if the right-hand side contains exponentials we try exponentials. If
L y ⇤ e 3x ,
we try y ⇤ Ae 3x as our guess and try to solve for A.
When the right-hand side is a multiple of sines, cosines, exponentials, and polynomials,
we can use the product rule for differentiation to come up with a guess. We need to guess
a form for y p such that L y p is of the same form, and has all the terms needed to for the
right-hand side. For example,
L y ⇤ (1 + 3x 2 ) e x
cos(⇡x).
For this equation, we guess
y p ⇤ (A + Bx + Cx 2 ) e x
cos(⇡x) + (D + Ex + Fx 2 ) e x
sin(⇡x).
We plug in and then hopefully get equations that we can solve for A, B, C, D, E, and F. As
you can see this can make for a very long and tedious calculation very quickly. C’est la vie!
There is one hiccup in all this. It could be that our guess actually solves the associated
homogeneous equation. That is, suppose we have
y 00 9y ⇤ e 3x .
We would love to guess y ⇤ Ae 3x , but if we plug this into the left-hand side of the equation
we get
y 00 9y ⇤ 9Ae 3x 9Ae 3x ⇤ 0 , e 3x .
There is no way we can choose A to make the left-hand side be e 3x . The trick in this case is
to multiply our guess by x to get rid of duplication with the complementary solution. That
is first we compute y c (solution to L y ⇤ 0)
3x
yc ⇤ C1 e + C2 e 3x ,
�.�. NONHOMOGENEOUS EQUATIONS 107
and we note that the e 3x term is a duplicate with our desired guess. We modify our guess
to y ⇤ Axe 3x so that there is no duplication anymore. Let us try: y 0 ⇤ Ae 3x + 3Axe 3x and
y 00 ⇤ 6Ae 3x + 9Axe 3x , so
Thus 6Ae 3x is supposed to equal e 3x . Hence, 6A ⇤ 1 and so A ⇤ 1/6. We can now write the
general solution as
1
y ⇤ y c + y p ⇤ C1 e 3x + C 2 e 3x + xe 3x .
6
It is possible that multiplying by x does not get rid of all duplication. For example,
y 00 6y 0 + 9y ⇤ e 3x .
L y ⇤ e 2x + cos x.
In this case we find u that solves Lu ⇤ e 2x and v that solves Lv ⇤ cos x (that is, do each
term separately). Then note that if y ⇤ u + v, then L y ⇤ e 2x + cos x. This is because L is
linear; we have L y ⇤ L(u + v) ⇤ Lu + Lv ⇤ e 2x + cos x.
y 00 + y ⇤ tan x.
Each new derivative of tan x looks completely different and cannot be written as a linear
combination of the previous derivatives. If we start differentiating tan x, we get:
This equation calls for a different method. We present the method of variation of
parameters, which handles any equation of the form L y ⇤ f (x), provided we can solve
certain integrals. For simplicity, we restrict ourselves to second order constant coefficient
108 CHAPTER �. HIGHER ORDER LINEAR ODES
equations, but the method works for higher order equations just as well (the computations
become more tedious). The method also works for equations with nonconstant coefficients,
provided we can solve the associated homogeneous equation.
Perhaps it is best to explain this method by example. Let us try to solve the equation
L y ⇤ y 00 + y ⇤ tan x.
We can still impose one more condition at our discretion to simplify computations (we
have two unknown functions, so we should be allowed two conditions). We require that
(u 10 y1 + u20 y2 ) ⇤ 0. This makes computing the second derivative easier.
y 0 ⇤ u 1 y10 + u 2 y20 ,
y 00 ⇤ (u 10 y10 + u 20 y20 ) + (u1 y100 + u 2 y200).
Since y1 and y2 are solutions to y 00 + y ⇤ 0, we find y100 ⇤ y1 and y200 ⇤ y2 . (If the equation
was a more general y 00 + p(x)y 0 + q(x)y ⇤ 0, we would have y i00 ⇤ p(x)y i0 q(x)y i .) So
y 00 ⇤ (u 10 y10 + u 20 y20 ) y,
and hence
y 00 + y ⇤ L y ⇤ u 10 y10 + u20 y20 .
For y to satisfy L y ⇤ f (x) we must have f (x) ⇤ u10 y10 + u 20 y20 .
What we need to solve are the two equations (conditions) we imposed on u 1 and u2 :
u 10 y1 + u 20 y2 ⇤ 0,
u 10 y10 + u 20 y20 ⇤ f (x).
We solve for u 10 and u 20 in terms of f (x), y1 and y2 . We always get these formulas for any
L y ⇤ f (x), where L y ⇤ y 00 + p(x)y 0 + q(x)y. There is a general formula for the solution we
�.�. NONHOMOGENEOUS EQUATIONS 109
could just plug into, but instead of memorizing that, it is better, and easier, to just repeat
what we do below. In our case the two equations are
u 10 cos(x) + u 20 sin(x) ⇤ 0,
u 10 sin(x) + u20 cos(x) ⇤ tan(x).
Hence
u 10 cos(x) sin(x) + u 20 sin2 (x) ⇤ 0,
u10 sin(x) cos(x) + u 20 cos2 (x) ⇤ tan(x) cos(x) ⇤ sin(x).
And thus
u 20 sin2 (x) + cos2 (x) ⇤ sin(x),
u 20 ⇤ sin(x),
sin2 (x)
u 10 ⇤ ⇤ tan(x) sin(x).
cos(x)
We integrate u10 and u 20 to get u 1 and u 2 .
π π
1 sin(x) 1
u1 ⇤ u 10 dx ⇤ tan(x) sin(x) dx ⇤ ln + sin(x),
2 sin(x) + 1
π π
u2 ⇤ u 20 dx ⇤ sin(x) dx ⇤ cos(x).
1 sin(x) 1
y p ⇤ u 1 y1 + u 2 y2 ⇤ cos(x) ln + cos(x) sin(x) cos(x) sin(x) ⇤
2 sin(x) + 1
1 sin(x) 1
⇤ cos(x) ln .
2 sin(x) + 1
The general solution to y 00 + y ⇤ tan x is, therefore,
1 sin(x) 1
y ⇤ C1 cos(x) + C2 sin(x) + cos(x) ln .
2 sin(x) + 1
2.5.4 Exercises
Exercise 2.5.2: Find a particular solution of y 00 y0 6y ⇤ e 2x .
Exercise 2.5.3: Find a particular solution of y 00 4y 0 + 4y ⇤ e 2x .
Exercise 2.5.4: Solve the initial value problem y 00 +9y ⇤ cos(3x)+sin(3x) for y(0) ⇤ 2, y 0(0) ⇤ 1.
Exercise 2.5.5: Set up the form of the particular solution but do not solve for the coefficients for
y (4) 2y 000 + y 00 ⇤ e x .
Exercise 2.5.6: Set up the form of the particular solution but do not solve for the coefficients for
y (4) 2y 000 + y 00 ⇤ e x + x + sin x.
110 CHAPTER �. HIGHER ORDER LINEAR ODES
Exercise 2.5.7:
c� Are the two solutions you found the same� See also Exercise �.�.��.
Exercise 2.5.9: For an arbitrary constant c find a particular solution to y 00 y ⇤ e cx . Hint� Make
sure to handle every possible real c.
Exercise 2.5.10:
c� Are the two solutions you found the same� What is going on�
Exercise 2.5.102:
Exercise 2.5.105: For an arbitrary constant c find the general solution to y 00 2y ⇤ sin(x + c).
�.�. FORCED OSCILLATIONS AND RESONANCE 111
F0
x ⇤ C 1 cos(!0 t) + C 2 sin(!0 t) + cos(!t).
m(!02 !2 )
20
x⇤ cos(⇡t) cos(4t) .
16 ⇡ 2
5 5
A B A+B
2 sin sin ⇤ cos B cos A
2 2 -5 -5
to get
✓ ✓ ◆ ✓ ◆◆
20 4 4+⇡
-10 -10
⇡ 0 5 10 15 20
x⇤ 2 sin t sin t . 20
16 ⇡2 2 2 Figure 2.5: Graph of 16 ⇡2
cos(⇡t) cos(4t) .
The function x is a high frequency wave
modulated by a low frequency wave.
Now suppose !0 ⇤ !. Obviously, we cannot try the solution A cos(!t) and then use
the method of undetermined coefficients. We notice that cos(!t) solves the associated
homogeneous equation. Therefore, we try x p ⇤ At cos(!t) + Bt sin(!t). This time we need
the sine term, since the second derivative of t cos(!t) contains sines. We write the equation
F0
x 00 + ! 2 x ⇤ cos(!t).
m
Plugging x p into the left-hand side we get
F0
2B! cos(!t) 2A! sin(!t) ⇤ cos(!t).
m
F0 F0
Hence A ⇤ 0 and B ⇤ 2m! . Our particular solution is 2m! t sin(!t) and our general solution
is
F0
x ⇤ C 1 cos(!t) + C 2 sin(!t) +
t sin(!t).
2m!
The important term is the last one (the particular solution we found). This term grows
F0 t F0 t
without bound as t ! 1. In fact it oscillates between 2m! and 2m! . The first two terms
q
only oscillate between ± C 12 + C 22 , which becomes smaller and smaller in proportion to
the oscillations of the last term as t gets larger. In Figure 2.6 on the facing page we see the
graph with C1 ⇤ C 2 ⇤ 0, F0 ⇤ 2, m ⇤ 1, ! ⇤ ⇡.
�.�. FORCED OSCILLATIONS AND RESONANCE 113
K. Billah and R. Scanlan, Resonance, Tacoma Narrows Bridge Failure, and Undergraduate Physics Textbooks,
American Journal of Physics, 59(2), 1991, 118–124, http://www.ketchum.org/billah/Billah-Scanlan.pdf
114 CHAPTER �. HIGHER ORDER LINEAR ODES
Let us find a particular solution. There can be no conflicts when trying to solve for the
undetermined coefficients by trying x p ⇤ A cos(!t) + B sin(!t). Let us plug in and solve
for A and B. We get (the tedious details are left to reader)
F0
(!02 !2 )B 2!pA sin(!t) + (!02 ! 2 )A + 2!pB cos(!t) ⇤ cos(!t).
m
We solve for A and B:
(!02 ! 2 )F0
A⇤ 2
,
m(2!p)2 + m(!02 !2 )
2!pF0
B⇤ 2
.
m(2!p)2 + m(!02 !2 )
p
We also compute C ⇤ A2 + B 2 to be
F0
C⇤ q .
2 2
m (2!p) + (!02 !2 )
Thus our particular solution is
(!02 !2 )F0 2!pF0
xp ⇤ 2
cos(!t) + 2
sin(!t).
m(2!p)2 + m(!02 !2 ) m(2!p)2 + m(!02 !2 )
Or in the alternative notation we have amplitude C and phase shift where (if ! , !0 )
B 2!p
tan ⇤ ⇤ 2 .
A !0 ! 2
Hence,
F0
xp ⇤ q cos(!t ).
2 2
m (2!p) + (!02 !2 )
F0
If ! ⇤ !0 , then A ⇤ 0, B ⇤ C ⇤ 2m!p , and ⇤ ⇡/2.
For reasons we will explain in a moment, we call x c the transient solution and denote
it by x tr . We call the x p from above the steady periodic solution and denote it by x sp . The
general solution is
x ⇤ x c + x p ⇤ x tr + x sp .
The transient solution x c ⇤ x tr goes to zero as t ! 1, as all the terms involve an
exponential with a negative exponent. So for large t, the effect of x tr is negligible and we
see essentially only x sp . Hence the name transient. Notice that x sp involves no arbitrary
constants, and the initial conditions only affect x tr . Thus, the effect of the initial conditions
is negligible after some period of time. We might as well focus on the steady periodic
solution and ignore the transient solution. See Figure 2.7 on the next page for a graph
given several different initial conditions.
�.�. FORCED OSCILLATIONS AND RESONANCE 115
2.5 2.5
2.0 2.0
1.5 1.5
1.0 1.0
0.5 0.5
0.0 0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Figure 2.8: Graph of C(!) showing practical resonance with parameters k ⇤ 1, m ⇤ 1, F0 ⇤ 1. The top
line is with c ⇤ 0.4, the middle line with c ⇤ 0.8, and the bottom line with c ⇤ 1.6.
To find the maximum we need to find the derivative C0(!). Computation shows
2!(2p 2 + !2 !02 )F0
C0(!) ⇤ .
2 2 3/2
m (2!p) + (!02 !2 )
116 CHAPTER �. HIGHER ORDER LINEAR ODES
This is zero either when ! ⇤ 0 or when 2p 2 + ! 2 !02 ⇤ 0. In other words, C0(!) ⇤ 0 when
q
!⇤ !02 2p 2 or ! ⇤ 0.
q
If!02 2p 2 is positive, then !02 2p 2 is the practical resonance frequency (that is the
point where C(!) is maximal). This follows by the first derivative test for example as then
C0(!) > 0 for small ! in this case. If on the other hand !02 2p 2 is not positive, then C(!)
achieves its maximum at ! ⇤ 0, and there is no practical resonance since we assume ! > 0
in our system. In this case the amplitude gets larger as the forcing frequency gets smaller.
If practical resonance occurs, the frequency is smaller than !0 . As the damping c (and
hence p) becomes smaller, the practical resonance frequency goes to !0 . So when damping
is very small, !0 is a good estimate of the practical resonance frequency. This behavior
agrees with the observation that when c ⇤ 0, then !0 is the resonance frequency.
Another interesting observation to make is that when ! ! 1, then C ! 0. This means
that if the forcing frequency gets too high it does not manage to get the mass moving in
the mass-spring system. This is quite reasonable intuitively. If we wiggle back and forth
really fast while sitting on a swing, we will not get it moving at all, no matter how forceful.
Fast vibrations just cancel each other out before the mass has any chance of responding by
moving one way or the other.
The behavior is more complicated if the forcing function is not an exact cosine wave,
but for example a square wave. A general periodic function will be the sum (superposition)
of many cosine waves of different frequencies. The reader is encouraged to come back to
this section once we have learned about the Fourier series.
2.6.3 Exercises
Exercise 2.6.1: Derive a formula for x sp if the equation is mx 00 + cx 0 + kx ⇤ F0 sin(!t). Assume
c > 0.
Exercise 2.6.3: Take mx 00 + cx 0 + kx ⇤ F0 cos(!t). Fix m > 0, k > 0, and F0 > 0. Consider
the function C(!). For what values of c �solve in terms of m, k, and F0 � will there be no practical
resonance �that is, for what values of c is there no maximum of C(!) for ! > 0��
Exercise 2.6.4: Take mx 00 + cx 0 + kx ⇤ F0 cos(!t). Fix c > 0, k > 0, and F0 > 0. Consider the
function C(!). For what values of m �solve in terms of c, k, and F0 � will there be no practical
resonance �that is, for what values of m is there no maximum of C(!) for ! > 0��
�.�. FORCED OSCILLATIONS AND RESONANCE 117
Exercise 2.6.5: A water tower in an earthquake acts as a mass-spring system. Assume that the
container on top is full and the water does not move around. The container then acts as the mass
and the support acts as the spring, where the induced vibrations are horizontal. The container with
water has a mass of m ⇤ 10, 000 kg. It takes a force of ���� newtons to displace the container �
meter. For simplicity assume no friction. When the earthquake hits the water tower is at rest �it is
not moving�. The earthquake induces an external force F(t) ⇤ mA!2 cos(!t).
b� If ! is not the natural frequency, find a formula for the maximal amplitude of the resulting
oscillations of the water container �the maximal deviation from the rest position�. The motion
will be a high frequency wave modulated by a low frequency wave, so simply find the constant
in front of the sines.
c� Suppose A ⇤ 1 and an earthquake with frequency �.� cycles per second comes. What is the
amplitude of the oscillations� Suppose that if the water tower moves more than �.� meter
from the rest position, the tower collapses. Will the tower collapse�
Exercise 2.6.101: A mass of � kg on a spring with k ⇤ 4 N/m and a damping constant c ⇤ 1 Ns/m.
Suppose that F0 ⇤ 2 N. Using forcing function F0 cos(!t), find the ! that causes practical
resonance and find the amplitude.
Exercise 2.6.103: Suppose there is no damping in a mass and spring system with m ⇤ 5, k ⇤ 20,
and F0 ⇤ 5. Suppose ! is chosen to be precisely the resonance frequency.
a� Find !.
b� Find the amplitude of the oscillations at time t ⇤ 100, given the system is at rest at t ⇤ 0.
118 CHAPTER �. HIGHER ORDER LINEAR ODES
Chapter 3
Systems of ODEs
3.1.1 Systems
Often we do not have just one dependent variable and one equation. And as we will see,
we may end up with systems of several equations and several dependent variables even if
we start with a single equation.
If we have several dependent variables, suppose y1 , y2 , . . . , y n , then we can have
a differential equation involving all of them and their derivatives with respect to one
independent variable x. For example, y100 ⇤ f (y10 , y20 , y1 , y2 , x). Usually, when we have two
dependent variables we have two equations such as
for some functions f1 and f2 . We call the above a system of differential equations. More
precisely, the above is a second order system of ODEs as second order derivatives appear.
The system
is a first order system, where x1 , x2 , x3 are the dependent variables, and t is the independent
variable.
The terminology for systems is essentially the same as for single equations. For example,
120 CHAPTER �. SYSTEMS OF ODES
for the above system a solution is a set of three functions x1 (t), x2 (t), x3 (t), such that
We usually also have an initial condition. Just like for single equations we specify x1 , x2 ,
and x3 for some fixed t. For example, x1 (0) ⇤ a 1 , x2 (0) ⇤ a 2 , x3 (0) ⇤ a 3 . For some constants
a1 , a 2 , and a3 . For the second order system we would also specify the first derivatives at a
point. And if we find a solution with constants in it, where by solving for the constants we
find a solution for any initial condition, we call this solution the general solution. Best to
look at a simple example.
Example 3.1.1: Sometimes a system is easy to solve by solving for one variable and then
for the second variable. Take the first order system
y10 ⇤ y1 ,
y20 ⇤ y1 y2 ,
with y1 , y2 as the dependent variables and x as the independent variable. And consider
initial conditions y1 (0) ⇤ 1, y2 (0) ⇤ 2.
We note that y1 ⇤ C1 e x is the general solution of the first equation. We then plug this
y1 into the second equation and get the equation y20 ⇤ C 1 e x y2 , which is a linear first
order equation that is easily solved for y2 . By the method of integrating factor we get
C 1 2x
e x y2 ⇤ e + C2 ,
2
C1 x
or y2 ⇤ 2 e + C2 e x. The general solution to the system is, therefore,
C1 x
y1 ⇤ C 1 e x , y2 ⇤ e + C2 e x
.
2
We solve for C 1 and C 2 given the initial conditions. We substitute x ⇤ 0 and find that
C 1 ⇤ 1 and C2 ⇤ 3/2. Thus the solution is y1 ⇤ e x , and y2 ⇤ (1/2)e x + (3/2)e x .
Generally, we will not be so lucky to be able to solve for each variable separately as in
the example above, and we will have to solve for all variables at once. While we won’t
generally be able to solve for one variable and then the next, we will try to salvage as much
as possible from this technique. It will turn out that in a certain sense we will still (try to)
solve a bunch of single equations and put their solutions together. Let’s not worry right
now about how to solve systems yet.
We will mostly consider the linear systems. The example above is an example of a linear
first order system. It is linear as none of the dependent variables or their derivatives appear
in nonlinear functions or with powers higher than one (x, y, x 0 and y 0, constants, and
�.�. INTRODUCTION TO SYSTEMS OF ODES 121
functions of t can appear, but not x y or (y 0)2 or x 3 ). Another, more complicated, example
of a linear system is
3.1.2 Applications
Let us consider some simple applications of systems and how to set up the equations.
Example 3.1.2: First, we consider salt and brine tanks, but this time water flows from one
to the other and back. We again consider that the tanks are evenly mixed.
x1 x2
Vol. = V r r Vol. = V
Suppose we have two tanks, each containing volume V liters of salt brine. The amount
of salt in the first tank is x1 grams, and the amount of salt in the second tank is x2 grams.
The liquid is perfectly mixed and flows at the rate r liters per second out of each tank into
the other. See Figure 3.1.
The rate of change of x1 , that is x10 , is the rate of salt coming in minus the rate going out.
The rate coming in is the density of the salt in tank 2, that is xV2 , times the rate r. The rate
coming out is the density of the salt in tank 1, that is xV1 , times the rate r. In other words it is
x2 x1 r r r
x10 ⇤ r r ⇤ x2 x 1 ⇤ (x2 x1 ).
V V V V V
Similarly we find the rate x20 , where the roles of x1 and x2 are reversed. All in all, the
system of ODEs for this problem is
r
x10 ⇤ (x2 x1 ),
V
r
x20 ⇤ (x1 x2 ).
V
In this system we cannot solve for x 1 or x2 separately. We must solve for both x1 and x2 at
once, which is intuitively clear since the amount of salt in one tank affects the amount in
the other. We can’t know x1 before we know x2 , and vice versa.
122 CHAPTER �. SYSTEMS OF ODES
We don’t yet know how to find all the solutions, but intuitively we can at least find
some solutions. Suppose we know that initially the tanks have the same amount of salt.
That is, we have an initial condition such as x 1 (0) ⇤ x2 (0) ⇤ C. Then clearly the amount
of salt coming and out of each tank is the same, so the amounts are not changing. In
other words, x1 ⇤ C and x2 ⇤ C (the constant functions) is a solution: x 10 ⇤ x 20 ⇤ 0, and
x2 x1 ⇤ x1 x2 ⇤ 0, so the equations are satisfied.
Let us think about the setup a little bit more without solving it. Suppose the initial
conditions are x1 (0) ⇤ A and x2 (0) ⇤ B, for two different constants A and B. Since no salt is
coming in or out of this closed system, the total amount of salt is constant. That is, x1 + x2
is constant, and so it equals A + B. Intuitively if A is bigger than B, then more salt will flow
out of tank one than into it. Eventually, after a long time we would then expect the amount
of salt in each tank to equalize. In other words, the solutions of both x 1 and x2 should tend
towards A+B 2 . Once you know how to solve systems you will find out that this really is so.
Example 3.1.3: Let us look at a second order example. We return to the mass and spring
setup, but this time we consider two masses.
Consider one spring with constant k and two masses m 1 k
and m 2 . Think of the masses as carts that ride along a straight m1 m2
track with no friction. Let x 1 be the displacement of the first
cart and x2 be the displacement of the second cart. That is, we x1 x2
put the two carts somewhere with no tension on the spring,
and we mark the position of the first and second cart and call those the zero positions.
Then x1 measures how far the first cart is from its zero position, and x2 measures how far
the second cart is from its zero position. The force exerted by the spring on the first cart
is k(x2 x 1 ), since x2 x 1 is how far the string is stretched (or compressed) from the rest
position. The force exerted on the second cart is the opposite, thus the same thing with a
negative sign. Newton’s second law states that force equals mass times acceleration. So the
system of equations is
m1 x100 ⇤ k(x 2 x 1 ),
m2 x200 ⇤ k(x 2 x 1 ).
Again, we cannot solve for the x1 or x2 variable separately. That we must solve for both
x1 and x2 at once is intuitively clear, since where the first cart goes depends on exactly
where the second cart goes and vice-versa.
1)
y (n) ⇤ F(y (n , . . . , y 0 , y, x).
�.�. INTRODUCTION TO SYSTEMS OF ODES 123
u 10 ⇤ u2 ,
u 20 ⇤ u3 ,
..
.
u 0n1 ⇤ un ,
0
un ⇤ F(u n , u n 1 , . . . , u 2 , u1 , x).
We solve this system for u 1 , u2 , . . . , u n . Once we have solved for the u’s, we can discard u2
through u n and let y ⇤ u 1 . This y solves the original equation.
Example 3.1.4: Take x 000 ⇤ 2x 00 + 8x 0 + x + t. Letting u1 ⇤ x, u 2 ⇤ x 0, u 3 ⇤ x 00, we find the
system:
u 10 ⇤ u2 , u 20 ⇤ u3 , u 30 ⇤ 2u 3 + 8u 2 + u1 + t.
A similar process can be followed for a system of higher order differential equations.
For example, a system of k differential equations in k unknowns, all of order n, can be
transformed into a first order system of n ⇥ k equations and n ⇥ k unknowns.
Example 3.1.5: Consider the system from the carts example,
Let u1 ⇤ x1 , u2 ⇤ x 10 , u 3 ⇤ x2 , u 4 ⇤ x20 . The second order system becomes the first order
system
Example 3.1.6: The idea works in reverse as well. Consider the system
x 0 ⇤ 2y x, y 0 ⇤ x,
where the independent variable is t. We wish to solve for the initial conditions x(0) ⇤ 1,
y(0) ⇤ 0.
If we differentiate the second equation, we get y 00 ⇤ x 0. We know what x 0 is in terms of
x and y, and we know that x ⇤ y 0. So,
y 00 ⇤ x 0 ⇤ 2y x ⇤ 2y y0.
We now have the equation y 00 + y 0 2y ⇤ 0. We know how to solve this equation and we
find that y ⇤ C 1 e 2t + C 2 e t . Once we have y, we use the equation y 0 ⇤ x to get x.
2t
x ⇤ y 0 ⇤ 2C 1 e + C2 e t .
Exercise 3.1.1: Plug in and check that this really is the solution.
It is useful to go back and forth between systems and higher order equations for other
reasons. For example, software for solving ODE numerically (approximation) is generally
for first order systems. So to use it, you have to take whatever ODE you want to solve and
convert it to a first order system. In fact, it is not very hard to adapt computer code for
the Euler or Runge–Kutta method for first order equations to handle first order systems.
We essentially just treat the dependent variable not as a number but as a vector. In many
mathematical computer languages there is almost no distinction in syntax.
-1 0 1 2 3 -1 0 1 2 3
3 3 3 3
2 2 2 2
1 1 1 1
0 0 0 0
-1 -1 -1 -1
-1 0 1 2 3 -1 0 1 2 3
Figure 3.2: The direction field for x 0 ⇤ 2y x, Figure 3.3: The direction field for x 0 ⇤ 2y x,
y 0 ⇤ x. y 0 ⇤ x with the trajectory of the solution starting
at (1, 0) for 0 t 2.
Theorem 3.1.1 (Picard’s theorem on existence and uniqueness for systems). If for every
@F j
j ⇤ 1, 2, . . . , n and every k ⇤ 1, 2, . . . , n each F j is continuous and the derivative @x k exists and is
continuous near some (x 10 , x20 , . . . , x 0n , t 0 ), then a solution to (3.1) subject to the initial condition
x1 (t 0 ) ⇤ x 10 , x2 (t 0 ) ⇤ x 20 , . . . , x n (t 0 ) ⇤ x 0n exists �at least for some small interval of t’s� and is
unique.
That is, a unique solution exists for any initial condition given that the system is
reasonable (F j and its partial derivatives in the x variables are continuous). As for single
equations we may not have a solution for all time t, but at least for some short period of
time.
As we can change any nth order ODE into a first order system, then we notice that this
theorem provides also the existence and uniqueness of solutions for higher order equations
that we have until now not stated explicitly.
126 CHAPTER �. SYSTEMS OF ODES
3.1.6 Exercises
Exercise 3.1.2: Find the general solution of x10 ⇤ x2 x1 + t, x20 ⇤ x 2 .
Exercise 3.1.6: Suppose two masses on carts on frictionless surface are at displacements x1 and x2
as in Example �.�.� on page ���. Suppose that a rocket applies force F in the positive direction on
cart x1 . Set up the system of equations.
Exercise 3.1.7: Suppose the tanks are as in Example �.�.� on page ���, starting both at volume V,
but now the rate of flow from tank � to tank � is r1 , and rate of flow from tank � to tank one is r2 . In
particular, the volumes will now be changing. Set up the system of equations.
Exercise 3.1.101: Find the general solution to y10 ⇤ 3y1 , y20 ⇤ y1 + y2 , y30 ⇤ y1 + y3 .
Exercise 3.1.105: Suppose two masses on carts on frictionless surface are at displacements x1 and
x2 as in Example �.�.� on page ���. Suppose initial displacement is x1 (0) ⇤ x2 (0) ⇤ 0, and initial
velocity is x 10 (0) ⇤ x 20 (0) ⇤ a for some number a. Use your intuition to solve the system, explain
your reasoning.
Exercise 3.1.106: Suppose the tanks are as in Example �.�.� on page ��� except that clean water
flows in at the rate s liters per second into tank �, and brine flows out of tank � and into the sewer
also at the rate of s liters per second.
Matrix addition is also easy. We add matrices element by element. For example,
1 2 3 1 1 1 2 3 2
+ ⇤ .
4 5 6 0 2 4 4 7 10
A + 0 ⇤ A ⇤ 0 + A,
A + B ⇤ B + A,
(A + B) + C ⇤ A + (B + C),
c(A + B) ⇤ cA + cB,
(c + d)A ⇤ cA + dA.
128 CHAPTER �. SYSTEMS OF ODES
Another useful operation for matrices is the so-called transpose. This operation just
swaps rows and columns of a matrix. The transpose of A is denoted by AT . Example:
21 4 3
T 6 7
⇤ 662 577
1 2 3
4 5 6 63 6 7
4 5
2 3
⇥ ⇤ 6b1 7
a1 a2 a3 · 66 b 2 77 ⇤ a 1 b 1 + a 2 b 2 + a3 b3 .
6b3 7
4 5
21 0 137
1 2 3 66
1 77 ⇤
4 5 6 66
1 1
41 0 0 75
1·1+2·1+3·1 1·0+2·1+3·0 1 · ( 1) + 2 · 1 + 3 · 0 6 2 1
⇤ ⇤
4·1+5·1+6·1 4·0+5·1+6·0 4 · ( 1) + 5 · 1 + 6 · 0 15 5 1
We have the following rules for matrix multiplication. Suppose that A, B, C are matrices
of the correct sizes so that the following make sense. Let ↵ denote a scalar (number).
A(BC) ⇤ (AB)C,
A(B + C) ⇤ AB + AC,
(B + C)A ⇤ BA + CA,
↵(AB) ⇤ (↵A)B ⇤ A(↵B),
IA ⇤ A ⇤ AI.
A few warnings are in order.
⇥ 1 1by
(i) AB , BA in general (it may be true ⇤ fluke sometimes).
⇥1 0⇤ That is, matrices do not
commute. For example, take A ⇤ 1 1 and B ⇤ 0 2 .
Then det(A) ⇤ 1 + 1 ⇤ 2. Let us see where the (unit) square with vertices (0, 0), (1, 0), (0, 1),
and (1, 1) gets sent. Clearly (0, 0) gets sent to (0, 0).
1 1 1 1 1 1 0 1 1 1 1 2
⇤ , ⇤ , ⇤ .
1 1 0 1 1 1 1 1 1 1 1 0
The image of the square is anotherpsquare with vertices (0, 0), (1, 1), (1, 1), and (2, 0). The
image square has a side of length 2 and is therefore of area 2.
If you think back to high school geometry, you may have seen a formula for computing
the area of a parallelogram with vertices (0, 0), (a, c), (b, d) and (a + b, c + d). And it is
precisely ✓ ◆
a b
det .
c d
⇥a b⇤
The vertical lines above mean absolute value. The matrix c d carries the unit square to
the given parallelogram.
Let us look at the determinant for larger matrices. We define A i j as the matrix A with
the i th row and the j th column deleted. To compute the determinant of a matrix, pick one
row, say the i th row and compute:
’
n
det(A) ⇤ ( 1)i+ j a i j det(A i j ).
j⇤1
The numbers ( 1)i+ j det(A i j ) are called cofactors of the matrix and this way of computing
the determinant is called the cofactor expansion. No matter which row you pick, you always
get the same number. It is also possible to compute the determinant by expanding along
columns (picking a column instead of a row above). It is true that det(A) ⇤ det(AT ).
A common notation for the determinant is a pair of vertical lines:
✓ ◆
a b a b
⇤ det .
c d c d
�.�. MATRICES AND LINEAR SYSTEMS 131
I personally find this notation confusing as vertical lines usually mean a positive quantity,
while determinants can be negative. Also think about how to write the absolute value of a
determinant. I will not use this notation in this book.
Think of the determinants telling you the scaling of a mapping. If B doubles the sizes
of geometric objects and A triples them, then AB (which applies B to an object and then A)
should make size go up by a factor of 6. This is true in general:
det(AB) ⇤ det(A) det(B).
This property is one of the most useful, and it is employed often to actually compute
determinants. A particularly interesting consequence is to note what it means for existence
of inverses. Take A and B to be inverses of each other, that is AB ⇤ I. Then
det(A) det(B) ⇤ det(AB) ⇤ det(I) ⇤ 1.
Neither det(A) nor det(B) can be zero. Let us state this as a theorem as it will be very
important in the context of this course.
Theorem 3.2.1. An n ⇥ n matrix A is invertible if and only if det(A) , 0.
1
In fact, det(A 1 ) det(A) ⇤ 1 says that det(A 1 ) ⇤ det(A) . So we even know what the
1
determinant of A is before we know how to compute A . 1
Notice the determinant of the matrix [ ac db ] in the denominator of the fraction. The formula
only works if the determinant is nonzero, otherwise we are dividing by zero.
To solve the system we put the coefficient matrix (the matrix on the left-hand side of the
equation) together with the vector on the right and side and get the so-called augmented
matrix
22 2 2 2 3
6 7
6 1 1 3 5 7.
6 7
6 1 4 1 10 7
4 5
We apply the following three elementary operations.
(i) Swap two rows.
Exercise 3.2.1: Check that the solution above really solves the given equations.
We write this equation in matrix notation as
Æ
A xÆ ⇤ b,
h2 2 2i h 2
i
where A is the matrix 113 and bÆ is the vector 5 . The solution can also be computed
141 10
via the inverse,
xÆ ⇤ A 1 A xÆ ⇤ A 1 b.
Æ
It is possible that the solution is not unique, or that no solution exists. It is easy to tell if
a solution does not exist. If during the row reduction you come up with a row where all the
entries except the last one are zero (the last entry in a row corresponds to the right-hand
side of the equation), then the system is inconsistent and has no solution. For example, for
a system of 3 equations and 3 unknowns, if you find a row such as [ 0 0 0 | 1 ] in the
augmented matrix, you know the system is inconsistent. That row corresponds to 0 ⇤ 1.
You generally try to use row operations until the following conditions are satisfied. The
first (from the left) nonzero entry in each row is called the leading entry.
(i) The leading entry in any row is strictly to the right of the leading entry of the row
above.
(ii) Any zero rows are below all the nonzero rows.
(iv) All the entries above and below a leading entry are zero.
Such a matrix is said to be in reduced row echelon form. The variables corresponding to
columns with no leading entries are said to be free variables. Free variables mean that we can
pick those variables to be anything we want and then solve for the rest of the unknowns.
Example 3.2.1: The following augmented matrix is in reduced row echelon form.
21 2 0 33
6 7
60 0 1 17
6 7
60 0 0 07
4 5
Suppose the variables are x 1 , x2 , and x3 . Then x 2 is the free variable, x1 ⇤ 3 2x2 , and
x3 ⇤ 1.
On the other hand if during the row reduction process you come up with the matrix
2 1 2 13 3 3
6 7
6 0 0 1 1 7,
6 7
60 0 0 37
4 5
there is no need to go further. The last row corresponds to the equation 0x1 + 0x2 + 0x3 ⇤ 3,
which is preposterous. Hence, no solution exists.
134 CHAPTER �. SYSTEMS OF ODES
3.2.6 Exercises
⇥1 2⇤ ⇥5⇤
Exercise 3.2.2: Solve 34 xÆ ⇤ 6 by using matrix inverse.
h 9 2 6
i
Exercise 3.2.3: Compute determinant of 8 3 6 .
10 2 6
1 2 3 1
Exercise 3.2.4: Compute determinant of 40 5 0 . Hint� Expand along the proper row or column
60 7 0
8 0 10 1
to make the calculations simpler.
h1 2 3i
Exercise 3.2.5: Compute inverse of 111 .
010
h1 2 3i
Exercise 3.2.6: For which h is 45 6 not invertible� Is there only one such h� Are there several�
78 h
Infinitely many�
hh 1 1
i
Exercise 3.2.7: For which h is 0 h 0 not invertible� Find all such h.
1 1 h
h 9 2 6
i h1i
Exercise 3.2.8: Solve 8 3 6 xÆ ⇤ 2 .
10 2 6 3
h5 3 7i h2i
Exercise 3.2.9: Solve 844 xÆ ⇤ 0 .
633 0
3 2 3 0 2
Exercise 3.2.10: Solve 3333 xÆ ⇤ 0 .
0242 4
2343 1
a� Compute M 1 . b� Compute N 1 .
136 CHAPTER �. SYSTEMS OF ODES
where P(t) is a matrix-valued function, and xÆ(t) and fÆ(t) are vector-valued functions. We
will often suppress the dependence on t and only write xÆ0 ⇤ P xÆ + fÆ. A solution of the
system is a vector-valued function xÆ satisfying the vector equation.
For example, the equations
x 10 ⇤ 2tx1 + e t x2 + t 2 ,
x1
x 20 ⇤ x2 + e t ,
t
�.�. LINEAR SYSTEMS OF ODES 137
can be written as
0 2t et t2
xÆ ⇤ 1 xÆ + t .
/t 1 e
We will mostly concentrate on equations that are not just linear, but are in fact constant
coefficient equations. That is, the matrix P will be constant; it will not depend on t.
When fÆ ⇤ 0Æ (the zero vector), then we say the system is homogeneous. For homogeneous
linear systems we have the principle of superposition, just like for single homogeneous
equations.
Theorem 3.3.1 (Superposition). Let xÆ0 ⇤ P xÆ be a linear homogeneous system of ODEs. Suppose
that xÆ1 , xÆ2 , . . . , xÆn are n solutions of the equation and c1 , c2 , . . . , c n are any constants, then
xÆ ⇤ c1 xÆ1 + c 2 xÆ2 + · · · + c n xÆn , (3.2)
is also a solution. Furthermore, if this is a system of n equations �P is n ⇥ n�, and xÆ1 , xÆ2 , . . . , xÆn
are linearly independent, then every solution xÆ can be written as (3.2).
Linear independence for vector-valued functions is the same idea as for normal functions.
The vector-valued functions xÆ1 , xÆ2 , . . . , xÆn are linearly independent when
h i work. h i
xÆ2 , and this holds for all t. So c1 ⇤ 1, c 2 ⇤ 1, and c3 ⇤ 1 above will h i
2 0
On the other hand if we change the example just slightly xÆ1 ⇤ t , xÆ2 ⇤ t , xÆ3 ⇤
t t2 ,
1
then the functions are linearly independent. First write c 1 xÆ1 + c 2 xÆ2 + c3 xÆ3 ⇤ 0Æ and note
that it has to hold for all t. We get that
c1 t 2 c3 t 2 0
c 1 xÆ1 + c2 xÆ2 + c3 xÆ3 ⇤ ⇤ .
c1 t + c2 t + c3 0
Theorem 3.3.2. Let xÆ0 ⇤ P xÆ + fÆ be a linear system of ODEs. Suppose xÆp is one particular solution.
Then every solution can be written as
xÆ ⇤ xÆc + xÆp ,
xÆ(t0 ) ⇤ bÆ
Æ Let X(t) be a fundamental matrix solution of
for some fixed t0 and a constant vector b.
the associated homogeneous equation (i.e. columns of X(t) are solutions). The general
solution can be written as
xÆ(t) ⇤ X(t) cÆ + xÆp (t).
We are seeking a vector cÆ such that
In other words, we are solving for cÆ the nonhomogeneous system of linear equations
x10 ⇤ x1 ,
x20 ⇤ x1 x2 ,
with initial conditions x1 (0) ⇤ 1, x2 (0) ⇤ 2. Let us consider this problem in the language of
this section.
The system is homogeneous, so fÆ(t) ⇤ 0. Æ We write the system and the initial conditions
as
0 1 0 1
xÆ ⇤ xÆ, xÆ(0) ⇤ .
1 1 2
c1 t
We found the general solution
h is ix1 ⇤ c 1 e t and x2 ⇤ 2e + c 2 e t . Letting c 1 ⇤ 1 and
e t
0
⇥ ⇤
c2 ⇤ 0, we obtain the solution (1/2)e t . Letting c 1 ⇤ 0 and c 2 ⇤ 1, we obtain
e t . These
two solutions are linearly independent, as can be seen by setting t ⇤ 0, and noting that
the resulting constant vectors are linearly independent. In matrix notation, a fundamental
matrix solution is, therefore, t
e 0
X(t) ⇤ 1 t t .
2 e e
�.�. LINEAR SYSTEMS OF ODES 139
This new solution agrees with our previous solution from § 3.1.
3.3.1 Exercises
Exercise 3.3.1: Write the system x10 ⇤ 2x1 3tx2 + sin t, x20 ⇤ e t x1 + 3x2 + cos t in the form
xÆ0 ⇤ P(t)xÆ + fÆ(t).
Exercise 3.3.2:
⇥1 3⇤ ⇥1⇤ ⇥ ⇤
a� Verify that the system xÆ0 ⇤ 31 xÆ has the two solutions 1 e 4t and 1
1 e 2t .
xÆ0 ⇤ P xÆ,
where P is a constant square matrix. We wish to adapt the method for the single constant
coefficient equation by trying the function e t . However, xÆ is a vector. So we try xÆ ⇤ vÆe t ,
where vÆ is an arbitrary constant vector. We plug this xÆ into the equation to get
t t
|{z} |{z}
vÆe ⇤ P vÆe .
xÆ0 P xÆ
We divide by e t and notice that we are looking for a scalar and a vector vÆ that satisfy
the equation
vÆ ⇤ P vÆ.
To solve this equation we need a little bit more linear algebra, which we now review.
det(A I) ⇤ 0.
(A Æ
I)vÆ ⇤ 0,
and solve for a nontrivial (nonzero) vector vÆ. If is an eigenvalue, there will be at least one
free variable, and so for each distinct eigenvalue , we can always find an eigenvector.
h2 1 1i
Example 3.4.3: Find an eigenvector of 120 corresponding to the eigenvalue ⇤ 3.
002
We write
22 1 13 21 0 03 2v1 3 2 1 1 37 2v1 3
© 66 7 6 7™ 6 7 6 6 7
1
(A I)vÆ ⇤ ≠ 61 2 077 3 660 1 077 Æ̈ 66v2 77 ⇤ 66 1 1 0 77 6v2 7 ⇤ 0.
6 7
Æ
6 7 60 0 17 6v3 7 6 0 175 6v3 7
´ 40 0 25 4 5 4 5 4 0 4 5
It is easy to solve this system of linear equations. We write down the augmented matrix
2 1 1 0 37
6 1
6 1 0 0 77 ,
6 1
6 0 1 0 75
4 0
and perform row operations (exercise: which ones?) until we get:
21 1 0 0 37
6
60 0 1 0 77 .
6
60 0 0 0 75
4
The entries of vÆ have to satisfy the equations v1 v2 ⇤ 0, v3 ⇤ 0, and v 2 is a free variable.
We can pick v 2 to be arbitrary
h i(but nonzero), let v1 ⇤ v2 , and of course v3 ⇤ 0. For example,
1
if we pick v2 ⇤ 1, then vÆ ⇤ 1 . Let us verify that vÆ really is an eigenvector corresponding
0
to ⇤ 3:
22 1 13 213 233 213
6 76 7 6 7 6 7
61 2 07 617 ⇤ 637 ⇤ 3 617 .
6 76 7 6 7 6 7
60 0 27 607 607 607
4 54 5 4 5 4 5
Yay! It worked.
142 CHAPTER �. SYSTEMS OF ODES
Exercise 3.4.1 (easy): Are eigenvectors unique� Can you find a different eigenvector for ⇤ 3 in
the example above� How are the two eigenvectors related�
Exercise 3.4.2: When the matrix is 2 ⇥ 2 you do not need to do row operations when computing an
eigenvector, you can read it off from A ⇥ 2you
I �if
1
⇤ have computed the eigenvalues correctly�. Can
you see why� Explain. Try it for the matrix 1 2 .
1t 2t nt
xÆ ⇤ c1 vÆ1 e + c2 vÆ2 e + · · · + c n vÆn e .
det(P I) ⇤ 0
is essentially the same as the characteristic equation we got in § 2.2 and § 2.3.
P (1 Æ
i)I vÆ ⇤ 0,
i 1 Æ
vÆ ⇤ 0.
1 i
The equations iv1 + v2 ⇤ 0 and v1 + iv2 ⇤ 0 are multiples of each other. So we only need⇥ ⇤
to consider one of them. After picking v 2 ⇤ 1, for example, we have an eigenvector vÆ ⇤ 1i .
⇥ ⇤
In similar fashion we find that 1i is an eigenvector corresponding to the eigenvalue 1 + i.
We could write the solution as
i)t
i (1 i)t i (1+i)t c 1 ie (1 c 2 ie (1+i)t
xÆ ⇤ c 1 e + c2 e ⇤ .
1 1 c 1 e (1 i)t + c 2 e (1+i)t
We would then need to look for complex values c1 and c2 to solve any initial conditions. It
is perhaps not completely clear that we get a real solution. After solving for c 1 and c 2 , we
could use Euler’s formula and do the whole song and dance we did before, but we will not.
We will apply the formula in a smarter way first to find independent real solutions.
We claim that we did not have to look for a second eigenvector (nor for the second
eigenvalue). All complex eigenvalues come in pairs (because the matrix P is real).
First a small detour. The real part of a complex number z can be computed as z+2 z̄ ,
where the bar above z means a + ib ⇤ a ib. This operation is called the complex conjugate.
If a is a real number, then ā ⇤ a. Similarly we bar whole vectors or matrices by taking
144 CHAPTER �. SYSTEMS OF ODES
the complex conjugate of every entry. Suppose a matrix P is real. Then P ⇤ P, and so
P xÆ ⇤ P xÆ ⇤ P xÆ. Also the complex conjugate of 0 is still 0, therefore,
0Æ ⇤ 0Æ ⇤ (P I)vÆ ⇤ (P ¯ I)vÆ.
is a solution (complex-valued) of xÆ0 ⇤ P xÆ. Euler’s formula shows that e a+ib ⇤ e a ib , and so
ib)t
xÆ2 ⇤ xÆ1 ⇤ vÆe (a
Theorem 3.4.2. Let P be a real-valued constant matrix. If P has a complex eigenvalue a + ib and
a corresponding eigenvector vÆ, then P also has a complex eigenvalue a ib with a corresponding
eigenvector vÆ. Furthermore, xÆ0 ⇤ P xÆ has two linearly independent real-valued solutions
For each pair of complex eigenvalues a + ib and a ib, we get two real-valued linearly
independent solutions. We then go on to the next eigenvalue, which is either a real
eigenvalue or another complex eigenvalue pair. If we have n distinct eigenvalues (real
or complex), then we end up with n linearly independent solutions. If we had only two
equations (n ⇤ 2) as in the example above, then once we found two solutions we are
finished, and our general solution is
We can now find a real-valued general solution to any homogeneous system where the
matrix has distinct eigenvalues. When we have repeated eigenvalues, matters get a bit
more complicated and we will look at that situation in § 3.7.
3.4.4 Exercises
Exercise 3.4.5 (easy):
h 1 i Let A be a 3 ⇥ 3 matrix with an eigenvalue of � and a corresponding
eigenvector vÆ ⇤ 1 . Find A vÆ.
3
Exercise 3.4.6:
a� Find the general solution of x10 ⇤ 2x1 , x 20 ⇤ 3x 2 using the eigenvalue method �first write the
system in the form xÆ0 ⇤ A xÆ�.
b� Solve the system by solving each equation separately and verify you get the same general
solution.
Exercise 3.4.7: Find the general solution of x10 ⇤ 3x1 + x2 , x 20 ⇤ 2x 1 + 4x 2 using the eigenvalue
method.
Exercise 3.4.8: Find the general solution of x10 ⇤ x 1 2x 2 , x20 ⇤ 2x1 + x2 using the eigenvalue
method. Do not use complex exponentials in your solution.
Exercise 3.4.9:
h 9 2 6
i
a� Compute eigenvalues and eigenvectors of A ⇤ 8 3 6 .
10 2 6
Exercise 3.4.101:
h 1 03
i
a� Compute eigenvalues and eigenvectors of A ⇤ 101 .
2 02
Exercise 3.4.102:
⇥ 1 1
⇤
a� Compute eigenvalues and eigenvectors of A ⇤ 10 .
the system
0 0
x x x a b x
⇤P or ⇤ . (3.3)
y y y c d y
The system is autonomous (compare this section to § 1.6) and so we can draw a vector field
(see the end of § 3.1). We will be able to visually tell what the vector field looks like and how
the solutions behave, once we find the eigenvalues and eigenvectors of the matrix P. For
this section, we assume that P has two eigenvalues and two corresponding eigenvectors.
⇥ 1 1 ⇤corre-
Case 1. Suppose that the eigenvalues of P are real and positive. We find two
sponding eigenvectors and plot them in the plane. For example, take the matrix 0 2 . The
⇥ ⇤ ⇥ ⇤
eigenvalues are 1 and 2 and corresponding eigenvectors are 10 and 11 . See Figure 3.4.
Suppose the point (x, y) is on the line
determined by an ⇥eigenvector
⇤ vÆ for an eigen-
-3 -2 -1 0 1 2 3
3 3
x
value . That is, y ⇤ ↵ vÆ for some scalar 2 2
↵. Then
0
x x
1 1
⇥ 1 Case
1
⇤ 3. Suppose one eigenvalue is positive and one is negative. For example
⇥ 1 ⇤ the matrix
⇥ 1 ⇤
0 2 . The eigenvalues are 1 and 2 and corresponding eigenvectors are 0 and 3 .
148 CHAPTER �. SYSTEMS OF ODES
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
3 3 3 3
2 2 2 2
1 1 1 1
0 0 0 0
-1 -1 -1 -1
-2 -2 -2 -2
-3 -3 -3 -3
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
Figure 3.5: Eigenvectors of P with directions. Figure 3.6: Example source vector field with eigen-
vectors and solutions.
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
3 3 3 3
2 2 2 2
1 1 1 1
0 0 0 0
-1 -1 -1 -1
-2 -2 -2 -2
-3 -3 -3 -3
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
Figure 3.7: Example sink vector field with eigen- Figure 3.8: Example saddle vector field with eigen-
vectors and solutions. vectors and solutions.
We reverse the arrows on one line (corresponding to the negative eigenvalue) and we
obtain the picture in Figure 3.8. We call this picture a saddle point.
For the next three cases we will assume the eigenvalues are complex. In this case the
eigenvectors are also complex and we cannot just plot them in the plane.
We can take any linear combination of them to get other solutions, which one we take
depends on the initial conditions. Now note that the real part is a parametric equation for
an ellipse. Same with the imaginary part and in fact any linear combination of the two.
This is what happens in general when the eigenvalues are purely imaginary. So when the
eigenvalues are purely imaginary, we get ellipses for the solutions. This type of picture is
sometimes called a center. See Figure 3.9.
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
3 3 3 3
2 2 2 2
1 1 1 1
0 0 0 0
-1 -1 -1 -1
-2 -2 -2 -2
-3 -3 -3 -3
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
Figure 3.9: Example center vector field. Figure 3.10: Example spiral source vector field.
⇥ part.
Case 5. Now suppose the complex eigenvalues have a positive real ⇤ That is, suppose
the eigenvalues are a ± ib for some a > 0. For example, let P ⇤ 14 11 . The eigenvalues
⇥ ⇤ ⇥ ⇤
turn out to be 1 ± 2i and eigenvectors are 2i1 and 12i . We take 1 + 2i and its eigenvector
⇥1⇤
2i and find the real and imaginary parts of v Æe (1+2i)t are
1 (1+2i)t cos(2t) 1 (1+2i)t sin(2t)
Re e ⇤ et , Im e ⇤ et .
2i 2 sin(2t) 2i 2 cos(2t)
Note the e t in front of the solutions. The solutions grow in magnitude while spinning
around the origin. Hence we get a spiral source. See Figure 3.10.
Case 6. Finally suppose the complex eigenvalues have a negative real part. ⇥ 1 1That
⇤ is,
suppose the eigenvalues are a ± ib for some a > 0. For example, let P ⇤ 4 1 . The
⇥ ⇤ ⇥ ⇤
eigenvalues turn out to be 1 ± 2i and eigenvectors are 12i and 2i1 . We take 1 2i
⇥ ⇤
and its eigenvector 2i1 and find the real and imaginary parts of vÆe ( 1 2i)t are
1 ( 1 2i)t t cos(2t) 1 ( 1 2i)t t sin(2t)
Re e ⇤e , Im e ⇤e .
2i 2 sin(2t) 2i 2 cos(2t)
Note the e t in front of the solutions. The solutions shrink in magnitude while spinning
around the origin. Hence we get a spiral sink. See Figure 3.11 on the next page.
150 CHAPTER �. SYSTEMS OF ODES
-3 -2 -1 0 1 2 3
3 3
2 2
1 1
0 0
-1 -1
-2 -2
-3 -3
-3 -2 -1 0 1 2 3
Eigenvalues Behavior
real and both positive source / unstable node
real and both negative sink / stable node
real and opposite signs saddle
purely imaginary center point / ellipses
complex with positive real part spiral source
complex with negative real part spiral sink
3.5.1 Exercises
Exercise 3.5.1: Take the equation mx 00 + cx 0 + kx ⇤ 0, with m > 0, c 0, k > 0 for the
mass-spring system.
a� Convert this to a system of first order equations.
b� Classify for what m, c, k do you get which behavior.
c� Can you explain from physical intuition why you do not get all the different kinds of behavior
here�
�.�. TWO-DIMENSIONAL SYSTEMS AND THEIR VECTOR FIELDS 151
⇥ ⇤
Exercise 3.5.2: What happens in the case when P ⇤ 10 11 � In this case the eigenvalue is repeated
and there is only one independent eigenvector. What picture does this look like�
⇥1 1⇤
Exercise 3.5.3: What happens in the case when P ⇤ 11 � Does this look like any of the pictures
we have drawn�
⇥a 0⇤
Exercise 3.5.4: Which behaviors are possible if P is diagonal, that is P ⇤ 0 b � You can assume
that a and b are not zero.
Exercise 3.5.5: Take the system from Example �.�.� on page ���, x10 ⇤ Vr (x 2 x 1 ), x20 ⇤ Vr (x1 x2 ).
As we said, one of the eigenvalues is zero. What is the other eigenvalue, how does the picture look
like and what happens when t goes to infinity.
Exercise 3.5.101: Describe the behavior of the following systems without solving�
e� x 0 ⇤ x 4y, y 0 ⇤ 4x + y.
k1 k2 k3 k4
m1 m2 m3
This simple system turns up in unexpected places. For example, our world really
consists of many small particles of matter interacting together. When we try the above
system with many more masses, we obtain a good approximation to how an elastic material
behaves. By somehow taking a limit of the number of masses going to infinity, we obtain
the continuous one-dimensional wave equation (that we study in § 4.7). But we digress.
Let us set up the equations for the three mass system. By Hooke’s law, the force acting
on the mass equals the spring compression times the spring constant. By Newton’s second
law, force is mass times acceleration. So if we sum the forces acting on each mass, put the
right sign in front of each term, depending on the direction in which it is acting, and set
this equal to mass times the acceleration, we end up with the desired system of equations.
Exercise 3.6.1: Repeat this setup for � masses �find the matrices M and K�. Do it for � masses.
Can you find a prescription to do it for n masses�
As with a single equation we want to “divide by M.” This means computing the inverse
of M. The masses are all nonzero and M is a diagonal matrix, so computing the inverse is
easy:
2 m1 0 0 3
6 1 7
M 1 ⇤ 66 0 m12 0 77 .
60 0 17
4 m3 5
This fact follows readily by how we multiply diagonal matrices. As an exercise, you should
verify that MM 1 ⇤ M 1 M ⇤ I.
Let A ⇤ M 1 K. We look at the system xÆ00 ⇤ M 1 K xÆ, or
xÆ00 ⇤ A xÆ.
Many real world systems can be modeled by this equation. For simplicity, we will only talk
about the given masses-and-springs problem. We try a solution of the form
xÆ ⇤ vÆe ↵t .
We compute that for this guess, xÆ00 ⇤ ↵ 2 vÆe ↵t . We plug our guess into the equation and get
↵2 vÆe ↵t ⇤ A vÆe ↵t .
xÆ ⇤ vÆ cos(!t) + i sin(!t) .
By taking the real and imaginary parts (note that vÆ is real), we find that vÆ cos(!t) and
vÆ sin(!t) are linearly independent solutions.
If an eigenvalue is zero, it turns out that both vÆ and vÆt are solutions, where vÆ is an
eigenvector corresponding to the eigenvalue 0.
154 CHAPTER �. SYSTEMS OF ODES
Exercise 3.6.2: Show that if A has a zero eigenvalue and vÆ is a corresponding eigenvector, then
xÆ ⇤ vÆ(a + bt) is a solution of xÆ00 ⇤ A xÆ for arbitrary constants a and b.
Theorem 3.6.1. Let A be a real n ⇥ n matrix with n distinct real negative �or zero� eigenvalues we
denote by !12 > !22 > · · · > ! 2n , and corresponding eigenvectors by vÆ1 , vÆ2 , . . . , vÆn . If A is
invertible �that is, if !1 > 0�, then
’
n
xÆ(t) ⇤ vÆi a i cos(! i t) + b i sin(! i t) ,
i⇤1
’
n
xÆ(t) ⇤ vÆ1 (a 1 + b 1 t) + vÆi a i cos(! i t) + b i sin(! i t) .
i⇤2
We use this solution and the setup from the introduction of this section even when
some of the masses and springs are missing. For example, when there are only 2 masses
and only 2 springs, simply take only the equations for the two masses and set all the spring
constants for the springs that are missing to zero.
3.6.2 Examples
Example 3.6.1: Consider the setup in Figure 3.13, with m 1 ⇤ 2 kg, m 2 ⇤ 1 kg, k1 ⇤ 4 N/m,
and k2 ⇤ 2 N/m.
k1 k2
m1 m2
0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0
2 2 1.0 1.0
1 1 0.5 0.5
0 0 0.0 0.0
-1 -1 -0.5 -0.5
-2 -2 -1.0 -1.0
0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0
Figure 3.14: The two modes of the mass-spring system. In the left plot the masses are moving in unison
and in the right plot are masses moving in the opposite direction.
Now we solve:
0 1 1 b 1 + 2b2
⇤ xÆ0(0) ⇤ b + 2b 2 ⇤ .
6 2 1 1 2b 1 2b 2
Again solve (exercise) to find b1 ⇤ 2, b 2 ⇤ 1. So our solution is
1 1 2 sin(t) + cos(2t) sin(2t)
xÆ ⇤ 2 sin(t) + cos(2t) sin(2t) ⇤ .
2 1 4 sin(t) cos(2t) + sin(2t)
The graphs of the two displacements, x1 and x 2 of the two carts is in Figure 3.15.
5.0 5.0
2.5 2.5
0.0 0.0
-2.5 -2.5
Figure 3.15: Superposition of the two modes given the initial conditions.
Example 3.6.2: We have two toy rail cars. Car 1 of mass 2 kg is traveling at 3 m/s towards
the second rail car of mass 1 kg. There is a bumper on the second rail car that engages at
the moment the cars hit (it connects to two cars) and does not let go. The bumper acts
like a spring of spring constant k ⇤ 2 N/m. The second car is 10 meters from a wall. See
Figure 3.16 on the facing page.
We want to ask several questions. At what time after the cars link does impact with the
wall happen? What is the speed of car 2 when it hits the wall?
OK, let us first set the system up. Let t ⇤ 0 be the time when the two cars link up. Let x 1
be the displacement of the first car from the position at t ⇤ 0, and let x2 be the displacement
�.�. SECOND ORDER SYSTEMS AND APPLICATIONS 157
k
m1 m2
10 meters
Figure 3.16: The crash of two rail cars.
of the second car from its original location. Then the time when x2 (t) ⇤ 10 is exactly the
time when impact with wall occurs. For this t, x20 (t) is the speed at impact. This system
acts just like the system of the previous example but without k1 . Hence the equation is
2 0 00 2 2
xÆ ⇤ xÆ,
0 1 2 2
or
00 1 1
xÆ ⇤ xÆ.
2 2
We compute the eigenvalues of A. It is not⇥ hard
⇤ to ⇥see ⇤that the eigenvalues are 0 and
1 1
3 (exercise). Furthermore, eigenvectors are 1 and 2 respectively (exercise). Then
p
!1 ⇤ 0, !2 ⇤ 3, and by the second part of the theorem the general solution is
⇣
1 1 p p ⌘
xÆ ⇤ (a1 + b 1 t) + a 2 cos( 3 t) + b 2 sin( 3 t)
1 2
p p
a 1 + b 1 t + a 2 cos(p3 t) + b 2 sin( p3 t)
⇤ .
a 1 + b 1 t 2a 2 cos( 3 t) 2b 2 sin( 3 t)
We now apply the initial conditions. First the cars start at position 0 so x1 (0) ⇤ 0 and
x2 (0) ⇤ 0. The first car is traveling at 3 m/s, so x10 (0) ⇤ 3 and the second car starts at rest, so
x20 (0) ⇤ 0. The first conditions says
a1 + a2
0Æ ⇤ xÆ(0) ⇤ .
a 1 2a 2
It is not hard to see that a1 ⇤ a 2 ⇤ 0. We set a1 ⇤ 0 and a 2 ⇤ 0 in xÆ(t) and differentiate to get
p p
0 b 1 + p3 b 2 cos( p3 t)
xÆ (t) ⇤ .
b 1 2 3 b 2 cos( 3 t)
So p
3 0 b 1 + p3 b 2
⇤ xÆ (0) ⇤ .
0 b1 2 3 b2
158 CHAPTER �. SYSTEMS OF ODES
Solving these two equations we find b 1 ⇤ 2 and b 2 ⇤ p1 . Hence the position of our cars is
3
(until the impact with the wall)
" p #
2t + p1 sin( 3 t)
xÆ ⇤ 3 p .
2t p2 sin( 3 t)
3
Note how the presence of the zero eigenvalue resulted in a term containing t. This means
that the cars will be traveling in the positive direction as time grows, which is what we
expect.
What we are really
p interested in is the second expression, the one for x2 . We have
2
x2 (t) ⇤ 2t p sin( 3 t). See Figure 3.17 for the plot of x2 versus time.
3
Just from the graph we can see that time of impact will be a little more than p 5 seconds
2
from time zero. For this we have to solve the equation 10 ⇤ x2 (t) ⇤ 2t p sin( 3 t). Using
3
a computer (or even a graphing calculator) we find that timpact ⇡ 5.22 seconds.
The speed of the second car is x 20 ⇤
p 0 1 2 3 4 5 6
first car links up with car 2, but if car 2 hits Figure 3.17: Position of the second car in time
the wall at any speed greater than zero, Bob (ignoring the wall).
will spill his drink. Suppose Bob can move
car 2 a few meters towards or away from
the wall (he cannot go all the way to the wall, nor can he get out of the way of the first car).
Is there a “safe” distance for him to be at? A distance such that the impact with the wall is
at zero speed?
The answer is yes. Looking at Figure 3.17, we note the “plateau” between t ⇤ 3 and
t ⇤ 4. There is a point where the speed is zero. To find it we solve x20 (t) ⇤ 0. This is when
p 2⇡ p
cos( 3 t) ⇤ 1 or in other words when t ⇤ p , 4⇡ , . . . and so on. We plug in the first value
⇣ ⌘ 3 3
2⇡ 4⇡
to obtain x 2 ⇤ p ⇡ 7.26. So a “safe” distance is about 7 and a quarter meters from
p
3 3
the wall.
Alternatively Bob could
⇣ move
⌘ away from the wall towards the incoming car 2, where
4⇡ 8⇡
another safe distance is x2 p ⇤ p ⇡ 14.51 and so on. We can use all the different t such
3 3
that ⇤ 0. Of course t ⇤ 0 is also a solution, corresponding to x2 ⇤ 0, but that means
x 20 (t)
standing right at the wall.
�.�. SECOND ORDER SYSTEMS AND APPLICATIONS 159
That is, we are adding periodic forcing to the system in the direction of the vector F.Æ
As before, this system just requires us to find one particular solution xÆp , add it to the
general solution of the associated homogeneous system xÆc , and we will have the general
solution to (3.4). Let us suppose that ! is not one of the natural frequencies of xÆ00 ⇤ A xÆ,
then we can guess
xÆp ⇤ cÆ cos(!t),
where cÆ is an unknown constant vector. Note that we do not need to use sine since there
are only second derivatives. We solve for cÆ to find xÆp . This is really just the method of
undetermined coefficients for systems. Let us differentiate xÆp twice to get
xÆ00p ⇤ !2 cÆ cos(!t).
xÆ00p
z }| {
A xÆp
z }| {
!2 cÆ cos(!t) ⇤ AÆ
c cos(!t) +FÆ cos(!t).
(A + !2 I)Æ Æ
c ⇤ F.
So
1
cÆ ⇤ (A + ! 2 I) ( F).
Æ
Of course this is possible only if (A + ! 2 I) ⇤ A ( ! 2 )I is invertible. That matrix is
invertible if and only if !2 is not an eigenvalue of A. That is true if and only if ! is not a
natural frequency of the system.
We simplified things a little bit. If we wish to have the forcing term to be in the units of
force, say Newtons, then we must write
Æ cos(!t).
M xÆ00 ⇤ K xÆ + G
xÆ00 ⇤ M 1 K xÆ + M 1 G
Æ cos(!t) or xÆ00 ⇤ A xÆ + FÆ cos(!t),
where FÆ ⇤ M 1 G.
Æ
Example 3.6.3: Let us take the example in Figure 3.13 on page 154 with the same parameters
as before: m 1 ⇤ 2, m 2 ⇤ 1, k 1 ⇤ 4, and k 2 ⇤ 2. Now suppose that there is a force 2 cos(3t)
acting on the second cart.
160 CHAPTER �. SYSTEMS OF ODES
The equation is
2 0 00 4 2 0 3 1 0
xÆ ⇤ xÆ + cos(3t) or 00
xÆ ⇤ xÆ + cos(3t).
0 1 2 2 2 2 2 2
We solved the associated homogeneous equation before and found the complementary
solution to be
1 1
xÆc ⇤ a 1 cos(t) + b 1 sin(t) + a 2 cos(2t) + b 2 sin(2t) .
2 1
The natural frequencies are 1 and 2. As 3 is not a natural frequency, we try cÆ cos(3t).
We invert (A + 32 I):
✓ ◆ 1 1 7 1
3 1 2 6 1 40 40
+3 I ⇤ ⇤ 1 3 .
2 2 2 7 20 20
Hence,
7 1 1
2 1 40 40 0 20
Æ ⇤
cÆ ⇤ (A + ! I) ( F) ⇤ .
1 3 3
20 20 2 10
Combining with the general solution of the associated homogeneous problem, we get
that the general solution to xÆ00 ⇤ A xÆ + FÆ cos(!t) is
1
1 1 20
xÆ ⇤ xÆc + xÆp ⇤ a 1 cos(t) + b 1 sin(t) + a 2 cos(2t) + b 2 sin(2t) + 3 cos(3t).
2 1 10
We then solve for the constants a 1 , a 2 , b 1 , and b 2 using any initial conditions we are given.
Note that given force fÆ, we write the equation as M xÆ00 ⇤ K xÆ + fÆ to get the units right.
Then we write xÆ00 ⇤ M 1 K xÆ + M 1 fÆ. The term gÆ ⇤ M 1 fÆ in xÆ00 ⇤ A xÆ + gÆ is in units of force
per unit mass.
If ! is a natural frequency of the system, resonance may occur, because we will have to
try a particular solution of the form
That is assuming that the eigenvalues of the coefficient matrix are distinct. Next, note that
the amplitude of this solution grows without bound as t grows.
3.6.4 Exercises
Exercise 3.6.3: Find a particular solution to
3 1 0
00
xÆ ⇤ xÆ + cos(2t).
2 2 2
�.�. SECOND ORDER SYSTEMS AND APPLICATIONS 161
Exercise 3.6.4 (challenging): Let us take the example in Figure �.�� on page ��� with the same
parameters as before� m 1 ⇤ 2, k 1 ⇤ 4, and k2 ⇤ 2, except for m2 , which is unknown. Suppose
that there is a force cos(5t) acting on the first mass. Find an m 2 such that there exists a particular
solution where the first mass does not move.
Note� This idea is called dynamic damping. In practice there will be a small amount of
damping and so any transient solution will disappear and after long enough time, the first mass will
always come to a stop.
Exercise 3.6.5: Let us take the Example �.�.� on page ���, but that at time of impact, car � is
moving to the left at the speed of � m/s.
b� Will the second car hit the wall, or will it be moving away from the wall as time goes on�
c� At what speed would the first car have to be traveling for the system to essentially stay in
place after linkup�
Exercise 3.6.6: Let us take the example in Figure �.�� on page ��� with parameters m1 ⇤ m 2 ⇤ 1,
k1 ⇤ k 2 ⇤ 1. Does there exist a set of initial conditions for which the first cart moves but the second
cart does not� If so, find those conditions. If not, argue why not.
h1 0 0i h 3 0 0
i h cos(2t) i
Exercise 3.6.101: Find the general solution to 020 xÆ 00 ⇤ 2 4 0 xÆ + 0 .
003 0 6 3 0
Exercise 3.6.102: Suppose there are three carts of equal mass m and connected by two springs of
constant k �and no connections to walls�. Set up the system and find its general solution.
1t 2t nt
xÆ ⇤ c1 vÆ1 e + c2 vÆ2 e + · · · + c n vÆn e .
In other words, the hypothesis of the theorem could be stated as saying that if all the
eigenvalues of P are complete, then there are n linearly independent eigenvectors and thus
we have the given general solution.
If the geometric multiplicity of an eigenvalue is 2 or greater, then the set of linearly
independent eigenvectors⇥ is not
⇤ unique up to multiples as it was
⇥ 1 ⇤before.
⇥ 1For
⇤ example, for
3 0
the diagonal matrix A ⇤ 0 3 we could also pick eigenvectors 1 and 1 , or in fact any
pair of two linearly independent vectors. The number of linearly independent eigenvectors
corresponding to is the number of free variables we obtain when solving A vÆ ⇤ vÆ. We
pick specific values for those free variables to obtain eigenvectors. If you pick different
values, you may get different eigenvectors.
We are now stuck, we get no other solutions from standard eigenvectors. But we need two
linearly independent solutions to find the general solution of the equation.
164 CHAPTER �. SYSTEMS OF ODES
Let us try (in the spirit of repeated roots of the characteristic equation for a single
equation) another solution of the form
We differentiate to get
As we are assuming that xÆ2 is a solution, xÆ20 must equal A xÆ2 . So let’s compute A xÆ2 :
By looking at the coefficients of e 3t and te 3t we see 3vÆ2 + vÆ1 ⇤ A vÆ2 and 3 vÆ1 ⇤ A vÆ1 . This
means that
(A 3I)vÆ2 ⇤ vÆ1 , and (A 3I)vÆ1 ⇤ 0. Æ
Therefore, xÆ2 is a solution if these two equations are satisfied. The second equation⇥1⇤ is
satisfied if vÆ1 is an eigenvector, and we found the eigenvector above, so let vÆ1 ⇤ 0 . So, if
we can find a vÆ2 that solves (A 3I) vÆ2 ⇤ vÆ1 , then we are done. This is just a bunch of linear
equations to solve and we are by now very good at that. Let us solve (A 3I) vÆ2 ⇤ vÆ1 . Write
0 1 a 1
⇤ .
0 0 b 0
Let us check that we really do have the solution. First x10 ⇤ c 1 3e 3t + c2 e 3t + 3c 2 te 3t ⇤ 3x1 + x2 .
Good. Now x20 ⇤ 3c2 e 3t ⇤ 3x2 . Good.
(A 3I)(A Æ
3I)vÆ2 ⇤ 0, or (A 3I)2 vÆ2 ⇤ 0.
Æ
xÆ1 ⇤ vÆ1 e t ,
xÆ2 ⇤ vÆ2 + vÆ1 t e t .
(A 2I)vÆ2 ⇤ vÆ1 ,
or
20 0 37 2a 3 2 1 3
6 5 6 7 6 7
60 0 77 6b 7 ⇤ 6 0 7 ,
6 0 6 7 6 7
6 1 175 6 c 7 6 17
4 4 4 5 4 5
where we used a, b, c as components of vÆ2 for simplicity. The first equation says 5b ⇤ 1
so b ⇤ 1/5. The second equation says nothing. The last equation is a + 4b c ⇤ 1, or
166 CHAPTER �. SYSTEMS OF ODES
(A I)k vÆ ⇤ 0,
Æ but (A I)k 1 vÆ , 0.
Æ
Such vectors are called generalized eigenvectors (then vÆ1 ⇤ (A I)k 1 vÆ is an eigenvector).
For the eigenvector vÆ1 there is a chain of generalized eigenvectors vÆ2 through vÆk such that:
(A Æ
I)vÆ1 ⇤ 0,
(A I)vÆ2 ⇤ vÆ1 ,
..
.
(A I)vÆk ⇤ vÆk 1 .
Really once you find the vÆk such that (A I)k vÆk ⇤ 0Æ but (A I)k 1 vÆk , 0,
Æ you find the
entire chain since you can compute the rest, vÆk 1 ⇤ (A I)vÆk , vÆk 2 ⇤ (A I)vÆk 1 , etc.
We form the linearly independent solutions
xÆ1 ⇤ vÆ1 e t ,
xÆ2 ⇤ (vÆ2 + vÆ1 t) e t ,
..
.
✓ ◆
t2 tk 2 tk 1
xÆk ⇤ vÆk + vÆk 1 t + vÆk 2 + · · · + vÆ2 + vÆ1 e t.
2 (k 2)! (k 1)!
If on the other hand A has an eigenvalue of algebraic multiplicity 3 and defect 1, then
solve
(A Æ
I) vÆ1 ⇤ 0, (A Æ
I)vÆ2 ⇤ 0, (A I)vÆ3 ⇤ vÆ2 .
Here vÆ1 and vÆ2 are actual honest eigenvectors, and vÆ3 is a generalized eigenvector. So
there are two chains. To solve, first find a vÆ3 such that (A I)2 vÆ3 ⇤ 0,
Æ but (A Æ
I)vÆ3 , 0.
Then vÆ2 ⇤ (A I)vÆ3 is going to be an eigenvector. Then solve for an eigenvector vÆ1 that is
linearly independent from vÆ2 . You get 3 linearly independent solutions
3.7.3 Exercises
⇥5 3
⇤
Exercise 3.7.2: Let A ⇤ 3 1 . Find the general solution of xÆ0 ⇤ A xÆ.
h 5 4 4
i
Exercise 3.7.3: Let A ⇤ 0 3 0 .
2 4 1
c� Find the general solution of xÆ0 ⇤ A xÆ in two different ways and verify you get the same answer.
h 0 1 2
i
Exercise 3.7.5: Let A ⇤ 1 2 2 .
4 4 7
Exercise 3.7.8: Suppose that A is a 2 ⇥ 2 matrix with a repeated eigenvalue . Suppose that there
are two linearly independent eigenvectors. Show that A ⇤ I.
h1 1 1i
Exercise 3.7.101: Let A ⇤ 111 .
111
Exercise 3.7.104: Let A ⇤ [ ba ac ], where⇥ a,⇤ b, and c are unknowns. Suppose that 5 is a doubled
eigenvalue of defect �, and suppose that 10 is a corresponding eigenvector. Find A and show that
there is only one such matrix A.
�.�. MATRIX EXPONENTIALS 169
3.8.1 Definition
There is another way of finding a fundamental matrix solution of a system. Consider the
constant coefficient equation
xÆ0 ⇤ P xÆ.
If this would be just one equation (when P is a number or a 1 ⇥ 1 matrix), then the solution
would be
xÆ ⇤ e Pt .
That doesn’t make sense if P is a larger matrix, but essentially the same computation that
led to the above works for matrices when we define e Pt properly. First let us write down
the Taylor series for e at for some number a:
Maybe we can try the same trick with matrices. For an n ⇥ n matrix A we define the matrix
exponential as
def 1 1 1
e A ⇤ I + A + A2 + A3 + · · · + A k + · · ·
2 6 k!
Let us not worry about convergence. The series really does always converge. We usually
write Pt as tP by convention when P is a matrix. With this small change and by the exact
same calculation as above we have that
d tP
e ⇤ Pe tP .
dt
Now P and hence e tP is an n ⇥ n matrix. What we are looking for is a vector. In the 1 ⇥ 1
case we would at this point multiply by an arbitrary constant to get the general solution. In
the matrix case we multiply by a column vector cÆ.
Theorem 3.8.1. Let P be an n ⇥ n matrix. Then the general solution to xÆ0 ⇤ P xÆ is
xÆ ⇤ e tP cÆ,
Let us check:
d d tP
xÆ ⇤ e cÆ ⇤ Pe tP cÆ ⇤ P xÆ.
dt dt
Hence e is a fundamental matrix solution of the homogeneous system. So if we can
tP
compute the matrix exponential, we have another method of solving constant coefficient
homogeneous systems. It also makes it easy to solve for initial conditions. To solve xÆ0 ⇤ A xÆ,
Æ we take the solution
xÆ(0) ⇤ b,
Æ
xÆ ⇤ e tA b.
This equation follows because e 0A ⇤ I, so xÆ(0) ⇤ e 0A bÆ ⇤ b.
Æ
We mention a drawback of matrix exponentials. In general e A+B , e A e B . The trouble is
that matrices do not commute, that is, in general AB , BA. If you try to prove e A+B , e A e B
using the Taylor series, you will see why the lack of commutativity becomes a problem.
However, it is still true that if AB ⇤ BA, that is, if A and B commute, then e A+B ⇤ e A e B .
We will find this fact useful. Let us restate this as a theorem to make a point.
Theorem 3.8.2. If AB ⇤ BA, then e A+B ⇤ e A e B . Otherwise, e A+B , e A e B in general.
We found a fundamental matrix solution for the system xÆ0 ⇤ A xÆ. Note that this matrix has
a repeated eigenvalue with a defect; there is only one eigenvector for the eigenvalue 3. So
we found a perhaps easier way to handle this case. In fact, if a matrix A is 2 ⇥ 2 and has an
eigenvalue of multiplicity 2, then either A ⇤ I, or A ⇤ I + B where B 2 ⇤ 0. This is a
good exercise.
Exercise 3.8.1: Suppose that A is 2 ⇥ 2 and is the only eigenvalue. Show that (A I)2 ⇤ 0,
and therefore that we can write A ⇤ I + B, where B 2 ⇤ 0 �and possibly B ⇤ 0�. Hint� First write
down what does it mean for the eigenvalue to be of multiplicity �. You will get an equation for the
entries. Now compute the square of B.
Matrices B such that B k ⇤ 0 for some k are called nilpotent. Computation of the matrix
exponential for nilpotent matrices is easy by just writing down the first k terms of the
Taylor series.
1 11 2 1 3
e BAB ⇤ I + BAB + (BAB 1 ) + (BAB 1 ) + · · ·
2 6
1 1
⇤ BB 1 + BAB 1 + BA2 B 1 + BA3 B 1 + · · ·
2 6
1 2 1 3
⇤ B I +A + A + A +··· B 1
2 6
A 1
⇤ Be B .
Given a square matrix A, we can usually write A ⇤ EDE 1 , where D is diagonal and
E invertible. This procedure is called diagonalization. If we can do that, the computation
of the exponential becomes easy as e D is just taking the exponential of the entries on the
diagonal. Adding t into the mix, we can then compute the exponential
e tA ⇤ Ee tD E 1 .
172 CHAPTER �. SYSTEMS OF ODES
2 1 0 0 3
6 7
···
60 0 7
D ⇤ 66 .. 7.
2 ···
.. .. ..7
6. . . .7
60 7
4 0 ··· n5
We compute
The columns of E are linearly independent as these are linearly independent eigenvectors
of A. Hence E is invertible. Since AE ⇤ ED, we multiply on the right by E 1 and we get
A ⇤ EDE 1 .
2e 1 t 0 · · · 0 3
6 7
6 0 e t 0 7 1
⇤ E 66 .. .. 77 E .
1
2 ···
e tA ⇤ Ee tD E .. .. (3.5)
6 . . . . 7
6 0 0 · · · e n t 75
4
The formula (3.5), therefore, gives the formula for computing a fundamental matrix solution
e tA for the system xÆ0 ⇤ A xÆ, in the case where we have n linearly independent eigenvectors.
This computation still works when the eigenvalues and eigenvectors are complex,
though then you have to compute with complex numbers. It is clear from the definition
that if A is real, then e tA is real. So you will only need complex numbers in the computation
and not for the result. You may need to apply Euler’s formula to simplify the result. If
simplified properly, the final matrix will not have any complex numbers in it.
Example 3.8.1: Compute a fundamental matrix solution using the matrix exponential for
the system
0
x 1 2 x
⇤ .
y 2 1 y
�.�. MATRIX EXPONENTIALS 173
Then compute the particular solution for the initial conditions x(0) ⇤ 4 and y(0) ⇤ 2.
⇥ ⇤
Let A be the coefficient matrix 12 21 . We first compute (exercise) that the eigenvalues
⇥ ⇤ ⇥ ⇤
are 3 and 1 and corresponding eigenvectors are 11 and 11 . Hence the diagonalization
of A is
1
1 2 1 1 3 0 1 1
⇤ .
2 1 1 1 0 1 1 1
| {z } | {z } | {z } | {z }
A E D E 1
We write
1
tA tD 1 1 1 e 3t 0 1 1
e ⇤ Ee E ⇤
1 1 0 e t 1 1
1 1 e 3t 0 1 1 1
⇤
1 1 0 e t 2 1 1
1 e 3t e t 1 1
⇤
2 e 3t e t 1 1
" #
e 3t +e t e 3t e t
1 e 3t e t e 3t +e t
2 2
⇤ ⇤ .
2 e 3t t e 3t t e 3t e e 3t +e
t t
+e e 2 2
The initial conditions are x(0) ⇤ 4 and y(0) ⇤ 2. Hence, by the property that e 0A ⇤ I
⇥ ⇤
we find that the particular solution we are looking for is e tA bÆ where bÆ is 42 . Then the
particular solution we are looking for is
" #
e 3t +e t e 3t e t
x 2 2 4 2e 3t + 2e t + e 3t e t 3e 3t + e t
⇤ ⇤ ⇤ .
e 3t e e 3t +e 2 2e 3t 2e t + e 3t + e t 3e 3t e t
t t
y 2 2
1
e tA ⇤ X(t) [X(0)] .
Clearly, if we plug t ⇤ 0 into X(t) [X(0)] 1 we get the identity. We can multiply a
fundamental matrix solution on the right by any constant invertible matrix and we still
get a fundamental matrix solution. All we are doing is changing what are the arbitrary
constants in the general solution xÆ(t) ⇤ X(t) cÆ.
174 CHAPTER �. SYSTEMS OF ODES
3.8.5 Approximations
If you think about it, the computation of any fundamental matrix solution X using the
eigenvalue method is just as difficult as the computation of e tA . So perhaps we did not
gain much by this new tool. However, the Taylor series expansion actually gives us a way
to approximate solutions, which the eigenvalue method did not.
The simplest thing we can do is to just compute the series up to a certain number of
terms. There are better ways to approximate the exponential . In many cases however, few
terms of the Taylor series give a reasonable approximation for the exponential and may
⇥ application.
suffice for the ⇤ For example, let us compute the first 4 terms of the series for the
matrix A ⇤ 12 21 .
5 13 7
t2 2 t3 3 1 2 2 2 2 6 3
e tA
⇡ I + tA + A + A ⇤ I + t +t 5 + t3 7 13 ⇤
2 6 2 1 2 2
3 6
5 2 13 3
1+t+ 2 t + 6 t 2 t + 2 t 2 + 73 t 3
⇤ .
2 t + 2 t 2 + 73 t 3 1 + t + 52 t 2 + 13 3
6 t
Just like the scalar version of the Taylor series approximation, the approximation will be
better for small t and worse for larger t. For larger t, we will generally have to compute
more terms. Let us see how we stack up against the real solution with t ⇤ 0.1. The
approximate solution is approximately (rounded to 8 decimal places)
0.1 A 0.12 2 0.13 3 1.12716667 0.22233333
e ⇡ I + 0.1 A + A + A ⇤ .
2 6 0.22233333 1.12716667
And plugging t ⇤ 0.1 into the real solution (rounded to 8 decimal places) we get
0.1 A 1.12734811 0.22251069
e ⇤ .
0.22251069 1.12734811
Not bad at all! Although if we take the same approximation for t ⇤ 1 we get
1 1 6.66666667 6.33333333
I + A + A2 + A3 ⇤ ,
2 6 6.33333333 6.66666667
So the approximation is not very good once we get up to t ⇤ 1. To get a good approximation
at t ⇤ 1 (say up to 2 decimal places) we would need to go up to the 11th power (exercise).
C. Moler and C.F. Van Loan, Nineteen Dubious Ways to Compute the Exponential of a Matrix, Twenty-Five
Years Later, SIAM Review 45 (1), 2003, 3–49
�.�. MATRIX EXPONENTIALS 175
3.8.6 Exercises
Exercise 3.8.2: Using the matrix exponential, find a fundamental matrix solution for the system
x 0 ⇤ 3x + y, y 0 ⇤ x + 3y.
⇥2 3⇤
Exercise 3.8.3: Find e tA for the matrix A ⇤ 02 .
Exercise 3.8.4: Find a fundamental matrix solution for the system x10 ⇤ 7x1 + 4x 2 + 12x3 ,
h 0
i
x20 ⇤ x1 + 2x 2 + x 3 , x 30 ⇤ 3x1 2x2 5x3 . Then find the solution that satisfies xÆ(0) ⇤ 1 .
2
⇥1 2⇤
Exercise 3.8.5: Compute the matrix exponential e A for A ⇤ 01 .
Exercise 3.8.6 (challenging): Suppose AB ⇤ BA. Show that under this assumption, e A+B ⇤ e A e B .
1
Exercise 3.8.7: Use Exercise �.�.� to show that (e A ) ⇤e A. In particular this means that e A is
invertible even if A is not.
Exercise
⇥ 1 ⇤ ⇥ 0 ⇤ 3.8.8: Let A be a 2 ⇥ 2 matrix with eigenvalues 1, 1, and corresponding eigenvectors
1 , 1 .
Exercise 3.8.12: For any positive integer n, find a formula �or a recipe� for A n for the following
matrices�
3 0 5 2 0 1 2 1
a� b� c� d�
0 9 4 7 0 0 0 2
⇥ 1 2
⇤
Exercise 3.8.101: Compute e tA where A ⇤ 2 1 .
h 1 32
i
Exercise 3.8.102: Compute e tA where A ⇤ 2 1 2 .
1 34
176 CHAPTER �. SYSTEMS OF ODES
Exercise 3.8.103:
⇥3 1
⇤ ⇥1⇤
a� Compute e tA where A ⇤ 1 1 . b� Solve xÆ 0 ⇤ A xÆ for xÆ(0) ⇤ 2 .
Exercise 3.8.104:
⇥ Compute
⇤ the first � terms �up to the second degree� of the Taylor expansion of
e tA where A ⇤ 22 32 �Write as a single matrix�. Then use it to approximate e 0.1A .
Exercise 3.8.105: For any positive integer n, find a formula �or a recipe� for A n for the following
matrices�
7 4 3 4 0 1
a� b� c�
5 2 6 7 1 0
�.�. NONHOMOGENEOUS SYSTEMS 177
where A is a constant matrix. The first method we look at is the integrating factor method.
For simplicity we rewrite the equation as
where P ⇤ A. We multiply both sides of the equation by e tP (being mindful that we are
dealing with matrices that may not commute) to obtain
We notice that Pe tP ⇤ e tP P. This fact follows by writing down the series definition of e tP :
✓ ◆
1 1
Pe tP ⇤ P I + tP + (tP)2 + · · · ⇤ P + tP 2 + t 2 P 3 + · · · ⇤
2 2
✓ ◆
1
⇤ I + tP + (tP)2 + · · · P ⇤ e tP P.
2
So d
dt e tP ⇤ Pe tP ⇤ e tP P. The product rule says
d ⇣ tP ⌘
e xÆ(t) ⇤ e tP xÆ0(t) + e tP P xÆ(t),
dt
d ⇣ tP ⌘
and so
e xÆ(t) ⇤ e tP fÆ(t).
dt
We can now integrate. That is, we integrate each component of the vector separately
π
tP
e xÆ(t) ⇤ e tP fÆ(t) dt + cÆ.
1
Recall from Exercise 3.8.7 that (e tP ) ⇤e tP . Therefore, we obtain
π
xÆ(t) ⇤ e tP
e tP fÆ(t) dt + e tP
cÆ.
178 CHAPTER �. SYSTEMS OF ODES
Perhaps it is better understood as a definite integral. In this case it will be easy to also
solve for the initial conditions. Consider the equation with initial conditions
Again, the integration means that each component of the vector e sP fÆ(s) is integrated
Æ
separately. It is not hard to see that (3.6) really does satisfy the initial condition xÆ(0) ⇤ b.
π 0
0P 0P Æ
xÆ(0) ⇤ e e sP fÆ(s) ds + e b ⇤ I bÆ ⇤ b.
Æ
0
x 10 + 5x1 3x 2 ⇤ e t ,
x20 + 3x 1 x 2 ⇤ 0,
Then
π t
xÆ(t) ⇤ e tP
e sP fÆ(s) ds + e tP Æ
b
0
(1 3t) e 2t 3te 2t te 3t (1 3t) e 2t 3te 2t 1
⇤ (3t 1) e 3t +1 +
3te 2t (1 + 3t) e 2t 3te 2t (1 + 3t) e 2t 0
2t
3
2t
te (1 3t) e
⇤ et 1 2t + 2t
+ +t e 3te
3 3
(1 2t) e 2t
⇤ et 1 2t .
3 + 3 2t e
Phew!
Let us check that this really works.
2t 2t 2t 2t
x10 + 5x1 3x2 ⇤ (4te 4e ) + 5(1 2t) e + et (1 6t) e ⇤ et.
Similarly (exercise) x 20 + 3x1 x2 ⇤ 0. The initial conditions are also satisfied (exercise).
For systems, the integrating factor method only works if P does not depend on t, that
is, P is constant. The problem is that in general
d h Ø P(t) dt
i Ø
P(t) dt
e , P(t) e ,
dt
because matrix multiplication is not commutative.
Eigenvector decomposition
For the next method, note that eigenvectors of a matrix give the directions in which the
matrix acts like a scalar. If we solve the system along these directions, the computations
are simpler as we treat the matrix as a scalar. We then put those solutions together to get
the general solution for the system.
Take the equation
xÆ0(t) ⇤ A xÆ(t) + fÆ(t). (3.7)
Assume A has n linearly independent eigenvectors vÆ1 , vÆ2 , . . . , vÆn . Write
That is, we wish to find g1 through g n that satisfy (3.9). Since all the eigenvectors are
independent, the matrix E ⇤ [ vÆ1 vÆ2 · · · vÆn ] is invertible. Write the equation (3.9) as
180 CHAPTER �. SYSTEMS OF ODES
fÆ ⇤ E gÆ, where the components of gÆ are the functions g 1 through g n . Then gÆ ⇤ E 1 fÆ.
Hence it is always possible to find gÆ when there are n linearly independent eigenvectors.
We plug (3.8) into (3.7), and note that A vÆk ⇤ k vÆk :
fÆ
z }| { z }| { z }| {
xÆ0 A xÆ
vÆ1 ⇠10 + vÆ2 ⇠20 +···+ vÆn ⇠0n ⇤ A vÆ1 ⇠1 + vÆ2 ⇠2 + · · · + vÆn ⇠ n + vÆ1 g 1 + vÆ2 g2 + · · · + vÆn g n
⇤ A vÆ1 ⇠1 + A vÆ2 ⇠2 + · · · + A vÆn ⇠ n + vÆ1 g1 + vÆ2 g2 + · · · + vÆn g n
⇤ vÆ1 1 ⇠1 + vÆ2 2 ⇠2 + · · · + vÆn n ⇠n + vÆ1 g1 + vÆ2 g 2 + · · · + vÆn g n
⇤ vÆ1 ( 1 ⇠1 + g1 ) + vÆ2 ( 2 ⇠2 + g2 ) + · · · + vÆn ( n ⇠n + g n ).
If we identify the coefficients of the vectors vÆ1 through vÆn , we get the equations
⇠10 ⇤ 1 ⇠1 + g1 ,
⇠20 ⇤ 2 ⇠2 + g2 ,
..
.
⇠0n ⇤ n ⇠n + gn .
Each one of these equations is independent of the others. They are all linear first order
equations and can easily be solved by the standard integrating factor method for single
equations. That is, for the k th equation we write
d h kt
i
kt
⇠ k (t) e ⇤e g k (t).
dt
We integrate and solve for ⇠ k to get
π
kt kt kt
⇠ k (t) ⇤ e e g k (t) dt + C k e .
If we are looking for just any particular solution, we can set C k to be zero. If we leave these
constants in, we get the general solution. Write xÆ(t) ⇤ vÆ1 ⇠1 (t) + vÆ2 ⇠2 (t) + · · · + vÆn ⇠ n (t), and
we are done.
As always, it is perhaps better to write these integrals as definite integrals. Suppose that
we have an initial condition xÆ(0) ⇤ b.Æ Take aÆ ⇤ E 1 bÆ to find bÆ ⇤ vÆ1 a 1 + vÆ2 a 2 + · · · + vÆn a n ,
just like before. Then if we write
π t
kt ks kt
⇠ k (t) ⇤ e e g k (s) ds + a k e ,
0
�.�. NONHOMOGENEOUS SYSTEMS 181
Æ
we get the particular solution xÆ(t) ⇤ vÆ1 ⇠1 (t) + vÆ2 ⇠2 (t) + · · · + vÆn ⇠ n (t) satisfying xÆ(0) ⇤ b,
because ⇠ k (0) ⇤ a k .
Let us remark that the technique we just outlined is the eigenvalue method applied to
nonhomogeneous systems. If a system is homogeneous, that is, if fÆ ⇤ 0, Æ then the equations
we get are ⇠ k ⇤ k ⇠ k , and so ⇠ k ⇤ C k e are the solutions and that’s precisely what we got
0 k t
in § 3.4.
⇥1 3⇤ ⇥ ⇤ h i
3/16
Example 3.9.2: Let A ⇤ . Solve xÆ0 ⇤ A xÆ + fÆ where fÆ(t) ⇤ 2e t for xÆ(0) ⇤ .
⇥ ⇤ ⇥1⇤
31 2t 5/16
The eigenvalues of A are 2 and 4 and corresponding eigenvectors are 11 and 1
respectively. This calculation is left as an exercise. We write down the matrix E of the
eigenvectors and compute its inverse (using the inverse formula for 2 ⇥ 2 matrices)
1 1 1 1 1 1
E⇤ , E ⇤ .
1 1 2 1 1
⇥ ⇤ ⇥ ⇤
We are looking for a solution of the form xÆ ⇤ ⇠1 + 11 ⇠2 . We first need to write fÆ 1
⇥ t⇤ ⇥ 1 ⇤ ⇥1⇤ 1
in terms of the eigenvectors. That is we wish to write fÆ ⇤ 2e
2t ⇤ 1 g 1 + 1 g 2 . Thus
g1 1 2e t 1 1 1 2e t et t
⇤E ⇤ ⇤ t .
g2 2t 2 1 1 2t e +t
So g1 ⇤ e t t and g2 ⇤ e t + t.
We hfurther
i need to write xÆ(0) in terms of the eigenvectors. That is, we wish to write
3/16 ⇥ 1
⇤ ⇥1⇤
xÆ(0) ⇤ 5/16 ⇤ 1 a1 + 1 a 2 . Hence
3 1/4
a1 1 /16
⇤E ⇤ .
a2 5/16 1/16
So a 1 ⇤ 1/4 and a2 ⇤ 1/16. We plug our xÆ into the equation and get
fÆ
z }| { z }| { z }| {
xÆ0 A xÆ
1 1 0 1 1 1 1
⇠10 + ⇠ ⇤A ⇠1 + A ⇠2 + g1 + g
1 1 2 1 1 1 1 2
1 1 1 1
⇤ ( 2⇠1 ) + 4⇠2 + (e t t) + (e t + t).
1 1 1 1
1
⇠10 ⇤ 2⇠1 + e t t, where ⇠1 (0) ⇤ a1 ⇤ ,
4
1
⇠20 ⇤ 4⇠2 + e t + t, where ⇠2 (0) ⇤ a 2 ⇤ .
16
182 CHAPTER �. SYSTEMS OF ODES
We solve with integrating factor. Computation of the integral is left as an exercise to the
student. You will need integration by parts.
π
2t et t 1
⇠1 ⇤ e e 2t (e t t) dt + C 1 e 2t
⇤ + + C1 e 2t
.
3 2 4
C 1 is the constant of integration. As ⇠1 (0) ⇤ 1/4, then 1/4 ⇤ 1/3 + 1/4 + C 1 and hence C 1 ⇤ 1/3.
Similarly π
4t et t 1
⇠2 ⇤ e e 4t (e t + t) dt + C2 e 4t ⇤ + C 2 e 4t .
3 4 16
As ⇠2 (0) ⇤ /16 we have /16 ⇤ /3 /16 + C 2 and hence C 2 ⇤ 1/3. The solution is
1 1 1 1
✓ ◆ ✓ ◆ " #
e 4t e 2t
1 et e 2t 1 2t 1 e 4t et 4t + 1 3 + 3 1612t
xÆ(t) ⇤ + + ⇤ 2t 4t .
1 3 4 1 3 16 e +e 2e t
+ 4t16 5
| {z } | {z } 3
⇠1 ⇠2
e 4t e 2t 3 12t 2t +e 4t 2e t 4t 5
That is, x1 ⇤ 3 + 16 and x2 ⇤ e
3 + 16 .
Exercise 3.9.1: Check that x1 and x2 solve the problem. Check both that they satisfy the differential
equation and that they satisfy the initial conditions.
Undetermined coefficients
We also have the method of undetermined coefficients for systems. The only difference
here is that we have to use unknown vectors rather than just numbers. Same caveats apply
to undetermined coefficients for systems as for single equations. This method does not
always work. Furthermore, if the right-hand side is complicated, we have to solve for lots
of variables. Each element of an unknown vector is an unknown number. So in system of
3 equations if we have say 4 unknown vectors (this would not be uncommon), then we
already have 12 unknown numbers that we need to solve for. The method can turn into
a lot of tedious work if done by hand. As this method is essentially the same as it is for
single equations, let us just do an example.
⇥ ⇤ ⇥ ⇤
Example 3.9.3: Let A ⇤ 12 01 . Find a particular solution of xÆ0 ⇤ A xÆ + fÆ where fÆ(t) ⇤ et .
t
Note that we can solve this system in an easier way (can you see how?), but for the
purposes of the example, let us use the eigenvalue method plus undetermined⇥ coefficients.
⇤ ⇥ ⇤
The eigenvalues of A are 1 and 1 and corresponding eigenvectors are 11 and 01
respectively. Hence our complementary solution is
1 t 0 t
xÆc ⇤ ↵1 e + ↵2 e ,
1 1
for some arbitrary constants ↵1 and ↵2 .
We would want to guess a particular solution of
Æ + cÆ.
xÆ ⇤ aÆe t + bt
�.�. NONHOMOGENEOUS SYSTEMS 183
Æ t + cÆt + d.
xÆ ⇤ aÆe t + bte Æ
h i h i h i h i
Thus we have 8 unknowns. We write aÆ ⇤ a1
a2 , bÆ ⇤ b1
b2 , cÆ ⇤ c1
c2 , and dÆ ⇤ d1
d2 . We plug
xÆ into the equation. First let us compute xÆ0.
⇣ ⌘
a1 + b1 t b1 c1
xÆ ⇤ aÆ + bÆ e + bte
0 Æ t
+ cÆ ⇤ t
e + te t + .
a2 + b2 b2 c2
A xÆ + fÆ ⇤ AÆ Æ t + AÆ
a e t + A bte c t + A dÆ + fÆ
a1 b1 c1 d1 1 t 0
⇤ et + te t + t+ + e + t
2a 1 + a 2 2b 1 + b 2 2c1 + c2 2d1 + d2 0 1
a1 + 1 b1 c1 d1
⇤ et + te t + t+ .
2a 1 + a 2 2b 1 + b 2 2c1 + c2 + 1 2d1 + d2
We identify the coefficients of e t , te t , t and any constant vectors in xÆ0 and in A xÆ + fÆ to find
the equations:
a 1 + b1 ⇤ a 1 + 1, 0 ⇤ c1 ,
a 2 + b2 ⇤ 2a 1 + a 2 , 0 ⇤ 2c1 + c2 + 1,
b1 ⇤ b1 , c 1 ⇤ d1 ,
b 2 ⇤ 2b 1 + b 2 , c2 ⇤ 2d1 + d2 .
We could write the 8 ⇥ 9 augmented matrix and start row reduction, but it is easier to just
solve the equations in an ad hoc manner. Immediately we see that b 1 ⇤ 0, c 1 ⇤ 0, d1 ⇤ 0.
Plugging these back in, we get that c2 ⇤ 1 and d2 ⇤ 1. The remaining equations that tell
us something are
a 1 ⇤ a1 + 1,
a 2 + b 2 ⇤ 2a 1 + a2 .
So a 1 ⇤ 1/2 and b 2 ⇤ 1. Finally, a 2 can be arbitrary and still satisfy the equations. We are
looking for just a single solution so presumably the simplest one is when a 2 ⇤ 0. Therefore,
1
Æ t + cÆt + dÆ ⇤
1/2 0 0 0 2 et
xÆ ⇤ aÆe t + bte et + te t + t+ ⇤ .
0 1 1 1 te t t 1
1
That is, x1 ⇤ 2 e t , x2 ⇤ te t
1. We would add this to the complementary solution to
t
Æ t were really needed.
get the general solution of the problem. Notice that both aÆe t and bte
184 CHAPTER �. SYSTEMS OF ODES
Exercise 3.9.2: Check that x1 and x 2 solve the problem. Try setting a2 ⇤ 1 and check we get a
solution as well. What is the difference between the two solutions we obtained �one with a2 ⇤ 0 and
one with a 2 ⇤ 1��
As you can see, other than the handling of conflicts, undetermined coefficients works
exactly the same as it did for single equations. However, the computations can get out of
hand pretty quickly for systems. The equation we considered was pretty simple.
Further, suppose we solved the associated homogeneous equation xÆ0 ⇤ A(t) xÆ and found a
fundamental matrix solution X(t). The general solution to the associated homogeneous
equation is X(t)Æ
c for a constant vector cÆ. Just like for variation of parameters for single
equation we try the solution to the nonhomogeneous equation of the form
where uÆ (t) is a vector-valued function instead of a constant. We substitute xÆp into (3.10) to
obtain
X 0(t) uÆ (t) + X(t) uÆ0(t) ⇤ A(t) X(t) uÆ (t) + fÆ(t).
| {z } | {z }
xÆ0p (t) A(t) xÆp (t)
But X(t) is a fundamental matrix solution to the homogeneous problem. So X 0(t) ⇤ A(t)X(t),
and
0 ⇠⇠⇠ ⇠ 0 ⇠⇠⇠ ⇠
⇠X⇠ (t) uÆ (t) + X(t) uÆ0(t) ⇤ ⇠
X⇠ (t) uÆ (t) + fÆ(t).
Hence X(t) uÆ0(t) ⇤ fÆ(t). If we compute [X(t)] 1 , then uÆ0(t) ⇤ [X(t)] 1 fÆ(t). We integrate to
obtain uÆ and we have the particular solution xÆp ⇤ X(t) uÆ (t). Let us write this as a formula
π
1
xÆp ⇤ X(t) [X(t)] fÆ(t) dt.
xÆp ⇤ cÆ cos(!t),
Eigenvector decomposition
fÆ
z }| { z }| { z }| {
xÆ00 A xÆ
vÆ1 ⇠100 + vÆ2 ⇠200 +···+ vÆn ⇠00n ⇤ A vÆ1 ⇠1 + vÆ2 ⇠2 + · · · + vÆn ⇠ n + vÆ1 g1 + vÆ2 g2 + · · · + vÆn g n
⇤ A vÆ1 ⇠1 + A vÆ2 ⇠2 + · · · + A vÆn ⇠ n + vÆ1 g 1 + vÆ2 g2 + · · · + vÆn g n
⇤ vÆ1 1 ⇠1 + vÆ2 2 ⇠2 + · · · + vÆn n ⇠n + vÆ1 g1 + vÆ2 g2 + · · · + vÆn g n
⇤ vÆ1 ( 1 ⇠1 + g1 ) + vÆ2 ( 2 ⇠2 + g2 ) + · · · + vÆn ( n ⇠n + g n ).
�.�. NONHOMOGENEOUS SYSTEMS 187
⇠100 ⇤ 1 ⇠1 + g1 ,
⇠200 ⇤ 2 ⇠2 + g2 ,
..
.
⇠00n ⇤ n ⇠n + gn .
Each one of these equations is independent of the others. We solve each equation using the
methods of chapter 2. We write xÆ(t) ⇤ vÆ1 ⇠1 (t) + vÆ2 ⇠2 (t) + · · · + vÆn ⇠ n (t), and we are done;
we have a particular solution. We find the general solutions for ⇠1 through ⇠ n , and again
xÆ(t) ⇤ vÆ1 ⇠1 (t) + vÆ2 ⇠2 (t) + · · · + vÆn ⇠ n (t) is the general solution (and not just a particular
solution).
Example 3.9.5: Let us do the example from § 3.6 using this method. The equation is
3 1 0
00
xÆ ⇤ xÆ + cos(3t).
2 2 2
⇥1⇤ ⇥ 1
⇤ ⇥1 1
⇤
The eigenvalues are 1 and 4, with eigenvectors and . Therefore E ⇤ and
⇥ ⇤ 2 1 2 1
E 1 ⇤ 13 12 11 . Therefore,
2
g1 1 1 1 1 0 3 cos(3t)
⇤E fÆ(t) ⇤ ⇤ 2 .
g2 3 2 1 2 cos(3t) 3 cos(3t)
So after the whole song and dance of plugging in, the equations we get are
2 2
⇠100 ⇤ ⇠1 + cos(3t), ⇠200 ⇤ 4 ⇠2 cos(3t).
3 3
For each equation we use the method of undetermined coefficients. We try C 1 cos(3t) for
the first equation and C 2 cos(3t) for the second equation. We plug in to get
2
9C 1 cos(3t) ⇤ C 1 cos(3t) + cos(3t),
3
2
9C 2 cos(3t) ⇤ 4C2 cos(3t) cos(3t).
3
We solve each of these equations separately. We get 9C 1 ⇤ C 1 + 2/3 and 9C2 ⇤ 4C 2 2/3.
And hence C 1 ⇤ 1/12 and C2 ⇤ 2/15. So our particular solution is
✓ ◆ ✓ ◆
1 1 1 2 1/20
xÆ ⇤ cos(3t) + cos(3t) ⇤ cos(3t).
2 12 1 15 3/10
3.9.4 Exercises
Exercise 3.9.4: Find a particular solution to x 0 ⇤ x + 2y + 2t, y 0 ⇤ 3x + 2y 4,
Exercise 3.9.6: Find the general solution to x100 ⇤ 6x1 + 3x2 + cos(t), x 200 ⇤ 2x1 7x2 + 3 cos(t),
Exercise 3.9.7: Find the general solution to x 100 ⇤ 6x1 +3x2 +cos(2t), x200 ⇤ 2x1 7x2 +3 cos(2t),
Exercise 3.9.103: Solve x10 ⇤ x 2 + t, x20 ⇤ x1 + t with initial conditions x 1 (0) ⇤ 1, x2 (0) ⇤ 2,
using eigenvector decomposition.
Exercise 3.9.104: Solve x100 ⇤ 3x 1 + x2 + t, x 200 ⇤ 9x1 + 5x 2 + cos(t) with initial conditions
x1 (0) ⇤ 0, x2 (0) ⇤ 0, x10 (0) ⇤ 0, x20 (0) ⇤ 0, using eigenvector decomposition.
Chapter 4
x 00 + x ⇤ 0, x(a) ⇤ 0, x(b) ⇤ 0,
for some constant , where x(t) is defined for t in the interval [a, b]. Previously we specified
the value of the solution and its derivative at a single point. Now we specify the value of
the solution at two different points. As x ⇤ 0 is a solution, existence of solutions is not a
problem. Uniqueness of solutions is another issue. The general solution to x 00 + x ⇤ 0 has
two arbitrary constants† . It is, therefore, natural (but wrong) to believe that requiring two
conditions guarantees a unique solution.
Example 4.1.1: Take ⇤ 1, a ⇤ 0, b ⇤ ⇡. That is,
x 00 + x ⇤ 0, x(0) ⇤ 0, x(⇡) ⇤ 0.
x 00 + 2x ⇤ 0, x(0) ⇤ 0, x(⇡) ⇤ 0.
† See subsection 0.2.4 on page 13 or Example 2.2.1 on page 85 and Example 2.2.3 on page 88.
190 CHAPTER �. FOURIER SERIES AND PDES
p p
Then the general solution is x ⇤ A cos( 2 t) + B sin( 2 t).pLetting x(0) ⇤
p 0 still forces A ⇤ 0.
We apply the second condition to find 0 ⇤ x(⇡) ⇤ B sin( 2 ⇡). As sin( 2 ⇡) , 0 we obtain
B ⇤ 0. Therefore x ⇤ 0 is the unique solution to this problem.
What is going on? We will be interested in finding which constants allow a nonzero
solution, and we will be interested in finding those solutions. This problem is an analogue
of finding eigenvalues and eigenvectors of matrices.
x 00 + x ⇤ 0, x(0) ⇤ 0, x(⇡) ⇤ 0.
We have to handle the cases > 0, ⇤ 0, < 0 separately. First suppose that > 0.
Then the general solution to x 00 + x ⇤ 0 is
p p
x ⇤ A cos( t) + B sin( t).
If B is zero, then
p x is not a nonzero
p solution. So to get a nonzero solution we must
have that sin( ⇡) ⇤ 0. Hence, ⇡ must be an integer multiple of ⇡. In other words,
�.�. BOUNDARY VALUE PROBLEMS 191
p
⇤ k for a positive integer k. Hence the positive eigenvalues are k 2 for all integers
k 1. Corresponding eigenfunctions can be taken as x ⇤ sin(kt). Just like for eigenvectors,
constant multiples of an eigenfunction are also eigenfunctions, so we only need to pick one.
Now suppose that ⇤ 0. In this case the equation is x 00 ⇤ 0, and its general solution is
x ⇤ At + B. The condition x(0) ⇤ 0 implies that B ⇤ 0, and x(⇡) ⇤ 0 implies that A ⇤ 0.
This means that ⇤ 0 is not an eigenvalue.
Finally, suppose that < 0. In this case we have the general solution
p p
x ⇤ A cosh( t) + B sinh( t).
Letting x(0) ⇤p0 implies that A ⇤ 0 (recall cosh 0 ⇤ 1 and sinh 0 ⇤ 0). So our solution must
be x ⇤ B sinh( t) and satisfy x(⇡) ⇤ 0. This is only possible if B is zero. Why? Because
sinh ⇠ is only zero when ⇠ ⇤ 0. You should plot sinh to see this fact. We can also see this
⇠ ⇠
from the definition of sinh. We get 0 ⇤ sinh ⇠ ⇤ e 2e . Hence e ⇠ ⇤ e ⇠ , which implies
⇠ ⇤ ⇠ and that is only true if ⇠ ⇤ 0. So there are no negative eigenvalues.
In summary, the eigenvalues and corresponding eigenfunctions are
x 00 + x ⇤ 0, x 0(0) ⇤ 0, x 0(⇡) ⇤ 0.
Again we have to handle the cases > 0, ⇤ 0, <p 0 separately.p First suppose that
> 0. The general solution to x 00 + x ⇤ 0 is x ⇤ A cos( t) + B sin( t). So
p p p p
x0 ⇤ A sin( t) + B cos( t).
We have already seen (with roles of A and B switched) that for this expression to be zero at
t ⇤ 0 and t ⇤ ⇡, we must have A ⇤ B ⇤ 0. Hence there are no negative eigenvalues.
In summary, the eigenvalues and corresponding eigenfunctions are
0 ⇤0 with an eigenfunction x0 ⇤ 1.
The following problem is the one that leads to the general Fourier series.
Example 4.1.5: Let us compute the eigenvalues and eigenfunctions of
x 00 + x ⇤ 0, x( ⇡) ⇤ x(⇡), x 0( ⇡) ⇤ x 0(⇡).
We have not specified the values or the derivatives at the endpoints, but rather that they
are the same at the beginning and at the end of the interval.
Let us skip < 0. The computations are the same as before, and again we find that
there are no negative eigenvalues.
For ⇤ 0, the general solution is x ⇤ At + B. The condition x( ⇡) ⇤ x(⇡) implies
that A ⇤ 0 (A⇡ + B ⇤ A⇡ + B implies A ⇤ 0). The second condition x 0( ⇡) ⇤ x 0(⇡) says
nothing about B and hence ⇤ 0 is an eigenvalue with a corresponding eigenfunction
x ⇤ 1. p p
For > 0 we get that x ⇤ A cos( t) + B sin( t). Now
p p p p
A cos( ⇡) + B sin( ⇡) ⇤ A cos( ⇡) + B sin( ⇡) .
| {z } | {z }
x( ⇡) x(⇡)
The terminology comes from the fact that the integral is a type of inner product. We will
expand on this in the next section. The theorem has a very short, elegant, and illuminating
proof so let us give it here. First, we have the following two equations.
( 1 2 )x 1 x 2 ⇤ x200 x1 x 2 x100 .
⇤ x20 x1 x 2 x10 ⇤ 0.
t⇤a
The last equality holds because of the boundary conditions. For example, if we consider
(4.1) we have x1 (a) ⇤ x1 (b) ⇤ x2 (a) ⇤ x2 (b) ⇤ 0 and so x20 x 1 x2 x10 is zero at both a and b.
As 1 , 2 , the theorem follows.
Exercise 4.1.1 (easy): Finish the proof of the theorem �check the last equality in the proof� for the
cases (4.2) and (4.3).
Similarly,
π ⇡ π ⇡
cos(mt) cos(nt) dt ⇤ 0, when m , n, and cos(nt) dt ⇤ 0.
0 0
4.1.5 Application
Let us consider a physical application of an endpoint problem. Suppose we have a tightly
stretched quickly spinning elastic string or rope of uniform linear density ⇢, for example in
kg/m. Let us put this problem into the x y-plane and both x and y are in meters. The x-axis
represents the position on the string. The string rotates at angular velocity !, in radians/s.
Imagine that the whole x y-plane rotates at angular velocity !. This way, the string stays in
this x y-plane and y measures its deflection from the equilibrium position, y ⇤ 0, on the
x-axis. Hence the graph of y gives the shape of the string. We consider an ideal string with
no volume, just a mathematical curve. We suppose the tension on the string is a constant T
in Newtons. Assuming that the deflection is small, we can use Newton’s second law (let us
skip the derivation) to get the equation
T y 00 + ⇢! 2 y ⇤ 0.
To check the units notice that the units of y 00 are m/m2 , as the derivative is in terms of x.
Let L be the length of the string (in meters) and the string is fixed at the beginning and
end points. Hence, y(0) ⇤ 0 and y(L) ⇤ 0. See Figure 4.1.
0 L x
⇢! 2
We rewrite the equation as y 00 + T y ⇤ 0. The setup is similar to Example 4.1.3
on page 190, except for the interval length being L instead of ⇡. We are looking for
⇢!2
eigenvalues of y 00 + y ⇤ 0, y(0) ⇤ 0, y(L) ⇤ 0 where ⇤ T . As before there are
no nonpositive
p eigenvalues.
p With > 0, the general solution to the equation is y ⇤
A cos( x) + B sin( x). The condition
p y(0) ⇤ 0 implies
p that A ⇤ 0 as before. The
condition y(L) ⇤ 0 implies that sin( L) ⇤ 0 and hence L ⇤ k⇡ for some integer k > 0,
so
⇢!2 k 2 ⇡2
⇤ ⇤ .
T L2
What does this say about the shape of the string? It says that for all parameters ⇢, !, T
not satisfying the above equation, the string is in the equilibrium position, y ⇤ 0. When
⇢!2 2 2
T ⇤ k L⇡2 , then the string will “pop out” some distance B. We cannot compute B with the
information we have.
Let us assume that ⇢ and T are fixed and we are changing !. For most values pof ! the
k⇡p T
string is in the equilibrium state. When the angular velocity ! hits a value ! ⇤ L ⇢ , then
196 CHAPTER �. FOURIER SERIES AND PDES
the string pops out and has the shape of a sin wave crossing the x-axis k 1 times between
the end points. For example, at k ⇤ 1, the string does not cross the x-axis and the shape
looks like in Figure 4.1 on the preceding page. On the other hand, when k ⇤ 3 the string
crosses the x-axis 2 times, see Figure 4.2. When ! changes again, the string returns to the
equilibrium position. The higher the angular velocity, the more times it crosses the x-axis
when it is popped out.
0 L x
For another example, if you have a spinning jump rope (then k ⇤ 1 as it is completely
“popped out”) and you pull on the ends to increase the tension, then the velocity also
increases for the rope to stay “popped out”.
4.1.6 Exercises
p p
Hint for the following exercises: Note that when > 0, then cos (t a) and sin (t
a) are also solutions of the homogeneous equation.
Exercise 4.1.5: Compute all eigenvalues and eigenfunctions of x 00 + x ⇤ 0, x(a) ⇤ x(b), x 0(a) ⇤
x 0(b) �assume a < b�.
Exercise 4.1.6: We skipped the case of < 0 for the boundary value problem x 00 + x ⇤ 0, x( ⇡) ⇤
x(⇡), x 0( ⇡) ⇤ x 0(⇡). Finish the calculation and show that there are no negative eigenvalues.
Exercise 4.1.101: Consider a spinning string of length � and linear density �.� and tension �. Find
smallest angular velocity when the string pops out.
Exercise 4.1.102: Suppose x 00 + x ⇤ 0 and x(0) ⇤ 1, x(1) ⇤ 1. Find all for which there is more
than one solution. Also find the corresponding solutions �only for the eigenvalues�.
�.�. BOUNDARY VALUE PROBLEMS 197
Exercise 4.1.103: Suppose x 00 + x ⇤ 0 and x(0) ⇤ 0, x 0(⇡) ⇤ 1. Find all the solution�s� if any
exist.
Exercise 4.1.104: Consider x 0 + x ⇤ 0 and x(0) ⇤ 0, x(1) ⇤ 0. Why does it not have any
eigenvalues� Why does any first order equation with two endpoint conditions such as above have no
eigenvalues�
One way to solve (4.6) is to decompose f (t) as a sum of cosines (and sines) and then solve
many problems of the form (4.7). We then use the principle of superposition, to sum up all
the solutions we got to get a solution to (4.6).
Before we proceed, let us talk a little bit more in detail about periodic functions. A
function is said to be periodic with period P if f (t) ⇤ f (t + P) for all t. For brevity we
say f (t) is P-periodic. Note that a P-periodic function is also 2P-periodic, 3P-periodic
and so on. For example, cos(t) and sin(t) are 2⇡-periodic. So are cos(kt) and sin(kt) for
all integers k. The constant functions are an extreme example. They are periodic for any
period (exercise).
Normally we start with a function f (t) defined on some interval [ L, L], and we want to
extend f (t) periodically to make it a 2L-periodic function. We do this extension by defining
a new function F(t) such that for t in [ L, L], F(t) ⇤ f (t). For t in [L, 3L], we define
F(t) ⇤ f (t 2L), for t in [ 3L, L], F(t) ⇤ f (t + 2L), and so on. To make that work we
needed f ( L) ⇤ f (L). We could have also started with f defined only on the half-open
interval ( L, L] and then define f ( L) ⇤ f (L).
Example 4.2.1: Define f (t) ⇤ 1 t 2 on [ 1, 1]. Now extend f (t) periodically to a 2-periodic
function. See Figure 4.3 on the facing page.
You should be careful to distinguish between f (t) and its extension. A common mistake
is to assume that a formula for f (t) holds for its extension. It can be confusing when the
formula for f (t) is periodic, but with perhaps a different period.
Exercise 4.2.1: Define f (t) ⇤ cos t on [ ⇡/2, ⇡/2]. Take the ⇡-periodic extension and sketch its
graph. How does it compare to the graph of cos t�
-3 -2 -1 0 1 2 3
1.5 1.5
1.0 1.0
0.5 0.5
0.0 0.0
-0.5 -0.5
-3 -2 -1 0 1 2 3
vÆ ⇤ a 1 w
Æ 1 + a2 w
Æ 2.
Therefore,
hvÆ, wÆ1 i
a1 ⇤ .
hwÆ 1 , wÆ1 i
Similarly
h vÆ, wÆ2 i
a2 ⇤ .
hwÆ 2 , wÆ2 i
You probably remember this formula from vector calculus.
⇥ ⇤ ⇥ ⇤ ⇥ ⇤
Example 4.2.2: Write vÆ ⇤ 23 as a linear combination of wÆ1 ⇤ 11 and wÆ2 ⇤ 11 .
First note that w
Æ 1 and w
Æ 2 are orthogonal as h w Æ 2 i ⇤ 1(1) + ( 1)1 ⇤ 0. Then
Æ1, w
x 00 + x ⇤ 0, x( ⇡) ⇤ x(⇡), x 0( ⇡) ⇤ x 0(⇡).
We computed that eigenfunctions are 1, cos(kt), sin(kt). That is, we want to find a
representation of a 2⇡-periodic function f (t) as
a0 ’
1
f (t) ⇤ + a n cos(nt) + b n sin(nt).
2
n⇤1
This series is called the Fourier series or the trigonometric series for f (t). We write the
coefficient of the eigenfunction 1 as a20 for convenience. We could also think of 1 ⇤ cos(0t),
so that we only need to look at cos(kt) and sin(kt).
As for matrices we want to find a projection of f (t) onto the subspaces given by the
eigenfunctions. So we want to define an inner product of functions. For example, to find a n
we want to compute h f (t) , cos(nt) i. We define the inner product as
π ⇡
def
h f (t) , g(t) i ⇤ f (t) g(t) dt.
⇡
With this definition of the inner product, we saw in the previous section that the eigenfunc-
tions cos(kt) (including the constant eigenfunction), and sin(kt) are orthogonal in the sense
that
For n ⇤ 1, 2, 3, . . . we have
π ⇡
h cos(nt) , cos(nt) i ⇤ cos(nt) cos(nt) dt ⇤ ⇡,
π ⇡
⇡
h sin(nt) , sin(nt) i ⇤ sin(nt) sin(nt) dt ⇤ ⇡,
⇡
Named after the French mathematician Jean Baptiste Joseph Fourier (1768–1830).
�.�. THE TRIGONOMETRIC SERIES 201
Compare these expressions with the finite-dimensional example. For a 0 we get a similar
formula
π ⇡
h f (t) , 1 i 1
a0 ⇤ 2 ⇤ f (t) dt.
h1, 1i ⇡ ⇡
Let us check the formulas using the orthogonality properties. Suppose for a moment
that
a0 ’
1
f (t) ⇤ + a n cos(nt) + b n sin(nt).
2
n⇤1
Then for m 1 we have
Da ’
1 E
0
h f (t) , cos(mt) i ⇤ + a n cos(nt) + b n sin(nt) , cos(mt)
2
n⇤1
a0 ’ 1
⇤ h 1 , cos(mt) i + a n h cos(nt) , cos(mt) i + b n h sin(nt) , cos(mt) i
2
n⇤1
⇤ a m h cos(mt) , cos(mt) i.
h f (t) , cos(mt) i
And hence a m ⇤ h cos(mt) , cos(mt) i
.
Exercise 4.2.2: Carry out the calculation for a0 and b m .
Example 4.2.3: Take the function
f (t) ⇤ t
for t in ( ⇡, ⇡]. Extend f (t) periodically and write it as a Fourier series. This function is
called the sawtooth.
The plot of the extended periodic function is given in Figure 4.4 on the next page. Let
us compute the coefficients. We start with a 0 ,
π
1 ⇡
a0 ⇤ t dt ⇤ 0.
⇡ ⇡
We will often use the result from calculus that says that the integral of an odd function
over a symmetric interval is zero. Recall that an odd function is a function '(t) such that
'( t) ⇤ '(t). For example the functions t, sin t, or (importantly for us) t cos(nt) are all
odd functions. Thus π ⇡
1
an ⇤ t cos(nt) dt ⇤ 0.
⇡ ⇡
202 CHAPTER �. FOURIER SERIES AND PDES
3 3
2 2
1 1
0 0
-1 -1
-2 -2
-3 -3
Let us move to b n . Another useful fact from calculus is that the integral of an even function
over a symmetric interval is twice the integral of the same function over half the interval.
Recall an even function is a function '(t) such that '( t) ⇤ '(t). For example t sin(nt) is
even.
π
1 ⇡
bn ⇤ t sin(nt) dt
π ⇡
⇡ ⇡
2
⇤ t sin(nt) dt
✓ ◆
⇡ 0
⇡ π
2 t cos(nt) 1 ⇡
⇤ + cos(nt) dt
⇡ n n
✓ t⇤0
◆ 0
2 ⇡ cos(n⇡)
⇤ +0
⇡ n
2 cos(n⇡) 2 ( 1)n+1
⇤ ⇤ .
n n
We have used the fact that
(
1 if n even,
cos(n⇡) ⇤ ( 1)n ⇤
1 if n odd.
The plot of these first three terms of the series, along with a plot of the first 20 terms is
given in Figure 4.5.
-5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0
3 3 3 3
2 2 2 2
1 1 1 1
0 0 0 0
-1 -1 -1 -1
-2 -2 -2 -2
-3 -3 -3 -3
-5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0
Figure 4.5: First 3 (left graph) and 20 (right graph) harmonics of the sawtooth function.
Extend f (t) periodically and write it as a Fourier series. This function or its variants appear
often in applications and the function is called the square wave.
3 3
2 2
1 1
0 0
The plot of the extended periodic function is given in Figure 4.6. Now we compute the
204 CHAPTER �. FOURIER SERIES AND PDES
Next, π π
1 ⇡
1 ⇡
an ⇤ f (t) cos(nt) dt ⇤ ⇡ cos(nt) dt ⇤ 0.
⇡ ⇡ ⇡ 0
And finally
π
1 ⇡
bn ⇤ f (t) sin(nt) dt
π ⇡
⇡ ⇡
1
⇤ ⇡ sin(nt) dt
⇡ 0
⇡
cos(nt)
⇤
n t⇤0
(
2
1 cos(⇡n) 1 ( 1)n if n is odd,
⇤ ⇤ ⇤ n
n n 0 if n is even.
⇡ ’
1
2 ⇡ ’ 2
1
+ sin(nt) ⇤ + sin (2k 1) t .
2 n 2 2k 1
n⇤1 k⇤1
n odd
Let us write out the first 3 harmonics of the series for f (t).
⇡ 2
+ 2 sin(t) + sin(3t) + · · ·
2 3
The plot of these first three and also of the first 20 terms of the series is given in Figure 4.7
on the facing page.
We have so far skirted the issue of convergence. For example, if f (t) is the square wave
function, the equation
⇡ ’ 2
1
f (t) ⇤ + sin (2k 1) t .
2 2k 1
k⇤1
is only an equality for such t where f (t) is continuous. That is, we do not get an equality
for t ⇤ ⇡, 0, ⇡ and all the other discontinuities of f (t). It is not hard to see that when t is
an integer multiple of ⇡ (which includes all the discontinuities), then
⇡ ’ 2
1
⇡
+ sin (2k 1) t ⇤ .
2 2k 1 2
k⇤1
�.�. THE TRIGONOMETRIC SERIES 205
-5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0
3 3 3 3
2 2 2 2
1 1 1 1
0 0 0 0
-5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0
Figure 4.7: First 3 (left graph) and 20 (right graph) harmonics of the square wave function.
We redefine f (t) on [ ⇡, ⇡] as
8
>
>
<
>
0 if ⇡ < t < 0,
f (t) ⇤ ⇡ if 0 < t < ⇡,
>
>
> ⇡/2 if
: t ⇤ ⇡, t ⇤ 0, or t ⇤ ⇡,
and extend periodically. The series equals this extended f (t) everywhere, including the
discontinuities. We will generally not worry about changing the function values at several
(finitely many) points.
We will say more about convergence in the next section. Let us however mention briefly
an effect of the discontinuity. Let us zoom in near the discontinuity in the square wave.
Further, let us plot the first 100 harmonics, see Figure 4.8 on the next page. While the series
is a very good approximation away from the discontinuities, the error (the overshoot) near
the discontinuity at t ⇤ ⇡ does not seem to be getting any smaller. This behavior is known
as the Gibbs phenomenon. The region where the error is large does get smaller, however, the
more terms in the series we take.
We can think of a periodic function as a “signal” being a superposition of many signals
of pure frequency. For example, we could think of the square wave as a tone of certain base
frequency. This base frequency is called the fundamental frequency. The square wave will
be a superposition of many different pure tones of frequencies that are multiples of the
fundamental frequency. In music, the higher frequencies are called the overtones. All the
frequencies that appear are called the spectrum of the signal. On the other hand a simple
sine wave is only the pure tone (no overtones). The simplest way to make sound using a
computer is the square wave, and the sound is very different from a pure tone. If you ever
played video games from the 1980s or so, then you heard what square waves sound like.
206 CHAPTER �. FOURIER SERIES AND PDES
3.50 3.50
3.25 3.25
3.00 3.00
2.75 2.75
4.2.4 Exercises
Exercise 4.2.3: Suppose f (t) is defined on [ ⇡, ⇡] as sin(5t) + cos(3t). Extend periodically and
compute the Fourier series of f (t).
Exercise 4.2.4: Suppose f (t) is defined on [ ⇡, ⇡] as |t |. Extend periodically and compute the
Fourier series of f (t).
Exercise 4.2.5: Suppose f (t) is defined on [ ⇡, ⇡] as |t | 3 . Extend periodically and compute the
Fourier series of f (t).
Exercise 4.2.7: Suppose f (t) is defined on ( ⇡, ⇡] as t 3 . Extend periodically and compute the
Fourier series of f (t).
Exercise 4.2.8: Suppose f (t) is defined on [ ⇡, ⇡] as t 2 . Extend periodically and compute the
Fourier series of f (t).
There is another form of the Fourier series using complex exponentials e nt for n ⇤
. . . , 2, 1, 0, 1, 2, . . . instead of cos(nt) and sin(nt) for positive n. This form may be easier
to work with sometimes. It is certainly more compact to write, and there is only one
formula for the coefficients. On the downside, the coefficients are complex numbers.
�.�. THE TRIGONOMETRIC SERIES 207
Use Euler’s formula e i✓ ⇤ cos(✓) + i sin(✓) to show that there exist complex numbers c m such that
’
1
f (t) ⇤ c m e imt .
m⇤ 1
Note that the sum now ranges over all the integers including negative ones. Do not worry about
convergence in this calculation. Hint� It may be better to start from the complex exponential form
and write the series as
1 ⇣
’ ⌘
c0 + c m e imt + c me
imt
.
m⇤1
Exercise 4.2.101: Suppose f (t) is defined on [ ⇡, ⇡] as f (t) ⇤ sin(t). Extend periodically and
compute the Fourier series.
Exercise 4.2.102: Suppose f (t) is defined on ( ⇡, ⇡] as f (t) ⇤ sin(⇡t). Extend periodically and
compute the Fourier series.
Exercise 4.2.103: Suppose f (t) is defined on ( ⇡, ⇡] as f (t) ⇤ sin2 (t). Extend periodically and
compute the Fourier series.
is 2⇡-periodic. We must also rescale all our sines and cosines. In the series we use ⇡
Lt as
the variable. That is, we want to write
a0 ’
1 ⇣ n⇡ ⌘ ⇣ n⇡ ⌘
f (t) ⇤ + a n cos t + b n sin t .
2 L L
n⇤1
a0 ’
1
g(s) ⇤ + a n cos(ns) + b n sin(ns).
2
n⇤1
We compute a n and b n as before. After we write down the integrals, we change variables
from s back to t, noting also that ds ⇤ ⇡L dt.
π π L
1 ⇡
1
a0 ⇤ g(s) ds ⇤ f (t) dt,
⇡ L
π π
L
⇣ n⇡ ⌘
⇡
L
1 ⇡
1
an ⇤ g(s) cos(ns) ds ⇤ f (t) cos t dt,
⇡ L L
π π
L
⇣ n⇡ ⌘
⇡
L
1 ⇡
1
bn ⇤ g(s) sin(ns) ds ⇤ f (t) sin t dt.
⇡ ⇡ L L L
The two most common half periods that show up in examples are ⇡ and 1 because of
the simplicity of the formulas. We should stress that we have done no new mathematics,
we have only changed variables. If you understand the Fourier series for 2⇡-periodic
functions, you understand it for 2L-periodic functions. You can think of it as just using
different units for time. All that we are doing is moving some constants around, but all the
mathematics is the same.
�.�. MORE ON THE FOURIER SERIES 209
-2 -1 0 1 2
1.00 1.00
0.75 0.75
0.50 0.50
0.25 0.25
0.00 0.00
-2 -1 0 1 2
Õ1
We want to write f (t) ⇤ a20 + n⇤1 a n cos(n⇡t) + b n sin(n⇡t). For n 1 we note that
|t| cos(n⇡t) is even and hence
π 1
an ⇤ f (t) cos(n⇡t) dt
π
1
1
⇤2 t cos(n⇡t) dt
π
0
h t i1 1
1
⇤2 sin(n⇡t) 2 sin(n⇡t) dt
n⇡ n⇡
(
t⇤0 0
1 h i1 2 ( 1)n 1 0 if n is even,
⇤0+ cos(n⇡t) ⇤ ⇤ 4
n 2 ⇡2 t⇤0 n 2 ⇡2 n 2 ⇡2
if n is odd.
Next we find a 0 :
π 1
a0 ⇤ |t| dt ⇤ 1.
1
You should be able to find this integral by thinking about the integral as the area under the
graph without doing any computation at all. Finally we can find b n . Here, we notice that
|t| sin(n⇡t) is odd and, therefore,
π 1
bn ⇤ f (t) sin(n⇡t) dt ⇤ 0.
1
210 CHAPTER �. FOURIER SERIES AND PDES
Let us explicitly write down the first few terms of the series up to the 3rd harmonic.
1 4 4
cos(⇡t) cos(3⇡t) ···
2 ⇡2 9⇡2
The plot of these few terms and also a plot up to the 20th harmonic is given in Figure 4.10.
You should notice how close the graph is to the real function. You should also notice that
there is no “Gibbs phenomenon” present as there are no discontinuities.
-2 -1 0 1 2 -2 -1 0 1 2
-2 -1 0 1 2 -2 -1 0 1 2
Figure 4.10: Fourier series of f (t) up to the 3rd harmonic (left graph) and up to the 20th harmonic (right
graph).
4.3.2 Convergence
We will need the one sided limits of functions. We will use the following notation
If you are unfamiliar with this notation, limt"c f (t) means we are taking a limit of f (t) as t
approaches c from below (i.e. t < c) and limt#c f (t) means we are taking a limit of f (t) as t
approaches c from above (i.e. t > c). For example, for the square wave function
(
0 if ⇡ < t 0,
f (t) ⇤ (4.8)
⇡ if 0 < t ⇡,
Let f (t) be a function defined on an interval [a, b]. Suppose that we find finitely many
points a ⇤ t0 , t1 , t2 , . . . , t k ⇤ b in the interval, such that f (t) is continuous on the intervals
(t0 , t1 ), (t1 , t2 ), . . . , (t k 1 , t k ). Also suppose that all the one sided limits exist, that is, all of
f (t0 +), f (t1 ), f (t1 +), f (t2 ), f (t2 +), . . . , f (t k ) exist and are finite. Then we say f (t) is
piecewise continuous.
If moreover, f (t) is differentiable at all but finitely many points, and f 0(t) is piecewise
continuous, then f (t) is said to be piecewise smooth.
Example 4.3.2: The square wave function (4.8) is piecewise smooth on [ ⇡, ⇡] or any other
interval. In such a case we simply say that the function is piecewise smooth.
Example 4.3.4: The function f (t) ⇤ 1t is not piecewise smooth on [ 1, 1] (or any other
interval containing zero). In fact, it is not even piecewise continuous.
p
Example 4.3.5: The function f (t) ⇤ 3 t is not piecewise smooth on [ 1, 1] (or any other
interval containing zero). f (t) is continuous, but the derivative of f (t) is unbounded near
zero and hence not piecewise continuous.
Piecewise smooth functions have an easy answer on the convergence of the Fourier
series.
a0 ’
1 ⇣ n⇡ ⌘ ⇣ n⇡ ⌘
+ a n cos t + b n sin t
2 L L
n⇤1
be the Fourier series for f (t). Then the series converges for all t. If f (t) is continuous at t, then
a0 ’
1 ⇣ n⇡ ⌘ ⇣ n⇡ ⌘
f (t) ⇤ + a n cos t + b n sin t .
2 L L
n⇤1
Otherwise,
f (t ) + f (t+) a 0 ’ ⇣ n⇡ ⌘
1 ⇣ n⇡ ⌘
⇤ + a n cos t + b n sin t .
2 2 L L
n⇤1
f (t )+ f (t+)
If we happen to have that f (t) ⇤ 2 at all the discontinuities, the Fourier series
converges to f (t) everywhere. We can always just redefine f (t) by changing the value at
each discontinuity appropriately. Then we can write an equals sign between f (t) and the
series without any worry. We mentioned this fact briefly at the end last section.
The theorem does not say how fast the series converges. Think back to the discussion
of the Gibbs phenomenon in the last section. The closer you get to the discontinuity, the
more terms you need to take to get an accurate approximation to the function.
212 CHAPTER �. FOURIER SERIES AND PDES
a0 ’
1 ⇣ n⇡ ⌘ ⇣ n⇡ ⌘
f (t) ⇤ + a n cos t + b n sin t
2 L L
n⇤1
is a piecewise smooth continuous function and the derivative f 0(t) is piecewise smooth. Then the
derivative can be obtained by differentiating term by term,
’
1
a n n⇡ ⇣ n⇡ ⌘ b n n⇡ ⇣ n⇡ ⌘
f 0(t) ⇤ sin t + cos t .
L L L L
n⇤1
It is important that the function is continuous. It can have corners, but no jumps.
Otherwise, the differentiated series will fail to converge. For an exercise, take the series
obtained for the square wave and try to differentiate the series. Similarly, we can also
integrate a Fourier series.
Theorem 4.3.3. Suppose
a0 ’
1 ⇣ n⇡ ⌘ ⇣ n⇡ ⌘
f (t) ⇤ + a n cos t + b n sin t
2 L L
n⇤1
-2 -1 0 1 2
0.50 0.50
0.25 0.25
0.00 0.00
-0.25 -0.25
-0.50 -0.50
-2 -1 0 1 2
Let us compute the Fourier series coefficients. The actual computation involves several
integration by parts and is left to student.
π 1 π 0 π 1
a0 ⇤ f (t) dt ⇤ (t + 1) t dt + (1 t) t dt ⇤ 0,
π π π
1 1 0
1 0 1
an ⇤ f (t) cos(n⇡t) dt ⇤ (t + 1) t cos(n⇡t) dt + (1 t) t cos(n⇡t) dt ⇤ 0,
π π π 1
1 1 0
1 0
bn ⇤ f (t) sin(n⇡t) dt ⇤ (t + 1) t sin(n⇡t) dt + (1 t) t sin(n⇡t) dt
1
( 1 0
n 8
4(1 ( 1) ) ⇡ 3 n3 if n is odd,
⇤ ⇤
⇡3 n 3 0 if n is even.
8 8
sin(⇡t) + sin(3⇡t),
⇡ 3 27⇡ 3
it is almost indistinguishable from the plot of f (t) in Figure 4.11. In fact, the coefficient
8
27⇡3
is already just 0.0096 (approximately). The reason for this behavior is the n 3 term in
the denominator. The coefficients b n in this case go to zero as fast as 1/n 3 goes to zero.
For functions constructed piecewise from polynomials as above, it is generally true
that if you have one derivative, the Fourier coefficients will go to zero approximately like
214 CHAPTER �. FOURIER SERIES AND PDES
1/n 3 .
If you have only a continuous function, then the Fourier coefficients will go to zero as
1/n 2 .
If you have discontinuities, then the Fourier coefficients will go to zero approximately
as /n . For more general functions the story is somewhat more complicated but the same
1
idea holds, the more derivatives you have, the faster the coefficients go to zero. Similar
reasoning works in reverse. If the coefficients go to zero like 1/n 2 , you always obtain a
continuous function. If they go to zero like 1/n 3 , you obtain an everywhere differentiable
function.
To justify this behavior, take for example the function defined by the Fourier series
’
1
1
f (t) ⇤ sin(nt).
n⇤1
n3
Therefore, the coefficients now go down like 1/n 2 , which means that we have a continuous
function. The derivative of f 0(t) is defined at most points, but there are points where f 0(t)
is not differentiable. It has corners, but no jumps. If we differentiate again (where we can),
we find that the function f 00(t), now fails to be continuous (has jumps)
’
1
1
00
f (t) ⇤ sin(nt).
n
n⇤1
This function is similar to the sawtooth. If we tried to differentiate the series again, we
would obtain
’
1
cos(nt),
n⇤1
which does not converge!
Exercise 4.3.2: Use a computer to plot the series we obtained for f (t), f 0(t) and f 00(t). That is,
plot say the first � harmonics of the functions. At what points does f 00(t) have the discontinuities�
4.3.5 Exercises
Exercise 4.3.3: Let (
0 if 1 < t 0,
f (t) ⇤
t if 0 < t 1,
extended periodically.
a� Compute the Fourier series for f (t).
’
1
( 1)n 1 1
b� By plugging in t ⇤ 0, evaluate ⇤1 + ···.
n2 4 9
n⇤1
’
1
1 1 1
c� Now evaluate ⇤1+ + + ···.
n2 4 9
n⇤1
a� F(2) b� F( 2) c� F(4)
d� F( 4) e� F(3) f� F( 9)
extended periodically.
Compute f 0(t).
�.�. MORE ON THE FOURIER SERIES 217
c� Using the first � terms of the result from part b� approximate ⇡/4.
a� F(0) b� F( 1) c� F(1)
d� F( 2) e� F(4) f� F( 8)
218 CHAPTER �. FOURIER SERIES AND PDES
Extend Fodd (t) and Feven (t) to be 2L-periodic. Then Fodd (t) is called the odd periodic extension
of f (t), and Feven (t) is called the even periodic extension of f (t). For the odd extension we
generally assume that f (0) ⇤ f (L) ⇤ 0.
Exercise 4.4.2: Check that Fodd (t) is odd and Feven (t) is even. For Fodd , assume f (0) ⇤ f (L) ⇤ 0.
Example 4.4.1: Take the function f (t) ⇤ t (1 t) defined on [0, 1]. Figure 4.12 on the facing
page shows the plots of the odd and even periodic extensions of f (t).
�.�. SINE AND COSINE SERIES 219
-2 -1 0 1 2 -2 -1 0 1 2
0.3 0.3 0.3 0.3
Similarly, if f (t) is an even 2L-periodic function. For the same exact reasons as above,
we find that b n ⇤ 0 and π
2 L ⇣ n⇡ ⌘
an ⇤ f (t) cos t dt.
L 0 L
The formula still works for n ⇤ 0, in which case it becomes
π L
2
a0 ⇤ f (t) dt.
L 0
220 CHAPTER �. FOURIER SERIES AND PDES
An interesting consequence is that the coefficients of the Fourier series of an odd (or
even) function can be computed by just integrating over the half interval [0, L]. Therefore,
we can compute the Fourier series of the odd (or even) extension of a function by computing
certain integrals over the interval where the original function is defined.
Theorem 4.4.1. Let f (t) be a piecewise smooth function defined on [0, L]. Then the odd periodic
extension of f (t) has the Fourier series
’
1 ⇣ n⇡ ⌘
Fodd (t) ⇤ b n sin t ,
L
n⇤1
where
π L ⇣ n⇡ ⌘
2
bn ⇤ f (t) sin t dt.
L 0 L
a0 ’
1⇣ n⇡ ⌘
Feven (t) ⇤ + a n cos t ,
2 L
n⇤1
where
π L ⇣ n⇡ ⌘
2
an ⇤ f (t) cos t dt.
L 0 L
Õ
We call the series 1 b n sin n⇡ t the sine series of f (t) and we call the series a20 +
Õ1 n⇤1 L
n⇤1 a n cos L t the cosine series of f (t). We often do not actually care what happens
n⇡
outside of [0, L]. In this case, we pick whichever series fits our problem better.
It is not necessary to start with the full Fourier series to obtain the sine and cosine series.
The sine series is really the eigenfunction expansion of f (t) using eigenfunctions of the
eigenvalue problem x 00 + x ⇤ 0, x(0) ⇤ 0, x(L) ⇤ L. The cosine series is the eigenfunction
expansion of f (t) using eigenfunctions of the eigenvalue problem x 00 + x ⇤ 0, x 0(0) ⇤ 0,
x 0(L) ⇤ L. We could have, therefore, gotten the same formulas by defining the inner
product
π L
h f (t), g(t)i ⇤ f (t)g(t) dt,
0
and following the procedure of § 4.2. This point of view is useful, as we commonly use
a specific series that arose because our underlying question led to a certain eigenvalue
problem. If the eigenvalue problem is not one of the three we covered so far, you can still
�.�. SINE AND COSINE SERIES 221
do an eigenfunction expansion, generalizing the results of this chapter. We will deal with
such a generalization in chapter 5.
Example 4.4.2: Find the Fourier series of the even periodic extension of the function
f (t) ⇤ t 2 for 0 t ⇡.
We want to write
a0 ’
1
f (t) ⇤ + a n cos(nt),
2
n⇤1
where π
2 ⇡
2⇡ 2
a0 ⇤ t 2 dt ⇤ ,
⇡ 0 3
and
π ⇡ π
2 ⇡
2 2 21 4 ⇡
an ⇤ t cos(nt) dt ⇤ t sin(nt) t sin(nt) dt
⇡ ⇡ n n⇡
π
0 0
4 h i⇡
0
n
4 ⇡
4( 1)
⇤ 2 t cos(nt) + 2 cos(nt) dt ⇤ .
n ⇡ 0 n ⇡ 0 n2
Note that we have “detected” the continuity of the extension since the coefficients decay
as n12 . That is, the even periodic extension of t 2 has no jump discontinuities. It does have
corners, since the derivative, which is an odd function and a sine series, has jumps; it has a
Fourier series whose coefficients decay only as n1 .
Explicitly, the first few terms of the series are
⇡2 4
4 cos(t) + cos(2t) cos(3t) + · · ·
3 9
Exercise 4.4.3:
a� Compute the derivative of the even periodic extension of f (t) above and verify it has jump
discontinuities. Use the actual definition of f (t), not its cosine series�
b� Why is it that the derivative of the even periodic extension of f (t) is the odd periodic extension
of f 0(t)�
4.4.3 Application
Fourier series ties in to the boundary value problems we studied earlier. Let us see this
connection in an application.
Consider the boundary value problem for 0 < t < L,
for the Dirichlet boundary conditions x(0) ⇤ 0, x(L) ⇤ 0. The Fredholm alternative (The-
orem 4.1.2 on page 194) says that as long as is not an eigenvalue of the underlying
222 CHAPTER �. FOURIER SERIES AND PDES
where f (t) ⇤ t on 0 < t < 1, and satisfying the Dirichlet boundary conditions x(0) ⇤ 0,
x(1) ⇤ 0. We write f (t) as a sine series
’
1
f (t) ⇤ c n sin(n⇡t).
n⇤1
Compute
π 1
2 ( 1)n+1
cn ⇤ 2 t sin(n⇡t) dt ⇤ .
0 n⇡
We write x(t) as
’
1
x(t) ⇤ b n sin(n⇡t).
n⇤1
We plug in to obtain
’
1 ’
1
2 2
x (t) + 2x(t) ⇤
00
b n n ⇡ sin(n⇡t) + 2 b n sin(n⇡t)
| {z } | {z }
n⇤1 n⇤1
x 00 x
’
1
⇤ b n (2 n 2 ⇡2 ) sin(n⇡t)
n⇤1
’
1
2 ( 1)n+1
⇤ f (t) ⇤ sin(n⇡t).
n⇡
n⇤1
Therefore,
2 ( 1)n+1
b n (2 n 2 ⇡2 ) ⇤
n⇡
or
2 ( 1)n+1
bn ⇤ .
n⇡(2 n 2 ⇡2 )
�.�. SINE AND COSINE SERIES 223
That 2 n 2 ⇡ 2 is not zero for any n, and that we can solve for b n , is precisely because 2 is
not an eigenvalue of the problem. We have thus obtained a Fourier series for the solution
’
1
2 ( 1)n+1
x(t) ⇤ sin(n⇡t).
n⇤1
n⇡ (2 n 2 ⇡2 )
See Figure 4.13 for a graph of the solution. Notice that because the eigenfunctions satisfy
the boundary conditions, and x is written in terms of the boundary conditions, then x
satisfies the boundary conditions.
0.00 0.00
-0.02 -0.02
-0.04 -0.04
-0.06 -0.06
-0.08 -0.08
Example 4.4.4: Similarly we handle the Neumann conditions. Take the boundary value
problem for 0 < t < 1,
x 00(t) + 2x(t) ⇤ f (t),
where again f (t) ⇤ t on 0 < t < 1, but now satisfying the Neumann boundary conditions
x 0(0) ⇤ 0, x 0(1) ⇤ 0. We write f (t) as a cosine series
c0 ’
1
f (t) ⇤ + c n cos(n⇡t),
2
n⇤1
where π 1
c0 ⇤ 2 t dt ⇤ 1,
0
and
π (
1 2 ( 1)n 1 4
⇡2 n 2
if n odd,
cn ⇤ 2 t cos(n⇡t) dt ⇤ ⇤
0 ⇡2 n 2 0 if n even.
We write x(t) as a cosine series
a0 ’
1
x(t) ⇤ + a n cos(n⇡t).
2
n⇤1
224 CHAPTER �. FOURIER SERIES AND PDES
We plug in to obtain
1 h
’ i 1 h
’ i
2 2
x (t) + 2x(t) ⇤
00
a n n ⇡ cos(n⇡t) + a 0 + 2 a n cos(n⇡t)
n⇤1 n⇤1
’
1
⇤ a0 + a n (2 n 2 ⇡2 ) cos(n⇡t)
n⇤1
1 ’1
4
⇤ f (t) ⇤ + cos(n⇡t).
2 ⇡ 2 n 2
n⇤1
n odd
4
a n (2 n 2 ⇡2 ) ⇤ ,
⇡2 n 2
or
4
an ⇤ .
n 2 ⇡2 (2 n 2 ⇡2 )
The Fourier series for the solution x(t) is
1 ’1
4
x(t) ⇤ + cos(n⇡t).
4 n ⇡ (2 n 2 ⇡2 )
2 2
n⇤1
n odd
4.4.4 Exercises
Exercise 4.4.4: Take f (t) ⇤ (t 1)2 defined on 0 t 1.
Exercise 4.4.5: Find the Fourier series of both the odd and even periodic extension of the function
f (t) ⇤ (t 1)2 for 0 t 1. Can you tell which extension is continuous from the Fourier series
coefficients�
Exercise 4.4.6: Find the Fourier series of both the odd and even periodic extension of the function
f (t) ⇤ t for 0 t ⇡.
Exercise 4.4.7: Find the Fourier series of the even periodic extension of the function f (t) ⇤ sin t
for 0 t ⇡.
�.�. SINE AND COSINE SERIES 225
Dirichlet conditions x(0) ⇤ 0 and x(⇡) ⇤ 1. Hint� Note that ⇡t satisfies the given Dirichlet
conditions.
226 CHAPTER �. FOURIER SERIES AND PDES
The general solution of (4.9) consists of the complementary solution x c , which solves
the associated homogeneous equation mx 00 + cx 0 + kx ⇤ 0, and a particular solution of (4.9)
we call x p . For c > 0, the complementary solution x c will decay as time goes by. Therefore,
we are mostly interested in a particular solution x p that does not decay and is periodic with
the same period as F(t). We call this particular solution the steady periodic solution and we
write it as x sp as before. What is new in this section is that we consider an arbitrary forcing
function F(t) instead of a simple cosine.
For simplicity, suppose c ⇤ 0. The problem with c > 0 is very similar. The equation
mx 00 + kx ⇤ 0
c0 ’
1 ⇣ n⇡ ⌘ ⇣ n⇡ ⌘
F(t) ⇤ + c n cos t + d n sin t .
2 L L
n⇤1
a0 ’
1 ⇣ n⇡ ⌘ ⇣ n⇡ ⌘
x(t) ⇤ + a n cos t + b n sin t ,
2 L L
n⇤1
where a n and b n are unknowns. We plug x into the differential equation and solve for a n
and b n in terms of c n and d n . This process is perhaps best understood by example.
�.�. APPLICATIONS OF FOURIER SERIES 227
Example 4.5.1: Suppose that k ⇤ 2, and m ⇤ 1. The units are again the mks units (meters-
kilograms-seconds). There is a jetpack strapped to the mass, which fires with a force of 1
newton for 1 second and then is off for 1 second, and so on. We want to find the steady
periodic solution.
The equation is, therefore,
x 00 + 2x ⇤ F(t),
where F(t) is the step function
(
0 if 1 < t < 0,
F(t) ⇤
1 if 0 < t < 1,
c0 ’
1
F(t) ⇤ + c n cos(n⇡t) + d n sin(n⇡t).
2
n⇤1
We compute
π 1 π 1
cn ⇤ F(t) cos(n⇡t) dt ⇤ cos(n⇡t) dt ⇤ 0 for n 1,
π π
1 0
1 1
c0 ⇤ F(t) dt ⇤ dt ⇤ 1,
π
1 0
1
dn ⇤ F(t) sin(n⇡t) dt
π
1
1
⇤ sin(n⇡t) dt
0
1
cos(n⇡t)
⇤
n⇡
(t⇤0
2
n
1 ( 1) if n odd,
⇤ ⇤ ⇡n
⇡n 0 if n even.
So
1 ’ 2
1
F(t) ⇤ + sin(n⇡t).
2 ⇡n
n⇤1
n odd
We want to try
a0 ’
1
x(t) ⇤ + a n cos(n⇡t) + b n sin(n⇡t).
2
n⇤1
Once we plug x into the differential equation x 00 + 2x ⇤ F(t), it is clear that a n ⇤ 0 for n 1
as there are no corresponding terms in the series for F(t). Similarly b n ⇤ 0 for n even.
228 CHAPTER �. FOURIER SERIES AND PDES
Hence we try
a0 ’
1
x(t) ⇤ + b n sin(n⇡t).
2
n⇤1
n odd
1 ’ 2
1
⇤ F(t) ⇤ + sin(n⇡t).
2 ⇡n
n⇤1
n odd
2
bn ⇤ .
⇡n(2 n 2 ⇡2 )
1 ’
1
2
x sp (t) ⇤ + sin(n⇡t).
4 ⇡n(2 n 2 ⇡2 )
n⇤1
n odd
We know this is the steady periodic solution as it contains no terms of the complementary
solution and it is periodic with the same period as F(t) itself. See Figure 4.14 on the next
page for the plot of this solution.
4.5.2 Resonance
Just as when the forcing function was a simple cosine, we may encounter resonance.
Assume c ⇤ 0 and let us discuss only pure resonance. Let F(t) be 2L-periodic and consider
When we expand F(t) and find that some of its terms coincide with the complementary
solution to mx 00 + kx ⇤ 0, we cannot use those terms in the guess. Just like before, they
disappear when we plug them into the left-hand side and we get a contradictory equation
(such as 0 ⇤ 1). That is, suppose
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0.0 0.0
0.0 2.5 5.0 7.5 10.0
where !0 ⇤ N⇡
L for some positive integer N. We have to modify our guess and try
✓ ✓ ◆ ✓ ◆◆ ’
1 ⇣ n⇡ ⌘ ⇣ n⇡ ⌘
a0 N⇡ N⇡
x(t) ⇤ + t a N cos t + b N sin t + a n cos t + b n sin t .
2 L L L L
n⇤1
n ,N
In other words, we multiply the offending term by t. From then on, we proceed as before.
Of course, the solution is not a Fourier series (it is not even periodic) since it contains
these terms multiplied by t. Further, the terms t a N cos NL⇡ t + b N sin NL⇡ t eventually
dominate and lead to wild oscillations. As before, this behavior is called pure resonance or
just resonance.
Note that there now may be infinitely many resonance frequencies to hit. That is, as we
change the frequency of F (we change L), different terms from the Fourier series of F may
interfere with the complementary solution and cause resonance. However, we should note
that since everything is an approximation and in particular c is never actually zero but
something very close to zero, only the first few resonance frequencies matter in real life.
Example 4.5.2: We want to solve the equation
where (
1 if 1 < t < 0,
F(t) ⇤
1 if 0 < t < 1,
extended periodically. We note that
’
1
4
F(t) ⇤ sin(n⇡t).
⇡n
n⇤1
n odd
230 CHAPTER �. FOURIER SERIES AND PDES
Exercise 4.5.1: Compute the Fourier series of F to verify the above equation.
q q
18⇡2
As k
m ⇤ 2 ⇤ 3⇡, the solution to (4.10) is
We simplify,
’
1
2
2x 00p + 18⇡ x p ⇤ 12a 3 ⇡ sin(3⇡t) + 12b 3 ⇡ cos(3⇡t) + ( 2n 2 ⇡2 b n + 18⇡2 b n ) sin(n⇡t).
n⇤1
n odd
n ,3
This series has to equal to the series for F(t). We equate the coefficients and solve for a3
and b n .
4/(3⇡) 1
a3 ⇤ ⇤ ,
12⇡ 9⇡ 2
b 3 ⇤ 0,
4 2
bn ⇤ ⇤ for n odd and n , 3.
n⇡(18⇡ 2 2n 2 ⇡2 ) ⇡3 n(9 n 2 )
�.�. APPLICATIONS OF FOURIER SERIES 231
That is,
1 ’1
2
x p (t) ⇤ t cos(3⇡t) + sin(n⇡t).
9⇡ 2
n⇤1
⇡ 3 n(9 n 2)
n odd
n ,3
When c > 0, you do not have to worry about pure resonance. That is, there are never
any conflicts and you do not need to multiply any terms by t. There is a corresponding
concept of practical resonance and it is very similar to the ideas we already explored in
chapter 2. Basically what happens in practical resonance is that one of the coefficients in
the series for x sp can get very big. Let us not go into details here.
4.5.3 Exercises
Õ
Exercise 4.5.2: Let F(t) ⇤ 12 + 1 1
n⇤1 n 2 cos(n⇡t). Find the steady periodic solution to x + 2x ⇤
00
Exercise 4.5.5: Let F(t) ⇤ t for 1 < t < 1 and extended periodically. Find the steady periodic
solution to x 00 + x ⇤ F(t). Express your solution as a series.
Exercise 4.5.6: Let F(t) ⇤ t for 1 < t < 1 and extended periodically. Find the steady periodic
solution to x 00 + ⇡2 x ⇤ F(t). Express your solution as a series.
Exercise
p 4.5.101: Let F(t) ⇤ sin(2⇡t) + 0.1 cos(10⇡t). Find the steady periodic solution to
x + 2 x ⇤ F(t). Express your solution as a Fourier series.
00
Õ
Exercise 4.5.102: Let F(t) ⇤ 1 n⇤1 e
n cos(2nt). Find the steady periodic solution to x 00 + 3x ⇤
Exercise 4.5.103:p Let F(t) ⇤ |t | for 1 t 1 extended periodically. Find the steady periodic
solution to x 00 + 3 x ⇤ F(t). Express your solution as a series.
Exercise 4.5.104: Let F(t) ⇤ |t | for 1 t 1 extended periodically. Find the steady periodic
solution to x 00 + ⇡2 x ⇤ F(t). Express your solution as a series.
232 CHAPTER �. FOURIER SERIES AND PDES
temperature u
0 L x
insulation
Figure 4.15: Insulated wire.
Let u(x, t) denote the temperature at point x at time t. The equation governing this
setup is the so-called one-dimensional heat equation:
@u @2 u
⇤ k 2,
@t @x
where k > 0 is a constant (the thermal conductivity of the material). That is, the change in
heat at a specific point is proportional to the second derivative of the heat along the wire.
�.�. PDES, SEPARATION OF VARIABLES, AND THE HEAT EQUATION 233
This makes sense; if at a fixed t the graph of the heat distribution has a maximum (the
graph is concave down), then heat flows away from the maximum. And vice-versa.
We will generally use a more convenient notation for partial derivatives. We will write
2
u t instead of @u
@t
, and we will write u xx instead of @@xu2 . With this notation the heat equation
becomes
u t ⇤ ku xx .
For the heat equation, we must also have some boundary conditions. We assume that
the ends of the wire are either exposed and touching some body of constant heat, or the
ends are insulated. For example, if the ends of the wire are kept at temperature 0, then the
conditions are
u(0, t) ⇤ 0 and u(L, t) ⇤ 0.
If, on the other hand, the ends are also insulated, the conditions are
Let us see why that is so. If u x is positive at some point x0 , then at a particular time, u is
smaller to the left of x0 , and higher to the right of x0 . Heat is flowing from high heat to low
heat, that is to the left. On the other hand if u x is negative then heat is again flowing from
high heat to low heat, that is to the right. So when u x is zero, that is a point through which
heat is not flowing. In other words, u x (0, t) ⇤ 0 means no heat is flowing in or out of the
wire at the point x ⇤ 0.
We always have two conditions along the x-axis as there are two derivatives in the x
direction. These side conditions are said to be homogeneous (that is, u or a derivative of u is
set to zero).
We also need an initial condition—the temperature distribution at time t ⇤ 0. That is,
u(x, 0) ⇤ f (x),
for some known function f (x). This initial condition is not a homogeneous side condition.
Exercise 4.6.1: Verify the principle of superposition for the heat equation.
The method of separation of variables is to try to find solutions that are sums or products
of functions of one variable. For example, for the heat equation, we try to find solutions of
the form
u(x, t) ⇤ X(x)T(t).
That the desired solution we are looking for is of this form is too much to hope for. What is
perfectly reasonable to ask, however, is to find enough “building-block” solutions of the
form u(x, t) ⇤ X(x)T(t) using this procedure so that the desired solution to the PDE is
somehow constructed from these building blocks by the use of superposition.
Let us try to solve the heat equation
We guess u(x, t) ⇤ X(x)T(t). We will try to make this guess satisfy the differential equation,
u t ⇤ ku xx , and the homogeneous side conditions, u(0, t) ⇤ 0 and u(L, t) ⇤ 0. Then,
as superposition works preserves the differential equation and the homogeneous side
conditions, we will try to build up a solution from these building blocks to solve the
nonhomogeneous initial condition u(x, 0) ⇤ f (x).
First we plug u(x, t) ⇤ X(x)T(t) into the heat equation to obtain
We rewrite as
T 0(t) X 00(x)
⇤ .
kT(t) X(x)
This equation must hold for all x and all t. But the left-hand side does not depend on x
and the right-hand side does not depend on t. Hence, each side must be a constant. Let us
call this constant (the minus sign is for convenience later). We obtain the two equations
T 0(t) X 00(x)
⇤ ⇤ .
kT(t) X(x)
In other words
X 00(x) + X(x) ⇤ 0,
T 0(t) + kT(t) ⇤ 0.
The boundary condition u(0, t) ⇤ 0 implies X(0)T(t) ⇤ 0. We are looking for a nontrivial
solution and so we can assume that T(t) is not identically zero. Hence X(0) ⇤ 0. Similarly,
u(L, t) ⇤ 0 implies X(L) ⇤ 0. We are looking for nontrivial solutions X of the eigenvalue
problem X 00 + X ⇤ 0, X(0) ⇤ 0, X(L) ⇤ 0. We have previously found that the only
2 2
eigenvalues are n ⇤ nL⇡2 , for integers n 1, where eigenfunctions are sin n⇡ L x . Hence,
let us pick the solutions ⇣ n⇡ ⌘
X n (x) ⇤ sin x .
L
�.�. PDES, SEPARATION OF VARIABLES, AND THE HEAT EQUATION 235
n 2 ⇡2
Tn0 (t) + kTn (t) ⇤ 0.
L2
This is one of our fundamental equations, and the solution is just an exponential:
n 2 ⇡2
kt
Tn (t) ⇤ e L2 .
’
1 ⇣ n⇡ ⌘
f (x) ⇤ b n sin x .
L
n⇤1
That is, we find the Fourier series of the odd periodic extension of f (x). We used the
sine series as it corresponds to the eigenvalue problem for X(x) above. Finally, we use
superposition to write the solution as
’
1 ’
1 ⇣ n⇡ ⌘ n 2 ⇡2
kt
u(x, t) ⇤ b n u n (x, t) ⇤ b n sin x e L2 .
L
n⇤1 n⇤1
Why does this solution work? First note that it is a solution to the heat equation by
superposition. It satisfies u(0, t) ⇤ 0 and u(L, t) ⇤ 0, because x ⇤ 0 or x ⇤ L makes all the
sines vanish. Finally, plugging in t ⇤ 0, we notice that Tn (0) ⇤ 1 and so
’
1 ’
1 ⇣ n⇡ ⌘
u(x, 0) ⇤ b n u n (x, 0) ⇤ b n sin x ⇤ f (x).
L
n⇤1 n⇤1
Example 4.6.1: Consider an insulated wire of length 1 whose ends are embedded in ice
(temperature 0). Let k ⇤ 0.003. Suppose the initial heat distribution is u(x, 0) ⇤ 50 x (1 x).
See Figure 4.16 on the next page.
We want to find the temperature function u(x, t). Let us suppose we also want to find
when (at what t) does the maximum temperature in the wire drop to one half of the initial
maximum of 12.5.
We are solving the following PDE problem:
u t ⇤ 0.003 u xx ,
u(0, t) ⇤ u(1, t) ⇤ 0,
u(x, 0) ⇤ 50 x (1 x) for 0 < x < 1.
236 CHAPTER �. FOURIER SERIES AND PDES
12.5 12.5
10.0 10.0
7.5 7.5
5.0 5.0
2.5 2.5
0.0 0.0
Õ1
We write f (x) ⇤ 50 x (1 x) for 0 < x < 1 as a sine series. That is, f (x) ⇤ n⇤1 b n sin(n⇡x),
where
π (
1 n
200 200 ( 1) 0 if n even,
bn ⇤ 2 50 x (1 x) sin(n⇡x) dx ⇤ ⇤ 400
0 ⇡3 n 3 ⇡3 n 3 ⇡3 n 3
if n odd.
The solution u(x, t), plotted in Figure 4.17 on the facing page for 0 t 100, is given
by the series:
’1
400 2 2
u(x, t) ⇤ 3 3
sin(n⇡x) e n ⇡ 0.003 t .
n⇤1
⇡ n
n odd
Finally, let us answer the question about the maximum temperature. It is relatively easy
to see that the maximum temperature will always be at x ⇤ 0.5, in the middle of the wire.
The plot of u(x, t) confirms this intuition.
If we plug in x ⇤ 0.5, we get
’
1
400 n 2 ⇡2 0.003 t
u(0.5, t) ⇤ sin(n⇡ 0.5) e .
n⇤1
⇡3 n 3
n odd
For n ⇤ 3 and higher (remember n is only odd), the terms of the series are insignificant
compared to the first term. The first term in the series is already a very good approximation
of the function. Hence
400 2
u(0.5, t) ⇡ 3 e ⇡ 0.003 t .
⇡
The approximation gets better and better as t gets larger as the other terms decay much
faster. Let us plot the function u(0.5, t), the temperature at the midpoint of the wire at time
t, in Figure 4.18 on the next page. The figure also plots the approximation by the first term.
�.�. PDES, SEPARATION OF VARIABLES, AND THE HEAT EQUATION 237
0
0.00 t
20
0.25 40
x 60
0.50
80
u(x,t)
0.75 100
1.00
12.5 12.5
11.700
10.400
9.100
10.0 10.0 7.800
6.500
5.200
7.5 7.5 3.900
2.600
1.300
5.0 5.0 0.000
2.5 2.5
0.0 0.0
0 0.25
20
0.50
40 x
60 0.75
t 80
1.00
100
0 25 50 75 100
12.5 12.5
10.0 10.0
7.5 7.5
5.0 5.0
2.5 2.5
0 25 50 75 100
Figure 4.18: Temperature at the midpoint of the wire (the bottom curve), and the approximation of this
temperature by using only the first term in the series (top curve).
After t ⇤ 5 or so it would be hard to tell the difference between the first term of the
series for u(x, t) and the real solution u(x, t). This behavior is a general feature of solving
238 CHAPTER �. FOURIER SERIES AND PDES
the heat equation. If you are interested in behavior for large enough t, only the first one or
two terms may be necessary.
Let us get back to the question of when is the maximum temperature one half of the
initial maximum temperature. That is, when is the temperature at the midpoint 12.5/2 ⇤ 6.25.
We notice on the graph that if we use the approximation by the first term we will be close
enough. We solve
400 2
6.25 ⇤ 3 e ⇡ 0.003 t .
⇡
That is,
⇡3
ln 6.25
400
t⇤ ⇡ 24.5.
⇡2 0.003
So the maximum temperature drops to half at about t ⇤ 24.5.
We mention an interesting behavior of the solution to the heat equation. The heat
equation “smoothes” out the function f (x) as t grows. For a fixed t, the solution is a Fourier
n 2 ⇡2
kt
series with coefficients b n e L2 . If t > 0, then these coefficients go to zero faster than
any n1p for any power p. In other words, the Fourier series has infinitely many derivatives
everywhere. Thus even if the function f (x) has jumps and corners, then for a fixed t > 0,
the solution u(x, t) as a function of x is as smooth as we want it to be.
Example 4.6.2: When the initial condition is already a sine series, then there is no need to
compute anything, you just need to plug in. Consider
Yet again we try a solution of the form u(x, t) ⇤ X(x)T(t). By the same procedure as before
we plug into the heat equation and arrive at the following two equations
X 00(x) + X(x) ⇤ 0,
T 0(t) + kT(t) ⇤ 0.
At this point the story changes slightly. The boundary condition u x (0, t) ⇤ 0 implies
X 0(0)T(t) ⇤ 0. Hence X 0(0) ⇤ 0. Similarly, u x (L, t) ⇤ 0 implies X 0(L) ⇤ 0. We are looking
for nontrivial solutions X of the eigenvalue problem X 00 + X ⇤ 0, X 0(0) ⇤ 0, X 0(L) ⇤ 0. We
2 2
have previously found that the only eigenvalues are n ⇤ nL⇡2 , for integers n 0, where
�.�. PDES, SEPARATION OF VARIABLES, AND THE HEAT EQUATION 239
a0 ’
1⇣ n⇡ ⌘
f (x) ⇤ + a n cos x .
2 L
n⇤1
That is, we find the Fourier series of the even periodic extension of f (x).
We use superposition to write the solution as
a0 ’
1
a0 ’ ⇣ n⇡ ⌘ 1
n 2 ⇡2
kt
u(x, t) ⇤ + a n u n (x, t) ⇤ + a n cos x e L2 .
2 2 L
n⇤1 n⇤1
Example 4.6.3: Let us try the same equation as before, but for insulated ends. We are
solving the following PDE problem
u t ⇤ 0.003 u xx ,
u x (0, t) ⇤ u x (1, t) ⇤ 0,
u(x, 0) ⇤ 50 x (1 x) for 0 < x < 1.
For this problem, we must find the cosine series of u(x, 0). For 0 < x < 1 we have
’1 ✓ ◆
25 200
50 x (1 x) ⇤ + cos(n⇡x).
3 ⇡2 n 2
n⇤2
n even
The calculation is left to the reader. Hence, the solution to the PDE problem, plotted in
Figure 4.19 on the following page, is given by the series
’1 ✓ ◆
25 200 n 2 ⇡ 2 0.003 t
u(x, t) ⇤ + cos(n⇡x) e .
3 ⇡2 n 2
n⇤2
n even
240 CHAPTER �. FOURIER SERIES AND PDES
0.00 0
5 t
x 0.25
10
0.50 15
0.75 20
u(x,t)
25
1.00
30
12.5
11.700
12.5 10.400
10.0
9.100
7.800
10.0 6.500
7.5
5.200
3.900
7.5 2.600
5.0
1.300
0.000
5.0
2.5
2.5
0.0
0 0.0
0.00
5
10 0.25
15 0.50
20
0.75 x
t 25
30 1.00
Figure 4.19: Plot of the temperature of the insulated wire at position x at time t.
Note in the graph that as time goes on, the temperature evens out across the wire.
Eventually, all the terms except the constant die out, and you will be left with a uniform
temperature of 253 ⇡ 8.33 along the entire length of the wire.
Let us expand on the last point. The constant term in the series is
π L
a0 1
⇤ f (x) dx.
2 L 0
In other words, a20 is the average value of f (x), that is the average of the initial temperature.
As the wire is insulated everywhere, no heat can get out, no heat can get in. So the
temperature tries to distribute evenly over time, and the average temperature must always
be the same, in particular it is always a20 . As time goes to infinity, the temperature goes to
the constant a20 everywhere.
4.6.4 Exercises
Exercise 4.6.2: Consider a wire of length �, with k ⇤ 0.001 and an initial temperature distribution
u(x, 0) ⇤ 50x. Both ends are embedded in ice �temperature ��. Find the solution as a series.
�.�. PDES, SEPARATION OF VARIABLES, AND THE HEAT EQUATION 241
u t ⇤ u xx ,
u(0, t) ⇤ u(1, t) ⇤ 0,
u(x, 0) ⇤ 100 for 0 < x < 1.
u t ⇤ u xx ,
u x (0, t) ⇤ u x (⇡, t) ⇤ 0,
u(x, 0) ⇤ 3 cos(x) + cos(3x) for 0 < x < ⇡.
1
ut ⇤ u xx ,
3
u x (0, t) ⇤ u x (⇡, t) ⇤ 0,
10x
u(x, 0) ⇤ for 0 < x < ⇡.
⇡
Exercise 4.6.6: Find a series solution of
u t ⇤ u xx ,
u(0, t) ⇤ 0, u(1, t) ⇤ 100,
u(x, 0) ⇤ sin(⇡x) for 0 < x < 1.
Hint� Use the fact that u(x, t) ⇤ 100x is a solution satisfying u t ⇤ u xx , u(0, t) ⇤ 0, u(1, t) ⇤ 100.
Then use superposition.
Exercise 4.6.7: Find the steady state temperature solution as a function of x alone, by letting
t ! 1 in the solution from exercises �.�.� and �.�.�. Verify that it satisfies the equation u xx ⇤ 0.
Exercise 4.6.9 (challenging): Suppose that one end of the wire is insulated �say at x ⇤ 0� and the
other end is kept at zero temperature. That is, find a series solution of
u t ⇤ ku xx ,
u x (0, t) ⇤ u(L, t) ⇤ 0,
u(x, 0) ⇤ f (x) for 0 < x < L.
Exercise 4.6.10 (challenging): Suppose that the wire is circular and insulated, so there are no
ends. You can think of this as simply connecting the two ends and making sure the solution matches
up at the ends. That is, find a series solution of
u t ⇤ ku xx ,
u(0, t) ⇤ u(L, t), u x (0, t) ⇤ u x (L, t),
u(x, 0) ⇤ f (x) for 0 < x < L.
Exercise 4.6.11: Consider a wire insulated on both ends, L ⇤ 1, k ⇤ 1, and u(x, 0) ⇤ cos2 (⇡x).
c� Initially the temperature variation is � �maximum minus the minimum�. Find the time when
the variation is 1/2.
u t ⇤ 3u xx ,
u(0, t) ⇤ u(⇡, t) ⇤ 0,
u(x, 0) ⇤ 5 sin(x) + 2 sin(5x) for 0 < x < ⇡.
u t ⇤ 0.1u xx ,
u x (0, t) ⇤ u x (⇡, t) ⇤ 0,
u(x, 0) ⇤ 1 + 2 cos(x) for 0 < x < ⇡.
Exercise 4.6.104: Use separation of variables �Hint� try u(x, t) ⇤ X(x) + T(t)� to find a nontrivial
solution to u x + u t ⇤ u.
Exercise 4.6.105: Suppose that the temperature on the wire is fixed at 0 at the ends, L ⇤ 1, k ⇤ 1,
and u(x, 0) ⇤ 100 sin(2⇡x).
c� At what time is the maximum temperature on the wire exactly one half of the initial maximum
at t ⇤ 0.
�.�. ONE-DIMENSIONAL WAVE EQUATION 243
0 L x
The equation that governs this setup is the so-called one-dimensional wave equation:
y tt ⇤ a 2 y xx ,
for some constant a > 0. The intuition is similar to the heat equation, replacing velocity
with acceleration: the acceleration at a specific point is proportional to the second derivative
of the shape of the string. In other words when the string is concave down then u xx is
negative and the string wants to accelerate downwards, so u tt should be negative. And
vice-versa. The wave equation is an example of a hyperbolic PDE.
Assume that the ends of the string are fixed in place as on the guitar:
Note that we have two conditions along the x-axis as there are two derivatives in the x
direction.
There are also two derivatives along the t direction and hence we need two further
conditions here. We need to know the initial position and the initial velocity of the string.
That is, for some known functions f (x) and g(x), we impose
The equation is linear, so superposition works just as it did for the heat equation. And
again we will use separation of variables to find enough building-block solutions to get
the overall solution. There is one change however. It will be easier to solve two separate
problems and add their solutions.
244 CHAPTER �. FOURIER SERIES AND PDES
w tt ⇤ a 2 w xx ,
w(0, t) ⇤ w(L, t) ⇤ 0,
(4.11)
w(x, 0) ⇤ 0 for 0 < x < L,
w t (x, 0) ⇤ g(x) for 0 < x < L,
and
z tt ⇤ a 2 z xx ,
z(0, t) ⇤ z(L, t) ⇤ 0,
(4.12)
z(x, 0) ⇤ f (x) for 0 < x < L,
z t (x, 0) ⇤ 0 for 0 < x < L.
The principle of superposition implies that y ⇤ w + z solves the wave equation and
furthermore y(x, 0) ⇤ w(x, 0) + z(x, 0) ⇤ f (x) and y t (x, 0) ⇤ w t (x, 0) + z t (x, 0) ⇤ g(x).
Hence, y is a solution to
y tt ⇤ a 2 y xx ,
y(0, t) ⇤ y(L, t) ⇤ 0,
(4.13)
y(x, 0) ⇤ f (x) for 0 < x < L,
y t (x, 0) ⇤ g(x) for 0 < x < L.
The reason for all this complexity is that superposition only works for homogeneous
conditions such as y(0, t) ⇤ y(L, t) ⇤ 0, y(x, 0) ⇤ 0, or y t (x, 0) ⇤ 0. Therefore, we
can use separation of variables to find many building-block solutions solving all the
homogeneous conditions. We can then use them to construct a solution satisfying the
remaining nonhomogeneous condition.
Let us start with (4.11). We try a solution of the form w(x, t) ⇤ X(x)T(t) again. We
plug into the wave equation to obtain
Rewriting we get
T 00(t) X 00(x)
⇤ .
a 2 T(t) X(x)
Again, left-hand side depends only on t and the right-hand side depends only on x. So
both sides equal a constant, which we denote by :
T 00(t) X 00(x)
⇤ ⇤ .
a 2 T(t) X(x)
X 00(x) + X(x) ⇤ 0,
T 00(t) + a 2 T(t) ⇤ 0.
�.�. ONE-DIMENSIONAL WAVE EQUATION 245
The conditions 0 ⇤ w(0, t) ⇤ X(0)T(t) implies X(0) ⇤ 0 and w(L, t) ⇤ 0 implies that X(L) ⇤ 0.
2 2
Therefore, the only nontrivial solutions for the first equation are when ⇤ n ⇤ nL⇡2 and
they are ⇣ n⇡ ⌘
X n (x) ⇤ sin x .
L
The general solution for T for this particular n is
⇣ n⇡a ⌘ ⇣ n⇡a ⌘
Tn (t) ⇤ A cos t + B sin t .
L L
We also have the condition that w(x, 0) ⇤ 0 or X(x)T(0) ⇤ 0. This implies that T(0) ⇤ 0,
which in turn forces A ⇤ 0. It is convenient to pick B ⇤ n⇡a
L
(you will see why in a moment)
and hence
L ⇣ n⇡a ⌘
Tn (t) ⇤ sin t .
n⇡a L
Our building-block solutions are
L ⇣ n⇡ ⌘ ⇣ n⇡a ⌘
w n (x, t) ⇤ sin x sin t .
n⇡a L L
We differentiate in t:
@w n ⇣ n⇡ ⌘ ⇣ n⇡a ⌘
(x, t) ⇤ sin x cos t .
@t L L
Hence,
@w n ⇣ n⇡ ⌘
(x, 0) ⇤ sin x .
@t L
We expand g(x) in terms of these sines as
’
1 ⇣ n⇡ ⌘
g(x) ⇤ b n sin x .
L
n⇤1
X 00(x) + X(x) ⇤ 0,
T 00(t) + a 2 T(t) ⇤ 0,
n 2 ⇡2
and the conditions X(0) ⇤ 0, X(L) ⇤ 0. So again ⇤ n ⇤ L2
and
⇣ n⇡ ⌘
X n (x) ⇤ sin x .
L
246 CHAPTER �. FOURIER SERIES AND PDES
This time the condition on T is T 0(0) ⇤ 0. Thus we get that B ⇤ 0 and we take
⇣ n⇡a ⌘
Tn (t) ⇤ cos t .
L
Our building-block solution is
⇣ n⇡ ⌘ ⇣ n⇡a ⌘
z n (x, t) ⇤ sin x cos t .
L L
As z n (x, 0) ⇤ sin n⇡
L x , we expand f (x) in terms of these sines as
’
1 ⇣ n⇡ ⌘
f (x) ⇤ c n sin x .
L
n⇤1
Exercise 4.7.2: Fill in the details in the derivation of the solution of (4.12). Check that the solution
satisfies all the side conditions.
Putting these two solutions together, let us state the result as a theorem.
Theorem 4.7.1. Take the equation
y tt ⇤ a 2 y xx ,
y(0, t) ⇤ y(L, t) ⇤ 0,
(4.14)
y(x, 0) ⇤ f (x) for 0 < x < L,
y t (x, 0) ⇤ g(x) for 0 < x < L,
where
’
1 ⇣ n⇡ ⌘ ’
1 ⇣ n⇡ ⌘
f (x) ⇤ c n sin x and g(x) ⇤ b n sin x .
L L
n⇤1 n⇤1
Then the solution y(x, t) can be written as a sum of the solutions of (4.11) and (4.12)�
’
1
L ⇣ n⇡ ⌘ ⇣ n⇡a ⌘ ⇣ n⇡ ⌘ ⇣ n⇡a ⌘
y(x, t) ⇤ bn sin x sin t + c n sin x cos t
n⇡a L L L L
⇣ n⇡ ⌘
n⇤1
’
1
L ⇣ n⇡a ⌘ ⇣ n⇡a ⌘
⇤ sin x bn sin t + c n cos t .
L n⇡a L L
n⇤1
Example 4.7.1: Consider a string of length 2 plucked in the middle, it has an initial shape
given in Figure 4.21 on the facing page. That is,
(
0.1 x if 0 x 1,
f (x) ⇤
0.1 (2 x) if 1 < x 2.
�.�. ONE-DIMENSIONAL WAVE EQUATION 247
y
0.1
0 2 x
Let the string start at rest (g(x) ⇤ 0), and let a ⇤ 1 for simplicity. In other words, we
wish to solve the problem:
y tt ⇤ y xx ,
y(0, t) ⇤ y(2, t) ⇤ 0,
y(x, 0) ⇤ f (x) and y t (x, 0) ⇤ 0.
We leave it to the reader to compute the sine series of f (x). The series will be
’
1
0.8 ⇣ n⇡ ⌘ ⇣ n⇡ ⌘
f (x) ⇤ sin sin x .
n 2 ⇡2 2 2
n⇤1
’
1
0.8 ⇣ n⇡ ⌘ ⇣ n⇡ ⌘ ⇣ n⇡ ⌘
y(x, t) ⇤ sin sin x cos t
n 2 ⇡2 2 2 2
✓ ◆ ✓ ◆
n⇤1
’1
0.8( 1)m+1 (2m 1)⇡ (2m 1)⇡
⇤ sin x cos t
1)2 ⇡2 2 2
m⇤1 (2m
⇣⇡ ⌘ ⇣⇡ ⌘ ✓ ◆ ✓ ◆
0.8 0.8 3⇡ 3⇡
⇤ 2 sin x cos t sin x cos t
2 2 9⇡ 2 2 2
✓ ◆ ✓ ◆
⇡
0.8 5⇡ 5⇡
+ sin x cos t ···
25⇡2 2 2
See Figure 4.22 on the next page for a plot for 0 < t < 3. Notice that unlike the heat
equation, the solution does not become “smoother,” the “sharp edges” remain. We will see
the reason for this behavior in the next section where we derive the solution to the wave
equation in a different way.
248 CHAPTER �. FOURIER SERIES AND PDES
0
0.0
t
1
0.5
2
x
1.0 3 y(x,t)
1.5 0.10
0.110
2.0 0.088
0.066
0.10 0.05
0.044
0.022
0.000
0.05 0.00 -0.022
y
-0.044
-0.066
-0.088
0.00 -0.05 -0.110
y
-0.05 -0.10
0.0
-0.10 0.5
0 1.0
x
1
1.5
2
t
2.0
3
Make sure you understand what the plot, such as the one in the figure, is telling you.
For each fixed t, you can think of the function y(x, t) as just a function of x. This function
gives you the shape of the string at time t. See Figure 4.23 on the facing page for plots of
at y as a function of x at several different values of t. On this plot you can see the sharp
edges remaining much better.
One thing to take away from all this is how a guitar sounds. Notice that the (angular)
frequencies that come up in the solution are n ⇡a
L . That is, there is a certain base fundamental
frequency L , and then we also get all the multiples of this frequency, which in music are
⇡a
called the overtones. Which overtones appear and with what amplitude is what musicians
call the timbre of the note. Mathematicians usually call this the spectrum. Because all the
frequencies are multiples of one frequency (the fundamental) we get a nice pleasing sound.
The fundamental frequency ⇡a L increases as we decrease length L. That is, if we place a
finger on the fingerboard and then pluck a string we get a higher note. The constant a is
given by
s
T
a⇤ ,
⇢
where T is tension and ⇢ is the linear density of the string. Tightening the string (turning
�.�. ONE-DIMENSIONAL WAVE EQUATION 249
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
the tuning peg on a guitar) increases a and hence produces a higher fundamental frequency
(a higher note). On the other hand using a heavier string reduces a and produces a lower
fundamental frequency (a lower note). A bass guitar has longer thicker strings, while a
ukulele has short strings made of lighter material.
Something rather interesting is the almost symmetry between space and time. In its
simplest form we see this symmetry in the solutions
⇣ n⇡ ⌘ ⇣ n⇡a ⌘
sin x sin t .
L L
Except for the a, time and space are just the same.
In general, the solution for a fixed x is a Fourier series in t, for a fixed t it is a Fourier
series in x, and the coefficients are related. If the shape f (x) or the initial velocity have
lots of corners, then the sound wave will have lots of corners. That is because the Fourier
coefficients of the initial shape decay to zero (as n ! 1) at the same rate as the Fourier
coefficients of the wave in time (for some fixed x). So if you use a sharp object to pick the
string, you get a sharper sound with lots of high frequency components, while if you use
your thumb, you get a softer sound without so many high overtones. Similarly if you pluck
250 CHAPTER �. FOURIER SERIES AND PDES
close to the bridge, you are getting a pluck that looks more like the sawtooth, and you get
an even sharper sound.
In fact, if you look at the formula for the solution, you see that for any fixed x we get an
almost arbitrary Fourier series in t, everything except the constant term. You can essentially
obtain any sound you want by plucking the string in just the right way. Of course we are
considering an ideal string of no stiffness and no air resistance. Those variables clearly
impact the sound as well.
4.7.1 Exercises
Exercise 4.7.3: Solve
y tt ⇤ 9y xx ,
y(0, t) ⇤ y(1, t) ⇤ 0,
y(x, 0) ⇤ sin(3⇡x) + 14 sin(6⇡x) for 0 < x < 1,
y t (x, 0) ⇤ 0 for 0 < x < 1.
y tt ⇤ 4y xx ,
y(0, t) ⇤ y(1, t) ⇤ 0,
y(x, 0) ⇤ sin(3⇡x) + 14 sin(6⇡x) for 0 < x < 1,
y t (x, 0) ⇤ sin(9⇡x) for 0 < x < 1.
Exercise 4.7.5: Derive the solution for a general plucked string of length L and any constant a �in
the equation y tt ⇤ a 2 y xx �, where we raise the string some distance b at the midpoint and let go.
Exercise 4.7.6: Imagine that a stringed musical instrument falls on the floor. Suppose that the
length of the string is � and a ⇤ 1. When the musical instrument hits the ground the string was in
rest position and hence y(x, 0) ⇤ 0. However, the string was moving at some velocity at impact
�t ⇤ 0�, say y t (x, 0) ⇤ 1. Find the solution y(x, t) for the shape of the string at time t.
Exercise 4.7.7 (challenging): Suppose that you have a vibrating string and that there is air
resistance proportional to the velocity. That is, you have
y tt ⇤ a 2 y xx k y t ,
y(0, t) ⇤ y(1, t) ⇤ 0,
y(x, 0) ⇤ f (x) for 0 < x < 1,
y t (x, 0) ⇤ 0 for 0 < x < 1.
Suppose that 0 < k < 2⇡a. Derive a series solution to the problem. Any coefficients in the series
should be expressed as integrals of f (x).
Exercise 4.7.8: Suppose you touch the guitar string exactly in the middle to ensure another condition
u(L/2, t) ⇤ 0 for all time. Which multiples of the fundamental frequency ⇡a
L show up in the solution�
�.�. ONE-DIMENSIONAL WAVE EQUATION 251
Exercise 4.7.104: Let’s see what happens when a ⇤ 0. Find a solution to y tt ⇤ 0, y(0, t) ⇤
y(⇡, t) ⇤ 0, y(x, 0) ⇤ sin(2x), y t (x, 0) ⇤ sin(x).
252 CHAPTER �. FOURIER SERIES AND PDES
y tt ⇤ a 2 y xx (4.15)
@2 y
0 ⇤ a 2 y xx y tt ⇤ 4a 2 ⇤ 4a 2 y ⇠⌘ .
@⇠@⌘
Therefore, the wave equation (4.15) transforms into y ⇠⌘ ⇤ 0. It is easy to find the general
solution to this equation by integrating twice. Keeping ⇠ constant, we integrate with respect
Named after the French mathematician Jean le Rond d’Alembert (1717–1783).
�.�. D’ALEMBERT SOLUTION OF THE WAVE EQUATION 253
to ⌘ first and notice that the constant of integration depends on ⇠; for each ⇠ we might get
Ø
a different constant of integration. We get y ⇠ ⇤ C(⇠). Next, we integrate with respect to ⇠
and notice that the constant of integration depends on ⌘. Thus, y ⇤ C(⇠) d⇠ + B(⌘). The
solution must, therefore, be of the following form for some functions A(⇠) and B(⌘):
We claim this A(x) and B(x) give the solution. Explicitly, the solution is y(x, t) ⇤
A(x at) + B(x + at) or in other words:
π x at π x+at
1 1 1 1
y(x, t) ⇤ F(x at) G(s) ds + F(x + at) + G(s) ds
2 2a 2 2a
π
0 0
x+at
(4.17)
F(x at) + F(x + at) 1
⇤ + G(s) ds.
2 2a x at
So far so good. Assume for simplicity F is differentiable. And we use the first form of (4.17)
as it is easier to differentiate. By the fundamental theorem of calculus we have
a 0 1 a 1
y t (x, t) ⇤ F (x at) + G(x at) + F0(x + at) + G(x + at).
2 2 2 2
So
a 0 1 a 1
y t (x, 0) ⇤ F (x) + G(x) + F0(x) + G(x) ⇤ G(x).
2 2 2 2
There is nothing special about ⌘, you can integrate with ⇠ first, if you wish.
254 CHAPTER �. FOURIER SERIES AND PDES
Yay! We’re smoking now. OK, now the boundary conditions. Note that F(x) and G(x) are
odd. So
π at π at
F( at) + F(at) 1 F(at) + F(at) 1
y(0, t) ⇤ + G(s) ds ⇤ + G(s) ds ⇤ 0 + 0 ⇤ 0.
2 2a at 2 2a at
F(L at) + F(L + at) ⇤ F( L at) + F(L + at) ⇤ F(L + at) + F(L + at) ⇤ 0.
Hence π L+at
F(L at) + F(L + at) 1
y(L, t) ⇤ + G(s) ds ⇤ 0 + 0 ⇤ 0.
2 2a L at
And voilà, it works.
Example 4.8.1: D’Alembert says that the solution is a superposition of two functions
(waves) moving in the opposite direction at “speed” a. To get an idea of how it works, let
us work out an example. Consider the simpler setup
y tt ⇤ y xx ,
y(0, t) ⇤ y(1, t) ⇤ 0,
y(x, 0) ⇤ f (x),
y t (x, 0) ⇤ 0.
The graph of this impulse is the top left plot in Figure 4.24 on the next page.
Let F(x) be the odd periodic extension of f (x). Then (4.17) says that the solution is
F(x t) + F(x + t)
y(x, t) ⇤ .
2
It is not hard to compute specific values of y(x, t). For example, to compute y(0.1, 0.6)
we notice x t ⇤ 0.5 and x + t ⇤ 0.7. Now F( 0.5) ⇤ f (0.5) ⇤ 20 (0.55 0.5) ⇤ 1
and F(0.7) ⇤ f (0.7) ⇤ 0. Hence y(0.1, 0.6) ⇤ 1+02 ⇤ 0.5. As you can see the d’Alembert
solution is much easier to actually compute and to plot than the Fourier series solution.
See Figure 4.24 on the facing page for plots of the solution y for several different t.
�.�. D’ALEMBERT SOLUTION OF THE WAVE EQUATION 255
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
Figure 4.24: Plot of the d’Alembert solution for t ⇤ 0, t ⇤ 0.2, t ⇤ 0.4, and t ⇤ 0.6.
If you think about it, the exact formulas for A and B are not hard to guess once you realize
what kind of side conditions y(x, t) is supposed to satisfy. Let us find the formula again,
but slightly differently. Best approach is to do it in stages. When g(x) ⇤ 0 (and hence
G(x) ⇤ 0) the solution is
F(x at) + F(x + at)
.
2
On the other hand, when f (x) ⇤ 0 (and hence F(x) ⇤ 0), we let
π x
H(x) ⇤ G(s) ds.
0
256 CHAPTER �. FOURIER SERIES AND PDES
Exercise 4.8.1: Check that the new formula (4.18) satisfies the side conditions (4.16).
Warning: Make sure you use the odd periodic extensions F(x) and G(x), when you
have formulas for f (x) and g(x). The thing is, those formulas in general hold only for
0 < x < L, and are not usually equal to F(x) and G(x) for other x.
we notice that we have only used the initial conditions in the interval [x at, x + at]. These
two endpoints are called the wavefronts, as that is where the wave front is given an initial
(t ⇤ 0) disturbance at x. So if a ⇤ 1, an observer sitting at x ⇤ 0 at time t ⇤ 1 has only seen
the initial conditions for x in the range [ 1, 1] and is blissfully unaware of anything else.
This is why for example we do not know that a supernova has occurred in the universe
until we see its light, millions of years from the time when it did in fact happen.
�.�. D’ALEMBERT SOLUTION OF THE WAVE EQUATION 257
4.8.5 Exercises
Exercise 4.8.2: Using the d’Alembert solution solve y tt ⇤ 4y xx , 0 < x < ⇡, t > 0, y(0, t) ⇤
y(⇡, t) ⇤ 0, y(x, 0) ⇤ sin x, and y t (x, 0) ⇤ sin x. Hint� Note that sin x is the odd periodic
extension of y(x, 0) and y t (x, 0).
Exercise 4.8.3: Using the d’Alembert solution solve y tt ⇤ 2y xx , 0 < x < 1, t > 0, y(0, t) ⇤
y(1, t) ⇤ 0, y(x, 0) ⇤ sin5 (⇡x), and y t (x, 0) ⇤ sin3 (⇡x).
Exercise 4.8.4: Take y tt ⇤ 4y xx , 0 < x < ⇡, t > 0, y(0, t) ⇤ y(⇡, t) ⇤ 0, y(x, 0) ⇤ x(⇡ x),
and y t (x, 0) ⇤ 0.
a� Solve using the d’Alembert formula. Hint� You can use the sine series for y(x, 0).
b� Find the solution as a function of x for a fixed t ⇤ 0.5, t ⇤ 1, and t ⇤ 2. Do not use the sine
series here.
Exercise 4.8.5: Derive the d’Alembert solution for y tt ⇤ a 2 y xx , 0 < x < ⇡, t > 0, y(0, t) ⇤
y(⇡, t) ⇤ 0, y(x, 0) ⇤ f (x), and y t (x, 0) ⇤ 0, using the Fourier series solution of the wave equation,
by applying an appropriate trigonometric identity. Hint� Do it first for a single term of the Fourier
series solution, in particular do it when y is sin n⇡ L x sin L t .
n⇡a
Exercise 4.8.6: The d’Alembert solution still works if there are no boundary conditions and the
initial condition is defined on the whole real line. Suppose that y tt ⇤ y xx �for all x on the real line
and t 0�, y(x, 0) ⇤ f (x), and y t (x, 0) ⇤ 0, where
8
>
>
>
0 if x < 1,
>
<
>x+1 if 1 x < 0,
f (x) ⇤
>
> x + 1 if 0 x < 1,
>
>
>0
: if 1 < x.
Solve using the d’Alembert solution. That is, write down a piecewise definition for the solution.
Then sketch the solution for t ⇤ 0, t ⇤ 1/2, t ⇤ 1, and t ⇤ 2.
Exercise 4.8.101: Using the d’Alembert solution solve y tt ⇤ 9y xx , 0 < x < 1, t > 0, y(0, t) ⇤
y(1, t) ⇤ 0, y(x, 0) ⇤ sin(2⇡x), and y t (x, 0) ⇤ sin(3⇡x).
Exercise 4.8.102: Take y tt ⇤ 4y xx , 0 < x < 1, t > 0, y(0, t) ⇤ y(1, t) ⇤ 0, y(x, 0) ⇤ x x 2 , and
y t (x, 0) ⇤ 0. Using the d’Alembert solution find the solution at
a� t ⇤ 0.1, b� t ⇤ 1/2, c� t ⇤ 1.
You may have to split your answer up by cases.
Exercise 4.8.103: Take y tt ⇤ 100y xx , 0 < x < 4, t > 0, y(0, t) ⇤ y(4, t) ⇤ 0, y(x, 0) ⇤ F(x),
and y t (x, 0) ⇤ 0. Suppose that F(0) ⇤ 0, F(1) ⇤ 2, F(2) ⇤ 3, F(3) ⇤ 1. Using the d’Alembert
solution find
a� y(1, 1), b� y(4, 3), c� y(3, 9).
258 CHAPTER �. FOURIER SERIES AND PDES
u t ⇤ k(u xx + u y y ), (4.19)
u ⇤ u xx + u y y ⇤ 0.
This equation is called the Laplace equation , and is an example of an elliptic equation.
Solutions to the Laplace equation are called harmonic functions and have many nice
properties and applications far beyond the steady state heat problem.
Named after the French mathematician Pierre-Simon, marquis de Laplace (1749–1827).
�.�. STEADY STATE TEMPERATURE AND THE LAPLACIAN 259
Harmonic functions in two variables are no longer just linear (plane graphs). For
example, you can check that the functions x 2 y 2 and x y are harmonic. However, note
that if u xx is positive, u is concave up in the x direction, then u y y must be negative and u
must be concave down in the y direction. A harmonic function can never have any “hilltop”
or “valley” on the graph. This observation is consistent with our intuitive idea of steady
state heat distribution; the hottest or coldest spot will not be inside.
Commonly the Laplace equation is part of a so-called Dirichlet problem . That is, we
have a region in the x y-plane and we specify certain values along the boundaries of the
region. We then try to find a solution u to the Laplace equation defined on this region such
that u agrees with the values we specified on the boundary.
In this section we consider a rectangular region. For simplicity we specify boundary
values to be zero at 3 of the four edges and only specify an arbitrary function at one edge.
As we still have the principle of superposition, we can use this simpler solution to derive
the general solution for arbitrary boundary values by solving 4 different problems, one for
each edge, and adding those solutions together. This setup is left as an exercise.
We wish to solve the following problem. Let h and w be the height and width of our
rectangle, with one corner at the origin and lying in the first quadrant.
X 00Y + XY 00 ⇤ 0.
X 00 Y 00
⇤ .
X Y
Named after the German mathematician Johann Peter Gustav Lejeune Dirichlet (1805–1859).
260 CHAPTER �. FOURIER SERIES AND PDES
The left-hand side only depends on x and the right-hand side only depends on y. Therefore,
00 00
there is some constant such that ⇤ XX ⇤ YY . And we get two equations
X 00 + X ⇤ 0,
Y 00 Y ⇤ 0.
Furthermore, the homogeneous boundary conditions imply that X(0) ⇤ X(w) ⇤ 0 and
Y(h) ⇤ 0. Taking the equation for X we have already seen that we have a nontrivial solution
2 2
if and only if ⇤ n ⇤ nw⇡2 and the solution is a multiple of
⇣ n⇡ ⌘
X n (x) ⇤ sin x .
w
For these given n, the general solution for Y (one for each n) is
⇣ n⇡ ⌘ ⇣ n⇡ ⌘
Yn (y) ⇤ A n cosh y + B n sinh y . (4.25)
w w
We only have one condition on Yn and hence we can pick one of A n or B n to be something
convenient. It will be useful to have Yn (0) ⇤ 1, so we let A n ⇤ 1. Setting Yn (h) ⇤ 0 and
solving for B n we get that ⇣ ⌘
cosh n⇡h
⇣ ⌘ .
w
Bn ⇤
sinh n⇡h
w
After we plug the A n and B n we into (4.25) and simplify by using the identity sinh(↵ )⇤
sinh(↵) cosh( ) cosh(↵) sinh( ), we find
⇣ ⌘
n⇡(h y)
sinh
⇣ ⌘ .
w
Yn (y) ⇤
sinh n⇡h
w
As u n satisfies (4.20)–(4.23) and any linear combination (finite or infinite) of u n also satisfies
(4.20)–(4.23), then u satisfies (4.20)–(4.23). By plugging in y ⇤ 0, we see u satisfies (4.24) as
well.
Example 4.9.1: Take w ⇤ h ⇤ ⇡ and let f (x) ⇤ ⇡. Let us compute the sine series for the
function ⇡ (same as the series for the square wave). For 0 < x < ⇡, we have
’
1
4
f (x) ⇤ sin(nx).
n
n⇤1
n odd
Therefore the solution u(x, y), see Figure 4.25, to the corresponding Dirichlet problem is
given as
’1 ✓ ◆
4 sinh n(⇡ y)
u(x, y) ⇤ sin(nx) .
n sinh(n⇡)
n⇤1
n odd
0.0 0.0 y
0.5
0.5 1.0
x 1.5
1.0 2.0
1.5 2.5
3.0
2.0 u(x,y)
2.5 3.0
3.0 3.142
2.5
2.828
2.514
3.0
2.199
2.0 1.885
2.5 1.571
1.5
1.257
0.943
2.0
0.628
1.0 0.314
1.5 0.000
0.5
1.0
0.0
0.5 0.0
0.5
0.0 1.0
0.0 1.5
0.5
1.0 2.0
1.5 x
2.0 2.5
2.5
3.0 3.0
y
Figure 4.25: Steady state temperature of a square plate, three sides held at zero and one side held at ⇡.
This scenario corresponds to the steady state temperature on a square plate of width ⇡
with 3 sides held at 0 degrees and one side held at ⇡ degrees. If we have arbitrary initial
262 CHAPTER �. FOURIER SERIES AND PDES
data on all sides, then we solve four problems, each using one piece of nonhomogeneous
data. Then we use the principle of superposition to add up all four solutions to have a
solution to the original problem.
A different way to visualize solutions of the Laplace equation is to take a wire and bend
it so that it corresponds to the graph of the temperature above the boundary of your region.
Cut a rubber sheet in the shape of your region—a square in our case—and stretch it fixing
the edges of the sheet to the wire. The rubber sheet is a good approximation of the graph
of the solution to the Laplace equation with the given boundary data.
4.9.1 Exercises
Exercise 4.9.1: Let R be the region described by 0 < x < ⇡ and 0 < y < ⇡. Solve the problem
Exercise 4.9.2: Let R be the region described by 0 < x < 1 and 0 < y < 1. Solve the problem
u xx + u y y ⇤ 0,
u(x, 0) ⇤ sin(⇡x) sin(2⇡x), u(x, 1) ⇤ 0,
u(0, y) ⇤ 0, u(1, y) ⇤ 0.
Exercise 4.9.3: Let R be the region described by 0 < x < 1 and 0 < y < 1. Solve the problem
u xx + u y y ⇤ 0,
u(x, 0) ⇤ u(x, 1) ⇤ u(0, y) ⇤ u(1, y) ⇤ C.
Hint� Try a solution of the form u(x, y) ⇤ X(x) + Y(y) �different separation of variables�.
Exercise 4.9.5: Use the solution of Exercise �.�.� to solve
u xx + u y y ⇤ 0,
u(x, 0) ⇤ 0, u(x, h) ⇤ f (x),
u(0, y) ⇤ 0, u(w, y) ⇤ 0.
The solution should be in series form using the Fourier series coefficients of f (x).
�.�. STEADY STATE TEMPERATURE AND THE LAPLACIAN 263
Exercise 4.9.7: Let R be the region described by 0 < x < w and 0 < y < h. Solve the problem
u xx + u y y ⇤ 0,
u(x, 0) ⇤ 0, u(x, h) ⇤ 0,
u(0, y) ⇤ f (y), u(w, y) ⇤ 0.
The solution should be in series form using the Fourier series coefficients of f (y).
Exercise 4.9.8: Let R be the region described by 0 < x < w and 0 < y < h. Solve the problem
u xx + u y y ⇤ 0,
u(x, 0) ⇤ 0, u(x, h) ⇤ 0,
u(0, y) ⇤ 0, u(w, y) ⇤ f (y).
The solution should be in series form using the Fourier series coefficients of f (y).
Exercise 4.9.9: Let R be the region described by 0 < x < 1 and 0 < y < 1. Solve the problem
u xx + u y y ⇤ 0,
u(x, 0) ⇤ sin(9⇡x), u(x, 1) ⇤ sin(2⇡x),
u(0, y) ⇤ 0, u(1, y) ⇤ 0.
Exercise 4.9.10: Let R be the region described by 0 < x < 1 and 0 < y < 1. Solve the problem
u xx + u y y ⇤ 0,
u(x, 0) ⇤ sin(⇡x), u(x, 1) ⇤ sin(⇡x),
u(0, y) ⇤ sin(⇡ y), u(1, y) ⇤ sin(⇡ y).
Exercise 4.9.11 (challenging): Using only your intuition find u(1/2, 1/2), for the problem u ⇤ 0,
where u(0, y) ⇤ u(1, y) ⇤ 100 for 0 < y < 1, and u(x, 0) ⇤ u(x, 1) ⇤ 0 for 0 < x < 1. Explain.
Exercise 4.9.101: Let R be the region described by 0 < x < 1 and 0 < y < 1. Solve the problem
’
1
1
u ⇤ 0, u(x, 0) ⇤ sin(n⇡x), u(x, 1) ⇤ 0, u(0, y) ⇤ 0, u(1, y) ⇤ 0.
n⇤1
n2
Exercise 4.9.102: Let R be the region described by 0 < x < 1 and 0 < y < 2. Solve the problem
u r ⇤ u x x r + u y y r ⇤ cos(✓)u x + sin(✓)u y ,
u rr ⇤ cos(✓)(u xx x r + u x y y r ) + sin(✓)(u yx x r + u y y y r )
⇤ cos2 (✓)u xx + 2 cos(✓) sin(✓)u x y + sin2 (✓)u y y .
�.��. DIRICHLET PROBLEM IN THE CIRCLE AND THE POISSON KERNEL 265
Similarly for the ✓ derivative. Note that we have to use the product rule for the second
derivative.
u ✓ ⇤ u x x ✓ + u y y ✓ ⇤ r sin(✓)u x + r cos(✓)u y ,
u ✓✓ ⇤ r cos(✓)u x r sin(✓)(u xx x ✓ + u x y y ✓ ) r sin(✓)u y + r cos(✓)(u yx x ✓ + u y y y ✓ )
2 2
⇤ r cos(✓)u x r sin(✓)u y + r sin (✓)u xx r 2 2 sin(✓) cos(✓)u x y + r 2 cos2 (✓)u y y .
Let us now try to solve for u xx + u y y . We start with r12 u ✓✓ to get rid of those pesky r 2 . If
we add u rr and use the fact that cos2 (✓) + sin2 (✓) ⇤ 1, we get
1 1 1
u ✓✓ + u rr ⇤ u xx + u y y cos(✓)u x sin(✓)u y .
r2 r r
We’re not quite there yet, but all we are lacking is 1r u r . Adding it we obtain the Laplacian in
polar coordinates:
1 1
u ⇤ u xx + u y y ⇤ 2 u ✓✓ + u r + u rr .
r r
Notice that the Laplacian in polar coordinates no longer has constant coefficients.
⇥00 + ⇥ ⇤ 0,
r 2 R00 + rR0 R ⇤ 0.
Let us first focus on ⇥. We know that u(r, ✓) ought to be 2⇡-periodic in ✓, that is,
u(r, ✓) ⇤ u(r, ✓ + 2⇡). Therefore, the solution to ⇥00 + ⇥ ⇤ 0 must be 2⇡-periodic. We
have seen such a problem in Example 4.1.5. We conclude that ⇤ n 2 for a nonnegative
integer n ⇤ 0, 1, 2, 3, . . .. The equation becomes ⇥00 + n 2 ⇥ ⇤ 0. When n ⇤ 0 the equation
is just ⇥00 ⇤ 0, so we have the general solution A✓ + B. As ⇥ is periodic, A ⇤ 0. For
convenience we write this solution as
a0
⇥0 ⇤
2
266 CHAPTER �. FOURIER SERIES AND PDES
⇥n ⇤ a n cos(n✓) + b n sin(n✓),
r 2 R00 + rR0 n 2 R ⇤ 0.
This equation appeared in exercises before—we solved it in Exercise 2.1.6 and Exercise 2.1.7
on page 83. The idea is to try a solution r s and if that does not give us two solutions, also
try a solution of the form r s ln r. Let us name the solution for R n . When n ⇤ 0 we obtain
R 0 ⇤ Ar 0 + Br 0 ln r ⇤ A + B ln r,
a0 ’
1
u(r, ✓) ⇤ + a n r n cos(n✓) + b n r n sin(n✓).
2
n⇤1
a0 ’
1
g(✓) ⇤ u(1, ✓) ⇤ + a n cos(n✓) + b n sin(n✓).
2
n⇤1
u ⇤ 0, 0 r < 1, ⇡ < ✓ ⇡,
u(1, ✓) ⇤ cos(10 ✓), ⇡ < ✓ ⇡.
�.��. DIRICHLET PROBLEM IN THE CIRCLE AND THE POISSON KERNEL 267
The solution is
u(r, ✓) ⇤ r 10 cos(10 ✓).
See the plot in Figure 4.26. The thing to notice in this example is that the effect of a high
frequency is mostly felt at the boundary. In the middle of the disc, the solution is very
close to zero. That is because r 10 is rather small when r is close to 0.
1.0 -1.0 x
-0.5
y 0.5 0.0
0.5
0.0
1.0
u(r,theta)
-0.5 1.5
-1.0
1.0 1.200
1.5 0.900
0.600
0.5
0.300
1.0 0.000
-0.300
0.0
-0.600
0.5
-0.900
-0.5
-1.200
-1.500
0.0
-1.0
-0.5
-1.5
-1.0
1.0
-1.5 0.5
-1.0
0.0
-0.5
0.0 -0.5 y
0.5
x 1.0 -1.0
Figure 4.26: The solution of the Dirichlet problem in the disc with cos(10 ✓) as boundary data.
Example 4.10.2: Let us solve a more difficult problem. Consider a long rod with circular
cross section of radius 1. Suppose we wish to solve the steady state heat problem in
the rod. If the rod is long enough, we simply need to solve the Laplace equation in two
dimensions. Let us put the center of the rod at the origin and we have exactly the region
we are currently studying—a circle of radius 1. For the boundary conditions, suppose in
Cartesian coordinates x and y, the temperature on the boundary is 0 when y < 0, and it is
2y when y > 0.
Let us set the problem up. As y ⇤ r sin(✓), then on the circle of radius 1, that is, where
r ⇤ 1, we have 2y ⇤ 2 sin(✓). So
u ⇤ 0, 0 r < 1, ⇡ < ✓ ⇡,
(
2 sin(✓) if 0 ✓ ⇡,
u(1, ✓) ⇤
0 if ⇡ < ✓ < 0.
268 CHAPTER �. FOURIER SERIES AND PDES
We must now compute the Fourier series for the boundary condition. By now the
reader has plentiful experience in computing Fourier series and so we simply state that
2 ’ 4
1
u(1, ✓) ⇤ + sin(✓) + cos(2n✓).
⇡
n⇤1
⇡(4n 2 1)
Exercise 4.10.1: Compute the series for u(1, ✓) and verify that it really is what we have just claimed.
Hint� Be careful, make sure not to divide by zero.
We now simply write the solution (see Figure 4.27) by multiplying by r n in the right
places.
2 ’ 4r 2n
1
u(r, ✓) ⇤ + r sin(✓) + cos(2n✓).
⇡ ⇡(4n 2 1) n⇤1
1.0 x
-0.5
y 0.5 0.0
0.5
0.0
1.0
u(r,theta)
-0.5 2.0
2.000
1.800
2.0 1.5 1.600
1.400
1.200
1.000
1.5 0.800
1.0
0.600
0.400
0.200
0.000
1.0 0.5
0.5 0.0
1.0
0.0 0.5
0.0
-0.5
0.0 -0.5 y
0.5
x 1.0
Figure 4.27: The solution of the Dirichlet problem with boundary data 0 for y < 0 and 2y for y > 0.
�.��. DIRICHLET PROBLEM IN THE CIRCLE AND THE POISSON KERNEL 269
1 ⇡ ’ 1 ⇡
⇤ g(↵) d↵ + g(↵) cos(n↵) d↵ r n cos(n✓)+
2⇡ ⇡
| {z } n⇤1 | {z }
⇡ ⇡
a0 an
✓ π
2
◆
1 ⇡
+ g(↵) sin(n↵) d↵ r n sin(n✓)
⇡
| {z }
⇡
!
bn
π ’
1
1 ⇡
⇤ g(↵) + 2 g(↵) cos(n↵) r n cos(n✓) + g(↵) sin(n↵) r n sin(n✓) d↵
2⇡
!
⇡ n⇤1
π ’
1
1 ⇡
⇤ 1+2 r n cos(n↵) cos(n✓) + sin(n↵) sin(n✓) g(↵) d↵
2⇡ ⇡
| {z }
n⇤1
P(r,✓,↵)
OK, so we have what we wanted, the expression in the parentheses is the Poisson kernel,
P(r, ✓, ↵). However, we can do a lot better. It is still given as a series, and we would really
like to have a nice simple expression for it. We must work a little harder. The trick is to
rewrite everything in terms of complex exponentials. Let us work just on the kernel.
’
1
P(r, ✓, ↵) ⇤ 1 + 2 r n cos(n↵) cos(n✓) + sin(n↵) sin(n✓)
n⇤1
’
1
⇤1+2 r n cos n(✓ ↵)
n⇤1
’
1
⇤1+ r n e in(✓ ↵)
+e in(✓ ↵)
n⇤1
’
1 ’
1
↵) n i(✓ ↵) n
⇤1+ re i(✓ + re .
n⇤1 n⇤1
In the above expression we recognize the geometric series. Recall from calculus that if z is a
complex number where |z| < 1, then
’
1
z
zn ⇤ .
1 z
n⇤1
Note that n starts at 1 and that is why we have the z in the numerator. It is the standard
geometric series multiplied by z. We can use z ⇤ re i(✓ ↵) , as lo and behold |re i(✓ ↵) | ⇤ r < 1.
Let us continue with the computation.
’
1 ’
1
↵) n i(✓ ↵) n
P(r, ✓, ↵) ⇤ 1 + re i(✓ + re
n⇤1 n⇤1
re i(✓ ↵) re i(✓ ↵)
⇤1+ +
1 re i(✓ ↵) 1 re i(✓ ↵)
1 r2
⇤
1 re i(✓ ↵) i(✓ ↵) + r 2
re
1 r2
⇤ .
1 2r cos(✓ ↵) + r 2
That’s a formula we can live with. The solution to the Dirichlet problem using the Poisson
kernel is
π ⇡
1 1 r2
u(r, ✓) ⇤ g(↵) d↵.
2⇡ ⇡ 1 2r cos(✓ ↵) + r 2
�.��. DIRICHLET PROBLEM IN THE CIRCLE AND THE POISSON KERNEL 271
1
Sometimes the formula for the Poisson kernel is given together with the constant 2⇡ , in
which case we should of course not leave it in front of the integral. Also, often the limits
of the integral are given as 0 to 2⇡; everything inside is 2⇡-periodic in ↵, so this does not
change the integral.
Let us not leave the Poisson kernel without explaining
its geometric meaning. Let s be the distance from (r, ✓) to (1, ↵) s
(r, ✓)
(1, ↵). You may recall from calculus that this distance s in
polar coordinates is given precisely by the square root of 1
1 2r cos(✓ ↵) + r 2 . That is, the Poisson kernel is really r
the formula
1 r2
.
s2
One final note we make about the formula is that it is
really a weighted average of the boundary values. First let
us look at what happens at the origin, that is when r ⇤ 0.
π
1 ⇡
1 02
u(0, 0) ⇤ g(↵) d↵
2⇡ 1 2(0) cos(✓ ↵) + 02
π ⇡
1 ⇡
⇤ g(↵) d↵.
2⇡ ⇡
So u(0, 0) is precisely the average value of g(✓) and therefore the average value of u on the
boundary. This is a general feature of harmonic functions, the value at some point p is
equal to the average of the values on a circle centered at p.
What the formula says is that the value of the solution at any point in the circle is a
weighted average of the boundary data g(✓). The kernel is bigger when (1, ↵) is closer to
(r, ✓). Therefore when computing u(r, ✓) we give more weight to the values g(↵) when
(1, ↵) is closer to (r, ✓) and less weight to the values g(↵) when (1, ↵) far from (r, ✓).
4.10.4 Exercises
Exercise 4.10.2: Using series solve u ⇤ 0, u(1, ✓) ⇤ |✓|, for ⇡ < ✓ ⇡.
Exercise 4.10.3: Using series solve u ⇤ 0, u(1, ✓) ⇤ g(✓) for the following data. Hint� trig
identities.
Exercise 4.10.4: Using the Poisson kernel, give the solution to u ⇤ 0, where u(1, ✓) is zero for ✓
outside the interval [ ⇡/4, ⇡/4] and u(1, ✓) is � for ✓ on the interval [ ⇡/4, ⇡/4].
272 CHAPTER �. FOURIER SERIES AND PDES
Exercise 4.10.5:
a� Draw a graph for the Poisson kernel as a function of ↵ when r ⇤ 1/2 and ✓ ⇤ 0.
b� Describe what happens to the graph when you make r bigger �as it approaches ��.
c� Knowing that the solution u(r, ✓) is the weighted average of g(✓) with Poisson kernel as the
weight, explain what your answer to part b� means.
Exercise 4.10.6: Let g(✓) be the function x y ⇤ cos ✓ sin ✓ on the boundary. Use the series
solution to find a solution to the Dirichlet problem u ⇤ 0, u(1, ✓) ⇤ g(✓). Now convert the
solution to Cartesian coordinates x and y. Is this solution surprising� Hint� use your trig identities.
Exercise 4.10.7: Carry out the computation we needed in the separation of variables and solve
r 2 R00 + rR0 n 2 R ⇤ 0, for n ⇤ 0, 1, 2, 3, . . ..
Exercise 4.10.8 (challenging): Derive the series solution to the Dirichlet problem if the region is a
circle of radius ⇢ rather than �. That is, solve u ⇤ 0, u(⇢, ✓) ⇤ g(✓).
P(x, y) �that is, write down the formula for the answer�. Write the answer in Cartesian
coordinates.
Notice the answer is again a polynomial in x and y. See also Exercise �.��.�.
Õ
1
1
Exercise 4.10.101: Using series solve u ⇤ 0, u(1, ✓) ⇤ 1 + n2
sin(n✓).
n⇤1
Exercise 4.10.102: Using the series solution find the solution to u ⇤ 0, u(1, ✓) ⇤ 1 cos(✓).
Express the solution in Cartesian coordinates �that is, using x and y�.
Exercise 4.10.103:
a� Try and guess a solution to u ⇤ 1, u(1, ✓) ⇤ 0. Hint� try a solution that only depends on
r. Also first, don’t worry about the boundary condition.
Exercise 4.10.104 (challenging): Derive the Poisson kernel solution if the region is a circle of
radius ⇢ rather than �. That is, solve u ⇤ 0, u(⇢, ✓) ⇤ g(✓).
Chapter 5
X 00(x) + X(x) ⇤ 0,
for some constant h. These conditions come up when the ends are immersed in some
medium.
In the separation of variables computation we encountered an eigenvalue problem and
found the eigenfunctions X n (x). We then found the eigenfunction decomposition of the initial
temperature f (x) ⇤ u(x, 0),
’
1
f (x) ⇤ c n X n (x).
n⇤1
274 CHAPTER �. MORE ON EIGENVALUE PROBLEMS
Once we had this decomposition and found suitable Tn (t) such that Tn (0) ⇤ 1 and such
that Tn (t)X n (x) were solutions to the heat equation, we wrote the solution to the original
problem, including the initial condition, as
’
1
u(x, t) ⇤ c n Tn (t)X n (x).
n⇤1
To study more general problems with this method, we must study more general
eigenvalue problems. First, we study second order linear equations of the form
✓ ◆
d dy
p(x) q(x)y + r(x)y ⇤ 0. (5.1)
dx dx
Essentially any second order linear equation of the form a(x)y 00 +b(x)y 0 +c(x)y + d(x)y ⇤ 0
can be written as (5.1) after multiplying by a proper factor.
Example 5.1.1 (Bessel): Put the following equation into the form (5.1):
x 2 y 00 + x y 0 + x2 n 2 y ⇤ 0.
1
Multiply both sides by x to obtain
✓ ◆
1 2 00 2 2 n2
x y + x y0 + x n y ⇤ xy + y +00 0
x y
x x
✓ ◆
d dy n2
⇤ x y + x y ⇤ 0.
dx dx x
The Bessel equation turns up for example in the solution of the two-dimensional wave
equation. If you want to see how one solves the equation, you can look at subsection 7.3.3.
The so-called Sturm–Liouville problem is to seek nontrivial solutions to
✓ ◆
d dy
p(x) q(x)y + r(x)y ⇤ 0, a < x < b,
dx dx
(5.2)
↵1 y(a) ↵2 y 0(a) ⇤ 0,
1 y(b) + 2y
0
(b) ⇤ 0.
In particular, we seek s that allow for nontrivial solutions. The s that admit nontrivial
solutions are called the eigenvalues and the corresponding nontrivial solutions are called
eigenfunctions. The constants ↵1 and ↵ 2 should not be both zero, same for 1 and 2 .
Named after the French mathematicians Jacques Charles François Sturm (1803–1855) and Joseph Liouville
(1809–1882).
�.�. STURM–LIOUVILLE PROBLEMS 275
Theorem 5.1.1. Suppose p(x), p 0(x), q(x) and r(x) are continuous on [a, b] and suppose p(x) > 0
and r(x) > 0 for all x in [a, b]. Then the Sturm–Liouville problem (5.2) has an increasing sequence
of eigenvalues
1 < 2 < 3 < ···
such that
lim n ⇤ +1
n!1
and such that to each n there is �up to a constant multiple� a single eigenfunction y n (x).
Moreover, if q(x) 0 and ↵1 , ↵2 , 1 , 2 0, then n 0 for all n.
Problems satisfying the hypothesis of the theorem (including the “Moreover”) are called
regular Sturm–Liouville problems, and we will only consider such problems here. That is, a
regular problem is one where p(x), p 0(x), q(x) and r(x) are continuous, p(x) > 0, r(x) > 0,
q(x) 0, and ↵1 , ↵2 , 1 , 2 0, where neither ↵ 1 and ↵ 2 are both zero, nor 1 and 2 are
both zero. Note: Be careful about the signs. Also be careful about the inequalities for r and
p, they must be strict for all x in the interval [a, b], including the endpoints!
When zero is an eigenvalue, we usually start labeling the eigenvalues at 0 rather than at
1 for convenience. That is we label the eigenvalues 0 < 1 < 2 < · · · .
Example 5.1.2: The problem y 00 + y, 0 < x < L, y(0) ⇤ 0, and y(L) ⇤ 0 is a regular
Sturm–Liouville problem: p(x) ⇤ 1, q(x) ⇤ 0, r(x) ⇤ 1, and we have p(x) ⇤ 1 > 0 and
r(x) ⇤ 1 > 0. We also have a ⇤ 0, b ⇤ L, ↵1 ⇤ 1 ⇤ 1, ↵ 2 ⇤ 2 ⇤ 0. The eigenvalues are
n 2 ⇡2
n ⇤ L2 and eigenfunctions are y n (x) ⇤ sin L x . All eigenvalues are nonnegative as
n⇡
y 00 + y ⇤ 0, y 0(0) ⇤ 0, y 0(1) ⇤ 0.
Identify the p, q, r, ↵ j , j . Can you use the theorem to make the search for eigenvalues easier� �Hint�
Consider the condition y 0(0) ⇤ 0�
y 00 + y ⇤ 0, 0 < x < 1,
h y(0) y (0) ⇤ 0,
0
y 0(1) ⇤ 0, h > 0.
or
h p
p ⇤ tan .
We use a computer to find n . There are tables available, though using a computer
or a graphing calculator is far more convenient nowadays. Easiest method is to plot the
functions h/x and tan x and see for which x they p intersect. There is an infinite p number of
intersections. Denote the first intersection
p by
p 1 , the second intersection by 2 , etc. For
example, when h ⇤ 1, we get 1 ⇡ 0.86, 2 ⇡ 3.43, . . . . That is 1 ⇡ 0.74, 2 ⇡ 11.73,
. . . . A plot for h ⇤ 1 is given in Figure 5.1 on the next page. The appropriate eigenfunction
p
(let A ⇤ 1 for convenience, then B ⇤ h/ ) is
p h p
y n (x) ⇤ cos( n x) + p sin( n x).
n
1 1
y1 (x) ⇡ cos(0.86 x) + sin(0.86 x), y2 (x) ⇡ cos(3.43 x) + sin(3.43 x), ....
0.86 3.43
5.1.2 Orthogonality
We have seen the notion of orthogonality before. For example, we have shown that sin(nx)
are orthogonal for distinct n on [0, ⇡]. For general Sturm–Liouville problems we need
a more general setup. Let r(x) be a weight function (any function, though generally we
assume it is positive) on [a, b]. Two functions f (x), g(x) are said to be orthogonal with
respect to the weight function r(x) when
π b
f (x) g(x) r(x) dx ⇤ 0.
a
�.�. STURM–LIOUVILLE PROBLEMS 277
0 2 4 6
4 4
2 2
0 0
-2 -2
-4 -4
0 2 4 6
1
Figure 5.1: Plot of x and tan x.
and then say f and g are orthogonal whenever h f , gi ⇤ 0. The results and concepts are
again analogous to finite-dimensional linear algebra.
The idea of the given inner product is that those x where r(x) is greater have more
weight. Nontrivial (nonconstant) r(x) arise naturally, for example from a change of variables.
Hence, you could think of a change of variables such that d⇠ ⇤ r(x) dx.
Eigenfunctions of a regular Sturm–Liouville problem satisfy an orthogonality property,
just like the eigenfunctions in § 4.1. Its proof is very similar to the analogous Theorem 4.1.1
on page 193.
Let y j and y k be two distinct eigenfunctions for two distinct eigenvalues j and k. Then
π b
y j (x) y k (x) r(x) dx ⇤ 0,
a
that is, y j and y k are orthogonal with respect to the weight function r.
278 CHAPTER �. MORE ON EIGENVALUE PROBLEMS
Theorem 5.1.3 (Fredholm alternative). Suppose that we have a regular Sturm–Liouville problem.
Then either
✓ ◆
d dy
p(x) q(x)y + r(x)y ⇤ 0,
dx dx
↵1 y(a) ↵2 y 0(a) ⇤ 0,
1 y(b) + 2y
0
(b) ⇤ 0,
This theorem is used in much the same way as we did before in § 4.4. It is used when
solving more general nonhomogeneous boundary value problems. The theorem does not
help us solve the problem, but it tells us when a unique solution exists, so that we know
when to spend time looking for it. To solve the problem we decompose f (x) and y(x) in
terms of eigenfunctions of the homogeneous problem, and then solve for the coefficients of
the series for y(x).
’
1
f (x) ⇤ c n y n (x), (5.3)
n⇤1
where y n (x) are eigenfunctions. We wish to find out if we can represent any function f (x)
in this way, and if so, we wish to calculate c n (and of course we would want to know if the
sum converges). OK, so imagine we could write f (x) as (5.3). We will assume convergence
�.�. STURM–LIOUVILLE PROBLEMS 279
and the ability to integrate the series term by term. Because of orthogonality we have
π π !
b b ’
1
h f , ym i ⇤ f (x) y m (x) r(x) dx ⇤ c n y n (x) y m (x) r(x) dx
a a
π
n⇤1
’
1 b
⇤ cn y n (x) y m (x) r(x) dx
a
π
n⇤1
b
⇤ cm y m (x) y m (x) r(x) dx ⇤ c m hy m , y m i.
a
Hence,
Øb
h f , ym i f (x) y m (x) r(x) dx
cm ⇤ ⇤ Øa b . (5.4)
hy m , y m i 2
a
y (x) r(x) dx
m
Note that y m are known up to a constant multiple, so we could have picked a scalar
multiple of an eigenfunction
p such that hy m , y m i ⇤ 1 (if we had an arbitrary eigenfunction
ỹ m , divide it by h ỹ m , ỹ m i). When hy m , y m i ⇤ 1 we have the simpler form c m ⇤ h f , y m i.
The following theorem holds more generally, but the statement given is enough for our
purposes.
Theorem 5.1.4. Suppose f is a piecewise smooth continuous function on [a, b]. If y1 , y2 , . . . are
eigenfunctions of a regular Sturm–Liouville problem, one for each eigenvalue, then there exist real
constants c 1 , c 2 , . . . given by (5.4) such that (5.3) converges and holds for a < x < b.
Example 5.1.4: Consider
y 00 + y ⇤ 0, 0 < x < ⇡/2,
y(0) ⇤ 0, y 0(⇡/2) ⇤ 0.
The above is a regular Sturm–Liouville problem, and Theorem 5.1.1 on page 275 says that
if is an eigenvalue then 0.
Suppose ⇤ 0. The general solution is y(x) ⇤ Ax + B. We plug in the initial conditions
to get 0 ⇤ y(0) ⇤ B, and 0 ⇤ y 0(⇡/2) ⇤ A. Hence ⇤ 0 is not an eigenvalue.
So let us consider > 0, where the general solution is
p p
y(x) ⇤ A cos( x) + B sin( x).
p p ⇡
Plugging in the boundary conditions we get 0 ⇤ y(0) ⇤ A and 0 ⇤ y 0(⇡/2) ⇤ B cos .
p ⇡ p ⇡ 2
Since A is zero, then B cannot be zero. Hence cos 2 ⇤ 0. This means that 2 is an
p ⇡
odd integral multiple of /2, i.e. (2n 1) 2 ⇤
⇡ ⇡
n 2 . Solving for n we get
n ⇤ (2n 1)2 .
We can take B ⇤ 1. Our eigenfunctions are
y n (x) ⇤ sin (2n 1)x .
280 CHAPTER �. MORE ON EIGENVALUE PROBLEMS
where
Ø ⇡
π
2
f (x) sin (2n 1)x dx
⇡
h f , yn i 4 2
cn ⇤ ⇤ Ø⇡ ⇣
0
⌘ 2
⇤ f (x) sin (2n 1)x dx.
hy n , y n i ⇡ 0
0
2
sin (2n 1)x dx
Note that the series converges to an odd 2⇡-periodic extension of f (x). With the regular
sine series we would expect a function with period 2 ⇡2 ⇤ ⇡.
Exercise 5.1.3 (challenging): In the above example, the function is defined on 0 < x < ⇡/2, yet the
series with respect to the eigenfunctions sin (2n 1)x converges to an odd 2⇡-periodic extension
of f (x). Find out how is the extension defined for ⇡/2 < x < ⇡.
Let us compute an example. Consider f (x) ⇤ x for 0 < x < ⇡/2. Some calculus later we
find π ⇡
4 2 4( 1)n+1
cn ⇤ f (x) sin (2n 1)x dx ⇤ ,
⇡ 0 ⇡(2n 1)2
and so for x in [0, ⇡/2],
’
1
4( 1)n+1
f (x) ⇤ sin (2n 1)x .
n⇤1 ⇡(2n 1)2
This is different from the ⇡-periodic regular sine series which can be computed to be
’
1
( 1)n+1
f (x) ⇤ sin(2nx).
n
n⇤1
Both sums converge are equal to f (x) for 0 < x < ⇡/2, but the eigenfunctions involved come
from different eigenvalue problems.
5.1.5 Exercises
Exercise 5.1.4: Find eigenvalues and eigenfunctions of
Exercise 5.1.5: Expand the function f (x) ⇤ x on 0 x 1 using eigenfunctions of the system
y 00 + y ⇤ 0, y 0(0) ⇤ 0, y(1) ⇤ 0.
Exercise 5.1.6: Suppose that you had a Sturm–Liouville problem on the interval [0, 1] and came
up with y n (x) ⇤ sin( nx), where > 0 is some constant. Decompose f (x) ⇤ x, 0 < x < 1 in
terms of these eigenfunctions.
This problem is not a Sturm–Liouville problem, but the idea is the same.
d x 0
(e y ) + e x y ⇤ 0, y(0) ⇤ 0, y(1) ⇤ 0.
dx
Hint� First write the system as a constant coefficient system to find general solutions. Do note that
Theorem �.�.� on page ��� guarantees 0.
y 00 + y ⇤ 0, y( 1) ⇤ 0, y(1) ⇤ 0.
Exercise 5.1.102: Put the following problems into the standard form for Sturm–Liouville problems,
that is, find p(x), q(x), r(x), ↵1 , ↵2 , 1 , and 2 , and decide if the problems are regular or not.
@4 y @2 y
a4 + 2 ⇤ 0,
@x 4 @t
for some constant a > 0, let us not worry about the physics .
Suppose the beam is of length 1 simply supported (hinged) at the ends. The beam is
displaced by some function f (x) at time t ⇤ 0 and then let go (initial velocity is 0). Then y
satisfies:
a 4 y xxxx + y tt ⇤ 0 (0 < x < 1, t > 0),
y(0, t) ⇤ y xx (0, t) ⇤ 0,
(5.5)
y(1, t) ⇤ y xx (1, t) ⇤ 0,
y(x, 0) ⇤ f (x), y t (x, 0) ⇤ 0.
X (4) T 00
⇤ 4 ⇤ .
X a T
The equations are
T 00 + a 4 T ⇤ 0, X (4) X ⇤ 0.
If you are interested, a 4 ⇤ EI
⇢ , where E is the elastic modulus, I is the second moment of area of the cross
section, and ⇢ is linear density.
�.�. HIGHER ORDER EIGENVALUE PROBLEMS 283
The point is that X n Tn is a solution that satisfies all the homogeneous conditions (that is,
all conditions except the initial position). And since Tn (0) ⇤ 1, we have
’
1 ’
1 ’
1
y(x, 0) ⇤ b n X n (x)Tn (0) ⇤ b n X n (x) ⇤ b n sin(n⇡x) ⇤ f (x).
n⇤1 n⇤1 n⇤1
284 CHAPTER �. MORE ON EIGENVALUE PROBLEMS
Hence, the solution to (5.5) with the given initial position f (x) is
’
1
4
y(x, t) ⇤ sin(n⇡x) cos(n 2 ⇡ 2 a 2 t).
n⇤1
5⇡ 3 n 3
n odd
There are other boundary conditions than just hinged ends. There are three basic
possibilities: hinged, fixed, and free. Let us consider the end at x ⇤ 0. For the other end, it
is the same idea. If the end is hinged, then
u(0, t) ⇤ u xx (0, t) ⇤ 0.
u(0, t) ⇤ u x (0, t) ⇤ 0.
5.2.1 Exercises
Exercise 5.2.2: Suppose you have a beam of length � with free ends. Let y be the transverse
deviation of the beam at position x on the beam �0 < x < 5�. You know that the constants are such
that this satisfies the equation y tt + 4y xxxx ⇤ 0. Suppose you know that the initial shape of the
beam is the graph of x(5 x), and the initial velocity is uniformly equal to � �same for each x� in the
positive y direction. Set up the equation together with the boundary and initial conditions. Just set
up, do not solve.
�.�. HIGHER ORDER EIGENVALUE PROBLEMS 285
Exercise 5.2.3: Suppose you have a beam of length � with one end free and one end fixed �the
fixed end is at x ⇤ 5�. Let u be the longitudinal deviation of the beam at position x on the beam
�0 < x < 5�. You know that the constants are such that this satisfies the equation u tt ⇤ 4u xx .
(x 5)
Suppose you know that the initial displacement of the beam is x505 , and the initial velocity is 100
in the positive u direction. Set up the equation together with the boundary and initial conditions.
Just set up, do not solve.
Exercise 5.2.4: Suppose the beam is L units long, everything else kept the same as in (5.5). What
is the equation and the series solution�
That is, you have also an initial velocity. Find a series solution. Hint� Use the same idea as we did
for the wave equation.
Exercise 5.2.101: Suppose you have a beam of length � with hinged ends. Let y be the transverse
deviation of the beam at position x on the beam �0 < x < 1�. You know that the constants are such
that this satisfies the equation y tt + 4y xxxx ⇤ 0. Suppose you know that the initial shape of the
beam is the graph of sin(⇡x), and the initial velocity is �. Solve for y.
Exercise 5.2.102: Suppose you have a beam of length �� with two fixed ends. Let y be the transverse
deviation of the beam at position x on the beam �0 < x < 10�. You know that the constants are such
that this satisfies the equation y tt + 9y xxxx ⇤ 0. Suppose you know that the initial shape of the
beam is the graph of sin(⇡x), and the initial velocity is uniformly equal to x(10 x). Set up the
equation together with the boundary and initial conditions. Just set up, do not solve.
286 CHAPTER �. MORE ON EIGENVALUE PROBLEMS
0 L x
y tt ⇤ a 2 y xx ,
y(0, t) ⇤ 0, y(L, t) ⇤ 0, (5.6)
y(x, 0) ⇤ f (x), y t (x, 0) ⇤ g(x).
where A n and B n are determined by the initial conditions. The natural frequencies of the
system are the (angular) frequencies n⇡aL for integers n 1.
But these are free vibrations. What if there is an external force acting on the string. Let
us assume say air vibrations (noise), for example from a second string. Or perhaps a jet
engine. For simplicity, assume nice pure sound and assume the force is uniform at every
position on the string. Let us say F(t) ⇤ F0 cos(!t) as force per unit mass. Then our wave
equation becomes (remember force is mass times acceleration)
y tt ⇤ a 2 y xx + F0 cos(!t), (5.7)
That is, the string is initially at rest. First we find a particular solution y p of (5.7) that
satisfies y(0, t) ⇤ y(L, t) ⇤ 0. We define the functions f and g as
@y p
f (x) ⇤ y p (x, 0), g(x) ⇤ (x, 0).
@t
We then find solution y c of (5.6). If we add the two solutions, we find that y ⇤ y c + y p
solves (5.7) with the initial conditions.
Exercise 5.3.1: Check that y ⇤ y c + y p solves (5.7) and the side conditions (5.8).
So the big issue here is to find the particular solution y p . We look at the equation and
we make an educated guess
y p (x, t) ⇤ X(x) cos(!t).
We plug in to get
!2 X cos(!t) ⇤ a 2 X 00 cos(!t) + F0 cos(!t),
or !2 X ⇤ a 2 X 00 + F0 after canceling the cosine. We know how to find a general solution to
this equation (it is a nonhomogeneous constant coefficient equation). The general solution
is ⇣! ⌘ ⇣! ⌘ F
0
X(x) ⇤ A cos x + B sin x .
a a !2
The endpoint conditions imply X(0) ⇤ X(L) ⇤ 0. So
F0
0 ⇤ X(0) ⇤ A ,
!2
F0
or A ⇤ !2
, and also
✓ ◆ ✓ ◆
F0 !L !L F0
0 ⇤ X(L) ⇤ 2 cos + B sin .
! a a !2
F0 cos !L
a 1
B⇤ . (5.9)
! sin !L
2
a
Therefore, !
F0 ⇣! ⌘ cos !L
1 ⇣! ⌘
a
X(x) ⇤ 2 cos x sin x 1 .
! a sin !La
a
The particular solution y p we are looking for is
!
F0 ⇣! ⌘ cos !L
1 ⇣! ⌘
a
y p (x, t) ⇤ 2 cos x sin x 1 cos(!t).
! a sin !La
a
288 CHAPTER �. MORE ON EIGENVALUE PROBLEMS
Now we get to the point that we skipped. Suppose sin( !L a ) ⇤ 0. What this means is
that ! is equal to one of the natural frequencies of the system, i.e. a multiple of ⇡a L . We
notice that if ! is not equal to a multiple of the base frequency, but is very close, then the
coefficient B in (5.9) seems to become very large. But let us not jump to conclusions just
yet. When ! ⇤ n⇡a L for n even, then cos( a ) ⇤ 1 and hence we really get that B ⇤ 0. So
!L
odd n.
We could again solve for the resonance solution if we wanted to, but it is, in the right
sense, the limit of the solutions as ! gets close to a resonance frequency. In real life, pure
resonance never occurs anyway.
The above calculation explains why a string begins to vibrate if the identical string is
plucked close by. In the absence of friction this vibration would get louder and louder
as time goes on. On the other hand, you are unlikely to get large vibration if the forcing
frequency is not close to a resonance frequency even if you have a jet engine running close
to the string. That is, the amplitude does not keep increasing unless you tune to just the
right frequency.
Similar resonance phenomena occur when you break a wine glass using human voice
(yes this is possible, but not easy ) if you happen to hit just the right frequency. Remember
a glass has much purer sound, i.e. it is more like a vibraphone, so there are far fewer
resonance frequencies to hit.
When the forcing function is more complicated, you decompose it in terms of the
Fourier series and apply the above result. You may also need to solve the above problem if
the forcing function is a sine rather than a cosine, but if you think about it, the solution is
almost the same.
Example 5.3.1: Let us do the computation for specific values. Suppose F0 ⇤ 1 and ! ⇤ 1
and L ⇤ 1 and a ⇤ 1. Then
✓ ◆
cos(1) 1
y p (x, t) ⇤ cos(x) sin(x) 1 cos(t).
sin(1)
cos(1) 1
Write B ⇤ sin(1) for simplicity.
Then plug in t ⇤ 0 to get
@y p
and after differentiating in t we see that g(x) ⇤ @t
(x, 0) ⇤ 0.
Mythbusters, episode 31, Discovery Channel, originally aired may 18th 2005.
�.�. STEADY PERIODIC SOLUTIONS 289
y tt ⇤ y xx ,
y(0, t) ⇤ 0, y(1, t) ⇤ 0,
y(x, 0) ⇤ cos x + B sin x + 1,
y t (x, 0) ⇤ 0.
The formula that we use to define y(x, 0) is not odd, hence it is not a simple matter of
plugging in the expression for y(x, 0) to the d’Alembert formula directly! You must define
F to be the odd, 2-periodic extension of y(x, 0). Then our solution is
✓ ◆
F(x + t) + F(x t) cos(1) 1
y(x, t) ⇤ + cos(x) sin(x) 1 cos(t). (5.10)
2 sin(1)
It is not hard to compute specific values for an odd periodic extension of a function and
hence (5.10) is a wonderful solution to the problem. For example, it is very easy to have a
computer do it, unlike a series solution. A plot is given in Figure 5.4.
0
0.0 t
1
2
0.2 3
x 4
0.5 5 y(x,t)
0.8
0.20 0.240
1.0 0.148
0.099
0.20 0.049
0.10
0.000
-0.049
0.10 -0.099
0.00
-0.148
y
-0.197
0.00 -0.254
y
-0.10
-0.10
-0.20
0.0
-0.20
0.2
0 0.5
1 x
2 0.8
3
4 1.0
t
⇣ ⌘
5
F(x+t)+F(x t) cos(1) 1
Figure 5.4: Plot of y(x, t) ⇤ 2 + cos(x) sin(1)
sin(x) 1 cos(t).
290 CHAPTER �. MORE ON EIGENVALUE PROBLEMS
We look for an h such that Re h ⇤ u. To find an h, whose real part satisfies (5.11), we look
for an h such that
h t ⇤ kh xx , h(0, t) ⇤ A0 e i!t . (5.12)
Exercise 5.3.3: Suppose h satisfies (5.12). Use Euler’s formula for the complex exponential to check
that u ⇤ Re h satisfies (5.11).
5.3.3 Exercises
Exercise 5.3.5: Suppose that the forcing function for the vibrating string is F0 sin(!t). Derive the
particular solution y p .
Exercise 5.3.6: Take the forced vibrating string. Suppose that L ⇤ 1, a ⇤ 1. Suppose that the
forcing function is the square wave that is � on the interval 0 < x < 1 and 1 on the interval
1 < x < 0. Find the particular solution. Hint� You may want to use result of Exercise �.�.�.
Exercise 5.3.7: The units are cgs �centimeters-grams-seconds�. For k ⇤ 0.005, ! ⇤ 1.991 ⇥ 10 7 ,
A0 ⇤ 20. Find the depth at which the temperature variation is half �±10 degrees� of what it is on the
surface.
Exercise 5.3.8: Derive the solution for underground temperature oscillation without assuming that
T0 ⇤ 0.
Exercise 5.3.101: Take the forced vibrating string. Suppose that L ⇤ 1, a ⇤ 1. Suppose that the
forcing function is a sawtooth, that is |x| 12 on 1 < x < 1 extended periodically. Find the
particular solution.
Exercise 5.3.102: The units are cgs �centimeters-grams-seconds�. For k ⇤ 0.01, ! ⇤ 1.991 ⇥ 10 7 ,
A0 ⇤ 25. Find the depth at which the summer is again the hottest point.
Chapter 6
We can think of t as time and f (t) as incoming signal. The Laplace transform will convert
the equation from a differential equation in time to an algebraic (no derivatives) equation,
where the new independent variable s is the frequency.
We can think of the Laplace transform as a black box. It eats functions and spits out
functions in a new variable. We write L f (t) ⇤ F(s) for the Laplace transform of f (t).
It is common to write lower case letters for functions in the time domain and upper case
letters for functions in the frequency domain. We use the same letter to denote that one
† Just
like the Laplace equation and the Laplacian, the Laplace transform is also named after Pierre-Simon,
marquis de Laplace (1749–1827).
294 CHAPTER �. THE LAPLACE TRANSFORM
function is the Laplace transform of the other. For example F(s) is the Laplace transform of
f (t). Let us define the transform.
π 1
def st
L f (t) ⇤ F(s) ⇤ e f (t) dt.
0
The limit (the improper integral) only exists if s > 0. So L{1} is only defined for s > 0.
Example 6.1.2: Suppose f (t) ⇤ e at , then
π 1 π 1 1
at st at (s+a)t e (s+a)t 1
L e ⇤ e e dt ⇤ e dt ⇤ ⇤ .
0 0 (s + a) t⇤0 s+a
The function is named after the English mathematician, engineer, and physicist Oliver Heaviside
(1850–1925). Only by coincidence is the function “heavy” on “one side.”
�.�. THE LAPLACE TRANSFORM 295
Let us find the Laplace transform of u(t a), where a 0 is some constant. That is, the
function that is 0 for t < a and 1 for t a.
π 1 π 1 st 1 as
st st e e
L u(t a) ⇤ e u(t a) dt ⇤ e dt ⇤ ⇤ ,
0 a s t⇤a s
Since the transform is defined by an integral. We can use the linearity properties of the
integral. For example, suppose C is a constant, then
π 1 π 1
st st
L C f (t) ⇤ e C f (t) dt ⇤ C e f (t) dt ⇤ CL f (t) .
0 0
So we can “pull out” a constant out of the transform. Similarly we have linearity. Since
linearity is very important we state it as a theorem.
Theorem 6.1.1 (Linearity of the Laplace transform). Suppose that A, B, and C are constants,
then
L A f (t) + B g(t) ⇤ AL f (t) + BL g(t) ,
and in particular
L C f (t) ⇤ CL f (t) .
Exercise 6.1.2: Verify the theorem. That is, show that L A f (t) + B g(t) ⇤ AL f (t) +
BL g(t) .
296 CHAPTER �. THE LAPLACE TRANSFORM
These rules together with Table 6.1 on the preceding page make it easy to find the
Laplace transform of a whole lot of functions already. But be careful. It is a common
mistake to think that the Laplace transform of a product is the product of the transforms.
In general
L f (t)g(t) , L f (t) L g(t) .
It must also be noted that not all functions have a Laplace transform. For example, the
function 1t does not have a Laplace transform as the integral diverges for all s. Similarly,
2
tan t or e t do not have Laplace transforms.
f (t)
lim .
t!1 e ct
If the limit exists and is finite (usually zero), then f (t) is of exponential order.
Exercise 6.1.3: Use L’Hopital’s rule from calculus to show that a polynomial is of exponential order.
Hint� Note that a sum of two exponential order functions is also of exponential order. Then show
that t n is of exponential order for any n.
For an exponential order function we have existence and uniqueness of the Laplace
transform.
Theorem 6.1.2 (Existence). Let f (t) be continuous and of exponential order for a certain constant
c. Then F(s) ⇤ L f (t) is defined for all s > c.
The existence is not difficult to see. Let f (t) be of exponential order, that is | f (t)| Me ct
for all t > 0 (for simplicity t0 ⇤ 0). Let s > c, or in other words (c s) < 0. By the
comparison theorem from calculus, the improper integral defining L f (t) exists if the
following integral exists
π 1 π 1 1
st ct (c s)t e (c s)t M
e (Me ) dt ⇤ M e dt ⇤ M ⇤ .
0 0 c s t⇤0 c s
The transform also exists for some other functions that are not of exponential order,
but that will not be relevant to us. Before dealing with uniqueness, let us note that for
exponential order functions we obtain that their Laplace transform decays at infinity:
lim F(s) ⇤ 0.
s!1
�.�. THE LAPLACE TRANSFORM 297
Theorem 6.1.3 (Uniqueness). Let f (t) and g(t) be continuous and of exponential order. Suppose
that there exists a constant C, such that F(s) ⇤ G(s) for all s > C. Then f (t) ⇤ g(t) for all t 0.
Both theorems hold for piecewise continuous functions as well. Recall that piecewise
continuous means that the function is continuous except perhaps at a discrete set of points,
where it has jump discontinuities like the Heaviside function. Uniqueness, however, does
not “see” values at the discontinuities. So we can only conclude that f (t) ⇤ g(t) outside of
discontinuities. For example, the unit step function is sometimes defined using u(0) ⇤ 1/2.
This new step function, however, has the exact same Laplace transform as the one we
defined earlier where u(0) ⇤ 1.
1 def
L F(s) ⇤ f (t).
There is an integral formula for the inverse, but it is not as simple as the transform itself—it
requires complex numbers and path integrals. For us it will suffice to compute the inverse
using Table 6.1 on page 295.
1
Example 6.1.5: Take F(s) ⇤ s+1 . Find the inverse Laplace transform.
We look at the table to find ⇢
1 1
L ⇤ e t.
s+1
As the Laplace transform is linear, the inverse Laplace transform is also linear. That is,
1 1 1
L AF(s) + BG(s) ⇤ AL F(s) + BL G(s) .
Of course, we also have L 1 AF(s) ⇤ AL 1 F(s) . Let us demonstrate how linearity can
be used.
2
Example 6.1.6: Take F(s) ⇤ s s+s+1
3 +s . Find the inverse Laplace transform.
First we use the method of partial fractions to write F in a form where we can use Table 6.1
on page 295. We factor the denominator as s(s 2 + 1) and write
s 2 + s + 1 A Bs + C
⇤ + 2 .
s3 + s s s +1
Putting the right-hand side over a common denominator and equating the numerators
we get A(s 2 + 1) + s(Bs + C) ⇤ s 2 + s + 1. Expanding and equating coefficients we obtain
298 CHAPTER �. THE LAPLACE TRANSFORM
s2 + s + 1 1 1
F(s) ⇤ ⇤ + 2 .
3
s +s s s +1
Another useful property is the so-called shifting property or the first shifting property
at
L e f (t) ⇤ F(s + a),
Exercise 6.1.4: Derive the first shifting property from the definition of the Laplace transform.
The shifting property can be used, for example, when the denominator is a more
complicated quadratic that may come up in the method of partial fractions. We complete
the square and write such quadratics as (s + a)2 + b and then use the shifting property.
1 1
Example 6.1.7: Find L s 2 +4s+8
.
First we complete the square to make the denominator (s + 2)2 + 4. Next we find
⇢
1 1 1
L ⇤ sin(2t).
s2 +4 2
In general, we want to be able to apply the Laplace transform to rational functions, that
is functions of the form
F(s)
G(s)
where F(s) and G(s) are polynomials. Since normally, for the functions that we are
considering, the Laplace transform goes to zero as s ! 1, it is not hard to see that the
degree of F(s) must be smaller than that of G(s). Such rational functions are called proper
rational functions and we can always apply the method of partial fractions. Of course this
means we need to be able to factor the denominator into linear and quadratic terms, which
involves finding the roots of the denominator.
�.�. THE LAPLACE TRANSFORM 299
6.1.4 Exercises
Exercise 6.1.5: Find the Laplace transform of 3 + t 5 + sin(⇡t).
Exercise 6.1.6: Find the Laplace transform of a + bt + ct 2 for some constants a, b, and c.
Exercise 6.1.15: Find the Laplace transform of t sin(!t). Hint� Several integrations by parts.
Exercise 6.1.104: Find the Laplace transform of sin(t)e t �Hint� integrate by parts�.
300 CHAPTER �. THE LAPLACE TRANSFORM
We repeat this procedure for higher derivatives. The results are listed in Table 6.2. The
procedure also works for piecewise smooth functions, that is functions that are piecewise
continuous with a piecewise continuous derivative.
We plug in the initial conditions now—this makes the computations more streamlined—to
obtain
s
s 2 X(s) 1 + X(s) ⇤ 2 .
s +4
We solve for X(s),
s 1
X(s) ⇤ 2 + 2 .
(s + 1)(s + 4) s + 1
2
This function is useful for putting together functions, or cutting functions off. Most
commonly it is used as u(t a) for some constant a. This just shifts the graph to the right
by a. That is, it is a function that is 0 when t < a and 1 when t a. Suppose for example
that f (t) is a “signal” and you started receiving the signal sin t at time t ⇤ ⇡. The function
f (t) should then be defined as
(
0 if t < ⇡,
f (t) ⇤
sin t if t ⇡.
1.00 1.00
0.75 0.75
0.50 0.50
0.25 0.25
0.00 0.00
Similarly the step function that is 1 on the interval [1, 2) and zero everywhere else can be
written as
u(t 1) u(t 2).
The Heaviside function is useful to define functions defined piecewise. If you want to
define f (t) such that f (t) ⇤ t when t is in [0, 1], f (t) ⇤ t + 2 when t is in [1, 2], and
f (t) ⇤ 0 otherwise, then you can use the expression
Hence it is useful to know how the Heaviside function interacts with the Laplace
transform. We have already seen that
e as
L u(t a) ⇤ .
s
This can be generalized into a shifting property or second shifting property.
as
L f (t a) u(t a) ⇤ e L f (t) . (6.1)
Example 6.2.2: Suppose that the forcing function is not periodic. For example, suppose
that we had a mass-spring system
where f (t) ⇤ 1 if 1 t < 5 and zero otherwise. We could imagine a mass-spring system,
where a rocket is fired for 4 seconds starting at t ⇤ 1. Or perhaps an RLC circuit, where
the voltage is raised at a constant rate for 4 seconds starting at t ⇤ 1, and then held steady
again starting at t ⇤ 5.
�.�. TRANSFORMS OF DERIVATIVES AND ODES 303
We can write f (t) ⇤ u(t 1) u(t 5). We transform the equation and we plug in the
initial conditions as before to obtain
e s e 5s
s 2 X(s) + X(s) ⇤ .
s s
We solve for X(s) to obtain
e s e 5s
X(s) ⇤ .
s(s 2 + 1) s(s 2 + 1)
We leave it as an exercise to the reader to show that
⇢
1 1
L ⇤1 cos t.
s(s 2 + 1)
1
In other words L{1 cos t} ⇤ s(s 2 +1)
. So using (6.1) we find
⇢
1 e s 1
L ⇤L {e s L{1 cos t}} ⇤ 1 cos(t 1) u(t 1).
s(s 2 + 1)
Similarly
⇢
1 e 5s 1 5s
L ⇤L e L{1 cos t} ⇤ 1 cos(t 5) u(t 5).
s(s 2 + 1)
Hence, the solution is
The plot of this solution is given in Figure 6.2 on the following page.
Lx ⇤ f (t),
where L is a linear constant coefficient differential operator. Then f (t) is usually thought
of as input of the system and x(t) is thought of as the output of the system. For example,
for a mass-spring system the input is the forcing function and output is the behavior of
the mass. We would like to have a convenient way to study the behavior of the system for
different inputs.
Let us suppose that all the initial conditions are zero and take the Laplace transform of
the equation, we obtain the equation
A(s)X(s) ⇤ F(s).
304 CHAPTER �. THE LAPLACE TRANSFORM
0 5 10 15 20
2 2
1 1
0 0
-1 -1
-2 -2
0 5 10 15 20
Solving for the ratio X(s)/F(s) we obtain the so-called transfer function H(s) ⇤ 1/A(s).
X(s)
H(s) ⇤ .
F(s)
In other words, X(s) ⇤ H(s)F(s). We obtain an algebraic dependence of the output of the
system based on the input. We can now easily study the steady state behavior of the system
given different inputs by simply multiplying by the transfer function.
Example 6.2.3: Given x 00 + !02 x ⇤ f (t), let us find the transfer function (assuming the initial
conditions are zero).
First, we take the Laplace transform of the equation.
X(s) 1
H(s) ⇤ ⇤ .
F(s) s + !02
2
Let us see how to use the transfer function. Suppose we have the constant input f (t) ⇤ 1.
Hence F(s) ⇤ 1/s , and
1 1
X(s) ⇤ H(s)F(s) ⇤ 2s
.
2
s + !0
Taking the inverse Laplace transform of X(s) we obtain
1 cos(!0 t)
x(t) ⇤ .
!02
�.�. TRANSFORMS OF DERIVATIVES AND ODES 305
It is sometimes useful (e.g. for computing the inverse transform) to write this as
π t ⇢
1 1
f (⌧) d⌧ ⇤ L F(s) .
0 s
n o
Example 6.2.4: To compute L 1 s(s 21+1) we could proceed by applying this integration
rule. ⇢ π t ⇢ π t
1 1 1
L 1 ⇤ L 1 2 d⌧ ⇤ sin ⌧ d⌧ ⇤ 1 cos t.
s s2 + 1 0 s +1 0
Example 6.2.5: An equation containing an integral of the unknown function is called an
integral equation. For example, take
π t
2
t ⇤ e ⌧ x(⌧) d⌧,
0
where we wish to solve for x(t). We apply the Laplace transform and the shifting property
to get
2 1 1
3
⇤ L e t x(t) ⇤ X(s 1),
s s s
where X(s) ⇤ L x(t) . Thus
2 2
X(s 1) ⇤ or X(s) ⇤ .
s2 (s + 1)2
We use the shifting property again
x(t) ⇤ 2e t t.
6.2.6 Exercises
Exercise 6.2.2: Using the Heaviside function write down the piecewise function that is � for t < 0,
t 2 for t in [0, 1] and t for t > 1.
Exercise 6.2.3: Using the Laplace transform solve
mx 00 + cx 0 + kx ⇤ 0, x(0) ⇤ a, x 0(0) ⇤ b,
mx 00 + cx 0 + kx ⇤ 0, x(0) ⇤ a, x 0(0) ⇤ b,
mx 00 + cx 0 + kx ⇤ 0, x(0) ⇤ a, x 0(0) ⇤ b,
Exercise 6.2.6: Solve x 00 + x ⇤ u(t 1) for initial conditions x(0) ⇤ 0 and x 0(0) ⇤ 0.
Exercise 6.2.7: Show the differentiation of the transform property. Suppose L f (t) ⇤ F(s), then
show
L t f (t) ⇤ F0(s).
Hint� Differentiate under the integral sign.
Exercise 6.2.8: Solve x 000 + x ⇤ t 3 u(t 1) for initial conditions x(0) ⇤ 1 and x 0(0) ⇤ 0, x 00(0) ⇤ 0.
Exercise 6.2.10: Let us think of the mass-spring system with a rocket from Example �.�.�. We
noticed that the solution kept oscillating after the rocket stopped running. The amplitude of the
oscillation depends on the time that the rocket was fired �for � seconds in the example�.
a� Find a formula for the amplitude of the resulting oscillation in terms of the amount of time the
rocket is fired.
b� Is there a nonzero time �if so what is it�� for which the rocket fires and the resulting oscillation
has amplitude � �the mass is not moving��
Exercise 6.2.12: Find the transfer function for mx 00 + cx 0 + kx ⇤ f (t) �assuming the initial
conditions are zero�.
�.�. TRANSFORMS OF DERIVATIVES AND ODES 307
Exercise 6.2.101: Using the Heaviside function u(t), write down the function
8
>
>
<
>
0 if t < 1,
f (t) ⇤ t 1 if 1 t < 2,
>
>
>1
: if 2 t.
Exercise 6.2.102: Solve x 00 x ⇤ (t 2 1)u(t 1) for initial conditions x(0) ⇤ 1, x 0(0) ⇤ 2 using
the Laplace transform.
Exercise 6.2.103: Find the transfer function for x 0 + x ⇤ f (t) �assuming the initial conditions are
zero�.
308 CHAPTER �. THE LAPLACE TRANSFORM
6.3 Convolution
Note: 1 or 1.5 lectures, §7.2 in [EP], §6.6 in [BD]
Ø 1 For those that have seen convolution defined before, you may have seen it defined as ( f ⇤ g)(t) ⇤
1
f (⌧)g(t ⌧) d⌧. This definition agrees with (6.2) if you define f (t) and g(t) to be zero for t < 0. When
discussing the Laplace transform the definition we gave is sufficient. Convolution does occur in many other
applications, however, where you may have to use the more general definition with infinities.
�.�. CONVOLUTION 309
The convolution has many properties that make it behave like a product. Let c be a
constant and f , g, and h be functions then
f ⇤ g ⇤ g ⇤ f,
(c f ) ⇤ g ⇤ f ⇤ (c g) ⇤ c( f ⇤ g),
( f ⇤ g) ⇤ h ⇤ f ⇤ (g ⇤ h).
The most interesting property for us, and the main result of this section is the following
theorem.
Theorem 6.3.1. Let f (t) and g(t) be of exponential order, then
⇢π t
L ( f ⇤ g)(t) ⇤ L f (⌧)g(t ⌧) d⌧ ⇤ L f (t) L g(t) .
0
In other words, the Laplace transform of a convolution is the product of the Laplace
transforms. The simplest way to use this result is in reverse.
Example 6.3.3: Suppose we have the function of s defined by
1 1 1
⇤ .
(s + 1)s 2 s + 1 s2
We recognize the two entries of Table 6.2. That is
⇢ ⇢
1 1 t 1 1
L ⇤e and L ⇤ t.
s+1 s2
Therefore, ⇢ π t
1 1 1 t
L ⇤ ⌧e (t ⌧)
d⌧ ⇤ e +t 1.
s + 1 s2 0
The calculation of the integral involved an integration by parts.
or in other words
1
X(s) ⇤ F(s) .
s 2 + !02
We know ( )
1 1 sin(!0 t)
L ⇤ .
s2 + !02 !0
Therefore, π t sin !0 (t ⌧)
x(t) ⇤ f (⌧) d⌧,
0 !0
or if we reverse the order
π t
sin(!0 ⌧)
x(t) ⇤ f (t ⌧) d⌧.
0 !0
Let us notice one more feature of this example. We can now see how Laplace transform
handles resonance. Suppose that f (t) ⇤ cos(!0 t). Then
π t π t
sin(!0 ⌧) 1
x(t) ⇤ cos !0 (t ⌧) d⌧ ⇤ sin(!0 ⌧) cos !0 (t ⌧) d⌧.
0 !0 !0 0
We have computed the convolution of sine and cosine in Example 6.3.2. Hence
✓ ◆ ✓ ◆
1 1 1
x(t) ⇤ t sin(!0 t) ⇤ t sin(!0 t).
!0 2 2!0
Note the t in front of the sine. The solution, therefore, grows without bound as t gets large,
meaning we get resonance.
Similarly, we can solve any constant coefficient equation with an arbitrary forcing
function f (t) as a definite integral using convolution. A definite integral, rather than
a closed form solution, is usually enough for most practical purposes. It is not hard to
numerically evaluate a definite integral.
where f (t) and g(t) are known functions and x(t) is an unknown we wish to solve for. To
find x(t), we apply the Laplace transform to the equation to obtain
where X(s), F(s), and G(s) are the Laplace transforms of x(t), f (t), and g(t) respectively.
We find
F(s)
X(s) ⇤ .
1 G(s)
To find x(t) we now need to find the inverse Laplace transform of X(s).
Example 6.3.5: Solve
π t
t
x(t) ⇤ e + sinh(t ⌧)x(⌧) d⌧.
0
We apply Laplace transform to obtain
1 1
X(s) ⇤ + 2 X(s),
s+1 s 1
or
1
s+1 s 1 s 1
X(s) ⇤ ⇤ ⇤ .
1 1 s2 2 s2 2 s2 2
s2 1
It is not hard to apply Table 6.1 on page 295 to find
p 1 p
x(t) ⇤ cosh 2t p sinh 2 t .
2
6.3.4 Exercises
Exercise 6.3.1: Let f (t) ⇤ t 2 for t 0, and g(t) ⇤ u(t 1). Compute f ⇤ g.
Exercise 6.3.2: Let f (t) ⇤ t for t 0, and g(t) ⇤ sin t for t 0. Compute f ⇤ g.
Exercise 6.3.3: Find the solution to
for an arbitrary function f (t), where m > 0, c > 0, k > 0, and c 2 4km > 0 �system is
overdamped�. Write the solution as a definite integral.
Exercise 6.3.4: Find the solution to
for an arbitrary function f (t), where m > 0, c > 0, k > 0, and c 2 4km < 0 �system is
underdamped�. Write the solution as a definite integral.
Exercise 6.3.5: Find the solution to
for an arbitrary function f (t), where m > 0, c > 0, k > 0, and c 2 ⇤ 4km �system is critically
damped�. Write the solution as a definite integral.
312 CHAPTER �. THE LAPLACE TRANSFORM
Exercise 6.3.104: Solve x 000 + x 0 ⇤ f (t), x(0) ⇤ 0, x 0(0) ⇤ 0, x 00(0) ⇤ 0 using convolution. Write
the result as a definite integral.
�.�. DIRAC DELTA AND IMPULSE RESPONSE 313
2.0 2.0
0.5 0.5
For simplicity we let a ⇤ 0, and it is Figure 6.3: Sample square pulse with a ⇤ 0.5,
b ⇤ 1 and M ⇤ 2.
convenient to set M ⇤ 1/b to have
π 1
'(t) dt ⇤ 1.
0
That is, to have the pulse have “unit mass.” For such a pulse we compute
⇢
u(t) u(t b) 1 e bs
L '(t) ⇤ L ⇤ .
b bs
We generally want b to be very small. That is, we wish to have the pulse be very short and
very tall. By letting b go to zero we arrive at the concept of the Dirac delta function.
314 CHAPTER �. THE LAPLACE TRANSFORM
The formula should hold if we integrate over any interval that contains 0, not just ( 1, 1).
So (t) is a “function” with all its “mass” at the single point t ⇤ 0. In other words, for any
interval [c, d]
π (
d
1 if the interval [c, d] contains 0, i.e. c 0 d,
(t) dt ⇤
c 0 otherwise.
Unfortunately there is no such function in the classical sense. You could informally think
that (t) is zero for t , 0 and somehow infinite at t ⇤ 0.
A good way to think about (t) is as a limit of short pulses whose integral is 1. For
example, suppose that we have a square pulse '(t) as above with a ⇤ 0, M ⇤ 1/b , that is
u(t) u(t b)
'(t) ⇤ b . Compute
π 1 π 1 π b
u(t) u(t b) 1
'(t) f (t) dt ⇤ f (t) dt ⇤ f (t) dt.
1 1 b b 0
If f (t) is continuous at t ⇤ 0, then for very small b, the function f (t) is approximately equal
to f (0) on the interval [0, b]. We approximate the integral
π b π b
1 1
f (t) dt ⇡ f (0) dt ⇤ f (0).
b 0 b 0
Hence, π π
1
1 b
lim '(t) f (t) dt ⇤ lim f (t) dt ⇤ f (0).
b!0 1 b!0 b 0
Let us therefore accept (t) as an object that is possible to integrate. We often want to
shift to another point, for example (t a). In that case we have
π 1
(t a) f (t) dt ⇤ f (a).
1
Note that (a t) is the same object as (t a). In other words, the convolution of (t)
with f (t) is again f (t),
π t
( f ⇤ )(t) ⇤ (t s) f (s) ds ⇤ f (t).
0
Named after the English physicist and mathematician Paul Adrien Maurice Dirac (1902–1984).
�.�. DIRAC DELTA AND IMPULSE RESPONSE 315
In particular,
L (t) ⇤ 1.
Remark 6.4.1: Notice that the Laplace transform of (t a) looks like the Laplace transform
of the derivative of the Heaviside function u(t a), if we could differentiate the Heaviside
function. First notice
e as
L u(t a) ⇤ .
s
To obtain what the Laplace transform of the derivative would be we multiply by s, to obtain
e as , which is the Laplace transform of (t a). We see the same thing using integration,
π t
(s a) ds ⇤ u(t a).
0
d h i
So in a certain sense
“ u(t a) ⇤ (t a). ”
dt
This line of reasoning allows us to talk about derivatives of functions with jump discontinu-
ities. We can think of the derivative of the Heaviside function u(t a) as being somehow
infinite at a, which is precisely our intuitive understanding of the delta function.
Example 6.4.1: Let us compute L 1 s+1 s . So far we have always looked at proper rational
functions in the s variable. That is, the numerator was always of lower degree than the
denominator. Not so with s+1 s . We write,
⇢ ⇢ ⇢
1 s+1 1 1 1
L ⇤L 1+ ⇤ L 1 {1} + L 1
⇤ (t) + 1.
s s s
The resulting object is a generalized function and only makes sense when put underneath
an integral.
Lx ⇤ (t)
We first apply the Laplace transform to the equation. Denote the transform of x(t) by
X(s).
1
s 2 X(s) + !02 X(s) ⇤ 1, and so X(s) ⇤ .
s + !02
2
sin(!0 t)
x(t) ⇤ .
!0
Let us notice something about the above example. We showed before that when the
input is f (t), then the solution to Lx ⇤ f (t) is given by
π t sin !0 (t ⌧)
x(t) ⇤ f (⌧) d⌧.
0 !0
That is, the solution for an arbitrary input is given as convolution with the impulse response.
Let us see why. The key is to notice that for functions x(t) and f (t),
π π
00 d2 t t
(x ⇤ f ) (t) ⇤ 2 f (⌧)x(t ⌧) d⌧ ⇤ f (⌧)x 00(t ⌧) d⌧ ⇤ (x 00 ⇤ f )(t).
dt 0 0
We simply differentiate twice under the integral , the details are left as an exercise. If we
convolve the entire equation (6.3), the left-hand side becomes
y 00 + !02 y ⇤ f (t).
This procedure works in general for other linear equations Lx ⇤ f (t). If you determine
the impulse response, you also know how to obtain the output x(t) for any input f (t) by
simply convolving the impulse response and the input f (t).
You should really think of the integral going over ( 1, 1) rather than over [0, t] and simply assume that
f (t) and x(t) are continuous and zero for negative t.
�.�. DIRAC DELTA AND IMPULSE RESPONSE 317
y F (x a)
x
We could integrate, but using the Laplace transform is even easier. We apply the
transform in the x variable rather than the t variable. Let us again denote the transform of
y(x) as Y(s).
s 4 Y(s) s 3 y(0) s 2 y 0(0) s y 00(0) y 000(0) ⇤ e s .
We notice that y(0) ⇤ 0 and y 00(0) ⇤ 0. Let us call C 1 ⇤ y 0(0) and C2 ⇤ y 000(0). We solve for
Y(s),
e s C1 C2
Y(s) ⇤ 4 + 2 + 4 .
s s s
We take the inverse Laplace transform utilizing the second shifting property (6.1) to take
the inverse of the first term.
(x 1)3 C2 3
y(x) ⇤ u(x 1) + C1 x + x .
6 6
We still need to apply two of the endpoint conditions. As the conditions are at x ⇤ 2 we
can simply replace u(x 1) ⇤ 1 when taking the derivatives. Therefore,
(2 1)3 C2 3 1 4
0 ⇤ y(2) ⇤ + C 1 (2) + 2 ⇤ + 2C 1 + C 2 ,
6 6 6 3
and
3 · 2 · (2 1) C 2
0 ⇤ y 00(2) ⇤ + 3 · 2 · 2 ⇤ 1 + 2C2 .
6 6
Hence C 2 ⇤ 12 and solving for C 1 using the first equation we obtain C1 ⇤ 1
4 . Our solution
for the beam deflection is
(x 1)3 x x3
y(x) ⇤ u(x 1) + .
6 4 12
6.4.5 Exercises
Exercise 6.4.1: Solve �find the impulse response� x 00 + x 0 + x ⇤ (t), x(0) ⇤ 0, x 0(0) ⇤ 0.
Exercise 6.4.2: Solve �find the impulse response� x 00 + 2x 0 + x ⇤ (t), x(0) ⇤ 0, x 0(0) ⇤ 0.
Exercise 6.4.3: A pulse can come later and can be bigger. Solve x 00 + 4x ⇤ 4 (t 1), x(0) ⇤ 0,
x 0(0) ⇤ 0.
Exercise 6.4.4: Suppose that f (t) and g(t) are differentiable functions and suppose that f (t) ⇤
g(t) ⇤ 0 for all t 0. Show that
Exercise 6.4.5: Suppose that Lx ⇤ (t), x(0) ⇤ 0, x 0(0) ⇤ 0, has the solution x ⇤ e t for t > 0.
Find the solution to Lx ⇤ t 2 , x(0) ⇤ 0, x 0(0) ⇤ 0 for t > 0.
n o
1 s 2 +s+1
Exercise 6.4.6: Compute L s2
.
�.�. DIRAC DELTA AND IMPULSE RESPONSE 319
Exercise 6.4.7 (challenging): Solve Example �.�.� via integrating � times in the x variable.
Exercise 6.4.8: Suppose we have a beam of length 1 simply supported at the ends and suppose that
force F ⇤ 1 is applied at x ⇤ 34 in the downward direction. Suppose that EI ⇤ 1 for simplicity. Find
the beam deflection y(x).
Exercise 6.4.101: Solve �find the impulse response� x 00 ⇤ (t), x(0) ⇤ 0, x 0(0) ⇤ 0.
Exercise 6.4.102: Solve �find the impulse response� x 0 + ax ⇤ (t), x(0) ⇤ 0, x 0(0) ⇤ 0.
Exercise 6.4.103: Suppose that Lx ⇤ (t), x(0) ⇤ 0, x 0(0) ⇤ 0, has the solution x(t) ⇤ cos(t) for
t > 0. Find �in closed form� the solution to Lx ⇤ sin(t), x(0) ⇤ 0, x 0(0) ⇤ 0 for t > 0.
n o
1 s2
Exercise 6.4.104: Compute L s 2 +1
.
n o
1 3s 2 e s +2
Exercise 6.4.105: Compute L s2
.
320 CHAPTER �. THE LAPLACE TRANSFORM
There is a corresponding Fourier transform on the real line as well, that looks sort of like the Laplace
transform.
† It’s a river of goo already, we’re not hurting the environment much more.
�.�. SOLVING PDES WITH THE LAPLACE TRANSFORM 321
To transform the derivative in t (the variable being transformed), we use the rules from
§ 6.2:
L y t (x, t) ⇤ sY(x) y(x, 0).
In our specific case, y(x, 0) ⇤ 0, and so L y t (x, t) ⇤ sY(x). We transform the equation
to find
sY(x) ⇤ ↵Y 0(x).
This ODE needs an initial condition. The initial condition is the other side condition of the
PDE, the one that depends on x. Everything is transformed, so we must also transform
this condition
C
Y(0) ⇤ L y(0, t) ⇤ L C ⇤ .
s
We solve the ODE problem sY(x) ⇤ ↵Y 0(x), Y(0) ⇤ s,
C
to find
C s
Y(x) ⇤ e ↵x .
s
We are not done, we have Y(x), but we really want y(x, t). We transform the s variable
back to t. Let (
0 if t < 0,
u(t) ⇤
1 otherwise
be the Heaviside function. As
π 1 π 1 as
st st e
L u(t a) ⇤ u(t a) e dt ⇤ e dt ⇤ ,
0 a s
then ⇢
1 C s
y(x, t) ⇤ L e ↵x ⇤ Cu t x/↵ .
s
In other words, (
0 if t < x/↵ ,
y(x, t) ⇤
C otherwise.
See Figure 6.6 on the following page for a diagram of this solution. The line of slope 1/↵
indicates the wavefront of the toxic substance in the picture as it is leaving the factory.
What the equation does is simply move the initial condition to the right at speed ↵.
Shhh. . . y is not differentiable, it is not even continuous (nobody ever seems to notice).
How could we plug something that’s not differentiable into the equation? Well, just think
322 CHAPTER �. THE LAPLACE TRANSFORM
t
y⇤C
wavefront, slope 1/↵
y⇤C
y⇤0
(0, 0) y⇤0 x
of a differentiable function very very close to y. Or, if you recognize the derivative of the
Heaviside function as the delta function, then all is well too:
@ ⇥ ⇤
y t (x, t) ⇤ Cu t x/↵ ⇤ Cu 0 t x/↵ ⇤C t x/↵
@t
and
@ ⇥ ⇤ C 0 C
y x (x, t) ⇤ Cu t x/↵ ⇤ u t x/↵ ⇤ t x/↵ .
@x ↵ ↵
So y t ⇤ ↵ y x .
Laplace equation is very good with constant coefficient equations. One advantage
of Laplace is that it easily handles nonhomogeneous side conditions. Let us try a more
complicated example.
Example 6.5.2: Consider
Again, we transform t, and we write Y(x) for the transformed function. As y(x, 0) ⇤ 0,
we find
1
sY(x) + Y 0(x) + Y(x) ⇤ 0, Y(0) ⇤ 2 .
s +1
The solution of the transformed equation is
1 (s+1)x 1 xs x
Y(x) ⇤ e ⇤ e e .
s2 +1 s2 +1
Using the second shifting property (6.1) and linearity of the transform, we obtain the
solution
y(x, t) ⇤ e x sin(t x)u(t x).
�.�. SOLVING PDES WITH THE LAPLACE TRANSFORM 323
We can also detect when the problem is ill-posed in the sense that it has no solution. Let
us change the equation to
Then the problem has no solution. First, let us see this in the language of § 1.9. The
characteristic curves are t ⇤ x + C. If ⌧ is the the characteristic coordinate, then we find
the equation y ⌧ ⇤ 0 along the curve, meaning a solution is constant along characteristic
curves. But these curves intersect both the x-axis and the t-axis. For example, the curve
t ⇤ x + 1 intersects at (1, 0) and (0, 1). The solution is constant along the curve so y(1, 0)
should equal y(0, 1). But y(1, 0) ⇤ 0 and y(0, 1) ⇤ sin(1) , 0. See Figure 6.7.
t
y(0, 1) ⇤ sin(1)
t ⇤ x+1
y ⇤ sin(t) y is constant along this characteristic curve
y(1, 0) ⇤ 0
(0, 0) y⇤0 x
1
sY(x) + Y 0(x) ⇤ 0, Y(0) ⇤ ,
s2 +1
and the solution ought to be
1
Y(x) ⇤ e sx .
s +1
2
Importantly, this Laplace transform does not decay to zero at infinity! That is, since x > 0
in the region of interest, then
1
lim e sx ⇤ 1 , 0.
s!1 s2 +1
It almost looks as if we could use the shifting property, but notice that the shift is in the
wrong direction.
Of course, we need not restrict ourselves to first order equations, although the compu-
tations become more involved for higher order equations.
324 CHAPTER �. THE LAPLACE TRANSFORM
Really we also impose other conditions on the solution so that for example the Laplace
transform exists. For example, we might impose that y is bounded for each fixed time t.
Transform the equation in the t variable to find
sY(x) ⇤ Y 00(x).
6.5.1 Exercises
Exercise 6.5.1: Solve
Exercise 6.5.5: Find the corresponding ODE problem for Y(x), after transforming the t variable
Hint� Note that e sx does not go to zero as s ! 1 for positive x, and e sx does not go to zero as
s ! 1 for negative x.
Exercise 6.5.103: Find the corresponding ODE problem for Y(x), after transforming the t variable
Hint� Note that e sx does not go to zero as s ! 1 for positive x, and e sx does not go to zero as
s ! 1 for negative x.
Chapter 7
’
1
a k (x x 0 )k .
k⇤0
7.1.1 Definition
As we said, a power series is an expression such as
’
1
a k (x x 0 )k ⇤ a 0 + a 1 (x x0 ) + a2 (x x 0 )2 + a 3 (x x 0 )3 + · · · , (7.1)
k⇤0
’
n
S n (x) ⇤ a k (x x0 )k ⇤ a0 + a 1 (x x0 ) + a 2 (x x0 )2 + a 3 (x x0 )3 + · · · + a n (x x 0 )n ,
k⇤0
’
n
lim S n (x) ⇤ lim a k (x x 0 )k
n!1 n!1
k⇤0
328 CHAPTER �. POWER SERIES METHODS
exists, then we say that the series (7.1) converges at x. At x ⇤ x0 , the series always converges
to a0 . When (7.1) converges at any other point x , x 0 , we say that (7.1) is a convergent power
series, and we write
’
1 ’
n
k
a k (x x 0 ) ⇤ lim a k (x x 0 )k .
n!1
k⇤0 k⇤0
If the series does not converge for any point x , x0 , we say that the series is divergent.
Example 7.1.1: The series
’
1
1 x2 x3
k
x ⇤1+x+ + +···
k! 2 6
k⇤0
x0 ⇢ x0 x0 + ⇢
A useful test for convergence of a series is the ratio test. Suppose that
’
1
ck
k⇤0
is a series and the limit
c k+1
L ⇤ lim
n!1 c k
exists. Then the series converges absolutely if L < 1 and diverges if L > 1.
We apply this test to the series (7.1). Let c k ⇤ a k (x x0 )k in the test. Compute
Then if 1 > L ⇤ A|x x0 | the series (7.1) converges absolutely. If A ⇤ 0, then the series
always converges. If A > 0, then the series converges absolutely if |x x0 | < 1/A, and
diverges if |x x 0 | > 1/A. That is, the radius of convergence is 1/A.
A similar test is the root test. Suppose
pk
L ⇤ lim |c k |
Õ
k!1
exists. Then 1 k⇤0 c k converges absolutely if L < 1 and diverges if L > 1. We can use the
same calculation as above to find A. Let us summarize.
Theorem 7.1.2 (Ratio and root tests for power series). Consider a power series
’
1
a k (x x 0 )k
k⇤0
such that
a k+1 pk
A ⇤ lim or A ⇤ lim |a k |
n!1 a k k!1
exists. If A ⇤ 0, then the radius of convergence of the series is 1. Otherwise, the radius of
convergence is 1/A.
330 CHAPTER �. POWER SERIES METHODS
a k+1 2 k 1 1
A ⇤ lim ⇤ lim ⇤ lim 2 ⇤ 1/2.
k!1 ak k!1 2 k k!1
Therefore the radius of convergence is 2, and the series converges absolutely on the interval
( 1, 3). And we could just as well have used the root test:
pk pk
1
A ⇤ lim lim |a k | ⇤ lim |2 k | ⇤ lim 2 ⇤ 1/2.
k!1 k!1 k!1 k!1
So the radius of convergence is 1: the series converges everywhere. The ratio test would
also work here.
p
The root or the ratio test does not always apply. That is the limit of aak+1 k
or k |a k | might
not exist. There exist more sophisticated ways of finding the radius of convergence, but
those would be beyond the scope of this chapter. The above two methods cover many of
the series that arise in practice. Often if the root test applies, so does the ratio test, and vice
versa, though the limit might be easier to compute in one way than the other.
For example, sine is an analytic function and its Taylor series around x 0 ⇤ 0 is given by
’
1
( 1)n
sin(x) ⇤ x 2n+1 .
(2n + 1)!
n⇤0
In Figure 7.2 we plot sin(x) and the truncations of the series up to degree 5 and 9. You can
see that the approximation is very good for x near 0, but gets worse for x further away
from 0. This is what happens in general. To get a good approximation far away from x0
you need to take more and more terms of the Taylor series.
-10 -5 0 5 10
3 3
2 2
1 1
0 0
-1 -1
-2 -2
-3 -3
-10 -5 0 5 10
Figure 7.2: The sine function and its Taylor approximations around x 0 ⇤ 0 of 5th and 9th degree.
Notice that the term corresponding to k ⇤ 0 disappeared as it was constant. The radius of
convergence of the differentiated series is the same as that of the original.
Example 7.1.5: Let us show that the exponential y ⇤ e x solves y 0 ⇤ y. First write
’
1
1
x
y⇤e ⇤ xk .
k!
k⇤0
Now differentiate
’
1
1 ’
1
1
1
y0 ⇤ k xk ⇤ xk 1.
k! (k 1)!
k⇤1 k⇤1
332 CHAPTER �. POWER SERIES METHODS
We reindex the series by simply replacing k with k + 1. The series does not change, what
changes is simply how we write it. After reindexing the series starts at k ⇤ 0 again.
’
1
1 ’
1
1 ’
1
1
k 1 (k+1) 1
x ⇤ x ⇤ xk .
(k 1)! (k + 1) 1 ! k!
k⇤1 k+1⇤1 k⇤0
That was precisely the power series for ex that we started with, so we showed that
dx [e ] ⇤ e .
d x x
Convergent power series can be added and multiplied together, and multiplied by
constants using the following rules. First, we can add series by adding term by term,
! !
’
1 ’
1 ’
1
a k (x x 0 )k + b k (x x 0 )k ⇤ (a k + b k )(x x 0 )k .
k⇤0 k⇤0 k⇤0
1 ’
1
⇤ x k ⇤ 1 + x + x2 + · · ·
1 x
k⇤0
�.�. POWER SERIES 333
This series is called the geometric series. The ratio test tells us that the radius of convergence
is 1. The series diverges for x 1 and x 1, even though 1 1 x is defined for all x , 1.
We can use the geometric series together with rules for addition and multiplication of
power series to expand rational functions around a point, as long as the denominator is
not zero at x 0 . Note that as for polynomials, we could equivalently use the Taylor series
expansion (7.2).
Example 7.1.6: Expand 1+2x+x x
2 as a power series around the origin (x 0 ⇤ 0) and find the
radius of convergence.
2
First, write 1 + 2x + x 2 ⇤ (1 + x)2 ⇤ 1 ( x) . Compute
✓ ◆2
x 1
⇤x
1 + 2x + x 2 1 ( x)
!2
’
1
⇤x ( 1)k x k
k⇤0
!
’1
⇤x ck x k
k⇤0
’
1
⇤ c k x k+1 ,
k⇤0
where to get c k , we use the formula for the product of series. We obtain, c 0 ⇤ 1,
c1 ⇤ 1 1 ⇤ 2, c 2 ⇤ 1 + 1 + 1 ⇤ 3, etc. Therefore
x ’ 1
⇤ ( 1)k+1 kx k ⇤ x 2x 2 + 3x 3 4x 4 + · · ·
1 + 2x + x 2
k⇤1
x3 + x 1 1 ’
1 ’
1 ’
1
k k k
⇤x+ ⇤x+ ( 1) x x ⇤ x+ ( 2)x k .
x 2 1 1+x 1 x
k⇤0 k⇤0 k⇤3
k odd
7.1.6 Exercises
’
1
Exercise 7.1.1: Is the power series e k x k convergent� If so, what is the radius of convergence�
k⇤0
334 CHAPTER �. POWER SERIES METHODS
’
1
Exercise 7.1.2: Is the power series kx k convergent� If so, what is the radius of convergence�
k⇤0
’
1
Exercise 7.1.3: Is the power series k!x k convergent� If so, what is the radius of convergence�
k⇤0
’
1
1
Exercise 7.1.4: Is the power series (x 10)k convergent� If so, what is the radius of
(2k)!
k⇤0
convergence�
Exercise 7.1.5: Determine the Taylor series for sin x around the point x0 ⇤ ⇡.
Exercise 7.1.6: Determine the Taylor series for ln x around the point x0 ⇤ 1, and find the radius of
convergence.
1
Exercise 7.1.7: Determine the Taylor series and its radius of convergence of around x0 ⇤ 0.
1+x
x
Exercise 7.1.8: Determine the Taylor series and its radius of convergence of around x0 ⇤ 0.
4 x2
Hint� You will not be able to use the ratio test.
Exercise 7.1.11: Suppose that f is an analytic function such that f (n) (0) ⇤ n. Find f (1).
’
1
Exercise 7.1.101: Is the power series (0.1)n x n convergent� If so, what is the radius of
n⇤1
convergence�
’
1
n!
Exercise 7.1.102 (challenging): Is the power series x n convergent� If so, what is the radius
nn
n⇤1
of convergence�
1
Exercise 7.1.103: Using the geometric series, expand 1 x around x0 ⇤ 2. For what x does the
series converge�
Exercise 7.1.105 (challenging): Imagine f and g are analytic functions such that f (k) (0) ⇤ g (k) (0)
for all large enough k. What can you say about f (x) g(x)�
�.�. SERIES SOLUTIONS OF LINEAR SECOND ORDER ODES 335
Suppose that p(x), q(x), and r(x) are polynomials. We will try a solution of the form
’
1
y⇤ a k (x x 0 )k
k⇤0
and solve for the a k to try to obtain a solution defined in some interval around x0 .
The point x0 is called an ordinary point if p(x0 ) , 0. That is, the functions
q(x) r(x)
and
p(x) p(x)
are defined for x near x0 . If p(x0 ) ⇤ 0, then we say x 0 is a singular point. Handling singular
points is harder than ordinary points and so we now focus only on ordinary points.
Example 7.2.1: Let us start with a very simple example
y 00 y ⇤ 0.
Let us try a power series solution near x0 ⇤ 0, which is an ordinary point. Every point is an
ordinary point in fact, as the equation is constant coefficient. We already know we should
obtain exponentials or the hyperbolic sine and cosine, but let us pretend we do not know
this.
We try
’
1
y⇤ ak x k .
k⇤0
If we differentiate, the k ⇤ 0 term is a constant and hence disappears. We therefore get
’
1
0
y ⇤ ka k x k 1 .
k⇤1
We recognize the two series as the hyperbolic sine and cosine. Therefore,
y ⇤ a 0 cosh x + a1 sinh x.
Of course, in general we will not be able to recognize the series that appears, since
usually there will not be any elementary function that matches it. In that case we will be
content with the series.
Example 7.2.2: Let us do a more complex example. Consider Airy’s equation :
y 00 x y ⇤ 0,
near the point x0 ⇤ 0. Note that x0 ⇤ 0 is an ordinary point.
Named after the English mathematician Sir George Biddell Airy (1801–1892).
�.�. SERIES SOLUTIONS OF LINEAR SECOND ORDER ODES 337
We try
’
1
y⇤ ak x k .
k⇤0
’
1
y ⇤ 00
k (k 1) a k x k 2 .
k⇤2
In other words, if we write down the series for y, it has two parts
✓ ◆
a0 a0 6 a0
y ⇤ a0 + x 3 + x +···+ x 3n + · · ·
6 180 (2)(3)(5)(6) · · · (3n 1)(3n)
✓ ◆
a1 a1 7 a1
+ a1 x + x 4 + x +···+ x 3n+1 + · · ·
12 504 (3)(4)(6)(7) · · · (3n)(3n + 1)
✓ ◆
1 1 6 1
⇤ a0 1 + x3 + x +···+ x 3n
+···
6 180 (2)(3)(5)(6) · · · (3n 1)(3n)
✓ ◆
1 1 7 1
+ a1 x + x4 + x +···+ x 3n+1 + · · · .
12 504 (3)(4)(6)(7) · · · (3n)(3n + 1)
We define
1 1 6 1
y1 (x) ⇤ 1 + x 3 + x +···+ x 3n + · · · ,
6 180 (2)(3)(5)(6) · · · (3n 1)(3n)
1 4 1 7 1
y2 (x) ⇤ x + x + x +···+ x 3n+1 + · · · ,
12 504 (3)(4)(6)(7) · · · (3n)(3n + 1)
and write the general solution to the equation as y(x) ⇤ a 0 y1 (x) + a 1 y2 (x). If we plug
in x ⇤ 0 into the power series for y1 and y2 , we find y1 (0) ⇤ 1 and y2 (0) ⇤ 0. Similarly,
y10 (0) ⇤ 0 and y20 (0) ⇤ 1. Therefore y ⇤ a 0 y1 + a 1 y2 is a solution that satisfies the initial
conditions y(0) ⇤ a 0 and y 0(0) ⇤ a1 .
5.0 5.0
2.5 2.5
0.0 0.0
-2.5 -2.5
-5.0 -5.0
-5.0 -2.5 0.0 2.5 5.0
The functions y1 and y2 cannot be written in terms of the elementary functions that you
know. See Figure 7.3 for the plot of the solutions y1 and y2 . These functions have many
interesting properties. For example, they are oscillatory for negative x (like solutions to
y 00 + y ⇤ 0) and for positive x they grow without bound (like solutions to y 00 y ⇤ 0).
Sometimes a solution may turn out to be a polynomial.
�.�. SERIES SOLUTIONS OF LINEAR SECOND ORDER ODES 339
Example 7.2.3: Let us find a solution to the so-called Hermite’s equation of order n :
y 00 2x y 0 + 2n y ⇤ 0.
0 ⇤ y 00 2x y 0 + 2n y
! ! !
’
1 ’
1 ’
1
2 1
⇤ k(k 1)a k x k 2x ka k x k + 2n ak x k
k⇤2
! k⇤1
! k⇤0
!
’1 ’
1 ’
1
2
⇤ k(k 1)a k x k 2ka k x k + 2na k x k
k⇤2 k⇤1
! k⇤0
! !
’
1 ’
1 ’
1
⇤ 2a 2 + (k + 2)(k + 1)a k+2 x k 2ka k x k + 2na0 + 2na k x k
k⇤1 k⇤1 k⇤1
’
1
⇤ 2a2 + 2na0 + (k + 2)(k + 1)a k+2 2ka k + 2na k x k .
k⇤1
As y 00 2x y 0 + 2n y ⇤ 0 we have
(2k 2n)
(k + 2)(k + 1)a k+2 + ( 2k + 2n)a k ⇤ 0, or a k+2 ⇤ ak .
(k + 2)(k + 1)
This recurrence relation actually includes a 2 ⇤ na0 (which comes about from 2a 2 + 2na0 ⇤
0). Again a 0 and a 1 are arbitrary.
2n 2(1 n)
a2 ⇤ a0 , a3 ⇤ a1 ,
(2)(1) (3)(2)
2(2 n) 22 (2 n)( n)
a4 ⇤ a2 ⇤ a0 ,
(4)(3) (4)(3)(2)(1)
Named after the French mathematician Charles Hermite (1822–1901).
340 CHAPTER �. POWER SERIES METHODS
2(3 n) 22 (3 n)(1 n)
a5 ⇤ a3 ⇤ a1 , ...
(5)(4) (5)(4)(3)(2)
Let us separate the even and odd coefficients. We find that
2m ( n)(2 n) · · · (2m 2 n)
a 2m ⇤ ,
(2m)!
2m (1 n)(3 n) · · · (2m 1 n)
a2m+1 ⇤ .
(2m + 1)!
Let us write down the two series, one with the even powers and one with the odd.
2( 4) 2 22 ( 4)(2 4) 4
y1 (x) ⇤ 1 + x + x4 ⇤ 1 4x 2 + x 4 .
2! 4! 3
7.2.1 Exercises
In the following exercises, when asked to solve an equation using power series methods,
you should find the first few terms of the series, and if possible find a general formula for
the k th coefficient.
Exercise 7.2.5: The methods work for other orders than second order. Try the methods of this section
to solve the first order system y 0 x y ⇤ 0 at the point x0 ⇤ 0.
Exercise 7.2.8:
b� Use the solution to part a� to find a solution for x y 00 + y ⇤ 0 around the point x 0 ⇤ 1.
Exercise 7.2.102 (challenging): Power series methods also work for nonhomogeneous equations.
1
a� Use power series methods to solve y 00 xy ⇤ 1 x at the point x0 ⇤ 0. Hint� Recall the
geometric series.
Exercise 7.2.103: Attempt to solve x 2 y 00 y ⇤ 0 at x 0 ⇤ 0 using the power series method of this
section �x0 is a singular point�. Can you find at least one solution� Can you find more than one
solution�
342 CHAPTER �. POWER SERIES METHODS
7.3.1 Examples
Example 7.3.1: Let us first look at a simple first order equation
2x y 0 y ⇤ 0.
we obtain
! !
’
1 ’
1
1
0 ⇤ 2x y 0 y ⇤ 2x ka k x k ak x k
k⇤1 k⇤0
’1
⇤ a0 + (2ka k ak ) x k .
k⇤1
y ⇤ x r f (x)
�.�. SINGULAR POINTS AND THE METHOD OF FROBENIUS 343
4x 2 y 00 4x 2 y 0 + (1 2x)y ⇤ 0,
0 ⇤ 4x 2 y 00 4x 2 y 0 + (1 2x)y
! ! !
’
1 ’
1 ’
1
⇤ 4x 2 (k + r) (k + r 1) a k x k+r 2
4x 2 (k + r) a k x k+r 1
+ (1 2x) a k x k+r
k⇤0
! k⇤0 k⇤0
’
1
⇤ 4(k + r) (k + r 1) a k x k+r
k⇤0
! ! !
’
1 ’
1 ’
1
4(k + r) a k x k+r+1 + a k x k+r 2a k x k+r+1
k⇤0
! k⇤0 k⇤0
’
1
⇤ 4(k + r) (k + r 1) a k x k+r
k⇤0
! ! !
’
1 ’
1 ’
1
4(k + r 1) a k 1 x k+r + a k x k+r 2a k 1 x k+r
k⇤1 k⇤0 k⇤1
1 ⇣
’ ⌘
r r
⇤ 4r(r 1) a 0 x + a 0 x + 4(k + r) (k + r 1) a k 4(k + r 1) a k 1 + ak 2a k 1 x k+r
k⇤1
’⇣
1 ⌘
r
⇤ 4r(r 1) + 1 a 0 x + 4(k + r) (k + r 1) + 1 a k 4(k + r 1) + 2 a k 1 x k+r .
k⇤1
4r(r 1) + 1 ⇤ 0.
344 CHAPTER �. POWER SERIES METHODS
This equation is called the indicial equation. This particular indicial equation has a double
root at r ⇤ 1/2.
OK, so we know what r has to be. That knowledge we obtained simply by looking at
the coefficient of x r . All other coefficients of x k+r also have to be zero so
4(k + r) (k + r 1) + 1 a k 4(k + r 1) + 2 a k 1 ⇤ 0.
If we plug in r ⇤ 1/2 and solve for a k , we get
4(k + 1/2 1) + 2 1
ak ⇤ ak 1 ⇤ ak 1.
4(k + 1/2) (k + 1/2 1) + 1 k
Let us set a 0 ⇤ 1. Then
1 1 1 1 1 1 1
a1 ⇤ a 0 ⇤ 1, a2 ⇤ a1 ⇤ , a3 ⇤ a2 ⇤ , a4 ⇤ a3 ⇤ , ···
1 2 2 3 3·2 4 4·3·2
Extrapolating, we notice that
1 1
ak ⇤ ⇤ .
k(k 1)(k 2) · · · 3 · 2 k!
In other words,
’
1 ’
1
1 ’
1
1
1/2
y⇤ ak x k+r
⇤ x k+1/2
⇤x x k ⇤ x 1/2 e x .
k! k!
k⇤0 k⇤0 k⇤0
That was lucky! In general, we will not be able to write the series in terms of elementary
functions.
We have one solution, let us call it y1 ⇤ x 1/2 e x . But what about a second solution? If we
want a general solution, we need two linearly independent solutions. Picking a0 to be a
different constant only gets us a constant multiple of y1 , and we do not have any other r to
try; we only have one solution to the indicial equation. Well, there are powers of x floating
around and we are taking derivatives, perhaps the logarithm (the antiderivative of x 1 ) is
around as well. It turns out we want to try for another solution of the form
’
1
y2 ⇤ b k x k+r + (ln x)y1 ,
k⇤0
be an ODE. As before, if p(x 0 ) ⇤ 0, then x0 is a singular point. If, furthermore, the limits
q(x) r(x)
lim (x x0 ) and lim (x x0 )2
x!x 0 p(x) x!x 0 p(x)
both exist and are finite, then we say that x0 is a regular singular point.
Example 7.3.3: Often, and for the rest of this section, x0 ⇤ 0. Consider
x 2 y 00 + x(1 + x)y 0 + (⇡ + x 2 )y ⇤ 0.
Write
q(x) x(1 + x)
lim x ⇤ lim x ⇤ lim (1 + x) ⇤ 1,
x!0 p(x) x!0 x2 x!0
r(x) (⇡ + x 2 )
lim x 2 ⇤ lim x 2 2
⇤ lim (⇡ + x 2 ) ⇤ ⇡.
x!0 p(x) x!0 x x!0
x 2 y 00 + (1 + x)y 0 + (⇡ + x 2 )y ⇤ 0,
then
q(x) (1 + x) 1+x
lim x ⇤ lim x 2
⇤ lim ⇤ DNE.
x!0 p(x) x!0 x x!0 x
Here DNE stands for does not exist. The point 0 is a singular point, but not a regular singular
point.
Let us now discuss the general Method of Frobenius . We only consider the method at
the point x ⇤ 0 for simplicity. The main idea is the following theorem.
Theorem 7.3.1 (Method of Frobenius). Suppose that
has a regular singular point at x ⇤ 0, then there exists at least one solution of the form
’
1
r
y⇤x ak x k .
k⇤0
’
1
y⇤ a k x k+r .
k⇤0
We plug this y into equation (7.3). We collect terms and write everything as a single
series.
(ii) The obtained series must be zero. Setting the first coefficient (usually the coefficient of
x r ) in the series to zero we obtain the indicial equation, which is a quadratic polynomial
in r.
(iii) If the indicial equation has two real roots r1 and r2 such that r1 r2 is not an integer,
then we have two linearly independent Frobenius-type solutions. Using the first root,
we plug in
’
1
r1
y1 ⇤ x ak x k ,
k⇤0
and we solve for all a k to obtain the first solution. Then using the second root, we
plug in
’
1
r2
y2 ⇤ x bk x k ,
k⇤0
(iv) If the indicial equation has a doubled root r, then there we find one solution
’
1
r
y1 ⇤ x ak x k ,
k⇤0
’
1
r
y2 ⇤ x b k x k + (ln x)y1 ,
k⇤0
(v) If the indicial equation has two real roots such that r1 r2 is an integer, then one
solution is
’
1
y 1 ⇤ x r1 ak x k ,
k⇤0
�.�. SINGULAR POINTS AND THE METHOD OF FROBENIUS 347
Exercise 7.3.1:
b� Suppose p is not an integer. Carry out the computation to obtain the solutions y1 and y2
above.
Bessel functions are convenient constant multiples of y1 and y2 . First we must define
the gamma function π 1
(x) ⇤ tx 1e t
dt.
0
Notice that (1) ⇤ 1. The gamma function also has a wonderful property
(x + 1) ⇤ x (x).
From this property, it follows that (n) ⇤ (n 1)! when n is an integer. So the gamma
function is a continuous version of the factorial. We compute:
1 ’ 1
( 1)k ⇣ x ⌘ 2k+p
Jp (x) ⇤ y1 ⇤ ,
2p (1 + p) k! (k + p + 1) 2
k⇤0
1 ’
1
( 1)k ⇣ x ⌘ 2k p
J p (x) ⇤ y2 ⇤ .
2 p (1 p) k! (k p + 1) 2
k⇤0
As these are constant multiples of the solutions we found above, these are both solutions
to Bessel’s equation of order p. The constants are picked for convenience.
When p is not an integer, Jp and J p are linearly independent. When n is an integer we
obtain
’ ( 1)k ⇣ x ⌘ 2k+n
1
Jn (x) ⇤ .
k! (k + n)! 2
k⇤0
In this case
Jn (x) ⇤ ( 1)n J n (x),
and so J n is not a second linearly independent solution. The other solution is the so-called
Bessel function of second kind. These make sense only for integer orders n and are defined as
limits of linear combinations of Jp (x) and J p (x), as p approaches n in the following way:
Each linear combination of Jp (x) and J p (x) is a solution to Bessel’s equation of order p.
Then as we take the limit as p goes to n, we see that Yn (x) is a solution to Bessel’s equation
of order n. It also turns out that Yn (x) and Jn (x) are linearly independent. Therefore when
n is an integer, we have the general solution to Bessel’s equation of order n:
y ⇤ AJn (x) + BYn (x),
for arbitrary constants A and B. Note that Yn (x) goes to negative infinity at x ⇤ 0. Many
mathematical software packages have these functions Jn (x) and Yn (x) defined, so they
can be used just like say sin(x) and cos(x). In fact, Bessel functions have some similar
properties. For example, J1 (x) is a derivative of J0 (x), and in general the derivative of
Jn (x) can be written as a linear combination of Jn 1 (x) and Jn+1 (x). Furthermore, these
functions oscillate, although they are not periodic. See Figure 7.4 for graphs of Bessel
functions.
0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0
1.0 1.0
1.00 1.00
0.5 0.5
0.75 0.75
0.0 0.0
0.50 0.50
-0.5 -0.5
0.25 0.25
-1.0 -1.0
0.00 0.00
-1.5 -1.5
-0.25 -0.25
-2.0 -2.0
0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0
Figure 7.4: Plot of the J0 (x) and J1 (x) in the first graph and Y0 (x) and Y1 (x) in the second graph.
Example 7.3.4: Other equations can sometimes be solved in terms of the Bessel functions.
For example, given a positive constant ,
2
x y 00 + y 0 + x y ⇤ 0,
can be changed to x 2 y 00 + x y 0 + 2 x 2 y ⇤ 0. Then changing variables t ⇤ x, we obtain via
chain rule the equation in y and t:
t 2 y 00 + t y 0 + t 2 y ⇤ 0,
which we recognize as Bessel’s equation of order 0. Therefore the general solution is
y(t) ⇤ A J0 (t) + BY0 (t), or in terms of x:
y ⇤ AJ0 ( x) + BY0 ( x).
This equation comes up, for example, when finding the fundamental modes of vibration of
a circular drum, but we digress.
350 CHAPTER �. POWER SERIES METHODS
7.3.4 Exercises
Exercise 7.3.3: Find a particular �Frobenius-type� solution of x 2 y 00 + x y 0 + (1 + x)y ⇤ 0.
Exercise 7.3.8: In the following equations classify the point x ⇤ 0 as ordinary, regular singular,
or singular but not regular singular.
a� x 2 (1 + x 2 )y 00 + x y ⇤ 0 b� x 2 y 00 + y 0 + y ⇤ 0
c� x y 00 + x 3 y 0 + y ⇤ 0 d� x y 00 + x y 0 ex y ⇤ 0
e� x 2 y 00 + x 2 y 0 + x 2 y ⇤ 0
Exercise 7.3.101: In the following equations classify the point x ⇤ 0 as ordinary, regular singular,
or singular but not regular singular.
a� y 00 + y ⇤ 0 b� x 3 y 00 + (1 + x)y ⇤ 0
c� x y 00 + x 5 y 0 + y ⇤ 0 d� sin(x)y 00 y⇤0
e� cos(x)y 00 sin(x)y ⇤ 0
Nonlinear systems
in some qualitative idea of what the solution is doing. For example, what happens as time
goes to infinity?
where f (x, y) and g(x, y) are functions of two variables, and the derivatives are taken with
respect to time t. Solutions are functions x(t) and y(t) such that
The way we will analyze the system is very similar to § 1.6, where we studied a single
autonomous equation. The ideas in two dimensions are the same, but the behavior can be
far more complicated.
It may be best to think of the system of equations as the single vector equation
0
x f (x, y)
⇤ . (8.1)
y g(x, y)
As in § 3.1 we draw the phase portrait (or phase diagram), where each point (x, y) corresponds
to a specific
h state
i of the system. We draw the vector field given at each point (x, y) by the
f (x,y)
vector g(x,y) . And as before if we find solutions, we draw the trajectories by plotting all
points x(t), y(t) for a certain range of t.
Example 8.1.1: Consider the second order equation x 00 ⇤ x + x 2 . Write this equation as a
first order nonlinear system
x 0 ⇤ y, y0 ⇤ x + x 2 .
The phase portrait with some trajectories is drawn in Figure 8.1 on the facing page.
From the phase portrait it should be clear that even this simple system has fairly
complicated behavior. Some trajectories keep oscillating around the origin, and some go
off towards infinity. We will return to this example often, and analyze it completely in this
(and the next) section.
h i
f (x, y)
If we zoom into the diagram near a point where is not zero, then nearby the
g(x, y)
arrows point generally in essentially that same direction and have essentially the same
magnitude. In other words the behavior is not that interesting near such a point. We are of
course assuming that f (x, y) and g(x, y) are continuous.
Let us concentrate on those points in the phase diagram above where the trajectories
seem to start, end, or go around. We see two such points: (0, 0) and (1, 0). The trajectories
seem to go around the point (0, 0), and they seem to either go in or out of the point (1, 0).
�.�. LINEARIZATION, CRITICAL POINTS, AND EQUILIBRIA 353
-2 -1 0 1 2
2 2
1 1
0 0
-1 -1
-2 -2
-2 -1 0 1 2
These points are precisely those points where the derivatives of both x and y are zero. Let
us define the critical points as the points (x, y) such that
f (x, y) Æ
⇤ 0.
g(x, y)
In other words, these are the points where both f (x, y) ⇤ 0 and g(x, y) ⇤ 0.
The critical hpointsi are where the behavior of the system is in some sense the most
f (x,y)
complicated. If g(x,y) is zero, then nearby, the vector can point in any direction whatsoever.
Also, the trajectories are either going towards, away from, or around these points, so if we
are looking for long-term qualitative behavior of the system, we should look at what is
happening near the critical points.
Critical points are also sometimes called equilibria, since we have so-called equilibrium
solutions at critical points. If (x0 , y0 ) is a critical point, then we have the solutions
x(t) ⇤ x0 , y(t) ⇤ y0 .
In Example 8.1.1 on the preceding page, there are two equilibrium solutions:
Compare this discussion on equilibria to the discussion in § 1.6. The underlying concept is
exactly the same.
8.1.2 Linearization
In § 3.5 we studied the behavior of a homogeneous linear system of two equations near a
critical point. For a linear system of two variables given by an invertible matrix, the only
354 CHAPTER �. NONLINEAR SYSTEMS
critical point is the origin (0, 0). Let us put the understanding we gained in that section to
good use understanding what happens near critical points of nonlinear systems.
In calculus we learned to estimate a function by taking its derivative and linearizing.
We work similarly with nonlinear systems of ODE. Suppose (x 0 , y0 ) is a critical point. First
change variables to (u, v), so that (u, v) ⇤ (0, 0) corresponds to (x 0 , y0 ). That is,
u⇤x x0 , v⇤y y0 .
Next we need to find the derivative. In multivariable calculus you may have seen that the
several variables version ofh the derivative
i is the Jacobian matrix . The Jacobian matrix of
f (x,y)
the vector-valued function g(x,y) at (x 0 , y0 ) is
"@f @f
#
(x , y )
@x 0 0
(x ,
@y 0
y0 )
@g @g .
(x ,
@x 0
y0 ) (x ,
@y 0
y0 )
This matrix gives the best linear approximation as u and v (and therefore x and y) vary.
We define the linearization of the equation (8.1) as the linear system
"@f @f
#
0
u (x , y )
@x 0 0
(x ,
@y 0
y0 ) u
⇤ .
v @g
(x , y0 )
@g
(x , y0 ) v
@x 0 @y 0
Example 8.1.2: Let us keep with the same equations as Example 8.1.1: x 0 ⇤ y, y 0 ⇤ x + x 2 .
There are two critical points, (0, 0) and (1, 0). The Jacobian matrix at any point is
"@f @f
#
@x
(x, y) @y
(x, y) 0 1
⇤ .
@g
(x, y)
@g
(x, y) 1 + 2x 0
@x @y
The phase diagrams of the two linearizations at the point (0, 0) and (1, 0) are given in
Figure 8.2 on the facing page. Note that the variables are now u and v. Compare Figure 8.2
with Figure 8.1 on the previous page, and look especially at the behavior near the critical
points.
Named for the German mathematician Carl Gustav Jacob Jacobi (1804–1851).
�.�. LINEARIZATION, CRITICAL POINTS, AND EQUILIBRIA 355
-1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0
1.0 1.0 1.0 1.0
Figure 8.2: Phase diagram with some trajectories of linearizations at the critical points (0, 0) (left) and
(1, 0) (right) of x 0 ⇤ y, y 0 ⇤ x + x 2 .
8.1.3 Exercises
Exercise 8.1.1: Sketch the phase plane vector field for�
a� x 0 ⇤ x 2 , y 0 ⇤ y 2 , b� x 0 ⇤ (x y)2 , y 0 ⇤ x, c� x 0 ⇤ e y , y 0 ⇤ e x .
Exercise 8.1.2: Match systems
�� x 0 ⇤ x 2 , y 0 ⇤ y 2 , �� x 0 ⇤ x y, y 0 ⇤ 1 + y 2 , �� x 0 ⇤ sin(⇡ y), y 0 ⇤ x,
to the vector fields below. Justify.
a� b� c�
Exercise 8.1.3: Find the critical points and linearizations of the following systems.
a� x 0 ⇤ x 2 y 2, y0 ⇤ x 2 + y 2 1, b� x 0 ⇤ y, y 0 ⇤ 3x + yx 2 ,
c� x 0 ⇤ x 2 + y, y 0 ⇤ y 2 + x.
Exercise 8.1.4: For the following systems, verify they have critical point at (0, 0), and find the
linearization at (0, 0).
a� x 0 ⇤ x + 2y + x 2 y 2 , y 0 ⇤ 2y x2 b� x 0 ⇤ y, y 0 ⇤ x y3
c� x 0 ⇤ ax + b y + f (x, y), y 0 ⇤ cx + dy + g(x, y), where f (0, 0) ⇤ 0, g(0, 0) ⇤ 0, and all first
@f @f @g
partial derivatives of f and g are also zero at (0, 0), that is, @x
(0, 0) ⇤ @y
(0, 0) ⇤ @x
(0, 0) ⇤
@g
@y
(0, 0) ⇤ 0.
356 CHAPTER �. NONLINEAR SYSTEMS
b� Sketch a phase diagram and describe the behavior near the critical point�s�.
b� Sketch a phase diagram and describe the behavior near the critical point�s�.
Exercise 8.1.101: Find the critical points and linearizations of the following systems.
a� x 0 ⇤ sin(⇡ y) + (x 1)2 , y 0 ⇤ y 2 y, b� x 0 ⇤ x + y + y 2 , y 0 ⇤ x,
c� x 0 ⇤ (x 1)2 + y, y 0 ⇤ x 2 + y.
�� x 0 ⇤ y 2 , y 0 ⇤ x 2 , �� x 0 ⇤ y, y 0 ⇤ (x 1)(x + 1),
�� x 0 ⇤ y + x 2 , y 0 ⇤ x,
a� b� c�
Exercise 8.1.103: The idea of critical points and linearization works in higher dimensions as well.
You simply make the Jacobian matrix bigger by adding more functions and more variables. For the
following system of � equations find the critical points and their linearizations�
x0 ⇤ x + z 2 , y0 ⇤ z 2 y, z0 ⇤ z + x 2 .
Table 8.1: Behavior of an almost linear system near an isolated critical point.
stable. Informally, a point is stable if we start close to a critical point and follow a trajectory
we either go towards, or at least not away from, this critical point.
A stable critical point (x0 , y0 ) is called asymptotically stable if given any initial condition
sufficiently close to (x0 , y0 ) and any solution x(t), y(t) satisfying that condition, then
That is, the critical point is asymptotically stable if any trajectory for a sufficiently close
initial condition goes towards the critical point (x0 , y0 ).
Example 8.2.1: Consider x 0 ⇤ y x 2 , y 0 ⇤ x + y 2 . See Figure 8.3 on the facing page for
the phase diagram. Let us find the critical points. These are the points where y x 2 ⇤ 0
and x + y 2 ⇤ 0. The first equation means y ⇤ x 2 , and so y 2 ⇤ x 4 . Plugging into the
second equation we obtain x + x 4 ⇤ 0. Factoring we obtain x(1 x 3 ) ⇤ 0. Since we are
looking only for real solutions we get either x ⇤ 0 or x ⇤ 1. Solving for the corresponding
y using y ⇤ x 2 , we get two critical points, one being (0, 0) and the other being (1, 1).
Clearly the critical points are isolated.
Let us compute the Jacobian matrix:
2x 1
.
1 2y
⇥ ⇤
At the point (0, 0) we get the matrix 01 01 and so the two eigenvalues are 1 and 1. As
the matrix is invertible, the system is almost linear at (0, 0). As the eigenvalues are real and
of opposite signs, we get a saddle point, ⇥which⇤ is an unstable equilibrium point.
At the point (1, 1) we get the matrix 21 12 and computing the eigenvalues we get 1,
3. The matrix is invertible, and so the system is almost linear at (1, 1). As we have real
eigenvalues and both negative, the critical point is a sink, and therefore an asymptotically
stable equilibrium point. That is, if we start with any point (x i , y i ) close to (1, 1) as an
initial condition and plot a trajectory, it approaches (1, 1). In other words,
-2 -1 0 1 2
2 2
1 1
0 0
-1 -1
-2 -2
-2 -1 0 1 2
As you can see from the diagram, this behavior is true even for some initial points quite far
from (1, 1), but it is definitely not true for all initial points.
Example 8.2.2: Let us look at x 0 ⇤ y + y 2 e x , y 0 ⇤ x. First let us find the critical points.
These are the points where y + y 2 e x ⇤ 0 and x ⇤ 0. Simplifying we get 0 ⇤ y + y 2 ⇤ y(y + 1).
So the critical points are (0, 0) and (0, 1), and hence are isolated. Let us compute the
Jacobian matrix: 2 x
y e 1 + 2ye x
.
1 0
⇥ ⇤
At the point (0, 0) we get the matrix 01 10 and so the two eigenvalues are 1 and 1. As
the matrix is invertible, the system is almost linear at (0, 0). And, as the eigenvalues are
real and of opposite signs, we get a saddle point, which is an unstable equilibrium point.
⇥ ⇤ p
At the point (0, 1) we get the matrix 11 01 whose eigenvalues are 12 ± i 23 . The matrix
is invertible, and so the system is almost linear at (0, 1). As we have complex eigenvalues
with positive real part, the critical point is a spiral source, and therefore an unstable
equilibrium point.
See Figure 8.4 on the next page for the phase diagram. Notice the two critical points,
and the behavior of the arrows in the vector field around these points.
-2 -1 0 1 2
2 2
1 1
0 0
-1 -1
-2 -2
-2 -1 0 1 2
The trouble with a center in a nonlinear system is that whether the trajectory goes
towards or away from the critical point is governed by the sign of the real part of the
eigenvalues of the Jacobian matrix, and the Jacobian matrix in a nonlinear system changes
from point to point. Since this real part is zero at the critical point itself, it can have either
sign nearby, meaning the trajectory could be pulled towards or away from the critical point.
Example 8.2.3: An example of such a problematic behavior is the system x 0 ⇤ y, y 0 ⇤
x + y 3 . The only critical point is the origin (0, 0). The Jacobian matrix is
0 1
.
1 3y 2
⇥ ⇤
At (0, 0) the Jacobian matrix is 01 10 , which has eigenvalues ±i. So the linearization has a
center.
Using the quadratic equation, the eigenvalues of the Jacobian matrix at any point (x, y)
are p
3 2 4 9y 4
⇤ y ±i .
2 2
At any point where y , 0 (so at most points near the origin), the eigenvalues have a positive
real part (y 2 can never be negative). This positive real part pulls the trajectory away from
the origin. A sample trajectory for an initial condition near the origin is given in Figure 8.5
on the next page.
The moral of the example is that further analysis is needed when the linearization has a
center. The analysis will in general be more complicated than in the above example, and
is more likely to involve case-by-case consideration. Such a complication should not be
surprising to you. By now in your mathematical career, you have seen many places where
a simple test is inconclusive, recall for example the second derivative test for maxima or
minima, and requires more careful, and perhaps ad hoc analysis of the situation.
�.�. STABILITY AND CLASSIFICATION OF ISOLATED CRITICAL POINTS 361
-3 -2 -1 0 1 2 3
2 2
1 1
0 0
-1 -1
-2 -2
-3 -2 -1 0 1 2 3
Figure 8.5: An unstable critical point (spiral source) at the origin for x 0 ⇤ y, y 0 ⇤ x + y 3 , even if the
linearization has a center.
Trajectories satisfy
1 2 1 2 1 3
y + x x ⇤ C.
2 2 3
We solve for y
r
2
y⇤± x 2 + x 3 + 2C.
3
Plotting these graphs we get exactly the trajectories in Figure 8.1 on page 353. In
particular we notice that near the origin the trajectories are closed curves: they keep going
around the origin, never spiraling in or out. Therefore we discovered a way to verify that
the critical point at (0, 0) is a stable center. The critical point at (0, 1) is a saddle as we
already noticed. This example is typical for conservative equations.
Consider an arbitrary conservative equation x 00 + f (x) ⇤ 0. All critical points occur
when y ⇤ 0 (the x-axis), that is when x 0 ⇤ 0. The critical points are those points on the
x-axis where f (x) ⇤ 0. The trajectories are given by
s π
y⇤± 2 f (x) dx + 2C.
So all trajectories are mirrored across the x-axis. In particular, there can be no spiral sources
nor sinks. The Jacobian matrix is
0 1
.
f (x) 0
0
The critical point is almost linear if f 0(x) , 0 at the critical point. Let J denote the Jacobian
matrix. The eigenvalues of J are solutions to
2
0 ⇤ det(J I) ⇤ + f 0(x).
p
Therefore ⇤ ± f 0(x). In other words, either we get real eigenvalues of opposite signs
(if f 0(x) < 0), or we get purely imaginary eigenvalues (if f 0(x) > 0). There are only two
possibilities for critical points, either an unstable saddle point, or a stable center. There are
never any sinks or sources.
8.2.5 Exercises
Exercise 8.2.1: For the systems below, find and classify the critical points, also indicate if the
equilibria are stable, asymptotically stable, or unstable.
a� x 0 ⇤ x + 3x 2 , y 0 ⇤ y b� x 0 ⇤ x 2 + y 2 1, y 0 ⇤ x
c� x 0 ⇤ ye x , y 0 ⇤ y x + y2
�.�. STABILITY AND CLASSIFICATION OF ISOLATED CRITICAL POINTS 363
Exercise 8.2.2: Find the implicit equations of the trajectories of the following conservative systems.
Next find their critical points �if any� and classify them.
a� x 00 + x + x 3 ⇤ 0 b� ✓00 + sin ✓ ⇤ 0
c� z 00 + (z 1)(z + 1) ⇤ 0 d� x 00 + x 2 + 1 ⇤ 0
b� For any initial point of the form (0, y0 ), find what is the trajectory.
c� Can a trajectory starting at (x0 , y0 ) where x0 > 0 spiral into the critical point at ( 1, 0)�
Why or why not�
Exercise 8.2.5: In the example x 0 ⇤ y, y 0 ⇤ y 3 x show that for any trajectory, the distance from
the origin is an increasing function. Conclude that the origin behaves like is a spiral source. Hint�
2 2
Consider f (t) ⇤ x(t) + y(t) and show it has positive derivative.
Exercise 8.2.6: Suppose f is always positive. Find the trajectories of x 00 + f (x 0) ⇤ 0. Are there
any critical points�
Exercise 8.2.7: Suppose that x 0 ⇤ f (x, y), y 0 ⇤ g(x, y). Suppose that g(x, y) > 1 for all x and
y. Are there any critical points� What can we say about the trajectories at t goes to infinity�
Exercise 8.2.101: For the systems below, find and classify the critical points.
a� x 0 ⇤ x + x 2 , y 0 ⇤ y b� x 0 ⇤ y y2 x, y 0 ⇤ x c� x 0 ⇤ x y, y 0 ⇤ x + y 1
Exercise 8.2.102: Find the implicit equations of the trajectories of the following conservative systems.
Next find their critical points �if any� and classify them.
a� x 00 + x 2 ⇤ 4 b� x 00 + e x ⇤ 0 c� x 00 + (x + 1)e x ⇤ 0
Exercise 8.2.103: The conservative system x 00 + x 3 ⇤ 0 is not almost linear. Classify its critical
point�s� nonetheless.
Exercise 8.2.104: Derive an analogous classification of critical points for equations in one dimension,
such as x 0 ⇤ f (x) based on the derivative. A point x0 is critical when f (x0 ) ⇤ 0 and almost linear
if in addition f 0(x 0 ) , 0. Figure out if the critical point is stable or unstable depending on the sign
of f 0(x 0 ). Explain. Hint� see § �.�.
364 CHAPTER �. NONLINEAR SYSTEMS
8.3.1 Pendulum
g
The first example we study is the pendulum equation ✓00 + L sin ✓ ⇤ 0. Here, ✓ is the angular
displacement, g is the gravitational acceleration, and L is the length of the pendulum. In
this equation we disregard friction, so we are talking about an idealized pendulum.
This equation is a conservative equation, so we can use our
analysis of conservative equations from the previous section. Let us
change the equation to a two-dimensional system in variables (✓, !) L
by introducing the new variable !: ✓
0
✓ ! m
⇤ g .
! L sin ✓
g
The critical points of this system are when ! ⇤ 0 and L sin ✓ ⇤ 0,
or in other words if sin ✓ ⇤ 0. So the critical points are when ! ⇤ 0 and ✓ is a multiple
of ⇡. That is, the points are . . . ( 2⇡, 0), ( ⇡, 0), (0, 0), (⇡, 0), (2⇡, 0) . . .. While there are
infinitely many critical points, they are all isolated. Let us compute the Jacobian matrix:
⇣ ⌘ ⇣ ⌘
2 3
6 @✓ !
@ @
7
6 ⇣ ⌘7 ⇤ 0 1
!
⌘ ⇣@!
6@ 7 g .
6 @✓ L sin ✓ L sin ✓ 7
cos ✓ 0
g @ g
4 5
@! L
2 2
1 1
0 0
-1 -1
-2 -2
-3 -3
-5.0 -2.5 0.0 2.5 5.0
Figure 8.6: Phase plane diagram and some trajectories of the nonlinear pendulum equation.
angles. The horizontal axis is the deflection angle. The vertical axis is the angular velocity
of the pendulum. Suppose we start at ✓ ⇤ 0 (no deflection), and we start with a small
angular velocity !. Then the trajectory keeps going around the critical point (0, 0) in an
approximate circle. This corresponds to short swings of the pendulum back and forth.
When ✓ stays small, the trajectories really look like circles and hence are very close to our
linearization.
When we give the pendulum a big enough push, it goes across the top and keeps
spinning about its axis. This behavior corresponds to the wavy curves that do not cross the
horizontal axis in the phase diagram. Let us suppose we look at the top curves, when the
angular velocity ! is large and positive. Then the pendulum is going around and around
its axis. The velocity is going to be large when the pendulum is near the bottom, and the
velocity is the smallest when the pendulum is close to the top of its loop.
At each critical point, there is an equilibrium solution. Consider the solution ✓ ⇤ 0;
the pendulum is not moving and is hanging straight down. This is a stable place for the
pendulum to be, hence this is a stable equilibrium.
The other type of equilibrium solution is at the unstable point, for example ✓ ⇤ ⇡. Here
the pendulum is upside down. Sure you can balance the pendulum this way and it will
stay, but this is an unstable equilibrium. Even the tiniest push will make the pendulum
start swinging wildly.
See Figure 8.7 on the next page for a diagram. The first picture is the stable equilibrium
✓ ⇤ 0. The second picture corresponds to those “almost circles” in the phase diagram
around ✓ ⇤ 0 when the angular velocity is small. The next picture is the unstable
equilibrium ✓ ⇤ ⇡. The last picture corresponds to the wavy lines for large angular
velocities.
The quantity
1 2 g
! cos ✓
2 L
366 CHAPTER �. NONLINEAR SYSTEMS
is conserved by any solution. This is the energy or the Hamiltonian of the system.
We have a conservative equation and so (exercise) the trajectories are given by
r
2g
!⇤± cos ✓ + C,
L
for various values of C. Let us look at the initial condition of (✓0 , 0), that is, we take the
pendulum to angle ✓0 , and just let it go (initial angular velocity 0). We plug the initial
conditions into the above and solve for C to obtain
2g
C⇤ cos ✓0 .
L
Thus the expression for the trajectory is
r
2g p
!⇤± cos ✓ cos ✓0 .
L
Let us figure out the period. That is, the time it takes for the pendulum to swing back
and forth. We notice that the trajectory about the origin in the phase plane is symmetric
about both the ✓ and the !-axis. That is, in terms of ✓, the time it takes from ✓0 to ✓0 is
the same as it takes from ✓0 back to ✓0 . Furthermore, the time it takes from ✓0 to 0 is the
same as to go from 0 to ✓0 . Therefore, let us find how long it takes for the pendulum to go
from angle 0 to angle ✓0 , which is a quarter of the full oscillation and then multiply by 4.
We figure out this time by finding d✓ dt
and integrating from 0 to ✓0 . The period is four
times this integral. Let us stay in the region where ! is positive. Since ! ⇤ d✓ dt , inverting
we get
s
dt L 1
⇤ p .
d✓ 2g cos ✓ cos ✓0
�.�. APPLICATIONS OF NONLINEAR SYSTEMS 367
We plot T, Tlinear , and the relative error T TTlinear in Figure 8.8. The relative error says how
far is our approximation from the real period percentage-wise. Note that Tlinear is simply a
constant, it does not change with the initial angle ✓0 . The actual period T gets larger and
larger as ✓0 gets larger. Notice how the relative error is small when ✓0 is small. It is still
only 15% when ✓0 ⇤ ⇡2 , that is, a 90 degree angle. The error is 3.8% when starting at ⇡4 , a
45 degree angle. At a 5 degree initial angle, the error is only 0.048%.
0.00 0.25 0.50 0.75 1.00 1.25 1.50 0.00 0.25 0.50 0.75 1.00 1.25 1.50
8.0 8.0
0.150 0.150
7.5 7.5
0.125 0.125
7.0 7.0
0.100 0.100
0.025 0.025
5.5 5.5
0.000 0.000
5.0 5.0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 0.00 0.25 0.50 0.75 1.00 1.25 1.50
g T Tlinear
Figure 8.8: The plot of T and Tlinear with L ⇤ 1 (left), and the plot of the relative error T (right), for
✓0 between 0 and ⇡/2.
lim T ⇤ 1.
✓0 "⇡
That is, the period goes to infinity as the initial angle approaches the unstable equilibrium
point. So if we put the pendulum almost upside down it may take a very long time before
it gets down. This is consistent with the limiting behavior, where the exactly upside down
pendulum never makes an oscillation, so we could think of that as infinite period.
368 CHAPTER �. NONLINEAR SYSTEMS
x 0 ⇤ (a b y)x,
0
y ⇤ (cx d)y,
where a, b, c, d are some parameters that describe the interaction of the foxes and hares† .
In this model, these are all positive numbers.
Let us analyze the idea behind this model. The model is a slightly more complicated
idea based on the exponential population model. First expand,
x 0 ⇤ (a b y)x ⇤ ax b yx.
The hares are expected to simply grow exponentially in the absence of foxes, that is where
the ax term comes in, the growth in population is proportional to the population itself.
We are assuming the hares always find enough food and have enough space to reproduce.
However, there is another component b yx, that is, the population also is decreasing
proportionally to the number of foxes. Together we can write the equation as (a b y)x, so
it is like exponential growth or decay but the constant depends on the number of foxes.
The equation for foxes is very similar, expand again
The foxes need food (hares) to reproduce: the more food, the bigger the rate of growth,
hence the cx y term. On the other hand, there are natural deaths in the fox population, and
hence the dy term.
Named for the American mathematician, chemist, and statistician Alfred James Lotka (1880–1949) and
the Italian mathematician and physicist Vito Volterra (1860–1940).
† This interaction does not end well for the hare.
�.�. APPLICATIONS OF NONLINEAR SYSTEMS 369
Without further delay, let us start with an explicit example. Suppose the equations are
x 0 ⇤ (0.4 0.01y)x, y 0 ⇤ (0.003x 0.3)y.
See Figure 8.9 for the phase portrait. In this example it makes sense to also plot x and y as
graphs with respect to time. Therefore the second graph in Figure 8.9 is the graph of x and
y on the vertical axis (the prey x is the thinner line with taller peaks), against time on the
horizontal axis. The particular solution graphed was with initial conditions of 20 foxes and
50 hares.
250 250
75 75
200 200
50 50 150 150
100 100
25 25
50 50
0 0 0 0
0 50 100 150 200 250 300 0 10 20 30 40
Figure 8.9: The phase portrait (left) and graphs of x and y for a sample solution (right).
Let us analyze what we see on the graphs. We work in the general setting rather than
putting in specific numbers. We start with finding the critical points. Set (a b y)x ⇤ 0,
and (cx d)y ⇤ 0. The first equation is satisfied if either x ⇤ 0 or y ⇤ a/b . If x ⇤ 0, the
second equation implies y ⇤ 0. If y ⇤ a/b , the second equation implies x ⇤ d/c . There are
two equilibria: at (0, 0) when there are no animals at all, and at (d/c , a/b ). In our specific
example x ⇤ d/c ⇤ 100, and y ⇤ a/b ⇤ 40. This is the point where there are 100 hares and 40
foxes.
We compute the Jacobian matrix:
a by bx
.
cy cx d
⇥ ⇤
At the origin (0, 0) we get the matrix 0a 0d , so the eigenvalues are a and d, hence real
and of opposite signs. So the critical point at the origin is a saddle. This makes sense. If
you started with some foxes but no hares, then the foxes would go extinct, that is, you
would approach the origin. If you started with no foxes and a few hares, then the hares
would keep multiplying without check, and so you would go away from the origin.
OK, how about the other critical point at (d/c , a/b ). Here the Jacobian matrix becomes
0 bd
c .
ac
b 0
370 CHAPTER �. NONLINEAR SYSTEMS
p
The eigenvalues satisfy 2 + ad ⇤ 0. In other words, ⇤ ±i ad. The eigenvalues being
purely imaginary, we are in the case where we cannot quite decide using only linearization.
We could have a stable center, spiral sink, or a spiral source. That is, the equilibrium could
be asymptotically stable, stable, or unstable. Of course I gave you a picture above that
seems to imply it is a stable center. But never trust a picture only. Perhaps the oscillations
are getting larger and larger, but only very slowly. Of course this would be bad as it would
imply something will go wrong with our population sooner or later. And I only graphed a
very specific example with very specific trajectories.
How can we be sure we are in the stable situation? As we said before, in the case of
purely imaginary eigenvalues, we have to do a bit more work. Previously we found that for
conservative systems, there was a certain quantity that was conserved on the trajectories,
and hence the trajectories had to go in closed loops. We can use a similar technique here.
We just have to figure out what is the conserved quantity. After some trial and error we
find the constant
ya xd
C ⇤ cx+b y ⇤ y a x d e cx b y
e
is conserved. Such a quantity is called the constant of motion. Let us check C really is a
constant of motion. How do we check, you say? Well, a constant is something that does
not change with time, so let us compute the derivative with respect to time:
1 0 d
C0 ⇤ a y a yx e cx b y
+ y a dx d 1 x 0 e cx b y
+ ya xd e cx b y
( cx 0 b y 0).
Our equations give us what x 0 and y 0 are so let us plug those in:
C0 ⇤ a y a 1 (cx d)yx d e cx b y
+ y a dx d 1 (a b y)xe cx b y
+ ya xd e cx b y
c(a b y)x b(cx d)y
⇣ ⌘
⇤ ya xd e cx b y
a(cx d) + d(a b y) + c(a b y)x b(cx d)y
⇤ 0.
ya xd
So along the trajectories C is constant. In fact, the expression C ⇤ e cx+b y gives us an implicit
equation for the trajectories. In any case, once we have found this constant of motion, it
ya xd
must be true that the trajectories are simple curves, that is, the level curves of e cx+b y . It
turns out, the critical point at (d/c , a/b ) is a maximum for C (left as an exercise). So (d/c , a/b )
is a stable equilibrium point, and we do not have to worry about the foxes and hares going
extinct or their populations exploding.
One blemish on this wonderful model is that the number of foxes and hares are discrete
quantities and we are modeling with continuous variables. Our model has no problem
with there being 0.1 fox in the forest for example, while in reality that makes no sense. The
approximation is a reasonable one as long as the number of foxes and hares are large, but
it does not make much sense for small numbers. One must be careful in interpreting any
results from such a model.
�.�. APPLICATIONS OF NONLINEAR SYSTEMS 371
8.3.3 Exercises
Exercise 8.3.1: Take the damped nonlinear pendulum equation ✓00 + µ✓0 + ( g/L) sin ✓ ⇤ 0
for some µ > 0 �that is, there is some friction�.
a� Suppose µ ⇤ 1 and g/L ⇤ 1 for simplicity, find and classify the critical points.
b� Do the same for any µ > 0 and any g and L, but such that the damping is small, in particular,
µ2 < 4( g/L).
c� Explain what your findings mean, and if it agrees with what you expect in reality.
Exercise 8.3.2: Suppose the hares do not grow exponentially, but logistically. In particular consider
For the following two values of , find and classify all the critical points in the positive quadrant,
that is, for x 0 and y 0. Then sketch the phase diagram. Discuss the implication for the long
term behavior of the population.
a� ⇤ 0.001, b� ⇤ 0.01.
Exercise 8.3.3:
yx
a� Suppose x and y are positive variables. Show e x+y attains a maximum at (1, 1).
b� Suppose a, b, c, d are positive constants, and also suppose x and y are positive variables.
ya xd
Show e cx+b y
attains a maximum at (d/c , a/b ).
372 CHAPTER �. NONLINEAR SYSTEMS
Exercise 8.3.5 (challenging): Take the pendulum, suppose the initial position is ✓ ⇤ 0.
a� Find the expression for ! giving the trajectory with initial condition (0, !0 ). Hint� Figure
out what C should be in terms of !0 .
b� Find the crucial angular velocity !1 , such that for any higher initial angular velocity, the
pendulum will keep going around its axis, and for any lower initial angular velocity, the
pendulum will simply swing back and forth. Hint� When the pendulum doesn’t go over the
top the expression for ! will be undefined for some ✓s.
c� What do you think happens if the initial condition is (0, !1 ), that is, the initial angle is �, and
the initial angular velocity is exactly !1 .
Exercise 8.3.101: Take the damped nonlinear pendulum equation ✓00 + µ✓0 + ( g/L) sin ✓ ⇤ 0 for
some µ > 0 �that is, there is friction�. Suppose the friction is large, in particular µ2 > 4( g/L).
b� Explain what your findings mean, and if it agrees with what you expect in reality.
Exercise 8.3.102: Suppose we have the system predator-prey system where the foxes are also killed
at a constant rate h �h foxes killed per unit time�� x 0 ⇤ (a b y)x, y 0 ⇤ (cx d)y h.
a� Find the critical points and the Jacobian matrices of the system.
b� Put in the constants a ⇤ 0.4, b ⇤ 0.01, c ⇤ 0.003, d ⇤ 0.3, h ⇤ 10. Analyze the critical
points. What do you think it says about the forest�
Exercise 8.3.103 (challenging): Suppose the foxes never die. That is, we have the system
x 0 ⇤ (a b y)x, y 0 ⇤ cx y. Find the critical points and notice they are not isolated. What will
happen to the population in the forest if it starts at some positive numbers. Hint� Think of the
constant of motion.
�.�. LIMIT CYCLES 373
x 00 µ(1 x 2 )x 0 + x ⇤ 0,
where µ is some positive constant. The Van der Pol oscillator originated with electrical
circuits, but finds applications in diverse fields such as biology, seismology, and other
physical sciences.
For simplicity, let us use µ ⇤ 1. A phase diagram is given in the left-hand plot in
Figure 8.10. Notice how the trajectories seem to very quickly settle on a closed curve. On
the right-hand side is the plot of a single solution for t ⇤ 0 to t ⇤ 30 with initial conditions
x(0) ⇤ 0.1 and x 0(0) ⇤ 0.1. The solution quickly tends to a periodic solution.
-4 -2 0 2 4 0 5 10 15 20 25 30
4 4
2 2
2 2
1 1
0 0 0 0
-1 -1
-2 -2
-2 -2
-4 -4
-4 -2 0 2 4 0 5 10 15 20 25 30
Figure 8.10: The phase portrait (left) and a graph of a sample solution of the Van der Pol oscillator.
The Van der Pol oscillator is an example of so-called relaxation oscillation. The word
relaxation comes from the sudden jump (the very steep part of the solution). For larger µ
the steep part becomes even more pronounced, for small µ the limit cycle looks more like a
circle. In fact, setting µ ⇤ 0, we get x 00 + x ⇤ 0, which is a linear system with a center and
all trajectories become circles.
A trajectory in the phase portrait that is a closed curve (a curve that is a loop) is called a
closed trajectory. A limit cycle is a closed trajectory such that at least one other trajectory
spirals into it (or spirals out of it). For example, the closed curve in the phase portrait for
Named for the Dutch physicist Balthasar van der Pol (1889–1959).
374 CHAPTER �. NONLINEAR SYSTEMS
the Van der Pol equation is a limit cycle. If all trajectories that start near the limit cycle
spiral into it, the limit cycle is called asymptotically stable. The limit cycle in the Van der Pol
oscillator is asymptotically stable.
Given a closed trajectory on an autonomous system, any solution that starts on it is
periodic. Such a curve is called a periodic orbit. More precisely, if x(t), y(t) is a solution
such that for some t0 the point x(t0 ), y(t0 ) lies on a periodic orbit, then both x(t) and y(t)
are periodic functions (with the same period). That is, there is some number P such that
x(t) ⇤ x(t + P) and y(t) ⇤ y(t + P).
Consider the system
x 0 ⇤ f (x, y), y 0 ⇤ g(x, y), (8.2)
where the functions f and g have continuous derivatives in some region R in the plane.
Theorem 8.4.1 (Poincarè–Bendixson ). Suppose R is a closed bounded region �a region in the
plane that includes its boundary and does not have points arbitrarily far from the origin�. Suppose
x(t), y(t) is a solution of (8.2) in R that exists for all t t0 . Then either the solution is a periodic
function, or the solution tends towards a periodic solution in R.
The main point of the theorem is that if you find one solution that exists for all t large
enough (that is, as t goes to infinity) and stays within a bounded region, then you have
found either a periodic orbit, or a solution that spirals towards a limit cycle or tends to a
critical point. That is, in the long term, the behavior is very close to a periodic function.
Note that a constant solution at a critical point is periodic (with any period). The theorem is
more a qualitative statement rather than something to help us in computations. In practice
it is hard to find analytic solutions and so hard to show rigorously that they exist for all time.
But if we think the solution exists we numerically solve for a large time to approximate the
limit cycle. Another caveat is that the theorem only works in two dimensions. In three
dimensions and higher, there is simply too much room.
The theorem applies to all solutions in the Van der Pol oscillator. Solutions that start at
any point except the origin (0, 0) will tend to the periodic solution around the limit cycle,
and if the initial condition of (0, 0) will lead to the constant solution x ⇤ 0, y ⇤ 0.
Example 8.4.2: Consider
2 2
x 0 ⇤ y + (x 2 + y 2 1) x, y 0 ⇤ x + (x 2 + y 2 1) y.
A vector field along with solutions with initial conditions (1.02, 0), (0.9, 0), and (0.1, 0) are
drawn in Figure 8.11 on the next page.
Notice that points on the unit circle (distance one from the origin) satisfy x 2 + y 2 1 ⇤ 0.
And x(t) ⇤ sin(t), y ⇤ cos(t) is a solution of the system. Therefore we have a closed
trajectory. For points off the unit circle, the second term in x 0 pushes the solution further
away from the y-axis than the system x 0 ⇤ y, y 0 ⇤ x, and y 0 pushes the solution further
away from the x-axis than the linear system x 0 ⇤ y, y 0 ⇤ x. In other words for all other
initial conditions the trajectory will spiral out.
Ivar Otto Bendixson (1861–1935) was a Swedish mathematician.
�.�. LIMIT CYCLES 375
1.0 1.0
0.5 0.5
0.0 0.0
-0.5 -0.5
-1.0 -1.0
-1.5 -1.5
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
This means that for initial conditions inside the unit circle, the solution spirals out
towards the periodic solution on the unit circle, and for initial conditions outside the unit
circle the solutions spiral off towards infinity. Therefore the unit circle is a limit cycle, but
not an asymptotically stable one. The Poincarè–Bendixson Theorem applies to the initial
points inside the unit circle, as those solutions stay bounded, but not to those outside, as
those solutions go off to infinity.
x 0 ⇤ y + (x 2 + y 2 1)x, y 0 ⇤ x + (x 2 + y 2 1)y.
We still obtain a closed trajectory on the unit circle, and points outside the unit circle spiral
out to infinity, but now points inside the unit circle spiral towards the critical point at the
origin. So this system does not have a limit cycle, even though it has a closed trajectory.
Due to the Picard theorem (Theorem 3.1.1 on page 125) we find that no matter where
we are in the plane we can always find a solution a little bit further in time, as long as f and
g have continuous derivatives. So if we find a closed trajectory in an autonomous system,
then for every initial point inside the closed trajectory, the solution will exist for all time
and it will stay bounded (it will stay inside the closed trajectory). So the moment we found
the solution above going around the unit circle, we knew that for every initial point inside
the circle, the solution exists for all time and the Poincarè–Bendixson theorem applies.
Let us next look for conditions when limit cycles (or periodic orbits) do not exist. We
assume the equation (8.2) is defined on a simply connected region, that is, a region with no
holes we can go around. For example the entire plane is a simply connected region, and
so is the inside of the unit disc. However, the entire plane minus a point is not a simply
connected domain as it has a “hole” at the origin.
376 CHAPTER �. NONLINEAR SYSTEMS
Theorem 8.4.2 (Bendixson–Dulac ). Suppose R is a simply connected region, and the expression†
@f @g
+
@x @y
is either always positive or always negative on R �except perhaps a small set such as on isolated
points or curves� then the system (8.2) has no closed trajectory inside R.
The theorem gives us a way of ruling out the existence of a closed trajectory, and hence
a way of ruling out limit cycles. The exception about points or curves means that we can
allow the expression to be zero at a few points, or perhaps on a curve, but not on any larger
set.
Example 8.4.3: Let us look at x 0 ⇤ y + y 2 e x , y 0 ⇤ x in the entire plane (see Example 8.2.2
on page 359). The entire plane is simply connected and so we can apply the theorem. We
@f
compute @x + @y ⇤ y 2 e x + 0. The function y 2 e x is always positive except on the line y ⇤ 0.
@g
2x + 2y ⇤ 2( x + y). This expression takes on both signs, so if we are talking about the
whole plane we cannot simply apply the theorem. However, we could apply it on the set
where x + y 0. Via the theorem, there is no closed trajectory in that set. Similarly, there
is no closed trajectory in the set x + y 0. We cannot conclude (yet) that there is no
closed trajectory in the entire plane. Perhaps half of it is in the set where x + y 0 and
the other half is in the set where x + y 0.
The key is to look at the line where x+ y ⇤ 0, or x ⇤ y. On this line x 0 ⇤ y x 2 ⇤ x x 2
and y 0 ⇤ x + y 2 ⇤ x + x 2 . In particular, when x ⇤ y then x 0 y 0. That means that the
arrows, the vectors (x 0 , y 0), always point into the set where x + y 0. There is no way we
can start in the set where x + y 0 and go into the set where x + y 0. Once we are in
the set where x + y 0, we stay there. So no closed trajectory can have points in both
sets.
Example 8.4.5: Consider x 0 ⇤ y + (x 2 + y 2 1)x, y 0 ⇤ x + (x 2 + y 2 1)y, and consider
the region R given by x 2 + y 2 > 12 . That is, R is the region outside a circle of radius p1
2
centered at the origin. Then there is a closed trajectory in R, namely x ⇤ cos(t), y ⇤ sin(t).
Furthermore,
@f @g
+ ⇤ 4x 2 + 4y 2 2,
@x @x
which is always positive on R. So what is going on? The Bendixson–Dulac theorem does
not apply since the region R is not simply connected—it has a hole, the circle we cut out!
8.4.1 Exercises
Exercise 8.4.1: Show that the following systems have no closed trajectories.
a� x 0 ⇤ x 3 + y, y0 ⇤ y 3 + x 2, b� x 0 ⇤ e x y, y 0 ⇤ e x+y ,
c� x 0 ⇤ x + 3y 2 y3 , y0 ⇤ y 3 + x 2.
Exercise 8.4.2: Formulate a condition for a �-by-� linear system xÆ0 ⇤ A xÆ to not be a center using
the Bendixson–Dulac theorem. That is, the theorem says something about certain elements of A.
Exercise 8.4.3: Explain why the Bendixson–Dulac Theorem does not apply for any conservative
system x 00 + h(x) ⇤ 0.
Exercise 8.4.4: A system such as x 0 ⇤ x, y 0 ⇤ y has solutions that exist for all time t, yet there are
no closed trajectories. Explain why the Poincarè–Bendixson Theorem does not apply.
Exercise 8.4.5: Differential equations can also be given in different coordinate systems. Suppose
we have the system r 0 ⇤ 1 r 2 , ✓0 ⇤ 1 given in polar coordinates. Find all the closed trajectories
and check if they are limit cycles and if so, if they are asymptotically stable or not.
Exercise 8.4.101: Show that the following systems have no closed trajectories.
a� x 0 ⇤ x + y 2 , y0 ⇤ y + x 2, b� x 0 ⇤ x sin2 (y), y0 ⇤ e x ,
c� x 0 ⇤ x y, y0 ⇤ x + x 2.
Exercise 8.4.102: Suppose an autonomous system in the plane has a solution x ⇤ cos(t) + e t ,
y ⇤ sin(t) + e t . What can you say about the system �in particular about limit cycles and periodic
solutions��
Exercise 8.4.103: Show that the limit cycle of the Van der Pol oscillator �for µ > 0� must not lie
completely in the set where 1 < x < 1. Compare with Figure �.�� on page ���.
Exercise 8.4.104: Suppose we have the system r 0 ⇤ sin(r), ✓0 ⇤ 1 given in polar coordinates. Find
all the closed trajectories.
378 CHAPTER �. NONLINEAR SYSTEMS
8.5 Chaos
Note: 1 lecture, §6.5 in [EP], §9.8 in [BD]
You have surely heard the story about the flap of a butterfly wing in the Amazon
causing hurricanes in the North Atlantic. In a prior section, we mentioned that a small
change in initial conditions of the planets can lead to very different configuration of the
planets in the long term. These are examples of chaotic systems. Mathematical chaos is
not really chaos, there is precise order behind the scenes. Everything is still deterministic.
However a chaotic system is extremely sensitive to initial conditions. This also means even
small errors induced via numerical approximation create large errors very quickly, so it is
almost impossible to numerically approximate for long times. This is a large part of the
trouble, as chaotic systems cannot be in general solved analytically.
Take the weather, the most well-known chaotic system. A small change in the initial
conditions (the temperature at every point of the atmosphere for example) produces
drastically different predictions in relatively short time, and so we cannot accurately
predict weather. And we do not actually know the exact initial conditions. We measure
temperatures at a few points with some error, and then we somehow estimate what is in
between. There is no way we can accurately measure the effects of every butterfly wing.
Then we solve the equations numerically introducing new errors. You should not trust
weather prediction more than a few days out.
Chaotic behavior was first noticed by Edward Lorenz in the 1960s when trying to
model thermally induced air convection (movement). Lorentz was looking at the relatively
simple system:
8
x 0 ⇤ 10x + 10y, y 0 ⇤ 28x y xz, z0 ⇤ z + x y.
3
A small change in the initial conditions yields a very different solution after a reasonably
short time.
A simple example the reader can experiment with, and which displays
chaotic behavior, is a double pendulum. The equations for this setup are
somewhat complicated, and their derivation is quite tedious, so we will not
bother to write them down. The idea is to put a pendulum on the end of
another pendulum. The movement of the bottom mass will appear chaotic.
This type of chaotic system is a basis for a whole number of office novelty
desk toys. It is simple to build a version. Take a piece of a string. Tie two
heavy nuts at different points of the string; one at the end, and one a bit
above. Now give the bottom nut a little push. As long as the swings are not too big and
the string stays tight, you have a double pendulum system.
x 00 + ax 0 + bx + cx 3 ⇤ C cos(!t).
Here a, b, c, C, and ! are constants. Except for the cx 3 term, this equation looks like a
forced mass-spring system. The cx 3 means the spring does not exactly obey Hooke’s law
(which no real-world spring actually does obey exactly). When c is not zero, the equation
does not have a closed form solution, so we must resort to numerical solutions, as is usual
for nonlinear systems. Not all choices of constants and initial conditions exhibit chaotic
behavior. Let us study
x 00 + 0.05x 0 + x 3 ⇤ 8 cos(t).
The equation is not autonomous, so we cannot draw the vector field in the phase plane.
We can still draw the trajectories. In Figure 8.12 we plot trajectories for t going from 0
to 15, for two very close initial conditions (2, 3) and (2, 2.9), and also the solutions in the
(x, t) space. The two trajectories are close at first, but after a while diverge significantly.
This sensitivity to initial conditions is precisely what we mean by the system behaving
chaotically.
3 3
5.0 5.0
2 2
2.5 2.5
1 1
0.0 0.0
0 0
-1 -1
-2.5 -2.5
-2 -2
-5.0 -5.0
-3 -3
-2 0 2 0.0 2.5 5.0 7.5 10.0 12.5 15.0
Figure 8.12: On left, two trajectories in phase space for 0 t 15, for the Duffing equation one with
initial conditions (2, 3) and the other with (2, 2.9). On right the two solutions in (x, t)-space.
Let us see the long term behavior. In Figure 8.13 on the next page, we plot the behavior
of the system for initial conditions (2, 3) for a longer period of time. It is hard to see any
particular pattern in the shape of the solution except that it seems to oscillate, but each
oscillation appears quite unique. The oscillation is expected due to the forcing term. We
mention that to produce the picture accurately, a ridiculously large number of steps had
In fact for reference, 30,000 steps were used with the Runge–Kutta algorithm, see exercises in § 1.7.
380 CHAPTER �. NONLINEAR SYSTEMS
0 20 40 60 80 100
2 2
0 0
-2 -2
0 20 40 60 80 100
Figure 8.13: The solution to the given Duffing equation for t from 0 to 100.
to be used in the numerical algorithm, as even small errors quickly propagate in a chaotic
system.
It is very difficult to analyze chaotic systems, or to find the order behind the madness,
but let us try to do something that we did for the standard mass-spring system. One
way we analyzed the system is that we figured out what was the long term behavior (not
dependent on initial conditions). From the figure above, it is clear that we will not get a
nice exact description of the long term behavior for this chaotic system, but perhaps we
can find some order to what happens on each “oscillation” and what do these oscillations
have in common.
The concept we explore is that of a Poincarè section . Instead of looking at t in a certain
interval, we look at where the system is at a certain sequence of points in time. Imagine
flashing a strobe at a fixed frequency and drawing the points where the solution is during
the flashes. The right strobing frequency depends on the system in question. The correct
frequency for the forced Duffing equation (and other similar systems) is the frequency of
the forcing term. For the Duffing equation above, find a solution x(t), y(t) , and look at
the points
x(0), y(0) , x(2⇡), y(2⇡) , x(4⇡), y(4⇡) , x(6⇡), y(6⇡) , ...
As we are really not interested in the transient part of the solution, that is, the part of
the solution that depends on the initial condition, we skip some number of steps in the
beginning. For example, we might skip the first 100 such steps and start plotting points at
t ⇤ 100(2⇡), that is
x(200⇡), y(200⇡) , x(202⇡), y(202⇡) , x(204⇡), y(204⇡) , ...
The plot of these points is the Poincarè section. After plotting enough points, a curious
pattern emerges in Figure 8.14 on the facing page (the left-hand picture), a so-called strange
attractor.
Named for the French polymath Jules Henri Poincarè (1854–1912).
�.�. CHAOS 381
5.0 5.0
0 0
-1 -1
2.5 2.5
-2 -2
-3 -3
0.0 0.0
-4 -4
-2.5 -2.5
-5 -5
Figure 8.14: Strange attractor. The left plot is with no phase shift, the right plot has phase shift ⇡/4.
Given a sequence of points, an attractor is a set towards which the points in the sequence
eventually get closer and closer to, that is, they are attracted. The Poincarè section is not
really the attractor itself, but as the points are very close to it, we see its shape. The strange
attractor is a very complicated set. It has fractal structure, that is, if you zoom in as far as
you want, you keep seeing the same complicated structure.
The initial condition makes no difference. If we start with a different initial condition,
the points eventually gravitate towards the attractor, and so as long as we throw away
the first few points, we get the same picture. Similarly small errors in the numerical
approximations do not matter here.
An amazing thing is that a chaotic system such as the Duffing equation is not random
at all. There is a very complicated order to it, and the strange attractor says something
about this order. We cannot quite say what state the system will be in eventually, but given
the fixed strobing frequency we narrow it down to the points on the attractor.
If we use a phase shift, for example ⇡/4, and look at the times
we obtain a slightly different attractor. The picture is the right-hand side of Figure 8.14. It
is as if we had rotated, moved, and slightly distorted the original. For each phase shift you
can find the set of points towards which the system periodically keeps coming back to.
Study the pictures and notice especially the scales—where are these attractors located
in the phase plane. Notice the regions where the strange attractor lives and compare it to
the plot of the trajectories in Figure 8.12 on page 379.
Let us compare this section to the discussion in § 2.6 about forced oscillations. Take the
equation
F0
x 00 + 2px 0 + !02 x ⇤ cos(!t).
m
This is like the Duffing equation, but with no x 3 term. The steady periodic solution is of
382 CHAPTER �. NONLINEAR SYSTEMS
the form
x ⇤ C cos(!t + ).
Strobing using the frequency !, we obtain a single point in the phase space. The attractor
in this setting is a single point—an expected result as the system is not chaotic. It was
the opposite of chaotic: Any difference induced by the initial conditions dies away very
quickly, and we settle into always the same steady periodic motion.
8
x 0 ⇤ 10x + 10y, y 0 ⇤ 28x y xz, z0 ⇤ z + x y.
3
The Lorenz system is an autonomous system in three dimensions exhibiting chaotic behavior.
See the Figure 8.15 for a sample trajectory, which is now a curve in three-dimensional space.
-15 x
20 -10
-5
y 10 0
5
10
0 15
-10
-20 40
40 30
30 20
20 10
10 20
10
-15 0
-10
-5
0 -10 y
5
10
x
15 -20
The solutions tend to an attractor in space, the so-called Lorenz attractor. In this case no
strobing is necessary. Again we cannot quite see the attractor itself, but if we try to follow a
solution for long enough, as in the figure, we get a pretty good picture of what the attractor
looks like. The Lorenz attractor is also a strange attractor and has a complicated fractal
structure. And, just as for the Duffing equation, what we want to draw is not the whole
trajectory, but start drawing the trajectory after a while, once it is close to the attractor.
The path of the trajectory is not simply a repeating figure-eight. The trajectory spins
some seemingly random number of times on the left, then spins a number of times on the
right, and so on. As this system arose in weather prediction, one can perhaps imagine a few
days of warm weather and then a few days of cold weather, where it is not easy to predict
when the weather will change, just as it is not really easy to predict far in advance when
the solution will jump onto the other side. See Figure 8.16 for a plot of the x component of
the solution drawn above. A negative x corresponds to the left “loop” and a positive x
corresponds to the right “loop”.
Most of the mathematics we studied in this book is quite classical and well understood.
On the other hand, chaos, including the Lorenz system, continues to be the subject of
current research. Furthermore, chaos has found applications not just in the sciences, but
also in art.
10 10
0 0
-10 -10
8.5.3 Exercises
Exercise 8.5.1: For the non-chaotic equation x 00 + 2px 0 + !02 x ⇤ Fm0 cos(!t), suppose we strobe
with frequency ! as we mentioned above. Use the known steady periodic solution to find precisely
the point which is the attractor for the Poincarè section.
384 CHAPTER �. NONLINEAR SYSTEMS
Exercise 8.5.2 (project): A simple fractal attractor can be drawn via the following chaos game.
Draw the three vertices of a triangle and label them, say p 1 , p2 and p3 . Draw some random point p
�it does not have to be one of the three points above�. Roll a die to pick of the p 1 , p2 , or p3 randomly
�for example � and � mean p1 , � and � mean p 2 , and � and � mean p3 �. Suppose we picked p2 , then
let p new be the point exactly halfway between p and p2 . Draw this point and let p now refer to this
new point p new . Rinse, repeat. Try to be precise and draw as many iterations as possible. Your
points will be attracted to the so-called Sierpinski triangle. A computer was used to run the game
for ��,��� iterations to obtain the picture in Figure �.��.
0.75 0.75
0.50 0.50
0.25 0.25
0.00 0.00
Figure 8.17: 10,000 iterations of the chaos game producing the Sierpinski triangle.
Exercise 8.5.3 (project): Construct the double pendulum described in the text with a string and
two nuts �or heavy beads�. Play around with the position of the middle nut, and perhaps use different
weight nuts. Describe what you find.
Exercise 8.5.4 (computer project): Use a computer software �such as Matlab, Octave, or perhaps
even a spreadsheet�, plot the solution of the given forced Duffing equation with Euler’s method.
Plotting the solution for t from � to ��� with several different �small� step sizes. Discuss.
Exercise 8.5.101: Find critical points of the Lorenz system and the associated linearizations.
Appendix A
Linear algebra
x y ⇤ 2,
2x + y ⇤ 4,
for x and y, that is, find numbers x and y such that the two equations are satisfied. Let us
perhaps start by adding the equations together to find
x + 2x y + y ⇤ 2 + 4, or 3x ⇤ 6.
In other words, x ⇤ 2. Once we have that, we plug in x ⇤ 2 into the first equation to find
2 y ⇤ 2, so y ⇤ 0. OK, that was easy. What is all this fuss about linear equations. Well,
try doing this if you have 5000 unknowns† . Also, we may have such equations not of just
numbers, but of functions and derivatives of functions in differential equations. Clearly
we need a more systematic way of doing things. A nice consequence of making things
† One of the downsides of making everything look like a linear problem is that the number of variables
tends to become huge.
386 APPENDIX A. LINEAR ALGEBRA
systematic and simpler to write down is that it becomes easier to have computers do the
work for us. Computers are rather stupid, they do not think, but are very good at doing
lots of repetitive tasks precisely, as long as we figure out a systematic way for them to
perform the tasks.
x2
2
0 0 1 2 3 x1
Figure A.1: The vector (1, 2) drawn as an arrow from the origin to the point (1, 2).
As vectors are arrows, when we want to give a name to a vector, we draw a little arrow
above it:
xÆ
Named after the ancient Greek mathematician Euclid of Alexandria (around 300 BC), possibly the most
famous of mathematicians; even small towns often have Euclid Street or Euclid Avenue.
† Named after the French mathematician René Descartes (1596–1650). It is “cartesian” as his name in Latin
is Renatus Cartesius.
‡ A common notation to distinguish vectors from points is to write (1, 2) for the point and h1, 2i for the
Another popular notation is x, although we will use the little arrows. It may be easy to
write a bold letter in a book, but it is not so easy to write it by hand on paper or on the
board. Mathematicians often don’t even write the arrows. A mathematician would write x
and just remember that x is a vector and not a number. Just like you remember that Bob is
your uncle, and you don’t have to keep repeating “Uncle Bob” and you can just say “Bob.”
In this book, however, we will call Bob “Uncle Bob” and write vectors with the little arrows.
The magnitude can be computed
p using Pythagorean theorem. The vector (1, 2) drawn in
p
the figure has magnitude 1 + 2 ⇤ 5. The magnitude is denoted by k xÆ k, and, in any
2 2
For reasons that will become clear in the next section, we often write vectors as so-called
column vectors:
2 x1 3
6 7
6 x2 7
xÆ ⇤ 66 .. 77 .
. 6 7
6x 7
4 n5
Don’t worry. It is just a different way of writing the same thing, and it will be useful later.
For example, the vector (1, 2) can be written as
1
.
2
The fact that we write arrows above vectors allows us to write several vectors xÆ1 , xÆ2 ,
etc., without confusing these with the components of some other vector xÆ.
So where is the algebra from linear algebra? Well, arrows can be added, subtracted, and
multiplied by numbers. First we consider addition. If we have two arrows, we simply move
along one, and then along the other. See Figure A.2.
x2
2
0 0 1 2 3 x1
Figure A.2: Adding the vectors (1, 2), drawn dotted, and (2, 3), drawn dashed. The result, (3, 1), is
drawn as a solid arrow.
388 APPENDIX A. LINEAR ALGEBRA
It is rather easy to see what it does to the numbers that represent the vectors. Suppose
we want to add (1, 2) to (2, 3) as in the figure. So we travel along (1, 2) and then we travel
along (2, 3). What we did was travel one unit right, two units up, and then we travelled
two units right, and three units down (the negative three). That means that we ended up
at 1 + 2, 2 + ( 3) ⇤ (3, 1). And that’s how addition always works:
2 x 1 3 2 y1 3 2 x 1 + y1 3
6 7 6 7 6 7
6 x 2 7 6 y2 7 6 x 2 + y2 7
6 . 7+6 . 7 ⇤ 6 . 7.
6 .. 7 6 .. 7 6 .. 7
6 7 6 7 6 7
6x 7 6 y 7 6x + y 7
4 n5 4 n5 4 n n5
Subtracting is similar. What xÆ yÆ means visually is that we first travel along xÆ, and then
we travel backwards along yÆ. See Figure A.3. It is like adding xÆ + ( yÆ) where yÆ is the
arrow we obtain by erasing the arrow head from one side and drawing it on the other side,
that is, we reverse the direction. In terms of the numbers, we simply go backwards in both
directions, so we negate both numbers. For example, if yÆ is ( 2, 1), then yÆ is (2, 1).
x2
2
0 0 1 2 3 x1
Figure A.3: Subtraction, the vector (1, 2), drawn dotted, minus ( 2, 1), drawn dashed. The result, (3, 1),
is drawn as a solid arrow.
times further but in the opposite direction, so 3 units to the left and 6 units down, or in
other words, ( 3, 6). As we mentioned above, yÆ is a reverse of yÆ, and this is the same as
( 1) yÆ.
In Figure A.4, you can see a couple of examples of what scaling a vector means visually.
1.5xÆ
2xÆ
xÆ
Figure A.4: A vector xÆ, the vector 2 xÆ (same direction, double the magnitude), and the vector 1.5 xÆ
(opposite direction, 1.5 times the magnitude).
We put all of these operations together to work out more complicated expressions. Let
us compute a small example:
1 4 2 3(1) + 2( 4) 3( 2) 1
3 +2 3 ⇤ ⇤ .
2 1 2 3(2) + 2( 1) 3(2) 2
F( xÆ) ⇤ 2xÆ.
For example, ✓ ◆
1 1 2
F ⇤2 ⇤ .
3 3 6
390 APPENDIX A. LINEAR ALGEBRA
F : R2 ! R2 .
The words function and mapping are used rather interchangeably, although more often than
not, mapping is used when talking about a vector-valued function, and the word function is
often used when the function is scalar-valued.
A beginning student of mathematics (and many a seasoned mathematician), that sees
an expression such as
f (3x + 8y)
yearns to write
3 f (x) + 8 f (y).
p p p
After all, who hasn’t wanted to write x + y ⇤ x + y or something like that at some
point in their mathematical lives. Wouldn’t life be simple if we could do that? Of course
we can’t always do that (for example, not with the square roots!) It turns out there are
many functions where we can do exactly the above. Such functions are called linear.
A mapping F : Rn ! Rm is called linear if
for any scalar ↵. The F we defined above that doubles the size of all vectors is linear. Let
us check:
F(xÆ + yÆ) ⇤ 2(xÆ + yÆ) ⇤ 2xÆ + 2 yÆ ⇤ F(xÆ) + F( yÆ),
and also
F(↵ xÆ) ⇤ 2↵ xÆ ⇤ ↵2xÆ ⇤ ↵F(xÆ).
We also call a linear function a linear transformation. If you want to be really fancy and
impress your friends, you can call it a linear operator.
When a mapping is linear we often do not write the parentheses. We write simply
F xÆ
instead of F(xÆ). We do this because linearity means that the mapping F behaves like
multiplying xÆ by “something.” That something is a matrix.
A matrix is an m ⇥ n array of numbers (m rows and n columns). A 3 ⇥ 5 matrix is
2a11 a12 a13 a14 a15 3
6 7
A ⇤ 66a 21 a 22 a 23 a24 a 25 77 .
6a31 a32 a33 a34 a35 7
4 5
The numbers a i j are called elements or entries.
A.�. VECTORS, MAPPINGS, AND MATRICES 391
If we know where A takes all the basis vectors, we know where it takes all vectors.
As an example, suppose M is the 2 ⇥ 2 matrix from above, and suppose we wish to find
2 1 2 2 1 2 1.8
M ⇤ ⇤ 2 + 0.1 ⇤ .
0.1 3 4 0.1 3 4 5.6
Lf ⇤ g
this section, that we can “make everything a vector.” That’s not strictly true, but it is true
approximately. Those “infinite-dimensional” spaces of functions can be approximated by a
finite-dimensional space, and then linear operators are just matrices. So approximately,
this is true. And as far as actual computations that we can do on a computer, we can work
A.�. VECTORS, MAPPINGS, AND MATRICES 393
only with finitely many dimensions anyway. If you ask a computer or your calculator to
plot a function, it samples the function at finitely many points and then connects the dots .
It does not actually give you infinitely many values. So the way that you have been using
the computer or your calculator so far has already been a certain approximation of the
space of functions by a finite-dimensional space.
To end the section, we notice how A xÆ can be written more succintly. Suppose
2x1 3
6 7
xÆ ⇤ 66 x2 77 .
a 11 a 12 a 13
A⇤ and
a 21 a 22 a 23 6x3 7
4 5
Then
2x1 3
6 7
a 11 a 12 a 13 6x2 7 ⇤ a11 x1 + a12 x2 + a13 x3 .
A xÆ ⇤ 6 7
a 21 a 22 a 23 6x3 7 a 21 x1 + a 22 x 2 + a 23 x3
4 5
For example,
1 2 2 1 · 2 + 2 · ( 1) 0
⇤ ⇤ .
3 4 1 3 · 2 + 4 · ( 1) 2
In other words, you take a row of the matrix, you multiply them by the entries in your
vector, you add things up, and that’s the corresponding entry in the resulting vector.
A.1.3 Exercises
Exercise A.1.1: On a piece of graph paper draw the vectors�
2 2
a� b� c� (3, 4)
5 4
Exercise A.1.2: On a piece of graph paper draw the vector (1, 2) starting at �based at� the given
point�
a� based at (0, 0) b� based at (1, 2) c� based at (0, 1)
Exercise A.1.3: On a piece of graph paper draw the following operations. Draw and label the
vectors involved in the operations as well as the result�
1 2 3 1 2
a� + b� c� 3
4 3 2 3 1
Exercise A.1.4: Compute the magnitude of
2 23
6 7
b� 66 3 77
7
a� c� (1, 3, 4)
2 617
4 5
If you have ever used Matlab, you may have noticed that to plot a function, we take a vector of inputs, ask
Matlab to compute the corresponding vector of values of the function, and then we ask it to plot the result.
394 APPENDIX A. LINEAR ALGEBRA
Exercise A.1.8: Write (1, 2, 3) as a linear combination of the standard basis vectors eÆ1 , eÆ2 , and eÆ3 .
Exercise A.1.10: Suppose a linear mapping F : R2 ! R2 takes (1, 0) to (2, 1) and it takes (0, 1)
to (3, 3). Where does it take
Exercise A.1.11: Suppose a linear mapping F : R3 ! R2 takes (1, 0, 0) to (2, 1) and it takes
(0, 1, 0) to (3, 4) and it takes (0, 0, 1) to (5, 6). Write down the matrix representing the mapping F.
Exercise A.1.12: Suppose that a mapping F : R2 ! R2 takes (1, 0) to (1, 2), (0, 1) to (3, 4), and it
takes (1, 1) to (0, 1). Explain why F is not linear.
Exercise A.1.13 (challenging): Let R3 represent the space of quadratic polynomials in t� a point
(a 0 , a 1 , a 2 ) in R3 represents the polynomial a 0 + a 1 t + a2 t 2 . Consider the derivative dt
d
as a mapping
3 3
of R to R , and note that dt is linear. Write down dt as a 3 ⇥ 3 matrix.
d d
Exercise A.1.105: Suppose a linear mapping F : R2 ! R2 takes (1, 0) to (1, 1) and it takes (0, 1)
to (2, 0). Where does it take
↵x + x ⇤ (↵ + )x.
We get a new mapping ↵ + that multiplies x by, well, ↵ + . If D is a mapping that doubles
things, Dx ⇤ 2x, and T is a mapping that triples, Tx ⇤ 3x, then D + T is a mapping that
multiplies by 5, (D + T)x ⇤ 5x.
Similarly we can compose such mappings, that is, we could apply one and then the other.
We take x, we run it through the first mapping ↵ to get ↵ times x, then we run ↵x through
the second mapping . In other words,
(↵x) ⇤ ( ↵)x.
We just multiply those two numbers. Using our doubling and tripling mappings, if we
double and then triple, that is T(Dx) then we obtain 3(2x) ⇤ 6x. The composition TD is
the mapping that multiplies by 6. For larger matrices, composition also ends up being a
kind of multiplication.
(A + B)xÆ ⇤ A xÆ + B xÆ.
It turns out you just add the matrices element-wise: If the i j th entry of A is a i j , and the i j th
entry of B is b i j , then the i j th entry of A + B is a i j + b i j . If
a11 a 12 a 13 b 11 b 12 b 13
A⇤ and B⇤ ,
a21 a 22 a 23 b 21 b 22 b 23
A.�. MATRIX ALGEBRA 397
then
a 11 + b 11 a 12 + b 12 a 13 + b 13
A+B ⇤ .
a 21 + b 21 a 22 + b 22 a 23 + b 23
Let us illustrate on a more concrete example:
21 23 2 7 8 3 2 1 + 7 2 + 8 3 2 8 103
6 7 6 7 6 7 6 7
63 47 + 6 9 10 7 ⇤ 6 3 + 9 4 + 107 ⇤ 612 147 .
6 7 6 7 6 7 6 7
65 67 611 17 65 + 11 6 1 7 616 5 7
4 5 4 5 4 5 4 5
Let’s check that this does the right thing to a vector. Let’s use some of the vector algebra
that we already know, and regroup things:
21 2 3 27 8 3 213 223 2 3 2 3
6 7 2 6 7 2 © 6 7 6 7™ © 6 7 7 6 8 7™
63 4 7 6 7 6 7 6 7 6 7 6 7
6 7 1 + 6 9 10 7 1 ⇤ ≠2 637 647 Æ̈ + ≠2 6 9 7 6 10 7 Æ̈
65 6 7 611 17 6 7 6 7 6 7 6 7
4 5 4 5 ´ 455 465 ´ 4115 4 15
213 2 7 3 223 2 8 3
© 66 77 66 77 ™ © 66 77 66 77 ™
⇤ 2 ≠ 637 + 6 9 7 Æ̈ ≠ 647 + 6 10 7 Æ̈
6 7 6 7 6 7 6 7
´ 455 4115 ´ 465 4 15
21+73 22+83 2 8 3 2103
6 7 6 7 6 7 6 7
⇤ 2 66 3 + 9 77 664 + 1077 ⇤ 2 661277 661477
65 + 117 6 6 1 7 6167 6 5 7
4 5 4 5 4 5 4 5
2 8 103 2 2(8) 10 3 2 6 3
6 7 2 © 66 7 6 7™
⇤ 6612 1477 ≠⇤ 62(12) 1477 ⇤ 661077 Æ̈ .
616 5 7 1 6 7 6 7
4 5 ´ 4 2(16) 5 5 4275
If we replaced the numbers by letters that would constitute a proof! You’ll notice that we
didn’t really have to even compute what the result is to convince ourselves that the two
expressions were equal.
If the sizes of the matrices do not match, then addition is not defined. If A is 3 ⇥ 2 and B
is 2 ⇥ 5, then we cannot add these matrices. We don’t know what that could possibly mean.
It is also useful to have a matrix that when added to any other matrix does nothing.
This is the zero matrix, the matrix of all zeros:
1 2 0 0 1 2
+ ⇤ .
3 4 0 0 3 4
We often denote the zero matrix by 0 without specifying size. We would then just write
A + 0, where we just assume that 0 is the zero matrix of the same size as A.
There are really two things we can multiply matrices by. We can multiply matrices by
scalars or we can multiply by other matrices. Let us first consider multiplication by scalars.
For a matrix A and a scalar ↵ we want ↵A to be the matrix that accomplishes
That is just scaling the result by ↵. If you think about it, scaling every term in A by ↵
accomplishes just that: If
a11 a 12 a 13 ↵a 11 ↵a 12 ↵a 13
A⇤ , then ↵A ⇤ .
a21 a 22 a 23 ↵a 21 ↵a 22 ↵a 23
For example,
1 2 3 2 4 6
2 ⇤ .
4 5 6 8 10 12
Let us list some properties of matrix addition and scalar multiplication. Denote by 0
the zero matrix, by ↵, scalars, and by A, B, C matrices. Then:
A + 0 ⇤ A ⇤ 0 + A,
A + B ⇤ B + A,
(A + B) + C ⇤ A + (B + C),
↵(A + B) ⇤ ↵A + ↵B,
(↵ + )A ⇤ ↵A + A.
AB xÆ ⇤ A(B xÆ).
First, a vector xÆ in Rp gets taken to the vector B xÆ in Rn . Then the mapping A takes it to the
vector A(B xÆ) in Rm . In other words, the composition AB should be an m ⇥ p matrix. In
terms of sizes we should have
“ [m ⇥ n] [n ⇥ p] ⇤ [m ⇥ p]. ”
And similarly for larger (or smaller) vectors. A dot product is really a product of two
matrices: a 1 ⇥ n matrix and an n ⇥ 1 matrix resulting in a 1 ⇥ 1 matrix, that is, a number.
Armed with the dot product we define the product of matrices. First let us denote by
rowi (A) the i th row of A and by column j (A) the j th column of A. For an m ⇥ n matrix A
and an n ⇥ p matrix B we can compute the product AB. The matrix AB is an m ⇥ p matrix
whose i j th entry is the dot product
rowi (A) · column j (B).
For example, given a 2 ⇥ 3 and a 3 ⇥ 2 matrix we should end up with a 2 ⇥ 2 matrix:
2b11 b12 3
6 7
a 11 a 12 a13 6b21 b22 7 ⇤ a11 b11 + a12 b21 + a13 b31 a 11 b 12 + a 12 b22 + a13 b 32
6 7 , (A.1)
a 21 a 22 a23 6b31 b32 7 a 21 b 11 + a22 b 21 + a 23 b 31 a 21 b 12 + a 22 b22 + a23 b 32
4 5
or with some numbers:
2 1 2 37
1 2 3 66
0 77 ⇤
1 · ( 1) + 2 · ( 7) + 3 · 1 1 · 2 + 2 · 0 + 3 · ( 1) 12 1
4 5 6 66
7 ⇤ .
175
4 · ( 1) + 5 · ( 7) + 6 · 1 4 · 2 + 5 · 0 + 6 · ( 1) 33 2
41
A useful consequence of the definition is that the evaluation A xÆ for a matrix A and a
(column) vector xÆ is also matrix multiplication. That is really why we think of vectors as
column vectors, or n ⇥ 1 matrices. For example,
1 2 2 1 · 2 + 2 · ( 1) 0
⇤ ⇤ .
3 4 1 3 · 2 + 4 · ( 1) 2
If you look at the last section, that is precisely the last example we gave.
You should stare at the computation of multiplication of matrices AB and the previous
definition of A yÆ as a mapping for a moment. What we are doing with matrix multiplication
is applying the mapping A to the columns of B. This is usually written as follows. Suppose
we write the n ⇥ p matrix B ⇤ [ bÆ1 bÆ2 · · · bÆ p ], where bÆ1 , bÆ2 , . . . , bÆ p are the columns of B.
Then for an m ⇥ n matrix A,
AB ⇤ A[ bÆ1 bÆ2 · · · bÆ p ] ⇤ [A bÆ1 A bÆ2 · · · A bÆ p ].
The columns of the m ⇥ p matrix AB are the vectors A bÆ1 , A bÆ2 , . . . , A bÆ p . For example, in
(A.1), the columns of
2b b 3
a 11 a 12 a13 66 11 12 77
a 21 a 22 a23 66 21 22 77
b b
4b31 b32 5
are
2b 3 2b 3
a 11 a 12 a13 66 11 77 a11 a 12 a 13 66 12 77
a 21 a 22 a23 66 21 77 a21 a 22 a 23 66 22 77
b and b .
4b31 5 4b32 5
This is a very useful way to understand what matrix multiplication is. It should also make
it easier to remember how to perform matrix multiplication.
400 APPENDIX A. LINEAR ALGEBRA
21 0 03
6 7
I ⇤ I3 ⇤ 660 1 077 .
60 0 17
4 5
Let us see how the matrix works on a smaller example,
a 11 a 12 1 0 a 11 · 1 + a 12 · 0 a11 · 0 + a 12 · 1 a 11 a 12
⇤ ⇤ .
a 21 a 22 0 1 a 21 · 1 + a 22 · 0 a21 · 0 + a 22 · 1 a 21 a 22
Multiplication by the identity from the left looks similar, and also does not touch anything.
We have the following rules for matrix multiplication. Suppose that A, B, C are matrices
of the correct sizes so that the following make sense. Let ↵ denote a scalar (number). Then
Example A.2.1: Let us demonstrate a couple of these rules. For example, the associative
law: ✓ ◆
3 3 4 4 1 4 3 3 16 24 96 78
⇤ ⇤ ,
2 2 1 3 5 2 2 2 16 2 64 52
| {z } | {z } | {z } | {z } | {z } | {z }
A B C A BC A(BC)
and ✓ ◆
3 3 4 4 1 4 9 21 1 4 96 78
⇤ ⇤ .
2 2 1 3 5 2 6 14 5 2 64 52
| {z } | {z } | {z } | {z } | {z } | {z }
A B C AB C (AB)C
and ✓ ◆
3 3 4 4 3 3 40 40 90 210
10 ⇤ ⇤ .
2 2 1 3 2 2 10 30 60 140
| {z } | {z } | {z } | {z } | {z }
A B A 10B A(10B)
A multiplication rule you have used since primary school on numbers is quite conspic-
uously missing for matrices. That is, matrix multiplication is not commutative. Firstly, just
because AB makes sense, it may be that BA is not even defined. For example, if A is 2 ⇥ 3,
and B is 3 ⇥ 4, the we can multiply AB but not BA.
Even if⇥ AB⇤ and BA are
⇥ 1 0both
⇤ defined, does not mean that they are equal. For example,
1 1
take A ⇤ 1 1 and B ⇤ 0 2 :
1 1 1 0 1 2 1 1 1 0 1 1
AB ⇤ ⇤ , ⇤ ⇤ BA.
1 1 0 2 1 2 2 2 0 2 1 1
A.2.5 Inverse
A couple of other algebra rules you know for numbers do not quite work on matrices:
For example:
0 1 0 1 0 0 0 1 0 2
⇤ ⇤ .
0 0 0 0 0 0 0 0 0 0
To make these rules hold, we do not just need one of the matrices to not be zero, we
would need to “divide” by a matrix. This is where the matrix inverse comes in. Suppose
that A and B are n ⇥ n matrices such that
AB ⇤ I ⇤ BA.
The computation is what you would do in regular algebra with numbers, but you have to
be careful never to commute matrices:
AB ⇤ AC,
A 1 AB ⇤ A 1 AC,
IB ⇤ IC,
B ⇤ C.
ad bc , 0
and otherwise it is singular. The expression ad bc is called the determinant and we will
look at it more carefully in a later section. There is a similar expression for a square matrix
of any size.
A.�. MATRIX ALGEBRA 403
It is no wonder that the way we solve many problems in linear algebra (and in differential
equations) is to try to reduce the problem to the case of diagonal matrices.
404 APPENDIX A. LINEAR ALGEBRA
A.2.7 Transpose
Vectors do not always have to be column vectors, that is just a convention. Swapping rows
and columns is from time to time needed. The operation that swaps rows and columns is
the so-called transpose. The transpose of A is denoted by AT . Example:
21 43
T 6 7
⇤ 662 577 .
1 2 3
4 5 6 63 67
4 5
So transpose takes an m ⇥ n matrix to an n ⇥ m matrix.
A key fact about the transpose is that if the product AB makes sense then B T AT also
makes sense, at least from the point of view of sizes. In fact, we get precisely the transpose
of AB. That is:
(AB)T ⇤ B T AT .
For example,
20 1 37
T
21 4 3
© 1 2 3 66 7 ™ 0 1 2 6 7
62 5 7 .
≠ 0 7 Æ̈ ⇤
4 5 6 66 6 7
1
275 63 6 7
1 0 2
´ 42 4 5
It is left to the reader to verify that computing the matrix product on the left and then
transposing is the same as computing the matrix product on the right.
If we have a column vector xÆ to which we apply a matrix A and we transpose the result,
then the row vector xÆT applies to AT from the left:
T
(A xÆ) ⇤ xÆT AT .
Another place where transpose is useful is when we wish to apply the dot product to
two column vectors:
xÆ · yÆ ⇤ yÆT xÆ.
That is the way that one often writes the dot product in software.
We say a matrix A is symmetric if A ⇤ AT . For example,
21 2 33
6 7
62 4 57
6 7
63 5 67
4 5
is a symmetric matrix. Notice that a symmetric matrix is always square, that is, n ⇥ n.
Symmetric matrices have many nice properties† , and come up quite often in applications.
As a side note, mathematicians write yÆT xÆ and physicists write xÆT yÆ. Shhh. . . don’t tell anyone, but the
physicists are probably right on this.
† Although so far we have not learned enough about matrices to really appreciate them.
A.�. MATRIX ALGEBRA 405
A.2.8 Exercises
Exercise A.2.1: Add the following matrices
21 2 43 22 337
6 7 6 8
b� 662 3 177 + 663 0 77
1 2 2 3 2 3
a� + 1
5 8 1 8 3 5 60 5 17 66 1 75
4 5 4 4
2 537
24 1 6 33 62 22 23
6 7 61 277 1 1 4 66 7
c� 665 6 5 077 66 1 077
577 6
d�
64 6 6 07 63 0 5 1 6 7
4 5 65 675 46 45
4
Exercise A.2.4: Compute the inverse of the given matrices
⇥ ⇤ 0 1 1 4 2 2
a� 3 b� c� d�
1 0 1 3 1 4
A.3 Elimination
Note: 2–3 lectures
If we knew the inverse of A, then we would be done; we would simply solve the equation:
xÆ ⇤ A 1 A xÆ ⇤ A 1 b.
Æ
Well, but that is part of the problem, we do not know how to compute the inverse for
matrices bigger than 2 ⇥ 2. We will see later that to compute the inverse we are really
solving A xÆ ⇤ bÆ for several different b.
Æ In other words, we will need to do elimination to
find A 1 . In addition, we may wish to solve A xÆ ⇤ bÆ even if A is not invertible, or perhaps
not even square.
Let us return to the equations themselves and see how we can manipulate them. There
are a few operations we can perform on the equations that do not change the solution.
First, perhaps an operation that may seem stupid, we can swap two equations in (A.2):
x1 + x2 + 3x3 ⇤ 5,
2x1 + 2x2 + 2x3 ⇤ 2,
x1 + 4x2 + x3 ⇤ 10.
Although perhaps we have this backwards, quite often we solve a linear system of equations to find out
something about matrices, rather than vice versa.
408 APPENDIX A. LINEAR ALGEBRA
Clearly these new equations have the same solutions x1 , x 2 , x3 . A second operation is that
we can multiply an equation by a nonzero number. For example, we multiply the third
equation in (A.2) by 3:
2x1 + 2x2 + 2x 3 ⇤ 2,
x1 + x2 + 3x 3 ⇤ 5,
3x1 + 12x2 + 3x 3 ⇤ 30.
Finally we can add a multiple of one equation to another equation. For example, we add 3
times the third equation in (A.2) to the second equation:
2x1 + 2x2 + 2x 3 ⇤ 2,
(1 + 3)x1 + (1 + 12)x 2 + (3 + 3)x 3 ⇤ 5 + 30,
x1 + 4x 2 + x 3 ⇤ 10.
The same x1 , x2 , x 3 should still be solutions to the new equations. These were just examples;
we did not get any closer to the solution. We must to do these three operations in some
more logical manner, but it turns out these three operations suffice to solve every linear
equation.
The first thing is to write the equations in a more compact manner. Given
Æ
A xÆ ⇤ b,
Æ
[A | b],
where the vertical line is just a marker for us to know where the “right-hand side” of the
equation starts. For example, for the system (A.2) the augmented matrix is
22 2 2 2 3
6 7
6 1 1 3 5 7.
6 7
6 1 4 1 10 7
4 5
The entire process of elimination, which we will describe, is often applied to any sort of
matrix, not just an augmented matrix. Simply think of the matrix as the 3 ⇥ 4 matrix
22 2 2 2 3
6 7
61 1 3 5 7 .
6 7
61 4 1 107
4 5
We run these operations until we get into a state where it is easy to read off the answer, or
until we get into a contradiction indicating no solution.
More specifically, we run the operations until we obtain the so-called row echelon form.
Let us call the first (from the left) nonzero entry in each row the leading entry. A matrix is
in row echelon form if the following conditions are satisfied:
(i) The leading entry in any row is strictly to the right of the leading entry of the row
above.
(ii) Any zero rows are below all the nonzero rows.
A matrix is in reduced row echelon form if furthermore the following condition is satisfied.
Example A.3.1: The following matrices are in row echelon form. The leading entries are
marked:
21 3 37 21 337 21 137 20 2 37
6 2 9 6 1 6 2 6 1 5
60 5 77 60 5 77 60 277 60 1 77
6 0 1 6 1 6 1 6 0 0
60 1 75 60 1 75 60 075 60 0 75
4 0 0 4 0 4 0 4 0 0
Note that the definition applies to matrices of any size. None of the matrices above are in
reduced row echelon form. For example, in the first matrix none of the entries above the
second and third leading entries are zero; they are 9, 3, and 5.
The following matrices are in reduced row echelon form. The leading entries are
marked:
21 837 21 0 37 21 3 37 20 0 37
6 3 0 6 0 2 6 0 6 1 2
60 677 60 0 77 60 277 60 1 77
6 0 1 6 1 3 6 1 6 0 0
60 075 60 1 75 60 0 75 60 0 75
4 0 0 4 0 0 4 0 4 0 0
The procedure we will describe to find a reduced row echelon form of a matrix is called
Gauss–Jordan elimination. The first part of it, which obtains a row echelon form, is called
Gaussian elimination or row reduction. For some problems, a row echelon form is sufficient,
and it is a bit less work to only do this first part.
To attain the row echelon form we work systematically. We go column by column,
starting at the first column. We find topmost entry in the first column that is not zero, and
we call it the pivot. If there is no nonzero entry we move to the next column. We swap rows
410 APPENDIX A. LINEAR ALGEBRA
to put the row with the pivot as the first row. We divide the first row by the pivot to make
the pivot entry be a 1. Now look at all the rows below and subtract the correct multiple of
the pivot row so that all the entries below the pivot become zero.
After this procedure we forget that we had a first row (it is now fixed), and we forget
about the column with the pivot and all the preceding zero columns. Below the pivot row,
all the entries in these columns are just zero. Then we focus on the smaller matrix and we
repeat the steps above.
It is best shown by example, so let us go back to the example from the beginning of the
section. We keep the vertical line in the matrix, even though the procedure works on any
matrix, not just an augmented matrix. We start with the first column and we locate the
pivot, in this case the first entry of the first column.
2 2 2 2 2 37
6
6 1 1 3 5 77
6
6 1 4 1 10 75
4
We multiply the first row by 1/2.
2 1 1 1 1 37
6
6 1 1 3 5 77
6
6 1 4 1 10 75
4
We subtract the first row from the second and third row (two elementary operations).
21 1 1 13
6 7
60 0 2 47
6 7
60 3 0 97
4 5
We are done with the first column and the first row for now. We almost pretend the matrix
doesn’t have the first column and the first row.
2⇤ ⇤ ⇤ ⇤ 3
6 7
6⇤ 0 2 47
6 7
6⇤ 3 0 97
4 5
OK, look at the second column, and notice that now the pivot is in the third row.
21 1 1 37
6 1
60 2 4 77
6 0
60 0 9 75
4 3
We swap rows.
21 1 1 37
6 1
60 0 9 77
6 3
60 2 4 75
4 0
A.�. ELIMINATION 411
We do not need to subtract anything as everything below the pivot is already zero. We
move on, we again start ignoring the second row and second column and focus on
2⇤ ⇤ ⇤ ⇤ 3
6 7
6 ⇤ ⇤ ⇤ ⇤ 7.
6 7
6⇤ ⇤ 2 47
4 5
We find the pivot, then divide that row by 2:
21 1 3 21 1 1 13
6 1 1 7 6 7
60 1 7 6 0 1 0 3 7.
6 0 3 7 ! 6 7
60 0 7 60 0 1 27
4 2 4 5 4 5
The matrix is now in row echelon form.
The equation corresponding to the last row is x 3 ⇤ 2. We know x3 and we could
substitute it into the first two equations to get equations for x1 and x 2 . Then we could
do the same thing with x2 , until we solve for all 3 variables. This procedure is called
backsubstitution and we can achieve it via elementary operations. We start from the lowest
pivot (leading entry in the row echelon form) and subtract the right multiple from the row
above to make all the entries above this pivot zero. Then we move to the next pivot and so
on. After we are done, we will have a matrix in reduced row echelon form.
We continue our example. Subtract the last row from the first to get
21 1 0 3
6 1 7
60 1 0 7.
6 3 7
60 0 1 7
4 2 5
The entry above the pivot in the second row is already zero. So we move onto the next
pivot, the one in the second row. We subtract this row from the top row to get
21 0 0 3
6 4 7
60 1 0 7.
6 3 7
60 0 1 7
4 2 5
The matrix is in reduced row echelon form.
If we now write down the equations for x1 , x 2 , x3 , we find
x1 ⇤ 4, x2 ⇤ 3, x3 ⇤ 2.
2 1 3
6 2 0 5 7
6 0 7
6 0 1 3 7
6 0 7
4 0 0 0 5
The last row is all zeros; it just says 0 ⇤ 0 and we ignore it. The two remaining equations
are
x1 + 2x2 ⇤ 5, x 3 ⇤ 3.
Let us solve for the variables that corresponded to the pivots, that is x1 and x3 as there was
a pivot in the first column and in the third column:
x1 ⇤ 2x2 5,
x3 ⇤ 3.
The variable x2 can be anything you wish and we still get a solution. The x2 is called a free
variable. There are infinitely many solutions, one for every choice of x2 . For example, if we
pick x2 ⇤ 0, then x1 ⇤ 5, and x 3 ⇤ 3 give a solution. But we also get a solution by picking
say x 2 ⇤ 1, in which case x1 ⇤ 9 and x3 ⇤ 3, or by picking x2 ⇤ 5 in which case x1 ⇤ 5
and x 3 ⇤ 3.
The general idea is that if any row has all zeros in the columns corresponding to the
variables, but a nonzero entry in the column corresponding to the right-hand side b,Æ then
the system is inconsistent and has no solutions. In other words, the system is inconsistent
if you find a pivot on the right side of the vertical line drawn in the augmented matrix.
Otherwise, the system is consistent, and at least one solution exists.
A.�. ELIMINATION 413
(ii) If there are columns corresponding to variables with no pivot, then those are free
variables that can be chosen arbitrarily, and there are infinitely many solutions.
When bÆ ⇤ 0,
Æ we have a so-called homogeneous matrix equation
Æ
A xÆ ⇤ 0.
There is no need to write an augmented matrix in this case. As the elementary operations
do not do anything to a zero column, it always stays a zero column. Moreover, A xÆ ⇤ 0Æ
Æ Such a system is always consistent. It may
always has at least one solution, namely xÆ ⇤ 0.
have other solutions: If you find any free variables, then you get infinitely many solutions.
The set of solutions of A xÆ ⇤ 0Æ comes up quite often so people give it a name. It is called
the nullspace or the kernel of A. One place where the kernel comes up is invertibility of a
square matrix A. If the kernel of A contains a nonzero vector, then it contains infinitely
Æ since infinitely
many vectors (there was a free variable). But then it is impossible to invert 0,
many vectors go to 0, Æ so there is no unique vector that A takes to 0. Æ So if the kernel is
nontrivial, that is, if there are any nonzero vectors, in other words, if there are any free
variables, or in yet other words, if the row echelon form of A has columns without pivots,
then A is not invertible. We will return to this idea later.
Æ
is a solution to A yÆ ⇤ 0:
So if you have found enough solutions, you have them all. The question is, when did
we find enough of them?
We say the vectors yÆ1 , yÆ2 , . . . , yÆn are linearly independent if the only solution to
⇥ ⇤
is ↵ 1 ⇤ ↵2 ⇤ · · · ⇤ ↵ n ⇤ 0. Otherwise, ⇥we⇤ say the vectors are linearly dependent.
For example, the vectors 12 and 01 are linearly independent. Let’s try:
1 0 ↵1 0
↵1 + ↵2 ⇤ ⇤ 0Æ ⇤ .
2 1 2↵1 + ↵ 2 0
So ↵ 1 ⇤ 0, and then it is clear that ↵2 ⇤ 0 as well. In other words, the vectors are linearly
independent.
If a set of vectors is linearly dependent, that is, some of the ↵ j ’s are nonzero, then we can
Æ
solve for one vector in terms of the others. Suppose ↵1 , 0. Since ↵1 xÆ1 +↵ 2 xÆ2 +· · ·+↵ n xÆn ⇤ 0,
then
↵2 ↵3 ↵n
xÆ1 ⇤ xÆ2 xÆ3 + · · · + xÆn .
↵1 ↵1 ↵1
For example,
213 213 2 1 3 203
6 7 6 7 6 7 6 7
2 66277 4 66177 + 2 66 0 77 ⇤ 66077 ,
637 617 6 17 607
4 5 4 5 4 5 4 5
and so
21 3 213 213
6 7 6 7 6 7
627 ⇤ 2 617 607.
6 7 6 7 6 7
63 7 617 6 17
4 5 4 5 4 5
You may have noticed that solving for those ↵ j ’s is just solving linear equations, and so
you may not be surprised that to check if a set of vectors is linearly independent we use
row reduction.
Given a set of vectors, we may not be interested in just finding if they are linearly
independent or not, we may be interested in finding a linearly independent subset. Or
A.�. ELIMINATION 415
perhaps we may want to find some other vectors that give the same linear combinations
and are linearly independent. The way to figure this out is to form a matrix out of our
vectors. If we have row vectors we consider them as rows of a matrix. If we have column
vectors we consider them columns of a matrix. The set of all linear combinations of a set of
vectors is called their span.
span xÆ1 , xÆ2 , . . . , xÆn ⇤ Set of all linear combinations of xÆ1 , xÆ2 , . . . , xÆn .
Given a matrix A, the maximal number of linearly independent rows is called the rank
of A, and we write “rank A” for the rank. For example,
21 1 37
6 1
rank 66 2 2 2 77 ⇤ 1.
6 1 175
4 1
The second and third row are multiples of the first one. We cannot choose more than one
row and still have a linearly independent set. But what is
21 2 33
6 7
rank 664 5 677 ⇤ ?
67 8 97
4 5
That seems to be a tougher question to answer. The first two rows are linearly independent,
so the rank is at least two. If we would set up the equations for the ↵ 1 , ↵ 2 , and ↵3 , we
would find a system with infinitely many solutions. One solution is
⇥ ⇤ ⇥ ⇤ ⇥ ⇤ ⇥ ⇤
1 2 3 2 4 5 6 + 7 8 9 ⇤ 0 0 0 .
So the set of all three rows is linearly dependent, the rank cannot be 3. Therefore the rank
is 2.
But how can we do this in a more systematic way? We find the row echelon form!
21 2 33 21 2 33
6 7 6 7
64 5 67 60 1 27 .
Row echelon form of 6 7 is 6 7
67 8 97 60 0 07
4 5 4 5
The elementary row operations do not change the set of linear combinations of the rows
(that was one of the main reasons for defining them as they were). In other words, the
span of the rows of the A is the same as the span of the rows of the row echelon form of
A. In particular, the number of linearly independent rows is the same. And in the row
echelon form, all nonzero rows are linearly independent. This is not hard to see. Consider
the two nonzero rows in the above example. Suppose we tried to solve for the ↵1 and ↵2 in
⇥ ⇤ ⇥ ⇤ ⇥ ⇤
↵1 1 2 3 + ↵2 0 1 2 ⇤ 0 0 0 .
Since the first column of the row echelon matrix has zeros except in the first row means
that ↵1 ⇤ 0. For the same reason, ↵2 is zero. We only have two nonzero rows, and they are
linearly independent, so the rank of the matrix is 2.
416 APPENDIX A. LINEAR ALGEBRA
The span of the rows is called the row space. The row space of A and the row echelon
form of A are the same. In the example,
21 2 33
6 7 ⇥ ⇤ ⇥ ⇤ ⇥ ⇤
row space of 664 5 677 ⇤ span 1 2 3 , 4 5 6 , 7 8 9
67 8 97
4 5 ⇥ ⇤ ⇥ ⇤
⇤ span 1 2 3 , 0 1 2 .
Similarly to row space, the span of columns is called the column space.
21 2 33 8
> 2 3 2 3 2 39
6 7 < 617 627 637 >
> =
>
column space of 664 5 677 ⇤ span 66477 , 66577 , 66677 .
67 8 97 >
> 677 687 697 > >
4 5 :4 5 4 5 4 5;
So it may also be good to find the number of linearly independent columns of A. One
way to do that is to find the number of linearly independent rows of AT . It is a tremendously
useful fact that the number of linearly independent columns is always the same as the
number of linearly independent rows:
Theorem A.3.1. rank A ⇤ rank AT
In particular, to find a set of linearly independent columns we need to look at where
the pivots were. If you recall above, when solving A xÆ ⇤ 0Æ the key was finding the pivots,
any non-pivot columns corresponded to free variables. That means we can solve for the
non-pivot columns in terms of the pivot columns. Let’s see an example. First we reduce
some random matrix:
21 2 3 43
6 7
62 4 5 67 .
6 7
63 6 7 87
4 5
We find a pivot and reduce the rows below:
21 2 3 437 21 4 37 21 4 37
6 6 2 3 6 2 3
62 4 5 677 ! 66 0 277 ! 66 0 277 .
6 0 1 0 1
63 6 7 875 63 8 75 60 475
4 4 6 7 4 0 2
We find the next pivot, make it one, and rinse and repeat:
21 4 37 21 4 37 21 437
6 2 3 6 2 3 6 2 3
60 27 ! 66 0
7 2 7 ! 66 0
7 277 .
6 0 1 0 1 0 1
60 475 60 475 60 075
4 0 2 4 0 2 4 0 0
The final matrix is the row echelon form of the matrix. Consider the pivots that we marked.
The pivot columns are the first and the third column. All other columns correspond to free
A.�. ELIMINATION 417
The mapping A 1 is linear and hence given by a matrix, and we have seen that to figure
out the matrix we just need to find where does A 1 take the standard basis vectors eÆ1 , eÆ2 ,
. . . , eÆn .
That is, to find the first column of A 1 we solve A xÆ ⇤ eÆ1 , because then A 1 eÆ1 ⇤ xÆ. To
find the second column of A 1 we solve A xÆ ⇤ eÆ2 . And so on. It is really just n eliminations
that we need to do. But it gets even easier. If you think about it, the elimination is the same
for everything on the left side of the augmented matrix. Doing n eliminations separately
we would redo most of the computations. Best is to do all at once.
Therefore, to find the inverse of A, we write an n ⇥ 2n augmented matrix [ A | I ], where
I is the identity matrix, whose columns are precisely the standard basis vectors. We then
perform row reduction until we arrive at the reduced row echelon form. If A is invertible,
then pivots can be found in every column of A, and so the reduced row echelon form of
[ A | I ] looks like [ I | A 1 ]. We then just read off the inverse A 1 . If you do not find a pivot
in every one of the first n columns of the augmented matrix, then A is not invertible.
This is best seen by example. Suppose we wish to invert the matrix
21 2 33
6 7
62 0 17 .
6 7
63 1 07
4 5
We write the augmented matrix and we start reducing:
2 2 3 1 0 0 37 2 1 0 0 37
6 1 6 1 2 3
6 0 1 0 1 0 77 ! 6 2 1 0 77 !
6 2 6 0 4 5
6 1 0 0 0 1 75 6 3 0 1 75
4 3 4 0 5 9
2 0 0 37 2 0 0 37
6 1 2 3 1 6 1 2 3 1
! 66 0 1 5/4 1/2 1/4 0 7 !
7
6
6 0 1 5/4 1/2 1/4 0 7 !
7
6 0 1 75 6 5/4 1 7
4 0 5 9 3 4 0 0 11/4 1/2
5
2 3 2 3
6 1 2 3 1 0 0 7 6 1 2 0 7
5/11 5/11 12/11
Not too terrible, no? Perhaps harder than inverting a 2 ⇥ 2 matrix for which we had a
formula, but not too bad. Really in practice this is done efficiently by a computer.
A.�. ELIMINATION 419
A.3.6 Exercises
Exercise A.3.1: Compute the reduced row echelon form for the following matrices�
1 3 1 3 3 3 6 6 6 7 7
a� b� c� d�
0 1 1 6 3 2 3 1 1 0 1
29 3 0 23 22 1 3 337 26 6 537 20 2 137
6 7 6 6 6 0
e� 668 6 3 677 f� 66 6 0 0 177 g� 660 2 277 h� 666 6 3 3 77
67 9 7 97 6 2 4 4 3 75 66 5 675 66 2 5 75
4 5 4 4 4 3
Exercise A.3.2: Compute the inverse of the given matrices
21 0 03 21 1 13 21 2 33
6 7 6 7 6 7
a� 660 0 177 b� 660 2 177 c� 662 0 177
60 1 07 60 0 17 60 2 17
4 5 4 5 4 5
Exercise A.3.3: Solve �find all solutions�, or show no solution exists
x1 + 5x 2 + 3x3 ⇤ 7
4x 1 + 3x2 ⇤ 2
a� b� 8x1 + 7x 2 + 8x3 ⇤ 8
x1 + x2 ⇤ 4
4x1 + 8x 2 + 6x3 ⇤ 4
4x 1 + 8x2 + 2x 3 ⇤ 3 x + 2y + 3z ⇤ 4
c� x1 2x2 + 3x 3 ⇤ 1 d� 2x y + 3z ⇤ 1
4x 1 + 8x2 ⇤2 3x + y + 6z ⇤ 6
Exercise A.3.4: By computing the inverse, solve the following systems for xÆ.
4 1 13 3 3 2
a� xÆ ⇤ b� xÆ ⇤
1 3 26 3 4 1
Exercise A.3.5: Compute the rank of the given matrices
26 3 53 25 137 21 3 37
6 7 6 2 6 2
a� 661 4 177 b� 663 0 6 77 c� 66 1 2 377
67 7 67 62 5 75 62 6 75
4 5 4 4 4 4
Exercise A.3.6: For the matrices in Exercise A.�.�, find a linearly independent set of row vectors
that span the row space �they don’t need to be rows of the matrix�.
Exercise A.3.7: For the matrices in Exercise A.�.�, find a linearly independent set of columns that
span the column space. That is, find the pivot columns of the matrices.
Exercise A.3.8: Find a linearly independent subset of the following vectors that has the same span.
2 13 223 2 23 2 13
6 7 6 7 6 7 6 7
617, 6 27 , 647, 637
6 7 6 7 6 7 6 7
627 6 47 617 6 27
4 5 4 5 4 5 4 5
420 APPENDIX A. LINEAR ALGEBRA
Exercise A.3.101: Compute the reduced row echelon form for the following matrices�
21 1 37
6 3
d� 66 4 277
1 0 1 1 2 1 1
a� b� c� 6
0 1 0 3 4 2 2 6 2 275
4 6
22 2 37 2 2 6 4 337
6 2 5 6
e� 661 177 f� 66 6 0 3 077
0 0 0 0 1 2 3 3
2 4 g� h�
60 275 64 2 1 175
0 0 0 0 1 2 3 5
4 3 1 4
Exercise A.3.102: Compute the inverse of the given matrices
2 0 1 03 21 1 13 22 4 03
6 7 6 7 6 7
a� 66 1 0 077 b� 661 1 077 c� 662 2 377
6 0 0 17 61 0 07 62 4 17
4 5 4 5 4 5
Exercise A.3.103: Solve �find all solutions�, or show no solution exists
5x + 6y + 5z ⇤ 7
4x1 + 3x2 ⇤ 1
a� b� 6x + 8y + 6z ⇤ 1
5x1 + 6x2 ⇤ 4
5x + 2y + 5z ⇤ 2
such that every other vector in S is a linear combination of vÆ1 , vÆ2 , . . . , vÆk , then the set
{ vÆ1 , vÆ2 , . . . , vÆk } is called a basis of S. In other words, S is the span of { vÆ1 , vÆ2 , . . . , vÆk }. We
say that S has dimension k, and we write
dim S ⇤ k.
Theorem A.4.1. If S ⇢ Rn is a subspace and S is not the trivial subspace {0}, Æ then there exists
a unique positive integer k �the dimension� and a �not unique� basis { vÆ1 , vÆ2 , . . . , vÆk }, such that
every w
Æ in S can be uniquely represented by
w
Æ ⇤ ↵ 1 vÆ1 + ↵ 2 vÆ2 + · · · + ↵ k vÆk ,
dim Rn ⇤ n.
We would have simply formed the matrix A with these vectors as columns and repeated
the computation above. The subspace X is then the column space of A.
Example A.4.4: Consider the matrix
21 2 0 0 3 3
6 7
L ⇤ 660 0 1 0 477
60 0 0 1 5 7
4 5
Conveniently, the matrix is in reduced row echelon form. The matrix is of rank 3. The
column space is the span of the pivot columns. It is the 3-dimensional space
8
> 2 3 2 3 2 39
< 617 607 607 >
> =
>
column space of L ⇤ span 66077 , 66177 , 66077 ⇤ R3 .
>
> 607 607 617 > >
:4 5 4 5 4 5;
The row space is the 3-dimensional space
⇥ ⇤ ⇥ ⇤ ⇥ ⇤
row space of L ⇤ span 1 2 0 0 3 , 0 0 1 0 4 , 0 0 0 1 5 .
A.4.2 Kernel
Æ the kernel of L, is a subspace: If xÆ and yÆ
The set of solutions of a linear equation L xÆ ⇤ 0,
are solutions, then
L(xÆ + yÆ) ⇤ L xÆ + L yÆ ⇤ 0Æ + 0Æ ⇤ 0,
Æ and L(↵ xÆ) ⇤ ↵L xÆ ⇤ ↵ 0Æ ⇤ 0.
Æ
So xÆ + yÆ and ↵ xÆ are solutions. The dimension of the kernel is called the nullity of the
matrix.
The same sort of idea governs the solutions of linear differential equations. We try to
describe the kernel of a linear differential operator, and as it is a subspace, we look for a
basis of this kernel. Much of this book is dedicated to finding such bases.
The kernel of a matrix is the same as the kernel of its reduced echelon form. For a
matrix in reduced row echelon form, the kernel is rather easy to find. If a vector xÆ is
applied to a matrix L, then each entry in xÆ corresponds to a column of L, the column that
the entry multiplies. To find the kernel, pick a non-pivot column make a vector that has a
1 in the entry corresponding to this non-pivot column and zeros at all the other entries
424 APPENDIX A. LINEAR ALGEBRA
corresponding to the other non-pivot columns. Then for all the entries corresponding
to pivot columns make it precisely the value in the corresponding row of the non-pivot
Æ This procedure is best understood by
column to make the vector be a solution to L xÆ ⇤ 0.
example.
Example A.4.5: Consider
21 337
6 2 0 0
L ⇤ 66 0 0 1 0 477 .
60 575
4 0 0 1
This matrix is in reduced row echelon form, the pivots are marked. There are two non-pivot
columns, so the kernel has dimension 2, that is, it is the span of 2 vectors. Let us find the
first vector. We look at the first non-pivot column, the 2nd column, and we put a 1 in the
2nd entry of our vector. We put a 0 in the 5th entry as the 5th column is also a non-pivot
column:
2?3
6 7
6 17
6 7
6 ? 7.
6 7
6?7
6 7
607
4 5
Let us fill the rest. When this vector hits the first row, we get a 2 and 1 times whatever the
first question mark is. So make the first question mark 2. For the second and third rows, it is
sufficient to make it the question marks zero. We are really filling in the non-pivot column
into the remaining entries. Let us check while marking which numbers went where:
2 2 37
6
21 0 0 337 6 177 26037
6 2 6
60 1 0 477 6 0 77 ⇤ 66077 .
6 0 6
60 0 1 575 6 0 77 64075
4 0 6
6 0 75
4
Yay! How about the second vector. We start with
2 ? 37
6
6 0 77
6
6 ? 77
6
6 ? 77
6
6 1.75
4
We set the first question mark to 3, the second to 4, and the third to 5. Let us check, marking
things as previously,
233
6 7
21 2 0 0 3 3 6 0 7 203
6 76 7 6 7
60 0 1 0 4 7 6 4 7 ⇤ 607 .
6 76 7 6 7
60 0 0 1 5 7 6 5 7 607
4 56 7 4 5
6 17
4 5
A.�. SUBSPACES, DIMENSION, AND THE KERNEL 425
There are two non-pivot columns, so we only need two vectors. We have found the basis of
the kernel. So,
8
> 2 2 3 2 3 39
>
> 6 7 6 7> >
>
> 6 17 6 0 7 > >
<6 7 6 7>
> =
>
kernel of L ⇤ span 66 0 77 , 66 4 77
>
> 6 7 6 7> >
>
>6 0 7 6 5 7> >
>
> 6 0 7 6 17 > >
:4 5 4 5;
What we did in finding a basis of the kernel is we expressed all solutions of L xÆ ⇤ 0Æ as a
linear combination of some given vectors.
The procedure to find the basis of the kernel of a matrix L:
(ii) Write down the basis of the kernel as above, one vector for each non-pivot column.
The rank of a matrix is the dimension of the column space, and that is the span on the
pivot columns, while the kernel is the span of vectors one for each non-pivot column. So
the two numbers must add to the number of columns.
Theorem A.4.3 (Rank–Nullity). If a matrix A has n columns, rank r, and nullity k �dimension
of the kernel�, then
n ⇤ r + k.
The theorem is immensely useful in applications. It allows one to compute the rank r if
one knows the nullity k and vice-versa, without doing any extra work.
Let us consider an example application, a simple version of the so-called Fredholm
alternative. A similar result is true for differential equations. Consider
Æ
A xÆ ⇤ b,
where A is a square n ⇥ n matrix. There are then two mutually exclusive possibilities:
How does the Rank–Nullity theorem come into the picture? Well, if A has a nonzero
Æ then the nullity k is positive. But then the rank r ⇤ n k must be less
solution xÆ to A xÆ ⇤ 0,
than n. In particular it means that the column space of A is of dimension less than n, so it
is a subspace that does not include everything in Rn . So Rn has to contain some vector bÆ
not in the column space of A. In fact, most vectors in Rn are not in the column space of A.
426 APPENDIX A. LINEAR ALGEBRA
A.4.3 Exercises
Exercise A.4.1: For the following sets of vectors, find a basis for the subspace spanned by the
vectors, and find the dimension of the subspace.
213 2 13 213 20 3 203 2 43 22 3 223
6 7 6 7 6 7 6 7 6 7 6 7 6 7 6 7
a� 66177 , 6 17
6 7 b� 66077 , 61 7 ,
6 7
6 17
6 7 c� 66 377 , 63 7 ,
6 7
607
6 7
617 6 17 657 60 7 607 657 63 7 627
4 5 4 5 4 5 4 5 4 5 4 5 4 5 4 5
213 20 3 2 13 233 223 2 53
6 7 6 7 6 7 6 7 6 7 6 7
d� 66377 , 62 7 , 6 17 f� 66177 , 647, 6 57
1 0 1
6 7 6 7 e� , , 6 7 6 7
607 62 7 627 3 2 1 637 6 47 6 27
4 5 4 5 4 5 4 5 4 5 4 5
Exercise A.4.2: For the following matrices, find a basis for the kernel �nullspace�.
21 1 1 37 22 337 2 4 4 43 2 2 1 1 13
6 6 1 6 7 6 7
a� 661 1 5 77 b� 66 4 0 477 c� 66 1 1 177 d� 66 4 2 2 277
61 1 475 6 1 2 75 6 5 5 57 6 1 0 4 37
4 4 1 4 5 4 5
Exercise A.4.3: Suppose a 5 ⇥ 5 matrix A has rank �. What is the nullity�
Exercise A.4.4: Suppose that X is the set of all the vectors of R3 whose third component is zero. Is
X a subspace� And if so, find a basis and the dimension.
Exercise A.4.5: Consider a square matrix A, and suppose that xÆ is a nonzero vector such that
Æ What does the Fredholm alternative say about invertibility of A.
A xÆ ⇤ 0.
Exercise A.4.6: Consider
2 1 2 33
6 7
M ⇤ 66 2 ? ?77 .
6 1 ? ?7
4 5
If the nullity of this matrix is �, fill in the question marks. Hint� What is the rank�
Exercise A.4.101: For the following sets of vectors, find a basis for the subspace spanned by the
vectors, and find the dimension of the subspace.
213 22 3 213 253 253 2 13
6 7 6 7 6 7 6 7 6 7 6 7
b� 66177 , 62 7 , 617 c� 66377 , 6 17 , 637
1 1
a� , 6 7 6 7 6 7 6 7
2 1 617 62 7 627 617 657 6 47
4 5 4 5 4 5 4 5 4 5 4 5
223 22 3 243 213 223 203
6 7 6 7 6 7 6 7 6 7 6 7
d� 66277 , 62 7 , 647 f� 66077 , 607 , 617
1 2 3
6 7 6 7 e� , , 6 7 6 7
647 63 7 6 37 0 0 0 607 607 627
4 5 4 5 4 5 4 5 4 5 4 5
Exercise A.4.102: For the following matrices, find a basis for the kernel �nullspace�.
22 6 1 93 22 537 21 437 20 4 43
6 7 6 2 6 5 6 7
a� 661 3 2 977 b� 66 1 1 5 77 c� 66 2 3 5 77 d� 660 1 177
63 9 0 97 6 5 375 6 3 2 75 60 5 57
4 5 4 5 4 5 4 5
A.�. SUBSPACES, DIMENSION, AND THE KERNEL 427
a� Rank of A. b� Nullity of A.
(iv) h xÆ + yÆ , zÆi ⇤ hxÆ, zÆi + h yÆ , zÆi and hxÆ, yÆ + zÆi ⇤ h xÆ, yÆi + h xÆ, zÆi.
In fact, anything that satisfies the above properties can be called an inner product, although
in this section we are concerned with the standard inner product in Rn .
The standard inner product gives the euclidean length:
p q
k xÆ k ⇤ h xÆ, xÆi ⇤ x12 + x22 + · · · + x 2n .
That is, ✓ is the angle that xÆ and yÆ make when they are based at the same point.
In Rn , we are simply going to say that ✓ from the formula is what the angle is. This
makes sense as any two vectors based at the origin lie in a 2-dimensional plane (subspace),
and the formula works in 2 dimensions. In fact, one could even talk about angles between
functions this way, and we do in chapter 4, where we talk about orthogonal functions
(functions at right angle to each other).
To compute the angle we compute
h xÆ, yÆi
cos ✓ ⇤ .
k xÆ kk yÆ k
Our angles are always in radians. We are computing the cosine of the angle, which is really
the best we can do. Given two vectors at an angle ✓, we can give the angle as ✓, 2⇡ ✓,
etc., see Figure A.5. Fortunately, cos ✓ ⇤ cos( ✓) ⇤ cos(2⇡ ✓). If we solve for ✓ using the
inverse cosine cos 1 , we can just decree that 0 ✓ ⇡.
2⇡ ✓ xÆ
✓
✓
yÆ
Example A.5.1: Let us compute the angle between the vectors (3, 0) and (1, 1) in the plane.
Compute
⌦ ↵
(3, 0), (1, 1) 3+0 1
cos ✓ ⇤ ⇤ p ⇤p .
k(3, 0)kk(1, 1)k 3 2 2
Therefore ✓ ⇤ ⇡/4.
As we said, the most important angle is the right angle. A right angle is ⇡/2 radians,
and cos(⇡/2) ⇤ 0, so the formula is particularly easy in this case. We say vectors xÆ and yÆ are
orthogonal if they are at right angles, that is if
h xÆ, yÆi ⇤ 0.
The vectors (1, 0, 0, 1) and (1, 2, 3, 1) are orthogonal. So are (1, 1) and (1, 1). However,
(1, 1) and (1, 2) are not orthogonal as their inner product is 3 and not 0.
430 APPENDIX A. LINEAR ALGEBRA
For the geometric idea, see Figure A.6. That is, we find the “shadow of w” Æ on the line
spanned by vÆ if the direction of the sun’s rays were exactly perpendicular to the line.
Another way of thinking about it is that the tip of the arrow of projvÆ (w)
Æ is the closest point
on the line spanned by vÆ to the tip of the arrow of w. Æ In terms of euclidean distance,
uÆ ⇤ projvÆ (w)
Æ minimizes the distance k w Æ uÆ k among all vectors uÆ that are multiples of
vÆ. Because of this, this projection comes up often in applied mathematics in all sorts of
contexts we cannot solve a problem exactly: We can’t always solve “Find w Æ as a multiple of
vÆ,” but projvÆ (w)
Æ is the best “solution.”
w
Æ
✓
projvÆ ( w)
Æ vÆ
Figure A.6: Orthogonal projection.
The formula follows from basic trigonometry. The length of projvÆ (w) Æ should be cos ✓
times the length of w, Æ that is (cos ✓)k wk.
Æ We take the unit vector in the direction of vÆ, that
is, k vvÆÆ k and we multiply it by the length of the projection. In other words,
Example A.5.2: Suppose we wish to project the vector (1, 2, 3) onto the vector (3, 2, 1).
A.�. INNER PRODUCT AND PROJECTIONS 431
Compute
ought to be orthogonal to (1, 2, 3). We compute the inner product and we had better get
zero: ⌧✓ ◆
16 4 8 16 4 8
, , , (1, 2, 3) ⇤ ·1+ ·2 · 3 ⇤ 0.
7 7 7 7 7 7
for all choices of j and k where j , k (a nonzero vector cannot be orthogonal to itself).
A basis is furthermore called an orthonormal basis if all the vectors in a basis are also
unit vectors, that is, if all the vectors have magnitude 1. For example, the standard basis
{(1, 0, 0), (0, 1, 0), (0, 0, 1)} is an orthonormal basis of R3 : Any pair is orthogonal, and each
vector is of unit magnitude.
The reason why we are interested in orthogonal (or orthonormal) bases is that they
make it really simple to represent a vector (or a projection onto a subspace) in the basis.
The simple formula for the orthogonal projection onto a vector gives us the coefficients.
In chapter 4 we use the same idea by finding the correct orthogonal basis for the set of
solutions of a differential equation we are then able to find any particular solution by simply
applying the orthogonal projection formula, which is just a couple of a inner products.
Let us come back to linear algebra. Suppose that we have a subspace and an orthogonal
basis vÆ1 , vÆ2 , . . . , vÆn . We wish to express xÆ in terms of the basis. If xÆ is not in the span of
the basis (when it is not in the given subspace), then of course it is not possible, but the
following formula gives us at least the orthogonal projection onto the subspace.
First suppose that xÆ is in the span. Then it is the sum of the orthogonal projections:
Another way to derive this formula is to work in reverse. Suppose that xÆ ⇤ a 1 vÆ1 + a 2 vÆ2 +
· · · + a n vÆn . Take an inner product with vÆ j , and use the properties of the inner product:
As the basis is orthogonal, then h vÆk , vÆ j i ⇤ 0 whenever k , j. That means that only one of
the terms, the j th one, on the right hand side is nonzero and we get
hxÆ, vÆ j i ⇤ a j hvÆ j , vÆ j i.
hxÆ, vÆ j i
Solving for a j we find a j ⇤ hvÆ j , vÆ j i
as before.
Example A.5.3: The vectors (1, 1) and (1, 1) form an orthogonal basis of R2 . Suppose we
wish to represent (3, 4) in terms of this basis, that is, we wish to find a 1 and a2 such that
We compute:
So
7 1
(3, 4) ⇤ (1, 1) + (1, 1).
2 2
If the basis is orthonormal rather than orthogonal, than the denominators are always
just one. It is easy to make a basis orthonormal, just by dividing all the vectors by their
size. If you want to decompose many vectors, it may be better to find an orthonrmal basis.
In the above example, the orthonormal basis we would thus create is
✓ ◆ ✓ ◆
1 1 1 1
p ,p , p ,p .
2 2 2 2
Then the computation would have been
⌧ ✓ ◆ ✓ ◆ ⌧ ✓ ◆ ✓ ◆
1 1 1 1 1 1 1 1
(3, 4) ⇤ (3, 4), p , p p , p + (3, 4), p , p p ,p
2 2
✓ ◆ ✓2 2 ◆ 2 2 2 2
7 1 1 1 1 1
⇤ p p ,p +p p ,p .
2 2 2 2 2 2
A.�. INNER PRODUCT AND PROJECTIONS 433
Maybe the example is not so awe inspiring, but given vectors in R20 rather than R2 ,
then surely one would much rather do 20 inner products (or 40 if we did not have an
orthonormal basis) rather than solving a system of twenty equations in twenty unknowns
using row reduction of a 20 ⇥ 21 matrix.
As we said above, the formula still works even if xÆ is not in the subspace, although
then it does not get us the vector xÆ but its projection. More concretely, suppose that S is a
subspace that is the span of vÆ1 , vÆ2 , . . . , vÆn and xÆ is any vector. Let projS ( xÆ) be the vector in
S that is the closest to xÆ. Then
hxÆ, vÆ1 i h xÆ, vÆ2 i hxÆ, vÆn i
projS (xÆ) ⇤ vÆ1 + vÆ2 + · · · + vÆn .
h vÆ1 , vÆ1 i hvÆ2 , vÆ2 i h vÆn , vÆn i
Of course, if xÆ is in S, then projS (xÆ) ⇤ xÆ, as the closest vector in S to xÆ is xÆ itself. But
true utility is obtained when xÆ is not in S. In much of applied mathematics we cannot find
an exact solution to a problem, but we try to find the best solution out of a small subset
(subspace). The partial sums of Fourier series from chapter 4 are one example. Another
example is least square approximation to fit a curve to data. Yet another example is given
by the most commonly used numerical methods to solve differential equations, the finite
element methods.
Example A.5.4: The vectors (1, 2, 3) and (3, 0, 1) are orthogonal, and so they are an
orthogonal basis of a subspace S:
Let us find the vector in S that is closest to (2, 1, 0). That is, let us find projS (2, 1, 0) .
w
Æ 1 ⇤ vÆ1 ,
434 APPENDIX A. LINEAR ALGEBRA
w
Æ 2 ⇤ vÆ2 projwÆ 1 (vÆ2 ),
w
Æ 3 ⇤ vÆ3 projwÆ 1 (vÆ3 ) projwÆ 2 (vÆ3 ),
w
Æ 4 ⇤ vÆ4 projwÆ 1 (vÆ4 ) projwÆ 2 (vÆ4 ) projwÆ 3 (vÆ4 ),
..
.
w
Æ n ⇤ vÆn projwÆ 1 (vÆn ) projwÆ 2 ( vÆn ) ··· projwÆ n 1 (vÆn ).
What we do is at the k th step, we take vÆk and we subtract the projection of vÆk to the subspace
spanned by wÆ1, w Æ k 1.
Æ2, . . . , w
Example A.5.5: Consider the vectors (1, 2, 1), and (0, 5, 2) and call S the span of the two
vectors. Let us find an orthogonal basis of S:
Æ 1 ⇤ (1, 2, 1),
w
Æ 2 ⇤ (0, 5, 2)
w proj(1,2, 1) (0, 2, 2)
h(0, 5, 2), (1, 2, 1)i
⇤ (0, 1, 1) (1, 2, 1) ⇤ (0, 5, 2) 2(1, 2, 1) ⇤ ( 2, 1, 0).
h(1, 2, 1), (1, 2, 1)i
So (1, 2, 1) and ( 2, 1, 0) span S and are orthogonal. Let us check: (1, 2, 1) · ( 2, 1, 0) ⇤ 0.
Suppose we wish to find an orthonormal basis, not just an orthogonal one. Well, we
simply make the vectors into unit vectors by dividing them by their magnitude. The two
vectors making up the orthonormal basis of S are:
✓ ◆ ✓ ◆
1 1 2 1 1 2 1
p (1, 2, 1) ⇤ p , p , p , p ( 2, 1, 0) ⇤ p , p , 0 .
6 6 6 6 5 5 5
A.5.5 Exercises
Exercise A.5.1: Find the s that makes the following vectors orthogonal� (1, 2, 3), (1, 1, s).
Exercise A.5.2: Find the angle ✓ between (1, 3, 1), (2, 1, 1).
a� h uÆ , 2 vÆi b� h vÆ, 2w
Æ + 3uÆ i Æ + 3 uÆ , vÆi
c� h w
Exercise A.5.5: Consider the vectors (1, 2, 3), ( 3, 0, 1), (1, 5, 3).
a� Check that the vectors are linearly indepen- b� Check that the vectors are mutually orthog-
dent and so form a basis. onal, and are therefore an orthogonal basis.
Exercise A.5.6: Let S be the subspace spanned by (1, 3, 1), (1, 1, 1). Find an orthogonal basis of
S by the Gram-Schmidt process.
Exercise A.5.7: Starting with (1, 2, 3), (1, 1, 1), (2, 2, 0), follow the Gram-Schmidt process to find
an orthogonal basis of R3 .
Exercise A.5.8: Find an orthogonal basis of R3 such that (3, 1, 2) is one of the vectors. Hint�
First find two extra vectors to make a linearly independent set.
Exercise A.5.9: Using cosines and sines of ✓, find a unit vector uÆ in R2 that makes angle ✓ with
Æı ⇤ (1, 0). What is hÆı , uÆ i�
Exercise A.5.101: Find the s that makes the following vectors orthogonal� (1, 1, 1), (1, s, 1).
Exercise A.5.102: Find the angle ✓ between (1, 2, 3), (1, 1, 1).
Exercise A.5.105: The vectors (1, 1, 1), (2, 1, 1), (1, 5, 3) for an orthonormal basis. Represent
the following vectors in terms of this basis.
Exercise A.5.106: Let S be the subspace spanned by (2, 1, 1), (2, 2, 2). Find an orthogonal basis
of S by the Gram-Schmidt process.
Exercise A.5.107: Starting with (1, 1, 1), (2, 3, 1), (1, 1, 1), follow the Gram-Schmidt process
to find an orthogonal basis of R3 .
436 APPENDIX A. LINEAR ALGEBRA
A.6 Determinant
Note: 1 lecture
For square matrices we define a useful quantity called the determinant. We define the
determinant of a 1 ⇥ 1 matrix as the value of its only entry
⇥ ⇤ def
det a ⇤ a.
Before defining the determinant for larger matrices, we note the meaning of the
determinant. An n ⇥ n matrix gives a mapping of the n-dimensional euclidean space Rn
to itself. In particular, a 2 ⇥ 2 matrix A is a mapping of the plane to itself. The determinant
of A is the factor by which the area of objects changes. If we take the unit square (square of
side 1) in the plane, then A takes the square to a parallelogram of area |det(A)|. The sign of
det(A) denotes a change of orientation (negative if the axes get flipped). For example, let
1 1
A⇤ .
1 1
Then det(A) ⇤ 1 + 1 ⇤ 2. Let us see where A sends the unit square with vertices (0, 0), (1, 0),
(0, 1), and (1, 1). The point (0, 0) gets sent to (0, 0).
1 1 1 1 1 1 0 1 1 1 1 2
⇤ , ⇤ , ⇤ .
1 1 0 1 1 1 1 1 1 1 1 0
The image of the square is anotherpsquare with vertices (0, 0), (1, 1), (1, 1), and (2, 0). The
image square has a side of length 2 and is therefore of area 2. See Figure A.7.
1
1
0 0 1 2
0 0 1
1
def
’
n
det(A) ⇤ ( 1)1+ j a 1j det(A1j ),
j⇤1
or in other words
(
+a 1n det(A1n ) if n is odd,
det(A) ⇤ a 11 det(A11 ) a12 det(A12 ) + a 13 det(A13 ) ···
a1n det(A1n ) if n even.
For a 3 ⇥ 3 matrix, we get det(A) ⇤ a 11 det(A11 ) a 12 det(A12 ) + a13 det(A13 ). For example,
21 2 33 ✓ ◆ ✓ ◆ ✓ ◆
© 66 7™
7 5 6 4 6 4 5
det ≠ 64 5 67 Æ̈ ⇤ 1 · det 2 · det + 3 · det
6 7 8 9 7 9 7 8
´ 47 8 95
⇤ 1(5 · 9 6 · 8) 2(4 · 9 6 · 7) + 3(4 · 8 5 · 7) ⇤ 0.
It turns out that we did not have to necessarily use the first row. That is for any i,
’
n
det(A) ⇤ ( 1)i+ j a i j det(A i j ).
j⇤1
It is sometimes useful to use a row other than the first. In the following example it is more
convenient to expand along the second row. Notice that for the second row we are starting
with a negative sign.
21 2 33 ✓ ◆ ✓ ◆ ✓ ◆
© 66 7™
det ≠ 60 5 077 Æ̈ ⇤ 0 · det
2 3 1 3 1 2
+ 5 · det 0 · det
67 8 97 8 9 7 9 7 8
´4 5
⇤ 0 + 5(1 · 9 3 · 7) + 0 ⇤ 60.
438 APPENDIX A. LINEAR ALGEBRA
Let us check if it is really the same as expanding along the first row,
21 2 33 ✓ ◆ ✓ ◆ ✓ ◆
© 66 7™
det ≠ 60 5 077 Æ̈ ⇤ 1 · det
5 0 0 0 0 5
2 · det + 3 · det
6 7 8 9 7 9 7 8
´ 47 8 95
⇤ 1(5 · 9 0 · 8) 2(0 · 9 0 · 7) + 3(0 · 8 5 · 7) ⇤ 60.
In computing the determinant, we alternately add and subtract the determinants of the
submatrices A i j multiplied by a i j for a fixed i and all j. The numbers ( 1)i+j det(A i j ) are
called cofactors of the matrix. And that is why this method of computing the determinant
is called the cofactor expansion.
Similarly we do not need to expand along a row, we could expand along a column. For
any j
’
n
det(A) ⇤ ( 1)i+ j a i j det(A i j ).
i⇤1
A matrix is upper triangular if all elements below the main diagonal are 0. For example,
21 2 33
6 7
60 5 67
6 7
60 0 97
4 5
is upper triangular. Similarly a lower triangular matrix is one where everything above the
diagonal is zero. For example,
21 0 03
6 7
64 5 07 .
6 7
67 8 97
4 5
The determinant for triangular matrices is very simple to compute. Consider the lower
triangular matrix. If we expand along the first row,⇥ we⇤ find that the determinant is 1
times the determinant of the lower triangular matrix 58 09 . So the deteriminant is just the
product of the diagonal entries.
21 0 03
© 66 7™
det ≠ 64 5 077 Æ̈ ⇤ 1 · 5 · 9 ⇤ 45.
6 7
´ 47 8 95
Similarly for upper triangular matrices
21 2 33
© 66 7™
det ≠ 60 5 677 Æ̈ ⇤ 1 · 5 · 9 ⇤ 45.
6 7
´ 40 0 95
A.�. DETERMINANT 439
The determinant is telling you how geometric objects scale. So if B doubles the sizes of
geometric objects and A triples them, then AB (which applies B to an object and then A)
should make size go up by a factor of 6. This is true in general:
Theorem A.6.1.
det(AB) ⇤ det(A) det(B).
This property is one of the most useful, and it is employed often to actually compute
determinants. A particularly interesting consequence is to note what it means for existence
of inverses. Take A and B to be inverses, that is AB ⇤ I. Then
det(A) det(B) ⇤ det(AB) ⇤ det(I) ⇤ 1.
Neither det(A) nor det(B) can be zero. This fact is an extremely useful property of the
determinant, and one which is used often in this book:
Theorem A.6.2. An n ⇥ n matrix A is invertible if and only if det(A) , 0.
In fact, det(A 1 ) det(A) ⇤ 1 says that
1
det(A 1 ) ⇤ .
det(A)
So we know what the determinant of A 1 is without computing A 1 .
Let us return to the formula for the inverse of a 2 ⇥ 2 matrix:
1
a b 1 d b
⇤ .
c d ad bc c a
Notice the determinant of the matrix [ ac db ] in the denominator of the fraction. The formula
only works if the determinant is nonzero, otherwise we are dividing by zero.
A common notation for the determinant is a pair of vertical lines:
✓ ◆
a b a b
⇤ det .
c d c d
Personally, I find this notation confusing as vertical lines usually mean a positive quantity,
while determinants can be negative. Also think about how to write the absolute value of a
determinant. This notation is not used in this book.
440 APPENDIX A. LINEAR ALGEBRA
A.6.1 Exercises
Exercise A.6.1: Compute the determinant of the following matrices�
21 2 33
⇥ ⇤ 6 7
d� 660 4 577
1 3 2 1
a� 3 b� c�
2 1 4 2 60 0 67
4 5
20 7 37 20 037
22 1 0 37 22 1 33 6 2 5 6 1 2
6 6 7 60 377 61 277
e� 66 2 7 377 f� 668 6 377 g� 66 h� 66
0 2 1 1
60 2 7 77 177
0 75 67 9 77 63 61
4 5 1 2
4 4 5 60 4 75 62 375
4 0 2 4 1 2
Exercise A.6.2: For which x are the following matrices singular �not invertible�.
2 x 0 13
6 7
d� 66 1 4 277
2 3 2 x x 1
a� b� c�
2 x 1 2 4 x 6 1 6 27
4 5
Exercise A.6.3: Compute
2 337
1
© 62 1 2 ™
≠ 60 577 Æ
det ≠≠ 66 Æ
8 6
977 Æ
≠ 60 0 3 Æ̈
60 175
´4
0 0
without computing the inverse.
21 037 25 sin(1)37
6 0 0 6 9 1
62 077 60 1 77
L ⇤ 66 U ⇤ 66
1 0 1 88
077 3 77
and .
67 ⇡ 1 60 0 1
628 175 60 1 75
4 5 99 4 0 0
Let A ⇤ LU. Compute det(A) in a simple way, without computing what is A. Hint� First read off
det(L) and det(U).
⇥ ⇤
Exercise A.6.5: Consider the linear mapping from R2 to R2 given by the matrix A ⇤ 12 x1 for
some number x. You wish to make A such that it doubles the area of every geometric figure. What
are the possibilities for x �there are two answers�.
Exercise A.6.6: Suppose A and S are n ⇥ n matrices, and S is invertible. Suppose that det(A) ⇤ 3.
Compute det(S 1 AS) and det(SAS 1 ). Justify your answer using the theorems in this section.
Exercise A.6.7: Let A be an n⇥n matrix such that det(A) ⇤ 1. Compute det(xA) given a number x.
Hint� First try computing det(xI), then note that xA ⇤ (xI)A.
A.�. DETERMINANT 441
Exercise A.6.102: For which x are the following matrices singular �not invertible�.
2 x 1 03
6 7
d� 66 1 4 077
1 3 3 x x 3
a� b� c�
1 x 1 3 3 x 6 1 6 27
4 5
Exercise A.6.103: Compute
2 7 12 37 ™
1
© 63 4
≠ 60 877 ÆÆ
det ≠≠ 66
1 9
≠ 60 0 2 4 77 ÆÆ̈
60 2 75
´4
0 0
without computing the inverse.
Exercise A.6.104 (challenging): Find all the x that make the matrix inverse
1
1 2
1 x
have only integer entries �no fractions�. Note that there are two answers.
442 APPENDIX A. LINEAR ALGEBRA
Appendix B
The function u is the Heaviside function, is the Dirac delta function, and
π 1 π t
⌧ t 1 2 ⌧2
(t) ⇤ e ⌧ d⌧, erf(t) ⇤ p e d⌧, erfc(t) ⇤ 1 erf(t).
0 ⇡ 0
Ø1
f (t) F(s) ⇤ L f (t) ⇤ e st f (t) dt
0
C
C s
1
t s2
2
t2 s3
n!
tn s n+1
(p+1)
tp (p > 0) s p+1
at 1
e s+a
sin(!t) !
s 2 +!2
cos(!t) s
s 2 +!2
sinh(!t) !
s 2 !2
cosh(!t) s
s 2 !2
e as
u(t a) s
(t) 1
(t a) e as
1 (as)2
erf t
se erfc(as)
⇣ ⌘
2a
2
p1 exp a
(a 0) epas
⇡t 4t s
1 2 p 1
p ae a t erfc(a t) (a > 0) p
s+a
⇡t
444 APPENDIX B. TABLE OF LAPLACE TRANSFORMS
Ø1
f (t) F(s) ⇤ L f (t) ⇤ e st f (t) dt
0
a f (t) + b g(t) aF(s) + bG(s)
1
f (at) (a > 0) aF a
s
f (t a)u(t a) e as F(s)
e at f (t) F(s + a)
g 0(t) sG(s) g(0)
g 00(t) s 2 G(s) s g(0) g 0(0)
g 000(t) s 3 G(s) s 2 g(0) s g 0(0) g 00(0)
g (n) (t) s n G(s) sn 1 g(0) ··· g (n 1) (0)
Øt
( f ⇤ g)(t) ⇤ 0
f (⌧)g(t ⌧) d⌧ F(s)G(s)
t f (t) F0(s)
t n f (t) ( 1)n F (n) (s)
Øt
1
f (⌧)d⌧ s F(s)
Ø1
0
f (t)
t s
F( )d
Further Reading
[BM] Paul W. Berg and James L. McGregor, Elementary Partial Differential Equations,
Holden-Day, San Francisco, CA, 1966.
[BD] William E. Boyce and Richard C. DiPrima, Elementary Differential Equations and
Boundary Value Problems, 11th edition, John Wiley & Sons Inc., New York, NY, 2017.
[EP] C.H. Edwards and D.E. Penney, Differential Equations and Boundary Value Problems:
Computing and Modeling, 5th edition, Pearson, 2014.
[I] E.L. Ince, Ordinary Differential Equations, Dover Publications, Inc., New York, NY,
1956.
[T] William F. Trench, Elementary Differential Equations with Boundary Value Problems.
Books and Monographs. Book 9. 2013. https://digitalcommons.trinity.edu/
mono/9
446 FURTHER READING
Solutions to Selected Exercises
1.1.104: 170
1/(1 n)
1.1.105: If n , 1, then y ⇤ (1 n)x + 1 . If n ⇤ 1, then y ⇤ e x .
1.1.106: The equation is r 0 ⇤ C for some constant C. The snowball will be completely
melted in 25 minutes from time t ⇤ 0.
1.1.107: y ⇤ Ax 3 + Bx 2 + Cx + D, so 4 constants.
1.2.101:
448 SOLUTIONS TO SELECTED EXERCISES
1.7.102: a) 0, 8, 12 b) x(4) ⇤ 16, so errors are: 16, 8, 4. c) Factors are 0.5, 0.5, 0.5.
1.7.103: a) 0, 0, 0 b) x ⇤ 0 is a solution so errors are: 0, 0, 0.
1.7.104: a) Improved Euler: y(1) ⇡ 3.3897 for h ⇤ 1/4, y(1) ⇡ 3.4237 for h ⇤ 1/8, b)
Standard Euler: y(1) ⇡ 2.8828 for h ⇤ 1/4, y(1) ⇡ 3.1316 for h ⇤ 1/8, c) y ⇤ 2e x x 1, so
y(2) is approximately 3.4366. d) Approximate errors for improved Euler: 0.046852 for
h ⇤ 1/4, and 0.012881 for h ⇤ 1/8. For standard Euler: 0.55375 for h ⇤ 1/4, and 0.30499 for
h ⇤ 1/8. Factor is approximately 0.27 for improved Euler, and 0.55 for standard Euler.
1.8.101: a) e x y + sin(x) ⇤ C b) x 2 + x y 2y 2 ⇤ C c) e x + e y ⇤ C d) x 3 + 3x y + y 3 ⇤ C
1.8.102: a) Integrating factor is y, equation becomes dx + 3y 2 dy ⇤ 0. b) Integrating
factor is e x , equation becomes e x dx e y dy ⇤ 0. c) Integrating factor is y 2 , equation
becomes (cos(x) + y) dx + x dy ⇤ 0. d) Integrating factor is x, equation becomes
(2x y + y 2 ) dx + (x 2 + 2x y) dy ⇤ 0.
1 1
1.8.103: a) The equation is f (x) dx+ g(y) dy, and this is exact because M ⇤ f (x), N ⇤ g(y)
,
x2
so M y ⇤ 0 ⇤ Nx . b) x dx + 1y dy ⇤ 0, leads to potential function F(x, y) ⇤ 2 + ln| y|,
solving F(x, y) ⇤ C leads to the same solution as the example.
1
1.9.101: a) u ⇤ b) u ⇤ cos(x 2t)
1+(x+5t)2
2
1.9.102: u ⇤ cos(x t)e t /2
1.9.103: u ⇤ x + 4t
2.1.101: Yes. To justify try to find a constant A such that sin(x) ⇤ Ae x for all x.
2.1.102: No. e x+2 ⇤ e 2 e x .
2.1.103: y ⇤ 5
2.1.104: y ⇤ C 1 ln(x) + C 2
2.1.105: y 00 3y 0 + 2y ⇤ 0
p p
2.2.101: y ⇤ C1 e ( 2+ 2)x + C2 e ( 2 2)x
2.2.102: y ⇤ C 1 e 3x + C 2 xe 3x
p p p
2.2.103: y⇤e x/4 cos ( 7/4)x 7e x/4 sin ( 7/4)x
2(a b) 3x/2 3a+2b
2.2.104: y⇤ 5 e + 5 ex
2.2.105: z(t) ⇤ 2e t cos(t)
a b ↵x
2.2.106: y⇤ ↵ e + b a↵
↵ e
x
2.2.107: y 00 y 0 6y ⇤ 0
2.3.101: y ⇤ C1 e x + C2 x 3 + C3 x 2 + C4 x + C5
2.3.102: a) r 3 3r 2 + 4r 12 ⇤ 0 b) y 000 3y 00 + 4y 0 12y ⇤ 0 c) y ⇤ C 1 e 3x + C 2 sin(2x) +
C 3 cos(2x)
2.3.103: y ⇤ 0
2.3.104: No. e 1 e x e x+1 ⇤ 0.
450 SOLUTIONS TO SELECTED EXERCISES
2.3.105: Yes. (Hint: First note that sin(x) is bounded. Then note that x and x sin(x) cannot
be multiples of each other.)
2.3.106: y 000 y 00 + y 0 y⇤0
2.4.101: k ⇤ 8/9 (and larger)
p p
2.4.102: a)p 0.05I 00 + 0.1I 0 + (1/5)I ⇤ 0 b) I ⇤ Ce t cos( 3 t ) c) I ⇤ 10e t cos( 3 t) +
10
p e t sin( 3 t)
3
1
2.4.103: a) k ⇤ 500000 b) p ⇡ 0.141 c) 45000 kg d) 11250 kg
5 2
2.4.104: m 0 ⇤ 13 . If m < m 0 , then the system is overdamped and will not oscillate.
16 sin(3x)+6 cos(3x)
2.5.101: y⇤ 73
2e x +3x 3 9x
p p 2e x +3x 3 9x
2.5.102: a) y ⇤ 6 b) y ⇤ C1 cos( 2x) + C 2 sin( 2x) + 6
2.5.103: y(x) ⇤x 2 4x + 6+e x (x 5)
2xe x (e x +e x ) log(e 2x +1)
2.5.104: y⇤ 4
p p
sin(x+c) 2x + C e 2x
2.5.105: y⇤ 3 + C 1 e 2
p
31
2.6.101: !⇤ p ⇡ 0.984 C(!) ⇤ 16 p ⇡ 2.016
4 2 3 7
(!02 !2 )F0 2!pF0
2.6.102: x sp ⇤ 2 cos(!t) + 2 sin(!t) + Ak , where p ⇤ c
2m and
q
2
m(2!p) +m(!02 !2 ) m(2!p)2 +m(!02 !2 )
m.
k
!0 ⇤
2.6.103: a) ! ⇤ 2 b) 25
3.1.101: y1 ⇤ C 1 e 3x , y2 ⇤ y(x) ⇤ C2 e x + C 1 3x
2 e , y3 ⇤ y(x) ⇤ C 3 e x + C1 3x
2 e
3.1.102: x ⇤ 53 e 2t 2
3e ,
t y ⇤ 53 e 2t + 43 e t
⇥ x ⇤0 ⇥3 ⇤ ⇥x⇤ ⇥ t⇤
1 1 1
3.3.103: ⇤ 1 + e
y t 0 y 0
SOLUTIONS TO SELECTED EXERCISES 451
⇥ 0 2t ⇤ h 2 i
C 2 e t +C1
3.3.104: a) xÆ 0 ⇤ 0 2t xÆ b) xÆ ⇤ 2
h1i h0i h i
C2 e t
3
3.4.101: a) Eigenvalues: 4, 0, 1 Eigenvectors: 0 , 1 , 5
h1i h0i h 3
i 1 0 2
b) xÆ ⇤ C 1 0 e 4t + C2 1 + C3 5 e t
h i h i
1 0 2
p p
a) Eigenvalues: 1+ 3i 1 3i p2 p2
3.4.102: 2 , 2 , Eigenvectors: ,
p p
1 3i 1+ 3i
2 cos 23t 2 sin 3t
b) xÆ ⇤ C 1 e t/2 p p p + C2 e t/2 p p
2 p
cos 23t + 3 sin 3t
sin 3t
3 cos 3t
⇥1⇤ ⇥ ⇤
2 2 2
3.4.103: xÆ ⇤ C 1 e t + C2 1 e t
h i h i
1 1
cos(t) sin(t)
3.4.104: xÆ ⇤ C 1 sin(t) + C2 cos(t)
p
3.5.101: a) Two eigenvalues: ± 2 so the behavior is a saddle. b) Two eigenvalues: 1
and 2, so the behavior is a source. c) Two eigenvalues: ±2i, so the behavior is a center
(ellipses). d) Two eigenvalues: 1 and 2, so the behavior is a sink. e) Two eigenvalues:
5 and 3, so the behavior is a saddle.
3.5.102: Spiral source.
3.5.103:
The solution does not move anywhere if y ⇤ 0. When y is positive, the solution moves
(with constant speed) in the positive x direction. When y is negative, the solution moves
(with constant speed) in the negative x direction. It is not one of the behaviors we have
seen.
Note that the matrix has a double eigenvalue 0 and the general solution is x ⇤ C 1 t + C2
and y ⇤ C 1 , which agrees with the above description.
h 1 i p p h 0 i p p
3.6.101: xÆ ⇤ 1 a1 cos( 3 t) + b 1 sin( 3 t) + 1 a 2 cos( 2 t) + b2 sin( 2 t) +
h0i 1 2
1
0 a 3 cos(t) + b 3 sin(t) + 1/2
cos(2t)
1 2/3
hm 0 0
i h k k 0
i h 1
i p p
3.6.102: 0 m 0 xÆ 00 ⇤ k 2k k xÆ. Solution: xÆ ⇤ 2 a 1 cos( 3k/m t)+b 1 sin( 3k/m t)
h 1
i 0 0 m
p 0 k
p k h1i 1
+ 0 a 2 cos( k/m t) + b 2 sin( k/m t) + 1 a3 t + b3 .
p
1 1
3.7.102: a) 1, 1, 2
b) Eigenvalue
h i 1 has a defect
⇣ h i of 1h i⌘ h i
0 1 0 3
c) xÆ ⇤ C1 1 et + C2 0 +t 1 et + C3 3 e 2t
1 0 1 2
3.7.103: a) 2, 2, 2
b) Eigenvalue
h 0 i 2 has a defect
⇣ h 0 i of 2h 0 i ⌘ ⇣h 1 i h i h 0 i⌘
2t 2t 0 t2
c) xÆ ⇤ C1 3 e + C2 1 +t 3 e + C3 0 + t 1 + 2 3 e 2t
⇥5 5⇤
1 0 1 0 0 1
3.7.104: A⇤
05
e 3t +e t e t e 3t
3.8.101: e tA ⇤ 2
e t e 3t
2
e 3t +e t
" #
2 2
3e t 3e 3t
2e 3t 4e 2t +3e t 2 2 e 3t +4e 2t 3e t
3.8.102: e tA ⇤ 2e t 2e 2t et 2e 2t 2e t
3e t 3e 3t
2e 5e 2t +3e t
3t e +5e 2t 3e t
3t
h i h i
2 2
(t+1) e 2t te 2t (1 t) e 2t
3.8.103: a) e tA ⇤ te 2t (1 t) e 2t
b) xÆ ⇤ (2 t) e 2t
h i ⇥ 1.25 0.36 ⇤
3.8.104: 1+2t+5t 2 3t+6t 2 e 0.1A ⇡
2t+4t 2 1+2t+5t 2 0.24 1.25
5(3n ) 2n+2 4(3n ) 2n+2 3 2(3n ) 2(3n ) 2
3.8.105: a) b)
5(2n ) 5(3n ) 5(2n ) 4(3n ) 3 3n+1 3n+1 2
1 0 0 1
c) if n is even, and if n is odd.
0 1 1 0
3.9.101: The general solution is (particular solutions should agree with one of these):
x(t) ⇤ C1 e 9t + 4C 2 e 4t t/3 5/54, y(t) ⇤ C 1 e 9t C 2 e 4t + t/6 + 7/216
3.9.102: The general solution is (particular solutions should agree with one of these):
x(t) ⇤ C1 e t + C 2 e t + te t , y(t) ⇤ C1 e t C 2 e t + te t
⇥1⇤ 5 t
⇥ 1
⇤ 1
3.9.103: xÆ ⇤ 2e t 1 + 2 e
t
⇥ 1 ⇤ ⇣⇣ ⌘ ⇣ ⌘ ⌘
1 1
p p
1 1p 6t 1 1p 6t cos(t)
3.9.104: xÆ ⇤ + e + + e t
⇤⇣ ⌘
9 140 140 60 70
⇥
120 6 120 6
1 9 1 9t cos(t)
+ 1 80 sin(2t) + 30 cos(2t) + 40 30
q
15
4.1.101: !⇤⇡ 2
Õ
1
(⇡ n) sin(⇡n+⇡ 2 )+(⇡+n) sin(⇡n ⇡ 2 )
4.2.102: ⇡n 2 ⇡ 3
sin(nt)
n⇤1
1 1
4.2.103: 2 2 cos(2t)
⇡4 Õ
1
( 1)n (8⇡2 n 2 48)
4.2.104: 5 + n4
cos(nt)
n⇤1
8 Õ
1
16( 1)n 8 16 4 16 3⇡
4.3.101: a) 6 + ⇡2 n 2
cos n⇡
2 t b) 6 ⇡2
cos ⇡
2t + ⇡2
cos ⇡t 9⇡2
cos 2 t +···
n⇤1
Õ
1
( 1)n+1 2 2 2⇡ 2 3⇡
4.3.102: a) n⇡ sin n⇡
t b) ⇡ sin ⇡
t ⇡ sin t + 3⇡ sin t ···
n⇤1
Õ
1
4.3.103: f 0(t) ⇤ ⇡
n 2 +1
cos(n⇡t)
n⇤1
Õ
1
1
4.3.104: a) F(t) ⇤ t
2 +C+ n4
sin(nt) b) no
n⇤1
Õ
1
( 1)n+1
4.3.105: a) n sin(nt) b) f is continuous at t ⇤ ⇡/2 so the Fourier series converges
n⇤1
Õ
1
( 1)n+1
to f (⇡/2) ⇤ ⇡/4. Obtain ⇡/4 ⇤ 2n 1 ⇤1 1/3 + 1/5 1/7 + ···. c) Using the first 4
n⇤1
terms get 76/105 ⇡ 0.72 (quite a bad approximation, you would have to take about 50 terms
to start to get to within 0.01 of ⇡/4).
4.3.106: a) F(0) ⇤ 1, b) F( 1) ⇤ 0, c) F(1) ⇤ 2, d) F( 2) ⇤ 1, e) F(4) ⇤ 1, f) F( 9) ⇤ 0
Õ
1
4 Õ
1
2( 1)n+1
4.4.101: a) 1/2 + ⇡2 n 2
cos n⇡
3 t b) ⇡n sin n⇡
3 t
n⇤1 n⇤1
n odd
Õ
1
4n
4.4.102: a) cos(2t) b) ⇡n 2 4⇡
sin(nt)
n⇤1
n odd
4.4.103: a) f (t) b) 0
Õ
1
1
4.4.104: n 2 (1+n 2 )
sin(nt)
n⇤1
Õ
1
1
4.4.105: t
⇡ + 2n (⇡ n 2 )
sin(nt)
n⇤1
1 Õ
1
4.5.103: x⇤ p + p4 cos(n⇡t)
2 3 2 2 2 2
n⇤1 n ⇡ ( 3 n ⇡ )
n odd
1 2 Õ
1
4
4.5.104: x⇤ p 3 t sin(⇡t) + n 2 ⇡ 4 (1 n 2 )
cos(n⇡t)
2 3 ⇡
n⇤3
n odd
454 SOLUTIONS TO SELECTED EXERCISES
6.1.102: 2t 2 2t + 1 e 2t
1
6.1.103:
(s+1)2
1
6.1.104: s 2 +2s+2
6.2.101: f (t) ⇤ (t 1) u(t 1) u(t 2) + u(t 2)
1
6.2.102: x(t) ⇤ (2e t 1 t2 1)u(t 1) 2e
t + 32 e t
1
6.2.103: H(s) ⇤ s+1
1
6.3.101: 2 (cos t + sin t e t)
6.3.102: 5t 5 sin t
6.3.103: 12 (sin t t cos t)
Øt
6.3.104: 0
f (⌧) 1 cos(t ⌧) d⌧
6.4.101: x(t) ⇤ t
6.4.102: x(t) ⇤ e at
7.2.103: Applying the method of this section directly we obtain a k ⇤ 0 for all k and so
y(x) ⇤ 0 is the only solution we find.
7.3.101: a) ordinary, b) singular but not regular singular, c) regular singular, d) regular
singular, e) ordinary.
p p
1+ 5 1 5
7.3.102: y ⇤ Ax 2 + Bx 2
Õ
1
( 1) 1 k
7.3.103: y ⇤ x 3/2 k! (k+2)!
x (Note that for convenience we did not pick a 0 ⇤ 1)
k⇤0
7.3.104: y ⇤ Ax + Bx ln(x)
8.1.101: a) Critical points (0, 0) and (0, 1). At (0, 0) using u ⇤ x, v ⇤ y the linearization
is u 0 ⇤ 2u (1/⇡)v, v 0 ⇤ v. At (0, 1) using u ⇤ x, v ⇤ y 1 the linearization is
u 0 ⇤ 2u + (1/⇡)v, v 0 ⇤ v.
b) Critical point (0, 0). Using u ⇤ x, v ⇤ y the linearization is u 0 ⇤ u + v, v 0 ⇤ u.
c) Critical point (1/2, 1/4). Using u ⇤ x 1/2, v ⇤ y + 1/4 the linearization is u 0 ⇤ u + v,
v 0 ⇤ u + v.
8.1.102: 1) is c), 2) is a), 3) is b)
8.1.103: Critical points are (0, 0, 0), and ( 1, 1, 1). The linearization at the origin using
variables u ⇤ x, v ⇤ y, w ⇤ z is u 0 ⇤ u, v 0 ⇤ v, z 0 ⇤ w. The linearization at the point
( 1, 1, 1) using variables u ⇤ x + 1, v ⇤ y 1, w ⇤ z + 1 is u 0 ⇤ u 2w, v 0 ⇤ v 2w,
w 0 ⇤ w 2u.
8.1.104: u 0 ⇤ f (u, v, w), v 0 ⇤ g(u, v, w), w 0 ⇤ 1.
8.2.101: a) (0, 0): saddle (unstable), (1, 0): source (unstable), b) (0, 0): spiral sink
(asymptotically stable), (0, 1): saddle (unstable), c) (1, 0): saddle (unstable), (0, 1):
saddle (unstable)
8.2.102: a) 12 y 2 + 13 x 3 4x ⇤ C, critical points: ( 2, 0), an unstable saddle, and (2, 0), a
stable center. b) 12 y 2 + e x ⇤ C, no critical points. c) 12 y 2 + xe x ⇤ C, critical point at
( 1, 0) is a stable center.
p
8.2.103: Critical point at (0, 0). Trajectories are y ⇤ ± 2C + (1/2)x 4 , for C > 0, these give
closed curves around the origin, so the critical point is a stable center.
8.2.104: A critical point x0 is stable if f 0(x0 ) < 0 and unstable when f 0(x 0 ) > 0.
8.3.101: a) Critical points are ! ⇤ 0, ✓ ⇤ k⇡ for any integer k. When k is odd, we have a
saddle point. When k is even we get a sink. b) The findings mean the pendulum will
simply go to one of the sinks, for example (0, 0) and it will not swing back and forth. The
friction is too high for it to oscillate, just like an overdamped mass-spring system.
8.3.102: a) Solving for the critical points we get (0, and ( bh+ad
ac , b ). The Jacobian
a
h i
h/d )
0
matrix at (0, h/d ) is a+bh/d
ch/d d whose eigenvalues are a + bh/d and d. So the eigenvalues
are always real of opposite signs and we get a saddle (In the application however we are
only looking at the positive quadrant so this critical point is not relevant). At ( bh+ad a
ac , b )
b(bh+ad)
0
we get Jacobian matrix ac
ac
bh+ad . b) For the specific numbers given, the second critical
b a d
SOLUTIONS TO SELECTED EXERCISES 457
h i p
0 11/6 5±i 327
point is( 550
3 , 40)the matrix is , which has eigenvalues
3/25 1/4 Therefore there 40 .
is a spiral source. This means the solution spirals outwards. The solution will eventually
hit one of the axes, x ⇤ 0 or y ⇤ 0, so something will die out in the forest.
8.3.103: The critical points are on the line x ⇤ 0. In the positive quadrant the y 0 is always
positive and so the fox population always grows. The constant of motion is C ⇤ y a e cx b y ,
for any C this curve must hit the y-axis (why?), so the trajectory will simply approach a
point on the y axis somewhere and the number of hares will go to zero.
8.4.101: Use Bendixson–Dulac Theorem. a) f x + g y ⇤ 1 + 1 > 0, so no closed trajectories.
b) f x + g y ⇤ sin2 (y) + 0 < 0 for all x, y except the lines given by y ⇤ k⇡ (where we get
zero), so no closed trajectories. c) f x + g y ⇤ y + 0 > 0 for all x, y except the line given by
y ⇤ 0 (where we get zero), so no closed trajectories.
8.4.102: Using Poincarè–Bendixson Theorem, the system has a limit cycle, which is the
unit circle centered at the origin as x ⇤ cos(t) + e t , y ⇤ sin(t) + e t gets closer and closer
to the unit circle. Thus we also have that x ⇤ cos(t), y ⇤ sin(t) is the periodic solution.
8.4.103: f (x, y) ⇤ y, g(x, y) ⇤ µ(1 x 2 )y x. So f x + g y ⇤ µ(1 x 2 ). The Bendixson–Dulac
Theorem says there is no closed trajectory lying entirely in the set x 2 < 1.
8.4.104: The closed trajectories are those where sin(r) ⇤ 0, therefore, all the circles centered
at the origin with radius that is a multiple of ⇡ are closed trajectories.
p p p p
8.5.101: Critical points: (0, 0, 0), (3 8, 3 8, 27), ( 3 8, 3 8, 27). Linearization at (0, 0, 0)
using
p up⇤ x, v ⇤ y, w ⇤ z is u p ⇤ 10u +p
0 10v, v 0 ⇤ 28u v, w 0 ⇤ (8/3)w. Linearization p at
(3 8, 3p 8, 27)pusing u ⇤ x 3 8, v ⇤ y 3 8, wp⇤ z 27 p is u ⇤ 10u+10v, v p⇤ u v 3 8w,
0 0
p
w ⇤ 3 8u+3 8v ( /3)w. Linearization at ( p
0 8 3 8, 3 8, 27)pusing up⇤ x+3 8, v ⇤ y+3 8,
w ⇤ z 27 is u 0 ⇤ 10u + 10v, v 0 ⇤ u v + 3 8w, w 0 ⇤ 3 8u 3 8v (8/3)w.
p p
A.1.101: a) 10 b) 14 c) 3
" # 2 p1 3
6 67 ⇣ ⌘
p1
6 17
A.1.102: a) 2 b) 6 p6 7 c) p2 , p 5 , p2
p1 627 33 33 33
2 6p 7
4 65
9 3 5 4 3 8
A.1.103: a) b) c) d) e) f)
2 3 3 8 7 3
A.1.104: a) 20 b) 10 c) 20
A.1.105: a) (3, 1) b) (4, 0) c) ( 1, 1)
25 3 037
6
b) 66 13 10 677
7 4 4
A.2.101: a)
2 3 4 6 1 3 17
4 5
1 13 2 5
A.2.102: a) b)
9 14 5 5
218 18 12 3 2 11 12 36 14 3 2 2 1237
6 7 6 7 6
b) 66 6 0 8 77 c) 66 2 4 5 277 d) 66 3 24 77
22 31
A.2.103: a)
42 44 634 48 27 6 13 38 20 28 7 61 9 75
4 5 4 5 4
458 SOLUTIONS TO SELECTED EXERCISES
⇥ ⇤ 0 1 5 2 1/2 1/4
A.2.104: a) 1/2 b) c) d)
1 0 3 1 1/2 1/2
2 0 0 37
21/4 0 37 6 1 0
6 0 6 0 0 77
b) 66 0 0 77 c) 66
1/2 0 0 1/2
1/3 0 7
A.2.105: a) 1/5
0 60 175 6 0 0 7
1/3
4 0 6 0 1075
4 0 0
21 0 0 37 21 0 0 77/15 3
6 6 7
d) 660 1 1/377 e) 660 1 0 2/1577
1 0 1 1 0 1 1
A.3.101: a) b) c)
0 1 0 0 1 0 0 60 0 0 75 60 0 1 8/5 75
4 4
21 0 1/2 0 37
6
f) 660 1 1/2 1/277
0 0 0 0 1 2 3 0
g) h)
60 0 0 75
0 0 0 0 0 0 0 1
4 0
20 1 037 20 1 37 2 5/2 337
6 6 0 6 1
A.3.102: a) 661 0 077 b) 660 1 177 c) 66 1 1/2 3/2 7
7
60 0 175 61 0 75 6 1 1 75
4 4 1 4 0
A.3.103: a) x1 ⇤ 2, x 2 ⇤ 7/3 b) no solution c) a ⇤ 3, b ⇤ 10, c ⇤ 8 d) x3 is free,
x1 ⇤ 1 + 3x 3 , x 2 ⇤ 2 x3
1 3
A.3.104: a) b)
3 1
A.3.105: a) 3 b) 1 c) 2
⇥ ⇤ ⇥ ⇤ ⇥ ⇤ ⇥ ⇤ ⇥ ⇤ ⇥ ⇤
A.3.106: a) 1 0 0 , 0 1 0 , 0 0 1 b) 1 1 1 c) 1 0 1/3 , 0 1 1/3
A.5.101: s ⇤ 2
A.5.102: ✓ ⇡ 0.3876
A.5.103: a) -15 b) -1 c) 28
A.5.104: a) ( 1/2, 0, 21 ) b) (0, 0, 0) c) (2, 0, 2)
A.5.105: a) (1, 1, 1) (2, 1, 1) + 2(1, 5, 3) b) 2(2, 1, 1) + (1, 5, 3) c) 2(1, 1, 1)
2(2, 1, 1) + 2(1, 5, 3)
A.5.106: (2, 1, 1), (2/3 , 8/3 , 4/3)
A.5.107: (1, 1, 1), (0, 1, 1), (4/3, 2/3, 2/3)
A.6.101: a) 2 b) 8 c) 0 d) 6 e) 3 f) 28 g) 16 h) 24
A.6.102: a) 3 b) 9 c) 3 d) 1/4
A.6.103: 12
A.6.104: 1 and 3
460 SOLUTIONS TO SELECTED EXERCISES
Index
461
462 INDEX
undamped, 98
undamped motion, 96
systems, 152
underdamped, 101
undetermined coefficients, 105
for second order systems, 159, 185
for systems, 182
unforced motion, 96
unit step function, 294
unit vector, 389
unstable critical point, 51, 357
unstable node, 147
upper triangular, 438
upper triangular matrix, 164