2021 Lecture Notes MAT1330
2021 Lecture Notes MAT1330
2021 Lecture Notes MAT1330
Lecture Notes
1 Preface 2
1.1 How to use these notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 How to Succeed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Fundamental Skills 5
2.1 Mathematical language (new) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.2 Sums and the geometric series formula (new) . . . . . . . . . . . . . . . . . . 7
2.2 Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.1 Parentheses and the order of operations . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 Powers and exponentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.3 Logarithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.4 Fractions and rationalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.5 Polynomials: factoring and long division . . . . . . . . . . . . . . . . . . . . . 19
2.3 Inequalities and absolute values (new) . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.1 Solving inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.2 Absolute values: how to handle them and how to solve equations . . . . . . . 27
2.4 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.1 Potential characteristics of functions . . . . . . . . . . . . . . . . . . . . . . . 31
2.4.2 Polynomial functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4.3 Rational functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.4.4 Root or radical functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.4.5 Absolute value function (new) . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.4.6 Trigonometric functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4.7 Exponential and logarithmic functions . . . . . . . . . . . . . . . . . . . . . . 45
2.4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
i
3.5.1 Examples: finding equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.5.2 Stability of equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.6 Stability in linear models: a theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.7 Stability in nonlinear models: examples . . . . . . . . . . . . . . . . . . . . . . . . . 68
5 The Derivative 96
5.1 The definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.1.1 Rate of change: the idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.1.2 Average rate of change (over an interval) . . . . . . . . . . . . . . . . . . . . 97
5.1.3 Instantaneous rate of change . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.2 Examples of using the definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.3 Five ways not to be differentiable at x . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.4 What f 0 tells you about f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.5 Differentation Rules: The basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.5.1 Why the power rule is true . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.5.2 Why derivative is linear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.5.3 Why the product rule is true . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.5.4 Why the chain rule is true . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.5.5 Why the quotient rule is true . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.6 Derivatives of exponential functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.7 Derivatives of logarithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.8 Derivatives of functions like f (x)g(x) . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.9 Implicit differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.9.1 More examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.10 Derivatives of sine and cosine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.10.1 Geometric arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.11 Derivatives of other trigonometric functions . . . . . . . . . . . . . . . . . . . . . . . 125
5.12 Inverse trigonometric functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.12.1 The inverse sine function, arcsin(x) or sin−1 (x) . . . . . . . . . . . . . . . . . 126
5.12.2 The inverse tangent function arctan(x) = tan−1 (x) . . . . . . . . . . . . . . . 129
5.12.3 The remaining inverse trig functions . . . . . . . . . . . . . . . . . . . . . . . 130
5.13 Summary of known derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
ii
6 Applications of the Derivative 133
6.1 The first derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
6.2 The second derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6.3 Graphing functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.4 Extrema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.4.1 Local extrema : First and second derivative tests . . . . . . . . . . . . . . . . 148
6.4.2 Methods for finding local extrema . . . . . . . . . . . . . . . . . . . . . . . . 148
6.4.3 Global Extrema and the Extreme Value Theorem . . . . . . . . . . . . . . . . 150
6.4.4 Method for finding global extrema . . . . . . . . . . . . . . . . . . . . . . . . 151
6.5 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.5.1 Maximization with trade-offs . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.5.2 Areas and volumes: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.5.3 Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.5.4 Maximize yield in a DTDS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
6.5.5 Minimal perimeter for area . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.5.6 Some more examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
6.6 L’Hôpital’s rule for finding limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.6.1 Recall: Algebra of limits with infinity . . . . . . . . . . . . . . . . . . . . . . 165
6.6.2 Recall: evaluating limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.6.3 Indeterminate forms and L’Hôpital’s rule . . . . . . . . . . . . . . . . . . . . 166
6.6.4 Product and difference indeterminate forms . . . . . . . . . . . . . . . . . . . 167
6.6.5 Exponential indeterminate forms . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.6.6 Graphing even more complex functions . . . . . . . . . . . . . . . . . . . . . 172
6.7 Approximating functions with polynomials . . . . . . . . . . . . . . . . . . . . . . . . 176
6.7.1 Estimating a function using a secant line . . . . . . . . . . . . . . . . . . . . 177
6.7.2 Linear approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6.7.3 Taylor polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
6.7.4 More examples of Taylor approximations . . . . . . . . . . . . . . . . . . . . 183
6.7.5 The Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
6.7.6 Applications of the Mean Value Theorem (optional) . . . . . . . . . . . . . . 185
6.7.7 Proof of Rolle’s theorem and advanced applications of the Mean Value The-
orem (optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
6.8 Stability of Discrete Time Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . 187
6.8.1 Stability of linear DTDS (recall) . . . . . . . . . . . . . . . . . . . . . . . . . 188
6.8.2 Stability of general DTDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
6.8.3 Example: Allee effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
6.8.4 Logistic growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
6.8.5 The Ricker equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
6.8.6 Harvesting and optimization : DTDS . . . . . . . . . . . . . . . . . . . . . . 194
6.9 The Intermediate Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
6.9.1 Classic solution: Bisection Method . . . . . . . . . . . . . . . . . . . . . . . . 197
6.9.2 Sophisticated solution: Newton’s Method . . . . . . . . . . . . . . . . . . . . 198
6.9.3 Discussion: So why does it work? Can it fail? . . . . . . . . . . . . . . . . . . 200
iii
MAT 1330 : Fall 2020
7 Integration 201
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
7.1.1 Motivation for Differential Equations . . . . . . . . . . . . . . . . . . . . . . . 201
7.1.2 Could there be more than one anti-derivative satisfying a given initial condition?203
7.1.3 Anti-differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
7.1.4 Applications of Anti-differentiation . . . . . . . . . . . . . . . . . . . . . . . . 208
7.2 Techniques of integration: Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . 211
7.2.1 The method of substitution: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
7.2.2 Other situations where you might try substitution . . . . . . . . . . . . . . . 214
7.2.3 Trying a substitution in the hopes of simplifying a complicated integrand . . 215
7.2.4 Tips on substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
7.2.5 Two examples where substitution is not enough . . . . . . . . . . . . . . . . . 219
7.3 Techniques of integration : Integration by Parts . . . . . . . . . . . . . . . . . . . . . 220
7.3.1 Method of integration by parts . . . . . . . . . . . . . . . . . . . . . . . . . . 220
7.3.2 Applying by parts more than once: two different kinds of examples . . . . . . 224
7.3.3 Tips for integration by parts . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
7.4 Mixed examples, and applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
7.4.1 More examples with integration by parts and substitution . . . . . . . . . . . 226
7.4.2 Application examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
7.4.3 Integrals that we still can’t solve . . . . . . . . . . . . . . . . . . . . . . . . . 229
7.5 Definite Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Index 241
1
Chapter 1
Preface
These lecture notes have been developed from multiple sources, including primarily the handwritten
notes of Dr. Frithjof Lutscher, from the University of Ottawa, Department of Mathematics and
Statistics, who originally developed MAT1330 : Calculus I for the Life Sciences.
For more details, and a different perspective, you can read the corresponding section of the text-
book by Adler and Lovric — a very good textbook with very few typos and excellent, illustrated
explanations and examples. The textbook also includes tons of exercises to practice and to hone
your skills.
I am also grateful to Dheera Venkatraman, who developed the online graphing tool FooPlot
(fooplot.com) which I have used to make most of the graphs in these notes. Elsewhere, I have
borrowed images from the internet and I have attempted to identify the source; please let me know
of errors or omissions.
Please note the following features to these lecture notes. The distinctive colours are there to help
you spot different kinds of material more easily.
Theorem 1.1.2. A theorem (or lemma, or proposition, or corollary) is a result that usually has
the form “If Thing A is true, then Thing B is true.” It works like a short-cut: every time you
notice that Thing A is true, you can just jump ahead and know for sure that Thing B is true.
Example 1.1.3. Examples are essential — but remember to always step back afterward and try
to see the important parts: how did we start the problem, and what was the big step in the
solution?
2
MAT 1330 : Fall 2020 1.2. HOW TO SUCCEED
Exercise 1.1.4. Exercises in these notes are usually meant as triggers to keep you thinking about
the concepts just presented. Some solutions are included at the end of the notes.
As with all exercises: once you have read the solutions, the problem has lost most of its value as
a study tool for you — it’s important to graduate to solving problems on your own (including,
importantly, figuring out how to start). That’s why these notes are supplemented with:
The Course Guide, with one-page summaries of each lecture, and a list of problems from the
textbook, as well as problems from old assignments and exams;
A DGD Workbook, which you are encouraged to print off, with a subset of the above problems
that you absolutely should try in advance of the DGD each week, as the TA will solve them
there;
Note. Sometimes, a box like this is used to summarize the important points after a complicated
discussion, or after a sequence of examples. In this case, the biggest message for success in a math
courses is:
Be active: don’t just copy the steps of a solved problem — constantly ask yourself what you
should do next, and use the solution to hone your judgement.
is used to approximately delineate the content of the course by lectures. These markers are hyper-
linked from the Index, at the end of the notes. Alternately, the Table of Contents, at the beginning
of the notes, has hyperlinks to let you jump directly to a particular section of the text.
To succeed in this course, you need to do math problems. Organize your week and build time to
work on math problems into your calendar. The DGD workbook and the Mobius assignments are
tools to help motivate you to achieve the goal of five problems per day. The key to honing your
skills is to keep trying new problems rather than rehashing ones you’ve already seen. So turn to
the CourseGuide and the Textbook for more depth.
There are many resources available to you as part of this course; it is up to you to choose to use
them. The specific modalities of these in Fall 2020 will be as described in our course home on
Brightspace:
Brightspace The hub for our course, where you’ll find everything you need, and links to everything
on this list.
3
MAT 1330 : Fall 2020 1.2. HOW TO SUCCEED
Classes Two classes per week, each 80 minutes long. Theory and examples and a sense of the big
picture.
DGDs Once per week, 80 minutes, led by a graduate student Teaching Assistant (TA). Lots of
examples, smaller group size, more opportunities to ask questions.
Math Help Centre Open five days a week, staffed by graduate student TAs. If in-person: just
drop in. If on-line: book your same-day or next-day appointment online. Bring your exercises
(but not homework problems that you will subsequently submit for grades). The TAs will
help you learn how to solve math problems on your own. If your question is more theoretical
in nature, come to office hours instead.
Office Hours As scheduled. Come with your questions: theory, examples, exercises, etc.
Textbook Has an index at the back, and a good table of contents at the front, to help you find
things. Well-written, with lots of examples and excellent graphics. Many exercises have a
brief answer at the back of the book.
Homework We will have regular assignments using Mobius Assessments, an interactive platform
that you can use to measure your progress and see what topics you need to focus more
attention on. Always work the problems by hand, writing your steps logically, and then type
in the answers and see how you do. Don’t leave assignments to the last minute; working and
focussing on them is part of your learning process.
Peer help groups There are various mentorship and peer help groups on campus; this can be a
good way to learn and keep motivated. These groups are not overseen by the instructor and
are not part of the course — so please do confirm any rumours about course policies or test
content directly with your professor!
Friends Teaching, by trying to explain an example to a friend, can be the most effective way of
learning. Working with friends can also be very motivating — everyone’s in this together.
Note. In this course, we declare that you can do the assignments with friends or peer help groups,
but not with the help of a tutor (private or through the help centre) or instructor. In all cases, you
should ensure that you, personally, have done the work to get to the answer that you enter.
Midterms and exams are to be done individually, without the help of another person or computer
aids. We set out rules for each test, but note that the rules are not a game to play — any attempt
to undermine the academic integrity of a test, regardless of the outcome, is an act of academic
fraud, and has no place at the University.
4
Chapter 2
Fundamental Skills
We will review some algebra, trigonometry and functions in our first two modules to ensure we have
a common vocabulary and to preview things to come. Experience has shown that mastering
these pre-Calculus skills is 100% essential for success in Calculus. Bring your questions
to the drop-in center or office hours; we will be very happy to help.
This chapter includes far more material than what will be reviewed in a typical pair of classes! Our
classes will focus on the material that is probably new to you (as indicated) but you should use the
background as “warm-up.”
The beauty of math is it universality; and its universality comes from the precision of its language.
We will be practicing to speak and write mathematics in the course and so become fluent.
2.1.1 Sets
Sets of numbers come up everywhere: the domain of a function, the set of solutions to an equation,
or the set of solutions to an inequality.
Our notation is very precise and can be read out loud as a sentence. For example
S = x ∈ R | x2 > 5
is read as “S is the set of x in the real numbers such that x2 > 5”. We can parse this as
S |{z}
= { ∈
x |{z} R
|{z} | x2 > 5 } .
|{z} |{z}
is the set of in the real numbers such that
In general, when we write a set this way, you can expect it to have the form
5
MAT 1330 : Fall 2020 2.1. MATHEMATICAL LANGUAGE (NEW)
or just
S = {x | condition on x to be in the set}
if it’s obvious from context what kind of number x should be. In Calculus, our sets are usually
in the real numbers R, but occasionally we might prefer the integers Z or the natural numbers
N = Z≥0 = {x ∈ Z | x ≥ 0}.
This final form is an example of a very common kind of set called an interval .
Definition 2.1.1. Given two real numbers a < b, the closed interval from a to b is the set
[a, b] = {x ∈ R | a ≤ x ≤ b}.
If we want to exclude one or both endpoints, we replace the square bracket with a round bracket (or,
in French notation, with a square bracket facing the wrong waya ). That is,
Example 2.1.2. We have worked out that {x2 | 0 < x < 10} = (0, 100).
Note. So math is a precise language, yes, but not without its foibles. Notice that if we write just
(3, 4)
then we could mean the open interval 3 < x < 4, or we could mean the point in the cartesian plane
with coordinates x = 3 and y = 4.
Oh well. We just ran out of different symbols. The takeaway : when you see something like this,
read the whole sentence for context, and it should be clear.
6
MAT 1330 : Fall 2020 2.1. MATHEMATICAL LANGUAGE (NEW)
Definition 2.1.3. The union of two or more intervals is the set of all points that are in at least
one of them. We write the union with the symbol ∪.
So for example
[−5, −3] ∪ [7, 19] = {x ∈ R | −5 ≤ x ≤ −3 or 7 ≤ x ≤ 19}
and
(−1, 0) ∪ (0, 1) = {x ∈ R | −1 < x < 1, x 6= 0}.
Infinity — both in terms of infinitely large and infinitesimally small — play a big role in Calculus.
You can say that Calculus was basically discovered because humanity desperately needed to find
a way to correctly talk about infinity. We’ll talk about this in great depth in Chapter 4; for now,
let’s just use it for intervals.
{x ∈ R | x > 5} = (5, ∞)
since −∞ is something smaller than every real number. Note that [−∞, 19] is poor notation: we
don’t want to include −∞ in our set because we only want real numbers in our set2 .
Now, consider S = x ∈ R | x2 > 7 . We’ll talk about solving inequalities like this in more detail
7
MAT 1330 : Fall 2020 2.1. MATHEMATICAL LANGUAGE (NEW)
1 + 2 + 4 + · · · + 64,
then you’d have to creatively think about how to fill in all the points in between. Fun for a game,
but not fun for science. So: we have a notation for that. We write
64
X
1 + 2 + 3 + · · · + 64 = i
i=1
First,
P
is the summation symbol ; it is there to signal you that this is about adding numbers
together3 .
Below and above the summation symbol are the limits of the sum (in this case, a and b, which
are integers, and a ≤ b) and the summation variable (in this case, j, but you can use any
letter you like, it’s just a pattern holder).
This variable j is a number that starts at a and then counts up (by integers) until it reaches
b. For each value, you get one term of your big sum.
After the summation symbol is the actual formula f (j) for the terms that you are summing.
They usually include the index of summation j, and we think of it as the pattern function.
Let’s do some short examples (that really don’t even need this fancy notation):
3
X
2j = 20 + 21 + 22 + 23 .
j=0
5
X
n2 = 12 + 22 + 32 + 42 + 52 .
n=1
6
X
(m2 − 10)xm = 6x4 + 15x5 + 26x6 .
m=4
3 P
Adding, as opposed to multiplying, or making a set; note that is the Greek capital letter Sigma, which is an
“s” for “sum.
8
MAT 1330 : Fall 2020 2.1. MATHEMATICAL LANGUAGE (NEW)
A particularly interesting sum that we’ll encounter early in this course is the geometric series. For
any real number r, the expression
n
X
ri = 1 + r + r2 + r3 + · · · + rn
i=0
is called a geometric series. In any particular example, we could add it up directly; for example,
5
X
2i = 1 + 2 + 22 + 23 + 24 + 25 = 1 + 2 + 4 + 8 + 16 + 32 = 63,
i=0
and
4
X
3i = 1 + 3 + 32 + 33 + 34 = 1 + 3 + 9 + 27 + 81 = 121.
i=0
Theorem 2.1.5 (Geometric series formula). Let r be a real number with r 6= 1 and let t be a
positive integer. Then
1 − rt
1 + r + r2 + r3 + · · · + rt−1 = .
1−r
If r = 1, then the sum on the left is just equal to t.
Notice how the sum goes to t − 1 on the left but the formula on the right is about t.
Example 2.1.6. With r = 2 and t = 6, the formula is telling us the sum of 1+2+22 +23 +24 +25 =
6 −63
1 + 2 + 4 + 8 + 16 + 32; the answer is 1−2
1−2 = −1 = 63, which is what we got above.
1
Example 2.1.7. With r = 2 and t = 6, the formula says that
1
1 1 1 1 1 1 − 64
1+ + + + + = = 1.96875.
2 4 8 16 32 1 − 12
S = 1 + r + r2 + · · · + rt−1 .
9
MAT 1330 : Fall 2020 2.2. ALGEBRA
rS = r + r2 + r3 + · · · + rt .
S = 1 + r + r2 + · · · +
rt−1
−rS = −r − r2 − · · · −
rt−1
− rt
−−− −−−−−−−−−−−−
S − rS = 1 − rt .
1 − rt
Solving for S gives (1 − r)S = 1 − rt or S = , as desired.
1−r
Finally, note that if if r = 1, then the sum had t terms (because we start our powers at 0 and end
at t − 1) and each of them was equal to 1i = 1, so S = |1 + ·{z
· · + 1} = t.
t times
We’ll use the geometric series early in the course, but other than that we won’t use see summation
notation in MAT1330 — but it’s used a lot in Statistics, for example.
2.2 Algebra
Solid algebraic skills are essential — they are the rules of the language of mathematics, and you
will be using them in all your courses and any analytic or quantitative work you do in your field.
Let’s recap some rules that you need to know in order to do the exercises attached to this section
(and indeed, that you will be using throughout the course, without comment). These represent
simplication steps in a calculation that would typically be skipped on the blackboard.4
Note. Above all, the proper use of parentheses and respect for the order of operations is essential.
4
For solutions to the problems in this section, come to office hours, or the math help centre.
10
MAT 1330 : Fall 2020 2.2. ALGEBRA
Why? Well:
2
3 2 1 2 1
= = =
4 3 4 12 6
whereas
2 4 8
3 =2 =
4
3 3
which is a totally different answer.
A typical place where this kind of mistake slips in is with rational functions:
Don’t
do this. The “equal sign” with the frown is wrong. Rewrite the first term as
x+3
(x + 1) instead (notice the flip of the denominator), which conveniently fits nicely on
x+2
two lines, unlike the original headache.
Another dangerous place where this crops up is when you know perfectly well the parentheses are
there, but you’ve only put them in mentally... and then forget:
Don’t do this. Those parentheses around x + 3 were necessary and the “equal sign” with the
frown is wrong. Use parentheses. Please.
11
MAT 1330 : Fall 2020 2.2. ALGEBRA
whereas
(am )n = a m m
| × a {z× · · · × am} = (a
| ×a×{z· · · × a}) × · · · × (a
| ×a×
mn
{z· · · × a}) = a .
n times m times m times
| {z }
n times
Now assume a > 0. Then these two rules continue to hold for exponents that are arbitrary real
numbers. For example we can say
√ 1
a1/2 = a since (a1/2 )2 = a 2 ×2 = a.
(To define ax for a real number x that is not a fraction, like x = π, requires the notion of limits
from Calculus (and a ≥ 0). But your calculator can approximate it very well.)
Let’s consider how to solve equations that are phrased in terms of exponents.
3
Example 2.2.2. To solve x3/5 = 8 for x, we can take both sides to the power 5/3 (since 5 × 53 = 1)
to get √
(x3/5 )5/3 = 85/3 ⇒ x = ( 8)5 = 25 = 32.
3
Since 64 = 82 , √ √ √ √ √
4 4
64 = 82 = 8= 4 · 2 = 2 2.
√ √
Thus
√ x = 2 2 is one solution — the positive solution. But x = −2 2 is another solution, since
(−2 2)4 = (8)2 = 64 as well. These are the only solutions.
12
MAT 1330 : Fall 2020 2.2. ALGEBRA
This example emphasizes that the rules of exponents are designed for a positive base, but sometimes,
there is a negative solution as well. You probably know perfectly well that an equation like x2 = 5
has two solutions, but beware: that’s only for even powers!
Example 2.2.4. To solve x3 = 8, we get our positive solution x as x = 81/3 = 2. This time, we
check that (−2)3 = −8 6= 8 — so −2 is not a solution.
On the other hand, when the variable is in the exponent, such as in the equation 3x = 25, we need
to use the logarithm function.
2.2.3 Logarithms
Recall that the logarithm of base a > 0 is the inverse function of the exponential: for x > 0
When a = 10, then we often just write log; when a = e (Euler’s number) then it’s called the natural
logarithm and we write ln. Of all the logarithm functions, it’s ln(x) that is the easiest to work with
in Calculus, but log(x) is very common in the natural sciences (e.g. measuring pH values).
Example 2.2.5. To evaluate log2 (8), we have to ask ourselves: is 8 a nice power of 2? Oh, yes, it
is: 8 = 23 , so log2 (8) = 3.
Example 2.2.6. To evaluate log2 (10), we ask ourselves: is 10 a nice power of 2? No, it really isn’t.
But a fancy calculator will tell you that the answer is about 3.321928....
Remark 2.2.7. You can check that the answer log2 (10) = 3.321928... makes sense (again using a
calculator) because:
That is, remembering that 3.321928... is an infinite decimal, we’re noticing that as we take 2 to
powers which are more and more precise approximations of this decimal, we are getting an answer
that is closer and closer to 10. In the language of Calculus: as x approaches log2 (10), 2x approaches
10.
Another way to write the fundamental identity of logarithms (2.1) is as the following identities:
aloga (x) = x for all x > 0 — and similarly, 10log(x) = x and eln(x) = x, for all x > 0.
Note. Since ex > 0 for all x, it makes no sense to ask for the value of ln(0) or ln(−5) — there are
no such values.
13
MAT 1330 : Fall 2020 2.2. ALGEBRA
Example 2.2.8. The pH of a solution measures the concentration of hydrogen ions in a solution.
If the molar concentration is C = 10−k , then the pH is k. What is the formula for pH in terms of
C?
Exercise 2.2.9. (a) Which solution has a greater concentration of hydrogen ions, one with a pH
of k = 1 or one with a pH of k = 7? (b) The formula for k has a minus sign in it. Could it ever
happen that k comes out to be a negative number? Why or why not?
The rules for logarithms are the inverse of the rules for exponentials. Where exponentials take
sums to products and products to exponents, logarithms do the opposite.
loga (xy) = loga (x) + loga (y), loga (xt ) = t loga (x). (2.2)
In other words, taking the logarithm of an expression can make it simpler, which is part of the
reason they are so useful.
Proof. This proof gives us practice in manipulating logarithms. Suppose a, x, y > 0 and set
p = loga (x), and q = loga (y).
By definition, this means ap = x and aq = y. Therefore by the rules of exponents (Section 2.2.2)
we have
ap+q = ap aq = xy.
So by definition, loga (xy) = p + q. Putting this together gives the first statement.
Note. Because we’ll use natural logarithms most in this course, let’s write our identities out that
way:
ln(xt ) = t ln(x)
14
MAT 1330 : Fall 2020 2.2. ALGEBRA
Solution: ln(xey ) = ln(x) + ln(ey ) since products become sums. Then, since ln = loge , we note that
ln(ey ) = y. Thus we have
ln(xey ) = ln(x) + ln(ey ) = ln(x) + y.
When you have a variable in an exponent, you will very often use a logarithm to simplify the
expression. If you cannot get your terms over a common base, the default is to use ln(x).
Example 2.2.12. To solve 3x = 25, we could apply log3 to both sides to get
x = log3 (25).
This is, however, not very enlightening; I don’t have a log3 button on my calculator. Instead, let’s
use ln (which is what we’ll always do in this course, anyway):
ln(25)
3x = 25 ⇐⇒ ln(3x ) = ln(25) ⇐⇒ x ln(3) = ln(25) ⇐⇒ x= .
ln(3)
ln(25)
Note. In fact, log3 (25) = , or, more generally, for all a, b > 0
ln(3)
ln(b)
loga (b) =
ln(a)
which you can either memorize, or else remember how we got it (which is a lot more useful, since
it’s a “trick” we use a couple of times in this course):
ln(b)
x = loga (b) ⇐⇒ ax = b ⇐⇒ ln(ax ) = ln(b) ⇐⇒ x ln(a) = ln(b) ⇐⇒ x= .
ln(a)
Well, it’s a nightmare, but the nightmare is in the exponents, and both sides are positive, so we
carefully apply ln to both sides:
At this point, your instinct might be to think that this expression is a lot worse: fight it! Notice
that ln(2) and ln(3) are just numbers, and we will treat them as such.5 So just as if the equation
5
We don’t want to replace them with their decimal approximations (0.69 and 1.10, respectively) because that
introduces round-off errors into our calculations, and our final answer would be (close but) wrong.
15
MAT 1330 : Fall 2020 2.2. ALGEBRA
were the nicer-looking “5x = 2x(7) + 7 = 14x + 7”, we bring all the x terms to the left hand side
to get
We made some side remarks in the preceding example that are worth pointing out.
Note. It is common to lose accuracy when you round of intermediate steps in a calculation. In
Science, you will keep track using significant figures, which is a good rule of thumb for tracking
measurement error. In Mathematics, you will avoid using decimals, or rounding off, as much as
possible (and when you can’t, you’ll keep many more decimals than you need for your intermediate
steps) — because in math, there is no measurement error. It’s a perfect world!
So that’s a lot of useful stuff about logarithms. But sometimes the best way to understand them
is by noticing things that are NOT true. We learn SO MUCH MORE from our mistakes than we
ever do by getting things right the first time.
Exercise 2.2.14. We want to solve ln(ex + 3) = 5. DANGER. Which of the following manipu-
lations are correct?
ln(ex + 3) = (x + 3) ln(e) = x + 3
ln(ex + 3) = 5 =⇒ ex + 3 = 5
ex + 3 = e5 ⇐⇒ ex = e5 − 3 ⇐⇒ x = 5 − ln(3)
Work this out, then check out your answers with the solutions at the end of the notes!
So many pitfalls to avoid! But actually not: it’s that the only true things are (2.1) (or its equivalent
form) and (2.2), and you have to stick to them like glue.
Exercise 2.2.15. Here are some problems to try, to test your facility with powers, exponentials
and logarithms. In all cases: go back and verify that your answer(s) is(are) correct — including
that they lie in the domain of the functions at hand. 7
6
But how can you check it? Well, using a calculator, x ' −0.7304, to a precision of 4 decimal places. Plugging
this value back for x in the left hand side (2x ) and right hand side (32x+1 ) of the equation gives the same answer
(0.603) to (only) three decimal places. Nice!
7
This is a precursor to our work later in the course, where we really have to be aware enough to say that “-5” is
not an acceptable answer if the question is about the number of turkeys in a field, for example.
16
MAT 1330 : Fall 2020 2.2. ALGEBRA
√
x1/2 y 5
simplify x5 y 1/4
, where x, y > 0;
√ √
( x 3 y)−1/2
simplify √
3 2 where x, y > 0;
x y
a a a 1/d a/d
= ×1= × = .
b b b 1/d b/d
This is easy when a and b are numbers, but takes more deliberation when the numerator and
denominator are algebraic expressions, and it is often not immediately obvious when two are equiv-
alent.
e−x + 1 1
Example 2.2.16. We have that −x
= (1 + ex ) because
2e 2
e−x + 1 e−x + 1 ex e−x ex + ex e0 + ex
1
−x
= −x
· x
= −x x
= 0
= (1 + ex ).
2e 2e e 2e e 2e 2
x+3 x 3 3
Example 2.2.17. We have that = + = 1 + = 1 + 3x−1 . (The first form is more
x x x x
convenient for graphing, but the latter is better for differentiating. Be flexible!)
17
MAT 1330 : Fall 2020 2.2. ALGEBRA
x x
Example 2.2.18. Is = 1 + ? NO. No way. (Plot their graphs: the left side has a vertical
x+2 2
asymptote at x = −2 and the right side is a straight line!) Sums in denominators rarely disappear.
The correct simplification here uses a simple form of long division:
x x+2−2 (x + 2) − 2 x+2 2 2
= = = − =1− .
x+2 x+2 x+2 x+2 x+2 x+2
Note. The goal of simplification is to get a more tractable form of the expression in front of you,
one that is more suitable to the next step you need to do. So what “simplify” means often depends
on your next step. If it’s the last step: then “simplify” just means bringing it to an easily intelligible
from.
(a + b)(a − b) = a2 − b2
With more complex expressions, there are two common mistakes to avoid.
Working too quickly, you might put a minus sign in a likely-looking place, instead of where it’s
needed.
and this does not simplify further; nothing in sight is a perfect square.
18
MAT 1330 : Fall 2020 2.2. ALGEBRA
Once you have the conjugate, it can help simplify a sum or difference of square root functions in a
denominator.
Example 2.2.22. Using the preceding calculation, we have
√ !
1 1 x + x2 + 2 1 p
√ = √ √ = − (x + x2 + 2).
x − x2 + 2 x − x2 + 2 x + x2 + 2 2
√
We swept one detail under the rug here: in fact, ( x)2 = |x|, not x. It was OK here because
x2 + 2 > 0 for every x, so |x2 + 2| = x2 + 2. More on this in Section 2.3.2, below.
Exercise 2.2.23. Problems to try:
4 + k1
simplify 5 ;
k −2
z −1 + 3
simplify ;
z −2 + 2
1
rationalize the denominator √ ;
10 − 3
√ √
6− 8
simplify √ √ .
6+ 8
a0 + a1 x + · · · + an xn
for some real numbers a0 , a1 , · · · , an , and some integer n ≥ 0. If an 6= 0, then n is called the degree
of the polynomial.
The polynomials of degree 0 are constants (boring); the polynomials of degree 1 are called linear
(very useful). Where things get exciting is when n ≥ 2.
19
MAT 1330 : Fall 2020 2.2. ALGEBRA
Basics on quadratics
ax2 + bx + c = 0
We know how to factor quadratic polynomials. If we had two roots, r+ and r− then the factorisation
is
ax2 + bx + c = a(x − r+ )(x − r− ).
If there is only one root r, then it is a repeated root, and our factorisation is
If there are no real roots, then it does not factor. The most useful thing you might do in this case
(which you can do for any quadratic, whether it factors or not) is to complete the square:
b 2 b2
ax2 + bx + c = a x + +c− .
2a 4a
Applications of quadratics
Some equations don’t appear to be quadratics until you simplify — which is another reason to
always try to simplify an expression before doing the next step.
2
Example 2.2.25. Solve for x if x + x = 3.
x2 + 2 = 3x ⇐⇒ x2 − 3x + 2 = 0 ⇐⇒ (x − 1)(x − 2) = 0
so there are exactly two solutions, x = 1 and x = 2. We check this by plugging it back into the
ORIGINAL equation8 and yes, they are solutions.
Solution: Since (e3x )2 = e6x , we can set u = e3x and the equation becomes
u2 − u = 20 ⇐⇒ u2 − u − 20 = 0 ⇐⇒ (u − 5)(u + 4) = 0
8
Always check your answer in the original equation: we will see many examples in this course where we pick up
errant solutions from our intermediate steps that do not solve the original problem.
20
MAT 1330 : Fall 2020 2.2. ALGEBRA
whose solutions are u = 5 and u = −4. But we want solutions for x. The equation u = 5 gives
e3x = 5 or 3x = ln(5) or x = 13 ln(5). But equation u = −4 gives e3x = −4, which has no solutions.
Therefore, we conclude that x = 13 ln(5) is the only solution to our original equation.
x3 − 7x + 6
then it can be impossible to find a formula for its roots.9 But (neat fact) if a polynomial has a
factorization with integer roots, those integers must divide the constant term.
So in this case, the possible roots are ±1, ±2, ±3, ±6. Plug in values until you find one root. In
this case, for example, you might find that x = 1 is a root, whence (x − 1) is a factor. Then use
long division
x2 + x − 6
x3
x−1 − 7x + 6
− x3 + x2
x2 − 7x
− x2 + x
− 6x + 6
6x − 6
0
to get
x3 − 7x + 6
= x2 + x − 6 = (x − 2)(x + 3)
x−1
so
x3 − 7x + 6 = (x − 1)(x − 2)(x + 3).
Of course, we might have found all these roots by guessing, and that would have brought us to the
same result (quicker!).
Solution: Again, any integer roots need to divide −2. We try some values but p(1) 6= 0, p(−1) 6= 0,
... but p(2) = 8 − 4 − 2 − 2 = 0, so 2 is a root, so (x − 2) is a factor.
9
The technique of Newton’s method will give us roots, to any degree of precision, quickly — later in the course.
21
MAT 1330 : Fall 2020 2.3. INEQUALITIES AND ABSOLUTE VALUES (NEW)
x2 + x + 1
x3 − x2 − x − 2
x−2
− x3 + 2x2
x2 − x
− x2 + 2x
x−2
−x+2
0
Exercise 2.2.28. Problems to try; be flexible and in each case think about what you could do to
make the expression look friendlier. Be willing to try a couple of options!
1 1
solve for x: + 2 = 1;
x x
√
solve for m: m = m + 6;
4x
solve for x: = 3x;
1+x
solve for y: 4y − 3(2y ) + 2 = 0 (Hint: 4y = (2y )2 . Let z = 2y and thus make this one
complicated problem into two less complicated problems.)
factor x3 + 1000;
5 3
divide x3 + x2 + x + 3 by x + ;
4 2
find all values of k for which the quadratic x2 + 2kx + 9k − 8 = 0 has only one solution for x.
You have used inequalities and absolute values in high school, but often not to the extent and in
the abstraction that we’ll need in this course. Much of this section may therefore be new to you,
and we’ll spend time on this in class.
22
MAT 1330 : Fall 2020 2.3. INEQUALITIES AND ABSOLUTE VALUES (NEW)
2<3 ⇐⇒ −2 > − 3.
−3 −2 −1 0 1 2 3
Answer: We multiplied both sides of the inequality by x − 2, which is sometimes positive and
sometimes negative. So when x − 2 > 0 (meaning x > 2) then our reasoning holds, and we find
that the only values of x > 2 which satisfy the inequality are x > 5/2.
23
MAT 1330 : Fall 2020 2.3. INEQUALITIES AND ABSOLUTE VALUES (NEW)
But when x − 2 < 0, meaning x < 2, then we have to change the direction of the inequality when
we multiply:
1
(x < 2) : < 2 ⇐⇒ 1 > 2(x − 2).
x−2 |{z}
because x − 2 < 0
We can then further simplify to deduce that the inequality in this case is equivalent to
5
1 > 2x − 4 ⇐⇒ 5 > 2x ⇐⇒ x< = 2.5.
2
What?! This is saying: when x < 2, the condition to satisfy the inequality is that x must be less
than 5/2 — which is always true. Therefore every x < 2 satisfies the condition.
Our answer: all x < 2 and all x > 5/2 : this is written as
(−∞, 2) ∪ (5/2, ∞)
which we read out loud as: the union of the open interval from negative infinity to 2 and the open
interval from 2.5 to positive infinity.
1
Example 2.3.3. Another way to solve < 2: avoid the problematic multiplication by
x−2
a variable term.
1 1
<2 ⇐⇒ −2<0
x−2 x−2
1 2(x − 2)
⇐⇒ − <0 common denominator
x−2 x−2
5 − 2x
⇐⇒ <0
x−2
2x − 5
⇐⇒ >0 multiplied both sides by −1
x−2
which holds only if the numerator and the denominator have the same sign. So we work out:
2x − 5 > 0 iff x > 5/2; and x − 2 > 0 iff x > 2. So both are positive if x > 5/2 since 5/2 > 2.
2x − 5 < 0 iff x < 5/2, and x − 2 < 0 iff x < 2, so both are negative iff x < 2, since 2 < 5/2.
2x − 5
Another way to deduce the set of values x for which > 0 is to use a table, based on the
x−2
following principles:
24
MAT 1330 : Fall 2020 2.3. INEQUALITIES AND ABSOLUTE VALUES (NEW)
Thus, noting that the numerator can only change sign at its root (2.5) and the denominator can
only change sign at its root (2), our table is
This kind of reasoning is solid and valid. If you are using it on a test, be sure to think it through,
rather than using some old memory of how the signs normally turn out — we will have to use this
reasoning for more complex functions, later.
If x > 0 then we can multiply both sides by x, to get −1 > x, or x < −1. But there are no positive
values of x which are less than −1! We conclude that there are no solutions arising from this case.
If x < 0, then we can still multiply both sides by x, but this time the direction of the inequality
changes. We thus get −1 < x. The negative values of x that satisfy the inequality are those between
−1 and 0.
Our total answer: the solution set is the interval (−1, 0) = {x ∈ R | −1 < x < 0}.
We can check our answer against reality by sketching the graph (for the original inequality).
25
MAT 1330 : Fall 2020 2.3. INEQUALITIES AND ABSOLUTE VALUES (NEW)
y = 2/x + 5 is in blue; y = 3 is in red; the answer should be all x-values for which the blue graph
lies below the red line: so just the region between −1 and 0.
Example 2.3.5. Another way to solve the previous question:
2 2 2 2x 2 + 2x 2(x + 1)
+5<3 ⇐⇒ +2<0 ⇐⇒ + <0 ⇐⇒ <0 ⇐⇒ < 0.
x x x x x x
This fraction is < 0 iff the numerator and the denominator have opposite signs. If the numerator
is negative (so x < −1) then the denominator is forced to be negative, so that doesn’t work. If the
numerator is positive (so x > −1) and the denominator is negative (so x < 0), then the quotient is
negative. So that’s the only solution interval.
(Or: create a table, based on the roots of the numerator and denominator.)
x
(a) easy: solve for all x such that − 3 > 5;
2
2
(b) harder: solve for all x such that −3>5 ;
x
1
(c) solve for t : t + 4 > −1;
3
3
(d) solve for t: + 4 < −1;
x
(e) solve for x: x2 − 3x + 2 < x + 5;
x2 − 3x + 2
(f ) solve for x: < 0.
x+5
26
MAT 1330 : Fall 2020 2.3. INEQUALITIES AND ABSOLUTE VALUES (NEW)
2.3.2 Absolute values: how to handle them and how to solve equations
Definition 2.3.7. (
x if x ≥ 0;
|x| =
−x if x < 0.
Note. If your expression has variables, you can’t think of the absolute value as “stripping away the
minus sign” — because you can’t tell if your variable is negative or not. Instead, use the definition.
|x − 3| =
6 |x| + 3 and |x − 3| =
6 |x| − 3.
That is, the above equalities are sometimes accidentally true, but are NOT ALWAYS true. Find
values of x that make the equalities fail.
That is, you cannot simplify an expression by moving the absolute value signs.
Solution: So
|x − 3| = 5 means x − 3 = −5 or 5.
That is, one equation with absolute values actually turns out to be two equations, in disguise.
We solve:
x−3=5 ⇐⇒ x = 8, and
x − 3 = −5 ⇐⇒ x = −2.
Thus we found two solutions: x = −2 and x = 8. We check by plugging these back in: yes! X
27
MAT 1330 : Fall 2020 2.3. INEQUALITIES AND ABSOLUTE VALUES (NEW)
Solution: So
|x2 − 3| = 1 happens if and only if x2 − 3 = ±1.
We solve both equations:
x2 − 3 = 1 ⇐⇒ x2 = 4 ⇐⇒ x = ±2;
and √
x2 − 3 = −1 ⇐⇒ x2 = 2 ⇐⇒ x = ± 2.
We got four solutions; plugging them back in, we see they are correct. X
√
Our answer: there are four solutions: x ∈ {± 2, ±2}.
But
x2 − 3 = −7 ⇐⇒ x2 = −4 . . .?
which has no (real number) solutions.10
√
So in this case, there are only two solutions, x ∈ {± 10}.
Solution: The simplest way to approach this is to first solve the equality, and then solve the
inequality.
So let’s solve |x2 − 5| = 4, as above (exercise). We get four solutions: x ∈ {±1, ±3}.
Now consider a number line (or a table, if you prefer) with these 4 points:
−4 −3 −2 −1 0 1 2 3 4
Now, just as with polynomials, the function |x2 − 5| − 4 can’t change sign without going through
zero.11 The consequence: on each of these intervals between the red points the inequality we want
is either true or false, and we can detect which by plugging in values, for example. That is, we can
build a table to figure out the signs, just as we did before.
10
It does, however, have two complex solutions: x = ±2i, where i is a square root of −1. The complex numbers
are obtained from the real numbers by including this “imaginary” number i. We’ll only need or talk about complex
numbers a bit in MAT1332.
11
This is because it’s a continuous function, as we’ll review in Chapter 4.
28
MAT 1330 : Fall 2020 2.3. INEQUALITIES AND ABSOLUTE VALUES (NEW)
Putting all these intervals together, we deduce the final answer is the union of two intervals:
(−3, −1) ∪ (1, 3)
that is, x satisfies |x2 − 5| < 4 if and only if either −3 < x < −1 or 1 < x < 3.
Note. When you want to solve an inequality with an absolute value, first solve the equality, and
then use a table to values to decide where the inequality holds true.
Things get more difficult when there are variables on both sides; it can happen that your two
equations generate ghost answers that do not solve the original equation.
Example 2.3.15. Solve |x2 − 3| = 3x + 1.
Solution: we first solve x2 − 3 = ±(3x + 1) (notice the parentheses!). This gives two equations to
solve:
x2 − 3 = 3x + 1 ⇐⇒ x2 − 3x − 4 = 0 ⇐⇒ (x − 4)(x + 1) = 0 ⇐⇒ x ∈ {−1, 4},
and
x2 − 3 = −(3x + 1) ⇐⇒ x2 − 3 = −3x − 1 ⇐⇒ x2 + 3x − 2 = 0.
To solve the latter, we need the quadratic formula; we get
√
−3 ± 9 + 8 3 1√
x= =− ± 17.
2 2 2
We make a table and plug these values in to check, and get a surprise:
x |x2 − 3| 3x + 1 equal?
4 13 13 yes
-1 √ 2√ -2 √ no
1 1 1
2 (−3 + √17) | 2 (7 − 3√17)| 2 (−7 + 3√17) yes
1 1 1
2 (−3 − 17) 2 (7 + 3 17) 2 (−7 − 3 17) no
√
Therefore, the solutions are (only) x = 4 and x = 12 (−3 + 17).
Notice what happened: we accidentally solved the equation |x2 − 3| = |3x + 1|, by allowing all
combinations of signs; but not every solution to this equation is also a solution to our equation.
29
MAT 1330 : Fall 2020 2.4. FUNCTIONS
Note. Moral: When you want to solve an equality with an absolute value, like |f (x)| = 3x + 2 :
solve the two equations f (x) = 3x + 2 and f (x) = −(3x + 2) and then plug them back in to check .
In fact, this is the approach you need to use for many problems in this course: check your answer,
because we do a lot of complicated steps, and sometimes, we accidentally get stray “solutions”.
Exercise 2.3.16. Try the following problems to build your confidence with absolute values:
End of lecture # 1
2.4 Functions
A function is a rule which assigns to each element in the domain a unique element in the range. In
this course, the domain and range will always be subsets of the real numbers. One way to specify
the domain D explicitly is to write such a function f with domain D ⊆ R as
f: D →R
x → f (x).
For example,
f : (0, ∞) → R
1
x→ √ .
x
Very often, we just write something like
x−2
f (x) = √
x−5
from which you may deduce that the natural
√ domain or domain of definition is all those x for which
the formula is well-defined. Here, since x − 5 is only defined if x − 5 ≥ 0, and since x = 5 is
excluded because we can’t divide by 0, we conclude that the domain of definition of f is {x | x > 5}
(that is, D = (5, ∞)).
We sometimes restrict the function to a smaller domain (for example, so that it becomes one-to-
one); in this case we say we specify the function on a given domain.
30
MAT 1330 : Fall 2020 2.4. FUNCTIONS
You can plug values into a function to get some values, but to understand a function is to under-
stand its behaviour as a whole. Calculus is about understanding this behaviour, but even without
Calculus, you can often identify key characteristics of functions that dictate its graph.
Definition 2.4.1. A function is even if for all x in the domain, −x is also in the domain and
f (x) = f (−x); its graph will be symmetric via reflection in the y-axis. A function is odd if instead
f (−x) = −f (x), which means the symmetry is via reflection in both axes.
Examples of even functions: x2 , cos(x); examples of odd functions: x3 , sin(x). Most functions are
neither even nor odd.
Definition 2.4.2. A function f is said to be increasing on an interval I in its domain if for every
x1 , x2 ∈ I with x1 < x2 , we have f (x1 ) ≤ f (x2 ). (It is called strictly increasing if you in fact have
the stronger condition f (x1 ) < f (x2 ).) The function f is an increasing function if it is increasing
on every interval of its domain.
y = −1/x is on the left; it is increasing on (−∞, 0) and on (0, ∞) but not defined at 0, so is an
increasing function. (A break in the domain means you are allowed to reset.) The graph of an
increasing function which is not strictly increasing is on the right.
We can defined a decreasing function similarly.
31
MAT 1330 : Fall 2020 2.4. FUNCTIONS
Linear transformations We can transform functions by linear operators, which retain the gen-
eral shape of its graph. For example,
the graph of y = f (x) + 2 is obtained from the graph of y = f (x) by shifting two units up;
the graph of y = f (x + 2) is obtained from the graph of y = f (x) by shifting two units to the
left;
the graph of y = 2f (x) is obtained from the graph of y = f (x) by doubling the y-values
(stretching vertically by a factor of 2);
the graph of y = f (2x) is obtained from the graph of y = f (x) by halving the x-values
1
(stretching horizontally by a factor of ).
2
Notice how transformations to the output variable have the obvious effect you would guess, but
transformations to the input variable have the opposite effect. If you’re ever confused for a moment:
ask yourself what happens at x = 1, for example.
The graph of y = f (x) = x4 − x3 − x2 is in blue; the graph of y = f (x) + 5 is in black; and the
graph of y = f (x + 5) is in red.
Composition More generally, we can compose functions, which means to evaluate them se-
quentially (rather than in parallel, as one does for multiplication). The important point is that
composition is not commutative, meaning, order matters.
(g ◦ f )(x) = g(f (x)) = g(x5 + 3x) = (x5 + 3x)2 = x10 + 6x6 + 9x2 .
32
MAT 1330 : Fall 2020 2.4. FUNCTIONS
This function is obtained by splicing three constant functions together. Some of the most interesting
medical questions about this dosage model concern the transition points; these are also the most
interesting points from the point of view of Calculus.
Example 2.4.4. Another drug’s maximum daily dosage, in mg, is expressed as a function of mass,
in kg, is given by (
0 if x < 30
g(x) =
4x − 120 if x > 30.
Notice that the x-intercept of the function 4x − 120 is x = 30, so there is no jump at the transition
point. The transition is still sharp (sketch the graph); we say the graph has a “cusp”. Again,
this cusp is the most interesting point, from both the applications and the mathematical point of
view.
Note. Notice that we swapped variables to define the function. The convention is always to use x
as the independent variable, and y as the dependent variable.
5x + 1
Example 2.4.5. Find the inverse function of y = .
3x − 2
Solution: To make sure the inverse function exists, we could sketch the graph and verify that f is
one-to-one; or else we can try to solve for x in terms of y, and if we get a unique solution for each
y (as opposed to having choices) then our function is one-to-one and we will have found a formula
for the inverse. So let’s do that:
(3x − 2)y = 5x + 1
⇐⇒ 3xy − 2y = 5x + 1 expand
⇐⇒ 3xy − 5x = 2y + 1 group terms with x together
⇐⇒ x(3y − 5) = 2y + 1
2y + 1
⇐⇒ x=
3y − 5
33
MAT 1330 : Fall 2020 2.4. FUNCTIONS
In the latter two cases, we had to limit the domain of definition to a smaller one on which the
function was one-to-one. There were many choices but the world (and your calculator) has agreed
to use these.
Linear functions are the simplest kind of relationship that we can express between two variables.
Very often, in an experiment, you will hope your points fall on a straight line, confirming a linear
relationship.
34
MAT 1330 : Fall 2020 2.4. FUNCTIONS
Quadratic functions show up in some logistic growth (= limited resource) population models: the
rate of growth relative to population size hits a peak, then decreases.
If n ≥ 3, then this is a higher-degree polynomial. We recognize their shapes, but it is more difficult
to find intercepts algebraically. Moreover, these functions have many more features (local maxima
and minima) which are easiest to find using Calculus.
y = (x − 5)3 + (x − 5)2 − 3(x − 5) + 5 is in blue; y = (x + 2)4 + 2(x + 2)3 − 5(x + 2)2 + 2 is in red;
y = x5 − 2x4 − 5x3 + 3x2 + x − 7 is in black. Note the limits as x → ±∞, as well as the number of
local minima and maxima, in relation to the degree of the polynomial.
Given a polynomial of degree ≥ 3, if we know one root, we can use long division to simplify and
perhaps find other roots. See Section 2.2.5.
35
MAT 1330 : Fall 2020 2.4. FUNCTIONS
Very often (that is, unless the root of the denominator is also a root of the numerator), a root of
the denominator induces a vertical asymptote of the graph of f .
The graph of y = 1/x is in blue; the graph of y = 1/x2 is in red. Notice that y = 1/x is an odd
function whereas y = 1/x2 is even. Both have a vertical asymptote at x = 0.
x+1 2 5
Exercise 2.4.7. Find the domain of f (x) = , and of g(x) = + .
x2 − 2 x − 1 x2 + 3
A function like p √ √
3
f (x) = x2 + 2, g(x) = x − 5, or h(x) = x−1
is a root or radical function. More generally, a function of the form
f (x) = g(x)a/b
where a/b is a reduced fraction that is not an integer and g(x) is a polynomial, might be called a
radical function.
If b is even then
If
√ b is odd (like a cube root) then f (x) is also defined for negative values (for example,
3
−8 = −2)
If the power is a real number but not expressed as a fraction, then the domain of definition
is the same as for the b even case.
36
MAT 1330 : Fall 2020 2.4. FUNCTIONS
These kinds of functions commonly showp up as part of a distance formula; for example, the distance
between (x, 1) and (1, 2) is d(x) = (x − 1)2 + 1. They are also common in various biological
applications. For example, Kleiber’s law states that the metabolic rate is proportional to m3/4
where m is the individual’s mass. For another example: heart rate has been observed to be
proportional to m1/4 .
√
Example 2.4.8. Find the domain of g(x) = x2 − 5.
√ √
Solution:
√ x2 − 5 is defined only if x2 − 5 ≥ 0, or x2 ≥ 5. So the domain is the set (−∞, − 5) ∪
( 5, ∞).
√
The square root function y = x in particular is only defined on half of the real line.
√ √
The graph of y = x2 − 5 is in blue; the graph of y = x is in red. Notice how they are not
defined on the entire real line.
37
MAT 1330 : Fall 2020 2.4. FUNCTIONS
The graph of y = |x|. The graph of the absolute value function is obtained by splicing together
the graphs of y = −x (for x < 0) with the graph of y = x (for x > 0).
Example 2.4.10. In Example 2.3.10 we solved |x − 3| = 5, and found the solutions were x = −2
and x = 8. We can see this makes sense by sketching the graph of y = |x − 3| (which is the
translation of the graph of y = |x| by 3 units to to the right) and seeing where it intersects the line
y = 5, below.
What happens when we compose the absolute value function with another function f ?
Given y = f (x), the graph of y = |f (x)| is obtained by reflecting any parts below the x-axis
upwards. That’s because we took the absolute value of the output of f (x), so our final answer
must be ≥ 0.
On the other hand, the graph of y = f (|x|) is obtained by erasing the parts to the left of the
y-axis and instead reflecting the graph to make it even. That’s because we took the absolute
value of the input to f : we refuse to evaluate f on any negative inputs.
38
MAT 1330 : Fall 2020 2.4. FUNCTIONS
The graph of y = | sin(x)| is on the left; the graph of y = sin(|x|) is on the right.
Notice how we had packed a rather complex function into some really concise notation.
Now, that’s all well and good, but to understand for which x we need to use x2 − 3 and for which
x we need to use −(x2 − 3), we need to solve the inequality x2 − 3 ≥ 0. Good thing we reviewed
that in Section 2.3.1.
So, to get a really useful formula for |x2 − 3|, we need to solve the conditions x2 − 3 ≥ 0 (and
x2 − 3 < 0). Well,
√ √
x2 − 3 ≥ 0 ⇐⇒ x2 ≥ 3 ⇐⇒ x ≥ 3 or − x ≥ 3.
√ √
Since −x ≥ 3 is the same as x ≤ − 3, we can rewrite our definition of |x2 − 3| in the following
very helpful way:
2
√
x − 3
if x ≤ − 3;
√ √
|x2 − 3| = −(x2 − 3) if − 3 < x < 3;
2 √
x −3 if x ≥ 3.
You can verify this is true by plugging it different values for x (like −10, 0, 10 and seeing that
|x2 − 3| coincides with the formula above.
In Example 2.3.11 we solved |x2 − 3| = 1 algebraically. Now we can see graphically why we ended
up with a total of four solutions: the bottom part of the parabola y = x2 − 3 was reflected up, and
thus the graph of y = |x2 − 3| means y = 1 four times in total. See the graph below.
39
MAT 1330 : Fall 2020 2.4. FUNCTIONS
y = |x2 − 3| is in black; y = 1 is in red; the answer is the√x-values for the points of intersection,
which are ±2 and ± 2.
You can also use this graph to understand why the equation |x2 − 3| = 7 in Example 2.3.12 has
only two solutions.
Example 2.4.12. In Example 2.3.13 we solved |x2 − 5| < 4 algebraically and deduced that the
answer was x ∈ (−3, −1) ∪ (1, 3). If we draw the graph we see why it is true.
y = |x2 − 5| is in blue; y = 4 is in red; the answer is the x-values such that the blue curve lies
below the red line, which are the intervals (−3, −1) and (1, 3).
Exercise 2.4.13. Sketch the graphs of y = |x2 − 3| and y = 3x + 1 to see where they intersect.
Compare with Example 2.3.15.
40
MAT 1330 : Fall 2020 2.4. FUNCTIONS
The basic trigonometric functions are sine and cosine. Their quotient is called the tangent function
because you can think of it as measuring a slope.
Three perspectives:
Triangles Given a right triangle with an acute angle θ, we label the adjacent and opposite sides,
as well as the hypotenuse, and have
opp adj opp
sin(θ) = , cos(θ) = , tan(θ) = .
hyp hyp adj
This perspective is good for practical applications, and for figuring out the values of your
trigonometric functions at your favourite angles (using geometry) but it’s limited to 0 < θ < π2 .
The circle Given a point on the unit circle, at angle θ (measured counterclockwise from the
positive x-axis), its coordinates are
and the slope of the line from the origin to this point is tan(θ) (except when θ is an integer
multiple of π/2, where the line is vertical, so tan(θ) is not defined).
This persective is excellent: you can tell with a glance at the quadrant what the signs will
be, and can fit your triangles into the picture to figure out the common ratios.
Their graphs For every input θ, we get an output sin(θ). Therefore this is a function on the real
line, and we can draw the graph of input versus output as usual. That said: our favourite
letter for the independent variable (input) of a function is x, and our favourite letter for the
dependent variable (output) of a function is y, so in this context we’d write y = sin(x), or
y = cos(x), etc.
This is the perspective we’ll use most in Calculus: trigonometric functions are the most
fundamental examples of periodic functions.
Note. Calculus only works with radians. Degrees have been obsolete since the 1650s. Spread the
word.
The six trigonometric functions Sine and cosine are two of the six trigonometric functions
sin(x) 1 1 cos(x)
tan(x) = , csc(x) = , sec(x) = , cot(x) = .
cos(x) sin(x) cos(x) sin(x)
These kinds of functions are crucial for modeling periodic phenomena, like the swinging of a pen-
dulum, or population cycles, or circadian rhythms (eg, sleep-wake cycle). See their graphs, below.
41
MAT 1330 : Fall 2020 2.4. FUNCTIONS
Wave functions These trigonometric functions are the building blocks we use in many instances
to model periodic phenomena. As an important example, a wave function is characterized by its
mean, amplitude, period and phase. For example, the function
2π
f (x) = M + A cos (x − ϕ)
T
has:
amplitude A (meaning, this is its maximum distance from the mean, in absolute value),
period T (meaning, the graph repeats after time T ; more precisely, f (x) = f (x + T ) for all
x), and
phase ϕ (meaning, the left-right displacement of the wave; here, the peak will be at x = ϕ
rather than the usual x = 0 which it would be for cos(x)).
42
MAT 1330 : Fall 2020 2.4. FUNCTIONS
2π 5π
The graph of y = 3 + 1.5 cos x− , above the graph of y = cos(x), for comparison. It
4π 2
has mean 3, amplitude 1.5, period 4π and phase 5π/2. Each of these is a standard graph
transformation (shift or stretch), but physicists tend to use these special names to identify them.
Trigonometric identities There are many excellent trigonometric identities, but the key one is
The next most useful thing to know: the values of sin(x) and cos(x) at x = 0, π/6, π/3, π/2, 2π/3, π.
There’s a lot of repetition in this table. In fact, by reflecting through the four quadrants you can
43
MAT 1330 : Fall 2020 2.4. FUNCTIONS
see:
sin(−x) = − sin(x) and sin(π − x) = sin(x)
whereas
cos(−x) = cos(x) and cos(π − x) = − cos(x).
Don’t forget that these are not the only repeated values! Sine and cosine are periodic functions of
period 2π, meaning
for every integer (positive or negative) k. On the other hand, tangent is already periodic of period
π, as you can see from the graph:
Your calculator has a button for the inverse sine function, called arcsine or inverse sine and written
arcsin or sin−1 ; but it only gives one answer (being a function!), and the one it gives is the answer
π π
in the interval − ≤ θ ≤ . USE RADIANS.
2 2
Note. WARNING: sin−1 is the inverse function of sin; it is NOT csc. THIS IS HORRIBLE
NOTATION AND I’M REALLY SORRY. We want to get rid of it but notice how we’re still
fighting, almost 400 years later, to get rid of degrees; fixing this annoying notation is going to take
a while.
Similarly, to solve for all x such that cos(x) = r, arccos(r) gives you just one answer (in the interval
0 ≤ x ≤ π) and you use the identity cos(x) = cos(−x), as well as its periodicity, to generate the
other solutions.
44
MAT 1330 : Fall 2020 2.4. FUNCTIONS
See also Sections 2.2.2 and 2.2.3; there’s repetition because this is important, and will be used a
lot in this course. Our perspective here is about the functions, rather than just the algebra.
For any number a > 0 we define the exponential function with base a as f (x) = ax . The domain
is all of R, but its range is only the nonnegative real numbers (because ax < 0 can never happen).
The base a = e ' 2.718..., called Euler’s number, is the distinguished base in Calculus, called the
natural base. We sometimes write f (x) = ex = exp(x), especially when the exponent is bulky and
would be hard to write above the e.
The graphs of various exponential functions. We have y = ex in blue, y = 2x in black (note that
1 x
1 < 2 < e), y = 2 = 2−x in red (note that 0 < 12 < 1), and y = 10x in green (note that 10 > e).
A short line segment of slope one is tangent to the graph of y = ex at x = 0; the other exponential
functions have slope ln(a) at x = 0.
Exponential functions arise in a huge variety of applications, including bacterial growth, radioactive
decay, and continuously compounded interest. It is one of the most important functions used in
biology.
45
MAT 1330 : Fall 2020 2.4. FUNCTIONS
For any a > 0 we also have the inverse function of the exponential, which is called the logarithm.
It is defined by the relation
loga (x) = y ⇐⇒ ay = x
which is saying that these are inverse functions. In particular, note that the domain of the log
function is only the positive part of the real line, (0, ∞), but its range is all of R.
The natural logarithm is ln(x) = loge (x). In the physical sciences, log10 (x) is very common (such as
for the pH scale); so if we write log(x) without specifying the base, we mean base 10.12 Logarithms
come up a great deal because exponential functions do, and to solve an equation like
300 = 200 + 5ex−7
you need the logarithm.
Note. The laws of logarithms (let’s state them for ln, but similar rules hold for all logarithms):
for every x, y > 0 we have
The logarithm is a key tool when your measurements will vary by orders of magnitude (from 10−7
to 107 , for example), and so it is actually the exponent you are interested in, rather than the digits
of the value. For example, the pH of a chemical solution is given by p = − log10 (k).
In Calculus, the logarithm is a wonderful function that allows you to simplify complex expressions
so they are easier to deal with. For example, for all x > 3 we have
2
x (x − 3)
ln = 2 ln(x) + ln(x − 3) − 7 ln(x + 2).
(x + 2)7
The right side is a lot easier to differentiate, or even to plug in values on your calculator.
12
In computer science, the most common is base 2, so in that subject log() means by default log2 (x).
46
MAT 1330 : Fall 2020 2.4. FUNCTIONS
Note. Love logarithms! They turn exponents into multiplication, and multiplication into addition
— they make expressions EASIER!
2.4.8 Summary
From Gizmodo.com, with thanks; posted by Philips Shiu. Can you spot and fix the mistakes?
End of lecture # 2
47
Chapter 3
Let’s now put these functions to their intended use: mathematical modeling.
Change is what we study in science, and the life sciences are full of examples. Individuals grow and
die; the size of a population varies; individuals physically move within their environment; individuals
can change; wounds heal; hearts beat regularly; the immune system responds to threats; diseases
spread through populations; drugs are absorbed into the bloodstream; ...
One key goal is to be able to predict future states from the present state, based on understanding
the mechanisms of the change. For example, if we know how an organism’s life cycle depends on
the external temperature, we can predict future developments under climate change.
Experiments can sometimes tell us what the future could bring, by allowing us to extrapolate from
past to future — but experiments can be costly, risky, impractical, or have large time requirements.
Mathematical models are invaluable tools to help prediction. Based on experimental data and/or
our understanding of the mechanisms of change, mathematical models are used in a huge variety
of applications. We use them to try to predict the weather, the stock market, the progress of a
pandemic, and to regulate species harvesting and management.
In this course, we will work through examples of doing each of the three steps, with a very large
48
MAT 1330 : Fall 2020 3.2. FIRST EXAMPLES
emphasis on the second step (analysis). The ultimate goal of MAT1330 and MAT1332 is for you
to have the tools to model, analyse and interpret phenomena in the life sciences, pertinent to your
scientific interests.
Suppose we grow bacteria in four petri dishes, measuring the amount of bacteria in cm2 (that is,
by area) before and after a 24-hour growth period. We collect the following empirical data.
Let’s write this as a formula. If xtoday is the area covered by bacteria today, then
xtomorrow = 2xtoday .
Predicting one day into the future is nice, but how can we go further? Answer: repeat !
Let’s formalize this idea, and also give ourselves a cleaner and clearer notation.
49
MAT 1330 : Fall 2020 3.2. FIRST EXAMPLES
an updating function f that describes the change during a single time step.
The quantity was the area covered by bacteria in our petri dish. The time step is days (or 24-hour
increments). The updating function in our bacteria example was f (x) = 2x. Writing x0 for the
amount today, x1 for the amount tomorrow, x2 for the amount in two days, etc, we say our DTDS
is xt+1 = f (xt ), or, explicitly:
xt+1 = 2xt .
Bamboo is one of the fastest-growing plants on Earth, with a growth rate of 3 cm/hour (!). So
let us take x as the length of bamboo in cm and t the time measured in hours (so the time step
is 1 hour). At the end of each hour, x has increased by 3. Therefore the updating function is
f (x) = x + 3. The DTDS is xt+1 = xt + 3.
Suppose you prescribe your patient pain medication as a daily dose. The daily dose increases the
amount of drug in their body, but over the course of the day, the body is absorbing and eventually
eliminating (some of) the drug.
Let us use a time step of 1 day, and let xt be the amount of drug in the body in mg just after the
daily dose is administered. What is the updating function in this case?
We think of what happens over the course of a day. If the patient began with xt mg in their
bloodstream, then over the day they eliminate a certain percentage of the drug. Just before we
next measure, a constant amount of drug is added. We could write this mathematically as
elimination intake
xt −−−−−−−→ rxt −−−−→ rxt + c
where 0 ≤ r < 1 is the percentage (meaning: fraction) of drug left in the bloodstream after one
day (so 1 − r is the rate of elimination) and c > 0 is the amount of drug administered.
Therefore our updating function is f (x) = rx + c and the DTDS is xt+1 = rxt + c.
50
MAT 1330 : Fall 2020 3.2. FIRST EXAMPLES
Example 3.2.5. In the preceding example, if a certain drug is cleared at a rate of 25% per day
and the daily dose is 10mg, then r = 1 − 0.25 = 0.75 and c = 10, giving a DTDS of
1. In Example 3.2.5, suppose instead that we measure the amount of drug in the body each day
immediately before the daily dose. What is the DTDS in this case?
2. Suppose the DTDS is xt+1 = 0.75xt + 10 and the initial drug level in the bloodstream was x0 = 8
mg (that is, immediately after the first dose). What is the amount of drug in the body after the
next daily dose? In two days? Plot these points. In your opinion, how does the amount of drug
in the bloodstream vary over the course of the day? Should we just connect the dots with straight
lines, or will the graph be much spikier? Does this model capture the maximum amount of drug
in the bloodstream per day, or the minimum? What about if we model it as the exercise above?
3. A DTDS need not only model applications in the life sciences. Assume you borrow $ 1000 and
agree to pay back $ 50 per month. The bank charges 0.5% in continuously compounded interest
per month.1 Write the updating function for this DTDS, where xt is the amount that you owe
immediately after making the tth payment, and t is measured in months.
What else does the updating function of a DTDS give you? The updating function f of
a DTDS tells you what happens to your measured value from one time step to the next. Therefore
it can also see several steps into the future, or the past.
Note. The updating function is NOT telling you what your measured value is at time t — it is
only describing CHANGE. That is, we calculate NOT f (t) but rather f (xt ).
51
MAT 1330 : Fall 2020 3.3. SOLUTIONS OF A DTDS
2. Consider f (x) = 12 x + 2. Calculate the two-time step map: f ◦ f . Calculate the previous time
step map: f −1 .
3. Suppose our DTDS models bacterial growth by xt+1 = 3xt , where xt is measure in cm2 and t
is measured in intervals of 6 hours. Thinking of 24 hours as iterating the DTDS four times,
give the DTDS for bacterial growth where t is instead measured in days (that is, each interval
is 24 hours). Starting with an initial condition of 10 cm2 of bacteria, check your answer by
finding x4 in the first (6-hour) model and finding x1 in the second (24-hour) model.
4. Using the DTDS of the previous exercise: how much bacteria did we start with if after 12
hours we have 100 cm2 ? Check by evaluating the original DTDS on the value you got.
We now know what a DTDS is. What is the “solution” of a DTDS? The goal was to predict future
and past events using the present (initial condition) and the short-term mechanism for change (the
updating function). So one answer is: the solution of a DTDS is the sequence of all future values.
Definition 3.3.1. The solution of the DTDS xt+1 = f (xt ) with initial value x0 is the sequence
{x0 , x1 , x2 , x3 , . . . }
Example 3.3.2. (Bacteria) The solution of xt+1 = 2xt with x0 = 20 is {20, 40, 80, 160, . . . }.
Example 3.3.4. (Drug model) If our DTDS is xt+1 = 0.75xt +10 with initial concentration x0 = 0,
then we calculate
x1 = f (x0 ) = 0.75(0) + 10 = 10
x2 = f (x1 ) = 0.75(10) + 10 = 17.5
x3 = f (x2 ) = 0.75(17.5) + 10 = 23.125
x4 = f (x3 ) = 0.75(23.125) + 10 = · · ·
So a solution is not a single number, nor is it a finite set of numbers — it is an entire sequence,
consisting of infinitely many numbers. Here, we are talking about the solution to a dynamical
system, rather that the solution to an equation.
Another way of saying it: the solution is the graph of xt versus t, which is an infinite sequence of
points, one per time t.
52
MAT 1330 : Fall 2020 3.3. SOLUTIONS OF A DTDS
Remark 3.3.5. Note that the solution of the DTDS is not the same as the updating function at
all: the solution is xt versus t but the updating function was xt+1 versus xt . You can see they are
different in the examples above.
So if you have written down the solution up to the 20th element of the sequence2 , then you can
easily answer “What is xt ?” for each t from 0 to 20.
That said, wouldn’t it be nice to have a simple formula (in terms of t) for each element of the
sequence? Then we wouldn’t have to write a long list, but could instead answer “What is xt ?”
using the formula.
There is one case where it’s super easy to write down the solution: when x0 is a fixed point of the
DTDS.
Definition 3.3.6. Suppose a DTDS has updating function f (x). Then any value x∗ satisfying
f (x∗ ) = x∗ is called a fixed point or an equilibrium or a steady state of the DTDS.
In this notation, the ∗ is just a decoration we put on the letter x to make it stand out, and you
don’t need to use it.
Example 3.3.7. We notice that when f (x) = 2x, then x∗ = 0 is a steady state. Indeed, if we start
with a population of x0 = 0, then the solution is just {0, 0, 0, . . . } — that is, the population stays
steady at that value.
This is always true: if x0 = x∗ , then x1 = f (x0 ) = f (x∗ ) = x∗ , and x2 = f (x1 ) = f (x∗ ) = x∗ , and
so on. That is, the solution is {x∗ , x∗ , x∗ , · · · }. It is an fixed point because it doesn’t change over
time; this makes it an equilibrium of the system.
53
MAT 1330 : Fall 2020 3.3. SOLUTIONS OF A DTDS
Assume that we are given a linear DTDS , that is, a DTDS of the form
xt+1 = rxt + c
where r and c are some constants, that is, real numbers that will not vary over time. For example,
for bacterial growth, we take r = 2 and c = 0; for bamboo growth, we take r = 1 and c = 3; for
medication levels, our constants will satisfy 0 ≤ r < 1 and c > 0, with actual values depending on
the drug.
x0
x1 = rx0 + c
x2 = rx1 + c = r(rx0 + c) + c = r2 x0 + c(r + 1)
x3 = rx2 + c = r(r2 x0 + c(r + 1)) + c = r3 x0 + cr(r + 1) + c = r3 x0 + c(r2 + r + 1)
x4 = rx3 + c = r4 x0 + cr(r2 + r + 1) + c = r4 x0 + c(r3 + r2 + r + 1)
This is wonderful! To find xt , we can use (3.2) directly, instead of iterating our updating function
t times. And we can simplify further, using our geometric series formula (Theorem 2.1.5), which
says that if r 6= 1 then
1 − rt
rt−1 + rt−2 + · · · + r2 + r + 1 = . (3.3)
1−r
Let’s first consider when r = 1, though. Then our DTDS is of the form xt+1 = xt + c. The sum on
the right of 3.2 is equal to t, and since rt = 1, our general solution is just
xt = x0 + ct (3.4)
Now let’s consider every other case. If r 6= 1, then using (3.3) the general formula for the solution
becomes
1 − rt
t
xt = r x0 + c . (3.5)
1−r
Actually, we can simplify this a bit further, by noticing that the formula we’d discovered earlier for
c
the fixed point f this linear DTDS, x∗ = , shows up. After that, let’s factor out the rt :
1−r
xt = rt x0 + (1 − rt )x∗ = rt x0 − rt x∗ + x∗ = rt (x0 − x∗ ) + x∗ .
54
MAT 1330 : Fall 2020 3.3. SOLUTIONS OF A DTDS
Theorem 3.3.9. Let xt+1 = rxt + c be a linear DTDS, with initial condition x0 . Then the general
solution formula is
xt = x0 + ct, if r = 1, and
c
xt = rt (x0 − x∗ ) + x∗ if r 6= 1, where x∗ = is the fixed point.
1−r
Example 3.3.10. Suppose xt+1 = 0.75xt + 10, as in Example 3.3.4. Then r = 0.75, c = 2 and
the fixed point is x∗ = 40 (as we calculated in Example 3.3.8). Therefore since r 6= 1 the general
solution formula is
xt = (0.75)t (x0 − 40) + 40.
If our initial concentration is x0 = 0, then this simplifies to
xt = 0.75t (−40) + 40.
Plugging in t = 0, 1, 2, 3, . . . gives
0, 10, 17.5, 23.125, · · ·
which is exactly what we had computed by iterating f earlier in Example 3.3.4. (But now: I can
calculate x10 = 37.75 in a second flat.... with a calculator.)
Notice how the term (x0 − x∗ ) being negative implies that our solution is always less than x∗ —
but it doesn’t force the solution to be negative (which wouldn’t make sense).
2
Example 3.3.11. Suppose xt+1 = 1
2 xt + 2. Then r = 1
2 6= 1, and c = 2, so x∗ = 1 = 4.
1− 2
Therefore the general solution to the DTDS is
t t
1 ∗ ∗ 1
xt = (x0 − x ) + x = (x0 − 4) + 4.
2 2
For example, if the initial condition is x0 = 10 then the first few terms of the solution (using the
original DTDS) are
{10, 7, 5.5, 4.75, · · · }
which (check!) we can also get by plugging t = 0, 1, 2, 3 into the formula
1 6
xt = (10 − 4) + 4 = t + 4.
2t 2
In this example, we could graph the solution function s(t) = 6(2−t ) + 4 on a graph of x versus t:
sketch the decreasing exponential function y = 2−t , then scale it by a factor of 6 in the vertical
direction, and then shift it up by 4 units. Then the solution to the DTDS will be the points on
this graph with integer t-coordinates (check). This lets you quickly see what happens as t → ∞!
Example 3.3.12. (Bacterial model) The DTDS was xt+1 = 2xt = 2xt + 0 so it is a linear DTDS
with r = 2 and c = 0. This gives x∗ = 0/(1 − 2) = 0/(−1) = 0, and general solution formula
xt = 2t (x0 − 0) + 0 = 2t x0 .
We are greatly relieved to see this is the general solution, as it was exactly the exponential growth
model we expect for bacterial growth.
55
MAT 1330 : Fall 2020 3.4. BEHAVIOUR OF GENERAL DTDS: COBWEBBING
Example 3.3.13. (Fixed point) If x0 = x∗ , a fixed point, then the general solution formula tells
us that
xt = rt (x0 − x∗ ) + x∗ = rt (x∗ − x∗ ) + x∗ = x∗
for all t. Again, we are happy to see we get the answer we expected.
A DTDS consists of: a quantity xt that varies with time t ∈ {0, 1, 2, 3, . . . }; the units of this
time variable; and an updating function f . It is written as xt+1 = f (xt ).
There is a general solution formula for any linear DTDS, given by Theorem 3.3.9 : When
you plug in values for r, c and x0 , it gives you a formula for xt as a function of t.
Notice that we derived our wonderful formula by iterating until we saw a pattern, and then using
the geometric series identity. But if our updating function was not linear, then we wouldn’t get
this pattern — in fact, it might be incredibly difficult to find any pattern at all! So what can we
do?
End of lecture # 3
So we defined a DTDS as a system of the form xt+1 = f (xt ), where xt is the value of the object of
interest at time t, and a time step for t (and units for xt ) have been given.
When the updating function f is linear, that is, f (x) = ax + b for some constants a and b, then we
found an explicit solution, that is, a formula that immediately tells us the value of xt for every t,
without having to iterate through all the preceding values. (See Example 3.3.11 for instance.)
Our goal now: Visualize the behaviour of solutions, even if f is not linear.
Example 3.4.1. Suppose our DTDS is xt+1 = 21 xt + 1, and the initial condition is x0 = 1.
To make this concrete, we could say that xt is the concentration of a drug in a patient’s bloodstream
at time t, given that the patient receives a daily dose that counteracts the natural elimination of
the drug. So t is measured in units of days and xt is the concentration right after the daily dose.
56
MAT 1330 : Fall 2020 3.4. BEHAVIOUR OF GENERAL DTDS: COBWEBBING
We calculate:
x1 = .5(1) + 1 = 1.5
x2 = .5(1.5) + 1 = 1.75
x3 = .5(1.75) + 1 = 1.875
···
This plot shows us that the concentration of the drug is increasing over time, but that this concen-
tration seems to be levelling off to about x∗ = 2.
(Check for yourself that this fits perfectly with the general solution formula we derived in Theo-
1 1
rem 3.3.9: since c = 1 and r = , x∗ = = 2 and so xt = (0.5)t (x0 − 2) + 2.)
2 (1 − 12 )
57
MAT 1330 : Fall 2020 3.4. BEHAVIOUR OF GENERAL DTDS: COBWEBBING
We calculate:
8
x1 = = 1.6
5
3.2
x2 = ' 1.23
2.6
x3 ' 1.10
x4 ' 1.05
This method is tedious and prone to round-off errors. Good for computers, but not for us.
Method 2: Cobwebbing
Cobwebbing is a technique for calculating iterations of the DTDS graphically, instead of numerically.
It is fast and gives you an overall sense of what is happening to the solution over time.
Note. Here is the algorithm (or recipe) for cobwebbing; it’s done on a graph of xt+1 (on the y-axis)
versus xt (on the x-axis):
1) Graph the updating function y = f (x) and also the diagonal line y = x.
2) Start with x0 on the x-axis and go vertically to the graph of f : this intersection is the point
(x0 , x1 ) where x1 = f (x0 ).
If required, we can then draw the solution over time on a separate graph, as the points (t, xt ) on a
graph of t versus x.
Example 3.4.3. Let’s apply this algorithm to our familiar example of a linear DTDS.
58
MAT 1330 : Fall 2020 3.4. BEHAVIOUR OF GENERAL DTDS: COBWEBBING
An example of a cobweb for the DTDS xt+1 = 21 xt + 1. The graph on the left shows the cobweb
diagram, starting from some point x0 ; the corners along the graph of the updating function f give
the values x1 , x2 , . . .. To graph the general solution, we plot (0, x0 ), (1, x1 ), (2, x2 ), . . . in a graph
of t versus xt , at right.
The solution (at right) is the same as the graph plotted in Example 3.4.1.
Example 3.4.4. Now let’s do a nonlinear DTDS using cobwebbing. Suppose the updating function
is
2x
f (x) =
1+x
and x0 = 0.3. We can graph y = f (x) using the tools of Calculus. We will remind ourselves of
the techniques for doing so later; for now, note that its graph starts at (0, 0) and has a horizontal
asymptote at y = 2.
2xt
An example of a cobweb for the DTDS xt+1 = . The graph on the left shows the cobweb
1 + xt
diagram, starting from some point x0 ; the corners along the graph of the updating function f give
the values x1 , x2 , . . .. To graph the general solution, we plot (0, x0 ), (1, x1 ), (2, x2 ), . . . in a graph
of t versus xt , at right.
Note that the solution (at right) is also what you get by plugging x0 = 0.3 into f and iterating.
59
MAT 1330 : Fall 2020 3.5. STEADY STATES
An example of a cobweb for the DTDS xt+1 = xt + 1. The graph on the left shows the cobweb
diagram, starting from some point x0 ; the corners along the graph of the updating function f give
the values x1 , x2 , . . .. To graph the general solution, we plot (0, x0 ), (1, x1 ), (2, x2 ), . . . in a graph
of t versus xt , at right.
This geometric approach gives us a lot more qualitative information about the DTDS; plugging
values into the updating function repeatedly gave us more quantitative information.5 Both are
doing the same thing: iterating f .
In the next sections, we infer properties of the DTDS and its solutions from these cobweb diagrams.
Recall that a fixed point or steady state or equilibrium 6 of the DTDS xt+1 = f (xt ) is a number x∗
such that
f (x∗ ) = x∗ .
Thinking now in terms of graphs: a fixed point is where the graph of y = f (x) intersects the graph
of y = x.
Suppose that our initial value x0 happens to be where y = f (x) intersects the diagonal line y = x.
We do the cobweb algorithm... but nothing happens! We stay put. This is what we noticed
quantitatively last time: if x0 = x∗ is a steady state, then the updating function f does nothing to
it, and our solution is just the constant {x∗ , x∗ , x∗ , · · · }.
5
Motivation #1 for Calculus: How can we sketch updating functions and thereby do cobwebbing, rather than
using a calculator and being surprised by how the numbers turn out? Answer: Calculus tells you the shapes of
curves.
6
Plural of equilibrium: equilibria.
60
MAT 1330 : Fall 2020 3.5. STEADY STATES
These points are very important, both mathematically and for the application we are modeling,
which is why they got a special name (actually, many special names for the same thing).
Example 3.5.1. Last time, we saw that for a linear DTDS xt+1 = rxt + c there is exactly one
c
equilibrium if r 6= 1, namely x∗ = .
1−r
Solution: The updating function is f (x) = 12 x + 1, which is a linear function. Therefore we can
c 1
apply the formula: x∗ = = = 2.
1−r 1 − 12
Suppose we’ve forgotten the formula. Here’s what we do. A steady state is a solution to x = f (x),
in other words, to
1
x = x + 1.
2
Subtract 12 x from both sides to get
1
x=1 ⇐⇒ x = 2.
2
So there is only one steady state, and it is x∗ = 2. (Note that we use a special notation to help us
emphasize the significance of 2 — it’s not just any old value, it’s a fi∗ed point.)
We check our answer: f (2) = 12 (2) + 1 = 1 + 1 = 2; yes, it’s a fixed point of the DTDS.
61
MAT 1330 : Fall 2020 3.5. STEADY STATES
Note: common mistake #5 in these problems is to cancel the x while solving, and end up (in this
case) with only one answer, x = 1. What happened? Division by zero! When you divide by x, first
ask yourself: what happens if x = 0? Once you know if 0 is a solution or not, you can say “ok, now
let’s consider x 6= 0” and continue on to divide by x to find the remaining solutions.
Solution: The updating function is f (x) = x + 1. This is linear, but we cannot apply our formula
because r = 1 (go see the problem!). A steady state is a solution to the equation x = f (x). Here,
this means solving
x=x+1
but this equation has no solution! (Quick check: indeed, the lines y = f (x) and y = x are parallel
in this case.) Therefore this DTDS does not have any equilibria.
Exercise 3.5.5. Compare the answers found here with the cobweb graphs of the previous section,
to agree that we have found all the equilibria.
Remark 3.5.6. When solving for steady states, it always helps to sketch the graph, so that you
can see how many solutions to expect.
Our second observation from our cobweb examples is that sometimes our solutions approach an
equilibrium, and sometimes they move away from an equilibrium.
To explore this phenomenon more fully, let’s consider a richer example: a population exhibiting
the Allee effect7 .
62
MAT 1330 : Fall 2020 3.5. STEADY STATES
4x2
A plot of y = , on a graph of xt+1 versus xt .
1 + x2
The shape of this curve is the interesting part. The way to read it: the slope of the curve at any
point tells you about the reproductive rate as a function of the current population. When it is
shallow, the population does not grow much from xt to xt+1 , so has a low reproductive rate; when
it is steep, the population grows quickly from xt to xt+1 , corresponding to a high reproductive rate.
So what this updating function is modeling about our population are the following observations
(which are written in words below and correspond to the shape of the graph above):
If the population is low, then the scarcity of mates and general low fitness of the population
implies a low reproductive rate.
If the population is too high, then the limited resources reduce the reproductive rate.
Let us apply the cobweb algorithm to some different initial values x0 and see what happens.
63
MAT 1330 : Fall 2020 3.5. STEADY STATES
4x2
Cobweb diagram applied to the updating function f (x) = . An initial condition x0 between
1 + x2
0 and the second fixed point gives a cobweb that tends down to 0 (in blue); an initial condition of
x0 = x∗ (in green) does not change over time; an initial condition x0 between the second and
third fixed points gives a cobweb that tends to the third fixed point (in red).
We see that the middle equilibrium is not approached by any cobweb diagram, but the two other
fixed points are approached by cobweb diagrams from various initial states.
Definition 3.5.8. A fixed point x∗ is called stablea if all nearby initial conditions give solutions
that approach x∗ . A fixed point x∗ is called unstable if there is at least one initial condition near
x∗ such that the solution does not approach x∗ .
a
also called “asymptotically stable”
Example 3.5.9. In the preceding example, the middle fixed point is unstable.
Example 3.5.10. In the linear DTDS xt+1 = 12 xt + 1, the fixed point x∗ = 2 was stable.
4x2t
Exercise 3.5.11. Calculate the fixed points of xt+1 = explicitly. Prove that the smallest
1 + x2t
and the largest of these fixed points is stable, by drawing cobwebs starting on either side of each
fixed point.
64
MAT 1330 : Fall 2020 3.6. STABILITY IN LINEAR MODELS: A THEOREM
Exercise 3.5.13. Do a cobweb for the linear DTDS xt+1 = − 21 xt + 2. Identify any fixed points
and classify by stability.
Note. Explore cobwebbing and graphing using the Excel file provided: linearDTDS.xls.
End of lecture # 4
Recall:
Given a DTDS xt+1 = f (xt ), a fixed point is a number x∗ such that f (x∗ ) = x∗ . There may
be none, one, or many fixed points for a given DTDS.
There are two kinds of fixed points (=equilibria, steady states): stable and unstable.
Note. The stability of its steady states tells us about the long-term behaviour of a DTDS.
So let’s start by figuring out the stability of fixed points in linear models.8 All the linear models
with fixed points we have looked at in the last two classes have been stable, but this is a little
misleading, as we’ll see.
xt+1 = rxt , x∗ = 0.
For different values of r, to decide stability we must draw at least two cobwebs: one for an initial
condition x0 > x∗ and one for an initial condition x0 < x∗ . The fixed point is stable only if all
nearby initial states give solutions that approach it.
8
Why linear models? First: because linear updating functions give DTDS for which we have a general solution
formula and so can answer all questions completely. Second: we will see that this answer gives us the hint we need
on how to predict stability in general.
65
MAT 1330 : Fall 2020 3.6. STABILITY IN LINEAR MODELS: A THEOREM
Cobwebs applied to the updating function y = rx, with two initial conditions each: x0 < 0 and
x0 > 0. At left, the slope of the updating function is 0 < r < 1; at right, the slope of the updating
function is r > 1. We see that in the former x∗ is a stable fixed point; in the latter, x∗ is an
unstable fixed point.
What if r < 0?
Cobwebs applied to the updating function y = rx, with two initial conditions each: x0 < 0 and
x0 > 0. At left, the slope of the updating function is −1 < r < 0; at right, the slope of the
updating function is r < −1. We see that in the former x∗ is a stable fixed point; in the latter, x∗
is an unstable fixed point.
Exercise 3.6.1. Show that if r = 0 then the fixed point is stable. Decide if this makes sense, given
that the DTDS is xt+1 = 0 · xt = 0?
Exercise 3.6.2. Examine the stability of the fixed points when r = 1 or r = −1. Argue (making
reference to the DTDS, and to the kinds of cobwebs you obtain) that these two borderline cases are
66
MAT 1330 : Fall 2020 3.6. STABILITY IN LINEAR MODELS: A THEOREM
an incredible balancing act. We won’t consider them here — they are just too extreme to occur in
nature.
Well, that pattern seems clear enough: the fixed point is stable if −1 < r < 1 and is unstable if
|r| > 1. In fact, it is true even with a general linear updating function; let’s see why.
Theorem 3.6.3. Let xt+1 = rxt + c be a linear DTDS with r 6= ±1. Then the fixed point
c
x∗ =
1−r
is stable if |r| < 1 and is unstable if |r| > 1.
xt = rt (x0 − x∗ ) + x∗ .
If |r| > 1, then the powers of r will grow (in absolute value!), and in fact |rt | → ∞ as t → ∞. In
particular, rt 6→ 0, so xt 6→ x∗ , so the fixed point is unstable.
1
Example 3.6.4. Consider the medication model xt+1 = xt + 1, with constant dose 1. The fixed
2
point is
c 1
x∗ = = = 2.
1−r 1 − 21
1
Since |r| = < 1, this fixed point is stable. Therefore, if we continue this regular daily dose, the
2
concentration in the bloodstream will eventually stabilize to x∗ = 2.10
Now let’s do an important example that explains why we’re working so hard to understand the
relationship between the DTDS and its solution.
Example 3.6.5. Suppose we are in the setting of the previous example and we want to change the
daily dose so that the steady state is x∗ = 3. Your first guess might be: add 1 to the daily dose.
Let’s try that:
1 2
xt+1 = xt + 2 ⇒ x∗ = = 4.
2 1 − 21
Oops! No, that was not the correct approach, because we forgot that some of the extra daily dose
is being kept in the system from one day to the next (and thus we overdosed our patient).
9
Motivation #2 for Calculus: what does it mean that rt → 0 as t → ∞? How could we decide this if the expression
were more complicated? Answer: limits.
10
Mathematically speaking, xt will get closer and closer to 2 without ever being equal to 2 — but in real life, we
can only measure the concentration to a given precision, so what you’ll measure, eventually, is a concentration of 2.
67
MAT 1330 : Fall 2020 3.7. STABILITY IN NONLINEAR MODELS: EXAMPLES
Solution: the actual problem we want to solve is the following. We want to choose the daily dose
c so that the linear DTDS xt+1 = 12 xt + c has a steady state of x∗ = 3. So we solve
c 3
1 =3 ⇔ c= .
1− 2
2
So our answer is: the daily dose should be increased to 1.5 from 1 to achieve a steady state of
x∗ = 3.
Exercise 3.6.6. Suppose the DTDS for a different drug is xt+1 = 23 xt + 4. Find x∗ and explain
why it is stable, using two different arguments. (Hint: cobweb, theorem). Now suppose we instead
want a steady state of x∗ = 10. What should the new daily dosage be?
Nonlinear models are very common in nature. Our eventual goal is a simple criterion that could
tell us (mathematically) whether a given fixed point is stable or not.
In Life Sciences: only stable states are visible in nature! (You would have to conduct a controlled
experiment to observe the initial few solutions, given any biological process; under normal circum-
stances, you’re seeing what has happened in the long run.) So the long term behaviour is about
the steady states.
Example 3.7.1. (Allee effect; see Example 3.5.7)
0.7x2t
xt+1 =
1 + 0.1x2t
The graph of the updating function f (x) = 0.7x2 /(1 + 0.1x2 ) of a DTDS displaying the Allee
effect, in red. The diagonal y = x is in blue; the axes are xt+1 versus xt . The fixed points are the
solutions to f (x∗ ) = x∗ , which are 0, 2 and 5; these are the intersections of the two graphs.
0.7x2
When we cobweb on the updating function f (x) = , we see that the fixed points 0 and 5
1 + 0.1x2
∗
are stable, whereas the middle fixed point, at x = 2, is unstable.
See also the Excel file AlleeDTDS.xls, to vary the parameters and see the effect on the graph of
the updating function and on the long-term behaviour.
68
MAT 1330 : Fall 2020 3.7. STABILITY IN NONLINEAR MODELS: EXAMPLES
When the graph of f crosses the diagonal from below to above, then x∗ is unstable.
When the graph of f crosses from above to below with positive slope), then x∗ is stable.
We’d love to talk about and calculate the slope of f at x∗ , and compare it to the slope of y = x.11
So far, we have used a simple model for drug elimination in the bloodstream. In reality, more
factors can come into play. For example, the rate at which a person’s body absorbs or eliminates
alcohol in the bloodstream depends on the alcohol level: the more alcohol in the body, the smaller
the fraction that can be absorbed and eliminated.
Let t be time in hours, and zt the concentration of alcohol in the blood at time t. (The units in
this model are such that z = 7 corresponds to one drink for an average-sized person.)
First model: pure absorption (no new alcohol added to the body). Instead of a constant rate
of absorption R, we imagine there is a function R(z), depending on the concentration z of alcohol
already in the blood, which tells us the fraction of alcohol absorbed over one hour. Then our DTDS
is
zt+1 = zt − R(zt )zt .
What is R(zt )? We do experiments! For example, using empirical data, we establish that for a
certain population a good fit is
(
10
when z ≥ 6
R(z) = 4+z
1 when z < 6.
So in this example, the body can completely eliminate the alcohol in one hour if the concentration
is below 6 at the beginning of the hour, but otherwise, it only eliminates some.
is given below.
11
Motivation #3 for Calculus!
69
MAT 1330 : Fall 2020 3.7. STABILITY IN NONLINEAR MODELS: EXAMPLES
We only sketched part of the graph; are we sure there isn’t another fixed point way off the graph?12
So we check, for x > 6:
x(x − 6)
x = f (x) ⇔ x = .
x+4
So if x > 6, x 6= 0 so we can divide by x to conclude
x+4=x−6
which has no solution. Therefore there are no fixed points with x > 6 (and only the obvious fixed
point x = 0 in the region x ≤ 6).
Second model: absorption plus drinking. Assume the subject raises their blood alcohol level by
d each hour through drinking. Then we have
Let’s solve for the steady states, which we call z ∗ , that is, we solve f (z) = z:
z = z − R(z)z + d
⇔ R(z)z = d
10z
⇔ =d
4+z
⇔ 10z = 4d + zd
⇔ (10 − d)z = 4d
4d
⇔ z∗ = .
10 − d
(We call the answer z ∗ ). What a strange answer! As with all Life Science applications, it is helpful
to ask ourselves: when is this positive, and when is it negative?
12
And what would it mean?
70
MAT 1330 : Fall 2020 3.7. STABILITY IN NONLINEAR MODELS: EXAMPLES
The graphs of the updating function f (x) = x − R(x)x + d) of a DTDS for alcohol elimination
with drinking, in red; on the left, d < 10 and on the right, d > 10. The diagonal y = x is in blue.
By cobwebbing, we see that the fixed point in the case d < 10 is stable. When d > 10, the fixed
point is negative and is not biologically relevant.
We see that the steady state is only biologically relevant when d < 10, because that’s the only
condition under which z ∗ ≥ 0; and this corresponds to a steady level of alcohol in the bloodstream
in the long run. We conclude that the body remains intoxicated, but at some steady level (which
is not d, but rather 4d/(10 − d), which can be higher or lower than d !).
When d > 10, we see that the fixed point is negative; doing cobwebbing, we see that it is also
unstable and that the concentration of alcohol over time climbs without bound (until death).
Note. You can experiment with this model (changing the parameter d and the initial value) in the
excel file provided : AlcoholDTDS.xls.
ct+1 = 0.87ct + d.
Our conclusion: the more coffee one drinks per unit time, the greater the concentration in the body
(as d increases, the steady state increases); but at a constant rate of drinking coffee, the level of
caffeine in the body levels off and stabilizes.
71
MAT 1330 : Fall 2020 3.7. STABILITY IN NONLINEAR MODELS: EXAMPLES
Now let’s consider a famous population model: logistic growth. It captures the phenomenon that
reproductive rate can decline with increasing population.
Example 3.7.4. (The logistic equation)
Let t represent time in years and xt the population at time t, normalized so that 1 represents the
maximum population that the resources can sustain. Then the logistic DTDS is
xt+1 = rxt (1 − xt ) for some 0 < r < 4
xt+1
where the per capita growth rate is proportional to 1 − xt with factor r. This means that the
xt
rate of growth declines with the density of the population, due to intraspecific13 competition for
resources.
Let’s solve for the fixed points, then determine their stability using cobwebbing. We have f (x) =
rx(1 − x) so a fixed point satisfies
x = rx(1 − x) ⇔ rx2 + (1 − r)x = 0 ⇔ x(rx + 1 − r) = 0
r−1
so we have two fixed points: x∗ = 0 and x∗ = .
r
Now to set up the cobweb, we need to choose r; and it turns out that the behaviour is very different
depending on the value of r!
Cobweb diagrams for the logistic equation with 0 < r < 1, 1 < r < 2, 2 < r < 3 and 3 < r < 4.
What we conclude is:
13
intraspecific: between individuals of a single species
72
MAT 1330 : Fall 2020 3.7. STABILITY IN NONLINEAR MODELS: EXAMPLES
if 0 < r < 1, then 0 is the only nonnegative equilibrium, and it is stable : r is too small and
the population dies out;
if 1 < r < 3, then there is a positive equilibrium, and it is stable, though if r < 2 the
population climbs up to the steady state and if r > 2 it fluctuates around the steady state
until it stabilizes;
if r > 4, then the population fluctuates more and more wildly, in boom and bust cycles, until
it dies out; the equilibrium is unstable.
Compare this to the linear model we analysed in Section 3.6, to notice that the stability is similarly
related to the “slope of the tangent line of f at x∗ ”.
The variety of interesting applications of DTDS to the life sciences is huge. Check out more
examples in the textbook. Of particular interest is a sophisticated model of the heartbeat, using a
discontinuous updating function that lets one understand arrhythmia from an electrical viewpoint.
End of lecture # 5
73
Chapter 4
The most important scientific discovery of the second millenium was the discovery of Calculus. It
changed natural philosophers into scientists: able to quantify not only the observations about the
state of matter, but also about its change.
The breakthrough result was the understanding of the concept of a limit. Isaac Newton formulated
limits as a theory of infinitesimals — theoretical “numbers” so small that when you square them
you get zero — but our modern version expresses itself as:
Where this becomes Calculus is when you use your understanding of the function at hand to do so.
Note. Weird fact: There is no number that is “right after” or “right next to” 0. If you choose a
number that’s close, like 0.000000001, there are always a ton of numbers that are even closer to 0,
like 0.000000000134345243098. Just like there is no largest number, there is no smallest positive
number. You can always zoom in closer with your microscope!
The goal: characterize the behaviour of a function at a point where it might not be defined.
Example 4.1.1. (Motivating example #1) The average rate of change of a function g over an
interval [x, a] in its domain is the rise over the run (as we’ll discuss in greater detail in Chapter 5).
The rise is g(x) − g(a) and the run is x − a; their quotient is the slope of the so-called secant line
(the line joining the point (a, g(a)) to the point (x, g(x))).
g(x) − g(a)
f (x) = if x 6= a
x−a
74
MAT 1330 : Fall 2020 4.1. LIMITS OF FUNCTIONS: THE CONCEPT
which, as x varies, gives the slope of all the possible secant lines through (a, g(a)).
The instantaneous rate of change of g at the point a would be obtained by choosing x to be equal
to a (but, oops, no: f (a) is illegal, being division by 0), or “right next to a” (but, oops, no: there
is no such x).
Example 4.1.2. (Motivating example #2) In our study of DTDS, we wanted to know the long-
term behaviour of the general solution, which is a function of the variable t (like xt = 4( 13 )t − 6).
“Long-term behaviour” kind of means “when t is ∞” — but ∞ is not a number, so the general
solution function is not defined there; there is no number we can plug into our formula to give the
answer. And if we just choose a large number t, how do we know if xt is correctly predicting the
value from then on?
In both of these motivating examples, the solution is to take the limit of the function, in the first
case as x goes to a (today’s lecture) and in the second case as t goes to ∞ (next lecture).
A first try. Suppose in the setting of Example 4.1.1 we have g(x) = x3 , and a = 1. Then we are
trying to understand the function
x3 − 1
f (x) = x 6= 1,
x−1
as x approaches 1. 1 Using a calculator, we could make the following tables of values (approaching
from the left and from the right):
We infer that the closer x gets to 1, the closer f (x) gets to 3. We formalize this idea in the following
definition.
Definition 4.1.3. We say that the limit of a function f as x approaches a is equal to a number
L, and we write
lim f (x) = L,
x→a
This seems to be the case in our example above, that is, we want to say
x3 − 1
lim =3
x→1 x − 1
(which is in fact true); but limits can be tricky and sometimes deceiving, so let’s start slowly.
1
We note that when we plug in x = 1 in the formula for f (x), we get 00 — which is pure garbage. There is NO
WAY to pretend that this is a valid number; we call it an indeterminate form.
75
MAT 1330 : Fall 2020 4.1. LIMITS OF FUNCTIONS: THE CONCEPT
A key observation: independence of f (a). The first thing to notice is that it doesn’t matter
what f (a) is, or even if f is defined at the point a. The limit is inferring what f would like to be
at a, not necessarily what it is.
Example 4.1.4. Consider three functions, whose graphs are drawn below.
The graphs of three different functions, each having the same limit as x goes to 1, despite being
different at the point x = 1.
x3 − x2
f2 (x) = x 6= 1.
x−1
The third represents (
x2 if x 6= 1;
f3 (x) =
2 if x = 1.
For example, f3 could represent the rule for winnings in a game, where the rules have an exception
that when you hit 1 exactly on the nose, your winnings double.
because from looking at the graph, we see that we can pick a y-value c as close to 1 as we want,
and go back and find an x value b such that f (b) = c. This is what the definition of the limit asks
us to verify.
Another observation: disagreement is possible. It can happen that the function does not
have a limit as x approaches a.
f (x) = sin(π/x),
76
MAT 1330 : Fall 2020 4.1. LIMITS OF FUNCTIONS: THE CONCEPT
This last example suggests we might sometimes do well to also consider one-sided limits.
2
To relate this to the definition of the limit: the limit can’t be L = 2 because there are bad x-values like x = 0.99
that are super-close to x = 1 but yet f (x) ' 1 is very far from 2. Similarly, you can exclude L = 1 as the limit, and
in fact you can exclude every value. Hence the limit doesn’t exist.
77
MAT 1330 : Fall 2020 4.2. EVALUATING LIMITS
Definition 4.1.7. We say that the limit of a function f as x approaches a from above (or from
the right) is equal to the number L, and write
lim f (x) = L
x→a+
if we can make f (x) as close to L as we wish be choosing x sufficiently close to a and larger than
a. Similarly, we say that the limit of a function f as x approaches a from below (or from the left)
is equal to the number L, and write
lim f (x) = L
x→a−
if we can make f (x) as close to L as we wish be choosing x sufficiently close to a and smaller than
a.
These limits are called the one-sided limits of the function f at a; when needed we call them
the right-hand limit and left-hand limit, respectively.
One-sided limits are also the correct thing to consider when the function is only defined on one side
of the point a.
√
Example 4.1.8. Consider f (x) = x, whose domain of definition is the interval [0, ∞). In this
case, we cannot ask about limx→0 f (x), or limx→0− f (x), because f is not defined on any number
less than 0. But it is reasonable to ask about the right-hand limit:
√
lim x = 0
x→0+
Proposition 4.1.9 (Existence test). We say that the limit of f as x approaches a exists if the two
one-sided limits exist and are equal, that is,
√
So for example, we would say that limx→0 x does not exist, because the left-hand limit is not
defined.
does exist, because it is defined on both sides of 0, and the limit from either side is 0.
78
MAT 1330 : Fall 2020 4.3. ALGEBRAIC LIMIT LAWS
(a) by calculator. But this can go wrong (see below) and is generally not accepted in this course.
(b) by reading the graph (see above). Great if you know the graph; but otherwise, not an option.
(c) by limit laws and algebraic manipulation. This is the most elegant method, and the only precise
way, to determine limits — and is the method required in this course.
whereas in fact, as we will show later, the limit is 0.1. The problem here was the round-off error
inherent to calculators.
Example 4.2.2. Consider the function f (x) = sin(π/x) we drew the graph of earlier. If we make
the following table of values
we might erroneously think that the limit was 1. But the problem here was our choice of numbers
x approaching 0; we could have chosen a different sequence of x-values to make a table, like
√
and think the limit was − 3/2 (!!).
The problem with using a calculator is that we can’t possibly test every single way that x approaches
0, and by focussing just on some numbers, we might miss the big picture completely.
79
MAT 1330 : Fall 2020 4.3. ALGEBRAIC LIMIT LAWS
1. lim c = c. (“If the function doesn’t depend on x, neither does its limit.”)
x→a
For the rest of the laws, suppose you already know that lim f (x) and lim g(x) exist (eg, from
x→a x→a
Laws #1 and #2). Then
Assuming for the moment that all the limits exist, we could use the sum rule (Law 3) to write
We can apply the product rule (Law 4) to each of x3 = x × x × x and −3x = (−3) × x (and apply
the constant rule (Law 1) to the limit of −3) to get
3
= lim x + (−3) lim x + lim 5.
x→2 x→2 x→2
Finally, applying Laws 1 and 2, we see that yes, in fact, all the limits exist, so we deduce that our
assumption was valid and thus conclude
= (2)3 − 3(2) + 5 = 7.
What this example shows is that if your function f (x) is a polynomial function, then
because you can repeatedly apply the limit laws until you’re just taking the limit of x as x goes to
a (and the constant limit).
Are there other functions that behave this perfectly? Certainly. They are the most magnificent,
wonderful, desirable functions we know.
80
MAT 1330 : Fall 2020 4.4. CONTINUOUS FUNCTIONS
Definition 4.4.1. A function f is called continuous at a point a (in its domain) if lim f (x) exists
x→a
and is equal to f (a). If lim f (x) does not exist, or is not equal to f (a), then the function is called
x→a
discontinuous (at a). A function that is continuous at every point in its domain is simply called
continuous.
We know the graphs of many key functions already. Consider the next figure, and note from the
graph that each function has the following features: for every point a in the domain, you can make
f (x) be as close to f (a) as you like by choosing x sufficiently close to a (that is, lim f (x) = f (a)).
x→a
f (x) = cos(x) at left, g(x) = |x| at right. Although g(x) has a cusp at x = 0, there is no question
that for x very near zero, |x| is very near zero, and vice-versa.
f (x) = 1/x at left, g(x) = tan(x) at right. There are points missing from their domains, but these
functions are continuous at every point in their domains.
81
MAT 1330 : Fall 2020 4.4. CONTINUOUS FUNCTIONS
Another way to say that a function is continuous: you can draw its graph without lifting your
pencil (accepting that you can reset at a vertical asymptote, for example).
Theorem 4.4.3 (Our favourite functions are continuous). Polynomial, rational, exponential, loga-
rithmic, trigonometric, inverse trigonometric, absolute value, and root functions are continuous at
every point in their domain.
Proof. We know from their graphs that this is true: there are no gaps or jumps (except at those
points we have removed from the domain).
Combining this theorem with our limit laws gives us an incredible array of continuous functions to
work with!
(a) if f and g are continuous, so is their sum, difference, product and quotient;a
(c) if f is continuous and g is any function such that lim g(x) = b exists, then
x→a
lim f (g(x)) = f lim g(x) = f (b).
x→a x→a
a
Don’t forget that the domain of the quotient f /g excludes any point where g(x) = 0.
Proof. (This proof may seem very pedantic; I include it here to illuminate how the corollary can
be deduced from the limit laws as easily as we deduced the continuity of polynomial functions. For
any particular function at hand, you could plug it into this proof to be convinced that the function
must be continuous — without needing to be able to sketch the graph of the function! Cool, eh?
(But you can skip ahead to the example if this is not your cup of tea.))
(a) Suppose f and g are functions that are continuous at a common point x = a of their domains.
That means we know that
Since f (x) − g(x) = f (x) + (−1)g(x), we can combine limit laws to deduce
lim (f (x)−g(x)) = lim f (x)+ lim (−1)g(x) = f (a)+ lim (−1) lim g(x) = f (a)+(−1)g(a) = f (a)−g(a).
x→a x→a x→a x→a x→a
82
MAT 1330 : Fall 2020 4.4. CONTINUOUS FUNCTIONS
and therefore the difference function f − g is continuous at a. And finally x = a is in the domain
of f /g if and only if g(a) 6= 0, in which case by Law 5 we have
f (x) f (a)
lim = .
x→a g(x) g(a)
(b) Now suppose that a is in the domain of g, and that g(x) = b, and that furthermore b is in the
domain of f . Then f ◦ g is the composition function and (f ◦ g)(a) = f (g(a)) = f (b). Then by
hypothesis we know that
But that’s actually a bit confusing, since it’s not the same x on both sides. So let’s use the variable
name y for f (y) instead corresponding to the fact that it’s the “y-values” of g that we plug into
the function f . Mathematically speaking, there is no difference in meaning if I write instead
But now we can see: if x goes to a, then the continuity of g says that g(x) goes to g(a); but y = g(x)
and b = g(a) so we’re saying y goes to b — and then the continuity of f says that f (y) goes to f (b)
or in other words
lim f (g(x)) = f (g(a)).
x→a
thus f ◦ g is continuous at x = a.
(c) The new thing here is that it is not important if g is even defined at a, or continuous at a;
we just need the innermost limit to exist. In fact you can just repeat the preceding paragraph
replacing g(a) with b and you’ll conclude that
that is, if you know where the pieces of your function are going in the limit, then you can deduce
where the whole function is going in the limit.
83
MAT 1330 : Fall 2020 4.5. BACK TO FINDING LIMITS: THE NICE CASE
So, what was the point of continuity, again? Ah, yes, it made finding limits easy: if f is continuous
at a then
lim f (x) = f (a),
x→a
5x4 − 3x + 4
lim .
x→2 x2 − 7
This is a rational function, which is continuous on its domain. We see that 2 is in the domain;
therefore, the limit can be evaluated by direct substitution. That is:
ex−2 − 1
Example 4.5.2. Find lim .
x→a ex − 1
Solution: First note that this function is continuous. The denominator vanishes when ex = 1 or
x = 0. So if a 6= 0, then a is in the domain, and so by direct substitution we may conclude
ex−2 − 1 ea−2 − 1
(a 6= 0) : lim = .
x→a ex − 1 ea − 1
However, if a = 0, then this is not in the domain, and we need other methods to find the limit.
Example 4.5.3. Find the value of the parameter c, if it exists, such that the following function is
continuous: (
x + c if x < 0;
f (x) =
cos(x) if x ≥ 0.
Solution: this question doesn’t at first seem to have anything to do with limits — until you look
at the definition of continuity (Definition 4.4.1). What the question is actually saying is: for what
value of c does lim f (x) = f (0)?
x→0
Well, since the function is defined by different formulas on the left and on the right of 0, we have
to split the problem into left-hand and right-hand limits.
On the right side, we have x > 0 so the formula for f (x) is f (x) = cos(x). Therefore we have
84
MAT 1330 : Fall 2020 4.6. FINDING LIMITS: METHODS FOR THE TRICKIER CASES
since cos(x) is continuous everywhere and so direct substitution applies. On the other hand, on
the left side we have x < 0 and so the formula for f (x) is f (x) = x + c. Thus
by our limit laws (or continuity). For the limit to exist, these two one-sided limits have to be
equal, so we deduce that we must choose c = 1. Finally, we have to check that the resulting limit
coincides with f (0). Looking at the formula, you see that you use the second one for x = 0, so
f (0) = cos(0) = 1. Excellent! That’s exactly the limit we got.
End of lecture # 6
The substitution rule is very helpful in many cases, but the real cases of interest are those for which
the rule cannot be applied (like in motivating example #1) — in particular, when a is not in the
domain. So the strategy in such cases is: use algebraic manipulation to transform your function
into another form where a is in the domain.
x3 − 1
Example 4.6.1. (Simplify) Consider lim . When we plug in x = 1, we get 0/0, which
x→1 x − 1
means we don’t know. In this case, we can use long division to see that
x3 − 1 (x − 1)(x2 + x + 1)
= = x2 + x + 1 except at x = 1.
x−1 x−1
That is, these two functions are identical everywhere except at x = 1, where the former function is
not defined but the latter function is. Therefore their limits as x → 1 are the same; and since 1 is
in the domain of the latter function, we can evaluate by the Direct Substitution Rule:
x3 − 1
lim = lim (x2 + x + 1) = 3.
x→1 x − 1 x→1
0
When direct substitution yields the completely illegal expression “ ”, we call the limit an indeter-
0
minate form. It is not a number, and could come out to anything. For example, the indeterminate
form in the preceding example came out to be 3!
Example 4.6.2. (Rationalize) When the problem has a difference of square roots, rationalisation
can transform it into something
√ very different that might be easier to deal with. For example,
x6 + 25 − 5
plugging in x = 0 into gives the indeterminate form 0/0. So we transform this
x6
85
MAT 1330 : Fall 2020 4.6. FINDING LIMITS: METHODS FOR THE TRICKIER CASES
The key is: if when you plug the value into a quotient and it comes out as “0/0”, then what you
can hope is that there is secretly a common factor between the numerator and denominator that
could be cancelled by algebraic manipulation.
(2 + t)2 − 4
Example 4.6.3. (Simplify: find a common factor) Plugging in t = 0 in gives the
t
nonsense answer 0/0 so to evaluate the following limit we expand and factor:
(2 + t)2 − 4 4 + 4t + t2 − 4 4t + t2
lim = lim = lim = lim(4 + t) = 4.
t→0 t t→0 t t→0 t t→0
z−3
Example 4.6.4. (Simplify: find a common factor) Plugging in z = 3 in gives the
z2 − 9
indeterminate form 0/0 so we look for a common factor:
z−3 z−3 1 1
lim 2
= lim = lim = .
z→3 z − 9 z→3 (z − 3)(z + 3) z→0 z + 3 6
Sometimes, your function is just a mess, and simplifying is about cleaning it up.
3
x − x2 + 4
Example 4.6.5. (Simplify) We can’t even plug in x = 0 in the expression so we just
5 + x1
clean it up:
3
x − x2 + 4 3 − x3 + 4x 3
lim 1 = lim = = 3,
x→0 5+ x x→0 5x + 1 1
where in the second-last equality we evaluated the limit using direct substitution.
You can also use the limit laws to evaluate one-sided limits.
3
Notice how we use the language of limits here: as long as the expression has an “x” in it, we wrote “limx→0 ” in
front of it. We removed the limit symbol exactly when we replaced the x with 0 for our direct substitution.
86
MAT 1330 : Fall 2020 4.6. FINDING LIMITS: METHODS FOR THE TRICKIER CASES
In this case, we do not have a single formula that is valid on both sides of 0; therefore we have no
choice but to consider the one-sided limits. Namely,
lim f (x) = lim (x2 − 4x) because that’s the formula when x > 0
x→0+ x→0+
=0 by Direct Substitution Rule.
whereas
In this case, the left-hand and right-hand limits are different, so lim f (x) does not exist.
x→0
Remember that the absolute value function is one of these piecewise-defined functions in disguise!
whereas
x x
lim = lim since x < 0 means |x| = −x
x→0− |x| x→0 −x
−
x
Since the two limits are different, lim does not exist.
x→0 |x|
That’s kind of obvious from the graph! Because in fact, sgn(x) is the function that returns 1 if
x > 0 and −1 if x < 0.
Notice that when we compute lim f (x), we only care about values of x near a. That is a handy
x→a
observation when the function has many strange features.
87
MAT 1330 : Fall 2020 4.7. DISCONTINUOUS FUNCTIONS
On the right side, it is not true that f (x) = sin(πx) for all x > −2, but this equality does hold for
all x > −2 and close to −2 (namely, x < 2). So we may still write
by the direct substitution. Since the limit exists and equals f (−2) = 0, f is continuous at x =
−2.
Exercise 4.6.9. Is the function f from the above example continuous at x = 2? Justify your
answer.
Just because a function is piecewise-defined, doesn’t mean you have to use one-sided limits.
Example 4.6.10. Find lim sgn(x).
x→4
In this case, we notice that near 4 (namely, for all x > 0), the function is given by sgn(x) = 1.
Therefore lim sgn(x) = lim 1 = 1.
x→4 x→4
Continuous functions are the best, but many very interesting functions are discontinuous.
Example 4.7.1. Examples of discontinuous functions.
88
MAT 1330 : Fall 2020 4.8. LIMITS INVOLVING INFINITY
In our model of drug absorption in the body, with a daily dose, we agreed that “connecting the
dots” of the general solution was not a good model of what actually happened over the course
of the day. Instead, we could model the concentration of drug in the body as a discontinuous
function: decreasing linearly but with a jump discontinuity with each daily dose that makes
the concentration suddenly jump much higher. (See graph below.)
A graph representing the change in level of a drug in the body over time, according to the DTDS
xt+1 = 12 xt + 1 from Example 3.4.1 with a discrete daily dose. This function is naturally
discontinuous, since it models as discrete (not continuous) phenomenon.
Note. The most common occurrence of a discontinuity is at the junctions of a piecewise defined
function. The points of discontinuity, and the points excluded from the domain of f , are often
where some of the most critical features of a function are to be found.
There are several ways that a limit question might involve ∞. That said, it is really important to
remember:
Note. Infinity is NOT a number. Infinity is a concept. Arithmetic with ∞ does not follow all the
rules of arithmetic. So we can say
∞ + ∞ = ∞, ∞ × ∞ = ∞,
n × ∞ = ∞ if n > 0, n × ∞ = −∞ if n > 0,
and similarly
1
(∞)(−∞) = −∞, = 0.
±∞
But the following expressions make NO SENSE; we call them indeterminate forms (like was 0/0):
∞
∞ − ∞, .
∞
For example, see the illogical mess we get if we “subtract ∞ from both sides of ∞ + ∞ = ∞.”
89
MAT 1330 : Fall 2020 4.8. LIMITS INVOLVING INFINITY
At issue is that functions can approach ∞ at different rates, and that makes all the difference.
Definition 4.8.1. We say that the limit of a function f as x approaches a is infinity (or: diverges
to infinity), and we write
lim f (x) = ∞
x→a
if we can make f (x) as large as we wish by choosing x sufficiently close to a. Similarly, we say the
limit of f as x approaches a is negative infinity (or: diverges to negative infinity), and we write
lim f (x) = −∞
x→a
if we can make f (x) as large a negative value as we wish by choosing x sufficiently close to a.
We also apply this definition to one-sided limits. Geometrically, a one-sided limit which diverges
to ∞ or −∞ corresponds to a vertical asymptote on the graph.
1
Example 4.8.2. f (x) = ; what is lim f (x)?
x x→0
1
First consider lim f (x). If x > 0 but is getting smaller, then gets bigger. In fact, if we want
x→0+ x
f (x) > 10n , then we should take x < 10−n . (Example, to get f (x) > 1000, choose 0 < x < 0.001.)
Thus we can make f (x) arbitrarily large by taking x close enough to 0, so we conclude
1
lim = ∞.
x→0+ x
1
Next consider lim f (x). If x < 0 and is close to zero, then will be very “large negative”. For
x→0− x
example, to get f (x) < −10n , we should choose −10−n < x < 0. So we conclude that
1
lim = −∞.
x→0− x
These results are confirmed when we look at the graph of y = 1/x. (See Section 4.4 if you’ve
forgotten the graph, but then memorize it for future reference.) There is a vertical asymptote at
x = 0, and the graph goes down to −∞ to the left of it, but up to +∞ to the right of it.
In this case, the two one-sided limits are different; this often happens. We say the limit does not
exist, but then we say: it diverges to +∞ on the right and −∞ on the left.
90
MAT 1330 : Fall 2020 4.8. LIMITS INVOLVING INFINITY
3 3
Exercise 4.8.3. Find lim and also lim . Describe your reasoning.
x→5+ x−5 x→5+ 5−x
Note. We abbreviate what we have understood in this example with the following mnemonic:
1 1
= ∞, = −∞,
0+ 0−
1
which means: if f (t) goes to 0 on the positive side then f (t) goes to ∞; and if f (t) goes to 0 on the
1
negative side then goes to −∞.
f (t)
In general, if substitution of x = a gives c/0 (for some number c 6= 0), then you can reason that
the function is going to grow very large as x → a, and can reason whether it is going to ∞, −∞,
or oscillating in between (in which case we just say it diverges).
On the other hand, since cos(x) > 0 if x < π/2 and x is close to π/2 (and sin(x) is still near 1 > 0),
we conclude
lim tan(x) = ∞.
x→ π2 −
Again, these results are exactly consistent with what we see when we sketch the graph of y = tan(x)
(see Section 4.4 if needed).
both from reasoning about what the logarithm function does on very small values of x, and from
describing the graph of y = ln(x).
Note that in this case we cannot take the left-hand limit because ln(x) is not defined for x ≤ 0.
The kind of limit we encountered in DTDS were those where t → ∞, where in that case the function
was xt (depending on the variable t). We define these limits as follows.
91
MAT 1330 : Fall 2020 4.8. LIMITS INVOLVING INFINITY
Definition 4.8.6. We say that the limit of a function f as x goes to ∞ is equal to the number L,
and write
lim f (x) = L,
x→∞
if we can make f (x) as close to L as we wish by choosing x arbitrarily large. We can define each
of the following expressions in a similar way:
1
as well. We again confirm this by looking at the graph of y = , which has horizontal asymptote
x
y = 0 at both extremes (that is, as x → ∞ and as x → −∞).
1
Note. We could abbreviate what we have understood here by saying that = 0.
±∞
lim ex = 0.
x→−∞
Exercise 4.8.9. Let α be a real number. What is lim xα ? Hint: your answer will be different for
x→∞
certain different values of α — there are three cases.
Exercise 4.8.10. Let r > 0 be a real number. What is lim rx ? Hint: your answer will be different
x→∞
for certain different values of r — there are three cases.
92
MAT 1330 : Fall 2020 4.8. LIMITS INVOLVING INFINITY
Exercise 4.8.11. We didn’t define one-sided limits when talking about limits to ±∞. Write a little
story about what “ lim f (x)” or “ lim f (x)” should mean; you’ll have to be creative. Result: we
x→∞+ x→∞−
don’t.
You might like to use the Limit Laws — but remember that they only apply to limits that exist.
We can extend them to infinite limits ONLY if it doesn’t lead to any indeterminate forms. So for
example, it’s OK to say that
2
∞ *∞
4x3 =
:
“∞ + ∞” = ∞
lim
3000x +
x→∞
A function need not have a limit, or diverge to ∞, as x → ∞. Key examples to remember are the
trigonometric functions. For example,
because the value of sin(x) oscillates between −1 and 1 as x goes larger and larger, never settling
to any single value L.
There are some standard techniques for working out limits as x → ±∞.
Example 4.8.12. (Factor out the dominant term.) We have
1 : −∞ :1
1
lim (3000x2 − 4x3 ) = lim (−4x3 )(−750 + 1) = lim 3
(−4x ) ·
(−750
+ 1) = −∞.
x x
x→∞ x→∞ x→∞
In fact, all polynomials functions of degree at least 1 give limx→∞ = ±∞, where the signs are
determined by the coefficient of the highest degree term.
Example 4.8.13. (Factor numerator and denominator by highest power term in the
denominator)
3x2 + 2x + 1 3 + 2/x + 1/x2 3
lim 2
= lim 2
= = −3
x→∞ −x + 3 x→∞ −1 + 3/x −1
since the other terms in the numerator and the denominator are going to zero. Similarly
93
MAT 1330 : Fall 2020 4.8. LIMITS INVOLVING INFINITY
because the numerator is growing without bound towards ∞ while the denominator is staying very
close to −1. At the other extreme,
3x2 + 2x + 1 3/x + 2/x2 + 1/x3
lim = lim =0
x→∞ −x3 + 3 x→∞ −1 + 3/x3
since this time the numerator is going to 0 while the denominator is staying constant near 1.
This technique also works with other rapidly-growing functions. What it amounts to is to scale
the numerator and the denominator by a factor which will make the denominator go to a constant,
nonzero value.6
Example 4.8.14. (Scaling with exponentials)
ex − 1 1 − e−x 1
lim x
= lim −x
=
x→∞ 4 + 5e x→∞ 4e +5 5
where in the first step we divided numerator and denominator through by ex , the dominant term.
Note that we could find the final answer by a kind of direct substitution (coming from continuity)
since lim e−x = 0.
x→∞
x−2
Example 4.8.15. (Scaling with radicals) Consider f (x) = √ . As x → ∞, the numerator
3x2 + 4
goes to −∞ and the denominator goes to ∞. You might be tempted to divide numerator and
denominator by x2 , but look what that gives in the denominator:
r r
1p 2 1p 2 1 4
x +4= x +4= +
x2 x4 x2 x4
which goes to 0 as x → −∞. Since the scaled numerator would also go to 0 (check!) we deduce we
scaled by too much: we changed our indeterminate form ∞/∞ into another indeterminate form of
1
type 0/0. So actually, we meant to multiply by in the numerator and denominator instead.
x
A different approach: factor the leading (most important) term out of the denominator, carefully:
x−2
lim f (x) = lim √
x→∞ x→∞ 3x2 + 4
x−2
= lim p
x→∞ x (3 + 4/x2 )
2
x−2 √
= lim p since x2 = |x|
x→∞ |x| 3 + 4/x2
x−2
= lim p since x → ∞ means x > 0
x→∞ x 3 + 4/x2
1 − 2/x x−2 2
= lim p since =1−
x→∞ 3 + 4/x2 x x
1−0
=√ since 1/x → 0
3+0
1
=√
3
6
Alternately, you can scale so as to make the numerator go to a constant, nonzero value instead; the only difference
is that if in that case your denominator goes to zero, you need to decide if it is approaching zero from above or below
to decide if the limit diverges to ∞ or −∞.
94
MAT 1330 : Fall 2020 4.8. LIMITS INVOLVING INFINITY
1
In fact, this is the same thing as multiplying numerator and denominator by .
x
Note. Scale by the net power of the denominator, and watch out for the algebra.
x−2
Example 4.8.16. (Scaling with radicals : careful with negatives) Consider now lim √ .
x→−∞ 3x2 + 4
This is legitimate, as the domain of this function is all of R. The work is almost the same, except,
crucially, as x → −∞, we have x < 0 so |x| = −x. Therefore we have:
x−2 x−2
lim √ = lim p
x→−∞ 3x2 + 4 x→−∞ x2 (3 + 4/x2 )
x−2 √
= lim p since x2 = |x|
x→−∞ |x| 3 + 4/x2
x−2
= lim p since x → −∞ means x < 0
x→−∞ −x 3 + 4/x2
−1 + 2/x x−2 2
= lim p since = −1 +
x→−∞ 3 + 4/x 2 −x x
−1 + 0
=√ since 1/x → 0
3+0
−1
=√
3
We used a nice shortcut in our last few example: We know that 1/x → 0 as x → ∞, so we “plugged
in 0” for 1/x when we evaluated the limit. This is in fact an application of Corollary 4.4.4 for
infinite limits. Let’s do examples.
Example 4.8.17. Since the exponential function is continuous,
lim 1/x
lim e1/x = e x→∞ = e0 = 1.
x→∞
End of lecture # 7
95
Chapter 5
The Derivative
One of the two central notions in Calculus is that of the derivative. The derivative of a function
at a point is its instantaneous rate of change at that point; if we know the derivative of f at every
point x, this gives us a new function f 0 (x). Finding this function, and understanding what it tells
us about f , is the object of this chapter.
Suppose first that we have a linear function like y = 3x + 2. Then the rate of change of y with
respect to x is 3: for each unit of increase of x, we get a 3 unit increase of y, and so forth.
Now consider what happens if we have a nonlinear function, like y = x2 . When x = 1, increasing
x by ∆x = 1 unit increases y from 1 to 4, so the change in y is ∆y = 3; when x = 2, increasing x
∆y
by ∆x = 1 unit increases y from 4 to 9, so ∆y = 5. Point: the rate of change ∆x depends on the
value of x.
∆y
Even worse: the fraction ∆x depends on the value of ∆x. For example, say x = 1. Then
∆y
if ∆x = 1 then we saw ∆y = 3 so ∆x = 3; but
∆y
if ∆x = 0.5 then y goes from 1 to (1.5)2 = 2.25 so ∆y = 1.25, and ∆x = 2.25.
96
MAT 1330 : Fall 2020 5.1. THE DEFINITION
What we did above was to say: suppose we start at a particular value x = a and then change to a
new point x = b. We want to know how this changes the y-value, from y = f (a) to y = f (b).
The line joining (a, f (a)) and (b, f (b)) is called a secant line of the curve y = f (x). This image is
taken from https://www.shmoop.com/derivatives/slope-function.html, with thanks.
∆y
The change is often denoted ∆x , where this notation means: the change in y-value divided by the
change in x-value. You can think of it as the slope of the secant line:
∆y f (b) − f (a)
= .
∆x b−a
This leads to the following definition.
Definition 5.1.1. The average rate of change of a function f over an interval [a, b] in its domain
is the rise over the run:
f (b) − f (a)
fav =
b−a
which is the slope of a secant line of the curve.
The average rate of change tells us something about the function. For example:
if f (t) represents the reading on your odometer at time t, then fav is your average speed from
time t = a to time t = b;
if f (t) represents the population of an organism at time t, then fav is the average net growth
rate from time t = a to time t = b;
if f (x) is the amount in mg of a chemical (or drug) absorbed by the lungs when the amount
in each breath is x (which varies as x varies! 1 ), then fav is the average marginal rate of
absorption of the drug as the concentration rises from x = a to x = b.
This information is crude, however: in the case of your odometer, it doesn’t tell you if you drove
within the speed limit during that interval; in the case of chemical absorption, it only gives a kind
of rule of thumb (eg: “when x increases from a to b, you are probably absorbing half of the extra
drug with each breath”). This is not precise enough to do science (or avoid a speeding ticket).
1
This kind of variation is called functional response. In Chemistry, you might call it Michaelis-Menten or Monod
reaction kinetics; it’s the effect that your lungs (or any absorbing substance) reach saturation and can’t absorb more.
See also Absorption Functions in your textbook for varied examples.
97
MAT 1330 : Fall 2020 5.1. THE DEFINITION
Example 5.1.2. Find the average rate of change of f (x) = ex on the interval [0, 1] and on the
interval [1, 2].
What we want is the instantaneous rate of change of f at the point x in its domain. In the case
of your odometer, the instantaneous rate of change means the value on your spedometer (your
instantaneous speed); in the case of chemical absorption, the instantaneous rate of change tells you
about the precise sensitivity of your lungs to the uptake of the drug as a function of concentration,
which can let you correctly prescribe an increased dosage that will have the effect you want.
The idea: we know how to find the average rate of change on any interval [a, b]. So now consider
smaller and smaller intervals
[a, b] and [b, a]
for values of b getting closer and closer to a. These correspond to secant lines that are getting
closer and closer to the tangent line of f at a: the line that represents the slope of the curve “at
a”. So we take the limit as b → a.
Definition 5.1.3. Let f be a function defined on an interval around a. Then f is called differen-
tiable at a if the following limit exists:
f (b) − f (a)
lim
b→a b−a
in which case we denote the limit f 0 (a), and call it the derivative of f at a. If f is differentiable at
every point in an interval, then this defines a function f 0 and we say that f is differentiable, and
f 0 is the derivative (function).
So the derivative at a point a is a number f 0 (a); we can also say the derivative at a point x is the
number f 0 (x). If we evaluate the derivative at every point x, then we get a function f 0 (x).
The names of the variables are not relevant, as long as we are consistent. So if we want a formula
for the derivative at x, we might rename the variables so that we have
f (u) − f (x)
f 0 (x) = lim .
u→x u−x
98
MAT 1330 : Fall 2020 5.2. EXAMPLES OF USING THE DEFINITION
It’s often easier to make a new variable h = u − x and then notice that as u → x, we have h → 0.
This gives the equivalent definition:
f (x + h) − f (x)
f 0 (x) = lim .
h→0 h
Sometimes we’ll use ∆x for this difference h, because it represents the difference in the x-value; so
we could write
f (x + ∆x) − f (x)
f 0 (x) = lim .
∆x→0 ∆x
All these equivalent definitions mean (and calculate) the same thing: the slope of the tangent line
to the curve y = f (x) at the point (x, f (x)).
Mathematical definitions are gold: they tell you exactly what you have to do to calculate the
answer.
Example 5.2.1. Consider the function f (x) = mx + b. This is a straight line with slope m; the
derivative should therefore come out to be equal to m. Let’s check:
f (x + h) − f (x) m(x + h) + b − (mx + b)
f 0 (x) = lim = lim
h→0 h h→0 h
mx + mh + b − mx − b mh
= lim = lim = lim m = m,
h→0 h h→0 h h→0
Notice how we had to algebraically manipulate the difference quotient in order to evaluate the
limit. By construction, the difference quotient gives an indeterminate form of type 00 (that’s kind
of the point); but as we saw in the last chapter, we know several ways of simplifying such a fraction
to cancel the hidden common factor in the numerator and denominator and thus reveal the actual
limit.
Example 5.2.2. Consider a quadratic function like f (x) = 5x2 . The slope of this curve varies
with x, so we expect a more interesting answer. Indeed:
f (x + h) − f (x) 5(x + h)2 − 5x2
f 0 (x) = lim = lim
h→0 h h→0 h
2 2
5(x + 2xh + h ) − 5x 2 5x2 + 10xh + 5h2 − 5x2
= lim = lim
h→0 h h→0 h
10xh + 5h2
= lim = lim (10x + 5h) = 10x.
h→0 h h→0
99
MAT 1330 : Fall 2020 5.3. FIVE WAYS NOT TO BE DIFFERENTIABLE AT x
We look at this formula and judge that it makes sense: as x gets larger positive, y = f (x) gets
steeper (larger slope), and when x = 0, the slope of y = 5x2 is 0, and when x is large negative, then
the slope is a large negative number, too. We could also sketch the graph of y = f (x) carefully and
measure the slope of the tangent line at each point to compare.
The number 5 was almost just a decoration in the preceding calculation. We could do the same
thing for a general abstract quadratic function.
Example 5.2.3. Let f (x) = ax2 + bx + c, where a, b, c are parameters (that is, do not vary with
x). Then we compute
a(x + h)2 + b(x + h) + c − ax2 + bx + c
0 f (x + h) − f (x)
f (x) = lim = lim
h→0 h h→0 h
a(x2 + 2xh + h2 ) + bx + bh + c − ax2 − bx − c 2axh + ah2 + bh
= lim = lim
h→0 h h→0 h
= lim (2ax + ah + b) = 2ax + b.
h→0
More complex functions generally take more work to solve for the limit.
√
Example 5.2.4. Let f (x) = x. Then if x > 0, we have:
√ √
0 f (x + h) − f (x) x+h− x
f (x) = lim = lim
h→0 h h→0 h
√ √ √ √
x+h− x x+h+ x x+h−x
= lim ·√ √ = lim √ √
h→0 h x + h + x h→0 h( x + h + x)
h 1 1
= lim √ √ = lim √ √ = √ .
h→0 h( x + h + x) h→0 x+h+ x 2 x
If x = 0, this formula fails — with good reason.
√
First: since x is not defined on both sides of x = 0, we are not allowed to define the derivative of
√
x at 0. The rule is: the function must be defined on both sides of the point; we have to take the
two-sided limit.
Secondly: as x → 0+ , the slope of the curve is increasing to ∞, so the instantaneous rate of change
f 0 ∞”;
is growing without bound. Although it seems reasonable to do so, no, we don’t say “ (0)=
we say “f 0 (0) does not exist”.
There are several ways that a function could fail to be differentiable. Each one indicates that the
behaviour of the function at that point is in some way unpredictable, which means it will be an
important point to understand — it will be the place where the function acts in an interesting way.
100
MAT 1330 : Fall 2020 5.3. FIVE WAYS NOT TO BE DIFFERENTIABLE AT x
Reason #1: The graph has a corner or a cusp at x. Consider for example
(
x if x ≥ 0;
f (x) = |x| =
−x if x > 0.
The graph of y = |x|. Its slope is −1 if x < 0 and +1 if x > 0, and the two do not agree at x = 0.
Since this is a piecewise defined function, to calculate the derivative at x = 0 we need to compute
the two one-sided limits. As h → 0+ we have h > 0 so therefore
|0 + h| − |0| h
lim = lim =1
h→0+ h h
h→0+
whereas
|0 + h| − |0| −h
lim = lim = −1.
h→0− h h→0 − h
Since the two one-sided limits disagree, the (two-sided) limit does not exist, so f is not differentiable
at x = 0.
√
Reason #2: The graph is vertical at x. Consider for example f (x) = 3
x = x1/3 , which is
the inverse function to y = x3 . Its graph is below.
101
MAT 1330 : Fall 2020 5.3. FIVE WAYS NOT TO BE DIFFERENTIABLE AT x
Note that f (x) is still a function — it passes the vertical line test — but at the instant it passes
zero its tangent line is vertical. We can see this from the definition as well:
f (0 + h) − f (0) h1/3 1
lim = lim = lim 2/3 = ∞.
h→0 h h→0 h h→0 h
Reason #3: The function does not exist on both sides of x. This is part of the requirement
for the derivative, and it reflects the idea that it only makes sense to talk about the instantaneous
rate of change at a point if you can pass through that instant.
√
For example, consider f (x) = x3/2 = x3 , which is only defined for x ≥ 0. Its graph is drawn
below.
The graph of y = x3/2 . The derivative of this function is not defined at 0, because the function is
not defined on both sides of 0.
So f 0 (0) does not exist. In this case, however, we might reasonably as about limx→0 f 0 (x), which
would be answering the (interesting, but not equivalent) question “What is the limit of the slope
√
of f as x approaches 0?”. You can verify that f 0 (x) = 23 x, so this limiting slope (as opposed to
instantaneous slope) is 0, which is vaguely reasonable-looking from the graph.
Reason #4: The function is discontinuous at x. For example, consider a function like
(
1 if x ≥ 0;
sgn(x) =
−1 if x < 0.
This is discontinuous at 0 and we claim that the derivative at 0 does not exist. We look at the
graph, below.
102
MAT 1330 : Fall 2020 5.3. FIVE WAYS NOT TO BE DIFFERENTIABLE AT x
The graph of y = sgn(x) in blue. A secant line of the function going through (0, f (0)) but starting
from the left, is drawn in green.
You might feel it is quite reasonable to say that the derivative of f (x) = sgn(x) is 0 at 0, since at
every point except x = 0, we have f 0 (x) = 0, and therefore
lim f 0 (x) = 0.
x→0
But wait: that is a cute fact, and good reasoning, but it is NOT THE DEFINITION OF THE
DERIVATIVE. The derivative of f at 0 is given by
f (h) − f (0)
f 0 (0) = lim .
h→0 h
But since f (h) = 1 if h ≥ 0 and f (h) = −1 if h < 0, we have
This is not just a technical oddity: it’s really important. When the curve is discontinuous, we
don’t have a tangent line, so we can’t have a derivative.
This is a general fact, which we can state in two equivalent ways, as follows.
103
MAT 1330 : Fall 2020 5.3. FIVE WAYS NOT TO BE DIFFERENTIABLE AT x
That is, we omit the transition point; if you then determine that the function is continuous, you
can see if the derivatives on both sides match as well.a
a
Technically, we are assuming that the derivatives on both sides are also continuous functions here; see upper-year
math courses for examples where this can fail.
Reason #5: The function f is not defined at x. If f is not defined at x, we cannot even
write down the formula for the derivative, because we have no value for f (x). At best, we could be
asking for the limit of the derivative as we approach x (which, as we saw in the previous example,
is just absolutely totally different from asking for the derivative at x itself).
The graph of y = 1/x in blue. It is undefined at 0 so we cannot draw a secant line through
(0, f (0)); moreover, any average rate of change across an interval containing 0 is meaningless.
Note. The domain of f 0 can never be bigger than the domain of f , but it can be smaller.
We saw an example where the domain of f 0 is smaller than the domain of f : remember f (x) = x2/3 .
104
MAT 1330 : Fall 2020 5.4. WHAT f 0 TELLS YOU ABOUT f
Since we defined the derivative in terms of the tangent line, there is a nice correspondence between
the graph of a function and the properties of its derivative:
So for example, by looking at the graph of a function f , we can sketch the graph of f 0 :
The graph of f (x) in blue, and the inferred graph of f 0 in red. Where f has a horizontal tangent,
f 0 is 0; where f is increasing, f 0 is positive; where f is decreasing, f 0 is negative.
We can also reverse this process, that is, sketch f from the graph of f 0 ; but the answer will not be
unique. The derivative determines the shape of the function, but not where exactly it is.2 Thus
f (x) and f (x) + c for any constant c will have the same derivative.
End of lecture # 8
2
For example, if you know exactly what speed you were driving at every instant of a day, you could figure out
how far you’d travelled in that day — but not where you were!
105
MAT 1330 : Fall 2020 5.5. DIFFERENTATION RULES: THE BASICS
Although the definition of the derivative can be used to compute derivatives, this is quite tedious.
Thankfully, over the years since the discovery of the derivative, people have figured out a number
of simple rules that, taken together, can be used to evaluate the derivative of almost function that
is given by a formula. The definition of the derivative then only needs to be used if
We state all the rules here and then give some examples of how they are used. This section concludes
with an explanation of why each of the rules is true.
1. (Power rule) If f (x) = xn for some n ∈ R, then f 0 (x) = nxn−1 . In particular, the derivative of
the constant function 1 is 0.
2. (Constant multiple rule) If f is differentiable and c is a constant, then g(x) = cf (x) is differen-
tiable and g 0 (x) = cf 0 (x).
3. (Sum/difference rule) h(x) = f (x) ± g(x) is differentiable and h0 (x) = f 0 (x) ± g 0 (x).
4. (Product rule) h(x) = f (x)g(x) is differentiable and h0 (x) = f 0 (x)g(x) + f (x)g 0 (x).
f (x)
5. (Quotient rule) h(x) = is differentiable and
g(x)
Note. Please memorize these formulas, in whatever way that works for you. I remember the
quotient rule as : “the bottom times the derivative of the top, minus the top times the derivative
of the bottom, all over the bottom squared.” Others write h(x) = uv and remember vdu − udv over
v2.
Note. For the chain rule: remember that f was evaluated at g(x), so that is where you have to
evaluate f 0 : it’s f 0 (g(x)) NOT f 0 (x) in the chain rule.
We can apply these rules to infer some of the things we proved directly from the definition. For
example:
106
MAT 1330 : Fall 2020 5.5. DIFFERENTATION RULES: THE BASICS
So by the constant multiple rule, if f (x) = mx for some constant m, then f 0 (x) = m.
Finally, by the sum rule, if f (x) = mx + b for some constants m and b, then f 0 (x) =
(mx)0 + (b)0 = m(x)0 + b(1)0 = m(1) + b(0) = m.
Example 5.5.2. If f (x) = x47.5 then f 0 (x) = 47.5x46.5 by the power rule.
√
Example 5.5.3. If f (x) = 3
x then rewrite this as f (x) = x1/3 . So by the power rule,
1 1 1 1
f 0 (x) = x 3 −1 = x−2/3 = 2/3 .
3 3 3x
Notice that the domain of f 0 (x) excludes 0.
1
Example 5.5.4. If f (x) = , then rewrite this as f (x) = x−3 . So by the power rule,
x3
−3
f 0 (x) = −3x−4 = .
x4
Example 5.5.6. If f (x) = (2x + 1)(3x + 4) then by the product rule, we have
We could also have gotten this answer by multiplying out f (x) = 6x2 + 11x + 4 and applying the
sum and constant multiple rules.
107
MAT 1330 : Fall 2020 5.5. DIFFERENTATION RULES: THE BASICS
Using the chain rule requires you to be strongly aware of the composition of functions. For example,
here is a table of some compositions of functions:
In each case, we apply the chain rule to find the derivative as F 0 (x) = f 0 (g(x))g 0 (x):
√
Consider F (x) = 4x + 1. This is f (g(x)) where g(x) = 4x√ + 1 (so g 0 (x) = 4) and f (u) =
√
u = u1/2 (so f 0 (u) = 12 u−1/2 ). Therefore F (x) = f (g(x)) = 4x + 1 has derivative
1 2
F 0 (x) = (4x + 1)−1/2 · 4 = √ .
2 4x + 1
When we want to work it out in stages, we might write:
d√ d 1 1 1 d 1 1 2
4x + 1 = (4x + 1) 2 = (4x + 1)− 2 (4x + 1) = (4x + 1)− 2 · 4 = √ .
dx dx 2 dx 2 4x + 1
√ √
Consider F (x) = 4 x + 1. We could think of it as f (g(x)) with g(x) = x and f (u) = 4u + 1.
Then g 0 (x) = 21 x−1/2 and f 0 (u) = 4, so
1 2
F 0 (x) = 4 · x−1/2 = √
2 x
as we can see by applying the constant multiple rule directly.
1
Consider F (x) = . We can think of this as f (g(x)) with g(x) = 1 + x2 and f (u) = 1/u.
1 + x2
Then g 0 (x) = 2x and f 0 (u) = −u−2 , so we have
−2x
F 0 (x) = −(1 + x2 )−2 · (2x) =
(1 + x2 )2
as we can check directly with the quotient rule (but this way is faster).
1
Example 5.5.8. If f (x) = then by the quotient rule
3x4 +x
(3x4 + x)0 − 1(12x3 + 1) −12x3 − 1
f 0 (x) = = .
(3x4 + x)2 (3x4 + x)2
Alternately, we could write f (x) = (3x4 + x)−1 and then use the chain rule
−12x3 − 1
f 0 (x) = −(3x4 + x)−2 (12x3 + 1) = ,
(3x4 + x)2
which of course comes out the same.
108
MAT 1330 : Fall 2020 5.5. DIFFERENTATION RULES: THE BASICS
Example 5.5.9. Suppose h(x) is a differentiable function and h0 (x) is its derivative. Now suppose
that
f (x) = (h(x))3 .
Then by the chain rule
f 0 (x) = 3(h(x))2 · h0 (x).
h(x)
Similarly, if g(x) = , then by the quotient rule,
x
xh0 (x) − h(x)
g 0 (x) = .
x2
Note. It is hugely important to practice these rules! Over the coming sections, we will be adding
the rules for differentiating more functions, and combining functions in better ways. These rules
get easier to use the more you practice with them. Like knowing your multiplication tables by
heart, being able to differentiate easily will make everything we do after this point make better
sense and go more easily.
Exercise 5.5.10. Differentiate each of the following functions using the rules in this section. In
each case, consider if there are multiple ways of writing or interpreting the function, so that you
use different rules, and verify that you always get the same answer.
The following subsections on why the rules are true are optional, but you are encouraged to read
them. Appreciating where these rules come from and why they are true is an important part of
making the connection between the definition of the derivative and these great rules — and really
helps with remembering them.
109
MAT 1330 : Fall 2020 5.5. DIFFERENTATION RULES: THE BASICS
We can explain here why the power rule is true for the case that n is a positive integer. (To prove
it holds for all values of n requires using the exponential and logarithm functions.)
In particular, each term is divisible by h except the first; and after the second term, each term is
divisible by h2 .
Note. If a and b are constants, and f (x) and g(x) are differentiable functions, then k(x) = af (x) +
bg(x) is differentiable and its derivative is
The constant multiple rule and the sum/difference rule can be summarized as the one rule above,
which is mathematically the statement that “differentation is a linear operator on the vector space
of functions.” You get the constant multiple rule by taking b = 0 and you get the sum rule by
taking a = 1 = b and the difference rule by taking a = 1, b = −1.
where in that last part we used the Limit Laws (Theorem 4.3.1) to evaluate the parts of the limit
separately.
110
MAT 1330 : Fall 2020 5.5. DIFFERENTATION RULES: THE BASICS
The reason for the strange mixed term comes from geometry. Algebraically, the way we have to
look at the difference is as follows:
where we have used the continuity of g at x to conclude that lim g(x + h) = g(x), and we have
h→0
used the definition of the derivative in the other two cases.
Notice that we differentiate each function once, and multiply the result.
111
MAT 1330 : Fall 2020 5.5. DIFFERENTATION RULES: THE BASICS
This example was boring because the derivatives were all constants. To see why the rule holds in
general, here is an argument.
So let’s write g(x) = y and for each h, define a new variable k by the formula k = g(x + h) − g(x).
So g(x + h) = y + k for some small value k. Since g is continuous at x, we see that as h → 0, we
also have k → 0. That lets us write
f (g(x + h)) − f (g(x)) f (y + k) − f (y)
lim = lim
h→0 h h→0
h
f (y + k) − f (y) k
= lim ·
h→0 k h
f (y + k) − f (y) g(x + h) − g(x)
= lim lim
k→0 k h→0 h
0 0
= f (y)g (x)
= f 0 (g(x))g 0 (x).
(What we didn’t allow for in this formula was the possibility that g is constant, so that k = 0; but
there are other ways to deduce the same formula even in these weird cases.)
Since we already know why the product rule and the chain rule are true, we can use those to prove
the quotient rule (which is shorter than using the definition).
f (x)
First, rewrite F (x) = as F (x) = f (x)(g(x))−1 . By the product rule
g(x)
d d
f (x)(g(x))−1 = f 0 (x)(g(x))−1 + f (x) (g(x))−1 .
dx dx
By the chain rule,
d g 0 (x)
(g(x))−1 = −(g(x))−1−1 g 0 (x) = −
.
dx g(x)2
Therefore, putting this together gives
0
d f (x) 0 −1 g (x)
= f (x)(g(x)) + f (x) −
dx g(x) g(x)2
which over a common denominator gives
f 0 (x) f (x)g 0 (x) f 0 (x)g(x) − f (x)g 0 (x)
= − =
g(x) g(x)2 g(x)2
as required.
112
MAT 1330 : Fall 2020 5.6. DERIVATIVES OF EXPONENTIAL FUNCTIONS
To find its derivative, lacking any other ideas, we use the definition:
ax+h − ax
h
ah − 1
0 f (x + h) − f (x) x a −1
f (x) = lim = lim = lim a = ax lim = ax f 0 (0).
h→0 h h→0 h h→0 h h→0 h
(5.1)
x
We have gone in a bit of a circle here: we wanted the derivative of f (x) = a at some random point
x, but instead figured out that the derivative will satisfy
f 0 (x) = ax f 0 (0).
Well, at least that is a simpler problem: just figure out the derivative at 0, which should be the
slope of the tangent line to the curve y = f (x) at x = 0.
So f (x) = ex , the natural exponential, is the one that satisfies, for any x, f 0 (x) = ex , that is,
Note.
d x
e = ex
dx
That’s an incredible property, for a function to be equal to its own derivative; in fact, the only
functions which this property are those of the form Kex for some constant K.
113
MAT 1330 : Fall 2020 5.6. DERIVATIVES OF EXPONENTIAL FUNCTIONS
Example 5.6.2. The normal distribution describes a standard bell curve, and the basic form is
2
h(x) = e−x .
2
The graph of y = e−x , which describes a standard bell curve in Statistics.
Problem: Find h0 (x).
Solution: This is a composition of two functions, so we apply the chain rule. We have h(x) = f (g(x))
where g(x) = −x2 is the innermost function and f (u) = eu is the outermost function; therefore
2 2
h0 (x) = f 0 (g(x))g 0 (x) = eg(x) g 0 (x) = e−x (−2x) = −2xe−x .
Example 5.6.3. Consider F (x) = xn e−x , where n is some fixed number; this is related to another
important function in Statistics, called the Gamma Distribution. Then by the product rule and
the chain rule
d
F (x) = nxn−1 e−x + xn (−e−x ) = (n − x)xn−1 e−x .
dx
Example 5.6.4. If f (x) = eg(x) then f 0 (x) = eg(x) g 0 (x) by the chain rule. Similarly, if h(x) = g(ex )
then h0 (x) = g 0 (ex )ex = ex g 0 (ex ) by the chain rule.
So we have solved for the derivative of f (x) = ex . What about f (x) = ax , for some other a > 0?
114
MAT 1330 : Fall 2020 5.7. DERIVATIVES OF LOGARITHMS
So instead of tackling f (x) = ax , we rewrite it as f (x) = ex ln(a) and apply the chain rule (remem-
bering that ln(a) is just a number, because a is some fixed number):
that is,
Note.
d x
a = ax ln(a).
dx
In fact, we have actually found the mysterious value from the beginning of this section, in (5.1):
the derivative of ax at x = 0! That is, since f 0 (0) = ln(a) we have shown
ah − 1
lim = ln(a).
h→0 h
Exercise 5.6.6. What a surprising limit; we didn’t do any of our usual tricks to find it. Convince
h
yourself it is true, at least for a = 2: evaluate 2 h−1 for smaller and smaller values of h and compare
the answer with ln(2).
3 +1
Example 5.6.7. If f (x) = 2x then by the above and the chain rule we have
3 3
f 0 (x) = 2x +1 ln(2) (3x2 ) = (3 ln(2)x2 )2x +1 .
3 +1)
Alternatively, we’d rewrite f (x) = eln(2)(x to deduce
3 +1)
f 0 (x) = eln(2)(x · 3 ln(2)x2
In the previous section, we found the derivatives of exponential functions, by using the definition
and discovering the number e for which the definition gives a nice limit. Now we want to differentiate
f (x) = ln(x) (or more generally loga (x) for some fixed constant a). We could use the definition,
but now we actually have more tools available, so there is an easier way.
115
MAT 1330 : Fall 2020 5.7. DERIVATIVES OF LOGARITHMS
equal to the function on the right hand side at every point x, and so their graphs are the same and
their derivatives are the same. So this should give is an equation to find g 0 (x)!
eg(x) = x.
whereas the derivative of the right hand side is 1. Therefore, differentiating both sides gives the
new equation
eg(x) g 0 (x) = 1
which says that
1
g 0 (x) = .
eg(x)
Now g(x) = ln(x) so eln(x) = x; thus we conclude:
Note.
d 1
(ln(x)) =
dx x
This is an incredible formula! When we differentiate the natural logarithm, the answer doesn’t
have a logarithm, or even an exponential, in it.
Notice that it fills a gap in our differentiation tables: the derivative of xn is nxn−1 , so there was
no function that gave us as derivative the function x−1 . Now we’ve found it — it’s the natural
logarithm, which you can think of as the slowest growing function that goes to ∞ as x → ∞.
p
Example 5.7.2. Find the derivative of ln(x) + 4.
116
MAT 1330 : Fall 2020 5.8. DERIVATIVES OF FUNCTIONS LIKE f (x)g(x)
We only know the derivative of g(x) = ln(x), so we need to change base. As with exponentials,
there is a standard method:
Therefore
d d ln(x) 1 1 1
(loga (x)) = = · = ,
dx dx ln(a) ln(a) x x ln(a)
that is
Note.
d 1
(loga (x)) = ,
dx x ln(a)
Example 5.7.4. If f (x) = log2 (ex + x) then by the above and the chain rule we have
1
f 0 (x) = (ex + 1) .
ln(2)(ex + x)
d n d x
We have seen that x = nxn−1 , if n is a constant. We have also seen that a = ax ln(a), if a
dx dx
is a positive constant. So what is
d x
x ?
dx
117
MAT 1330 : Fall 2020 5.8. DERIVATIVES OF FUNCTIONS LIKE f (x)g(x)
Would it be x(xx−1 ) “power rule” or xx ln(x) “exponential rule”? Answer: NEITHER. You may
only apply a rule under the hypotheses in which it was derived, and in this case, both are wrong.
Instead, we go back and recall how we solved for the derivative of ax : we converted it to base e.
That process will work here as well:
Fabulous! This is now a function that we can differentiate, using the exponential and chain rules:
d x d x ln(x) d 1
x = e = ex ln(x) (x ln(x)) = ex ln(x) (1 · ln(x) + x · ) = xx (ln(x) + 1).
dx dx dx x
(We used that ex ln(x) = xx to simplify the expression in the last step.)
It comes from the identity ab = eb ln(a) , for any a > 0 and any b.
Example 5.8.2. Find the derivative of h(x) = (x2 + 1)ln(x) .
g(x)f 0 (x)
d
f (x)g(x) = f (x)g(x) g 0 (x) ln(f (x)) +
dx f (x)
but this is too ridiculous to memorize; instead we remember the technique of Proposition 5.8.1.
Exercise 5.8.3. You can use this method to go back and prove the power rule for any power n ∈ R,
not just positive integers, by rewriting xn = en ln(x) and simplifying your answer. So the power rule
is a consequence of the derivatives of exponentials!
End of lecture # 9
118
MAT 1330 : Fall 2020 5.9. IMPLICIT DIFFERENTIATION
A function y = f (x) is a particular kind of relation between the variables x and y — one whose
graph passes the vertical line test. If our variables satisfy a relation like
x2 + y 2 = 9
then the corresponding graph is not a function, and does not pass the vertical line test. However,
we know we can decompose this graph into pieces, such that each piece is a function; in this case,
the graph is the union of the graphs of
p p
y = 9 − x2 and y = − 9 − x2 .
Now here’s the clever idea: if we want to find the slope of the tangent line to the circle at a certain
point, do we really need to solve for y in terms of x? After all, if we know that y is a function of x
near each point, then we can differentiate y with respect to x. For example,
d 2 dy
(y ) = 2y .
dx dx
What this implies is that sometimes we can solve for y 0 without first having to solve for y. (This is
in fact the trick we used when finding the derivative of y = ln(x); now we’ll state it more generally.)
For example, if x2 + y 2 = 9 then near any point we can think of both sides as being functions of x;
since they’re equal, their derivatives are equal. So we have
2x + 2yy 0 = 0
Example
√ 5.9.2. Find the equation of the tangent line to the curve x2 + y 2 = 9 at the point
(1, − 8).
Solution: this is indeed a point on the curve, and by the preceding, the slope of the tangent line at
that point is √
0 −x −1 1 2
m=y = = √ =√ = .
y − 8 8 4
√ √
A line is y = mx + b; given the point (x, y) = (1, − 8) and the slope m = 42 we solve to get
√
√ 2 √ 1√ 9√
b = y − mx = − 8 − = −2 2 − 2=− 2.
4 4 4
√ √
Thus the equation of the tangent line is y = 42 x − 49 2, which you can judge to be about right
(using a calculator to find out what these values are like).
119
MAT 1330 : Fall 2020 5.9. IMPLICIT DIFFERENTIATION
So:
At a mechanical level, implicit differentiation is saying that if you have an equation with
variables which depend on x, then you can differentiate both sides with respect to x using
the chain rule — remembering that
Note.
dx dy
=1 but = y0.
dx dx
1 0 1 1 ln(x2 + 1) 2x ln(x)
y = ln(x2 + 1) + ln(x) 2 (2x) = + 2
y x x +1 x x +1
whence, upon multiplying the far left and the far right sides by y we get
2
0 2 ln(x) ln(x + 1) 2x ln(x)
y = (x + 1) + 2
x x +1
Applying the logarithm to a complicated equation y = f (x) to make it ln(y) = ln(f (x)) (and
then using the laws of logarithms to simplify the ln(f (x)) term), and then differentiating, is called
logarithmic differentiation. It is a good approach to use when f (x) is deeply ugly and unwieldy.
2
Exercise 5.9.4. Find the derivative of y = (ln(x2 + 1))x using two methods: (1) be rewriting
2 2
the function as y = ex ln(ln(x +1)) ; and (2) by writing ln(y) = x2 ln(ln(x2 + 1)) and differentiating
implicitly.
Example 5.9.5. Find the equation of the tangent line to the astroid
x2/3 + y 2/3 = 5
120
MAT 1330 : Fall 2020 5.9. IMPLICIT DIFFERENTIATION
Solution: we verify that (1, 8) is indeed a point on the curve. We differentiate both sides with
respect to x:
2 −1/3 2 −1/3 0
x + y y =0
3 3
whence
x−1/3 −y 1/3
0
y = − −1/3 = .
y x
At the point (1, 8), we get y 0 = (−8/1)1/3 = −2. Therefore the equation fo the tangent line is
Using software to sketch the graph of this curve, we see this answer looks about right.
The graph of x2/3 + y 2/3 = 5. This shape is called an astroid and is cut out by a small circle
rolling along the inside of a large circle: http://mathworld.wolfram.com/Astroid.html.
121
MAT 1330 : Fall 2020 5.9. IMPLICIT DIFFERENTIATION
(x2 + y 2 )2 = 4xy 2 .
Three points
√ on the graph are (0, 0), (1, 1)
and (3/4, 3/4). Find the slope of the tan-
gent line, when defined.
whence
(4y(x2 + y 2 ) − 8xy)y 0 = 4y 2 − 4x(x2 + y 2 ),
or
y 2 − x(x2 + y 2 )
y0 = .
y(x2 + y 2 ) − 2xy
At (0, 0), this is 0/0 so undefined. On the graph we see that there’s a huge mess at the origin; of
course there’s no tangent line.
At (1, 1) this is −2/0 so again undefined; but on the graph we see that in fact there’s a vertical
tangent line at this point. (So it’s not a function of x there; rather, x is a function of y.)
√
At (3/4, 3/4), we have √
0 3/16 − (3/4)(3/4) 3
y = √ √ =2
( 3/4)(3/4) − (3 3/8) 3
which looks reasonable from the graph.
122
MAT 1330 : Fall 2020 5.10. DERIVATIVES OF SINE AND COSINE
4x3 = 2x − 2yy 0
or y 0 = 1
x(1 − 2x2 ), after simplifying. At
√y
1 3
(− 2 , − 4 ), this is √13 .
Now differentiate the relation 4x3 = 2x −
2yy 0 , noting that dx
d
(yy 0 ) = y 0 y 0 + yy 00 by the
product rule, to get:
12x2 = 2 − 2y 0 y 0 − 2yy 00
√
or y 00 = (1 − (y 0 )2 − 6x2 )/y after simplifying. Pluggin in the point (x, y) = (− 12 , − 3
4 ) and the first
derivative y 0 = √13 at this point yields
4
y 00 = √ .
3 3
We compare with the graph, and agree that the slope is positive and around 0.6 at that point; we
agree that the curve is concave up (see Section 6.2).
sin(x + h) − sin(x)
f 0 (x) = lim
h→0 h
sin(x) cos(h) + sin(h) cos(x) − sin(x)
= lim
h→0
h
cos(h) − 1 sin(h)
= lim sin(x) + cos(x)
h→0 h h
cos(h) − 1 sin(h)
= sin(x) lim + cos(x) lim .
h→0 h h→0 h
So it all comes down to understanding these two limits — which are exactly the derivatives of
cos(x) and of sin(x) at x = 0.
In fact, we have:3
3
That sin0 (0) = 1 is ONLY TRUE when we measure our angle in RADIANS. If you change the units with which
you measure the x-axis, such as by using degrees, the value of the slope will change (in this case, to the useless and
annoying value π/180). Use RADIANS for Calculus.
123
MAT 1330 : Fall 2020 5.10. DERIVATIVES OF SINE AND COSINE
Note.
cos(h) − 1 sin(h)
cos0 (0) = lim =0 and sin0 (0) = lim =1
h→0 h h→0 h
(see below). Thus sin0 (x) = cos(x). A similar process with the definition of the derivative of cosine
comes down to the same two limits, and after some work we conclude that
Note.
d d
sin(x) = cos(x) and cos(x) = − sin(x).
dx dx
Here are some good arguments, using geometry, to explain why the derivative of sin(x) at 0 is 1
and the derivative of cos(x) at 0 is 0.
sin(h)
Why limh→0 h =1 We can make a geometric argument.
Or:
cos(0 + h) − cos(h) cos(h) − 1
cos0 (0) = lim = lim .
h→0 h h→0 h
We know that
sin2 (x) + cos2 (x) = 1.
(This is more correctly written as (sin(x))2 + (cos(x))2 = 1.) Therefore we can differentiate both
sides to give
2(sin(x)) sin0 (x) + 2 cos(x) cos0 (x) = 0.
124
MAT 1330 : Fall 2020 5.11. DERIVATIVES OF OTHER TRIGONOMETRIC FUNCTIONS
Now at this moment we don’t know the derivative of sin(x) everywhere (that depends on knowing
this limit!), but when x = 0, the fact that sin(0) = 0 and cos(0) = 1 is enough to tell us that
A deeper connection with exponentials As an aside: the same mathematician Euler who
discovered and calculated e also figured out why the derivatives
√ of exponential and trig functions
show the same dependency on their values at 0. If we let i = −1 denote a complex number whose
square is −1, and flesh out what this should mean in terms of functions, we get the identity:
This formula is only valid if x is measured in radians. So sine and cosine functions are essentially
special cases of exponential functions — if you are willing to work with complex numbers.
That said, although complex numbers are the only way to discuss electricity and magnetism, for
example, their main application in the life sciences is through linear algebra rather than Calculus
(as we’ll see later in MAT1332). So we won’t be pursuing this thought further here.
Note.
sin(x) cos(x) 1 1
tan(x) = , cot(x) = , csc(x) = , sec(x) = .
cos(x) sin(x) sin(x) cos(x)
Therefore we can just apply the quotient rule to deduce their derivatives of those of sin(x) and
cos(x).
d
Example 5.11.1. Find tan(x).
dx
Solution: we apply the quotient rule
d d sin(x) cos(x) cos(x) − sin(x)(− sin(x)) 1
tan(x) = = 2
= = sec2 (x).
dx dx cos(x) cos (x) cos2 (x)
Note.
sin2 (x) + cos2 (x) = 1
125
MAT 1330 : Fall 2020 5.12. INVERSE TRIGONOMETRIC FUNCTIONS
d
Example 5.11.2. Find csc(x).
dx
Solution: we apply the chain rule to csc(x) = (sin(x))−1 . This gives
d − cos(x) cos(x) 1
csc(x) = −(sin(x))−2 cos(x) = =− = − cot(x) csc(x).
dx sin2 (x) sin(x) sin(x)
Exercise 5.11.3. Use the quotient rule and standard identities to find the derivatives of sec(x)
and of cot(x).
It is very useful to memorize the derivatives of the six standard trigonometric functions:
Note.
d d
sin(x) = cos(x), cos(x) = − sin(x)
dx dx
d d
tan(x) = sec2 (x), cot(x) = − csc2 (x)
dx dx
d d
sec(x) = sec(x) tan(x), csc(x) = − csc(x) cot(x)
dx dx
by the chain rule. Notice that sec(x) tan(x) does NOT occur in this expression.
Inverse trigonometric functions have a special place in Calculus, because their derivatives are such
astonishingly normal-looking functions. This means that inverse trigonometric functions sometimes
pop up when you need to find anti-derivatives (see “integrals”, later in this course) even when there
are no trigonometric functions in sight! (This is a bit how the logarithm ln(x) shows up as an
anti-derivative of 1/x, a rational function.)
We also need inverse trigonmetric functions whenever we want to solve an equation like cos(x) = 0.3
or tan(x) = 17.
Both notations are acceptable but you must recall that sin−1 (x) means the inverse function of sine,
NOT csc(x), DESPITE the suggestive −1. The “−1” is intended to evoke “inverse function” NOT
126
MAT 1330 : Fall 2020 5.12. INVERSE TRIGONOMETRIC FUNCTIONS
reciprocal. We write arcsin(x) for the inverse sine function in this course (and in the homework
software Mobius).
So we sketch the graph of y = sin(x); this is not one-to-one; therefore, like we did for y = x2 , we
have to agree on a portion of the domain of y = sin(x) to which we can restrict the function. We
have universally agreed on [−π/2, π/2].
The graph of y = sin(x) on left, with the portion over [−π/2, π/2] on which it is one-to-one
highlighted in green, together with the graph of y = arcsin(x), which is the inverse of sin
restricted to [−π/2, π/2], and thus has domain [−1, 1]. Note the scales on the axes.
So we conclude that:
Note.
Solution: exactly one solution is given by x = arcsin(0) = 0. To find all the others, we look at the
graph of y = sin(x) and see that kπ, for any integer k, is also a solution.
Solution: exactly one solution is given by x = arcsin( 12 ) = π/6. To find all others, we look at the
graph of y = sin(x), or we use the identities:
and
sin(x) = sin(π − x).
We see that these account for all the possible solutions, so our final answer is:
127
MAT 1330 : Fall 2020 5.12. INVERSE TRIGONOMETRIC FUNCTIONS
So
y = arcsin(x) ⇐⇒ sin(y) = x and −π/2 ≤ x ≤ π/2.
We apply implicit differentiation the equation sin(y) = x to get
x x
cos(y)y 0 = x ⇐⇒ y 0 = = .
cos(y) cos(arcsin(x))
This looks somewhat hideous: but let’s simplify it.
So arcsin(x) is the angle y, with −π/2 ≤ y ≤ π/2, such that sin(y) = x. Now we know
sin2 (y) + cos2 (y) = 1
and moreover, on −π/2 ≤ y ≤ π/2, cos(y) ≥ 0. Thus
q p
cos(arcsin(x)) = cos(y) = 1 − sin2 (y) = 1 − x2 .
Therefore:
Note.
d 1
(arcsin(x)) = √ for all x ∈ (−1, 1)
dx 1 − x2
A quick reality check: indeed, this function is defined only for −1 < x < 1, which is what you’d
expect for the derivative of arcsin(x). It is, nonetheless, a little shocking that the derivative of this
function isn’t another inverse trig function — but notice that the formula is definitely related to
trigonometry, which is more obvious from the following example.
√
Example 5.12.3. Let’s prove that cos(arcsin(x)) = 1 − x2 using triangles. We pretend that
0 < arcsin(x) < π/2 but the argument can be adapted for −π/2 < arcsin(x) ≤ 0 to give the same
answer as well.
Now that we have a formula for the derivative of arcsin(x), we can use it to differentiate any
function involving the arcsine function; we don’t need to rederive it each time.
Example 5.12.4. Let y = arcsin(ex + x2 ). Then
1
y0 = p · (ex + 2x)
1 − (ex + x2 )2
Remark 5.12.5. Notice that when arcsin is the outermost function of a composition, it does not
occur in the derivative.
128
MAT 1330 : Fall 2020 5.12. INVERSE TRIGONOMETRIC FUNCTIONS
Sketching the graph of y = tan(x), we see that our natural choice for restricting the domain is
similar to that for sin(x); except we must exclude endpoints because of the vertical asymptotes.
The graph of y = tan(x), with a maximal portion selected on which it is one-to-one, together with
the graph of y = arctan(x), which is the inverse of tan restricted to (−π/2, π/2), which has
domain all of R.
Note.
One solution is arctan(1) = π/4. The graph of y = tan(x) is periodic with period π, and so in fact,
we simply have that all solutions are
x = π/4 + πk
for some integer k.
sec2 (y)y 0 = 1
129
MAT 1330 : Fall 2020 5.12. INVERSE TRIGONOMETRIC FUNCTIONS
so
1 1 1
y0 = 2
= 2 =
sec (y) 1 + tan (y) 1 + x2
by the identity
Note.
sec2 (θ) = 1 + tan2 (θ).
So
Note.
d 1
arctan(x) = for all x.
dx 1 + x2
Again, this is reasonable; this function is defined on all of R and goes to 0 as x goes to ±∞, as
you’d expect from the graph of arctan(x).
(This function, you’ll notice, is completely unrelated to sec2 (x), because 1/ arctan(x) is completely
unrelated to tan(x).)
The graph of y = cos(x) on left, with the portion over [0, π] on which it is one-to-one highlighted
in green, together with the graph of y = arccos(x), which is the inverse of cos restricted to [0, π],
and thus is defined on domain [−1, 1]. Note the scales on the axes.
130
MAT 1330 : Fall 2020 5.12. INVERSE TRIGONOMETRIC FUNCTIONS
Note.
√
Example 5.12.8. Find all solutions to cos(x) = 3/2.
√
One solution is arccos( 3/2) = π/6; this is the only solution in the interval [0, π] since cos(x) is
one-to-one there. To find all other solutions, we use the identities:
and
cos(−x) = cos(x).
We see from the graph that these give us all other solutions. Therefore the answer is
We could repeat the argument used above, but instead we might stare at the graph of arccos(x)
and realize that it has a very similar shape to that of arcsin(x), because they are each portions of
a sinusoidal curve in y. How can we use this?
We can relate the portions of the sine and cosine graphs on which we took the inverse functions.
We know that for 0 ≤ x ≤ π,
cos(x) = sin(π/2 − x)
with −π/2 ≤ π/2 − x ≤ π/2. So
so that
Note.
arccos(x) = π/2 − arcsin(x).
It follows that
Note.
d −1
(arccos(x)) = √
dx 1 − x2
131
MAT 1330 : Fall 2020 5.13. SUMMARY OF KNOWN DERIVATIVES
Exercise 5.12.9. You might ask about the remaining inverse trigonometric functions (which no
one uses). They can be expressed in terms of the ones we know. For example, if you want to know
about the inverse cosecant function, you would reason as follows:
1
y = arccsc(x) means x = csc(y) = ,
sin(y)
which is valid for y ∈ [−π/2, 0) ∪ (0, π/2]. (Exercise: why?) Then you solve:
1 1
x= ⇐⇒ sin(y) = ⇐⇒ y = arcsin(1/x).
sin(y) x
(a) Find similar expressions for arcsec(x) and arccot(x), as well as their domains and ranges.
They’re interesting, but since most calculators don’t even have these functions as buttons, it’s kind
of pointless to use them.
It is good to remember all our rules of differentiation as the first step of a chain rule. In the
following, u represents some function of x; we are giving a formula to reduce the problem of finding
the derivative of a composite function to finding the derivative of u (i.e. reducing to a smaller
problem). Iterating this gives you the answer.
Note.
d d n du
1=0 u = nun−1 , n 6= 0
dx dx dx
d u du d 1 du
e = eu ln(u) =
dx dx dx u dx
d du d du
sin(u) = cos(u) cos(u) = − sin(u)
dx dx dx dx
d du d du
tan(u) = sec2 (u) cot(u) = − csc2 (u)
dx dx dx dx
d du d du
sec(u) = sec(u) tan(u) csc(u) = − csc(u) cot(u)
dx dx dx dx
d 1 du d 1 du
arcsin(u) = √ arctan(u) =
dx 1 − u2 dx dx 1 + u2 dx
End of lecture # 10
132
Chapter 6
Suppose we are given a formula for a function that models a phenomenon of interest (eg. drug
absorption over time, population as a function of environmental pollutants). From that formula,
using our pre-Calculus skills, we can deduce its domain and the values at its endpoints. Using
limits, we can often work out its behaviour near gaps in its domain, or towards ±∞.
In the previous chapter, we learned how to differentiate everything. If a function has a formula
that you recognize, then you can differentiate it at most points, and can additionally tell where
there is a cusp (a place where it is continuous but turns sharply, like does y = |x|). That is, we can
now say how the function changes with its input.
The goal in this chapter is to show how powerful differentiation is as a tool for understanding
functions. We begin by interpreting the first and second derivative of a function (Sections 6.1 and
6.2, and then see how to use these to sketch graphs (Section 6.3). We then identify local and
global extrema of functions (Section 6.4). These extrema help us model phenomena and optimize
functions (Section 6.5).
The derivative does much more, too. We also develop a new tool for finding certain kinds of
limits, called L’Hôpital’s rule (Section 6.6). It lets us approximate complex functions with simpler
ones (Section 6.7), come up with criteria for when fixed points of nonlinear DTDS are stable or not
(Section 6.8), and find roots of unsolvable equations to any degree of accuracy we want (Section 6.9).
One of the key things the derivative can tell us is where the function does something interesting.
We call these critical points.
Definition 6.1.1. A number c in the domain of a function f is a critical point or a critical number
of f if either f 0 (c) = 0 or else f 0 (c) is undefined.
133
MAT 1330 : Fall 2020 6.1. THE FIRST DERIVATIVE
The function f (x) = 2x + 3 has derivative f 0 (x) = 2 everywhere, so it has no critical points.
The function f (x) = 5x2 has derivative f 0 (x) = 10x, which is 0 at x = 0, so c = 0 is a critical
point of f .
The derivative of the function f (x) = |x| is not defined at x = 0, so 0 is a critical point of f .
In each case, the critical point identifies the existence of an interesting feature of the graph (none,
a minimum, and a cusp, respectively).
What happens between critical points? Well: suppose f 0 (x) is continuous1 . Then f 0 (x) can’t
change sign between critical points — it’s either positive on the whole interval, or negative on the
whole interval.
Note. Recall from Section 5.4 that if f 0 (x) > 0 on an interval, then f is increasing; and if f 0 (x) < 0
on an interval, then f is decreasing.
So if we divide the real line into intervals delimited by critical points, then we can determine the
sign on each interval, and thereby see where f is increasing, decreasing, or has a horizontal tangent.
Example 6.1.3. Consider f (x) = −x3 + 3x2 + 45x − 8. We find
so the domain of f 0 is R and the only critical points are its roots, where f 0 (x) = 0, that is,
x ∈ {−3, 5}.
We make a table with columns for the critical points and the intervals in the domain that they
define. On each interval and point, we evaluate f 0 (x); then we interpret this in terms of the
behaviour of f (x). We also record the value of the function at each critical point, because this can
help us check our answer makes sense.
The values of f at the critical points are consistent with the function being increasing in between.
We can also check that lim f (x) = ∞ and lim f (x) = −∞ so it’s all consistent, and therefore
x→−∞ x→∞
we have confidence that we didn’t make a mistake.
The graph of f is below; notice how it has a horizontal tangent line exactly at the critical points
we found.
1
This is true of all the functions we consider in this course. There are stranger functions for which f 0 (x) is not
continuous — but then we just add the points of discontinuity to the list of critical points.
134
MAT 1330 : Fall 2020 6.1. THE FIRST DERIVATIVE
The graph of y = −x3 + 3x2 + 45x − 8. Notice that the graph is increasing between 0 and 2, and
decreasing otherwise.
Solution: we compute f 0 (x) = −2x−3 . This is undefined at x = 0 because you can’t divide by 0
−2
(and −2x−3 = 3 ) but x = 0 isn’t in the domain of f , so is not a critical point. Thus f has no
x
critical points.
DANGER: You might be tempted to say “since f has no critical points, f 0 never changes sign;
since f 0 (1) = −2, f is always decreasing.” But let’s look at the graph of f (x) = 1/x2 to see if this
is true.
The graph of y = x−2 . Notice that the graph is increasing on (−∞, 0) and decreasing on (0, ∞).
What happened? Ah: f 0 changed sign at x = 0, which is where it is undefined. Makes sense.
The correct table to make is the following, which is defined by ALL the points where f 0 is zero or
undefined:
x (−∞, 0) 0 (0, ∞)
sign of −2 - - -
sign of x−3 - undefined +
sign of f 0 (x) + undefined -
behaviour of f (x) increasing undefined decreasing
135
MAT 1330 : Fall 2020 6.1. THE FIRST DERIVATIVE
The point: the table should include all points where f could change from increasing to decreasing
and vice versa.
2. Create a table with all intervals in the domain cut out by critical points. Include critical points,
gaps in the domain, and endpoints (if applicable).
3. On each interval, find the sign of f 0 (x). If f 0 (x) > 0, then f is increasing; if f 0 (x) < 0 then f is
decreasing.
write f 0 (x) as a product of factors, and reason out the signs of each of the factors and then
multiply them (as in the table in Example 6.1.3); or
choose a sample point in each interval, and just plug that point in for x in the formula for
f 0 (x). (This works because the sign of f 0 (x) is the same for all points in that interval, by our
construction; but for complicated functions it is tedious and prone to errors.)
Either way is fine. On a test, you must clearly communicate your reasoning, so either work out the
signs of the factors as above or state which sample point you used in each interval, for example.
Some more examples Note that we can use this method to analyse functions for which we do
not yet know anything about the graph.
Example 6.1.5. Find the critical points of f (x) = x2 e−x , and where this function is increasing
and decreasing. This is an example of a Gamma distribution, which is used to model the expected
length of time it will take for something to occur (eg: for three synaptic impulses to occur; for you
to receive four phone calls) when the average time between these random events is known.
The domain of f 0 (x) is R, so the only critical points are where f 0 (x) = 0. Since e−x 6= 0 for any x,
We make a table with columns for the critical points and the intervals in the domain that they
define. On each interval and point, we evaluate f 0 (x); then we interpret this in terms of the
behaviour of f (x). We also record the value of the function at each critical point, because this can
help us check our answer makes sense.
136
MAT 1330 : Fall 2020 6.2. THE SECOND DERIVATIVE
Notice that f (0) < f (2) so it makes sense that the function is increasing there. When f models the
Gamma distribution, we are only interested in x ≥ 0. This table tells us that the probability that
the random event will occur increases until x = 2, and then decreases: meaning it is most likely
the event will occur around x = 2.
1
Example 6.1.6. Consider f (x) = x ln(x). We have f 0 (x) = ln(x) + x = ln(x) + 1. This is
x
We make a table with columns for the critical points and the intervals in the domain that they
define. On each interval and point, we evaluate f 0 (x); then we interpret this in terms of the
behavious of f (x):
So our function decreases on 0 < x < e−1 until it bottoms out at the point f (e−1 ) = −e−1 , after
which it is increasing; since lim x ln(x) = ∞, there is no horizontal asymptote. (But what happens
x→∞
near x = 0? Good question! Stay tuned in a few classes...)
So the first derivative tells us where f is increasing, but that isn’t the whole story. Let’s consider
an example for motivation.
x
Example 6.2.1. Consider the two functions f (x) = ex and g(x) = for x ≥ 0. Exercise:
1+x
1
f 0 (x) = ex and g 0 (x) = .
(1 + x)2
137
MAT 1330 : Fall 2020 6.2. THE SECOND DERIVATIVE
Note that both functions have positive first derivatives, hence, both are increasing functions on
x ≥ 0. However, their graphs look quite different.
x
The graphs of y = ex in red and y = in blue. Both are increasing on x ≥ 0 but their shapes
1+x
are opposite.
We observe that the slope of f increases with x but the slope of g decreases with x. In other words,
the function f 0 (x) is increasing but the function g 0 (x) is decreasing. Let’s calculate this (for x ≥ 0):
d 0 d 0 −2
[f (x)] = ex > 0, whereas [g (x)] = < 0.
dx dx (1 + x)3
Given a function f (x), then its second derivative of f is the derivative of f 0 (x), which we denote
f 00 (x). The third derivative is the derivative of f 00 (x), and we usually write f (3) (x) rather than
f 000 (x) just because it’s confusing otherwise. We can define the nth derivative of a function, denoted
f (n) (x).
The first derivative of f tells us about the rate of change of f . If f is increasing, then f 0 (x) > 0; if
f is decreasing, then f 0 (x) < 0.
Therefore, the second derivative of f tells us about the rate of change of f 0 : if f 00 (x) > 0 then f 0 is
increasing; if f 00 (x) < 0 then f 0 is decreasing. What does this look like?
Note how the slopes of the tangent lines are decreasing in the figure on the left (as x increases)
and increasing with x on the right. This change in slope of the tangent line forces the curve to
take on a characteristic shape. With thanks to The MathRoom
http://www.the-mathroom.ca/cal1/ and this diagram from
http://www.the-mathroom.ca/cal1/cald4/cald4.htm.
Let’s reason this out further:
If f 00 (x) > 0 on an interval, meaning f 0 is increasing there, then the slope of the tangent line
138
MAT 1330 : Fall 2020 6.2. THE SECOND DERIVATIVE
to f is increasing.
On this kind of graph, the tangent line is under the curve, because the tangent line is “pushing
it up”.
We call this shape concave up, and the shape is like that of a cup ∪ (or a smile).
Since f 0 (x) = 3x2 and f 00 (x) = 6x, we see that f 00 (x) < 0 for x < 0 and f 00 (x) > 0 for x > 0. This
confirms that f 0 (x) = 3x2 is decreasing on (−∞, 0) and increasing on (0, ∞). By the above, it tells
us that f is concave down on (−∞, 0) and concave up on (0, ∞).
On the other hand, g 0 (x) = 1/x = x−1 and g 00 (x) = −x−2 which is negative everywhere on its
domain (which is bigger than the domain of g, but we only care about the domain of g). Thus
g 00 (x) < 0 and so g 0 (x) is decreasing and so g(x) is concave down.
We confirm all these observations by sketching the well-known graphs of these functions.
The graph of f (x) = x3 on the left and the graph of g(x) = ln(x) is on the right. Note their
concavity.
The change from concave up to concave down (or vice versa) is a subtle but important feature on
a graph. We give it a name.
Definition 6.2.3. An inflection point on the graph of y = f (x) is a point (x, y) on the curve where
the concavity of the curve changes: from concave up to concave down, or vice versa.
Example 6.2.4. So (0, 0) is an inflection point of f (x) = x3 because the graph goes from concave
down to concave up there. The graph of g(x) = ln(x) has no inflection points.
Note. Notice how we use the language: each inflection point of f occurs at a critical point of f 0
(that is, a point in the domain where f 00 is either 0 or undefined), but that not every critical point
of f 0 is an inflection point of f . The critical points of f 0 are just “potential inflection points”.
139
MAT 1330 : Fall 2020 6.2. THE SECOND DERIVATIVE
Example 6.2.5. The function f (x) = 1/x is concave down on (−∞, 0) and concave up on (0, ∞)
but does not have any inflection points: where it changes sign is not a point on the curve (it’s at
the vertical asymptote).
The function g(x) = x4 is always concave up, even though g 00 (x) = 12x2 is zero at x = 0. This is
an example of a critical point of g 0 (that is, a place where g 00 is zero) that didn’t turn out to be an
inflection point of g (because the concavity ended up staying the same on both sides).
The graphs of f (x) = 1/x at left and g(x) = x4 at right. Neither has an inflection point, for
different reasons.
A good example to know are the power functions. Let p be a real number and consider
f (x) = xp .
We compute
f (x) = xp , f 0 (x) = pxp−1 , f 00 (x) = p(p − 1)xp−2 .
Let’s focus on what they look like when x > 0 (because this is in the common domain of all off
these functions). We make a table:
We can sketch the three kinds of shapes (for the part with x > 0).
Graphs of power functions in the first quadrant: case p < 0 at left (x−1 ); case 0 < p < 1 in the
middle (x1/3 ); case p > 1 at right (x3 ).
140
MAT 1330 : Fall 2020 6.3. GRAPHING FUNCTIONS
Our knowledge of limits and derivatives lets us graph functions by identifying the most important
features, rather than by plotting points and connecting the dots. This is important for two reasons:
(a) when you plot points and connect the dots, you may easily miss key features of the graph that
would completely change a cobwebbing, for example;
(b) when you consider functions of more that one variable next term, you definitely can’t plot
points anymore, because the graph is 3D (or more!).
Here are the things to look for when you are graphing a function:
2. The limit of f as x approaches a point not in the domain (eg, asymptotes) as well as lim f (x)
x→∞
and lim f (x);
x→−∞
3. The derivative;
4. The critical points of f (i.e. where f 0 (x) = 0 or undefined) and the intervals of increase or
decrease (i.e. the sign of f 0 between critical points);
6. Where f 00 (x) = 0 or undefined, and the sign of f 00 between these points : to tell us concavity
of f , and to identify any inflection points of the graph.
7. Consistency! Make sure these clues all fit together and make sense. (Otherwise: check your
work.)
Example 6.3.1. We saw that the updating function for one kind of limited population model was
2x
f (x) = .
1+x
Graph this function.
141
MAT 1330 : Fall 2020 6.3. GRAPHING FUNCTIONS
(1 + x)(2) − 2x(1) 1
3. f 0 (x) = 2
=
(1 + x) (1 + x)2
4. f 0 (x) > 0 for all x at which it’s defined, so f is increasing on each interval of its domain and
there are no critical points;
6. If x < −1 then (1 + x)3 < 0 so f 00 (x) > 0 and the function is concave up;
if x > −1 then (1 + x)3 > 0 so f 00 (x) < 0 and the function is concave down;
7. Since it never reaches a local extremum, but goes to a horizontal asymptote, it must approach
the asymptote from one side. We sketch the graph by drawing some dotted lines for the
horizontal asymptotes at each side (not shown: just draw them at the extremes of the graph,
since the graph can sometimes cross a horizontal asymptote) and also at (0, 0) (the point we
found in part 1.) then little arrows for the 4 limits; then connect them, respecting concavity:
We then look back at our analysis and check: yes, increasing on each interval of the domain.
No crtical points, inflection points or extrema; just some asymptotes.
Note. The key thing is that all the clues you collect offer multiple confirmations of the same overall
picture; if something seems impossible to draw: check your derivatives!!
Example 6.3.2. The updating function for a population showing the Allee effect was
4x2
f (x) =
1 + x2
Let’s graph it.
1. Domain R, only 0 at 0
142
MAT 1330 : Fall 2020 6.3. GRAPHING FUNCTIONS
4x2
2. lim = 4 (divide by highest power)
x→±∞ 1 + x2
3.
(1 + x2 )(8x) − 4x2 (2x) 8x
f 0 (x) = 2 2
=
(1 + x ) (1 + x2 )2
4. The only critical point is 0; f 0 (x) < 0 when x < 0, so f is decreasing there; but f 0 (x) > 0 for
x > 0, so f is increasing there;
8 − 24x2
f 00 (x) = 8(1 + x2 )−2 − 16x(1 + x2 )−3 (2x) = 8(1 + x2 ) − 32x2 (1 + x2 )−3 =
(1 + x2 )3
√ √ √ √ √ √
x (−∞, −1/ 3) −1/ 3 ' −0.58 (−1/ 3, 1/ 3) 1/ 3 ' 0.58 (1/ 3, ∞)
00
f (x) : - 0 + 0 -
f (x) is concave down 1 concave up 1 concave down
7. Putting this together : we plot our 3 points and 2 limits and connect the dots (with the
concavity as indicated) to yield
Again, we confirm our result by comparing with our analysis of the first derivative. There
are two inflection points and one local and global minimum.
We’ve written it as seven steps above, but another way to think about it: we glean information
from the formula for f , then from f 0 , and then from f 00 , and put it all together.
143
MAT 1330 : Fall 2020 6.3. GRAPHING FUNCTIONS
ln(x)
Example 6.3.3. Sketch the graph of f (x) = .
x
Solution:
domain is x > 0.
x-intercept is ln(x) = 0 or x = 1.
ln(x)
lim = −∞ (vertical asymptote)
x→0+ x
ln(x)
lim = 0 (as we’ll see in Example 6.6.6) (horizontal asymptote)
x→∞ x
conclude that there’s a local maximum at x = e (with coordinates (e, 1/e) ≈ (2.72, 0.37),
which must in fact be the absolute maximum since there are no gaps in the domain, and f
never bounces again.
when x < e3/2 , f 00 (x) < 0 so f is concave down there — good, because our critical point in
on this interval and we’d said it was a maximum!
conclude that we change concavity at (e1.5 , 1.5e−1.5 ) ≈ (4.48, 0.33), so this is an inflection
point
Putting these clues together gives the graph, and our sketch will be quite accurate.
144
MAT 1330 : Fall 2020 6.4. EXTREMA
Graph of y = ln(x)/x. The intercepts, critical point and point of inflection are marked. Notice
the properties of this graph are consistent will all clues from the function and its derivatives.
End of lecture # 11
6.4 Extrema
When a function f comes up in an application, we are particularly interested in its extrema. For
example, if f is the function describing the valuation of a certain stock, then you’re interested in
its highs and lows: they tell you about the volatility and about the range over a period of time.
Definition 6.4.1.
3. A local maximum or a local minimum is also called a local extremum (plural: local extrema).
So a local maximum (respectively, minimum) is a part of the curve that’s not an endpoint, and
where if you zoom in close enough, f (c) is the largest (respectively, the smallest) y-value of your
function in that neighbourhood. It’s an interesting feature of the graph.
145
MAT 1330 : Fall 2020 6.4. EXTREMA
Definition 6.4.2.
1. A function f attains a global or absolute maximum at a point c in its domain if f (x) ≤ f (c)
for all x in the domain of f .
3. A global maximum or global minimum is called a global extremum (plural: global extrema).
So a global maximum or minimum is the very largest or smallest y-value of the graph.
Here are some examples to help us think about the difference between “local” and “global” and
also about how not every graph will have local or global extrema.
Example 6.4.3. local extrema but no global ex-
trema. Consider
146
MAT 1330 : Fall 2020 6.4. EXTREMA
Note. Summary: Not every function attains a global maximum or global minimum, and even if it
does, the x-value where the extremum is attained need not be unique. Not every local extremum
is a global extremum; not every global extremum is a local extremum.
147
MAT 1330 : Fall 2020 6.4. EXTREMA
Considering these examples, we notice that the local extrema, when there were any, occurred at
the critical points of f (Definition 6.1.1 : a critical point is a number c in the domain of f such
that either f 0 (c) = 0 or f 0 (c) is undefined.) In fact, local extrema can only occur at critical points.
This is a theorem due to Fermat2 .
Theorem 6.4.10. If f has a local extremum at c, and f 0 (c) exists, then f 0 (c) = 0.
Equivalently:
If f is a continuous function, then its local extrema, if any, can only occur at critical points
(but not all critical points give local extrema).
Please note: Fermat’s theorem only goes one way. That is, just because f 0 (c) = 0 you cannot
deduce that c gives a local extremum. Think of f (x) = x3 , which has a critical point at 0 but no
extremum there.
Knowing that local extrema occur at critical points, we can now use what we learned in previous
sections to classify all local extrema (without needing to know the graph).
Proposition 6.4.11 (First Derivative Test). Suppose c is a critical point of f and f is continuous
at c. Then if in a small interval on both sides of c we have
if f 0 (x) < 0 for x < c and f 0 (x) > 0 for x > c, then (c, f (c)) is a local minimum of f ;
if f 0 (x) > 0 for x < c and f 0 (x) < 0 for x > c, then (c, f (c)) is a local maximum of f .
This just says that you increase to a maximum and then decrease afterwards, and vice-versa for a
minimum.
Example 6.4.12. We saw in Example 6.1.6 that x = e−1 is a critical point of the function f (x) =
x ln(x), since f 0 (x) = ln(x)+1 is zero there. When x ∈ (0, e−1 ), we had f 0 (x) < 0 and when x > e−1
we have f 0 (x) > 0, therefore by the first derivative test, the point (e−1 , f (e−1 )) = (e−1 , −e−1 ) is a
local minimum.
Thinking about concavity gives us a second test that is simpler, but please note that sometimes it
doesn’t apply or it gives no answer:
2
Pierre Fermat had a lot of theorems; his most famous was called Fermat’s Last Theorem, about solutions to
equations like x3 + y 3 = z 3 . He’d written the theorem in the margin of a book in 1637, with a little note to the effect
that he had a “marvelous proof, but the margin is too small to contain it.” No one ever found the proof, but Andrew
Wiles famously finally proved the theorem by other means in 1995.
148
MAT 1330 : Fall 2020 6.4. EXTREMA
Proposition 6.4.13 (Second Derivative Test). Suppose c is a critical point of f such that f 0 (c) = 0.
This is saying that if your curve is concave up at a critical point, then it must be a local minimum;
and if it is concave down, then it must be a local maximum. Nice!
Example 6.4.14. Find and classify the local extrema of f (x) = x2 e−x .
Solution: We have
f 0 (x) = 2xe−x + x2 e−x (−1) = (2x − x2 )e−x ,
which gives us two critical points: x = 0 and x = 2. To use the second derivative test, we first
compute the second derivative:
f 00 (x) = (2 − 2x)e−x + (2x − x2 )e−x (−1) = (x2 − 4x + 2)e−x .
Now we put the critical points into f 00 . Since f 00 (0) = 2 > 0, x = 0 gives a local minimum. Since
f 00 (2) = 4 − 8 + 2 = −2 < 0, x = 2 gives a local maximum. (Compare with Example 6.1.5, using
the First Derivative Test.)
Example 6.4.15. Consider f (x) = x2 . Then f 0 (x) = 2x so 0 is a critical point; since f 00 (x) = 2 > 0,
by the second derivative test, 0 is a local minimum.
Example 6.4.16. Consider f (x) = sxp where p and s are some real numbers such that p > 2.
Then the domain of f is R and f 0 (x) = spxp−1 so 0 is a critical point; but f 00 (x) = sp(p − 1)xp−2 so
f 00 (0) = 0, which tells us nothing, according to Proposition 6.4.13. In fact, anything can happen:
s = 1, p = 3 gives f (x) = x3 , which does not have a local extremum at (0, 0);
s = 1, p = 4 gives f (x) = x4 , which has a local minimum at (0, 0);
s = −1, p = 4 gives f (x) = −x4 , which has a local maximum at (0, 0).
So: the second derivative test fails; we have to use the first derivative test for these functions.
149
MAT 1330 : Fall 2020 6.4. EXTREMA
Example 6.4.18. The function f (x) = x(x + 1)(x − 1) = x3 − x has f 0 (x) = 3x2 − 1 so two critical
1
points where f 0 (x) = 0 or 3x2 = 1 or x = ± √ . We have f 00 (x) = 6x so by the second derivative
√ 3 √
test, x = 1/ 3 gives a local minimum and x = −1/ 3 gives a local maximum. This is consistent
with our sketch of this cubic function (see Example 6.4.3).
√
So y = x(x − 1)(x + 1) has a local maximum at x = − √13 , where f (x) = 1/ 3 − 1/3 ' 0.244. But
you can find many points x such that f (x) > 0.244, like x = 10 for which f (10) = 990 — so this
local maximum is not a global maximum.
√ 1
Example 6.4.19. The function f (x) = x is defined only on x ≥ 0. We have that f 0 (x) = √ ,
2 x
which is not defined at 0, so 0 is a critical point. Since f 0 (0) is undefined, the second derivative test
does not apply. Since f 0 (x) is not defined for x < 0, the first derivative test does not apply either.
Is this a problem? No: (0, 0) can’t be a local extremum because 0 is an endpoint of the domain:
the function is not defined on both sides of c = 0. (As it happens, (0, 0) is the global minimum of
f .)
In the previous section, we determined that local minima can only occur at critical points, and
learned two tests to classify them. In this section we classify the global extrema.
The first question to ask: When do global (or absolute) extrema exist? Well, there is one case,
which is very common in practice, where we are in fact guaranteed that both an absolute maximum
and an absolute minimum exist.
Theorem 6.4.20 (Extreme Value Theorem). Suppose that f is a continuous function. Then for
any closed interval [a, b] in the domain of f , f attains both a global maximum and a global minimum
on [a, b].
Example 6.4.21. The function f (x) = x2 attains a global maximum and a global minimum on
any interval of the form [a, b], by the Extreme Value Theorem. For example,
if [a, b] = [−10, 15] then the global maximum is (15, 225) and the global minimum is (0, 0)
because these are points on the graph and 0 ≤ f (x) ≤ 225 for all x ∈ [−10, 15];
if [a, b] = [−2, −1] then the global maximum is at (−2, 4) and the global minimum at (−1, 1),
because these are points on the graph and 1 ≤ f (x) ≤ 4 for all x ∈ [−2, −1].
What the theorem amounts to saying is that if you start at a point (a, f (a)) and draw the graph of
a continuous function (with no gaps or breaks) until (b, f (b), then you necessarily hit a maximum
and a minimum value: no asymptotes, no jumps, no question about “close but not quite”.
150
MAT 1330 : Fall 2020 6.4. EXTREMA
This seems kind of obvious! But it’s helpful to consider why we needed “continuous” and “closed
interval” to make the theorem work, by looking back on our examples.
It is defined on a closed interval [0, 3] but is not continuous so the Extreme Value Theorem does
not apply. In fact, −1 < f (x) < 1 on this interval and we can get arbitrarily close to these extremes
— but there is no value of x for which f (x) = 1 or f (x) = −1 so f does not attain an absolute
max or an absolute min on [0, 3].
Example 6.4.23. Consider f (x) = 2x on the interval (1, 3). This interval is not closed so the
Extreme Value Theorem does not apply. In fact, we can get arbitrarily close to 2 and to 6, but
there is no value of x such that f (x) = 2 or f (x) = 6, so f does not attain its max or min.
This failure to attain a maximum or a minimum can also happen when the interval is an open
unbounded interval.
Example 6.4.24. Consider f (x) = 1/x. It is defined and continous on the open unbounded
interval (0, ∞) but is always strictly decreasing there, and never attains either a global maximum
or a global minimum.
The Extreme Value Theorem doesn’t say that if f is discontinuous, or if f is defined on some other
kind of domain, that it doesn’t have global extrema. It’s just removing the guarantee in that case.
So those were some examples of where the Extreme Value Theorem doesn’t apply. Let’s now see
how to use it when it does apply.
We can’t evaluate the function at all of the points in its domain; there are infinitely many. But we
can use Calculus to reduce the question to consideration of a few points.
Note. Principle: If f is continuous on a closed interval, then its absolute maximum and absolute
minimum must occur at critical points or endpoints of the interval.
2. Evaluate f at each boundary point, that is, calculate f (a) and f (b).
3. The largest is the global maximum of f , and the smallest is the global minimum of f .
151
MAT 1330 : Fall 2020 6.4. EXTREMA
As a perk: in the process, you are sometimes able to deduce what critical points are local extrema
(not always, but often).
Example 6.4.25. Consider f (x) = |x| on the interval −1 ≤ x ≤ 2. Find its global maximum and
minimum on this domain.
Solution:
3. Comparing these values, and knowing that the function is continuous between these points,
we conclude that f attains a global max at x = 2 and a global min at x = 0.
We can also therefore deduce that there is a local minimum at 0 but no local maximum.
Now, suppose f is not continuous, or the interval is not closed. Then there is no guarantee that a
global maximum or minimum exists. Nevertheless, the method is similar:
3. Evaluate the left and right hand limits (as appropriate) as x approaches the boundary (if the
boundary is not in the domain) or the points of discontinuity.
4. If there is a largest value (that is, you didn’t get ∞ as a limit) and that value is actually a
point on the curve (c, f (c)) (that is, it’s not just a limit you never reach), then you have a
global maximum.
5. If there is a smallest value (that is, you didn’t get −∞ as one of your limits) and that value
is actually attained for some point on the graph (c, f (c)) (that is, it’s not just a limit you
never reach), then you have a global minimum.
√
Example 6.4.26. Let f (x) = xe−x . Find all local and global maxima and minima.
152
MAT 1330 : Fall 2020 6.4. EXTREMA
√
3. In this case, we don’t necessarily know lim xe−x (wait for L’Hôpital’s rule in Section 6.6),
x→∞
so we reason otherwise: when x > 1/2 we have f 0 (x) < 0 so f decreases after x = 21 , but from
from the formula we deduce that f (x) ≥ 0 for all x ≥ 0. So the limit is between 0 and 12 (in
fact, it’s 0).
5. Also, f has a global minimum at (0, 0). It has no local minima, since the minimum is at an
endpoint.
√
x
Example 6.4.27. Consider f (x) = . Find its local and global extrema.
1+x
Solution: The domain of definition is [0, ∞).
1. The derivative is
1 1 √ 1−x
f 0 (x) = ( √ (1 + x) − x) = √ ,
(1 + x)2 2 x 2 x(1 + x)2
so the critical points are x = −1 (not in the domain), x = 0 (an endpoint) and x = 1 (where
f 0 (x) = 0). We compute f (1) = 21 .
4. Comparing values, we see that f has a global max at x = 1 and a global min at x = 0.
We also infer that f has a local max at 1 but no local min. We could also infer this from other
tests, as follows.
Note that we could apply the first derivative test to the critical point 1, because calculating f 00 (x)
looks hard (making the second derivative test unappealing). In fact, we can reason as follows:
1+x √
f 0 (x) > 0 √ > x
⇐⇒
2 x
√ √
and since x > 0, we can safely multiply both sides by x to get
1+x √
√ > x ⇐⇒ 1 + x > 2x ⇐⇒ x < 1.
2 x
Thus f is increasing before x = 1 and decreasing after, so this is a local max. In fact it must also
be a global max.
153
MAT 1330 : Fall 2020 6.5. OPTIMIZATION
These examples came out nicely; answering about the global extrema told us about the local
extrema as well. So effectively there are several methods, not all perfect, that you can use to verify
if a critical point is a local extremum:
Note. All methods are acceptable but you must share your logical reasoning — explain why a
point is irrefutably a local (or global) maximum or minimum (with reference to a test, the Extreme
Value Theorem, or how the function must behave, given that you have determined all of the critical
points etc.).
End of lecture # 12
6.5 Optimization
Finding extreme values of functions is the goal of optimization. Respecting our constraints and our
goals, we want the highest yield, the minimum cost, the highest temperature or the lowest dosage.
These are all the maximum or the minimum of a function.
The yield of crop in agriculture changes with the amount of fertilizer (for example, nitrogen)
applied. When nitrogen levels in the soil are low, then adding some nitrogen will greatly increase
yield. When nitrogen levels are already very high, however, adding more might decrease yield.
Assume that yield Y as a function of the amount of nitrogen in the soil N is given by the equation
N
Y (N ) = .
1 + N2
What is the optimal level of nitrogen in the soil?
Solution. We want to choose N so as to maximize Y , so we are looking for the absolute maximum
value of the function Y (N ).
We compute
(1 + N 2 ) − N (2N ) 1 − N2
Y 0 (N ) = = ,
(1 + N 2 )2 (1 + N 2 )2
so the only critical points are where 1 − N 2 = 0 or N = ±1. Since N is the amount of nitrogen,
N ≥ 0, and there is only one critical point to consider. Since Y 0 (N ) > 0 if 0 ≤ N < 1 and
Y 0 (N ) < 0 if N > 1, we conclude that N = 1 gives a local and global maximum.
1
Thus the maximum yield is Y (1) = 2 at it occurs at N = 1.
154
MAT 1330 : Fall 2020 6.5. OPTIMIZATION
Minimize the material used to produce a cylindrical can of a fixed volume V = 355 cm3 .
Solution. We draw a picture, and introduce some variables by labeling the important parts of the
picture. We obtain equations by relating the variables. Once we have thus translated the question
into math, we can decide what we need to optimize.
Denote by r the radius of the bottom of the can and by h its height, in cm.
Then the volume is
V = πr2 h = 355 cm3 (6.1)
and the surface area is
A = 2πrh + 2πr2 . (6.2)
So the question says we want to minimize A — but right now this is a function of two variables, r
and h (which is bad, because this is a one-variable Calculus course). We need to use the information
that V is a fixed value (355); this constraint allows us to solve for h in terms of r. Namely, from
(6.1)
h = 355/(πr2 )
so that we rewrite (6.2) as
355 710
A = 2πr · 2
+ 2πr2 = + 2πr2
πr r
which expresses A as a function of one variable, r.
A0 = −710r−2 + 4πr
355
whose zeros are 4πr = 710r−2 or r3 = . This has only one root, at about r ' 3.8372 cm.
2π
Caution: We chose more than 3 significant figures at this point because we will need
our final answer to be accurate to the same precision as the data given: 3 significant
figures, and round-off error comes in whenever we use estimates.
Since A00 = 1420r−3 + 4π > 0 for all r > 0, this function is always concave up, and so our critical
point is a local and global minimum. The dimensions of the cylinder having volume 355 cm3
155
MAT 1330 : Fall 2020 6.5. OPTIMIZATION
and with minimal surface area is thus r = 3.84 and h = 7.68 (cm); the minimal surface area is
A = 277 cm2 . (Note all these answers were only rounded to 3 significant digits at the last step.)
6.5.3 Distances
Find the distance of the line y = 1 + 2x from the origin and find the point on the line that is closest
to the origin.
Solution. Draw a picture, assign variable names and decide what we are trying to minimize.
p
Here, a point on the line is (x, y), and its distance to the origin is d = x2 + y 2 .
Again, there are 3 variables and we need to cut it down to two; again, there is an equation relating
x and y, namely y = 1 + 2x.
In this case: minimizing the distance is equivalent to minimizing the square of the distance, since
the square root function is an increasing function. So let’s minimize
D = x2 + y 2 = x2 + (1 + 2x)2
instead, because it is easier. We have D0 = 2x + 2(1 + 2x)(2) = 10x + 4, which has a unique critical
point at x = − 52 . Since D00 = 10 > 0, the function is concave up everywhere, so this is a local
minimum and must also be a global minimum (by concavity).
2 1
So the minimum distance occurs when x = − , and thus y = ; the distance is
5 5
s
2 2
2
1 1
d= − + =√ .
5 5 5
156
MAT 1330 : Fall 2020 6.5. OPTIMIZATION
Exercise 6.5.1. In the setting of the example above, minimize instead the distance function
p p
d(x) = x2 + (1 + 2x)2 = 5x2 + 4x + 1
(rather than the square of the distance function) and verify that you get the exact same answer
(with a touch more work).
Assume that a population grows logistically and is being harvested regularly. In this case, we model
“harvesting” by saying that at each time t, a certain fraction h of the population is removed. (Thus,
a higher population yields a higher harvest, but a smaller population yields a smaller harvest.)
Actually (to make the final formula simpler), the way we determine harvest is relative to last
month’s population: We start with population xt , we let it grow over the course of a month, then
harvest a total of hxt individuals (instead of hxt+1 ). 3
where h > 0 is a parameter which denotes the intensity of harvesting. Let us assume that this
DTDS has a stable positive steady state x∗ . Then in the long term, the yield of the harvest is
Y (h) = hx∗ .
Solution. We notice that our function Y (h) depends on h and on x∗ , but not on xt or t. So in fact,
the first step is to find the equilibria of the DTDS and decide what x∗ is.
Recall that x∗ is an equilibrium of a DTDS with updating function f if x∗ = f (x∗ ). In this case,
we have updating function
f (x) = 2.5x(1 − x) − hx
so f (x) = x means
157
MAT 1330 : Fall 2020 6.5. OPTIMIZATION
But wait: 15 (3 − 2h) will be negative if 3 − 2h < 0, meaning h > 3/2. That doesn’t make sense: so
we have to restrict our values of h to where the equilibrium is positive (so actually exists) and also
where h ≥ 0. So the domain we are considering is
0 ≤ h ≤ 1.5.
Good. For this range of values of h we now have a function for Y (h) just in terms of h:
(3 − 2h) 1
Y (h) = hx∗ = h = (3h − 2h2 ).
5 5
We want the maximum value of this function on the domain [0, 1.5], which exists by the Extreme
Value Theorem.
Now in fact this is a parabola which is concave down, so has a unique global maximum at its critical
point. We compute Y 0 (h) = 35 − 45 h so the unique critical point is where
3 4
− h=0 ⇐⇒ 4h = 3 ⇐⇒ h = 3/4.
5 5
Since 0.75 ∈ [0, 1.5], this is in our domain; since it’s the global maximum of Y on R, it is for sure
the global maximum on our interval. Success!
Thus the optimate harvesting rate is h = 0.75 and the maximum steady state harvest will be
!
3 3 − 32
∗ 3 9
Y =Y = =
4 4 5 40
So for example if this is an annual harvest, then by harvesting 75% of the population each year
our population continues to grow to a positive steady state of x∗ = 0.3 (30% of the maximum
population possible given the limited resources, as per the logistic growth model) and we harvest
75% of this (which is 9/40).
Of course, we can add a harvesting component to any population model. For example, if we consider
a population growing according to a Ricker model xt+1 = 2xt e1−0.4xt , then if we harvest at a rate
of h (meaning: we harvest hxt just before the next (t + 1) population count), this gives us a new
model with harvesting of
xt+1 = 2xt e1−0.4xt − hxt .
Exercise 6.5.2. (Challenge) You might ask: why didn’t we just set up a constant harvesting rate
and optimize from there? That is, define your DTDS with constant harvesting by xt = 2.5xt (1 −
xt ) − h for some parameter h, meaning you harvest the same amount, regardless of population.
Biologically speaking, this isn’t a good strategy, because you know that you should harvest less when
the population is small. Mathematically speaking, this is a more complex model that takes into
account how intensely can I harvest before I drive the population to extinction — and the stability
of the fixed point is a huge concern.
158
MAT 1330 : Fall 2020 6.5. OPTIMIZATION
Let’s start with a farmer wanting to build a rectangular yard for his sheep. The fence costs $ 20
per meter, and he wants to enclose 100m2 . What should the dimensions be to minimize cost?
A rectangle, representing the fence and field. We label the sides a, b, a, b, which represent length
in meters.
So the perimeter of the fence is 2a + 2b and so the cost is C = 20 × (2a + 2b) = 40a + 40b $.
Now C is currently a function of two variables, a and b, so we are not ready yet. We look back
and see that we haven’t yet taken into account that the area should be 100 m2 . The area of the
rectange is ab; so the equation is ab = 100 or b = 100a−1 .
That gives C = 40a+4000a−1 . Excellent: cost as a function of the length of one side; let’s minimize.
The derivative is
C 0 = 40 − 4000a−2
so C 0 = 0 when 4000a−2 = 40 or 100 = a2 or a = 10.
C 00 = 8000a−3 > 0
so the function is concave up at a = 10 (and in fact on (0, ∞)) so this is a global minimum on the
domain (0, ∞).
The minimum cost is C(10) = 400 + 400 = 800$. The minimal perimeter is P = 40.
So the optimal shape for minimizing perimeter for fixed area (or: for maximizing area given the
perimeter) if you start with a rectangle is a square: the most symmetric one.
Next question: consider the different regular polygons: equilateral triangle, square, regular pen-
tagon, regular hexagon, etc. Could the farmer do better? With a perimeter of P = 40, could he
get more area with a different shape? (See textbook.) Answer: yes, with a circle.
Neat fact: Bees literally deal with this problem, wanting to use the least amount of material to
build their honeycombs but have the most space for honey. But they don’t want just one cell; they
want to create dozens of them stacked together, so if there are gaps, they would waste space (and
159
MAT 1330 : Fall 2020 6.5. OPTIMIZATION
material). The only regular polygons that tile the plane are triangles, squares and hexagons — so
bees use hexagons!
Optimal age of reproduction Semelparous organisms, like Pacific salmon, reproduce only once
in their lifetime, and then die. Typically, they can produce more female offspring as they get older,
which is an advantage for population growth. But if they wait too long, then they might die before
they reproduce.
To answer this, we need a mathematical model and we need to refine our question.
Fact:5 If we denote by `(x) the probability that an individual lives to age x and by m(x) the
average number of female offspring of an individual at age x, then the average annual reproduction
is given by
ln(`(x)m(x))
r(x) = .
x
We6 want to maximize r as a function of x.
Specific problem: Suppose that our semelparous organism is such that `(x) = e−ax and m(x) =
bxc , for some positive constants a, b, c. Find the value of x that maximizes r as above.
Solution. We are given r as a function of x, so this question is purely an extreme value problem.
Let’s simplify r before differentiating:
1 1 1
r(x) = (ln(`(x)) + ln(m(x))) = (ln(e−ax ) + ln(bxc )) = (−ax + ln(b) + c ln(x))
x x x
so
ln(b) ln(x)
r(x) = −a + +c .
x x
So
x x1 − ln(x) 1 1
r0 (x) = − ln(b)x−2 + c = 2 (− ln(b) + c − c ln(x)) = 2 (c − ln(bxc )).
x2 x x
The critical points are x = 0 (technically not a critical point, since it’s not in the domain of r, but
it certainly is a critical value in this model) and where c = ln(bxc ). We solve this:
ec
ec = bxc ⇐⇒ xc = ⇐⇒ x = eb−1/c .
b
The critical point x = 0 is irrelevant, as it means a lifetime of length 0. The other critical point
is positive, since b > 0. Since ln is an increasing function, as is xc , we deduce that c − ln(bxc ) is
5
For example, see Vaupel JW, Missov TI, Metcalf CJE (2013) Optimal Semelparity. PLoS ONE8(2): e57133.
https://doi.org/10.1371/journal.pone.0057133
6
Why isn’t it just `(x)m(x)? Because that’s just for one individual; we have to average over the whole population
size, so there are more individuals if they reproduce more often. This formula takes all that into account; see MAT2379
intro to biostatistics.
160
MAT 1330 : Fall 2020 6.5. OPTIMIZATION
decreasing, so goes from positive to negative. Thus eb−1/c is a local maximum (and in fact global
maximum).
We deduce that x = eb−1/c is a formula for the optimal age of reproduction for this species.
Optimal clutch size. If an organism produces only few offspring, then each has a high probability
of survival; if there are many offspring then the survival probability individually declines7 .
Again, to get to a mathematical question, we need to convert this concept to an equation using a
mathematical model.
Let R denote the total resources (per adult female) available for reproduction and N the clutch
size. Then the amount of resources per offspring is x = R/N. Denote the survival probability of
an offspring having resources x as f (x). This function should be positive (between 0 and 1) and
non-decreasing (since more resources should not decrease survival). Then the expected number of
surviving offspring is
R
w(x) = N f (x) = f (x).
x
We want to maximize the number of offspring w and we have expressed this number as a function
of x, the amount of resources per offspring.
Solution. Since
R
w(x) = f (x)
x
we have
w0 (x) = R(xf 0 (x) − f (x))/x2 ,
which gives critical points x = 0 (not relevant) and xf 0 (x) = f (x).
x2
If f (x) = x2 +k2
, then
(x2 + k 2 )(2x) − x2 (2x) 2xk 2
f 0 (x) = =
(x2 + k 2 )2 (x2 + k 2 )2
so
2x2 k 2 x2
xf 0 (x) = f (x) ⇐⇒ = ⇐⇒ 2k 2 = x2 + k 2 ⇐⇒ x2 = k 2
(x2 + k 2 )2 x2 + k 2
161
MAT 1330 : Fall 2020 6.5. OPTIMIZATION
We compute w00 (x) to classify this critical point. To do so efficiently, let’s write
−R R
w0 (x) = 2
f (x) + f 0 (x)
x x
so that
2R R R R 2R R
w00 (x) = 3
f (x) − 2 f 0 (x) − 2 f 0 (x) + f 00 (x) = 3 (f (x) − 2Rxf 0 (x)) + f 00 (x),
x x x x x x
and at the critical point, the first term in this last expression is 0 since xf 0 (x) = f (x). Great! So
R
the concavity at the critical point is comes down to the sign of f 00 (x), which is the same as the
x
sign of f 00 (x).
We compute
so that f 00 (k) = −4k 4 /(2k 2 )3 < 0 and thus the critical point is a local maximum.
Since the function w(x) is concave down at x = k, we deduce that f is increasing before k and
decreasing after k; and since there are no other critical points, this must therefore be a global
max.
Optimize food intake by adjusting residence time. Consider a bee consuming nectar from
flowers. Suppose that it remains at each flower for a fixed amount of time before it travels to the
next flower. If that residence time is small, then the bee might leave valuable nectar behind. If it
is large, then it might be depleting all the nectar and getting less than if it went to look for the
next flower. What is the optimal residence time?
Approach: To answer this question, we need to know how much food the bee collects in t time
units while at one flower, measured from t = 0 its arrival at the flower. Let’s call this function
F (t).
162
MAT 1330 : Fall 2020 6.5. OPTIMIZATION
So we collect some data and plot points and choose a curve that seems to fit the data and our
expectations well, and come up with
t
F (t) = .
t + 0.5
But we’re not done yet. We have to take into account how long it takes the bee to go between
flowers — if the flowers are close, it’s negligible, but if the flowers are far apart, maybe bees should
spend more time where they are, right?
Thus we create a parameter: suppose now that the bee takes on average d time units to fly to the
next flower. Then if it spent time t at the flower and time d flying, and gained F (t) units of nectar
over that time, then the rate of nectar collection (amount of nectar per unit time) is
F (t)
R(t) = .
t+d
The bee wants to maximize R as a function of t, the amount of time it spends at one flower.
t
Solution for the specific function F (t) = . We have
t + 12
t
R(t) = 1 where d is a positive constant parameter.
(t + 2 )(t + d)
Thus
(t + 21 )(t + d) − t(2t + d + 12 ) t2 + (d + 12 )t + 21 d − 2t2 − (d + 12 )t
R0 (t) = =
(t + 12 )2 (t + d)2 (t + 12 )2 (t + d)2
thus
1 2
2d − t
R0 (t) = .
(t + 12 )2 (t + d)2
This is undefined when t = − 21 and t = −d (neither of which are biologically relevant); and it is
q
zero when t2 = 12 d or t = ± d2 . Again, only the positive root is relevant.
q
Therefore there is only one biologically relevant critical point, t = d2 . The denominator is positive;
q q
the sign of R0 (t) is positive when 0 < t < d2 and is negative if t > d2 , so this is a local (and in
fact, on this domain) global maximum.
q
Conclusion: in this model, the bee should spend d2 time at each flower to maximize its average
yield. For example, if d = 50 then t = 5; if d = 2 then t = 1. Funny: if the flowers are
closer together, it spends less time at each flower (but spends a larger percentage of its time on
flowers).
What does it all mean? Sometimes, it’s helpful to work with a more general formula so we can
better see the patterns.
163
MAT 1330 : Fall 2020 6.6. L’HÔPITAL’S RULE FOR FINDING LIMITS
General Solution. So now just suppose that F is increasing and concave down and R(t) = F (t)/(t+
d) for some constant parameter d. We differentiate R:
F (t)
This is zero when (t + d)F 0 (t) = F (t) or F 0 (t) = = R(t).
t+d
Wow: this means something! The critical point is where the average rate of nectar collection (R(t))
is equal to the instantaneous rate of nectar collection (F 0 (t)) on the flower.
Since F is increasing and concave down, F 0 is decreasing but F is increasing. Therefore the
numerator (t + d)F 0 (t) − F (t) is will be negative after the critical point, and positive before. That
is, we conclude that R0 > 0 before the critical point and R0 < 0 after, meaning it is a local (and in
fact global) maximum.
The result is the marginal value theorem: the bee should leave the flower if the instantaneous food
intake falls below the average food intake.
End of lecture # 13
We studied limits, and continuous functions, back in Chapter 4, and used them to help analyse
the graphs and behaviour of functions in Section 6.3. But we did encounter some functions whose
limits we could not evaluate using existing algebraic methods. Calculus, to the rescue!
164
MAT 1330 : Fall 2020 6.6. L’HÔPITAL’S RULE FOR FINDING LIMITS
For infinite limits, some arithmetic is valid; for example, let c > 0 be any finite number then
∞±c=∞ (−1)∞ = −∞
∞+∞=∞ ∞·∞=∞
c·∞=∞
c c
=∞ = −∞
0+ 0−
c ∞
=0 =∞
∞ c
∞ ∞
=∞ = −∞
0+ 0−
(along with many other variations).
Example 6.6.1.
x+4 7
lim = “ ” = −∞
x→3− x − 3 0−
1
lim e2x ( + 4) = “∞ · 4” = ∞
x→∞ x
Note. But the following are examples of what are called indeterminate forms and their value
cannot be assessed without analyzing the functions involved:
0 ∞
0 ∞
∞−∞ 0·∞
Example 6.6.2.
x2 − 4
lim
x→2 x − 2
is an indeterminate form of type 0/0; to find the limit we have to algebraically manipulate the
function (here, simplify) to deduce the true value (which is 2). The point is that the indeterminate
form holds no information about the value of the limit.
Earlier in the course, we talked about the limit — but then we jumped to the theorem that all our
favourite functions are continuous, and in that case you can evaluate the limit
lim f (x)
x→a
165
MAT 1330 : Fall 2020 6.6. L’HÔPITAL’S RULE FOR FINDING LIMITS
Example 6.6.3.
So therefore
lim etan(x) = 0
x→π/2+
ln(x)
Example 6.6.4. Find lim .
x→0+ x
So as x → 0+, the numerator goes to −∞ and the denominator goes to 0 (on the positive side).
Dividing by a very small positive number makes you bigger — so we see that the fraction ln(x)/x
also tends to −∞, by our rules of “algebra with infinity.”
ln(x)
So lim = −∞.
x→0+ x
Let’s consider
ln(x)
lim .
x
x→∞
This time we cannot reason it out so easily: both ln(x) and x go to ∞ as x → ∞. We call this an
indeterminate form of type ∞/∞ — indeterminate because we can’t determine the answer without
thinking more about the functions involved.
sin(x)
lim
x→0 x
which we can solve geometrically (see Section 5.10.1).
So what’s going on with these types of limits? For ln(x)/x it’s a question of who “gets to infinity”
fastest, and for sin(x)/x, it’s about how quickly sin(x) goes to 0 as compared with x going to 0. In
other words: if they are both headed for zero or both headed for ∞, then what we need to do is
compare their rates — that is, their derivatives.
166
MAT 1330 : Fall 2020 6.6. L’HÔPITAL’S RULE FOR FINDING LIMITS
Theorem 6.6.5 (L’Hôpital’s rule). Let f and g be differentiable functions such that g 0 (x) is nonzero
around a. If
f (x) 0 ∞
lim is an indeterminate form of type or ±
x→a g(x) 0 ∞
then
f (x) f 0 (x)
lim = lim 0 .
x→a g(x) x→a g (x)
This formula is also valid for one-sided limits, and limits x → ±∞.
∞
Example 6.6.6. Type: ∞
ln(x) ∞
lim = “ ”, so we can apply L’Hôpital’s rule
x→∞ x ∞
0 1/x
=L H lim =0
x→∞ 1
0
Example 6.6.7. Type: 0
sin(x) 0
lim = “ ”, so we can apply L’Hôpital’s rule
x→0 x 0
0 cos(x)
=L H lim = cos(0) = 1
x→0 1
Note this was a lot easier than our geometric argument — but we had to use the geometric argument
back then because we were trying to find out the derivative of sin(x)!
Note. You CANNOT apply L’Hôpital’s rule UNLESS it’s an indeterminate form of type 0
0 or ± ∞
∞.
Type 0 · ∞
This doesn’t mean actually 0, it means a limit of a product where one term is shrinking to 0 and
the other is growing to infinity.
167
MAT 1330 : Fall 2020 6.6. L’HÔPITAL’S RULE FOR FINDING LIMITS
Since sin(x) → 0 and ln(x) → −∞, this is an indeterminate form of type 0 · ∞. Does the function
sin(x) go to zero faster than ln(x) goes to −∞? or vice versa? Or do their rates match and cancel?
0 ∞ b a
Note. Convert 0 · ∞ into 0 or ∞ by the identity : ab = a−1
or ab = b−1
.
We have
ln(x) 1
lim sin(x) ln(x) = lim since = csc(x)
x→0+ x→0+ csc(x) sin(x)
−∞
=“ ” so we can apply L’Hôpital’s rule
∞
0 1/x ∞
=L H lim (still , but let’s simplify)
x→0+ − csc(x) cot(x) ∞
1/x
= lim (converted to sine and cosine)
x→0+ − cos(x)/ sin2 (x)
sin2 (x)
= lim
x→0+ −x cos(x)
0
= “ ” so we can apply L’Hôpital’s rule
0
0 2 sin(x) cos(x)
=L H lim
x→0+ − cos(x) + x sin(x)
= 0,
(because the numerator is 0 and the denominator is −1) so sin(x) won the race.
The graph of y = sin(x) ln(x), which indeed tends to 0 as x → 0, even though ln(x) → −∞.
Type ∞ − ∞
168
MAT 1330 : Fall 2020 6.6. L’HÔPITAL’S RULE FOR FINDING LIMITS
so we think of ways to simplify this and/or put it over a common denominator to make it a fraction.
Since it’s trig, our first thought is to write everything in terms of sine and cosine:
1 cos(x)
lim (csc(x) − cot(x)) = lim −
x→0+ x→0+ sin(x) sin(x)
1 − cos(x)
= lim
x→0+ sin(x)
0
= “ ” so we can apply L’Hôpital’s rule
0
0 sin(x)
=L H lim = 0,
x→0+ cos(x)
Once it’s over a common denominator, plug in the values again, and decide if you’ve got an
indeterminate form (so apply L’Hôpital’s rule) or else a limit that you can reason out.
−2x
= lim √ √
x→∞ ( x3 − x + x3 + x)
∞
= “ ” so we can apply L’Hôpital’s rule
∞
−2
= lim 1 .
x→∞ (3x2 − 1)(x3 − x)−1/2 + 1 (3x2 + 1)(x3 + x)−1/2
2 2
Similarly,
1
lim (3x2 + 1)(x3 + x)−1/2 = ∞,
x→∞ 2
so in fact the denominator is going to ∞ + ∞ = ∞ while the numerator is constant. Thus we can
conclude from the above calculations that
p p
lim ( x3 − x − x3 + x) = 0.
x→∞
169
MAT 1330 : Fall 2020 6.6. L’HÔPITAL’S RULE FOR FINDING LIMITS
So in fact: in the limit, the little ±x under the square root didn’t make a difference, and “at
infinity” these functions are indistinguishable! (Check it out with a calculator to see that this is
completely true.)
Exercise 6.6.11. Our initial step was to convert our ∞ − ∞ limit into the more tractable limit
−2x
lim √ √
x→∞ ( x − x + x3 + x)
3
and then to apply L’Hôpital’s rule. We didn’t have to use L’Hôpital’s rule, though. Show that you
can reason out this limit using our old techniques.
“∞0 should be ∞ because the base is ∞” but “∞0 should be 1 because any number to the
power 0 is 1” (but remember that ∞ is not a number and it’s only the limit as x → 0, not
really 0)
“00 should be 0 because the base is 0”, but “00 should be 1 because the exponent is 0”
“1∞ should be 1 because 1 to any power is 1” but “if the base is just a bit bigger than 1 then
(1+)∞ should be ∞ because anything bigger than 1 to the power ∞ is ∞” but then again
“if the base is just a big smaller than 1 then (1−)∞ should be 0 because anything just under
1 to the power ∞ is 0”. Phew!
These lines of reasoning are fine: the fact that you get contradictory answers is what says you don’t
have enough information, and that this is an indeterminate form.
The method of solution is the same for all exponential types like this: convert to base e. (Notice
how this is the solution to a lot of problems with exponentials in this course.)
170
MAT 1330 : Fall 2020 6.6. L’HÔPITAL’S RULE FOR FINDING LIMITS
Example 6.6.12. So
lim x1/x = lim eln(x)/x = e0 = 1
x→∞ x→∞
by Example 6.6.6.
and since lim x ln(x) is an indeterminate form of type 0 · ∞, we have to work a bit harder; let’s
x→0
focus on this limit.
ln(x) L0 H 1/x
lim x ln(x) = lim = lim = lim −x = 0
x→0+ x→0+ 1/x x→0+ −1/x2 x→0+
where we had carefully checked that we had an indeterminate form of type ∞/∞ before we applied
L’Hôpital’s rule in the second step.
1 ln(1 + x1 )
lim x ln(1 + ) = lim 1
x→∞ x x→∞
x
1 1
1+1/x (− x2 )
= lim
x→∞ −1/x2
1
= lim =1
x→∞ 1 + 1/x
171
MAT 1330 : Fall 2020 6.6. L’HÔPITAL’S RULE FOR FINDING LIMITS
So that
1 x 1
lim (1 + ) = lim ex ln(1+ x ) = e1 = e.
x→∞ x x→∞
Summary
To find limits of continuous functions, try plugging in the values. When the result is an indeter-
minate form (0/0, ∞/∞, 0 · ∞, ∞ − ∞, 00 , ∞0 or 1∞ ), either simplify it algebraically to find the
answer, or else turn the limit into an indeterminate form of type 0/0 or ∞/∞ so that you can
apply L’Hôpital’s rule (as many times as it takes).
As you saw in the examples in this section: always take the time to simplify; and ALWAYS check
that you have the right kind of indeterminate form before using L’Hôpital’s rule.
2. lim x2 e−x = ∞ since both x2 → ∞ and e−x → ∞; lim x2 e−x — oops, an indeterminate
x→−∞ x→∞
form of type ∞ · 0 so we have to work:
x2
lim x2 e−x = lim x
x→∞ x→∞ e
∞
= “ ” so apply L’Hôpital’s rule
∞
0 2x
=L H lim x
x→∞ e
∞
= “ ” so apply L’Hôpital’s rule
∞
L0 H 2
= lim
x→∞ ex
=0
172
MAT 1330 : Fall 2020 6.6. L’HÔPITAL’S RULE FOR FINDING LIMITS
√ √
6. This is zero only if x2 − 4x + 2 = 0, so when x = 12 (4 ± 16 − 8) = 2 ± 2:
√ √ √ √ √ √
x<2− 2 x=2− 2 2− 2<x<2+ 2 x=2+ 2 x>2+ 2
f 00 (x) : + 0 - 0 +
f (x) is ∪ 0.19 ∩ 0.38 ∪
7. This gives the following graph, once you put all the pieces together. Start by plotting the
points of interest (x, f (x)) (the critical points, and potential inflection points), and draw
little arrows for all the limits, then connect them, using the concavity information to get the
curvature right:
Label all the features on your graph (sorry, can’t do it with FooPlot): critical points (there
are two), inflection points (there are two), local and global minimum at (0, 0), local maximum
at (2, 0.073), horizontal asymptote at y = 0 as x → ∞.
Let’s do another example of graphing a function using all the information we can readily glean
from the function itself, and its first and second derivatives.
Example 6.6.16. Let f (x) = e1/x .
Information from f :
f is always positive.
Since f is undefined at 0, we need to find the limit of f as x approaches 0, that is, what does
f look like near x = 0? So:
lim e1/x = “e∞ ” = ∞
x→0+
since for x < 0, 1/x > 0 and as x → 0+ we have 1/x → ∞; so e1/x → inf ty also. On the
other hand
lim e1/x = “e−∞ ” = 0
x→0−
173
MAT 1330 : Fall 2020 6.6. L’HÔPITAL’S RULE FOR FINDING LIMITS
since as x → 0−, 1/x → −∞; but we know that limz→−∞ ez = 0 so we deduce e1/x → 0.
This is a weird answer; so we check, but it’s right.
Finally, we would like to know if there are horizontal asymptotes, or more generally, how the
graph of f behaves as x → ∞ and x → −∞:
and
lim e1/x = “e−1/∞ ” = e0 = 1
x→−∞
Remember: a graph can cross a horizontal asymptote! (Look at our graph for y = ln(x)/x, for
example.)
Ok, this has given us quite a few details about the graph but now we look for the bumps and
valleys, the local extrema, which really start to define the shape of the curve in between the points
we’ve figured out so far.
f 0 (x) = 0 is never true, since f 0 (x) is a product of two functions and neither one is ever zero.
We see that e1/x > 0 for all x 6= 0, and −1/x2 < 0 for all x 6= 0. So f 0 (x) < 0 for all x 6= 0.
This says f is decreasing on every connected component of its domain. That is, f is
decreasing on (−∞, 0) and also on (0, ∞).
174
MAT 1330 : Fall 2020 6.6. L’HÔPITAL’S RULE FOR FINDING LIMITS
YUCK! This is worse than what we started with!! So let’s flip things around and see if it improves:
−1 1/x −x−2 ∞
lim e = lim indeterminate form
x→0− x2 x→0− e−1/x ∞
0 2x−3
=L H lim
x→0− e−1/x (−x−2 )
−2x−1 ∞
= lim −1/x indeterminate form
x→0− e ∞
2x −2
0
=L H lim −1/x
x→0− e (−x−2 )
= lim −2e1/x = 0
x→0−
f 00 (x) = 0 when 1 + 2x = 0 or x = − 12
We see that (−0.5, e−2 ) is an inflection point since the graph changes concavity there
From these details, we piece together a very good sketch of the graph. Your hand-drawn sketch
would be better, because you’d exaggerate the features and pick a better scale.
175
MAT 1330 : Fall 2020 6.7. APPROXIMATING FUNCTIONS WITH POLYNOMIALS
Graph of y = e1/x . The point of inflection and horizontal asymptote are marked. Notice the
properties of this graph are consistent will all clues from the function and its derivatives and the
values of the limits as x → 0± and x → ±∞.
You should do practice problems with functions that make you fearful — because they will challenge
you! The hard parts are finding the derivatives and then finding the critical points of f and of f 0
— once you have those, it becomes a fun puzzle of connecting the dots.
x
Exercise 6.6.18. Sketch f (x) = xe−1/x and f (x) = √ , noting all the critical points, points
2
x +1
of inflection, asymptotes, and local and global extrema, as above.
End of lecture # 14
Having the full graph of a function, or a formula for it, is wonderful. But many functions are
√
difficult to work with and to evaluate (such as x, ln(x), ex ) compared to polynomial functions
(such as linear, quadratic functions). Can we locally approximate the more complex functions with
polynomials?
This kind of approximation is used to simplify complex mathematical calculations. But it is also
important when you need to make complex mathematics accessible. For example, if the absorption
model of a drug is given by a complicated function, but you want those who use it to understand
176
MAT 1330 : Fall 2020 6.7. APPROXIMATING FUNCTIONS WITH POLYNOMIALS
the effect of modifying their dosage, then you are better off estimating it with a linear function
that can be explained in words.
The simplest approach is to connect the dots. Formally, we are saying we might approximate our
function by a secant line approximation. That is, given a function f (x) and two points a, b defining
an interval in the domain of the function, the secant line of f from a to b is the straight line from
We usually leave this is factored form, because it’s quite natural to evaluate x − a.
√
Example 6.7.1. Give the √ secant line approximation of the function x on the interval [144, 169],
and use it to approximate 150.
Solution: We have (a, f (a)) = (144, 12) and (b, f (b)) = (169, 13) so
f (b) − f (a) 1
m= =
b−a 25
and thus the secant line is given by
1
y = 12 + (x − 144).
25
6
When x = 150, this gives y = 12 + = 12.24. That’s quite a decent approximation to the actual
√ 25
value 150 ≈ 12.247.
So this is great. But can we get a more sophisticated approximation by using our knowledge of the
derivative of f ? In particular, can we get away with approximating values of f from only knowing
f at a single point?
177
MAT 1330 : Fall 2020 6.7. APPROXIMATING FUNCTIONS WITH POLYNOMIALS
The point of Calculus and the derivative is that locally near any point, the graph of a differentiable
function is quite close to its tangent line. In practice, this means you can estimate f (x) for x near
a by its linear approximation, meaning, the function for this tangent line.
That is: think of the tangent line to f at a as another function (a far simpler function!!!) which is
pretty close to f near a.
y − f (a) = f 0 (a)(x − a)
or
y = f (a) + f 0 (a)(x − a).
Note. So given a function f (x) and a base point a, the tangent line to f at a is the graph of
and this function is the linear approximation to f at a (also called the linearization of f at a) in
the sense that
of all the lines that satisfy these three properties, T (x) gives the best approximation to f (x).
√
1 1
T (x) = f (100) + f 0 (100)(x − 100) =
100 + √ (x − 100) = 10 + (x − 100).
2 100 20
√ 1
√
So 150 ∼ T (150) = 10 + 20 (50) = 12.5. Check: 150 ∼ 12.247, not bad.
178
MAT 1330 : Fall 2020 6.7. APPROXIMATING FUNCTIONS WITH POLYNOMIALS
√
The graph of y = x is in red. Its tangent line at the base point a = 100 is given in blue and its
tangent line at the base point a = 144 is given in black. Compare the closeness of the
approximation at x = 150.
Example 6.7.4. Find the linear approximation to f (x) = ex near x = 0.
We compare:
f (0.1) = e0.1 = 1.10517
whereas
T (0.1) = 1 + 0.1 = 1.1
and we needed a calculator to find f (0.1) whereas we didn’t for T (0.1).
However, if x is not near a, then your linear approximation won’t be very good: for example
e1 ∼ 2.718 but T (1) = 1 + 1 = 2. This is obvious from the graph.
179
MAT 1330 : Fall 2020 6.7. APPROXIMATING FUNCTIONS WITH POLYNOMIALS
Note. The linear approximation T (x) = f (a) + f 0 (a)(x − a) is useful anytime you know f (a) and
f 0 (a) and want to estimate f (b) for some number b near a : just calculate T (b).
If your function f is given numerically, then you might estimate its derivative by divided differences,
and then T is some nice numerical extrapolation of your data.
Remark 6.7.5. In Physics, it is routine to replace sin(x) or cos(x) with their linearizations at 0
when the physical problem involves smaller angles. For example, if
Replacing sin(x) by x makes it easier to solve for x in certain formulas, such as approximating the
period of oscillation of a pendulum, or in the defraction of light through a lens.
Linear approximation is fine, but we could do better. The second derivative tells us about the
concavity of the function, so it we took it into account, we could find a simple function that
matches both the slope of f and the concavity of f at a point a.
√ √
Example 6.7.6. Our linear approximation of 150 was too high, because the graph of y = x is
concave down near x = 150. Our linear approximation of e0.1 was too low, because the graph of
y = ex is concave up near x = 0.1.
Theorem 6.7.7. Let f be a functiona and a a base point. Then for any n > 0 there is a polynomial
of degree n, called the Taylor polynomial of degree n and denoted Tn (x), with the property that
Tn (a) = f (a), Tn0 (a) = f 0 (a), Tn00 (a) = f 00 (a), ··· Tn(n) (a) = f (n) (a)
where f (n) (a) denotes the nth derivative of x evaluated at the number a.
a
such that the first n derivatives of f exist at a
That’s quite cool: it says for any function whatsoever, you can come up with a polynomial that
matches it perfectly, up to the nth derivative, at the point a. (And notice that this is not about
fitting a curve to datapoints — we are only using our knowledge of the function at a single point
a.)
180
MAT 1330 : Fall 2020 6.7. APPROXIMATING FUNCTIONS WITH POLYNOMIALS
dn
f (n) (a) = dxn f (x) at x = a, the n-th derivative;
T1 (x) = T (x), the linear approximation = the function defining the tangent line to f at a;
You get the nth approximation from the (n − 1)st approximation but just adding one term :
1 (n)
f (a)(x − a)n .
n!
√ −1
Example 6.7.8. So let f (x) = x, so that f 0 (x) = 1
√
2 x
, f 00 (x) = 4x3/2
. At a = 100 we have
√
a = 10 and a3/2 = 103 = 1000, so
f 00 (a)
T2 (x) = f (a) + f 0 (a)(x − a) + (x − a)2
2
1 −1
= 10 + (x − 100) + (x − 100)2
20 2 · 4000
1 1
= 10 + (x − 100) − (x − 100)2
20 8000
(which will indeed be less than our linear approximation, as desired) and we can evaluate
50 502 5 5
T2 (150) = 10 + − = 10 + − = 12.1875.
20 8000 2 16
This time we are a little too small — but much closer.
We can always do better by choosing the base point a closer to the x at which we want the
approximation.
√ √
Example 6.7.9. Find T2 (x) for f (x) = x at a = 144, and use it to estimate 150.
√
Solution: as above, we need to know 144 = 12, 1443/2 = 123 so
1 1 36 1
T2 (x) = 12 + (x − 144) − 3
(x − 144)2 = 12.25 − 3
= 12.25 − ∼ 12.247
24 2 · 4 · 12 8 · 12 384
which is correct to three decimal places!
181
MAT 1330 : Fall 2020 6.7. APPROXIMATING FUNCTIONS WITH POLYNOMIALS
√
The graph of y = x is in red. Its quadratic Taylor approximation at the base point a = 100 is
given in blue and its quadratic Taylor approximation at the base point a = 144 is given in black.
Compare the closeness of the approximation over a large interval to that of the linear
approximation.
So we can get a better approximation by choosing a base point closer to x; or we can get a better
approximation by choosing a higher order Taylor polynomial.
√ √
Example 6.7.10. Find T3 (x) for f (x) = x at base point a = 100, and use it to estimate 150.
Solution:
f 00 (a) f 000 (a)
T3 (x) = f (a) + f 0 (a)(x − a) + (x − a)2 + (x − a)3
2 3!
√ 1 −1 1 3
= a + √ (x − a) + 3/2
(x − a)2 + · 5/2 (x − a)3
2 a 2 · 4(a) 6 8a
1 1 1
= 10 + (x − 100) − (x − 100)2 + (x − 100)3
20 8000 16 · 105
5
= 12.1875 + ∼ 12.265.
64
√
So we can approximate x with a linear, quadratic, cubic or even higher order polynomial.
Example 6.7.11. Find the cubic Taylor polynomial of f (x) = x3 at the point a = 8 and use it to
estimate (8.1)3 .
Solution: We have f (x) = x3 so f 0 (x) = 3x2 , f 00 (x) = 6x and f 000 (x) = 6. Thus f (a) = 83 = 512,
f 0 (a) = 3(64) = 196, f 00 (a) = 6(8) = 48 and f 000 (a) = 6. Then
48 6
T3 (x) = 512 + 192(x − 8) + (x − 8)2 + (x − 8)3 = 512 + 192(x − 8) + 24(x − 8)2 + (x − 8)3
2 3!
which gives
T3 (8.1) = 512 + 19.2 + 0.24 + 0.001 = 531.441
182
MAT 1330 : Fall 2020 6.7. APPROXIMATING FUNCTIONS WITH POLYNOMIALS
Amazing, right? Well, not so amazing. If you multiply out the formula for T3 (x) you get simply
x3 . So Taylor’s theorem in this case just gives us a really cool refactorisation of x3 which makes it
easy to evaluate near the base point.
Example 6.7.12. Find the 4th Taylor polynomial of f (x) = cos(x) at the base point a = 0.
1 (n)
function function evaluated at a = 0 Taylor polynomial coefficient n! f (a)
f (x) cos(x) 1 1
f 0 (x) − sin(x) 0 0
f 00 (x) − cos(x) -1 − 12
f 000 (x) sin(x) 0 0
1 1
f (4) (x) cos(x) 1 4! = 24
Remark 6.7.13. In fact, you can continue this pattern infinitely to get the Taylor series of f at
a = 0:
1 1 1 1
cos(x) = 1 − x2 + x4 − x6 + x8 ± · · ·
2 24 6! 8!
and yes, if you really could add all these infinitely many terms, the answer would completely equal
cos(x). Basically, this is how your calculator evaluates cos(x) (with x in RADIANS!!); it works
better than drawing triangles and measuring ratios...
Example 6.7.14. Find the third Taylor polynomial of ln(x) at base point a = 1 (you can’t use
a = 0 here!!) and use it to estimate ln(1.1).
1 (n)
function function evaluated at a = 1 Taylor polynomial coefficient n! f (a)
f (x) ln(x) 0 0
1
f 0 (x) 1 1
x
1 1
f 00 (x) − 2 -1 −
x 2
2 2 1
f 000 (x) 2 =
x3 3! 3
183
MAT 1330 : Fall 2020 6.7. APPROXIMATING FUNCTIONS WITH POLYNOMIALS
so we have
1 1
T3 (x) = 0 + (x − 1) − (x − 1)2 + (x − 1)3
2 3
and thus we estimate ln(1.1) by
1 1
T3 (1.1) = 0.1 − (0.01) + (0.001) = 0.0953.
2 3
With a calculator, we check that ln(1.1) = 0.0953101798; pretty fabulous approximation to have
done by hand.
Again, you can infer a pattern from the derivatives and figure out the Taylor series in this case.
√ √
Exercise 6.7.15. Estimate 6 using the cubic Taylor polynomial of f (x) = x at the base point
a = 4.
Exercise 6.7.16. Estimate sin(1) (radians!!!!) using the quintic Taylor polynomial of f (x) = sin(x)
at the base point a = 0.
|f (x) − Tn (x)|
on the approximation, for a given value of x near a, if we do not know f (x)? This is a hugely
significant question, and absolutely essential in mathematical modeling. After all, in science it’s
not good enough to say that the answer is “near” 5; we need to say something like “it’s within the
range 5 ± 0.5”.
In brief: the answer is yes, we certainly can; Proposition 6.7.23, coming up below, is just the barest
of introductions to the subject, which you can explore in much greater depth in a subsequent course
on mathematical modeling or differential equations.
To get there, we need to talk about the second (of a trio) of theorems in Calculus.
We know how to get from function to its derivative : take the limit of the slope of the secant line.
But soon we will want to go backwards: from the derivative back to the function. The first clue is
a deceptively simple theorem which will end up being the key to understanding how things work,
called the Mean Value Theorem.8
8
This is one of the grand trio of key theorems about continuous and differentiable functions: the Intermediate
Value Theorem, the Mean Value Theorem and the Extreme Value Theorem.
184
MAT 1330 : Fall 2020 6.7. APPROXIMATING FUNCTIONS WITH POLYNOMIALS
Theorem 6.7.17 (Mean Value Theorem). Suppose f is a function that satisfies the following
hypotheses:
(i) f is continuous on the closed interval [a, b] (and maybe more); and
Then there is at least one (unknown) value c somewhere in the interval [a, b] for which
f (b) − f (a)
f 0 (c) = .
b−a
That is, this theorem says that for a nice function like f , at some point in the interval, the
instantaneous rate of change of f is equal to the average rate of change over the whole interval (or,
the slope of the secant line on that interval).
Let us agree that this is very plausible. If you drive from Ottawa to Montreal (200km) in 2 hours
(f (b) − f (a) = 200, b − a = 2) you could not have driven less than 100km/h (your average speed)
the whole time; nor could you have driven more than 100km/h the whole time. At some instant —
maybe just ONE instant in the whole 2 hours, maybe for lots of minutes during those two hours,
you were driving exactly 100km/h. Any one of those instants could be called c.
So setting c = √1
3
we have f 0 (c) = 1 and a ≤ c ≤ b, as required.
But where this is particularly interesting is when we can’t solve for c — the theorem still tells us
that c exists.
Example 6.7.19. Consider f (x) = sin(x) ln(x) on the interval [π, 2π]. Since f (π) = 0 and f (2π) =
0, the slope of the secant line is 0. Therefore, by the Mean Value Theorem, there is some number
c between π and 2π such that f 0 (c) = 0. If we calculate:
we are completely stuck: it is impossible to solve f 0 (x) = 0 !! But we know a solution exists.
This special case of the Mean Value Theorem is so common that it has its own name.
185
MAT 1330 : Fall 2020 6.7. APPROXIMATING FUNCTIONS WITH POLYNOMIALS
Theorem 6.7.20 (Rolle’s theorem). Let f be a function and let [a, b] be an interval in the domain
of f . Suppose that
f (a) = f (b).
Notice that this is just the Mean Value Theorem in the case that m = 0.
Example 6.7.21. Suppose an object follows a straight line path, and occupies the same position
at two different moments in time — that is, there are times t1 < t2 such that s(t1 ) = s(t2 ).
It then follows that there was a time in between, say t3 , such that s0 (t3 ) = 0. But this is just saying
the velocity was zero at time t3 , such as when it turned around.
6.7.7 Proof of Rolle’s theorem and advanced applications of the Mean Value
Theorem (optional)
Proof of Rolle’s theorem. If f is a constant function, then in fact f 0 (c) = 0 for all c ∈ (a, b), so the
theorem is true but boring.
So let’s assume f is not a constant function. Then since it is continuous, by the Extreme Value
Theorem it attains its absolute maximum and its absolute minimum somewhere on [a, b].
If the absolute maximum is f (a) = f (b), then the absolute minimum has to be somewhere in the
middle, at a point c ∈ (a, b), meaning it is a local minimum. Since f is differentiable, then c is a
critical point with f 0 (c) = 0. Done.
Otherwise, the absolute maximum is somewhere in the middle, at a point c ∈ (a, b), meaning it is
a local maximum. Again, this means f 0 (c) = 0.
As a more advanced application of the Mean Value Theorem, we can prove the following result
about the function sin(x).
Interpretation: the difference in the y-values of the function y = sin(x) is always less than or equal
to the difference in the x-values; i.e. the slope of any secant line is always less than or equal to 1.
That sounds good!
186
MAT 1330 : Fall 2020 6.8. STABILITY OF DISCRETE TIME DYNAMICAL SYSTEMS
Proof. The Mean Value Theorem applies to f (x) = sin(x), and says that for every a < b ∈ R there
is a number c ∈ [a, b] such that
f (b) − f (a)
f 0 (c) =
b−a
0
or, since f (c) = cos(c), we can rewrite this as
sin(b) − sin(a)
= cos(c).
b−a
Taking absolute values of both sides, this gives
sin(b) − sin(a)
= | cos(c)| ≤ 1.
b−a
Therefore
| sin(b) − sin(a)|
≤1
|b − a|
which gives
| sin(b) − sin(a)| ≤ |b − a|.
Since |x − y| = |y − x|, this is equivalent to the inequality we wanted to prove.
The Mean Value Theorem can also be used to figure out how far from the correct answer your
Taylor approximation can be, that is, to estimate the error |f (x) − Tn (f )|. As a simple example,
we do this for n = 0, which is the constant approximation T0 (x) = f (a).
End of lecture # 15
187
MAT 1330 : Fall 2020 6.8. STABILITY OF DISCRETE TIME DYNAMICAL SYSTEMS
Our method for distinguishing stable from unstable fixed points was cobwebbing. For example, if
we consider the DTDS
2xt
xt+1 = f (xt ) =
1 + xt
we see it has two fixed points, x∗1 = 0 and x∗2 = 1. If we draw the cobweb on some initial value
between the two fixed points, we get the following:
which shows that x∗1 is unstable and x∗2 is stable. (Well, we should also check an initial value beyond
x∗2 — exercise.)
Goal: Find an analytical way to distinguish stable and unstable fixed points, that is, a method
that doesn’t rely on graphing and cobwebbing, or on numerical tests.
A linear DTDS is a special kind of DTDS, where the updating function f (x) is a linear function
f (x) = rx + c. If the slope r 6= 1, then the corresponding DTDS has exactly 1 fixed point
c
x∗ = rx∗ + c ⇐⇒ (1 − r)x∗ = c ⇐⇒ x∗ = .
1−r
We had examined the stability of the fixed point in Section 3.6. Our conclusion was:
Note. For a linear DTDS xt+1 = rxt + c, with slope r 6= 1, the fixed point x∗ = c
1−r is stable if
the slope satisfies |r| < 1 and unstable if the slope satisfies |r| > 1.
We saw in Section 6.7.2 that we can approximate a function near a point like a = x∗ by its
linearization. So the stability of the fixed point should be determined by the stability of the fixed
point of the corresponding linear DTDS.
So the idea is: near a fixed point x∗ , the DTDS xt+1 = f (xt ) should have similar behaviour to the
DTDS yt+1 = L(xt ), where L is the linearization of f at x∗ .
188
MAT 1330 : Fall 2020 6.8. STABILITY OF DISCRETE TIME DYNAMICAL SYSTEMS
What is the slope r of the linearization? Why, it’s the derivative evaluated at a = x∗ , that is,
r = f 0 (x∗ ), of course!
yt+1 = 2yt
and 0 is an unstable fixed point of this DTDS since r = 2 > 1. On the other hand, f (1) = 1 and
f 0 (1) = 0.5 so the linearization of this DTDS at x∗ = 1 is
yt+1 = 0.5(yt − 1) + 1
and 1 is a stable fixed point of this DTDS since r = 0.5 satisfies |r| < 1.
Note that in fact, in the preceding example, we didn’t need to actually write down the linearized
DTDS: it was enough to find the value of f 0 (x∗ ) to figure out how the stability would work out.
This leads to the following theorem.
Theorem 6.8.2 (Stability of Fixed Points). Suppose xt+1 = f (xt ) is a DTDS and x∗ is a fixed
point. We have:
If |f 0 (x∗ )| = 1, we can’t use this test; see further courses on differential equations.
The idea of the proof. To make the proof easier to read, let’s set
a = x∗ ,
the fixed point in question of the DTDS. Let’s write down the linearization of f at a = x∗ :
189
MAT 1330 : Fall 2020 6.8. STABILITY OF DISCRETE TIME DYNAMICAL SYSTEMS
If xt is close to a, then since the linearization gives a good approximation of f near a we have
xt+1 = f (xt )
≈ L(xt )
= rxt + c
xt+1 = rxt + c
whose fixed point is also a (check this, using the formulas for r and c above!).
So if |f 0 (a)| = |r| < 1, then since a is a stable fixed point of the linear DTDS, it attracts all
solutions, and xt+1 will be closer to a than xt . So if we repeat the argument, xt+2 will be closer to
a, and so on. Thus it follows that limt→∞ xt = a, and the fixed point is stable.
But if |f 0 (a)| = |r| > 1, then xt+1 will be further away from a, so the approximation will get
worse, not better. Thus although we don’t know what limt→∞ xt is, we at least can be sure that
limt→∞ xt 6= a, and the fixed point is unstable.
xt+1 = xt e−3xt .
Its fixed points are the solutions of x = xe−3x+1 so x∗ = 0 and 1 = e−3x+1 or x∗ = 31 . Are these
fixed points stable or unstable? By the theorem, we should evaluate f 0 (x∗ ) for each, and compare
its absolute value with 1.
and
1 1
|f 0 ( )| = |e0 (0)| = 0 < 1 ⇐⇒ x∗ = is stable, by the theorem.
3 3
We can confirm this by drawing the graph of y = f (x) = xe−3x+1 and cobwebbing.
Consider a population displaying the Allee effect, and described by the DTDS
3x2t
xt+1 = f (xt ) = .
1 + x2t
Find its fixed points and classify them according to their stability.
3x2
x = f (x) = ⇐⇒ x(1 + x2 ) = 3x2
1 + x2
190
MAT 1330 : Fall 2020 6.8. STABILITY OF DISCRETE TIME DYNAMICAL SYSTEMS
Therefore:
At x1 = 0, we have f 0 (0) = 0. Since |f 0 (0)| < 1, this is a stable fixed point. (So if our
population is too small, it dies out.)
At x2 , we have
6x2
f 0 (x2 ) =
(1 + x22 )2
6x2
= since x2 is a root of 1 + x2 = 3x, see above
(3x2 )2
6 2
= = ' 5.236 > 1
3x2 x2
so that x2 is an unstable fixed point.
191
MAT 1330 : Fall 2020 6.8. STABILITY OF DISCRETE TIME DYNAMICAL SYSTEMS
A population growing under logistic growth follows the dynamics of a DTDS of the form
xt+1 = rxt (1 − xt )
for some parameter r satisfying r > 0. Let’s find the fixed points and classify their stability.9
x = rx(1 − x)
f 0 (x) = r − 2rx.
So the fixed point x∗ = 0 gives f 0 (x∗ ) = f 0 (0) = r, meaning that it is stable if |r| < 1 and unstable
if |r| > 1. But we said at the beginning that r > 0; so we conclude that 0 is stable when 0 < r < 1
and is unstable when r > 1.
Now assume r > 1 and we have a second relevant fixed point x∗ = r−1 0 ∗
r . We have f (x ) =
r − 2(r − 1) = 2 − r. Thus x∗ is stable if |2 − r| < 1. How do we solve this?
First method: |2 − r| is the distance between 2 and r. So we need r to be less than one unit
away from 2, which means 1 < r < 3.
Second method: the numbers with absolute value less than 1 are the numbers between −1
and 1. Therefore we have
On the other hand, x∗ is unstable if |2 − r| > 1. Remembering that r > 1, we deduce that this
comes down to r > 3.
Putting these together gives the several different cobwebbing scenarios we found in Example 3.7.4:
9
Notice that we can’t use cobwebbing to solve this question because to cobweb, we need to sketch the graph, which
means we have to choose a value of r.
192
MAT 1330 : Fall 2020 6.8. STABILITY OF DISCRETE TIME DYNAMICAL SYSTEMS
Note. If f 0 (x∗ ) contains a parameter, then you need to solve the inequality
|f 0 (x∗ )| < 1
to find the range of parameters that give a stable fixed point. Rewrite this inequality as
You can also, if you prefer, write this latter as the pair of inequalities
Similarly, if f 0 (x∗ ) has parameters, to solve for the parameters that give |f 0 (x∗ )|, we rewrite it as
f 0 (x∗ ) < −1 OR f 0 (x∗ ) > 1. This time you have to solve the two inequalities separately; each one
gives you conditions for an unstable fixed point.
If x∗ is just a number, then you can apply the Stability of Fixed Points Theorem directly; there’s
nothing to solve for.
The logistic model incorporates a diminishing per capita rate of reproduction r(1 − xt ), reflecting
that the rate of reproduction goes down as the size of the population increases. But this model for
the reproduction rate is only valid for xt < 1 (or else it becomes negative).
A more sophisticated model would model the per capita reproduction rate with a function like
er(1−xt )
for some positive constant r. This function is always positive and decreasing. This leads to the
Ricker model which is often used in modeling the population in fisheries:
xt+1 = xt er(1−xt ) .
193
MAT 1330 : Fall 2020 6.8. STABILITY OF DISCRETE TIME DYNAMICAL SYSTEMS
We need to solve x = f (x) = xer(1−x) ; one solution is x = 0, the other must satisfy
1 = er(1−x) ⇐⇒ 0 = r(1 − x) ⇐⇒ x = 1.
Thus f 0 (0) = er > 1 for r > 0; this is always unstable and that small populations always grow.
For x∗ = 1, we compute f 0 (1) = (1 − r)er(1−1) = 1 − r. So this fixed point is stable if and only if
|1 − r| < 1
⇐⇒ −1<1−r <1
⇐⇒ − 2 < −r < 0 (subtracted 1 from all terms)
⇐⇒ 0<r<2 (multiplied all terms by −1, and changed the direction of inequalities).
The graph of f (x) = xe3(1−x) is given below. Try cobwebbing to see what kind of instability we
have at x∗ = 1 in this case. Both fixed points being unstable essentially drives the population into
chaos. In terms of what’s happening in the fishery: the fish reproduce so much one year that they
annihilate the resources in their environment, leading to a population crash the following year. The
population swings can seem completely random — the dynamics of this population are in fact an
example of the mathematical concept of chaos.
Recall in Section 6.5.4 that we considered a population undergoing logistic growth, but being
harvested regularly at a rate h, which led to a DTDS of the form
We started by assuming that this DTDS had a stable positive steady state x∗ , so that we could
talk about the yield of the harvest in the long term, which would be Y (h) = hx∗ .
194
MAT 1330 : Fall 2020 6.8. STABILITY OF DISCRETE TIME DYNAMICAL SYSTEMS
Now, we can complete the problem by determining if the level of harvest that we chose gives a
steady state that is stable.
We begin by finding the steady states by solving x = f (x) = 2.5x(1 − x) − hx. This gives one
solution x = 0 and the other satisfies
1.5 − h
1 = 2.5(1 − x) − h ⇐⇒ 1 = 2.5 − 2.5x − h ⇐⇒ 2.5x = 1.5 − h ⇐⇒ x= .
2.5
As always, we first decide what the biologically relevant range is. This second fixed point is
nonnegative if 1.5 − h ≥ 0 or h ≤ 1.5. Since h is a rate of harvesting, h ≥ 0. So the region of
interest is
0 ≤ h ≤ 1.5.
Next, we discuss stablity.
f 0 (x) = 2.5 − 5x − h.
1.5 − h
When x∗ = this gives
2.5
0 ∗ 1.5 − h
f (x ) = 2.5 − 5 − h = 2.5 − 2(1.5 − h) − h = 2.5 − 3 + 2h − h = −0.5 + h
2.5
so the steady state arising from harvesting is stable only if | − 0.5 + h| < 1 or
In particular, the harvesting rate we found, of h = 0.75, falls in the good range.
Exercise 6.8.4. In the above example, the range of values of h for which the nonzero steady state
was stable was wider than the range of values of h for which the nonzero steady state was positive.
Find the range of h ≥ 0 that make the nonzero steady state x∗ of the following DTDS (a) positive
(b) stable:
xt+1 = 2xt (2 − xt ) − hxt
and notice that this time it is possible to choose h to give a nonstable positive steady state. This
would be a very problematic rate of harvesting, as it would give a different yield each year, in some
kind of chaotic dynamics, and the potential of extinction.
End of lecture # 16
195
MAT 1330 : Fall 2020 6.9. THE INTERMEDIATE VALUE THEOREM
The final application of differentiation we’ll consider is about solving equations. We know a great
many algebraic techniques, but occasionally come across equations that are impossible to solve
algebraically. Along the way, we’ll meet the third of our trio of theorems in Calculus.
Solution: We have to solve x = e−x . But applying ln gives ln(x) = −x, which is just as difficult.
We look at the graph:
The graphs of y = e−x (in blue) and y = x (in red). They have a unique point of intersection near
x = 0.55.
We agree that there is a solution. Actually, we can do better than that.
In other words, if our function is negative in one place and positive in another, then it must have
gone through 0 ... right?
Caution: Imagine our function f had a graph like one of the following.
On the left, the graph of a discontinuous function y = f (x) which satisfies that f (0) < 0 and
f (1) > 0 but there is no number c between 0 and 1 such that f (c) = 0. On the right, the graph of
f (x) = 1/x, which satisfies f (−1) < 0 and f (1) > 0 but there is no x for which f (x) = 0.
It is important that the function be continuous on your interval for this strategy to work!
196
MAT 1330 : Fall 2020 6.9. THE INTERMEDIATE VALUE THEOREM
Theorem 6.9.1 (Intermediate Value Theorem). Suppose that f is a function which is continuous
on the interval [a, b], and that y is a number between f (a) and f (b). Then there is a number
c ∈ [a, b] such that f (c) = y.
The idea of the theorem is: if your function is continuous on [a, b], then if you draw the curve from
(a, f (a)) to (b, f (b)), you have to cross every horizontal line (every y-value) between f (a) and f (b).
You can’t avoid solving f (c) = y.
The graphs of a continuous function f (x) and two marked points in red (−2, −18) and (6, 43.12),
whose y-values are marked on the y-axis in green. The Intermediate Value Theorem says: every
horizontal line between y = f (a) and y = f (b) has to cross the graph at least once, at an x-value
lying between a and b.
Remark 6.9.2. This is a different application of an idea we have used before in a completely
different context: if we have an interval (a, b) and f has no critical points in that interval (meaning
f 0 (x) is never zero or undefined) then necessarily f 0 is either always positive or always negative on
that interval. In other words: if f 0 changes sign on an interval, then that interval must contain a
critical point of f . This is the Intermediate Value Theorem, applied to f 0 !
Now the theorem says that if your function is continuous on an interval and changes sign from one
endpoint to the other, then it must cross the x-axis somewhere in between.
So let’s apply the Intermediate Value Theorem to our example, to solve x = e−x .
The function f (x) = x−e−x is defined and continuous on all of R, so in particular on any subinterval.
Tip: in general, watch out for asymptotes!!
We have f (0) < 0 and f (1) > 0 so there is a root of f (that is, a value x such that f (x) = 0)
between 0 and 1.
We compute f (0.75) = 0.28 > 0. So there is a root of f between 0.5 and 0.75.
197
MAT 1330 : Fall 2020 6.9. THE INTERMEDIATE VALUE THEOREM
We compute f (0.625) = 0.09 > 0. So there is a root of f between 0.5 and 0.625.
We compute f (0.57) = 0.004, which is equal to zero to two decimal places, which we think is
good enough.
Our method yields the estimate of 0.57 for the solution to x = e−x .
Let’s use information about the derivative and the tangent line to make better guesses at a root of
f (x) = x − e−x . The idea is in the following picture, where we try the root of the linearization as
a guess at the root of the function.
The graph of f (x) = x − e−x in black. An initial guess x0 = 1 gives a point on the curve y = f (x).
We draw the tangent line to the curve and see where it intersects the x-axis.
Actually, if we just work this out in general, we’ll get a simple formula.
So given f (x), our goal is to solve f (x) = 0. We assume we have used the Intermediate Value
Theorem to help us make an initial guess x0 .
198
MAT 1330 : Fall 2020 6.9. THE INTERMEDIATE VALUE THEOREM
f (x0 )
0 = f (x0 ) + f 0 (x0 )x − f 0 (x0 )x0 ⇐⇒ f 0 (x0 )x = f 0 (x0 )x0 − f (x0 ) ⇐⇒ x = x0 − .
f 0 (x0 )
Therefore:
f (x0 ) f (x1 )
x1 = x0 − , x2 = x1 − , ...
f 0 (x0 ) f 0 (x1 )
that is, we have in effect created a DTDS which ought to converge to the root we were looking for!
Note. Newton’s method: To solve f (x) = 0 with initial guess x0 , use the iterative formula
f (xn )
xn+1 = xn − , i = 0, 1, 2, 3, . . .
f 0 (xn )
Memorize this formula correctly, or else remember how to derive it. Wrong formula = garbage.
Example 6.9.3. Solve x = e−x with an accuracy of three decimal places using Newton’s method.
Solution: We first have to convert the problem into a root-finding problem. Let
f (x) = x − e−x .
Then the roots of f are the solutions to the equation x = e−x that we are looking for. Good; we
can apply Newton’s method. We compute
f 0 (x) = 1 + e−x .
Start with x0 = 0.57, which is the best guess we got with our bisection method. Then
0.57 − e−0.57
x1 = 0.57 − = 0.56714181501
1 + e−0.57
x1 − e−x1
x2 = x1 − = 0.5671432904
1 + e−x1
and we stop, because our solution is already accurate to 5 decimal places!
√
Example 6.9.4. Find 2 to 3 decimal places using Newton’s method.
√
Solution: We need to create a nice function with 2 as a root; f (x) = x2 − 2 is a good choice.
Then we need a first guess; x0 = 1 is close. Our formula is
f (xn ) x2n − 2
xn+1 = xn − = x n −
f 0 (xn ) 2xn
199
MAT 1330 : Fall 2020 6.9. THE INTERMEDIATE VALUE THEOREM
so that gives
1−2
x1 = 1 − = 1.5
2
2.25 − 2
x2 = 1.5 − = 1.4166667
3
x3 = 1.41421568633
x4 = 1.41421356237
which is 4 decimals; in fact x5 = 1.41421356237 = x4 is already the maximum precision my
calculator can do.
So if f 0 (xn ) is not near zero, but f (xn ) is near zero, then we are correcting our guess by the small
amount each time, and by our graph, we see we are getting better and better.
If f 0 (xn ) is near zero — that is, if we have a critical point near our guess — then Newton’s method
can just spike off and give ridiculous numbers. You can tell when that happens: xn is nowhere near
xn+1 , and your value of f 0 (xn ) was suspiciously small.
In particular, if x is a root of both f and f 0 (like a double root of a polynomial) then you shouldn’t
do Newton’s method to f — do it to f 0 instead!
Another possible failure: if you apply Newton’s method to a function with many roots, it could
happen that it converges to a different root than the one you wanted. The solution is to make as
good a first guess x0 as possible, so that your function more or less looks like a straight line from
the root to your guess (no local maxima or local minima in between).
So: Newton’s method can fail, but only for reasons that you should have noticed when you started.
Finally: using a computer, it’s very fun and easy to get maximum precision. Use a spreadsheet,
for example, and just cut and paste the formula from one line to the next. Therefore, as long as
you can differentiate the function, you never have to worry about solving an equation again.
End of lecture # 17
200
Chapter 7
Integration
We have explored lots of applications of the derivative; now it’s time to turn the story around and
consider anti-derivatives. We begin with some motivation, and setting the stage for why we really
want to do this.
7.1 Introduction
Utility companies measure the rate of flow of water (or gas, or electricity) into your home in
litres/second (respectively, m3 /second, Watts = joules/second). In the end, though, they bill you
for the total amount consumed in litres (respectively, m3 , kWh). In other words: they measure the
instantaneous rate of change and use that to compute the total amount used. This is the inverse
process of differentiation.
Examples abound:
f (x) f 0 (x)
value differentiation rate of change
position −→ velocity
mass growth rate
volume ←− flow rate
amount anti-differentiation (new) production rate
If my speedometer is broken, I can use my GPS to find out my speed: it knows my position
each second and can use that to estimate my velocity. So knowing your position as a function
of time implies knowing your velocity.
If my GPS is broken, I cannot use my speedometer to find out where I am. The speedometer
could tell me exactly the same velocity function whether I was driving in B.C. or Newfound-
201
MAT 1330 : Fall 2020 7.1. INTRODUCTION
land. BUT if you told me that we started in Ottawa and drove along the TransCanada
Highway west for 22 hours at a steady speed of 100 km/h, then I know that we are now in
Winnipeg.
In math terms:
Given f 0 (x), and the value of f (a) for some a, we can find f (x).
Example 7.1.1. Suppose that the volume of a cell increases continuously at a rate of 2µm3 per
second. What is the volume of the cell after 3 seconds, if the cell starts at a volume of 1µm3 ?
Solution: Denote V (t) = volume of the cell. We are given that V 0 (t) = 2µm3 /s. Since the derivative
is a constant, the function must be linear, with slope 2, which gives:
µm3
V (t) = 2 · t + V (0),
s
and this initial value is V (0) = 1µm3 . So our solution is
Notice that:
The units work out! Rate in µm3 /s times time in s gives µm3 .
Different initial conditions give different volumes — of course! Without the initial condition,
we couldn’t tell you V (t) just from the rates of change.
Remark 7.1.2. Problems like the above — and like we will do in the rest of MAT1330 this term —
are called pure-time differential equations. Next term, we will consider also autonomous differential
equations, in which the derivative is given in terms of the function itself.
For example, if the amount x(t) of radioactive material decreases at a continuous rate of 1% per
day, and we start with 10 grams, then this is telling us that
That is: we don’t know the actual rate of change at any given time. Instead, we know the rate
of change based on how much there is; but we’re trying to find out how much there is, so it feels
circular! (Don’t worry, it works out: MAT1332.)
If it wasn’t the derivative, but instead just the average rate of change each day, then you would
replace x0 (t) with x(t+1)−x(t)
(t+1)−t on the left, and the result would be the DTDS x(t + 1) = 0.99x(t)
202
MAT 1330 : Fall 2020 7.1. INTRODUCTION
(which we would write xt+1 = 0.99xt ). Notice that this DTDS would give a rough approximation to
the correct answer, but we agree that the autonomous differential equation (7.1), which takes into
account that the growth compounds continuously, should give the more accurate answer.
In MAT1332, we’ll show that the solution to (7.1) is x(t) = 10e−0.01t ; for now, you should be able
to verify that x0 (t) = −0.01x(t) and that x(0) = 10, which means it is a solution to (7.1).
7.1.2 Could there be more than one anti-derivative satisfying a given initial
condition?
First, an intuitive argument: Let’s work with some graphs to see how it’s possible to go backwards
from the derivative to the function, given an initial condition.
dV
Example 7.1.3. Suppose = 2 and
dt
V (0) = 1. We have sketched the graph of the
derivative (the red horizontal line), together
with the initial value (0, 1) (red dot). Start-
ing at the red dot, draw a curve (in blue) with
slope equal to V 0 (t) at each point t (which in
this case, means a line with constant slope 2);
this is the graph of V (t).
dV
Example 7.1.4. Suppose = 4 − 2t and
dt
V (0) = 2. We have sketched the graph of the
derivative (the red line), together with the ini-
tial value (0, 2) (red dot). Starting at the red
dot, draw a curve (in blue) with slope equal
to V 0 (t) at each point t (so: initially steep,
with slope 2, but this slope decreases as t in-
creases); this is the graph of V (t).
203
MAT 1330 : Fall 2020 7.1. INTRODUCTION
Now, a proof. Suppose you had two functions F (x) and G(x) that both satisfied F 0 (x) = f (x) =
G0 (x). Let’s look at h(x) = F (x) − G(x). This is a function with derivative
h0 (x) = F 0 (x) − G0 (x) = f (x) − f (x) = 0.
What could it be? By the Mean Value Theorem applied to the function h(x) on the interval [a, b],
there is a point c between a and b so that
h(b) − h(a)
h0 (c) = .
b−a
But since h0 (x) = 0 for all x, we have h0 (c) = 0. Therefore after simplifying, we get that h(b) = h(a).
This is true for every such b, so in fact h is a constant function (with graph a horizontal line).
Conclusion: if F and G are two anti-derivatives of f on an interval, then they are almost the
same; in fact, there is a constant C such that
F (x) = G(x) + C
i.e. the graphs of the two anti-derivatives are identical, up to a vertical shift.
7.1.3 Anti-differentiation
Let’s begin by being clear about what we are looking for: we are given a function f (t) and want to
find a new function F (t) such that F 0 (t) = f (t).
Definition 7.1.5. An antiderivative of a function f (t) is a function F (t) with the property that
F 0 (t) = f (t). If F is one anti-derivative of f , then so is F + c, for any constant c. We write
Z
f (t) dt = F (t) + c,
which we read out loud as “the integral of f of t dt isR F(t) plus an arbitrary constant c.” In this
expression, the function f is called the integrand and f (t) dt is called the indefinite integral of f .
We will meet the “definite integral”, which is a number representing the area between the graph of
f and the x-axis, at the end of MAT1330.
Note. So an antiderivative is one function whose derivative is f ; the indefinite integral is the set
of all functions whose derivative is f .
Example 7.1.6. Suppose f (t) = 1. Then F (t) = t as an antiderivative; and F (t) = t + 5 is another
antiderivative. The indefinite integral is
Z Z
f (t) dt = 1 dt = t + c with c ∈ R.
We can turn our differentiation rules into rules for anti-derivatives. Let’s start by making a list of
anti-derivatives of common functions.
204
MAT 1330 : Fall 2020 7.1. INTRODUCTION
Recall that
d n+1
t = (n + 1)tn ;
dt
therefore,
Examples: Z
1
t3 dt = t4 + c
4
Z
1
x5 dx = x6 + c
6
√
Z Z
1 1 2
x dx = x1/2 dx = 1 x 2 +1 + c = x3/2 + c
2 +1 3
Z Z
1 1
dt = t−2 dt = t−2+1 + c = −t−1 + c.
t2 −2 + 1
d d
If a is a constant, then dx (aF (x)) = a dx (F (x)). Therefore, thinking about what this says about
anti-derivatives, we get the rule
Note. Z Z
af (x) dx = a f (x) dx for any constant a.
Note. Z Z Z
f (x) + g(x) dx = f (x) dx + g(x) dx.
Examples:
Z Z Z
2 2 1 1 7
(3x − 7x) dx = 3 x dx − 7 x dx = 3 x3 − 7 x2 + c = x3 − x2 + c.
3 2 2
√
Z Z Z
3 4 −1/2 1 1/2 1 2
√ + 3 dx = 3 x dx + 4 x−3 dx = 3 x + 4 x−2 + c = 6 x − 2 + c.
x x 1/2 −2 x
205
MAT 1330 : Fall 2020 7.1. INTRODUCTION
So the way to CORRECTLY solve the final indefinite integral above is to simplify it algebraically
until it is in a form we can handle:
3 + 4x2 4x2
Z Z
3
√ dx = √ + √ dx
x x x
Z
= 3x−1/2 + 4x3/2 dx
Z Z
= 3 x−1/2 dx + 4 x3/2 dx
! !
1 (− 12 +1) 1 ( 32 +1)
=3 x +4 3 x +c
− 21 + 1 2 +1
3 1/2 4 5/2
= x + x +c
1/2 5/2
√ 8
= 6 x + x5/2 + c.
5
And of course we check the answer by quickly differentiating.
Note. Tip: always check your final answer by differentiating. It is very easy to mess up the
coefficients, but if you differentiate and your answer is 34 x−1/2 instead of 3x−1/2 , for example, you
know that you were off by a factor of 4 — and you can fix it, without trying to just repeat your
calculation line by line.
Note. If when you differentiate you get a totally different function than what you were supposed
to get, then you know you did something wrong. Try again.
206
MAT 1330 : Fall 2020 7.1. INTRODUCTION
Special functions
Note.
Z
Since (ex )0 = ex , we have et dt = et + c;
Z
Since (sin(x))0 = cos(x), we have cos(x) dx = sin(x) + c
Z
Since (cos(x))0 = − sin(x), we have sin(x) dx = − cos(x) + c
Z
1
Since (arctan(x))0 = 1
1+x2
, we have dx = arctan(x) + c.
1 + x2
Z
1
Since (arcsin(x))0 = √ 1
1−x2
, we have √ dx = arcsin(x) + c.
1 − x2
We also know that (ln(x))0 = 1/x — but this one is annoying. The domain of ln(x) is (0, ∞)
whereas the domain of 1/x is all real numbers except 0. But there turns out to be an easy solution.
so if we differentiate, we get (
1
if x > 0
f 0 (x) = x
1 1
−x (−1) = x if x < 0
which is just perfect!
Remark 7.1.8. There is another little glitch: since there is a gap in the domain of ln |x|, you could
vertically shift the two parts of the graph independently and still have the same derivative. So the
indefinite integral of x1 is the function
(
ln |x| + c1 if x > 0
Z
1
dx = (7.2)
x ln |x| + c2 if x < 0
where c1 and c2 are constants which might not be equal to each other.
Note. This is too much bother! We will be lazy and just write
Z
1
dx = ln |x| + c
x
and if we are ever given a question with two initial conditions (one on each half of the domain)
then we can revert to (7.2).
207
MAT 1330 : Fall 2020 7.1. INTRODUCTION
In that same spirit, for all the functions with vertical asymptotes, we use the same shorthand
notation (understanding that in the unlikely event that we need to adjust the vertical shifts in
different parts, we can, by writing them as spliced functions):
Note.
Z
sec2 (x) dx = tan(x) + c
Z
sec(x) tan(x) dx = sec(x) + c
Z
csc2 (x) dx = − cot(x) + c
Z
csc(x) cot(x) dx = − csc(x) + c
In fact, you could write down lots of rules like this, but it gets a little out of hand. For example,
Z
since (xex )0 = ex + xex we know that (ex + xex ) dx = xex ,
Z
2 2 2 2
since (ex )0 = 2xex we know that 2xex dx = ex .
But we can’t possibly write down all the possible rules. What we need are methods to undo the
chain rule and the product rule (coming up in the next two lectures).
Solution: Let A(t) = number of cases, with A(0) = 340. We are told that A0 (t) = 500t2 .
Therefore, Z Z
0 500 3
A(t) = A (t) dt = 500t2 dt = t +c
3
500 3
for some constant c. Since A(0) = 340, plugging in t = 0 gives c = 340. Therefore A(t) = 3 t +340.
We want to know the number of cases in 1991, which corresponds to t = 10. So plug in:
500
A(10) = (10)3 + 340 ∼ 167, 000.
3
208
MAT 1330 : Fall 2020 7.1. INTRODUCTION
Example 7.1.10. A bucket falls from a window cleaner’s platform, and experiences constant
acceleration due to gravity of a = −9.8m/s2 .
Recall: the rate of change of position is velocity, and the rate of change of velocity is acceleration.
Suppose the platform is 49m up and the initial speed of the bucket is 0.
(a) Find the equation for the position p(t) of the bucket.
Solution:
(b) After 1 second, the position will be p(1) = −4.9 + 49 = 44.1 m above the ground.
(c) It will hit the ground when its position is 0. So we solve p(t) = 0 for t, which gives
√
0 = −4.9t2 + 49 ⇐⇒ 4.9t2 = 49 ⇐⇒ t2 = 10 ⇐⇒ t = ± 10s;
since we are going forward in time, the positive solution is our answer, giving t ∼ 3.2s.
√
(d) Since it hits the ground after 10 seconds, its velocity at that time is
√ √
v( 10) = −9.8( 10) ∼ −31m/s
Example 7.1.11. (Buckets, continued) Suppose now that the window cleaner tosses the bucket
straight upwards towards another platform, but it misses and falls to the ground. If the bucket is
thrown with an initial velocity of 10m/s,
209
MAT 1330 : Fall 2020 7.1. INTRODUCTION
Solution: Again, the only force exerted on the bucket is gravity, so we have
Z Z
v(t) = a(t) dt = −9.8 dt = −9.8t + c.
Since v(0) = 10, we have c = 10, so v(t) − 9.8t + 10 m/s. Next, we solve
−9.8 2
Z Z
p(t) = v(t) dt = −9.8t + 10 dt = t + 10t + c0 ,
2
and since p(0) = 49, we have c0 = 49. Therefore the equation of motion for the bucket is
(a) The highest point is attained where the bucket stops for a moment, that is, when v(t) = 0. We
solve
10
v(t) = 0 ⇐⇒ −9.8t + 10 = 0 ⇐⇒ t = ' 1.02s.
9.8
(b) Its position at this time is p(10/9.8) = −4.9(10/9.8)2 +10(10/9.8)+49 ∼ 54.1m, which is about
5 m above the platform.
(d) Its speed when it hits the ground will be v(4.34) ' −32.6 m/s, which is about 117 km/h.
Example 7.1.12. (Buckets, continued) Suppose now that the window cleaner tosses the bucket
straight upwards towards another platform 10m higher, and it gets to exactly the correct height
but it isn’t caught and the bucket falls to the ground.
Solution: Again, the only force exerted on the bucket is gravity, so we have
Z Z
v(t) = a(t) dt = −9.8 dt = −9.8t + c.
210
MAT 1330 : Fall 2020 7.2. TECHNIQUES OF INTEGRATION: SUBSTITUTION
This time we don’t know anything about v(0), so we just have to continue as is. We have
−9.8 2
Z Z
p(t) = v(t) dt = −9.8t + c dt = t + ct + c0 ,
2
where we’ve remembered the constants are most likely different, and c is just a number so the
constant multiple rule applied.
We are given that p(0) = 49, so c0 = 49. Therefore we have two functions:
p(t) = −4.9t2 + ct + 49 and v(t) = −9.8t + c.
The highest point is attained where the bucket stops for a moment, that is, when v(t) = 0. We
solve
c
v(t) = 0 ⇐⇒ −9.8t + c = 0 ⇐⇒ t = .
9.8
So at time t = c/9.8, we’re at the highest point, which according to the question is 10 m above the
platform, so at position p(t) = 49 + 10 = 59 m above the ground. Using our equation for p(t) with
t = c/9.8 we get
2 2 −4.9 1
59 = −4.9(c/9.8) + c(c/9.8) + 49 ⇐⇒ 10 = c + ⇐⇒ c2 = 196
9.82 9.8
so c = ±14. Since t = c/9.8 is the time when it reaches the max, and time is positive, we deduce
that c = 14 and t = 14/9.8 = 10/7 s.
End of lecture # 18
So far we can only calculate indefinite integrals when the integrand is a function whose anti-
derivative we already know. This is a fairly small list (see Table 7.2.1 in the textbook, for example).
Today we’ll learn and practice a method which is based on undoing the chain rule; we call it the
method of substitution.
But how would we recognize that our integrand has this special form? This is where the special
notation of integrals comes in handy.
211
MAT 1330 : Fall 2020 7.2. TECHNIQUES OF INTEGRATION: SUBSTITUTION
Note. Z Z
0 0
f (g(x))g (x) dx = f 0 (u) du.
Z Z Z Z
x 2 u
x2 2
2xe dx = e 2x dx = e du = eu du = eu + c = ex + c.
We try u = 3x, which gives du = 3 dx. We don’t have a 3 in the integral, but it’s just a constant,
so we can write dx = 13 du. That gives:
Z Z Z
3x u 1 1 1 1
e dx = e du = eu du = eu + c = e3x + c.
3 3 3 3
212
MAT 1330 : Fall 2020 7.2. TECHNIQUES OF INTEGRATION: SUBSTITUTION
R
Example 7.2.5. Given cos(2π(x − 1)) dx, we try the substitution u = 2π(x − 1) which gives
1
du = 2πdx or dx = 2π du so
Z Z
1 1 1
cos(2π(x − 1)) dx = cos(u) du = sin(u) + c = sin(2π(x − 1)) + c.
2π 2π 2π
1
Note. It is nice and easy to make a linear substitution like u = mx + b because then dx = m du,
so it always works.
sec2 (1/x)
Z Z
1
dx = sec2 (1/x) dx
x2 x2
Z
= sec2 (u)(−1)du
Z
= − sec2 (u)du
= − tan(u) + c
= − tan(1/x) + c.
Remark 7.2.7. What happens if you pick a “wrong” substitution? What does it mean to say that
it “doesn’t work”? Consider again
sec2 (1/x)
Z
dx.
x2
213
MAT 1330 : Fall 2020 7.2. TECHNIQUES OF INTEGRATION: SUBSTITUTION
√
If you’d tried u = x2 , then du = 2xdx. There is no “2x” in the integrand; so we can say x = u
√ √
so du = 2 udx or dx = 2√1 u du, fine. But we still have a 1/x — which is 1/ u. So OK, it worked:
√ √
sec2 (1/x) sec2 (1/ u) sec2 (1/ u)
Z Z Z
1
dx = · √ du = du
x2 u 2 u 2u3/2
which is definitely worse, not better.
Moral: keep your options open, and keep thinking critically as you go along.
So we get
Z Z
1 −3t 1 1
e dt = (− 31 )du = − ln |u| + c = ln |e−3t + 1| + c = ln(e−3t + 1) + c
e−3t +1 u 3
where in the last step we noticed that e−3t +1 > 0 for all t so the absolute value sign was superfluous.
We check by differentiating: yes!
Substitution is a great thing to try whenever you see both a function and its derivative in the
integrand.
Example 7.2.9. Find Z
arctan(x)
dx.
1 + x2
This time, there is no composition of functions, so it’s not obvious that substitution is the thing
to do. But we notice that the integrand is a product of a function u = arctan(x) and its derivative
1
du = 1+x 2 dx, so we will just go ahead and make the substitution and see what happens:
Z Z
arctan(x) 1 1
2
dx = u du = u2 + c = (arctan(x))2 + c.
1+x 2 2
We check by differentiation.
So what happened here was: we couldn’t see the term f 0 (g(x)) that normally clues us in because
in this case f (u) = 21 u2 and so f 0 (u) = u.
214
MAT 1330 : Fall 2020 7.2. TECHNIQUES OF INTEGRATION: SUBSTITUTION
Sometimes, you try a substitution without actually realizing what f (u) is going to be.
substitution v = ln(u). So it all works out just fine, it just takes a little longer.
215
MAT 1330 : Fall 2020 7.2. TECHNIQUES OF INTEGRATION: SUBSTITUTION
Note. Lesson: sometimes you just try a substitution to see if it will work out.
216
MAT 1330 : Fall 2020 7.2. TECHNIQUES OF INTEGRATION: SUBSTITUTION
(where we have gathered all the constants into a new c0 ). We check our answer2 is correct, by
differentiating it:
d 3 2/3
x − 3x1/3 + 3 ln |x1/3 + 1| + c0 =
dx 2
3 1
= x−1/3 − x−2/3 + 1/3 ( x−2/3 )
x +1 3
1
−1/3 1/3
= 1/3 x (x + 1) − x−2/3 (x1/3 + 1) + x−2/3
x +1
1
= 1/3 1 + x−1/3 − x−1/3 − x−2/3 + x−2/3
x +1
1
= 1/3
x +1
as required, as if by a minor miracle.
Example 7.2.14. Z
tan(x)
dx
ln(cos(x))
We do not see what this will be, but there is a composition of ln with cos(x) as the innermost
function, and tan(x) is related to cos(x), so we give it a shot.
u = cos(x) ⇐⇒ du = − sin(x) dx
2
Careful with Mobius, please — we’ll often rig the questions in Mobius so the absolute value is not needed, because
you’d have to type abs(x) not —x— which is just a pain. In real life, put the absolute values and then remove them
if it is correct to do so; remember the derivative of ln |x| is 1/x for all x 6= 0.
217
MAT 1330 : Fall 2020 7.2. TECHNIQUES OF INTEGRATION: SUBSTITUTION
Well, this integral is definitely an improvement, but we’re still not done. This time, however, we
see that there is a ln(u) in our integral, and also a u1 du, which is exactly what we’d need to make
the substitution for ln(u)! So (and of course we have to take a different letter) we set
1
w = ln(u) ⇒ dw = du
u
so that our integral becomes
Z Z
1 1 1
=− · du = − dw = − ln |w| + c
ln(u) u w
= − ln | ln(u)| + c = − ln | ln(cos(x))| + c.
Example 7.2.15.
(4t + 2)2
Z
dt
t2
We look at this integral and think that the most complicated piece is 4t+2 — but wait, a substitution
is the SECOND thing you think of, after “CAN I SIMPLIFY THIS?” because in fact
218
MAT 1330 : Fall 2020 7.2. TECHNIQUES OF INTEGRATION: SUBSTITUTION
Not everything needs substitution. Always look to see if you can find an anti-derivative
directly, first.
Don’t be too greedy with your substitution: don’t take u = ln(sin(x)) in one go but instead
start with u = sin(x) and see what happens. You can always do a second substitution.
Be meticulous in your work. Sloppy substitution will give you garbage and is worthless.
Make sure that you have translated every part of your integral to your new variable. Never
write any integral with two different variables in it.
Be flexible. Try a substitution even if you can’t see how it will turn out. If you can’t get it
to work, or it gives a yuckier integral, don’t erase it — you might later see what to do with
it after trying something else.
R 2
Example 7.2.17. Consider ex dx. If we try u = x2 , we would need du = 2x dx. There is no
√ 1 √
x in the integral. But we could say: u = x2 so u = x. Therefore dx = 2x du = 12 u du. That
means we have Z Z
2 1
ex dx = √ eu du.
2 u
Nice — but we still don’t see an antiderivative. In fact: there is no formula for a function
2
whose derivative is ex (or u−1/2 eu , for that matter). The antiderivative must exist, but it
2
is a brand new function that has no name or formula besides “an anti-derivative of ex ”.
End of lecture # 19
219
MAT 1330 : Fall 2020 7.3. TECHNIQUES OF INTEGRATION : INTEGRATION BY PARTS
Last time we learned how to use substitution (which is like an anti-chain rule) to change one integral
into a (hopefully) simpler integral. The goal is to change our integrand into an elementary function
who anti-derivative we know. Today: we will learn how to use integration by parts, which you can
think of as the anti-product rule.
So of course if your integrand looks like the left side, you can solve it. However, that’s not what
usually happens. Let’s rewrite the above equation by splitting the left side into a sum of two
integrals and moving it to the other side of the equation:
Z Z
f (x)g (x) dx = f (x)g(x) − g(x)f 0 (x)dx.
0
This tells us: if youR have to solve f (x)g 0 (x) dx, then by using this identity you can reduce the
R
problem to finding g(x)f 0 (x) dx. This is great if this other integral is easier!
1. Divide your integrand into two pieces: a function that is easy to differentiate, and one that
is easy to anti-differentiate.
2. Call the piece you will differentiate u = f (x); call the rest dv = Rg 0 (x)dx. Then differentiate
u to give du = f 0 (x)dx and choose an antiderivative v = g(x) = dv. I write this in a little
u = f (x) dv = g 0 (x)dx
table like
du = f 0 (x)dx v = g(x)
3. Using the notation of the integral makes the rule easier to remember:
Z Z
u dv = uv − v du
since we multiply the two functions (on the diagonal in our table) and then subtract the
integral of the product across the bottom row.
4. Note: unlike with substitution, your resulting integral is still in terms of x; just solve and
check.
220
MAT 1330 : Fall 2020 7.3. TECHNIQUES OF INTEGRATION : INTEGRATION BY PARTS
u=x dv = ex dx
We choose to split our integrand into u = f (x) and dv = g 0 (x)dx, so:
du = dx v = ex
= xex − ex + c
= (x − 1)ex + c
u = ex dv = x dx
Remark 7.3.2. What happens if we pick u = ex and dv = x dx? Let’s try: .
du = e dx v = 12 x2
x
That gives Z Z
1 1 2 x
xe dx = x2 · ex −
x
x · e dx
2 2
which is clearly more complicated. It is correct (that is an honest equals sign), but not helpful for
your goal of finding an antiderivative, since the new integral is a bit harder to solve than the old
one.
We like to use something that differentiates well for u, so u = f (x) = x2 ; and the rest should be
u = x2 dv = e3x dx
easy to integrate, dv = e3x dx: e3x dx either
R
(where we solved
du = 2x dx v = 31 e3x
dv = e3x dx
Z
1 2 u=x
= x2 e3x − xe3x dx by parts again:
3 3 du = dx v = 13 e3x
Z
1 2 1 3x 1
= x2 e3x − xe − e3x dx
3 3 3 3
1 2 3x 2 1 3x
= x2 e3x − xe + e +c
3 9 9 3
1 2 3x 2
= x2 e3x − xe + e3x + c0
3 9 27
which we verify is correct by differentiation.
221
MAT 1330 : Fall 2020 7.3. TECHNIQUES OF INTEGRATION : INTEGRATION BY PARTS
Example 7.3.4. Z
ln(x) dx
We do not know this antiderivative, and there is no substitution to make, since there is only the
u = ln(x) dv = dx
one function, so we try integration by parts. No choice: Therefore
du = x1 dx v = x
Z Z Z
1
ln(x) dx = x ln(x) − x dx = x ln(x) − dx = x ln(x) − x + c
x
R R
where we have remembered that dx = 1 dx. We check by differentiating!
Example 7.3.5. Z
x ln(x) dx
This time we have choices. You might be tempted to set dv = ln(x) dx, since we now know the
integral (try it! it gets messy) but the strategy is: choose u to have a nice derivative, and dv to
u = ln(x) dv = x dx
have a nice integral. So here we go with which gives
du = x1 dx v = 12 x2
Z Z Z
1 1 21 1 1 1 1
x ln(x) dx = x2 ln(x) − x dx = x2 ln(x) − x dx = x2 ln(x) − x2 + c
2 2 x 2 2 2 4
which we again check by differentiating.
Example 7.3.6. Z
ln(x)
dx
x2
Again, doesn’t seem to be one we know, or a good candidate for substitution, so we go to integration
by parts; again, ln(x) is the one you’d like to differentiate, because then it turns into a function in the
u = ln(x) dv = x12 dx
same family as x−2 , which will make the integral easier. This means
du = x1 dx v = −x−1
which gives
Z Z Z
ln(x) ln(x) 1 1 ln(x) ln(x)
2
dx = − − − dx = − + x−2 dx = − − x−1 + c
x x x x x x
Remark 7.3.7. If the answers to the two preceding examples seem strangely similar, you might
note that
−2x−2 ln(x) = x−2 ln(x−2 ) = u ln(u) with u = x−2 .
222
MAT 1330 : Fall 2020 7.3. TECHNIQUES OF INTEGRATION : INTEGRATION BY PARTS
Example 7.3.8. Z
ln(x)
dx
x
u = ln(x) dv = x1 dx
Given our success, we just dive right in: which gives3
du = x1 dx v = ln(x)
Z Z
ln(x) ln(x)
dx = ln(x) ln(x) − dx
x x
R ln(x)
(!!!!??!!) But actually: this is marvelous. It’s an equation, and the thing we want ( x dx) can
be isolated: Z
ln(x)
2 dx = ln(x)2
x
so Z
ln(x) 1
dx = (ln(x))2 + c
x 2
where we remember to put +c in the end. We check by differentiating.
Remark 7.3.9. Actually, in the previous example, we should have noticed that it was a prime
candidate for substitution: set w = ln(x) then dw = x1 dx, so
Z Z
ln(x) 1 1
dx = w dw = w2 + c = (ln(x))2 + c.
x 2 2
That’s easier!
t2 ln(t)dt.
R
Exercise 7.3.10. Find
We don’t know this anti-derivative, and there’s no substitution to make, so we try by parts. Our
u = arcsin(x) dv = dx
only real choice is: 1 which gives
du = √1−x 2
dx v = x
Z Z
x
arcsin(x) dx = x arcsin(x) − √ dx.
1 − x2
Now we figure out the resulting integral; since we see an xdx in the numerator, we’ll do a substitution
w = 1 − x2 and dw = −2xdx. (You can use u if you want to, but I don’t want to be confusing.)
−1
Z Z
x p
√ dx = w−1/2 dw = −w1/2 + c = − 1 − x2 + c
1 − x2 2
3
We used ln(x) instead of ln |x| here because the ln(x) in the integrand means we are only considering x > 0
anyway.
223
MAT 1330 : Fall 2020 7.3. TECHNIQUES OF INTEGRATION : INTEGRATION BY PARTS
Therefore:
Z p p
arcsin(x) dx = x arcsin(x) − (− 1 − x2 ) + c0 = x arcsin(x) + 1 − x2 + c0
R
Exercise 7.3.12. Find arctan(x)dx.
7.3.2 Applying by parts more than once: two different kinds of examples
x2 sin(3x) dx.
R
Example 7.3.13. Find
This is a product of two unrelated functions so a good candidate for by parts. As usual, we pick
u = x2 dv = sin(3x)dx
for u the one that gets simpler when you differentiate: So
du = 2xdx v = − 13 cos(3x)
Z Z
1 1
x2 sin(3x) dx = − x2 cos(3x) − (− ) cos(3x)(2x)dx
3 3
Z
1 2
= − x2 cos(3x) + x cos(3x) dx. (7.3)
3 3
To work out the resulting integral, we need to use by parts again. So let’s solve
Z
x cos(3x) dx
u=x dv = cos(3x)dx
using by parts which gives
du = dx v = 31 sin(3x)
Z Z
1 1 1 1
x cos(3x) dx = x sin(3x) − sin(3x)dx = x sin(3x) + cos(3x) + c
3 3 3 9
Therefore plugging it back into (7.3) we have:
Z Z
1 2
x2 sin(3x) dx = − x2 cos(3x) + x cos(3x) dx
3 3
1 2 1 1
= − x2 cos(3x) + x sin(3x) + cos(3x) + c0
3 3 3 9
1 2 2
= − x2 cos(3x) + x sin(3x) + cos(3x) + c0
3 9 27
which we check by differentiation.
There’s also a stranger way that doing integration by parts twice can pay off.
224
MAT 1330 : Fall 2020 7.3. TECHNIQUES OF INTEGRATION : INTEGRATION BY PARTS
This is a product of two unrelated functions so a good candidate for by parts. We have two choices
in this case, since both functions differentiation and integrate as easily; it doesn’t matter which one
u = e−θ dv = cos(θ)dθ
we take. Let’s go with: so
du = −e−θ dθ v = sin(θ)
Z Z
e−θ cos(θ)dθ = e−θ sin(θ) − (−1) e−θ sin(θ)dθ.
Now the resulting integral looks analogous to the one we had before; it is certainly not easier. But
we persevere. We do integration by parts again.
CAREFUL: if at this point you were to choose u = sin(θ) and dv = e−θ dθ, you would just UNDO
your first step and get back exactly to where you started. Try this, and then compare what happens
with the magic of the following steps.
u = e−θ dv = sin(θ)dθ
and we choose which gives
du = −e−θ dθ v = − cos(θ)
Z Z
−θ −θ
e sin(θ)dθ = −e cos(θ) − (−e−θ )(− cos(θ))dθ
Z
= −e−θ cos(θ) − e−θ cos(θ)dθ.
Now let’s carefully write out what we’ve figured out, putting all this together:
Z Z
−θ −θ −θ −θ
e cos(θ)dθ = e sin(θ) + −e cos(θ) − e cos(θ)dθ
and
R −θthe integral we want to solve for DOES NOT CANCEL OUT. In other words, we can add
e cos(θ)dθ to both sides of this equation to get
Z
2 e−θ cos(θ)dθ = e−θ sin(θ) + −e−θ cos(θ)
or Z
1 1
e−θ cos(θ)dθ = e−θ sin(θ) − e−θ cos(θ) + c
2 2
which we check by differentation.
Note: in this example, we could have swapped the functions we used for u and dv; the answer
comes out the same.
225
MAT 1330 : Fall 2020 7.4. MIXED EXAMPLES, AND APPLICATIONS
Good candidates for u: polynomials, exp, log, trig, inverse trig — anything whose derivative
is a bit simpler
Good candidates for dv: polynomials, exp, sine, cosine — functions whose anti-derivative is
(a) known and (b) hopefully simpler
Keep CAREFUL TRACK of all signs. There’s a minus sign in the formula, and often extra
constants floating around. Be meticulous!
If the result of your by parts doesn’t look helpful, don’t erase it! Try another combination of
udv, or maybe come back and do a second by parts, or look for a substitution .... persistence
is key!
Remember that you have two big methods: substitution and by parts. They sometimes both
work on the same integrand, but most of the time, substitution helps you with compositions
of functions and most of the time, by parts helps you with products of functions.
Example 7.4.1.
√
Z
sin( x) dx
√
Now the messy part is x so we try a substitution, which effectively turns a composition problem
√
into a product problem. So set t = x. Knowing that we don’t have the derivative available,
we shortcut to saying t2 = x so 2t dt = dx by implicit differentiation. Then we can make the
substitution:
√
Z Z Z
sin( x) dx = sin(t)(2t) dt = 2 t sin(t) dt.
u=t dv = sin(t) dt
This is now a great candidate for by parts: which gives
du = dt v = − cos(t)
Z Z
2 t sin(t) dt = 2 −t cos(t) − (− cos(t)) dt = −2t cos(t) + 2 sin(t) + c
We check by differentiating; as always with by parts, you see one of the product rule factors of the
first cancels off the derivative of the second summand.
226
MAT 1330 : Fall 2020 7.4. MIXED EXAMPLES, AND APPLICATIONS
(We could also have solve this last one using by parts u = ln(y), dv = √1 dy and no substitution.)
y
Example 7.4.3. A fish grows in length over time by a function L(t) which obeys the pure-time
differential equation
L0 (t) = 7e−0.1t cm/year.
Suppose that L(0) = 0 (meaning we measure from fertilization); how long until the fish reaches
50 cm in length?
This seems like a stupid answer since the function takes negative values — but let’s find c. Since
L(0) = 0, we have
−70e−0.1(0) + c = 0 ⇐⇒ c = 70,
and thus our answer is
L(t) = 70 − 70e−0.1t = 70(1 − e−0.1t )
which makes perfect sense! In fact, we can sketch the graph of y = L(t) to get:
227
MAT 1330 : Fall 2020 7.4. MIXED EXAMPLES, AND APPLICATIONS
L(t) = 50
5
⇐⇒ 1 − e−0.1t =
7
5
⇐⇒ e−0.1t = 1 −
7
2
⇐⇒ e−0.1t =
7
⇐⇒ −0.1t = ln(2/7)
⇐⇒ t = ln(2/7)/(−0.1) = 12.5 years.
Example 7.4.4. The mass of a worm, M (t), changes over time according to the pure-time differ-
ential equation M 0 (t) = ate−t , for some positive constant a. If M (0) = 0, find a formula for M (t)
and lim M (t).
t→∞
u=t dv = e−t dt
This is a good candidate for by parts: giving
du = dt v = −e−t
Z Z
M (t) = a −te − (−e ) dt = −ate + a e−t dx = −ate−t − ae−t + c.
−t −t −t
It seems worrisome that the functions are all negative, but let’s plug in the initial condition of
M (0) = 0. This gives
−a(0)e0 − ae0 + c = 0 ⇐⇒ c = a > 0.
So our formula is
M (t) = a(1 − (t + 1)e−t ).
Since M 0 (t) > 0 for all t and M (0) = 0, this is always positive (try it out!) and
t+1
lim M (t) = lim a 1 − t
t→∞ t→∞ e
The quotient gives an indeterminate form of type ∞/∞ in the limit, so we can apply l’Hôpital’s
rule
t+1 1
lim = lim t = 0,
t→∞ et t→∞ e
whence limt→∞ M (t) = a. The mass of the worm increases over its lifetime to asymptotically
approach a.
228
MAT 1330 : Fall 2020 7.5. DEFINITE INTEGRALS
Example 7.4.5.
ex
Z
dx
x
It is not a function we recognize. If we do a substitution w = ex , then dw = ex dx which is the
numerator (weird!); but we still have the x in the denominator. So we use our first equation to
write x = ln(w), and the substitution becomes
Z x Z Z
e dw 1
dx = = dw
x ln(w) ln(w)
which we can’t solve, either. If we now try the substitution substitution t = ln(w) then we will end
up right back at x.
u = ex dv = x1 dx
With x we get
du = e dx v = ln |x|
ex
Z Z
dx = ex ln |x| − ln |x|ex dx
x
which is not better; and if we do by parts again with u = ln |x| we’ll end up back where we started.
u = x−1 dv = ex dx
With we get
du = −x−2 dx v = ex
Z Z
−1 x −1 x
x e dx = x e + x−2 ex dx
In fact, the reason for our failure is not for lack of ability: this integral again has no elementary
function as antiderivative; there is no formula for the anti-derivative.
Or is there?
End of lecture # 20
We now get to the geometric interpretation of the integral. Remember that taking the derivative
of a function means finding the slope of the tangent line at every point. It turns out that taking
the integral of a function is about measuring the area under the curve.
229
MAT 1330 : Fall 2020 7.5. DEFINITE INTEGRALS
The area problem is as follows. Given a function y = f (x), and two points a and b, find the area
of the region bounded by y = f (x), x = a, x = b and the x-axis, as in the following picture:
Our strategy is to divide the interval [a, b] into n subintervals, and approximate the area over each
subinterval by the area of a rectangle, and add them all together. If we choose n larger and larger,
then we should get closer and closer to the actual value we will define the integral as the limit as
n → ∞.
But what are we really calculating? To make this feel concrete, suppose that f (t) was the velocity
function (the reading from your speedometer) at time t, and that we’re doing this on a long trip
from time t = 0 to t = 3 (hours).
If we use one interval, then we get one rectangle. We pick our sample point (say t = 1.5) and so
the height is f (1.5) (the speed we were going at that time). The area of the rectangle is
Of course, if our speed varied a lot, this estimate would be pretty bad. But we could do better, by
taking, say, 3 intervals of one hour each:
Choose a moment in each interval, and measure your speed at that moment.
Pretend that we were driving exactly that speed for the entire hour.
Then the area of each rectangle is f (t)km/h × 1h = the distance in km you would have
travelled if you stuck to that speed for one hour.
230
MAT 1330 : Fall 2020 7.5. DEFINITE INTEGRALS
Adding them together: you have estimated how far you drove in the 3 hours.
As we choose smaller and smaller intervals, we should get better and better estimates of how far
we actually drove. In the limit (when our intervals become “infinitely small”) we should have the
exact distance that we drove.
In other words:
Note. The area under the velocity curve from time a to time b is the distance travelled over that
time period. In other words:
Z b
v(t) dt = s(b) − s(a).
a
But there is nothing special about calling f (t) velocity and the answer displacement: this is a general
statement about the area under the curve being a difference of the values of an anti-derivative.
which equals 3(b−a), as expected. That is to say, we had f (x) = 3 and F (x) = 3x, so F (b)−F (a) =
3b − 3a.
231
MAT 1330 : Fall 2020 7.5. DEFINITE INTEGRALS
as required.
Where it gets interesting is when we calculate areas that we didn’t previously have formulas for.
Example 7.5.4. Find the area under the curve of y = x2 between x = 0 and x = 1.
The graph of y = x2 between x = 0 and x = 1,Rin red. For reference, the unit square is outlined in
1
green, as is its diagonal. The definite integral 0 x2 dx represents the area of the region under the
red curve but above the x-axis and to the left of x = 1.
So it’s definitely less than 21 , since it’s less than half the unit square, but how much should it really
be? The Fundamental Theorem of Calculus gives us that
1 3 1 1 3 1 3 1
Z 1
2
x dx = x = (1) − (0) = .
0 3 0 3 3 3
Wow, cool. You can count squares to estimate the area yourself and see if this is reasonable.
When the function is below the x-axis, then the integral can give a negative answer. The correct
interpretation is of net area: the difference between the area under the curve but above the x-axis,
and the area under the x-axis but above the curve.
232
MAT 1330 : Fall 2020 7.5. DEFINITE INTEGRALS
Example 7.5.6.
Z 2 2
(−3) dx = (−3x) = (−3(2)) − (−3(1)) = −6 + 3 = −3,
1 1
which is the negative of the area of the rectangle, because it’s below the axis.
Thinking about the integral as relating to (net) area gives us some very nice consequences.
Example 7.5.7. Suppose r(t) is the rate of change of your population, where your population is
given by n(t). So r(t) = n0 (t). Then
Z b
r(t) dt = n(b) − n(a),
a
that is, the area under the rate-of-change curve is just the total change in population.
Example 7.5.8. If ρ(x) represents the linear density of a rod, then each rectangle corresponds to
density times length, which is mass. So in the limit,
Z x
ρ(x)dx
a
is the total mass of a piece of rod starting at the point a and ending at x. That is, if ρ(x) is the
linear density of a rod, then this is m0 (x) where m(x) is the mass of a length b of the rod, measured
Z b
from any arbitrary starting point. Then ρ(x) dx = m(b) − m(a).
a
Remark 7.5.9. What this last example gives us is something unexpected: a formula for the anti-
derivative of any function at all.
2
For example: What is an anti-derivative of f (x) = e−x ?
We saw earlier that we couldn’t find a formula for a function F (x) for which F 0 (x) = f (x). But
now we have something: Z x
2
F (x) = e−t dt.
0
That is: let F (x) be the function that measures the area under the curve of f (t) = tet between 0
and x (the “area so far” function — which we can approximate to any degree of precision we want).
Then F 0 (x) = f (x), so it’s an anti-derivative of f (x). This is quite cool, and is the other part of
the fundamental theorem of Calculus (part 1).
But one weird thing: the fundamental theorem of calculus doesn’t specify which anti-derivative we
have to use, and we know there are infinitely many choices.
Suppose F (x) and G(x) are two anti-derivatives of f (x) on the interval [a, b]. Then we proved
earlier that there must be a constant c such that F (x) = G(x) + c. Now FTC tells us that
Z b Z b
f (x) dx = F (b) − F (a) and also f (x) dx = G(b) − G(a)
a a
233
MAT 1330 : Fall 2020 7.5. DEFINITE INTEGRALS
Note. This interpretation of the anti-derivative, and definite integrals, is the starting point we’ll
use in MAT1332.
Z 1
Example 7.5.10. Find arctan(x) dx.
0
Answer: we begin by finding an antiderivative, that is, solving the indefinite integral
Z
arctan(x) dx.
u = arctan(x) dv = dx
integration by parts: 1
du = 1+x 2 dx v=x
Z Z
x
arctan(x) dx = x arctan(x) − dx,
1 + x2
and to solve this new integral, we see a good substitution: w = 1 + x2 , dw = 2x dx or x dx = 12 dw.
Thus
Z Z Z
x 1 1 1 1 1 1 1
2
dx = · dw = dw = ln |w| + c = ln |1 + x2 | + c = ln(1 + x2 ) + c
1+x w 2 2 w 2 2 2
whence Z
1
arctan(x) dx = x arctan(x) −ln(1 + x2 ) + c0
2
(for some constant c0 ; here c0 = −c). We can check this by differentiating. Once we’re sure we have
found the general antiderivative, we pick our favourite one (say, the one with no constant term)
and apply FTC. Therefore
Z 1 1
1 2
arctan(x) dx = x arctan(x) − ln(1 + x )
0 2
0
1 1
= 1 arctan(1) − ln(2) − 0 arctan(0) − ln(1)
2 2
π 1
= ( − ln(2)) − (0 − 0)
4 2
π 1
= − ln(2)
4 2
End of lecture # 21
234
Appendix A
ln(ex + 3) = ln(ex ) + ln(3) = x + ln(3) : WRONG — ln(a + b) 6= ln(a) + ln(b), so the first
step is false. (But the second equality is true since ln(ex ) = x.)
Solution to Exercise 2.3.14: Solve the equality |x2 −4| = 5 following the pattern of the preceding
examples to get only two solutions: x ∈ {±3}. Make a table of values as you normally do to deduce
that the answer is the closed interval [−3, 3].
235
MAT 1330 : Fall 2020
The intersection of y = |x2 − 3| and y = 3x + 1. There are only two solutions, whereas
intersecting y = x2 − 3 with y = 3x + 1 and y = −(3x + 1) gave four solutions.
Solution to Exercise 3.2.6:
1. If we measure before the daily dose, then we start with xt mg/L in the blood, and immediately
add 10 mg/L, giving xt + 10 mg/L. Then as the day progresses, 25% is absorbed, leaving
0.75(xt + 10). Thus the DTDS is
xt+1 = 0.75(xt + 10) = 0.75xt + 7.5.
It’s a different DTDS! But that should make sense: we’re measuring at a different time in the
daily cycle.
2. Using the DTDS xt+1 = 0.75xt + 10: If the initial drug level is x0 = 8, then
x1 = 0.75x0 + 10 = 0.75 ∗ 8 + 10 = 16
and
x2 = 0.75x1 + 10 = 0.75 ∗ 16 + 10 = 22.
We plot these on a graph of xt vs t as points (0, 8), (1, 16) and (2, 22). To make a continuous
graph, we have to go beyond the DTDS, and go back to our understanding of drug absorption.
So we’d add points like (0.99, 6) and (1.99, 12), representing the concentration in the blood just
before the daily dose, and connect the dots to get a zig-zag graph. So measuring after the daily
dose give a (local) maximum, and measuring just before gives a (local) minimum.
3. Let xt be the amount you owe after the tth payment. Then over the course of the month, the
bank adds 0.5% of interest to the amount you owe, increasing the total to x5 +0.005xt = 1.005xt .
But then you pay off $ 50, reducing this total, so xt+1 = 1.005xt − 50. Our updating function
is thus f (x) = 1.005x − 50.
236
MAT 1330 : Fall 2020
3. In words: if you triple every six hours, then you will have done this four times in 24 hours,
yielding a total of 34 times what you started with. In math: if t = 0 is now, and t = 1 is in 6
hours, then t = 4 is in 24 hours. So we calculate:
x1 = 3x0 ; x2 = 3x1 = 3(3x0 ) = 9x0 ; x3 = 3x2 = 3(9x0 ) = 27x0 ; x4 = 3x3 = 3(27x0 ) = 81x0 .
Therefore the DTDS tells us that x4 = 81x0 . Now let’s write yt for the population of bacteria
in cm2 where t is measured in days; we have decided that the model for growth is
yt+1 = 81yt .
Indeed, y1 = x4 .
4. We are told that x2 = 100 and that we want to know x0 . We found an equation relating these
two values above: x2 = 9x0 . Therefore x0 = 100/9. Check.
4x2
Solution to Exercise 3.5.11: The updating function is f (x) = . Therefore we have to
1 + x2
solve
4x2
x= ⇐⇒ x(1 + x2 ) = 4x2 .
1 + x2
We see the common factor of x on both sides, that we would like to cancel. But of course because
there is a common factor of x, x∗ = 0 is a solution! So that is one fixed point. Continuing:
√
2 2 ∗ 4 ± 16 − 4 1√ √
1 + x = 4x ⇐⇒ x − 4x + 1 = 0 ⇐⇒ x = =2± 12 = 2 ± 3.
2 2
√ √
Since 3 < 4, 3 < 4 = 2, so both of these are positive. Therefore there are a total of 3 fixed
points, all of which are biologically relevant. Do the cobwebbing on a well-drawn graph to determine
stability.
2x
Solution to Exercise 3.5.12: In Example 3.4.4, we had the updating function f (x) = . Its
1+x
fixed points are solutions of
2x
x= ⇐⇒ x(1 + x) = 2x ⇐⇒ x2 − x = 0 ⇐⇒ x(x − 1) = 0 ⇐⇒ x∗ = 0, 1.
1+x
We had already cobwebbed with a value near x∗ = 0 and seen that the solution moved away;
therefore x∗ = 0 is an unstable fixed point. For x∗ = 1, our work showed that a cobweb from below
converges towards the fixed point x∗ = 1; it only remains to do a cobweb from above, say x0 = 1.1.
A quick cobweb shows that it also converges to 1, and therefore we conclude that the fixed point
x∗ = 1 is stable.
237
MAT 1330 : Fall 2020
Similarly, if y = arcsec(x) then sec(y) = x or x = 1/ cos(y), which means cos(y) = 1/x. Thus
y = arccos(1/x), in other words:
arcsec(x) = arccos(1/x).
The domain of arcsec(x) is (−∞, −1] ∪ [1, ∞) and its range is [0, π/2) ∪ (π/2, 1]. Its derivative,
using the above formula, the derivative of arccos(x) and the chain rule, is
d −1 −1 1
arcsec(x) = p · 2 = √ ,
dx 1 − (1/x) 2 x |x| x2 − 1
238
MAT 1330 : Fall 2020
The graph of y = arcsec(x), which is always increasing, and has the same (mirrored) shape as
arccsc(x).
Now y = arccot(x) is a different situation. We’re asking about x = cot(y), so let’s begin with the
graph of y = cot(x).
The graph of y = cot(x). It is one-to-one (i.e. passes the horizontal line test) on the interval
(0, π), for example.
Thus while it would seem reasonable to deduce arccot(x) = arctan(1/x), the images only overlap
on (0, π/2). So instead we go at it straight:
so
−1 −1 −1
− csc2 (y)y 0 = 1 ⇐⇒ y0 = 2
= 2 = ,
csc (y) 1 + cot (y) 1 + x2
where we used 1 + cot2 (y) = csc2 (y) (derived from our favourite identity by dividing both sides by
sin2 (y)).
239
MAT 1330 : Fall 2020
Solution to Exercise 6.7.16: The quintic Taylor polynomial for the sine function centered at 0
is
1 1
T (x) = x − x3 + x5
3! 5!
(as you can check). So T (x) ∼ sin(x) for x near 0; this means
1 1
sin(1) ∼ T (1) = 1 − + = 0.84167,
6 120
and our calculator tells us that sin(1) = 0.8415.
Solution to Exercise 6.8.4: Consider the DTDS xt+1 = 2xt (2−xt )−hxt , for a positive parameter
h representing the harvesting rate. We are first asked about when the steady state is positive. So
we need to find x∗ . We note f (x) = 2x(2 − x) − hx is the updating function, so the fixed points
are all solutions to f (x) = x. So x∗ = 0 is a solution; otherwise, we can divide by x and the other
fixed point is a solution to 1 = 2(2 − x) − h or 1+h ∗ 1+h
2 = 2 − x or x = 2 − 2 . This we have a fixed
point satisfying x∗ > 0 only if 2 − 1+h 1+h
2 > 0 or 2 > 2 or 4 > 1 + h or h < 3. Since we said h > 0,
the answer is: we have a strictly positive steady state if 0 < h < 3.
Next we asked when the nonzero steady state is stable. So we compute f 0 (x) = 4 − 4x − h. At
x∗ = 2 − 1+h
2 we have
1+h
f 0 (x∗ ) = 4 − 4 2 − − h = −4 + 2(1 + h) − h = −2 + h,
2
so that x∗ is stable iff | − 2 + h| < 1 iff −1 < −2 + h < 1 iff 1 < h < 3.
Thus there are some harvesting rates h (0 < h < 1) for which there is a positive fixed point but it
is unstable.
240
Index
241
MAT 1330 : Fall 2020 INDEX
natural base, 45
natural domain, 30
natural logarithm, 46
242