Calculus 1 - Amber Habib (MAT101)
Calculus 1 - Amber Habib (MAT101)
Calculus 1 - Amber Habib (MAT101)
chapter sets up the basic language for describing quantities and relationships
between them. Quantities are described by numbers and you would have seen
dierent kinds of numbers: Natural numbers, whole numbers, integers, rational
numbers, real numbers, perhaps complex numbers. Of all these, real numbers
provide the right setting for the techniques of Calculus and so we begin by
listing their properties and understanding what distinguishes them from other
number systems. The key element here is the Completeness Axiom, without
which Calculus would lose its power.
wait later, and we need to get ready for them by practicing on easier material.
If you are in a hurry and condent of your basic skills with numbers and proofs,
you may skip ahead to the next section, although a patient reading of these few
pages would also help in later encounters with Linear Algebra and Abstract
Algebra.
Any concept of `numbers' involves rules for combining them to create new
ones. We shall use the term binary operation to denote a rule for associating
a single member of a set to each pair of elements from that set.
R
0 1
The set R is equipped with two binary operations, + (addition) and · (multi-
plication), and has two special elements named zero (0) and one (1), with the
following fundamental properties:
Proof. The cancellation laws are based on associativity and the existence of
inverses:
a + b = a + c =⇒ (−a) + (a + b) = (−a) + (a + c)
=⇒ ((−a) + a) + b = ((−a) + a) + c
=⇒ 0 + b = 0 + c =⇒ b = c.
00 = 0 + 00 = 00 and 10 = 10 · 1 = 1.
4 Chapter 1. Real Numbers and Functions
a + b = 0 = a + b0 =⇒ b = b0 , by cancellation law. ␣
The properties R8 to R12 are called the order axioms of R. Let us see some
of their consequences:
Theorem 1.1.5.
1. If x, y ∈ R− then x + y ∈ R− .
1.1. Arithmetic and Order Properties 5
2. If x, y ∈ R− then xy ∈ R+ .
3. If x ∈ R+ and y ∈ R− then xy ∈ R− .
4. If x ∈ R∗ then x2 ∈ R+ .
5. 1 ∈ R+ .
Again, these are familiar properties which you were asked to memorize
7 at school. We wish to convert them to proved facts. We treat the rst
two to show you the way, and leave the others as exercises.
=⇒ x + y = −((−x) + (−y)) ∈ R− .
x, y ∈ R− =⇒ −x, −y ∈ R+ =⇒ (−x)(−y) ∈ R+
=⇒ xy = (−x)(−y) ∈ R+ .
␣
The split into positive and negative allows us to think of larger and smaller
real numbers (an `ordering') as follows: We say that a is greater than b, denoted
by a > b, if a − b ∈ R+ . In this case, we also say that b is less than a and
denote that by b < a.
Let A be a subset of R.
An element M ∈ A is called the maximum of A if a ≤ M for every
a ∈ A. M = max(A).
We write
By combining (5) of Theorem 1.1.5 and (4) of Theorem 1.1.6 we see that
1 < 2 < 3 < ···.
By including zero with the natural numbers we get the whole numbers:
W = N ∪ {0} = {0, 1, 2, . . . }.
By further including the additive inverse of each whole number we get the
integers:
Z = {. . . , −2, −1, 0, 1, 2, . . . }.
We make a small digression to recall some important facts about the natural
numbers.
1.1. Arithmetic and Order Properties 7
Strong induction has that name because its hypothesis is weaker than ordinary
induction. That is, if A satises the hypothesis of ordinary induction, it will
also satisfy the hypothesis of strong induction. One can see from this that the
Principle of Strong Mathematical Induction implies the Principle of Mathe-
matical Induction. In fact, all three of these principles are equivalent. Starting
from any one, we can prove the others. (See Exercises 6 to 8) Their typical
uses are quite dierent: The Induction Principles are typically used to prove
something involving all natural numbers, while the Well Ordering Principle is
used to show the existence of special numbers.
Absolute Value
Let us nish this overview of familiar facts about the real numbers by recalling
the denition of the absolute value of a real number x:
x if x≥0
|x| = .
−x if x<0
|x| |x|
R
−x 0 x
i=1
+ 0 1 2 · 0 1 2
0 0 1 2 0 0
and
1 1 1 0 1 2
2 2 2 2
1.1. Arithmetic and Order Properties 9
3. In each case below, nd the numbers with the given property and sketch
the solutions on the number line:
4. Find the numbers which meet all the conditions of the previous exercise.
wherever required:
n n
X n(n + 1) X n(n + 1)(2n + 1)
(a) k= , (b) k2 = .
2 6
k=1 k=1
12. Recall that the factorials of whole numbers are dened by 0! = 1 and
n! =
n · (n − 1)! for n ∈ N. Further, the binomial coecients are dened by
n n!
= for 0 ≤ k ≤ n. Prove the following for0 ≤ k ≤ n:
k k!(n − k)!
n+1 n n n
(a) = + , (b) ∈ N.
k+1 k k+1 k
R13. Suppose A and B are non-empty subsets of R such that a ≤ b for every
a ∈ A and b ∈ B . Then there is a real number m such that a ≤ m ≤ b for
every a ∈ A and b ∈ B .
This is called the Completeness Axiom and it has been established that
the real numbers form the only system which satises the eld and order axioms
as well as the Completeness Axiom.
You are probably aware that the rational numbers do not have this
« property. For example, there is no rational number whose square is 2.
Thus the Completeness Axiom distinguishes R from Q.
1.2. Completeness Axiom and Archimedean Property 11
(We call b the positive square root of a and denote it by a1/2 or a.)
The Completeness Axiom lends itself to showing the existence of a num-
ber with a particular property by locating it between numbers which
7 are too large or too small to have that property. To nd a number whose
square is a we create one class of numbers whose squares are greater
than a and another of numbers whose squares are less than a.
Let Then
1∈A and a ∈ B . So A and B are non-empty.
Now x∈A and y∈B x2 < a < y 2 , and hence x < y . By the
implies that
+
Completeness Axiom, there is a number b ∈ R such that x ≤ b ≤ y for every
2
x ∈ A, y ∈ B . We'll show that b = a.
First, suppose b2 > a . Then b2 = a + δ with δ > 0. Let's take a number
0
b which is slightly smaller than b. It will have the form b0 = b − h with h > 0.
Hence
If we take h = δ/2b, we have (b0 )2 > a and hence b0 ∈ B . But this contradicts
the fact that b0 < b.
Second, suppose b2 < a. Then b2 = a − δ with δ > 0. Let's take a number
0
b which is slightly greater than b. It will have the form b0 = b + h with h > 0.
Choose h to be less than both b and δ/3b. Then
One can prove in a similar fashion, for any a ∈ R+ and n ∈ N, the existence
of the positive nth root a1/n . However, we shall wait until we have studied the
exponential function, when we shall have an extremely simple proof. In the
meantime, we'll use only square roots and not cube roots or any other nth
roots.
Let A be a subset of R.
12 Chapter 1. Real Numbers and Functions
The supremum and inmum of an interval are called its endpoints. All
other points of the interval are called its interior points.
An interval is called closed if it contains its endpoints. Intervals of the
form [a, b], [a, ∞), (−∞, b] and (−∞, ∞) are closed.
Task 1.2.5. Let I be an interval and a, b ∈ I . Show that a < x < b implies
x ∈ I.
Theorem 1.2.6 (Archimedean Property of R, ver. 1). The set N is not bounded
above in R.
DRAFT August 15, 2020
Proof. Suppose N is bounded above. Then the set B of all upper bounds of N is
non-empty. Further, if a ∈ N and b ∈ B then a ≤ b. Hence, by the Completeness
Axiom, there is a real number α such that a ≤ α ≤ b for every a ∈ N, b ∈ B .
Task 1.2.7. Show that Z has neither an upper nor a lower bound in R.
Theorem 1.2.8 (Archimedean Property of R, ver. 2). Let x, y ∈ R+ . Then
there exists N ∈ N such that N x > y.
y
Proof. There is N ∈N such that N x > y > 0. Hence 0< < x. ␣
N
The next result is the one Archimedes used to prove formulas for lengths,
areas and volumes of a variety of shapes, and is the reason that this kind of
reasoning is called `Archimedean'.
14 Chapter 1. Real Numbers and Functions
M M
Theorem 1.2.10. Suppose x, y ∈ R and M > 0 such that y − ≤ x ≤ y+
n n
for every n ∈ N. Then y = x.
M −1 M 1 1
= − ≥y− > x =⇒ M − 1 ∈ A,
N N N N
which is a contradiction. Therefore M/N < y . Hence M/N ∈ (x, y). ␣
x+y √
2. Show that if x, y ≥ 0 then ≥ xy , with equality if and only if x = y .
2
3. Prove that every non-empty subset of R which is bounded below has a
greatest lower bound.
1.2. Completeness Axiom and Archimedean Property 15
1 1 1 1 1 1
1, 1 + √ , 1 + √ + √ , 1 + √ + √ + √ , . . . .
2 2 3 2 3 4
6. Show that the following sets are bounded, and nd their supremum:
1 1 1 1 1 1
1, 1 + 2 , 1 + 2 + 2 , 1 + 2 + 2 + 2 , . . . .
2 2 3 2 3 4
c · inf(A), c ≥ 0 and A is bounded below
inf(cA) = .
c · sup(A), c < 0 and A is bounded above
12. Produce a rational number as well as an irrational number that lie between
√
17/12 and 2.
16 Chapter 1. Real Numbers and Functions
1.3 Functions
A function f from a set X to a set Y is usually described as a rule that
associates exactly one element of Y to each element of X. The element of Y
associated to x∈X is denoted by f (x). This is not quite a formal denition as
one has to wonder what is allowed as a `rule'. We can do better by stating our
requirements purely in terms of membership of sets:
(A) (B)
(C) (D)
(A) does not represent a function because there is a point in the domain that
has no image. (B) also does not represent a function, since there is a point in
the domain that has two images. (C) and (D) do represent functions, since it
is permitted for points in the codomain of a function to have no pre-image as
well as to have multiple pre-images.
Task 1.3.3. Consider a binary operation on a set X . Can you describe it as
a function with a certain domain and codomain?
We have been using the name f for a function. This is simply the most
commonly used notation for a function (as x is for a variable), but we are free
1.3. Functions 17
to use any other letter, symbol, or word. Other popular choices are g , h, u, v ,
F , G, H , η , θ and so on. Functions that are particularly important have their
own names such as sin, cos, exp and log.
A function f: X → Y is called one-one or injective if distinct points in
X have distinct images in Y: If a, b ∈ X and a 6= b then f (a) 6= f (b).
Task 1.3.4. Show that f : X → Y is one-one if and only if f (a) = f (b) implies
a = b.
Example 1.3.5. Consider the functions depicted below.
a
b
DRAFT August 15, 2020
(A) (B)
The function in (A) is 1-1 because distinct points in the domain are mapped
to distinct points in the codomain. The function in (B) is not 1-1 because the
points a and b are mapped to the same value.
(A) (B)
The function in (A) is not onto because the point z in the codomain has no
pre-image. The function in (B) is onto.
Task 1.3.7. Find out whether the following functions are 1-1 or onto. If a
function is not onto, give its range.
1. f : R → R, f (x) = 12 x + |x| .
2. g : R → R, g(x) = x2 .
x2 + x + 1 if x ≥ 0 .
3. h : R → R, h(x) = x+1 if x < 0
A function f: X →Y is called a one-one correspondence or bijection
if it is both one-one and onto.
The function in (A) is one-one but not onto. The function in (B) is onto but
not one-one. Finally, the one in (C) is both one-one and onto, hence it is a
bijection.
(f −1 ◦ g −1 ) ◦ (g ◦ f ) = ((f −1 ◦ g −1 ) ◦ g) ◦ f = (f −1 ◦ (g −1 ◦ g)) ◦ f
= (f −1 ◦ 1Y ) ◦ f = f −1 ◦ f = 1X ,
(g ◦ f ) ◦ (f −1 ◦ g −1 ) = ((g ◦ f ) ◦ f −1 ) ◦ g −1 = (g ◦ (f ◦ f −1 )) ◦ g −1
= (g ◦ 1X ) ◦ g −1 = g ◦ g −1 = 1Y . ␣
1
(a) f : R \ {0} → R, f (x) = .
x
x
(b) g : R \ {1} → R, g(x) = .
1−x
1
(c) h : R \ {0, 1} → R, h(x) = .
x(1 − x)
2. Show that the following functions are bijections, and nd their inverses:
1
(a) f : R \ {0} → R \ {0}, f (x) = .
x
x
(b) g : R \ {1} → R \ {−1}, g(x) = .
1−x
1
(c) h : [1/2, 1) → [4, ∞), h(x) = .
x(1 − x)
20 Chapter 1. Real Numbers and Functions
(a) f ◦ f = f,
(b) f ◦ f = 1R and f 6= 1R ,
(c) f ◦ f = 0.
7. Give bijections between the following sets:
(a) N and W.
(b) N and Z. (Hint: 0, −1, 1, −2, 2, . . . )
1
(c) [0, 1) and [0, ∞). (Hint: )
1
8. Give a bijection between N and N × N. The diagram below gives a hint for
one such bijection:
(This particular bijection will be useful when we multiply innite series in the
last chapter. Your task is to nd a formula for it.)
Task 1.4.1. Identify the domains of the real functions given by the following
rules:
√
(a) f (x) = 1 − x2 , (c) h(x) = (x − 1)(x − 2),
DRAFT August 15, 2020
1
(b) g(x) = ,
1 (d) k(x) = √ .
x 1 − x2
f (x)
x
−1 1
We have utilized a useful convention in depicting this graph. The unlled circle
at the origin indicates that the origin is not part of the graph. The lled circle
at (0, 1) emphasizes that the function value at x=0 is 1.
Sign Function
−1, if x<0
The sign or signum function is dened by sgn(x) = 0, if x=0 .
1, if 0<x
−1 1
2
1
−2 −1 1 2
−1
−2
f +c
f (x) + c f
DRAFT August 15, 2020
f (x) f −c
x
f (x) − c
cf
c · f (x) f
f (x)
x
−c · f (x)
−cf
Task 1.4.4. Let f be a real function with domain A and let c be a real number.
What are the domains of g(x) = f (x + c) and h(x) = f (cx)?
Consider the function g(x) = f (x + c) with c > 0. We have f (x) = f ((x −
c) + c) = g(x − c). That is, the value taken by f at x is taken by g at x − c.
Thus the graph of g is a horizontal shift to the left of the graph of f .
g f
f (x) = g(x − c)
x−c x
h f
f (x) = h(x/c)
x/c x
Note that the graph will contract if c>1 and will stretch if c < 1.
Task 1.4.6. Describe the graph of h(x) = f (cx) when c < 0.
Task 1.4.7. Recall that the graph of f (x) = x2 is an upward opening parabola.
Use your understanding of shifts and scalings to plot the graphs of the
following on the same xy-plane:
(a) g(x) = (x − 2)2 + 1, (b) h(x) = 4x2 + 12x + 5.
2. The graph of f is symmetric with respect to the y -axis: f (−x) = f (x) for
every x ∈ X.
An example of an even function:
f (x)
−x x
f (x)
−x
x
f (−x)
Task 1.4.8. Determine whether the following functions are even, or odd, or
neither: |x|, sgn(x) and [x].
Task 1.4.9. Can a function be both even and odd?
Monotonic Functions
A real function f : A → R is called an increasing function if, for all points
x, y ∈ A, x ≤ y implies f (x) ≤ f (y). It is called strictly increasing if x < y
implies f (x) < f (y).
Periodic Functions
We have been looking at functions which are special in having some regularity.
Even and odd functions have reection symmetry. Monotonic functions have a
persistent trend of either growth or decay. Another kind of regularity is when a
function represents a cyclic phenomenon, one in which the same pattern repeats
indenitely.
0, [x] is even
(b) g(x) = has period 2.
1, [x] is odd
The function f is called the `sawtooth wave' and g is the `square wave'.
1.4. Real Functions and Graphs 27
Arithmetic of Functions
Let f, g be real functions. We use them to dene new functions as follows:
Inverse Functions
Suppose I, J are subsets of R and f : I → J is a bijection. Then we have the
inverse function f −1 : J → I . We'll establish a relationship between the graphs
of f and f −1 :
(x, y) is in the graph of f ⇐⇒ y = f (x)
⇐⇒ x = f −1 (y)
⇐⇒ (y, x) is in the graph of f −1
Polynomials
A monomial is an expression of the form xn where n = 0, 1, 2, . . . . If we let x
vary over real numbers, a monomial gives a real function. Here are graphs of
some monomial functions:
28 Chapter 1. Real Numbers and Functions
x0 x1 x2 x3 x4
n
X
n n−1
p(x) = an x + an−1 x + · · · + a1 x + a0 = ai xi ,
i=0
+ →
x3 −x x3 − x
If p(x) is a polynomial and c is a real number such that p(c) = 0 then c is called
a zero or root of p(x).
Let us recall the following from our school algebra:
1.4. Real Functions and Graphs 29
Theorem 1.4.18 (Division Algorithm for Polynomials). Let p(x) and q(x) be
polynomials with q 6= 0. Then there are unique polynomials s(x) and r(x) such
that p(x) = s(x)q(x) + r(x) and either deg r < deg q or r = 0.
an n−m an n−m
p(x) = r0 (x) + x q(x) = s0 (x)q(x) + r(x) + x q(x)
bm bm
an n−m
= (s0 (x) + x )q(x) + r(x).
bm
Ifc is a root of p(x), its multiplicity is the largest natural number k such
that (x − c)k divides p(x).
Theorem 1.4.20. Let p(x) be a polynomial of degree n ≥ 1. Then p(x) has at
most n roots, counting multiplicities.
Rational Functions
p(x)
A rational function has the form where p(x) and q(x) are polynomials
q(x)
and q(x) is not the zero polynomial.
30 Chapter 1. Real Numbers and Functions
1
Example 1.4.21. We shall sketch the graph of f (x) = . We
(x − 1)(x + 1)
start by considering the graph of (x − 1)(x + 1):
−1 1
The reciprocal function f (x) will take values of large magnitude where (x −
1)(x + 1) takes values of small magnitude, and of the same sign. So, as we
approach 1 from the right, f (x) will take large positive values, while from the
left it will take large negative values. Thus we obtain the following graph:
−1 1
√ p
(a) f (x) = x2 − 2, (c) h(x) = x(x2 − 1),
1 1
(b) g(x) = , (d) k(x) = .
[x] x − [x]
2. Draw the graphs of the following functions, not by plotting points but by
transforming or the graphs of the standard functions like x, x2 and 1/x.
1.4. Real Functions and Graphs 31
√
(a) 1 − x2 (c) x−1
x
(b) x2 − 4x + 3 (d)
x−4
(a) (b)
(a) (b)
(a) When f and g are both even. (c) When f is even and g is odd.
(b) When f and g are both odd. (d) When f is odd and g is even.
(b) Show f can be written as the sum of an even function and an odd
function.
11. Dene x+ = max{0, x}. Draw the graphs of the following functions:
15. Show that if a function has either of the following pairs of properties, it
must be constant:
16. The graph of a function is given. Draw the graph of its inverse function.
1.4. Real Functions and Graphs 33
(a) (b)
Each chapter of this book ends with one or two sets of thematic exercises.
These either develop applications of the material of that chapter, or illustrate
theoretical concerns that future courses would take up in detail. It is not essen-
tial that you solve them at rst sight, but you should at least browse them and
keep them in mind as you read on. Chances are you will suddenly recognize a
relevant idea and how to apply it here. Or, studying another course, you will
DRAFT August 15, 2020
see how these exercises support the techniques you are learning there.
A2. Find the unique linear or quadratic function that passes through the
points (−h, a), (0, b) and (h, c), with h > 0.
Actual data has errors and a perfect match to imperfect data has little
importance. It is more useful to nd a function which is only an approximate
match but is easy to work with or allows some special insight. The most common
approach is to nd a line which passes as close as possible to the data points.
This can be done by geometry!
The goal is to nd a, b such that E(a, b) is as small as possible. The correspond-
ing line is called the least squares line for the data.
A5. Show that the problem of minimizing the total squared error is equivalent
to considering the plane Π = { a~x + b~v | a, b ∈ R } with ~v = (1, . . . , 1), and
nding the member closest to ~y .
A6. Show that the least squares line y = ax + b is given by
Pn Pn Pn Pn Pn
n xi yi − ( i=1 xi )( i=1 yi )
i=1P i=1 yi − a i=1 xi
a= n Pn , b= .
n i=1 x2i − ( i=1 xi )2 n
Cardinality
We consider two sets as having the same amount (why didn't we say `number' ?)
of elements if there is a bijection between them. In this case we say the two sets
have the same cardinality. A set is called nite if it is empty or it has the
same cardinality as {1, . . . , n} for some n ∈ N. Otherwise it is called innite.
The most familiar innite set is N. One can ask if all innite sets have the
same cardinality. The rst surprise is that it is not so easy to go beyond N. In
Exercises 7 and 8 of 1.3 you were asked to show that W, Z and even N×N
have the same cardinality as N. A set which is either nite or has the same
cardinality as N is called countable.
B1. Prove that the following function f: N×N→N is a bijection:
can be written as p/q in many ways and we have to account for this. We
present below an elegant bijection which appears to have been rst discovered
by an undergraduate student in 1960! (McCrimmon [44]) It is based on the
Fundamental Theorem of Arithmetic: Every natural number greater than
one can be written uniquely as a product of prime powers. That is, if n∈N
andn > 1, then there is a unique choice of primes p1 < · · · < pk and natural
αk
numbers α1 , . . . , αk such that n = pα
1 · · · pk .
1
αk f (α1 ) f (αk )
ϕ(1) = 1 and ϕ(pα
1 · · · pk ) = p1
1
· · · pk
for any distinct primes pi and natural numbers αi . Show that ϕ is a bijection.
B3.
DRAFT August 15, 2020
Now we'll show there can't be a surjection from N to R, and hence there
certainly can't be a bijection either. Therefore R is uncountable.
2 | Integration
Calculus has two parts: dierential and integral. Integral calculus owes its ori-
gins to fundamental problems of measurement in geometry: length, area and
volume. It is by far the older branch. Nevertheless, it depends on dierential
calculus for its more dicult calculations, and so nowadays we typically teach
dierentiation before integration.
We shall revert to the historical sequence and begin our journey with inte-
gration. Our rst reason is that it provides a direct application of the Complete-
ness Axiom without needing the concept of limits. The second is that important
functions such as the trigonometric, exponential and logarithmic functions are
most conveniently constructed through integration. Finally, the student should
become aware that integration is not just an application of dierentiation or a
set of calculational techniques.
5 5 5
4 4 4
3 3 3
2 2 2
1 1 1
0 0 0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
3 ≤ Area ≤ 17 3.75 ≤ Area ≤ 13 6.48 ≤ Area ≤ 9.64
We have said that we are estimating area. But what is our denition of area? In
school books you will nd descriptions such as Area is the measure of the part
of plane or region enclosed by the gure. It should be evident that this is not
40 Chapter 2. Integration
1. Could there be a gure for which multiple numbers satisfy the denition
of its area?
2. If we slightly shifted or rotated the grids, could that change our calcula-
tion? That is, could moving a gure change its area?
It takes some eort to answer these objections. Indeed, we have to concede the
rst point, for there are such subsets of the plane. We shall content ourselves
with working out a method for assigning area only to certain regions, those
that are bounded by graphs of functions. In this special situation we shall use
rows of rectangles rather than grids of squares. The word `integration' will be
used for this process.
Underestimate Overestimate
In what follows, it will be important that we use our pre-existing notions about
area only to motivate or guess results, and never to justify them. For area and
its properties will formally exist only at the end of our analysis.
For our nal piece of motivation, suppose we know velocity rather than
speed, and wish to recover the total displacement. We again cut [a, b] into some
subintervals, of widths 4ti , and approximate by a constant velocity
P vi in each
subinterval. Then the total displacement is approximated by i v i 4t i . Some
of the vi could be negative and then the product vi 4ti would be negative, and
thus not the area of a rectangle. We shall call it the `signed area' instead.
2.1. Integration of Step and Bounded Functions 41
. . . xn−1 R
a = x0 x1 x2 xn = b
As illustrated in the diagram, the subintervals need not have equal lengths.
si
s1
R
a = x0 x1 x2 xi−1 xi xn−1xn = b
s2
Task 2.1.1. We have already encountered some step functions. How many can
you recall?
Suppose s : [a, b] → R is a step function with adapted partition P =
{x0 , . . . , xn }, such that s(x) = si if xi−1 < x < xi . Then the integral of
42 Chapter 2. Integration
s from a to b is dened by
Z b n
X
s(x) dx = si (xi − xi−1 ).
a i=1
This integral represents the total signed area of the rectangles enclosed by
the graph of s(x), the x-axis, and the vertical lines x=a and x = b.
sn
sn−1
s1
R
a = x0 x1 xn−1 xn = b
s2
The term `signed area' refers to the area of a rectangle marked by the step
function being taken as positive if the rectangle lies above the x-axis and as
negative if it lies below the x-axis.
Note that this denition ignores the step function's values at the parti-
tion points. This is acceptable because these values create nitely many
« line segments, and we view a line segment as having zero area regardless
of its length.
Z 3
Calculate s(x) dx.
0
Since the denition of the integral involved the choice of an adapted par-
tition we need to show that the resulting number is independent of the choice
of partition. Let P, P 0 be partitions of [a, b]. We say P0 is a renement of P
0
if P ⊆P .
P: R
a = x0 x1 x2 x3 b = x4
P 0: R
a = y0 y1 y2 b = y3
P ∪ P 0: R
a = x0 y1 x1 x2 y2 x3 b = x4
= y0 = y3
Task 2.1.5. Suppose s, t : [a, b] → R are step functions. Show there is a parti-
tion which is adapted to both s and t.
Theorem 2.1.6 (Comparison Theorem). Let s, t : [a, b] → R be step functions
such that s(x) ≤ t(x) for every x ∈ [a, b]. Then
Z b Z b
s(x) dx ≤ t(x) dx.
a a
nZ b
Lf = s(x) dx s : [a, b] → R is a step function and s(x) ≤ f (x)
a
o
for every x ∈ [a, b] .
The members of Lf are called lower sums for f , while the members of Uf are
upper sums.
called
Rb
(a) s(x) ≤ f (x) for every x ∈ [a, b] and
a
s(x) dx = `.
Rb
(b) t(x) ≥ f (x) for every x ∈ [a, b] and
a
t(x) dx = u.
Then s(x) ≤ f (x) ≤ t(x) for every x ∈ [a, b]. Hence, by the Comparison
Theorem,
Z b Z b
`= s(x) dx ≤ t(x) dx = u. ␣
a a
1, x ∈ Q ∩ [0, 1]
D(x) = .
0, x ∈ Qc ∩ [0, 1]
Let s : [0, 1] → R be step function such that s(x) ≤ D(x) for every x ∈ [0, 1].
Each open subinterval of [0, 1] contains an irrational number, hence s(x) ≤ 0 on
Rb Rb
each open subinterval, and so
a
s(x) dx ≤ 0. Similarly, we see that a t(x) dx ≥
1 if t : [0, 1] → R is a step function such that t(x) ≥ D(x) for every x ∈ [0, 1].
Therefore every number α between 0 and 1 has the property that ` ≤ α ≤ u
for every ` ∈ LD and u ∈ UD .
For functions like the Dirichlet function we have to admit that our approach
fails to successfully assign a `signed area'. For the functions where our approach
does work we have a special term: A bounded function f : [a, b] → R is called
integrable if there is a unique number I such that ` ≤ I ≤ u for every ` ∈ Lf
and u ∈ Uf . This unique I is called the (denite) integral of f on [a, b] and
Rb
is denoted by
a
f (x) dx.
Example 2.1.9. Consider the function f (x) = x on [0, 1]. Consider a partition
P : x0 < · · · < xn which cuts [0, 1] into n subintervals of equal length. That is,
xi = i/n. Corresponding to this partition we dene two step functions, sn and
tn , as follows.
1 n n n n−1
1 Xi−1
Z X 1 1X 1 X
sn (x) dx = xi−1 = xi−1 = = 2 i
0 i=1
n n i=1 n i=1 n n i=1
n−1 1 1
= −
= ,
2n 2 2n
Z b n
X 1 1 1
tn (x) dx = xi = · · · = + .
0 i=1
n 2 2n
The number 1/2 is the only number that ts between all these lower and upper
sums, and hence it is the integral of f over [0, 1].
Task 2.1.10. Why was it enough to consider only some special partitions and
step functions in the above example?
Example 2.1.11. Consider the function f : [−1, 1] → R dened by f (0) = 1
and f (x) = 0 if x 6= 0. This is a step function, and its integral as a step function
is zero. But we can also treat it as a bounded function and ask whether it is
integrable.
−1 1
Proof. First, suppose f is not integrable. We have to nd an ε>0 for which
there are no step functions s, t satisfying the given conditions. Now since f is
not integrable, there are two numbers I1 , I2 such that
Z b Z b
s(x) dx ≤ I1 < I2 ≤ t(x) dx,
a a
Proof. Exercise. ␣
2. Graph the following step functions on the given domain and compute their
integrals:
√
(a) [2x − 1] on [0, 2], (b) [ x] on [0, 9].
4. Let f (t) = 1 for t ∈ [0, 1] and f (t) = 2 for t ∈ (1, 2]. Compute and graph
the following function for x ∈ [0, 2]:
Z x
F (x) = f (t) dt.
0
n 2 b
b4
Z
X n(n + 1)
(a)
3
k = , (b) x3 dx = .
2 0 4
k=1
Z b
8. Let f : [a, b] → R and let I ∈ R. Show that I = f (x) dx if and only if
a
the following two conditions hold:
Z b Z b
Proof. Suppose that f (x) dx > g(x) dx.
a a
Let s, t : [a, b] → R be step functions such that s(x) ≤ f (x) ≤ t(x) for
every x ∈ [a, b]. The inequality between f and g gives us s(x) ≤ g(x) for
Rb Rb Rb
every x ∈ [a, b], and hence s(x) dx ≤ a g(x) dx. We also have a g(x) dx <
Rb Rb a
a
f (x) dx ≤ a t(x) dx. Thus the integral of g satises the dening properties
of the integral of f and so must be equal to it. But this contradicts the assumed
inequality. ␣
Proof. First, we prove the result for a step function s. Let P = {x0 , . . . , xn } be
a partition of [a, b], such that s(x) = si if xi−1 < x < xi . Then c · s(x) = c · si
50 Chapter 2. Integration
Z b n
X n
X Z b
cs(x) dx = (csi )(xi − xi−1 ) = c si (xi − xi−1 ) = c s(x) dx.
a i=1 i=1 a
Second, consider an arbitrary integrable function f and c > 0. Let ε > 0. Then
there are step functions s, t such that s(x) ≤ f (x) ≤ t(x) for every x ∈ [a, b]
and
Z b Z b Z b Z b
ε ε
f (x) dx − s(x) dx < and t(x) dx − f (x) dx < .
a a c a a c
It follows that cs, ct are step functions such that cs(x) ≤ cf (x) ≤ ct(x) for
every x ∈ [a, b] and
Z b Z b Z b Z b
c f (x) dx − cs(x) dx < ε and ct(x) dx − c f (x) dx < ε.
a a a a
Z b Z b
By Theorem 2.1.14, cf is integrable and cf (x) dx = c f (x) dx.
a a
The c<0 case can be done in a similar fashion. The c=0 case is trivial.
␣
Task 2.2.3. Consider the step functions s(x) and t(x) dened by
0, 0 ≤ x ≤ 1.5
−1, 0 ≤ x < 1
s(x) = , t(x) = 2, 1.5 < x ≤ 2.5 .
1, 1 ≤ x ≤ 3
3, 2.5 < x ≤ 3
Proof. s, t are
We start by proving this property for step functions. Suppose
step functions on [a, b]. If a partition Ps is adapted to s and a partition Pt is
adapted to t, then P = Ps ∪ Pt is adapted to all three of s, t and s + t. In
particular, s + t is a step function. Let P = {x0 . . . , xn }, and suppose
Z b n
X
s(x) + t(x) dx = (si + ti )(xi − xi−1 )
a i=1
n
X n
X
= si (xi − xi−1 ) + ti (xi − xi−1 )
i=1 i=1
Z b Z b
= s(x) dx + t(x) dx.
a a
Now consider any integrable functions f, g : [a, b] → R. Let ε > 0. We have step
functions sf , sg , tf , tg such that the following hold:
Proof. The function h=g−f is zero except at nitely many points. Hence it
is a step function which has integral zero. Hence g =f +h is integrable and
Z b Z b Z b Z b
g(x) dx = f (x) dx + h(x) dx = f (x) dx + 0. ␣
a a a a
Theorem 2.2.6 (Additivity over Intervals). Let a < c < b and suppose
f : [a, b] → Ris a bounded function. Then
1. f is integrable on [a, b] if and only if f is integrable on both [a, c] and [c, b].
Z b Z c Z b
2. If f is integrable on [a, b] then f (x) dx = f (x) dx + f (x) dx.
a a c
52 Chapter 2. Integration
Proof. Exercise. ␣
Proof. Exercise. (Hint: If s(x) is a step function with domain [a, b] then s(x/k)
is a step function with domain [ka, kb].) ␣
Theorem 2.2.9 (Shift of interval of integration). Let f : [a, b] → R be integrable
and let k ∈ R. Then f (x − k) is integrable on [a + k, b + k] and
Z b+k Z b
f (x − k) dx = f (x) dx.
a+k a
Proof. Exercise. (Hint: If k>0 and s(x) is a step function with domain [a, b]
then s(x − k) is a step function with domain [a + k, b + k].) ␣
and sn (b) = tn (b) = f (b). Then we have sn (x) ≤ f (x) ≤ tn (x) for every
x ∈ [a, b]. Further,
b b n n
b−a X b−a
Z Z X
tn (x) dx − sn (x) dx = f (xi ) − f (xi−1 )
a a i=1
n i=1
n
2.2. Properties of Integration 53
n
b − a X
= f (xi ) − f (xi−1 )
n i=1
b − a
= f (b) − f (a) .
n
By the Archimedean property, this quantity can be made smaller than any
given positive ε.
The same approach works for decreasing functions. We just switch the
denitions of sn and tn . ␣
This approach can also succeed in nding the value of the integral of a
given function, as we saw earlier for f (x) = x.
Task 2.2.11. Suppose that f : [a, b] → R is an increasing function. Let xi =
a + (b − a)i/n. Prove that
n−1 Z b n
b−a X b−aX
f (xi ) ≤ f (x) dx ≤ f (xi ).
n i=0 a n i=1
x2
x0 x1 x3 x0 x1 x2
−M, if x = xi
gi (x) = f (x), if xi < x < xi+1 .
M, if x = xi+1
This result suces to establish the integrability of the functions that are
typically encountered in Calculus and its applications.
H(x) F (x)
Thus, integration has smoothened out the step function and removed its jump.
This is a general phenomenon which we shall explore when we take up con-
tinuous functions. For now, here is a result along these lines about integrals of
monotone functions:
F (b) < L < F (c). Then there is α ∈ (b, c) such that F (α) = L.
First, suppose F (α) < L. Then F (α) = L−ε with ε > 0. Dene δ = ε/f (α).
So,
Z α+δ Z α+δ
F (α + δ) = F (α) + f (t) dt < (L − ε) + f (α) dt
α α
= (L − ε) + f (α)δ = L.
2.2. Properties of Integration 55
Hence α + δ ∈ A, a contradiction.
If F (α) > L, we can show in a similar fashion and with the same choice of
δ, that α − δ ∈ B . Therefore, by trichotomy, F (α) = L. ␣
Z 2 Z 2
(a) f (x) dx, (c) f (1 − x) dx,
1 0
Z 2 Z 2
(b) f (x) dx, (d) (f (x) + f (−x)) dx.
−2 1
2. Integrate:
Z 2 Z 2
(a) (x − 1)(x − 2) dx, (b) (x − 1)(x − 2)(x − 3) dx.
1 1
Z c Z c
(a) x(1 − x) dx = 0. (b) |x(1 − x)| dx = 0.
0 0
6. Let f : [a, b] → R. Dene f + (x) = max{f (x), 0}, f − (x) = max{−f (x), 0}.
Prove the following:
Z b Z b Z b Z b
t(x)2 dx − s(x)2 dx ≤ 2M
t(x) dx − s(x) dx .
a a a a
Z f (b) Z b
f −1 (y) dy + f (x) dx = bf (b) − af (a).
f (a) a
Z b
3 4/3
x1/3 dx = b .
0 4
By denition, log(1) = 0.
We shall often just write log x for log(x). Using fewer brackets increases
readability but we should use brackets whenever ambiguity is a danger. For
example, log x + y should be clearly expressed as either log(x + y) or (log x) + y ,
two expressions with very dierent meanings. The same convention will be used
later for any function which has its own name.
ab Z a Z ab
1 ab 1
Z Z
1 1 1
log(ab) = dt = dt + dt = log a + dt
1 t 1 t a t a a t/a
Z b 1 Z kβ Z β
1
= log a + dt ∵ f (t/k) dt = f (t) dt
1 t k kα α
= log a + log b
a
2. log(a/b) + log b = log ·b = log a.
b
3. If m = 0, both sides are 0. For any m ∈ N, we have
b a b b
b−a
Z Z Z Z
1 1 1 1
log b − log a = dt − dt = dt ≥ dt = > 0.
1 t 1 t a t a b b
Consider any y > 0. We know that log 2 > log 1 = 0, hence by the
Archimedean Property we have N ∈ N such that y < N log 2 = log(2N ).
So,
log 1 < y < log 2N .
By the Intermediate Value Property (Theorem 2.2.13) there is x ∈ (1, 2N ) such
that log x = y .
Now consider y < 0. There is x > 0 such that log x = −y . Then log(1/x) =
− log x = y . So log is onto. ␣
Exponential Function
Since log : (0, ∞) → R is a bijection, it has an inverse function exp : R → (0, ∞)
which we call the exponential function. We have y = exp x if and only if
log y = x. Since log is strictly increasing, so is exp.
Theorem 2.3.3. Let a, b ∈ R. Then
1. exp(a + b) = exp a exp b,
exp a
2. exp(a − b) = ,
exp b
3. exp(ma) = (exp a)m for m ∈ Z.
Proof. Since log is 1-1, we can prove these identities by applying log to both
sides and observing the results are equal. For example, the rst identity is
proved as follows:
log(exp(a + b)) = a + b,
log(exp a exp b) = log(exp a) + log(exp b) = a + b. ␣
Euler's Number
The number e = exp 1 is called Euler's number. It also satises log e = 1.
Consider the y = 1/x graph between 1 and e. The area under the graph is
Z e
1
dt = log e = 1.
1 t
2.3. Logarithm and Exponential Functions 59
1 1
1 e 1 2 e
1 e−2 3e − 4
+ < 1 =⇒ < 1 =⇒ 3e − 4 < 2e =⇒ e < 4.
2 e 2e
Later, we will see how to get more accurate estimates of e.
2
1
1 e e2
The graph of exp can be obtained by reecting the log graph in the y=x line:
exp x
log x
Roots
Theorem 2.3.4. Let a > 0 and n ∈ N. Then there is a unique b > 0√such that
bn = a. (This b is called the nth root of a and is denoted by a1/n or n
a.)
Proof. Since exp : R → (0, ∞) is surjective, there is a real number x such that
exp x = a. Then b = exp(x/n) satises b > 0 and bn = a. Uniqueness follows
n n n−1
from b − c = (b − c)(b + · · · + cn−1 ), as this shows that bn − cn = 0 ⇐⇒
b − c = 0. ␣
60 Chapter 2. Integration
Task 2.3.5. Let n be an odd natural number. Show that every real number a
has a unique nth root b, i.e. bn = a.
Rational Powers
Leta>0 and r ∈ Q. If r = m/n with m ∈ Z, n ∈ N, we dene ar = am/n =
m 1/n
(a ) .
Since a rational number can be expressed as m/n in many dierent ways,
we need to check that our denition of am/n is independent of these choices.
1 m
log(ar ) = log((am )1/n )) = log(am ) = log a = r log a.
n n
Real Powers
Recall that Euler's number was dened by e = exp 1. From the second part of
the last theorem we see that
er = exp r, if r ∈ Q.
Therefore we dene
ex = exp x, if x ∈ R.
More generally, given any a>0 and r∈Q we note that
Therefore we dene
ax = exp(x log a), if x ∈ R.
2.3. Logarithm and Exponential Functions 61
With this denition of real powers of a positive real number, we can use the
properties of the exponential function to verify the following:
Proof. The rst two are easily checked using the properties of exp. For the
third, we rst consider the special case when a = e:
This is dierent from school, where log x = log10 x, and the natural
«
logarithm had the special name `ln'.
Theorem 2.3.12. Fix a > 0, a 6= 1. Then the functions ax and loga x are
strictly monotone on their domains.
Hyperbolic Functions
Some particular combinations of exponential functions are very convenient in
calculations, and also show up directly in some physical problems. These are
called hyperbolic functions and are dened by
ex + e−x ex − e−x
cosh x = and sinh x = .
2 2
These are called the `hyperbolic cosine' and `hyperbolic sine' respectively.
(sinh t, cosh t)
(sinh t, − cosh t)
y = cosh x
y = sinh x
2. You are given some values of the natural logarithm, accurate to two decimal
places: log 2 = 0.69, log 3 = 1.10, log 5 = 1.61 and log 7 = 1.95. Use these to
nd the natural logarithms of the following numbers:
1 1
1 e 1 3 2 e
2
1
6. Prove the following inequalities for x > 0: 1 − ≤ log x ≤ x − 1.
x
7. Plot the graphs of the following functions on the same coordinate plane:
(a) log x, (b) log 2x, (c) log x2 , (d) log |x|.
8. Plot the graphs of the following functions on the same coordinate plane:
(a) exp x, (b) exp(1 + x), (c) exp(−x), (d) exp |x|.
10. Prove that the function ax is strictly increasing if a > 1 and strictly
decreasing if 0 < a < 1.
11. Graph the functions ax for a = 1/2, 1, 2, e, 3.
y = f (x)
R
x=a x=b
Rb
The denite integral
a
f (x) dx is our denition of the area of the region R. If
f is not integrable, then we fail to dene the area of R.
Next, suppose we have two integrable functions f, g : [a, b] → R such that
f (x) ≥ g(x) ≥ 0 for every x ∈ [a, b]. Then we can create a region S which is
enclosed by the graphs of f and g , and by the lines x = a, x = b:
y = f (x)
x=a S
y = g(x) x=b
This region can be viewed as the result of removing the region under the graph
of g from that which is under the graph of f:
2.4. Integration and Area 65
S
= −
Z b Z b Z b
f (x) dx − g(x) dx = (f (x) − g(x)) dx.
a a a
(x1 , y1 )
(x2 , y2 )
x2 x2
y2 − y1
Z Z
Then, f (x) dx = y1 + (x − x1 ) dx
x1 x1 x 2 − x 1
Z x2 −x1
y2 − y1
= y1 + x dx
0 x2 − x1
y2 − y1 (x2 − x1 )2
= (x2 − x1 )y1 +
x2 − x1 2
1
= (x2 − x1 )(y2 + y1 ).
2
Example 2.4.1. Let us nd the area of the triangle T with vertices at (x1 , y1 ),
(x2 , y2 ) and (x3 , y3 ) as shown below:
66 Chapter 2. Integration
(x3 , y3 )
(x1 , y1 )
T
(x2 , y2 )
y3 − y1
f (x) = y1 + (x − x1 ), x ∈ [x1 , x3 ]
x3 − x1
y2 − y1
y1 +
(x − x1 ), x ∈ [x1 , x2 ]
x2 − x1
g(x) =
y3 − y2
y2 + (x − x2 ), x ∈ [x2 , x3 ]
x3 − x2
This calculation can be used to recover the area formulas for basic polyg-
onal shapes.
Example 2.4.2. Consider a triangle with base b and height h. Place it with
base along the x-axis and one vertex at origin:
(c, h)
(b, 0)
1 1
Its area is (0 · 0 − b · 0) + (b · h − c · 0) + (c · 0 − 0 · h) = bh.
2 2
Example 2.4.3. Consider a trapezium with height h, base b, and top t. Place
it as follows:
2.4. Integration and Area 67
(c, h) (c + t, h)
(b, 0)
1 1 b+t
Using the previous example, we nd its area to be bh + th = h. If t = b,
2 2 2
we get the formula bh for the area of a parallelogram.
Example 2.4.4. We wish to nd the area of the region enclosed by the curves
given by y = 3x and y = 4 − x2 . The rst step is to sketch this region. For this,
2
we rst check if the curves meet. Setting 3x = 4 − x gives x = 1, −4. Hence
the curves meet at (−4, −12) and (1, 3). The curves plot as follows, and the
shaded part is the region that is enclosed by the curves.
(1, 3)
(−4, −12)
Z 1 Z 1 Z 1 Z 1
2 2
Area = (4 − x − 3x) dx = 4 1 dx − x dx − 3 x dx
−4 −4 −4 −4
1 1
= 4 · (1 − (−4)) − (13 − (−4)3 ) − 3 · (12 − (−4)2 ) = 20 65 .
3 2
√
y= R2 − x2
−R R
√
y = − R 2 − x2
1 1 1
The table shows the bounds and their means for various N:
N Lower Bound Upper Bound Mean
102 3.1204 3.1604 3.1404
103 3.1395 3.1435 3.1415
106 3.1415906 3.1415946 3.1415926
2.4. Integration and Area 69
(The actual value of π is 3.141592653 . . . . The calculations in the 106 case took
7 seconds with the Maxima program.)
1
(x1 y2 − x2 y1 ) + (x2 y3 − x3 y2 ) + (x3 y4 − x4 y3 ) + (x4 y1 − x1 y4 ) .
2
(x4 , y4 ) (x3 , y3 )
(x1 , y1 )
(x2 , y2 )
3. The trapezium drawn below is to be cut into two parts of equal area by a
vertical line. Where should it be drawn?
k
h
4. Use denite integrals to represent the area of the region enclosed by the
given curves (do not try to evaluate the integrals):
y = log(3 − x) y = log(1 + x) √
y= 1 − x2
y = x2 − 2x
(a) (b)
1/2
y=M
y=m
Darboux Integral
Our approach to integration is a minor variation of one due to Gaston Darboux,
which he gave in 1875. At this time the work of Fourier had led to a broadening
of the idea of a function and so it became important to give integration a formal
structure and not rely on intuitions about area. Bernhard Riemann gave the
rst general approach in 1854, but Darboux's denition is much easier to set
up. For completeness, we give the standard version of the Darboux integral
below. We also introduce the Riemann integral in the supplementary exercises
to Chapter 6.
We then take the lower and upper Darboux sums created by these numbers:
n
X n
X
L(f, P ) = mi (xi − xi−1 ) and U (f, P ) = Mi (xi − xi−1 ).
i=1 i=1
A1. Show that the collection of lower Darboux sums is bounded above and
the collection of upper Darboux sums is bounded below.
A2. Find the upper and lower Darboux integrals of the Dirichlet function
(Example 2.1.8).
Z b Z b
A3. Show that f (x) dx ≤ f (x) dx.
a a
get a global result. In the previous chapter, we set up the general process for
achieving this. We also saw how a specic kind of informationmonotonicity
could be used to obtain global results. Already, we were able to formally dene
the natural logarithm and exponential functions, which are usually taken for
granted in school mathematics.
3.1 Limits
You have seen in school, the notation lim f (x) = L, which is read as the
x→p
limit of f (x) at p is L and is interpreted as the values of f (x) approach L as
the values of x approach p". We need a clear denition of what we mean by
`approaches'.
This example shows what we mean by `limit'. We mean that we can control
the nearness of the output to a certain number, by controlling the nearness of
the input to another number. This is expressed formally as follows:
We say x→p
lim f (x) = L if for every ε > 0 there is a corresponding δ > 0
such that 0 < |x − p| < δ =⇒ |f (x) − L| < ε.
In Example 3.1.1 we have δ = ε/2.
Task 3.1.2. Show that at most one number can satisfy the denition of the
limit of a given function at a given point.
(c) Since the denition is intended for situations where x can approach p, it
should only be applied to such situations. In Calculus, this means that
we shall only consider the limit of f at p if there is an α>0 such that
the open interval (p − α, p + α) is contained in the domain of f, except
perhaps for p itself.
L+ε L+ε
L L
L−ε L−ε
p p−δ p p+δ
The two stages in a limit process. In the rst stage, we have a requirement to
make the output f (x) lie between L − ε and L + ε. In the second stage, we meet
the requirement by nding a δ such that input being between p − δ and p + δ
guarantees that the output is between L − ε and L + ε (except perhaps at p
itself).
Example 3.1.3. Consider lim x. This amounts to asking What does x ap-
x→a
proach when x approaches a? Obviously, our response has to be that it will
approach a, that is, lim x = a. While this is indeed obvious, let us still work it
x→a
out with the ε-δ formulation, for practice.
3.1. Limits 75
4.5 − 2 = 0.121.
0.129 0.121
( )
√ √
3.5 p=2 4.5
√
Now if we take δ= 4.5 − 2 = 0.121 (the smaller√of the√two values) it has the
required properties, since then (2 − δ, 2 + δ) ⊂ ( 3.5, 4.5).
√
Next, consider ε = 0.01. Can you conrm that δ = 4.01 − 2 will meet the
requirements?
Proof. We simply match the denitions of the three limits and see that they
are the same:
• lim f (x) = L:
x→p
For every ε > 0 there is a corresponding δ > 0 such that 0 < |x − p| <
δ =⇒ |f (x) − L| < ε.
• lim (f (x) − L) = 0:
x→p
For every ε > 0 there is a corresponding δ > 0 such that 0 < |x − p| <
δ =⇒ |(f (x) − L) − 0| < ε.
76 Chapter 3. Limits and Continuity
• lim f (p + h) = L:
h→0
For every ε > 0 there is a corresponding δ > 0 such that 0 < |h| < δ =⇒
|f (p + h) − L| < ε.
The rst two are completely identical. The rst can be converted to the
third, and conversely, by the substitution x = p + h. ␣
• lim f (x) = 0:
x→p
For every ε > 0 there is a corresponding δ > 0 such that 0 < |x − p| <
δ =⇒ |f (x) − 0| < ε.
• lim |f (x)| = 0:
x→p
For every ε > 0 there is a corresponding δ > 0 such that 0 < |x − p| <
δ =⇒ ||f (x)| − 0| < ε.
The two denitions are the same because |f (x) − 0| = |f (x)| = |f (x)| − 0 =
||f (x)| − 0|. ␣
Proof. We know from the triangle inequality that ||f (x)| − |M || ≤ |f (x) − M |.
Consider any ε > 0. Since lim f (x) = M , there is a δ > 0 such that
x→p
0 < |x − p| < δ =⇒ |f (x) − M | < ε. The same δ works for |f (x)| since
|f (x) − M | < ε implies ||f (x)| − |M || ≤ |f (x) − M | < ε. ␣
Now we consider three examples which illustrate the typical ways in which
a limit can fail to exist.
−1, x<0
sgn(x) = 0, x=0 .
1, x>0
( )
−δ 1δ
n
1
3
−1 1 1
2
In any (−δ, δ) interval, S takes both the values ±1 and so we can argue as in
the previous two examples to show that lim S(x) does not exist.
x→0
78 Chapter 3. Limits and Continuity
Remember our statement that the limit does not have to equal the func-
tion's value? Here is an example.
Example 3.1.13. Let f (x) = 0 when x 6= 0 and f (0) = 1. We will show that
lim f (x) = 0.
x→0
Limit Theorems
We now take up questions regarding limits of combinations of functions. If we
know the limits of two functions at the same point, what can we say about
their sum, product, etc.?
We begin by considering the special case when the initial functions have
limit zero. The ε-δ arguments are much simpler in this situation.
2. x→p
lim (f (x) + g(x)) = 0,
3. x→p
lim f (x)g(x) = 0,
f (x)
4. If x→p
lim h(x) = 1 then lim = 0.
x→p h(x)
Proof.
ε ε
0 < |x − p| < δ =⇒ |f (x) + g(x) − 0| ≤ |f (x)| + |g(x)| < + = ε.
2 2
3.1. Limits 79
√ √
0 < |x − p| < δ =⇒ |f (x)g(x)| < ε ε = ε.
2 2
ε
Second, there is a δ2 > 0 such that 0 < |x − p| < δ2 implies |f (x)| < .
2
f (x ε/2
Let δ = min{δ1 , δ2 }. Then 0 < |x − p| < δ =⇒ < = ε. ␣
h(x) 1/2
Now we take up the general situation. We are able to reduce the calcula-
tions to the cases considered in the lemma.
1. x→p
lim c f (x) = cM (c ∈ R),
2. x→p
lim (f (x) + g(x)) = M + N ,
3. x→p
lim (f (x) − g(x)) = M − N ,
4. x→p
lim f (x)g(x) = M N ,
f (x) M
5. x→p
lim = (N 6= 0).
g(x) N
lim f (x)g(x) − M N = lim [f (x) − M ][g(x) − N ]
x→p x→p
+ M g(x) + N f (x) − 2M N
= lim [f (x) − M ][g(x) − N ]
x→p
= 0 + M N + N M − 2M N = 0.
1 1
5. Due to part 4 of this theorem, it is enough to prove that lim = .
x→p g(x) N
1 1 N − g(x) 1 − g(x)/N
lim − = lim = lim
x→p g(x) N x→p g(x) x→p g(x)/N
= 0. (Part 4 of Lemma 3.1.14) ␣
(x − 1)2
Example 3.1.18. Calculate lim
x→1 x2 − 1
.
The cancellation in the last step is allowed because when we calculate lim we
x→1
work with x 6= 1 and hence x − 1 6= 0. This simplied form is easily dealt with:
x−1 0
lim (x − 1) = 0 and lim (x + 1) = 2 =⇒ lim = = 0.
x→1 x→1 x→1 x+1 2
1 x2 − 6x + 9 |x|
1. lim 2. lim 3. lim
x→2 x2 x→3 x2 − 9 x→0 x
DRAFT September 25, 2020
1 y
0.5
x
−1 −0.5 0.5 1
−0.5
−1
In order to avoid the x>0 and x<0 cases we work with |xS(x)|:
Since lim |x| = 0, the Sandwich Theorem implies that lim |xS(x)| = 0. Hence
x→0 x→0
we have lim xS(x) = 0.
x→0
√
Example 3.1.23. Let a>0 lim x. The natural guess for this
and consider
x→a
√
limit is a. To conrm this, we calculate as follows:
√ √ x−a |x − a|
0 ≤ | x − a| = √ √ ≤ √ .
x+ a a
|x − a| √ √
We have lim √ = 0. Hence, by the Sandwich Theorem, lim | x − a| =
x→a a x→a
0.
One-sided Limits
We say that lim f (x) = L if for every ε>0 there is a corresponding δ>0
x→p+
such that 0 < x − p < δ =⇒ |f (x) − L| < ε. The quantity lim f (x) is called
x→p+
the right-hand limit of f at p.
We say that lim f (x) = L if for every ε > 0 there is a corresponding
x→p−
δ>0 such that 0 < p − x < δ =⇒ |f (x) − L| < ε. The quantity lim f (x) is
x→p−
called the left-hand limit of f at p.
L+ε L+ε
L L
L−ε L−ε
p p+δ p−δ p
Right-hand Limit Left-hand Limit
Task 3.1.24. Evaluate the following one-sided limits:
1. x→p+
lim C , 3. lim [x], |x|
DRAFT September 25, 2020
x→1+ 5. lim ,
x→0+ x
√
2. x→p−
lim C , 4. lim [x], 6. lim x.
x→1− x→0+
0 < |x − p| < δ =⇒ |f (x) − L| < ε. The same δ works for lim f (x) = L and
x→p+
lim f (x) = L.
x→p−
Next, suppose lim f (x) = lim f (x) = L. Let ε > 0. Then there is a
x→p+ x→p−
δ1 > 0 0 < x − p < δ1 =⇒ |f (x) − L| < ε. There is also a δ2 > 0
such that
such that 0 < p − x < δ2 =⇒ |f (x) − L| < ε. Then δ = min{δ1 , δ2 } works for
lim f (x) = L:
x→p
Since the one-sided limits are not equal, lim H(x) does not exist.
x→0
84 Chapter 3. Limits and Continuity
Task 3.1.27. Conrm that the Algebra of Limits and the Sandwich Theorem
also hold for one-sided limits.
2. Consider the function f (x) = x3 . For the given a and ε, nd δ>0 such
that 0 < |x − a| < δ implies |f (x) − f (a)| < ε:
4. Compute the limits, explaining which theorem you are using for each step:
√
1 1− 1 − x2
(a) lim (e) lim
x→2 x2 x→0 x2
2
x −4 x3 − 1
(b) lim (f) lim
x→2 x − 2 x→1 x − 1
(t + h)2 − t2
1 1
(c) lim (g) lim − 2
h→0 h t→0 t t +t
√
x2 + 5x + 4 x+2−3
(d) lim (h) lim
x→−4 x2 + 3x − 4 x→7 x−7
(a) lim (f (x) + g(x)) may exist even when neither lim f (x) nor lim g(x)
x→a x→a x→a
exists.
(b) lim (f (x)g(x)) may exist even when neither lim f (x) nor lim g(x) ex-
x→a x→a x→a
ists.
f (x) − 5
6. Suppose lim
x→2 x − 2
= 3. What can you say about lim f (x)?
x→2
x→0−
11. Suppose
x→a
lim f (x) = 0 and g(x) is a bounded function dened on an open
3.2 Continuity
A function f is said to be continu-
ous at p if
f (p) + ε
lim f (x) = f (p). f (p)
x→p
f (p) − ε
f is continuous at p
Alternately,
ε > 0 there is a corre-
if for every
sponding δ > 0 such that |x − p| <
δ =⇒ |f (x) − f (p)| < ε. p−δ p p+δ
x
x 6= 0
0 x<0
H(x) = , sgn(x) = |x| .
1 x≥0 0 x=0
(In both cases the limit is 0 but the function value is 1.)
On the other extreme, the Dirichlet function is not continuous at any point!
Theorem 3.2.1. Let f (x) and g(x) be continuous at p. Then the following are
also continuous at p:
1. C f (x), 3. f (x)g(x),
f (x)
2. f (x) ± g(x) , 4. (if g(p) 6= 0).
g(x)
Proof. We prove the last claim. The others are left as an exercise for the reader.
First note that lim g(x) = g(p) 6= 0, by continuity of g(x) at x=p and
x→p
the given condition that g(p) 6= 0. So, by the Algebra of Limits,
lim f (x)
f (x) x→p f (p)
lim = = ␣
x→p g(x) lim g(x) g(p)
x→p
i=0 ai xi is continuous. ␣
Recall that a rational function has the form p(x)/q(x) where p(x) and
q(x) are polynomials and q(x) is not the zero polynomial. The domain of this
rational function consists of all real numbers x where q(x) 6= 0. Recall that q(x)
has only nitely many zeroes. Hence each point of the domain is the center of
an open interval which is contained in the domain, and we can talk about the
function's limit at each point in the domain.
One-sided Continuity
A function f is said to be left-continuous at p if x→p−
lim f (x) = f (p). It is called
right-continuous at p if x→p+
lim f (x) = f (p).
3.2. Continuity 87
2
1
−2 −1 1 2
−1
−2
right-continuous at p.
Proof. x→p
lim f (x) = f (p) ⇐⇒ lim f (x) = f (p) and lim f (x) = f (p).
x→p+ x→p−
␣
For example, we can argue that the Heaviside step function H(x) is not
continuous at x=0 because it is right continuous but not left continuous.
Removable discontinuity: lim f (x) exists but does not equal f (a). We can
x→a
make f continuous at a by changing its value at a to lim f (x).
x→a
Continuity of Compositions
Theorem 3.2.6. Let f and g be real functions such that their composition g ◦f
is dened on an interval (a, b). Let p ∈ (a, b) with q = x→p
lim f (x) and suppose g
is continuous at q. Then
lim g(f (x)) = g(q) = g( lim f (x)).
x→p x→p
88 Chapter 3. Limits and Continuity
Proof. Letε > 0. Since g is continuous at q there is a δ 0 > 0 such that |y−q| < δ 0
implies |g(y) − g(q)| < ε. And there is a δ > 0 such that 0 < |x − p| < δ implies
|f (x) − q| < δ 0 . Hence
Theorem 3.2.8. Let f and g be real functions such that their composition g ◦f
is dened on an interval (a, b). Let p ∈ (a, b) such that f is continuous at p and
g is continuous at f (p). Then g ◦ f is continuous at p.
Proof. x→p
lim g(f (x)) = g( lim f (x)) = g(f (p)).
x→p
␣
Monotone Functions
Theorem 3.2.9. If I, J are intervals and f : I → J is a surjective monotone
function, then f is continuous on I .
Proof. We'll do the case when J is an open interval. For other intervals we have
to do a similar analysis at any end-points that are included in J.
Let x0 ∈ I and let ε > 0. We may assume that f (x0 ) ± ε ∈ J , by shrinking
ε if necessary. (A δ that works for a smaller ε will also work for the original
one.)
f (x0 ) + ε f (x)
f (x0 )
f (x0 ) − ε
x− x0 x+
3.2. Continuity 89
Take δ = min{x0 − x− , x+ − x0 }. ␣
Theorem 3.2.10. All logarithms and exponential functions are continuous.
Proof. They are strictly monotonic bijections between intervals. ␣
Task 3.2.11. Let r ≥ 0. Show that the function xr is continuous on [0, ∞).
Indenite Integrals
We call a function integrable on an interval I if it is integrable on every
[a, b] with a, b ∈ I .
DRAFT September 25, 2020
indenite integral of f .
is called an
Example 3.2.12. Calculate the indenite integral F (x) = 0x H(t) dt for the
R
Z x Z x Z x
−M (x − p) = (−M ) dt ≤ f (t) dt ≤ M dt = M (x − p).
p p p
90 Chapter 3. Limits and Continuity
Therefore,
−M (x − p) ≤ F (x) − F (p) ≤ M (x − p),
and so 0 ≤ |F (x) − F (p)| ≤ M |x − p|. By the Sandwich Theorem, we have
lim |F (x) − F (p)| = 0. Therefore lim F (x) = F (p).
x→p+ x→p+
p
(a) lim log x, (c) lim (1 + h)1/h ,
x→1+ h→0
p 2
(b) lim log x2 + 1 , (d) lim xx .
x→0 x→0+
Z x
8. Compute and graph the indenite integral f (t) dt of each f (t):
0
integrals of monotone functions have this property, and this established that
the natural logarithm is a surjective function.
Proof. Assume that f (x) is never zero. First, let a0 = a and b0 = b. Let c0 be
the midpoint of [a0 , b0 ]. If f (c0 ) = 0 we have succeeded. Dene
[a0 , c0 ] if f (a0 )f (c0 ) < 0
[a1 , b1 ] = .
[c0 , b0 ] if f (b0 )f (c0 ) < 0
+ −
a = a0 c0 b = b0
0 f(
c0
)> )<
f (c 0
0
+ + − + − −
a0 a1 b1 = b0 a0 = a1 b1 b0
We have f (a1 )f (b1 ) < 0, so we repeat this process with [a1 , b1 ] replacing [a0 , b0 ].
Proceeding in this manner, we nd a sequence of intervals [an , bn ] such that
a0 ≤ a1 ≤ a2 ≤ · · · ≤ b2 ≤ b1 ≤ b0 .
92 Chapter 3. Limits and Continuity
<δ
( [ ] )
c−δ aN c bN c + δ
We have a contradiction since f changes sign on [aN , bN ] but not on (c−δ, c+δ).
The f (c) < 0 case similarly leads to a contradiction.
The Intermediate Value Theorem is very useful for showing the existence of
special numbers. For example, suppose we wish to show that a certain equation
has a solution. By moving all terms to one side of the equality, we put it in
the form f (x) = 0. If f is continuous we can try to use the Intermediate Value
Theorem.
Example 3.3.2. Consider the equation x4 +4x3 +x2 −6x−1 = 0. Since the LHS
is a polynomial of degree 4 this equation has atmost 4 distinct real solutions,
but it may have fewer, or even none. Let us see how many the Intermediate
Value Theorem can help us to locate. We start by calculating the values of
f (x) = x4 + 4x3 + x2 − 6x − 1 = 0 at various points:
x −4 −3 −2 −1 0 1 2
f (x) 39 −1 −1 3 −1 −1 39
By tracking the sign changes of f (x) we see there are solutions in the intervals
(−4, −3), (−2, −1), (−1, 0) and (1, 2). We can shrink these intervals further
by employing the bisection method. For example, let us consider the so-
lution that lies in (1, 2). f (x) at the midpoint of (1, 2):
We nd the value of
f (1.5) = 10.8. Therefore the solution is in (1, 1.5). This process can be repeated
indenitely for greater accuracy:
Proof. Suppose f (a) < L < f (b). Dene g : [a, b] → R by g(x) = f (x) − L.
Then g(a) = f (a) − L < 0 and g(b) = f (b) − L > 0. Hence there is a number
c ∈ (a, b) such that g(c) = 0, and f (c) = g(c) + L = L.
DRAFT September 25, 2020
(a) Assuming f (a) < 0 < f (b), let A = {x ∈ [a, b] : f (x) < 0}. Show that
c = sup(A) exists.
(b) Show that f (c) > 0 and f (c) < 0 lead to contradictions.
What are the relative merits and demerits of this proof and the original one?
9. Suppose f, g : [a, b] → R are continuous functions such that f (a) > g(a)
and f (b) < g(b). Show that there is a c ∈ (a, b) such that f (c) = g(c).
10. Suppose f : [0, 2] → R is a continuous function with f (0) = f (2). Show
that there are a, b ∈ [0, 2] such that b − a = 1 and f (a) = f (b).
11. Let f : [a, b] → R be a continuous and injective function. Assume that
f (a) < f (b).
(a) Show that f (a) is the minimum value of f and f (b) is the maximum
value of f. Hence the image of f is [f (a), f (b)].
(c) Show that f : [a, b] → [f (a), f (b)] has an inverse function which is also
strictly increasing and continuous.
12. Show that there cannot be a continuous bijection f : (0, 1) → [0, 1].
13. The following tasks will establish that a cubic polynomial p(x) = x3 +
2
ax + bx + c has at least one real root.
|a| + |b| + |c|
(a) Show that for x ≥ 1, p(x) ≥ x3 1 − . Hence there is an
x
x1 with p(x1 ) > 0.
3 |a| + |b| + |c|
(b) Show that for x ≤ −1, p(x) ≤ x 1+ . Hence there is
x
an x2 with p(x2 ) < 0.
14. Prove that every polynomial of odd degree has at least one real root.
Hence the image of such a function is all of R.
15. Prove that if a polynomial p(x) = x + an−1 xn−1 + · · · + a0
n
of even degree
has a negative value then it has a real root.
for every x.
To measure the angle we draw a unit circle whose centre is the meeting point
of the rays. We take twice the area enclosed by this circle within the angle, and
call that the radian measure of the angle. Thus the full circle corresponds to
2π radians while a right angle corresponds to π/2 radians. (See the discussion
of π on page 68)
DRAFT September 25, 2020
The usual denition of radian is to take the length of the arc of unit
radius cut by the angle. However, we haven't taken up lengths of curves
« yet and have to work with areas. We use twice the area to keep our
denition compatible with the arc length approach. The association of
radians with arc length is achieved later in Example 6.4.8.
At this point, we have associated a real number between 0 and 2π to each angle.
We would like to be assured that every such number is the radian measure of
some angle. Then we shall have a perfect identication of physical angles with
radian measures.
√ √
1 − x2 1 − x2
x x
Let R(x) be the radian measure of this angle. We now have a function
R : [−1.1] → [0, 2π] dened by
p Z 1 p
2
R(x) = x 1 − x + 2 1 − t2 dt.
x
Task 3.4.1. Show that every number between π and 2π is the radian measure
of an angle.
96 Chapter 3. Limits and Continuity
Now that we know how to identify angles with real numbers, we are in a
position to dene the trigonometric functions.
Consider the ray in the xy -plane created by rotating the positive x-axis
counterclockwise through an angle of t radians. This ray cuts the unit circle
with centre at origin at exactly one point (x, y). We then dene cos t = x and
sin t = y (cos is an abbreviation of `cosine' while sin is an abbreviation of `sine').
The gures below illustrate the denitions for an acute and an obtuse angle
respectively.
t t
cos t cos t
Task 3.4.2. Show that sin2 t + cos2 t = 1 for every t ∈ [0, 2π].
Task 3.4.3. Show that sin(π/2 − t) = cos t for every t ∈ [0, π/2].
The following values of sine and cosine are obvious from the denitions:
x 0 π/2 π 3π/2 2π
sin x 0 1 0 −1 0
cos x 1 0 −1 0 1
√
Task 3.4.4. Show that sin(π/4) = cos(π/4) = 1/ 2.
These observations indicate that the graph of cosine over the interval
[0, π/2] is likely to be as follows:
√1
1/ 2
π/4 π/2
3.4. Trigonometric Functions 97
As we learn more Calculus, we will be able to conrm that the graph indeed
looks like this. The following identities also follow directly from the denitions:
With their help we can visualize the graph over the entire interval [0.2π], using
the piece for [0, π/2] as the building block.
1
π
π/2 3π/2 2π
−1
DRAFT September 25, 2020
sin x
1
3π/2 2π
π/2 π
−1
We notice that as the input changes from 0 to 2π , the sine and cosine functions
return to their initial values. Thus the function domain can be extended on
each side by just repeating the function values, using sin(x + 2π) = sin x and
cos(x + 2π) = cos x:
sin x
1 2π 4π
−2π −1
cos x
1 2π 4π
−2π −1
The following properties of sin, cos : R → [−1, 1] are obvious from the deni-
tions:
1. The sine and cosine functions are periodic, with a period of 2π: sin(t +
2π) = sin t and cos(t + 2π) = cos t for every t.
Our gure is only valid for 0 ≤ α, β and with α + β ≤ π/2. The identity can be
extended to arbitrary α, β by other appropriate gures.
The other sum of angle identities can be obtained from this one. First,
replacing β by −β gives
sin x
Task 3.4.7. Compute numerical values of for x = π/2n , n = 1, 2, 3.
x
(You can use a calculator for the arithmetic operations and square roots, but
do not use the inbuilt sine and cosine functions)
Applying the Sandwich Theorem gives lim sin x = 0. Since sin x is an odd
DRAFT September 25, 2020
x→0+
function, we get
lim sin x = − lim sin x = 0.
x→0− x→0+
x
lim cos x = lim 1 − 2 sin2 = 1.
x→0 x→0 2
The limits at 0 can be combined with the angle sum identities to compute the
limits at other points:
lim sin x = lim sin(a + h) = lim [sin a cos h + cos a sin h] = sin a,
x→a h→0 h→0
lim cos x = lim cos(a + h) = lim [cos a cos h − sin a sin h] = cos a.
x→a h→0 h→0
sin x cos x 1 1
tan x = , cot x = , sec x = , csc x = .
cos x sin x cos x sin x
sin x
0 < x < π/2 =⇒ < 1.
x
100 Chapter 3. Limits and Continuity
sin x
cos x < < 1.
x
Therefore, by the Sandwich Theorem,
sin x
lim = 1.
x→0+ x
sin x
Since (sin x)/x is an even function, we also get lim = 1. We have reached
x→0− x
sin x
lim =1 (3.1)
x→0 x
y y
0.5
0.5 x
x −10 10
−10 10 −0.5
sin x 1 − cos x
x x
sin(x2 − 1)
Task 3.4.9. Calculate x→1
lim .
x−1
At this point, all the standard continuous functions of calculus are avail-
able to us: polynomials, rational functions, roots, real powers, exponential, log-
arithm, sine, cosine.
1
tan 2x (g) lim x sin ,
(c) lim , x→0 x
x→0 sin x s
x2 − 1
(d) lim (sec x − tan x), (h) lim sin .
x→π/2 x→1 x−1
5. Consider the function tan : (−π/2, π/2) → R. Show that it is odd, strictly
increasing and surjective. Plot its graph.
π
6. Consider the function tan : R \ { (2n + 1) | n ∈ Z } → R. Show that it
2
has period π. Plot its graph.
7. Find the domains and plot the graphs of the functions cot x, sec x, csc x.
8. Prove that a linear combination A sin x + B cos x can be expressed in the
form R sin(x + φ) for some R, φ. (Hint: First do the case when A2 + B 2 = 1)
9. Prove the following identities:
x + y x − y
(a) cos x + cos y = 2 cos cos ,
2 2
x + y x − y
(b) sin x + sin y = 2 sin cos .
2 2
10. Consider the following identities, which we have already proved:
1 − sin(π/2 − 2θ)
sin2 (π/2 − θ) = 1 − sin2 θ and sin2 θ = .
2
(a) Starting with the known values of sin θ forθ = π/6, π/4 and π/3, use
these identities to nd the sine values for θ = π/12, 5π/12.
102 Chapter 3. Limits and Continuity
(b) Continue the above process to nd the sine values for θ starting at zero
and increasing in steps of π/24 to π/2.
(This process is described in the work Pancha-Siddhantika by Varahamihira,
written in the 6th century CE. Varahamihira went one step further and calcu-
lated in steps of π/48 or 3◦ 450 .)
11. We will develop a rational function that is a close approximation to sin x
over the interval [0, π].
π
(a) Find a quadratic polynomial p such that p(x) = sin x for x = 0, , π.
2
p(x) π π π
(b) Find a quadratic polynomial q such that q(x) = for x= , , .
sin x 6 2 3
(c) Plot sin x and r(x) = p(x)/q(x) over [0, π] using a graphing software.
(The approximation r(x) was rst developed by Bhaskara in the 7th century
CE, of course using degrees rather than radians.)
Task 3.5.1. Find the spans of the following functions on [0, 1]: sgn(x), sin x,
1/x (with the value at x = 0 set to 0).
Theorem 3.5.2 (Small Span Theorem). Let f : [a, b] → R be continuous. For
every ε > 0 there is a partition P of [a, b] such that the span of f is less than
ε on every subinterval of P .
Proof. Suppose there is an ε > 0 such that no such partition of [a, b] exists. Let
a1 = a, b1 = b and dene c1 = (a1 + b1 )/2. Then at least one of the intervals
[a1 , c1 ] and [c1 , b1 ] fails to have such a partition. Let that one be called [a2 , b2 ].
(If both fail to have such a partition, we take the left one.)
Note that
a = a1 ≤ a2 ≤ · · · ≤ b2 ≤ b1 = b
By the Completeness Axiom, there is an α ∈ R such that ai ≤ α ≤ bi for every
i. Henceα is in each [an , bn ]. Since f is continuous at α, there is a δ > 0 such
that |x − α| < δ implies |f (x) − f (α)| < ε/3. And then x, y ∈ (α − δ, α + δ) =⇒
|f (x) − f (y)| < 2ε/3.
b−a
By taking large enough n we can ensure that < δ and hence
2n
[an+1 , bn+1 ] ⊂ (α − δ, α + δ). Then we would have span of f being less than
or equal to 2ε/3 < ε on [an+1 , bn+1 ], which contradicts our earlier observation
about these intervals. ␣
DRAFT September 25, 2020
Bounded Image
Theorem 3.5.3 (Boundedness Theorem) . Let f : [a, b] → R be continuous.
Then the image of f is bounded.
Proof. We have to show there are numbers m, M such that m ≤ f (x) ≤ M for
every x ∈ [a, b].
By the Small Span Theorem, there is a partition P = {x0 , . . . , xn } of [a, b]
such that
xi−1 ≤ x, y ≤ xi =⇒ |f (x) − f (y)| < 1
In particular, |f (xi−1 ) − f (xi )| < 1 for each i. Take any x ∈ [a, b]. Then xi−1 ≤
x < xi for some i. Hence
|f (a) − f (x)| ≤ |f (x0 ) − f (x1 )| + |f (x2 ) − f (x1 )| + · · · + |f (xi−1 ) − f (x)|
< i ≤ n.
So we can take M = f (a) + n and m = f (a) − n. ␣
Theorem 3.5.4 (Extreme Value Theorem) . Let f : [a, b] → R be continuous.
Then there are points c, d ∈ [a, b] such that
f (c) = max{ f (x) | x ∈ [a, b] } and f (d) = min{ f (x) | x ∈ [a, b] }.
Proof. We prove the existence of c. Consider the set A = { f (x) | x ∈ [a, b] }.
A is bounded, by the Boundedness Theorem. Therefore, A has a least upper
bound M. We need to show that M ∈ A.
1
If f (x) never equals M , then M −f (x) is never zero, and g(x) =
M − f (x)
denes a positive and continuous function on [a, b]. By the Boundedness Theo-
rem, the image of g is bounded, so let g(x) ≤ R on [a, b]. Then M − f (x) ≥ 1/R
and so f (x) ≤ M − 1/R on [a, b]. But then M − 1/R is an upper bound of A
and M − 1/R < M , a contradiction.
Hence there is a c ∈ [a, b] such that f (x) = M . ␣
104 Chapter 3. Limits and Continuity
Integrability
Theorem 3.5.5 (Integrability of Continuous Functions). Let f : [a, b] → R be
continuous. Then f is integrable on [a, b].
The plan is to apply the Riemann Condition. For it to work, we need to
nd upper and lower sums which are close to each other. The Small Span
7
Theorem helps out by giving a partition where the function uctuates
little on each subinterval.
Z b Z b n n
X ε X
t(x) dx − s(x) dx = (Mi − mi )(xi − xi−1 ) < (xi − xi−1 ) = ε.
a a i=1
b − a i=1
Mean Values
n
1X
Let us recall that the average of numbers x1 , . . . , xn is dened by x̄ = xi .
n i=1
This notion of average can be generalised from nitely many numbers
to the values of integrable functions. First consider an interval [a, b] with a
partition P = {x0 , . . . , xn } of equally spaced points. Consider a step function
s : [a, b] → R such that s(x) = si for x ∈ (xi−1 , xi ). Then,
b n n
b−aX
Z X
s(x) dx = si (xi − xi−1 ) = si = (b − a)s̄
a i=1
n i=1
Z b
1
=⇒ s̄ = s(x) dx.
b−a a
3.5. Continuity and Integration 105
Task 3.5.6. Show that if f has upper and lower bounds M and m respectively
then m ≤ f¯ ≤ M .
Further, recall that if x1 , . . . , xn have average x̄ and y1 , . . . , ym have aver-
n
ȳ then
age the pooled collection x1 , . . . , xn , y1 , . . . , ym has average x̄ +
DRAFT September 25, 2020
m+n
m
ȳ .
m+n
Task 3.5.7. Suppose a < b < c and f : [a, c] → R is integrable. Show that
b−a ¯ c−b ¯
f¯[a,c] = f[a,b] + f[b,c] .
c−a c−a
Finally, if we acquire new data whose values are lower than previous values,
then the average decreases:
Task 3.5.8. Suppose that f is a decreasing function. Show that f¯[a,x] is also a
decreasing function of x.
Now let us see a phenomenon that is special to averages of functions. The
average of a collection of numerical data is usually not a member of that data
set. However, the average of a continuous function is a value of that function:
f (x)
a c b
weighted average
Pn
A
P of numbers x1 , . . . , xn
n
i=1 wi xi is a combination
where each wi ≥ 0 w
and
i=1 i = 1 . The concept of weighted average gener-
alises that of ordinary average by allowing dierent importance (or weight) for
each number. If we set each wi = 1/n we get the original x̄.
The analogue for integration is to dene the weighted average of an in-
Rb
tegrable function f to be
a
f (x)g(x) dx where g is non-negative on [a, b] and
Rb
a
g(x) dx = 1. This denition requires fg to be integrable. For that, see Exer-
cise 9 of 2.2.
(b) A bounded continuous function which does not have a maximum value.
(a) For any real number y there is a real number R(y) ≥ 0 such that |x| >
R(y) implies p(x) > y .
(b) Let m be the minimum value of p(x) over the interval [−R(a0 ), R(a0 )].
Show that m is the minimum value of p(x) over the entire real line.
1
x2
Z
1 1
√ ≤ √ dx ≤ .
3 2 0 1+x 3
4 | Dierentiation
In this chapter, we take a closer look at the idea that local information about
DRAFT September 25, 2020
Among the continuous functions, the ones that are easiest to integrate are
the `piecewise linear' ones. Their graphs consist of line segments, such as in the
example below:
a b
4 1.1
1.5
3 1.05
2 1 1
1 0.95
This can happen even for functions with rapid oscillations. Let us look at
the function dened by y = x2 cos(1/x) if x 6= 0, and y=0 if x = 0, near the
origin.
·10−2 ·10−3
0.4 4 4
0.2 2 2
x x x
−0.5 0.5 −0.1−5 · 10−2 5 · 10−2 0.1 −1 −0.5 0.5 1
−0.2 −2 −2
·10−2
−0.4 −4 −4
No matter how much we zoom in, the function has innitely many oscillations.
Nevertheless, their amplitudes decrease and in that sense the function becomes
closer to the line y = 0. (We have kept a constant ratio between the unit lengths
in the x and y directions.)
y = f (x) can be
We wish to set up a clear criterion for when a function
considered to merge, on zooming in, with a line which passes through (a, f (a))
and has slope m. This line has equation y = f (a) + m(x − a). A `nearby' line
0 0
would have equation y = f (a) + m (x − a) with |m − m| being small. The graph
of f will merge with the given line if for any ε > 0, we can ensure that f (x)
lies between f (a) + (m ± ε)(x − a) for x close enough to a. This leads to the
following denition.
df df
If f has derivative m at a, we use the notation f 0 (a) or (a) or
dx dx x=a
for m.
Task 4.1.1. Consider a linear function y = mx + c. Show that its derivative
at any point is m. (Hence the derivative of a constant function is zero)
y = x2 , a = 1,
The next sequence of graphs illustrates this denition for
2 2
m = 2 and ε = 0.1. The curve is y = x − 1 − 2(x − 1) = x − 2x + 1 and the
shaded zone is bounded by the lines y = ±0.1(x − 1). We see that δ = 0.05
works for these values, and brings the curve inside the shaded zone.
0.4 0.1 2
0.2 5 · 10−2 1
0 0 0
−0.2 −5 · 10−2 −1
−0.4 −0.1 −2
0 0.5 1 1.5 2 0.8 0.9 1 1.1 1.2 0.96 0.98 1 1.02 1.04
Proof. Exercise. ␣
Example 4.1.3. Let us check that the derivative of f (x) = x2 at a=1 is 2:
124 Chapter 4. Dierentiation
dn f
f (n) (x) = (x).
dxn
[a, x]. Letting x approach a gives us a better idea of the velocity in the imme-
diate vicinity of a, and the limit is seen as dening the instantaneous velocity
at a. In general, the derivative of any function f is called the (instantaneous)
rate of change of f .
Task 4.1.7. Show (again) that a constant function will have zero derivative.
Our original denition of derivative is useful for conceptualizing and prov-
ing abstract results. For example, it gives the right starting point for discussing
dierentiation in higher dimensions. On the other hand, the limit expression is
convenient for calculations. Let us see an example.
Example 4.1.8 (Power Rule). Consider the function xn , for a xed n ∈ N. Its
derivative can be calculated as follows:
n−1
y n − xn X
(xn )0 = lim = lim y i xn−1−i
y→x y − x y→x
i=0
n−1
X n−1
X
= xi xn−1−i = xn−1 = nxn−1 .
i=0 i=0
n−1
X
y n − xn = (y − x)(y n−1 + y n−2 x + · · · + yxn−2 + xn−1 ) = (y − x) y i xn−1−i .
i=0
0 2 0
In particular, x = 1, (x ) = 2x, etc.
The last limit does not exist, since the right-hand limit is 1 while the left-hand
limit is −1.
126 Chapter 4. Dierentiation
One-Sided Derivatives
The concept of one-sided limits can be applied to derivatives:
f (x) − f (a)
f+
0
(a) = lim is the right derivative of f at a.
x→a+ x−a
f (x) − f (a)
f−
0
(a) = lim is the left derivative of f at a.
x→a− x−a
Task 4.1.10. Show that a function f is dierentiable at x = a if and only if
the left and right derivatives of f at a exist and are equal.
Graph of Derivative
A useful skill is to be able to sketch the graph of f0 from that of f, without
0
actually calculating f . We can do this by observing where the tangent slopes
appear to be 0, positive, or negative. As an example, let y = f (x) have the
`bell-shaped' graph shown below.
9. Let f (x) = xn for n ∈ N. Prove that f (n) (x) = n! and f (n+1) (x) = 0.
10. Suppose f is an even function which is dierentiable at 0. Show that
f 0 (0) = 0.
11. Let f: R → R have period T and be dierentiable. Show that f0 has
period T.
12. Match the graphs of f in (a), (b), (c) with the graphs of f0 in (i), (ii),
(iii).
2
1 50
1.5
1
−4 −2 2 4 −4 −2 2 4 0.5
−1 −50
(a) (b) (c) 1 2 3 4
4 1 60
3 0.8
0.6 40
2
0.4 20
1 0.2
1 2 3 4 −4 −2 2 4 (iii) −4 −2 2 4
(i) (ii)
1. Scaling:
2. Sum Rule:
4. Product Rule:
f (x)g(x) − f (p)g(p)
(f g)0 (p) = lim
x→p x−p
f (x)g(x) − f (x)g(p) + f (x)g(p) − f (p)g(p)
= lim
x→p x−p
g(x) − g(p) f (x) − f (p)
= lim f (x) + lim g(p)
x→p x−p x→p x−p
= f (p)g 0 (p) + f 0 (p)g(p).
0
1 1/f (x) − 1/f (p) f (p) − f (x) f 0 (p)
(p) = lim = lim =− .
f x→p x−p x→p f (x)f (p)(x − p) f (p)2
Task 4.2.2. Dierentiate the given functions and identify the points where the
derivative exists:
130 Chapter 4. Dierentiation
x
(a) (b) [x] (c) x−n , n ∈ N
x−1
Trigonometric Functions
To dierentiate the trigonometric functions, we use the two `fundamental limits'
calculated earlier, namely:
sin x 1 − cos x
lim =1 and lim = 0.
x→0 x x→0 x
Theorem 4.2.3. For every x ∈ R, sin0 x = cos x and cos0 x = − sin x.
Proof. We dierentiate the sine function, and leave the cosine for the reader.
sin(x + h) − sin x
sin0 x = lim
h→0 h
sin x cos h + cos x sin h − sin x
= lim
h→0 h
cos h − 1 sin h
= lim sin x + cos x
h→0 h h
= 0 · sin x + 1 · cos x
= cos x. ␣
Task 4.2.4. Use the reciprocal and quotient rules to show that
sec0 x = sec x tan x, csc0 x = − csc x cot x,
tan0 x = sec2 x, cot0 x = − csc2 x.
Logarithms
In order to obtain the derivative of log x we begin with the following simple
inequalities:
1
Theorem 4.2.5. For x > 0, 1 − ≤ log x ≤ x − 1.
x
Proof. For x≥1 these inequalities are obtained from
Z x Z x Z x
1 1
dt ≤ dt ≤ 1 dt.
1 x 1 t 1
1
Theorem 4.2.6. For every x > 0, log0 x = .
x
Proof. We apply the denition of the derivative:
h→1− h−1
1
Task 4.2.7. Let a > 0 and a 6= 1. Show that log0a x = .
x log a
The limit calculation that we carried out in the last proof can also be
expressed as
log(1 + h)
lim =1 or lim log((1 + h)1/h ) = 1.
h→0 h h→0
lim (1 + h)1/h = e.
h→0
We can use this limit to get better estimates of e. Let us take a closer look at
Z 1+h
1/h 1/h 1 dx
the behaviour of (1 + h) for h > 0. First, since log(1 + h) = is
h
1 x
the average of 1/x over the interval [1, 1 + h] and 1/x is a decreasing function,
so islog(1 + h)1/h (Task 3.5.8). Hence (1 + h)1/h is also a decreasing function.
1/h
Therefore (1 + h) is an underestimate of e when h > 0. Similarly, it is an
overestimate when h < 0. So we can get bounds for e by taking small h of both
signs.
h (1 + h)1/h (1 − h)−1/h
1/2 2.25 4
1/10 2.59 2.87
1/103 2.717 2.719
Thus, by taking h = 0.001 we already know that e ≈ 2.718. The actual value
of e when rounded to 6 decimal places is 2.718282.
x2 − 1 (c) x log x − x,
(a) ,
x2 + 1
(b) sin 2x, (d) log |x|.
xn+1 − 1
2. Given the formula 1 + x + x2 + · · · + xn = determine, by dier-
x−1
entiation, formulas for:
3. (Leibniz rule) Let u, v be real functions with a common domain, each being
dierentiable n times. Show that
n
(n)
X n
(uv) = u(k) v (n−k) .
k
k=0
Task 4.3.2. Dierentiate the given functions and identify the points where the
derivative exists:
Implicit Dierentiation
We have been studying relationships of the form y = f (x) between two variables
x, y . Sometimes the relationship is not of such a simple form. For example, it
may be x2 +y 2 = 1. Clearly this shows√a dependence: for any 2x ∈ [−1, 1] we can
2
solve for corresponding values y = ± 1 − x2 . We say that x + y = 1 denes
y implicitly in terms of x√ . In fact this implicit relation can be separated into
√
two explicit functions y = 1 − x2 and y = − 1 − x2 .
134 Chapter 4. Dierentiation
√
1 x2 + y 2 = 1 1 y= 1 − x2
−1 1 −1 1
−1 −1 √
y = − 1 − x2
The Chain Rule allows us to calculate dy/dx without solving explicitly for y,
as follows:
x2 + y 2 = 1 =⇒ 2x + 2y y 0 = 0 =⇒ y 0 = −x/y (if y 6= 0)
√
This works simultaneously for both cases of y = ± 1 − x2 !
−4 −2 2
−2
−4
It is hard to separate this into explicit functions, but easy to dierentiate im-
plicitly:
Suppose we wish to nd a point on the curve where the tangent line is hori-
zontal. We have
y 0 = 0 =⇒ 2y − x2 = 0 =⇒ y = x2 /2
x = 24/3
=⇒ x3 + x6 /8 = 3x3 =⇒ x3 = 16 =⇒ .
y = 25/3
4.3. Chain Rule and Applications 135
2x 2y 0
+ 2 y = 0.
a2 b
If (x0 , y0 ) is a point on the ellipse, the slope m of the tangent line there is given
by
2x0 2y0 x 0 b2
+ 2 m=0 or m=− .
a2 b y0 a2
DRAFT September 25, 2020
f −1 (x)
a f (x)
b
b a
1
−π/2
π/2
−1
4.3. Chain Rule and Applications 137
This restriction has an inverse function sin−1 : [−1, 1] → [−π/2, π/2]. It is also
called arcsine and its values are denoted by arcsin(x). It is continuous because
it is monotone and its image is an interval. We can get its graph by reecting
the y = sin x graph in the y=x line:
y = arcsin x
π/2
1 y = sin x
−π/2−1
DRAFT September 25, 2020
1 π/2
−1
−π/2
1
π
−1
This restriction has an inverse function cos−1 : [−1, 1] → [0, π]. It is also denoted
by arccos and is continuous because it is monotone and its image is an interval.
We can get its graph by reecting the y = cos x graph in the y = x line:
y = arccos x π
1
π
−1 1
y = cos x
−1
138 Chapter 4. Dierentiation
y = tan x
π/2
y = arctan x
−π/2
π/2
−π/2
Proof. We apply the formula for dierentiating inverse functions to the arcsine
function:
1 1
arcsin0 x = = .
sin0 (arcsin x) cos(arcsin x)
Now cos2 (arcsin x) = 1−sin2 (arcsin x) = 1−x2 and since arcsin x ∈ [−π/2, π/2]
we know that cos(arcsin x) ≥ 0. Hence
1
arcsin0 x = √ , x ∈ (−1, 1).
1 − x2
The calculation for arccosine is similar and is left to the reader. Finally,
1 1 1 1
arctan0 x = = = = .
tan0 (arctan x) sec2 (arctan x) 1 + tan2 (arctan x) 1 + x2
␣
4.3. Chain Rule and Applications 139
sin x−1 is the sine function applied to 1/x. The safer way to write
« it is sin(x
−1
).
Exponential Function
Theorem 4.3.8. The derivative of the exponential function is itself:
DRAFT September 25, 2020
(ex )0 = ex .
Proof. Consider f (x) = log x. Its inverse function is f −1 (x) = ex . Applying the
formula for dierentiating an inverse function, we get:
1 1 1
(ex )0 = (f −1 )0 (x) = = = = ex . ␣
f 0 (f −1 (x)) log0 (ex ) 1/ex
x<0 x>0
x0 (t) y(t)
x(t)x0 (t) − y(t)y 0 (t) = 0 or = .
y 0 (t) x(t)
(In discovery mode, we do not worry about dividing by zero) It is natural to
try to arrange y 0 (t) = x(t) and x0 (t) = y(t). y 00 (t) = y(t). We
This leads to
−t t
can easily check that functions of the form y(t) = Ae + Be satisfy this
requirement. Now suppose we want the motion to start at (1, 0) when t = 0.
This gives the equations A + B = 0 and A − B = 1, with solutions A = 1/2,
B = −1/2. Hence
1 t 1 −t 1 t 1 −t
y(t) = e − e = sinh t and x(t) = y 0 (t) = e + e = cosh t.
2 2 2 2
You may recall that we had already veried that the hyperbolic functions do
trace the hyperbola.
It follows that the hyperbolic sine function has an inverse which is strictly
increasing as well as continuous. We denote it by sinh−1 or arcsinh, following
the same pattern as for inverse trigonometric functions.
The hyperbolic cosine function is even and hence not one-one. Therefore
we restrict the domain to [0, ∞) and try again.
Task 4.3.18. Show that cosh : [0, ∞) → [1, ∞) is a strictly increasing bijection.
The corresponding inverse function is called cosh−1 or arccosh. It is also
strictly increasing and continuous.
1 1
Task 4.3.19. Prove that (sinh−1 x)0 = √ and (cosh−1 x)0 = √ .
x2 +1 x2 −1
DRAFT September 25, 2020
2 2
(b) g(x) = sin x − sin x , (e) `(x) = log(1 + x2 ),
(c) h(x) = sin(sin x), (f) r(x) = π x − xπ .
Show that whenever a curve from one family cuts a curve from the other
family, their tangent lines are perpendicular to each other.
5. Let cot−1 or arccot denote the inverse of cot : (0, π) → R. Show that
−1
arccot0 (x) = .
1 + x2
142 Chapter 4. Dierentiation
π π
6. Let sec−1 or arcsec denote the inverse of sec : (0, ) ∪ ( , π) → R \ [−1, 1].
2 2
Show that
1
arcsec0 x = √ .
|x| x2 − 1
π π
7. Let csc−1 or arccsc denote the inverse of csc : (− , 0) ∪ (0, ) → R \ [−1, 1].
2 2
Show that
−1
arccsc0 x = √ .
|x| x2 − 1
g(x) 0 g(x)
0 f 0 (x)
that (f (x) ) = f (x) g (x) log f (x) + g(x) .
f (x)
Z x
f (x) F (x) = f (t) dt F 0 (x)
0
The following rough argument gives geometric insight and also brings out
the need for assuming continuity. The change F (x + h) − F (x) is approximated
by the area of the trapezium whose vertices are at x, x + h on the x-axis and
the corresponding points on the graph of f:
4.4. The First Fundamental Theorem 143
f (x + h)
f (x)
x x+h
Z x+h Z x Z x+h
F (x + h) − F (x) = f (t) dt − f (t) dt = f (t) dt.
a a x
Hence,
Z x+h Z x+h
F (x + h) − F (x) − hf (x) = f (t) dt − f (x) dt
x x
Z x+h
= (f (t) − f (x)) dt.
x
R x+h
Dene ϕ(h) = h1 x (f (t) − f (x)) dt. Consider ε > 0. If f is continuous at x,
there is a δ > 0 such that |t − x| < δ implies |f (t) − f (x)| < ε. Therefore, if
0 < |h| < δ , we obtain:
Z x+h
1 1
|ϕ(h)| = (f (t) − f (x)) dt ≤ |h|ε = ε.
|h| x |h|
Therefore, ϕ(h) → 0 as h → 0, and so F 0 (x) = f (x). ␣
√
Example 4.4.2.
Rx
F (x) = 0 sin t dt.√By the
Suppose we have to dierentiate
0
First Fundamental Theorem we know immediately that F (x) = sin x. We
didn't have to rst nd a closed form expression for H(x)!
144 Chapter 4. Dierentiation
Example 4.4.3. We shall combine the First Fundamental Theorem and the
√
R x2
Chain Rule to dierentiate G(x) = sin t dt, x > 0. First, let F (x) =
Rx √ x
0
sin t dt, as in the previous example. Then
Z x2 √ Z x √
G(x) = sin t dt − sin t dt = F (x2 ) − F (x).
0 0
√ √ √
G0 (x) = F 0 (x2 )2x − F 0 (x) = 2x sin x2 − sin x = 2x sin |x| − sin x.
Z x3 Z x2
(a) F (x) = (1 + t2 )−3 dt, (b) F (x) = (1 + t2 )−3 dt.
0 x
I, dierentiate
g(x)
f (t) dt.
4.5. Extreme Values and Monotonicity 145
10
−10 −5 5 10
−5
−10
The function x + 2 sin x has several local maxima (discs) and local minima
(squares) which are not absolute extrema.
Theorem 4.5.1 (Fermat's Theorem). Let f (x) have a local extreme at an
interior point c of an interval in its domain. Then either f 0 (c) does not exist
or f 0 (c) = 0.
Proof. Suppose f 0 (c) exists. We have to show that f 0 (c) = 0. Suppose f 0 (c) > 0,
that is,
f (x) − f (c)
lim > 0.
x→c x−c
Since the limit is positive, the secant slopes must themselves be positive once
we are close to c. That is, there must be a δ>0 such that 0 < |x − c| < δ =⇒
f (x) − f (c)
> 0. Then,
x−c
This rules out f 0 (c) > 0. We similarly rule out f 0 (c) < 0. ␣
Example 4.5.2. Consider f (x) = |x|. It has a local minimum at x=0 but
f 0 (0) is not dened.
In the next example we have a point where f0 is zero but it is not a local
extreme:
Example 4.5.3. Consider f (x) = x3 . Then f 0 (0) = 0 but there isn't a local
extreme at x = 0.
Example 4.5.4. Consider f (x) = x3 − 3x + 1 with domain [0, 3]. We make the
following calculations:
1. Calculate the function values at the endpoints: f (0) = 1 and f (3) = 19.
2. Find the critical points. Since f is dierentiable we only have to look for
f 0 (c) = 0. This gives 3c2 − 3 = 0 or c = ±1. Thus c = 1 is the only critical
point (in the given domain).
DRAFT September 25, 2020
Monotonicity
Theorem 4.5.5 (Monotonicity Theorem). Suppose I is an interval and f : I →
R is dierentiable on I .
1. If f 0 (x) > 0 for every x ∈ I then f is strictly increasing.
2. If f 0 (x) ≥ 0 for every x ∈ I then f is increasing.
We also have the corresponding statements regarding negative derivatives and
decreasing functions.
Proof. (1) Let p, q ∈ I with p < q. We have to show that f (p) < f (q).
Since f is continuous on [p, q] it achieves its maximum and minimum over
this interval. By Fermat's Theorem the points of maximum and minimum can
only be the endpoints p, q .
If the maximum and minimum values are equal, then f is a constant func-
tion, and f 0 = 0. So they are not equal and f (p) 6= f (q). Suppose f (q) is the
minimum value over [p, q]. Then
f (x) − f (q)
f 0 (q) = lim ≤ 0,
x→q− x−q
since p < x < q implies f (x) ≥ f (q). This contradicts the positivity of f 0 . It
follows that f (q) is the maximum value over [p, q] and hence f (p) < f (q).
148 Chapter 4. Dierentiation
(2) Let p, q ∈ I with p < q . Take any ε > 0 and consider the function g(x) =
f (x) + εx. Then g 0 (x) = f 0 (x) + ε > 0 and g is strictly increasing. Now,
Thus f (q) − f (p) is greater than every negative number and hence must be
non-negative. ␣
Example 4.5.6. We will show that the equation x3 + 3x + 1 = 0 has exactly
one solution.
(b) Find the absolute maximum and minimum values of the function.
(b) Find the absolute maximum and minimum values of the function.
x3
4. Prove that f (x) = + 2x − 2 cos x has exactly one zero.
3
5. Show that x2 = x sin x + cos x for exactly two values of x.
6. Suppose that f: R→R satises f (n+1)
= 0. Prove that f is a polynomial
with degree n or less.
7. Find the equation y(x) of a curve such that the tangent line at the point
(x, y) intersects the x-axis at x − 1.
8. Suppose a function f satises the dierential equation f 0 (x) = k(M −f (x)).
Find the general form of f.
9. Find all functions f: R → R with the property that x 6= y implies f (x) −
2
f (y) ≤ (x − y) .
10. Let f be a dierentiable function such that every tangent line to its graph
passes through the origin. Show that the graph of f is a line through the origin.
11. Prove that there is no function f such that f 0 (x) = sgn(x) for every
x ∈ R.
12. Let I be an interval and f: I → R a dierentiable function such that
f 0 (x) is never zero. Show that f is strictly monotonic.
local max
saddle
local min
As we move from left to right and pass through the saddle point, the derivative
changes from positive to zero and back to positive. Thus, it has the same
sign on each side of the saddle point. As we pass through the local maximum
DRAFT September 25, 2020
the derivative changes from positive to negative, and at the local minimum it
changes from negative to positive. These observations give a test for deciding
whether a critical point is a local extreme and of what kind.
Similarly, if f 0 (x) < 0 for x ∈ (a, c) and f 0 (x) > 0 for x ∈ (c, b), there is a
local minimum at c.
But if f 0 (x) has the same sign on both sides of c then values on one side
are higher and on the other are lower. Hence there is neither a local maximum
nor a local minimum at c. ␣
Example 4.6.2. Consider f (x) = x2 ex . This is a dierentiable function, so
its critical points are given by the derivative being zero. We have f 0 (x) =
x 2 x x
2xe + x e = x(x + 2)e . Hence
f 0 (c) = 0 ⇐⇒ c(c + 2) = 0 ⇐⇒ c = 0, −2
To identify the nature of the critical points we have to nd the sign of the
derivative on either side of them:
152 Chapter 4. Dierentiation
0.8
0.6
0.4
0.2
−5
π 2π
We have seen that the rst derivative tells us whether a function is increas-
ing or decreasing, and how fast. We can apply the same logic to get information
from the second derivative. The sign of f 00 will determine whether f0 is rising
or falling, and therefore whether the graph of f rises or falls with increasing or
decreasing steepness.
f (b) − f (a)
convex
DRAFT September 25, 2020
f (d) − f (c)
y = f (c) + (x − c).
d−c
f (d) − f (c)
Consider the dierence g(x) = f (c) + (x − c) − f (x). Note that
d−c
00 00 0
g(c) = g(d) = 0. Further, g = −f ≤ 0 and so g is a decreasing function.
We wish to show that for each x, g(x) ≥ 0. Suppose that g(x) < 0 at some
point x ∈ (c, d). By the Monotonicity Theorem, we obtain α, β as follows:
α ∈ (c, x) and g 0 (α) < 0,
154 Chapter 4. Dierentiation
Theorem 4.6.6 (Second Derivative Test). Let f have a critical point at c and
f 00be continuous in an open interval containing c. Then
1. f 00 (c) > 0 implies there is a local minimum at c.
Example 4.6.7. Let f (x) = x2 ex . We saw earlier that this has a local max-
imum at −2 and a local (as well as absolute) minimum at 0. Now we identify
the inection points and convexity. First, we calculate the second derivative:
√
f 00 (c) = 0 ⇐⇒ c2 + 4c + 2 = 0 ⇐⇒ c = −2 ± 2 ≈ −3.4, −0.6.
√ √ √ √
x < −2 − 2 −2 − 2 < x < −2 + 2 x > −2 + 2
f 00 (x) + − +
Convexity Convex Concave Convex
Note that f 00 (−2) = −2e−2 < 0 conrms the local maximum at −2 and f 00 (0) =
2>0 conrms the local minimum at 0.
4.6. Derivative Tests and Curve Sketching 155
Here is the graph of f (x) showing the convex parts as solid curves and the
concave part as a dashed curve:
0.8
0.6
0.4
0.2
−6 −4 −2 2
DRAFT September 25, 2020
Example 4.6.8. Let f (x) = x + sin(x) on the interval [0, 2π]. We saw in
Example 4.6.3 that the only critical point is at x=π and this is not a local
maximum or minimum. Now we calculate the second derivative:
π 2π
Curve Sketching
We have seen how rst and second derivative calculations can give us key fea-
tures of a graph. We can capture all the essential aspects of a function's be-
haviour by supplementing these with the following: domain, axis-intercepts,
points of discontinuity, symmetry (even, odd, periodic), asymptotes (vertical,
slant).
x−1
Example 4.6.9. f (x) = arctan .
x+1
156 Chapter 4. Dierentiation
Domain: Since the domain of arctan is R, the only point where this ex-
pression is undened is x = −1. Hence the domain is R \ {−1}. Note also that
f (x) ∈ (−π/2, π/2).
Intercepts: The function is zero at x = 1. It cuts the y-axis at y = f (0) =
arctan(−1) = −π/4.
Symmetry: We have f (2) = arctan(1/3) and f (−2) = arctan(3). They
are positive and unequal (arctan is 1-1) so f (x) is neither even nor odd.
x−1 t+2 x−1 π
lim = lim = ∞ =⇒ lim arctan = .
x→−1− x + 1 t→0+ t x→−1− x+1 2
Since the limits are nite there isn't a vertical asymptote at x = −1. They are
still useful in plotting the graph.
Horizontal Asymptotes:
x−1 x−1 π
lim =1 =⇒ lim arctan = arctan(1) = ,
x→∞ x + 1 x→∞ x+1 4
x−1 x−1 π
lim =1 =⇒ lim arctan = arctan(1) = .
x→−∞ x + 1 x→−∞ x+1 4
Critical Points:
d x−1 1 d x−1
arctan = 2
dx x+1 dx x+1
x−1
1+
x+1
(x + 1)2 (x + 1) − (x − 1)
= ×
(x + 1)2 + (x − 1)2 (x + 1)2
2 1
= = 2 .
2x2 + 2 x +1
The derivative f 0 (x) always exists and is never zero, so there are no critical
points. In fact f 0 (x) > 0 and so f is strictly increasing on any interval in its
domain. So f is strictly increasing on (−∞, −1) and also on (−1, ∞).
d 1 −2x
Convexity: f 00 (x) = = 2 .
dx x2 + 1 (x + 1)2
4.6. Derivative Tests and Curve Sketching 157
f 00 (x) > 0 for x < 0 and f 00 (x) < 0 for x > 0. So f is convex on
We have
(−∞, −1) and on (−1, 0). It is concave on (0, ∞). The only inection point is
x = 0.
Here is the graph of f:
1 y = π/4
−4 −2 2 4
−1
DRAFT September 25, 2020
x2
Example 4.6.10. Let f (x) = .
x2 +9
Domain: Clearly, the domain is R. And the image is in [0, 1).
Intercepts: The function is zero at x = 0.
Symmetry: The function is even.
Vertical Asymptotes: As f is continuous on R it has no vertical asymp-
totes.
Horizontal Asymptotes:
x2 1
lim = lim = 1,
x→∞ x2
+9 x→∞ 1 + 9/x2
x2 1
lim 2
= lim = 1.
x→−∞ x + 9 x→−∞ 1 + 9/x2
Critical Points:
!
d x2 2x(x2 + 9) − x2 (2x) 18x
= = 2 .
dx x2 + 9 (x2 + 9)2 (x + 9)2
The only critical point is x = 0. We have f 0 (x) < 0 for x<0 andf 0 (x) > 0 for
x > 0. So First Derivative Test implies that there is a local minimum at x = 0.
Note that f (0) = 0.
Convexity:
!
d2 x2 (x2 + 9) − 4x2 3 − x2
d 18x
=− = 18 = 54 2 .
dx2 2
x +9 dx (x2 + 9)2 2
(x + 9) 3 (x + 9)3
√ √
The possible inection points are x = ± 3. Note that f (± 3) = 0.25.
158 Chapter 4. Dierentiation
√ √ √ √
x<− 3 − 3<x< 3 x> 3
f 00 (x) − + −
Convexity Concave Convex Concave
y=1
1
0.5
−5 5
Example 4.6.11. Let f (x) = (x−x3 )1/3 . The computations are a little lengthy,
so we just give the results. Verifying them will be an excellent test of your
algebra and dierentiation skills!
1/3
(x − x3 )1/3
1
a = lim = lim −1 = −1,
x→∞ x x→∞ x2
b = lim ((x − x3 )1/3 − (−x))
x→∞
x−1
= lim = 0.
x→∞ ( x12 − 1)2/3 − ( x12 − 1)1/3 + 1
Critical Points:
df 1 − 3x2
(x) = .
dx (x − x3 )2/3
The derivative is undened for x = 0, ±1. It has innite limit at these points,
√
indicating a vertical slope. The derivative is zero at x = ±1/ 3. Thus, there
are ve critical points. The intervals of increase and decrease are:
−1 −1
x (−∞, −1) (−1, √ 3
) (√ 3
, 0) (0, √13 ) ( √13 , 1) (1, ∞)
0
f (x) − − + + − −
4.6. Derivative Tests and Curve Sketching 159
Convexity:
d2 f 2 + 6x2
2
(x) = .
dx 9(x − x3 )5/3
The second derivative is never zero. But there are possible inection points
where it is undened, i.e., at x = 0, ±1.
−2 −1 1 2
−1
−2
2. For each of the following functions, use the rst derivative to nd and
classify the critical points, and identify the intervals of decrease and increase.
2
(a) f (x) = sin2 x, x ∈ [−π, π] (c) h(x) = xe−x /2
, x ∈ [−3, 3]
2
(b) g(x) = e−x /2
, x ∈ [−3, 3] (d) k(x) = x|x − 1|, x ∈ [−1, 2]
3. For each function in the previous exercise, nd the intercepts and asymp-
totes, use the second derivative to nd the inection points, identify the in-
tervals of convexity and concavity, and incorporate this information in their
graphs.
(a) Show that u does not have a positive local maximum or a negative local
minimum in (0, 1).
(b) Suppose u(0) = u(1) = 0. Show that u = 0.
8. Suppose f satises x2 f 00 (x) + 4xf 0 (x) + 2f (x) ≥ 0 on (a, b) and f (a) =
f (b) = 0. Show that f (x) ≤ 0 on [a, b]. (Hint: Consider g(x) = x2 f (x))
9. Prove that ex > 1 + x + x2 /2 for x > 0.
10. Consider a function f on an interval I. Prove the following:
(a) f is convex on I if and only if f ((1 − t)x + ty) ≤ (1 − t)f (x) + tf (y) for
any x, y ∈ I and t ∈ [0, 1].
(b) f is concave on I if and only if f ((1 − t)x + ty) ≥ (1 − t)f (x) + tf (y)
for any x, y ∈ I and t ∈ [0, 1].
Dierential Equations
We'll solve a very important class of dierential equations, called `second-
order ordinary dierential equations with constant coecients'. Such dier-
ential equations crop up in the study of mechanics, waves, electrical circuits
and market equilibrium. They have the form
f 00 + af 0 + bf = g,
0
(a) If f (0) = 0 and f (0) = 1 then f (x) = sinh x.
(b) If f (0) = 1 and f 0 (0) = 0 then f (x) = cosh x.
00
With the help of the cases f = ±f we can solve a general equation
f 00 = kf :
A3. Suppose f: R → R satises f 00 = kf for some k ∈ R, f (0) = A and
0
f (0) = B . Prove the following:
(b) If d=0 and the characteristic equation has repeated root λ, show that
any solution has the form f (x) = Aeλx + Bxeλx .
(c) If d>0 and the characteristic equation has distinct real roots λ1 , λ2 ,
show that any solution has the form f (x) = Aeλ1 x + Beλ2 x .
(d) If d<0 and the characteristic equation has complex roots r ± wi, show
that any solution has the form f (x) = Aerx cos wx + Berx sin wx.
A5. Suppose the roots of the characteristic equation of f 00 + af 0 + bf = 0
have negative real parts. Then every solution f satises lim f (x) = 0.
x→∞
two parameters A, B . By varying them we can generate all solutions. This form
will be called the `general solution' of the homogeneous dierential equation
and we shall denote it by fh . For example, if we consider f 00 + 2f 0 + f = 0 then
−x −x
fh (x) = Ae + Bxe .
A7. Consider the dierential equation f 00 (x) − 3f 0 (x) + 2f (x) = cos x. Since
cosine can be obtained by dierentiating sine once or cosine twice, we try
fp (x) = α cos x + β sin x.
(a) Substitute fp in the given dierential equation and show α = 0.1, β =
−0.3.
(b) Show the general solution is f (x) = 0.1 cos x − 0.3 sin x + Aex + Be2x .