Calculus 1 - Amber Habib (MAT101)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 143

CALCULUS by AMBER HABIB

1 | Real Numbers and Functions

Calculus could be described as a means of studying how one quantity is aected


by another, focusing on relationships which are smooth rather than erratic. This
DRAFT August 15, 2020

chapter sets up the basic language for describing quantities and relationships
between them. Quantities are described by numbers and you would have seen
dierent kinds of numbers: Natural numbers, whole numbers, integers, rational
numbers, real numbers, perhaps complex numbers. Of all these, real numbers
provide the right setting for the techniques of Calculus and so we begin by
listing their properties and understanding what distinguishes them from other
number systems. The key element here is the Completeness Axiom, without
which Calculus would lose its power.

The mathematical object which describes relationships is `function'. We


recall the denition of a function and then concentrate on functions that relate
real numbers. Such functions are visualized through their graphs, and this vi-
sualization is a key part of Calculus. We make a small beginning with simple
examples. A more thorough investigation of graphs can only be carried out
after Calculus has been developed to a certain level. Indeed, the more interest-
ing functions such as the trigonometric functions, logarithms and exponentials,
require Calculus for their very denition.

1.1 Arithmetic and Order Properties


Field Properties of Real Numbers
We begin with a review of the set R of real numbers, which is also called the
Euclidean line. It is a `review' in that we do not dene or construct the set as
such but just list its key attributes, and use them to derive others. For descrip-
tions of how real numbers can be constructed from scratch, you can consult
Hamilton and Landin [10] or Landau [13] or most books on Real Analysis. The
fundamental ideas underlying these constructions are easy to absorb, but the
checking of details can be arduous. You would probably appreciate them more
after reading this book.

What is the need of this review? Mainly, it is intended as a warm-up session


before we begin Calculus proper. Many involved denitions and proofs lie in
2 Chapter 1. Real Numbers and Functions

wait later, and we need to get ready for them by practicing on easier material.
If you are in a hurry and condent of your basic skills with numbers and proofs,
you may skip ahead to the next section, although a patient reading of these few
pages would also help in later encounters with Linear Algebra and Abstract
Algebra.

Any concept of `numbers' involves rules for combining them to create new
ones. We shall use the term binary operation to denote a rule for associating
a single member of a set to each pair of elements from that set.

R
0 1

The set R is equipped with two binary operations, + (addition) and · (multi-
plication), and has two special elements named zero (0) and one (1), with the
following fundamental properties:

R1. Addition and multiplication are commutative: a + b = b + a and a · b = b · a


for every a, b ∈ R.
R2. Addition and multiplication are associative: a + (b + c) = (a + b) + c and
for every a, b, c ∈ R.
a · (b · c) = (a · b) · c
R3. 0 (zero) serves as identity for addition: 0 + a = a for every a ∈ R.
R4. 1 (one) serves as identity for multiplication: 1 · a = a for every a ∈ R.
R5. Each a ∈ R has an additive inverse b ∈ R, with the property a + b = 0.
We denote b by −a.
R6. Each non-zero a ∈ R has a multiplicative inverse c ∈ R, with the property
a · c = 1.
We denote c by a−1 or 1/a.
R7. Multiplication distributes over addition: a · (b + c) = (a · b) + (a · c) for
every a, b, c ∈ R.
The properties R1 to R7 are called the eld axioms for R. In general, if a set F
has two binary operations + and ·, such that these seven properties hold (with
R replaced by F everywhere), then F is called a eld. Other familiar examples
of elds are the set Q of rational numbers and the set C of complex numbers.
Each eld has its own binary operations and its own special elements called
zero and one.

Task 1.1.1. Let F2 = {0, 1} have binary operations + and · dened by


+ 0 1 · 0 1
0 0 1 and 0 0 0 .
1 1 0 1 0 1

Verify that F2 is a eld.


1.1. Arithmetic and Order Properties 3

The set of non-zero real numbers is denoted by R∗ . We shall usually ab-


breviate a·b to ab.
The fundamental properties listed above imply various other properties of
R, such as:

Theorem 1.1.2. Let a, b, c ∈ R. Then the following hold:


1. (Cancellation laws)
(a) If a + b = a + c then b = c.
(b) If ab = ac and a 6= 0 then b = c.
2. 0 is the only additive identity and 1 is the only multiplicative identity.
3. The additive inverse of any real number is unique.
DRAFT August 15, 2020

4. The multiplicative inverse of any non-zero real number is unique.


5. −(−a) = a.
6. 0 · a = 0.
7. If a ∈ R∗ then (a−1 )−1 = a.
8. (−1)a = −a.
9. (−1)(−1) = 1.
10. (−a)(−b) = ab.
11. If ab = 0 then a = 0 or b = 0.
The important thing is to realize that these claims need proof, and then
7 to prove them using only the eld axioms. We shall prove the rst three
and leave the others as exercises for you.

Proof. The cancellation laws are based on associativity and the existence of
inverses:

a + b = a + c =⇒ (−a) + (a + b) = (−a) + (a + c)
=⇒ ((−a) + a) + b = ((−a) + a) + c
=⇒ 0 + b = 0 + c =⇒ b = c.

If a 6= 0 then it has a multiplicative inverse a−1 and we have:

ab = ac =⇒ a−1 (ab) = a−1 bc =⇒ (a−1 a)b = (a−1 a)c


=⇒ 1 · b = 1 · c =⇒ b = c.

Suppose 00 and 10 are additive and multiplicative identities, respectively. Then:

00 = 0 + 00 = 00 and 10 = 10 · 1 = 1.
4 Chapter 1. Real Numbers and Functions

Let b and b0 be additive inverses of a. Then

a + b = 0 = a + b0 =⇒ b = b0 , by cancellation law. ␣

For any a, b ∈ R, the sum a + (−b) is denoted by a−b and is called


the dierence of a and b. This creates a new binary operation on R, called
subtraction. Similarly, if b ∈ R∗ , the product a · (1/b) is denoted by ab or a/b
and is called the ratio of a and b.

Task 1.1.3. Is division a binary operation?


We dene powers as follows. First, we dene x0 = 1 for any x ∈ R. Then,
for any n ∈ N, we dene xn = x · xn−1 . If x 6= 0, we dene x−n = (xn )−1 .
It follows from the above properties that (−x)2 = x2 .
Task 1.1.4. Use the eld axioms of R to prove the following:
a −a a a c ad + bc
(a) − = = if b 6= 0 (b) + = if b, d 6= 0
b b −b b d bd

Order Properties of Real Numbers


Since the eld axioms of R are also satised by Q and C we know that they do
not completely determine the real numbers. What else is special about R?
The non-zero real numbers split into two types: positive and negative.
We shall denote the set of positive real numbers by R+ and the set of negative
real numbers by R− . The key facts associated to this split are:

R8. Every non-zero real number is either positive or negative.


R9. Zero is neither positive nor negative.
R10. No real number is both negative and positive.
R11. A real number is negative if and only if its additive inverse is positive.
R12. The sum and product of positive numbers are positive.
The complex numbers cannot be split into positive and negative ones
« in this manner. For, one of ±i would be positive, as well as one of ±1.
Hence both −1 = i2 = (−i)2 and 1 = 12 = (−1)2 would be positive!

The properties R8 to R12 are called the order axioms of R. Let us see some
of their consequences:

Theorem 1.1.5.
1. If x, y ∈ R− then x + y ∈ R− .
1.1. Arithmetic and Order Properties 5

2. If x, y ∈ R− then xy ∈ R+ .
3. If x ∈ R+ and y ∈ R− then xy ∈ R− .
4. If x ∈ R∗ then x2 ∈ R+ .
5. 1 ∈ R+ .
Again, these are familiar properties which you were asked to memorize
7 at school. We wish to convert them to proved facts. We treat the rst
two to show you the way, and leave the others as exercises.

Proof. x, y ∈ R− =⇒ −x, −y ∈ R+ =⇒ (−x) + (−y) ∈ R+


DRAFT August 15, 2020

=⇒ x + y = −((−x) + (−y)) ∈ R− .
x, y ∈ R− =⇒ −x, −y ∈ R+ =⇒ (−x)(−y) ∈ R+
=⇒ xy = (−x)(−y) ∈ R+ .

The split into positive and negative allows us to think of larger and smaller
real numbers (an `ordering') as follows: We say that a is greater than b, denoted
by a > b, if a − b ∈ R+ . In this case, we also say that b is less than a and
denote that by b < a.

Theorem 1.1.6. Let a, b, c ∈ R. Then the following hold:


1. R+ = { x ∈ R | x > 0 } and R− = { x ∈ R | x < 0 }.
2. (Trichotomy) Exactly one of the following holds: a = b or a > b or a < b.
3. (Transitivity) If a > b and b > c then a > c.
4. If a > b then a + c > b + c.
5. Let c > 0. If a > b then ac > bc.
6. Let c < 0. If a > b then ac < bc.
a+b
7. If a < b then a < < b.
2
8. If 0 < a < b then 0 < 1/b < 1/a.
9. Suppose a, b > 0. Then a > b ⇐⇒ a2 > b2 .
We shall prove trichotomy and leave the others as exercises. Note that
7 item 7 implies that there are innitely many real numbers between any
two real numbers.
6 Chapter 1. Real Numbers and Functions

Proof. a, b be distinct real numbers. We have to prove that exactly one


Let
of a>b a < b holds. Since a, b are distinct, a − b 6= 0. Therefore a − b
and
+ − +
belongs to exactly one of R and R . Now a − b ∈ R corresponds to a > b

and a − b ∈ R corresponds to a < b. ␣

Let A be a subset of R.
ˆ An element M ∈ A is called the maximum of A if a ≤ M for every
a ∈ A. M = max(A).
We write

ˆ An element m ∈ A is called the minimum of A if m ≤ a for every a ∈ A.


We write m = min(A).
A maximum element is also called greatest while a minimum element is also
least.
called

Example 1.1.7. Let A = { x ∈ R | x ≤ 1 }. Then 1 is the maximum of A. 


Task 1.1.8. Let A = { x ∈ R | x < 1 }. Show that A has no maximum.
The set of real numbers includes various special types of numbers:

ˆ By repeatedly adding 1 we generate the subset of natural numbers:


N = {1, 2 = 1 + 1, 3 = 2 + 1, . . . }.

By combining (5) of Theorem 1.1.5 and (4) of Theorem 1.1.6 we see that
1 < 2 < 3 < ···.

ˆ By including zero with the natural numbers we get the whole numbers:
W = N ∪ {0} = {0, 1, 2, . . . }.

ˆ By further including the additive inverse of each whole number we get the
integers:
Z = {. . . , −2, −1, 0, 1, 2, . . . }.

ˆ By dividing integers with each other we get the rational numbers:


Q = { a/b | a, b ∈ Z and b 6= 0 }.

The positive rational numbers will be denoted by Q+ .

Some mathematicians include 0 in the set of natural numbers itself. So


« be careful when you see someone using N, and check whether or not
they include 0.

We make a small digression to recall some important facts about the natural
numbers.
1.1. Arithmetic and Order Properties 7

Principle of Mathematical Induction: A is a subset of N which contains


If
1 and is closed under adding 1 then A = N. Alternately: If P (n) is a
statement about n (for every natural number n) such that P (1) is true
and the truth of P (n) implies the truth of P (n + 1), then P (n) is true for
every natural number n.

Principle of Strong Mathematical Induction: If A is a subset of N which


contains 1 and contains n + 1 whenever it contains all of 1, . . . , n, then
A = N. Alternately: If P (n) is a statement about n (for every natural
number n) such that P (1) is true and the truth of P (1), . . . , P (n) implies
the truth of P (n + 1), then P (n) is true for every natural number n.

Well Ordering Principle: Every non-empty subset of N has a least element.


DRAFT August 15, 2020

Strong induction has that name because its hypothesis is weaker than ordinary
induction. That is, if A satises the hypothesis of ordinary induction, it will
also satisfy the hypothesis of strong induction. One can see from this that the
Principle of Strong Mathematical Induction implies the Principle of Mathe-
matical Induction. In fact, all three of these principles are equivalent. Starting
from any one, we can prove the others. (See Exercises 6 to 8) Their typical
uses are quite dierent: The Induction Principles are typically used to prove
something involving all natural numbers, while the Well Ordering Principle is
used to show the existence of special numbers.

Absolute Value
Let us nish this overview of familiar facts about the real numbers by recalling
the denition of the absolute value of a real number x:

x if x≥0
|x| = .
−x if x<0

We think of a real number as having two aspects: a direction determined by


whether it is positive or negative, and a magnitude given by its absolute value.

|x| |x|

R
−x 0 x

Theorem 1.1.9. Let x, y, z ∈ R. Then:


1. |x| ≥ 0.
2. |x| = 0 if and only if x = 0.
3. |x2 | = |x|2 = x2 .
4. |xy| = |x||y|.
8 Chapter 1. Real Numbers and Functions

5. (Triangle Inequality) |x + y| ≤ |x| + |y|.


6. |x − y| ≥ ||x| − |y||.
Proof. The rst two claims are obvious from the denition. Proofs of the others
are given below. We make use of the fact that if a, b ≥ 0 then a = b ⇐⇒ a2 =
2
b .

3. Since x2 ≥ 0, we have |x2 | = x2 = (±x)2 = |x|2 .

4. |xy|2 = (xy)2 = x2 y 2 = |x|2 |y|2 = (|x||y|)2 .

5. |x + y|2 = (x + y)2 = x2 + y 2 + 2xy ≤ |x|2 + |y|2 + 2|x||y| = (|x| + |y|)2 .

6. |x − y|2 = (x − y)2 = x2 + y 2 − 2xy ≥ |x|2 + |y|2 − 2|x||y| = (|x| − |y|)2 =


||x| − |y||2 . ␣
Task 1.1.10. For any x, a ∈ R with a ≥ 0, prove that |x| ≤ a ⇐⇒ −a ≤ x ≤
a.
n
Task 1.1.11. Let x 1 , . . . , x n ∈ R. Use induction to show that
X
xi ≤
i=1
n
|xi |.
X

i=1

Since we think of |x| as the magnitude or size of a real number, |x − y|


becomes a measure of the gap between x and y . We call it the distance between
x and y. The properties of absolute value convert to the following properties of
distance:

Theorem 1.1.12. Let x, y, z ∈ R. Then:


1. (Positivity) |x − y| ≥ 0, and |x − y| = 0 if and only if x = y.
2. (Symmetry) |x − y| = |y − x|.
3. (Triangle Inequality) |x − z| ≤ |x − y| + |y − z|.
Proof. Exercise. ␣

Exercises for Ÿ1.1


1. Let F3 = {0, 1.2}. Complete the following tables for + and · such that F3
becomes a eld:

+ 0 1 2 · 0 1 2
0 0 1 2 0 0
and
1 1 1 0 1 2
2 2 2 2
1.1. Arithmetic and Order Properties 9

2. Put the following numbers in ascending order without converting them to


decimal form:
√ 14 17 3
3, 2, , , 2, −2, − .
10 12 2

3. In each case below, nd the numbers with the given property and sketch
the solutions on the number line:

(a) x2 − x > 0, (b) 3x2 + 2x − 1 ≥ 0, (c) x2 − 5x + 6 < 0.

4. Find the numbers which meet all the conditions of the previous exercise.

5. Let x, y ∈ R and m, n ∈ Z. Prove the following, taking x, y to be non-zero


DRAFT August 15, 2020

wherever required:

(a) xm xn = xm+n , (b) xm y m = (xy)m , (c) (xm )n = xmn .

6. The Principle of Mathematical Induction (page 7) is usually taken as an


axiom for the natural numbers. The Principle of Strong Mathematical Induction
can be derived from it. Let A be a non-empty subset N such that 1∈A and
1, . . . , n ∈ A =⇒ n + 1 ∈ A. Dene

S = {n ∈ N | k ∈ N and k≤n implies k ∈ A }.

Show the following:

(a) 1∈S and n ∈ S =⇒ n + 1 ∈ S .


(b) A = N.
7. The Principle of Mathematical Induction also implies the Well Ordering
Principle. Given a non-empty subset A of N, let

S = {n ∈ N | k ∈ N and k≤n implies k∈


/ A }.

Show the following:

(a) If S=∅ then 1 is the least element of A.


(b) If S 6= ∅ then there is N ∈N such that N ∈S and N +1 ∈
/ S. The
number N + 1 is the least element of A.
8. Show that the Well Ordering Principle implies the Principle of Mathemat-
ical Induction.

9. Use mathematical induction to prove that for any x∈R and n ∈ N,


n
X
xn+1 − y n+1 = (x − y) xi y n−i .
i=0
10 Chapter 1. Real Numbers and Functions
n
X 1 − xn+1
Hence, for x 6= 1, xi = .
i=0
1−x
10. Prove the following for a, b ∈ R and n ∈ N:

(a) 1n = 1. (b) 0 < a < b =⇒ an < bn .

11. Prove the following:

n n
X n(n + 1) X n(n + 1)(2n + 1)
(a) k= , (b) k2 = .
2 6
k=1 k=1

12. Recall that the factorials of whole numbers are dened by 0! = 1 and
n! =
  n · (n − 1)! for n ∈ N. Further, the binomial coecients are dened by
n n!
= for 0 ≤ k ≤ n. Prove the following for0 ≤ k ≤ n:
k k!(n − k)!
       
n+1 n n n
(a) = + , (b) ∈ N.
k+1 k k+1 k

13. Prove the Binomial Theorem: For any x, y ∈ R and n ∈ N,


n  
n
X n k n−k
(x + y) = x y .
k
k=0

(Note that this requires the denition 00 = 1.)

1.2 Completeness Axiom and Archimedean Property


The rational numbers also satisfy the same order axioms as R, and hence the
eld and order axioms together still do not completely determine the real num-
bers. For that, we need one more property:

R13. Suppose A and B are non-empty subsets of R such that a ≤ b for every
a ∈ A and b ∈ B . Then there is a real number m such that a ≤ m ≤ b for
every a ∈ A and b ∈ B .
This is called the Completeness Axiom and it has been established that
the real numbers form the only system which satises the eld and order axioms
as well as the Completeness Axiom.

An important consequence of the completeness axiom is the existence of


square roots.

You are probably aware that the rational numbers do not have this
« property. For example, there is no rational number whose square is 2.
Thus the Completeness Axiom distinguishes R from Q.
1.2. Completeness Axiom and Archimedean Property 11

Theorem 1.2.1. Let a ∈ R+ . Then there is a unique b ∈ R+ such


√ that b = a.
2

(We call b the positive square root of a and denote it by a1/2 or a.)
The Completeness Axiom lends itself to showing the existence of a num-
ber with a particular property by locating it between numbers which
7 are too large or too small to have that property. To nd a number whose
square is a we create one class of numbers whose squares are greater
than a and another of numbers whose squares are less than a.

Proof. Note that if b = a1/2 then 1/b = (1/a)1/2 . So it is enough to do the


a>1 case, and we assume a > 1 in the rest of this proof.

Existence : A = { x ∈ R+ | x2 < a } and B = { y ∈ R+ | y 2 > a }.


DRAFT August 15, 2020

Let Then
1∈A and a ∈ B . So A and B are non-empty.
Now x∈A and y∈B x2 < a < y 2 , and hence x < y . By the
implies that
+
Completeness Axiom, there is a number b ∈ R such that x ≤ b ≤ y for every
2
x ∈ A, y ∈ B . We'll show that b = a.
First, suppose b2 > a . Then b2 = a + δ with δ > 0. Let's take a number
0
b which is slightly smaller than b. It will have the form b0 = b − h with h > 0.
Hence

(b0 )2 = b2 + h2 − 2bh > b2 − 2bh = a + δ − 2bh.

If we take h = δ/2b, we have (b0 )2 > a and hence b0 ∈ B . But this contradicts
the fact that b0 < b.
Second, suppose b2 < a. Then b2 = a − δ with δ > 0. Let's take a number
0
b which is slightly greater than b. It will have the form b0 = b + h with h > 0.
Choose h to be less than both b and δ/3b. Then

(b0 )2 = a − δ + h(h + 2b) < a − δ + 3bh < a − δ + δ = a =⇒ b0 ∈ A.

This contradicts b0 > b.


Since b2 > a and b2 < a have both been shown to be impossible, by
2
trichotomy we must have b = a.

Uniqueness : Suppose b, b0 ∈ R+ and b2 = (b0 )2 . Then 0 = b2 − (b0 )2 = (b −


b0 )(b + b0 ). Since b + b0 > 0 we must have b − b0 = 0. ␣

One can prove in a similar fashion, for any a ∈ R+ and n ∈ N, the existence
of the positive nth root a1/n . However, we shall wait until we have studied the
exponential function, when we shall have an extremely simple proof. In the
meantime, we'll use only square roots and not cube roots or any other nth
roots.

Let A be a subset of R.
12 Chapter 1. Real Numbers and Functions

ˆ An element M ∈R is called an upper bound of A if a ≤ M for every


a ∈ A.
ˆ An element m ∈ R is called a lower
bound of A if m ≤ a for every a ∈ A.
If A has an upper bound, we say A is bounded above. If it has a lower bound,
we say it is bounded below. If it is bounded above and below, we simply say
it is bounded. If it is not bounded, we say it is unbounded.

When A, B satisfy the hypotheses of the Completeness Axiom then A is


automatically bounded above, and B consists of certain upper bounds of A.
Now if we are originally given just a non-empty set A which is bounded above,
we can choose B to be the set of all upper bounds of A. The Completeness
Axiom then gives an α which separates A and B. We note the following:

1. Since a≤α for every a ∈ A, α is an upper bound of A.


2. Since α≤b for every b ∈ B, α is least among all the upper bounds of A.
3. The number α is the only number with these properties.

α is called the least upper bound (or LUB) of A. It is


Therefore the number
supremum (or sup) of A. We have just proved the following:
also called the

Theorem 1.2.2 (Least Upper Bound or LUB Property). Every non-empty


subset of R which is bounded above has a (unique) least upper bound. ␣

Similarly, any non-empty subset A of R which is bounded below has a


(unique) greatest lower bound (or GLB). This is a number β which is a
lower bound of A and which is the greatest among all the lower bounds of A.
It is also called the inmum (or inf ) of A.
The following subsets of real numbers are called intervals:
(a, b) = { x ∈ R | a < x < b },
[a, b] = { x ∈ R | a ≤ x ≤ b },
[a, b) = { x ∈ R | a ≤ x < b },
(a, b] = { x ∈ R | a < x ≤ b },
(a, ∞) = { x ∈ R | a < x },
[a, ∞) = { x ∈ R | a ≤ x },
(−∞, b) = { x ∈ R | x < b },
(−∞, b] = { x ∈ R | x ≤ b },
(−∞, ∞) = R.
Task 1.2.3. Let a < b. Show that the supremum of [a, b] is b, while its inmum
is a.
Task 1.2.4. Let a < b. Show that the supremum of (a, b) is b, while its inmum
is a.
1.2. Completeness Axiom and Archimedean Property 13

The supremum and inmum of an interval are called its endpoints. All
other points of the interval are called its interior points.
An interval is called closed if it contains its endpoints. Intervals of the
form [a, b], [a, ∞), (−∞, b] and (−∞, ∞) are closed.

An interval is called open if it contains none of its endpoints. Intervals of


the form (a, b), (a, ∞), (−∞, b) and (−∞, ∞) are open.

Task 1.2.5. Let I be an interval and a, b ∈ I . Show that a < x < b implies
x ∈ I.
Theorem 1.2.6 (Archimedean Property of R, ver. 1). The set N is not bounded
above in R.
DRAFT August 15, 2020

Proof. Suppose N is bounded above. Then the set B of all upper bounds of N is
non-empty. Further, if a ∈ N and b ∈ B then a ≤ b. Hence, by the Completeness
Axiom, there is a real number α such that a ≤ α ≤ b for every a ∈ N, b ∈ B .

Now α − 1 ∈/ B . Hence there is an N ∈ N such that N > α − 1. But then


N + 1 ∈ N and N + 1 > α, a contradiction. ␣

Task 1.2.7. Show that Z has neither an upper nor a lower bound in R.
Theorem 1.2.8 (Archimedean Property of R, ver. 2). Let x, y ∈ R+ . Then
there exists N ∈ N such that N x > y.

Proof. Consider y/x. By the Archimedean Property there is N ∈N such that


N > y/x, and hence N x > y. ␣

Theorem 1.2.9 (Archimedean Property of R, ver. 3) . Let x, y ∈ R+ . Then


y
there exists N ∈ N such that 0< < x.
N

y
Proof. There is N ∈N such that N x > y > 0. Hence 0< < x. ␣
N

The Archimedean property justies viewing the real numbers as strung


out on a line marked o by integers. The rst version tells us that the
natural numbers can reach out and exceed any real number. The second
one says we can change our unit size without losing this property. Any
«
ruler can be used to measure a distance, no matter how small the ruler
and how immense the distance. The third version reassures us that by
taking fractional parts of the ruler we can also measure arbitrarily small
distances.

The next result is the one Archimedes used to prove formulas for lengths,
areas and volumes of a variety of shapes, and is the reason that this kind of
reasoning is called `Archimedean'.
14 Chapter 1. Real Numbers and Functions

M M
Theorem 1.2.10. Suppose x, y ∈ R and M > 0 such that y − ≤ x ≤ y+
n n
for every n ∈ N. Then y = x.

Proof. We apply trichotomy. First, suppose y > x. Then 0 < y − x. By


Archimedean Property ver. 3, there is N ∈ N such that 0 < M/N < y − x.
Hence x < y − M/N . This contradicts the given relationship between x, y, M .
So y > x is false. We similarly prove that y < x is false (Try!). Therefore, by
trichotomy, y = x. ␣
Theorem 1.2.11 (Denseness of Q). Let x, y ∈ R with x < y. Then there is a
rational number in the interval (x, y).

Proof. By shifting the interval (x, y) by a rational, we may assume that x, y


are both positive.

Since y−x > 0 there is N ∈ N such that 1/N < y − x (Archimedean


Principle, ver. 3). Further, there is a k ∈ N such that k/N > x (Archimedean
Principle, ver. 2). Hence, the set A = { m ∈ N | m/N > x } is non-empty. By
the Well Ordering Principle, A has a least element M . Then M/N > x. If
M/N ≥ y then

M −1 M 1 1
= − ≥y− > x =⇒ M − 1 ∈ A,
N N N N
which is a contradiction. Therefore M/N < y . Hence M/N ∈ (x, y). ␣

Members of Qc = R \ Q are called irrational numbers.


Theorem 1.2.12 (Denseness of Qc ). Let x, y ∈ R with x, y. Then there is an
irrational number in the interval (x, y).
√ √
Proof. √By the denseness of Q, we have p ∈ Q such p ∈ ( 2 x, 2 y). Then
t = p/ 2 is an irrational number and t ∈ (x, y). ␣

Exercises for Ÿ1.2


1. Show that the Completeness Axiom is equivalent to the following (which is
Dedekind's original formulation): If all points of the straight line fall into two
classes such that every point of the rst class lies to the left of every point of
the second class, then there exists one and only one point which produces this
division of all points into two classes, this severing of the straight line into two
portions.

x+y √
2. Show that if x, y ≥ 0 then ≥ xy , with equality if and only if x = y .
2
3. Prove that every non-empty subset of R which is bounded below has a
greatest lower bound.
1.2. Completeness Axiom and Archimedean Property 15

4. Which intervals are bounded?

5. Show that the following set is unbounded:

 
1 1 1 1 1 1
1, 1 + √ , 1 + √ + √ , 1 + √ + √ + √ , . . . .
2 2 3 2 3 4

6. Show that the following sets are bounded, and nd their supremum:

(a) {1, 1 + 1/2, 1 + 1/2 + 1/22 , 1 + 1/2 + 1/22 + 1/23 , . . . },


(b) {0.1, 0.11, 0.111, 0.1111, . . . }.
7. Show that the following set is bounded:
DRAFT August 15, 2020

 
1 1 1 1 1 1
1, 1 + 2 , 1 + 2 + 2 , 1 + 2 + 2 + 2 , . . . .
2 2 3 2 3 4

8. Show that 1 is the supremum of each of the following:

(a) A = [0, 1),


n 1 o
(b) B = 1− :n∈N .
n

9. Let A be a non-empty subset of R. For c ∈ R, dene cA = { cx | x ∈ A }.


Show that:

c · sup(A), c ≥ 0 and A is bounded above
sup(cA) = ,
c · inf(A), c < 0 and A is bounded below


c · inf(A), c ≥ 0 and A is bounded below
inf(cA) = .
c · sup(A), c < 0 and A is bounded above

10. Let A, B ⊆ R be non-empty and bounded above.

(a) Show that sup(A ∪ B) = max{sup(A), sup(B)}.


(b) Dene A + B = {a + b | a ∈ A and b ∈ B }. Show that sup(A + B) =
sup(A) + sup(B).
(c) Dene AB = { ab | a ∈ A and b ∈ B }. Show that if the members of
A, B are all non-negative then sup(AB) = sup(A) sup(B).
11. State and prove the results for inmum that correspond to the previous
exercise.

12. Produce a rational number as well as an irrational number that lie between

17/12 and 2.
16 Chapter 1. Real Numbers and Functions

1.3 Functions
A function f from a set X to a set Y is usually described as a rule that
associates exactly one element of Y to each element of X. The element of Y
associated to x∈X is denoted by f (x). This is not quite a formal denition as
one has to wonder what is allowed as a `rule'. We can do better by stating our
requirements purely in terms of membership of sets:

Let X, Y be sets. A function f from X to Y is a subset of the cartesian


product X × Y such that for each x ∈ X there is exactly one member (x, y) of
f, and this member is denoted by (x, f (x)).

In the above denitions, X is called the domain of f and Y is called the


codomain of f. The notation f: X → Y is used as shorthand for  f is a
function with domain X and codomain Y . The subset of Y consisting of the
values actually taken by the function is called its range.
Example 1.3.1. Consider f: R → R dened by f (x) = x2 . The domain and
codomain are both R, but the range is [0, ∞). 
Example 1.3.2. Consider the rules depicted by the following pictures. In each
pair of ovals, the one on the left represents the domain, while the one on the
right represents the codomain. The arrows mark the associations given by the
rule. Which diagrams represent functions?

(A) (B)

(C) (D)

(A) does not represent a function because there is a point in the domain that
has no image. (B) also does not represent a function, since there is a point in
the domain that has two images. (C) and (D) do represent functions, since it
is permitted for points in the codomain of a function to have no pre-image as
well as to have multiple pre-images. 
Task 1.3.3. Consider a binary operation on a set X . Can you describe it as
a function with a certain domain and codomain?
We have been using the name f for a function. This is simply the most
commonly used notation for a function (as x is for a variable), but we are free
1.3. Functions 17

to use any other letter, symbol, or word. Other popular choices are g , h, u, v ,
F , G, H , η , θ and so on. Functions that are particularly important have their
own names such as sin, cos, exp and log.
A function f: X → Y is called one-one or injective if distinct points in
X have distinct images in Y: If a, b ∈ X and a 6= b then f (a) 6= f (b).
Task 1.3.4. Show that f : X → Y is one-one if and only if f (a) = f (b) implies
a = b.
Example 1.3.5. Consider the functions depicted below.

a
b
DRAFT August 15, 2020

(A) (B)

The function in (A) is 1-1 because distinct points in the domain are mapped
to distinct points in the codomain. The function in (B) is not 1-1 because the
points a and b are mapped to the same value. 

A function f : X → Y is called onto or surjective if its range is all of Y ,


that is, for each b ∈ Y there exists a ∈ X such that f (a) = b.
Example 1.3.6. Consider the functions depicted below.

(A) (B)

The function in (A) is not onto because the point z in the codomain has no
pre-image. The function in (B) is onto. 
Task 1.3.7. Find out whether the following functions are 1-1 or onto. If a
function is not onto, give its range.
1. f : R → R, f (x) = 12 x + |x| .


2. g : R → R, g(x) = x2 .
x2 + x + 1 if x ≥ 0 .

3. h : R → R, h(x) = x+1 if x < 0
A function f: X →Y is called a one-one correspondence or bijection
if it is both one-one and onto.

Example 1.3.8. Consider the following functions:


18 Chapter 1. Real Numbers and Functions

(A) (B) (C)

The function in (A) is one-one but not onto. The function in (B) is onto but
not one-one. Finally, the one in (C) is both one-one and onto, hence it is a
bijection. 

Let f : X → Y . We ask whether there is a function g : Y → X which


`reverses' f . That is, if f takes x to y then g takes y to x and conversely.
Consider f as in the picture on the right. A func-
tion which reverses f would be obtained by reversing
the arrows in the picture. However, when we reverse v
the arrows, we nd we have not created a function,
since no reversed arrow starts at v. Why did we run
into this problem? Because f is not onto.

Now consider a dierent function f, this time


depicted on the left. On reversing the arrows, we have
diculty because two of the reversed arrows start at
v. Thus we have trouble if f is not one-one. These
v
examples show that our reversing process can only
be successful if f is both onto and one-one, that is,
it is a bijection.

So, suppose f : X → Y is a bijection. We dene its inverse function


f −1 : Y → X by f −1 (y) = x ⇐⇒ f (x) = y . Starting with any y ∈ Y we note
that since f is a bijection there is exactly one x ∈ X such that f (x) = y . This
x is termed f −1 (y).
Task 1.3.9. Let f : X → Y be a bijection. Then f −1 : Y →X is also a bijection
and (f −1 )−1 = f .
Let f: X → Y and g : Y → Z. Then their composition g ◦ f : X → Z is
dened by
g ◦ f (x) = g(f (x)), ∀x ∈ X.

Note that to dene g ◦ f, we have to ensure that the codomain of f equals


the domain of g. Only then does f (x) become a valid input for g. (It would
actually be enough for the codomain of f to be a subset of the domain of g,
but that greater generality does not bring any signicant improvement in the
theory.)

Task 1.3.10. Show that composition of functions is associative: If f : W → X,


g: X → Y and h : Y → Z then h ◦ (g ◦ f ) = (h ◦ g) ◦ f .
1.3. Functions 19

The identity function from a set A to itself is denoted 1A and maps


every element to itself: 1A (a) = a for every a ∈ A.
Task 1.3.11. Let f : X → Y and g : Y → X . Show that g is the inverse
function of f if and only if g ◦ f = 1X and f ◦ g = 1Y .
Theorem 1.3.12. Let f : X → Y and g : Y → Z be bijections. Then
(g ◦ f )−1 = f −1 ◦ g −1 .

Proof. We start with the following observations:

1. Since f is a bijection it has an inverse f −1 : Y → X .


2. Since g is a bijection it has an inverse g −1 : Z → Y .
DRAFT August 15, 2020

To verify that f −1 ◦ g −1 is the inverse of g ◦ f, we carry out the following


calculations

(f −1 ◦ g −1 ) ◦ (g ◦ f ) = ((f −1 ◦ g −1 ) ◦ g) ◦ f = (f −1 ◦ (g −1 ◦ g)) ◦ f
= (f −1 ◦ 1Y ) ◦ f = f −1 ◦ f = 1X ,

(g ◦ f ) ◦ (f −1 ◦ g −1 ) = ((g ◦ f ) ◦ f −1 ) ◦ g −1 = (g ◦ (f ◦ f −1 )) ◦ g −1
= (g ◦ 1X ) ◦ g −1 = g ◦ g −1 = 1Y . ␣

A constant function is a function that only takes one value. Suppose


f: X →Y only takes one value c∈Y, that is, f (x) = c for every x ∈ X. Then
we denote f by c and call it `the constant function c'.

Exercises for Ÿ1.3


1. Find out whether the following functions are 1-1 or onto. If a function is
not onto, give its range.

1
(a) f : R \ {0} → R, f (x) = .
x
x
(b) g : R \ {1} → R, g(x) = .
1−x
1
(c) h : R \ {0, 1} → R, h(x) = .
x(1 − x)
2. Show that the following functions are bijections, and nd their inverses:

1
(a) f : R \ {0} → R \ {0}, f (x) = .
x
x
(b) g : R \ {1} → R \ {−1}, g(x) = .
1−x
1
(c) h : [1/2, 1) → [4, ∞), h(x) = .
x(1 − x)
20 Chapter 1. Real Numbers and Functions

3. Give an example to show that composition of functions is not commutative.

4. Consider functions f: X →Y and g : Y → Z. Prove the following:

(a) If f and g are injective then g◦f is injective.

(b) If f and g are surjective then g◦f is surjective.

(c) If f and g are bijective then g◦f is bijective.

5. Give a counterexample to the converse of each part of the previous exercise.

6. In each case, give an example of a non-constant function f: R → R with


the given property:

(a) f ◦ f = f,
(b) f ◦ f = 1R and f 6= 1R ,
(c) f ◦ f = 0.
7. Give bijections between the following sets:

(a) N and W.
(b) N and Z. (Hint: 0, −1, 1, −2, 2, . . . )
1
(c) [0, 1) and [0, ∞). (Hint: )
1
8. Give a bijection between N and N × N. The diagram below gives a hint for
one such bijection:

(1,1) (1,2) (1,3) (1,4) ···


1 3 6

(2,1) (2,2) (2,3) (2,4) ···


2 5

(3,1) (3,2) (3,3) (3,4) ···


4

(4,1) (4,2) (4,3) (4,4) ···


. . . .
. . . .
. . . .

(This particular bijection will be useful when we multiply innite series in the
last chapter. Your task is to nd a formula for it.)

1.4 Real Functions and Graphs


A real function is a function whose domain and codomain are subsets of R.
In this book, we deal solely with real functions, except for the very last section.
1.4. Real Functions and Graphs 21

Commonly, mathematicians provide only the rule of association that de-


scribes a real function and expect the reader or listener to work out the domain.

For example, one might write Consider the function f (x) = x without spec-
ifying the domain. In this case, the reader is expected to realise that since square
roots are dened only for non-negative real numbers, the domain is [0, ∞). In
general, if a function is given only by a rule f (x) the domain is to be taken to
consist of all real numbers x such that f (x) is dened. As for the codomain, if
it is not explicitly given, it is taken to be R.

Task 1.4.1. Identify the domains of the real functions given by the following
rules:

(a) f (x) = 1 − x2 , (c) h(x) = (x − 1)(x − 2),
DRAFT August 15, 2020

1
(b) g(x) = ,
1 (d) k(x) = √ .
x 1 − x2

Suppose f is a real function with domain X . Then the graph of f is the


subset of the xy -plane which consists of all ordered pairs of the form (x, f (x))
with x ∈ X.

f (x)
x

Absolute Value Function


The absolute value |x| of a real number x denes a real function called the
absolute value or modulus function. Its domain is R and its graph is:

−1 1

Unit Step Function



0, x<0
The Heaviside or unit step function is given by H(x) = 1,
if
if x≥0
.

Its graph is:


22 Chapter 1. Real Numbers and Functions

We have utilized a useful convention in depicting this graph. The unlled circle
at the origin indicates that the origin is not part of the graph. The lled circle
at (0, 1) emphasizes that the function value at x=0 is 1.

Sign Function

 −1, if x<0
The sign or signum function is dened by sgn(x) = 0, if x=0 .
1, if 0<x

−1 1

Greatest Integer Function


Consider any real number x. By the Archimedean Property, the set A =
{ n ∈ Z | n > x } is non-empty. By the Well Ordering Principle, A has a least
element α. (Why?) Then we have α − 1 ≤ x < α. We denote α − 1 by [x]. The
function which associates [x] to x is called the greatest integer function.
Sometimes it is called the oor function and is denoted by bxc.

2
1

−2 −1 1 2
−1
−2

Task 1.4.2. The ceiling function, whose value at x is denoted by dxe, is


dened by setting dxe to be the least integer which is greater than or equal to
x. Draw the graph of the ceiling function.
Task 1.4.3. Draw the graph of the `sawtooth' function r(x) = x − [x].
Now we will consider some simple ways of creating new functions by mod-
ifying existing ones.
1.4. Real Functions and Graphs 23

Vertical Shift and Scaling


Given a real function f and a real number c, we dene a function called f +c
as follows:
(f + c)(x) = f (x) + c.
Adding a constant to a function shifts its graph vertically. For example, adding
2 will shift the graph up by 2 units and adding −2 will shift it down by 2 units.
The gure below shows the graphs of f ±c when c is positive.

f +c
f (x) + c f
DRAFT August 15, 2020

f (x) f −c
x
f (x) − c

Another modication is to use multiplication to create a function cf dened


by
cf (x) = c · f (x).
Multiplying a function by a constant scales its graph vertically. For example,
multiplying by 2 will scale the graph vertically by a factor of 2, while multiplying
by−2 will further reect it in the x-axis. The gure below shows the graphs of
±cf when c is positive.

cf

c · f (x) f
f (x)
x

−c · f (x)

−cf

Horizontal Shift and Scaling


We have just seen how the graph of a function changes when a constant is
added to or multiplies its output. Now we consider what happens when the
input of the function is changed in this manner. That is, given a function f
and a constant c, we consider the functions dened by g(x) = f (x + c) and
h(x) = f (cx).
24 Chapter 1. Real Numbers and Functions

Task 1.4.4. Let f be a real function with domain A and let c be a real number.
What are the domains of g(x) = f (x + c) and h(x) = f (cx)?
Consider the function g(x) = f (x + c) with c > 0. We have f (x) = f ((x −
c) + c) = g(x − c). That is, the value taken by f at x is taken by g at x − c.
Thus the graph of g is a horizontal shift to the left of the graph of f .

g f

f (x) = g(x − c)

x−c x

Task 1.4.5. Describe the graph of g(x) = f (x + c) when c < 0.


Now consider the function h(x) = f (cx) with c > 0. Reasoning as above
(try!) we conclude that the value taken by f at x is taken by h at x/c. In this
case the graph is scaled horizontally by a factor of 1/c.

h f

f (x) = h(x/c)
x/c x

Note that the graph will contract if c>1 and will stretch if c < 1.
Task 1.4.6. Describe the graph of h(x) = f (cx) when c < 0.
Task 1.4.7. Recall that the graph of f (x) = x2 is an upward opening parabola.

Use your understanding of shifts and scalings to plot the graphs of the
following on the same xy-plane:
(a) g(x) = (x − 2)2 + 1, (b) h(x) = 4x2 + 12x + 5.

Even and Odd Functions


A real function f: X →R is called an even function if
1. X is symmetric about 0: x ∈ X ⇐⇒ −x ∈ X , and
1.4. Real Functions and Graphs 25

2. The graph of f is symmetric with respect to the y -axis: f (−x) = f (x) for
every x ∈ X.
An example of an even function:

f (x)

−x x

A real function f: X →R is called an odd function if


DRAFT August 15, 2020

1. X is symmetric about 0: x ∈ X ⇐⇒ −x ∈ X , and

2. The graph of f is symmetric with respect to the origin: f (−x) = −f (x)


for every x ∈ X.
An example of an odd function:

f (x)
−x
x
f (−x)

Task 1.4.8. Determine whether the following functions are even, or odd, or
neither: |x|, sgn(x) and [x].
Task 1.4.9. Can a function be both even and odd?

Monotonic Functions
A real function f : A → R is called an increasing function if, for all points
x, y ∈ A, x ≤ y implies f (x) ≤ f (y). It is called strictly increasing if x < y
implies f (x) < f (y).

An increasing function A strictly increasing function


26 Chapter 1. Real Numbers and Functions

Similarly, a real function f is called a decreasing function if x ≤ y implies


f (x) ≥ f (y). It is called strictly decreasing if x < y implies f (x) > f (y).

A decreasing function A strictly decreasing function

A real function is called monotone or monotonic if it is either increasing or


decreasing.

Task 1.4.10. Identify which of these functions are monotonic. If a function is


monotonic, state whether it is (strictly) increasing or decreasing:

(a) f (x) = sgn(x), (c) h(x) = x2 ,


(b) g(x) = [x], (d) k(x) = −x3 .

Periodic Functions
We have been looking at functions which are special in having some regularity.
Even and odd functions have reection symmetry. Monotonic functions have a
persistent trend of either growth or decay. Another kind of regularity is when a
function represents a cyclic phenomenon, one in which the same pattern repeats
indenitely.

A real function is called periodic if there is a positive number T such that


f (x + nT ) = f (x) for every n∈Z and every x in the domain of f. One can
view this as symmetry under horizontal shifts by integer multiples of T. The
number T is called a period of f .
Task 1.4.11. Show that if T is a period of f then every nT , n ∈ N, is also a
period of f .
Example 1.4.12. Here are some examples of periodic functions:

(a) f (x) = x − [x] has period 1.


0, [x] is even
(b) g(x) = has period 2.
1, [x] is odd

The function f is called the `sawtooth wave' and g is the `square wave'. 
1.4. Real Functions and Graphs 27

Arithmetic of Functions
Let f, g be real functions. We use them to dene new functions as follows:

(f + g)(x) = f (x) + g(x), (f − g)(x) = f (x) − g(x),


f f (x)
(f g)(x) = f (x)g(x), (x) = .
g g(x)
Task 1.4.13. Let f, g be real functions with domains A, B respectively. Describe
the domains of the following functions: f + g, f − g, f g, f /g.
Task 1.4.14. Under the given conditions, are the functions f + g, f − g, f g,
f /g even or odd?
(a) When f and g are both even.
DRAFT August 15, 2020

(b) When f and g are both odd.


(c) When f is even and g is odd.
(d) When f is odd and g is even.

Inverse Functions
Suppose I, J are subsets of R and f : I → J is a bijection. Then we have the
inverse function f −1 : J → I . We'll establish a relationship between the graphs
of f and f −1 :
(x, y) is in the graph of f ⇐⇒ y = f (x)
⇐⇒ x = f −1 (y)
⇐⇒ (y, x) is in the graph of f −1

(y, x) can be obtained by


Now the point
f −1 y=x
reecting (x, y) in the line y = x. Therefore
−1
the graph of f can be obtained by reecting
the graph of f in the line y = x. f
Task 1.4.15. Sketch the graphs of f and f −1
on the same coordinate plane:
(a) f : R → R, f (x) = 2x + 1.
(b) f : [0, ∞) → [0, ∞), f (x) = x2 .

(c) f : [0, 1] → [0, 1], f (x) = 1 − x2 .

Polynomials
A monomial is an expression of the form xn where n = 0, 1, 2, . . . . If we let x
vary over real numbers, a monomial gives a real function. Here are graphs of
some monomial functions:
28 Chapter 1. Real Numbers and Functions

x0 x1 x2 x3 x4

Note the convention that x0 = 1 for every real number, including 00 = 1.


Task 1.4.16. Which monomials are even? Which are odd?
A polynomial is obtained by scaling and adding monomials. Thus a gen-
eral polynomial has the form

n
X
n n−1
p(x) = an x + an−1 x + · · · + a1 x + a0 = ai xi ,
i=0

wheren = 0, 1, 2, . . . and each ai is a real number. If an 6= 0 we say that


p(x) n, and we write deg p = n. If each ai = 0 we call p the zero
has degree
polynomial and write p = 0. The degree of the zero polynomial is not dened.
Task 1.4.17. Let p, q be two non-zero polynomials. Show that deg(pq) =
(deg p) + (deg q) and deg(p + q) ≤ max{deg p, deg q}.
If we let x vary over real numbers, p(x) gives a real function whose domain
is R.
If a polynomial involves only a few monomials, we can combine the graphs
of these monomials to understand at least the main features of the graph of the
polynomial. For example, consider p(x) = x3 − x.

+ →

x3 −x x3 − x

If p(x) is a polynomial and c is a real number such that p(c) = 0 then c is called
a zero or root of p(x).
Let us recall the following from our school algebra:
1.4. Real Functions and Graphs 29

Theorem 1.4.18 (Division Algorithm for Polynomials). Let p(x) and q(x) be
polynomials with q 6= 0. Then there are unique polynomials s(x) and r(x) such
that p(x) = s(x)q(x) + r(x) and either deg r < deg q or r = 0.

Proof. First we prove existence, by strong induction on the degree of


Pn p. If
i
deg p <Pdeg q we can take s = 0 and r = p. Now let p(x) = i=0 ai x and
m
q(x) = i=0 bi xi with n ≥ m and an , bm 6= 0. Dene
an n−m
r0 (x) = p(x) − x q(x).
bm

We have deg r0 < deg p. By the strong induction hypothesis, r0 = s0 q + r with


r=0 or deg r < deg q . Then,
DRAFT August 15, 2020

an n−m an n−m
p(x) = r0 (x) + x q(x) = s0 (x)q(x) + r(x) + x q(x)
bm bm
an n−m
= (s0 (x) + x )q(x) + r(x).
bm

As for uniqueness, let p = sq + r = s0 q + r0 be two such decompositions. Then


0 0
(s − s )q = r − r. The only way the degrees of the left and right side can match
0 0
is if s = s and r = r . ␣
Theorem 1.4.19 (Factor Theorem). Let p(x) be a polynomial and let c be a
root of p(x). Then x − c divides p(x). That is, there is a polynomial q(x) such
that p(x) = (x − c)q(x).

Proof. Divide p(x) by x − c. There will be a quotient polynomial q(x) and


a constant remainder r. Thus p(x) = (x − c)q(x) + r. Substituting the value
x = c on each side gives 0 = p(c) = (c − c)q(c) + r = r . Hence r = 0 and
p(x) = (x − c)q(x). ␣

Ifc is a root of p(x), its multiplicity is the largest natural number k such
that (x − c)k divides p(x).
Theorem 1.4.20. Let p(x) be a polynomial of degree n ≥ 1. Then p(x) has at
most n roots, counting multiplicities.

Proof. Let c1 , . . . , cm be the distinct roots ofp(x) and let k1 , . . . , km be their


respective multiplicities. Then p(x) has the form (x − c1 )k1 · · · (x − cm )km q(x).
It follows that k1 + · · · + km ≤ deg p(x) = n. ␣

Rational Functions
p(x)
A rational function has the form where p(x) and q(x) are polynomials
q(x)
and q(x) is not the zero polynomial.
30 Chapter 1. Real Numbers and Functions

The simplest interesting example is the


2
function given by f (x) = 1/x. Its domain is 1/x
R∗ and the graph is given on the right. 1
As x increases on the positive side, 1/x 1/2
decreases, and the graph moves towards the
1/2 1 2
x-axis. While, as x decreases towards zero,
the graph rises without bound.

Once the graph is obtained for positive


x, we obtain it for negative x by symmetry
since the function 1/x is an odd function.

1
Example 1.4.21. We shall sketch the graph of f (x) = . We
(x − 1)(x + 1)
start by considering the graph of (x − 1)(x + 1):

−1 1

The reciprocal function f (x) will take values of large magnitude where (x −
1)(x + 1) takes values of small magnitude, and of the same sign. So, as we
approach 1 from the right, f (x) will take large positive values, while from the
left it will take large negative values. Thus we obtain the following graph:

−1 1

Exercises for Ÿ1.4


1. Identify the domains of the real functions given by the following rules:

√ p
(a) f (x) = x2 − 2, (c) h(x) = x(x2 − 1),
1 1
(b) g(x) = , (d) k(x) = .
[x] x − [x]

2. Draw the graphs of the following functions, not by plotting points but by
transforming or the graphs of the standard functions like x, x2 and 1/x.
1.4. Real Functions and Graphs 31


(a) 1 − x2 (c) x−1
x
(b) x2 − 4x + 3 (d)
x−4

3. Graph the following functions:

(a) f (x) = (x − 1)(x − 2)(x − 3), (c) h(x) = |x| − |x − 1|,


x
(b) g(x) = |x| + |x − 1|, (d) k(x) = 2 .
x −1

4. Let f : [0, 2] → R be dened as follows: f (x) = 1 if 0≤x≤1 and f (x) = 2


if 1 < x ≤ 2.
DRAFT August 15, 2020

(a) Draw the graph of f.


(b) Describe the domain and draw the graph of g(x) = f (2x).
(c) Describe the domain and draw the graph of h(x) = f (x − 2).
(d) Describe the domain and draw the graph of k(x) = f (2x) + f (x − 2).
5. Graph the following modications of the greatest integer function over the
domain [−2, 2]:

(a) f (x) = 2[x], (c) h(x) = [x2 ],


(b) g(x) = [2x], (d) k(x) = (−1)[x] .

6. Extend the given graph to a suitable domain so that it represents an even


function:

(a) (b)

7. Extend the given graph to a suitable domain so that it represents an odd


function:

(a) (b)

8. Under the given conditions, is the function f ◦g even or odd?


32 Chapter 1. Real Numbers and Functions

(a) When f and g are both even. (c) When f is even and g is odd.

(b) When f and g are both odd. (d) When f is odd and g is even.

9. Prove or give a counter-example:

(a) If f, g are increasing, so is their sum f + g.


(b) If f, g are increasing, so is their product f g.
10. Consider a function f : R → R.
(a) Show f can be written as the dierence of two functions, each of which
takes only non-negative values.

(b) Show f can be written as the sum of an even function and an odd
function.

11. Dene x+ = max{0, x}. Draw the graphs of the following functions:

(a) f (x) = x+ , (d) k(x) = x+ − (−x)+ ,


(b) g(x) = (−x)+ , (e) `(x) = (1 − x)+ + (x − 2)+ ,
(c) h(x) = x+ + (−x)+ , (f) m(x) = (x − 1)+ − (x − 2)+ .

12. n ∈ Z, dene fn : R → R by fn (x) = (x − n)+ − 2(x − n −


For each
1/2) + (x − n − 1)+ . (See the previous exercise for the x+ notation.)
+

(a) Graph fn (x).


(b) Graph F (x) = sup{ fn (x) | n ∈ Z }.
13. Let functions f, g have period T . Show that the following functions also
have period T : f + g , f − g , f g , f /g .
14. Find out whether the following functions have any of the properties of
being odd/even, increasing/decreasing, or periodic:

(a) 1/x, (c) [x + 1/2] − [x],


(b) [x2 ], (d) [x + 1/2] − [x] − 1/2.

15. Show that if a function has either of the following pairs of properties, it
must be constant:

(a) Monotonic and even. (b) Monotonic and periodic.

16. The graph of a function is given. Draw the graph of its inverse function.
1.4. Real Functions and Graphs 33

(a) (b)

Each chapter of this book ends with one or two sets of thematic exercises.
These either develop applications of the material of that chapter, or illustrate
theoretical concerns that future courses would take up in detail. It is not essen-
tial that you solve them at rst sight, but you should at least browse them and
keep them in mind as you read on. Chances are you will suddenly recognize a
relevant idea and how to apply it here. Or, studying another course, you will
DRAFT August 15, 2020

see how these exercises support the techniques you are learning there.

Curve Fitting: Interpolation and Least Squares


In the laboratory, we measure a function's values at nitely many points. From
these values, we attempt to establish the form of the function. One common
practice is to nd a polynomial which matches the data and has as low degree as
possible. This is called an interpolating polynomial for the given data. We
shall show below that, given n+1 data points, there is a unique interpolating
polynomial of degree n or less.

A1. Suppose that x0 , x1 , . . . , xn ∈ R satisfy xi 6= xj when i 6= j .


(a) Show there is a unique polynomial wi (x) of degree n such that wi (xi ) = 1
and wi (xj ) = 0 when j 6= i. (Hint: Use the Factor Theorem to nd the
formula of wi (x))

(b) Let y0 , y1 , . . . , yn ∈ R. Show there is a unique polynomial p(x) of degree


n or less such that p(xi ) = yi for each i, and it is given by
n
X
p(x) = yi wi (x).
i=0

A2. Find the unique linear or quadratic function that passes through the
points (−h, a), (0, b) and (h, c), with h > 0.
Actual data has errors and a perfect match to imperfect data has little
importance. It is more useful to nd a function which is only an approximate
match but is easy to work with or allows some special insight. The most common
approach is to nd a line which passes as close as possible to the data points.
This can be done by geometry!

Let Rn = { ~x = (x1 , . . . , xn ) |Pxi ∈ R }. Recall that the dot product on


n
this space is dened by ~ x • ~y = i=1 xi yi . Vectors ~
x, ~y are perpendicular
34 Chapter 1. Real Numbers and Functions

to each other if and only if ~ x • ~y = 0. The length of a vector is given by


Pn
||~x|| = (~x • ~x)1/2 = ( i=1 x2i )1/2 . The distance between ~x, ~y is ||~y − ~x||.
A3. Prove Pythagoras' Theorem: Vectors ~x, ~y are perpendicular if and only
if ||~x + ~y ||2 = ||~x||2 + ||~y ||2 .
A4. Let ~u, ~v Rn , and Π = { a~u + b~v | a, b ∈ R }
be distinct non-zero points in
be the plane passing through them and origin. Take any xed vector ~ y . Show
that Π has a unique member ~ x which is closest to ~y and that it is given by the
y − ~x) • ~u = (~y − ~x) • ~v = 0.
equations (~

Consider ~x = (x1 , . . . , xn ) and ~y = (y1 , . . . , yn ). The closeness of a line


y = ax + b to the points (xi , yi ) is measured by the `total squared error':
n
X
E(a, b) = (yi − axi − b)2 .
i=1

The goal is to nd a, b such that E(a, b) is as small as possible. The correspond-
ing line is called the least squares line for the data.
A5. Show that the problem of minimizing the total squared error is equivalent
to considering the plane Π = { a~x + b~v | a, b ∈ R } with ~v = (1, . . . , 1), and
nding the member closest to ~y .
A6. Show that the least squares line y = ax + b is given by
Pn Pn Pn Pn Pn
n xi yi − ( i=1 xi )( i=1 yi )
i=1P i=1 yi − a i=1 xi
a= n Pn , b= .
n i=1 x2i − ( i=1 xi )2 n

Cardinality
We consider two sets as having the same amount (why didn't we say `number' ?)
of elements if there is a bijection between them. In this case we say the two sets
have the same cardinality. A set is called nite if it is empty or it has the
same cardinality as {1, . . . , n} for some n ∈ N. Otherwise it is called innite.
The most familiar innite set is N. One can ask if all innite sets have the
same cardinality. The rst surprise is that it is not so easy to go beyond N. In
Exercises 7 and 8 of Ÿ1.3 you were asked to show that W, Z and even N×N
have the same cardinality as N. A set which is either nite or has the same
cardinality as N is called countable.
B1. Prove that the following function f: N×N→N is a bijection:

f (m, n) = 2m−1 (2n − 1).

Since Q is made of pairs of integers we expect it to be countable. Finding


an explicit bijection with N is a little daunting because a rational number
1.4. Real Functions and Graphs 35

can be written as p/q in many ways and we have to account for this. We
present below an elegant bijection which appears to have been rst discovered
by an undergraduate student in 1960! (McCrimmon [44]) It is based on the
Fundamental Theorem of Arithmetic: Every natural number greater than
one can be written uniquely as a product of prime powers. That is, if n∈N
andn > 1, then there is a unique choice of primes p1 < · · · < pk and natural
αk
numbers α1 , . . . , αk such that n = pα
1 · · · pk .
1

B2. Let f: W→Z be a bijection such that f (0) = 0. Dene ϕ : N → Q+ by

αk f (α1 ) f (αk )
ϕ(1) = 1 and ϕ(pα
1 · · · pk ) = p1
1
· · · pk

for any distinct primes pi and natural numbers αi . Show that ϕ is a bijection.

B3.
DRAFT August 15, 2020

Give a bijection between N and Q.


So Q turns out to be countable. What about R? Cantor gave several proofs
that R isuncountable (i.e., innite and not in bijection with N). We present
a version of his very rst proof, as it is more in line with our general approach.
First, we have a simple application of the Completeness Axiom:

B4. (Nested Interval Property of R) Let Jn = [an , bn ] be closed intervals such


that
[a1 , b1 ] ⊇ [a2 , b2 ] ⊇ · · · ⊇ [an , bn ] · · ·

Show that ∩ [an , bn ] 6= ∅.
n=1

Now we'll show there can't be a surjection from N to R, and hence there
certainly can't be a bijection either. Therefore R is uncountable.

B5. f : N → R be a surjection. Construct closed intervals Jn = [an , bn ]


Let
such that J1 ⊃ J2 ⊃ J3 · · · and f (n) ∈
/ Jn . Show that this gives a contradiction.
(Hint: J1 is easily chosen so that f (1) ∈
/ J1 . Cut it into three closed subintervals
of equal length. At least one of these does not include f (2).)

B6. Is the set of irrational numbers countable or uncountable?


CALCULUS by AMBER HABIB

2 | Integration

Calculus has two parts: dierential and integral. Integral calculus owes its ori-
gins to fundamental problems of measurement in geometry: length, area and
volume. It is by far the older branch. Nevertheless, it depends on dierential
calculus for its more dicult calculations, and so nowadays we typically teach
dierentiation before integration.

We shall revert to the historical sequence and begin our journey with inte-
gration. Our rst reason is that it provides a direct application of the Complete-
ness Axiom without needing the concept of limits. The second is that important
functions such as the trigonometric, exponential and logarithmic functions are
most conveniently constructed through integration. Finally, the student should
become aware that integration is not just an application of dierentiation or a
set of calculational techniques.

Suppose we wish to nd the area of a shape in the Cartesian plane. We


can at least estimate it by comparing the shape with a standard area, that of
a square. We cover the shape with a grid of unit squares and count how many
squares touch it, and also how many squares are completely contained in it.
This gives an upper and a lower estimate for the area. We can obtain better
estimates by taking ner grids with smaller squares. The gures below illustrate
this process of iteratively improving the estimates.

5 5 5
4 4 4
3 3 3
2 2 2
1 1 1
0 0 0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
3 ≤ Area ≤ 17 3.75 ≤ Area ≤ 13 6.48 ≤ Area ≤ 9.64

We have said that we are estimating area. But what is our denition of area? In
school books you will nd descriptions such as Area is the measure of the part
of plane or region enclosed by the gure. It should be evident that this is not
40 Chapter 2. Integration

a very useful prescription. It means nothing without a description of the mea-


suring process. In fact, the estimation process described above could become
the basis for a meaningful denition of area, by requiring it to be a number
that lies above all the lower estimates produced by the process, and below all
the upper estimates. Its existence would be guaranteed by the Completeness
Axiom. This is a promising start, but the skeptic can raise various objections
that would have to be answered:

1. Could there be a gure for which multiple numbers satisfy the denition
of its area?

2. If we slightly shifted or rotated the grids, could that change our calcula-
tion? That is, could moving a gure change its area?

It takes some eort to answer these objections. Indeed, we have to concede the
rst point, for there are such subsets of the plane. We shall content ourselves
with working out a method for assigning area only to certain regions, those
that are bounded by graphs of functions. In this special situation we shall use
rows of rectangles rather than grids of squares. The word `integration' will be
used for this process.

Underestimate Overestimate

In what follows, it will be important that we use our pre-existing notions about
area only to motivate or guess results, and never to justify them. For area and
its properties will formally exist only at the end of our analysis.

The idea of approximating by rectangles arises very naturally in the context


of motion. Imagine a body moving in a straight line with a constant speed s
over a time interval [a, b]. Then the distance traveled equals s(b − a), which
is the area of a rectangle with base b−a and height s. Now, suppose the
speed is a function s(t) of time. Let us cut [a, b] into some subintervals and
approximate the speed by a constant speed over each subinterval. Then the
distance traveled is approximated by the distance traveled under this sequence
of constant speeds, and the latter is the sum of the areas of rectangles!

For our nal piece of motivation, suppose we know velocity rather than
speed, and wish to recover the total displacement. We again cut [a, b] into some
subintervals, of widths 4ti , and approximate by a constant velocity
P vi in each
subinterval. Then the total displacement is approximated by i v i 4t i . Some
of the vi could be negative and then the product vi 4ti would be negative, and
thus not the area of a rectangle. We shall call it the `signed area' instead.
2.1. Integration of Step and Bounded Functions 41

Our development of integration in this chapter will be along the lines of


the previous paragraph. We shall use integration to dene `signed area' rather
than `area', as it is the more general concept. As applications, we will rigorously
develop the logarithmic and exponential functions, prove the existence of nth
roots, dene real powers, and dene and estimate π.

2.1 Integration of Step and Bounded Functions


Integration of Step Functions
Consider a closed and bounded interval [a, b] in R. A partition of [a, b] is a set
P = {x0 , x1 , . . . , xn−1 , xn } such that

a = x0 < x1 < · · · < xn−1 < xn = b.

Such a partition cuts [a, b] into n subintervals

[a, x1 ], [x1 , x2 ], . . . , [xn−2 , xn−1 ], [xn−1 , b].

. . . xn−1 R
a = x0 x1 x2 xn = b

As illustrated in the diagram, the subintervals need not have equal lengths.

A function s : [a, b] → R is called a step function if there is a partition


P = {x0 , . . . , xn } of [a, b] such that s is constant on each open subinterval
(xi−1 , xi ):
s(x) = si if xi−1 < x < xi .

In this case we say the partition P is adapted to the step function s.

si
s1

R
a = x0 x1 x2 xi−1 xi xn−1xn = b
s2

An example of a step function.

Task 2.1.1. We have already encountered some step functions. How many can
you recall?
Suppose s : [a, b] → R is a step function with adapted partition P =
{x0 , . . . , xn }, such that s(x) = si if xi−1 < x < xi . Then the integral of
42 Chapter 2. Integration

s from a to b is dened by
Z b n
X
s(x) dx = si (xi − xi−1 ).
a i=1

This integral represents the total signed area of the rectangles enclosed by
the graph of s(x), the x-axis, and the vertical lines x=a and x = b.

sn
sn−1
s1

R
a = x0 x1 xn−1 xn = b
s2

The term `signed area' refers to the area of a rectangle marked by the step
function being taken as positive if the rectangle lies above the x-axis and as
negative if it lies below the x-axis.

Note that this denition ignores the step function's values at the parti-
tion points. This is acceptable because these values create nitely many
« line segments, and we view a line segment as having zero area regardless
of its length.

Task 2.1.2. Consider the step function s(x) dened by



 2, 0 ≤ x ≤ 1.5
s(x) = −1, 1.5 < x ≤ 2.5 .
3, 2.5 < x ≤ 3

Z 3
Calculate s(x) dx.
0

Since the denition of the integral involved the choice of an adapted par-
tition we need to show that the resulting number is independent of the choice
of partition. Let P, P 0 be partitions of [a, b]. We say P0 is a renement of P
0
if P ⊆P .

Task 2.1.3. Suppose a partition P is adapted to a step function s. Show that


every renement of P is also adapted to s.
Note that if P, P 0 are partitions of [a, b] then P ∪P 0 is a common renement
for both of them.
2.1. Integration of Step and Bounded Functions 43

P: R
a = x0 x1 x2 x3 b = x4

P 0: R
a = y0 y1 y2 b = y3

P ∪ P 0: R
a = x0 y1 x1 x2 y2 x3 b = x4
= y0 = y3

Theorem 2.1.4. Suppose s : [a, b] → R is a step function and P, Q are par-


titions ofR [a, b] which are adapted to s. Then both P and Q lead to the same
value of ab s(x) dx.
Proof. Let I(P ) be the value of the integral of s corresponding to the partition
P. We need to prove that I(P ) = I(Q). It suces to prove this when one
partition is a renement of the other. For, if we have proved this, we'll have
I(P ) = I(P ∪ Q) = I(Q).
Next, it suces to prove I(P ) = I(Q) when Q has just one point more
than P . Let P = {x0 , . . . , xn } and Q = {x0 , . . . , xk−1 , t, xk , . . . , xn }. For each
i, let s take the value si on (xi−1 , xi ). Then
k−1
X n
X
I(Q) = si (xi − xi−1 ) + sk (t − xk−1 ) + sk (xk − t) + si (xi − xi−1 )
i=1 i=k+1
k−1
X n
X
= si (xi − xi−1 ) + sk (xk − xk−1 ) + si (xi − xi−1 )
i=1 i=k+1
Xn
= si (xi − xi−1 ) = I(P ) ␣
i=1

Task 2.1.5. Suppose s, t : [a, b] → R are step functions. Show there is a parti-
tion which is adapted to both s and t.
Theorem 2.1.6 (Comparison Theorem). Let s, t : [a, b] → R be step functions
such that s(x) ≤ t(x) for every x ∈ [a, b]. Then
Z b Z b
s(x) dx ≤ t(x) dx.
a a

Proof. Let P = {x0 , . . . , xn } be a partition which is adapted to both s and t.


Let s(x) = si and t(x) = ti for each x ∈ (xi−1 , xi ). Then si ≤ ti for each i, and
therefore
Z b n
X n
X Z b
s(x) dx = si (xi − xi−1 ) ≤ ti (xi − xi−1 ) = t(x) dx. ␣
a i=1 i=1 a
44 Chapter 2. Integration

Integration of Bounded Functions


A functionf : [a, b] → R is bounded if there is a real number M such that
−M ≤ f (x) ≤ M for every x ∈ [a, b]. We then say that f is bounded by M.
Let f : [a, b] → R be bounded by M . Consider a step function s : [a, b] → R
such that s(x) ≤ f (x) for every x ∈ [a, b].

We view the integral of s over [a, b] as being an underestimate of the `signed


area' under the graph of f . We can improve the estimate by considering another
step function which lies between s and f . Therefore we consider the collection
of all such underestimates:

nZ b
Lf = s(x) dx s : [a, b] → R is a step function and s(x) ≤ f (x)
a
o
for every x ∈ [a, b] .

The set is non-empty because the constant function −M is a step function


whose values never exceed those of f.
We can similarly obtain a non-empty collection of overestimates of the
`signed area' under the graph of f:
nZ b
Uf = t(x) dx t : [a, b] → R is a step function and t(x) ≥ f (x)
a
o
for every x ∈ [a, b] .

The members of Lf are called lower sums for f , while the members of Uf are
upper sums.
called

Theorem 2.1.7. Let f : [a, b] → R be a bounded function. Then ` ∈ Lf and


u ∈ Uf implies ` ≤ u.
Proof. Let s, t : [a, b] → R be step functions such that
2.1. Integration of Step and Bounded Functions 45

Rb
(a) s(x) ≤ f (x) for every x ∈ [a, b] and
a
s(x) dx = `.

Rb
(b) t(x) ≥ f (x) for every x ∈ [a, b] and
a
t(x) dx = u.
Then s(x) ≤ f (x) ≤ t(x) for every x ∈ [a, b]. Hence, by the Comparison
Theorem,

Z b Z b
`= s(x) dx ≤ t(x) dx = u. ␣
a a

The Completeness Axiom informs us that for a bounded function f there


will be a number I such that `≤I ≤u for every ` ∈ Lf and u ∈ Uf . This I
is our natural candidate for the value of the signed area under the graph of f.
However, we have to face the possibility of there being more than one such I .
Here is an example:

Example 2.1.8. Consider the Dirichlet function D : [0, 1] → R which is


dened by


1, x ∈ Q ∩ [0, 1]
D(x) = .
0, x ∈ Qc ∩ [0, 1]

Let s : [0, 1] → R be step function such that s(x) ≤ D(x) for every x ∈ [0, 1].
Each open subinterval of [0, 1] contains an irrational number, hence s(x) ≤ 0 on
Rb Rb
each open subinterval, and so
a
s(x) dx ≤ 0. Similarly, we see that a t(x) dx ≥
1 if t : [0, 1] → R is a step function such that t(x) ≥ D(x) for every x ∈ [0, 1].
Therefore every number α between 0 and 1 has the property that ` ≤ α ≤ u
for every ` ∈ LD and u ∈ UD . 

For functions like the Dirichlet function we have to admit that our approach
fails to successfully assign a `signed area'. For the functions where our approach
does work we have a special term: A bounded function f : [a, b] → R is called
integrable if there is a unique number I such that ` ≤ I ≤ u for every ` ∈ Lf
and u ∈ Uf . This unique I is called the (denite) integral of f on [a, b] and
Rb
is denoted by
a
f (x) dx.

Example 2.1.9. Consider the function f (x) = x on [0, 1]. Consider a partition
P : x0 < · · · < xn which cuts [0, 1] into n subintervals of equal length. That is,
xi = i/n. Corresponding to this partition we dene two step functions, sn and
tn , as follows.

sn (x) = xi−1 and tn (x) = xi if xi−1 ≤ x < xi ,


46 Chapter 2. Integration

and sn (1) = tn (1) = 1. Then

1 n n n n−1
1 Xi−1
Z X 1 1X 1 X
sn (x) dx = xi−1 = xi−1 = = 2 i
0 i=1
n n i=1 n i=1 n n i=1
n−1 1 1
= −
= ,
2n 2 2n
Z b n
X 1 1 1
tn (x) dx = xi = · · · = + .
0 i=1
n 2 2n

The number 1/2 is the only number that ts between all these lower and upper
sums, and hence it is the integral of f over [0, 1]. 
Task 2.1.10. Why was it enough to consider only some special partitions and
step functions in the above example?
Example 2.1.11. Consider the function f : [−1, 1] → R dened by f (0) = 1
and f (x) = 0 if x 6= 0. This is a step function, and its integral as a step function
is zero. But we can also treat it as a bounded function and ask whether it is
integrable.

−1 1

Since f is itself a step function we see that 0 · (0 − (−1)) + 0 · (1 − 0) = 0 is both


a lower sum as well an upper sum for f. So any number I that lies between Lf
and Uf must satisfy 0 ≤ I ≤ 0. Therefore, I = 0 is the only such number. 
Task 2.1.12. Show that every step function is integrable, and the two ways of
calculating its integral give the same number.

The following formulation is often useful in establishing the integrability


of a function.

Theorem 2.1.13 (Riemann Condition). Let f : [a, b] → R be a bounded func-


tion. Then f is integrable on [a, b] if and only if for each ε > 0 there are step
functions s, t : [a, b] → R such that
1. s(x) ≤ f (x) ≤ t(x) for every x ∈ [a, b],
Z b Z b
2. t(x) dx − s(x) dx ≤ ε.
a a
2.1. Integration of Step and Bounded Functions 47

Proof. First, suppose f is not integrable. We have to nd an ε>0 for which
there are no step functions s, t satisfying the given conditions. Now since f is
not integrable, there are two numbers I1 , I2 such that

Z b Z b
s(x) dx ≤ I1 < I2 ≤ t(x) dx,
a a

whenever s, t satisfy the two inequalities in the rst condition. Consider ε =


1
2 (I 2 − I1 ). Then any s, t satisfying the rst condition will not satisfy the second
condition.

f is integrable with integral I . Given any ε > 0, consider


Now, suppose
the interval (I − ε/2, I + ε/2). If this does not contain any lower sum of f , then
I − ε/2 will also meet the conditions for the integral of f . So this interval must
Rb
contain a lower sum
a
s(x) dx of f . Similarly, it must contain an upper sum
Rb
a
s(x) dx of f . And then s, t fulll all requirements. ␣

There is a similar formulation for establishing that a certain number I is


the integral of f over [a, b].
Theorem 2.1.14. Let f : [a, b] → R be a bounded function. Then I ∈ R is the
integral of f over [a, b] if and only if for each ε > 0 there are step functions
s, t : [a, b] → R such that
Z b
1. s(x) ≤ f (x) for every x ∈ [a, b] and I − s(x) dx < ε,
a
Z b
2. t(x) ≥ f (x) for every x ∈ [a, b] and t(x) dx − I < ε.
a

Proof. Exercise. ␣

We nish this section with two conventions:


Z a
1. f (x) dx = 0,
a
Z a Z b
2. If a<b then f (x) dx = − f (x) dx.
b a
The rst is consistent with line segments having zero area. The second takes
into account the direction of travel.

Exercises for Ÿ2.1


1. Let H be the unit step function dened on page 23. Compute the following
integrals of step functions:
48 Chapter 2. Integration
Z 2 Z 2
(a) [H(x) + 2H(x − 1)] dx, (b) [H(1 − x) − H(x − 1)] dx.
−1 0

2. Graph the following step functions on the given domain and compute their
integrals:


(a) [2x − 1] on [0, 2], (b) [ x] on [0, 9].

3. Let s : [−a, a] → R be a step function. Prove the following:


Z a Z a
(a) If s is even then s(t) dt = 2 s(t) dt.
−a 0
Z a
(b) If s is odd then s(t) dt = 0.
−a

4. Let f (t) = 1 for t ∈ [0, 1] and f (t) = 2 for t ∈ (1, 2]. Compute and graph
the following function for x ∈ [0, 2]:
Z x
F (x) = f (t) dt.
0

5. Calculate the following integrals:


Z b
(a) x dx,
0
Z b
(b) x2 dx,
0
Z b √
(c) x dx. (Hint: Relate the upper (lower) sums of x2 to the lower (up-
0 √
per) sums of x.)
6. Prove the following:

n 2 b
b4
 Z
X n(n + 1)
(a)
3
k = , (b) x3 dx = .
2 0 4
k=1

7. Suppose f is integrable on [0, a] with a > 0.


Z a Z a
(a) If f is even then f (x) dx = 2 f (x) dx.
−a 0
Z a
(b) If f is odd then f (x) dx = 0.
−a
2.2. Properties of Integration 49

Z b
8. Let f : [a, b] → R and let I ∈ R. Show that I = f (x) dx if and only if
a
the following two conditions hold:

(a) Every u>I is an upper sum for f, and

(b) Every `<I is a lower sum for f.



1 1 1
9. 1 x= , , ,...

if
Show that F (x) = 2 3 4 is integrable on [0, 1].
 0 else

10. Let f : [a, b] → R be a bounded function which is integrable on every


interval [c, b] with a < c < b. Show that f is integrable on [a, b].

2.2 Properties of Integration


We now take up various general properties of integration. The common idea is
to rst establish results for step functions, and then use those to get results for
general integrable functions. Our rst instance is to generalise the Comparison
Theorem, proved earlier for step functions, to arbitrary integrable functions.

Theorem 2.2.1 (Comparison Theorem). Suppose f, g : [a, b] → R are inte-


grable functions such that f (x) ≤ g(x) for every x ∈ [a, b]. Then
Z b Z b
f (x) dx ≤ g(x) dx.
a a

Z b Z b
Proof. Suppose that f (x) dx > g(x) dx.
a a

Let s, t : [a, b] → R be step functions such that s(x) ≤ f (x) ≤ t(x) for
every x ∈ [a, b]. The inequality between f and g gives us s(x) ≤ g(x) for
Rb Rb Rb
every x ∈ [a, b], and hence s(x) dx ≤ a g(x) dx. We also have a g(x) dx <
Rb Rb a

a
f (x) dx ≤ a t(x) dx. Thus the integral of g satises the dening properties
of the integral of f and so must be equal to it. But this contradicts the assumed
inequality. ␣

Theorem 2.2.2 (Homogeneity). Let f : [a, b] → R be an integrable function


and c ∈ R. Then cf is integrable and
Z b Z b
cf (x) dx = c f (x) dx.
a a

Proof. First, we prove the result for a step function s. Let P = {x0 , . . . , xn } be
a partition of [a, b], such that s(x) = si if xi−1 < x < xi . Then c · s(x) = c · si
50 Chapter 2. Integration

if xi−1 < x < xi . Hence

Z b n
X n
X Z b
cs(x) dx = (csi )(xi − xi−1 ) = c si (xi − xi−1 ) = c s(x) dx.
a i=1 i=1 a

Second, consider an arbitrary integrable function f and c > 0. Let ε > 0. Then
there are step functions s, t such that s(x) ≤ f (x) ≤ t(x) for every x ∈ [a, b]
and
Z b Z b Z b Z b
ε ε
f (x) dx − s(x) dx < and t(x) dx − f (x) dx < .
a a c a a c

It follows that cs, ct are step functions such that cs(x) ≤ cf (x) ≤ ct(x) for
every x ∈ [a, b] and

Z b Z b Z b Z b
c f (x) dx − cs(x) dx < ε and ct(x) dx − c f (x) dx < ε.
a a a a
Z b Z b
By Theorem 2.1.14, cf is integrable and cf (x) dx = c f (x) dx.
a a
The c<0 case can be done in a similar fashion. The c=0 case is trivial.

Task 2.2.3. Consider the step functions s(x) and t(x) dened by

  0, 0 ≤ x ≤ 1.5
−1, 0 ≤ x < 1
s(x) = , t(x) = 2, 1.5 < x ≤ 2.5 .
1, 1 ≤ x ≤ 3
3, 2.5 < x ≤ 3

1. Calculate s(x) + t(x).


Z 3 Z 3 Z 3
2. Verify (s(x) + t(x)) dx = s(x) dx + t(x) dx.
0 0 0

Theorem 2.2.4 (Additivity). Let f, g : [a, b] → R be integrable functions. Then


f +g is an integrable function and
Z b   Z b Z b
f (x) + g(x) dx = f (x) dx + g(x) dx.
a a a

Proof. s, t are
We start by proving this property for step functions. Suppose
step functions on [a, b]. If a partition Ps is adapted to s and a partition Pt is
adapted to t, then P = Ps ∪ Pt is adapted to all three of s, t and s + t. In
particular, s + t is a step function. Let P = {x0 . . . , xn }, and suppose

s(x) = si if xi−1 < x < xi ,


t(x) = ti if xi−1 < x < xi .
2.2. Properties of Integration 51

Then s(x) + t(x) = si + ti if xi−1 < x < xi . Hence,

Z b   n
X
s(x) + t(x) dx = (si + ti )(xi − xi−1 )
a i=1
n
X n
X
= si (xi − xi−1 ) + ti (xi − xi−1 )
i=1 i=1
Z b Z b
= s(x) dx + t(x) dx.
a a

Now consider any integrable functions f, g : [a, b] → R. Let ε > 0. We have step
functions sf , sg , tf , tg such that the following hold:

sf (x) ≤ f (x) ≤ tf (x) and sg (x) ≤ g(x) ≤ tg (x),


Z b Z b Z b Z b
ε ε
f (x) dx − sf (x) dx < and g(x) dx − sg (x) dx < ,
a a 2 a a 2
Z b Z b Z b Z b
ε ε
tf (x) dx − f (x) dx < and tg (x) dx − g(x) dx < .
a a 2 a a 2

Then sf + sg and tf + tg are step functions such that:

sf (x) + sg (x) ≤ f (x) + g(x) ≤ tf (x) + tg (x),


Z b Z b Z b 
f (x) dx + g(x) dx − sf (x) + sg (x) dx < ε,
a a a
Z b   Z b Z b
tf (x) + tg (x) dx − f (x) dx − g(x) dx < ε.
a a a

Now apply Theorem 2.1.14. ␣


Theorem 2.2.5. Suppose f, g : [a, b] → R such that f Ris integrable Rand g = f
except at nitely many points. Then g is integrable and b
a
g(x) dx =
b
a
f (x) dx.

Proof. The function h=g−f is zero except at nitely many points. Hence it
is a step function which has integral zero. Hence g =f +h is integrable and

Z b Z b Z b Z b
g(x) dx = f (x) dx + h(x) dx = f (x) dx + 0. ␣
a a a a

Theorem 2.2.6 (Additivity over Intervals). Let a < c < b and suppose
f : [a, b] → Ris a bounded function. Then
1. f is integrable on [a, b] if and only if f is integrable on both [a, c] and [c, b].
Z b Z c Z b
2. If f is integrable on [a, b] then f (x) dx = f (x) dx + f (x) dx.
a a c
52 Chapter 2. Integration

Proof. Exercise. ␣

f (x) dx + c f (x) dx holds for any ordering


Rb Rb Rc
The equality
a
f (x) dx = a
of a, b, c. For example, suppose a < b < c. Then:
Z c Z b Z b Z c Z c
f (x) dx + f (x) dx = f (x) dx + f (x) dx − f (x) dx
a c a b b
Z b
= f (x) dx.
a
Z 1
Task 2.2.7. Calculate |t| dt.
−1

Theorem 2.2.8 (Scaling of Interval of Integration). Let f (x) be integrable on


[a, b] and let k 6= 0. Then f (x/k) is integrable on [ka, kb] and
Z kb Z b
f (x/k) dx = k f (x) dx.
ka a

Proof. Exercise. (Hint: If s(x) is a step function with domain [a, b] then s(x/k)
is a step function with domain [ka, kb].) ␣
Theorem 2.2.9 (Shift of interval of integration). Let f : [a, b] → R be integrable
and let k ∈ R. Then f (x − k) is integrable on [a + k, b + k] and
Z b+k Z b
f (x − k) dx = f (x) dx.
a+k a

Proof. Exercise. (Hint: If k>0 and s(x) is a step function with domain [a, b]
then s(x − k) is a step function with domain [a + k, b + k].) ␣

Integration of Monotone Functions


Theorem 2.2.10. Let f : [a, b] → R be a monotone function. Then f is inte-
grable on [a, b].
Proof. Suppose f is an increasing function. Let Pn = {x0 , . . . , xn } be the par-
tition of [a, b] that cuts it into n equally sized subintervals, each of length
(b − a)/n. Dene step functions sn and tn on [a, b] by

sn (x) = f (xi−1 ) and tn (x) = f (xi ) if xi−1 ≤ x < xi ,

and sn (b) = tn (b) = f (b). Then we have sn (x) ≤ f (x) ≤ tn (x) for every
x ∈ [a, b]. Further,
b b n n
b−a X b−a
Z Z X
tn (x) dx − sn (x) dx = f (xi ) − f (xi−1 )
a a i=1
n i=1
n
2.2. Properties of Integration 53

n
b − a X 
= f (xi ) − f (xi−1 )
n i=1
b − a 
= f (b) − f (a) .
n
By the Archimedean property, this quantity can be made smaller than any
given positive ε.
The same approach works for decreasing functions. We just switch the
denitions of sn and tn . ␣

This approach can also succeed in nding the value of the integral of a
given function, as we saw earlier for f (x) = x.
Task 2.2.11. Suppose that f : [a, b] → R is an increasing function. Let xi =
a + (b − a)i/n. Prove that
n−1 Z b n
b−a X b−aX
f (xi ) ≤ f (x) dx ≤ f (xi ).
n i=0 a n i=1

A function f : [a, b] → R is called piecewise monotone if there is a par-


tition x0 < · · · < xn of [a, b] such that f is monotone on each open subinterval
(xi , xi+1 ).
Examples of piecewise monotone functions:

x2
x0 x1 x3 x0 x1 x2

Theorem 2.2.12. If f : [a, b] → R is bounded and piecewise monotone then it


is integrable.

Proof. Let x0 < · · · < xn be a partition of [a, b] such that f is monotone on


each (xi , xi+1 ). Also, suppose |f (x)| ≤ M for every x ∈ [a, b].
If f is increasing on (xi , xi+1 ), dene gi : [xi , xi+1 ] → R by


 −M, if x = xi
gi (x) = f (x), if xi < x < xi+1 .
M, if x = xi+1

Then gi is an increasing function. If f is decreasing on (xi , xi+1 ), switch the


denitions of g(xi ) and g(xi+1 ), and then gi will be a decreasing function.
54 Chapter 2. Integration

In either case, gi is monotone and hence integrable. Since f equals gi except


perhaps at two points, f is integrable on [xi , xi+1 ]. Therefore f is integrable on
[a, b], by the result on additivity over intervals. ␣

This result suces to establish the integrability of the functions that are
typically encountered in Calculus and its applications.

Intermediate Value Property


The Heaviside step function H(x) jumps
R x from 0 to 1 at x = 0. We create a new
function by integrating it: F (x) = 0 H(t) dt. We nd that this function does
not have a jump at x = 0.

H(x) F (x)

Thus, integration has smoothened out the step function and removed its jump.
This is a general phenomenon which we shall explore when we take up con-
tinuous functions. For now, here is a result along these lines about integrals of
monotone functions:

Theorem 2.2.13 (Intermediate Value Property). Let I be an interval and


f: I → R a decreasing function which is always positive. Let a ∈ I and dene
F (x) = a f (t) dt for x ∈ I . Suppose b, c ∈ I and L ∈ R such that b < c and
Rx

F (b) < L < F (c). Then there is α ∈ (b, c) such that F (α) = L.

Proof. We start by noting that F is an increasing function. Let x < y. Then,


Z y Z x Z y Z y
F (y) − F (x) = f (t) dt − f (t) dt = f (t) dt ≥ 0 dt = 0.
a a x a

Now dene A = { x ∈ I | F (x) < L } and B = { y ∈ I | F (y) > L }.


Supposex ∈ A and y ∈ B . Then F (x) < L < F (y). Since F is increasing,
we must have x < y . Hence, by the Completeness Axiom, there is a real number
α such that x ≤ α ≤ y for every x ∈ A, y ∈ B . We shall use trichotomy to
prove F (α) = L.

First, suppose F (α) < L. Then F (α) = L−ε with ε > 0. Dene δ = ε/f (α).
So,

Z α+δ Z α+δ
F (α + δ) = F (α) + f (t) dt < (L − ε) + f (α) dt
α α
= (L − ε) + f (α)δ = L.
2.2. Properties of Integration 55

Hence α + δ ∈ A, a contradiction.

If F (α) > L, we can show in a similar fashion and with the same choice of
δ, that α − δ ∈ B . Therefore, by trichotomy, F (α) = L. ␣

Exercises for Ÿ2.2


Z a
1. For a certain function f, you are given that f (x) dx = a3 for every
0
a ∈ R. Compute the following:

Z 2 Z 2
(a) f (x) dx, (c) f (1 − x) dx,
1 0
Z 2 Z 2
(b) f (x) dx, (d) (f (x) + f (−x)) dx.
−2 1

2. Integrate:

Z 2 Z 2
(a) (x − 1)(x − 2) dx, (b) (x − 1)(x − 2)(x − 3) dx.
1 1

3. Find all values of c for which:

Z c Z c
(a) x(1 − x) dx = 0. (b) |x(1 − x)| dx = 0.
0 0

4. Find a quadratic polynomial P (x) such that P (0) = P (1) = 0 and


R1
0
P (x) dx = 1.
2
x2 ,
Z 
0≤x≤1
5. Calculate f (x) dx where f (x) =
x − 2, 1 < x ≤ 2
.
0

6. Let f : [a, b] → R. Dene f + (x) = max{f (x), 0}, f − (x) = max{−f (x), 0}.
Prove the following:

(a) f = f+ − f− and |f | = f + + f − . (Denition: |f |(x) = |f (x)|)


(b) If f is integrable, so are f − . (Hint: If a step function s
f+ and gives a
±
lower/upper sum for f , consider s .)
Z b Z b
(c) If f is integrable, so is |f |, and f (x) dx ≤ |f (x)| dx.
a a

7. Let f, g : [a, b] → R be integrable and dene f ∨ g, f ∧ g by

(f ∨ g)(x) = max{f (x), g(x)} and (f ∧ g)(x) = min{f (x), g(x)}.

Prove that f ∨g and f ∧g are integrable.


56 Chapter 2. Integration

8. Give an example of a function f which is not integrable on [0.1] yet |f | is


integrable there.

9. Prove the following:

(a) If step functions s, t 0 ≤ s ≤ t ≤ M on [a, b] then


satisfy

Z b Z b Z b Z b
t(x)2 dx − s(x)2 dx ≤ 2M

t(x) dx − s(x) dx .
a a a a

(b) If f is integrable on [a, b], so is f 2. (Hint: First assume f ≥ 0)


(c) If f and g are integrable on [a, b], so is their product f g.
10. Prove that all polynomials are integrable on any [a, b].
Z x
11. Suppose f is integrable on [−a, a] and we dene F (x) = f (t) dt for
0
each x ∈ [−a, a]. Prove the following:

(a) If f is even then F is odd.

(b) If f is odd then F is even.


Z 2
1
12. Find dx within an error of no more than 0.1.
1 x
13. Let f : [a, b] → R be strictly increasing with a, f (a) ≥ 0. Prove that

Z f (b) Z b
f −1 (y) dy + f (x) dx = bf (b) − af (a).
f (a) a

14. Prove that Theorem 2.2.13 can be generalised to arbitrary monotone


functions by following the given hints:

(a) If f is an increasing and positive function, consider g(x) = f (−x).


(b) If f is monotone and is not given to be positive, consider shifts g(x) =
f (x) + K .
15. Show that f (x) = x3 is a bijection from R to R. (Hence every real number
has a unique cube root)

16. Consider the function C : R: R dened by C(x) = x3 . Since it is bijective


(see the previous exercise, or the proof of existence of n
th roots in the next
section), every x has a unique cube root x1/3 . Now prove that

Z b
3 4/3
x1/3 dx = b .
0 4

17. Let f: R→R have period T and be integrable on [0, T ].


2.3. Logarithm and Exponential Functions 57

(a) Show that f is integrable on every interval [a, b].


Rx
(b) Show that the function F (x) = 0
f (t) dt has period T.

2.3 Logarithm and Exponential Functions


Natural Logarithm
Consider the function f (x) = 1/x on (0, ∞). It is monotonic and hence inte-
grable on every [a, b] ⊂ (0, ∞). We use this observation to dene the natural
logarithm function by
Z x
1
log(x) = dt (x > 0).
1 t

By denition, log(1) = 0.
We shall often just write log x for log(x). Using fewer brackets increases
readability but we should use brackets whenever ambiguity is a danger. For
example, log x + y should be clearly expressed as either log(x + y) or (log x) + y ,
two expressions with very dierent meanings. The same convention will be used
later for any function which has its own name.

Theorem 2.3.1. Let a, b > 0. Then


1. log(ab) = log a + log b,
2. log(a/b) = log a − log b,
3. log(am ) = m log a for any m ∈ Z.
Proof.
1. We use the property of scaling of interval of integration:

ab Z a Z ab
1 ab 1
Z Z
1 1 1
log(ab) = dt = dt + dt = log a + dt
1 t 1 t a t a a t/a
Z b  1 Z kβ Z β
1 
= log a + dt ∵ f (t/k) dt = f (t) dt
1 t k kα α
= log a + log b

 
a
2. log(a/b) + log b = log ·b = log a.
b
3. If m = 0, both sides are 0. For any m ∈ N, we have

log(am ) = log(a · · · a) = log a + · · · + log a = m log a


log(a−m ) = log(1/am ) = − log(am ) = −m log a ␣
58 Chapter 2. Integration

Theorem 2.3.2. The function log : (0, ∞) → R is a strictly increasing bijec-


tion.
Proof. Let b > a > 0. Then

b a b b
b−a
Z Z Z Z
1 1 1 1
log b − log a = dt − dt = dt ≥ dt = > 0.
1 t 1 t a t a b b

So log is strictly increasing and hence also 1-1.

Consider any y > 0. We know that log 2 > log 1 = 0, hence by the
Archimedean Property we have N ∈ N such that y < N log 2 = log(2N ).
So,
log 1 < y < log 2N .
By the Intermediate Value Property (Theorem 2.2.13) there is x ∈ (1, 2N ) such
that log x = y .
Now consider y < 0. There is x > 0 such that log x = −y . Then log(1/x) =
− log x = y . So log is onto. ␣

Exponential Function
Since log : (0, ∞) → R is a bijection, it has an inverse function exp : R → (0, ∞)
which we call the exponential function. We have y = exp x if and only if
log y = x. Since log is strictly increasing, so is exp.
Theorem 2.3.3. Let a, b ∈ R. Then
1. exp(a + b) = exp a exp b,
exp a
2. exp(a − b) = ,
exp b
3. exp(ma) = (exp a)m for m ∈ Z.
Proof. Since log is 1-1, we can prove these identities by applying log to both
sides and observing the results are equal. For example, the rst identity is
proved as follows:

log(exp(a + b)) = a + b,
log(exp a exp b) = log(exp a) + log(exp b) = a + b. ␣

Euler's Number
The number e = exp 1 is called Euler's number. It also satises log e = 1.
Consider the y = 1/x graph between 1 and e. The area under the graph is
Z e
1
dt = log e = 1.
1 t
2.3. Logarithm and Exponential Functions 59

1 1

1 e 1 2 e

The area of the sketched rectangle in the rst diagram is 1(e − 1) = e − 1.


Since the rectangle has greater area, we see that e−1 > 1 and therefore e > 2.
Now consider the two inner rectangles shown in the second gure. Adding
their areas gives

1 e−2 3e − 4
+ < 1 =⇒ < 1 =⇒ 3e − 4 < 2e =⇒ e < 4.
2 e 2e
Later, we will see how to get more accurate estimates of e.

Graphs of Log and Exp


The graph of the log function has the following appearance:

2
1
1 e e2

The graph of exp can be obtained by reecting the log graph in the y=x line:

exp x

log x

Roots
Theorem 2.3.4. Let a > 0 and n ∈ N. Then there is a unique b > 0√such that
bn = a. (This b is called the nth root of a and is denoted by a1/n or n
a.)

Proof. Since exp : R → (0, ∞) is surjective, there is a real number x such that
exp x = a. Then b = exp(x/n) satises b > 0 and bn = a. Uniqueness follows
n n n−1
from b − c = (b − c)(b + · · · + cn−1 ), as this shows that bn − cn = 0 ⇐⇒
b − c = 0. ␣
60 Chapter 2. Integration

Task 2.3.5. Let n be an odd natural number. Show that every real number a
has a unique nth root b, i.e. bn = a.

Rational Powers
Leta>0 and r ∈ Q. If r = m/n with m ∈ Z, n ∈ N, we dene ar = am/n =
m 1/n
(a ) .
Since a rational number can be expressed as m/n in many dierent ways,
we need to check that our denition of am/n is independent of these choices.

Task 2.3.6. Let a > 0, m, n, p, q ∈ Z, n, q 6= 0 such that m/n = p/q. Show


that (am )1/n = (ap )1/q . (Hint: Take the nq power of both sides)
Task 2.3.7. Let a > 0, m, n ∈ Z, n 6= 0. Show that (am )1/n = (a1/n )m .
Theorem 2.3.8. Let a > 0, x ∈ R, and r ∈ Q. Then
1. log(ar ) = r log a,
2. exp(rx) = (exp x)r .
Proof.
1. Let r = m/n with m ∈ Z and n ∈ N. We have shown that log(am ) =
m log a if m ∈ Z. Replacing m by n ∈ N and a by a1/n in this equality,
1/n 1
we get log a = n log(a ), and hence log(a1/n ) = log a. Combine these
n
calculations:

1 m
log(ar ) = log((am )1/n )) = log(am ) = log a = r log a.
n n

2. Apply log to both sides. ␣

Real Powers
Recall that Euler's number was dened by e = exp 1. From the second part of
the last theorem we see that

er = exp r, if r ∈ Q.

Therefore we dene
ex = exp x, if x ∈ R.
More generally, given any a>0 and r∈Q we note that

ar = [exp(log a)]r = exp(r log a).

Therefore we dene
ax = exp(x log a), if x ∈ R.
2.3. Logarithm and Exponential Functions 61

With this denition of real powers of a positive real number, we can use the
properties of the exponential function to verify the following:

Theorem 2.3.9. Let a > 0 and x, y ∈ R. Then


1. ax+y = ax ay , 2. ax−y = ax /ay , 3. (ax )y = axy .

Proof. The rst two are easily checked using the properties of exp. For the
third, we rst consider the special case when a = e:

(ex )y = exp(y log ex ) = exp(xy) = exy .

Now we consider the general case:

(ax )y = (ex log a )y = exy log a = axy . ␣

Fora 6= 1, the function log(a) · x is a bijection of R with R, while ex is a


bijection of R with (0, ∞). Hence their composition ax = elog(a)x is a bijection
of R with (0, ∞).
a > 0 and a 6= 1, the inverse function of ax is called the logarithm
For
with base a and is denoted by loga x: y = loga x ⇐⇒ ay = x. We have
log x = loge x.

This is dierent from school, where log x = log10 x, and the natural
«
logarithm had the special name `ln'.

Task 2.3.10. Solve for y in terms of x:


2y + 2−y
(a) log(y + 1) − log(y − 1) = 2 log x (b) x =
2y − 2−y
log x
Theorem 2.3.11. Let a, x > 0 and a 6= 1. Then loga x = .
log a

Proof. Rearrange the given expression to log a loga x = log x. Now,

log a loga x = log x ⇐⇒ elog a loga x = elog x


⇐⇒ (elog a )loga x = x
⇐⇒ aloga x = x ⇐⇒ x = x. ␣

Theorem 2.3.12. Fix a > 0, a 6= 1. Then the functions ax and loga x are
strictly monotone on their domains.

Proof. Apply the previous theorem. ␣


logb x
Task 2.3.13. Let a, b, x > 0 with a, b 6= 1. Show that loga x = .
logb a
62 Chapter 2. Integration

Hyperbolic Functions
Some particular combinations of exponential functions are very convenient in
calculations, and also show up directly in some physical problems. These are
called hyperbolic functions and are dened by
ex + e−x ex − e−x
cosh x = and sinh x = .
2 2

These are called the `hyperbolic cosine' and `hyperbolic sine' respectively.

Task 2.3.14. Prove the following:


(a) The function cosh is even and sinh is odd.
(b) The image of cosh is [1, ∞) and that of sinh is R.
(c) For every x ∈ R, (cosh x)2 − (sinh x)2 = 1.

This implies that if we take y = cosh t and x = sinh t, and vary t, we


will trace out the upper branch of the hyperbola given by y 2 − x2 = 1. Taking
y = − cosh t and x = sinh t will trace the lower branch. This explains the choice
of names.

(sinh t, cosh t)

(sinh t, − cosh t)

Let us consider the graph of cosh x. Note that cosh x > ex /2


and cosh x
x
approaches e /2 as x becomes larger on the positive side. Whereas, on the
−x
negative side, cosh x approaches e /2. This gives the following picture:

y = cosh x

Task 2.3.15. Justify the following depiction of the graph of sinh:


2.3. Logarithm and Exponential Functions 63

y = sinh x

Exercises for Ÿ2.3


1. Estimate log 2 and log 3 within an error of 0.1.

2. You are given some values of the natural logarithm, accurate to two decimal
places: log 2 = 0.69, log 3 = 1.10, log 5 = 1.61 and log 7 = 1.95. Use these to
nd the natural logarithms of the following numbers:

(a) 80, (b) 120, (c) 2.1, (d) 3/35.

3. Express x in terms of log 2 and exp 2:

1 (b) 12 log(x − 3) = 96.


(a) 2 exp(5x) = ,
16

4. Which is bigger: log2 3 or log3 5? (Hint: Compare with 3/2)


5. Use the following gures to show that 2.4 < e < 3.5:

1 1

1 e 1 3 2 e
2

1
6. Prove the following inequalities for x > 0: 1 − ≤ log x ≤ x − 1.
x
7. Plot the graphs of the following functions on the same coordinate plane:

(a) log x, (b) log 2x, (c) log x2 , (d) log |x|.

8. Plot the graphs of the following functions on the same coordinate plane:

(a) exp x, (b) exp(1 + x), (c) exp(−x), (d) exp |x|.

9. Suppose a, b > 0 and x ∈ R. Show that ax bx = (ab)x .


64 Chapter 2. Integration

10. Prove that the function ax is strictly increasing if a > 1 and strictly
decreasing if 0 < a < 1.
11. Graph the functions ax for a = 1/2, 1, 2, e, 3.

12. Verify the following identities for the hyperbolic functions:

(a) cosh(2x) = (cosh x)2 + (sinh x)2 ,

(b) sinh(2x) = 2(cosh x)(sinh x),

(c) cosh(x + y) = cosh x cosh y + sinh x sinh y ,

(d) sinh(x + y) = sinh x cosh y + cosh x sinh y .

2.4 Integration and Area


We began this chapter with the promise that integration will enable us to
calculate areas. Now we shall deliver on that promise. To begin with, consider
a bounded function f : [a, b] → R such that f (x) ≥ 0 for every x ∈ [a, b]. This
creates a region R in the cartesian plane, enclosed by the graph of f , the x-axis,
and the lines x = a, x = b:

y = f (x)

R
x=a x=b

Rb
The denite integral
a
f (x) dx is our denition of the area of the region R. If
f is not integrable, then we fail to dene the area of R.
Next, suppose we have two integrable functions f, g : [a, b] → R such that
f (x) ≥ g(x) ≥ 0 for every x ∈ [a, b]. Then we can create a region S which is
enclosed by the graphs of f and g , and by the lines x = a, x = b:

y = f (x)

x=a S
y = g(x) x=b

This region can be viewed as the result of removing the region under the graph
of g from that which is under the graph of f:
2.4. Integration and Area 65

S
= −

Hence we dene the area of S to be

Z b Z b Z b
f (x) dx − g(x) dx = (f (x) − g(x)) dx.
a a a

The assumption f, g ≥ 0 can be relaxed. Since f, g are bounded, there is a


number C such that f + C, g + C ≥ 0. The region that lies between the graphs
of f + C, g + C ≥ 0 is just a vertical shift of S , so it should have the same area.
Z b Z b Z b
And that area is (f (x) + C) dx − (g(x) + C) dx = (f (x) − g(x)) dx.
a a a
Now, we shall develop an approach for calculating the area of a polygon
whose vertices are given. The building block is the integral corresponding to
one side of the polygon. Thus, let f be the function whose graph is the straight
line joining (x1 , y1 ) to (x2 , y2 ), with x1 < x2 .

(x1 , y1 )

(x2 , y2 )

x2 x2
y2 − y1
Z Z

Then, f (x) dx = y1 + (x − x1 ) dx
x1 x1 x 2 − x 1
Z x2 −x1
y2 − y1 
= y1 + x dx
0 x2 − x1
y2 − y1 (x2 − x1 )2
= (x2 − x1 )y1 +
x2 − x1 2
1
= (x2 − x1 )(y2 + y1 ).
2

Example 2.4.1. Let us nd the area of the triangle T with vertices at (x1 , y1 ),
(x2 , y2 ) and (x3 , y3 ) as shown below:
66 Chapter 2. Integration

(x3 , y3 )

(x1 , y1 )
T

(x2 , y2 )

The triangle lies between the graphs of the following functions:

y3 − y1
f (x) = y1 + (x − x1 ), x ∈ [x1 , x3 ]
x3 − x1
y2 − y1

 y1 +

 (x − x1 ), x ∈ [x1 , x2 ]
x2 − x1
g(x) =
y3 − y2
 y2 + (x − x2 ), x ∈ [x2 , x3 ]


x3 − x2

Now we calculate the area as follows:


Z x3 Z x3
Area(T )= f (x) dx − g(x) dx
x x
Z x1 3 Z x1 2 Z x3
= f (x) dx − g(x) dx − g(x) dx
x1 x1 x2
1 
= (x3 − x1 )(y3 + y1 ) − (x2 − x1 )(y2 + y1 ) − (x3 − x2 )(y3 + y2 )
2
1 
= (x1 y2 − x2 y1 ) + (x2 y3 − x3 y2 ) + (x3 y1 − x1 y3 ) . 
2

This calculation can be used to recover the area formulas for basic polyg-
onal shapes.

Example 2.4.2. Consider a triangle with base b and height h. Place it with
base along the x-axis and one vertex at origin:

(c, h)

(b, 0)

1  1
Its area is (0 · 0 − b · 0) + (b · h − c · 0) + (c · 0 − 0 · h) = bh. 
2 2
Example 2.4.3. Consider a trapezium with height h, base b, and top t. Place
it as follows:
2.4. Integration and Area 67

(c, h) (c + t, h)

(b, 0)

1 1 b+t
Using the previous example, we nd its area to be bh + th = h. If t = b,
2 2 2
we get the formula bh for the area of a parallelogram. 
Example 2.4.4. We wish to nd the area of the region enclosed by the curves
given by y = 3x and y = 4 − x2 . The rst step is to sketch this region. For this,
2
we rst check if the curves meet. Setting 3x = 4 − x gives x = 1, −4. Hence
the curves meet at (−4, −12) and (1, 3). The curves plot as follows, and the
shaded part is the region that is enclosed by the curves.

(1, 3)

(−4, −12)

The area of this region is calculated as follows:

Z 1 Z 1 Z 1 Z 1
2 2
Area = (4 − x − 3x) dx = 4 1 dx − x dx − 3 x dx
−4 −4 −4 −4
1 1
= 4 · (1 − (−4)) − (13 − (−4)3 ) − 3 · (12 − (−4)2 ) = 20 65 . 
3 2

Dening and Estimating π


If we change our unit of length by a factor k , then all length measurements will
change by a factor 1/k and all area measurements by a factor 1/k 2 . This shows,
for example, that the ratio of a circle's circumference to its radius is constant,
and also that the ratio of its area to the square of its radius is constant. What is
remarkable is that the same constant appears in both cases, in the relationships
Circumference = 2πR and Area = πR2 .
In this section we study π using area. Eventually, in Ÿ ??, we shall connect
π to lengths as well. First, we verify by a direct calculation that the area of a
circle is proportional to the square of its radius. Consider a circle with centre
at origin and radius R. It is given by the equation x2 + y 2 = R√
2
, and can be
seen as the region enclosed by the graphs of the
√ functions y = R2 − x2 and
y = − R 2 − x2 .
68 Chapter 2. Integration


y= R2 − x2

−R R

y = − R 2 − x2

Its area is given by


Z R p p Z R p
2 2 2 2
( R − x − (− R − x )) dx = 2 R2 − x2 dx.
−R −R

We apply scaling of the interval of integration (Theorem 2.2.8) to get


Z R p Z R/R p Z 1 p
2 R2 − x2 dx = 2R R2 − (Rx)2 dx = 2R2 12 − x2 dx.
−R −R/R −1

Thus, we get the area to be πR2 , where


Z 1 p Z 1 p
π=2 1 − x2 dx = 4 1 − x2 dx.
−1 0

Since 1 − x2 is a piecewise monotone function on [−1, 1], these integrals exist.
We can use lower and upper sums to put bounds on the values of π . For example,
suppose we partition [0, 1] into N equal parts. Then we have xi = i/N and
q
XN p 2
1 − xi XN 1 − x2i−1
4 <π<4 .
i=1
N i=1
N

1 1 1

xi−1 xi 1 xi−1 xi 1 xi−1 xi 1


Lower Sum Upper Sum Mean

The table shows the bounds and their means for various N:
N Lower Bound Upper Bound Mean
102 3.1204 3.1604 3.1404
103 3.1395 3.1435 3.1415
106 3.1415906 3.1415946 3.1415926
2.4. Integration and Area 69

(The actual value of π is 3.141592653 . . . . The calculations in the 106 case took
7 seconds with the Maxima program.)

Exercises for Ÿ2.4


1. Show that the area of the quadrilateral depicted below is

1 
(x1 y2 − x2 y1 ) + (x2 y3 − x3 y2 ) + (x3 y4 − x4 y3 ) + (x4 y1 − x1 y4 ) .
2

(x4 , y4 ) (x3 , y3 )

(x1 , y1 )

(x2 , y2 )

2. Can the result of the previous exercise be generalised to a polygon with n


sides?

3. The trapezium drawn below is to be cut into two parts of equal area by a
vertical line. Where should it be drawn?

k
h

4. Use denite integrals to represent the area of the region enclosed by the
given curves (do not try to evaluate the integrals):

y = log(3 − x) y = log(1 + x) √
y= 1 − x2
y = x2 − 2x
(a) (b)

1/2

5. Find the area of the region enclosed by the curves y = x2 and y = x3


between x=0 and x = 1.
x2 y 2
6. Prove that the area enclosed by the ellipse with equation + = 1 with
a2 b2
a, b > 0 is πab.
7. Let f be monotonically increasing, and y = m, y = M be horizontal lines
that cut its graph. Find the point on its graph such that the area depicted
below is minimized.
70 Chapter 2. Integration

y=M

y=m

Darboux Integral
Our approach to integration is a minor variation of one due to Gaston Darboux,
which he gave in 1875. At this time the work of Fourier had led to a broadening
of the idea of a function and so it became important to give integration a formal
structure and not rely on intuitions about area. Bernhard Riemann gave the
rst general approach in 1854, but Darboux's denition is much easier to set
up. For completeness, we give the standard version of the Darboux integral
below. We also introduce the Riemann integral in the supplementary exercises
to Chapter 6.

Consider a bounded function f : [a, b] → R and a partition P : x0 < · · · <


xn of [a, b]. Using the LUB Property, we dene

mi = inf{ f (x) | x ∈ [xi−1 , xi ] } and Mi = sup{ f (x) | x ∈ [xi−1 , xi ] }.

We then take the lower and upper Darboux sums created by these numbers:
n
X n
X
L(f, P ) = mi (xi − xi−1 ) and U (f, P ) = Mi (xi − xi−1 ).
i=1 i=1

A1. Show that the collection of lower Darboux sums is bounded above and
the collection of upper Darboux sums is bounded below.

Hence we can dene the lower and upper Darboux integrals:


Z b
f (x) dx = sup{ L(f, P ) | P is a partition of [a, b] },
a
Z b
f (x) dx = inf{ U (f, P ) | P is a partition of [a, b] }.
a

A2. Find the upper and lower Darboux integrals of the Dirichlet function
(Example 2.1.8).

Z b Z b
A3. Show that f (x) dx ≤ f (x) dx.
a a

We say f is Darboux integrable on [a, b] if its upper and lower Darboux


integrals are equal. The common value is called the Darboux integral of f .
CALCULUS by Amber Habib

3 | Limits and Continuity

Integration can be seen as accumulation, the summing up of local changes to


DRAFT September 25, 2020

get a global result. In the previous chapter, we set up the general process for
achieving this. We also saw how a specic kind of informationmonotonicity
could be used to obtain global results. Already, we were able to formally dene
the natural logarithm and exponential functions, which are usually taken for
granted in school mathematics.

Further progress requires a closer look at the local behaviour of functions.


The more we know of the local behaviour, the better our chances of extract-
ing global information. These considerations underlie our development of the
notions of limit and continuity in this chapter. As applications, we will rigor-
ously develop angles and their radian measures, followed by the trigonometric
functions and their properties.

3.1 Limits
You have seen in school, the notation lim f (x) = L, which is read as the
x→p
limit of f (x) at p is L and is interpreted as the values of f (x) approach L as
the values of x approach p". We need a clear denition of what we mean by
`approaches'.

Example 3.1.1. Consider f (x) = 2x + 5. What happens if we take values of


x that approach 0? Here are some calculations:

x 1 0.1 0.01 0.001 0.0001 0.00001


f (x) 7 5.2 5.02 5.002 5.0002 5.00002

We see that as x gets closer to 0, f (x) appears to be getting closer to 5. Can


we control this? Can we get the output f (x) close to 5 within any required
accuracy level, simply by making the input x appropriately close to 0?
Suppose ε is some positive number and we need f (x) = 2x + 5 to be within
ε of 5. Now:

|(2x + 5) − 5| < ε ⇐⇒ |2x| < ε ⇐⇒ |x| < ε/2.


74 Chapter 3. Limits and Continuity

Thus, if |x| < ε/2, we are guaranteed that |f (x) − 5| < ε. 

This example shows what we mean by `limit'. We mean that we can control
the nearness of the output to a certain number, by controlling the nearness of
the input to another number. This is expressed formally as follows:

We say x→p
lim f (x) = L if for every ε > 0 there is a corresponding δ > 0
such that 0 < |x − p| < δ =⇒ |f (x) − L| < ε.
In Example 3.1.1 we have δ = ε/2.
Task 3.1.2. Show that at most one number can satisfy the denition of the
limit of a given function at a given point.

We make three observations about this denition.

(a) It sets up δ as depending on ε.


(b) We do not care about the value of f (p), or even whether it is dened.

(c) Since the denition is intended for situations where x can approach p, it
should only be applied to such situations. In Calculus, this means that
we shall only consider the limit of f at p if there is an α>0 such that
the open interval (p − α, p + α) is contained in the domain of f, except
perhaps for p itself.

We may also write `f (x) →L as x → p' for lim f (x) = L.


x→p

L+ε L+ε
L L
L−ε L−ε

p p−δ p p+δ

The two stages in a limit process. In the rst stage, we have a requirement to
make the output f (x) lie between L − ε and L + ε. In the second stage, we meet
the requirement by nding a δ such that input being between p − δ and p + δ
guarantees that the output is between L − ε and L + ε (except perhaps at p
itself).
Example 3.1.3. Consider lim x. This amounts to asking What does x ap-
x→a
proach when x approaches a? Obviously, our response has to be that it will
approach a, that is, lim x = a. While this is indeed obvious, let us still work it
x→a
out with the ε-δ formulation, for practice.
3.1. Limits 75

We start by considering an ε > 0. We need to nd a δ > 0 such that


|x − a| < δ =⇒ |x − a| < ε. Clearly δ = ε will work. 
Task 3.1.4. Let f (x) = c be a constant function. Show that x→p
lim f (x) = c.

Example 3.1.5. Consider the limit of the function y = x2 at x = 2. A natural


2 2
guess is that as x approaches 2, x should approach 2 = 4. We can test this
guess by trying out some values of ε > 0.
For example, suppose ε = 0.5. We need a positive δ such that x ∈ (2 −
δ, 2 + δ) implies x2 ∈ (4 − 0.5, 4 + 0.5) √
= (3.5,√4.5). We rst note that since the
function is an increasing one, it maps ( 3.5, 4.5) into (3.5,√4.5). The interval
√ √
(√ 3.5, 4.5) contains 2 but is not centered on it, since 2 − 3.5 = 0.129 while
DRAFT September 25, 2020

4.5 − 2 = 0.121.

0.129 0.121
( )
√ √
3.5 p=2 4.5


Now if we take δ= 4.5 − 2 = 0.121 (the smaller√of the√two values) it has the
required properties, since then (2 − δ, 2 + δ) ⊂ ( 3.5, 4.5).

Next, consider ε = 0.01. Can you conrm that δ = 4.01 − 2 will meet the
requirements?

If you have understood the arguments for these two values of


√ ε,√you are
ready to handle any choice of positive ε. Just take δ = min{2− 4 − ε, 4 + ε−
2}. 

This example also demonstrates that δ typically depends on ε, with smaller


ε requiring smaller δ.
Theorem 3.1.6. lim f (x) = L ⇐⇒ lim (f (x) − L) = 0 ⇐⇒ lim f (p + h) =
x→p x→p h→0
L.

Proof. We simply match the denitions of the three limits and see that they
are the same:

• lim f (x) = L:
x→p

For every ε > 0 there is a corresponding δ > 0 such that 0 < |x − p| <
δ =⇒ |f (x) − L| < ε.
• lim (f (x) − L) = 0:
x→p

For every ε > 0 there is a corresponding δ > 0 such that 0 < |x − p| <
δ =⇒ |(f (x) − L) − 0| < ε.
76 Chapter 3. Limits and Continuity

• lim f (p + h) = L:
h→0

For every ε > 0 there is a corresponding δ > 0 such that 0 < |h| < δ =⇒
|f (p + h) − L| < ε.
The rst two are completely identical. The rst can be converted to the
third, and conversely, by the substitution x = p + h. ␣

Task 3.1.7. Let a, b ∈ R with a 6= 0. Show that:


lim f (ax + b) = L ⇐⇒ lim f (y) = L.
x→p y→ap+b

Theorem 3.1.8. lim f (x) = 0 ⇐⇒ lim |f (x)| = 0.


x→p x→p

Proof. Again, just match the denitions:

• lim f (x) = 0:
x→p

For every ε > 0 there is a corresponding δ > 0 such that 0 < |x − p| <
δ =⇒ |f (x) − 0| < ε.
• lim |f (x)| = 0:
x→p

For every ε > 0 there is a corresponding δ > 0 such that 0 < |x − p| <
δ =⇒ ||f (x)| − 0| < ε.
The two denitions are the same because |f (x) − 0| = |f (x)| = |f (x)| − 0 =
||f (x)| − 0|. ␣

Theorem 3.1.9. lim f (x) = M =⇒ lim |f (x)| = |M |.


x→p x→p

Proof. We know from the triangle inequality that ||f (x)| − |M || ≤ |f (x) − M |.
Consider any ε > 0. Since lim f (x) = M , there is a δ > 0 such that
x→p
0 < |x − p| < δ =⇒ |f (x) − M | < ε. The same δ works for |f (x)| since
|f (x) − M | < ε implies ||f (x)| − |M || ≤ |f (x) − M | < ε. ␣

Now we consider three examples which illustrate the typical ways in which
a limit can fail to exist.

Example 3.1.10. Consider the signum function


 −1, x<0
sgn(x) = 0, x=0 .
1, x>0

We shall prove by contradiction that lim sgn(x) does not exist.


x→0
3.1. Limits 77

Suppose lim sgn(x) = L. Consider ε = 1. By the existence of the limit,


x→0
there is a δ > 0 such that 0 < |x| < δ =⇒ |sgn(x) − L| < 1.
Both x = δ/2 and x = −δ/2 satisfy the condition 0 < |x| < δ .
Hence |sgn(δ/2) − L| < 1 and |sgn(−δ/2) − L| < 1.
Therefore, by triangle inequality,

|sgn(δ/2) − sgn(−δ/2)| = |(sgn(δ/2) − L) − (sgn(−δ/2) − L)|


≤ |sgn(δ/2) − L| + |sgn(−δ/2) − L|
< 1 + 1 = 2.
DRAFT September 25, 2020

On the other hand, using the denition of sgn(x), we have

|sgn(δ/2) − sgn(−δ/2)| = |1 − (−1)| = 2.


This equality contradicts the previous inequality. So lim sgn(x) does not exist.
x→0

Example 3.1.11. Dene f: R→R by f (0) = 0 and f (x) = 1/x when x 6= 0.

( )
−δ 1δ
n

We proceed as in the previous example. Suppose lim f (x) = L and consider


x→0
ε = 1/2. Now consider any δ > 0. By the Archimedean property, the interval
(−δ, δ) contains points of the form 1/n and 1/(n+1) with n ∈ N. Then f (1/(n+
1)) − f (1/n) = 1 and so it is impossible that both f (1/(n + 1)) and f (1/n) are
within a distance ε = 1/2 of L. 
Example 3.1.12. Let S : [−1, 1] → R be dened by S(1/n) = (−1)n for each
n ∈ N and let its graph be a straight line on each interval between these points.
Further, let S(0) = 0.

1
3

−1 1 1
2

In any (−δ, δ) interval, S takes both the values ±1 and so we can argue as in
the previous two examples to show that lim S(x) does not exist. 
x→0
78 Chapter 3. Limits and Continuity

Remember our statement that the limit does not have to equal the func-
tion's value? Here is an example.

Example 3.1.13. Let f (x) = 0 when x 6= 0 and f (0) = 1. We will show that
lim f (x) = 0.
x→0

Consider any ε > 0. Let δ = 1. Then

0 < |x − 0| < δ =⇒ x 6= 0 =⇒ f (x) = 0 =⇒ |f (x) − 0| = 0 < ε.


(Note that the implications are entirely based on 0 < |x| and so any choice of
positive δ would have worked here.) 

Limit Theorems
We now take up questions regarding limits of combinations of functions. If we
know the limits of two functions at the same point, what can we say about
their sum, product, etc.?

We begin by considering the special case when the initial functions have
limit zero. The ε-δ arguments are much simpler in this situation.

Lemma 3.1.14. Let f, g be real functions such that x→p


lim f (x) = lim g(x) = 0.
x→p
Then
1. x→p
lim c f (x) = 0 (c ∈ R),

2. x→p
lim (f (x) + g(x)) = 0,

3. x→p
lim f (x)g(x) = 0,

f (x)
4. If x→p
lim h(x) = 1 then lim = 0.
x→p h(x)

Proof.

1. This is trivial if c = 0. So suppose c 6= 0. Take any ε > 0.


There is a δ>0 such that 0 < |x − p| < δ |f (x)| < ε/|c|.
implies
ε
Then 0 < |x − p| < δ implies |cf (x) − 0| = |c||f (x)| < |c| = ε.
|c|
2. Take any ε > 0.
First, there is a δ1 > 0 such that 0 < |x − p| < δ1 implies |f (x)| < ε/2.
Second, there is a δ2 > 0 such that 0 < |x − p| < δ2 implies |g(x)| < ε/2.
Let δ = min{δ1 , δ2 }. Then

ε ε
0 < |x − p| < δ =⇒ |f (x) + g(x) − 0| ≤ |f (x)| + |g(x)| < + = ε.
2 2
3.1. Limits 79

3. Take any ε > 0.



First, there is a δ1 > 0 such that 0 < |x − p| < δ1 implies |f (x)| < ε.

Second, there is a δ2 > 0 such that 0 < |x − p| < δ2 implies |g(x)| < ε.
Let δ = min{δ1 , δ2 }. Then

√ √
0 < |x − p| < δ =⇒ |f (x)g(x)| < ε ε = ε.

4. Take any ε > 0.


1 3
First, there is a δ1 > 0 such that 0 < |x − p| < δ1 implies< h(x) < .
DRAFT September 25, 2020

2 2
ε
Second, there is a δ2 > 0 such that 0 < |x − p| < δ2 implies |f (x)| < .
2
f (x ε/2
Let δ = min{δ1 , δ2 }. Then 0 < |x − p| < δ =⇒ < = ε. ␣
h(x) 1/2

Now we take up the general situation. We are able to reduce the calcula-
tions to the cases considered in the lemma.

Theorem 3.1.15 (Algebra of Limits) . Let f, g be real functions such that


lim f (x) = M
x→p
and x→p
lim g(x) = N . Then

1. x→p
lim c f (x) = cM (c ∈ R),

2. x→p
lim (f (x) + g(x)) = M + N ,

3. x→p
lim (f (x) − g(x)) = M − N ,

4. x→p
lim f (x)g(x) = M N ,

f (x) M
5. x→p
lim = (N 6= 0).
g(x) N

Proof. We use the equivalence lim F (x) = K ⇐⇒ lim (F (x) − K) = 0.


x→p x→p
   
1. lim c f (x) − c M = lim c f (x) − M = 0. (By part 1 of Lemma 3.1.14)
x→p x→p
   
2. lim (f (x) + g(x)) − (M + N ) = lim (f (x) − M ) + (g(x) − N ) = 0.
x→p x→p
(By part 2 of Lemma 3.1.14)

3. Combine parts 1 and 2 of this theorem, using c = −1.


80 Chapter 3. Limits and Continuity

4. We use part 3 of Lemma 3.1.14 and parts 1, 2, 3 of this theorem:

  
lim f (x)g(x) − M N = lim [f (x) − M ][g(x) − N ]
x→p x→p

+ M g(x) + N f (x) − 2M N

= lim [f (x) − M ][g(x) − N ]
x→p

+ lim (M g(x)) + lim (N f (x)) − lim 2M N


x→p x→p x→p

= 0 + M N + N M − 2M N = 0.

1 1
5. Due to part 4 of this theorem, it is enough to prove that lim = .
x→p g(x) N
 
1 1 N − g(x) 1 − g(x)/N
lim − = lim = lim
x→p g(x) N x→p g(x) x→p g(x)/N
= 0. (Part 4 of Lemma 3.1.14) ␣

Let us take up some applications to limit calculations:

Example 3.1.16. Calculate lim (x2 + 9).


x→2

By part 2 of Algebra of Limits, we have lim (x2 + 9) = lim x2 + lim 9 =


x→2 x→2 x→2
lim x2 + 9.
x→2

By part 4 we have lim x2 = ( lim x)( lim x) = 2 · 2 = 4.


x→2 x→2 x→2
2
Hence lim (x + 9) = 4 + 9 = 13. 
x→2

Example 3.1.17. Calculate lim (7x)9 .


x→2

By part 1 of Algebra of Limits, we have lim (7x)9 = 79 lim x9 .


x→2 x→2

By part 4 we have lim x = ( lim x) · · · ( lim x) = ( lim x)9 = 29 .


9
x→2 x→2 x→2 x→2
9 9 9 9
Hence lim (7x) = 7 2 = 14 . 
x→2

(x − 1)2
Example 3.1.18. Calculate lim
x→1 x2 − 1
.

lim (x2 − 1) = lim x2 − lim 1 = 12 − 1 =


The limit of the denominator is
x→1 x→1 x→1
0. So we can't apply the rule for ratios. However, we can rst simplify the
expression and remove this obstacle:

(x − 1)2 (x − 1)2 x−1


lim = lim = lim .
x→1 x2 − 1 x→1 (x − 1)(x + 1) x→1 x + 1
3.1. Limits 81

The cancellation in the last step is allowed because when we calculate lim we
x→1
work with x 6= 1 and hence x − 1 6= 0. This simplied form is easily dealt with:

x−1 0
lim (x − 1) = 0 and lim (x + 1) = 2 =⇒ lim = = 0. 
x→1 x→1 x→1 x+1 2

Task 3.1.19. Evaluate the following limits:

1 x2 − 6x + 9 |x|
1. lim 2. lim 3. lim
x→2 x2 x→3 x2 − 9 x→0 x
DRAFT September 25, 2020

Task 3.1.20. Show that if m ≤ f (x) ≤ M for all x and x→a


lim f (x) exists, then
m ≤ lim f (x) ≤ M .
x→a

Theorem 3.1.21 (Sandwich or Squeeze Theorem) . Suppose that f (x) ≤


g(x) ≤ h(x) in an interval (p − δ 0 , p + δ 0 ), with δ 0 > 0, except perhaps at
p. If lim f (x) = lim h(x) = L then lim g(x) = L.
x→p x→p x→p

Proof. Let ε > 0.


There exists δf > 0 such that 0 < |x−p| < δf implies L−ε < f (x) < L+ε.
There exists δh > 0 such that 0 < |x−p| < δh implies L−ε < h(x) < L+ε.
0
Let δ = min{δf , δh , δ }. Now, if 0 < |x − p| < δ then:

ˆ δ ≤ δf =⇒ L − ε < f (x) < L + ε.

ˆ δ ≤ δh =⇒ L − ε < h(x) < L + ε.

ˆ δ ≤ δ 0 =⇒ f (x) ≤ g(x) ≤ h(x).


Combining these gives L − ε < f (x) ≤ g(x) ≤ h(x) < L + ε. Hence L−ε <
g(x) < L + ε. Therefore lim g(x) = L. ␣
x→p

Example 3.1.22. The Sandwich Theorem allows us to calculate a limit with-


out an exact analysis of every part of the expression. For example, suppose
we are asked about lim xS(x), where S(x) is the function dened in Example
x→0
3.1.12. We have already seen that lim S(x) does not exist. This means that the
x→0
algebra of limits cannot be applied to the product xS(x). On the other hand,
since S(x) takes values between ±1 it follows that xS(x) takes values between
±x, and this suggests that we take the help of the Sandwich Theorem.
82 Chapter 3. Limits and Continuity

1 y

0.5

x
−1 −0.5 0.5 1

−0.5

−1

xS(x) lies between −x and x.

In order to avoid the x>0 and x<0 cases we work with |xS(x)|:

0 ≤ |S(x)| ≤ 1 =⇒ 0 ≤ |xS(x)| ≤ |x|.

Since lim |x| = 0, the Sandwich Theorem implies that lim |xS(x)| = 0. Hence
x→0 x→0
we have lim xS(x) = 0. 
x→0

Example 3.1.23. Let a>0 lim x. The natural guess for this
and consider
x→a

limit is a. To conrm this, we calculate as follows:

√ √ x−a |x − a|
0 ≤ | x − a| = √ √ ≤ √ .
x+ a a

|x − a| √ √
We have lim √ = 0. Hence, by the Sandwich Theorem, lim | x − a| =
x→a a x→a
0. 

One-sided Limits
We say that lim f (x) = L if for every ε>0 there is a corresponding δ>0
x→p+
such that 0 < x − p < δ =⇒ |f (x) − L| < ε. The quantity lim f (x) is called
x→p+
the right-hand limit of f at p.
We say that lim f (x) = L if for every ε > 0 there is a corresponding
x→p−
δ>0 such that 0 < p − x < δ =⇒ |f (x) − L| < ε. The quantity lim f (x) is
x→p−
called the left-hand limit of f at p.

The right-hand limit at p can be considered if there is an α > 0 such


« that (p, p + α) is in the domain of f . The left-hand limit needs an α > 0
such that (p − α, p) is in the domain.
3.1. Limits 83

L+ε L+ε
L L
L−ε L−ε

p p+δ p−δ p
Right-hand Limit Left-hand Limit
Task 3.1.24. Evaluate the following one-sided limits:
1. x→p+
lim C , 3. lim [x], |x|
DRAFT September 25, 2020

x→1+ 5. lim ,
x→0+ x

2. x→p−
lim C , 4. lim [x], 6. lim x.
x→1− x→0+

Theorem 3.1.25. lim f (x) = L


x→p
if and only if x→p+
lim f (x) = lim f (x) = L.
x→p−

Proof. Suppose lim f (x) = L:


x→p
Let ε > 0. Then there is a δ > 0 such that

0 < |x − p| < δ =⇒ |f (x) − L| < ε. The same δ works for lim f (x) = L and
x→p+
lim f (x) = L.
x→p−

Next, suppose lim f (x) = lim f (x) = L. Let ε > 0. Then there is a
x→p+ x→p−
δ1 > 0 0 < x − p < δ1 =⇒ |f (x) − L| < ε. There is also a δ2 > 0
such that
such that 0 < p − x < δ2 =⇒ |f (x) − L| < ε. Then δ = min{δ1 , δ2 } works for
lim f (x) = L:
x→p

0 < |x − p| < δ =⇒ 0 < x − p < δ or 0<p−x<δ


=⇒ 0 < x − p < δ1 or 0 < p − x < δ2
=⇒ |f (x) − L| < ε. ␣

For many functions, this characterization is a useful way to calculate limits,


or to show they do not exist.

0, x<0
Example 3.1.26. Consider the Heaviside step function H(x) =
1, x≥0
.

We calculate the one-sided limits at zero:

lim H(x) = lim 1 = 1,


x→0+ x→0+

lim H(x) = lim 0 = 0.


x→0− x→0−

Since the one-sided limits are not equal, lim H(x) does not exist. 
x→0
84 Chapter 3. Limits and Continuity

Task 3.1.27. Conrm that the Algebra of Limits and the Sandwich Theorem
also hold for one-sided limits.

Exercises for Ÿ3.1


1. Suppose lim |f (x)| = L.
x→p
Can we conclude that lim f (x) = ±L?
x→p

2. Consider the function f (x) = x3 . For the given a and ε, nd δ>0 such
that 0 < |x − a| < δ implies |f (x) − f (a)| < ε:

(a) a = 0, ε = 1, (c) a = 1, ε = 0.1,


(b) a = 0, ε = 0.1, (d) a = 2, ε = 0.1.

3. Use mathematical induction to prove that if p(x) is a polynomial then


lim p(x) = p(a).
x→a

4. Compute the limits, explaining which theorem you are using for each step:


1 1− 1 − x2
(a) lim (e) lim
x→2 x2 x→0 x2
2
x −4 x3 − 1
(b) lim (f) lim
x→2 x − 2 x→1 x − 1

(t + h)2 − t2
 
1 1
(c) lim (g) lim − 2
h→0 h t→0 t t +t

x2 + 5x + 4 x+2−3
(d) lim (h) lim
x→−4 x2 + 3x − 4 x→7 x−7

5. Show by means of examples that:

(a) lim (f (x) + g(x)) may exist even when neither lim f (x) nor lim g(x)
x→a x→a x→a
exists.

(b) lim (f (x)g(x)) may exist even when neither lim f (x) nor lim g(x) ex-
x→a x→a x→a
ists.

f (x) − 5
6. Suppose lim
x→2 x − 2
= 3. What can you say about lim f (x)?
x→2

7. Compute the following limits:

(a) lim x exp(x2 ), (c) lim log x,


x→0 x→a

(b) lim log x, (d) lim x2 log x.


x→1 x→0+

8. Let n ∈ N. Show that


3.2. Continuity 85

(a) lim x1/n = a1/n if a > 0, (b) lim x1/n = 0.


x→a x→0+

9. Compute the following limits:

log(1 + h) log(x + h) − log(x) 1


(a) lim = 1, (b) lim = .
h→0 h h→0 h x

10. Suppose lim f (x) = L.


x→0+
Show the following:

(a) If f is even then lim f (x) = L.


x→0−

(b) If f is odd then lim f (x) = −L.


DRAFT September 25, 2020

x→0−

11. Suppose
x→a
lim f (x) = 0 and g(x) is a bounded function dened on an open

interval that includes a. Show that lim f (x)g(x) = 0.


x→a

3.2 Continuity
A function f is said to be continu-
ous at p if
f (p) + ε
lim f (x) = f (p). f (p)
x→p
f (p) − ε
f is continuous at p
Alternately,
ε > 0 there is a corre-
if for every
sponding δ > 0 such that |x − p| <
δ =⇒ |f (x) − f (p)| < ε. p−δ p p+δ

The concept of continuity is only to be applied to points which are in


the domain of f. In fact they need to be in an open interval which is
«
completely contained in the domain of f , so that the limit can be talked
about.

Examples of functions which are not continuous at 0 because their limit


does not exist at 0:

 x

x 6= 0

0 x<0
H(x) = , sgn(x) = |x| .
1 x≥0  0 x=0

Examples of functions which are not continuous at 0 because their limit at


0 does not equal their value at 0:

  x x = 1/n, n ∈ N
0 x 6= 0
D(x) = , E(x) = 1 x=0 .
1 x=0
0 else

86 Chapter 3. Limits and Continuity

(In both cases the limit is 0 but the function value is 1.)

Examples of functions which are continuous at every point of R:

f (x) = C, g(x) = x, h(x) = |x|.

On the other extreme, the Dirichlet function is not continuous at any point!

Using the Algebra of Limits we can conclude the following:

Theorem 3.2.1. Let f (x) and g(x) be continuous at p. Then the following are
also continuous at p:
1. C f (x), 3. f (x)g(x),
f (x)
2. f (x) ± g(x) , 4. (if g(p) 6= 0).
g(x)

Proof. We prove the last claim. The others are left as an exercise for the reader.

First note that lim g(x) = g(p) 6= 0, by continuity of g(x) at x=p and
x→p
the given condition that g(p) 6= 0. So, by the Algebra of Limits,

lim f (x)
f (x) x→p f (p)
lim = = ␣
x→p g(x) lim g(x) g(p)
x→p

Theorem 3.2.2. Any polynomial is continuous at every point of R.


Proof. Let a0 , . . . , an ∈ R. The functions y = a0 and y=x are continuous. By
repeated application of part 4 of Theorem 3.2.1, every function y = xi (i ∈ N)
i
is continuous. By part 1, every function
Pn y = ai x is continuous. So by part 2,

i=0 ai xi is continuous. ␣

Recall that a rational function has the form p(x)/q(x) where p(x) and
q(x) are polynomials and q(x) is not the zero polynomial. The domain of this
rational function consists of all real numbers x where q(x) 6= 0. Recall that q(x)
has only nitely many zeroes. Hence each point of the domain is the center of
an open interval which is contained in the domain, and we can talk about the
function's limit at each point in the domain.

Theorem 3.2.3. A rational function is continuous at every point of its domain.


Proof. Combine continuity of polynomials with part 5 of Theorem 3.2.1. ␣

One-sided Continuity
A function f is said to be left-continuous at p if x→p−
lim f (x) = f (p). It is called
right-continuous at p if x→p+
lim f (x) = f (p).
3.2. Continuity 87

Example 3.2.4. The greatest integer function is right-continuous at every


point. It is left-continuous at all points except the integers.

2
1

−2 −1 1 2
−1
−2


Theorem 3.2.5. A function f is continuous at p if and only if it is left and


DRAFT September 25, 2020

right-continuous at p.

Proof. x→p
lim f (x) = f (p) ⇐⇒ lim f (x) = f (p) and lim f (x) = f (p).
x→p+ x→p−

For example, we can argue that the Heaviside step function H(x) is not
continuous at x=0 because it is right continuous but not left continuous.

Our recognition of dierent ways in which a discontinuity can happen leads


to the following classication:

Removable discontinuity: lim f (x) exists but does not equal f (a). We can
x→a
make f continuous at a by changing its value at a to lim f (x).
x→a

Jump discontinuity: lim f (x)


x→a+
and lim f (x)
x→a−
exist but are not equal. The

quantity lim f (x) − lim f (x)


x→a+ x→a−
is called the jump of f at a.

Essential discontinuity: Either lim f (x)


x→a+
or lim f (x)
x→a−
fails to exist.

A function f is called continuous on an interval I if

1. f is continuous at every interior point of I,

2. f is right-continuous at the left endpoint, if the left endpoint is in I,

3. f is left-continuous at the right endpoint, if the right endpoint is in I.

Continuity of Compositions
Theorem 3.2.6. Let f and g be real functions such that their composition g ◦f
is dened on an interval (a, b). Let p ∈ (a, b) with q = x→p
lim f (x) and suppose g
is continuous at q. Then
lim g(f (x)) = g(q) = g( lim f (x)).
x→p x→p
88 Chapter 3. Limits and Continuity

Proof. Letε > 0. Since g is continuous at q there is a δ 0 > 0 such that |y−q| < δ 0
implies |g(y) − g(q)| < ε. And there is a δ > 0 such that 0 < |x − p| < δ implies
|f (x) − q| < δ 0 . Hence

0 < |x − p| < δ =⇒ |f (x) − q| < δ 0 =⇒ |g(f (x)) − g(q)| < ε. ␣


s
x2 − 1
Example 3.2.7. Calculate x→1
lim .
x−1
x2 − 1
We rst note that lim = 2. Since the square root function is con-
x→1 x − 1 √ √
tinuous at 2 (we proved that for a > 0, lim x = a), we have
x→a
s s
x2 − 1 x2 − 1 √
lim = lim = 2. 
x→1 x−1 x→1 x − 1

Theorem 3.2.8. Let f and g be real functions such that their composition g ◦f
is dened on an interval (a, b). Let p ∈ (a, b) such that f is continuous at p and
g is continuous at f (p). Then g ◦ f is continuous at p.

Proof. x→p
lim g(f (x)) = g( lim f (x)) = g(f (p)).
x→p

Monotone Functions
Theorem 3.2.9. If I, J are intervals and f : I → J is a surjective monotone
function, then f is continuous on I .
Proof. We'll do the case when J is an open interval. For other intervals we have
to do a similar analysis at any end-points that are included in J.
Let x0 ∈ I and let ε > 0. We may assume that f (x0 ) ± ε ∈ J , by shrinking
ε if necessary. (A δ that works for a smaller ε will also work for the original
one.)

The surjectivity of f implies that there are x± ∈ I such that f (x− ) =


f (x0 ) − ε and f (x+ ) = f (x0 ) + ε.

f (x0 ) + ε f (x)
f (x0 )
f (x0 ) − ε

x− x0 x+
3.2. Continuity 89

Take δ = min{x0 − x− , x+ − x0 }. ␣
Theorem 3.2.10. All logarithms and exponential functions are continuous.
Proof. They are strictly monotonic bijections between intervals. ␣
Task 3.2.11. Let r ≥ 0. Show that the function xr is continuous on [0, ∞).
Indenite Integrals
We call a function integrable on an interval I if it is integrable on every
[a, b] with a, b ∈ I .
DRAFT September 25, 2020

Suppose f is integrable on an interval I containing a. Then the function


Z x
F (x) = f (t) dt (x ∈ I)
a

indenite integral of f .
is called an

Example 3.2.12. Calculate the indenite integral F (x) = 0x H(t) dt for the
R

unit step function H(t).


Z x
Z 0 Z 0
x < 0 =⇒ H(t) dt = − H(t) dt = − 0 dt = 0
0 x x
Z x Z x
x ≥ 0 =⇒ H(t) dt = 1 dt = x
0 0

0 x<0
Hence F (x) = . 
x x≥0
Task 3.2.13. Show that any two indenite integrals f (t) dt and
Rx
a
F (x) =
x
of the same function, dier only by a constant.
R
G(x) = b
f (t) dt
Theorem 3.2.14. Let f (x) beR integrable on an interval I and let a ∈ I . Then
the indenite integral F (x) = ax f (t) dt is continuous on I .
Proof. For any x, p ∈ I we have
Z x Z p Z x
F (x) − F (p) = f (t) dt − f (t) dt = f (t) dt.
a a p

p is not the right endpoint of I . We shall establish the right-


Suppose
continuity of f at p. First, there is a δ > 0 such that [p, p + δ] ⊆ I . Since f is
integrable on [p, p + δ] it is bounded there. Hence there is a positive number M
such that −M ≤ f (x) ≤ M for every x ∈ [p, p + δ]. Now for p < x < p + δ ,

Z x Z x Z x
−M (x − p) = (−M ) dt ≤ f (t) dt ≤ M dt = M (x − p).
p p p
90 Chapter 3. Limits and Continuity

Therefore,
−M (x − p) ≤ F (x) − F (p) ≤ M (x − p),
and so 0 ≤ |F (x) − F (p)| ≤ M |x − p|. By the Sandwich Theorem, we have
lim |F (x) − F (p)| = 0. Therefore lim F (x) = F (p).
x→p+ x→p+

Similarly, we check that if p is not the left endpoint of I then f is left-


continuous at p. This establishes the continuity of f on I. ␣

Exercises for Ÿ3.2


1. Identify the points at which the given functions are not continuous. Check
one-sided continuity at those points, and also classify the discontinuity as re-
movable, jump, or essential.

(a) f (x) = x − [x], (c) h(x) = S(x),


(b) g(x) = sgn(x) exp(x), (d) k(x) = H(x)S(x).

(S is the function dened in Example 3.1.12. H is the Heaviside step function.)

2. Compute the following limits:

p
(a) lim log x, (c) lim (1 + h)1/h ,
x→1+ h→0
p 2
(b) lim log x2 + 1 , (d) lim xx .
x→0 x→0+

3. Consider f: R → R dened by f (x) = x if x is rational and f (x) = 0 if x


is irrational.

(a) Show that f is continuous only at x = 0.


(b) Create a function g: R → R which is continuous only at x = 0 and
x = 1.
4. Suppose f : (a, b) → R is a monotone function. Show that the left and right
limits of f exist at every point of(a, b). Thus any discontinuity of f is a jump
discontinuity. (Hint: Use the concepts of sup and inf )

5. Is the following statement correct? If f ◦g is dened, lim g(x) = q


x→p
and

lim f (y) = L, then lim (f ◦ g)(x) = L.


y→q x→p

6. Prove Theorem 3.2.9 for the case J = (a, b].


7. Prove that every monotone and continuous function f : [a, b] → R has
the `Intermediate Value Property': If L lies strictly between f (a) and f (b) then
there is a c ∈ (a, b) such that f (c) = L. (Hint: First review the proof of Theorem
2.2.13)
3.3. Intermediate Value Theorem 91

Z x
8. Compute and graph the indenite integral f (t) dt of each f (t):
0

(a) sgn(t), (b) [t], (c) H(t)t, (d) t2 .

3.3 Intermediate Value Theorem


We have encountered and applied the `Intermediate Value Property'. A function
is said to have this property if, whenever f (a) < L < f (b), there is a c between
a, b with f (c) = L. That is, f assumes all intermediate values between any
two of its values. In particular, we saw in Theorem 2.2.13 that the indenite
DRAFT September 25, 2020

integrals of monotone functions have this property, and this established that
the natural logarithm is a surjective function.

Now that we have learned that indenite integrals are continuous, it is


natural to ask if all continuous functions have this property. We start by giving
a positive answer when L = 0.
Theorem 3.3.1 (Intermediate Value Theorem, ver. 1). Suppose f is contin-
uous on [a, b] and f (a)f (b) < 0. Then there is a number c ∈ (a, b) such that
f (c) = 0.

Proof. Assume that f (x) is never zero. First, let a0 = a and b0 = b. Let c0 be
the midpoint of [a0 , b0 ]. If f (c0 ) = 0 we have succeeded. Dene

[a0 , c0 ] if f (a0 )f (c0 ) < 0
[a1 , b1 ] = .
[c0 , b0 ] if f (b0 )f (c0 ) < 0

+ −
a = a0 c0 b = b0
0 f(
c0
)> )<
f (c 0
0

+ + − + − −
a0 a1 b1 = b0 a0 = a1 b1 b0

We have f (a1 )f (b1 ) < 0, so we repeat this process with [a1 , b1 ] replacing [a0 , b0 ].
Proceeding in this manner, we nd a sequence of intervals [an , bn ] such that

[a0 , b0 ] ⊃ [a1 , b1 ] ⊃ [a2 , b2 ] ⊃ · · ·

The endpoints of these intervals are arranged as follows:

a0 ≤ a1 ≤ a2 ≤ · · · ≤ b2 ≤ b1 ≤ b0 .
92 Chapter 3. Limits and Continuity

From the Completeness Axiom we obtain a number c such that an ≤ c ≤ bn


for every n.
Suppose f (c) > 0. By continuity, there is a δ > 0 such that x ∈ (c−δ, c+δ)
implies f (x) > 0.
b−a b−a
Now, note that bn − an = < .
2n n
Hence, by the Archimedean Property, there exists N such that bN −aN < δ.
Since c ∈ [aN , bN ], this implies [aN , bN ] ⊂ (c − δ, c + δ).

( [ ] )
c−δ aN c bN c + δ

We have a contradiction since f changes sign on [aN , bN ] but not on (c−δ, c+δ).
The f (c) < 0 case similarly leads to a contradiction.

Hence f must be zero on some point in (a, b). ␣

The Intermediate Value Theorem is very useful for showing the existence of
special numbers. For example, suppose we wish to show that a certain equation
has a solution. By moving all terms to one side of the equality, we put it in
the form f (x) = 0. If f is continuous we can try to use the Intermediate Value
Theorem.

Example 3.3.2. Consider the equation x4 +4x3 +x2 −6x−1 = 0. Since the LHS
is a polynomial of degree 4 this equation has atmost 4 distinct real solutions,
but it may have fewer, or even none. Let us see how many the Intermediate
Value Theorem can help us to locate. We start by calculating the values of
f (x) = x4 + 4x3 + x2 − 6x − 1 = 0 at various points:

x −4 −3 −2 −1 0 1 2
f (x) 39 −1 −1 3 −1 −1 39

By tracking the sign changes of f (x) we see there are solutions in the intervals
(−4, −3), (−2, −1), (−1, 0) and (1, 2). We can shrink these intervals further
by employing the bisection method. For example, let us consider the so-
lution that lies in (1, 2). f (x) at the midpoint of (1, 2):
We nd the value of
f (1.5) = 10.8. Therefore the solution is in (1, 1.5). This process can be repeated
indenitely for greater accuracy:

f (1.25) = 3.3 =⇒ solution is in (1, 1.25).


f (1.125) = 0.81 =⇒ solution is in (1, 1.125).
f (1.0625) = −0.17 =⇒ solution is in (1.0625, 1.125).
3.3. Intermediate Value Theorem 93

If we take the next midpoint 1.094 to be an approximate solution, we know


it is accurate to within about ±0.03. 

The Intermediate Value Theorem can be proved for arbitrary L by just


vertically shifting the function.

Theorem 3.3.3 (Intermediate Value Theorem, ver. 2). Suppose f is continu-


ous on [a, b] and L is a value between f (a) and f (b), i.e., f (a) < L < f (b) or
f (b) < L < f (a). Then there is a number c ∈ (a, b) such that f (c) = L.

Proof. Suppose f (a) < L < f (b). Dene g : [a, b] → R by g(x) = f (x) − L.
Then g(a) = f (a) − L < 0 and g(b) = f (b) − L > 0. Hence there is a number
c ∈ (a, b) such that g(c) = 0, and f (c) = g(c) + L = L.
DRAFT September 25, 2020

The case f (b) < L < f (a) is dealt in a similar manner. ␣

We have concentrated on continuity as a guarantor for the interme-


diate value property. We must keep in mind that the domain is also
important. As an example, consider the signum function on the do-
« main [−1, 0) ∪ (0, 1]. It is continuous, has opposite signs at ±1, but is
never zero. Care is needed because we often describe a function by its
rule and omit giving the domain.

Exercises for Ÿ3.3


1. Use the Intermediate Value Theorem to nd three disjoint intervals, each
of which contains a solution of 4x3 − 6x2 − 6x + 2 = 0.
2. Is there a number which is one more than its cube?

3. Use the Bisection Method to approximate
3
7 to within two decimal places.
4. Fill in the details of this alternate proof of the Intermediate Value Theorem:

(a) Assuming f (a) < 0 < f (b), let A = {x ∈ [a, b] : f (x) < 0}. Show that
c = sup(A) exists.
(b) Show that f (c) > 0 and f (c) < 0 lead to contradictions.

What are the relative merits and demerits of this proof and the original one?

5. Consider the function S of Example 3.1.12, which we know is not contin-


uous. Show that it has the intermediate value property.

6. Can there be a non-constant continuous function f : R → Q?


7. Let f : [a, b] → R be continuous and c1 , . . . , cn ∈ [a, b]. Then there is a
f (c1 ) + · · · + f (cn )
point c ∈ [a, b] such that f (c) = .
n
8. Prove the following xed point theorem: If f : [0, 1] → [0, 1] is continuous
then there is a c ∈ [0, 1] such that f (c) = c.
94 Chapter 3. Limits and Continuity

9. Suppose f, g : [a, b] → R are continuous functions such that f (a) > g(a)
and f (b) < g(b). Show that there is a c ∈ (a, b) such that f (c) = g(c).
10. Suppose f : [0, 2] → R is a continuous function with f (0) = f (2). Show
that there are a, b ∈ [0, 2] such that b − a = 1 and f (a) = f (b).
11. Let f : [a, b] → R be a continuous and injective function. Assume that
f (a) < f (b).
(a) Show that f (a) is the minimum value of f and f (b) is the maximum
value of f. Hence the image of f is [f (a), f (b)].

(b) Show that f is strictly increasing.

(c) Show that f : [a, b] → [f (a), f (b)] has an inverse function which is also
strictly increasing and continuous.

12. Show that there cannot be a continuous bijection f : (0, 1) → [0, 1].
13. The following tasks will establish that a cubic polynomial p(x) = x3 +
2
ax + bx + c has at least one real root.
 
|a| + |b| + |c|
(a) Show that for x ≥ 1, p(x) ≥ x3 1 − . Hence there is an
x
x1 with p(x1 ) > 0.
 
3 |a| + |b| + |c|
(b) Show that for x ≤ −1, p(x) ≤ x 1+ . Hence there is
x
an x2 with p(x2 ) < 0.

14. Prove that every polynomial of odd degree has at least one real root.
Hence the image of such a function is all of R.
15. Prove that if a polynomial p(x) = x + an−1 xn−1 + · · · + a0
n
of even degree
has a negative value then it has a real root.

16. Suppose f : [−1, 1] → [−1, 1] is √ 2 2


continuous and satises x + f (x) = 1

for every x. Prove that either f (x) = 1 − x for every x or f (x) = − 1 − x2
2

for every x.

3.4 Trigonometric Functions


Let us review the geometric denitions of cos t and sin t with t ∈ R. For this,
we rst need to describe angles and their measurement. We dene an angle to
be a region bounded by two rays with a common starting point:
3.4. Trigonometric Functions 95

To measure the angle we draw a unit circle whose centre is the meeting point
of the rays. We take twice the area enclosed by this circle within the angle, and
call that the radian measure of the angle. Thus the full circle corresponds to
2π radians while a right angle corresponds to π/2 radians. (See the discussion
of π on page 68)
DRAFT September 25, 2020

The usual denition of radian is to take the length of the arc of unit
radius cut by the angle. However, we haven't taken up lengths of curves
« yet and have to work with areas. We use twice the area to keep our
denition compatible with the arc length approach. The association of
radians with arc length is achieved later in Example 6.4.8.

At this point, we have associated a real number between 0 and 2π to each angle.
We would like to be assured that every such number is the radian measure of
some angle. Then we shall have a perfect identication of physical angles with
radian measures.

First consider any number x between −1 and 1. We create an angle corre-


sponding to x as shown below:

√ √
1 − x2 1 − x2

x x

Let R(x) be the radian measure of this angle. We now have a function
R : [−1.1] → [0, 2π] dened by

p Z 1 p
2
R(x) = x 1 − x + 2 1 − t2 dt.
x

Now R is continuous, R(−1) = π and R(1) = 0. By the Intermediate Value


Theorem, R takes every value between 0 and π . Thus every number between 0
and π is the radian measure of an angle.

Task 3.4.1. Show that every number between π and 2π is the radian measure
of an angle.
96 Chapter 3. Limits and Continuity

Now that we know how to identify angles with real numbers, we are in a
position to dene the trigonometric functions.

Consider the ray in the xy -plane created by rotating the positive x-axis
counterclockwise through an angle of t radians. This ray cuts the unit circle
with centre at origin at exactly one point (x, y). We then dene cos t = x and
sin t = y (cos is an abbreviation of `cosine' while sin is an abbreviation of `sine').
The gures below illustrate the denitions for an acute and an obtuse angle
respectively.

sin t (x, y) (x, y) sin t

t t
cos t cos t

It follows from this denition that


pcos : [0, 2π] → [−1, 1] are onto. For, let
sin,
2
−1 ≤ y ≤ 1. Then 1−y ≥ 0 and x = 1 − y 2 is dened. Note that x2 +y 2 = 1
and hence (x, y) is on the unit circle with centre at origin. Let t be the angle
between the positive x-axis and the ray emanating from origin and passing
through (x, y). Then, by the denition of the sine function, sin t = y . Therefore
sine is onto. Similarly, cosine is also onto.

Task 3.4.2. Show that sin2 t + cos2 t = 1 for every t ∈ [0, 2π].
Task 3.4.3. Show that sin(π/2 − t) = cos t for every t ∈ [0, π/2].
The following values of sine and cosine are obvious from the denitions:

x 0 π/2 π 3π/2 2π
sin x 0 1 0 −1 0
cos x 1 0 −1 0 1

Task 3.4.4. Show that sin(π/4) = cos(π/4) = 1/ 2.

These observations indicate that the graph of cosine over the interval
[0, π/2] is likely to be as follows:

√1
1/ 2

π/4 π/2
3.4. Trigonometric Functions 97

As we learn more Calculus, we will be able to conrm that the graph indeed
looks like this. The following identities also follow directly from the denitions:

cos(π − t) = cos(π + t) = − cos t for every t ∈ [0, π].

With their help we can visualize the graph over the entire interval [0.2π], using
the piece for [0, π/2] as the building block.

1
π
π/2 3π/2 2π
−1
DRAFT September 25, 2020

We have similar identities for the sine function:

sin(π − t) = − sin(π + t) = sin t.

These generate the graph of the sine function:

sin x
1
3π/2 2π
π/2 π
−1

We notice that as the input changes from 0 to 2π , the sine and cosine functions
return to their initial values. Thus the function domain can be extended on
each side by just repeating the function values, using sin(x + 2π) = sin x and
cos(x + 2π) = cos x:

sin x
1 2π 4π
−2π −1
cos x
1 2π 4π
−2π −1

The following properties of sin, cos : R → [−1, 1] are obvious from the deni-
tions:

1. The sine and cosine functions are periodic, with a period of 2π: sin(t +
2π) = sin t and cos(t + 2π) = cos t for every t.

2. We have sin2 t + cos2 t = 1 for every t ∈ R.


98 Chapter 3. Limits and Continuity

3. The sine function is odd while the cosine function is even.

We need one last property of trigonometric functions, which will be essen-


tial in later work. This is the set of identities involving sums and dierences of
angles.

In this gure,CP and QR are perpen- C


dicular to OA, while CQ is perpendicular to
α
OB . We have the following calculations:

OQ = cos β =⇒ OR = cos α cos β, B


CQ = sin β =⇒ P R = sin α sin β. Q
β
Therefore,
α
O
P R A 1
cos(α + β) = OP = OR − P R = cos α cos β − sin α sin β.

Our gure is only valid for 0 ≤ α, β and with α + β ≤ π/2. The identity can be
extended to arbitrary α, β by other appropriate gures.

The other sum of angle identities can be obtained from this one. First,
replacing β by −β gives

cos(α − β) = cos α cos β + sin α sin β.

Alternately, replacing α by π/2 − α and β by −β gives

sin(α + β) = sin α cos β + cos α sin β.

Substituting β by −β in the last identity gives

sin(α − β) = sin α cos β − cos α sin β.



Task 3.4.5. Show that sin π/6 = cos π/3 = 1/2 and cos π/6 = sin π/3 = 3/2.
Task 3.4.6. Prove the half-angle formulas:
cos 2x = cos2 x − sin2 x = 1 − 2 sin2 x = 2 cos2 x − 1,
sin 2x = 2 cos x sin x.

sin x
Task 3.4.7. Compute numerical values of for x = π/2n , n = 1, 2, 3.
x
(You can use a calculator for the arithmetic operations and square roots, but
do not use the inbuilt sine and cosine functions)

Limits and Continuity


Now we'll verify that the trigonometric functions are continuous. We compute
their limits at zero, and then use the identities for sin(x + y) and cos(xy ) to
transfer these calculations to other points.
3.4. Trigonometric Functions 99

In the gure on the left, the circle has radius


1. Consider the areas of the triangle 4OP Q and
P
the sector OP Q. The triangle is fully contained
in the sector, hence has smaller area. So for 0 <
x x < π/2 we have
O
Q
0 < Area(4OP Q) < Area( OP Q)
1 x
=⇒ 0 < sin x <
2 2
=⇒ 0 < sin x < x.

Applying the Sandwich Theorem gives lim sin x = 0. Since sin x is an odd
DRAFT September 25, 2020

x→0+
function, we get
lim sin x = − lim sin x = 0.
x→0− x→0+

Both the one-sided limits being 0, we have lim sin x = 0.


x→0

Now use the half-angle formula:

x
lim cos x = lim 1 − 2 sin2 = 1.
x→0 x→0 2
The limits at 0 can be combined with the angle sum identities to compute the
limits at other points:

lim sin x = lim sin(a + h) = lim [sin a cos h + cos a sin h] = sin a,
x→a h→0 h→0
lim cos x = lim cos(a + h) = lim [cos a cos h − sin a sin h] = cos a.
x→a h→0 h→0

Thus the sine and cosine functions are continuous on R.


 x2 − 2x + 1 
Task 3.4.8. Calculate x→1
lim sin .
x2 − 1

Let us now recall the other four trigonometric functions:

sin x cos x 1 1
tan x = , cot x = , sec x = , csc x = .
cos x sin x cos x sin x

By the properties of continuity, these functions are continuous at every


point of their domains.

Two Fundamental Limits


We have seen that 0 < x < π/2 =⇒ sin x < x. Hence,

sin x
0 < x < π/2 =⇒ < 1.
x
100 Chapter 3. Limits and Continuity

Now, consider the unit circle drawn on the


P R
left, and compare the areas of the sector OP Q
and the triangle 4ORQ. The sector has less area
than the triangle, and so
x
O x 1 sin x
Q < tan x, or cos x < .
2 2 x
Hence, for 0 < x < π/2, we have

sin x
cos x < < 1.
x
Therefore, by the Sandwich Theorem,

sin x
lim = 1.
x→0+ x
sin x
Since (sin x)/x is an even function, we also get lim = 1. We have reached
x→0− x
sin x
lim =1 (3.1)
x→0 x

Another important limit is of (1 − cos x)/x at 0. We again use the half-angle


formula:

1 − cos x 2 sin2 (x/2) sin(x/2)


lim = lim = lim sin(x/2) lim = 0 · 1 = 0.
x→0 x x→0 x x→0 x→0 x/2
(3.2)

y y
0.5
0.5 x
x −10 10
−10 10 −0.5

sin x 1 − cos x
x x
sin(x2 − 1)
Task 3.4.9. Calculate x→1
lim .
x−1
At this point, all the standard continuous functions of calculus are avail-
able to us: polynomials, rational functions, roots, real powers, exponential, log-
arithm, sine, cosine.

Exercises for Ÿ3.4


1. Graph the following functions:
3.4. Trigonometric Functions 101

(a) f (x) = sin(x − 1), (d) k(x) = sin(x2 ),


(b) g(x) = sin 2x, (e) p(x) = | sin x|,
(c) h(x) = sin |x|, (f) q(x) = sin2 x.

2. Compute the following limits:

sin 5x − sin 3x 1 − cos x


(a) lim , (e) lim ,
x→0 x x→0 x2
sin x − sin a 1 − cos(1 − cos x)
(b) lim ,
(f) lim ,
x→a x−a x→0 x4
DRAFT September 25, 2020

1
tan 2x (g) lim x sin ,
(c) lim , x→0 x
x→0 sin x s
x2 − 1
(d) lim (sec x − tan x), (h) lim sin .
x→π/2 x→1 x−1

3. Graph the given function and identify where it is continuous:

(a) f (x) = sin(1/x) if x 6= 0 and f (0) = 0,


(b) g(x) = x sin(1/x) if x 6= 0 and g(0) = 0.
4. Show that the equation cos x = x3 has at least one solution.

5. Consider the function tan : (−π/2, π/2) → R. Show that it is odd, strictly
increasing and surjective. Plot its graph.
π
6. Consider the function tan : R \ { (2n + 1) | n ∈ Z } → R. Show that it
2
has period π. Plot its graph.

7. Find the domains and plot the graphs of the functions cot x, sec x, csc x.
8. Prove that a linear combination A sin x + B cos x can be expressed in the
form R sin(x + φ) for some R, φ. (Hint: First do the case when A2 + B 2 = 1)
9. Prove the following identities:

x + y x − y
(a) cos x + cos y = 2 cos cos ,
2 2
x + y x − y
(b) sin x + sin y = 2 sin cos .
2 2
10. Consider the following identities, which we have already proved:

1 − sin(π/2 − 2θ)
sin2 (π/2 − θ) = 1 − sin2 θ and sin2 θ = .
2
(a) Starting with the known values of sin θ forθ = π/6, π/4 and π/3, use
these identities to nd the sine values for θ = π/12, 5π/12.
102 Chapter 3. Limits and Continuity

(b) Continue the above process to nd the sine values for θ starting at zero
and increasing in steps of π/24 to π/2.
(This process is described in the work Pancha-Siddhantika by Varahamihira,
written in the 6th century CE. Varahamihira went one step further and calcu-
lated in steps of π/48 or 3◦ 450 .)
11. We will develop a rational function that is a close approximation to sin x
over the interval [0, π].
π
(a) Find a quadratic polynomial p such that p(x) = sin x for x = 0, , π.
2
p(x) π π π
(b) Find a quadratic polynomial q such that q(x) = for x= , , .
sin x 6 2 3
(c) Plot sin x and r(x) = p(x)/q(x) over [0, π] using a graphing software.

(The approximation r(x) was rst developed by Bhaskara in the 7th century
CE, of course using degrees rather than radians.)

3.5 Continuity and Integration


In this section, we establish the most important properties of continuous func-
tions.

Let f : [a, b] → R be continuous. The span of f on [a, b] is sup{|f (x) −


f (y)| : x, y ∈ [a, b]} (if the sup exists).

Task 3.5.1. Find the spans of the following functions on [0, 1]: sgn(x), sin x,
1/x (with the value at x = 0 set to 0).
Theorem 3.5.2 (Small Span Theorem). Let f : [a, b] → R be continuous. For
every ε > 0 there is a partition P of [a, b] such that the span of f is less than
ε on every subinterval of P .

We shall use the same approach as in proving the Intermediate Value


7 Theorem, of successively halving the intervals, until they are small
enough for continuity to pay o.

Proof. Suppose there is an ε > 0 such that no such partition of [a, b] exists. Let
a1 = a, b1 = b and dene c1 = (a1 + b1 )/2. Then at least one of the intervals
[a1 , c1 ] and [c1 , b1 ] fails to have such a partition. Let that one be called [a2 , b2 ].
(If both fail to have such a partition, we take the left one.)

Applying the same process to [a2 , b2 ], and so on, we get a sequence of


intervals
[a, b] = [a1 , b1 ] ⊃ [a2 , b2 ] ⊃ · · · ⊃ [an , bn ] ⊃ · · ·
none of which have such a partition. In particular, the span of f exceeds ε on
every [ai , bi ].
3.5. Continuity and Integration 103

Note that
a = a1 ≤ a2 ≤ · · · ≤ b2 ≤ b1 = b
By the Completeness Axiom, there is an α ∈ R such that ai ≤ α ≤ bi for every
i. Henceα is in each [an , bn ]. Since f is continuous at α, there is a δ > 0 such
that |x − α| < δ implies |f (x) − f (α)| < ε/3. And then x, y ∈ (α − δ, α + δ) =⇒
|f (x) − f (y)| < 2ε/3.
b−a
By taking large enough n we can ensure that < δ and hence
2n
[an+1 , bn+1 ] ⊂ (α − δ, α + δ). Then we would have span of f being less than
or equal to 2ε/3 < ε on [an+1 , bn+1 ], which contradicts our earlier observation
about these intervals. ␣
DRAFT September 25, 2020

Bounded Image
Theorem 3.5.3 (Boundedness Theorem) . Let f : [a, b] → R be continuous.
Then the image of f is bounded.
Proof. We have to show there are numbers m, M such that m ≤ f (x) ≤ M for
every x ∈ [a, b].
By the Small Span Theorem, there is a partition P = {x0 , . . . , xn } of [a, b]
such that
xi−1 ≤ x, y ≤ xi =⇒ |f (x) − f (y)| < 1
In particular, |f (xi−1 ) − f (xi )| < 1 for each i. Take any x ∈ [a, b]. Then xi−1 ≤
x < xi for some i. Hence
|f (a) − f (x)| ≤ |f (x0 ) − f (x1 )| + |f (x2 ) − f (x1 )| + · · · + |f (xi−1 ) − f (x)|
< i ≤ n.
So we can take M = f (a) + n and m = f (a) − n. ␣
Theorem 3.5.4 (Extreme Value Theorem) . Let f : [a, b] → R be continuous.
Then there are points c, d ∈ [a, b] such that
f (c) = max{ f (x) | x ∈ [a, b] } and f (d) = min{ f (x) | x ∈ [a, b] }.
Proof. We prove the existence of c. Consider the set A = { f (x) | x ∈ [a, b] }.
A is bounded, by the Boundedness Theorem. Therefore, A has a least upper
bound M. We need to show that M ∈ A.
1
If f (x) never equals M , then M −f (x) is never zero, and g(x) =
M − f (x)
denes a positive and continuous function on [a, b]. By the Boundedness Theo-
rem, the image of g is bounded, so let g(x) ≤ R on [a, b]. Then M − f (x) ≥ 1/R
and so f (x) ≤ M − 1/R on [a, b]. But then M − 1/R is an upper bound of A
and M − 1/R < M , a contradiction.
Hence there is a c ∈ [a, b] such that f (x) = M . ␣
104 Chapter 3. Limits and Continuity

Integrability
Theorem 3.5.5 (Integrability of Continuous Functions). Let f : [a, b] → R be
continuous. Then f is integrable on [a, b].
The plan is to apply the Riemann Condition. For it to work, we need to
nd upper and lower sums which are close to each other. The Small Span
7
Theorem helps out by giving a partition where the function uctuates
little on each subinterval.

Proof. Let ε > 0. By the Small Span Theorem, there is a partition P =


{x0 , . . . , xn } of [a, b] such that
ε
xi−1 ≤ x, y ≤ xi =⇒ |f (x) − f (y)| < .
b−a

Let mi = min{f (x) : x ∈ [xi−1 , xi ]} and Mi = max{f (x) : x ∈ [xi−1 , xi ]}.


Then Mi − mi < ε/(b − a) for every i. Dene step functions s ∈ Lf and t ∈ Uf
by s(xi ) = t(xi ) = f (xi ) for each i and

s(x) = mi if xi−1 < x < xi ,


t(x) = Mi if xi−1 < x < xi .

Then s≤f ≤t on [a, b] and,

Z b Z b n n
X ε X
t(x) dx − s(x) dx = (Mi − mi )(xi − xi−1 ) < (xi − xi−1 ) = ε.
a a i=1
b − a i=1

By the Riemann Condition, f is integrable on [a, b]. ␣

Mean Values
n
1X
Let us recall that the average of numbers x1 , . . . , xn is dened by x̄ = xi .
n i=1
This notion of average can be generalised from nitely many numbers
to the values of integrable functions. First consider an interval [a, b] with a
partition P = {x0 , . . . , xn } of equally spaced points. Consider a step function
s : [a, b] → R such that s(x) = si for x ∈ (xi−1 , xi ). Then,
b n n
b−aX
Z X
s(x) dx = si (xi − xi−1 ) = si = (b − a)s̄
a i=1
n i=1
Z b
1
=⇒ s̄ = s(x) dx.
b−a a
3.5. Continuity and Integration 105

This motivates the following: If f : [a, b] → R is integrable, we dene its average


by
Z b
1
f¯ = f¯[a,b] = f (x) dx.
b−a a
The average of a function has the same basic properties as the average of a
collection of numbers. For example:

Task 3.5.6. Show that if f has upper and lower bounds M and m respectively
then m ≤ f¯ ≤ M .
Further, recall that if x1 , . . . , xn have average x̄ and y1 , . . . , ym have aver-
n
ȳ then
age the pooled collection x1 , . . . , xn , y1 , . . . , ym has average x̄ +
DRAFT September 25, 2020

m+n
m
ȳ .
m+n
Task 3.5.7. Suppose a < b < c and f : [a, c] → R is integrable. Show that
b−a ¯ c−b ¯
f¯[a,c] = f[a,b] + f[b,c] .
c−a c−a

Finally, if we acquire new data whose values are lower than previous values,
then the average decreases:

Task 3.5.8. Suppose that f is a decreasing function. Show that f¯[a,x] is also a
decreasing function of x.
Now let us see a phenomenon that is special to averages of functions. The
average of a collection of numerical data is usually not a member of that data
set. However, the average of a continuous function is a value of that function:

Theorem 3.5.9 (Mean Value Theorem for Integration). Consider a continuous


function f : [a, b] → R. There is a number c ∈ (a, b) such that
Z b
1
f (c) = f (x) dx.
b−a a

f (x)

a c b

Proof. Let m, M be the minimum and maximum values, respectively, of f (x)


on [a, b]. Then there exist a0 , b0 ∈ [a, b] such that f (a0 ) = m and f (b0 ) = M .
Further,
Z b
0 1
f (a ) = m ≤ f (x) dx ≤ M = f (b0 )
b−a a
106 Chapter 3. Limits and Continuity

By the Intermediate Value Theorem, there is a number c between a0 and b0


Z b
1
with f (c) = f (x) dx. ␣
b−a a

weighted average
Pn
A
P of numbers x1 , . . . , xn
n
i=1 wi xi is a combination
where each wi ≥ 0 w
and
i=1 i = 1 . The concept of weighted average gener-
alises that of ordinary average by allowing dierent importance (or weight) for
each number. If we set each wi = 1/n we get the original x̄.
The analogue for integration is to dene the weighted average of an in-
Rb
tegrable function f to be
a
f (x)g(x) dx where g is non-negative on [a, b] and
Rb
a
g(x) dx = 1. This denition requires fg to be integrable. For that, see Exer-
cise 9 of Ÿ2.2.

When we calculate a weighted average of a continuous function, we again


nd that it equals one of the values of the function.

Theorem 3.5.10 (Mean Value Theorem for Weighted Integration). Consider


functions f, g : [a, b] → R where f is continuous, while g is integrable and g ≥ 0
on [a, b]. Then there is a number c ∈ (a, b) such that
Z b Z b
f (c) g(x) dx = f (x)g(x) dx.
a a

Proof. Let m, M and a0 , b0 be as in the proof of the previous theorem. Then,

m ≤ f (x) ≤ M =⇒ mg(x) ≤ f (x)g(x) ≤ M g(x)


Z b Z b Z b
=⇒ m g(x) dx ≤ f (x)g(x) dx ≤ M g(x) dx.
a a a
Rb Rb
If g(x) dx = 0, these inequalities give a f (x)g(x) dx = 0, and then any c will
a Rb
Rb f (x)g(x) dx
work. If
a
g(x) dx 6= 0, we have m ≤ a R b ≤ M . Then Intermediate
a
g(x) dx
Value Theorem gives the desired c. ␣

Exercises for Ÿ3.5


1. Prove the following functions are not continuous, by showing they lack
some property of continuous functions (and not by showing discontinuity at
any particular point):

(a) f : [0, 1] → R, f (x) = 1/x if x 6= 0 and f (0) = 0.



(b) g : [0, 1] → R, g(x) = x if x 6= 1 and g(1) = 0.
(c) h : [0, 1] → R, h(x) = [ex ].
2. Give an example of each of the following:
3.5. Continuity and Integration 107

(a) An unbounded continuous function with a bounded domain.

(b) A bounded continuous function which does not have a maximum value.

3. Consider a non-constant polynomial p(x) = xn + an−1 xn−1 + · · · + a0 of


even degree. Prove the following:

(a) For any real number y there is a real number R(y) ≥ 0 such that |x| >
R(y) implies p(x) > y .
(b) Let m be the minimum value of p(x) over the interval [−R(a0 ), R(a0 )].
Show that m is the minimum value of p(x) over the entire real line.

(c) Show that the image of p is [m, ∞).


4.
DRAFT September 25, 2020

Complete the following sketch of an alternate proof of the Boundedness


Theorem:

(a) If a function is bounded on sets A and B then it is bounded on A ∪ B.


(b) Given a continuous function f : [a, b] → R, dene

A = { x ∈ [a, b] | f is bounded on [a, x] }.

Show that α = sup A exists.

(c) If α < b, use the continuity of f at α to show that f is bounded on


[a, α + δ] for some δ > 0. Hence α = b.
(d) Use the continuity of f at b to show that f is bounded on [a, b].
5. Suppose f is continuous and positive on [a, b]. Show there is a δ>0 such
that f (x) ≥ δ for every x ∈ [a, b].
6. Suppose f is bounded on [a, b] and continuous on (a, b]. Show that f is
integrable on [a, b].
Z b
7. Suppose f is continuous and non-negative on [a, b]. If f (x) dx = 0 then
a
f =0 on [a, b]. (Hint: If f (c) > 0 then f (x) > f (c)/2 for x near c)
Z b
8. Suppose f is continuous on [a, b] and f (x)2 dx = 0. Show that f =0
a
on [a, b].
Z b
9. Suppose f is continuous on [a, b] and f (x) dx = 0. Show that f (c) = 0
a
for some c ∈ (a, b).
10. Use the Mean Value Theorem for Weighted Integration to prove:

1
x2
Z
1 1
√ ≤ √ dx ≤ .
3 2 0 1+x 3
4 | Dierentiation

In this chapter, we take a closer look at the idea that local information about
DRAFT September 25, 2020

functions should help us resolve integration problems. We already saw that


continuity guarantees integrability. Another application of continuity combined
with integration was in enabling the denition of the trigonometric functions.
On the other hand, continuity did not give us new tools to calculate integrals.

Among the continuous functions, the ones that are easiest to integrate are
the `piecewise linear' ones. Their graphs consist of line segments, such as in the
example below:

a b

This suggests that we try to locally approximate functions by straight line


segments. If we can get good approximations of this type, we can use them to
assess the integral. We shall give the name `dierentiable' to functions which
can be locally approximated by straight lines. Most of this chapter is devoted to
identifying these functions and to calculating the corresponding straight lines.
Then we make the rst connection between the processes of dierentiation
and integration, the so-called First Fundamental Theorem of Calculus. Finally,
we see that dierentiation has a life of its own, and we use it to explore the
problems of nding the extreme values and sketching the graph of a function.

4.1 Derivative of a Function


Let us consider what happens if we zoom in for a closer look at the graph of a
function such as y = x2 , near a point such as (1, 1).
122 Chapter 4. Dierentiation

4 1.1
1.5
3 1.05

2 1 1

1 0.95

0.5 1 1.5 2 0.8 1 1.2 0.96 0.98 1 1.02 1.04

We see that the graph of the function y = x2


looks more like the line y = 2x − 1
as we zoom in towards (1, 1), and at some stage becomes indistinguishable from
it.

This can happen even for functions with rapid oscillations. Let us look at
the function dened by y = x2 cos(1/x) if x 6= 0, and y=0 if x = 0, near the
origin.

·10−2 ·10−3
0.4 4 4
0.2 2 2
x x x
−0.5 0.5 −0.1−5 · 10−2 5 · 10−2 0.1 −1 −0.5 0.5 1
−0.2 −2 −2
·10−2
−0.4 −4 −4

No matter how much we zoom in, the function has innitely many oscillations.
Nevertheless, their amplitudes decrease and in that sense the function becomes
closer to the line y = 0. (We have kept a constant ratio between the unit lengths
in the x and y directions.)

y = f (x) can be
We wish to set up a clear criterion for when a function
considered to merge, on zooming in, with a line which passes through (a, f (a))
and has slope m. This line has equation y = f (a) + m(x − a). A `nearby' line
0 0
would have equation y = f (a) + m (x − a) with |m − m| being small. The graph
of f will merge with the given line if for any ε > 0, we can ensure that f (x)
lies between f (a) + (m ± ε)(x − a) for x close enough to a. This leads to the
following denition.

We say that a function f : I → R, where I is an open interval, has deriva-


tive m at a point a ∈ I if for each ε > 0 there is a δ > 0 such that |x − a| < δ
implies |f (x) − f (a) − m(x − a)| ≤ ε|x − a|.
ˆ If f has a derivative at a we say that f is dierentiable at a. The act of
nding the derivative is called dierentiation.
ˆ If f has derivative m at a, the line y = f (a) + m(x − a) is called the
tangent line to the graph of f at (a, f (a)).
4.1. Derivative of a Function 123

df df
ˆ If f has derivative m at a, we use the notation f 0 (a) or (a) or
dx dx x=a
for m.
Task 4.1.1. Consider a linear function y = mx + c. Show that its derivative
at any point is m. (Hence the derivative of a constant function is zero)
y = x2 , a = 1,
The next sequence of graphs illustrates this denition for
2 2
m = 2 and ε = 0.1. The curve is y = x − 1 − 2(x − 1) = x − 2x + 1 and the
shaded zone is bounded by the lines y = ±0.1(x − 1). We see that δ = 0.05
works for these values, and brings the curve inside the shaded zone.

δ=1 δ = 0.25 ·10−2


δ = 0.05
DRAFT September 25, 2020

0.4 0.1 2

0.2 5 · 10−2 1

0 0 0

−0.2 −5 · 10−2 −1

−0.4 −0.1 −2
0 0.5 1 1.5 2 0.8 0.9 1 1.1 1.2 0.96 0.98 1 1.02 1.04

Functions can fail to be dierentiable at a given point. The next set of


diagrams shows functions whose graph passes through the origin but never
resembles a line no matter how much we zoom in.

sgn(x) |x| x cos(1/x)


·10−2
1 1
0.8 5
0.6
0.4
−1 −0.5 0.5 1 −0.1−5 · 10−2 5 · 10−2 0.1
0.2
−5
−1 −1 −0.5 0.5 1

Our denition of derivative can be rephrased as follows:

Theorem 4.1.2. A function f has derivative f 0 (a) at a if and only if there is


a function ϕ such that f (x) − f (a) − f 0 (a)(x − a) = ϕ(x)(x − a) and x→a
lim ϕ(x) =
ϕ(a) = 0.

Proof. Exercise. ␣
Example 4.1.3. Let us check that the derivative of f (x) = x2 at a=1 is 2:

x2 − 12 − 2(x − 1) = ϕ(x)(x − 1) =⇒ ϕ(x) = x − 1 =⇒ lim ϕ(x) = 0 = ϕ(1).


x→1


124 Chapter 4. Dierentiation

Theorem 4.1.4. If a function is dierentiable at a point, then it is continuous


at that point.

Proof. Suppose f has derivative f 0 (a) at x = a. Then there is a function ϕ such


that f (x) − f (a) − f 0 (a)(x − a) = ϕ(x)(x − a) and lim ϕ(x) = 0. Hence,
x→a

lim f (x) = lim [f (a) + f 0 (a)(x − a) + ϕ(x)(x − a)] = f (a). ␣


x→a x→a

Task 4.1.5. Give an example of a function which is continuous at every point


but fails to be dierentiable at some point.
When we dierentiate a real function f: D →R we create a new function
f 0 : D0 → R where D0 is the subset of D consisting of all the points where f
is dierentiable. We can further dierentiate f0 to get a function f 00 = (f 0 )0 ,
called the second derivative of f . Then we can create the third derivative
f 000 = (f 00 )0 , and so on. Other choices of notation are:

f (0) (x) = f (x),


df
f (1) (x) = f 0 (x) = (x),
dx
d2 f
f (2) (x) = f 00 (x) = (x),
dx2
d3 f
f (3) (x) = f 000 (x) = (x),
dx3
.
.
.

dn f
f (n) (x) = (x).
dxn

The function f (n) , obtained by dierentiating f successively n times, is called


the nth derivative of f .
Derivative via Limits
We have established a connection between derivatives and limits. Let us make
it more explicit.

Theorem 4.1.6. Let f : I → R where I is an open interval. Then f has


derivative f 0 (a) at a ∈ I if and only if
f (x) − f (a) f (a + h) − f (a)
f 0 (a) = lim = lim .
x→a x−a h→0 h

Proof. We begin with the rst equality:

f 0 (a) = m ⇐⇒ ∃ϕ such that f (x) − f (a) − m(x − a) = ϕ(x)(x − a)


4.1. Derivative of a Function 125

and lim ϕ(x) = ϕ(a) = 0


x→a
f (x) − f (a) − m(x − a)
⇐⇒ lim =0
x→a x−a
f (x) − f (a)
⇐⇒ lim = m.
x→a x−a
The second equality follows from Theorem 3.1.6. ␣

This result gives us another viewpoint on the nature of the derivative.


Imagine that x represents time and f (x) is a position on the number line. Then
f (x) − f (a)
the ratio represents the average velocity over the time interval
x−a
DRAFT September 25, 2020

[a, x]. Letting x approach a gives us a better idea of the velocity in the imme-
diate vicinity of a, and the limit is seen as dening the instantaneous velocity
at a. In general, the derivative of any function f is called the (instantaneous)
rate of change of f .
Task 4.1.7. Show (again) that a constant function will have zero derivative.
Our original denition of derivative is useful for conceptualizing and prov-
ing abstract results. For example, it gives the right starting point for discussing
dierentiation in higher dimensions. On the other hand, the limit expression is
convenient for calculations. Let us see an example.

Example 4.1.8 (Power Rule). Consider the function xn , for a xed n ∈ N. Its
derivative can be calculated as follows:
n−1
y n − xn X
(xn )0 = lim = lim y i xn−1−i
y→x y − x y→x
i=0
n−1
X n−1
X
= xi xn−1−i = xn−1 = nxn−1 .
i=0 i=0

The second equality uses the identity

n−1
X
y n − xn = (y − x)(y n−1 + y n−2 x + · · · + yxn−2 + xn−1 ) = (y − x) y i xn−1−i .
i=0
0 2 0
In particular, x = 1, (x ) = 2x, etc. 

And here is an example of using limits to show that a certain function is


not dierentiable:

Example 4.1.9. Let f (x) = |x|. Let us try to calculate f 0 (0):


f (x) − f (0) |x| − |0| |x|
f 0 (0) = lim = lim = lim .
x→0 x−0 x→0 x − 0 x→0 x

The last limit does not exist, since the right-hand limit is 1 while the left-hand
limit is −1. 
126 Chapter 4. Dierentiation

One-Sided Derivatives
The concept of one-sided limits can be applied to derivatives:

f (x) − f (a)
ˆ f+
0
(a) = lim is the right derivative of f at a.
x→a+ x−a
f (x) − f (a)
ˆ f−
0
(a) = lim is the left derivative of f at a.
x→a− x−a
Task 4.1.10. Show that a function f is dierentiable at x = a if and only if
the left and right derivatives of f at a exist and are equal.

We say f is dierentiable on an interval I if it is dierentiable at every


interior point of I, and has the appropriate one-sided derivative at any end-
point which is included in I. Further, we shall denote the one-sided derivative
at an end-point c by f 0 (c) for simplicity.

Graph of Derivative
A useful skill is to be able to sketch the graph of f0 from that of f, without
0
actually calculating f . We can do this by observing where the tangent slopes
appear to be 0, positive, or negative. As an example, let y = f (x) have the
`bell-shaped' graph shown below.

We see that the tangent line at x = 0 is horizontal, by symmetry, and


hence has slope 0. Thus f 0 (0) = 0.
As we move to the right from x = 0, the tangents have negative slope. We
also see that for a while their steepness increases but then they start attening
out. Thus, as x increases, f 0 (x) at rst takes more and more negative values
but then starts moving up towards zero. Similarly, as we move to the left from
x = 0, f 0 (x) at rst takes more and more positive values but then starts moving
0
down towards zero. The plot below shows the graph of f (x) according to these
observations.
4.1. Derivative of a Function 127

Exercises for Ÿ4.1


DRAFT September 25, 2020

1. Dierentiate each function at a = 0:

(a) f (x) = 2x + 1, (c) h(x) = x2 + x + 1,


(b) g(x) = 2x2 + 1, (d) k(x) = x|x|.

2. Show that the following functions are not dierentiable at a = 0:


p
(a) f (x) = sgn(x), (b) g(x) = |x|.

3. Consider the function f : R → R dened by f (x) = x2 when x ∈ Q and


f (x) = 0 when x∈
/ Q. Show that f is dierentiable only at x = 0.
4. Prove the following:

x sin(1/x), x 6= 0
(a) The function f (x) = is continuous but not dif-
0, x=0
ferentiable at x = 0.
x2 sin(1/x),

x 6= 0
(b) The function g(x) = is dierentiable at x = 0.
0, x=0
5. Let n ∈ N and consider the function f (x) = x1/n with domain x > 0. Show
1
that f 0 (x) = x(1/n)−1 .
n
6. Prove the following:

(a) sin0 (0) = 1, (b) cos0 (0) = 0.

7. Suppose a function f is dierentiable at x = a. Prove that the function


g(x) = xf (x) is also dierentiable at x = a, with g 0 (a) = af 0 (a) + f (a).
8. Suppose f : R → R is dierentiable at every point, so that we have f 0 : R →
R. Prove:
128 Chapter 4. Dierentiation

(a) If f is even then f0 is odd. (b) If f is odd then f0 is even.

9. Let f (x) = xn for n ∈ N. Prove that f (n) (x) = n! and f (n+1) (x) = 0.
10. Suppose f is an even function which is dierentiable at 0. Show that
f 0 (0) = 0.
11. Let f: R → R have period T and be dierentiable. Show that f0 has
period T.
12. Match the graphs of f in (a), (b), (c) with the graphs of f0 in (i), (ii),
(iii).

2
1 50
1.5
1
−4 −2 2 4 −4 −2 2 4 0.5
−1 −50
(a) (b) (c) 1 2 3 4

4 1 60
3 0.8
0.6 40
2
0.4 20
1 0.2

1 2 3 4 −4 −2 2 4 (iii) −4 −2 2 4
(i) (ii)

4.2 Algebra of Derivatives


Theorem 4.2.1 (Algebra of Derivatives).
Let f and g be dierentiable at p. Then
1. (Scaling) (Cf )0 (p) = Cf 0 (p).
2. (Sum Rule) (f + g)0 (p) = f 0 (p) + g0 (p).
3. (Dierence Rule) (f − g)0 (p) = f 0 (p) − g0 (p).
4. (Product Rule) (f g)0 (p) = f 0 (p)g(p) + f (p)g0 (p).
 0
1 f 0 (p)
5. (Reciprocal Rule) (p) = − , if f (p) 6= 0.
f f (p)2
 0
g g 0 (p)f (p) − g(p)f 0 (p)
6. (Quotient Rule) (p) = , if f (p) 6= 0.
f f (p)2

Proof. We apply the Algebra of Limits:


4.2. Algebra of Derivatives 129

1. Scaling:

Cf (x) − Cf (p) f (x) − f (p)


(Cf )0 (p) = lim = C lim = Cf 0 (p).
x→p x−p x→p x−p

2. Sum Rule:

f (x) + g(x) − f (p) − g(p)


(f + g)0 (p) = lim
x→p x−p
f (x) − f (p) g(x) − g(p)
= lim + lim = f 0 (p) + g 0 (p).
x→p x−p x→p x−p
DRAFT September 25, 2020

3. Dierence Rule: Combine the sum rule with scaling by C = −1.

4. Product Rule:

f (x)g(x) − f (p)g(p)
(f g)0 (p) = lim
x→p x−p
f (x)g(x) − f (x)g(p) + f (x)g(p) − f (p)g(p)
= lim
x→p x−p
g(x) − g(p) f (x) − f (p)
= lim f (x) + lim g(p)
x→p x−p x→p x−p
= f (p)g 0 (p) + f 0 (p)g(p).

(Since f 0 (p) exists, f is continuous at p and we have lim f (x) = f (p).)


x→p

5. Reciprocal Rule: By continuity, f (x) 6= 0 for x near p. Hence,

 0
1 1/f (x) − 1/f (p) f (p) − f (x) f 0 (p)
(p) = lim = lim =− .
f x→p x−p x→p f (x)f (p)(x − p) f (p)2

(Since f 0 (p) exists, we have lim f (x) = f (p).)


x→p

6. Quotient Rule: Combine the product rule and reciprocal rule. ␣

With these rules we can dierentiate polynomials and rational functions.


For example,

(x45 + 7x4 + 99)0 = (x45 )0 + (7x4 )0 + (99)0 (sum rule)


45 0 4 0
= (x ) + (7x ) (C 0 = 0)
= (x45 )0 + 7(x4 )0 (scaling)
44 3
= 45x + 28x (power rule)

Task 4.2.2. Dierentiate the given functions and identify the points where the
derivative exists:
130 Chapter 4. Dierentiation
x
(a) (b) [x] (c) x−n , n ∈ N
x−1

The Algebra of Derivatives gives us ways to deal with combinations of


functions. Its eectiveness requires a collection of `elementary' functions whose
derivatives are already known. We now begin building that collection by work-
ing out the derivatives of the trigonometric and logarithmic functions.

Trigonometric Functions
To dierentiate the trigonometric functions, we use the two `fundamental limits'
calculated earlier, namely:

sin x 1 − cos x
lim =1 and lim = 0.
x→0 x x→0 x
Theorem 4.2.3. For every x ∈ R, sin0 x = cos x and cos0 x = − sin x.
Proof. We dierentiate the sine function, and leave the cosine for the reader.

sin(x + h) − sin x
sin0 x = lim
h→0 h
sin x cos h + cos x sin h − sin x
= lim
h→0 h
 
cos h − 1 sin h
= lim sin x + cos x
h→0 h h
= 0 · sin x + 1 · cos x
= cos x. ␣

Task 4.2.4. Use the reciprocal and quotient rules to show that
sec0 x = sec x tan x, csc0 x = − csc x cot x,
tan0 x = sec2 x, cot0 x = − csc2 x.

Logarithms
In order to obtain the derivative of log x we begin with the following simple
inequalities:

1
Theorem 4.2.5. For x > 0, 1 − ≤ log x ≤ x − 1.
x
Proof. For x≥1 these inequalities are obtained from
Z x Z x Z x
1 1
dt ≤ dt ≤ 1 dt.
1 x 1 t 1

Substituting 1/x for x gives the inequalities for 0 < x ≤ 1. ␣


4.2. Algebra of Derivatives 131

1
Theorem 4.2.6. For every x > 0, log0 x = .
x
Proof. We apply the denition of the derivative:

log y − log x log(y/x) log(hx/x) 1 log h


log0 x = lim = lim = lim = lim .
y→x y−x y→x y − x h→1 hx − x x h→1 h − 1
1 log h
For h > 1, we have ≤ ≤ 1 from Theorem 4.2.5. The Sandwich The-
h h−1
log x
orem gives lim = 1. If h < 1, the inequalities reverse and again give
x→1+ x − 1
log h
lim = 1. ␣
DRAFT September 25, 2020

h→1− h−1
1
Task 4.2.7. Let a > 0 and a 6= 1. Show that log0a x = .
x log a

The limit calculation that we carried out in the last proof can also be
expressed as

log(1 + h)
lim =1 or lim log((1 + h)1/h ) = 1.
h→0 h h→0

Applying the exponential function, and recalling that it is continuous, we get:

lim (1 + h)1/h = e.
h→0

We can use this limit to get better estimates of e. Let us take a closer look at
Z 1+h
1/h 1/h 1 dx
the behaviour of (1 + h) for h > 0. First, since log(1 + h) = is
h
1 x
the average of 1/x over the interval [1, 1 + h] and 1/x is a decreasing function,
so islog(1 + h)1/h (Task 3.5.8). Hence (1 + h)1/h is also a decreasing function.
1/h
Therefore (1 + h) is an underestimate of e when h > 0. Similarly, it is an
overestimate when h < 0. So we can get bounds for e by taking small h of both
signs.

h (1 + h)1/h (1 − h)−1/h
1/2 2.25 4
1/10 2.59 2.87
1/103 2.717 2.719

Thus, by taking h = 0.001 we already know that e ≈ 2.718. The actual value
of e when rounded to 6 decimal places is 2.718282.

Exercises for Ÿ4.2


1. Dierentiate the following functions:
132 Chapter 4. Dierentiation

x2 − 1 (c) x log x − x,
(a) ,
x2 + 1
(b) sin 2x, (d) log |x|.

xn+1 − 1
2. Given the formula 1 + x + x2 + · · · + xn = determine, by dier-
x−1
entiation, formulas for:

(a) 1x + 2x2 + 3x3 + · · · + nxn , (b) 12 x + 22 x2 + 32 x3 + · · · + n2 xn .

3. (Leibniz rule) Let u, v be real functions with a common domain, each being
dierentiable n times. Show that

n  
(n)
X n
(uv) = u(k) v (n−k) .
k
k=0

4. Let p be a polynomial of degree n. Show that p(n+1) = 0.


5. Can there be a polynomial p(x) such that p(x) = sin x on some interval
(a, b)?
6. Let functions f1 , . . . , fn have derivatives f10 , . . . , fn0 .
(a) Find a rule for dierentiating the product g = f1 · · · fn and prove it by
mathematical induction.

(b) Show that if fi (x) 6= 0 for every i then

g 0 (x) f 0 (x) f 0 (x)


= 1 + ··· + n .
g(x) f1 (x) fn (x)

7. Letf be a dierentiable function. Show that (f (x)n )0 = nf (x)n−1 f 0 (x) for


each n ∈ Z.
8. Let r ∈ Q and consider the function f (x) = xr with domain x > 0. Show
that f 0 (x) = rxr−1 .
9. We say that a is a zero with multiplicity k (k ∈ N) of a polynomial p if
p(x) = (x − a)k q(x), where q is a polynomial such that q(a) 6= 0. Show that a is
0 (k−1)
a zero of p, with multiplicity k , if and only if p(a) = p (a) = · · · = p (a) = 0
(k)
and p (a) 6= 0.
10. Prove that ex ≥ 1 + x for each x ∈ R.
11. Prove that if a, b ∈ N satisfy a < b and ba = ab then a = 2 and b = 4.
(Hint: Take b = a(1 + t) and apply the previous exercise)
4.3. Chain Rule and Applications 133

4.3 Chain Rule and Applications


The next task is to be able to dierentiate compositions of functions. Let's rst
see what happens in the simplest case, when the functions are linear. Consider
f (x) = mx + c and g(x) = nx + d. Then f ◦ g(x) = f (nx + d) = m(nx + d) + c =
mnx + md + c. Therefore (f ◦ g)0 (x) = mn, the product of the individual
derivatives of f and g . This motivates the following result:

Theorem 4.3.1 (Chain Rule). Let g be dierentiable at a and let f be dier-


entiable at b = g(a). Then the composition f ◦ g is dierentiable at a and the
derivative is given by
DRAFT September 25, 2020

(f ◦ g)0 (a) = f 0 (g(a))g 0 (a).

Proof. Let f 0 (g(a)) = m and g 0 (a) = n. Then

1. ∃ϕ such that g(x) − g(a) = [n + ϕ(x)](x − a) and lim ϕ(x) = ϕ(a) = 0.


x→a

2. ∃ψ such that f (y) − f (b) = [m + ψ(y)](y − b) and lim ψ(y) = ψ(b) = 0.


y→b

Hence, f (g(x)) − f (g(a)) = [m + ψ(g(x))](g(x) − b)


= [m + ψ(g(x))][n + ϕ(x)](x − a)
= mn(x − a) + E(x)(x − a),
and lim E(x) = lim [mϕ(x) + nψ(g(x)) + ψ(g(x))ϕ(x)] = 0. ␣
x→a x→a

Task 4.3.2. Dierentiate the given functions and identify the points where the
derivative exists:

(a) f (x) = (x2 + 1)10 , (c) h(x) = cos |x|,


sin2 x
(b) g(x) = | cos x|, (d) k(x) = .
sin x2

Implicit Dierentiation
We have been studying relationships of the form y = f (x) between two variables
x, y . Sometimes the relationship is not of such a simple form. For example, it
may be x2 +y 2 = 1. Clearly this shows√a dependence: for any 2x ∈ [−1, 1] we can
2
solve for corresponding values y = ± 1 − x2 . We say that x + y = 1 denes
y implicitly in terms of x√ . In fact this implicit relation can be separated into

two explicit functions y = 1 − x2 and y = − 1 − x2 .
134 Chapter 4. Dierentiation

1 x2 + y 2 = 1 1 y= 1 − x2

−1 1 −1 1

−1 −1 √
y = − 1 − x2

The Chain Rule allows us to calculate dy/dx without solving explicitly for y,
as follows:

x2 + y 2 = 1 =⇒ 2x + 2y y 0 = 0 =⇒ y 0 = −x/y (if y 6= 0)

This works simultaneously for both cases of y = ± 1 − x2 !

The real advantage of this process of implicit dierentiation is that


often one may be unable to solve for y as an explicit function of x, and then
this is the only approach available.

Example 4.3.3 (Folium of Descartes) . Consider the relation x3 + y 3 = 6xy .


Its solutions plot as follows:

−4 −2 2
−2

−4

It is hard to separate this into explicit functions, but easy to dierentiate im-
plicitly:

x3 + y 3 = 6xy =⇒ 3x2 + 3y 2 y 0 = 6y + 6xy 0


2y − x2
=⇒ (y 2 − 2x)y 0 = 2y − x2 =⇒ y 0 = .
y 2 − 2x

Suppose we wish to nd a point on the curve where the tangent line is hori-
zontal. We have

y 0 = 0 =⇒ 2y − x2 = 0 =⇒ y = x2 /2
x = 24/3
=⇒ x3 + x6 /8 = 3x3 =⇒ x3 = 16 =⇒ . 
y = 25/3
4.3. Chain Rule and Applications 135

Example 4.3.4 (Tangent to Ellipse) . The equation of an ellipse in standard


form is
x2 y2
+ = 1.
a2 b2
Implicit dierentiation gives

2x 2y 0
+ 2 y = 0.
a2 b
If (x0 , y0 ) is a point on the ellipse, the slope m of the tangent line there is given
by
2x0 2y0 x 0 b2
+ 2 m=0 or m=− .
a2 b y0 a2
DRAFT September 25, 2020

Hence the equation of the tangent line at (x0 , y0 ) is

x 0 b2 yy0 − y02 xx0 − x20 yy0 xx0


y = y0 − (x−x0 ) or + =0 or + 2 = 1. 
y0 a2 b2 a2 b2 a

This technique of implicit dierentiation takes the following for granted:


That the given relation between x and y breaks into parts, each of which im-
plicitly denes y as a function of x, and each of these is dierentiable. Ideally,
we should have criteria for deciding whether these assumptions holds. Such cri-
teria exist but involve the study of functions f (x, y) of two variables. They can
be found in texts on Analysis such as Apostol [1] under the name of `Implicit
Function Theorem'.

Derivatives of Inverse Functions


Suppose f is a dierentiable function such that its inverse exists and is also
dierentiable. Then, using the Chain Rule and assuming f (a) = b, we have:

(f ◦ f −1 )(y) = y =⇒ (f ◦ f −1 )0 (b) = 1 =⇒ f 0 (a)(f −1 )0 (b) = 1


1
=⇒ (f −1 )0 (b) = 0 .
f (a)
That was easy. But it leaves something unanswered. Is there a guarantee that
f −1 will indeed be dierentiable?

Theorem 4.3.5. Let f be a continuous and monotonic bijection between two


intervals. Let f 0 (a) exist and be non-zero. Then f −1 is dierentiable at b = f (a)
and the derivative is given by
1
(f −1 )0 (b) = .
f 0 (a)

Proof. We begin by noting that if a line with slope m 6= 0 is reected in


the y = x line, the resulting line has slope 1/m. The following picture now
represents a proof.
136 Chapter 4. Dierentiation

f −1 (x)

a f (x)

b
b a

An Alternate Proof: First, we note that f −1 is a monotone function whose


−1
image is an interval, hence f is continuous. Now, dene a function g by
 −1
 f (y) − f −1 (b)
if y 6= b
g(y) = y−b .
0
1/f (a) if y = b

Substituting y = f (x) andb = f (a) gives:


x−a

 if x 6= a
g(f (x)) = f (x) − f (a) .
 1/f 0 (a) if x=a

So g◦f is continuous at a. Therefore g = g ◦ f ◦ f −1 is continuous at b. This


gives the result. ␣
Task 4.3.6. Dierentiate the given functions and identify the points where the
derivative exists:
√ √
(a) g(x) = x (b) h(x) =
p
x+ x

Inverse Trigonometric Functions


The sine function is neither one-one nor onto. We can make it onto simply by
choosing the codomain to be [−1, 1] instead of R. Now sin : R → [−1, 1] is still
not one-one, but we can choose a piece of the function which is one-one. A
standard choice is to restrict the domain to [−π/2, π/2]. On this domain, sine
is strictly increasing and hence one-one.

1
−π/2
π/2

−1
4.3. Chain Rule and Applications 137

The graph of the bijection sin : [−π/2, π/2] → [−1, 1].

This restriction has an inverse function sin−1 : [−1, 1] → [−π/2, π/2]. It is also
called arcsine and its values are denoted by arcsin(x). It is continuous because
it is monotone and its image is an interval. We can get its graph by reecting
the y = sin x graph in the y=x line:

y = arcsin x
π/2
1 y = sin x

−π/2−1
DRAFT September 25, 2020

1 π/2

−1
−π/2

Similarly, the cosine function can be made a bijection by choosing the


codomain to be [−1, 1] and restricting the domain to [0, π]. On this domain,
cosine is strictly decreasing.

1
π

−1

The graph of the bijection cos : [0, π] → [−1, 1].

This restriction has an inverse function cos−1 : [−1, 1] → [0, π]. It is also denoted
by arccos and is continuous because it is monotone and its image is an interval.
We can get its graph by reecting the y = cos x graph in the y = x line:

y = arccos x π

1
π
−1 1
y = cos x
−1
138 Chapter 4. Dierentiation

Comparing the graphs of sin−1 x cos−1 x we see the rst can be


and
converted to the second by translating up by π/2 and then reecting in the
y -axis. This means that sin−1 x + π/2 = cos−1 (−x). Similarly, we see that
− sin−1 x + π/2 = cos−1 x. Which familiar identities do these reect?
The portion of tan x dened on (−π/2, π/2) is a bijection with R. Its
inverse is denoted by tan−1 x or arctan x and maps R to (−π/2, π/2).

y = tan x

π/2
y = arctan x
−π/2
π/2

−π/2

Theorem 4.3.7. The derivatives of the inverse trigonometric functions are:


1
arcsin0 x = √ , x ∈ (−1, 1)
1 − x2
−1
arccos0 x = √ , x ∈ (−1, 1)
1 − x2
1
arctan0 x = , x∈R
1 + x2

Proof. We apply the formula for dierentiating inverse functions to the arcsine
function:
1 1
arcsin0 x = = .
sin0 (arcsin x) cos(arcsin x)

Now cos2 (arcsin x) = 1−sin2 (arcsin x) = 1−x2 and since arcsin x ∈ [−π/2, π/2]
we know that cos(arcsin x) ≥ 0. Hence

1
arcsin0 x = √ , x ∈ (−1, 1).
1 − x2

The calculation for arccosine is similar and is left to the reader. Finally,

1 1 1 1
arctan0 x = = = = .
tan0 (arctan x) sec2 (arctan x) 1 + tan2 (arctan x) 1 + x2


4.3. Chain Rule and Applications 139

The notation for inverse can be dangerous. Note the following:


ˆ sin−1 x is the inverse sine function applied to x.

ˆ sin x−1 is the sine function applied to 1/x. The safer way to write
« it is sin(x
−1
).

ˆ (sin x)−1 is 1/(sin x). A common error is to mistake sin−1 x for


1/(sin x).

Exponential Function
Theorem 4.3.8. The derivative of the exponential function is itself:
DRAFT September 25, 2020

(ex )0 = ex .

Proof. Consider f (x) = log x. Its inverse function is f −1 (x) = ex . Applying the
formula for dierentiating an inverse function, we get:

1 1 1
(ex )0 = (f −1 )0 (x) = = = = ex . ␣
f 0 (f −1 (x)) log0 (ex ) 1/ex

It is a general principle in mathematics that objects or relations left


unchanged by operations are especially important. The fact that the
« exponential function is unchanged by dierentiation indicates it will be
a fundamental object in Calculus.

Task 4.3.9. Let a > 0. Show that (ax )0 = ax log a.


The dierentiation of the exponential function can be combined with the
Chain Rule to dierentiate arbitrary powers:

Theorem 4.3.10 (Power Rule). If r ∈ R then (xr )0 = r xr−1 for x > 0.


r r
Proof. (xr )0 = (er log x )0 = er log x = xr = rxr−1 . ␣
x x

Example 4.3.11. We'll dierentiate the function y = xx , with x > 0. We use


the same technique as in the proof of the Power Rule:

(xx )0 = (ex log x )0 = ex log x (x log x)0 = xx (1 + log x). 

At this point, we have a catalogue of basic functions whose derivatives are


known, and techniques for dealing with both their algebraic combinations as
well as compositions. So we can dierentiate anything that can be described by
such combinations.

Task 4.3.12. Dierentiate exp( x2 + arctan x).
140 Chapter 4. Dierentiation

The trick that we used to dierentiate xr and xx can be formalized into


a method called logarithmic dierentiation. We start by noting that if f
takes only positive values then f is dierentiable if and only if g = log ◦f is
dierentiable. (Since g = log ◦f =⇒ exp ◦g = f ) And then g 0 = f 0 /f =⇒
f 0 = f g0 .
Example 4.3.13. (xx )0 calculation as logarithmic dieren-
Let us express the
x 0
tiation. Let f (x) = x and g(x) = log f (x) = x log x. Then g (x) = 1 + log x.
0 0 x
Hence f (x) = f (x)g (x) = x (1 + log x). 
Task 4.3.14. Use logarithmic dierentiation to nd the derivative of the func-
tion (sin x)cos x for x ∈ (0, π/2).
Example 4.3.15. (Hyperbolic Functions) We'll show how dierentiation can
be used to discover the hyperbolic functions. Consider the hyperbola given by
x2 − y 2 = 1. We want to nd functions x(t), y(t) such that varying t generates
all the points on the branch with x > 0.

x<0 x>0

These functions have to satisfy x(t)2 − y(t)2 = 1. Dierentiating both sides


with respect to t, we get

x0 (t) y(t)
x(t)x0 (t) − y(t)y 0 (t) = 0 or = .
y 0 (t) x(t)
(In discovery mode, we do not worry about dividing by zero) It is natural to
try to arrange y 0 (t) = x(t) and x0 (t) = y(t). y 00 (t) = y(t). We
This leads to
−t t
can easily check that functions of the form y(t) = Ae + Be satisfy this
requirement. Now suppose we want the motion to start at (1, 0) when t = 0.
This gives the equations A + B = 0 and A − B = 1, with solutions A = 1/2,
B = −1/2. Hence
1 t 1 −t 1 t 1 −t
y(t) = e − e = sinh t and x(t) = y 0 (t) = e + e = cosh t. 
2 2 2 2
You may recall that we had already veried that the hyperbolic functions do
trace the hyperbola.

Task 4.3.16. Prove that cosh0 x = sinh x and sinh0 x = cosh x.


Inverse Hyperbolic Functions
Task 4.3.17. Show that sinh : R → R is a strictly increasing bijection.
4.3. Chain Rule and Applications 141

It follows that the hyperbolic sine function has an inverse which is strictly
increasing as well as continuous. We denote it by sinh−1 or arcsinh, following
the same pattern as for inverse trigonometric functions.

The hyperbolic cosine function is even and hence not one-one. Therefore
we restrict the domain to [0, ∞) and try again.

Task 4.3.18. Show that cosh : [0, ∞) → [1, ∞) is a strictly increasing bijection.
The corresponding inverse function is called cosh−1 or arccosh. It is also
strictly increasing and continuous.

1 1
Task 4.3.19. Prove that (sinh−1 x)0 = √ and (cosh−1 x)0 = √ .
x2 +1 x2 −1
DRAFT September 25, 2020

Exercises for Ÿ4.3


1. Dierentiate the following functions:
√ 2
(a) f (x) = x2 + 1 , (d) k(x) = e−1/x ,

2 2
(b) g(x) = sin x − sin x , (e) `(x) = log(1 + x2 ),
(c) h(x) = sin(sin x), (f) r(x) = π x − xπ .

2. Use implicit dierentiation to nd the tangent line at a point (x0 , y0 ) on


x2 y2
the hyperbola given by
2
− 2 = 1.
a b
3. Consider the families of curves y = cx2 and x2 + 2y 2 = k 2 , where c and k
vary over all real numbers.

Show that whenever a curve from one family cuts a curve from the other
family, their tangent lines are perpendicular to each other.

4. Graph the following functions:

(a) sin(arcsin x), (b) arcsin(sin x).

(Keep in mind that arcsine inverts only a part of sine.)

5. Let cot−1 or arccot denote the inverse of cot : (0, π) → R. Show that

−1
arccot0 (x) = .
1 + x2
142 Chapter 4. Dierentiation

π π
6. Let sec−1 or arcsec denote the inverse of sec : (0, ) ∪ ( , π) → R \ [−1, 1].
2 2
Show that
1
arcsec0 x = √ .
|x| x2 − 1

π π
7. Let csc−1 or arccsc denote the inverse of csc : (− , 0) ∪ (0, ) → R \ [−1, 1].
2 2
Show that
−1
arccsc0 x = √ .
|x| x2 − 1

8. Use logarithmic dierentiation to dierentiate y = xx


x
, with x > 0.
9. Let f and g be dierentiable functions, with f (x) > 0 for every x. Prove

g(x) 0 g(x)

0 f 0 (x) 
that (f (x) ) = f (x) g (x) log f (x) + g(x) .
f (x)

4.4 The First Fundamental Theorem


As we have seen with the logarithm function, important functions may be
dened not through an algebraic formula but as indenite integrals. Such func-
tions are automatically continuous. Are they dierentiable? Can we calculate
their derivatives? Let us consider the few simple examples we already know:

Z x
f (x) F (x) = f (t) dt F 0 (x)
0

sgn(x) |x| sgn(x), for x 6= 0


x x2 /2 x
x2 x3 /3 x2
Rx
Similarly, if we take F (x) = log x = 1 f (t) dt, then F 0 (x) =
f (x) = 1/x and
1/x. It seems that on dierentiation, F always reverts to f . Keeping the rst
example in mind, we should add the qualier, where f is continuous.

This result is to be expected on physical grounds. Recall that one moti-


vation for dening integration the way we did was to to obtain displacement
from velocity. If we have done so correctly, then dierentiating the integral that
represents displacement should give us back the velocity.

The following rough argument gives geometric insight and also brings out
the need for assuming continuity. The change F (x + h) − F (x) is approximated
by the area of the trapezium whose vertices are at x, x + h on the x-axis and
the corresponding points on the graph of f:
4.4. The First Fundamental Theorem 143

f (x + h)
f (x)

x x+h

F (x + h) − F (x) 1 f (x) + f (x + h) f (x) + f (x + h)


Hence, ≈ h= → f (x), as
h h 2 2
h → 0.
DRAFT September 25, 2020

Theorem 4.4.1 (First Fundamental Theorem). Let I be an interval and


f: I → R be integrable on each subinterval [a, b] ⊆ I . Fix a ∈ I and consider
the indenite integral F : I → R dened by
Z x
F (x) = f (t) dt.
a

Then F 0 (x) = f (x) if f is continuous at x. (If x is an end-point, use the


appropriate one-sided notion of continuity and dierentiability)
Proof. For h 6= 0 we have

Z x+h Z x Z x+h
F (x + h) − F (x) = f (t) dt − f (t) dt = f (t) dt.
a a x

Hence,

Z x+h Z x+h
F (x + h) − F (x) − hf (x) = f (t) dt − f (x) dt
x x
Z x+h
= (f (t) − f (x)) dt.
x
R x+h
Dene ϕ(h) = h1 x (f (t) − f (x)) dt. Consider ε > 0. If f is continuous at x,
there is a δ > 0 such that |t − x| < δ implies |f (t) − f (x)| < ε. Therefore, if
0 < |h| < δ , we obtain:
Z x+h
1 1
|ϕ(h)| = (f (t) − f (x)) dt ≤ |h|ε = ε.
|h| x |h|
Therefore, ϕ(h) → 0 as h → 0, and so F 0 (x) = f (x). ␣


Example 4.4.2.
Rx
F (x) = 0 sin t dt.√By the
Suppose we have to dierentiate
0
First Fundamental Theorem we know immediately that F (x) = sin x. We
didn't have to rst nd a closed form expression for H(x)! 
144 Chapter 4. Dierentiation

Example 4.4.3. We shall combine the First Fundamental Theorem and the

R x2
Chain Rule to dierentiate G(x) = sin t dt, x > 0. First, let F (x) =
Rx √ x
0
sin t dt, as in the previous example. Then

Z x2 √ Z x √
G(x) = sin t dt − sin t dt = F (x2 ) − F (x).
0 0

Hence, by Chain Rule,

√ √ √
G0 (x) = F 0 (x2 )2x − F 0 (x) = 2x sin x2 − sin x = 2x sin |x| − sin x.

At this point, the First Fundamental Theorem is another method of dif-


ferentiation. However, it has also established a bridge between the realms of
dierentiation and integration. We shall use this bridge in the next chapter to
supply integration with a rich supply of computational techniques.

Exercises for Ÿ4.4


1. Compute and graph F (x):
Z x Z x
(a) F (x) = tH(t) dt, (c) F (x) = [t] dt.
0 0
Z x Z x
(b) F (x) = sgn|t| dt. (d) F (x) = (−1)[t] dt.
0 0

(H is the Heaviside step function)

2. Find the derivative F 0 (x):

Z x3 Z x2
(a) F (x) = (1 + t2 )−3 dt, (b) F (x) = (1 + t2 )−3 dt.
0 x

3. We say that f is a Ck function if f (k) exists at every point of the domain of


f and isRcontinuous. Prove that if f is a C k function then its indenite integral
x
F (x) = a f (t) dt is a C k+1 function.
Z R x √1+t3 dt p
4. Dierentiate g(x) =
0
1 + t3 dt.
0
5. If f is continuous on an interval
R h(x)
I and g, h are dierentiable with range in

I, dierentiate
g(x)
f (t) dt.
4.5. Extreme Values and Monotonicity 145

4.5 Extreme Values and Monotonicity


We now make a start on using dierentiation to explore dierent aspects of
functions. The very rst question we consider is How large or small a value
can a certain function take?"

We say a function f : D → R has a global or absolute maximum at a


point c if f (c) ≥ f (x) for every x ∈ D. Similarly, it has a global or absolute
minimum at a point d if f (d) ≤ f (x) for every x ∈ D.
Various situations are possible:

1. f may not have an absolute maximum or an absolute minimum. For ex-


DRAFT September 25, 2020

ample, f (x) = x : R → R and f (x) = 1/x : (0, 1) → R.

2. f may have an absolute maximum but not an absolute minimum. For


example, f (x) = −x2 : R → R.

3. f may have an absolute minimum but not an absolute maximum. For


example, f (x) = x2 : R → R.

4. f has both an absolute minimum and an absolute maximum. And these


may occur several times. For example, sin : R → R.
The Extreme Value Theorem (Theorem 3.5.4) tells us that if the function
is continuous and the domain is of the form [a, b] then an absolute maximum
and an absolute minimum will certainly be present.

Absolute maxima and minima are collectively known as absolute ex-


trema.
Local Extrema and Fermat's Theorem
We don't always need to know the very largest (or smallest) value of a function.
If we kick a ball into some rough ground, we know it will stop in a depression,
but it need not stop in the very deepest one. The stopping point will be lower
than the immediately surrounding points, but perhaps not lower than further
o ones.

We say a function f : D → R has a local or relative maximum at a point


c if there is an open interval I containing c such that f (c) ≥ f (x) for every
x ∈ I ∩ D. Similarly, it has a local or relative minimum at a point d if there
is an open interval I containing d such that f (d) ≤ f (x) for every x ∈ I ∩ D .
Local maxima and minima are collectively known as local extrema.
Obviously, an absolute maximum will also be a local maximum, and an
absolute minimum will be a local minimum. But local extrema need not be
absolute extrema, and a function could well have local extrema without having
any absolute extreme.
146 Chapter 4. Dierentiation

10

−10 −5 5 10

−5

−10

The function x + 2 sin x has several local maxima (discs) and local minima
(squares) which are not absolute extrema.
Theorem 4.5.1 (Fermat's Theorem). Let f (x) have a local extreme at an
interior point c of an interval in its domain. Then either f 0 (c) does not exist
or f 0 (c) = 0.
Proof. Suppose f 0 (c) exists. We have to show that f 0 (c) = 0. Suppose f 0 (c) > 0,
that is,
f (x) − f (c)
lim > 0.
x→c x−c
Since the limit is positive, the secant slopes must themselves be positive once
we are close to c. That is, there must be a δ>0 such that 0 < |x − c| < δ =⇒
f (x) − f (c)
> 0. Then,
x−c

c − δ < x < c =⇒ f (x) < f (c) =⇒ c is not a point of local minimum,

c < x < c + δ =⇒ f (x) > f (c) =⇒ c is not a point of local maximum.

This rules out f 0 (c) > 0. We similarly rule out f 0 (c) < 0. ␣

Here is an example of a local extreme which occurs at a point where f0


does not exist:

Example 4.5.2. Consider f (x) = |x|. It has a local minimum at x=0 but
f 0 (0) is not dened. 

In the next example we have a point where f0 is zero but it is not a local
extreme:

Example 4.5.3. Consider f (x) = x3 . Then f 0 (0) = 0 but there isn't a local
extreme at x = 0. 

We call a point c a critical point or critical number of f (x) if it is an


interior point c of an interval in the domain and either f 0 (c) does not exist or
0
f (c) = 0.
4.5. Extreme Values and Monotonicity 147

Let f (x) have an interval [a, b] as domain. Then, by Fermat's Theorem,


the local extremes of f occur either at critical points or at the end-points of
the domain.

Example 4.5.4. Consider f (x) = x3 − 3x + 1 with domain [0, 3]. We make the
following calculations:

1. Calculate the function values at the endpoints: f (0) = 1 and f (3) = 19.

2. Find the critical points. Since f is dierentiable we only have to look for
f 0 (c) = 0. This gives 3c2 − 3 = 0 or c = ±1. Thus c = 1 is the only critical
point (in the given domain).
DRAFT September 25, 2020

3. Calculate the function values at the critical points: f (1) = −1.


Thus the candidates for absolute extremes are only f (0) = 1, f (1) = −1 and
f (3) = 19. So the absolute maximum is at x=3 and the absolute minimum is
at x = 1. 

The common errors made by students in such problems are: Ignoring


« the endpoints, and forgetting those critical points where f0 does not
exist.

Monotonicity
Theorem 4.5.5 (Monotonicity Theorem). Suppose I is an interval and f : I →
R is dierentiable on I .
1. If f 0 (x) > 0 for every x ∈ I then f is strictly increasing.
2. If f 0 (x) ≥ 0 for every x ∈ I then f is increasing.
We also have the corresponding statements regarding negative derivatives and
decreasing functions.

Proof. (1) Let p, q ∈ I with p < q. We have to show that f (p) < f (q).
Since f is continuous on [p, q] it achieves its maximum and minimum over
this interval. By Fermat's Theorem the points of maximum and minimum can
only be the endpoints p, q .
If the maximum and minimum values are equal, then f is a constant func-
tion, and f 0 = 0. So they are not equal and f (p) 6= f (q). Suppose f (q) is the
minimum value over [p, q]. Then

f (x) − f (q)
f 0 (q) = lim ≤ 0,
x→q− x−q

since p < x < q implies f (x) ≥ f (q). This contradicts the positivity of f 0 . It
follows that f (q) is the maximum value over [p, q] and hence f (p) < f (q).
148 Chapter 4. Dierentiation

(2) Let p, q ∈ I with p < q . Take any ε > 0 and consider the function g(x) =
f (x) + εx. Then g 0 (x) = f 0 (x) + ε > 0 and g is strictly increasing. Now,

g(p) < g(q) =⇒ f (q) − f (p) > ε(p − q).

Thus f (q) − f (p) is greater than every negative number and hence must be
non-negative. ␣
Example 4.5.6. We will show that the equation x3 + 3x + 1 = 0 has exactly
one solution.

Let f (x) = x3 + 3x + 1. Then f (x) is a polynomial, hence continuous and


dierentiable everywhere.

We have f (−1) = −3 < 0 and f (0) = 1 > 0. So by Intermediate Value


Theorem applied to f : [−1, 0] → R we have a c ∈ (−1, 0) such that f (c) = 0,
i.e. c3 + 3c + 1 = 0. (We haven't found c but we know it is somewhere in there.
If required, we can use the Bisection Method to further narrow the range in
which it lies.)

Now calculate the derivative: f 0 (x) = 3x2 + 3 ≥ 3 > 0, hence f is strictly


increasing, therefore one-one. So there can only be one c with f (c) = 0. 
Theorem 4.5.7. Let f, g be dierentiable functions from an interval I to R.
1. If f 0 (x) = 0 for each x ∈ I then f (x) =constant.
2. If f 0 (x) = g0 (x) for each x ∈ I then f (x) = g(x)+constant.
Proof. Exercise. ␣

These results often enable us to characterize a function by properties of its


derivative. For example, suppose a function f satises f 0 (x) = f (x) for every
x x
x. One such function is e . In fact, every function of the form Ae has this
property. So we wonder whether they are the only ones to have this property.
We have a positive answer:

Theorem 4.5.8. If f 0 (x) = k f (x) on an interval I then f (x) = Aekx .


Proof. Consider g(x) = f (x)e−kx . Then

g 0 (x) = f 0 (x)e−kx − kf (x)e−kx = kf (x)e−kx − kf (x)e−kx = 0.

Hence g(x) = A, a constant, and f (x) = Aekx . ␣


Task 4.5.9. Suppose f : R → R is dierentiable, f 0
=f and f (0) = 1. Show
that f (x) = ex .
Now consider the sine and cosine functions. They satisfy the relation f 00 =
−f . More generally, every combination a cos x + b sin x satises this relation.
Again, we wonder if they are the only ones. We begin with a special case.
4.5. Extreme Values and Monotonicity 149

Task 4.5.10. Suppose f : R → R is dierentiable, f 00 = −f and f (0) = f 0 (0) =


0. Show that f (x) = 0. (Hint: Dierentiate the function f 2 + (f 0 )2 )
Task 4.5.11. Suppose f : R → R is dierentiable and f 00 = −f . Show that if
f (0) = a and f 0 (0) = b then f (x) = a cos x + b sin x.

An equation involving a function and its derivatives is called a dierential


equation. Mathematical modeling of physical and economic systems usually
leads to dierential equations and the goal is to nd all the functions which
satisfy them. These functions are called their solutions. We have seen that the
solutions off 0 = kf have the form f (x) = Aekx while solutions of f 00 = −f
have the form f (x) = a cos x+b sin x. These are two simple but quite important
DRAFT September 25, 2020

examples of dierential equations.

We close with a curious property of derivatives. The derivative of a function


need not be continuous. Nevertheless, it always has the Intermediate Value
Property!

Theorem 4.5.12 (Darboux's Theorem). Let a < b and f : [a, b] → R be dier-


entiable. Suppose L is strictly between f 0 (a) and f 0 (b). Then there is c ∈ (a, b)
such that f 0 (c) = L.

Proof. First, suppose L = 0. f 0 (a)f 0 (b) < 0. Since f is continuous, it


Then
assumes its maximum and minimum values on [a, b]. If either occurs at a point
c ∈ (a, b) then by Fermat's Theorem we have f 0 (c) = 0. So, suppose they are
assumed only on a, b. We may assume that the maximum is assumed on a and
0 0
the minimum on b. Then f (a), f (b) ≤ 0, a contradiction. This resolves the
L = 0 case.
L 6= 0. Consider g(x) = f (x) − Lx. Then g 0 (a) = f 0 (a) − L < 0 <
Now let
f (b) − L = g (b). So there is c ∈ (a, b) such that g 0 (c) = 0, hence f 0 (c) = L. ␣
0 0

Exercises for Ÿ4.5


1. Consider the function f (x) = xe−x
2
with domain [−1/2, 2].
(a) Identify the critical points.

(b) Find the absolute maximum and minimum values of the function.

2. Consider the function f (x) = 1 − x2/3 with domain [−1, 1].


(a) Identify the critical points.

(b) Find the absolute maximum and minimum values of the function.

3. Consider the rectangle inscribed inside a triangle as shown below. What is


its maximum possible area?
150 Chapter 4. Dierentiation

x3
4. Prove that f (x) = + 2x − 2 cos x has exactly one zero.
3
5. Show that x2 = x sin x + cos x for exactly two values of x.
6. Suppose that f: R→R satises f (n+1)
= 0. Prove that f is a polynomial
with degree n or less.

7. Find the equation y(x) of a curve such that the tangent line at the point
(x, y) intersects the x-axis at x − 1.
8. Suppose a function f satises the dierential equation f 0 (x) = k(M −f (x)).
Find the general form of f.
9. Find all functions f: R → R with the property that x 6= y implies f (x) −
2
f (y) ≤ (x − y) .

10. Let f be a dierentiable function such that every tangent line to its graph
passes through the origin. Show that the graph of f is a line through the origin.

11. Prove that there is no function f such that f 0 (x) = sgn(x) for every
x ∈ R.
12. Let I be an interval and f: I → R a dierentiable function such that
f 0 (x) is never zero. Show that f is strictly monotonic.

13. Prove that if a derivative f0 is monotonic then it is continuous.

4.6 Derivative Tests and Curve Sketching


We dened local extrema and critical points in Ÿ4.5. Fermat's Theorem informs
us that a local extreme in the interior of an interval can only occur at a critical
point. At the same time, it is possible that a critical point fails to be a local
extreme. A critical point of the last kind is called a saddle point. Naturally,
we wish to be able to classify a given critical point as a local maximum, local
minimum, or saddle point. One way to do this is to use the values of the
derivative of the function on either side of the critical point. For example,
consider the following graph:
4.6. Derivative Tests and Curve Sketching 151

local max

saddle
local min

f0 > 0 f0 > 0 f0 < 0 f0 > 0

As we move from left to right and pass through the saddle point, the derivative
changes from positive to zero and back to positive. Thus, it has the same
sign on each side of the saddle point. As we pass through the local maximum
DRAFT September 25, 2020

the derivative changes from positive to negative, and at the local minimum it
changes from negative to positive. These observations give a test for deciding
whether a critical point is a local extreme and of what kind.

Theorem 4.6.1 (First Derivative Test). Let a function f be continuous on an


interval (a, b) and let c ∈ (a, b) be a critical point of f . Suppose f is dierentiable
on (a, b) except perhaps at c. Then,
1. If f 0 (x) > 0 for x ∈ (a, c) and f 0 (x) < 0 for x ∈ (c, b) then f has a local
maximum at c.
2. If f 0 (x) < 0 for x ∈ (a, c) and f 0 (x) > 0 for x ∈ (c, b) then f has a local
minimum at c.
3. If f 0 has the same sign on either side of c then f has a saddle point at c.
Proof. Suppose f 0 (x) > 0 for x ∈ (a, c) and f 0 (x) < 0 for x ∈ (c, b). By the
Monotonicity Theorem, f is strictly increasing on (a, c) and strictly decreasing
on (c, b). The continuity of f then gives us that f is strictly increasing on (a, c]
and strictly decreasing on [c, b). It follows that f (c) is the largest value taken
by f (x) on (a, b) and hence there is a local maximum at c.

Similarly, if f 0 (x) < 0 for x ∈ (a, c) and f 0 (x) > 0 for x ∈ (c, b), there is a
local minimum at c.

But if f 0 (x) has the same sign on both sides of c then values on one side
are higher and on the other are lower. Hence there is neither a local maximum
nor a local minimum at c. ␣
Example 4.6.2. Consider f (x) = x2 ex . This is a dierentiable function, so
its critical points are given by the derivative being zero. We have f 0 (x) =
x 2 x x
2xe + x e = x(x + 2)e . Hence

f 0 (c) = 0 ⇐⇒ c(c + 2) = 0 ⇐⇒ c = 0, −2

To identify the nature of the critical points we have to nd the sign of the
derivative on either side of them:
152 Chapter 4. Dierentiation

x < −2 −2 < x < 0 x>0


f 0 (x) + − +

By the First Derivative Test, there is a local maximum at −2 and a local


minimum at 0. The function increases on (−∞, −2) to the value 4e−2 ≈ 0.54
at −2, then decreases to the value 0 at 0. Beyond 0 it increases again. Note
x2
that lim x2 ex = ∞ and lim x2 ex = lim = 0.
x→∞ x→−∞ x→∞ ex

Here is the graph of f (x) showing these features:

0.8
0.6
0.4
0.2

−5 

Example 4.6.3. f (x) = x + sin x on [0, 2π]. We have f 0 (x) = 1 +


Consider
0
cos x and so f (c) = 0 ⇐⇒ cos c = −1. Thus the only critical point in the
0
given domain is c = π . We have f (x) = 1 + cos(x) ≥ 0 always and so f (x)
is monotonically increasing in the domain. In particular, although there is a
horizontal tangent at c = π, it is not a local extreme but a saddle point.

π 2π 

We have seen that the rst derivative tells us whether a function is increas-
ing or decreasing, and how fast. We can apply the same logic to get information
from the second derivative. The sign of f 00 will determine whether f0 is rising
or falling, and therefore whether the graph of f rises or falls with increasing or
decreasing steepness.

A function f : I → R is said to be con-


vex on I if its graph over every interval [a, b]
in I lies below the secant line through the
endpoints of the graph over that interval.

The graph of a convex function turns up-


wards as we move from left to right. The spe-
I
4.6. Derivative Tests and Curve Sketching 153

cial exercises at the end of this chapter pro-


vide more insight related to this comment.

Similarly, f: I → R is said to be con-


cave on I if its graph over every interval
[a, b] in I lies above the secant line through
the endpoints of the graph over that interval.
The graph of a concave function turns down-
I
wards as we move from left to right.

The formal denitions are as follows: Consider a function f: I →R where


I is an interval. Then

f (b) − f (a)
convex
DRAFT September 25, 2020

1. f is called on I if f (x) ≤ f (a) + (x − a) for any


b−a
a, x, b ∈ I such that a < x < b.
f (b) − f (a)
2. f is called concave on I if f (x) ≥ f (a) + (x − a) for any
b−a
a, x, b ∈ I such that a < x < b.
Task 4.6.4. Can a function be both convex and concave?
The convex or concave nature of a function is called its convexity. A func-
tion may be convex over one interval and concave over another. A point where
the function is continuous and switches its convexity is called an inection
point of the function. For example, the sine function is convex on [−π, 0] and
concave on [0, π]. Hence 0 is an inection point.

Theorem 4.6.5 (Convexity Test). Let f be twice dierentiable on an interval


I. Then
1. f 00 ≥ 0 on I implies f is convex on I .
2. f 00 ≤ 0 on I implies f is concave on I .
3. If f 00 is continuous at an inection point c then f 00 (c) = 0.
Proof. First, suppose f 00 ≥ 0 on I. Let c, d ∈ I with c < d. The secant from
(c, f (c)) to (d, f (d)) has the equation

f (d) − f (c)
y = f (c) + (x − c).
d−c
f (d) − f (c)
Consider the dierence g(x) = f (c) + (x − c) − f (x). Note that
d−c
00 00 0
g(c) = g(d) = 0. Further, g = −f ≤ 0 and so g is a decreasing function.
We wish to show that for each x, g(x) ≥ 0. Suppose that g(x) < 0 at some
point x ∈ (c, d). By the Monotonicity Theorem, we obtain α, β as follows:
ˆ α ∈ (c, x) and g 0 (α) < 0,
154 Chapter 4. Dierentiation

ˆ β ∈ (x, d) and g 0 (β) > 0.


This contradicts g0 being a decreasing function. Hence g(x) < 0 is impossible,
and f is convex.

If f 00 ≤ 0 on I, apply the rst part to −f .


For the third part, suppose f 00 (c) > 0. Then, by continuity, f 00 > 0 in an
interval I centered at c. So f is convex on I and c is not an inection point.
This rules out f 00 (c) > 0. We can similarly rule out f 00 (c) < 0. ␣

Theorem 4.6.6 (Second Derivative Test). Let f have a critical point at c and
f 00be continuous in an open interval containing c. Then
1. f 00 (c) > 0 implies there is a local minimum at c.

2. f 00 (c) < 0 implies there is a local maximum at c.

Proof. Let f 00 (c) > 0. By continuity, f 00 > 0 in an open interval containing c.


Then f0 is strictly increasing in that interval. Hence f0 changes from negative
to positive at c, and there is a local minimum at c (by the First Derivative
Test).

For the second part, apply the rst part to −f . ␣

Example 4.6.7. Let f (x) = x2 ex . We saw earlier that this has a local max-
imum at −2 and a local (as well as absolute) minimum at 0. Now we identify
the inection points and convexity. First, we calculate the second derivative:

f 0 (x) = (x2 + 2x)ex =⇒ f 00 (x) = (x2 + 4x + 2)ex .

Then we identify the possible inection points:


f 00 (c) = 0 ⇐⇒ c2 + 4c + 2 = 0 ⇐⇒ c = −2 ± 2 ≈ −3.4, −0.6.

√ √ √ √
x < −2 − 2 −2 − 2 < x < −2 + 2 x > −2 + 2
f 00 (x) + − +
Convexity Convex Concave Convex

Note that f 00 (−2) = −2e−2 < 0 conrms the local maximum at −2 and f 00 (0) =
2>0 conrms the local minimum at 0.
4.6. Derivative Tests and Curve Sketching 155

Here is the graph of f (x) showing the convex parts as solid curves and the
concave part as a dashed curve:

0.8
0.6
0.4
0.2

−6 −4 −2 2 
DRAFT September 25, 2020

Example 4.6.8. Let f (x) = x + sin(x) on the interval [0, 2π]. We saw in
Example 4.6.3 that the only critical point is at x=π and this is not a local
maximum or minimum. Now we calculate the second derivative:

f 0 (x) = 1 + cos(x) =⇒ f 00 (x) = − sin(x) =⇒ f 00 (c) = 0 at x = 0, π, 2π.

0<x<π π < x < 2π


f 00 (x) − +
Convexity Concave Convex

π 2π 

Remark: If c is a critical point of f and f 00 (c) = 0, the Second Derivative Test


is inconclusive. For example, each of the following functions has a critical point
at c=0 with f 00 (c) = 0 but the rst has a local maximum there, the next has
a local minimum, and the last has a saddle point: −x4 , x4 , x3 .

Curve Sketching
We have seen how rst and second derivative calculations can give us key fea-
tures of a graph. We can capture all the essential aspects of a function's be-
haviour by supplementing these with the following: domain, axis-intercepts,
points of discontinuity, symmetry (even, odd, periodic), asymptotes (vertical,
slant).
 
x−1
Example 4.6.9. f (x) = arctan .
x+1
156 Chapter 4. Dierentiation

Domain: Since the domain of arctan is R, the only point where this ex-
pression is undened is x = −1. Hence the domain is R \ {−1}. Note also that
f (x) ∈ (−π/2, π/2).
Intercepts: The function is zero at x = 1. It cuts the y-axis at y = f (0) =
arctan(−1) = −π/4.
Symmetry: We have f (2) = arctan(1/3) and f (−2) = arctan(3). They
are positive and unequal (arctan is 1-1) so f (x) is neither even nor odd.

Vertical Asymptotes: As f (x) is continuous on its domain, the only


possibility of vertical asymptotes is at x = −1. So we calculate the one-sided
limits there:
 
x−1 t−2 x−1 π
lim = lim = −∞ =⇒ lim arctan =− ,
x→−1+ x + 1 t→0+ t x→−1+ x+1 2

 
x−1 t+2 x−1 π
lim = lim = ∞ =⇒ lim arctan = .
x→−1− x + 1 t→0+ t x→−1− x+1 2

Since the limits are nite there isn't a vertical asymptote at x = −1. They are
still useful in plotting the graph.

Horizontal Asymptotes:
   
x−1 x−1 π
lim =1 =⇒ lim arctan = arctan(1) = ,
x→∞ x + 1 x→∞ x+1 4
   
x−1 x−1 π
lim =1 =⇒ lim arctan = arctan(1) = .
x→−∞ x + 1 x→−∞ x+1 4

Therefore y = π/4 is a horizontal asymptote on both sides.

Critical Points:
   
d x−1 1 d x−1
arctan = 2
dx x+1 dx x+1

x−1
1+
x+1
(x + 1)2 (x + 1) − (x − 1)
= ×
(x + 1)2 + (x − 1)2 (x + 1)2
2 1
= = 2 .
2x2 + 2 x +1
The derivative f 0 (x) always exists and is never zero, so there are no critical
points. In fact f 0 (x) > 0 and so f is strictly increasing on any interval in its
domain. So f is strictly increasing on (−∞, −1) and also on (−1, ∞).
d 1 −2x
Convexity: f 00 (x) = = 2 .
dx x2 + 1 (x + 1)2
4.6. Derivative Tests and Curve Sketching 157

f 00 (x) > 0 for x < 0 and f 00 (x) < 0 for x > 0. So f is convex on
We have
(−∞, −1) and on (−1, 0). It is concave on (0, ∞). The only inection point is
x = 0.
Here is the graph of f:

1 y = π/4

−4 −2 2 4
−1
DRAFT September 25, 2020


x2
Example 4.6.10. Let f (x) = .
x2 +9
Domain: Clearly, the domain is R. And the image is in [0, 1).
Intercepts: The function is zero at x = 0.
Symmetry: The function is even.
Vertical Asymptotes: As f is continuous on R it has no vertical asymp-
totes.

Horizontal Asymptotes:
x2 1
lim = lim = 1,
x→∞ x2
+9 x→∞ 1 + 9/x2
x2 1
lim 2
= lim = 1.
x→−∞ x + 9 x→−∞ 1 + 9/x2

So y=1 is a horizontal asymptote on each side.

Critical Points:
!
d x2 2x(x2 + 9) − x2 (2x) 18x
= = 2 .
dx x2 + 9 (x2 + 9)2 (x + 9)2

The only critical point is x = 0. We have f 0 (x) < 0 for x<0 andf 0 (x) > 0 for
x > 0. So First Derivative Test implies that there is a local minimum at x = 0.
Note that f (0) = 0.
Convexity:
!
d2 x2 (x2 + 9) − 4x2 3 − x2
 
d 18x
=− = 18 = 54 2 .
dx2 2
x +9 dx (x2 + 9)2 2
(x + 9) 3 (x + 9)3
√ √
The possible inection points are x = ± 3. Note that f (± 3) = 0.25.
158 Chapter 4. Dierentiation
√ √ √ √
x<− 3 − 3<x< 3 x> 3
f 00 (x) − + −
Convexity Concave Convex Concave

y=1
1

0.5

−5 5 
Example 4.6.11. Let f (x) = (x−x3 )1/3 . The computations are a little lengthy,
so we just give the results. Verifying them will be an excellent test of your
algebra and dierentiation skills!

Domain: The domain is R.


Intercepts: The function is zero at x = 0, ±1.
Symmetry: The function is odd.
Vertical Asymptotes: As f is continuous on R it has no vertical asymp-
totes.

Slant Asymptotes: It is evident that the function has innite limits at


±∞ and so has no horizontal asymptotes. The form of the function suggests it
should behave like y = −x for large x. Let us look for slant asymptotes:

1/3
(x − x3 )1/3

1
a = lim = lim −1 = −1,
x→∞ x x→∞ x2
b = lim ((x − x3 )1/3 − (−x))
x→∞
x−1
= lim = 0.
x→∞ ( x12 − 1)2/3 − ( x12 − 1)1/3 + 1

This conrms that y = ax + b = −x is a slant asymptote. By symmetry, it is a


slant asymptote on both sides.

Critical Points:
df 1 − 3x2
(x) = .
dx (x − x3 )2/3
The derivative is undened for x = 0, ±1. It has innite limit at these points,

indicating a vertical slope. The derivative is zero at x = ±1/ 3. Thus, there
are ve critical points. The intervals of increase and decrease are:

−1 −1
x (−∞, −1) (−1, √ 3
) (√ 3
, 0) (0, √13 ) ( √13 , 1) (1, ∞)
0
f (x) − − + + − −
4.6. Derivative Tests and Curve Sketching 159

Convexity:
d2 f 2 + 6x2
2
(x) = .
dx 9(x − x3 )5/3
The second derivative is never zero. But there are possible inection points
where it is undened, i.e., at x = 0, ±1.

x < −1 −1 < x < 0 0<x<1 1<x


f 00 (x) − + − +
Convexity Concave Convex Concave Convex

The resulting graph:


DRAFT September 25, 2020

−2 −1 1 2
−1

−2 

Exercises for Ÿ4.6


1. Give an example of a function with a saddle point where the function is
not dierentiable.

2. For each of the following functions, use the rst derivative to nd and
classify the critical points, and identify the intervals of decrease and increase.

2
(a) f (x) = sin2 x, x ∈ [−π, π] (c) h(x) = xe−x /2
, x ∈ [−3, 3]
2
(b) g(x) = e−x /2
, x ∈ [−3, 3] (d) k(x) = x|x − 1|, x ∈ [−1, 2]

3. For each function in the previous exercise, nd the intercepts and asymp-
totes, use the second derivative to nd the inection points, identify the in-
tervals of convexity and concavity, and incorporate this information in their
graphs.

4. Consider the function plotted in Example 4.6.9. Show that on each of


the intervals (−∞, −1) and (−1, ∞) it has the form arctan(x) + C . Is this
surprising?

5. Make a detailed sketch of the graph of f (x) = (x2 − x3 )1/3 .


6. Show that every cubic polynomial has exactly one inection point.

7. Let u : [0, 1] → R be a twice continuously dierentiable function which


satises the dierential equation u00 (x) = ex u(x).
160 Chapter 4. Dierentiation

(a) Show that u does not have a positive local maximum or a negative local
minimum in (0, 1).
(b) Suppose u(0) = u(1) = 0. Show that u = 0.
8. Suppose f satises x2 f 00 (x) + 4xf 0 (x) + 2f (x) ≥ 0 on (a, b) and f (a) =
f (b) = 0. Show that f (x) ≤ 0 on [a, b]. (Hint: Consider g(x) = x2 f (x))
9. Prove that ex > 1 + x + x2 /2 for x > 0.
10. Consider a function f on an interval I. Prove the following:

(a) f is convex on I if and only if f ((1 − t)x + ty) ≤ (1 − t)f (x) + tf (y) for
any x, y ∈ I and t ∈ [0, 1].
(b) f is concave on I if and only if f ((1 − t)x + ty) ≥ (1 − t)f (x) + tf (y)
for any x, y ∈ I and t ∈ [0, 1].

Dierential Equations
We'll solve a very important class of dierential equations, called `second-
order ordinary dierential equations with constant coecients'. Such dier-
ential equations crop up in the study of mechanics, waves, electrical circuits
and market equilibrium. They have the form

f 00 + af 0 + bf = g,

where a, b ∈ R and g is some given function. These generalize f 00 = −f , which


we have already solved. We shall show that we can go from the special case to
the general by a completion of squares, analogous to how we go from nding
square roots to solving a general quadratic equation.

First, we consider another special case: f 00 = f .


A1. Suppose f: R → R satises f 00 = f . Show that f (x) = Aex + Be−x .
(Hint: Dierentiate the functions f + f 0 and f − f 0 .)
A2. Suppose f: R→R satises f 00 = f . Show the following:

0
(a) If f (0) = 0 and f (0) = 1 then f (x) = sinh x.
(b) If f (0) = 1 and f 0 (0) = 0 then f (x) = cosh x.
00
With the help of the cases f = ±f we can solve a general equation
f 00 = kf :
A3. Suppose f: R → R satises f 00 = kf for some k ∈ R, f (0) = A and
0
f (0) = B . Prove the following:

(a) If k=0 then f (x) = Ax + B .


(b) If k = −w2 < 0 then f (x) = A sin wx + B cos wx.
(c) If k = w2 > 0 then f (x) = A sinh wx + B cosh wx.
4.6. Derivative Tests and Curve Sketching 161

(Hint: For (b) and (c), consider g(x) = f (x/w))


00 0
The dierential equation f + af + bf = g is called `homogeneous' if
g = 0 and `inhomogeneous' otherwise. The homogeneous case can be completely
solved as follows:

A4. Consider the homogeneous dierential equation f 00 + af 0 + bf = 0. The


2
corresponding `characteristic equation' is λ + aλ + b = 0, with discriminant
d = a2 − 4b.
(a) Prove that f is a solution of this dierential equation if and only if
a2
f (x) = e−ax/2 v(x) where v satises v 00 + (b − )v = 0.
4
DRAFT September 25, 2020

(b) If d=0 and the characteristic equation has repeated root λ, show that
any solution has the form f (x) = Aeλx + Bxeλx .
(c) If d>0 and the characteristic equation has distinct real roots λ1 , λ2 ,
show that any solution has the form f (x) = Aeλ1 x + Beλ2 x .
(d) If d<0 and the characteristic equation has complex roots r ± wi, show
that any solution has the form f (x) = Aerx cos wx + Berx sin wx.
A5. Suppose the roots of the characteristic equation of f 00 + af 0 + bf = 0
have negative real parts. Then every solution f satises lim f (x) = 0.
x→∞

We have found a general form for the solutions of f + af 0 + bf = 0. It has


00

two parameters A, B . By varying them we can generate all solutions. This form
will be called the `general solution' of the homogeneous dierential equation
and we shall denote it by fh . For example, if we consider f 00 + 2f 0 + f = 0 then
−x −x
fh (x) = Ae + Bxe .

Now consider an inhomogeneous dierential equation f 00 + af 0 + bf = g .


The rst important observation is that if we know one solution of this equation
then we know all its solutions!

A6. Suppose fp (x) is a solution of f 00 + af 0 + bf = g . Let fh be the general


solution of f 00 + af 0 + bf = 0. Then the general solution of f 00 + af 0 + bf = g
is f = fp + fh .
There is a general approach for nding fp , called `variation of parame-
ters'. We leave that for a course in dierential equations. We give below an
example of how fp can often be found quite quickly by recognising patterns of
dierentiation.

A7. Consider the dierential equation f 00 (x) − 3f 0 (x) + 2f (x) = cos x. Since
cosine can be obtained by dierentiating sine once or cosine twice, we try
fp (x) = α cos x + β sin x.
(a) Substitute fp in the given dierential equation and show α = 0.1, β =
−0.3.
(b) Show the general solution is f (x) = 0.1 cos x − 0.3 sin x + Aex + Be2x .

You might also like