Analysis 1
Analysis 1
Analysis 1
Alessio Figalli
Preface
Welcome to ETH Zürich and to your exploration of these lecture notes. Originally crafted
in German for the academic year 2016/2017 by Manfred Einsiedler and Andreas Wieser, these
notes were designed for the Analysis I and II courses in the Interdisciplinary Natural Sciences,
Physics, and Mathematics Bachelor programs. In the academic year 2019/2020, a substantial
revision was undertaken by Peter Jossen.
For the academic year 2023/2024, Alessio Figalli has developed this English version. It
differs from the German original in several aspects: reorganization and alternative proofs
of some materials, extensive rewriting and expansion in certain areas, and a more concise
presentation. This version strictly aligns with the material presented in class, offering a
streamlined educational experience.
The courses Analysis I/II and Linear Algebra I/II are fundamental to the mathematics
curriculum at ETH and other universities worldwide. They lay the groundwork upon which
most future studies in mathematics and physics are built.
Throughout Analysis I/II, we will delve into various aspects of differential and integral
calculus. Although some topics might be familiar from high school, our approach requires
minimal prior knowledge beyond an intuitive understanding of variables and basic algebraic
skills. Contrary to high-school methods, our lectures emphasize the development of mathemat-
ical theory over algorithmic practice. Understanding and exploring topics such as differential
equations and multidimensional integral theorems is our primary goal. However, students are
encouraged to engage with numerous exercises from these notes and other resources to deepen
their understanding and proficiency in these new mathematical concepts.
1 Introduction 2
1.1 Quadrature of the Parabola . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Tips on Studying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
ii
Chapter 0.0 CONTENTS
1
Chapter 1
Introduction
This area was already determined as the first curvilinear bounded area by Archimedes (ca.
287–ca. 212 BCE) in the 3rd century BCE. For the area calculation, let us assume that we
know what the symbols in the definition in equation (1.1) mean and that P describes the area
in the following figure. In particular, we assume for the moment that we already know the
set of real numbers R.
Of course, calculating the area of P is not a challenge if we use integrals and the associated
calculation rules. However, we do not want to assume we know the integral calculus. Strictly
speaking, we must ask ourselves the following fundamental question before calculating:
What is an area?
If we cannot answer this question exactly, then we cannot know what it means to calculate
the area of P . Therefore, we qualify our goal in the following way:
2
Chapter 1.1 Quadrature of the Parabola
Proposition 1.1. — Suppose there is a notion of area in R2 that satisfies the following
properties:
3. For sets F, G in R2 without common points, the area of the union F ∪ G is the sum of
the areas of F and G.
In other words, we have left open the question of whether there is a notion of area and for
what areas it is defined, but we want to show that 13 is the only reasonable value for the area
of P .
For the proof of Proposition 1.1 we need a lemma (also called an “auxiliary theorem”):
n3 n2 n
12 + 22 + · · · + (n − 1)2 + n2 = + + . (1.2)
3 2 6
1
Proof. We perform the proof using induction. For n = 1, the left-hand side of equation (1.2)
is equal to 1 and the right-hand side is equal to 31 + 21 + 16 = 1. So equation (1.2) is true for
n = 1. This part of the proof is called the beginning of induction.
Suppose we already know that equation (1.2) holds for the natural number n. We now
want to show that it follows that equation (1.2) also holds for n + 1. The left-hand side of
equation (1.2), for (n + 1) instead of n, is given by
n3 n2 n n3 3n2 13n
12 + 22 + · · · + n2 + (n + 1)2 = + + + (n + 1)2 = + + +1
3 2 6 3 2 6
where, in the first equality, we have used the validity of equation (1.2) for the number n. The
right-hand side of equation (1.2), for (n + 1) instead of n, is given by
This shows that the left and right sides of equation (1.2) also agree for n + 1. This part of
the proof is called the induction step.
It follows that equation (1.2) is true for n = 1 due to the validity of the beginning of the
induction. Therefore, it is also true for n = 2 due to the induction step, and for n = 3 again
due to the induction step. Continuing in this way, we obtain (1.2) for any natural number.
We say that equation (1.2) follows by means of induction for all natural numbers n ≥ 1.
Furthermore, we indicate the end of the proof with a small square.
Proof of Proposition 1.1. We assume that there is a notion of area content with the properties
in the proposition and that it is defined for P . Suppose that I is the area content of P . We
cover P for a given natural number n ≥ 1 with rectangles whose base has length n1 , as in
Figure 1.1 on the left.
Note that the straight-line segments where the rectangles touch have area 0, and we may
ignore them.
If, on the other hand, we use rectangles as in Figure 1.1 on the right, we also get
1 0 1 12 1 (n − 1)2
I≥ + + · · · +
n n2 n n2 n n2
1 2
= (1 + · · · + (n − 1)2 )
n3
1 2
= (1 + · · · + (n − 1)2 + n2 − n2 )
n3
1 n3 n2 n
2
= + + −n
n3 3 2 6
1 1 1
= − + 2
3 2n 6n
So in summary
1 1 1 1 1
− + 2 ≤I− ≤ + 2 (1.3)
2n 6n 3 2n 6n
for all natural numbers n ≥ 1. The only number that satisfies this for all natural numbers
n ≥ 1 is 0. Therefore, I − 13 = 0 and the proposition follows.
To rigorously prove the statement above, one needs to show that the only real number
satisfying (1.3) for all n ≥ 1 is 0. This is intuitively clear: indeed, taking n larger and larger,
the two expressions 2n 1
+ 6n1 2 and − 2n
1
+ 6n1 2 get smaller and smaller. However, we cannot give
a proof of it at this point, as we lack a rigorous definition of the real numbers.
As already mentioned, we have not answered the question of whether there is a notion of
area for sets in R2 . Nor have we described precisely what domains in R2 are, but we have
implicitly assumed that domains are those subsets of R2 to which we can assign an area. The
notions of the Riemann and Lebesgue integrals and measurable sets answer these fundamental
questions.
The idea of the proof is illustrated in the following applet.
Applet 1.3 (Estimating an area). We use up to 1000 rectangles to estimate the area from
below and from above. In the proof below, however, we will use an unlimited number of
rectangles and can thus determine the area exactly without any fuzziness.
1
Note that in the previous examples, we informally used the notion of “set”. Here, for
completeness, we give a more precise definition.
(4) Every statement A about elements of a set X defines the set of elements in X for
which the statement A is true; one writes {x ∈ X | A is true for x}.
The empty set, written as ∅ (or sometimes also {}), is the set containing no elements.
can write
{n ∈ Z | ∃ m ∈ Z : n = 2m}
1 to describe the set of even numbers. The symbol ∃ means “there exists”, while the symbol ∀
means “for all”. The symbols “ | ” and “:” in the formula above both mean “such that”. With
time, this mathematical terminology will become familiar, one just needs time and practice.
You will notice a big difference between school mathematics and university mathematics. The
latter also uses its own language, which you will have to learn. The sooner you take this on,
the more you will take away from the lectures. This brings us to the next tip.
You cannot learn mathematics by watching it; just as you cannot learn tennis or skiing by
watching all available tournaments or world championships on television. Rather, you should
learn mathematics like a language, and a language is taught by using it. Discuss the topics of
the lectures with colleagues. Explain to each other the proofs from the lecture or the solution
of the exercise examples. Above all, solve as many exercises as possible; this is the only way
to be sure that you have mastered the topics.
It is fine to work on the exercises in small groups. This even has the advantage that the
group discussions make the objects of the lectures more lively. However, you should ensure
that you fully understand the solutions, explain them and subsequently solve similar problems
on your own.
“He who asks is a fool for a minute. He who does not ask is a fool all his life.”
Confucius
Ask as many questions as you can and then ask them when they come up. Probably many of
your colleagues have the same question, or have not even noticed the problem. This allows the
lecturer or teaching assistant to simultaneously fix a problem for many and identify problems
in students where she or he thought none existed. Furthermore, good question formulation
needs to be practiced; the first year is the ideal time to do this.
8
Chapter 2.1
Interlude: Groups
A group is a (non-empty) set G endowed with an operation “⋆” that satisfy the fol-
lowing properties:
(a ⋆ b) ⋆ c = a ⋆ (b ⋆ c).
• (Neutral element) there exists a neutral element, i.e., e ∈ G such that for all
a ∈ G we have
a ⋆ e = e ⋆ a = a.
• (Inverse element) for each element a ∈ G there is an inverse element, i.e., a−1 ∈ G
such that
a ⋆ a−1 = a−1 ⋆ a = e.
Example 2.1. — To make these concepts easier to access, let us assume for the moment
2 that we already know the natural numbers N and the integers Z. We check whether they
satisfy the above properties if we replace ⋆ with the operations you are already familiar with.
1. Consider the natural numbers N = {0, 1, 2, 3, ...} with the usual addition + that you
probably know since primary school. This is
(k + l) + m = k + (l + m),
0 + n = n + 0 = n,
but
• no element apart from 0 has an inverse element. In fact, the inverse element of
n ∈ N\{0} would be −n, which is not included in the natural numbers N.
2. The same arguments show that the integers Z = {..., −3, −2, −1, 0, 1, 2, 3, ...}, again
with the addition, form a group. Moreover, this is a commutative group, since for all
integers n, m ∈ Z we have
n + m = m + n.
with the usual multiplication · between numbers. In this case, one can check that the
multiplication is associative and commutative, the neutral element is 1, and the inverse
of pq is pq . Hence, this is a commutative group.
2.2. — It follows directly from the definition of the neutral element that it is unique.
Indeed, assume that additionally to e ∈ G, we have a second element e′ with the property
such that e′ ⋆ a = a ⋆ e′ = a for all elements a ∈ G. Then, we can choose a = e and obtain
e = e ⋆ e ′ = e′ ,
2
where the first equality follows from the fact that e′ is neutral, while in the second equality
we used that that e is neutral.
We can thus speak of the neutral element of a group.
In the same spirit, assume that for an element a ∈ G, there exist two inverse elements a−1
and ã−1 . Then, using associativity, we observe that
So also for inverse elements, we might speak of the inverse element. In particular, since
a ⋆ a−1 = e, we deduce that a is the inverse of a−1 , thus
a · (b + c) = a · b + a · c
and
(a + b) · c = a · c + b · c.
Example 2.3. — Let us continue with our examples. We have already established that the
integers Z form a commutative group, but are they also a ring with the usual multiplication?
We must check:
(k · l) · m = k · (l · m).
• Neutral element for the multiplication: The neutral element for the multiplication is
1 ∈ Z as, for all integers k ∈ Z, we have
1 · k = k · 1 = k.
k · (l + m) = k · l + k · m
and
(k + l) · m = k · m + l · m.
n o
p
Example 2.4. — The set of rational numbers Q = q | p, q ∈ Z, q ̸= 0 with the usual
addition and multiplication is a field.
2.5. — Before going on, we look at some immediate consequences of the definition of field.
In the current notation, −a denotes the inverse of a with respect to the addition, while a−1
is the inverse of a with respect to the multiplication. Note that, in the current context, (2.1)
implies that
−(−a) = a, and (a−1 )−1 = a whenever a ̸= 0. (2.2)
(i) 0 · a = 0 and a · 0 = 0.
Proof: Since 0 is the neutral element for the addition, we have 0 = 0 + 0. Hence, using
2 distributivity, we get
0 · a = (0 + 0) · a = (0 · a) + (0 · a).
Adding −0 · a (i.e., the inverse of 0 · a for the addition), we deduce that 0 · a = 0. The
case of a · 0 is analogous.
a · b + a · (−b) = a · (b + (−b)) = a · 0 = 0.
a · b = −(a · (−b)).
On the other hand, applying (ii) with (−b) instead of b, we also have
Combining the two identities above, we conclude that (−a) · (−b) = a · b. Finally, taking
b = a−1 yields (−a) · (−(a−1 )) = a · a−1 = 1, which gives the second assertion.
Remark 2.6. — A natural question one may ask is the following: Is it possible to construct
a field K where 0 (i.e., the neutral element for +) and 1 (i.e., the one for ·) are equal?
Assume that 0 = 1. Then, using (i) above and the fact that 1 is the neutral element for
multiplication, we get
0=a·0=a·1=a
for every a ∈ K. So, the only possibility for having 0 = 1 is that K consists of the single
element 0. From now on, we shall assume that K always contains at least two elements, so in
particular, 0 and 1 cannot coincide.
Next, we introduce the second ingredient of an ordered field, the order relation. Again we
will do so in steps:
X × Y = {(x, y) | x ∈ X, y ∈ Y }.
2
Interlude: Subsets
Let P and Q be sets.
• We say that P is subset of Q, and write P ⊂ Q (or P ⊆ Q), if for all x ∈ P also
x ∈ Q holds.
For instance, {x, y} = {z} holds if x = y = z. Note that there are no “multiplicities” for
elements of a set (for instance, {x, x, x} = {x}).
Interlude: Relations
Let X be a set. A relationship on X is a subset R ⊂ X × X, that is, a list of ordered
pairs of elements of X. We also write xRy if (x, y) ∈ R and often use symbols such as
<, ≪ , ≤, ∼=, ≡, ∼ for the relations.
If ∼ is a relation, we write “x ̸∼ y” if “x ∼ y” does not hold. A relation ∼ is called:
Example 2.8. — We look again at the integers Z and two examples of relations on them.
Let m, n, p ∈ Z be integers.
2
• Consider the relation ≤ of being “less than or equal to", i.e., we write n ≤ m if n is less
than or equal to m. Then we see that this relation is:
• Consider next the relation < of being “strictly smaller than", i.e., we write n < m if n
and m are distinct integers and n is less than m. This relation is:
We conclude that < is neither an equivalence relation nor an order relation, since it does
not satisfy the reflexivity property.
x ≤ y =⇒ x + z ≤ y + z.
0 ≤ x and 0 ≤ y =⇒ 0 ≤ x · y.
The following terminology is standard and will be used throughout these lecture notes:
• Analogously, we define x > y when y < x, and say “x is greater than y” or “x is strictly
greater than y”.
We often use these symbols in “equidirectional chains”, for example, x ≤ y < z = a stands for
Example 2.10. — A well-known example of an ordered field is the one of rational numbers
Q, together with the usual order relation given by
p p′
≤ ′ ⇐⇒ pq ′ ≤ p′ q, p, p′ ∈ Z, q, q ′ ∈ N.
q q
Here on the right-hand side is the order on the integers, which we assume to be known.
(l) If x + y ≤ x + z, then y ≤ z.
Proof: Exercise 2.12.
Exercise 2.12. — Prove the inferences (k),(l),(m). What happens in (m) when you drop
the condition x > 0, that is, when x < 0 or x = 0? For some of the above inferences, formulate
and prove similar versions for the strict relation “<”.
2.13. — Let (K, ≤) be an ordered field. As usual, we write 2, 3, 4, . . . for the elements of
K given by 2 = 1 + 1, 3 = 2 + 1, et cetera. By the compatibility of + and ≤ in Definition 2.9,
and recalling property (g) in Paragraph 2.11, the inequalities
hold in K. In particular, the elements . . . , −2, −1, 0, 1, 2, 3, . . . of K are all distinct. We iden-
tify the set Z of integers with a subset of K. That is, we call the elements {. . . , −2, −1, 0, 1, 2, 3, . . .}
of K “integers”. Consequently, we call elements {pq −1 | p, q ∈ Z, q ̸= 0} in K rational numbers
Z ⊊ Q ⊆ K.
In other words, if (K, ≤) is an ordered field, then it always includes a copy of the rationals
inside it.
The above axioms, inferences, and statements in the exercises represent the usual properties
for inequalities. We can also use them to solve problems like the one in the following exercise:
{x ∈ R \ {0} | x + 3
x + 4 ≥ 0} = {x ∈ R \ {0} | − 3 ≤ x ≤ −1 or x > 0}.
(x+3)(x+1)
Hint: note that x + 3
x +4= x .
Interlude: Functions
A function f from a set X to a set Y is a map that assigns to each x ∈ X a uniquely
determined element y = f (x) ∈ Y . We write f : X → Y for a function from X to Y
and sometimes also speak of a mapping or a transformation. We refer to the set X
as domain, and the set Y as domain of values or codomain.
3
The set F = {(x, f (x)) | x ∈ X} is called the graph of f . In the context of a function
f : X → Y , an element x of the domain of definition is also called argument, and an
element y = f (x) ∈ Y assumed by the function is also called value of the function. If
f : X → Y is a function, one also writes
f :X → Y
x 7→ f (x),
Definition 2.15. — Let (K, ≤) be an ordered field. The absolute value or modulus on
K is the function | · | : K → K given by
x if x ≥ 0
|x| =
−x if x < 0.
2.16. — In what follows, let (K, ≤) always be an ordered field, and x, y, z, w denote
elements from K.
|x + y| ≤ |x| + |y|.
Proof: Note that, by (e), we have −|x| ≤ x ≤ |x| and −|y| ≤ y ≤ |y|. Adding these two
inequalities, we get
−(|x| + |y|) ≤ x + y ≤ |x| + |y|.
leads to |x| − |y| ≤ |x − y|. Exchanging x and y, we get |y| − |x| ≤ |x − y|. So, by property
(e), |x| − |y| ≤ |x − y|, as desired.
3
Exercise 2.17. — For which x, y ∈ R does equality hold in the triangle inequality? And
in the inverse triangle inequality?
(V) Let X, Y be non-empty subsets of K such that for all x ∈ X and y ∈ Y the
inequality x ≤ y holds. Then there exists c ∈ K lying between X and Y , in the
sense that for all x ∈ X and y ∈ Y the inequality x ≤ c ≤ y holds.
4
We call statement (V) the completeness axiom.
2.20. — We will often visualise the real numbers as the points on a straight line, which is
why we also call it the number line.
We interpret the relation x < y for x, y ∈ R as “on the straight line, the point y lies to the
right of the point x”. What does the completeness axiom mean in this picture?
Let X, Y be non-empty subsets of R such that for all x ∈ X and all y ∈ Y the inequality
x ≤ y holds. Then all elements of X are to the left of all elements of Y as in the following
figure.
So, according to the completeness axiom, there exists a number c that lies in between. The
existence of the number c is, in a sense, an assurance that R has no “gaps”. It is advisable to
visualize definitions, statements, and their proofs on the number line. However, the number
line should always be used only as a motivation and to develop a good intuition, but not for
rigorous proof.
f (A) = {y ∈ Y | ∃ x ∈ A : f (x) = y}
and call this subset of Y the image of A under the function f . For a subset B ⊂ Y we
write
f −1 (B) = {x ∈ X | ∃ y ∈ B : f (x) = y}
Example 2.24. — Let X, Y be two finite sets with the same number of elements (for
example, X and Y could be the same set). Then, for a function f : X → Y , injectivity and
surjectivity are equivalent.
To show this, assume that X and Y have n elements and write X = {x1 , . . . , xn }. Suppose
first that f is injective. Then all the elements f (xi ) are distinct, which means that the set
f (X) = {f (x1 ), . . . , f (xn )} also has n elements. Being f (X) a subset of Y and Y has n
elements, the only option is that f (X) = Y . This proves that injectivity implies surjectivity.
Conversely, to show that surjectivity implies injectivity, we prove that if f is not injective
then f is not surjective. So, assume there exist at least two elements xi ̸= xj such that
f (xi ) = f (xj ). This means that f (X) has at most n − 1 elements, so f cannot be surjective.
Remark 2.25. — For infinite sets, injectivity and surjectivity are not necessarily equivalent.
Consider for instance the functions f1 , f2 : N → N defined as
(
4 0 if n = 0,
f1 (n) = n + 1, f2 (n) =
n − 1 if n ≥ 1.
Then f1 is injective but not surjective, while f2 is surjective but not injective.
Exercise 2.27. — In this exercise, we show the existence and uniqueness of a bijective
√ √
function · : R≥0 → R≥0 with property ( a)2 = a for all a ∈ R≥0 .
2. Use Step 1 to deduce that, for every a ∈ R≥0 , there can exist at most one element
c ∈ R≥0 satisfying c2 = a.
and apply the completeness axiom to find c ∈ R with x ≤ c ≤ y for all x ∈ X and
y ∈ Y . Prove that c ∈ X and c ∈ Y to conclude that both c2 ≤ a and c2 ≥ a hold, thus
c2 = a.
Hint: If by contradiction c ∈ / X (that is, c2 > a), then one can find a suitably small
real number ε > 0 such that (c − ε)2 ≥ a. Thus c − ε ∈ Y, which contradicts y ≥ c for
every y ∈ Y . The case of c ∈
/ Y is analogous.
√
We call square root function the function · : R≥0 → R≥0 that assigns to each a ∈ R≥0
the number c ∈ R≥0 uniquely determined by the above construction. We note that c2 = a,
√
and we call c = a the square root of a. Show that:
√ √ √
4. The function · is increasing: for x, y ∈ R≥0 with x < y, the inequality x < y holds.
√
5. The function · : R≥0 → R≥0 is bijective.
√ √ √
6. For all x, y ∈ R≥0 , xy = x y.
4 √
Exercise 2.28. — For all x ∈ R, show that x2 = |x|2 and x2 = |x|.
2.29. — In summary, in a field of real numbers as defined in Definition 2.19, the usual
arithmetic rules and equation transformations work, although (as usual) division by zero is
not defined. Furthermore, the relations ≤ and < satisfy the usual transformation laws for
inequalities. In particular, when multiplying by negative numbers, the inequalities must be
reversed. We will use these laws in the following without reference. We will see the deep
meaning of the completeness axiom when we use it for further statements. In particular, until
further notice, we will always refer to it when we use it.
It is not clear for the moment that there is indeed a field of real numbers as in Definition
2.19. The fact that we occasionally even speak of the real numbers stems from the fact that,
except for certain identifications, there is only one completely ordered field. In this course,
we assume, in agreement with your high school experience, that a field of real numbers exists
and is unique.
2.1.3 Intervals
[a, b] = {x ∈ R | a ≤ x ≤ b}.
4
• The unbounded closed intervals
2.31. — The intervals (a, b], [a, b), (a, b) for a, b ∈ R are non-empty exactly when a < b,
and [a, b] is non-empty exactly when a ≤ b. If the interval is non-empty, a is called the left
endpoint, b is called the right endpoint and b − a is called the length of the interval.
Intervals of the kind [a, b], (a, b], [a, b), (a, b) for a, b ∈ R are also called bounded intervals
if we want to distinguish them from the unbounded intervals.
Instead of round brackets, inverted square brackets are sometimes used to denote open and
half-open intervals. For example, instead of (a, b) for a, b ∈ R, one can also find ]a, b[ in the
literature. We will always use round brackets here.
P ∩ Q = {x | x ∈ P and x ∈ Q}
P ∪ Q = {x | x ∈ P or x ∈ Q}
P \ Q = {x | x ∈ P and x ∈
/ Q}
P △Q = (P ∪ Q) \ (P ∩ Q).
These definitions are illustrated in the following pictures. Sketches of this kind are called
Venn diagrams.
If it is clear from the context that all sets under consideration are subsets of a given basic
set X, then the complement P c of P is defined by P c = X \ P .
2. When is a union of two intervals an interval again? In this case, what happens when
you unite two intervals of the same type (open, closed, half-open)?
2.34. — For example, both [−1, 1] and Q ∪ [−1, 1] are neighborhoods of 0 ∈ R (since they
both contain, for instance, (−1/2, 1/2)), but [0, 1] is not a neighbourhood of 0.
We note further that, for δ > 0 and x ∈ R, the δ-neighbourhood of x is given by {y ∈
R | |x − y| < δ}. We will interpret |x − y| as the distance from x to y. In terms of “distance”,
a few of the above inferences can be re-expressed more intuitively. For example, property (a)
in Paragraph 2.16 implies that, for x, y ∈ R, the equality |x − y| = | − (x − y)| = |y − x| holds.
In other words, the distance from x to y is equal to the distance from y to x.
2.36. — Open intervals are open, closed intervals are closed. Intuitively, a subset is open
if, for any point x in the set, all points close enough to x are also in the set. Contrary to
conventional usage, “open” is not the opposite of “closed”.
The sets ∅ and R are both open in R. Hence, they are also closed since ∅ = Rc and R = ∅c .
We note that Q ⊆ R and [a, b) ⊂ R are neither open nor closed.
Exercise 2.37. — Show that a subset U ⊆ R is open exactly if, for every element x ∈ U ,
there exists δ > 0 such that (x − δ, x + δ) ⊆ U .
5
Exercise 2.38. — Let U be a family of open sets, and F be a family of closed subsets of
R. Show that the union and the intersection
[ \
U and F
U ∈U F ∈F
C = R2 = {(x, y) | x, y ∈ R}.
We call elements z = (x, y) ∈ C complex numbers, and will write them in the form z = x + iy,
where the symbol i is called the imaginary unit. Note that in this identification, the symbol
+ is, for the time being, to be understood as a substitute for the comma. The number x ∈ R is
called the real part of z and one writes x = Re(z); the number y ∈ R is the imaginary part
of z and one writes y = Im(z). The elements of C with imaginary part 0 are also called real,
and the elements with real part 0 are called purely imaginary. Via the injective mapping
x ∈ R 7→ x + i0 ∈ C we identify R with the subset of real elements of C.
The graphical representation of the set C is called the complex plane or also Gaussian
number plane. From this geometric point of view, the set of real points is called the real
axis and the set of purely imaginary points is called the imaginary axis.
As you might expect from previous knowledge, i should corresponds to a square root of
−1. Hence, we want to define an addition and a multiplication on the set C so that the set C
together with these operations is a field in which i2 = −1 holds. Note that, if i2 = −1, then
it follows from commutativity and distributivity that
Proof. We review the axioms of fields. The associativity and commutativity of the addition,
and the fact that (0, 0) is a neutral element for the addition, are direct consequences of the
corresponding properties of the addition of real numbers. The inverse element of (x, y) for
the addition is given by (−x, −y), since
Proving the properties of multiplication requires a little more effort. We start with associa-
tivity of multiplication: let (x1 , y1 ), (x2 , y2 ), and (x3 , y3 ) be elements of C. Now calculate
(x1 , y1 ) · (x2 , y2 ) · (x3 , y3 ) = (x1 x2 − y1 y2 , x1 y2 + y1 x2 ) · (x3 , y3 )
= (x1 x2 x3 − y1 y2 x3 − x1 y2 y3 − y1 x2 y3 , x1 y2 x3 + y1 x2 x3 + x1 x2 y3 − y1 y2 y3 ).
Analogously, we calculate
(x1 , y1 ) · (x2 , y2 ) · (x3 , y3 ) = (x1 , y1 ) · (x2 x3 − y2 y3 , x2 y3 + y2 x3 )
5 = (x1 x2 x3 − y1 y2 x3 − x1 y2 y3 − y1 x2 y3 , x1 y2 x3 + y1 x2 x3 + x1 x2 y3 − y1 y2 y3 ).
Also, in the same way, we check that (1, 0) is the neutral element for multiplication:
Next, we check the distributivity law: Let again (x1 , y1 ), (x2 , y2 ), and (x3 , y3 ) be elements of
C. Then
(x1 , y1 ) · (x2 , y2 ) + (x3 , y3 ) = (x1 , y1 ) · (x2 + x3 , y2 + y3 )
= (x1 x2 + x1 x3 − y1 y2 − y1 y3 , y1 x2 + y1 x3 + x1 y2 + x1 y3 )
= (x1 x2 − y1 y2 , y1 x2 + x1 y2 ) + (x1 x3 − y1 y3 , y1 x3 + x1 y3 )
= (x1 , y1 ) · (x2 , y2 ) + (x1 , y1 ) · (x3 , y3 ),
which shows that C is a ring when endowed with the addition and multiplication given in
Definition 2.39.
To finish the proof, we still need to show the existence of multiplicative inverses. Let
(x, y) ∈ C be such that (x, y) ̸= (0, 0). So either x ̸= 0 or y ̸= 0 holds, and therefore
−y
x2 + y 2 > 0. Then the multiplicative inverse of (x, y) is given by x2 +y
x
2 , x2 +y 2 , because
(x, y) · x
x2 +y 2
, x2−y
+y 2
= x · x2 +yx −y
2 − y · x2 +y 2 , y ·
x
x2 +y 2
+x· −y
x2 +y 2
2 2
= xx2 +y
+y 2 , yx−xy
2
x +y 2 = (1, 0).
Applet 2.41 (Complex numbers). We consider the field operations (addition, multiplication,
multiplicative inverse) on the complex numbers. The true geometric meaning of multiplication
and of the multiplicative inverse will be discussed later.
As already explained, we do not write (x, y) for complex numbers, but x + iy. Instead of
x + i0 we also simply write x, and instead of 0 + iy we write iy, and finally we also write i
for i1. By construction, i2 = −1 holds. According to this notation, take R to be a subset of
C. This makes sense since addition and multiplication on C restricted to R is equivalent to
addition and multiplication on R. Also, given z, w ∈ C, we write zw for z · w.
Given z ̸= 0, we shall use both z −1 and z1 to denote the inverse of z for the multiplication.
For instance, i−1 = 1i = −i (since (−i) · i = −i2 = 1).
Proof. Part (1) follows from the fact that, for z = x + iy,
z z̄ = (x + iy)(x − iy) = x2 + y 2 .
To show parts (2) and (3), we write z = x1 + iy1 and w = x2 + iy2 . Then z + w =
(x1 + x2 ) + i(y1 + y2 ) and we get
as desired.
z+z z−z
Re(z) = and Im(z) =
2 2i
for all z ∈ C. In particular, conclude that R = {z ∈ C | z = z}. Can you interpret these
equalities geometrically?
2.45. — Since i2 = −1 < 0, property (f) in Paragraph 2.11 implies that no order compatible
with addition and multiplication can be defined on C. Nevertheless, calculus can be performed
on the complex numbers, which is partly addressed in this course but mainly in the course on
complex analysis in the second year of the study of mathematics and physics. The reason for
this is that C satisfies a generalisation of the completeness axiom, which we can only discuss
after some more theory.
5
for z = x + iy ∈ C.
At this point we note that, given x ∈ R, the absolute value |x| = sgn(x)·x and the absolute
√ √
value of x as an element of C coincide, since xx = x2 = |x| holds. In particular, the newly
introduced notation is consistent, and we have extended the absolute value of R to C.
Note that |z| ≥ 0 for all z ∈ C, and |z| = 0 exactly when z = 0. Also, the absolute value on
C is multiplicative, namely
√ √ √ √
|zw| = zwzw = zzww = zz ww = |z||w| for all z, w ∈ C.
Furthermore,
z̄
z −1 = for all z ̸= 0.
|z|2
These are essential consequences of Lemma 2.43. Finally, the triangle inequality holds, as
shown in the next proposition.
x1 x2 + y1 y2 ≤ |z||w|. (2.3)
6
Proof. We begin by observing that
Taking the square root on both sides and recalling Exercise 2.28, we get |x1 x2 + y1 y2 | ≤ |z||w|
which implies (2.3) (recall that x ≤ |x| for any x ∈ R).
2.49. — The absolute value of the complex number z = x + iy is the square root of x2 + y 2
and, in the geometric notion of complex numbers, is equivalent to the length of the straight
line from the origin 0 + i0 to z. In the same way, for two complex numbers z and w, we
interpret |z − w| as the distance from z to w.
The closed circular disk with radius r > 0 around z ∈ C is the set
B(z, r) = {w ∈ C | |z − w| ≤ r}.
6 2.51. — The open circular disk B(z, r) thus consists precisely of those points that have
distance strictly less than r from z. Open circular disks in C and open intervals in R are
compatible in the following sense: If x ∈ R and r > 0, then the intersection of the open
circular disk B(x, r) ⊆ C with R is just the open interval (x − r, x + r) liyng symmetrically
about x.
Exercise 2.52. — Show the following property of open circular disks: let z1 , z2 ∈ C, r1 > 0
and r2 > 0. For each point z ∈ B(z1 , r1 ) ∩ B(z2 , r2 ) there exists a radius r > 0 such that
The definition of open set in C given below generalizes that in R from Exercise 2.37.
For example, thanks to Exercise 2.52, all open circular disks are open. In addition to open
circular disks, there are many other subsets of C. For example, every union of open subsets
is open. We will return to studying open sets and related notions in much greater generality
in the second semester.
s = max(X)
• The terms bounded from below, lower bound, and minimum are defined
analogously.
In other words, if it exists, the supremum of X is the smallest upper bound of X. We can
describe the supremum s = sup(X) of X directly by
Equivalently, the supremum of X could also be characterised by the fact that no real number
x1 strictly smaller than s = sup(X) is an upper bound of X:
holds for all x ∈ X and a ∈ A. From the first inequality it follows that c is an upper bound
of X, so c ∈ A. From the second inequality it follows that c is the minimum of the set A.
6
Applet 2.58 (Supremum of a bounded non-empty set). We consider a bounded non-empty
subset of R and two equivalent characterizations of the supremum of this set.
X + Y := {x + y | x ∈ X, y ∈ Y } and XY := {xy | x ∈ X, y ∈ Y }.
ε = x0 + y0 − z0 > 0.
Since x0 is the supremum of X, it follows from (2.5) that there exists x ∈ X with x > x0 −ε/2,
and similarly there exists y ∈ Y with y > y0 − ε/2. Set z = x + y. It follows that
which contradicts the fact that z0 is an upper bound for X + Y . Thus z0 = x0 + y0 , as desired.
The proof of (4) is done in a similar way.
2.60. — For a non-empty subset X ⊆ R bounded from below, the largest lower bound of X
will also be called the infimum inf(X) of X. An existence statement analogous to Theorem
2.57 holds for the infimum. Alternatively, the infimum of X can be written as
− sup{−x | x ∈ X}.
6
In this way, practically all statements about infima can be traced back to statements about
suprema.
Here we have added the point +∞ to the right of R and the point −∞ to the left of R
to the straight line. We extend the order relation of the real numbers ≤ to R by requiring
−∞ < x < +∞ for all x ∈ R.
To simplify the notation, we also write ∞ in place of +∞. Often used calculation rules for
the symbols −∞ and ∞, such as the following, are standard conventions:
∞+x=∞+∞=∞ and − ∞ + x = −∞ − ∞ = −∞
One should use such conventions as sparingly as possible and be careful with them. The
expressions ∞ − ∞ and 0 · ∞ or similar remain undefined.
∀ x0 ∈ R ∃ x ∈ X : x > x0 .
1
Corollary 2.65: n is arbitrarily small
1
For every ε > 0 there exists an integer n ≥ 1 such that n < ε holds.
For every two real numbers a, b ∈ R with a < b, there is one r ∈ Q with a < r < b.
Proof. Set ε = b − a. According to Archimedes’ principle in the form of Corollary 2.65, there
exists m ∈ N with m 1
< ε. Similarly, according to Archimedes’ principle from Theorem 2.63,
there exists n ∈ Z with n ≤ ma < n + 1, or equivalently mn
m .
≤ a < n+1
Hence, since m < ε and a + ε = b, we get
1
n+1 n 1 1
a< = + ≤a+ < a + ε = b.
m m m m
Stated differently, the above corollary shows that Q intersects any open non-empty interval
7 I, that is, I ∩ Q ̸= ∅. A subset X of R is called dense in R if every open non-empty interval
of R contains an element of X. Corollary 2.66 thus states: Q is dense in R.
The following two exercises establish a generalization of Archimedes’ principle, which will
be used in the section about decimal fractions.
a0 , a1 , a2 , a3 , . . .
with a0 ∈ Z, and 0 ≤ an ≤ 9 for all n ≥ 1. Thus, given a decimal fraction, we can assign to
it a unique element of R as follows.
Suppose a0 ≥ 0. We set
n
X n
X
xn = ak · 10−k and yn = 10−n + ak · 10−k . (2.6)
k=0 k=0
x0 ≤ x1 ≤ . . . ≤ xn ≤ xn+1 ≤ . . . ≤ yn+1 ≤ yn ≤ . . . ≤ y1 ≤ y0 .
Thus, if we consider the sets X = {x0 , x1 , x2 , . . .} and Y = {y0 , y1 , y2 , . . .}, according to the
completeness axiom we can conclude the existence of c ∈ R with the property that
xn ≤ c ≤ yn (2.7)
for all n ∈ N. Archimedes’ principle in the form of Corollary 2.65 shows that there is precisely
one real number c that satisfies the inequality xn ≤ c ≤ yn for all n ≥ 0. If there were two
different such numbers, say c and d with c < d, it follows from
xn ≤ c < d ≤ yn
for all n ≥ 0, which contradicts Exercise 2.68 with mi = 10i . We call the element c ∈ R
uniquely determined by (2.7) the real number with decimal expansion a0 , a1 a2 a3 a4 . . .. We
note that two possible alternative definitions would be
If a0 is negative, we first consider the real number c with decimal expansion −a0 , a1 a2 a3 a4 . . .
and then define the real number with decimal expansion a0 , a1 a2 a3 a4 . . . as −c.
2.70. — Now, the following question arises: Can every element of R be written as a decimal
fraction? This is indeed the case: Let c ∈ R and c ≥ 0. Then we can write a0 := ⌊c⌋ and
define
an := ⌊10n c⌋ − 10⌊10n−1 c⌋. (2.8)
2.71. — It is important to remark that two different decimal fractions can represent the
same real number. For example, the two decimal fractions
both represent the real number 51 . However, the problem only occurs when a decimal fraction
becomes a constant sequence . . . 9999 . . . after a certain point. To rule this out, we can consider
the following definition: We call real decimal fraction any sequence of integers
a0 , a1 , a2 , a3 , . . .
with 0 ≤ an ≤ 9 for all n ≥ 1, and with the property that for every n0 ≥ 1 there exists a
n ≥ n0 with an ̸= 9.
7
Exercise 2.72. — Let c ≥ 0 be a real number. Verify that the sequence a0 , a1 , a2 , . . .
defined by (2.8) is a real decimal fraction. Then show that this gives rise to a bijection
between R and the set of all real decimal fractions.
Interlude: Cardinality
Let X and Y be two sets. We say that X and Y have the same cardinality or the
same number of elements, written X ∼ Y , if there is a bijection f : X → Y . We
say that Y is larger than X, and write X ≲ Y , if there is an injection f : X → Y .
• We say that the cardinality of the empty set is zero, and write |∅| = 0.
• Let X be a set and n ≥ 1 a natural number. We say the set X has cardinality
n, and write |X| = n, if there is a bijection from X to {1, . . . , n}. In this case we
call X a finite set and write |X| < ∞.
7 X≲Y and Y ≲ X =⇒ X ∼ Y.
Proof. The function i : X → P(X) given by i(x) = {x} is injective. So P(X) is larger than
X. It remains to show that there is no bijection from X to P(X). To show this, we assume
that there is a bijection f : X → P(X) and take this to a contradiction. For this, we define
the set
A = {x ∈ X | x ̸∈ f (x)}.
In other words, A ∈ P(X) consists of all elements x in X for which x is not an element of the
subset f (x) ⊂ X.
Since, by assumption, f : X → P(X) is a bijection, there exists a ∈ X such that A = f (a).
We now ask ourselves: does a belong to A or not?
If a ∈ A then, by the definition of A, a ̸∈ f (a). However, this is impossible since f (a) = A.
Vice versa, if a ̸∈ A it means that a ∈ f (a), that again is impossible since f (a) = A
This proves that there exists no a ∈ X with f (a) = A, which contradicts the surjectivity
of f . So there can be no bijection f : X → P(X).
Proof. By Theorem 2.75, P(N) is strictly larger than N, and thus uncountable. Thus, to show
that R is uncountable, it suffices to prove the existence of an injection
φ : P(N) → R.
7
We construct such an injection by assigning to each subset A the real number φ(A) whose
decimal fraction expansion is given by a0 , a1 a2 a3 a4 . . . with
0 if n ̸∈ A
an =
1 if n ∈ A
Injectivity of the function φ can be proved in two ways. As a first option, one can simply
apply Problem 2.72. Alternatively, one can argue as follows. Let A and B be distinct subsets
of N, and let n be the smallest element of A△B (recall Definition 2). If n ∈ A and n ∈ / B
then φ(A) > φ(B) holds. On the other hand, if n ∈ / A and n ∈ B then φ(A) < φ(B) holds.
Therefore φ(A) ̸= φ(B) holds in all cases, which proves the injectivity.
Since we primarily use the letter x to denote a real number, for sequences of real numbers
we shall mostly use the notation (xn )n∈N , (xn )∞
n=0 , or (xn )n≥0 .
Let (xn )∞
n=0 be a sequence in R. We say that (xn )n=0 is convergent or converges if
∞
It is a priori not clear that a converging sequence has only one limit. In the following lemma
8
we show that the limit is indeed unique, so the notation (2.9) is justified.
Proof. Let A ∈ R and B ∈ R be limits of the sequence (xn )∞ n=0 . Let ε > 0. Then we can
find NA , NB ∈ N such that for all n ≥ NA it holds |xn − A| < 2ε , and for all n ≥ NB it holds
|xn − B| < 2ε . Set N = max{NA , NB }. Then
ε ε
|A − B| ≤ |A − xN | + |xN − B| < 2 + 2 = ε.
Example 2.82. — A constant sequence (xn )∞ n=0 with xn = A ∈ R for all n ∈ N converges
to A. Similarly, eventually constant sequences converge to the value they eventually take.
1 ∞
Example 2.83. — The sequence of real numbers converges to zero, i.e., lim 1
n n=1 = 0.
n→∞ n
Indeed, given ε > 0, by Archimedes’ principle (Theorem 2.63) there exists N ∈ N with N1 < ε.
Therefore, for every n ∈ N with n ≥ N , we have 0 ≤ n1 ≤ N1 < ε.
8 Example 2.84. — The sequence of real numbers (yn )∞ n=0 given by yn = (−1) for n ∈ N
n
is not convergent, since the sequence members 1, −1, 1, −1, 1, −1, . . . alternate between 1 and
1 and, in particular, do not approach any real number.
x0 , x1 , x4 , x9 , x16 , x25 , . . .
Let (xn )∞
n=0 be a sequence in R. A subsequence of (xn )n=0 is a sequence of the form
∞
k=0 , where (nk )k=0 is a sequence of nonnegative integers such that nk+1 > nk for
(xnk )∞ ∞
all k ∈ N.
Remark 2.86. — In the previous definition, since nk+1 > nk for all k ∈ N, it follows that
nk ≥ k for every k ∈ N. (Exercise: Prove this fact by induction on k ∈ N.)
Let (xn )∞ ∞
n=0 be a sequence in R converging to A ∈ R. Then, each subsequence of (xn )n=0
also converges to A.
2.88. — A sequence can have convergent subsequences without itself converging. For
example, the sequence of real numbers given by xn = (−1)n is not convergence, while the
subsequences
(x2n )∞
n=0 and (x2n+1 )∞
n=0
Let (xn )∞
n=0 be a sequence in R. A point A ∈ R is called an accumulation point of
the sequence (xn )∞
n=0 if for every ε > 0 and every N ∈ N there exists a natural number
n ≥ N with |xn − A| < ε.
Let (xn )∞ ∞
n=0 be a sequence in R. An element A ∈ R is an accumulation point of (xn )n=0
8 if and only if there exists a convergent subsequence of (xn )∞
n=0 with limit A.
Proof. By Proposition 2.90 there exists a sequence (xnk )k≥0 such that limk→∞ xnk = A.
8 In particular, given ε > 0, there exists N ∈ N such that all elements (xnk )k≥N are inside
(A − ε, A + ε).
A converging sequence has exactly one accumulation point, which coincides with its
limit.
Exercise 2.93. — Let (xn )∞ n=0 be a sequence in R, and let F ⊆ R be the set of accumulation
points of the sequence (xn )n=0 . Show that F is closed.
∞
(xn )∞ ∞ ∞
n=0 + (yn )n=0 = (xn + yn )n=0
α · (xn )∞ ∞
n=0 = (αxn )n=0 .
(xn )∞ ∞ ∞
n=0 · (yn )n=0 = (xn yn )n=0 .
Remark 2.95. — With the addition and multiplication defined above, the set of sequences
forms a commutative ring, where the zero element is the constant sequence (0)∞
n=0 , and the
neutral element for multiplication is the constant sequence (1)n=0 .
∞
|xn − A| < ε
2 ∀ n ≥ NA , and |yn − B| < ε
2 ∀ n ≥ NB .
ε ε
|(xn + yn ) − (A + B)| ≤ |xn − A| + |yn − B| < 2 + 2 = ε,
|A|
|xn − A| ≤ .
2
|A|
|xn | = |A + (xn − A)| ≥ |A| − |xn − A| ≥ ∀ n ≥ N0 .
2
|xn − A|
x−1
n −A
−1
≤2 <ε ∀ n ≥ N.
|A|2
Let (xn )∞ ∞
n=0 and (yn )n=0 be sequences of real numbers with limits A = n→∞
lim xn and
B = lim yn .
n→∞
1. If A < B, then there exists N ∈ N such that xn < yn for all n ≥ N .
Proof. Suppose A < B, and let ε = 13 (B − A) > 0. Then there exist NA , NB ∈ N such that
n ≥ NA =⇒ A − ε < xn < A + ε
n ≥ NB =⇒ B − ε < yn < B + ε
9
holds. Note now that by the choice of ε it holds 2ε < B − A, therefore
A + ε < B − ε.
Remark 2.98. — In Proposition 2.97(2), even if one assumes that xn < yn , one can only
deduce that A ≤ B. Indeed, consider the sequences
1 1
xn = − , yn = ∀ n ≥ 1.
n n
7n4 + 15 n2 + 5 n5 − 10
lim , lim , lim .
n→∞ 3n4 + n3 + n − 1 n→∞ n3 + n + 1 n→∞ n2 + 1
9
2.5.4 Bounded Sequences
In this section, we study bounded sequences of real numbers.
A sequence (xn )∞
n=0 in R is called bounded if there exists a real number M ≥ 0 such
that |xn | ≤ M for all n ∈ N.
Proof. Let (xn )∞n=0 be a convergent sequence with limit A ∈ R. Choosing ε = 1 in the
definition of limit, there exists N ∈ N such that |xn − A| ≤ 1 for all n ≥ N . In particular, by
triangle inequality (see property (g) in Paragraph 2.16),
Set
M = max{1 + |A|, |x0 |, |x1 |, . . . , |xN −1 |}.
|xn yn − AB| = |xn yn − xn B + xn B − AB| ≤ |yn − B|, |xn | + |xn − A|, |B|
2.104. — As we will show, bounded sequences of real numbers always have at least one
accumulation point, or equivalently, a convergent subsequence. This fact gives rise to the
important notion of superior limit and inferior limit.
A sequence (xn )∞
n=0 is called:
2.107. — Monotone bounded sequences are always convergent. We illustrate this with a
picture:
Figure 2.3
A monotone sequence of real numbers (xn )∞ n=0 converges if and only if it is bounded.
∞
If the sequence (xn )n=0 is monotonically increasing, then
Proof. If (xn )∞
n=0 is convergent, it follows by Lemma 2.102 that (xn )n=0 is bounded.
∞
exists M > 0 such that xn ≤ M for all n ∈ N. This means that the set {xn | n ∈ N} is
bounded from above, so the supremum A = sup {xn | n ∈ N} exists. We now want to prove
that xn converges to A.
By the definition of supremum, we have:
(i) xn ≤ A for all n ∈ N;
(ii) for every ε > 0 there exists N ∈ N with xN > A − ε.
Thus, for n ≥ N , it follows from (i), (ii), and the monotonicity of (xn )∞
n=0 , that
9 A − ε < xN ≤ xn ≤ A < A + ε
Remark 2.109. — If (xn )∞ n=0 is monotone and there exists a bounded subsequence (xnk )k=0 ,
∞
then the whole sequence is bounded (and therefore converges, thanks to Theorem 2.108).
Indeed, assume for instance that (xn )∞
n=0 is increasing and the subsequence (xnk )k=0 is
∞
bounded from above by a number M . Then, recalling Remark 2.86, by monotonicity we have
x0 ≤ xk ≤ xnk ≤ M ∀ k ∈ N.
So, (xn )∞
n=0 is bounded. The case when (xn )n=0 is increasing is analogous.
∞
for n ≥ 1. Show that (xn )∞n=0 converges and determine the limit.
Hint: First, prove that the sequence converges to a nonnegative limit. Second, show that if
A ≥ 0 is the limit, then it satisfies A = 23 A + A1 . Use this relation to identify A.
Exercise 2.111. — Let (xn )∞ n=0 be a monotonically increasing sequence and (yn )n=0 be
∞
a monotonically decreasing sequence with xn ≤ yn for all n ∈ N. Show that both sequences
converge and that limn→∞ xn ≤ limn→∞ yn holds. Illustrate your argument with a picture
similar to the one in Figure 2.3.
2.112. — Let (xn )∞ n=0 be a bounded sequence of real numbers. For the definition of limits
and accumulation points of the sequence (xn )∞ n=0 , only its long-term behavior is relevant, or
more precisely, the end (xk )n=N for arbitrarily large N ∈ N. Following this observation, for
∞
sn = sup {xk | k ≥ n}
over the final part {xk | k ≥ n} of the sequence. Since {xk | k ≥ m} ⊂ {xk | k ≥ n} for m > n,
it follows that sm ≤ sn for m > n. The sequence (sn )∞n=0 is therefore monotonically decreasing.
Since (xn )∞
n=0 is bounded by assumption, (sn )n=0 is also bounded, and so it is a monotoni-
∞
cally decreasing bounded sequence. Therefore, the sequence (sn )∞ n=0 converges to the infimum
of the set {sn | n ∈ N} by Theorem 2.108. This infimum is called the superior limit of the
given sequence (xn )∞ n=0 .
Analogously, one can define the inferior limit of (xn )∞ n=0 considering
in = inf {xk | k ≥ n} .
9
in ≤ xn ≤ sn ∀ n ∈ N. (2.10)
and
lim inf xn = lim inf{xk | k ≥ n}
n→∞ n→∞
are called superior limit, respectively inferior limit of the sequence (xn )∞
n=0 . Note
that, as a consequence of (2.10) and Proposition 2.97,
n 1 2 3 4 5 6 7 8 ...
3
xn 0 2 − 23 5
4 − 45
... 7
6 − 67 9
8
3 3 5 5 7 7 9 9
sn 2 2 4 4 ...
6 6 8 8
in −1 −1 −1 −1 −1 −1 −1 −1 . . .
9 Note here that sn = xn when n is even, and sn = xn+1 otherwise. Therefore lim supn→∞ ((−1)n +
1 2n + 1 ) = 1.
n ) = limn→∞ ((−1) 2n
Because xn ≥ −1 for every n and limn→∞ x2n+1 = −1, then in = −1 for all n ∈ N. Therefore
lim inf n→∞ ((−1)n + n1 ) = −1.
Proof. Define
in = inf{xn | n ≥ N }, sn = sup{xn | n ≥ N },
A − ε ≤ in ≤ sn ≤ A + ε,
A − ε ≤ I ≤ S ≤ A + ε.
Proof. Let ε > 0, and write sn = sup{xk | k ≥ n}. The sequence (sn )∞
n=0 is monotonically
decreasing and converges to A. So there exists N0 ∈ N such that
A ≤ sn < A + ε ∀ n ≥ N0 . (2.11)
Every bounded sequence of real numbers has an accumulation point and has a convergent
subsequence.
Proof. By Theorem 2.116, the limsup (and analogously for the liminf) is always an accumu-
lation point. Also, by Proposition 2.90, every accumulation point is the limit of a converging
subsequence. So, a convergent subsequence always exists.
Exercise 2.118. — Let (xn )∞ n=0 be a bounded sequence in R, and let E ⊆ R be the set of
accumulation points of the sequence (xn )∞
n=0 . Show that
Exercise 2.119. — Let (an )∞ n=0 , (bn )n=0 and (cn )n=0 be convergent sequences of real
∞ ∞
numbers, with limits A, B, and C respectively. Let (xn )∞n=0 be the sequence defined by
a if n = 3k, k ∈ N
n
xn = bn if n = 3k + 1, k ∈ N .
cn if n = 3k + 2, k ∈ N
Calculate lim sup xn , lim inf xn , and the set of accumulation points of the sequence (xn )∞
n=0 .
n→∞ n→∞
2.5.5 Cauchy-Sequences
Proof. The proof is very similar to the one of Lemma 2.102. Choosing ε = 1 in the definition
of Cauchy sequence, there exists N ∈ N such that |xn − xN | ≤ 1 for all n ≥ N . In particular
|xn | ≤ 1 + |xN | for all n ≥ N , and therefore
Exercise 2.123. — Show that a Cauchy sequence converges if and only if it has a conver-
gent subsequence.
ε ε
|xn − xm | ≤ |xn − A| + |xm − A| < + =ε ∀ n, m ≥ N,
2 2
hence (xn )∞
n=0 is a Cauchy sequence.
Vice versa, let (xn )∞
n=0 be a Cauchy sequence. Since it is bounded (see Lemma 2.122),
ε
|xn − xm | < ∀ m, n ≥ N0 ,
2
0, 1, 1 + 12 , 2, 2 + 13 , 2 + 32 , 3, 3 + 14 , 3 + 24 , 3 + 34 , 4, 4 + 15 , 4 + 25 , 4 + 53 , 4 + 45 , 5, 5 + 16 . . .
that progresses between n−1 and n in steps of length n1 . This sequence is unbounded and thus
not convergent. On the other hand, the distance between two successive elements decreases
as the sequence progresses, and becomes arbitrarily small.
write
lim xn = ∞.
n→∞
if for every M > 0 there exists N ∈ N such that xn > M for all n ≥ N .
Similarly, we say that (xn )∞
n=0 diverges to −∞ if for every real number M > 0 there
exists N ∈ N such that xn < −M for all n ≥ N .
In both cases, we speak of an improper limit.
2.127. — An unbounded sequence may not diverge to ∞ or −∞. For example, the sequence
Exercise 2.128. — Let (xn )∞ n=0 be an unbounded sequence of real numbers. Show that
there exists a subsequence that diverges to ∞ or to −∞.
2.129. — We can use improper limits to define the superior and inferior limits for un-
11
bounded sequences in R. If the sequence (xn )∞
n=0 is not bounded from above, then sup{xn | k ≥
n} = ∞ for all n ∈ N and we write
lim sup xn = ∞.
n→∞
If (xn )∞
n=0 is bounded from above but not from below, then we write
lim sup xn = lim sup{xk | k ≥ n}
n→∞ n→∞
where the right-hand side is a real limit if the monotonically decreasing sequence sup{xk | k ≥
n} is bounded, and the improper limit −∞ otherwise. We use this terminology analogously
for the inferior limit.
Exercise 2.130. — Prove the following version of the sandwich lemma for improper limits.
For two sequences of real numbers (xn )∞
n=0 and (xn )n=0 with xn ≤ yn for all n ∈ N holds:
∞
lim xn = ∞ =⇒ lim yn = ∞
n→∞ n→∞
A sequence of complex numbers (zn )∞ n=0 = (xn + iyn )n=0 is convergent with limit
∞
A + iB ∈ C if the two sequences of real numbers (xn )∞n=0 and (yn )n=0 are convergent,
∞
Remark 2.132. — As we did for sequences of real numbers, also for sequences of complex
numbers we can consider subsequences. This corresponds to consider (znk )∞ k=0 = (xnk +
iynk )k=0 for some strictly increasing sequence of nonnegative integers (nk )k=0 .
∞ ∞
Exercise 2.133. — Let (zn )∞ n=0 be a convergent sequence in C. Show that (|zn |)n=0 con-
∞
verges, and find the limit. Conversely, does the convergence of (|zn |)∞
n=0 imply the convergence
of (zn )n=0 ?
∞
3.1. — For an arbitrary non-empty set D ⊂ R, we define the set of real-valued functions
on D as
11 F(D) = {f | f : D → R} .
61
Chapter 3.1 Real-valued Functions
Remark 3.2. — The interested reader may notice that, with the addition and multi-
plication defined above, F(D) is a commutative ring (the neutral element for addition is the
constant function f ≡ 0, the neutral element for multiplication is the constant function f ≡ 1).
We say that x ∈ D is a zero of f ∈ F(D) if f (x) = 0 holds. The zero set of f is defined
by {x ∈ D | f (x) = 0}. Finally, we define an order relation on F(D): given f1 , f2 ∈ F(D) we
say
f1 ≤ f2 ⇐⇒ f1 (x) ≤ f2 (x) ∀ x ∈ D,
Exercise 3.4. — Verify that the relation ≤ defined above on F(D) is indeed an order
11
relation.
f (x) ≤ M ∀ x ∈ D.
We say that f is bounded from below if there exists M > 0 such that
f (x) ≥ −M ∀ x ∈ D.
Finally, we say that f is bounded if f is bounded from above and from below. Equiv-
alently, f is bounded if there exists M > 0 such that
|f (x)| ≤ M ∀ x ∈ D.
• For any subset D ⊂ R and any odd integer number n ≥ 0, the function x 7→ xn on D is
strictly monotonically increasing.
• The rounding function ⌊·⌋ : R 7→ R (recall Definition 2.64) is increasing, but not strictly
increasing.
Figure 3.1: A strictly monotone function is always injective. However, it need not be surjective.
For example, the function f : R → R given by f (x) = 18 x + sgn(x) is strictly monotone
increasing but not subjective, (e.g., 12 is not in the image of f ).
Exercise 3.8. — Let D ⊆ R, and let f1 , f2 ∈ F(D) be strictly increasing. Show that:
(ii) given a ∈ R, the function af ∈ F(D) is strictly increasing if a > 0, and strictly
decreasing if a < 0;
3.1.2 Continuity
Remark 3.10. — In the definition of continuity, it is only important to check the implica-
tion for ε > 0 small. Indeed, assume that a function f satisfies the following:
There exists ε0 > 0 such that, for all ε ∈ (0, ε0 ] there exists δ > 0 such that
Then f is continuous at x0 . Indeed, for ε > ε0 we can choose the number δ > 0 corresponding
to ε0 to get
∀ x ∈ D, |x − x0 | < δ =⇒ |f (x) − f (x0 )| < ε0 < ε.
3.11. — The following illustration shows a continuous function on D = [a, b) ∪ (c, d] ∪ {e}.
We see that f is continuous at every point x0 : no matter how small one chooses ε > 0, for a
suitable δ > 0 we have that for all x that are δ-close to x0 , f (x) is also ε-close to f (x0 ).
Applet 3.12 (Continuity). We consider a function that is continuous at most (but not all)
points in the domain of definition.
Example 3.13. — • Let a and b be real numbers. The affine function f : R → R given
by f (x) = ax + b is continuous.
11 Indeed, if a = 0, then the function is constant and therefore continuous. So, let a ̸= 0.
Given x0 ∈ R and ε > 0, note that |f (x) − f (x0 )| = |a||x − x0 | holds for all x ∈ R.
Thus, considering the choice δ = |a|
ε
, for any x ∈ R with |x − x0 | < δ we have
• The function f : R → R given by f (x) = |x| is also continuous. Indeed, let x0 ∈ R and
ε > 0. Choosing δ = ε we notice that if |x − x0 | < δ then
• The rounding function f : R → R given by f (x) = ⌊x⌋ (recall Definition 2.64) is not
continuous at points in Z. Indeed, if x0 ∈ Z, then for any δ > 0 small it holds that
|(x0 − 21 δ) − x0 | < δ and
Note that we consider f |D′ and f as different functions, since their domains of definition
are not the same – except, of course, when D′ = D.
Proof. Let ε > 0. Since f1 and f2 are continuous at x0 , there exist δ1 , δ2 > 0 such that, for
all x ∈ D, it holds
ε
12 |x − x0 | < δ1 =⇒ |f1 (x) − f1 (x0 )| <
2
ε
|x − x0 | < δ2 =⇒ |f2 (x) − f2 (x0 )| < .
2
ε ε
|x−x0 | < δ =⇒ |(f1 +f2 )(x)−(f1 +f2 )(x0 )| ≤ |f1 (x)−f1 (x0 )|+|f2 (x)−f2 (x0 )| < + = ε.
2 2
The argument for f1 f2 is similar but a little more complicated. Given x ∈ D, using the
triangle inequality we have the estimate
|f1 (x)f2 (x) − f1 (x0 )f2 (x0 )| = |f1 (x)f2 (x) − f1 (x0 )f2 (x) + f1 (x0 )f2 (x) − f1 (x0 )f2 (x0 )|
≤ |f1 (x)f2 (x) − f1 (x0 )f2 (x)| + |f1 (x0 )f2 (x) − f1 (x0 )f2 (x0 )|
= |f1 (x) − f1 (x0 )||f2 (x)| + |f1 (x0 )||f2 (x) − f2 (x0 )|.
ε
|x − x0 | < δ1 =⇒ |f1 (x) − f1 (x0 )| < ,
2(|f2 (x0 )| + ε)
ε
|x − x0 | < δ2 =⇒ |f2 (x) − f2 (x0 )| < .
2(|f1 (x0 )| + 1)
therefore
ε ε
|f1 (x) − f1 (x0 )||f2 (x)| < (ε + |f2 (x0 )|) = .
2(|f2 (x0 )| + ε) 2
Analogously, for the second term, for x ∈ D with |x − x0 | < δ we have
ε ε
|f1 (x0 )||f2 (x) − f2 (x0 )| < |f1 (x0 )| < .
2(|f1 (x0 )| + ε) 2
Combining the inequalities above, we obtain |f1 (x)f2 (x) − f1 (x0 )f2 (x0 )| < ε as desired.
The statement about af1 for a ∈ R follows choosing f2 equal to the constant function
f2 ≡ a and applying the previous result.
We will refer to aj as the summand and j as the index or the running variable of
the sum. If J is a finite set, and if for each j ∈ J a number aj is given, we write
X
aj
j∈J
for the sum of all numbers in the set {aj | j ∈ J}. Finally, we will use the convention
j∈∅ aj = 0 for the sum over the empty index set.
P
the parentheses are irrelevant because the following is true for all w ∈ W :
Proof. Let ε > 0. Then, due to the continuity of g, there exists η > 0 at f (x0 ), so that for all
y ∈ D2
|y − f (x0 )| < η =⇒ |g(y) − g(f (x0 ))| < ε.
Since η > 0 and f is continuous at x0 , there exists δ > 0 such that for all x ∈ D1 .
|x − x0 | < δ =⇒ |f (x) − f (x0 )| < η =⇒ |g(f (x)) − g(f (x0 ))| < ε
Remark 3.21. — Applying Proposition 3.20 with g(x) = |x| (see Example 3.13), we deduce
that if f : D → R is continuous then also the function x 7→ |f (x)| is continuous.
Exercise 3.23. — Let a < b < c be real numbers, and f1 : [a, b] → R and f2 : [b, c] → R
be continuous functions. Show that the function f : [a, c] → R given by
f (x) if x ∈ [a, b)
1
f (x) =
f (x) if x ∈ [b, c]
2
Exercise 3.24. — Let I ⊂ R be an open interval and let f : I → R a function. Show that
f is continuous if and only if, for every open set U ⊂ R, f −1 (U ) is also open.
Thus
n ≥ N =⇒ |f (xn ) − f (x̄)| < ε,
Using this for n ∈ N and δ = 2−n > 0, for each n ∈ N we find xn ∈ D with
converge?
X = {x ∈ [a, b] | f (x) ≤ c} .
Since a ∈ X and X ⊆ [a, b], the set X is non-empty and bounded from above. Therefore, by
Theorem 2.57, the supremum x̄ = sup(X) ∈ [a, b] exists. We will now use the continuity of f
13 at x̄ to show that f (x̄) = c holds.
First of all, since x̄ is the supremum of X, for any n ≥ 0 we can find an element xn ∈
X ∩ [x̄ − 2−n , x̄]. Since xn ∈ X it follows that f (xn ) ≤ c. Then, since the sequence (xn )∞
n=0
converges to x̄ (because |xn − x̄| ≤ 2−n ) and f is continuous, Theorem 3.26 implies that
lim f (xn ) ≤ c.
n→∞
Hence, f (x̄) ≤ c.
Suppose now by contradiction that f (x̄) < c. Since c ≤ f (b), it follows that x̄ < b. Then,
by continuity, given ε = c − f (x̄) > 0 there exists δ > 0 such that, for x ∈ [a, b],
This implies in particular that (x̄ − δ, x̄ + δ) ∩ [a, b] ⊂ X. In particular X contains the set
(x̄, x̄ + δ) ∩ (x̄, b], a contradiction to the fact that x̄ = sup(X). In conclusion f (x̄) = c, as
desired.
Remark 3.30. — If f : [a, b] → R is a continuous function such that f (a) > f (b), Theo-
rem 3.29 still holds in the following way:
For every real number c with f (a) ≥ c ≥ f (b) there exists x̄ ∈ [a, b] such that f (x̄) = c.
To prove this statement, there are two possible ways:
(1) repeat the proof of Theorem 3.29 defining X = {x ∈ [a, b] | f (x) ≥ c};
(2) apply Theorem 3.29 to the function g = −f .
The function g is called inverse function (or inverse mapping) of f , and is often
denoted by f −1 .
3.33. — In this subsection, we show that every continuous strictly monotone function has
an inverse function that is also continuous.
Proof. Without loss of generality, we can assume that I is non-empty and not a single point.
Also, we can assume that f is strictly increasing (otherwise replace f with −f ).
We write J = f (I), and first notice that the function f : I → J is bijective, since it is
surjective by definition, and due to strict monotonicity it is also injective. Thus, there exists
a uniquely determined inverse g = f −1 : J → I.
We note that the function g is strictly increasing: since f is strictly increasing,
which leads to
g(y1 ) < g(y2 ) ⇐⇒ y1 < y2 ∀ y1 , y2 ∈ J.
13 Define xn = g(yn ) and x̄ = g(ȳ). The property above tells us that, for every n ∈ N,
either xn ≤ x̄ − ε, or xn ≥ x̄ + ε.
In particular, there are infinitely many indices n’s for which one of the above options holds.
Without loss of generality, let us assume that there are infinitely many indices n’s for which
xn ≤ x̄ − ε, and define a subsequence xnk using such indices, so that
Since f is strictly increasing and x̄ − ε ∈ I (since both xnk = g(ynk ) and x̄ = g(ȳ) belong to
I, and I is an interval), we deduce that
a contradiction.
is continuous, strictly increasing, and surjective. According to the Inverse Function Theorem,
there exists a continuous strictly increasing inverse [0, ∞) → [0, ∞), that we express as
√
n
x 7→ x
1
13
m
x− n = m for x ∈ (0, ∞).
xn
Exercise 3.36. — For any real number a > 0, we define the sequence of real numbers
√
n=0 by xn =
(xn )∞ a. Show that the sequence (xn )∞
n=0 converges, and that
n
√
n
lim a = 1.
n→∞
√
n=0 by xn =
Exercise 3.37. — We define a sequence of real numbers (xn )∞ n
n. Show that
this sequence converges, with limiting value
√
n
lim n = 1.
n→∞
14
Proof. Assume by contradiction that f is not bounded. Then, for every n ∈ N there exists
a point xn ∈ [a, b] such that |f (xn )| ≥ n. Applying Lemma 3.38, we can find a subsequence
k=0 such that limk→∞ xnk = x̄ ∈ [a, b].
(xnk )∞
Hence, by the continuity of |f | (recall Remark 3.21) we deduce that the sequence {|f (xnk )|}∞
k=0
converges to the real number |f (x̄)|, a contradiction since
|f (xnk )| ≥ nk → ∞ as k → ∞.
Proof. Since f is bounded (thanks to Theorem 3.39), Theorem 2.57 implies that the supremum
S = sup(f ([a, b])) exists. By definition of supremum, for any n ∈ N there exists yn = f (xn )
with xn ∈ [a, b], such that S − 2−n ≤ yn ≤ S. Hence, limn→∞ yn = S.
Thanks to Lemma 3.38, there exists a subsequence (xnk )∞ k=0 such that limk→∞ xnk = x̄ ∈
[a, b]. Hence, by the continuity of f and Theorem 3.26, this implies that
Exercise 3.43. — Does any continuous function f on the open interval (0, 1) attain its
maximum?
Exercise 3.45. — Show that the polynomial function f (x) = x2 is continuous on R but
not uniformly continuous. Also, the restriction of f to [0, 1] is uniformly continuous.
Proof. Assume by contradiction that f is not uniformly continuous. This means that there
exists ε > 0 such that, for all δ > 0, one can find x, y ∈ [a, b] with
Using this with δ = 2−n > 0, for each n ∈ N we can find xn , yn ∈ [a, b] satisfying
|ynk − x̄| ≤ |ynk − xnk | + |xnk − x̄| < 2−nk + |xnk − x̄|.
ε ε
k ≥ N1 =⇒ |f (xnk ) − f (x̄)| < , and k ≥ N2 =⇒ |f (ynk ) − f (x̄)| < .
2 2
ε ε
|f (xnk ) − f (ynk )| ≤ |f (xnk ) − f (x̄)| + |f (ynk ) − f (x̄)| < + = ε,
2 2
Exercise 3.47. — Does the statement of Theorem 3.46 hold for continuous functions on
the open interval (0, 1)?
Give examples of Lipschitz continuous functions, and show that a Lipschitz continuous
function is also uniformly continuous.
√
14 2. Let f : R≥0 → R be the root function, f (x) = x. Show that:
(i) f[0,1] : [0, 1] → R is not Lipschitz continuous;
(ii) f[1,∞) : [1, ∞) → R is Lipschitz continuous;
(iii) f : [0, ∞) → R is uniformly continuous.
x x+n 1
≤ = ≤ 1,
(n + 1)(n + x) (n + 1)(n + x) n+1
that is
x
− ≥ −1 ∀ n ≥ n0 .
(n + 1)(n + x)
x n+1 x n+1
n(n + 1 + x) n+1
1+ x 1 + n+1
n+1
n+x
n = 1+ =
1 + nx n 1 + nx n (n + 1)(n + x)
n+1
n+x x n+x x
= 1− ≥ 1− = 1.
n (n + 1)(n + x) n n+x
To show boundedness, we first consider the case x ≤ 0. In this case we note that
x
0<1+ ≤1 ∀ n ≥ n0 ,
n
n
thus 0 < 1 + nx ≤ 1 holds. So 1 is an upper bound for the increasing positive sequence
n=n0 . Therefore
(an )∞
x n n
= sup 1 + nx | n ≥ n0 > 0.
lim 1+
n→∞ n
x n
x −n 1
1≤ 1+ ≤ 1− = n ∀ n > x.
n n 1 + (−x)
n
n ∞
(−x)
Since the sequence 1+ nconverges to a positive number (by the case x ≤ 0
n=1 −n ∞
(−x)
above), Proposition 2.96(3) implies that also the sequence 1+ n converges,
n=1
and in particular it is bounded (see Lemma 2.102). This implies that the monotonically
n ∞
increasing sequence 1 + nx n=1
is also bounded, so it converges.
for all x ∈ R.
e = 2.71828182845904523536028747135266249775724709369995 . . .
Proof. It suffices to observe that, given x ∈ R with x > −n, Lemma 3.51 and Definition 3.52
imply that an ≤ an+1 ≤ . . . ≤ exp(x).
exp(0) = 1, (3.3)
−1
exp(−x) = exp(x) for all x ∈ R, (3.4)
exp(x + y) = exp(x) exp(y) for all x, y ∈ R, (3.5)
1. The identity (3.3) follows directly from the definition of the exponential function.
Since the left-hand side converges to 1 as n → ∞, the Sandwich Lemma 2.99 implies
2 n
that lim 1 − nx2 = 1, so (3.4) holds.
n→∞
(x + y)2 xy
x y x+y x+y cn
1− 1− 1+ =1− 2
+ 2 1+ = 1 + 2,
n n n n n n n
x+y
−2|xy| ≤ −xy + xy ≤ 2|xy| ∀ n ≥ |x| + |y|.
n
2|xy| ≤ x2 + y 2 .
x2 + y 2 cn cn n
1−2 ≤1+ ≤ 1+ 2 ≤1 ∀ n ≥ |x| + |y|,
n n n
n
so lim 1 + ncn2 = 1 by the Sandwich Lemma 2.99. Using (3.4) and Proposition
n→∞
2.96(2), we get
x+y n
exp(x + y) x n y n cn n
= lim 1 − 1− 1+ = lim 1 + 2 =1
exp(x) exp(y) n→∞ n n n n→∞ n
It remains to prove the continuity, monotonicity, and bijectivity of the exponential function.
We shall first prove some useful estimates. First of all, we claim that
exp(x) ≥ 1 + x ∀x ∈ R (3.6)
Indeed, for x ≤ −1 this is clear since exp(x) > 0. For x > −1, it follows from Corollary 3.53.
Combining (3.6) and (3.4), we also deduce that
1 1
exp(x) = ≤ ∀ x < 1. (3.7)
exp(−x) 1−x
1 1 δ
x ∈ [0, δ) =⇒ | exp(x) − 1| = exp(x) − 1 ≤ −1≤ −1= ,
1−x 1−δ 1−δ
δ
x ∈ (−δ, 0] =⇒ | exp(x) − 1| = 1 − exp(x) ≤ 1 − (1 + x) = −x ≤ δ ≤ .
1−δ
Recalling Remark 3.10, to prove the continuity of exp at 0 it suffices to consider ε ∈ (0, 1].
So, let ε ∈ (0, 1] and choose δ = 2+ε
ε
. With this choice we see that δ < 1 and that
δ ε
| exp(x) − exp(0)| ≤ = < ε.
1−δ 2
We can thus write the exponential function as a composition, namely exp = µ ◦ exp ◦τ ,
with τ : R → R and µ : R → R given by
Note that the functions τ and µ are continuous. In particular, τ is continuous at x0 , and
exp is continuous at 0 = τ (x0 ). It follows that exp is continuous at x0 from Proposition
3.20.
2. The function exp is strictly increasing. For all real numbers x < y it follows from (3.6)
that exp(y − x) ≥ 1 + (y − x) > 1, therefore
3. The function exp : R → R>0 is bijective. The exponential function is strictly increasing
and, therefore, injective. To show surjectivity, choose a ∈ R>0 arbitrarily. If we set
x0 = −a−1 and x1 = a, it follows from (3.8) that
Since exp is continuous on all R, it follows from the Intermediate Value Theorem 3.29,
that there exists x ∈ [x0 , x1 ] such that exp(x) = a. This shows the assertion and finishes
the proof.
log(1) = 0, (3.9)
log(a−1 ) = − log(a) for all a ∈ R>0 , (3.10)
log(ab) = log(a) + log(b) for all a, b ∈ R>0 , (3.11)
Proof. This follows directly from Theorem 3.54 and the Inverse Function Theorem 3.34. The
equations (3.9), (3.10), and (3.11) follow from the corresponding properties of the exponential,
choosing x = log a and y = log b.
Figure 3.2: The graphs of the exponential function and the logarithm. The auxiliary lines
show exp(x) ≥ x + 1 and log(x) ≤ x − 1.
3.57. — The logarithm function defined here is also called the natural logarithm to
distinguish it from the logarithm with another base a > 1, typically a = 10 or a = 2. Let
15 a > 1 be a real number. We can define the logarithm loga : R>0 → R in base a: the notation
log x
loga (x) = ∀ x ∈ R>0 .
log a
Verify that log10 (10n ) = n holds for all n ∈ Z. However, we will not use this definition, not
even for a = 10 or a = 2, and log(x) will always denote the natural logarithm of x ∈ R>0 to
base e.
3.58. — We can use the logarithm and exponential mapping to define more general powers.
For a positive number a > 0 and arbitrary exponents x ∈ R, we write
ax = exp(x log(a)).
xa = exp(a log(x)).
Exercise 3.59. — Show that for x ∈ Q and a > 0 this definition agrees with the definition
of rational powers from Example 3.35. Furthermore, check the calculation rules
Exercise 3.60. — Let a > 0 be a positive number. Show that there exists a real number
Ca > 0 such that log(x) ≤ Ca xa holds for all x > 0.
Exercise 3.61. — Show that for all real numbers x ≥ −1 and p ≥ 1, the continuous
Bernoulli inequality
(1 + x)p ≥ 1 + px.
holds.
Exercise 3.62. — In this exercise, we consider another continuity term (compare with
Exercise 3.48).
2. Given α ∈ (0, 1], consider the function f : [0, ∞) → R given by f (x) = xα . Show that
f is α-Hölder continuous.
Hint: use the inequality
Applet 3.63 (Slide rule). Using the slide rule, calculate some products and quotients. Recall
the properties of the logarithm to see how to do these calculations. Before the introduction of
electronic calculators, these mechanical aids were widely used.
holds for all δ > 0. Whenever (3.13) holds, we say that x0 is an accumulation point of D.
Note that when x0 ∈ D, (3.13) is always satisfied.
We remark that (3.13) implies that there exists a sequence of points in D converging to
x0 .
In general, the limit of f (x) as x → x0 may not exist. But if a limit exists, then it is
15
uniquely determined. Therefore, from now on, we speak of the limit and write
lim f (x) = L
x→x0
if the limit of f (x) as x → x0 exists and is equal to L. Informally, this means that the function
values of f are arbitrarily close to L if x ∈ D is close to x0 .
3.66. — The limit of a function satisfies properties analogous to Proposition 2.96. If f and
g are functions on D such that the limits
L = x→x
lim f (x) (3.15)
0
x̸=x0
in place of (3.14). Note that, in the situation above, the function fe : D → R defined by
f (x) if x ∈ D \ {x0 }
15 fe(x) = (3.16)
L if x = x0
is continuous at the point x0 . In other words, we have removed the discontinuity of the
function fe by replacing the value of the function f at the location x0 with L.
3.69. — Suppose instead that x0 ̸∈ D but the limit in (3.15) exists. In this situation, we
call the function fe defined in (3.16) the continuous extension of f on D ∪ {x0 }.
Arguing exactly as in the proof of Theorem 3.26, we also have the validity of the following:
Let f : D → R. Then L = lim f (x) if and only if, for every sequence (xn )∞
n=0 ⊂ D
x→x̄
converging to x̄, lim f (xn ) = L also holds.
n→∞
Finally, we state a result on the limit of the composition with a continuous function.
3.72. — We can introduce conventions for improper limits of functions, as we have already
done for sequences.
x ∈ D ∩ (x0 − δ, x0 + δ) =⇒ f (x) ≥ M ∀ x ∈ D.
Analogously, lim f (x) = −∞, if for every real number M > 0 there exists δ > 0 with
x→x0
the property that
x ∈ D ∩ (x0 − δ, x0 + δ) =⇒ f (x) ≤ −M ∀ x ∈ D.
L = x→x
lim f (x). (3.17)
0
x≥x0
L = x→x
lim f (x), or also L = lim f (x).
0 x→x+
0
x>x0
Analogous to limits from the right, we can also define limits from the left, with the notation
lim f (x)
x→x0
and lim f (x) = x→x
lim f (x).
x→x−
0
0
x≤x0 x<x0
Finally, we can also allow the symbols +∞ and −∞ for one-sided limits, as in Paragraph 3.72.
Instead, we say that f diverges to ∞ as x → ∞ if for every M > 0 there exists R > 0
such that
x ∈ D ∩ (R, ∞) =⇒ f (x) ≥ M.
16
If the limit of f as x → ∞ exists, then it is unique and we write
L = lim f (x).
x→∞
If f diverges to ∞ as as x → ∞, we write
lim f (x) = ∞.
x→∞
Of course, also limits at −∞ and/or the case when f diverges to −∞ can be considered, and
the definitions are analogous.
3.76. — Limits as x → ∞ can be transformed into limits from the right as x → 0. Indeed,
if D and f are given as in Definition 3.75, consider the set E and the function g : E → R
defined as
E = {x ∈ R>0 | x−1 ∈ D}, g(x) = f (x−1 ).
Then it holds
lim f (x) = lim g(x).
x→∞ x→0+
This means, in particular, that one limit exists if and only if the other limit exists.
lim f (x)
x→x+
0
exists and it is equal to f (x0 ), then we say that f is continuous from the right at
x0 . Similarly, we define continuous from the left.
We call x0 ∈ D a jump point if the one-sided limits
3.78. — The following graph represents a function with three points of discontinuity x1 ,
x2 , x3 .
16
Example 3.79. — The domain of definition for all functions in this example is D = R>0 .
We want to study the limit as x → 0+ of the function f : D → R given by
1. We claim that
lim y exp(−y) = 0 (3.19)
y→∞
holds. Indeed, Corollary 3.53 implies that exp(y) ≥ (1 + y2 )2 holds for y > 0. This gives
0 ≤ y exp(−y) ≤ (1+yy )2 ≤ y4 , which implies (3.19) because of the sandwich lemma.
2
So let ε > 0. Because of (3.19) there exists R > 0 such that |y exp(−y)| < ε for all
y > R. Set δ = exp(−R) and consider x ∈ (0, δ). Then y = − log x satisfies y > R due
to the strict monotonicity of the logarithm, therefore |x log x| = | exp(−y)y| < ε, which
shows (3.20).
Because of Proposition 3.71 and since the exponential mapping is continuous, (3.20) yields
x3 − x2 − x − 2 3e2x + ex + 1 ex log(x)
lim , lim , lim , lim .
x→2 x−2 x→∞ 2e2x − 1 x→∞ xa x→∞ xa
In each case, choose a suitable domain on which the given formulas define a function.
16
3.5.3 Landau Notation
We now introduce two common notations that relate the asymptotic behaviour of a function to
the asymptotic behaviour of another function – that is, describe relative asymptotic behaviour.
These notations are named after the German-Jewish mathematician Edmund Georg Hermann
Landau (1877 - 1938).
f (x) = O(g(x)) as x → x0
If g(x) ̸= 0 for all x ∈ D \ {x0 } sufficiently close to x0 , then f (x) = O(g(x)) is equivalent
f (x)
to g(x) being bounded in a neighbourhood of x0 .
We can also allow for x0 the elements ∞ and −∞ of the extended number line, as discussed
in the next definition. We define only the case at ∞, the definition for −∞ is analogous.
f (x) = O(g(x)) as x → ∞
The advantage of this notation is that we do not need to introduce the name for the upper
bound M . If we are not particularly interested in this constant, then we can concentrate on
the essentials in calculations. In this context, one also speaks of implicit constants.
• It holds
x2 = O(x) as x → 0
• It holds
16 3x3
= O(1) as x → ∞
x3 + 3
3x3
but not x3 +3
= O(xα ) for α < 0.
As discussed above, the big-O means that f is bounded by a multiple of g. One may also
consider a stronger condition, namely that f is asymptotically negligible with respect to g.
This leads to the following definition.
f (x) = o(g(x)) as x → x0
f (x) = o(g(x)) as x → ∞
3x3
= o(|x|α ) as x → 0
2x2 + x10
but not for α ≥ 1. In fact, for any α < 1 the limit exists.
3x3 3 3
lim α 2 10
= lim |x|1−α 8
= lim |x|1−α
16 x→0 |x| (2x + x ) x→0 (2 + x ) 2 x→0
and is equal to 0.
1. xp = o(x) as x → 0;
2. x = o(xp ) as x → ∞;
3. xa = o(ex ) as x → ∞;
4. log(x) = o(xb ) as x → ∞.
for every α1 , α2 ∈ R. Formulate and show the analogous statement for big-O.
x → x0 ;
• If f1 (x) = o(g1 (x)) and f2 (x) = O(g2 (x)) as x → x0 , then f1 (x)f2 (x) = o(g1 (x)g2 (x)) as
x → x0 ;
• If f1 (x) = O(g1 (x)) and f2 (x) = O(g2 (x)) as x → x0 , then f1 (x)f2 (x) = O(g1 (x)g2 (x)) as
x → x0 .
2
Example 3.91. — Let f (x) = x+x3 +4x4 +x7 and g(x) = x+ 1+x 3x
. Then f (x) = x+o(x2 )
and g(x) = x + O(x2 ) as x → 0. In particular, their product satisfies the following:
3.92. — Landau notation is often used as a placeholder, for example, to express that one
term in a sum is increasing or decreasing faster than the others. In an expression of the form
f (x) + o(g(x)) as x → x0
16 the term o(g(x)) stands for a function h : D → R with the property that
h(x) = o(g(x)) as x → x0 .
x3 − 7x2 + 6x + 2
1
2
=x−7+O as x → ∞
x x
= x − 7 + o(1) as x → ∞
= x + O(1) as x → ∞
= x + o(x) as x → ∞
and thus remembers on the right-hand side only those terms that make up the bulk of the
term as x → ∞. It may perhaps come as a surprise that, in the above example, all four
formulas could be true or useful. The assertions all follow directly from polynomial division
with remainder, and depending on the context, one might want to use the slightly more precise
assertion with error −7 + O x1 or the coarser assertion using error o(x).
Exercise 3.96. — Show that the pointwise limit of a sequence of functions is uniquely
determined if it exists.
3.97. — In the following example we show that in general the continuity property is not
17 preserved under pointwise convergence.
Example 3.98. — Let D = [0, 1] and let fn : D → R be given by fn (x) = xn . Then the
sequence of continuous functions (fn )∞
n=0 converge pointwise to the non-continuous function
f : D → R given by
0 for x < 1
f (x) = lim fn (x) = lim xn =
n→∞ n→∞ 1 for x = 1.
3.100. — The estimate |fn (x) − f (x)| < ε is equivalent to f (x) − ε ≤ fn (x) ≤ f (x) + ε.
Thus, uniform convergence can also be described by the graph of a function sequence and its
limit function, as the following figure shows: The function sequence fn converges uniformly
to f if for every ε > 0 the graph of fn lies in the “ε-tube” around f for all sufficiently large n.
17
Exercise 3.101. — Let D be a set and let (fn )∞ n=0 be a sequence of functions fn : D → R.
Show that if (fn )n=0 converges uniformly to a function f , then (fn )∞
∞
n=0 also converges pointwise
to f .
Proof. Let x̄ ∈ D and ε > 0. First, by the uniform convergence of fn to f , there exists N ∈ N
such that |fN (y) − f (y)| < 3ε for all y ∈ D. Then, since fn is continuous at x̄, there exists
δ > 0 such that
ε
|x − x̄| < δ =⇒ |fN (x) − fN (x̄)| <
3
holds for all x ∈ D. Then, for all x ∈ D with |x − x̄| < δ it follows that
which proves that f is continuous at x̄. Since x̄ is arbitrary in D, the theorem follows.
Exercise 3.103. — Let (fn )∞ n=0 be a sequence of bounded real-valued function on a set D,
and let f : D → R be another real-valued function on D. Suppose that D = D1 ∪ D2 for two
subsets, (fn |D1 )∞
n=0 tends uniformly towards f |D1 and (fn |D2 )n=0 tends uniformly towards
∞
Exercise 3.104. — Let (fn )∞ n=0 be a sequence of bounded real-valued functions on a set
17 D. Show that if (fn )n=0 converges uniformly to a function f : D → R, then f is also bounded.
∞
Find also an example in which a sequence (fn )∞n=0 of bounded functions converges pointwise
to an unbounded function.
Exercise 3.105. — Let D ⊂ R and (fn )∞ n=0 be a sequence of uniformly continuous real-
valued functions on D that tends uniformly to f : D → R. Let (xn )∞
n=0 be a sequence in D
that converges towards x̄ ∈ D. Show that
Exercise 3.106. — Let D ⊂ R and (fn )∞ n=0 be a sequence of uniformly continuous real-
valued functions on D, uniformly converging to f : D → R. Show that f is uniformly
continuous.
In this chapter we will consider so-called series, i.e., “infinite sums”, which will lead us to the
definitions of known functions, in particular to the definitions of trigonometric functions.
n
X
A = lim ak .
n→∞
k=0
In other words, computing the infinite sum ∞ k=0 ak corresponds to finding (if it exists)
P
the limit of the sequence (sn )n=0 given by the partial sums
∞
n
17 X
sn = ak .
k=0
We call an the n-th element or the n-th summand of the series. We call the series
P∞
k=0 ak convergent if the limit exists, in which case we call it the value of the
series. Otherwise, the series is not convergent.
If the sequence sn diverges to ∞ (respectively, −∞), then we call the series ∞
P
k=0 ak
divergent to ∞ (respectively, −∞).
Remark 4.2. — Unless otherwise specified, all sequences always consist of real numbers.
101
Chapter 4.1 Series of Real Numbers
Proof. By assumption, the partial sums sn = nk=0 ak for n ∈ N have a limit limn→∞ sn = S.
P
P∞
Example 4.4 (Geometric Series). — The geometric series n=0 q
n to q ∈ R converges
exactly when |q| < 1 and, in this case,
∞
X 1
qk = .
1−q
k=0
Indeed, by Proposition 4.3, convergence of the series implies that q k → 0, hence |q| < 1.
Conversely, for |q| < 1, first one prove by induction on n ∈ N the validity of the “geometric
sum formula”
n
X 1 − q n+1
qk = ∀ n ∈ N, q ̸= 1.
17 1−q
k=0
Example 4.5 (Harmonic Series). — The converse of Proposition 4.3 does not hold. For
example, the harmonic series ∞ k=1 k is divergent. We prove the divergence with a concrete
1
P
estimate.
Let ℓ ∈ N and consider n = 2ℓ . Then the partial sum of the harmonic series for n satisfies
the estimate
2 ℓ
X 1 1 1 1 1 1 1 1 1 1 1
= 1 + + + + + + + + + · · · + ℓ−1 + ··· + ℓ
k 2 3 4 5 6 7 8 9 2 +1 2
k=1
1 1 1 1 1 1 1 1 1 1 ℓ
≥1+ + + + + + + + + ··· + ℓ + ··· + ℓ = 1 + .
2 |4 {z 4} |8 8 {z 8 8} 16 |2 {z 2} 2
= 12 = 12 = 12
Since ℓ ∈ N is arbitrary we see that the partial sums are not bounded, and therefore, the
harmonic series is divergent.
∞
X ∞
X ∞
X
(αak + βbk ) = α ak + β bk .
k=0 k=0 k=0
∞
X N
X −1 ∞
X
ak = ak + ak .
k=0 k=0 k=N
n
X N
X −1 n
X
ak = ak + ak .
k=0 k=0 k=N
17
In particular, the partial sums of ∞
P∞
k=N ak converge exactly when the partial sums of
P
k=0 ak
converge. The case of a divergent sequence is completely analogous.
Proof. From an+1 ≥ 0 it follows that sn+1 = sn + an+1 ≥ sn for all n ∈ N, so the sequence
n=0 is increasing. If the partial sums {sn | n ∈ N} are bounded, then they converge ac-
(sn )∞
cording to Theorem 2.108.
Remark 4.9. — If ∞ k=0 ak is a series with non-negative elements, then the sequence of
P
partial sums (sn )n=0 is bounded if and only if there exists a subsequence (snk )∞
∞
k=0 which is
bounded (see Remark 2.109).
∞
X ∞
X
bk convergent =⇒ ak convergent
k=0 k=0
∞
X X∞
ak divergent to ∞ =⇒ bk divergent to ∞.
k=0 k=0
Proof. From ak ≤ bk it follows nk=0 ak ≤ nk=0 bk for all n ∈ N. Thus, according to the
P P
Under the assumptions of the corollary, one calls the series ∞ k=0 bk a majorant of the
P
P∞ P∞
series k=0 ak , and the latter is a minorant of the series k=0 bk . This is why one refers to
17 the above result as the majorant and the minorant criterion.
∞
X ∞
X
ak converges ⇐⇒ 2k a2k converges.
k=0 k=0
Proof. (Extra material) Due to the monotonicity of the sequence (ak )∞ k=0 , the following in-
equalities hold:
1 · a1 ≥ a2 ≥ 1 · a2 , 2 · a2 ≥ a3 + a4 ≥ 2 · a4 ,
4 · a4 ≥ a5 + a6 + a7 + a8 ≥ 4 · a8 , 8 · a8 ≥ a9 + . . . + a16 ≥ 8 · a16 ,
n n n+1
2X
X X
k
2 a2k ≥ a2k +1 + . . . + a2k+1 = aj
k=0 k=0 j=2
17 and
n+1
2X n n n n+1
X X
k 1 X k+1 1X k
aj = a2k +1 + . . . + a2k+1 ≥ 2 a2k+1 = 2 a2k+1 = 2 a2k .
2 2
j=2 k=0 k=0 k=0 k=1
Because of Remark 4.9 and Corollary 4.10, it follows that the series ∞ k=0 2 a2k is bounded
k
P
P∞
(and therefore converge) if and only if the series k=0 ak is bounded (and therefore converge).
P∞
Example 4.14. — Given p ∈ R, the series 1
n=1 np converges exactly when p > 1. Indeed:
∞ ∞
X
k 1 X
2 k p = (21−p )k
(2 )
k=0 k=0
converges. According to Example 4.4, this series converges exactly when 21−p < 1, that
is, p > 1.
Remark 4.15. — The argument in Example 4.14 gives an alternative proof that the har-
monic series diverges (see Example 4.5).
17
Exercise 4.16. — Given p ∈ R, the series ∞ n=2 n log(n)p converges exactly when p > 1.
1
P
Hint: for p ≤ 0, compare the series above with the harmonic series; for p > 0, use Proposi-
tion 4.13 and Example 4.14.
P∞
Exercise 4.17. — Is the series 1
n=3 n log(n) log log(n) convergent or divergent?
verges.
The series ∞k=0 ak is conditionally convergent if it converges but is not absolutely
P
convergent.
The critical property of a conditionally convergent sequence is that one can rearrange the
terms to obtain any possible limit!
P = {n ∈ N | an ≥ 0}, N = {n ∈ N | an < 0}
depending on the sign of the corresponding an , and we enumerate the elements in P and N
in ordered sequence, i.e., P = {p0 , p1 , . . .} and N = {n0 , n1 , . . .}, with p0 < p1 < p2 < · · · and
The mapping φ : N → N thus defined is injective by construction, and surjective due to the
divergence of the series (4.2). Since an → 0 as n → ∞, the sequence of partial sums (sn )∞
n=0
converges to A, which shows (4.1).
Exercise 4.20. — Complete the details omitted from the proof of Theorem 4.19. Also,
show that for A one can also take one of the symbols −∞ or ∞.
18
2n+1
X ∞
X 2n
X
(−1)k ak ≤ (−1)k ak ≤ (−1)k ak (4.3)
k=0 k=0 k=0
Pn
Proof. For n ∈ N, let sn = k=0 (−1) ak .
k Since the sequence (an )∞
n=0 is decreasing and
non-negative, we have
Hence, the sequence (s2n )∞ n=0 is decreasing and bounded from below, while the sequence
(s2n+1 )n=0 is increasing and bounded from above, so the limits A = limn→∞ s2n+1 and B =
∞
However, since s2n+2 −s2n+1 = a2n+2 converges to zero, then A = B and the result follows.
Proposition 4.22 guarantees that this series converges, while the series of its absolute values
18 ∞
X 1
∞
X |(−1)n+1 |
=
n n
n=1 n=1
diverges to infinity (see Example 4.5). So this series is only conditionally convergent.
|sn − sm | < ε ∀ n, m ≥ N.
Example 4.25. — To see the divergence of the harmonic series, we can also use the Cauchy
criterion. We do this by setting ε = 12 and noticing that, for N ∈ N, the following holds:
18 2N
X 1 1 1 1 N +1 1
= + + ... + ≥ > .
k N N +1 2N 2N 2
k=N | {z }
N + 1 terms
Hence the harmonic series cannot converge, since it does not satisfy the Cauchy criterion in
Theorem 4.24.
Proof. Since the series ∞ n=0 |an | converges, according to the Cauchy criterion in Theorem
P
We now prove two important criteria to guarantee the absolute convergence of a series. In
their proofs, we will implicitly use the following fact:
Assume that a sequence (xn )∞ n=0 converges to a limit α. Then, given q > α (respectively,
q < α) then there exists N ∈ N such that xn < q (respectively, xn > q) for every n ≥ N .
This fact is a consequence of Proposition 2.97.
p
n
α = lim sup |an | ∈ R ∪ {∞}.
n→∞
Then
∞
X
α < 1 =⇒ an converges absolutely
n=0
and
∞
X
α > 1 =⇒ an does not converge.
n=0
such that
p
k
xN = sup |ak | < q.
18 k≥N
4.10 and the convergence of the geometric series in Example 2.134 (recall that q < 1).
If α > 1 holds, since the limsup is an accumulation point (see Theorem 2.116), Proposi-
tion 2.90 implies the existence of a subsequence (ank )∞k=0 such that limk→∞ |ank | > 1. In
p
nk
4.28. — Let (an )∞ n=0 be a sequence of real numbers and α = lim supn→∞ |an | as in the
pn
root criterion. If α = 1 holds, then no decision about convergence or divergence of the series
P∞
n=0 an can be made using the root criterion:
• On the other hand, n 1/n2 → 1 as n → ∞, but the series ∞ n=0 n2 converges according
p P 1
to Example 4.11.
Let (an )∞
n=0 be a sequence of real numbers with an ̸= 0 for all n ∈ N, and assume that
|an+1 |
lim exists.
n→∞ |an |
and
∞
X
α > 1 =⇒ an does not converge.
n=0
|ak+1 |
<q ∀ k ≥ N.
|ak |
|ak+1 |
>1 ∀ k ≥ N,
|ak |
therefore
|ak | |ak−1 | |aN +1 |
|ak | = · · ... · · |aN | > |aN | ∀ k > n.
|ak−1 | |ak−2 | |aN |
This implies that the sequence (an )n≥0 does not converge to zero. Hence, according to Propo-
sition 4.3, ∞n=0 an does not converge.
P
|an+1 | |an+1 |
α+ = lim sup , α− = lim inf .
n→∞ |an | n→∞ |an |
∞
X ∞
X
an = aφ(n) . (4.4)
n=0 n=0
P∞
Proof. Let φ : N → N be a bijection, and fix ε > 0. By the convergence of n=0 |an |, there
exists N ∈ N such that
∞
X ε
|ak | < .
2
k=N +1
Let M be the maximum of the finite set {φ−1 (k) | 0 ≤ k ≤ N }. Equivalently, M ∈ N is the
smallest number such that
19
{a0 , . . . , aN } ⊂ {aφ(0) , . . . , aφ(M ) }.
n
X N
X X
aφ(ℓ) − ak = aφ(ℓ) .
ℓ=0 k=0 0≤ℓ≤n,
φ(ℓ)>N
n
X ∞
X n
X N
X ∞
X X ∞
X
aφ(ℓ) − ak = aφ(ℓ) − ak − ak = aφ(ℓ) − ak
ℓ=0 k=0 ℓ=0 k=0 k=N +1 0≤ℓ≤n, k=N +1
φ(ℓ)>N
X ∞
X X ∞
X
≤ aφ(ℓ) + ak ≤ |aφ(ℓ) | + |ak |.
0≤ℓ≤n, k=N +1 0≤ℓ≤n, k=N +1
φ(ℓ)>N φ(ℓ)>N
Note now that all terms of the form aφ(ℓ) with φ(ℓ) > N and contained inside the infinite set
{ak | k > N }, therefore
X X∞
|aφ(ℓ) | ≤ |ak |.
0≤ℓ≤n, k=N +1
φ(ℓ)>N
Proof. Consider first a bijection α : N → N2 written as α(n) = (α1 (n), α2 (n)) such that
{α(k) | 0 ≤ k < n2 } = {0, 1, . . . , n − 1}2 for all n ∈ N. For example, (α(n))∞
n=0 could pass
through the set N as in the following figure.
2
2 −1
nX ∞
! ∞
!
X X
sup |aα1 (k) ||bα1 (k) | ≤ |aℓ | |bm | < ∞,
n∈N k=0 ℓ=0 m=0
so the series ∞
k=1 aα1 (k) bα2 (k) is absolutely convergent, and in particular converges.
P
∞
X ∞
X ∞ X
X n
an bn = an−k bk ,
n=0 n=0 n=0 k=0
P∞ Pn
where the series n=0 k=0 an−k bk is absolutely convergent.
Proof. Consider the bijection α : N → N × N, α(n) = (α1 (n), α2 (n)), as in the picture below.
Now, if we write explicitly the terms appearing in the sum and we group them in blocks of
length 1, 2, 3, 4, . . . (geometrically, this corresponds to grouping terms that belong to the same
diagonal in the figure above), we see that
∞
X
19 aα1 (n) bα2 (n) = a0 b0 + (a0 b1 + a1 b0 ) + (a2 b0 + a1 b1 + a0 b2 )
n=0
+ (a0 b3 + a1 b2 + a2 b1 + a3 b0 ) + . . .
X∞ X n
= an−k bk .
n=0 k=0
Finally, the absolute convergence follows from the triangle inequality and Theorem 4.32:
∞ X
X n ∞ X
X n ∞
X
an−k bk ≤ |an−k bk | = |aα1 (n) ||bα2 (n) | < ∞.
n=0 k=0 n=0 k=0 n=0
Example 4.34. — Let q ∈ R be such that |q| < 1. Then ∞ n=0 q converges absolutely
n
P
according to Example 4.4. If we apply the Cauchy product to this series with itself, we get
∞
!2 ∞ X
n ∞ X
n ∞
1 X
n
X X X
= q = q n−k q k = qn = (n + 1)q n .
(1 − q)2
n=0 n=0 k=0 n=0 k=0 n=0
P∞
In this way, we obtain an explicit formula for n=0 nq
n :
∞ ∞ ∞
X X X 1 1 q
nq n = (n + 1)q n − qn = − = .
(1 − q)2 1 − q (1 − q)2
n=0 n=0 n=0
The series n=0 zn is convergent with limit Z if the two series of real numbers ∞
P∞ P
n=0 xn
P∞
and n=0 yn are convergent, with limits A and B, respectively.
19 We say that ∞
P∞
n=0 zn converges absolutely if the series n=0 |zn | converges.
P
Our next goal is to investigate power series. These are series where the elements are powers
of the variable x ∈ R (or z ∈ C, if one wants to consider complex power series) multiplied by
a coefficient.
where (an )∞
n=0 is a sequence in R, and x ∈ R. Here, x is called variable, and the
element an ∈ R is called the coefficient of xn .
4.38. — A power series is a polynomial if only finitely many of its coefficients are zero.
The convergence of a power series depends heavily on the coefficients an , and is answered in
Theorem 4.41.
P∞
Exercise 4.40. — Find, for each R ∈ [0, ∞) ∪ {∞}, a power series n=0 an x
n with radius
of convergence R.
Proof. Let x ∈ R, and write ρ = lim supn→∞ |an | as in Definition 4.39. It holds
p
n
p
n
p
lim sup |an xn | = lim sup n |an ||x| = ρ|x|
n→∞ n→∞
P∞
According to the root criterion applied to the series n=0 bn with bn = an x , the series
n
converges absolutely for all x ∈ R with ρ|x| < 1, and does not converge if ρ|x| > 1 (in
particular, if ρ = 0, then the series converges absolutely for all x ∈ R).
(fn )∞
n=0 converges uniformly to f on (−r, r). In particular, the power series defines a
19
continuous function f : (−R, R) → R.
Proof. To prove the result, we note that Theorem 4.41 applied with x = r implies that
P∞ P∞
n=0 |an |r < ∞ holds. Therefore, for every ε > 0 there exists N ∈ N such that
n n
n=N +1 |an |r <
ε. Thus, for all x ∈ (−r, r) and n ≥ N ,
n
X ∞
X ∞
X ∞
X ∞
X
|fn (x) − f (x)| = ak xk − ak xk = ak xk ≤ |ak ||x|k ≤ |ak |rk < ε.
k=0 k=0 k=n+1 k=N +1 k=N
This proves the uniform convergence inside (−r, r) of the sequence of continuous functions
n=0 to f so, by Theorem 3.102, f is continuous inside (−r, r). Since r < R is arbitrary,
(fn )∞
f : (−R, R) → R is continuous.
Example 4.43. — In general, it is not true that the partial sums fn (x) = nk=0 ak xk tend
P
uniformly to the function f (x) = ∞k=0 ak x in the whole open interval (−R, R). We illustrate
k
P
N N
1 X
k
X
<1+ x ≤1+ |x|k ≤ 2 + N ∀ x ∈ (−1, 1).
1−x
k=0 k=0
P∞
Exercise 4.45. — Let n=0 an x be a power series with an ̸= 0 for all n ∈ N, and
n
Proof. Due to the linearity of the limit and Corollary 4.33, the absolute convergence of the
power series ∞
P∞
n=0 an x and n=0 bn x for |x| < R implies that both power series
n n
P
20
∞
X ∞
X ∞
X ∞ X
X n
(an + bn )x n
and an x n
bn x n
= an−k bk xn
n=0 n=0 n=0 n=0 k=0
absolutely converge for |x| < R. Since a power series does not converge for |x| larger than
its radius of converges, this implies that both power series have radii of convergence at least
R.
P∞
Example 4.47. — If n=0 an x
n has at least radius of convergence 1, then
∞ ∞
1 X X
an xn = (a0 + . . . + an )xn ∀ x ∈ (−1, 1) (4.6)
1−x
n=0 n=0
Indeed, the power series ∞ n=0 x has radius of convergence 1, and for x ∈ (−1, 1) we have
n
P
P∞ n
n=0 x = 1−x , so (4.6) follows from Proposition 4.46.
1
P∞
Exercise 4.48. — Calculate n=1 n2
−n .
Addition and multiplication of power series are the same as in the real case, namely
∞
X ∞
X ∞
X
an z n + bn z n = (an + bn )z n , (4.7)
n=0 n=0 n=0
∞
X ∞
X X∞ X
n
an z n
bn z n
= an−k bk z n . (4.8)
n=0 n=0 n=0 k=0
Also, the radius of convergence is still defined as in Definition 4.39. Several results that are
true for real power series, hold also in the complex case.
We note that, with the very same proof as in the real case, the analogue of Theorem 4.41
holds:
Also in the complex case, the analog of Theorem 4.42 holds. To prove that, one defines
continuous functions exactly as in Definition 3.9, and uniform convergence as in Definition 3.99
(with the only warning that | · | now denotes the absolute value on C, see Definition 2.46). In
this way, one can prove that Theorem 3.102 also holds in the complex case, namely, uniform
limit of continuous functions is continuous (in the courses of Analysis II or Complex Analysis,
20 this will be proved in full detail), and we get the following:
of polynomials (fn )∞
n=0 converges uniformly to f on B(0, r). In particular, the power
series defines a continuous function f : B(0, R) → C.
where
0! = 1, k! = 1 · 2 · . . . · k.
It follows directly from the quotient criterion (see Exercise 4.45) that this series has infinite
radius of convergence, so in particular the right-hand side of (4.9) is well-defined for all x ∈ R.
Alternatively, one can note that, given N ∈ N
n! ≥ n · (n − 1) · . . . · N ≥ N n−N +1 ,
| {z }
n − N + 1 terms
hence
p
n 1 1
20 lim sup |an | ≤ lim sup n−N +1 = .
n→∞ n→∞ N n N
Since N ∈ N can be chosen arbitrarily large, this implies that lim supn→∞ n |an | = 0 and,
p
therefore, R = ∞.
The representation of the exponential mapping as a power series is, in many aspects, more
flexible than the representation as a limit. In addition, as we shall see, its complex version is
related to sine and cosine in a very practical way.
We first show that our new definition of exponential coincides with the one in Defini-
tion 3.52.
Proof. (Extra material) We first observe that, for any n ≥ 0, the identity
n n k−1 n k−1
xk xk 1 Y xk Y
x n X n! X X l
1+ = = (n − l) = 1−
n k!(n − k!) nk k! nk k! n
k=0 k=0 l=0 k=0 l=0
P∞
hold. Now, given x ∈ R and ε > 0, since 1
k=0 k! |x|
k < ∞ we can find N ∈ N such that
∞
X 1 k
|x| < ε.
k!
k=N +1
N ∞ ∞ ∞
X 1 k X 1 k X 1 k X 1 k
x − x ≤ x ≤ |x| < ε (4.10)
k! k! k! k!
k=0 k=0 k=N +1 k=N +1
N n k−1
X 1 k X 1 kY ℓ
x − x 1−
k! k! n
k=0 k=0 ℓ=0
N k−1
Y n k−1
X 1 k ℓ X 1 kY ℓ
≤ |x| 1 − 1− + |x| 1−
k! n k! n
k=0 ℓ=0 k=N +1 ℓ=0
| {z }
≤1
N k−1
Y ∞
X 1 k ℓ X 1 k
≤ |x| 1 − 1− + |x| .
k! n k!
k=0 ℓ=0 k=N +1
N N k−1
Y ∞
X 1 k x n X 1 k ℓ X 1 k
x − 1+ ≤ |x| 1 − 1− + |x| .
k! n k! n k!
k=0 k=0 ℓ=0 k=N +1
Since
k−1
Y
ℓ
lim 1− 1− =0 ∀ k ∈ {0, . . . , N },
n→∞ n
ℓ=0
N ∞
X 1 k x n X 1 k
x − lim 1 + ≤ |x| < ε.
k! n→∞ n k!
k=0 k=N +1
∞ ∞ N
X 1 k x n X 1 k X 1 k x n
x − lim 1 + ≤ x + x − lim 1 +
k! n→∞ n k! k! n→∞ n
k=0 k=N +1 k=0
∞ N
X 1 k X 1 k x n
≤ |x| + x − lim 1 + < 2ε.
k! k! n→∞ n
k=N +1 k=0
Figure 4.1: The graph of the exponential mapping, and the graphs of some partial sums of
the exponential series.
for all z ∈ C.
For a positive real number a ∈ R>0 and z ∈ C we write az = exp(z log(a)), and in
particular also ez = exp(z) for all z ∈ C.
Before stating the main properties of the exponential, we recall the binomial formula: given
z, w ∈ C and n ∈ N,
n
X n n n!
n
(z + w) = k
z w n−k
, = . (4.11)
k k k!(n − k)!
k=0
It remains to prove the formula for the absolute value. Since the conjugation C → C is a
continuous function, the following holds:
n n n
X 1 k X 1 k X 1 k
ez = lim z = lim z = lim z = ez ,
n→∞ k! n→∞ k! n→∞ k!
k=0 k=0 k=0
where the equality z k = z k follows from Lemma 2.43(3). Recalling that for a complex number
w it holds |w|2 = ww and w + w = 2 Re(w), we get
20
z n
Exercise 4.55. — Show that ez = lim 1 + holds for all z ∈ C.
n→∞ n
Starting from this formula, we define the sine function and the cosine function as
∞ ∞
X (−1)n 2n+1 X (−1)n
sin(x) = x and cos(x) = x2n , (4.13)
(2n + 1)! (2n)!
n=0 n=0
holds.
As for the exponential, the radius of convergence of these series is infinite, and by Theorems
4.50 and 4.51 they define continuous functions. Since (−x)2n+1 = −x2n+1 and (−x)2n = x2n
for every n ∈ N, it follows directly from the definition as power series that the sine function
is odd, i.e., sin(−x) = − sin(x), and the cosine function is even, i.e., cos(−x) = cos(x) for
all x ∈ R.
20
If we add (respectively subtract) these two identities, we obtain the formulae for cos(x) and
sin(x).
To prove the addition formulas, we multiply eix by eiy and using (4.12) we get
Recalling that |eix |2 = 1, we also get the circle equation for sine and cosine:
cos(x)2 + sin(x)2 = 1 ∀ x ∈ R.
Applet 4.58 (Power Series). We consider the first partial sums of the power series defining
exp, sin and cos (respectively sinh, cosh from the next section). By zooming in and out, you
can get a feeling for the quality of the approximations of the various partial sums. In the case
of the trigonometric functions, you can also clearly see in the picture that the power series
form alternating series.
n
Proof. The sequence of real numbers ( xn! )∞ n=2 is monotonically decreasing for all x ∈ (0, 2].
Therefore, from the Leibniz criterion for alternating series (see Proposition 4.22), the following
estimates hold for every x ∈ (0, 2]:
x3 x3 x5 x2 x2 x4
x− ≤ sin(x) ≤ x − + and 1− ≤ cos(x) ≤ 1 − + .
3! 3! 5! 2 2 24
20 1 1
sin(1) ≥ 1 − >√ .
6 2
Therefore, since the sine function is continuous, it follows from the Intermediate Value Theo-
rem 3.29 that there exists a number p ∈ (0, 1) such that sin(p) = √12 .
Because sin2 (p) + cos2 (p) = 1 and cos(x) ≥ 1 − 21 x2 > 0 for x ∈ [0, 1], it also follows
cos(p) = 1 − sin2 (p) = √12 . In other words,
p
1+i
eip = cos(p) + i sin(p) = √ .
2
π 2 (1 + i)2 π 2
ei 2 = ei2p = eip = = i, eiπ = ei 2 = i2 = −1, ei2π = (−1)2 = 1.
2
In particular, from the identity cos(π) + i sin(π) = eiπ = −1 we deduce that sin(π) = 0 and
cos(π) = −1.
It remains to show the uniqueness of π as in the theorem. From the estimate
x3 x2
sin(x) ≥ x − =x 1− >0 for x ∈ (0, 2]
3! 6
it follows that the sine function has no zeros in (0, 2]. In particular, π ∈ (2, 4). Suppose now,
by contradiction, that there exists another value s ∈ (2, 4) satisfying sin(s) = 0, and define
π − s if 2 < s < π
r= .
s − π if 2 < π < s.
Then r ∈ (0, 2) and using (4.14) we get (the sign ± below depends on whether π − s is positive
or negative)
sin(r) = ± sin(π − s) = ± sin(π) cos(s) − cos(π) sin(s) = ±(0 − 0) = 0.
| {z } | {z }
=0 =0
This is a contradiction since sin never vanishes on (0, 2). This proves that π ∈ (0, 4) is uniquely
determined by the equation sin(π) = 0.
20
Proof. Rewriting the formulas in Theorem 4.59 in terms of sine and cosine, we see that
4.61. — From Corollary 4.60 it follows that sine and cosine are both periodic functions
with period length 2π. To know the numerical value of sin(x) or cos(x) for a given real number
x, it is sufficient to know the values of the sine on the interval [0, π2 ].
Exercise 4.62. — Show that the zeros of sin : R → R are exactly the points in πZ ⊂ R,
and the zeros of cos : R → R are exactly the points in πZ + π2 . Also, cos(x) = 1 only when
x = 2nπ with n ∈ Z.
for all x, y ∈ R. Use this to show that sin : − π2 , π2 → [−1, 1] is strictly monotonically
Exercise 4.64. — Show that 3.1 < π < 3.2 holds. Using an electronic tool to calculate
certain rational numbers may be helpful.
where r is the distance from the origin 0 ∈ C to z, i.e., the absolute value r = |z| of z, and θ
is the angle enclosed between the half-lines R≥0 and zR≥0 .
20
If z ̸= 0, then the angle θ is uniquely determined, and is called the argument of z and written
as θ = arg(z). The set of complex numbers with absolute value one is thus
Proof. (Extra material) Let r = |z|, and consider the complex number w = zr . Note that
|w| = |z|
r = 1. We want to prove that there exists a unique θ ∈ [0, 2π) such that w = e .
iθ
Assume first that Im(w) ≥ 0 and recall that Re(w) ∈ [−1, 1] (since Re(w)2 + Im(w)2 = 1).
Hence, since cos(0) = 1 and cos(π) = −1, according to the Intermediate Value Theorem 3.29
there exists θ ∈ [0, π] such that Re(w) = cos(θ). Since Im(w) ≥ 0 by assumption and
sin(θ) ≥ 0 (since θ ∈ [0, π]), this implies that
p p
sin(θ) = 1 − cos2 (θ) = 1 − Re(w)2 = Im(w),
thus w = eiθ .
If Im(w) < 0, then we apply the above argument to −w to find ϑ ∈ (0, π) such that
−w = eiϑ (note that ϑ must be different from 0 and π, since Im(e0 ) = Im(eiπ ) = 0). Recalling
that eiπ = −1, it follows that w = eiπ eiϑ = eiθ with θ = π + ϑ ∈ (π, 2π).
′
It remains to show the uniqueness of θ. If θ, θ′ ∈ [0, 2π) satisfy w = eiθ = eiθ , then
′
ei(θ−θ ) = 1, that is,
sin(θ − θ′ ) = 0, cos(θ − θ′ ) = 1.
20 Note that θ − θ′ ∈ (−2π, 2π). Hence, from the uniqueness of π in Theorem 4.59 and the
formula sin(x + π) = − sin(x) (see Corollary 4.60) it follows that θ − θ′ ∈ {−π, 0, π}. But if
θ − θ′ ∈ {−π, π} then cos(θ − θ′ ) = −1, so the only possibility is θ − θ′ = 0, as desired.
Applet 4.67 (Geometric Meaning of Complex Numbers). We can see from the polar coordi-
nate lines drawn in the geometrical meaning of the multiplication of complex numbers and the
inverses and roots of a given number.
Exercise 4.68. — Let w = reiθ be non-zero. Show that the n-th roots of w (namely, the
solutions z ∈ C to the equation z n = w) are given by the n numbers
n√
θ
o
i 2πα+ n
n
re α = 0, n1 , n2 , . . . , n−1
n .
ei2πα α = 0, n1 , n2 , . . . , n−1
n
n−1
X k
Exercise 4.69. — For all natural numbers n ≥ 2, show the identity ei2π n = 0.
k=0
4.70. — The tangent function and the cotangent function are given by.
sin(x) cos(x)
tan(x) = and cot(x) =
cos(x) sin(x)
tan(x) + tan(y)
tan(x + y) =
1 − tan(x) tan(y)
20 holds where defined. Find and prove an analogous addition formula for the cotangent.
4.72. — The hyperbolic sine and the hyperbolic cosine are the functions given by the
power series
∞ ∞
X 1 X 1 2k
sinh(x) = x2k+1 and cosh(x) = x .
(2k + 1)! (2k)!
k=0 k=0
It holds
ex − e−x ex + e−x
sinh(x) = and cosh(x) = ,
2 2
and so ex = cosh(x) + sinh(x) for all x ∈ R. The hyperbolic tangent and the hyperbolic
cotangent are given by
and the hyperbolic cotangent is defined for all x ∈ R \ {0} (since sinh(x) ̸= 0 for x ̸= 0). The
functions sinh and tanh are odd, and cosh is even. The addition formulae
20
Exercise 4.73. — Starting from the definitions of hyperbolic sine and hyperbolic cosine,
prove the above formulae.
Differential Calculus
We deal with differential calculus in one variable. This is of fundamental importance for
understanding functions on R.
135
Chapter 5.1 The Derivative
Therefore
lim f (x) = lim f (x0 ) + f ′ (x0 )(x − x0 ) + o(x − x0 ) = f (x0 ),
x→x0 x→x0
hence f is continuous at x0 .
df
5.3. — An alternative notation for the derivative of f is dx . If x0 ∈ D is an accumulation
point from the right of D, then f is differentiable from the right at x0 if the derivative
from the right
exists. Differentiability from the left and the derivative from the left f−′ (x0 ) are
21 defined analogously considering the limit x → x−
0.
5.4. — An affine function is a function of the form x 7→ sx + r, for real numbers s and r.
The graph of an affine function is a non-vertical line in R2 . The parameter s in the equation
y = sx + r is called the slope of the straight line. If f : D → R is differentiable at a point
a ∈ D, the function x 7→ f (a) + f ′ (a)(x − a) is called affine approximation of f at a.
Example 5.5. — • Constant functions are everywhere differentiable and have the zero
function as their derivative.
• The identity function f (x) = x is differentiable, and its derivative is the constant func-
tion 1. Indeed
x − x0
f ′ (x0 ) = lim =1 ∀ x0 ∈ R.
x→x0 x − x0
Example 5.6. — The exponential function exp : R → R>0 is differentiable and its deriva-
tive is again the exponential function. Indeed, for x ∈ R, since ex+h = ex eh we get
ex+h − ex eh − 1
(ex )′ = lim = ex lim
h→0 h h→0 h
∞ ∞
X 1 k−1 X 1
= ex lim h = ex lim hn .
h→0 k! h→0 (n + 1)!
k=1 n=0
Note now that the power series h 7→ ∞ n=0 (n+1)! h has infinite radius of convergence, as it
1 n
P
21 follows for instance from Exercise 4.45. In particular the function g(h) = ∞ n=0 (n+1)! h is
1 n
P
More in general, let α be a complex number and let f : R → C be the complex-valued function
given by f (x) = eαx . The derivative of f can still be defined as the limit of the incremental
ratios, namely
f (x + h) − f (x)
f ′ (x) = lim ,
h→0 h
and one gets
1 1
x+h− x x − (x + h)
f ′ (x) = lim = lim
h→0 h h→0 (x + h)xh
1 1 1
= − lim =− = − 2.
h→0 (x + h)x limh→0 (x + h)x x
for all n ∈ N. If f (n) exists for any n ∈ N, f is called n-times differentiable. If the n-
th derivative f (n) is also continuous, f is called n-times continuously differentiable.
We denote the set of n-times continuously differentiable functions on D by C n (D).
Differently put, C 0 (D) denotes the set of real-valued continuous functions on D, and
C 1 (D) denotes the set of all differentiable functions whose derivative is continuous. We call
such functions continuously differentiable or of class C 1 . Recursively, for n ≥ 1 we define
Proof. We compute using the properties of the limit introduced in Section 3.5.1:
(f + g)(x) − (f + g)(x0 ) f (x) − f (x0 ) g(x) − g(x0 )
lim = lim +
x→x0 x − x0 x→x0 x − x0 x − x0
f (x) − f (x0 ) g(x) − g(x0 )
= lim + lim
x→x0 x − x0 x→x0 x − x0
= f ′ (x0 ) + g ′ (x0 ),
and
where we used that g is continuous at x0 to say that limx→x0 g(x) = g(x0 ) (recall Remark 5.2).
Proof. For n = 1 this corresponds to Proposition 5.12. The general case follows by induction
on n ≥ 1.
Polynomial functions are differentiable on all R. It holds (1)′ = 0 and (xn )′ = nxn−1
for all n ≥ 1.
Proof. The cases n = 0 and n = 1 have already been discussed in Example 5.5. We now prove
by induction the case n > 1. Assume that, for some n ≥ 1, (xn )′ = nxn−1 holds. Then it
follows from (5.3) that xn+1 = xxn is differentiable and
and analogously cos′ (x) = − sin(x). Similarly, sinh′ (x) = cosh(x) and cosh′ (x) = sinh(x).
Proof. Note that one can write g(y) = g(y0 ) + g ′ (y0 )(y − y0 ) + εg (y)(y − y0 ) with
g(y)−g(y0 ) − g ′ (y ) if y ∈ E \ {y }
y−y0 0 0
εg (y) =
0 if y = y0 .
g(f (x)) = g(f (x0 )) + g ′ (f (x0 ))[f (x) − f (x0 ))] + εg (f (x))[f (x) − f (x0 )]
where we used Proposition 3.71 and the continuity of εg at y0 = f (x0 ) to deduce that
εg (f (x)) → εg (f (x0 )) as x → x0 .
f
If we now use the product rule in Proposition 5.12, it follows that g =f· 1
g is differentiable
at x0 , and
′
1 ′ g ′ (x0 ) f ′ (x0 )g(x0 ) − f (x0 )g ′ (x0 )
f 1
(x0 ) = f · (x0 ) = f ′ (x0 ) − f (x0 ) = .
g g g(x0 ) g(x0 )2 g(x0 )2
obtain
f ′ (x) = exp(g(x))g ′ (x),
Exercise 5.20. — Determine the derivative of the function x 7→ cos sin3 (exp(x)) .
Proof. We first show that ȳ is an accumulation point of E \ {ȳ}, which allows us to speak
of differentiability at ȳ. In fact, since by assumption x̄ is an accumulation point of D \ {x̄},
22 there exists a sequence (xn )∞ n=0 in D \ {x̄} with xn → x̄ as n → ∞. Since f is continuous and
bijective, the sequence (f (xn ))∞n=0 converges to ȳ = f (x̄).
Now, to compute the derivative, let (yn )∞ n=0 be an arbitrary sequence in E \ {ȳ} converging
to ȳ. Then xn = f (yn ) tends towards x̄ (since f −1 is continuous by assumption), and the
−1
following holds:
−1
f −1 (yn ) − f −1 (ȳ)
xn − x̄ f (xn ) − f (x̄) 1
lim = lim = lim = .
n→∞ yn − ȳ n→∞ f (xn ) − f (x̄) n→∞ xn − x̄ f ′ (x̄)
−1 −1
Hence, if we set g(y) = f (y)−f y−ȳ
(ȳ)
, this proves that for every sequence yn converging to
ȳ it holds g(yn ) → f ′ (x̄) . Recalling Lemma 3.70, this proves that limy→ȳ g(y) = f ′1(x̄) , as
1
desired.
Figure 5.1: An intuitive representation of Theorem 5.21. Mirroring the graph of f and the
tangent line at the point (x0 , y0 ) around the straight line x = y in R2 , we get the graph of
f −1 and, this is the assertion, the tangent line at (y0 , x0 ). A short calculation shows that the
reflection of a straight line with slope m around x = y has slope m 1
.
1 1 1
g ′ (y) = log′ (y) = = = .
exp(x) exp(log(y)) y
For y < 0, since g(y) = log(−y), we apply the chain rule (Theorem 5.16) to get g ′ (y) =
− log′ (−y) = − −y
1
= y1 .
Example 5.23. — Given x > 0 and α ∈ R, we can compute the derivative of xα as follows:
α
xα = exp(α log(x)) =⇒ (xα )′ = exp′ (α log(x))α log′ (x) = exp(α log(x)) = αxα−1 .
x
22
where fn is a polynomial. Then, using that the exponential function grows faster than any
polynomial, prove by induction that
ψ(x)
lim =0 ∀ k ∈ N.
x→0+ xk
ψ(x)fn x1 − 0
ψ (n) (x) − ψ (n) (0)
(n+1) 1 1
ψ (0) = lim = lim = lim ψ(x)fn = 0.
x→0 x−0 x→0+ x x→0+ x x
f (x) − f (x0 )
f+′ (x0 ) = lim ≤0
x→x+
0
x − x0
since now, for x close to x0 to the left of x0 , f (x) − f (x0 ) ≤ 0 and x − x0 < 0.
Since f is differentiable at x0 then f ′ (x0 ) = f+′ (x0 ) = f−′ (x0 ), therefore f ′ (x0 ) = 0.
1. x0 ∈ I is an endpoint of I;
2. f is not differentiable at x0 ;
In particular, all local extrema of a differentiable function on an open interval are zeros
of the derivative.
Exercise 5.29. — Let f : R → R be the polynomial function f (x) = x3 − x. Find all local
extrema of f . Find all local extrema of the function |f | on [−3, 3].
Proof. According to Theorem 3.42, the minimum and maximum of f exist in [a, b]. That
is, there exist x0 , x1 ∈ [a, b] with f (x0 ) ≤ f (x) ≤ f (x1 ) for all x ∈ [a, b]. According to
Proposition 5.27, the derivative of f must be zero for all extrema in (a, b). So if either
x0 ∈ (a, b) or x1 ∈ (a, b) holds, then we have already found a ξ ∈ (a, b) with f ′ (ξ) = 0.
Instead, if both x0 and x1 are endpoints of the interval, because f (a) = f (b) it follows that f
is constant, hence f ′ (x) = 0 holds for all x ∈ (a, b).
f (b) − f (a)
f ′ (ξ) = .
b−a
f (b) − f (a)
g(x) = f (x) − (x − a)
b−a
f (b) − f (a)
x 7→ f (x) and x 7→ (x − a)
b−a
are differentiable in (a, b), it follows from Proposition 5.12 that g is differentiable in (a, b).
Thus, according to Rolle’s theorem, there exists ξ ∈ (a, b) such that
f (b) − f (a)
0 = g ′ (ξ) = f ′ (ξ) −
b−a
as desired.
5.32. — So, in words, Rolle’s theorem states that if a differentiable function on an inter-
22 val takes the same value at the endpoints, the slope must be zero somewhere between the
endpoints. We illustrate this in the following picture on the left.
Instead, according to the Mean Value Theorem, for every differentiable function on an interval,
there is at least one point where the slope of the function is exactly the average slope. This
can be seen in the picture on the right. You can also see here how the proof of the mean value
theorem can be traced back to Rolle’s theorem by modifying the graph of the function f on
the right by shearing in such a way that f (a) = f (b) applies afterwards.
Example 5.34. — Let f : [0, 2π] → C be the complex-valued function given by f (t) =
exp(it) = cos(t) + i sin(t). At the endpoints of the interval [0, 2π], f (0) = f (2π) = 1 holds.
However, the derivative of f never takes the value zero because, according to Example 5.5,
f ′ (t) = i exp(it) ̸= 0
for all t ∈ [0, 2π]. Thus, the statements of Rolle’s Theorem and the Mean Value Theorem for
complex-valued functions are false in this generality.
If in addition g ′ (x) ̸= 0 holds for all x ∈ (a, b), then g(a) ̸= g(b) and
Thus, according to Rolle’s Theorem 5.30, there exists ξ ∈ (a, b) such that
This proves (5.5). If in addition g ′ (x) ̸= 0 for all x ∈ (a, b), then it follows from Rolle’s
theorem that g(b) ̸= g(a) holds. After dividing (5.5) by g ′ (ξ)(g(b) − g(a)), we get the second
assertion of the theorem.
5.36. — Just like for the Mean Value Theorem 5.31, Cauchy’s Mean Value Theorem
has a geometrical interpretation, only this time you have to look in the two-dimensional
plane. There, Cauchy’s mean value theorem states, under the assumptions made, that the
curve t 7→ (f (t), g(t)) has a tangent that is parallel to the straight line through the points
(f (a), g(a)), (f (b), g(b)).
f ′ (x)
3. The limit L = lim g ′ (x) exists.
x→a+
f (x)
Then lim g(x) also exists and is equal to L.
x→a+
Proof. By assumption (2), we can extend f and g continuously on [a, b) by setting f (a) =
g(a) = 0. Fix ε > 0. According to assumption (3), there exists δ > 0 such that
f ′ (ξ)
∈ (L − ε, L + ε) ∀ ξ ∈ (a, a + δ).
g ′ (ξ)
Now, given x ∈ (a, a + δ), we apply the Mean Value Theorem 5.35 on the interval [a, x] to
find ξx ∈ (a, x) such that
f (x) f (x) − f (a) f ′ (ξx )
= = ′ .
g(x) g(x) − g(a) g (ξx )
f (x) f ′ (ξx )
22 = ′ ∈ (L − ε, L + ε) ∀ x ∈ (a, a + δ).
g(x) g (ξx )
f (x)
Since ε > 0 is arbitrary, this proves that limx→a+ g(x) = L, as desired.
Theorem 5.37 is one of several versions of the rule of l’Hôpital For instance, one can as-
sume that the limits in (2) are both improper limits, i.e., that limx→a+ g(x) = ±∞ and
limx→a+ f (x) = ±∞ hold with arbitrary signs. More precisely, the following holds:
f ′ (x)
3. The limit L = lim g ′ (x) exists.
x→a+
f (x)
Then lim g(x) also exists and is equal to L.
x→a+
Proof. (Extra material) Fix ε > 0. According to assumption (3), there exists δ > 0 such that
23
f ′ (ξ)
∈ (L − ε, L + ε) ∀ ξ ∈ (a, a + δ).
g ′ (ξ)
Now, given x ∈ (a, a + δ), we apply the Mean Value Theorem 5.35 on the interval [x, a + δ]
to find ξx ∈ (x, a + δ) such that
f (x) − f (a + δ) f ′ (ξx )
= ′ .
g(x) − g(a + δ) g (ξx )
f (x) − f (a + δ) f ′ (ξx )
= ′ ∈ (L − ε, L + ε) ∀ x ∈ (a, a + δ). (5.6)
g(x) − g(a + δ) g (ξx )
g(a+δ)
1−
Also, since |f (x)|, |g(x)| → ∞ as x → a+ , limx→a+ g(x)
f (a+δ) = 1. Hence, recalling (5.6) and
1− f (x)
(5.7), there exists η ∈ (0, δ) such that
f (x)
∈ (L − 2ε, L + 2ε) ∀ x ∈ (a, a + η).
g(x)
f (x)
Since ε > 0 is arbitrary, this proves that limx→a+ g(x) = L, as desired.
f ′ (x)
3. The limit L = lim ′ exists.
x→∞ g (x)
f (x)
Then lim also exists and is equal to L.
x→∞ g(x)
23 Proof. (Extra material) If lim f (x) = lim g(x) = 0, apply Theorem 5.37 in the interval
x→∞ x→∞
0, R1 to the functions x 7→ f x1 and x 7→ g x1 .
If lim |f (x)| = lim |g(x)| = ∞, apply instead Theorem 5.38 in the interval 0, R1 to the
x→∞ x→∞
functions x 7→ f x1 and x 7→ g x1 .
sin(x) − x ex − x − 1 x4 − 4x
(a) lim ; (b) lim ; (c) lim ; (d) lim x3 ex .
x→0+ x2 sin(x) x→0 cos x − 1 x→2 sin(πx) x→−∞
Exercise 5.42. — Let a < b be real numbers and f : [a, b] → R a continuous function.
Suppose x0 ∈ [a, b] is a point such that f is differentiable on [a, b] \ {x0 } and suppose that the
limit limx→x0 f ′ (x) exists. Show that f is differentiable at x0 and that f ′ is continuous at x0 .
f (x + h) − 2f (x) + f (x − h)
f ′′ (x) = lim
h→0 h2
for all x ∈ I. Using the sign function x 7→ sgn(x), verify that the existence of the above limit
does not imply twice differentiability.
Hint: Apply l’Hôpital’s Rule twice.
f ′ ≥ 0 ⇐⇒ f is increasing.
Proof. Suppose f is increasing. Then we can note that f (x + h) − f (x) ≥ 0 for h > 0, and
f (x + h) − f (x) ≤ 0 for h < 0. Therefore, in either case f (x+h)−f
h
(x)
≥ 0, and we get
f (x + h) − f (x)
f ′ (x) = lim ≥ 0.
23
h→0 h
To prove the converse implication, assume that f is not increasing. Then there exist two
points x1 < x2 in I with f (x2 ) > f (x1 ), and according to the Mean Value Theorem 5.31 there
exists ξ ∈ [x1 , x2 ] with f ′ (ξ) = f (xx22)−f
−x1
(x1 )
< 0. So f ′ ̸≥ 0 on I.
Remark 5.45. — If f ′ > 0, the above argument can be used to show that f is strictly
increasing. However, the converse is false: the function f : R → R, x →
7 x3 is strictly
increasing, but f ′ (0) = 0.
Proof. On the one hand, the derivative of a constant function is the zero function.
Conversely, if f is differentiable and f ′ = 0, then f ′ ≥ 0 and −f ′ ≥ 0. Hence, Proposition
5.44 implies that both f and −f are increasing, so f is constant.
(5.8)
f (1 − t)a + tb ≤ (1 − t)f (a) + tf (b)
holds. We say that f strictly convex if the inequality in (5.8) is strict. A function
g : I → R is called (strictly) concave if −g is (strictly) convex.
5.49. — The inequality (5.8) can be understood geometrically as follows: If a < b are
points in the domain of definition of f , then the graph of f in the interval [a, b] lies below
the secant through the points (a, f (a)) and (b, f (b)). Convexity can also be characterized by
means of slopes of secants. Namely, f : I → R is convex if for all x ∈ (a, b) ⊂ I the inequality
holds, and strictly convex if the inequality above is strict. Geometrically, this means that the
slope of the lines through the points (a, f (a)) and (x, f (x)) is smaller than the slope of the
lines through the points (x, f (x)) and (b, f (b)).
23
Exercise 5.50. — Show that the inequality (5.8) for all t ∈ (0, 1) is equivalent to the
inequality (5.9) for all x ∈ (a, b).
Proof. Suppose first that f ′ is increasing. Fix a, b ∈ I with a < b, and consider x ∈ (a, b).
According to the Mean Value Theorem 5.31, there exist ξ ∈ (a, x) and ζ ∈ (x, b) such that
f (b − h) − f (a + h) f (b) − f (b − h)
≤ .
(b − h) − (a + h) h
Combining the two inequalities above, we deduce that for all h > 0 sufficiently small,
f (a + h) − f (a) f (b) − f (b − h)
≤ . (5.10)
h h
Taking the limit as h → 0+ , we obtain f ′ (a) ≤ f ′ (b). Since a < b are arbitrary, this proves
that f ′ is increasing.
23
Exercise 5.52. — With the same assumptions as in Proposition 5.51, prove that f is
strictly convex if and only if f ′ is strictly increasing.
Exercise 5.54. — Under the same assumptions as in Corollary 5.53, prove that if f ′′ (x) > 0
for all x ∈ I, then f is strictly convex. Is the converse true?
1 1
f ′ (x) = log(x) + x = log(x) + 1, f ′′ (x) = >0
x x
for all x > 0. Furthermore, we already know from Example 3.79 that limx→0 x log(x) = 0.
Lastly, we note that limx→0 f ′ (x) = −∞, all of which can be seen in the graph of f .
23
Then, prove that the function x 7→ xα is concave for x ∈ (0, ∞) and conclude the validity of
(5.11).
By Theorem 4.59 and Exercise 4.62, the zeros of cos : R → R are the set { π2 + kπ | k ∈ Z}, and
cos(0) = 1. From the Intermediate Value Theorem 3.29 it follows that sin′ (x) = cos(x) > 0
for all x ∈ (− π2 , π2 ). Thus, by Remark 5.45, the function
is strictly increasing and bijective (recall that sin(− π2 ) = −1 and sin( π2 ) = 1). Consequently,
the sine function restricted to [− π2 , π2 ] has an inverse, which we express as
arcsin : [−1, 1] → [− π2 , π2 ]
Remark 5.59. — Since sin′′ = − sin, it follows that sin is convex in the interval − π2 , 0 ,
1 1 1
23 arcsin′ (s) = =p 2
=√
cos(x) 1 − sin (x) 1 − s2
5.61. — The discussion above can be done analogously for the cosine. The cosine function is
strictly monotonically decreasing in the interval [0, π] and satisfies cos(0) = 1 and cos(π) = −1.
In particular, the restricted cosine function
is bijective.
24
Just as for the arcsine, we can apply the differentiation rules for the inverse and get s = cos(x)
for x ∈ (0, π)
1 1 1
arccos′ (s) = = −p = −√ .
− sin(x) 2
1 − cos (x) 1 − s2
Remark 5.62. — Since cos′′ = − cos, it follows that cos is concave in the interval 0, π2 ,
sin(x) sin(x)
lim tan(x) = lim = +∞ and lim tan(x) = lim = −∞
x→ π2 − x→ π2 − cos(x) x→− π2 + x→− π2 + cos(x)
Thus, it follows from the Intermediate Value Theorem 3.29 that the tangent function tan :
(− π2 , π2 ) → R is bijective.
24
The inverse π π
arctan : R → − ,
2 2
is called arctangent. By Theorem 5.21 the arc tangent is differentiable, and for x ∈ − π2 , π2
Since
sin2 (x) sin2 (x) + cos2 (x) − cos2 (x) 1
s2 = = = − 1,
cos2 (x) cos2 (x) cos2 (x)
it follows that 1 + s2 = 1
cos2 (x)
, and therefore
1
arctan′ (s) = ∀ s ∈ R.
1 + s2
5.64. — The cotangent and its inverse function behave similarly. The restriction cot |(0,π) :
(0, π) → R is strictly monotonically decreasing and bijective. The inverse
arccot : R → (0, π)
1
arccot′ (s) = − ∀ s ∈ R.
1 + s2
5.65. — It holds sinh′ (x) = cosh(x) > 0 for all x ∈ R. Thus, according to Proposition
5.44, the hyperbolic sine is strictly monotonically increasing. Since lim sinh(x) = ∞ and
x→∞
lim sinh(x) = −∞, by the Intermediate Value Theorem 3.29 we get that
x→−∞
sinh : R → R
arsinh : R → R
24
is called the Inverse Hyperbolic Sine. According to the theorem on differentiability of the
inverse function, arsinh is differentiable and it holds, for x ∈ R and s = sinh(x),
1 1 1
arsinh′ (s) = =q =√ .
cosh(x) 2 1 + s 2
1 + sinh (x)
The inverse hyperbolic sine has a closed form, unlike the inverse functions arcsin, arccos, and
arctan. In fact, starting from the relation sinh(x) = s we have
ex − e−x
=s =⇒ e2x − 2sex − 1 = 0.
2
5.66. — The hyperbolic cosine satisfies cosh′ (x) = sinh(x) and cosh′′ (x) = cosh(x) > 0
for all x ∈ R. In particular, the cosine hyperbolic is strictly convex by Corollary 5.53 and
has a global minimum at 0 by Corollary 5.28 (since 0 = cosh′ (x) = sinh(x) implies x = 0).
For x > 0 we have cosh′ (x) > 0, so cosh is strictly monotonically increasing on R≥0 . Since
cosh(0) = 1 and lim cosh(x) = +∞, it follows that
x→∞
1 1
arcosh′ (s) = =√
sinh(x) 2
s −1
for all s > 1. We leave the proof of the above properties to those interested.
of the strictly monotonically increasing bijection tanh : R → (−1, 1). According to Theorem
5.21, artanh is differentiable and the following holds:
1
artanh′ (s) =
1 − s2
Exercise 5.68. — Check all assertions made in Paragraphs 5.65, 5.66, and 5.67.
In this chapter we will take the idea of section 1.1 and extend it to the notion of the Riemann
integral with the help of the supremum and the infimum, i.e. implicitly the completeness
axiom.
Interlude: Partitions
Two sets A, B are called disjoint if A ∩ B = ∅. For a collection A of sets, we say
that the sets in A are pairwise disjoint if for all A1 , A2 ∈ A with A1 ̸= A2 it holds
A1 ∩ A2 = ∅.
Let X be a set. A partition of X is a family P of non-empty pairwise disjoint subsets
of X such that
[
X= P.
P ∈P
24 In other words, sets P ∈ P are non-empty, and each element of X is an element of exactly
one P ∈ P.
For the following discussion, we fix two real numbers a < b, and work with the compact
interval [a, b] ⊂ R.
161
Chapter 6.1 Step Functions and their Integral
with n ∈ N. The points x0 , . . . , xn ∈ [a, b] are called the division points of the decom-
position.
which we will use implicitly from now on. A decomposition a = y0 < y1 < · · · < ym = b is
called a refinement of a decomposition a = x0 < x1 < · · · < xn = b if
{x0 , x1 , . . . , xn } ⊆ {y0 , y1 , . . . , ym }.
The notion of refinement leads to an order relation on the set of all decompositions of [a, b].
24 Note that any two decompositions of [a, b] always have a common refinement (take the union
of the points).
Figure 6.2: The graph of a step function on the interval [a, b].
Proof. There exist decompositions of [a, b] with respect to which f and g are step functions.
For these decompositions, there exists a common refinement a = x0 < x1 < · · · < xn = b with
respect to which f and g are step functions. Thus, the functions f and g are both constant
on the open intervals (xk−1 , xk ), and consequently so is αf + βg, which means that αf + βg
is a step function with respect to a = x0 < x1 < · · · < xn = b.
Remark 6.6. — Just as in the proof of Proposition 6.4, one can show that the product of
two step functions is again a step function. Also, we note that step functions are bounded,
since they take finitely many values.
For the moment, in (6.1), the individual symbols and dx have no meaning. Originally, the
R
symbol stands for an S for “sum”, and the symbol dx indicates an “infinitesimal length”, i.e.
R
xk − xk−1 for an “infinitesimal fine” decomposition. The notation was introduced by Leibniz
(1646-1716).
Figure 6.3: For a non-negative step function f ≥ 0 we interpret (6.1) as the area of the, set
{(x, y) ∈ R2 | a ≤ x ≤ b, 0 ≤ y ≤ f (x)}, and in general as the signed net area.
6.8. — The equation (6.1) defining the integral is not without problems. A priori, in fact,
the right-hand side depends on the choice of a decomposition of the interval [a, b]. We must
convince ourselves that this is only an apparent dependence. In other words, if a = y0 < · · · <
ym = b is another decomposition of [a, b] with respect to which f is a step function, then
n
X m
X
ck (xk − xk−1 ) = dk (yk − yk−1 ) (6.2)
k=1 k=1
Proof. We have already shown in Proposition 6.4 that αf + βg is a step function. Let a ≤
x0 < · · · < xn = b be a decomposition such that the functions f and g (and consequently
αf + βg) are constant on the intervals (xk−1 , xk ). If ck is the value of f and dk the value of
g on (xk−1 , xk ), then αck + βdk is the value of αf + βg on (xk−1 , xk ). Therefore
Z b n
X
(αf + βg)(x) dx = (αck + βdk )(xk − xk−1 )
a k=1
Xn n
X
=α ck (xk − xk−1 ) + β dk (xk − xk−1 )
k=1 k=1
Z b Z b
=α f (x) dx + β g(x) dx,
a a
as desired.
Proof. As in the proofs of Proposition 6.4 and Proposition 6.9, we can find a decomposition
a = x0 < · · · < xn = b such that f and g are constant on the intervals (xk−1 , xk ). We again
write ck for the value of f and dk for the value of g on (xk−1 , xk ). Now, because f ≤ g holds,
i.e., f (x) ≤ g(x) for all x ∈ [a, b], we get ck ≤ dk for all k ∈ {1, . . . , n}. Therefore
Z b n
X n
X Z b
f (x) dx = ck (xk − xk−1 ) ≤ dk (xk − xk−1 ) = g(x) dx.
a k=1 k=1 a
24 Exercise 6.11. — Let [a, b], [b, c] be two bounded and closed intervals and let f1 : [a, b] → R
and f2 : [b, c] → R be step functions. Show that the function
f (x) if x ∈ [a, b)
1
f : [a, c] → R, x 7→
f (x) if x ∈ [b, c]
2
is a step function on [a, c]. Then prove that the integral of f is given by
Z c Z b Z c
f (x) dx = f1 (x) dx + f2 (x) dx.
a a b
Finally, show that every step function on [a, c] is of the form described above.
Note that, if f is bounded, then these sets are non-empty. Indeed, if |f | ≤ M , then
24 ℓ = −M ∈ L(f ) and u = M ∈ U(f ).
For ℓ, u ∈ SF with ℓ ≤ f ≤ u, Proposition 6.10 implies that
Z b Z b
ℓ dx ≤ u dx,
a a
therefore s ≤ t for all s ∈ L(f ) and t ∈ U(f ). In particular, we have the inequality
if f is bounded.
6.14. — We call a the lower (integration) limit and b the upper (integration) limit,
Rb
and the function f the integrand of the integral a f dx. If f ≥ 0 is Riemann integrable,
Rb
then we interpret the number a f dx as the area of the set
Remark 6.15. — Since, for the time being, we only know Riemann integrability and the
Riemann integral, we will simply take the liberty of speaking of integrability and integral. Note
however, that besides Riemann integration theory, there is another important such theory
called Lebesgue integral.
In such a case
Z b Z b Z b Z b
f dx − ℓ dx < ε, u dx − f dx < ε.
25 a a a a
Proof. Let A and B be nonempty subsets of R with the property that a ≤ b holds for all
a ∈ A and all b ∈ B. Then sup A ≤ inf B, and equality sup A = inf B holds exactly if for
every ε > 0 there is an a ∈ A and an b ∈ B with b − a < ε. This reasoning holds in particular
for the sets L(f ) and U(f ). The implications
6.17. — It is good to know that the Riemann integral is a generalisation of the integral
of step functions, and in this sense we can simply speak of the Riemann integral of a step
Exercise 6.18. — Let f : [a, b] → R be a step function. Show that f is Riemann integrable
and that the Riemann integral of f is equal to the integral of f as a step function.
Exercise 6.19. — Repeat the proof of Proposition 1.1 and show, in the language of this
R1
section, that f : [0, 1] → R, x 7→ x2 is Riemann integrable with 0 x2 dx = 31 . Also,
L(f ) = − ∞, 13 and 1
U(f ) = 3, ∞ .
Example 6.20. — Not all functions are Riemann integrable, as the following example
shows. Consider f : [0, 1] → R defined as
1 x ∈ Q
f (x) =
0 x ∈
̸ Q.
using telescopic sums. Thus, the upper integral of f is given by 1, since the step function with
constant value 1 has integral 1 and u was arbitrary.
Similarly, one shows that the lower integral of f is given by 0. Thus, f is not Riemann
integrable.
Proof. Given ε > 0, thanks to Proposition 6.16 we can find step functions ℓ1 , ℓ2 , u1 , u2 such
that Z b Z b
ℓ1 ≤ f ≤ u1 , ℓ2 ≤ g ≤ u2 , (u1 − ℓ1 ) dx < ε, (u2 − ℓ2 ) dx < ε,
a a
Z b Z b Z b Z b
f dx − ℓ1 dx < ε, g dx − ℓ2 dx < ε
a a a a
and
Z b
Z b Z b
(αu1 + βu2 ) − (αℓ1 + βℓ2 ) dx = α (u1 − ℓ1 ) dx + β (u2 − ℓ2 ) dx < (α + β)ε.
a a a
Since ε is arbitrary, this shows that αf + βg is integrable. Also, by the traingle inequality
and Proposition 6.9 applied to ℓ1 and ℓ2 , we get
Z b Z b Z b Z b Z b
(αf + βg) dx − α f dx − β g dx ≤ (αf + βg) dx − (αℓ1 + βℓ2 ) dx
a a a a a
Z b Z b Z b
+ (αℓ1 + βℓ2 ) dx − α ℓ1 dx − β ℓ2 dx
|a {z a a
}
=0
Z b Z b Z b Z b
+α ℓ1 dx − f dx + β ℓ2 dx − g dx
a a a a
≤ (α + β)ε + αε + βε = 2(α + β)ε.
25
Rb Rb Rb
which implies, again from the arbitrariness of ε, that a (αf + βg) dx = α a f dx + β a g dx.
The case when α or β is negative is analogous, but one needs to reverse some inequalities.
For instance, if α ≥ 0 but β < 0 then
and
Z b
Z b Z b
(αu1 + βℓ2 ) − (αℓ1 + βu2 ) dx = α (u1 − ℓ1 ) dx + β (ℓ2 − u2 ) dx < (α + |β|)ε.
a a a
Since ε is arbitrary, this shows that αf + βg is integrable, and analogously one proves that
Rb Rb Rb
a (αf + βg) dx = α a f dx + β a g dx.
Proof. For any step function u : [a, b] → R, if u ≤ f then u ≤ g. This implies that L(f ) ⊆
L(g), and therefore
Z b Z b
f dx = sup L(f ) ≤ sup L(g) = g dx,
a a
as desired.
f + (x) = max{0, f (x)}, f − (x) = − min{0, f (x)}, |f |(x) = |f (x)| ∀ x ∈ [a, b].
The function f + is the positive part, f − is the negative part, and |f | is the absolute
value of the function f . One can check that
|f | + f |f | − f
f = f + − f −, |f | = f + + f − , f+ = , f− = .
2 2
In addition,
f ≤ g =⇒ f + ≤ g + and f ≤ g =⇒ f − ≥ g − .
25
Proof. We start by showing that f + is Riemann integrable. Let ε > 0. Since f is integrable,
there exist step functions ℓ and u with the property
Z b
ℓ≤f ≤u and (u − ℓ) dx < ε.
a
The functions ℓ+ and u+ are also step functions, and ℓ+ ≤ f + ≤ u+ holds. Since u − ℓ is
non-negative, u−ℓ = (u−ℓ)+ holds. Also, considering all possible cases (i.e., u(x) ≥ ℓ(x) ≥ 0,
u(x) ≥ 0 > ℓ(x), or 0 > u(x) ≥ ℓ(x)), one checks that
Therefore Z b Z b Z b
+ + +
(u − ℓ ) dx ≤ (u − ℓ) dx = (u − ℓ) dx < ε
a a a
Rb Rb
where we used Theorem 6.21, as well as a f + dx ≥ 0 and a f − dx ≥ 0.
Exercise 6.26. — Let a < b < c be real numbers. Show that a function f : [a, c] → R is
integrable exactly when f |[a,b] and f |[b,c] are integrable, and that in this case
Z c Z b Z c
f dx = f |[a,b] dx + f |[b,c] dx.
a a b
Exercise 6.27. — Let f : [a, b] → R be integrable, and λ > 0 be a real number. Let
g : [λa, λb] → R be the function given by g(x) = f (λ−1 x). Show that g is integrable, and that
Z b Z λb
λ f dx = g dx.
25 a λa
Exercise 6.28. — Let f : [a, b] → R be an integrable function. Show that the function
F : [a, b] → R given by Z x
F (x) = f (t) dt
a
is continuous.
Exercise 6.29. — Let C be the space of continuous functions on [a, b], and let I : C → R
be the integration.
Z b
I(f ) = f dx
a
Show that the function I is continuous, in the following sense: for all ε > 0 there exists a
δ > 0 such that
|f (x) − g(x)| < δ ∀ x ∈ D =⇒ |I(f ) − I(g)| ≤ ε.
Exercise 6.30. — Let f : [0, 1] → R be an integrable function and ε > 0. Show that there
exists a continuous function g : [0, 1] → R such that
Z 1
|f (x) − g(x)| dx < ε. (6.3)
0
Proof. Without loss of generality, f : [a, b] → R is increasing – if not, replace f with −f and
apply Proposition 6.21. We want to apply Proposition 6.16, that is, for a given ε > 0 we want
Rb
to find two step functions ℓ, u ∈ SF such that ℓ ≤ f ≤ u and a (u − ℓ) dx < ε.
We construct ℓ, u using a natural number n ∈ N which we will specify later, and the
decomposition
a = x0 < x1 < . . . < xn = b
Since f is increasing, ℓ ≤ f ≤ u holds. Indeed, for x ∈ [a, b] either x = b, where ℓ(x) = f (x), or
there is a k ∈ {1, . . . , n} with x ∈ [xk−1 , xk ). In the latter case we get ℓ(x) = f (xk−1 ) ≤ f (x),
and thus ℓ ≤ f holds. An analogous argument yields f ≤ u.
Recalling that xn = b and x0 = a, this yields
b n n
b − a
Z X X
(u − ℓ) dx = f (xk ) − f (xk−1 ) (xk − xk−1 ) = f (xk ) − f (xk−1 )
a n
k=1 k=1
b − a
= f (xn ) − f (xn−1 ) + f (xn−1 ) − f (xn−1 ) + . . . + f (x1 ) − f (x0 )
n
b − a
= f (b) − f (a) .
n
Rb
Following Archimedes’ principle, we can now choose n ∈ N such that a (u − ℓ) dx < ε. Thus,
it follows from Proposition 6.16 that f is Riemann-integrable.
√
Exercise 6.32. — Show that the function x ∈ [0, 1] 7→ 1 − x2 ∈ R is Riemann integrable.
Using the addition property in Exercise 6.26, the statement of Theorem 6.31 can be ex-
tended to functions that are only piecewise monotonic.
of [a, b] such that f |(xk−1 ,xk ) is monotone for all k ∈ {1, . . . , n}.
Proof. (Extra Material) This follows from Theorem 6.31, and Exercises 6.26 and 6.22.
Proof. Let f : [a, b] → R be continuous, and ε > 0. By Theorem 3.46 f is uniformly continu-
ous, so there exists δ > 0 such that
for x ∈ [a, b]. Since ck ≤ dk , we see that ℓ ≤ f ≤ u. Also, because dk − ck < ε, it follows that
Z b n
X n
X
(u − ℓ) dx = (dk − ck )(xk − xk−1 ) < ε (xk − xk−1 ) = ε(b − a).
a k=1 k=1
Again by the addition property in Exercise 6.26, the statement of Theorem 6.35 can be
extended to functions that are only piecewise continuous.
of [a, b] such that f |(xk−1 ,xk ) is continuous for all k ∈ {1, . . . , n}, and both limits
limx→x+ f (x) and limx→x− f (x) exist. In other words, each function f |(xk−1 ,xk ) can
25 k−1 k
be extended to a continuous function on [xk−1 , xk ].
Proof. (Extra Material) This follows from Theorem 6.35 applied to the continuous extension
of f |(xk−1 ,xk ) on [xk−1 , xk ], and Exercises 6.26 and 6.22.
6.38. — Most “common” functions are piecewise continuous or piecewise monotone, and in
particular integrable according to Theorem 6.31 or according to Theorem 6.35. We note that
there exist functions that are continuous on their domain of definition but are not monotone
on any open subinterval.
Applet 6.39 (Integrability of a “shaky” function). We see that a continuous but shaky function
as in the graph shown is also Riemann integrable. We can also note that the program GeoGebra
sometimes has problems with the function used, and some of the displayed lower sums or upper
sums are actually not displayed and calculated correctly. Regardless of this, we have proven the
Riemann integrability, so we should not worry if there are some computational errors using
25 GeoGebra.
hold?
One can show that the pointwise limit of integrable functions is not integrable. More
importantly, as the following example shows, if a sequence of integrable functions (fn )n∈N
converges pointwise to an integrable function f , then the limit of the integrals does not
necessarily coincide with the integral of the limit.
for x ∈ [0, 1] and n ∈ N. Then fn is continuous and, in particular, integrable. Also, its integral
is equal to the area of the triangle in the figure, which is 21 n1 n2 = 14 .
Note that the sequence (fn )∞ n=0 converges pointwise to the constant function f (x) = 0.
Indeed, fn (0) for every n, so fn (0) → 0. Also, for every x > 0 it follows that fn (x) = 0 for
every n > x1 (since this is equivalent to x > n1 ), so again fn (x) → 0.
However, for all n ∈ N the following is true:
Z 1 Z 1
1
fn (x) dx = ≠ 0= f (x) dx.
0 4 0
So, the limit of the integrals is not equal to the integral of the limit function, although all
functions fn and f are continuous.
On the other hand, if the convergence fn → f is uniform, then both questions can be
answered affirmatively:
Since ε > 0 is arbitrary, the Riemann integrability of f follows from Proposition 6.16.
From the monotonicity and triangle inequality for the Riemann integral in Propositions
6.23 and 6.25, we also have
Z b Z b Z b Z b
f dx − fn dx = (f − fn ) dx ≤ |f − fn | dx ≤ ε(b − a) ∀ n ≥ N.
a a a a
In this chapter we will examine the connections between the Riemann integral from chapter 6
and the derivative from chapter 5. These connections are of fundamental importance for the
further theory.
177
Chapter 7.1 The Fundamental Theorem of Calculus
is a primitive of f .
Moreover, any primitive F : [a, b] → R has this form for some constant C ∈ R.
Noticing that in the last integral t ∈ [x0 , x] ⊂ [x0 , x0 + δ) ∩ [a, b], it follows from the continuity
of f that |f (t) − f (x0 )| < ε, therefore
x x
F (x) − F (x0 )
Z Z
1 1
− f (x0 ) < ε dt = ε dt = ε.
x − x0 x − x0 x0 x − x0 x0
Z x0
1
= − f (t) dt − f (x0 )
x − x0 x
Z x0
1
= f (t) dt − f (x0 )
x0 − x x
Z x0
1
= f (t) − f (x0 ) dt
x0 − x x
Z x0
1
≤ |f (t) − f (x0 )| dt < ε.
x0 − x x
F (x) − F (x0 )
lim = f (x0 ),
x→x0 x − x0
26 7.5. — Illustration 7.1 shows the essential estimate in the proof of Theorem 7.4. The
value F (x) − F (x0 ) can be written as f (x0 )(x − x0 ) plus the area in red that is smaller than
ε(x − x0 ). Thus F (x)−F
x−x0
(x0 )
, to an error less than ε, is given by f (x0 ).
Figure 7.1
Theorem 7.4, as stated or in the form of one of the following corollaries, is known as Funda-
mental Theorem of Integral and Differential Calculus and goes back to the work of
Leibniz, Newton and Barrow, which are largely the starting points of calculus. Isaac Barrow
(1630–1677) was a theologian, but also a physics and mathematics professor at Cambridge.
His most famous student was Isaac Newton.
26
Proof. Apply Corollary 7.6 with f = F ′ and x = b.
Exercise 7.9. — Let f : [a, b]] → R be discontinuous at mostly finitely many points. Show
Rx
that the function F (x) = a f (t) dt for is continuous in [a, b], differentiable at all continuity
points of f , and at such points it satisfies F ′ (x) = f (x).
Exercise 7.10. — Let f : I → R be continuous. Show that there exists ξ ∈ (a, b) with
Z b
26 f (x) dx = f (ξ)(b − a).
a
f g ′ = (f g)′ − f ′ g.
27
Integrating this identity on [a, b] and using Corollary 7.7 yields
Z b Z b
f (x) g ′ (x) dx = f (b)g(b) − f (a)g(a) − f ′ (x) g(x) dx,
a a
as desired.
Ry
Proof. Fix y0 ∈ J, and define G(y) = y0 g(t) dt. By Theorem 7.4 we know that G′ = g, so it
follows from the Chain Rule (see Theorem 5.16) that g ◦ f f ′ = G′ ◦ f f ′ = (G ◦ f )′ . Hence
Z b Z b
′
g(f (x))f (x) dx = (G ◦ f )′ (x) dx = G(f (b)) − G(f (a))
a a
Z f (b) Z f (a) Z f (b)
= g(y) dy − g(y) dy = g(y) dy.
y0 y0 f (a)
Before stating the following result, we recall that if h : [a, b] → R is continuously differen-
tiable with h′ ̸= 0, then it follows from the Intermediate Value Theorem 3.29 that h′ > 0 or
h′ < 0. So, Remark 5.45 implies that it is strictly monotone and therefore invertible. Thanks
to Theorem 3.34 it follows that h−1 is continuous, and then Theorem 5.21 implies that h−1 is
differentiable on (h(a), h(b)).
g
Proof. By Theorem 7.13 applied with f′ in place of g we have
Z b Z b Z b Z f (b)
g(f (x)) ′ g(f (x)) g(y)
g(f (x)) dx = f (x) dx = f ′ (x) dx = dy.
a a f ′ (x) a f ′ ◦ f −1 (f (x)) f (a) f ′ (f −1 (y))
Recalling that 1
f ′ ◦f −1
= (f −1 )′ (see Theorem 5.21), the result follows.
whenever both limits exist and the sum makes sense (so, if the limits are infinite, we
do not admit an expression of the form ∞ − ∞).
If the limit is finite, we say that the improper integral converges. If the limit is ∞ or
−∞, we say that the improper integral is divergent. Otherwise, we call the improper
integral not convergent.
In particular, the above improper integral is convergent exactly when α > 1. In fact
h ib
if α ̸= 1
Z b 1 1−α 1 1−α 1
−α
1−α x = 1−α b − 1−α
x dx = 1
b
1 [log(x)]1 = log(b) if α = 1
and (
1 1−α ∞ if α < 1
lim b = , lim log(b) = ∞.
b→∞ 1 − α 0 if α > 1 b→∞
Rb
Proof. Since the function b ∈ [a, ∞) 7→ f (x) dx is monotonically
nR a increasing,
o it always has
b
a limit as b → ∞, which is equal to the supremum sup a f (x) dx | b > a . This supremum
is either finite (in which case the improper integral converges) or it is infinite (in which case
the improper integral diverges to ∞).
2
The function x ∈ R 7→ e−x is called the Gaussian. Due to Lemma 7.17, to prove that the
integral above converges it suffices to find a “majorant function” which defines a convergent
2
improper integral. Since x2 ≥ x for x ∈ [1, ∞), it follows that e−x ≤ e−x , and therefore
Z ∞ Z ∞ b
−x2
e−x dx = lim − e−x = lim e−1 − e−b ) = e−1 < ∞.
e dx ≤ 1
1 1 b→∞ b→∞
This shows the convergence of the second improper integral. Therefore, due to the symmetry
R∞ 2 R∞ 2 2
of the function, −∞ e−x dx = 1 e−x dx is also convergent. Finally, since e−x ≤ 1, the
R1
integral on [−1, 1] is bounded by −1 1 dx = 2.
This proves the convergence of the integral. However, we will not be able to calculate the
exact value of this integral until the second semester.
N
X +1 Z N +1 N
X
f (n) ≤ f (x) dx ≤ f (n).
0
27 n=1 n=0
In particular
∞
X Z ∞ ∞
X
f (n) ≤ f (x) dx ≤ f (n),
n=1 0 n=0
P∞ R∞
therefore, the series n=1 f (n) converges exactly when the improper integral 0 f (x) dx
converges.
Proof. Due to the monotonicity of f , Theorem 6.31 implies that the function f is locally
integrable. We consider the functions ℓ, u : [0, ∞) → R≥0 given by
where ⌊·⌋ represents the rounding function (i.e., ⌊x⌋ is the largest number n ∈ N smaller than
x), and ⌈·⌉ represents the rounding up function (i.e., ⌈x⌉ is the smallest number n ∈ N larger
than x). With this choice, ℓ ≤ f ≤ u. Therefore, for all N ∈ N with N > 1, we have
N
X +1 Z N +1 Z N +1 Z N +1 N
X
f (n) = ℓ(x) dx ≤ f (x) dx ≤ u(x) dx = f (n),
n=1 0 0 0 n=0
which can also be seen in the following picture. The statement of the theorem follows by
letting N → ∞.
27
q
Recalling (7.3), given ε > 0 there exists N ∈ N such that 1 − ε ≤ n 1
n ≤ 1 + ε for all n ≥ N .
This implies that
r
p n|an−1 | p
lim sup n |cn | = lim sup ≤ (1 + ε) lim sup n |an−1 |
n→∞ n→∞ n n→∞
p n−1
n
= (1 + ε) lim sup n−1 |an−1 | = (1 + ε)ρ,
n→∞
and analogously
p n−1
n
p
n
lim sup |cn | ≥ (1 − ε) lim sup n−1 |an−1 | = (1 − ε)ρ.
n→∞ n→∞
(1 − ε)ρ ≤ ρ̄ ≤ (1 + ε)ρ.
Since ε > 0 is arbitrary, this implies that ρ̄ = ρ, so the power series F (x) = ∞
n=0 cn x has
n
P
radius of convergence R.
We now want to prove that F ′ = f on (−R, R). To prove that, fix an interval [a, b] ⊂
(−R, R), and consider the polynomial functions fn (t) = nk=0 ak tk . We note that
P
Z x n Z x n n
X X ak k+1 X ak k+1
fn (t) dt = ak tk dt = x − a ∀ x ∈ [a, b].
a a k+1 k+1
k=0 k=0 k=0
By Theorem 4.42, the sequence of functions (fn )∞ n=0 converges uniformly to f in [a, x] ⊂ [a, b],
so it follows from Theorem 6.42 that
Z x Z x
f (t) dt = lim fn (t) dt = F (x) − F (a) ∀ x ∈ [a, b].
a n→∞ a
According to Theorem 7.4, this implies that F ′ (x) = f (x) for all x ∈ [a, b]. Since [a, b] ⊂
(−R, R) is an arbitrary interval, this implies that F ′ = f on (−R, R), as desired.
where the power series on the right has also radius of convergence R.
This implies that G and f have the same radius of convergence and g = G′ = (f − a0 )′ = f ′ .
Therefore, we conclude that R̄ = R and f ′ = g.
Exercise 7.23. — Let f (x) = ∞ n=0 an x be a power series with radius of convergence
n
P
R > 0. Show that f : (−R, R) → R is smooth, and for each n ∈ N find a representation of
the n-the derivative f (n) by a power series.
radii of convergence Rf , Rg > 0. Let R = min{Rf , Rg } and suppose that f (x) = g(x) for all
x ∈ (−R, R). Show that cn = dn for all n ∈ N, and therefore Rf = Rg .
28 ∞
X α n
f (x) = x
n
n=0
f (x)
f ′ (x) = α ∀ x ∈ (−1, 1). (7.5)
1+x
′
f
(c) Define g(x) = (1 + x)α and use (8.20) to show that g = 0 on (−1, 1). Conclude the
validity of (7.4) by noticing that f (0) = 1 = g(0).
Example 7.26. — We have already seen in Example 4.23 that, as a consequence of the
Leibniz criterion in Proposition 4.22, the alternating harmonic series converges. However,
with the results of Chapter 4 we could not determine the value of the series. Now, with the
help of the fundamental theorem of integral and differential calculus, we can show that
∞
X (−1)n+1
= log(2).
n
n=1
To prove this, using the formula for the geometric series (which has radius of convergence 1),
we see that
∞
1 1 X
(log(1 + x))′ = = = (−1)n xn ∀ x ∈ (−1, 1).
1+x 1 − (−x)
n=0
k
Note now that, given x ∈ [0, 1], the sequence ak = k+1 x
is non-negative, decreasing, and
converging to zero. Hence it follows from Proposition 4.22 that, for x ∈ [0, 1],
2n ∞ 2n+1
X (−1)k+1 X (−1)k+1 X (−1)k+1 k
xk ≤ xk = log(1 + x) ≤ x ∀ n ∈ N.
k k k
k=1 k=1 k=1
2n 2n+1
X (−1)k+1 X (−1)k+1
≤ log(2) ≤ ∀ n ∈ N.
k k
k=1 k=1
28
Finally, letting n → ∞ in the above inequalities proves the result.
Using again the formula for the geometric series, we see that
∞
1 X
arctan′ (x) = 2
= (−1)k x2k ∀ x ∈ (−1, 1).
1+x
k=0
2n+1 2n
X (−1)k 2k+1 X (−1)k
x ≤ arctan(x) ≤ x2k+1 ∀ n ∈ N.
2k + 1 2k + 1
k=0 k=0
2n+1 2n
X (−1)k π X (−1)k
≤ arctan(1) = ≤ ∀ n ∈ N,
2k + 1 4 2k + 1
k=0 k=0
so the result follows by letting n → ∞ (note that the series converges, thanks to Leibniz
criterion in Proposition 4.22).
Sometimes, the above methods for determining an indefinite integral of a function do not
produce a result. This may be because the primitive function we are looking for cannot be
expressed in terms of “known” functions.
28 Example 7.28 (Integral Sine). — The integral sine is the primitive function Si : R → R
of the continuous function
(
sin(x)
x if x ̸= 0
x ∈ R 7→
1 if x = 0
Rx sin(t)
with normalisation Si(0) = 0, that is Si(x) = 0 t dt. Thanks to Theorem 7.21, the
function Si can be written as a power series:
x ∞
xX ∞
(−1)n 2n (−1)n
Z Z
sin(t) X
Si(x) = dt = t dt = x2n+1
0 t 0 n=0 (2n + 1)! (2n + 1)!(2n + 1)
n=0
for all x ∈ R.
7.29. — We this whole section, I ⊆ R denotes a non-empty interval that does not consist
only of one point. Also, all functions in this section are real-valued functions with domain I
that are integrable on any compact interval [a, b] ⊆ I.
7.30. — Let f and g be functions with primitives F , and G, respectively. Recall that,
from the product rule for the derivative in Proposition 5.12, it follows that (F G)′ = f G + F g.
This implies the integration by parts formula
Z Z
F (x)g(x) dx = F (x)G(x) − f (x)G(x) dx + C. (7.7)
In Leibniz notation, f = dF
dx and g = dx . This leads to the notation f dx = dF and g dx = dG,
dG
Z Z
′
g(f (x))f (x) dx = g(u) du + C (7.8)
where we used the change of variables u = f (x). The substitution rule is also called change
of variable, as one has replaced the variable u in g(u) du by u = f (x). In Leibniz notation
R
Z
xex dx = xex − ex + C.
We note that it is sufficient to use only one integration constant C in such calculations, since
several such constants can be combined into one.
Z Z
1
= log(x) · x − x dx + C = log(x) · x − dx + C = x log(x) − x + C.
x
Suggestion: To ensure that the final result is correct, differentiate the result and check if you
get the original function. For instance, in this case, one can easily check that
Exercise 7.34. — Give a recursive formula for calculating the indefinite integrals
Z Z Z
xn ex dx , xn sin(x) dx , xn cos(x) dx
for n ∈ N.
for all s, a, b ∈ R. Note that the case s = −1 needs to be treated separately, in analogy with
Example 7.8(6)-(7).
R√
Example 7.37. — Given r > 0, we want to compute the indefinite integral r2 − x2 dx.
Due to the trigonometric identities r2 − r2 sin(θ)2 = r cos(θ) it is convenient to use the
p
Z Z Z
cos2 (θ)dθ = cos(θ) sin′ (θ)dθ = cos(θ) sin(θ) − cos′ (θ) sin(θ) + C
Z
= cos(θ) sin(θ) + sin2 (θ) + C.
therefore
Z Z
1
2 cos2 (θ)dθ = cos(θ) sin(θ) + θ + C cos2 (θ)dθ =
=⇒ cos(θ) sin(θ) + θ + C
2
(note that, since C ∈ R is arbitrary, in the last formula we still write C in place of 2 ).
C
This
28
proves that
r2
Z p Z
2
cos2 (θ) dθ =
2 2
r − x dx = r sin(θ) cos(θ) + θ + C
2
r2
Z p
1 p x
r2 − x2 dx = x r2 − x2 + arcsin r + C.
2 2
• Although this is not a trigonometric substitution, we still note the following: For the
n n
expression x(a2 − x2 ) 2 or the expression x(a2 + x2 ) 2 , the substitutions u = a2 − x2 and
u = a2 + x2 , respectively, allow us to compute the indefinite integrals.
Example 7.39. — (i) Given a > 0, using the substitution x = a tan(θ), recalling that
1
a
(a2 + x2 ) 2 = cos(θ) and dx = cosa2 (θ) dθ, we get
cos3 (θ) a
Z Z Z
1 1 1
3 dx = dθ = 2 cos(θ)dθ = sin(θ) + C
(a2 + x2 ) 2 a3 cos2 (θ) a a2
1 x
= tan(θ) cos(θ) + C = √ + C.
a2 a a2 + x2
2
Certain indefinite integrals can be computed with hyperbolic substitutions. For instance,
n
for expressions of the form (x2 − a2 ) 2 with a ∈ R, the substitution x = a cosh(u) yields
1
dx = a sinh(u) du and (x2 − a2 ) 2 = a sinh(u).
Example 7.40. — Using the substitution x = cosh(u) (so dx = sinh(u) du), we compute
Z p Z q Z
x2 − 1 dx = cosh (u) − 1 sinh(u) du = sinh2 (u) du.
2
In analogy to the argument used in Example 7.37, we compute sinh2 (u) du as follows:
R
Z Z
2
sinh (u) du = cosh(u) sinh(u) − cosh2 (u) du
Z
1 + sinh2 (u) du + C
= cosh(u) sinh(u) −
29 Z
= cosh(u) sinh(u) − u − sinh2 (u) du + C,
This yields
cosh(u) sinh(u) − u
Z Z
2
2 sinh (u) du = cosh(u) sinh(u)−u+C =⇒ sinh2 (u) du = +C,
2
hence
√
− x2 − 1 − arcosh(x)
Z p
cosh(u) sinh(u) u x
x2 − 1 dx = +C = + C.
2 2
Another method that we would like to mention briefly here is the so-called half-angle
method (or Weierstrass substitution). This is useful for the integral of expressions like
cos2 (x)+cos(x)+sin(x)
sin(x) or , see also Remark 7.46 below. We show this method in detail in
1
1+sin(x)
the next example.
Example 7.41. — We want to compute sin(x) dx, and we consider the change of variable
R 1
u = tan 2 . We can note that, by the doubling angle formulas for sine and cosine, it follows
x
that
2u 1 − u2
sin(x) = and cos(x) = .
1 + u2 1 + u2
and analogously for the second formula. Furthermore, the relation u = tan x
implies that
2
x = 2 arctan(u), therefore dx = 1+u
2
2 du (recall that arctan (s) = 1+s2 ).
′ 1
1 + u2 2
Z Z Z
1 1 x
dx = du = du = log |u| + C = log tan 2 + C.
sin(x) 2u 1 + u2 u
The integrals (7.10) and (7.11) are calculated with substitution u = x−a, for (7.12) substitute
u = xa , for (7.13) and (7.14) substitute u = a2 + x2 .
7.43. — To integrate a general rational function, we use what is called the partial fraction
decomposition of rational functions. Let p, q be polynomials without nontrivial common
divisors such that q ̸= 0 and deg p < deg q.
for some k ≤ ki and ℓ ≤ ℓj , and then one needs to integrate each of these individual terms.
x4 +1
Example 7.44. — We calculate the indefinite integral dx. First, we perform
R
x2 (x+1)
division with remainder:
x4 + 1 x2 + 1
= x − 1 +
x3 + x2 x2 (x + 1)
x2 +1
To obtain the partial fraction decomposition of x2 (x+1)
, we set
x2 + 1 ax + b c
2
= 2
+
x (x + 1) x x+1
a + c = 1, a + b = 0, b = 1,
x4 + 1 1 1 2
=x−1− + 2 + .
x2 (x + 1) x x x+1
Therefore
x4 + 1
Z Z Z Z Z Z
1 1 1
dx = x dx − 1 dx − dx + dx + 2 dx
x2 (x + 1) x x2 x+1
x2 1
= − x − log |x| − + 2 log |x + 1| + C.
2 x
polynomial x + 2x + 2 has no real zeros. For the partial fraction decomposition we make the
2
approach
1 a bx + c
= + 2 .
x(x2 + 2x + 2) x x + 2x + 2
thus
a + b = 0, 2a + c = 0, 2a = 1,
In some cases, the above procedure may also lead to the integral (a2 +x
1
2 )n dx for an a ∈ R
R
29 and n ≥ 2, which (as explained previously) we can handle with the trigonometric substitution
tan(u) = xa .
Remark 7.46. — Now that we know how to integrate rational functions, we can rediscuss
the half-angle method introduced before. This allows one to compute the integral of rational
expressions in sine and cosine. In fact, with the substitution u = tan x2 , using that
2u 1 − u2 2
sin(x) = , cos(x) = , dx = du,
1 + u2 1 + u2 1 + u2
(see Exercise 7.41), one ends up with the integral of a rational function in u.
cos(x)
Exercise 7.47. — Calculate the indefinite integral dx using the substitution
R
2+sin(x)
u = tan x2 .
Remark 7.48. — Sometimes one or the other substitution is carried out because there is a
nested function in the function to be integrated and one simply has no other method available.
√
For example, in the integral sin( x) dx none of the mentioned methods is available, but one
R
√
is tempted to set u = x, and this indeed leads to an integral that one can solve. Similarly,
in an integral of the form 1+e x dx, one sets u = e .
R 1 x
R1
Example 7.49. — We compute the indefinite integral 0 log(x) dx using
Z 1 Z 1
log(x) dx = lim log(x) dx = lim [x log(x) − x]1a
0 a→0 a a→0
R1
Thus the improper integral 1
0 x dx diverges, and we can assign it the value ∞.
π
Z 1 Z
1 2
Exercise 7.51. — Calculate √ dx and tan(x) dx.
0 x 0
R∞
Exercise 7.52. — Decide for which p ∈ R≥0 the improper integral 0 x sin(xp ) dx con-
verges.
To verify that this improper integral indeed converges, we examine the integration limits 0
and ∞ separately. For 0 < a < b we find, using integration by parts,
Z b s −x b
Z b
s−1 −x
x e dx = 1
s x e a
+ 1
s xs e−x dx. (7.16)
a a
We obtain
Z b
b
Z b
s−1 −x 1
xs e−x a 1 s −x
x e dx = lim s + s x e dx
0 a→0 a
Z b
1
= 1 s
s b exp(−b) + xs e−x dx,
s 0
where the integral on the right is an actual Riemann integral since the function xs e−x is
continuous on [0, b]. To investigate the upper limit of integration, we note that there exists
R > 0 such that ex > xs+2 holds for all x > R. Thus
Z ∞ Z R Z ∞
xs e−x dx ≤ xs e−x dx + x−2 dx < ∞
0 0 R
This shows that the gamma function satisfies the functional equation
7.54. — The Gamma function extends the factorial function from N to (0, ∞). In fact
Z ∞
Γ(1) = x0 e−x dx = e0 − lim e−x = 1
0 x→∞
At the moment, it is not clear whether the Gamma function is continuous. Eventually, it
will turn out that Γ is smooth. Also, for example, we cannot calculate the value Γ( 12 ) with
the integration methods we know so far, but we will see later by means of a two-dimensional
√
integral that it is π.
7.55. — David Hilbert (1862–1943), in his 1893 article [Hil1893], used improper integrals in
the style of the Gamma function to prove that e (as first proved by Hermite in 1873) and π (as
first proved by Lindemann in 1882) are transcendental. We note here that the irrationality
of these numbers is much easier to prove. Transcendence proofs are generally much more
difficult. The difficulty in making such statements is perhaps illustrated by the fact that it is
still not known whether e + π is a transcendental number or not.
approximates the function f to within an error f (x) − y(x) = o(x − x0 ) as x → x0 . The “qual-
ity” of the approximation can be increased by considering higher polynomial approximations
instead of affine approximations.
In this section, it will be convenient to use the following abuse of notation: Given a, b ∈ R,
irrespective of the order between a and b, [a, b] denotes the interval between them. In other
words, for all a, b ∈ R, [a, b] and [b, a] denote the same interval.
We also recall that, if a < b, then
Z b Z b
f (x) dx ≤ |f (x)| dx,
a a
see Theorem 6.25. If instead b < a, then a minus sign appears (recall (7.2)) and we get
30
Z b Z a Z a Z b Z b
f (x) dx = f (x) dx ≤ |f (x)| dx = − |f (x)| dx = |f (x)| dx .
a b b a a
The coefficients are chosen so that P (k) (x0 ) = f (k) (x0 ) for k ∈ {0, . . . , n}.
We will state and prove several versions of Taylor’s Theorem. We begin with this first
version:
Remark 7.58. — In the above theorem, the assumption that f is a n-times continuously dif-
n−1
ferentiable function guarantees that the integral of the continuous function t 7→ f (n) (t) (x−t)
(n−1)!
exists.
Proof. The proof follows by induction on n and integration by parts. If n = 1 then f : [a, b] →
R is continuously differentiable and, by Corollary 7.6, we get
Z x
f (x) = f (x0 ) + f ′ (t)dt.
x0
If f is twice continuously differentiable, we can apply integration by parts to the above integral
with u(t) = f ′ (t) and v(t) = t − x. Indeed, since v ′ = 1 and v(x) = 0, we get
30 Z x
f (x) = f (x0 ) + f ′ (t)v ′ (t)dt
x0
x
Z x
′
f ′′ (t)v(t) dt
= f (x0 ) + f (t)v(t) x0
−
x0
Z x
= f (x0 ) + f ′ (x0 )(x − x0 ) + f ′′ (t)(x − t) dt
x0
x
(x − t)1
Z
= P1 (x) + f (2) (t) dt.
x0 1!
n−1
(x − t)n x
Z x
f (k) (x0 ) (x − t)n
X
k (n)
f (x) = (x − x0 ) − f (t) + f (n+1) (t) dt
k! n! x0 x0 n!
k=0
n x
f (k) (x0 ) (x − t)n
X Z
= (x − x0 )k + f (n+1) (t) dt.
k! x0 n!
k=0
We can now state our two versions of Taylor’s Approximation, using first the big-O nota-
tion, and then a refined version with the little-o notation.
Also, since f (n) is continuous on [a, b], it is bounded (recall Theorem 8.25). Hence, there exists
30 a constant M such that |f (n) | ≤ M on [a, b]. This implies that
x x
(x − t)n−1 |x − t|n−1
Z Z
|f (x) − Pn−1 (x)| ≤ f (n) (t) dt ≤ M dt ∀ x ∈ [a, b].
x0 (n − 1)! x0 (n − 1)!
Observe now that the sign of the integrand (x − t) is constant on the interval [x0 , x], so
x x
|x − t|n−1 (x − t)n−1
Z Z
dt = dt
x0 (n − 1)! x0 (n − 1)!
and the last integral can be computed with a change of variable: setting s = x − t we get
x x−x0
(x − t)n−1 sn−1 (x − x0 )n |x − x0 |n
Z Z
dt = ds = =
x0 (n − 1)! 0 (n − 1)! n! n!
We now show that by replacing Pn−1 with Pn , we can improve the previous result using
the little-o notation.
Therefore
Z x
(x − x0 )n
(n)
(x − t)n−1
f (x) = Pn−1 (x) + f (x0 ) + f (n) (t) − f (n) (x0 ) dt
n! x0 (n − 1)!
Z x (7.22)
(x − t)n−1
30 = Pn (x) + f (n) (t) − f (n) (x0 ) dt.
x0 (n − 1)!
Now, given ε > 0, it follows from the continuity of f (n) at x0 that there exists δ > 0 such that
|f (n) (x) − f (n) (x0 )| < ε for all x ∈ (x0 − δ, x0 + δ) ∩ [a, b]. Hence, if x ∈ (x0 − δ, x0 + δ) ∩ [a, b],
we can bound the integrand in the last integral by
This implies
Z x (x − t)n−1
|f (x) − Pn (x)| ≤ f (n) (t) − f (n) (x0 ) dt
x0 (n − 1)!
x
|x − t|n−1 |x − x0 |n
Z
<ε dt = ε ≤ ε|x − x0 |n ,
x0 (n − 1)! n!
where the last integral has been computed as in the proof of Corollary 7.59. This proves that
|f (x) − Pn (x)|
<ε ∀ x ∈ (x0 − δ, x0 + δ) ∩ [a, b],
|x − x0 |n
1
f (x) = f (x0 ) + f ′ (x0 )(x − x0 ) + f ′′ (x0 )(x − x0 )2 + o(|x − x0 |2 ) as x → x0 .
2
Hence, while in the case where f has a finite number of derivatives Corollary 7.60
provides a stronger result, in the case when f is smooth, the bound on f − Pn provided
by Corollary 7.59 is more convenient.
While for proving Corollary 7.60 the continuity of f (n) plays a crucial role, in the proof
of Corollary 7.59 we mainly used that f (n) is bounded (the continuity of f (n) is needed only
to guarantee that f (n) is integrable). In fact, it is possible to prove Corollary 7.59 under
the weaker assumption that the n-th derivative exists and is bounded (but is not necessarily
continuous). For this, we first prove the following alternative version of Taylor Theorem. Note
that in the case n = 1, this result corresponds to the Mean Value Theorem 5.31.
1 (n)
f (x) − Pn−1 (x) = f (ξL )(x − x0 )n . (7.23)
n!
Proof. (Extra Material) Fix x ∈ (a, b) and consider the function F : (a, b) → R defined as
n−1
f (n−1) (t) X f (k) (t)
F (t) = f (t) + f (1) (t)(x − t) + . . . + (x − t)n−1 = (x − t)k . (7.24)
(n − 1)! k!
k=0
Then F (x) = f (x) and F (x0 ) = Pn−1 (x). Also, its derivative is given by
n−1 n−1
′
X f (k+1) (t) X f (k) (t)
F (t) = (x − t)k − k(x − t)k−1
k! k!
k=0 k=0
n−1 n−1
X f (k+1) (t) X f (k) (t)
= (x − t)k − (x − t)k−1
k! (k − 1)!
k=0 k=1
n−1 n−2
X f (k+1) (t) X f (k+1) (t) f (n) (t)
= (x − t)k − (x − t)k = (x − t)n−1 .
k! k! (n − 1)!
k=0 k=0
Hence, applying Cauchy Mean Value Theorem 5.35 in the interval [x0 , x] to the functions
f (t) = F (t) and g(t) = −(x − t)n we deduce the existence of a point ξL ∈ (x0 , x) such that
f (n) (ξL )
f (x) − Pn−1 (x) F (x) − F (x0 ) f ′ (ξL ) (n−1)! (x − ξL )n−1 f (n) (ξL )
= = = = .
(x − x0 )n g(x) − g(x0 ) g ′ (ξL ) n(x − ξL )n−1 n!
31
This implies (7.23) and concludes the proof.
Proof. (Extra Material) Given x ∈ [a, b], we apply (7.23) to find a point ξL ∈ (x0 , x) such
that
1
f (x) − Pn−1 (x) = f (n) (ξL )(x − x0 )n .
n!
Since |f (n) (ξL )| ≤ M , it follows that
M
|f (x) − Pn−1 (x)| ≤ |x − x0 |n ∀ x ∈ [a, b],
n!
Another version of Taylor formula is the one with the so-called Cauchy remainder. We
discuss it in the following exercise.
1
f (x) − Pn−1 (x) = f n (ξC )(x − ξC )n−1 (x − x0 ). (7.26)
(n − 1)!
Hint: Consider the function F defined in (7.24) and apply to it the Mean Value Theorem 5.31
in the interval [x0 , x].
Example 7.65. — We can use the Taylor approximation to refine the discussion in Section
5.2.1. Let f : (a, b) → R be a n-times continuously differentiable function. Suppose x0 ∈ (a, b)
satisfies
f ′ (x0 ) = . . . = f (n−1) (x0 ) = 0.
• If f (n) (x0 ) < 0 and n is even, then f has an isolated local maximum in x0 .
• If f (n) (x0 ) > 0 and n is even, then f has an isolated local minimum in x0 .
All three statements follow from (7.23), which, in this case, takes the form
31
1 (n)
f (x) = f (x0 ) + f (ξL )(x − x0 )n , ξL ∈ (x0 , x).
n!
Indeed, if f (n) (x0 ) > 0, by continuity there exists δ > 0 such that f (n) (ξL ) > 0 for ξL ∈
(x0 , x) ⊂ (x0 − δ, x0 + δ). If n is even, then (x − x0 )n > 0 for x ̸= 0 and we deduce that
f (x) > f (x0 ) for x ∈ (x0 − δ, x0 + δ) with x ̸= 0. If n is odd, then (x − x0 )n changes sign
when considering x > x0 and x < x0 , so x0 is not a local extremum of f .
On the other hand, if f (n) (x0 ) < 0 and n is even, the same argument as above shows that
f (x) < f (x0 ) for x ∈ (x0 − δ, x0 + δ) with x ̸= 0, while in the case n odd x0 is not a local
extremum of f .
then one should recover f (x). Unfortunately, this is false, and functions that satisfy such a
property are rather special.
Note that the Taylor series is centered at x0 instead of 0 (i.e., xn is replaced with (x−x0 )n ).
Hence, all theorems about power series from Section 4.4 still hold, but taking into account
that now x0 plays the role of the center. In particular, if the series has radius of convergence
R > 0, then it converges for all x ∈ (x0 − R, x0 + R), while it diverges for |x − x0 | > R.
In other words, analytic functions f : I → R are characterized by the fact that, for every
point x0 ∈ I, there exists a power series that converges to f in a neighborhood of x0 .
31 As the next example shows, there are smooth functions f whose Taylor series converges to
a function different from f .
As shown in Exercise 5.25, ψ is smooth on R and satisfies ψ (n) (0) = 0 for all n ∈ N. Hence,
the Taylor series of the function ψ at the point x0 = 0 is the zero series:
∞ ∞
X ψ (n) (0) n
X 0 n
x = x = 0.
n! n!
n=0 n=0
This series has an infinite radius of convergence and converges to the function 0. Since
ψ(x) > 0 holds for all x > 0, the Taylor series does not converge to ψ, and so ψ is not analytic
at the point x0 = 0.
The next result provides a criterion that guarantees that the Taylor series of f converges
to f in a neighborhood of x0 .
Then f is analytic at x0 .
Proof. We first estimate the radius of convergence R of the Taylor series. If we define an =
f (n) (x0 )
, then the Taylor series is equal to ∞
n=0 an (x − x0 ) . Also, thanks to our assumption
n
P
n!
on the size of |f (n) | it follows that
cAn n!
|an | ≤ = cAn .
n!
1 (n)
|f (x) − Pn−1 (x)| ≤ |f (ξL )| |x − x0 |n ≤ cAn |x − x0 |n ≤ c(Aδ)n .
n!
∞
X f (n) (x0 )
f (x) = lim Pn−1 (x) = (x − x0 )n ∀ x ∈ (x0 − δ, x0 + δ) ∩ I,
n→∞ n!
n=0
as desired.
Exercise 7.71. — 1. Show that the functions exp, sin, sinh satisfy the property (7.27)
on any interval [a, b] ⊂ R.
31 2. Show that the function log satisfies (7.27) on any interval [a, b] ⊂ (0, ∞).
3. Let f, g : [a, b] → R be functions satisfying (7.27). Show that f + g and f · g also satisfy
this property (possibly with different constants c and A).
Setting up and solving differential equations stands as a primary practical use of calculus.
These equations are instrumental in addressing a wide range of challenges in fields such as
physics, chemistry, biology, and more. Moreover, disciplines like structural analysis, modern
economics, and information technology heavily rely on differential equations, making them
indispensable in these domains.
1. Order: An ODE is of order n if u(n) is the maximal derivative appearing in the ODE.
For instance:
211
Chapter 8.1
3. Homogeneity (for linear ODEs): A linear ODE is homogeneous if all terms involve the
function or its derivatives (this is equivalent to asking that if u is a solution, then Au is a
solution for all A ∈ R). It is non-homogeneous if there is an additional term independent
of the function.
Example 8.2. — Here we present some classic examples of ODEs and their applications:
1. Newton’s Law of Cooling: In the field of heat transfer, Newton’s Law of Cooling plays
a pivotal role in understanding the dynamics of temperature change. This law states
that:
This principle leads to the formulation of a differential equation that governs the tem-
perature dynamics of an object. The equation is expressed as:
Ṫ (t) = −k T (t) − Tenv ,
where:
From Newton’s second law of motion ẍ(t) = −F (x(t)), this leads to the formulation of
the following ODE for the simple harmonic oscillator:
ẍ(t) + ω 2 x(t) = 0,
32 where ω denotes the angular frequency of the oscillations. This ODE is linear, homoge-
neous, and of second order.
In real oscillators, friction slows the motion of the system. In many vibrating systems,
the frictional force can be modeled as being proportional to the velocity ẋ of the object.
This leads to the formulation of the following ODE for the damped harmonic oscillator:
where:
8.3. — So far, we have only talked about single differential equations, but one may also
study systems of differential equations. In addition, solutions of a differential equation are
required to satisfy some “initial conditions” such as u(0) = 0 (for example, this can correspond
to prescribing position at time 0) and/or u′ (0) = 1 (this can correspond to prescribing velocity
at time 0). These are called boundary conditions.
In other words, the set of solutions of (8.2) forms a one-dimensional linear subspace of
C 1 (I).
32 By Corollary 5.46, we deduce that v(x) = A for some A ∈ R, or equivalently u(x) = Ae−F (x) .
Remark 8.5. — As we have seen, solutions of (8.2) are defined in terms of a primitive of f .
Since primitives are defined up to a constant, one could wonder what happens if one replaces
F by F + C for some constant C ∈ R. This would correspond to replacing Ae−F (x) with
Ae−C e−F (x) , but since A ∈ R is arbitrary, this plays no essential role in the final statement.
8.6. — We can now investigate the solvability of non-homogeneous linear first order ODEs,
namely
u′ (x) + f (x)u(x) = g(x) ∀ x ∈ I. (8.3)
To motivate the next result, we look for a special solution by applying the method of variation
of constants. The idea is that, instead of looking for solutions of the form x 7→ Ae−F (x)
(that we know solve the homogeneous equation), we look for solutions u(x) = H(x)e−F (x) for
some C 1 function H : I → R. With this choice it follows that
u′ (x) = H ′ (x)e−F (x) − H(x)F ′ (x)e−F (x) = H ′ (x)e−F (x) − f (x)u(x).
Hence, if we want u to solve (8.3) we need to impose that H ′ (x)e−F (x) = g(x), or equivalently,
H is a primitive of g(x)eF (x) .
In other words, the set of solutions of (8.3) forms a one-dimensional affine subspace of
C 1 (I).
Proof. First, given A ∈ R and u(x) = (H(x) + A)e−F (x) , it follows that
so u solves (8.3).
Vice versa, if u solves (8.3) then v(x) = u(x) − H(x)e−F (x) solves
In other words, v(x) solves (8.2), so Proposition 8.7 implies that v(x) = Ae−F (x) for some
A ∈ R. Since u(x) = v(x) + H(x)e−F (x) , this proves the result.
The previous results give us formulas to solve every linear first order ODE. However, in
a concrete case, the difficulty will be determining the primitive F of f and then the one of
g(x)eF (x) . As we have seen above, solutions are uniquely determined up to a free parameter
A ∈ R. This will be used to impose the boundary condition.
2
u′ (x) − 2xu(x) = ex , u(0) = 1, (8.4)
2
on R. Following Proposition 8.7, we set f (x) = −2x and g(x) = ex . Then a primitive of f is
2 2
the function F (x) = −x2 , while a primitive of g(x)eF (x) = ex e−x = 1 is given by x. So, u
2
must be of the form u(x) = (x + A)ex . Imposing the boundary condition u(0) = 1 we obtain
A = 1, therefore the solution to the above ODE is given by
2
u(x) = (x + 1)ex . (8.5)
Remark 8.9. — If one forgets the formula from Proposition 8.7, one can try to remember
the following procedure to solve (8.3).
Recalling that multiplying by a function of the form ew(x) for some function w is “useful”
(based on what we have seen in previous pages), we multiply (8.3) by ew(x) , so to get
Hence, if we choose w = F a primitive of f (note that here we can choose any primitive of
f without worrying about the additional constant C, since all that matters is that F ′ = f ),
32 then ′
u(x)eF (x) = g(x)eF (x) ,
therefore Z
F (x)
u(x)e = geF + A,
u′ − 4
+ 1 u = x4
x u(1) = 1
u′ (x)
= 1.
f (u(x))
Integrating both side and using the change of variable formula (7.8), we get
Z Z Z
1 1
du = u′ (x) dx = 1 dx + C = x + C, (8.7)
f (u) f (u(x))
Note that, since by assumption f1 ̸= 0 on the domain of integration (otherwise we could not
divide by f (u(x))), it means that H ′ = f1 ̸= 0, so H is invertible (since it is either strictly
increasing or strictly decreasing).
Example 8.11. — Consider the logistic growth model used in population dynamics:
′ u(x)
u (x) = ru(x) 1 − (8.8)
K
where u(x) ∈ (0, K) represents the population at “time” x, r is the growth rate, and K is the
33
carrying capacity (see Example 8.2(3)). To solve this, we rearrange and integrate:
Ku′ (x)
Z Z
K
=r =⇒ du = r dx = rx + C
u(x) (K − u(x)) u (K − u)
The left-hand side is the integral of a rational function, that can be solved as discussed in
Section 7.3.4: one can observe that
K 1 1
= + ,
u (K − u) u K −u
This implies
u(x) u(x)
log = C + rx =⇒ = eC+rx
K − u(x) K − u(x)
=⇒ u(x) = KeC+rx − u(x)eC+rx
1 + eC+rx u(x) = KeC+rx
=⇒
eC+rx
=⇒ u(x) = K .
1 + eC+rx
If u0 ∈ (0, K) denotes the initial population, then setting x = 0 (here x plays the role of time)
we get
eC u0
u0 = K C
=⇒ eC =
1+e K − u0 ,
which gives
Ku0
u(x) = .
u0 + (K − u0 ) e−rx
u′ (x) = −ku(x)
where k > 0 is the decay constant. This ODE could be solved using (8.4), but we show an
alternative approach via separation of variables.
More precisely, if u is identically zero then there is nothing to prove. Otherwise, if u is
non-zero in some interval, then we can divide by u and integrate to get
Z Z
1
du = −k dx =⇒ log |u| = −kx + C. (8.9)
u
This proves that, in each interval I where u does not vanish, there exists a constant C ∈ R such
33 that |u(x)| = eC e−kx . Since u has to be continuous, this implies that either u(x) = eC e−kx
or u(x) = −eC e−kx on the whole R. Therefore, in conclusion, u(x) = ae−kx for some a ∈ R.
Imposing the condition that u(0) = u0 , this gives
Remark 8.13 (Method of Separation of Variables). — The method described above can
be generalized to ODEs of the form
for some continuous functions f, g : R → R. More precisely, assuming as before that f (u(x)) ̸=
0, we can divide both side and obtain
u′ (x)
= g(x).
f (u(x))
Integrating both side and using the change of variable formula (7.8), we get
Z Z Z
1 1
du = u′ (x) dx = g(x) dx + C, (8.12)
f (u) f (u(x))
u′′ + a1 u′ + a0 u = 0 (8.13)
Example 8.14. — Solutions of u′′ = 0 are affine functions, that is, u(x) = Ax + B for
constants A, B ∈ R.
Solutions of u′′ − u = 0 are functions of the form
33
u(x) = Aex + Be−x
for constants A, B ∈ R.
Solutions of u′′ + u = 0 are functions of the form
for constants A, B ∈ R. Since sine and cosine can be written in terms of e±ix , one can also
look for solutions of the form
u(x) = Ceix + De−ix
with C, D ∈ C and then re-express the solution in terms of sine and cosine (recall that we are
interested in real-valued functions).
8.16. — The last two examples above suggest the approach of looking for solutions of
(8.13) of the form u(x) = eαx for some α ∈ C. Indeed, with this choice we see that
p(t) = t2 + a1 t + a0 .
We distinguish three cases, depending on whether the discriminant ∆ = a21 − 4a0 is positive,
negative, or zero.
• Case 1: ∆ > 0. In this case, the characteristic polynomial p(t) has two distinct real roots
√ √
−a1 + ∆ −a1 − ∆
α= , β= . (8.14)
2 2
This implies that the real-valued functions x 7→ eαx and x 7→ eβx are solutions, and therefore
a1
α=− , (8.16)
2
thus x 7→ eαx is a solution of (8.13). To find another solution we recall the example u′′ = 0.
In this case two solutions are given by 1 and x, and these two solutions can be written as eγx
and xeγx with γ = 0.
Motivated by this observation, one could wonder whether x 7→ xeαx is a solution. This is
indeed the case:
′′ ′
xeαx + a1 xeαx + a0 xeαx = α2 + a1 α + a0 xeαx + (2α + a1 )eαx = 0
| {z } | {z }
=0 =0
where the first term vanishes because α is a root of the characteristic polynomial, while the
second term vanishes because of (8.16). This shows that
It is customary, for second order ODEs, to prescribe both the value of u and the value of
its derivative at some point (for instance, u(0) = 1 and u′ (0) = 0). The fact that we have
two constants A, B guarantees that we can choose them so as to impose these two boundary
conditions.
Now that we have found solutions to (8.13), we want to prove that they are the only ones.
This is the content of the next:
Following the terminology from Paragraph 8.16 above, consider the following solutions
to the homogeneous ODE (8.13):
If u ∈ C 2 (I) solves (8.13), then there exist A, B ∈ R such that u = Au1 + Bu2 .
In other words, the set of solutions of (8.13) forms a two-dimensional linear subspace
of C 2 (I).
Proof. (Extra material) We consider the case ∆ > 0 (the other cases can be treated analo-
gously). Assume for simplicity that 0 ∈ I (the general case can be treated similarly, choosing
33 a point x0 ∈ I and arguing in a similar fashion with x0 in place of 0). Since
if we define
αu2 (x) − βu1 (x) u1 (x) − u2 (x)
v1 (x) = , v2 (x) = ,
α−β α−β
then v1 and v2 are two solutions of (8.13) satisfying
Now, given u ∈ C 2 (I) solution of (8.13), consider w(x) = u(x) − u(0)v1 (x). This is still a
solution and w(0) = u(0) − u(0)v1 (0) = 0. We then consider the function
(this function is called “Wronskian”). Using that both w and v2 solve (8.13) one can check
that W ′ = 0, thus W is constant. Since W (0) = 0 (because w(0) = v2 (0) = 0), we conclude
that
w(x)v2′ (x) − w′ (x)v2 (x) = 0.
Now, if w is identically zero, then there is nothing to prove (since this means that u = u(0)v1 ).
Otherwise, if w is not identically zero, by continuity we can find a small interval where both
which implies that log |w(x)| − log |v2 (x)| = C for some C ∈ R. This shows that, in each
interval I where w and v2 do not vanish, there exists a constant C ∈ R such that |w(x)| =
eC |v2 (x)|. By continuity this implies that as long as v2 does not vanish, then also w does not
vanish and w(x) = av2 (x) for some constant a ∈ R. Since in our case v2 vanished only at 0
(as one can easily check), then we deduce that there exist two constants b, c ∈ R such that
So, since w ∈ C 2 (I) by assumption, the only option is b = c, which proves that w(x) = bv2 (x)
in R. In conclusion
A component of this force arises from the expansion of the spring and is oriented towards rest.
According to Hooke’s law, this force is given by −ku, where the real number k > 0 is called
the spring constant. Furthermore, friction forces generally act movement. We assume that
the corresponding force action is given by −du′ , where d ≥ 0 is the damping constant. The
differential equation describing the motion u(t) of the mass is thus mu′′ = −du′ − ku, or
d ′ k
u′′ + u + u=0
m m
which
qis, therefore, a linear homogeneous differential equation of second order. If we set
ω= m k
and ζ = 2mω
d
, then the ODE becomes
u′′ + 2ζωu′ + ω 2 u = 0
p(t) = t2 + 2ζωt + ω 2
with γ = 1 − ζ 2 ω. The constants A and B depend on the initial position u(0) and the
p
initial velocity u′ (0) of the mass. In the case ζ = 0, the oscillation is undamped and u is a
33 periodic function.
If friction is large compared to the strength of the spring so that ∆ ≥ 0 (this happens
when ζ ≥ 1), then the oscillating behavior disappears and the weight returns exponentially
fast to its steady state: if ζ > 1 then
p
u(t) = Ae−λ1 t + Be−λ2 t , λ1,2 = ζ ± ζ 2 − 1 ω,
while if ζ = 1 then
u(t) = Ae−ωt + Bte−ωt .
One can note that ζ − ζ 2 − 1 < 1 for ζ > 1, so the fastest exponential convergence is
p
u′′ + a1 u′ + a0 u = g (8.17)
for given a0 , a1 ∈ R and g ∈ C 0 (I). Following the terminology from Paragraph 8.16, consider
the solutions to the homogeneous ODE:
We want to solve (8.17) using the method of variation of constants, that is, we look for a
solution u of the form u(x) = H1 (x)u1 (x) + H2 (x)u2 (x) for some functions H1 , H2 ∈ C 2 (I).
Calculating u′ and u′′ gives
u′′ = (H1′ u1 + H2′ u2 )′ + (H1′ u′1 + H2′ u′2 ) + (H1 u′′1 + H2 u′′2 ).
Therefore
u′′ + a1 u′ + a0 u = (H1′ u1 + H2′ u2 )′ + (H1′ u′1 + H2′ u′2 ) + (H1 u′′1 + H2 u′′2 )
+ a1 (H1′ u1 + H2′ u2 ) + a1 (H1 u′1 + H2 u′2 ) + a0 (H1 u1 + H2 u2 ).
Using that u1 and u2 solve the homogeneous equations, the expression above gives
33
u′′ + a1 u′ + a0 u = (H1′ u1 + H2′ u2 )′ + (H1′ u′1 + H2′ u′2 ) + a1 (H1′ u1 + H2′ u2 ).
H1′ (x)u1 (x) + H2′ (x)u2 (x) = 0, H1′ (x)u′1 (x) + H2′ (x)u′2 (x) = g(x).
Using the first equation, one can express H2′ in terms of H1′ , that is,
u1 ′
H2′ = − H . (8.18)
u2 1
u1 ′ ′ u2 g
H1′ u′1 − u H =g =⇒ H1′ = .
u2 2 1 u′1 u2 − u′2 u1
In other words, if H1 and H2 are primitives of the functions appearing in the integral above,
then u = H1 u1 + H2 u2 is a particular solution of (8.17). Finally, the set of all solutions can
Following the terminology from Paragraph 8.16 above, consider the following solutions
to the homogeneous ODE (8.13):
In other words, the set of solutions of (8.13) forms a two-dimensional affine subspace
of C 2 (I).
33
Although this method is very general, computationally it can be very involved. So, in
some (very special) cases it may be easier to “guess” a particular solution to the homogeneous
equation by looking at functions of the form p(x)eγx , p(x)eγx cos(ηx), or p(x)eγx sin(ηx),
where p(x) is a polynomial and γ, η > 0 (depending on the structure of g, one of these
functions may work).
In this case, all solutions to the homogeneous equation are A cos(x) + B sin(x). Therefore, we
look for a solution of the form u(x) = H1 (x) cos(x) + H2 (x) sin(x).
This leads to the two equations
H1′ (x) cos(x) + H2′ (x) sin(x) = 0, −H1′ (x) sin(x) + H2′ (x) cos(x) = 1,
Z Z
cos(x)
H2 = dx = cos(x) dx.
cos (x) + sin2 (x)
2
Hence, we can take H1 = cos(x) and H2 = sin(x), which leads to the particular solution
u(x) = cos2 (x) + sin2 (x) = 1 (you see that, in this case, one may have tried to guess it!). So,
the general solution is given by
particular solution u(x) = 21 cos2 (x) sin(x) − x cos(x) − cos2 (x) sin(x) = − 12 x cos(x). So, the
x cos(x)
u(x) = − + A cos(x) + B sin(x).
2
1 3
u(x) = − x cos(x) + sin(x).
2 2
Remark 8.23. — In the solution of Example 8.22, one may note the presence of x in front
of cos(x). This is due to the fact that sin(x) and cos(x) are solutions to the homogeneous
equation, so the solution to the non-homogeneous problem cannot be just a linear combination
of them. As a general strategy, in such situations, a special solution to the homogeneous
equation is sought by multiplying the solutions of the homogeneous equation by x.
1. f is continuous in R × R;
2. f is Lipschitz with respect to the second variable; that is, there exists a constant
L > 0 such that
Then, for any point (x0 , y0 ) ∈ R × R there exists a unique C 1 function u : R → R such
that (
u′ (x) = f (x, u(x)) for all x ∈ R,
(8.19)
u(x0 ) = y0 .
As we shall see in Section 8.2.2 below, the proof is based on the method of successive
approximations, also known as Picard iterations. It involves constructing a sequence of con-
tinuous functions that converge to the solution of the differential equation. Before diving into
the proof of this important theorem, we first discuss some examples and generalizations.
Theorem 8.26 guarantees that solutions to the first order ODE (8.19) are unique when f
is Lipschitz in the second variable. This assumption is crucial, as the next example shows.
Hence, Theorem 8.26 guarantees that the solution is unique. Since u = 0 is a solution, this is
the unique solution.
• For α < 1 the function f (y) = |y|α is not Lipschitz. Indeed, choosing y2 = 0 in the definition
of Lipschitz continuity, if this function were Lipschitz there would exist a constant L > 0 such
34
that
|f (y1 ) − f (0)| = |y1 |α ≤ L|y1 | ∀y1 ∈ R.
but this is false since limy1 →0 |y1 |1−α = 0 (recall that α < 1).
Note now that, also in this case, the function u = 0 is a solution. We now try to use
separation of variables (recall Section 8.1.2) to find a second solution that is not zero, say
with u(x) > 0 somewhere:
u′ (x)
u′ (x) = u(x)α =⇒ = 1,
u(x)α
u(x)1−α
Z Z
du
= dx + C =⇒ = x + C.
uα 1−α
Motivated by the previous example, one may wonder if the solution of (8.20) is unique for
α > 1. We begin by this observation, that we state as an exercise.
Exercise 8.28. — Let α > 1. Prove that the function f : R → R, y 7→ |y|α , is locally
Lipschitz (i.e., it is Lipschitz in every compact interval [a, b]), but it is not Lipschitz on the
whole R.
Ry
Hint: To prove local Lipschitz continuity, use that f (y1 ) − f (y2 ) = y12 f ′ (x) dx.
By the previous exercise, we see that Theorem 8.26 does not apply to (8.20) when α > 1.
34 Still, since this function is locally Lipschitz, one may hope that some existence and uniqueness
theorem still holds.
This is indeed the case, as implied by the local version of Cauchy-Lipschitz Theorem
stated below. As we shall discuss later, since now the function f is only assumed to be locally
Lipschitz, in general we cannot find a solution u defined in whole R.
1. f is continuous in I × R;
2. f is locally Lipschitz with respect to the second variable; that is, for every compact
intervals [a, b] ⊂ I and [c, d] ⊂ R there exists a constant L > 0 such that
Then, for any point (x0 , y0 ) ∈ I × R there exist an interval I ′ ⊂ I containing x0 , and
a unique C 1 function u : I ′ → R, such that
(
u′ (x) = f (x, u(x)) for all x ∈ I ′ ,
(8.21)
u(x0 ) = y0 .
In other words, under a local Lipschitz assumption, one can only guarantee the existence
and uniqueness of a solution for some interval around x0 . Also, as long as the solution u(x)
remains bounded within I ′ , then one can continue applying Theorem 8.29 to extend the
interval I ′ as much as possible.
To better understand why solutions are defined only in some interval I ′ ⊂ I, we consider
the following example.
u′ (x)
u′ (x) = u(x)2 =⇒ = 1,
u(x)2
1
u(x) = .
1−x
Note that this function solves the ODE in (−∞, 1), but limx→1− u(x) = ∞, so we cannot
extend this solution beyond x = 1.
with α > 1. Show that Theorem 8.29 applies and find the unique solution.
Although Theorem 8.29 guarantees that most nonlinear ODEs have a unique solution (at
least locally in x), nonlinear ODEs are very difficult to solve and there are no general tech-
niques to tackle such problems, neither in practice nor in theory. Therefore, in applications,
one often resorts to numerical methods.
Proof. We first prove existence in four steps. In the fifth step, we prove uniqueness.
• Step 1: An equivalent integral equation. We claim that u : R → R is a C 1 solution to
(8.19) if and only if u : R → R is a continuous function satisfying
Z x
u(x) = y0 + f (s, u(s)) ds ∀ x ∈ R. (8.22)
x0
Indeed, if u solves the ODE, then by integration (see Corollary 7.6) we deduce the validity of
(8.22).
Vice versa, if u is a continuous function satisfying (8.22), then Theorem 7.4 and Re-
mark 8.32 imply that u(x) is a primitive of the continuous function f (x, u(x)), therefore
u′ (x) = f (x, u(x)). In particular, u is C 1 (since its derivative is continuous). Finally, choos-
ing x = x0 in (8.22) we deduce that u(x0 ) = y0 .
Therefore, to prove the existence, it suffices to construct a solution to (8.22). This will
35
be accomplished by constructing what are known as Picard approximations, a sequence of
functions that converge to a solution of (8.22).
• Step 2: Construction of Picard approximations. First, we define the continuous
function u0 : R → R as
u0 (x) = y0 ∀x ∈ R
Then, we define u1 : R → R as
Z x
u1 (x) = y0 + f (s, u0 (s)) ds.
x0
Note that the integral is well defined since u0 is continuous and therefore also the function
s 7→ f (s, u0 (s)) is continuous (see Remark 8.32). We also observe that u1 is the primitive of
a continuous function, so it is C 1 (and, in particular, continuous).
More generally, given n ∈ N, once the continuous function un : R → R is constructed, then
we define Z x
un+1 (x) = y0 + f (s, un (s)) ds.
x0
Again, since un is continuous, also s 7→ f (s, un (s)) is continuous, and therefore un+1 is C 1
(and in particular continuous).
• Step 3: Convergence of Picard approximations. Fix τ = 2M 1
(so that τ M = 12 ).
We now prove the uniform convergence of the sequence of Picard approximations un on the
interval [x0 − τ, x0 + τ ] by proving that this sequence corresponds to the partial sums of a
uniformly convergent series of functions. More precisely, define vk = uk − uk−1 , so that
n
X
un = u0 + vk .
k=1
converges absolutely for every x ∈ [x0 −τ, x0 +τ ], so that the function u∞ : [x0 −τ, x0 +τ ] → R
is well defined, and that the sequence of function {un }∞ n=0 converges uniformly to u∞ on
[x0 − τ, x0 + τ ].
To prove this, we observe that
Z x
vn+1 (x) = un+1 (x) − un (x) = f (s, un (s)) − f (s, un−1 (s)) ds ∀ x ∈ R,
x0
for every x ∈ I and n ≥ 1. In particular, if we define an = maxx∈[x0 −τ,x0 +τ ] |vn (x)|, given
x ∈ [x0 − τ, x0 + τ ] it follows that [x0 , x] ⊂ [x0 − τ, x0 + τ ]. Therefore
Z x Z x
|vn (s)| ds ≤ an ds = an |x − x0 | ≤ an τ,
x0 x0
an
|vn+1 (x)| ≤ M an τ = .
2
an
an+1 ≤ M an τ = ∀ n ≥ 1.
2
This implies (by induction) that an+1 ≤ 2−n a1 which gives, for x ∈ [x0 − τ, x0 + τ ],
∞
X
u∞ (x) = y0 + vk (x), |vk+1 (x)| ≤ 2−k a1 .
k=1
By the majorant criterion, the series above converges absolutely. Also, for x ∈ [x0 − τ, x0 + τ ],
∞
X ∞
X
|u∞ (x) − un (x)| ≤ |vk (x)| ≤ a1 2k−1 = a1 2n−1 → 0 as n → ∞,
k=n+1 k=n+1
which proves that the sequence of function {un }∞n=0 converge uniformly to u∞ on [x0 −τ, x0 +τ ].
• Step 4: The limit function u∞ (x) solves (8.22). We want to take the limit, as n → ∞,
in the formula
Z x
un+1 (x) = y0 + f (s, un (s)) ds, ∀ x ∈ [x0 − τ, x0 + τ ].
x0
We first observe that the term on the left-hand side converges to u∞ (x) as n → ∞.
To prove the convergence of the right-hand side, we estimate
Z x Z x
|f (s, u∞ (s)) − f (s, un (s)) | ds ≤ M |u∞ (s) − un (s)| ds ,
x0 x0
and the last term converges to 0 as n → ∞ thanks to Theorem 6.42 (since |u∞ − un | → 0
uniformly). This proves that
Z x Z x
f (s, un (s)) ds → f (s, u∞ (s)) ds as n → ∞,
x0 x0
and we find a solution on u∞ : [x−1 − τ, x1 ] → R. Then we define x−2 = x−1 − τ and y−2 =
Hence, if we define a = maxx∈[x0 −τ,x0 +τ ] |u1 (x) − u2 (x)|, then the inequality above yields
Z x
|u1 (x) − u2 (x)| ≤ M a ds = M a|x − x0 | ≤ M τ a ∀ x ∈ [x0 − τ, x0 + τ ],
x0
therefore
a
a ≤ Mτa = .
2
This implies that a = 0, that is, u1 = u2 on [x0 − τ, x0 + τ ].
We can now repeat this argument in subsequent intervals as we did in Step 4. More
precisely, we first repeat the argument in [x0 + τ, x0 + 2τ ] to show that u1 = u2 there, then
in [x0 + 2τ, x0 + 3τ ], and so on. In this way, we deduce that u1 = u2 on [x0 − τ, ∞). Then,
we repeat the argument in [x0 − 2τ, x0 − τ ], then in [x0 − 3τ, x0 − 2τ ], etc., until we conclude
that u1 = u2 on the whole R.
is possible to isolate the highest order derivative u(n) (x) and express it as a function of x and
lower order derivatives of u. Specifically, we assume that
Un′ (x) = u(n) (x) = f x, u(x), u′ (x), . . . , u(n−1) (x) = f x, U1 (x), U2 (x), . . . , Un (x) ,
thus converting the original n-th order ODE into a system of first-order ODEs:
′
U1 = U2 ,
′
U2 = U3 ,
..
.
′
Un−1 = Un ,
U′
= f x, U1 (x), U2 (x), . . . , Un (x) .
n
for some x0 ∈ I, then there exists a unique solution (local or global, depending on the
assumptions) satisfying these boundary conditions. As a consequence of this result, one can
for instance recover (and generalize) Proposition 8.17
Proposition 8.33: Existence and Uniqueness for Linear 2nd Order ODEs
which has a unique solution once we prescribe the values of U1 (x0 ) and U2 (x0 ). Equivalently,
(8.25) has a unique solution once we prescribe the values of u(x0 ) and u′ (x0 ).
Now, let u1 denote the unique solution satisfying u1 (x0 ) = 1 and u′1 (x0 ) = 0, and let
u2 denote the unique solution satisfying u2 (x0 ) = 0 and u′2 (x0 ) = 1. Then, the function
u = Au1 + Bu2 is a solution of (8.25) for every A, B ∈ R.
Vice versa, if u ∈ C 2 (I) solves (8.25) and we set A = u(x0 ) and B = u′ (x0 ), then
v = u − Au1 − Bu2 is a solution of (8.25) satisfying v(x0 ) = v ′ (x0 ) = 0. By uniqueness v
must be identically zero (since the zero function is a solution), therefore u = Au1 + Bu2 as
desired.
a global Lipschitz condition) will provide a solid foundation for studying the generalized form
35
of the Cauchy-Lipschitz Theorem for systems.
239
Chapter 8.2 INDEX
surjective, 21
triangle inequality, 19
two-point compactification, 37
uniform convergence, 99
uniformly continuous, 77
unit circle, 130
upper bound, 35
upper sums, 166
zero, 62
[ACa2003] N. A’Campo, A natural construction for the real numbers arXiv preprint 0301015,
(2003)
[Apo1983] T. Apostol, A proof that Euler missed: Evaluating ζ(2) the easy way The Mathe-
matical Intelligencer 5 no.3, p. 59–60 (1983)
[Aig2014] M. Aigner and G. M. Ziegler, Das BUCH der Beweise Springer, (2014)
[Bol1817] B. Bolzano, Rein analytischer Beweis des Lehrsatzes, daß zwischen je zwei Werthen,
die ein entgegengesetztes Resultat gewähren, wenigstens eine reelle Wurzel der Gleichung
liege, Haase Verl. Prag (1817)
[Cau1821] A.L. Cauchy, Cours d’analyse de l’école royale polytechnique L’Imprimerie Royale,
Debure frères, Libraires du Roi et de la Bibliothèque du Roi. Paris, (1821)
[Ded1872] R. Dedekind, Stetigkeit und irrationale Zahlen Friedrich Vieweg und Sohn, Braun-
schweig (1872)
[Hil1893] D. Hilbert, Über die Transzendenz der Zahlen e und π Mathematische Annalen 43,
216-219 (1893)
[Hos1715] G.F.A. Marquis de l’Hôpital, Analyse des Infiniment Petits pour l’Intelligence des
Lignes Courbes 2nde Edition, F. Montalant, Paris (1715)
242
Chapter 8.2 BIBLIOGRAPHY
[Zag1990] D. Zagier, A one-sentence proof that every prime p ≡ 1 mod 4 is a sum of two
squares. Amer. Math. Monthly 97, no.2, p. 144 (1990)