Lecture Notes: Metric Spaces - Sergey Mozgovoy

Download as pdf or txt
Download as pdf or txt
You are on page 1of 41

LECTURE NOTES

METRIC SPACES (2018)

SERGEY MOZGOVOY

Contents
1. Metric spaces 2
1.1. Definition and examples 2
1.2. Balls and bounded sets 5
1.3. Open and closed sets 6
1.4. Convergent sequences 8
1.5. Continuous maps 10
1.6. Complete metric spaces 12
1.7. Completion 14
1.8. Banach fixed point theorem 16
2. Topological spaces 18
2.1. Definition and examples 18
2.2. Subspace and product topologies 20
2.3. Hausdorff spaces 22
2.4. Connected topological spaces 23
2.5. Compact topological spaces 26
2.6. Compact metric spaces 28
2.7. Continuous functions on compact spaces 30
3. Normed vector spaces 32
3.1. Definition of a normed vector space 32
3.2. Examples of normed vector spaces 33
3.3. Bounded linear operators 34
3.4. Equivalent norms 36
3.5. Banach spaces 37
Appendix A. Some inequalities 40

Date: November 27, 2018.


2

1. Metric spaces
1.1. Definition and examples. The usual distance between points on the plane is given by
p
d(x, y) = (x1 − y1 )2 + (x2 − y2 )2
where x = (x1 , x2 ) ∈ R2 and y = (y1 , y2 ) ∈ R2 . This distance satisfies in particular the triangle inequality
d(x, z) ≤ d(x, y) + d(y, z), x, y, z ∈ R2 .
Using this example as a motivation we can axiomatize the properties of a distance as follows.
Definition 1.1. A metric space is a pair (X, d), where X is a set and d: X × X → R is map such that for
all x, y, z ∈ X
(0) d(x, y) ≥ 0,
(1) d(x, y) = 0 ⇐⇒ x = y,
(2) d(x, y) = d(y, x) (symmetry),
(3) d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality).
The map d is called a metric (or a distance function). The elements of X are called points of X.
Remark 1.2. Axiom 0 follows from other axioms. Indeed 0 = d(x, x) ≤ d(x, y) + d(y, x) = 2d(x, y), hence
d(x, y) ≥ 0.
Example 1.3 (Euclidean spaces). Let
Rn = { (x1 , . . . , xn ) | xi ∈ R ∀1 ≤ i ≤ n} .
This is a vector space over R, where the sum of elements x, y ∈ Rn is defined by
x + y = (x1 + y1 , . . . , xn + yn ) ∈ Rn
and the product of x ∈ Rn with a scalar λ ∈ R is defined by
λx = (λx1 , . . . , λxn ) ∈ Rn .
Define the scalar product on Rn by
(x, y) = x1 y1 + · · · + xn yn ∈ R.
n
Define the norm on R by q
p
kxk = (x, x) = x21 + · · · + x2n .
Define the map d: Rn × Rn → R by
p
d(x, y) = kx − yk = (x1 − y1 )2 + · · · + (xn − yn )2 .
To prove that d is a metric on Rn we will apply (see Theorem A.1)
Theorem 1.4 (Cauchy-Schwartz inequality). For all x, y ∈ Rn , we have
|(x, y)| ≤ kxk · kyk .
Corollary 1.5 (Triangle inequality). For all x, y ∈ Rn , we have
kx + yk ≤ kxk + kyk .
Proof. We have
2
kx + yk = (x + y, x + y) = (x, x) + 2(x, y) + (y, y)
2 2
≤ kxk + 2 kxk · kyk + kyk = (kxk + kyk)2 ,
hence kx + yk ≤ kxk + kyk. 
n n n
Corollary 1.6. The map d: R × R → R defined above is a metric on R .
Proof. Let us verify the axioms of a metric:
pP
(1) If x = y, then clearly d(x, y) = kx − yk = 0. Conversely, if d(x, y) = (xi − yi )2 = 0 =⇒
2
(xi − yi ) = 0 for all i =⇒ xi = yi for all i =⇒ x = y.
(2) d(x, y) = kx − yk = ky − xk = d(y, x).
3

(3) We have to prove d(x, z) ≤ d(x, y) + d(y, z) or equivalently kx − zk ≤ kx − yk + ky − zk. But


kx − zk = k(x − y) + (y − z)k ≤ kx − yk + ky − zk .


Remark 1.7. The metric space (Rn , d) is called the Euclidean space. The metric d is called the Euclidean
metric (or Euclidean distance).
Example 1.8. Define a metric d∞ on R2 by
d∞ (x, y) = max {|x1 − y1 | , |x2 − y2 |} .
n
More generally, define for x, y ∈ R
kxk∞ = max |xi | , d∞ (x, y) = kx − yk∞ = max |xi − yi | .
1≤i≤n 1≤i≤n
n
Then d∞ is a metric on R different from the Euclidean metric.
Example 1.9. Define a metric d1 on R2 by
d1 (x, y) = |x1 − y1 | + |x2 − y2 | .
More generally, define for x, y ∈ Rn
n
X n
X
kxk1 = |xi | , d1 (x, y) = kx − yk1 = |xi − yi | .
i=1 i=1

Then d1 is a metric on Rn different from the Euclidean metric.


Example 1.10. For any p ≥ 1 and x, y ∈ Rn , define
n
!1/p n
!1/p
X p
X p
kxkp = |x|i , dp (x, y) = kx − ykp = |xi − yi | .
i=1 i=1

To see that dp satisfies the triangle inequality we will apply (see Theorem A.2)
Theorem 1.11 (Minkowski inequality). For any p ≥ 1 and x, y ∈ Rn , we have
n
!1/p n
!1/p n
!1/p
X p
X p
X p
|xi + yi | ≤ |xi | + |yi | .
i=1 i=1 i=1

We can write this inequality as kx + ykp ≤ kxkp + kykp . Triangle inequality for dp follows from
dp (x, z) = kx − zkp = k(x − y) + (y − z)kp ≤ kx − ykp + ky − zkp = dp (x, y) + dp (y, z).
pP
Remark 1.12. For P p = 2, we obtain d2 (x, y) = (xi − yi )2 which is the Euclidean metric. For p = 1, we
obtain d1 (x, y) = |xi − yi | which is the metric considered in the previous example.
Remark 1.13. One can show that for any x ∈ Rn , we have limp→∞ kxkp = maxi |xi | = kxk∞ . This implies
that limp→∞ dp (x, y) = d∞ (x, y).
Example 1.14 (Discrete metric). Let X be any set. Define the discrete metric on X by
(
0 if x = y
d(x, y) =
1 if x 6= y

Example 1.15 (Hamming distance). This is a distance used in information theory and coding theory. Let
X be a set and let n ≥ 1. Define a metric on X n by the rule
d(x, y) = # { 1 ≤ i ≤ n | xi 6= yi } , x, y ∈ X n .
This metric is called the Hamming distance. It is equal to the number of positions where the messages x
P1 we obtain the discrete metric on X. For X = {0, 1} we can write the distance
and y are different. For n =
also in the form d(x, y) = i |xi − yi |.
4

Example 1.16 (Levenshtein distance). This is S a distance between strings used in computer science and
linguistics. Let X be a finite set and let X ∗ = n≥0 X n be the set of finite sequences x = (x1 , . . . , xn ),
n ≥ 0, called words (or strings) in the alphabet X. Define the distance between two words to be the minimal
number of 1-letter edits (insertion, deletion or substitution) needed to obtain one word from the other.
Formally, let x, y be two words of length m, n respectively. For 0 ≤ i ≤ m and 0 ≤ j ≤ n define
(
max {i, j} ij = 0,
`x,y (i, j) = 
min `x,y (i − 1, j) + 1, `x,y (i, j − 1) + 1, `x,y (i − 1, j − 1) + 1xi 6=yj ij > 0
Then define d(x, y) = `x,y (m, n). For example
d(tip, top) = d(ti, to) = d(t, t) + 1 = 1
d(desert, dessert) = d(de, des) = d(de, de) + 1 = 1
Example 1.17 (Distance on graphs). A graph is a pair (V, E), where V is a set, called the set of vertices,
and E is a set of unordered pairs of elements in V , called the set of edges. An edge e = {v, v 0 } ∈ E (also
written as vv 0 ) is called an edge between vertices v and v 0 . Given a function f : E → R>0 , we can interpret
f (e) for an edge e = vv 0 as the time needed to travel between v and v 0 along e. For any path p = e1 . . . en
0
f (ei ). Define d(v, v 0 ) as the minimal value of f (p) for the paths p
P
between vertices v and v , let f (p) =
0
between v and v (and +∞ if there are no such paths). One can show that (V, d) is a metric space (if we
also allow the value +∞). For example, if f (e) = 1 for every e ∈ E, then d(v, v 0 ) is equal to the number of
edges along the shortest path between v and v 0 .
Example 1.18. We can interpret the space Rn as the set of all functions f : {1, . . . , n} → R. Let us
generalize this construction to infinite domains. Given real numbers a < b, let C[a, b] be the set of all
continuous functions f : [a, b] → R. Define a metric d = d∞ on C[a, b] by
d(f, g) = sup |f (x) − g(x)| , f, g ∈ C[a, b].
x∈[a,b]

Let us check the axioms of a metric


(1) If f = g then clearly d(f, g) = 0. Conversely, if d(f, g) = 0 then |f (x) − g(x)| = 0 for all x ∈ [a, b].
Therefore f (x) = g(x) for all x ∈ [a, b], hence f = g.
(2) d(f, g) = d(g, f ) is clear.
(3) We have to show that for any f, g, h ∈ C[a, b], we have d(f, g) ≤ d(f, h) + d(h, g). For any x ∈ [a, b]
we have
|f (x) − g(x)| ≤ |f (x) − h(x)| + |h(x) − g(x)| ≤ d(f, h) + d(h, g).
Taking the supremum over all x ∈ [a, b] we obtain the required inequality.
Example 1.19. Define a metric d1 on C[a, b] by
Z b
d1 (f, g) = |f (x) − g(x)| dx, f, g ∈ C[a, b].
a
The axioms 1-2 of a metric are verified as before. The triangle inequality
d1 (f, g) ≤ d1 (f, h) + d1 (h, g)
follows from the inequality
Z b Z b Z b
|u(x) + v(x)| dx ≤ |u(x)| dx + |v(x)| dx
a a a
(substitute f − h by u, h − g by v and f − g by u + v).
Example 1.20 (Subspace). Let (X, d) be a metric space and let A ⊂ X be any subset. Then A together
with the restriction of d to A × A is a metric space denoted by (A, dA ) and called a subspace of (X, d). The
metric dA on A is called the induced metric.
Example 1.21. Let (X, d) and (Y, d) be two metric spaces. Then we can equip X ×Y with different induced
metrics. For example
d1 ((x1 , y1 ), (x2 , y2 )) = d(x1 , x2 ) + d(y1 , y2 )
p
d2 ((x1 , y1 ), (x2 , y2 )) = d(x1 , x2 )2 + d(y1 , y2 )2
5

1.2. Balls and bounded sets. Let (X, d) be a metric space.


Definition 1.22. For any x ∈ X and r > 0 define
(1) the open ball B(x, r) with center x and radius r by
B(x, r) = { y ∈ X | d(x, y) < r} .
(2) the closed ball B(x, r) with center x and radius r by
B(x, r) = { y ∈ X | d(x, y) ≤ r} .
Example 1.23. Describe an open ball B(0, 1) and a closed ball B(0, 1) in R2 with metrics d2 , d1 , d∞ .
Example 1.24. Let X be a discrete metric space and x ∈ X. Describe B(x, 1), B(x, 1), B(x, 2).
Definition 1.25. A subset A ⊂ X is called bounded if it is contained in some open ball, that is, if there
exist x ∈ X and r > 0 s.t. A ⊂ B(X, r)
Definition 1.26. Define the diameter of a subset A ⊂ X to be
d(A): = sup d(x, y).
x,y∈A

Example 1.27. The diameter of B(0, 1) in the Euclidean space R2 is 2.


Remark 1.28. If A ⊂ B then d(A) ≤ d(B). Indeed, for any x, y ∈ A we have also x, y ∈ B. Therefore
d(x, y) ≤ d(B).
Taking the supremum over all x, y ∈ A we obtain d(A) ≤ d(B).
Lemma 1.29. The diameter of an open ball B(x, r) is ≤ 2r.
Proof. For any two points y, z ∈ B(x, r) we have
d(y, z) ≤ d(y, x) + d(x, z) < r + r = 2r.
This implies
d(B(x, r)) = sup d(y, z) ≤ 2r.
y,z∈B(x,r)

Lemma 1.30. A subset A ⊂ X is bounded if and only if its diameter is finite.
Proof. If A is bounded then A ⊂ B(x, r) for some x ∈ X, r > 0. Then
d(A) ≤ d(B(x, r)) ≤ 2r < ∞.
Conversely, assume that d(A) < ∞ and let x ∈ A, r = d(A) + 1. Then, for any y ∈ A,
d(x, y) ≤ d(A) < r =⇒ y ∈ B(x, r).
This means that A ⊂ B(x, r). 
2 2
Example 1.31. (1) y = x in R is unbounded
(2) Let X have a discrete metric and assume that #X ≥ 2. Then d(X) = 1. The set X and any its
subset are bounded.
6

1.3. Open and closed sets.


1.3.1. Open sets. Let (X, d) be a metric space.
Definition 1.32. Let A ⊂ X be a subset.
(1) A point x ∈ A is called an internal (or interior) point if there exists r > 0 such that B(x, r) ⊂ A.
(2) A subset A ⊂ X is called an open set if every point of A is an internal point. That is, if for any
x ∈ A there exists r > 0 such that B(x, r) ⊂ A.
(3) The set of all internal points of A is denoted by A◦ and is called the interior of A.
Lemma 1.33. Every open ball B(x, r) in X is an open set in X.
Proof. For any y ∈ B(x, r) we have to find r0 > 0 such that B(y, r0 ) ⊂ B(x, r). As y ∈ B(x, r) =⇒
d(x, y) < r =⇒ r − d(x, y) > 0. Let r0 = r − d(x, y). Then, for any z ∈ B(y, r0 ),
d(x, z) ≤ d(x, y) + d(y, z) < d(x, y) + r0 = r.
This implies that z ∈ B(x, r). Therefore B(y, r0 ) ⊂ B(x, r) as required. 
Example 1.34. Any open interval (a, b) ⊂ R is an open set as we can represent it as an open Ball B(x, r),
where x = a+b b−a
2 , r = 2 .

Theorem 1.35. We have


(1) The sets ∅ and X are open sets in X.
(2) The union of any collection of open sets is an open set.
(3) The intersection of any finite collection of open sets is an open set.
Proof. (1) Clear. S
(2) Let (Ui )i∈I be a collection of open sets in X. Let U = i∈I Ui . For any x ∈ U , there exists i ∈ I:
x ∈ Ui . As Ui is open, ∃r > 0: B(x, r) ⊂ Ui ⊂ U . This implies that x is an internal point of U . This
proves that U is open. T
(3) Let (Ui )i∈I be a finite collection of open sets in X. Let U = i∈I Ui . Given x ∈ U , we have x ∈ Ui
∀i ∈ I. For any i ∈ I the set Ui is open, therefore ∃ri > 0: x ∈ B(x, ri ). Let r = mini∈I
T ri (we can
take this minimum as I is finite). Then B(x, r) ⊂ Ui for any i ∈ I. Therefore B(x, r) ⊂ i∈I Ui = U .
This implies that x is an internal point of U . This proves that U is open.

S
Example 1.36. The union of open intervals in R is an open set. In particular, the set (a, +∞) = n≥1 (a, a+
n) is an open set in R.
Remark 1.37. It is not true in general that the intersection of any collection of open sets is an open set.
For example, consider a collection of open sets in R defined by
 
1 1
Un = − , , n ≥ 1.
n n
Their intersection is
\ \  1 1
Un = − , = {0} ,
n n
n≥1 n≥1
which is not an open set.
Lemma 1.38. For any subset A ⊂ X its interior A◦ is an open set. It is the union of all open subsets of A.
Proof. For every x ∈ A◦ , there exists εx > 0 such that B(x, εx ) ⊂ A. Then
[
A◦ ⊂ U = B(x, εx ) ⊂ A.
x∈A◦

The set U is open as a union of open balls. Therefore all points of U are internal in U , hence also in A. This
implies that U ⊂ A◦ and we conclude that A◦ = U and it is open.
If V ⊂ A is the union of all open subsets of A, then A◦ ⊂ V . On the other hand all points of V are
internal in A, hence V ⊂ A◦ and we conclude that A◦ = V . 
7

1.3.2. Closed sets.


Definition 1.39. Let A ⊂ X be a subset.
(1) A point x ∈ X is called a limit point of A if for any ε > 0 there exists y ∈ A\ {x} such that
d(x, y) < ε.
(2) A is called a closed set if it contains all of its limit points.
(3) Define the closure of A to be the union of A and the set of all its limit points. It is denoted by A.
Theorem 1.40. A subset A ⊂ X is a closed set if and only if X\A is an open set.
Proof. Assume that A is a closed set. To prove that X\A is open, we have to show that for any x ∈ X\A
there exists ε > 0 such that B(x, ε) ⊂ X\A. If this is not the case then for any ε > 0 the set B(x, ε) is not
contained in X\A, that is,
B(x, ε) ∩ A 6= ∅.
This implies that there exists y ∈ A such that d(x, y) < ε. This means that x is a limit point of A. As A is
closed, we conclude that x ∈ A. But by our assumption x ∈ X\A, a contradiction.
Conversely, assume that X\A is an open set. We have to show that every limit point x of A is contained
in A. Assume that x ∈ X\A. As X\A is open, there exists ε > 0 such that B(x, ε) ⊂ X\A, that is,
B(x, ε) ∩ A = ∅. But this implies that x can not be a limit point of A, a contradiction. 
Example 1.41. The closed interval [a, b] ⊂ R is a closed set. Indeed, its complement R\[a, b] = (−∞, a) ∪
(b, ∞) is a union of open intervals and we know that they are open sets.
Theorem 1.42. Let (X, d) be a metric space. Then
(1) The sets ∅ and X are closed sets in X.
(2) The intersection of any collection of closed sets is a closed set.
(3) The union of any finite collection of closed sets is an closed set.
Proof. (1) X\∅ = X is an open set =⇒ ∅ is closed. X\X = ∅ is an open set =⇒ X is closed.
(2) Let (Fi )i∈I be a collection of closed sets. By the De Morgan’s law
\ [
X\ Fi = (X\Fi ).
i∈I i∈I
S T T
The sets X\Fi are open =⇒ i∈I (X\Fi ) is open =⇒ X\ i∈I Fi is open =⇒ i∈I Fi is closed.
(3) Let (Fi )i∈I be a finite collection of closed sets. By the De Morgan’s law
[ \
X\ Fi = (X\Fi ).
i∈I i∈I
T S S
The sets X\Fi are open =⇒ i∈I (X\Fi ) is open =⇒ X\ i∈I Fi is open =⇒ i∈I Fi is closed.

Theorem 1.43. For any subset A ⊂ X its closure A is a closed set. It is the intersection of all closed sets
containing A.
Proof. Let x be a limit point of A such that x ∈
/ A. Then ∀ε > 0 ∃y ∈ A with d(x, y) < ε/2. If y ∈ A\A
then ∃z ∈ A with d(y, z) < ε/2, hence
d(x, z) ≤ d(x, y) + d(y, z) < ε/2 + ε/2 = ε.
If y ∈ A then we take z = y and also obtain d(x, z) < ε/2 < ε. This implies that x is a limit point of A,
hence x ∈ A, a contradiction. We conclude that A is closed.
If F is any closed set containing A then all limitTpoints of A are in F . Therefore A ⊂ F . The set A is
one of the closed sets containing A. Therefore A = A⊂F − cl F . 
Example 1.44. Consider A = (0, 1) ⊂ R. Its closure is [0, 1] and we have seen that it a closed set.
8

1.4. Convergent sequences. Let (X, d) be a metric space. A sequence (xn )∞ n=1 of points in X is a collection
of elements x1 , x2 , . . . in X. We denote a sequence (xn )∞
n=1 also by (x n ) n≥1 or just (xn ).
Definition 1.45. A sequence (xn ) of points in X is called convergent to x ∈ X if ∀ε > 0 ∃N > 0 such that
d(xn , x) < ε ∀n ≥ N.
The point x is called the limit of the sequence and is denoted by limn→∞ xn . We also write xn → x as
n → ∞.
Remark 1.46. The definition of convergence of a sequence (xn ) to x is equivalent to the statement that
d(xn , x) → 0, n → ∞.
Lemma 1.47. Every convergent sequence in a metric space has a unique limit.
Proof. Assume that (xn )n≥1 converges to x, y ∈ X. For any ε > 0 there exist N, N 0 > 0 such that
d(xn , x) < ε for n ≥ N and d(xn , y) < ε for n ≥ N 0 . Then for any n ≥ N, N 0 we have
d(x, y) ≤ d(x, xn ) + d(xn , y) < ε + ε = 2ε.
This inequality is true for any ε > 0 =⇒ d(x, y) = 0 =⇒ x = y. 
n−1
Example 1.48. Consider a sequence (xn )n≥1 in R2 with xn = n1 , n . Then limn→∞ xn = x = (0, 1).


Indeed, s 2  2 r √
1 n−1 1 1 2
d(xn , x) = −0 + −1 = + 2 = → 0, n → ∞.
n n n2 n n
This example shows that convergence in Rk can be studied coordinatewise. This is the content of the next
lemma.
(n)
Lemma 1.49. A sequence (x(n) )n≥1 in the Euclidean space Rk converges to x ∈ Rk ⇐⇒ xi → xi as
n → ∞, for all 1 ≤ i ≤ k.
q
(n) (n)
P (n) (n)
Proof. =⇒ If x → x as n → ∞, then d(x , x) = i (xi − xi )2 → 0 =⇒ (xi − xi )2 → 0 for all
(n)
1 ≤ i ≤ k =⇒ xi → xi as n → ∞. q
(n) (n) (n)
− xi )2 → 0 for all i, hence d(x(n) , x) =
P
⇐= If xi → xi as n → ∞ for all i, then (xi i (xi − xi )2 → 0.
This means that x(n) → x as n → ∞. 
Sequences are useful for the characterization of limit points of subsets.
Lemma 1.50. Let A ⊂ X be a subset. A point x ∈ X is a limit point of A ⊂ X ⇐⇒ there exists a
sequence in A\ {x} convergent to x.
Proof. =⇒ Let x ∈ X be a limit point of A. Then, for any n ≥ 1 there exists xn ∈ A\ {x} such that
d(xn , x) < 1/n. This implies that the sequence (xn )n≥1 is contained in A\ {x} and converges to x.
⇐= Assume that there exists a sequence (xn )n≥1 in A\ {x} convergent to x. Then, for any ε > 0 there exists
n ≥ 1 such that d(xn , x) < ε. This implies that x is a limit point of A. 
Corollary 1.51. A subset A ⊂ X is closed ⇐⇒ the limit of any convergent sequence of points in A is
contained in A.
Proof. =⇒ If A is closed and (xn ) is a sequence in A convergent to x ∈ X, then x is a limit point of A, hence
x ∈ A. ⇐= If x is a limit point of A, then there exists a sequence (xn ) in A\ {x} convergent to x. By our
assumption this limit is contained in A, hence x ∈ A and A is closed set. 
Example 1.52. Consider the space C[0, 1] equipped with the metric d1 . Consider the sequence of elements
xn ∈ C[0, 1] defined by xn (t) = tn for t ∈ [0, 1]. For every t ∈ [0, 1), we have limn→∞ xn (t) = limn→∞ tn = 0.
Therefore we can guess that the sequence (xn ) converges to the constant zero function 0 ∈ C[0, 1]. This is
indeed the case as
Z 1 Z 1 1
n n tn+1 1
d1 (xn , 0) = |t − 0| dt = t dt = = → 0, n → ∞.
0 0 n+1 0 n+1
9

On the other hand, let us equip C[0, 1] with the metric d∞ (called the supremum metric). Then
d∞ (xn , 0) = sup |tn − 0| = 1,
t∈[0,1]

hence xn 6→ 0 as n → ∞ with respect to the metric d∞ .


Example 1.53. Let us again equip C[a, b] with the supremum metric d∞ . Then a sequence of functions
(xn ) in C[a, b] converges to a function x ∈ C[a, b] ⇐⇒ ∀ε > 0 ∃N > 0:
d∞ (xn , x) = sup |xn (t) − x(t)| ≤ ε ∀n ≥ N
t∈[a,b]

or equivalently
|xn (t) − x(t)| ≤ ε ∀n ≥ N ∀t ∈ [a, b].
This leads us to the following definition.
Definition 1.54. Let x(t) be a function and (xn (t))n≥1 be a sequence of functions on the interval [a, b].
(1) A sequence (xn (t))n≥1 is said to converge pointwise to a function x(t) if
∀t ∈ [a, b] ∀ε > 0 ∃N > 0: |xn (t) − x(t)| < ε ∀n ≥ N
or equivalently, ∀t ∈ [a, b] we have limn→∞ xn (t) = x(t).
(2) A sequence (xn (t)) is said to converge uniformly to a function x(t) if
∀ε > 0 ∃N > 0: |xn (t) − x(t)| < ε ∀n ≥ N ∀t ∈ [a, b].
Remark 1.55.
(1) Note that for pointwise convergence we can choose N depending on t, while for uniform convergence
we have to choose N that works for all t simultaneously (uniformly).
(2) Uniform convergence implies pointwise convergence.
(3) We can see from the previous example that uniform convergence is equivalent to the convergence
with respect to the metric d∞ .
Example 1.56. Consider the sequence of functions xn (t) = tn − t2n in C[0, 1]. It converges pointwise to
the zero function as lim xn (t) = limn→∞ (tn − t2n ) = 0 for every t ∈ [0, 1]. On the other hand this sequence
does not converge uniformly to the zero function as
1
d∞ (xn , 0) = sup (tn − t2n ) = ∀n ≥ 1.
t∈[0,1] 4
Example 1.57. We have seen earlier that the sequence of functions xn (t) = tn in C[0, 1] does not converge
to the zero function with respect to the metric d∞ . But does it converge to any other function x ∈ C[0, 1]?
If this is the case, then (xn ) converges uniformly to x, hence also pointwise. Then, for any t ∈ [0, 1], we have
(
0, 0 ≤ t < 1
x(t) = lim xn (t) = lim tn =
n→∞ n→∞ 1, t = 1
But this function is not continuous, hence the sequence (xn ) does not converge in (C[0, 1], d∞ ).
Lemma 1.58. Let (xn ) be a sequence in X convergent to x ∈ X and let y ∈ X. Then
lim d(xn , y) = d(x, y).
n→∞

Proof. In the proof of this lemma we will need the following useful result
Lemma 1.59. For any points x, y, z ∈ X we have
|d(x, y) − d(x, z)| ≤ d(y, z).
Proof. From the triangle inequalities d(x, y) ≤ d(x, z) + d(y, z) and d(x, z) ≤ d(x, y) + d(y, z), we obtain
d(x, y) − d(x, z) ≤ d(y, z) and d(x, z) − d(x, y) ≤ d(y, z), hence the required statement. 
Going back to the original lemma, we obtain
|d(xn , y) − d(x, y)| ≤ d(xn , x) → 0, n → ∞.
Therefore d(xn , y) → d(x, y) as n → ∞. 
10

1.5. Continuous maps.


Definition 1.60. Let (X, d), (Y, d0 ) be two metric spaces. A map f : X → Y is called
(1) continuous at a point x ∈ X if ∀ε > 0 ∃δ > 0:
d(x, y) < δ =⇒ d0 (f (x), f (y)) < ε.
(2) continuous if it is continuous at every point of X.
Remark 1.61. Continuity at the point x can be written in the form
∀ε > 0 ∃δ > 0: y ∈ B(x, δ) =⇒ f (y) ∈ B(f (x), ε)
or in the form
∀ε > 0 ∃δ > 0: f (B(x, δ)) ⊂ B(f (x), ε).
Example 1.62. A constant function f : X → Y is continuous. Assume that f (x) = z for all x ∈ X and
some z ∈ Y . For any ε > 0 choose arbitrary δ > 0. Then f (B(x, δ)) = {z} ∈ B(f (x), ε).
Example 1.63. Let X have the discrete metric. Then any map f : X → Y is continuous. For every ε > 0
we choose δ = 1. Then B(x, δ) = B(x, 1) = {x} and the condition f (x) ∈ B(f (x), ε) is obviously satisfied.
Theorem 1.64. A map f : X → Y is continuous ⇐⇒ for any sequence (xn ) in X convergent to x ∈ X,
the sequence (f (xn )) converges to f (x).
Proof. =⇒ Let f be continuous and let (xn ) be a sequence in X convergent to x ∈ X. Then ∀ε > 0 ∃δ > 0
d(x, y) < δ =⇒ d0 (f (x), f (y)) < ε.
As xn → x, there exists N > 0
d(x, xn ) < δ ∀n ≥ N.
This implies
d0 (f (x), f (xn )) < ε ∀n ≥ N,
hence f (xn ) → f (x) as n → ∞.
⇐= Assume that f is not continuous at x ∈ X. This means that ∃ε > 0 ∀δ > 0 ∃y ∈ X such that
d(x, y) < δ, d0 (f (x), f (y)) ≥ ε.
For any n ≥ 1, choosing δ = 1/n, we can find xn ∈ X with d(x, xn ) < 1/n and d0 (f (x), f (xn )) ≥ ε. In this
way we obtain a sequence (xn ) such that xn → x, but f (xn ) 6→ f (x). A contradiction. 
Theorem 1.65. A map f : X → Y is continuous ⇐⇒ for any open set U ⊂ Y the preimage f −1 (U ) =
{ x ∈ X | f (x) ∈ U } is open in X.
Proof. =⇒ Assume that f : X → Y is continuous and let U ⊂ Y be open. For any point x ∈ f −1 (U ), the
point f (x) ∈ U is an internal point of U . Therefore ∃ε > 0 with B(f (x), ε) ⊂ U . By the continuity of f ,
there exists δ > 0 such that
f (B(x, δ)) ⊂ B(f (x), ε) ⊂ U.
Then B(x, δ) ⊂ f (U ) =⇒ x is an internal point of f −1 (U ). Therefore f −1 (U ) is an open set.
−1

⇐= Let us show that f is continuous at any point x ∈ X. For every ε > 0, the set U = B(f (x), ε) is open
in Y . Therefore its preimage f −1 (U ) is open in X. In particular, the point x ∈ f −1 (U ) is an internal point.
Therefore there exists δ > 0 such that B(x, δ) ⊂ f −1 (U ), that is,
f (B(x, δ)) ⊂ U = B(f (x), δ).
This meas that f is continuous at x ∈ X. Therefore f is continuous. 
Lemma 1.66. Let f : X → Y and g: Y → Z be continuous maps between metric spaces. Then their compo-
sition gf : X → Z is also a continuous map.
Proof. We have to show that for any open set U ⊂ Z the preimage (gf )−1 (U ) is also open. We have
(gf )−1 (U ) = f −1 (g −1 (U )).
If U ⊂ Z is open, then g −1 (U ) ⊂ Y is open as g is continuous. Therefore f −1 (g −1 (U )) is open as f is
continuous. This implies that (gf )−1 (U ) is open. 
11

Definition 1.67. A map f : (X, d) → (Y, d0 ) is called Lipschitz continuous if there exists L ≥ 0 such that
d0 (f (x), f (y)) ≤ L · d(x, y) ∀x, y ∈ X.
The number L above is called a Lipschitz constant for f .
Theorem 1.68. If f : X → Y is Lipschitz continuous, then it is continuous.
Proof. Let L > 0 be a Lipschitz constant for f . For any ε > 0 choose δ = ε/L. Then if d(x, y) < δ =⇒
d0 (f (x), f (y)) ≤ L · d(x, y) < Lδ = ε. Therefore f is continuous. 

Example 1.69. Consider the function f (t) = t for t ∈ [0, 1]. This function is continuous, although it is
not Lipschitz continuous. Indeed, if f would be Lipschitz continuous with a Lipschitz constant L > 0, then
in particular, √
t − 0 ≤ L · |t − 0| = Lt,

√ √
hence t ≥ L1 for t > 0. But this is obviously wrong for t = 2L 1
.
Lemma 1.70. Let f : [a, b] → R be a function having a continuous derivative. Then f is Lipschitz continuous
with the Lipschitz constant maxt∈[a,b] |f 0 (t)|.
Proof. The function f 0 : [a, b] → R is continuous, hence bounded. Let L = maxt∈[a,b] |f 0 (t)|. By the mean
value theorem, for any a ≤ x < y ≤ b, there exists t ∈ (x, y) such that
f (y) − f (x)
= f 0 (t).
y−x
Therefore
|f (y) − f (x)| = |f 0 (t)| · |x − y| ≤ L · |x − y|
and f is Lipschitz continuous. 
Definition 1.71. A map f : (X, d) → (Y, d0 ) between two metric spaces is called an isometry if
d0 (f (x), f (y)) = d(x, y) ∀x, y ∈ X.
An isometry is called a global isometry if it is bijective.
Lemma 1.72. Let f : (X, d) → (Y, d0 ) be an isometry. Then
(1) f is continuous.
(2) f is injective.
(3) If f is a global isometry, then f −1 : Y → X is continuous.
Proof. (1) An isometry is Lipschitz continuous with a Lipschitz constant L = 1. Therefore it is also
continuous.
(2) Assume that x, y ∈ X satisfy f (x) = f (y). Then
d(x, y) = d0 (f (x), f (y)) = 0,
hence x = y. This means that f is injective.
(3) If f is a global isometry then f −1 : Y → X is also an isometry. Therefore by (1) it is continuous. 
12

1.6. Complete metric spaces. Let (X, d) be a metric space.


Definition 1.73. A sequence (xn )n≥1 in X is called a Cauchy sequence (or a fundamental sequence) if
∀ε > 0 ∃N > 0:
d(xm , xn ) < ε ∀m, n ≥ N.
Lemma 1.74. A convergent sequence in a metric space is a Cauchy sequence.
Proof. Let (xn )n≥1 be a sequence in X convergent to x ∈ X. Then ∀ε > 0 ∃N > 0
ε
d(xn , x) < ∀n ≥ N.
2
This implies that ∀m, n ≥ N
ε ε
d(xm , xn ) ≤ d(xm , x) + d(x, xn ) < + = ε,
2 2
hence (xn )n≥1 is a Cauchy sequence. 
Example 1.75. There exist Cauchy sequences that are not convergent. For example, the sequence (1/n)n≥1
is a Cauchy sequence which is not convergent in (0, 1]. It is a Cauchy sequence as it converges to 0 in R. It
is not convergent in (0, 1] as the limit limn→∞ n1 = 0 ∈
/ (0, 1].
Definition 1.76. A metric space (X, d) is called a complete metric space if any Cauchy sequence in X
converges in X.
Example 1.77. We obtain from the previous example that (0, 1] is not complete.
Theorem 1.78. The space R is a complete metric space.
We will need some auxiliary statements.
Theorem 1.79. A bounded monotone sequence in R is convergent.
Proof. Without loss of generality we assume that (xn ) is a non-decreasing sequence bounded from above.
By a property of real numbers there exists x = sup { xn | n ≥ 1}. Then for any ε > 0 there exists N ≥ 1
such that xN > x − ε. Therefore for all n ≥ N we have x − ε < xN ≤ xn ≤ x, hence |xn − x| < ε. This
implies that (xn ) converges to x. 
Theorem 1.80 (Bolzano-Weierstrass theorem). A bounded sequence in R has a convergent subsequence.
Proof. Let (xn ) be a bounded sequence. We will say that n ≥ 1 is a peak if xn > xm for all m > n. If
there exist infinitely many peaks n1 < n2 < . . . , then we have xn1 > xn2 > . . . , hence we found a monotone
bounded subsequence, which is convergent by the previous result. If there exist only finitely many peaks,
then we can choose N ≥ 1 such every n ≥ N is not a peak. As n1 = N is not a peak, there exists n2 > n1
with xn1 ≤ xn2 . Continuing this process we find a bounded monotone subsequence xn1 ≤ xn2 ≤ . . . which
is convergent by the previous result. 
Proof of Theorem 1.78. Let (xn ) be a Cauchy sequence. This sequence is bounded. Indeed, for any ε > 0
there exists N ≥ 1 such that d(xm , xn ) ≤ ε for all m, n ≥ N . In particular, d(xN , xn ) ≤ ε for n ≥ 1.
Choosing r = max { ε, d(xN , xn ) | 1 ≤ n ≤ N }, we obtain xn ∈ B(xN , r + 1) for all n ≥ 1.
By the Bolzano-Weierstrass theorem there exists a convergent subsequence (xnk ), with a limit, say x ∈ R.
For every ε > 0 there exists N ≥ 1 such that d(xm , xn ) < 2ε for all m, n ≥ N . On the other hand there
exists K ≥ 1 such that nK ≥ N and d(xnk , x) < 2ε for all k ≥ K. This implies that for all n ≥ N we have
d(xn , x) ≤ d(xn , xnK ) + d(xnK , x) < 2ε + 2ε = ε. 
Example 1.81. Let us show that the Euclidean space Rk is complete. Let (x(n) )n≥1 be a Cauchy sequence
(n)
in Rk . Then for any 1 ≤ i ≤ k, the sequence (xi )n≥1 in R is a Cauchy sequence: ∀ε > 0 ∃N > 0
qX
(m) (n)
∀m, n ≥ N : d(x(m) , x(n) ) = (xi − xi )2 < ε,

(m) (n) (n)
hence xi − xi < ε. But we know that R is complete, hence the sequence (xi )n≥1 converges to some
xi ∈ R. We conclude that (x(n) )n≥1 converges to x = (x1 , . . . , xk ).
13

Lemma 1.82. Let (X, d) be a metric space and A ⊂ X be a subset. Then


(1) If X is complete and A is closed then (A, d) is complete.
(2) If (A, d) is complete then A is closed.
Proof. (1) Assume that (xn ) is a Cauchy sequence in A. Then it is Cauchy in X =⇒ it is convergent to
some x ∈ X. As A is closed, we conclude that x ∈ A. This means that (xn ) is convergent to a point in A.
Therefore A is complete.
(2) Assume that A is complete. Let (xn ) be a sequence in A convergent to x ∈ X. We have to show that
x ∈ A. As (xn ) is a convergent sequence, it is a Cauchy sequence. Therefore it converges to some y ∈ A.
We conclude from the uniqueness of limits that x = y ∈ A. 
Example 1.83. A subspace [a, b] ⊂ R is complete as R is complete and [a, b] is closed. A subspace Q ⊂ R
is not complete as Q is not closed in R (we have seen that Q = R).
Theorem 1.84. The space C[a, b] with the supremum metric d∞ is complete.
Proof. Let (xn ) be a Cauchy sequence in C[a, b]. Then, for any t0 ∈ [a, b], the sequence (xn (t0 ))n≥1 of real
numbers is also a Cauchy sequence: ∀ε > 0 ∃N > 0 ∀m, n ≥ N :
(1) |xm (t0 ) − xn (t0 )| ≤ sup |xm (t) − xn (t)| = d∞ (xm , xn ) < ε.
t∈[a,b]

Therefore this sequence converges to some x(t0 ) ∈ R. In this way we obtain a function x: [a, b] → R. We
have to show that (xn ) converges uniformly to x and that x is a continuous function. We can write the
Cauchy condition in the form: ∀ε > 0 ∃N > 0
∀m, n ≥ N ∀t ∈ [a, b]: |xm (t) − xn (t)| < ε.
Taking the limit m → ∞ we obtain
∀n ≥ N ∀t ∈ [a, b]: |x(t) − xn (t)| ≤ ε
hence the sequence of functions (xn ) converges uniformly to x: [a, b] → R. The continuity of x follows from
the next lemma. 
Lemma 1.85. Assume that a sequence of functions (xn (t)) in C[a, b] converges uniformly to a function
x: [a, b] → R. Then x ∈ C[a, b].
Proof. Let us show that x: [a, b] → R is continuous at t0 ∈ [a, b]. For any ε > 0 ∃N > 0
ε
|x(t) − xn (t)| < ∀n ≥ N ∀t ∈ [a, b].
3
We will need just n = N . As xn ∈ C[a, b], there exists δ > 0
ε
|xn (t) − xn (t0 )| < ∀t ∈ [t0 − δ, t0 + δ].
3
Then ∀t ∈ [t0 − δ, t0 + δ]:
ε ε ε
|x(t) − x(t0 )| ≤ |x(t) − xn (t)| + |xn (t) − xn (t0 )| + |xn (t0 ) − x(t0 )| < + + = ε.
3 3 3
This implies that x is continuous at t0 . Therefore x is continuous on [a, b]. 
14

1.7. Completion. Assume that X is a metric space which is not complete. We would like to embed X into
a larger metric space X ⊂ X ∗ such that X ∗ is complete and such that the closure of X is equal to X ∗ .
Definition 1.86. A subset A ⊂ X is called dense in X if A = X.
Example 1.87.
(1) Let X = [0, 1] and A = (0, 1) ⊂ X. Then the closure of A is [0, 1] = X, hence A is dense in X.
(2) The set Q is dense in R. Indeed, any x ∈ R has a decimal representation
x = a0 .a1 a2 . . .
where ai ∈ Z and 0 ≤ ai ≤ 9 for i ≥ 1 (the digits). For any n ≥ 1 the numbers
n
X ai
xn = a0 .a1 . . . an = i
i=0
10
are rational. Moreover, limn→∞ xn = x as
X ai X 9 1
|x − xn | = 0.0 . . . 0an+1 an+2 · · · = i
≤ = n → 0, n → ∞.
10 10i 10
i≥n+1 i≥n+1

This implies that x is a limit point of Q. Therefore, the closure of Q equals R.


Lemma 1.88. Let f, g: X → Y be two continuous maps between metric spaces. Assume that there exists a
dense subset A ⊂ X such that f (x) = g(x) for all x ∈ X. Then f (x) = g(x) for all x ∈ X.
Proof. Let F = { x ∈ X | f (x) = g(x)}. We claim that F is closed in X. Indeed, let (xn ) be a sequence in F
convergent to x ∈ X. By the continuity of f and g, we have f (xn ) → f (x) and g(xn ) → g(x) as n → ∞. But
xn ∈ F =⇒ f (xn ) = g(xn ) =⇒ the limits of these sequences coincide, hence f (x) = g(x) and x ∈ F . This
means that F is closed. But A ⊂ F , hence X = A ⊂ F and F = X. Therefore f (x) = g(x) ∀x ∈ X. 
Definition 1.89. A completion of a metric space (X, d) is a complete metric space (X ∗ , d∗ ) together with
an isometry i: X → X ∗ such that i(X) = X ∗ (that is, i(X) is dense in X ∗ ).
Remark 1.90. We have seen that an isometry is an injective map. Therefore we will often identify X with
its image i(X) ⊂ X ∗ .
Example 1.91. Let Q be equipped with the Euclidean metric. Its completion is the Euclidean space R
with the canonical inclusion i: Q → R, x 7→ x. Indeed
(1) We know that R is complete.
(2) The canonical inclusion is an isometry.
(3) We have seen that Q is dense in R.
Theorem 1.92. Every metric space has a completion (unique up to an isometry).
Proof. The idea of the construction is the following:
(1) Define an equivalence relation on the set of all Cauchy sequences in X
(xn ) ∼ (yn ) ⇐⇒ lim d(xn , yn ) = 0.
n→∞

(2) Define the completion X to be the set of all equivalence classes of Cauchy sequences.
(3) Define the metric on X ∗ by d([(xn )], [(yn )]) = limn→∞ d(xn , yn ).
(4) Construct an embedding i: X → X ∗ by sending x ∈ X to the (equivalence class of) constant sequence
(xn ) with xn = x for all n ≥ 1.
Uniqueness: Assume that X has two completions i: X → X ∗ and i0 : X → X 0 . Let us construct an isometry
φ: X ∗ → X 0 that preserves X. We will identify x ∈ X with i(x) ∈ X ∗ and i0 (x) ∈ X 0 . For any point x∗ ∈ X ∗
there exists a sequence (xn ) in X that converges to x∗ . The sequence (xn ) is a Cauchy sequence, therefore it
converges to some x0 ∈ X 0 . This point does not depend on the choice of the sequence (xn ) converging to x∗ .
We define φ(x∗ ) = x0 . For any x ∈ X, we have φ(x) = x. To see that φ: X ∗ → X 0 is an isometry, consider
two Cauchy sequences (xn ), (yn ) in X that converge to x∗ , y ∗ ∈ X ∗ and x0 , y 0 ∈ X 0 . Then
d∗ (x∗ , y ∗ ) = lim d(xn , yn ) = d0 (x0 , y 0 ).
n→∞
15

This implies that d0 (φ(x∗ ), φ(y ∗ )) = d∗ (x∗ , y ∗ ). We can also show that φ is a bijection.
Existence:
1. Construction of X ∗ . Let X be the set of all Cauchy sequences in X. We call two sequences (xn ),
(yn ) equivalent if limn→∞ d(xn , yn ) = 0. This is an equivalence relation (one can check reflexivity, symmetry,
transitivity). Let X ∗ be the set of equivalence classes.
2. Metric on X ∗ . For any two equivalence classes x∗ = [(xn )] and y ∗ = [(yn )] in X ∗ define
d(x∗ , y ∗ ) = lim d(xn , yn ).
n→∞
(1) The sequence (d(xn , yn ))n≥1 is Cauchy and therefore converges in R.
(2) The limit limn→∞ d(xn , yn ) is independent of the choice of Cauchy sequences in the classes x∗ and
y∗ .
To show that d defines a metric on X ∗ , we have to check the triangle inequality. Let x∗ , y ∗ , z ∗ ∈ X ∗ and let
(xn ), (yn ), (zn ) be their representatives. For any n ≥ 1 we have
d(xn , yn ) ≤ d(xn , zn ) + d(zn , yn ).
Taking the limits, we obtain
d(x∗ , y ∗ ) ≤ d(x∗ , z ∗ ) + d(z ∗ , y ∗ ).
3. Embedding of X into X ∗ . For any x ∈ X, consider the constant sequence xn = x for n ≥ 1. This
is a Cauchy sequence that defines a point in X ∗ . This embedding X ⊂ X ∗ is an isometry.
4. Density of X in X ∗ . Indeed, let x∗ ∈ X ∗ and let (xn ) be its representative. Then (xn ) is a Cauchy
sequence =⇒ ∀ε > 0 ∃N > 0
∀m, n ≥ N : d(xm , xn ) < ε.
Therefore ∀n ≥ N :
d(x∗ , xn ) = lim d(xm , xn ) ≤ ε.
m→∞
This implies that the sequence (xn ) in X converges to x∗ ∈ X ∗ =⇒ X is dense in X ∗ .
5. Completeness of X ∗ . Any Cauchy sequence (xn ) in X converges to x∗ ∈ X ∗ which is an equivalence
class of (xn ). Consider now a Cauchy sequence (x∗n ) in X ∗ . As X is dense in X ∗ , for any n ≥ 1 we can
choose xn ∈ X such that d(xn , x∗n ) < 1/n. The sequence (xn ) is also a Cauchy sequence:
d(xm , xn ) < d(x∗m , x∗ n) + 1/m + 1/n.
Therefore it converges to some x∗ ∈ X ∗ . This implies that (x∗n ) also converges to x∗
d(x∗n , x∗ ) ≤ (x∗n , xn ) + d(xn , x) < 1/n + d(xn , x) → 0, n → ∞.

Example 1.93. We have seen that R is the completion of Q. But we can also apply the above theorem in
order to construct the set R of real numbers as the completion of Q. According to the above construction
R can be identified with the set of equivalence classes of Cauchy sequences Pin Q. One can show that every
n
such class x∗ has a representative Cauchy sequence (xn ) of the form xn = i=0 ai /10i , where ai ∈ Z and

0 ≤ ai ≤ 9 for i ≥ 1. In this way we can represent x using the usual decimal representation a0 .a1 a2 a3 . . . .
Note that two different decimal representations can induce equivalent Cauchy sequences, hence the same
real number. For example,Pnconsider decimal representations 0.999
Pn. . . and 1.000 . . . with the corresponding
Cauchy sequences xn = i=1 9/10n = 1 − 1/10n and yn = 1 + i=1 0 = 1. These sequences are equivalent
as d(xn , yn ) = 1/10n → 0 as n → ∞, hence they represent the same real number.
16

1.8. Banach fixed point theorem.


Definition 1.94. Let (X, d) be a metric space. A map f : X → X is called a contraction if there exists
0 ≤ α < 1 such that ∀x, y ∈ X
d(f (x), f (y)) ≤ αd(x, y).
Remark 1.95. A contraction is always continuous.
Definition 1.96. An element x ∈ X is called a fixed point of a map f : X → X if f (x) = x.
Theorem 1.97 (Banach’s fixed point theorem). Let X be a complete metric space and f : X → X be a
contraction. Then f has a unique fixed point (x ∈ X such that f (x) = x).
Proof. Existence: Let x0 ∈ X, x1 = f (x0 ) and generally xn+1 = f (xn ). The sequence (xn ) is Cauchy.
Indeed, for any k, n ≥ 1
k−1
X
d(xn , xn+k ) ≤ d(xn , xn+1 ) + · · · + d(xn+k−1 , xn+k ) = d(xn+i , xn+i+1 )
i=0
k−1 k−1 ∞
X X X d(x0 , x1 )
= d(f n+i (x0 ), f n+i (x1 )) ≤ αn+i d(x0 , x1 ) ≤ αn d(x0 , x1 ) αi = αn .
i=0 i=0 i=0
1−α

For any ε > 0 ∃N > 0 such that αN d(x1−α


0 ,x1 )
< ε. Then for any m ≥ n ≥ N
d(x0 , x1 ) d(x0 , x1 )
d(xm , xn ) ≤ αn ≤ αN < ε.
1−α 1−α
This implies that (xn ) is a Cauchy sequence and therefore it converges to some x ∈ X. As f is continuous,
we have
f (x) = lim f (xn ) = lim xn+1 = x,
n→∞ n→∞
that is, x is a fixed point.
Uniqueness: If x, y ∈ X are two fixed points, then
d(x, y) = d(f (x), f (y)) < αd(x, y).
Therefore d(x, y) = 0 and x = y. 
Example 1.98. Let X = (0, 1) and f : X → X be given by f (x) = 21 x. Then f is a contraction, but it does
not have fixed points (if 21 x = x then x = 0 ∈
/ X). The reason is that X is not complete. On the other hand,
if X = [0, 1], which is complete, and f : X → X, x 7→ 12 x, then f has a fixed point x = 0.
Example 1.99 (Picard Theorem). Let
f : [a, b] × [c, d] → R
be a differentiable map. Consider an ordinary differential equation (ODE)
dy
(t) = f (t, y(t))
dt
with an initial value condition
y(t0 ) = y0
for some t0 ∈ (a, b), y0 ∈ (c, d). We are going to show that this equation has a unique solution in a
neighborhood [t0 − ε, t0 + ε] of t0 for some ε > 0. We can assume that ∃L, M > 0 such that
|f (t, y1 ) − f (t, y2 )| ≤ L |y1 − y2 | , t ∈ [a, b], y1 , y2 ∈ [c, d],
|f (t, y)| ≤ M, t ∈ [a, b], y ∈ [c, d].
Let us choose ε > 0 such that
εL < 1, I = [t0 − ε, t0 + ε] ⊂ [a, b], J = [y0 − εM, y0 + εM ] ⊂ [c, d].
Consider the space
X = { y ∈ C[t0 − ε, t0 + ε] | y(t) ∈ J ∀t ∈ I}
17

equipped with the supremum metric. Consider the map


Z t
F : X → X, F (y)(t) = y0 + f (s, y(s))ds y ∈ X, t ∈ I.
t0
Note that F (y) is continuous and
Z t Z t

|F (y)(t) − y0 | =
f (s, y(s))ds ≤ M ds ≤ εM.
t0 t0
Therefore F (y) ∈ X. Note that F (y) = y if and only if
Z t
y(t) = y0 + f (s, y(s))ds ∀t ∈ I
t0
if and only if y 0 (t) = f (t, y(t)) and y(t0 ) = y0 , that is, y is a solution of our ODE. To show that F has a
unique fixed point we will apply the previous theorem. The space X is complete as a closed subspace of a
complete metric space C[t0 − ε, t0 + ε]. The map F is a contraction as
Z t

d(F (y1 ), F (y2 )) = sup |F (y1 )(t) − F (y2 )(t)| = sup
(f (s, y1 (s)) − f (s, y2 (s))) ds
t∈I t∈I t0
Z t Z t
≤ sup L |y1 (s) − y2 (s)| ds ≤ sup Ld∞ (y1 , y2 )ds = Ld∞ (y1 , y2 ) sup |t − t0 | = εLd(y1 , y2 )
t∈I t0 t∈I t0 t∈I
and εL < 1. This implies that we can apply the Banach fixed point theorem to F and find a unique y ∈ X
with F (y) = y. This proves existence and uniqueness of a solution of ODE.
18

2. Topological spaces
2.1. Definition and examples. Given a metric space (X, d), we proved that the collection of all its open
sets satisfies the following properties: ∅ and X are open sets, unions of open sets are open, finite intersections
of open sets are open. A collection of subsets of X satisfying these conditions is called a topology on X. It
turns out that many properties of the metric space depend only on the underlying topology. For this reason
we will study topologies in more detail.
Definition 2.1. A topological space is a pair (X, τ ), where X is a set and τ is a collection of subsets of
X such that
(1) ∅, X ∈ τ . S
(2) If (Ui )i∈I is a collection of sets in τ then the union i∈I Ui is inTτ .
(3) If (Ui )i∈I is a finite collection of sets in τ then the intersection i∈I Ui is in τ .
The collection τ is called a topology on X. The elements of τ are called open sets.
Example 2.2. Let (X, d) be a metric space and τ be the collection of all open sets in (X, d). Then τ is a
topology on X, called a metric topology generated by metric d.
Definition 2.3. A topological space (X, τ ) is called metrizable if τ is generated by some metric on X.
Example 2.4. Let X be an arbitrary set and τ = {∅, X}. Then τ is a topology, called the trivial topology.
This topology is not metrizable if |X| ≥ 2. Indeed, if τ is generated by some metric d, consider x, y ∈ X and
let r = d(x, y). Then y ∈
/ B(x, r), hence B(x, r) 6= ∅, X. Therefore B(x, r) ∈/ τ , while it should be open. A
contradiction.
Example 2.5. Let X be an arbitrary set and let τ consist of all subsets of X. Then τ is a topology, called
the discrete topology. This topology is actually a metric topology generated by the discrete metric
(
0, x = y,
d(x, y) =
1, x 6= y.
S
Indeed, B(x, 1) = {x} is open with respect to d for all x ∈ X. Then for any A ⊂ X, the set A = x∈A {x}
is also open with respect to d.
Example 2.6. Let X = {a, b}. Then the possible topologies on X are
(1) τ = {∅, X} (trivial topology).
(2) τ = {∅, {a} , X}.
(3) τ = {∅, {b} , X}.
(4) τ = {∅, {a} , {b} , X} (discrete topology).
Only the last topology is metrizable. One can show that a metric on a finite set always generates the discrete
topology.
Definition 2.7. Let (X, τ ) be a topological space.
(1) A subset A ⊂ X is called closed if its complement X\A is open.
(2) Given a subset A ⊂ X, its closure Ā is defined to be the minimal closed set that contains A. It is
the intersection of all closed sets that contain A.
(3) Given a subset A ⊂ X, its interior A◦ is defined to be the maximal closed set contained in A. It is
the union of all closed sets contained in A.
Example 2.8. If X is equipped with the trivial topology, then {x} = X for every x ∈ X. If X is equipped
with the discrete topology, then {x} = {x} for every x ∈ X.
Example 2.9. Let X = {a, b} and τ = {∅, {a} , X}. Then {a} = {a} and {b} = X.
Definition 2.10. A map f : (X, τ ) → (Y, τ 0 ) is called continuous if for every open set U ⊂ Y , the preimage
f −1 (U ) is open in X. A map f is called a homeomorphism if f is bijective and both f, f −1 are continuous.
Example 2.11. Let X be a set with |X| ≥ 2, τdisc be the discrete topology on X and τtriv be the trivial
topology on X. Then the identity map
f : (X, τdisc ) → (X, τtriv ), x 7→ x,
19

is bijective and continuos. On the other hand g = f −1 (X, τtriv ) → (X, τdisc ) is not continuos. Indeed a one-
point set {x} is open in (X, τdisc ), but its preimage g −1 ({x}) = {x} is not open in (X, τtriv ). We conclude
that f is not a homeomorphism, although it is bijective and continuous.
Lemma 2.12. If f : X → Y and g: Y → Z are continuous maps between topological spaces, then gf : X → Z
is also continuous.
Proof. For any open set U ⊂ Z, we have (gf )−1 (U ) = f −1 (g −1 (U )). Note that g −1 (U ) is open in Y , hence
f −1 (g −1 (U )) is open in X. 
Example 2.13. Let X be an arbitrary set and let τ consist of ∅ and all U ⊂ X with X\U finite. This is a
topology, called a cofinite topology (or Zariski topology). Let us verify the axioms.
(1) It is clear that ∅ and X are in τ . S
(2) Let (Ui )i∈I be a collection of elements in τ . To show that the union i Ui is in τ , we can assume
that Ui 6= ∅, hence X\Ui is finite for all i ∈ I. Consider the complement
[  \
X\ Ui = (X\Ui ),
i∈I i∈I
where
S we applied De Morgan’s law. This set is finite as an intersection of finite sets. Therefore
i U i ∈ τ. T
T a finite collection of elements in τ and let us show that i Ui ∈ τ . If some set Ui is
(3) Let (Ui )i∈I be
empty, then i Ui = ∅ ∈ τ . Therefore we can assume that Ui 6= ∅, hence X\Ui is finite for all i ∈ I.
Consider the complement \  [
X\ Ui = (X\Ui ),
i∈I i∈I
where we again applied
T De Morgan’s law. This set is a finite union of finite sets, hence is itself finite.
This implies that i∈I Ui ∈ τ .
We conclude that τ is a topology on X. Note that if X is finite, then τ is a discrete topology and in particular
metrizable. If X is infinite, then τ is not metrizable. Indeed, assume that τ is generated by a metric d. Let
x, y ∈ X be two distinct points and let r = 21 d(x, y). Then open sets U = B(x, r) and V = B(y, r) satisfy
U ∩ V = ∅ and therefore
X = X\(U ∩ V ) = (X\U ) ∪ (X\V ).
But U, V are open and non-empty, hence X\U, X\V are finite. This implies that X is finite as a union of
two finite sets. A contradiction.
20

2.2. Subspace and product topologies.


Definition 2.14. Let (X, τ ) be a topological space and A ⊂ X be a subset. Then the collection
τA = { U ∩ A | U ∈ τ }
is a topology on A, called the subspace topology. The topological space (A, τA ) is called a subspace of
(X, τ ).
Lemma 2.15. Let (X, τ ) be a topological space and A ⊂ X be a subset. Then
(1) the inclusion map i: A → X, x 7→ x, is continuous.
(2) if f : (X, τ ) → (Y, τ 0 ) is a continuous map, then the restriction map f |A : A → Y , x 7→ f (x), is
continuous.
Proof. (1) Let U ⊂ X be open. Then its preimage i−1 (U ) = U ∩ A is open in (A, τA ). Therefore i: A → X
i f
is continuous. (2) We can write f |A : A → Y as a composition A →
− X −
→ Y . The composition f i of two
continuous maps is continuous. Therefore f |A is continuous. 
0
Definition 2.16. Let (X,S τ ), (Y, τ ) be two topological spaces. We define the topology on X × Y to be the
collection of all unions i∈I Ui × Vi , where Ui are open subsets of X and Vi are open subsets of Y for i ∈ I.
This topology is called the product topology.
Remark 2.17. Let us check that the above collection τ of subsets in X × Y is indeed a topology.
(1) The first axiom is satisfied as ∅ × ∅ and X × Y are in τ .
(2) The second axiom (about unions of open sets) is automatically satisfied. S
(3) To
S prove the third axiom it is enough to consider an intersection of just two elements Ui × Vi and
Uj0 × Vj0 from τ . We obtain
[ \ [  [ \
0 0
Uj0 × Vj0

Ui × V i Uj × Vj = (Ui × Vi )
i∈I j∈J (i,j)∈I×J
[
= (Ui ∩ Uj0 ) × (Vi ∩ Vj0 ).
(i,j)∈I×J

The intersections Ui ∩ Uj0 are open in X and the intersections Vi ∩ Vj0 are open in Y . Therefore the
above union is an element of τ .
This implies that τ satisfies all axioms of a topology.
Theorem 2.18. Let X, Y be two topological spaces and let X × Y be equipped with a product topology. Then
(1) The projection maps
p1 : X × Y → X, (x, y) 7→ x, p2 : X × Y → Y, (x, y) 7→ y
are continuous.
(2) Given continuous maps f : Z → X, g: Z → Y , there exists a unique continuous map h: Z → X × Y
that makes the diagram commute (p1 h = f and p2 h = g).
Z
g
h

f
X ×Y p2 Y
p1

Proof. 1. Let us show that p1 : X ×Y → X is continuous. For any open U ⊂ X its preimage is p−1
1 (U ) = U ×Y
which is an open subset of X × Y .
2. It is clear that h: Z → X × Y should satisfy
h(z) = (f (z), g(z)), z ∈ Z.
21

We just have to show that this map is continuous. An open set in X × Y can be written as ∪i∈I Ui × Vi ,
where Ui are open in X and Vi are open in Y . Its preimage is equal to
[
h−1 (Ui × Vi )
i∈I
−1
and it is enough to show that h (U × V ) is open for any open U ⊂ X and any open V ⊂ Y . We have
−1
h (U × V ) = { z ∈ Z | f (z) ∈ U, g(z) ∈ V } = f −1 (U ) ∩ g −1 (V )
and the last set is open as an intersection of two open sets f −1 (U ) and g −1 (V ). This means that preimages
of open sets in X × Y with respect to h are open in Z. Therefore h: Z → X × Y is continuous. 
Example 2.19. Let us show that the following three topologies on R2 coincide
(1) Topology τ2 generated by the Euclidean metric d2 .
(2) Topology τ∞ generated by the supremum metric d∞ .
(3) The product topology τ̄ on R × R = R2 .
Similarly one can show that these topologies are equivalent to the topology τp generated by the metric dp
for every p ≥ 1. To show that two topologies τ, τ 0 on a set X are equal one needs to prove that the identity
map f : (X, τ ) → (X, τ 0 ), x 7→ x, is a homeomorphism. Indeed, continuity of f implies that τ 0 ⊂ τ (if U ∈ τ 0
=⇒ U = f −1 (U ) ∈ τ ) and continuity of f −1 implies that τ ⊂ τ 0 . We conclude that τ = τ 0 . In our case we
consider the identity map f : (R2 , τ2 ) → (R2 , τ∞ ). Then
p
d∞ (f (x), f (y)) = d∞ (x, y) = max |xi − yi | ≤ (x1 − y1 )2 + (x2 − y2 )2 = d2 (x, y).
i=1,2

This implies that f is Lipschitz continuous, hence continuous. The inverse function g = f −1 : (R2 , τ∞ ) →
(R2 , τ2 ) satisfies
p √
d2 (g(x), g(y)) = d2 (x, y) = (x1 − y1 )2 + (x2 − y2 )2 ≤ 2 · d∞ (x, y).
This implies that g is Lipschitz continuous, hence continuous. We conclude that f is a homeomorphism, hence
τ2 = τ∞ . Let us compare now τ∞ and τ̄ . The identity map h: (R2 , τ∞ ) → (R2 , τ̄ ) is continuous by the previous
theorem (take Z = (R2 , τ∞ ) and X = Y = R). We need to show that its inverse g = h−1 : (R2 , τ̄ ) → (R2 , τ∞ )
is also continuous. Every open set in (R2 , τ∞ ) is a union of open balls with respect to the metric d∞ . Such
an open ball has a form
 
B(x, ε) = y ∈ R d∞ (x, y) = max |xi − yi | < ε = y ∈ R2 yi ∈ (xi − ε, xi + ε) ∀i
2

i=1,2

= (x1 − ε, x1 + ε) × (x2 − ε, x2 + ε) = U1 × U2 ,
where Ui = (xi − ε, xi + ε) is open in R. We conclude that g −1 (B(x, ε)) = B(x, ε) = U1 × U2 . But this set
is open in the product topology τ̄ , hence g is continuous. We conclude that h is a homeomorphism, hence
τ∞ = τ̄ .
Example 2.20. In contrast to the previous example, consider the identity map
Id: (C[0, 1], d∞ ) → (C[0, 1], d1 ).
This map is Lipschitz continuous, hence continuous. On the other hand its inverse is not continuous as there
exists a sequence of functions xn (t) = tn that converges to zero with respect to d1 , but diverges with respect
to d∞ . This implies that the corresponding topologies satisfy τ1 ⊂ τ∞ , but they are not equal.
22

2.3. Hausdorff spaces. To motivate the introduction of Hausdorff topological spaces let us discuss first
convergent sequences in topological spaces. In the case of metric spaces we required open balls for the
definition of convergent sequences. In the case of topological spaces the analogue of open balls are open
neighborhoods of a point.
Definition 2.21. Let (X, τ ) be a topological space and x ∈ X.
(1) An open set U ⊂ X that contains x is called an open neighborhood of x.
(2) A subset V ⊂ X is called a neighborhood of x if there exists an open set U ⊂ X such that
x∈U ⊂V.
Remark 2.22. Note that a subset V ⊂ X is open ⇐⇒ for every point x ∈ V there exists an open
neighborhood U 3 x contained in V . Indeed, if V is open, then we can take U = V for everyS x ∈ V .
Conversely, assume that for every x ∈ V , there exists open Ux 3 x contained in V . Then V = x∈V Ux ,
hence V is open.
Definition 2.23. Let (X, τ ) be a topological space. A sequence (xn ) in X is called convergent to x ∈ X
if for any neighborhoodU 3 x there exists N > 0 such that xn ∈ U for n ≥ N .
Example 2.24. Let (X, τ ) be a topological space with the trivial topology τ = {∅, X} and let x ∈ X. Then
the sequence (xn = x)n≥1 converges to any y ∈ X. Indeed, the only open set that contains y is X and the
elements xn are contained in X for all n ≥ 1.
This example shows that arbitrary topological spaces can have rather undesirable properties. In order to
ensure that sequences have at most one limit, we have to put some restrictions on the topological space.
Definition 2.25. A topological space (X, τ ) is called a Hausdorff space (or a Hausdorff topological space)
if for any x, y ∈ X with x 6= y there exist open sets U 3 x, V 3 y such that U ∩ V = ∅.
Example 2.26. Any metric space (X, d) is Hausdorff. For x 6= y, let r = 21 d(x, y) and
U = B(x, r), V = B(y, r).
Then U, V are open and U ∩ V = ∅.
Example 2.27. Let (X, τ ) have the trivial topology and let |X| ≥ 2. Then X is not Hausdorff. Indeed, let
x, y ∈ X with x 6= y and let U 3 x, V 3 y be open. Then U = V = X, hence U ∩ V 6= 0.
Lemma 2.28. Let (X, τ ) be a Hausdorff topological space. Then any convergent sequence in X has a unique
limit.
Proof. Let (xn ) be a sequence convergent to x, y ∈ X and assume that x 6= y. There exist open sets U 3 x,
V 3 y with U ∩ V = ∅. As xn → x, y for n → ∞, there exist N, N 0 > 0 s.t.
xn ∈ U ∀n ≥ N, xn ∈ V ∀n ≥ N 0 .
Then for any n ≥ max {N, N 0 }: xn ∈ U ∩V =⇒ U ∩V 6= ∅ which contradicts to our assumption. Therefore
x = y. 
Lemma 2.29. Let (X, τ ) be a Hausdorff topological space. For any point x ∈ X the set {x} is closed.
Proof. Let U = X\ {x}. By the Hausdorff condition, for any y ∈ U , there exists an open set Vy 3 y such
that x ∈
/ Vy . Therefore Vy ⊂ U . It is clear that U = ∪y∈U Vy and this set is open as a union of open sets.
This means that {x} = X\U is closed. 
Example 2.30. Let X be an infinite set with the Zariski topology (closed sets are finite sets and X). Then
every 1-point set {x} is closed. However, X is not Hausdorff. Indeed, let x 6= y and let U 3 x, V 3 y be
open sets such that U ∩ V = ∅. Then
X = X\(U ∩ V ) = (X\U ) ∪ (X\V )
is finite as a union of two finite sets. This contradicts to our assumption that X is infinite.
Exercise 2.31. A topological space X is Hausdorff if and only if the diagonal { (x, x) | x ∈ X} is closed in
X × X.
23

2.4. Connected topological spaces.


Definition 2.32. A topological space (X, τ ) is called connected if X can not be represented as a disjoint
union X = U ∪V˙ of two nonempty open subsets U, V . Equivalently, we require that if U ⊂ X is both open
and closed then U = ∅ or U = X.
Remark 2.33. A subset A ⊂ X is called connected if the subspace (A, τA ) is connected. This means that
there don’t exist open U, V ⊂ X such that A ∩ U 6= ∅, A ∩ V 6= ∅ and A ⊂ U ∪ V , A ∩ U ∩ V = ∅.
Example 2.34. The subset A = [0, 1] ∪ [2, 3] ⊂ R is not connected. Note that the interval [0, 1] is not open
in R, but it is open in A.
Proposition 2.35. Let f : X → Y be a continuous map between topological spaces and let X be connected.
Then f (X) is connected.
˙
Proof. Assume that f (X) is not connected. We can assume that Y = f (X). We can represent Y = U ∪V
−1 ˙ −1 −1 −1
with nonempty disjoint U, V . Then X = f (U )∪f (V ), where f (U ), f (V ) are nonempty and open.
This contradicts to the assumption that X is connected. 
Corollary 2.36. Let f : X → Y be a continuous map between topological spaces and let Y have discrete
topology. Then f is a constant map.
Proof. The image f (X) is connected and has discrete topology. If y ∈ f (X) then we can represent f (X) =
{y} ∪ (f (X)\ {y}), where both subsets on the right are open. As f (X) is connected, we conclude that
f (X) = {y}. 
Lemma 2.37. Intervals [a, b], (a, b), [a, b), (a, b] ⊂ R are connected.
Proof. We will prove this result just for the interval (a, b). For other intervals one can use similar arguments
or apply the result for (a, b) together with the result about connected closures, to be proved later.
Assume that (a, b) = U ∪V ˙ , where U, V 6= ∅ are both open and closed in (a, b). Choose c ∈ U , d ∈ V and
assume that c < d without loss of generality. Let x = sup U ∩ [c, d]. Then x ∈ U as U is closed in (a, b). We
have (x, d] ⊂ V , hence its limit point x is contained in V . We conclude that x ∈ U ∩ V , a contradiction. 
Theorem 2.38 (Intermediate value theorem). Let f : [a, b] → R be a continuous function. Then every value
between f (a) and f (b) is attained by f .
Proof. Let f (a) ≤ f (b) without loss of generality and assume that f (a) < y < f (b) is not in the image of f .
But this would imply that A = f ([a, b]) is not connected as
A = (A ∩ (−∞, y)) ∪ (A ∩ (y, +∞))
and these two sets contain f (a) and f (b) respectively. We proved, however, that the image f ([a, b]) of a
connected set is connected. 
Lemma 2.39. Let (X, τ ) be a topological space and A ⊂ X be a connected subset. Then the closure A ⊂ X
is connected.
Proof. Restricting the topology to A, we can assume that A = X. Assume that X = U ∪V ˙ , where U, V are
open. Then
A = (U ∩ A) ∪ (V ∩ A).
Without loss of generality U ∩ A 6= ∅. As A is connected we conclude U ∩ A = A and A ⊂ U . As U is closed
in X, we obtain X = A ⊂ U and therefore U = X. This implies that X is connected. 
Lemma 2.40. If X, Y are connected topological spaces, then X × Y is also connected.
˙ with open U, V . Assume that U is nonempty and let (x0 , y0 ) ∈ U . The
Proof. Assume that X × Y = U ∪V
subset A = {x0 } × Y is homeomorphic to Y , hence is connected. From
A = (U ∩ A) ∪ (V ∩ A)
and (x0 , y0 ) ∈ U ∩ A, we obtain that U ∩ A = A, hence A ⊂ U . This implies that, for every y ∈ Y , we have
(x0 , y) ∈ U . Similarly to the above argument, we conclude that also X × {y} ∈ U . This implies that
[
X ×Y = X × {y} ⊂ U,
y∈Y
24

hence U = X × Y . Therefore X × Y is connected. 


Example 2.41. The square [0, 1] × [0, 1] ⊂ R2 is connected as [0, 1] ⊂ R is connected.

Example 2.42. The disk D = (x, y) ∈ R2 x2 + y 2 ≤ 1 is connected. Consider a continuous map (polar
coordinates)
f : A = [0, 1] × [0, 2π] → R2 , (r, θ) 7→ (r cos θ, r sin θ).
Then A is connected (as a product of two intervals), hence D = f (A) is also connected.
Proposition 2.43. Let (X, τ ) be a topological space and let (Ai )i∈I be a collection of connected subsets such
that ∩i∈I Ai 6= ∅. Then ∪i∈I Ai is connected.
Proof. We can assume that X = ∪i∈I Ai . Assume that X = U ∪V ˙ , where U, V are open. Let x ∈ ∩i∈I Ai .
Without loss of generality we can assume that x ∈ U . For any i ∈ I, we have x ∈ Ai ∩ U and therefore
Ai ∩ U 6= ∅. As
Ai = (U ∩ Ai ) ∪ (V ∩ Ai )
and Ai is connected we conclude that U ∩ Ai = Ai and Ai ⊂ U . This implies X = ∪i∈I Ai ⊂ U , that is,
U = X. Therefore X = ∪i∈I Ai is connected. 
Definition 2.44. A subset A ⊂ X is called a connected component of X if it is connected and it is not
contained in any larger connected subset.
Theorem 2.45. Let X be a topological space. Then
(1) X is a disjoint union of its connected components.
(2) Every connected component of X is closed.
Proof. (1) For any x ∈ X, let A be the union of all connected subsets of X that contain x. Then A is
connected by the previous result. It is not contained in any larger connected subset B as this would imply
that B is contained in the above union, hence B ⊂ A and A = B. We conclude that A is a connected
component. As every point x ∈ X is contained in a connected component, we can cover X with connected
components (represent it as a union). If two connected components A, B have a nonempty intersection then
A ∪ B is connected by the previous result. By maximality of A, B, this implies that A = B = A ∪ B. We
conclude that X is a disjoint union of its connected components.
(2) If A is a connected component then Ā is connected and A ⊂ Ā. By maximality of A we obtain A = Ā,
hence A is closed. 
Remark 2.46. If X is a finite union A1 ∪ · · · ∪ An of its connected components, then
S all Ai are
T both open
and closed. We proved already that Ai are closed. On the other hand Ai = X\ j6=i Aj = j6=i (X\Aj ).
This set is open as a finite intersection of open sets X\Aj .
S
Example 2.47. Let U ⊂ R be an open set. Then U can be represented as a disjoint union i∈I Ui of its
connected components. We claim that every Ui is an open interval (possibly infinite), hence every open set
U ⊂ R can be written as a disjoint union of open intervals.
Let V ⊂ R be an arbitrary connected subset and let a = inf V and b = sup V (possibly a = −∞
and b = +∞). Then (a, b) ⊂ V as otherwise we would find a point c ∈ (a, b)\V and decompose V =
(V ∩ (−∞, c)) ∪ (V ∩ (c, +∞)) into a disjoint union of two non-empty open sets. This implies that V is one
of the intervals (a, b), [a, b], (a, b], [a, b).
Assume now that V is a connected component of U and a, b are defined as before. If a ∈ V , then
A = (a − ε, a + ε) ⊂ U for some ε > 0, hence A ∪ V is connected and therefore A ⊂ V . But this implies that
inf V < a, a contradiction. We conclude that a ∈ / V and similarly that b ∈
/ V . Therefore V = (a, b), hence
every connected component of U is an open interval.
Definition 2.48. A topological space X is called path-connected if for any two points x, y ∈ X there
exists a continuous map f : [0, 1] → X such that f (0) = x, f (1) = y (a path from x to y).
Lemma 2.49. If X is path-connected, then it is connected.
Proof. Assume that X is not connected. Then we can decompose X = U ∪V ˙ , where U, V 6= ∅ are open
subsets of X. Choose x ∈ U , y ∈ V and a continuous map f : [0, 1] → X such that f (0) = x and f (1) = y.
Then we can decompose [0, 1] = f −1 (U )∪f˙ −1 (V ). The sets f −1 (U ), f −1 (U ) are open and non-empty as
f (0) = x ∈ U and f (1) = y ∈ V . This implies that [0, 1] is not connected, a contradiction. 
25

Example 2.50. There exist connected topological spaces that are not path-connected. For example, consider
a subset E ⊂ R2 , called a deleted comb space,
[   
1
E= n × [0, 1] ∪ [0, 1] × {0} ∪ {(0, 1)}
n≥1

One can show that the set E 0 = E\ {p}, where p = (0, 1), is path-connected, hence also connected. Its
closure (in E) is equal to E, hence E is also connected. On the other hand there is no path (contained in
E) between p and E 0 , hence E is not path-connected.
26

2.5. Compact topological spaces.


Definition 2.51. Let (X, τ ) be a topological space.
S
(1) A collection (Ui )i∈I of open sets in X is called an open cover of X if X = i∈I Ui .
(2) X is called a compact topological space if every open coverS (Ui )i∈I of it contains a finite subcover,
that is, there exists a finite subset I 0 ⊂ I such that X = i∈I 0 Ui .
(3) A subset A ⊂ X is called compact if (A, τA ) is a compact topological space.
Remark 2.52. The compactnessSof A ⊂ X means that given an open cover of A, that is, a collection (Ui )i∈I
of open sets in X such that A ⊂ i∈I Ui , there exists a finite subset I 0 ⊂ I such that A ⊂ i∈I 0 Ui .
S

Example 2.53. The following topological spaces are compact.


(1) Finite topological sets.
(2) (X, τ ) with trivial topology (that is, τ = {∅, X}).
(3) A closed interval [a, b] ⊂ R (Heine-Borel Theorem).
Example 2.54. The following topological spaces are not compact.
(1) Euclidean space R. Consider the cover (Un )n≥1 , where Un = (−n, n). It does not have a finite
subcover.
(2) The interval (0, 1). Consider the cover (Un )n≥2 , where Un = n1 , 1 . It does not have a finite


cubcover.
Theorem 2.55 (Heine-Borel Theorem on R). A closed interval [a, b] is compact.
Proof. Assume that I0 = [a, b] is not compact and let (Ui )i∈S be an infinite cover without a finite subcover.
Divide I0 into two equal subintervals. At least one of them does not have a finite subcover. Denote it by I1 .
Continuing this process we get intervals
I0 ⊃ I1 ⊃ I2 ⊃ . . .
b−a
with In having length 2n . Let xn = min In . Then
x0 ≤ x1 ≤ x2 ≤ . . .
is an increasing sequence in [a, b]. Therefore it converges to some x ∈ [a, b]. For any n ≥ 1 and k ≥ n, we
have xk ∈ Ik ⊂ In . Therefore x = limk→∞ xk ∈ In as In is closed. As (Ui )i∈S is a cover of [a, b], we can
find i ∈ S such that x ∈ Ui . Then there exists ε > 0 such that (x − ε, x + ε) ⊂ Ui . There exists n > 0 such
that b−a
2n < ε and therefore In ⊂ (x − ε, x + ε) ⊂ Ui (In contains x and has length < ε). This implies that
In ⊂ Ui , contradicting to our assumption that In does not have a finite subcover. 
Lemma 2.56. Let (X, τ ) be a compact topological space and let A ⊂ X be closed. Then A is compact.
S
Proof. Let (Ui )i∈I be an open cover of A, that is, A ⊂ i∈I Ui and Ui ⊂ X are open. The set U = X\A is
open, so there is an open cover of X [ 
X= Ui ∪ U.
i∈I
0
S
S compactness of X, there exists a finite subset I ⊂ I such that X = (
By the i∈I 0 Ui ) ∪ U. This implies
A ⊂ i∈I 0 Ui , hence A is compact. 
Theorem 2.57. Let X, Y be compact topological spaces. Then X × Y (with product topology) is compact.
Proof. Let (Wi )i∈I be an open cover of X × Y . Refining the cover (Wi ), we can assume that Wi = Ui × Vi ,
where Ui , Vi are open in X, Y respectively. For any x ∈ X, the space {x} × Y is compact and has an
open coverS(Ui × Vi )i∈I . ThereforeT there exists a finite subset Ix ⊂ I such that x ∈ Ui for all i ∈ Ix
and Y = i∈Ix Vi . Let Ux = i∈Ix Ui 3 x. The family (Ux )x∈X covers X, hence has a finite subcover
(Ux1 , . . . , Uxn ). We claim that 
U i × V i | i ∈ Ixj , 1 ≤ j ≤ n
S cover of X × Y . Indeed, for any (x, y) ∈ X × Y , we have x ∈ Uxj for some 1 ≤ j ≤ n. We also have
is a finite
Y = i∈Ix Vi , hence y ∈ Vi for some i ∈ Ixj . Then x ∈ Uxj ⊂ Ui , hence (x, y) ∈ Ui × Vi . This proves the
j
above statement, hence the cover of X × Y has a finite subcover and we conclude that X × Y is compact. 
27

Example 2.58. Let us show that every closed and bounded subset A ⊂ Rn is compact (part of the Heine-
Borel theorem on Rn to be proved later). As A is bounded, we can embed A ⊂ [−N, N ]n for some N > 0.
We know that the interval [−N, N ] is compact. Therefore the product [−N, N ]n is compact by the previous
theorem. As A ⊂ [−N, N ]n is closed, we conclude that it is also compact.
Proposition 2.59. Let f : X → Y be a continuous map between topological spaces. If X is compact then
f (X) is compact.
Proof. Let (Ui )i∈I be an open cover of f (X), that is, Ui ⊂ Y are open and f (X) ⊂ ∪i∈I Ui . Then
[
X= f −1 (Ui ),
i∈I
−1
where f (Ui ) are open. By the compactness of X there exists a finite subset I 0 ⊂ I such that
[
X= f −1 (Ui ).
i∈I 0
S
This implies f (X) ⊂ Ui , hence we found a finite subcover of f (X). Therefore f (X) is compact.
i∈I 0 

Example 2.60. Let us show that the unit circle S 1 = (x, y) ∈ R2 x2 + y 2 = 1 is compact. Consider a
continuous map
f : [0, 2π] → R2 , θ 7→ (cos θ, sin θ).
Then S 1 = f ([0, 2π]). As the closed interval [0, 2π] is compact, we conclude that S 1 = f ([0, 2π]) is also
compact.
Proposition 2.61. Let (X, τ ) be a Hausdorff topological space and let A ⊂ X be a compact subspace. Then
A is closed in X.
Proof. We need to show that X\A is open. Consider S some x ∈ X\A. For every y ∈ A there exist open sets
Uy 3 x, Vy 3 y such that Uy ∩ Vy = ∅. Then A ⊂ y∈A Vy and by the compactness of A there exists a finite
set {y1 , . . . , yn } ⊂ A such that A ⊂ Vy1 ∪ · · · ∪ Vyn . We claim that
U = Uy1 ∩ . . . Uyn
T
is an open neighborhood of x contained in X\A. It is clear that U is open and x ∈ i Uyi = U . Moreover,
we know that Uyi ∩ Vyi = ∅, hence U ∩ Vyi = ∅ and therefore
U ∩ (Vy1 ∪ · · · ∪ Vyn ) = ∅.
But A ⊂ Vy1 ∪ · · · ∪ Vyn , hence U ∩ A = ∅. This implies that U ⊂ X\A and x is an internal point of X\A.
Therefore X\A is open. 
Corollary 2.62. Let f : X → Y be a continuous and bijective map between topological spaces. If X is
compact and Y is Hausdorff, then f is a homeomorphism.
Proof. We have to prove that g = f −1 : Y → X is continuous. To do this it is enough to show that for any
closed subset A ⊂ X the set g −1 (A) = f (A) is closed. As X is compact and A ⊂ X is closed, we obtain
that A is also compact. Therefore f (A) ⊂ Y is compact. By the assumption that Y is Hausdorff and the
previous statement, we conclude that f (A) ⊂ Y is closed. 
Example 2.63. We have seen previously that the identity map
Id: (C[0, 1], d∞ ) → (C[0, 1], d1 ), f →f
is Lipschitz continuous, hence also continuous. On the other hand we have seen that the inverse map is not
continuous (there is a sequence of functions xn (t) = tn convergent to zero with respect to d1 , but divergent
with respect to d∞ ). This means that the previous statement can not be applied. As any topology generated
by a metric is automatically Hausdorff, we conclude that (C[0, 1], d∞ ) is not compact.
28

2.6. Compact metric spaces.


Lemma 2.64. Let (X, d) be a metric space and A ⊂ X be a compact subspace. Then A is closed and bounded
(that is, A ⊂ B(x, r) for some x ∈ X and r > 0).
Proof. (1) As X is a metric space, it is Hausdorff, hence every compact subset A ⊂ X is closed in X. (2)
For any x ∈ X, the family of open balls (B(x, n))n≥1 is an open cover of A. By the compactness of A it
contains a finite subcover. If N is the maximal radius of balls in this subcover, then A ⊂ B(x, N ), hence A
is bounded. 

Theorem 2.65 (Heine-Borel Theorem on Rn ). A subset of Rn is compact if and only if it is closed and
bounded.
Proof. We have seen that a compact set in a metric space is always closed and bounded. Conversely, assume
that A ⊂ Rn is closed and bounded. Then there exists N > 0 such that A ⊂ [−N, N ]n . We know that
[−N, N ] is compact in R. Therefore [−N, N ]n is compact. Therefore its closed subset A is compact. 

Example 2.66. It is not true in general that a closed and bounded set is compact. For example, let X be
an infinite set equipped with the discrete metric. Then X is closed
S and bounded Sas X = B(x, 2) for any
x ∈ X. But X is not compact. For example, the open cover X = x∈X B(x, 1) = x∈X {x} does not have
a finite subcover.
We will prove now some useful characterizations of compact metric spaces.
Definition 2.67. A metric space (X, d) is called sequentially compact if any sequence (xn ) in X has a
convergent subsequence.
Definition 2.68. Let (X, d) be a metric space.
(1) GivenSε > 0, a subset A ⊂ X is called an ε-net of X if ∀x ∈ X ∃y ∈ A: d(x, y) < ε. Equivalently,
X = y∈A B(y, ε).
(2) A metric space X is called totally bounded if for any ε > 0 there exists a finite ε-net of X.
Theorem 2.69. Let (X, d) be a metric space. Then the following conditions are equivalent
(1) (X, d) is compact.
(2) (X, d) is sequentially compact.
(3) (X, d) is complete and totally bounded.
Proof. (1) =⇒ (2). Assume that (X, d) is compact. Let (xn )n≥1 be a sequence in X. Then it contains a
subsequence convergent to y ∈ X if and only if ∀ε > 0 and ∀N > 0
∃n ≥ N : xn ∈ B(y, ε).
Assuming that (xn ) does not contain a convergent subsequence, for any y ∈ X there exist εy > 0 and Ny > 0
such that
xn 6∈ B(y, εy ) ∀n ≥ Ny .
The open balls B(y, εy ) with y ∈ X form a cover of X. By the compactness of X there exists a finite subset
I ⊂ X s.t. [
X= B(y, εy ).
y∈I

S then for any n ≥ max { Ny | y ∈ I} we would have xn 6∈ B(y, εy ) for y ∈ I and therefore xn 6∈
But
y∈I B(y, εy ) = X, a contradiction.
(2) =⇒ (3). Assume that X is sequentially compact. We will prove that it is complete and totally bounded.
Let (xn ) be a Cauchy sequence. We can find a convergent subsequence (xnk )k≥1 with a limit x ∈ X. For
any ε > 0 there exists N > 0 such that d(xm , xn ) < ε/2 for m, n ≥ N . There exists k > 0 such that nk ≥ N
and d(xnk , x) < ε/2. Then for any n ≥ N
d(xn , x) ≤ d(xn , xnk ) + d(xnk , x) < ε/2 + ε/2 = ε.
This implies that the sequence (xn ) converges to x. This proves that X is complete.
29

Assume that X is not totally bounded. Then there exists ε > 0 s.t. for any finite set of points {x1 , . . . , xn }
the balls B(xi , ε) don’t cover X. Then we can choose a sequence (xn )n≥1 in such a way that
xn+1 6∈ B(x1 , ε) ∪ · · · ∪ B(xn , ε), n ≥ 1.
This implies that d(xm , xn ) ≥ ε for m 6= n. This sequence can not contain a convergent subsequence (any
subsequence is not Cauchy).
(3) =⇒ (1). Assume that X is complete and totally bounded and let’s prove that X is compact. Let (Ui )i∈I
be an open cover that does not contain a finite subcover. By our assumption, for any n ≥ 1, there exists a
finite set An ⊂ X such that [
X= B(x, 1/n).
x∈An
At least one of the sets B(x, 1) with x ∈ A1 can not be covered by a finite number of Ui . Let x1 ∈ A1 be
the corresponding point. At least one of the sets
B(x1 , 1) ∩ B(x, 1/2), x ∈ A2
can not be covered by a finite number of Ui . Let x2 ∈ A2 be the corresponding point. Continuing this
process we find a sequence of points xn ∈ An such that
B(x1 , 1) ∩ B(x2 , 1/2) ∩ · · · ∩ B(xn , 1/n)
can not be covered by a finite number of Ui and in particular is non-empty. The sequence (xn ) is Cauchy.
Indeed, for m > n we have B(xm , 1/m) ∩ B(xn , 1/n) 6= ∅, hence
d(xm , xn ) < 1/m + 1/n < 2/n.
As X is complete, the sequence (xn ) converges to some x ∈ X. Choose i ∈ I such that x ∈ Ui . Then
B(x, ε) ⊂ Ui for some ε > 0. There exists n > 0 such that d(xn , x) < ε/2 and n1 < ε/2. Then B(xn , 1/n) ⊂
B(x, ε) ⊂ Ui , hence
B(x1 , 1) ∩ B(x2 , 1/2) ∩ · · · ∩ B(xn , 1/n)
is covered by just one Ui , contradicting to our choice of the sequence (xn ). 
Example 2.70. Consider the space C[0, 1] with the supremum metric d∞ . The set A = B̄(0, 1) =
{ f ∈ C[0, 1] | d∞ (f, 0) ≤ 1} is closed and bounded, but it is not compact. By the previous theorem it is
enough to show that A is not sequentially compact. The sequence xn (t) = tn in A converges pointwise to
the function x(t) = δ1 (t) which is 0 for t 6= 1 and 1 for t = 1. This function is not continuous, hence the
sequence (xn ) does not have a convergent subsequence.
We can use Theorem 2.69 to give an alternative proof of the Heine-Borel theorem:
Theorem 2.71 (Heine-Borel theorem, see 2.55). A closed interval [a, b] is compact.
Proof. We know that [a, b] is complete. To show that it is totally bounded, for every ε > 0, choose N > 1/ε
and divide [a, b] into N equal subintervals. The set A of the end-points of these subintervals forms a finite
ε-net of [a, b]. We conclude that [a, b] is totally bounded, hence compact. 
Remark 2.72. In the same way we can show that an n-dimensional cube [a, b]n ⊂ Rn is complete and
totally bounded, hence compact. This can be used to prove Theorem 2.65.
We can also use Theorem 2.69 to give an alternative proof of the Bolzano-Weierstrass theorem:
Theorem 2.73 (Bolzano-Weierstrass theorem, see 1.80). A bounded sequence in Rk has a convergent sub-
sequence.
Proof. Let (xn ) be a bounded sequence in Rk . Let A = { xn | n ≥ 1} ⊂ Rk . Then Ā is closed and bounded.
Therefore Ā is compact, hence also sequentially compact. This implies that a sequence (xn ) in Ā has a
convergent subsequence. 
30

2.7. Continuous functions on compact spaces.


Remark 2.74. Recall that given a subset A ⊂ R, we define
(1) The supremum sup A ∈ R ∪ {+∞, −∞} to be the least upper bound of A, that is, an element
c ∈ R ∪ {+∞, −∞} such that x ≤ c for all x ∈ A (upper bound of A) and c is the least element
with this property, meaning that if x ≤ d for all x ∈ A, then c ≤ d. By the supremum property of
R, there always exists sup A. For example sup(0, 1) = 1, sup R = +∞, sup ∅ = −∞. Given a map
f : X → R, we define supx∈X f (x) = sup f (X).
(2) The infimum inf A ∈ R ∪ {+∞, −∞} to be the greatest lower bound of A, that is, an element
c ∈ R∪{+∞, −∞} such that x ≥ c for all x ∈ A (lower bound of A) and c is the greatest element with
this property, meaning that if x ≥ d for all x ∈ A, then c ≥ d. There always exists inf A. For example
inf(0, 1) = 0, inf R = −∞, inf ∅ = +∞. Given a map f : X → R, we define inf x∈X f (x) = inf f (X).
(3) The maximum max A to be an element c ∈ A such that x ≤ c for all x ∈ A. For example max(0, 1) is
undefined, max R is undefined, max ∅ is undefined, max[0, 1] = 1. Note that if A has the maximum,
then max A = sup A. Given a map f : X → R, we define maxx∈X f (x) = max f (X). We say that f
attains its maximum at x0 ∈ X if f (x0 ) = supx∈X f (x) (then necessarily f (x0 ) = maxx∈X f (x)).
(4) The minimum min A to be an element c ∈ A such that x ≥ c for all x ∈ A. Note that if A has the
minimum, then min A = inf A. Given a map f : X → R, we define minx∈X f (x) = min f (X). We say
that f attains its minimum at x0 ∈ X if f (x0 ) = inf x∈X f (x) (then necessarily f (x0 ) = inf x∈X f (x)).
Theorem 2.75. Let (X, τ ) be a compact topological space and let f : X → R be a continuous function. Then
the set of values f (X) is bounded and there exist points x0 , x1 ∈ X such that
f (x0 ) = inf f (x), f (x1 ) = sup f (x).
x∈X x∈X

Proof. Let A = f (X) ⊂ R. Then A is compact, hence closed and bounded. Therefore
M = sup f (x) = sup A < +∞
x∈X

and M ∈ A. This implies that there exists x1 ∈ X with M = f (x1 ). The proof for the infimum is similar. 

Remark 2.76. Let us generalize the space of continuous functions C[a, b]. Given a topological space X,
define C(X) to be the set of continuous functions f : X → R. If X is compact, define the supremum metric
d∞ on C(X)
d∞ (f, g) = sup |f (x) − g(x)| .
x∈X
This metric is well-defined as the function h: X → R, x 7→ |f (x) − g(x)| is continuous, hence bounded by
the above result.
Remark 2.77. Even more generally, given topological spaces X, Y , define C(X, Y ) to be the set of con-
tinuous maps f : X → Y . If X is compact and (Y, dY ) is a metric space, define the supremum metric on
C(X, Y )
d∞ (f, g) = sup dY (f (x), g(x)).
x∈X
This metric is well-defined as the function h: X → R, x 7→ dY (f (x), g(x)) is continuous, hence bounded by
the above result. One can show that the generated topology on C(X, Y ) coincides with the compact-open
topology. This is the minimal topology that contains the sets W (K, U ) = { f ∈ C(X, Y ) | f (K) ⊂ U } for all
compact K ⊂ X and open U ⊂ Y .
Example 2.78. If X is finite and has the discrete topology, then C(X) ' Rn , where n = |X|, hence a
subset A ⊂ C(X) is compact if it is closed and bounded (by the Heine-Borel theorem). But this is not true
in general. For example, the subset A = B̄(0, 1) ⊂ C[0, 1] is closed and bounded, but not compact (see
Example 2.70). The next result is a useful characterization of compact subsets of C(X).
Theorem 2.79 (Arzela-Ascoli Theorem). Let X be a compact metric space and C(X) be the space of
continuous maps f : X → R with the supremum metric. A subset A ⊂ C(X) is compact if and only if
(1) A is closed and bounded.
31

(2) A is equicontinuous, that is, ∀ε > 0 ∃δ > 0


∀f ∈ A: d(x, y) < δ =⇒ |f (x) − f (y)| < ε.
Definition 2.80. A map f : X → Y between two metric spaces is called uniformly continuous if ∀ε > 0
∃δ > 0:
dX (x, y) < δ =⇒ dY (f (x), f (y)) < ε.
Remark 2.81. For a map f : X → Y between metric spaces one can show that
(1) Lipschitz continuity implies uniform continuity.
(2) Uniform continuity implies continuity.
Example 2.82. Let us construct a continuous map that is not uniformly continuous. Consider the map
f : (0, 1) → R, x 7→ 1/x. It is continuous, but not uniformly continuous. Indeed, let ε > 0 and assume ∃δ > 0
such that
d(x, y) < δ =⇒ d(f (x), f (y)) < ε.
Let y = δ and 0 < x < δ. Then d(x, y) = |x − y| < δ and
d(f (x), f (y)) = |1/x − 1/δ| → ∞, x → 0.
This means that d(f (x), f (y)) > ε for x > 0 sufficiently small.
Example 2.83. Let us construct a uniform √ continuous map f : X → Y that is not Lipschitz continuos.
Consider the map f : [0, 1] → R, x 7→ x. This map is continuous on a compact, hence also uniformly
continuous by the next theorem. We can also √show√this < ε < 1, choose δ = ε2 . Assume
directly: for any 0√

that x ≤ y and |x − y| < δ. If y < ε, then x − y < ε as required. If y ≥ ε, then

√ √ √ √ √ √
δ > |y − x| = x + y · x − y ≥ ε x − y ,
√ √
hence x − y < δ/ε = ε as required. On the other hand f is not Lipschitz continuous:
|f (x) − f (y)| 1
= √ √ → ∞, x, y → 0.
|x − y| x + y
Theorem 2.84 (Uniform continuity theorem). Let f : X → Y be a continuous map between metric spaces.
If X is compact, then f is uniformly continuous.
Proof. Assume the contrary. Then there exists ε > 0 such that for all n ≥ 1 ∃xn , yn ∈ X:
dX (xn , yn ) < 1/n, dY (f (xn ), f (yn )) ≥ ε.
As X is compact, there exists a subsequence (xnk )k≥1 of (xn )n≥1 convergent to some x ∈ X. By the above
inequality, the sequence (ynk ) also converges to x. Then (f (xnk )) and (f (ynk )) converge to f (x), hence
∃N > 0 ∀k ≥ N :
dY (f (xnk ), f (x)) < ε/2, dY (f (ynk ), f (x)) < ε/2.
=⇒ dY (f (xnk ), f (ynk )) < ε, a contradiction. 
32

3. Normed vector spaces


3.1. Definition of a normed vector space. Recall that a vector space over a field K (we will need only
K = R and K = C) is a set X together with two operations
addition: X × X → X, (x, y) 7→ x + y
scalar multiplication: K × X → X, (λ, x) 7→ λx
satisfying the axioms:
(1) (X, +) is an abelian group,
(2) λ(µx) = (λµ)x,
(3) 1x = x,
(4) λ(x + y) = λx + λy,
(5) (λ + µ)x = λx + µx
for any x, y ∈ X and λ, µ ∈ K.
Definition 3.1. A normed vector space is a pair (X, k·k), where X is a vector space (over K = R or K = C)
and k·k is a map
k·k : X → R, x 7→ kxk
satisfying the axioms
(1) kxk = 0 ⇐⇒ x = 0.
(2) kλxk = |λ| · kxk for all λ ∈ K, x ∈ X.
(3) kx + yk ≤ kxk + kyk for all x, y ∈ X (triangle inequality).
The map k·k is called a norm on X and kxk is called the norm of x.
Remark 3.2. For any x ∈ X we have
k−xk = k(−1) · xk = |−1| · kxk = kxk .
We also have kxk ≥ 0. Indeed
0 = kx + (−x)k ≤ kxk + k−xk = 2 kxk ,
so kxk ≥ 0. We can think of kxk as the length of a vector x.
Lemma 3.3. Let (X, k·k) be a normed vector space. Then the map
d: X × X → R, d(x, y) = kx − yk , x, y ∈ X
defines a metric on X.
Proof. For any x, y, z ∈ X we have
(1) d(x, y) = 0 ⇐⇒ kx − yk = 0 ⇐⇒ x − y = 0 ⇐⇒ x = y.
(2) d(x, y) = kx − yk = ky − xk = d(y, x).
(3) d(x, y) = kx − yk = kx − z + z − yk ≤ kx − zk + kz − yk = d(x, z) + d(z, y).

In this way any normed vector space can be considered as a metric space (or a topological space). Many
examples of metric spaces that we considered before arise in this way.
33

3.2. Examples of normed vector spaces.


Example 3.4. The Euclidean norm on Rn
q
kxk2 = x21 + · · · + x2n
for x = (x1 , . . . , xn ) ∈ Rn .
Example 3.5. For any p ≥ 1 there is a norm k·kp on Rn given by
X n 1/p
p
kxkp = |xi |
i=1
n
for x = (x1 , . . . , xn ) ∈ R . The triangle inequality axiom follows from the Minkowski inequality. There is
also the ∞-norm (or the supremum norm) on Rn
kxk∞ = max {|x1 | , . . . , |xn |} .
Example 3.6. Let C[a, b] be the set of all continuous functions f : [a, b] → R. Define addition and scalar
multiplication on C[a, b] pointwise:
For any x, y ∈ C[a, b] define x + y ∈ C[a, b] by
(x + y)(t) = x(t) + y(t), t ∈ [a, b].
For any λ ∈ R, x ∈ C[a, b] define λx ∈ C[a, b] by
(λx)(t) = λ · x(t), t ∈ [a, b].
The set C[a, b] equipped with these operations is a vector space. For any p ≥ 1 there is a norm k·kp on
C[a, b] given by
Z b 1/p
p
kxkp = |x(t)| dt .
a
There is also the ∞-norm (or the supremum norm) on C[a, b]
kxk∞ = sup |x(t)| = max |x(t)| < ∞
t∈[a,b] t∈[a,b]

where in the last equality we used the fact that [a, b] is compact.
P p
Example 3.7. For any p ≥ 1, let `p be the set of sequences (xn )n≥1 in R such that n≥1 |xn | < ∞. For
 1/p
P p
any such sequence, define kxkp = n≥1 |xn | . Define addition and scalar multiplication of sequences
componentwise:
(x + y)n = xn + yn , (λx)n = λ · xn .
It is clear that if x ∈ `p and λ ∈ R, then λx ∈ `p . Let us show that if x, y ∈ `p , then x + y ∈ `p . By the
Minkowski inequality, for any m ≥ 1,
X m 1/p  X m 1/p  Xm 1/p
p p p
|xn + yn | ≤ |xn | + |yn | ≤ kxkp + kykp .
n=1 n=1 n=1
Taking the limit as m → ∞, we obtain
X 1/p
p
|xn + yn | ≤ kxkp + kykp < ∞,
n≥1

hence x + y ∈ `p . This implies that `p is a vector space. The map k·kp defines a norm on `p .
Similarly, define `∞ to be the set of all sequences (xn )n≥1 in R such that supn≥1 |xn | < ∞ (that is,
bounded sequences). Then `p is a vector space with addition and scalar multiplication defined as before. It
has a norm
kxk∞ = sup |xn | ∀x ∈ `∞ .
n≥1
34

3.3. Bounded linear operators.


Definition 3.8. Let (X, k·k), (Y, k·k) be two normed vector spaces over K.
(1) A map A: X → Y is called a linear operator (or a linear map) if
A(x + y) = Ax + Ay, A(λx) = λAx
for any x, y ∈ X and λ ∈ K.
(2) A linear operator A: X → Y is called bounded if there exists M ≥ 0 such that
kAxk ≤ M kxk ∀x ∈ X.
(3) A linear operator A: X → Y is called continuous if it is continuous with respect to the metrics on
X and Y induced by the norms.
Lemma 3.9. A linear operator A: X → Y between two normed vector spaces is bounded if and only if it is
continuous.
Proof. (1) =⇒ (2). If A: X → Y is bounded with a constant M > 0, then it is Lipschitz continuous
dY (Ax, Ay) = kAx − Ayk = kA(x − y)k ≤ M kx − yk = M dX (x, y),
hence continuous.
(2) =⇒ (1). Let A: X → Y be continuous. If A is not bounded, then for any n ≥ 1, there exists xn ∈ X
xn
such that kAxn k > n kxn k. Let yn = nkx nk
. Then kyn k = 1/n and
kAxn k
kAyn k = > 1.
n kxn k
This implies that yn → 0, while Ayn 6→ 0 as n → ∞. This contradicts to the continuity of A. 

Definition 3.10. Let X, Y be normed vector spaces. Define L(X, Y ) to be the set of all bounded linear
operators from X to Y . The set L(X, Y ) has a structure of a vector space:
(1) Given A, B ∈ L(X, Y ), define A + B ∈ L(X, Y ) by
(A + B)x = Ax + Bx, x ∈ X.
If A is bounded by M > 0 and B is bounded by N > 0, then A + B is also bounded:
k(A + B)xk = kAx + Bxk ≤ kAxk + kBxk ≤ M kxk + M kxk = (M + N ) kxk .
(2) Given λ ∈ K and A ∈ L(X, Y ) define λA ∈ L(X, Y ) by
(λA)x = λ · Ax.
It is clear that if A is bounded, then λA is bounded.
Define L(X) = L(X, X) and define the dual space X ∗ = L(X, R).
Definition 3.11. Define the operator norm kAk of a bounded linear operator A ∈ L(X, Y ) to be the
minimal number M ≥ 0 satisfying kAxk ≤ M kxk for all x ∈ X. That is,
kAk = inf { M ≥ 0 | kAxk ≤ M kxk ∀x ∈ X} .
Lemma 3.12. For any A ∈ L(X, Y ), we have
kAxk
kAk = sup = sup kAxk .
x6=0 kxk kxk=1

Proof. For any x 6= 0, consider the vector y = x/ kxk. Then kyk = 1 and kAxk kAyk
kxk = kyk = kAyk. This implies
the second equality of the lemma.
Let a = supx6=0 kAxk kAxk
kxk . Then kxk ≤ a for all x 6= 0, hence kAxk ≤ a kxk for all x ∈ X. Therefore kAk ≤ a.
kAxk kAxk
Conversely, we have kAxk ≤ kAk · kxk, hence kxk ≤ kAk for all x 6= 0. Therefore a = supx6=0 kxk ≤ kAk.
We conclude that kAk = a. 
35

Example 3.13. Consider a linear map A: (Rn , k·k1 ) → (Rn , k·k∞ ) given by a matrix (aij )ni,j=1 . Then

n
X  X

kAxk∞ = max aij xj ≤ max max |aij | |xj | = max |aij | · kxk1 .
i i j i,j
j=1 j

This implies that kAk ≤ maxi,j |aij |. To prove the inverse inequality, consider x = ej ∈ Rn for 1 ≤ j ≤ n.
Then
kAxk∞ = k(a1j , a2j , . . . , anj )k∞ = max |aij | .
i
Therefore
kAxk∞
kAk ≥ = max |aij | .
kxk1 i

This is true for all 1 ≤ j ≤ n, hence kAk = maxi,j |aij |.


Theorem 3.14. For any A, B ∈ L(X, Y ) and λ ∈ K, we have
(1) kAk = 0 ⇐⇒ A = 0.
(2) kλAk = |λ| kAk.
(3) kA + Bk ≤ kAk + kBk.
Therefore L(X, Y ) equipped with the operator norm is itself a normed vector space.
Proof. (1) Assume that kAk = 0. Then ∀x ∈ X: kAxk ≤ 0 kxk = 0 =⇒ kAxk = 0 =⇒ Ax = 0. Therefore
A = 0. Conversely, if A = 0, then
kAk = sup kAxk / kxk = sup 0 = 0.
x6=0 x∈X

(2)
kλAxk kAxk
kλAk = sup = |λ| sup = |λ| · kAk .
x6=0 kxk x6=0 kxk
(3) For any x ∈ X
k(A + B)xk = kAx + Bxk ≤ kAxk + kBxk ≤ kAk kxk + kBk kxk .
Therefore kA + Bk ≤ kAk + kBk. 
Example 3.15. Let X = (Rn , k−k1 ). We will show that the dual space X ∗ = L(X, R) is given by
(Rn , k−k∞ ). A linear map f : Rn → R can be written in the form
n
X
f (x) = fi xi x ∈ Rn
i=1
for some (f1 , . . . , fn ) ∈ Rn (we will denote this vector also by f ). Then
X X
kf (x)k = fi xi ≤ max |fi | · |x|i = kf k∞ · kxk1 ,

i
hence kf k = supx6=0 kf (x)k / kxk1 ≤ kf k∞ . On the other hand, if kf k∞ = |fi | for some i, then for x = ei
(element of the standard basis) we have kf (x)k / kxk1 = |fi |, hence kf k = supx6=0 kf (x)k / kxk1 = kf k∞ .
Similarly one can show that the dual space of (Rn , k−k∞ ) is given by (Rn , k−k1 ). More generally, for any
p, q ≥ 1 such that p1 + 1q = 1, the dual space of (Rn , k−kp ) is given by (Rn , k−kq ).
Lemma 3.16. Let A ∈ L(X, Y ) and B ∈ L(Y, Z). Then BA is bounded and kBAk ≤ kBk · kAk.
Proof. For any x ∈ X we have
kBAxk ≤ kBk · kAxk ≤ kBk · kAk · kxk .
This implies kBAk ≤ kBk · kAk. 
Remark 3.17. The previous result implies that the normed vector space L(X) = L(X, X) has a multipli-
cation given by composition of operators. This means that L(X) is equipped with a structure of an algebra.
Its identity element is the identity operator.
36

3.4. Equivalent norms.


Definition 3.18. Two norms k·k1 and k·k2 on a vector space X are called equivalent if there exist C, C 0 > 0
such that
kxk2 ≤ C kxk1 , kxk1 ≤ C 0 kxk2 , ∀x ∈ X.
Theorem 3.19. Two norms k·k1 and k·k2 on a vector space X are equivalent if and only if they generate
the same topology on X.
Proof. Consider the identity operator
A: (X, k·k1 ) → (X, k·k2 ), x 7→ x.
Then the norms k·k1 and k·k2 are equivalent if and only if A and its inverse are bounded linear operators:
(1) If kxk2 ≤ C kxk1 =⇒ kAxk2 ≤ C kxk1 =⇒ kAk ≤ C.
(2) If A is bounded, then kxk2 = kAxk2 ≤ kAk kxk1 and we can take C = kAk.
(3) The argument for A−1 is the same.
On the other hand the norms k·k1 and k·k2 generate the same topology if and only if A and its inverse are
continuous (the preimage of an open set is open). But we know that for linear operators continuity and
boundedness are equivalent. 
Example 3.20. Consider the vector space X = C[0, 1] with the norms
Z 1
kxk1 = |x(t)| dt, kxk∞ = sup |x(t)| .
0 x∈[0,1]
n
These norms are not equivalent. The sequence xn (t) = t converges to zero with respect to k·k1 , but it
diverges with respect to k·k∞ (it converges pointwise to t 7→ δ1t which is not continuous).
Theorem 3.21. Let X be a finite dimensional vector space. Then all norms on X are equivalent.
Proof. Fixing a basis (e1 , . . . , en ) of X, we can assume that X = Rn . We have to prove that any norm k·k
on X is equivalent to the fixed norm
Xn
kxk1 = |xk | , x = (x1 , . . . , xn ).
k=1
Consider the identity operator A: (X, k·k1 ) → (X, k·k). Then
X X
kAxk = kxk = kx1 e1 + · · · + xn en k ≤ |xi | · kei k ≤ C |xi | = C kxk1 ,
where C = maxi kei k. Therefore A is bounded and continuous. Define
kAxk
a = inf = inf kAxk .
x6=0 kxk1 kxk1 =1

The last minimum is attained as the set { x ∈ Rn | kxk1 = 1} is compact and the map (Rn , k·k1 ) → R,
x 7→ kAxk is continuous by the previous argument. This implies that a > 0. We have kxk / kxk1 ≥ a for all
x 6= 0, hence
1
kxk1 ≤ kxk .
a
Therefore the norms are equivalent. 
Theorem 3.22. Let A: X → Y be a linear operator between normed vector spaces. If X is finite-dimensional,
then A is continuous.
n
Proof. Choosing a basis (e1 , . . . , en ) of X, we can assumePnthat X = R . As all norms on X are equivalent,
we can assume that the norm on X is given by kxk1 = i=1 |xi |. Then
X  X X
kAxk = A xi ei = xi Aei ≤ kxi Aei k

X X
= |xi | kAei k ≤ C |xi | = C kxk1 ,
where C = maxi kAei k. This implies that A is bounded. 
37

3.5. Banach spaces.


Definition 3.23. A Banach space is a normed vector space (X, k·k) which is complete (with respect to
the metric d(x, y) = kx − yk).
Example 3.24. Any finite dimensional normed vector space is Banach. We have seen that the Euclidean
space Rn is complete. Any other norm on Rn is equivalent to the Euclidean norm.
Example 3.25. We have seen that the space of continuous functions C[a, b] with the supremum metric is
complete. Therefore C[a, b] with the supremum norm
kxk∞ = sup |x(t)|
t∈[a,b]

is a Banach space.
R1
Example 3.26. Consider the space C[0, 1] equipped with the norm kxk1 = 0 |x(t)| dt. Let us show that
it is non-Banach. Consider the function f (t) = 0 for t ∈ [0, 21 ] and f (t) = 1 for t ∈ ( 12 , 1]. This is a
non-continuous function which can be interpolated by the continuous functions (for n ≥ 2)

0
 t ≤ 12
xn (t) = n(t − 2 ) 12 ≤ t ≤ 12 + n1
1

t ≥ 21 + n1

1

We have kf − xn k1 ≤ 1/n, hence the sequence (xn ) is Cauchy. On the other hand if it converges to some
continuous function g with respect to the norm k−k1 , then kf − gk1 = 0. This would imply that g(t) = 0
for t < 12 and g(t) = 1 for t > 12 , contradicting the continuity of g.
Theorem 3.27. The space `p is Banach.
Proof. Let (x(n) )n≥1 be a Cauchy sequence in `p . Then ∀ε > 0 ∃N > 0:
X p 1/p
(m) (m) (n)
(2) − x(n) = xi − xi <ε ∀m, n ≥ N.

x
p
i≥1

Then for any k ≥ 1


X p 1/p
(m) (n) (m) (n)
xk − x k ≤ xi − x i < ε.
i≥1
(n)
Therefore the sequence in R is Cauchy, hence converges to some xk ∈ R. Consider the sequence
(xk )n≥1
x = (xk )k≥1 . Taking the limit m → ∞ in (2), we obtain

x − x(n) ≤ ε.

p
(n) (n)
This implies that x − x ∈ `p , hence x = (x − x ) + x(n) ∈ `p . This also implies that (x(n) )n≥1 converges
to x. 

Theorem 3.28. Let X, Y be two normed vector spaces and assume that Y is a Banach space. Then the
space of bounded linear operators L(X, Y ) is a Banach space.
Proof. Let (An )n≥1 be a Cauchy sequence in L(X, Y ). For any x ∈ X and for any ε > 0 there exists N > 0
s.t.
kAm − An k < ε ∀m, n ≥ N.
Therefore
kAm x − An xk ≤ kAm − An k kxk ≤ ε kxk , m, n ≥ N
and this implies that (An x) is a Cauchy sequence in Y . Therefore it is convergent to some Ax ∈ Y . Let us
show that A: X → Y is linear. For any x, y ∈ X, we have
A(x + y) = lim An (x + y) = lim (An x + An y) = lim An x + lim An y = Ax + Ay.
n→∞ n→∞ n→∞ n→∞
38

Similarly one proves that A(λx) = λAx for any A ∈ R, x ∈ X. Let us show that A is bounded. For any
ε > 0 there exists N > 0 such that
kAm − An k < ε ∀m, n ≥ N.
Therefore
kAm x − An xk ≤ kAm − An k kxk = ε kxk
and
kAx − An xk ≤ ε kxk .
This implies that A − An is bounded and therefore A is bounded. We also obtain that
kA − An k ≤ ε ∀n ≥ N.
This implies that An → A as n → ∞. 

Remark 3.29. Given a normed vector space (X, k·k) over R, we defined its dual space to be X = L(X, R).
Its elements are continuous linear maps f : X → R, called linear functionals. Note that X ∗ is a Banach space
(as R is complete), although X can be non-Banach. This implies that X ∗∗ is not necessarily isomorphic to
X (although there is a canonical map X → X ∗∗ ).
Definition 3.30. Let X be a normed vector space and (xn )n≥1 be a sequence in X.
P
(1) The (formal) series n≥1 xn is called convergent to x ∈ X if the sequence of partial sums (sn )n≥1
Pn P
with sn = k=1 xk converges to x. We denote the limit limn→∞ sn by n≥1 xn .
P P
(2) The series n≥1 xn is called absolutely convergent if the series n≥1 kxn k converges in R.
P∞
Theorem
P∞ 3.31. Let X be a Banach space and n=1 xn be an absolutely convergent series in X. Then
n=1 xn is convergent.

Proof. For any n ≥ 1, let


n
X n
X
sn = xk ∈ X, an = kxk k ∈ R.
k=1 k=1
By assumption, the sequence (an )n≥1 is convergent and in particular Cauchy. Therefore ∀ε > 0 ∃N > 0
|am − an | < ε ∀m, n ≥ N.
If n > m ≥ N , then
n
X
ksn − sm k = kxm+1 + · · · + xn k ≤ kxk k = an − am < ε,
k=m+1

hence the sequence (sn )n≥1 in X is Cauchy. As X is complete, we conclude that (sn )n≥1 is convergent. 
Theorem 3.32. Let X be a Banach space and A ∈ L(X). Then the series
X Ak
eA =
k!
k≥0

is convergent.
Proof. As X is Banach, the space L(X) = L(X, X) equipped with an operator norm is also Banach. To
prove the theorem, we have to show that the above series is absolutely convergent. We have
X X 1 k X 1 k
Ak /k! = A ≤ kAk = ekAk ,
k! k!
k≥0 k≥0 k≥0
k
where we used the fact that kABk ≤ kAk · kBk (and therefore Ak ≤ kAk ). This means that the series
k
A
P
k≥0 k! in L(X) is absolutely convergent and therefore convergent. 
Definition 3.33. Let X be a normed vector space and let I ∈ L(X) be the identity operator defined by
I(x) = x for x ∈ X. An operator A ∈ L(X) is called invertible in L(X) if there exists an operator B ∈ L(X)
such that AB = BA = I. The operator B is called the inverse of A and is denoted by A−1 .
39

Theorem 3.34. Let X be a Banach Pspace and let A ∈ L(X) with kAk < 1. Then the operator I − A is
invertible in L(X) and it inverse is k≥0 Ak .
Proof. The normed vector space L(X) is Banach. To show that k≥0 Ak is convergent, we have to show
P
that it is absolutely convergent. We have
X X
Ak ≤ k 1
kAk = < ∞.
1 − kAk
k≥0 k≥0
Pn
Ak is absolutely convergent and therefore convergent. Let sn = k
P
This means that k≥0 k=0 A and
B = limn→∞ sn = k≥0 Ak . Then
P

(I − A)B = (I − A) lim sn = lim (I − A)sn


n→∞ n→∞
= lim ((I + A + · · · + An ) − (A + A2 + · · · + An+1 )) = lim (I − An+1 ) = I
n→∞ n→∞
n
as kAn k ≤
PkAk → 0 as n → ∞. Similarly one can show that B(I − A) = I. Therefore I − A is invertible
and B = k≥0 Ak is its inverse. 
Corollary 3.35. Let X be a Banach space and let A ∈ L(X) be invertible in L(X). Then for any B ∈ L(X)
−1
with kBk < A−1 . The operator A + B is invertible.
Proof. It is enough to show that the operator A−1 (A + B) = I + C (where C = A−1 B) is invertible. We
have −1
kCk = A−1 B ≤ A−1 · kBk < A−1 · A−1 = 1.

Therefore by the above theorem I + C is invertible. 


Assume that we have a bounded linear operator A ∈ L(X) that is bijective. Then its inverse map
A−1 : X → X is automatically linear, but it is not bounded in general (hence A is not invertible in L(X) in
general). However, one can show that A−1 is bounded if X is a Banach space.
Theorem 3.36 (Banach-Shauder Theorem). If X, Y are Banach spaces and A ∈ L(X, Y ) is surjective then
it maps open sets in X to open sets in Y .
Corollary 3.37. Let X be a Banach space and let A ∈ L(X) be bijective. Then the linear operator A−1 : X →
X is bounded. Therefore A is invertible in L(X).
Proof. By the above theorem A maps open sets to open sets. This means that the preimages of open sets
with respect to A−1 are open. Therefore A−1 is continuous, or equivalently, bounded. 
Example 3.38. Let X be the set of sequences (xn )n≥1 in R that have only finitely many nonzero elements.
We equip X with the componentwise addition and scalar multiplication and with the norm kxk = supn≥1 |xn |.
Then the linear operator
A: X → X, x 7→ (xn /n)n≥1
is bounded and bijective, while its inverse A−1 : X 7→ X, x 7→ (nxn )n≥1 is not bounded. This implies that A
is not invertible in L(X). By the previous result we conclude that X is not Banach.
40

Appendix A. Some inequalities


Theorem A.1 (Cauchy-Schwarz inequality). For all x, y ∈ Rn , we have
|(x, y)| ≤ kxk · kyk .
Proof. We have
n X
X n n
X n
X n
X n
X n
X n
X
(xi yj − xj yi )2 = x2i yj2 + yi2 x2j − 2 xi yi xj yj
i=1 j=1 i=1 j=1 i=1 j=1 i=1 j=1
2 2 2 2 2 2
= kxk kyk + kyk kxk − 2(x, y)(x, y) = 2 kxk kyk − 2(x, y)2 .
As the first expression above is non-negative, we obtain
2 2
2 kxk kyk − 2(x, y)2 ≥ 0,
hence |(x, y)| ≤ kxk kyk. 
Theorem A.2 (Minkowski’s inequality). For any p ≥ 1, we have
n
!1/p n
!1/p n
!1/p
X p
X p
X p
|xi + yi | ≤ |xi | + |yi |
i=1 i=1 i=1

First, let us prove two auxiliary inequalities.


1 1
Theorem A.3 (Young’s inequality). If p, q > 1 satisfy p + q = 1, then
p q
x y
xy ≤ + , x, y ≥ 0.
p q
Proof. As f (x) = − log(x) is convex (convex-downward) and p1 + 1q = 1, we obtain
 
1 p 1 q 1 1
f x + y ≤ f (xp ) + f (y q )
p q p q
hence  
1 p 1 q 1 1
log x + y ≥ log(xp ) + log(y q ) = log(xy)
p q p q
and xy ≤ p1 xp + 1q y q . 

Remark A.4. A convex (convex-downward) function is a function f : [a, b] → R satisfying


f (tx + (1 − t)y) ≤ tf (x) + (1 − t)f (y)
for 0 < t < 1 and a ≤ x < y ≤ b. If f is twice differentiable and f 00 ≥ 0, then f is convex.
Theorem A.5 (Hölder inequality). If p, q > 1 satisfy 1/p + 1/q = 1, then
n n
!1/p n
!1/q
X X p
X q
|xi yi | ≤ |xi | · |yi |
i=1 i=1 i=1
P p P q
Proof. We can assume that xi , yi ≥ 0. Also we can rescale our vectors to get xi = yi = 1. Then we
obtain
n
!1/p n
!1/q
X 1X p 1X q 1 1 X p
X q
xi yi ≤ xi + yi = + = 1 = |xi | · |yi | .
p q p q i=1 i=1

Remark A.6. For p = 1, we would get q = ∞ and the corresponding inequality should have the form
X X
|xi yi | ≤ kxk1 · kyk∞ = |xi | · max |yi |
i

which is obviously true.


41

Remark A.7. For p = q = 2, we obtain the Cauchy-Schwarz inequality


X
(x, y) = xi yi ≤ kxk · kyk .
Proof of Minkowski’s inequality. If p = 1, the inequality follows from the triangle inequality for real numbers
|x + y| ≤ |x| + |y|. If p > 1, let q > 1 be defined by 1/p + 1/q = 1. We can assume that xi , yi ≥ 0. Then
n
X n
X n
X
(xi + yi )p = (xi + yi )p−1 xi + (xi + yi )p−1 yi
i=1 i=1 i=1
n
!1/q  n
!1/p n
!1/p 
X X X
≤ (xi + yi )(p−1)q  xpi + yip 
i=1 i=1 i=1
Pn 1/q
From 1/p + 1/q = 1 we obtain (p − 1)q = p, hence we can divide both sides by ( i=1 (xi + yi )p ) and get
n
!1/p n
!1/p n
!1/p
X X X
(xi + yi )p ≤ xpi + yip
i=1 i=1 i=1

Theorem A.8 (Jensen’s inequality). P Let f be a convex function, let x1 , . . . , xn be elements in its domain
and let p1 , . . . , pn ≥ 0 satisfy pi = 1. Then
X  X
f p i xi ≤ pi f (xi ).

Intuitive P
explanation.
P Put particles with mass pi at the points
P (xi , f (xi ))
Pof the plain. Then the center of
gravity ( pi xi , pi f (xi )) will be over the curve, hence pi f (xi ) ≥ f ( pi xi )). 
Proof. By induction on n
n+1
! n
!
X X pi
f pi xi = f (1 − pn+1 ) xi + pn+1 xn+1
i=1 i=1
1 − pn+1
n
! n
X pi X
≤ (1 − pn+1 )f xi + pn+1 f (xn+1 ) ≤ pi f (xi ) + pn+1 f (xn+1 ).
i=1
1 − pn+1 i=1

Remark A.9. We can consider the sequence (pi ) as a collection
P of probabilities Pand the sequence x = (xi )
as a random variable. Then we define expectations E[x] = pi xi , E[f (x)] = pi f (xi ) and we can write
Jensen’s inequality in the form
f (E[x]) ≤ E[f (x)].

You might also like