Mul Downey
Mul Downey
Mul Downey
muldown
2010/1/10
page 1
i
i
i
muldown
2010/1/10
page 2
i
i
i
muldown
2010/1/10
page i
i
Contents
Preface
iii
Cartesian Space
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
1
1
1
2
3
4
4
5
6
7
11
11
13
13
13
21
.
.
.
.
.
.
.
.
.
.
.
25
25
29
33
35
38
42
42
44
46
49
52
Riemann Integration
3.1
Content and the Riemann Integral . . . . . . . . . . . . . . . . .
57
57
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
real variable .
. . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
i
i
ii
Contents
3.2
3.3
muldown
2010/1/10
page ii
i
3.1.1
Partition of I . . . . . . . . . . . . . . .
3.1.2
Riemann Sums . . . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . .
Cauchy Criteria and Properties of Integrals . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . .
Evaluation of Integrals . . . . . . . . . . . . . . . .
3.3.1
Real valued functions of a real variable
3.3.2
Real valued functions on R2 . . . . . . .
3.3.3
Real valued functions on Rn . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
59
59
60
61
69
71
71
73
76
77
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
81
81
81
84
85
86
89
96
98
102
105
111
119
124
131
131
137
139
147
150
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
155
155
166
169
169
175
178
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
i
i
muldown
2010/1/10
page iii
i
Preface
These notes are not intended as a textbook. It is hoped however that they
will minimize the amount of notetaking activity which occupies so much of a students class time in most courses in mathmatics. Since the material is presented in
the sterile definition, theorem, proof form without much background colour or
discussion most students will find it profitable to use the notes in conjunction with
a textbook recommended by the instructor.
Probably the most important aspect of the notes is the set of exercises. You
should develop the practice of attempting several of these problems every week.
Many of the problems are quite difficult so please consult your instructor if you
are not blessed with success initially. Do not acquire the habit of abandoning a
problem if it does not yield to your first attempt; a defeatist attitude is your greatest
adversary. Solution of a problem, even with some assistance from the teacher when
necessary, is a fine boost to your morale. You will find that a strong effort expended
on the earlier part of the courses will be rewarded by growing self-confidence and
easier success later.
iii
i
i
iv
muldown
2010/1/10
page iv
i
Preface
NOTATION: Except when specified otherwise, upper case (capital) letters will
denote sets and lower case (small) letters will denote elements of sets.
a A means a is an element of the set A.
A B means A is a subset of B.
a 6 A means a is NOT an element of the set A.
A 6 B means A is NOT a subset of B.
Remark The slash through any symbol will mean the negation of the corresponding statement.
B A means b contains A.
A ( B means A is a proper subset of B.
P = Q means statement P implies statement Q.
P Q means P holds if and only if Q holds.
there exists.
for all.
s.t.
such that.
end of proof.
{x : . . .} means the set of all things x that satisfy conditions specified in . . .. For
example, see the following.
A B The union of sets A and B, {x : x A
or x B}.
and x B}.
i
i
muldown
2010/1/10
page 1
i
Chapter 1
1.1
1.1.1
Fields
A field is a set F together with two binary operations + and (addition, multiplication) which satisfy the following axioms: For all a, b, c, . . . in F
F1 a + b F and a b F (closure)
F2 a + b = b + a and a b = b a (commutativity)
F3 a + (b + c) = (a + b) + c and a (b c) = (a b) c (associativity)
F4 (a + b) c = (a c) + (b c) (distributivity)
F5 There exists unique elements 0 and 1 in F , 0 6= 1, such that
a+0=a
and a 1 = a
a F.
i
i
muldown
2010/1/10
page 2
i
Chapter 1. The Real Number system & Finite Dimensional Cartesian Space
1. a b will henceforth be written ab.
2. a(b + c) = ab + ac is an consequence of axioms F2 and F4, and need not be
separately assumed. (verify for yourself)
3. The elements (a) and a1 are uniquely determined by a. Suppose, for
example, that there are two elements (a1 ), (a2 ) that satisfy a + (a1 ) =
0 = a + (a2 ). Then
(a1 ) = (a1 )+0 = (a1 )+(a+(a2 )) = ((a1 )+a)+(a2 ) = 0+(a2 ) = (a2 ).
4. It is customary to write a b for a + (b) and
a
b
for ab1 .
e
e
+
the operations
e
e
e
e
e
Example 1.2
m
n
(n 6= 0) with the
iii The sets R and C of real and complex numbers respectively with the usual
addition and multiplication are fields.
iv The set Q(t) of rational functions with rational coefficients (i.e. functions of
the form p(t)
q(t) where p(t) and q(t) are polynomials with rational coefficients)
is a field.
v The set N = {1, 2, 3, 4, . . .} of natural numbers and the set Z of integers are
NOT fields.
1.1.2
Ordered Fields
and ab P .
O2 0
/P
O3 x F,
x 6= 0 = x P
or
i
i
muldown
2010/1/10
page 3
i
Remark: Every ordered field contains Q as a subfield (we do not prove this). Thus,
Q may be characterized as an ordered field containing no ordered proper subfield,
i.e., Q is the smallest ordered field. (Two ordered fields are considered the same if
they are isomorphic and the isomorphism preserves the order.) A discussion of this
point may be found in
C. Goffman, Real Functions, Proposition 1 and 2 in Chapter 3.
E. Hewitt and K. Stromberg, Real and Abstract Analysis, Theorem 5.9.
We define a relation > on an ordered field as follows: If a, b F , write a > b
(equivalently, b < a) if a b P . Then the axioms O1, O2, and O3 have the
following consequences:
Proposition 1.3 (Properties of <).
i a > b, b > c implies a > c.
ii If a, b F , then exactly one of the following holds, a > b, b > a, a = b.
iii a b, b a implies a = b.
iv a > b implies a + c > b + c for each c F .
v a > b, c > d implies a + c > b + d.
vi a > b, c > 0 implies ac > bc and a > b, c < 0 implies ac < bc.
vii a > 0 implies a1 > 0 and a < 0 implies a1 < 0.
viii a > b implies a >
a+b
2
> b.
Exercises
1.1. Establish the following properties of a field:
(a) a 0 = 0
(c) (a)(b) = ab
(b) a 6= 0 = a2 P
(c) If n N, then n P .
i
i
muldown
2010/1/10
page 4
i
Chapter 1. The Real Number system & Finite Dimensional Cartesian Space
(d) The field {, e} in Example 1.2(i) cannot be ordered.
(e) The field C of complex numbers cannot be ordered.
1.1.3
The notion of completeness for an ordered field will stretch your imagination a little
further. We will introduce some terminology necessary to discuss this topic. Let S
be a subset of an ordered field F .
(a) An element u F is an upper bound of the set S if s u,
(b) w F is a lower bound of S if w s,
s S.
s S.
1.1.4
Properties of R
i
i
muldown
2010/1/10
page 5
i
1
n
1
n
< a.
> 0.
m
< b.
n
Exercises
1.3. Guess the supremum and infimum of the following sets (when they exist):
(0, 1) = {x : 0 < x < 1} [0, 1] = {x : 0 x 1}
N = {1, 2, 3, 4, . . .}
{ n1 : n = 1, 2, 3, 4, . . .}
1.4. If a > 0, there exists n N such that 0 < 21n < a. Hint: Show 2n > n,
n N.
1.5. Q is not Archimedean.
1.6. Let F be an Archimedean ordered field containing an irrational element .
Show that if a, b F , a < b, then there is an irrational element such that
a < < b.
1.7. Show that R contains an irrational element. Hint: Show first that no rational
p satisfies p2 = 2. Then show that p = sup{x > 0 : x2 < 2} must satisfy
p2 = 2.
Theorem 1.10. Let In = [an , bn ], and In+1 In , n N. Then
n=1 In 6= . In
other words, a nested sequence of closed intervals has at least one point common to
all intervals.
Proof. First note that an < bm for all n, m. Thus, each bm is an upper bound for
the set {an : n N}. Therefore, a := sup{an : n N} bm for all m. It follows
that an a bn for all n. Thus, a In for all n.
i
i
muldown
2010/1/10
page 6
i
Chapter 1. The Real Number system & Finite Dimensional Cartesian Space
|b| b |b|
which is the desired right hand inequality. This implies the left hand inequality
since
|b| = |b a + a| |b a| + |a| = |b| |a| |b a| = |a b|,
interchanging a and b = |a| |b| |a b|,
Exercises
1.9. Let F be an ordered field, with the property that if {In } is a nested sequence
of closed intervals in F , then
n=1 In 6= . Show that F is complete. (Remark:
This exercise and Theorem 1.10 shows that the supremum (completeness)
property and the nested interval property are equivalent.)
1.10. Let In = (0, n1 ). Show that
n=1 In = .
1.11. Let Kn = [n, ) := {x : x n}. Show that
n=1 Kn = .
i
i
muldown
2010/1/10
page 7
i
1.12. If a set S of real numbers contains one of its upper bounds a, then a = sup S.
Such a supremeum is called a maximum.
1.13. Show that S R cannot have two suprema.
1.14. Show that an ordered field F is complete if and only if every non-empty
subset of F which has a lower bound has an infimum.
1.15. Show that Q is not complete.
1.16. If s R is bounded and S0 S, show
inf S inf S0 sup S0 sup S.
1.17. If S = {(1)n (1 n1 ) : n = 1, 2, . . .}, find sup S, inf S. Prove any statements
you make.
1.18. Prove (i) (iv) of Proposition 1.12.
1.2
Cartesian Spaces
i
i
muldown
2010/1/10
page 8
i
Chapter 1. The Real Number system & Finite Dimensional Cartesian Space
kp
p+q
Figure 1.1. The sum of vectors p, q and the scalar kp, k > 1.
(iv) x + (1)x = O
(v) 1x = x, and 0x = O
(vi) (x) = ()x
(vii) (x + y) = y + x
There is another kind of product on Rn :
Definition 1.15. The inner product or dot product of two vectors x and y is the
quantity
n
X
xi yi = x1 y1 + . . . + xn yn .
x y :=
i=1
i
i
muldown
2010/1/10
page 9
i
by (i)
Hence, x y |x| |y|. If equality holds in the last expression, then working backwards through the proof yields
O = z = |y|x |x|y,
i.e. x =
|x|
y.
|y|
1/2
1/2
.
y12 + . . . + yn2
|x1 y1 + . . . + xn yn | x21 + . . . + x2n
We have
|x + y|2 = (x + y) (x + y) = x x + 2x y + y y
= |x|2 + 2x y + |y|2
|x|2 + 2|x| |y| + |y|2 ,
2
= |x| + |y|
(CBS inequality)
= |x + y| |x| + |y|.
The left-hand inequality follows from this just as in the scalar triangle inequality.
Proposition 1.20 (Properties of Norm).
i
i
10
muldown
2010/1/10
page 10
i
Chapter 1. The Real Number system & Finite Dimensional Cartesian Space
k=1 Ik 6= .
Proof. If Ik = Ik,1 . . . Ik,n , k = 1, 2, . . ., where the Ik,i are closed intervals in
R, then for each i, Ik+1,i Ik,i , k = 1, 2, . . ., and
k=1 Ik,i 6= . Thus, there exits
xi Ik,i , k N, and each i = 1, 2, . . . , n. Hence, (x1 , . . . , xn ) Ik,1 . . . Ik,n =
Ik , k N.
i
i
1.2.1
muldown
2010/1/10
page 11
i
11
Functions
1
f [1, 1/2] = [0, 1] f
[1, 1/2] = [0, 1/ 2].
Definition 1.25. A function f : A 7 B is one-to-one if it also satisfies
(x1 , y), (x2 , y) f = x1 = x2 .
11
We write this f : 7 B.
11
Exercises
1.19. Prove Corollary 1.18.
1.20. Show |x + y|2 + |x y|2 = 2(|x|2 + |y|2 ) for all x, y Rn . (parallelogram
identity)
1.21. If x = (x1 , . . . , xn ), show
for i = 1, . . . , n.
|xi | |x n sup{|x1 |, . . . , |xn |},
i
i
12
muldown
2010/1/10
page 12
i
Chapter 1. The Real Number system & Finite Dimensional Cartesian Space
1.22. Show that |x + y|2 = |x|2 + |y|2 x y = 0. In this case, x and y are said
to be orthogonal. This is sometimes denoted by x y.
1.23. Is it true that
|x + y| = |x| + |y| x = y
or y = x
with 0?
1.24. Two sets A and B have the same cardinality if there is a one-to-one function
: A 7 B such that (A) = B and 1 (B) = A. Show that the following
sets have the same cardinality:
(a) N = {1, 2, 3, . . .} and 2N = {2, 4, 6, 8, . . .}.
1.25. A set A is said to be finite if it has the same cardinality as some initial segment
{1, . . . , n} of the natural numbers, and is said to be infinite otherwise. (Thus,
finite means that the elements can be labeled a1 , . . . , an .) Show that a finite
set of real numbers contains its inf and its sup. Hint: Induction.
1.26. A set A is countable if it has the same cardinality as N, the set of natural
numbers, or if it is finite. Otherwise, the set is said to be uncountable. Show
that a countable set need not contain its sup or its inf. (Countability means
all elements of the set can be labeled by the natural numbers {a1 , a2 , a3 , . . .}.)
1.27. Show that the union of a countable collection of countable sets is countable.
Hint:
S1 : a1,1 a1,2
a1,3 . . .
S2 : a2,1
a2,2
a2,3
...
S3 : a3,1
a3,2
a3,3 ,
...
S4 : a4,1
a4,2
a4,3 ,
...
..
..
..
..
..
..
..
..
. :
.
.
.
.
.
.
.
The elements may be counted by the scheme indicated . Deduce that Q is
countable.
1.28. [0, 1] is uncountable. Complete this sketch of proof: Suppose [0, 1] is countable and that
[0, 1] = {a1 , a2 , . . .}.
At least one of the intervals [0, 13 ], [ 13 , 23 ], [ 23 , 1] does not contain a1 ; call this
interval I1 . Subdivide I1 into three closed intervals, then a2 in not in one of
those three; call it I2 . Continuing in this manner, we obtain a nested sequence
of closed intervals, In , with the property that an 6 In . Therefore, none of
the an are in
k=1 Ik . But, by the nested interval theorem, the intersection
is non-empty so there must be an x [0, 1] with x 6= an for any n. This
contradicts our assumption.
i
i
1.3. Topology
1.2.2
muldown
2010/1/10
page 13
i
13
Convexity
(ii) and the line segment between x and y as the set {x + t(y x) : t [0, 1]}.
A subset C of Rn is convex if
x, y C = x + t(y x) C,
t [0, 1];
that is, if the line segment between any two points of the set is a subset of C.
Example 1.28 S := {x : |x| 1} is convex.
Proof. If |x| 1 and |y| 1, then
|x + t(y x)| = |(1 t)x + ty| |(1 t)x| + |ty|. triangle inequality
(1 t)|x| + t|y| (1 t) + t = 1.
Thus, x + t(y x) S, for 0 t 1; that is S is convex.
Exercises
1.29. Prove that {x : |x| = 1} is not convex.
1.30. Prove that {(x, y) R2 : y > 0} is convex.
1.31. Let C be any collection of convex sets. Show that AC A is convex. Is AC A
necessarily convex?
1.32. The convex hull H(A) of a set A is the intersection of all convex sets containing A as a subset. Prove that H(A) is convex. What is H(A) if A is a
set consisting of two points only?
1.33. A subset C of Rn is a cone if {tx : x C} C for all t 0.
(i) Prove that a cone C is a convex set if and only if
{x + y : x C, y C} C.
(ii) Draw pictures of convex and non-convex cones in R2 .
1.3
Topology
Definition 1.29. If > 0 and x0 Rn , then the open ball of center x0 and radius
is the set
B(x0 , ) := {x : |x x0 | < }.
i
i
14
muldown
2010/1/10
page 14
i
Chapter 1. The Real Number system & Finite Dimensional Cartesian Space
definition of 1
= .
(since A is open)
i
i
1.3. Topology
muldown
2010/1/10
page 15
i
15
For (c) let C be a collection of open sets, and select any x AC A. Then
x A for some A C
and x 6 V } = C V c .
i
i
16
muldown
2010/1/10
page 16
i
Chapter 1. The Real Number system & Finite Dimensional Cartesian Space
(= |x| < ,
x S).
will be called the diameter of I. Note that if x, y are two points in I, the |x y|
(I).
The next theorem has important consequences.
Theorem 1.38 (Bolzano-Weierstrass Theorem). Every bounded infinite subset
of Rn has a cluster point.
Proof. Let K be a bounded infinite subset of Rn . Then, by definition, there is
a closed interval I1 such that K I1 . Bisect the sides of I1 to partition I1 into
2n subintervals. At least one of these new intervals must contain infinitely many
points of K; pick one of these and call it I2 . Bisect the sides of I2 . Again, at least
one of the 2n subintervals of I2 must contain infinitely many points of K; pick one
of these and call it I3 .
Continue this procedure inductively to define a sequence of closed intervals Ik
with the properties:
i
i
1.3. Topology
muldown
2010/1/10
page 17
i
17
1
2k1 (I1 ),
k = 1, 2, 3, . . ..
1
(I1 ) < . Why is this possible?
2k1
Then 0 < (Ik ) < . Remember that for any x Ik , |x x0 | < (Ik ) < since x0
is also in Ik . Thus, Ik B(x0 , ). But, Ik contains (infinitely many) points of K;
a contradiction. Therefore, x0 is a cluster point of K
i
i
18
muldown
2010/1/10
page 18
i
Chapter 1. The Real Number system & Finite Dimensional Cartesian Space
3. (0, 1) is not compact. (Find the infinite open cover that does not have a finite
subcover.)
1
}.
k
(Why is Gk open?) Clearly, the collection G covers all of Rn \{x0 }, and hence covers
K. Since K is compact, there is a finite subcover {Gki : i = 1, . . . , m} of K. Now,
the Gj are nested with Gj Gi if i > j. Hence, letting k0 = max{k1 , . . . , km }, we
have K Gk0 . In other words,
B(x0 ,
1
1
1
) := {x : |x x0 | < } {x : |x x0 | } = Gck0 K c .
k0
k0
k0
i
i
1.3. Topology
muldown
2010/1/10
page 19
i
19
B
A
Figure 1.4. A disconnection,
is open, so there is a > 0 so that B(x0 , ) G0 . Now, by (2), we can chose k so
large that (Ik ) < . Hence,
Ik B(x0 , ) G0 ,
which contradicts (1).
i
i
20
muldown
2010/1/10
page 20
i
Chapter 1. The Real Number system & Finite Dimensional Cartesian Space
y
B
x
A
which implies by
0 c 1.
(i) A 6= , B 6= .
(ii) A B = .
(iii) A B = Rn .
Choose any points x A and y B, and consider the subsets of R determined by
A1 := {t R : x + t(y x) A},
B1 := {t R : x + t(y x) B}.
Since A, B are open in Rn , both A1 and B1 are open in R (this is not obvious so
think how to show it). Now the statements (i), (ii) and (iii) above imply
(i)1 A1 [0, 1] 6= , B1 [0, 1] 6= . (Indeed, 0 A1 and 1 B1 .)
(ii)1 (A1 [0.1]) (B1 [0, 1]) = .
(iii)1 (A1 [0, 1]) (B1 [0, 1]) = [0, 1].
i
i
1.3. Topology
muldown
2010/1/10
page 21
i
21
This means, (A1 , B1 ) is a disconnection of [0, 1] which, we have seen, is connected. The contradiction implies that Rn is connected.
Corollary 1.46. The only sets in Rn which are both open and closed are Rn and
.
Proof. If the nonempty set A 6= Rn is both open and closed, then so is Ac . Then,
(A, Ac ) would provide a disconnection of Rn .
More restricted notions of connectedness are sometimes used. A set C is
polygonally connected, if for each pair of point x0 , xm in C, there is a finite subset
{x1 , . . . , xm1 } of C such that the polygon
{xi1 + t(xi bf xi1 ) : t [0, 1], i = 1, 2, . . . , m}
is a subset of C. A set C is arcwise connected if, for each pair of points x, y C,
there is a path joining x to y lying entirely in C. That is, there is a continuous
function f : [0, 1] 7 C such that f (0) = x and f (1) = y. Either polygonally
connected or arcwise connected will imply the set is connected in the sense defined
here. However, the converse is not true. For example, the set {(x, y) R2 : x 6=
0, 0 < y x2 } {(0, 0)} is connected (in fact, arcwise connected), but is not
polygonally connected. The set {(x, y) R2 : y = sin( x1 ), x 6= 0} {(0, y) : 1
y 1} is connected, but is not arcwise connected.
Exercises
1.33. Property (b) of Proposition 1.31 implies that the intersection of any finite
collection of open sets is open. Show that it is not true that this holds for an
infinite collection of open sets.
1.34. Prove items (a), (b), and (c) of Proposition 1.33.
1.35. Prove that the two definitions of a bounded set are equivalent.
1.36. Prove that the intersection of any finite collection of open sets is open. Hint:
Use Property (b) of open sets and induction.
1.37. Prove that {x : |x| 1} is closed in Rn .
1.38. Prove that a subset U of Rn is open if and only if it is the union of a collection
of open balls.
1.39. If A is a subset of Rn , then A, the closure of A, is the intersection of all closed
sets which contain A as a subset. Show that
(a) A is closed.
(b) A A.
(c) A = A.
i
i
22
muldown
2010/1/10
page 22
i
Chapter 1. The Real Number system & Finite Dimensional Cartesian Space
(d) A B = A B.
(e) = .
1.40. If A is a subset of Rn , then A , the interior of A, is the union of all open sets
contained in A. Show that
(a) A is open.
(b) A A.
(c) A = A .
(d) (A B) = A B .
(e) Rn = Rn .
1.41. Let A Rn . The derived set A of A is the set of all cluster points of A
(a) Prove that A is closed.
(c) Theorem 1.39 says that A is closed if and only if A A. A set A for
which A = A is called perfect. Give examples of perfect and non-perfect
sets.
1.42. If A Rn , then A, the boundary of A, is the set of all points x such that
each neighborhood of x contains a point of A and a point of Ac .
(a) Show A is closed if and only if A A.
(b) { n1 R : n N}
1
(c) {( n1 , m
) R2 : n, m N}
1.44. Without using the Heine-Borel Theorem show that {(x, y) : x2 + y 2 < 1} is
not compact on R2 .
1.45. Let A and B be open in R. Prove that A B is open in R2 .
1.46. Show that a finite subset of Rn is closed.
i
i
1.3. Topology
muldown
2010/1/10
page 23
i
23
1.47. Show that a countable subset of R is not open. Show that it may or may not
be closed.
1.48. Let S be an uncountable subset of R. Show that S has a cluster point.
Hint: Show that at least one of the intervals [n, n + 1], n Z, must contain
uncountably many points of S.
1.49. Show that a closed interval is closed.
1.50. Let Q2 denote the set of points in R2 with rational coordinates. What is the
interior of Q2 ? The boundary of Q2 ? Show that Q2 is not connected.
1.51. Show that Qc is not connected. Show that (Q2 )c is connected, in fact, polygonally connected.
1.52. Show that an open connected set in Rn is also polygonally connected.
i
i
24
muldown
2010/1/10
page 24
i
Chapter 1. The Real Number system & Finite Dimensional Cartesian Space
i
i
muldown
2010/1/10
page 25
i
Chapter 2
2.1
Sequences
i
i
26
muldown
2010/1/10
page 26
i
If k = max{N1 , N2 }, then
|q p| = |q pk + pk p| |pk q| + |pk p| < 2.
Therefore, |q p| < 2 for each > 0; thus |p q| = 0 (Archimedian Property).
Hence, p = q.
Example 2.5 Two simple limits:
1. xk = 1, k = 1, 2, 3, . . ., then limk xk = 1.
2. limk
1
k
k N.
i
i
2.1. Sequences
muldown
2010/1/10
page 27
i
27
1
). So, by induction, there
there must exist kj+1 > kj such that pkj+1 B(p, j+1
exists a subsequence {pkj } of {pk } such that |pkj p| < 1j , j = 1, 2, 3, . . .. The
Archimedian Property implies limj pkj = p.
Theorem 2.11. A sequence is convergent to a limit p if and only if each subsequence is convergent and has limit p.
Proof. = Suppose limk pk = p, that is, for each > 0 there exists N such
that j N = |pj p| < . If {pkj } is a subsequence of {pk }, then
j N = kj j N = |pkj p| < .
Thus, limj pkj = p.
= Suppose each subsequence {pkj } of {pk } satisfies limj pkj = p.
But, {pk } is a subsequence of itself.
Example 2.12 If pk = (1)k , then {pk } is divergent. Indeed,
lim p2k = 1,
lim p2k+1 = 1.
If {pk } were convergent, these two limits would be the same number by the previous
theorem.
i
i
28
muldown
2010/1/10
page 28
i
and
lim qk = q,
then
(i) limk (pk + qk ) = p + q,
(ii) limk (pk qk ) = p q.
Further, if {xk } is a sequence in R such that limk xk = x, then
(iii) limk xk pk = xp, and
(iv) limk
1
xk pk
= x1 p, if xk , x 6= 0.
Proof. Exercise 2
The following result shows that it is sufficient to consider only sequences in R
when considering convergence or divergence.
Theorem 2.14. Let pk = (x1,k , . . . , xn,k ) Rn . Then {pk } is convergent if and
only if each of the sequences {xi,k } is convergent for i = 1, 2, . . . , n. Furthermore,
lim pk = p = (x1 , . . . , xn ) lim xi,k = xi ,
i = 1, 2, . . . , n.
|xi,k xi | < ,
n
2k+3
k+2
= 2. Indeed,
2+
2k + 3
=
k+2
1+
3
k
2
k
and
lim
3
1
= 0 = lim
= 0,
k k
k
lim
2
= 0,
k
i
i
2.1. Sequences
muldown
2010/1/10
page 29
i
29
Exercises
2.1.
2.2.
2.3.
2.4.
Show that the two definitions of the limit of a sequence are equivalent.
Prove Theorem 2.13
Finish the proof of = in Theorem 2.14.
Let xk zk yk . Prove that if {xk } and {yk } are convergent with limit c,
then {zk } is convergent with limit c.
2.5. Discuss the convergence or divergence of the sequences whose kth terms are
given by:
k
k
k
(b) (1)
(c) 3k2k
(a) k+1
2 +1
k+1
(d)
2k2 +3
3k2 +1
(e) ( k1 , k)
(f) ((1)k , k1 )
xk x
.
that limk xk = x. HINT: xk x =
xk + x
2.11. (Ratio Test) Let {xk } be such that limk xxk+1
= r. Show
k
HINT: In (a), if r < s < 1, show that for some constant A and all large
enough k, 0 |xk | Ask (using the previous exercise).
k
<x<1
(1 + )k
1 + k
i
i
30
muldown
2010/1/10
page 30
i
= x k 1 <
< , if k N.
1+
2.16. (Root Test) Let {xk } be such that lim |xk | k = r. Show
k
2.17. If a and b are nonnegative real numbers, show that lim (ak +bk ) k = max{a, b}.
k
1. limk rk = 0 if 0 r < 1.
Note that 0 < rk+1 = rrk rk < 1 so {rk } is decreasing and bounded, hence
is convergent with 0 limk rk < 1. Now {rk+1 } is a subsequence of {rk }
so they have the same limit. But, limk rk > 0 implies
lim rk+1 = r lim rk < lim rk ,
k
2k ,
k+1
k+1
=
ak = 0 < ak+1 ak .
k+1
2
2k
i
i
2.1. Sequences
muldown
2010/1/10
page 31
i
31
1
k+1
ak = lim ak
2k
2 k
ak = (1 +
Notice that the typical term after the first two has the form
1
1
1
(1 ) (1
)
!
k
k
and this increases in k. Thus, each term of ak is no greater than the corresponding term in ak+1 and ak+1 has one more term. Thus, ak+1 > ak and
{ak } is increasing.
We next show ak is bounded. Examining the above, we find
1
1
1
+ + ...+
2! 3!
k!
1
1
1
1 + 1 + + 2 + . . . + k1 3.
2 2
2
2 < ak 1 + 1 +
1 2
A2
(xk + 2A + 2 )
4
xk
A2
1
A 2
1 2
) 0.
(xk 2A + 2 ) = (xk
4
xk
4
xk
A = xk+1 A, k = 1, 2, . . . .
= x2k+1 A =
= x2k+1
Hence, xk
A, for k = 2, 3, . . .. Thus,
1
1 x2k A
A
A
1
) = (xk
)=
0.
xk xk+1 = xk (xk +
2
xk
2
xk
2 xk
i
i
32
muldown
2010/1/10
page 32
i
1
A
A
1
) = L = (L + ) = L2 = A = L = A.
(xk +
2
xk
2
L
The results on monotone sequences are interesting in that, unlike the proceeding examples, it is not necessary first to guess the limit of a sequence in order to
show that it converges. Fortunately, we are able to do this in general.
Definition 2.19. A sequence {pk } in Rn is a Cauchy sequence if, for each > 0,
there exists a natural number N = N () such that if k, m N , then
|pk pm | < .
Theorem 2.20 (Cauchy Criterion). {pk } is convergent if and only if {pk } is a
Cauchy sequence.
Proof. = Suppose limk pk = p. Thus, for each > 0, there exists N such
that k N = |pk p| < /2. Hence, if k, m N , we have
|pk pm | = |pk p + p pm | |pk p| + |p pm | <
+ = .
2 2
+ = .
2 2
Thus, limk pk = p.
Remark: An ordered field F is complete if and only if each Cauchy sequence in F
is convergent (with its limit in F ).
i
i
2.1. Sequences
muldown
2010/1/10
page 33
i
33
1. The sequence {(1)k } is divergent.
Example 2.21
In fact, we have for xk = (1)k that |xk xk+1 | = 2 for all k, and so {xk }
cannot be a Cauchy sequence.
2. For xk = 1 +
1
2
1
3
|x2k xk | =
1
1
1
1
1
1
= .
+
+ ...+
+ ...+
k+1 k+2
2k |2k
2k
2
{z
}
k times
Therefore, |x2k xk |
1
2
Exercises
2.17. Show that the following sequences are divergent by proving directly they are
not Cauchy sequences.
(a) {k} (b) {(1)k (1 k1 )}
2.18. Show directly that the following are Cauchy sequences and hence are convergent.
1
1
1
(a) { k+1
(b) {1 + 1!
+ 2!
+ . . . + k!
}.
k }
2.19. Determine whether each of the following sequences is convergent or divergent.
In the case
convergence,
find the
o
n limit.
o
n of
k3 +k2 +1
4k2 +k+4
(b)
(a)
3
4
(k+1)
(1+k)(2+ k
o
n
2
2
(k +1)
(d) {k 2 + k}
(c)
(k+1)(k+2)(k+3)
sin k
+3k
3
3
(e) {1 k }
(f)
k
(g)
sin
k
3
+2k2
(h)
k2 +1
k3 +1
cos2
k
4
o
k
k
(i)
3 [3] ,
where [x] denotes the greatest integer not exceeding x.
2.20. For what values of x are the following convergent, divergent? Wherever you
can, give
n the limit.
o
xk
(a)
(b)
(k + 1)(k + 2)xk
(k+2)(k+1)
o
n k k1 o
n
kx +x
+1
xk +k
(d)
(c)
xk+1 +(k+1)
kxk1 +1
n ko
k
x
(f)
k!x
(e)
k!
2.21. If a1 = 2, ak+1 =
limk ak = 3.
i
i
34
muldown
2010/1/10
page 34
i
a1 +...+ak
.
k
(c) Give an example to show that {k } may be convergent even though {ak }
is not.
1
2.25. We have seen that limk x k = 1 if x > 0. (Exercise 15). An easier proof is
1
now available to us from our results on monotone sequences. Show {x k } is
1
monotone and bounded, hence convergent. Consider the subsequence {x 2k }
and deduce the result.
2.26. Let {xk } be a sequence of positive real numbers. Show that
1
xk+1
= r = lim xkk = r.
k
k xk
lim
Hint: If r > 0 and > 0, then there exist positive numbers A, B and a natural
number N such that
A(r )k xk B(r + )k ,
k > N.
k
2.27. By applying the last exercise to the sequence { kk! } show that
lim
k
(k!)
1
k
= e,
where
e = lim (1 +
k
1 k
) .
k
1
1
1
+ 2!
+ . . . + k!
. We have seen that both
2.28. Let sk = (1 + k1 )k , tk = 1 + 1!
sequences are convergent. Show that they have the same limit. Hint: First
use the Binomial Theorem to show
sk = 1 +
1
1
1
1
1
k1
+ (1 ) + . . . + (1 ) (1
) tk .
1! 2!
k
k!
k
k
1
1
1
1
1
m1
+ (1 ) + . . . +
(1 ) (1
)
1! 2!
k
m!
k
k
i
i
2.2. Continuity
muldown
2010/1/10
page 35
i
35
2.2
Continuity
and
In general, = (, p0 ).
i
i
36
muldown
2010/1/10
page 36
i
0110..p
00
11
00p0
11
.
f(p0 )
.
f(p)
f(D)
D
n
R
Corollary 2.26.
(i) If f, g : Rn 7 Rm are continuous at p0 , then f + g and f g are continuous at
p0 .
(ii) If f : Rn 7 Rm and : Rn 7 R are continuous at p0 , then f is continuous
at p0 and 1 f is continuous at p0 when (p0 ) 6= 0.
Proof. This follows immediately from the corresponding theorem for sequences
Theorem 2.13.
i
i
2.2. Continuity
Proof.
muldown
2010/1/10
page 37
i
37
i = 1, 2, . . . , m,
by Theorem 2.14.
Example 2.28 Some examples of continuous and discontinuous functions:
(1) D = [0, 1], f (x) = 1, for 0 < x 1, and f (0) = 0. Then f is discontinuous
at x = 0. Indeed, if xk is any sequence in (0, 1] with limk xk = 0, then
f (xk ) = 1 so limk xk = 1 6= 0.
(2) Let D = R, f (x) = sin( x1 ), for x 6= 0 and f (0) = 0. Then f is discontinuous at
x = 0 since
2
2
= (1)k+1 ,
= 0 and
f
lim
k (2k + 1)
(2k + 1)
and the latter sequence is not convergent.
Remark: The discontinuity in the first example is removable, i.e. the discontinuity at x = 0 can be removed by changing the value of the function
at 0. The discontinuity in the second example is essential since no matter
what value is assigned to the function at x = 0, the discontinuity cannot be
removed.
(3) D = R2 , f (x, y) = x2xy
+y 2 , for (x, y) 6= O and f (0, 0) = 0. The function f is
discontinuous at (0, 0). Indeed, for the sequence of points
1 1
pk = ( , ),
k k
k N,
lim pk = O and
lim f (pk ) =
1
1
1
/(2 2 ) = 6= 0.
2
k
k
2
1
The discontinuity at (0, 0) is essential since if qk = ( k1 , 2k
), then
lim qk = O,
but
lim f (qk ) =
1
1
2
1
/( + 2 ) = .
2k 2 k 2
4k
5
x2 y 2
x2 +y 2 ,
=
= |(x, y)|2 .
x2 + y 2
2(x2 + y 2 )
2
2
Hence, if |(x, y)| = |(x, y) (0, 0)| < 2, we have |f (x, y) f (0, 0)| < .
|f (x, y) f (0, 0)| =
i
i
38
muldown
2010/1/10
page 38
i
2.3
1
1
f ([ , 2]) = [ , 1],
2
4
1
1
1
f 1 ([ , 3]) = [1. ] [ , 1].
4
2
2
i
i
muldown
2010/1/10
page 39
i
39
c= f(S)
f
S
R2
S2
S1
1=f(S)
1
0
S
1
1 = f(S )
2
R2
i
i
40
muldown
2010/1/10
page 40
i
which implies
B(p0 , (p0 )) D f 1 (V ),
p0 f 1 (V ).
(2.1)
Hence, if we set
U := p0 f 1 (V ) B(p0 , (p0 )),
then U is open and (2.1) implies
U D f 1 (V ).
(2.2)
(2.3)
Let p0 D, and > 0 be given. Consider V = B(f (p0 ), ). Then there exists an
open U Rn with U D = f 1 (V ). In particular, p0 U , so U is a neighborhood
of p0 , and
p U V = f (p) V.
Thus, f is continuous at p0
Theorem 2.33 (Preservation of Connectedness). Suppose f : Rn 7 Rm . If
f is continuous on D and D is connected, then f (D) is connected.
Proof. Suppose f (D) is not connected. Then there exist open sets V1 and V2 in
Rm such that
(i) V1 f (D) 6= and V2 f (D) 6= ;
(ii) (V1 f (D)) (V2 f (D)) = ; and
(iii) (V1 f (D)) (V2 f (D)) = f (D).
By the Global Continuity Theorem (Theorem 2.32), there exist open sets U1
and U2 in Rn such that
U1 D = f 1 (V1 ),
U2 D = f 1 (V2 ).
Then
(i) U1 D 6= , U2 D 6= , from (i),
(ii) (U1 D) (U2 D) = , from (ii),
i
i
muldown
2010/1/10
page 41
i
41
i
i
42
muldown
2010/1/10
page 42
i
M f (D), that is, that M = f (p0 ) for some p0 D. Suppose M 6 f (D), which
implies M > f (p) for all p D. Consider the function
g(p) =
1
> 0,
M f (p)
for p D.
Exercises
2.32.
2.33.
2.34.
2.35.
2.36.
2.4
Uniform Continuity
and
i
i
muldown
2010/1/10
page 43
i
43
Here the depends only on (and possibly D), but not on the choice of points.
Example 2.40 Some examples on uniform continuity:
(1) f (x) = x, < x < + (D = R), is uniformly continuous on R. Obviously,
if x, y R and |x y| < , then |f (x) f (y)| < .
(2) f (x) = x2 , 0 x 1 (D = [0, 1]), is uniformly continuous on D. Indeed,
|f (x) f (y)| = |x2 y 2 | = |x y||x + y| 2|x y| for x, y [0, 1].
Thus, if x, y D and |x y| < 2 , then |f (x) f (y)| < , so f is uniformly
continuous on D = [0, 1].
(3) f (x) = x2 , 0 x < + (D = [0, +)), then f is not uniformly continuous on
D. We will show the condition is not satisfied for = 1 by any choice of .
Let > 0 be given and choose x = 1 , y = 1 + 2 . Then
|x y| =
<
2
2
( + ) > 1.
2
2
|x y|
1
|f (x) f (y)| = | x y| =
|x y|,
x+ y
2
x, y [1, +).
If x ,
f (y)| < = . If y < x < , then |f (x) f (y)| = x y < 0 = .
In either case, we see that
|x y| < = 2 and x, y [0, +) = |f (x) f (y)| < .
Theorem 2.41. Let f : D Rn 7 Rm . If f is continuous on D and D is compact,
then f is uniformly continuous on D.
Proof. Let > 0 be given. By the continuity of f on D, for each p0 D, there
exists a (, p0 ) > 0 such that
p B(p0 , (.p0 )) D = |f (p) f (p0 | <
.
2
i
i
44
muldown
2010/1/10
page 44
i
1
(.pj ).
2
(2.6)
(2.7)
Therefore,
(2.6) = |f (p) f (pj )| <
.
2
Therefore, from the triangle inequality, if p, q D and |p q| < (), then |f (p)
f (q)| < . So, f is uniformly continuous on D.
Alternative Proof: Suppose f is not uniformly continuous on D. Negating
the definition of uniform continuity, there exists and 0 > 0, such that for every
> 0, there exists p, q D with |p q| < but |f (p) f (q)| 0 . In particular,
there exist pk , qk such that |pk qk | < k1 and |f (pk ) f (qk )| 0 > 0.
Now {pk } D and D is compact, hence bounded. Therefore, there is a
subsequence {pkj } converging to some point p0 in D; limj pkj = p0 (Corollary
2.10). Now
|qkj p0 | |qkj pkj | + |qkj p0 |
1
+ |qkj p0 |,
kj
and the right hand side tends to zero as j . Hence, limj qkj = p0 also.
But, the continuity of f at any point p0 D implies that
lim (f (pkj ) f (qkj )) = f (p0 ) f (p0 ) = 0,
Exercises
2.38. If f (x) = x1 , x 6= 0, then f is continuous on its domain.
2.39. Show that a polynomial
f (x) = an xn + an1 xn1 + . . . + a1 x + a0 ,
(ai constants)
i
i
muldown
2010/1/10
page 45
i
45
1 x, x Q;
x,
x
6 Q.
xu
x+u
) cos(
)
2
2
1
1+x2
1
2.46. If f (x) = 1+x
2 , then f is uniformly continuous on R.
2.47. Let f be a real-valued function with domain D, D being an open subset of
Rn . Prove that f is continuous if and only if the sets
{p : f (p) > },
{p : f (p) < }
i
i
46
muldown
2010/1/10
page 46
i
2.5
Limits
The notion of a limit which was introduced for sequence can, as you recall from first
year Calculus, be extended to any function.
Definition 2.42. Let f : Rn 7 Rm with domain D Rn . Let p0 be a cluster
point of D (it is not assumed that p0 D). A point L Rm is the limit at p0 of f
if, for each > 0, there exists a > 0 such that
(p D
and
Write
lim f (p) = L.
pp0
0, if y 6= x2 ;
1, if y = x2 ;
does not have a limit at O = (0, 0) since each neighborhood of (0, 0) contains
points (x, y) 6= (0, 0) at which f (x, y) = 0 and at which f (x, y) = 1.
i
i
2.5. Limits
muldown
2010/1/10
page 47
i
47
Just as in the case of sequences there is a Cauchy criterion for the existence
of limpp0 f (p).
Theorem 2.44 (Cauchy Criterion). The limit limpp0 f (p) exists if and only
if for each > 0, there is a > 0 such that
p, q D B(p0 , )\{p0 } = |f (p) f (q)| < .
Proof. = Suppose limpp0 f (p) = L; thus, if > 0 is given, there exists a
> 0 such that
p D, 0 < |p p0 | < = |f (p) L| < /2.
Then, if p, q D B(p0 , )\{p0 } , we have
Exercise 55.
i
i
48
muldown
2010/1/10
page 48
i
The following corollaries follow immediately from the corresponding statements for sequences.
Corollary 2.47. Let f : D Rn 7 Rm , f = (f1 , . . . , fm ), L = (L1 , . . . , Lm ),
and p0 D. Then limpp0 f (p) = L if and only if limpp0 fi (p) = Li , for all
i = 1, 2, . . . , m.
Corollary 2.48. If f, g : Rn 7 R, p0 (Df Dg ), and limpp0 f (p) = L,
limpp0 g(p) = M , then
(i) limpp0 (f (p) + g(p)) = L + M.
(ii) limpp0 (f (p)g(p)) = LM.
(iii) limpp0
f (p)
g(p)
L
M,
if M 6= 0.
xx0
xx0
x0+
lim f (x) = 0.
x0
0, if y 6= x2 ;
1, if y = x2 .
i
i
muldown
2010/1/10
page 49
i
49
is 0. Notice that in this case, we have the limit along any line through the
origin is 0 at O, i.e.
lim f (tx0 , ty0 ) = 0
t0
if (x0 , y0 ) 6= O.
The moral of this is that when the limit properties of functions of more than
one variable are being considered, it is not enough to look at their behaviour on
straight lines.
2.6
We review the most important results on differentiation from first year calculus.
Throughout this section f : R 7 R.
Definition 2.50. If f is defined in a neighborhood of x0 and
lim
xx0
f (x) f (x0 )
x x0
exists, then this limit is called the derivative of f at x0 and is denoted f (x0 ).
Proposition 2.51. If f (x0 ) exists, then f is continuous at x0 .
Proof.
f (x0 ) < 1.
0 < |x x0 | < =
x x0
Hence, |f (x)f (x0 )| < [|f (x0 )| + 1] |xx0 |. From this it follows that limxx0 f (x) =
f (x0 ).
Definition 2.52. The function f has an interior relative maximum (respectively, minimum)
at c if there is a neighborhood U of c such that
f (x) f (c)
xU
x U ).
xc
f (x) f (c)
.
xc
i
i
50
muldown
2010/1/10
page 50
i
c < x < c + =
Similarly,
f (x) f (c)
0
xc
f (x) f (c)
= lim
0.
xc
xc
c < x < c =
i
i
muldown
2010/1/10
page 51
i
51
Proof. Consider the function (x) = [f (x) f (a)](b a) [f (b) f (a)](x a).
The function is continuous on [a, b], exists on (a, b) and (a) = (b) = 0. By
Rolles Theorem, there is a c (a, b) such that
0 = (c) = f (c)(b a) [f (b) f (a)].
Corollary 2.56 (Cauchy Mean Value Theorem). Suppose that
(i) f, g are continuous on [a, b]
(ii) f and g exist on (a, b)
Then there exists c (a, b) such that
f (c)[g(b) g(a)] = g (c)[f (b) f (a)].
Proof.
xb
f (x)
f (x)
= L = lim
= L.
g (x)
xb g(x)
Note: limxb f (x) = means that, for each real number N , there is a > 0
such that x (b , b) = f (x) > N .
(x)
=
Proof. Case (i): Suppose limxb f (x) = limxb g(x) = 0 and limxb fg (x)
L. If > 0 is given, there is a > 0 such that
f (x)
x (b , b) =
L < .
(2.8)
g (x)
Let f, g be defined at b by f (b) = g(b) = 0. Then f and g are continuous on [x, b]
if x > a. From the Cauchy Mean Value Theorem,
f (x) f (b)
f (c)
f (x)
=
= ,
g(x)
g(x) g(b)
g (c)
Therefore,
f (x)
f (c)
=
< ,
L
g(x)
g (c)
for
x (b , b).
i
i
52
Thus, limxb
muldown
2010/1/10
page 52
i
= L.
(x)
Case (ii): Suppose limxb f (x) = limxb g(x) = and limxb fg (x)
= L.
As before, (2.8) holds if x (b, b) for some > 0. However, we cannot define f (b)
and g(b) in this case so that f is continuous at b. Let x0 = b and x (b , b).
Then
f (c)
f (x) f (x0 )
= , for some c (b , b).
g(x) g(x0 )
g (c)
L
h(x)
L
g(x) g(x0 )
g(x)
where
h(x) =
f (x0 )
f (x)
g(x0 )
g(x)
(2.9)
(2.10)
1
.
2
(2.11)
Therefore,
lim
xb
f (x)
= L.
g(x)
Remark: Define limx = L if, for each > 0, there exists N such that x N
implies |f (x) L| < . LHospitals Rule is also valid if limxb is replaced by
limx . Minor changes are required for the proof however.
Exercises
x2 a2
for x 6= a, show that lim f (x) = 2a.
xa
xa
2.55. Prove Theorem 2.45.
2.54. If f (x) =
i
i
muldown
2010/1/10
page 53
i
53
xy 2
(x,y)(0,0) x2 + y 4
lim
xy 2
,
+ y4
lim
(x,y)(0,0) x2
c1
cn1
cn
+ ...+
+
= 0.
2
n
n+1
prove that
c0 + c1 x + . . . + cn1 xn1 + cn xn = 0
has at least one solution in (0, 1).
p
2.61. If f (x) = |x|, then f (x) exists if x 6= 0 and does not exist if x = 0.
Prove that
x4 5x3 + 6x2 + 4x 8
=3
x2 x4 7x3 + 18x2 20x + 8
1
= 0.
(iv) lim cot(x)
x0
x
csc(x)
= e.
(v) lim 1 + tan(x)
x0
log x 2
= 0.
(vi) lim
+
tan(x)
x
2
(iii) lim
2.65. If f (x) exists for all x near c, f is continuous at c and limxc f (x) = A,
then f (c) exists and equals A.
i
i
54
muldown
2010/1/10
page 54
i
2.66. (Darboux Property of the Derivative) Let f (x) exist for each x [a, b] and
f (a) = , f (b) = . Suppose < < . Show that there is a c
(a, b) such that f (c) = . Hint: Show that g(x) = f (x) x must achieve
its minimum in (a, b). This exercise shows that derivatives, like continuous
functions, have the Intermediate Value Property. However, a derivative need
not be continuous on its domain. See Exercise 70 (i) and (ii).
2.67. Let
0, if 1 x 0;
f (x) =
1, if 0 < x 1.
Is there a function g such that g (x) = f (x) on 1 x 1.
2.68. The function defined by
x2 , x Q;
g(x) =
0,
x 6 Q;
( )n1 (n1)
( )
f () + . . . +
f
() + Rn (f ; , ),
1!
(n 1)!
where
Rn (f ; , ) =
( )m ( )nm (n)
f ()
m(n 1)!
for some point (, ). This is called Schlomilchs form of the remainder. Special cases are given by taking
( )n (n)
f () (Lagrange form)
n!
( )( )n1 (n)
f () (Cauchy form)
Rn (f ; , ) =
(n 1)!
m = n;
m = 1;
Rn (f ; , ) =
The Lagrange form is the easiest to remember and is adequate for most
purposes. Hint: Let the constant C be defined by
f ()f ()
( )
( )n1 (n1)
f (). . .
f
()()m C = 0,
1!
(n 1)!
( x)
( x)n1 (n1)
f (x). . .
f
(x)(x)m C.
1!
(n 1)!
i
i
muldown
2010/1/10
page 55
i
55
(b) (i) If f (x) = ex , show that lim Rn (f ; 0, x) = 0 for all x. That is, show
n
that
x2
xn1
x
= ex , x.
+ ...+
lim 1 + +
n
1!
2!
(n 1)!
(b) (ii) If f (x) = sin(x), then limn Rn (f ; 0, x) = 0 for all x.
(b) (iii) If f (x) =
1
1x ,
(c) (i) How large must I take n to approximate e to four decimal places by
the expression
1
1
1
1 + + + ...+
.
1! 2!
(n 1)!
dn 2
(x 1)n ,
dxn
n = 1, 2, . . . .
i
i
56
muldown
2010/1/10
page 56
i
i
i
muldown
2010/1/10
page 57
i
Chapter 3
Riemann Integration
The definition of the Riemann integral is essentially the same in higher dimensions
as in one dimension. You may have learned Riemann integration of functions of one
variable by the lower and upper Riemann sums approach; the treatment adopted
here is slightly different but equivalent to that approach (see Exercise 17).
3.1
v
n
1/2 u
uX
= t (bk ak )2 .
(I) := (b1 a1 )2 + . . . + (bn an )1
k=1
If p, q I, then |p q| (I).
n
Y
k=1
(bk ak ).
In R, R2 , R3 , is length, area, and volume respectively. We write n if the dimension of the space needs to be emphasized. A set S Rn has content zero
(Jordan measure zero) if, for each > 0, there is a finite collection of intervals Ii ,
i = 1, . . . , k, such that
S ki=1 Ii
and
k
X
(Ii ) < .
i=1
i
i
58
muldown
2010/1/10
page 58
i
(0,0)
(1,0)
1/k
()
(b)
(5) If f is a continuous real valued function on [0, 1], then {(x, f (x)) : 0 x 1},
the graph of f , has content zero in R2 .
Proof. Let > 0. Since f is uniformly continuous on [0, 1] (Theorem 2.41),
there exists > 0 such that
x, y [0, 1] and |x y| < = |f (x) f (y)| /2.
Now let m be a natural number so that m < 1 and (m + 1) 1. Consider
the following intervals in R2 :
Ik = [k, (k + 1)] [f (k) /2, f (k) + /2],
k = 0, . . . , m 1,
and
m
X
k=1
(Ik ) = (m + 1 m) = ,
i
i
muldown
2010/1/10
page 59
i
59
the union of the two half circles which are the graphs of f (x) = r2 x2 ,
and thus has zero content by (5) and (6).
3.1.1
Partition of I
i = 1, . . . , n.
Then P = P1 P2 . . . Pn is said to be a partition of I. Notice that a partition generates a subdivision of I into a finite collection of closed non-overlapping
subintervals of the form
[tj1 ,1 , tj1 +1,1 ] [tj2 ,2 , tj2 +1,2 ] . . . [tjn ,n , tjn +1,n ].
The total number of subintervals generated by P is m1 m2 mn . A partition Q
is a refinement of P if P Q. Notice that a refinement of P further subdivides the
intervals Ii generated by P .
3.1.2
Riemann Sums
i
i
60
muldown
2010/1/10
page 60
i
Ii
i = 1, 2.
Exercises
3.1. If f is integrable on I, then f is bounded on I.
i
i
muldown
2010/1/10
page 61
i
61
3.2
R
Theorem 3.5 (Cauchy Criterion). Let f : I 7 Rm . I f exists if and only if
for each > 0 there exists a partition P of I such that if P, Q P , then
|S(P, f ) S(Q, f )| <
for all Riemeann sums S(P, f ), S(Q, f ) corresponding to P ,Q.
R
Proof. = If = I f exists, then for each > 0 there is a partition P of I
such that if P P , then
|S(P, f ) | < /2.
(3.1)
Let P, Q P , then (3.1) implies
|S(P, f ) S(Q, f )| |S(P, f ) | + | S(Q, f )| <
+ = ,
2 2
1
,
k
k = 1, 2, . . . .
(3.2)
1
,
k
, m k.
exists.
(3.3)
R
It remains to show that = I f (Theorem 2.20). From (3.3) it follows that, if
> 0, there is a natural number N such that
1
<
N
2
.
2
(3.4)
i
i
62
muldown
2010/1/10
page 62
i
1
+ < + .
N
2
2 2
Proof. This can be proved by a straightforward induction on the number of subintervals into which I is partitioned. Do it.
The next corollary gives an equivalent but more readily applicable version of
the Cauchy Criterion.
R
Corollary 3.7 (Cauchy Criterion). I f exists if and only if for each > 0,
there exists a partition P such that if S1 (P , f ) and S2 (P , f ) are any two Riemann
sums corresponding to P , then
|S1 (P , f ) S2 (P , f )| < .
Proof. It is evident from Theorem 3.5 that the condition is necessary. To see that
it is sufficient, let P satisfy the requirement of Corollary 3.7, and let P and Q be
refinements of P , generating subintervals {Ak } and {Bj } respectively. Then
P
P
|S(P, f ) S(Q, f )| = k f (pk )(Ak ) j f (qj )(Bj )
P hP
i
P
f
(q
)(B
)
f
(p
)(A
)
= i
j
j
k
k
Bj Ii
Ak Ii
(3.5)
where Ii are the subintervals generated by P . Now there exists points xi , xi in Ii
such that
X
X
f (pk )(Ak )
f (qj )(Bj ) f (xi ) f (xi ) (Ii ).
(3.6)
Ak Ii
Bj Ii
Indeed, to see that (3.6) is true, let
i
i
muldown
2010/1/10
page 63
i
63
where the max and min are taken over the k and j for which Ak Ii and Bj Ii ,
two finite sets. Then
X
X
X
X
f (x )
(Ai ) f (xi )
(Bj )
f (pk )(Ak )
f (qj )(Bj )
Ak Ii
f (xi )
Ak Ii
Bj Ii
(Ai ) f (x )
Ak Ii
Bj Ii
(Bj )
Bj Ii
Ak Ii
(Ai ) =
(Bj ) = (Ii )
Bj Ii
so (3.6) follows.
From (3.5) and (3.6), we find
X
[f (xi ) f (xi )] (Ii ) = |S1 (P , f ) S2 (P , f )| < .
|S(P, f ) S(Q, f )|
i
f exists.
(I) = .
(I)
(I)
i
i
Thus,
i
i
64
muldown
2010/1/10
page 64
i
(i) f is bounded on I,
(ii) the set of points of discontinuity of f has content zero.
R
Then I f exists.
Proof.
Suppose
|f (p)| N,
p I
(3.7)
and K is the set of points of discontinuity of f in I. Let > 0. Since K has content
zero, there exists a partition Po of I such that if Ii are the subintervals generated
by Po , then
X
.
(3.8)
(Ii ) <
4N
i,KIi 6=
Let
L
L
L2 = i,KIi = Ii .
.
2(I)
.
2(I)
(3.9)
Thus,
X
X
|f (pi ) f (qi )| Ii
f (pi ) f (qi ) Ii
|S1 (P, f ) S2 (P, f )| =
i
i
i
Ii L1
< 2N
muldown
2010/1/10
page 65
i
65
X
|f (pi ) f (qi )| Ii
|f (pi ) f (qi )| Ii +
Ii L1
Ii L2
Ii )
Ii ) +
2(I)
Ii L2
+
(I) =
2N
4N
2(I)
by (3.7),
(3.8), and (3.9). Thus, f satisfies the Cauchy Criterion (Corollary 3.7)
R
and I f exists.
Example 3.10 Let f (x, y) =R xy 2 , 0 x 1, 0 y 1 and I = [0, 1] [0, 1].
Then f is continuous on I so I f exists. Partition I into k 2 intervals of side length
1
k:
i1 i
j1 j
Pk := Ii,j =
, i = 1, . . . , k, j = 1, . . . , k .
,
,
k k
k
k
2
j1 2
f (pi,j ) ki kj , so that
If pi,j Ii,j , then i1
k
k
2
2
k
k
X
X
i1 j1
i j
1
1
S(Pk , f )
2
2
k
k
k
k
k
k
i,j=1
i,j=1
1 k(k 1) (k 1)k(2k 1)
1
k5
2
6
6
2
k
k
k
1 X i j
1 X X 2
j
i
=
k 2 i,j=1 k k
k 5 i=1 j=1
=
=
Thus,
f = 16 .
i
i
66
muldown
2010/1/10
page 66
i
Proof. The function f with its domain extended to the interval I D as in the
previous definition, is continuous at eachR point of I except possibly
at points in
R
f
exists
and, by
f
exists.
Therefore,
D. But (D) = 0, so
by
Theorem
3.9,
D
I
R
definition, is equal to I f .
Definition 3.13. A bounded set D Rn has content if
Z
1.
(D) :=
1 exists and
1,
0,
if p D;
if p
6 D,
Rand let I be any closed interval containing D as a subset. Then D has content if
exists and then
I D
Z
(D) := D .
I
Exercise 4.
(c) If
f exists, then
|f | exists, and
Z Z
f
|f |.
D
i
i
D1
f and
D2
muldown
2010/1/10
page 67
i
67
D2
D1
g
f
D
D
D
D
Z
Z
|| S(P, f )
f + || S(P, g)
g
D
(|| + ||).
R
R
R
Thus, D (f + g) exists and equals D f + D g.
Proof of (b). Exercise 6
Proof of (c). Exercise 7
Proof of (d): Let Di be the characteristic functions of Di , i = 1, 2, and let I
be a closed interval containing D1 D2 . By definition
Z
Z
Z
f D1
f = f D1 =
ZD1 D2
ZI
ZD1
f D2
f = f D2 =
D2
D1 D2
D1 D2
D1 D2
D1
f.
D2
i
i
68
muldown
2010/1/10
page 68
i
p 6 D1 D2 ,
with
(D1 D2 ) = 0,
Thus,
D1 D2
D1
f+
D2
f.
R
f
m := inf{f (p) : p D}
= m D M.
M := sup{f (p) : p D}
(D)
By Corollary 2.38, D compact and f continuous on D implies that there exist points
p1 , p2 D with f (p1 ) = m and f (p2 ) = M . Therefore, by Corollary 2.34 (the
Intermediate Value Theorem), there exists p0 D such that
R
Z
f
D
f (p0 ) =
f.
= f (p0 )(D) =
(D)
D
Discussion: A subset of Rn has Jordan measure zero if, for each > 0, there is a
finite collection of intervals {Ik } such that
X
K k Ik and
(Ik ) .
k
K has Lebesgue measure zero if, for each > 0, there is a countable collection
of intervals {Ik } satisfying the displayed relation. This simple extension of the
concept of measure zero has profound consequences. We know that the set of
rationals in [0, 1] does not have Jordan content. However, if this set is enumerated
as {rk : k = 1, 2, 3, . . .}, then rk Ik with (Ik ) = 2k yields
{rk : k = 1, 2, 3, . . .} k Ik
and
X
k
(Ik ) =
X
1
= .
2k
k=1
i
i
muldown
2010/1/10
page 69
i
69
Thus the set of rationals in [0, 1] has Lebesgue measure zero. In fact, the same
argument shows that Q, or any countable set of real numbers has Lebesgue measure
zero. Countable sets are not the only sets of Lebesgue measure zero however; in
fact, they can be quite complicated. The remarkable Cantor set which you will
study in Real Analysis has Lebesgue measure zero but nonetheless has the same
cardinality as [0, 1], i.e. it has zero length but has the same number of points
as [0, 1]. A simple discussion of the Cantor set can be found in Bartles book.
An interesting theorem of Lebesgue
states that, for a bounded real-valued
R
function f , the Riemann integral I f exists if and only if the set of points in I at
which f is discontinuous has Lebesgue measure zero. For example, for the functions
in Exercises 15 and 16, the one in the first exercise is discontinuous at each point in
[0, 1], while the function in the second exercise is only discontinuous at the rational
points in [0, 1]
Exercises
3.3. Suppose f and g are bounded on I and f (p) = g(p) for all p I\K, where
K has content zero. Prove that
Z
Z
Z
f exists = g exists and equals f.
I
i
i
70
muldown
2010/1/10
page 70
i
f = sup{S(P, f )}
and
the sup and inf here being taken over all partitions P of I.
(i) Show that Q P implies
S(P, f ) S(Q, f ) S(Q, f ) S(P, f ).
(ii) Show that
If.
f=
f=
f=
I
f.
I
(iv) Show that f is integrable in this sense if and only if there is exactly one
number such that
S(P, f ) S(P, f )
R
for every partition P of I, in which case I f = .
R
(v) (Cauchy Criterion). Show that I f exists in this sense if and only if, for
each > 0, there exists a partition P of I such that
|S(P , f ) S(P , f )| < .
(vi) Show that for a bounded f , f is integrable on I in the sense of (iii) if
and only if f is integrable in the sense adopted
in these notes and that
R
both definitions give the same value for I f .
i
i
3.3
3.3.1
muldown
2010/1/10
page 71
i
71
Evaluation of Integrals
Real valued functions of a real variable
a x b.
Rx
Proof. Consider the function F (x) = a f , a x b which exists by Exercise 14,
we have from Theorem 3.15 (d) and (f) that
F (x + h) F (x) =
x+h
f = f (ch )h,
Rx
Proof. By the proceding Proposition, there is a constant C so that a f = F (x)
Ra
C. When x = a, this yields a f = 0 = F (a) C, while using this when x = b we
Rb
find a f = F (b) C = F (b) F (a).
Corollary 3.19 (Change of Variable Formula). Suppose that
(i) exists and is continuous on [a, b].
(ii) f is continuous on ([a, b]).
i
i
72
muldown
2010/1/10
page 72
i
Then
(b)
f=
f (x) dx =
(a)
Proof.
(a)
(b)
(f ) ,
or
If F is an antiderivative of f , then
d
F ((u)) = f ((u)) (u),
du
(b)
(a)
f = F ((b)) F ((a)) =
b
a
(f ) .
You will have noticed that we used the Chain Rule here even though it was
not yet proven in these Notes. A proof will be given in the next chapter in a more
general context.
R /2
Example 3.20 To evaluate 0 sin2 (u) cos(u) du, we take f (x) = x2 , (u) =
sin(u), and (u) = cos(u). Then
Z
/2
(/2)
2
x dx =
(0)
x2 dx =
1
.
3
Example 3.22
Z
1 Z
xex dx = xex
0
1
ex dx = e (ex ) = e (e 1) = 1.
0
i
i
muldown
2010/1/10
page 73
i
73
3.3.2
The important theorem of Fubini allows us to extend the use of the Fundamental
Theorem of Calculus to functions of more than one variable.
Theorem 3.23 (Fubinis Theorem). Let f : I = [a, b] [c, d] 7 R. Suppose that
R
(ii) I f exists
(ii)
Rd
Then
Rb
a
F (x) dx =
a
"Z
f (x, y) dy dx =
c
f.
I
P = {x0 , x1 , . . . , xm } {y0 , y1 , . . . , yk } P
m k
Z
X X
f (xi , yj )(xi xi1 )(yj yj1 ) f < ,
=
I
i=1 j=1
m
Z
k
X
X
(xi xi1 )
f (xi , yj )(yj yj1 ) f < .
I
i=1
j=1
(3.10)
Condition (ii) implies that for any fixed set of numbers xi , i = 1, . . . , m, the partition {yj } of [c, d] may be chosen to be so fine that
X
Z d
k
f (xi , y) dy <
f (xi , yj )(yj yj1 )
, i = 1, . . . , m,
b
a
c
j=1
Rd
since each of the integrals c f (xi , y) dy exists. Therefore, for the xi we have
X
k
f (xi , yj )(yj yj1 ) F (xi ) <
, i = 1, . . . , m.
(3.11)
ba
j=1
i
i
74
muldown
2010/1/10
page 74
i
< ba
i=1 (xi xi1 ) = ba (b a) = .
The triangle inequality with (3.10) and (3.12) gives
Z
m
X
F (xi )(xi xi1 ) f < 2.
I
(3.12)
i=1
Rb
a
f.
Corollary 3.24 (Interchanging the order of integration). Let I = [a, b][c, d].
If
R
(i) I f exists
(ii)
(iii)
Rd
then
Rb
a
"Z
f (x, y) dy dx =
c
d
c
"Z
f (x, y) dx dy.
a
f=
"Z
f (x, y) dy dx.
R
Proof. Conditions (i) and (ii) imply that the integral I f exists by Theorem 3.9.
Condition (iii) says that the intersection of K with each vertical line (considered as
Rd
a set in R) has content zero so that c f (x, y) dy exists for each x [a, b] again by
Theorem 3.9. Therefore, all the conditions of Fubinis Theorem are satisfied.
i
i
muldown
2010/1/10
page 75
i
75
(x)
Proof. Use Corollary 3.25. The graphs of and have zero content in R2 . Each
vertical line intersects each graph once, and these two points have 1 dimensional
content zero. See Figure 3.5
Also,
f=
Z
1
0
x
1
dx = .
3
6
y2
1
dy = .
2
6
Z
xy dx dy =
2
.
=
3
6
21
30
70
0
0
Also,
"Z
1 4 2 x= y
x y
dy
x=y
y
0 4
0
5
Z 1 4
y
y6
y 7 1
1
y
dy =
.
=
=
4
4
20 28 0
70
0
f=
D
3 2
x y dx dy =
i
i
76
muldown
2010/1/10
page 76
i
y=x
2
y=x
y=0
Z
f (x, y) dy dx =
where
(y) =
f=
f (x, y) dx dy
1, if 0 y 1
y, if 1 y 2.
(y)
"Z
f (x, y) dx dy.
3.3.3
i
i
muldown
2010/1/10
page 77
i
77
y=x
y=x
x=1
x=1
x=2
y=0
y=0
In
f.
F =
Z Z
I
Im
f (p, q) dq dp
(1) The proof of Theorem 3.29 is exactly the same as that given before when n = 2.
(2) The symbols dp, dq above are simply used as devices to indicate the spaces
on which we are integrating.
(3) For a more general formulation of Fubinis Theorem where the condition (ii) is
dropped see Calculus on Manifolds by M. Spivak (p. 58). However, the above
statement of the theorem is sufficient for our needs.
(4) We will prove a change of variables formula for integrals in higher dimensions
in a later chapter.
Exercises
3.18. Let f be a real-valued function on [a, b] such that f (x) exists for each x [a, b]
Rb
and a f exists. Prove that
Z b
f (b) f (a) =
f .
a
i
i
78
muldown
2010/1/10
page 78
i
Rb
a
f is the
D := {(x, y) : a x b, 0 y f (x)}.
Rb
R
That is, show D 1 = a f .
3.20. Let D := {(x, y) : 1 x 3, x2 y x2 + 1}. Show that
#
Z "Z 2
3
x +1
(D) =
dy dx = 2.
x2
131
120 .
(Do
y
y=1
y=x
y=(x3)
4
Z
Z
g(t) dt dx =
tg(t) dt.
i
i
muldown
2010/1/10
page 79
i
79
y=4
1 y=(2x+4)/3
0
2
y=x 4x + 5
000
111
000
111
00
11
x=4
00
11
y=x/2
x=0
f exists, then
f 2 exists. [Hint:
3.28. The Mean Value Theorem for Integrals (Theorem 3.15) implies the following
If f is continuous on [a, b], then there exists c [a, b] such that
Z b
f = f (c)(b a).
a
Can you replace continuous by a less restrictive condition which still implies the result? Hint: [What was Darbouxs first name?]
3.29. (Cavalieris Principle) Let A and B be subsets of R2 with content. If x R,
define
Ax := {y : (x, y) A},
Bx := {y : (x, y) B}.
(sections of A and B). Suppose that, for each x, Ax and Bx have content
in R and 1 (Ax ) = 1 (Bx ). Prove that 2 (A) = 2 (B). [Hint: Spell Fubini.]
R1
3.30. Let f be a real-valued function on [0, 1] such that 0 f exists. Define ak by
k
1X j
ak :=
f ( ),
k j=1 k
k = 1, 2, 3, . . . .
i
i
80
muldown
2010/1/10
page 80
i
xf (sin(x)) dx =
0
/2
f (sin(x)) dx.
2
x sin(x)
.
dx =
2
4
2 sin (x)
x2
f (x, y) dy dx+
"Z
(x1)2
f (x, y) dy dx.
R
3.35. Show that D 1 = 61 where D := {(x, y, z) : x 0, y 0, z 0, 0
x + y + z 1}.
3.36. Let D be a subset of R2 with content and f be a positive continuous function
on D. Use Fubinis Theorem to show that if
K := {(x, y, z) : (x, y) D, 0 z f (x, y)},
then
3 (K) =
1=
f.
Deduce that m2 (D) 3 (K) M 2 (D) where m, M are lower and upper
bounds for f on D.
3.37. Evaluate
Z 2Z 2
Z Z
2
sin(y)
ex dx dy and
dy dx.
y
0
y
0
x
3.38. Show
"Z 2
1y
1y 2
sin
1 x2
dx dy = 1.
i
i
muldown
2010/1/10
page 81
i
Chapter 4
Differentiation 0f
Functions of Several
Variables
4.1
4.1.1
Preliminaries
Linear Functions
C1,1 . . . C1,n x1
y1
.. ..
.. .. , or qT = CpT
(T is the transpose).
. = .
...
. .
x
ym
Cm,1 . . . Cm,n
n
Pn
In particular, yi = j=1 Ci,j xj , i = 1, . . . , m.
i
i
82
muldown
2010/1/10
page 82
i
If ej is the vector with 1 is the jth coordinate and has zeros elsewhere, j = 1, . . . , n,
then
p = (x1 , . . . , xn ) = x1 e1 + . . . + xn en =
L(p) = L(x1 e1 + . . . + xn en ) = x1 L(e1 ) + . . . + xn L(en ).
z
3
L(R )
v +L(R )
0
(0,0,0)
vo
v o+u
y
(0,1,0)
x
Figure 4.1. An affine plane in R3 .
Example 4.3
1 1
rank 1 1
0 1
0
0 = 2.
1
i
i
4.1. Preliminaries
muldown
2010/1/10
page 83
i
83
Thus, for the linear map L : R3 7 R3 corresponding to this matrix, the range
L(R3 ) has dimension 2. In fact, it consists of all vectors of the form (s, s, t), that is
the plane x = y. An example of an affine space in R3 is (0, 1, 0) + L(R3 ), which is
the parallel plane through the point (0, 1, 0), i.e., the plane y = x + 1.
Theorem 4.4. If L is linear, there is a constant M such that
|L(p)| M |p|,
p Rn .
Proof. Using the matrix representation of a linear map, we have by the CauchySchwartz inequality
v
v
uX
u n
n
X
u n 2 uX
t
Ci,j t
x2j
Ci,j xj |
|yi | = |Li (p)| = |
j=1
j=1
j=1
i=1 j=1
j=1
v
v
v
um
uX
uX
m X
n
uX
u
u n 2
2 t
= |L(p)| = t
yi2 t
Ci,j
xj
i=1
v
uX
n
um X
2 .
or |L(p)| M |p| where M := t
Ci,j
i=1 j=1
i
i
84
muldown
2010/1/10
page 84
i
= L is continuous on Rn by Corollary 4.5, which implies that |L| is continuous on Rn , and in particular, |L| is continuous on S := {p : |p| = 1}. Since S is
compact and |L| is continuous on S, |L| achieves its minimum on S at some point,
call it p0 . Thus, we define
k := min{|L(p)| : |p| S} = |L(p0 )|,
and |p0 | = 1.
Exercises
4.1. Let L : R2 7 R3 be linear with L(e1 ) = (2, 1, 0), L(e2 ) = (1, 0, 1), where
e1 = (1, 0), e2 = (0, 1). Find L(2, 0), L(1, 1), L(1, 3). Draw pictures.
4.2. Show that L(R2 ) 6= R3 for the function in the last exercise.
4.3. Show that if L : R2 7 R3 is linear, then L(R2 ) 6= R3 .
4.4. Let L : R3 7 R2 be linear. Show that there are non-zero vectors p R3 such
that L(p) = O.
a b
4.5. If L : R2 7 R2 has a matrix
, show that
c d
(i) L(R2 ) is a point if and only if a = b = c = d = 0.
(ii) L(R2 ) is a line if and only if = ad bc = 0 and a2 + b2 + d2 + c2 > 0.
(iii) L(R2 ) = R2 if and only if 6= 0.
4.6. Show that in case (iii) of the last exercise, that L is one-to-one (and only in
that case), and that the inverse function L1 is linear with matrix
d
b
.
c
a
4.7. Show that the sum and composition of two linear functions are linear. What
are the matrix representations of these?
4.8. If L : Rn 7 Rm is linear and one-to-one, then L1 is linear on its domain.
4.9. Let f : Rn 7 Rm be such that
(i) f (p + q) = f (p) + f (q), for all p, q Rn (additive)
(ii) f is continuous at O.
(f is homogeneous).
i
i
4.1. Preliminaries
muldown
2010/1/10
page 85
i
85
Notice that f must be linear since f (x) = f (1)x. Show that homogeneity does
not imply linearity for functions of more than one variable. HINT: Consider
the function
( 3 3
x +y
if (x, y) 6= O
2
2,
f (x, y) = x +y
0,
otherwise.
4.1.2
c+tu
p+t(qp)
u
q
qp
c
f ( ) f (0 )
: t R}.
0
f ( ) f (0 )
. We therefore define
0
f( )
f( )
0
i
i
86
muldown
2010/1/10
page 86
i
f ( ) f (0 )
0
exists.
Evidently,
f (0 ) =
fn ( ) fn (0 )
f1 ( ) f1 (0 )
lim
, . . . , lim
0
0
0
0
= (f1 (0 ), . . . , fn (0 )).
4.2
(t R)
t0
i
i
muldown
2010/1/10
page 87
i
87
f(c+tu)
f(D)
f(c)+tf(u)
c+tu
f(c)
c
n
t0
Thus, Gu (c) = (u, fu (c)) is the direction (in Rn+m ) of the tangent to the curve
{G(c + tu) : t R} = {(c + tu, f (c + tu)) : t R}.
at the point (c, f (c)). (See Figure 4.5.)
m
G(D)
(c,f(c))=G(c)
(c+tu,f(c)+tf (c))=G(c)+t G (c)
u
(c+tu,f(c+tu))=G(c+tu)
c
n
c+tu
i
i
88
muldown
2010/1/10
page 88
i
z
G
z=x 2+ y 2
f(R )
f
(1,0)
(0,0,0)
3
(0,0)
Notice that
1
= (u1 , u2 , 2u1 c1 + 2u2 c2 ).
t
u1
1
= 0
u2
2u1 c1 + 2u2 c2
2c1
0
u
1 1
u2
2c2
t0
f (c + h) f (c)
f (c + tu) f (c)
= lim
u = f (c)u.
h0
t
h
i
i
0,
f (x, y) = 1,
0,
muldown
2010/1/10
page 89
i
89
: R2 7 R by
when y 6= x2 ;
when (x, y) = O.
(5) This example shows that the directional derivative does not have to be linear
in u. Define f : R2 7 R by
( 2
x y
when x3 =
6 y2;
3
2,
f (x, y) = x y
0,
when x3 = y 2 .
Then f is not continuous at (0, 0) (why?!). Computing fu (0, 0), for u =
(u, v) R2 , we find
( 2
u
v , if v 6= 0;
f(u,v) (0, 0) =
0,
if v = 0.
In particular, f(u,v) is not linear in u = (u, v).
The preceding examples show that when f is a nice function at c, then fu (c)
is linear in u, but in general it does not need to be linear even if it exists for all u.
Observe however, that all the fu (c) in the above examples are homogeneous in u,
that is, fu (c) = fu (c) for all R. When they fail to be linear, it is the additive
property that fails. The homogeneity in u is always true when fu (c) exists.
Exercises
4.11. Prove that if fu (c) exists, then fu (c) = fu (c).
4.12. Prove that the expressions given in the directional derivatives of (4) and (5)
in Example 4.10 are correct.
4.13. Check that the following partial derivatives are correct:
(a) The function f (x, y) = x2 + y 3 has partial derivatives
f
(x, y) = 2x,
x
f
(x, y) = 3y 2 ,
y
i
i
90
muldown
2010/1/10
page 90
i
f
(x, y) = (x cos(xy), 0, sin(y)).
y
2x
3y 2
0 cos(y) .
ex
0
Motivation: At first glance it seems that the directional derivative determines the
properties of functions in the same way that the derivative in one variable does.
Unfortunately, that is not quite the case. For example, in Example 4.10 (4), we
have shown that a function may be discontinuous at a point where all directional
derivatives exist. Therefore, a more rigorous approach must be used to extend the
role of differentiation. By reforulating the definition, we can realize the expected
properties, and hence the usefulness, of the derivative. To this end, we reformulate
the definition of the derivative of a function in one variable to show case the linearity
as found in Example 4.10 (3).
Definition 4.11 (Derivative in one variable reformulated). A function f :
R 7 R is differentiable at c if there exists a number f (c) such that
f (x) f (c)
= f (c)
xc
f (x) f (c)
|f (x) f (c) f (c)(x c)|
= lim
f (c) = lim
= 0.
xc
xc
xc
|x c|
lim
xc
xc
(4.1)
i
i
muldown
2010/1/10
page 91
i
91
(x,f(x))
f(x)
(x,f(c)+L(xc))
(c,f(c))
f(c)
= lim
pc
u Rn
and
i = 1, . . . , m,
where
f (p) = (f1 (p), . . . , fm (p)),
Proof. This follows because the limit exists if and only if the limit of each component exists (this uses the equivalence of d(p, c) and d (p, c)):
|f (p) f (c) L(p c)|
=0
|p c|
|fi (p) fi (c) Li (p c)|
lim
= 0,
pc
|p c|
lim
pc
i = 1, . . . , m
The matrix for the differential must have the form provided by the next theorem.
Theorem 4.15. If f : D Rn 7 Rm and f is differentiable at an interior point
c of D, then
i
i
92
muldown
2010/1/10
page 92
i
f1
f1
. . . x
(c)
x1 (c)
n
..
..
,
.
...
.
fm
fm
x1 (c) . . .
xn (c)
This
matrix
is the Jacobian Matrix of f at c, and can be denoted also as
h
i
f
Proof.
t0
Here we used the homogeneous property of linearity, Df (c)(tu) = tDf (c)(u) for
the second step.
(ii) recall from Theorem 4.2 that the matrix of L is [Ci,j ] = [Li (ej )]. Here
Li (ej ) = Dfi (c)(ej ) = (fi )ej (c) by part (i)
fi
(c), i = 1, . . . , m, j = 1, . . . , n.
=
xj
Notes: In the case when n = m, the determinant det f (c) is called the Jacobian
of f at c and is often denoted by
Jf (c)
or
(f1 , . . . , fn )
(c).
(x1 , . . . , xn )
n
X
fi
(c)uj .
x
j
j=1
(c)
are
unique
and
hence
the
matrix
f
(c)
=
(c)
is
natively, the partials x
xj
j
unique.
i
i
muldown
2010/1/10
page 93
i
93
i.e. |f (p) f (c) L(p c)| < |p c|. In particular, with = 1, we have
|p c| < (1) = |f (p) f (c) L(p c)| < |p c|
= |f (p) f (c)| (1 + M )|p c|,
Theorem 4.4
i = 1, . . . , m,
j = 1, . . . , n,
p = p0 , p1 = (1 , x2 , . . . , xn ), p2 = (1 , 2 , x3 , . . . , xn ), . . . , c = pn .
(See Figure 4.8.) Note that
p = ( ,2)
2
p =(x ,x )
0
1 2
p = (1, x2)
1
and
(4.2)
n
X
k=1
[f (pk1 ) f (pk )] .
(4.3)
i
i
94
muldown
2010/1/10
page 94
i
Now on each line segment between pk1 and pk , k = 1, . . . , n, we are really considering a real-valued differentiable function of a real variable so we may use the Mean
Value Theorem.
f
(p )(xk k )
(4.4)
f (pk1 ) f (pk ) =
xk k
where pk is on the line segment from pk1 to pk , k = 1, . . . , n. Clearly, pk {p :
|p c| < } by (4.2), since this set is convex. From (4.3) and (4.2), we find
f (p) f (c) =
n
X
f
(p )(xk k ).
xk k
k=1
n
X
f
(c)uk .
xk
k=1
Pn
k=1
xk (pk )
|f (p)f (c)L(pc)|
|pc|
(xk k )
2 1/2 P
1/2
n
[ k=1 (xk k )2
f
xk (c)
(Cauchy-Schwartz)
n|p c|
=
f
xk (c)
<
n ,
when |p c| <
1
0
f (x1 , x2 ) = 0
1 .
2x1 2x2
For example, if L = Df (0, 0), then L : R2 7 R3 and
L1 u
1 0
L2 u = 0 1 u1 , where u = (u1 , u2 ).
u2
L3 u
0 0
i
i
muldown
2010/1/10
page 95
i
95
In other words, Df (0, 0)(u1 , u2 ) = (u1 , u2 , 0) for each (u1 , u2 ) R2 . Check for
yourself that
Df (0, 1)(u1 , u2 ) = (u1 , u2 , 2u2 ),
and
sin(t)
f (t) = cos(t) ;
1
and Df (0)(u) = (0, u, u), Df (/2)(u) = (u, 0, u).
f (x, y) =
and Df (x0 , y0 ) (u, v) = (u+v, 2(x0 +y0 )(u+v)).
2(x + y) 2(x + y)
Interpretation: Suppose f : Rn 7 Rm and L : Rn 7 Rm be linear. Now
|f (c + u) f (c) L(u)|
=0
uO
|u|
lim
if L = Df (c),
i
i
96
muldown
2010/1/10
page 96
i
y3
f(R )
x1
y2
2
f(0,0)+L(R )
3
y1
f(R)
y
0
f(0)+L(R)
x
y
3
Exercises
4.16. In Example 4.22, sketch the range of f (R2 ) (it is a curve in R2 ). Find the
tangent at one or two points of the range.
4.17. Let f : R2 7 R2 be defined as f (x, y) = (x2 + y, y 2 ). Check that
f (x0 , y0 ) =
2x0
0
1
.
2y0
i
i
muldown
2010/1/10
page 97
i
97
f
then Df (c)(u) = 0 for all u Rn . [Hint: Show x
(c) = 0, for i = 1, . . . , n.]
i
4.19. Let f be a real-valued continuous function on a compact subset K of Rn such
that
for all u Rn .
for each
u Rn .
Deduce that the largest value of fu (c) under the restriction |u| = 1 is |f (c)|
and this value is attained when u = f (c)/|f (c)|. This means that the
direction of maximum rate of increase of f at c is the direction of the gradient
vector. [Hint: What was Cauchy-Schwartz first name?]
i
i
98
muldown
2010/1/10
page 98
i
4.24. Sketch the surface {(x, y, xy) : (x, y) R2 } in R3 (i.e. z = xy) and show that
the tangent to this surface at the point (1, 1, 1) is (1, 1, 1) + {(u, v, u + v) :
(u, v) R2 } (i.e. z + 1 = x + y).
4.25. If L : Rn 7 Rm is linear, then the differential L exists and equals L at each
point in Rn .
4.2.1
Differentiation Rules
for each
u Rn .
for each
u Rn .
by the triangle inequality. Both terms in this last expression have a limit of zero as
p c, therefore, Dh(c) exists and equals the specified L.
(ii) Let L : Rn R be defined by
L(u) = (c) L (u) + (c) L (u),
u Rn .
Then
|k(p) k(c) L(p c)|
=
|p c|
|(p) (p) (c) (c) (c) L (p c) (c) L (p c)|
|p c|
= |(p) [(p) (c) L (p c)] + (c) [(p) (c) L (p c)]
+ [(p) (c)] L (p c)| /|p c|
|(p)|
i
i
muldown
2010/1/10
page 99
i
99
The last inequality was the result of several applications of the Cauchy-Schwartz
inequality. As p c in the last expression, the first term goes to zero since
is continuous and D(c) exists, the second term goes to zero since D(c) exists,
while the third term goes to zero by the continuity of at c and the fact that
|L (p c)| M |p c| for some constant M since L is linear.
Theorem 4.24 (The Chain Rule). Suppose : Rn 7 Rm and : Rm 7 R ,
and
(i) is differentiable at c Rn
(ii) is differentiable at b = (c) Rm .
Then f = , defined by f (p) = ((p)), is differentiable at c and
Df (c) = D(b) D(c).
In particular, the Jacobian matrices f , , satisfy
f (c) = (b) (c),
that is,
f1
x1
.
..
f
x1
f1
xn
...
..
.
...
1
y1
.
..
. = ..
f
xn
y1
...
..
.
...
1
ym
1
x1
.. ..
. .
ym
m
x1
...
..
.
...
1
xn
..
. .
m
xn
X i
fi
k
(c) =
(b)
(c),
xj
yk
xj
i = 1, . . . , ,
j = 1, . . . , m.
k=1
X f yk
f
=
,
xj
yk xj
k=1
df
dt
d d
d dt .
i
i
100
muldown
2010/1/10
page 100
i
where
In other words, f = where (x, y) = (u(x, y), v(x, y)) If D(x0 , y0 ) and
D(u0 , v0 ) exist, where u0 = u(x0 , y0 ) and v0 = v(x0 , y0 ), then Df (x0 , y0 ) exists
and
f (x0 , y0 ) = (u0 , v0 ) (x0 , y0 ).
In matrix form
f
Less precisely,
f
y
(x0 ,y0 )
= [
u
v ](u0 ,v0 )
u v
f
=
+
,
x
u x v x
u
x
v
x
u
y
v
y
(x0 ,y0 )
f
u v
=
+
.
y
u y
v y
u
v
Example 4.27 n = 1, m = 2, = 1, F (t) = h(x, y) where x = r(t), y = s(t). Then
h h dx
dF
dt
= x y
dy ,
dt
dt
or,
h dx h dy
dF
=
+
.
dt
x dt
y dt
Example 4.28 Suppose a particles position (x, y, z) in space is given at time t by
x = cos(t), y = sin(t), z = t (it is moving on a helix), and the temperature at any
point (x, y, z) is given by T (x, y, z) = x2 + y 2 + z 2 . If H(t) is the temperature of
the particle at time t, find H/dt.
In this case, H = T f , where
T (x, y, z) = x2 + y 2 + z 2 ,
From the Chain Rule, H (t) = T (x, y, z)f (t); more exactly
sin(t)
dH
dH
= [ 2x 2y 2z ] cos(t) , or
= 2x sin(t) + 2y cos(t) + 2z = 2t,
dt
dt
1
dH
dt
directly
i
i
muldown
2010/1/10
page 101
i
101
|p c|
|p c|
as p c by (1) and (2). Since L L is a linear function, we have shown that
Df (c) exists and equals L L .
It remains to prove assertions (1) and (2). For assertion (1), let > 0 be given.
Since is differentiable at c, there is a neighborhood U of c and a constant K > 0
such that if p U , then
|(p) (c)| < K|p c|
(see the proof of Theorem 4.17). Since L is the differential of at b = (c), there
exists a > 0 such that
G : R2m 7 Rm , G(q1 , q2 ) = q1 + q2 ,
H : R2m 7 R, H(q1 , q2 ) = q1 q2 .
q1 , q2 Rm
i
i
102
muldown
2010/1/10
page 102
i
Then G F = + and H F = .
Theorem 4.29 (Mean Value Theorem). Suppose f : Rn 7 R. If a, b Rn
and f is differentiable at each point of the line segment S between a and b, then
there is a point c S, c 6= a or b, such that
f (b) f (a) = Df (c)(b a).
In the notation of Exercise 23, this may be written as
f (b) f (a) = f (c) (b a)
n
X
f
(c)(bj aj ),
=
x
j
j=1
where b = (b1 , . . . , bn ) and a = (a1 , . . . , an ).
Proof. We reduce the problem to one variable by introducing the function F :
[0, 1] 7 R defined by
F (t) = f (a + t(b a)),
0t1
= f ((t)),
(t) = a + t(b a).
By the Chain Rule, F (t) exists for 0 t 1 and
F (t) = DF (t)(1) = Df ((t)) D(t)(1)
= Df ((t)) (t) = Df ((t))(b a).
By the Mean Value Theorem for F (t),
F (1) F (0) = F (t0 )(1 0) = F (t0 ),
for some t0 (0, 1). Thus,
f (b) f (a) = Df ((t0 ))(b a) = Df (c)(b a),
c = (t0 ).
The Mean Value Theorem does not hold for functions f : Rn 7 Rm if m > 1.
Can you tell why? If you cannot, try to carry out a proof with m > 1. See Exercise
30 for what can be said.
Exercises
4.26. Let f : R 7 R be differentiable. If F : R2 7 R is defined by
F
(a) F (x, y) = f (xy), then x F
x = y y ,
F
(b) F (x, y) = f (ax + by), then b F
x = a y ,
i
i
muldown
2010/1/10
page 103
i
103
F
(c) F (x, y) = f (x2 + y 2 ), then y F
x = x y .
4.27. Define f : R2 7 R by
(x2 + y 2 ) sin 2 1 2 , if (x, y) 6= (0, 0)
x +y
f (x, y) =
0,
if (x, y) = (0, 0).
f
Show that f is differentiable at (0, 0), but f
x and y are not continuous at
(0, 0).
4.28. Let f : Rn 7 R be a differentiable function and C be a smooth curve in
Rn on which the function is constant. Prove that for any point c C, the
tangent to C at c is perpendicular to
f
f
f (c) =
(c), . . . ,
(c) .
x1
xn
xn
t R.
x5 + y 5 + z 5
,
(x + y + z)5
x2 y 7 + 2z 4 x5 ,
f
f
+ . . . + xn
= mf.
x1
xn
i
i
104
muldown
2010/1/10
page 104
i
f (x, y) dy,
F (x) =
f
(x, y) dy.
x
f
(x, y) dy
x
[Hint: Let
(x) =
then
R x is continuous by part (i). Use the Fubini Theorem to show that
= F (x) F (a). Hence, F is an antiderivative of so F exists and
a
equals .]
(iii) Suppose (x) and (x) have continuous derivatives on [a, b], and f and
f
x are continuous on I. Show
d
dx
(x)
(x)
(x)
(x)
f
(x, t) dt.
x
dx
,
= 2
a + b cos(x)
(a b2 )1/2
a > 0,
|b| < a,
dx
a
= 2
2
(a + b cos(x))
(a b2 )3/2
b
cos(x) dx
= 2
.
2
(a + b cos(x))
(a b2 )3/2
i
i
dx
,
0 x2 +a2
(v) Evaluate
Z
[Hint:
4.3
R
0
105
dx
1
= 3 arctan
2
2
2
(x + a )
2a
muldown
2010/1/10
page 105
i
f = limT
dx
= 3,
(x2 + a2 )2
4a
RT
0
b
b
+ 2 2
,
a
2a (b + a2 )
a > 0.
a > 0.
2f
=
xj xi
xj
and, in particular,
2f
=
2
xi
xi
f
xi
f
xi
f
xi
These are the partial derivatives of second order. Partial derivatives of third and
higher order are similarly defined.
Example 4.31 Let f (x, y) = x3 + 3y 2 + 2xy. Then
f
f
= 3x2 + 2y,
= 6y + 2x
x
y
2f
2f
2f
= 6x,
=2=
,
2
x
xy
yx
2f
= 6.
y 2
It is usually the case, (but not always, see Exercise 13, page 349 in Buck),
that the successive partial derivatives may be taken in any order we please, e.g., in
the above example we have seen
2f
2f
=
.
xy
yx
The following two theorems give sufficient conditions for this. There is no loss
of generality in the fact that these theorems are proved in R2 only; we are only
concerned with the behaviour of f with respect to two of the variables in any case.
i
i
106
muldown
2010/1/10
page 106
i
2f
yx
exist and are continuous on an open set U R2 , then they are equal at each point
of U .
Proof.
J1 J2 =
xy yx
I
by Theorem 3.15(a); a contradiction. Hence, we must have equality throughout U .
The next Theorem is more general than the one just proved, but is also more
difficult to prove.
Theorem 4.33. If f : R2 R satisfies
(i)
2 f
f f
x , y , xy ,
i
i
2 f
xy
then
2 f
yx (x0 , y0 )
Proof.
muldown
2010/1/10
page 107
i
107
is continuous at (x0 , y0 ),
exists and equals
2f
xy (x0 , y0 ).
2f
(x0 , y0 ) = lim
k0
yx
f
x (x0 , y0
+ k)
k
f
x (x0 , y0 )
2f
(x0 , y0 ),
xy
(4.5)
or equivalently,
lim
k0
f
x (x0 , y0
+ k)
k
f
x (x0 , y0 )
)
2f
(x0 , y0 ) = 0.
xy
f
x (x0 , y0 )
2f
(x (k), y (k)) + terms going to zero with k
xy
(4.6)
2 f
and with (x (k), y (k)) (x0 , y0 ) as k 0, so that the continuity of xy
at
(x0 , y0 ) may be used. To that end, given > 0, choose = > 0 so that
2
f
2f
(x , y )
(x0 , y0 ) < /2. (4.7)
max{|x0 x |, |y0 y |} < =
xy
xy
+ k)
k
Fix k with 0 < |k| < . We rewrite the numerator of the quotient in the limit
of (4.5) using
(a) the definition of the partial derivative,
(b) the fact that the Mean Value Theorem for one variable can be applied to
g(y) = f (x0 + h, y) f (x0 , y) for any fixed h (why?), and
(c) that the Mean Value Theorem for one variable can be applied to h(x) =
f
y (x, y1 ) for any fixed y1 (why?),
as follows
f
f
(x0 , y0 + k)
(x0 , y0 ) =
[f (x, y0 + k) f (x, y0 )]x=x0
x
x
x
=
dg
k
g(y0 + k) g(y0 )
+ (h) =
(y0 + h,k k) + (h)
h
dy
h
h
i
f
f
y (x0 + h, y0 + h,k k) y (x0 , y0 + h,k k)
=
k + (h), where |h,k | < 1 by (b)
h
=
i
i
108
muldown
2010/1/10
page 108
i
=k
2f
(x0 + h,k h, y0 + h,k k) + (h),
xy
If h = h(k) is chosen to be small enough that |h| < |k| and |(h)| < |k|2 , then, after
dividing by k, we obtain (4.6) with x = x (k) = x0 +h,k h, y = y (k) = y0 +h,k k,
and (h)/k as the term that goes to zero with k. Since
max{|x (k) x0 |, |y (k) x0 |} = max{|x0 + h,k h x0 |, |y0 + h,k k y0 |} |k|,
all the requirements are met to achieve
f (x , y + k) f (x , y )
2f
2f
2f
0 0
x 0 0
x
(x0 , y0 ) =
(x , y )
(x0 , y0 ) < ,
k
xy
xy
xy
using (4.7) provided |k| < /2, for then |(h)/k| < |k| < /2.
Notation: Let D Rn . By C(D) we will denote the set of all continuous functions
on D. By C k (D), we mean all functions defined on D having all k-th order partial
derivatives continuous on D. The range of the functions will be clear from the
context in which the notation is used.
We first give Taylors Theorem in an expanded form emphasizing two-variables.
The Theorem can be stated cleanly in a form similar to the univariate form even in
higher dimensions which we do in a second pass.
Theorem 4.34 (Taylors Theorem). If f : U Rn 7 R, f C k (U ), where U
is a convex open subset of Rn , and a, b U .
(i) For n = 2, we can write f (x), b = (b1 , b2 ), a = (a1 , a2 ), as
j
k1
X
1
(b1 a1 )
+ (b2 a2 )
f (a) + Rk ,
f (b) =
j!
x1
x2
j=0
where for some point c on the line segment joining b and a,
k
1
Rk =
(b1 a1 )
+ (b2 a2 )
f (c).
k!
x1
x2
(ii) For an arbitrary n, with b = (b1 , b2 , . . . , bn ), a = (a1 , a2 , . . . , an ), we have
f (b) =
k1
X
j=0
1
j!
j
(b1 a1 )
+ . . . + (bn an )
f (a) + Rk
x1
xn
(b1 a1 )
Rk =
+ . . . + (bn an )
f (c).
k!
x1
xn
i
i
muldown
2010/1/10
page 109
i
109
b 1 a1
F (t) = [
b 2 a2
f
f
((t)) + (b2 a2 )
((t))
= (b1 a1 )
x1
x2
f ((t)).
+ (b2 a2 )
= (b1 a1 )
x1
x2
f
x1 ((t))
f
x2 ((t)) ]
The last expression is again a differentiable function of t (if k 2), so we can apply
the same argument to it as we did for f to get inductively
F (t) = (b1 a1 )
(b1 a1 )
f ((t))
+ (b2 a2 )
+ (b2 a2 )
x1
x2
x1
x2
j
(j)
F (t) = (b1 a1 )
+ (b2 a2 )
f ((t)).
x1
x2
Applying Taylors Theorem in one variable on 0 t 1 to obtain F (1) in
terms of the derivatives of F at zero, we find
F (1) =
k1
X
j=0
1 (j)
F (0) + Rk ,
j!
Rk =
1 (k)
F (),
k!
(j)
+ (b2 a2 )
f (a),
F (1) = f (b), F (0) = (b1 a1 )
x1
x2
c = ().
Part (ii) is exactly the same, except more terms are used in the sum of partials
because there are more variables.
To view Taylors Theorem somewhat differently, we define the gradient (see
Exercise 23, which you may have done already), or the operator defined on
real-valued differentiable functions on Rn with values in the real-valued functions
on Rn :
f
f
,
: f 7
.
(4.8)
,...,
,...,
:=
x1
xn
x1
xn
The gradient vector has many important applications (see Exercise 23). One is that
the directional derivative in the direction u Rn at a point c can be given by the
inner product with the gradient vector at the point
fu (c) = Df (c)(u) =
n
X
j=1
uj
f
= f (c) u = u f (c) =: hf (c), ui.
xj
(4.9)
i
i
110
muldown
2010/1/10
page 110
i
Below we phrase Taylors Theorem using the gradient notation, which is simpler in form but richer in meaning. When properly viewed it allows the concept to
expand to several variables without unnecessary clutter. (Unfortunately, our minds
sometimes demands a struggle with the clutter before we can fully conceptualize.
Dont be afraid to get dirty with this.) Before stating Taylors Theorem, we do
some house-keeping on notation and look at iterations of the particular operator
u . Look at how the gradient operator is used to define, for a fixed direction u,
the directional derivative as an operator on differentiable functions:
Given u in Rn :
u = u : f 7
Pn
n
X
j=1
uj
f
.
xj
f
: Rn R is differentiable, the operator may
If the resulting function j=1 uj x
j
be applied once again. Quite literally, this gives
n
X
=1
n
n
n
n
X
X f X X
2f
2f
u uj
uj
=
u uj
.
=
x j=1 xj
x xj
x xj
j=1
(4.10)
,j=1
=1
(u )2 f := (u ) (u )f .
As long as the resulting function (the function defined as the last sum in (4.10) is
differentiable, we can apply the operator again. In this way, we define inductively
the operator,
(u )k f := (u ) (u )k1 f .
This can be written out in long form (in the same manner as the binomial formula)
as a k-fold sum
(u )k f =
1j1 ,...,jk n
uj1 ujk
kf
.
xj1 xjk
Since we can only apply this operator again if all the partial derivatives of each
kf
xj1 xjk
(4.11)
N
1
X
k=1
k
1
(b a) f (a) + RN ,
k!
i
i
muldown
2010/1/10
page 111
i
111
N
1
(b a) f (c)
N!
Proof. We will reduce the theorem to the univariate Taylor theorem. In fact, one
k
can see directly the analogy when one views (b a) f as the k-th directional
derivative in the direction b a.
Let (t) := a + t(b a), and define
F (t) := f ((t)),
0 t 1.
b 1 a1
f
f
..
((t)), . . . ,
((t))
F (t) = Df ((t))(b a) =
.
x1
xn
b n an
f
f
((t)) + . . . + (bn an )
((t)) = (b a) f ((t)).
= (b1 a1 )
x1
xn
N
1
X
k=0
F (k) (0)
+ RN ,
k!
where
F (N ) (t0 )
,
for some t0 , 0 < t0 < 1.
N!
Substituting (4.12) into this formula and noting (0) = a and (1) = b, we obtain
the Theorem with c = a + t0 (b a).
RN =
4.3.1
Min-Max Theory
We use Taylors Theorem to look for tests for relative maximums and minimums as
was done with the second derivative tests for functions of one variable. Suppose f
has continuous second order partial derivatives. Then
f (b) = f (a) + (b a) f (a) + (b a) )2 f (c)/2!
for some c on the line segment from a to b. If f has a relative max or min at
x = a, then f restricted to any coordinate direction is a real-valued function of one
variable with a relative max or relative min at a. Therefore, (see Exercise 18)
i
i
112
muldown
2010/1/10
page 112
i
for all j = 1, . . . , n.
(4.13)
Points that satisfy (4.13) are called stationary or critical points. These are
the potential points for relative max and mins. At a stationary point a, Taylors
theorem of order 2 reads
2
f (b) = f (a) + (b a) f (c)/2.
Thus, we find
2
(b a) f (c) > 0 = f (b) > f (a),
2
(b a) f (c) < 0 = f (b) < f (a).
and
2
Hence the behavior of the operator (b a) f is important for determining
whether the critical point is a max or min.
We will view this a little differently, using matrix notation. Let u Rn , and
consider (u )2 f written in some imaginative ways:
(u )2 f
= (u ) (u )f = uT T f u
x1
u1
. f
f ..
,
.
.
.
,
]
= [u1 , . . . , un ] .. [ x
.
xn
1
un
x
n2 f
2
f
. . . x1 x
x1 x1
n
2
u
f
2f
x2 x1 . . . x2 xn .1
.
= [u1 , . . . , un ]
.
..
..
..
.
.
.
un
= [u1 , . . . , un ]
Pn
j=1
Pn
=1
(4.14)
2
2 f
f
. . . xn x
xn x1
n
Pn
2
f
u
=1 x1 x
Pn
..
.
2 f
=1 xn x u
2
f
.
uj u xj x
i
i
Ac :=
muldown
2010/1/10
page 113
i
113
x1 x1 (c)
...
2 f
x1 xn (c)
2f
x2 x1 (c)
...
2 f
x2 xn (c)
..
.
..
.
..
.
2f
xn x1 (c)
...
2 f
xn xn (c)
(4.15)
(u )2 f (c) = uT Ac u.
We make the following observations: If f has continuous partial derivatives of
second order, then
1. Ac is symmetric (Theorem 4.32),
2. if (b a)T Ac (b a) = ((b a) )2 f (c) > 0 for all b and c close to a, then
f (a) is a relative minimum,
3. if (b a)T Ac (b a) = ((b a) )2 f (c) < 0 for all b and c close to a, then
f (a) is a relative maximum,
4. if in any small neighborhood of a there are b and b with
(
(b a)T Ac (b a) = ((b a) )2 f (c) > 0
and
then f (a) is neither a max nor a min, and is called a saddle point because
f (b) > f (a) while f (b ) < f (a).
5. Since the second order derivatives are continuous, the function in (4.14) is
continuous in u = b c and therefore, (1)-(4) will be true for Ac replaced by
Aa if b is sufficiently close to a.
Thus, we need to consider the properties of the mapping u 7 uT Au on Rn
for a given n n matrix A. These mappings are called quadratic forms, since the
values uT Au are homogeneous multivariate polynomials of degree 2 in the variables
u1 , . . . , un , that is, they have the form
X
a,j u uj .
1,jn
i
i
114
muldown
2010/1/10
page 114
i
j = 1, . . . , n,
and
(ii) the symmetric matrix A(a) is given by
2 f
T
A(a) = f (a) =
x1 x1 (a)
...
2 f
x1 xn (a)
2 f
x2 x1 (a)
...
2 f
x2 xn (a)
..
.
..
.
..
.
2 f
xn x1 (a)
...
2 f
xn xn (a)
Then f has
(a) a relative maximum at a if A is negative definite;
(b) a relative minimum at a if A is positive definite;
(c) a saddle point at a if A is indefinite; and
(d) unknown properties (test doesnt work) at a if A is semi-definite.
Proof.
Before going further, we return to R2 and look at these ideas more concretely.
A 2 2 symmetric matrix being positive (negative) semidefinite means
Q(x, y) = [ x y ]
a
b
b
c
x
= ax2 + 2bxy + cy 2 0 ( 0),
y
(x, y) R2 .
i
i
muldown
2010/1/10
page 115
i
115
It is said to be positive (negative) definite if Q(x, y) > 0 (< 0) for all (x, y) 6= O.
For two by two matrices this expression reminds one of the quadratic formula, and
we have the following easily checked criterion.
Lemma 4.39. Let the 2 2 symmetric matrix be as in the preceding paragraph.
(i) The matrix is positive (negative) semidefinite if and only if
ac b2 0
and
a 0 ( 0),
c 0 ( 0).
and
f
x
f
y
fxy :=
fyx :=
f
yx
f
xy
fxx :=
fyy :=
f
xx
f
yy
f (x, y)
A(x, y) = xx
fyx (x, y)
fxy (x, y)
.
fyy (xy)
fxx < 0
at (x0 , y0 );
i
i
116
muldown
2010/1/10
page 116
i
fxx > 0
at (x0 , y0 );
at (x0 , y0 );
11
00
00(c,f(c))
11
01c
at (x0 , y0 ).
f(c)
f
f(U)
G(U)
U
n
n+1
G(p)=(p,f(p))
01c
11
00
00 G(U)
11
f(U)
U
(c,f(c))
n+1
f(c)
G(p)=(p,f(p))
(c,f(c))
1
0
G
f
G(U)
n+1
0c
1
1 f(c)
0
Figure 4.12. Graphical illustrations of a relative maximum (top), a relative minimum (middle), and a saddle point (bottom).
i
i
muldown
2010/1/10
page 117
i
117
satisfies fxx (0, 0) = 2 > 0 and 2 2 12 = 3 > 0, so that the matrix is positive
definite and f has a relative minimum at (0, 0). In fact, it is a global minimum
since the matrix above is positive definite for all (x, y), so that the remainder in
Taylors Theorem R2 (x, y) > 0 for any (x, y) 6= (0, 0).
Example 4.42 For the function f (x, y) = x2 + 4xy + y 2 , from the equations fx =
2x + 4y = 0 and fy = 4x + 2y = 0, we find that (x, y) = (0, 0) is the only stationary
point, but in this case
fxx (x, y) fxy (x, y)
2 4
=
, satisfies 2 2 42 = 12 < 0
fyx (x, y) fyy (x, y)
4 2
so the matrix is indefinite. Hence, f has a saddle point at (0, 0). Alternately one
can see this from the observation that f (x, y) > 0 on the coordinate axes except at
the origin, but is negative on the line y = x when x > 0.
Example 4.43 For the function f (x, y) = 3x2 y 2 + x3 , from the equations
fx = 6x + 3x2 = 3x(2 + x) = 0,
and fy = 2y = 0,
we see that the stationary points are (0, 0) and (2, 0). We observe
6 0
for (0, 0)
is indefinite = saddle point
0 2
6 0
for (2, 0)
is negative definite = relative maximum.
0 2
Example 4.44 For the function f (x, y) = x4 + y 4 , from the equations
fx = 4x3 = 0,
and fy = 4y 3 = 0,
i
i
118
muldown
2010/1/10
page 118
i
and fy = 4y 3 = 0,
The last two examples illustrates the fact that anything can happen in the case
that fxx fyy (fxy )2 = 0 at a stationary point. Our next example looks at a function
on R3 . However, we need some more applicable criteria for the definitiveness of a
matrix.
The following criteria may be found, say, in the book by Gantmacher, Matrix
Theory.
Theorem 4.46. Let A be a symmetric n n matrix. Then
(a) A is positive semidefinite if and only if the determinants of all the k k submatrices of A symmetric about the main diagonal are non-negative ( 0).
(b) A is negative semidefinite if and only if the determinants of all the k k submatrices of A symmetric about the main diagonal have sign (1)k or are zero.
(c) A is positive definite if and only if the determinants
a1,1 . . . a1,k
.. > 0, k = 1, 2, . . . , n.
det ...
...
.
ak,1 . . . ak,k
(d) A is negative definite if and only if the determinants
a1,1 . . . a1,k
.. > 0, k = 1, 2, . . . , n.
(1)k det ...
...
.
ak,1 . . . ak,k
(e) A is indefinite if and only if it satisfies none of (a), (b), (c), or (d).
The determinants in parts (c) and (d) are sometimes called the principal minors.
i
i
muldown
2010/1/10
page 119
i
119
Example 4.47 For the function f (x, y, z) = x2 + y 2 + z 2 + 2xyz, solving the equations
fx = 2x + 2yz = 0,
fy = 2y + 2zx = 0,
and fz = 2z + 2xy = 0
(1, 1, 1),
, (1, 1, 1),
(1, 1, 1),
(1, 1, 1).
fxx
fyx
fzx
fxz
2
fyz = 2z
fzz
2y
fxy
fyy
fzy
2z 2y
2 2x .
2x 2
At (0, 0, 0), A(0, 0, 0) = 2I33 which is positive definite; hence f has a relative
minimum at (0, 0, 0). At the other stationary points, |x| = |y| = |z| = 1 and
xyz = 1, so the principal minors of A satisfy
2 2z
det[2] > 0 det
= 4 4z 2 = 0
2z
2
2 2z 2y
det 2z 2 2x = 8(1 x2 y 2 z 2 + 2xyz) < 0.
2y 2x 2
Thus, the matrix is indefinite at all the other critical points so that these are all
saddle points.
In the following exercises, assume all the differentiability you need unless otherwise specified.
Exercises
4.32. Find Vx , Vy , Vz , Vu , Vxy , Vxyz , Vxyzu for the following functions V
(i)
x2 y 2 u
,
a2 z 2
(ii)
x y
z
u xy
+ + + +
.
y
z
u x zu
2xy 2 u
a2 z 2
Vxy =
4xyu
a2 z 2
Vy =
2x2 yu
a2 z 2
Vz =
8xyzu
(a2 z 2 )2
Vxyz =
2x2 y 2 uz
(a2 z 2 )2
Vxyzu =
Vu =
x2 y 2
a2 z 2
8xyz
(a2 z 2 )2
1
y
Vu =
z
u2
u
x2
1
x
y
zu
xy
u2 z
Vy =
x
2
Vxy =
1
y2
1
z
1
zu
x
zu
Vz =
y
z2
Vxyz =
1
uz 2
1
u
xy
z2 u
Vxyzu =
1
u2 z 2
i
i
120
muldown
2010/1/10
page 120
i
Vy = Q,
for some 6= 0,
, Vz = R,
show that
P (Qz Ry ) + Q(Rx Pz ) + R(Py Qx ) = 0.
Q
4.38. Let P, Q : R2 7 R, with P, Q, P
y , x continuous.
P
y
Q
x .
f (x, y) =
P (t, y0 ) dt +
Q(x, t) dt.
y0
x0
[ 2xy
x2 + 3y 2 ] ,
[ 2ye2x + 2x cos(y)
(b)
e2x x2 sin(y) ] .
2
x2
2
y 2 .
i
i
(a) [ Vu
(b)
Vuu
Vvu
Vv ]
Vuv
Vvv
=
+
muldown
2010/1/10
page 121
i
121
xu xv
yu yv
xu yu
Vxx Vxy
xu xv
xv yv
Vyx Vyy
yu yv
y
yuv
x
xuv
.
+ Vy uu
Vx uu
yvu yvv
xvu xvv
[ Vx
Vy ]
(ii) If U = U (x, y), V = V (x, y), x = x(u, v), y = y(u, v), show that
(U, V ) (x, y)
(U, V )
=
.
(u, v)
(x, y) (u, v)
[Hint: Use (i)(a)]
(iii) If x = r cos(), y = r sin(), show
1 (U, V )
(U, V )
=
.
(x, y)
r (r, )
4.41. If V = 3x2 + 2y 2 + (x2 y 2 ), prove that
yVx + xVy = 10xy.
4.42. If V = x2 + y 2 + (xy) + (y/x), prove that
x2 Vxx y 2 Vyy + xVx yVy = 4(x2 y 2 ).
4.43. If V = V (x, y), x = cos(), y = sin(), show that
1
1
2 V := Vxx + Vyy = V + V + 2 V .
p
[Hint: = x2 + y 2 , = arctan(y/x).]
4.44. If V = V (r, ), = c2 /r, = 2 , show that
2 V + V + V = r2 Vrr + rVr + V .
4.45. If V = V (x, y), x = x(u, v), y = y(u, v), and xu = yv , xv = yu , show
Vuu + Vvv
= x2u + x2v = yu2 + yv2 .
Vxx + Vyy
4.46. (Eulers Theorem continued, see Exercise 31) If V is a homogeneous function
of (x, y, z) of the mth degree, show that
x2 Vxx + y 2 Vyy + z 2 Vzz + 2xyVxy + 2yzVyz + 2zxVzx = m(m 1)V.
i
i
122
muldown
2010/1/10
page 122
i
2
4.47. If zy = F (zz ), F 6= 0, then zxx zyy = zxy
.
4.48. If z = xF (x + y) + G(x + y), show that
(c) A vibrating string for which initial displacement and velocity are specified
is governed by relations of the form
ztt c2 zxx = 0,
z(x, 0) = f (x),
zt (x, 0) = g(x).
Show that
z(x, t) =
1
1
{f (x + ct) + f (x ct)} +
2
2c
x+ct
xct
i
i
muldown
2010/1/10
page 123
i
123
F (xj ) yj
2
which are easily solved for A and B. The line y = Ax + B is the line which
best fits the given set of points in the sense of least squares.
4.55. Let {k } be a sequence of real-valued continuous functions on [a, b] such that
Z b
1, if k = m;
k m =
0,
if k 6= m,
a
Let f be a real-valued continuous function on [a, b]. Prove that the choice of
constants 1 , . . . , which minimizes the quantity
!2
Z b
X
k k ,
f
a
for a given is
k =
k=1
f k ,
k = 1, . . . , .
ab
.
ba
i
i
124
muldown
2010/1/10
page 124
i
.
C
a ba
b ba
f (p) + f (q) .
f
2
2
What does this result mean geometrically?
4.4
xn p0
i
i
muldown
2010/1/10
page 125
i
125
xn
i = 1, . . . , n) b a = 0
since the system of equations has full rank. Thus, f (a) = f (b) if and only if a = b
and f is one-to-one on U .
The following is an immediate consequence of this lemma:
Theorem 4.50. Let f : Rn 7 Rm , m n. Suppose D is an open subset of Rn
and
(i) f C 1 (D)
(ii) rank f (p) = n,
for all
p D,
i
i
126
muldown
2010/1/10
page 126
i
for all
(x, y) R2 .
2
y=k
x=c
x
u 2 + v 2= e 2
(Why?).
(p) = |f (p) q| ,
p B(p0 , 0 ).
i
i
muldown
2010/1/10
page 127
i
127
00
11
p0
1
0
q0
f(C)
Figure 4.14. The point q0 is away from the image of the boundary.
Now p 6 C since otherwise we would have
p
2
p
p C = (p )
> > (p0 ),
3
3
n
X
i=1
(fi (p) qi )2
n
X
fi
(p ) = 0 = 2
(p ) (fi (p ) qi ) ,
=
xj
x
j
i=1
j = 1, . . . , n.
But the last line can be viewed as a homogeneous system of linear equations with
coefficient matrix having determinant Jf (p ) 6= 0. Consequently, we must have
n
o
fi (p ) = qi ,
i = 1, . . . , n = f (p ) = q = q f (D).
The proof is complete, and f (D) is open.
Remark: The condition that Jf (p) 6= 0 cannot be dropped even at a single point
in D. Indeed, take D = R and f (x) = x2 . Then f (R) = [0, ) is not open even
though f (x) 6= 0 except at the single point x = 0.
Theorem 4.53. Suppose f : D Rn 7 Rm , m n, for some open set D. If
f C 1 (D) and the Jacobian matrix has full rank, i.e. rank f (p) = m, for every
p D, then f (D) is open.
Proof. Given c D, we must prove that f (c) belongs to the interior of f (D). By
relabeling the x s if necessary, we may assume that
(f1 , . . . , fm )
(c) 6= 0.
(x1 , . . . , xm )
Since f C 1 (D), there is an open neighborhood U of c such that
(f1 , . . . , fm )
(p) 6= 0,
(x1 , . . . , xm )
p U.
i
i
128
muldown
2010/1/10
page 128
i
by
= f (c) f (D) ,
c D,
Now the image of the sequence, {f 1 (qk )} is a sequence in the compact set S, hence
there is a subsequence {f 1 (qkj )} such that
lim f 1 (qkj ) = p1 S.
(4.19)
i
i
muldown
2010/1/10
page 129
i
129
2m|u|,
2m|p p0 |,
for all p.
(4.20)
(4.21)
But then
m|p p0 | |f (p) f (p0 ) Df (p0 )(p p0 )|
|Df (p0 )(p p0 )| |f (p) f (p0 )|
2m|p p0 | |f (p) f (p0 )| by (4.21)
=
Let g : Rn 7 Rn be defined by
1
f (p) f (p0 ) Df (p0 )(p p0 )
g(p) :=
|p p0 |
= lim g(p) = O,
pp0
(4.22)
(4.23)
1
Since Jf (p0 ) 6= 0, the inverse matrix Df (p0 )
exists and from (4.22),
1
1
f (p) f (p0 ) Df (p0 )(p p0 )
|p p0 | Df (p0 )
g(p) = Df (p0 )
1
f (p) f (p0 ) (p p0 ).
= Df (p0 )
(4.24)
i
i
130
muldown
2010/1/10
page 130
i
lim
qq0
1
Df (p0 )
(g(p)) = 0.
1
Df (p0 )
(g(p)) = lim
pp0
1
1
Thus, since Df (p0 )
is linear, Df 1 exists and equals Df (p0 )
.
1
From the fact that Df 1 (f (p0 )) = Df (p0 )
, it follows that f 1 C 1 (f (D))
1
since the partial derivatives of f
are rational functions of the partial derivatives
of f in which the denominator Jf (p) is not zero.
Example 4.58 We have seen that f (x, y) = (ex cos(y), ex sin(y)) is one-to-one on
the strip 0 y < 2. Recall
x
e cos(y) ex sin(y)
= e2x 6= 0,
det f (x, y) = det x
e sin(y) ex cos(y)
x
e cos(y) ex sin(y)
1
.
= [f (x, y)] = e2x
ex sin(y) ex cos(y)
To find f 1 , solve (u, v) = (ex cos(y), ex sin(y)) for x and y:
u2 + v 2 = e2x ,
and
f
1
v
u
v
1
log(u2 + v 2 ), y = arctan
2
u
v
1
.
log(u2 + v 2 ), arctan
f 1 (u, v) =
2
u
=
tan(y) =
x=
(u, v) =
u
u2 +v 2
v
u2 +v 2
= e2x
v
u2 +v 2
u
u2 +v 2
1
= 2
u + v2
u
v
ex cos(y) ex sin(y)
.
ex sin(y) ex cos(y)
v
u
i
i
muldown
2010/1/10
page 131
i
131
1
Thus, we see that Df 1 (u, v) = Df (x, y)
.
Exercises
4.60. In Exercise 29 a sufficient condition was given for f : Rn 7 Rn to be globally
one-to-one. Verify that this condition is not satisfied by Example 4.51. (If
you did not do Exercise 29, a proof is given in Lemma 4.49.)
4.61. Show that if f : R 7 R and f (x) 6= 0 for each x R, then f is one-to-one
globally on R.
4.62. Let f : R 7 R. Show that if f is continuous and one-to-one on a connected
subset S of R, then f 1 is continuous on f (S).
4.63. If y = y(x), that is yi = yi (x1 , . . . , xn ), for i = 1, . . . , n, be C 1 . Show that
(y1 , . . . , yn )
(y1 , . . . , yn ) (x1 , . . . , xn )
6= 0 =
= 1.
(x1 , . . . , xn )
(x1 , . . . , xn ) (y1 , . . . , yn )
4.64. Let f (x, y) = (x2 , y/x) when x > 0. Find f (x, y). Show that f is one-to-one
1
on its domain by finding f 1 . Check that f
= f 1 .
4.65. Let
!
x
y
f (x, y) = p
.
,p
x2 + y 2
x2 + y 2
Show that Jf (x, y) = 0 for all (x, y) R2 \O and f is not locally one-to-one
anywhere on its domain. Show that the range of f is the circle u2 + v 2 = 1
and thus contains no open subset.
4.5
i = 1, 2, . . . , n
(4.26)
i = 1, 2, . . . , n?
(4.27)
Of course, the way the question was originally stated, part of the problem is to
find which are the independent variables and which are the dependent variables. The
equations will not be written for you with the independent variable so conveniently
i
i
132
muldown
2010/1/10
page 132
i
labelled. However, since we can always rearrange variables (once we identify them),
for the sake of the statement of the Theorem, we assume the given order.
If the equations (4.26) are linear in the variables yi , then we know we have a
solution if the coefficient matrix is nonsingular. But observe, the coefficient matrix
of linear functions is nothing more than the matrix of partial derivatives for those
functions with respect to the linear variables. This suggests that the answer lies in
whether or not the Jacobian of functions with respect to the dependent variables
in nonzero.
Theorem 4.59 (Implicit Function Theorem). Suppose that
(i) f : Rm+n Rn is C 1 (D) for some open domain D Rm+n , and
(ii) there is a point (p0 , q0 ) in D for which
f (p0 , q0 ) = O
and
(f1 , . . . , fn )
(p0 , q0 ) 6= 0,
(y1 , . . . , yn )
(4.28)
and
..
..
.. = ..
..
.
.
.
.
.
.
.
fn
x1
and
...
fn
xm
h i
( f
x
fn
y1
p U.
f (p, (p)) = O,
...
h ih i
= f
y
x )
fn
yn
1
x1
.
.
.
(f1 , . . . , fn )
(y1 , . . . , yi1 , xj , yi+1 , . . . , yn )
i
yi
=
=
,
(f1 , . . . , fn )
xj
xj
(y1 , . . . , yn )
n
x1
...
..
.
...
(4.29)
1
xm
..
. ,
n
xm
i = 1, . . . , n,
j = 1, . . . , m.
(4.30)
(4.31)
i
i
muldown
2010/1/10
page 133
i
133
Notice there are two solutions of the form x = y defined on [0, ), but even
these are not C 1 at x = 0. There are infinitely many discontinuous solutions defined
(f1 (x, y, z) = 0)
(f2 (x, y, z) = 0)
which of the variables can be solved in terms of the others and at what points?
Basically, we want to test when (4.28) holds. Therefore, we choose pairs of
variables and look at the Jacobian:
(f1 , f2 )
1
1
=
= 2(y x + z).
2x 2z 2y
(x, y)
or z 2 + (z + y)2
1
= 0,
2
q
which implies y = z 12 z 2 . Substituting this in the first equation gives x as
a function of z.
Testing another pair of variables, we find
(f1 , f2 )
1
1
= 0.
=
2x 2z 2z 2x
(x, z)
Hence, we can never solve for x, z as a function of y. One can check this directly:
from the first equation y = (x + z) which substituted into the second equation
gives 2(x + z)2 1 = 0, or 2y 2 = 1. Thus, eliminating x also eliminates z.
Since the equations are symmetric in x and z, it should be no surprise to
discover that we can solve for z, y as a function of x in a neighborhood of any point
except those lying on the plane x z y = 0.
Example 4.62 Consider the equations
2x2 u + v = 0
2y 3 u v = 0
z + u v2 = 0
(f1 (x, y, z, u, v) = 0)
(f2 (x, y, z, u, v) = 0)
(f3 (x, y, z, u, v) = 0)
0
0 = 2.
1
i
i
134
muldown
2010/1/10
page 134
i
1 +1 4x
0 = 4x(2v + 1)
= 1 1
1 2v 0
1 +1
0
= 1 1 6y 2 = 6y 2 (2v 1)
1 2v
0
2x 0 0
= 9 6y 2 0 = 12xy 2 .
0
0 1
This shows that we can always solve for u, v, z in terms of x and y, but for the other
combinations tested, there are points at which we cannot apply the theorem. The
first case leads to the solution
v = y 3 x2 ,
u = x2 + y 3 ,
z = x2 y 3 + y 6 2y 3 x2 + x4 .
= 0
= 0.
(4.32)
Now, the last equation implies y0 6= 0 and u0 6= 0, and putting this in (4.32)
indicates that x0 6= 0 as well. The equations are easy to solve as
u=
x2
,
y
v=
y 2
.
x
i
i
muldown
2010/1/10
page 135
i
135
You will see from the simple exercises for this section that when the basic
condition on the Jacobian in (4.28) is not satisfied there may be no solution, or no
C 1 solution, or indeed infinitely many solutions.
Consider the function F : Rn+m 7 Rn+m defined by
..
.
fn
x1
f1
xm
...
..
.
...
..
.
fn
xm
Omn
f1
f1
. . . yn
(f1 , . . . , fn )
y1
=
(p0 , q0 ) 6= 0.
..
..
..
(y1 , . . . , yn )
.
.
.
fn
fn
. . . y
(p0 ,q0 )
y1
n
R (v)
F(W)
F
m
R (u)
00
11
F(p,q)
00
11
(p ,q
) R (q)
00
011
0
00
11
f
00
11
00
11
00
11
W
00
11
00
11
00
11
m
R (p)
1
0
(p,q)
f(W)
0
1
0
0f(p,q)
1
(p 0,0)
=
=
p =
= (u, v)
(Notice u = p)
(4.33)
u and q = (u, v)
F (F 1 (u, v)) = F (u, (u, v)) = (u, f (u, (u, v)).
(u = p)
and thus,
O = f (p, (p, O))
for all p in a neighborhood U of p0 . Therefore,
O = f (p, (p))
if
i
i
136
muldown
2010/1/10
page 136
i
= 0 =
fi X fi k
+
,
xj
yk xj
i, j = 1, . . . , m,
k=1
which gives (4.30). Viewing that as a linear system of equations for unknowns
k /xj , k = 1, . . . , n, Cramers Rule gives (4.31)
Remark: The function is the unique solution to f (p, (p)) = O with (p0 ) = q0
since if to the contrary we had two solutions
f (p, 1 (p)) = f (p, 2 (p)) = O
dy
f /x
=
.
dx
f /y
i
i
muldown
2010/1/10
page 137
i
137
and
f2 (x, y, z, u, v) = 0
v = 2 (x, y, z)
with
u0 = 1 (x0 , y0 , z0 ),
v0 = 2 (x0 , y0 , z0 ),
(f1 ,f2 )
2
v
(u,x)
=
= (f ,f ) ,
1 2
x
x
u
1
(x,v)
=
= (f ,f ) ,
1 2
x
x
(u,v)
(u,v)
(f1 ,f2 )
(f1 ,f2 )
1
u
(y,v)
=
= (f ,f ) ,
1 2
y
y
v
2
(u,y)
=
= (f ,f ) ,
1 2
y
y
(u,v)
(u,v)
(f1 ,f2 )
(f1 ,f2 )
1
u
(z,v)
=
= (f ,f ) ,
1 2
z
z
v
2
(u,z)
=
= (f ,f ) .
1 2
z
z
(u,v)
(u,v)
Exercises
4.66. Prove Corollary 4.64. That is, work through the proof of Theorem 4.59 in
this special case.
4.67. The equation y 2 x2 = 0 has two C 1 solutions, y = x in a neighborhood of
x = 0; this shows uniqueness may not hold. What condition of the Implicit
Function Theorem does not hold? Check that there are four solutions of class
C and infinitely many real-valued solutions.
4.68. Show that the equations
x2 yu = 0,
u = u0 ,
v = v0 ,
xy + uv = 0,
u
u
=
,
y
y
v
y
2vx
=
,
x
u
uy
v
x v
= +
y
u y
i
i
138
muldown
2010/1/10
page 138
i
= (1)n
.
(u1 , . . . , un ) (x1 , . . . , xn )
(x1 , . . . , xn )
[Hint: Use the Chain Rule and think matrices.]
4.71. If u2 + v 2 + 2xuv + y = 0, uv + (u + v)y + x2 = 0, prove that
(u, v)
uv(u + v) x
=
.
(x, y)
(u v)[(u + v) + y(1 x)]
4.72. If u1 = x1 + x2 + x3 + x4 , u1 u2 = x2 + x3 + x4 , u1 u2 u3 = x3 + x4 , and
u1 u2 u3 u4 = x4 , show that
(x1 , x2 , x3 , x4 )
= u31 u22 u3 .
(u1 , u2 , u3 , u4 )
4.73. Do the following
(a) If V = (u, v), (u, v) = E(x, y), (u, v) = F (x, y), and (, )/(u, v) 6=
0. Prove
E (, ) F (, )
V (, )
=
+
.
x (u, v)
x (u, v)
x (u, v)
(b) If V = u2 + v 2 + uv, u + v = x2 + y 2 , u3 + v 3 = 2xy, prove that
3
2x(x2 y 2 )
V
+ 8y(x2 + y 2 ).
=
y
(x2 + y 2 )2
V (1 , . . . , n ) X fk (1 , . . . , k1 , , k+1 , . . . , n )
=
.
xj (u1 , . . . , un )
xj
(u1 , . . . , un )
k=1
f
2 f
y
2
y
yx
i
i
4.5.1
muldown
2010/1/10
page 139
i
139
Dimension
We have a notion of dimension for linear objects, specifically for vectors spaces,
namely, the number of vectors in a basis. If L : Rk 7 Rn is linear, L(x) = Ax, then
the dimension of L(Rk ) Rn is the rank of A. In particular, if rank A = k, then
L(Rk ) Rn is k-dimensional, the same dimension as Rk .
Definition 4.67. A subset S of Rn is a k-dimensional segment, k > 0, if there is
an open connected set D Rk and a function f : Rk 7 Rn such that
(a) f C 1 (D) and f : D 7 S is one-to-one and onto (i.e. f (D) = S),
(b) rank f (p) = k for all p D.
A 0-dimensional segment in Rn is a single point in Rn .
A subset S of Rn is a k-dimensional manifold if for every q S, there exists
an open neighborhood V Rn of q such that V S is a k-dimensional segment.
Remarks: Note that k n always. A k-dimensional segment is automatically a
k-dimensional manifold. If the condition of one-to-one is dropped in (a), then f is
still locally one-to-one by Theorem 4.50, and so, f (U ) is a k-dimensional segment
if U is a sufficiently small open subset of D.
Example 4.68 The set S := {(x, y) : y = 3x, 0 < x < 1} is a 1-dimensional
manifold (segment) in R2 . Indeed, let D = (0, 1) R and define f (t) = (t, 3t),
0 < t < 1. Then f (t) = [ 1 3 ] and rank f = 1.
Example 4.69
Example 4.70 The set S := {(x, y, z) : x2 +y 2 +z 2 = 1,
is a 2-dimensional segment in R3 . In this case, consider
x > 0,
f (, ) := cos() sin(), sin() sin(), cos() = (x, y, z),
Then f C 1 (0, /2) (0, /2) , f is one-to-one, and
sin() sin()
f (, ) = cos() sin()
0
(y,z)
(,)
y > 0,
0<<
0<<
z > 0}
2.
cos() cos()
sin() cos() .
sin()
i
i
140
muldown
2010/1/10
page 140
i
V
1
0
0 1
1
0
1
0
1
0
1
1111111
0000000
0
1
0
1
0
1
0
1
0
1
0
1
0S
1
1
1
0
V2
0
1
0
1
0
1
0
1
0
1
111111
000000
0
1
0
1
0
1
0
1
0
1
0
1
S
0
1
0
1
0
1
0
1
0
1
0
1
0
1
11111111
00000000
0
1
0
1
0
1
0 S
1
0
1
0
1
11
00
00(x,y,z)
11
i
i
muldown
2010/1/10
page 141
i
141
x
Figure 4.18. The 1-dimensional segment of Example 4.71.
of the segment (the variable of f being called the parameter). A local parameterization of a manifold is essentially a local coordinate system for the manifold; for
example, in Example 4.70,
/2
= 0
= /2
= 0
0
= /2
x
i
i
142
muldown
2010/1/10
page 142
i
i = 1, . . . , k
fi
xj
has rank k.
i = 1, . . . , k,
F =
, and rank F = n k.
I(nk)(nk)
with
1 ]
and
rank F = 1.
i
i
muldown
2010/1/10
page 143
i
143
1 ] ,
fy
and
rank F = 1.
f=0
0
000
11
f=0
i
i
144
muldown
2010/1/10
page 144
i
Question: Let
A(x) =
a(x)
d(x)
b(x)
e(x)
c(x)
.
f (x)
Does rank A(x0 ) = 2 imply that rank A(x) = 2 near x0 if the entries in A are
continuous? Does rank A(x0 ) = 1 imply rank A(x) = 1 near x0 ?
Theorem 4.79. Let D Rn be open. Suppose
(i) f : D 7 Rn and f C 1 (D),
(ii) rank f (p) = k for every p D.
Then for each c D, there is a neighborhood Uc of c such that f (Uc ) is a kdimensional segment.
This theorem is analogous to the statement that the dimension of the range
of any linear function L(x) = Ax is the rank of A. The proof of this theorem is
notationally quite complicated so we will consider a few examples and special cases
first. Actually, all the essential ideas of the proof are contained in the special cases
so you may skip the proof if you wish.
Example 4.80 Let f : R2 7 R2 be defined by f (x, y) = (x + y, (x + y)2 ). Then
1
1
rank f (x, y) = rank
= 1 (x, y).
2(x + y) 2(x + y)
Thus, f (R2 ) is the 1-dimensional segment {(t, t2 ) : t R}, a parabola. We have
used t = x + y to parameterize the curve.
Example 4.81 Let f : R3 7 R3 be defined by
f (x, y, z) = (x y, y z, x(x 2y) z(z 2y)).
Then
1
1
rank f (x, y, z) = rank
0
1
2(x y) 2(z x)
0
1 = 2.
2(y z)
u = (v)
or
v = (u),
, C 1 ,
i
i
muldown
2010/1/10
page 145
i
145
Proof. If f1 /x (x0 , y0 ) 6= 0, then f1 /x 6= 0 in an open neighborhood U of
(x0 , y0 ), so f1 (U ) is open (Theorem 4.53). By the Implicit Function Theorem, near
(x0 , y0 , u0 ), the equation u = f1 (x, y) may be solved uniquely for x in the form
x = (u, y),
with
f2
f2
f2
v
=
+
=
y
x y
y
x
=
f1 f2
x y
f2 f1
x y
f1
x
f1
x
(f1 ,f2 )
(x,y)
f1
x
f2
y
= 0,
v = f2 (x, y, z),
w = f3 (x, y, z),
w = (u, v),
or
u = (v, w),
or
v = (w, u),
, , C 1 ,
y = 2 (u, v, z),
1 , 2 C 1 ;
hence
w = f3 (1 (u, v, z), 2 (u, v, z), z).
We observe that this last function is actually independent of z:
f3 1
f3 2
f3
w
=
+
+
z
x z
y z
z
i
i
146
muldown
2010/1/10
page 146
i
(f1 ,f2 )
(f1 ,f2 )
f3 (z,y) f3 (x,z) f3
(f ,f ) +
(f ,f ) +
=
1 2
1 2
x
y
z
(x,y)
f3 (f1 ,f2 )
x (y,z)
= 0,
(x,y)
f3 (f1 ,f2 )
y (x,z)
f3 (f1 ,f2 )
z (x,y)
(f1 ,f2 )
(x,y)
since
rank f = 2.
Therefore,
w = (u, v)
i = 1, . . . , n.
(4.34)
(4.35)
i = 1, . . . , k.
(4.36)
i = k + 1, . . . , n.
(4.37)
We claim that these are functions of (u1 , . . . , un ) only. We show only that there is
no dependence on xk+1 . By the derivative formula (4.31) of the Implicit Function
Theorem,
(f1 ,...,fk )
= 1, . . . , k.
(x1 ,...,xk )
i
i
muldown
2010/1/10
page 147
i
147
Substituting these into the Chain Rule applied to equations from (4.37), using the
effect of interchanging rows in determinants, and using the expansion of a determinant about row i, we obtain
(f1 ,...,fk )
k
X
f
ui
(x
,...,x
,x
,x
,...,x
)
i
1 k+1 +1
k
+ fi
=
1
(f
,...,f
)
1
k
xk+1
x
xk+1
=1
k+1
X
=1
(x1 ,...,xk )
(f1 ,...,fk )
f
(x
,...,x
,x
,...,x
,x
)
i
1
1 +1
k
k+1
(1)k+1
(f1 ,...,fk )
x
= 0,
since
(x1 ,...,xk )
rank f = k.
i = k + 1, . . . , n,
with
i C 1 (V0 )
(4.38)
4.5.2
Here we provide an application of the Implicit Function Theorem to extremal problems subject to constraints.
Definition 4.84. Let f : D Rn 7 R be a real-valued function on the domain D.
The function f has a relative maximum (respectively, minimum) with respect to a set
S at p0 S D if, for some neighborhood U of p0
f (p) f (p0 )
p S U
(respectively,
f (p) f (p0 )
p S U ).
i = 1, . . . , k},
i
i
148
muldown
2010/1/10
page 148
i
i = 1, . . . , n.
2y + = 0,
with
x + y 1 = 0.
2 2
1
1
1
+
= ,
2
2
2
1
or d = .
2
2
=
0
1
x
= (x, y) =
,0 .
f
= 6y
= 0
2
y
But,
fxx
fyx
fxy
fyy
4 0
=
0 6
is indefinite,
so f has a saddle point at ( 21 , 0). Therefore, the maximum and minimum occur in
the set
{(x, y) : x2 + y 2 1 = 0}.
The Lagrange Multiplier Rule then gives the existence of a such that at the
extrema
4x 2 + 2x = 0
6y + 2y = 0
x2 + y 2 1 = 0.
i
i
muldown
2010/1/10
page 149
i
149
The Lagrange multipliers are of no interest per se, and if they can be eliminated
from the equations so much the better. You will observe that either of the examples
above could have been done by solving the constraints to reduce the dimension of
the problem and completed by applying standard methods. For example, in the
first of the examples, we could solve the constraint for y = x 1 and rewrite
x2 + y 2 = x2 + (1 x)2 = 2x2 2x + 1 which has a minimum 1/2 at x = 1/2. Such a
reduction technique however often leads to unwieldy calculations, but none-the-less
we will use it in one of the proofs of the theorem.
Proof of 1 of Theorem 4.85.
Let f have a relative extremum at p0 with respect
to the set S = {p : g(p) = O}. Consider the function F : Rn 7 Rk+1 given by
F (p) = (f (p), g(p)) = (f (p), g1 (p), . . . , gk (p)).
Then
f
x1
g1
x1
f (p)
=
F (p) =
.
g (p)
..
gk
x1
...
...
..
.
...
f
xn
g1
xn
.. .
.
gk
xn
i = 1, . . . , n.
i
i
150
muldown
2010/1/10
page 150
i
k
F(U)
U
p
0
(f(p ),0)
0
S: g(p)=0
n
R
j = 1, . . . , k,
serve to determine the (n + k) numbers which are the coordinates of p0 and the
1 , . . . , k .
Proof of 2 of Theorem 4.85.
This proof works in general but we prove just the
case of one constraint. Suppose f, g : Rn 7 R are C 1 functions and f (p), p =
(x1 , . . . , xn ), has an extremum at c = (1 , . . . , n ), with respect to the constraint
g(p) = 0.
(4.39)
If rank g (c) = 1, then we may assume that (g/x1 )(c) 6= 0. Then, equation (4.39)
may be solved for x1 in a neighborhood of (2 , . . . , n ) in the form
x1 = (x2 , . . . , xn ),
with C 1
and 1 = (2 , . . . , n ).
(c) +
(c)
(c) = 0, i = 2, . . . , n, or
xi
x1
xi
!
g
f
f
xi (c)
= 0, i = 2, . . . , n.
(c) +
(c) g
xi
x1
x1 (c)
The last equation holds trivially for i = 1 as well. Hence we find
g
f
(c) +
(c) = 0,
xi
xi
i = 1, . . . , n
1
with = x
.
g
x1
Exercises
4.77. Let f (x, y) = (x + y, 2x + ay) = (u, v).
i
i
muldown
2010/1/10
page 151
i
151
K := {(x, y) : 0 x 1,
0 y 1}
x+y
z ,
v=
z+y
x ,
w=
y(x+y+z)
.
xz
(iii) u = x + y + z, v = x2 + y 2 + z 2 , w = x3 + y 3 + z 3 3xyz.
i = 1, . . . , k
i
i
152
muldown
2010/1/10
page 152
i
=0
C
=0
(iii) xyz = a3 [Solution: 3a2 at (a, a, a), (a, a, a), (a, a, a), (a, a, a).]
1
nn
1
(a1 + a2 + . . . + an ),
n
ai 0, i = 1, . . . , n.
=
=
.
fy (, )
gy (, )
Use this to find the shortest distance between the ellipse x2 +2xy+5y 2 16y =
0 and the line x + y 8 = 0.
i
i
muldown
2010/1/10
page 153
i
153
4.92. Let f be a real valued function of class C 1 on R3 . Prove that there are at least
two points on the sphere x2 + y 2 + z 2 = R2 , R > 0, at which the equations
f
f
x
=0
x
y
f
f
z
y
=0
y
z
f
f
x
z
=0
z
x
are satisfied.
4.93. (The H
older and Minkowski inequalities.) Let p > 1, q > 1 and
xp
p
ap
p
+
+
y
q
1
p
1
q
= 1.
bq
q .
k=1
ak b k
n
X
apk
k=1
!1/p
n
X
bqk
k=1
!1/q
(Holders Inequality).
Pn
Pn
1/p
1/q
[Hint: Let A = ( k=1 apk )
and B = ( k=1 bqk )
and consider a =
ak /A, b = bk /B.]
k=1
|ak + bk |
!1/p
n
X
k=1
|ak |
!1/p
n
X
k=1
|bk |
!1/p
i
i
154
muldown
2010/1/10
page 154
i
i
i
muldown
2010/1/10
page 155
i
Chapter 5
Further Topics in
Integration
5.1
I O O
A1 = O a O k-row
O O I
I O O O O
O 1 O 1 O k-row
A2 =
O O I O O
O 0 O 1 O j-row
O O O O I
I O O O O
O 0 O 1 O k-row
A3 =
O O I O O .
O 1 O 0 O j-row
O O O O I
i
i
156
muldown
2010/1/10
page 156
i
Q
Recall the concepts of an interval in Rn , I = nj=1 [aj , bj ]; the Jordan content
Q
n
of an interval in Rn , (I) = j=1 (bj aj ); and the diameter of an interval I in Rn ,
P
1/2
n
2
. A special case of an interval is an n-cube
(I) =
j=1 (bj aj )
Definition 5.1. If for the interval I, bj aj = a, j = 1, . . . , n, then I is an
bn an
1
.
n-cube of side a and center b1 a
2 ,...,
2
Our immediate goal is to look at sets with content and to investigate what
happens to the content after mapping by nice functions. We begin with investigating the content of intervals under linear functions, and get progressively more
sophisticated. This is background for change of variables in integration.
Lemma 5.2. If I is an interval in Rn and : Rn 7 Rn is linear, then
((I)) = |J |(I).
Proof. The matrices Aj are the Jacobian matrices for the linear functions Lj ,
j = 1, 2, 3, respectively. It is easy to determine what these matrices do to intervals
I. Only the linear function L1 changes the content by changing the length of one
side.
Indeed, both L1 (I) and L3 (I) are again intervals. For L1 (I), the k-th interval
becomes either [aak , abk ] (a > 0), or [abk , aak ] (a < 0), and the content is (L1 (I)) =
|a|(I). For L3 (I), the intervals of the k-th and j-th coordinates are swapped, and
thus, the overall content, namely the product of the lengths of the intervals, remains
unchanged.
Since L2 (I) is no longer an interval, we need to use the theory developed in
Chapter 3. In passing from I to L2 (I) only the description of the k-th coordinate
changes
n
L2 (I) = p = (x1 , . . . , xn ) : am xm bm , m = 1, . . . , k 1, k + 1, . . . , n,
o
ak + xj xk bk + xj .
In2
bj
aj
Kn2
K2
1 dxk
bk +xj
ak +xj
= (bj aj )(bk ak )
dxj
(Corollary 3.26)
1 = (I).
In2
i
i
muldown
2010/1/10
page 157
i
157
Thus, the Lemma holds for the elementary linear functions. Since any linear
function is a composition of finitely many elementary linear functions, the differential is the product of the differentials, and hence the Jacobian is the product of the
Jacobians, the general case follows.
The next three lemmas and Theorem 5.7 that follows are technical in nature.
In view of the last lemma, you will find it easy to accept Theorem 5.7, so on your
first reading, you may proceed directly to the important change of variables theorem,
Theorem 5.8; however, read the statement of Theorem 5.7 first.
Lemma 5.3. Suppose that : G Rn 7 Rn , is C 1 (G) on the open set G. Then
D G,
p m
j=1 Ij ,
u Rn .
p, q Ij , j = 1, . . . , m,
Therefore, the diameter of (Ij ) is at most nM (Ij )1/n , and so we can find an
n-cube Kj with
n
Consequently,
(D) nj=1 Kj
and ((D)
n
X
j=1
n
(Kj ) 2 nM .
and
J (p) 6= 0,
p G.
i
i
158
muldown
2010/1/10
page 158
i
If D is a compact subset of G and D has content, then (D) is a compact set with
content.
Proof. We know that C(D) and D compact implies (D) is compact (Theorem
2.36). In particular, D D implies that the boundary of (D) is contained in
(D). By Theorem 4.50 and Theorem 4.53, is locally one-to-one on G and maps
open sets onto open sets. Therefore,
both
(G)
and (G)\(D)
(5.1)
The set D has content if and only if the content of its boundary is zero (Theorem 3.14). Therefore
(D) = 0 = (((D))) = 0,
= (((D))) = 0,
= (D)
Lemma 5.3
by (5.1)
n
n
((K))
p K = 1 n <
< 1+ n .
(K)
n-cube of side 2(1 + n)r with center O. Since (K) = (2r)n , this implies
n
n
((K))
< 1+ n .
1 n <
(K)
i
i
muldown
2010/1/10
page 159
i
159
(K)
p
Figure 5.1. The cube K is mapped toa region (K)
whose boundary is
captured between two cubes of side lengths 1 n and 1 + n.
We next need a stronger version of Lemma 5.2.
Lemma 5.6. Suppose that L : Rn 7 Rn is a linear map of full rank, and that
: Gopen Rn 7 Rn is in C 1 (G) and J (p) 6= 0 for all p G. Then for any
interval I G,
(L((I))) = |JL |((I)).
Proof.
K,
By Lemma 5.4, (I) has content. Therefore, for any interval K with (I)
Z
(I) .
((I)) =
K
By the Cauchy Criterion for integrals, Corollary 3.7, and by the proper choice
of Riemann sums, given any > 0, there are finitely many intervals with nonoverlapping interior, I1 , . . . , Im , Im+1 , . . . , IN such that
N
m
j=1 Ij (I) j=1 Ij
and
((I))
Then
m
X
j=1
(Ij ) ((I))
N
X
j=1
(Ij ) ((I)) + .
N
L m
j=1 Ij L ((I)) L j=1 Ij
L m
j=1 Ij
=
m
X
j=1
(L ((I))) L( N
j=1 Ij
N
X
j=1
|JL |(Ij )
i
i
160
muldown
2010/1/10
page 160
i
C 1 (G)
with
J (p) 6= 0,
p G.
Then, given > 0, there is a () > 0 such that for any n-cube, K, with center
p D and side length less than 2,
|J (p)| (1 )n
((K))
|J (p)| (1 + )n .
(K)
p D
and u Rn . (Why?)
|u|.
M n
((K))
(1 + )n .
(K)
e
But, by Lemma 5.6 applied to e and K p where (u)
= (p + u) (p), and
from the fact that content is invariant under translation, we find
e p)) = | det[Lp ]|((K
e p))
((K)) = (Lp ((K
=
((K))
((K) (p))
=
|J (p)|
|J (p)|
= |J (p)| (1 )n
((K))
|J (p)| (1 + )n .
(K)
i
i
muldown
2010/1/10
page 161
i
161
Proof. The set (D) has content by Lemma 5.4. Without loss of generality, we
may assume both f 0 and J 0 (WHY?). By Theorem 3.12 both of the
intergrals in the theorem exist. Therefore, it only remains to show that they are
equal.
For any , 1 > > 0, we may choose a partition on D consisting of n-cubes
Kj , j = 1, . . . , m, which are sufficiently small so that
R
Pm
(a) D (f )J j=1 (f )(qj )J (pj )(Kj ) < , for any qj Kj and pj
being the center of Kj ;
(b) J (pj )(1 )n
((Kj ))
J (pj )(1 + )n ,
(Kj )
(item (a) by the definition of the integral and the continuity of J (p), and item (b)
by Theorem 5.7).
(D)
j=1
(since is one-to-one)
(Kj )
i
i
162
muldown
2010/1/10
page 162
i
=
=
m
X
j=1
m
X
pj (Kj )
where qj Kj .
j=1
(Theorem 3.15(e))
m
X
(D)
j=1
Remarks: Strictly speaking, the use of the Mean Value Theorem for Integrals,
Theorem 3.15(e), is not applicable to those Kj which intersect the boundary of D.
But this presents no problem (why ?).
You will notice that Theorem 5.8 is much more restrictive than the corresponding result in R1 , Corollary 3.19, which states that
Z
(b)
f=
(a)
(f )
without the restrictions that 6= 0 and being one-to-one. These restrictions are
necessary in Rn because of considerations derived from the notion of orientation
which will be discussed later.
Example 5.9 Compute the area of the region
(x, y) : 0 y, 0 < r2 x2 + y 2 R2 R2 .
=
=
D |J | =
D
Z R
r
v dv =
v du dv
1
R2 r 2 .
2
i
i
muldown
2010/1/10
page 163
i
163
v
R
r
x
r
Figure 5.3. (a) The regions D and (D) for Example 5.9
Alternatively, you may write this as
Z
Z
(x, y)
du dv.
1
1 dx dy =
(u, v)
D
(D)
5
0
4
2
8
5
0
0
Figure 5.4. The region D in the (, r)-plane and its image, the inside of
a cardioid in the (x, y)-plane for Example 5.10.
J =
(x, y) cos()
=
sin()
(r, )
r sin()
= r.
r cos()
i
i
164
muldown
2010/1/10
page 164
i
Therefore,
Z
Z
dx dy =
(D)
Z 2 Z a+b cos()
(x, y)
dr d =
r dr d
0
0
D (r, )
Z
Z
1 2
1 2 2
=
(a + b cos())2 d =
(a + 2ab cos() + b2 cos2 ()) d
2 0
2 0
Z
1
1
1 2
a2 + b2 (1 + cos(2)) d = a2 + b2 .
=
2 0
2
2
1.5
1
0.5
0.5
0
0
0.5
0.5
0.5
0.5
0.5
0.5
1.5
Figure 5.5. The triangle D in the (u, v) plane is transformed to the inside
of the loop of the curve x3 + y 3 = xy in Example 5.11
onto (0, 0). If f is continuous on (D), then
Z
Z
(f )|J |.
f=
(D)
For example, if f (x, y) = xy, then (f )(u, v) = (u2/3 v 1/3 )(u1/3 v 2/3 ) = uv; hence
Z
Z Z
Z
1 1 1v
1
1 1
f=
.
(1 v)2 dv =
uv du dv =
3 0 0
6 0
72
(D)
Why is this valid? is not C 1 at (0, 0) and is not one-to-one on the u and v axes.
The first three examples illustrated how the change of variable formula may
be used to simplify the region of integration. It may also be used to simplify the
integrand, which was its basic role in one-dimension.
i
i
muldown
2010/1/10
page 165
i
165
x+y=1
D
u=v
u=v
D*
x
0
Figure 5.6. The tip down triangle, D, in the (u, v)-plane is transformed
to the right triangle D in Example 5.12.
D = 1 D , D = (D),
u+v vu
= (u, v),
,
(x, y) =
2
2
1
1
(x, y) 2
1
J =
= 1 12 = .
(u, v)
2
Alternatively,
J1
(u, v) 1
=
=
(x, y) 1
1
1
= 2 = J = ,
1
2
exp
D
xy
x+y
dx dy =
exp
u 1
v
du, dv =
1
2
Z
1 1 u/v u=v
1
=
ve
dv =
2 0
2
u=v
1
1
1
=
e
= sinh(1).
4
e
2
0
Z 1
0
exp
u
v
1
dv
v e
e
v
du, dv
See also the examples in Buck pp 306-311 and the exercises on p311-313. The
particular exercises 10-12 indicate a proof of Theorem 5.8 based on the Implicit
Function Theorem when n = 2.
i
i
166
muldown
2010/1/10
page 166
i
Exercises
5.1. For f1 (x, y) =
Z
f1
x2
a2
and
y2
b2
1
(1 + x2 + y 2 )2
i
i
muldown
2010/1/10
page 167
i
167
and (ii) D2 is the triangle with vertices (0, 0), (2, 0), (1, 3). [Solution: 4 12
and 23 arctan(1/2).]
5.7. Show that
Z
y2
z2
x2
a 2 b 2 c2
where K := (x, y, x) : 2 + 2 + 2 1 .
|xyz| dx dy dz =
6
a
b
c
K
5.8. Let (x, y) = Ax2 + 2Hxy + By 2 + 2Gx + 2F y + C.
(i) Show
Z
1
= ab(Aa2 +Bb2 +4C) where
4
K
K :=
y2
x2
(x, y, ) : 2 + 2 1 .
a
b
(iii) Show
R
(2x2 + y 2 + 3x 2y + 4) dx dy =
2
57
2
where
S is inside x + 4y 2x + 8y + 1 = 0.
log(x2 + y 2 ) dx dy
i
i
168
muldown
2010/1/10
page 168
i
R
5.10. Show that D x3 y 2 dx dy = 248 52 , where D is the region bounded by the lines
y = 3(x 2), y = 3(x 4). [Hint: Move the origin to the center of the
region and use symmetry as much as you can.]
5.11. Show that the area of the region in the first quadrant bounded by
y 3 = a1 x2 ,
2
b31 ,
y 3 = a2 x2 ,
2
a1 > a2 > 0,
and
b32 ,
xy =
b1 > b2 0, is
7 15/7
1/7
1/7
15/7
.
Area =
a2
a1
b1 b2
5
xy =
5.12. Show that the area of the region bounded by the loop of the curve x3 + y 3 =
3axy is 3a2 /2.
5.13. The transformation x = um (v), y = un (v) maps the rectangle [u1 , u2 ]
[v1 , v2 ] into a region in the (x, y)-plane. The area of this region is
Rv
n
(um+n
if m 6= n and
um+n
) v12 mm+n
,
1
2
if m = n.
What conditions must and satisfy if this statement is correct? Prove the
statement under your conditions.
R
5.14. Find D xyz dx dy dz where D is the region in the first octant bounded by
the cylinder x2 + y 2 = 16 and the plane z = 3, that is,
D = (x, y, z) : 0 x, 0 y, 0 z 3, x2 + y 2 16 .
= 1 (p.q > 0)
p
q
and (x, y) = ax2 + 2hxy + by 2 + 2gx + 2f y + c.
5.17. Show the following:
(i) If X = x + y + z + u, XY = y + z + u, XY Z = z + u, XY ZU = u. show
that
(x, y, z, u)
= X 3 Y 2 Z.
(X, Y, Z, U )
(ii) Show that
Z
1u
1zu
0
1yzu
(x + y + z + u)n xyzu dx dy dz du =
i
i
X n+7 dX
169
Y 5 (1 Y ) dY
Z 3 (1 Z) dZ
U (1 U ) dU =
=
5.2
R1
0
xx dx or
R1R1
0
muldown
2010/1/10
page 169
i
1
.
(n + 8)7!
(xy)xy dx dy?
Figure 5.7. The function on the right has graph that are equilateral triangles with base on [1/(k + 1), 1/k], k = 1, . . ., alternating up and down. Since
the height is one, the length of the graph of each triangle is greater than 2. Hence,
the total length cannot be finite. The graph on the left is piecewise smooth and
intuitively does have lenght.
segments piecewise.
5.2.1
Curves (1-surfaces)
n 1.
i
i
170
muldown
2010/1/10
page 170
i
x1 (t)
n
X
..
xi (t)2 > 0, t U.
(t) = . , smooth
xn (t)
i=1
(iii) Two smooth curves and on U and U respectively are called parametrically
equivalent if there is a function f : U 7 U , f C 1 (U ), such that f (t) > 0
and (t) = (f (t)) for all t U .
(iv) (U ) is the trace of the curve in Rn .
(v) A curve : U 7 Rn is piecewise smooth if
(a) is continuous on U
(b) Each compact subinterval of U is the union of a finite number of intervals
I such that : I 7 Rn is a smooth curve and | | is Riemann integrable
on I.
We shall refer informally to the curve and its trace as the curve. We will
see that this is unambiguous in the integration theory for parametrically equivalent
curves. It is perhaps even more precise to consider curves as equivalence classes
of functions : R 7 Rn , the equivalence relation being parametric equivalence.
However, you might consider this idea of a curve as too eccentric on your first
encounter.
Example 5.14 A line in R3 through a point p0 = (x0 , y0 , z0 ) may be parametrized
by
( x = x + at,
a
0
: y = y0 + bt, t R
(t) = b .
z = z0 + ct,
c
a2
1
(a, b, c),
+ b 2 + c2
with
2 + m2 + n2 = 1,
are the direction cosines of . The triple (, m, n) are the cosines of the angles
determined by the line and the directions (1, 0, 0), (0, 1, 0), (0, 0, 1) respectively.
A line is determined by (i) a point (x0 , y0 , z0 ) and a direction (a, b, c), or (ii)
two points (x0 , y0 , z0 ) and (x1 , y1 , z1 ). In the latter case, a direction is determined
as (a, b, c) = (x1 x0 , y1 y0 , z1 z0 ).
i
i
muldown
2010/1/10
page 171
i
171
p + (a,b,c)
o
c
po
a
Figure 5.8. The line through p0 = (x0 , y0 , z0 ) in the direction (a, b, c).
The direction (a, b, c) is orthogonal to (, , ) if
0 = (a, b, c) (, , ) = a + b + c.
The set of points (x, y, z) in R3 such that (x x0 , y y0 , z z0 ) is orthogonal to
(a, b, c) is a plane through (x0 , y0 , z0 ) with normal (a, b, c). The equation of this
plane is
0 = (x x0 , y y0 , z z0 ) (a, b, c) or a(x x0 ) + b(y y0 ) + c(z z0 ) = 0.
:(a,b,c)
p
o
(x,y,z)
( x = t,
y = t2
,0 < t < 1,
(t) =
1
.
2t
i
i
172
muldown
2010/1/10
page 172
i
y
(t,t 2 )
x
Figure 5.10. A parametrization og the parabola in Example 5.15.
Example 5.16 Another example weve seen before is the helix:
(
sin(t)
x = cos(t)
: y = sin(t), t R,
(t) = cos(t) .
1
z=t
Again x (t)2 + y (t)2 = 2 > 0, so is a smooth curve.
(cos(t),sin(t),t)
x = x(t)
: y = y(t) , is
z = z(t)
the curve
x = x0 + x (t0 )(t t0 )
y = y0 + y (t0 )(t t0 ) ,
z = z0 + z (t0 )(t t0 )
i
i
muldown
2010/1/10
page 173
i
173
1
p
(x (t0 ), y (t0 ), z (t0 )) .
x x0
[ x (t0 ) y (t0 ) z (t0 ) ] y y0 = 0.
z z0
Remark: Two parametrically equivalent curves have the same direction. That is,
if (t) = (f (t)) with f > 0, then
(t)
(f (t))f (t)
(f (t))
=
=
.
| (t)|
| (f (t))f (t)|
| (f (t))|
Definition 5.18. Let [a, b] U R and let : U 7 Rn be a curve.
i
i
174
muldown
2010/1/10
page 174
i
(ii) The curve is piecewise smooth on U if for any [a, b] there are nonoverlapping
intervals of [a, b], [ai , bi ], such that is smooth on (ai , bi ). In this case.
X
(([ai , bi ])).
(([a, b])) =
i
(f (t)) f (t) dt
(t) dt =
( ) =
a
b
| (t)| dt.
b
a
p
1 + f (t)2 dt.
NOTE: (([0, 4])) = 2(([0, 2])) even though these 2 curves have the same
trace.
Motivation and Remarks: By way of motivation for the definition of the length
of a curve, we offer the following:
(a) The length of the line segment p(t) = p0 + q0 t for t [t1 , t2 ] is
Z t2
|p (t)| dt.
|p(t2 ) p( t1 )| = |q0 (t2 t1 )| =
t1
k=1
| (k )|(tk tk1 ),
Rb
a
| (t)| dt.
i
i
muldown
2010/1/10
page 175
i
175
m
nX
k=1
o
|(tk ) (tk1 )| : (tk ) partitions of [a, b] .
It is not difficult to see by the Mean Value Theorem, that this is equivalent to the
Figure 5.13. The two means to approximate length, length of tangent lines
(top), and lengths of secant lines (bottom).
Rb
definition ([a, b]) = a | (t)| dt if is C 1 . However, this alternative definition is
more general in that it pertains to any curve for which the supremum exists as a
real number; in particular, need not be C 1 .
5.2.2
Surfaces (2-surfaces)
n 2.
.. and is smooth
> 0.
= ...
.
(u, v)
x
x
i,j=1
n
(iii) Two smooth surfaces and on U and U respectively are called parametrically
equivalent if there is a function f : U 7 U with f C 1 (U ), Jf (p) > 0 for
all p U , f is one-to-one from U onto U and (p) = (f (p)) for all
p U .
(iv) (U ) is called the trace of the surface in Rn . Again, we will not worry
excessively about distinguishing between a surface and its trace.
i
i
176
muldown
2010/1/10
page 176
i
2 1/2
n
X
(x
,
x
)
i
j
(u,
v)
i,j=1
is Riemann integrable on D.
(
1 1
x=u+v
: y = u v, (u, v) R2 , = 1 1 ,
1 0
z=u
describes a smooth plane since rank = 2 which may also be described as the
solution set of
x
x + y 2z = 0, or [ 1 1 2 ] y = 0.
z
From the latter equation, we see that the direction numbers of the normal vector
are (1, 2, 2). Notice also that these direction numbers are in fact given by
(y, z) (z, x) (x, y)
.
,
,
(u, v) (u, v) (u, v)
x = cos() sin()
:
y = sin() sin(), (, ) R2 .
z = cos()
Example 5.24 In Example 5.22 we had a particular plane through the origin in
R3 . In this example, we take a longer look at the general form for a plane. Let
i
i
muldown
2010/1/10
page 177
i
177
(x0 , y0 , z0 ), (a1 , a2 , a3 ) and (b1 , b2 , b3 ) be given triples of real numbers, and consider
the parametrization
(
x = x0 + a1 u + b1 v
:
y = y0 + a2 u + b2 v, (u, v) R2 , or
z = z + a u + b3 v
0 3
x
x0
a1 b 1
y = y 0 + a2 b 2 u .
v
z
z0
a3 b 3
a
b3
+ e3 1
a2
b1
b1
.
b2
i
i
178
muldown
2010/1/10
page 178
i
The best affine approximation to any surface (u, v) for (u, v) near (u0 , v0 ) is
(u0 , v0 ) + D(u0 , v0 )(u u0 , v v0 ); hence the following definition.
Definition 5.25. For 2-surfaces
(i) the tangent plane for a smooth surface at the point (u0 , v0 ) is
p(u, v) = (u0 , v0 ) + D(u0 , v0 )(u u0 , v v0 ).
For example, in R3 , the tangent to the surface
x
x
x
+
(u
u
)
+
0
0
u p
v p (v v0 )
0
0
x = x(u, v)
y
y
y0 + u
(u u0 ) + v (v v0 )
: y = y(u, v) is
p0
p0
z = z(u, v)
z
z
z0 + u
(u u0 ) + v
(v v0 ),
p0
p0
where p0 = (u0 , v0 ), x0 = x(u0 , v0 ), y0 = y(u0 , v0 ), z0 = z(u0 , v0 ). In Example 5.24, this is the plane
(y, z)
(z, x)
(x, y)
(x x0 ) +
(y y0 ) +
(z z0 ) = 0.
(u, v) p0
(u, v) p0
(u, v) p0
(ii) the normal line to the surface at p0 in R3 has the direction numbers
(u, v)
(z, x)
(x, y)
n=
.
,
,
(u, v) p0 (u, v) p0 (u, v) p0
The unit normal direction is n1 = n/|n|.
Question: In R4 , what corresponds to the normal line discussed above for the
surface in R3 ? If you cannot figure it out, ask.
Exercises
5.19. Show that two parametrically equivalent surfaces in R3 have the same unit
normal n1 .
5.20. Let : R2 7 Rn , (xi = xi (u, v), i = 1, . . . , n), be a smooth surface. Show
that the vectors
xn
xn
x1
x1
, v=
,
,...,
,...,
u=
u
u p0
v
v p0
are tangent vectors to certain smooth curves in . Check that the condition
rank (p0 ) = 2 simply requires that u and v be linearly independent. Check
that in the case n = 3, n = u v.
i
i
muldown
2010/1/10
page 179
i
179
x = x(u, v)
: y = y(u, v), (u, v) U
z = z(u, v)
then
A((D)) :=
|n|du dv =
s
(y, z)
(u, v)
2
(z, x)
(u, v)
2
(x, y)
(u, v)
2
du dv.
Then
y=v
z = f (u, v)
1
= 0
fu
that is
z = f (x, y),
0
1 ,
fv
and
f C 1 (D).
rank = 2,
i
i
180
muldown
2010/1/10
page 180
i
Example 5.28 The surface area of a sphere of radius a is easily computed when
the sphere is described as the surface
x = a sin() cos()
: y = a sin() sin() on D : {(, ) : 0 2, 0 .}.
z = a cos()
Then x2 + y 2 + z 2 = a2 can easily be checked, and
It follows that
2
+a4 sin()(cos() sin2 () + cos() sin() cos2 () ,
|n|
D
Z
0
and
q
a4 sin2 () cos2 () + a4 sin4 () d d
a2 sin() d d
= 2a2 cos() = 4a2 .
x
Figure 5.14. The sphere in Example 5.20.
Example 5.29 The surface area of a torus. A torus can be parametrized by
(
0 2
x = (R a cos()) cos()
: y = (R a cos()) sin(), on D :
, 0 < a < R,
z = a sin()
0 2
i
i
muldown
2010/1/10
page 181
i
181
and
Then
Therefore,
Z 2 Z 2
a(R a cos()) d d
|n| =
0
D
0
Z
= 2a (R a cos()) d = (2)2 aR.
A((D)) =
As motivation for the definition of A() we offer the following two ideas.
(a) The area of a plane segment (see the figure)
Ayz = A cos(1 ) = A,
= A2
Axy = a cos(3 ) = An
(5.2)
Thus, A2 is the sum of the squares of the areas of the projectons onto the
coordinate planes (yeah Pythagoras!).
Consider the affine function : R2 7 R3 and
onto the coordinate planes in R3 :
x
x0
a1
: y = y 0 + a2
z
z0
a3
the projections yz , zx , xy
b1
u
b2
v
b3
i
i
182
muldown
2010/1/10
page 182
i
3
n =(l,m,n)
A
y
A
xy
Figure 5.16. The plane segment area and its projection onto the xy-plane.
u
a b
y
y
= 0 + 2 2
v
a3 b 3
z
z0
z
z
a b
u
:
= 0 + 3 3
x
x0
v
a1 b 1
x
x0
a b
u
:
=
+ 1 1
y
v
y0
a2 b 2
yz :
zx
xy
etc.,
q
A(yz (D))2 + A(zx (D))2 + A(xy (D))2
s
2
2
2
(z, x)
(x, y)
(y, z)
=
+
+
A(D)
(u, v)
(u, v)
(u, v)
Z
= |n|A(D) =
|n| du dv (since n is constant here).
A((D)) =
Now that the notion of content or measure has been introduced for curves and
surfaces it is a simple matter to extend the idea of integration to such objects. For
example in R3 : If : R 7 R3 is a smooth curve and I is a closed interval in R, then
i
i
muldown
2010/1/10
page 183
i
183
.
.
i
i