Edwin E Moise Calculus PDF
Edwin E Moise Calculus PDF
Edwin E Moise Calculus PDF
SECOND EDITION
EDWIN E. MOISE
Harvard University
All rights reserved. No part of this publication may be reproduced, stored in a retrieval
system, or transmitted, in any form or by any means, electronic, mechanical, photocopying,
recording, or otherwise, without the prior written permission of the publisher. Printed in the
United States of America. Published simultaneously in Canada. Library of Congress Catalog
Card No. 76-150576.
Author's Note on the Second Edition
The preface to the first edition was an explanation of the author's intentions, and
since these intentions have not changed, the original preface is reprinted after this one.
But the present edition is a thorough revision of the first, with many major changes
and even more minor ones. Some of these are as follows.
1. The use of language has been simplified throughout. Excessive colloquialisms
have been eliminated.
2. Many problems have been added, most of them being easy. In cases where
several problems form a sequence, they have been combined into a single problem,
with parts (a), (b), (c), .... Thus it is now safe to assign all odd-numbered problems,
without checking to make sure that problem-sequences are not being broken up.
3. Various long sections have been divided into two parts.
4. The classical definition of a limit has been restored. Exploratory problems,
dealing with limits and continuity in terms of "boxes," have been eliminated, and
this material has been inserted in later portions of the text. ·
5. Section 5.8, on the derivative of one function with respect to another, has been
completely recast, following suggestions of Professor Hugh Thurston, of the University
of British Columbia. The new version is mathematically straightforward, and it
bridges the gap between the modern concepts of function and derivative and the
"fractional" notation du/dv commonly used in physics.
6. Chapter 8, on the conic sections, has been shortened and simplified, by
omitting various topics not ordinarily covered in a first course in calculus. In
particular, the section on the geometry of the ellipse has been omitted. In a way it is
a pity to leave this out, because it is very good mathematics, but in a first course in
calculus we barely have time for essentials.
7. In the chapters on vector spaces, the standard use of the terms "vector space"
and "inner product space" has been restored.
8. The old Chapter 10, on number theory and partial fractions, has been omitted.
The above remarks about the geometry of the ellipse also apply here.
9. The chapter on infinite series has been completely recast. In the first edition,
the idea of uniform convergence was built into the presentation, almost from the
outset; it was used, in various special cases, to justify term-wise integration, long
before the general definition of uniform convergenc� was stated. This treatment had
advantages, for some students, but it had a serious tlisadvantage: it meant that the
hardest part of the study of infinite series could not be skipped, or even postponed.
The chapter has now been arranged in such a way that the hardest parts of it come
iii
iv Author's Note on the Second Edition
last. Term-wise integration and differentiation of power series are introduced early,
and play a central part throughout the chapter; but the justification of these processes
, is saved for the end.
The construction of the complex numbers (using congruence classes of real
polynomials modulo 1 + x2) has been moved to an appendix.
10. The chapter on linear transformations, matrices, and determinants has been
recast and simplified in various ways. For example, the idea of isometries between
subspaces has been omitted (from the text and therefore from the problems.)
11. In the chapter on functions of several variables, the Leibniz notation for
partial derivatives has been introduced, in parallel with the subscript notation
f.,,fv, . . .. The former notation is of course universal in physics, and it cannot be
denied that it makes the chain rule easier to remember.
These examples should make it plain that this is not a perfunctory revision.
The intent of the revision is to make the book more teachable and more flexible,
without weakening its mathematical content. Some sections (as indicated above)
have been omitted outright. Some chapters have been recast in such a way that more
topics can be omitted at the teacher's discretion. But the main substance of the book,
and the conception of calculus that it attempts to teach, have not been changed. All
the hard problems in the first edition have been retained (except for one very embar
rassing case, in which I asked the student to prove a false theorem).
Most of the calculus books now in print are of one of the following three types:
1) Some are written on a high plateau of austerity and rigor, and the Devil take
the hindmost.
2) Some are "quick calculus" books. A typical device, in this sort of book, is to
use the Fundamental Theorem of Integral Calculus as a definition of the definite
integral. This enables the student to imitate the behavior of mathematicians, in
calculating definite integrals, without sharing the mathematicians' conception of
what the problem meant in the first place.
The rock-bottom minimum, in any good mathematics course, is for the student
to attach conceptual meanings to the problems that he solves and to the "answers"
that he computes. If we settle for less than this, we are making a bad bargain. The
hard fact is that "practical" calculus courses are not practical. In real life it seldom,
if ever, happens that a mathematical problem takes the form of a homework exercise
which can be solved by copying the pattern of the "solved problems" that immediately
precede it. In physics it is the conceptual definite integral that is crucial, and numerical
valuations are often done by computers. Thus the art of setting up integrals is often
more useful than the art of calculating them by elementary methods. The same
principle applies very widely. When people put their m�thematical training to
practical use, they seldom need the logical refinements that appear in a thorough
treatise, but they nearly always use their conceptual grasp-at some level-of mathe
matical ideas. Obviously, many techniques are needed, and in this book we have
worked hard to teach them. But as the same time we have tried to produce the sort
of conceptual grasp of mathematics that can be put to work in real life.
The mathematical content of the first ten chapters of this book is familiar and easy
to describe. These chapters present, more thoroughly than is customary, the material
normally covered in one-year introductions to college calculus, and end with a chapter
on infinite series. (This portion of the book is being published separately, under
the title Elements of Calculus.) In the last four chapters of the complete edition, the
choice of material is nowhere nearly so traditional. In particular, we have laid heavy
stress on the methods of linear algebra.
In the latter portion of this preface, we explain the considerations on which the
selection of topics in the last few chapters is based. Most of the novelties in the
Elements are in the style of treatment; and the ideas underlying them may best be
explained by means of numerous examples.
The central concepts of the calculus are deep. It is not to be expected that they can
be learned ali at once, in the forms in which a modern mathematician thinks of them.
Therefore, in this book, the more difficult ideas are presented in a series of different
forms, in ascending order of difficulty, generality, and exactitude. Thus the idea of
the definite integral makes its first and simplest appearance in Section 2.10; it is
generalized in Section 3.7; and it is not presented in final form (using Riemann sums)
until Section 7.1, where Riemann sums are needed, in the calculation of arc length.
Similarly, the chain rule for derivatives appears first in Section 3.6, for powers
and square roots of functions; it is proposed, in more general forms, in Problem
Sets 3.6, 3.8, 4.3, and 4.5; and it appears in final form only in Section 4.6.
The mean-value theorem is first stated, in geometric terms, in Section 3.2, before
any formal definition of the derivative. It is used freely thereafter. Finally, in Section
5.7, it is proved, after the ideas needed in the proof have been used and motivated
in other ways.
The idea of the limit of a function appears first in Section 2. 7. The formal defini
tion is in Section 3.3. Earlier sections include a lengthy preparation for the
formal definition, designed to eliminate in advance as many of its difficulties as
possible. This purpose is served by the text of Sections 1.4 and 2.5. Thus the style of
treatment is such that an inspection of isolated sections of the book is likely to lead
to an overestimate of the difficulty of the course. The point is that the sections are
not isolated: difficult discussions have been provided with elaborate foundations, in
the text and especially in the problems.
vi
Preface vii
The spiral treatment, in which concepts appear in various forms as the theory
develops, is intended to make the concepts easier to learn. But this is not its only
purpose. The processes by which special ideas are generalized, and heuristic ideas
are made concrete and exact, are part of the substance of what we ought to be
teaching. Thus the heuristic treatment of exponentials and logarithms, in Section
4.9, is not given merely in order to make the student's life easier. The transition from
Section 4.9 to Sections 4.10 and 4.11 (in which the theory is based on the definition
In x = n (dt/t)) is valuable in itself, as an illustration of a recasting process which
is essential both in the growth of mathematics and in the growth of the people who
use it.
2. MOTIVATION
The desire to solve interesting puzzles is very strong; there. is no maturity level at
which it disappears; and we should appeal to it continually. Most of the time,
however, when new ideas are introduced, they ought to be motivated by a sense of
power, and by the light that they throw on ideas already regarded as significant.
For example, if we present Riemann sums, in full generality, long before we deal
with problems in which they are needed, it is not reasonable to expect the student to
master their complications. Similarly, the completeness of the real number system,
in the sense of Dedekind, is not needed at all in the theory of pointwise limits: this
theory takes exactly the same form in the rational domain as in the real domain.
If we postpone the idea of completeness until the point where it is needed, in the
study of functions continuous on an interval, it is more likely to be understood,
partly because it is more likely to get the student's attention.
The problem of motivating the idea of the limit of a function involves a peculiar
difficulty. The only cases in which limx-+a f(x) is easy to calculate are those in which
f is a continuous function, described by a simple formula. In these cases, the formula
works just as well for x = a as for other values of x; in practice, it turns out that
the limit isf(a); and the student is likely to get the idea that the expression limx-+a f(x)
is merely a devious and pretentious description of j(a). If we avoid this trouble by
starting with significant cases, such as
1. sin x
1m--,
X-+0 X
then the technical difficulties are formidable, and workable problem material is hard
to come by. If we choose, instead, to discuss limits of sequences, then we have
evaded the issue by changing the subject: in the differential calculus, limits of
functions are what we need.
But there is a fourth alternative: we can introduce the idea of a limit not as a
subject in its own right, but as a device for solving a problem. In Section 2.7 we
mention limits for the first time, in finding the limit of a linear function. To pass
to the limit, in this case, we merely plug the hole in a punctured line. This process
has no intrinsic significance. But in the context of Section 2.7, it has an extrinsic
viii Preface
3. BLACK BOXES
It is generally agreed that in a physics laboratory the student should build as much as
possible of his own equipment. Nobody learns very much by watching the per
formance of the proverbial "black box." In mathematics the situation is similar:
we do not learn mathematical principles by hearing them mentioned once, no matter
how elegantly; we need to live with them and use them. Therefore, in this book
certain extremely powerful theorems have been proved long before being stated.
That is, the proof has been presented, in the form of a method of solving a certain
class of problems; and after the student has learned the idea by using it on many
problems, we have summed up the situation by stating the general theorem that the
proof proves. This scheme costs very little time, even in the short run; and in the
long run it is likely to save a great deal of time. The point is that if we allow recipes
to take the place of ideas, in a first course, then the ideas need to be taught all over
again later; and the second attempt may be harder, because the problem-solving
motivation for these particular ideas has already been used up.
There are good reasons for not giving examples of this technique.
It should be understood that the avoidance of black boxes has no particular
connection with the pursuit of logical rigor. Indeed, if we have to choose, it is better
to master an idea in an heuristic form, by using it repeatedly, than to listen once to a
rigorous exposition, and then forget it.
4. PROBLEMS
In a quick examination of a textbook, it is not a good idea to read the text and skip
the problems; it is better to read the problems and skip the text. The problems
represent the life that the student leads when he studies the course; and any ideas
that do not appear in them are unlikely to be learned, no matter how much preachment
may be devoted to them.
In this book, a variety of problems are used for a variety of purposes. There are:
I) Technical problems, as, for example, in the chapter on the technique of integration.
These are carefully graded, and often they form sequences, in which the answer
to one problem can be used in the solution of the next.
2) Theoretical problems, some easy, some hard. Vigorous attempts have been
made to find easy ones, so as to avoid a dichotomy between techniques (which the
student really uses) and "theory" (of which he is intermittently a spectator).
Preface ix
3) Puzzle problems.
4) Sketching exercises, in which the student is asked to translate back and forth
between analytic ideas and visual images.
5) Discovery problems, which anticipate, in special cases, ideas which will later
be explained in the text.
There is wide general agreement on the content of the first year course in college
calculus; and in writing the Elements, the author was in the happy position of
working on the basis of a consensus with which he was fully in sympathy. But there is
no such general agreement on the content of a course in intermediate calculus. In
the past decade, calculus courses have tended to grow, by including various topics
from advanced calculus and linear algebra. But it is not easy to decide which of these
topics should be included, and what relative stress should be placed on them; and in
fact there is no reason to suppose that such questions have unique answers.
On the other hand, every book and every course must make some choices, and
then stick to them long enough to permit a valid learning process. If the pursuit of
flexibility turns an intermediate calculus book into an anthology, then its little pieces
are unlikely to have any lasting effect. For example, if the treatment of infinite series
is sketchy, then its residuum in the mind of the student may include hardly more than
the ratio test. And the dangers presented by brief treatments of linear algebra are
worse.
Modern algebra is modern because its motivations and its applications came late.
Today, there are very good reasons for studying groups, rings, fields, vector spaces,
normed vector spaces, inner product spaces, linear transformations, matrices, and so
on. But the logical simplicity of the rudiments of these theories is misleading. For
example, the manipulative process of multiplying matrices can be taught to almost
anybody, at almost any level; but the significant applications of this process are
another matter entirely. In a short treatment of axiomatic and linear algebra, at the
freshman or sophomore level, we cannot presuppose knowledge of the significant
applications, and we have no time in which to present them. Thus we may fall into a
peculiar form of use-mention confusion: the reader hopes that the ideas of modern
algebra are going to be used, but in the end he sees that they have merely been
mentioned.
For these reasons we have tried, throughout, never to state an algebraic definition
until the reader already knows at least one important instance of the idea that the
definition describes; and once an algebraic idea has been introduced, we have tried,
throughout, to put it to work for the purposes that it is good for. Thus, for example,
matrices are introduced as a shorthand for handling linear transformations; and
thereafter the treatment of the two is closely tied together. The Schwarz inequality is
first introduced (on page 521) as a theorem in Cartesian three-space, and for this
case it is proved by the trivial observation that cos2 f) � 1 for every fJ. Later, on
page 536, it is proved in the general case, and thereafter it is used in a great variety
of ways, to trivialize problems which would not otherwise be trivial. It appears in
disguised forms in many problems (which should not be listed here). These examples
are typical of the style of Chapters 11 through 13. It appears to the author that the
x Preface
nature, the purposes, and the power of algebraic methods are hot likely to be under
stood unless they are conveyed to the student by some such extended experience.
The most impressive, but also the most difficult, of these applications occurs in
Chapter 12, on Fourier series. This topic is not ordinarily included in intermediate
courses; and if something must be omitted, in teaching a course from this book,
Chapter 12 is an excellent candidate for omission. (None of the material in it is used
later.)
Chapters 1 through 13 amount to more than 600 pages; something had to be
shortened; and so the treatment of functions of many variables is shorter than might
have been expected, and there is no separate chapter on differential equations. It
should be noted, however, that there is a substantial treatment of linear differential
equations at the end of Chapter 13, and that the viewpoint of differential equations
has been stressed throughout. (Recall, for example, the treatment of the fundamental
theorem of integral calculus, and of the elementary functions, in the Elements.). In
Chapter IO, the standard method of showing that a given series converges to a given
function is first to show that the series and the function satisfy the same differential
equation, and then to show that the differential equation (with initial condition) has
only one solution. Usually, the series is derived from the differential equation, and so
the student is not likely to be surprised when the same process is applied later to
equations whose solutions were not previously known. For this sort of reason, the
book conveys much more of the spirit and methodology of differential equations
than the table of contents would suggest.
Moreover, it appeared to the author that the natural sequels of the material in
Chapter 14 would grow exponentially more difficult, and that they rightly belong in
an advanced calculus course. The hard fact is that multivariate calculus, once we get
past its beginnings, is not an elementary subject; and if we try to make it seem elemen
tary, we are likely to give up both intuition and logic in favor of a bewildering
formalism. Thus it appeared, at the end of Chapter 14, that we should say either
much more, or no more at all; and since every book-even a calculus book-has got
to end somewhere, the choice was clear.
The above discussion is an attempt to indicate some of the author's objectives,
and some of the methods used in pursuing them. Obviously no such discussion can
prove anything about the extent of the contribution that the text makes to the
achievement of these objectives. A great deal has happened, in the teaching of cal
culus, in the past decade, and it remains to be seen how much more can be accom
plished, and how.
Chapter 1 Inequalities
1.1 Introduction 1
1.2 Products which are equal to zero 2
1.3 Order . 3
1.4 Absolute values. Intervals on the number line 9
2.1 Introduction . 16
2.2 Coordinate systems. The distance formula 16
2.3 The graph of a condition. Equations for circles . 21
2.4 Equations of lin�s. Slopes, parallelism, and perpendicularity 26
2.5 Graphs of inequalities. And, or, and if ... then 33
2.6 Parabolas . 38
2.7 Tangents . 43
2.8 A shorthand for sums 49
2.9 The induction principle and the well-ordering principle . 51
2.10 Solution of the area problem for parabolas . 57
xi
xii Contents
1.1 INTRODUCTION
In this book it is assumed that you know elementary geometry and the algebra of the
real number system. Theorems of plane geometry will be used only occasionally, and
there is no need to reexamine the subject as a whole.
Inequalities, however, are another matter. We shall be using them constantly,
and they are tricky. We shall therefore handle them with care. To derive the laws that
govern them we first need to recall the elementary laws of the number system. These
are as follows.
We have given the set R of real numbers, with the operations of addition and
multiplication. Thus the number system is a triplet
[R,+-
, ].
Addition and multiplication are subject to the following laws:
a+(b+c)= (a+b)+c,
and
a(bc)= (ab)c.
Commutativity. For every a and b,
a(b+c)=ab+ac.
Existence of 0 and 1. There are two different numbers 0 and 1 such that
a+O=a and a· 1 =a
for every a.
Existence of Negatives. For every a there is a number -a such that a+(-a)= 0.
Existence of Reciprocals. For every a ¥: 0 there is a number I /a such that a · I /a=1.
1
2 Inequalities 1.2
These laws are called the field postulates; and any number system which satisfies
them is called afield. There are many such number systems: the real numbers form a
field, and so do the complex numbers. For a long time to come, however, we shall be
working only with the real numbers. Therefore, when we speak of numbers, we mean
real numbers, unless the contrary is stated.
We shall assume not only the field postulates but also the familiar laws based on
them. For example, we know that (a - b)(a + b) = a2 - b2, and that a· 0 = 0
for every a.
.When we perform calculations, we shall not stop to justify them on the basis of the
field postulates. But the following principle is worth special mention, because it is
used in reasoning processes which don't involve calculations:
Proof
1) If a 0,= there is nothing to prove.
2) If a � 0, then a has a reciprocal. Therefore
1
-1 (ab) = - ·
0' 1. b = 0,
a a
and
b = 0.
Thus either a = 0 or b = 0.
Obviously it is possible that a and bare both 0. In Theorem 1 (and everywhere
=
else in mathematics) when we say either ... or . . , we allow the possibility of both.
.
(x - l)(x + 1) =0.
x2 - 5x + 6 =0.
O·x=l.
x3 - 6x2 + llx - 6 = 0.
1 1 1
- +- =-- ?
x a x +a
(a + b)2 = a2 + b2?
(a + b)3 = a3 + b3?
*9. Consider the "number system" which has only two elements 0 and 1, with addition and
multiplication defined by the following tables:
+ 0 1 0 1
----- ---
0 0 0 0 0
1 0 0 1
Which of the field postulates hold true, in this system? Which, if any, fail to hold?
(The answer to this question suggests that the field postulates are not, in themselves,
a very adequate description of the real number system.)
*10. Consider the number system in which the "numbers" are 0, 1, 2, and 3, with addition
and multiplication defined by the following tables:
+ 0 2 3 0 1 2 3
0 0 1 2 3 0 0 0 0 0
1 2 3 0 0 1 2 3
2 2 3 0 1 2 0 2 0 2
3 3 0 1 2 3 0 3 2
Exactly one of the field postulates fails to hold in this number system. Find out which
one. [Hint: Don't bother to test the Associative and Distributive Laws; in fact, they
hold true in this system, although the verifications are extremely tedious.]
Does Theorem 1 hold true in this system? Why or why not?
1.3 ORDER
-V3 11'
f �
- 3 - 2 -1 0 2 3
4 Inequalities 1.3
When we write a<b, this means (roughly speaking) that a lies to the left of b on the
number line. Thus what we have in mind is a system
[R, +, ·, <],
0.1. (Trichotomy) For every a and b in R, one and only one of the following
conditions holds:
a<b, or a= b, or b<a.
These four laws, in combination, tell the whole story: all of the elementary laws
of inequalities can be derived from them. You will carry out this process, in the
following problem set. Meanwhile we state the theorems without proof.
a+c<b+d.
Theorem 4. An inequality is preserved if both sides are multiplied by the same positive
number.
Theorem 5. An inequality is preserced if both sides are divided by the same positive
number.
Theorem 6. An inequality is raersed if both sides are multiplied by the same negative
number.
Theorem 7. An inequality is rerersed if both sides are divided by the same negative
number.
3x+4<5x + 7.
3(-5) + 4<5(-5) + 7
-3<2x, (3)
by AO; and so
x > -t, (4)
by Theorem 4. (We have multiplied, on each side, by t, and then written the inequality
backwards, to put x on the left.)
Thus every number which satisfies (1) also satisfies (4). And all of our steps can
be reversed. If
x > -t, (4)
then
-3<2x, (3)
by Theorem 4; therefore
4 <2x + 7, (2)
by AO; and so
3x + 4<5x + 7, (1)
by AO. Therefore every number which satisfies (4) also satisfies (1). We can sum all
this up briefly by writing
Here the symbol <=>- is pronounced "is equivalent to." When we write<=:>- between
two inequalities (or any two open sentences of any kind) we mean that whenever one
of them is satisfied, so is the other.
6 Inequalities 1.3
is false, because x = -1 satisfies the second inequality but not the first. Similarly,
a = b => a2 = b2 is true, but
( ?) a = b <=> a2 = b2 (?)
is false, because if a ":/= 0 and b = -a, then the second inequality holds, but the first
does not.
The shorthand symbols<=> and => are worth learning and using. The reason is
that when we write down strings of formulas, in solving a problem, we ought to
indicate what the connection between them is supposed to be. We are more likely to
do this if we have a way of doing it briefly.
Using the symbols=> and<=>, we can restate some of the theorems of this section
in a more efficient way. For example, AO says that
We shall refer to this, for short, as ALO. Similarly, Theorem 4 says that
We sum all this up in the short form on the next page. The meanings of the
abbreviations should be plain.
Order
Trich. For every a and b in R, one and only one of the following
conditions holds:
a<b, or a = b, or b <a.
The last three of these are convenient in solving inequalities; they enable u
e <=> at each stage, instead of working first forward and then backward.
nple, the solution of the illustrative problem above can now be written like t
In the following problems, we d evelop the theory in which all of the results of
ion are derived from Trich., Trans., MO, and AO. Therefore, at the start, thest
8 Inequalities 1.3
the only statements that can be given as reasons in proofs. In each problem, however,
you may assume that the results given in the preceding problems are known and you may
cite them as reasons.
13. Following are the steps in the proof of Theorem 1. Complete the proof by giving a
reason for each step.
14. Following is an outline of the proof of Theorem 2. Complete the proof by giving a
reason for each =>.
c) Prove Theorem 4.
19. a) Everybody knows that 1 > 0. Prove it, on the basis of the theory that we have
developed so far. (You may assume, of course, that 1 ,,e 0.)
b) Show that
1
a> 0 => - > 0.
a
That is, the reciprocal of every positive number is positive. [Hint: By Trich., it
will be sufficient to show that the conditions 1/a = 0 and 1/a< 0 are impossible.
Remember that a · 1/a = 1.]
21. Give the reason for each step in the following proof of Theorem 6.
24. Is there a negative number which is larger than all other negative numbers? Why
or why not?
*25. ls it possible to define, for the complex numbers, a relation < which obeys the laws
0.1 and 0.2? (That is, can an order relation be defined for the complex numbers?)
Why or why not?
*26. ls it possible to define, for the complex numbers, a relation < which satisfies not only
0.1 and 0.2 but also MO and AO? [Hint: Since i ;f:. 0, we must have i >0 or -i >0.)
The language in which these problems are stated ought to suggest what the answers are.
The answer to Problem 26 indicates why it is that arranging the complex numbers in an
order is not a useful proceeding. In the complex number system, no theory of inequalities
can be made to work.
The absolute value lxl of a number xis defined by the following two conditions:
1) If x � 0, then lxl = x.
12 1=2,
and under Condition (2) we have
1-21 = -(-2) = 2,
10 Inequalities 1.4
Thus the operation I I leaves positive numbers unchanged, and replaces each negative
number by the corresponding positive number. On this basis it is easy to see that the
following theorem holds.
Theorem 1. For every x,
lxl � 0.
Case 2. x < 0. Here lxl -x, by definition of lxl; and -x > 0, by Theorem 2
=
\x\2 = x2.
positive square root of a. Thus, for example, 9 has two square roots, 3 and -3;
and /9 is 3, which is the positive square root. We define Jo 0. Here and hereafter, =
we are assuming that positive numbers have roots of all orders-square roots, cube
roots, and so on.
Theorem 3. For every x,
\xi= -J�.
Proof By Theorem 2, lxl2 = x2, and so \xi is a square root of x2• By Theorem 1,
lxl � 0. Therefore lxl = -J x2, by definition of -J� .
Theorem 4. For every x,
1-x\ = I x!.
x�Ix!.
Ix+ YI= x + y.
Since
x �lxl , and y�jy l,
we have
x+ Y�lxl+ IYI,
and so
Ix+ YI�lxl + jyj.
Case 2. Suppose that x+ y < 0. Then -x - y > 0. Therefore
lxl <d
This is geometrically obvious: lxl is "the distance between 0 and x, on the number
line"; and the points that lie within a distance d of the origin are the numbers between
-d and d. We get a more general result by using any given point a instead of the
origin.
12 Inequalities 1.4
lx-aj<d
If a<b, then the set of all numbers between a and b is called an open interval,
and is denoted by (a, b).
(a, b)
0 a b
{x I .Jx2 + 1 = x- 1} = { }.
The notation { } is designed to suggest its meaning: we describe sets in the brace
notation; and when there is nothing written between the braces, this means that the
set has nothing in it.
If we add to the open interval (a, b) the endpoints a and b, we get a closed interval,
denoted by [a, b].
[a, b]
0 a b
Thus
[a, b] = {x I a � x � b}.
1.4 Absolute Values. Intervals on the Number Line 13
We shall also be dealing with "infinite intervals." In the first figure below,
the "infinite interval" is
(a, oo)
0 a
(- oo, a)
0 a
This notation, in which "oo" is used as if it denoted a number, is not very logical,
but it is convenient. To keep track of the notation, you should think of fictitious
"numbers" - oo and oo as the "ends" of the number line, as shown below.
[a, b)
0 a b
(a, b]
0 a b
0 a
(, oo, a]
a O
14 Inequalities 1.4
Finally, we may refer to the whole real number system Ras the interval ( - oo, oo).
Thus we have a total of nine kinds of interval:
In some of the problems below, you may find it convenient to use the following:
Proof
Jxl = IYI => Jxl2 IYl 2
=
=> x2 = y2
=> x2 - y2 = 0
=> (x - y)(x + y) = 0
=> x y = or x = -y.
(The converse is obvious.)
Describe each of the following sets in the interval notation. Your answers should be in
a form like the following:
9. a) Is it true that Yx2 = x for every x? Why or why not? Describe the set
. {x I Yx2 = x},
in the interval notation.
b) Describe the set
{xi Y(x+1)2=x+1},
in the interval notation.
Find out for what numbers x (if any) each of the following conditions holds. In each
case in which the solution set is an interval, the answer should be given in the interval
notation.
Indicate graphically, on a number scale, the places where the following conditions hold;
describe the graphs in the interval notation if possible.
l�I 1�1= ·
(By definition, the reciprocal of lbl is the number y such that lbl · y = I. Therefore
it is sufficient to show that lbl 1 1/bl
· l.)=
l,bl
a lal
=
fbl.
34. a) Show that for every a and b,
la - bl � lal - lbl.
(There is a short proof.)
b) Show that for every a and b,
la +bl � lal - lbl.
(The proof is short.)
35. For what numbers a is the fraction a/lal defined? What is this fraction equal to, for
various values of a?
36. Sketch
{x I Ix - 21 + 1 7 - xi = 5}
on the number line, and describe this set in the interval notation.
2 Analytic Geometry
2.1 INTRODUCTION
This chapter includes various topics which serve as a preparation for calculus. Some
of these topics are familiar to you, at least in some form. In such cases you should
still read the text carefully, in order to learn the terminology that will be used hereafter.
We shall now apply algebra to the study of geometry. We start with a plane, in the
usual sense of Euclidean geometry; and we suppose that a unit of distance has been
chosen, once for all, so that the distance between two points Pand Q is a well-defined
nonnegative number. The distance between the points P and Q is denoted by PQ.
(We say merely that PQ is nonnegative, rather than PQ > 0, because we are allowing
the case P = Q, and in this case PQ = 0.)
To set up a coordinate system in a plane, we first need to assign number-labels to
the points of a line. We choose a point 0 as the origin; it is given the label 0.
Each pointP1 to the right of 0 is labeled with the distance x1 = OP1, which is positive.
And each point P2 to the left of 0 is labeled with the number x2 = -OP2, which is
negative. Thus we have a matching scheme, under which each point of the line is
matched with exactly one real number.
p Q 0 RS T
I I
-2 -1 6 I I I
1 V2 2
For the points marked in the figure, the matching pairs are
S� ../2, T � 2, U � 71'.
Here the double arrow� is pronounced "is matched with." Every such pair has the
form P� x, where Pis a point and xis a number. A one-to-one matching scheme,
16
2.2 Coordinate Systems. The Distance Formula 17
between the elements of one set and the elements of another, is called a one-to-one
correspondence between the two sets.
If the correspondence is set up in the way that we have just described, then we
can compute the distance between any two points by means of the formula
Here P1 +--* x1 and P2 +--* x2• This distance formula holds no matter how the points
P1 and P2 are situated on the line:
0 0
0 0
0 0
0 0
and so on; in every case, P1P2 = lx2 - x11. Thus we have a one-to-one corre
spondence P � x, between the points of the line and the real numbers, such that the
distance formula holds for every pair of points. Such a correspondence is called a
coordinate system for the line. If P +--* x, then x is called the coordinate of P.
These ideas are summed up in the following postulate.
The Ruler Postulate. Every line has a coordinate system. And given any two points
0 and P of the line, there is a coordinate system in which the coordinate of 0 is 0
and the coordinate of P is positive.
0 p
0 x>O
On the basis of the ruler postulate, it is easy to set up a coordinate system in the
plane. We take two perpendicular lines Xand Y, intersecting in a point 0. On each
of the two lines we set up a coordinate system, in such a way that 0 +--* 0; that is, the
coordinate of 0 is zero on each of the lines Xand Y. Xis called the x-axis, Y is
called the y- axis, and the point 0 is called the origin.
Given any point P of the plane, we drop a perpendicular from P to the x-axis,
ending at a point M. The point M has a coordinate x, on the line X. If M +--* x,
then x is called the x-coordinate of P.
y
y N--------- -,p
I
x I
M x
x
Ml
I
L--------- y
p N
18 Analytic Geometry 2.2
p +--+ (x, y)
between the points P of the plane and the ordered pairs (x, y) of real numbers. The
order in which we write the numbers makes a difference. In the left-hand figure below,
We may speak of "the point (1, 2)" or "the point (x, y)," meaning "the point
which is matched with (1, 2)" or "the point which is matched with (x, y)." Thus we
may write P = (x, y), meaning P+--+ (x, y).
y
p
2 -- -
-,
I
I N -----
p'
tt -
I Q y
I
--- ----,
+ I
I I I
I I I
I I I
M
x x x
0 2 0
I
Obviously x and y are determined when Pis known. And Pis determined when
x and y are known, because the vertical line through M and the horizontal line
through N intersect in exactly one point. Thus we have a one-to-one correspondence
p +--+ (x,y)
between the points of the plane and the ordered pairs of real numbers. Such a corre
spondence is called a coordinate system for the plane. We need to see how the algebra
in this situation is related to the geometry.
Consider first the question of distance. If we know the coordinates (x1, y1) and
(x2, h) of two points P and Q, then the points are determined, and so the distance
between them is determined. The following theorem gives a formula for the distance.
2.2 Coordinate Systems. The Distance Formula 19
Theorem 1. If
and
then
PQ= .J(x2 - X1)2 + (Y2 - Y1)2.
Proof Draw the vertical line through Q and the horizontal line through P, meeting
at the point R. Let S and T be the feet of the perpendiculars to X, from P and R
respectively. Then
PR= ST,
because opposite sides of a rectangle have the same length. And
PR = lx2 - Xii·
For the same sort of reason,
But fhe axes are usually drawn as shown on the right above. This figure shows the
minimum that must be indicated when graph paper is used for drawing pictures of
20 Analytic Geometry 2.2
coordinate systems. That is, the axes must be labeled, and the number scale must be
shown on each axis, by indicating the coordinate of at least one point.
The two axes separate the plane into four parts, called quadrants. The quadrants
are numbered I, II, III, IV. That is, the first quadrant is the set of all points (x, y)
of the plane for which x > 0 and y > O; the second quadrant is the set of all points
(x, y) for which x < 0 and y > O; and so on.
y y
1
II I
--+---X
-'--
III IV
We have used the letters X and Yin order to have convenient names for the x
and y-axes. The axes are more commonly labeled as on the right above.
Calculate the distances between the following pairs of points. Then plot the points and
check the plausibility of your answers.
The following problems are a review of the main theorems of elementary geometry that
we have been using so far.
12. Show that an exterior angle of a triangle is greater than either of its remote interior
angles.
l><i_.
A
6
B C D
. .,
B C D
That is, show that in the left-hand figure we have LACD > LA. The proof is
based on the figure on the right. [Query: If you know that LACD > LA, how do
you infer that LACD > LB?]
13. Show that there is only one perpendicular to a given line, from a given external point.
That is, show that the left-hand figure below is impossible for A � B. (We needed this
in order to explain what was meant by the x-coordinate of a point; A must be determined
when P is known.)
.. A
A B
•L
a
a
14. Write the proof of the Pythagorean theorem suggested by the figu,re on the right above.
15. The proof of Theorem 1 of this section was incomplete: it discussed only the most
significant case and neglected to mention two other cases. The point is that if P and Q
lie on the same horizontal line, or the same vertical line, then there is no such thing as
t-,PQR, and so the Pythagorean theorem cannot be used.
Show that the distance formula holds in the case x1 = x2, and also in the case
Yi =
Y2·
Given a point Pand a positive number r, the circle with center Pand radius r is the
set of all points of the plane whose distance from Pis equal to r. That is, a point Q
is on the circle if PQ = r.
This is the first and simplest example of the idea of the graph of a condition. If
we state a condition which every point of the plane either satisfies or doesn't satisfy,
then the graph of the condition is the set of all points of the plane that satisfy it. (Thus
the graph is simply the solution set of an open sentence; we use the word graph
22 Analytic Geometry 2.3
when the solution set is a set of points.) In this language, we say that the graph of the
condition OQ = r is the circle with center at the origin and radius r.
-r
The interior of the circle with center P and radius r (r > 0) is the set of all points
Q such that PQ < r. Thus the interior is the graph of the inequality PQ < r. We
indicate such graphs in figures by means of shading or cross-hatching.
Sometimes the condition takes the form of an algebraic equation. For example,
if Q +--+ (x, y), then the distance formula tells us that
OQ = Jx2 + y2.
Therefore the condition
OQ = r (1)
Jx2 + y2 = r, (2)
or
x2 + y2 =
r2. (3)
The point (x, y) is on the circle if and only if x and y satisfy (2). And
Jx2 + y2 r <=> x2 + y2 r2
=
(r > 0). =
Thus the circle with center -at the origin and radius 2 is the graph of
J x2 + y2 =
2 <=> x2 + y2 = 4;
Similarly, the first quadrant is the graph of the condition x > 0 and y > 0.
2.3 The Graph of a Condition. Equations for Circles 23
x>O
y>O
x>O
IV y<
O
The fourth quadrant is the graph of the condition x > 0 and y < 0.
We found that the circle with center at the origin and radius r is the graph of the
equation
x
2
+ y2 =
2
r .
P(x, y)
r,,....
,,
.,.,....-
Q(a, b)
(x - a)2 + (y - b)2 =
r2•
An equation written in the above form is easy to interpret. For example, given
(x + 2)2 + (y - 5)2 = 4,
we see by Theorem 1 what the graph is. On the other hand, if such an equation
is "simplified" algebraically, it may look like this:
2
x + y2 + 4x - lOy + 25 = 0.
24 Analytic Geometry 2.3
5
4
3
2
To find out what the graph is, we first "unsimplify" by completing the square:
there are three possibilities for the graph. In some cases, the graph is a circle. But
x2+ y2 = 0
is also an equation of this form, and its graph is not a circle, but a single point, namely
the origin. And the equation
x2+ y2+ 1 = 0
is never satisfied, for any x and y. Its graph is therefore the empty set { }.
By completing the square, starting with the general form, we shall show that these
three possibilities-a circle, a point, and the empty set-are in fact the only ones:
If the fraction on the right, in the last equation, is positive, then it is= r2 for some
positive number r, and so the graph is the circle with center at (-D/2, - E/2) and
radius r. If the fraction on the right is = 0, then the equation takes the form
2.3 The Graph of a Condition. Equations for Circles 25
x2 + y2 + Dx + Ey + F = 0
Problems 1 through 6.
In the illustration below six figures are drawn. For each of these figures, state a condition
which has the given figure as its graph. In the figure, the arrowheads merely indicate that the
line is supposed to go infinitely far in the indicated direction. Thus (1) and (2) are entire
lines;(3) is a ray, going infinitely far on the right, but stopping at the point (0, 4) on the left;
and (6) is a segment, with endpoints (I,'-3) and (4, -3).
y (1) (4)
Sketch the graphs of the following conditions, using cross-hatching to indicate regions.
11. x2 + y2 = 1 12. x2 + y2 < 1 13. x2 + y2 > 1
21. a) Sketch the graph of the condition "(x, y) is equidistant from the points (0, 1) and
(1, 0)."
b) Write this condition in the simplest possible algebraic form.
22. Write the simplest equation that you can get, for the set of all points that are equidistant
from (1, 2) and (0, 3). What sort of a figure is this graph? How is it related to the
segment from (1, 2) to (0, 3)?
23. Same problem, for the set of all points that are equidistant from (1, 2) and (2, 2).
24. Same problem, for the set of all points that are equidistant from P i = (xi, Yi) and
P2 = (x2, J2).
*25. Describe and sketch the graph of the equation
v x2 + (y - 1)2 + v (x - 2)2 + y2 = 1.
x3y + y3x - xy = 0.
x2y + xy2 - xy = 0.
29. Consider the set of all points that are twice as far from the origin as from the point
(3, 0). Find an equation for this graph, and sketch.
Ax+ By+ C = 0,
y
L
x2 - 2a1x + ai + y2 - 2b1y + b� 2
<::?- x - 2a2x + a: + y2 - 2b2y + b�
=
Ax + By + C = 0,
with
and
be =0; the number pairs (au b1) and (a2 , b2) are the coordinates of Q and R, and
Q =;tf R, because Q and R are the endpoints of a segment.
An equation of this type is called a linear equation in x and y. Thus we have
y
L
If the line is not vertical, we can say more. In this case, the perpendicular segment
from Q to R is not horizontal, and this means that b2 - b1 =;tf 0. Therefore B =;tf 0,
28 Analytic Geometry 2.4
y
y
P1 P1
P, � t:.x O
" <
y O y
<
We shall show that all segments of the same line have the same slope, and that
this slope is the number m which appears in the equationy = mx + b.
Given two points Pi� (x1,y1) and P2 �> (x2, Yz), on the line
y = mx + k,
then
m
Yz = x2 + k and
Therefore
and
Y2 - Y1 m.
=
X2 - X1
2.4 Equations of Lines. Slopes, Parallelism, and Perpendicularity 29
y=mx+k
is the nonvertical line with slope m and y-intercept k. All segments of this line have
slope=m.
The equation given in this theorem is called the slope-intercept form of the equa
tion of the line.
y
y=x
A line can be described by many different equations. For example, the bisector
of the first and third quadrants above is the graph of each of the following equations:
y=x
<=> x-y=O
<=> 3x - 3y=0
2
<=> (x- y) =0
<=> (x - y)177=0,
and so on. But there is only one equation, in the slope-intercept form, for every
nonvertical line, because when the line is named, its slope and its y-intercept are
determined.
Often a line will be described by its slope m and the coordinates x1, Yi. of one of
its points. We can then find an equation for it in the following way. If (x, y) is any
other point of the line, then
because all segments of the line have the same slope m. Therefore
Y - Y1=m(x - )
X1 .
The graph of this equation contains (x1, y1), because 0=m · 0. And the graph is a
line with slope=m, because the equation has the form
Thus:
Theorem 3. The graph of the equation y - y1 = m(x - x1) is the line which has
slope = m and contains the point (x1, Yi).
For example, the graph ofthe equationy -3 =-2(x + l)is the line which has
slope = - 2 and passes through the point ( -1, 3). Solving for y, we get the slope
intercept form y = - 2x + 1.
Theorem 4. Two nonvertical lines are parallel if and only if they have the same slope.
Given:
we need to prove two things: (1)If the slopes are the same, and the lines are different,
then the lines are parallel. (2) Ifthe slopes are different, then the lines are not parallel.
1) If m1 = m2, then k1 -¥- k2, because the lines are different. Therefore the lines are
parallel, because the two equations are inconsistent: they take the form
Theorem 5. If two nonvertical lines are perpendicular, then their slopes are negative
reciprocals of each other.
y
2.4 Equations of Lines. Slopes, Parallelism, and Perpendicularity 31
Proof Given Li with slope mi and L2 with slope m2, intersecting at right angles at T.
Let (ai, bi) and (a2, b2) be points of L2 which are equidistant from T. Then Li is the
perpendicular bisector of the segment between these points. As we found earlier, the
slope of Li is
But we can calculate the slope m2 of L2 by the slope formula, using the points (ai, bi)
and (a2, b2). This gives
Obviously m2 = -1/m1.
This also works the other way around:
Theorem 6. Given two lines L1, L2, with slopes m1, m2. If
\
\
\
\
\
\
\
\
\
\
\
\ Lz
\
\
L?.
Proof First we observe that the lines cannot be parallel, because m2 cannot be = m1.
(Why?) Let T be the point where they intersect. Let L� be the line through T,
perpendicular to L1. Then L� has slope m2 = - l/m1. But through a given point
there is only one line with a given slope. (Why?) Therefore L� is L2, and L2 is
perpendicular to L1.
Probably you have seen these theorems proved before, in different ways. The
treatment given above is intended to avoid repetitions and also to furnish some
practice in drawing geometric conclusions by algebraic methods.
32 Analytic Geometry 2.4
Find point-slope equations, and slope-intercept equations, for the Jines containing
the following pairs of points.
1. (-3, 2), (2, 1) 2. (3, -4), (1, 2) 3. (I, 0), (3, 3) 4. (-1, I), (2, -2)
5. Find an equation for the tangent to the graph of
x2 + y2 = 25,
at the point (3, 4).
6. Given thatP1 � (x1, y1) lies on the circle
x2 + y2 =
a2,
with
LetP2 be the point where the tangent atP1 crosses the x-axis. Find the distanceP1P•2
[Warning: Geometric distances are never negative.]
-a
7. Find the points P on the circle x2 + y2 2 so that the tangent line to the circle atP
=
passes through the point (2, 0). (You may use the fact that, at any pointPon a circle,
the tangent and the radius are perpendicular.)
x2 + y2 + I + 2xy + 2x + 2y = 0.
x2 + 4y2 + I - 4xy + 2x - 4y = 0.
a) y = lxl b) y = -l2xl c) y = I - Ix - II
For this problem we offer a hint which applies equally well to a very large number of other
problems. If you didn't know the meaning of the symbol lxl, you would have no hope of
sketching the graph. This suggests that you should recall the definition of lxl, and use it.
2.5 Graphs of Inequalities. And, Or, and If ... Then 33
[Hint: As a first step, sketch the portion of the graph that lies in the first quadrant.]
12. Sketch the graph of the equation
y = x +!xi + 1.
!xi - lyl = 1.
14. Sketch the graph of the equation
15. Let C be the set of all points P such that the segment from ( - 1, 0) to Pis perpendicular
to the segment from P to (2, 1). What sort of figure is C? Sketch. (In answering this
one you should bear in mind that the endpoints of a segment are always different. That
is, there is no such thing as the segment from P to P.)
*16. Let A = ( -2, 0), let B = (2, 0), and let G be the set of all points P such that LAPE
is an angle of 60°. What sort of figure is G? Sketch. (You will have to remember and
use some plane geometry, to do this one. If you have suitable drawing instruments, you
ought to be able to do a good sketch.)
(x - 1)2 + (y - 1)2 = 1
is the circle with center at A = (1, 1) and radius 1. The interior of the circle is the
graph of the condition AP < 1. This is the region marked R1 in the figure. It is the
graph of the inequality
y = l - x
is a line L. The points lying above L form a set H1, called a halfplane. Evidently H1
is the graph of the inequality
y > 1 - x.
The points lying below L form a half-plane H2; and H2 is the graph of the inequality
y< 1 - x.
t< x < %.
The graph is an infinite vertical strip R1, lying between the lines
.l.. Ji
X -
- 2 and X -
- 2·
Similarly, the graph of
t<y< 1
is an infinite horizontal strip, as shown on the left below.
I I
.I I
y I I
I I
1 ___ ! __ �---- I
1 I
or t<y<l.
2.5 Graphs of Inequalities. And, Or, and If ... Then 35
The graph of the condition using or is an infinite cross-shaped region. This region R'
is the union of an infinite vertical strip R1 and an infinite horizontal strip R2; it contains
all points of the plane that belong to R1 or to R2.
(In mathematics, when we say that one condition holds or another condition
holds, we allow the possibility that both conditions hold. If we mean " ... but not
both," we have to say so.)
Similarly, the graphs of the conditions
y>x, y > -x
are two half-planes H1 and H2• They are respectively to the left of the line y x =
and to the right of the line y = -x, as shown in the figure on the left below. The graph
of the condition
y> x and y > -x
y
y
y= - x y=x
y >x or y > -x
y y
S1
R
I- I I
R '
I R I
'
I I
�-
I I x
0 l
1 2 2
R
2 2
R
S2
36 Analytic Geometry 2.5
Let us now see what sort of graph we get when we combine two inequalities by
"if ... then." Consider the condition
1) If (x, y) is a point of R, and t < x< !, then we must have t <y< I. There
fore the part of R that lies between the lines x = t and x = i must be the interior of a
rectangle, as indicated by the dashed lines in the figure.
2) On the other hand, if xis not between t and i, then the condition for the graph
imposes no restriction on y at all. Therefore R contains all points to the left of the
line x = t and all points to the right of the line x = l R also contains these two
vertical lines, for the same reason.
The reasoning in (2) may seem a little tricky, but may be clarified by an analogy
from everyday life. The law in most places requires that if a person has seriously
defective vision, then he must wear corrective glasses when driving a car. A person
with normal vision automatically obeys this law; its restrictive clause does not apply
to him. In the same way, the "law"
x<O x>O
Yes Yes
y>O y>O
x<O x>O
Yes No
y<O y<O
2.5 Graphs of Inequalities. And, Or, and If ... Then 37
y y y
x;;;o
y y y
r- --
_ +
1 -
� -t-���.,.x
-1
There is no need to use graph paper in the following problem set. Reasonably
neat freehand sketches, with cross-hatching used to indicate regions, are sufficient.
2.6 PARABOLAS
The distance from a point to a line is the length of the perpendicular from the point
to the line . Given a point F and a line Dnot containing F, the parabola with focus F
and directrix Dis the set of all points of the plane that are equidistant from Fand D.
FP =MP,
where Mis the foot of the perpendicular from P to D. The perpendicular line to D
2.6 Parabolas 39
through F is called the axis of the parabola. The point where the axis crosses the
parabola is called the vertex. (There is only one such point, because any such point
is midway between the focus and the directrix.)
The first step in the study of parabolas is to get equations for them.
������::::-+-:���+l��-. x
D..----------- _______ n_ ____
Mly=-'E.
2
In setting up our axes, we take the vertex as the origin, and the x-axis parallel to
p
the directrix, in such a way that Dis below the x-axis and the focus is above it. The
number is the distance from the focus to the directrix. Now let P � (x, y) be a
point of the parabola. Then
FP = Jx2 + (y �r _
and
Therefore
FP = MP
Jx2 + (y - �r = J (y + �r
x2 + (y - �)2 = (y + �r
x2 + Y2 py + 2 y2 + py + 2
_ p__
4
= p__
4
x2 = 2py
r
. - 21p ..
-
.,
x
a = l/2p,
p 2a1
where and
=-.
40 Analytic Geometry 2.6
y= ax2
is a parabola, with focus at (0, 1/4a) and directrix
y=-1/4a.
y
x
I
______ _ti_ ____
x
M
-4li
FP=MP <=> y=ax2.
If a parabola is situated like this, relative to the axes, then the parabola is said
to be in standard position. The use of standard position simplifies the equation
considerably. For example, if Fis the point (2, -1) and Dis the line y= 3, then the
parabola is the graph of the equations
FP=MP
<=> .Jex - 2)2 + (y + 1)2 = .J(y - 3)2
<=> x2 - 4x + 4 + y2 + 2y + 1 = y2 - 6y + 9
<=> x2 - 4x - 4 =-8y
<=> y = -tx2 + tx + t.
It is not hard to check, in general, that if the directrix is horizontal, then the equation
always takes the form
y = Ax2 + Bx + C, A :;i: 0.
x= Ay2 + By + C, A ,t: 0.
If the directrix is neither horizontal nor vertical, then the equation involves, in
general, terms in x2, y2, and xy, as well as linear terms and a constant. In this case it
is hard to derive the equation when the focus and directrix are given; and it is even
harder, when the equation is given, to see that the graph is a parabola. This case will
be discussed in Chapter 8.
For a long time to come, however, we shall deal only with the simplest case, in
which the directrix is horizontal.
Parabolas arise in a variety of contexts which appear at first to be unrelated.
Following are a few.
2.6 Parabolas 41
1) If a right circular cone is cut by a plane parallel to an element of the cone, the
resulting curve is a parabola. This was the viewpoint from which the Greeks studied
parabolas; and it is for this reason that a parabola is one of the conic sections. There
are other kinds of conic sections, obtained by slicing cones by planes in various
positions.
y
--------------- D
2) If a theoretical projectile is fired from the surface of the earth, in any direction
other than straight upward, the path that it moves along is a portion of a parabola.
In the figure on the right above, the x-axis lies along the surface of the earth, the
y-axis is vertical, L Cl.. represents the angle at which the gun is aimed, and Tis the
point where the projectile hits the ground. We say, "a theoretical projectile," because
to get this result you must assume both that the weight of the projectile is independent
of its altitude and that the air makes no resistance. These assumptions are false, but
they are good approximations to the truth, if the projectile is not going very fast or
very high. For high-speed, long-range projectiles, both assumptions are quite
unrealistic, and the situation is more complicated.
3) If you rotate a parabola around its axis, you get a surface which is called a parab
oloid of recolution. The mirror in a reflecting telescope is a paraboloid of revolution,
as is the reflector in an automobile headlight. The reason is that if a ray of light
travels along a line parallel to the axis, and is reflected in the usual way, it always hits
the focus. And conversely, if a ray of light starts at the focus, hits the surface and is
reflected, it always continues along a line parallel to the axis. The first of these prin
ciples is used in telescopes, and the second in headlights.
i i
I I
I I
l"1F
horizontal axis as the t-axis; we measure time starting at the moment of firing; and
we plot, for each time t, the height of the projectile at time t.
h ----- -:..:-...--�
1. Take a full-size sheet of graph paper; draw the y-axis in the center; and draw the x-axis
near the bottom of the paper. Then choose the largest uniform scule that you can,
on the axes, in such a way that x ranges from - 2 to 2 and y ranges from -! to 4.
Now sketch the graph of y =x2• First plot the points corresponding to the following
values of x:
x=O, x=0.1, x=0.2, ... , x=0.9, x=l,
Then draw the curve, freehand, as smoothly as you can. If this is done carefully, it will
really look as if FP = MP at every point of the curve.
One of the reasons for doing this is that it will give you an accurate idea of what a
parabola really looks like.
2. Show that
0 < x1 < x2 => xi < x� .
y=ax2 , a>O
Xz
Y1 <yz
2.7 Tangents 43
This means that the right-hand half of a parabola in standard position rises as we go
from left to right along the curve.
3. Show that
2.7 TANGENTS
Definition. A tangent to a circle is a line (in the same plane) which intersects the
circle in one and only one point. This point is called the point of contact.
It is then shown that a line is tangent to the circle if and only if the line is per
pendicular to the radius drawn to the point of contact. (In fact, the latter condition
is probably the one that you used to find the slopes of tangent lines to circles, in
Problem Set 2.4.)
y
L
z2 y2
�+b2 = 1.
Tangency can be defined in the same way for an ellipse. Ellipses will be studied
in Chapter 8. Meanwhile we observe that an ellipse is an oval curve, of the sort shown
in the right-hand figure above, and the tangents to it are the lines that intersect
it in one and only one point.
44 Analytic Geometry 2.7
But for some curves, tangents cannot be described by the definition that we use
for circles. Consider, for example, a parabola, as shown in the figure below. The
tangent to the parabola, at the point (x1, y1), intersects the curve only at (x1, Ji). But
the vertical line through (x1, y1) has the same property; and the vertical line is not
a tangent.
We may try to get around this trouble by providing that the tangent line must
only touch the curve, without crossing it. But for many curves, this won't work
either. The graph of y = x3 is shown below. The tangent to this curve at the origin
turns out to be the x-axis; and the x-axis crosses the curve, at the point of tangency.
Jn other cases a tangent line may cross a curve in many points.
y
y
3
y =x
The geometric idea of tangency is obvious in all these cases. But the above
examples indicate that the mathematical definition that works for circles does not
work in general. To find the tangents to other curves, we need a better definition.
Consider first the graph of y = x2, and the fixed point (I, 1) at which we want to
find the slope of the tangent. For every other point (x, x2 ) of the curve, let Lx be the
2.7 Tangents 45
secant line through (1, 1) and (x, x2). Then the slope of L., is
x2
- 1
m,,, =
---
(x -:F- 1).
x - 1
Here the restriction x -:F- 1 reflects the geometric fact that it takes two different
points to determine a line. It also refers to the algebraic fact that fractions with
denominator 0 have no meaning.
y
y
Lx
!/'�m.
I
I
I
I
I
I
I
·X x
2
y = mx = x+ 1 (x -:F- 1).
The graph is a line from which one point has been deleted. For x = 1, there is no
such thing as "the secant line through (1, 1) and (1, l2"
) ; and for x = 1, there is no
such thing as the "fraction" m1 = 0/0. But this causes no trouble, because it is easy
to see that mx is very close to 2 when xis very close to 1. We express this by writing
lim mx = 2.
x->1
This is read: "The limit of mx, as xapproaches 1, is equal to 2." Later we shall give a
general definition of the idea of a limit. But in the present case, the meaning of the
limit is clear, and so we use it in the definition of the tangent to the parabola.
y = ax2 + bx + c,
at a point (x0, y0) of the graph, is the line through (x0, y0) with slope
where mx is the slope of the secant line passing through the points (x0, y0) and
(x, ax2 + bx + c) (x -:F- x0).
46 Analytic Geometry 2.7
Even in the general case, the slope is easy to calculate on the basis of this defini
tion. We have
The graph of y = m.,, is a line with one point missing. The line from which the point
is missing is shown on the left below. The graph of y = m.,, is on the right.
y y
I
I
I
----- �/ I
I
I
I
I
I
���-4-���--+�--- x
1·ro
I
y=ax+(ax0+b) y=m.;
Here again, the limit of m.,, is simply they-coordinate of the point that is missing from
the graph. Thus we have:
y = ax2 +bx + c.
For some curves, there is no tangent. Consider, for example, the graph of
y = lxl, at the point (0, 0) . For each x � 0,
.
Thus:
m.,, = 1 for x > 0, mx = - 1 for x < 0.
(Remember the definition of Ix!.) Therefore the graph of y = m,, looks like the
2.7 Tangents 47
y k?
y=l,x>O
k?
y = -1,x<O
k?
y=fxf
drawing on the right above. For this graph there is no one number that y is close to,
whenever x is close to 0. Therefore, there is no such thing as
1. You already have a carefully drawn graph of the equation J = x2• At each point (x, J)
of the graph, the slope of the tangent ought to be 2x. Check this graphically by drawing
lines of the proper slope at the points where x = 0.2, 0.4, 0.6, 0.8, and 1.
2. Given J = x2 - 4x + 4. Find the slopes of the tangents at the p oints where x = -
2,
x = 0, and x = 2 and sketch, showing all three of these tangents.
,
J = ax2 +bx + c
y = a(x - A)2 + B.
For a > 0, this means that the point where x = A is the lowest point on the curve.
Find the slope of the tangent at this point.
and a point (x0, Jo) of the curve. Show that the tangent at (x0, Jo) is the only non
vertical line which passes through (x0, Jo) and has no other point in common with the
48 Analytic Geometry 2.7
(y - Yo) = m (x - x0)
m = 2ax0•
y = ax2 + bx + c.
6. a) Get a plausible answer for the slope of the tangent to the graph of y = x3, at the
point (I, 1). Sketch the graph of y = mx, explain what sort of graph it is, and explain
as well as you can why your value for the slope is plausible.
b) Do the same for y = x3, at an arbitrary point (x0, xg).
7. a) Show that, if m < 0, then the line through the origin with slope m meets the graph
of y = x3 at precisely one point.
b) Show that, if m > 0, then the line through the origin with slope m meets the graph
of y = x3 at precisely three points.
8. Sketch the graph of
y = x lxl,
and describe this curve in terms of types of curve that we already know about. At
which points does this graph have a tangent? What is S2? What is S_2? Give, if
possible, a general formula for Sx. Is there such a thing as S0?
9. Consider the graph of
y = x3 - 4x.
Where does this cross the x-axis? At which points is the tangent horizontal? What is
the slope of the tangent at (0, 0)? For what values of x is y > 0? For what values
of x is y < 0? Use this information to draw a reasonable sketch of the graph, plotting
onlyfive points.
y = 2x3 - 6x.
11. Show that every parabola has the reflecting property. In the figure, Tis the tangent at P,
and you need to show that Cl = /3. The key to the proof is that the quadrilateral FPRQ
is a rhombus. (That is, all four sides have the same length.)
y
2.8 A Shorthand for Sums 49
Sn =a + (a + d) + (a + 2d) + · · · + (a + [n - l]d).
A geometric series is a sum of the form
There is a shorthand for sums, which makes them easier to handle. Given a sum
we write
n
Sn= I a;.
i=l
(This is pronounced: "The summation from 1 to n of a;.") That is, when we write
I��i and follow it with an expression involving i, this means that we are to substitute
all (integral) values of i, from 1 to n, and add the results.
Sn = a + ar + ar
2
+ · · · + arn-1
can be written as
i = 2 3 4 n
2 1 a 1 4 l
ar - ar - ar -
2 3
ar ar ar
Sn = L [a + (i - l)d].
i=l
This can be checked by means of a table of the sort that we gave above for the case
of the geometric series.
In each case, the formula after I gives the ith term; i = 1 gives the first term,
i= 2 gives the second term, and so on. This will always be true so long as we are
50 Analytic Geometry 2.8
5
�
.::.., i· 3•
i=2
Here we take all values of i from i = 2 to i = 5 inclusive and add the results. There
fore
5
I i3 = 23 + 33 + 43 + 53 = s + 21 + 64 + 125 = 224.
i=2
In general, for m � n,
n
I a; =
am + a.,,+1 + + a n.
i=m
· · ·
Thus
4
Note that I applies only to the expression immediately after it; in the last line, we
are told to add the numbers a� (from i = 2 to i = 4) and add 3 to the result. The
parentheses in the formula !t=2 (a� + 1) indicate that 1 is part of every term of the
sum.
3 3 5 4 3
1. �>2 2. I u-1)2 3. I u2 - 1) 4. "2i2 s. I u3 1)
i=l i=l i=2 i=l i=2
-
Each of the sums below is of the form Lf=m a;. Write each of them in the long form
am + Gm+l + ·
· ·
+ an.
3 n
s. I (3b� + d) 9. !i7
i=3 i=2 i=m
Convert each of the indicated sums to the short form:
k-1 k
--
L kai = k 2, ai?
i=l i=l
Why or why not?
L..
( -) · -
hi 2 h
=
h3
-a
n
L.. iz?
i=1 n n n i=l
17. For 0 (�)is the number of subsets with exactly k elements, in a given set with
� k � b,
n elements. <mis the number of possible 13-card bridge hands; (552) is
For example,
the number of possible 5-card draw poker hands. Show that (�) <D· =
Consider the following game. We have three spindles, of the sort used as targets in
quoits. On the first spindle is a stack of wooden disks, diminishing in size from bottom
to top. (See the figure.) The disks are numbered 1, 2, 3, . .. , n, from top to bottom;
in the figure, n = 5.
A B c
A legal move consists in taking the topmost disk from one spindle and placing it
on one of the other spindles, providing that we must not, at any stage, place a disk
above a smaller disk.
At the start, all the disks are on spindle A. The object of the game is to get all
the disks onto spindle B, by a series of legal moves.
For example, we might begin by taking disk 1 off spindle A and putting it on
spindle B. There would then be three possibilities for the second move: ( 1) Put disk 1
back on spindle A, (2) put disk 1 on spindle C, and (3) put disk 2 on spindle C. It
would not be legal to put disk 2 on spindle B, because disk 2 would then be above
disk 1, which is smaller.
We shall see that the game can always be completed, no matter how large the
positive integer n may be. For each positive integer n, Let Pn be the proposition that
the game can be completed, starting with n disks. What we need to show is that all
of the propositions P n are true.
52 Analytic Geometry 2.9
Lemma 1. P1 is true.
Proof of Lemma I. Move the one and only disk from spindle A to spindle B. Then
the game is over.
Lemma 2. P 2 is true.
Proof of Lemma 2. (I) Move disk 1 to spindle C. (2) Move disk 2 to spindle B.
(3) Move disk I to spindle B. Then the game is over.
Lemma 3. P3 is true.
And Lemma 1 tells us that the first statement in the chain is true. Therefore all of
the statements P1, P2, • • • are true. This idea is conveyed mathematically as follows:
a) P1 is true, and
b) Pn => Pn+i for every n,
The problem of the disks is probably the clearest illustration of what the induction
principle means. The principle is used continually, in all branches of mathematics.
In this section, we shall use it to get short formulas for certain sums.
2.9 The Induction Principle and the Well-Ordering Principle 53
=?- �: i= G+ 1) en+ 1)
n +l
n+ 1
=?- I i = -- (n+ 2).
i=l 2
In this chain of implications, the first equation is Pn and the last is Pn+i Therefore ·
Pn =?- Pn+i· By the induction principle, Pn is true for every n, which was to be proved.
In fact, there is a simpler way of getting this result. If
Sn = 1 + 2+ 3+ · · · + (n - I)+ n,
then
Sn = n+ (n - I)+ (n - 2)+ · · · + 2+ I;
and adding terms in pairs, we get
to n terms. Therefore
n
2Sn = n(n+ 1) and Sn = -(n+ 1) '
2
as before. This device is neat but very special. Consider now the problem of calculat
ing
n
Sn = L i2 = 12+ 22+ 32+ ...+ n2.
i=l
We have just found that the sum of the first n positive integers is a polynomial in n,
of degree 2. This suggests that S n is a polynomial of degree 3. That is, we conjecture
that
Sn = An3+ Bn2+ Cn+ D,
54 Analytic Geometry 2.9
We want P n � P n+l• to make the induction proof work. This means that
n
L i2=tna+ tn2+ tn,
i=l
for every n. Taking a common denominator on the dght and factoring, we get:
For some purposes, the following idea is easier to use thanthe Induction Principle.
(See, for example, Problems 10 and 12 below.) The Well-Ordering Principle and the
Induction Principle are equivalent. (See Problems 14 and 15 below.)
2.9 The Induction Principle and the Well-Ordering Principle 55
1. Prove by any method that for every n, the sum of the first n odd numbers is n2• That is,
n
I c2i - 1) = n2•
i=l
This can be shown by induction, but there are at least two other ways.
I C3i - 1).
i=l
I C4i - 2) .
i=l
6. Find a formula for
n
2
Iu + i + 1).
i=l
7. Find a formula for
i (i2 - i).
i=l
9. a) Let Pn be the number of moves required to complete the game with n disks. Show
that for every 11,
Pn+l = 2pn + 1.
p,, = 2" - 1.
(Since 210 = 1024, this means that the game with 20 disks requires over a million
moves. Thus, if you want to verify that P20 is true, the easiest way to do it is to
show by induction that Pn is true for every 11, and then set 11 = 20.)
*10. Throughout this problem, the numbers under discussion are positive integers. If
a =be for some c, then bis called a factor of a (or a divisor of a). If p > 1, and the
only positive factors of p are p and 1, then p is a prime. Obviously every prime has a
56 Analytic Geometry 2.9
prime factor, namely, itself. Prove that every number greater than 1 has a prime factor.
[Beginning of the proof: "Let K be the set of all numbers Which are greater than 1 and
have no prime factors. We need to show that K is empty. If K is not empty, then . .."]
* 11. Following is the beginning of Euclid's proof that there are infinitely many primes.
Suppose that there are only a finite number of primes, say
G; = G;_1 + i.
Let Tn be the total number of gifts sent on the first n days of Christmas. Get a
formula for Tn, in the form
?(? + ?)(? + ?)
?
As a check, the final value is T12 = 364. (I am indebted, for this problem, to Professor
Thomas F. Banchoff.)
* 14. Show that, if the Well-Ordering Principle is taken as a postulate, then the Induction
Principle can be proved as a theorem. [Start of the proof: Suppose that not all of the
propositions Pn are true, and let
K = {n I Pn is false.}.
Then K ""- { }. Therefore ...]
* 15. Show conversely that, if the Induction Principle is taken as a postulate, then the Well
Ordering Principle can be proved as a theorem. [Start of the proof: For each n, let Pn
be the proposition that none of the integers 1, 2, . .. , n belongs to K .... ]
If a line intersects a parabola in two points, then it cuts off a region called a parabolic
sector. In the left-hand figure below, the sector is the region lying above the parabola
and below the line. In the third century B.C., Archimedes discovered a method for
finding the area of a parabolic sector. In this section we shall give an easier solution
of the problem.
The problem will be solved if we can find the area of a "curvilinear triangle" of
the type shown on the right above. If we can do this, then we can find the area of the
trapezoid in the other figure, and subtract the areas of the two curvilinear triangles.
The result will be the area of the sector.
We shall attack the area problem, for the graph of y = x2, by approximating the
region with rectangles, like this:
We cut the closed interval [O, h] into n little intervals of equal length, using the
di.vision points
With each of these intervals as base, we construct a rectangle, using as altitude the
height of the parabola at the right-hand endpoint. The right-hand endpoint of the
ith interval is ih/n. Therefore the altitude of the ith rectangle is (ih/n)2•·
Therefore
the area of the ith rectangular region is
Let Rn be the union of all these rectangular regions. Then the area of Rn is
n n h3i2 n
h3
An = £.
�a·i= £.
�- = - � i2
£.
i=l n3 n3 i=l
·
i=l
We want to find out what limit An approaches as n becomes very large. If we find this
limit, then our problem is solved, because the limit is the area of the region R that
we started with.
We found, in Theorem 2 of Section 2.9, that
n n
I i2 = - (n + 1)(2n + 1).
i=l 6
(1 + 1-) ( 1 + -1 )
Therefore
h3 n h3
An =- - (n + 1)(2n + 1)
· = - .
n3 6 3 /1 2n
As /1 becomes large without limit, it is easy to see that
1 1 1
- 1
-- o, 1 + - -1, -o' and 1 + --1,
n n 2n 2n
so that
and
h3 ( 1 1
An =- 1 + - 1 + -
)( ) -
h3
-.
3 n 2n 3
h3
A=-.
3
It would have been equally natural to approximate the area from the inside. We
shall see that this procedure leads to the same answer as before. Here we have cut
up the interval [O, h] into the same little intervals as before; but on each little in
terval we have set up a rectangle whose altitude is the height of the parabola at the
2.10 Solution of the Area Problem for Parabolas 59
ah
--...,
I
I
I
I
I
I
I
I
I
I
I
I
I
I
n n -
/
n n n n n
left-hand endpoint. Therefore, on [O, h/n] our "rectangle" is merely the base inter
val, with area 0; and thereafter the area of the ith rectangle is
To see why the last equation holds, observe that each of the indicated sums is the
sum of the squares of the integers from 1 to n- 1. Therefore
h3 n [ h3 .
A� = 3 (n + 1)(2n + 1) n2
J An -
=
- -
-
n 6 n
As n increases, An---+ h3/3 and h3/n---+ 0. Therefore A�---+ h3/3, and we get the same
limit as before. To sum up:
Theorem 1. Let
R y) I 0 � x � h y � x2}.
=
{(x, and 0 �
y
60 Analytic Geometry 2.10
It is easy to extend this result to the case in which the parabola is the graph of
y = kx2, k > 0.
y y=kx2
.h
i
n
Ui
. - (ih)2h
- -
n n
Bn = .2 a;,
-i=l
we have
n n
Bn = ,2 ka; = k ,2 G; = kAn.
i=l i=l
3
Therefore we have the following theorem:
Theorem 2. Let
be the area of the region under the graph of y = kx2, from x = a to x = b. Then
we have the following:
Theorem 3.
Find the area under the graph of y = 5x2, between the following limits.
4. From 2 to 4 5. From -2 to 2
Find the area under the graph of y = 2x2 + 1, between the following limits.
6. From 0 to 4 7. From -1 to 0 8. From -1 to 3
9. Find the area of the parabolic sector between the graphs of y. = 2x2 and y = x + 1.
13. Solve, for the general case, the problem of Archimedes, stated at the beginning of
this section.
Obviously An > 1 for every n. Under what condition for n can you be sure that
1
An - l < ?
10 '000 '000
c) Let E be any positive number. Under what condition for n can you be sure that
An - 1 < E?
Under what condition for n can you be sure that En < lo?
b) Under what condition for n can you be sure that
c) Given any positive number E, under what condition for n can you be sure that
E,,. < E?
16. For each n, let
Obviously C,,. > 4 for every n. Given a positive number E, show that Cn - 4 < E
+
n 1
17. a) For each n, let Dn n2 + + 2 . Under what condition for n can you be sure
3n
1
that Dn <
102 ?
b) Given any positive number E, under what condition for n can you be sure that
Dn < E?
18. Given an ellipse, find its area.
y
y
-b
- a
x2 y2
-+- = 1
a2 b2
This can be done by a method somewhat similar to one used in the preceding section of
the text. [Hint: In the figures, what is the relation between y and k?]
19. In the discussion preceding Theorem 1, we fo und that A� = A n - h3/n. Verify this
statement geometrically, without using a formula for either An or A�.
Hint: Draw a figure showing both the inner and outer rectangles, and explain why
h
An - A� = - . 1z2.
n
*20. a) Find a formula for
n
""' •
£.., l 3
•
i=l
b) Find the area of the region under the graph of y = x3, from 0 to 1.
*21. a) Let
Thus En is the error in the approximation An ""' '13/3, and En > 0 for each n.
Calculate En and show that En < h3/n for each n.
b) Show that for every E > 0, En < E when n is sufficiently large. That is, find a number
N such that En < E whenever n > N.
Functions,
3 Derivatives, and Integrals
To distinguish these two functions, we give them different names, say, X and Y.
Thus
X: E--+ R,
: PHX
is the "x-coordinate function," and
Y: E--+R,
PHy
is the "y-coordinate function." When we write PH x (with the vertical bar on the
left-hand end of the arrow), this means that each point P is matched with its x
coordinate x. Thus we write ---+ between sets and H between elements of the sets.
3) If the real number x is known, then x2 is determined. Thus we have a function
f: R--+R,
2
X H X .
4) Every nonnegative real number has one and only one nonnegative square root.
Thus we have a function
g: R+--+R,
1-
: x H X,
-Y
63
64 Functions, Derivatives, and Integrals 3.1
1 x E R PH x
2 y --
E
I
R PHy
3 I R R XH x2
4 g R+ R XH V�
5 h [2, co) R xH Vx - 2
6 i R R XH/x/
It is not required that all the elements of the range actually get used. Thus, in
Example 3, x2 � 0 for every x, and so we could equally well write
/: R-+R+,
XHX2,
using R+ as the range instead of R.
Often functions are defined by algebraic formulas, but some of the most important
functions are defined in other ways. Consider the following example.
7) Given the parabola, shown below, which is the graph of the equation y x•2 =
For each point P of the parabola, the arc of the curve from the origin 0 to P has a
certain length. If to each xwe let correspond the length of the arc from 0 (0 , 0) =
(Here we are talking about simple geometric length, independent of direction, and
so the length of the arc is never negative.) Later we shall find that this function can
be described by a formula. But we don't need to know this, let alone find the formula,
to know that we are dealing with a function.
y y
kH ik3
for every k � 0.
9) Given the graph of y = x", for x � 0.
y
(The rest of the graph goes upward when n is even and downward when n is odd.)
To each k � 0 there corresponds a number A which measures the area of the shaded
region
66 Functions, Derivatives, and Integrals 3.1
: kl---* A .
Only for the cases n = 1 and n = 2 do we know how to calculate the values of A.
But for n = 3, we nevertheless have a well-defined function/3. Later in this chapter,
you will see how this function can be calculated.
Given a function/: A -+ B, for each a in A we denote by f(a) the element of B
which corresponds to a. For example, if f is the function which squares things
(x � x2), then
f(l) = 1, /(2) = 4, /(3) = 9, f(J2) = 2;
and
/()-;;) =x for every x � 0.
In Example 9 above,f3(1) is the area under the graph of y = x3, from 0 to I ; and
so on.
If the domain A and the range B are sets of real numbers, then we can draw
pictures of the function. The graph of a function f: A _,. B is the set of all points
of the coordinate plane that have the form (x,f(x)). In other words, to draw the
graph of the function, we plot the point (x,f(x)) for each x in A.
r-Y!
I I
I I
I I
In the case shown in the left-hand figure above, the domain is a closed interval
[a, b]. Consider, next, the function g in Example 4, which extracts nonnegative
square roots:
g: R+-+ R,
: x 1--4 )"'-;,
The graph of g (the right-hand figure above) is the graph of the equation y = )°'-;.
To see that this graph is approximately right, observe that
y = )� <=> x � 0, y � 0, x= y2.
We get x = y2 by interchanging x and yin the equation y = x2. Therefore the graph
3.1 The Idea of a Function 67
of x = y2 is a parabola with directrix x = - t and focus (t, 0). And the graph of
y = ..j--; is the upper half of this graph.
A curve which is the graph of a function is called a function-graph. It is easy to
see what sort of curve is a function-graph: A set of points in a coordinate plane is a
function-graph if it intersects every vertical line in at most one point.
y y
I
I
I
y I
I
Yes No
For example, in the figure above,jis a set of points, and is a function-graph. We use
the same symbol /for the corresponding function. Thus we say that the domain off
is the closed interval [ -1, 7], and the range off is R. (Obviously some smaller set
could be used as the range, but it is not obvious from the figure just what the smallest
possible range is.) We write f(O) = 2, f(l) = 1, J(2) = 3, and so on, because
0 H 2, 1 H 1, 2 H 3, under the action of the function/
Given a function
f: A-+B.
If bis = f(a) for some a in A, we say that bis a value of the function. For example,
2
4 rs a value of the function x H x , but -1 is not. The set of all values of a function
is called the image. If you reexamine Examples 1 through 6 above, you will find
that in 1 and 2 the image is all of R, and in the remaining cases the image is R+.
(You should check these cases.)
Similarly, for
Here the graph is a quadrant of a circle, as shown on the left below, and the image
is the closed interval [O, 1].
1
f
you get a curve which really is a function-graph. We often use this device, to study
various curves C for which the reflection C' is a function-graph. But this does not
mean that C was a function-graph in the first place. Therefore, in the following
problem set, when you are asked whether certain curves are function-graphs, you
must look at the curves right side up. For the curve C shown in the above figure, the
answer is "No," even though for C' the answer is "Yes."
In some of the problems below, you are asked to find the image. In some cases,
the image is not an interval; and you may find it convenient to use the notation
{a, b, c, . ..}
6. For what positive integers n (if any) is the graph of y = lxln a function-graph? For
each such case (if any), what are the domain and range?
7. Same problem, for the graphs of the equations lyln = x.
*8. Same question as 6, for y3 + ny = x.
9. Is the graph of x = Vy a function-graph? If so, what are the domain and the ima ge?
Sketch.
10. Same question, for y = lxl/x.
11. Same question, for Jyl = x.
12. Same question, for y = lxl + x.
13. Same question, for y = x2 + x + 1. (Here, of course, the only trouble is in finding
the image. The image is an interval, and should therefore be described in the interval
notation.)
14. The postage rate for airmail letters within the United States is now (1971) ten cents
per ounce or fraction thereof Thus we have a function
amp: R+ ___,. R+,
where amp xis the airmail postage (in cents) for a letter of weight x (in ounces.) Thus
amp t = 10, amp 1 = 10, amp 1T = 40, amp 0 = 0, and so on. Sketch the graph of
this function. What is the image?
15. The roundoff function r: R ---+ R assigns to each number the nearest integer (with a
half-integer assigned to the next highest integer). Thus r(2) = 2, r(2!) = 2, r(2t) = 3,
r (2! ) = 3. Sketch the graph of this function from 0 to 3. What is its image?
16. Under what conditions is a semicircle a function-graph?
17. Under what conditions is a parabola a function-graph? (To solve this one, you will
need a theorem from a problem in Chapter 2.)
In Section 2.7 we solved the tangent problem for parab olas. Given the graph of
y = ax2 + bx + c,
we found that for each x0, the slope of the tangent at the point (x0, y0) of the graph
was
70 Functions, Derivatives, and Integrals 3.2
Obviously a parabola with its axis vertical is a function-graph; its equation expresses
yin terms of x. Thus we have a function
f: R-R
x H ax2 + bx + c.
Now at each point of the graph off there is one and only one tangent; and this
tangent has a certain slope. Thus we have another function
f': R-R
: X H S"' = 2ax + b.
For each x,f'(x) is the slope of the tangent to the graph ofjat the point (x,f(x)).
To see how this works, consider the simplest example, in which
j(x) = x2•
Here the parabola is the graph of the function
j: X H X2,
and the line is the graph of the function
f': X H 2X.
For each x, the value off' is the slope of the tangent to the graph off For example,
at the point where x I the slope of the tangent tofis 2; and f'(l)
=
, 2 ·I 2. = =
At x %, we getf'(i)
=
t; and tis the slope of the tangent to the parabola. Where
=
y
3.2 The Derivative of a Function, Intuitively Considered 71
If the graph ofjhas a nonvertical tangent at each point (x,f(x)), we letf'(x) be the
slope of this tangent. This gives a new function
f': R ___.... R.
The new function f' is called the derivative off Consider another example.
y
A careful inspection of the figure above indicates that f' is (at least approximately)
the derivative of/ Thus, at x = 0, the tangent tofis horizontal; andf'(O) = 0, as it
should be. At x = 1, the tangent to f seems to have slope = -1; and f'(1) = -1.
At x = -1, the tangent to f has slope = I; and /'(-1) = I. For x > 0. the
tangent to f has negative slope; and/' (x) < 0 for x > 0. For x < 0, the tangent
to f has positive slope; and/'(x) > 0 for x < 0.
It may be that at some points f has no tangent. At such points,/' is not defined.
Thus, in some cases, the domain off' is a smaller set than the domain of/ Consider,
for example, the function/: x H Ix/.
f'
-1
f'
-1
For every x > 0, the slope of the tangent is 1; and for every x < 0, the slope of the
tangent is -1. Therefore the graph off' looks like the figure on the right above.
72 Functions, Derivatives, and Integrals 3.2
Drawing both/ and/' on the same set of axes, we get the left-hand figure be low.
You should carefully inspect the figure on the right below, to conv ince yourself that
f' is the derivative off, at least approximately. Heref has a t angent at x = 0, but
the tangent is vertical, and therefore there is no such thing as f'(O). When x > 0
and x is small, thenf' (x) is large, becausef is rising steeply. When x > 1, f' (x) is
small. It looks as iff'(2) = O; and the g raph off has a horizontal tangent at the
point ( 2,/(2)).
y
f
f f'
f'
-1
0
A function which has a derivative at every point of its domain is called differ
entiable. The following theorem describes a fundamental property of differentiable
functions:
y y
Here by a chord we mean a segment joining two points of the graph. The theorem
says that if/ is differentiable on [a, b], then there is some i between a and bat which
the slope of the tangent is the same as the slope of the chord. As indicated in the
right-hand figure above, there may be more than one such point.
The situation with regard to this theorem is awkward. It is geometrically
obvious. Also it is important and we shall need it soon. On the other hand, the
proof of the theorem is hard, and involves ideas which belong in the later portion of a
calculus course. We shall therefore postpone the proof, but use the theorem whenever
we need it.
3.2 The Derivative of a Function, Intuitively Considered 73
The theorem can be stated in a form which looks more algebraic. If f is defined
on an interval [a, b], then the slope of the chord joining the endpoints is
f(b)-f(a)
b-a
and the slope of the tangent at x is f' (.X). Thus the theorem states that
f(b)-f(a)
f'(x)
=
b - a
for some x between a and b. In this style we can restate the theorem as follows:
f(b)-f(a) .
f'(x)
=
b - a
Note that, if we merely required that the graph have a tangent at every point, the
theorem would become false. The graph shown below has a tangent at every point,
but one of these tangents is vertical. Therefore the function f is not differentiable on
[a, b]. And no tangent line is parallel to the chord from P to Q.
.... Q
,,,.,,,,, l
.,,.,,. I
,,,,,. I
p
I
I
x
a b
zop.tal; /'(x) should be > 0 where the original graph slopes upward;j'(x) should be < 0
wheref slopes downward; and so on. In some cases, you may find that the values off' are
so large that there is no room for them on the paper. In such cases, draw as much of the
graph of/' as space permits.
Some but not all of the functions shown below satisfy the conditions of the mean-value
theorem. For each such function, draw the chord between the endpoints of the graph,
draw a tangent line which is parallel to this chord, and drop a dashed line from the point of
tangency to the point x on the x-axis. (See the figures in the text, illustrating MVT.)
74 Functions, Derivatives, and Integrals 3.2
1. y 2. y 3. y
1 4 4
f, 2 2 f
x
-1 -4 -2 2 4
-2 -2
4. y 5. y 6. y
1
f
f
x x
-1 1 -1
y y y
7. 8. 9.
y y
y
10. f 11. 12.
f
f
x x -x
-1 -1 1
-1
-1
y y y
13. 14. 15.
1
-1 -1 -1
3.3 Continuity and Limits 75
-1 1 x -1
y y y
19. 20. 21.
1
1
40
30
20
IO
76 Functions, Derivatives, and Integrals 3.3
Here y = amp x, where amp x is the airmail postage on a letter weighing x ounces.
The values of this function make sudden jumps at integral values: the graph cannot be
drawn without lifting the pencil from the paper, and so the function is discontinuous.
Functions of this kind are used in physics. For example, the so-called Heaviside
function is defined by the conditions
h(x) = {� if
if
x < 0
x � 0.
The graph looks like the figure below. It makes a sudden jump at x = 0.
We shall now make the idea of continuity more exact, in several stages. Given
a point x0, in the domain off, we want to explain what it means to say that f is
continuous at x0. First we try the following:
3.3 Continuity and Limits 77
This is the idea, but it is not good enough; the question is how close things are
supposed to be to each other. As xgets very close to x0,f(x) is supposed to become
very close to f(x0). This suggests:
2) We can makef(x) as close as we please tof(x0), by taking xsufficiently close to x0•
This is better, but it can be improved. We measure the closeness of two numbers
by taking the absolute value of their difference. Thus if E is a positive number, and
then we say thatf(x) is £-close tof(x0). In these terms, we can restate (2) as follows:
3) For each E > O,f(x) is £-close to f(x0) whenever xis sufficiently close to x0•
If o > 0 and Ix - x01 < o, then we say that x is a-close to x0• The idea of
"sufficiently close" can be described by taking a positive number o. This gives:
4) For each E > 0, there is a o > 0 such thatf(x) is £-close tof(x0) whenever xis
0-close to Xo·
/(xo)+E
/(xo) - - - - - -
In the figure, the solid rectangular region is called an EO-boxfor the functionfat the
point (x0,f(x0)
) . When we call it a boxfor the function, we mean that no point of the
graph lies above the box or below it. If the function is continuous, then for every
positive number E, no matter how small, we can find a o > 0 that gives an EO-box.
We now restate (4) as follows.
Definition. Let x0- be a point in the domain of the function f Suppose that for
every E > 0 there is a o > 0 such that
This definition applies very simply to the function/(x) = 2x, at the point (I, 2).
Given any E > 0, we can find an EO-box, as shown in the figure; we simply take o =
1-------
d I d'
3.3 Continuity and Limits 79
To find the desired number D directly requires clumsy calculations, but there is an
easier way. Let d and d' be the numbers such that
(See the lower figure on page 78.) The graph rises from left to right. Therefore,
Thus the dotted rectangle in the figure boxes in the graph, in the same way that an
ED-box does. We call such a rectangle a dd'-box. Obviously a dd'-box is just as
good as an ED-box. And, in fact, given a dd'-box, we can always get an ED-box that
lies in it. Let D be the smaller of the positive numbers 1 - d and d' - 1. (Jn fact,
d' - 1 is the smaller, but we don't need to use this.) Then
which is what we wanted. We are going to use this method again, and so we record
it as a theorem.
Theorem 1. Let x0 be a point in the domain of the function f Suppose that for every
E > 0 there are numbers d and d' such that d < x0 < d' and
f(xo)+.- - - - -.------�-�--
f(xo) - - - - - - - - - - -- - - - -
8 8
d
80 Functions, Derivatives, and Integrals 3.3
Proof Let o be the smaller of the numbers x0 - d and d' - x0. (In the figure,
o = d' - x0.) Then
Ix - x0I < o => d < x < d' => f (x0) - € < f(x) < f (xo) + €
We shall now reexamine the idea of a limit, which we used in defining the slope
of the tangent to the graph of a function.
To find the slope of the tangent at the point (x0,f(x0)), we let m(x) be the slope of the
secant line through the points (x0,f(x0)) and (x, f(x)), where x -¥: x0. Thus the slope
of the secant is a function, and we are now describing it in functional notation. By
definition, the slope of the tangent is
if such a limit exists. We shall now give a definition of the limit. The idea is that
limx�x m(x) L if the function m becomes continuous at x0 when we insert the
0
=
value L as the value m(x0). Thus we want to use L as m(x0) in the definition of
continuity. This gives the following:
Then
Jim m(x) = L.
x2 - 1
m(x) = -- = x + 1 (x -¥: 1).
x - 1
When we insert the point (1, 2) on the graph of the function m, we get a continuous
function (which is equal to x + 1 for every x ). Thus lim,,,_"'0 m(x) 2, not just =
what do we mean by lim.,�.,J(x)? The answer is that we ignore the value off at x0,
and investigate how the rest of the graph behaves. To be exact:
Definition.Let/be a function defined on an interval I, except perhaps at the point x0•
Suppose that for every e > 0 there is a b > 0 such that
0 < Ix - x0I < b � If (x) - L I < e.
Then
lim f(x) = L.
x-+x0
Note that here we have simply copied the preceding definition, using f for m: all
along, the value x x0 was ruled out by the condition 0 < Ix - x01. The left-hand
=
figure, showing the eb-box, looks the same as before, except that there is no point in
plotting /(x0) (which may not be defined, and which will not in any case be used).
f(Xo) �/
-r?:
-- - ---
���� -+-��--<�:--���-x
I
Xo
I
1. How close to 3 does x need to be, for 2x to be within 0.001 of 6? (Answer in the form
Ix - 31 < · · · � 12x - 61 < 0.001. Sketch the graph of j(x) = 2x, and sketch your
.8-box (• = 0.001).)
2. Find numbers d and d' such that d < x < d' � lx2 - 321 < 0.0001. Sketch the graph
ofj(x) = x2 and sketch your dd'-box. In your sketch, you will have to distort the scale
grossly, because of the small size of your •.
82 Functions, Derivatives, and Integrals 3.4
3. Show that the function f(x) = x2 is continuous at the point x0 = 3. Use the method
that was used in the text for the same function at the point x0 = 1 and apply Theorem I.
Thus your answer will include statements in the form: "Let d = . . . , and let d' = . .. .
Then d <x < d' => 32 - E < x2 < 32 + €." Sketch the function, showing your
dd'-box.
In this section we shall give the elementary rules that we use in dealing with limits.
These rules are much easier to learn and to use than they are to prove, and so many of
the proofs are omitted from this section. (You will find the missing proofs in Appendix
B.) But some of the proofs are easy, and they throw some light on the idea of a limit.
Theorem 1. If limx-xJ(x) = L, then limx-x0 [-j(x)] = -L.
Proof To get -/from/, we flip the graph of/ across the x-axis. We know that for
every e > 0, fhas an eb-box at (x0, L). If we flip the box across the x-axis, in the
same way that we flipped the graph, this gives us a box for -fat (x0, -L).
This theorem can also be proved algebraically. The hypothesis means that:
(1) for every e > 0 there is a o > 0 such that
0 < Jx - x01 < O => If (x) - LI < e;
3.4 Theorems on Limits 83
the conclusion means that: (2) for every" > 0 there is a o > 0 such that
O<lx-xl
o <o => 1-f(x)-(-L)I<"·
Since 1-f(x) -(-L)I = lf(x) - LI, it is obvious that (1) => (2), and thus the
theorem holds.
!I
I � f
J---�- 1
I
I
51
rl f-L
To prove this, merely use the previous proof in reverse; move the box along
with the graph.
Theorem 4. If limx_.,0 f(x) = L, and k is any number, then limx-xo kf(x) = kL.
That is, the limit of a constant times a function is the same constant times the
limit of the function.
y
Proof
1) Fork = 0, this is easy: kf(x) = 0 for everyx, and so lim.,_x0 kf(x) = 0 = 0 · L.
84 Functions, Derivatives, and Integrals 3.4
2) Suppose that k > 0. For every E > 0, the graph off has an EO-box at (x0, L).
Therefore, for every E > 0, the graph of /has an (E/k) o-box at (x0, L). Thus
0
That is, if each of the functions/and g has a limit, as x--+ x0, then the sum also
has a limit, and the limit of the sum is the sum of the limits.
Theorem 7. If limx�x f(x) =Land limx�x0 g (x) = L', and L' � 0, then
0
Jim
f(x) =
!=._
.
x-xo g(x) I.:
Caution: The preceding theorem says nothing about what happens when L' = 0.
anything can happen, even in very simple cases. If
And in fact, for L' = 0
And any number k can be used in place of 2. Therefore, if f (x) --+ 0 and g (x) --+ 0,
the quotient//g can approach any number whatever as a limit. This should not sur
prise us, because every time we calculate a derivative we are finding the limit of a
quotient
f(x) - f(xo)
X - Xo
(This was Theorem 2.) Hereafter, we shall regard the above formula as interchange
able with the definition of continuity. Thus every theorem on limits automatically
gives us a theorem on continuous functions. Some of these are as follows:
By Theorem 4,
lim kf(x) = kf(x0),
Theorem 9. If f and g are continuous at x0, then so also are f + g and Jg.
Theorem 10. If f and g are continuous at x0, and g(x0) >'6 0, then fig is continuous
at x0•
Most of the time we shall apply these results not just at one point x0 but through
out the domain of the functions f and g. For these cases, we can state our theorems
more briefly as follows:
Theorem 11. Letf and g be functions with the same domain. Iff and g are continuous,
then so also are kf, f + g, and Jg. And fig is continuous at every point x0 where
g(x0) ¥=: 0.
on the entire real number system, we can infer immediately that kf,f+ g,fg, andfig
have the same property. Here g(x) ¥=: 0 for every x. Given
h(x) = x2 - 1,
we can infer that/+ h andfh are continuous everywhere, and thatflh is continuous
except at 1 and -1. Of course, at x 1 and x
= -1 it is not just continuity that
=
breaks down: the quotient function is not even defined at these points, because the
denominator ofj/h becomes 0.
Finally, a trivial observation.
86 Functions, Derivatives, and Integrals 3.4
k -- -- --
---
Xo X
Proof Given/(x) k, for every x in a certain domain. We need to show that for
=
In solving the following problems, you need not base your work directly on the definition
of continuity or the definition of a limit; you are free to use all the theorems stated in this
section. Note that the later problems are not based on this section at all; they are extensions
of the theory.
1. Show that if/(x) = kx, then/is continuous.
2. Same, for f(x) kx2•=
Find out which of the following functions are bounded, on the given domains, and justify
your answers:
1
7. f(x) = -- , - oo < x < oo 8. f(x) = x2, - oo < x < oo
1 + x2
x2
9. f(x) = x2, 0 ;;:; x ;;:; 1 10. f(x) = --2 , - w < x < w
1 + x
x
11. f(x) = x3, 0 ;;:; x ;;:; 2 12. f(x) = -- 0 � x � 1
1 + x2' - -
x x
13. f(x) = , 1;;:; x < oo 14. f(x) = -- , - oo < x < -1
1 + x2 1 + x2
x 1
15. f(x) = -- , - oo < x < oo 16. f(x) = -- , 1 < x < oo
1 + x2 1 + x3
1 x4
17. f(x) , -1 < x < 1 18. f(x) , 0 < x < oo
1 + x3
= = --
1 + x3
x + 1
19. f(x) = -- , -1 < x < 1
1 + x3
20. Show that if f is bounded, then so also is kf for every k.
21. Show that if f and g are bounded (on the same domain), then so also is f + g.
*22. Show that if/and g are bounded (on the same domain), then so also is/g. You may
find it convenient to write the condition for boundedness in the form If(x)I ;;:; M.
Can you infer also that fig is bounded? Why or why not?
*23. a) Show that if f is bounded and
Jim g(x) = 0,
x-xo
then
Jim [f(x)g(x)] = 0.
x-xo
b) Show that if
Jim f(x) 0, =
then
Jim [f(x)sin x] = 0.
x-xo
x-xo
d) Show that if
Jim f(x) = 0,
x�o
then
Jim
x�o
[t(x) �]sin
X
= 0.
t 5 I
M � -
-M -----
*24. a) A function fis locally bounded at x0 if there are positive numbers M and o such that
b) If fis bounded, does it follows that f is locally bounded at each point of its domain?
c) Conversely, if f is locally bounded at each point of its domain, does it follow that f
is bounded?
d) If f is locally bounded at each point of the open interval (0, 1), does it follow that f
is bounded on (0, 1)? Why or why not?
e) Show that if
Jim f(x) = L,
x-xo
then fis locally bounded at x0. (This result does not require that x0 be in the domain
of f If you draw a picture of what you have, and a picture of what you want, and
.
Jim g(x) = 0,
x-xo
then
Jim f (x)g(x) = 0.
x-xo
3.5 The Process of Differentiation 89
The theorems in the preceding section tell us enough about limits to give us some
information about derivatives.
To make some formulas easier to write, we introduce an alternative notation for
the derivative: we write DJ to mean the derivative off Thus
Df=f',
by definition. Similarly, if
h(x) =f(x) + g(x)
for every x in a certain domain, then
D(f + g) = h'.
Similarly, when we write
D(x2 + 2x + 5)
we mean the derivative of the function
x H (x2+2x+ 5).
We know already what this derivative is:
D(x2+2x+ 5) = 2x + 2.
Here we are merely rewriting the result which we got quite a while ago: for each x,
the slope of the tangent to the graph of
y= ax2+bx+c
is given by the formula
S,, = 2ax+b.
We recall that a functionfis differentiable at a point x0 if it has a derivative at x0.
When we say thatf is differentiable, we mean that it has a derivative at every point of
its domain. For example, if f(x) = lxl, then f is differentiable at 1, but not at 0.
But ifj(x) = x2, thenf is differentiable ( without qualification).
f(x) - f(x o)
!'(Xo) - 1.lm
X - x0
_
rr;->x0
k
k
- = 1·1m 0 = 0 .
= !.im ---
x->x0 X - X0 x->x0
90 Functions, Derivatives, and Integrals 3.5
D(kj) = kDf
"''"'"'o
Dx2 = 2x,
and
D(kx2) = 2kx,
as it should be.
D( f + g) = DJ+ Dg.
f(x) - f(xo)
l im = f '(xo)
"'"""'o x - x0
and
Dx3 = 3x2•
f(x) = xa;
3.5 The Process of Differentiation 91
· f(x) - f(xo) r· x3 - x�
' i
f (x0) = Im = Im
o:->o:0 x - x0 o:->o:0 X - x0
Proof. Let f(x) = x" for every x, and take any x0• Then
1. (x - x0)(x11-1 + x"-2x0 +
= Im
z-+ :Z:o X - Xo
· ·
= lim (x"-1 + x
11-2
x0 + · + xxg-2 + xg-1)
x-xo
(to n terms)
Dx" = nx"-1,
which was to be proved.
92 Functions, Derivatives, and Integrals 3.5
because the derivative of the sum is the sum of the derivatives. This is
I
I--/'
I
I
I
I
---1I
I
I
·x
---+---�xc!--·o----
. f(x) - f(x0)
Itm - f'(x0),
_
x-xo X - Xo
then
. . [
f(x) - f(xo)
hm [f(x) - f(x0)]
x-+x0
=
hm
x-x0 X - x0
· (x - x0)
J
=
f'(x0) · 0 = 0.
(Theorem needed, for the second step?) By Theorem 3 of Section 3.4, lim.,_,, f(x) =
0
f(x0), which was to be proved.
The differential calculus would be simpler if the derivative of the product were
equal to the product of the derivatives; but this is not so. For example, take
A correct formula can be derived as follows. Take any x0, and suppose, as usual,
thatfand g are differentiable. Then
1. f(x)-f(x0) '
-f (X0) (1)
_
lffi
.,_,.,o X - Xo
and
1. g(x) - g(x0)
tm = g '(Xo) . (2)
.,_,.,0 X - x0
.f(x)g(x)-f(x0)g(x0) .
lim (3)
x->x0 X- x0
In a similar situation, when we were finding the derivative off+ g, there was no
problem: we looked at the fraction whose limit we wanted to find, and observed
that it was the sum of the two fractions
whose limits we knew. If these fractions appeared in (3), then we could use the fact
that their limits are given. Since neither of them appears, we use a trick: we simply
put one of them there, fix up the rest of the fraction so that its value is unchanged,
and hope for the best:
g(x)
x - x0 X - x0
1. f(x)-f(xo) '
-f (Xo,
)
_
lill
x->x0 X - X0
1. g(x) - g(x0)
lffi = g'(Xo) .
:z:->:z:o x - Xo
Therefore, by our theorems on limits of sums and products, we have
!. f(x)g(x) - f(x0)g(x0)
1m -f'(x0)g(x0) + f(X0)g'(x0)
_
•
:z:->:z:o X - Xo
94 Functions, Derivatives, and Integrals 3.5
In words:
More briefly:
D(fg) = f'g + g'f
Let us try this one out for
f(x) = x3, g(x) = x2•
Now
/'(x) = 3x2, g'(x) = 2x.
Therefore
f'(x)g(x) + f(x)g'(x) = 3x2 x2 • + x3 2x • = 5x 4,
as it should be.
Next we want to find the derivative of the reciprocal g = l// of a function f
As usual, we take a fixed x0; and we assume that f is differentiable at x0. We must
have/ (x0) � 0, or g(x0) would not be defined. Now
x�x0 X - Xo x�x0 X - Xo
if such a limit exists. Algebraically,
1/f(x) - l/f(x0) =
f(x0) - f(x) f(x) - f(x0) -1
x - X0 f( x)f(xo)(x - x0) x - X0 x
f( )f(x0)
As x-->- x0, the first fraction approaches f'(x0), and the second fraction approaches
- 1/ [/(x0)] 2 , because/(x0)� 0. Therefore
, -f'(xo)
g (x ) =
o [f(xo)J2 '
and so
D
(f); D f
( ;1) f' ·
1
; + fD
(1); !'
;- + f.
-g' =
7 f'g - g'f
g2
= ·
· = =
In words:
the derivative of the denominator, all divided by the square of the denominator
(wherever the denominator is not = 0).
More briefly, at every point x where g(x) ¢ 0, we have
D (-xx42) = x2 • 4x3 - x4 2x
(x2)2
· 2x5
=- =
x4
2x.
(i) Dk 0,
=
t2 - tx2 + x,
j: t H !2 - tx2 + X.
96 Functions, Derivatives, and Integrals 3.5
g: x f-> t2 - tx2 + x.
because nobody could tell whether we meantf' or g'. To eliminate the ambiguity,
we write D1 or D,,, to indicate which letter does not represent a constant. Thus
while
D,,,(t2 - tx2 + x) = g'(x) = -2tx + 1.
Similarly,
Dx(ax3z + z2) = 3ax2z,
All of the following are differentiation problems. Most of them can be worked by the
standard formulas that we have just derived. But in some cases you will need to start with
the definition of[' (x0) and then use various algebraic strategems.
1 x
2.
0
1. D(7x1 - x8) D-- 3. D --
x + 1 x + 1
1 y
4. D -- 5. D -- 6. D(?y4 _ y2 + 7T)
x2 + 1 y3 - 3
7.
n (�) 8.
(
n 2 � x) 9. D(l + x)3
16. If you worked Problem 9.of Section 3.3, you know that f(x) Vx is continuous in its
=
entire domain R+. Assuming, in any case, that this is true, find['. [Hint: Set up the
fraction whose limit is/'(x0), rationalize the numerator and hope for the best.]
17. Given/(x) = Vx + 1 (x;;; -1),findf'. Here you may assume thatfis continuous.
But you should mention this fact, at the stage where you need it.
20. Gi veng(x) = x2Vl - x2, findg'. 21. Find D(I/Vx) (x > 0).
22. Find D(l/Vl - x2). 23. Find D(x/Vl - x2) (-1 < x < 1).
24. Now solve Problem 19 by the methods of Chapter 2, without using limits or differentia-
tion formulas.
Find out whether the following formulas are correct, and give your reasons.
25. D(x2 + 1)2 = 2(x2 + 1) (?) 26. D(x2 + 1)2 = 2(x2 + 1) · 2x (?)
27. D(x2 + 1)3 = 3(x2 + 1) (?) 28. D(x2 + 1)3 = 3(x2 + 1) · 2x (?)
29. D(x2 + osoo 500(x2 + 1)499 (?) [Hint: In fact, this formula is wrong. And it is
=
possible to prove that it is wrong, without finding out what the derivative of the given
function really is.]
30. Prove by induction that Dxn = nxn-1, without using either a factorization formula
or the binomial theorem.
Some of the answers that you got in the preceding problem set deserve to be regarded
as standard differentiation formulas. For example, you found that for
f(x) =
Jx
we have
f'(x) =
2 vlrx .
This problem is going to come up again. We had, therefore, better add it to the list
of formulas at the end of Section 3.5:
- 1
(viii) D.jx =
2,/x .
You found also that
> - 1),
1
Dy'X+I (x
2.Jx + 1
=
D J x2 + 1 =
x ,
J x2 + 1
DvI1 - x2 =
-X
--=== (-1 < x < 1).
J1 - x2
In each of these cases, we have the problem of finding the derivative of the positive
square root Jj of the differentiable positive functionf If we can solve this problem
98 Functions, Derivatives, and Integrals 3.6
n.J]= ?;
and we can then apply the formula hereafter. Suppose, then, that we are given
g(x) == .JJ(x),
on a domain where f (x) > 0; and suppose that f has a derivative at a certain point
x0 of the domain. We want to find g'(x0). By definition,
= lim
x->a:o
[.JI(X5 -- .Jf(XJ
XXo
.JTW .J!!S?__
.jf(X5 Jf(xo)J
·
+
[f(x) - f(xo) . 1
JJ(x) JJ(xo)J
= lim
a:->a:o X Xo
- +
1
= f'(x0) lim ,
-> x a:o .jf(X5 + JJ(xo)
provided that the latter limit exists. It is easy to see that this limit exists, provided that
Since the limit of the sum is the sum of the limits, it then f ollows that
DJ = f' .
] 2J]
Let us try this on the function x H Jx + 1. Here f(x) =x + 1, and f' (x) = 1
for every x. Therefore
1
DJ.X+l = J.X+l '
2 x+ 1
which is the right answer. For x HJ x2 + 1, we have
f(x) = x2 + 1,
f'(x) = 2x,
x) x
DJx2 + 1 = DJJ(x) = f'( = �
2JiW 2Jx2 + 1
3.6 The Process of Differentiation: Roots and Powers of Functions 99
which we haven't proved. We postpone the proof, observing meanwhile that (1) is
reasonable. Since f is continuous, we have
g(x) = f"(x),
where by f"(x) we mean [f (x)]n. Then
1. rCx) - rCxo)
(Xo)
'
g Jill
X - X0
=
x-+x0
Iim
([f(x) - JCxomr-1Cx) + r-2(x)f(xo) + · · · + r-1cxo)l )
X - x0
=
x-+x0
for each positive integer k. Equation (3) states that the kth power of a continuous
function is always continuous. And this is true: given that
because the limit of the product is the product of the limits. For the same reason,
Ink - 1 such steps, we get Eq. (3). Therefore Eq. (2) is correct; and we have:
Theorem 2. If f is differentiable, and n is a positive integer, then
nr = njn-1!'.
Let us try this on our polynomial of degree 12 ,350:
D (xso _
x17 + 1)247 =Df247 = 247f246f'
= 247(xso x17 + 1)246(50x 49 - 17x16).
-
Note that our use of the shortcut formula Dfn =nfn-1.f' has two advantages
over the method based on the binomial expansion. First, the calculation is possible,
as a practical matter. Second, it gives the answer not merely in a correct form, but
also in a factored form, which is easier to handle than the binomial expansion of the
derivative.
Since we know how to differentiate fractions, we know how to differentiate
functions of the form
1
f(x) =---;;,
x
where k is a positive integer. We have
-Dxk -kxk-l
f'(x) = =
(xk)2 x2k
wherek + 1 = 2k - (k - 1). If we express 1/x k as x-k, and make the same change
in the formula for f', then we get
Dx-k = -kx-k-1•
This has the same form as our previous formula Dxn =nxn-1, with n = -k . What
is needed, to take care of all such cases, is the following:
Theorem 3. If n is a positive integer, and f is differentiable, then
Ifn is a negative integer, then the same formula holds at every pointx wheref(x) � 0 .
The last condition is necessary: ifn < 0 , thenf(x) appears in the denominator of
rcx) = l/J-n(x), and r is therefore not defined at points where f (x) = 0.
Theorem 3 has already been proved for the case in which n is a positive integer.
For the case in which n is a negative integer, = -k, the proof is as follows:
kp-1f'
nr =Df-k = D _!_k = k
-
f f2
= -kf-lc-lf ' = nr-l f'.
3.6 The Process of Differentiation: Roots and Powers of Functions 101
For convenience of reference, we list all the differentiation formulas that we have
so far:
(vii) ()
D L = gf' - Jg' ,
2
...) DyX
(Vlll r=
1
2Jx
---=,
g g
(ix) DJ]= �J' (x) nr = njn-1.f' (n :rt= 0).
Each of these formulas holds for every x for which its right-hand member is defined.
1 x
Dv'(x +l)(x +2) 2. D D
(x2 +x +1)2 (x2 +2x+1)2
I. 3.
x2 +1
4. DVx4 +5x2 +2 5. D(x3 +x2 - x +7)�2 6. D- --
x2 - 1
v'x - I
7. DVx(x - 2) 8. DVx3 +2x +1 D --
x2 + 1
9.
x2
J�
1
10. D- - -- 12. DYv'�
-
x2 +1 +x
11. D
1
x3y2
14. D,, v' x2 +a2 15. Dv(x3y2 - x2y3)3 16 · Dx
(x2 +y2)2
17. Recall that for x > 0,
Therefore you need to show that the qth power of the right-hand side of the above
equation is xv.]
This is an instance of a frequent phenomenon: often, a problem becomes easy if
we rewrite it, using the definitions of the ideas that the problem involves.
18. Find Dx312, and write the answer in a form which brings out the analogy with the
formula Dxn nxn-1•
= You may assume that x312 is continuous.
102 Functions, Derivatives, and Integrals 3.7
*27. Given positive integers p, q. Find a formula for Dxp/q, valid for x > 0, and write the
answer in a form which brings out the analogy with the formula Dxn nxn-1•
=
Given the graph ofy = kx2. In Section 2.10, we calculated the area A" of the shaded
region
R = {(x,y) \ 0 � x � h, 0 � y � kx2}.
We found that
Ah -
- �3 h3.
y y
2
y=kt
3.7 The Integral of a Nonnegative Function 103
k k
F(l) =- · 1 3 =- .
3 3
F( 3) is the area under the parabola from 0 to 3; this is
F(3) = � · 33 = 9k.
3
And so on. Thus we have a function F: R+-+ R+. And we have a formula giving
the values of F:
F(x) = � x3•
3
Here we have replaced h by x in the area formula
k
3
A"=- h.
3
Thus, starting with the nonnegative function
. 3 k
-x.
F.x1--+
3
For each x, the value of the new function is the area under the graph of the old one.
We can generalize this scheme in the following way.
y
y=f(t)
n/l
I I
I
I
I
I I
I
I
I
I I I
a x b
104 Functions, Derivatives, and Integrals 3.7
y
y=t+l =f(t)
4
(x, x+I)
I
I
I
3 I
I
1
I
2- I
/I I
// I I
1 / I I
/ I I
// I
I
I
/ I
t
a=l x
For every x on the infinite interval [ I, oo ), let F(x) be the area under the graph, from
1 to x. We now have a new function
In this case, it is easy to write a formula for F. For x= I, the area is 0. Therefore
F(l)= 0. For x > I, F(x) is the area of a trapezoid lying on its side, with its "bases"
vertical. The altitude is h= x - I, and the lengths of the bases are b1= 2 and
b2= x + 1. Therefore
·y
-2 x
For each x � -2, F(x) is the area under this graph, from -2 to x. By our old for-
mula,
F(x) tx3
= t(-2)3 = tx3 + l
-
In all these situations, F(x) is determined by (a) the given function f, (b) the
number a, and (c) the number x. All this is conveyed by the notation
y
y
fi �J(t)
I
I
1x f(t) dt I
I
I a I
I I
a x
is called the integral from a to x of the function f The number a is called the lower
limit of integration (or, briefly, the lower limit) and x is called the upper limit of
106 Functions, Derivatives, and Integrals 3.7
integration (or simply the upper limit). The function f is called the integrand. The
notation for the integral may look formidable at first, but it is not hard to learn and
is convenient: it shows at a glance that we are taking the integral, of a certain function,
between certain limits.
We proceed to generalize these ideas in two ways.
a) Suppose that f is negative, for some values of t. In this case, areas below the
t-axis are counted negatively. For example, in the left-hand figure below, Ai and A2
are positive numbers, representing the areas of the two shaded regions. We count
Ai positively; it is the area of a region above the t-axis. We count A2 negatively;
it is the area of a region below the !-axis.
y y
Thus we have
J:f(t) dt,
and then reverse the sign. Thus, in the figure on the left below,
(x, 2-x)
y 4
y=f(t)
x?
a
Consider, for example, f(t) 2 - t, defined for every t (in the right-hand
x ;:;;; 2
=
J"'(2
-1
- t) dt = t · 3 · 3 - t(x - 2)(x - 2)
J"'c2
-1
- t) dt = -r-\2J.,
- t) dt
= Hx - x).
+ 1)(5
You should check that these are the right answers for the three cases. In each
case, we have computed areas of triangles and trapezoids by elementary area formulas,
and then attached the correct sign to the area of each region.
f ltldt,
d) Let
F(x) = f' t
l l dt.
f'(t2 + 1) dt.
when < 0,
sigt when = 0,
when > 0.
f' sig t dt
i1 sig t dt = I;
1-i sigtdt = - ( - !) = 1,
13 sigtdt = 3,
3.8 The Derivative of the Integral 109
f1 sig t dt -1 + 1 O;
J_l
= =
and so on.
Do the same six things for f (t) sig t that you did forf (t)
= = It/ and f (f) = t2 + 1.
4. Do the same six things, for f(t) t It!.
=
5. a) Explain why
( t3 dt
1 0.
J_l
=
b) Explain why
(1 t273 dt 0.
J_l
=
fa f(t) cit = o.
6. Explain why
0 <
J3 -dt 1 14
< - .
1
t 12
7. Explain why
3 1---+t2 dt
J
1 3
0 < < 5.
1
y
y y=F(x)
y=f(t)
110 Functions, Derivatives, and Integrals 3.8
F(x) - F(x0)
F'(x0 ) -_ 1.tm .
o:->o:0 X - Xo
Now
f"•f(t) dt.
a
Therefore
F(x) - F(x0) =
.
Cf(t) dt - L f(t) dt.
"
"
F(x) - F(x0) ( f(t) dt.
J..
=
Since f is continuous,
F(x) - F(x0) � (x - x0)f (xo).
Here"�" means "is approximately equal to." We are claiming that the area under
the curve, from x0 to x, is closely approximated by the area of a rectangle with base
x - x0 and altitude f(x0). Therefore
F(x) - F(x0)
� f(Xo·'
) ·
X - Xo
y
y=f(t)
a x
For the situation shown in the figure, in which the graph rises to the right of x0,
it is easy to see why the approximation is good. Here we have
where E is the area of the little curvilinear triangle at the upper right. Now
---
E
E < e(x - x0) and < e,
X - X0
3.8 The Derivative of the Integral 111
a x x0
If x < x0, the same approximation formula holds, although the reasons are
slightly different. Here the area under the curve from x to x0 is
F(x0) - F(x);
the area of the rectangle is
( x0 - x)f (x0);
these are approximately equal. Changing the sign of each, we get
as before.
But the fraction on the left is the slope of the secant line to the graph of F. The
limit of this fraction is F' (x0); and since the fraction is close to f (x0) when xis close
to x0, we ought to have
F'(x0) = f(xo).
y
F
112 Functions, Derivatives, and Integrals 3.8
To solve this problem, the first step is to realize that whoever proposed the
problem has asked the wrong question: the answer to his question is a number, and
there is nothing about this number that is easy to see.
y y
y=x4 y=f(t) =t4
F(x) = f t4 dt.
It might seem that Problem 2 must be harder, but this is not true. The point is
that, while information about the number F(l) is hard to come by, Theorem 1 tells us
something about the function F, namely,
F'(x) = x4•
We now ask ourselves what sort of function has x4 as its derivative. We get powers
of x by differentiating powers of x, using the formula
Thus
3.8 The Derivative of the Integral 113
This is like F'(x), except for the factor 5. But the 5 is easy to get rid of: we divide
x5 by 5, getting
D(tx5) = t . 5x4 =x4.
G(O) = t05 = O
and that
F(O) = f t4 dt = 0.
The functions F and G start with the same value, at x= 0. And (1) tells us that F
and G always change at the same rate. This suggests the following:
Here we call the interval I because we want to allow intervals of all kinds, including
[a, b], [a, b), [a, oo), ( -oo, a], and so on. We also allow the case I= ( - oo, oo).
This is the case for the functions
Suppose that F(b) � G(b)for some bon the interval I. For each xon [a, b], let
The slope ofthe chord joining the endpoints ofthe graph of His
H(b) - H(a)
� O.
b-a
By MVT there is an x between a and bsuch that
= H(b)-H(a)
H'(x) .
b-a
Therefore H' (x) � 0. But this is impossible: for every xwe have
Obviouslythis area is
Let
Then
F'(x)= I - x2,
by Theorem 1. It is easy to find another function with this derivative, namely,
G(x) = x - tx3•
Now F(-1)=0, and
G(-1) = -1 + t = -i. (?)
G(x) = x - tx3 + i.
Then
G(-1)=0,
as it should be; and by the uniqueness theorem it follows that
G(x) = F(x)
for every x. Therefore
: 1
-1 1 2
fy2 + t + 2) dt.
Let
As our first guess, let G(x) = }x3 + }x2 + 2x, so that G'(x) = F'(x). We find that
116 Functions, Derivatives, and Integrals 3.8
G(-1) -t + t - 2
= = -1-l-. Therefore we really want G(x) = tx3 + t x2 +
2x + ll, so that G( -1) = 0. This gives the answer in the form
F(x) =
f'f (t) dt,
!Jf(t) dt
we tried. We then know that G(x) F(x) for every x. Therefore
=
J a = F(b) = G(b),
in the same way as for positive functions, and for the same reasons.
I. By the methods of this section, find the area under the graph of y x - x3, from =
x 0 to x
= 1, and sketch. (Here, and hereafter in this problem set, you should
=
explain what functions you are using as the functions/, F, and G.)
2. Find the area of the region lying below the x-axis and above the graph of y x4 I. = -
Note that here the function f is negative, so that the area and the integral are different;
the area A is positive, and
f' tndt,
4. Find
5. a) Find
f (t3 + t - 1) dt.
b) Get a general formula for
F(x) f + t - 1) dt.
= (t3
x
v' x2 + 1 '
y=
from 1 2.
to Unless you happen to remember a function whose derivative is
x/ v'x2 + 1, you are going to have to figure out how this function might arise as
the answer to a differentiation problem. The radical in the denominator suggests
that somebody has been using the formula
f'
Dv'f- =
2v'J'
-
b) Find
Jl t2t + 1 dt.
-1
__
v'
__
y = /(!)
v' t2 + 1 '
as well as you can, and explain how the numerical value that you got for the integral
could have been predicted, without any calculations at all.
8. Let
F(x) f(1 +
= v'()0 dt.
Express F'(x) by an elementary formula (that is, by a formula not involving integrals
or differentiations). Note that you are not being asked to express F(x) by an elementary
formula.
9. Same as Problem 8, for
1
11. Find the area under the graph of y = _ 1 __ (and above the x-axis) from 0 to 1.
v x + 1
14. Find
15. Find
2 x2
for x > O; the derivative is O; and l/x still does not appear. Thusf(x) l/x is not the
=
If you attempt to evaluate this integral by applying the methods of this section in a
mechanical sort of way, you will get an "answer." If you try to interpret your answer
geometrically, you will see that your answer cannot possibly be right. What went wrong?
(Evidently we must have been trying to apply a theorem in a case in which its hypothesis
is not satisfied. The question is what theorem and what hypothesis.)
*18. In Theorem 1, suppose that we had omitted the hypothesis that/ is continuous. Give
an example to show that the resulting theorem would not have been true. [Hint: You
have already seen cases in which a function of the type
F(x) = ft (t) dt
fails to have a derivative at some point x0; and surely we cannot have
Suppose that a particle is moving, according to some given law, along a line . If we
think of the line of motion as the y-axis, then the motion can be described by a
function
f: I->- R;
for each time ton the interval,f(t) is they-coordinate of the moving particle at time t.
Thus, for example, in the figure below, the total time interval I is the closed interval
[t1, t4]. The figure tells us that, at the start of the motion, the time is t1 and the particle
is at the pointy = 1; in the time interval [t1, t2], the particle rises from 1 to 3; in the
time interval [t2, t3], the particle falls from 3 to -1; and in the time interval [f 3, t4],
the particle rises from -1 to 4.
-1
The figure shows a finite time interval I= [ti. t4]. More generally, the function
f may be defined on an infinite time interval I= [t1, oo) or I= R = (-oo, oo).
But most of the time, on or near the earth, the motion begins at some time t0, and
eventually the motion stops. The velocity is the function
v=f': I_.. R,
a= v': I->- R,
provided that vis differentiable. Thus the acceleration is the derivative of the derivative
off. We call this the second derivative off, and denote it by f". Thus we can sum up:
v=/', a= v =f"I
(by definition).
Finally, there is a fourth function associated with the motion. This is the function
F: l->-R
which gives, for each time t, the force F(t) acting on the body at time t.
120 Functions, Derivatives, and Integrals 3.9
We shall now see what form these functions take when/ describes the motion of a
freely falling body. Before we can work mathematically on the problem, we have to
state our physical assumptions in mathematical form.
2) For a freely falling body (or a body projected vertically upward), the force is the
resultant of the weight (which acts downward) and the air resistance (which acts up
ward when the body is falling and downward when the body is rising). If the speed is
moderate, then the air resistance can be neglected. Hereafter, we shall assume that the
weight is the only force, so that
and
k3
a(t) = - < 0.
m
This last equation says that for each falling body there is a constant which is equal to
the acceleration, independently of the time.
4) There remains, however, a question: is there one constant which works for all
falling bodies, or does the constant acceleration depend on what sort of body is
falling? Conceivably, the law governing the free fall of heavy bodies (such as cannon
balls) might be different from the law governing the fall of light bodies (such as BB
shots). Jn fact, until the time of Galileo, everybody thought that heavy bodies fell
faster. The story goes that Galileo proved them wrong by dropping two iron balls
of different sizes off the leaning tower of Pisa: they hit the ground at the same time.
Since k3/m is independent of m, there is a constant -g = k3/m which gives the
acceleration of every freely falling body, regardless of its mass. The number
k3
g= --
m
is called the acceleration of gravity. If distance is measured in feet and time in seconds,
then numerically
g � 32, measured in ft/sec2•
3.9 Uniformly Accelerated Motion 121
a(t) = -g,
where g is a constant and
g � 32 ft/sec2•
We now consider the problem of finding the functions that satisfy the equation
c) f(O) =Yo·
Thus our data consist of (a) the constant acceleration -g, (b) the initial velocity
v0 = v(O), and (c) the initial position y0 = f(O). The solution is as follows:
a) We know that
v'(t) = a(t) = -g
for every t. The function
u(t) = -gt (?)
has -g as its derivative; the only trouble is that u(O) is 0 instead of v0• But this is
easy to fix: we change our minds and let
Our function u then has the same derivative as v, and has the same value at t = 0.
By the uniqueness theorem, u and v are the same function, and so
has -gt + v0 as its derivative; the only trouble- is that z(O) is 0 instead of y0• But
this is easy to fix: we change our minds and let
g
z(t) = - - t2 + v0t + Yo·
2
The function z then has the same derivative as f, and has the same value at t = 0.
By the uniqueness theorem, f and z are the same function, and so
then
d) f(t) = (-g/2)t2 + v0t +Yo for every t.
Thus the mathematical problem defined by (a), (b), and (c) has only one solution.
This fact is important in applications, because, if our mathematical problem had two
solutions, we would have to find out which of the two solutions applied to the
But, if f"(t)
physical situation that we started out to investigate. -g, f'(O) =
=
v'(t) = -g.
We try
u(t) = -gt (?);
we observe that u(t0) = -gt0 instead of v0; to fix this, we let
u(t) = -gt + gt0 + Vo·
3.9 Uniformly Accelerated Motion 123
Now u has the same derivative as the unknown function v, and has the same value
at t = t0• By the uniqueness theorem, u(t) = v(t) for every t, and so
we observe that z(t0) = (g/2)t� + v 0t0 instead of y0; and we fix this by letting
g g
z(t) = - - t2 + gt0t + v0. t - - t20 - v0t0 + Yo
2 2
Then z has the same derivative as/, and has the same value at t0. By the uniqueness
theorem it follows that z{t) = f(t) for every t. Therefore
None of these formulas should be learned. What you need to learn is the process by
which they were derived; if you remember the method, you can use it. For example:
Solution. Let v = f'. Then v'(t) = 3. This suggests that v(t) = 3t. Adding the
appropriate constant, to get v ( 3 ) = 1, we obtain
v(t) = 3t - 8.
Now
f'(t) = 3t - 8.
This suggests f(t) = ft2 - 8t. Adding the appropriate constant, to get f (3) = 2,
we have
f(t) = ft2 - 8t - t . 32 + 8 . 3 + 2 = it2 - 8t + ¥ . - -
This is the answer. (Two differentiations verify that it is an answer; and two applica
tions of the uniqueness theorem tell us that it is the only answer.)
Find formulas for the unknown functions, under each of the following sets of conditions.
In all but one of these problems, the conditions are enough to determine the function.
In three cases, however, there are infinitely many possibilities; and in these cases you should
try to explain what the possible functions are.
3 , 1
5. J"(t) = t ,f'(O) = 1,/(1) = 0 6. f (x) = 2 ,/(1) 2
x
=
, x
7. g (x) = _ / ,g(O) = -1 8. g'(x) = x(x2 + 1)2, g(3) = 1
v 1 - x2
, 1
9. I (t) = t ,/(2) = 5
V
10. f'(t) = t2(1 + t3)10,f(O) = 2 (By all means, do not use the binomial theorem on this
one.)
t2 1
13. f'(t) = + 3 ,/(1) = 1 14. g"(t) = + l 3,g(O) = l,g(I)= 1.
(I t )2 (t )
15. A "theoretical projectile" is fired vertically upward, from the surface of the earth, at
time 0, with initial velocity 10 ft/sec. When will it hit the ground again? For what time
interval is its motion described by the condition a(t) = g?
- (Following the advice
given at the end of this section, you should solve this problem with your book closed,
using the methods but not the results given in the text.)
I 6. A "theoretical projectile" is fired vertically upward, from the surface of the earth, and
hits the ground again ten seconds later. What was the initial velocity?
17. A "theoretical projectile" is fired vertically downward from the top of a 200-foot
building and hits the ground 2 seconds later. What was the initial velocity?
18. We state this problem in a nonmilitary form. A billiard ball is raised to a certain height
y0 and simply dropped, so that it begins its free fall at velocity v0 = 0. Five seconds
later it hits the ground. What was y0?
19. Free fall near the surface of the moon works the same way as free fall near the surface
of earth, except that the constant acceleration -gL (L for lunar) is different; the smaller
mass of the moon makes the difference. Suppose you went to the moon, dropped a
billiard ball as in Problem 18, and found that it dropped 3 feet in one second. What
could you conclude aboutgL?
Let e be any p ositi v e number. Since f is continuous, we know that the graph off
has an eb-box at the point (x0,f(x0)).
3.10 Proof of the Formula for the Derivative of the Integral 125
f(xo)+l---------�--+--�---,
a x0-5 Xo
Thus
Ix - x0I < a => If(x) - f(xo)I < E
We are going to use these inequalities to get information about the function
1
m(x) = -- [F(x) - F(x0)].
X - Xo
Here m is the slope function for the function F, so that lim,,_.,0 m(x) = F'(x0).
Evidently
= f !(t) dt,
:ro
and so
m(x) = __ l
x - X0
[F(x) - F(x0)] = _ l_
X - Xo
r
"'o
f(t) dt.
If f is positive and x0 < x, as in the figure, then F(x) - F(x0) is the area of the
shaded region.
and so
f(x0) - E < --
1
X0 - X "'
J"'•f(t) dt < f(x0) + €.
When we interchange x and x0, this changes the sign of each of the factors in the
middle of this expression. Therefore
To sum up:
f(x0) - E <
X -1 Xo l"'f(t) dt
--
a:0
< f(x0) + €.
X0 - o < x < X0 => /(x0) - E < m(x) < f(x0) + E => lm(x) - f(x0)1 < E,
(2)
exactly as in Case 1. Fitting together our results in Cases 1 and 2, we get
0 < Ix - x01 < o => lm(x) - f(x0)1 < E.
Therefore
1
lim m(x) lim [F(x) - F(x0)]
a::-.a::o
=
a::-t-a:o
--
X - Xo
which was to be proved.
This proof is not easy, but it might have been worse. It was made simpler by the
fact that for each E > 0, the o > 0 that we get from the hypothesis lim.,_,.,0/ (x) =
4. {"' � dt
3. h(x) =
5. g(x) = r4 + t8dt
Vl 6. h(x) i v t2 + 1dt
=
7. f(x) =
J("'4,, 1dt
v( 8. g(x) f (1 + t3)100dt
=
2
=
� dt
-
2
17. J"' J 1 + tdt 18. {"' J 1 + t dt
-1 1 t
- Ja 1 + t4
19. If you know that
for every continuous function f, this does not immediately enable you to find the
rJo2x
derivative of
f(x) = Vl + t8dt.
But find the answer f', by any method.
*20. Findg'(x), given
g(x) =
(
Jo
"' ' vl + t8 dt.
Trigonometric and
4 Exponential Functions
--+ -+
A
v
If AB and AC are rays which have the same endpoint A, but do not lie on the same
line, then their union is the angle LBAC. (In the figure, the arrowheads remind us
that the sides of an angle are rays rather than segments.) Some authors define the
word angle in such a way as to allow "zero angles" and "straight angles."
A C B B A c
In any case, in elementary geometry the idea of an angle does not include the idea
of order; the sides of an angle are not arranged in an order, any more than the sides
of a triangle are.
Terminal Initial
OL.
LAOBA 0 L.
LBOA A
In trigonometry, however, the order of the sides of an angle makes a difference.
Henceforth, whenever we speak of an angle we shall mean a directed angle. Thus,
-+ --+ -+
in the figures above, LAOB is an ordered pair of rays (OA, OB); OA is
the ray the
--+
initial side, and OB is the terminal side. Thus LAOB is different from LBOA.
128
4.1 Directed Angles. Trigonometric Functions of Angles and Numbers 129
·+
x
y
Lx y
x x
R
x x
"X
x
L
We can now define the trigonometric functions of an angle LAOB. The procedure
--+
is as follows. We set up a right-handed coordinate system, in which the initial side OA
--+
is the positive half of the x-axis. On the terminal side OB we choose a point P ¢ 0.
P has coordinates (x, y), in the coordinate system that we have set up, and the distance
OP is a positive number r.
It is easy to show (by similar triangles) that the ratios xfr, yfr, yfx, xfy, r/x, r/y are
independent of the choice of P; they depend only on the angle that we started with.
Thus we can define the trigonometric functions of LAOB as follows:
We have defined six functions. Note that the domains of these functions are not
sets of numbers, but sets of angles.
Consider now the unit circle C, with center at the origin, in the xy-plane.
-1 0
P, %
P0 1
x
-1
Let P0 be the point (1, 0), as in the figure. To each real number fJ there corresponds a
point P0 of C, under the following rules:
2) Given fJ < 0, we start at P0 and move around C in the clockwise direction, until
·
we have traced out a path whose total length is lfJI. The point where our path ends.
is P0•
w: R-+ C
fJ f-'>P0 == w(fJ),
under which to each real number fJ there corresponds a point of C. The function w
is called the winding function. Note that the values of the function w are points
rather than numbers. Note also that
Po+2" ==Po,
for every e. The reason is that when we add 27T to fJ, this merely means that we
take another round trip around the circle, ending at the same point P0 where we
began. Similarly,
and
LO== LP00P0•
4.1 Directed Angles. Trigonometric Functions of Angles and Numbers 131
The symbol L() is pronounced "angle () ; " L() is the angle which corresponds to the
number e. We now define
x� + y�= 1,
and we have:
If the sign of 6 is changed, this sends us around the circle C in the opposite
direction. Therefore the points P8 and P_8 are symmetric across the x-axis, as in the
figure.
y
Pe
/1
/ I
/ I
/ I
/
Therefore
Y-o = -Ye·
This gives:
Theorem 3.
sin 0 =
0, cos 0 =
1,
•
7T '!!.
Sln -
=
1, cos = 0'
2 2
Proof For each (), the points P9 and P1T+9 are symmetric across the origin. This
holds in all quadrants. Therefore
In the kind of trigonometry that we are dealing with now, the relation between
angles and numbers is a little tricky. If () is known, then P9 is determined, and so
L() is determined; L() is LP00P8•
y y
But if the angle is known, the number() is not determined. In the figure on the right,
LP00Q is given, but for this angle we may have
() = !7T + 2n1T.
If an angle LAOB corresponds to a number(), under the rules that we have been
giving, then we shall say that LAOB has measure (), and we shall write, for short,
LAOB = L().
(We have seen that every angle LAOB has infinitely many measures e. For
this reason, it would be misleading to speak of "the measure of an angle.")
So far, we have used the notation L() only for angles "in standard position,"
that is, angles with the positive half of the x-axis as initial side. But it will be con
venient to use the same shorthand for angles in general. Thus
R x'
'
But if we set up new axes x', y , we can also say that
LQOS = L 7T - 2
, and LQOT = LTT.
LSOP0 = L (- ) 37T ,
4
and so on.
Derive the trigonometric identities given or suggested below. The derivations should
be based on the definitions and theorems given in this section of the text.
1.
sin 8
2.
cos 8
3.
tan x
4.
cotx
5. --1
secy
sin 8 cos 8 sec 8 csc 8
6. 7. 8. 9. 10.
csc z cos 8 sin 8 csc 8 sec 0
1 5. tan( -8) = 16. cot (-8) = 17. sec(-6) 18. csc (-8) =
Jn the figure on the left below, We have X9 =COS 6, andy9 =sin 6, by definition Of
the sine and cosine.
y y
P(x, y)
- lxel
lxl =- and
'
a 1
Therefore
lyl =a IYol·
In these equations x xe also agree in
and sign, and similarly for y and y9• Therefore
B = a
( cose,a sine).
And obviously
A = (b, 0).
Therefore, by the distance formula,
Then
A= 1
( ,0),
c =(cos e
( + rp), since + rp)),
AC2 = [cos e
( + cp) - 1]2 + sin2 e
( + cp)
= cos2ce + rp) - 2 cos e
( + rp) +1 + sin2 e
( + cp)
= 2 - 2 cosce + r/J).
4.2 The Law of Cosines and the Addition Formulas 137
---+
We now set up a new coordinate system, with OP6 as the positive x'-axis.
= cos2 8 - 2 cos 8 cos cf> + cos2 cf> + sin2 8 + 2 sin 8 sin cf> + sin2 cf>
= 2 - 2(cos 8 cos cf> - sin 8 sin cf>).
But the distance AC is independent of the coordinate system. Therefore
Once we have the addition formula for the cosine, it is easy to get similar formulas
for the other trigonometric functions.
Proof. By Theorem 4,
cos (� )
-e =cos 7!. cose
2
+ sin 7!. sine
2
cos [� - (� - ) J e =sin (� ) - e .
Therefore
sin (� )
-e = cose.
(The name of the cosine is a reference to this theorem; the word cosine is from
the Latin complementi sinus, meaning sine of the complement.)
Theorem 6. For every fJ and cp, sin (fJ + cp) =sin fJ cos cp + cos fJ sin cp.
Proof.
tan A +tan B
1. tan (A + B) = 2. tan (A - B) =
l - tan A tan B
cote cot</> - 1
3. cot (8 + </>) = 4. cot (A - B) =
cote + cot</>
5. sin 28 = 2 sine cose 6. cos 28 = 2 cos2 e - 1
31T
9. a) sin-=
2
b) sin (3 7T + o)
2
31T
10. a) cos2 = b) cos (3 7T + o
2
) =
31T
11. a) tan-=
2
b) tan (3 7T + e
2
) =
o (}
-
0 l + cos 20 =
12. 2 sin - cos- =
2 2
13. 2 cos2
2
- 1 14.
J 2
l + cos(} l - cos 20
�
15.
J 2
= 16.
J 2
= 17. =
(} sin(} [Hint: Let <P = (}/2, so that 8 2</i, and rewrite the formula
18. tan -
=
(} 1 - cos(}
19. tan - =
2 sin(}
*20. Show geometrically (without using any of the theory developed in this section) that the
formula in Problem 18 holds whenever 0 is between 0 and TT. Discuss the problem of
extending the formula from this special case to the general case.
21. Show that there is no formula which expresses sin (0/2) in terms of sin e. That is,
show that sin ((}/2) is not determined if only sin(} is known.
2 2. Find a formula which expresses !sin ((}/2)1 in terms of cos 0.
23 . Show that there is no formula which expresses sin(}in terms of tan e. That is, show that
tan(}does not determine sin 8.
24. Show that there is no formula which expresses sin ((}/2) in terms of sin (}and cos e.
25. Show that if Pe is known, then P3e is determined. [Hint: If Pe = P4,, what is the relation
between 0 and <P? In this case, what is the relation between 3(} and 34'? Between P3e
and P3q,?]
26. It is a consequence of Problem 25 that, if sin 8 and cos(}are known, then P38 is deter
mined, and therefore sin 3(}is determined. How ? That is, find a formula which expres
ses sin 3(}in terms of sin (}and cos e.
27. Can cos 3(}be expressed in terms of cos(}? If so, derive such a formula. If not, explain
how you know that no such formula exists.
If we try, in a straightforward way, to find the derivative of the sine, we get into
trouble. By definition,
!'(
x0)
_
-
1. f(x) - f(xo)
1m ,
x->x0 X - Xo
if the indicated limit exists. For f (x) = sin x, this definition says that
.
Sln
, Xo =
1.
!ill
sin x - sin x0
,
x->x0 X - X0
140 Trigonometric and Exponential Functions 4.3
if the indicated limit exists. In fact, the limit does exist. But it is not obvious what we
ought to do to this expression
sin x - sin x0
X - Xo
in order to find its limit. For functions f which were defined algebraically, we found
ways to cancel out x - x0 in fractions of the form
f(x) - f(xo)
X - Xo
using various algebraic tricks. Evidently some new device is needed for the sine.
It is as follows. Let
Ax= x - x0•
The symbol Ax is all one symbol. It is pronounced "delta x," and the Greek delta
is supposed to remind us that Ax is the difference in x. Obviously, x = x0 + Llx.
Similarly, let
Llf = f (x) - f (xo).
t:.f=f(x)-f(xo)
1J
I
I
I I
I I
x0 x=x0+t:.x
Jim
f(x) - f(xo)= '
f (xo)
x->x0 X - x0
and
The point of this procedure, in finding the derivative of the sine, is that it enables
us to apply the addition formula for the sine. For f (x) =sin x, we have
f'(Xo)
il.x->O Ax
. [ . cos Ax - 1] 1. [ sin Ax . ]
=I!ill Sill Xo + Jill cos Xo --
il.x->O Ax il.x->O Ax
and
D sin x =cos x.
The unknown limits ( 1) and (2) have curious forms. Since cos 0 = I, the first
limit has the form
Thus we have found that if cos' 0 =0 and sin' 0 = 1, then sin' x = cos x for every x.
To simplify the notation, in the theorems that follow, we use e in place of Ax,
and state the theorems that we need in the following way:
sine
Theorem 1. Jim -8- = 1.
8->0
142 Trigonometric and Exponential Functions 4.3
. cose - 1
Theorem 2. IIm = 0.
9-o e
Theorem 1 is the hard part; given that Theorem 1 holds, Theorem 2 follows from
it. To see this, we first observe that
r [ cos2 e - 1 [ -sin2 e
J J
r
1 .
/:::i e(cose + 1) = 9� e(cose + l)
lim
cose - 1
= [ -lim
i_n_e
s_ ][ 1im sine
0
][ 1im --1-- ] .
o-+o e 8-+o e 8-+ o-+o cos e + 1
cos e - 1
lim = -1 . 0 . l = 0.
o-+o e
(Query: How do we know that lim0_,0 sine = 0, and that lim _0 cose
9 1 ?) =
sine � e � tane.
-1
4.3 The Derivatives of the Trigonometric Functions 143
RP�()� QS.
segments of equal length a1 = a2 = · · · = an. Thus the length of the broken line is
(In the figure, 11 = 3.) We extend the radii of the circle until they intersect the vertical
line through Q; and for each segment of our broken line we let b; be the length of the
corresponding segment on the vertical line through Q.
y s
and that
for each i.
Therefore
RP<A11< QS,
and so
sin()<A,, <tan 8.
As n -+ oo, A11 -+ () . In fact, this is the definition of the length of a circular arc.
Therefore
sin()�()� tan().
144 Trigonometric and Exponential Functions 4.3
In fact, the limit is 1, which is � I, but not >I. Hence the overcautious weak in
equalities that we have written above. The strong inequalities sin 8 < 8 < tan 8
always hold for 0 < 8 < 7r/2, but we are not stopping to prove it.) Therefore
As 8-+ 0, cos 8-+ cos 0 = 1. (You proved this in Problem 30(b) of Section 4.1.)
Therefore I/cos 8-+ 1, because the limit of the reciprocal is the reciprocal of the limit.
Thus the picture must look something like the figure below.
y= cos e
e
----y = sine
e
1.1m -- = 1.
9 ... 0 sin e
because the limit of the reciprocal is the reciprocal of the limit. As we have seen,
this means that:
4.3 The Derivatives of the Trigonometric Functions 145
Theorem 3.
D sin x =cos x.
Once we know how to find D sin x, the derivatives of the other trigonometric
functions are easy.
Theorem 4.
D cos x = -sin x.
D tan x =sec2 x,
D cot x = -csc2 x,
D sec x = sec x tan x,
Theorem 5. (The squeeze principle). Letf and g be functions defined at every point
of the interval I, except perhaps at the point x0• If
Jim f(x) = L,
lim g(x) = L.
146 Trigonometric and Exponential Functions 4.3
y y
g (}
L Lt------
(}
Two illustrations of the theorem are shown above. The theorem is geometrically
clear, and is also easy to prove. The point is that since g(x) is betweenf(x) and L,
any box for fat (x0, L) is automatically a box for g at (x0, L). Since fhas an EO-box
at (x0, L), for every E > 0, it follows that g does also. Therefore
lim g(x) = L,
x�x0
by definition of a limit.
f
-+--
Ll--.,--.c.+-----,,.>.£'-- g
(} -.:-.-�/ : :
I I
j L------l-----J
I
I
The same idea also works when two functions approach the same limit, and a
third function lies between them.
y
f f
(} g
4.3 The Derivatives of the Trigonometric Functions 147
If
g(x) � h(x) � f(x),
and
limf(x) = lim g(x) = L,
x-+xo
All of these ideas are very closely related, and we shall refer to all of them as the
squeeze principle.
5. DVl - sin2 x [Warning: It is very easy to get a wrong answer to this one.]
6. DVl - cos2 x [Same warning.]
7. D cos2 x - :2 � ce>J�" 8. D(cos2 8 + sin2 8)
9. D 2 sin x cos x 10. Dv'l + tan2 8
sin x
11. D(csc2 8 - cot2 8) 12. D
1 + cos x
cos x
13. D l . 14. D(x2 sin x)
+ smx
Show that the following differentiation formulas are correct:
15. D sin 2x =(cos 2x)2 1 6. D cos 2x =(-sin 2x)2
17. D tan 2x =2 sec2 2x 18. D sin ( -x) = [cos ( -x)](-1)
19. D cos (-x) = [-sin ( -x)]( -1) 20. D cot 2x = -2 csc 2x cot 2x
21. D tan (-x) = [sec2 (-x)](-1) 22. D sin 3x = (cos 3x)3
23. D cos 3x = (-sin 3x)3 24. D tan 3x = 3 sec2 3x
*25. Make a plausible guess for D,, sin ax, and verify it if you can.
148 Trigonometric and Exponential Functions 4.4
What sort of function is F? From what you learn about F, what can you conclude
about /1,/2, g1, and g2 ?]
The answer to this problem has a rather curious significance: it means that all
properties of the sine and cosine are contained, implicitly, in conditions (a) through (d).
That is, the sine and cosine are completely described by the conditions
• I
Sill = COS, cos' = -sin, sin 0 = 0, cos 0 = 1.
We recall, from the preceding section, the apparatus which we set up in order to
calculate the derivative of the sine. Given a function
/: J-..R,
where I is an interval, and a fixed point x0 of I. For each point x of I, we let .6.x =
/.,
6}L
I Llx I
I I
We let
IJ..f = f(x) - f(x0) = f(x0 + .6.x) - f(x0).
In the old notation,
when Llx::::::; 0,
where ::::::; stands for the phrase "is approximately �qua! to." This ought to mean that
when Llx::::::; 0.
y
f
Xo X
In the figure, the line Tis the tangent to the graph off at the point (x0,f(x0)). Thus
the slope of Tis f' (x0). If S = (x, y), then
Y f (xo) -
f'( Xo'
)
-
X - X0
because the slope of the segment from P to S is the slope of the line T. This gives
Y -
f(xo) = f'(x0) Llx.
This quantity is called the differential off at x0, and is denoted by elf (See the label
in the figure.) To repeat:
df = f' ( x0) Llx,
when Llx::::::; 0.
Let us try this on some numerical examples, and see how good the approximation
looks.
150 Trigonometric and Exponential Functions 4.4
Example 1. Let
5
4
3
2
�---f�---'�---'�---'�---'�---'�---'�---'��'--�'--�'---�'---�'---!--,__ X
0 2 4 6 8 10 12 14 16 18 20 22 24 I 26
Xo=25
Then
f(x0) =
.J25 = 5,
f'(x) =
1;- (x > 0),
2yx
and
df= lo bx · = lo (0.4)
· = 0.04.
The approximation formula
df�bf
suggests that
.-/25.4 = 5.039841.
Thus the error in our approximation is 0.000159, which is not bad. Note also that the
approximation b f� df wasn't supposed to be good except when bx is small; and
Ax= 0.4 is not very small. Using Ax= 0.1, we get
1
df ;- (0.1) = 0.01;
2y25
=
dj = 11-o(0.01) 0.001; =
b.f f::::! df =
f'(xo) b.x
should be as good as it is. The reason is as follows. We know that
when you multiply two numbers each of which is small, the product is even smaller.
We shall now express these ideas in a more exact form. For each b.x, let
Then
lim E( b.x) = 0,
6.x-+O
because
Thus the graph of the function Elooks like the figure on the left below.
y y
// y =E(t.x)
(D.x7'0)
/
To this graph we add the origin. That is, we define
E(O) = 0.
The graph of the extended function Eis shown on the right above. We now have
y y
x
�---+- -xo- �--a ��Xo�X-o�+-�
a -
The reason is that if the open interval (x0 - a, x0 + a) lies in the domain off,
then the open interval ( -a, a) lies in the domain of E. An open interval containing
a given point will be called a neighborhood of the given point. In this language, we
can sum up the above discussion in the following theorem.
Following is a partial table of the sine and cosine functions, for ready reference in solving
some of the following problems:
x sin x cos x
1. Find sin 0.1251 approximately (sin 0.1251 = 0.1248, correct to four decimal places).
Explain how this formula is related to the ideas in this section of the text. [Hint:
Consider the general approximation formula
Interpret this in terms of the theory that we have been developing, and justify it. [Hint:
Surely the given formula is equivalent to
x
v1 + x""' 1 + - when x""' 0.
2
1 x
- ""'1 when x""' 0.
.3; - 3
v1 + x
1
R:Jl-x when x""' 0.
1 + x
--
Is this a "doubly good" approximation in the same sense in which 6f""' df is "doubly
good"? Why or why not?
�(x) = Jx2 + 1,
we let
g(x) = x2 + 1.
We then get <P' in the form
</>'(x) = � = __
x_
2Jx2 + 1 Jx2 + 1
4.5 Composition of Functions 155
The idea that we have been using is that of composition of functions. In the
first case, the action of ¢ is described by
¢: xH (x2 + 3x + 5)5.
x H x2 + 3x + 5 H (x2 + 3x + 5)5.
The first of these steps represents the action of the function
g: x H x2 + 3x + 5.
The second step raises things to the fifth power. It can thus be described by the
function
In this situation, g is called the inside function; it represents the first step. The function
f is called the outside function; it represents the second step. And ¢ is called the
composition off and g. The reason for the use of the terms inside and outside is that
we can write
cp(x) = f(g(x)).
To get cp(x), we should substitute g in the formula for f
Diagrammatically:
x � x2 + 3x + 5 � (x2 + 3x + 5)5•
�
Our second example fits the same pattern. We have
Diagrammatically:
f ;-
-
g
x H x2 + 1 H vx2 +
1.
�
Algebraically, to get the values of the composite function ¢ = f(g), we substitute
g(x) for u in the formula forf(u). This is why we described the "square-root function"
f by the formula
f(u) = .J�
instead of the equally logical formula f(x) = ,J--;. We want to form the composite
function by setting
u = g(x) = x2 + 1,
and it would hardly make sense to set ( ?) x = g(x) = x2 + 1 ( ?).
We sum all this up in the following definition:
156 Trigonometric and Exponential Functions 4.S
g: A-+ B, f: B-+C,
the composition
f(g): A-+ C
is the function whose values are given by the formula
f(g)(x)=f(g(x)).
Here, for each x, f(g)(x) denotes the value of the functionf(g) at the point x.
Diagrammatically:
g f
AHBHC.
�
Let us consider some more examples.
Example 1. Let
f(u)=sin u, g(x)=2x+ 1.
Then
f(g(x)) = sin g( x) = sin (2x+ 1).
(In this example, what is A? What are B and C?)
Example 2. Let
f(u)=u2+u+I, g(x) = J;.
Then
f(g(x))= c);)2 +J; + 1 =x + J; + I.
composition of two other functions, each of which is simpler than </>. For example,
f2(t4
to investigate the function
</>(x) = - 1) dt,
(u(x)
</>(x) =
Jo
(t4 - 1) dt,
where
g(x) = x2•
Thus
</> = f(g),
where
g(f(u)) = sin u3, g'(f(u)) = cos u3, g'(f(u))f'(u) = (cos u3)3 u2,
which will turn out to be the derivative of cos u3. (Here cos u3 is the cosine of u3,
not the cube of cos u.) Let us try one more example:
For each of the functions ¢, given in the problems below, find formulas for functions
f andg, such that</> =j(g). Then get formulas for f',g',f'(g), and ¢'.
can be expressed without the use of integral signs; f can be calculated as a polynomial.)
sin x3 - sin x8
13. Find Jim -----
x-+xo X Xo -
14. G iv en <f>(x) =sin x2, proceed as in Problems 1 through 11.
15. Do the same, for <f>(x) =sin x3•
16. Do the same, for <f>(x) = sin ·<X:·
4.6 The Chain Rule 159
17. Given
On the basis of the theory that you know so far, you are in no position to calculate
f
(u)
=
f v'l+f2 dt.
D[f(g)] =
?
On the other hand, you ought by this time to be able to make a good guess about
sinx v'l+f2 dt
D[f(g)], and then use your guess to write some kind of formula for
<f>'(x) = D
J
0
.
[Hint:
As a start, what is/'(u)?]
sinx - I
18. Find lim
:r-1Tf2 x - 7r12
.
[Hint: If you can figure out what the geometric meaning of this limit is, it will then be
easy to find its numerical value.]
cos x + 1 tanx - I
19. Find lim ---- [Same hint.] 20. Find Jim .
x-;; x - 1T :r-"14 x - 14
1T
sin 2x - I secx - 1
21. Find Jim 22. Find Jim ---
.x�"/4 X - 14
1T :r�o .X
You may have observed, in the preceding problem set, that the formula
Df(g) = f'(g)g'
held in a number of cases. For example, if
f(u) = u",
then
f' (u) = nu"-1;
and
Df(g) = Dg" = ng"-1g' = f'(g)g'.
Similarly, if
f(u) = Jli,
then
Df(g) = D,/g =
-1=
2 Jg
· g' =
f'(g)g'.
160 Trigonometric and Exponential Functions 4.6
Example 2. Consider
cp(x) = sin (k + x) .
By the chain rule,
D sin= cos
Example 3. Consider
cp(x) =
f,kx 1
- dt (k, x > 0).
1 t
u
Here
J
1
cp(x)= f(g(x)), f(u) = - dt (u > 0),
1 t
lkx "'
This is a curious result:
J
1 1 1
D - dt = - = D - dt.
1 t x 1 t
Example 4. The chain rule can be applied several times in the same problem. For
example, we know that
D sing= (cosg)g',
whatever g may be. We can then apply the formula in cases where g' itself needs to
be calculated by the chain rule:
Here sin sin x is the sine of the sine of x, which is different from sin2 x.
Therefore
D sin sin sin x = (cos sin sin x)D sin sin x
= (cos sin sin x)(cos sin x) cos x.
Example 5. Similarly,
D{[(x3 + 1)2 + 1]2 + 1}3 = 3{[(x3 + 1)2 + 1]2 + 1}2D{[(x3 + 1)2 + 1]2 + I}
= 3{ }2 2 [(x3 + 1)2 + I]D[(x3 + 1)2 + I]
•
3{ }2 2 [ ] 2(x3 + l)D(x3 + I)
= · · ·
= 3{ }2 • 2[ ] 2( ) · · 3x2•
Here we have left braces, brackets, and parentheses empty, in the intermediate
stages, to make the steps easier to follow. The final answer is
rp(x) = f(g(x)),
we want to show that for each x0 we have
rp'(xo) = f'(g(xo))g'(xo).
Obviously, we must assume that
By definition,
rp(x) - rp(xo )
rp'(x0) = lim
x->x0 X - Xo
rp(x0 + �x) - rp(x0)
= lim
t.x->O �x
. f( g(x o + �x)) - f(g(xo))
=hm .
t.x->O �X
Let
�u = g(x0 + �x) - g(x0) = g(x0 + �x) - u0,
so that
g(x0 + �x) = u0 + �u.
Then
lm .
t.x->O �x
Here the numerator is a difference
Therefore
lim E(D.u) = 0.
Ax-+O
This gives
.
D.f
ef/(x0) = Jim - = f'(g(x0))g'(x0) + 0 · g'(x0).
<ix-o D.x
We therefore have:
Let f and
Theorem 1.
g be functions. Then
Df(g) = f'(g)g',
That is, the formula holds at every point x0 such that (a) g is differentiable at
x0 and (b) f is differentiable at g(x0). These conditions illustrate the normal pattern
of theorems involving differentiation formulas: the equation holds whenever the
quantities mentioned in the right-hand member exist.
In this problem set, your main job is to learn to use the chain rule. In each odd-numbered
problem, from 1 to 19, you should indicate the logic of your work by writing formulas for
f,g,f',f'(g),andg', before writing the answer in the form D[(f g)] = f'(g)g'. For example,
given the function
<f>(x) = sin (x2 + 1),
your solution should be written in the form
1. 2. 3. 4. 5. 1) 6.
-1
sinx2 sin2 x cosx3 cos3x tan(t2 + tan t2 +
7. 8. 9. 10.
2
x
sin(x3 + x) sinx3 +x cos ,i:x ,1 cosx I I. tan --
16. a) tan2 x b) tanx2 17. cos4x - sin4x 18. cos 2x 19. cos2 x - sin2 x
24. sec vx2 + 1 25. cos vx2 + 1 26. a) '\!tanx b) tan ,;:;.:
cos:c
cos cosx sin cosx sin sin sin tan sinx
37.
f,k:J; -1 dt f,x 1 f,k - dt.
Let k be any positive number; and for each positive number x, let
- - dt -
t t
1
¢(x) =
1 f l 1
Find the simplest possible formula for ¢'(x). Then do the same, for the functions ¢(x)
f,x' - dt J,"'' 1- dt - 2 Jx 1- dt x3 1
defined by the following formulas.
V;; 1 1 "' 1
t
41. J t 43. J - dt - - - dt
t t 2J t
sin 1
1 1
f 1 1
- dt (0 < <
1
44. J
t
.t
X TT )
=f,x �dt.
t
f(x)
1
[Hint:
/(ab) = f(a) + f (b).
When we try to attack this problem by the methods of calculus, the obvious
46. Let ef>(x) =f(xn), where x > 0 and/is as in the preceding problem. Find ef>'(x).
*47. Given
D sin =cos, D cos = -sin,
sin 0 = 0, cos 0 = 1,
and given no other information whatever about the sine and cosine, prove that
for every k and x. [Hint: Let f be the function which is 0 if the first equation holds;
=
let g be the function which is = 0 if the second equation holds, and investigate the
function
F =12 + g2.]
This result tends to confirm a claim that was made in Problem *27 of Problem Set
4.3. The claim was that all properties of the sine and cosine are contained, implicitly,
in the properties that we have just used to prove the addition formulas. Later we shall
find further confirmation of this.
*48. Let/be a function, defined for every x, such that
(a) f" =
-f , (b) f (0) =0, (c) f' (0) = I.
A function f is called invertible if its graph intersects every horizontal line in at most
one point. Thus f(x) = x3 is invertible, but f (x) = x2 is not.
y y
y=f(x)=x3. y=f(x)=x2.
Iff is invertible, then for each number yin the image of/there is exactly one number
x in the domain of/ such that/(x) y. =
Thus to every invertible function f there corresponds a new function 1-1, called
the inverse off. (This is pronounced f inverse. The symbol -1 is not an exponent,
166 Trigonometric and Exponential Functions 4.7
1-1(x) = y ifJ(y) = x.
y = fl;: .
Thus
J-l (x) = -{Y;:,
as we would expect: the inverse of cubing is the extraction of cube roots.
y
f
y
4.7 Invertible Functions. The Inverse Trigonometric Functions 167
J(!-1(x)) = x,
for every x.
Proof For each x, lety =J-1(x). Then/(y) = x, by definition ofJ-1. Therefore,
J(!-l(x)) = f (y) = x.
We can use this idea to calculate the derivatives of inverse functions, assuming
that the inverse function has a derivative.
Example 1. The function f(x) = x3 is invertible, and its inverse is J-1(x) = \o/;:.
Thus
(i1'�)3 = x.
We take the derivative on each side of this equation, using the chain rule for the
composite function on the left. This gives:
3(\o/x)2 D\o/x = 1,
You may have calculated this by another method, in Problem Set 3.6, but the present
method is easier.
y
r1<x>=vx
1) (::Jx)q = x,
2) q(::Jx)q-l D�/x = 1,
- 1
3) D�x = ---== (x > 0).
q::}xq-1
168 Trigonometric and Exponential Functions 4.7
When we use this method, the equations that we write have the following general
form:
1) j(J-1(x)) = x,
2) 1
f'(f- (x))Df-1(x) = 1,
3) Df-1(x) = \
f'(r (x))
(f'(f-1(x)) 9'6 0).
(You should check this against the preceding examples.) The method assumes that
our problem has an answer, that is, that1-1 has a derivative. Thus we need to show
that this holds, in every case in which the fraction at the last stage has a meaning.
This is easy to see. Consider I, 1-1, as in the figure below, with
Yi = 1-1(x1), X1 = l(Y1),
as the labels indicate.
If I has a tangent line L, at (y1, x1), then1-1 has a tangent line L,' at (x1, y1): to get
this, we reflect both the graph and the tangent line across the line y x. The slope
=
of Lis
m =
f'(y1) =
l'(f-1(x1)).
If m 9'6 0, then Lis not horizontal. Therefore L' is not vertical, andl-1 has a deriva
tive at x1. Thus we have completed the proof of the following theorem.
Theorem 2.
1
D r-1( x)
1 f'(r1Cx))'
=
every value that a trigonometric function takes on at all is taken on for infinitely
many values of x. For example, the graph ofj(x) = sin x looks something like this:
-1
y=f(x)=sinx
If we restrict x to the interval [-7T/2, 7r/2], then we get a new function whose graph
includes some, but not all, of the original graph. This new function is denoted by
Sin, and the graph of y = Sin x looks like the left-hand figure below.
y
y
Sin
The graph looks as if Sin ought to be invertible; and in fact this is not hard to see.
In the right-hand figure above, we have switched the notation to fit the definition of
the sine, so that y = sine. Every point of the semicircle corresponds to exactly one
() on the interval [ -7T/2, 7r/2]; and every horizontal line intersects the semicircle in
exactly one point.
As always for inverse functions, we get the graph of Sin-1 by reflecting the graph
of Sin across the line y = x. Therefore the graph of Sin-1 looks like this:
y
170 Trigonometric and Exponential Functions 4.7
Similarly, we define Cos x to be equal to cos x, on the interval [O, 7T], and we
show that Cos is invertible. The graphs of Cos and Cos-1 look like this:
y y
sin Sin-1 x = x,
l
D Sin-1 x =
cos Sin-1 x
We want to simplify the expression cos Sin-1 x on the right, and, while we are at it,
we shall get a formula for sin Cos-1 x. Since
cos2 u + sin2 u = 1,
we can now solve, getting
cos u = ±.JI - sin2 u, (1)
sin u = ±.JI - cos2 u. (2)
For
u = Sin-1 x,
this gives
sin u = sin Sin-1 x =x,
and so from (I) we get
cos Sin-1 x = ±.J 1 - x2• (3)
Similarly, for
u = Cos-1 x
we have
cos u =cos Cos-1 x =x
· ,
and so from (2) we get
sin Cos-1 x = ±.JI - x2. (4)
Formulas (3) and (4) are correct, but they are not good enough for our purposes.
In fact, the double signs can be omitted, and the formulas still hold:
4.7 Invertible Functions. The Inverse Trigonometric Functions 171
Theorem 3.
On this interval, the cosine is � 0. Therefore, in (3), it must be the plus sign that
applies. Similarly,
0 � Cos-1 x � 7T.
On this interval, the sine is � 0. Therefore, in (4), it must be the plus sign that
applies.
We now substitute .J l - x2 for cos Sin-1 x, in the formula that we got for
D Sin-1 x. This gives:
,--
Theorem 4. D Sin-1 x = 1/...; 1 - x2 (-l<x<l).
Note that D Sin-1 x is always >0, just as the graph suggests that it ought to be.
At the endpoints of the graph, the tangents are vertical.
The proof of the following theorem is like that of the preceding one:
To get an invertible function Tan, we take the portion of the graph that lies between
x = -7T/2 and x = 7T/2. We could verify by brute force than Tan is invertible, but
it is easier to prove first the following theorem:
172 Trigonometric and Exponential Functions 4.7
The proof is based on the mean-value theorem. If f is not invertible, then the
graph intersects some horizontal line in more than one point. Thus
f(a) =
f(b),
for some a and b in /. Therefore the graph has a horizontal chord. By MVT, this
means that the graph has a horizontal tangent; that is,f'(x) = 0 for some x, which
contradicts the hypothesis for f.
Now the domain of Tan is an open interval ( -1T/2, 1T/2). On this interval,
Tan' x = sec2 x � 0. Therefore Tan is invertible.
The graphs of Tan and Tan-1 look like this:
y
y
y = Tanx
7r
2
--------�---------
2
The derivation is easier than the preceding ones, because it turns out that there
are no double signs to be eliminated.
For the secant, the situation is trickier, and some handbooks contain formulas
that are wrong. The reason is that the graph of the secant looks like this:
4.7 Invertible Functions. The Inverse Trigonometric Functions 173
y
I IY = secx I
I I I
I I I
I I I
I I I
I I,,.
3.,,. : 3.,,.
1
-2 _,. - �: 2
1r
12
x
I I I I
'(\ :(\
-1
I I I I
I I I I
I I I I
I I I I
I I I I
(Remember that sec x = I/cos x wherever cos x ¥- 0.) This graph consists of
infinitely many connected pieces, but none of these connected pieces is the graph of an
invertible function. We therefore cannot use all of any one of the pieces. Everybody
agrees that we ought to use the part of the graph where 0� x < TT/2, but there is no
general agreement on what else we ought to use. To be safe, we define Sec x only for
0� x < TT/2. (See the graphs below.)
y y
Jy=Sec x
I
I .,,.
---------
I 2
I
I
I
I
--f---
.,,. --L---x
2
(Query: How do you know that the secant never takes on the same value twice,
on the interval [O, TT/2) ?)
On the basis of the definition of Sec-1, it is plain that the equation
y = Sec-1 x
for every u from 0 to TT/2. (Why are we justified in using capital letters on the right?)
Therefore, by the chain rule,
and
x(Tan Sec-1 x)D Sec-1 x= 1. (8)
Therefore
x = -----
D Sec-1 1 (9)
x Tan Sec-1 x
We now need a formula for Tan Sec-1 x, analogous to the formulas for sin Cos-1 x
1 + tan2 u = sec2 u
for every u. Therefore
tan u= ± ) sec2 u - I.
and so it must be the plus sign that applies on the right. Therefore
1
D Tan-1 x= , D Sec-1 x= _1
x2 )
__
1 + x x2 - 1
We now have a new set of functions arising as derivatives: none of these four functions
has appeared before as a result of differentiation. This means, for one thing, that we
can use our new functions to solve certain area problems that we couldn't solve
before. Later we shall see that the process by which we find a function whose deriva
tive is a given function has many other applications.
You will also need to remember
for every x on the interval [ I, 1 ]. - (A very short proof is possible. Remember the
uniqueness theorem of Section 3.8.)
'/�
- ? dt. 2dt.
I I
30. Find f -1 -
+ r-
Sketch. 31. Find
L l--
+ t
Sketch.
0
-1
1v2 r
f/2 dt. dt.
1
32. Find
0 VI - t2
33. Find f o v1 - t2
34. Try to get the right answer for the area under the graph of y I /(I + x2) on the whole
= ,
interval ( - w, w). You need not justify your answer, so long as it is right.
35. Given
0 � x � l,
find a formula for f-1(x). Then explain how your answer might have been predicted
without a calculation.
(2 1
dt.
2 1
dx.
36. Find
J 21v3 iVi2 1
37. Find
J xVx2
__
_
_
l
-
1 -
11v3 t
38. Find 1 --- v t2 + 1
dt.
0
176 Trigonometric and Exponential Functions 4.8
39. In Theorem 6 we required that f' (x) be different from 0 everywhere on the interval I.
This hypothesis was satisfied by Tan on the open interval ( -Tr/2, Tr/2), and so we
could conclude that Tan is invertible. But Theorem 6, as it stands, does not apply
to Sin on [ - Tr/2, Tr/2] or to Sec on [O, Tr/2), because the derivatives of these functions
vanish at the endpoints - Tr/2, Tr/2, and 0. To take care of such cases, we need the
following:
Reread the proof of Theorem 6 and see whether it proves this more general theorem.
If so, say so and explain. If not, furnish whatever additional reasoning is necessary.
40. It might also be convenient to have the following generalized form of the uniqueness
theorem (of Section 3.8). Here we require that F'(x) = G'(x) at all interior points of
the interval I.
Theorem (?). Let F and G be differentiable functions, defined on the same interval I, and
let a be a point of I. If (i) F(a) G(a) and (ii) F'(x)
= G'(x) for every interior point
=
f !(x) dx,
Then F'(x) = f(x), for every x. We find another function G, such that G'. = f
Then F and G have the same derivative f; and by adding a constant to G, we· get a
function, say H, such that H' G' = = f and H(a) = 0. Since F(a) = 0, we know
by the uniqueness theorem that F(x) = H(x) for every x. Therefore
f f(t) dt = H(b).
The proof reproduces the procedure that we have been using all along. G is the
first G that we try, with G' = f; and His the function that we get when we adjust the
constant.
But in many cases it is hard to find a known function which has a given function
fas its derivative. For example, if we had never heard of tan, Tan, or Tan-1, then we
would have had no chance at all of finding a known function G such that
G'(x) 1-.
1 + x2
= -
Later, we shall learn more and better methods for attacking such problems. But no
method, and no system of methods, works all the time. Therefore we often need to
use numerical methods, to calculate definite integrals approximately.
One way is the following. Suppose that we didn't know anything about deriva
tives, but we needed to find H (1 - x3) dx approximately. We might divide the inter
val [O, 1] into 10 subintervals of length 0.1, and add the areas of the circumscribed
rectangles.
y
178 Trigonometric and Exponential Functions 4.8
i= xi= Yi= ai =
0 0 1 0.1
1 0.1 0.999 0.0999
2 0.2 0.992 0.0992
3 0.3 0.973 0.0973
4 0.4 0.936 0.0936
5 0.5 0.875 0.0875
6 0.6 0.784 0.0784
7 0.7 0.657 0.0657
8 0.8 0.488 0.0488
9 0.9 0.271 0.0271
0.7975
We might also have used inscribed rectangles. Their total area would be
The sum A3 has a geometric meaning: it is the sum of the areas of the inscribed trape
zoids.
CS:J
I I
I I
I I
Over each of the little intervals, the area of the trapezoid is the average of the areas
of the inscribed and circumscribed rectangles; and it is not hard to check that the
same is true of the sums. This helps to explain why the approximation A � A3 is
4.8 Simpson's Rule. The Computation of 7t 179
g y
3 2
then
G' = g,
and so
In the figure above, the approximation looks good, because the errors on the
two halves of [a, b] seem to cancel each other out. Most of the time, we cut [a, b]
into a certain number of little intervals [a;, ai+1]; we then use Simpson's rule on each
of the little intervals, and add the results.
· We shall now develop a shortcut formula for Simpson's rule, in a special case.
Jk g(x)dx
-k
=
�(Yo
3
+ 4y1 + Y2),
Before proving that this formula is true, let us first check it, in a simple case, to
make sure that it is not absurd. One of the possibilities is that g(x) = l for every x.
In this case, the integral on the left is equal to 2k. Thus our fom.Jla says that
2k = � (1 + 4 + 1),
3
which is correct. Any time you wonder whether you have remembered Simpson's
rule correctly, you should check by this method; the check uncovers the most common
errors in recollection.
We proceed to the proof. We have
g(x) = Ax2 + Bx + C.
Let
A B
G(x) - x3 + x2 + Cx'
2
= -
3
so that G' = g. Then
fkg(x) dx =
G(k) - G(-k) = iAk3 + 2Ck.
f g(x) dx
k
= iAk3 + 2Ck =
k
- (y0 - 2Yi + Y2) + 2kYi
-k 3
k
= 3 ( Yo + 4y1 + Y2),
1
f(x) -1 � x � 1.
x + 2
= --
,
Here we have
k = 1, Yo= 1,
4.8 Simpson's Rule. The Computation of 7t 181
y
3
1
2 f(x) =
x+2
--
- 2�1 -�
- 1 -+ --''-- '---
2 --X
I
� � � � �
I
I
The rule gives
Jl-1 �
x
� t(l
+ 2
+ 2 + t) � 1.11.
Later, we shall find ways to calculate this integral as exactly as we please. It will then
turn out that the right answer, correct to four decimal places, is 1.0986. In this case,
the approximation is good, in spite of the length of the interval [ -1, 1], because the
portion of the graph off that we are dealing with is very close to its approximating
parabola.
y
-1
Here we have
k = 1, Yo= i, Y1 = 1,
The rule gives
1 dx � Ht
J-11 + 4 + t) t � 1.67.
x2 +
--- =
Since
-1 x = 1
D Tan --- ,
1 + x2
the right answer is
f1 ---
dx
= Tan
-1 1 - Tan-1 ( l)
- =
7T
- -
-11 + x2 4
= !!...�
. 1.57.
2
182 Trigonometric and Exponential Functions 4.8
Here the error is about 0.10, which is not very bad. To get better results, we need to
cut up our intervals into smaller pieces. The first step in deriving the necessary
formulas is to generalize Theorem 2, to take care of the case in which the origin is not
necessarily the midpoint of the interval over which we are integrating.
The easiest way to see this is to move the graph a + k units to the left, so that the
point (a + k, 0) falls on the origin. When a parabola (or a line) is moved in this way,
it is still a parabola (or a line); the integral does not change, and neither do the
numbers k, y0, Ji, and y2• Therefore Theorem 3 is a consequence of Theorem 2.
Consider now a functionf, on an interval [a, b]. We cut up the interval [a, b]
into an even number 2n of little intervals, each of length
b-a
k= .
2n
The division points are x0, Xi, . . . , x2n, as shown in the figure for n = 2.
Yt f
Yo
a 3
4.8 Simpson's Rule. The Computation of 7t 183
where Yi =/(xi); for each i. On the interval [x2, x4] = [a + 2k, a + 4k] we get
ia+4kf(x) dx R:i
k
- (Y2 + 4y3 + y4).
a+2k 3
This formula is the final form of Simpson's rule. Let us try it, with k = 0.2, to get a
better approximation of
11 d x
-1 x + 2
.
The computation looks like this:
i= xi= Yi=
16.4795
This gives
1
f � 0· 2
(16.4795) 1.0986.
J_1 x + 2
R:i R:i
3
This answer is actually correct, to the fourth decimal place. Obviously, however,
we must have been lucky: Simpson's rule is not supposed to be exact, and, besides,
we were carrying only four decimal places in the calculation.
When you use Simpson's rule, it is a good idea to use a table like the one shown
above. Make sure that the last entry in the fourth column of your table is 1 and not 2.
We have postponed until now the presentation of Simpson's rule, because this
is the first point at which we can do something interesting with it. The interesting
thing is as follows. We know that
11---
dx = Tan-1 1 0 =-
o1 + x
2 - Tan-1 7T .
4
Therefore
184 Trigonometric and Exponential Functions 4.8
Applying Simpson's rule, we can thus get a numerical approximation of Tr. This is
Problem 1 below.
1
f(x) = (0 � x � 1),
1 + x2
with k = !. Check your answer against what people have been telling you about 'TT.
If you want to use k = 0.1, to get a more exact approximation, it might occur to you
to use a slide rule to calculate they/s. Would this be a good idea? Why or why not?
How about five-place log tables?
2. Apply Simpson's rule to the function
with k = 2. Then calculate J:2/ (x) dx exactly, and compute the error in your approxi
mation.
3. Apply Simpson's rule to the function
with k = 100. Then calculate the integral exactly, and compute the error in the
approximation.
4. Apply Simpson's rule to
over the same interval as in Problem 4, using the same k, and compute the error.
6. There ought to be a theorem which accounts for some of the results that you have been
getting. State and prove the theorem.
.
7. Apply Simpson's rule to the function
f(x) =1 - x3,
on the interval [O, 1], using k = 0.1. (This is the integral which we investigated in the
text above, using inscribed rectangles, circumscribed rectangles, and finally trapezoids.)
*8. Given a positive number k and numbers y0, y1, and y2, write an explicit formula for a
quadratic function g such that g( -k) y0, g(O) =Yi. and g(k)
= y2 That is, write
= •
*9. Does the theorem that you proved in Problem 6 hold only on intervals of the type
[ -k, k] or does it hold on any interval [a, b]? Proof or refutation?
After finishing Problem 1, you may want to try a smaller k, to get a better approximation
of 11'. As a check,
11' = 3.14159265,
correct to eight decimal places.
In Appendix F, at the end of the book, you will find a theorem which enables us, under
some conditions, to set a limit on the error in Simpson's rule.
For the case in which the exponents are positive integers, exponentials are part of
elementary algebra. We begin with:
It is then easy to see, simply by counting factors on the left and on the right
that the familiar laws of exponents hold:
(A)
(xmr = xmn. (B)
3 1 1
x- = --
-
x-C-3l x3 ·
It can be shown that, if x :;if 0, then formulas (A) and (B) hold for all integers m and n.
When the exponents are allowed to range over all real numbers, exponentials
cease to be part of elementary algebra. In this section we shall state the facts about
exponentials and logarithms, but will make no attempt to verify them. (In the follow
ing two sections, we shall see how these facts fit together to make a logical theory.)
We begin with a positive base and a rational exponent.
1) Suppose that a > 0, and that x is a rational number p/q (where p and q are
integers and q > 0). We want to define ax = a'IJfa in such a way that (A) and (B) will
continue to hold. For (B) to hold, we must have
(a'IJfo)q = aP.
That is, aPfa must be the qth root of aP. Hence the following:
186 Trigonometric and Exponential Functions 4.9
a'Pfa= ,:;a'P.
Here we cannot allow the case a < 0. For a = -1, we would get
( -1)1/3 = � -1 = -'- 1,
( -1 )2/6 = � ( -1)2 = �1 = 1.
Thus, for a < 0, a'Pfa would depend not merely on the number that we use as an
exponent but also on the notation in which the number is expressed. This would lead
to nothing but trouble.
2) It is a fact that for a > 0, and x and y rational, the following laws hold:
a>l
\ I
\ I
\ I
\ I
' I
' I
' /
'
',1 "' ,,/
f(x) =a", x in Q
............
a<l
--
.......... __
j: Q-->R+
It is a fact that the definition of this function can be extended so as to give a new
function:
/: R-+ (0, oo)
X H ax> 0,
defined for every x, such that f is continuous and satisfies (A), (B), and (C). For
a = I, we have f(x) = l'" = 1 for every x. But for a > 0 and a � 1,f is invertible.
by definition. The image of the exponential function includes all positive numbers.
4.9 Exponentials and Logarithms 187
Therefore the domain of its inverse includes all positive numbers, and we have a
function
log,,: (0, co ) __,.. R.
loga b x
= X loga b (b > 0, b � 1), (B')
loga 1 = 0. (C')
loga ax= x,
alogax =
x.
And the graph of either of these functions is the reflection of the graph of the other
across the line y = x.
y
y=f(x)=ax, a>l
Using the laws (A'), (B'), and (C'), we express this in the form
!:1x 1/
t:.x
1 x0 + 1:1x
Jim - loga = lim loga ( 1 + - )
t:.x-o 1:1x x0 t:.x-o x0
f:1x "'o-'
= lim - loga
t:.x-o
[
1
x0
( 1 + -
x0
) t:.x] .
Let
Ax
h = .
Xo
188 Trigonometric and Exponential Functions 4.9
e = lim (1 + h)11".
h-+O
Suppose that loga is continuous, so that the limit of the logarithm is the logarithm of
the limit. Then
Since e1 = e, we have
log. e = 1;
and so for a = e our differentiation formula takes the form
D log. x = -
1
x
.
f(x) = loga x,
so that
J-1(x) =a'".
The general formula
J(j-1(x)) = x
thus takes the form
loga a'" = x.
Since
Du loga u = -1
u
loga e,
the chain rule gives:
Cx loga e) Da" = 1.
Therefore
Da" = --
1
loga
e
a'".
4.9 Exponentials and Logarithms 189
1
-- =log. a.
loga e
Proof. Let
x = loga e, y =loge a.
Then
a"'= e,
by the definition of the logarithm. Therefore
This holds when xy = 1. Since the exponential function is invertible, it cannot take
on the same value twice. Therefore the equation can hold only when xy = I. There
fore
1
-= y,
x
'"
Da'" = a log. a.
This is better, not just because it avoids a fraction, but also because e is one of the
two bases for which tables of logarithms are published.
Throughout the following problem set you may assume that the statements
made in this section are true. (They will be proved in the following two sections.)
For convenience of reference, we give a summary.
!h
25. log. sin x 26. log. cos x 27. log. (sec x + tan x)
28. log.
J 1 -x
29. log. (csc x + cot x) 30. log. (x + v x2 + 1)
l
31. Show that for every x > 0,
log. x =
i1
"' -dt.
t
32. Show that if a and b are positive and different from 1, then
1
logb a =
1--
.
oga b
33. Show that, under the..same conditions,
34. Show that, if a and b are positive and different from 1, then
Prove this.
36. Show that the function f(x) = e"' is completely described by the conditions
f'(x) = f(x) ( - oo <x < oo ), (1)
f(O) = 1. (2)
That is, show that (1) and (2) imply that f(x) = e"' for every x.
37. Show that the function f(x) = e-"' is completely determined by the conditions
f'(x) =
-f(x) ( -oo <x < oo ) (1)
f(O) = 1. (2)
That is, show that (1) and (2) imply that f(x) = e-"' for every x.
4.10 The Functions In and exp 191
In the preceding section, we gave a sketch of the way that logarithms and exponentials
ought to behave, postponing both the proofs and also the basic definitions. We
shall now fill these gaps.
If you review the formulas of the preceding section, you will see that after con
siderable complications in the middle, we got a formula that looked simpie:
1
D loge X = - .
x
f,"' t dt.
1
loge x = -
1
If the theory works, then this formula must be right: the functions on the two sides
of the equation have the same derivative (namely, 1/x), and they have the same value
at x = 1 (namely, 0); and so it follows by the uniqueness theorem that they are the
same function.
We shall use the function f: (l/t) dt as the foundation of the theory of exponentials
and logarithms. The scheme is to investigate the function ff (1/t) dt, learn its proper
ties, and then define all our other functions in terms of it. Thus, at the beginning,
we shall investigate f: (1/t) dt without assuming that we know anything about
logarithms, or about exponentials, or about the number e. To emphasize that we are
starting afresh, we give the function a new name In. (Here In is suggested by
natural
logarithm.) And the official theory
. begins with the definition of In in terms of an
����:
ln x =
J1
-
t
.
Soon we shall show that every real number y is equal to In x for some x. For this
purpose we shall need:
That is, if f(xi) < k < f(x2), then there is an x, between Xi and x2, such ti
f(x) = k. And ifj(x2) < k < f(xi), then the same conclusion follows. This theor1
will be proved in the next chapter.
Our first few theorems on the function In are easy.
Theorem 2. D In x = I/x.
Proof. This follows from the definition of In and the formula for the derivative
the integral.
Theorem 3. In 1 = 0.
This is obvious.
Proof. The trouble with this theorem is that it does not appear to involve any fu1
tions. To prove it, we first restate it, using k for a and x for b. It then says that 1
every k, x > 0,
Inkx =Ink + In x.
We now want to show that the graph of In looks approximately like the drawi
above. The figure suggests that In 1 = 0 and In' 1 = 1. These things we alrea
4.10 The Functions In and exp 193
know. Other things suggested by the figure are conveyed by some of the following
theorems.
Theorem 6. In is invertible.
Proof We know that In' x = 1/x; and I/x -:F- 0 for every x. Therefore In' x -:F- 0 for
every x. By Theorem 6 of Section 4.7, In is invertible.
= In x + n In x = (n + 1) In x.
Therefore, by induction, we have In xn = n In x for every x, which was to be proved.
Then
In 2 n = n In 2 ,
for every n. And In 2 > 0, because In 2 is the area of a region. Therefore In cannot
have an upper bound; no number M is greater than or equal to all of the numbers
n In 2 , because
M
n ln 2 > M whenever n > - .
In 2
194 Trigonometric and Exponential Functions 4.10
Proof
In � + In x = In e x)
· = In 1 = 0;
Because no number mis less than or equal to all the numbers In 2-n = -n In 2.
Theorem 12. Every real number is a value of the function In. That is, every number y
is equal to In x for some x > 0.
Proof Since In is unbounded both above and below, it follows that every number
y lies between twoYalues of In. If In x1 < y < In x2, then it follows by Theorem 1
that y = In x for s�e x. Thus the image of the function In is the entire interval
R = (-oo, oo).
We know by Theorem 6 that In is invertible. Its inverse will be denoted by exp.
That is:
Definition. exp = ln -
1.
Since In x will turn out to be log. x, this means that exp x will turn out to bee"'.
But we should not use the notation e", at this stage, because we have not yet defined e
1-1(f(x)) = J(f-1(x)) = x.
As always, for functions which are inverses of one another, the image of exp is the
domain of In. Therefore
exp
y=x
/
/
/
/
/
,,
/ In
/
/
/
/
This theorem is also easy to see graphically, in the figure above. The graph
of In lies to the right of the y-axis. Reflecting this graph across the line y = x, we get
the graph of exp. Therefore the graph of exp lies above the x-axis.
Because In 1 = 0.
Theorem 17. exp (k + x ) = (exp k)(ex� x) for every k and x.
ln exp (k + x) = k + x,
because In and exp are inverses of each other. And
Proof. We know that ln exp x = x. Since In' u = 1/u, the chain rules gives
1
(In' exp exp' -- exp x = 1, exp' exp x,
I
x) x = 1, x =
exp
x
for every x. Therefore exp' = exp, which was to be proved.
We now have functions ln x and exp x which have the properties that the functions
log. x and e"' are expected to have. A natural next step is to find a number e, and
define the exponential function e"', in such a way that exp x = e"'. We shall do this in
the next section. Meanwhile we give a quick summary of this one.
f,.,
Definitions
dt
a) In x = - (x > 0),
1 t
196 Trigonometric and Exponential Functions 4.10
Laws for In
c) In 1 = 0, d) In xn = n In x (x > 0),
e) In kx = In k + In x (k, x > 0), f) In' x = I/x (x > 0).
Some of the problems below are to be solved by any method that works, including
methods based on the unproved results of Section 4.9. Some, however, are supposed to
be worked strictly on the basis of the theory developed in this section; and these are stated
in the notation of In and exp. Thus, if the problem uses the notation a"', Ioga x, then the
solution may use the theory in Section 4.9; but if the problem uses In, exp, then the solution
should also.
Find the derivatives of the f ollowing functions:
1. ln2 x 2. In In x 3. In (x2 + 1) 4. In x2 + 1
ln x =
lx dt
-
1
t
f(x) =
Jx dt
�?
1 vt
Why or why not?
18. How about the function
g(x) =
lx dt
2?
1 t
19. Given
h(x) =
l"'2 dt
-- (0 < x < oo),
. 1 vr
find h'(x), by any method. Note, however, that you are not being asked to calculate h;
you are being asked only to calculate h'.
20. Given
rsinx
f(x) = Jo vl+t2 dt,
findf'(x).
4.11 Exponentials and Logarithms. The Existence of e 197
1
2 . Given
f(x) =
fanx v1 + t2dt,
0
find f'(x).
22. Given
g(x) = fxdtt'
i
findg'(x).
23. Given
tanx-1 lnx In x2
25. Find Jim 26. Find Jim . 27. Find Jim 1. --
x�1X - 1
--
32. Given f (x) = x-1, find a formula for 1-1(x), and sketch both functions on the
same set of axes.
Show that expx � x + 1, for every x. [Hint: Try to use a known property of In.]
33. Let k = Jn-1 l. Show that k > 2.
34. Show that k < 4.
f (h) = (1 + h)lfh,
as h � 0. To investigate this limit, we first need a proper definition of the function
(1 + h)1'"· Since the exponent 1/h varies continuously through real values, we need
a definition of the exponential ax, where a > 0 and x is not necessarily rational.
The right definition is not hard to find. We know that if n is a positive integer, then
In n
a = n In a.
198 Trigonometric and Exponential Functions 4.11
Therefore
a" = exp (n ln a) .
If the laws of logarithms hold for all exponents, then
ln a"'= x ln a,
which gives
"
a = exp (x ln a) .
We take this last equation as our definition of the exponential function a". Thus:
This gives:
Theorem 1. ln a" = x ln a.
We are now ready to show that (1 + h)1f1i approaches a limit, as h ---+ 0. Let
f(x) = (1 + /
x)1 x.
Then
1
lnf(x) = - ln (1 + x).
x
\ 1
ln f
, (�x) = - ln (l + �x)
�x
!
ln (1 + �x) - In 1
�x
This last fraction is the fraction whose limit is ln' 1, by definition of the derivative.
Therefore
lim lnf(�x) = ln' 1 = t = l,
Ax-o
and
limf(�x) = lim exp lnf(�x) = explim lnf(�x)
Ax-o Ax-o Ax-o
= exp 1 = ln -1 1.
And we know:
1 1 1
e=l+-+-+ ..·+-+ ··· '
1! 2! n!
where
n!= 1 · 2· 3 · · · n.
The series on the right is infinite, but the terms diminish so rapidly as n increases
that we get good numerical approximations by using the first few terms.
We expected exp x to bee"'. We can now show that this is true:
e"' = exp (x In e),
y = loga x <=> aY = x.
Since e"' and exp x are the same function, they have the same inverse. Therefore
we have:
aY = exp (y ln a),
200 Trigonometric and Exponential Functions 4.11
by definition. Therefore
v
In a = In exp (y In a) ,
and
In av= y In a.
a
log.x
= x.
In this equation, we take the In of each side, getting
(loga x) In a= In x.
This gives:
Thus the function Ioga is a constant times the function In; and this means that the
extension of the theory from In to loga is easy.
Ioga 1 = 0, (i)
1
Ioga/ X = , (j)
x log, a
Here the formula designations are those of the summary at the end of Section 4.9.
The proofs are as follows.
Proof For a = e, the first three formulas are known to hold, because in this case
Joga = log, = In. If we divide every term by In a, then we get loga throughout, and
the equations still hold. To get Eq. (j), we observe that
In x 1
log� x= D loga x= D - - --
In a x In a
"'
Equations (k) and (1) merely remind us that loga x and a are inverses of one
"'
another. Similarly, we get the laws governing the exponential a by using the fact that
"' Ina
a"' = exp (.JC In a) = e
.
4.11 Exponentials and Logarithms. The Existence of e 201
a) We have
In (a" · a") =In a" + In a11 =xIn a + y In a
= (x + y)Ina = In a"'+v.
Since a"· a" and ax+v have the same In, they must be the same; In is invertible,
and so In never takes on the same value twice.
b) By definition, b11 = exp (yIn b). Therefore
(a")11 =exp (yIn a") = exp (yxIn a)
=exp (xyIn a) = a"Y.
e) Da"'= D exp (xIn a) = [exp (xIna)] In a, by the chain rule. Therefore Da"' =
a"'In a.
This completes the program that was sjs,ewked_jn Section 4.9. There are, however,
some things that we still need to check. In the elementary theory, we stated:
(to n factors).
In (xn) = nIn x,
by definition of [x"].
Similarly, we now have two definitions of avla.
202 Trigonometric and Exponential Functions 4.11
a
"'
= a111q = exp (x In a) = exp (� )
In a .
�a" = exp (� a) .
In
Then
and
q lny = p ln a.
Therefore
In y =!!.in a,
q
and y = exp (� ) In a ,
Dxk = kx"-1
holds true in certain cases. We first proved it for the case in which k was a positive
integer. Later we found that it held true when k was a n7gative integer. For k = t,
and x > 0, it says that
/
Dx112 _
-
ix112-1
2
_
-
ix-112
2 -
__ 1_
2.Jx,
= xk · k · ! = kxk-1•
x
In this section we have presented no new results, except for Theorem 8; we have
merely furnished proofs for the theory sketched in Section 4.9. You therefore have
no new material for problem work. Hence we give the definitions of a new set of
4.ll Exponentials and Logarithms. The Existence of e 203
functions, the hyperbolic functions, and list various identities which they satisfy.
In the following problem set you will be asked to derive these. The theory is simpler
than the theory of trigonometric functions. In fact, once you know about the expo
nerltial function, most of the following formulas have straightforward derivations.
The functions are called the hyperbolic sine, hyperbolic cosine, hyperbolic
tangent, and so on.
Definitions
. e"' - e-"'
smhx = ---
2
e"' + e-"'
coshx = ---
2
e"' - -"'
sinhx .
= --
e
tanhx =
e"' + e-"' coshx
2 1
sechx = - --
e"' + e-"' cosh x
2 1
--
cschx = -
e"' - e-"' sinh x
Identities
sinh ( -x,) = -sinhx. (1)
Derivatives
sinh' x = cosh x. (14)
cosh' x = sinh x. (15)
tanh' x = sech2 x. (16)
coth' x = -csch2 x. (17)
sech' x = -sech x tanh x. (18)
csch' x = - csch x coth x. (19)
Verify the following. (The numbers in parentheses refer to the numbered formulas
above in the text.)
Show that A + B = 0. (It is not necessary to go back to the definitions to show this.
Try Identities (12) and (13).)
17. Let A and B be as in Problem 16. Show that A - B = 0.
18. (7) 19. (8) 20. (9) 21. (10) 22. (11)
23. Express cosh 3x in terms of cosh x.
24. Show that
x > 0 => sinh x > 0.
25. Show that
x < 0 => sinh x < 0.
for every x. Note that there is no double sign in this formula; if your derivation leads
to a"±" sign, you musf find a way to get rid of it.
4.11 Exponentials and Logarithms. The Existence of e 205
35. Let
Cosh x = cosh x (0 � x).
(Compare with the definition of Cos: Cos x = cos x (0 � x � 7T .) ) Show that Cosh is
invertible.
.
Sillh X =
{v cosh2 x - 1, for x � 0,
-Vcosh2x - 1, for x < 0.
37. Show that
sinh Cosh-1 x = v x2 - 1.
38. Find D Cosh-1 x.
�
46. Find a formula which express s sinh-1 x as the logarithm of an algebraic expression.
Hint: The graph of sinh is the graph of the equation
Here we have reflected the graph across the line y = x, by interchanging x and y in
Eq. (1). Now solve for y in (2), getting
y = ( " · ) .
Then
sinh-1 x = ( · · · ) .
y
y
Here x and x' are any points in the domain off Some simple functions are neither
increasing nor decreasing. For example, f(x) = x2 satisfies neither of the above
conditions.
Often, however, we can get a good description of a function by cutting up its
domain into subintervals, in such a way that on each subinterval the function is either
increasing or decreasing. For example, the domain might be a closed interval [a, b],
and the graph might look like this:
206
5.1 Intervals on Which a Function Increases, or Decreases 207
y
f y
I
: I
I
I
I
I
I
I
I
�--+-�����--x
J
X2 X X1 X3
We recall that an interior point of an interval is any point which is not an endpoint.
Theorem 1 is a consequence of the mean-value theorem (MVT).
If we had
(?)a,b in l, a< b, f(a)> f(b) (?),
as on the left of the graph above, then the slope of the chord would be
f(b) -f(a)< 0,
b -a
and this would give f'(x)
< 0 for some x between a and b. This is impossible,
because such an x would be an interior point of I. If we had
[7T, 27T]. We don't need theorems to be as general as possible, but we want them to be
general enough to be usable. And it is not unusual to find that/'(x) = 0 at an end
point; in fact, this is what usually happens, when we break up the domain of our
function into the largest possible subintervals on which the derivative does not change
sign. Here f is increasing on 11 and /3, and decreasing on 12; and the derivative
vanishes at the endpoints x1 and x2•
f(x) = x2 - x (0 � x � 2).
Here
y y
2
f 2
x x
3
-1 -1
f(x) = x3 + 2x2 - 3x - 4, -2 � x � 2.
This is not a put-up job; it is a "real-life" problem, and nothing is going to come out
even. We need to find out where f' > 0 and where f' < 0. Now
f'(x) = 3x2 + 4x - 3,
so that
-2 ± JTI
f'(x) = 0 when X=
3
-6
Since the graph off' is a parabola opening upward, it must look like the drawing
on the left above. Thus
f'(x) > 0 when x < x2,
This gives us our sketch on the right. (The problems in the following problem set are
not this awkward.)
To apply this method, you need to know how the derivative behaves; and we
may use the same method in investigating the derivative. For example, in the pre
ceding problem we had
2
f'(x) 3x + 4x - 3.
=
If we let
2
g(x) = f'(x) = 3x + 4x - 3,
then
g'(x) = 6x + 4.
Therefore g is increasing for x > -i, and is decreasing for x < -i· Plotting g
exactly, at the points -2, x2, 0, and x1, we get the sketch of f' which is given
above. We know thatf'(x) > 0 for x1 < x < 2, because f' increases, starting at the
value f'(x1) = 0. Similarly, f'(x) > 0 for -2 < x < x2, because on the interval
[-2, x2], f' decreases toward f'(x2) = 0. Similarly in the middle interval [x1, x2].
This idea is simple enough, but it is so useful that we had better record it as a theorem:
Theorem 3. Iff is increasing on [x1, x2], thenf(x) > f(x1) for every x on (x1, x2].
__/L_ I I
Theorem 4. Iff is decreasing on [x1, x2], then f(x) < f(x1) for every x on (x1, x2].
For each function given, state on what intervals the function is increasing, and on wha
intervals it is decreasing; and sketch the graph.
5.2 Local Maxima and Minima, Direction of Concavity, Inflection Points 211
1. f(x) = sin x,
2. f(x) = Sin-1 x, -1;:;:; x;:;:; 1
1
3. f(x) -2;:;:; x;:;:; 2
=
- --2
1 +x '
x
4. f(x) -2;:;:; x;:;:; 2
-1 --2
=
+x '
5. f(x) = x3 - 3x, - 2;:;:; x � 2
6. f(x) = x3 + 3x2 - 2, -1;:;:; x ;:;:; 3
7. f(x) = (sin x + cos x)2 - 1,
+x 4
x
6
1 . f(x) ---4 -1 ;;;; x ;:;:; 1
1 +x
=
7
1 . f(x) = x cos x - sin x
18. f(x) = x/2 + sin x, 0 ;;=;x;:;:; h
19. f(x) = e"' - 2x, 0;:;:; x;:;:; 2
(Here you are not going to be able to get answers in an exact numerical form. The figure
should indicate plausible approximations.)
20. Investigate the converse of Theorem 1. That is, find out whether the following state
ment is true:
Theorem(?). If (i) f is continuous on [a, b], (ii) f is differentiable on (a, b), and (iii) f is
increasing on [a, b], then (iv)/'(x) > 0 for every x of (a, b).
2 1 . Is the following true?
Theorem(?). If f is differentiable at x0 and f '(x0) > 0, then some chord of the graph of
f has a positive slope.
22. Investigate:
Theorem(?). Let f be a function satisfying (i), (ii), and (iii) of Problem 20. Then (iv')
f'(x) � 0 for every x of (a, b).
Again we consider a continuous functionf, defined on a closed interval [a, b]. In the
figure,f(x2) M; and Mis the largest value off
=
212 The Variation of Continuous Functions 5.2
We say thatf has a maximum at x2; and we say that Mis the maximum value off
Similarly,f(x3) = m; and mis the smallest value off We say that/has a minimum
at x3 ; and we say that mis the minimum value off
Here when we speak of maxima and minima, we mean maxima and minima on the
whole domain of the functionf; in this case the domain is [a, b]. Before you know
what is a maximum or minimum, you must first know the domain of the function.
In the figure above, f(x1) is not a minimum value, becausef(x3) <f(x1). But
f(x1) is the smallest value thatf takes on when x is close to x1. We say thatf has a
local minimum at x1. This is abbreviated as LMin. Local minima can occur in three
ways:
y y y
�
I
I
I
I
I
(fV
I
I
I
I m I
I I
I I I I
I
I
I I
I
I
I I I
I I I I I I
x x x
X1-0XJ X1+0 X1 X1+0 X1 -o Xj
1) x1 may lie on an open interval (x1 o, x1 + o), in the domain off; and /(x1)
-
may be the smallest value of the function on the interval (x1 - o, x1 + o). In this
case, we say that/has an interior local minimum at x1. This is abbreviated as ILMin.
2) x1 may be the left-hand endpoint of the domain of/; and/(x1) may be the smallest
value Of j on an interval (X1, X1 + 0).
3) x1 may be the right-hand endpoint of the domain of/; and/(x1) may be the smallest
value off on an interval (x1 o, x1].
-
Thus, for the function/whose graph is sketched at the beginning of this section,
we have local minima at x1 and x3. Note that every minimum is automatically a local
minimum, just as the tallest man in the world is automatically the tallest in his own
neighborhood.
Local maxima are defined similarly. Local maximum is abbreviated as LMax.
A local maximum can occur in three ways:
5.2 Local Maxima and M inima, Direction of Concavity, Inflection Points 213
y y y
� I
I
I I
0
I
I
I
I
!\): I
I
I
I
I
I
I I I I I I I
x x x
Xi-5 Xi xi+5 X1 xi+5 X1-5 Xi
In the figure on the left, f has an interior local maximum at x1• This is abbreviated
ILMax.
There are simple conditions under which a function has an ILMax or an ILMin
at a given point.
�-+-�---�x���--
X1
f'(x1) =0
Theorem 3. If f has an ILMax at Xi, and f is differentiable at Xi, then f' (xi) = 0.
f(x) - f(xi)
m(x) =
,
X Xi -
so that
Jim m(x) =
f'(x1).
214 The Variation of Continuous Functions 5.2
1) Suppose thatf'(xi) > 0. Then the function m(x) must be >0 when x �Xi.
2) Suppose thatf'(xi) < 0. Then the function m(x) must be <0 when x �Xi.
This is the standard method for finding an ILMax. Given a differentiable function
f, we find the pointsx wheref'( x) 0. Usually there are only a finite number of
=
such points. These are the only possible places where interior local maxima can occur.
Therefore we have only a finite number of values of x to investigate; and when we are
done, our list of interior local maxima is complete.
Note, however, that the converse of Theorem 3 is false: iff' (x1) = 0, it does not
follow that f has a local maximum ( or a local minimum) at x1• For example, if
(fx) = x3, -1 � x � 1, thenf'(O) = 0, butf is increasing on the whole interval
[-1, 1]
y
Theorem 4. Iff has an ILMin at x1, andf is differentiable at x1, then f'(x1) = 0.
Proof Let
g(x) = -f(x).
y
y
�
I I
I
I
I I
If f' is increasing, on an interval [xi. x2], thenf is concave upward on [x1, x2].
(You ought to be able to convince yourself that this is a reasonable use of language.)
IfJ' is decreasing on [x2, x3], thenf is concave downward on [x2, x3]. In the figure on
the right, x2 is the point at which the direction of concavity changes. Such a point is
called an inflection point. Of course, the direction of concavity can change from up to
down or from down to up. Hence:
Note the way in which these definitions fit together. If you know how to investi
gate (a) increasing, (b) decreasing, (c) interior local maxima, and (d) interior local
minima, then automatically you know how to investigate direction of concavity and
inflection points. The reason is that f', once you get it, is a function, and can be
investigated in the same way as any other function, with the aid of its derivativef".
Wheref' increases,fis concave upward; where/' decreases,fis concave downward;
and where f' has an interior local maximum or minimum, f has an inflection point.
Most of the time, we investigate local maxima and local minima because we
want to find the maxima and minima. We find the maxima and minima, on the whole
domain, by looking to see which local maximum value is the largest and which local
minimum value is the smallest.
Finally, we observe that a function may easily have a local maximum or minimum
at an endpoint at which it is not differentiable. For example, the function f(x) =
x2/3 (0 � x < )
oo has a minimum (and hence a local minimum) at x = 0. The
theory takes care of this case. Since the derivative jx-1/3 is positive in the interior
of the interval [O, )
oo , it follows that the function is increasing, and so it has a
minimum at the left-hand endpoint.
1 through 19. For each of the functions described in Problems 1 through 19 of the
preceding problem set, find the local maxima, the local minima, the maximum, the minimum,
the inflection points (if any), and the image. (The image will always turn out to be a closed
interval.) Tell where each of the functions is concave upward and where it is concave down
ward.
20. Consider the function defined by the following conditions:
1 1
f(x) = x sin for 0 < x � -
x 'TT
/(0) = 0.
An exact sketch is not practical, because the ILMax and ILMin points are hard to
calculate. Give a rough sketch, however, indicating as well as you can how the function
behaves. Is it continuous at O? Does it have a local maximum or minimum at O? Is it
differentiable at 0?
*21. Suppose that/is both continuous and differentiable on [O, 1]. Does it follow that/has
an LMax or an LMin at 0? Why or why not?
So far, we have been discussing functions on closed interva!S: In this section, we shall
consider larger domains, including infinite intervals, such as ( - oo, oo), [O, oo ),
and so on, and also intervals with holes in them. For example, the domain of the
tangent is
D = {x I x � TT/2 + nTT};
and this is an infinite interval ( - oo, )
oo with infinitely many holes in it.
5.3 The Behavior of Functions at Infinity 217
Most of the ideas that we shall be investigating are illustrated by a simple function,
whose domain has a hole in it at 0.
y
x
_) -1
!"'---
1l
I
x
-1 I
I
I
I
1 I
f(x) = x (x�O) I
I
f(x) = --
1 , x�±I
x2-1
A carefi:I inspection of the left-hand graph above will give you an idea of the
meanings of the following statements:
limf(x) = 0, (1)
x--+ oo
Definitions will come later. Meanwhile, let us look at another example. The
function whose graph is shown on the right above has the following properties:
Here statements (ii) through (v) mean the things that the figure suggests. An
examination of the formula shows why the figure is right. For example, if x > 1,
218 The Variation of Continuous Functions 5.3
Definition. limx�00f(x) = L means that for every E > 0 there is an M such that
and
In the definitions, the condition x � x0 is expressed by 0 < Ix - x01 < o, and the
condition x � oo is expressed by x > M.
Let us see how our definition of limx_,00 applies to the function
1
f(x) =
-
(x ¥: 0).
x
We claim that
lim ..! = 0.
X-Jof"fJ X
Under the definition, given E > 0, we are supposed to find an M such that
·l
-E < - < E whenever x > M.
x
This is trivial: take M = 1/E. When x > 1/E, obviously 0 < 1/x < E. Similarly:
Definition. limx�-00/(x) = L means that for every E > 0 there is an M such that
L-•
Definition. limx�x0+ f(x) = oo means that for every M there is a (J > 0 such that
y
I
I I
I I
I
M t- _J
I I
------ -
I I
I I
I I
I I
I I
I I
I I
�--L-���--'-�-'--���x
x0 x0+o
That is, you can make f(x) as big as you want (i.e., > M) by taking x to the right
of x0 and very close to x0 (i.e., between x0 and x0 + b.)
We need to talk about one-sided limits (as x � x � or x � x�) because these
often turn out to be different. In some cases, however, the one-sided limits have the
same value. In such cases, limx�x. f (x) must exist, and must be their common value.
Thus
1.im 2
1 = 00.
x-+O X
The following two theorems justify the remarks that were made above about
f(x) = 1/(x2 - 1).
Theorem 1. Suppose thatf(x) > 0 on an interval (x0, x1). If limx�x.� f(x) = 0, then
1.lill - 1 = 00.
x�,,0+ j(x)
Proof M > 0. We need to find a b > 0 such that l/f(x) > M whenever
Given
x0 < x0 + b. Let E = I/M. By the definition of the statement lim,,,_." f(x) = 0,
< x
•
we know that there is a o > 0 such that
f(x) <
1,
M
and hence
1
__ > M.
f(x)
(Remember that M > 0, and f (x) > 0 for the values of x that we are interested in.)
Similarly, we have:
Theorem 2. Suppose that f(x) < 0 on an interval (x0, x1). If lim,,_,.,0+ f(x) = 0, then
1
I.lm - = -oo.
+
x->xo f(x)
Proof Given M < 0. Let E = -1/M > 0. Leto be a positive number such that
1 1 1
f(x) > -E, 1 < -€ ' -- > - , -<M.
f(x) E f(x) f(x)
(Here we have been reversing inequalities, because we have been dividing by negative
numbers.)
Following the analogy of the above definitions, you ought to be able to write
your own definitions of the following statements:
(?) lim
x-+oc:
(1 !)x (?)
+
X
We use question marks, because it is not obvious that the indicated limit exists at all:
as x---->- 1/x---->- 0, and 1 + 1/x---->- 1.
oo, Therefore we have an "indeterminate form
of the type 100." We recall, however, a similar situation before where we got an
·
1
f(u) = (1 + u)11u and g(x) = - '
x
5.3 The Behavior of Functions at Infinity 221
then
( �r
1 + = f(g(x)),
Theorem. If lim., �xo g(x) = u0 = g(x0) and limu-.uo j(u) = j(u0), then
limf(g(x)) =f (u o ) .
For the case in which x---+ ro, instead of x---+ x0, this theorem is still true, and
the proof is virtually the same. That is:
limf(g(x)) = L.
x-+ oo
x � ro � g(x) � u0 � f(g(x)) � L.
lim f(g(x)) = L.
x� oo
1
f(u) = (1 + u)llu, g(x) = -
x
, f(g(x)) = (1 ;r+
we get immediately:
Investigate the following functions for maxima, minima, local maxima, local minima,
direction of concavity, and inflection points. Then investigate for limits of the sort defined
in this section.
1
1. f(x) = (x >'6 0, x >'6 2)
x(x - 2)
222 The Variation of Continuous Functions 5.3
2· f(x) =
(x - l)(x -
1
3) ( x ;= l, x ;= 3)
3. f(x) = 2
x -x -6
1
(x ¢ -2, x ¢ 3)
1 1
4. f(x) --
2 - 5. f(x) (x ¢ 0)
=
2
x + 1 x
=
x xz
6. I <x) = --
2 - 7. f(x) = --
2 -
x + 1 x + 1
8. f(x) =
3
x + 1
1
( -1 ¢ x) 9. f(x) = -
x + 1
-3
x +
1 (-1 ¢ x)
IO. f(x) = 3
x -x
1
(x ¢ 0, x ¢ 1, x ¢ -1) 11 . f (x) =
-
1-
x3
+x
-3 (-1 ¢ x)
)"'2
Investigate:
(
1
12. lim 1 + 2 13. lim (1 + cosx)secx 14. Jim o + v':X)11v'x
x-+oo X x-,,12 x-o+
( Ir
Investigate the following, for lim x-oo .
17. f(x) = 1 +-
2x
18. f(x) =
( 2
1 +-
x
r2
19. f(x)(1 L)"'
= +
3. 3r2
= 1 + = (1 + e-xy' 22.
ln x
1 + -
(
2
x
24. Discuss as in Problems 1 through 11
(Here the sticky point is limx-• er,· You ought to be able to figure out what this limit is,
and convince yourself that your answer must be right. But to prove that the answer is
right is an unreasonably hard problem, at this stage.)
25. Find lim x-o+xIn (1/x). You need not prove that your answer is right.
26. Is there such a thing as limx - 'Y) sinx? Why or why not?
27. Is there such a thing as Jim"'_"' (l/x) sinx? Why or why not?
28. Prove the following:
29. If you borrow a dollar for a year, at 100 % simple interest, then at the end of the year
you owe 2 dollars. (A certain Marcus Junius Brutus lent money at this rate, in the first
century B.C. He was also an assassin.) If interest is compounded semiannually, then
at the end of the year you owe
(1 + t)2 = t = $2.25.
Suppose now that interest is compounded continuously: the bank passes to the limit,
as n increases without limit, and at the end of the year they charge you the limit. How
much do you owe?
30. Suppose that the basic interest rate is 6%, but interest is compounded continuously,
as in Problem 29. How much do you owe? (To get a numerical answer to this one,
you will need to use one of the tables at the end of the book.)
On several occasions already we have been confronted with problems which did not
appear to involve functions, and have solved them by introducing functions.
For example, in Section 3.7 we wanted to find the area under the graph of y =
4
x ,
from o to 1.
!I
y
y=t4
y=x4
F(x)
We solved this problem by attacking the more general problem of calculating the
function
We found that
xs
F(x)
5
In (ab) == In a + In b,
224 The Variation of Continuous Functions 5.4
for every pair of positive numbers a, b. To use the methods of calculus, we had to
introduce functions into the problem. Given k > 0, we set
f(x) =In kx (x > 0),
g(x) = Ink + In x (x > 0).
We then found that/'(x) = g'(x) for every x, and/(1) = g(l). It followed that
f = g; and this proved our theorem.
We use the same kind of method to attack problems in maxima and minima which
may be stated in geometric or physical terms. Consider some examples.
Problem 1. A segment of length 1 has its endpoints on the sides of a right angle.
What position for the segment gives maximum area for the resulting triangle?
y
The first step is to introduce a coordinate system, as shown on the right above.
The endpoints of the segment now fie on the positive ends of the axes.
Let x be the x-coordinate of the endpoint that lies on the x-axis; and let the other
endpoint be (0, y). When x is named, y is determined. Thus there is a function f
which gives y in terms of x. Since
x2 + y2 = 1,
we have
f(x) = .JI - x2 (0 � x � 1).
And for each x, the area enclosed is
A(x) = txf(x) = tx.J1 - x2.
We need to investigate the function A for maxima. Now
A'(x) = [
.! x � + .J1 - x2
2 .Jr x2
] =
.! -x2 � x2)
2
- .J 1 - x2
1 2x2 - 1
= (0 � x � 1).
2 .Ji - x2
Therefore A'(x) = 0 when x = ±/212. Since we are concerned only with numbers
on the interval [O, l], only x = Ii.12 is of interest to us. Here A = t. Any maximum
of A is surely an ILMax, because A(O) = A(I) = 0, and A(x) > 0 for 0 < x < 1.
5.4 The Introduction of Functions into Geometric Problems 225
y
y
--(1\1 --�
M
I I
I I
I I I I I
x x
a x b a b=x
Therefore
A'(O) = !(cos 20) 2 = t cos W. ·
The only point 0 on the interval [O, TT/2] where A' (0) = 0 is the point where
o = !!. .
4
We claim, without further investigation of derivatives, that this must be where the
maximum occurs. (As in the previous discussion, there must be a maximum some
where; this is not at an endpoint 0 or TT/2; it is therefore an interior local maximum;
at an ILMax, A' (0) = 0; and 0 = TT/4 is the only point of the interval at which
A'(O) = 0.)
Setting 0 = TT/4, we get the maximum value of A as
A(�) =
i sin ( �)
1 · = i sin � = i,
as before.
On reflection, you may find a way to solve this problem by purely geometrical
methods, without taking any derivatives or even introducing any functions. The
geometric method is easier if you think of it. Even in cases where elementary methods
can be made to work, however, calculus does the same job methodically.
figure. What is the length of the shortest path from A to the x-axis to B? And where
should the path touch the x-axis, for this minimum to be attained?
In other words, for what choice of P (x, 0) is the sum of the distances AP
=
B
2
Solution. Let
f(x) =AP+ PB
= .J12 + x2 + ·./(3 - x)2 + 22
=.Ji + x2 + .Jx2 - 6x + 13.
5.4 The Introduction of Functions into Geometric Problems 227
Then
x x -3
f'(x) =
i +
J + x2 Jx2 - 6x + 13
xJx2 - 6x + 13 + (x - 3)-,/�
J 1 + x2 Jx2 - 6x + 13
•
Thereforef'(x) = 0 when
To examine second derivatives looks hard. Let us try to use reasoning instead.
I) When x decreases past 0, AP increases, and so does PB. The same is true when x
increases past 3. Therefore, in searching for a minimum, we can restrict the search to,
say, the interval [ -1, 4].
2) Suppose that we know that the function has a minimum, somewhere on the
interval (-1, 4]. Then the minimum must be an ILMin, at whichf'(x) = 0. There
is only one such point on our interval, namely, x = 1. Therefore the minimum
must be at x = I.
Theorem 2 (Existence ofm in ima). If/is continuous on [a, b], then/has a minimum
value on [a, b].
The proof is easy, granted that Theorem 1 is true. Since -fis continuous, it has
a maximum; and any maximum of -fis a minimum off
-M
:
-� b l �-- x
1 _
�
�
��-+--+'a'--- ·�����
�
-- ��x
I I I
M -t- ---- :
I -f I
I
228 The Variation of Continuous Functions 5.4
Here again, once the problem is solved, you may be able to think of a simpler
attack on it. But the methods of calculus work in any case.
Problem 3. Find the right circular cylinder of largest volume, inscribed in a sphere
of radius 1.
-1
(0 � x � 1).
This gives
V'(x) = (
2TT 2x ·Ji - x2 + x2 ·
Ji -
-
x
)
x2
Therefore
V'(x) = 0 -¢> 2x - 3x3 = 0 -¢> x(3x2 - 2) = 0 .
Since x must lie on [O, 1), we find that V'(x) = 0 only when x = 0 or x = Ji
Now V has a maximum, because V is continuous on [O, 1]. And this must be an
ILMax, because V(O) = 0 = V(l), and V(x) > 0 everywhere else on [O, 1]. There-
fore V'(x) = 0 at the maximum. Therefore the maximum occurs at x = J'i. Hence
the maximum volume is
There is another function that we might have used to solve the same problem. We
might have written
so that
V' (y) = 0 <=> 3y2 = 1 <=> y = Jt or y = -/i.
Here again only the positive number applies, because y must be on the interval [O, 1].
As before, we conclude that the maximum value occurs at y = J};
The second method is simpler. This sort of thing happens often. It is therefore a
good idea to have a quick look at all of the functions that it seems natural to try,
before doing any hard work with any one of them. If the first function that you try
looks simple, there is no point in examining others.
Our third problem shows a danger which should be remembered hereafter. We
might have supposed that the inscribed cylinder attains its maximum volume at the
stage where the inscribed rectangle (in the cross section) attains its maximum area.
But this is false: it is easy to show that the inscribed rectangle of maximum area is a
square; and the cross section of the maximal cylinder is a rectangle of base 2.Jj and
altitude 2.Jt. Therefore we should never assume without proof that two maximum
minimum problems are equivalent.
A further word of caution: In establishing that a certain x0 gives a maximum or
minimum, you may use the theorems of the preceding sections. Under certain con
ditions, you may avoid these theorems (and the calculations that they require) by the
sort of reasoning that we have used in the problems above. But in any case, you must
use either the theorems of the preceding sections or a reasoning process which justifies
your conclusions. To find a point x0 where a derivative vanishes and hence infer that
your problem is solved is a mistake. For one thing, x0 may give a minimum when you
were looking for a maximum, or vice versa. For another thing, x0 may give a point
of inflection.
1. Find the area of the largest rectangle than can be inscribed in a semicircle of radius a.
2. Find the area of the largest rectangle that can be inscribed in an equilateral triangle
whose sides have length a.
3. Find the area of the triangle with the smallest area which contains a square with side a.
4. Find the perimeter of the triangle with the smallest perimeter which contains a square
with side a ..
· 5. A rectangular field has one side along a river and a fence along the other three sides.
If the total length of the fence is k, what is the maximum possible area of the field?
6. Given a rectangular field with one side along a river, as in Problem 5. If the area of
the field is A, what is the minimum possible length of the fence?
7. If a rectangular wooden beam is supported horizontally at its ends, then the maximum
weight that it can support at its midpoint is proportional (at least approximately) to its
width, and to the square of its thickness. That is, W = k · x · y2, where xis the width,
230 The Variation of Continuous Functions 5.4
y is the thickness, and k is a constant depending on the wood (and on the units of
length and weight).
Suppose that such a beam is to be cut from a cylindrical log of radius a, in such a
way as to maximize W. What should be the width and the thickness?
8. An open pan is to be made out of a square metal sheet, by cutting out the square pieces
from the corners of the sheet and folding up the sides of the metal that is left. (The
square pieces are to be thrown away.) If the sheet has edges of length a, what is the
volume of the pan of largest volume that can be made in this way?
9. An open pan, of the sort described in the preceding problem, has a total surface area
of 128 sq. in. What is the largest possible volume?
10. Find the closed circular cylinder with volume 10 cu. in. and surface area as small as
possible.
11. Solve the same problem, given that the cylinder is open at one end.
12. Solve the same problem, given that the cylinder is open at both ends. (It sits on a flat
table and holds flour.)
13. A piece of sheet metal, n feet long and w feet wide, is to be bent so as to form a trough
n feet long, with open top, open ends, and triangular cross sections. What is the greatest
possible cross sectional area?
15. In a rectangular parallelepiped, with a square base, the total length of the edges is k.
What is the largest possible volume?
16. A rectangle is to be inscribed in the region above the x-axis and below the graph of
y = 1 - x2• Find the area of the rectangle of maximum area.
18. Find the rectangle of maximum area contained in the region above the line y = t,
to the right of the line x = 1, and under the graph of y = 1/x.
19. A rectangle is inscribed in the region R = {(x, y) I ixl + [y\ � 1}, in such a way as to
maximize the area. Find the area of the rectangle.
Find the values of x at which the following functions take on their maximum values,
and j ustify your answers. You need not find the maximum values of the functions.
x
=
x
20. f(x) 21. g(x)
1
= --
+ x2 1 + x4
-1 0
Problems 24 through 27. Investigate the preceding four functions for minimum values.
5.4 The Introduction of Functions into Geometric Problems 231
28. An isosceles triangle has base d and. altitude h. Find the area of the rectangle of largest
area that can be inscribed in it.
29. Given a triangle with angles of 30°, 60°, and 90°, there are three plausible ways of
inscribing in it a rectangle of maximum area; the rectangle may have a side lying along
any one of the three sides of the triangle. Show that all three of these "maximal"
rectangles are really maximal; that is, show that they all have the same area.
30. Show that there are some triangles for which the conclusion in Problem 29 does not hold.
31. Show, however, that the conclusion of Problem 29 holds for a class of triangles which
includes more than the 30°-60°-90° triangles.
32. Consider the curve which is the graph of the equation x2 + 4y2 = 4. Find the area
of the rectangle of largest area that can be inscribed in this curve.
33. A right circular cone has a base of diameter d, and altitude h. Find the volume of the
largest right circular cylinder that can be inscribed in it.
34. Find the area of the isosceles triangle of maximum area that can be inscribed in a circle
of radius r.
35. Find the volume of the right circular cylinder of maximum volume that can be inscribed
in a sphere of radius r.
36. Suppose that in Problem 34 the word "isosceles" is omitted. Is the solution of the
resulting problem the same as before?
37. Similarly, discuss the problem obtained by omitting the word "right" in Problem 35.
38. Find the length of the longest ladder than can be carried (in a horizontal position)
around the corner shown on the left below. The segment from P to Q shows a possible
position of the ladder.
39. In the right-hand figure above, the circle (of radius r) is inscribed in the right angle
LBAC. What is the minimum possible area of 6ADE?
**40. Suppose that in Problem 39 we do not require that LBAC be a right angle. Given
that LBAC has measure ix, find the minimum possible area of 6ADE, in terms of r
In the preceding section, we found that under some conditions we could locate maxi
mum and minimum values merely by finding a point where the derivative vanishes.
We shall now see that in some cases we can locate maximum and minimum values
without calculating the function. Consider first a simple problem, from Section 5.4.
Problem 1. A segment of length 1 has its endpoints on the sides of a right angle.
What position for the segment gives maximum area for the resulting triangle?
As in Section 5.4, we set up the axes as shown. Let x be the x-coordinate of the
lower endpoint of the segment; and for each x from 0 to 1, letf(x) be they-coordinate
of the other endpoint. Note that we are entitled to use functional notation: f(x)
really is determined when x is named. And for each x, we have
x2 + [j(x)]2 = l2,
because x2 + [ f(x)]2 is the square of the length of the segment. Therefore the function
f satisfies the equation
x2+f2=l (0 � x � 1). (1)
The area of the triangle is
A(x) = tx ·f(x). (2)
Now in (1), the left-hand member is a function, whose derivative is 2 · x+ 2 ·ff'.
But this function is known to be a constant, equal to 1 for every x from 0 to 1.
Therefore
x +ff' = 0 (0 < x < 1). (l')
Here, of course, we are assuming that f has a derivative, for 0 < x < 1, but this
must be true, because the graph off is a quadrant of a circle. Obviously
The maximum of A(x) must be an ILMax; and so, at the maximum of A(x), we have
xf' +f = 0. (2")
We now know:
x
!' = - -
on (0, 1),
f
and x = f(x).
f x
That is, the maxim um is achieved when the triangle is isosceles.
This discussion has been long, because ideas needed to be explained; but once the
ideas are understood, the calculations are simple:
x 2 + f2 = 1, 2 · x + 2 ·ff'= 0, f' = - J;
A(x) = tx · f(x), A'(x)= tx · f'(x) + tf(x);
and hence
- -= _J
x
and x = f(x).
f x
In this case, of course, it was not much trouble to find a formula for f and use it.
But in many cases, equations like
x2+12= 1
are more convenient than formulas for the function f These are called functional
equations. Obviously every trigonometric identity is a functional equation. Usually,
however, we use the word identity when the function is known, and the termfunc
tional equation when the equation itself is being used as a working definition of the
function.
Consider another example, Problem 3 in Section 5.4.
As before, we show a vertical cross section of the figure. Let x be the radius of the
inscribed cylinder, and letf(x) be half the altitude. Then
x2 +f2= a2, 2 · x + 2 ·ff' = 0,
and
x
f' = (0 < x < a). (3)
f
234 The Variation of Continuous Functions s.s
so that
V'= 27T (x2.f' + 2xf).
•
- 2xf= - 2!
f' = (at Max). (4)
x2 x
Therefore, at the maximum, both our formulas for f' must hold, and so
2
- �
=- f
f x
and
./2
f=-·x. (5)
2
For a = 1, this tells us that
X - '\/1;,_
-
3 '
as before.
Note, however, that in a way the most natural answer to a problem like this is a
shape, rather than a size. And the solution based on the functional equation ordinarily
gives the answer in the form of a shape, that is, in the form of a ratio between two
measurements. For example, in the preceding problem the constant a, which deter
mines the size of the whole configuration, disappeared immediately when we differ
entiated in the equation x2 + f2 = a2. Our final equation (5) means that at the
maximum,
2f(x) = /2 x,
that is, the altitude of the maximum cylinder is equal to /2 times the radius of its base.
y
-a
5.5 The Use of Functional Equations as Shortcuts 235
The answer is also a shape when the problem is to find the rectangle of maximum
area in a given circle:
x2 + f2 = a2, 2x + 2ff' = 0,
- �= f '= - 1 and x = f,
f x
because x and f are both positive. This is a qualitative answer, as it should be: it
says that the maximum rectangle is a square. The constant a has disappeared,
because the shape of the maximum rectangle is the same for all circles.
In the following problem set, you will find more cases in which maxima and
minima can most conveniently be found by using functional equations. Meanwhile
let us look carefully at what happens when we take the derivative on each side of a
functional equation. The ideas here are illustrated by a simple case. When we write
x2 + 12 = a2 (6)
=> 2. x + 2 . ff' = 0, (7)
we are claiming that every differentiable function which satisfies Eq. (6) also satisfies
Eq. (7). It often happens that there is more than one such function/ For example,
consider
Here
-x -x
f{(x)= ,
..Ja2 - x2 fi( x)
and
x -x -x
f�(x) = =
--;===
..J a2 - x2 -..Ja2 - x2 f2(x)
Therefore
fi(x)f{(x) = -x, and
Therefore
{2. x + 2. fd{ = 0,
(8)
2 . x + 2. ! !�
2 = 0.
236 The Variation of Continuous Functions 5.5
That is, both/1 and/2 satisfy (7). A figure makes it obvious what is going on here.
y= x3 - x (9)
looks like the left-hand figure below.
Therefore the graph of
x= ya-y (10)
looks like the right-hand drawing below.
We have interchanged x and y in Eq. (9), and reflected the graph across the line
5.5 The Use of Functional Equations as Shortcuts 237
y = x. This gives the curve C which is the graph of (10). C is not a function-graph.
But C is the union of the graphs of three functions fi,f2,f3, as indicated in the figure.
And each of the functions ft> h, and /3 satisfies the functional equation
x
=/3 - f
Therefore each of these functions satisfies the differential equation
1
= 3 ·Pf' - !'.
This is what we are claiming when we differentiate the functional equation, and write
In Problems 1 through 10 below, the notation 5.4.n refers to Problem n of Problem Set
5.4. In each of these cases, the indicated ratio is to be found by the method based on func
tional equations.
1. In 5.4.1, find altitude/base, at the maximum.
2. In 5.4.2, same.
3. In 5.4.5, same, using the side parallel to the river as base.
4. In 5.4.7, findy/x, at the maximum.
5. In 5.4.14, let l be the length of the rectangular side and let w be the width. Find w/l
at the maximum.
6. In 5.4.15, let h be the altitude and let e be the length of each edge of the base. Find
h/e, at the maximum.
x4 + [/ (x)]4 = 1.
14. a) Let n = 101010• Sketch the graph of xn + yn l. [Hint: A commonly used drawing
=
x +ff'= 0.
(You need not show that the solutions that you describe are the only ones.)
16. Given that f and/' are continuous, let
*17. Now show that your list of solutions, in Problem 15, is complete.
18. Let f be the function whose graph is the union of (a) the lower left-hand quadrant of
the circle with center at (0, 1) and radius 1 and (b) the upper right-hand quadrant of the
circle with center at (0, -1) and radius I. Show that f is a solution of the differential
equation
[f'(x)]2 = [x + f(x)J'(x)]2,
except, of course, at the endpoints x = ± 1, where the tangent lines are vertical and the
function- has no derivative. As a start, observe that at x 0, the tangent to the graph
=
In Section 5.4 and later, we have used the fact that, if f is continuous on [a, b],
then f has a maximum value o:n [a, b]. In Section 5.4 this theorem was used as a
shortcut in finding maximum values, but this is only one of the uses of the theorem.
In fact, the theorem is part of the foundation of the calculus, as we shall see.
In proving it, we shall need to use, for the first time, the fact that the number line
has no holes in it. As a guide in giving an exact description of this property of the
number system, let us consider what happens when you remove a point from the
number line, thus getting a system which really does have a hole in it.
Let A be the set of all negative numbers, and let B be the set of all positive
5.6 The Completeness of R and the Existence of Maxima 239
numbers. We mean strictly positive and strictly negative, so that 0 belongs neither
to A nor to B. Then
�1--"x'---x�-- ·1=----
� ---= 2��-=---
3 ----
- R+
2
R- �-_
---.,;.
3 .. �� -�,...-
2 - � �l �X--:':-X-+-1 ---·
-
2
Evidently this situation could not have arisen if we had not excluded 0: if we
put 0 in A, then 0 would be the greatest element of A; and if we put 0 in B, then 0
would be the least element of B. Thus the situation described in (3) can arise only in a
number system with a hole in it, and so the following statement conveys the idea that
there are no holes in R:
The Dedekind Cut Postulate (DCP). Suppose R is expressed as the union of two
nonempty sets A and B, such that every element of A is less than every element of B.
Then either A has a greatest element or B has a least element.
A B
I
Xo
for every i.
For example, if
[a·i' b·]
i
-
-
( - ! !) •
1
' .
1
for every i,
then the sequence is nested. This sequence "closes down on O." That is, 0 lies in each
of the intervals in the sequence, and 0 is the only number that lies in all of them.
A more important example is as follows. Given a circle of radius 1, let Pn be the
perimeter of an inscribed regular (n + 2)-gon, and let qn be the perimeter of a circum
scribed regular (n + 2)-gon. Evidently
and
for each i.
of closed intervals. And this sequence "closes down on 27T." That is, 27T lies in all of
the intervals in the sequence, and no other number lies in all of them.
The following postulate says that every nested sequence of intervals closes down
on at least one point.
The Nested Interval Postulate (NIP). For every nested sequence of closed intervals
there is a number x which lies in every interval in the sequence.
This conveys the idea that the number system is complete. Suppose, for example,
that 27T were missing, so that the number system had a hole in it where 2rr ought to be.
Then no number at all would lie on all of the intervals [Pi, q1], [p2, q2], . • • that we
have just discussed. Similarly, if ./2 were missing, then there would be a nested
sequence of closed intervals closing down on no number whatever. (We could use
[,./2 - lfi, ,J2 + l/i] as the ith interval in the sequence.)
Using the nested interval postulate (NIP), we shall prove the following theorem:
Theorem 1. If/is continuous on [a, b], then/has an upper bound on [a, b].
That is, there is a number M such that/(x) � M for each x of [a, b].
Lemma. If/is unbounded above on an interval [c, d], then/is unbounded above on
at least one of the halves of [c, d].
By the halves of [c, d] we mean the intervals [c, (c + d)/2] and [(c + d)/2, d].
The proof of the lemma is immediate: if/has an upper bound M1 on [c, (c + d)/2]
and has an upper bound M2 on [(c + d)/2, d], then/has an upper bound on [c, d].
\\{e merely use the larger of the bounds M1 and M2•
5.6 The Completeness of R and the Existence of Maxima 241
y y
I
I
I
I
I
--1-------
1
I
I
I
I
�+-�
c���c+�d��-d,,...__
._ x
2
We proceed to prove the theorem. For short, we say that an interval is good
if f is bounded above on the interval; and we say that an interval is bad if it is not
good. Thus we need to prove that [a, b] is good. We start by supposing that [a, b]
is bad, and we shall show this assumption leads to a contradiction.
If [a, b] is bad, then it follows that at least one of the halves of [a, b] must be bad.
Let [a1, b1] be a bad half of [a, b]. For the same reason, [a1, b1] must have a bad half.
Let [a2, b2] be a bad half of [a1, b1]. Continuing this process to infinity, we get a
sequence
and so
1
b; - a; = -: (b - a).
2'
M = f(x) + • --------1--------,
I I
I
I
f(x)
Thus
lx-xl<o ::?- f(x)-E<f(x)<f(x)+E,
and so f (x) + E is an upper bound for f on the interval (x -o, x + o). But since
we have
bi - ai < a
for some i. For such an i, the closed interval [ai, bi] lies inside the open interval
(x - a, x + a). That is,
x-o a; x
(This is easy to see geometrically, because [a;, b;] contains the midpoint x of the open
interval, and is less than half as long.)
But this situation is impossible, because f is bounded above on (x - a, x + a)
and is not bounded above on the smaller interval [ai, b;]. This contradiction completes
the proof of the theorem.
One of the ideas that we have just used is going to be useful later. We therefore
record it as a theorem:
7r
---------------
2
5.6 The Completeness of R and the Existence of Maxima 243
When xis far to the right, Tan-1 xis close to TT/2, but Tan-1 xis never actually equal
to TT/2 for any x. Similarly, when xis far to the left, Tan-1 xis close to -TT/2, but
-TT/2 is not one of the values of the function. On the other hand, it is easy to see that
the numbers TT/2 and -TT/2 are related to the function Tan-1 in a special way: TT/2
is an upper bound of the function; and of all upper bounds of the function, TT/2 is the
smallest. We express this by writing
k =sup/
B
�
0 a b
Here b is an upper bound of B, and b is smaller than all other upper bounds of B.
Therefore
b =sup B.
Consider now
B-
_ {1 '3� '4'····
� n -
n
1
,
\
... f'
2
Here the upper bounds of B are the points of the interval [ 1, w), and sup B = 1.
4
5
��---'-����-----++<>---'-�- x
0 1 2 3 5
x =2 346
· In each of these cases, starting with a nonempty set B which is bounded above,
we have found that the upper bounds form an interval of the type [k, w), and k =
sup B. The following postulate says that this is what always happens:
The Least Upper Bound Postulate (LUBP). Let B be a nonempty set of numbers.
If B has an upper bound, then B has a supremum.
Using the least upper bound postulate, we shall show that no continuous function
can behave like Tan-1 if its domain is a closed interval:
244 The Variation of Continuous Functions 5.6
k = supf
Thenf(x) � k for every x on [a, b]. We need to show thatf(x) = k for some x.
Suppose not, and let
1
(a � � b).
g(x)
= x
k - f(x)
Then g is continuous. But g is unbounded. For suppose that
g(x)� M for a� x � b.
Then
1
1 :$ M, � k - f(x),
k - f(x) - M
and
f(x) :$
- k - _!_ for a � x � b.
M
This is impossible, because k is the least of the upper bounds off
Thus, if/has no maximum, there is a continuous function g which is unbounded
on [a, b]. This contradicts Theorem 1, and so completes the proof of Theorem 3.
We have already observed, in Section 5.4, that the existence of maxima implies
the existence of minima. Therefore
1. Let B be the set of all rational numbers p/q for which p2/q2 < 2. What is sup B?
2. Consider a circle of radius 1. For each polygon P inscribed in the circle, Jet k(P) be the
perimeter of P. Let B be the set of all numbers k(P). What is sup B?
5.6 The Completeness of R and the Existence of Maxima 245
3. Consider the graph off (x) = sin x, 0 � x � 1T. Suppose that we cut up the interval
[0, 7r] into little intervals, in any way, using subdivision points 0 x1 < x2 < < = · · ·
xi < xi+l < · < x,. = 1T. Over each little interval [xi, xi+i l we set up the tallest
· ·
possible inscribed rectangle with [xi, xi+i l as base. Let s be the sum of the areas of the
rectangles. Let B be the set of all numbers s which are obtainable in this way. What is
sup B? (A numerical answer is called for here.)
4. Let B be any set of numbers. If b EB, and b is larger than every other element of B,
then b is called the greatest element of B, and we write b = Max B. Question: If B
has an upper bound, does it follow that B has a Max?
5. Suppose that we had defined bounds and suprema in the following way:
"Let B be a set of numbers, and let k be a number. If x < k, for every x in B,
then k is a strict upper bound of B. If k is a strict upper bound of B, and is smaller
than every other strict upper bound of B, then k = sup B."
a) What is the difference between this "definition" and the usual definition of upper
bounds and suprema?
Under the new "definition" of "supremum," which if any of the following statements
are true?
6. If B is a set of numbers, then -B denotes the set obtained when we replace every
element x of B by its negative -x. That is,
-B = { -x Ix EB}.
7. If k is a lower bound of the set B, and k is greater than every other lower bound of B,
then k is called the infimum of B, and we write k = inf B. Show that if a set B is bounded
below, then B has an infimum.
8. Let B be a set which is bounded below, and let K be the set of all lower bounds of B.
Describe Kin the interval notation.
*9.· Let [av bi], [a2, b2], .. .be a nested sequence, and letA = {ava2, • • • }, B = {b1,b2, • • . }.
Show that (a) every number b; is an upper bound of A. Let x = sup A. Then show that
(b) ai � x � bi for every i.
This result means that the least upper bound postulate (LUBP) implies the nested
interval postulate (NIP).
*10. Let Kbe a (nonempty) set of numbers, bounded above. Let A be the set of all numbers
a which are not upper bounds of K. That is, a EA if a < k for some k in K.
Show that A cannot contain a greatest element.
246 The Variation of Continuous Functions 5.7
* 11. Show that the Dedekind cut postulate (DCP) implies the least upper bound postulate
(LUB P) .
The results of Problems 9 and 11 mean that
DCP => LUBP => NIP.
Thus our only really new assumption, in this section, is DCP.
The mean-value theorem was stated in Chapter 3, and we have been using it ever
since. We are now finally in a position to prove it. We need one preliminary result.
Rolle's Theorem. Iffis continuous on the closed i'nterval [a, b] and differentiable on
the open interval (a, b), and f(a) f(b) 0, then j'(x)
= = 0 for some x between
=
a and b.
y
3) Iff(x) < 0 for some x, then the minimum of/is an ILMin. By Theorem 4 of
Section 5.2 we know that at an ILMin the derivative vanishes.
f(b l- -�---------
g
a b
5.7 The Mean-Value Theorem and the No-Jump Theorem 247
Given that f is continuous on [a, b] and differentiable on (a, b), let g be the linear
function which agrees with fat a and at b. Thus
We could write a formula for g, in the form g(x) mx + k, if we needed to, but
=
we don't need to. Since the derivative of a linear function is simply the slope of the
line which is its graph, we know that
b - a
for every x. For each x of [a, b], let
b - a
Since cf>(a)=c/>(b) =
0, we can apply Rolle's theorem. Therefore cf>'(x) =
0 for
some x. Thus
f(b) - f(a) o,
f '( x)
_ =
b - a
and
f(b) - f(a)
f' (x)
=
b a -
In the definition of a limit, we take E = f(x0) > 0. There is a o > 0 such that
Proof? (The proof of Lemma 1 can be adapted, to give a proof of Lemma 2. But
it is quicker to derive Lemma 2 from the statement of Lemma 1.)
A functionfchanges sign, on an interval/, iff(x) > 0 for some x in I and/(x') <
0 for some x' in /.
Lemma 3. If fis continuous, on an interval containing x0, and f(x0) ¥- 0, then there
is a 0 > 0 SUCh that j does not change sign on the interval (x0 - O, Xo + 0).
Proof. For f(x0) > 0, this follows from Lemma 1. For f(x0) < 0, it follows from
Lemma 2.
We are now ready to prove the following convenient special case of the no-jump
theorem.
Theorem 1. If/is continuous on [a, b], and/ changes sign on [a, b], then
f(x0) =
0
for some x0 in [a, b].
y
The proof is based on Lemma 3 and the nested interval postulate (NIP). We
suppose that/(x) ¥- 0 for every x in [a, b]. We shall show that this assumption leads
to a contradiction.
Given that f changes sign on [a, b] and that f(x) is never =
0, it follows that f
changes sign on one of the halves of [a, b]. We recall, from Section 5.6, that the
halves of [a, b] are [a, (a + b)/2] and [(a + b)/2, b]. Let [a1, b1] be half of [a, b],
such that f changes sign on [a1, b1]. Similarly, let [a2, b2] be half of [av bi], such
that f changes sign on [a2, b2]. Proceeding to infinity in this way, we get a nested
sequence
[a1, b1], [a2, b2], ...
of closed intervals, such that f changes sign on each of them. Evidently
b; - a; = 2 -i( b - a),
5.7 The Mean-Value Theorem and the No-Jump Theorem 249
and so
lim (bi - a;) = 0,
i-t> 00
as in the proof of Theorem 1 of Section 5.6. By NIP, there is an x0 which lies on all
of the intervals in the nested sequence. That is,
for every i.
By Lemma 3 there is a o > 0 such that/ does not change sign on the interval (x0 - o,
x0 + o). By Theorem 2 of Section 5.6, there is an i for which [a;, b;] lies in (x0 - 0,
Xo + 0), as indicated in the figure.
This is impossible, because f changes sign on [a;, b;], but does not change sign on
(x0 - o, x0 + o). This contradiction completes the proof of Theorem 1.
It is now easy to prove the no-jump theorem.
Then g changes sign on [x1, x2]. Therefore g(x0) 0 for some x0 on [x1, x2]. This
=
givesf(x0) - k = 0, andf(x0) = k.
Iff(x2) < k < f(x1), then the same function g still changes sign, and so the proof
is exactly the same.
Nobody expects that a doctor will write down a definition of the word man
and then write a few assumptions about men, in such a way that all medical science
can be derived by logical reasoning from the definition and from the assumptions.
Medicine is an empirical science: it depends on observations of fact, not just at the
outset but continually. Mathematics is different.
Moreover, in your study of mathematics you have already passed the point where
the truth can be relied upon to be obvious and where obvious things can be relied
on to be true. From now on, logic is going to be an important part of your mathe
matical equipment. This is partly due to recent developments. As late as 1800,
calculus was illogical, and very few people cared. In the last century, however,
mathematical ideas which require careful logical analysis have become more
important, in pure research and also in applications.
Let fandg be differentiable functions. Take a point x0, and form the differences
Jim
!if= df
6.x-+O A g dg,
by definition. In fact, the limit always exists, wheneverg'(x0) "16- 0.
Theorem 1.
df f '
-=-,
dg g'
whereverg'(x) "16- 0.
Proof
lim
!if =
Jim
!if/fix = f'(xo)
.
6.x-+O fig 6.x-+O fig/fix g'(x0)
For the case in which g(x) = x for every x, the derivative off with respect tog
reduces to an ordinary derivative:
df df
= f'(x).
=
dg dx
Obviously,
df !if
= Jim f'(x0),
=
dg 6.x-+O fix
for each x0•
5.8 The Derivative of One Function with Respect to Another 251
d sin x cos x
-cot x, (wherever cos x ":/= 0)
=
---
=
d cos x
---
-sin x
dsin x
--- = cos x,
dx
de"' e"'
(wherever x ":/= 0)
dx2 2x
We often write
d df
-f(x) for
dx dx
Thus every derivative can be written in the form
d
f'(x) = -f(x)
dx
,
' df.
f
=
dx
The notation df/dx for derivatives is widely used, especially in physics, and it is
natural to use it when you are continually dealing with the derivative df/dg of one
function with respect to another. It has a disadvantage, however: there is no con
venient way to write the value of the derivative at a particular point x0• Sometimes
we denote this by
°fx 'X=Wo'
but the notation f' (x0) is more convenient.
We now want to prove a sort of cancellation law
df. dg =
df
dg dh dh
We can derive this from the equation
l:lj df !:lg dg
---- _ _,,_ _
g'(x0) ":/= 0,
df dg df
dg. dh dh'
=
'
wherever g ¥:- 0 and h' ¥:- 0.
Theorem 4.
dg -
- 1_
df df/dg'
_
This is like
df
f'(x) 2x.
=
dx
=
That is, to find du2/du (where u is a function), we treat u as if it were a dummy variable
x and differentiate in one step. This is an example of the following situation.
For example, sin2 x is a function of sin x, with </>(u) u2• And cos2 x - 2 cos x
is a function of cos x, with </>(u) u2 - 2u. The easiest way to calculate df/du, in
=
d sin2 x du2
2u 2 sm x'
•
d sin x du
--- = - = =
= 2 cos x - 2.
This procedure is justified by the following theorem.
Theorem 5. Let/be a function of g, = </>(g), where all the functions are differentiable.
Then
</> ( )
wherever g' ¥:- 0.
% =
' g,
5.8 The Derivative of One Function with Respect to Another 253
Proof
df f' gg
= = </>'( ) ' = f (g).
dg g' g'
Using this theorem, we can write immediately
dtan2x
=
2 tanx,
dtanx
calculate <f>'(g),and compare it with your previous formula for df/dg. (Or, if you worked
the problems this way in the first place, work them by the other method, and check.)
15. f(t) =sin t,g(t) =et 16. f(t) =cost, g(t) =Tan t
26. Given /3 + t3 = 1, find df/dt. Then calculate f = f(t), find /' (t), and compare the
result with dffdt.
27. Same, for f4 + t4 = 1. (Here there are two functions f = f(t) to be considered.)
28. Now try to check your answer to Problem 25 in the same way that you checked your
answers to Problems 26 and 27. (It often happens that a formal process gives "answers"
in cases where there never was a question.)
The Technique
6 of Integration
6.1 INTRODUCTION
In Section 3.7 we found a way to solve certain types of area problem. To :find the
area under the graph of a continuous function/, from a to b, we introduce the area
function
y
A= f f(t) dt
rYf
I
a x
We know that
254
6.2 Independent Variables and Indefinite Integrals 255
To sum up:
The notations t and G were introduced for the sake ofthe derivation. Once we have
the answer, it is natural to use x and F, and write:
To apply the theorem, ofcourse, we need to find F when/is given. This process
is called antidijferentiation. We shall see later that the method of antidifferentiation
enables us to solve not only the sort ofarea problems that we have used it on so far,
but also a variety of problems which, offhand, don't look like area problems at all.
But these applications should be postponed. The point is that, to apply the method,
we need to know how to calculate a function F whose derivative is a given function/;
up to now we have been finding such functions F only by hit-or-miss procedures, in
simple cases; and it would not be good to reduce various problems to problems in
antidifferentiation, when we are unable to solve the antidifferentiation problems.
We should therefore first learn better methods for calculating functions when their
derivatives are given.
The usual way of defining a function is to write an expression which gives the value
ofthe function for every number in the domain. For example, we may define functions
f and g by writing
The set of all functions F for which F' =/is commonly denoted by
ff(x) dx.
This is called the indefinite integral off Thus
and so on. Any other dummy letter would have done as well:
we might have gotten along without the "dx," because the only constants involved
are the numerical constants 2 and 3. On the other hand, if we write
the "dx" is needed; it tells us that a, (3, and y are to be regarded as constants, and that
the function which we are dealing with is
f(x) = ax3y + f3x2y2.
6.2 Independent Variables and Indefinite Integrals 257
When the problem is understood in this sense, it is plain ·that the answer is
(ii)
In (ii), o:, x
{3, and are constants, and the function is
g(y) o:xay + fJx2y2.
=
Dxn+l= ( + l)xn ( n n ¥- -
1) ,
D In x = .! (x
x
> 0) => J � dx= {In x + C} (x > 0),
We know many more differentiation formulas than this, and so we could have
written many more integration formulas. But we postpone the complete list until we
can write it in a better form, which we shall now explain.
Given a function/, if u is another function, then f (u ) is a composite function.
By the chain rule,
Df(u)=j'(u)u'.
It follows that
ff'(u)u'(x) dx =
{f(u(x)) + C}.
For example, if
f(u)= sin u, u(x)= x2 + 1,
then
F'=f,
so that
ff(x) dx =
{F(x) + C},
then
D[F(u(x))] =
F'(u(x))u'(x) =
f(u(x))u'(x),
so that
or feu<t>u'(t)dt,
and so on. More often, however, we start with an integral described in the long
notation and observe that it is convertible to a short form. For example,
fex2+12x dx
has the form
fex2+12xdx fe"du = =
{ e" + C} =
{ ex2+1 + C}.
Similarly, f [sin (t2 + 1)]2t dt has the form f sin u du. Therefore
Note that the solution is not finished in the third formula above, because u is a
function. To complete the solution, we need to express the function u in terms of
che dummy letter t. To sum up:
F' =
f => D[F(u)] = f (u)u'.
Therefore
ff(x)dx =
{F(x) + C} => ff(u) du = {F(u) + C}.
260 The Technique of Integration 6.2
Using this general idea, we can write all of our old integration formulas in the more
general form. The first few look like this:
because
D[f+ g] =DJ+ Dg and D (kf) = kDf
Let us now consider how to apply such formulas as these, as a practical matter.
Example 1. Consider
Jcx2 + 1)7xdx.
This is almost, but not quite, in the form
J u7du.
If we take u(x) = x2 + 1, then
du=u'(x) dx=2x dx.
We therefore have
This checks:
Example 2. Consider
cos -Jx
J - 1_ dx
"\ x
(x > 0).
The only form that might fit this integral is the form f cos udu. Thus we would have
1
u(x) =.jX, du = u'(x)dx =--dx.
2.jX
The only difference between what we have and what we want is a multiplicative
6.2 Independent Variables and Indefinite Integrals 261
constant. Therefore
JX
cos 1 ;- 1
dx= (cos v x) x dx= 2 (cos v x) x dx
J J J
;-
Jx J 2J
x) dx= - eu du
Je00sx sin x dx=
J{- -e008"'(-'-sin
J
= eu + C}= {-ecosx + C}.
Below we shall give a list of all the integration formulas that we can write, at this
stage, on the basis of the differentiation formulas that we know. Special explanations
are needed, however, in connection with the formula for f (lju) du. Given a function
u, defined on a domain where u(x) > 0 for every x, we know that
1
D ln u(x)= - Du(x).
u(x)
We need to know that u(x) > 0 on t� domain under consideration, because only
positive numbers have logarithms. Therefore we write
J �du;
that is, it makes sense to ask what functions f have (1/u)u' as their derivatives. The
answer is easy: if u(x) < 0, then -u(x) > 0. Therefore -u(x) has a logarithm, and
Ikf(x) dx k ff(x) dx (k � O)
= (1)
(n � -1) (3)
f � du = { In u + C} (u > 0) (4)
f u du { u C}
cos = sin + (6)
I u du { u C}
sin = -cos + (7)
f u du { u C}
sec2 = tan + (8)
f u du { u C}
csc2 = -cot + (9)
I u u du { u C}
sec tan = sec + (10)
I u u du { u C}
csc cot = -csc + (11)
f du { C}
e
"
= e
"
+ (12)
Ja" du {1:"a c} (a 0, a � 1)
= + > (13)
I�
J1 - u2 { u C} (lul 1)
= Sin-1 + < (14)
f�1 u +
{ = Tan-1 u + C} (15)
I u J�
u2 - 1
= { Sec-1 u
.
+ C} (u > 1) (16)
6.2 Independent Variables and Indefinite Integrals 263
To solve the following problems, you will start by expressing the given integral
in the form f f(u) du. In each such case, you should (a) say what u and du are and (b)
state the general formula that you are applying. It is natural to write down the original
integral first, and after this it would be awkward to interrupt the solution with the
formulas for u(x) and du = u'(x) dx. But u and du can be filled in on the right, like
this, for example:
= fiu10 du {t TI-u11
= · + C}
du = 3x2 dx
as if it were true that for u(x) = x3 + 1, du = x2dx. When we write formulas for u
and du, we uncover such errors.
Similarly for the following wrong solution:
Calculate the following integrals, and check by differentiation in each case. Some of
these problems fit together in sequences, in which the answer to one problem helps in the
solution of another; you should watch for such patterns.
10. J(rs/2 - l)rs/2dt 11. f (I + sin x)2 cos xdx 12. f (1 + tan x)3i2 sec2 x dx
13 . fv'cos x sin xdx 14. f (e"' + 2)4 e"' dx 15. f (e"' - 2)3e-"' dx
19. f(I
x2
+ x3)3
dx 20.
f : 1 x2
dx
21. f x2
1 + xs
dx (There are two intervals to be considered in this problem.)
27. Jsin101 x cos xdx 28. J cos2 x sin x dx 29. J cos3 x sin xdx
3 6. Jcos 0
-- dO
sin2 0
37. J sin 20d0 38. J cos 20dO
39. J(cos2 0 - sin2 0)dO 40. J (cos2 0 + sin2 0)dO 41. J (2 cos2 0 - 1)dO
45. J
sin2 0dO 46. J sin2 20dO 47. J sin2 0 cos2 0dO
48. J
cos2 0 sin 0dO 49. J sin 0 (1 - sin2 0)dO 50. J sin3 0dO
51. J
cos (0/2) dO 52. fJl - cos 0
2
dO 5 3. J v' 1 - cos 0 sin 0 dO
54. J
x e-"'2 dx 55. J t2et3 dt 56. J xe"'2 dx
57. f
e2"'dx 58. J e5t dt 59. J e1
2
tdt
6.3 Integrals Leading to the Logarithm and the Inverse Secant, Algebraic Devices 265
60. J dt
et'+3t 61.
f dx
ein sec' x 62.
f cosxdx
esin x
63. J sin t dt
e008 t 64. J2x+idx 65. J xdx
10x•
66.
I dx(10x)2 67.
I (2 dt
+ o-3/2 68. J dt
t(2 +t2)-312
J (2 dt dt dt
I
tdt
69. +i-312) 70.
I .Y1 - t2
71.
.y 1 - t2
dt dt
t2 dt
I - (2t)2
t
72.
I ( .Y1· _ 12)s
73.
I\o/1 +t3
74.
.y 4
I exdx I -=dx
t3 ex ex
75.
I 1 +t
4 dt 7 6.
1 +
77.
Y] _ex
78.
I
ex
Yl - e2x
dx
osx
c--dx xdx
(There are different intervals to consider in Problems 79 through 84.)
79.
I x 80. --
I 2 x
sin
81. J xdx
tan
8 .
sec 2 +sec
secx +tan
tan
83.
I x x csc
+ csc
+cot
cot
84. J xdx
sec
In the preceding section, we got two formulas for J du/u, for the intervals (0, oo)
and (-co, 0).
d u
I -; ; = {In u + C} (u > 0) (4)
du
I - =
u
{ln ( - u) + C} (u < 0). (5)
Since lul = u when u > 0 and luJ = -u when u < 0, these two formulas can be
combined into one:
r- =
du
{ln Ju J + C} (on (0, oo ) or ( - oo, 0)). (17)
• Ll
Here the expression in parentheses on the right reminds us that the formula can
be used on an interval where u > 0, or on an interval where u < 0; it cannot be used
on an interval where u takes on the value 0. When u = 0, there is no such thing as the
"1/u" on the left or the "In lul" on the right. Thus, whenever we apply formula (17),
we might have used formula (4) or (5). The advantage of (17) is that it is easier to use.
Consider
r -l d
x
.
J-2 x
266 The Techniqu� of Integration 6.3
1
f(x) =- F(x) = In !xi.
x
,
l-1d =
2 F(-1) - F(-2) = ln 1-11 - ln 1-21=0 - ln2 = -ln2.
-2 x
This is negative, as it should be; the integrand is negative, and we are integrating
from left to right. The calculation might be confusing if we used formula (5):
Hereafter, we shall use the following shorthand for this kind of calculation:
In general
[F(x)]�=F(b) - F(a),
by definition. Sometimes, where no confusion could result, we may omit the opening
bracket on the left. Thus
�3I= � � �- - =
sin u
J tan u du =
J --
cos u
du.
v' v u, dv
J � =cosu du. = - sin
dv
Since
J --;- vl = (v
{ln l v + C} > 0 or < 0),
d:
we have
u du = du = -
J tan
J u
u
-
J
- sin
--
cos
Similarly,
cos u
This gives
J cot u du =
J -.
-
smu
du.
f cot udu= {ln Jsin uJ + C} (sin u > 0 or sin u < 0). (19)
By an ingenious device, we can find
J secxdx.
Since
D secx = secx tan x
and
D tan x = sec2 x,
the integral has the form
where
u = secx+ tan x,
du = (secx tan x+sec2 x) dx.
Therefore
J secxdx = {In u
J l + C} = {ln Jsecx+ tan xi + C}.
As always, the chain rule gives us a more general formula for J sec u du:
J sec u du = {In Jsec u+tan ul + C} (sec u+ tan u > 0 or < 0). (20)
Similarly,
cscx(cscx+cot x)
J cscxdx =
J csc x+cot x
dx = {-lnJcscx+cot xi+ C};
J csc udu = { -ln Jcsc u+cot ul + C} (csc u +cot u > 0 or < 0). (21)
7r
2 ---- -- ------- --- -----
(See Section 4. 7.) Thus Sec-1 is defined on the interval [I, oo ). But at 1 its tangent is
vertical; and so the differentiation formula holds only for x > 1. It gives
dx
I .J
x x2 -
.
1 = {See1 x + C} (x > 1),
I
du
U'\I U 2 - 1
= {See1 u + C} (u > 1). (16)
Notice, however, that the integral
-2 1
J-ax.Jx2-l dx
___
makes sense. We therefore need an integration formula which will apply to this
integrand on the interval ( - oo, -1). On this domain,
1
(-1) 1
Ix.Jx21 - 1
__ dx =
where
u(x)=-x, du=(-l)dx.
Therefore, for x < -1,
•
r I
x'./ x·
�x
-1
= {See1 u + C} = {S.ee1 (-x) + C}
because !xi = -x when x < 0. Fitting our two formulas together, and passing to
the general case (with a function u instead of x), we get
fu.Ju2 - 1
du
= {Sec-1 lul + C} (u > 1 or u < - 1) . (22)
6.3 Integrals Leading to the Logarithm and the Inverse Secant. Algebraic Devices 269
There is a rough rule to help you decide which of our present list of formulas to
apply to a given problem: look in the integrand for functions which are the derivatives
of other functions. The point is that all our formulas have left-hand members of the
form ff(u) du; and we need to decide, in each case, what u is.
Example 1.
I: In x
dx.
D In x = .!. .
x
Taking u(x)= In x, we have
{u
Thus our integral has the form
I
ln3 x
� dx=
I (ln3 x)
1
� dx= I u3 du = l x;
= {: 4
}
+ c = {t ln4x + C}.
du= - dx
x
f
Example 2.
x dx
(1 + x2)7 .
Looking for functions which are derivatives of other functions, we observe that
x dx !· 2x dx ! u-7 du,
J (1 + x2)7
=
2 J (1 + x2)7
=
2 J
where
u(x) = 1 + x2, du= u'(x) dx= 2x dx.
x dx
J )1 - x4
•
270 The Technique of Integration 6.3
There is no hope that l/J l -x4 is part of du. Either the problem is hard or du
must be x dx, or a constant multiple of x dx. Now 2x=Dx2; and x2 is what gets
squared under the radical sign in the denominator. This suggests
u=x2,
x dx 1 2x dx 1 du
JJ1 - x4=,2 JJ1-(x2)2 =2 JJ 1 - u2
={t Sin-1 u + C} ={t Sin-1 x2 + C}.
Example 4. Some obscure-looking integrals may be calculated algebraically:
�
Jx-1 =J(1 + -) dx.
1
x-1
(Here we have divided the denominator into the numerator, getting a quotient and a
remainder.) Therefore
xdx
= {x + In Ix-11 + C}.
J x-1
Example 5. Sometimes we need to find other algebraic devices, for such problems
as this:
dx
J1 + e-x
As it stands, this is hopeless: nothing in the integrand is the derivative of anything
r
else. But
dx � du
= =. u=ex+ 1
• 1 + e-x J
ex+ 1 u J
={ln lul + C}={ln (1 + e") + C}.
(No absolute-value signs are needed, because 1 + e" > 1 for every x.)
Example 6. Sometimes the same devices appear in more complicated forms:
r ---
dx - e" dx
-
e" dx
-
du --- (u=ex, du=e"dx)
•
-
e" + e-x J J
1 + e 2" - 1 + (e")2 1 + u2 J
={Tan-1 u + C} ={Tan-1 e" + C}.
Here we have used, in combination, the methods that worked in Examples 3 and 5.
_E_= ! dx
= !.2.
t dx
J + x2 J J
.
4 4 1 + (x/2)2 4 1 + (x/2)2 ·
{t Tan-1 � + c}.
6.3 Integrals Leading to the Logarithm and the Inverse Secant. Algebraic Devices 271
x (1/J3) dx
Similarly,
{ ;3 }= sin-1 + c .
Ia2 dx x2 {.!a
+
= Tan-1 � + c
a
} (a > 0),
IJa2dx- x2 { = sin-1 � + C)
a J
(a > 0).
Passin g from x to any differentiable function u, we get two more standard formulas:
du 1 {a }
I
u
1 -Tan- - + C (a > 0), (23)
a2 u2
=
+ a
IJa2du- u2 = {
sin-1 � + c
a
} (a > 0). (24)
t
1. I � v' 1 x4
dx 2. J 4 v'1 - t2
dt 3. J � v'1 4y4
d
y
x
4. J y
v'4 - y4
dy J
5. 1 + 9x 4
dx 6. J 1
:39 4 dx
x
7. I 9 :sx4 dx 8. J � 5 t4
dt 9 J �
. 1 z6
dz
10. I � 2 zs
dz J 1�
11.
2zs
dz 12. J 1 �6dz
+ z
x
13. I � 5 zs
dz 14. J 1 :55zs
dz 15. J 7
Vl - x8
dx
x
16. J 3
Vl - x8
dx 17. J 1 � xs
dx 18. J � 1 xs
dx
1
19. J : e•
1
• dz 20. J 1 :t e2t dt 21. J e"' e-x +
dx
e-
,.dx
·
23. I e"'e"' - e-x
e-x +
dx 24. J (e"' e- )( + "' e"' - e-"') dx
2
x x
25. J (e"' e-•')(e2"'
+ + e-2"') dx 26. J 2
v'2 - x3
dx 2 7. J 2
v'2 - x6
dx
272 The Technique of Integration 6.3
x2
Consider
= =
•
JI-Ix21 dx.
y
6.4 Integration by Parts 273
1
f(x) F(x) =
x2' x
Then F' = f Therefore
fl -dx
1
= F(l) - F(-1)
-1
x2
Now we interpret the problem geometrically. We seem to have proved that the region
under a positive function has negative area.
b) Show that the area in question is not only positive but infinite. (This does not
follow from the mere fact that the region is unbounded. Some unbounded regions have
finite areas.)
65. Let R be the region under the graph of f(x) = I/v'�, from x = 0 to x = 1. Show
·
By differentiation, we get
D[x sin x] = x cos x + sin x.
Since
D cos x = - sin x,
we have
D[x sin x + cos x] = x cos x + sin x - sin x = x cos x.
Therefore
Thus, working backward, we have found the solution of an integration problem which
m ight have looked hard if we had approached it forward, starting'wifu_the unknown
integral f x cos x dx. We shall now describe a general method of solving problems of
this kind.
The formula for the derivative of a product is
Ju dv= uv - Jv du.
This is the formula for integration by parts; the word parts refers to the functions
u(x) and v'(x) in the integral on the left. Any time we apply the formula, we replace
one integral by another. The method is useful when the new integral is easier to
calculate than the old one.
Let us first try the method on
J x cos x dx.
Let
u =x, dv =cos x dx,
so that
du= dx and V = Sill X.
(We need not allow for a constant here; any function v whose derivative is cosx
will work. We will return to this point in a moment.) By the basic formula, we get
J
x cos J
x dx = u dv= uv - Jv du= x sin x - J sin x dx
v =sin x + c,
we would have
In applying the basic formula, we made what may seem to be an arbitrary choice
of u and dv. We might have taken
x2
dv = x dx , v = -.
2
6.4 Integration by Parts 275
2
J xcos x dx = J udv = uv - J v du =· � cos x + t x2sinxdx.
J
This is true, but is worthless as a method of finding f x cos x dx, because the new
integral is harder to calculate than the old one.
An equally bad choice would be
du= dx,
which gives
xcosxdx = udv = uv -
f J J vdtt
Here again the new integral is harder than the old one. We remember also that no
term of the form x2cos x appears in the right answer. Therefore the term x2 cos x
cannot be the beginning of the solution, as we might hope: it must be a blind alley.
These examples indicate that integration by parts can be either a good or a bad
method, according to the skill with which we choose the parts. Practice is a help,
but there are general rules which help us to decide what choices are promising:
1) dv has got to be something that we know how to integrate. (If it isn't, we can't
apply the method at all.)
2) We want f vdu to be an. easier integral than f u du. Therefore we want du to be
simpler than u. At least, we don't want du to be more complicated than u.
3) For the same reason, we want v to be simpler than dv; at least we don't want it to
look worse than du.
These rules are not infallible, but they are a help. Let us try them on
J xe"' dx.
We can integrate both x and e". Therefore (1) gives us no guidance. Rule (2)
suggests that u= x and u = e
r
are both acceptable, but that u = x is to be preferred.
(De"'= e"', which is no worse thane·", but Dx =I, which looks good.) We therefore
use
u= x, du= dx, du= e"' dx, v = e"'.
This gives
J xe"'dx = J u du= uv - J v d u = f
xe"' - e"'dx = {xe"' - e"' + C}.
because we would then get i: = x2/ 2, which looks worse than du. In fact, this choice
276 The Technique of Integration 6.4
f x2e'" dx.
Rule (3) tells us that we had better take du = e'"dx. We therefore take
This looks good under rule (2), and acceptable under rule (3). We get
We take
u = e'", du= e"' dx, du= sin x dx, v = -cos x,
We then have
I = -e'" cos x + f e"' cos x dx = -e"' cos x + e'" sin,x - f e"' sin x dx.
_
Here the last integral is simply the one we started with. Therefore
J in x dx.
6.4 Integration by Parts 277
Here we use
u= lnx,
1
du= - dx, dv = dx, v = x.
x
When we replace 1 by x, we seem to have lost somewhat, but the profit in passing
from In x to l/x more than makes up for it. In fact, this scheme works:
J inx dx = uv - Jv du = x Inx - J �
x · dx
Evaluate the following integrals. Each of them can be calculated by the method of
integration by parts. You should try to work these problems with the smallest possible
number of false starts. In each case, survey the situation and try to arrive at a conclusion on
the question of-what choice of u and dv is most promising. If you do this carefully,
you ought to be able to solve each of the problems below on the first try.
Each answer should be checked by differentiation.
where n and m are any integers, positive, negative, or zero, and integrals of the forms
f secnx tanmxdx,
and
f cscnx cotmxdx,
where n � 0 and m � 0. We shall discuss the various cases in the order of increasing
difficulty.
= J sinnx(l - sin2x?cosxdx .
6.5 Integration of Powers of Trigonometric' Functions 279
We expand (1 - sin2 x)k by the binomial theorem. This gives us a sum of integrals
of the form
This is like the preceding case. Here n = 2k + 1, and the integral has the form
.
2
Making these substitutions in the integrand, we get a form in which the exponents
are divided by 2. For example,
1 cos 2x 1 + cos 2x
J sin2 x cos2 x dx
J dx
-
·
=
2 2
280 The Technique of Integration 6.5
1 - cos 4x .
1
4J
dx
2
=
J
tx - t cos 4xdx = {tx - ;t2 sm 4x + C}.
When the exponents are large, this method is tedious, but at least we know that it
will work.
(n positive). (4)
Jtan xdx = {-In jcos xi + C} (cos x > 0 or cos x < 0). (4a)
For n = 2,
and so
Jtann xdx =
1
J
-- tann-i x - tann-2 xdx.
n - 1
(4c)
This is called a reduction formula. By repeated applications of it, we can reduce the
integral to one of the forms (4a) and (4b).
cos x .
Jcot x J sm
-.-dx
x
= = {In Ism x I + C} (sin x > 0 or sin x < 0). (5a)
For n = 2,
For n > 2,
and so
1
Icotn x dx = - -
n
-
- 1
f
coc-1 x - cotn-2 x dx. (5c)
By repeated applications of (5c), we can reduce our integral to one of the forms
(5a) and (5b).
When we expand (1 + tan2 x)k-l by the binomial formula, we get a sum of integrals
of the form
We integrate each of these by the power formula and add the results. For example,
Jccot2 1t 1 csc2 x dx
= x + -
secx(sec +tan
f secxd x = f sec x
x
+tan x
x)
dx = fdu-u ,
where
u = sec x +tan x, du = (sec x tan x +sec2 x) dx.
Therefore
For n odd and greater than 1, we have a problem. For example, in f sec3 x dx
it does no good to write
because the second term fits no standard form. The solution is obtained by integrating
by parts. We have
= secn-2 x tan x - f
(n - 2) secn-2 x tan2 xdx
n-2
J seen x dx = --
n-1
1
secn-2 x tan x + --
n-1
J secn-2 x dx.
There is a similar reduction formula which works for odd powers of the cosecant:
1 n-2
J cscn x dx = - --
n-1
cscn-2 x cot x + --
n-1
Jcscn-2 x dx. (9)
Before starting to work on these problems, you should read Section 6.5 carefully, until
you understand what the methods are and why they work. In working the problems, you
should refer to the text as seldom as possible. You should try to avoid looking up even the
reduction formulas (8) and (9), unless a problem requires you to apply one of them more
than once. If only one reduction is required, you should integrate by parts, instead of using
the reduction formula. As you will see, the first few problems below are designed to remind
you of the methods that we have been using. Check by differentiation in each case.
13. x dx
f csc3 14. x dx
f sec5 15. x dx
J csc5
16. x x dx
J sin sec 17. x x dx
J cos csc 18. x x dx
J sin sec3
19. x 2x dx
J sin2 sin 20. 2x dx
J cos3 21. 2x dx
f cos2
22. 2x 2x dx
f tan sec 23. x x dx
f csc tan 24. I --dx
sinx
x
cos2
2 . J x dx
8
sec
1
tan
29. x1 x dx
J
csc cot
30.
x
J cos" x dx = A cos"-1 x x
sin + BJ cos"-2 x dx.
Derive it.
In Section 6.2 we found that there was a close connection between certain simple
integrals and some more complicated ones. For example, if we know that
(We are using a different dummy letter in the second problem, for reasons which will
soon be clear.) Thus we have two related integration problems:
Jx2 dx {tx3 + C}
l x�sino l ,,_,sin o
The ( !) at the bottom indicates that the equation in the bottom line is the final
conclusion. The pattern here is the following:
ff(x) dx = {F(x) + C}
l X-+U(O) l X-+U(O)
Thus, if we know how to find Jf(x) dx, we can use the result to find Jf(u) du.
It sometimes happens, however, that we want to move in the opposite direction;
sometimes we can see how to calculate
ff(u(e))u'(e) d(),
and we want to use the result to calculate Jf(x) dx. So as to give ourselves a simple
example to work with at first, let us suppose that we know about the functions
Sin and Sin-1, but do not know that l/.J 1 - x2 is the derivative of Sin-1 x. We then
consider
J dx
.J 1 - x2 •
We observe that it does not fit any form that we know. But perhaps it would be
manageable if we could extract the indicated square root. For x = Sine, the square
root can be extracted. (See below.) If we replace the dummy letter x by the function
Sine, then dx becomes Sin'() de, and we get the related integrals on the left in the
following diagram:
J dx
=?
l
.Ji - x2
X-+�inO
·J cos e de _ ?
.J1 - Sin2 e � .
cos e de cos e de
J.J1 J .J =
J = {O C}.
_
_
1 d() +
- Sin20 cos2e
fJldx- X2
=
(I)
{Sin-1 x + C}
1 x-Sino ·r o-Sin-1x
f cos e de
= {0 + c}
J1 - Sin2e
In this case, of course, the solution in the top line was known before we started.
But the same scheme works in general, whenever we can calculate the new integral
on the lower left:
ff(x) dx
l x-u(8)
ff(u) du ff(u(O))u'(O) dO
= = {G(O) + C}
We shall prove, at the end of this section, that this procedure is valid, whenever the
symbols u' and u-1 have a meaning; that is, whenever u has both a derivative and an
inverse. Meanwhile, we shall show how the scheme is used to solve problems
which would otherwise be hard.
Example 1.
f dx
x2J1 - x2 -
_?
•
(-1 < x < 1, x ¥: 0).
As in the preceding case, it seems to be the radical that is causing the trouble; and so
we get rid of it by the substitution
f J dx --)- f cos e dO
x2 1 - x2 Sin2 0 cos e
(Throughout, -Tr/2 < e< Tr/2; on this interval, sin x = Sin x, and the usual
identities hold automatically.)
We now reverse the substitution, using e--)- Sin-1 x. This gives
fh
x2 1 - x2
=
{-cotSin-1x + C}.
- 1
Here
Sine= x, e= Sin-1 x;
.
cot Sm 1
_
x= cot e= -k = J 1 - x2
-'-----
x x
Therefore
dx = {- J� + c }
J x2J1-x2 x .
Note that all the trigonometry has cancelled out of the problem. Our answer checks:
DJ� = l. x
x x2
-x
J1-x2
(
- J1 - x2 · )
= 1 [-x2-cJ1-x2)2J= -1 .
x2)1 - x2 x2)1 - x2
We can sum this up in a diagram as follows:
dx {- � }
Jx2J1-x2=(
+ c
x
l x�SinO I o�Sin-1 :v
J e de =
csc2 {-cot e+ C}
-.. e
- x2•
The substitution x Sin is the usual one to try, if the troublesome part of the
integrand is
J1 In other cases, x-... Tan e works in much the same way.
Example 2. Consider
288 The Technique of Integration 6.6
This gives
Jsec3ede= fsecnede (n = 3)
n
1
= -- secn-2etan e+
n-1
2
n-1
- Jsecn-2ede
= t secetane+ t secede J
= {tsecetane +tin lsece +tanel+ C}
= {G(e) + C}.
We complete the solution by letting e -+ Tan-1 x. This gives
y
6.6 Integration by Substitution 289
In the figure, -7T/2 < e < 7r/2, bute may be positive or negative. We take OP = 1.
This gives
x =Tane, e = Tan-1 x, r = sece = secTan-1 x,
so that
secTan-1x =.Ji+x2•
Therefore the answer is
This can be simplified slightly: since .J 1 +x2+ x > 0 for every x, we can omit the
absolute value bars, getting:
As before, we sum up in a diagram the process by which the problem was solved:
J 8d 8 =
sec3 {tsec8tane +tlnjsece + tan8j + C}
Such diagrams are worth drawing, especially the first few times you use the
substitution process; often the calculations are long, and it is easy to lose track of
what the process means.
The answer in Example 2 suggests that no method would have made the problem
seem easy. Note that the formulas of Section 6. 5 are turning out to be useful in solving
problems which do not appear, at first, to involve trigonometry at all.
We return to the general theory, to see why this method works. The pattern of
our work is described by the diagram:
ff(x) dx 0) {G(u-1(x)) + C}
l x- 1<(6) r e- u-1(x)
Jf(u(8))u'(8) de = {G(8) + C}
What we are claiming, when we use the method of substitution, is that, if the second
equation holds, so does the first. In terms of the definition of the indefinite integral,
this means the following:
290 The Technique of Integration 6.6
1
f(u(u-1))u'(u-I) Du-I = f · u'(u-1) Du-I = f · u'(u-1) · -- ,
u'(u-1)
by the general formula for the derivative of the inverse of a function. Now u' (u-1)
cancels, and gives us D[G(u-1)] = f, which was to be proved.
Calculate each of the following integrals, by any method. In most cases, but not all,
the easiest method is to use a substitution of the form x _,.Sin 8, x-+ Tan 0, or x -+ Sec 0.
In each case where you do use the method of substitution, you should sum up the process
of solution in a diagram as in Examples 1 and 2 in the text. Finally, check in each case by
differentiation.
1. J (1 - x2)-3!2dx 2. J v' x2
dx
+ 1
3.
J
dx
v' x2 - 1
4. J dx
x(l + x2)
dx 5.
J xv' x2 - 1
dx
dx 6. J x(l - x2)-3f2dx
7. J dx
x2(1 + x2)
8. J 1
xdx
+ x2
9.
J Vl - x2dx
10.
J
dx
x2v' x2 - 1
11.
J v' x2 - 1
xdx
12. J x2v'1 - x2dx
13. J x2dx
Vl - x2
14.
J
x3dx
v'1 - x2
15.
J 1 +
x2dx
x2
16.
J x2v'l + x2 dx 17. J� v'l - x2dx 18. J (1 : x2)3dx
19. J xv'l + x2dx 20.
J x3 v' 1 + x2dx 21.
J 1
x3
+ x2
dx
= /(x) dx,
a ida>
C u-l(c)
24. Obviously there is no point in writing this on a paper which is to be turned in and
graded; but for your own benefit, reproduce the proof of the following, without reference
to the text:
G' = f(u)u' => D[G(u-1)] = f
It is a good rule, if you have a problem which you don't see how to solve, to try to
think ofan easier problem that resembles it. If you can solve the easier problem, and
bridge the gap between the two, then you have solved the problem which you started
J-::: =x2= == d x.
with. For example, consider
.J2x + 1
This does not fit any of the standard forms that we know about. There is no reason to
2x + 1 = t�x = Ht - 1 ) .
We therefore try the substitution
x - u(t) = Ht - 1),
x2 dx -+ J t(t - dt
Under this substitution,
.
1)� k
J.J2x +
·
1 .Ji
The latter integral is easy to calculate. It is
{G(t)
=
=
+ C}.
292 The Technique of Integration 6.7
To get the answer to the problem which we started with, we use the inverse substitution
f x2 dx
= h�(2x+ l)s/2 - t(2x+ l)a/2+!(2x+ 1)112+ C}
-J2x+ 1
= {G(u-1(x))+ C}.
The scheme here is the same as in the preceding section:
x2
J-J2x+1 dx = {G(u-1(x))+ C}
<1>
lx�1i(t)
t2 - 2t+ 1
J dt {G(t)+C}
-
r -
8v t
The only differences are that (a) the functions u and u-1 are described algebraically,
and (b) the formulas for G(t) and G(u-1(x)) are too long to be conveniently written in
the diagram. In any case, we know that the method works: this follows from Theorem
1 of Section 6.6.
Often we can tell that a substitution is going to work, long before we know what the
answer is. As soon as we wrote
x2 dx
J-J2x+1 -+
JW - 1)� l dt
-Jt
·
,
it was evident that the numerator was a polynomial. We can integrate the quotient
of a polynomial and a power. Similarly, we know that we can integrate
HY2 - 2. Y
J J - 2y2+ 1) dy
j. dx
'
-Jx2 + 1
we want to extract the square root; we can do this if
J-Jx2dx + 1
---+ J sec2 e de
sec e
= J sec e de ,
z2 = x2 + 1,
x ---+ u(z) = ,/ z2 - 1
This gives
dx =
\ z
/
2
- 1
dz.
J
dx
-Jx2 + 1
__..
J! z
.
,Iz2
z
- 1
dz = J-J__:!:___
-z2
, 1
which gets us nowhere, unless we happen to remember the solution of Problem 3 of
Problem Set 6.6.
Usually, to find out what algebraic substitution is going to work, we need to
solve an algebraic equation. For example, given
dz
J 1 + ./:'
294 The Technique of Integration 6.7
1 +;-;� t,
we need
z� (t - 1)2•
We usually write this with"=" signs:
2(t- o dt = (2 - �) dt
- Ji :=J:Z � J t J
= {2t - 2 ln !ti + C}.
The reverse substitution
t� u-l(z) = 1 +)-;,
gives the final answer
{2(1 +,Jzj -
2 In (1 +-Jzj + C}.
(Query: Would it be all right to delete the "1" in the first parenthesis?)
This is probably the most efficient solution. If we hadn't thought of it, we might
have tried
,J; = t,
which gives the substitution
z� u(t) =t2,
t 1
--=1---.
l+t l+t
Therefore
getting
I 1 +dz.J
--- = r - 2 ln (1 + vz)
{2vz r + C}.
z
(Is this really the same as the previous answer? Why or why not?)
We have used the substitutions x---+ Sin8; x---+ Tan8, and x---+ Sec() to handle
integrands involving the radicals
and
() = Sin-1 �.
a
Thus
= J
a2 t ( cos 28 + 1) dfJ
2
= {: \in2fJ + � e + c .}
Now
a a
This gives
2
J .Ja2 - x2dx = { tx.Ja2 - x2 + � Sin-1 � + c} .
In the same way, we use
x---+ a Tan 8,
296 The Technique of Integration 6.7
the trouble seems to be that the integrand is concentrated in its own denominator.
We ought to be able to correct this by letting
1
x ---+ u(t) = - ,
-1
dx ---+ - dt.
t2
This gives
fx2(x� +
x
1)
---+ J( -1+
1
� t2) dt {-t +Tan-it= + C}
-
t ---+ u 1(x) = x. 1
-
In writing up solutions of problems, in the following problem set, you need not
draw diagrams of the form:
1x�11(t)
ff(u(t))�'(t) dt = {G(t) + C}
But whenever you use a substitution, you should explain what you are doing, by
writing formulas of the type
x ---+ u(t) = · · · ,
6.8 Algebraic Devices: Completing the Square and Partial Fractions 297
dx vx
1.
J (I + V:x-)3
2.
J (1 + Vx)3
dx 3.
J (a2 _
x2)-af2dx
4. J (a2 + x2)-af2dx 5.
J
dx
v1 + e"'
6.
J
dx
v1 - e"'
dx dx dx
7.
J l + �x
8.
J (1 - �x)2
9.
J Yvx + 1
10.
J
z3dz
vz + I
11. J Vz2
z3dz
- I
12.
J
dx
YI + �'x
dx dx
13.
J � l + vx
14.
J x4(x - I)
15.
J (l - x2)4dx
dx dx
16.
J (I + vx)3vxdx 17.
J� 18.
J (1 + e"')4
19.
J
dx
Yl + e2"'
20. J Sin-1xdx 21.
J xln xdx
22.
J x Sin-1xdx 23. J Tan-1 xdx 24.
J xTan-1 xdx
25.
J
l
(1 + vx)2
dx 26. J 1
vx(l + vx)2
dx 27.
J
1
l + fix
dx
28. J v1 +fix
I
dx 29. J : (l
dx
e"')2
30.
J
dx
v 1 + e3X
31. J dx
x3(1 - x)
32. J x2 ln xdx 33. J x2Tan-1xdx
x- + x +
9
1 x- + x + i + !
?
(x + t)- +
? (.)3)2
2
= =
Therefore
dx dx
J v x2 + x +
I
1
-
J '
I
(x + t)2 + (./3/2)2
'
JJuzdu+
which has the form
.
a2
298 The Technique of Integration 6.8
J du.Ju2 - a 2
.
Here we would use
u--+ a Sec(),
and proceed as in Section 6.6.
The following simple-looking problem has a curious solution:
J �=?
x2 - 1
We try
x--+ Sec() ex> 1)
so that
dx --+ sec() tan () d(),
giving
Now
J
x 1 x+ 1 x+ 1
--+ = = ex> 1).
x -1
-
J� { 1
x2 - 1
= t 1n � +
x +1
1 c} ex> 1).
We check by differentiation:
( l : �� I )
D tin = Detinlx - 1\ - tln\x +1\)
1 1 1 1 1 ex + 1) - ex - 1)
=-·
2 --
x -1
-
-·--
2 x+1
=-·
2 ex - l)ex + 1)
1 •
=
x2 - 1 ·
6.8 Algebraic Devices: Colfi'pleting the Square and Partial Fractions 299
This shows that our answer was right. But it also shows that our use of trigonometry
was unnecessary; the solution depends merely on the algebraic identity
_l
1 _l.
2_
_ 2_ +
__
x2 - 1 x - 1 x +1
ex+d A _B_
= + (x � a, b).
__
(x - a)(x - b) x - a x - b
ex +d
= � +_!}__
(x - a)(x - b) x - a x - b
{
<=> ex +d = A(x - b) +B(x - a)
<=> A+B = e and
Ab+Ba= -d.
ae+d be+d.
A= B=
a - b' b - a
Theorem 1. If a � b, then
+
_
.
(x - a)(x - b) a -b x - a b - a x -b
But nob'ody could remember this formula. The efficient way to handle such
problems is the following. Given
fcx 2��x
- - 5)
=
?
(x
1 A B
------ = -- +-- .
- 2)(x - 5) x - 2 x - 5
300 The Technique of Integration 6.8
The only problem is to find out what they are, numerically. We first write
A= -t, B =t.
1 _!_ .__
l l __
1
=- + .
(x - 2)(x - 5) 3 x-2 3 x-5
would not have been valid. To see this, consider the following analogous procedure:
l=a·2:
2'
and a =2/7r. Therefore
2x
(?)
. Sill
. X =- for every x (?)
This is wrong: in fact, our formula is correct for only three values of x, namely,
x = 0 and x = ±7r/2. The fallacy was in assuming at the outset that the problem
had a solution, when in fact it has none. What the above line of reasoning really
proves is the following:
sin x is a linear function (1)
=> sin x is the linear function 2x/7r. (2)
The statement (1) => ( 2) is true, but it is not useful, because (1) is false.
The method that we used for quadratic denominators also works whenever the
denominator can be factored into linear factors all of which are different.
2x2+ 1 A B C
��������-- = + + �� �� ��,
Theorem 2. If p(x) is of degree � 2, and a, b, and c are all different, then there are
numbers A, B, and C such that
p(x) = __ A + __ B + _ c
.
_
(3)
(x - a)(x - b)(x - c) x - a x - b x - c
With our present equipment, we could give only a brute-force proof. But we know
how to handle simple cases, and in the following problem set you will see how various
more difficult problems of the same type can be solved.
Find:
dx dx dx
1.
J x2+2x+5
2.
J v'x2+2x+5
3.
J x2 - 4
dx dx dx
4.
J x2+x- 4
5.
J Yx2+x - 4
6.
J v2 - x2
dx
7.
J >12 - 2x - x2
dx
8.
J v -2x- x2
(Is this an impossible problem at the outset?)
dx dx d
9.
J x2+ 6x + IO
10.
J v-x2- 6x +10
11.
J x2 - ;
6 +10
dx x dx dx
12.
J (x - l)(x - 2)
13.
J (x - l)(x - 2)
14·
J x(x - l)(x - 2)
x dx
15.
J x(x- l)(x- 2)
Find the unknown coefficients A, B, C, . . . which satisfy the following equations:
1��x
1 A B C
16.
(x - I)2 (x - 2)
=
(x - 1)2
+
x- 1
+ --
x - 2
--
17. Find
J (x _ _ 2)
1 A B C D d
18.
x2 (x - l )2 = x - + (x - J)2 +x- I
2 +x --
19. Find
J x2(x
�
1)2
I A B c D
20. = + + +
x+ I (x- 2)3 (x - 2)2 (x - 2)
---
dx I A Bx+ C
21. Find
J (x+ l)(x- 2)3
22.
2
x(x + I)
=-+
x x2+ I
dx 1 A Bx+ C Dx+ E
23. Find
J x(xz + I)
24.
x(x2+1)2
=-+
x (x2+ 1)2
+
x2+1
dx
J J
sine
25. Find 26. Find d8
x(xz+1)2 1 + cos 8
302 The Technique of Integration 6.8
28. Find J d8/(1 +cos 8). (One way to do this is to use the substitution
= 2 dx
8 ->- u(x) = 2 Tan-1 x, d8 _,. u'(x) dx
1 + x 2'
sin 8 _,. ? cos 8 ->- ?
But when you see the answer, you may be able to think of a simpler method of solution.)
Find:
f f f
d8 d8 dx
29. 30. 31.
1 +sin 8 cos 8 x2 +6x +9
f f J
d8 d8
32. 33. 34. sec3 8 cot 8 d8
sin 8 +cos 8 sec 8 +tan 8
f J
d8 d8
J
d8
35. 36. 37.
1 - sin 8 2 +cos 8 cos 8 - sin 8
The
7 Definite Integral
We join successive points P;_1, Pi with segments, getting a broken line as in the figure.
Such a broken line is said to be inscribed in the graph off Its length is
PoP1 + P1P2 + · · · + Pn-1Pn.
We denote this by p(N). That is,
n
p(N) = I P;-1P;.
i=l
We use the functional notation p(N), because when the net N is named, the broken
line is determined, and so also is its length.
The graph of a continuous function on a closed interval may have infinite length.
But if the length is finite, we ought to be able to approximate it by using a net N
which cuts up [a, b] into very small pieces. This idea is the basis of the following
definitions.
303
304 The Definite Integral 7.1
Definition. Let
be a net over [a, b]. The mesh of N is the largest of the numbers
lim p(N) = L.
INl->O
Intuitively, this means that p(N) R:! L when IN I R:! 0. We define this idea by the same
method that we used to defn i e the limit of a function at a point . To make the analogy
clearer, we write the old and new definitions in parallel .
Suppose that for every E > 0 there is a Suppose that for every E > 0 there is a
a > 0 such that if x is a point of [a, b], a > 0 such that if N is a net over [a, b],
then then
0 <Ix - x01 <a INI < 0
=> lf(x) - LI < E . => lp(N) - LI < E.
Then Then
limf(x) = L. Jim p(N) = L.
x--+xo INl->O
To calculate the arc length, we fri st express the length p(N) (of the inscribed
broken line) in terms of things that we know how to handle. By definition,
n
p(N) = I P;-1P;.
i=l
(See the figure on p . 303.) The segment from P;_1 to Pi looks like this:
Yi=f(xi)
Yi-I= f(Xi-1)
�-+-��--'---.L__��-'--�-x
Xi-I Xi Xi
7.1 The Problem of Arc Length 305
Thus
= [i (��:YJ Lix;.
+
Here the fraction Liyi/Lixi is the slope of the chord from Pi-l to Pi; and the mean-value
theorem says that this is the slope of the tangent line at some intermediate point.
Thus we have
Liyi
= f'(xi) (xi-1 < X; < X;).
Lixi
Making this substitution, and extracting the square root, we get
n n
p(N) =I ;-1 =I J1
i P P; + [f'(xi)]2 Lixi.
i
•
=l =l
The problem is to find out what happens to the sum on the right as INI -+ 0. We can
find this by giving a geometric interpretation to the sum.
y y
g
� -t-�-�X i---
I X�i�X�.i��---x
y =
g(x), X;-1 <Xi < X; for 1 � i � n.
On each little interval [xi_1, xi], of length Lixi X; - x,_1, we have set up a rectangle
=
with [xi_1, xi] as base, and altitude g(xi). The area of this rectangle is then
g(xi) Lixi.
The sum of these areas is
If f' is continuous, then so is g; and I g(xi) Lixi ought to be close to the area
306 The Definite Integral 7.1
under the graph of g, when the mesh of the net N is small. That is, we ought to have
lbg(x) dx.
n
This holds whenever f' is continuous; and we will complete the proof later in this
chapter. Meanwhile, consider some examples.
Example 1. Let
f(x) = 1, O�x�l.
Thenj'(x) = 0, and
L = Jf1o .J1 -+ 02 dx = 1,
which is the right answer.
Example 2. Let
f(x) = kx, O�x�l.
Thenf'(x) = k, and
L f
= .J1 + k2 dx = .J1 + k2,
1 + [f'(x)]2 =1 - x2 + x2 = __' 1
1 -x2 1 - x2
7.1 The Problem of Arc Length 307
and
i,1;10 i\!212
"-J1 + [f'(x)]2 dx
dx ,;-
[Sin-1 x]0 212
-Ji
L = = =
o o - x2
= Sin-1
-J2_ = !!. .
2 4
This is the right answer, because Lis one-eighth of the circumference of a circle of
radius 1.
L =
f-J1 + 4x2 dx.
Now
u = 2x.
This is
H-x-JI + 4x2 + t In 12x + -J1 + 4x21 + C}.
Therefore, by the fundamental theorem of integral calculus, we have
L =
ls+ t In c2 + -Js).
2
The answer suggests that no method would have made the problem look easy.
Find the lengths of the graphs of the following functions, between the indicated limits.
x
*6. f (x) = e , 0 � x � 1
"'
7. a) /(x) = t(ex + e- ) , 0 � x � 1 (You can solve this one, by an algebraic trick,
without using any of the standard formulas for hyperbolic functions. But the
problem is a little easier if you remember that sinh x = He"' - e-" ,
) cosh x =
}Ce" + e-x .) For the definitions of the hyperbolic functions sinh and cosh, and the
formulas governing them, see the end of Section 4.11.)
308 The Definite Integral 7.2
"'
Let r(x) P0P,,/P0Px, that is, the ratio of the arc length to the length of the chord.
=
10. Let/be any function on [a, b], and let x be any point of(a, b). Let mv m2, and mbe the
slopes of the chords over the intervals [a, x], [x, b], and [a, b]. Show that mis between
m1 and m2 . (Unless, of course, m1 = m2 = m.)
More precisely,
f(b)-/(a)
m
b - a
The theorem says that either (a) m1 � m � m2 or (b) m2 � m � m1.
In Section 3.7, we defined the definite integral in terms of area, with areas above the
x-axis counted positively and those below counted negatively. In the preceding
7.2 The Definite Integral, Defined as a Limit of Sample Sums 309
bg(x) dx
section, however, we regarded the integral as the limit of a sum:
n
J a
= Lim
JNJ...,Oi=l
2 g(.Xi) �xi.
Most of the time hereafter, the definite integral will be used in this way, and so we
shall redefine the integral, using the above formula as a definition. For this purpose,
we need to investigate nets, and sums of the type
n
2 g(.X;) �xi.
i=l
Consider first ail increasing continuous function/, on an interval [a, b].
y y
I
f(a) I I
I I
�--+-����x f(a)
a=xo xi xz x3 Xn=b · · ·
--- _
.J
(a) (b)
The points of the net N cut up the interval [a, b] into little intervals [xi_1, X;]. For
each i from 1 to n,
m; be the minimum value off on the ith interval [x;_1, x;],
let
and let M; be the maximum value. Since f is increasing, we have m; = / (x;_1),
M; = f (x;)· As usual, �x; = X; - X;-i. so that �x; is the length of the ith interval
[x;_1, x;]. Iff is positive, as in part (a) of the figure above, then the sum
n
s(N) = 2 m; �xi
i=l
is the sum of the areas of the inscribed rectangles, and the sum
S(N) = L M; �X;
i=l
is the sum of the areas of the circumscribed rectangles. For functions which may be
s(N) and S(N) are sums of signed areas. In either
negative, as in part (b) of the figure,
case,s(N) is called the lower sum off over the net N, and S(N) is called the upper sum
off over N.
310 The Definite Integral 7.2
(1 � i � 11) .
The sequence
Let R be the region between the graph off and the x-axis, from a to b.
Theorem 1. If/is continuous, and N1 and N2 are any nets over [a, b], then
s(N1) � S(N2).
That is, every lower �um off is less than or equal to every upper sum off For
positive functions this is obvious, because in this case s(N1) is the area of an inscribed
polygonal region (lying under the curve) and S(N2) is the area of a circumscribed
polygonal region. Jn general,
where A and D are areas of inscribed regions, and Band C are areas of circumscribed
regions. (To see how this works, see the figure (b) above.) Therefore,
y y
f(b) --
,
f(b) --------------
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
.-+---�]----�
I I
I I
I I
j(a) --,
I I
I I
This difference is
n n n
Proof We have to start the proof by naming the number k. The numbers s(N) are
bounded above. (By Theorem 1, every upper sum is an upper bound of the lower
sums.) Let
k = sup {s(N)}.
Consider an interval [x;_1, x;]. For each i,
m; � f(i;) � M;.
312 The Definite Integral 7.2
That is, every sample sum lies between the lower sum and the upper sum.
We are now almost done. Given E > 0, we want a c5 > 0 such that
Thus when INI < o, the interval from s(N) to S(N) has length less than E.
This interval contains I (X). ( See the inequalities above.) And it also contains k:
s(N)� k, because k is an upper bound for the lower sums; and k� S(N), because
k is the least upper bound of the lower sums. Therefore IL (X) - kl < E, because
I (X) and k are squeezed together: they both lie on the same short interval.
We can now give the new definition of the integral.
Definition. fa f(x) dx = lim1N1-?0 I (X), if the indicated limit on the right exists.
If the limit exists, then we say that f is integrable on [a, b ].
D f'f(t) dt = f(x),
where/is continuous. We need to know that this differentiation formula still holds,
under the new definition of the integral. This is the purpose of the following theorem.
then
m(b - f
a) � !(x) dx �M(b - a).
Proof Let N be any net over [a, b], and let X be any sample of N. Then
'L D.xi = b - a.
i=l
(Why?) Therefore
m(b - a) � 'L (X) � M(b - a);
and this holds for every sample sum, over every net N. Therefore the same inequalities
hold for lim 'L (X), and the integral lies between m(b - a) and M(b - a), which
was to be proved.
D f'f(t) dt = f(x),
in Section 3.10, you will find that in this proof, all that we needed to know about the
integral was the information conveyed by the betweenness theorem for integrals.
Therefore the differentiation formula continues to hold, under our new definition,
wherever the integrand is continuous. It follows that the fundamental theorem of
integral calculus still holds true.
At the end of this chapter we shall prove that every continuous function is inte
g rable.
1. In Theorem 2 it was assumed that the function f is increasing. Does the same scheme
of proof work, for a decreasing function? If so, draw a figure illustrating the proof for
decreasing functions. If not, explain how the scheme breaks down, for the case of a
decreasing function.
2. In Theorem 2 it was assumed that f is both continuous and increasing. Suppose we
assume that f is increasing, but not that f is continuous. What changes (if any) do we
then need to make in
a) the definitions of mi and Mi, b) the definitions of s(N) and S(N), and
c) the proof of Theorem 2?
314 The Definite Integral 7.2
Theorem A (The mean-value theorem for integrals). If f is continuous on [a, b], then
there is a point x, between a and b, such that
/Ci) = !; /(x) 0 for every other x on [O, I]. Is this function integrable? Why
=
or why not?
*5. Consider the following function, on [O, I]. If x is irrational, then/(x) 0. If x p/q, = =
in lowest terms, then/(x) I/q. At what points (if any) is this function continuous?
=
7. In Section 7.1 we showed that for every net N we could choose a sample X in such a
way that the length of the inscribed broken line is equal to the sample sum L (X),
not just approximately but exactly. ls it always possible to choose a sample X' such
that L (X') is exactly equal to the arc length? (Here we are assuming, as usual, that
/' is continuous.)
*8. The following remarks are a very sketchy indication of an amusing proof of an important
theorem which is known to you in a slightly weaker form. Fill in the gaps, and state the
theorem which is proved.
F' f. f is known to be integrable on [a, b], but is not necessarily continuous.
=
n n
i=l
As INI ___,. 0, L'f=if (x;) D.xi ___,. ?; but Lf=1 [F(xi) - F(xi_1)] was something simple, all
along.
*9. Let/be differentiable on [a, b]. Show that if
*IO. Theorem (171e no-jump theorem for derivatives). If f is differentiable on [a, b], and k
is between j'(a) and j'(b), then k j'(x) for some x between a and b.
=
for x ,e. 0,
for x = 0
13. a) A function f is uniformly continuous on [ a, b] if for every " > 0 there is a c5 > O
such that
[x - x'[
*
< c5 I/(x) - f(x')[ < "·
(Here x and x' are any points of [a, b].) Show that if[' is continuous, then f is
uniformly continuous on [a, b].
b) Show that every uniformly continuous function is integrable.
The volumes of various solids can be expressed as definite integrals. In this process,
we shall assume that the following volume formulas are known.
I
I
I
I h
I
I
I
) ---- ---- -
,,....-- -----
-- ........,
/
/
b r
/
The first of these solids is a rectangular parallelepiped; the second is a right circular
cylinder; and the third is a cylindrical shell, that is, the portion of the larger cylinder
that lies outside the smaller cylinder.
We get a coordinate system in space by setting up a z-axis, perpendicular to the
xy-plane at the origin. Here, and throughout this chapter, we shall indicate only the
positive half of each axis, thus getting a picture of only the "first octant," in which
the points have nonnegative coordinates.
� I
x
over the interval [1, 2]. For convenience, we use equally spaced points, so that
xi - xi-l = �x = 1/n
·
for each i. Over the intervals [xi_1 , xi ] we set up the circumscribed rectangles. These
form a region Rn which is an approximation of the region R. Then we rotate Rn
about the x-axis. Each of our rectangles then gives a cylinder (lying on its side), and
the cylinders form a solid Sn which is an approximation of S. In the figure on the
right below we show only the ith cylinder. Its altitude is �x xi xi_1, and the
= -
y y
V·
. =
xi-1
.
7T (-1-)2• �x
7Tr2 �x · =
'
vSn .2 vi .2 7T
n 1 2
= �x. =
(-) ·
(In fact, it is the upper sum of g over the net N, because g(xi_1) is the maximum
value of g on the interval [x;_1, X;].) The mesh of N is
INI = �x =
.! ,
- -
n
lim vSn
n-+ 00
=
2
{ 7T dx
J 1 x2
= - �Jx 21 = .'.'.:: .
2
(1)
If we use inscribed rectangles, and rotate them about the x-axis, then we get an
inscribed solid S�, with volume
2
n
vS� .2 7T ( 1 ) �x. =
i=l
-
X;
·
Therefore the volume vS of Sis squeezed between the volumes of the inscribed and
circumscribed solids:
for every n,
318 The Definite Integral 7.3
and so
2
vs = f '!!__ dx = '!!. . (3)
Ji x2 2
We shall now review this process and state the assumptions on which it is based.
Not all solids are measurable, in the sense that they have volumes; but the solids that
you are likely to encounter soon are measurable, and their volumes are governed by
the following laws.
By an elementary solid we mean a right parallelpiped, cylinder, or cylindrical
shell, as at the beginning of this section. We have been assuming that:
V.1. Elementary solids are measurable, and their volumes are given by the formulas
v = abc, v = 7Tr2h, v = 7Th(r2 - s2).
Two solids are nonoverlapping if they have no solid in common. (They may have
surfaces in common.)
V.2. If s1, s2, . • • , sn are nonoverlapping elementary solids, and Sn is their union,
then Sn is measurable, and
V.3. If S and S' are measurable, and S' lies in S, then vS' � vS.
V.4 (The squeeze principle). If (a) Si. S2, • • • are measurable solids containing S,
(b) S{, S�, ... are measurable solids lying in S, and (c) lim,H00 vSn = L = limn-•ro vS�,
then S is measurable, and
vs= L.
Using V.1 through V.4, we can show that the method of disks, which we have
'
used for the function
f (x) = I/x,
y y
�
f
M; ---
-
'mi ---- I I
I I
I I
I I
I I
--lf---���x,�· -1 �-i
.�x ��---x
-
z
If we rotate the inscribed rectangles about the x-axis, we get an inscribed solid S�,
of volume
n
vS� = I 7Tmi!lx.
i=l
g(x) = 7Tj(x)2,
and vS� is a lower sum of the same g. As n-. oo, JNI-. O;
vS = f7Tj(x)2 dx.
We also use this formula sidewise.
y
y
y=vx
Suppose that the region R on the left is rotated about the y-axis. Sidewise, R can be
regarded as the region under the graph of a function
1 7T 1 7TY4 dy
Therefore the volume is
7T
1 0
[f(y)]2 dy =
10
=-
5
.
320 The Definite Integral 7.3
1. Obviously a right circular cone can be regarded as the solid of revolution of a right
triangle about one of its legs. If we place the triangle in the xy-plane as shown in the
figure, then the hypotenuse becomes the graph of a function f Calculate/, and find the
volume of the cone by the methods of this section.
2. Similarly, a round ball of radius r can be regarded as the solid of revolution of a semi
circular region about its diameter. Find the volume, by the methods of this section.
y y
3. The region under the graph of /(x) = v':X (0 � x � 1) is rotated about the x-axis.
Find the volume of the resulting solid.
4. Same, forf(x) = sinx, 0 � x � .;,,
5. Same, for f(x) = x312, 0 � x � 1.
6. Same, forf(x) = cosx, -1Tj2 � x � 1T/2.
7. Let R = {(x,y) I 0 � x � 1,. x2 � y � 1} be rotated :>.!)out the y-axis. Find the
volume.
Theorem (?). Let T and T' be triangles each of which has a side on the x-axis. If T
and T' have the same area, then when they are rotated about the x-axis, they give solids
with the same volume.
13. a) For each.x from 0 to 1, let T,,, be the triangle whose vertices are (0, O), (1, 0), and
(x, 1). What value or values of x give maximum volume, when Tx is rotated about
the x-axis?
b) Suppose that the triangles T,,, are rotated about the y-axis (instead of the x-axis).
Which value or values of x give maximum volume?
14. For each k from 0 to 1, let Tk be the triangle whose vertices are (0, 0), (k, 0), and
(0, Vl - k2). (Thus the hypotenuse of Tk has length 1.) Tk is rotated about the x-axis.
What value of k gives maximum volume? What is the maximum volume?
15. a) Given/(x) = 1/x. Let R be the region under the graph of/, from 1 to ro. Give a
reasonable definition of the area of R. Is this area finite?
b) The region R is rotated about the x-axis, giving a solid S. Give a reasonable definition
of the volume of S. Is this volume finite?
16. a) The region R under the graph of f(x) = 1/x2 from l to ro is rotated about the x-axis
giving a solid S. Does S have finite volume?
b) If the region R is rotated about the y-axis, do we obtain a solid with finite volume?
The method of disks can be generalized in the following way. Given a solid S in
space. Suppose that we can calculate the areas of the cross sections perpendicular to
the x-axis.
y y
'\
\
I 'I
I , ,
I 11
I ,,
I; I
I
I
I
I
I
x
a x b
z
z
For each x from a to b we let A(x) be the area of the cross section. This gives a
function A which expresses the cross-sectional area in terms of x. (In our previous
examples, the cross sections were all circular.) As before, we divide the interval
[a, b] into n equal parts, and we approximate the volume by cylinders. In the figure
at the right, we show only the ith cylinder.
322 The Definite Integral 7.4
We then have
n
vn = L A(xi) D..x,
i=l
and the sum on the right-hand side is a sample sum of the function A. Therefore,
as the mesh goes to 0,
vS = fA(x) dx.
By this method we can calculate volumes. For example, take the parabola
y = x�, for 0 � x � 1. For each y from 0 to 1, we take the horizontal segment
from (O,y) to-the point (x,y) of the parabola; and using this segment as an edge,
we construct a horizontal square. Thus we get a solid, as shown in the figure.
/
/
/
/
The cross-sectional areas perpendicular to the y-axis are given by the formula
A(y) = x2 = y.
Therefore the volume is
V =
11A(y) dy 11y dy [y2] 1
= = - =
1
- .
0 0 2 0 2
The general method of cross sections applies, in a sense, to every volume problem.
That is, it is always true that
vS = fA(x) dx.
But often this formula leads to difficult calculations.
7.4 The General Method of Cross Sections, and the Method of Shells 323
x
,,. ,,.
I
-2 I
I
2
I
I
f(x) = cos x,
We rotate R about the y-axis, getting a solid of revolution, of which only the front
half is shown in the figure. We can find the volume by the cross-section method. We
have
A(y) = ?TX2 = ?T(Cos-1 y)2•
Therefore
We can calculate this by integrating by parts twice, but there is a better way.
Instead of approximating the solid by thin cylinders, we approximate it by thin
cylindrical shells.
the outer radius is X;; and the inner radius is X;_1 = X; - �x . Therefore the volume
of the ith shell is
V
;
= ?TX� cos xi - ?T(X; - �x)2 cos X;
n n
= L 27TXi · cos xi · Llx - 7T Llx L cos xi Llx.
1 1
We need to find out what happens as the mesh goes to 0. The first sum is a sample
sum of the function 27TX cos x. Therefore
n
L 27TX; cos X; · Llx -+
i rr/2
27Tx cos x dx.
1 0
Thus the entire second sum, in the expression for vSn, drops out when we pass to the
limit. Therefore
[ "12
vSn -+
J 27TX cos x dx,
o
and
[ "12
V =
J 27TX cos x dx
o = 27T[X sin x + cos x]�12
The same method applies if we rotate a region lying to the right of the y-axis.
If the width of the region is given by a function h(x), then the volume of the ith
cylindrical shell is
vi = 7TX7h(x;) - 7T(Xi - Llx)2h(xi)
= 7Th(xi)(x7 - x7 + 2x; Llx - Llx2 )
= 27TX;h(x;) Llx - 7Th(xi) Llx2•
7.4 The General Method of Cross Sections, and the Method of Shells 325
Therefore
n n n
vn = L vi = 27T L X;h(x;) Llx - 7T Llx L h(x;) Llx
1 1 1
Therefore
2...x;
h(u){ �------------------]
\..,
D.x
If we make a vertical cut in the ith cylindrical shell, and :flatten it out, we get a rec
tangular prism. The length of the prism is the circumference of the outer circle in the
base of the shell. This is 27TX;. The altitude and the thickness of the prism are the
same as the altitude and the thickness of the shell; these are h(xJ and Llx. Therefore
the volume of the prism is exactly 27Txih(x;) Llx; and this ought to be a good approxi
mation to V; when Llx is small, because when the shell is thin, we can flatten it out
without distorting it very much. As we have seen, the error goes to zero as the mesh
goes to zero.
The method of shells applies to the problem that we were discussing above.
We know that the volume is
V =
irr/227TX
0 COS X dx.
Therefore
V = [27T(X sin x + cos x)]�12 = 7T2 - 27T.
R
�-_Ll �� +- � ----'-
� -'-�-'-�.__--X
I
I
If R is rotated about the line x = -1, then by the shell method the volume of the
resulting solid is
=
[ x4
27T - 4 +
4xa
3 -
x2
2 -
]a
6x. 2 =
t7T.
The horizontal cross section at height y is the region between a circle of radius 1
and a circle of radius x = JY. Therefore
7T
as before.
= 7T - 7T
fly dy
0 2
= - '
7.5 The Area of a Surface of Revolution 327
1. Let R be the circular region with center at (5, 0) and radius 2. R is rotated abo1•t the
y-axis. Find the volume of the resulting solid.
2. A solid of the sort described in Problem 1 is called a solid torus. More generally, suppose
we have given a circular region of radius a, and a line L in the same plane, such that
the perpendicular distance from L to the center of R is b, with b � a. When R is rotated
about the line L, the result is a solid torus. Find its volume, in terms of a and b.
3. Let R be the square region with center at (4, 0) and sides of length 2, parallel to the
coordinate axes. R is rotated about the y-axis. Find the volume of the resulting solid.
4. Let T be the square region with center at (4, 0) and sides of length 2, with diagonals
parallel to the coordinate axes. Find the volume of the solid which results when T is
rotated about the y-axis.
y = lnx, 1 �x � e,
is rotated about the x-axis. Find the volume, by the method of disks.
6. For eachx from 0 to 1, let Rx be the circular region perpendicular to thexy-plane, with
center at the point (x, x2) and radius 1. Let S be the solid formed by the regions Rx.
Find the volume of S.
7. a) The region described in Problem 5a is rotated about the y-axis. Find the volume, by
the shell method.
8. a) The region under the graph of y = e", 0 � x � 1, is rotated about the y-axis. Find
the volume by the method of shells.
b) Now solve Problem Sa by the cross-section method.
9. Let C be the cylinder with the y-axis as its axis of symmetry, and radius 1. Let S be the
sphere with center at the origin and radius 2. Find the volume of the solid which lies
inside the sphere and outside the cylinder.
10. Let C,, be the cylinder of radius 1, with the x-axis as its axis of symmetry; and let Cy
be the cylinder of radius 1 with the y-axis as its axis of symmetry. Find the volume of the
solid which lies in both Cx and Cy.
11. Let S be the sphere of radius v2 with center at the origin. Let C be the cone with vertex
at the origin, axis along the y-axis, and passing throug� the point (1, I). Find the
volume of the solid which lies inside the sphere and inside the cone.
Given a line and a curve, lying in the same plane and lying on one side of the given
line. If the curve is rotated about the line, the resulting surface is called a surface of
revolution. The area of such a surface can be expressed as an integral. We begin with
the simplest case, in which a function-graph is rotated about the y-axis. Here the
functionfis defined on a closed interval [a, b] on the positive half of the x-axis. We
assume that f has a continuous derivative.
328 The Definite Integral 7.5
.. x
To calculate the area of the surface of revolution, we need the formula for the
lateral surface of a right circular cone. Let s be the slant height of the cone, and let
r be the radius of the base, so that the circumference of the base is 2Trr. We assert
that the lateral surface is the same as the area of a circular sector of radius s, with
boundary arc of length 2Trr. The reason is that we can make a straight cut in the cone,
starting at the vertex, and then flatten out the surface, without changing its area, so
that the resulting surface lies in a plane. The plane surface thus obtained is the sector
shown below.
But the area of a circular sector is half the product of its radius and the length of its
boundary arc. Therefore, for cones, we have
A = 'TrYS.
(Note that for a "cone of altitude O," that is, a disk, this formula gives the right
answer Trrs = Trr2. ) From this we can get a formula for the lateral area of a frustum
of a cone. If the larger cone (with slant height s2) has area A2, and the smaller cone
7.5 The Area of a Surface of Revolution 329
b - a
Ax= --= INI
xi - xi-l =
n
for each i. For each i, let Pi be the point (xi, f (xi)) . These points determine a broken
line Bn which is an approximation of the graph off When Bn is rotated about the
y-axis, we get a surface Sn, with area An. By definition, the area of the surface of
revolution off is
A= lim An,
INl->O
if the limit exists. (This is like the definition of arc length.)
We shall now calculate An, and find its limit as INI -+ 0. Consider the ith segment,
from Pi-l to Pi. When this segment is rotated, it gives a frustum whose area is
ai= 21Txi · Pi-lpi;
.. x
Xi-1 Xi Xi
330 The Definite Integral 7.5
J f xi) xi )
= 1 + ( c �:c -l Y Llx
= .J1 + f'(i;)2Lix,
where X;_1 < i; < xi, as shown in the last figure.
We now have a formula for the area A n of the approximating surface:
An = I ai iI=l 27TiiPi-1Pi
i=l
= =
i=l
I 27Tii.J1 + f'(i;)2Lix.
Here i i is the midpoint of [xi_1, x i], and i; is somewhere on the same interval.
If it were true that i; = i i for each i, then A would be a sample sum of the function
n
I a; = I 27Ti;.J1 + f'(i;)2Lix,
i=l i=l
we know that
n ibg(x) dx ib27Tx.J1
lim I a; = = + f'(x)2 dx.
n�oo i=l a a
� 7T.J1 + f'(i;)2Lix2•
7.5 The Area of a Surface of Revolution 331
Therefore
I I �;�Ia;
n n n n
INJ-+O i=l
y
b
z
z
x2 a2
1 + f'(x)2 = 1 + =
a 2 - x2 a2 - x2
---
•
332 The Definite Integral 7.5
It follows that the total area of a sphere of radius a is 47Ta2• This is the standard
formula.
It is harder to find the area when we rotate a function-graph about the x-axis
instead of the y-axis.
b a
INI.
-
xi - xi-1 = D..x = -- =
As before, we approximate the graph by a broken line Bn- Then we rotate Bn about
the x-axis, getting a surface Sn, with area Aw We define the area of the surface of
revolution to be limlNl--?O An, if such a limit exists. We proceed to calculate:
where a; is the area of the ith frustum, shown in the figure. As before,
But when we rotate the chord from P ;_1 to P; about the x-axis, the "average circum
ference" is
Obviously i'; is between/(xi_1) andf(x;), because i'; is their average. By the no-jump
theorem of Section 5.7,
Therefore
An = I a; I 27Tj(.X;)-J1
i=l
=
i=l
+ f'(.X�)2 D..x.
7.5 The Area of a Surface of Revolution 333
If it were true that xi= x� for each i, then the sum on the right-hand side would be a
sample sum of the function
g(x)= 2rrf(x)J1 + j'(x)2.
As it stands, it is very close to being a sample sum. The idea is that
INI= �x � 0
for each i
for each i
i 2rrl(xi)-./1
i=l
+ f'(x;)2 �x � i 2rrl(x;)J1
i=l
+ f'(x;)2 �x
At the end of the chapter, these ideas will be turned into a proof. Meanwhile let us
look at some applications of the formula
A= f-aa 2rrl(x)J7
l(x)2
a
-- dx= f 2rra dx= 4rra2•
-a
PROBLEM SET 7.5
2. The entire circle Ca is rotated about the x-axis, giving a sphere of radius a. Eb and Ee
are two planes, perpendicular to the x-axis, at x = b and x = c; and S is the part of
the sphere that lies between them. Find the area of S, in terms of a, b, and c. The
form of your answer ought to suggest a somewhat surprising theorem which can be
stated without the use of formulas. What is the theorem?
3. The circle with center at (b, 0) and radius a, a < b, is rotated about the y-axis. The
resulting surface is called a torus. Find its area.
4. The square with corners at the points (a, O), (a+k, k), (a+k, -k), and (a + 2k, 0)
is rotated about the y-axis. (Here 0 < k < a.) Find the area of the resulting surface.
5. Find the volume of the solid obtained when the corresponding square region is rotated
about the y-axis.
6. The same square is rotated about the line x = a+2k. Find the surface area.
7. The square region is rotated about the line x = a + 2k. Find the volume.
8. The square with center at (a, 0) and sides of length 2k parallel to the coordinate axes is
rotated about the line x = 2a. Find the area of the resulting surface. (Here 0 < k < a.)
9. Find the volume when the corresponding square region is rotated about the Jine y = k/2.
10. Consider the curve consisting of (a) the segment from (0, 0) to (a, 0), (b) the segment
from (0, 1) to(a, 1), and (c) the semicircle, pointing outward, with endpoints at (a, 0)
and (a, 1). This curve is to be rotated about the y-axis. For what value of a is it true
that the total area of the resulting surface is equal to 15?
11. For each a, let Sa be the area of the surface described in Problem 10, and let Va be the
volume of the solid that it encloses. What value of a maximizes the ratio VafSa?
12. The circle with center at (b, b) and radius a is rotated about the line
x+y=l .
Here a and b are both positive, and the Circle does not intersect the line. Find (a) the
area of the resulting surface, and (b) the volume of the solid that it encloses.
13. Same question, for the circle with center at (2, 10) and radius I and the line x+y = 2.
(The only natural solutions of this, on the basis of the theory that we have so far, are
rather clumsy. This suggests that some new ideas are needed.)
14. The graph of
y = 2x2,
from x = -1 to x = 1, is rotated about the line x = 5. Find the area of the resulting
surface.
15. If the same surface is rotated about the line x = 4, would the area of the resulting
surface be greater, or would it. be less, than the answer to Problem 14? Get a plausible
answer to this, and justify it as well as you can.
16. The graph of y = t(e"' + e-"'), 0 � x � 1, is rotated about the x-axis. Find the area of
the resulting surface.
17. The same graph is rotated about the y-axis. Find the area of the resulting surface.
18. Let G be the graph off (x)=sin x, from x = 0 to x = n/2. G is first rotated about the
line x+y = 4, and then about the line x +y = 5. Which of the resulting surfaces
has the larger area? Why? (A right answer, with a plausibility argument, is acceptable
7.6 Moments and Centroids. The Theorems of Pappus 335
as an answer to this one. It is possible, however, to give a proof of the right answer,
without calculating the area of either of the surfaces. That is, you can prove an in
equality of the form A < B, without calculating either A or B.)
The ideas in this section are mathematical descriptions of physical ideas. Given a
finite set of "point masses" m;, at the points P; = (x;, y;) in a coordinate plane, the
moment (of the system) about the y-axis is
n
My= .L X;m;.
i=l
The left-hand figure below shows the general case.
P2 •
-2 -1
----�
--+-----____,� : --r---'---+--�---_...x
I
I
-1
• P3 · · . Pn I
I
I
m2=1�------ -2
Physically speaking, this means that if the plane is horizontal, resting on a knife-edge
along the y-axis, it will balance. The formula .L mix; for M v makes it plain that the
effect of each point mass depends only on the product m;X;; if we divide m; by 2,
and double xi, then the moment Mv is unchanged.
Similarly, the moment about the x-axis, of our finite system of point masses, is
defined to be
n
M
x =
L Yimi.
i=l
The total mass of alt the particles in the system is denoted by m. That is,
n
m = .L m;.
i=l
The centroid of the system is defined to be the point
P = (x, .Y)
such that
and Mx = ym.
336 The Definite Integral 7.6
Thus if we concentrate the entire mass of the system at P, the moments about the
x-axis and y-axis are unchanged.
For example, if we have m1 = 2 at P1 = (1, 2) and m2 = 3 at P2 = (2, 5), then
m = Lmi = 5,
8 = x. 5, 19 = ji . 5,
x = t. ji=1_/.
The above discussion does not prove that My, Mx, and P = (x, ji) have any
physical significance; only experiments can prove this. The fact, however, is that the
physical conditions for equilibrium are described by moments and centroids.
Let us now consider how these ideas can be applied to a region Rin the xy-plane.
We shall think of Ras a very thin sheet of homogeneous material, so that the mass per
unit area is constant, say, = I.
Suppose that we take a net over the interval [a, b], as in the figure; for each x ,
we let h(x) be the height of the cross section of Rat x, and we let
is the area of the rectangle in the figure. The rectangle is narrow, and so its moment
about the y-axis should be approximately
If we approximate the region R by a finite set of such narrow rectangles, then the
moment of R about the y-axis ought to be approximately
n n
and the approximation ought to get better as the mesh ,6.x decreases. This is the idea
of the following definition.
Definition. Let R be the region lying between the graphs of two continuous functions
/1 and/2, on an interval [a, b], with/1 � /2, and let
h(x)=fix) - /1(x).
Then the moment of R about the y-axis is
w (y)=g2(y) - gi(y),
and by definition,
f
A= h(x) dx= w(y) dy, f
it is natural to define the centroid of R as the point P= (.X,ji) such that
Mv=.XA, Mx=jiA.
y=Va2-x2, O�x�a.
338 The Definite Integral 7.6
Here
Obviously
and so
Therefore
- 4
x =-a.
37T
- - 4
y = x = -a.
37T
Mx=x =
o
fcx - x0)h(x)dx fxh(x)dx - x0fh(x)dx =My - x0A.
=
This is 0 for x0 = x. The proof of the other half of the theorem is the same. In fact,
the equation
M"'="o = My - x0A
shows that the converse of Theorem I is also true.
p p
L
pt
P'
It is easy to see that if R is symmetric about the y-axis, then x = 0. In the figure
on the left, h(x) is an even function, with h(-x) = h(x). Therefore xh(x) is an odd
function, with (-x)h(-x) = - [xh(x)]. Therefore
M11 = faxh(x) = 0,
and x = 0.
y y
�--+-�-'-�x "'----'-�-'--�..L.._�
x0-k x0-t x0 x0+t x0+k
By symmetry,
h(x0 - t) = h(x0 + t)
for every t; and so
= -efi(xo + t).
Therefore the graph of efi must be like the graph shown below .
xo+k x
340 The Definite Integral 7.6
Therefore
r xo cp(x) dx '"o+k cp(x) dx,
Jxo-k Jxo
= -
and
xo+k
Mx=xo ixo-k cp(x) dx
= = 0.
It follows that x0 = x.
In this proof, all that we have used is the assumption that
Theorem 5 (Pappus' theorem, for volumes). If a region is rotated about a line not
intersecting it, then the volume of the resulting solid is equal to the area of the region
times the circumference of the circle described by the centroid.
That is, if the region below is rotated about the y-axis, then
V= 27T.XA.
V = f27Txh(x) dx.
Therefore
V= 27TMv = 27T.XA,
7.6 Moments and Centroids. The Theorems of Pappus 341
Mv = xA.
Pappus' theorem can be applied in two ways. If we know x and A, we can
compute V = 27TxA; and if we know V and A, we can solve for x = V/2.;.A. For
example, consider a circular region R, of radius a, with center at the point (b, 0),
b > a. When R is rotated about the y-axis, we get a solid which is called a solid torus.
(The surface of the solid is called a torus.) By Pappus' theorem, we get
y a
I
I
I
I
I
I
-�-+-�+----+�-- x
a pt=?
I
I
I
I
-a
We can use the theorem in reverse to find the centroid of a semicircular region.
If the region is rotated about the y-axis, we get a sphere of radius a, with volume
V = t7Ta3.
Obviously
Therefore
and
- 4
x =-·a.
37T
These ideas apply also to arcs. We shall think of an arc as a thin homogeneous
wire whose mass per unit length is constant, say, = 1. Suppose that the arc is the
graph of a function f, on an interval [a, b ]. As usual, we take a net over [a, b], with
equal subdivisions. The arc length over the interval [x;_1, X;] is
si f�/1
= + f'(x)2 dx.
y
342 The Definite Integral 7.6
Now
Mv R::! I xi-./1
i�l
+f'(x;)2 t>,.x.
Definitions. Given the function f, with a continuous derivative f', on [a, b], the
moment of the graph about the y-axis is
M,11 =
fx-./1 +f'(x)2dx;
and the moment about the line x = x0 is
M,x=xo =
fcx - x0)-./l +f'(x)2dx.
Similarly, we state the following:
Mx =
ff(x)-./1 +f'(x)2dx,
Mv=110 = f(f(x) - Yo)-./1 +f'(x)2dx;
and the centroid of the graph is the point P = (x, ji) for which
M :XL ,
v
=
Our previous theorems for regions now have analogous forms for arcs, as follows.
Theorem 6. If the graph off is symmetric about a Ii ne x = x0 (or y = y0) then this
line contains the centroid.
Theorem 7. If the graph off is rotated about a horizontal or vertical line not inter
secting the graph, then the area of the resulting surface of revolution is equal to the
length of the arc times the circumference of the circle described by the centroid.
r-1
I
I
I
I
7.6 Moments and Centroids. The Theorems of Pappus 343
For example, if we rotate about the y-axis, then the area of the resulting surface is
S = f 27Tx�l + f'(x)2 dx
= 27TM" = 27TxL,
by definition of x. The proof of the theorem in the other cases is similar.
Throughout this section, we have used a fixed coordinate system to define and
investigate moments and centroids of regions and arcs. It is a fact, however, that
moments and centroids do not depend on the choice of a coordinate system; they
depend only on the regions and the arcs. ln particular, any line of symmetry (horizon
tal, vertical, or sloping) must contain the centroid. You may use this fact in the
problem set below.
I. Let A, B, and C be the points (0, 0), (a, 0), and (b, c), a, c > 0. At each of these points
there is a particle of mass 1. Find the centroid of the resulting system.
3. A median of a triangle is a segment between a vertex and the midpoint of the opposite
side. Show that every median of the triangle described in P roblem 1 passes through the
centroid.
4. Now consider the triangular region R determined by the same points A, B, and C.
Find the centroid of R.
7. The figure formed by the sloping sides of the triangle is rotated about the x-axis. Find
the area of the resulting surface.
10. The region Tis rotated about the x-axis. F!nd the volume.
11. The region Tis rotated about the y-axis. Find the volume.
12. The figure formed by the four sides of the trapezoid is rotated about the y-axis. Find the
surface area.
13. The circle with center at (b, 0) and radius a, with 0 < a < b, is rotated about the
y-axis. Find the area of the resulting torus.
14. Let the arc A be the portion of the circle with center at the origin and radius a which
lies in the first quadrant. Find the centroid of A.
15. The square with corners at (a, 0), (a + k, k), (a + k, -k), (a + 2k, 0) is rotated
about the y-axis. (Here 0 < k < a.) Find the area of the resulting surface.
16. Find the volume of the solid obtained if the corresponding square region is rotated.
17. The same square is rotated about the line x = a + 1k. Find the surface area.
344 The Definite Integral 7.7
18. The square region is rotated about the line x = a + 2k. Find the volume.
19. Consider the curve consisting of: (a) the segment from (0, 0) to (a, O); (b) the segment
from (0, I) to (a, I); and (c) a semicircle, pointing outward, with endpoints at (a, 0)
and (a, I). This curve is to be rotated about the y-axis. For what value of a is it true
that the total area of the resulting surface is equal to 15?
20. For each a let Sa be the area of the surface described in Problem 19, and let Va be the
volume of the solid that it encloses. What value of a maximizes the ratio VafSa?
The definite integral is defined as a limit of sample sums as the mesh of the net
approaches 0. This limit exists if the integrand f is continuous. But this definition
of the integral does not apply to the function
1
f(x)
Jx
=
(0, l] = {x j 0 < x � I}
that we are dealing with, because at x = 0 the function is not defined. On this
half-open interval the function is unbounded. Therefore, for every net over (0, l]
we can form a sample sum as large as we please, by taking the first sample point x�
close to 0. Thus,
the sample sum is large when x� is small, and so the sample sums do not approach a
limit as the mesh approaches 0.
Nevertheless, we can extend the definition of the integral in such a way that our
problem has an answer. The function f (x) = 1/J� is defined and continuous on
every closed interval [a, l], where a > 0. Therefore H (1/J�) dx is well defined.
7.7 Improper Integrals 345
f(x) 1
-
Vx
=
f
f 7x
=
!��+ �; '
if the indicated limit on the right exists. (We write a-+ O+, because a takes on only
positive values.) In the present case, the limit exists and is finite:
f1 d
Therefore
lim [2 2.
a-+O J a '\/� a-+O+
lim
+
=
- 2.fa] =
There are similar-looking problems for which the limit is infinite. For example,
11 dx . 1-
dx
0 2 a->0+ 1a 2 '
-= hm
X X
1
-1 + - (a > 0),
a
r 1 dx
and so
a->0+ Ja x2
= oo
lim ,
f1 dx
Jo x2
= oo
.
The same test can be applied at any point where the function "blows up," as long
as there is only one such point, at an endpoint of the interval. For example, consider
/
_
where the minus sign means that a-+ 7Tj2 through values less than 7T/2.
346 The Definite Integral 7.7
--+---rr,,.__----• X
2
Now
= In I sec a + tan a j.
As a-+ "TT/2-,
sm
sec a= tan a
1 oo a oo.
-- -+ and = -+
cos a cos a
a -"12- xdx =
Jim r sec
Jo
00.
ico dx2
x 1
•
Here the integration is supposed to be carried out all the way to the right, starting
at x = 1. Again our definition of the definite integral (as the limit of the sample sums)
does not apply, and so we define the improper integral as a limit:
lco dx = .m ladx •
2 h 2
1
a-coX 1 X
ladx
g(a) = 2'
the limit exists and is finite:
x 1
j'adx2 = [-.!]a=_ l + 1;
and so
xi x a i
1
f(x) 2
x
=
f 00 dx fa dx
1
=
Jim =
Jim [In x]� = lim ln a = oo
.
1 X a--+oo X a--+oo a--+oo
Integrals of the type that we have been discussing here are called improper
integrals, of the first and second kind respectively. The two kinds can occur in
combination. For example,
r oo dx
Jo ,jx(l + x)
is improper in two ways: the function blows up at the lower limit, and also the upper
limit is oo. Thus, for the integral to be finite, the two limits
.
hm
1
1 dx 1.
1m
fb dx
a-+o+ a,Jx(l + x) .jx(l + x)
_
,
b-+oo i
must both be finite; if they are, then the integral from 0 to oo is their sum. (Any
positive number k would have done equally well in place of 1.) In this case, both
limits are finite.
Improper integrals may appear in a disguised form, and so we need to be careful.
For example, a careless calculation gives
l d: [-1]1
J-1 x x -1 = = -1 - 1 = -2.
This is impossible, because the integrand is positive. The trouble here is that the
function blows up at x = 0, and so we need to make separate investigations of the
integrals
o dx
J
-1x 2 '
We find that both these limits are equal to oo. Therefore the original integral is equal
to oo.
348 The Definite Integral 7.7
Note that when we got the "answer" -2 we had no right to complain that the
theory was wrong, because the Fundamental Theorem of Integral Calculus does not
apply to functions whose domains have holes in them; the theorem applies only
to functions which are defined and continuous on the interval over which we are
integrating.
An even worse example of the same kind is
f-1 x
1 dx
= [In lxl]:.1 = ln 1 - ln 1 = 0 (?).
r a
x
= li�[ln lxl]'.:1 = li� ln lal = -oo,
J-1 X a-+O a-+O
1 dx
( =lim[ln lxlJ! =lim[-lna] = oo.
Jo X a-+O
+
a-+o
+
The limits - oo and oo do not combine to give a well-defined limit, either finite or
infinite, and therefore the original integral is not defined at all. We also get no
answer for
"'
sin x dx. l
Here
l a
sin x dx =[-cos x]g = -cos a + 1,
which oscillates forever between 0 and 2, and therefore does not approach a limit.
Thus, when we investigate an improper integral, there are three situations that
we may encounter.
1) The integral may exist, as a finite limit. For example,
(
J1
"'
d
:
X
=Jim
a-+00 J1
a
r a:
X
= lim
a-+oo
[- l] X
a
1
= l.
i"' dx
- =Jim l a
dx
- = lim lna =
1
oo.
X a-+ oo 1 X a-+ oo
In the following problem set, when you are asked to "investigate" an improper
integral, you should find out which of the above three cases it represents. If it is
Case 1, you should find the limit, unless the contrary is stated.
7.7 Improper Integrals 349
1.
x2 2.
1 x2 x 3.
1 xs
oo dx f dx loo dx
1 + +
J x3
1
4. 5. 6.
-v:x -v:x
loo dx loo dx loo dx
o o 1
f dx f dx1 0001
10.
1x x 11.
0 x. 12.
0 xo.9999
loo
In
19.
1 x4 1 +
20. Show that f;' xn dx is never finite, for any value of n, positive, negative, or zero. (The
l/x, x
point is that something always goes wrong, either at 0 or at oo.)
21. Consider the graph of/(x) = 1 � < oo. Let R be the region under the graph;
let S be the solid of revolution (about the x-axis); and let Tbe the surface of revolution.
Investigate the improper integrals which represent (a) the area of R, (b) the volume of S,
and (c) the area of T.
Investigate the following for existence. (That is, find out whether the integral represents
a finite limit; but in the cases where it does, you need not calculate the liP)-it.)
loo-1--.,
dx dx loo dx Joo dx
22.
1 e +
23.
2 x x
1 + In
24.
o 1 + v' x(l x)+
loo--dx
x2e-x f-00oo e-x• dx f;" e-x• dx f=� e-"'2
x
(To show that this is finite, it will be suffi-
25. 26.
1 1 + cient to show that < oo. It will
dx
then follow by symmetry that
e-x•
< oo. And obviously
because is continuous on
f:_1 e-x•-dx1,
[
x
< oo,
1].)
Joo --dx x
1T
x2
sin
*36.
..
350 The Definite Integral 7.8
IJv';;;(n+iJ.-. I 'Jv;;;
sm x2 dx < __ sin x2 dx .
I
f 00 v
v' (n-1>.-
38. Let f be a decreasing function, with a continuous derivative, on the interval [a, oo ) .
The graph (and the region under it) are rotated about the x-axis. Show that if the
surface area is finite, then so also is the volume.
The definition of continuity applies to the points x of the interval, one at a time.
It may appear, therefore, that iff is continuous at each point x of the interval, then we
have to use infinitely many boxes (one for each x) in order to exhibit the fact. But
if I is a closed interval, this is not so:
y
y
~
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I I
I I I I
Proof Let [c, d] be a subinterval of [a, b]. If there is a finite collection of h-boxes,
covering the part of the graph determined by [c, d], then we shall say that [c, d] is good.
7.8 The Integrability of Continuous Functions 351
If no such finite collection exists, then we say that [c, d] is bad. We allow the case in
which [c, d] is all of [a, b]. Thus what we need to show is that [a, b] is good. Suppose,
then, that [a, b] is bad. We shall show that this leads to a contradiction.
Let [au b1] = [a, b]. If the left-hand half of [a1, b1] and the right-hand half of
[a1, b1] are both good, then it follows that [a1, b1] is good; we can fit together two
finite collections of boxes, getting another finite collection of boxes that covers the
whole graph. Therefore one or both of the halves of [a1, b1] must be bad. Let [a2, b2]
be a bad half of [a1, bi]. Similarly, one of the halves of [a2, b2] must be bad. Let
[a3, b3] be a bad half of [a2, b2].
of closed intervals, each of which is bad. By the nested interval postulate, there is a
number x which lies on all of these intervals.
Now f is continuous at .X. Therefore /has an h-box at (x,j(x)). Suppose that the
box is {(x, y) I x0 < x < x1, Yo < y < y1}. Since
b - a
bn - an = --1- '
2n-
the length of the nth interval [a,,, b,,] approaches 0. Therefore we must have
for some n.
This means that [an, bnl is good after all: one h-box covers the part of the graph that
lies above it; and 1 is finite.
We continue now at the point where we left off in Section 7.2. There we defined
net, mesh (of a net), upper sum S(N), lower sum s(N), and sample sum I (X). Theorem
3 of Section 7.2 was as follows:
Proof Let E > 0 be given. We need to show that there is a o > 0 such that
INI <o => S(N) - s(N)<E.
We know by the finite covering theorem that for every h > 0 there is a finite
collection of h-boxes, covering the graph. (See the left-hand figure below. We have
not yet decided what h we want to use.) The x-coordinates of the vertical sides of the
boxes, together with a and b, form a net N0 over [a, b]. Let o be the length of the
shortest interval in N0• We assert that if N is any other net over [a, b], with IN! <o ,
then every little interval [x;_1, X;] in N lies under some one of our boxes. We illustrate
y
y
with the simpler figure on the right. If [x;_1, X;] contains no point of N0 (as on the
right) this is evident. If [xi_1, X;] contains a point Y; of N0 (as on the left), then Y;
lies on the open interval under one of our boxes, and so [x;_1, x;] lies under the same
box.
Now take a net N, with INI <o . The difference S(N) - s(N) is the sum of the
areas of a collection of rectangles, like this:
�
I
I
I
I
I I
Each of these rectangles lies in one of our h-boxes. (Why?) Therefore each of them
has height � h. Hence
S (N) - s(N) � h(b - a).
Thus we want
h(b - a)<E,
and this will hold if
€
h<--.
b - a
This is the way we should choose h at the beginning of the proof. The resulting o
is the o that we need.
7.8 The Integrability of Continuous Functions 353
Definition. Let f be a function on an interval /. Suppose that for every E > 0 there
is a o > 0 such that
Ix - x'I < o =>
If(x) - f(x')I < E,
where x and x' are any two points of I. Then f is uniformly continuous on I.
Note that while continuity is defined for one point x at a time, uniform continuity
is defined for the graph as a whole. The difference between these ideas may be clarified
by an analogy:
Thus, uniform literacy is a property not of the individuals in a group but of the
group as a whole; if each of the members of the group is literate, it follows that the
group is literate [see (2)], but it does not follow that the group is uniformly literate.
The difference between continuity of a function f on an interval I and uniform
continuity of f on / is analogous. For example,f(x) = l/x is continuous on the open
interval I= (0, 1), because/is continuous at every point x of I. But/is not uniformly
continuous on I. (For every E > 0, we can find two points x, x', as close together as
we please, such that lf(x) - f(x')I > E.)
But continuity implies uniform continuity when the domain of the function is a
closed interval.
Proof Let E > 0 be given. By the finite covering theorem there is a finite collection
of boxes of height E, covering the graph. (We are using E as the h of Theorem 1.)
Let N0 be the corresponding net over [a, b], as in the proof of Theorem 2. As before,
let o be the length of the shortest interval in N0• It follows that if Ix - x'I < o, then
x. and x' lie under some one of our boxes. Therefore lf(x) - f(x')I < E, which was
to be proved.
This is the idea that we need, to complete the proof of the formula
A = f27Tf(x).J1 + f'(x)2 dx
for the area of a surface of revolution about the x-axis. In Section 7.5, we knew that
354 The Definite Integral 7.8
I' = I 27Tf(x;)J1
i=l
+ f'(x;)2 ilx
approaches the integral. But the area of the approximating surface was
with two different sample points i;, x; used on each interval rx;_1, x;]. Thus we need
to show that
Iim II' -
JNJ->O
II = o.
We are assuming that / ' is continuous. Therefore so also is 2TTJ I + f'(x)2• Therefore
the latter function is bounded. Let M be such that
€
> 0.
M(b - a)
By the uniform continuity theorem, there is a o > 0 such that
Ix - x'I <O (
lf x) - f(x')I <M(b E
- a)
=>
Proof
INI <o => Ii; - .x;1 <o for each i
Now
and so
n
i=l
Therefore
n
INI <o
E E
=>
'(b - a) =
E
b ·-a = -- E,
Most of the questions below can be answered on the basis of a careful reexamination of
the theorems and proofs in Sections 7.2 and 7.8. Some of them, however, require independent
investigation. Naturally, all answers should be explained.
I. Suppose that/is known to be increasing on [a, b], but is not known to be continuous.
Does it follow that f is integrable?
6. Suppose that/is (a) continuous at a, (b) continuous at b, and (c) uniformly continuous
on (a, b) . Does it follow that f is uniformly continuous on [a, b]?
*7. Suppose that f is bounded and integrable, but not necessarily continuous, on [a, b].
For each x of [a, b], let
Show that Fis continuous. (The betweenness theorem for integrals, which is Theorem 5
of Section 7.2, may be useful here.)
*8. For the definition of Lipschitzian, see Problem 12a of Problem Set 7.2. Show that if f
is Lipschitzian on I, then/ is uniformly continuous on I. (Here I may be any interval,
open or closed, finite or infinite.)
If f(x) dx
I f�
If(x)I dx.
8 The Conic Sections
x X2
L <--+ R, p <--+ x,
between the points P of L and the real numbers x, such that the distance between
any two points is the absolute value of the difference of the corresponding numbers.
That is,
If we subtract the same number from the coordinate of every point, we obtain
another coordinate system on the line. If we subtract h from every x, then
and so
'
Therefore the distance formula works, for the new coordinates x = x - h.
This process is called translation. The origin is moved to the point h, and all the
other number labels are moved with it.
x= 0 2 h h+l h+3
1
x = -h 1-h 2-h 0 3
Thus the old and new coordinates are related by the formulas
'
x = x - h, x = x' + h.
356
8.1 Translation of Axes 357
y y'
y=k x
'
x=h
Suppose that we translate the coordinate system on the x-axis, subtracting h
from every x-coordinate, and then translate the coordinate system on the y-axis,
subtractingk from every y-coordinate. The effect is to move the origin to the point
(h, k). Every point p now has a new pair of coordinates x', y', and these are related
to the old ones by the formulas
x = x' + h, x' = x - h,
y = y' + k, y' = y - k.
These formulas are easy to remember; the only way you are likely to go wrong
is to get them backwards ( by writing x' = x + k ( ?) or y' = y + k ( ?)). It is easy
to see, however, that the new origin must have old coordinates h, k and new coordi
nates 0, O; and from this we can tell which way the formulas ought to go.
As usual, (x,y) denotes the point whose old coordinates are x and y. Thus the
old origin is (0, 0), and the new origin is (h, k). When we write (a, b)' (with a prime
outside the parentheses) we mean the point whose new coordinates are a and b.
Thus the new origin is (0, O)', and the old origin is ( -h, -k)'. More examples are
given below:
y y'
P=(5,2)=(2, 1)1
2 ------- --�
I
�+--=-t���.._--,+�-'-�.._--- x
- 1, 0 2 3 4 5
/L--1
Q=(-1, -l)=(-4, -2)1
In the figure, h = 3 and k = I. Two points have been labeled both ways. At the
point P, we have
x = 5, y = 2, x' = 2, y' = 1,
so that the label
p = (5, 2) = (2, l)'
is correct. Similarly, at Q we have
so that
Q = (-1, -1) = (- 4 , -2)'.
When we write an equation to describe a figure in the plane, the equation depends
on the choice of axes; and often one choice of axes gives a simpler equation than any
other. If we didn't start with the axes in the best position, then we can simplify the
equation by translation of axes. For example, consider the parabola with directrix
y = -1 and focus F = (3, 3).
y
F=(3, 3)
x, y)
---- P=(
"'-.__
--\
v
��-+-���������--- x
0 1 2 3 4 5 16
D--- ________________ n_ __
-1 M
<=>- 8y = x2 - 6x + 17
<=>- y = tx2 - !x + -1i-.
We know, however, that a parabola with a horizontal directrix and vertex at the
origin always has an equation of the form y = ax2• The vertex of our parabola is
halfway between the focus and the directrix, at the point V = (3, 1). This means
that we should translate the axes so that the new origin becomes the point
to appear in a simple form. If we hadn't known this, we could still have investigated
algebraically, to find out what sort of simplifications a translation could accomplish.
To do this, we would regard hand k as unknown quantities, and make the substitution
x= x' + h, y = y' + k
Certain facts are now obvious: (1) We can't get rid of the term x'2, by any choice
of h and k, because h and k do not appear in the coefficient of x'2. (2) For the same
reason, we can't get rid of the linear term in y'. (3) The total coefficient of x' is 2h - 6,
and the total constant term is
h2 - 6h + 17 - Sk.
We can therefore get rid of the x' term, by using h = 3. The constant term then
becomes
9 - 1 S + 17 - Sk' or s - Sk,
which is 0 when k= 1. Thus, translating the origin to the point (h, k)= (3, 1),
we get the equation in the form
as before. This is the process that you follow if you don't know the answer in advance
1. Find a translation which eliminates both of the linear terms from the equation
xy - 5y - 6x - 30 = 0.
2. Is there a translation which eliminates the xy-term from the above equation? Why
or why not? How about the possibility of removing the constant term?
3. Find a translation which removes both linear terms from the equation
x2 + y2 + x + y - 2 = 0.
4. Find a translation which removes both linear terms from the equation
2xy - x + 3y - 2 = 0.
5. Find a translation which removes both linear terms from the equation
x2 + y2 + 4x + 2y + 1 = 0.
6. Find a translation which removes both linear terms from the equation
x2 + xy - 3x + 2 = 0.
360 The Conic Sections 8.2
x2 +xy +y2 +x +y +5 = 0.
8. Find a translation which removes both linear terms from the equation
x2 +xy +y2 +x +y +1 = 0.
9. Show that there is no translation which removes both linear terms from the equation
x2 +2xy +y2 +x - y +1 = 0.
10. Show that there is no translation which removes both linear terms from the equation
with possible linear terms but no constant term? (You may be able to think of a way to
answer this question without doing any calculations at all.)
12. Show that if ad - be ;;f. 0, then the linear system
ah +bk = e, ch +dk = f
always has a solution. (Simply start solving it; at some point, you will need to assume
that ad - be ;;f. 0.)
13. Consider an equation of the form
Show that if B2 - 4AC ;;f. 0, then there is always a translation that eliminates both of
the linear terms. (The converse is not true; there are simple examples of equations
where B2 - 4AC = 0, but where the linear terms are absent to begin with. Examples?)
Show that if the axes are translated, then C is the graph of an equation of the same form,
in the new coordinates x' and y'.
Let F and F' be two points, let c be half the distance between them, so that
FF'= 2c,
and let a be a number greater than c. Let C be the graph of the equation
FP + F'P = 2a.
The curve C is called the ellipse with foci F, F' and focal sum 2a.
8.2 The Ellipse 361
the graph of the condition FP + F'P = 2a is the segment from F to F'; and for
a < c, the graph is empty.)
Some things about ellipses are easily seen from the definition. For the definition
of symmetry of a figure, about a line or a point, see Section 7.6.
FP + F'P = 2a.
Proof? (This is not quite as simple as the preceding theorem. See the right-hand
figure above.)
L'
P'
--,
I
I
I
I
I
I
Theorem 4. Every ellipse is symmetric about the point midway between its foci.
P0 is called the center of the ellipse. (See the right-hand figure above.)
These symmetry theorems convey nearly all that is easy to see about ellipses
merely from the definition. Our next step is to set up a coordinate system, and describe
our ellipses by equations. We take the origin at the center of the ellipse, and the foci
on the x-axis. The ellipse is then said to be in standard position, relative to the axes.
y
P(x, y)
As indicated in the figure above, let F and F' be the points ( -c, 0) and (c, 0). Then
FP + F'P = 2a
� J(x + c)2 + y2 + J(x - c)2 + y2 = 2a
� J(x + c)2 + y2 = 2 a - J(x - c)2 + y2
-----
=> x2 + 2cx + c2 + y2 = 4a2 - 4aJ(x - c) 2 + y2 + x2 - 2cx + c2 + y2
� aJ(x - c) 2 + y 2 a2 '-- ex
=
Thus every point on the ellipse satisfies the final equation. It is possible to show,
conversely, that every point (x, y) that satisfies the final equation lies on the ellipse.
(See Problem 22 below.) Thus we have:
Theorem 5. The ellipse with foci at (±c, 0) and focal sum 2a is the graph of
the equation
x2 y2
+
-�- = 1.
a2 a 2 - c2
x2 y2
-+-=l.
9 5
- 1
-2
Given an equation
x2 y2
-+- = 1 '
a2 b2
x2 y2
+ /;2 1.
=
a2
For b2 < a2, the graph is the ellipse with focal sum 2a and foci at (±c, 0), where
c = )a2 - b2. For a2 < b2, the graph is the ellipse with focal sum 2b and foci
at (0, ±c), where c = )b2 - a2.
y
y
b
-a a x
-b
If the foci are not in either of the two positions shown above, then the equation
of the ellipse is more complicated. In some cases, when the equation is given, we can
simplify the equation by a translation of axes. Consider
Evidently we want h = 1, k = -1; and this gives the equation in the form
x2 y2
2 2 ' '
4x ' + 9y' - 36 = 0, or -+-=l.
9 4
The graph is the ellipse with foci at (±)5, O)' and focal sum 6; it intersects the x'
and y'-axes at the points (±3, O)', (0, ±2)'. We can now sketch, showing both sets of
axes
y '
y
-_
-3-+---�--+-_
- l-+-�-_._.,__+3---. x'
-1
-2
In doing such sketches, we start by drawing the new axes and the curve, in a con
venient position on the paper, and then draw the old axes, in the position where
they must have been.
8.2 The Ellipse 365
1. Foci at (±1, O); focal sum 4. 2. Foci at (0, ±1); focal sum 4.
3. Foci at (1, 2), (1, 4); focal sum 4. 4. Foci at (-1, -1), (1, 1); focal sum 4.
5. Foci at (-1, 1), (1, -1); focal sum 4.
6. Foci at (±2, O); focal sum 6. 7. Foci at (0, ±2); focal sum 6.
8. Foci at (-1, 1) and (I, -1); focal sum 6.
Find the foci and the focal sum, and sketch, showing both sets of axes, in cases where
more than one set is used.
y2 2y
13. 4x2 + y2 =1 14. x2 + x + - + - + 1 =0
9 3
15. Given an equation of the form
Ax2 + By2 + Cx + Ey + F = 0,
where A and Bare both positive, show that the graph is (a) an ellipse, (b) a point, or (c)
the empty set. (The same conclusion follows if A and B are both negative.)
16. A function f is odd if/(-x) = -f (x) for every x. Show that the graph of an odd
function is symmetric about the origin.
17. a) Let C be the graph of the sine function. Show that C is symmetric about the origin.
b) Now show that C is also symmetric about infinitely many other points. (Thus it
may happen that an unbounded figure has more than one "center." In fact, there
is a simpler example: a line is symmetric about each of its points, and so every point
of a line is "a center" of the line. For this reason, we ordinarily use the word
center only for bounded figures.)
18. a) Show that the graph of the cosine is symmetric about infinitely many points.
b) Show that the graph of the sine is symmetric about infinitely many lines.
19. Consider the infinite strip R between the lines y =1 and y = -1. That is,
20. Show that every cubic curve is symmetric about its point of inflection. Here by a cubic
. curve we mean the graph of an equation y = ax3 + bx2 + ex + d, with a yf 0.
21. Suppose that in Theorem 3 of this section we drop the hypothesis that the two lines of
symmetry are perpendicular. Would the resulting theorem be true? Why or why not?
*22. Given 0 < c < a, as in the definition of an ellipse. Let F = ( -c, O), F' = (c, 0).
Let P = (x, y) be a point satisfying the equation
x2 y2
a2 + a-2--2c -
= 1.
366 The Conic Sections 8.3
a) Show that
a2 - e2
y2 = -- - (a2 - x2).
a2
b) Show that
1 I
FP + F P '\/ (a2 + ex)2 + 1 '\/I (a2 - ex)2•
I
= - -
a a
Given 0 < a < c, and the points F and F', with FF' = 2c. Let C be the graph of the
condition
FP - F'P = ±2a (a < c).
The curve C is called the hyperbola with foci F, F' and focal difef rence 2a. The figure
shows what a hyperbola looks like, but the reasons for this appearance of the graph
are not obvious; the only thing that is easy to see, on the basis of the definition, is
that the hyperbola is symmetric about each of the two perpendicular lines. The first
step in our investigation of hyperbolas is to take the axes in a convenient position, as
shown above, with F = (-c, 0) and F' = (c, 0), and get an equation for the curve.
FP - F'P = ±2a
<=> FP = F'P ± 2a
J(x + c)2 + y J(x - c)2 + y2 ± 2a
2
<=> =
y2
--- = 1.
c 2 - a2
Thus every point P = (x, y) of the hyperbola satisfies the final equation. As in the
case of the ellipse, it can be shown conversely that every point on the graph of the
final equation is on the hyperbola. (See Problem 32 below.) Since c2 > a2, we may let
b2 = c2 - a2.
x2
---Y2 =l.
a2 b2
And we can sum up as follows:
Theorem 1. The graph of the equation
x2 Y2
- - - = 1
a2 bz
is the hyperbola with foci at (±c, 0) (where c =.Ja2 + b2) and focal difference 2a.
We shall use our equation to justify the sketch which we gave at the outset.
1) No point of the hyperbola lies between the lines x = -a and x = a. The
reason is as follows. Solving for y, we get
It remains to discuss the two lines which the curve seems to be getting close to
when both x and y become numerically large. The behavior of the hyperbola relative
368 The Conic Sections 8.3
= f(x) = l,
x
Y
relative to the coordinate axes. The coordinate axes are called asymptotes of this
curve. By this we mean, roughly speaking, that points of the curve, far from the
origin, in the appropriate directions, are close to the axes. We want to extend this
idea to cases in which the asymptote is neither horizontal nor vertical.
y 1
y=f(x) = x' x;-<0.
As x-+ oo, the distance from the point P= (x, y) to the x-axis approaches 0.
(This distance is MP= IYI = 1 1/x j . ) We shall take this property as our definition
of an asymptote. That is, a line Lis an asymptote of a function-graph if the distance
from the line to the point P= (x,j(x)) approaches 0 as x-+ oo, or as x-+ oo.
=
-
It is evident that the x-axis is an asymptote of the graph off (x) I/x under this
definition. In fact, the x-axis is an asymptote in both the positive and negative
directions. We also say that a line L is an asymptote of a curve C if C contains a
function-graph which has L as an asymptote. In the case of y= l/x, we also have
x = g(y) = 1/y; thus the curve, looked at sidewise, is still a function-graph, and has
the y-axis as an asymptote, in both the positive and negative directions. This is shown
in the left-hand figure below.
y
x=g(y) = ly (y;o<O) y=f(x) = l
x (x;o<O)
M p
IimMP=O
y-.ro
lim MP=O
M Jim MP=O
y-.-ro
8.3 The Hyperbola 369
We return to our hyperbola. In the last figure on the preceding page, the slope
of the segment from the origin to the point P = (x, y) is
m(x) = ,!'.
x
= ..! £ .Jx2 - a2 = £ 1
x a
·
a
j - a2
x2
•
Obviously
Jim m(x) = bfa,
and this suggests that the line y= bx/a, or x/a - y/b = 0, is an asymptote of the
part of the curve that lies in the first quadrant. If we show this, then it will follow
by symmetry that the lines x/a ± y/b = 0 are asymptotes of the curve in all four
quadrants.
Thus we need to show that Jim., ... 00 MP = 0. Since MP < NP, it will be sufficient
to show that lim NP = 0. This can be done by an algebraic trick.
x + .Jx2 - a2
£. (x - .Jx2 - a2).
+
=
a x .Jx2 - a2
b a2
-
a x + .Jx2 - a2 .
Obviously NP --+ 0 as x --+ oo. Therefore MP --+ 0, which was to be proved. This
gives the following theorem.
x2 y2
- l.
a2 b2
You can sketch a hyperbola by drawing in ·the asymptotes and x-intercepts
exactly, and then filling in the curve freehand.
y
370 The Conic Sections 8.3
The x-intercepts are at x = ±3, and the asymptotes are the Jines
y
x
-±- = 0 or ±Jx.
' Y =
3 2
y =-x y y=x
' /
' /
' /
' /
' /
' /
' /
' /
' /
' /
' /
/ '
/ '
/ '
/
/
'
/ '
/ '
/ '
/ '
/ '
/ '
/ '
If such a hyperbola is in standard position, then the asymptotes must be the lines
x±y = 0, and the equation must have the form
2 y2
x
- - - = 2 2 2
2 2 1, or x -y =a .
a a
If the foci are on the y-axis, at the points (0,±c ) , then the equation of the hyper
bola takes the form
2
x
--- = 1.
2
c - a2
2 2
x y
= ±1
2 2
a b
is the union of two hyperbolas with the same asymptotes. These are called conjugate
hyperbolas.
8.3 The Hyperbola 371
11. Foci at (±2, 0); focal difference 3. 12. Foci at (±2, 2); focal difference 3.
13. Foci at (0, 0) and (0, 4); focal difference 3. 14. Foci at (0, ±2); focal difference 3.
15. Foci at (±1, ±I); focal difference 2.
16. Foci at (±2, O); passing through the point (3, 4).
17. Foci at (±2, 0); focal difference 2.
18. Foci at (±3, 0); focal difference 4.
19. Foci at (0, ±3); focal difference 4.
20. Foci at (±3, O); passing through the point (5, 5).
21. Given F, F', and a, as for a hyperbola in standard position. What is the graph of the
condition FP - F'P = la? How about the graph of FP - F'P = -la?
22. Find a rectangular hyperbola in standard position (with asymptotes x + y = 0 and
x - y =
0) passing through the point (5, 3).
Investigate the graphs of the following equations. In each case, find all asymptotes.
(x - l)(x - 2)
27. Let D be the line x -1, let F be the origin, and for each point P let DP be the
perpendicular distance between D and P. Let C be the graph of the condition
FP
2
DP= ·
What sort of curve is this? Sketch.
28. Let F and D be as in the preceding problem, and let C' be the graph of the condition
FP 1
=
DP 1.
What sort of curve is this? Sketch.
29. _Let G be the set of points P such that CP = 2DP, where C is the circle x2 + y2 = I and
D is the line x = 4. What sort of figme is G? Discuss and sketch.
30. The following passage occurs in the U.S.Internal Revenue Act of 1964.
" ...There shall be allowed as a deduction moving expenses paid ... in connection
with the commencement of work by the taxpayer ... at a new principal place of work ...
[However,] no deduction shall be allowed ... unless ... the taxpayer's new principal
place of work ... is at least 20 miles farther from his former residence than was his
former principal place of work ... "
372 The Conic Sections 8.4
Give a sketch, showing what this means. Your sketch should show (a) the former
residence, (b) the former place of work, and (c) the region in which the new place of
work must lie, for the moving expenses to be deductible. (The author is indebted,
for this problem, to Dr. Henry Pollak, of the Bell Telephone Laboratories.)
31. The region between two conjugate hyperbolas stretches out infinitely far, in each of four
directions. Find out whether the area of such a region is finite.
*32. Given 0 < a < e, as in the definition of a hyperbola. Let F = ( -e, 0), F' = (e, 0).
Let P (x, y) be a point satisfying the equation
=
x2 v2
- __,_ = I
a2 e2 - a2 .
a) Show that
e2 - a2
y2 = -- - (x2 - a2).
a2
b) Show that
I .I 1
FP - F P - v (ex + a2)2 - - '\I (ex - a2)2.
I
=
a a
c) Show that, if x � a, then
where at least one of the coefficients A, B, and C is different from zero. The latter
condition is to guarantee that the degree of the equation really is 2, rather than 1 or 0.
We have found that all conic sections are graphs of equations of this type; and we shall
now investigate the converse. That is, we propose to find out what sort of figure can
be the graph of a second-degree equation. The possibilities that we have already
found are
a) a circle,
b) a p arabola,
c) an ellipse,
d) a hyperbola.
There are other possibilities, which we noted as exceptional cases when we were
studying the equation
x2 + y2 + Dx + Ey + F = 0,
8.4 The General Equation of the Second Degree. Rotation of Axes 373
x2 + y2 = 0
is a point; and the graph of
x2 + y2 + 1 = 0
is empty. (See Theorem 2 of Section 2.3.) Our list of possible graphs of second
degree equations must therefore include
e) a point, and
f) the empty set.
And this is not all. The graph of
y2 = 0
is a line, namely, the x-axis. And the graph of
xy = 0
is the union of two lines, namely, the two axes. Similarly, the graph of
x2-y2=0
x2-y2 = (x
is the union of two lines. The reason is that y)(x + y). This is = 0
-
In this example, the lines intersect, but we may get the union of two parallel lines.
The equation
x2- x = 0
is equivalent to
x(x - 1) = O;
and the graph is therefore the union of the two parallel lines x = 0 and x = 1. Thus
the graphs, for the general equation of the second degree, include
g) a line, and
h) the union of two lines, either parallel or intersecting.
We shall show that the eight possibilities that we have just listed are the only possi
bilities. The method will be to reduce the equation to a recognizable form by moving
th� axes. In some cases, this cannot be done by translation; we may also have to use
rotation of the axes.
374 The Conic Sections 8.4
Suppose that we rotate the axes through an angle of measure 8, getting a new
pair of axes.
y' y
In the figure, r is the distance OP; P has coordinates x, y in the old coordinate system,
and coordinates x', y' in the new coordinate system. Evidently
x' = r cos (<P - 8) = r cos <P cos 8 + r sin <P sin 8'
Therefore the new coordinates are given in terms of the old ones l:>y the formulas
If we rotate the new axes through an angle of measure -8 we are back where we
started. Therefore the old coordinates are expressed in terms of the new ones by the
formulas
x = x' cos (-8) + y' sin (-8), y = -x' sin (-8) + y' cos (-8).
These give
x = x' cos 8 - y' sin 8' y = x' sin 8 + y' cos 8.
xy = l.
x = x' cos 8 - y' sin 8' y = x' sin 8 + y' cos 8. (1)
The equation then becomes
sin 8 = cos 8 = �
)2
and
sin 8 cos 8 = t.
Thus our new equation is
x' 2 y' 2 1.
=
2 2
This is the equation of a rectangular hyperbola.
A(x' cos 8 - y' sin 8)2 + B(x' cos 8 - y' sin 8)(x' sin 8 + y' cos 8)
+ C(x' sin 8 + y' cos 8)2 + D(x' cos 8 - y' sin 8)
+ E(x' sin 8 + y' cos 8) + F = 0.
When we collect coefficients for the terms of various types, we get a new equation of
the same form, like this:
For future reference, we have written down all of these, but for the moment, all we
are interested in is B': we want to find a e that makes B' =0. Simplifying trigono
metrically, we get
B' = (C- A) sin2e + Bcos2e.
1) If A =C, then B' =Bcos 28. We must have B ¥- 0, or there wouldn't be any
xy-term in the original equation. Therefore
e = 27:.
4
This proves the theorem. (The theorem did not say that the coefficients in the new
equation were easy to compute.)
Proof By the preceding theorem, we can assume that there is no xy-term. The
equation then has the form
Ax2 + Cy2 + Dx + Ey + F = 0.
( �)
A x2 + x
2
+ C y + ( �) y = -F,
( �r c(y �r
A x + + +
2
= -F +
2
� :;2,
2
+
D , E
x =x +- y =y+-.
I
2A' 2C
There are six possibilities to be considered. For each of these cases, we have indicated
on the right what sort of figure the graph is.
Ax2 + Dx + Ey + F = 0,
and then we complete the square in x, so as to eliminate the linear term in x. This
gives an equation of the form
x'2 + F" = E'y.
For E' =;r6 0, this is a parabola. For E' = 0, the equation x'2 = - F" gives one line,
two lines, or the empty set.
3) Suppose that A = 0. This is exactly like Case 2; we interchange x and y, and
proceed as before. This completes the proof of the theorem.
e = 1.
2
Tan-1 _B
__
A-C
Thus we want to express sine and cose in terms of tan 28 ( = B/(A - C)) for the
case where
_!!<28<�.
2 2
When 28 is in the first or fourth quadrant, cos 28 > 0, and sin 28 has the same sign
as tan 28.
378 The Conic Sections 8.4
v1+k2
k?
k?
v1+k2
In the figure,
B
k = tan 2() = ___ .
A-C
Therefore
1
cos 2() =
I
v1 +k2
;
J J
x i+cos x . x l - os x .
cos-= ± ' sin-= ±
2 2 2
J J
i+cos 2e . i- cos 2()
cos e = ' sin e = ± '
2 2
where
1
cos 2() =
I +k2
v1
and where the sign in the formula for sine is the same as the sign of k =tan 2(),
For example, consider
3x2 +2xy +y2 =1.
Here
A=3, B =2 , c =1,
and
B 2
k= =--=1.
--
A-C 3-1
Therefore
1 1
cos 2e = = ----=
.
J1 + k2 J2
Hence
J J
i+1;/2 2 + /2
cos e = =
'
2 4
8.4 The General Equation of the Second Degree. Rotation of Axes 379
and
(In the second formula, sine > 0 because k > 0.) Therefore
2 + /i 2 - .J2 /i.
cos2e =
' sin2 e =
'
sine cose =
4 4 4
4 4 4
and
C' =
A sin2e - B s in e cose + C cos2e
2 - .J2 - . .J2 2 + .J2 -.J2
2 + + 2.
3.
=
4 4 4
In these problems, when you are asked to inrestigate an equation, you should find out
what sort of figure the graph is, and sketch. If the graph is a conic section, you should also
find the coefficients in the standard form.
1. Investigate
x - xy = I .
(Here it is easier to translate first and rotate second. Sketch, showing all three sets
of axes.)
2. Investigate
x2 - xy = 1.
3. Investigate
xy - x - 2y = 0.
4. Investigate
2xy - y2 + 2 = 0.
5 . . Investigate
x2 + 2xy + y2 + 2x + 2y + l = 0.
6. Investigate
x2 + 4xy + 4y2 + 4x + 8y + 3 = 0.
8. a) Given the general equation of the second degree. Let A0, B0, C0, . • • be the new
coefficients, when the axes are rotated through an angle of measure 8. Thus A8, B8,
C8, • • • are the A', B', C', . . . of the text; and so
x2 + 2xy + 5y2 - 10 = 0.
12. a) Let D be a line, let F be a point not on D, and let e be a positive number. Let G
be the set of all points P such that
FP
= e
DP ,
where DP is the perpendicular distance from D to P. G is called the conic section
with directrix D, focus F, and eccentricity e. Show that G is (a) an ellipse if e < 1,
(b) a parabola if e = I, and (c) a hyperbola if e > 1.
b) Is a circle a conic section, in the sense defined in Problem 12(a)? Why or why not?
Paths and
9 Vectors in a Plane
P: I-E.
For the motion shown in the figure, I is the infinite interval [O, oo ), and the initial
point P(O) is the point (1, 1 ) .
In general:
P: I-E,
where I is an interval and E is a plane.
381
382 Paths and Vectors in a Plane 9.1
Briefly, the locus of P is the image of I under the function P. The locus is deter
mined when the path is named, but given a locus, the path is not determined: the
same curve can be traced out by a moving point in infinitely many ways.
We describe a path in a coordinate plane by defining two functions which give
the coordinates of the moving point for each time t. For example, we might take
x = f(t) 4t
= }
y = g(t) = 8t2
Here
/=(-00,00), and P(t) = (4t, 8t2).
At t = 0, P(t) = (0, 0). As t increases, starting from 0, both x and y increase, but y
increases faster. In fact, the locus of this path is a parabola. To see this, we observe
that from the first equation, t = x/4. Substituting in the second equation, we get
y= sGY =
tx2.
Thus every point of the path lies on the graph of the equation
And it is easy to check, conversely, that every point (x, y) of the parabola is on the
path.
y
y = x2.
But the converse is not true. On the path, we always have x ?; 0, because t2 ?; 0.
Therefore the locus of P is only half of the parabola.
9.1 Motion of a Particle in a Plane 383
\
\
\
\
\
\
'
'
', P(O)
,
���� -- '-
� ,....-::;
- . ����-'---- x
1
x =cost, y =sint.
These functions describe uniform motion around a circle. Here we may regard the
parameter as the measure of an angle, and write
x =cose, y =sin()
Somewhat similar looking paths have ellipses as their loci. For example, consider
x =a cose, y =b sine.
::
=cos() ' 2' = sin() '
a b
x2 y2
2 .
-2 + -2 =COS () + Sill2 () = 1.
a b
In the figure,
Q = (a cos e, a sin8),
R = (bcos8,bsin8).
Therefore
P = P(8) = (a cos8, bsin8).
Following the scheme of the above figure, using drawing instruments, you can plot
as many points of the ellipse as you want to, without making any numerical calcula
tions. The same idea is used in the construction of a drawing instrument called the
ellipsograph, which can be adjusted so as to draw the ellipse with any pair of semiaxes
a, b.
Investigate the paths described by the following pairs of coordinate functions, sketch
the loci, and label a few points as P(O), P( 7T/4), and so on, so as to indicate the way in which
the moving point traverses the locus.
l. x = sec e, y = tan e 2. x = cos e, y = cos2 0 3. x = 2 cos e, y = sin 0
4. x = cos2 e, y = sin2 e 5. x = t3, y = lt31
(Check that not only /(t) = t3 but also g(t) = lt31 have continuous derivatives. Thus a
moving point can go smoothly around a sharp corner, if only it does so slowly enough.)
11. In the left-hand figure above, IJ ranges over the open interval (0, 7T) , OR b, and QP =
is a constant a. Find a parametric description of the path, and sketch the locus.
12. In the right-hand figure above, OR = b as before, and QP is a constant a. Find a
parametric description of the path, and sketch the loci, showing the three cases a < b,
a = b, a> b.
13. A circle of radius a rolls without slipping inside a circle of radius 2a. The initial position
is shown on the left below; a later position is shown on the right. Observe that RQ =
-2a
Then
x' = a cos (0 - if>), y' = a sin (0 - if>),
'
x = x' + h, y = y + k.
Complete the discussion to get a parametric description of the path, and find out what
the locus is. It will turn out that the figure on the right above is slightly misleading.
**14. lf you solved the preceding problem correctly, you found that some of the machinery
that you used was not necessary after all. But consider the case where the outer circle
has radius a and the inner circle has radius b = a/4. Find parametric equations for the
path, and eliminate the parameter to get the rectangular equation
x = f(t), y = g(t),
386 Paths and Vectors in a Plane 9.2
we may want to find the slope of the tangent line at the point corresponding to a
particular t.
y
In the figure, we see the path; we want to find the slope of the tangent at P, if such a
tangent exists. Suppose that P is the point corresponding to a certain t; and let Q
be the neighboring point corresponding to t + !it. Let
Ll y
m= I.im-,
Ll.t-+O LlX
if such a limit exists. Suppose now that f and g are differentiable, and that f' (t) � 0.
Then we can write
-
limLl.t-+O {[j(t + Llt) - f(t)]/Llt} j'(t)
Thus we get the formula
g'(t)
m=--.
f'(t)
This will be called the parametric slope formula. We have shown:
x=j(t), y= g(t).
If f and g are differentiable at t, and f' (t) � 0, then the path has a tangent at the
corresponding point P, and the slope of the tangent is given by the formula
g'(t)
m= m(t) = .
f'(t)
An important case is the one in which f' (t) � 0 for a < t < b. Here x=f (t)
can never take on the same value twice, and so the locus of the path is the graph of a
function ef;.
9.2 The Parametric Mean-Value Theorem; L'Hopital's Rule 387
Qt=b
p
t=a M=<t>'(x)
I
I
If P and Q are the endpoints of the graph, as in the figure, then the slope of the
secant line through P and Q is
g(b) - g(a)
f(b) - f(a)
By the mean-value theorem,there is a point i where the derivative cf' (i) is the slope of
the secant line. Thus
g(b) - g(a)
= rf'(x).
f(b) - f(a)
This number i must have come from somewhere. That is, there must be a i between
a and b such that
i =
f(i).
It follows that
g'(i)
f'(i) = .
f'(f)
Therefore
g(b) - g(a) g'(i)
= (a < i < b).
f(b) - f(a) f'(i)
What we have just proved is a parametric form of the mean-value theorem. The idea
is that, if a function-graph is presented parametrically, then we can rewrite the
mean-value theorem parametrically, expressing both the slope of the secant and the
slope of the tangent in terms of the parameter.
Theorem 2 (The parametric mean-value theorem). Given two continuous functions
Jandg,fora � t � b. If both functions are differentiable fora < t < b,andf'(t) ¥=
0 for a < t < b, then
g(b) - g(a) g'(i)
f(b) - f(a) f'(i)
for some i between a and b.
y
388 Paths and Vectors in a Plane 9.2
f (a) = g(a) = 0.
g (b)
-- -
g'(i)
f(b) f'(i)'
for some f between a and b. This has the following consequence: if g'(t)/f'(t)
approaches a limit L, as t _,..a, then g(t)/f(t) approaches the same limit L. That is,
if
'(t)
f(a) a lim g L
g( )
0, and =
J'(t)
= = '
t�a
then
(t)
lim g = L.
t-a f(t)
This is called l'Hopital's rule. Roughly, the reason why it holds true is as follows.
Since f is between t and a, we know that t R:> a=> f R:> a. But
'(f)
fR:> a => g
- R:> L,
f'(i)
because g'(t) lf' (t) _,.. L. Therefore
fR:>a=:>fR:>a =>
g(t)
= g'(i) R::3 L.
f(t) f'(f)
Therefore
t
t R::3 a => g( ) R::3 L'
t
f( )
and so
(t)
lim g L.
J(t)
=
t-a
It is very easy to express this idea in the form of an E o proof; all we do is to for
- -
t
O<lt-al<o => \g'( )_L\<E.
t
f'( )
2) Conclusion. For every E > 0 there is a o > 0 such that
t
0 < It-a I <o => \ g( )-L / < E.
f(t)
We need to show that ( 1) => (2). Given E > 0, as in (2), we take the o > 0
furnished by (I). For each t, let f be the f furnished by the parametric mean-value
9.2 The Parametric Mean-Value Theorem; L'Hopital's Rule 389
0 <It 0
-
al < o ==> < If - al < o
(f)
==>
I g
f''(f) - L i <
€
(t
==>
I g
f(t))
-
L I <
€.
t--+a t--+a
If these relations hold, and f and g are not defined at x = a, then we define f (a) and
g(a) to be O;f and g are then continuous, and the discussion of the limit of g/f goes
through exactly as before.
Using x instead of t, we get our theorem in the following form:
L'
( )
I im f( x) ( ) 0 g' x
gx
Jim and Jim
f ' (x)
= = =
then
lim g(x) .
L
=
x--+a f(x)
Let us now look at some applications. Consider
1. sin x
1m -- .
x--+O X
1.
cos x
1
Jim =
x--+O
Therefore
1.
sin x
Jim =
x--+O X
This discussion does not supersede the geometric proof of the same statement,
given in Section 4.2. The reason is that to apply l'Hopital's rule, we had to know the
derivative of the sine, and to find the derivative of the sine we needed to know that
lim.,�o [(sin x)/x] = 1. Moreover, if you know the derivative of the sine, you can
remind yourself of what lim [(sin x/x] is, without using l'Hopital's rule. The point
390 Paths and Vectors in a Plane 9.2
is that
. sin x 1. sinx - sin 0 . , 0,
l Im -- = Im = Sln
x-+O X x-+O X- 0
by definition of sin' 0. Since sin' = cos, and cos 0 = 1, we get the answer imme
diately.
It is not an accident that in applying the first form of l'Hopital's rule, we some
times find that we are merely solving a differentiation problem. The reason is that the
formula used in the definition of the derivative is always an instance of the rule,
whenever the function is differentiable. By definition,
F(x) - F(x0)
F'(x0) 1
- .Im
_ .
x-+x0 X - X0
The indicated limit on the right satisfies the conditions of Theorem 1, with
g(x) = Fx
( ) - F(x0) --+ 0, f (x) =x - x0 --+ 0,
asx --+ x0. Thus every differentiation problem is a problem of the sort that l'Hopital's
rule deals with. The rule, of course, applies in many other cases; and it is the other
cases that make it significant. For example,
2
1. sin x+x 1. 2sinxcosx+l
Im = 1m =l
. ,,
"' "'
x-+O e - 1 x-+O e
lim xcot x.
x-+O
Investigate the following indicated limits. (That is, calculate the ones which exi st.)
sin2 e sin3 e
1. limx cotx 2. lim 3. lim
82 82
x-o o-o o-o
cos2x - 1 lnx x- I
4. lim 5. Jim -- 6. lim -.,--
x -o x2 x-1
X - 1 x-1
e - 1
e"' - 1 . y2 - 2y + 4 cos3x - 1
7. Jim 8. ltm 9. Jim
.,_0 In (x + 1) y-2 y- 3 .,-,,/2 x3 - 1
1 x4 - 1
10. Jim x2 sec2 x 11. limx2 sin 2 12. Jim . 2X - 1
00-+rr/2 x-o x x-rr
Sill
9.2 The Parametric Mean-Value Theorem; L'Hopital's Rule 391
sine - cose x2 - 4x + 3
1 6. Jim 1 7. Jim x2 csc2 x 18. lim
e-rr/4 e - Tr/4 x-o x-+1 X
2 - 3x + 2
x2 - 1 3x
1 9. Jim .,-
20. Jim -- . -
X x-oe
---
x-l
In 1 -
(t)
21. Jim
t-e
In
t -
- 1
e
22. Jim
t-rr/2
[ f
sect
rr/2
Vl + sin3 t dt ]
23. Jim
x-1
[�· J,"'
X 1 1
e13 v( dt] 24. Jim
x-rr
[ l"'
csc x
rr/2
Vl + sin3 t dt ]
*25. Jim x In x 26. Jim x In sin x
CC-+O+ X-+0+
*27. Jim x In (sin x cos x) *28. Jim x In (cos2 x sin2 x)
a;_.o+ x-o+
29. A circle starts off tangent to the x-axis at the origin. The circle then rolls, without slip
ping, along the x-axis. The point P which started at the origin then traces out a path;
this path is called a cycloid. (The same term cycloid is applied also to the locus of the
path.) The parameter in the coordinate functions is thee indicated in the figure. Sketch
the locus, and calculate the coordinate functions of the path.
y y'
•
I
I
As the figure suggests, the easiest method is to use a "moving coordinate system,"
as in Problem 13 of Problem Set 9. 1 ; we need to calculate the coordinates (h, k) of the
"moving origin" O', and calculate x' and y' as a cos </> and a sin ¢. (What is ¢ ?)
30. a) When a circle rolls on the inside of another circle, we get a hypocycloid. In the figure,
the fixed circle has radius a and the rolling circle has radius b.
y
392 Paths and Vectors in a Plane 9.2
x = f(O)
b-a
= (a- b) cos e + b cos - - e,
b
y = g(O)
b - a
= (a - b) sin e + b sin - - e.
b
b) Get a rectangular equation for the locus, for the case b = a/4. Sketch.
31. When one circle rolls around the outside of another, the figure traced out is called an
epicycloid. Derive parametric equations for the epicycloid, using radius a for the fixed
circle and radius b for the moving circle. Use the same parameter e as in Problem 30(a).
32. Suppose that a railroad wheel rolls (without slipping) along a flat track. Find coordinate
functions for the path traced out by a point at the outer edge of the flange on the wheel.
In the figure below, the outer radius is b and the inner radius is a. Sketch the locus,
bearing in mind that it is not a function-graph; it has loops in it.
33. Make the same modification in the definition of the epicycloid, as suggested by the
figure below, and sketch the curve. The fixed circle has radius a; the rolling wheel
has inner radius b and outer radius c.
*34. A path is regular if the coordinate functions/, g are differentiable, and we never have
'
f' (t) = g (t) = 0 for the same t. Show that every chord of a regular path is parallel to
9.3 Other Forms of L'Hopital's Rule 393
(Evidently the locus need not be a function-graph, and the chord may be vertical.)
*35. Given a path, with differentiable coordinate functions/, g. Show that, if the axes are
rotated, then the coordinate functions F, G that work for the new set of axes are also
'
differentiable. Show that if f' and g never vanish simultaneously, then F' and G'
never vanish simultaneously.
Thus, in the most general f orm ofl'Hopital's rule, we have: (1) x _,.a, x _,. oo, or
x--+ - oo; (2) g'(x)/f'(x)--+ L, g'(x)/f'(x) _,. oo, or g'(x)/j'(x)--+ - oo; and (3)
f(x), g(x) --+ 0, or--+ oo, or_,. - oo. Thus we have a grand total of 27 theorems, all of
which are true. One of these has already been proved, and the only hard one among
the others is the following.
then
(x)
Jim g = L.
x->ro J(x)
Example I . To find
1. In x
im-,
X-+O'J X
we take derivatives and find
1/x
lim- = 0.
X-+00 1
By the Northeast theorem,
ln x
lim = 0.
X-+ 00 X
Example 2. To find
"'
e
Jim - ,
X-t-00 X
we investigate
"'
e
lim - = oo.
X-+ 00 1
Theorem 2. If
. g'(x)
limf(x) =Jim g(x) = oo, and hm -- =co,
X-+00 X-+00 x-+oo f'(x)
then
g(x)
lim = oo.
x-+oof(x)
The proof is easy, on the basis of the Northeast theorem; we merely investigate
reciprocals. Since g'(x)Jf'(x) � co, we have
f'(x)
Jim = 0.
X-+00 g'(x)
By the Northeast theorem,
f(x)
lim = 0.
X-+00 g(x)
g(x)
lim =co.
x-+oof(x)
Example 3. Consider
. In x 1. - In x
I1m-=-1m -- .
x-+O+ X ., .... o+ x
9.3 Other Forms of L'Hopital's Rule 395
-1/x
lim -- =0 .
., ... o+ 1
By one form of l'Hopital's rule, it follows that
-In x = ,
I"Im+ -- O
X->0 X
and so the answer to the original problem is -0 = 0. The theorem being used here
is the following.
Theorem 3. If
(x)
lim f(x) = lim g(x) = oo, and lim g' = L'
x-+a+ +
x-+a x->a+ f'(x)
then
(x)
lim g = L.
x..,a+ f(x)
lim
g(x) =lim g(a + l/y).
x-.a+ f(x) y-.oof(a+ 1/y)
Taking derivatives, we find
1.
Im g'(a+ l /y)(-l/y22) = Im
1. g'(a+ 1/y)
v->oo f'(a+ 1/y)(-1/y ) y-.oo f'(a+ 1/y)
. g'(x)
= hm -- = L.
x-.a+ f'(x)
The Northeast theorem now applies to
].
1m g(a+ 1/y)
'
Y->00 f(a
+ 1/y)
and tells us that this limit is L. Therefore
(x)
lim g = L.
x-.a+ f(x)
We have now discussed all the troublesome cases of l'Hopital's rule; once we
have gotten this far, the rest of the derivations are routine. Hereafter, we shall use
all forms of the rule without comment.
Sometimes we can apply the rule by taking logarithms. Consider
lim x"' = ?
.,_, 0
396 Paths and Vectors in a Plane 9.3
=x = 1/xx = f(x)
=
In g(x)
In cf>(x ) In x - -
.
Now
.
x-+o+
g'(x)
Ilffi --
f'(x)
=
1·
lffi
x-+O
--
1/x
-1/x 2
== 1·lffi ( -x )
x-+o+
= 0 .
..-. =
Therefore
Jim In cf>(x) = 0, and limxx e0 = 1.
x-o+ x o+
7. Jim [(sin x)(In x)/x] 8. Jim (1 + Tan-1 o:)Tau-ia 9. Jim ( l + csc x)•in' x
a--+ co x-o+
( � ))Tan-'
x
13. Jim (l - 2x)1fx 14. Jim 1 + Tan 1 (x 15. Jim x2e-11x
x-'"12 a:- o+
x)sin
=
x-o
'
32. Jim ww 33. Jim (I + tan 3 fJ)-CSC 9
Tf'-o+ o-o
{
*34. The nth derivative of a function .f is denoted by ['"1• Let
e-1/x2 for x ;e 0,
f( x ) =
0 for x =
0.
Show that for each n > O,fhas an nth derivative, for every x; and show that J<">(O) 0 =
for every n. [Hint: You are not likely to find a manageable general formula for J<n>(x).
But you ought to be able to show that for x ¥- O,J<">(x) is always given by a formula of
a certain form, involving certain constant coefficients; and you may be able to use thi5
form, to show that ['"1(0) 0, without needing to determine the coefficients.]
=
9.4 Polar Coordinates 397
And the correspondence also works in reverse: when P is named, x and y are deter
mined. Thus a rectangular coordinate system gives us a one-to-one correspondence
{(x, y)} +--+ E between the ordered pairs of real numbers and the points of E.
We now consider another way of labeling points with pairs of numbers.
Given two numbers rand 8, we first draw the ray which starts at the origin and
has direction 8. On the line containing this ray we set up a coordinate system, with
the direction 8 as the positive direction; and we let P be the point with coordinate r.
(This is equivalent to saying that the directed distance OP is = r.) We then say that
P has polar coordinates (r, 8).
For example, in the left-hand figure it looks as if P1 has polar coordinates (2, 7r/3),
and P2 has polar coordinates (-2, 7r/3).
1r
y
2
�����,f---__,___,_��-'---- x
2
I
I
I
I
I
I
I
I
P2(r,8)� (r< 0)
I
Thus to every pair (r, 8) of numbers there corresponds a point P. But the corre
spondence does not work uniquely the other way: every point P corresponds to
infinitely many number pairs (r, 8). Thus, in the right-hand figure, the point P with
rectangular coordinates (1, 1) has polar coordinates (._}2, 7r/4). But Palso has polar
coordinates ( -/2, 57T/4). And this is not all; the possible polar coordinates for Pare
{(r, 8)} � E,
Since the cosine is periodic, with period 27T, we can get all of the locus by restricting()
to the interval [O, 27T].
As an aid in sketching the polar graph, we first sketch the rectangular graph of the
equation r = cos e. We cut the curve into four parts, as indicated, and then sketch the
portion of the polar graph corresponding to each of them. As () increases from 0 to
7T/2, r decreases from I to 0. As () increases from 7T/2 to 7T, r decreases from 0 to -1.
Therefore the second part of the curve, in the fourth quadrant, comes from values of()
in the second quadrant. (See the figure on the left.)
,..
,.. 2
2
3
' 4
' 3,..
3,..
2
2
As() continues to increase, from 7T to 27T, we trace out a curve shown on the right.
This looks like the curve that we had already. And in fact it is exactly the same curve
as before, because
If Phas polar coordinates (r, ()), then the rectanguar coordinates of Pare
x = r cos e, y = r sine.
9.4 Polar Coordinates 399
(There are two cases to check. If r > 0, then these formulas follow from the defini
tions of the sine and cosine. Verification for r = 0 and r < O?) Therefore
x2 + y2 = r2 cos2 e + r2 sin2e = r2 .
does not involve any of the three expressions r cose, r sine, r2 which we know how
to convert into rectangular coordinates. But we can multiply by r, on both sides,
getting
r2 = r cose;
and this means that
x 2 + y2 = x.
This is the circle with center at the point (t, 0) and radius t.
2) Consider
r = sece.
It is easier, however, to multiply both sides of the equation by cose. This gives
r cose = I, or x = I.
Ase increases from 0 to 27T (skipping 7T/2 and 37T/2), this line is traversed twice. (It
is worthwhile to figure out how.)
(As for r = cos(), the interval [O, 277] gives us the entire locus of the path.) First
we do a rectangular sketch as on the left. We then sketch the polar graph in four
parts:
11'
3,,.
2
!J
r cos ee - cf>) - p = 0.
Theorem I. Let d be the distance between the points with polar coordinates (r1, e1)
and (r2, e2). Then
d2 ri + r� - 2r1r2 cos eel - e2).
=
for i = 1, 2. Therefore
2 2 2
d = (x1 - X2) + eY1 - Y2)
2 2
= erl cos el - '2 cos e2) + er1 sin el - '2 sin ez)
2 2 2 2
= riecos el + sin el) + r�ecos e2 + sin ez)
For r1 > 0, r2 > 0, 0 < e1 - e2 < TT, this is simply the law of cosines.
But the polar distance formula applies for any values of r1, r2, e1, and e2•
402 Paths and Vectors in a Plane 9.5
r
22. r = 2 +cos 8 23. = 2. 24. r =eBI�
1 +r cos 0
25. The figure given in the text suggests that at the origin, the two sides of the cardioid
have the same tangent, namely, the line 0 = 7Tj2. Show that this is correct.
Discuss, as in Problems 1 through 24.
Find polar equations for the curves defined by the following conditions, and sketch.
Identify the curve if possible.
31. The set of all points which are equidistant from the origin and the line r = csc0.
32. The set of all points which are equidistant from the origin and the point (2 v2, 7T/4).
33. The set of all points P such that PA = 2PB, where A is the origin and B = (2v2, 7T/4).
34. The circle with center at (2, 7T/4) and radius 2.
Sketch:
Given
r = f(8) � 0,
where f is continuous, and the length of the interval [O'., /9] is ;;; 2rr. Let R be the
region between the polar graph and the origin.
9.5 Areas in Polar Coordinates 403
7r
2
That is,
R = { (r, &) I o: ;;; e ;;; f3 and 0 ;;; r ;;; f (&)}.
Consider a subinterval [&;_1, &i] of the interval [o:, {3]. Let mi be the minimum
value off on [&;_1, &;],let M; be the maximum value, and let flA; be the area of the
region between the origin and the part of the curve from e ei-1 to e e i. = =
-�M; fl&;.
Therefore
A= .2flA;.
i=l
404 Paths and Vectors in a Plane 9.5
But the sum on the left is the lower sum s(N) of the function
F(8) = }/(B)2,
over the net N, and the sum on the right is the upper sum S(N), of the same function
F, over the net N. Thus
s(N) � A � S(N);
and so
Jim s(N) � A � Jim S(N).
I s1-o J.YJ-o
Since
JXJ-o J.\·J-o
it follows that A is squeezed, and
A = fHC0)2 dO.
Thus we have:
Theorem 1. Let f be continuous and � 0 on [?:, (3], with (3 ?'. � 21T, and Jet - R be
the region between the origin and the polar graph off Then the area of R is
A = f §f(fJ)2 dO.
Let us try this in some simple cases. For the circular region with radius a and
center at the origin, the formula gives
A=
1"12 1
- ·
1
. dO
o 2 (cos 0 + sin 8)2
=
1
-
l;;/2 1
c1e
2 o 1 + sin 20
=
1
-
["12 l - sin 20
d6
2.o cos220
= Mt tan 2e - �- sec 28]�12
=
}(O + t) - �(O - i) =
�·
9.6 The Length of a Path 405
This is correct, because the region is a right triangle with legs of length 1.
Find the areas of the regions enclosed by the following curves, and sketch.
10. Find the area of the inside loop of the graph of r = 1 - 2 sin fJ, and sketch.
1
11. r = 12. r = e0, 0 � 0 � 21T 13. r = e28, 0 � () � 21T
lcos fJI + lsin 01
14. Given a polar graph defined by a differentiable function r = f(O) ( <X � () � {3), derive
a formula for the slope of the tangent, at a point (r0, fJ 0) (r0 =f(fJ0)). Here we really
mean the slope, relative to a rectangular coordinate system superimposed on the polar
coordinate system.
Roughly speaking, the length of a path is the total distance traversed by the moving
point. For example, consider the path defined by the coordinate functions
The locus of this path is a circle with radius a and circumference 27Ta. But as t increases
from 0 to 47T, this locus is traversed twice. Therefore the length of the p ath is 2 27Ta
· =
41ra.
Lengths of paths are undirected; they are always positive (or zero, in trivial cas es) .
Thus the length of the path
is four, not zero; the two halves of the path do not cancel each other out.
406 Paths and Vectors in a Plane 9.6
Yi = g(ti),
Then
Po
s = Jim 2; P;_1P;.
IN1�0 i=O
:I _____ _
Xi-1 Xi
Then
9.6 The Length of a Path 407
Since
and
for some r; between ti-l and ti. (We do not know that f; = i;, and this leads to trouble,
as we shall see.) Therefore
and so
i=l
i -J f'(ii)2 +
i pi-lpi i=l = g'(f/)2 6.ti.
i -Jf'(f;)2 +
i=l
g'(f;)2 6.ti � i -Jf'(i;)2 +
i=l
g ( f;)2 6.t;
' = i t1.(i;) 6.ti
i=l
when INJ � 0. For a proof of this, see Appendix I. Meanwhile we shall state the
following theorem and use it.
This formula can be converted to polar coordinates in the following way. Suppose
that a polar path is described by a function
r = cp(6) (a � e � b).
The rectangular coordinate functions of the path are then
For short, let us write </>' for </>' (e), c for cos e, and s for sin e. This gives
and
(a� e � b),
Theorem (?). For each i, there is a single point f;, betweent;_1 andt;, such that
We could then have expressed P;_1P; as a sample sum, and passed to the limit, as
in Section 7.1. But the above theorem is false. Give an example of a path ( withf' andg'
continuous) for which the theorem fails. There is a very simple example of this kind.
9.7 Vectors in a Plane 409
In Section 3.8, we found that the motion of a particle on a line could be described by
a single functionf, with real numbers as values, and that the velocity and acceleration
functions were the first and second derivatives
v =f' and a= v
I
= f" .
As we remarked at the time, these ideas are not adequate to describe the motion of a
particle in a plane (or in space). The motion of a particle in a plane Eis described
by a path, which is a function
P: I---+E
: t H P(t),
where I is an interval, and P(t) is the location of the moving particle at time t. Velocity
in this case is a "vector quantity," with both a magnitude and a direction, conven
iently pictured by an arrow. At each point P(t), the direction of the velocity vector is
the direction of the motion, so that the arrow always lies on the tangent line, pointing
in the appropriate direction on the tangent line; and the length of the velocity vector
is the speed.
y
--+
We allow the "degenerate segment" 00; this is called the zero vector1 and may be
-+
denoted simply by 0. Moreover, since all our directed segments, in this section, are
--+
going to start at the origin, we can denote the directed segment OP by the shorter
-+
symbol P. Three operations can be performed, in this system:
-+ -+
Addition. Given P1, P2, with P1 = (x1, Yi) and P2 = (x2, y2), the sum is defined to be
where
Vector addition is governed by the same formal laws that govern addition of real
numbers, as follows.
-+ -+ -+ -+ -+ -+
A.1 Associativity. (P1 +P2) +Pa = P1 + (P2 +Pa).
- -+ -+ --+ -+ - -+
A.2 Existence of 0. There is a vector 0 such that 0 +P = P + 0 = P for
-+
every vector P.
-+ -
A.3 Existence of negatives. For each vector P there is a vector -P such that
- -+ -+ --+ -+
P + ( -P) = ( -P) +P = 0.
-+ -+ -+ -+
A.4 Commutativity. P1 +P2 = P2 +P1.
These follow from the corresponding laws for real numbers. For example, if
-+ - - -+ -+ -+ -+ -+
(P1 +P2) +P3 = Q, and P1 + (P2 +Pa) = Q',
then
Q = ((x1 + Xz) + X3, (Y1 +Y2) +Ya)
'
= (x1 + (x2 + Xa) , Y1 + (Y2 +Ya))= Q ,
-+ -+ -+ -+ � -
and so Q = Q'. The existence of 0 is obvious: 0 is 00. If P = (x, y), then -P =
-
Q, where Q = (-x, -y). Similarly for A.4.
9.7 Vectors in a Plane 411
Because (a{J)x ==; a({Jx), and (a{J)y = a({Jy). Multiplication is connected with
vector addition by two distributive laws.
- - -
M.2. (a+ {J)P = aP+ {JP.
- -+ -+ -+
M.3. a(P1 + P2) = aP1 + r1.P2.
Zero and 1 work in the usual way:
- - -+
M.4. 0 · P = 0, for everyP.
- - -
M.5. 1 · P =P, for every P.
- -
M.6. a· 0 = 0, for every a.
-+
Let "f/' be the set of all vectors P. In "f/' we have defined two operations (addition
and scalar multiplication), and shown that they satisfy the laws A. l through A.4 and
M. l through M.6; "f/' is called a vector space (relative to these two operations). More
generally, any collection "f/' of objects is called a vector space if it is provided with two
operations satisfying the above formal laws. There are many important vector spaces
other than the one which we are now discussing. For example, we may consider the
- ---+
directed segments P = OP, starting from the origin in three-dimensional space, with
the two operations defined in an analogous way.
Finally, we introduce another kind of multiplication for vectors, called the
dot product or inner product. If P1 = (xi. y1) and P2 = (x2 , y2), as before, then the
inner product is a scalar, namely,
+ +
P1 . P2 = X1X2 + Y1Y2·
The following properties of this operation are easy to check:
-+ -+ -+ ->-
S.1. P1 P2 · = P2· P1.
- - - -+
S.2. (aP1)· P2 = a(P1 P2).
·
- - -+ -- -+ -+
-+ -+ -+ -+
S.5. If P · P = 0, then P = 0.
-+ -+
(The last condition rules out trivial "dot products" for which P1 P2· is always 0,
-+ -
for every P1, P2.)
Thus, "f/ is called an inner product space (relative to the three operations which
have now been defined). More generally, any collection 1/ is called an inner product
space if it is provided with three operations (addition, scalar multiplication, and
inner product) satisfying all the above laws.
As a matter of convenience, we have defined our three operations algebraically,
using the coordinates (x, y) of the terminal points P of the vectors. But it is important
to understand that all three of them have geometric meanings. We can add two
vectors, geometrically, by completing a parallelogram, as shown on the left.
To do this, we don't need to know the directions of the x- and y-axes. Therefore the
-+ -+
axes can be, and h�ve been, omitted from the figure. lf P1 and P2 are collinear, then
the parallelogram collapses, but the idea is the same.
-+ -+ -+
Geometrically, - P is the vector Q which has the same length as P, but has the
opposite direction.
-+
To multiply a vector P by a positive scalar IX, we draw a vector with the same
-+
direction as P, but multiply the length by x. If(/. < 0, we go in the opposite direction,
and multiply the length by j(/.j.
Q Q = -? Q = aP
-+ -+
P1· P2 =
X1X2 + Y1Y2·
9.7 Vectors in a Plane 413
so that
-+ -+
P1 · P2 = OP1 · OP2 cos 8.
Obviously cos 8 is independent of the directions of the axes, because () measures the
-+
angle between the two vectors. Note that the length of the vector P can be expressed
in terms of the dot product:
-+ +
p . p = x2 + y 2 = OP 2.
-+ +
The length of a vector P may also be denoted by IP!. Thus
IPI
� j+-+
= P·P.
-+-+ -+
By a linear combination of two vectors P1, P2 we mean a vector Q which can be
expressed in the form
where rx and f3 are scalars. In a coordinate plane, it is easy to find two vectors i and j
such that every vector is a linear combination of them. If the vectors i and j are as in
the left-hand figure below, and P = (x, y), then
-+
P =xi + yj (i = (1, 0), j = (0, 1)).
This is an equation between vectors, not numbers. On the right, we have multiplied
the vectors i and j by the scalars x and y, and added the resulting vectors.
414 Paths and Vectors in a Plane 9.7
y
y
0 1 x
This section contains no new information, but quite a lot of new language.
Learning a language takes practice. Therefore, while some of the problems below
are genuine problems, many of them are merely exercises in the process of translation
from the language of coordinate s y ste m s to the language of vectors and back again.
28. Let P1 = (2, 1), P2 = (1, 2). Sketch the set of all points P such that
30. p . i �0 31. p . j � 0
32. p . (i + j) � 0 33. p . (i - j) � 0
34. p = et:i + {i'j ( et: � 0, fi' � 0) 35. p = et:i + {i'(i + j) (o: � 0, /! � 0)
36. p . (i + 2j) �0 37. p . (i - 2j) < 0
*33. Let "ff/' be the set of all continuous functions on the interval [ -1, I]. State, for the
functions in 1/1, definitions of (a) addition, (b) scalar multiplication, and (c) inner
product, in such a way that 1f/' forms an inner product space. Verify that under your
definitions, the inner product space laws are all satisfied. (There is only one reasonable
definition for (a), and similarly for (b); but the "right" definition of the inner product is
less obvious. Hint: The""' operation is supposed to assign a numberf g to each pair ·
where h and k are constants. This is different from the idea of translation of axes,
which we used in Chapter 8. Then, we were moving the axes, while now we are
moving the points (x, y), with (x, y) � (x + h, y + k).
� ----+
Suppose that we have given two directed segments PQ, P'Q', in a coordinate
plane. If there is a translation under which P � P' and Q �---+ Q', then we say that
"---+ ----+
PQ and P'Q' are equivalent.
y
Q'
I
Q
p
I p'
416 Paths and Vectors in a Plane 9.8
x I-+ x + h, y I-+ y + k,
where
h = x{ - X1, k =
y{ - Yi-
If it is true that
y{ - Y1 = Y� - Y2,
-+ -----*
then this translation also moves Q onto Q', and PQ and P'Q' are equivalent.
-+
For each pair P, Q, the symbol PQ denotes the set of all directed segments
-----* -+
P'Q' that are equivalent to PQ. Such a set of equivalent directed segments is called
a free vector (or simply a vector, if the context makes it obvious what meaning is
intended). Thus the figure on the left below is a partial picture of exactly one free
vector. A free vector is called an equivalence class of directed segments; and any
directed segment which belongs to such an equivalence class is called a representative
of the class. Thus each of the arrows in the figure is a representative of the free
-+
vector PQ.
y
-+ -----*
If two directed segments PQ, P'Q' are equivalent, then they determine the same
-+ � -+ � -+ -----*
free vector, and PQ = P' Q'. And if PQ = P'Q', then the segments PQ and P'Q'
are equivalent. Therefore, when we write an equation of the form
-+ �
PQ =
P'Q',
��
we are saying that the segments PQ, P'Q' are equivalent under a translation.
It is now easy to define, for free vectors, the operations of addition, scalar
multiplication, and dot product. If
-+ ---+ -+
OP+ OQ =
OR ,
9.8 Free Vectors 417
It is easy to see that the right-hand figure above is correctly labeled. Therefore:
--+ --+
Theorem 1. PQ + QP = 0, for every P, Q.
Similarly, the labels are correct in the parallelogram below. Since
--+ � �
OP+ OR=OQ,
we have
--+ --+ -
OP+ PQ = OQ.
This has a geometric meaning: we can add free vectors by laying representative
--+ --+ -- --+
segments end to end. Solving for PQ, we get PQ = OQ - OP. And this gives:
418 Paths and Vectors in a Plane 9.8
� � �
Theorem 2. PQ + QR + RP = 0, for every P, Q, R.
Proof
� - � - � - �- -
PQ + QR + RP = OQ - OP + OR - OQ + OP - OR = 0.
y
R
I
y I
I
I
I
I
I
I
I
I
I p
I "
I/" ---
x
.-:.,. .......... x
0
Note that while the directions e, e' depend on the directions of the axes, the equation
e = e' does not; if the equation holds, and the axes are rotated, then the equation
continues to hold.
Thus we say that the relation of equivalence between directed segments, used in
defining free vectors, is invariant under changes in the coordinate system.
It very often happens that we use coordinate systems in the study of things which
are invariant under changes of coordinates. Thus the distance between two points is
invariant, and so also is the question whether a given curve is a parabola. But we use
coordinate systems in the study of parabolas, and similarly we use coordinate systems
in the study of vectors. If P = (x, y), then x and y are called the x- and y-components
9.8 Free Vectors 419
--+
of OP. In this case
--+ --+
P= OP = xi + yj,
--+
where P, i, and j are as in the preceding section. Corresponding to the vectors i, j
�
we have free vectors i, j; and OP is a linear combination of these free vectors:
�
OP= xi+ yj,
as shown on the left below. And of course pictures of the new i and j can be drawn
--+
starting at any point that we want. In the right-hand figure, PQ, i, and j are all
free vectors. In general, ifV and Tare any vectors, with T :;tf. 0, then the T-component
of V is the number
vT = 1v1 cos e,
where 8 measures the angle between the direction of T and the direction of V.
y
y Q
PQ=i+2j.
Thus, in the figure below VT is the directed distance PQ, relative to the given positive
�
direction on the line that contains PR.
Since
V T = IVI
· · ITI cos e,
V ·T
v -
T
--
IT!
420 Paths and Vectors in a Plane 9.8
In the figures below, we use tick marks to indicate that segments have the same length.
Thus the tick marks in the figure below say that AB = AC.
-+ --+ --+
1. a) Calculate OS as a linear combination of OR and OP. (The figure is a parallelogram.)
0 0
--+ --+
b) Show that, in a rhombus, SR · OT = 0. (These two answers, in combination, give
a vector proof that the diagonals of a rhombus are perpendicular.)
9.8 Free Vectors 421
� � -+
3. a) Calculate OS as a linear combination of OP and OR in the left-hand figure below.
R R
� � --+
b) Calculate OT as a linear combination of OP and OR, in the right-hand figure.
4. Do Problems 3a and 3b give a vector proof that the three medians of a triangle are
concurrent? Or do you need to carry out a third calculation of the same kind, to
complete the proof?
5. a) Show that
IV· Tl � IVI · ITI,
for every two free vectors V and T.
b) Show that for any real numbers a, b, x, y, we have
11. a) A set of vectors Vv V2, , Vn are linearly dependent if there are scalars !Xv IX2,
• • • • • •,
Show that for any V, the vectors i, j, and V are linearly dependent.
b) Show that if one of the vectors Vi is 0, then the vectors V1, V2, ... , Vn are
=
linearly dependent.
c) Find a number a such that 2i + j and 7i + aj are linearly dependent.
12. a) A set of vectors V1, V2, ... , Vn are linearly independent if they are not linearly
dependent. Thus the V/s are linearly dependent if
n
and
then
IV1 - V2I =
IW1 - W2I·
(Remember that IVl2 V V, for every V.) Then draw a figure, and restate the
= ·
15. a) Consider the vector space which you were asked to define in the last problem of the
preceding problem set. Let 1 be the constant function which is = 1 for each x on
[ -1, 1 ]. Find ten nonconstant functions.f1,f2, ,/10 such that 1 f; = 0 for each i.
. • . ·
We return to the discussion of moving particles in a plane. Suppose that the motion is
described by a path
P: I�E
t H P(t),
where I is a time interval. Let the coordinate functions of the path be f and g, so that
pt= OPt,
-
where Pt is a vector in the sense of Section 9.7, and P(t) is denoted by Pt, to fit it into
the vector notation.
We then have
-+
where i and j are the free vectors corresponding to i and j. Since Vt and At are free
vectors, we can draw pictures of them in any position we want; and so we picture
them by drawing arrows starting at the point P1•
The picture then says that at time t, the moving particle is at the point Pt and has the
indicated velocity and acceleration vectors Vt and At· Note that Vt lies along the
tangent line; and this is right. (This should be checked, for the various possible cases.
(a) If f'(t) and g'(t) are both 0, then V1 0, and there is nothing to prove. (b) If
=
f'(t) -:;!= 0, then V1 and the tangent line both have slope g'(t)/f'(t). (c) If f'(t) = 0
and g'(t) -:;!= 0, then V1 and the tangent line are both vertical.)
When we write Vt = f'(t)i + g'(t)j, A1 j"(t)i + g"(t)j, we are describing each
=
Next we take a free vector N, with length 1, perpendicular to T, and lying on the
same side of T as At· Then Ai must be expressible as a linear combination
Y At
N��----
In the right-hand figure above, 8 is the direction of Vt. Since Vt = f'(t)i + g'(t)j,
we have
where
f"(t) . (t)
cos = Sill</>= g"
-.
<P IAtl ' IA1I
By definition of the T-component IX of At, we have
I
IX= A1I cos(</> - 8) = IA1I cos 4> cos 8 + IAtl sin 4> sin 8
f" (t)f'(t) � g"(t)g'(t)
= f"(t) cos 8 + g"(t) sin 8 = I
I
f' (t)f"(t) + g'(t)g"(t)
-Jf'(t)2 + g'(t)2
IV tl = )j'(t)2 + g'(t)2.
The normal component fJ is computed as follows. If N is counterclockwise from
T, as in the figure on p. 423, then
If the direction of N is reversed, the sign of sin (8 - </>) is also reversed. In any case,
we want fJ � 0, because N is taken on the same side of the tangent as A1. Therefore
9.9 Velocity, Acceleration, and Curvature 425
we must have
fJ = IAtl • !sin (8 - </>)I,
in all cases. Therefore
=
I IAtl sine cos</> - IA1I cose sin </> I = lf"(t) sine -
g"(t) cose1
lf"(t)g'(t) - g"(t) f'(t) I lf" (t)g'(t) - g"(t)f'(t) I
IVtl ,Jf'(t)2 + g'(t)2
This formula for fJ also has an interpretation, but its interpretation is harder to
see, and requires the idea of the curvature of a path at a point.
For the sake of simplicity, we start with the idea of the curvature of the graph of a
twice differentiable function at a point.
For each x, let s(x) be the length of the graph from t = a to t = x. Then
=I �:I·
426 Paths and Vectors in a Plane 9.9
Therefore
K
= I d(}ds I I d(}/dx
=
ds/dx
I I f"(x)
•
1 f'(x)2 -J1
=
+
1
+ f'(x)2 I
f"(x)
=
I [1 + f'(x)2]3/2
.
I
For future reference:
I (1 + j'(x)2]3/2
I •
For paths, the idea is similar. Take a fixed t0, and for each t, let s(t) be the length
of the path, from t0 to t. Then
and
For each t, let{}(t) be the direction of the velocity vector at time t. We are working
on a portion of the path where !Vtl -JJ'(t)2 + g'(t)2 ¥- 0. On such a portion of the
=
path, sis an increasing function, and so {}(t) is determined when s(t) is known. Thus
there is a function h such that
{}(t) = h(s(t)).
Therefore
d(} h'(s(t)) d(}/dt
ds ds/dt.
= =
K
=
l��I·
In order to calculate K
we take first the case in which Vt is not vertical, so that
2 f'(t)g"(t) - g'(t)f"(t)
[sec (}(t)](}'(t) =
.
j'(t)2
9.9 Velocity, Acceleration, and Curvature 427
Now
f'(t)g"(t) - g'(t)f"(t)
010 = .
f'(t)2 + g'(t)2
This derivation works whenever the velocity vector is nonvertical. (Query: How
would you derive the same formula, in the case where the velocity vector is vertical?)
[ [
This gives
I
de d e/dt e'(t)
K= I I I = =
l
ds ds/dt s'(t)
= f'(t)g"(t) g'(t)f"(t)
I
_
1
]3/
[f'(t)2 + g'(t)2 2
Thus we have:
Theorem 3. The curvature of a twice differentiable path, at any point where the
speed is not 0, is given by the formula
lf'(t)g"(t) - g'(t)f"(t)I
K= ]3/
[f'(t)2 + g'(t)2 2
Comparing this with the formula
lf"(t)g'(t) - g"(t)f'(t) I
/3 = '
.Jf'(t)2 + g'(t)2
we get:
Theorem 4. At any point where the speed is not zero, the normal component of
acceleration is given by the formula
/3 = AN = K IV1l2•
In our discussion, we used the notation f', g', ... for derivatives, most of the
time, in order to connect our work with the preceding theory. We used the notation
de/dt, de/ds, . . . only when we really needed to talk about the derivative of one
function with respect to another, in defining and calculating curvature. In the litera
'
ture of physics, however, the notation f', g, . . . is hardly used at all. The following
notations are far more common:
df. dg. dx dy
v = -1 + 1 - v = -i +--j, v =xi+ yj.
dt dt dt dt
,
In the last expression the dots over x and y indicate differentiation with respect to
time. Similarly,
d2j d2g . d2x d2y.
A= l + J= .
+ J = Xl + YJ·
. .. . ...
- - l -
In these notations,
and
li.Y Ji.XI
- l(dx/dt)(d2y/dt2) - (dy/dt)(d2x/dt2)1
K = =
1. Find the point of maximum curvature of the parabola y = x2, and find the maximum
value of K.
2. Find the point of maximum curvature of the parabola y =2 + x + x2, and find the
maximum value of K.
3. Find the points of maximum and minimum curvature of the graph of y = x3, and
calculate the values of K at these points.
x2 y2
�+ =1.
/;2
Pt = i cost + j sint,
- t t
Pt = i cos 2 + j sin 2 .
16. For a certain path, the velocity at time 0 has direction I)( and length 1. The initial
point P0 is the origin. For each t, At gj Express the path in the form Pt
= - . =
f(t )i + g(t)j.
17. Discuss as in Problems 6 through 15, and express the tangential and normal com
ponents of acceleration as functions of the time:
Describe this as a path in polar coordinates; find a rectangular equation for its locus,
and identify the locus.
19. Discuss and sketch
The figure indicates that in some cases, at least, there is such a time t.
y
24. Given a path which has curvature K at time t0, suppose that the axes are rotated. Does
the curvature change? Why or why not? [Hint: This problem does not require a
calculation.]
25. Let a = i + j, b = i - j. Suppose we define an "inner product" V1 * V2, by agreeing
that for
V1 = x1a + y1b,
V2 =
x2a + y2b,
430 Paths and Vectors in a Plane 9.10
the * product is
V 1 * V2 = X1X2 + Y1Y 2·
a) Does *obey the same formal laws as the old inner product?
b) Is it true that V1 *V2 = V1 · V2 for every V1 and V2? Why or why not? In any
case, express the new operation * in terms of the old.
The treatment of vectors in this chapter has been brief, because so far we are working
in a plane, and the main advantages of a vector approach appear in three-dimensional
space, and in spaces of higher dimensions. Meanwhile we must bear in mind that
vector ideas appear in many different forms.
1) Free vectors. Velocity and acceleration are vectors in this sense, as in Sections
9.8 and 9.9.
2) Bound vectors. These have not only length and direction, but also position.
For example, if two forces act in opposite directions on the ends of a spring, then
they may be regarded as bound vectors.
�F_1_. � ---F�2-
In the figure, the two forces have the same length and opposite directions, but they
do not cancel each other out, as free vectors would; on the contrary, they compress
the spring.
(w1, X1, Y1, z1) + (w2, Xz, J2, z2) = (w1 + W2, X1 + X2, Y1 + Y2, z1 + Zz) ,
(w1, X1, Y1, z1) · (wz, X2, Yz, Z2) = (w1W2 + X1X2 + Y1Y2 + Z1Z2.
4) Systems of other kinds, regarded as vector spaces and inner product spaces. Some
of these are unexpected, but turn out to be useful. See, for example, Problem 33 of
Problem Set 9.7, in which it appeared that a set of functions can be regarded as an
inner product space, although functions may not seem like vectors when we look at
them one at a time.
For this reason, when people speak of "vectors," we need to find out what kind
of vectors they are talking about.
10 Infinite Series
h3 ( r) ( 1 )
An = 3 1 + � 1
+ 2n '
and we found that
h3
limAn = -
n-+oo 3
We are now going to use limits of sequences more extensively, as a way of dealing
with infinite series. Given an infinite sum
. 00
L ai = a 1 + a 2 + ...
. '
i=l
we define
and we call the An's the partial sums of _2;:1 ai. Thus the An's form a sequence
2; ai =A.
i=l
We shall now examine limits of sequences more carefully, starting with the
definition of the limit, and building up the theory that is needed.
Definition. Given a sequence A1, A2, . • . of numbers, and a number L. Suppose that
for every E > 0 there is an integer N such that
431
432 Infinite Series 10.1
Then
Jim An= L.
n-+ oo
Note that this is like the definition of lim.,�00/(x). A sequence which has a limit
is called convergent. Here, as always, when we speak of a limit we mean a finite limit
(unless the contrary is stated.)
1
Theorem 1.
.
limn�oo -
n
= 0.
Proof Here L = 0, and IAn - LI = 11/n - OI = l/n. Thus we need to show that
for every E > 0 there is an N such that
Now
1 1
-<E <:::> n>-.
n E
If l/E is an integer, let N = l/E. In any case, there is an integer N> l/E. Then
On the basis of the definition of limn_,00 A,,, we can prove the expected theorems
on sums, products, and quotients. These are much like the corresponding theorems
for limits of functions. In Appendix C they are listed in such an order that they became
easy to prove. Meanwhile we shall state the main results and use them.
and
These theorems justify the procedures that we have been using informally. For
example, they give a proof that
lim
n-+oo
h3 (i n.!) (i
3
+ + 1-
2n
) h3 3
10.1 Limits of Sequences 433
lim .! =0
n-+ 00 n
lim (i .!) + = 1
n -+oo n
lim ( i +) 1- =1
n-+oo 2n
sequence is decreasing if An �
sequence is strictly increasing; and if An+i < An for every n, then the sequence is
strictly decreasing.)
Definition. If there is a number M such that An� M for every n, then M is called
an upper bound of the sequence A1, A2, . . . , and we say that the sequence is bounded
above. If there is a number m such that m �An for every n, then m is called a
lower bound of the sequence, and we say that the sequence is bounded below. If there
is a K > 0 such that /An/ � K for every n, then the sequence is bounded.
bounded. (3) If An = sin n, then the sequence is neither increasing nor decreasing,
but is bounded, with /sin n/ � 1 for each n.
It is easy to see that if a sequence is bounded both above and below, then it is
bounded. Given m �An� M for every n, let K be the larger of the numbers /m/
and IM/.
That is, if
Ai�A2 �...�An�An+l �...� M,
then the sequence has a limit. The first application of this principle that you may
have seen is in geometry. Given a circle of diameter 1, we inscribe in it a regular
polygon of 2n sides. For each n, let An be the perimeter of our 2n-gon. (Note that
we had better start with n = 2.) It is a matter of elementary geometry to show that the
434 Infinite Series 10.1
sequence A2, A3, . • • is increasing. Also, An < 4 for every n, because the perimeter of
every inscribed polygon is less than the perimeter of the circumscribed square.
(Draw a figure.) Therefore the sequence is convergent. Its limit, of course, is 'TT.
We proceed to the proof. Let S be the set of all numbers An. That is,
Then S has an upper bound. By the Least Upper Bound Postulate (LUBP), S has a
least upper bound. (See Section 5.6.) This is called the supremum of S, and is denoted
by sup S. Let
A= sups.
We shall show that
limAn = A.
n co
....
Therefore
for every n.
Therefore
n > N => IAn - Al < E,
That is, if A1, A2, • • • is decreasing, and An � K for every n, then the sequence
has a limit.
Proof For each n, let Bn = -An. Then B1, B2, • • • is increasing, and is bounded
above. Therefore it is convergent. Let limn-co Bn = B. Then limn-oo An = - B.
Some simple sequences converge for reas0ns which are not covered by the
preceding theorems. For example, given that
lim _! = 0'
n-.0 __ n
it is obvious that
lim
n-co
1.V"- = 0,
n + 2 n
because the second sequence is smaller, term by term. This is the idea of the following
theorem.
10.l Limits of Sequences 435
Theorem 6 (The annihilation theorem). If limn-· co An= 0, and BI> B2, • • • is bounded,
then limn_,00 A nB n = 0.
mean what you would expect. You should be able to state your own definitions of
them, following, if you need to, the models of Section 5.3. Sequences like this are
not called convergent. If lim"_,00 An = oo, then we say that the sequence diverges
to infinity. And if limn-oo An = - oo, we say that the sequence diverges to minus
infinity. We have to be careful about this: if convergence allowed the limits oo and
- oo, then Theorem 7 would become false, and Theorem 2 would be meaningless
in many cases. (You can't perform algebraic operations on the "numbers" oo and
-oo. )
Investigate the following indicated limits. That is, find out whether they exist, and find
out, if possible, what they are.
I
1. lim 2
n�oo
n
2 + 311
2. Jim --- (Try dividing the numerator and denominator by n.)
n-Cf) 3n
1 ll
4. Jim 3 5. lim 3
n-oo 2 ll + n n->00 ll + n2 + 7T
� n
)
8. Jim
n->OO
( 1 +
n
� -l/n
) [Hint: (1 x)1fx.
9. lim
n_,.oo
( 1 -
n (1 +
Surely you know limx-o
1/y)v,
+
and apply the result to the problem in hand.]
Now find liffiy_oo
10. lim,H"' Bn. where Bn is the perimeter of a regular 2n-gon circumscribed about a circle
of radius 1. (You need not prove that your answer to this one is right.)
17. Jim
f " dx
312
n-oo 1 X
n 1
18. Jim .L � (Investigate existence only. A geometric interpretation is useful.)
n---+CO z=l l
"dx
19. Jim (
n--oo J1 X
n
1
20. Jim L :a (Investigate existence only.)
n--i-oo i=1 l
n
1
21. lim L .31 (Investigate existence only.)
n-+OO 1.=l l 2
n
I
22. Jim L -: (Investigate existence only.)
n->OO t=l l
23. Jim
n--co i=l
I ( hr)n n sin : (Geometric interpretation?)
n
I I)
_L --- (
!!__
24. Jim I ( 2n) n cos
hr
25. Jim
n�oo i=l 2 n-oo i=l I + (i/n) ;
y
10.2 Infinite Series. Convergence. Comparison Tests 437
26. lim I
" 1 (1)- 27. Jim -
1 n
L eifn
n-•co i=l 1 + (i/211) n n-co n i=l
28.
l n
lim - L e-i/n
n-co n i=l
29. Jim n -n1
sin
1
30. Jim n2 sin 2
n--+CO n
31. Jim 11 1
n-+CO
( - cos �
n
)
32. Jim n tan
1
33. Jim n
1-
sec
n-.oo n n
34. Jim
n--+CO
[I � - ]
1.=l l
In n
(In fact, this limit exists; if you can find a geometric interpretation of the problem, you
can prove it. The limit is known as Euler's constant. Nobody knows whether it is
rational.)
n 1 n 1
35. Jim
n->OO
I
i=l
.
(21 + 1) 2
(Investigate existence only.) 36. um I
n-.oo i=l (31.2
+ l
)2
I ai = a1 + a2 + · · + an + ·
· · ·
i=l
We say "an indicated sum" because in many cases there is no such thing as the sum
of infinitely many terms. For example, the series
1 - 1 +1 - 1 +1 (to infinity).
In many cases, however, the "sum of infinitely many terms" can be defined, by a
passage to a limit, in the following way.
Given the series
limAn =A,
where A is a (finite) number, then we say that the series is convergent and that A
is its sum. We also say that the series converges to A. If the sequence A1, A2, • • •
limAn = oo,
n-+ oo
438 Infinite Series 10.2
limAn = -oo,
n-+ oo
then the series diverges to minus infinity. We may write these statements briefly as
"' "' 00
Probably the first example that you have seen of a convergent series is the geo
metric series
n
1+ r + r2 + · · · + r + · · · (0 < r < 1).
Here
n
n 1- r +I 1 1
An= 1 + ,2 + ... + r = --- - , n_r_
1- r 1- r 1 - r 1-r 1- r
If we know that
Jim ,.n = 0 (0 < ,. < 1), (1)
-oo
n
then it follows that
1
limA = - - (0 < r < 1); (2)
-+oo n 1- r
n
and this means that
00
1
.2; ri = - (0 < r < 1). (3)
i=O 1 - I'
There are many ways of proving (1 ). The following proof is the easiest. Since
0 < r < I, we have
rn+i < rn for every n.
Therefore the sequence r, r2, r3, • . • is decreasing. And it has a lower bound, namely 0.
Therefore the sequence is convergent, to some limit L. Thus
(Why? What happens to the limit of a sequence, if you omit the first term?) Therefore
l
L = Jim rn+ = r Jim rn = rL,
11-00 n-oo
and so
L = rL, and (1 - r)L = 0.
Since 1 - r � 0, it follows that L = 0. Therefore
Proof? (If you rewrite these two statements, using the definitions of the statements
limn-oo lanl= 0 and limn-oo an= 0, they hardly even look different.) Thus we get
the following theorem.
10.2 Infinite Series. Convergence. Comparison Tests 439
limrn = 0.
n-+oo
Algebraically, the formula
n n
1 + r + y2 + . . . + r = (1 - y +l)/(1 - r)
holds for every r ¥:- I. We therefore have a more general result for geometric series:
The following theorem often makes it easy to see that a series diverges.
lim(An - An_i) =A - A= 0.
n->oo
For example, the geometric series L:o ari is divergent for a ¥:- 0 and lrl � I.
In this case, lanl = lal · lrln � lal, and so an does not approach 0.
Warning. The converse of Theorem 5 is false. That is, the nth term of a series
may approach 0, and the series may still diverge. The simplest example of this is the
series
1 + t + t + ·� + i + i + t + t + t + t + · · ·
The next five terms are each equal to t; and so on. Here an-+ 0, but the series
diverges to infinity.
A more natural example of the same phenomenon is the harmonic series
00 1 1 1 1
I�=1+-+-+ . . ·+ -+
n
. . .
i=l i 2 3
440 Infinite Series 10.2
In fact, this diverges. The easiest way to see this is to draw a picture:
y
1 2 3 n n+I
For each n, the area under the graph from x = 1 to x =n + 1 is less than the total
area of the circumscribed rectangles. Therefore
A
1 1 1
=1+-+-+ .. · +- >
ln+l dx -.
"
2 3 /1 1 X
limln(n + 1) = oo.
Therefore the partial sums An form an unbounded sequence, and the series must
diverge to infinity. Briefly:
The same sort of comparison scheme can be used for other series, to show that
they converge. Consider, for example,
Here ai = l/i2, and so lim;�oo ai = 0. This does not, in itself, show that the series
converges. But the algebraic pattern suggests that the series is related to the improper
integral
100 dx la dx
2 = 1Im
[-l]a (-1 l)
.
2 =
1.
Im - =
1.Im - + = l.
x a-+ x a-+ x a-+
1 00 1 co 1 00 a
(n > 1).
J1 x2
1
An 2 2 < 1 + < 2.
=
i=l n
-
Theorem 8 (The comparison theorem). Let 2:1 ai and 2:1 bi be series, with
0 � ai � bi for each i.
Then (1) if 2:1 bi is convergent, then so also is ,Li':1 ai; and (2) if ,L:1 ai is divergent,
then so also is .L:1 bi.
Proof For each n, let
Then
442 Infinite Series 10.2
(Why?) And each of the sequences A1, A2, • • • and B1, B2, • • . is increasing. An
increasing sequence is convergent if it is bounded, and conversely. We can therefore
prove ( l) in the following steps:
00
i=l bi
2 is convergent
00
i=l ai
2 is divergent
=>
i2=l bi is divergent.
The comparison theorem gives us easy tests for some series. Consider for
example,
00 1 =1+-+-+-+···
1 1 1
i2-
=O i ! 1 2! 3!
Here
n! = l n (n · 2 · 3 · · ·
� 1)
Then
a i � bi for each i;
1 erl (i = 0),
- < -
O! 2
1
1! Gr (i = 1),
-
10.2 Infinite Series. Convergence. Comparison Tests 443
and thereafter the strict inequality holds, with l/n ! < (l/2r-1 for n � 2. Therefore
our series is term by term less than the geometric series
I �!
i=O l
=
e = ln-
1
1 = lim (1 + x)1/"'.
x-+0
But we won't be able to prove this until we have developed the theory much further.
The situation here is peculiar: the easiest way to get this special result is first to show
that
and then to set x = 1. (You have seen a situation like this before. The easiest way to
find H x4 dx is first to calculate the function frr t4 dt, and then to set x = I.) Consider
next
00
1
�
i n - /n
Since
1 > !
_
for every n,
n - �n n
and
ro
1
Cf).
i� n - )n
While the comparison theorem tells us, under some conditions, that a series
converges, it never tells us what the sum is. But such partial information may be
useful. In fact, some of the most important uses of series are in cases where a number
(or a function) can best be described by a series; in such cases, we use Lf=i a; (for
some large n) to get an approximation of ,L;:1 a;. For example, the approximation
is excellent, even for fairly small values of n; it gives by far the best way of computing
e; and in fact, the series approaches its infinite sum so fast that e is much easier to
Therefore, when you are asked to show that a series has a sum, without finding
out what the sum is, you should not consider that the problem is artificial.
Find out which of the following series are convergent. If the series is geometric, calculate
the sum.
I. -
oo
oo
1 1 00
. i� j 3 /2
2 Vi: 3. .2
= --
i=l ----: 2 i l Vi + 1
oo
1
6. i(
) � 2i
i � j3/2 + 2
4.
i=l 4
00 00
sin2 (2i - 1)
.2 (-l)i7T-i 8. L ( -2)ie2i I
.
7 9.
i=l i=l i =l j2
cos3 (2i)
i�
00 00
1
10. �l j2
11.
i=l j 3
12.
i�i'
oo 00
1 1 00
1
13. 14. 0 9 15. .2 -.-·
i�l j l-1 i� j. i=2 1 nl I
oo 00
1 1 00
1
i=2 �
2 l 1n I . 17. 2 _2
i=2 z:---1n2 z
16. . 18.
1n
i=2 1-:---
z z
t
•
00 co
00
1 1 1
i=2 l:a-n
1 .
19 20 .2 21.
· i�2 ln2 i i�O (i !)2
•
22.
� (i ! - 1)
£.,
i=O (i!)3
23.
oo
2 i(i+1)
i=l -
1
24. 2
co
-
1
i=2 i(i - 1)
--- i -._
i_
00 00
1 1
26. 2
2
(i + l)(i + 2) (i - l)(i)(i + 2) + 1
25. 27.
i=l i i=2 --- i=l l
+ 1
£.,
2 8.
� i
i=l 21_
30. i: �
i=l j2 - 1
31. If you think of Theorem 3 backwards, it says that
--1
1 -r
=1 +r+r 2+···
That is, 1/(1 - r) can be expressed as the sum for an infinite series. Express 1/(1+x)
as the sum of an infinite series. For what numbers x does your series converge?
32. Express 1/(1 +x2) as an infinite series. For what numbers x does the series converge?
33. Same question, for 1/(1 + x4).
*34. Suppose that :L:o aixi converges for every x. The series then defines a function
co
/(x) =
2 aixi.
i=O
It will turn out that functions which can be defined in this way are always differentiable,
and that their derivatives can be calculated by differentiating the series a term at a time.
That is,
co
(Don't try to prove this; you haven't got a chance.) Granted that all this is true, what
must the a;'s be, if /(0) 1 and f'(x) /(x) for every x? Comment on your result.
= =
00
i 00 2i2
35. :L-.-
3 36. 1 I�1
i=l I + 1 i=l +
*37. For which numbers ct. is the series :L:1 (l/n") convergent?
*38. Prove the following.
Theorem A (The Integral Test). Let f be a positive decreasing continuous function,
on the interval [1, oo ) . If
then
00
Given a series _L!o ai (in which the terms may be positive, negative, or zero), we can
form a new series by taking the absolute value la;J of each term a;. For example, if
00 00
Lai = L( - l)iri = 1 - r + r2 - · · • ,
i=O i=O
then
00 00
i=O i=O
Given that La; converges, it does not follow that _L lai l converges. For example,
the series
1 a; =
1 - 1 + t- t + i - i + · · ·
:L la;I =
1 + 1 + t+ t + i + t + · · ·
is not, because the harmonic series is not. The same sort of thing happens if we take
absolute values in the series
00 00
· 1 1 1 1
:L ai :L c-1)i+ 1--: 1
= = � - + - - - + ...
i=l i=l l 2 3 4
Here it is plain that L la;I diverges, but it is not quite so easy to see that La; is
convergent. This is worth proving, however, because the idea used in the proof is
useful in other connections.
"k= 1 - I_ + I_ -
An = Ao.
2 3
· · ·
2k
- l_
= (1 -
l
2 )+( l
3 - )4 .+
1 ...
+ (2k 1- 1
-
1)
2k .
Therefore the sequence A2, A4, A6, ••• , A2k, • • • is increasing. And it has an upper
bound, because
A2k = 1 - G �) (� - �)
- - - · · · - ck� 2 - 2k� 1 ) 2k1 - < 1.
so that
lim A2k+1= lim [A2k
le-> oo k-> oo
+ a2k+il = lim A2k
k-> oo
+ lim
le-> oo 2k + 1
,
lim A2k+i
k--+ 00
= A + 0= A. (2)
Thus we see that (I) as n--+ oo through even values, An --+A and (2) as n--+ oo through
odd values, An --+A. It follows that (3) limn�oo An= A.
Proof? (You need to show that for every E > 0 there is an N such that [An
- A[ <
n > N. Given such an i:, you know from (1) that there is an N1 such that
i:· for every
k
[A2k - A [ < i: for every > NI; and you know from that there is an N2 such that (2)
k
IA2k+i - Al < i: for every > N2• How can N be definecl in terms of NI and N2?)
The scheme that we used to prove Theorem 1 applies more generally. If you
reexamine the proof, you will see that the only facts ,about the series
1) The series is alternating. That is, successive terms a;, ai+I have opposite signs.
2) Limn�oo an = 0.
Theorem 2 (The alternating series test). Given an alternating series ,L:1 a;. If
l i mn_,00 an = 0, and the sequence la1i, la2i, . . . is decreasing, then the series converges.
(Strictly speaking, some of our formulas in the proof of Theorem 1 used the fact
that the first term was positive instead of negative. If you know that the theorem
holds in this case, how would you show that it also holds when a1 < 0 ?)
We have seen that if La; converges, it does not follow that L la;I converges.
But the reverse implication does hold:
into a sum of positive terms and a sum of negative terms. To do this, we let
+_ a;
a;
{ if a; � 0,
- 0 if a; < 0,
{
and let
a; if a; � 0,
a; =
_
0 if a;> 0.
Let
n n
A+
n -
-
"" ai'
L- + A-;;-= _La;.
i=l i=l
Then
for each i. Obviously A�, A;, ... is an increasing sequence, and A-;, A;, ... is a
decreasing sequence. Let
ro
k = .L la;!.
i=l
Then
n
A!� _L la;I,
i=l
because A� is the sum of some (perhaps all) of the terms on the right-hand side.
448 Infinite Series 10.4
and I�1 a; is convergent, which was to be proved. In fact, we can say a little more:
Find out which of the following series are alternating, which are convergent, and which
are absolutely convergent.
"' .
. l
3. '
£.. ( -1)' -. -
i=l i=l"' �·2 i=l I + I
"' 1 1-
4. I <-1)i -:{I s. I 2
i=l"' l + i
i=l
l
"'
i1 -
i=lI c ni1
I
9.
00
:L t)i ' (i 1)
1.
i=lI c-2)�)2i-:-i i=2 c
-
s. �
-
(- (
l. I
rri) � 00
10.
i=lf e
11.
i=lI ::.!_2
sin + cos
2 l
12. I
i=l
c-i)-i
A= lim An = L ai.
n-+co i=l
The approximation An � A is used in some of the most important applications, and
in all applications that use computers. As in all approximation processes, we are
better off if we can set a limit on the error. We shall now find ways to do this.
Given that 1:1 ai converges to a sum A, let"Rn= A - An. Then
ct)
Rn= L ai,
i=n+l
and obviously
limRn= 0.
n-+oo
For alternating series, of the type treated in Theorem 2 of the preceding section,
it is easy to get an estimate of R,,,. Let the series be
ct) ct)
"" ai = "" i+l =
£.., £.., ( -1) b; b1 - b2 + ba -
i=l i=l
where b; = Jail· Then
ct) ct)
R n= L a i= L (-l)i+l bi.
i=n+l i=n+l
If n is even, then
Therefore
If n is odd, then
Therefore
Theorem 1. Given Ii:1 ai. If (1) the series is alternating, (2) limn�co an = 0, and
(3) the sequence la11, la2I, . is decreasing, then
. .
That is, when you stop after a finite number of terms, the error is numerically no
larger than the first term that you omit. For example, take
i=l i 2 3
By the alternating series test, this series converges. Let A be its sum. Then
1 1 1
A�l--+-- ··· +-·
2 2 2 ' 2 3 9
and the error in the approximation is � 1/102 = 0.01. This series does not converge
very fast. Next consider
C()
-1 1 1
I c lY-:-
i=O
-
l!
= 1 - 1 + -
2!
- - + ...
3!
This series converges to a sum A. (It will turn out that A = 1/e.) We have
1 1 1
ARJ--- + ··· +-·
2! 3! 10 ! '
and the error is less than 1/11 !. This series converges very rapidly:
If you reexamine the proof of Theorem 1, you will see that the method that we
used to get an estimate of the error was very much like the method that we used to
establish convergence in the first place, in the proof of the alternating series test.
This happens most of the time: that is, a proof of convergence usually gives an
estimate of Rn- Consider, for example,
C() 1
I-:;.
i=l l
We let
and we observe that the sequence A1, A2, is increasing. To show that it is bounded
• • •
An =
n 1
L -:2 < 1+
ln dx
2 ·
i=l l 1 x
10.4 Estimates of Remainders 451
n x·
Since
we conclude that
< 1
Rn -
for every n.
n
This is nowhere nearly so small as the estimate of error for the corresponding alter
nating series. 1n fact, the positive series I (1/i2) converges very slowly.
Similarly, Theorem 4 of Section 10.3 gives an estimate of the error for series
which are absolutely convergent.
Rn= L ai.
i=n+l
452 Infinite Series 10.4
Then
00
IRnl � L lail·
i=n+l
That is, the error in Iai is numerically no greater than the error in I la;I- To
prove this, we apply Theorem 4 to the series
00
I ai,
i=n+l
If we use the comparison theorem of Section 10.2, to establish the convergence
of a positive series, then any estimate of the remainder of the larger series auto
matically is an estimate of the remainder of the smaller one. For example, we have
found that
00 1
I-:; < co,
i=l i
with Rn < l/n for every n. Since 0 < l/(i2 + 1) 2
< l/i for every i, it follows by the
comparison theorem that
00
1
I-. -- <co.
i=l i2 + 1
It also follows, for the remainder R� in the new series, that
and so
R� < l.
n
This scheme always works, whenever we establish convergence by means of the
comparison theorem.
Each of the following series is convergent. In each case, get an estimate of the remainder
Rn, in the form \Rn\ � · · ·
i
I. i; (� ym
i=l 3}
2. i: (- �4)
i=l
i; co� 7Ti
00
3. I ( -l)i7T-i 4.
i=l i=l l
00
sin2 (2i - 1) 00 1
5. L 6.
i=l i2 i�l f3
00 ( - l)i+l
7. I 8.
i=l i-4
w ( - l)i w
1
9. L
i=l
- i0.9
10. L -:---
i=2 12.
1 n 1
10.5 Termwise Integration of Series. Power Series for Tan-1 and In 453
1 1.
co 1
12.
co
2
( --rl
i O (i!)2
� i=l e
co 1 co 1
13. 14.
�1 i(i + 1) i 2 i2(1 + i)
�
co 1 co
for each x on [a, b]. Questions: (1) Does it follow that/ is continuous? (2) If f is known
to be continuous, does it follow that
(?)Jim
n--+co
Jb fn(x) dx Jb f(x) dx?
a
=
i
* 18. Consider I!1 ( -l) +l (l/i).
Show that by writing the terms of this series in a different
order (using each term once and only once) you can get a series La; whose sum is 10.
19. Now reexamine your solutions of Problems 1 through 16. If you used any method other
than Theorem 1, in estimating the remainder in an alternating series, try using Theorem
1, and compare the new estimate with the old one. (The alternating series test usually
gives a good estimate, in the cases where it applies at all.)
If a given series is convergent, for every x on an open interval (-r, r), then the series
defines a function/, on the same interval, and we write
co
f(x) = L a;xi (-r<x<r).
i=O
The following theorem is fundamental:
Theorem A. Given
co
f(x) = L G;Xi (-r<x<r).
i=O
Then/ is continuous and differentiable on (-r, r), and the derivative of the sum is
454 Infinite Series 10.5
00
Theorem B. Given
00
x ["'
l f(t) dt =I ai dt =I -._ai_1 xi+1.
oo oo
0 i=O. 0 1=0 l +
As you might expect, the proofs are hard; they will be postponed until the end of
this chapter. But the theorems are easy to apply, and Theorem B gives the best method
of finding series for many functions. The method is as follows.
We know that
1
1 + x + x2 + ... + xn + ... = -- (-1 <x<1).
1 -x
Writing this backwards, we can express the function 1/(1 - x) as a power series:
l
__ = 1 - x + x 2 - . .. + (-lrx" + .. . (-l<x<l );
1 + x
Theorem B says that the series on the right can be integrated a term at a time. Thus
lx dt
= lo"'dt -lo"'t2 dt + ·
+ (-1)" l"'t2n dt + (-l<x<l ),
ol+t2
-- · · · · ·
. o
lx dt
=x -
x
3
� + ...(-l)" 211 + 1
. x2n+1
+ ...
0 1 + t2
00
i 2i +l
=I C-l) -
�
- - (-1<x<1).
i=o 2i + 1
Theorem 1.
oo . x2i+1
Tan-1 x =I (-1)' ---
. (-l< x<l).
i=o 21 + 1
Granted that Theorem B is true, there is no need to test the convergence of the
series on the right; Theorem B tells us not only that the series has a sum, but also that
its sum is Tan-1 x. Note that the series includes only terms of odd degree. This
could have been predicted, because Tan-1 is an odd function, with Tan-1 ( -x) =
-Tan-1 x for every x.
The same method can be used to get a series for the natural logarithm.
Theorem 2.
x2 x3
In (1 + x) = x- - + -
2 3
x
Proof We know that
dt
lx dt f
--
o l +t
=
o Jo o
· · · + c-1Y tiat +
o
x2 x3 xi
=x--+- l)i +
. 1
· · · + ( - - + · · ·
2 3 i
00
i
=IC-l)i+l� (-1 < x< 1).
i=l l
Note that this method cannot be used to calculate the integral from 0 to 2, because the
series for 1/(1 + t) converges only for !ti < 1.
The method that we have just been using can be applied so as to give answers,
in the form of series, for problems which up to now we could not have solved.
Consider
r -1/2 �
.
Jo 1 + x4
In Chapter 6, this would have been an impossible problem. But now we can solve it,
by expressing the integrand as a power series, and integrating a term at a time. In
the series for 1/(1 + x), we replace x by x4• This gives
1 .
1 - x 4 + x8 - + (-l)'x4'
.
+
1 + x4
--- = · · · · · ·
00
=I c-1)ix4i.
i=O
456 Infinite Series 10.5
Therefore
i -112 dx oo 1 -112
(-l)'x4'dx
. .
4 I
1 + X i=O
=
--
0 o
4z
-t - t(-t)5 + -H-W +
-! +_1 _ -
_
1 _ + ...
.
=
2 5 25 9 . 29
This is an alternating series; the terms diminish numerically, and approach 0 as
n--+ oo. Therefore, if we use the first three terms as an approximation of the integral,
which is quite small: 210 = 1024, 213 = 8192, and so E < I0-5•
1. Calculate Tan-1 0.02 to six decimal places, and explain how you know that the error in
7
l/10
your approximation is less than 5 · 10- •
1
2. Calculate
J 0
---4 dx to five decimal places, and explain how you know that the
1 + x
error in your approximation is less than 5 · 10--G.
3. Using the first term only, in the series for Tan-1, we get the approximation formula
How might you explain and justify this approximation formula if you knew nothing
about infinite series?
4. Given
f(x) 1 + 6 •
x
c) Calculate numerically the sum of the first three terms of your series.
d) Get (by any method) an estimate of the error in the resulting approximation of the
integral.
5. Do the same four things, starting with f(x ) = 1/(1 + vi), on the interval [O, 0.49}
(Your infinite series will use powers of v�, but the same methods will apply, for the
same reasons.)
10.6 The Ratio Test for Absolute Convergence 457
6. Do the same four things, starting with/(x) = 1/(1 + x), on the interval [O, 0.2].
1
7. Do the same, starting with f (x) 5 2 , on the interval [O, 0.25].
1 + x 1
=
1
8. Do the same, starting with/(x) = 3, on [O, 1/2].
1 + x
9. Express in the form of a series:
rk [ f--=:_J
Jo i=oi + 1
dx (0 < k < 1).
10. Using the first term only, in the series for In, we get the approximation formula
How might you explain and justify this formula, if you knew nothing about infinite
series?
11. Consider the function f(x) defined by the series
n
x2 x3 x
1 + x + + +· + +
2 3! · · n! . · ·
a) Express/'(x) as a series.
b) Express H f(t) dt as a series.
The results that you get ought to enable you to guess what the function is.
*12. For each n, let
(0 � x � 1).
rl
)o
f(x) dx = J1 0
[limfn(x)] dx.
n-co
*13. Your answers in Problem 12 suggest that the functions fn behave rather peculiarly.
Investigate as follows:
a) For each n, let .Xn be the point at whichfn takes on its maximum value. Get a formula
for Xn , and find limn_,00 Xn .
b) For each n, let Yn fn(.Xn ). Get a formula for Ym and find liffin_,00 Yn
·
=
c) Draw a sketch showing what the graph of fn looks like for n 1, n 2, and = =
n R:i oo. Your sketch will throw some light on the results that you got in Prob
lem 12.
Consider a series .Lai, in which the terms may be positive or negative, but not equal
to 0. For each i, let
458 Infinite Series 10.6
so that
Proof Let s be any number such that r < s < 1. Then there is an N such that
i� N => ri < s.
0 r t
r;
(i�N)
it follows that
By induction,
laN+;I < laNI s1 for every j.
Therefore
"' "' . "' . 1
2 laN+1I �2 laN[ s' = laNl2s' = [aNI -- < oo.
i=O i=O i=O 1 - S
It follows that
and so
co N-1 co
What we are really using here is a comparison test between the series 2 la;I and
a geometric series; the comparison does not necessarily work for the first few terms,
but it does start working after a certain point; and this is good enough to tell us what
we want to know.
10.6 The Ratio Test for Absolute Convergence 459
1
a;=�·
l.
Qi+l i! 1
r-=-=
' - -- '
a; (i + 1)! i + 1
and so lim;�co r; = 0. It follows that the series converges.
There are simple cases in which a series converges, but in which convergence
cannot be established by the ratio test. Consider
co
1
!-:;.
i=l I
a;+1 i2
r; = - = .
a; (i + 1)2
Therefore, while r; < I for each i, we have
1
I.1m r; = ]"1m = 1,
i-co i- co (1 + (l/i)]2
and so the ratio test does not apply. And Theorem 1 cannot be generalized to take
care of these cases, because if r; ---+ I it may easily happen that the series diverges.
This happens for
co 1 i
I -: oo, r;= -----+ 1.
i=l l i +l
An even simpler case is
co
I c-1)i= 1 - 1 + 1 - 1 + · · ·
i=O
Here
r; = 1(- l)i+l/(- l)il = I for every i,
a;+
Jim
t-+ro
I I
ai
i = L.
If L = 0, then the series is absolutely convergent for every x. If L > 0, then the
series is absolutely convergent for
lxl < 1/L.
460 Infinite Series 10.6
Therefore
lim ri = lxl · L.
If L = 0, then ri --+ 0, no matter what x may be. If L > 0, then limi�oo ri <
whenever lxl < I/L. In either case, the series converges absolutely.
By the first half of Theorem 2, we conclude that
converges absdlutely for every x. By the second half of Theorem 2, we see that
converges absolutely for lxl < 1. In each of these cases, the sum of the coefficients
forms a convergent series. But the theorem also applies in cases where the sum of the
coefficients diverges. Consider
CXl
I i11'iXi.
i=l
Here
ai ci + 1)11'i+l
1.Im
.
1.-+00
I I
+l
-
Qi
= Im
1.
n-+OO . i
l'TT
= 11'.
i � N => r; > 1.
Therefore lai+11 > lail for i � N, and so after a certain point the sequence la1J,
Ja2J, . . . becomes an increasing sequence. Therefore ai cannot approach 0. This
observation enables us to add something to the conclusion of Theorem 2.
a;
Jim I I
i-+oo ai
+1 = L.
If L = 0, then the series converges absolutely for every x. If L > 0, then the series
converges absolutely for lxl < l/L and diverges for Jxl > 1/L.
This theorem can be adapted to take care of cases in which some terms of the
series are equal to 0. For example,
CXl
I c-1)ix2i.
i=O
10.6 The Ratio Test for Absolute Convergence 461
Setting x2 = y, we get
00
.L (-1)y\
i=O
which converges absolutely for IYI < 1 and diverges for IYI > 1. Therefore the given
series converges absolutely for /xi < 1 and diverges for /xi > 1. Similarly,
oo
x2i+1 x2i
oo
= x_L--:-.
i=O 2
.L i
i=O 2'
-
Here
1
I I
. ai+l
lIm - =- .
i-+ oo a; 2
Therefore the series converges absolutely for x2 < 2 and diverges for x2 > 2.
Some more observations about Theorem 3 are in order.
1) The theorem applies only to the case in which /ai+1/a;/ approaches a limit. This
usually happens for series which are describable by simple formulas. But for series
in general it should be regarded as a remarkable accident. Suppose, for example, that
we start with
00
�
£.. x'
.
= 1 + x + x2 + · · ·
i=O
Here a; = 1 for every i, and so r; = 1 for every i. We now divide xi by i! for every
even i. This gives
oo
x2 4
.L b xi 1 + x+ + x3 + � +
i=O i
= -
2! 4!
The series still converges, for /xi < 1, but the ratio approaches no limit at all.
2) The theorem tells us that the series converges everywhere on the open interval
(-1/L, l/L), but it tells us nothing about what happens at the endpoints of the
interval. In fact, at the endpoints anything can happen. For example, _L;:1 (xi/i2)
converges on ( -1, 1), and converges at both the endpoints. The series .L:1 ixi
converges on the same interval, but converges at neither of the endpoints. The series
_L;:1 (xi/i) converges on ( -1, 1), and converges at x -1, but diverges at x
= 1. =
The series .L:1 (- l)i(xi/i) converges on (-1, 1), and converges at x 1, but =
diverges at x =-1. For this reason, to tell where the series converges, we have to
make separate tests at the endpoints.
3) Obviously every power series L a;xi converges for x = O; the sum is a0• But
sometimes 0 is the only value of x that gives convergence. Consider .L:o i!xi. For
every x ;tf 0, we have
r; = l(i + l)!xi+i/i!xil
= (i + 1) /xi --+ oo.
Therefore the series converges only for x = 0.
462 Infinite Series 10.6
4) Finally, the results that we have been getting for power series suggest a conjecture.
In every case that we have investigated, the domain of convergence of .L aixi has
turned out to be of one of the following types:
The question arises whether these are the only possibilities. For example, is there
a series .L a;xi whose domain of convergence is an interval whose midpoint is not the
origin? We shall see, as the theory develops, that the domain of convergence of
.L a;xi is always a set of one of the forms
( - oo, oo), (-a, a), [-a, a) , (-a, a], [-a, a], {O}.
For each of the following series, find the domain of convergence, remembering, of course,
to test the endpoints.
ro ro ro
1. _Li2xi 2. _Li3xi 3. _Li2x2i-1
i=l i=1 i=l
ro oo xi x2i
ro
7.
;� v2i + 1 8. I -= 9. .L(3i)3xi
i= Vi - I
2 i=l
00 ro co
00 00 ro
19. I (-l)i�
i=l i2 + 1
00
20. .L i(2x - l)i (Does the answer to this one contradict Theorem 3 ?)
i=l
oo ei
21. .L-:--- (x - 4)i (Same query as for Problem 20.)
i=ll3
oo (x - 2 )i
22. .L . (Same query as for Problem 20.)
i=l l
23. Show that ,L:1 (sin i)xi is absolutely convergent when Jxl < 1.
*25. Show that there are infinitely many integers i for which sin i > t.
*26. Show that ,L;:1 (sin i)xi is divergent when !xi > I.
(The ri;;:sults of Problems 23 and 26 show that for this very irregular series, the domain
of convergence is still of one of the types described by Theorem 3.)
*27. You may have noticed that the number 1 has come up very often as an endpoint of our
domains of convergence. The following theorem helps to account for this:
Theorem. Let p(i) and q(i) be polynomials in i, of any degree, with q(i) never equal to 0.
If a; = p(i)/q(i), then
converges absolutely for !xi < 1, and diverges for !xi > 1.
Theorem A of Section 10.5 asserts that power series can be differentiated a term at a
time. That is, if
co
i=O
then
co
co
i=O
for some sequence of coefficients a0, a1, . . . . On any open interval whf"re this works,
we have
co
;
f'(x) = ,L ia;xi-l = a1 + 2a 2x + · · · + ia;xi-l + (i + l)a;+1x + · · ·
i=l
It must be true thatj'(x) =/(x), and/(O) = 1; and so we want to find a sequence
of coefficients a0, a1, a2, • . . which gives these results for the series. This is easy: we
want
which gives f' (x) = f (x); and we want a0 = 1, which gives f (0) = 1. Thus
a; = 1/i!.
464 Infinite Series 10.7
This can be checked by induction. For i = 0, I, 2, the formula a; = I/i! holds true.
And
1 G; 1 1
- .
i+l - i + 1 - (i + l)i! - ( + 1)!
a -- => a - -
i i!
--
- i
(1)
because we started off with an unproved assumption that ex had some power series
expansion. But now that we know what series to examine, it is very easy to show that
Eq. (1) holds. By the ratio test, the series on the right-hand side converges for every x.
It therefore defines a function g. Thus
oo i
x
g(x) =I-:- (-oo < x < oo).
i=O l !
e"'g'(x) - g(x)e"' 1
ef;'(x) = = - [g'(x) - g(x)] 0.
2x
e e"'
=
Therefore ¢ is a constant, and ef;(x) = ef;(O) for every x. But ef;(O) = I. Therefore
Therefore
f(O) = 0, f'(O) = I,
and
f"(x) = f(x) for every x.
-
10.7 Power Series for exp, sin, and cos 465
Thus if
a;xi a0 + a1x + ·· · .
co
.
sm x= �
£..
i=O
=
f'(x) = L ia;xi-1,
co
i=l
i 2
= 2a2 + 3·2a3x + ··· + i(i - l)aix -
+ (i + l)ia;+lxi-l + (i + 2)(i + l)a;+2xi + · · ·
To getf" = -f, we want
a1 = 1,
a1 a1 1
as= ---=
i 1
- (-l) .
a2i+l -
(2i + 1)!
To check this by induction, we note that
i 1
a2i+l -
- (-l)
(2i + 1)!
a2i+l
ac2i+i>+2 -[
(2i + 1) 1]· [(2i 1) + 2]
=>- =
+ +
. 1 1
a2<i+i>+i = (-l)'( -l) .
(2i + 1)! . (2i + 2)(2i + 3)
1
- c l)i+l c -1Y+i
i
Therefore, if there is a series for the sine, the series must have the form
co
x2i+1
.
(x) = ·
g i � (-l)' (2i + l)!
466 Infinite Series 10.7
I) g' = h,
2) h' = -g,
3) g(O) = 0,
4) h(O) = I.
It ought to be true that
g(x) =sin x, h(x) =cos x.
If so, the function
cp(x) = [g(x) - sin x]2 + [h(x) - cos x]2
must be equal to 0 for every x. And conversely, if cp(x) =0 for every x, it follows
that g(x) =sin x and h(x) =cos x. Now
cp'(x) = 2[g(x) - sin x][g'(x) - cos x] + 2[h(x) - cos x][h'(x) + sin x]
=2[g(x) - sin x][h(x) - cos x] + 2[h(x) - cos x][-g(x) + sin x]
=0 for every x.
Theorem 3.
oo
. x2i+1 x3 xs x7
sin x = I (-1)' =x -- + - - - + · · ·
i=O (2i + 1) ! 3! 5! 7!
By differentiation,
oo
i (2i + l)x2i oo
x2i
COS X = .? (-1)
i-0
(2.l + l) .I
= .? (-1)i (2 . ) I
t-0 l .
•
Thus:
Theorem 4.
oo . x2; x2 x4 xs
cos x =I c - 1) - = 1 - - + - - - +
' · · ·
i=O (2i)! 2! 4! 6!
Obviously, the series that we have been developing in this section can be used
for calculating the values of the corresponding functions. In fact, this is the way
people arrived at the values .that you find in the tables of exp, sin, and cos. And the
series can be adapted, in simple ways, to handle a variety of related problems. For
example, consider
o
r .5 e
'"2
dx.
Jo
If we could get a simple formula for a function F such that
"'°
F'(x) = e ,
10.7 Power Series for exp, sin, and cos 467
then the integral could be expressed as F(0.5) - F(O). There is no such simple formula.
But we can express such an Fas an infinite series, in the following way. We know that
ro i 2
x x x
e = L� = 1 + X + -+ · · ·
i=D l ! 2!
ex'
Therefore
ro 2i 4
x 2 x
= L--;-- = 1 + x + - + ...
i=O l ! 2!
..
Integrating a term at a time, we get a function
a " ro 2i+1
x x x
F(x) = x + - + -- + · I --- .
-i=oi!(2i
3 5 2! + 1)
ex'.
·
Evidently
F(O) = 0, and F'(x) =
Therefore
f' e
t'
dt = F(x),
and so, using the series for F, we can calculate F(t) approximately, with an error as
small as we please.
Find a series for each of the following functions. In each case, name the interval on
2.
2
Problem
'n-'x
;e 0
3 "'3 "'
C for x 0
l"' 3 t3
� for x
1
= t e =
(e"'
0
-
for x
x
wheref is as in Problem
t
= =
r 0
17. = = ; for
for x =
468 Infinite Series 10.8
J as in Problem21.
o
=
after finding the series, find an elementary formula for such a function f
26. Find a series for a function f such that (1) /' (x) = 2/ (x)/x for every x � 0 and (2)
/(0) 0. =
27. Is there only one function satisfying the conditions of Problem26? Why or why not?
28. Get a formula for Dixi, where D1 denotes the ith derivative of the function f
29. Get a formula for Dix1, valid for i < j.
30. Do the same, for the case i > j.
*31. Given f(x) I:o a xi. Get a formula forJ(i>(O). (Here /( i ) denotes the ith derivative
i
=
of f )
*32. Is it possible that there are two different power series for the same function, valid on
the same open interval I? That is, given
<XJ <XJ
f(x) L aixi = L b;xi on I,
i�o i�o
=
n(n - 1) an-2 2 +
(a + bt = an + nan-lb + b
2
n(n 1) · · · (n
an-ibi + . . · + bn.
- - i + 1)
+
l·1.
Here the coefficient of an-ibi can be written more briefly as
(n) n! n(n - 1) · · · (n - i + 1)
·1
1.
=
i i !(n i)!
The induction proof of the binomial theorem depends on the identity
You may have seen this proved. In any case, we shall not stop to prove it now,
10.8 The Binomial Series 469
because the elementary form of the binomial theorem is a corollary of a more general
result which we shall prove presently.
We would like to generalize the familiar binomial formula
(a + br = i� G) an-ibt
to take care of the case in which n is not an integer. That is, we want a formula for
(a+ b)k, where k is any real number. The following observations are obvious:
1) Fork = 0, we have (a+ b)k = 1, and our problem is solved. We may therefore
assume that
k ¥ 0.
2) For the case of interest, in which k is not an integer, the exponential ck is defined
only for c> 0. (See Section 4.9.) Therefore we must assume that
a+ b> 0.
3) For a = b, the problem has an immediate solution: (a+ b)7' = (2a)k = 2kak.
We therefore may assume hereafter that
a-:;!:. b.
And we want to assume this, because the case a = b does not fit the pattern that is
going to emerge.
a> b.
We let x = bfa, so that
a+ b = a(l + x), and
If we had \x\ = \bfa\ � 1, then either b � a or b � -a; and these possibilities are
ruled out by conditions (2) and (4). Therefore \x\ < I, and our problem takes the
following form:
Our past experience with sin, cos, and exp suggests that we should investigate
the relation between f(x) = (1 + x)k and its derivatives, and use the results in the
investigation of the series. Now
CQ CQ
Therefore
C()
xf'(x) = .L ia ixi.
3 2+1 2- - 3! '
and in general, for i > 1,
k(k - 1) . .. (k -
i + 1)
ai = .
.,
I.
We denote the fraction on the right by the symbol (�),just as in the case where k is a
positive integer. The above formula then takes the form
That is, the series that we have found is the only series that might work. To know
that our series does work, we need the following two theorems.
10.8 The Binomial Series 471
Then
1)... (k
r
·
' = I k(k _ i + l)(k
(i+l)!
_ _ i). i!
k(k-l)···(k-i+l) .
xI
Evidently
=
I ��; x I·
·
Jim r
; = Ix!.
i-too
Therefore, by the ratio test, the series converges absolutely for lxl < 1, and diverges
for lx l > 1.
i� (�)xi= (1 + x?.
Proof g Let be the function which is the sum of the series. We determined the
coefficients in such a way that
l)g(O)=l;
2) (I +x)g (x) = kg(x).
'
f(x) = (1 g(x)
+ x)k
Then
A. '
x) - (1 +x)kg'(x)(1-+ gx()xzk)k(l+x)k-l
't' C _
= (1 + (1x)g+'(x)-
)
kg(x)
x k+1
=
0
'
1. Write a series for v'�, and find out how many terms of the series you would need
to use, to calculate v'u, correct to three decimal places.
472 Infinite Series 10.8
- -
2. .a;
Do the same, for v x + 1.
3. Do the same, for V1 x + 1.
4. Do the same, for V2x + 1.
5. Let n be a positive integer. Using the definition
n!
(n - i)!i!'
<1 + xr = i ( � ) xi
i=O l
"""
(The first and last terms on the right-hand side require a separate discussion. But
note that (�) <nt1), because both are equal to 1 ; and similarly that (�)
= (�!}) = = 1.)
Since obviously (1 + x)1 = (�) + (�)x1, this gives an induction proof of the elementary
form of the binomial theorem.
Find a series for each of the following functions, and discuss for convergence. You need
not test for convergence at the endpoints.
x2
8. f(x) 9
j(X) =
v'I + x . .I
v 1 +x
x
10. f(x) = xVI + x 11. f(x) .a;
1 + x2
=
x t
20. f(x) 21. f(x) = ( "' dt
V 1 + x2 J v'1 + t2
= --
22. Find a function f such that (1) (1 + x2)[' (x) f(x) and (2) f(0) = 1. Then show that
=
· the function th�t you found is the only function satisfying conditions (1) and (2).
23. Same question, for the conditions (1) f' (x) sec x = f(x) and (2) f(0) = 1.
10.9 Taylor Series 473
1
f C-l)ixi __ (lxl <1).
1+X
=
i=O
If we let x' = 1 + x, x = x' - 1, then the above equation takes the form
1 00
= - =
X i=O
The series on the right-hand side is called a Taylor series, oi a Taylor expansion of
the function f about the point 1. A power series ! a;xi, of the type that we have
x =
been discussing so far, is called a Maclaurin series. Thus every Maclaurin series is a
Taylor serie�:
00 00
which is a Taylor series, with a = 0. In this language, we may say that f(x)
1/x =
has no Maclaurin series, but it does have a Taylor expansion about the point 1.
Similarly, f(x) = In x cannot have a Maclaurin series, because at x = 0 the
function approaches - oo. But we do have a series for
oo
xi
g(x) = ln (l + x) = !c-1r1--:- (l xl<1).
i=l l
(x l)i
lnx ic-l)i+i -:- (Ix - 11<1).
i=l
=
thenf is continuous and differentiable on the interval (a - r, a + r), and the deriva
tive of the sum is the sum of the derivatives. That is
ro
+ 1
so that
j<n>(o)
a = .
n --
n!
An analogous formula holds for Taylor series:
J<n>(a)
a = for every n.
n --
n!
Proof The nth derivative of a function described by a formula rp(x) will be denoted
by Dncp(x). We observe that
We don't care what form the b/s have, because every term of the sum on the right
hand side has (x - a) raised to a positive power, and we are about to set x = a.
This gives
so that
fCn>(a)
a =
n --
n.1
'
We have found that for some functions the use of Taylor series in place of
Maclaurin series is a necessity. For example, l/x and Jn xdon't have any Maclaurin
expansions. In other cases a Taylor series may be preferable, even though the
Maclaurin expansion exists. The point is that .Z a ix
i
usually converges rapidly when x
is close to 0, and more slowly when xis larger. To take an extreme case, we know
that sin 10,0007T = 0, because 10,000 is even. Therefore it must be true that
But in waiting for the partial sums to get close to 0, we had better not be impatient.
In general, if we want to use a series to calculate a function numerically, we should
choose the "base point" a as close as possible to the value of x that we want to
substitute. Suppose, for example, that we have calculated In 1.5 = 0.4055. One
way to calculate In 1.6 would be to take x = 1.6 in the series
x 1
ln l(-l)i+1C � y
=
x
i l
=
.I c-1)i+l co"6)i
i=l l
In x .2 a;(x - 1.5t
i=O
=
j(O)(l.5) = 0.4055.
Therefore
i
(- ) +l ( x - 1.5);·
Jn x 0.4055 + I �
1.5
=
.
t=l l
In 1.6 0.4055 +
15
=
i=l l
f(a) .
ai = -,- .
L
476 Infinite Series 10.9
<Xl
<Xl
For some of the functions in the first twelve problems below, it is a practical proceeding
to derive a general formula for pn>(a), and use the formula to calculate the coefficients ai
in the series ! a;(x - a)i. In each such case, calculate the coefficients by this method. In
cases where the derivation of the general formula seems unreasonably difficult, merely cal
culate the first three terms of the series.
then
and
LC; = - oo.
*16. Let n1, n2, • • • be a sequence of positive integers in which each positive integer appears
exactly once. That is, the numbers n1, n2, • • • are the integers 1, 2, 3, . . . arranged in
some order. For each series 2:%1 a;, we can then form a "rearranged series" !%1 an,,
in which the same terms appear in some order. The following theorem is a sort of
"commutative law of addition" for positive series.
10.10 Taylor's Theorem. Estimates of Remainders 477
Prove this.
has a rearrangement which converges to 0. (Thus the "commutative Jaw for infinite
sums" does not hold in general.)
*18. Show that for every number k there is a rearrangement of the above series which
converges to k.
Using the formula, we can write down a series. But there are three questions which
it is natural to ask:
1) For what values of x does the series converge? (We recall that Tan-1 x is defined
for every x, but its series converges only for -1 < x � 1.)
2) Does the series converge to the function f that we started with?
3) If we use a partial sum
n (i)(a)
J
Sn(x) = .2- . - (x - a)i
i=O l !
as an approximation ofj(x), what is the error? For this, we need an estimate of the
"remainder function"
Rn(x) =
/(x) - Sn(x)
n /(i)(a)
= f(x) - 2-. (x - aY, -
i=l l !
Partial answers to these questions are given by the following theorem.
The proof is artificial, and hard to remember. We regard x as a constant; and for
each t we let
i
f (t)
f(x)- F(t) i
. (x - t) .
= f
i=O l !
Here we have simply replaced a by t in the formula for R (x). For t x we have
n
=
r<ol(x)
F(x) = f(x)- ·-- = f(x) - f(x) = 0.
O!
For t a we have F(a) R (x). Since
n
= =
- [
f"'(t)
2!
(x - t)2-
f" (t)
2!
2(x- t) · J
- [ f(n+l)(t)
n!
J(n)(t)
(x- tr- -- . n(x - tr-1 .
n! J
Here all terms cancel out, telescopically, except the first term in the last bracket;
and so
f(n+l)(t)
F'(t) = - (x - ir.
n!
Now let
(x - tr+i
G(t)-
- �--
(n + 1)! '
so that
-(x - tr
G'(t) =
n!
To the functionsF and G, on the interval between a and x, we apply the parametric
mean-value theorem. (This is Theorem 2 of Section 9.2.) It gives
-F(a) F '(x)
=
-G(a) G'(x)
And
10.11 The Complex Number System 479
F(a) = +1
f<n >(x).
G(a)
By definition of G(a) and F(a), we have
= = f(n+l)(-
x)
R n(x) F(a) = J<n+i>(x)G(a) (x - ar+l'
(n + 1)!
which was to be proved.
In some cases we can use this theorem to prove that a formal power series con
verges to the expected function. For example, we may be able to find a number M
such that
IJ<n+i>(x) I � M
for every n and every x between a and x. In such a case it follows that Rn(x) --+ O·
and f (x) is the sum of its formal Taylor series. Most of the time, however, estimates
of (n + l)st derivatives are hard to come by. For example, the calculation ofj <n> (x)
is unmanageable for the function
f(x) =
- 1 -,
1 + x3
even though we can easily see what the Maclaurin series is:
JC3i>(O) = ( -1)i(3i) ! ,
and thatJ<nl(O) = 0 if n is not divisible by 3. But this does not give us any information
about J<n>(x) for other values of x.
1 through 6. In at least six of the first twelve problems in Problem Set 10.9, it is easy to
get an estimate of J<n>(x), and then show by Taylor's Theorem that the series converges to
the given function. Identify these six cases, and carry out the process.
z =a+ bi,
=
where a and bare real numbers, and where i is some sort of number such that i2 -1.
Granted that there is such a number system, and that it obeys the same manipulative
rules as the real number system, the equation i2 = -1 gives all that we need to
480 Infinite Series 10.11
Jl
I -
i4 =(i2)2 =1;
i l0,001 = i;
= -1.
Obviously 0 = 0+ 0i has no reciprocal. But if a + bi ¥- 0, then a+ bi has
a reciprocal in the complex number system. To see this, note that if a + bi ¥- 0,
·
1 1 a - bi a - bi
--- = --- . --- =
a + bi a + bi a - bi a2 - (bi)2
a - bi a + -b
= = i
a2 + b2 a2 + b2 a2 + b2
=A+ Bi.
This calculation begins with the assumption that a + bi and a - bi have reciprocals,
but once we know the answer, it is easy to check:
a -b )
(a+ bi)(A + Bi) =(a + bi) ( + i
a2 + b2 a 2+ b2
(a + bi)(a - bi) a2 + b2
= = =1.
a2 + b2 a2 + b2
Therefore A + Bi is the reciprocal of a+ bi.
There are several ways to define the set of complex numbers, as a mathematical
system, and check their properties. One such method is explained in Appendix J.
Meanwhile we shall regard the complex numbers as known, and calculate with them,
using the familiar laws of algebra and the fact that i2 = 1
conjugate of the complex number
- .
The
z =a+ bi
is the number
z =a - bi.
10.11 The Complex Number System 481
z = z,
z + z is a real number,
Proofs.
z = a + bi, z = a - bi, z = a - (-b)i = a + bi = z;
z + z = a + bi + a - bi = 2a;
z · z = (a + bi)(a - bi) = a2 + b2 = !zl2;
z1 + z2 = a1 + b1i + a2 + b2i = a1 - b1i + a2 - b2i
= (a1 + a2) - (b1 + b2)i = z1 + z2;
Z1Z2 = (a1 + b1i)(a2 + b2i) = a1a2 - b1b2 + (a1b2 + a2b1)i;
z1 z2 = (a1 - b1i)(a2 - b2i) = a1a2 - b1b2 - (a1b2 + a2b1)i = z1z2;
•
Y ---------, z=x+yi
I
I
I
I
I
I
�+--����x,--�--x
Thus real numbers z = x fall on the x-axis; we shall think of this as the real axis.
And "pure imaginary numbers," of the form z = iy , fall on the y-axis; we shall think
of this as the imaginary axis. This explains the labels on the axes in the figure below.
Evidently z is the reflection of z across the x-axis. If you reflect twice, you get back
where you started; and this is the geometric meaning of the equation z = z. As the
482 Infinite Series 10.11
z=x-yi.
figure suggests, lzl is the distance to z from the origin; the reason is that
lzl = ,Jx2 + y2,
which gives the distance. More generally, lz1 - z21 is the distance between z1 and z2.
Z2
I
I
I
____ _r:jI
I
. 1 1
l = - - , -i, and Imz = £(z - z ).
i 2
These formulas enable us to connect complex numbers with the geometry of our
coordinate plane. lzl = 1 is the circle with
For example, the graph of the equation
center at the origin and radius 1; and the graph of the equation lz - z01 = a is the
circle with center at z0 and radius a. The vertical line through the point (1, 0) =
1 + 0 i is the graph of the equations
·
x= 1 <=> Re z = 1
¢;>- t(z + z) = 1
¢;>- z + z = 2.
10.11 The Complex Number System 483
In the following problem set, you will be asked to carry out a variety of such
processes. For short, we shall use the term C-equation to describe an equation in
which complex numbers are the only variables. Thus the vertical line discussed above
is the graph of the C-equation z + z = 2; and a certain circle is the graph of the
C-equation zz = 4.
I. (1 + i)4 2. (1 - i)4 3. v2
+
v2i
4. (-1 - _1 r (1 v3
- +-1 6. 3
v2 v2
i 5.
2 2 J G- �
1
J
1
v3
- +- 8. (�3 - �iJ 9. ( _1 +
(2 21
J v2 v2 J
- _1
7. i
19. 1 2i +
20. 2i 1 1
2 .
I + 3i
1
-
22.
i - 3i
23.
2 + v3 i
24. - 2 v3 i
25.
v3 + 2i
26. - v3
1
27.
i2 + i + 1
1
1 29. i2 1
2i
28.
i3 + j2 + i -1 - i + 1
30. 1)3
(i +
---
31. 11 +i 3i 33. 1 -- 3i
2i 1 +
2 + 2
32.
- i 2i
34. is + i4 + i3 + i +I
35.
i4 + i3 + i2 + i + I
36. Show that lz"I = lzl", for every z.
37. Show that zn zn, for every 38. Show that l/z (l/z), for every 0.
Sketch the graphs of the following C-equations.
= z. = z �
39. Re z +Im z Re z - = I.
41. Re z =Im z. - 11 = I.
= I. 40. Im z
43. - -
42. I=
lz II < I. 44. lz II > I.
484 Infinite Series 10.12
(a +hr =I (�)an-ibi
i=O I
from a more general result, using the methods of calculus. But the methods of Section
10.8 do not, as they stand, apply in the complex domain. Show, by induction, that
(u +vr = i (n·)un-ivi.
i=O }
The second of these conditions also has a meaning if the xn's and x are complex
numbers, because the absolute values lxn - xi are real in any case. We use this idea
to define limits for sequences of complex numbers.
10.12 Sequences and Series of Complex Numbers. The Complex Exponential Function 485
lim lzn - zl = 0,
n-+oo
then
limzn =z.
We can test for limits by examining the real and imaginary parts of the sequence
separately.
If the sequencesx1, x2, • • • and y1, y2, • • • are convergent, then z1, z2, ••• is convergent,
and
limZn =lim Xn + i lim Yn·
Proof Let
x = limxm y =limym
n-+oo n-+ oo
so that
lim(xn - x) = lim(yn - y) = 0.
Theorem 2. If limn_,.00 (xn + Yni) = x + yi, then limn_,.00 Xn=x and limn_,.oo Yn= y.
If
limSn= S,
n-+ oo
486 Infinite Series 10.12
For real series, we found that if ! lxil converges, then ! xi also converges. The
same is true in the complex domain.
Then
and similarly
and
! (x1 + y1i)
n n n
:L z1
i=O i=O n-+ oo =O
lim
j=O
Jim = lim = =A +Bi.
j
n-+ oo n-+ co n-+ co
The simplicity of this theorem, and of its proof, are misleading: the theorem is
powerful. It gives immediately:
This is so because
j
1 I �j! I 11�1
i=O j=O j !
=
< 00
}
for every real number lzl. This enables us to extend the domain of the exponential
ex = x to the entire complex plane:
exp
For the case in which z is a pure imaginary number i8, we can express e• in
another form:
Now
and
Therefore
. oo c 0ke2k . oo c 0ke2k+l
+ i:L��-
_ _
e'6=:L
k=O (2k)! k=O (2k + 1)!
10.12 Sequences and Series of Complex Numbers. The Complex Exponential Function 487
This gives:
Conversely, if izi = 1, then z = ei8 for some e. Here z = x + yi, x = cose, and
y = sine.
\
\
'\
'-
,_
r = izl � 0.
To see this, we let
z
w=- '
izl
so that lwl = I. Therefore
The expression rew (or r(cose + i sin fJ)) is called the polar form for z, because it
describes z in terms of the polar coordinates r, e of the corresponding point.
For example, consider
z = 1 + 2i.
izl
488 Infinite Series 10.12
1 . 2
cos() = .J5' sm () = .J .
S
Then
i8
w = cos() + i sin () = e
and
z = rw = r(cos () + i sin())
in polar form.
w z w+z
(The same is true for all complex numbers w, z. That is, e · e = e . But we
are not yet in a position to prove it.)
Proof
i8 i
e • ea = (cos() + i sin ())(cos IX + i sin IX)
= COS() COS IX - Sin ()Sin IX + (sin 8 COS IX + COS()Sin 1X)i
= cos (() + IX) + i sin (8 + IX)
i(e+ l
e a. =
i8
In the polar form re , r is called the modulus and () is called the amplitude. It is
i8
a slight abuse of language to speak of () as the amplitude of re , because while the
modulus is determined when the number z is· named, the amplitude () is not deter
mined. In fact, when we apply the exponent i() to e , we get a periodic function/(8).
v3
4. i s. v3 + i 6. -4 - 4i
2 + 2
7. In the complex domain, the sine and cosine are defined by the series
z2Hl
co
8. In the text, we expressed ei8 in terms of sin fJ and cos 8. More generally, express ei•
11. Express sin z and cos z in terms of the complex exponential function.
In Section 10.12 we found that every complex number could be expressed in polar
form, with
where
r = lzl.
And Theorem 6 said that for every e and a,
This gives us a rule for multiplying complex numbers in polar form: we multiply
the moduli and add the amplitudes. For
we have
To divide, we divide by the modulus of the divisor and subtract the amplitude.
These ideas give us a method for extracting roots of any order. We shall now see
that the number 1 has three cube roots in the complex domain. If z3 = 1, then izl3 =
and
38 = 0 + 2n7T
for some n. For three successive values of n we get:
n = 0, e = o, Z = Z1 = 1;
n = 1, e = f 7T, z = z2 = cos f7T + i sin f7T;
n = 2, e = f7T, z z cos t7T + i sin f7T.
3
= =
490 Infinite Series 10.13
Using other values of n, we would get repetitions of the same cube roots. Thus the
roots are
z1 = 1,
1 J3
Z2 = -2 + li,
1 J3.
Z3 = -- - -z.
2 2
These cube roots could have been found by elementary methods, because
z3 - 1 = (z - l)(z2 + z + 1),
and the quadratic formula gives z2 and z3 as the roots of the equation z2 z + 1 0. + =
But for roots of higher order, and for numbers less simple than 1, the elementary
methods break down, and our new method still works. For example, i ei"f2• =
i
Therefore the fourth roots of are the numbers ei0 for which
1 + 4n
4() = :!!. + 2n1T' or () =
8
1T.
2
Four successive values of n give us
from which the roots z; = eio; can be computed, by repeated applications of the half
angle formulas for the sine and cosine.
In general, every complex number z ¥- 0 has exactly n nth roots in the complex
domain, and the roots can be expressed in the form that we have been using.
z = reie ¥- 0
.nr ·o
z.' = v r e'; (j = 0, 1, . . . , n - 1),
where
e; = (1/n)(() + 2j7T) (j = 0, 1, . . . , n - 1).
To prove this, we need to investigate two things.
10.13 De Moivre's Theorem 491
ex = -+
()
n
-
j
n
· 27T.
Any n successive values of j give us n different values of e ia., but thereafter the values of
eia. repeat themselves.
For example, consider
z= 1 + 2i, n= 5.
Then
where
r = J5, e = Sin-1 (1/J5).
The fifth roots of z are the numbers
Z;= �r ei81
= M eie1= 5111oei8; (j= 0, 1, 2, 3, 4),
where
e; =tee + 2j7T)
= !!. + 2j
7T = 0, 1, 2, 3, 4).
5 5
(j
Thus
efs
Zo= 5111oei ,
rr) s
Zi = 51/1oe i(8+2 / ,
4rrl/s
z2 = 5111oei(9+ ,
and so on. Note that it is not easy to express these numbers in the form a + bi.
De Moivre's theorem shows that C not only contains roots of all orders for all
real numbers, but also roots of all orders for all complex numbers. This means, in
particular, that any quadratic equation with coefficients in C can be solved in C. The
method follows the derivation of the familiar quadratic formula.
2
az + bz + c = 0 (a � 0)
2 b c
z +-z=--
a a
2 2
2 b b c b
z + -z + - = - -+-
2 2
a 4a a 4a
b
2 2 )
b - 4ac
(z +- ·=
2a 4a
2 .
492 Infinite Series 10.13
(z -)2 = 0
+
b
<=> z + -2a = 0,
b
2a
and
z = -b/2a
(
is the only root. If b2 - 4ac ¥- 0, then the complex number (b2 - 4ac)/4a2 has two
square roots z1, z , and
2
z + b/2a = z1'
az2 + bz + c = 0 ::::> or
Z + bj2a = Z .
2
Therefore the roots are the numbers
b b
- - + Z1 ' - - + Z 2.
2a 2a
In fact, a much more general result holds: every polynomial equation
1. z4 + 1 0. 2. z6 + i 0.
-
= =
3. z3 + 8 = 0. 4. z2 + 2z i + 1 = 0.
5. z3 + z2 + z + 1 = 0. 6. z2 + z + i + t = 0.
7. z7 + z6 + z5 + z4 + z3 + z
2
+ z + 1 = 0.
8. z5 + z
4
+ 2z3 + 2z2 + z + 1 = 0.
9. We know that for each n,-the number 1 has exactly n nth roots. Show that we can
always find one of these, say, z0, so that the complete set of nth roots are the powers
z0, z5 , zg , ... , z(; of z0•
1) The domain of convergence of ! a;xi was always symmetric about zero, except
perhaps for the endpoints of an interval. That is, the domain of convergence always
turned out to be (a) 0 alone, (b) ( - oo, oo), or (c) an interval of one of the types
(-r, r), [-r, r], [-r, r), (-r, r].
2
2) The function f(x) = 1/(1 + x ) is defined for every x, and has derivatives
of all orders. Nevertheless, its series !�o ( - l )ix2i converges only on the interval
( - 1 1) .
, Here the series goes bad for reasons which seem unrelated to the properties
of the function which it represents.
We shall now find out why these things happen. The next two theorems are
modeled on theorems which are known in the real domain.
Proof For each n, let Sn = Li'=o z;, so that limn�oo Sn = S = !:o Z;. Then
limn�"' Zn = limn�oo (Sn - Sn_1) = S - S = 0. (Note that this is exactly like the
old proof.)
IYnl � b,
for every n. Therefore
Throughout this section, for each r > 0, D, denotes the interior of the circle
with center at the origin and radius r, in the complex plane. Here D stands for disk:
the interior of a circle is called an open disk. If we include the boundary circle, we
get the closed disk D, = {z 1 Jzl � r}.
Theorem 3. If a series !:o a;z; converges for z = z0, with z0 � 0, and 0 < s < lz0I,
then the series converges at every point of D,.
•z
494 Infinite Series 10.14
The first step in the proof is to show that L Ja;J si is convergent. For this purpose
we use a comparison test. We have
i ;
I
!ail · ;0 Zo = !ail · ;0 · lzoli
!ail s; = ·
I I I
laiz&I ; ·
o
i
= · I I
But we know that L aizi is convergent. Therefore a;zi 0, and so the numbers aizi -'>-
lzol· Therefore
I :J <
1,
<
because s
I Ja;I si
i=O
= I la;Z6J · I.!._
i=O
b I i
�b
_
z0 i=O z0
1
I I.!._
ls/z01 <
I i
= _ oo.
L la z il � zla;I s1 <
"' "'
i=O ;
(z in Ds)·
i=O
00
S {
= s J _I a1z1
J=O
converges on I>s}·
If S is unbounded, then the series is convergent for every z. In this case, we say that
the radius of convergence is oo. If S is bounded, then S has a least upper bound
sup S. (See page 243.) Let
r =sup S.
2) Given z0, with lz0J > r. Suppose that the series converges at z = z0, and let s be
such that
< s lz0J.
<
r
Then the series converges on Ds; and this is impossible, because r is an upper bound
for such numbers s.
10.14 The Radius of Convergence. Differentiation of Complex Power Series 495
Note that while this theorem tells us what happens inside the circle lzl = r, and
what happens outside the circle, it tells us nothing about what happens on the circle.
L a,xi. Suppose that the series
This theorem clarifies the situation for real series
converges for some x � 0. Then the complex series I,�0 aizi converges for some
z � 0 (namely, the same x). Let the radius of convergence be r. Then L a1zi con
verges for lzl < r and diverges for lzl > r. Therefore, for real values of z, L a1xi
converges for Jxl < r and diverges for lxl > r.
The circular domain of convergence for complex power series also accounts for
the behavior of the series
1
I< -1)ix2i = ___2 .
i=O 1+ X
If this series converged for some x for which lxl > 1, then for the complex series
1
�
""'(-l)'z'
. 2'
= --- 2,
i=O 1 + Z
we would have a radius of convergence r > 1. This is impossible, because the function
itself blows up at a point of the unit circle: Iii 1, and for z = i the denominator
=
on the right-hand side becomes 0. [Query: how do we know that the series is equal to
2
1/(1 + z ), for complex values of z?]
The derivative of a complex-valued function is defined by an obvious analogy
with the derivative of a real-valued function. That is,
f(z) - f(zo)
f'(zo) = lim .
z->zo Z - z0
This is a complicated limiting process, because z may approach z0 from any direction
in the complex plane.
G
8
496 Infinite Series 10.14
To be exact, the indicated limit means that for every E > 0 there is a o > 0 such that
Here the inequality lz - z01 < o allows z to lie anywhere in the interior of a circle
with center at z0 and radius o. lfj'(z0) exists, then we say that/ is differentiable at z0•
If/ is differentiable at every point of an open disk containing z0, then we say that/ is
analytic at z0• It is easy to see that if f(z) = zn, then f is differentiable everywhere,
and therefore analytic everywhere, with
j'(z) = nzn-1 •
n
The proof is exactly like the proof for f (x) = x . Similarly:
Theorem 5. If f is a polynomial, with
n
f(z) = 2;a1z1,
1=0
then/ is analytic everywhere, and
n
f'(z) = 2;ja1z1-1.
j=l
Theorem 6. If f(z) has a power series 2;;:0 a;z1, converging in the open disk D.,
then f is analytic in D,., and
00
Then
00
Therefore the series defines a function g(s), and by Theorem A of Section 10.5
it follows that
00
g'(s) Lj la 1 sH,
1
j=l
=
00
Lj la 1 s1 < oo,
j=l 1
and so
00
lim L j Ja 1 s1 0.
n-+oo :i=n+l 1
=
. [--- ( La z ;)]
Now consider what we are trying to prove. The theorem says that
1 00 j 00 00
LJa;z0J-1,
•
hm ; - La z0
z - z0 J=O J=O 1 J=l
· =
z--+zo
for every z0 in Dr- Simplifying the expression in brackets, and�transposing the sum on
the right, we get the equivalent form
00
This is not as bad as it looks, because the limit of thejth term is 0 for eachj: each
of the firstj terms in the parenthesis approaches the limit zb-1, as z --+ z0, and so the
total expression in the parentheses approaches 0. Therefore if the sum were finite
our conclusion would follow: if
n
Sn(z) Lak )
i=O
= · · ,
then
lim Sn(z) = 0 for each n.
z-+zo
00
Rn(z) = L alzi-l + z1-2z0 + + zg-1 - jzg-1). · · ·
i=n+l
We know that lzol < s; and since we are discussing limh•o' we may assume that
lzl < s. Under these conditions,
00
IRn(z)J � L Ja J 2js1-1.
i=n+l 1
·
L ak ) Sn(z) + Rn(z),
1=1
· · =
we have
D L a1z1 = L ja1z1-1,
1=1 1=1
on any open disk Dr where the series on the left converges.
g'
n
=
Dt =
nr-1f'.
(The pattern of proof for the real domain works equally well here.)
8. Let f be analytic for every z , and let g(z) f(a + z). 3how that g'(z)
= = f' (a + z).
(Evidently this is another simple special case of the chain rule.)
9. Givenf(z) = L:o a1z1 in Dr. Show that/has not only a first derivative in D,., but also
derivativesf<2> "
f , f<3>, . . . . of all orders.
L:o a1z1
=
10. Given f(z) = = 0 for every z in Dr· Obviously a0 = 0, because f(O) = a0.
Show that a1 =0 for every j.
11. Given f(z) = L: o a1z1 in Dr.Show that if f'(z) = 0 for every z, then f is a constant
function and is equal to a0 for each z.
12. Given thatf(z) = L a1zi, g(z) = L b1zi in D r. Show that if f' = g', then/ - :; is a
constant.
10.15 Integration and Differentiation of Real Power Series 499
13. Let ¢(z) = cos2 z + sin2 z. Show that ¢(z) = 1 for every z.
14. Given that cos2 z + sin2 z = 1 for every z, does it follow that the complex functions
cos z and sin z are bounded? Why or why not?
In Section 10.5 we stated (in Theorems A and B) that a Maclaurin series could be
differentiated and integrated a term at a time, on any interval ( -r, r) on which the
series converges. We shall now prove these theorems. The ideas that are needed to
do this may be easier to understand if we first show how they apply to the geometric
series
00 .
1
f(x) Lx 1 < x < 1).
i=O '
=
--
= ( -
1 - X
i=O
00
Rn(x) L x\
i=n+l
=
so that
f(x) Sn(x) + Rn(x), =
and
and
n ( " oo "
xi dx � J xi dx,
(
J
!�rr;, i� o i o
=
500 Infinite Series 10.15
If (3) holds, then we can take the limit in formula (2), getting
[
11-+oo Jo Jo n-+co J
]
lim rf(x) dx - rksn(x) dx = lim rkf(x) dx [ o
- i
i=o J
(\i dx
o
]
lkxi dx = 0.
00
= kf(x) dx
This means that (1) holds.
l0
- !
i=O 0
i=n+l
.
= xn+l ! xi =
xn+l 00
i=O
--
.
1- X
Therefore
Therefore
r n+l kn+2
kRn(x) dx � k k
Therefore
lo J ol-k
dx -- = -- .
1-k
lim (k R (x) dx = 0,
n-+oo Jo n
which is what we wanted.
What made this work was the fact that the functions Rv R2, • • • were squeezed
to 0 by a sequence of constants. We had
Mn =--
kn+l
1 k' -
j.kRn(x) dx
0 k dx
� rM
Jo n
= kMn--+ 0.
This is Theorem 7 of Section 10.1. It gives us the following result for series.
Theorem 2. If L a,xi is convergent on the interval (-r, r), and 0 < k < r, then
L a;ki is absolutely convergent.
Therefore
00 . .
00 . • 00
b
L la;k'I = L laixtl s' � L bs' = -- < co.
� � � 1-s
Therefore L aiki is absolutely convergent, which was to be proved.
Theorem 3. If L aixi is convergent on the interval (-r, r), and 0 <k < r, then the
remainders
00
R n(x) = L a;xi
i�n+l
are squeezed to 0, on the interval [-k, k], by a sequence of constants. That is, there
is a sequence M1, M2, • • • of constants, such that
limMn = 0,
n-> oo
and
(-k � x � k).
Proof We know by the preceding theorem that L aiki is absolutely convergent.
For each n, let
L laikil.
00
Mn=
i=n+l
Then
limM n=O,
n-> oo
because
oo n
i
Mn= L lai kil - L la;k j.
i=O i=O
502 Infinite Series 10.15
and
Therefore
IRn(x)I � Mn for every n,
and so the remainders Rn(x) are squeezed to 0, on the interval [-k, k], by the con
stants M1, M2, • • • •
The ideas in this theorem are going to come up again, and so we need a briefer
language in which to describe them.
Definition. Let Ri. R2, • • . be a sequence of functions on the interval [a, b]. If there
is a sequence M1, M2, • • • of positive constants, approaching 0, such that
I n(x)I � Mn,
R
for every x on [a, b] and for every n, then we say that the functions R1, R2, • • •
Rn(x) = L aixi,
i=n+l
n
Sn(x) = L a;x\
i=O
co
S(x) = L a;x;,
i=O
. Rn(x) = S(x) - Sn(x).
In our new terminology, Theorem 3 takes the following form:
Theorem 3'. If L a;x; is convergent on (-r, r), and 0 < k < r, then
n co
tinuous. But whenf(x) is given only by a series z a xi , we first need to show that/is
i
continuous, in order to conclude that the sum has an integral. Thus we need the
following two theorems.
Theorem 4. Iffn is continuous for each n, and U limn-oofn(x) = f(x) on [a, b], then
f is continuous on [a, b].
Proof Take a fixed x0, and let E be any positive number. Let M1, M2, be as in • • •
the definition of U lim. Then there is an n such that Mn < E/3. Hereafter in the
proof, n is fixed. Sincefn is continuous at x0, there is a o > 0 such that
{
l f(x) - f n(x)I < Mn for every x,
we have
If(x) - f n(x)I < E/3,
Ix - Xol < O => /fn(x) - fn(Xo)I < E/3,
/fn(Xo) - f(xo)I < E/3.
By the triangular inequality,
a
/a + b + cl � l / + /bl + /c/.
Therefore
/x - x01 < O :::::> If(x) - /(x0)1 < E/3 + E/3 + E/3 = E,
which was to be proved.
To conclude that f is continuous, it is not enough to know that limn-co f,,(x) =
f(x) for each x of [a, b]; we really need to know that U limn_00/(x) f(x) on [a, b].=
(See the following problem set, for an example showing this.) For power series, of
course, we know that Sn(x) is continuous for each n, because Sn(x) is a polynomial;
and we know that the sum of the series is not merely lim S,(, x) but also U lim S,,(x),
on every closed interval lying in the interval of convergence. This gives the following
theorem.
Theorem 6. If z�o a;xi converges on (-r, r), and lxl < r, then
{"'
Jo
[f al]
i=O
dt =I l"'0 ai dt.
i=O
504 Infinite Series 10.15
To prove the theorem, we let Sn(t) = Lf=o aiti. Then for 0 < x < r we have
00
Therefore
� l "Mn dt
=Mn Ix!.
Since limn�oo Mn lxl = 0, it follows that
But Sn(t) is a finite sum, and can be integrated a term at a time. Therefore
lim
n�oo
_L a/dt = _L
i=O 0 i=O l"a/dt,
0
Theorem 7. If _L;:0 aix i converges on (-r, r), then _L;:1 iaixi-i converges on (-r, r).
Proof This is going to be very similar to the proof of Theorem 2. Let x1 be any
number such that lxJ < x1 < r. Then _L a;xi is convergent; limn�oo aixi = O; and
10.15 Integration and Differentiation of Real Power Series 505
there is a bound b such that laixil � b for every i. Lets = lxl/x1, so that
Jxl = X1S,
i
liaix -ll = liaixi-1si-ll = laixi-11 · isi-l � bii-1•
Therefore
C()
C()
i i
2 I iaix -ll � b 2 is -l.
i�l i=l
But s < 1, and so the series on the right-hand side converges, by the ratio test.
Therefore the series on the left-hand side converges. Therefore 2 iaix i-l converges.
It remains to show that the "derivative series" 2 ia;xi-l really gives the derivative,
but this is easy.
Theorem 8. If
C()
C()
i
f'(x) = 2 iaix -1•
i=l
Proof Let
C()
0 i=l
ia/-1 dt
J = 2
00 1"'
i=l 0
ia/-1 dt =
00
2 aix;
i=l
= f(x) - a0•
Therefore
Di"'g(t) dt = g(x).
f"(x) =
00
2 i(i - l)xi-2,
i=2
/<3\x) =
00
2
i =3
i(i - l)(i - 2)xi-a,
and so on. Thus ifj(x) is represented by a power series, thenfhas an nth derivative
for every n. In a way this is good; it means that functions given by series are in some
respects manageable. But it also means that if a function f does not have infinitely
many derivatives, then f cannot be represented by a power series. Later you will see
506 Infinite Series 10.15
that many such "irregular" functions can be represented by series of other kinds,
notably by so-called Fourier series, of the form
00
1. Let
oo x 2i
J<x) .I <-1)i ' .
(2l )'
=
•=0 •
Calculate the series for j"(x), and verify that j"(x) = - f(x) . [This must be true,
because f(x) cos x.]
=
�
i (2i + 1)! ·
off, and find out whether it is true that U liIDn ...... oof n(x) f(x) on [O, 1]. (This throws
=
differentiable, then f is differentiable, and f'(x) liIDn...... cx,/�(x) for every x between
=
-k and k.
(If this is true, then it furnishes a straightforward proof of Theorem 8, replacing
the proof using integrals.)
**16. Here we return to complex power series, as in Section 10.14. It is evident that if
L aizi converges on Dr, then L I aizil converges on Dr; in fact, every time we have proved
10.15 Integration and Differentiation of Real Power Series 507
convergence for a complex power series, we have first proved absolute convergence and
then used Theorem 3 of Section 10.12. It remains, however, to consider the question of
uniform convergence. Just as for sequences of real functions,
U lim/n(z) = f(z) on D8
n-co
if 1/n(z) - f(z)I is squeezed to 0 by a sequence of constants. That is, the above U Jim
relation holds if there is a sequence M1, M2, of positive constants such that
• • •
Theorem. If .2 a1zi has r > 0 as its radius of convergence, and O < s < r, then
n co
U Jim .2 a1zi .2 a1zi on D5•
1=0
=
i=O
n-oo
11 Vector Spaces and Inner Products
To set up a coordinate system in three-dimensional space, we use the same scheme that
we used in a plane; the only difference is that we use three mutually perpendicular
lines instead of two. These are the x-, y-, and z-axes. On each of the axes we take a
coordinate system, in such a way that the origin 0 has coordinate 0. The plane con
taining the x- and y-axes is called the xy-plane. Similarly for the yz- and xz-planes.
/ -----
/
/ -:-'II
/ /
/ / I
f--- // I
I - -- P1 / I
I --- /
- I
I I I
I I
: ------ !
I
I
/7---- y
/
--
- -- I //
I
---....v/
/
x
These are called the coordinate planes. In the figure we have indicated the position of
the pointP1 by drawing the rectangular parallelepiped which has the origin 0 and the
point P1 as opposite corners, and sides parallel to the coordinate planes. We get
the coordinates of a pointPl> as before, by dropping perpendiculars to the coordinate
axes.
Here M1, M2, and M3 are the feet of the perpendiculars fromP1 to the three axes.
If these points have coordinates x1, Yi· Zi, on the respective axes, then Pi is matched
with the triplet (xi, Yi, zi), and we write
508
11.1 Cartesian Coordinate Systems in Three-Dimensional Space 509
In figures, we may label a point as P1(x1, y1, z1), to indicate that the given point has
the given coordinates.
The coordinate planes divide space into eight parts, called octants. The figure
above shows the fi rst octant, consisting of all points of space for which all three co
ordinates are � 0.
By two applications of the Pythagorean theorem, we see that each diagonal of a
rectangular parallelepiped has length Ja 2 2 2
+ b + c • This means that for each point
P1(x1, y1, z1) we have
0 P� = x� + y� + z�.
More generally, for any two points P1, P2, we have the distance formula given in the
following theorem.
b
i'\
a I \
I '-
'
'
'e
c '
J.-----'\--- ---·
/ ............. \
/ -....... \
/a d ', '
/ ......
...... _,
2 2 2
P1P2= J (x2 - X1) + (Y2 - Y1) + (z2 - z1) .
Proof Suppose first that x1 -:;!: x2, Yi -:;!: y2, and z1 -:;!: z2. Then P 1 and P2 are opposite
corners of a rectangular parallelepiped. In the figure on the left below,
If some of the inequalities x1 -:;!= x , y1 -:;!= y , and z1 -:;!= z do not hold, then our
2 2 2
parallelepiped reduces to a rectangle, a segment, or a point, and the same distance
formula holds for simpler reasons.
We shall use this result to describe planes by equations. (See the figure on the
right below.) Given a plane E, suppose first that E does not pass through the origin,
b
al
I
I P2
I
c I
- 1 ---- --- Y
/
�! __ ---
/
/
x x
and let P0 = (a, b, c) be the foot of the perpendicular from the origin to E. Let P1 be
the point (2a, 2b, 2c). Then OP0 = P0P1; P0 is the midpoint of the segment from 0
to P1; and Eis the perpendicular bisecting plane of the segment . Therefore Eis the set
of all points of space that are equidistant from 0 and P1. That is, Eis the graph of the
condition
The condition on the right says that the numbers A, B, and Care not all equal to zero;
and this is correct, because the point P0 = (a, b, c) is not the origin. An equation of
the above type is called a linear equation in x, y, and z. Thus we have shown that every
plane that does not pass through the origin is the graph of a linear equation in x, y,
and z. For planes through the origin, the same result holds. Let P0(a, b, c) be any
point of the line perpendicular to Ethrough 0, other than 0 itself. Let P1 be the point
(-a, -b, -c). Then E is the perpendicular bisecting plane of the segment from
11.1 Cartesian Coordinate Systems in Three-Dimensional Space 511
Po(a,b,c), P1 (-a,-b,-c)
P1P = P0P
��+�+�+W+�+�=�-�+�-W+�-�
� 2ax+a2+2by+b2+2cz+c2 = -2ax+a2- 2by+b2 - 2cz+c2
�ax+by+cz = 0.
This has the same form Ax+By+Cz+ D = 0, with D = 0, as it must be: the
origin lies in the plane E. In general:
Ax+By+Cz+ D = 0,
and the equation must be satisfied by the coordinates of the three given points.
Therefore the coefficients A, B, C, and D must satisfy the equations
A -B = 0. (1)- (2)
Setting B =A in (2) and (3), we get
6A+4C+ D = 0, (2')
5A+ 6C+ D = 0. (3')
A - 2C = 0, (2') - (3')
512 Vector Spaces and Inner Products 11.2
2x + 2y+z - 16 = 0.
This checks.
Note that any number different from 0 could have been used as A. There are some
cases, however, when this is not so. For example, the graph of the equation
y+z=l
is a plane, parallel to the x-axis. This plane is the graph of infinitely many different
equations, of the form
ky+ kz - k = 0 (k -:;t: O);
but x does not appear (with nonzero coefficient) in any of these equations.
1. Find the equation of the plane containing all points equidistant from the origin and the
point P0 = (2,6, 4).
2. Find the equation of the plane containing all points equidistant from P0 = (1, 0,0) and
Pi = (0,2,3).
3. Find the equation of the plane containing all points equidistant from the planes
x +y + z = 2 and x +y + z = -1.
4. Find the equation of the plane through the points P0 = (1,0, 1), Pi = (1,1, 1), and
P2 = ( -1,2, 0).
5. Find the equation of the plane through the points P0 = (2,1,1),Pi = ( -1, -1,0), and
P2 = (0,0,3).
6. Find the equation of the plane through
7. The point (1, -1, 2) is the foot of the perpendicular from the origin to a plane E.
Find the equation for this plane.
9. Let A = (1,0,0), let B = (4,0,0), and let K = {PI 2AP = BP}. Find an equation
whose graph is K. What sort of figure is this?
Linear equations for planes are more useful if we know the geometric significance of
the coefficients in the equations. For this purpose, we need the idea of directed
distance on a line. Given a line L with a coordinate system, and two points P and Q
of L, with coordinates x1 and x2, we define the directed distance from P to Q as
PQ = X2 - X1.
11.2 Direction Cosines. The Directed Normal Form 513
{
PQ+ QR= PR for every P, Q, R.
And since the distance PQ is equal to lx2 - x1I, it follows that
PQ if Q is in the positive
direction from P,
PQ=
-PQ if Q is in the negative
direction from P.
This means that directed distances are determined if the positive direction on the line
L is known; we do not care where the origin is in the coordinate system.
P <--> (a,b,c),.o(0,0,0).
Consider now a directed line L, through the origin. Let P= (a, b, c) be a point
on the positive end of L, with OP= 1. Consider the angle between the positive end
of the x-axis and the positive end of L. In its own plane, this angle looks like this:
Note that the foot of the perpendicular really does have coordinate a on the x-axis;
in fact, this is the definition of the x-coordinate of P. If the angle has measure e<, then
really mean is that they are the measures of the angles between the positive end of L
and the positive ends of the axes.) The numbers
are called the direction cosines of L. Note that they determine not merely a line
through the origin but also a positive direction on the line. If the direction on the line
is reversed, then
Suppose now that P is any point (xi, Yi, zi) of L, other than the origin. Let p be
the directed distance from 0 to P, relative to the given positive direction on L. If
p > 0, as above, then
XifP = cos ex,
- COS ex = - Xifp,
and xifP = cos ex, as before. In the same way, we get
a b c
-x + - y + - z - p = 0,
p p p
or
x cos ex + y cos {J + z cosy - p = 0.
11.2 Direction Cosines. The Directed Normal Form 515
If the plane passes through the origin, then p 0, but the same result holds: the =
oc - 1T - oc, fJ - 1T - {J, y - 1T - y,
and p ___.. -p. This changes all the signs in the equation
Theorem 3. Let Ebe a plane, and letN be a directed normal to Ethrough the origin,
with direction angles oc, {J, and y. Then Eis the graph of the equation
So far, we seem to have been talking about numbers which are hard to compute.
But it is easy to bring the discussion down to earth. Consider the following equation:
x + 2y + 3z + 4 = 0.
It is not reasonable to expect that an equation taken at random will be in the directed
normal form; after all, if a plane Eis the graph of the equation
Ax + By + Cz + D = 0,
then E is also the graph of the equation
Theorem 4. If oc, {J, y are the direction angles of a directed line L through the origin,
then
cos2 oc + cos2 fJ + cos2y = 1.
516 Vector Spaces and Inner Products 11.2
Proof. LetP be the point of L for which the directed distance OP is 1. IfP= (a, b, c)
then
a=COS IX, b=cos (3, c =cosy.
Since
OP=OP=1 =.Ja2 + b2 + c2,
we have a2 + b2 + c
2
= 1; and the theorem follows. We also have a converse:
Theorem 5. If a2 + b2 + =1, then there is a directed line whose direction cosines
c2
are a, b, and c.
Proof. Let L be the line from the origin through the point P=(a, b, c) , directed
positively from 0 to P. This does it.
This suggests that the equation
x + 2y + 3z + 4=0 (1)
has the form
xk cos + yk cos f3 + zk cosy - pk=0,
IX
1 2 3 4
x + + + = (2)
.)14 .}14 y .)14 z .}14 o.
Here
cos IX= I/./14, cos f3 = 2/./14,
cosy=3/./14, p= -4/./14.
y
11.2 Direction Cosines. The Directed Normal Form 517
We have sketched the graph by plotting the intercepts, on the axes, and then completed
a triangle just as we did in the case where the intercepts were in the first octant. The
graph of (2) is the plane whose normal has the given direction cosines 1/.J14, 2/.J14,
3/.J14, and which lies at a directed distance p = -4/.J14 from the origin.
Taking k = -.J 1 4 , we get
1 2 3 4
x - - (3)
- .J14 y
=
- .J14 .J14 z .J14 O,
1
cos a'= - ==-cos a, cos /3'=
-
.J14
'
3 4
cosy'= - = -cosy, p = + - - -p.
.J14 .J14 -
The same scheme works for any linear equation in x, y, z. This gives a converse
of Theorem 2:
Proof Given
Ax + By + Cz + D = 0, (1)
with A, B, and C not all= 0. Then A2 + B2 + C2 > 0. Let
k = .J A2 + B2 + C2,
and
A B c D
a= ' b= -, c=
' p=
k k k k
ax + by + cz - p = 0. (2)
The graph of Eq. (2) is a plane E: the direction cosines of a directed normal to E
are a, b, and c; and the directed distance from the origin to Eis p.
PROBLEM SET 11 2
.
l. Given x + y + z -1 = 0. Write the two directed normal forms for the same plane,
and sketch.
6. The normal to E from the origin contains the point (2, 4, 6). The plane E contains the
point (1, 1, 1). Find the two directed normal forms of the equation of E. How far is E
from the origin?
7. Let K be the set of all points which are equidistant from A = (1, 0, 0), B = (0, l, 0),
and C = (0, 0, 1). That is,
Prove that K is a line through the origin, and find a set of direction cosines for K (more
precisely, a set of direction cosines for a direction on K).
8. Let A, B, and C be three points which are all different, but collinear. Let
11. The normal to E from the origin lies in the xy-plane; and E contains the points (1, 1, 1)
and (-1, 2, 1). Discuss as in Problem 10.
12. The normal to E lies in the yz-plane; and E contains the points (2, 2, 1) and (1, 1, 2).
Discuss as in Problem 10.
13. Let E be the plane z = -1, and let K be the set of all points wh i ch are equidistant
from E and the point A = (1, 0, 0). What sort of figure is this? Sketch.
14. Let E be the plane z = -2 and let K be the set of all points which are equidistant from
E and the x-axis. What sort of figure is this? Sketch.
Following the pattern of Section 9.7, we identify the point P = (x, y, z) with the
� �
directed segment OP from the origin to P; we denote the resulting vector as P; and
for P1 = (x1, y1, z 1), P2 = (x2, y2, z2), we define addition, scalar multiplication, and
inner product by the formulas
- -
P1 + P2 = (x1 + X2,Y1 + Y2, Z1 + Zz),
-
- -
P1 P2 = X1X2 + Y1Y2 + Z1Z2.
•
�
To simplify the notation, however, we drop the arrows, and write P for P. Thus we
have an inner-product space
with addition, scalar multiplication, and inner product defined by the formulas
The resulting system satisfies all the vector and inner-product laws of Section 9.7.
In the new notation (without the arrows) these are as follows.
O+P=P+O=P
for every P.
P+ ( -P)= ( - P)+P= 0.
In M.4 and M.5, 0 is the real number zero, and 0 is the zero vector. Thus M.4
says that the scalar product of the number 0 and any vector P is the zero vector; and
M.6 says that the scalar product of any number rx and the zero vector is the zero vector.
It is understood that all sums and scalar products are also vectors, that is, elements
of"f/'. But we had better make this explicit:
CSM (Closure under scalar multiplication). For every Vin "f/' and every real number
rx, rxV belongs to "f/'.
As in Section 9. 7, any set"f/', with operations satisfying the above laws, is called a
vector space. The space that we are dealing with at the moment, in which the vectors
are the triplets P= (x,y,z) of real numbers, is denoted by R3,and is called Cartesian
three-space. Thus
R3= {(x,y, z) I x,y,z in R}.
520 Vector Spaces and Inner Products 11.3
The inner product that we have defined in R3 has the following properties:
.jx2 + y2 + z2 = .jp . P.
Hereafter this number will be called the norm of P, and will be denoted by [[ P\[. (The
double bars are a reminder that we are performing an operation on a vector rather
than a number.) For inner-product spaces in general, the formula ../x2 + y2 + z2
may not apply, but the expression ../P ·P always has a meaning, and so we use it as
our definition of the norm.
Theorem 1. In R3,
� --+
where() is the measure of the angle between the directed segments OP1 and OP2•
The proof is by definition of the. norm, together with the law of cosines:
2
(Pi P2) OPi + OP� - 2· OPi · OP2
= cos 0;
2 2 2
(X1 - X2) + (Yi - Y2) + (zi - Zz)
= x� + Yi + zi + x� + Y� + z� - 2[\P1\\ · \\P2\I cos O;
2
This is true because cos () � 1.
Following the pattern of Section 9.7, we let
i = (1,0,0), j = (0, 1, 0), k = (0, 0, 1),
so that for each P = (x, y, z) we have
P =xi + yj + zk.
In general, if V = ix1V1 + ix2V2 + + ixnVn, for some scalars ix1, ix2, ..., ixn,
· · ·
then Vis a linear combination of the vectors Vi, V2, ..., Vn. Thus every vector in R3
is a linear combination of i, j, and k.
Definition. A set {V1, Vz, . .., Vn} of vectors spans the vector f' if every Vin f'
is a linear combination of the Vi's. (Thus {i, j, k} spans R3.)
A set {V1, V2, ..., Vn} is linearly dependent if there are scalars ix1, ix2, • • • , an,
not all equal to zero, such that
ix1V1 + ix2V2 + ... + anVn = 0.
Thus, in R3, every-set of vectors of the form {P, i, j, k} is linearly dependent, because
for P = (x, y, z) , we have P =xi + yj + zk, and so
P - xi - yj - zk = 0.
Here ix1 = 1, ix2 = -x, ix3 = -y, and ix4 = -z; and the numbers ixi are not all
=0, because ix1 = 1.
A set of vectors is linearly independent if it is not linearly dependent. Thus
{V1, V2, ..., Vn} is linearly independent if
spans f' and (2) the set is linearly independent. Thus we have:
Theorem 3. {i, j, k} is a basis for R3•
Obviously the points of the xy-plane form a vector space in themselves; in fact,
this is the vector space that was discussed in Section 9.7. In fact, all three of the
coordinate planes
form vector spaces. Such sets are called subspaces of R3• More generally:
Definition. Given a vector space"/'" and a subset"/'"'. If"/'"' also forms a vector space
( under the same definitions of addition and scalar multiplication) then "/'"' is called
a subspace of"/'".
Thus a subspace must satisfy all of the vector laws. But this is not as tedious to
check as one might think, because of the following theorem.
Theorem 4. Let"/'"' be a subset of the vector space"/'". If"/'"' is closed under addition
and scalar multiplication, then "/'"' is a subspace of"/'".
Proof Many of the laws can be checked all at once. Since A.1 and A.4 hold for all
vectors in"/'", they automatically hold in"/'"'. The same is true for M.1 through M.6
and S.1 through S.5. Therefore the only things remaining to verify are A.2 and A.3.
[1+(-l)]P=1 P +(-l)P. ·
then
P1 + P2 = (x1 + X2)i + (Y1 + Y2)j,
so that the set E,,11 of all linear combinations of i and j is closed under addition.
Similarly, CJ..P1 = CJ..X1i + CJ..y1j, and so E,,11 is closed under scalar multiplication.
Similarly for Ev• and E,,,. In fact, a more general result holds:
Proof Let E be such a plane. Then E is the graph of an equation of the form
By addition,
A(x1 + X2) + B(Y1 + Y2) + C(z1 + z2) = O;
and this means that P 1 + P2 is in E. Similarly, for every real number CJ..,
There is, however, a much better way to get this result, using vector-space
methods instead of using the results of the preceding section. Given the plane E
through 0, let
P0 = (A , B, C)
be any vector such that the line through 0 and P0 is perpendicular to E. Then for
� �
each P-:;!:. 0 in E, OP and OP0 are perpendicular, and so
Po · P =
llPoll · llPll cos (7T/2) = 0.
Ax + By + Cz = 0,
which we already knew. But when we describe Eby a vector equation, using the
inner product, this suggests the following theorem:
Theorem 6. Let "I"" be any inner-product space; let V0 be any vector in "I"", different
from 0; and let
"I""' = {V I V0 • V = O}.
Then "I""' is a subspace of "I"".
Proof We need to show that "I""' satisfies CA and CSM. If Vi and V2 are in "I""', then
V0 • Vi = 0 = V0 • V2•
Therefore
Vo · (Vi + V2) = Vo · Vi + Vo V2, ·
by S.
3 . Therefore V0 • (Vi + V2) = 0, and Vi + V2 is in "I""'. Similarly,
If you rewrite these formulas, in the forms that they take when V0, Vi, and V 2
are vectors in R3, with
V0 = (A, B, C),
you will find that you are simply copying the proof of Theorem 5. (This is worth
going through, to see how it works.) Thus it may seem that nothing is new in Theorem
6 except the notation. But this is not true, because Theorem 6 and its proof work in
every vector space, including spaces of four dimensions, spaces of functions, and so on.
Thus when we proved Theorem 6, we found that the method used in proving Theorem
5 had nothing to do with any special properties of R3; it depended only on the inner
product space laws. From now on, easy generalizations of this kind will occur often.
We shall treat the vector laws (or the inner-product space laws) as basic assumptions,
like postulates in geometry, and any theorems that we derive from them will be
known to hold in every vector space (or any inner-product space.)
If a plane E does not contain the origin, then it never forms a subspace of R3 ,
because it does not contain 0. But we can still write a vector equation for E, because
P0 • P =a.
Each of the following is the equation of a plane. Convert each of them to the form
P0 • P = a, giving the value of a that you are using and the coordinates A, B, C of P0•
1. z=x+y 2. z=x - y
3. z= -x - y 4. x= 3y - 4z
5. y = 4z - 3x 6. z= 4x+ 3y
7. z= 1 8. x= 4 -
x y z x y z
+ + =4 lO. 2 - 4 + 3 = 2
9· 1 2 3
11. Let V1 = i +j, V2 =j + k, Va k. = Show that each of the basis vectors i, j, k is a
linear combination of V1, V2, V3•
12. Now show that {V1, V2, Va} spans R3.
13. Now show that {V1, V2, Va} is linearly independent. (By definition, this means that
cx1 V1 + ix2 V2 + ix3V 3= 0 =:-- ix1 = ix2 ix3 0. Problems 12 and 13, in combination,
= =
Show that the following hold, in any inner-product space. (Each of them should be
derived from the inner-product space laws, with a reason given for each step .)
16. V1 •(1XV2) = 1X(V1 V2) ·
21. (P+ Q) (R + S)
·
= P R + Q R + P S + Q S (Here P, Q, R,
· ·
· ·
S are vectors,
of course .)
22 . (P + Q)(P - R) P P+ Q P - R · P - Q R
= · · ·
23 . (P+ Q) (P - Q) = P · P - Q · Q
·
24. (P + Q) · (P + Q) P · P + 2(P Q) + Q Q
= · ·
25. (P - Q) (P - Q) = P· P - 2(P Q) + Q · Q
· ·
26. (P - Q) · (Q - P) = -P P + 2(P Q) - Q · Q
·
·
27. Is it true that in R3, (P · Q)R (Q · R)P? (Each side of this equation has a meaning,
=
28. Is the following true in any inner-product space? Theorem ( ?) If P Q = 0 for every ·
orthogonal if every two (different) vectors in the set are orthogonal. Verify that {i, j, k }
i s an orthogonal set.
37. A set { V1, V2, •• . , Vn} is orthonormal if (1) the set is orthogonal, and (2) II Vi II 1 for =
each i. (Thus the basis {i, j, k} for R3 is orthonormal.) Given that { V1, V2, V3} is
orthonormal, express the inner product
��+��+��·��+��+��
in the simplest possible form.
*38 . Let R4 be the set of all quadruplets P = (w, x, y, z) of real numbers. The set R4 forms
an inner-product space, under the obvious definitions of sum, scalar product, and inner
2 2 2
product. As always, llPll vp · P. Show that (P1 • P2) � llP1ll • llP2ll •
=
526 Vector Spaces and Inner Products 11.4
*39. Let P be the set of all polynomials with real coefficients. For
n n
V =�a.x i W �b.xi
£., • ' =
£., • '
i=O i=O
we define
n
V+ W i
= L (ai + bi) x ,
i=O
n
i
�v = L �aix ,
i =O
n
V· W = Laibi.
i=O
Show that
2 2 2
(V· W) � llVll • llW ll •
40. Does the vector space P of Problem 39 have a finite basis? If so, describe such a basis.
If not, explain why no such basis exists.
For each positive integer n, let Rn be the set of all n-tuples of real numbers. Thus
Rn = {(x1, X2, ••• 'Xn) I xi ER},
and Rn forms an inner-product space, under the obvious definitions of sum, scalar
product, and inner product. Rn is called Cartesian n-space. Let
Bn = {E1. E2, ... ' E n},
where
E1 = (1, 0, 0, ..., 0), E2 = (0, 1, 0, ..., 0), ... ,
(In general, Ei has 1 in the ith position, and O's everywhere else.) The vectors Ei span
the space Rn, with
n
(xi. x2, ••• , x n) = L xiEi.
i=l
They are linearly independent, since
n
L xiEi = 0 => (x 1, x2, ••• , x n) = (0, 0, ..., 0)
i=l
is a subspace, and forms a plane, and so on, for any subset of Bn. But Rn has
many subspaces which are not obtainable in this way. For example, we have found
that in R3, any line or plane through the origin forms a subspace. To investigate
these other subspaces, we need to use bases other than the obvious basis Bn. Our
investigation of other bases will also be useful in other connections.
The key to the theory of bases is the following theorem.
Theorem 1. Let "f/ be a vector space. Let A be a set of m vectors, and let B be a set
of n vectors, such that (1) A is linearly independent, and (2) B spans '"f/. Then "f/ is
spanned by a set C consisting of (a) all the elements of A, and (b) exactly n m-
of the elements of B.
For example, we might have
"f/ =R ,
a
This has the desired form, using all elements of A and 3 - 2 = 1 element of B. To
see that C spans R3, we observe that
Therefore every basis element E1, E2, E is a linear combination of the elements of C.
3
Therefore every vector in R3 is a linear combination of the elements of C, and C
spans R3•
We proceed to the general proof. First we list the elements of A in such a way
that the vectors which also belong to B come first. Thus
where A1, A2, • • • , Ai belong also to B, but Ai+l• ... , Am do not. Then B can be
described in the form
Here it cannot be true that all the numbers {J1 are equal to zero, because if so it
would follow that A is linearly dependent. It is a matter of notation, therefore, to
528 Vector Spaces and Inner Products 11.4
suppose that {31 :;rt: 0. We can therefore solve for B1, in the above equation, getting
1
B1 = f31 (Ai+l - Cl1A1 - Cl2A2 - ... - Cl;Ai - f32 B2 - ... - /Jn-iBn-i).
Now let
B' = {A1, A2, • • • , A;, Ai+i• B2, • • • , Bn-i}·
Every element of B (including B1) is a linear combination of the elements of B'; and
B spans "f/. Therefore B' spans "f/. In m - i steps of this type, we get the desired
set C.
Let us now check to see how this general scheme of proof applies to the above
example. We had
A = {A1, A2} = {Ei +E2, E2 +Ea},
B = { B1, B2, Ba} = {E1, E2, Ea}·
Here m = 2, n = 3, and at the outset, i = 0. Also A1 is a linear combination of the
elements of B, with
A1 = E1 +E2•
This equation can be solved for Ei. giving Ei as a linear combination of Ai and E2.
Therefore we can replace Ei by Ai in B, getting
B' = {Ai, B2, Ba} = { Ei +E2, E2, Ea}.
This completes step 1. Next we express A2 as a linear combination
E2 +Ea. A2 =
This equation can be solved for E2, giving E2 as a linear combination of A2 and E 3•
Therefore we can replace E2 by A2 in B', getting
Theorem 2. Let "f/ be a vector space. Let A be a set of m vectors, and let B be a
set of n vectors, such that (1) A is linearly independent, and (2) B spans "f/. Then
m � n.
This follows from Theorem 1, because Chas n elements, and contains all of A.
Theorem 3. If a vector space "f/ has a basis with n elements, then every basis for "f/
has exactly n elements.
Proof Let B be a basis with n elements, and let A be any other basis, with m elements.
Then A is linearly independent, and B spans "f/. By Theorem 2, m � n. But we also
know that Bis linearly independent, and A spans "f/. By Theorem 2, n � m. There
fore n = m, which was to be proved.
If you review the conditions for a vector space, you will see that they are all
satisfied in the trivial case where Y contains a zero vector 0 and nothing else. In
this case we define dim Y = 0. That is,
dim {O} = 0.
(Here the empty set is being regarded as a "basis" for {O}.) In a way it is a nuisance
to allow this case, but to rule it out would lead to worse nuisances in the long run.
Proof Let B be a basis, with n elements, and let A be any set of vectors, with m
elements, with m > n. If A were linearly independent, this would contradict Theorem
2. Therefore A is linearly dependent, which was to be proved.
B is a basis. If not, there is a vector Vn+i which is not a linear combination of elements
of B. It follows that the larger set
Theorem 6. Let Y be a vector space, and let B = {V1, V2, . . • , Vm} be a set which
spans Y. Then B contains a basis for Y.
Proof Any set which spans Y contains a basis, and every basis has n elements.
Theorem 8. Let Y be an n-dimensional vector space, and let "f/"' be a subspace off.
Then "f/"' is finite-dimensional, and dim f' � n.
P;oof Let m be the largest number for which it is true that "f/"' contains a linearly
independent set of m vectors. (By Theorem 4, there is such a largest number m, and
m � n.) Let
B = {V1, V2, • • • , Vm}
be a linearly independent set in "f/"'. We assert that B spans "f/"'. (Proof If not, there
is a vector Vm-'-' in "f/"' which is not a linear combination of elements of B, and it
530 Vector Spaces and Inner Products 11.5
If
n
W (Yi. Y2, · · ·' Yn) =
L Y;E;,
i=l
=
then
n
.
v w =
.L XiYi·
i=l
Thus, for linear combinations of the E/s, we have a simple formula for the inner
product:
n
=
L X;Y;·
i=l
11.5 Orthonormal Bases 531
This formula does not hold for all bases. For example, the set
B= {Vi, V2, • • • , V n}
is orthogonal if
for i � j.
Thus Bn = {Ei, £2, • • • , En} is an orthogonal set, but {Ei, Ei + £2} is not, because
Ei · (Ei + £2) = 1 + 0 = 1 � 0.
If
ll V;ll = 1 for each i,
then B= {Vi, V2, , V n} is normal. If Bis both orthogonal and normal, then B
2
orthonormal. Thus nn is orthonormal. Since II Vill = vi . vi, we note that Bis
• • •
is
orthonormal if and only if
V;·V.=
' o { 1
for i � j
for i = j.
Theorem 1. Every finite-dimensional inner-product space has an orthonormal basis.
In the sum on the right, the only nonzero term is Wk Wk, because Wis an orthogonal ·
W k V�+1
· = Wk· Vn+i -
ak = ak - ak = 0.
Then
and
Therefore the set { Wi. W2, • • • , Wn, Wn+i} forms an orthonormal basis.
Note that the pattern of this proof supplies us with a method of actually finding
an orthonormal basis, starting with a basis which is not necessarily orthonormal.
The proof gives a scheme for "orthonormalizing" a given basis, a step at a time.
For example, in R3 let V be the subspace spanned by
B = {V1, Vi} = {E1 + E2, E2 + £3}.
Then Bis a basis for V, but is neither orthogonal nor normal. We can get an ortho
normal basis for V by following the pattern of the proof of Theorem 1.
1) Let
Then II Will = 1.
Let
v; = V2 - a1W1 = (£ 2 + E3) - t(E1 + E2)
=
-
tE1 + !E2 + £3.
The theory predicts that
11.5 Orthonormal Bases 533
W 1 V� =
•
:2. (E1 +E2) (-tE1 +fE2 +Ea)
•
1
= ./2 (-t +t +0) = 0.
W2 = V�/11V�ll-
Since
we have
llV�ll
2
= V2 ·
V� = ! +t + 1 = -f,
J-3
1 2
=
11v;11
and
1 1
W2 = - ._j(, E1 +.Jf, E2 +yr
i £3,
so that
II W2ll2 = i +i +i = 1
,
and II W2II = 1 , as it should be. Now { W1, W2} is an orthonormal basis.
Orthonormal bases are what we need to get a simple formula for the inner
product:
= 2 ;{J
i=l
ct. ;.
Therefore
ct.iV; . L f3;V; =
n
i=l
ct.;{3; for each i,
and so
J=l i=l
ct.; {3;.
Of course, for inner products of the form V V = II V l!2,
· we have
Definition. In any inner-product space, the distance between two points P and Q is
llP- Qll.
The distance between P and Q may be denoted by d( P, Q), or simply PQ. An
orthonormal basis gives us a distance formula.
Proof We have
n
d(P, Q) = llP - Qii , P- Q = .L (a; - p;)Vi,
i=l
and
llP - Qll2 =
(P- Q) · (P - Q).
Since the basis is orthonormal,
2 n
5. Same question for B ={E3, E1 +E2 +E3, E2 +E3} ={V1, V2, V3}.
3
6. In R , find an ={V / V · (E1 +E2 +E3) = O} .
orthonormal basis for 'i'
8. Let V1 = 2E2 +E3, V2 = 4E1 +E4. Find vectors V3, V4 so that {Vv V2, V3, V4} is
4
an orthogonal basis of R •
9. Given V1 = E1 - E3 +E4, V2 = E1 +E2 +E3, proceed as in Problem 8.
4
10 . In R, find an orthonormal basis of the subspace
13. Suppose that {V1, V2, , Vn} is orthogonal, but not necessarily orthonormal.
• • • Find
a formula for
14. Let {V1, V2, • • • , Vn} be an orthogonal set of nonzero vectors. Let V = L ixi V; be any
linear combination of them. Show that
(This means that in an n-dimensional vector space, no orthogonal set of nonzero vectors
can have more than n elements. Thus, for example, in R3 there is no set of four con
current lines, every two of which are perpendicular.)
16. Let ii'" and f1E be subspaces of a vector spacer. If every vector in ii'" is orthogonal to
every vector in f1E, then ii'" and f1E are orthogonal subspaces, and we write ii'" J_ fE. Show
that if ii'" J_ fE, then dim ii'" + dim q; ;;:;; dim r. Give an example to show that the
equality does not necessarily hold.
17. The following is a converse of Theorem 2.
Theorem (?) Let B ={Vi. Vi, . .. , Vn} be a basis for i"". If for all vectors V = L ix; V;,
W = ! /3;V;, we have
then B is orthonormal.
Is this true? Why or why not?
18. Show that if {V1, V2, • • • , Vn} is an orthonormal basis for r, then for every Vinr,
n
v = L (V· V;)V;.
i�I
That is, for V=L IX; V;, we always have IX; = V V;.
·
PR� PQ +QR.
The equality holds if the points are collinear, and Q is between P and R; and in every
other case, the strict inequality holds.
We propose to show that in any inner-product space, the same inequality holds
for distances. That is,
d(P, R)� d(P, Q) +d(Q, R), (1)
for every P, Q, and R. Since distance was defined by the formula
\x + YI � \x\ + IYI,
which is known to hold for both real and complex numbers. Obviously any general
proof of (2) , for all inner-product spaces, must appeal to the definition of the norm:
ll A ll =
..{;0A, \I A ll2 = A · A. (3)
Therefore the natural first step, in proving (2) , is to restate it in terms of the definition
given in (3). In these terms,
<::;, A · A + 2A · B + B· B � A · A + 2 ll A ll · llBll + B B ·
Here l4) automatically holds whenever A · B < 0. But if ( 4) always holds, then
we must have
I A · B\ � l\ A I\ · ll Bll. (5)
(5) were false, for some A , B, then (4) would also be false, either for A , B or
(If
- , B.) And Eq. (5) is obviously equivalent to
A
Formula (6) is called the Schwarz inequality. We shall now prove it.
1
l't.=- so that lll't.A ll = 1.
llAll'
We then choose
1
(J= so that l't.A (JB = I.
·
l't.A B
-
(P - Q) · (P - Q) � 0
=> P · P - 2P Q + Q · Q � 0·
=> 1 - 2 + 11Q11 2 � o
=> 1 � llQll2•
The theorem follows. In the light of the discussion which led to the Schwarz inequality,
we also have the following:
All these are easy to check, on the basis of the definition llA ll =.JA· A, the vector
laws, and the Schwarz inequality. For distance, we have
D.l. d(P, Q) � 0.
D.2. d(P, Q) =0 => P = Q.
D.3. d(P, Q) =d(Q , P ).
D.4. d(P, R) � d(P, Q) + d(Q , R).
On this basis, we shall define various types of mathematical systems which are
more general than inner-product spaces.
Addition and scalar multiplication are defined in the obvious way. We define
(x) I dx + f 1lg(x) I dx
1lf
=f
= 11!111 + II gll1·
It should be emphasized that a normed vector space is not merely a vector space
in which a norm can be defined, but rather a linear space in which a norm has been
11.6 The Schwarz Inequality. More General Concepts of Norm and Distance 539
defined. Thus, in Examples 1 and 2 we defined two different norms II II,, and II 111
in the same vector space [21, +, sm]; and this gave us two different normed vector
spaces
[21, +, sm, 11 II,,], [21, +, sm, 11 Iii].
In any normed vector space, we can define distance by means of the formula
Thus a metric space is a pair [S,d ], where d is a metric for S. It is evident that
metric spaces can arise in ways that have very little to do with vector spaces or with
norms. For example, S may be the surface of a sphere in R3; that is,
S = {(x, y, z) I x2 + y2 + z2 = 1},
and for each pair of points P, Q on S, the distanced(P, Q) may be the length of the
shortest arc on S, joining P and Q. It is not hard to see that this system forms a
metric space, that is,d satisfies D.l through D.4. In fact, this is the metric space used
in navigation on the open sea, with arc length measured in nautical miles.
1. Show, by any method, that for any pair of pairs of real numbers (x1, x2) and (y1, y2)
we have (x� + x;)(y� + y�) ;;;;; (x1y1 + X2J2)2 •
2. Show, by any method, that for every pair of finite sequences xl> x2, ••• , Xn, Yi. y2, ••• ,
Yn of real numbers,
3. Let E be a coordinate plane and, for each P = (x, y) and Q = (a, b), let
Thus d is the square of the usual distance. Does [E,d] form a metric space?
4. Same question,for d(P, Q) Ix - al + ly - bl.
=
6. If [S,d] is a metric space and k > 0, does it follow that [S, kd] is a metric space?
7. The real number systemR clearlyforms a vector space. For eachxinR,let llxll = Vlxl.
Does this give a normed vector space?
8. Let S be the set of all airline passenger terminals in the world; and for each P, Q in S
let d(P, Q) be the minimum number of hours required to get from P to Q by a combination
of regularly scheduled flights. Is [S, d] a metric space?
540 Vector Spaces and Inner Products 11.6
9. Let .!l'1 be the set of all continuous functions on the interval [ -1, 1], as in Example 1
of the text. For each/, g in .!l'1, let
d (/,g )
f 1 l/(x) - g(x)I
dx.
l/(x) + g(x)I
=
-1 1 +
(This is why the norm defined in Example 1 is called the uniform norm.)
11. Let .!l'1 be the same as in Problem 10, with the norm
11/111 =
f 1
l/(x)I dx,
Jim llfnll1 =
0 => U lim/n = O?
n-co n-oo
Is it true that
U lim fn = 0 => Jim llfnll O?
1 =
12. Let C0[-7T, 7T] be the set of all continuous functions/ on the interval [-7T, 7T], with +
and sm defined as usual. Set f g (f"..." f(x)g(x) dx), and verify that C0[ - TT,
· = ] with
TT ,
i=l
on the interval [-7T, 7T]. (Such functions are called trigonometric polynomials.)
Evid�nt!y Tn forms a subspace of C0[-7T, 7T], and the set
spans Tn. Show that (a) B is orthogonal, and (b) B is a basis for Tn. Then find an
orthonormal basis for Tn. [Warning: This one is long. It is easier if you note that in
(a) you need not necessarily compute indefinite integrals; what you need, in each case,
is the definite integral, from - 7T to 7T. The identities
cos (A + B) - cos (A - B) =
-2 sin A sin B,
are also useful.] Problem 15 of Problem Set 11.5 is useful at one stage.
12 Fourier Series
The idea of a projection is taken from elementary geometry. Let Ebe a plane in R3,
and let P be a point . To suit the terms of our later discussion, suppose that Epasses
through the origin. Then the projection of P into Eis the point Q which is the foot of
the perpendicular from Pto E. (If Pis in E, then the projection of Pis P.) The follow
ing facts are well known:
+-�
1) The line PQ, through Pand Q, is perpendicular to every line in Ethat contains Q.
++
(In fact, this is the definition of the statement that PQ J_ £.)
2) If P is not in£, then there is one and only one point Q in£, satisfying (1).
3) If R is in£, then
(PQ)2 + (Q R)2 = (PR)2•
It follows immediately that:
541
542 Fourier Series 12.1
3) If R is in E, then
\IP - Q\12 + llQ - R\12 = \IP - R\12.
Therefore
Theorem 1. Let "I"' be any vector space, let "fr be any finite-dimensional subspace,
and let P be any vector in "I"'. Then there is one and only one vector Q in "fr such that
Proof The easy part is to show that there is only one such Q. If
(P - Q) · S = 0 and (P - Q') · S = 0,
[(P - Q) - (P - Q')] · S = 0,
(Q' - Q) . s = 0,
(P - Q) E1 · = (P - Q) E2 · = 0,
and so
(P - Q) S · = (P - Q) · (x1E1 + x2E2) = 0
and let
n
Q = ! rxiW i.
i=l
Then
n 2
Q · W; = ! rxiW,: · W; = rx; llW;lt = rx3,
i=l
and so
(P - Q) ·
W; = P W; - Q
· · W; = rx; - rx; = 0,
for each j. Therefore for every
n
S = ! /J3 W; E "fl/,
j=l
we have
n
(P - Q) · S = ! /J;(P - Q) · W; = 0,
j=l
which was to be proved.
The point Q is called the projection of Pinto "fl/, and is denoted by Pr P (or by
Prif" P, if there is any doubt about the subspace into which we are projecting). To
repeat:
Definition. Let "f/ be any vector space , and let "fl/ be any finite-dimensional sub
space of "f/. For each P in "f/, Pr P (or Prif" P) is the point Q of "fl/ such that
(P - Q) S 0 for each Sin "fl/.
· =
Theorem 1 tells us that this definition defines something. And one of the ideas
in the proof of Theorem 1 is worth noting for future reference:
Theorem 2. If Pr P is the projection of P into "fl/, and {W1, W 2, • • • , Wn} is an
orthonormal basis for "fl/, then
n
Pr P ! (P W,:)W,:. = ·
i=l
544 Fourier Series 12.1
(In the proof of Theorem 1, we found that this sum satisfied the conditions for Q,
and that there is only one such Q.) It remains to show that conditions (3) and (4),
stated at the beginning of this section, hold on the basis of our general definition.
because Q R is
= - - -
- the following.
=A·A+2A·B+B·B
= llAIJ2 + 0 + llBll2•
As in the special case of R3, this immediately gives:
["Y, +, sm, · ],
where + and sm are defined as usual for spaces of functions, and the inner product is
defined by the formula
· g /(x)g(x) dx.
f = f:
For each positive integer n, let Tn be the set of all trigonometric polynomials of order
n, that is, the set of all functions of the form
n
g(x) = a0+ I [a; cos ix + b; sin ix].
i=l
{1; cos x, cos 2x,·. .. , cos nx; sin x, sin 2x, ... , sin nx}.
This set spans Tn. To verify that the set is orthogonal, we need to show that
and
J:11
sin ix sinjx dx =
,,
J:
cos ix cosjx dx = 0 for i ;;i!: j.
All of these answers can be calculated by brute force, but there are tricks that help.
By more straightforward calculations, we get
1
Co=
.J21T'
.
Ci 1 COS lX (i > 0),
.J;
=
1 .
S. - s m i .x (i > 0) .
.J;
=
'
Now let/ be a function in C0[-7T, 7T]. By Theorem 2, the projection off into the
finite-dimensional subspace "fl/" Tn is the vector
=
n n
P r nf = I (f Ci)Ci + I (f S,)S;. ·
i=O i=l
·
f · C0 = J f(x)
_,,
"
· --=
1
.J21T
dx =
1
.J21T
·_
_,,
f· C; =
1 f" f(x)
1-
'\/ 7T -11
cos ix dx (i > 0),
27T
-
a; =
- f f(x) ix dx
1
7T -1T
cos (i > 0)
b; =
1 f" f(x)
-
7T -1T
sin ix dx.
546 Fourier Series 12.l
(i ).
> O ]
It now seems reasonable to hope that Prnf is in some sense an approximation of/
when n is large. That is,
( ?) n � co � f � Prnf ( ?).
If we judge the approximation by observing II/ - Prnf II , it is clear, at least, that
we have done our best: Theorem 5 tells us that Prnf is the element of Tn which
minimizes II/ - Prn/11. If our best was good enough, then we should have
A stronger conjecture is that the approximation is good even in the uniform norm:
It may be disturbing, at this stage, to observe that (2) cannot be true as stated:
Prnf (-Tr) Prnf (Tr), because all trigonometric polynomials have period 2Tr.
=
Therefore (2) cannot be true unless/has the same property. We shall see, however,
that this is the only way that (2) can fail to hold:
Theorem A. If/ has period 2Tr, andf' is continuous, then limn�oo II/- Prnfll,. = 0.
tinuous function has a set of Fourier coefficients, and therefore has a Fourier series.
The question is under what conditions we can conclude that the Fourier series con
verges to the function. This question is complicated, and the situation is not yet
thoroughly understood by anybody; Theorem A, above, is the best of the simple
results.
Meanwhile, the successive projections Pr1/, Pr2/, have two encouraging • • •
properties. First, each projection Prn+if is simply a continuation of the preceding one
Prnf; to get Prn+if from Prnf we merely add a term of the form
an+l cos (n + l)x + bn+l sin (n + l)x,
leaving the preceding terms unchanged. Second, the error in the approximation
Prnf � f, as measured by II/ - Prn/11, is nonincreasing.
Theorem 6. Iff is continuous on [-Tr, Tr], then
II/ - Prn+dll � II/ - Pr;Jll.
12.1 Projections into a Subspace. Trigonometric Polynomials and Fourier Series 547
(Proof Since T,,+1 contains T,,, the minimum distance from/to T,, cannot be less
than the minimum distance from/ to T,,+i.)
Note that the Fourier series for a function f depends only on the values off on
the interval [-7T, 7T]. Therefore, when we set up the series forf(x) x, what we are
=
really dealing with is a discontinuous function, with period 27T, whose graph looks
like this:
y
Similarly, when we set up the series forf(x) x2, the series turns out to represent a
=
periodic function whose graph is obtained by fitting together infinitely many parabolic
arcs, like this:
y
__,'---"'""'-�....._'--""""'---'-�-'"4""-�L----"..£.�-'-���_,___.� x
-511" - 3,,. -.,,. .,,. 3,,. 5,,.
Of course the periodic function that we get from f(x) x is not continuous.
=
But it turns out that this doesn't matter: if the graph ofjis obtained by fitting together
a finite number of continuous functions with continuous derivatives, then the Fourier
series always converges to a function F; and F(x) =f(x) at every point where f is
continuous.
y
At points where the "continuous pieces" of the function fail to fit together, as at
x = 0 in the figure above, the series makes a compromise, and converges to the
average of the lefthand value and the righthand value. Similarly, ifj(-7T) =;C j(7T),
then
F(-7T) F(7T)
= ![f(-7T) + f(7r)].
=
548 Fourier Series 12.1
Thus, for the function fin the preceding figure, the graph of the function F given by
the series looks like the figure below. Here
Throughout this problem set, it should be understood that a0, av . . , b1, b2, are
. • • •
the Fourier coefficients of the function f, and that F is the function to which the series
converges. In each case, the graph offshould be sketched; and Fshould be sketched also,
in those cases in which Fis different from/
Compute the Fourier coefficients for each of the following functions.
for each i. Show that this happens for every odd function (/is odd if/( -x) = -/(x)
for every x).
13. Similarly, show that if/is even (with/(- x) =/(x)), then the series uses cosines only.
14. Show that for each/in C0[-rr, TT], 11/112 � 2rr 11/11�.
(See Problem 10 of Problem Set 11.6) Can (2) hold iffis not continuous?
12.2 Uniform Approximations by Trigonometric Polynomials 549
17. Working merely on the basis of the formulas for the Fourier coefficients of a continuous
functionf, give a geometric plausibility argument for the statement
Iima,. =Jim bn = 0.
n-oo n-co
(What does the graph of y = cos nx look like, when n is very large? How about the
graph of y =f(x) cos nx?)
18. In Theorem 6, under what simple conditions does the equality hold?
The purpose of this section is to show that for every continuous function f, with
period 27T, and every E > 0, there is a trigonometric polynomial
n
f+•
f
¢
f-· x
The first clue to this situation is that the trigonometric polynomials form a bigger
system than one might think:
From this it follows immediately that sin2 x, cos2 x, . . . sinn x, cosn x are trig-
onometric polynomials. For example,
+ .l
1 + cos 2x _ 1
Cos2 x = cos 2x.
2
- 2
2
550 Fourier Series 12.2
n
Consider now the function g(t) = cos2 t, which we now know to be a trigono
metric polynomial. If n is large, then cos2n t R:1 1 only when t R:1 O; everywhere else,
n
cos2 t R:i 0. Thus the graph looks something like the figure above, on the interval
[ -7T, 7T]. Let o be any number between 0 and TT, and let
-0
i"0
n
Jn=
J -IT
cos2 t dt = cos2n t dt,
K ==
n
f 0
-o
n
·cos2 t dt.
Jn= l " n
cos2 t dt < (7T - o) cos2
n
0.
12.2 Uniform Approximations by Trigonometric Polynomials 551
n
f cosn x dx = � cosn-l x sin x + : 1 fcosn-2 x dx .
(See Problem 31 of Problem Set 6.5) The formula can be derived by integration by
parts. This gives a recursion formula for the definite integral:
i" n - li"
n
cosn x dx = -- cosn-2 x dx .
i" 2n - li" 2n - 1
2n 2n
In = cos2n x dx = cos2<n-ll x dx =
In-l•
_,, _,,
Therefore
2n - 2n - 3 2n - 5 111
2n 2n - 2 2n - 4
1 1
I,, = ·
·
· · · - cos0 x dx
2 -11
3 (2n - 3)(2n - 1) . 17
2
2 4 6 (2n - 2)2n
1 · · 5 · · ·
=
3.-
5 .- . . . 2n - 1 . -
1 .
· · · · ·
- 27T > - .
2 4 6 2n - 2 2n n
7 'TT
=
Therefore
('TT - ()) cos2" ()
- < -- n cos2n u.
Jn 'TT - ()
TT/n
Ji
= ·
In
Since 0 < cos c5 < 1, it follows that Jn/In--+- 0. [In fact, L!1 (J;/I;) converges;
'TT
the easiest way to see this is to recall that L ki(x2)i converges for 0 < x < l.]
Now let/be any continuous function with period 2TT, and for each n let
J'!..,,f(x + t) cos2" t dt
..,, ..(x)
,1.
f'!..,. cos2n t dt
.
=
1
n
In i{J
� c/>,,(x) :::::: - f(x + t) cos2" t dt;
oo =:;>
1
n
i{J
:::::: oo =:;> cf>n(x) :::::: - f(x) cos2n t dt
In -!J
1
:::::: - f(x)I n = f(x).
In
552 Fourier Series 12.2
</>n(x) =
1
-
f1l f(x t) + cos2n t dt
Jn -1!
J -tr
1
cos2n (t -
=
- f" f(t) x) dt.
n
(The integrand has period 27T, and so f.'.'." is unchanged if we slide the graph back and
forth horizontally.) We know that cos2n tis a trigonometric polynomial; say,
m
cos2n t = a0 + !
i=l
(a; it
cos + bi sin it).
Therefore
m
Therefore
it ix
+ b; sin cos - b; cos it ix).
sin
i� [f:}a; it
+ cos + b; sin it)f(t) dt] ix cos
The coefficients here are complicated, but they are constants, and so the indicated
integral is a trigonometric polynomial.
.J..
�nX - () _
f':_" f (x t)+ cos2n t dt
then
f7:_1l cos2n t dt '
Step 1. Since/is continuous on [-TT, TT],fis bounded on [-TT, TT]. Also/is periodic.
Therefore there is an Msuch that
If(t)I < M for every t. (1)
Step 2. Since f is continuous on [-2TT, 2TT], it follows that f is uniformly continuous
on [-2TT, 2TT]. And f is periodic. Therefore there is a (J > 0 such that
L JI(x)J: t dt -J:/(x t) t dt I
=
7
cos2n + cos2n
=
I
1
n
/j + + t)/cos2ntdt
_!_ [
=
I
4MJ + �
n 2
f-o t dt] 0
cos2n
n
J
< 4M · n+ l . .=J" tdt cos2n
I I 2 -1I
n n
J €
= 4M- n+-.
In 2
Therefore there is an N such that
J
n � N => 4M n < .: .
I 2
We now have n
limllf - Prn/11 = 0.
Finally, we observe that for each n there is a point xn at which the error in the
proposed approximationf R:> Prnf is actually equal to 0.
Prnf(xn).
Therefore
J:"[f(x) - Prnf(x)] • 1 dx = 0.
At first, this theorem may seem almost like a joke, but it isn't. See the following
section.
Would Theorem 3 still have been true? (Either prove the theorem in the more general
form, with odd exponents allowed, or give an example to show that the more general
theorem is false.)
12. Show, by any method, that
[Hint: There is a quick method, on the basis of what you know now.]
13. Show that if/is as in Theorem 5, then
:� fJ f(x) - Prn/(x)I dx = 0.
15. Now show that the same result holds on every interval [a, b].
* 16. In Section 10.8 we proved the binomial theorem
(1 + xr = � C)
i
x
i
by the methods of calculus, in the real domain. Thus the proof in Section 10.8 does not
·show that
Let f(x) = x2, on the interval [-1T, 1T]; and extend the graph so as to get a function
of period 21T. (See the figure on p. 547.) For this function f, compute the function
<f> (x) in the form of a trigonometric polynomial, using definite integrals as coefficients.
2
You need not compute the integrals numerically.
20. Given a trigonometric polynomial
n
form � n?
Why or why not?
*21. Let/be a continuous function on [O, 1]. Show that for every E > 0 there is a poly
nomialp(x) L�=o a; x; such that
=
lf(x) -
p(x)I < E (0 � x� 1).
The ideas suggested by (3) are the key to the situation: to prove convergence, we
first need to find out how the operations Prn are related to differentiation and
integration.
Theorem 1. Iffhas period 27T, andf' is continuous, then
Prnf' = (Prnf)'.
That is, the projection of the derivative is the derivative of the projection.
Proof Let
n
Prnf= a0 + L (ai cosix + bi sinix),
i-=1
n
Prn f'= A0
iL
�l
(Ai cosix + Bi sinix).
+
We need to show thatA0 = 0, Ai= ibi, Bi= iAi . The Fourier coefficients forfare
-
1
ai = (i > 0),
1T -1T
-
bi = .!. [" fx
( ) sinix dx.
J_1T
1T
Similarly,
f"f'(x) dx,
1
A0 =
27T
-
-1T
Ao = _L [f7r ( 7r )] =
( )- f- 0.
27T
D[ i�a
0
+ (ai cos ix + bi sin ix) ] i�
= (-iai sin ix + ibi cos ix).
In fact, at this stage, we don't know that either of the,indicated'series converges to any
function at all.
We now propose to find out how Prn is related to integration. Theorem 5 of
Section 12.2 says that
Jim II! - Prnf 11
n-+CO
= o,
What we want is
b
[f g(x) dx r� (b - a) i [g(x)]2 dx.
Proof Let C0[a, b] be the set of all continuous functions on [a, b]. Under the usual
definitions of+, sm, and·, C0[a, b] forms an inner-product space, and so the Schwarz
inequality holds. In the inequality
(A B)2 � llAll2
· • llB ll2,
b
[f g(x) .
1 dx r � [f g(x) . g(x) dx ] [i (l . 1) dx J
= ( b - a) f [g(x)]2 dx.
This tells us that the integral of a function is small if the norm of the function is
small. Applying this principle to the function If - Prn/I, we obtain the following
theorem.
Then
lim Mn 0.
'7t-+CO
=
12.3 Integration of Fourier Series. The Uniform Convergence Theorem 559
Proof
l J - Prnfllu= 0 .
lim l
In each case, the reason is that the differences/ (x) - Prnf (x) are squeezed to 0
by a sequence of positive constants. Finally, all this can be restated in terms of the
formula
co
and the formulas for the Fourier coefficients ai and bi. This gives a third form of the
theorem:
Theorem 4". Let f be a function with period 27T and a continuous derivative. Let
a0
Then
=-
1 J"f(x) dx, ai = -1 f"f(x)
27T _,, 'TT -1T
cos ix dx, bi = -
1 f"f(x)
'TT -1T
sin ix dx.
Just as we found for power series, in Chapter 10, uniform convergence enables us
to integrate a term at a time. In general:
b b
rfn(x) dx = rf(x) dx.
n-+oo Ja
lim
Ja
This gives:
Theorem 6. Iffhas period 27T andf' is continuous , then the Fourier series forf can
be integrated a term at a time; on any interval.
We found, at the beginning of this section, that P r nf' = (Pr nf) ' . If
f' (x) = lim P r nf' ,
it follows that the series for f can be differentiated a term at a time. But we have
proved Theorem 4 only for functions with continuous derivatives; and so our con•
vergence theorem applies tof' only whenf"is continuous. Hence the heavy hypotheses
in the following theorem:
Theorem 7. Iffhas a period 27T andf' andf" are continuous, then the Fourier series
for f can be differentiated a term at a time.
Suppose that we form a function of period 27T by fitting together the graphs of a
finite set of continuously differentiable functions, end to end. We reconcile the values
at common endpoints by defining
3) The series can be integrated a term at a time on any closed interval (even if the
interval contains points of discontinuity).
But it is not necessarily true that the series for a function of the Fourier type can
be differentiated a term at a time, even on an interval on which f, f', and f" are
continuous. This will be brought out in the problem set below. In working on these
problems, you should regard Theorem B as given.
f
Evidently has a Fourier series
i=l
Show that termwise differentiation of the series forf cannot give the Fourier series for/'.
4. Now calculate the Fourier series off (You did not need to do this, to solve Problem 3.)
5. Now verify, by a calculation, that for this series,
Jo('" [ao + i a; cos it + b; sin it] dt =Jor '"ao dt + i Jo("'(a; cos it + b; sin it) dt.
i=l i=l
=Ia�+ Ib�.
00 00
111112
i=O i=l
Linear Transformations,
13 Matrices, and Determinants
where
E1 = (1, 0, 0), E2 = (0, 1, 0), Ea = (0, 0, 1).
Now f(E1) must be a vector in R3; and B3 is a basis for R3• Therefore we have
for some set of scalars a11, a21, aa1. [Here we are using double subscripts; ai1 is the
coefficient of Ei inthe expression forf(E1) .] Similarly,f(E2) and/(£3) have the forms
f(E2) = a12E1 + a22E2 + aa2Ea, f(Ea) = a13E1 + a2aE2 + aaaEa.
Iff(E1),f(E2), andf(E3) are known, then this determinesf(P) for every P in Ra. The
reason is that for
we must have
563
564 Linear Transformations, Matrices, and Determinants 13.1
When Pis described in this notation, we call Pa column vector. Similarly, for
Note that the array on the right i; a column vector; once the indicated additions are
performed, there is only one entry in each row.
Now we define the operation of multiplication, of the column vector P by the
matrix M, in such a way thatf(P) =MP. That is:
Definition
13.1 Linear Transformations 565
MP � [f:] � f (P)
There is a usable pattern in this multiplication: to get the entry y1 in the first row of
the product, we regard the first row of M as a vector, and form its inner product
with the column vector P. Similarly for the other rows.
Let us examine an example. The matrix
[ 1 2
-2
1
1
-1 �]
f(xi. X2, Xa) =
[
describes a linear transformation f, with
1
-2
2
1
1] [X1]
2 X2 =
[ X1 + 2x2 + X3]
-2X1 + X2 + 2x3 =
[Y1]
Y2 .
1 -1 2 Xa Xi - x2 + 2x3 Ya
Two questions arise naturally here.
Problem 1. Given a particular point (Ji, y2, Ya), for what points P, if any, do we have 1
f(P) = (Yi. Y2, Ya)? For example, for what points P is it true that
f(P) (1, 2, 3)?
{
=
Xi + 2X2 + X3 = 1, (1)
-2xi + x2 + 2x3 = 2, (2)
Xi - x2 + 2xa = 3 (3)
of linear equations in the unknowns Xi. x2, x3• Almost any method will do. We shall
use a method which will be of theoretical importance later.
Step 1. Eliminate Xi from (2) and (3), by adding twice (1) to (2) and subtracting (1)
from (3). This gives a new system which is equivalent to the original system, in the
sense that it has exactly the same solutions:
Xi + 2X2 + Xa = 1, (1)
5x2 + 4X3 = 4, (2') = (2) + 2(1)
- 3 x2 + Xa = 2. (3') = (2) - (1)
(The notations on the right indicate where the new equations come from.)
Step 2. Eliminate x2 from (3'), by adding t of (2') to (3'). This gives the equivalent
system
Xi + 2X2 + Xa = 1, (1)
1] [ ] [1]
X = H.
3
1
This means that
13 7
H -1 ; -r� ;
2
.
Problem 2. Find, if possible, a set of formulas expressing P = (xv x2, Xa) in terms of
(Y1. Y2, Ya ).
To do this, we need to get a "general solution" of the equation
-11
2
in which the x/s are expressed in terms of the y /s. This is only slightly more trouble
some than Problem 1; we treat (y1, y2, Ya) in exactly the same way as we treated
(1, 2, 3) in Problem 1. The solution is
X1 = 147Yl - 151Y2 + 137Ya·
']
Xa = l\Y1 + 131Y2 + 151Ya·
[':
- rr T7
M-1 = 17
1
1
T7 :
-1 7 .
T7
3
T7 17
/: R"�Rm
1
of the function at all. In such cases, there is no such thing as J- (Q). Some points
Q of Rm may be equal to f(P) for more than one point P. In such cases, f is not
[1 1
invertible. An example of both these phenomena is furnished by the matrix
J
0
M= 0 0 .
0 0 0
Here
[l21 25 38.7]
for example, the transformation f described by the matrix
M=
z) (a,
4
To simplify the notation, we-use (x, y, for (x1, x2, x3) and b, c) for (y1, J2, Ya)·
We then have
+ + 7z
Sy b,
x 4y = c.
+ 2y 3z
Reducing this to triangular form, we get the equivalent system
+ 2z b - 2a, (2)
x + =a, (l)
2b + (3)
y =
= 0 3a - c.
Thus the equation
3a - 2b +
f(P) = MP = (a, b, c)
{(a, 2b +
cannot have a solution unless c = 0. Let
E= b, c) I 3a - c = O}.
Then Eis a plane, and every point of Eis = f(P) for some P. The reason is that as
568 Linear Transform;itions, Matrices, and Determinants 13.1
long as (3) is satisfied, we can solve (2) and (1) in the forms
y = -2z +b - 2a, (2')
x = -2y - 3z +a
= z - 2b +Sa. (1')
Here we can choose any z; if
P = (z - 2b +Sa, -2z +b - 2a, z),
then
f(P) = (a, b, c) .
As in Section 3.1, the image of a function is defined to be the set of all values of
the function. If f is a function
A � B, then the image is denoted by f(A). More
generally, if A' is any subset of A, then
In the above example, the kernel is the solution set of the equation
To get the solution, we set a = b = 0in equations (2') and (l'). Therefore Pis in
the kernel if P has the form
.P = (z, -2z, z).
Using (J.. for z (to fit the usual notation of scalar multiplication), we get
Kerf= {(J..V0}, V0 = ( l , -2, 1).
[
Therefore Kerf is a line. In other cases, the image may be an even smaller set, and the
kernel even larger. For
1 0O
M= 0 0 0,
000 J
13.1 Linear Transformations 569
rn � �J[�J m
The kernel is the yz-plane, because the equation
�
0, 0 0, 0 0,
gives the system
x = = =
R3.
1) As far as the ideas in this section are concerned, there is nothing special about
2) In the examples that we have discussed, the image and kernel of a linear trans
3) If you are acquainted with determinants, and with the process of solving linear
systems by Cramer's rule, then you may suspect that the method used above, by which
we convert the system to a triangular system, is naive or inefficient or both. But this
is not true. In computation, triangularization is about as good a method as any.
In each of the following problems, you are given a matrix M, describing a linear transfor
mation f Iff turns out to be invertible, compute 1-1. If not, find the image and the kernel.
Thus the answer to each of the first ten problems below should be in one of the forms
(a) M-1 = [ · · · ]
or
(b) /(R3) ={(a, b, c) I ·}, and Ker/= {(x,y, z) I···}.
[[ !]
· ·
, [H �] ]
I
I
.[H �] 2
. [�HJ 5
[: ; ] -3
-
+2
1
[-� -� -; 6.
4 2 -2
[h i] [! g �]
_ _
7 [H :J · [H i] 9
10.
570 Linear Transformations, Matrices, and Determinants 13.2
[
a, ll 2
M=
"•]
aa1 Olll 2 Olll3
f3a1 f3a2 flaa
["" g l
where a11a22a33 � 0.
13. Same question, for
0
M= a21 ll22
ll31 ll32 ll33
with a11a22a33 � 0.
[g
14. Same question, for
]
ll12 "
u
M� 0 aa
J ,
ll31 0
with a12a23a31 � 0.
15. Same question, for
]
ll12
rn
"
u
M� 0 aa
0
� •
Proof By Theorem 4 of Section 11.3, we merely need to show that these sets are
closed under addition and scalar nrnltiplication. For eachf(P),f(Q), we have
f(P) + f(Q) = fP
( + Q);
and for eachfP ( ),ocfP
( ) = f(ocP). Thereforef(Rm) is a subspace. IfP andQ belong
to Ker f, then
f(P) = f(Q) = 0,
Theorem 2. If f and g are linear transformations Rm-+ Rn, then so also are f + g
andrx.f
Here the sum and scalar product are defined by the obvious conditions
Theorem 4. If g: Rm-+ Rn and f: Rn-+ RP are linear, then the composite function
f(g): R"'-+ RP
is also linear:
f(g)
The verification is straightforward:
then
572 Linear Transformations, Matrices, and Determinants 13.2
g(P) =
Adding by columns, to get the total coefficient of each E; on the right, we get
Describing P and f(P) as column vectors, and representing g by the matrix with a; 1 in
the ith row and jth column, we get
bn b12 b1m X1 Y1
b21 b22 b2m X2 Y2
=
b,.1 b,.2 . . .
bnm Xm Yn
Xm Yn
Here the pattern of the operation is the same as in the case m n 3: to get y; in= =
the column vector on the right, we regard the ith row of the matrix Mg as a vector, and
form its inner product with the column vector.
Now if g and fare as in the preceding theorem, then each of the transformations
g,f, andf(g) can be described by a matrix. Let these matrices be
M0 = [b;1] (n by m),
M1 = [a1k] (p by n),
M1<g> = [cik1 (p by m).
Here [a,:;] is a shorthand for the matrix with the number a;; in the ith row and the
jth column, and similarly for [b1k] and [cik]. We define the product of two matrices,
13.2 Composition of Linear Transformations and Multiplication of Matrices 573
J: Rn-+ RP,
J(g): Rm-+ RP,
with associated matrices M1, Mg, Mt<a>· By definition,
M1Ma = Mt<g>·
We shall now get a formula for the product M1M11 of two matrices. The general
formula looks complicated, but its pattern is easy to see by an examination of the
case m
= n = p = 2. Let the matrices of the transformationsf, g, andf(g) be
M, = [an
a21
a12
a22
J
, Mg= [ bn
h21
h12
h22 'J Mt<g> = [en J
C21
C
12 .
C22
[:�] [ J[ J [ J [�:].
Then
= bu b12 X1 = bnX1 + h12X2
Mu =
b21 b22 X2 b21X1 + b22X2
{:�J [J [J
and so
Mt<a = M M X1 = M
Yi
g
f X2 f Y2
[ J[ ] [
J
= an a12 Yi = a11Yi + a12Y2
a21 a22 Y2 a21Yi + a22Y2
= [au(b11X1 + b12xJ + a12(b21X1 + h22X2)
a21(b11X1 + b12X2) + a22(b21X1 + h22X2)
J
= [ (aubn + ai2h21)X1 + (a11b12 + ai2h22)X2
(a21b11 + a22b21)X1 + (a21b12 + az2h22)X2 J
[ aubu + a12h21
a21b11 + a22b21
auh12 + a12b22
a21b12 + a22b22 J[ JX1
X2 •
Therefore M1Mg is the 2 by 2 matrix in the last formula. The pattern of the operation
is clear: to get the number C;;, in the ith row andjth column of the product, we regard
the ith row of M1 and the jth column of M a as vectors, and compute their inner
product. That is,
C;; = a; 1b1; + a;2b2;·
This is called the row by column rule of matrix multiplication. The same rule applies
in the general case, and the only problem is that the formulas are complicated to write
down. We have
Yi 2: bi;X; (i = 1, 2, .. , n).
J=l
= .
Therefore
m
= LCk;X; (k 1, 2, 'p),
J=l
= . . .
where
n
ckJ· = L akibii"
i=l
Thus
But the above, formula for ck1 says that c k; is the inner product of the kth row of M1
and the jth column of M0; these are the row vector and column vector
13.2 Composition of Linear Transformations and Multiplication of Matrices 575
and their inner product is ck; = z�1 akibi;· For example, in the product
]
c22 is the inner product of (-2, 0, 2) and (3, 2, 1). Therefore c22 = -4. The com
plete calculation gives the answer
10.
u
-4
4 .
-2
Definition. Given the linear transformations /: Rm->- Rn and g: Rm->- Rn, with
matrices M1 and M9• Then
M1 +Mg= Mt+u·
The sum is easy to calculate; the simplest possible idea works. Let
au ll12 a ln bu b12
ll21 ll22 a2n b21 b22
M, = Mg=
L�=l aux;
L�=l ll 2;X;
f(P) =
g(P) =
Therefore
Carry out the indicated operations, expressing each answer as a matrix (which may, of
course, turn out to be an n by 1 matrix, that is, a column vector).
[� iJ[: :J [� m: �] [; �][� �]
2 0
1. 2. 5 1 3.
8 0
[� m� !] [! m� i] [� m� !J
2 0 10 2 0
4. 5 1 5. 1 5 6. 1
8 0 0 8 1
7.
rn
0
2
0 urn 8. [3
f�]
0 9.
[! !J[l]
2
5
[! !][�] [� mi i] [� m�f �J
1 4 1
10. 0 11. 0 5 12. 0
f
0 0 6 0
n m� !] ff f G f
4 1 ·1 l
13. 5 0 14. 0 15. 0
6 0 0 0
[� n [! ff"" [l �m �]
1 1 0 1
16. 0 17. 0 18. 4 3
1 0
['� i][i �]
0 0
[l m� �] [� m: �]
1 0 0 1 1 0
19. 3 1 20. 1 3 21. 15 1 1
0 0 0 0 18 8 14 1 0
13.3 Formal Properties of the Algebra of Matrices. Groups and Rings 577
0 1
[ _l ][ -2
]
22.
[l mf i l]
4
1
3
0
23.
0
-6
2
0 O
-6
2
I -1
I
-1
4
4
[� J
1 0 0
24.
[� f 1J1
0
0 0
25.
0
0
0
1
0
O
0
1
26.
[� 0
1
0
1
0
0
0
0
0
27.
[� T 0
1
0
1
0
0
0
0
0
For each of the following four matrices, find the i nverse M-1 if there is an inverse; if
[ I 2 3
OJ [�
not, give the simplest reason that you can for concluding that no inverse exists.
0 0 2 3
[� �
7
4]
2 0 4 5 6 0 6 8
28. 29. 30. 31. [� �]
0 2 7 8 9 0 10 11 12
0 0 10 11 12 0 0 0 0
*32. Let
12 =
[� �l
For how many 2 by 2 matrices
M = [� �]
is it true that MM = 12? (It is easy to find two such "square roots" of 12• The question
is whether there are others, and if so, what they are.)
33. Let
02 =
[� �].
This matrix acts like 0, in that for every 2 by 2 matrix M, we have
02 + M M + 02 = M.
or B 02?
= =
34. If A and B are 2 2 matrices and AB 02, does it follow that BA 02?
=
x = =
.
13.3 FORMAL PROPERTIES OF THE
ALGEBRA OF MATRICES. GROUPS AND RINGS
In the preceding problem set, you found that addition and multiplication of matrices
were analogous in some ways, but not in others, to addition and multiplication of
real and complex numbers. We shall now investigate the algebra of matrices syste
matically, and find out how far the analogy goes.
Throughout this section we shall be concerned only with square matrices. The
set of all n by n matrices is denoted by ,An. In our investigation of the formal prop
erties of vltn, under addition and multiplication, it will not be very useful to think
578 Linear Transformations, Matrices, and Determinants 13.3
about square arrays of numbers; the ideas are much easier to see if we work with the
linear transformations f that the matrices represent.
Definition. !l'n is the set of all linear transformations f: Rn ---* Rn.
We found, in Theorem 3 of Section 13.2, that !l'n forms a vector space, under
addition and scalar multiplication. Therefore, in particular, we have:
C.1. !l'n is closed under addition.
A pair [!l', +] is called a group if the operation + satisfies C. I, A.I, A.2, and
A.3. If A.4 is also satisfied, then [!l', +] is called a commutative group. We can
therefore sum up as follows:
Theorem 1. For each n, [!l'n, +] is a commutative group.
We defined multiplication for matrices by composition of functions. That is,
M1Mu = M1tu>,
by definition. We therefore need to investigate composition of functions in !l'".
For the sake of convenience, we shall denote the composite function f (g) by the
notation f g . Theorem 4 of Section 13.2 tells us that iff and g are linear: R" ---* Rn,
o
fog
13.3 Formal Properties of the Algebra of Matrices. Groups and Rings 579
Starting at any point w, we get to the same point z, no matter how the functions
f, g, hare grouped. Therefore we have:
for every g.
This f1 acts like the number 1, under our "multiplication." Obviously f1 is the
"identity" function, such that/(P) = P for every P. We call/1 the unit element.
So far, M.l and M.2 are precisely analogous to C.l, A.I, and A.2; these conditions
say the same things, about addition in one case and "multiplication" in the other.
1
But the analogy now breaks down: not every linear function f has an inverse J- ,
and composition of functions is not, in general, commutative. (We have seen many
examples of both of these.) But we do have:
A system satisfying all the conditions that we have mentioned so far is called a
ring. More precisely:
[2,+, o]
All this discussion carries over immediately to the set ._,«n of n by n matrices,
since M1 + Mu = Mt+u and M1Mu = M1.u. This gives:
Here 0 is used to denote matrix multiplication. It is easy to see that the zero
eleinent of ._,«n is
0 0 0
0 0 0
0 0 ... 0
580 Linear Transformations, Matrices, and Determinants 13.3
1 0 0 0
0 1 0 0
1
In =
with l's on the main diagonal and O's everywhere else. There is a shorthand for this:
{
we define
/J .. = 1 for i = j,
Z?
0 for i -:;6 j.
We then have
In = [o;;].
arrays of numbers, you will see that the use of the system !t' of linear transformations
offered great advantages; most of the proofs were easier to write down than even
one n by n matrix. In particular, a direct verification of the associativity of matrix
multiplication, using the formula for the product of two matrices, would be extremely
tedious.
A final remark, on the notation used in describing a group. In this section, the
group operation is denoted by +. This is partly because addition was what we meant,
in the case that we were discussing. Also it is customary to use the symbol + when the
operation is commutative. More generally, however, we can state the conditions for a
group as follows:
[G,*]
for every a.
A field is a system [F, +, ·] which satisfies all the conditions which were stated
for the real number system in Section 1.1. Thus [F, +, ·]is a field if (1) [F, +, ·]is.a
ring, (2) multiplication is commutative, and (3) every x -:;6 0 has an inverse x-1, such
that x · x-1 = 1.
13.3 Formal Properties of the Algebra of Matrices 581
Obviously the real-number system furnishes examples of all the ideas that we
have been talking about in this section: [R, +, ·] is both a ring and a field, and
[R, +] is a group. But if the real-number system were the only algebraic system that
we were concerned with, there would be no advantage in using the terms group, ring,
and field. The advantage is in other conne
. ctions: already we have been dealing with
vector spaces, which form groups (under addition), but do not form rings or fields;
and from now on, we shall be dealing with (a) groups which are not rings, (b) rings
which are not commutative, (c) commutative rings which are not fields, and so on.
To find our way around in this variety of algebraic systems, we need a language in
which we can explain briefly and clearly what sort of system we are dealing with at a
given moment.
6. Let G be the set of all real numbers of the form a + bVZ, where a and bare rational
Discuss as in Problem 1.
7. In this problem, you may regard it as known that 1T is not a root of any linear or quadratic
equation with rational coefficients. Let G be the set of all real numbers of the form
a + b1T, where a and bare rational. Is [G, +,·]a ring?
8. A permutation matrix is a square matrix with exactly one 1 in each row, exactly one 1
in each column, and O's everywhere else. The set of all n by n permutation matrices is
denoted by pn. Show that [P2 , o] is a group, and write a multiplication table for the
group, in the form
0 A
12
A
set of all n x n upper triangular matrices by vn. Show [ vn, +] is a group, but [ vn, o]
is not.
582 Linear Transformations, Matrices, and Determinants 13.4
group. (This is called the general linear group.) Then show that [GL(n), +, ] is not a
o
ring.
*13. Let GLU(n) be the set of all upper triangular n x n matrices [ai1], with a11a22 • • · ann ¢
0. Show that [GLU(n), ] is a group, but [GLU(n), +, o] is not a ring.
o
The determinant function assigns, to every square matrix, a real number. The
definition of this function begins as follows.
a11 a 12 ll1n
ll21 ll22 G2n
using exactly one element from each row and exactly one element from each column.
Thus the numbers
are all different. To each of these products we attach a + or -'-- sign, according to a
rule which will be stated presently. We then take the sum
of all terms which can be formed according to the above rules. This sum is called the
determinant of the matrix, and is denoted by det M, or det [ a;1 ]. Thus, when we have
explained how the sign is to be chosen for each term, we shall have a function
n
det: ,A ---+ R.
The rule for the signs takes time, to explain and justify. With each term
where
In= {l, 2, 3, ... , n}.
Here p{i)= j;; that is, p(i) is the column number of the element a;1, that we chose
from the ith row. We can describe such a function p by a diagram in the following
13.4 The Determinant Function 583
(1 2 3 . )
form:
n
p
· · ·
= jl j2 ja . • jfl ,
with the numbers i in the top line and the numbers p(i) below them. For example,
(13 12 43 24)
is the function under whose action
1H3, 2H1, 3 H4, 4H2.
A one-to-one function p: In---+ In is called a permutation. When permutations are
described in the two-line notation, the order of the columns does not matter; all that
matters is what is under what. For example,
p
=
(13 22 13 44)
1 3.
(13).
is a transposition; it interchanges and We denote this permutation by the short
hand In general, (ab) is the permutation which interchanges a and b.
Here the word product is used in the sense of composition of functions. For
example, for
p
= G i ! �).
we can use the following transpositions:
We now have
(24)(21)(13). p =
As always, for composition of functions, the operations are performed in the order
from right to left. Thus
4 ----+-
584 Linear Transformations, Matrices, and Determinants 13.4
(1 2 3 n)
··
· · ·
p= ji j2 ja · jn '
we first take the transposition (1 j1); this puts ji under 1, where we want it to be. To
the resulting sequence, we apply a transposition which puts j 2 in the second position,
and so on. A further example:
�)·
2 3 4 5 6
p= G 5 4 1 7 3
1 2 3 4 5 6 7
1
(12)
2 3 4 5 6 7
(15)
2 5 3 4 1 6 7
(34)
2 5 4 3 1 6 7
(31)
2 5 4 1 3 6 7
(37)
2 5 4 1 7 6 3
(36)
2 5 4 1 7 3 6
It is not claimed, in Theorem 1, that every p can be expressed in only one way as a
product of transpositions; and in fact this is not true. For example, the above
diagram gives
p= G ; ! 1 � � �) =
(36)(37)(31)(34)(15)(12).
(36)(73)(16)(43)(57)(24)(16)(27)(14)(57),
p=
which looks different, and uses ten transpositions instead of six. Nevertheless all such
expressions for a given have a common property, now to be described.
p
A permutation is called even if it can be expre&0� : as the product of an even
p
number of transpositions; is odd if is the product o an odd number of trans
p p
positions.
i<j
where the expression on the right is the product of all differences xi - X; for which
i < j. (Analogously, we might use
f(xu x2, x3, x4) = (x1 - x2)(x1 - xa)(x1 - xJ(x2 - xa)(x2 - X4)(xa - xJ.
Now consider what happens to f when we apply the transposition (i j), thus inter
changing xi and X;. The factors of f are of the following types:
4) (Xi - X;).
When we apply the transposition (ij), interchanging xi and X;, the effect is
1)
( <Xr - Xi) H (Xr - X;),
(x, - X;) H (xr - xi) ,.
2)
{(xi - x,) H (x; - x,) -(X8 - X;),
=
3)
{(xi - xt) H (x1 - xt),
(x; - Xt) H (xi - Xt);
= (1. 2 3 . �) . .
P
j2 ja
Ji ' '
·
}n
is even, and - -if pis odd.
In practice, when we have developed some of the theory, we shall never have to
make direct use of the above definition; and this is fortunate, because the definition
is even more tedious to handle than one might think. In order to form a term of
det [a;;], for an n by n matrix, we have to choose an element from each row, in such a
way as never to use the same column twice. Thus we have n possibilities to choose
from in the first row; there are then -1 n possibilities in the second row; and so on.
Therefore the total number of terms is
n(n - l)(n - 2) 3 21 · · · · · = n!
Therefore the determinant of an n by n matrix is the sum of n! terms. In particular,
for n = 20, the number of terms of det Mis
=
20! 2,432,902,008,176,640,000 .
The number of seconds in a year is only 31,526,000. This is why nobody asks even an
electronic computer to calculate the determinants of large matrices by brute force.
Nevertheless, the definition of the function det is usable conceptually, as the
basis of a theory which leads quickly to efficient techniques. In the rest of this section,
we shall begin to develop the portion of the theory which makes direct use of the idea
of odd and even permutations.
Theorem 4. For each n, let S,. be the set of all permutations p: I,.--+ I,.. Then
[Sn, ] is a group.
o
3) There is an identity
= (1, 2, 3, ... , n )
e
1, 2, 3, ... , n ·
4) By Theorem 3, every pin Sn has an inverse.
Theorem 5. In any group, the inverse of a product is the product of the inverses,
in reverse order.
Thus
because
Hereafter, we shall omit the operation sign o. For products of n factors, the theorem
says that
Theorem 6. For each p in Sm p and p-1 are either both even or both odd.
p =
P1P2 . .
.
h-1h·
p -l =
hh-1 . .
.
P 2Pi·
If k is even, then p and p -1 are both even. If not, p and p -1 are both odd.
The transpose of a matrix .!It is the matrix obtained by reflecting .!It across its
main diagonal. The transpose is denoted by .Jtt. Thus
and in general
p
(1 2 3 .
.
. �)
= ji j2 ja ' ' · }n
q
=
Ct 1 j; �n)
588 Linear Transformations, Matrices, and Determinants 13.4
is even or odd. Obviously p-1, and sop and are both even or both odd. There
=
q q
fore the terms of det Mand det Mt have the same signs, and det M det M.t =
Theorem 8. If two rows of Mare interchanged, then the determinant of the resulting
matrix is - det M.
For example,
The reason is that when two rows are interchanged, this contributes exactly one
transposition to the permutation
p
(1 2 3 ··· n ) .
= A jz ja ··· jn
For example, if the first and third rows are interchanged, then the sign of the term
a a • • • a in the new determinant is determined by the permutation
(1 )
1ii 212 nJn
2 3 ··· n . .
q =
=
()1 ]a) 0 p.
h j2 ji ··· n
1. Working directly from the definition of <let, get an explicit formula for
[anll21 .
J
<let
ll12
ll22
2. Similarly, get a formula for
<let
["" ""] au
[1
0 0
1 0 OJ [ 0 0 2
0 00 00
4. 00 00 01 010
<let
0 5. � 0 001 �]
0
d� 6. det
7. [� 001 00
�·
0 4 �] 2
8. [� 000 007 400 !]
0
det
00 01 2
9. det
[! 00 07 00 �]
00 000 401
2
13.4 The Determinant Function 589
10. det [i � i � i] 0
1
� �
0
1
3
0
5
[H H �] �]
2 -6 1
0 0 0
12 det 0 0 4
[! : ]
0 -14 0
0 0 0
0 0 0 0 0
o o 6 1
0 -2 0 0 0 0
� o 4 o � 3 0 0 0 0 0
14 . det 0 15. det
0 0 0 0 -4 0
l
0 0 0 0 0 5
7
0 0 -6 0 0 0
1
6
�
2
5
3
4
4
3
5
2
6
1
0
1
1
0
0
0
0
0
0
0
0
0
1 2 3 4 5 6 0 0 0 1 0 0
16. det 17 . det
2 3 4 5 6 7 0 0 1 0 0 0
3 4 5 6 7 8 0 0 0 0 2 3
4 5 6 7 8 9 0 0 0 0 4 5
19. Let Bn be the set of all odd permutations in Sn. Is [Bn, 0] a group?
20. Let Cn denote the set of permutations in Sn, for whichji = 1. Is [Cm a] a group?
21. Let Dn denote the set of permutations in Sn such that either Ji = 1 and h = 2 or
h = 2 and iz = 1. Is [Dm a] a group?
22. Can any general statement be made about the evenness or oddness of the following
(
permutation?
;)
_ 1 2 3 .·
. n-2 n- 1
p- n n- 1 n- 2 ... 3 2
23. What can you say about the sign of the following permutation?
=
( 1 2 3 . . · n- 2 n- 1 n
)
q n 1 2 ... n- 3 n-2 n- 1
24. Suppose that [G, a] satisfies all the conditions for a group, except that some elements
of G may not have inverses. Let H be the set of all elements of G that have inverses.
Does it follow that [H, a] is a group?
25. Find the roots of the equation
det [�: f: fJ � 0.
detMt = detM
for every square matrix M; and we showed that if two rows ofM are interchanged,
the effect is to change the sign of detM. These statements in combination give us the
following:
To get the second half of this theorem, we take the transpose, perform the
appropriate interchange of two rows, and take the transpose of the resulting matrix.
The first and third of these operations leave the determinant unchanged, and the
second one reverses the sign.
In fact, since detMt = <let M, every theorem about rows automatically gives us
a theorem about columns.
Theorem 2. IfM has two identical rows, then <letM = 0. Similarly for columns.
The reason is that when the two identical rows are interchanged, nothing happens
to the determinant (or even to the matrix). Therefore detM = -detM, and
detM = 0.
The minor of an element a;1, in a square matrix M, is the matrix that we get by
deleting the ith row and the )th column ofM. The minor is denoted by M;1, and its
determinant detM;1 is denoted by D;1.
It is easy to see that the sum of all terms of detM that include au is auDu:
au G12 G1
3
a ln
...... ------ ----- ----------
a 21 G22 a 2a a2 n
aa1 aa2 aaa aa n
M =
13.5 Expansions by Minors 591
here a21 a31 an is (except possibly for sign) a term of Du = det M11• And
in
• •
2 3
•
these two corresponding terms of det Mand det Mu have the same sign, because the
) (� )
permutations
(l
1
� �
)2 )a
.
•
.. � '
· · }n )2 h
� .
"· ·
.. �
Jn
Proof This is known for the case i = j = 1. We shall reduce the theorem to this case.
By a simple row transposition we mean an operation which interchanges two
consecutive rows of a matrix. Similarly for simple column transpositions. We assert
that ai; can be moved into the first column by j - 1 simple column transpositions:
'l<-.A
Here c1 denotes the jth column. For j = 5, the transpositions are (c4c5), (c3c5),
(c2c5), (c1c5). The new order of columns is
Thus the fifth column becomes the first, and the other columns are in the same order,
among themselves, as they were before.
Similarly, we can then move a1
; into the first row, by i - 1 simple row trans
positions. Let the new matrix be M'. Then
But the sum of all terms of det M' that involve a;1 is a;1 det M/1, where M1
/ is the
minor of ai; in M'; and M1
/ is M1
; , because our total operations on the rows and
columns of M did not disturb the order of the rows and columns of Mi;· Therefore
the sum of all terms of det M that involve ai; is
. . det Mi3··
(-l)i+ia1.J = (-l)i+ia i..3 Di1'
..
det M = L(-1/+1a;1DiJ·
i=l
(At this stage you should check to see how these formulas apply to a 3 by 3 matrix,
using, say, the second row and the second column.)
If we multiply the elements of one column by the determinants of the minors of
some other column, with the appropriate signs, and add, we get 0:
The reason is this. Let M' be the matrix obtained by changing the jth column
so as to make it identical with the kth column of the given matrix M. Then the above
sum is the expansion of M' about the minors of itsj th column. Therefore the sum is
det M'. But det M' is 0, because M' has two identical columns. Similarly for rows:
Let M be the matrix [ai11 of the system; let D det M, and suppose that D ;;if 0.
=
In the first equation, we multiply by D11, in the second by -D21, and in the third
by D31• Then we add:
an equation in which Xi is the only unknown. The sum on the right-hand side, in the
last equation, is easy to describe: it is the determinant Di of the matrix
Here the sum on the righthand side is the determinant D2 of the matrix obtained by
replacing the second column of M by the b/s. The same scheme works for x3• There
fore, if D � 0, the system has one and only one solution, namely,
Xi = Di/D, X2 = D2/D, X3 = D3/D.
Obviously none of the above discussion depended on the condition n = 3. In general,
we have:
X1 bi
Xz b2
M =
Let D = det M. If D � 0, then the system has one and only one solution; and the
solution is given by the formula
X; = D ;f D,
where D 1 is the determinant of the matrix M1 obtained by replacing the jth column
of M by the vector ( bi, b2, , bn).
• • •
Cramer's rule has the following consequence. A square matrix M is called non
singular if M has an inverse.
Theorem 9. If M is a square matrix, and det M � 0, then M is nonsingular, and
its inverse is given by the formula
[� (-l)i+iD;;J ·
594 Linear Transformations, Matrices, and Determinants 13.5
M =
Here
n
Dxi =
I (-I)i+iyiDii•
i�l
xi I ( - ly+i(l/D)Di;Yi·
i=l
=
Here the y's form a column vector, and if M' = [c;;l, then
n
X; = L ciiYJ·
i�l
To convert our previous formula for xi to this form, we interchange i and j on the
right, getting
i
X; = t
i l
(-ly+ (-ii) Diih
The value of the sum on the right is unchanged when we usej as an index of summation.
Therefore
M-1 = M1 = [cii] =
[� (-l)i+1D1l
which was to be proved.
x + 2y + 3z = 4,
2x + 3y + 4z = 5,
3x + 4y + 5z = 6.
13.5 Expansions by Minors 595
Carry out the multiplications, add, and solve for x. (Here you are not supposed to use
Cramer's rule; you should use the scheme used in deriving Cramer's rule.)
2. Similarly, solve for y in the system
x - 2 y + 3z = -4,
2x + 3y + 4z = -5,
3x - 4y + 5z = -6.
x - 2y - 3z = 4,
2x + 3y - 4z 0= -5,
-3x + 4y - 5z = 6.
Find the inverses of the following matrices, by a direct application of Theorem 9, and
check your answers by matrix multiplication.
4.
[� �]. 5.
[� �]. 6.
[! ]-2
4 .
G �]. [� !l [! �l
1 1
7. 8. 1 9. 0
0 0
1 0 0
[� �l [� �l
0 2 0
10. 11.
0 1 0 1
0 0 4 0
Find, by any method, the inverses of the following matrices. (You need not calculate
the determinants unless you need to, as a step in finding the inverse.)
0 0 0 0 0
[� �l [� !l [� !l
0 1 0 0 1 0
12. 13. 14.
1 0 0 0 0 0
0 0 0 1 0 1
0 0 -1 0 0 0
[� �] [� �]
-2 0 0 0 0 1
15. 0 0 0 16. 0 1 0
0 0 0 1 0 0
0 -5 0 0 0 0
·
17. Suppose we form a 4 by 4 matrix by fitting together four 2 by 2 matrices, like this:
Let D det M, and let D;; det M;1 for each i,j. Is it true that
[ ]
= =
Dn D12 ?
D = det
D21 D22
18. Similarly, discuss the case in which a 2n by 2n matrix is formed by fitting together
four n by n matrices.
19. Similarly, discuss the case in which a 6 by 6 matrix is formed by fitting together nine
2 by 2 matrices.
We shall now show that when we apply to a square matrix the "triangularization"
process that we applied to systems of linear equations in Section 13.1, the determinant
of the matrix is unchanged.
Theorem 1. If one row of a square matrix is multiplied by a scalar, and the resulting
vector added to another row, the determinant of the matrix is unchanged. Similarly
for columns.
Proof Suppose that the kth row of the matrix M = [ai;] is multiplied by a and added
to the ith row, giving a matrix M'. Expanding M' about the minors of the ith row,
we get
n
;
det M' = ! (- lY+ (a;; +aak 1)D;;
i=l
n n
; ;
= ! (-1y+ a;;D;; +a! (-l)H ak;D;; = detM +a· 0,
i=l i=l
by Theorems 4 and 7 of Section 13.5. We get the other half of the theorem by taking
transposes, as in the proof of Theorem 1 of Section 13.5.
Iterations of this procedure constitute the most efficient scheme for computing
determinants; by appropriate row (or column) operations, we can introduce O's
into a particular row (or column), so that when we use an expansion by minors, only
one of the minors needs to be computed. Note that without these preliminaries, an
expansion by minors is not a short cut in computation, but merely a device for
systematizing our work; in an expansion by minors, the same number of terms appear
as under the original definition of the determinant; they have merely been sorted into
n sets of (n - 1) ! terms each.
Tbeorem 2. If the rows of a matrix M form a linearly dependent set, then det M = 0.
Similarly for the columns.
Proof Let M = [a;1]; let the rows be r; = (a;1, a;2, • • • , a;n); suppose that
n
! a;r; = 0,
i=l
for some set of numbers a;, not all equal to 0. Then some r;, say, r1, is a linear
combination of the others:
n
r1 = L {Jiri
i=2
13.6 Row and Column Operations 597
·
The converse is a little harder.
Theorem 3. If detM = 0, then the rows of Mform a linearly dependent set (and so
also do the columns).
Proof The proof is by induction. Obviously the theorem holds for 1 by 1 matrices.
We need to show that if it holds for n - 1 by n 1 matrices, then it also holds for
-
n by n matrices.
Given an n by n matrix M [ai;], with rows r1, r2,
= , r n· If any row ri is the • • •
R = {r1, r2, • • • , r n}
is obvious. Therefore we may assume that r 1 -¥- 0. We may also assume that a11 -¥- 0,
since the linear dependence or independence of the rows is unaffected by permutations
of the columns.
Now consider the matrix M' whose rows form the set
Proof By the preceding two theorems, each of the following conditions is equivalent
to the next:
purpose, we shall use matrices of/unctions; and the first thing that we need to under
stand is that for matrices of functions,the analogue of Theorem 4 is false. This can be
shown by a very simple example, as follows:
[l 2
M(x) = x 2x
2 2
x 2x
Here the columns are linearly dependent, obviously; but the rows are not: if
for every x,
for some real numbers 1)(1, 1)(2, 1)(3, then 1)(1 = 1)(2 = 1)(3 = 0, because the only poly
nomial that vanishes for every x is the zero polynomial.
The simplest general test for linear independence of functions uses the determinant
of a matrix of functions. Let /1, /2, • • • Jn be functions on an interval I. The
Wronskian of the sequence /1,/2, ,fn • • • is the function
!1 f{ f�
!2 f� f�
n l
fn J� f� · · · f� - )
Thus
W(x) = det
[JJ1-1>];
the Wronskian matrix has the (j - !)-derivative off; in the ith row andjth column.
Note that the Wronskian really depends on a sequence of functions, and not merely
on a set of functions; if the same functions are taken in a different order, the sign of W
may change. The notation W(x) is meant to emphasize that the Wronskian is a
function and not a number. The following is easy:
D1 2, et.Jlx) = 0 (j = 1, 2, . . . , n - 1).
i=l
(Here Di[· · ·] denotes thejth derivative.) Therefore
n
2, l)(;f�1>(x) = 0 (j = 1, 2, . . . ,n - 1),
i=l
for every x, and the rows are linearly dependent.
13.6 Row and Column Operations 599
[ ]
Here
sin x c ?s x
W(x) det
cos x
=
-sm x
[� .
Here
J
snx cos x
W(x) =det
sm 2x 2 cos 2x
Here W(O) = 0, but
J J
+ e"' e"'
W(x) =det 2 ., + 2xe ., =det
2 .,
xe xe x2e ., 2xe"'
[1x 2x
J
1 =x2e2"''
== xe2"'det
fi(x) =x2,
{
2x2 for x � 0
!2(x) = 3x2 for x � o'.
Then W(x) = 0 on the interval [O, co), because/1 and/2 are linearly dependent on
[O, oo); and for the same reason, W(x) =0 on ( oo, O]. Therefore W(x) =0 for
-
If the above equation holds for every x, then setting x = -1 and x = 1 we get
{ oc1 + 2oc
2
= 0,
oc1 + 3oc = 0.
2
By subtraction, oc = 0. It follows that oc·1 = 0.
2
This indicates that the Wronskian can give proofs of linear dependence only for
special types of functions. Fortunately, these functions are of special interest and
importance, as we shall see.
In the following problem set, you will be writing long strings of equations between
determinants, and it will be convenient to use I · · · I as an abbreviation for det [· · ·].
For example,
I� �I = det [� �J = ad - be.
X 1 x12 x13
X2 x22 x3
D4 2 0.
X3 x32 x33
=
x x2 x3
Here the numbers x1, x , and x are all different. How do you know that D4 is a poly
2 3
nomial of degree three? Express D4 as a product of linear factors, not involving
determinants.
13.7 Linear Differential -Equations 601
2
x
1
x2
2
1 Xn 1
-
1 Xn
*25. Investigate for linear dependence: {ea1"', ea•"', eas'", . a '"
. . , e n , } where the a/s are all
different.
For Eq. (1), and for many others like it, we can give complete answers to all these
questions. (In fact, the only hard one is the third.) As a guide to what to try, we look
first at cases in which some of the solutions are obvious. The equation
f" -f=O
has the solutions
because D2e"' = e'" and D2e-x = e-'". And it is easy to see that if f1 and f2 are solutions,
then so also is rxif1 + rx2f2, for every pair ofscalars rx.1 and rx. •
2
We had better make a
note of this, more generally:
pn> + f <n-1)
an_i + .
..+ aif' + aof =
0,
where a/s are constants. Let "// be the set of all solutions of the equation. Then "//
f orms a vector space.
That is, "// is closed under addition and scalar multiplication. This is trivial to
check. Note, however, that ifthe zero on the right is replaced by a nonzero function,
or even a nonzero constant k, the solutions of the resulting equation never form a
vector space. (If the sum of two solutions is a solution, then 2k = k, and k = 0.)
The above example suggests that we try solutions of the form
emx
and since -:/:- 0 for every x, this is equivalent to the equation
m2 + bm + c = 0. (2)
Equation (2) is called the auxiliary equation. There are three possibilities for its
solutions.
-b + )b 2 - 4c -b - )b 2 - 4c
2 2
and both these roots are real.
II. If b2 - 4c = 0, then there is only one root
m1 = -b/2,
and this is a real root of multiplicity 2, with
m2 + bm + c = (m - m1)2.
III. If b2 - 4c < 0, then the roots are two conjugate complex numbers
Therefore the solutions that we have found for case I include all the solutions, if
the following theorem is true:
f" + bfI + cf = 0.
Then dim "Y = 2.
13.7 Linear Differential Equations 603
This is true, and will be proved in the following section. Meanwhile we shall use it.
In case II, we seem to have only one solution
/1(x) = em•"';
and on the basis of Theorem 2, we need to find another one, h, such that {/1,/2} is
linearly independent. We do not know how somebody first thought of trying
h(x) = xem1"',
but at any rate, it works:
m1
= e2m1x det [1x m1x + 1l_J
= e2m1 x � 0.
Case III looks peculiar. Taken at face value, the roots of the auxiliary equation
give us
f 1(x) e<a+Pilx ea"'(cos f3x + i sin f3x),
= =
f2(x) =
e<a-Pilx =
ea"'(cos{Jx - i sin {Jx) .
At the outset, we did not intend to get into the complex domain; but in the complex
domain, our formulas still make sense: if mis complex, then the function/(x) em"' =
Therefore /1 and /2 really are solutions. But at the moment we are interested only in
real solutions (in another sense), and so we take the real and imaginary parts separately,
getting
(Check that if a complex-valued function f is a solution, then its real and imaginary
parts are also solutions. This is easier than checking g1 and g2 by a brute-force
604 Linear Transformations, Matrices, and Determinants 13.7
g{(x) = rxe""' cos {Jx - {Je""' sin {Jx, g�(x ) = rxe""' sin {Jx + {Je""' cos {Jx.
Therefore
rxe""' cos {Jx - {Je""' sin {Jx ]
rxe""' sin {Jx + f3e""' cos {Jx
- e2ax det
-
[ . {Jx rx
cos cos {Jx - f3 sin {Jx ]
{Jx rx
sm sin {Jx + {J cos {Jx
= e2""' <let
[ ? {Jx
c s - /3 sin {Jx ] = {Je2""'
¥-
0
· '
¥-
sm {Jx {J cos {Jx
because {J 0.
In case III, the linear combinations
= .Jk21 +
k22 e""' (.Jkiki +
k�
cos {Jx +
.Jk�
k2
+
k�
sin {Jx )
= ke""' cos {J(x - x0),
where
k = l 2
'\/ k 1
+
k22•
and x0 is any number such that cos f3x0 = k1/k and sin f3x0 = k 2/k. Using t for x,
we get
f(t) = ke"t cos {J(t - t0),
which describes the motion of a particle along a line, with the position given as a
function of the time. This kind of motion is called damped oscillation. To get the
graph of f, we start with the "simple oscillating function" cos {Jt; we move the graph
{Jt0 units to the right, so that t0 acts like O; and then we damp the function by multiply
ing each value by ke"t. (For rx < 0, this damps the oscillations as t -+ oo; for rx > 0,
the oscillations are damped as t -+ - oo.)
Note that in our formula for f, the constants rx and {J play a very different part
from k t0: rx and {J are determined by the coefficients
and b and c in the differential
equation, while k and t0 range arbitrarily.
It is a fact that a solution f of the equation
+ +
f" bf I cf = 0
is completely determined if f(x0) and f'(x0) are known, for some x0. This can be
verified by a calculation, for the three types of solutions that we have found, but the
theorem is best postponed until the next section, where we can give the "right proof."
Meanwhile, in the following problem set, you will find that such initial conditions
always determine an answer.
13.7 Linear Differential Equations 605
Case III, in which b2 - 4c < 0, and we get real solutions by making a detour
into complex variables, may seem peculiar, but it is case III that has the most
elementary application in physics: it describes the behavior of a vibrating spring.
This problem is as follows. Suppose that you hang a coiled steel spring from a rigid
support, like this:
The spring has a certain natural length L. If you hang an object of weight w to the
bottom end, the spring will be stretched by a distance s. It turns out experimentally
that if the weight w is not too great, then the ratio w/s is a constant k; that is, s = w/k;
the stretch is proportional to the weight. This statement is called Hooke's law.
The proportionality constant k depends on the physical properties of the spring; the
thicker and stiffer the spring, the larger k will be. This law, of course, applies only
within certain limits: if you hang a brick on the hairspring of a watch, the result will
not be an illustration of the law. Note, however, that the validity of the law for a
given spring and a given range of weights is capable of being tested by static experi
ments; and this is important, because we are about to deduce from Hooke's law first a
differential equation and then a law of motion.
If the spring is in equilibrium, when stretched to a length L + s, with a weight w
at the bottom, then the spring must be exerting a force of magnitude w = ks, upward,
to balance the force w exerted downward by gravity. Let us now set up a coordinate
system on the line which is the axis of the spring, in such a way that the origin is at
the equilibrium point for the given weight. In the figure below, we omit the spring
itself, to clarify the labeling.
.'V
x x
606 Linear Transformations, Matrices, and Determinants 13.7
Suppose that the spring has been stretched to a point with coordinate x. Then two
forces are acting:
Fi = -k(x + )
s ,
because x + s is the total stretch. We use the minus sign because the x-axis is directed
downward.
2) The weight w. This counts positively, because weight acts downward, in the posi
tive direction on the x-axis. Therefore the total force is
F = -k(x + s) + w= -kx.
Now suppose that the weight is pulled down to a certain point x0 and then released.
Then the weight will bob up and down, with its position given as a function of the time.
For x =f(t), the velocity and acceleration are
v(t) =J'(t),
where m is the mass. The force represented by the weight is equal to the mass m times
the acceleration g of gravity. Thus w =mg, and m = w/g. This gives
w
F(t) = - a(t).
g
But we know that
F(t) = -kx.
Therefore the function f which describes the motion must satisfy the differential
equation
w
- a(t) = -kf(t),
g
which can be written in the form
kg
f"(t) + f(t) = 0.
w
In each of the following eight problems, find the solution space of the given differential
equatiOn, and then find the scalars which give the solution satisfying the initial conditions
on the right. In each of these cases, you should use .the methods but not the results of this
13.8 The Dimension Theorem for the Space of Solutions 607
section of the text. That is, set up the auxiliary equation, solve it, and then use the root(s)
to get two solutions which form a linearly independent set.
9. A spring is such that an 8-lb weight stretches it 6 in. A 4-Jb weight is attached, allowed
to reach equilibrium, then pulled 2 in. below the equilibrium point and released. What
happens? What is the period?
10. A spring is such that a 10-lb weight stretches it 18 in. A 1-lb weight is attached, allowed
to reach equilibrium, pushed 6 in. above the equilibrium point, and released. What
happens? What is the period?
11. You found, in Problem 27 of Problem Set 4.3, that the sine and cosine are the only
functionsf and g for which it is true that
f" + bf' + cf = 0;
and for i = 1, 2, 3, and every j, let
Y2 = (J21. Y22• · ·
. ), Ya = (y31, Ys2, · · . ),
which form a "3 by infinity matrix." Show that the rows of this matrix form a linearly
dependent set.
In the preceding section, we found that for every equation of the form
the solutions formed a linear space "Y. In each case, we found solutions /1, h such
that {j1,j;} is linearly independent. We shall now show that the linear combinations
f= rx i f1 + rxd2
are the only solutions.
f" = -bf' - cf
Here the righthand side is differentiable, and so also is the lefthand side. Therefore
It follows that for every solution of (1) we can write a "formal Taylor series"
oo . oo
f(i\a) .
I a;(x - a)' =
I-.- (x - a)'.
i=O i=O l !
We shall now show that f is real-analytic; that is, the Taylor series converges, for
every x, and its sum is the function/that we started with.
oo j<il(a)
f(x)= I -.- (x - a)i for every x.
i=O l !
In the proof, we shall use Taylor's theorem. (This is Theorem 1 of Section 10.10.
Note that we are now using it for the first time.) The theorem says that for each x,
the remainder
n f(i)(a) .
Rn(x) = f(x) - L -. - (x - a)'
i=O l !
is given by the formula
f(n+l)(x
-
)
R
n
(x) = (x - a)n+l'
(n + 1)!
for some x between a and x. We want to conclude that Rn(x) � O; and to do this,
we need to show that the numberspn+i>(x) cannot increase fast enough to overcome
the effect of the (n + 1) ! in the denominator. It should be understood that x is fixed,
throughout the following discussion.
13.8 The Dimension Theorem for the Space of Solutions 609
I
�
0 x? t a x?
� kM + kM = 2kM.
Similarly,
l/<3\t)I = I-bf"(t) - cf'(t)I
� k · 2kM + kM
< (2k)2M.
We now claim that
for every n.
This is known for n = 1 and n = 2. And if it holds for two successive integers
n - 2 and n - 1, then it holds for the next integer n:
{1pn-1>(t)i < (2kt-2M
IJ<n+i>(t)I 1-bpn>(t) - cpn-ll(t)I
=
=> =
IJ<nl(t)I � (2kt-1M
� k IJ<n>(t)I + k IJ<n-ll(t)i
� k(2kt-1M + k(2kt-2M
which obviously approaches O; in fact, the expression on the right is the (n + l)st
term of the series for
M e2klx al.
-
2k
It is not an accident that the series for f converges with the rapidity of an exponential
series: the solutions that we have found, so far, for our differential equation have been
610 Linear Transformations, Matrices, and Determinants 13.8
combinations of exponentials, sines, and cosines; and we are about to find that these
are the only solutions.
Theorem 2. If/"+ bf'+ cf= 0, andf(a) =f'(a) = 0 for some a, thenf(x) = 0
for every x.
Proof Since pn> = - b pn 1> - cpn-2), it follows by induction that pnl(a) = 0
-
for every n, and so all the coefficients in the Taylor series are equal to 0. This gives
the result which was used without proof in the last problem set:
Theorem 3. If Ji and/2 are solutions of the equationf" + bf'+ cf= 0, and
f(a) = f'(a) = 0.
The dimension theorem is now easy:
Theorem 4 (The dimension theorem). Let "f/' be the space of solutions of the equation
f" + bf'+ cf= 0. Then dim "f/' = 2.
Proof We found, in the preceding section, that every equation of this form has
two linearly independent solutions. Therefore dim "f/' � 2. It remains to show that
[ ]
every three solutions fi,f2,f 3 form a linearly dependent set. Consider the matrix
for some scalars IX1, IX2, 1X3, not all equal to 0. Let
If m1 is a root of the equation, with multiplicity k1 (so that (m - mi)k1 divides the
lefthand member), then the functions
are solutions, and form a linearly independent set. If IX + {Ji, IX - {Ji are a pair of
conjugate complex roots, with multiplicity k2, then the functions
e"-"' cos {Jx, xe"-" cos {Jx, ... , xk•-1e"-" cos {Jx
and
e"-"' sin {Jx, xe"-"' sin {Jx, ... , xk.-1e"-" sin {Jx
are solutions, and are linearly independent. Moreover, the total set of functions
obtained in this way forms a linearly independent set, and the number of elements
in the set is n, because n is the sum of the multiplicities of the roots of the auxiliary
equation. Therefore the dimension of the solution space is at least equal to n. As for
equations of order 2, it can be shown that all solutions of the equation are real
analytic; and matrix theory then furnishes a proof that every set of n + 1 solutions
forms a linearly dependent set. It follows that the dimension of the solution space is
exactly n.
Thus the results follow the pattern that we found for equations of order 2.
However, to derive them, in a reasonably efficient and natural way, requires new
theoretical ideas, and, in particular, a new kind of algebraic formalism. This theory
is best postponed to a systematic course in differential equations.
Meanwhile we consider what happens, in a linear differential equation with
constant coefficients, when the 0 on the right is replaced by a function. Consider,
for example,
f"(x) + 5f'(x) + 6f ( x) = e". (I)
We know how to find all solutions of the equation
Cf -
fo)" + (f - fo)' + 6(f -
fo) = 0.
Therefore the function f - Jo is a solution of the reduced equation (2). This means
that if we can find one solution of (1), then we can express all solutions of (1) in the
form
f(x) = 1X1e-2'" + 1X2e-3'" + fo(x);
every solution of (1) has this form, because f - f0 is of the form 1X1e-2"' + 1X2e-3"'.
612 Linear Transformations, Matrices, and Determinants 13.8
If the function on the right is real-analytic, then there is a systematic scheme for
looking for solutions of a nonhomogeneous equation: we assume that
co
I(x) = L aixi,
i=O
and solve for the coefficients ai one at a time. But if the function on the right is simple,
then the method of trial and error may work faster and lead to a simpler formula. In
the example above, we try
The set H does not form a subspace. A set of this kind is called a hyperplane. In
general, if if/ is a subspace of a linear space "f/", and lo is any point of"f/", then the set
H =
U +lo I I in if/}
is called a hyperplane. Note that every subspace is automatically a hyperplane,
because lo may be zero. The term hyperplane is suggested by the language of geometry
in Cartesian 3-space. If E is a plane through the origin, and P0 is any point of R3, then
the set
H = {P +P 0 I Pin E}
is a plane. (We use the prefix hyper because in vector spaces of higher dimension, the
dimension of a hyperplane may easily be greater than 2. The set H may, of course,
be of dimension 1 or 0; every line in R3 forms a hyperplane, under the above definition,
because every line through the origin forms a subspace. The same applies for a point,
although we rarely have any occasion to say so.)
Similar devices work for various other functions on the right in a nonhomogeneous
equation. For example, consider
which gives
-A sin x - B cos x + 5(A cos x - B sin x)
+ 6A sin x + 6B cos x = sin x
<=>(-A - 5B + 6A) sinx
+ ( -B + 5A + 6B) cos x sin x
{
=
5A - 5B 1 =
<=> 5A + 5B 0 =
This gives
fo(x) /o(sin x - cos x),
=
For each of the following equations, find the space ii" of solutions. Answers should be
in the form
11' = {aif1 + a2f2 + + an fn} · · ·
·
1. JC4l +4/<3> +6f" +4f' +f=0. 2. [<4> +J<3l - 3f" - Sf' - 2/ = 0.
3. 1<4> +2/<3l - 3[" - 4[' +4/ = 0. 4. 1<4> +2f" +f = 0.
5. pal - f" + f' - f= 0. 6. /(4) +J<3l - f' - f=0.
7. /(5) - 2/(3) +f' = 0.
For each of the following, find (a) the space 11' of solutions of the reduced equation, in
the same form as in the preceding problems, and (b) the hyperplane Hof solutions of the
given equation, in the form
Bz
\
I
I
,, I
,.--i
/ I
I I
IB
I
1) The figures above suggest that a cylinder is a bounded figure, with a lower
base B1 and an upper base B•2 But this is merely a device for clarifying the meaning
of the pictures; according to our definition, cylinders are of infinite extent, in each of
two directions. In the same way, planes are unbounded, although we indicate them
in pictures by drawing parallelograms.
2) The base may be any set of points in a plane. If the base is a curve, as on the left
above, then the cylinder is a surface., If the base is a region, as on the right above,
then the cylinder is a solid. The definition applies to each of these cases in exactly the
same way. In each case, if the plane of the base is regarded as horizontal, then the
cylinder is the union of all vertical lines that intersect the base.
If the base is in the xy-plane, and is described by an equation in x and y, then the
same equation can be regarded as a description of the cylinder.
y
z
-1
/
)r -----
I
I
y I
I
/)- __
7c-- - y
__ _
/
x --- /
,,
- 1
x
On the left above we show the unit circle in the xy-plane; this is the graph of
the equation x2 + y2 = 1. In the center above, we see the same figure in perspective,
as it appears when we are about to draw in a z-axis. In the three-dimensional figure
on the right above we see the portion of the cylinder that lies iri the first octant. The
cylinder is the graph of the same equation x2 + y2 = 1 ; since the fquation imposes
no restriction on z, the graph includes the vertical line through each of its points.
To be more precise, the circle is
{(x,y) I x2 + y2 = I},
and the cylinder is
{(x,y,z) I x2 + y2 =I}.
The relations among these figures deserve careful examination. At the left above,
the tangents to the circle at the y-intercepts are horizontal,that is,parallel to the ,x-axis.
This should be true also in the perspective drawings at the center and right. Hence
the dotted guide lines. Similarly, the tangents to the circle at the x-intercepts are
vertical, that is, parallel to the y-axis. This should also be true in the perspective
drawings; it is indicated by the dotted guide lines. Often a correctly drawn figure
looks peculiar, unless you analyze it in this way. For example:
z
y
1 ----- - - 1 /
-71
/ /i
I
I
I / y I
I /
I
I /
I
I
/
I
-L
I
/--L
::..._
__
__
I y
_
x /
--t-< ___ __:_I _.
__ x /
----- /
-
x
y=x2, O�x�l}. {(x,y,z)iy=x2, O�x�l}.
which is
{ cx,y,z) I x2 + y2 � 1}.
If we use the xz-plane or the yz-plane as the plane of the base, then the same
scheme works, in a similar way. For example:
z
z
z
x
x+z=l, O�x;;;l. {(x,y,z)lx+z=l, O;;;x;;;l}.
As usual, the figure is cut off at the ends, to clarify it in a pictorial sense. In its own
plane, the cylinder is an infinite strip, of width J2.
v'2
�->--'-����l���'----� '
If we had used the entire line x + z = 1, y = 0 as base, then the cylinder would
have been the entire plane
{(x,y,z) ix + z = l}.
This plane is parallel to the y-axis. Thus any plane parallel to one of the coordinate
axes can be described as a cylinder. In fact, for appropriate choice of the base plane,
any plane whatever can be regarded as a cylinder.
We have seen that cylindrical surfaces with their bases in the coordinate planes
are easy to describe by equations (if their bases are so describable). The next simplest
surfaces are the surfaces of revolution, whose areas we learned to compute in Section
7.5. Given a curve in, say, the yz-plane, we may rotate the curve about !he y-axis.
This generates a surface.
z
x
14.1 Surfaces and Solids in R3 617
The cross sections of the surface, in planes parallel to the xz-plane, are all circles,
with their centers on the y-axis. If the generating curve is described by a function, say'
z = f(y) G: 0,
then for each y0, the cross section in the plane y = y0 is the circle with center at
(0, y0, 0) and radius f (y0). Thus the cross section is the graph of the condition
Y =Yo,
x2 + z2 = [f (y))2.
Two important special cases are as follows:
-a a
z = -J a2 - y2, x = 0.
This is a semicircle. We rotate about the y-axis. The surface of revolution is the graph
of the equation
x 2 + z2 = [-}a 2 _ y 2) 2 = a2 _
y2
<=> x2 + y2 + z2 = a2.
This is as it should be, because the surface of revolution is the sphere with center
at the origin and radius a; it is easy to see by the distance formula that the sphere
must be the graph of the equation
<=> x2 + y2 + z2 = a2.
in the yz-plane. When we rotate about the y-axis, we get a cone (that is, a conical
surface).
618 Functions of Several Variables 14.1
As usual, the figure shows only the first octant. The conical surface is the graph of
the equation
x2 +z2 = (my)2
<=> x2 - m2y2+z2 = 0.
If we had taken a line through the origin and rotated it about one of the other
coordinate axes, we would have gotten an equation of one of the forms
(a) -m2x2+y2+z2 = 0,
z z
/
/
t------
y
:t x
(a) (b)
Each of the surfaces that we have investigated so far has been the graph of an
equation of the second degree in x, y, and z, that is, an equation of the form
where the first six coefficients are not all equal to 0. Using the method of rotation of
axes in a plane, as in Section 8.4, we can find out what the plane cross sections of such
surfaces are like. Let £0 be any plane, and let N be the normal line to E0 through the
origin. Let F0 be the plane which contains N and the z-axis, and let L be the line in
which F0 intersects the xy-plane. By a rotation of axes in the xy-plane, we can make
14.1 Surfaces and Solids in R3 619
'
x = x' cos () - y' sin () and y = x' sin () + y cos ().
In the new coordinate system, the equation of the surface that we started with has the
form
A' x'2 + B' y'2 + C'z2 + D'x'y' + E' x'z + F'y'z + Gx' + Hy' + I'z + J = 0.
(Query: How do we know that the constant term is unchanged? And how do we
know that the first six coefficients are not all equal to 0?) In the x'z-plane, we now
perform another rotation of axes, in such a way that N becomes the new x-axis. The
equations for this rotation are of the form
• ,/. ,/.
x' x" cos cf> - z' sin cf>,
ff I
= Z = X Slll 'I' + Z COS 'I';
and in the x"-, y'-, z'-coordinate system, the equation of our surface is still of the
second degree, for the same reason as before. The plane F0 is the graph of an equation
of the form
x" = k,
where k is the distance between the origin and F0• To get the equation of the inter
section of the surface with F0, we should set x" = k in the equation of the surface.
This gives an equation of the second degree in y' and z'. By Theorem 2 of Section 8.4,
this means that every plane cross section of a second-degree surface is (a) a circle,
(b) a parabola, (c) an ellipse, (d) a hyperbola, (e) a point, (f) the empty set, (g) a line,
or (h) the union of two lines (either parallel or intersecting).
In particular, every plane cross section of a cone is a "conic section" of the sort
that we investigated in Chapter 8.
620 Functions of Several Variables 14.2
Sketch the graphs of the following, in the first octant only. All the equations are to be
regarded as equations in (x, y, z). For example, x+y=1 is the equation of a plane,
x2 + y2 - 1 =O is the equation of a cylindrical surface,and so on.
x2 + y2 + z2 = a2
x2 Y2 z2
-+-+-=1
a2 b2 c2
is called an ellipsoid.
The sketch on the left shows the entire surface. On the right we show only the
part of the surface that lies in the first octant. Such partial sketches are much easier
to draw, and sometimes they are actually easier to interpret and to use.
14.2 The Quadric Surfaces 621
Y2 z2 x2
-+-=1-_Q-k.
b2 c 2 a2 -
Fork > 0 the cross section is an ellipse or a circle; for k =0 the graph is the point
(x0, 0, 0); and fork < 0 the graph is empty. Similarly for y = y0 or z = z0•
2) Elliptic and circular cones. We found in the last section that the graph of the
equation
x2 - m2 y2 + z2 =0
x2 z2
--y2 +-=0
a2 b2
/yo
/
--y
/
x /
x /
/
The figure on the left shows the entire cone, and the one on the right shows only the
portion that lies in the first octant. The cross section in the yz-plane is obviously a
pair of lines, because it is the graph of
z2
x =0, y2 =-.
b2
622 Functions of Several Variables 14.2
Similarly, the cross section in the xy-plane is a pair of lines, because it is the graph of
x2
z = 0, y2 =
2·
a
x2 z2
Y =Yo, + = Y2o
·
a2 b2
This is a point for Yo = 0, and is a circle or an ellipse for Yo � 0. Note that the cross
section in the plane x = x0 (x0 � 0) is the graph of
which is a hyperbola.
x2 y2 z2
-+- --=l.
a2 b2 c2
x2 y2 z2
-+-=l+_Q=k2:::
- 1.
a2 b2 c2
Therefore all horizontal cross sections of the graph are ellipses. Rewriting in the form
x2 y2
=
a2k + b2k 1'
we see that as lzol increases the ellipses get bigger, but their shape does not change.
I
I
I
y I
I
)- - --,_
__
__
y
x x
14.2
The Quadric Surfaces 623
The cross sections in the other coordinate planes are hyperbolas; they are the graphs
of the conditions
y2 z2
x = 0, - -- = 1'
b2 c2
x2 z2
y= 0, --- =l.
a2 c2
x2 y2 z2 v2 z2 x2
-- - - - = 1 '--+ = - - 1. -
a2 b2 c2 b2 c2 a2
Again we investigate cross sections. For \x0\ < a, the cross section in the plane
x = x0 is empty; for x = ±a, the graph is a point; and for x� > a, the graph is an
ellipse (or a circle), being the graph of
y2 z2 x2
X = X0, -+ -
=
_Q - 1 > 0.
b2 c2 a2
The cross sections in the xz-plane and the xy-plane are obviously hyperbolas.
5) The hyperbolic paraboloid. This one is hard to visualize and hard to sketch. It is
the graph of the equation
Y2 x2
CZ= - - - (c '=r6 0).
b2 a2
We give the sketch for the case a = b= c = 1. Thus the equation of the surface
becomes
624 Functions of Several Variables 14.2
z z
Using these cross sections in a perspective drawing, we get the result shown
below.
For other values of a and b, we get hyperbolas of different shapes in the hori
zontal cross sections. And when the sign of c is changed, the effect is to reflect the
surface across the xy-plane.
Sketch the graphs of the following equations, and identify the surfaces.
�
1. x2+-+-= l
� �
2. x2--+ - = 1
�
4 9 4 9
2 z2 2 z2
y y
3. x2+--- =1 4. x2----= 1
4 9 4 9
2 z2 2 z2
y y
5. x2 - - - - =0 6. x2+---=0
4 9 4 9
2 z2 2 z2
y y
7. -x2+---=0 8. x2 ____ -1
4 9 4 9
2
10. z2 =x2+t
y
9. z=x2--
4 4
14.2 The Quadric Surfaces 625
y2 y2
11. z =- - x2 12. z2 = - - x
4 4
y2 x2
cz = - (c ¥: 0).
b2 a2
Let (x0, y0) be any point of the xy-plane. For each cc, consider the path whose coordinate
functions are
X = Xo + t COS cc, y = y0 + t sin cc.
(In effect, we have set up a coordinate system on a line L through (x0, y0), in such a
way that L becomes the t-axis.) Now consider the path in space defined by the coordinate
functions
X = Xo + t COS cc,
y = y0 + t sin cc,
1 y2( x2
z =- - - -
) 1 1
= - ( y0 + t sin cc)2 - - (x0 + t cos cc)2•
c b2 a2 b2c a2c
18. Find the volume of the region inside the graph of the equation
0 � y � 1, -y � x � y,
[This is the solid which lies (a) above the xy-plane, (b) below the hyperbolic paraboloid
z = y2 - x2, and (c) between the planes y = 0 and y = 1.)
626 Functions of Several Variables 14.3
20. Find the volume of the solid which lies between the planes z = 0 and z = 1, and
inside the one-sheeted hyperboloid
2
2 y 2
x + 4 = z + 1.
22. Find the volume of the solid which lies between the planes z = 1 and z = 2 and the
two-sheeted hyperboloid
2 2 2
z = x +y + 1.
So far, most of the functions that we have been studying .1ave been of the following
types.
1) Functions whose domains are sets of real numbers. In these cases, the' domain
Dwas usually an interval.
2) Functions of one vector space into another. These were always linear, and were
referred to as linear transformations.
f: D-+ R,
'
I
I
,--1=- ---- .... ,
. I I
I I
I
c=0
I
l �I I
1
l: : I
-- y
•
x
p D
z=f(P).
14.3 Functions of Two Variables 627
Suppose that a rule is given under which to each point P of D there corresponds a
real number z. We then say that we have a function
f: D-+ R.
As for a function defined on an interval, the limit off is defined, more generally,
for the case in which P0 does not necessarily lie in D. In this case,
lim f(P) = L
P-+Po
means that
and P ¥- P0 => f (P) ::::,; L.
To make these ideas precise, we interpret P ::::,; P0 to mean that llP - P0ll is small,
and we interpretf(P)::::,; L to mean that lf(P) L I is small. This gives the following
-
definition :
Definition. Let D be a region in an inner-product space Y, let fbe a function D-+ R,
let P0 be a point of"f/ (not necessarily lying in D), and let L be a real number. Suppose
that for every E > 0 there is a o > 0 such that
Theorem 1. Let D be a region in a vector space "/', and let f and g be functions
D-+R.If
lim f(P) = L and lim g( P) = L',
P-+Po P-+Po
then
lim [f(P) + g(P)] = L + L' and lim [f(P)g(P)] =LL'.
P-+Po P-+Po
If L' � 0, then
f (P)
lim = !::__ •
Then
lim f(P) = L.
P-+Po
To see this, we merely need to examine the geometric meaning of the inequalities
on the left (preceding the =>).
y
m
�
The inequalities Ix - x01 < o, IY- - Yol < o hold inside the square, and the m
equality llP - P0ll < o holds inside the inscribed circle. Therefore
=>
lf(x,y) - LI < E;
and so
lim f(P) = L.
P-+Po
14.3 Functions of Two Variables 629
From this it follows, as for functions of one variable, that for functions of two
variables, continuity is preserved under composition of functions:
Theorem 3. Let f, g, and h be continuous functions of two variables. For each
P = (x, y), let
<f>(x, y) f[g(x, y), h(x, y)].
=
I
)-. __
/
/ / y
/
/
/ /
,------k:___j,/
x
Therefore
x2y2 _
xa y
f(x, y) cos •
1 + x 2 + y2
=
/: D-+R,
take a fixed y0,and consider the intersection of the graph with the plane y = y0•
x
/
The cross section is a curve, and is the graph of a function cf>v.: for each x for which
(x,y0) is in the domain of/, we have
Geometrically, f,,(x0, y0) is the slope of the surface in the x-direction at the point
(xo, Yo).
Naturally, we can restate this in purely analytic terms, without any reference to
the geometry, and without mentioning a slice function. We can state:
Definition
1. f(x, Yo) - f(xo, Yo)
(
f"'Xo, Yo) - 1m ,
_
x-+xo x - x0
if such a limit exists.
The function f,, is called the partial derivative off with respect to x.
Standard differentiation formulas give us partial derivatives very easily. For
example, given
This gives
The formal rule is simple: regard y as a constant, regard x as the dummy letter
defining the function, and differentiate.
We can equally well consider slice functions in the y-direction, setting x = x0•
The derivative of the slice function is now the partial derivative off with respect toy.
More precisely:
Definition
) 1. f(xo, y) - f(Xo, Yo)
f,v(Xo, Yo - im '
_
v-+'llo Y - Yo
if such a limit exists.
632 Functions of Several Variables 14.3
The partial derivatives f., and .fv, once we get them, are also functions; and their
partial derivatives are defined in the same way. For example, consider
f,,,,(x, y) = 6x + 2y.
Similarly
h v(x, y) = 2x + l2y2.
Now !rev is the partial derivative off., with respect toy. We have
fxv =
Dv(3x2 + y2 + 2xy) = 2y + 2x.
And similarly
hx = D,,(2xy + x2 + 4y3) = 2y + 2x.
Note that while fxv and hx turned out to be the same function, they were not defined
in the same way, and they were not arrived at by the same process. Therefore the fact
that.fcv fy,,, for this particular function f, must be due either to an accident or to a
=
nontrivial theorem.
Finally, some remarks on notation. Often, people write
of of
for f,,, for f11,
ox oy
o2f o2f
for fxx• for fvv•
ox2 oy2
02! o2f
for fxv• for fvx•
oy ox ox oy
and so on. Note, in the last line, that in the symbols.fcv and.fv.,, the letters indicating
partial differentiation accumulate on the right; while in the symbols
o2f 02/
and
oy ox ox oy
the letters indicating partial differentiation accumulate on the left. Thus
oY oY
'.l
'.l '.l f vx and '.l '.l fxyy·
uxuyuy 0yuyux
= v =
·
Note that in the CJ-notation, the symbols for higher derivatives look like "prod
ucts" of "factors" of the types of ox, o/oy. Thus
a 0 a o3f
fm f (x y) =
oy oy ox , oy oy ox
=
This is why the symbols accumulate on the left instead of the right.
14.3 Functions of Two Variables 633
Citing the theorems of this section, at the points where you need them, show that each
of the following functions is continuous.
xy
4. f(x,y) [(x,y) � (0, O)] 5. f(x, y) sin (x2y + y2x)
x 2 + y2
=
=
x2 _
y2 xy
6. f(x,y) [(x,y) � (0, O)] 7. f(x,y)
x + y2
2 x + y2 + 1
2
= =
sin x cosy - 1
8. f(x,y) - (y � 0) 9. f(x,y) (x � 0)
y x
= =
sin xy cosy
10. f(x,y) -2- [(x,y) � (0, O)] 11. f(x,y) [(x,y) � (0, 0)]
x -
+ y2
=
x-2--
+ y
2
=
Problems 12 through 22. For each of the functions f given in Problems 1 through 11,
find fx, fv, fxy, and fvx·
23. Obviously the definition off, in Problem 4, is valid only for (x,y) � (0, 0). Is it
possible to give a separate definition off (0, 0), in such a way that the resulting function
is continuous ? That is, is there any such thing as
. xy
( ?) lim ---
2 + y2 ?
(x,vJ-<o,OJ X
n n
f(x, y) = .2 .2 a;;xiy i.
i=O i=O
*33. Write a complete proof of Theorem 3, showing that for every£ > 0 there is a c5 > 0
such that ...
z
y
Yo
/
y
/
XQ /
QI_
x t(x, y)
x
Given a function
f: D---+ R
X = Xo + t COS CJ.,
y =Yo+ t sin a.
For ex = 0, the slice is parallel to the x-axis, and so it ought to be true that /0 =fz.
And this is true: for ex = 0 we have cos ex = 1, sin ex = 0, and
lix-+O �X
Similarly, for ex = 7r/2 we have
cos iX = 0, sin ex = 1,
and
f,,12(Xo,Yo) = fv(Xo,Yo).
We now want a general formula forfr,. As a guide to what we should be aiming at,
we consider first the simplest case, in which/is linear, with
We shall now see that this formula holds under much more general conditions,
when/is not necessarily linear, but is "approximately linear near (x0,y0)," in a sense
which we shall define presently.
We recall that if f is a differentiable function of one variable, the difference
�f = f(x0 + �x) - f(x0)
can be expressed in the form
It is fairly easy to find out what form the formula has to take if it exists at all. If f is
linear, with
f(x,y) =Ax+ By+ C,
then
!J.f=Ax+ By+ C - ( Ax0+ By0+ C)
=A(x - x0)+ B( y - Yo)
=A !J.x+ B !J.y.
Here, as before,
A=f,,(x,y), B=fu (x,y);
for a linear function, the partial derivatives are simply the coefficients of x and y.
This suggests that our expression for /J..f ought to take the form
f(x,y)=x2+ xy + y2.
Here
+ II
I
Yo ___ __. I
I I
I I
��--+-�x����-1��---
x0 x0+Ax
cp(x)=f(x,Yo+ /J..y),
on the interval from x0 to x0+ /J..x.
14.4 Directional Derivatives and Differentiable Functions 637
x0 x! x0+t.x
z= q,(x)
The mean-value theorem says that there is an .X, between x0 and x0 +Lix, such
that
��+----''-----'-��---'-�y
Yo Y Yo+t.y
z=Y,(y)
7P(Y) =
f(xo, y),
on the interval from y 0 toYo + Liy. Thus
Lif =
f.,(x,Yo + Liy)Lix + fu(x0, ji) Liy,
where xis between x0 and x0 + 6.x, and ji is between Yo and y0 + Liy.
We are now almost finished. For each 6.x, Liy, let
Then
and
Therefore
b.J = f.,(xo, Yo) b.x + fixo, Yo) b.y + Ei(b.x, b.y) b.x + E2(b.x, b.y) b.y.
Note that
lim E1(b.x, b.y) = 0,
(4x,4·y)-+ (0,0)
because fx and fy are continuous. Thus, if the partial derivatives off are continuous,
then b.f is well approximated by the linear function f.,(x0, y0) b.x + fu(x0, y0) b.y.
For functions of two or more variables, the idea of approximation by linear functions
is used as the definition of differentiability. To be exact:
Definition. Let D be a domain in R2, let/be a function D --+ R, and let P 0 = (x 0,y0)
be a point of D. Suppose that there is a linear function
For functions of one variable, we defined the differential to be the linear function
Proof By definition,
Af = A Ax + B 11y + £1 Ax + E2 11y
= f,,(x0 , y0) 11x + fu(x0, y0) 11y + E1 11x + E2 11y
= f,,(x0, y0)t cos ix + fu(x0,y0)t sin IX + E1t cos IX + E2t sin ix.
Therefore
+ Ei(t cos ix, t sin ix) cos ix + E2(t cos ix, t sin�) sin ix,
and so
!1f
faCxo,Yo) = l im
t-+0 t
= f,,(x0, Yo) cos IX + fy(x0,Yo) sin ix + 0 · cos ix + 0 · sin ix,
In the first five problems in the following problem set, you are asked to
"verify directly" that certain functions are differentiable at certain points. In each
of these cases, you should go through an elementary calculation to express 11/ in the
form
Note that other choices of E1 and E2 would have worked just as well. For example,
Verify directly that each of the following nine functions is differentiable at the indicated
point.
10. Given f (x, y) = v x2 + y2, (x0, y0) = (1, 1), get a general formula for fa(x0, y0).
For which ex does f.. (x0, y0) take on its maximum value? For which ex do we get the
minimum value?
11. Same question, forf(x, y) =x2 - y2, (x0, y0) =(1, 1).
12. Same question forf(x, y) = xy, (x0, y0) =(1, 1).
13. Suppose that f has a directional derivative/o: in every direction ex, at a point (x0, y0).
Is it possible that f.. (x0, y0) > 0 for every ex? Why or why not? (Try to answer this
one merely on the basis of the definition of fa, without appealing to Theorem 2.)
14. Show that if fa(x0, y0) =0 for every ex, then fx(x0, y0) =fy(x0, y0) =0.
15. Give an example to show that the following "Theorem" is false:
Theorem(?) "Given/: D-+ R. Iff,,(x, y) =fy(x, y) =0, for every (x, y) in D, then
f is a constant."
16. Show that the following theorem is true:
Theorem A. Given
z = f(x, y) (a < x < b, c < y < d).
If fx(x, y) =fy(x, y) =0, for every (x, y) in the given domain, then f is a constant.
Here we are requiring that the domain be a rectangular region with sides parallel to
the x- and y-axes.
d ---- r ----- ,
i I
I J
c ----+------ �
I I
I I
I I
I I
a b
14.5 The Chain Rule for Paths 641
17. Theorem A, stated in Problem 16, is artificially special; it does not apply, as it stands,
to domains like the following:
y y y
Find a way of describing the property of D that is really needed in the proof of Theorem
A, and prove a theorem which uses your more general hypothesis.
In the preceding section, we defined the directional derivative fa as the derivative off
along a linear path
X = g(t) = Xo + t COS IX,
y = h(t) = Yo + t sin IX,
and we found that if f is differentiable, then
f a(Xo, Yo) =
fx(Xo, Yo) cos IX + fixo, Yo) sin IX.
This result can be generalized, so as to apply to derivatives along paths which are not
necessarily linear. Suppose that a path Pis defined by a pair of coordinate functions.
Strictly speaking, we should write
But it is easier to keep track if we use the letters x and y as the names of the co
ordinate functions. Thus we write
�p
642 Functions of Several Variables 14.5
Let F: D---+ R be a differentiable function, and suppose that the locus of the
path Plies in D. We can then form the composite function
cf>(t) = F(x(t),y(t)),
and we have the following theorem:
y(t) = sin t.
Here the locus of the path P is a circle. And
cf>'(t) =
(2 cost sin t + sin2t)(-sin t) + (cos2 t + 2 sin t cost ) cost,
which is the right answer.
We proceed to the proof. Take a fixed t0, and let
Xo =
x(to), Yo =
YUo),
Lix =
x(t0 + Lit) - x(t0),
Liy = y(t0 + Lit) - YUo)·
In this notation,
M�o Lit
(Re-examine the definition of cf>.) Now
Therefore
�F
cp'(t0) = lim
M-+O �t
= F,,(x0, y0)x'(t0) + Fu�x0, y0)y'(t0) + 0 · x'(t0) + 0 · y'(t0)
= Fixo, Yo)x'(to) + Fu(xo, Yo)y'(to).
cp'(t) =
F,,,(x(t),y(t))x'(t) + Flx(t),y(t)),
which was to be proved.
Briefly,
1. Given F(x,y) =cosxy,x(t) =t2 + 1, y(t) =t3, and <f>(t) = F(x(t),y(t)), find ¢'.
Did you need to use Theorem 3?
2. Same question, for F(x,y) =sinxy,x(t) =t2 + 1, y(t) =t2 - 1, <f>(t) =F(x(t),y(t)).
3. Same question, for F(x, y) =2xy , x(t) =cost, y(t) =sint, <f>(t) =F(x(t), y(t)).
4. Same question, for F(x, y) =x2 + y2,x(t) =cost, y(t) =sint, <f>(t) =F(x(t),y(t)).
7. Given F(x,y) =xy, x =x(t), y =y(t), and <f>(t) =F(x(t),y(t)), find ¢', using
Theorem 3. ( This will give you a circuitous derivation of the formula for the derivative
of a product.)
8. Same question, for F(x,y) =x/y. ( This will give you an equally circuitous derivation
of the formula for the derivative of a quotient.)
9. Same question, for F(x,y) =xu,x > 0. This will give you the formula
This formula is easy to remember: first we differentiate as though the exponent were
a constant, then we differentiate as though the base were a constant, and then we add
the results.
10. Now derive the same differentiation formula, without using the theory developed in
this chapter, appealing only to the basic definition
(a > 0).
All the ideas which we developed in the last section, for functions of two variables,
can be generalized immediately for functions of any number of variables. Limits
and continuity have already been defined in the general case, in Section 14.3.
Following the pattern of Section 14.4, we say that a function of n variables is differ
entiable at a point if the difference function is well approximated by a linear function,
in small neighborhoods of the point. The definition is as follows:
let
and let
l:,. f = f(P) - f(Po)·
and a set of n functions E1, E2, • • • , En, defined in a neighborhood of 0, such that
!:,.f = L(l:,.x1, !:,.xz, ... , !:,.xn) + E1(/:,.P) !:,.x1 + E2(/:,.P) l:,.x2 + · · · + En(l:,.P) l:,.xn,
(1)
and
lim E;(l:,.P) = 0 (2)
!J.P->O
and so on. Sometimes it is convenient to write f1(P0) for f,,1(P0), and in general
f;(P0) for f,,/P0); that is, f; is the derivative of f with respect to the ith coordinate
in Rn. Thus, for Rn= R2,P= (x,y), we may writef1 forf,, andf2 forJ..
Just as in the preceding section, if a function is differentiable, then it has all its
first partial derivatives, and these are the coefficients in the linear approximation
L(b.P):
Theorem 1. If f is differentiable at P 0, with tlf !"=:! L(b.P), then f;(P 0) is defined for
each i, and
The proof is just the same as for two variables: we take a fixed integer k, and set
tlx; = 0 for i :;zf k, so that b.P = b.xk. Then
and
By a slight extension of the device that we used in the proof of the same theorem for
two variables, we write
In the first bracket, we regard x and y as constants, and apply the mean-value theorem
646 Functions of Several Variables 14.6
to the function
<f>(w) =
f(w,x, y).
Then
</>(w) - <f>(w0) = </>'(w) !::i.w,
where w is between w0 and w. Since </>'(w) = fw(w,x, y), we have
f(w,x, y) - f(w0,x, y) =
fw(w, x,y) !::i.w.
By two more such applications of the mean-value theorem, we get
!::i.J =
fw(w,x, y) !::i.w + fx(w0,x, y) !::i.x + fy(w0,x0,y) !::i.y. (1)
Let
E1(!::i.P) =
fw(w, x,y) - fw(w0,Xo,Yo),
E2(!::i.P) = f,,(w0,.X, y) - fx(w0,Xo,Yo),
E3(!::i.P) =
f/wo,Xo,y) - jy(wo,Xo,Yo).
Then
!::i.J =
fw(P0) !::i.w + fx(P0) !::i.x + jy(P0) !::i.y
+ E1(!::i.P) !::i.w + E2(!::i.P) !::i.x + E3(!::i.P) !::i.y,
as in the definition of differentiability.
The chain rule for paths takes the same form as for two variables, and has the
same proof.
Theorem 3 (The chain rule for paths). Let P be a path, with coordinate functions
w(t), x( ), y(t), and with locus lying in the domain D in R3.
t Let f be a function
D-+ R, and for each t, let
</>(t) f(w(t),x(t),y(t)). =
Iff and the three coordinate functions are differentiable, then</> is differentiable, and
</>'(t) =
fw(w,x,y)w' + f,,(w,x,y)x' + fy(w,x, y)y'.
That is,
</>(t) = fw(w(t),x(t),y(t))w'(t)
+ Jx(w(t),x(t),y(t))x'(t)
+ jy(w(t),x(t),y(t)) y'(t)
for every t.
This automatically gives us a chain rule for composite functions in which w, x,
and y are functions of several variables. As in Theorem 3, let D be a domain inR3,
D-+ R. But now let w,x,andy be functions of three variables,
and let/be a function
defined in a domain D'. Suppose that for each point (t,u,v) of D', the point
(w(t,u,v), x(t,u,v),y(t,u,v)) lies in D. We then have a composite function
cp: D'-+R,
defined by the formula
</>(t,u,v) =
f(w(t,u,v), x(t,u,v ),y(t,u,v)),
14.6 Differentiable Functions of Many Variables 647
and we want to find the partial derivatives cp1, <f>u, and <Pv· But this is not a new
problem, really: in calculating cp1, we regard u and v as constants, and this means that
we can calculate <Pt by means of the chain rule for paths. The only difference is that the
derivatives w'(t), x'(t),y'(t) in Theorem 3 are now the partial derivatives wt(t, u, v),
x1(t, u, v) , y1(t, u, v) , and the final answer cp'(t) in Theorem 3 now becomes cp1(t, u, v).
This gives the formula
cp1(t, u, v) = fw (w, x, y)w1(t, u, v)
+f,,(w, x,y)xv(t, u, v)
+fv(w, x,y)y/t, u, v).
This may be easier to remember in the o-notation. In this notation,
o<f> o f aw of ax of oy
= + + .
ai aw ai ax at oy at
Theorem 4 (The chain rule). Letfbe a differentiable function of w, x, and y; and let
w, x, and y be differentiable functions of t, u, and v. Let
o<f> o f aw of ax of oy
-=--+--+--,
a1 aw a1 ax at oy at
o<P of aw of ax of oy
- = -- + -- + - - ,
au aw au ax au oy au
o<P ofaw of ax of oy
+ +
=
.
ov ow av OX ov oy OV
Here
of
- = 2w = 2(t+ 2u + 3v),
OW
of
2x 2(2t+ 3u + 4v),
=
- =
OX
of
- = 2y = 2(3t+ 4u + Sv),
oy
ow OX oy
2 3, 4.
=
OU - ,
=
OU OU
648 Functions of Several Variables 14.7
o<f>
- = <f>11(t, u, v) = 58u + 40t + 76v,
OU
as before.
We shall now generalize the idea of the directional derivative f,a in such a way that it
applies to functions of any number of variables. D in Rn, and a
Given a region
differentiable functionf: D-+ R, let P0 be any point of D, and let V be any vector in
Rn, with Ii VII = l. (A vector of norm 1 will be called a direction in Rn.) The derivative
offin the direction V, at the point P,0 is defined to be
Thus we have
Theorem 1. If f is differentiable in D, then f has a directional derivative at every
point of D, in every direction. If Vis a unit vector (c1, c2, , en), then • . •
(You should check that for the case n = 2, our definition of the directional
derivative, and the formula given in Theorem 1, agree with the definition and formula
given in Section 14.4.)
The gradient of a differentiable function, at a point P0, is the vector whose
components are the partial derivatives off at P0. The gradient vector is denoted by
grad/ Thus ifjis a differentiable function D R, with real numbers as its values,
__.,.
where the f;,'s are the first partial derivatives off That is,
which is a vector in R2, as it should be, for each point P = (x, y).
650 Functions of Several Variables 14.7
The definition of the gradient may seem arbitrary, but it is not; the gradient has
a geometric meaning, now to be explained. First we observe that for each unit vector
V, with
V= (c1, C , , en),
2
. . •
llVll2 = c� + c� + + c; = ,1
· .· .
fv = (grad f) · V.
Theorem 2. If/is differentiable at P, then (1) the direction of gradf(P) is the direction
which gives the maximum value of the directional derivative fv(P), and (2) the norm
of grad f(P) is the maximum value offv(P).
V= -- 1 G
llGI! '
so that
where
f;(P)
C· = --
i llGll .
Then
fv(P ) = (gradf(P) ) V= G V
· ·
=- G · (_Q_)
llGll
l
= _ (G
llGll
· G) = llGll = llgradf(P)ll.
Thus the directional derivative, in the direction of the gradient, is the norm of the
gradient. And this is the direction which maximizes the directional derivative: if W
is any unit vector, then we know by the Schwarz inequality (Theorem 1 of Section 11.6)
that
(G. W)2 � llGll2 II w112 = llG ll2 ,
and so
fw(P) = G W � llGll = fv(P ).
·
A continuous function D--+ 1/, where Dis a region in a Cartesian space Rn and
1/ is a vector space, is called a vector field. We ordinarily draw the graphs of
14.8 Interior Local Maxima and Minima 651
f(x, y) = x2 + y2,
gradf(x, y) = (2x , 2y),
we can indicate the vector field grad f by drawing sample vectors in the xy-plane,
like this:
y
x
-3 3
-3
At each point P = (x, y), the direction of gradf(P) is the direction of the ray from
the origin through P, and the length l[gradf(P)ll is twice the distance from the origin
to P. At the origin , the gradient vector vanishes. Such a point is called a singularity
of a vector field.
x_
2__
J
__
y2
9. f(x,y) = 1 4
- 4 - 10. f(x,y) = x2 + y
1
11. f(x,y) = (x2 + y2)2 12. f(x,y) = 13. f(x,y) = 4y2 - x2
x2 + y2 + 1
For functions of one variable, defined on a closed interval, we had two kinds of
maxima. In the figure on the left below , the maximum occurs at the endpoint b; at
652 Functions of Several Variables 14.8
x1 the function has a local maximum, but not a maximum, because/(x1) < f(b).
In the figure on the right, the function has a maximum at x1; this is an interior maxi
mum, and so/'(x1) must be 0.
y y
One of the simplest theorems for functions of one variable was the following:
The proof is simple. Sincef"(x0) < 0, andf" is continuous, it follows that there
is a neighborhood (x0 O, X0 + 0) Of X0 such that
-
(Xo - O, Xo + 0).
Therefore
J'(x) > 0 for x0 - o < x < x0
and
f'(x) < 0 for x0 < x < x0 + o.
The same proof proves the following theorem, which is going to be more useful:
®
Let D be a set of points in the xy-plane. If P0 is a point of D, and D contains a
neighborhood of P0 (for some o), then P0 is called an interior point of D.
Thus Pis an interior point of D if P0 lies in D, with at least a little room to spare.
Consider, for example,
D = {(x,y) J x2 + y 2 � l}.
Here D consists of the unit circle, plus its interior. If OP0 < 1, as in the figure, then
P0 is an interior point; if we let
o = 1 - OP0,
then N(P0, o) lies in D.
This works, no matter how close P0 may be to the circle, as long as P0 isn't actually on
the circle; no matter how small the positive number 1 - OP0 may be, we can use it
654 Functions of Several Variables 14.8
as our positive o. On the other hand, if OP1 = 1, so that P1 is on the circle, then P1
is not an interior point of D; no matter how small we take o, the neighborhood
NP
( 1, o) contains points outside of D.
. ---,
-...
____
..._ __,., --y
f: D---+ R,
(At this point you may want to review the definition of slice functions, at the
beginning of Section 14.3.) Obviously, however, the vanishing of the partial deriv
ativesfx and fu is not enough to guarantee an ILMax; we might have a minimum or
14.8 Interior Local Maxima and Minima, for Functions of Two Variables. Level Curves 655
f(x,y) = x2 - y2,
we have
fx(O,0) = fv(O,0) = 0,
/
</>0(t) = f(t, 0) = t2
has a minimum at 0, and the slice function
the level curves are circles with center at the origin, as shown above. For each
k > 0, the level curve on whichf(x,y) = k is the circle with center at the origin and
radius .J'k. The origin is a singular point of this family of curves; and this is the point
at which the function takes on its obvious minimum value 0.
For the function
f(x,y) = x2 - y2,
there is no maximum and no minimum. Since/is defined for every x and y,any Max
or Min would have to be an ILMax or ILMin; at any such point, both the partial
derivatives .fc and fv would have to vanish; f,, and fv vanish simultaneously only at
656 Functions of SE .eral Variables 14.8
(0, 0), and at (0, 0) the function has a saddle point. The level curves for this function
look like this :
y y
y=x
y= -x
y= -x
on which f(x,y) = 0 is the union of two lines, which intersect each other at the
origin,where f has a saddle point. These examples are typical of the way level curves
behave in simple cases.
Even if each of the slice functions in the x- and y- directions has an ILMax at a
point (x0, y0), we may still have a saddle point, on which a man in the saddle would
be facing in some third direction. Consider
</>rr12(f) = -!f2•
But for rx = 37T/4 we have
1 . 1
cos rx = .J2. , sm rx = .J2. ,
1 t2 1t2
42 42
t2 --
t2 1 2
- =-t ,
2 4 4
which has a minimum at 0.
14.8 Interior Local Maxima and Minima, for Functions of Two Variables. Level Curves 657
Thus, if we want to infer that f has an ILMax at (x0, y0), we need to consider
every direction rx, and examine all the slice functions
cp�(t). =fx(x0 + t cos rx, Yo + t sin rx) cos rx + fv(x0 + t cos rx, Yo + t sin rx) sin rx;
here we are using the chain rule. Applying the chain rule again, to each term, we get
The reason is that for every rx, <fa(O) is the maximum value of <Pa on the interval
(-o, o). It follows thatf (x0,y0) is the maximum value of/in the a-neighborhood
of (x0,y0).
658 Functions of Several Variables 14.8
lr
r/>;(t) =fxx C2 + 2
csf
·
fxx
�+
s2 r
frxx J
--.:!.Y!!.
s�;Y
=fxx [( c + s . )
fxv
fxx
2 _
f xx
+
s2fvv
fxx
]
=fxx [( c + ) s2]
fxu 2 fxxfvv - f;v .
S • +
fxx J2xx
Suppose now that at the point (x0, y0) we have
fxx < 0,
We are assuming that all the partial derivatives that we are dealing with are con
tinuous. It follows that the same inequalities hold in the c5-neighborhood of (x0, y0),
for some c5 > 0. Thus for !ti < c5 we have
then we have
rf>�(O) = 0 for every o:.
Therefore, by Theorem A, f has an ILMax at (x0, y0). We sum all this up in the
following theorem:
Not only the proof of this theorem, but also the theorem itself, are hard to read
and hard to remember. This is typical of what you can expect from now on: when we
pass from one variable to two or more, the calculus takes on a higher order of
·
difficulty.
The following is a corollary of Theorem 3:
Theorem 4. Suppose that f has continuous second partial derivatives in a neighbor
hood of (x0, y0). If
f,,(xo, Yo) =fv(xo, Yo) = 0, (1)
fxx(Xo, Yo) > 0, (2)
14.8 Interior Local Maxima and Minima, for Functions of Two Variables. Level Curves 659
and
Then
Theorem 5. If f ;v -
f xxfvv > 0 at P0, then f has neither an ILMax nor an ILMin
at P0•
660 Functions of Several Variables 14.8
Investigate the following functions for interior local maxima and minima. Not all of
these problems can be worked by straightforward applications of the theorems in Section
14.8; you may need to examine slice functions, or use other elementary methods.
take on its minimum value? What is the minimum value of the function?
13. Consider the ellipsoid
x2 + y2/4 + z2/9 = 1 .
y2 z2
x2 + + -;;=l.
4
y2 z2
x2 + 4 + 4 = 1.
16. Let A1 = (0, 0), A = (1, 2), and A3 = (2, 1). For each P (x, y), let
2
=
You recall that in Section 3.7 we gave a preliminary intuitive definition of the definite
integral of a continuous function over a closed interval. Here the A;'s are areas, in
the elementary geometric sense, so that A; � 0 for every i. To get the integral, we
count areas above the x-axis positively, and areas below the x-axis negatively.
14.9 Double Integrals, Intuitively Considered 661
Later, in Section 7.2, we gave a new definition of the integral, as the limit of the
sample sums of the function as the mesh of the net approaches 0:
provided, of course, that such a limit exists. The new definition was necessary for
two reasons. First, we needed it to clarify the underlying theory. Second, we wanted
to use the definite integral to solve problems which did not, at the outset, look like
area problems at all. For example, to calculate arc lengths, surface areas, volumes,
and moments, we regarded them as limits of sample sums, as the mesh approaches
zero. Thus our second definition of the definite integral was not only more exact but
also more widely applicable.
We shall follow the same scheme with multiple integrals, first giving an intuitive
definition, and then reformulating it when the need arises (which will be soon).
Suppose that we have given a nonnegative continuous function
f: D--+ R,
defined in a domain Din the xy-plane. (See figure on the left below.)
,0,
� I
y
____,-
I - - 1 ---y
'1
--
1......
' ./
---- x
x
a
x
662 Functions of Several Variables 14.9
The expression
fff(P)
D
dA
denotes the volume of the region lying above the xy-plane and below the graph off
This is called the integral off over D. Thus the integral is the volume of the solid
for each x0 from a to b we can compute, somehow, the area of the cross section in
the plane x =x0. If for each such x0 we let A(x0) be the area of this cross section,
then the volume of our solid S is
vS = f A(x) dx.
This method works for many solids whose volumes are not given by standard
formulas. Consider the following.
y z
y=vx. y
R = {(x,y) \ 0 � x � 1, 0 � y � )-;,},
in the xy-plane. For each x, we join the point (x,0) to the point (x, y2), by a segment.
On each such segment we set up an isosceles right triangle, as shown above on the
right. Let S be the union of all these triangles (including, of course, their interiors).
For each x, the area of the triangle _at x is
A(x) = t)-;; )� = tx.
Therefore the volume is
f A(x) dx = t f x dx = t[tx2]� =
-!.
Here A(x) was computable by an elementary formula, because the cross sections
for constant x were triangular. But no matter what method you use to compute
A(x), you can still find the volume by integrating A(x) between the appropriate
14.9 Double Integrals, Intuitively Considered 663
limits. In particular, you can use the method when A(x) is itself computed as a definite
integral.
y
D = {(x,y) \ 0 � x � 1, 0 � y � 1 - x}.
For each (x, y) in D, let
f(x, y) = x2 + y3.
We want to find the volume of the solid lying above D and below the graph off
Now for each x0, the cross section in the plane x = x0 looks like the drawing on the
left below. Therefore the area of the cross section at x0 is
A(x0) =
f -"'0 (x � + y3) dy = [x�y + !y4]�-xo
= xW - Xo) + i(l - X0)4
= x� - xg + !( 1 - x0)4•
Dropping the subscript we get
A(x) = x2 - x3 + Hl - x)4•
Therefore the volume is
ff!(P) dA fA(x) dx
= = [tx3 - !x4 - ! i-(1 · - x)5]�
D
= (t - t] - (-io] = T\.
The method works more generally. Suppose that we have a region D in the
xy-plane, lying between the graphs of two functions, as on the right below.
z
y
x2
0
1
664 Functions of Several Variables 14.9
ff
D
F(x, y)dA.
This is the volume of the solid lying above D and below the graph of F. For each x,
the cross-sectional area is
A(x) =
lgf(x)(x)F(x, y)dy.
Here x is being held constant, and we are integrating from f (x) to g(x). But A(x),
once you get it, is a function. Therefore the total volume is
This takes a very simple form when D is a rectangular region defined by inequalities of
the form
a� x � b, c � y � d.
Here
:���] I
I
I
I I
I I
I I
I I
a x b
Of course, we could equally well have used cross sections for constant y. This
would give a different cross-sectional area function
B(y) = f F(x,y)dx;
and we would have
The expressions
The same phenomenon occurs, in a less simple form, if the domain is not
rectangular. In- the figure above, the domain D can equally well be described by the
inequalities
0 � � 1 2
0 � x � 1, y - x , (1)
or
0 � y � 1, (2)
f1 r-vl-Y
ff
D
F(x, y) dA
=Jo Jo
F(x, y) dx dy.
Therefore the two iterated integrals must have the same value.
JJF(x,y)dA
D
666 Functions of Several Variables 14.9
as an iterated integral in two different ways, evaluate both of your iterated integrals, and
check by observing that they ought to have the same value.
3. D: 0 �x �2, x3 �y � 8; F(x,y) = x2 + y
4. D: 0 �x � 1, x2 � y �x; F(x,y) = x + y
5. D: 0 �x �1, x �y � 1; F(x,y) = x3y3
6. D: 0 �y �1, y �x � 1; F(x,y) = x2 + y2
7. D: -1 �x � 1, 0 �y � 1 - x2; F(x,y) = xy
10. Let
Find / ( 0<:) .
'
/(0<:) =
ff </>(x,y)dxdy.
/(<X) =
rld </>(x, y) dxdy.
13. Let </> be a positive function, with continuous first and second partial derivatives. Get
the simplest formula that you can for
fld z
tl
</>xy(X,y) dxdy.
_
____
/
/
I
I
I
I
I
I
I
)-_____ 7r
// .7c---
1 - 2
The unit sphere with center at the origin is the graph of the equation r2 + z2 = 1.
(See the figure on the left below.)
Recalling the familiar formulas giving x and yin terms of rand e, we see that
rectangular and cylindrical coordinates are related by the formulas
x = r cose,
y = r sine,
z = z,
x2 + y2 =
,2.
z
z
'
'
'
'
I '· (r,0,z)
I
I
l..
I ',
I
I 1/ ,'1I
I/ IZ
--�7r
/
/���---- -r---- 2 7r
0
r
--2
0 0
As for polar coordinates in the plane, these formulas work in only one direction:
when rande are named, x and y are determined, but when x and yare named, there
are two possibilities for rand infinitely many possibilities for 8. (See figure on the
right above.)
Suppose now that we have given a domain D in the plane z = 0. The plane
z = 0 may be regarded as the xy-plane or the r8-plane; sometimes we shall refer to
it simply as the base plane. Suppose that we have given a continuous function
/: D --+ R. If we describe a point P of D by its polar coordinates (r, 8), then we have
z = f(P) = f(r, 8).
ff!(P)
D
dA,
Q
I
J
I
I
Given a domain D in the base plane. By a net over D we mean a finite collection
N: D1, D2, , Dn • • •
of regions such that (1) D is the union of the D/s, (2) each Di has an area (i.e., is
measurable, in the sense defined in Appendix G), and (3) if Di intersects D1, then the
area of the intersection is 0. The sets Di are called the cells of the net. The figure
indicates, at long last, why we use the word net in integration theory.
By the diameter of a set Di we mean the supremum of the distances between its
points. The diameter is denoted by oDi. Thus
oDi = sup {PQ I P, Qin Di}.
Note that if Di is a circular region, then oDi is the diameter of D; in the elementary
sense. The mesh of the net N is the greatest of the diameters of the cells of the net.
The mesh is denoted by JNJ. Thus JNJ Max {aDi}.
=
?
14.10 Cylindrical Coordinates in Space. The Definition of the Integral 669
is a sequence
of points, where P; belongs to D; for each i (see figure at the right above).
For each i, let flA; be the area of D;. A sample sum off over the net N is a sum
of the form
n
"J.J(Pi) LlA;.
i=l
We are now finally ready to give our definition of the double integral. By definition,
We recall that if/ is continuous on the closed interval [a, b], then/ is integrable
on [a, b]. We want to state an analogous theorem for functions of two variables.
It would hardly do to restrict ourselves to "two-dimensional closed intervals"
a � x � b, c � y � d. On the other hand, we cannot allow all sets D in the xy
plane as domains, because continuous functions on some domains may not even be
bounded. (Examples?) What is needed here is the following:
Definition. A point P is a limit point of a set D if every neighborhood U(P, c5) of P
contains a point of D other than P.
Definition. A set D is closed if it contains all its limit points.
Thus a closed interval is closed, but an open interval is not; the region
D = {(x, y) I x2 + y 2 � 1}
is closed, but the region
D' {(x, y) I x2 + y2 < 1}
=
is not.
We recall that a set D in a plane is bounded if it lies in the interior of some circle
(or, equivalently, if it lies in the interior of some rectangle). We can now finally
state our theorem:
Theorem 1. Let D be a closed, bounded, measurable set in the xy-plane, and let f be
a function which is continuous on D. Then/ is integrable on D.
670 Functions of Several Variables 14.10
You may be able to convince yourself of this, for positive functions, by thinking of
the integral as a volume, and thinking of the sample sums as approximations of the
volume; the idea is that we can approximate the volume as closely as we please,
by cutting up the base domain into sufficiently small pieces. If the function is negative
somewhere, then we need to use volumes with signs attached, but the idea is much
the same. But a mathematical proof that all this works is far beyond the scope of
this book, and we make no attempt to present one.
Meanwhile we assume that the theorem is true, and return to the problem of
integration in cylindrical coordinates. For the sake of simplicity, we consider first a
domain of the type
D = {(r, 6) I a � r � b, Cl � e � {J}.
This is the polar equivalent of a rectangular region.
7r
2 z
The first step in the calculation of the integral is to set up a net N, on the interval
[a, b] and a net N8 on the interval [cc, {J]. Thus we have
7r
2
14.10 Cylindrical Coordinates in Space. The Definition of the Integral 671
Then
and
r; - r;:1 = r; - (r; - 2r; Llr; + Llr;2)
= 2r; Llr; - Llri.
Therefore
Ll A;1 = i Ll01[ri - rLJ = i Ll01[2r; Llr; - Llri]
= r; Llr; LlO1 - i Llri LlO1•
In each cell D;1 of the net we pick the sample point P;1 = (r ;, O;). We now form the
sample sum
n m
lim
INl->o
L= fff(P) d A.
D
On=/3
oi - - -1-- +-----+- -.---<
pij
- --
oj-1 - - - 1--+-----1---1
00=a
672 Functions of Several Variables 14.10
Let D' be the rectangular region shown in the figure. That is,
Theorem 2. Let
Let us try this out in a simple case in which we know the answer. Consider the
hemisphere under the graph of
z =f(x,y) = )1 _ x2 _
y2.
In cylindrical coordinates,
z =f(r, e) = )1 - r2•
Let
D = {(x ,y) I x2 + y2 � l} = {(r, e) I r2 � l}.
f J1 -
r2rdr = {-t·�(l - r2)312+ C}.
14.10 Cylindrical Coordinates in Space. The Definition of the Integral 673
Therefore
Therefore
This is right, because the volume of the whole sphere is 477/3 13 · = 477/3.
JJcx2 + y2)712dydx.
D
JJ v 1 + x2 + y2dydx.
D
3. Find the volume of the solid which lies under the paraboloid z = x2 + y2 and over
the interior of the cardioid r = 1 - sin e.
4. Find the volume of the solid lying inside the cylinder x2 + y2 � 1 and inside the
sphere x2 + y2 + z2 = 4.
5. Find the volume of the solid lying inside the cylinder x2 + y2 � 1 and inside the
ellipsoid
x2 y2
4 + 4 + z2 = 1.
6. Find
il Jyl-x• _(x2 + y2)10dydx.
-1 -v1-x2
7. Let D be the circular region with center at (0, 1) and radius 1 , in the xy-plane. Find
x
ffv 2
D
x + y2
dydx.
y
ffvx2
D
+ y2
dydx.
9. Let S be the part of the disk D lying in the half-plane {(x, y) Ix ;;; O}. Find
xy
ff s
x2 + y2
dydx.
674 Functions of Several Variables 14.10
10. Find
f1 i'h-x• sin7 cos7 x y
-1 - y l-x• (X2 + 2)3/2 dydx.
Y
11. Find
12 iy4-x• 2 x2 - Y.
- dydx.
0 0
- x2 + y2
-
We recall, from Section 7.6, the definitions of moments and centroids for finite
systems of point masses in a coordinate plane. Suppose that we have given a set of
particles Pv P2, Pn, with masses m1, m 2,
• • • , , m n, at the points (x1, y1), • • •
(x2, y2), , (x n, y,,). The moment of the system about the y-axis is defined to be
• • •
M11 = .L m;X;,
i=l
i=l
i=l
If
then the point (.X, ji) is called the centroid of the system. By easy calculations we get
n
1 m; · I
ji = _L m;Y; m = - ( )
m ;=1 i=l
It is easy to see that if the axes are translated, the centroid is unchanged: for
x
_,
i=l
1
= x- - - ,,;;;., m; = x- - h,
h�
m i=l
_,
Y = -m1�;,,;;;.,=1 m;Yi
,
= Y- - k,
14.11 Moments and Centroids of Nonhomogeneous Bodies 675
so that in the new coordinate system we get the same centroid as before. Similarly,
if we reverse the direction of the x-axis, or the y-axis, or both, we get the same
centroid as before. Finally, we observe that the centroid is unchanged if we rotate
the axes through an angle of measure(), We have
Thus the old coordinate system and the new one give us the same point as centroid.
Suppose now that we have a thin rod, lying on an interval [a, b] on the x-axis.
We do not suppose that its mass per unit length is constant. But in any case there
is a function f which gives, for each x, the mass of the part of the rod that lies on
the interval [a, x]. If f has a continuous derivative f', then .
A function p which behaves in the way that we have just observed for f' is called a
density function for the rod. That is:
Definition. Given a rod on [a, b]. For a � x1 < x2 � b, let m(x1, x2) be the mass
of the part of the rod that lies on [x1, x2]. A densityfunction for the rod is a function
p such that
It follows, of course, that the total mass m of the rod is m(a, b) g p (x) dx.=
And the definition agrees with our intuitive notion of what density at a point ought
to mean. If p(x0) is the density at x0, then it ought to be true that
when <5 � 0.
676 Functions of Several Variables 14.11
Here the lefthand side is the average mass per unit length on the interval [x0, x0 + o],
and this ought to be approximately p(x0) when o R::> 0. And this is true, at any point
xo+o
where p is continuous:
by the general formula for the derivative of the integral. We assume hereafter that p
is a continuous function. Let us take a net N: x0, x1, • • • , xn over [a, b], and form
the sum
n
,L xip(xi) 6.xi.
i=l
This sum is the moment, about the origin, of a finite system of particles of mass
0, is f� xp(x) dx. This integral is defined to be the moment of the rod about the origin.
Thus
b
M0 = a xp(x) dx. J
More generally, the moment about the point x = k is
M33 = 0.
It is easy to calculate that
_ J! xp(x) dx
x= .
J! p(x) dx
By the definition of the density function, the integral in the denominator is the total
mass m of the system. Thus, briefly,
x = -
1 iabxp(x) dx.
m
Suppose, for example, that the rod lies on the interval [O, 2], and that the density
is proportional to the distance from the origin. Here we have
p(x) = kx,
"
x= _!_ [ x kx dx _!_ k[lx3]� = ! l 8 f.
Jo
· = · · · =
m 2k
This is greater than one, as it should be.
14.11 Moments and Centroids of Nonhomogeneous Bodies 677
For each subregion D; of D let m(D;) be the mass of the portion of the plate that
lies in D;. Following the analogy of the rod, we define a density function for the
plate to be a function psuch that
m(D;) =ff ( ) dP
pP
D;
for every D; lying in D. In particular, D; may be all of D; and in this case the total
mass of the plate is
m = m(D)
=ff D
p(P) dA.
N= D1, D2, • • • , Dn
over D; we take a sample P1, P2, • • • ,Pn of N, with Pi= (xi, y;); and we form the
sum
n
( i, yJ �Ai,
xipx
iL
=l
where �A; is the area of Di. This sum is the moment about the y-axis of a system of
particles of mass p(x i, Yi) �Ai, with x-coordinates x i . As the mesh of the net
approaches zero, these sums approach the limit
Jf xp(x, y) dA.
D
By definition, the moment of the plate about the y-axis is this integral. More generally,
the moment about the line x =k is
and similarly,
m =ffp(x, y) dA,
D
In the preceding discussion, we have assumed for the sake of simplicity that the
density is continuous, so that we don't need to worry about whether our integrals
exist. In some very simple cases, however, the density is not continuous. Suppose,
for example, that we take a rod of unit length, with constant density 1, and another
rod of unit length, with constant density 2, and lay them end to end.
p=l p=2
0 2
Thus p(x) = 1 for 0 � x < 1, and p(x) =
2 for 1 < x � 2. At the midpoint 1,
we split the difference, and take p(l) t· We now have a discontinuous density
=
-+------+--x
2
y=(x)
fp(x) dx = 1 + 2 = 3,
14.11 Moments and Centroids of Nonhomogeneous Bodies 679
which is equal to the mass, as it should be. The function xp(x) looks like this:
y
2 I
•
I
I
I
I
I
I
I
x
2
and
M0 = fxp(x)dx = i + i(2 + 4) = t.
Therefore
x = _!_ Mo = l t · = t.
m
This is the right answer. Ifwe assume that the masses of the two halves of the rod are
concentrated at their centroids, then we get two particles, of masses 1 and 2, at the
points i and t.
• x
0 2
Here
Mo = t ·
1 + t 2 · = {, m = 3,
and
x = t. t = i,
as before.
This illustrates the way in which our formulas work, for discontinuous density
functions. The general theory, however, is hard, and we make no attempt to discuss
it here. Meanwhile the above example shows that some very simple physical situations
lead naturally to discontinuous functions.
1. A thin rod occupies the interval [2, 4]. Its density is proportional to the distance from
the origin. Find the centroid.
2. A thin rod occupies the interval [I, 2]. Its density is proportional to the square root
of the distance from the origin. Find the centroid.
3. A thin plate occupies the unit disk with center at the origin. Its density is proportional
to e-<•"+Y'>'. Find the centroid.
680 Functions of Several Variables 14.11
4. A thin plate occupies the unit disk with center at the origin. Its density is proportional
to Vl + x2 + y2• Find the centroid.
5. A thin plate occupies the righthand half (x � 0) of the unit disk with center at the
origin. Its density is proportional to the distance from the origin. Find the centroid.
6. Same question, where the density is proportional to the square of the distance from the
origin.
7. A thin plate occupies the interior of the cardioid r = 1 - sin IJ. Its density is constant,
say, = 1. Find the centroid. (The computation is Jong, even if the appropriate
short-cuts are used.)
8. A function f is defined by the conditions
f(x) =
{ x2 for 0 � x � 1 ,
(x - 2) 2 for 1 � x � 2.
Find S�f(x)dx.
9. A function/ is defined by the condition
for 0 � x � 1 ,
f(x) =
{; = �2 for 1 � x � 2.
Find S�f(x)dx.
10. A thin plate occupies the square region whose corners are (0, 0), (1, 1 ), (2, 0), and
(1, - 1 ). Its density is proportional to the distance from the y-axis. Find the centroid.
11. A thin plate occupies a triangular region with vertices (0, 0), ( 1 , 1 ), and (1, -1). Its
density is proportional to the distance from the x-axis. Find the centroid.
1 2. Given a thin plate, occupying a region D, with density function p. The moment of
inertia of the plate about the point P0 = (x0, y0) is defined to be
Suppose that the plate occupies the unit circle with center at the origin, and that the
density is constant. Find the moment of inertia about the origin.
13. Under the conditions of Problem 12, find out which point P0 gives the minimum value
of the moment of inertia.
1 4. The moment of inertia of a thin plate about the line x = x0 is
The moment of inertia IY=Yo about the line y y0 is defined similarly. A thin plate
=
occupies the righthand half of the unit disk with center at the origin, and its density is
proportional to the distance from the origin. Find lx=o and ly=o·
1 5. Given a thin plate, with density function p, on a domain D. For what point P0 does the
moment of inertia Ip take on its minimum value?
Suppose that we have given a path P: I--+ D, where I is a closed interval [a, b] and
D is a region in a coordinate plane.
14.12 Line Integrals 681
�
P(a) P(b)
for each t, and suppose that f and g have continuous derivatives. Let F and G be
continuous functions defined on D. The line integral Sr F dx + G dy, of F and G
over the path P, is defined as follows.
Let
xi = f(ti), Yi g(ti),
=
ti.xi =
X; - xi-1• L'l.yi = Yi - Yi-1·
Then
J{P F dx + G dy = Jim
JNJ..,O
i [F(x;, Yi) L'l.x; + G(x;, Y;) L'l.y;].
i=l
y
We need, of course, to show that the limit exists. We shall do this by deriving a
formula for the limit, as follows:
Theorem 1. If F and G are continuous, and the coordinate functions f and g of the
path P have continuous derivatives, then
�
n n
</>(t) = F(j(t),g(t))j'(t)
over the net N; the only trouble is that we have substituted two different sample
points in two different places in the formula for if>. But in the limit, this does not
matter. (See Appendix I, where a very similar case is discussed in detail.) Therefore
n
lim
iNi-Oi=l
_LF(x;, y;) !:ixi=
lb</>(t) dt lbF(j(t),g(t))f'(t) dt.
a
=
lb
n
Jim .L G(xi, Yi) !:iyi = G(f(t), g(t) )g'(t) dt ;
INl-+O i=l a
and from this the theorem follows.
For example, we might have
Line integrals have the following quite natural physical interpretation. We regard
the path P: I-. Das a description of the motion of a particle in the plane, during
the time interval a ;;; t ;;; b. Suppose that at each point (x,y) of Dthere is a resisting
force R(x,y) = F(x,y)i + G (x , y)j .
F
14.12 Line Integrals 683
Here R is a vector, with components F and G in the x- and y-directions, and the indi
cated addition is vector addition. As the particle moves from P(t;_1) to P(t;), the work
should be approximately
W= L F dx + G dy,
_L R(Q;) LlPi,
·
i=l
where Q ; P(t;)
= (x;, y;). Therefore the line integral depends merely on the
=
LR·dP.
is an exact differential. This is the case in which there is a function <P such that
cf>x = F, cf>11= G,
684 Functions of Several Variables 14.12
Here, by the chain rule for paths, the integrand is the derivative of the function
In such a case, we may describe the line integral merely by using the endpoints of
the path as limits of integration, writing
r<c',d')<Px dx + <Pv dy
Jcc,a>
for
Calculate the line integral fp R · dP, by any method, for the following functions R
andP.
14.12 Line Integrals 685
In this book, the use of logical symbols is held to a minimum, on the ground that
words are usually easier to read. But the symbolism explained below will at some
points be convenient in the text, and is even more useful in notebooks and on black
boards.
We explained in Chapter 1 that <=> means "is equivalent to." Thus
{x I P(x)}
x such that P(x) is true. That is, {x I P(x)} is the solution
denotes the set of all obj ects
set of the open sentence P(x). Thus the closed interval from 0 to 4 is
[O, 4] = {x I 0 � x � 4},
and the open interval from 1 to 2 is
AcB
B ::i A.
688 Appendix A
x EA.
This is read "x belongs to A." The denial of this statement is indicated by a diagonal
stroke. That is,
x¢A
means that x does not belong to A. The union of A and B is denoted by
A UB.
Thus
A UB={x I x EA or x EB}.
The formula A u B is read "A cup B." The intersection of A and B is denoted by
AnB.
Thus
AnB={x I x EA and x EB}.
The formula AnBis read "A cap B." The difference
A-B
of A and Bis the set of all elements of A that are not elements of B. To write A-B,
we need not suppose that B c A. For example, if A= [O, 2] and B = [1, 4], then
A-B= [O, 1).
These sets are intervals, of course, described in the notation of Chapter 1. The
empty set is denoted by { }. Thus
AnB={}
A-B={}
means that A c B. When we write
we mean that the equation on the left holds true for every x. Similarly,
x2-y2=(x-y)(x + y) v.,,11
means that the equation holds true for every x and y. The symbol "Y" is read "for
every." More informally, we may write
Used in moderation, this symbolism is a convenience. But its use can be over
done; and it takes practice to read formulas like
This says that between any two real numbers there is a third.
This symbolism is introduced merely as a scheme of abbreviations of English
words and phrases. This book makes no attempt to deal with symbolic logic; and its
use of the "theory of sets" is entirely intuitive. But the shorthand of logic is useful
simply as a shorthand; the point is that we are more likely to say what we mean if we
have a quick and easy way to do so.
Algebraic Operations
Appendix B with Limits of Functions
In Section 3.4 a number of theorems on limits were stated without proofs. Here we
give the proofs. First we recall some of the results of Section 3.4.
Theorem 1. If limX---+.,.f(x) =L, then lim.,-.,.[f(x) - L] = 0.
Theorem 2. If lim.,-.,0 [f(x) - L] = 0, then lim.,-.,J(x) =L.
(These were Theorems 2 and 3 of Section 3.4.)
Theorem 3. If lim.,-.,J(x) = 0 and limo:---+xo g(x) = 0, then
690
Algebraic Operations with Limits of Functions 691
By Theorem 3,
lim [f(x) + g(x) - (L + I:)] = 0.
By Theorem 2,
lim [f(x) + g(x)] = L + I:.
X--tXo
Proof Let
limf(x) = L.
Let Ebe any positive number, and let o be as in the definition of a limit.
f
L
I I
I I
L-• --
�---4---�
I I I
I I I
�-+--�����-x
x0-o x0 Xo+o
Thus
0 < Ix - x01 < o => L - E< f(x) < L + E.
The o that we now have is the o that we wanted. We let M be the larger of the
numbers IL+ E l , IL - E l .
lim [f(x)g(x)] = 0.
692 Appendix B
Proof Let a positive number E be given. Take 01 > 0 and M > 0 such that
(
then (1) and (3) both hold, and so (2) and (4) both hold. Therefore
g(x) < M,
Ix - x01
0 < < O =>
f(x) < �.
Since
we have
In each of these brackets, the first factor approaches 0 as x � x0, and the second
factor is locally bounded at x0• Therefore
and
lim [(g(x) - L')L] = 0.
a::-+xo
Therefore the sum of the two bracketed expressions approaches 0, which was to be
proved.
Roughly speaking, a function f is locally bounded away from 0 at x0 iff (x) is not
very close to 0 when x is close to x0 and different from x0.
Definition. Suppose that there are numbers € > 0 and c5 > 0 such that
0 < Ix -
Xol < c5 => If(x)I > €.
y y
�-t-��----.,___,�1 -Xo��
1 1
� Xo-+�0,,-----x
st- -t-i--
-x;;--
1 I I
I I
Note that, if/(x) is never 0 when x ¥= x0 and xis close to x0, it does not follow
that/ is locally bounded away from 0 at x0• The situation shown in the figure on the
right above can easily occur. Here f is undefined at x,0 f (x) ¥= 0 for x ¥= x0, and
lim.,_,.., f(x) = 0. In this case f is not locally bounded away from 0 at x•0
0
Theorem 8. If l i m.,_,.x0 f(x) L, and L ¥= 0, then f is locally bounded away from
=
0 at x0•
Proof Suppose first that L > 0. Let € = L/2, and let c5 be as in the definition of a
limit.
y
Then
694 Appendix B
Thus
0 < Ix - x01 < b => f(x) > E
=> lf(x)I > e.
Suppose now that L < 0. -L > 0, and limx-x0 [-f(x)]
Then = -L. There
fore -f is locally bounded away from 0 at x0• Therefore so also is f (Look again at
the definition: it is a statement about I f!.)
Theorem 9. Iff is locally bounded away from 0 at x0, then l!fis locally bounded at x0•
Proof We know that there are positive numbers e and b such that
=>
0 < Ix - x01 < b => - 1- <
lf(x)I
.!
e J i
--
1
f(x)
<
.!e .
Therefore 1 /f is locally bounded at x0; we use the b that was given for f, and we use
the bound M = l/e.
Theorem 10. If lim.,_.,0 f(x) = L, and L :;.6 0, then
Jim
x-+xo
[f(x) - !.]L
-
1
= 0.
Now
1 1 L - f(x)
1
[L - f(x)].
L
= =
The bracket on the left approaches 0. The fraction on the right is locally bounded at
x0• Therefore the product approaches 0.
Theorem 11. If lim.,_.,0 f(x) = L, lim.,_.,0 g(x) = L', and L' :;.6 0, then
f(x) L
Jim .
I.:
=
x-+xo g(x)
Proof by Theorems 7 and 10.
Algebraic Operations
Appendix C with Limits of Sequences
Definition. Given a sequence a1, a2, • • • and a number L. Suppose that for every
e > 0 there is an integer N such that
Then
lim an= L.
n.-+oo
Therefore
n > N => J(an - L) - OJ < e.
Since for every e > 0 there is such an N, it follows that limn--+oo (an - L) = 0.
Note that statements of the form n > N are playing exactly the same part as
statements of the form 0 < Ix - x0J < o.
Theorem 2. If limn--+oo (an - L) = 0, then lim11--+00 an= L.
n-+OO
Therefore the sequence aN+i. aN +2, • • • is bounded. (The larger of the numbers
IL - l IL + E l is a bound.)
E ,
And the finite sequence a1, a2, • • • , aN is bounded. We now get a bound M for the
entire sequence a1, a2, • • • : let M be the largest of the numbers IL - l
E , IL + l
E ,
la1I, la2 I • . . .
, laNI·
Theorem 6. If limn-«> an = 0, and bi b2, . • • • is bounded, then
lim anbn = 0.
n-+oo
Definition. Suppose that there are numbers E > 0 and N such that
Theorem 9. If a1, a2, • • • is bounded away from 0, and an ¢ 0 for each n, then the
sequence 1/a1, 1/a2, • • • is bounded.
Jim l. = !. .
n-+oo an L
Note that we must require that an ¢ 0 for each n; otherwise the sequence of
reciprocals is not defined.
Theorem 11. If limn-«> an =L, limn-«> bn =L' ¢ 0, and bn ¢ 0 for every n, then
an L
lim =-.
n-+oo bn
-
L'
At the beginning of Section 4.4, we gave numerical examples of the use of the approxi
mation 6.f � df, and we found that when we checked our approximate answers
against the exact answers, the approximations looked good. But numerical approxima
tion methods are important precisely in those cases where their accuracy cannot be
checked in this way: if you can find the exact answer, then you use it; you don't get
an inexact answer to compare with it. This brings up the problem of setting a limit
on the error that results when you use df in place of 6.f The solution of this problem is
as follows. We have
I
f I f
I
I
I
I I
I
------ r -- -1
I
p p
I I
I I
I I
I I
I I
x x
XQ x0+�x XQ x x0+�x
Applying the mean-value theorem (MVT) to the function f, on the interval from x0
to x0 + 6.x, we conclude that
f(xo + �� - f(xo) =
j'(x),
for some x between x0 and x0 + 6.x. (For x0 < x0 + 6.x, as in the figure, this result
follows. You should check that it also follows when x0 + 6.x < x0• Hereafter we
shall assume that 6.x > 0. The case 6.x < 0 needs to be checked separately.)
Therefore
6.f = f ' (x) !ix, x0 < x < x0 + !ix,
and
'
!if - df = f (x) 6.x - f'(x0) 6.x = [f'(x) - f'(x0)] !ix.
697
698 Appendix D
We now apply MVT to the function/', on the interval [x0, x] . MVT tells us that
�
f'(x - f'(xo)
= j"(x'),
X - X0
for some x' between x0 and x. Therefore
f'(x) - f'(x0) =
f"(x')(x - x0),
and
b,.f - df = f"(x')(x - x0) 6.x.
Now Ix - Xol � 16.xl. Therefore 16.f - df I � lf"(x')I b,.x2, where x' is between
x0 and x0 +b,.x.
y
It often happens that we can find a bound M for the numbers If" (x)I, for x
between x0 and x0 + 6.x. If so, we can conclude that
For example, if f(x) = sin x, then j"(x) = -sin x, and lf"(x)I � 1, no matter
what x may be. Before giving further examples, let us write down the theorem that
we have proved:
Theorem 1. Suppose that f has a second derivative f", and that lf"(x)I � M, for
every x between x0 and x0 + b,.x. Then
Let us see how this applied to Example 1 of Section 4.4. Here we had
f(x) =
..J�, Xo = 25, b,.x = 0.4.
Now
f(x) = ·x1f2, f'(x) = tx-1/2,
In Section 4.5 we stated the following theorem, with only rough indications of proof.
Theorem. The composition of two continuous functions is continuous. That is, if
lim g(x) = g(x0) = u0, (1)
x-xo
and
lim f(u) = f(u0), (2)
then
limf(g(x)) = f(g(xo )). (3)
Proof Let E be any positive number. By (2), there is a number 01 such that
lu - Uol < 01 => If (u) - f (uo)I < E.
Now take o1 as E, in the definition of hypothesis (1). By (1), there is a o > 0 such that
Ix - x01 < O => Jg(x) - g(x0)J < 01.
This is the o that we wanted: we have
Ix - Xol < O => Jg(x) - g(x0)J < 01 => lf(g(x)) - f(g(xo))I < E,
I
I
2 •
I
I
x
2
700
The Continuity of Continuous Functions 701
In fact,
x :;z6 1 => g(x) = 1 => f(g(x)) =
2,
and so
limf(g(x)) = 2.
x-+1
Appendix F The Error 1n Simpson1s Rule
The results of our calculations in Section 4.8 and in later sections suggest two questions:
2) In a particular computation, how can we tell how good the approximation is?
That is, how can we determine a bound for the error?
These questions have the following answer. In the theorem below, f(4) denotes
the fourth derivative off Thus j<ll is f', j<2l = f", j<3l = DJ" = J'", and J<4l =
Dj<3l. As usual,
Yo= f(-k), Y1 = f(O), Y2 = f(k).
Theorem 1. If f has a fourth derivative, on the interval [-k, k], then the error in
Simpson's rule is equal to
E(k) = :�j<4l(.X),
0
l
k k
l(x) dx - - (Yo + 4Yi + Y2) =
- 1<4l(.X) (-k < x< k).
- 3 90
k
JE(k)J � 9�k5M.
The latter is the statement which is most convenient to apply. Before proceeding to
the proof of the theorem, let us look at an application of it.
1111(x) = -6x-4,
702
The Error in Simpson's Rule 703
Since/(4) decreases as x increases, its maximum value on the interval [1, 2] isj<4>(1) =
24. Therefore /j<4>(x)/ � 24 (1 � x � 2). Therefore, if we cut up the interval
[1, 2] into 2n parts, each of length k, we have
/E(k)[ � 910k5 24. •
We want
/E(k)/ < 5 10-s, ·
for the fifth decimal place in our approximation to be correct. Thus we want to take
k such that
9�k5 24 < 5 10-s,
• ·
or
ks < H . 5 . 10-s.
Arithmetically, this reduces to
ks<\"-. 10-s,
which surely holds if k = 0.05. Therefore E(0.05) < 5 10-s. ·
This example was selected for its simplicity. For most functions, the calculation
of fourth derivatives is tedious.
We proceed to the proof of Theorem 1. Let F be any function such that
F' =f
(How do we know that there is such a function?) Then
E(k) =
i-k k
f(x) dx - - (Yo 4y1 +
Y2) +
k
3
= - � [f"'(k) - f'"(-k)].
3
704 Appendix F
We need all of these formulas, not just the last. It is easy to check that
E(k)
G(t) E(t) - t5•
k5
=
On the interval [O, k] we apply the mean-value theorem (MVT) to the function G.
This gives
0<X1<k.
We next apply MVT to the function G', on the interval [O, x1]. This gives
G"(x2) = 0, 0<X2<X1.
G"'(x3) = 0,
By a straightforward calculation,
E(k)
G"'(t) E"'(t) - (60t2)
k5
• =
E(k) 2
- !._ [f"'(t) - j"'(-t)] - 60 t•
k5
=
k5 k5 f'"(xa) - f"'(-xa)
E(k)
f"'(xa) - f"'( -xa)
90
= - - · = - · ·
180 . X3 2X3
We now apply MVT for the last time. By MVT there is an .X, between -x3 and x3,
such that the second fraction on the right is equal to j<4>(x). This gives
k
E(k) s 1<4>(x) (-k<x<k),
90
=
If you reexamine Section 2.10, you will see that at the end of the section we were in a
peculiar position: we had gotten an answer for the area under the graph of y = kx2,
from x = a to x = b, but we were not in a position to prove it, because we had no
definition of area. The trouble, however, is easy to remedy.For the sake of simplicity,
consider first the case in which R is the region under the graph of y = x2, from
x = 0 to x = h. In Section 2.10, we proved the following two things:
1) There is a sequence R1, R2, of polygonal regions containing R, with areas
A1, A2,
• • •
, such that
limAn =
• • •
L.
n-+co
(Here Rn was the union of the outer rectangles, An was (h3/3)(1 + l / n) ( l + 2/n),
and Lwas h3/3.)
2) There is a sequence R�, R�, ...of polygonal regions lying in R, with areas
A�, A�, ... , such that limn�co A� is the same number L.
(Here R� was the union of the inner rectangles, A� was (h3/3) zf�i (i - 1)2, and Lwas
h3/3, as in condition 1.)
These ideas can be used to give a definition of area, in the following way.
Definition. Let R be a region in the plane. If R satisfies conditions (1) and (2),
then R is said to be measurable, and the number Lis called its area.
Under this definition, the plane regions discussed in Chapter 2 are measurable,
and their areas are the numbers that we computed. The same conclusion follows
whenever we compute an area by means of a definite integral. In Section 7.8 we
showed that every continuous function is integrable.This gives the following:
Theorem. Let f be continuous and nonnegative on [a, b], and let R be the region
under the graph off Then R is measurable, and the area of R is
A =ff(x) dx.
Proof Take a sequence of nets
over [a, b], with IN;I - 0. For each i, let A; be the upper sum S(N;) and let A� be the
705
706 Appendix G
lower sum s(Ni). Then A; is the area of a polygonal region containing R, and A; is
the area of a polygonal region lying in R, as in the definition of a measurable set.
And
Ai � ff(x) dx � A;.
The sequences A1, A2, • • • and A�, A�, ... have the same limit, namely, the integral.
Therefore R is measurable, and its area is the integral.
This theorem can be extended so as to apply to the region between the graphs of
two continuous functions.
It might seem that we could simplify the preceding discussion by defining the area
to be the integral, in the first place. But this will not work. The point is that some
regions can be represented in many different ways as the regions between the graphs
of two continuous functions. Different directions for the axes give different limits of
integration, and also different integrands, even for so simple a figure as an ellipse.
In the theory that we have just developed, we know that all the resulting integrals
give the same answer, because they all give the right answer for the area of the region.
But if we defined the area to be the integral, we would have the problem of showing,
by the methods of calculus, that all the integrals have the same value, and this would
be hard.
Proof of the
Appendix H Northeast Theorem
and
t
lim g'( ) = L, (2)
t->CO
f'(t)
then
( t)
lim g
j t)
= L.
(
t�co
To start the proof, we first observe that since g'(t)/f'(t)->- Las t ->- oo, we must
have f' ( t ) � 0 when t is sufficiently large, say, for
t � t0• Since we are taking the
limit as t ->- oo, we mayregard t0 as the initial point of the path. Since g' (t)lf' (t) ->- L,
the function g'If' must be bounded on some interval [t1, oo) (t1 � t0). The reason is
that for every E > 0, we have
(t)
L - € < g' < L + €'
f'(t)
fort � a certain t1• Therefore g'lf' is bounded on the interval [t1, oo). We now take
t1 as the initial point of the path.
As a further simplification, we translate the point (j(t1), g(t1)) to the origin,
replacing f(t) and g(t) by F(t) = f(t) - f(t0), G(t) = g(t) - g(t0), where c � t0•
ObviouslyG'/F' is bounded, and
t)
lim G'( = L'
t->oo F'(t)
because F' = f' and G' = g'. And if we can prove that
707
708 Appendix H
Therefore it will be sufficient to prove the theorem in the following special form.
Theorem A. Let F and G be differentiable functions on the interval [ti. oo ), such that
G'(t)
lim
=
L' (4)
t-+oo F'(t )
Then
G(t)
lim L. (8)
=
t-+ oo F(t)
F'(t)
m y=<J>(x)
x=F(t)
-
- G1(t)
. m-</> (x) -
I
F'(t)
Evidently
G(t) <f(x) (x F(t)).
= =
F(t) x
Theorem B (The Northeast theorem, rectangular form). Let <P be a function on the
interval [O, oo ) such that
,
¢(0) = 0, (9)
<P' is bounded, (10)
lim ¢(x) = oo, (11)
X-+<Xl
and
lim </J'(x) = L. (12)
x-+oo
Then
(x)
lim </J = L. (13)
X-+ co X
Step 1.
. <f>(x2) - </>(x)
l lffi -
- L.
X-><Xl X 2 - X
Proof By the mean-value theorem (MVT), for each x there is an x, between x and x 2,
such that
�
</>(x2 - </>(x)
= <f>'(x).
x - x
As x -+ oo, x-+ oo, and so <f>'(x)-+ L. Therefore the fraction on the left also-+ L
y y
Step 2.
lim
X-+<Xl
[<f>(x2� - <f>(x) - </>(�2)]
X - X X
= 0.
Proof
<f>(x2) - <f>(x) <f>(x2) x<f>(x2) - xrf>(x) - x<f>(x2) + <f>(x2)
---=
x2 - x x2 x2(x 1) -
=
__l _ <f>(x)
x 1 x -
[ +
x2
]
</>(x2)
·
710 Appendix H
</>(x) = </>'(.X).
x
Since </>'is bounded, it follows that <f>(x)/x is bounded. Therefore </>(x2)/x2 is bounded.
Since -1/(x - I)-+ 0, it follows that
_ _
x
1_ </> ( x)
[
- 1 x
+ </>(x2)
x2
] -+ O.
Step 3.
(x2)
lim </> = L,
X-+CO X2
by Steps I and 2. From Step 3 it follows immediately that </> (x)/x -+ L, which was
to be proved.
Proof of the
Appendix I Formula for Path Length
Here we complete the proof of Theorem 1 of Section 9.6, which asserts that the
length of a path is given by the formula
wheref and g are the coordinate functions andf' and g' are continuous. The notation
is that of Section 9.6. By definition,
n
INl-+Oi=l
s = lim L P;_1P;;
we know that
i=l
I Pi-1Pi iI=l .JJ '(ii)2 + g'(f;)2 �ti,
=
Proof 1) Sincef' and g' are continuous on [a, b], so also is the functionf'2 + g'2•
Let M be such that
711
712 Appendix I
< -- L Llti -- (b - a)
E E
= = e.
b - ai�1 b - a
Since the absolute value of the sum is less than or equal to the sum of the absolute
values, it follows that o satisfies the conditions of the lemma.
A Method for
Constructing the
Appendix J Complex Numbers
In Section 10.11 the complex numbers were presented as a formal system of symbols
a + bi, with i2 = -1. We shall now define a mathematical system of this kind, and
show that it has the properties that we want. There are various ways to do this.
The following method has the advantage of copying the pattern of the manipulative
processes that we would be using anyway. It has the further advantage of introducing
ideas that will be useful later, in modern algebra.
Let P(x) be the set of all polynomials p(x) = .2f=0 aixi. In P(x) we can add and
multiply. We know that in P(x), these operations obey the CAD laws, that is, they
are commutative, associative, and distributive:
pq = qp' p +q = q + p'
p(qr) = (pq)r, p + (q + r) = (p + q) + r,
p(q + r) = pq +pr.
These follow immediately from the corresponding laws for the real numbers p(x)
which are the values of our polynomial functions.
Two polynomials p, q will be called congruent modulo 1 + x2 if their difference
is a multiple of 1 + x2• We then write
713
714 Appendix J
Proof We know that these laws hold in P( x). Therefore, under our definitions of
addition and multiplication in C, we have
p + q = p +q= q + p = q +p;
and
For example,
p(x)= 7x7 - 5x3 + 6x2 - 3
= -2x - 9.
In fact, the system C that we have just defined has all the properties of the number
system that we wanted. To describe it in the familiar notation, we denote each
congruence class p(x) by the formal expressionp(i), in which xis replaced by i. Thus,
A Method for Constructing the Complex Numbers 715
-7i+ 5i - 6 - 3
-9 - 2i.
Here we have simplified by substituting -1 for i2, and this is right; since
x2 = -1,
we have
x2 =i2 =-1;
any congruence between two polynomials p(x) and q(x) gives an equation between
their congruence classes p(i) and q(i) And our number system satisfies the conditions
.
for a field, given in Chapter 1: the CAD laws hold; there are numbers 0 and 1,
such that if
z =a+ bi,
then
0. z = 0, and 1 · z =z;
such that
z + (-z) =0.
Finally, every z -:;tf:. 0 has a reciprocal. To prove this, we first observe that
a + bi =0 => a =b =0.
The reason is that
a + bi =0 <=> a + bx = 0 mod 1 + x2
Here r(x) must be 0, because otherwise r(x)(l + x2) would be of degree � 2. Since
r =0, a + bx is the zero polynomial, and so a =b =0. Similarly,
a - bi =0 => a =b =0,
and so
a + bi -:;tf:. 0 => a - bi -:;tf:. 0, and a2+ b2 > 0.
a + bi a + bi a - bi a 2 + b2 a 2 + b2
716 Appendix J
To sum up:
Theorem 4. C is a field.
Note that when we passed from P(x) to C, by forming congruence classes modulo
+ x2, the algebraic character of the system changed: in P(x), only the constant
polynomials p(x) = a ¥- 0 have reciprocals; but every congruence class p(i) =
2 2
x Y
_
f(x, y) =
2 + 2
x y
Here
. . x 2 - y2 . x
2
bm hm hm 2 = 1,
2 2
=
and
2
2 - - 2
x y y
lim Jim = lim -- = -1.
2 + 2 2
y-+0 x-+O X y y-+0 y
This sort of thing cannot happen, however, if/is continuous in D and the double
limit exists. That is, we have the following theorem:
lim f(x, y) = L,
(x,v)-+(xo.vo>
then
lim limf(x, y) = lim limf(x, y) = L.
Proof Iff(x0, y0) is defined at all, thenf(x0, y0) must be L. Iff(x0, y0) is not defined,
we define it to be L. Thus we may assume that f is continuous in a neighborhood of
(x0, y0), including (x0, Yo)·
717
718 Appendix K
In (1) , all we are saying is that ifjis continuous (as a function of two variables)
then the slice functions, for each fixed x, are also continuous. This may be easier
to keep straight if we rewrite (1) in the form
Iimf(a, y) =
f(a, Yo), (1')
11-+Yo
which reminds us that xis fixed as y ---+ y0• Equations (2) and (3) follow by repeated
applications of the same principle.
Since in some cases the two iterated limits are different, we always have to
investigate, in the cases where we need to know that they are the same. One such
case comes up when we consider the "mixed partial derivatives" fxy and hx· If we
write in full the definitions of fx,u(x0, y0) and fux(x 0, y0), we see that they are iterated
limits of the same function:
Ay->O 6.y
. 1 f(x o + 6.x, Yo + 6.y) - f(x o, Yo + 6.y)
=Im
I Im
-
[i·
Ay -> 06.y A ->O
x 6.x
. f(xo + 6.x, Yo) - f(xo, Yo)
- 1Im
]
Ax-+O 6.x
Let us now investigate the function F. This function can be regarded as the
difference of two values of the function
ip(y) =f.,(x,y);
we have
F(�x, �y) = [VJ(Yo + �y) - VJ(Yo)J �x.
By MVT,
VJ(Yo + �y) - VJ(Yo) = ip'(y) �y,
where y is between Yo and y0 + �y. This gives
F(�x,�y) =fxy(x, y) �Y �x.
Suppose now that fxy is continuous. We then have
[Since Ix - x01 < �x and IY - Yol < �y, we must have (x,y) � (x0,y0) as
(� , �y) � (0, 0).)
x
Let us now take stock. We had
1
1) fxy(X0,y0) = lim lim F(�x, �y) (by definition).
t.y�o t.x-�o Ll X Ll y
A A
exists, and is equal to fxy(x0,y0). This is what we have just proved. Suppose now
that fy x is also defined in a neighborhood of (x0, y0), and is continuous. Then
.
Since the double limit in (2) exists, the iterated limit in (3) must be equal to it.
Therefore fxy(x0,y0) =fvx,(x0,y0). Thus we have proved the following theorem:
Theorem 2. Iffx v and fvx exist and are continuous, in a domain D, thenfxv = f11x-
720 Appendix K
Proof
Therefore
(hua)v = (hx11)11,
andfxvxv = fxxvv• which was to be proved.
Warning: It is not true that if/xv exists and is continuous, thenJ;.."' also exists and
is the same. To see this, let
f(x, y) <f>(y),
=
Some of the definitions that we have used in Chapter 14, and the hypotheses of some
of our theorems, may seem needlessly strong. In fact, they are not. The theory of
functions of two variables includes some rather odd and unexpected phenomena;
and if we want to draw simple conclusions, we need to use hypotheses sufficiently
strong to rule out the oddities. Some of these are as follows.
1) all the slice functions f(x, y0) (with y held constant) are continuous, and
2) all the slice functions f (x0, y) (with x held constant) are continuous, but
3) f is not continuous.
Proof In the first quadrant of the xy-plane we take an infinite sequence Di. D2, • • •
of circular disks, not intersecting each other, with radii approaching 0, and approaching
the origin as a limit.
y z
.... y
As indicated in the figure on the left, we take these disks with their centers on the
line y = x, in such a way that no horizontal or vertical line intersects more than one
of them. This is easy to arrange, because we can make the disks as small as we want.
If (x, y) lies in none of the disks D;, then we definef(x, y) to be 0. Over each disk
D;, the graph off is a "blister" of height 1, shown on the right.
Obviously f is not continuous at (0, 0). But all the slice functions ef>(x) = f(x, y0)
are continuous. Since no horizontal line intersects more than one of the disks D;,
it follows that the graph of cf> looks, at worst, like the graph shown below.
721
722 Appendix L
z z
1 ----
-+--�-....._x
On the left we see what happens if the line y = y0 passes through the center of a disk.
If this doesn't happen, then the maximum of cp is smaller; and of course cp(x) may
be 0 for every x. Similarly for the slice functions for constant x.
Here, of course, the slice function
<Prr14(t) = J
(J2' Jl)
is not continuous. Its graph is shown below. But even if a function f has slice functions
1 -
Proof. Consider the parabola y = x2, in the xy-plane. Between the parabola and the
x-axis we take a sequence D1, D2, • • • of circular disks, with radii approaching 0,
approaching the origin as a limit. On each disk, the graph off is a blister of height 1,
as in Example 1; everywhere else,f (x, y) = 0.
y
Possible Peculiarities of Functions of Two Variables 723
As before, f is not continuous. But all the slice functions are. The reason is that
no line L intersects more than a finite number of the disks D;:
y y y
L L
If L does not pass through the origin (as on the left above) or if L passes through the
origin and has negative slope (as in the center), this conclusion is trivial. The interest
ing case is shown on the right. Here L passes through (0, 0) and has positive slope.
Near the origin, in the first quadrant, the line lies above the parabola and the disks
lie below it. Therefore L cannot intersect infinitely many disks. Therefore the slice
functions defined along any line L are continuous.
Our next peculiar function is going to be continuous. We recall that in Section
14.8 we proved the following theorem:
Theorem A. Given a function f, defined in a neighborhood of (x0, y0) . For each tJ., let
cf>a = cos tJ., Yo + t sin tJ.) .
f (x0 + t
Suppose that c/>�(O) =0 for every tJ.; and suppose that there is a number o > 0
such that for !ti < o we have c/>�(t) < 0 for every tJ.. Then/has an ILMax at (x0, y0) .
y
Yo -- - - -
The reason is that for each tJ., cf>a(O) is the maximum value of cf> on the interval
( -o, a). See the figure below.
z
-iJ
<t>&(O)=O, <t>&'(t)<O for- iJ<t<iJ
Here it is essential that there be a single number a > 0 which works for every tJ..
? Theorem B? Given a functionf, defined in a neighborhood of (x0,y0). For each ex, let
Example 3. There is a continuous function f such that (1) every slice function
through the origin has an ILMax at the origin, but (2) f does not have an ILMax
at the origin.
This is similar to Example 2. As before, we take a sequence of disks lying under a
parabola. We define f(x, y) to be 0 everywhere except on the disks. But this time,
we take the blister over the ith disk D; in such a way that its height is 1/i. Now our
function f is continuous.
y
As before, no line Lin the xy-plane intersects more than a finite number of the disks.
Therefore every slice function
rp,,(t) = f(t cos ex, t sin ex)
is equal to 0 in a neighborhood of 0. That is, for each ex there is a o,, > 0 such that
rf>a(t) = 0 for !ti < o". Therefore every <Pa has an ILMax at 0. But obviously f does
not have an ILMax at (0, O);f(O, 0) = 0, but every neighborhood of (0, 0) contains
a disk D;, on which f(x, y) > 0.
z
The trouble here is that while for every ex there is a o,, with the desired property,
there is no one o which works for every ex. If ex > 0 and ex !::::; 0, then o" !::::; O; and so
inf {o"} = 0. If you reread the proof of Theorem 3, Section 14.8, you will see how
this trouble was avoided: using the continuity of fxv• fxx, and fvv, we found a single
o > 0 which worked for every ex . Thus the proof of Theorem 3 was not merely
complicated in a technical way but was also subtle, in a way which is not likely to be
understood unless we re-examine the proof in the light of Example 3.
Maxima and Minima, for
APPENDIX M Functions of Two Variables
Here we give a brief sample of the way the theory of continuous functions of one
variable can be extended so as to apply to functions f: D ---+ R, where D is a domain
in a Cartesian space Rn (n > 1).
Theorem 1. Let D be a closed rectangular region in R2, defined by the inequalities
a ;;a x ;;a b and c ;;a y ;;a d. Let f be a continuous function D ---+ R. Then f is
bounded above.
Proof Suppose that f is not bounded above. We shall show that this leads to a
contradiction.
The region D is the union of four closed rectangular regions, shown in the figure
on the left below. These will be called quarters of D. These are like the "halves" of an
interval [a, b], as defined in Section 5.6; and they are going to be used in exactly the
same way.
y y
--
di
I
I
I
I
Di
x
a a+b b a; bi
-2
-
Ii
Following the pattern of Section 5.6, we say that a closed rectangular region D' is
good ifjis bounded on D'; and D' is bad if/is unbounded on D'. We are assuming
that the giyen D is bad. It follows that one of the quarters of D must be bad. (Why?
See Lemma 1 on page 240.) Let D1 be a bad quarter of D. Similarly, let D2 be a bad
quarter of D1• Proceeding in this way, we get a sequence
of closed rectangular regions, each of which is bad, such that for each i, D;+1 is a
quarter of D;. As indicated in the figure on the right above, let/; and l; be the closed
intervals which are the projections of I; and l; onto the x- and y-axes. Then fi, 12, •••
is a nested sequence. By the Nested Interval Postulate (NIP) there is an x which lies
725
726 Appendix M
on each interval !;. Similarly, there is a y which lies on every interval Ji. It follows
that the point P (x, y) lies in every region Di.
=
But f is continuous at P. Thus for every E > 0 there is a o > 0 such that
that is, f is bounded on the circular disk with center at P and radius o. But this
circular disk contains some Di, because P lies in all the D/s, and the height and width
of D; both ___,.. 0 as i ___,.. oo. Therefore Di must be good for some i, which contradicts
our hypothesis.
Theorem 2. If/is continuous on a closed rectangular region D, then/has a maximum
value on D.
The proof is exactly like the proof of Theorem 3 of Section 5.6. Let k sup f =
If k =f(P) for some P, then/has its maximum value at P. If f(P) < k for every P
in D, let
(P) - 1
g - k -f(P)
Then g is continuous on D, but is not bounded above; and this contradicts Theorem 1.
As before, the existence of maxima gives, as a corollary, the existence of minima:
Theorem 3. Iffis continuous on a closed rectangular region D, thenfhas a minimum
value on D.
(Proof Any maximum value of -f is a minimum value off)
The same scheme works for continuous functions defined on an "n-dimension
interval"
D {(x1, x2,
= , Xn) I a;� X; � b; for i
• • • 1, 2, ... , n}. =
We use a subdivision process just as in R1 and R2, dividing our "interval," at each
stage, into 2n parts.
An Exact Definition of
APPENDIX N the Idea of a Function
We shall now approach our new definition in the following two steps.
Step 1. We regard the function as being indistinguishable from its graph, so that the
function/becomes a set of points P, in a coordinate plane. (In this case, the function
is a parabola.)
P=(x,x2)
727
728 Appendix N
"·
The graph now becomes a collection ofordered pairs ofreal numbers, namely, \
f={(x, x2)}.
This collection ofordered pairs has the property that each real number x is the first
term ofexactly one ordered pair (x, y) in the set. (This is because the graph intersects
every vertical line in exactly one point.)
This final description off, as a collection of ordered pairs {(x, x2)}, can be
generalized to apply to functions ofany kind, on any domain. The final definition is
as follows.
Definition. Let A and B be sets. Let f be a collection ofordered pairs(a, b). Suppose
that
1) if (a, b) belongs to f, then a belongs to A and b belongs to B, and
2) every element a ofA is the first term ofexactly one pair belonging to f
Then f is a function of A into B, and we write
f: A-+B.
For each a in A,f(a) denotes the second term ofthe ordered pair whose first term is a.
{
f=Sin-1= cx, y) \ -1 � x � 1, -� �y � �, x=siny}.
Here again the idea is that the function is defined to be its graph, and the graph is
regarded as a set ofordered pairs ofreal numbers. Note, however, that our general
An Exact Definition of the Idea of a Function 729
Table 1
Natural Trigonometric Functions
�
Angle
� fu
\
�
Angle
�
/ I
0° 0.000
n t_1
���.� � � � �l� _g_e_
0.000 1.000
�
0.000
ful�
1°
2°
0.017
0. 035
0.017
0.035
1.000
0 999
0.017
0.035
46°
47°
0.803
0.820
0.719
0.731
0.695
0 682
1.o:l6
1.072
I
3° 0.052 0. 052 0 999 0.052 48° 0.838 0.743 0 669 1.111
4° 0.070 0. 070 0.998 0. 070 49° 0.855 0.755 0.656 1.150
5° 0.087 0 087 0.996 0.087 50 ° 0.873 0.766 0.643 1.192
I
7° 0.122 0.122 o.993 o.12:i 52° 0.908 0.788 0.616 1. 280
8° 0.140 0.139 0.990 O.Hl 53 ° 0.925 0.799 0.602 1.327
9° 0.157 0.156 0.988 0.158 54° 0.942 0.809 0 588 1.376
10 ° 0.175 0.174 0.985 0.176 55° 0.960 0.819 0 574 1.428
11 ° 0
0.192 0. 1\Jl 0.982 0.194 5() 0.977 0.829 0.559 1.483
12° 0.209 0.208 0.978 0 2J:l 57° 0.995 0.839 0.545 1.540
0 58°
i:i 0.227 0.225 0 974 0.231 1.012 0.848 0 530 I.GOO
0
14 0.244 0.242 0.970 0.249 59° 1.030 0.857 0.515 l.GG4
15° 0.262 0.259 0.966 0.268 60 ° 1.047 0.866 0.500 1.732
1()0 0.279 0.276 0.961 0 287 61° 1.065 1.804
0.875 0.485
o
n 0.297 0.292 0.956 0.306 62° 1.082 0.883 0.469 1.881
18° 0.314 0.309 0.951 0.325 0 1 100
()3 0.891 0.454 1.963
rno 0 332 0.326 0.946 0.344 64° 1 117 0 899 0.438 2.050
: �:: I
20° 0.349 0.342 0.940 0.364 65° 1 134 0.906 0 423
2G0 0.454 0.438 0.899 0.488 71° 1.239 0.946 0.326 2.904
I
27° 0.471 0 454 0.891 0.510 72° 1.257 0.951 0 309 3.078
28° 0.48\J 0.469 0.883 0.5:32 7:3 ° 1.274 0.956 0.292 3.271
29° 0.506 0485 0.875 0.554 740 1.292 0.961 0.276 3.487
30° 0.524 0.500 0.866 0.577 75° 1.309 0.966 0.259 3.732
31° 0.541 0.515 0.857 0.601 76° 1.326 0.970 0.242 4. 011
32° 0.559 0.530 0.848 0.625 77° 1.344 0.974 0.225 4.332
33° 0.576 0.545 0.839 0.649 78° 1. 361 0 978 0.208 4.705
34° 0.593 0.559 0 829 0.675 79 ° 1.379 0.982 0.191 5.145
35 ° 0 611 0.574 0.819 0 700 80° 1.:396 0.985 0.174 5.671
:�50 0.628 0.588 0.809 0.727 81° 1.414 0.988 0.156 6.314
37° 0 646 0.602 0.799 0.754 82° 1. 431 0 990 0.139 7.115
38° 0.663 0.616 0.788 0.781 83° 1.449 0 993 0.122 8.144
39° 0.681 0.629 0.777 0.810 84 ° 1.466 0.995 0.105 9.514
40 ° 0.698 0.643 0.766 o.s:io 85° 1.484 0.996 0.087 1143
41° 0.716 0.656 0.755 0.86!) 86° 1.501 0.998 0 070 14.30
42° 0.733 0.669 0.743 0. 900 ! 87° 1.518 0.999 0.052 19.0 �i"
ggo
I
43° 0.750 0.682 0.731 0.933 1.536 0 999 0.035 28.64
44° 0.695 0.719 0.966 0 1 553 1.000 0.017 57.29
0.768 39
45° 0.785 0. 707 0.707 i. ooo I 90° 1.571 1 t. ooo 0 000
An Exact Definition of the Idea of a Function
Table 2
Exponential Functions
Table 3
Natural Logarithms of Numbers
I
n log,n n log,n n log,n
---
*
0.0 -
4.5 1.5041 9.0 2.1972
0.1 7.6974 4.6 1.5261 9.1 2.2083
0.2 8.3906 4 7 1 .5476 9.2 2 2192
0.3 8.7960 4.8 1 .5686 9 3 2.2300
0 4 9.0837 4.9 I 5892 9.4 2.2407
I
I. 3 0.2624 5.8 I.7579 13 2.5649
I. 4 0.3365 5.9 1 7i50 14 2.6391
I
4.2 1.43.51 8.7 2.1633
4.3 1.4586 8.8 2. 1748
4.4 14816 8.9 2.1861
Selected Answers
1. x 3 = 3. y 4, x � 0
=
5. x2 + y2 + 4x - 4y + 4 = 0 7. y 2x, x � - 1
=
9. x2 - 2x + y2 - 3 0 = 21. b) y x =
23. x = t
733
734 Selected Answers
9. crosses x-axis at (0, 0), (2, 0), ( -2, 0); tangent horizontal where x = ± ----2 --:
v3
; slope at
(0, O) = -4; y > 0 if -2 < x < 0 or 2 < x; y < 0 if x < -2 or 0 < x < 2
1. 14 3. 50
5. 33 7. b3 + 2c3 + b4 + 2c4 + b5 + 2c5
k
9. m7 + (m + 1)7 + + n7 11. �>2
i=3
· · ·
1 •
itXQ
3 3 •
_4,Q
3 5 •
!lQ
3
7. t 9. t 11. t
(n � > also works. Note that you were not asked to find the smallest possible n. )
17. a) n > 98 b) n > -
1
€
-2
1
1. 70x9 - 8x7 3.
+ 1)2
(x
-2y3 - 3 -1
5. 7
(y3 - 3)2 . (x - 1)2
9. 3(1 + x)2 11. a) 3ay + 2x b) 3xy + 3a 2
2(x2 - 1)
13. 15. 4x3 + 6x2 + 2x
(x2 + x + 1)2
1 -x
17. --- 1 9. ---===
2.Yx + 1 Vl - x2
-1 1
21. 23. ---:====
2.Y v(1 - x2)s
�
2x + 3 -3x + 1
1. --;===== 3
2Y(x + l)(x + 2) . (x + l)s
x -1
5. 712(x3 + x2 - x + 7)711(3x2 + 2x - 1) 7. -:===
.Yx(x - 2)
-3x2 + 4x + 1 -1
9. ---- 11. ----:===-----:===
2Vx - l(x2 + 1)2 v1 - x Y(l + x)s
13. 1 if x > 0, -1 if x < 0 15. 3(2x3y - 3x2_y2)(x3y2 - x2y3)2
19. f(3x2 + l)(x3 + x)l/2 21. y11211
23. f(x2 + 3x + 1)312(2x + 3) 25. b) tx-213
27. f. x<PfqJ-1
q
x2 x2
1. a) b) - - c) tx lxl d) x e) -x f) Ix!
2 2
3. a) x b) -x c) JxJ d) 1 e) -1 f) sig x
1. t
bll 1 1 b101 1
3. a) U b) - (b11 - a11) c) ( _ alOl) d) -- (x n+l - an+l)
101 1
.
11
_
n +
x4 x2
5. a) -! b)- + - -x 7. a) vs - v2 b) 0
4 2
736 Selected Answers
t2 t5 2 1
1. ft2+4t+4 3. -2 +2t+3 5. +t - 20
20
t3 2
7. -Vl- x2 9. 2v/ - 2v2+ 5 1i.3+t+3
-1
l3. 3(1 + t3) +
7
15.
20
, a(t) -gin the time interval
[ ]
0,
20
6
=
g g
I. x2, 2x 3. x4 - x, 4x3 - I
4x7
5. VI+x8, 7. v:;,
VI+ x8
I -2x -
5
I
9. II. -vx(x2+I), -2xvx -
-
J
I+ x I
I7. -- ' -;::==-;:==:::::; I9. 2VI+(2x)8
I- x VI+x V(I - x)3
9. a) -I b) -cos 0
-1
9. 2 cos 2x 11. 0 13.
1 +smx
.
7. 1
I. g(x) =sin x,f(u) =u2,f'(u) =2u,g'(x) =cos x,f'(g) =2 sinx, q/(x) 2 sin x cos x =
3. g(x) =sin x +cos x,f(u) = u2,j'(u) =2u, g'(x) =cos x - sin x,j'(g) =2(sin x +
cos x), cp' (x) =2 cos 2x
5. g(x) =2x,f(u) =tan u,f'(u) =sec2 u,g'(x) =2,f'(g) =sec2 2x, cp'(x) =2 sec2 2x
/- 1 1
7. g(x) = 1 - x2,f(u) =vu,f (u) = r ,g (x) = -2x,f (g) = . i ,
• I I I
2vu 2vl- x2
-x
cp'(x) =
v1 - x2
9. g(x) =1 +x, f(u) =u113, f'(u) =tu-213, g'(x) =1, j'(g) =t(l +x)-213, cp'(x) =
!(1 +x)-213
11. g(x) =cos x,f(u) =Jg (t2 +1) dt,f'(u )=u2 +1,g'(x) = -sinx,f'(g) =cos2 x +1,
cp'(x) = -sin x(cos2 x + 1)
13. -3x5 cos (x&)
15. g(x) =x3, f(u) =sin u, f'(u) =cos u, g'(x) =3x2,f'(g) =cos x3, cp'(x) =3x2 cos x3
17. cp'(x) =cos xVl +sin2 x
19. 0 21. 0
-1 x- 1
7. (3x2 +1) cos (x3 +x) 9. sin vx 11. t sec2 --
2v:X 2
13. 1
15. a) 2 sec2 x tan x b) 2x sec x2 tan x2 17. -2 sin 2x
41. 0 43. 0
738 Selected Answers
1 1
1. ---- 3. 5. 1
Y2x - x2 2+2x+x 2
-x 2x
7. 2x 9. 11.
lxl Yl - x2 2 +2x2+x4
2 -1 1
13. --=== 15. 17.
xYx4 - 1 .Y1 - x2 x.Yx2 - 1
x 1
19. 21 . 23.
.Y1 +x2
(1 + x2)a/2 (1 _
x2)a;2
1 2x 77"
25. 27. 1. 3
Y -x - x2 .Y1 - x4
1
33. 1 - --= 35. J-1(x) = .Y 1 - x2, 0 ;2; x ;2; 1
.Y2
77"
37. 3
1
1. log.x+ 1, - 3. (I +2x)e2"', (4+4x)e2"'
x
5. (sin x+cos x)e"', 2e"' cos x 7. 2/x, -2/x2
500 -500
9. -,- - 11. (log. 10) 10"', (log.10)2 10"'
x x2
13. 0, 0 15. (2x + x2)e"', (2+ 4x+x2)e"'
-1 -1
17. (2x2+ l)e"'2, (4x3+ 6x)e"'2 19.
1 - x ' (1 - x)2
21. e"'-1, e"'-1 23. tan x, sec2 x
25. cot x, -csc2 x 27. sec x, sec x tan x
29. -csc x, csc x cot x
2 2x.
1. - In x 3. - -- 5. 2x exp x2
x x2+ 1
7. 2x 9. cos x exp sin x 11. (In x+ 1) exp (x In x)
27. 2 29. 2
Selected Answers 739
2
23. 4 cosh3 x - 3 cash x 31. v'1 + x2 33
· v'1 + 4x 2
2x
39. 43. x = In 2 45. x = In (2y)
Vx4-1
0.,
y
y
-2 -1 1 2
5. increasing on [-2, -1] and [1, 2], 7. increasing on [- tr, -3tr/4], [-tr/4, tr/4],
decreasing on [-1, 1] and [h/4, 5tr/4], decreasing on
[-3tr/4, -tr/4] and [tr/4, 37T/4]
.. x
13. increasing on [-2, -1] and [O, 1], 15. increasing on [-1, O], decreasing on
decreasing on [-1, O] and [1, 2] [0, 1]
y y
740 Selected Answers
17. increasing on[-Tr, -Tr/2J and ['"/2, '"J, 19. increasing on [In 2, 2J, decreasing on
decreasing on[-'"/2, '"/2J [O, In 2J
1. local maxima -Tr, '"/2; local mm1ma -'"/2, '"; maximum '"/2; mm1mum -Tr/2;
inflection point O; image[-1, lJ; concave upward[-'"· OJ; concave downward [O, '"J
3. local maximum 1; local minima -2, 2; maximum 1; minima -2, 2; inflection points
-l/v'3, l/v'3; image [i-, lJ; concave upward [-2, -l/v'3] and [l/v'3, 2]; concave
downward [ -l/v'3, l/v'3]
5. local maxima -1, 2; local minima -2, l ; maxima -1, 2; minima -2, l; inflection
point O; image[-2, 2J; concave upward [O, 2J; concave downward [-2, OJ
7. local maxima -3Tr/4, '"/4, '"; local minima -Tr, -Tr/4, 3Tr/4; maxima -3Tr/4, '"/4;
minima -Tr/4, 3'"/4; inflection points -Tr/2, 0, '"/2; image [-1, lJ; concave upward
[-Tr/2, OJ and [7T/2, '"J; concave downward[ -1T, -7T/2] and [0, 7T/2]
9. local maximum 1; local minimum O; maximum 1; minimum O; inflection points none;
image[l, e - 2]; concave upward [O, 1]; concave downward {}
11. local maxima 0, 2'"; local minimum 1T; maxima 0, 21T; minimum '"; inflection points
7r/2, 37T/2; image [-1, 1]; concave upward [7r/2, 37T/2]; concave downward [O, 7r/2]
and[3'"/2, 27T]
13. local maxima -1, 1; local minima -2, 0, 2; maxima -1, 1; minima -2, 2; inflection
-v'i, + v'k; image[-8, 1]; concave upward [-v'i, v'l]; concave downward
points
[-2, -v'l] and [v'i, 2]
15. local maximum O; local minima -1, 1; maximum O; minima -1, l; inflection points
_
{fl; image [t,
{! % ,
l]; concave upward [-1, _
{ff] and [{If, l]; concave
lim f(x) = - oo
x--2-
3. maxima none; minima none; local maximum t; concave upward (-oo, -2) and
(3, oo); concave downward (-2, 3); inflection points none;
limf(x) = 0, lim f(x) = 0, Jim f(x) = -oo, Jim f(x) = oo,
:2:-+00 x---oo X--..-2+ X-+-2-
5. maxima none; minima none; local maxima none; concave upward (-oo, 0) and (0, oo) ;
concave downward { }; inflection points none;
limf(x) = 0, Jim f(x) = 0, Jim f(x) = oo, Jim f(x) = oo
X--+00 x--co a'-+0+ X--+0-
7. maxima none; minimum O; local maxima none; concave upward [-1/v'3, l/v'J];
concave downward (-oo, -l/v'3] and [l/v'3, oo); inflection points -l/v'3, l /v'3;
lim f(x) = 1, Jim f(x) = 1
X-+00 x--oo
9. maxima t; minima none; local maxima t; no local minima; concave upward (-oo, O]
and [l, oo); concave downward [O, 1]; inflection points 0, 1;
Iim f(x) = 0, lim/(x) = 0, Jim f(x) = !; Jim f(x) = !
X---+--00 X-+CO X--+-1- x.-.-1+
11. no maxima; no minima; no local maxima; no local minima; concave upward ( - oo, -1)
and [o, -fit]; concave downward ( -1, O] and [-fit, oo); inflection points 0, -fit;
lim /(x) = 1, Iim f(x) = 1, Jim f(x) = oo, Jim f(x) = - oo
X-+-co x_.co X-+-1- x--..-1 +
1. a 2
3. 2a2
2 2v'2
7 x =-a
· v3 •
y = --
v3
a 9 128 in.3
·
w2 k3
13. 15
8 • 1728
1 17'
19. 2 21. - 23. 2 +217'k, k an integer
{1 3
25. -
{13
4 '17'r3
35. 39. (3 +2v'2)r2
3v'3
742 Selected Answers
1. t 3. t 5. 1 7. l/v'3
9. h/d 11. v2 13. 4x3 + 4.ff' = 0
1. v'2 3. 2
1 . 3 •
5. (sin x + x cos x)e"' sm"' 7. e"'
3x2 4x
9. 3x2 cos2 x
11. cp(u) eu214, cp'(u) (u/2)eu2/4, cp' (g) xe"'2
= = =
'
13. cp(�) eu sin u, cp' (u) (sin u + u cos u)eu sin u, cp (g)
=
= = (sin x + x cos x)e"' sin'"
15. cp(u) = e"312, cp'(u) fu112eu312, cp'(g)
= i x e"'3 =
cos x
17. cp(u) = v'� cp'(u) = - �, cp'(g)
v'1 - u
=
sin x
1
19. cos (t)/e1 21. 2t3 23. -
xe"'
1
2 5. cp (u) = _ / , cp'(u) =
'
cp (g)
-u
,
-tan t cos3 t -sin t cos2 t
= =
+ u2 (1 + u2)3/2
·
v1
6 6t5
27. cp(u) (Tan-1 u)6, cp' (u) -- (Tan-1 u)5 , cp' (g)
+ u2 1 + (tan t)5
= = = ----
/ (!)
2
= -(1 - 14)114,f;(t) !:(4t4) (1 - t4)-3/4
=
-t3/f{ =
25. {(e"' - e-"')3 - e"' + e-x + C} 27. { (1/3) Sin-1 (x3/ v'z) + c}
29. { (1/3 v'z) Sin-1 ( v'z x3) + c} 31. {t(Sin-1 x)2 + C}
45. {tln4 x + C}
47. {i ln2 lx2 + 2xl + C} on any interval where x2 + 2x ¥- 0
49. {t Sec-1 (x2) + C} 51. {x Sin-1 x + C}
53. {x Sin-1 x + v'1 - x2 + c} 55. {x Cos-1 x- v'l - x2 + c}
57. {t In (1 + e2u) + C} 59. {t (e2u - In (1 + e2u)) + C}
61. {t In (1 + x2) + C} 63. {t (Sin-1 (z) +zv'l - z2) + c}
1.
{( -1 -2v�) + C 3.
{ x. +c
}
(1+v x)2 } a2va2 .;._ x2
5. {x-2 In (ve"+l +1)+c}
7. {t(1+ \o/�)2 -6 d+ \o/�)+3 In Jl+ \o/;I+c}
9. {t(v'vx+1) 3 -4Vv1x+1+c} 11. {vz2 -1+l(vz2-1)3+c}
Selected Answers 745
35. {tanx+sec x+ C}
37. { �2.In Isec (e+ 77/4)+ tan (e+ 77/4)1+c}
746 Selected Answers
3. ln(v'2+1) 5. j(5312 - 1)
1
7. a) i-(e - 1/e) b) -(ea - e-a)
2a
5. 4tr(a + k)k2
13. 10 v2 1T2
(a b
1. (x' y) = ; ' n 5.
1T
J
ac2
(b c (3b - 4a))
7. trc(Vb2 c2 v'(a - b)2
_
+ c2) 9· (x, y)
_
+ +
2' 6 (b - a)
=
1. tr/4 3. t 5. 2
7. 10,000 9. 00 11. 00
31.x=(a+b)coslJ+bcos
( a+b a+b
- - e,y=(a+b)sin!J+bsin - - e
(
b b
1. 0 3. 0 5. 0 9. 1 11. 0
13. e -2 15. 0 17.e 19. 0 21. 0
k 3
23. - co 27. 1 29. e 31. 1 33. e-
748 Selected Answers
1. y = 2 3. x +y = v'2
7. y = .x3 9. (x 2 +y2)2 = 2xy
13. 2x +y2 -1 = 0 19. x - y = 1
29. (x2 +y 2)2 = a2(x2 _ y2) 31. r = 1/(1 + sin 0)
33. 3r2 -16rcos () - 16r sin () + 32 = 0
1. 3rr/2 3. rr/4 5. t
9. 2 11. 2 13. i(e8" - 1)
1. 7T 3. 3 5. 2rr
7. i((l + i-)3/2 - 1) 9. tv's +t1n(2 +1v's) 1t. !
1. maximum at x = 0, K = 2
3. maximum at x = (45)-114, K = 5312 (45)-114 6-112; minimum at x = -(45)-114,
K =
_53/2 (45)-1/4 6-1/2
1 1
5· a'
'b
1. 0 3. 0 5. 0
1
7. 0 9. - 11. 00
e
13. - 00 15. 1 17. 2
19. 00 21. converges 23. 2
25. In 2 27. e - 1 29. 1
31. 0 33. 00 35. converges
Selected Answers 749
7. --
1T
9. convergent 11. convergent
1 + 1T
13. convergent 15. not convergent 17. convergent
19. not convergent 21. convergent 23. convergent
25. convergent 27. not convergent 29. not convergent
00
1
5. IRnl � -
n
1 1
7. IR..I � 9. !Rn! �
<n + 1)4 (n + 1)·9
1
11. IRnl � -
(This is the estimate which is easiest to derive. Much better estimates are
n
possible.)
1 1
13. !Rn! � - 15. IR..I �
n 2n2
1. 0.019997
.
2 /2 +)
i�l (-1)• 2(0 49)<i
00 00
5. a) /(x) c)
i + 2
=
i
oo ro
2(0.25)<5i+2)/2
7. a)/(x) L (-l)i x5if2 b) !Rn(x)I � (0.25)5<n+l>/2 c) L (-1)' --
.
. --
i=l 51 + 2
=
i=O
oo xi+ 2 co xi+ 3
I (-1Y . - , ( -1, 1 1 I c-1)i - . - , c-1, 11
l +1 l +1
3.
i=O
i.
i =O
-
22i+ix2H1 oo x2i+l
i -l)i 2i+1
� (-l) . , (- oo, oo )
oo
7 ( 2
(2i + 1) ! ' - oo' oo) ( (2i +1)!
5.
i i�
x2i co x3i +3
9. ( 1) 11. I-.- c- oo , oo
oo
i � - 3)2i(2i)!'
(- oo, oo
\ ) i=O ,I.
, )
x2i+1
15. Ic-1)i .
co
[-1, 11
i=O c 2l + n2,
x2i+l
19. (-l)i
oo
�
i (2i + 1)(2i + l)! ' - oo, oo) (
22ix2i+3 x3 co 22ix2i
i .I (-l)i
co
21. . 1 + -, (- oo, ) 23. I c-1Y . ,, c- oo , oo )
(2I ) 2 c2l )
- oo
i=O i=O
f(x) = e"'l2
• •
n+ 1 (n + 1)! n n!
5
·
( ) i =(n+l- i)!i!' ( )
i- 1 (n- i +1)! ( i - 1)!
x2i+1 x2i +l
17. 2113
oo J.
�()
co k
19. 2k on -v'2, v'2)
� (; ) ( - v2, v2)
_ _
i 2 i(2i + 1)
on
i i 2i
2 (2i +1) (
-i xi+2
23. f(x) = e•in(x)
00
21. _2 ( ) (-1, 1)
+2
. -:-- on
i=O l l
(-l)m
an = 0 if n is an if n= 2m + 1
(2m + i) !
I. even,
=
3. a0 = 0, a1 = 1, a2 = 0
( -l)m .
an = 0 if n is a11 = ( 1f n= 2m +1
Zm + l)
5. even,
1
7. an= f
n.
Selected Answers 751
9. a0 =In 2, a1 = !, a2 = -!
. . (-l)m-1.
11. an = 0 1f n IS odd, an = if n = 1m, m > 0 (a0 = 0)
1m
lxln+l
1. IRn(x)j �
(n + l ) !
1. -4 3. -1 5. -1
9. 1 11. 2i
2 i
13. 3 +4i 15. -9 +sv'3; 17. 5-5
1 2i 1 3i 2 v'3 i
19. 5 -5 21. - 23. 7 - - -
10 10 7
v'3 2i
25. 7 27. -i 29. i
7
8 i
31. i 33. 5 +5 35. 1
5. 1ei"16
1 i 1 i
-- 1 i 1 - i
1. z= --= + ----=' ----= - - ----= + -- ' - ----= ----=
3. z = -2, 1 + v3 i, 1 - v3 i
5. z= -1, i, -i
1 i 1 i 1 1 1 i
7· z= l , i, -i, + - + -
v2 v'2 v2 - v2' v'2 v2' v'2 - v2
- '
152 Selected Answers
1. e• 3. cos z 5. 2ze"2
(2i +l)x2i
5.
GO (i + l)Xi
.L
i=O
.,
I.
7. .�
£.-
i=O
'+l
( -1)'
(2 I")I•
13. Jim /n(O) = 1,Iim fn(x) = oo if 0 < x � 1,U lim does not exist
n--+ oo n--+ co
1. x + 3y + 2z = 14 3. x + y +z = t 5. 5x - Sy + z = 3
7. x - y +2z = 6 9. The figure is a sphere with equation x 2 + y2 +z2 = 4.
1
i. � +L +_:_ __l_ _:_ + - -
v'3 v' v'3 v'3=0
3 '
-�
v'3
_ _l'_
v'3 v'3 v'3=0
_
x 2y 2z x 2y 2z
3. - + - +- - 1 = 0 ' - - - - - - + 1 = 0
3 3 3 3 3 3
2 2
5 __!__ + Y + � +� = 0 - __!__ - Y - � - � = 0
· �l � �l �1 ' �1 �1 �1 �l
7
·
(:3' :3' :3)
x 2y -x 2y 3
11. - 3 =0 -
- +- - -- +- =0
v'5 v'5 v'5 ' Vs Vs Vs
x y z
xi +yj+zk =2 (V1 +V2) + 2 (V2+ V3)+2. (Vi + V3)
1. °'1 =1, °'2 = -2, °'a =1 3. °'1 =4, °'2 = -5, °'a = 2, <X4=· -1
5. {E1 - E2, 2E2+ Ea} 7. {E1+E2, E2, Ea+E4, E4}
9. {2E1 - 2E2 - Ea}
1. {�
v'3
(E1+E2+E3), -1- (-E1 - E2+2E3)}
v'6
3. { 1 (E1 E2+Ea), �
-- + (2E1 -E2 -Ea), � (E2 - Ea)}
v'3 v'6 v'2
1T2
13. /(x) = x2 - 3
2 . 7T2 (-l)i4 2
1. a0= 0, ai =0, bi= -: (-1)'+1 3 . a0 = - a·i = j2 bi = - (-l)i+l
l i
--
3 ' '
5• a·=
i Ob·=
' i E_ (-l)Hl (i37T'2)
.
1T 1 1
7. ao =-4, ai= -;z- ((-l)i - 1), bi=-: (-l)i+1
l 1T l
9.
1T 1
a0= -4 + -2, ai
( 1 ) ((-1)'. - 1), bi=-:1 (-1)•+. 1 +,1 ((-1)'.-1)
-;z-
l 1T l l1T
=
1T 1 . 3 · 1
11. a0=-4 j ((-1)'- 1) ' b·=-(-1)'
a·=- +
' i 21T i
i
1
7. a0= 1T ( -2 + e" + e-")
2
1
ai= . (-2 + (e" + e-")(-l)i)
7r(l + 12)
i
b·= ( +2 + (e" + e-")(-I)i+l)
' 7r(l + i2)
·[l -u ·[� �]
PROBLEM SET 13.1
-4 0
1. M-• � 3 3. M-• �
3
-2 -1
5. /(R3) ={(a, b, c) I b =3a, c = -2a}
]
Ker/={(x,y, z) I 2x + y - z = O}
7. /(R3) = {(a, b, c) I 3b = 2c}
�H
-3 -I
Ker f = {(x , y, z) I x + 2y = 0, z = o} 9. M-• � 3 -1
0 2
756 Selected Answers 755
!
PROBLEM SET 13.2
3. [� ;]
LP �]
15 15
5. [4741 5825 ·�] 7 UJ
9. L�J 11. [l 64 !]5
3.9. 56
7.13. 112-2. 17. -2-2
1. G11G22 - G12G21 G33 G44 5. 1
PROBLEM SET
I. D 11 = -1,
13.5
-D21 = 2, D31 = -1 3. z =
-
5047
756 Selected Answers
5. [� :] 7.
[� _!]
[i !]
0 0
5
[! H] 11.
0
-
2 +
•. 0 1
2 0 -1
+
0 0 0 0
3
0 0 0 0
2
15. 1
0 0 0 0
5
-1 0 0 0 0
1
0 0 0 0
4
+ +
5. "fl/' = { IX1e"' 1X2 cos x + IX3 sin x} 7. "fl/' = { IX1 + 1X2e"' + IX3Xe"' + 1X4e-X + IX5Xe- "'}
+
9. H = { IX1 sin x + IX2 cos x - tx cos x} 11. H = { IX1 sin x + IX2 cos x + ix sin x}
13. H = { IX1 sin x + IX2 cos x x}
+
15. H = { 1X1 sin x + IX2 cos x + ie"' sin x - -}e"' cos x}
1
13. hyperbolic paraboloid 19. 3 21. 87T
2x 3 2y3 -4x3y3
13. fx= v'x 4 + y4 + /v = ,/xv= (x4 =/vx
1 , v'x 4 + y 4 + 1 + y 4 + 1 )312
y(y2 _
x2) x(x2 _
y2) .,...y4 + 6x2y2 x4
.fv = (x2 .fxv= =fvx
_
1. A = 1, B = 2, E1 = Ay, E2= 0
3. A = 1, B 2, E1 = Ay2, E2 = Ay + 2Ax
=
7. A = 2, B 2, E1 = Ax, E2 = Ay
=
544 1376
1. 35 3.
21
9 4
7. 0 ·
27
7. 0 11. 0
28
1. 9 3. x = 0, ji = 0
5. x = 3/27r, ji = 0 7. x = 0, ji = - t
3
9. -11/6 11. x =- y-=o 13. (0, 0)
4'
1. 0 3. 0 5. 3/7
9
7. 0 . 0 11. 3
Index
759
760 Index
A BCDEFGH'/987 65432