Curl

Download as pdf or txt
Download as pdf or txt
You are on page 1of 49

Divergence and Curl

E. L. Lady

(Last revised October 7, 2004)

A vector field F defined on a certain region in n-dimensional Euclidian space consists of an


n-dimensional vector defined at every point in this region. (I. e. F is a function which assigns a vector
in Rn to every point in the given region.)

In the case of three-dimensional space, a vector field would have the form

F = P (x, y, z) i + Q(x, y, z) j + R(x, y, z) k ,

where P , Q , and R are scalar functions.

The usual way to think of differentiation in this situation is to think of the derivative of F as
being given by the Jacobean matrix
 
∂P ∂P ∂P
 ∂x ∂z 
 ∂y 
 
 ∂Q ∂Q ∂Q 
J =
 ∂x
.
 ∂y ∂z 

 
 ∂R ∂R ∂R 
∂x ∂y ∂z

If one writes x = (x, y, z) and ∆x = (x, y, z) − (x0 , y0 , z0 ), and ∆F = F(x, y, z) − F(x0 , y0 , z0 ), and if
one thinks of ∆x and ∆F as being column vectors rather than row vectors, then one has
 
  ∂P ∂P ∂P
 ∂x ∂z 
∆P  ∂y  
    ∆x
   ∂Q ∂Q 
∆F =    ∂Q  
 ∆Q  ≈  ∂x ∂z 
 ∆y  = J ∆x
   ∂y  ∆z
 
∆R  ∂R ∂R ∂R 
∂x ∂y ∂z

in the same way that, with a function f (x) of one variable, one has ∆f ≈ f 0 (x0 )∆x.

One can also think of the Jacobean matrix (or, better still, the linear transformation corresponding
to the matrix J ) as a way of getting the directional derivative of F in the direction of a unit vector u:

F(x + hu) − F(x)


lim = Ju,
h→0 h
where in computing the matrix product here we think of u as a column vector, i. e. a 3 × 1 matrix.

It is useful to notice that the first column of J gives the directional derivative of F in the
direction i , the second column is the directional derivative in the direction j, and the third column is
the directional derivative in the k direction.
2

Divergence

The divergence of the vector field F, often denoted by ∇ • F, is the trace of the Jacobean
matrix for F, i. e. the sum of the diagonal elements of J . Thus, in three dimensions,

∂P ∂Q ∂R
∇•F = + + .
∂x ∂y ∂z

Now the concept of the trace is surprisingly useful in matrix theory, but it in general is also a very
difficult concept to interpret in any meaningful intuitive way. Thus it is not surprising that the
divergence seems a rather mysterious concept in vector analysis.

Example. Let F(x, y) = 3e5x cos 3y i − 5e5x sin 3y j = e5x (3 cos 3y i − 5 sin 3y j). Then
∇ • F = 15e5x cos 3y − 15e5x cos 3y = 0 .

More generally, if g(x, y) is any function with continuous second partial derivatives and
∂g ∂g ∂2g ∂2g
F(x, y) = i− j, then ∇ • F = − = 0.
∂y ∂x ∂x ∂y ∂y ∂x

In trying to understand any relationship involving differentiation, it is usually most enlightening to


start with the case where the derivative is constant. For instance in studying the relationship between
distance, time, and velocity, one begins in high school with the case where velocity is constant.

The principle of Taylor Series expansions shows that in a small neighborhood of a point (x0 , y0 ) in
two-dimensional space, it will usually be true for a vector field F = P i + Q j that

F(x, y) = P (x, y) i + Q(x, y) j ≈ c + (a1 x + a2 y) i + (b1 x + b2 y) j,

where c is a constant vector and

∂P ∂P ∂Q ∂Q
a1 = (x0 , y0 ), a2 = (x0 , y0 ), b1 = (x0 , y0 ), b2 = (x0 , y0 ) .
∂x ∂y ∂x ∂y

(More precisely,

∂P ∂P
P (x, y) ≈ P (x0 , y0 ) + (x0 , y0 )(x − x0 ) + (x0 , y0 )(y − y0 )
∂x ∂y

∂Q ∂Q
Q(x, y) ≈ Q(x0 , y0 ) + (x0 , y0 )(x − x0 ) + (x0 , y0 )(y − y0 )
∂x ∂y

in a sufficiently small neighborhood of (x0 , y0 ). Actually, this will be true provided that P and Q are
differentiable at (x0 , y0 ), even if they don’t have Taylor Series expansions.)

Now a2 and b1 don’t affect ∇ • F , so to get an intuitive sense of how divergence works in the
plane, it makes sense to look at some examples of vector fields F = ax i + by j, where a and b are
constants. Then ∇ • F = a + b .

Here are some pictures. (For future reference, the values for ∇ × F are also given.)
3

Example 1 y6

A
K  F = x i + .5y j = ∇(.5x2 + .25y 2 )
@
I A  
@ ∇ • F = 1.5
HH
Y A  *

H @ 
HH  ∇×F=0
XXX
y A
K 6 
  :

XXX H I
@    
HH@ A
Y

*
yXX
X :
 - -
x

Example 2
C F = − 31 x i − 13 y j = ∇(− 16 x2 − 16 y 2 )
A 6  
C y 
A CW   ∇ • F = −2/3
@ U
A  
@R
@ ∇×F =0
HH C  
j
H AA
U CW   

@R
@
H
j
H 


-
x

Example 3 F = 12 x i − 12 y j = ∇( 14 x2 − 14 y 2 )
y6 C
C ∇•F = 0
 C J @
C ∇×F=0
 CW J @
    C A @ JJQ
^ Q@ R
@
 

+ 
   CCW AAU RP Q
@
@ s
Q
   AU @ HH PP
R
@
)
 

9
 
  H
j
H XXj
H
z
X
q
P
-
y
XXX H
Y
H *

 :
 x
Y
HHH *


I
@
@ K
A   

Consider now the special case of a vector field F with constant direction. Then we can write
F(x, y) = f (x, y) u, where u is a constant vector with magnitude 1. A calculation then shows that

∇ • F = ∇f • u = Du (f ) ,

where Du (f ) denotes the directional derivative of f (x, y) in the direction u. Assume that f (x, y) ≥ 0
at the particular point of interest. Then f (x, y) = ||F(x, y)||. Since F is parallel to u, this shows that
4

Proposition. If F(x, y) is a vector field with constant direction u, then ∇ • F is the rate of increase
in ||F|| in the direction u.

(Note that if f (x, y) < 0 then the direction of F is −u and ||F|| = −f (x, y), so that
∇ • F = Du (f ) = D−u (−f ) is still the rate at which ||F|| is changing as one moves in the direction
of F.)

The principle here is equally valid in three dimensions or even in higher dimensional spaces.

Example 4 y `
6 - -- 
A field with constant direction
- -- 
and negative divergence.
- --  F = a(y − x) i
- --   ∇ • F = −a

- --   ∇ × F = −a k

- F = 0 along
x
the line ` .
--  

-  

 

On the other hand, consider a vector field U in two dimentions with constant magnitude.
Suppose, say, that ||U(x, y)|| = 1 everywhere. Then basic trigonometry shows that if γ(x, y) is the
angle that U(x, y) makes to the horizontal, i. e. γ is the angle between U and i , then

U(x, y) = cos γ(x, y) i + sin γ(x, y) j .

y
6
*
U  ||U|| = 1

.
......
 γ ... -
i
-
x
5

We then see that


∂ ∂
∇•U = (cos γ) + (sin γ)
∂x ∂y
∂γ ∂γ
= − sin γ + cos γ
∂x ∂y
∂γ ∂γ
= (− sin γ i + cos γ j) • ( i+ j)
∂x ∂y
= (− sin γ i + cos γ j) • ∇γ .

Thus ∇ • U is the directional derivative of γ(x, y) in the direction − sin γ i + cos γ j. This shows that

Proposition. If U(x, y) is a vector field in the plane with constant magnitude 1, then ∇ • U(x, y)
equals the rate at which U turns as (x, y) moves at unit speed in a direction which is 90◦
counterclockwise to the direction of U(x, y).

To get a clearer idea of what this means, look at a curve C orthogonal to the vector field, i. e. the
tangent to C is perpendicular to U(x, y) at every point (x, y) on C . Then what we see for a vector
field U of constant magnitude is that as we move along the orthogonal curve in a direction 90◦
counterclockwise from the direction of U, in case of positive divergence the vector field U will be
turning counter-clockwise, and consequently C will be curving in the direction opposite to the
direction of U. For a field with constant magnitude and negative divergence, on the other hand, as
one moves along the curve in the direction 90◦ counterclockwise from the direction of U, U will be
turning clockwise, so C will be curving in the in the same direction as that in which U is pointing.

Vector fields with constant magnitude.

..
...
 ...


:
...
 
 ..

..
..
 *

..


..

...........................

..
..
...........

.........
...... 
..
..
...
......
....
....
....
  ...
...
...
*
  
.... ...
...


... ....
... ....
 
... ....
.. ....
..
:

 ....
 

.. ....
.. ....
.. .....
.. .....
...
..
.......
......

......... 
..........
..
.  .............
....................
..................

Positive divergence Negative divergence

Vector fields having constant magnitude are unusual. For instance, linear vector fields (see below)
never do. Certainly we can contrive such a field simply by using any differentiable function γ(x, y)
and the formula U(x, y) = cos γ i + sin γ j or by setting U(x, y, z) = F(x, y, z)/ ||F|| , for any vector
field which is never zero in the region of interest. One of the more natural examples of this sort with
6

q
xi + yj + zk
constant magnitude 1 is F(x, y, z) = where r = x2 + y 2 + z 2 .
r
q
It is easily seen that F = ∇g , where g(x, y, z) = x2 + y 2 + z 2 .

2
One can show that ∇ • F = . In fact, F = r−1 r, where r = x i + y j + z k, and so
r
∇ • F = ∇(r−1 ) • r + r−1 ∇ • r = −r−3 r • r + 3r−1 = 2r−1 .

Since we can write any vector field F as f (x, y)U(x, y) where f (x, y) = ||F(x, y)|| and
F(x, y)
U(x, y) = , we can apply the product rule to get
||F||
∇ • F = (∇f ) • U + f ∇ • U .

In words, then,

Proposition. The divergence of a two-dimensional vector field F(x, y) is the sum of the following
two terms:
(1) The rate at which ||F|| is increasing when (x, y) moves at unit speed in the direction of F (this
is negative, of course, if ||F|| is decreasing);
(2) the product of ||F|| and the rate at which F is turning as (x, y) moves in the direction 90◦
counterclockwise to the direction of F at unit speed.

There is a more sophisticated and more conceptual way of deriving this result, namely by making
a change of coordinates. Of course one has to know the fact that a change of coordinates will not
change ∇ • U. One should also remember that this change of coordinates has its limitations as a
practical tool, since except for linear fields (see below), it will only simplify the Jacobean matrix at
one particular point.

The key to the change in coordinates comes from a fundamental fact about the derivatives of
dv(t)
vectors. Namely, the derivative of a vector v(t) at a given point can be decomposed into two
dt
components, one in the direction of the vector, which shows the rate at which the magnitude of v is
changing, and the other orthogonal to v(t), which points in the direction towards which v is turning
and whose magnitude is the product of ||v|| and the rate at which v is turning.

Thus the first component of the directional derivative of a vector field F in a given direction at a
given point gives the rate at which the magnitude of F is changing and the second component is
determined by the rate at which F turns as one moves in the given direction at unit speed. Recall
that if F = P i + Q j and we denote the directional derivatives of F in the i and j directions by
D i (F) and D j (F), then

∂P ∂Q
D i (F) = i+ j
∂x ∂x
∂P ∂Q
D j (F) = i+ j.
∂y ∂y
7

This makes it useful to choose an orthogonal coordinate system so that at a given point (x0 , y0 ),
i is in the direction of F. As usual, we make j in the direction 90◦ counterclockwise from the
∂P
direction of i . Then is the first component of the directional derivative of F in the direction
∂x
∂Q
of F, and thus gives the rate at which ||F|| changes as we move in the direction of F. And is the
∂y
second component of the directional derivative of F in the direction 90◦ perpendicular to F, and thus
equals the product of ||F|| and the rate at which F turns as we move in the direction 90◦
∂P ∂Q
counterclockwise to F. Since ∇ • F = + , we thus recover the characterization of the
∂x ∂y
divergence of a two-dimensional field given in the preceding Proposition.

In general, the Proposition should not be looked at as a practical method for computing ∇ • F,
since in most cases the original definition is easily used for calculation. Instead, it is a way of
attempting to see the intuitive conceptual meaning of ∇ • F. However in certain cases, the
differentiation required by the original formula is a slight nuisance. For instance, consider the vector
q
yi − xj
field F(x, y) = , where r = x2 + y 2 . Here we can easily see from the Proposition above that
rn
∇ • F = 0 since F is tangent to the circles x2 + y 2 = const around the origin and ||F|| is constant as
one moves around such circles, so that the directional derivative of ||F|| as (x, y) moves in the
direction of F is zero. Furthermore, the direction of F does not change when (x, y) moves in a
direction perpendicular to F, i. e. in the direction of the radial lines from the origin. As a
computational check, with P = y/rn , Q = −x/rn , then, using the fact that ∂r/∂x = x/r ,
∂r/∂y = y/r , one finds that

∂P ∂ y  −nxy
= =
∂x ∂x rn  rn+2
∂Q ∂ −x nxy
= =
∂y ∂y rn rn+2

∂P ∂Q
so that indeed ∇ • F = + = 0.
∂x ∂y
One should also note that the Proposition breaks down at points (x0 , y0 ) where F(x0 , y0 ) = 0,
since at those points the direction of F is not well defined. In some cases, this is only an apparent
difficulty. If one writes F(x, y) = f (x, y)u(x, y), where u(x, y) is a unit vector, and if
F(x0 ) = F(x0 , y0 ) = 0 , then one may be able to get away by setting u(x) = limx→x0 u(x). But in
some important cases this limit will not exist. Two examples are the fields F(x, y) = x i + y j and
F(x, y) = −y i + x j, with x0 = (0, 0) for both examples. In cases like this, however, one ought to at
least get away with using the formula F(x0 ) = limx→x0 ∇ • F(x).

Since the characterization of divergence for two-dimensional vector fields given by the preceding
Proposition is so nicely conceptual and coordinate-free, it seems only natural to hope that if one can
find the appropriate way to state it, then it will continue to be true for three-dimensional fields as
well. Or at least there should be some analogous result for three dimensions. This is pretty much true,
as one will see at the end of the article.
8

The Jacobean matrix for a vector field does not tell us the magnitude or direction of the field,
since what it describes is the way the field changes. However recall our earlier observation that the
directional derivative of a field F in any direction u consists of two components, one (in the direction
of the field) telling us the rate at which ||F|| changes as one moves at unit speed in the direction u,
and the other (orthogonal to the first) telling us the rate at which F turns.

If F has constant magnitude, then the first of these components (the one in the direction of F) of
the directional derivative in any direction will be 0 . And if F has constant direction, then the second
component (the one orthogonal to F) wil be 0.

Applying this to the directional derivatives D i (F) and D j (F), one sees that

A vector field F has constant magnitude if and only if at every point the columns of the
Jacobean matrix are orthogonal to F.
A vector field F has constant direction if and only if at every point the columns of the Jacobean
matrix are parallel to F.

Why Is It Called Divergence?

Let’s go back to the simple example of a horizontal vector field F = P (x, y) i and suppose that this
vector field F represents the velocity of a moving fluid. Now construct a small box in the xy-plane
with horizontal and vertical sides.

y6

-
b F

a
-
x

∂P
Suppose in particular that ∇ • F = = 5 . Let a denote the horizontal width of the box, as
∂x
indicated. Since we are assuming that F is directed toward the right (i. e. that P is positive) then the
rate at which the fluid is flowing out of the right end of the box is 5a more than the rate at which it is
entering the left end. If, on the other hand, P is negative, then fluid flows out of the left end of the
box at a rate 5a more than it flows into the right end. In either case, if the box has height b , this
means that there is 5ab more fluid flowing out of the box than there is fluid flowing in.

On the other hand, if we were to have a vertical vector field Q j representing the flow of a fluid,
∂Q ∂Q
and if were constant, say = 8 , this would mean that the velocity at which fluid is leaving
∂y ∂y
9

through the top of the box (since we have assumed that Q is positive) would exceed the velocity at
which it enters through the bottom by 8b , and there would be 8ab more fluid flowing out of the box
than fluid flowing in.

In either case, the difference between the amount of fluid leaving the box and the amount entering
is equal to the divergence of the vector field times the area of a box.

But we can write a general vector field F in the two-dimensional case as the sum of a horizontal
field P i and a vertical one Q j. And the divergence of F is the sum of the divergence of the
horizontal and vertical components:

∇ • F = ∇ • (P i) + ∇ • (Q j) .

We conclude that
∂P
Lemma. If the velocity of a fluid in the plane is given by a vector field F = P i + Q j such that
∂x
∂Q
and are constant, then for any rectangular box with horizontal and vertical sides, the amount of
∂y
fluid inside the box will decrease per unit time at a rate equal to the divergence of F times the area of
the box. (It will increase if the divergence is negative.)

Now if one looks at an arbitrary differentiable two-dimensional vector field, then in a sufficiently
∂P ∂Q
small neighborhood of a given point (x0 , y0 ), the partial derivatives and will be very nearly
∂x ∂y
constant. Thus the conclusion of the Lemma above will be true to a very close degree of
approximation for any sufficiently small box constructed about (x0 , y0 ). And one can in fact see that
the divergence of F will equal the limit of the ratio between the rate at which fluid flows out of the
box to the area of the box, as the size of the box shrinks to zero.

But since any reasonbly-shaped region can be closely approximated by dividing it up into tiny
boxes, one gets the following standard characterization of divergence.

Theorem. Suppose that a two-dimensional vector field F represents the velocity of a fluid at a
particular moment in time. Consider small regions surrounding a given point (x0 , y0 ) and take the
ratio of the rate at which fluid flows out of such a region (taken as negative if fluid is flowing inward)
to the area of the region. Then the divergence of F at the point (x0 , y0 ) is the limit of this ratio as
the region around (x0 , y0 ) shrinks to the single point (x0 , y0 ).

An alternate way of stating this may be easier to grasp intuitively.

Theorem. Suppose that a two-dimensional vector field F represents the velocity of a fluid at a
particular moment in time. Suppose that a region Ω in the plane is small enough so that the
divergence ∇ • F does not change very much over the region Ω. Then the rate at which fluid is
flowing out of Ω is approximately equal to the product of the area of Ω and the divergence of F at
any point within Ω.
10

It may be enlightening to compare the relationship between divergence and the rate of dissipation
of fluid out of a region to the relationship between density and mass. If we are looking at the surface
of a solid (perhaps metal) plate with varying thickness or density of meterial, and if M (Ω) represents
the mass of that portion of the plate within a region Ω and µ(x) is the density of the plate (in grams
per square centimeter, for instance) at a point x, then we know that if Ω is small enough so that the
density does not change a whole lot within Ω, then M (Ω) ≈ A(Ω)µ, where A(Ω) is the area and µ is
the density at any point within Ω.

We see then that the relationship between divergence and dissipation of fluid from a region is
exactly the same as the relationship beween desnity and mass.

This suggests the following theorem, which in fact is easy to prove using the preceding results:

Theorem. Suppose that a two-dimensional vector field F represents the velocity of a fluid at a
particular moment in time. Let Ω be a finite region in the plane. Then the rate at which fluid flows
out of Ω is given by
ZZ
∇ · F dx dy .

(If the integral is negative, then of course fluid is accumulating in the region rather than dissipating.)

How does one mathematically describe “the rate at which fluid is flowing out of a region Ω?”

If Ω is bounded by a simple closed curve C (i. e. one which is connected and does not intersect
itself), let n denote the unit outward normal to this curve. I. e. at a point x on C , n(x) is a unit
vector perpendicular to C (i. e. to the tangent vector to C ) and pointing away from Ω. Then the rate
at which fluid is flowing out of Ω at the point x is given by the product of ||F|| and the cosine of the
angle between F(x) and n(x). (Thus the rate of flow is negative if the angle is obtuse, i. e. if F is
directed toward the interior of Ω.) Since n is a unit vector, this product is given by F(x) • n(x).

The total flow outward from Ω will then be given by the integral of F • n over the curve C . Thus
the preceding Theorem can be restated as follows:

Theorem. Suppose that a two-dimensional vector field F represents the velocity of a fluid at a
particular moment in time. Let Ω be a region bounded by the simple closed curve C and let n denote
the unit outward normal to the curve C . Then
I ZZ
F(x) • n(x) ds = ∇ • F dx dy .
C Ω

(Here ds is the differential corresponding to arc length on C .)

This is the two-dimensional case of the Divergence Theorem.

This two-dimensional case can also be proved q by means of Green’s Theorem. Parametrize the
curve by functions x(t) and y(t) such that (x0 (t)2 + y 0 (t)2 = 1 . (This is easy to do in principle,
although often difficult in practice because the calculation can be nasty.) Then (possibly after a
reversal of sign) x0 (t) i + y 0 (t) j is the unit tangent vector to C and so y 0 (t) i − x0 (t) j is the unit
11

outward normal (rotated clockwise from the unit tangent). Let F(x, y) = P (x, y) i + Q(x, y) j. Then
I Z t1
F(x) • n(x) ds = P (x, y) y 0 (t) − Q(x, y) x0 (t) dt
C t0
I
= −Q dx + P dy ,
C
and by Green’s Theorem this equals
ZZ ZZ
∂P ∂Q
+ dx dy = ∇ • F dx dy .
Ω ∂x ∂y Ω

All this reasoning (except for the use of Green’s Theorem) works just as well in three-dimensional
space. One needs to consider a three-dimensional region T , whose boundary will be a surface S
rather than a curve. One gets the three-dimensional theorem which is generally referred to as the
Divergence Theorem.

Divergence Theorem. Suppose that a three-dimensional vector field F represents the velocity of a
fluid at a particular moment in time. Let T be a finite region inf three-space bounded by a simple
closed surface S . Then ZZ ZZZ
F • n dσ = ∇ • F dx dy dz .
S T

The Laplacean and the Heat Equation.

The divergence of the gradient field for a function g , i. e. ∇ • ∇g , is called the Laplacean of the
function. This is relevant to the study of the flow of heat in a solid (usually metal).

It is worth noting that for a gradient field F = ∇g


  ∇g in the plane, F is perpendicular to the level curves g(x, y) = const, and thus
.......................
..........
.......
......
....
 the rate at which F is turning as (x, y) moves in a direction perpendicular
....
....
... *

 to F is the same as the rate at which the tangent vector to the level curve
..
...
..
..
..
..
. turns as one moves at unit speed along the level curve. This is, up to sign,
..
..
...
g = const the curvature of the level curve. (By definition, curvature is always positive.)

The Laplacean of a planar function g(x, y) is given as follows:


(1) If ∇g points away from the direction in which the level curve for g curves at the given
point, then ∇ • g is the sum of the rate at which ||∇g|| changes as one moves in the direction
of ∇g plus the product of ||∇g|| times the curvature of the level curve for g at the given point.
(2) If ∇g points in the same direction towards which the level curve is curving, then ∇ • g is
the sum of the rate at which ||∇g|| changes as one moves in the direction of ∇g minus the
product of ||g|| and the curvature of the level curve.

For reasons which I’m not completely sure about, functions g with ∇ • ∇g = 0 are often called
harmonic.
12

Note that any linear function f (x, y) = ax + by is certainly harmonic, as are the functions x2 − y 2
and xy . And by a calculation done above, the function f (x, y) = tan−1 (y/x) is harmonic, since
∇f = (y i − x j)/r2 , and we have seen that the vector fields (y i − x j)/rn all have divergence 0. A
very important set of harmonic functions consists of those of the form f (x, y) = eax sin ay and
f (x, y) = eax cos ay .
P
Furthermore, if ui (x, y) are harmonic functions, then any linear combination c u (x, y) is also
P i i
harmonic. In fact, this is true (at least under reasonable restrictions) even when ci ui is an infinite
series.

Based on the reasoning given above, we can see what it means for a planar function to be
harmonic by looking at its gradient field and level curves. One can distinguish two cases: Case 1 is
when ∇g increases when one moves away from the point (x, y) in the direction of increasing g
(i. e. the direction of ∇g ). And Case 2 is the case when ∇g decreases when one moves in the direction
of ∇g . In Case 1, in order that ∇ • ∇g(x, y) = 0 , the level curve through (x, y) must curve towards
the direction of ∇g and furthermore the product of ||∇g|| times the curvature of the level curve must
equal the rate of increase of ||∇g|| when (x, y) moves away in the direction of ∇g . And in Case 2, the
level curve at (x, y) must curve away from the direction of ∇g and the product of ||g|| and the
curvature of the level curve must be the negative of the rate of increase of ∇g when (x, y) moves in
the direction of ∇g .

The Heat Equation. Although heat is energy rather than mass, it flows in a way that makes it
very much like a fluid in terms of mathematical structure. Heat flows from one point to another when
there is a temperature difference between the two points, and the rate of flow is proportional to the
temperature difference, but in the opposite direction (from a point with high temperature to one with
lower). Thus if we look at a metal plate of homogeneous material and constant thickness, and let
u(x, y) be the temperature at a point (x, y), then heat flow is proportional to −∇u . (It is traditional
in texts on partial differential equations to use the variable u for the function being considered.)

As we have mentioned, for a vector field F representing the velocity of a fluid in the plane,
−∇ • F(x, y) gives the rate at which the fluid accumulates at (x, y). More precisely, we look at a
small region Ω surrounding (x, y), and look at the amount of fluid accumulating in Ω in a unit time
interval, and take the ratio of this to the area of Ω. If we then take the limit as we shrink the
region Ω down to the single point(x, y), this value will be −∇ • F.

Now when the “fluid” in question is heat, and the plane corresponds to a metal plate of constant
thickness and homogeneous material, then the rate of accumulation of heat is proportional to the rate
∂u
of temperature increase, i. e. . Thus the changing temperature in a metal plate will be governed by
∂t
the heat equation
∂u ∂2u ∂2u
a = ∇ • ∇u = + ,
∂t ∂x2 ∂y 2
where a is a positive constant of proportionality.

Now if one leaves the plate alone, then eventually the temperature will stabilize, so that
∂u/∂t = 0 . (In principle, it takes infinitely long before the temperature completely stabilizes, but for
practical purposes it usually happens fairly quickly.) Now one might think that if the temperature
stabilizes, this would mean that the plate would have a constant temperature all over. But this will
13

not be the case if heat is being continually applied (or removed) at points around the boundary of the
plate. For instance, one edge of the plate might be submerged in a bucket of ice water, thus giving it a
temperature of exactly 0◦ Celsius, and the other edge in a flame or furnace.

Thus we have the steady-state heat equation, also known as Laplace’s Equation,
∂2u ∂2u
∇ • ∇u = + 2 = 0,
∂x2 ∂y
which describes the temperature distribution in a metal plate where points on the boundary are held
at prescribed temperatures and the temperature distribution over the plate has stabilized.

This is a partial differential equation for the unknown function u(x, y). The problem with this
equation is not that it’s difficult to find solutions. In fact, there are infinitely many solutions to the
equation and many of them are quite well known. (These are the functions that we’ve called
harmonic.) What is challenging is to find a solution that will take the prescribed values on the
boundary of the plate. This sort of problem is known as a boundary value problem.

Curl

Unlike divergence, curl is something that only exists in three-dimensional space. It is usually defined
by way of the cross product, and the cross product does not exist in the plane or in four-dimensional
space.

In fact, defining curl as a vector is, in the language of computer programming, a ”clever hack.”
Concepts such as curl, angular momentum, and torque really should be second order tensors
(i. e. 3 × 3 matrices). But these particular matrices skew-symmetric, i. e. they are 0 on the diagonal
and the half below the diagonal is the mirror of the half aobve, except with change of sign. Thus the
matrix has only three distinct entries, and these can be used as the components of a vector.

Although defining the curl as a cross product works well on the symbolic level in several respects,
it can also be misleading if taken too seriously. For instance, we know that the cross product of two
vectors is perpendicular to each of them. Therefore it is plausible to conclude that ∇ × F will be
perpendicular to F. However this is often not true. For instance if F = z i + x j + y k, then
∇ × F = i + j + k, and this is perpendicular to F only at points on the plane z + x + y = 0 .

If we think of a vector field as representing the velocity of a fluid, then the curl corresponds
roughly to the extent to which the fluid is swirling at a particular point. One tends to
think of a vector field with a non-zero curl at
 a particular point as looking in a neighborhood of the point
I
@@ somewhat like the picture to the left. In fact, as one looks
at this planar vector field as the reference point moves around
6 the circle, one notices that what happens is that as one moves
in the direction of the field and the direction of increasing x
? (look at the the bottom of the circle), as the vector rotates
 counter-clockwise, the j-component increases (or becomes
@R
@ - less negative), i. e. the direction of the vector moves upwards.
And as one moves in the direction of increasing y , the
14

∂Q ∂P
i-ccmponent decreases (or becomes more negative). Thus we have and − both positive, thus
∂x ∂y
∂Q ∂P
giving a positive value to − , which corresponds in this planar example to ∇ × F.
∂x ∂y
This seems to give a bit of intuitive significance to the formula for curl, however it is a bit
simplistic in at least two ways. First of all, it is not only the direction of F that contributes to the
curl. Surely if the magnitude of F is decreasing rapidly enough, for instance, then the magnitude of
the x-coordinate and y-ccordinate of F will also decrease, regardless of which direction the vector is
turning. Secondly, in this example F turns as we move in the same direction as F. The analysis
breaks down if the turning of F happens when we move in a direction roughly perpendicular to F, as
in Examples 1 and 2. In Example 1 we see that what happens is that as we move around the cirle
and x increases, the vector turns in a negative direction (clockwise), but y at first increases and then
decreases. And in fact, in Examples 1 and 2, ∇ × F = 0 .

Thus it is incorrect to think that a non-zero curl corresponds to a twisting of the field.

Example 5
y
6
F = −ay i + ax j

@
I ∇•F = 0
@
∇ × F = 2a k
6
-
x
?

@
R
@ -

However we will see that a turning of the vector field when one moves in the direction of the field
is indeed one of the things that contributes to curl. In fact, the fundamental and archetypical example
of a planar vector field with non-zero curl is the field
F = −ay i + ax j
"#
0 −a
(Example 5). This is a linear field and its Jacobean matrix is . This represents the velocity
a 0
of a point in the plane if the entire plane is rotated with an angular speed a. (One can think of a
wooden disk being rotated, or an old-fashioned phonograph record on a turntable.) We have
∇ × F = 2a k. Since the entire plane is being rotated, and the axis of rotation is k, this makes sense.
It’s important to remember, though, that curl is something that happens at an individual point, not
on the plane as a whole or merely at the origin. So one should consider that if one is on a rotating
disk, and is walking away from the origin, one will experience a twisting, since the foot which is
further away from the origin will be moving slightly faster than the other one. And in fact, if one
stands on the rotating disk during the time interval in which the disk makes a complete revolution, in
15

addition to traveling around the disk, one’s body will also be rotated through 360◦ with respect to an
external frame of reference. By the time the disk has made a complete revolution, one will have faced
all four compass points. (Or, if one prefers to think of F as the velocity of a fluid, one can think of a
person trapped in a whirlpool.)

A three-dimensional analog for the field in Example 5 is the field

F(x, y, z) = (−cy + bz) i + (cx − az) j + (−bx + ay) j .

This turns out to be the velocity vector for a rotation around the axis a i + b j + c k. And
∇ × F = 2(a i + b j + c k). I will come back to this example later, but it is easy to verify that for all x,
F(x) is perpendicular to a i + b j + c k and perpendicular to x. (Except that F(x) = 0 if the position
vector x is parallel to a i + b j + c k.)

These rotational examples are fundamental and yet also somewhat misleading. The non-zero curl
here is not merely a consequence of the fact that the vector field rotates around the origin. To see
this, change Example 5 a little, letting

G(x, y) = rn F ,

q
where F is the vector field F = −ay i + ax j of Example 5, r = x2 + y 2 , and n is an integer. The
differentiation here may seem awkward because of the square root, but we can note that rn increases
(or decreases, if n < 0 ) most rapidly in the direction radially away from the origin, and the rate of
d n
increase is (r ) = nrn−1 . Thus ∇(rn ) = nrn−1 ur , where ur = (x i + y j)/r . Note that since F and
dr
ur are perpendicular and ||F|| = |a|r , we have ur × F = ar k. Thus

∇ × G = ∇(rn ) × F + rn ∇ × F = nrn−1 ur × F + 2arn k .

= anrn k + 2arn k = (n + 2)arn k .

In particular, if n = −2 then ∇ × G = 0. (Note that if n is negative, then F is discontinuous


at (0, 0). In particular, for this case one should not try to apply Stoke’s Theorem for any region
containing the origin.)

Example 6. Let F(x, y, z) = ay i , where a is a constant. Then ∇ × F = −a k.

This is a horizontal vector field in the plane, and ∇ × F gives us the rate at which F changes as
we move vertically. Although this is not a rotation, the non-zero curl coresponds to a twisting (or
shearing) effect. If we were to walk in the y direction through this force field, we would feel a twisting
effect, since, assuming that a > 0 , the force on our forward foot would be a little greater than that on
the other foot.
16

Example 6 F = ay i
y6
∇•F = 0
- - - - -
∇ × F = −a k
- - - - - -

- - - - - -
-
x
     

     

Or if we imagine F as representing the flow of water in a river, and if we were to move a boat in
the y direction, we can see that the river would constantly be trying to turn the boat towards the
x direction, since the force on the bow of the boat would be greater than that on the stern.

However one should not consequently make the simplistic assumption that somehow curl can be
equated to torque. Unfortunately, the calculations just don’t work out. Furthermore, consider the
planar vector field F = ay i + ax j (Example 7). We see that ∇ × F = 0. However if this field
represents the velocity of a current and if we move through this current with a boat (presumably
longer than it is wide) in the j direction then the current will be pushing the boat more strongly at
the bow (front) than at the stern (rear), and consequently will be exerting a torque, trying to turn the
boat clockwise towards the i direction. On the other hand, if we the boat in the i direction, then the
current will attempt to turn it counterclockwise towards the j direction.

Example 7 3 F = ay i + ax j = ∇(axy)
y6 - 

∇•F = 0
*

-   
@ ∇×F=0
@ HHj -
R
@ 
 
- 6
A @
R
6
U
A 6 -
? AK x
? A
 @
I
I
@
H
Y
H @
H

This shows that for a vector field F there is not a simple relationship between ||∇ × F|| and the
torque exerted on an object placed in the field F in terms of the area (or volume) of that object.

To understand curl more systematically, start by considering a vector field with constant direction:
F(x, y, z) = f (x, y, z)u ,
17

where u is a constant unit vector. Then

∇ × F = ∇f × u .

We see then that the curl of F is perpendicular to both the direction of F and to the gradient of f
(i. e. the gradient of ||F|| , assuming that f > 0 ). The curl is zero if the gradient of f is parallel to u,
i. e. to F. And
||∇ × F|| = ||∇f × u|| = ||∇f || sin ϕ = ||∇f || cos ψ ,
where ϕ is the angle between ∇f and F and ψ is the complementary angle (in the plane of ∇f and
F). In other words, in the case of a field of constant direction, ∇ × F measures the rate of change of
||F|| in a direction perpendicular to F.

There are, of course, only two directions perpendicular to both ∇f and F (assuming that these
two are not parallel), and if w is a vector representing one of these directions, then −w represents the
other. The correct choice for the direction of ∇ × F (or, in the planar case, for the sign of the curl, if
one thinks of the curl as always being a multiple of k) will be determined by the right-hand rule. For
a field in the plane, ∇ × F will be a positive multiple of k when ∇f is in a clockwise direction
from u, i. e. f increases in a direction clockwise from u (but not necessarily perpendicular to it).

On the other hand, consider a vector field in the plane with constant magnitude 1:

U(x, y, z) = cos γ i + sin γ j ,

where γ(x, y) is the angle between U and i . Then


 
∂Q ∂P ∂ ∂
∇×U =( − )k = (sin γ) − (cos γ) k
∂x ∂y ∂x ∂y
∂γ ∂γ
= (cos γ + sin γ )k
∂x ∂y
∂γ ∂γ
= (cos γ i + sin γ j) • ( i+ j) k
∂x ∂y
= (U • ∇γ) k .

Since U is by assumption a unit vector, the scalar in parentheses here is the directional derivative of
γ in the direction of U.

I.e. for a vector field U in the plane with constant magnitude 1, ∇ × u shows the rate at which U
is turning as (x, y) moves in the direction of U.

Since this theorem is so simple, it is very tempting to believe that it would also hold for vector
fields in three dimensions. However this is not the case. Consider the the following example.

√ √
−y i + x j + 3r k −y i + x j 3
Example 8. F(x, y, z) = = + k,
2r 2r 2
q
where, as usual, r = x2 + y 2 + z 2 . We have ||F|| = 1 . Furthermore, the Jacobian matrix for F,
and hence also the curl, are the same as for the planar field 12 G, where G = (−y i + x j)/r , which was
considered immediately after Example 5. We found that ∇ × G = k/r . Thus
k
∇×F= .
2r
18

Since G is a planar vector field with constant magnitude 1, the Theorem then tells us that the
directional derivative of G in the direction of −y i + x j is 1/r , so the directional derivative of F in
this direction is 1/2r . Since F has constant magnitude 1, this is the same as the rate at which F is
turning when (x, y) moves in this direction. Now F is at an angle of 30◦ to k and F does not
change when z increases. From this we can see that the directional derivative of F in the
direction of F is 1/2 the directional derivative in the direction of −y i + x j. Thus for this
three-dimensional case of a vector field with constant magnitude 1, the curl is not the same as the rate
at which F turns as (x, y) moves in the direction of F.

In fact, a close look at this example reveals that it would in fact impossible for a theorem to hold
stating that ∇ × F is the rate at which F turns when (x, y, z) moves in the direction of F. Because
∇ × F is completely determined by the Jacobean matrix J for the field F at the particular
point (x, y, z), but the suggested theorem is stated in terms of the direction of F. And J does not
tell us either the direction or the magnitude of F; it only tells us about how F changes.

In general, a planar vector field can be written as a product F(x, y) = f (x, y)U(x, y), where
f (x, y) is a scalar function and U(x, y) is a vector field with constant magnitude 1. Then

(?) ∇ × F = (∇f ) × U + (f )(∇ × U) .

Theorem A. For a general vector field F in two dimensional space, ∇ × F will be the product of k
with sum of the following two terms:
(1) The rate at which ||F|| changes as one movess at unit speed in a direction 90◦ clockwise to F;
(2) The product of ||F|| and the rate at which F turns as one moves in the direction F at unit
speed.

One can also easily derive this result by using a change of coordinates. Fix a point (x0 , y0 ) and
introduce a new coordinate system so that i is in the direction of F at (x0 , y0 ). Then if
∂P ∂Q
F(x, y) = P i + Q j, the directional derivative of F in the i direction is i+ j. Recall that the
∂x ∂x
first component here, which is in the direction of F, gives the rate at which ||F|| is changing as we
∂Q
move in the direction of F, and the second component, viz. (x0 , y0 ), equals the product of
∂x
||F(x0 , y0 )|| and the rate at which F is turning as one moves in the the direction of F(x0 , y0 ). On the
∂P ∂Q
other hand, the directional derivative of F in the − j direction, is given by − i− j. The
∂y ∂y
∂P ∂P
component − i is in the direction of F and thus is the rate of change of ||F|| as one moves
∂y ∂y
away from (x0 , y0 ) in the − j direction, i. e. the direction 90◦ clockwise to F(x0 , y0 ), But
∂Q ∂P
∇×F=( − ) k. Thus the theorem follows.
∂x ∂y

The Theorem is not meant as a practical method for computing ∇ × F, since the basic formula is
already quite simple to use in most cases. Instead, it is a way of trying to find the conceptual intuitive
meaning of ∇ × F. However there are a few cases where this approach is slightly simpler than doing
19

the differentiations. For instance, consider the planar vector field

xi + yj
F=
rn
q
with r = x2 + y 2 . Since this field is directed radially away from the center, the direction of F(x, y)
does not change as (x, y) moves in the direction of F. Furthermore, ||F|| is constant on the circles
x2 + y 2 = const, hence the directional derivative of ||F|| is zero in the direction perpendicular to the
direction of F. Thus we see that ∇ × F = 0.

Also look again at the field

G = rn (−ay i + ax j)

which we considered immediately after Example 5. This field is tangent to the circles x2 + y 2 = const,
and thus the rate at which G turns as (x, y) moves in the direction of G equals the curvature of the
circle around the origin through (x, y), namely 1/r . Since ||G|| = arn+1 , the second summand
1
indicated in the Theorem equals arn+1 k = arn k. Furthermore, the directional derivative of ||G||
r
as one moves in a direction 90◦ clockwise to G, i. e. along a radial line away from the center, is
a(n + 1)rn , so the first summand indicated in the Theorem equals a(n + 1)rn k. Thus one gets
∇ × G = (n + 2)arn k, as previously calculated.

This is not the end of the discussion of curl, but the rest will have to be postponed until I talk
about some topics in linear algebra.

Eigenvectors and Eigenvalues.

As mentioned, in a small neighborhood of a point (x0 , y0 ), a vector field in the plane can be
closely approximated by one of the form

F(x, y) = P (x, y) i + Q(x, y) j = C + (a1 x + a2 y) i + (b1 x + b2 y) j,

where C is a constant vector. The Jacobean matrix corresponding to this field is


 
a1 a2
J = .
b1 b2

The nature of this matrix, and consequently the behavior of the original vector field within a small
neighborhood, can be understood in terms of the eigenvalues and eigenvectors.

To say that a non-zero vector v is an eigenvector for a matrix A with corresponding


eigenvalue c is to say that Av = cv . Any non-zero multiple of an eigenvector is also an eigenvector
with the same eigenvalue, so that an eigenvector really corresponds more to a direction than to a
particular vector.
" #
9 −2
As examples of eigenvectors, we can notice that the matrix has the eigenvectors i + 2 j
−2 6
20

and 2 i − j with corresponding eigenvalues 5 and and 10. In fact,


" #" # " # " #
9 −2 1 5 1
= =5
−2 6 2 10 2
" #" # " # " #
9 −2 2 20 2
= = 10
−2 6 −1 −10 −1

" #
5 1
Likewise the matrix has the eigenvectors i + j and i − j with corresponding eigenvalues 6
1 5
and 4.

It is shown in linear algebra that the eigenvalues for an n × n matrix are the roots of a certain
polynomial of degree n. It is also a fact that every polynomial of odd degree must have at least one
real root, since its graph must cross the x-axis.

It is a theorem in linear algebra that every 3 × 3 matrix, or for that matter, every n × n matrix
for odd n, has at least one real eigenvector and corresponding eigenvalue.

Counting Eigenvectors. We have already mentioned that any multiple of an eigenvector is also an
eigenvector, for the same eigenvalue. When we are counting eigenvectors, we don’t want multiples to
count as separate. On the other hand, it is possible that for certain matrices that two vectors v1 and
v2 which are not multiples of each other both be eigenvectors corresponding to the same eigenvalue.
 
a 0 0
 
This happens, so instance, with the matrix 0 a 0 , where both i and j are eigenvectors
0 0 1
corresponding to the same eigenvalue a. In this case, it is easy to see that any combination rv1 + sv2
is also an eigenvector corresponding to the same eigenvalue. We don’t want to count these infinitely
many eigenvectors separately, so in counting we indicate that this matrix has two linearly independent
eigenvectors corresponding to the given eigenvalue, but no more.

An n × n matrix can have at most n linearly independent eigenvectors, since more than n vectors
in Rn cannot be linearly independent. Also, it can be proven (fairly easily) that any set of eigenvectors
corresponding to distinct eigenvalues is always linearly independent. So it is only when two or more
eigenvectors correspond to the same eigenvalue that the issue of linear independence arises.

It is known from linear algebra that a symmetric n × n matrix (see below) always has n linearly
independent eigenvectors. (It is possible that this can also happen for matrices which are not
symmetric.)

If an n × n matrix has n linearly independent eigenvectors, then the trace of the matrix is the
sum of the corresponding eigenvalues.

Thus in a lot of cases we will be able to interpret the divergence at a particular point of a vector
field in terms of the eigenvalues of the Jacobean matrix.
21

In terms of vector fields, consider a vector planar field F with corresponding Jacobean matrix J
at a point (x0 , y0 ) and suppose v is an eigenvector for J with corresponding eigenvalue c. Let us
suppose that v is fairly small (as we may, since only the direction is crucial). Now if (x, y) is another
point such that (x, y) − (x0 , y0 ) = ∆x = v , then J ∆x = c ∆x, so that
" #
x − x0
F(x, y) − F(x0 , y0 ) = ∆F(x0 , y0 ) ≈ J ∆x = c .
y − y0

The corresponding equation in three dimensions, or for that matter in space of any dimensionality, is
equally valid.

We want to define the concept of an eigenvector for a vector field F in such a way that an
eigenvector for F is the same as an eigenvector for the Jacobean matrix. This can be accomplished by
use of the directional derivative. Unfortunately, there’s a slight technicality in that directional
derivatives are normally only defined in terms of a unit vector in a given direction, but it’s
inconvenient to require that eigenvectors be unit vectors.

Definition. A vector v is an eigenvector for a vector field F at a point x if the directional


derivative for F in the direction of v at the point x is a multiple of v . If u = v/ ||v|| and
Du F = c u, then c is called the eigenvalue corresponding to v .

Note that for a linear vector field

F(x, y) = P (x, y) i + Q(x, y) j = F(x0 , y0 ) + (a1 x + a2 y) i + (b1 x + b2 y) j ,

the Jacobean matrix J is constant. From the above, this means that the vector field F looks the
same, no matter what point we look at, except for the summand F(x0 , y0 ):
" #" # " #
a1 a2 x − x0 a1 (x − x0 ) + a2 (y − y0 )
F(x, y) = F(x0 , y0 ) + = F(x0 , y0 ) + .
b1 b 2 y − y0 b1 (x − x0 ) + b2 (y − y0 )

Thus for a linear field we can get the general idea by looking at F(x0 , y0 ) where (x0 , y0 ) is the origin.
Furthermore, it makes it a lot easier to think about it if we replace F by the vector
field F − F(x0 , y0 ), which has the same Jacobian. In other words, often we might as well consider the
case where F(x0 , y0 ) = F(0, 0) = 0.

Note that if F is a linear vector field with F(0, 0) = 0, then all the vectors making up F are linear
combinations of the columns of the Jacobian matrix J :
" #" # " # " #
a 1 a2 x a1 a2
F(x, y) = =x +y .
b1 b2 y b1 b2

" #
c 0
If every two-dimensional vector is an eigenvector for J , this says that J = , which is to
0 c
say that for all points (x, y), F(x, y) = cx i + cy j = cx, in other words, the vector field F is radially
directed away from the origin (or towards it if c is negative).
22

Otherwise, for a linear vector field F with F(0, 0) = 0 , we see that x = x i + y j is an eigenvector
of J , with corresponding eigenvalue c, if
" # " # " #
x cx cx
F(x, y) = F(0, 0) + J =0+ = .
y cy cy

In other words, for a linear vector field with F(0, 0) = 0 , the eigenvectors for J correspond to those
directions such that when (x, y) lies in that direction from the origin, F(x, y) is directed radially away
from (or towards) the origin.

For a linear vector field F with F(0, 0) = 0 , the eigenvectors can be recognized as those vectors
in F which lie on straight lines through the origin, and also those position vectors x (aside from
the origin) such that F(x) = 0.

Thus if J has two linearly independent eigenvectors, then when we look at the vector field F, we
will see that some of the vectors in the field form two lines emanating from the origin (or directed
toward the origin), and all the other vectors making up F will be pointing in at least slightly skewed
directions. In fact, since for a linear vector field the Jacobean matrix is the same at all points, we see
that corresponding to a planar vector field F with two linearly indepedent eigenvectors v and w ,
there will be two key directions, the directions of v and w . The difference between any two
vectors F(x, y) along a straight line with the direction of v will be a multiple of v and likewise
for w . In particular, if a straight line in the direction v contains (0, 0), or any point (x0 , y0 ) with
F(x0 , y0 ) in the direction of v , then F(x, y) will be in the direction of v for all the points on that
line. (And likewise, of course, for lines in the direction w .)

In Example 3, i is an eigenvector corresponding to the eigenvalue 1/2 and j is an eigenvector


corresponding to −1/2 . In Example 7, i + j is an eigenvector corresponding to the eigenvalue 1 and
i − j is an eigenvector corresponding to the eigenvalue −1 .

It is shown in linear algebra that every n × n matrix has at least one eigenvalue and corresponding
eigenvector, provided that we allow complex numbers as eigenvalues and entries in eigenvectors.
However since we are concerned here only with real numbers, it is possible that there may be no (real)
eigenvector for a matrix J . The most standard example of this is a matrix corresponding to a linear
transformation which rotates vectors through an angle α , combined with scaling by a factor m. We
have
" #
m cos α −m sin α
J =
m sin α m cos α
and
F = (xm cos α − ym sin α) i + (xm sin α + ym cos α) j + F(0, 0) .

If F is a vector field which has J as its Jacobean matrix and if F(0, 0) = 0 , then at every
point (x, y), F(x, y) is turned counter-clockwise
q at an angle α from the radius vector x i + y j and, if
m > 0 , then ||F|| = mr , where r = x2 + y 2 . In this case, we have ∇ • F = 2m cos α and
∇ × F = 2m sin α k.
23
" #
0 1
The matrix , in particular, represents a rotation of 90◦ counter-clockwise, and so has no
−1 0
(real) eigenvectors and eigenvalues. This matrix is the Jacobean matrix for the vector field
G = y i − x j + G(0, 0) .

This may be a good moment to point out that when using the term “rotate,” it is easy to get
confused between two, or perhaps three, different things. A linear vector field is completely
determined by its Jacobean matrix J , but the vector field is not the same thing as the matrix, as one
" #
0 a
sees with the example G = ay i − ax j , where J = . This vector field G here gives the
−a 0
velocity vector for a rotation of the plane with an angular velocity of a. On the other hand, J is the
matrix of a linear transformation that rotates vectors through 90◦ , combined with a expansion by a
factor of a. This represents the fact that if we move away from a point x by a displacement ∆x, then
∆G will be obtained by rotating ∆x through an angle of 90◦ and multiplying its magnitude by a.
Since G is linear and G(0, 0) = 0 , this also represents the fact that we can obtain G(x) by rotating
the position vector x by 90◦ and scaling by a. It is the scalar a, not the angle 90◦ , that gives the
angular speed of the rotation of the plane for which G is the velocity vector. (It will also be true that
the vector G(x, y) will be turning, or one might sometimes say rotating, when the point (x, y) moves.)

Although eigenvectors are by definition non-zero, 0 is an allowable eigenvalue. If v is a eigenvector


corresponding to the eigenvalue 0, then J v = 0 . Thus an n × n matrix J has 0 as one of its
eigenvalues if and only if J is a singular matrix. From linear algebra, we know that this is the case
when the determinant of J is zero. There will then be a line through the origin (in the direction v )
along which F is constant (thus F = 0 on this line in the special case F(0, 0) = 0 ).

Two by Two Matrices. In the case of a planar linear field for which there is an eigenvector v
corresponding to the eigenvalue 0, there are now two possibilities. If there are any points (x, y) with
F(x, y) is not parallel to v , let w = F(x, y) for such a point. Then v and w are linearly independent,
so that every vector in R2 , in particular the position vector for any point, is a linear combination of v
and w . Since F(x, y) remains constant when moving in the direction v , we can see that F must be
parallel to w in the whole plane (except along the line where it is 0). In particular, F( i) and F( j)
will be multiples of w , which is to say that the two columns of J are both multiples of w .
Furthermore, F(w) will be a multiple of w . (F(w) 6= 0 , otherwise F(x) = 0 for every x ∈ R2 .) Thus
w will be a second eigenvector for J , with the corresponding eigenvalue being non-zero.

If, on the other hand, F 6= 0 and F(v) = 0 and the entire vector field F is parallel to v , then
F(F(x)) = 0 for every x ∈ R2 , so that J 2 is the zero matrix. F can have no second eigenvector (or
eigenvalue) since if w were such an eigenvector, then F(w) = cw where c 6= 0 (otherwise it would
follow that F = 0, since F(v) = 0 ), but
0 = F(F(w)) = F(cw) = cF(w) = c2 w .
a contradiction.

In either case, if F is a linear planar vector field and F(0) = 0 and F has an eigenvector
corresponding to the eigenvalue 0, then F has constant direction.
24

Restated in terms of matrices rather than vector fields, we have the following:

Two by Two Theorem. For a non-zero 2 × 2 matrix J , there are three possibilities.
(1) Neither of the two columns of J are multiples of the other.
(2) The two columns are multiples of each other or one column is zero. Furthermore, if w is a
non-zero column, then J w 6= 0. In this case, w is an eigenvector for J corresponding to a non-zero
eigenvalue. And J has a second eigenvector corresponding to the eigenvalue 0.
(3) The two columns of J are multiples of each other (or one of them is 0) and if w is either of
these columns then J w = 0 . In this case, w (if not zero) is the only eigenvector for J and
corresponds to the eigenvalue 0. Furthermore, in this case J 2 is the zero matrix.

Case (1) (the case where J is a non-singular matrix) could be divided into still further subcases,
but it is Cases (2) and (3) that we’re really interested in at the moment.

The Jacobean matrix for a planar vector field with either constant direction or constant magnitude
will fall under Case (2) or Case (3). (However fields with constant direction are never linear.)

Examples 1, 2, and 3 are all examples of Case 1 vector fields.


" #
−a a
Example 4, with J = is an example of Case (2). The eigenvector i corresponds to the
0 0
non-zero eigenvalue −a and i + j is an eigenvector corresponding to the eigenvalue 0. On the line `
through the origin with equation y = x, F is 0. More generally, any time we move away from a given
point in the direction i + j, F does not change. We have ∇ • F = −a.
" #
0 a
Example 6, with Jacobean matrix (for a 6= 0 ), is an example of Case (3). The only
0 0
eigenvector is a i (or any multiple of it, in particular i ), corresponding to the eigenvalue 0.

Trajectories. The same ideas we have been using for linear vector fields can be applied to look at
any vector field F at any point x0 , if we remember that we are getting information about a very good
approximation to F in a small neighborhood of x. For eigenvectors to be visually apparent, though,
one really needs to look at the field F − F(x0 ).

The best way to see the field visually may be to look at the trajectories or integral curves for
the field through a given point. These are the curves x(t) whose tangent vectors belong to the
d
field F, i. e. x(t) = F(x(t)). (This is the curve that will be followed by a cork placed in the field, if
dt
the field is two-dimensional and represents the surface of a moving stream.)

We’ve seen that for a linear vector field F with F(0) = 0, the eigenvectors can be recognized as
those vectors F which lie along straight lines through the origin, as well as those position vectors x
such that F(x) = 0 . For a non-linear field, the difference is that one should look for trajectories
instead of straight lines. Furthermore, since the Jacobean matrix J is not constant, one cannot
assume that the point x0 of interest is the origin.
25

The eigenvectors for F(x) − F(x0 ), at the point x0 will correspond to the trajectories
for F − F(x0 ) that go through x0 . (One also needs to include here curves through x0 on which
F(x) − F(x0 ) = 0 .)

(Strictly speaking, according to the definition given, one could object that a curve x0 (t) going
through x0 cannot be considered a trajectory at x0 = x(t0 ) since it doesn’t make sense to say that
x0 (t0 ) = F(x, y) − F(x0 ) because F(x) − F(x0 ) is zero there, hence no direction is specified. But
visually, a curve x(t) will be seen to be a trajectory for F at x0 if for points on the curve very close
to x0 , the tangent vector to the curve is headed directly towards, or directly away from, x0 . To
express this more formally, the condition required is that if x = x(t) is very close to x0 = x(t0 ), so
that ∆x = x − x0 is small, then the tangent vector to the curve at x, should be in the same direction
as ∆x. But the tangent vector to the curve at x which is F(x) − F(x0 ) since the curve is a trajectory
for F − F(x0 ). So this is the same as saying that ∆x is an eigenvector for F − F(x0 ), and also that
for this specific x, F(x) − F(x0 ) is an eigenvector for F − F(x0 ). This little technicality also explains
that it is possible for more than one trajectory for F − F(x0 ) to go through the point x0 .)

It is fairly easy to see these in the examples shown in the preceding graphics, since it is easy to
visualize the trajectories. For instance, in Example 7, the only trajectories which pass through the
origin are the straight lines with slopes of ±1 , corresponding to the eigenvectors i + j and i − j at
" #
0 .4
the origin. (The Jacobean matrix is .)
.4 0

If we look at a vector field as representing the velocity of a moving fluid, then what this says is
that the eigenvectors at a point x0 correspond, roughly speaking, to little streams within the current
that flow either either directly towards or directly away from x0 , and the corresponding eigenvalues
correspond to the speeds of these streams, except that, contrary to the usual usage, we here use the
word “speed” with the understanding that it take have negative as well as positive values. An
eigenvector corresponding to the eigenvalue 0 would correspond to a line (or curve, unless the scope of
our vision is totally microsopic) going through the point in question where the fluid absolutely
motionless. (Think of a vertical curve climbing through the eye of a hurricane, for example.)

If there are three linearly independent eigenvectors for the field (or two, in the planar case), then
the divergence of the field is the sum of the corresponding eigenvalues. With the interpretation we
have given, it is thus easy to see the fact that the divergence equals the rate at which fluid disappears
(or accumulates, if the divergence is negative) at the given point.

We mentioned earlier that the canonical example of a matrix with no real eigenvectors is
" #
cos α − sin α
.
sin α cos α

This is the matrix for the linear transformation which rotates each vector counterclockwise through an
angle of α . It is the Jacobean matrix for the planar linear vector field

F = (x cos α − y sin α) i + (x sin α + y cos α) j

(Example 9). If 0 < α < π/2 , as in the picture, then we see that F is the velocity vector for the
motion of a fluid which is swirling around the origin and spiraling outward. We have ∇ • F = 2 cos α ,
26

corresponding to the fact that fluid is moving outward away from the origin, even through there are
no trajectories showing movement directly away from the origin.

Example 9
y6
@
I
@ 6
F = a(x cos α − y sin α) i + a(x sin α + y cos α) j
@
@ ∇ • F = 2a cos α
 ∇ × F = 2a sin α k
@
I
@ 6 

 -
x
-
? @
R
@ -

@
@
? @
R
@

Curl and Skew-symmetric Matrices

As already mentioned, curl is a phenomenon that occurs in three-dimensional space. The


two-dimensional fields we have looked at so far are enlightening up to a point, but they really only
partially address the concept. In particular, so far we have no insight whatsoever into the significance
of the direction of the vector ∇ × F.

To penetrate this mystery we will need a few simple concepts from matrix theory. The (main)
diagonal of a matrix consists of the entries running diagonally from the upper left corner to the
upper right corner:
 
D ∗ ∗
 
∗ D ∗ .
∗ ∗ D
The transpose Atr of a matrix A is its mirror image if one lays a mirror along the main diagonal.
(Another way of saying this is that the columns of the transpose are the same as the rows of the
original matrix.) A matrix is symmetric if it is the same as its transpose, i. e. if the entries above the
diagonal are the mirror image of the ones below. And a matrix is skew-symmatric if it is the
negative of its transpose, i. e. the entries above the main diagonal are the mirror image of the ones
below except for a change in sign. (The diagonal entries of a skew-symmetric matrix must be 0.)
     
1 2 3 1 5 10 7 4 5
     
For example, if A =  5 6 7  , then Atr = 2 6 11 . The matrix 4 8 6  is
10 11 12 3 7 12 5 6 10
27
 
0 6 7
 
symmetric and −6 0 8 is skew-symmetric.
−7 −8 0

It is easy to see that the sum of a matrix and its transpose will be symmetric and the difference of
a matrix and its transpose will be skew-symmetric. Any n × n matrix can be written as the sum of a
symmetric matrix and a skew symmetric one, since we have
1 1
A = (A + Atr ) + (A − Atr ) .
2 2

Suppose that a particular point the Jacobean matrix for a vector field F is
 
a11 (x) a12 (x) a13 (x)
 
J = a21 (x) a22 (x) a23 (x).
a31 (x) a32 (x) a33 (x)
Then
∇ × F = (a32 (x) − a23 (x)) i − (a31 (x) − a13 (x)) j + (a21 (x) − a12 (x)) k .
Then what we notice is that ∇ × F = 0 if and only if J is a symmetric matrix.
" #
0 a
We have seen that a 2 × 2 skew-symmatrix matrix (for a 6= 0 ) is the Jacobean matrix
−a 0
corresponding to a rotation of the plane with angular velocity a k and has no real eigenvectors. We
will see that a 3 × 3 skew-symmetric matrix has exactly one real eigenvector, and this corresponds to
the eigenvalue 0.

Now an arbitrary 3 × 3 skew-symmetric matrix looks like


 
0 −c b
 
J = c 0 −a .
−b a 0
This is the Jacobean matrix for the linear vector field
F = F(0, 0, 0) + (−cy + bz) i + (cx − az) j + (−bx + ay) k .
We will look at the case where the constant term F(0, 0, 0) is 0 . We see that ∇ × F = 2(a i + b j + c k).
Now look at F(x) for an arbitrary point x = x i + y j + z k. Calculation shows
F(x, y, z) = (−cy + bz) i + (cx − az) j + (−bx + ay) k = 12 (∇ × F) × x .
From this we see two things: (1) ∇ × F is an eigenvector for J corresponding to the eigenvalue 0;
(2) F consists of the vectors in the plane perpendicular to ∇ × F and also perpendicular to the radius
vector x. Thus under the asuumption that F(0, 0, 0) = 0, the vector field F is the velocity of a
rotation of three-space around the axis through the origin in the direction a i + b j + c k with an
angular velocity of ||a i + b j + c k||.

It is very tempting to believe that if F is a vector field with Jacobean matrix J , and we
decompose J into the sum of a symmetric and a skew-symmetric matrix, then there would exist a
corresponding way of writing F as a sum of two fields, one of which is the field of velocity vectors for
a rotation and the other is a field with three mutually orthogonal eigenvectors at every point. This is
certainly easily done in the case that F is linear, but in general it is not feasible, because it is usually
28

not possible to find a vector field with pre-assigned Jacobean matrix. This is the problem of finding
three functions P , Q , and R whose gradients are the rows of the given matrix, and is not possible
unless certain compatibility conditions are satisfied. (See my article on integrating vector fields.)

The best one can say then is that in a sufficiently small neighborhood of a point of interest a vector
field F can be reasonably well approximated by a linear field, and therefore it will look pretty much
as though it is the sum of two fields, one of which has three mutually orthogonal eigenvectors and has
zero curl, and the other of which represents the velocity vectors (within the given neighborhood) of a
rotation of all of three space around the axis ∇ × F with an angular speed of 12 ||∇ × F|| .

It seems to me that this is the best intuitive interpretation that one can give for the concept of
curl. However, such a decomposition of even a linear vector field with F(0) = 0 into the sum of a field
with zero curl and one which corresponds to a rotation of 3-space is often not at all apparent visually.

Consider again Example 9:

F(x, y) = (x cos α − y sin α) i + (x sin α + y cos α) j ,

where α is a constant. We saw that this linear field is the velocity vector for the motion of a fluid
which is swirling around the origin and spiralling outward. We can break break F up into the sum of
two terms, one having a symmetric Jacobean matrix and the other a skew-symmetric one. Namely,
F1 = (x i + y j) cos α , and F2 = (−y i + x j) sin α . The first field represents the velocity vectors of a
fluid streaming radially away from the origin, and the second the velocity vector for a rotation of the
plane around the origin with an angular velocity of sin α k. In terms of this decomposition, it makes
perfectly good sense that that ∇ • F = 2 cos α and ∇ × F = 2 sin α k.

Now look again at Example 6, F = ay i . When we first looked at it, it seemed a little surprising
that it had non-zero curl: ∇ × F = −a k. Now write F = F1 + F2 , where F1 (x, y) = 12 (ay i + ax j),
and F2 = 12 (ay i − ax j). The Jacobean matrix for F1 is symmetric and the one for F2 is
skew-symmetric. F1 is actually Example 7, multiplied by 12 , the velocity vector for a current that
contains one substream streaming directly toward the origin at a 45◦ angle and a speed of a, and a
perpendicular one streaming directly way from the origin at the same speed. F2 on the other hand is
the velocity vector for a clockwise rotation of the plane with an angular velocity of a k. But even
after one knows this, it seems rather hard to look at F and see F1 and F2 .

A Vector Field with No Curl is a Gradient.

(1) For a vector field F defined in a given region (in two or three dimensional space),
∇ × F = 0 if and only if each point x in the region is surrounded by some subregion in which F
is a gradient field, i. e. there exists a function g in that subregion such that F = ∇g in that
subregion.
(2) If ∇ × F = 0 for a vector field F at a point x0 in Rn , then F has n mutually orthogonal
eigenvectors at x0 .
(3) If x0 is a critical point for a twice-differentiable function g in Rn , then g has a maximum
at x0 if all the eigenvalues of ∇g at x0 are strictly negative, and a minimum if all the
eigenvalues are strictly positive. If some of the eigenvalues are strictly positive and some are
strictly negative (including the possibility that others are zero), then g has a saddle point at x0 .
29

Statement (1) is discussed and proved in my article on integrating vector fields.


...........
.................. .......................
........
At first glance, the way this first statement is stated, in terms
........ .....
..... .....
..... ....
...
...
.
.. ....
...
of subregions, seems a bit odd. One would think that one could take
... ....... ............ ...
.. ........... ..... ...
...
.. ........ .... ..
. all these subregions, and the corresponding functions defined in them,
.. ... ..
.. ... ... ..
.. .. .. ..
..... .... ..
... .
. and paste them together to get one function g defined on the whole
.. .. ? ...
..
..
..
..
.. .. ...
..
..
..
...
...
.... ....
. ...
..
.
..
..
original region such that F = ∇g . In fact, this is most often the case.
.. ...... .... ..
... ........... .................. ...
... .... ... But a problem occasionally occurs when the original region winds
... ..
....
.... ....
....
..... ....
......
........
............ .........
......
around a discontinuity for F. The gradient ∇g determines g only up
......... ..........................
to a constant summand, and there may be a difficulty in consistently
choosing this constant in a way that makes g continuous throughout the whole original region. One
can get a situation similar to the picture on the left. One may be able to make the functions defined
in the left-hand and right-hand regions agree where they meet along the top boundary line, but they
may then be inconsistent along the bottom boundary. (The ? in the middle indicates a point of
discontinuity of the vector field.)

The archetypical example of this is the vector field


yi − xj
F= 2 .
x + y2
In either the left half-plane or right half-plane, one sees that F(x, y) is the gradient of the function
tan−1 (y/x). But tan−1 (y/x) is discontinuous along the y-axis. One can easily adjust g to get a
function continuous in the upper half-plane or lower half-plane with F = ∇g , or in fact in any region
which does not wind around the origin. What we need is a function
q g(x, y) so that if we set
θ = g(x, y), then the point (x, y) has polar coordinates r = x2 + y 2 and θ . But one cannot define θ
in a continuous manner in a region which winds around the origin without having a discontinuous
break in θ along at least one line, because as the point (x, y) circles the origin to return to an original
starting point, θ will have increased by 2π , thus producing an inconsistency. (Well, if one wants to be
really weird in the way one defines g , then one could make the discontinuities occur along some curve
through the origin other than a straight line. But the point is, there have to be discontinuities.)

Statement (2) is a direct consequence of a theorem in linear algebra that an n × n matrix A is


symmetric if and only if it has n mutually orthogonal eigenvectors. Part of this is not difficult to
prove. It’s easy to see that if an n × n matrix A is symmetric, then for any two n-dimensional
column vectors v and w , v • Aw = w • Av . If now v and w are eigenvectors for A with
corresponding eigenvalues m and n, then we get
nv • w = v • Aw = w • Av = mv • w .
If m 6= n, it then follows that v • w = 0 , i. e. v and w are orthogonal.

However it is unfortunately not very easy to prove in general that an n × n symmetric matrix has
" #
a b
n linearly independent eigenvectors. But in the 2 × 2 case, it’s easy to see that a matrix A =
c d
will have two real eigenvectors provided that b and c have the same sign (for instance if b = c,
making A symmetric). In fact, in linear algebra it is known that the eigenvalues for A are the
solutions to the equation (x − a)(x − d) − bc = 0 . Now the graph of y = (x − a)(x − d) is a parabola
directed upwards, and this intersects the x-axis at x = a and x = d. If b and c have the same sign
then bc is positive, and so the parabola y = (x − a)(x − d) intersects the horizontal line y = bc, and
intersects it at two points, except in the case a = d and b = c = 0 . Therefore A will have two distinct
30

real eigenvalues and therefore two real eigenvectors. (In the exceptional case a = c and b = d = 0 , we
" #
a 0
have A = , so A has only the single eigenvalue a, but every vector in R2 is an eigenvector.)
0 a

Now let’s look at statement (3). Let g(x, y) be differentiable function of two variables. We will
∂2g ∂2g
assume that g is continuously twice differentiable, meaning that the second partials 2
, and
∂x ∂x ∂y
∂ 2g ∂g ∂g
exist and are continuous. Now ∇g = i+ j, and the Jacobean matrix for ∇g is
∂y 2 ∂x ∂y

 
∂2g ∂2g
 ∂x2 ∂x ∂y 
J =
 ∂2g
.
∂2g 
∂x ∂y ∂y 2

Because this matrix is is symmetric, it has two linearly independent eigenvectors.

Now let x0 be a critical point for g , i. e. a point where ∇g = 0 . We have seen that if a vector
field F in n-dimensional space such that F(x0 ) = 0 has n linearly independent eigenvectors and all
the corresponding eigenvalues at x0 are strictly positive, then F will be directed away from x0
throughout some neighborhood of x0 . But if ∇g is directed away from x0 for all points near x0 , this
says that g is increasing when we move in any direction away from x0 , which shows that g has a
minimum at x0 .

And if on the other hand all the eigenvalues are strictly negative, then throughout some
neighborhood of x0 , ∇g will be directed toward x0 . This says that g is decreasing as (x, y) moves
away from x0 from any direction, so g has a maximum at x0 .

However if there are two non-zero eigenvalues with opposite signs, then g is sometimes increasing
and sometimes decreasing as (x, y) moves away from x0 , so that x0 is a saddle point for g .
∂2g ∂2g ∂2g
Going back to the 2-dimensional case, write A = , C = , amd B = . Then
∂x2 ∂y 2 ∂x ∂y
2
AC − B is the determinant of the Jacobean matrix for ∇g . But it is known from linear algebra that
the determinant of a matrix equals the product of the eigenvalues. Therefor AC − B 2 will be strictly
positive if both eigenvalues for the Jacobean matrix are non-zero and have the same sign. If x0 is a
critical point for g , this then indicates that g has either a maximum or a minimum at x0 . We can
determine the sign of the eigenvalues, and thus determine whether g has a maximum or a minimum
at x0 , by looking at the sum of the two eigenvalues, i. e. at ∇ • ∇g = A + C . Furthermore, if
AC − B 2 > 0 , then necessarily A and C have the same sign, so we can say that g has a maximum
at x0 if A < 0 and a minimum if A > 0 .

And AC − B 2 will be strictly negative if the eigenvalues are non-zero with opposite signs,
indicating that x0 is a saddle point for g .

Finally, if AC − B 2 = 0 , this indicates that J has an eigenvector corresponding to the


eigenvalue 0. In this case, the behavior of the function is too delicate to figure out on the basis of the
information in the Jacobean. For instance, consider the two functions g1 (x, y) = x2 + y 3 and
g2 (x, y) = x2 + y 4 . The gradients are ∇g2 = 2x i + 3y 2 j and ∇g2 = 2x i + 4y 3 j. For both functions,
31

the only critical point is at (0, 0). The Jacobeans are


" #
2 0
J1 =
0 6y
" #
2 0
J2 = .
0 12y 2
At the critical point (0, 0), J1 and J2 are equal and i is an eigenvector corresponding to the
eigenvalue 2 and j is an eigenvector corresponding to the eigenvalue 0. But it is evident that (0, 0) is
a minimum for x2 + y 4 and is neither a maximum nor a minimum for x2 + y 3 (and thus by definition
a saddle point, although the graph looks nothing like a saddle).

What sort of vector field has both zero curl and zero divergence? Answer: The gradient
field for a harmonic function. (See, for instance, Examples 1, 2, 3, 7, and 11.)

A vector field in n-space with zero curl and zero divergence at a particular point can be
characterized by the fact that at the given point it has n orthogonal eigenvectors, and the sum of the
corresponding eigenvalues is 0.

A planar vector field which has zero curl and zero divergence at a certain point will have a
" #
a b
Jacobean matrix at that point of the form . The eigenvalues for this matrix are the
b −a
p
solutions to the equation (x2 − a2 ) − b2 = 0 , i. e. ± a2 + b 2 . Corresponding eigenvectors are
" # " #
−b −b
p and p , which are orthogonal to each other. (Here, of course,
a − a2 + b 2 a + a2 + b 2

∂P ∂Q ∂Q ∂P
a= =− and b = = .)
∂x ∂y ∂x ∂y

The Maximum Principle for Harmonic Functions. If g(x, y) is a harmonic function, then
∇ • ∇g = 0 . This shows that the two eigenvalues for ∇g have opposite signs, and this in turn
indicates that any critical point for g is a saddle point.

Now consider a set Ω which is bounded and closed (i. e. includes all its boundary points). It is
known from a theorem in topology that, as a continuous function, g must have a maximum (and also
a minimum) somewhere in Ω. But we have seen that this maximum or minimum cannot occur at a
critical point for g . Therefore it must occur at a boundary point. This give us the Maximum
Principle, which turns out to be extremely useful: The points within a closed bounded set Ω
where a harmonic function g takes its maximum and minimum must lie on the boundary
of that set.

Finally, note that if, say, a 3 × 3 matrix J has three mutually orthogonal eigenvectors, this in
 
a 0 0
 
turn means that by means of a rotation of coordinates, J can be put the form 0 b 0 . (In
0 0 c
32

considering studying curl, it is not good to make any change of coordinates other than combinations
of translations and rotations, i. e. rigid motions.) Conceptually, this is very enlightening in
understanding the nature of vector fields with zero curl. However one should remember that a specific
change of coordinates will usually only simplify the form of the Jacobean matrix at a single point.

Some Non-linear Vector Fields

We have seen that a linear planar vector field F, has constant direction if and only if the Jacobean
matrix J has 0 as an eigenvalue, or equivalently, J is a singular matrix. (Except for the trivial case
where F = 0 everywhere. In this case, J is certainly singular, since J = 0 , but F has no direction
at all.) More generally, in the case of a field in 3-space or higher dimensional space, a linear vector
field F will have constant direction if and only if the Jacobean matrix J has rank one.

However the same need not be true if the vector field F is not linear. For instance, the Jacobean
" #
2 a 0
for the planar vector field ax i + ax j in Example 10 is , which is singular at every point.
2ax 0
The vector j is always an eigenvector, and corresponds to the eigenvalue 0. (Notice that if (x, y)
moves in the direction of j, i. e. the direction of increasing or decreasing y , then F does not change.)
But F is in the direction i + x j, which is not constant. At a point (x, y), the second eigenvector
for J is the vector i + 2x j, with corresponding eigenvalue a.


Example 10 
A
K y6   F = ax i + ax2 j
A  
A   ∇•F = a
MBB A  
@
I  
B A @    ∇ × F = 2ax k
B A
K   
B A  
A @
I
@    
B
B A   
MBB A @
I
@   
B 
B B K A
A 
  
B B A    -
B B A    x
B A @
I
@   
B A K  
B A  
B A @
I
@   
B A  
B A @
I
@   

As another example, consider a planar vector field U with constant magnitude 1. As U moves it
will change direction but not length, so we expect its directional derivatives to all be perpendicular to
it. Thus if w is the unit vector perpendicular to U turned in, say, the positive (counterclockwise)
direction from U, then the directional derivative Dw (U) will be a multiple (possibly negative) of w .
Thus w will be an eigenvector for U and the corresponding eigenvalue will indicate how fast U turns
as (x, y) moves in the direction w .
33

Furthermore, the two columns of U, which are the directional derivatives of U in the i and j
directions, will also be multiples of v , and hence are multiples of each other, unless one of them is
zero. So this puts us in Case (2) or Case (3) of the Two by Two Theorem, so that U also has an
eigenvector who corresponding eigenvalue is 0. Since ∇ • U equals the sum of the two eigenvalues,
this gives a second (although certainly not easier!) proof of the fact that ∇ • U equals the rate at
which U turns as one moves in the direction 90◦ counterclockwise from U.

We can also see this computationally. We have seen that U has the form
U(x, y) = cos γ(x, y) i + sin γ(x, y) j .
The Jacobean matrix for this non-linear vector field is
 
∂γ ∂γ
− sin γ − sin γ
 ∂x ∂y 
(??) J =  ∂γ
.

∂γ
cos γ cos γ
∂x ∂y
The vector field U, and consequently the matrix J are thus functions of a single parameter γ(x, y).
Thus the directional derivative of U will be zero when (x, y) moves in a direction perpendicular
∂γ ∂γ ∂γ ∂γ
to ∇γ , i. e. in the direction i− j. Thus i− j is an eigenvector for J corresponding to
∂y ∂x ∂y ∂x
the eigenvalue 0 .

But despite having 0 as an eigenvalue, U is very definitely not a field with constant direction,
except in the uninteresting case when γ(x, y) is constant.

We see that the two columns of J are multiples of − sin γ i + cos γ j, which is the direction
perpendicular to U. Thus for any vector v , J v is always a multiple of − sin γ i + cos γ j, and this
tells us that this is the only possible eigenvector for J corresponding to a non-zero eigenvalue. (See
the Case (2) of the Two by Two Theorem above.) Checking, we see that
 
" # ∂γ ∂γ " #
− sin γ − sin γ − sin γ  − sin γ
J = ∂x ∂y 
cos γ ∂γ ∂γ  cos γ
cos γ cos γ
∂x ∂y
" #
− sin γ
= ( (− sin γ i + cos γ j) • ∇γ )
cos γ
so that − sin γ i + cos γ j is an eigenvector for J with corresponding eigenvalue
(− sin γ i + cos γ j) • ∇γ , which is the rate at which U is turning as (x, y) moves in the direction
− sin γ i + cos γ j. This confirms what we figured out above with the more conceptual approach.

One can note that there is an exceptional case here if − sin γ i + cos γ j is perpendicular to ∇γ
(and therefore ∇γ is parallel to U), which says that the directional derivative of γ in the direction
∂γ ∂γ
perpendicular to U is 0. In this case, − sin γ i + cos γ j is a multiple of i− j, so one sees that
∂y ∂x
J (− sin γ i + cos γ j) = 0. So this is Case (3) of the Two by Two Theorem. As previously seen,
∂γ ∂γ
− sin γ i + cos γ j is then the only eigenvector for J (since i− j is a multiple of it).
∂y ∂x
This tells us not much of anything new about U, but it’s interesting to see how everything fits
together.
34

Example 8 Redux. It’s interesting now to go back and look at Example 8 again:

−y i + x j 3k
F= p + .
2
2 x +y 2 2

At the time, we observed that we should not expect to be able to prove a theorem showing that for a
vector field like F in three dimensions with constant magnitude 1, ∇ × F equals the rate that F
turns when (x, y) moves in the direction of F, because J , which determines ∇ × F, does not tell us
the direction of F.

However Example 8 is far from a typical example of a three-dimensional field with constant
magnitude. We have observed that the Jacobean matrix for F is the same as for the planar field
−y i + x j
G1 = q , except for an added row and added column of zeros. We can write
2 x2 + y 2
G1 = 12 (cos γ i + sin γ j), where γ = θ + π/2 and θ is the usual polar coordinate. Since π/2 is a
constant, ∇γ = ∇θ . In fact, ∇γ(x, y) = ∇θ(x, y) = 2G1 (x, y), so we are in the exceptional case for
vector fields of the form cos γ i + sin γ j, as discussed above. We can use formula (?? ) to get
 
xy −x2 0
1  
J = 3  y 2 −xy 0 .
2r
0 0 0

The two non-zero columns of J are both multiples of −x i − y j, which is the direction in which G1
turns when (x, y) moves and which is also an eigenvector for J corresponding to the eigenvalue 0.

We know that for vector fields of constant magnitude the columns of the Jacobean matrix are
orthogonal to the field. And G1 = (−y i + x j)/2r is in fact perpendicular to the columns of its
Jacobean matrix J . But what makes Example 8 possible is that the vector (−y i + x j)/2r + a k is
also perpendicular to these columns, for any scalar a.

In the typical case of a vector field in three dimensions with constant magnitude 1, on the other
hand, where two of the three columns of the Jacobean matrix are linearly independent, the direction
of the field is in fact determined up to a plus-and-minus sign by the Jacobean matrix. In fact, if v1
and v2 are two columns of the Jacobean matrix which are not multiples of each other, then the vector
field, since it has constant magnitude, is perpendicular to v1 and v2 and thus is parallel to v1 × v2 .
This means that one can’t rule out the possibility that there is a nice theorem relating ∇ × F to the
rate at which F turns, except in those exceptional cases where the Jacobean matrix has rank one.
But I doubt that such a theorem exists.

It may be interesting to compare the field 2G1 = (y i − x j)/r above, and the linear vector field
G0 = −y i + x j (Example 5), which we have seen gives the velocity vectors for a rotation of the entire
plane with an angular velocity of one radian per unit time, with the non-linear vector field
−y i + x j −y i + x j  
−1 y
G2 = 2 = = ∇ tan ( ) .
x + y2 r2 x
All three of these vector fields represent the velocity vectors of circular motion around the origin, but
the their behavior with respect to curl and especially eigenvectors are very different. The angular
velocity for the vector field (−y i + x j)/rn is r−n k. We have seen earlier that the curl is
35

(−n + 2)r−n k, so that

∇ × G0 = ∇ × r0 (−y i + x j) = 2 k
k
∇ × 2G1 = ∇ × r−1 (−y i + x j) =
r
∇ × G2 = ∇ × r−2 (−x i + y j) = 0 .

Above, we have computed the Jacobean matrices for the vector fields G0 and 2G1 and seen that
the first has no eigenvectors and the second has exactly one.

If J is the Jacobean matrix for G2 at a point x, J x is not perpendicular to the position


vector x, although the field vector G2 (x) is of course perpendicular to x. (Since G2 is not linear,
the vector J x is actually irrelevant to the situation.) Since ∇ × G2 = 0, we know that the Jacobean
matrix J at every point will be symmetric. Linear algebra then tells us that J will have two
orthogonal eigenvectors.

Example 11
y  
6 −ay i + ax j −1 y
G2 = = a ∇ tan ( )
x2 + y 2 x

@
I ∇ • G2 = 0
 @
I
@ ∇ × G2 = 0
6
6
-
?  x
?
@ -
R
@ 
@
R -

Routine calculation produces

 
∂P ∂P " # " #
  1 −2xy x2 − y 2 −2xy (x + y)(x − y)
 ∂x ∂y  1
J = = = 4 .
 ∂Q ∂Q  r4 x2 − y 2 2xy r (x + y)(x − y) 2xy
∂x ∂y

The eigenvectors for J turn out to be (x − y) i + (x + y) j and (x + y) i + (y − x) j. (The fact that


these are perpendicular to each other is predictable by linear algebra because the matrix J is
symmetric.) The corresponding eigenvalues are ±1/r2 . As verification, we’ll look at the calculation
36

for the first eigenvector:


" #" # " #
1 −2xy (x + y)(x − y) x − y 1 (x − y)( −2xy + (x + y)2 )
= 4
r4 (x + y)(x − y) 2xy x+y r (x + y)( (x − y)2 + 2xy )
" # " #
x2 + y 2 x − y 1 x−y
= = 2 .
r4 x+y r x+y

It is enlightening to write the eigenvectors here in terms of the radial and tangential vectors for the
circle. We have

(x − y) i + (x + y) j = (x i + y j) + (−y i + x j )
(x + y) i + (y − x) j = (x i + y j) − (−y i + x j ) .

To interpret this intuitively, imagine moving away from a point (x, y) in the direction of the
eigenvector (x − y) i + (x + y) j, so that

∆x = a( (x − y) i + (x + y) j ) = a(x i + y j ) + a(−y i + x j ) ,

where a is a small real number. Then one is moving farther from the origin (assuming that a is
positive) by a distance ra and then a distance of ra in a direction tangential to the circle, which is
roughly equivalent, if a is small enough, to moving along the circle itself.

Now for the movement away from the origin, the vector G2 (x, y) keeps the same direction, but
gets a little shorter: for this move, ∆ ||G2 || = −a/r and ∆G2 = −a(−y i + x j)/r2 . (The vector
−y i + x j is in the direction of G2 but is not a unit vector, hence the extra factor of r in the
denominator here.) And for the move in the tangential direction, ||G2 || stays the same, but G2 turns
through an angle of a/r . (One can see this if one remembers that G2 is perpendicular to the radius
vector.) Since G2 is turning but not changing magnitude during this tangential relocation, the
derivative for the motion of G2 is directed toward the center of the circle, and one sees that for this
a
move, one has ∆G2 ≈ − 2 (x i + y j).
r
Putting all this together, one gets that
−a(−y i + x j ) −a(x i + y j ) −1
∆G2 ≈ 2
+ 2
= 2 ∆x ,
r r r
again confirming that ∆x is one of the eigenvectors for G2 .

On a purely qualitative level, what we see here is that the crucial difference between G2 and G0
is that for G2 , when (x, y) moves radially away from the origin, the corresponding ∆G2 is a vector
tangential to the circle and in the clockwise direction, so that in this case ∆G2 is at a clockwise angle
to ∆x, whereas when (x, y) moves in a direction tangential to the circle in a counter-clockwise
direction, G2 turns and the corresponding ∆G2 is a vector directed radially towards the origin, so
that in this case ∆G2 is at a counterclockwise angle to ∆x. From this, it seems reasonable it should
be possible to construct a vector which is an appropriate linear combination of a motion radially away
from the center and a motion tangential to the circle in the counter-clockwise direction so that if ∆x
is given by this motion, then the corresponding ∆G2 should be exactly opposite to ∆x. This vector
will be an eigenvector for G2 , and by an analogous process, one can construct a second eigenvector,
which will be perpendicular to it. (Of course since G2 is a non-linear vector field, one ought to also
choose ∆x to be small.)
37

On the other hand, in the case of G0 = −y i + x j one finds that whether ∆x is radially directed
away from the origin or tangential to the circle, multiplication of ∆x by the Jacobean matrix
corresponding to G0 will produce a vector ∆G0 which is rotated 90◦ counter-clockwise from ∆x.
Thus there is no possibility of combining vectors in these two directions to obtain a vector ∆x which
is not rotated when multiplied by the Jacobean matrix for G0 , which is precisely what would be
required to have an eigenvector.

In light of the theorem that a vector field with zero curl is a gradient (at least within some
neighborhood), it is tempting to reason that it is obvious that the field F = a i of Example 6 cannot
have zero curl (as in fact it doesn’t). Because imagine that F = ∇g for some function g(x, y). Since
F = ∇g is a field in the i direction, this would mean that g(x, y) does not increase when (x, y)
moves in the j direction. But then it seems impossible that ∇g could increase when (x, y) moves in
the j direction, as is the case for the field F.

Although the intuitive basis for this reasoning is sound, the reasoning itself is simplistic. For
consider the previously discussed vector field

−y i + x j
F= = ∇g
x2 + y 2

where

y
g(x, y) = tan−1
x

(see Example 11). Since ∇g is tangent to the circles x2 + y 2 = const, the function g(x, y) does not
change when (x, y) moves in a direction radially towards or away from the origin. (This is obvious in
any case, since movement in such a direction does not change y/x.) On the other hand, || ∇g || = 1/r ,
so that ∇g decreases in magnitude when one moves in a direction radially away from the origin,
despite the fact that g(x, y) does not change.

In fact, from Theorem A above about the curl of planar vector fields, the fact that ||F|| = 1/r and
∇ × F = 0 and that F turns at a rate of 1/r as (x, y) moves in the direction of F (i. e. tangent to
the circle through (x, y ) around the origin) means that necessarily F must decrease at a rate of 1/r2
when (x, y) moves in a direction perpendicular to F, i. e. radially away from the origin, and in fact,
 
d d 1 1
||F|| = =− 2.
dr dr r r
Because it does seem strange that ∇g can decrease when (x, y) moves in a direction that does not
change g , its worth looking at this situation more closely to see how this can happen. Here’s a picture.
38

y6
................................
.............
...........
.........
c s
.......
.....
1 .............
............................ ....
............ ....
......... ....
s
.......
..... ....
....
sa
c 0 .....
..... ....
...
....
.... ... 1
s ....
...
...
...
a ...
... 0
..
..
 ...
...
..
..
...
 ..
..
..
..
 ..
..
..
..
 ..
..
..
..
...
 .. ... -
x

Consider points a0 = (a0 , b0 ) and a1 = (a1 , b1 ) on the same radial line and with distances r0 and
r1 from the origin. And likewise points c0 = (c0 , d0 ) and c1 = (c1 , d1 ) on a different radial line. Now
g(x, y) = tan−1 (y/x) does not change between a0 and a1 or between c0 and c1 . Furthermore, a
basic theorem for line integrals (the analogue of the Fundamental Theorem of Calculus) says that
Z c0
∇g • dr = g(c0 ) − g(a0 )
a0
Z c1
= g(c1 ) − g(a1 ) = ∇g • dr .
a1

So the integral over the circular path from c0 to a0 is the same as the integral over the path from c1
to a1 . But the path further away from the origin is longer. This certainly seems to suggest that ∇g is
smaller on the second path (as, in fact, in this example it certainly is).

To make this more convincing, notice also that the line integrals are taken over paths where ∇g is
in the direction of the tangent vector to the path. Therefore the line integrals reduce to ordinary
integrals
Z s1
||∇g(x(s), y(s))|| ds
s0

where we have parametrized the two curves with respect to the arc-length variable s.

But what we see in this example is that the two trajectory curves (the two circular arcs) are
curved, the orthogonal curves (the level curves for g , which in this example are the radial lines from
the origin) spread out as we move away from the first trajectory along the level curve in the direction
opposite to the direction of curvature for this trajectory. And so the outside trajectory is longer, and
it is a greater distance from a1 to c1 than from a0 to c0 . But the function g changes by the same
amount in both cases. So it makes sense that ∇g is smaller on the outside curve. (This inference
would not make any sense were it not for the fact that the direction of ∇g is parallel to the curve. In
this specific example, of course, ||∇g|| is constant along the two trajectory curves, so the two integrals
reduce to (s1 − s0 ) ||(g(x, y)||, making it very easy to see that for the two integrals to be equal, ||∇g||
must be smaller on the outside curve than on the inside.)

It’s worth seeing how the reasoning we have looked at applies to the example g(x, y) = x2 + y 2 ,
which is actually described by the same picture that we have used for tan−1 (y/x). In this case, the
39

trajectories (or integral curves) determined by ∇g(x, y), which is given by

∇g = 2x i + 2y j ,

are the radial lines through the origin, and thus have curvature 0, and the level curves are the circles
centered at the origin. So the roles of the points in the picture are now switched. We will want to
Ra Rc
consider the line integrals over the radial lines, a01 ∇g • dr and c01 ∇g • dr . Although the
trajectories here (the radial lines) have zero curvature, they are slanted, and the level curves, the
circles, do spread out as we move along the radial lines away from the origin. Despite this fact, the
distance from a0 to a1 along one radial line is that same as the distance from c0 to c1 along the
other. We therefore surmise, as is readily apparent, that ||∇g|| does not change when (x, y) moves
along a circle centered at the origin.

Examples can be misleading, because they often have special properties that are not true in the
general case. If we did not already know that a gradient field ∇g has to decrease in magnitude as we
move away from a point in the direction opposite to the direction of curvature of the trajectory, the
reasoning just given would be considerably less than totally convincing, since there are too many
possibilities it doesn’t take into account. In my own mathematical investigations, I almost always
start by looking at examples. But when when I notice a certain phenomenon which holds for all the
examples I have found, I then ask myself what it is about these examples that make this phenomenon
occur, and then try and see whether I can construct another example were it does not occur. If I am
repeatedly unsuccessful at constructing a counter-example, then I ask myself what the stumbling
blocks are that I constantly run up against. If I can identify these stumbling blocks, then it is possible
that I have found a proof that the phenomenon in question is true in general. (More often, though, I
am successful in constructing the counter-example, and therefore have to kiss the theorem goodbye.
But this may enable me to see how to change the theorem I was trying to prove in order to obtain one
that is in fact valid.)

Green’s Theorem and Stoke’s Theorem.

Although we need 3-space to define curl, we can think of a vector field


F(x, y) = P (x, y) i + Q(x, y) j in the plane as having a curl ∇ × F in the k direction. I. e.

∂Q ∂P
∇ × (P i + Q j) = ( − )k.
∂x ∂y

As was the case for divergence, we can see the idea of curl most clearly if we start by looking at
linear vector fields F(x, y, z) = (a1 x + a2 y) i + (b1 x + b2 y) j. We may further simplify by supposing
the a1 and b2 are 0, since these will not contribute to ∇ × F.

Consider then a planar vector field F(x, y) = P (x, y) i + Q(x, y) j = py i + qx j, where p and q are
scalar constants. Now look at a rectangle with sides parallel to the axes as shown below.
40

y6 (a0 , b1 ) (a1 , b1 )

F = py i + qx j

(a0 , b0 ) (a1 , b0 )
-
x

∂Q
Here ∇ × F = (q − p) k. Then (b1 − b0 ) = (b1 − b0 )q is the amount that F increases (or
∂x
decreases, in the negative case) between the bottom of the rectangle and the top. And
∂P
(a1 − a0 ) = (a1 − a0 )p is the amount that F increases between the left side and the right.
∂y
If we consider F as denoting the velocity of a fluid, and if we adopt the convention that positive
flow is counter-clockwise, then since py i = pb1 i is the horizontal component of F along the top of the
rectangle, it is seems plausible to say that −(a1 − a0 )P (x, b1 ) = −(a1 − a0 )pb1 is a measure of the flow
along the top of the rectangle. (I don’t know of an easy conceptual definition of the word flow, but I
hope it becomes clear in context.) Likewise (a1 − a0 )pb0 equals the flow along the bottom edge. Thus
−(a1 − a0 )(b1 − b0 )p equals the flow along the top and bottom edges of the rectangle. Likewise, since
qx j is the vertical component of F, on sees that (a1 − a0 )(b1 − b0 )q equals the flow along the
left-hand and right-hand sides. (Since a1 q(b1 − b0 is the flow along the RH side and −a0 (b1 − b0 is
the flow along the left.) Thus (q − p)(a1 − a0 )(b1 − b0 ) equals the total flow around the rectangle.
This is generally called the circulation of the vector field F.

More generally, if we assume merely that F = P i + Q j, where P (x, y) and Q(x, y) are
differentiable functions, then we see that the flow along the top edge of the rectangle considered above
will be given by Z a1
− P (x, b1 ) dx
a0
and the flow along the bottom edge by
Z a1
P (x, b0 ) dx .
a0

Furthermore, we see that for given x,


Z b1
∂P
P (x, b1 ) − P (x, b0 ) = dy .
b0 ∂y
Thus the total flow along the top and bottom edges will be given by
Z a1 Z a1 Z a1 Z b1
∂P
− P (x, b1 ) dx + P (x, b0 ) dx = − dy dx .
a0 a0 a0 b0 ∂y

The same reasoning shows that the flow along the left-hand and right-hand edges will be given by
Z a1 Z b1
∂Q
dy dx .
a0 b0 ∂x
41

Thus the total flow around the rectangle will be given by


Z a1 Z b1
∂Q ∂P
− dy dx .
a0 b0 ∂x ∂y
This is Green’s Theorem, as applied to the case of a rectangle.

∂Q ∂P
It is tempting to say that for a planar vector field F = P i + Q j, − = ||∇ × F||. However
∂x ∂y
this may not be quite correct because the left-hand side is possibly negative. What is true is that
∂Q ∂P
− is the k-coordinate of ∇ × F, and the correct way of expressing this is by the equation
∂x ∂y

∂Q ∂P
− = k•∇ × F.
∂x ∂y
Thus Green’s Theorem, for the case of a rectangle, states that the total flow of a vector field F
around a rectangle as described above is given by
Z a1 Z b1
k • ∇ × F dy dx .
a0 b0

The proof given can easily be generalized to prove the general statement of Green’s Theorem.
Namely, if F(x, y) is a planar vector field defined in a portion of the plane containing a region Ω
bounded by a simple closed curve C , then
ZZ I
k • ∇ × F dx dy = F • dr .
Ω C

I have given the proof (which is essentially the same as that found in many standard texts) in my
article on Green’s Theorem.

Vector Fields in Three-space with Constant Magnitude

We can recall that in the two-dimensional case, we were able to understand a vector field of
constant magnitude by using polar coordinates. In three-space, we can attempt to use spherical
coordinates, although with less success.

Consider a vector field F with constant magnitude 1. Let β be the angle between F and k. We
can write F as the sum of a vertical component and a component in the xy-plane. Since F is a unit
vector, clearly the vertical component equals cos β k. Let γ be the angle between i and the
horizontal component of F and let Fh be a unit vector in the xy-plane in the same direction as this
planar component. Thus
Fh = cos γ i + sin γ j .

Since F and Fh are unit vectors, we see that the magnitude of the component of F in the xy-plane
is sin β . (By standard convention, we choose 0 ≤ β ≤ π so that sin β ≥ 0 .) Thus we have

F = (sin β) Fh + (cos β) k = (sin β)(cos γ i + sin γ j) + cos β k .


42

z6

k 6

F 6


β 
.................. (cos β) k



 j

H -
H .
i ................................... H
...
j
H
γ (sin β)Fh
y
-


x


∂β
Since the formulas now get rather complicated, we will resort to the standard notations βx = ,
∂x
∂γ
γy = , etc. We get
∂y
 
βx cos β cos γ − γx sin β sin γ βy cos β cos γ − γy sin β sin γ βz cos β cos γ − γz sin β sin γ
 
J =
 βx cos β sin γ + γx sin β cos γ βy cos β sin γ + γy sin β cos γ βz cos β sin γ + γz sin β cos γ 
.
−βx sin β −βy sin β −βz sin β

Note that the directional derivative of U will be zero in a direction that is perpendicular to both ∇β
and ∇γ . Thus ∇β × ∇γ is an eigenvector for J corresponding to the eigenvalue 0. From the
geometry it is apparent that if either β or γ change, then U will change, so that ∇β × ∇γ is in fact
the only possible eigenvector for J corresponding to the eigenvalue 0. (There is a exceptional and
somewhat tricky special case when β is 0 or π . We ignore this for the moment, as well as the
exceptional cases when ∇β or ∇γ are 0 or the two are parallel to each other.)

To clarify the pattern, let us use block notation to write


h i h i
J = βx w1 βy w1 βz w1 + (sin β) γx w2 γy w2 γz w2 ,

where
   
cos β cos γ − sin γ
   
w1 =  cos β sin γ  and w2 =  cos γ  .
− sin β 0
43

We can now note that all three columns of J are linear combinations of the two unit vectors w1
and w2 which are, predictably, orthogonal to U, and are also orthogonal to each other. Thus any
possible eigenvector with non-zero eigenvalue will be a linear combination of these two.

For an arbitrary vector v = a i + b j + c k, we get


h i
J v = βx w1 + γx sin β w2 βy w1 + γy sin β w2 βz w1 + γz sin β w2 v

= (∇β • v) w1 + sin β (∇γ • v) w2 .

If we choose v as a unit vector, we can rewrite this as

J v = Dv (β) (cos β (cos γ i + sin γ j) − sin β k) + Dv (γ) sin β (− sin γ i + cos γ j) .

I don’t see any way of getting anything useful out of this.

However consider the special case of a vector field U(x, y, z) with constant magnitude 1 and such
that ∇ × U = 0 . Then we know that, at least within a sufficiently small neighborhood, U is the
gradient field for some function g(x, y, z). Thus U is orthogonal to the level surface
g(x, y, z) = const. Since we are assuming that U has constant magnitude 1, we see that U is in fact
the unit normal vector for such a level surface. And in this case, we can get some additional insight by
means of differential geometry.

If v is any vector, then the fact that U has constant magnitude 1 implies that the directional
derivative of U at a point (x0 , y0 , z0 ) in the direction of v , which will be the vector J v , will be
perpendicular to U, and thus will be tangent to the level surface. Furthermore, J v will point in the
direction in which U(x, y, z) turns as (x, y, z) moves away from (x0 , y0 , z0 ) in the direction v , and its
magnitude will show the speed of this turning.

The fact that all the vectors J v lie in a plane indicates that we are in the situation analogous to
Case 2 or Case 3 of the Two-by-Two Theorem for 2 × 2 matrices. In other words, the 3 × 3 matrix J
is a singular matrix and has at most two eigenvectors corresponding to non-zero eigenvalues, and these
eigenvectors are all parallel to the tangent plane to the level surface at the given point.

But we have seen earlier that the assumption that ∇ × U = 0 implies that the 3 × 3 Jacobian
matrix J corresponding to the derivative of F is symmetric and thus has three mutually orthogonal
eigenvectors. Thus there must be a third eigenvector perpendicular to the tangent plane of the level
surface and with 0 as the corresponding eigenvalue. Since all vectors perpendicular to the tangent
plane are multiples of each other, this shows that U itself is an eigenvector for the Jacobean
matrix J corresponding to the eigenvalue 0.

It is worth recording this conclusion as a proposition.

Proposition. If we vector field U(x, y, z) in three-space is the gradient field for a function g(x, y, z)
and has constant magnitude 1, then at any point (x0 , y0 , z0 ), U(x0 , y0 , z0 ) is an eigenvector for the
Jacobean matrix J (x0 , y0 , z0 ) corresponding to the eigenvalue 0. Furthermore, J has two more
eigenvectors which are parallel to the tangent plane to the level surface g(x, y, z) = const at the
point (x0 , y0 , z0 ).
44

Now consider a curve on the surface through a point (x0 , y0 , z0 ) and let v be the unit tangent
vector to this curve. Any curve on the level surface will have curvature arising from two different
causes: first, the curvature that it may have as seen by someone actually on the surface itself and
looking at it from that perspective, and second the curvature that it unavoidably has because it is on
a curved surface. This second curvature, at a point (x0 , y0 , z0 ) on the surface, is called the normal
curvature of the surface at the point.

In other words, if one is building a road on the surface through the point (x0 , y0 , z0 ), then one
may do one’s very best to make the road straight, but even so it will have some curvature, because it
is on a curved surface. This unavoidable curvature when the road goes in a direction v is what we
mean by the normal curvature of the surface in that direction.

To give the definition in more practical, although possibly a little less intuitive fashion, at the
given point (x0 , y0 , z0 ) on the surface, and for a given direction v , one should construct the plane
through the given point and going through both v and the normal vector U(x0 , y0 , z0 ) to the surface
at this point. This plane will then cut the surface in a curve, and the curvature of this curve is that is
meant by the normal curvature of the surface at the given point in the given direction v . (If one does
this on a sphere, then the curve thereby constructed will be a great circle. This is in fact the most
practical way of describing what is meant by a great circle.)

I am belaboring this issue, because the concept of the normal curvature of a surface at a given
point in a given direction is difficult to master if one doesn’t know differential geometry, but it is
essential for gaining an intuitive understanding of the divergence of a vector field with constant
magnitude and zero curl in three dimensions.

Now watch what happens to the normal vector U(x, y, z) as (x, y, z) moves away from a
point (x0 , y0 , z0 ) along a normal curve in the level surface with unit tangent vector v . U turns in the
direction of its directional derivative Dv (U), which is perpendicular to U and therefore is in the
tangent plane. Now this directional derivative can be written as the sum of two components: one of
which is in the direction of v and the second of which is orthogonal to v .

Now the first component, which shows the rate at which U is turning towards v , is the rate at
which the projection of U onto the plane of the normal curve (i. e. the plane determined by v
and U) turns, and this will be the same as the rate at which the tangent vector to the normal curve is
turning, since U and the tangent vector are perpendicular. Thus the magnitude of the first
component of Dv (U) is the same as the curvature of the normal curve.

The second component of Dv , orthogonal to the normal curve, shows the rate at which U is
turning away from the plane of the normal curve, due to the fact that the surface is, we might say,
tilting sideways as we move along the curve.

Intuitively, one can think of this in terms of driving a car on a mountain at a place which has not
been smoothed out into a road. Generally, when one drives in a particular direction the car tips
forward or backward in the direction one is driving, but also tilts from side to side in the
perpendicular direction. The normal curvature of the road corresponds only to the forward or
backward tipping. (If the car has the sort of antenna that sticks straight up from the roof, then this
antenna would be in the direction of the normal vector U.)

Theorem. Suppose that the gradient field U = ∇g for a function g(x, y, z) has constant
45

magnitude 1. Let v and w be two orthogonal unit vectors tangent to the level surface
g(x, y, z) = const at a point (x0 , y0 , z0 ) on surface. Let J0 be the Jacobian matrix J (x0 , y0 , z0 ) for
the derivative of U at the point (x0 , y0 , z0 ). Then
J0 v = a11 v + a12 w

J0 w = a21 v + a22 w
where

(1) a11 is the normal curvature of the level surface g = const at the point (x0 , y0 , z0 ) in the
v direction and a22 is the normal curvature in the w direction. Here, contrrary to the usual
convention, one should interpret the curvature as a signed number which is positive if the
normal curve curves in the direction away from U and negative if it curves in the direction
towards U.
(2) Furthermore, a12 is the rate at which the surface tilts toward the w direction as one moves
on it away from (x0 , y0 , z0 ) along the normal curve in the v direction at unit speed, and a21
is the rate at which the surface tilts toward the v direction as one moves along the normal
curve in the w direction.
(3) Furthermore, a12 = a21 .
(4) The divergence of U equals a11 + a22 .

proof: The normal curve in the direction v to the level surface has v as its tangent vector.

J v is the directional derivative Dv (U) of U as (x, y) moves away from (x0 , y0 ) at unit speed in
the direction v , i. e. along the normal curve through (x0 , y0 ) in the direction v . Now since U has
constant magnitude, this directional derivative is perpendicular to U and gives the rate at which U is
turning as (x, y) moves along the normal curve at unit speed. Since Dv (U) is perpendicular to U, it
lies in the tangent plane to the level surface, and thus can be broken up into two orthogonal
components, one in the direction of v and one perpendicular to it: Dv (U) = a11 v + a12 w .

Now a11 v gives the rate at which U turns towards v as (x, y) moves away from (x0 , y0 ) along
the normal curve in the direction v . Since v and U are perpendicular and in the same plane, this is
also the rate at which v(x, y) turns as we move in the direction v(x0 , y0 ) away from (x0 , y0 ). Since v
is the unit tangent vector to the normal curve, this equals the curvature of the normal curve
at (x0 , y0 ), except that it is negative if U is turning away from v (towards −v ), i. e. if the normal
curve is curving towards U.

On the other hand, a12 is the rate at which U turns towards w as (x, y) moves along the normal
curve in the v direction. Since U is perpendicular to the level surface, it is more or less self-evident
that this represents the rate at which the surface is tilting.

With respect to the orthonormal vector basis v, w, U , the Jacobean matrix for the derivative of
 
a11 a12 ∗
 
U is a21 a22 ∗ . The divergence of U is the trace of this matrix, i. e. a11 + a22 .
0 0 0

Since U is a gradient field, it is known that this matrix is symmetric. Thus a12 = a21 .
(Furthermore, this shows that the unspecified entries in the matrix indicated by asterisks must both
be 0.)
46

Restating this more verbally, we get:

Theorem. If U is the gradient field of a function g(x, y, z) and U has constant magnitude 1, then
at any given point (x0 , y0 , z0 ), the divergence of U is the sum of the curvatures of any two orthogonal
normal curves on the level surface g(x, y, z) = const through the point (x0 , y0 , z0 ), provided that one
treats the curvature as negative in case the normal curve curves towards (rather than away from) U.

As noted in a Proposition above, the Jacobean matrix for the derivative of a vector field U = ∇g
with constant magnitude 1 has at any given point U itself as an eigenvector, corresponding to the
eigenvalue 0, along with two other eigenvectors which are orthogonal to each other and lie in the
tangent plane to the level surface g(x, y, z) = const. Let us choose v and w as these two
eigenvectors. The discussion above shows that to say that v is an eigenvector for J is thus to say
that when (x, y, z) moves away from the point (x0 , y0 , z0 ) in the v direction, then the normal vector
U(x, y, z) to the surface turns in the direction v . The eigenvalue corresponding to v will be negative
in the case that the normal curve through (x0 , y0 , z0 ) in the direction v curves toward U and will be
positive if the normal curve curves away from the direction of U.

In terms of my previous analogy, the eigenvectors v and w correspond to the only two directions
in which one can drive on the mountain and have the car tip only up or down in the direction one is
driving without also tilting sideways.

Finally, there is a fact from differential geometry which I will not prove, namely that at any given
point, the normal curves in the directions of the two eigenvectors of the Jacobean matrix will have the
maximum and minimum normal curvatures of all normal curves through that point.

To summarize, we get the following theorem:

Theorem. If a three-dimensional vector field F is the gradient field of a function g(x, y, z), and if F
has constant magnitude, then the divergence of F at a point (x0 , y0 , z0 ) is in absolute value of the
product of ||F|| and the sum (or possibly difference) of the maximal and minimal normal curvatures
of the level surface g(x, y, z) = const at that point.

(One needs to take the difference rather than the sum of the two curvatures in case one normal
curve curves towards F and the other curves away from it, in which case the surface will look vaguely
like a saddle point at that point. The difficulty arises because I have attempted to state the theorem
in terms that respect the standard convention that curvature is always positive. If one adopts the
convention that I have used in most of the discussion, where curvature is a signed number, then the
statement of the theorem is much more straightforward.)

The Bottom Line on Divergence.

From what we have seen for the special case where F is a gradient field, one can see the
significance in general of ∇ • F for any differentiable vector field F. What we see is that the
divergence of a vector field shows the rate at which that vector field is, um, diverging. And this
happens in two ways: positive divergence will occur when F(x, y, z) increases in magnitude when
(x, y, z) moves in the direction of the field, and also when the vector field “fans out” as it were, that is
to say when F(x, y, z) turns towards the direction of the movement of (x, y, z) as (x, y, z) moves in a
direction perpendicular to the direction of F(x0 , y0 , z0 ). If one thinks of the field as giving the velocity
47

of a fluid, this would mean that as one moves away from a given point P0 in a direction perpendicular
to the velocity of the fluid at that point, the nearby streams splay out from the direction of the fluid
through P0 , somewhat the way water sprays out from a nozzle that is set on “spray.”

If the verbal expression of this seems unclear, the technical discussion below should clarify it. We
simply duplicate the same reasoning used above in the speciaL case where F was a gradient. As in
that special case, the crucial fact is that as shown in an article on my web site) the directional
derivative of a vector field F in any given direction can be decomposed into two orthogonal
componments, one of which is parallel to the vector field F and shows the rate at which the
magnitude of F is increasing as one moves in the given direction, and the other is orthogonal to F
and whose magnitude is the product of the magnitude of F and the rate at which F is turning as one
moves in the given direction.

Consider a vector field F(x, y, z). Change coordinates so that i is in the direction F(x0 , y0 , z0 ).
Suppose that the Jacobean matrix for F at the point (x0 , y0 , z0 ) is
 
a11 a12 a13
 
J = a21 a22 a23  .
a31 a32 a33

Then
∇ · F = a11 + a22 + a33 = i · D i (F) + j · D j (F) + k · D k (F).
The directional derivative of F in the direction i is the vector

D i (F) = a11 i + a21 j + a31 k .

Since the direction of i is the same as the direction of F, a11 is the rate at which ||F|| increases (or
decreases, if a11 is negative) as (x, y, z) moves away from (x0 , y0 , z0 ) in the direction of i (i. e. the
direction of F). On the other hand, a21 j + a31 k, which is the component of D i (F) perpendicular to
the direction of F, points in the direction in which F turns as (x, y, z) moves in the i direction and
its magnitude, |a21 + a31 |, is the product of ||F|| and the rate at which F turns. This second
component of D i (F) is not relevant to the divergence.

Now the directional derivative of F in the j-direction is D j (F) = a12 i + a22 j + a32 k. Since i is
in the direction of F, the component a12 i , which does not contribute to the divergence, is in the
direction of F and indicates the rate at which ||F|| changes as (x, y, z) moves in the j-direction. And
a22 j + a32 k points in the direction in which F is turning as (x, y, z) moves away from (x0 , y0 , z0 ) in
the j-direction, and its magnitude, i. e. |a22 | + |a32 | (here we use the fact that j and k are
orthogonal), gives the speed of this turning multiplied by ||F||.

Now we see that this vector corresponding to the turning is the sum of two components: a22 j and
a32 k. The first of these is determined by the turning of F towards the j direction, which is the
direction the point (x, y, z) is moving in. This is the component which is significant for the
divergence. On the other hand, the component a32 k, which does not contribute to the divergence, is
determined by the rate at which F turns toward the k direction as (x, y) moves in the j direction.
In the language used previously, a22 is the product of F and the rate at which F tips forwards or
backwards as (x, y, z) moves in the j-direction, and a32 is the product of F and the rate at which F
tilts sideways at (x, y, z) moves in that direction.
48

Likewise, a33 is determined by the rate at which F tips forwards or backwards and as (x, y, z)
moves in the k-direction, and a23 shows the rate at which F tilts sideways. We see that the tipping
of F forwards or backwards as (x, y, z) moves in the j and k directions contributes to the
divergence, but the sidewise tilting of F does not.

We can sum this all up as a theorem.

Theorem. Consider a vector field F(x, y, z) at a point (x0 , y0 , z0 ). Let u be a unit vector in the
direction of F(x0 , y0 , z0 ) (i. e. u = F/||F||) and let v and w be unit vectors orthogonal to
F(x0 , y0 , z0 ) and to each other. Then the divergence of F at the point (x0 , y0 , z0 ) is the sum of the
rate at which ||F|| is increasing as (x, y, z) moves away from (x0 , y0 , z0 ) in the u direction (taken as
negative if ||F|| decreases rather than increases) and the product of ||F|| with the sum of the rates at
which F tips toward the direction of the movement of (x, y, z) as (x, y, z) moves away from
(x0 , y0 , z0 ) in the v and w -directions (with the convention at these rates should be taken as negative
if F tips away from the direction of movement of (x, y, z) rather than towards it).

The Bottom Line for the Curl of a Vector Field with Constant Magnitude. Now look at
curl in terms of its basic definition. Consider a vector field U in three-space with constant
magnitude 1 and fix a point P0 = (x0 , y0 , z0 ). Change to a new coordinate system in which i is
parallel to U(x0 , y0 , z0 ) and j points in the direction in which U is turning when (x, y, z) moves
away from P0 in the direction U. The direction of k will then be determined by the right-hand rule.
Becuase U has constant magnitude, the directional derivatives of U at P0 are orthogonal
to U(x0 , y0 , z0 ), and thus orthogonal to i , which means that the Jacobean matrix J for U at P0
will have zeros in the first row. (It then follows that J has an eigenvector corresponding to the
eigenvalue 0.) Furthermore, the directional derivative of U in the i direction will point to the
direction that U turns in when (x, y, z) moves in the i direction, i. e. the first column will be a
multiple of j, i. e. is zero except in the second entry.

Thus the Jacobean matrix for U at the point P0 will look like
 
0 0 0
 
J = a21 a22 a23 
0 a32 a33
where a21 is the speed at which U turns as (x, y, z) moves away from P0 in the i direction, i. e. the
direction of U. Then
∇ × U = (a32 − a23 ) i + a21 k .
Thus the curl for a vector field U which has constant magnitude 1 is the sum of two vector
components. The first component is in the direction of the axis around which U(x, y, z) turns as
(x, y, z) moves away from (x0 , y0 , z0 ) the the direction of U (i. e. the direction i ) at unit speed, and
its magnitude is the speed of this turning. This is what we found in the two-dimensional case. But for
the three dimensional case, there is a second component to the curl in the direction of U itself. This
second component, (a32 − a23 ) i , is the curl of the two-dimensional field which is the projection of U
onto the plane perpendicular to U(x0 , y0 , z0 ). It vaguely (very vaguely) indicates the extent to which
the tip of U swirls (“circulates” would be the technically accurate word) as (x, y, z) moves
around (x0 , y0 , z0 ) in the plane perpendicular to U(x0 , y0 , z0 ).
49

The intuitive significance of this third component is not as nice as one might hope for, but it seems
that it’s the best one can get.

You might also like