Tensor PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Chapter 1

Describing the Physical World:


Vectors & Tensors

It is now well established that all matter consists of elementary particles1 that interact through
mutual attraction or repulsion. In our everyday life, however, we do not see the elemental nature
of matter, we see continuous regions of material such as the air in a room, the water in a glass or
the wood of a desk. Even in a gas, the number of individual molecules in macroscopic volumes is
enormous ≈ 1019 cm−3 . Hence, when describing the behaviour of matter on scales of microns or
above it is simply not practical to solve equations for each individual molecule. Instead, theoretical
models are derived using average quantities that represent the state of the material.
In general, these average quantities will vary with position in space, but an important concept to
bear in mind is that the fundamental behaviour of matter should not depend on the particular coor-
dinate system chosen to represent the position. The consequences of this almost trivial observation
are far-reaching and we shall find that it dictates the form of many of our governing equations. We
shall always consider our space to be three-dimensional and Euclidean2 and we describe position in
space by a position vector, r, which runs from a specific point, the origin, to our chosen location.
The exact coordinate system chosen will depend on the problem under consideration; ideally it
should make the problem as “easy” as possible.

1.1 Vectors
A vector is a geometric quantity that has a “magnitude and direction”. A more mathematically
precise, but less intuitive, definition is that a vector is an element of a vector space. Many physical
quantities are naturally described in terms of vectors, e.g. position, velocity and acceleration, force.
The invariance of material behaviour under changes in coordinates means that if a vector represents
a physical quantity then it must not vary if we change our coordinate system. Imagine drawing a
line that connects two points in two-dimensional (Euclidean) plane, that line remains unchanged
whether we describe it as “x units across and y units up from the origin” or “r units from the origin
in the θ direction”. Thus, a vector is an object that exists independent of any coordinate system,
but if we wish to describe it we must choose a specific coordinate system and its representation in
that coordinate system (its components) will depend on the specific coordinates chosen.
1
The exact nature of the most basic unit is, of course, still debated, but the fundamental discrete nature of matter
is not.
2
We won’t worry about relativistic effects at all.

7
1.1.1 Cartesian components and summation convention
The fact that we have a Euclidean space means that we can always choose a Cartesian coordinate
system with fixed orthonormal base vectors, e1 = i, e2 = j and e3 = k. For a compact notation, it
is much more convenient to use the numbered subscripts rather than different symbols to distinguish
the base vectors. Any vector quantity a can be written as a sum of its components in the direction
of the base vectors
a = a1 e1 + a2 e2 + a3 e3 ; (1.1)
and the vector a can be represented via its components (a1 , a2 , a3 ); and so, e1 = (1, 0, 0), e2 =
(0, 1, 0) and e3 = (0, 0, 1). We will often represent the components of vectors using an index, i.e.
(a1 , a2 , a3 ) is equivalent to aI , where I ∈ {1, 2, 3}. In addition, we use the Einstein summation
convention in which any index that appears twice represents a sum over all values of that index
3
X
J
a = a eJ = aJ eJ . (1.2)
J=1

Note that we can change the (dummy) summation index without affecting the result
3
X 3
X
J J
a eJ = a eJ = aK eK = aK eK .
J=1 K=1

The summation is ambiguous if an index appears more than twice and such terms are not allowed.
For clarity later, an upper case index is used for objects in a Cartesian (or, in fact, any orthonormal)
coordinate system and, in general, we will insist that summation can only occur over a raised index
and a lowered index for reasons that will hopefully become clear shortly.
It is important to recognise that the components of a vector aI do not actually make sense unless
we know the base vectors as well. In isolation the components give you distances but not direction,
which is only half the story.

1.1.2 Curvilinear coordinate systems


For a complete theoretical development, we shall consider general coordinate systems3 . Unfortu-
nately the use of general coordinate systems introduces considerable complexity because the lines on
which individual coordinates are constant are not necessarily straight lines nor are they necessarily
orthogonal to one another. A consequence is that the base vectors in general coordinate systems
are not orthonormal and vary as functions of position.

1.1.3 Tangent (covariant base) vectors


The position vector r = xK eK , is a function of the Cartesian coordinates, xK , where x1 = x, x2 = y
and x3 = z. Note that the Cartesian base vectors can be recovered by differentiating the position
vector with respect to the appropriate coordinate
∂r
= eK . (1.3)
∂xK
3
For our purposes a coordinate system to be a set of independent scalar variables that can be used to describe
any position in the entire Euclidean space.
In other words the derivative of position with respect to a coordinate returns a vector tangent to the
coordinate direction; a statement that is true for any coordinate system. For a general coordinate
system, ξ i , we can write the position vector as

r(ξ 1 , ξ 2 , ξ 3) = xK (ξ i )eK , (1.4)

because the Cartesian base vectors are fixed. Here the notation xK (ξ i ) means that the Cartesian
coordinates can be written as functions of the general coordinates, e.g. in plane polars x(r, θ) =
r cos θ, y(r, θ) = r sin θ, see Example 1.1. Note that equation (1.4) is the first time in which we use
upper and lower case indices to distinguish between the Cartesian and general coordinate systems.
A tangent vector in the ξ 1 direction, t1 , is the difference between two position vectors associated
with a small (infinitesimal) change in the ξ 1 coordinate

t1 (ξ i ) = r(ξ i + dξ 1 ) − r(ξ i ), (1.5)

where dξ 1 represents the small change in the ξ 1 coordinate direction, see Figure 1.1.

t1 (ξ i ) ξ 2 constant

r(ξ i + dξ 1 )
i
r(ξ )

Figure 1.1: Sketch illustrating the tangent vector t1 (ξ) corresponding to a small change dξ 1 in the
coordinate ξ 1 . The tangent lies along a line of constant ξ 2 in two dimensions, or a plane of constant
ξ 2 and ξ 3 in three dimensions.

Assuming that r is differentiable and Taylor expanding the first term in (1.5) demonstrates that
∂r 1
t1 = r(ξ i ) + dξ − r(ξ i ) + O((dξ i )2 ),
∂ξ 1
which yields
∂r 1
t1 = dξ ,
∂ξ 1
if we neglect the (small) quadratic and higher-order terms. Note that exactly the same argument
can be applied to increments in the ξ 2 and ξ 3 directions and because dξ i are scalar lengths, it follows
that
∂r
gi = i
∂ξ
is also a tangent vector in the ξ i direction, as claimed. Hence, using equation (1.4), we can compute
tangent vectors in the general coordinate directions via

∂r ∂xK
gi = i = eK . (1.6)
∂ξ ∂ξ i
We can interpret equation (1.6) as defining a local linear transformation between the Cartesian
base vectors and our new tangent vectors g i . The transformation is linear because g i is a linear
combination of the vectors eK , which should not be a surprise because we explicitly neglected the
quadratic and higher terms in the Taylor expansion. The transformation is local because, in general,
the coefficients will change with position. The coefficients of the transformation can be written as
entries in a matrix, M, in which case equation (1.6) becomes
M
z
 }| { 
  ∂x1 ∂x2 ∂x3 
g1 =  ∂ξ 1 ∂ξ 1 ∂ξ 1 e1 (1.7)
 g2  ∂x1 ∂x2 ∂x3   e .
 ∂ξ 2 ∂ξ 2 ∂ξ 2  2
g3 ∂x1 ∂x2 ∂x3 e3
∂ξ 3 ∂ξ 3 ∂ξ 3

Provided that the transformation is non-singular (the determinant of the matrix M is non-zero)
the tangent vectors will also be a basis of the space and they are called covariant base vectors
because the transformation preserves the tangency of the vectors to the coordinates. In general,
the covariant base vectors are neither orthogonal nor of unit length. It is also important to note
that the covariant base vectors will usually be functions of position.
Example 1.1. Finding the covariant base vectors for plane polar coordinates
A plane polar coordinate system is defined by the two coordinates ξ 1 = r, ξ 2 = θ such that
x = x1 = r cos θ and y = x2 = r sin θ.
Find the covariant base vectors.
Solution 1.1. The position vector is given by
r = x1 e1 + x2 e2 = r cos θe1 + r sin θe2 = ξ 1 cos ξ 2e1 + ξ 1 sin ξ 2 e2 ,
and using the definition (1.6) gives
∂r ∂r
g1 = = cos ξ 2 e1 + sin ξ 2 e2 , and g 2 = = −ξ 1 sin ξ 2 e1 + ξ 1 cos ξ 2 e2 .
∂ξ 1 ∂ξ 2
Note that g 1 is a unit vector,
q

|g 1 | = g 1 ·g 1 = cos2 ξ 2 + sin2 ξ 2 = 1,

but g 2 is not, |g 2 | = ξ 1 . The vectors are orthogonal g 1 ·g 2 = 0 and can be are related to the standard
orthonormal polar base vectors via g 1 = er and g 2 = reθ .

1.1.4 Contravariant base vectors


The fact that the covariant basis is not necessarily orthonormal makes life somewhat awkward. For
orthonormal systems we are used to the fact that when a = aK eK , then unique components can be
obtained via a dot product4 .
a·eI = aK eK ·eI = aI , (1.8)
4
The dot or scalar product is an operation on two vectors that returns a unique scalar: the product of the lengths
of the two vectors and the cosine of the angle between them. In the present context we only need to know that
for orthogonal vectors the dot product is zero and the dot product of a unit vector with itself is one, see §1.3.1 for
further discussion.
where the last equality is a consequence of the orthonormality. In our covariant basis, we would
write a = ak g k , so that
a·g i = ak g k ·g i , (1.9)
but no further simplification can be made. Equations (1.9) are a linear system of simultaneous
equations that must be solved in order to find the values of ak , which is considerably more effort
than using an explicit formula such as (1.8). The explicit formula (1.8) arises because the matrix
eK ·eI is diagonal, which means that the equations decouple and are no longer simultaneous.
We can, however, recover most of the nice properties of an orthonormal coordinate system if we
define another set of base vectors that are each orthogonal to two of the covariant base vectors and
have unit length when projected in the direction of the remaining covariant base vector. In other
words, the new vectors g i are defined such that
(
1, if i = j,
g i ·g j = δji ≡ (1.10)
0, otherwise,

where the object δji is known as the Kronecker delta. In orthonormal coordinate systems the two
sets of base vectors coincide; for example, in our global Cartesian coordinates eI ≡ eI .
We can decompose g i into its components in the Cartesian basis g i = gK i K
e , where we have
used the raised index on the base vectors for consistency with our summation convention. Note
i
that gK is thus defined to be the K-th Cartesian component of the i-th contravariant base vector.
From the definition (1.10) and (1.6)
 
i K
 ∂xL i ∂x
L
K i ∂x
L
K i ∂x
L
gK e · eL = g K e ·eL = g K δ = g = δji . (1.11)
∂ξ j ∂ξ j ∂ξ j L L
∂ξ j
i K
Note that we have used the “index-switching” property of the Kronecker delta to write gK δL = gLi ,
which can be verified by writing out all terms explicitly.
∂ξ j
Multiplying both side of equation (1.11) by ∂x K yields

∂xL ∂ξ j i ∂ξ
j
∂ξ i
gLi = δj = ;
∂ξ j ∂xK ∂xK ∂xK
and from the chain rule
∂xL ∂ξ j ∂xL L
j K
= K = δK ,
∂ξ ∂x ∂x
because the Cartesian coordinates are independent. Hence,
∂ξ i
gLi δK
L i
= gK = ,
∂xK
and so the new set of base vectors are
∂ξ i K
i
g = Ke . (1.12)
∂x
The equation (1.12) defines a local linear transformation between the Cartesian base vectors and
the vectors g i . In a matrix representation, equation (1.12) is
M−T
z }| { 
 1
 ∂ξ 1 ∂ξ 1 ∂ξ 1

g =  ∂x1 ∂x2 ∂x3
e1 (1.13)
 g2  ∂ξ 2 ∂ξ 2 ∂ξ 2   e2  ,
 ∂x1 ∂x2 ∂x3 
g3 ∂ξ 3 ∂ξ 3 ∂ξ 3 e3
∂x1 ∂x2 ∂x3
and we see that the new transformation is the inverse transpose5 of the linear transformation that
defines the covariant base vectors (1.6). For this reason, the vectors g i are called contravariant base
vectors.
Example 1.2. Finding the contravariant base vectors for plane polar coordinates
For the plane polar coordinate system defined in Example 1.1, find the contravariant base vectors.
Solution 1.2. The contravariant base vectors are defined by equation (1.12) and in order to use that
equation directly, we must express our polar coordinates as functions of the Cartesian coordinates

1
√ x2 2
r=ξ = x1 x1 + x2 x2 , and tan θ = tan ξ = 1 ,
x
and then we can compute

∂ξ 1 ∂ξ 1 ∂ξ 2 sin ξ 2 ∂ξ 2 cos ξ 2
= cos ξ 2 , = sin ξ 2 , = − and =
∂x1 ∂x2 ∂x1 ξ1 ∂x1 ξ1
Thus, using the transformation (1.13), we have

∂ξ 1 1 ∂ξ 1 2
g1 = e + 2 e = cos ξ 2 e1 + sin ξ 2 e2 = g 1 ,
∂x1 ∂x
where we have used the fact that eI = eI , and also

∂ξ 2 1 ∂ξ 2 2 sin ξ 2 cos ξ 2 1
g2 = 1
e + 2
e = − 1
e1 + 1
e2 = 1 2 g 2 .
∂x ∂x ξ ξ (ξ )

We can now easily verify that g i ·g j = δji .


An alternative (and often easier) approach is to find the contravariant base vectors by finding
the inverse transpose of the matrix M that defines the covariant base vectors and using equation
(1.13).

1.1.5 Components of vectors in covariant and contravariant bases


We can find the components of a vector a in the covariant basis by taking the dot product with the
appropriate contravariant base vectors

a = ak g k , where ai = a·g i = ak g k ·g i = ak δki = ai . (1.14)

Similarly components of the vector a in the contravariant basis are given by taking the dot product
with the appropriate covariant base vectors

a = ak g k , where ai = a·g i = ak g k ·g i = ak δik = ai . (1.15)
5
That the inverse matrix is given by
 
∂ξ 1 ∂ξ 2 ∂ξ 3
∂x11 ∂x21 ∂x31
 
M−1 =  ∂ξ
∂x12
∂ξ
∂x22
∂ξ
∂x32
,
∂ξ ∂ξ ∂ξ
∂x3 ∂x3 ∂x3

can be confirmed by checking that MM−1 = M−1 M = I, the identity matrix. Alternatively, the relationship follows
directly from equation (1.11) written in matrix form.
In fact, we can obtain the components of a general vector in either the covariant or contravariant
basis directly from the Cartesian coordinates. If a = aK eK = aK eK , then the components in the
covariant basis associated with the curvilinear coordinates ξ i are
∂ξ i J K ∂ξ
i
K ∂ξ
i
ai = a·g i = aK eK ·g i = aK e ·eK = a δK
J
= a ,
∂xJ ∂xJ ∂xK
a contravariant transform. Similarly, the components of the vector in the contravariant basis may
be obtained by covariant transform from the Cartesian components and so
∂xK
ai = aK (1.16a)
∂ξ i
and
∂ξ i K
ai = a . (1.16b)
∂xK

1.1.6 Invariance of vectors:


(significance of index position)
Having established the need for two different types of transformations in curvilinear coordinate
systems, we are now in a position to consider the significance of the raised and lowered indices
in our summation convention. We shall insist that for an index to be lowered the object must
transform covariantly under a change in coordinates and for an index to be raised the object must
transform contravariantly under a change in coordinates6 . An important exception to this rule
are the coordinates themselves: ξ i represents the three scalar coordinates, e.g. in spherical polar
coordinates ξ 1 = r, ξ 2 = θ and ξ 3 = φ; ξ i are not the components of a vector and do not obey
contravariant transformation rules. Equation (1.16a) demonstrates that the components of a vector
in the contravariant basis are indeed covariant, justifying the lowered index, and equation (1.16b)
provides similar justification for the contravariance of components in the covariant basis. We shall
now demonstrate that these transformation properties also follow directly from the requirement
that a physical vector should be independent of the coordinate system.
Consider a vector a, which can be written in the covariant or contravariant basis

a = ai g i = ai g i . (1.17)
We now consider a change in coordinates from ξ i to another general coordinate system χi . It will
be of vital importance later on to know which index corresponds to which coordinate system so
we have chosen to add an overbar to the index to distinguish components associated with the two
coordinate systems, ξ i and χi . The covariant base vectors associated with χi are then
∂r ∂xJ ∂ξ k ∂xJ ∂ξ k
gi ≡ = eJ = eJ = gk ; (1.18)
∂χi ∂χi ∂χi ∂ξ k ∂χi
and the transformation between g i and g k is of the same (covariant) type as that between g i and eK
in equation (1.6). The transformation is covariant because the “new” coordinate is the independent
6
The logic for the choice of index location is the position of the generalised coordinate in the partial derivative
defining the transformation:
∂xK ∂ξ i
gi = eK (lowered index), gi = eK (raised index).
∂ξ i ∂xK
variable in the partial derivative (it appears in the denominator). In our new basis, the vector
a = ai g i and because a must remain invariant

a = ai g i = ai g i .

Using the transformation of the base vectors (1.18) to replace g i gives


k
i ∂ξ i k k ∂ξ k i
a gk = a gi = a gk ⇒ a = a. (1.19)
∂χi ∂χi
Hence, the components of the vector must transform contravariantly because multiplying both sides
of equation (1.19) by the inverse transpose transformation ∂χj /∂ξ k gives

∂χi k
ai = a . (1.20)
∂ξ k
This transformation is contravariant because the “new” coordinate is the dependent variable in the
partial derivative (it appears in the numerator).
A similar approach can be used to show that the components in the contravariant basis must
transform covariantly in order to ensure that the vector remains invariant. Thus, the use of our
summation convention ensures that the summed quantities remain invariant under coordinate trans-
formations, which will be essential when deriving coordinate-independent physical laws.

Interpretation
The fact that base vectors and vector components must transform differently for the vector to
remain invariant is actually quite obvious. Consider a one-dimensional Euclidean space in which
a = a1 g1 . If the base vector is rescaled7 by a factor λ so that g1 = λg1 then to compensate the
component must be rescaled by the factor 1/λ: a1 = λ1 a1 . Note that for a 1 × 1 transformation
matrix with entry λ, the inverse transpose is 1/λ.

1.1.7 Orthonormal coordinates


If the coordinates are orthonormal then, by construction, there is no distinction between the co-
variant and contravariant basis, g i = g i . Using equations (1.6) and (1.12), we see that

∂xK i ∂ξ i
gi = eK = g = eK ,
∂ξ i ∂xK
and so
∂xK ∂ξ i
= . (1.21)
∂ξ i ∂xK
Hence, the covariant and contravariant transformations are identical in orthonormal coordinate
systems, which means that there is no need to distinguish between raised and lowered indices. This
simplification is adopted in many textbooks and the convention is to use only lowered indices. When
working with orthonormal coordinates we will also adopt this convention for simplicity, but we must
always make sure that we know when the coordinate system is orthonormal. It is for this reason
that we have adopted the convention that upper case indices are used for orthonormal coordinates.
7
In one dimension all we can do is rescale the length, although the scaling can vary with position.
If the coordinate system is not known to be orthonormal, we will use lower case indices and must
distinguish between the covariant and contravariant transformations.
Condition (1.21) implies that

∂xK ∂xK ∂xK ∂ξ i


j i
= j K
= δji . (1.22)
∂ξ ∂ξ ∂ξ ∂x

In the matrix representation, equation (1.22) is

MMT = I ⇒ MT M = I,

where I is the identity matrix. In other words the components of the transformation form an
orthogonal matrix. It follows that (all) orthonormal coordinates can only be generated by an
orthogonal transformation from the reference Cartesians. This should not be a big surprise: any
other transform will change the angles between the base vectors or their relative lengths which
destroys orthonormality. The argument is entirely reversible: if either the covariant or contravariant
transform is orthogonal then the two transforms are identical and the new coordinate system is
orthonormal.

An aside
Further intuition for the reason why the covariant and contravariant transformations are identical
when the coordinate transform is orthogonal can be obtained as follows. Imagine that we have a
general linear transformation represented as a matrix M the acts on vectors such that components
in the fixed Cartesian coordinate system p = pK eK transform as follows

p̃K = MJK pJ .

Note that the index K does not have an overbar because p̃K is a component in the fixed Cartesian
coordinate system, eK . The transformation can, of course, also be applied to the base vectors of
the fixed Cartesian coordinate system eI ,

[ẽI ]K = MJK [eI ]J ,

where [ ]K indicates the K-th component of the base vector. Now, [eI ]J = δIJ and it follows that

[ẽI ]K = MIK ,

which allows us to define the operation of the matrix components on the base vectors directly
because
ẽI = [ẽI ]K eK = MIK eK . (1.23a)
Thus the operation of the transformation on the components is the transpose of its operation on
the base vectors8 . We could write the new base vectors as eI˜ = MI˜K eK to be consistent with our
previous notation, but this will probably lead to more confusion in the current exposition.
Now consider a vector a that must remain invariant under our transformation. Let the vector
ã′ be the vector with the same numerical values of its components as a but with transformed base
vectors, i.e. ã′ = aK ẽK . Thus, the vector ã′ will be a transformed version of a. In order to ensure
that the vector remains unchanged under transformation we must apply the appropriate inverse
8
This statement also applies to general bases.
transformation to ã′ relative to the new base vectors, ẽI . In other words, the transformation of the
coordinates must be
ãK = [M −1 ]K ′J
J ã = [M
−1 K J
]J a , (1.23b)
where we have used the fact that ã′J = aJ by definition. Using the two transformation equations
(1.23a,b) we see that

ãK ẽK = [M −1 ]K J L
J a MK eL = [M
−1 K
]J MKL aJ eL = δJL aJ eL = aJ eJ ,

as required.
Thus, we have the two results: (i) a general property of linear transformations is that the matrix
representation of the transformation of vector components is the transpose of the matrix represen-
tation of the transformation of base vectors; (ii) in order to remain invariant the transform of the
components of the vector must actually undergo the inverse of the coordinate transformation. Thus,
the transformations of the base vectors and the coordinates coincide when the inverse transform is
equal to its transpose, i.e. when the transform is orthogonal.
If that all seems a bit abstract, then hopefully the following specific example will help make the
ideas a little more concrete.

Example 1.3. Equivalence of covariant and contravariant transformations under affine


transformations
Consider a two-dimensional Cartesian coordinate system with base vectors eI . A new coordinate
system with base vectors eI is obtained by rotation through an angle θ in the anticlockwise direction
about the origin. Derive the transformations for the base vectors and components of a general vector
and show that they are the same.

Solution 1.3. The original and rotated bases are shown in Figure 1.2(a) from which we determine
that the new base vectors are given by

e1 = cos θe1 + sin θe2 and e2 = − sin θe1 + cos θe2 .

e2 e2
e2 e2
p′

e1 p e1
θ θ

θ
(a) e1 (b) e1

Figure 1.2: (a) The base vectors eI are the Cartesian base vectors eI rotated through an angle θ
about the origin. (b) If the coordinates of the position vector p are unchanged it is also rotated by
θ to p′ .
Consider a position vector p = pI eI in the original basis. If we leave the coordinates unchanged
then the new vector p′ = pI eI is the original vector rotated by θ, see Figure 1.2(b).
We must therefore rotate the position vector p′ through an angle −θ relative to the fixed basis
eI , but this is actually equivalent to a positive rotation of the base vectors. Hence the transforms
for the components of vector and the base vectors are the same.

p1 = cos θ p1 + sin θ p2 and p2 = − sin θ p1 + cos θ p2 .

1.2 Tensors
Tensors are geometric objects that have magnitude and zero, one or many associated directions,
but are linear in character. A more mathematically precise definition is to say that a tensor is
multilinear map or alternatively an element of a tensor product of vector spaces, which is somewhat
tautological and really not helpful at this point. The order (or degree or rank) of a tensor is
the number of associated directions and so a scalar is a tensor of order zero and a vector is a
tensor of order one. Many quantities in continuum mechanics such as strain, stress, diffusivity and
conductivity are naturally expressed as tensors of order two. We have already seen an example
of a tensor in our discussion of vectors: linear transformations from one set of vectors to another,
e.g. the transformation from Cartesian to covariant base vectors, are second-order tensors. If the
vectors represent physical objects, then they must not depend on the coordinate representation
chosen. Hence, the linear transformation must also be independent of coordinates because the
same vectors must always transform in the same way. We can write our linear transformation in a
coordinate-independent manner as
a = M(b), (1.24)
and the transformation M is a tensor of order two. In order to describe M precisely we must
pick a specific coordinate system for each vector in equation (1.24). In the global Cartesian basis,
equation (1.24) becomes
aI eI = M(bJ eJ ) = bJ M(eJ ), (1.25)
because it is a linear transformation. We now take the dot product with eK to obtain

aI eK ·eI = bJ eK ·M(eJ ) ⇒ aK = bJ eK ·M(eJ ),

where the dot product is written on the left to indicate that we are taking the dot product after
the linear transformation has operated on the base vector eJ . Hence, we can write the operation of
the transformation on the components in the form

aI = MIJ bJ , (1.26)

where MIJ = eI ·M(eJ ). Equation (1.26) can be written in a matrix form to aid calculation
    
a1 M11 M12 M13 b1
 a2  =  M21 M22 M23   b2  .
a3 M31 M32 M33 b3

The quantity MIJ represents the component of the transformed vectors in the I-th Cartesian di-
rection if the original vector is of unit length in the J-th direction. Hence, the quantity MIJ is
meaningless without knowing the coordinate system associated with both I and J.
In fact, there is no need to choose the same coordinate system for I and J. If we write the
vector a in the covariant basis, equation (1.25) becomes

ai g i = bJ M(eJ ).

Taking the dot product with the appropriate contravariant base vector gives

∂ξ k
ak = bJ g k ·M(eJ ) = bJ eK ·M(eJ ),
∂xK
which means that
∂ξ k ∂ξ k
ak = MJk bJ = MKJ bJ ⇒ M k
J = MKJ .
∂xK ∂xK
In other words the components of each (column) vector corresponding to a fixed second index in
a coordinate representation of M must obey a contravariant transformation if the associated basis
undergoes a covariant transform, i.e. the behaviour is exactly the same as for the components of a
vector.
If we now also represent the vector b in the covariant basis, equation (1.25) becomes

ai g i = bj M(gj ).

Taking the dot product with the appropriate contravariant base vector gives
k
 J  k J
k j k j ∂ξ ∂x j ∂ξ ∂x
a = b g ·M(gj ) = b e K ·M eJ = b eK ·M(eJ ),
∂xK ∂ξ j ∂xK ∂ξ j

on using the linearity of the transformation. Hence,

k ∂ξ k ∂xJ j ∂ξ k ∂xJ
a = Mjk bj = K MKJ j b ⇒ Mjk = K MKJ j ,
∂x ∂ξ ∂x ∂ξ

and the components of each (row) vector associated with a fixed first index in a coordinate repre-
sentation of M undergo a covariant transformation when the associated basis undergoes a covariant
transform, i.e. the “opposite behaviour” to the components of a vector. The difference in behaviour
between the two indices of the components of the linear transformation arises because one index
corresponds to the basis of the “input” vector, whereas the other corresponds to the basis of the
“output” vector. There is a sum over the second (input) index and the components of the vector b
and in order for this sum to remain invariant the transform associated with the second index must
be the opposite to the components of the vector b, in other words the same as the transformation
of the base vectors of that vector.
The obvious relationships between components can easily be deduced when we represent our
vectors in the contravariant basis,

ai = M ij bj , ai = Mij bj , ai = Mij bj . (1.27)

Many books term M ij a contravariant second-order tensor; Mik a covariant second-order tensor
and Mij a mixed second-order tensor, but they are simply representations of the same coordinate-
independent object in different bases. Another more modern notation is to say that M ij is a type
(2, 0) tensor, MiJ is type (0, 2) and Mji is a type (1, 1) tensor, which allows the distinction between
mixed tensors or orders greater than two.
1.2.1 Invariance of second-order tensors
Let us now consider a general change of coordinates from ξ i to χi . Given that
ai = Mji bj , (1.28a)
we wish to find an expression for Mji such that9

ai = Mji bj . (1.28b)
Using the transformation rules for the components of vectors (1.19) it follows that (1.28a)
becomes
∂ξ i n i ∂ξ
j
a = M j bn .
∂χn ∂χn
We now multiply both sides by ∂χm /∂ξ i to obtain
∂χm ∂ξ i n m n m ∂χm i ∂ξ j n
a = δn a = a = M b .
∂ξ i ∂χn ∂ξ i j ∂χn
Comparing this expression to equation (1.28b) it follows that
∂χi n ∂ξ m
Mji = M ,
∂ξ n m ∂χj
and thus we see that covariant components must transform covariantly and contravariant compo-
nents must transform contravariantly in order for the invariance properties to hold. Similarly, it
can be shown that
∂χi ∂χj ∂ξ n ∂ξ m
M i j = n m M nm , and Mi j = Mnm . (1.29)
∂ξ ∂ξ ∂χi ∂χj
An alternative definition of tensors is to require that they are sets of index quantities (multi-
dimensional arrays) that obey these transformation laws under a change of coordinates.
The transformations can be expressed in matrix form, but we must distinguish between the
covariant and contravariant cases. We shall write M♭ to indicate a matrix where all components
transform covariantly and M♯ for the contravariant case. We define the transformation matrix F to
have the components  
∂χ1 ∂χ1 ∂χ1
∂ξ 1 ∂ξ 2 ∂ξ 3
  ∂χi
 ∂χ2 ∂χ2 ∂χ2 
F= ∂ξ 1 ∂ξ 2 ∂ξ 3
, or Fji = ;
  ∂ξ j
∂χ3 ∂χ3 ∂χ3
∂ξ 1 ∂ξ 2 ∂ξ 3
and then from the chain rule and independence of coordinates
 1 1 1

∂ξ ∂ξ ∂ξ
 ∂χ1 ∂χ2 ∂χ3 
 ∂ξ 2 ∂ξ 2 ∂ξ 2  ∂ξ i
F−1 =  ∂χ1 ∂χ2 ∂χ3
, or [F −1 ]ij = .
  ∂χj
∂ξ 3 ∂ξ 3 ∂ξ 3
∂χ1 ∂χ2 ∂χ3
♯ ♭
If M and M represent the matrices of transformed components then the transformation laws (1.29)
become
♯ ♭
M = FM♯ FT , and M = F−T M♭ F−1 . (1.30)
9
This is a place where the use of overbars makes the notation look cluttered, but clarifies precisely which coordinate
system is associated with each index. This notation also allows the representation of components in the two different
coordinate systems, so-called two-point tensors, e. g. Mji , which will be useful.
1.2.2 Cartesian tensors
If we restrict attention to orthonormal coordinate systems, then the transformation between co-
ordinate systems must be orthogonal10 and we do not need to distinguish between covariant and
contravariant behaviour. Consider the transformation from our Cartesian basis eI to another or-
thonormal basis eI . The transformation rules for components of a tensor of order two become

∂xN ∂xM
MI J = MN M .
∂xI ∂xJ
The transformation between components of two vectors in the different bases are given by

∂xI ∂xK
aI = K aK = aK ,
∂x ∂xI
which can be written the form
∂xI ∂xK
aI = QIK aK , where QIK = = ,
∂xK ∂xI
and the components QIK form an orthogonal matrix. Hence the transformation property of a
(Cartesian) tensor of order two can be written as

MI J = QIN MN M QJM

or in matrix form
M = QMQT . (1.31)
In many textbooks, equation (1.31) is defined to be the transformation rule satisfied by a (Cartesian)
tensor of order two.

1.2.3 Tensors vs matrices


There is a natural relationship between tensors and matrices because, as we have seen, we can write
the components of a second-order tensor in a particular coordinate system as a matrix. It is often
helpful to think of a tensor as a matrix when working with it, but the two concepts are distinct.
A summary of all the above is that a tensor is an geometric object that does not depend on any
particular coordinate system and expresses a linear relationship between other geometric objects.

1.3 Products of vectors: scalar, vector and tensor


1.3.1 Scalar product
We have already used the scalar or dot product of two vectors, and the discussion here is included
only for completeness. The scalar product is the product of two vectors that returns a unique scalar:
the product of the lengths of the vectors and the cosine of the angle between them. Thus far, we
have only used the dot product to define orthonormal sets of vectors.
If we represent two vectors a and b in the co- and contravariant bases, then

a · b = (ai g i )·(bj g j ),
10
Although the required orthogonal transformation may vary with position, as is the case in plane polar coordinates.
and so
a · b = ai bj g i ·g j = ai bj δji = ai bi .
An alternative decomposition demonstrates that

a · b = ai bi ,

and we note that the scalar product is invariant under coordinate transformation, as expected. In
orthonormal coordinate systems, there is no distinction between co and contravariant bases and so
a · b = aK bK .

1.3.2 Vector product


The vector or cross product is a product of two vectors that returns a unique vector that is orthogonal
to both vectors. In orthonormal coordinate systems, the vector product is defined by
     
a1 b1 a2 b3 − b2 a3
a × b =  a2  ×  b2  =  a3 b1 − a1 b3  .
a3 b3 a1 b2 − a2 b1

In order to represent the vector product with index notation it is convenient to define a quantity
known as the alternating, or Levi-Civita, symbol eIJK . In orthonormal coordinate systems the
components of eIJK are defined by


0 when any two indices are equal;
IJK
e = eIJK = +1 when I, J, K is an even permutation of 1, 2, 3; (1.32)


−1 when I, J, K is an odd permutation of 1,2,3.
e.g.
e112 = e122 = 0, e123 = e312 = e231 = 1, e213 = e132 = e321 = −1.
Strictly speaking eIJK thus defined is not a tensor because if the handedness of the coordinate system
changes then the sign of the entries in eIJK should change in order for it to respect the appropriate
invariance properties; such objects are sometimes called pseudo-tensors. We could ensure that eIJK
is a tensor by restricting our definition to right-handed (or left-handed) orthonormal systems, which
will be the approach taken in later chapters.
The vector product of two vectors a and b in orthonormal coordinates is

[a × b]I = eIJK aJ bK , (1.33)

which can be confirmed by writing out all the components. In addition, the relationship between
the Cartesian base vectors eI can be expressed as a vector product using the alternating tensor

eI × eJ = eIJK eK . (1.34)

Let us now consider the case of general coordinates: the cross product between covariant base
vectors is given by

∂xI ∂xJ ∂xI ∂xJ ∂xI ∂xJ


gi × gj = eI × eJ = eI × eJ = eIJK eK .
∂ξ i ∂ξ j ∂ξ i ∂ξ j ∂ξ i ∂ξ j
The expression on the right-hand side corresponds to the first two indices of the alternating tensor
undergoing a covariant transformation so that

g i × g j = eijK eK .

If we now transform the third index covariantly we must transform the base vector contravariantly
so that
∂ξ k
g i × g j = ǫijk K eK = ǫijk g k ; (1.35)
∂x
∂xI ∂xJ ∂xK
where ǫijk ≡ e .
∂ξ i ∂ξ j ∂ξ k IJK
A similar argument shows that

g i × g j = ǫijk g k ,
∂ξ ∂ξ ∂ξi j k
where ǫijk =≡ ∂x I ∂xJ ∂xK e
IJK
.
If we decompose the vectors a and b into the contravariant basis we have

a × b = (ai g i ) × (bj g j ) = ai bj g i × g j = ai bj ǫijk g k .

Thus, if we decompose the vector product into the covariant basis we have the following expression
for the components
[a × b]k = ǫijk ai bj , or [a × b]i = ǫijk aj bk .

1.3.3 Tensor product


The tensor product is a product of two vectors that returns a second-order tensor. It can be
motivated by the following discussion. Recall that equation (1.24) can be written in the form

ai = Mij bj , where Mij = g i ·M(g j ).

The components Mij correspond to the representation of tensor with respect to a basis, but which
basis? We shall define the basis to be that formed from the tensor product of pairs of base vectors:
g i ⊗ g j , where the symbol ⊗ is used to denote the tensor product. Hence, we can represent a tensor
in the different forms

M = Mij g i ⊗ g j = MIJ eI ⊗ ej = Mji g i ⊗ g j · · · ,

which is analogous to representing vectors in the different forms

a = aI eI = ai g i = ai g i .

Returning to equation (1.24), we have

a = M(b) ⇒ a = (Mij g i ⊗ g j )(b),

and because Mij are just coefficients it follows that g i ⊗ g j are themselves tensors of second order11 .
Decomposing a and b into the contravariant and covariant bases respectively gives

ai g i = (Mij g i ⊗ g j )(bn g n ) = Mij bn (g i ⊗ g j )(g n ), (1.36)


11
You should think carefully to convince yourself that this is true.
but from equation (1.27)
ai = Mij bj ⇒ ai g i = Mij bj g i = Mij bn δnj g i . (1.37)
Equating (1.36) and (1.37), it follows that the tensor product acting on the base vectors is
(g i ⊗ g j )(gn ) = δnj g i = (g j ·g n ) gi . (1.38)
Equation (1.38) motivates an alternative definition of the tensor product, which in the case of the
product of two vectors is also called the dyadic or outer product. The dyadic product of two vectors
a and b is a second-order tensor that is defined through its action on an arbitrary vector v
(a ⊗ b)(v) = (b · v)a. (1.39)
Consider the tensor T = a ⊗ b; the components of that tensor are given by
Tij = g i ·T (g j ) = g i ·(b·g j )a = (g i ·a)(b·g j ) = ai bj ,
which gives a third (equivalent) definition of the tensor product. Alternatively, we can write
Tji = ai bj or T ij = ai bj or Tij = ai bj . (1.40)
Note that we can write
T = a ⊗ b = ai g i ⊗ bj g j = ai bj g i ⊗ g j = Tij g i ⊗ g j ,
again demonstrating that Tij are the components of the tensor T in the basis g i ⊗ g j .
The tensor product can easily be extended to tensors of arbitrary order via its component-wise
definition. e.g.
i jk
Tmn = Aimn B jk

1.4 The metric tensor


We can reinterpret the Kronecker delta as a tensor with components in the global Cartesian coor-
dinate system, so that (
1, if I = J,
δIJ = δ IJ = δJI ≡
0, otherwise.
If we transform the Kronecker delta into the general coordinate system ξ i , then we obtain the
so-called metric tensors
∂xI ∂xJ ∂xI ∂xI
gij ≡ δIJ = = g i ·g j , (1.41a)
∂ξ i ∂ξ j ∂ξ i ∂ξ j
∂ξ i ∂ξ j IJ ∂ξ i ∂ξ j
g ij ≡ δ = = g i ·g j , (1.41b)
∂xI ∂xJ ∂xI ∂xI
∂ξ i ∂xJ I ∂ξ i ∂xI
gji ≡ δ = = δji . (1.41c)
∂xI ∂ξ j J ∂xI ∂ξ j
We note that all the tensors are symmetric gij = gji and that the mixed metric tensor gji is invariant
and equal to δji in all coordinate systems. From equations (1.41a,b) we deduce that

ij ∂ξ i ∂ξ j ∂xJ ∂xJ ∂ξ i J ∂xJ ∂ξ i ∂xI


g gjk = I I j k
= I δI k = I k = δki , (1.42)
∂x ∂x ∂ξ ∂ξ ∂x ∂ξ ∂x ∂ξ
in other words if we represent the components gij as a matrix, we can find the components g ij by
taking the inverse of that matrix, i.e.
D ij
g ij = , (1.43)
g
where D ij is the matrix of cofactors of the matrix of coefficients gij and g is the determinant of the
matrix gij
I 2 i 2
∂x ∂ξ 1
g ≡ |gij | = j and |g | = J = ,
ij
(1.44)
∂ξ ∂x g
which demonstrates that the determinants are positive.

1.4.1 Properties of the metric tensor


Line elements
We have already seen that an infinitesimal line element corresponding to a change dξ i in the ξ i
coordinate is given by
∂r
ti = dsi = i dξ i = g i dξ i (not summed).
∂ξ
Hence, a general infinitesimal line element corresponding to changes in all coordinates is given by

dr = g i dξ i , (summed over i),

and its length squared is


ds2 = dr·dr = g i dξ i ·g j dξ j = gij dξ i dξ j .
Thus,
p the infinitesimal length change due to increments dξ i in general coordinates is given by
gij dξ i dξ j , which explains the use of the term metric to describe the tensor gij .

Surface elements
Any surface ξ 1 is constant is spanned by the covariant base vectors g 2 and g 3 and so an infinitesimal
area within that surface is given by

dS(1) = |ds2 × ds3 | = |g 2 × g 3 |dξ 2 dξ 3 ,

and I J K
∂x ∂x ∂x p
|g 2 × g 3 | = |ǫ231 g | = |ǫ231 ||g | = 2 3
1 1
eIJK
g 1 ·g 1 .
(1.45)
∂ξ ∂ξ ∂ξ 1
The term between the modulus signs is the determinant12 of the matrix with components ∂xI /∂ξ j ,
so from the definition of the determinant of the metric tensor (1.44),
p
dS(1) = gg 11 dξ 2 dξ 3 .

Hence, an element of area in the surface ξ i is constant is


p
dS(i) = gg ii dξ j dξ k , (i 6= j 6= k and i not summed ).
12
It is the scalar triple product of the three rows of the matrix.
Volume elements
An infinitesimal parallelepiped spanned by the vectors dsi has volume given by

dV = |ds1 ·(ds2 × ds3 )| = |g 1 ·(g 2 × g 3 )|dξ 1 dξ 2 dξ 3 = |g1 ·ǫ231 g 1 |dξ 1 dξ 2 dξ 3.

Now

|g1 ·ǫ231 g 1 | = |ǫ231 | = g,
using the same argument as in equation (1.45). Thus, the volume element is given by

dV = gdξ 1 dξ 2 dξ 3 , (1.46)

which is, of course, equivalent to the standard expression for change of coordinates in multiple
integrals
∂(x1 , x2 , x3 ) 1 2 3
dV = dξ dξ dξ .
∂(ξ 1 , ξ 2, ξ 3 )

Index raising and lowering


We conclude this brief section by noting one of the most useful properties of the metric tensor. For
a vector a = aj g j , we can find the components in the covariant basis by taking the dot product

ai = a·gi = aj g j ·g i = aj g ji .

Thus the contravariant metric tensor can be used to raise indices and similarly the covariant metric
tensor can be used to lower indices
ai = aj gji .

1.5 Vector and tensor calculus


1.5.1 Covariant differentiation
We have already established that the derivative of the position vector r with respect to the coor-
dinate ξ i gives a tangent vector g i . If we wish to express the rate of change of a vector field in the
direction of a coordinate then we also need to be able to calculate
∂a
≡ a,i ,
∂ξ i
where we shall use a comma to indicate differentiation with respect to the i-th general coordinate.
Under a change of coordinates from ξ i to χi , the derivative becomes

∂a ∂ξ j ∂a
= ,
∂χi ∂χi ∂ξ j
so it transforms covariantly. If we decompose the vector into the covariant basis we have that

a,j = ai g i ,j = ai,j g i + ai g i,j , (1.47)

by the product rule.


Vector equations are often written by just writing out the components (in an assumed basis)

v=u ⇒ vI eI = uI eI ⇒ vI = uI ,

where formally the last equation is obtained via dot product with a suitable base vector. In Carte-
sian components we can simply take derivatives of the component equations with respect to the
coordinates because the base vectors do not depend on the coordinates

uI = vI leads to uI,J = vI,J because (uI eI ),J = uI,J eI .

In other words, the components of the differentiated vector equation are simply the derivatives of
the components of the original equation.
In a general coordinate system, the base vectors are not independent of the coordinates and so
the second term in equation (1.47) cannot be neglected. The vector equation

a=b ⇒ ai gi = bi g i ⇒ ai = bi ,

where the last equation is now obtained by taking the dot product with the contravariant base
vectors. However, in general coordinates,

ai = bi does not (directly) lead to ai,j = bi,j .

Although the final equation is (obviously) true from direct differentiation of each component the
statement is not coordinate-independent because it does not obey the correct transformation prop-
erties for a mixed second-order tensor.
In fact, the derivatives of the base vectors are given by

∂2r
g i,j = = r ,ij = r ,ji ,
∂ξ j ∂ξ i
assuming symmetry of partial differentiation. If we decompose the position vector into the fixed
Cartesian coordinates, we have
∂ 2 xK
g i,j = j i eK ,
∂ξ ∂ξ
which can be written in the contravariant basis as

g i,j = Γijk g k ,

where
∂2r ∂ 2 xK ∂xK
Γijk = g i,j ·g k = ·g = . (1.48)
∂ξ j ∂ξ i k ∂ξ i ∂ξ j ∂ξ k
The symbol Γijk is called the Christoffel symbol of the first kind. Thus, we can write

a,j = ai,j g i + ai Γijk g k = ai,j g i + ai Γijk g kl g l = al,j + Γlij ai g l ,

where
Γlij = g klΓijk = g klg k ·g i,j = g l ·g i,j , (1.49)
and Γlij is called the Christoffel symbol of the second kind.
Finally, we obtain
a,j = ak |j g k , where ak |j = ak,j + Γkij ai ,
and the expression ak |j is the covariant derivative of the component ak . In many books the covariant
derivative is represented by a subscript semicolon ak;j and in some it is denoted by a comma, just
to confuse things. The fact that a,j is a covariant vector and that g k is a covariant vector means
that the covariant derivative ak |j is a (mixed) second-order tensor13 . Thus when differentiating
equations expressed in components in a general coordinate system we should write
ai = bi ⇒ ai |j = bi |j .
If we had decomposed the vector a into the contravariant basis then
a,j = ai,j g i + ai g i,j .
and from equation (1.12),
∂ξ k K ∂xI ∂ξ k ∂xI K
gk = e ⇒ gk = I K
e = δK e = eI . (1.50)
∂xK ∂ξ k K
∂x ∂ξ k

We differentiate equation (1.50) with respect to ξ j , remembering that eI is a constant base vector,
to obtain
∂xI ∂ 2 xI
g k,j k + g k j k = 0,
∂ξ ∂ξ ∂ξ
∂xI ∂ξ i k ∂ξ
i
∂ 2 xI
⇒ g k,j = −g ⇒ g i,j = −g k,j ·g i g k = −Γikj g k .
∂ξ k ∂xI ∂xI ∂ξ j ∂ξ k
Thus the derivative of the vector a when decomposed into the contravariant basis is

a,j = ai,j g i − Γijk ai g k = ak,j − Γijk ai g k ;
and finally, we obtain
a,j = ai |j g i , where ai |j = ai,j − Γkji ak ,
which gives the covariant derivative of the covariant components of a vector.
In Cartesian coordinate systems, the base vectors are independent of the coordinates, so ΓIJK ≡
0 for all I, J, K and the covariant derivative reduces to the partial derivative:
aI |K = aI |K = aI,K .
Therefore, in most cases, when generalising equations derived in Cartesian coordinates to other
coordinate systems we simply need to replace the partial derivatives with covariant derivatives.
The partial derivative of a scalar already exhibits covariant transformation properties because
∂φ ∂ξ j ∂φ
= ;
∂χi ∂χi ∂ξ j
and the partial derivative of a scalar coincides with the covariant derivative
φ,i = φ|i .
The covariant derivative of higher-order tensors can be constructed by considering the covariant
derivative of invariants and insisting that the covariant derivative obeys the product rule.
13
Thus, the covariant derivative exhibits tensor transformation properties, which the partial derivative does not
because !
∂ai ∂ξ k ∂ ∂χi l ∂ξ k ∂χi ∂al ∂ξ k ∂ 2 χi l
= a = + a,
∂χj ∂χj ∂ξ k ∂ξ l ∂χj ∂ξ l ∂ξ k ∂χj ∂ξ k ∂ξ l
and the presence of the second term violates the tensor transformation rule.
1.5.2 Vector differential operators
Gradient
The gradient of a scalar field f (x) is denoted by ∇f , or gradf , and is defined to be a vector field
such that
f (x + dx) = f (x) + ∇f ·dx + o(|dx|); (1.51)
i.e. the gradient is the vector that expresses the change in the scalar field with position.
Letting dx = th, dividing by t and taking the limit as t → 0 gives the alternative definition

f (x + th) − f (x) d
∇f ·h = lim = f (x + th)|t=0 . (1.52)
t→0 t dt
Here, the derivative is the directional derivative in the direction h.
If we decompose the vectors into components in the global Cartesian basis equation (1.52)
becomes
∂f d ∂f
[∇f ]K hK = (xK + thK )|t=0 = hK ,
∂xK dt ∂xK
and so because h is arbitrary, we can define
∂f
∇f = [∇f ]K eK = eK , (1.53)
∂xK
which should be a familiar expression for the gradient. Relative to the general coordinate system
ξ i equation (1.53) becomes
∂f ∂ξ i ∂f
∇f = i eK = i g i = f,i g i , (1.54)
∂ξ ∂xK ∂ξ
so we can write the vector differential operator ∇ = g i ∂ξ∂ i . Note that because the derivative
transforms covariantly and the base vector transform contravariantly the gradient is invariant under
coordinate transform.

Example 1.4. Gradient in standard coordinate systems


Find the gradient in a plane polar coordinate system.

Solution 1.4. The gradient of a scalar field f is simply


∂f
∇f = g i .
∂ξ i
In plane polar coordinates,

sin ξ 2 cos ξ 2
g 1 = cos ξ 2 e1 + sin ξ 2 e2 , g2 = − e1 + e2 ,
ξ1 ξ1
so  
2
 ∂f
2 sin ξ 2 cos ξ 2 ∂f
∇f = cos ξ e1 + sin ξ e2 1
+ − 1 e1 + 1
e2 ,
∂ξ ξ ξ ∂ξ 2
which can be written in the (hopefully) familiar form

∂f 1 ∂f
∇f = er + eθ .
∂r r ∂θ
Gradient of a Vector Field
The gradient of a vector field F (x) is a second-order tensor ∇F (x) also often written as ∇ ⊗ F
that arises when the vector differential operator is applied to a vector

∇ ⊗ F = g i ⊗ F ,i = F k |i g i ⊗ g k = Fk |i g i ⊗ g k .

Note that we have used the covariant derivative because we are taking the derivative of a vector
decomposed into the covariant or contravariant basis.

Divergence
The divergence of a vector field is the trace or contraction of the gradient of the vector field. It is
formed by taking the dot product of the vector differential operator ∇ with the vector field:
∂ 
divF = ∇·F = g i ·F ,i = g i · i
F j
g j = g i ·F j |i g j = δji F j |i = F i |i ,
∂ξ
so that
divF = F i |i = F,ii + Γiji F j = F,ii + Γiij F j . (1.55)

Curl
The curl of a vector field is the cross product of our vector differential operator ∇ with the vector
field
∂F ∂ (Fj g j )
curlF = ∇ × F = g i × i = g i × = g i × Fj |i g j = ǫijk Fj |i g k .
∂ξ ∂ξ i

Laplacian
The Laplacian of a scalar field f is defined by

∇2 f = ∆f = ∇·∇f.

Thus in general coordinates


 
2 ∂ j i ∂f
∇ f =g · j g i = g j ·f,i |j g i = g ij f,i |j = g ij f |ij ,
∂ξ ∂ξ

because the partial and covariant derivatives are the same for scalar quantities. Now g ij |j is a tensor
(covariant first order), which means that we know how it transforms under change in coordinates,
but in Cartesian coordinates g ij |j becomes δ,J
IJ
= 0, so g ij |j = 0 in all coordinate systems14 . Thus,
 
2 ij 1 ∂ √ ∂f
ij
∇ f = (g f,i )|j = √ gg .
g ∂ξ j ∂ξ i

The last equality is not obvious and is proved in the next section, equation (1.58)
14
This is a special case of Ricci’s Lemma.
1.6 Divergence theorem
The divergence theorem is a generalisation of the fundamental theorem of calculus into higher
dimensions. Consider the volume given by a ≤ ξ 1 ≤ b, c ≤ ξ 2 ≤ d and e ≤ ξ 3 ≤ f . If we want to
calculate the outward flux of a vector field F (x) from this volume then we must calculate
ZZ
F · n dS,

where n is the outer unit normal to the surface and dS is an infinitesimal element of the surface
area.
From section 1.4.1, we know that on the faces ξ 1 = a, b,
p
dS = gg 11 dξ 2 dξ 3 ,
1 1
and a normal
p p to the surface is given by the contravariant base vector g , with length |g | =
1 1
g 1 ·g 1 = g 11 . On the face ξ = a the vector g is directed into the volume (it is oriented along
the line of increasing coordinate) and on ξ 1 = b, g 1 is directed out of the volume, so the net outward
flux through these faces is
Z dZ f p p Z dZ f p p
2 3 1 2 3
11 11
F (a, ξ , ξ )·(−g / g ) gg dξ dξ + F (b, ξ 2 , ξ 3)·(+g 1 / g 11 ) gg 11dξ 2dξ 3
c e c e
Z d Z f h p p i
= F (b, ξ , ξ ) g(b, ξ , ξ ) − F (a, ξ , ξ ) g(a, ξ , ξ ) dξ 2 dξ 3
1 2 3 2 3 1 2 3 2 3
c e
From the fundamental theorem of calculus, the integral becomes
Z dZ f Z b

(F 1 g),1 dξ 1 dξ 2 dξ 3 ,
c e a

and using similar arguments, the total outward flux from the volume is given by
ZZ Z bZ dZ f

F · n dS = (F i g),i dξ 1dξ 2dξ 3.
a c e

From section 1.4.1 we also know that an infinitesimal volume element is given by dV = g dξ 1 dξ 2 dξ 3 ,
so ZZ ZZZ ZZZ
1 i√ √ 1 2 3 1 √
F · n dS = √ (F g),i g dξ dξ dξ = √ (F i g),i dV.
g g
The volume integrand is
1 √ 1 √
√ (F i g),i = F,ii + √ F i ( g),i ,
g g
but
1 √ 1 ∂g 1 ∂g ∂gjk
√ ( g),i = i
= ,
g 2g ∂ξ 2g ∂gjk ∂ξ i
by the chain rule. From equations (1.42) and (1.44)

D kl ∂g l ∂g
gjk g kl = gjk = δjl ⇒ gjk D kl = g δjl ⇒ δj = = D kl = gg kl = gg lk , (1.56)
g ∂gjk ∂glk
which means that
1 √ 1 ∂gjk 1  1  1 j 
√ ( g),i = gg jk i = g jk g j,i ·g k + g j ·g k,i = g j,i ·g j + g k,i ·g k = Γji + Γkki = Γjji ,
g 2g ∂ξ 2 2 2
(1.57)
from the definition of the Christoffel symbol (1.49). Thus the volume integrand becomes
1 √  j
√ F i g ,i = F,ii + F i Γji = F i |i , (1.58)
g

and so, ZZ ZZZ ZZZ ZZZ


F · n dS = F,ii + F i Γjji dV = i
F |i dV = div F dV,

using the definition of the divergence (1.55). In fact, one should argue that the definition of the
divergence is chosen so that it is the quantity that appears within the volume integral in the
divergence theorem. The form of the theorem that we shall use most often is
ZZ ZZ ZZZ ZZZ
i i i 1 √
F ni dS = Fi n dS = F |i dV = √ (F i g),i dV. (1.59)
g

You might argue that the proof above is only valid for simple parallelepipeds in our general coordi-
nate system, but we can extend the result to more general volumes by breaking the volume up into
regions, each of which can be described by a parallelepiped in a general coordinate systems.

1.7 The Stokes Theorem


For completeness, we also state, but do not prove the Stokes theorem, which states that for any
vector v, the integral of its tangential component around a closed path is equal to the flux of the
curl of the vector through any surface bounded by the path:
I ZZ
v· dr = (curl v)· dS. (1.60)
C A

In general tensor form, equation (1.60) becomes


I I ZZ ZZ
i i ijk
v · g i dξ = vi dξ = ǫ vj |i g k ·dS = ǫijk vj |i nk dS, (1.61)
C A A

where the vector area is dS = n dS, and n is a unit vector normal to the surface. Again we can
make the argument that the definition of the curl is chosen so that this relationship is satisfied.
You may be interested to know that an entire theory (differential forms) can be constructed in
which the Stokes theorem is a fundamental result
Z Z
ω= dω,
∂C C

where ω is an n-form and dω is its differential. The theory encompasses all known integral theorems
because the appropriate n-dimensional differential forms coincide with the divergence and curl and
the theory provides a rational extension to higher dimensions.

You might also like