c 2001 Society for Industrial and Applied Mathematics
SIAM REVIEW
Vol. 43, No. 4, pp. 694–706
Matrix Exponentials—Another
Approach∗
William A. Harris, Jr.†
Jay P. Fillmore‡
Donald R. Smith‡
Abstract. The exponential of a matrix and the spectral decomposition of a matrix can be computed
knowing nothing more than the eigenvalues of the matrix and the Cayley–Hamilton theorem. The arrangement of the ideas in this paper is simple enough to be taught to beginning
students of ODEs.
Key words. matrix exponential, Vandermonde matrix, spectral decomposition
AMS subject classifications. 15-01, 15A21, 34-01
PII. S0036144599362406
Introduction. William Harris (born Dec. 18, 1930, died1 Jan. 8, 1998) is certainly
missed by all who knew him. He was frequently a visiting scholar at UCSD, most
recently in the fall of 1997. Smith met Harris in 1975 because of their common interest
in ODEs. Fillmore (whose main interest is geometry) did not meet Harris until fall
1997.
Harris possessed a great enthusiasm for ODEs and would frequently exchange
insights on various aspects of the subject with colleagues. During his final month at
UCSD, Harris gave Fillmore and Smith two short lectures on a nice way of computing
the exponential of a matrix. During his last year Harris also lectured and spoke
on these ideas at other locations, including Seattle, Rolla, and Los Angeles. Like
Harris, we feel that these ideas are simple and elegant and suitable for presentation to
undergraduates in an elementary course. In this spirit, we have recorded what Harris
told us about computing matrix exponentials, and we present that information here.
(See Moler and Van Loan (1978) for 19 other approaches.)
The paper is arranged as follows:
• Section 1 gives a description of the method. We start with the well-known
fact that entries of eAt satisfy the ODE c(d/dt)u = 0, where c(λ) is the
characteristic polynomial of A, and we proceed to an arrangement of all the
information, which is simple and easy to use.
• Section 2 gives an example using the rudiments of Mathematica. We include
just enough so that a beginning student can do some interesting compu∗ Received by the editors October 6, 1999; accepted for publication (in revised form) June 5, 2001;
published electronically October 31, 2001.
http://www.siam.org/journals/sirev/43-4/36240.html
† This author is deceased. Former address: Department of Mathematics, University of Southern
California, Los Angeles, CA 90089.
‡ Department of Mathematics, University of California at San Diego, La Jolla, CA 92093
(jfillmore@ucsd.edu, drsmith@ucsd.edu).
1 See In Memoriam: William Ashton Harris Jr. (1930–1998), AMS Notices, 45 (1998), p. 684.
694
MATRIX EXPONENTIALS—ANOTHER APPROACH
•
•
•
•
695
tations. Several known results flow unexpectedly without effort from the
development of section 1.
Section 3 gives the spectral decomposition in the case of distinct eigenvalues.
Section 4 discusses the case of repeated roots. The fact that the generalized (or “confluent”) Vandermonde matrix is the change-of-basis matrix,
which brings the companion matrix of a characteristic polynomial into Jordan canonical form, was well known early in this century but is now largely
forgotten. Here we give a proof using differential equations and, in a remark,
a proof that is strictly algebraic. (We leave it to the reader to judge which
proof is simpler.)
Section 5 gives another example, illustrating the sensitivity of the matrix
exponential to changes in the data.
Section 6 comments on other classical as well as recent methods for exponentiating matrices.
1. The Exponential. We define eAt as the n × n solution of the initial value
problem
dX
= AX
dt
for
− ∞ < t < ∞,
(1.1)
X=I
at t = 0,
where I is the n × n identity matrix. Our goal is to compute eAt .
We shall often use the symbol D = d/dt to denote differentiation with respect to
an independent variable. From (1.2), eAt satisfies
Dk eAt = Ak eAt
for every nonnegative integer k = 0, 1, 2, . . . and, by linearity,
(1.2)
p(D)eAt = p(A)eAt
for every polynomial p. As in (1.2), polynomial operators are applied to a matrix
entrywise.
Let c be the characteristic polynomial of A,
(1.3)
c(λ) = det(λI − A) = λn − c1 λn−1 − c2 λn−2 − · · · − cn ,
with coefficients c1 = tr A, . . . , cn = (−1)n det A. The Cayley–Hamilton theorem2
asserts that
(1.4)
c(A) = 0
in Cn×n . Hence, using the polynomial p(λ) = c(λ) in (1.2), we have c (D) eAt = 0, so
each entry u(t) of eAt is a solution of the scalar nth-order linear constant coefficient
homogeneous differential equation
(1.5)
c (D) u(t) = 0.
2 A simple proof is based on the formula of Kramer: (λI − A) × adjugate (λI − A) = c(λ)I.
Viewing this as the product of two polynomials in the commuting indeterminant λ and with matrix
coefficients, it is legitimate to set λ = A. For details, see p. 395 of Broida and Williamson (1989).
696
WILLIAM A. HARRIS, JR., JAY P. FILLMORE, AND DONALD R. SMITH
The Wronski matrix W [y; t] for n smooth, complex-valued functions y1 , y2 , . . . , yn
of a real variable is the n × n matrix
y (t)
y (t)
···
y (t)
1
(1.6)
W [y; t] :=
2
n
···
yn′ (t)
.
..
..
.
.
(n−1)
(n−1)
(n−1)
(t) y2
(t) · · · yn
(t)
y1
y1′ (t)
..
.
y2′ (t)
..
.
If each of the functions z1 , z2 , . . . , zn is a (constant coefficient) linear combination of
y1 , y2 , . . . , yn , we have the matrix equation
(1.7)
[ z1
z2
· · · zn ] = [ y 1
y2
· · · yn ] C,
where C is a complex n × n matrix with constant entries. This equation relates the
first rows of the respective two Wronski matrices W [z; t] and W [y; t], and then, by
successive differentiations of (1.7), we have
(1.8)
W [z; t] = W [y; t]C.
Now let y1 (t), y2 (t), . . . , yn (t) and z1 (t), z2 (t), . . . , zn (t) be any two bases for the
n-dimensional vector space (over C) of solutions of the nth-order linear differential
equation (1.5). The Wronski matrices are now invertible matrices3 (see section 8.12
of Apostol (1997)). Relation (1.8) evaluated at t = 0 determines the constant matrix
C = W [y; 0]−1 W [z; 0], and then (1.8) yields
(1.9)
W [z; t]W [z; 0]−1 = W [y; t]W [y; 0]−1 .
The resulting common matrix value appearing on both sides of (1.9) is recognized as
the Wronski matrix
ϕ (t)
ϕ2 (t)
···
ϕn (t)
1
′
′
ϕ2 (t)
···
ϕ′n (t)
ϕ1 (t)
W [ϕ; t] =
..
..
..
..
.
.
.
.
(n−1)
(n−1)
(n−1)
(t)
(t) ϕ2
(t) · · · ϕn
ϕ1
of the principal solutions ϕ1 (t), ϕ2 (t), . . . , ϕn (t), these latter being the special solutions
of (1.5) satisfying the initial conditions contained in the following matrix equation for
this Wronski matrix evaluated at t = 0:
ϕ (0)
ϕ2 (0)
···
ϕn (0) 1 0 · · · 0
1
′
′
···
ϕ′n (0) 0 1 · · · 0
ϕ2 (0)
ϕ1 (0)
=. . .
.
.
..
..
.. ..
. . ... .
..
..
.
.
(n−1)
(n−1)
(n−1)
0 0 ··· 1
(0) ϕ2
(0) · · · ϕn
(0)
ϕ1
In particular, the unique principal solutions for (1.5) may be obtained as
(1.10)
[ ϕ1 (t) · · · ϕn (t) ] = [ y1 (t) · · · yn (t) ] W [y; 0]−1
in terms of arbitrary fundamental solutions y1 (t), . . . , yn (t), where [ y1 (t) · · · yn (t) ]
denotes a 1 × n row matrix.
3 In an elementary course one may also mention Abel’s result for such a Wronski matrix W [y; t],
according to which the determinant of W [y; t] (i.e., the Wronskian) is a solution of the first-order
differential equation du/dt = (trA)u.
697
MATRIX EXPONENTIALS—ANOTHER APPROACH
Each entry of eAt is a solution of the nth-order differential equation (1.5) and is
hence a unique linear combination (with constant coefficients) of any given n fundamental solutions y1 (t), . . . , yn (t). It follows that the array eAt can be written as a
linear combination
F1
F2
(1.11) eAt = y1 (t)F1 + y2 (t)F2 + · · · + yn (t)Fn = [ y1 (t) y1 (t) · · · yn (t) ]
...
Fn
for unique n × n constant matrices Fi ∈ Cn×n (i = 1, 2, . . . , n). The expression
F1
F2
.
..
Fn
on the right side of (1.11) is not interpreted as an n2 × n block matrix,4 but rather
as an n × 1 array with the ith entry being Fi (for i = 1, 2, . . . , n).
Evaluating successive derivatives of (1.11) at t = 0 yields
(i)
y1 (0)F1 + · · · + yn(i) (0)Fn = Ai
for i = 0, 1, . . . , n − 1,
from which we obtain the formal matrix equation
(1.12)
y1 (0)
y1′ (0)
..
.
y2 (0)
y2′ (0)
..
.
···
···
..
.
yn (0)
yn′ (0)
..
.
F1
I
F2 A
. = .
.. ..
(n−1)
(n−1)
(n−1)
An−1
Fn
(0)
(0) y2
(0) · · · yn
y1
or, in terms of the Wronski matrix,
F1
I
F2 A
W [y; 0]
... = ... .
An−1
Fn
Hence,
F1
I
F2
A
. = W [y; 0]−1 . .
..
..
An−1
Fn
(1.13)
The Fi in (1.11) are polynomials in A and may be obtained by (1.13).
4 If V denotes the n-dimensional space of solutions to (1.5), then the space of solutions of c(D)X =
0 (n × n matrices) is the free right module V ⊗ C Cn×n of rank n over Cn×n . The row on the right
side of (1.11) is [ y1 (t)I · · · yn (t)I ], which is a consequence of the injection V −→ V ⊗ C Cn×n
given by y → y⊗I.
698
WILLIAM A. HARRIS, JR., JAY P. FILLMORE, AND DONALD R. SMITH
Inserting (1.13) into (1.11) yields a formula for eAt :
(1.14)
eAt
I
A
= [ y1 (t) y2 (t) · · · yn (t) ] W [y; 0]−1
...
An−1
or, more simply in terms of the principal solutions (1.10),
(1.15)
eAt
I
A
= [ ϕ1 (t) ϕ2 (t) · · · ϕn (t) ]
... .
An−1
Remark 1. We thus have two representations for eAt : a(bc) from (1.11) and (1.13),
and (ab)c from (1.15). These are connected by abc from (1.14). One needs to compute
the inverse
W [y; 0]−1
of only one n × n constant matrix. The representation a(bc) uses an arbitrary choice
of the n fundamental solutions yj (t), and the representation (ab)c uses the n unique
principal fundamental solutions determined by
(1.16)
c(D)ϕj (t) = 0
subject to the n2 initial conditions
(i−1)
ϕj
(0) = δji
for i, j = 1, 2, . . . , n (i.e., the matrix initial condition W [ϕ; 0] = I).
Remark 2. Beyond their fixed initial conditions, the principal solutions ϕj (t)
depend only on the characteristic polynomial c(λ) = det(λI − A) and hence remain
valid for all A with the same c(λ).
At
Remark 3. One could replace
∞ the differential equation (1.2) defining e and use
instead the Maclaurin series k=0 Ak tk /k!. Then the fact that (1.15) involves the
principal solutions (1.10) for (1.5) is ascertained directly from the Cayley–Hamilton
theorem (1.3)–(1.4), according to which An is a linear combination of I, A, A2 , . . . , An−1 .
In this case the (converging) Maclaurin series implies directly
(1.17)
eAt = x1 (t)I + x2 (t)A + x3 (t)A2 + · · · + xn (t)An−1
I
A
= [ x1 (t) x2 (t) · · · xn (t) ]
...
An−1
for suitable complex-valued functions x1 (t), . . . , x2 (t). Note that the Maclaurin series
and each of its derivatives converge and yield, in particular, at t = 0,
(i−1)
xj
(0) = δji
for i, j = 1, 2, . . . , n.
MATRIX EXPONENTIALS—ANOTHER APPROACH
699
Similarly, the Maclaurin series also implies that eAt satisfies the differential equation
of (1.2), which, with (1.4) and (1.17), yields
c(D)xj (t) = 0
for j = 1, 2, . . . , n.
It follows by (1.16) and a standard differential equation uniqueness theorem that the
present functions xj (t) of (1.17) coincide with the previous principal solutions ϕj (t).
As an aside, we note that some references (e.g., Chapter 14 of Bronson (1989)) factor
suitable powers of t from the principal solutions and cast (1.17) in the form
eAt = α1 (t)I + α2 (t)t A + α3 (t)t2 A2 + · · · + αn (t)tn−1 An−1 ,
with xj (t) = αj (t)tj−1 . In this case the αj (t) are not principal solutions.
2. An Example. We illustrate a Mathematica computation for the 3 × 3 matrix
0
1
0
(2.1)
a= 0
0
1.
12 −16 7
This matrix5 a along with the identity matrix are entered into Mathematica as
a := {{0, 1, 0}, {0, 0, 1}, {12, −16, 7}};
id := IdentityMatrix[3];
To MatrixForm[a] Mathematica responds with
0
1
0
0
12 −16
0
1
7
The eigenvalues of a are obtained by Eigenvalues[a]
{2, 2, 3}
Now, the characteristic polynomial is c(λ) = (λ − 2)2 (λ − 3) = λ3 − 7λ2 + 16λ − 12. In
this case, fundamental solutions of the characteristic differential equation x′′′ − 7 x′′ +
16 x′ − 12 x = 0 are simple enough to insert into Mathematica “by hand”:
y[t ] = {Exp[2 ∗ t], t ∗ Exp[2 ∗ t], Exp[3 ∗ t]}
The Wronski matrix is computed from y[t ] and the derivatives of y[t ] as in (1.6):
w[t ] = {y[t], D[y[t], {t, 1}], D[y[t], {t, 2}]}
Then MatrixForm[w[t]] yields
(2.2)
e2 t
2 e2 t
4 e2 t
e2 t t
e2 t + 2 e2 t t
4 e2 t + 4 e2 t t
e3 t
3 e3 t
9 e3 t
5 One often uses lowercase letters to denote variables in Mathematica. The immediate output of
a Mathematica command is suppressed by following the command with a semicolon.
700
WILLIAM A. HARRIS, JR., JAY P. FILLMORE, AND DONALD R. SMITH
The principal solutions of x′′′ − 7 x′′ + 16 x′ − 12 x = 0 are obtained as in (1.10) from
temp1 = y[t].Inverse[w[0]]
(The lowered dot denotes matrix multiplication in Mathematica.) The “row vector”
of principal solutions is now assigned to temp1:
−3 e2 t + 4 e3 t − 6 e2 t t,
4 e2 t − 4 e3 t + 5 e2 t t,
−e2 t + e3 t − e2 t t
Now, create a “column vector” whose elements are the identity, a, and a2 as appears
on the right side of (1.15):
ee[u , x ] = {u, x, x.x}
temp2 = ee[id, a]
The indeterminant u is needed to obtain the identity matrix by substitution. Then
temp2, written as nine rows of three, contains
{{{1, 0, 0}, {0, 1, 0}, {0, 0, 1}}, {{0, 1, 0}, {0, 0, 1}, {12, −16, 7}},
{{0, 0, 1}, {12, −16, 7}, {84, −100, 33}}}
Finally, we obtain the matrix exponential eAt using (1.15):
MatrixForm[Simplify[temp1.temp2]]
e2 t (−3 + 4 et − 6 t)
e2 t (4 − 4 et + 5 t)
e2 t (−1 + et − t)
2t
t
2t
t
2t
12 e (−1 + e − t)
e (13 − 12 e + 10 t) e (−3 + 3 et − 2 t)
2t
t
12 e (−3 + 3 e − 2 t) 4 e2 t (9 − 9 et + 5 t) e2 t (−8 + 9 et − 4 t)
Mathematica has its own matrix exponential evoked by Simplify[MatrixForm
[MatrixExp[a ∗ t]]]. MatrixForm[Simplify[MatrixExp[a ∗ t]]]. This yields the same
answer, as can be checked by subtraction.
3. Spectral Decomposition. As a by-product of the present approach, one gains
an elementary development of the spectral decomposition of A. The approach is
sufficiently simple that it can be easily presented to undergraduate students. We
illustrate the argument for the semisimple case in which A is diagonalizable with
distinct eigenvalues. The characteristic polynomial (1.3) factors as
(3.1)
c(x) = (x − λ1 )(x − λ2 ) · · · (x − λn )
with λi = λj for i = j, where now we have replaced the variable λ of (1.3) with x.
The functions
(3.2)
eλj t
for j = 1, 2, . . . , n
provide n fundamental solutions for (1.5), so these functions may be used for yj (t) in
(1.11) to write
(3.3)
eAt = eλ1 t E1 + eλ2 t E2 + · · · + eλn t En ,
MATRIX EXPONENTIALS—ANOTHER APPROACH
701
where in this special case Ej plays the role of the previous constant matrix Fj . Now
(1.12) becomes
1
λ1
.
..
1
λ2
..
.
(3.4)
λ1n−1
λ2n−1
···
···
..
.
1
E1
I
λ n E2 A
. = . ,
..
. .. ..
An−1
· · · λnn−1
En
where here the Wronski matrix is a Vandermonde matrix. Solving (3.4) as in (1.13)
yields
Ej = ej (A) for j = 1, 2, . . . , n,
(3.5)
where ej (x) is a polynomial. In fact, Kramer’s rule gives that ej (x) is the solution of
1
λ1
.
..
1
λ2
..
.
(3.6)
λ1n−1
λ2n−1
e1
1
1
λn e2 x
. = . ,
..
. .. ..
xn−1
en
· · · λnn−1
···
···
..
.
which is
(3.7)
ej (x) =
(x − λ1 ) · · · (x − λj−1 )(x − λj+1 ) · · · (x − λn )
(λj − λ1 ) · · · (λj − λj−1 )(λj − λj+1 ) · · · (λj − λn )
based on the n distinct values λ1 , λ2 , . . . , λn . Relation (3.5) with (3.7) is a familiar
representation for the polynomial dependence of Ej on A, obtained here by employing
the special fundamental solutions (3.2) in (1.11). Note that we have the identity
(3.8)
1 = e1 (x) + e2 (x) + · · · + en (x),
which comes from the top row of (3.6).
Relations (3.1) and (3.7) show that the characteristic polynomial c(x) divides the
product
ei (x)ej (x)
for i = j. Hence (3.5) and the Cayley–Hamilton theorem (1.4) give
(3.9)
Ei Ej = 0 (in Cn×n ) for i = j.
Analogous to (3.8) it also holds (put t = 0 in (3.3)) that
(3.10)
I = E1 + E 2 + · · · + E n
(in Cn×n ),
from which (multiply both sides of (3.10) by Ej and use (3.9))
Ej2 = Ej
for j = 1, 2, . . . , n.
Thus each Ej in (3.3) is a projection. Moreover, with (3.7), c(x) also divides the
product
(x − λj )ej (x),
702
WILLIAM A. HARRIS, JR., JAY P. FILLMORE, AND DONALD R. SMITH
so, again by (3.5) and Cayley–Hamilton,
(A − λj I)Ej = 0 (in Cn×n ).
Hence AEj = λj Ej or
A(Ej v) = λj (Ej v) for all v ∈ Cn×1
for each j = 1, 2, . . . , n.
Thus in the present case of n distinct eigenvalues, (3.10) leads to the spectral
decomposition
A = AE1 + AE2 + · · · + AEn ,
where Ej = ej (A) is a projection onto the eigenspace of A associated with the eigenvalue λj .
4. Remarks on the Case of Repeated Roots. We include a few remarks on the
case of repeated eigenvalues. Let
c(x) = (x − λ1 )n1 (x − λ2 )n2 · · · (x − λr )nr
be the characteristic polynomial of a matrix or of a linear ODE. For each ρ =
1, 2, . . . , r, the characteristic root λρ has multiplicity nρ . The degree of the polynomial is n1 + n2 + · · · + nr = n.
A basis of solutions to c(D)u = 0 is made up by the functions
1 j λρ t
t e
j!
for 0 ≤ j ≤ nρ − 1 and ρ = 1, 2, . . . , r. The resulting Wronski matrix will lead to the
generalized Vandermonde matrix.
The generalized (or “confluent”) Vandermonde matrix T consists of r blocks of
size nρ × nρ (ρ = 1, 2, . . . , r), each block being of the form (with ρ suppressed in
n = nρ and in λ = λρ )
1
0
0
··· 0
λ
1
0
··· 0
2
λ
2λ
1
··· 0
3
λ
3λ2
3λ
··· 0,
.
..
..
..
..
..
.
.
.
.
λn−1
(n − 1)λn−2
n−1
2
λn−3
··· 1
µ−ν
. It turns out that this matrix T is the changeand the µνth entry being µ−1
ν−1 λ
of-basis matrix, which brings the companion matrix (cf. (1.3))
0
1
0
··· 0 0
0
0
1
··· 0 0
.
..
..
..
..
.
(4.1)
A= .
.
.
.
.
0
0
0
··· 0 1
cn cn−1 cn−2 · · · c2 c1
into Jordan canonical form J:
(4.2)
AT = T J.
MATRIX EXPONENTIALS—ANOTHER APPROACH
703
This pretty fact can be found (without proof) on p. 60 of Turnbull and Aitken
(1961). A short algebraic proof is included at the end of this paper, but here we give
a derivation in the spirit of differential equations.
Let ϕ be the 1 × (n1 + n2 + · · · + nr ) principal matrix (which we write in terms
of r 1 × nρ blocks, ρ = 1, 2, . . . , r, with only the first block shown)
ϕ = eλ1 t
t eλ1 t
t2
2!
eλ1 t
···
tn1 −1
(n1 −1)!
eλ1 t
(r − 1 additional blocks) .
Entrywise differentiation of the matrix ϕ(t) gives directly
Dϕ(t) = ϕ(t)
0
J1
J2
..
.
0
Jr
with blocks
λ
Jρ =
ρ
0
0
..
.
0
1
λρ
0
0
1
λρ
···
···
···
..
.
0
0
0
..
.
0
0
· · · λρ
for ρ = 1, 2, . . . , r. That is, Dϕ(t) = ϕ(t)J, where the constant matrix J on the right
is a matrix of Jordan blocks. Now differentiate Dϕ(t) = ϕ(t)J repeatedly to obtain
the rows of
ϕ
ϕ
Dϕ Dϕ
D2 ϕ = D2 ϕ J
D
..
..
.
.
Dn−1 ϕ
Dn−1 ϕ
or, in terms of the Wronski matrix,
(4.3)
DW [ϕ; t] = W [ϕ; t]J.
Return to W [ϕ; t] and differentiate it again, writing the result differently now as
a product with a suitable constant matrix on the left,
ϕ
Dϕ
D2 ϕ
..
.
0 1
0 0
0 0
=
D
n−2 0 0
D
ϕ
∗
Dn−1 ϕ
0
1
0
0
···
ϕ
··· 0
· · · 0 Dϕ
· · · 0 D2 ϕ
,
..
..
.
.
n−2
0 ··· 1
D
ϕ
n−1
···
∗
D
ϕ
0
0
1
where we would like to have Dn ϕ for the last entry on the right side. Since every entry
of ϕ, and consequently of W [ϕ; t], is a solution of c(D)u = 0, we have (Dn −c1 Dn−1 −
c2 Dn−2 − · · · − cn )u = 0 and so Dn u = (c1 Dn−1 + c2 Dn−2 + · · · − +cn ) u. Hence
704
WILLIAM A. HARRIS, JR., JAY P. FILLMORE, AND DONALD R. SMITH
we may fill in [ cn cn−1 · · · c2 c1 ] for the bottom row of the constant matrix on
the right side, from which, with (4.1),
(4.4)
DW [ϕ; t] = AW [ϕ; t].
Thus, by (4.3)–(4.4), the companion matrix A of c(x) is related to the Jordan
matrix J by
(4.5)
A W [ϕ; t] = W [ϕ; t] J.
But note that the generalized Vandermonde matrix T is just W [ϕ; 0], so putting
t = 0 in the previous equation gives the asserted result (4.2):
AT = T J,
which shows that T is the change-of-basis matrix that brings the companion matrix
into the Jordan canonical form J.
In the example discussed earlier with companion matrix (2.1) and Wronski matrix
(2.2), one can check that
2t
2t
e
0
1
0
te2t
e3t
e
te2t
e3t
2 1 0
0
0
1 2e2t (1 + 2t)e2t 3e3t ≡ 2e2t (1 + 2t)e2t 3e3t 0 2 0
12 −16 7
4e2t (4 + 4t)e2t 9e3t
4e2t (4 + 4t)e2t 9e3t
0 0 3
for all t, in agreement with (4.5).
This change-of-basis property of the generalized Vandermonde matrix can be
proved algebraically as follows. Let A be the companion matrix (4.1) of c(x) as
before. Repeatedly differentiate the identity
1
1
0
x x
0
2 2
x x
0
= . x − . c(x)
A
.
.. ..
..
xn−2 xn−2
0
xn−1
xn−1
1
with respect to the indeterminant x and set x = λρ . (Higher derivatives of c(x) vanish
at λρ since the roots are repeated.) This will yield the n × nρ block of the generalized
Vandermonde matrix.
This algebraic proof is tantamount to differentiating solutions to the differential
equations with respect to the λρ . Such a procedure can be made rigorous for multiple
roots.
5. Another Example. It has been known since the early 1960s with the pioneering work of J. H. Wilkinson (cf. the account in Wilkinson and Reinsch (1971))
that various problems involving repeated (multiple) roots and/or eigenvalues are in
general ill conditioned. In particular, in such cases the matrix exponential can be
quite sensitive to perturbations in the data. To illustrate this phenomenon we briefly
consider a slightly modified version of the Mathematica calculation of section 2 for
the 3 × 3 matrix (2.1). We now decrease the (3, 1) entry by a small amount, 0.0001,
from 12 to 11.9999, so the matrix (2.1) is replaced with
0
1
0
B=
(5.1)
0
0
1.
11.9999 −16 7
MATRIX EXPONENTIALS—ANOTHER APPROACH
705
The previous eigenvalues {2, 2, 3} of (2.1) are replaced for (5.1) with the (approximate)
values
{1.99005,
2.01005,
2.9999},
so the eigenvalues are now distinct.6 The Mathematica calculation of section 2 based
on the algorithm of section 1 easily computes for eBt the following approximate results
with entries listed by columns to avoid clutter:
4.0017e2.9999t + 298.538e1.99005t − 301.54e2.01005t
col1 (eBt ) ≈ 12.0047e2.9999t + 594.105e1.99005t − 606.11e2.01005t ,
36.0129e2.9999t + 1182.3e1.99005t − 1218.31e2.01005t
(5.2)
−4.0017e2.9999t − 248.039e1.99005t + 252.04e2.01005t
col2 (eBt ) ≈ −12.0047e2.9999t − 493.609e1.99005t + 506.614e2.01005t ,
−36.0129e2.9999t − 982.306e1.99005t + 1018.32e2.01005t
1.0004e2.9999t + 49.5092e1.99005t − 50.5096e2.01005t
col3 (eBt ) ≈ 3.0011e2.9999t + 98.5257e1.99005t − 101.527e2.01005t .
9.003e2.9999t + 196.071e1.99005t − 204.074e2.01005t
The matrix eBt can also be computed in this case by the standard elementary method
based on diagonalizing the matrix, and the results agree with the present answer (5.2).
One sees with (5.2) that a small change in the data from (2.1) to (5.1) results in large
changes for the matrix exponential as compared with the results of section 2.
This example (5.1)–(5.2) provides a nice point of contact with the rich subject
of numerical linear algebra. The thrust of the present paper is a conceptual understanding of the algorithm described in section 1. We do not intend to provide “the”
ultimate algorithm for computing matrix exponentials—that is best left to numerical
analysts, who guide us very well through the subtleties of numerical linear algebra,
where one can often produce examples for which an algorithm may not be suited in
finite precision arithmetic due to extreme sensitivity to changes in data.
6. Concluding Remarks. The classical paper of Moler and Van Loan (1978)
treated 19 ways to compute matrix exponentials. The algorithm of Putzer (cf. section 9.12 of Apostol (1997)) can also be used to calculate matrix exponentials in terms
of polynomials in A but without the simplicity and elegance of the method discussed
here. This can be appreciated by beginning students of differential equations. Others
(cf. Ziebur (1970), Fulmer (1975), Leonard (1996), and Elaydi and Harris (1998))
have discussed portions of the method described here, but these references did not
develop this method with the ease of use shown to us by Harris.
Acknowledgments. Fillmore and Smith thank Robert Sacker, Louis Grimm, and
the referees for their helpful comments on this paper, including the valuable suggestion
of one referee to mention the sensitivity of the matrix exponential to changes in the
data along with consequent pitfalls when using finite precision arithmetic.
6 A change of +0.0001 has complex eigenvalues, which leads to a cluttered display, but the effect
of either change is comparable.
706
WILLIAM A. HARRIS, JR., JAY P. FILLMORE, AND DONALD R. SMITH
REFERENCES
T. M. Apostol (1997), Linear Algebra—A First Course, with Applications to Differential Equations, Wiley, New York.
J. G. Broida and S. G. Williamson (1989), Comprehensive Introduction to Linear Algebra,
Addison-Wesley, Reading, MA.
R. Bronson (1989), 2500 Solved Problems in Differential Equations, 2nd. ed., Schaum’s Solved
Problems Series, McGraw-Hill, New York, 1989.
S. N. Elaydi and W. A. Harris, Jr. (1998), On the computation of An , SIAM Rev., 40, pp.
965–971.
E. P. Fulmer (1975), Computation of the matrix exponential, Amer. Math. Monthly, 82, pp.
156–159.
I. E. Leonard (1996), The matrix exponential, SIAM Rev., 38, pp. 507–512.
C. Moler and C. Van Loan (1978), Nineteen dubious ways to compute the exponential of a
matrix, SIAM Rev., 20, pp. 801–836.
H. W. Turnbull and A. C. Aitken (1961), An Introduction to the Theory of Canonical Matrices,
Dover, New York.
J. H. Wilkinson and C. Reinsch (1971), Linear Algebra, Handbook for Automatic Computation
2, Springer-Verlag, Berlin.
A. D. Ziebur (1970), On determining the structure of A by analyzing eAt , SIAM Rev., 12, pp.
98–102.