Academia.eduAcademia.edu

Matrix Exponentials: Another Approach

2001, SIAM review

The exponential of a matrix and the spectral decomposition of a matrix can be computed knowing nothing more than the eigenvalues of the matrix and the Cayley-Hamilton theorem. The arrangement of the ideas in this paper is simple enough to be taught to beginning students of ODEs.

c 2001 Society for Industrial and Applied Mathematics  SIAM REVIEW Vol. 43, No. 4, pp. 694–706 Matrix Exponentials—Another Approach∗ William A. Harris, Jr.† Jay P. Fillmore‡ Donald R. Smith‡ Abstract. The exponential of a matrix and the spectral decomposition of a matrix can be computed knowing nothing more than the eigenvalues of the matrix and the Cayley–Hamilton theorem. The arrangement of the ideas in this paper is simple enough to be taught to beginning students of ODEs. Key words. matrix exponential, Vandermonde matrix, spectral decomposition AMS subject classifications. 15-01, 15A21, 34-01 PII. S0036144599362406 Introduction. William Harris (born Dec. 18, 1930, died1 Jan. 8, 1998) is certainly missed by all who knew him. He was frequently a visiting scholar at UCSD, most recently in the fall of 1997. Smith met Harris in 1975 because of their common interest in ODEs. Fillmore (whose main interest is geometry) did not meet Harris until fall 1997. Harris possessed a great enthusiasm for ODEs and would frequently exchange insights on various aspects of the subject with colleagues. During his final month at UCSD, Harris gave Fillmore and Smith two short lectures on a nice way of computing the exponential of a matrix. During his last year Harris also lectured and spoke on these ideas at other locations, including Seattle, Rolla, and Los Angeles. Like Harris, we feel that these ideas are simple and elegant and suitable for presentation to undergraduates in an elementary course. In this spirit, we have recorded what Harris told us about computing matrix exponentials, and we present that information here. (See Moler and Van Loan (1978) for 19 other approaches.) The paper is arranged as follows: • Section 1 gives a description of the method. We start with the well-known fact that entries of eAt satisfy the ODE c(d/dt)u = 0, where c(λ) is the characteristic polynomial of A, and we proceed to an arrangement of all the information, which is simple and easy to use. • Section 2 gives an example using the rudiments of Mathematica. We include just enough so that a beginning student can do some interesting compu∗ Received by the editors October 6, 1999; accepted for publication (in revised form) June 5, 2001; published electronically October 31, 2001. http://www.siam.org/journals/sirev/43-4/36240.html † This author is deceased. Former address: Department of Mathematics, University of Southern California, Los Angeles, CA 90089. ‡ Department of Mathematics, University of California at San Diego, La Jolla, CA 92093 (jfillmore@ucsd.edu, drsmith@ucsd.edu). 1 See In Memoriam: William Ashton Harris Jr. (1930–1998), AMS Notices, 45 (1998), p. 684. 694 MATRIX EXPONENTIALS—ANOTHER APPROACH • • • • 695 tations. Several known results flow unexpectedly without effort from the development of section 1. Section 3 gives the spectral decomposition in the case of distinct eigenvalues. Section 4 discusses the case of repeated roots. The fact that the generalized (or “confluent”) Vandermonde matrix is the change-of-basis matrix, which brings the companion matrix of a characteristic polynomial into Jordan canonical form, was well known early in this century but is now largely forgotten. Here we give a proof using differential equations and, in a remark, a proof that is strictly algebraic. (We leave it to the reader to judge which proof is simpler.) Section 5 gives another example, illustrating the sensitivity of the matrix exponential to changes in the data. Section 6 comments on other classical as well as recent methods for exponentiating matrices. 1. The Exponential. We define eAt as the n × n solution of the initial value problem dX = AX dt for − ∞ < t < ∞, (1.1) X=I at t = 0, where I is the n × n identity matrix. Our goal is to compute eAt . We shall often use the symbol D = d/dt to denote differentiation with respect to an independent variable. From (1.2), eAt satisfies Dk eAt = Ak eAt for every nonnegative integer k = 0, 1, 2, . . . and, by linearity, (1.2) p(D)eAt = p(A)eAt for every polynomial p. As in (1.2), polynomial operators are applied to a matrix entrywise. Let c be the characteristic polynomial of A, (1.3) c(λ) = det(λI − A) = λn − c1 λn−1 − c2 λn−2 − · · · − cn , with coefficients c1 = tr A, . . . , cn = (−1)n det A. The Cayley–Hamilton theorem2 asserts that (1.4) c(A) = 0 in Cn×n . Hence, using the polynomial p(λ) = c(λ) in (1.2), we have c (D) eAt = 0, so each entry u(t) of eAt is a solution of the scalar nth-order linear constant coefficient homogeneous differential equation (1.5) c (D) u(t) = 0. 2 A simple proof is based on the formula of Kramer: (λI − A) × adjugate (λI − A) = c(λ)I. Viewing this as the product of two polynomials in the commuting indeterminant λ and with matrix coefficients, it is legitimate to set λ = A. For details, see p. 395 of Broida and Williamson (1989). 696 WILLIAM A. HARRIS, JR., JAY P. FILLMORE, AND DONALD R. SMITH The Wronski matrix W [y; t] for n smooth, complex-valued functions y1 , y2 , . . . , yn of a real variable is the n × n matrix  y (t) y (t) ··· y (t)  1 (1.6)  W [y; t] :=   2 n ··· yn′ (t)  . .. ..  . . (n−1) (n−1) (n−1) (t) y2 (t) · · · yn (t) y1 y1′ (t) .. . y2′ (t) .. . If each of the functions z1 , z2 , . . . , zn is a (constant coefficient) linear combination of y1 , y2 , . . . , yn , we have the matrix equation (1.7) [ z1 z2 · · · zn ] = [ y 1 y2 · · · yn ] C, where C is a complex n × n matrix with constant entries. This equation relates the first rows of the respective two Wronski matrices W [z; t] and W [y; t], and then, by successive differentiations of (1.7), we have (1.8) W [z; t] = W [y; t]C. Now let y1 (t), y2 (t), . . . , yn (t) and z1 (t), z2 (t), . . . , zn (t) be any two bases for the n-dimensional vector space (over C) of solutions of the nth-order linear differential equation (1.5). The Wronski matrices are now invertible matrices3 (see section 8.12 of Apostol (1997)). Relation (1.8) evaluated at t = 0 determines the constant matrix C = W [y; 0]−1 W [z; 0], and then (1.8) yields (1.9) W [z; t]W [z; 0]−1 = W [y; t]W [y; 0]−1 . The resulting common matrix value appearing on both sides of (1.9) is recognized as the Wronski matrix  ϕ (t) ϕ2 (t) ··· ϕn (t)  1 ′ ′ ϕ2 (t) ··· ϕ′n (t)   ϕ1 (t)  W [ϕ; t] =  .. .. .. ..   . . . . (n−1) (n−1) (n−1) (t) (t) ϕ2 (t) · · · ϕn ϕ1 of the principal solutions ϕ1 (t), ϕ2 (t), . . . , ϕn (t), these latter being the special solutions of (1.5) satisfying the initial conditions contained in the following matrix equation for this Wronski matrix evaluated at t = 0:  ϕ (0) ϕ2 (0) ··· ϕn (0)   1 0 · · · 0  1 ′ ′ ··· ϕ′n (0)   0 1 · · · 0  ϕ2 (0)  ϕ1 (0)  =. . .  . . .. ..    .. .. . . ...  . .. .. . . (n−1) (n−1) (n−1) 0 0 ··· 1 (0) ϕ2 (0) · · · ϕn (0) ϕ1 In particular, the unique principal solutions for (1.5) may be obtained as (1.10) [ ϕ1 (t) · · · ϕn (t) ] = [ y1 (t) · · · yn (t) ] W [y; 0]−1 in terms of arbitrary fundamental solutions y1 (t), . . . , yn (t), where [ y1 (t) · · · yn (t) ] denotes a 1 × n row matrix. 3 In an elementary course one may also mention Abel’s result for such a Wronski matrix W [y; t], according to which the determinant of W [y; t] (i.e., the Wronskian) is a solution of the first-order differential equation du/dt = (trA)u. 697 MATRIX EXPONENTIALS—ANOTHER APPROACH Each entry of eAt is a solution of the nth-order differential equation (1.5) and is hence a unique linear combination (with constant coefficients) of any given n fundamental solutions y1 (t), . . . , yn (t). It follows that the array eAt can be written as a linear combination   F1  F2   (1.11) eAt = y1 (t)F1 + y2 (t)F2 + · · · + yn (t)Fn = [ y1 (t) y1 (t) · · · yn (t) ]   ...  Fn for unique n × n constant matrices Fi ∈ Cn×n (i = 1, 2, . . . , n). The expression  F1  F2   .   ..   Fn on the right side of (1.11) is not interpreted as an n2 × n block matrix,4 but rather as an n × 1 array with the ith entry being Fi (for i = 1, 2, . . . , n). Evaluating successive derivatives of (1.11) at t = 0 yields (i) y1 (0)F1 + · · · + yn(i) (0)Fn = Ai for i = 0, 1, . . . , n − 1, from which we obtain the formal matrix equation (1.12) y1 (0) y1′ (0) .. . y2 (0) y2′ (0) .. . ··· ··· .. . yn (0) yn′ (0) .. .    F1 I   F2   A    .  =  .      ..   ..  (n−1) (n−1) (n−1) An−1 Fn (0) (0) y2 (0) · · · yn y1   or, in terms of the Wronski matrix,    F1 I  F2   A     W [y; 0]   ...  =  ...  . An−1 Fn  Hence,    F1 I  F2   A   .  = W [y; 0]−1  .  .  ..   ..  An−1 Fn  (1.13) The Fi in (1.11) are polynomials in A and may be obtained by (1.13). 4 If V denotes the n-dimensional space of solutions to (1.5), then the space of solutions of c(D)X = 0 (n × n matrices) is the free right module V ⊗ C Cn×n of rank n over Cn×n . The row on the right side of (1.11) is [ y1 (t)I · · · yn (t)I ], which is a consequence of the injection V −→ V ⊗ C Cn×n given by y → y⊗I. 698 WILLIAM A. HARRIS, JR., JAY P. FILLMORE, AND DONALD R. SMITH Inserting (1.13) into (1.11) yields a formula for eAt : (1.14) eAt  I  A   = [ y1 (t) y2 (t) · · · yn (t) ] W [y; 0]−1   ...   An−1 or, more simply in terms of the principal solutions (1.10), (1.15) eAt  I  A   = [ ϕ1 (t) ϕ2 (t) · · · ϕn (t) ]   ...  .  An−1 Remark 1. We thus have two representations for eAt : a(bc) from (1.11) and (1.13), and (ab)c from (1.15). These are connected by abc from (1.14). One needs to compute the inverse W [y; 0]−1 of only one n × n constant matrix. The representation a(bc) uses an arbitrary choice of the n fundamental solutions yj (t), and the representation (ab)c uses the n unique principal fundamental solutions determined by (1.16) c(D)ϕj (t) = 0 subject to the n2 initial conditions (i−1) ϕj (0) = δji for i, j = 1, 2, . . . , n (i.e., the matrix initial condition W [ϕ; 0] = I). Remark 2. Beyond their fixed initial conditions, the principal solutions ϕj (t) depend only on the characteristic polynomial c(λ) = det(λI − A) and hence remain valid for all A with the same c(λ). At Remark 3. One could replace ∞ the differential equation (1.2) defining e and use instead the Maclaurin series k=0 Ak tk /k!. Then the fact that (1.15) involves the principal solutions (1.10) for (1.5) is ascertained directly from the Cayley–Hamilton theorem (1.3)–(1.4), according to which An is a linear combination of I, A, A2 , . . . , An−1 . In this case the (converging) Maclaurin series implies directly (1.17) eAt = x1 (t)I + x2 (t)A + x3 (t)A2 + · · · + xn (t)An−1   I  A   = [ x1 (t) x2 (t) · · · xn (t) ]   ...  An−1 for suitable complex-valued functions x1 (t), . . . , x2 (t). Note that the Maclaurin series and each of its derivatives converge and yield, in particular, at t = 0, (i−1) xj (0) = δji for i, j = 1, 2, . . . , n. MATRIX EXPONENTIALS—ANOTHER APPROACH 699 Similarly, the Maclaurin series also implies that eAt satisfies the differential equation of (1.2), which, with (1.4) and (1.17), yields c(D)xj (t) = 0 for j = 1, 2, . . . , n. It follows by (1.16) and a standard differential equation uniqueness theorem that the present functions xj (t) of (1.17) coincide with the previous principal solutions ϕj (t). As an aside, we note that some references (e.g., Chapter 14 of Bronson (1989)) factor suitable powers of t from the principal solutions and cast (1.17) in the form eAt = α1 (t)I + α2 (t)t A + α3 (t)t2 A2 + · · · + αn (t)tn−1 An−1 , with xj (t) = αj (t)tj−1 . In this case the αj (t) are not principal solutions. 2. An Example. We illustrate a Mathematica computation for the 3 × 3 matrix   0 1 0 (2.1) a= 0 0 1. 12 −16 7 This matrix5 a along with the identity matrix are entered into Mathematica as a := {{0, 1, 0}, {0, 0, 1}, {12, −16, 7}}; id := IdentityMatrix[3]; To MatrixForm[a] Mathematica responds with 0 1 0 0 12 −16 0 1 7 The eigenvalues of a are obtained by Eigenvalues[a] {2, 2, 3} Now, the characteristic polynomial is c(λ) = (λ − 2)2 (λ − 3) = λ3 − 7λ2 + 16λ − 12. In this case, fundamental solutions of the characteristic differential equation x′′′ − 7 x′′ + 16 x′ − 12 x = 0 are simple enough to insert into Mathematica “by hand”: y[t ] = {Exp[2 ∗ t], t ∗ Exp[2 ∗ t], Exp[3 ∗ t]} The Wronski matrix is computed from y[t ] and the derivatives of y[t ] as in (1.6): w[t ] = {y[t], D[y[t], {t, 1}], D[y[t], {t, 2}]} Then MatrixForm[w[t]] yields (2.2) e2 t 2 e2 t 4 e2 t e2 t t e2 t + 2 e2 t t 4 e2 t + 4 e2 t t e3 t 3 e3 t 9 e3 t 5 One often uses lowercase letters to denote variables in Mathematica. The immediate output of a Mathematica command is suppressed by following the command with a semicolon. 700 WILLIAM A. HARRIS, JR., JAY P. FILLMORE, AND DONALD R. SMITH The principal solutions of x′′′ − 7 x′′ + 16 x′ − 12 x = 0 are obtained as in (1.10) from temp1 = y[t].Inverse[w[0]] (The lowered dot denotes matrix multiplication in Mathematica.) The “row vector” of principal solutions is now assigned to temp1:  −3 e2 t + 4 e3 t − 6 e2 t t, 4 e2 t − 4 e3 t + 5 e2 t t, −e2 t + e3 t − e2 t t Now, create a “column vector” whose elements are the identity, a, and a2 as appears on the right side of (1.15): ee[u , x ] = {u, x, x.x} temp2 = ee[id, a] The indeterminant u is needed to obtain the identity matrix by substitution. Then temp2, written as nine rows of three, contains {{{1, 0, 0}, {0, 1, 0}, {0, 0, 1}}, {{0, 1, 0}, {0, 0, 1}, {12, −16, 7}}, {{0, 0, 1}, {12, −16, 7}, {84, −100, 33}}} Finally, we obtain the matrix exponential eAt using (1.15): MatrixForm[Simplify[temp1.temp2]] e2 t (−3 + 4 et − 6 t) e2 t (4 − 4 et + 5 t) e2 t (−1 + et − t) 2t t 2t t 2t 12 e (−1 + e − t) e (13 − 12 e + 10 t) e (−3 + 3 et − 2 t) 2t t 12 e (−3 + 3 e − 2 t) 4 e2 t (9 − 9 et + 5 t) e2 t (−8 + 9 et − 4 t) Mathematica has its own matrix exponential evoked by Simplify[MatrixForm [MatrixExp[a ∗ t]]]. MatrixForm[Simplify[MatrixExp[a ∗ t]]]. This yields the same answer, as can be checked by subtraction. 3. Spectral Decomposition. As a by-product of the present approach, one gains an elementary development of the spectral decomposition of A. The approach is sufficiently simple that it can be easily presented to undergraduate students. We illustrate the argument for the semisimple case in which A is diagonalizable with distinct eigenvalues. The characteristic polynomial (1.3) factors as (3.1) c(x) = (x − λ1 )(x − λ2 ) · · · (x − λn ) with λi = λj for i = j, where now we have replaced the variable λ of (1.3) with x. The functions (3.2) eλj t for j = 1, 2, . . . , n provide n fundamental solutions for (1.5), so these functions may be used for yj (t) in (1.11) to write (3.3) eAt = eλ1 t E1 + eλ2 t E2 + · · · + eλn t En , MATRIX EXPONENTIALS—ANOTHER APPROACH 701 where in this special case Ej plays the role of the previous constant matrix Fj . Now (1.12) becomes 1  λ1  .  .. 1 λ2 .. .  (3.4) λ1n−1 λ2n−1 ··· ··· .. .     1 E1 I λ n   E2   A   .  =  . , ..  .   ..   ..  An−1 · · · λnn−1 En where here the Wronski matrix is a Vandermonde matrix. Solving (3.4) as in (1.13) yields Ej = ej (A) for j = 1, 2, . . . , n, (3.5) where ej (x) is a polynomial. In fact, Kramer’s rule gives that ej (x) is the solution of 1  λ1  .  .. 1 λ2 .. .  (3.6) λ1n−1 λ2n−1     e1 1 1 λn   e2   x   .  =  . , ..  .   ..   ..  xn−1 en · · · λnn−1 ··· ··· .. . which is (3.7) ej (x) = (x − λ1 ) · · · (x − λj−1 )(x − λj+1 ) · · · (x − λn ) (λj − λ1 ) · · · (λj − λj−1 )(λj − λj+1 ) · · · (λj − λn ) based on the n distinct values λ1 , λ2 , . . . , λn . Relation (3.5) with (3.7) is a familiar representation for the polynomial dependence of Ej on A, obtained here by employing the special fundamental solutions (3.2) in (1.11). Note that we have the identity (3.8) 1 = e1 (x) + e2 (x) + · · · + en (x), which comes from the top row of (3.6). Relations (3.1) and (3.7) show that the characteristic polynomial c(x) divides the product ei (x)ej (x) for i = j. Hence (3.5) and the Cayley–Hamilton theorem (1.4) give (3.9) Ei Ej = 0 (in Cn×n ) for i = j. Analogous to (3.8) it also holds (put t = 0 in (3.3)) that (3.10) I = E1 + E 2 + · · · + E n (in Cn×n ), from which (multiply both sides of (3.10) by Ej and use (3.9)) Ej2 = Ej for j = 1, 2, . . . , n. Thus each Ej in (3.3) is a projection. Moreover, with (3.7), c(x) also divides the product (x − λj )ej (x), 702 WILLIAM A. HARRIS, JR., JAY P. FILLMORE, AND DONALD R. SMITH so, again by (3.5) and Cayley–Hamilton, (A − λj I)Ej = 0 (in Cn×n ). Hence AEj = λj Ej or A(Ej v) = λj (Ej v) for all v ∈ Cn×1 for each j = 1, 2, . . . , n. Thus in the present case of n distinct eigenvalues, (3.10) leads to the spectral decomposition A = AE1 + AE2 + · · · + AEn , where Ej = ej (A) is a projection onto the eigenspace of A associated with the eigenvalue λj . 4. Remarks on the Case of Repeated Roots. We include a few remarks on the case of repeated eigenvalues. Let c(x) = (x − λ1 )n1 (x − λ2 )n2 · · · (x − λr )nr be the characteristic polynomial of a matrix or of a linear ODE. For each ρ = 1, 2, . . . , r, the characteristic root λρ has multiplicity nρ . The degree of the polynomial is n1 + n2 + · · · + nr = n. A basis of solutions to c(D)u = 0 is made up by the functions 1 j λρ t t e j! for 0 ≤ j ≤ nρ − 1 and ρ = 1, 2, . . . , r. The resulting Wronski matrix will lead to the generalized Vandermonde matrix. The generalized (or “confluent”) Vandermonde matrix T consists of r blocks of size nρ × nρ (ρ = 1, 2, . . . , r), each block being of the form (with ρ suppressed in n = nρ and in λ = λρ )   1 0 0 ··· 0  λ 1 0 ··· 0   2  λ 2λ 1 ··· 0   3  λ 3λ2 3λ ··· 0,  . ..  .. .. ..  .. . . . . λn−1 (n − 1)λn−2 n−1 2 λn−3 ··· 1 µ−ν . It turns out that this matrix T is the changeand the µνth entry being µ−1 ν−1 λ of-basis matrix, which brings the companion matrix (cf. (1.3))   0 1 0 ··· 0 0  0 0 1 ··· 0 0   . .. .. .. ..   . (4.1) A= . . . . .    0 0 0 ··· 0 1  cn cn−1 cn−2 · · · c2 c1 into Jordan canonical form J: (4.2) AT = T J. MATRIX EXPONENTIALS—ANOTHER APPROACH 703 This pretty fact can be found (without proof) on p. 60 of Turnbull and Aitken (1961). A short algebraic proof is included at the end of this paper, but here we give a derivation in the spirit of differential equations. Let ϕ be the 1 × (n1 + n2 + · · · + nr ) principal matrix (which we write in terms of r 1 × nρ blocks, ρ = 1, 2, . . . , r, with only the first block shown) ϕ = eλ1 t t eλ1 t t2 2! eλ1 t ··· tn1 −1 (n1 −1)! eλ1 t (r − 1 additional blocks) . Entrywise differentiation of the matrix ϕ(t) gives directly   Dϕ(t) = ϕ(t)   0 J1 J2 .. . 0 Jr    with blocks λ   Jρ =    ρ 0 0 .. . 0 1 λρ 0 0 1 λρ ··· ··· ··· .. . 0 0 0 .. . 0 0 · · · λρ       for ρ = 1, 2, . . . , r. That is, Dϕ(t) = ϕ(t)J, where the constant matrix J on the right is a matrix of Jordan blocks. Now differentiate Dϕ(t) = ϕ(t)J repeatedly to obtain the rows of     ϕ ϕ  Dϕ   Dϕ      D2 ϕ  =  D2 ϕ  J D     .. ..     . . Dn−1 ϕ Dn−1 ϕ or, in terms of the Wronski matrix, (4.3) DW [ϕ; t] = W [ϕ; t]J. Return to W [ϕ; t] and differentiate it again, writing the result differently now as a product with a suitable constant matrix on the left, ϕ Dϕ D2 ϕ .. . 0 1   0 0      0 0  = D       n−2   0 0 D ϕ ∗ Dn−1 ϕ    0 1 0 0 ···   ϕ ··· 0 · · · 0   Dϕ    · · · 0   D2 ϕ    , .. ..   . .      n−2 0 ··· 1 D ϕ n−1 ··· ∗ D ϕ 0 0 1 where we would like to have Dn ϕ for the last entry on the right side. Since every entry of ϕ, and consequently of W [ϕ; t], is a solution of c(D)u = 0, we have (Dn −c1 Dn−1 − c2 Dn−2 − · · · − cn )u = 0 and so Dn u = (c1 Dn−1 + c2 Dn−2 + · · · − +cn ) u. Hence 704 WILLIAM A. HARRIS, JR., JAY P. FILLMORE, AND DONALD R. SMITH we may fill in [ cn cn−1 · · · c2 c1 ] for the bottom row of the constant matrix on the right side, from which, with (4.1), (4.4) DW [ϕ; t] = AW [ϕ; t]. Thus, by (4.3)–(4.4), the companion matrix A of c(x) is related to the Jordan matrix J by (4.5) A W [ϕ; t] = W [ϕ; t] J. But note that the generalized Vandermonde matrix T is just W [ϕ; 0], so putting t = 0 in the previous equation gives the asserted result (4.2): AT = T J, which shows that T is the change-of-basis matrix that brings the companion matrix into the Jordan canonical form J. In the example discussed earlier with companion matrix (2.1) and Wronski matrix (2.2), one can check that    2t   2t   e 0 1 0 te2t e3t e te2t e3t 2 1 0  0 0 1   2e2t (1 + 2t)e2t 3e3t  ≡  2e2t (1 + 2t)e2t 3e3t   0 2 0  12 −16 7 4e2t (4 + 4t)e2t 9e3t 4e2t (4 + 4t)e2t 9e3t 0 0 3 for all t, in agreement with (4.5). This change-of-basis property of the generalized Vandermonde matrix can be proved algebraically as follows. Let A be the companion matrix (4.1) of c(x) as before. Repeatedly differentiate the identity       1 1 0  x   x  0  2   2     x   x  0  =  .  x −  .  c(x) A .  ..   ..   ..         xn−2   xn−2  0 xn−1 xn−1 1 with respect to the indeterminant x and set x = λρ . (Higher derivatives of c(x) vanish at λρ since the roots are repeated.) This will yield the n × nρ block of the generalized Vandermonde matrix. This algebraic proof is tantamount to differentiating solutions to the differential equations with respect to the λρ . Such a procedure can be made rigorous for multiple roots. 5. Another Example. It has been known since the early 1960s with the pioneering work of J. H. Wilkinson (cf. the account in Wilkinson and Reinsch (1971)) that various problems involving repeated (multiple) roots and/or eigenvalues are in general ill conditioned. In particular, in such cases the matrix exponential can be quite sensitive to perturbations in the data. To illustrate this phenomenon we briefly consider a slightly modified version of the Mathematica calculation of section 2 for the 3 × 3 matrix (2.1). We now decrease the (3, 1) entry by a small amount, 0.0001, from 12 to 11.9999, so the matrix (2.1) is replaced with   0 1 0 B= (5.1) 0 0 1. 11.9999 −16 7 MATRIX EXPONENTIALS—ANOTHER APPROACH 705 The previous eigenvalues {2, 2, 3} of (2.1) are replaced for (5.1) with the (approximate) values {1.99005, 2.01005, 2.9999}, so the eigenvalues are now distinct.6 The Mathematica calculation of section 2 based on the algorithm of section 1 easily computes for eBt the following approximate results with entries listed by columns to avoid clutter:   4.0017e2.9999t + 298.538e1.99005t − 301.54e2.01005t col1 (eBt ) ≈  12.0047e2.9999t + 594.105e1.99005t − 606.11e2.01005t  , 36.0129e2.9999t + 1182.3e1.99005t − 1218.31e2.01005t (5.2)  −4.0017e2.9999t − 248.039e1.99005t + 252.04e2.01005t col2 (eBt ) ≈  −12.0047e2.9999t − 493.609e1.99005t + 506.614e2.01005t  , −36.0129e2.9999t − 982.306e1.99005t + 1018.32e2.01005t   1.0004e2.9999t + 49.5092e1.99005t − 50.5096e2.01005t col3 (eBt ) ≈  3.0011e2.9999t + 98.5257e1.99005t − 101.527e2.01005t  . 9.003e2.9999t + 196.071e1.99005t − 204.074e2.01005t  The matrix eBt can also be computed in this case by the standard elementary method based on diagonalizing the matrix, and the results agree with the present answer (5.2). One sees with (5.2) that a small change in the data from (2.1) to (5.1) results in large changes for the matrix exponential as compared with the results of section 2. This example (5.1)–(5.2) provides a nice point of contact with the rich subject of numerical linear algebra. The thrust of the present paper is a conceptual understanding of the algorithm described in section 1. We do not intend to provide “the” ultimate algorithm for computing matrix exponentials—that is best left to numerical analysts, who guide us very well through the subtleties of numerical linear algebra, where one can often produce examples for which an algorithm may not be suited in finite precision arithmetic due to extreme sensitivity to changes in data. 6. Concluding Remarks. The classical paper of Moler and Van Loan (1978) treated 19 ways to compute matrix exponentials. The algorithm of Putzer (cf. section 9.12 of Apostol (1997)) can also be used to calculate matrix exponentials in terms of polynomials in A but without the simplicity and elegance of the method discussed here. This can be appreciated by beginning students of differential equations. Others (cf. Ziebur (1970), Fulmer (1975), Leonard (1996), and Elaydi and Harris (1998)) have discussed portions of the method described here, but these references did not develop this method with the ease of use shown to us by Harris. Acknowledgments. Fillmore and Smith thank Robert Sacker, Louis Grimm, and the referees for their helpful comments on this paper, including the valuable suggestion of one referee to mention the sensitivity of the matrix exponential to changes in the data along with consequent pitfalls when using finite precision arithmetic. 6 A change of +0.0001 has complex eigenvalues, which leads to a cluttered display, but the effect of either change is comparable. 706 WILLIAM A. HARRIS, JR., JAY P. FILLMORE, AND DONALD R. SMITH REFERENCES T. M. Apostol (1997), Linear Algebra—A First Course, with Applications to Differential Equations, Wiley, New York. J. G. Broida and S. G. Williamson (1989), Comprehensive Introduction to Linear Algebra, Addison-Wesley, Reading, MA. R. Bronson (1989), 2500 Solved Problems in Differential Equations, 2nd. ed., Schaum’s Solved Problems Series, McGraw-Hill, New York, 1989. S. N. Elaydi and W. A. Harris, Jr. (1998), On the computation of An , SIAM Rev., 40, pp. 965–971. E. P. Fulmer (1975), Computation of the matrix exponential, Amer. Math. Monthly, 82, pp. 156–159. I. E. Leonard (1996), The matrix exponential, SIAM Rev., 38, pp. 507–512. C. Moler and C. Van Loan (1978), Nineteen dubious ways to compute the exponential of a matrix, SIAM Rev., 20, pp. 801–836. H. W. Turnbull and A. C. Aitken (1961), An Introduction to the Theory of Canonical Matrices, Dover, New York. J. H. Wilkinson and C. Reinsch (1971), Linear Algebra, Handbook for Automatic Computation 2, Springer-Verlag, Berlin. A. D. Ziebur (1970), On determining the structure of A by analyzing eAt , SIAM Rev., 12, pp. 98–102.