Bauer–Fike theorem

In mathematics, the Bauer–Fike theorem is a standard result in the perturbation theory of the eigenvalue of a complex-valued diagonalizable matrix. In its substance, it states an absolute upper bound for the deviation of one perturbed matrix eigenvalue from a properly chosen eigenvalue of the exact matrix. Informally speaking, what it says is that the sensitivity of the eigenvalues is estimated by the condition number of the matrix of eigenvectors.

The setup

In what follows we assume that:

$A \in C n, n$ is a diagonalizable matrix;
$V \in C n, n$ is the non-singular eigenvector matrix such that $A = V Λ V -1$ , where $Λ$ is a diagonal matrix.
If $X \in C n, n$ is invertible, its condition number in $p$ -norm is denoted by $κ p (X)$ and defined by:

\kappa _{p}(X)=\|X\|_{p}\left\|X^{-1}\right\|_{p}.

Theorem (Friedrich L. Bauer, C. T. Fike – 1960)

Let

μ

be an eigenvalue of

A + δA

then there exists

λ \in σ (A)

such that:

|\lambda -\mu |\leq \kappa _{p}(V)\|\delta A\|_{p}

Proof

If $μ \in σ (A)$ , we can choose $λ = μ$ and the thesis is trivially verified (since $κ p (V) \geq 1$ ).

So suppose $μ \notin σ (A)$ . Since $μ$ is an eigenvalue of $A + δA$ , we have $det(A + δA - μI) = 0$ and so

{\begin{aligned}0&=\det(A+\delta A-\mu I)\\&=\det(V^{-1})\det(A+\delta A-\mu I)\det(V)\\&=\det \left(V^{-1}(A+\delta A-\mu I)V\right)\\&=\det \left(V^{-1}AV+V^{-1}\delta AV-V^{-1}\mu IV\right)\\&=\det \left(\Lambda +V^{-1}\delta AV-\mu I\right)\\&=\det(\Lambda -\mu I)\det \left((\Lambda -\mu I)^{-1}V^{-1}\delta AV+I\right)\\\end{aligned}}

However our assumption, $μ \notin σ (A)$ , implies that: $det(Λ - μI) \neq 0$ and therefore we can write:

\det \left((\Lambda -\mu I)^{-1}V^{-1}\delta AV+I\right)=0.

This reveals $-1$ to be an eigenvalue of

(\Lambda -\mu I)^{-1}V^{-1}\delta AV.

Since all $p$ -norms are consistent matrix norms we have $| λ | \leq || A || p$ where $λ$ is an eigenvalue of $A$ . In this instance this gives us:

|-1|=1\leq \left\|(\Lambda -\mu I)^{-1}V^{-1}\delta AV\right\|_{p}\leq \left\|(\Lambda -\mu I)^{-1}\right\|_{p}\left\|V^{-1}\right\|_{p}\|V\|_{p}\|\delta A\|_{p}=\left\|(\Lambda -\mu I)^{-1}\right\|_{p}\ \kappa _{p}(V)\|\delta A\|_{p}

But $(Λ - μI) -1$ is a diagonal matrix, the $p$ -norm of which is easily computed:

\left\|\left(\Lambda -\mu I\right)^{-1}\right\|_{p}\ =\max _{\|\mathbf {x} \|_{p}\neq 0}{\frac {\left\|\left(\Lambda -\mu I\right)^{-1}\mathbf {x} \right\|_{p}}{\|\mathbf {x} \|_{p}}}=\max _{\lambda \in \sigma (A)}{\frac {1}{|\lambda -\mu |}}\ ={\frac {1}{\min _{\lambda \in \sigma (A)}|\lambda -\mu |}}

whence:

\min _{\lambda \in \sigma (A)}|\lambda -\mu |\leq \ \kappa _{p}(V)\|\delta A\|_{p}.

Theorem (Friedrich L. Bauer, C. T. Fike – 1960) (alternative statement)

The theorem can also be reformulated to better suit numerical methods. In fact, dealing with real eigensystem problems, one often has an exact matrix $A$ , but knows only an approximate eigenvalue-eigenvector couple, $({\tilde {\lambda }},{\tilde {\mathbf {v} }})$ , and needs to bound the error. The following version comes in help.

Let

({\tilde {\lambda }},{\tilde {\mathbf {v} }})

be an approximate eigenvalue-eigenvector couple, and

\mathbf {r} =A{\tilde {\mathbf {v} }}-{\tilde {\lambda }}{\tilde {\mathbf {v} }}

. Then there exists

λ \in σ (A)

such that:

\left|\lambda -{\tilde {\lambda }}\right|\leq \kappa _{p}(V){\frac {\|\mathbf {r} \|_{p}}{\left\|\mathbf {\tilde {v}} \right\|_{p}}}

Proof

We solve this problem with Tarık's method: m ${\tilde {\lambda }}\notin \sigma (A)$ (otherwise, we can choose $\lambda ={\tilde {\lambda }}$ and theorem is proven, since $κ p (V) \geq 1$ ). Then $(A-{\tilde {\lambda }}I)^{-1}$ exists, so we can write:

{\tilde {\mathbf {v} }}=\left(A-{\tilde {\lambda }}I\right)^{-1}\mathbf {r} =V\left(D-{\tilde {\lambda }}I\right)^{-1}V^{-1}\mathbf {r}

since $A$ is diagonalizable; taking the $p$ -norm of both sides, we obtain:

\left\|{\tilde {\mathbf {v} }}\right\|_{p}=\left\|V\left(D-{\tilde {\lambda }}I\right)^{-1}V^{-1}\mathbf {r} \right\|_{p}\leq \|V\|_{p}\left\|\left(D-{\tilde {\lambda }}I\right)^{-1}\right\|_{p}\left\|V^{-1}\right\|_{p}\|\mathbf {r} \|_{p}=\kappa _{p}(V)\left\|\left(D-{\tilde {\lambda }}I\right)^{-1}\right\|_{p}\|\mathbf {r} \|_{p}.

But, since $(D-{\tilde {\lambda }}I)^{-1}$ is a diagonal matrix, the $p$ -norm is easily computed, and yields:

\|(D-{\tilde {\lambda }}I)^{-1}\|_{p}=\max _{\|\mathbf {x} \|_{p}\neq 0}{\frac {\|(D-{\tilde {\lambda }}I)^{-1}\mathbf {x} \|_{p}}{\|\mathbf {x} \|_{p}}}=\max _{\lambda \in \sigma (A)}{\frac {1}{\left|\lambda -{\tilde {\lambda }}\right|}}={\frac {1}{\min _{\lambda \in \sigma (A)}\left|\lambda -{\tilde {\lambda }}\right|}}

whence:

\min _{\lambda \in \sigma (A)}\left|\lambda -{\tilde {\lambda }}\right|\leq \kappa _{p}(V){\frac {\|\mathbf {r} \|_{p}}{\left\|{\tilde {\mathbf {v} }}\right\|_{p}}}.

The Bauer–Fike theorem, in both versions, yields an absolute bound. The following corollary, which, besides all the hypothesis of Bauer–Fike theorem, requires also the non-singularity of $A$ , turns out to be useful whenever a relative bound is needed.

Corollary

Let

μ

be an eigenvalue of

A + δA

then there exists

λ \in σ (A)

such that:

{\frac {|\lambda -\mu |}{|\lambda |}}\leq \kappa _{p}(V)\left\|A^{-1}\delta A\right\|_{p}

Note. $|| A -1 δA ||$ can be formally viewed as the "relative variation of $A$ ", just as $.mw-parser-output .sfrac{white-space:nowrap}.mw-parser-output .sfrac.tion,.mw-parser-output .sfrac .tion{display:inline-block;vertical-align:-0.5em;font-size:85%;text-align:center}.mw-parser-output .sfrac .num{display:block;line-height:1em;margin:0.0em 0.1em;border-bottom:1px solid}.mw-parser-output .sfrac .den{display:block;line-height:1em;margin:0.1em 0.1em}.mw-parser-output .sr-only{border:0;clip:rect(0,0,0,0);clip-path:polygon(0px 0px,0px 0px,0px 0px);height:1px;margin:-1px;overflow:hidden;padding:0;position:absolute;width:1px}⁠|λ − μ|/|λ|⁠$ is the relative variation of $λ$ .

Proof

Since $μ$ is an eigenvalue of $A + δA$ and $det(A) \neq 0$ , by multiplying by $- A -1$ from left we have:

-A^{-1}(A+\delta A)\mathbf {v} =-\mu A^{-1}\mathbf {v} .

If we set:

{\tilde {A}}=\mu A^{-1},\qquad {\widetilde {\delta A}}=-A^{-1}\delta A

then we have:

\left({\tilde {A}}+{\widetilde {\delta A}}-I\right)\mathbf {v} =\mathbf {0}

which means that ${\tilde {\mu }}=1$ is an eigenvalue of ${\tilde {A}}+{\widetilde {\delta A}}$ , with $v$ an eigenvector. Now, the eigenvalues of ${\tilde {A}}$ are ${\frac {\mu }{\lambda _{i}}}$ , while its eigenvector matrix is the same as A. Applying the Bauer–Fike theorem to the matrix ${\tilde {A}}+{\widetilde {\delta A}}$ and to its eigenvalue ${\tilde {\mu }}=1$ , we obtain:

\min _{\lambda \in \sigma (A)}\left|{\frac {\mu }{\lambda }}-1\right|=\min _{\lambda \in \sigma (A)}{\frac {|\lambda -\mu |}{|\lambda |}}\leq \kappa _{p}(V)\|A^{-1}\delta A\|_{p}

Remark

If $A$ is normal, $V$ is a unitary matrix, and $\|V\|_{2}=\|V^{-1}\|_{2}=1$ , so that $\kappa _{2}(V)=1$ .

The Bauer–Fike theorem then becomes:

\exists \lambda \in \sigma (A):\quad |\lambda -\mu |\leq \|\delta A\|_{2}

Or in alternate formulation:

\exists \lambda \in \sigma (A):\quad \left|\lambda -{\tilde {\lambda }}\right|\leq {\frac {\|\mathbf {r} \|_{2}}{\left\|{\tilde {\mathbf {v} }}\right\|_{2}}}

which obviously remains true if $A$ is a Hermitian matrix. In this case, however, a much stronger result holds, known as the Weyl's theorem on eigenvalues. In the hermitian case on can also restate the Bauer–Fike theorem in the form that the map $A \mapsto σ (A)$ that maps a matrix to its spectrum is a Non-expansive function with respect to the Hausdorff distance on the set of compact subsets of $C$ .

References

F. L. Bauer and C. T. Fike. Norms and exclusion theorems. Numer. Math. 2 (1960), 137–141.
S. C. Eisenstat and I. C. F. Ipsen. Three absolute perturbation bounds for matrix eigenvalues imply relative bounds. SIAM Journal on Matrix Analysis and Applications Vol. 20, N. 1 (1998), 149–158