Krylov Methods
Krylov Methods
Krylov Methods
Motivation
Every method performs better for some classes of problems ...
Direct methods
Jacobi/Gauss-Seidel
Krylov methods
Multigrid methods
Fast multipole methods
but their features can be combined (hybrid, preconditioning)
Advantages of Krylov methods depend on whom to compare with
xk = x0 + r0 + r1 + . . . rk1
k1
xk = x0 + r0 + (I A)r0 + . . . (I A)
k1
r0 c
= x0 + r0 , Ar0 , . . . , A
r0
k1
X
j=0
cj Aj r0 Kk (A, r0 )
Arnoldi process
There always exists an orthogonal matrix U T U = In such that
h1,1
h
2,1
T
U AU =: H =
h1,2
h2,2
..
.
...
..
.
h1,n
..
.
..
hn1,n
hn,n1
hn,n
u
2 := Au1 u1 h1,1 ,
h2,1 := k
u 2 k2
k
X
j=1
u
k+1
k
X
:= Auk
uj hj,k ,
j=1
hk+1,k := k
uk+1 k2 ,
uk+1 := u
k+1 /hk+1,k
Uk
Uk
Hk
+u
k+1 eTk
...
.
..
I
. ..
= Im k
UkT (bAxk ) = 0,
h1,1
h
2,1
h1,2
h2,2
..
.
...
..
.
..
hk,k1
h1,k
..
.
hk1,k
hk,k
y1
y2
..
.
yk
kr0 k2
0
..
.
0
T
T
bAxk Im[Uk+1 ] kbAxk k2 = kUk+1
r0 Uk+1
AUk yk2
h1,1 h1,2
...
h1,k
kr0 k2
y1
..
..
.
.
h2,1 h2,2
y2
..
. =
..
..
.
.
hk1,k .
0
hk,k1
hk,k
y
k
hk+1,k
One can prove that kb Axk k2 kb Axk1 k2 but it may stall ...
10
GMRES vs FOM
G
Denote k := kb Axk k2 of FOM and GMRES by F
k and k , resp.
Then
2
2
2
(G
= (F
+ (G
k)
k)
k1 )
F
G
k k
11
Stalling
Consider
...
0
..
.
0
..
1
.
..
0
..
.
0
y1
y2
..
.
yn
1
0
..
.
0
Then
G = (1, . . . , 1, 0)
F = (, . . . , , 0)
12
Lanczos process
For A = AT there exists an orthogonal matrix U T U = In such that
U AU =: T
2
..
.
..
..
k+1 := k
uk+1 k2 ,
k := uTk Auk
uk+1 := u
k+1 /k+1
1 2
kr0 k2
y1
..
0
.
2 2
y2 .
. = .
..
..
.
.
k .
k
k
y
k
k+1
14
Lanczos
Conj G
Ax
orthog.
2k 2 n
9kn
10kn
storage
kn
3n
4n
This clearly shows the need for other approaches for unsymmetric
A if k gets large.
Partial orthogonalization (IOM, ...)
Restarts (FOM(m), GMRES(m), ... )
r
b
I
A
T
T
=
Consider A Ax = A b or
x
0
AT 0
15
W T AV =: T
2
..
.
..
..
16
AT
WkT AVk = Tk ,
Vk
Wk
Tk
= Wk
TkT
Vk
WkT Vk = Ik
17
T
(k+1 k+1 wk+1
vk+1 6= 0)
Kk (T, e1 ) = Im
Ik
0
and
Kk (T , e1 ) = Im
Ik
we have that
Kk (A, v1 ) = Im [Vk ] ,
Kk (AT , w1 ) = Im [Wk ]
because
T
k1
T
k1
W )V e1 = V e1 , T e1 , . . . , T
e1
V e1 , (V T W )V e1 , . . . , (V T
W e1 , (W T T V T )W e1 , . . . , (W T
T k1
18
V T )W e1 = W e1 , T T e1 , . . . , T
T k1
e1
Galerkin condition
Look for xk x0 Kk (A, r0 ) such that b Axk Kk (AT , w1 ).
Therefore
xk x0 = Vk y,
WkT (bAxk ) = 0,
So we solve recursively
2
..
.
..
..
y1
y2
..
.
yk
kr0 k2
0
..
.
0
b Axk Im[Vk+1 ],
T
T
kW T (b Axk )k2 = kWk+1
r0 Wk+1
AVk yk2
1 2
kr0 k2
y1
..
0
.
2 2
y2
..
. =
..
..
.
.
k .
0
k
k
yk
k+1
Variants
Avoid transposes via inner products (x, AT y) = (Ax, y)
Factorize T = L.U to get coupled 2-term recurrences rather
than 3-term recurrences
Re-orthogonalize to compensate for loss of bi-orthogonality
Apply look-ahead to avoid breakdowns
Restart rather than look-ahead or re-orthogonalization
Block versions for all of the above
21
Loss of orthogonality
The orthogonalization process u
k+1 := Auk
under finite precision
Uk
Pk
j=1
uj hj,k yields
uk+1 eTk + Ek ,
Uk Hk +
kEk k kAk
Breakdowns
Arnoldi stops when hk+1,k = 0, implying Axk = b (solved)
Lanczos stops when k+1 = 0, implying Axk = b (solved)
when k+1 = 0, choose vk+1 Vk , e.g. vk+1 = wk+1
T
when wk+1
vk+1 = 0, no bi-orthogonality, serious breakdown
23
Recap solving Ax = b
Choose x0 , r0 := b Ax0 and look for xk x0 Kk (A, r0 )
Four different methods
O(k 2 n)
O(kn)
O(kn)
Axk b Kk (A, r0 )
CG (A = AT 0)
MINRES (A = AT )
FOM
Axk b Kk (AT , w1 )
BICG
GMRES
24
Eigenvalue problems
Bauer-Fike Theorem applied to
AUk Uk Hk = hk+1,k uk+1 eTk
yields (for X 1 AX = )
i : |j (Hk ) i (A)| |hk+1,k |(X)
i.e. each eigenvalue of Hk approximates well some eigenvalue of A
For A = AT , (X) = 1
Breakdown hk+1,k = 0 is good since j (Hk ) = i (A)
Improved bounds exist if we know something about (A)
25
26
Eigenvalue convergence
Note that Krylov spaces are related to the power method
k1
Kk (A, r) := Im r, Ar, . . . , A
r
and that under exact arithmetic
Kk ((A cI), r) = Kk (A, r)
For Arnoldi, (Hk ) should converge to outer spectrum of A cI
The same holds for Lanczos and (Tk ) (real spectrum if A = AT )
In practice one often converges to outer eigenvalues
but the full story is more complex ...
27
28
Hk I = Q.R,
k and
(and a deflation) deletes the undesired eigenvalue from H
yields a new starting vector r := (A I)r such that
k2
Kk1 (A, r) := Im r, A
r, . . . , A
r
k1 as projected matrix
yields H
Choose shifts i recursively
r := [(A i )]r = p(A)r
such that p(s) filters the complex plane from undesired eigenvalues
29
30
31
Rational approximation
1b
R(s)
:= c(In A)
R(s) R(s)
= O(s )2k
one chooses
A = Wk AVk ,
b = Wk b,
c = cVk ,
where
A := (AI)1 ,
Ab),
Im[Vk ] = Kk (A,
Im[Wk ] = Kk (AT , AT cT )
32
33
Conclusion
Krylov subspace algorithms are used to
solve large scale systems of equations
find eigenvalues, generalized eigenvalues and singular values of
large scale matrix problems
approximate the exponential or high degree rational functions
by lower degree ones
There are numerous variants and sophisticated algorithms
It is still a very active area of research
34
References
[1] R. Lehoucq, D. Sorensen, C. Yang, ARPACK users guide,
SIAM, 1997.
[2] B. Parlett, The symmetric eigenvalue problem, Prentice Hall,
1980.
[3] Y. Saad, Iterative methods for sparse linear systems, SIAM,
2nd Ed., 2003.
[4] H. van der Vorst, Iterative Krylov Methods for Large Linear
Systems, Cambridge Monographs on Applied and
Computational Mathematics, 2003.
[5] P. Van Dooren, Gramian based model reduction of large-scale
dynamical systems, in Numerical Analysis, Chapman and Hall,
CRC Press, London, 2000.
35