0% found this document useful (0 votes)
1 views161 pages

A Multigrid Tutorial

The document is a tutorial on multigrid methods, which are efficient iterative techniques for solving problems with many variables and scales. It discusses the concepts of local versus global processing, the convergence of iterative methods, and the advantages of multiscale approaches in reducing computational complexity. Additionally, it covers damping techniques and provides examples of 1D and 2D model problems, including finite-difference discretization and Gauss-Seidel relaxation.

Uploaded by

zh.xiaoluo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views161 pages

A Multigrid Tutorial

The document is a tutorial on multigrid methods, which are efficient iterative techniques for solving problems with many variables and scales. It discusses the concepts of local versus global processing, the convergence of iterative methods, and the advantages of multiscale approaches in reducing computational complexity. Additionally, it covers damping techniques and provides examples of 1D and 2D model problems, including finite-difference discretization and Gauss-Seidel relaxation.

Uploaded by

zh.xiaoluo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 161

A Multigrid Tutorial

Irad Yavneh
Faculty of Computer Science
Technion – Israel Institute of Technology
irad@cs.technion.ac.il

1
2
3
4
Some Relevant Books:
1. W.L. Briggs, V.E. Henson, and S.F. McCormick: “A Multigrid
Tutorial”, 2nd ed., SIAM, 2000.
2. U. Trottenberg, C.W. Oosterlee, and A. Schueller:
“Multigrid”, Academic Press, 2001.
3. Brandt, A., “1984 Guide with Applications to Fluid
Dynamics”, GMD-Studie Nr. 85, 1984.
4. Hackbusch, W., “Multigrid Methods and Applications”,
Springer, Berlin, 1985.
5. W. Hackbusch and U Trottenberg eds.: “Multigrid Methods”,
Springer-Verlag, Berlin, 1982.
6. Wienands, R., and Joppich, W., “Practical Fourier Analysis
for Multigrid Methods”, Chapman & Hall/CRC, 2004.

5
What’s it about?

A framework of efficient iterative


methods for solving problems with many
variables and many scales.

6
Basic Concepts: Local vs. Global processing.

Imagine a large number of soldiers in the Queen’s


guard who need to be arranged in a straight line and
at equal distances from each other.
The two soldiers at the ends of the line are fixed.
Suppose we number the soldiers 0 to N , and that the
length of the entire line is L.

8
Initial Position

9
Final Position

10
Global processing. Let soldier number j stand on the
line connecting soldier 0 to soldier N at a distance jL/N
from soldier number 0.

11
12
This method solves the problem directly, but it
requires a high degree of sophistication: recognition
of the extreme soldiers and some pretty fancy
arithmetic.

13
Local processing (iterative method). Suppose that the
inner soldiers’ initial position is x (0) = ( x1 , x2 , , xN −1 )(0) .
Then repeat for i=1,2,…: Let each soldier j, j=1,…N-1 at
iteration i move to the point midway between the
locations of soldier j-1 and soldier j+1 at iteration i-1:

xj(i )
=
2
(
1 (i −1)
x j −1 + x (ji+−11) )

This is an iterative process. Each iteration brings us


closer to the solution(?). The arithmetic is trivial.

14
A step in the right direction

15
16
Slow convergence

17
18
Fast convergence

19
20
Slow convergence

21
Local solution: damping

22
Local solution: damping

23
Local solution: damping

24
Local solution: damping

25
The multiscale idea: Employ the local processing with
simple arithmetic. But do this on all the different
scales.

26
27
Large scale

28
Large scale

29
Intermediate scale

30
Intermediate scale

31
Small scale

32
33
HOW MUCH DO WE SAVE?
Analysis of the Jacobi iterative process
Matrix representation:

x (i ) = Sx (i −1)
where
0 1 
1 0 1 
 
 1 0 1 
1 
S=     
2
 1 0 1 
 
 1 0 1
34

 1 0

This matrix S has N - 1 linearly independent eigenvectors,
vk, and corresponding real eigenvalues, λk

S v k = λk v k .
Since vk span the space ℜ N −1 , any initial configuration of
the soldiers can be written as a linear combination:
N −1
x (0 )
= ∑ ck v k

k =1

with some coefficients, ck.

35
Hence, we obtain after m iterations:

x ( m ) = Sx ( m −1) = S 2 x ( m − 2 ) =
 = S m x (0 ) = S m ∑ ck v k = ∑ ck λmk v k
k k

Conclusion:

lim x( m)
→ 0 if λk < 1, k = 1,
 , N −1
m →∞

The iteration converges if the spectral radius, ρ, of


the iteration matrix, S, is smaller than 1.

36
Observation: the eigenvectors and eigenvalues of the
matrix S are given by
 jkπ 
=
vk { }
=
v kj sin 
 N 
 ,=j 1, , N − 1,

 kπ 
λk = cos  ,
 N 
with k = 1, …, N –1.
Proof: Using the trigonometric identity,
1 ( j − 1) kπ
+
( j + 1) kπ 
=
kπ jkπ
 sin sin  cos sin ,
2 N N  N N

and the fact that sin 0 = sin π = 0 , we obtain by


substitution, S v = λk v .
k k

37
38
Note: since | λk | < 1, the method converges. But, for
some eigenvectors, | λk | is close to 1, so convergence is
slow. In particular, for kπ/N << 1, we have,

 kπ  1  kπ 
2

=λk cos   ≈ 1−   .
 N  2 N 
For k =1 we obtain
m 1 π 
 1 π  
2
2
− m 
λ ≈ 1 −    ≈ e
1
m 2 N
.
 2  N  

Conclusion: O(N 2) iterations are required to reduce such


an error by an order of magnitude.
39
40
How much work do we save?
Jacobi’s method requires about N 2 iterations and N 2 *
N = N 3 operations to improve the accuracy by an order
of magnitude.
The multiscale approach solves the problem in about
Log2(N) iterations (whistle blows) and only about N
operations.
Example: for N = 1000 we require about:
10 iterations and 1000 operations
instead of about
1,000,000 iterations and 1,000,000,000 operations
41
The catch: in less trivial problems, we cannot
construct appropriate equations on the large
scales without first propagating information
from the small scales.
Skill in developing efficient multilevel
algorithms is required for:
1. Choosing a good local iteration.
2. Choosing appropriate coarse-scale
variables.
3. Choosing inter-scale transfer operators.
4. Constructing coarse-scale approximations
to the fine-scale problem.

44
Damping
Recall: the eigenvectors and eigenvalues of the
iteration matrix S are given by

 jkπ 
{ }
v k = v kj = sin  , j = 1,  , N − 1,
 N 
 kπ 
λk = cos ,
 N 
with k = 1, …, N –1.

Note that convergence is also slow for k / N ≈ 1.

45
This slow convergence can be overcome by damping:
(i )
x j = (1 − ω ) x j (i −1)

2
(
1 (i −1)
x j −1 + x (ji+−11) , )
where ω is a parameter.
(i ) ( i −1)
Then, x = Sω x , where
Sω = (1 − ω )I + ω S.
Note: vk are eigenvectors of Sω . The corresponding
eigenvalues are now λ(kω ) = 1 − ω + ωλk = 1 − ω (1 − λk ).

For 0 < ω ≤ 1, we have convergence, k


(ω )
λ < 1.
46
Definition:
Eigenvectors vk with 1 ≤ k < N / 2 are called
smooth (low-frequency).
Those with N / 2 ≤ k < N are called rough or
oscillatory (high-frequency).

 kπ 
Recall that λk = cos  , so for rough
eigenvectors,  N 

λk ≤ 0.

47
Exercise: Find 0 < ω < 1 which yields optimal
convergence for the set of rough modes for
arbitrary N:

ω : sup Nmax λk(ω ) = min!,


N ≤k < N
2

i.e.,
ω: sup 1 − ω + ωλ =
min!,
λ∈( −1,0]

What is then the bound on the convergence


(ω )
factor, λk , maximized over the rough modes?

48
1D Model Problem
Find u which satisfies:

= ′′ ( x )
Lu u= f ( x ) , x ∈ ( 0, 1) , (1)

u ( 0 ) = u0 ,
u (1) = u1.

In the particular case where f = 0, the solution is a


straight line that connects u0 with u1.

49
Discrete approximation: Since closed-form solutions
exist only for a small number of differential equations,
we solve such equations approximately by a discrete
approximation.

Define a grid: divide the domain (0,1) into N intervals.


Assume for simplicity a uniform grid of mesh-size
h=1/N.

50
Finite-difference discretization; examples:
Forward differences:
u (x + h ) − u (x )
u′ = + O (h ).
h
Backward differences:
u (x ) − u (x − h )
u′ = + O (h ).
h
Central differences:
u (x + h ) − u (x − h )
u′ =
2h
+ O h2 . ( )
Second derivative:
u ( x − h ) − 2u ( x ) + u ( x + h )
u ′′( x ) =
h 2
+ O( )
h 2
. (2)
Derivation: by the Taylor theorem

51
We can thus approximate the differential
equation by a set of algebraic difference
equations:

u − 2u + u h h h
Lu = h h i +1
2
i i −1
fi ,h

h
=i 1, , N − 1,

u0h = u0 ,
u Nh = u1.
52
In matrix form:
 −2 1   u1h 
1   h 
−2 1  u2 
1  
       =
h    h 
2

 1 −2 1 u N − 2 
 −2  uh 
1  N −1 

 f1h − u0h / h 2 
 h 
f
 2 
  .
 h 
 N −2
f 
 f h − u h / h2 
 N −1 1 

This is a tridiagonal system of equations that is


essentially as easy to solve as the soldier problem
53
2D Model Problem
Find u which satisfies:

Lu = u xx + u yy = f ( x, y ), (x, y ) ∈ Ω, (4)

u = g ( x, y ) , (x, y ) ∈ ∂Ω.
This is the 2D Poisson equation, with Dirichlet boundary
conditions. It is an elliptic partial differential equation
which appears in many models.

54
Ωh
55
Discrete approximation
Define a grid: Ω h ⊂ Ω (assumed to be uniform for
simplicity, with mesh interval h).

Let uh, gh and f h denote discrete approximations to u, g


and f defined at the nodes of the grid.

Plug (2) for uxx, and the analogous approximation for uyy
into (4), obtaining:

56
Lhuih, j = (5)
uih−1, j − 2uih, j + uih+1, j uih, j −1 − 2uih, j + uih, j +1
+ =
fi ,hj in Ω h
h2 h2

=
u h g h on ∂ h Ω h

This yields a nonsingular linear system of equations for


uih, j (the discrete operator satisfies a maximum
principle.)

We consider solving this system by the classical


approach of Gauss-Seidel relaxation.
57
Gauss-Seidel (GS) Relaxation:
1. Choose initial guess, ~ h
u .
2. Repeat until some convergence criterion is satisfied
{
Scan all variables in some prescribed order, and
change each variable u ~ h in turn so as to satisfy
i, j
the (i,j)th equation.
}

58
Observation: GS is a local process, because only near
neighbors appear in each equation. Hence, it may be
efficient for eliminating errors which can be detected
locally. But large-scale (“smooth”) errors are
eliminated very slowly.

(The difference between GS and Jacobi is that old


neighboring values are used in Jacobi, while the most
updated values are used in GS.)

59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
Key Observation re-worded: Relaxation cannot be
generally efficient for reducing the error (i.e., the
difference vector u ~ h − u h ). But relaxation may be
extremely efficient for smoothing the error relative
to the grid.

Practical conclusion:
1. A smooth error can be approximated well on a
coarser grid.
2. A coarser grid implies less variables, hence less
computation.
3. On the coarser grid the error is no longer as
smooth relative to the grid, so relaxation may once
81
again be efficient.
Grid-refinement algorithm
Define a sequence of progressively finer grids all
covering the full domain. Then,

1. Define and solve the problem on the coarsest grid,


say by relaxation.
2. Interpolate the solution to the next-finer grid.
Apply several iterations of relaxation.
3. Interpolate the solution to the next-finer grid and
continue in the same manner…

Does this method converge fast?

82
1D Model Problem Revisited
Fine-grid (h) difference equation:
u h
− 2u h
+ u h
= Lh u h =
i +1 i i −1
fi h ,
h2 (6)
= i 1, N − 1,

u0h = u0 ,
u Nh = u1.

The eigenvectors of Lh (like those of the Jacobi


relaxation operation) are Sine-function “waves”:

vk (sin kπ / N , sin jkπ / n, sin( N − 1)kπ / N )T (7)

83
Aliasing

84
Smooth waves—with k << N—have wavelengths large
compared to h. Hence they can be approximated well
on the coarse grids. But non-smooth eigenvectors alias
with smooth components on the coarse grids.
Since the right-hand side, f h, will generally have some
non-smooth components, these will be “interpreted” as
smooth components by the coarse grids, resulting in a
smooth error.
Hence, when we interpolate a coarse-grid solution to
the fine grid, we still have smooth errors in this
solution. These cannot be corrected efficiently by
relaxation.

85
Errors:
There is an important distinction here between the
discretization error:
u − uh,
and the algebraic error:
~h ,
uh − u

~
Where u
h
is our current approximation to u .
h

86
Note: Neither the solution, uh, nor the discretization
error are smoothed by relaxation, only the algebraic
error. Hence, we formulate our problem in terms of
this error.
Denote ~h.
vh = u h − u

Recall Lh u h = f h .

Subtract L u
h~ from both sides, and use the linearity
h

of Lh to obtain:
Lh v h = f h ~h ≡ r h
− Lh u (8)
87
As we have seen, we need to smooth the error vh on the
fine grid first, and only then solve the coarse-grid
problem. Hence, we need two types of integrid
transfer operations:

1. A Restriction (fine-to-coarse) operator: I hH .


2. A Prolongation (coarse-to-fine) operator: I Hh .

For restriction we can often use simple injection, but


full-weighted transfers are preferable.
For prolongation linear interpolation (bi-linear in 2D) is
simple and usually effective.

88
Two-grid Algorithm
• Relax several times on grid h, obtaining u~ h with a
smooth corresponding error.
h~h
• Calculate the residual: r h
= f h
− L u .

• Solve approximate error-equation on the coarse


grid:
LH v H = f H ≡ I hH r h .

• Interpolate and add correction:


~h ← u
u ~h + I h vH .
H

• Relax again on grid h.


Multi-grid is obtained by recursion.
89
Multi-grid Cycle V (ν 1 ,ν 2 )

Let u approximate v , u approximate the error on


2h 2h 4h

grid 2h, etc. Relax on L u = f v times


h h h
1

Set f 2h
= I h2 h f( h
− Lh u h , ) u 2h = 0
Relax on L2 h u 2 h = f 2h
v1 times
Set f 4h
= I 24hh f( 2h
− L2 h u 2 h , ) u 4h = 0
Relax on L4 h u 4 h = f 4h
v1 times
Set f 8h
= I 48hh f( 4h
− L4 h u 4 h , ) u 8h = 0

Solve LMh − u Mh = f Mh


Correct u 4 h ← u 4 h + I 84hh u 8 h
Relax on L4 h u 4 h = f 4h
v2 times
Correct u 2 h ← u 2 h + I 42hh u 4 h
Relax on L2 h u 2 h = f 2h
v2 times
Correct u h ← u h + I 2hh u 2 h
90 Relax on Lh u h = f h
v2 times
91
Multigrid vs. Relaxation
92
Remarks:
1. Simple recursion yields a V cycle. More generally,
we can choose a cycle index γ, and define a γ–cycle
recursively as follows: Relax; transfer to next
coarser grid; perform γ γ–cycles; interpolate and
correct; Relax. (On the coarsest grid define the γ–
cycle as an exact solution).
2. The best number of pre-relaxation + post-
relaxation sweeps is normally 2 or 3.
3. The boundary conditions for all coarse-grid
problems is zero (because the coarse-grid variable
is the error). The initial guess for the coarse-grid
solution must be zero.

93
Local Mode Analysis (LMA)
We would like to obtain a quantitative prediction
of the convergence behavior of the multigrid (or
at least two-grid) cycle.
This is important for debugging, choosing
parameters, etc.
We first derive the iteration matrix of the two-
grid cycle. That is, the matrix Sh,

v h
after =S v
h h
before ,

where before and after are the algebraic errors


h h
v v
before and after the two-grid cycle.

94
Notation

Rh – Relaxation matrix
Lh – Fine-grid matrix, LH – Coarse-grid matrix
I hH – Prolongation matrix, I hH– Restriction matrix
Ih – Fine-grid identity matrix
ν1 – # Relaxations before CGC
ν2 – # Relaxations after CGC

Two-grid matrix

=S h
(R )
h
ν2
(I h
−I h
H (L )
H
−1
H
I L
h
h
) (R )
h
ν1

95
For a given problem, we can compute the norm of
Sh and determine the convergence behavior of the
two-grid algorithm, which often provides a relevant
approximation of the multigrid performance.

However, this requires use of a computer, and it is


only moderately useful for algorithm development.

We can in fact obtain a useful quantitative


approximate prediction by means of a local
(Fourier) analysis.

96
Fourier Analysis
Consider for simplicity the 1D problem.
Instead of fixed boundary conditions we assume
periodicity (or an infinite grid). Also, we assume
our operator, Lh, to have constant coefficients.
Hence, every element of the corresponding
matrix, denoted by A, satisfies:
A i , j = A i −1, j −1 (mod N ) .
That is, every row of A is identical to the
previous row, modulo N, shifted one place
forward.
We next compute the eigenvectors and
eigenvalues of matrices of this type.
97
Observation: any matrix A representing a
constant-coefficient discretization + periodicity
can be written as a polynomial in the cyclic
forward shift matrix,
 0 1 0  0 0 
 
 0 0 1 0  0 
 0 0 0 1  0 
ω =  .
  

 0 0 0  0 1 

 1 0 0  0 0 

Here,
ω i , j = δ i +1, j (mod N ) .
98
Proof.
By induction, for any integer p,

(ω )
p
i, j = δ i + p , j (mod N ) .

That is, ω p is a cyclic forward shift by p


places. Hence we have
N −1
A = ∑ A0, jω j .
j =0

99
Observation: If P is a polynomial and B is a square
matrix with eigenvector v and corresponding
eigenvalue λ, then v is also an eigenvector of P(B)
with corresponding eigenvalue P(λ).
Proof:
m
Suppose P (B) = ∑ ci B i .
i =0
Observe that, by induction, B i v = λi v.
Hence,
m
P ( B) v
= ∑
= cλ v
i =0
i
i
P (λ ) v

100
Conclusion: Since any matrix A of the type we are
considering is a polynomial in ω, we only need to
compute the eigenvalues and eigenvectors of ω.
The corresponding eigenvalues of A will then be
easy to compute.

Let λk denote an eigenvalue of ω, and let vk


denote the corresponding eigenvector. We have :

101
 v 0k   v1k   v 0k 
 k   k   k 
 v1   v2   v1 
        
   k   
ω = k
vj  =v j +1  λk  v j 
k

        
 k   k   k 
 v N −2   v N −1   v N −2 
 vk   vk   vk 
 N −1   0   N −1 

Hence,

102
v1k = λk v 0k ,
v 2k λ=
= v
k 1
k
λ 2 k
k v0 ,


v kj = λkj v 0k ,

=v 0k λ=
k v k
N −1 λk
N
v k
0.

Thus, the N eigenvalues are the N th roots


of 1:

λk = 1 1/ N
=e i 2πk / N
.

103
The N eigenvectors are

 1   1 
   
λ
 k   e ik 2π / N

 λk2   e 2ik 2π / N 
   
=v k
=      .
 λkj   e jik 2π / N 
   
     
 λ N −1   e( N −1) ik 2π / N 
 k   

Here, k is any integer. But note that if k1 = k2 (mod N)


then the eigenvalues corresponding to k1 and k2 alias.
Thus, k runs over any N consecutive integers.

104
Consider now a grid of N equally spaced points
over a domain of size 2π. The grid-points are
located at
2πj
x j = hj = , j = 0,, N − 1,
N
where h = 2π / N is the mesh-size.
The eigenvectors can be considered as grid-
based functions (Fourier Components):
ϕ k (x j ) = e .
ikx j

Here, k can be any integer, but note that


ϕk aliases with ϕk+N . We therefore
restrict:  N N
k ∈ − + 1,  .
105  2 2
Additional Remarks.
1. Domain length 2L: replace x by πx/L.
2. Nonperiodic, with function vanishing at
endpoints: Antisymmetric continuation
(sine series).
3. The wavelength is l=2π/k.
4. For multigrid analysis we define
2π k
θ = hk = , −π < θ ≤ π.
N

Then, ( j)
ϕθ =x e= e
iθ x j / h
iθ j
.
The wavelength is 2πh/θ.
108
Fourier Analysis
Local mode (Fourier) analysis is the main tool
used for practical analysis of multigrid solvers.
Though it is rigorously justified only for rather
special situations, it is useful for quantitative
predictions in a wide set of circumstances.
The underlying assumption is that small subsets
comprised of one or a few Fourier components
of the form (in d dimensions) ϕ = eiθ  j ,
θ = (θ1 , θ 2 , , θ d ) ,
x x x 
j =  1 , 2 , , d ,
 h1 h2 hd 
are invariant under operations of the common
multigrid components.
109
The Symbol
The symbol is a generalization of the eigenvalue.
The symbol of an operator L is denoted by L̂ .
When the Fourier mode is an eigenfunction, it is
defined by:
Leiθ  j = Lˆ (θ ) eiθ  j .

Examples (1D):
Lu = u xx ,
θ2
Leiθ x / h = − eiθ x / h ,
h2
θ2
⇒ Lˆ =
− 2
h
110
Suppose we discretize L by
u hj +1 − 2u hj + u hj −1
Lh u h = 2
.
Then, h
eiθ − 2 + e − iθ iθ x / h
Lh eiθ x / h = 2
e .
h
So,
2 θ 
L (θ ) =2 ( cos (θ ) − 1) =
2 4
ˆh
− 2 sin   .
h h 2

Truncation error:
θ2
4  θ   θ 4

Lˆ (θ ) − Lˆh (θ ) =
− 2 + 2 sin 2   = O 2 .
h h 2 h 

111
The symbol of relaxation
Consider the discretized equation
Lh u h = f h .
A pointwise relaxation can often be written as

Lh + uafter
h
+ Lh −ubefore
h
=
fh,

where ubefore denotes the old approximation to u h


h

(before the relaxation step), and after denotes


h
u
the new approximation.
Thus, the relaxation is characterized by the
splitting: =
Lh Lh + + Lh − .
112
Examples:
For 1D Poisson,
u hj +1 − 2u hj + u hj −1
Lh u h = 2
,
h

we obtain for Jacobi relaxation,


−2 uafter
h
( ) ( h
ubefore ) (
+ ubefore
h
)
=Lh + uafter
h
(j
=
h 2
j
, L)
h+ h
ubefore
j
( ) j +1

h 2
j −1
,

and for Gauss-Seidel relaxation,

(u )
h
(
− 2 uafter
h
) (u ) h

(L ) ( )
after
j −1 before j +1
h+ h+ h
=
h j
uafter 2
, L ubefore 2
,
j h j h
113
Let v= − denote the algebraic error
h h h
before ubefore u
before the relaxation, and let v= −
h h h
after u after u
denote the new error. Then,

Lh + vafter
h
+ Lh − vbefore
h
=
0.

Now, consider an error that is a single Fourier


component (assumed to be an eigenfunction of
Lh ± ),

iθ x / h iθ x / h
=v h
before A=
θe , v h
after Aθ e .

114
The relaxation operator, Rh, is defined by
h
vafter = R h vbefore
h
.
We obtain that the symbol of Rh is:
A Lˆh − (θ )
Rˆ h (θ ) = θ = − h + .
Aθ L (θ )
ˆ

Examples: for Jacobi relaxation we have


iθ − iθ
−2 e + e
Lˆh + (θ ) =
= 2
, ˆh − (θ )
L 2
,
h h

So the symbol of Jacobi relaxation is


iθ − iθ
e + e
=
Rˆ Jac
h
(θ ) = cos (θ ) .
2
115
For damped Jacobi relaxation we have

Aθ = 1 − ω + ω Rˆ Jac
h
(θ )  Aθ
= 1 − ω + ω cos (θ )  Aθ .

So the symbol of damped Jacobi relaxation is

Rˆωh Jac (θ ) =1 − ω + ω cos (θ ) .

116
For Gauss-Seidel relaxation we have
(v h
before ) j +1
(
− 2 vafter
h
) + (v )
j
h
after j −1
=
0.
Substituting the Fourier function yields

Aθ eiθ ( j +1) − 2 Aθ eiθ j + Aθ eiθ ( j −1) =


 Aθ eiθ − 2 Aθ + Aθ e − iθ  eiθ j =
0.

Hence, eiθ
Aθ = − iθ
Aθ .
2−e
The symbol of Gauss-Seidel relaxation is
therefore e iθ
RˆGS (θ ) =
h
− iθ
.
117 2−e
Aliasing Revisited
Note that
i (θ ± 2π ) x j / h i (θ ± 2π ) j
e = e=
± i 2π j iθ j iθ j iθ x j / h
e e= e= e .

iθ x / h i (θ ± 2π ) x / h
That is, Fourier modes e and e alias
with each other on grid h.
iθ x / h i 2θ x / 2 h
On grid 2h the component e becomes e .
That is, its frequency relative to the grid is
doubled.

118
Aliasing Revisited
iθ x / h
Fourier modes e and ei(θ ± 2π ) x / h alias with
each other on grid h.
iθ x / h
On grid 2h the component e becomes ei 2θ x / 2. h
That is, its frequency relative to the grid is
doubled.
and e (
iθ x / h i θ ±π ) x / h
Thus, the fine-grid components e
alias when sampled on grid 2h.
Conclusion: the coarse grid 2h resolves only
about ½ of the fine-grid frequencies – those in
the range θ ≤ π / 2 .

119
Smoothing analysis
We simplify the analysis of the two-grid algorithm
by making the following approximation:
1. The coarse-grid correction eliminates all smooth
fine-grid components, those with θ ≤ π / 2 .
2. The coarse-grid correction has no effect on the
rough fine-grid error components, with θ > π / 2 .

With these simplifications we can predict


approximately the convergence rate per fine-grid
relaxation sweep of the two-grid cycle by computing
the smoothing factor defined below.
120
The smoothing factor
Let Rˆ denote the symbol of a relaxation
h

operator whose eigenvectors are Fourier


components. Then the smoothing factor is
defined by

µ = max Rˆ h (θ ) .
π / 2≤ θ ≤π

121
Example:
The smoothing factor of Gauss-Seidel relaxation is
µ = max RˆGS
h
(θ )
π / 2≤ θ ≤π

eiθ
= max
π / 2≤ θ ≤π 2 − e − iθ

1
= max
π / 2≤ θ ≤π 2 − cos (θ ) + i sin (θ )

1 1
=max .
π / 2≤ θ ≤π
( 2 − cos (θ ) ) + sin 2 (θ )
2
5

The maximum is obtained at θ = ±π / 2 .

122
The smoothing factor of Jacobi relaxation
with no damping is 1, because cos ( ±π ) =
1.
Jacobi relaxation does not smooth errors at
the highest frequencies, only changes their
signs.

What is the smoothing factor of damped


Jacobi? What is the optimal damping? That is,
what is the ω which minimizes µ of damped
Jacobi? What is the corresponding µ?

123
Higher dimensions
The Fourier analysis can be generalized to d
dimensions.

xj ( )
x (j11 ) , x (j22 ) , , x (jdd ) , θ
= (θ1 ,θ 2 , ,θ d ) ,
where
=x (jkk ) j=
k hk , k 1, , d .

124
Higher dimensions

The Fourier components are

 d (k )   d 

i ∑θ k x jk / hk  i θ k jk 
ϕ ( x ) e= e
   
= =
θ j
 k 1=  k 1 
,

Where hk is the mesh-size in the kth coordinate.

125
The smoothing factor (for standard coarsening) is
now defined as in the 1D case, with

θ = max ( θ1 , , θ d ) .
def

Example (2D)
The symbol of Gauss-Seidel Relaxation for the
Poisson problem is iθ1 iθ 2
e + e
Rˆ (θ ) = − iθ1 − iθ 2
.
4−e −e
The smoothing factor µ is the maximum of R̂ (θ )
over all θ for which the absolute value of at least
one component is at least π/2.

126
( −π , π ) θ Domain (π , π )

( 0, 0 )

( −π , −π ) (π , −π )

The shaded area marks the part of the θ domain that


is ignored when computing the smoothing factor.

127
The smoothing factor is most easily computed
approximately by a small computer program. For
Gauss-Seidel relaxation of the 5-point Laplacian,
the smoothing factor is found to be ½.

Conclusion: a V(2,1) cycle is expected to reduce


the error approximately by a factor 8 per cycle.

128
Exercise:
Write a MATLAB function that computes approximately
the smoothing factor, µ . The input should be two small
matrices representing the stencils of Lh + and Lh −, and also
a damping parameter, ω. The function should compute the
symbol of the relaxation, and maximize its absolute value
over a discrete subset spanning the high frequencies,
 π π  π π
(θ1 , θ 2 ) ⊂ [ −π , π ] × [ −π , π ] \  −
,  × − ,  .
 2 2  2 2
The resolution of the θ ’s should also be a parameter (run
with 65 by 65). Use the program to verify the analytical
results of the smoothing analysis for Jacobi relaxation
(presented later). Verify also that the smoothing factor of
point Gauss-Seidel for the 5-point Laplacian is 0.5.

129
For example, for Gauss-Seidel the input is
0 0 0 0 1 0 
Lh + 1 −4
= 0  , Lh − 0 0 1  ,
=
  
 0 1 0   0 0 0 

whereas for Jacobi it is


0 0 0 0 1 0 
Lh +  0 −4
= 0  , Lh − 1 0 1  .
=
  
 0 0 0   0 1 0 

The symbol of the relaxation is then given by

 Lˆh − 
R =1 − ω + ω  −
h
.
 Lˆh + 
130
Ellipticity and h-Ellipticity
Multigrid methods are particularly useful and often
straightforward for elliptic problems. But the “on-off”
definition of ellipticity is inadequate for numerical
purposes, and a quantitative measure of “ellipticity” of
the discrete operator is important. This is given by the
h-ellipticity measure, Eh, defined by

min Lˆh (θ )
E h
(L ) =
h π / 2≤ θ ≤π

max Lˆ (θ ) h
,
π / 2≤ θ ≤π

where Lˆh is the symbol of Lh.

131
We say that a discrete operator, Lh, is h-elliptic if Eh
is bounded away from zero. Generally, for ordinary
(i.e., local) relaxation methods, larger Eh corresponds
to better “error-smoothability” by local processing.
Basically, a large Eh implies that all high-frequency
errors generate relatively large residuals that are
roughly of the same size. If Eh is small, then there
are some error components whose residuals are
relatively small. Such components cannot be detected
locally, and hence they cannot be reduced efficiently
by a local relaxation.
Remark: an elliptic PDO may have a discretization
that is not h-elliptic, while a nonelliptic PDO might
have one that is.
132
Examples:
Consider three different stencils for the Laplacian .

⋅ 1 ⋅
1  .
=Lh
1 −4 1 (1)
h2  
 ⋅ 1 ⋅ 
Here,
1
Lˆh (θ ) =2 ( 2 cos θ1 + 2 cos θ 2 − 4 ) =
h
4
( )
− 2 sin 2 (θ1 / 2 ) + sin 2 (θ 2 / 2 ) .
h

Hence, E =h
4 O (1) , and we say that Lh is h-
1/=
elliptic.
133
Next, choose
1 ⋅ 1
1 
L=
h
2 
⋅ −4 ⋅  . (2)
2h
1 ⋅ 1

Here,
2
= Lˆh (θ ) 2 (
cos θ1 cos θ 2 − 1) ,
h
and
E h = 0.

Conclusion: if this discretization is used, the error


cannot be smoothed efficiently by local relaxation.

134
Finally, choose

⋅ ⋅ 1 ⋅ ⋅
⋅ ⋅ ⋅ ⋅ ⋅ 
1 
L =
h
1 ⋅ −4 ⋅ 1 .
4h 
2
 (3)
⋅ ⋅ ⋅ ⋅ ⋅
 ⋅ ⋅ 1 ⋅ ⋅ 
Here,
1
Lˆh (θ ) = 2 ( 2 cos 2θ1 + 2 cos 2θ 2 − 4 ) =
4h
4
(
− 2 sin 2 θ1 + sin 2 θ 2 .
h
)

Again E h = 0. Note that in the last two examples,


the Red-Black mode could not be smoothed.

135
Smoothing and h-ellipticity
h-ellipticity is a necessary and sufficient condition for
the existence of pointwise smoothers based on a local
splitting,
=
Lh Lh + + Lh − ,
where Lh + is comprised of the coefficients of variables
already relaxed, while Lh - is comprised of the
coefficients of unrelaxed variables.
Obviously, E ( L ) = 0 implies Lˆh (θ ∗ ) = 0 for some high
h h

frequency θ *, and therefore

( ) ( )
Lh + θ ∗ = − Lh − θ ∗ .
136
For L (
ˆh + θ *
)≠ 0 , we get

Lˆ h−
( )
θ *
ˆ θ*
=
R ( ) = 1.
Lˆ h
( )
+
θ *

Thus the smoothing factor is at best 1.

137
Exercise:
Show that optimally damped Jacobi relaxation yields
for symmetric constant-coefficient operators, a
smoothing factor
1− Eh
µ= ,
1+ E h

where Eh is the h-ellipticity measure. (Note that the


symmetry of the operators implies that the symbols
are real).
Hint: without loss of generality, assume that Lh is
normalized such that its diagonal is the identity matrix.
Then, write the damped Jacobi relaxation matrix in
terms of Lh .
138
For non-symmetric stencils we can obtain the bound

µ ≤
1− Eh( )
2

< 1.
1 + (E )
h 2

By means of a distributive relaxation. However, in


practice we can usually find far simpler and more
effective smoothers.

139
Anisotropic Diffusion
Let = uss + ε utt .
Lu
Let φ be the angle between (s,t) and the grid-aligned
coordinate system, (x,y). Hence,

( ) ( )
Lu = C 2 + ε S 2 u xx + 2 (1 − ε ) CSu xy + ε C 2 + S 2 u yy ,
with
= =
C cos (φ ) , S sin φ .

140
Anisotropic Diffusion
We discretize L using the stencil

 1
(1 − ε ) CS 
1
 − 2 (1 − ε ) CS εC + S
2 2

2
1  
=
Lh
 C 2
+ ε S 2
−2 (1 + ε ) C + ε S .
2 2

h2  
1 1
 (1 − ε ) CS εC2 + S2 − (1 − ε ) CS 
 2 2 

141
Anisotropic Diffusion: The Aligned case
In the previous examples, of the Laplacian operator, we
were able to find a discretization that would give us a
good h-ellipticity measure. But suppose that the
differential operator is

L =ε ∂xx + ∂ y y

142
A 2nd order discretization for the aligned linear
diffusion operator is again obtained by the standard
five-point stencil:
. 1 .
− 2(1 + ε ) ε  .
1 
L = 2 ε
h

h
 . 1 . 
Here,
2  (
 2 ε cos (θ1 ) + cos (θ 2 ) − 2 (1 + ε ) ) 
1
=Lˆh 
h
4
− 2 ε sin 2 (θ1 2 ) + sin 2 (θ 2 2 )  .
=
143
h
Setting (θ1 ,θ 2 ) = (π 2 ,0) vs. (π , π ), for example, we
obtain
= E h O ( ε ) , ε → 0, hence small h-ellipticity.
Indeed, all error components which have high-
frequency oscillations only in the x direction generate
relatively small residuals. Such errors cannot be
reduced efficiently by local relaxation.
For example, the symbol of the Gauss-Seidel relaxation
for this operator is iθ iθ
ε e + e
Rˆ (θ ) =
1 2

(9)
2 (1 + ε ) − ε e − e
1− iθ − iθ
2

setting θ = (π ,0 ) , for example we get


1− ε
Rˆ (π , 0 ) =
1 + 3ε
=− ( )
1 4ε + O ε 2 , ε → 0 (10)
Thus the smoothing factor is very poor for small ε as
expected.
144
Treating Aligned Anisotropy
There are two general approaches for handling
anisotropic operators. One method is to employ line-
relaxation in the direction of the strong coupling (i.e.,
for which the coefficient is relatively large – the y
direction in this example.) This means that we relax
simultaneously a full line of variables for each
gridpoint index i in the x direction.
In our Gauss-Seidel example, the resulting relaxation
symbol is
εe iθ
Rˆ (θ ) =
1

2(1 + ε ) − εe − iθ 1
−e iθ 2
−e − iθ 2

145
Now, R (θ ) is maximized over the high frequencies for
θ = (π 2,0 ) , yielding R(π 2,0) = 1 5, for ε → 0 , which
implies very good smoothing.

The drawback of this approach is that for each y-line


we must solve a tri-diagonal system of equations. We
can do this by the usual Gaussian elimination or by a 1D
multigrid cycle.

146
An alternative approach for anisotropic operators is
partial coarsening (or semi-coarsening). In this
approach we use the usual relaxation, but we only
coarsen in the direction in which the error is smoothed
efficiently (y in our example).

Thus, the relaxation symbol remains as usual, but the


coarse grid resolves more components, and the
definition of the smoothing factor changes
accordingly.

147
In our example, the smoothing factor is given by

µ = maxπ 2 ≤θ 2 ≤π Rˆ (θ ) ,

where R̂ (θ ) is the usual Gauss-Seidel symbol (9). Given


the range of θ, the maximum (for small ε) is now
obtained at θ = ( 0, π / 2 ) , yielding,

i +ε 1
=Rˆ (θ )  → ,
2 (1 + ε ) − ε − i ε →0
5

again implying good smoothing properties.

148
The General Rule of Block Relaxation
The general rule is that local (point) relaxation only
smoothes the error efficiently in the direction of the
strongest coupling. If we relax the strongly-coupled
variables simultaneously (block relaxation), then the
relaxation will smooth well also in the direction of the
second-strongest coupling. Thus, we can use block-
relaxation (line, plane, etc.) to regain full multigrid
efficiency.
Alternatively, we can refrain from coarsening in the
directions along which the error is not smoothed. The
ultimate form of this is Algebraic Multigrid (AMG).

149
Both techniques can be used simultaneously. For
example, if we do not know a priori the direction of
strong coupling, then we can use line relaxation in one
direction while coarsening only in the other. This can
be generalized to higher dimensions.

Alternatively, we can use line relaxation in each of the


directions (alternating), but then the generalization to
higher dimensions is more cumbersome.

150
Conclusions
• With proper care and insight, multigrid methods are a
highly efficient tool for the iterative solution of
problems arising from the discretization of elliptic
PDE
• Honing your multigrid algorithm to make it work for
your particular problem may be difficult.
• A more general and robust approach – that comes with
the cost of heavier machinery – is Algebraic Multigrid
Methods, introduced in the next tutorial.

151
Algebrization of Multigrid
There are many problems for which multigrid is suitable
in principle but cannot be applied in a straightforward
way. For example,
1. Unstructured grids and complex geometries
2. Non-PDE applications
Such situations require algebraic multigrid methods.
The multigrid components can be expressed as matrices.
Consider, for example, the 1D model problem using
linear interpolation and full-weighted residual transfers.

152
− 2 1 
 1 −2 1 
 
 1 −2 1 
1  
Lh = 2   
h
 1 −2 1 
 
 1 −2 1 

 1 − 2 

1 
2 
 
1 1 
 
 2 
 1 1 
1 
( )
I Hh = 2 I hH
T
= 
2
 
1 1 
 
 2 
 1 1
 
 2
 
153  1
Given the fine-grid matrix, Lh, and the
prolongation and restriction matrices, I Hh , and I hH ,
how should we define the coarse-grid matrix, LH ?

The coarse grid should be able to correct smooth


errors. We use the following (algebraic)
definition of smoothness: An error before is
h
v
smooth if it is in the range of interpolation. That
is, if there exists some coarse-grid function, w H,
such that

v h
before =I w .
h
H
H

154
The error after the coarse-grid correction is
given by

v h
after =C v h h
before

where

C = I −I
h h h
H (L )H −1
I hH Lh .

Plugging in our smooth error we obtain:

155
h
vafter = C h vbefore
h

[
= I −I h h
H (L )
H −1
]
I hH Lh I Hh w H

= I[ h
H −I h
H (L )
H −1
I hH Lh I Hh w H ]
=I h
H [I H
− L ( )
H −1 H
I LI
h
h h
H ]w H
.

In order to annihilate the error we must choose the


Petrov-Galerkin coarse-grid operator:

L H
=I L I .
H
h
h h
H
156
For symmetric problems especially, the preferred choice
for the restriction is the transpose of the prolongation.
Along with the Galerkin coarse-grid operator this yields
so-called variational coarsening, which arises naturally in
finite-element formulations.

It remains only to define the prolongation (and,


implicitly, the set of variables which defines the coarse
grid). The prolongation operator should produce good
approximate fine-grid values from given coarse-grid
values. Therefore, I Hh needs to be determined using Lh
When used with appropriate coarse grids, such methods
yield fast and robust algebraic solvers.

157
For tridiagonal matrices in 1D the different algebraic
methods become the same: an exact multigrid solver
that is equivalent to cyclic reduction

If the fine-grid equations are

ai ui −1 + bi ui + ci ui +1 = f i ,

I = 1,…, n – 1, with a1 = cn −1 ≡ 0 , we choose the


prolongation matrix to be

158
 − bc1 
 1 
 1 
 a3 
 − b3 − bc33 
 1 
 
 − a5

 b5

I Hh = ... 
 
 − bcnn−−55 
 1 
 
 − an−3
− bcnn−−33 
 bn−3

 1 
 an −1 

 − bn−1  
159
Furthermore, we let I = (I ) and employ Galerkin
H h T
h H
coarsening. For smoothing we use half-Red-Black
relaxation. That is, before restricting residuals we
relax only on odd-indexed gridpoints, and after the
coarse-grid correction only on even-indexed points.

Theorem: the two-level cycle is an exact solver.


Furthermore, the coarse-grid equations are
tridiagonal. Hence, the multigrid cycle is an exact
solver.

160
Algebraic Multigrid (AMG)

Introduced by Brandt et al. (1983) and developed by Ruge


and Stueben.
AMG takes the algebrization of multigrid to the limit.
Here, a relaxation method is chosen (usually, point Gauss-
Seidel), and then the coarse-grid variables are chosen by
a heuristic graph algorithm such that each fine-grid
variable depends strongly on one or more coarse-grid
variable (i.e., with relatively large coefficient).
AMG enables us to handle unstructured and non-PDE
problems.

161
An Abstract View of Algeraic Multigrid Methods

Consider the linear system

Au = f .
Suppose we partition the variables, ui , into F variables
and C variables, and permute the equations and variables
to produce the following partitioning of the system:

 AFF AFC   u F   f F 
Au =     .
 ACF ACC   uC   f C 

162
An Abstract View of AMG

Given an approximate solution, , define the error as


u
v= u − u.
Then, the partitioned equation for the error is

 AFF AFC   vF   rF 
Av =    ,
 ACF ACC   vC   rC 
where

rF =f F − AFF u F − AFC uC ,


f C ACF u F − ACC uC .
rC =−
163
An Abstract View of AMG

The upper block yields

AFF vF= rF − AFC vC ,


⇒= −1
vF AFF ( rF − AFC vC ) .
Plugging this into the lower block yields

−1
ACF AFF ( rF − AFC vC ) + ACC vC =
rC ,
( −1
⇒ ACC − ACF AFF )
AFC vC = −1
rC − ACF AFF rF .

164
An Abstract View of AMG

Conclusion: the “ideal” prolongation and restriction are

 − AFF
−1

( )
AFC −1
P=  , R= − ACF AFF I ,
 I 

with the coarse-grid operator given by

−1
=
AC =
RAP ACC − ACF AFF AFC .

165
An Abstract View of AMG

In particular, it is straightforward to verify that the two-


level solution is exact in this case, provided that either a
pre-relaxation or a post-relaxation eliminates rF .
(If this is done by post-relaxation, only uF should be
relaxed.)
The problem: AFF-1 is not sparse, and therefore, neither are
P and R. Therefore, we generally look for good sparse
approximations.
One exception is tri-diagonal systems, where AFF is diagonal.
In this case the multigrid V-cycle with the appropriate
prolongation and restriction, and with relaxation only on uF
is166
an exact solver, equivalent to total reduction.

You might also like