Conservative Iterative Methods For Implicit Discre
Conservative Iterative Methods For Implicit Discre
Conservative Iterative Methods For Implicit Discre
1
Centre for mathematical sciences, Lund University, Lund, Sweden.
email: philipp.birken@na.lu.se
viktor.linders@math.lu.se
Abstract
Conservation properties of iterative methods applied to implicit finite
volume discretizations of nonlinear conservation laws are analyzed. It
is shown that any consistent multistep or Runge-Kutta method is glob-
ally conservative. Further, it is shown that Newton’s method, Krylov
subspace methods and pseudo-time iterations are globally conserva-
tive while the Jacobi and Gauss-Seidel methods are not in general. If
pseudo-time iterations using an explicit Runge-Kutta method are ap-
plied to a locally conservative discretization, then the resulting scheme
is also locally conservative. However, the corresponding numerical flux
can be inconsistent with the conservation law. We prove an extension
of the Lax-Wendroff theorem, which reveals that numerical solutions
based on these methods converge to weak solutions of a modified con-
servation law where the flux function is multiplied by a particular con-
stant. This constant depends on the choice of Runge-Kutta method
but is independent of both the conservation law and the discretiza-
tion. Consistency is maintained by ensuring that this constant equals
unity and a strategy for achieving this is presented. Experiments show
that this strategy improves the convergence rate of the pseudo-time
iterations.
1 Introduction
Conservation laws arise ubiquitously in the modelling of physical phenomena
and their discretizations have been the subject of intense study; see e.g. [17,
1
Chapter 1] and [18, Chapter 1]. They derive their name from the fact
that they describe the conservation of some quantities of interest over time.
In computational fluid dynamics (CFD), these quantities are usually mass,
momentum and energy. For convenience we always refer to the conserved
quantity as the ”mass” in this paper.
Many numerical schemes have been designed to mimic mass conserva-
tion. Throughout, we refer to such schemes as globally conservative. As
special cases, locally conservative schemes dictate that local variations in
the mass propagate from one computational cell to neighboring ones with-
out ”skipping” any cells (typically referred to as ”conservative schemes” in
the literature). In particular, finite volume methods are designed in part
upon this principle, although many other schemes possess the same prop-
erty [13]. A major result on the topic is the Lax-Wendroff theorem, see
e.g. [16], which provides sufficient conditions for numerical solutions of con-
vergent and locally conservative discretizations to converge to weak solutions
of the corresponding conservation law.
For stiff problems, e.g. the simulation of wall bounded viscous flows, [23],
implicit time stepping methods are used. Such discretizations result in sys-
tems of linear or nonlinear equations. In the context of fluid flow simulations,
these systems are sparse and very large. Their solutions are approximated
using iterative methods, see e.g. [1, 2, 4–8, 10–12] and the references therein.
It is natural to ask whether such approximate solutions satisfy the con-
servation properties upon which the discretizations are based, and in partic-
ular, whether a Lax-Wendroff type result is available. It is typically claimed
that this is not the case, e.g. in [15], where a version of a Newton-Jacobi
iteration was noted to violate global conservation. In this paper we set out
to answer this question in greater detail for some well-known (families of)
iterative methods. We consider conservative implicit space-time discretiza-
tions whose solutions in each time step are approximated by a fixed number
of iterations with different iterative methods.
After introducing relevant concepts and definitions in section 2, we in-
vestigate the global conservation of iterative methods in section 3. It is
shown that Newton’s method, Krylov subspace methods and pseudo-time
iterations using Runge-Kutta (RK) methods are globally conservative if the
initial guess has correct mass. However, the Jacobi and Gauss-Seidel meth-
ods in general are not. In section 4 we focus on pseudo-time iterations using
explicit RK methods. We show that such methods are locally conservative.
By fixing the number of iterations and considering the limit of infinitesimal
space-time increments, an extension of the Lax-Wendroff theorem is given.
It turns out that the numerical solution in general converges to a weak solu-
tion of a modified conservation law for which the flux function is multiplied
by a particular constant. An expression for this modification constant is
given that only depends on the choice of Runge-Kutta method and the se-
lected pseudo-time steps. A technique for ensuring that the constant equals
2
one is presented, thereby ensuring consistency of the resulting scheme. Ex-
periments indicate that the new technique improvess the convergence rate
of the pseudo-time iterations. Numerical tests corroborate these findings
and further suggest that the results hold for systems of conservation laws in
multiple dimensions. Conclusions are drawn in section 5.
Throughout the paper we separate quantities that belong to a spatial
discretization from those that do not. Any vector living on a grid with
m cells is denoted by a lower case bold letter, e.g. u ∈ Rm . Similarly,
any matrix operating on such a vector is represented by a bold upper case
letter, e.g. A ∈ Rm×m . In Section 4, we consider s-stage Runge-Kutta
methods. Vector quantities spanning these stages are denoted by bold and
underlined lower case letters, e.g. b ∈ Rs . Matrices operating on the stages
are represented by bold and underlined upper case letters e.g. A ∈ Rs×s .
ut + ∇ · f (u) = 0, t ∈ (t0 , te ], x ∈ Ω ⊂ Rd ,
(1)
u(x, t0 ) = u0 (x),
possibly with additional boundary conditions. Here, the kth component uk
of the vector u ∈ Rq represents the concentration Rof a quantity, i.e. the total
amount of that quantity in a domain is given by Ω uk dx. The conservation
law states that this amount is only changed by flow across the boundary of
the domain. Assuming that the net flow across the boundary is zero, we
thus have for all t ∈ (t0 , te ] that
Z Z
u(x, t)dx = u0 (x)dx. (2)
Ω Ω
In this paper we focus on scalar conservation laws in 1D, i.e. the case
when d = q = 1 in (1). Numerical experiments in section 4 suggest that the
results of the paper can be generalized to systems in multiple dimensions,
however this task is deferred for future work. Further, we restrict our anal-
ysis to the Cauchy problem for (1), or the problem with periodic boundary
conditions. Thus, the conservation law of interest becomes
ut + fx = 0, t ∈ (t0 , te ], x ∈ Ω,
(3)
u(x, t0 ) = u0 (x),
3
A very successful line of research in computational fluid dynamics is to
construct numerical methods that respect (2) on a discrete level. Herein,
we predominantly consider finite volume methods that utilize the implicit
Euler method as temporal discretization. Let the computational grid be
given by (xi , tn ) = (i∆x, n∆t) with n ≥ 0. This grid may be either infinite
or finite and periodic in space, i.e. xi+m = xi for some positive integer m.
We consider discretizations on the form
un+1 − uni 1 ˆ
i
+ fi+ 1 − fˆi− 1 = 0. (4)
∆t ∆x 2 2
Here, uni approximates the solution u(x, tn ) of (3) in the computational cell
Ωi . The notation fˆi+ 1 (u) is used as a placeholder for a generic numerical
2
flux of the form fˆ(un+1 , . . . , un+1 ), where p, q ∈ N determine the bandwidth
i−p i+q
of the stencil. Multiplying (4) by ∆x and summing over all i reveals that
X X
∆x un+1
i = ∆x uni . (5)
i i
Comparing this with (2) shows that the finite volume scheme discretely
mimics conservation of mass.
4
limit of vanishing ∆x and ∆t, the theorem provides sufficient conditions for
u to be a weak solution of the conservation law (1) [17, Chapter 12]. More
precisely, consider a sequence of grids (∆xℓ , ∆tℓ ) such that ∆xℓ , ∆tℓ → 0
as ℓ → ∞. Let Uℓ (x, t) denote the piecewise constant function that takes
the solution value uni in (xi , xi+1 ] × (tn−1 , tn ] on the ℓth grid. We make the
following assumptions:
Assumption 1.
1. There is a function u(x, t) such that over every bounded set Ω = [a, b]×
[0, T ] in x-t space,
ut = f̂ (u), t ∈ (t0 , te ],
(6)
u(t0 ) = u0 ,
with u(t), f̂ ∈ Rm .
5
Definition 3. A discretization f̂ = (fˆ1 (u), . . . , fˆm (u))⊤ is globally conser-
vative if
Xm
|Ωi |fˆi (u) = 0.
i=1
6
Consistent multistep methods satisfy as = 1 and a0 + · · · + as−1 = −1.
Hence, global conservation is ensured.
Similarly, all Runge-Kutta (RK) methods are globally conservative. Con-
sider an s-stage RK method
s
X
n+1 n
u = u + ∆t bj k j ,
j=1
s
! (8)
X
n
kj = f̂ u + ∆t aj,l kl , j = 1, . . . , s,
l=1
where aj,l and bj are given by the method. Since f̂ is globally conservative
it follows from Definition 3 that the mass of kj is zero and therefore that
m
X m
X s
X m
X m
X
|Ωi |un+1
i = |Ωi |uni + ∆t bj |Ωi |kji = |Ωi |uni .
i=1 i=1 j=1 i=1 i=1
satisfies
m
X m
X
|Ωi |xi = |Ωi |bi .
i=1 i=1
7
Proof. Multiplying the ith element of (10) by |Ωi | and summing over all i
gives
m
X m
X m
X m
X ′
′
|Ωi |bi = |Ωi | I + αf̂ x = |Ωi |xi + α |Ωi | f̂ x .
i i
i=1 i=1 i=1 i=1
8
also has zero mass. Thus, the mass of u(k+1) is the same as that of u(k) ,
which by assumption is the same as that of un . Hence, Newton’s method is
globally conservative.
In large scale practical applications, each linear system in (12) is solved
approximately using other iterative methods. In the following, we consider
a generic linear system
(I − αA)u = b, (13)
that may either represent a Newton iteration or a linear discretization of a
conservation law. In line with Lemma 2, it is assumed that the mass of Ay
is zero for any vector y ∈ Rm .
By assumption, the matrix-vector product Au(k) has zero mass, hence the
final term vanishes. Further, if u(k) has the same mass as b, then the
second term on the right-hand side vanishes. By induction it follows that
the Richardson iteration is globally conservative.
9
this into (14), multiplying by I − αD, then adding and subtracting αDu(k) ,
gives after some rearrangement
Thus, u(k+1) has correct mass, at least as long as αa 6= 1. Hence, the Jacobi
iteration is conservative in this special case.
10
By assumption, the final term vanishes. However, the second term on the
right-hand side is in general non-zero. Thus, the Gauss-Seidel method is not
globally conservative and the conservation error is given by
m
X m
X (k+1) (k)
α |Ωi | ai,j (uj − uj ).
i=1 j=i+1
However, note that for problems where A is lower triangular the Gauss-
Seidel iteration is indeed conservative. In fact, for such problems the method
is by definition exact.
The final term vanishes by assumption. Note that the mass of r (0) is zero if
the mass of u(0) equals that of b. Thus, the second term on the right-hand
side also vanishes. All Krylov subspace methods are therefore conservative
if the initial guess is chosen to have the same mass as the right-hand side.
Typically a preconditioner P −1 will pre- or post-multiply A in Krylov
subspace methods. We will not delve into this subject here but merely
observe that if y and P −1 y have the same mass for any choice of y, then
the above analysis applies without modification.
11
3.4 Multigrid methods
Multigrid methods combine two methods, namely a smoother and a coarse
grid correction (CGC). The idea is to separate the residual into high and
low frequency components. As a smoother, an iterative method is applied,
designed to effectively damp the high frequency components.
The role of the CGC is to remove the low frequency components of
the residual. The residual is mapped to a coarser grid using a restriction
operator, R. Smoothing is then applied and a prolongation operator P
is used to reconstruct the residual on the fine grid. Finally, the fine grid
solution is updated by adding the correction to the previous iterate. The
procedure can be applied to a hirearchy of grids, thereby resulting in a
multigrid method. If we consider only two grid levels and assume that the
system is solved exactly on the coarser one, then the coarse grid correction
for the linear problem (13) has the form
u(k+1) = (I − P (I − αA)−1
ℓ−1 R(I − αA)ℓ )u
(k)
+ P (I − αA)−1
ℓ−1 Rb. (16)
Here, the notation (·)ℓ and (·)ℓ−1 denote matrices operating on the fine and
the coarse grid respectively.
The easiest way to make a multigrid method globally conservative is
to choose its components to be globally conservative. The smoother can
be any of the globally conservative iterations discussed so far. By Lemma
3, the inverted matrices in (16) are mass conserving. Thus, it remains to
choose globally conservative restriction and prolongation operators R and
P . This is achieved using agglomeration. The restriction is performed
by agglomerating a number of neighboring cells by constructing a volume
weighted average of fine grid values. Conversely, the prolongation is done
by injecting coarse grid values at multiple fine grid points. In fact, this is
the standard choice of multigrid method in CFD since it results in faster
convergence than other alternatives. For details, see [3].
12
of equations whose solutions are once again approximated using iterative
methods [5]. Whether the resulting approximation is conservative depends
on the choice of method. Any one of the conservative methods discussed so
far can be applied in principle. We will return to pseudo-time iterations in
Section 4, where more details are provided.
13
(GM) GMRES without restarts or preconditioning.
(CGC) A two-level coarse grid correction.
(H) Pseudo-time iterations using Heun’s method with ∆τ = 0.5.
As prolongation and restriction operators for the CGC, agglomeration gives
1 1
1 1 1 ⊤
R= , P = 2R .
2 ..
.
The same methods are used directly on the advection problem (18) with un
as initial guess.
First we consider very coarse discretizations with ∆x = ∆t = 0.5 for both
problems. A single iteration is used with each method and the mass error
and residual is computed after one time step. The results are shown in Table
1. Here, any error smaller than 10−15 is denoted as zero. The Richardson
iteration, GMRES, CGC and the pseudo-time iterations are conservative for
both problems. Similarly, the exact Newton method (Exact) is conservative.
As expected, the Jacobi method is conservative for the advection problem
but not for Burgers’ equation. Gauss-Seidel is not conservative for any of
the two problems.
Table 1: Mass errors and residuals after one time step with ∆t = ∆x = 0.5
for the advection and Burgers’ equations using a single iteration. Computed
values smaller than 10−15 are denoted by 0.
14
10 -6
0.01 20
0 15
-0.01 10
5
-0.02
0
-0.03
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6
Figure 1: Mass error for iterative methods applied to the advection equation
(18) and Burgers’ equation (19).
the previous test is that Gauss-Seidel has a non-detectible mass error for
Burger’s equation. This is in line with our expectations since the Jacobian
only has a single nonzero element above the main diagonal.
15
Several different methods are available for iterating in pseudo-time [5, 28].
Herein, we use an explicit s-stage Runge-Kutta method (ERK). Let (A, b, c)
denote the coefficient matrix and vectors of the ERK method. We denote the
(k) (k+1)
kth pseudo-time iterate by ui . The subsequent iterate ui is computed
(k)
from ui as
Xs
(k+1) (k) (k)
ui = ui − ∆τk bj gi U j , (21)
j=1
(k)
where the stage vectors U j , j = 1, . . . , s have elements
j−1
X
(k) (k)
Ujι = u(k)
ι − ∆τk aj,l gι U l , ι = i − p, . . . , i + q. (22)
l=1
(N )
where the flux ĥi+ 1 is given by
2
N −1 N −1
!
(N )
X Y (k)
ĥi+ 1 = µk b⊤ (I + µk A)−1 φ(−µl ) f̂ i+ 1 (25)
2 2
k=0 l=k+1
(k)
⊤
(k) (k)
and f̂ i+ 1 = fˆi+ 1 U 1 , . . . , fˆi+ 1 U s .
2 2 2
16
Remark 1. The product in (25) is empty when k = N − 1. To handle this
case we use the convention
N
Y −1
φ(−µl ) = 1.
l=N
(N )
where ĥi+ 1 , given by (25), is a numerical flux consistent with the flux
2
c(µ0 , . . . , µN −1 )f , with
N
Y −1
c(µ0 , . . . , µN −1 ) = 1 − φ(−µl ). (27)
l=0
Proof. Firstly, the locally conservative form (26) follows from setting un+1 i =
(N )
ui in (24). Secondly we recall that the numerical flux fˆi+ 1 depends on
2
p + q + 1 parameters, e.g. fˆ 1 (w) = fˆ(wi−p , . . . , wi+q ). Inspecting the flux
i+ 2
(N )
ĥi+ 1 we see that it is additionally dependent on uni so that we may write
2
(N )
ĥi+ 1 (w) = ĥN (wi−p , . . . , wi+q ; uni ). To establish the consistency we must
2
therefore show that ĥN (u, . . . , u; u) = cf (u). To do this, we first note from
(20) that gi (u, . . . , u; u) = 0 due to the consistency of fˆi+ 1 . Using the fact
2
that A is lower triangular for any ERK method, it follows from (21) and (22)
(k) (k)
that Ujι = ui = u for every j, k and ι = i − p, . . . , i + q. Consequently,
(k)
the vector f̂ i+ 1 becomes
2
k
f̂ (u, . . . , u; u) = fˆ(u, . . . , u), . . . , fˆ(u, . . . , u) = fˆ(u, . . . , u)1 = f (u)1,
17
where the consistency of fˆ has been used in the final equality. Inserting this
(N )
into the flux function ĥi+ 1 in (25) and using (23) gives
2
"N −1 N −1
#
(N )
X Y
⊤ −1
ĥi+ 1 = µk b (I + µk A) 1 φ(−µl ) f (u)
2
k=0 l=k+1
"N −1 N −1
#
X Y
= [1 − φ(−µk )] φ(−µl ) f (u)
k=0 l=k+1
N
" −1
−1 N N −1
#
X Y Y
= φ(−µl ) − φ(−µl ) f (u)
k=0 l=k+1 l=k
" N −1
#
Y
= 1− φ(−µl ) f (u)
l=0
= c(µ0 , . . . , µN −1 )f (u),
where we have utilized the fact that the sum in the third equality is telescop-
(N )
ing. Finally, note that ĥi+ 1 is formed by a linear combination of evaluations
2
of fˆi+ 1 and is therefore Lipschitz continuous.
2
(N )
The fact that ĥi+ 1 is inconsistent with f when c 6= 1 is a manifestation
2
of the error associated with the pseudo-time iterations. This has important
implications on the convergence of the resulting scheme. What follows is an
extension of the Lax-Wendroff theorem (see [17, Chapter 12]) pertaining to
ERK pseudo-time iterations:
Theorem 6. Consider a sequence of grids (∆xℓ , ∆tℓ ) such that ∆xℓ , ∆tℓ →
(0)
0 as ℓ → ∞. Fix N independently of ℓ, set ui = uni and terminate the
pseudo-time iterations after N steps. Let ∆τk,ℓ/∆tℓ = µk,ℓ = µk be constants
independent of ℓ for each k = 0, . . . , N −1. Suppose that the numerical flux fˆ
in (4) is consistent with f and that Assumption 1 is satisfied. Then, u(x, t)
is a weak solution of the conservation law
N
Y −1
ut + c(µ0 , . . . , µN −1 )fx = 0, c(µ0 , . . . , µN −1 ) = 1 − φ(−µl ). (28)
l=0
Proof. We follow the proof of the Lax-Wendroff theorem given in [17, Chap-
ter 12] with changes and details added where necessary. Throughout the
proof we let Uℓ (x, t) denote the piecewise constant function that takes the
solution value uni in (xi , xi+1 ] × (tn−1 , tn ] on the ℓth grid. Similarly, for
18
(k)
j = 1, . . . , s we let Ujℓ (x, t) be the piecewise constant function that takes
(k)
the value Uji in (xi , xi+1 ] × (tn−1 , tn ].
The discretization (26) can equivalently be expressed as
(N ) (N )
∆xℓ [Uℓ (xi , tn+1 ) − Uℓ (xi , tn )] + ∆tℓ ĥi+ 1 − ĥi− 1 = 0. (29)
2 2
At this point we will use summation by parts on the sums in (30). Recall
that for two sequences (ai ) and (bi ), the summation by parts formula can
be expressed as
M
X M
X
aj (bj − bj−1 ) = aM bM − a1 b0 − (aj+1 − aj )bj .
j=1 j=1
Applied to the n-sum in the first term and to the i-sum in the second term
in (30), this results in
"∞ ∞
X X ϕ(xi , tn ) − ϕ(xi , tn−1 )
∆xℓ ∆tℓ Uℓ (xi , tn )
∆tℓ
n=1 i=−∞
∞ X ∞ #
X ϕ(xi+1 , tn ) − ϕ(xi , tn ) (N )
+ ĥi+ 1 (31)
∆xℓ 2
n=0 i=−∞
∞
X
= −∆xℓ ϕ(xi , 0)Uℓ (xi , 0).
i=−∞
Here we have used the fact that ϕ has compact support in order to eliminate
all boundary terms except the one at n = 0.
We now let ℓ → ∞ and investigate the convergence of the terms arising
in (31). The first and third terms are identical to those in the proof of
19
the Lax-Wendroff theorem [17, Chapter 12]. Thus, it can immediately be
concluded that the first term converges to
Z ∞Z ∞
ϕt u(x, t)dxdt
0 −∞
∞ X ∞
X ϕ(xi+1 , tn ) − ϕ(xi , tn )
∆xℓ ∆tℓ
n=0 i=−∞
∆xℓ
N −1 N −1
!
X Y (k)
⊤ −1
µk b (I + µk A) φ(−µl ) f̂ i+ 1 .
2
k=0 l=k+1
in order to obtain
∞ X ∞
X ϕ(xi+1 , tn ) − ϕ(xi , tn )
∆xℓ ∆tℓ
∆xℓ
n=0 i=−∞
"N −1 N −1
!#
X Y
µk b⊤ (I + µk A)−1 1 φ(−µl ) f (Uℓ (xi , tn ))
k=0 l=k+1
∞ ∞
X X ϕ(xi+1 , tn ) − ϕ(xi , tn )
+∆xℓ ∆tℓ
∆xℓ
n=0 i=−∞
N −1 N −1
!
X Y (k)
⊤ −1
µk b (I + µk A) φ(−µl ) f̂ i+ 1 − f (Uℓ (xi , tn ))1 .
2
k=0 l=k+1
20
The bracketed part of this expression evaluates to c(µ0 , . . . , µN −1 ), as in the
proof of Theorem 5. Thus, the first half of this expression equals
∞ X
∞
X ϕ(xi+1 , tn ) − ϕ(xi , tn )
c(µ0 , . . . , µN −1 )∆xℓ ∆tℓ f (Uℓ (xi , tn )).
∆xℓ
n=0 i=−∞
(k)
vanishes in the limit ℓ → ∞. Since Uℓ and Ujℓ are both constant in
(xi , xi+1 ] × (tn−1 , tn ], we can rewrite (32) as
∞ X
X ∞ Z xi+1 Z tn
ϕx (x, tn )
n=0 i=−∞ xi tn−1
N −1 N −1
!
X Y (k)
⊤ −1
µk b (I + µk A) φ(−µl ) f̂ i+ 1 (x, t) − f (Uℓ (x, t))1 .
2
k=0 l=k+1
(k)
Here, f̂ i+ 1 (x, t) is placeholder notation for the vector whose jth element is
2
(k) (k)
fˆi+ 1 Ujℓ (x − p∆x, t), . . . , Ujℓ (x + q∆x, t) .
2
21
tends to zero for almost every x. By the Cauchy-Schwarz and triangle
inequalities, (33) is bounded by
N −1 N −1
!
X Y (k)
⊤ −1
kµk b (I + µk A) k |φ(−µl )| f̂ i+ 1 (x, t) − f (Uℓ (x, t))1 ,
2
k=0 l=k+1
(34)
where the Euclidean norm in Rs is used. The only term in (34) that depends
(k)
on ℓ is f̂ i+ 1 (x, t) − f (Uℓ (x, t))1 and it therefore suffices to show that
2
this term vanishes for almost every x.
Recall that f (Uℓ ) = fˆi+ 1 (Uℓ , . . . , Uℓ ) by consistency. A standard norm
2
inequality gives
(k)
f̂ i+ 1 (x, t) − f (Uℓ (x, t))1
2
√
(k) (k)
ˆ
≤ s max fi+ 1 Ujℓ (x − p∆x, t), . . . , Ujℓ (x + q∆x, t)
1≤j≤s 2
The second of these terms appear in the proof of the Lax-Wendroff theorem
[17, Chapter 12] and vanishes in the limit for almost every x due to the
bounded total variation of Uℓ . Further, from (22) and (21) and the fact
(0) (k)
that ui = uni it follows that for every j = 1, . . . , s and k ≥ 0, Ujι = unι
for each ι = i − p, . . . , i + q in the limit of vanishing ∆τℓ . Consequently,
(k)
Ujℓ (·, t) − Uℓ (·, t) vanishes identically as ℓ → ∞.
22
A few remarks about Theorem 6 are in place: First, we demand from a
useful iterative method that it converges to the correct solution as N → ∞.
Thus, the pseudo-time steps should be chosen in a way that ensures that
c → 1, or equivalently,
N
Y −1
φ(−µl ) → 0 as N → ∞. (35)
l=0
un+1 − uni 1
i
+ un+1
i − un+1
i−1 = 0, i = 1, . . . , m.
∆t ∆x
Throughout the experiments the temporal and spatial increments are chosen
to be equal: ∆t = ∆x.
We validate Theorem 6 by studying the convergence of the numerical
scheme to the solution of the original advection problem (18) as well as to
the solution of the modified version ut + cux = 0. As pseudo-time iteration
we use the explicit Euler method, Heun’s method and the third order strong
23
stability preserving RK method SSPRK3 [26]. Theorem 6 predicts that
these methods respectively will modify the propagation speed by the factor
N
Y −1
c(µ0 , . . . , µN −1 ) = 1 − (1 − µl ),
l=0
N −1
Y µ2
c(µ0 , . . . , µN −1 ) = 1 − 1 − µl + l ,
2
l=0
N −1
Y µ2 µ3
c(µ0 , . . . , µN −1 ) = 1 − 1 − µl + l − l .
2 6
l=0
24
10 -1 10 0
10 -2 10 -1
10 -3 10 -2
10 -4 10 -3
10 -5 10 -4
10 -6
10 -6 10 -4 10 -2 10 -6 10 -4 10 -2
1
(a) Constant pseudo-time steps, µl = 20 . (b) Variable pseudo-time steps, µl = 2−l .
contribution to the speed from dispersion errors built into the discretization;
see e.g. [19–22, 29, 30] for details and remedies.
In the next experiment we verify that c depends on N as predicted by
Theorem 6 by measuring the propagation speed c̄ of the numerical solution
while varying N . To measure c̄ we track the x-coordinate of the maximum
of the propagating pulse through time. To get accurate measurements we
extend the computational domain to x ∈ (−1/5, 1/5], t ∈ (0, 6].
For this experiment we fix ∆x = 0.003 and set µl = µ to be fixed for each
l = 0, . . . , N − 1. The measured propagation speed error 1 − c̄ for different
µ are shown in Fig. 3a for Heun’s method and in Fig. 3b for SSPRK3. The
solid lines show the theoretically predicted error,
1 − c(µ0 , . . . , µN −1 ) = φ(−µ)N .
The measured and theoretical speed errors agree well for µ = 0.05 and
µ = 0.2. For µ = 0.5 a discrepancy between theory and measurement is
seen for small errors. This suggests that the dispersion error intrinsic to the
finite volume scheme is starting to dominate the propagation speed error.
25
10 0 10 0
10 -2
10 -2
10 -4
10 -4
0 5 10 15 20 0 5 10 15 20
Figure 3: Proparagion speed error vs number of iterations per time step for
the linear advection problem.
26
10 -1 10 -1
10 -2 10 -2
10 -3 10 -3
10 -4 10 -4
10 -6 10 -4 10 -2 10 -6 10 -4 10 -2
2 − u2r 1
s= = ,
ul − ur 2
where ul,r denote left and right states of the discontinuity respectively. The
shock speed of the modified conservation law is instead c/2.
We use the exact same grids and pseudo-time iterations as for the trian-
gular shock and measure the L2 error with respect to the exact and modified
27
0.6
0.5
0.4
0.3
0.2
0.1
-0.1
0 0.2 0.4 0.6 0.8 1
Figure 5: Numerical solutions and predicted shock locations (crosses) for the
solution of Burgers’ equation (19) using different numbers of pseudo-time
iterations, N . The dotted line indicates the exact solution. The dashed line
shows the initial data.
conservation laws. The results are shown in Fig. 4b. Convergence is once
again seen towards the modified equation, thus verifying that Theorem 6
predicts the propagation speed error correctly despite the added boundary
condition.
Next we extend the time domain to t ∈ (0, 1], fix ∆x = ∆t = 4∆τ =
1/100 and vary the number of iterations, N . The computed solutions using
N = 1, N = 3 and N = 12 are shown in Fig. 6 together with the predicted
shock locations (dashed lines). Once again, there is good agreement between
prediction and experiment, although numerical dissipation smears the shock
fronts somewhat.
As mentioned previously, the shock speed error can be eliminated entirely
for the explicit Euler method by noting that µ = 1 is a root of the stability
polynomial 1 − µ. To highlight the effect of this, we introduce two strategies
for choosing the pseudo-time steps:
Strategy 1: Use N = 12 iterations with ∆τ0,...,11 = ∆t/4.
Strategy 2: Use N = 9 iterations with ∆τ0 = ∆t and ∆τ1,...,8 = ∆t/4.
28
1.2
0.8
0.6
0.4
0.2
-0.2
0 0.2 0.4 0.6 0.8 1
The first strategy is the same that we have used in the experiments so far.
The second one ensures that c = 1 by taking a large initial pseudo-time
step. Note that both strategies correspond to integration in pseudo-time to
the same point; τ = 3∆t.
The relative residuals of the pseudo-time iterates in the first physical time
step are shown in Fig. 7 for the two strategies. The large initial pseudo-time
step in Strategy 2 results in a considerably greater residual reduction than
the corresponding iteration using Strategy 1. Interestingly, subsequent
iterations yield faster convergence of the residual using Strategy 2 as seen
by the steeper gradient. This suggests that the incorrect shock speed makes
the dominant and slowest converging contribution to the residual for this
problem when using Strategy 1. We also conclude that the point to which
we march in pseudo-time, here τ = 3∆t, have less of an impact on the
convergence than the choice of pseudo-time steps used to reach this point.
29
10 0
10 -1
10 -2
0 2 4 6 8 10
posed on the domain (x, y) ∈ (−5, 15] × (−5, 5]. Here, ρ, u, v, E and p re-
spectively denote density, horizontal and vertical velocity components, total
energy per unit mass and pressure. The pressure is related to the other
variables through the equation of state
p = (γ − 1)ρe,
where γ = 1.4 and e = E − (u2 + v 2 )/2 is the internal energy density. The
domain is taken to be periodic in both spatial coordinates. The setting is
30
the isentropic vortex problem [25] with initial conditions
1
γ−1
1 − ǫ2 (γ − 1)M∞ 2
ρ0 = exp (r) ,
8π 2
ǫy
u0 =1− exp (r/2),
2π
ǫx
v0 = exp (r/2),
2π
ργ0
p0 = 2
,
γM∞
31
Figure 8: Computed density of the isentropic vortex at t = 10 using Strat-
egy 1 and Strategy 2. Dashed lines mark the vortex locations predicted
by Theorem 6.
32
10 -1
10 0
10 -2
10 -3
10 -1
10 -4
10 -2 10 -1 10 0 0 2 4 6 8
Figure 9: (a) L2 -error computed with respect to the exact solution using
Strategy 1 (S1) and Strategy 2 (S2), and with respect to the modified
conservation law (S1 mod). (b) Relative residuals at each iteration for 200
physical time steps.
factors than the propagation speed for this particular problem. Nonetheless,
a correct propagation speed is visibly very beneficial, here with a drop in
relative residual of more than an order of magnitude.
33
ing on the problem solved, this modification can alter both the propagation
speed and amplitude of the numerical solution if the constant differs from
unity. We present a strategy for ensuring that the constant equals one and
show numerically that the strategy results in faster convergence. Experi-
ments suggest that the results hold even for systems of conservation laws in
multiple dimensions.
References
[1] F. Bassi, A. Ghidoni, and S. Rebay, Optimal Runge–Kutta
smoothers for the p-multigrid discontinuous Galerkin solution of the
1D Euler equations, Journal of Computational Physics, 230 (2011),
pp. 4153–4175.
[2] F. Bassi and S. Rebay, GMRES discontinuous Galerkin solution
of the compressible Navier-Stokes equations, in Discontinuous Galerkin
Methods, Springer, 2000, pp. 197–208.
[3] P. Birken, Numerical methods for the unsteady compressible Navier–
Stokes equations, Habilitation thesis, University of Kassel, Kassel,
(2012).
[4] , Numerical Methods for Unsteady Compressible Flow Problems,
Chapman and Hall/CRC (to appear), 2021.
[5] P. Birken, J. Bull, and A. Jameson, Preconditioned smoothers
for the full approximation scheme for the RANS equations, Journal of
Scientific Computing, 78 (2019), pp. 995–1022.
[6] P. Birken, G. Gassner, M. Haas, and C.-D. Munz, Precondition-
ing for modal discontinuous Galerkin methods for unsteady 3D Navier–
Stokes equations, Journal of Computational Physics, 240 (2013), pp. 20–
35.
[7] P. Birken, G. J. Gassner, and L. M. Versbach, Subcell finite
volume multigrid preconditioning for high-order discontinuous Galerkin
methods, International Journal of Computational Fluid Dynamics, 33
(2019), pp. 353–361.
[8] P. Birken and A. Jameson, On nonlinear preconditioners in
Newton–Krylov methods for unsteady flows, International journal for
numerical methods in fluids, 62 (2010), pp. 565–573.
[9] P. Birken, A. Meister, S. Ortleb, and V. Straub, On stability
and conservation properties of (s)EPIRK integrators in the context of
34
discretized PDEs, in XVI International Conference on Hyperbolic Prob-
lems: Theory, Numerics, Applications, Springer, 2016, pp. 617–629.
[10] P. Birken, J. D. Tebbens, A. Meister, and M. Tůma, Precon-
ditioner updates applied to CFD model problems, Applied Numerical
Mathematics, 58 (2008), pp. 1628–1641.
[11] D. S. Blom, P. Birken, H. Bijl, F. Kessels, A. Meister, and
A. H. van Zuijlen, A comparison of Rosenbrock and ESDIRK meth-
ods combined with iterative solvers for unsteady compressible flows, Ad-
vances in Computational Mathematics, 42 (2016), pp. 1401–1426.
[12] B. Cockburn, G. Kanschat, and D. Schötzau, A locally con-
servative LDG method for the incompressible Navier–Stokes equations,
Mathematics of Computation, 74 (2005), pp. 1067–1095.
[13] T. C. Fisher and M. H. Carpenter, High-order entropy stable finite
difference schemes for nonlinear conservation laws: Finite domains,
Journal of Computational Physics, 252 (2013), pp. 518–557.
[14] A. Jameson, W. Schmidt, and E. Turkel, Numerical solution of
the Euler equations by finite volume methods using Runge Kutta time
stepping schemes, in 14th fluid and plasma dynamics conference, 1981,
p. 1259.
[15] C. Junqueira-Junior, L. C. Scalabrin, E. Basso, and J. L. F.
Azevedo, Study of conservation on implicit techniques for unstructured
finite volume Navier–Stokes solvers, Journal of Aerospace Technology
and Management, 6 (2014), pp. 267–280.
[16] P. Lax and B. Wendroff, Systems of conservation laws, tech. rep.,
LOS ALAMOS NATIONAL LAB NM, 1959.
[17] R. J. LeVeque, Numerical methods for conservation laws, vol. 3,
Springer, 1992.
[18] , Finite volume methods for hyperbolic problems, vol. 31, Cam-
bridge university press, 2002.
[19] V. Linders, M. H. Carpenter, and J. Nordström, Accurate
solution-adaptive finite difference schemes for coarse and fine grids,
Journal of Computational Physics, 410 (2020), p. 109393.
[20] V. Linders, M. Kupiainen, S. H. Frankel, Y. Delorme, and
J. Nordstrom, Summation-by-parts operators with minimal disper-
sion error for accurate and efficient flow calculations, in 54th AIAA
Aerospace Sciences Meeting, 2016, p. 1329.
35
[21] V. Linders, M. Kupiainen, and J. Nordström, Summation-by-
parts operators with minimal dispersion error for coarse grid flow cal-
culations, Journal of Computational Physics, 340 (2017), pp. 160–176.
[22] V. Linders and J. Nordström, Uniformly best wavenumber approx-
imations by spatial central difference operators, Journal of Computa-
tional Physics, 300 (2015), pp. 695–709.
[23] W. L. Miranker, Numerical methods of boundary layer type for stiff
systems of differential equations, Computing, 11 (1973), pp. 221–234.
[24] Y. Saad, Iterative methods for sparse linear systems, SIAM, 2003.
[25] C.-W. Shu, Essentially non-oscillatory and weighted essentially non-
oscillatory schemes for hyperbolic conservation laws, in Advanced nu-
merical approximation of nonlinear hyperbolic equations, Springer,
1998, pp. 325–432.
[26] C.-W. Shu and S. Osher, Efficient implementation of essentially
non-oscillatory shock-capturing schemes, Journal of computational
physics, 77 (1988), pp. 439–471.
[27] V. Straub, S. Ortleb, P. Birken, and A. Meister, Adopting
(s)EPIRK schemes in a domain-based IMEX setting, in AIP Conference
Proceedings, vol. 1863, AIP Publishing LLC, 2017, p. 410008.
[28] R. C. Swanson, E. Turkel, and C.-C. Rossow, Convergence ac-
celeration of Runge–Kutta schemes for solving the Navier–Stokes equa-
tions, Journal of Computational Physics, 224 (2007), pp. 365–388.
[29] C. K. Tam and J. C. Webb, Dispersion-relation-preserving finite dif-
ference schemes for computational acoustics, Journal of computational
physics, 107 (1993), pp. 262–281.
[30] L. N. Trefethen, Group velocity in finite difference schemes, SIAM
review, 24 (1982), pp. 113–136.
[31] G. Wanner and E. Hairer, Solving ordinary differential equations
II, Springer Berlin Heidelberg, 1996.
A Proof of Lemma 4
The purpose of this appendix is to provide a detailed proof of Lemma 4.
36
Proof. Note first that (21)-(22) can be equivalently written in the form
(k) (k)
Ui = ui 1 − ∆τk Ag i (U (k) ),
(k+1) (k)
i = . . . , −1, 0, 1, . . . , (40)
ui = ui − ∆τk b⊤ g i (U (k) ),
(k) (k) ⊤
where 1 = (1, . . . , 1)⊤ ∈ Rs . Here, we use the notation g i (U k ) = gi (U 1 ), . . . , gi (U s ) .
Suppose that for some N ≥ 1 the relation
(N )
ui − uni 1 (N ) (N )
+ ĥi+ 1 − ĥi− 1 = 0
∆t ∆x 2 2
holds. We will show that it also holds for N + 1. From (20), (24) and (40)
it follows that
" (N ) #
(N ) (N ) U i − uni 1 1 (N ) (N )
U i = ui 1 − ∆τN A + f̂ i+ 1 − f̂ i− 1
∆t ∆x 2 2
∆t (N ) (N )
= uni 1 − ĥi+ 1 − ĥi− 1 1
∆x 2 2
(N ) ∆τN (N ) (N )
− µN AU i + µN A1uni − A f̂ i+ 1 − f̂ i− 1 .
∆x 2 2
(N )
Solving for U i gives
(N ) ∆t (N ) (N )
Ui = uni 1 − ĥi+ 1 − ĥi− 1 (I + µN A)−1 1
∆x 2 2
∆t (N ) (N )
− µN (I + µN A)−1 A f̂ i+ 1 − f̂ i− 1 .
∆x 2 2
37
In the last equality we have used the fact that
I − µN (I + µN A)−1 A = (I + µN A)−1 [(I + µN A) − µN A] = (I + µN A)−1 .
(41)
(N ) (N +1)
Inserting the above expression for g i U into ui as given in (40)
leads to
(N +1) (N )
ui = ui − ∆τN b⊤ g i U (N )
n ∆t (N ) (N )
= ui − ĥi+ 1 − ĥi− 1
∆x 2 2
∆τN ⊤ −1 (N ) (N ) (N ) (N )
− b (I + µN A) f̂ i+ 1 − f̂ i− 1 − ĥi+ 1 − ĥi− 1 1
∆x 2 2 2 2
∆t (N ) (N )
= uni − [1 − µN b⊤ (I + µN A)−1 1] ĥi+ 1 − ĥi− 1
∆x 2 2
∆t (N ) (N )
− µN b⊤ (I + µN A)−1 f̂ i+ 1 − f̂ i− 1
∆x 2 2
∆t (N ) (N )
n (N ) (N ) ⊤ −1
= ui − φ(−µN ) ĥi+ 1 − ĥi− 1 + µN b (I + µN A) f̂ i+ 1 − f̂ i− 1 .
∆x 2 2 2 2
Rearranging, using the induction hypothesis and the expression (25) for ĥN
results in
(N +1)
ui − uni
0=
∆t
" N −1 N −1
!
1 X Y (k) (k)
+ φ(−µN ) µk b⊤ (I + µk A)−1 φ(−µl ) f̂ i+ 1 − f̂ i− 1
∆x 2 2
k=0 l=k+1
(N ) (N )
i
⊤
+ µN b (I + µN A)−1 f̂ i+ 1 − f̂ i− 1
2 2
(N +1) N N
!
ui − uni 1 X
⊤ −1
Y (k) (k)
= + µk b (I + µk A) φ(−µl ) f̂ i+ 1 − f̂ i− 1
∆t ∆x 2 2
k=0 l=k+1
(N +1)
ui − uni 1 (N +1) (N +1)
= + ĥi+ 1 − ĥi− 1 .
∆t ∆x 2 2
38
(0)
Solving for U i gives
(1)
Here we have once again used (41) in the final equality. Thus, ui can be
evaluated using (40) as
(0)
(1) ∆τ0 ⊤ (0) ∆t (1) (1)
ui = uni − b (I + µ0 A)−1 f̂ i+ 1 − f̂ i− 1 = uni − ĥi+ 1 − ĥi− 1 .
∆x 2 2 ∆x 2 2
39