Hmmorgan
Hmmorgan
Hmmorgan
Hannah Morgan
Abstract
Implicit difference schemes for nonlinear PDEs, such as the Korteweg-de Vries
(KdV) equation, require large systems of equations to be solved at each timestep,
leading to long computation times. A group of finite difference schemes for KdV are
given here, and a parallel algorithm is presented and implemented for one such scheme.
A computational model is compared to experimental results, and the performance and
scalability of the algorithm are discussed.
1 Introduction
The Korteweg-de Vries (KdV) equation, given by (1), is a nonlinear PDE first introduced in
[1] in 1895 to model low amplitude water waves in shallow, narrow channels like canals.
It has since been applied to many other problems in physics and engineering including plasma
physics. KdV has been extensively studied both theoretically and numerically. Many finite
difference schemes have been derived for the equation including an explicit one by Zabusky
and Kruskal in [2], who discovered the existence of solitons, and an implicit one from Goda
in [3]. Spectral methods (e.g. [4] and [5]) and finite element methods (e.g. [6], [7], and [8])
have also been used to solve KdV. The accuracy of a given method is simple to measure
since KdV has a smooth analytical solution given by (2).
"√ #
c c
u(t, x) = sech2 (x − ct − a) (2)
2 2
With the rise of parallel numerical analysis, some work has been done to effectively im-
plement certain methods to solve KdV. For example, an intrinsically parallel finite difference
scheme was developed in [9] and a pseudospectral method was implemented in parallel in
[10]. In this paper, we discuss implementation techniques of a finite difference scheme for
KdV and use the block inverse algorithm, a parallel numerical algorithm given in [11], in an
attempt to improve the computation time needed to solve KdV.
1
2.1 Formulation
To obtain a finite difference method, we first write the first order Taylor series where xi+1 =
xi + h and xi−1 = xi − h:
Using the equations above, we can write a forward time and backward space scheme:
ui+1,j − ui,j
ut =
∆t
f (ui,j ) − f (ui,j−1 )
f (u)x =
∆x
Now we can write a finite difference method for the nonlinear advection equation
ut + f (u)x = 0 and, given some initial data u(0, j) = g for j = 0, . . . , n, solve it using:
∆t
ui+1,j = ui,j − (f (ui,j ) − f (ui,j−1 ))
∆x
Now we will write the third order Taylor series where xi−2 = xi − 2h and xi−3 = xi − 3h:
a+b+c+d=0
−3a − 2b − c = 0
9a + 4b + c = 0
−27a − 8b − c = 6
2
Solving, we find a = −1, b = 3, c = −3, d = 1. Finally, we get:
−y(xi−3 ) + 3y(xi−2 ) − 3y(xi−1 ) + y(xi )
y 000 (xi ) =
h3
Now we have a scheme for uxxx :
ui,j − 3ui,j−1 + 3ui,j−2 − ui,j−3
uxxx =
∆x3
Using Taylor series to rewrite each term, we can write a formulation for the KdV equation
ut + f (u)x + uxxx = 0 as:
∆t ∆t(ui,j − 3ui,j−1 + 3ui,j−2 − ui,j−3 )
ui+1,j = ui,j − (f (ui,j ) − f (ui,j−1 )) −
∆x ∆x3
2.2 Implementation
We will implement our method different ways in Octave. The first implementation uses
the built-in function filter to compute finite differences. Then we will write a matrix
formulation of the equation above and solve the problem using difference matrices in Octave.
Finally, we will increase the efficiency our code by implementing sparse matrices.
Finally, we can solve the KdV equation using the time-stepping scheme:
3
Figure 1: Resulting plot of KdV solution with Gaussian initial data using filter with
dt = 0.05, dx = 0.9, and nts = 40.
u = u - (dt/dx)*filter(b, a, f) - (dt/(dx*dx*dx))*filter(d, c, u)
A complete Octave code to solve KdV using filter and resulting plot is given below.
% Parameters
dx=.9;
dt=.05;
nts=40;
a=[1];
b=[1 -1];
c=[1];
d=[1 -3 3 -1];
r=1:dx:100;
% Gaussian initial data:
u0=1*exp(-(.05*(r-50)).^2);
u=u0;
4
matrix given by:
1 0 0 ··· 0
−1 1 0 · · · 0
0 −1 1 · · · 0
D=
.. .. . . . . ..
. . . . .
0 0 · · · −1 1
f (ui,j )−f (ui,j−1 )
Then to calculate f (u)x = ∆x
, we can write:
f (ui,1 )
1 0 0 ··· 0 ..
.
−1 1 0 ··· 0
1
0 −1 1 · · · 0 f (u i,j−1 )
f (u)x =
∆x .. .. f (ui,j )
.. . . ..
. . . . . ..
0 0 · · · −1 1
.
f (ui,n )
Similarly, we can use a matrix formulation to write y 000 = 1
h3
D3 y. Then for uxxx we have:
ui,1
.
1 0 0 0 ··· 0 ..
−3 1 0 0 ··· 0 ui,j−3
1 3 −3 1 0 ··· 0 ui,j−2
uxxx =
∆x3 −1 3 −3 1 · · · 0 ui,j−1
.. . . . . . . . . .. u
. . . . . . .i,j
0 0 · · · 3 −3 1 ..
ui,n
Altogether, we have a time-stepping scheme:
ui+1,1 ui,1 f (ui,1 ) ui,1
.. .. .. ..
. . ∆t . ∆t 3
.
ui+1,j = ui,j − D f (ui,j ) − D u
i,j
. . ∆x . ∆x3
.
.. .. .. ..
ui+1,n ui,n f (ui,n ) ui,n
In our code, however, u is a row vector, so we take the transpose of the matrices defined
above to compute finite differences. For the first difference, then, we have:
1 −1 0 · · · 0
0 1 −1 · · · 0
1
f (u)x = f (ui,1 ) · · · f (ui,n ) ... ... . . . . . . ...
∆x
0 0 · · · 1 −1
0 0 ··· 0 1
In Octave, we populate the first order difference matrix fod and the third order tod and
solve KdV using the time stepping scheme:
5
u = u - ((dt/dx)*(3*(u .* u))*fod) - ((dt/(dx*dx*dx))*u*tod)
fod = sparse(fod);
tod = sparse(tod);
However, we need not construct these full matrices fod and tod. Instead, we will initially
create the sparse difference matrices we need to compute finite differences. The command
S = sparse(i, j, s) uses the vectors i, j and s (all with same length) to generate a sparse
matrix so that:
S(i(k), j(k)) = s(k)
where k = 1, . . . , length(i). A program for creating the sparse difference matrices tod and
fod is given below with identical output as filter above.
6
kc=0;
for k=1:nr
kc=kc+1;
rindx(kc)=k;
cindx(kc)=k;
val(kc)=1;
end
for k=1:(nr-1)
kc=kc+1;
rindx(kc)=k;
cindx(kc)=k+1;
val(kc)=-1;
end
fod=sparse(rindx, cindx, val);
7
2.3 Results
We use the analytical solution (2) to KdV as initial data with varying parameters:
Because the number of time steps will vary depending on our choice of dt, we modify the
parameters of our code:
% Parameters:
dx=.01;
dt=.01;
T=2; % total time
nts=T/dt;
We calculate the maximum relative error of our computed value for u using the code:
max(abs(u-exact)./exact)
Table 1: Maximum relative and absolute error for KdV with c = 0.5, T = 1, and dt = 0.01.
8
Figure 2: KdV after T = 1 with dt = 0.01 and dx = 1.
2. Hopscotch Method
The Hopscotch Method is a mixed implicit/explicit scheme given by:
∆t j
uj+1
i = uji − 3 j
fi+1 − fi−1
∆x
∆t j j j j
− u − 2ui+1 + 2ui−1 − ui−2 (7)
2(∆x)3 i+2
∆t j+1
uj+1
i = uji −3 f j+1
− fi−1
∆x i+1
∆t
j+1 j+1 j+1 j+1
− u − 2ui+1 + 2ui−1 − ui−2 (8)
2(∆x)3 i+2
u2
where f = 2
and (7) is used to calculate (i + j) even, (8) for (i + j) odd.
9
1 j+1 j 1 j+1
j j
j+1
j j
u − ui+1 + u u + ui+1 − ui−1 ui + ui−1
∆t i ∆x i+1 i
1 j+1 j+1 j+1 j+1
+ u − 2ui+1 + 2ui−1 − ui−2 = 0
2(∆x)3 i+2
where the quasi-pentagonal system of equations shown below must be solved at each
time step.
x x x x x
x x x x x
x x x x x
x x x x x
x x x x x
x x x x x
x x x x x
5. M. Kruskal
Kruskal suggested the following scheme for ut + uxxx = 0:
j+1 j+1 j+1 j+1
uj+1
i − uji ui−1 − 3ui + 3ui+1 − ui+2
=
∆t 2(∆x)3
uji−2 − 3uji−1 + 3uji − uji+1
+
2(∆x)3
10
and used the following to solve KdV:
uj+1 − uji
i 3θ j+1 2
j+1 2
j
2 j
2
+ ui+1 − ui−1 + ui+1 − ui−1
∆t 4∆x
1 − θ j+1 j+1 j+1
j j j
+ u ui+1 − ui−1 − ui ui+1 − ui−1
2∆x i
uj+1 j+1
i−1 − 3ui + 3uj+1 j+1
i+1 − ui+2
+
2(∆x)3
uj − 3uji−1 + 3uji − uji+1
+ i−2 = 0.
2(∆x)3
2
Experimentally, it was found that θ = 3
gave the best results.
6. Split Step Fourier Method by Tappert
Note that for this and the next method we will use the superscript m to represent the
time variable and the subscript n for the space variable, since both use the imaginary
number, i. A Fourier Method for KdV can be derived by making the change of variables
Y = (x + p) πp , where p is half the length of the interval of calculation, to normalize the
spatial period to [0, 2π] and discretizing by N points so that ∆Y = 2π N
. KdV, then,
becomes:
π π3
ut + 6 uuY + 3 uY Y Y = 0.
p p
The transformation into discrete Fourier space for k = − N2 , . . . , N2 − 1 is given by:
N −1
1 X
û(k, t) = F u = √ u(j∆Y, t)e−2πijk/N
N j=1
1 X
u(j∆Y, t) = F −1 û = √ û(k, t)e2πijk/N .
N k
The Slip Step Fourier Method uses a finite difference method to solve ut + 6 πp uuY
followed by the fast Fourier transform to advance the solution using the linear term
uY Y Y , described below.
"
∆t π 2 2 2 2
ũm+1
n = um
n − um+1
n+1 − 8 um+1
n−1 − um+1
n+2 + um+1
n−2
8∆Y p
#
2 2 2 2
+ 8 um
n+1 − 8 um
n−1 − um
n+2 + um
n−2
11
followed by the fast Fourier transform:
3 π 3 /p3 ∆t
u(Yj , t + ∆t) = F −1 eik
F ũ(Yj , t) .
∆x ∆t L∞ time
Zabusky 0.1739 0.002 0.00469 28
Hopscotch 0.2 0.003 0.00472 23
Goda 0.1 0.002 0.00492 244
Proposed 0.16 0.125 0.00173 7
Kruskal 0.08 0.04 0.00453 24
Tappert 0.3125 0.004 0.00494 63
Fornberg 0.625 0.0096 0.00113 12
Figure 3: Computation time in seconds for tolerance < 0.005 after T = 1.0 and wave
amplitude 1
∆x ∆t L∞ time
Zabusky 0.08 0.0019 0.00930 591
Hopscotch 0.1 0.0005 0.00994 272
Goda 0.04 0.00025 0.001282 3568
Proposed 0.1 0.1 0.00332 23
Kruskal 0.04 0.011 0.00952 163
Tappert 0.156 0.002 0.00943 271
Fornberg 0.3125 0.0042 0.00474 40
Figure 4: Computation time in seconds for tolerance < 0.01 after T = 1.0 and wave amplitude
2
Various tests were performed to analyze the goodness of the methods described above.
A selection of them are included here in Figures 3, 4, and 5.
12
∆x ∆t L∞ time
Zabusky 0.12 0.00066 0.00165 349
Hopscotch 0.13 0.001 0.00142 283
Goda 0.1 0.0005 0.00165 2764
Proposed 0.1 0.14 0.00148 19
Kruskal 0.08 0.015 0.00145 106
Tappert 0.15625 0.005 0.00186 322
Fornberg 0.625 0.0148 0.00107 24
Figure 5: Computation time in seconds for tolerance < 0.002 after T = 3.0 with two inter-
acting solitons as initial conditions with amplitudes 0.5 and 1
L1
R1 L2
L= = =
.. ..
. .
Rk−1 Lk s
Assume that each Li is invertible and let D be the n × n block diagonal matrix where
Di = L−1
i , i = 1, . . . , k.
Now, instead of the original system, we will solve DLx = Df . To do so, we multiply through
by D and write f as a block vector, shown below.
13
I D1 f1 b1
G1 I D2 f2 b2
DL = Df = =
... ... ... .. ..
. .
Gk−1 I Dk fk bk
Li+1 G
bi = R
bi , i = 1, . . . , k − 1 (11)
Mi s−w ui yi
Gi = bi =
G bi = xi =
Hi w vi zi
w
Now we have simple equations to the system DLx = b. The first block of x is straightforward:
y1 = u1 , z1 = v1
and the subsequent blocks are given by:
zi+1 = vi+1 − Hi zi , i = 1, . . . , k − 1 (12)
yi+1 = ui+1 − Mi zi , i = 1, . . . , k − 1. (13)
The zi ’s in equation (12) above will be computed sequentially, but equation (13) gives us an
opportunity to compute the yi ’s in parallel. Note that there is a possibility for parallelism
in (12) using the parallel prefix algorithm as suggested in [13] if w is large enough.
Conversely, let U be an upper triangular matrix with bandwidth w. Let k and s be as
before, so that n = ks and w ≤ s and write U as a block matrix shown below.
14
w
U1 R1
U2 ..
.
U= = =
... R
k−1
Uk s
Di = Ui−1 , i = 1, . . . , k.
Now, instead of U x = f , we will solve DU x = Df following the same steps as before, shown
below.
I G1 D1 f1 b1
... D2 f2
I b2
DU = Df = =
.. .. .. ..
. Gk−1 . . .
I Dk fk bk
Ui Gi = Ri , i = 1, . . . , k − 1 (14)
Ui bi = fi , i = 1, . . . , k. (15)
Here, since Ui is upper triangular for all i, we will use backwards substitution to solve 14
and 15.
Note that now the last s − w columns of Ri are zero for all i and that the same must be
true of Gi . We will again decompose the problem and only solve for nonzero entries of Gi
(G
bi ). Let Mi the top w rows of G bi , and Hi the bottom s − w rows of G
bi , shown below. We
can write bi as ui and vi , and xi as yi and zi similarly, shown below.
Mi w ui yi
Gi = G = bi = v xi =
s−w zi
bi
Hi i
15
Now we have simple equations to the system DU x = b backwards. The last block of x is
straightforward:
y k = uk , zk = vk
and the subsequent blocks are given by:
yi = ui − Mi yi+1 , i = 1, . . . , k − 1 (16)
zi = vi − Hi yi+1 , i = 1, . . . , k − 1. (17)
This time, the yi ’s in equation (16) above will be computed sequentially, but equation (17)
lets us compute the zi ’s in parallel.
3.3 Implementation
An implicit finite difference scheme to solve KdV from Goda proposed in [3] is given by:
1 j+1 j 1 j+1 j j
j+1 j j
u − ui+1 + u u + ui+1 − ui−1 ui + ui−1
∆t i ∆x i+1 i
1 j+1 j+1 j+1 j+1
+ u − 2ui+1 + 2ui−1 − ui−2 = 0
2(∆x)3 i+2
which can be written as the pentadiagonal system:
1 c d uj+1
1 uj1
b 1 c d uj+1
2 uj2
a b 1 c
a b 1
=
d
c
a b 1 uj+1
n ujn
where
∆t
a=−
2(∆x)3
∆t ∆t j
ui + uji−1
b= 3
−
(∆x) ∆x
∆t ∆t j j
c=− + u + u
(∆x)3 ∆x i i+1
∆t
d= .
2(∆x)3
16
The algorithm described above was implemented in C++ using MPI where A was set to
be the finite difference matrix given by Goda described before. We chose Goda’s scheme
because [12] found it to have prohibitively large computation times in some cases but the
scheme has truncation error O(∆t) + O((∆x)2 ) and is unconditionally stable.
Large matrices were implemented as 3 × m arrays, where m is the number of elements
in the matrix and each row in the array contains the row, column, and value of an element
in the matrix. Parallel factorization of banded matrices has been studied in [14], [15], and
[16], but here A was factored by one processor using a modified Dolittle’s algorithm for a
pentadiagonal matrix. The problem was then solved as described above using the block
inverse algorithm with k processors. The details of the algorithm for Ly = f are presented
in Algorithm 1 below and solving U x = y follows similarly.
17
Figure 6: Total storage used for n = 100, 000.
integers and (29s + 3 + n)k + 33n − 36 + (k − 1)s doubles so that the total number of bytes
used is:
4[10k + 8] + 8[(29s + 3 + n)k + 33n − 36 + (k − 1)s].
For a fixed sized problem, then, the amount of storage scales linearly with k, shown in
Figure 6. Let T1 , T2 , T3 , T4 , T5 , Tsend , Trec be the amount of time it takes to perform
an assignment, multiplication, division, addition, subtraction, MPI send, and MPI receive,
respectively. We break each timestep into a series of “tasks” that we time separately and the
amount of computation for each is given below. P0 and Pk−1 are handled separately from
the other processors since they behave differently.
P0 Pi Pk−1
18
(18s)T1 (18s)T1
+(6s)T2 +(6s)T2
+(3s)T3 +(3s)T3
+(3s)T4 +(3s)T4
+(66s)T5 +(66s)T5
19
(15s − 6)T1 (15s − 6)T1
+(6s − 9)T2 +(6s − 9)T2
+(3s)T3 +(3s)T3
+(3)T4 +(3)T4
+(9s − 18)T5 +(9s − 18)T5
(3s)T1
(2s)T1 (2s)T1
+(k − 1)T2
+(s − 3)T4 +(s − 3)T4
+(s − 3)T4
+(s)Tsend +(s)Tsend
+[s(k − 1)]Trec
20
Figure 7: Total computation time on k = 4, 8, 10, 16 processors.
x = 20 on the domain [0, 40). The problem size was fixed with n = 100, 000 grid points
so that ∆x = 0.0004 and was run for ten time steps with ∆t = 0.1. It was tested on
k = 4, 8, 10, and 16 processors so that each processor solved an s × s system where s =
25, 000, 12, 500, 10, 000, and 6, 250, respectively. We can see in figure 7 that for this problem,
increasing processor count and using the block inverse algorithm is not a particularly effective
way to speed up the computation. We hypothesize why in the following discussion.
Figure 8 shows in detail the block inverse algorithm on four and eight processors during
one timestep. The computation has been broken down into subtasks, each denoted by a
Figure 8: Computation time for one timestep on four processors (left) and eight (right).
21
Figure 9: Total computation time on k = 4, 8, 10, 16 processors.
different color. For each task, the size of the bar is the length of computation. A description
of the computation is given below.
1. During the “setup” of the problem (dark blue in Figure 8), P0 factors A and collects
and distributes blocks to P1 , . . . , Pk−1 .
2. All processors except for P0 solve for Gi using equation (11). P0 starts the next step.
4. Processor Pi waits for zi−1 from its neighbor Pi−1 , computes zi by (12) and sends it to
Pi+1 . P0 immediately computes zi and sends it to P1 .
6. The “setup” task resets the array used to store fi so that code can be repeated.
7. P0 , . . . , Pk−2 solve for Gi using (14). Pk−1 starts the next step.
9. Processor Pi waits for zi+1 from Pi+1 , computes zi using (17) and sends it to Pi−1 . Pk−1
immediately computes zi and sends it to Pk−2 .
Figure 8 shows that most of the computation time is used factoring and distributing
pieces of the matrix A. After each timestep, Goda’s scheme updates the difference matrix
A, so factoring at each timestep is unavoidable. This computation cannot be reduced by
22
Figure 10: Forward block inverse algorithm on k = 4, 8, 10, 16 processors.
Figure 11: Computation time for problem size n = 4, 000 (left) and n = 100, 000 (right).
increasing the number of processors, but instead increases the length of this task by adding
additional sends, shown in Figure 9.
If we restrict ourselves to solving Ly = f , the forward substitution part of our program
and steps 2 through 5 above, we see that the block inverse algorithm works well in practice.
Figure 10 shows clearly that speedup is gained by adding processors for a fixed problem size.
It is also worth mentioning that the size of the problem influences the effectiveness of
the algorithm. We ran the program again on a smaller grid size n = 4, 000 and found that
the algorithm performs better on larger problems. Figure 11 shows the computation time
for the forward part of the algorithm for both problem sizes. Each task can be done in less
time using more processors (due to the fact that the block matrix size s is smaller), except
for the ones involving communication. By increasing the problem size, large communication
costs using more processors can be mitigated by a larger computation.
Finally, Figure 12 shows the computation time over all ten timesteps on P1 and P2 when
k = 4. Here we can see that not much variance occurs between processors or over timesteps.
23
Figure 12: Computation time over 10 timesteps.
4 Conclusion
In this paper, an attempt was made at parallelizing a finite difference scheme for the KdV
equation. The block inverse algorithm that was used does not appear to be a good candidate
for this problem. While we have not accomplished our original goal, this study has exposed
some of the difficulties with solving nonlinear PDEs in parallel that should be investigated
further. We have shown, however, that the algorithm would work well with a linear PDE
such as the linearlized KdV.
24
References
[1] Korteweg, Diederik Johannes, and Gustav De Vries. “Xli. on the change of form of long
waves advancing in a rectangular canal, and on a new type of long stationary waves.” The
London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 39.240
(1895): 422-443.
[2] Zabusky, Norman J., and Martin D. Kruskal. “Interaction of solitons in a collisionless
plasma and the recurrence of initial states.” Phys. Rev. Lett 15.6 (1965): 240-243.
[3] Goda, Katuhiko. “On stability of some finite difference schemes for the Korteweg-de Vries
equation.” Journal of the Physical Society of Japan 39.1 (1975): 229-236.
[4] Abe, Kanji, and Osamu Inoue. “Fourier expansion solution of the Korteweg-de Vries
equation.” Journal of Computational Physics 34.2 (1980): 202-210.
[5] Fornberg, Bengt, and G. B. Whitham. “A numerical and theoretical study of certain
nonlinear wave phenomena.” Philosophical Transactions of the Royal Society of London
A: Mathematical, Physical and Engineering Sciences 289.1361 (1978): 373-404.
[6] Arnold, Douglas N., and Ragnar Winther. “A superconvergent finite element method for
the Korteweg-de Vries equation.” Mathematics of Computation 38.157 (1982): 23-36.
[7] Bona, Jerry L., Vassilios A. Dougalis, and Ohannes A. Karakashian. “Fully discrete
Galerkin methods for the Korteweg-de Vries equation.” Computers & Mathematics with
Applications 12.7 (1986): 859-884.
[8] Winther, Ragnar. “A conservative finite element method for the Korteweg-de Vries equa-
tion.” Mathematics of Computation (1980): 23-43.
[9] Qu, Fu-li, and Wen-qia Wang. “Alternating segment explicit-implicit scheme for nonlinear
third-order KdV equation.” Applied Mathematics and Mechanics 28 (2007): 973-980.
[10] Guo, Jinhua, and Thiab R. Taha. “Parallel implementation of the split-step and the
pseudospectral methods for solving higher KdV equation.” Mathematics and Computers
in Simulation 62.1 (2003): 41-51.
[11] Schendel, Udo. Introduction to numerical methods for parallel computers. Prentice Hall
PTR, 1984.
[12] Taha, Thiab R., and Mark J. Ablowitz. “Analytical and numerical aspects of certain
nonelinear evolution equations III. Numerical, Korteweg-de Vries equation.” Journal of
Computational Phycis 55.2 (1984): 231-253.
[14] Quintana-Ort, Gregorio, et al. “SuperMatrix for the Factorization of Band Matrices
FLAME Working Note 27.” (2007).
25
[15] Gupta, Anshul, et al. “The design, implementation, and evaluation of a symmetric
banded linear solver for distributed-memory parallel computers.” ACM Transactions on
Mathematical Software (TOMS) 24.1 (1998): 74-101.
[16] Hajj, Ibrahim N., and Stig Skelboe. “A multilevel parallel solver for block tridiagonal
and banded linear systems.” Parallel Computing 15.1 (1990): 21-45.
[17] Hammack, Joseph L., and Harvey Segur. “The Korteweg-de Vries equation and water
waves. Part 2. Comparison with experiments.” Journal of Fluid mechanics 65.02 (1974):
289-314.
[18] Hammack, Joseph L., and Harvey Segur. “The Korteweg-de Vries equation and water
waves. Part 3. Oscillatory waves.” Journal of Fluid Mechanics 84.02 (1978): 337-358.
[19] Bona, J., et al. “Conservative, discontinuous Galerkin-methods for the generalized Ko-
rtewegde Vries equation.” Mathematics of Computation 82.283 (2013): 1401-1432.
[20] Bona, J. L., et al. “Conservative, high-order numerical schemes for the generalized
Korteweg-de Vries equation.” Philosophical Transactions of the Royal Society of London.
Series A: Physical and Engineering Sciences 351.1695 (1995): 107-164.
[21] Dutykh, Denys, Marx Chhay, and Francesco Fedele. “Geometric numerical schemes for
the KdV equation.” Computational Mathematics and Mathematical Physics 53.2 (2013):
221-236.
[22] Craig, Walter, and Catherine Sulem. “Numerical simulation of gravity waves.” Journal
of Computational Physics 108.1 (1993): 73-83.
[23] Kolebaje, Olusola Tosin, and Oluwole Emmanuel Oyewande. “Numerical solution of the
Korteweg-de Vries equation by finite difference and adomian decomposition method.”
International Journal of Basic and Applied Sciences 1.3 (2012): 321-335.
[24] Scott, L. Ridgway, Terry Clark, and Babak Bagheri. Scientific parallel computing. Vol.
146. Princeton: Princeton University Press, 2005.
26