PCFD 05
PCFD 05
PCFD 05
in a Flow Field
Ying Xua , J. M. McDonoughb and K. A. Tagavia
a
Ying Xu et al.
field models with convection. Even such formulations still have the drawback of being
very CPU intensive. Hence, it is necessary to apply parallelization to phase-field model
algorithms in order to decrease wall-clock time to within practical limits.
We introduce the phase-field model with convection in the first part of this paper, followed by numerical solutions corresponding to growth of dendrites in a supercooled melt
of nickel. Finally we discuss the approach to parallelization and the speedups obtained.
Parallelization via both OpenMP and MPI has been implemented on a symmetric multiprocessor; the latter is found to be significantly more effective but required considerably
more programming effort.
2. GOVERNING EQUATIONS OF PHASE-FIELD MODEL WITH CONVECTION
In this section we introduce the equations of the phase-field model including effects
of convective transport on macroscopic scales. The difference in length scales between
dendrites and flow field make it unreasonable to implement a dimensionless form of the
governing equations since there is no single appropriate length scale. Therefore, dimensional equations are employed. Boundary and initial conditions required to formulate a
well-posed mathematical problem are also prescribed.
The coupled 2-D NavierStokes equations and phase-field model are
ux + v y = 0 ,
(1a)
()
1
px +
u + X1 () ,
0
0
(1b)
1
()
[(, T ) L ]
py +
v + X2 ()
g,
0
0
0
(1c)
ut + (u2 )x + (uv)y =
vt + (uv)x + (v 2 )y =
2
300 L0
(( ))
()(Tm T )
M
Tm M
0 0
()T + Y (, T ) ,
aM
2
k
30L0 () D
Tt + (uT )x + (vT )y =
T +
(( ))
0 cp ()
2
cp ()
Dt
t + (u)x + (v)y =
+ W (u, v, , T ) ,
(1d)
(1e)
buoyancy force term in Eq. (1c), density is also a function of temperature. We express
the density, dynamic viscosity and specific heat in the forms:
(, T ) = S + P ()[L S + L (T Tm )] ,
() = S + P ()(L S ) ,
cp () = cpS + P ()(cpL cpS ) ,
where subscripts S and L denote solid and liquid respectively, and is the coefficient of
thermal volumetric expansion. () is a double-well potential, and P () is a polynomial
introduced to denote the interface; these functions are commonly given as polynomials in
:
() = 2 (1 )2 ,
P () = 65 154 + 103 .
Other parameters in Eqs. (1) are
2 = 6 2 ,
0 Tm
a= ,
6 2
M=
0 L0
,
Tm k
where 2 is the positive gradient coefficient related to the interfacial thickness in such
a way that as 0, the phase-field model approaches the modified Stefan model; k
is the kinetic coefficient, and M is the mobility which is related to the inverse of kinetic
coefficient; a is a positive parameter occuring in the double-well potential which is related
to the surface tension . Kinetic coefficient k is a microscopic physical parameter reflecting kink density at steps in solid thickness and the atom exchange rate at each kink, as
explained in Chernov [5] and Ookawa [6]. It is a function of temperature and orientation,
and also depends on the material; the kinetic coefficient of metal is the highest among
all materials. Linear variation of kinetic coefficient might be reasonable for a molecularly
rough interface; however, it strongly depends on orientation of the interface for facetted
interfaces, as shown by Langer [7]. For simplicity, the kinetic coefficient is usually assumed to be a constant as we do herein, but it is difficult to determine the value of this
constant either theoretically or experimentally.
The forcing terms in Eqs. (1) take the forms
X1 () =
2
x 12 xx + 21 2 xy + 22 yy ,
0
(5a)
X2 () =
2
y 12 xx + 21 2 xy + 22 yy ,
0
(5b)
Y (, T ) =
30 p
()[L S + L (T Tm )] ,
0 M
W (u, v, , T ) =
(5c)
() 2
2
2ux + vy2 + (uy + vx )2 +
12 2x 22 2y (vy ux )
0 cp ()
40 cp ()
2
vx 1 2 2x + 22 x y + uy 12 x y + 1 2 2y .
20 cp ()
(5d)
Ying Xu et al.
(6a)
to represent anisotropy in the interfacial energy and kinetics for a crystal of cubic symmetry with anisotropy strength m ; m determines the mode of symmetry of the crystal;
= arctan(y /x ) is the angle between the interface normal and the crystal axis, and
denotes the angle of the symmetry axis with respect to the x-axis. The forcing terms
X1 and X2 represent the effects introduced by the solidification process on the flow field;
they are effective only over the interface region. Y (, T ) is a modification to the Stefan
condition caused by the buoyancy force term; W (u, v, , T ) is viscous dissipation in the
energy equation, and it can usually be neglected since it is small compared with other
dissipative terms in that equation.
On the domain [0, l] [0, l], prescribed boundary conditions are
u=0
u=U
v0
p
=0
n
=0
n
T
=0
n
on \{(x, y)|y = l} ,
on {(x, y)|y = l} ,
on ,
on 0 ,
on ,
on .
on ,
in 0 {(x, y) | |x| + |y| lc , (x, y) [lc , lc ]2 } ,
where lc is one half the length of the diagonal of a 45 -rotated square in the center of the
domain.
3. NUMERICAL METHODS AND RESULTS
The governing equations (1be) are four coupled nonlinear parabolic equations in conserved form. We apply a projection method due to Gresho [8] to solve the momentum
equations while preserving the divergence-free constraint (1a). The Shumann filter [9] is
applied to solutions of the momentum equations (1b), (1c) prior to projection to remove
aliasing due to under resolution. Since time-splitting methods are efficient for solving
multi-dimensional problems by decomposing them into sequences of 1-D problems, a form Douglas & Gunn [10] procedure is applied to the current model. Quasilinearization
of Eqs. (1bd) is constructed by FrechetTaylor expansion in -form as described by
Ames [11] and Bellman and Kalaba [12].
The computations are performed on a square domain [0, 63 cm] [0, 63 cm]. Initially, the currently-used material, nickel, is supercooled by an amount T = Tm
T = 224K, and freezing begins from a small, rotated square with half-diagonal length
lc = 1.89 cm in the center of the domain. The numerical spatial and time step sizes are
x = y = 0.315 cm, t = 106 s, respectively, and the length scale for the interfacial
thickness is = 0.105 cm. The kinetic coeffient k is chosen to be 2.85 m/s K.
Figure 1 displays the velocity field at t = 100.1s. Lid-driven-cavity flow is introduced
at t = 100 s with U = 1cm/s. Dendrite shape evolves from the initial rotated square to
an approximate circle as shown in Fig. 1. We observe that velocity near the solid-liquid
interface is greater than that nearby, a direct result of the forcing terms X1 and X2 in
momentum equations. We also have found that the flow field has a significant effect on
the growth rate of dendrites although this is not evident from Fig. 1; in particular, the
growth rate is decreased by the flow field.
Ying Xu et al.
Dome is 64, and in the current study each processor is used to compute one part of the
whole domain. For parallelization studies the grid is set at 201 201 points corresponding
to the domain size 63 cm 63 cm. The procedure for parallelizing two-step Douglas &
Gunn time-splitting with MPI is to compute different parts of the domain on different
processors, i.e., simply a crude form of domain decomposition. In particular, we divide the
domain into n equal pieces along the separate directions corresponding to each split step,
where n is the number of processors being used. That is, we first divide the domain in
the x direction during the first time-splitting step, and then in the y direction during the
second step. Therefore, transformations of data between each processor are required during the two steps of time-splitting. The sketch of this is shown in Fig. 2. Moreover, data
transformations are also needed between adjacent boundaries of each processor. Since
communication between processors increases with increasing number of processors for a
fixed number of grid points, parallel performance is expected to decrease in such a case,
resulting in only a sub-linear increase of speed-up for MPI.
processor
processor n
n
processor 2
y
processor 1
x
x
splitting step I
splitting step II
sub-linearly for both OpenMP and MPI. Moreover, the speed-up performance of MPI is
better than that of OpenMP. The curve for OpenMP in Fig. 3 also suggests that the
speed-up factor attains its maximum at a number of processors only slightly beyond 32
for the present problem. Moreover, it is clear that parallel efficiency is quite low for
OpenMP already by 16 processors, so this is possibly the maximum number that should
be used. It should also be mentioned that even though the MPI implementation has
not yet been completely optimized, the CPU time of MPI runs is somewhat less than
that for OpenMP. Better performance could be achieved if further optimization of MPI is
applied within the context of the current algorithm. Such optimization might include use
of nonblocking communication, sending noncontiguous data using pack/unpack functions,
decreasing unnecessary blocking, and optimizing the number of Douglas & Gunn split line
solves simultaneously sent to each processor. The last of these can significantly alter the
communication time to compute time tradeoff.
40
theoretical speed-up
MPI results
OpenMP results
Speedup
30
20
10
0
0
10
20
30
40
Number of Processors
Ying Xu et al.
for our current problem are high, MPI is recommended for such problems. However, the
implementation of MPI is more difficult than that of OpenMP. For example, programming
effort for the current problem using MPI was approximately 100 times greater than that
using OpenMP.
6. ACKNOWLEDGEMENTS
This work is supported by Center for Computational Science of University of Kentucky.
We are also grateful to the University of Kentucky Computing Center for use of their HP
SuperDome for all the computations.
REFERENCES
1. D. M. Anderson, G. B. McFadden, and A. A. Wheeler. A phase-field model of solidification with convection. Physica D, 135:175194, 2000.
2. D. M. Anderson, G. B. McFadden, and A. A. Wheeler. A phase-field model with
convection: sharp-interface asymptotics. Physica D, 151:305331, 2001.
3. C. Beckermann, H.-J. Diepers, I. Steinbach, A. Karma, and X. Tong. Modeling melt
convection in phase-field simulations of solidification. J. Comput. Phys., 154:468496,
1999.
4. Nabeel Al-Rawahi and Gretar Tryggvason. Numerical simulation of dendritic solidification with convection: Two-dimensional geometry. J. Comput. Phys., 180:471496,
2002.
5. A. A. Chernov. Surface morphology and growth kinetics. In R. Ueda and J. B. Mullin,
editors, Crystal Growth and Characterization, pages 3352. North-Holland Publishing
Co., Amsterdam, 1975.
6. A. Ookawa. Physical interpretation of nucleation and growth theories. In R. Ueda and
J. B. Mullin, editors, Crystal Growth and Characterization, pages 519. North-Holland
Publishing Co., Amsterdam, 1975.
7. J. S. Langer. Models of pattern formation in first-order phase transitions. In G. Grinstein and G. Mazenko, editors, Directions in Condensed Matter Physics, pages 164
186. World Science, Singapore, 1986.
8. P. M. Gresho. On the theory of semi-implicit projection methods for viscous incompressible flow and its implementation via a finite element method that also introduces
a nearly consistent mass matrix. part 1: Theory. Int. J. Numer. Meth. Fluids, 11:587
620, 1990.
9. F. G. Shuman. Numerical method in weather prediction: smoothing and filtering.
Mon. Weath. Rev., 85:357361, 1957.
10. J. Douglas Jr. and J. E. Gunn. A general formulation of alternating direction methods,
part 1. parabolic and hyperbolic problems. Numer. Math., 6:428453, 1964.
11. W. F. Ames. Numerical Methods for Partial Differential Equations. Academic Press,
New York, NY, 1977.
12. R. E. Bellman and R. E. Kalaba. Quasilinearization and Nonlinear Boundary-Value
Problems. American Elsevier Publishing Company, Inc., New York, NY, 1965.