Introduction
The MPDATA advection scheme introduced in
has grown into a family of numerical algorithms for geosciences and
beyond see for example. MPDATA stands for
multi-dimensional positive-definite advection transport
algorithm. It is a finite-difference/finite-volume
algorithm for solving the generalised transport equation
∂t(Gψ)+∇⋅(Guψ)=GR.
Equation () describes the advection of a scalar field ψ
in a flow with velocity u. The field R on the
right-hand side (rhs) is a total of source/sink terms. The scalar
field G can represent the fluid density, the Jacobian of coordinate
transformation or their product and satisfies the equation
∂t(G)+∇⋅(Gu)=0.
In the homogeneous case (R≡0), MPDATA is at least
second-order-accurate in space and time, conservative and non-linearly
stable.
The history of MPDATA spans 3 decades: –, and
is widely documented in the literature – see
,
and for reviews. Notwithstanding, from the
authors' experience the software engineering aspects still overshadow
the benefits of MPDATA. To facilitate the use of MPDATA schemes,
hereby we present a new implementation of the MPDATA family of
algorithms for regular structured grids – libmpdata++.
In the development of libmpdata++ we strive to comply with
the best sought-after practices among the scientific community
, in particular with the paradigm of
maximising code reuse. This paradigm is embodied in the
“open source computational libraries – the main foundation
upon which academic and also a significant part of industrial
computational research rests” .
The libmpdata++ has been developed in C++, making extensive use of
object-oriented programming (OOP) and template programming. The
primary goals when designing libmpdata++ were to maintain
strict separation of concerns and to reproduce within the code the
mathematical “blackboard abstractions” used for documenting
numerical algorithms. The adopted design contributes to the
readability, maintainability and conciseness of the code. The current
development of libmpdata++ is an extension of the research on
OOP implementation of the basic MPDATA scheme presented in
.
The goal of this article is twofold: first, to document the library
interface by providing usage examples and, second, to validate the
correctness of the implementation by verifying the results against
published benchmarks.
The structure of the paper is as follows. Section
outlines the library design. The four sections that follow correspond
to four types of equation systems solved by the implemented
algorithms, namely homogeneous advective transport, inhomogeneous
transport, transport with prognosed velocity, and systems featuring
elliptic pressure equation. Each of these sections outlines the
implemented algorithms, describes the library interface and provides
usage examples. Each example is accompanied with a definition of the
solved problem, description of the program code and discussion of the
results. An index of libmpdata++ options
documented in the article is provided in Appendix A.
The paper structure reflects the solver inheritance hierarchy in
libmpdata++. All features discussed in preceding sections
apply to the one that follows. The set of discussed problems was
selected to match the tutorial structure of the paper. The
presentation begins with simple examples focusing on the basic library
interface.
Subsequent examples use increasingly more complicated
cases with the most complex reflecting potential for
applications to cloud dynamics .
The library and programs used to generate all results presented in the paper
are released as free and open-source software – see the
section on code availability at the end of the paper.
Library design
Dependencies and supported platforms
The libmpdata++ package is a header-only C++ library.
It is built upon the Blitz++ array containers.
We refer the reader to the Blitz++ documentation for
description of the Blitz++ interface,
to which the user is exposed while working with libmpdata++.
The libmpdata++ core also depends on several components of the
Boost library
collection, however these are used internally only.
Output handlers included in the library depend additionally on
gnuplot-iostream and
HDF5, but
their use is optional.
Example programs discussed within this article require
gnuplot,
ParaView,
Python, and the following Python packages: h5py,
matplotlib
and scipy.
The library code requires a C++11-compliant compiler.
In the current development workflow, we employ continuous integration
on Linux with GNU g++ and
LLVM clang++ compilers and on Apple OSX
with the Apple clang++ compiler.
Consequently, these are considered the supported platforms.
Components
Inheritance diagram of classes mentioned in the paper.
Classes defined within libmpdata++ have their names
surrounded with black frames. The coupled_harmosc
class is an example of a user-defined class
defined out of the library tree. The
solid black lines show the inheritance relations. The
output label depicts any of the output handlers
available in libmpdata++.
Components of the library are grouped as follows.
Solvers:
mpdata intended for solving homogeneous transport
problems (Sect. )
mpdata_rhs extending the above with rhs term
handling (Sect. )
mpdata_rhs_vip adding prognosed-velocity
support (Sect. )
mpdata_rhs_vip_prs further extending the above
with elliptic pressure equation solvers
(Sect. ).
Output handlers:
gnuplot offering direct communication with the
gnuplot program with no intermediate output files
hdf5 offering basic HDF5 output compatible with
netCDF
readers
hdf5_xdmf implementing the eXtensible Data Model
and Format standard
supported for instance by the ParaView visualisation tool.
Boundary conditions:
cyclic implementing periodic boundaries
open giving zero-divergence condition on domain
edges
polar applicable with spherical coordinates.
Concurrency handlers:
serial for single-thread operation
cxx11_thread for multi-threading using C++11 Thread support library
boost_thread for multi-threading using
Boost.Thread
openmp for multi-threading using OpenMP
threads that defaults to openmp if
supported by the compiler and falls back to
boost_thread otherwise.
Performing integration with libmpdata++ requires choosing one
of the solvers, one output handler, one boundary condition per each
domain edge and one concurrency handler.
The inheritance diagram in Fig. shows relationships
between libmpdata++ solvers defined within the library.
The diagram includes as well an example user-defined class
coupled_harmosc defined out of the library tree.
The mpdata solver is displayed at the top, as it is the
base class for all other classes.
Computational domain and grid
The arrangement of the computational domain used in
libmpdata++ is shown in Fig. . The initial
condition for the dependent variable ψ is assumed to be known in
nx×ny data points. The outermost data points are located at
the boundaries of the domain.
The dual, staggered Arakawa-C grid used
in libmpdata++ is shown in Fig. . In this
spatial discretisation approach, the cell-mean values of the scalar
fields ψ, and G reside in the centres of computational cells,
– corresponding to the data points of the primary grid in
Fig. – whereas the components of the velocity field
u are specified at the cell edges of the dual grid in Fig. 3.
Schematic of a 2-D computational domain. Bullets mark the
data points for the dependent variable ψ
in Eq. (), solid lines depict edges of primary grid and
dashed lines mark edges of dual grid in Fig. .
Error and progress reporting
There are several error-handling mechanisms used within
libmpdata++.
First, there are sanity checks within the code implemented using
static_assert() calls. These are reported during
compilation, for instance when invalid values of compile-time
parameters are supplied.
Second, there are available numerous run-time sanity checks,
implemented using assert() calls. These are often
time-consuming and are not intended to be executed in production runs.
To disable them, one needs to compile the program using
libmpdata++ with the -DNDEBUG compiler flag.
Examples of such checks include detection of NaN values
within the model state variables, which may be useful to trace origins
of numerical instability problems.
Third, the user may chose to activate the Blitz++ debug mode
that enables run-time array range checks. Activating Blitz++
debug mode requires compiling the program using libmpdata++
with the -DBZ_DEBUG flag and linking with
libblitz.
Finally, libmpdata++ reports run-time errors by throwing
std::runtime_error exceptions.
Simulation progress is communicated to the user by continuously
updating the process threads' names with the percentage of work
completed (can be observed e.g. by invoking top -H).
A schematic of a 2-D Arakawa-C grid. Bullets denote the
cell centres and dashed lines denote the cell walls
corresponding to the dual grid in Fig. .
Advective transport
The focus of this section is on the advection algorithm used within
libmpdata++. Section provides a short
introduction to the implemented MPDATA scheme.
Section describes the library interface needed for
the homogeneous transport cases. The following Sects. – show examples of usage of libmpdata++
along with the references to other MPDATA benchmarks.
Implemented algorithms
This subsection is intended to provide the reader with an outline of
selected MPDATA features that correspond to the options presently
available in libmpdata++. For the full derivation of the
scheme and its options see the reviews in
and
, whereas for an extended discussion of
stability, positivity and convexity see
.
In the present implementation, it is assumed that G is constant in
time. Consequently, the governing homogeneous transport
Eq. () can be written as
∂tψ+1G∇⋅(Guψ)=0.
This particular form is solved by the mpdata solver of
libmpdata++.
The following paragraphs will focus on the algorithms used for
handling Eq. (). The rules for applying source and sink
terms are presented in Sect. .
Basic MPDATA
MPDATA is an, at least, second-order-accurate iterative scheme in
which all iterations take the form of a first-order-accurate
donor-cell pass alias upwind, upstream;
cf.Sect. 20.1.3. For the
one-dimensional case, after
the discretisation in space (subscripts i) and time
(superscripts n), the donor-cell pass applied to
Eq. () yields
ψin+1=ψin-1Gi[F(ψin,ψi+1n,Gi+1/2,ui+1/2n+1/2)-F(ψi-1n,ψin,Gi-1/2,ui-1/2n+1/2)].
The flux function F is defined as
F(ψL,ψR,G,u)≡[u]+ψL+[u]-ψRGΔtΔx,
where [u]+≡max(u,0) and [u]-≡min(u,0).
In the case of a time-varying velocity field, the velocity components
are evaluated at an intermediate time level denoted by the n+1/2
superscript in Eq. (). Association of the velocity
components with dual-cell edges is denoted by fractional indices
i+1/2 and i-1/2; see Fig. .
Hereafter, GuΔtΔx is written compactly as GC,
where C denotes the Courant number. GC is referred to as the
advector, while the scalar field ψ as the advectee – the
nomenclature adopted after .
Evaluation of Eq. () concludes the first pass of MPDATA.
To compensate for the implicit diffusion of the donor-cell pass, the
subsequent passes of MPDATA reuse Eqs. () and (),
but with ψ replaced with the result of the preceding pass and
u replaced with the “anti-diffusive” pseudo-velocity. The
pseudo-velocity is analytically derived by expanding
Eq. () in the second-order Taylor series about spatial
point i and time level n, and representing the leading,
dissipative truncation error as an advective flux; see
for a derivation. A single corrective pass
ensures second-order accuracy in time and space. Subsequent
corrective passes decrease the amplitude of the leading error, within
second-order accuracy. The one-dimensional formula for the basic
antidiffusive advector is written as
GCi+1/2k+1=GCi+1/2k-GCi+1/2k20.5(Gi+1+Gi)ψi+1k-ψikψi+1k+ψik,
where k numbers MPDATA passes. For k=1, Ck is the
flow-velocity-based Courant number, whereas for k>1, Ck is the
pseudo-velocity-based Courant number. The number of corrective passes
can be chosen within libmpdata++.
The library features two implementations of the donor-cell algorithm
defined by Eqs. () and ().
The default one is a “straightforward” summation.
The alternative, more resource-intensive,
is the compensated summation algorithm of which
reduces round-off error arising when summing numbers of different
magnitudes.
Third-order-accurate variant
Accounting for third-order terms in the Taylor series expansion while
deriving the pseudo-velocity improves the accuracy of MPDATA. When
G≡1, u=constant and three or more corrective passes are
applied, the procedure ensures third-order accuracy in time and space.
The formulæ for the third-order scheme, derived
analytically in , can be found
in Eq. 36.
Divergent-flow variant
In case of a divergent flow, the pseudo-velocity formulæ are
augmented with an additional term proportional to the flow divergence.
This additional term is implemented in libmpdata++ following
Sect. 3.2(3).
Non-oscillatory option
Solutions obtained with the basic MPDATA are sign-preserving, and thus
non-oscillatory near zero. Generally, however, they feature dispersive
ripples characteristic of higher-order numerical schemes. These can
be suppressed by limiting the pseudo-velocities, in the spirit of
flux-corrected transport. Application of the limiters reduces
somewhat the accuracy of the scheme
, yet this loss is generally
outweighed by ensuring non-oscillatory (or ripple-free) solutions.
Noteworthy, because MPDATA is built upon the donor-cell scheme
characterised by small phase error, the non-oscillatory corrections
have to deal with errors in signal amplitude only. The
non-oscillatory option is a default option within the
libmpdata++. For the derivation and further discussion of
the multi-dimensional non-oscillatory option see
.
Variable-sign scalar fields
The basic MPDATA formulation assumes that the advected field ψ is
exclusively either non-negative or non-positive. In particular, this
assumption is evident in the ψ-fraction factor
ψi+1k-ψikψi+1k+ψik of
Eq. (), which can become unbounded in case of
a variable-sign field. The libmpdata++ library includes
implementations of two MPDATA options intended for simulating
advection of variable-sign field.
The first method replaces ψ with |ψ| in all ψ-fraction
factors that enter the pseudo-velocity expressions. This approach is
robust but it reduces the solution quality where ψ crosses
through zero; see Sect. 3.2(4) in
.
The default method, is the “infinite-gauge” variant of the
algorithm, a generalised one-step Lax–Wendroff (linear, oscillatory)
limit of MPDATA at infinite constant background, discussed in
Sect. 4.2. In practice, the
infinite-gauge option of MPDATA is used with the non-oscillatory
enhancement.
Library interface
Compile-time parameters
Compile-time parameters include number of dimensions, number of
equations and algorithm options. Most of the compile-time parameters
are declared by defining integer constants within the compile-time
parameter structure. Listing depicts a minimal
definition that inherits from the ct_params_default_t
structure containing default values for numerous parameters.
Example definition of compile-time parameters structure.
All solvers expect a structure with compile-time parameters as their
first template parameter, as exemplified in Listing .
Example alias declaration combining solver- and
compile-time parameters choice.
Choosing library components
The library components listed in Sect. are chosen
through template parameters. First, the solver is equipped with an
output mechanism by passing the solver type as a template parameter to
the output type, as exemplified in Listing . The
output classes inherit from solvers.
Example alias declaration of an output mechanism.
Second, the concurrency handlers expect solver class (equipped with output) as the first template parameter.
Subsequent template parameters control
boundary condition types on each of the domain edges (see Listing ).
Example alias declaration of a concurrency handler.
Run-time parameters
Run-time parameters include the grid size, number of MPDATA passes and
output file name. The list of applicable run-time parameters is
defined by fields of the rt_params_t structure. This
structure is defined within each solver and extended when equipping
the solver with an output mechanism. The concurrency handlers expect
an instance of the run-time parameters structure as their constructor
argument. Example code depicting how to set the run-time parameters
and then instantiate a concurrency handler is presented in
Listing .
Example run-time parameter structure declaration followed
by a concurrency handler instantiation.
Public methods
The concurrency handlers act as controlling logic for the other
components and, hence, the user is exposed to the public interface of
these handlers only.
Listing contains signatures of methods implemented by
each of the concurrency handlers.
Signatures of all the methods within libmpdata++
application programming interface.
The advectee() is an accessor method for the advected
scalar fields. It can be used for setting the initial condition as
well as for examining the solver state. It expects an index of the
requested advectee as the argument (advected scalar fields are
numbered from zero). This provides choice between different advected
variables. The returned blitz::Array is zero-base indexed
and has the same size as the computational grid (set with the
grid_size field of the run-time parameters structure, see
Listing ).
The advector() method allows accessing the components of
the vector field of Courant numbers multiplied by the G factor (i.e.
a Jacobian of coordinate transformation, a fluid density field or
their product). The argument selects the vector field components
numbered from zero. The size of the returned array depends on the
component. It equals the grid size in all but the selected dimension
in which it is reduced by 1 (i.e. nx×(ny-1) for the “y”
component and so forth; cf. Fig. ).
The g_factor() is an accessor method for the G field.
The returned array has the same size as the one returned by
advectee(). The default value is set to G≡1 (for
details, see Sect. ).
The advance() method launches the time-stepping logic of
the solver advancing the solution by the number of time steps given as
argument.
The panic_ptr() method returns a pointer to a Boolean
variable that if set to true will cause the solver to stop the
computations after the currently computed time step. This method may
be used, for instance, to implement signal handling within programs
using libmpdata++.
All multi-dimensional arrays used in libmpdata++ use the
default Blitz++ “row-major” memory layout with the last
dimension varying fastest. Domain decomposition for parallel
computations is done over the first dimension only.
Basic example
The source code presented in this subsection is intended to serve as
a minimal complete example on how to use libmpdata++. In
other examples presented throughout the paper, only the fragments of
code that differ significantly from the minimal example will be
presented.
Simulation results generated by the code in
Listing .
A usage example of libmpdata++. The listing
contains the code needed to generate Fig. .
The example consists of an elemental transport problem for
a one-dimensional, variable-sign field advected with a constant
velocity. The simulation results using code in Listing are
shown in Fig. . Spatial and temporal directions are
depicted on the abscissa and ordinate, respectively. Cell-mean values
of the transported field are shown on the applicate and are presented
in compliance with the assumption of data points representing
grid-cell means of the transported field.
The code in Listing begins with three include statements that
reflect the choice of the library components: solver, concurrency
handler and output mechanism. All compile-time parameters are grouped
into a structure passed as a template parameter to the solver. Here,
this structure is named ct_params_t and inherits from
ct_params_default_t what results in assigning default
values to parameters not defined within the inheriting class. The
solvers expect the structure to contain a type real_t which
controls the floating point format used. The two constants that do
not have default values and need to be explicitly defined are
n_dims and n_eqns. They control the
dimensionality of the problem and the number of equations to be
solved, respectively.
Choice between different solver types, output mechanisms and
concurrency handlers is done via type alias declaration. Here, the
basic mpdata solver is chosen which is then equipped with
the gnuplot output mechanism. All output classes expect
a solver class as their first template parameter, which is used to
define the parent class (i.e. output classes inherit from solvers).
Classes representing concurrency handlers expect the output class and
the boundary conditions as their template parameters. In the example,
a basic serial handler is used and open boundary conditions on both
ends of the domain are chosen.
The choice of run-time parameters is done by assigning values to the
member fields of the rt_params_t structure defined within
the solver class and augmented with additional fields by the output
class. In this example, the instance of rt_params_t
structure is named p, the grid size is set to 101 points and
the output is set to be done every 20 time steps. An instance of the
rt_params_t structure is expected as the constructor
parameter for concurrency handlers.
The grid step dx is set to 0.1 and the number of time steps
to 100. Initial values of the Courant number and the transported
scalar fields are set by assigning them to the arrays returned by the
advector() and advectee() methods. In this
example, the Courant number equals 0.5 and the advected shape is
described by the Witch of Agnesi formula y(x)=8a3/(x2+4a2) with
the coefficient a=0.5. Initial shape is centred in the middle of
the computational domain and is shifted downwards by 0.5.
Finally, the actual integration is performed by calling the
advance() method with the number of time steps as argument.
Example: advection scheme options
The following example is intended to present MPDATA advection scheme
options described in Sect. . The way of choosing
different options is discussed, and the calling sequence of the
library interface is shown for the case of advecting multiple scalar
fields.
The example consists of transporting two boxcar signals with different
MPDATA options. In all tests, the first signal extends from 2 to 4
and the second signal extends from -1 to 1, to observe the solution
for fixed-sign and variable-sign signals.
Listing shows the compile-time parameters
structure fields common to all cases presented within this example.
The number of dimensions is set to 1 and the number of equations to
solve is set to 2. Consistent with Listing from the
basic example, p shown in
Listing is an instance of rt_params_t
structure with run-time parameters of the simulation. Setting the
outfreq field to the number of time steps results in plotting
the initial condition and the final state. The outvars field
contains a map with a structure containing a variable name, here left
empty, and unit defined for each of the advected scalar fields.
Listing shows how to set initial values to multiple
scalar fields using the advectee() method with an integer
argument specifying the index of the equation in the solved system.
Compile-time parameters for the example presented in Sect. .
Run-time parameters for the example presented in Sect. .
Initial condition and velocity field for the example presented in
Sect. .
Variable-sign scalar fields
The libmpdata++ library is equipped with two options for
handling variable-sign fields; recall the discussion in Sect. . The option using absolute values is named
abs, whereas the “infinite-gauge” option is dubbed
iga. The option flags are defined in the opts
namespace. The option choice is made by defining the opts
field of the compile-time parameters structure, in analogy to
n_dims or n_eqns.
Result of the simulation with the advection scheme option
for variable-sign signal set to absolute value;
cf. Listing .
Advection scheme options for Fig. ,
variable-sign option is set to absolute value.
In the first test, the choice of handling variable-sign signal is set
to abs (Listing ). Figure shows
the result of simulation with parameters set in
Listings , ,
and . The final signal shows dispersive ripples
characteristic of higher-order schemes. It is also evident that the
ripple magnitude depends on the constant background, a manifestation
of the scheme non-linearity. Furthermore, the final variable-sign
signal features a bogus saddle point at the zero crossings
(cf. Sect. ), and this can be eliminated by using
the infinite-gauge (alias iga) option.
Listing shows how to choose the iga option.
Figure shows the result of the simulation with parameters
set in Listings , ,
and . Although iga evinces
more pronounced oscillations, their magnitudes do not depend on the
constant background. This, together with the robust behaviour of
iga when crossing zero, substantiates the discussion of
Sect. on iga amounting to a linear
limit of MPDATA.
Third-order-accurate variant
Choosing third-order variant enhances the accuracy of the scheme when
used with more than two passes of MPDATA or with iga; recall
Sect. . Option tot enables the third-order
variant of the MPDATA scheme. Figure shows the result of
the same test as in Figs. and but with
MPDATA options set as in Listing . The resulting
signal is evidently more accurate and symmetric, but the oscillations
are still present.
As in Fig. but with variable-sign option
set to “infinite-gauge”; cf. Listing .
Advection scheme options for Fig. ,
variable-sign option is set to “infinite-gauge”.
As in Fig. but with variable-sign option set
to “infinite-gauge” and third-order-accurate variant;
cf. Listing .
Advection scheme options for Fig. ,
variable-sign option is set to “infinite-gauge” and
third-order accuracy variant is chosen.
Non-oscillatory option
To eliminate oscillations apparent in the preceding tests, the
non-oscillatory (fct) option (Sect. ) needs
to be chosen. This option can be used together with all other MPDATA
options, such as basic scheme, variable-sign signals (abs or
iga) and the third-order-accurate variant (tot).
Here, fct is selected together with iga;
cf. Listing . This is the default setting, i.e. when
inheriting from the default parameters structure, and not overriding
the opts setting, as illustrated in Listing .
Figure shows the corresponding results. The
solutions for both fixed-sign and variable-sign signals have
indistinguishable profiles and all of the dispersive ripples have been
suppressed.
As in Fig. but with options set to
infinite-gauge treatment of variable-sign signal and flux
corrections; cf. Listing .
Advection scheme options for Fig. ,
variable-sign option is set to “infinite-gauge” and
non-oscillatory option is enabled. This is the default setting
in libmpdata++.
As in Fig. but with options set to
infinite-gauge treatment of variable-sign signal,
non-oscillatory option and third-order accuracy variant;
cf. Listing .
Advection scheme options for
Fig. , variable-sign option is set to
“infinite-gauge”, non-oscillatory option is enabled and
third-order accuracy variant is chosen.
To further enhance the accuracy of the solution, fct and
iga can be combined with the tot variant;
cf. Listing . The corresponding result is shown
in Fig. . Enabling the third-order-accurate
variant improves the symmetry of the solution, as compared to the
results presented in Fig. .
Example: convergence tests in 1-D
In this subsection the convergence test originated in
is used to quantify the
accuracy of various MPDATA options.
The test consists of a series of one-dimensional simulations with
Courant numbers
C∈(0.05,0.1,0.15,0.2,…,0.85,0.9,0.95)
and grid increments
Δx∈Δxm20,Δxm21,Δxm22,Δxm23,Δxm24,Δxm25,Δxm26,Δxm27,
where Δxm=1 is the maximal increment. The series amounts
to 152 simulations for each option. In each simulation, the number of
time steps NT and the number of grid cells NX is adjusted so that
the total time T and total length of the domain X remain constant.
The domain size X=44Δxm and simulation time T=1 are
selected. The advective velocity is set to u=Δxm/T=1.
In each simulation, a Gaussian profile
ψex(x)t=0=1σ2πexp-(x-x0)22σ2
is advected, and the result of the simulation is compared with the
exact solution ψex. The initial profiles and the exact
solutions are calculated by analytically integrating
function () over the grid-cell extents, to comply with the
inherent MPDATA assumption of a data point representing the grid-cell
mean of transported field. The dispersion parameter of the initial
profile () is set to σ=1.5Δxm, while the
profile is centred in the middle of the domain x0=0.5X.
As a measure of accuracy, a truncation-error function is introduced
err(C,Δx)≡1T∑i=1NX[ψex(xi)-ψ(xi)]2/NXt=T.
The results of the convergence test for the generic
first-order-accurate donor-cell scheme, the basic MPDATA and its
third-order-accurate variant are shown in
Fig. a–c. Each figure
displays, in polar coordinates, the base-two logarithm of the
truncation-error function () for the entire series of 152
simulations. The radius and angle, respectively,
r=ln2ΔxΔxm+8,ϕ=Cπ2,
indicate changes in grid increment and Courant number. Thus, closer
to the origin are simulation results for finer grids, closer to the
abscissa are points for small Courant numbers, and closer to the
ordinate are points with Courant numbers approaching unity. The
contour interval of dashed isolines and of the colour map is set to 1,
corresponding to error reduction by the factor of 2. Lines of
constant grid-cell size and constant Courant number are overlaid with
white contours.
The result of the convergence test (a) for the
donor-cell scheme, (b) for the basic MPDATA and
(c) for the third-order-accurate variant.
As in Fig. (a) for three
passes of MPDATA, (b) for two passes with non-oscillatory
option, (c) for infinite-gauge option, and (d) for
infinite-gauge with third-order-accurate variant.
The figures contain information on the convergence rate of MPDATA
options. When moving along the lines of constant Courant number
towards the origin, thus increasing the spatial and temporal
resolution, the number of crossed dashed isolines determines the order
of the scheme; cf. Sect. 8.1 in
. Therefore, the results in
Fig. a–c attest to the first-, second- and
third-order asymptotic convergence rates, respectively. Furthermore,
the shape of dashed isolines conveys the dependency of the solution
accuracy on the Courant number. In particular, they show that at
fixed spatial resolution the solution accuracy increases with the
Courant number. Moreover, as the order of the convergence increases
the isolines become more circular indicating more isotropic solution
accuracy in the Courant number.
Figure b reproduces the solution in Fig. 1 of
and, thus, verifies the
libmpdata++ implementation. For further verification
Fig. a and b shows results of the convergence
test for (i) three-pass MPDATA (run-time solver parameter
n_iters = 3), and (ii) for two-pass MPDATA with
fct option. These results reproduce Figs. 2 and 3 from
. Noteworthy, an interesting
feature of Fig. a is the groove of the
third-order convergence rate formed around ϕ=45∘,
characteristic of MPDATA with three or more passes
. Next, comparing
Fig. b with Fig.b shows that
the price to be paid for an oscillation-free result is a reduction in
the convergence rate from 2 to ∼1.8; Sect. 4
in.
Figure c and d documents original results for
the convergence test applied to the “infinite-gauge” limit of
MPDATA. In particular, Fig. c shows that
iga is as accurate as three-pass MPDATA, cf. Sect. 4
in, whereas
Fig. d reveals that the third-order-accurate
iga is more anisotropic in Courant number than the
third-order-accurate standard MPDATA in Fig. c.
The convergence test results for the default setting of
libmpdata++ (iga plus fct) are not shown,
because they resemble results from Fig. b with
somewhat enhanced accuracy for well-resolved fields (i.e. small
grid cells).
The results of the example presented in Sect. ; only a quarter of the
domain, centred over the cone's initial location, is shown.
Abscissa and ordinate mark the spatial dimensions.
Colours correspond to the amplitude of the advected field.
Panel (a) shows initial condition of
Sect. , (b) results for basic MPDATA with
fct, (c) for MPDATA with three passes with
fct and tot and (d) for the default
setting of libmpdata++ (iga and fct).
Example: rotating cone in 2-D
This example introduces the libmpdata++ programming interface for
two-dimensional simulations with the velocity field varying in space.
Test results are compared with published MPDATA benchmarks. The
example is based on the classical solid-body rotation test
. The current setup follows
. The initial condition
features a cone centred around the point (x0,y0)=(50Δx,75Δy). The grid interval is Δx=Δy=1, and the
domain size is 100Δx×100Δy – thus containing
101×101 data points (cf. Fig. ). The height of
the cone is set to 4, the radius to 15Δx, and the background
level to 1. The flow velocity is specified as (u,v)=ωy-yc,-(x-xc), where angular velocity ω=10-1 and (xc,yc) denotes coordinates of the domain centre.
With time interval Δt=0.1, one full rotation requires 628
time steps. The total integration time corresponds to six full
rotations.
Compile-time parameter settings for the rotating-cone
test.
Run-time parameter responsible for setting the number of
MPDATA passes in Fig. c.
Implementation of the setup using the libmpdata++ interface
begins with definition of the compile-time parameters structure. The
test features a single scalar field in a two-dimensional space, what
is reflected in the values of n_dims and
n_eqns set in Listing . In one of the test
runs, the number of MPDATA passes (n_iters) is set to 3,
instead of the default value of 2. Corresponding field of run-time
parameters structure is shown in Listing . During
instantiation of the concurrency handler, four boundary-condition
settings (two per each dimension) are passed as template arguments.
In this example, open boundary conditions (bcond::open) are
set in both dimensions – see Listing .
The choice of the threads concurrency handler in
Listing results in multi-threaded calculations – using
OpenMP if the compiler supports it, or using Boost.Thread
otherwise. The number of computational subdomains (and hence threads)
is controlled by the OMP_NUM_THREADS environment
variable, regardless if OpenMP or Boost.Thread implementation
is used. The default is to use all CPUs/cores available in the
system. Notably, replacing concurr::serial from the
previous examples with concurr::threads is the only
modification needed to enable domain decomposition via shared-memory
parallelism.
Concurrency handler instantiation for the rotating-cone
test.
The way the initial condition and the velocity field are set is shown
in Listing . The Courant number components are specified
using calls to the advector() method with the argument
defining the component index.
Initial condition for the rotating-cone test.
The initial condition is displayed in Fig. a,
and the results after total integration time are shown in Fig. b–d.
All plots are centred around cone's initial location and show
only a quarter of the computational domain.
The isolines of the advected cone are plotted with 0.25 intervals.
The results in Fig. b and c were obtained with the
fct and the three-pass tot + fct MPDATA,
respectively, whereas Fig. d shows
test results for the default setting
of libmpdata++.
These results match those presented in
Fig. 1
and Fig. 4 and Table 1.
In particular, the rms errors – defined on the rhs of Eq. () –
are 0.37×10-3, 0.11×10-3 and 0.27×10-3
for the fct, three-pass tot fct and
the default libmpdata++ options, respectively.
Example: revolving sphere in 3-D
This example extends Sect. to three spatial dimensions.
It exemplifies how to specify a three-dimensional setup using
libmpdata++. Furthermore, the option is described for saving
the simulation results to HDF5 files with XDMF
annotations.
The setup follows :
the domain size is 100×100×100,
with a uniform grid consisting of 59 grid points in each direction.
The time step is 0.036π.
The initial condition is a sphere of radius 15 centred
around the point (x0,y0,z0)=(50-25/3,50+25/3,50+25/3) with a constant density of 4.
The sphere is rotating with constant angular velocity Ω=ω/3(1,1,1) of magnitude ω=0.1. The components
of the advecting velocity field are (u,v,w)=(-Ωz(y-yc)+Ωy(z-zc),Ωz(x-xc)-Ωx(z-zc),-Ωy(x-xc)+Ωx(y-yc)), where the
coordinates of the rotation centre are (xc,yc,zc)=(50,50,50).
The test lasts for one revolution which takes 556 time steps.
Specifying the 3-D setup with the libmpdata++ programming
interface calls starts by setting the n_dims field to 3 (Listing ). Listing shows the choice of
recommended three-dimensional output handler hdf5_xdmf.
This results in output consisting of HDF5 files with
XDMF annotation that can be viewed, for example, with the
ParaView visualisation software. This output is saved in
a directory specified by the outdir field of the run-time
parameters; see Listing .
Compile time parameter setting for the revolving-sphere
test.
Alias declaration of an output mechanism for the
revolving-sphere test.
Run-time parameters field specifying output directory for
the revolving-sphere test.
Figure a shows the initial condition,
Fig. b shows the results
after one revolution for the default libmpdata++ options.
The grey volume is composed of dual-grid cells
(Sect. ) encompassing data points with cell-mean values
of density greater than or equal to 1.
Obtained results can be compared with those presented in
Figs. 9–13 and Table 4.
In particular, for the default libmpdata++ setting,
the rms error is 2.8×10-3,
and it compares favourably with the L2 norm in their Table 4.
The results of the example presented in Sect. .
The whole computational domain is shown.
The grey volume encompasses data points with values of density greater or equal to 1.
Panel (a) shows initial condition, (b) results for the default libmpdata++ options.
Example: 2-D advection on a sphere
This subsection concludes homogeneous transport examples with a 2-D
solid-body rotation test on a spherical surface
. The purpose of this example is to
present methods for setting up the simulations in spherical
coordinates.
Following only the case when the
initial field rotates over the poles is presented. The initial
condition is a cone centred around the point (3π/2,0) with height
and radius equal to 1 and 7π/64, respectively. The wind field is
given by
u=-Usinϕcosλ,v=Usinλ,
where λ and ϕ denote respectively longitude and latitude,
and U=π/128. The computational domain [0,2π]×[-π/2,π/2] is resolved with 128×64 grid increments
Δλ=Δϕ and is shifted by 0.5Δϕ so
that there are no data points on the poles. The test is run for 5120
time steps corresponding to one revolution around the globe.
The advection equation in spherical coordinates has the form of the
generalised transport Eq. () with the Jacobian of coordinate
transformation
G=cosϕ.
In order to solve the generalised transport equation with G≢1 the nug option has to be set; see
Listing .
Compile-time parameter field for the example presented in
Sect. .
Boundary conditions in this example incorporate principles of
differential geometry cf. chapter XIV in in the
classical spherical latitude–longitude framework
. They are cyclic
(bcond::cyclic) in the zonal direction, whereas in the
meridional direction they represent two degenerated charts (of the
atlas composed of three) defining differentiation of dependent
variables in vicinity of the poles (bcond::polar;
Listing ). The setting of G is done using the
g_factor() accessor method as shown in
Listing ; note the shift in latitude by Δϕ/2.
Concurrency handler for the example presented in Sect. .
The Jacobian setting for the example presented in Sect. .
The results of the example presented in Sect. .
The plots are centred over the cone's initial location and show the advected field
plotted in spherical coordinates.
Colours mark the amplitude of the advected field.
Panel (a) shows
the initial condition, (b) results for the default
libmpdata++ options and (c) results for the
three-pass MPDATA with fct and tot.
The initial condition for the test is plotted in
Fig. a, whereas the results are displayed in
Fig. b and c. All figures use
orthographic projection, with the perspective centred at the initial
condition (the true solution), with the contour interval 0.1.
Figure b shows the result for the default
libmpdata++ options. There is a visible deformation in the
direction of motion, consistent with earlier Cartesian rotational
tests. The result in Fig. c, obtained using three
passes of MPDATA with fct and tot, shows reduced
deformation and reproduces Fig. 6 in
.
Error norms were calculated following
Eqs. 24a–e
to take into account the effects of coordinate transformation.
For instance, the “energy” conservation error (their ERR2) is -0.066 for the default libmpdata++ setting
and -0.11 for the three-pass MPDATA with tot and fct,
which agrees with the values presented in
Table 1.
Inhomogeneous advective transport
Implemented algorithms
As of the current release, libmpdata++ provides three ways of
handling source terms in the inhomogeneous extension of
Eq. ():
∂tψ+1G∇⋅(Guψ)=R.
The available time integration schemes include the two variants of
the first-order-accurate Euler-forward scheme (hereafter referred to
as euler_a and euler_b) and the
second-order-accurate Crank–Nicolson scheme (trapez). The
Euler schemes are implemented to account for parameterised forcings
(e.g. due to cloud microphysics), whereas the Crank–Nicolson scheme
is standard for basic dynamics (e.g. pressure gradient, Coriolis and
buoyancy forces). In both Euler schemes, while calculating the solver
state at the time level n+1, the right-hand-side at the time level
n is only needed. In the euler_a option
(Eq. ), the source terms are computed and applied
standardly after the advection:
ψn+1=ADV(ψn)+ΔtRn.
In the euler_b option (Eq. ), the source terms
are computed and applied arguably in the Lagrangian spirit;
Sect. 3.2 in before the advection
ψn+1=ADV(ψn+ΔtRn).
In the trapez option (Eq. ), half of the source
terms are computed and applied as in the euler_a and half as
in the euler_b arguably in the spirit of the
Lagrangian trapezoidal rule; Sect. 2.2
in:
ψn+1=ADV(ψn+0.5ΔtRn)+0.5ΔtRn+1.
Library interface
The logic for handling source terms is implemented in the
mpdata_rhs solver that inherits from the mpdata
class (Fig. ). Consequently, all options discussed in
the preceding section apply. The choice of the source-term
integration scheme is controlled by the rhs_scheme
compile-time parameter with the valid values of euler_a,
euler_b or trapez.
The user is expected to provide information on the source terms by
defining a derived class of mpdata_rhs with the
update_rhs() method overloaded. The update_rhs()
signature is given in Listing , whereas the usage
example is given in Sect. . The method is called by
the solver with the following arguments:
a vector of arrays rhs storing the source terms for
each equation of the integrated system,
a floating-point value dt with the time-step value,
an integer number at indicating if the source terms are
to be computed at time level n (if at =0) or n+1
(if at =1).
Signature of the method used for defining source terms.
Calculation of forcings at the n+1 time level is needed if
the rhs_scheme=trapez option is chosen. The case of at
equal to zero is used in the Euler schemes and in the very first time
step when using the trapez option (i.e. once per
simulation). When the trapez option is used, the dt
passed to the update_rhs() method equals half of the original
time step.
The update_rhs() method is expected to first call
parent_t::update_rhs() to zero out the source and sink
terms stored in rhs. Later, it is expected to calculate the
rhs terms in a given time step by summing all sources and
sinks and “augment assign” them to the rhs field (e.g.
using the += operator).
All elements of the rhs vector corresponding to subsequent
equations in the system are expected to be modified in a single
update_rhs() call.
Example: translating oscillator
The purpose of this example is to show how to include rhs terms in
libmpdata++ by creating a user-defined class out of the
library tree.
A system of two one-dimensional advection equations,
∂tψ+∂x(uoψ)=ωϕ,∂tϕ+∂x(uoϕ)=-ωψ,
represents a harmonic oscillator translating with uo=constant; see Sect. 4.1 in for
a discussion. Applying the
trapezoidal rule to integrate the PDE (partial differential equation) system () leads to
the following system of coupled implicit algebraic equations:
ψin+1=ψi∗+0.5Δtωϕin+1,ϕin+1=ϕi∗-0.5Δtωψin+1,
where ψi∗ and ϕi∗ stand for
ψi∗=MPDATAψin+0.5Δtωϕin,C,ϕi∗=MPDATAϕin-0.5Δtωψin,C.
Substituting in Eq. () ψin+1 with
ϕin+1 and vice versa and then regrouping leads to
ψin+1=ψi∗+0.5Δtωϕi∗1+(0.5Δtω)2,ϕin+1=ϕi∗-0.5Δtωψi∗1+(0.5Δtω)2.
Implementation of forcing terms prescribed in Eq. () is
presented in Listing . A new solver
coupled_harmosc is defined by inheriting from the
mpdata_rhs class. A member field omega is defined
to store the frequency of oscillations.
Simulation results of the example presented in Sect. .
Abscissa marks the spatial dimension and ordinate represents the
oscillator amplitude.
The oscillator state is plotted every 20 time steps.
Definition of the solver used in the example presented in
Sect. .
The rhs terms are defined for both variables, ix::psi and
ix::phi, within the update_rhs() method. The method
implements both implicit and explicit formulæ, the two cases are
switched by the at argument. Defining forcings for both
n and n+1 cases allows using this class with both
euler and trapez options. The current state of the
model is obtained via a call to the state() method. Note how
the formulæ defined in update_rhs() in case (1) loosely
resemble the mathematical notation presented in Eq. ().
The 0.5 is absent because the Δt passed as argument in
trapez option is already divided by 2.
Next, the rt_params_t structure is augmented (by inheriting
from parent's rt_params_t) with the omega. Lastly,
the coupled_harmosc constructor is defined. Within it, the
choice of the omega is handled by copying its value from the
p.omega to omega member field and then checking if
the user has altered the default value of 0.
Compile-time parameter structure for the example presented in
Sect. .
For inhomogeneous transport, the rhs_scheme within the
ct_params_t structure needs to be defined. In this example
it is set to trapez (Listing ). MPDATA
advection scheme options are set to default by inheriting from the
ct_params_t_default structure. The structure ix
allows calling advected variables by their labels, phi and
psi, rather than integer numbers. Lastly, when defining the
rt_params_t structure a value is assigned to the member
field p.omega; see Listing .
Run-time parameter structure for the example presented in
Sect. .
In the present example, the initial condition for ψ is defined as
ψ(x)=0.5[1+cos(2πx/100)] for x∈(50,150) and zero
elsewhere. The initial condition for ϕ is set to zero.
The result of 1400 s of simulated time is shown in
Fig. . Note that the solutions for both ψ and
ϕ remain in phase and feature no substantial amplitude error.
This contrasts with calculations using Euler-forward schemes (not shown).
In particular, at the end of the simulation, the rms error is 1×10-7 and
1×10-18 for the analogous experiment with uo≡0 (not shown).
Transport with prognosed velocity
Implemented algorithms
Whenever the velocity field changes in time, the second-order accuracy
of the solution at n+1 requires
an estimate of the advector at n+1/2.
This is provided by linear extrapolations from n and n-1 values
. Furthermore, when the
velocity is a dependent variable of the model,
Eq. () embodies equations of motion. Then the
velocity (or momentum) components are treated as advected scalars
(i.e. advectees) and are predicted at the centres of the dual-grid
cells (Fig. ). The advector field is then interpolated
linearly to the centres of the cell walls.
Library interface
The algorithms for interpolating in space and extrapolating in time
the advector field from the model variables are defined in the
mpdata_rhs_vip class and all user-created solvers with
time-varying velocity must inherit from this class.
The transported fields may represent either velocity or momenta. In
the latter case the prognosed velocity components are calculated as
ratios of two advectee fields (e.g. momentum components and density).
The index of the advectee that forms the common denominator for all
velocity components should be assigned to vip_den. The
vip_i, vip_j and vip_k store the
index of the advected fields appearing in the numerators for each
velocity component. These velocity components are then used to
calculate the advector field. In cases when the velocity components
are model variables (as in the example of Sect. ), the
common denominator is redundant and the value -1 should be assigned to
vip_den.
For systems where numerators and denominators can uniformly approach
zeros, the vip_eps value is defined to prevent divisions
by zero. Then, if the denominator at a given grid-point is less than
the vip_eps, the resulting advector is set to zero
therein. The default setting (represented with vip_eps set to 0)
gives no protection from divisions by zeros. Any user-defined vip_eps > 0 activates the above algorithm.
The vip_i, vip_j, vip_k and
vip_den are expected to be members of the compile-time
parameters structure ct_params_t of the
mpdata_rhs_vip class. The vip_eps value is
a run-time parameter.
As of the current release, the prognosed-velocity features of
libmpdata++ are implemented for constant G≡1 only.
Example: 1-D shallow-water system
The aim of this example is to show how to define simulations with
prognosed velocity field. The necessary compile-time and run-time
parameters as well as the user-defined class with source and sink
terms are discussed. The obtained results are compared with the
analytical solution and a published MPDATA benchmark.
The idealised system of 1-D inviscid shallow-water equations is
considered, with both the surface friction and background rotation
neglected. The simulated physical scenario is a slab-symmetric
parabolic drop spreading under gravity; see for
a general context and for the bespoke
analytical solutions. The corresponding governing equations take the
dimensionless form
∂th+∂x(uh)=0,∂t(uh)+∂x(uuh)=-h∂xh,
where h is a normalised depth of the fluid layer and u is
a normalised velocity. Following the
selected velocity scale is uo=(gho)1/2,
where ho is the initial height of the drop and g denotes
the gravitational acceleration. The characteristic timescale is
to=a/uo, where a denotes the initial half-width of the
drop. At the initial time a drop is confined to |x|≤1 and
centred about x=0,
h(x,t=0)=1-x2,for|x|≤10,for|x|>1.
The time step is set to 0.01 and the grid spacing is set to 0.05. The
crux of the test is a synchronous solution for the depth and momentum
near the drop edge that accurately diagnoses the velocity.
The definition of the rhs terms for Sect. is presented in
Listing . Only the method for calculating the forcing
terms is shown; for the full out-of-the-library-tree definition of
source-terms see Listing . As in Listing ,
the definition in Listing attempts to follow the
mathematical notation. Because of the use of the grad
function, the nabla namespace is included.
Method for calculating source and sink terms in the example presented in
Sect. .
Compile-time parameters for the example presented in Sect. .
Listing specifies the compile-time parameters
structure. Because fluid flow in this example is divergent the
opts::dfl correction is enabled; cf. Sect. .
The rhs_scheme is set to
trapez. Within the ix structure,
the equation indices are assigned. Furthermore, the recipe for
calculating the velocity is defined by assigning the indices to
vip_i and vip_den. Lack of the rhs terms is
specified by toggling the nth bit of the hints_norhs field,
where n is the index of the homogeneous equation. This prevents the
unnecessary summation of zeros.
Listing shows the run-time parameters structure. The
value of gravitational acceleration p.g is set to 1 to
follow the dimensionless notation of Eq. (), and the
vip_eps is set arbitrarily to 10-8.
Run-time parameters for the example presented in Sect. .
The results of the test are plotted in Fig. .
Figure a shows the initial condition (black) and the
analytical solution for t=3 (blue). Solid lines mark the fluid depth
and the dashed line the velocity. The remaining two panels show
numerical results at t=3 for different MPDATA options
(red) plotted over the top panel. Figure b shows the
solution with options abs and fct, whereas
Fig. c shows the solution obtained with options
iga and fct.
All presented results are free of apparent artefacts near the drop
edge. The abs+fct in the central panel compares
well with Fig. 7b in , whereas
the iga+fct solution in the bottom panel closely
reproduces the analytical result.
The rms error, on the rhs of Eq. (), at the end of the simulation is
5.77×10-4 for abs+fct and
1.87×10-4 for iga+fct options.
Simulation results of the example presented in Sect. . Solid lines
represent fluid height and dashed lines represent fluid velocity.
Initial condition is plotted in black, analytical solution in blue
and test results in red. (a) shows the initial condition
and analytical solution at t=3. (b) and (c) show numerical results plotted over (a) obtained with
options abs + fct and
iga + fct, respectively.
Example: 2-D shallow-water system
The 2-D shallow-water test discussed here is an original
axis-symmetric extension of the 1-D slab-symmetric test in
Sect. . The corresponding dimensionless equations take
the form
∂th+∂x(uh)+∂y(vh)=0,∂t(uh)+∂x(uuh)+∂y(vuh)=-h∂xh,∂t(vh)+∂x(uvh)+∂y(vvh)=-h∂yh.
As in the 1-D case, h stands for the fluid height and u and v are the
velocity components in x and y directions, respectively. Again,
the initial condition consists of a parabolic drop centred at the
origin and confined to x2+y2≤1,
h(x,y,t=0)=1-x2-y2,forx2+y2≤10,forx2+y2>1.
Following the method presented by and
, the analytical solution of the
system () can be obtained as
h(x,y,t)=1λ2-x2+y2λ4,u(x,t)=xλ˙λ,v(y,t)=yλ˙λ.
Here λ(t) is half-width of the drop, evolving according to
λ(t)=2t2+1
and λ˙≡dλ/dt is the velocity of the leading
edge.
Figure shows a perspective display of drop height at
t=3, whereas Fig. shows the profiles of velocity and
height of the drop. Similarly to Fig. , the top panel
shows the initial condition (black) and analytical solution for t=3
(blue). Central and bottom panels show corresponding numerical
results at t=3 (red). Solid lines represent the fluid height and
the dashed lines the velocity. The central panel shows the solution
with options abs and fct, whereas the bottom
panel shows the solution with options iga and
fct. As in the 1-D case, the velocity field near the drop
edge is regular and the iga+fct result closely
follows the analytical solution.
The rms error for abs and fct
equals 1.60×10-4 and for abs and iga 0.70×10-4;
see for a discussion.
Drop height at t=3 of the example presented in Sect. .
Systems with elliptic pressure
equation
Implemented algorithms
The libmpdata++ library includes an implicit representation
of pressure gradient terms for incompressible fluid equations. This
necessitates the solution of an elliptic Poisson problem for pressure.
The elliptic problem is solved after applying all explicitly known
forcings to ensure a non-divergent velocity field at the end of each
time step. As of the current release, the library is equipped with
the minimal- and conjugate-residual variational iterative solvers.
For the derivation of used schemes and further discussion of the
elliptic problem see ,
and references therein.
The same as in Fig. but for a cross section of the
two-dimensional case.
Library interface
The methods for solving the elliptic problem are implemented in the
mpdata_rhs_vip_prs class (Fig. ). This
class inherits from the mpdata_rhs_vip class. Therefore,
the way to specify other source terms as well as the time-varying velocity
field remains unchanged.
The choice of elliptic solver is controlled by setting the
compile-time parameter prs_scheme to mr and
cr for the minimal-residual and conjugate-residual solver,
respectively. The iterations within the elliptic solver stop when the
divergence of the velocity field is lower than a threshold tolerance
set by a run-time parameter prs_tol
cf..
Example: Boussinesq convection
The goal of this example is to show the user interface for simulations
featuring an elliptic pressure equation. The governing PDE system
consists of momentum, potential temperature, and mass-continuity
equations for an ideal, 2-D, incompressible Boussinesq fluid
∂tv+∇⋅(v⊗v)=-∇π-gθ′θo,∂tθ+∇⋅(vθ)=0,∇⋅v=0.
Here, v=(u,w) denotes the velocity field, π is the
pressure perturbation about the hydrostatic reference state normalised
by the reference density ρo constant in the Boussinesq
model and g is the gravitational acceleration.
Furthermore, θ′ represents the potential
temperature perturbation about the reference state θo=constant, and ⊗ denotes the tensor product.
Combining the velocity prediction from the momentum equation,
according to Eq. (), with the mass continuity
Eq. () leads to the elliptic Poisson problem:
-1ρo∇⋅ρov^-0.5Δt∇π=0,
where v^ is the velocity field after the advection
summed with all the explicitly known source terms at time level n+1,
namely buoyancy in this example.
In Eq. () the pressure perturbation field π
is unknown, and it needs to be adjusted such that the final velocity
field v^-0.5Δt∇π satisfies the mass
continuity Eq. (). Denoting 0.5Δtπ as
ϕ allows symbolising Eq. () using standard
notation for linear sparse problems
:
L(ϕ)-R=0.
The setup of the test follows
. It consists of a circular
potential temperature anomaly of radius 250 m, embedded in
a neutrally stratified quiescent environment, with θo=300K, in the domain resolved with 200×200
grid cells of the size dx=dy=10 m. The
initial anomaly θ′=0.5K is centred in the
horizontal, 260 m above the lower boundary.
The value of g is set to 9.81 ms-1.
The time step is set to Δt=0.75 s and the simulation takes 800
time steps.
Compile-time parameters for the example presented in Sect. .
Listing shows the compile-time parameters
structure. The time integration scheme for the buoyancy forcing is
set to trapez, as the user has a choice of the algorithm.
However, as of the current release, the elliptic problem formulation
requires forcings to be independent of velocity if handled using the
trapez scheme. The implicit pressure gradient terms are
always integrated with the trapezoidal rule (), regardless
of the rhs_scheme setting. In Listing
the elliptic solver option is set to the conjugate-residual scheme
cr. The vip_den is set to -1, because here
the velocity components are the model kinematic variables; cf. the
discussion in second paragraph of Sect. .
The convergence threshold of the elliptic solver,
∇⋅(v)≤ε, is set to 10-7 via the
run-time parameter prs_tol (Listing ).
Run-time parameter field setting the accuracy of the pressure
solver.
Listing shows the buoyancy forcing definition.
Method for calculating source and sink terms in the example presented in
Sect. .
The results of the example presented in Sect. .
Abscissa and ordinate mark the spatial dimensions.
Colours correspond to potential temperature values.
Panel (a) shows results from the 200th, (b) from the
600th and (c) from the 800th time step.
The evolved θ fields after 200, 600 and 800 time steps
are shown in Fig. a–c. These results correspond to plots from Fig. 3
in and illustrate that
libmpdata++ captures the interfacial instabilities and sharp
gradients, including small turbulent structures in
Fig. c. Yet, the solutions contain small
(imperceptible in the plots) under- and overshoots, developing at the
rate of Δθ/Δt∼Δtθo∇⋅(v). These oscillations depend on
the magnitude of the residual errors, ∇⋅v≠0,
controlled by the convergence threshold prs_tol. For
substantiation, Table displays the magnitude of such
spurious extrema δθmax – defined as the larger
from the maximal magnitudes of normalised under- and overshoots with
respect to their initial values – against prs_tol at the
time of Fig. c. Note that
δθmax is bounded by prs_tol(× 800Δt).
The conservation errors for θ′ and (θ′)2 are
defined as
err1=∑θ′-∑θo′∑θo′100%,
err2=∑(θ′)2-∑(θo′)2∑(θo′)2100%,
where θo′ indicates the initial perturbation and ∑ stands for
summing over the whole computational domain.
At the end of the simulation err1≈1×10-11 is orders of magnitude smaller
than in semi-Lagrangian calculations of ,
whereas err2=-14 matches their value, reflecting the implicit LES (ILES) property
of non-oscillatory numerics; see ,
and references therein.
Maximal spurious extrema of the θ field after 800 time steps for
various values of the convergence threshold prs_tol.
prs_tol
10-5
10-7
10-9
δθmax
3×10-4
8×10-6
1×10-7
Remarks
In this paper the first release of libmpdata++ was
introduced. Versatility of the user interface as well as the
correctness of the implementation were illustrated with a series of
examples with increasing degree of physical, mathematical and
programming complexity. Starting from elementary advection in
the Cartesian domain, through passive advection on the sphere, through
slab- and axis-symmetric water drop spreading under gravity, to
buoyant convection in an incompressible Boussinesq fluid, the
accompanying discussions included code snippets, description of the
user interface and comparison with previously published benchmarks.
Our priority in the development of libmpdata++ is the
researcher productivity. In case of scientific software such as
libmpdata++, the researchers are both users and developers of
the library. The adherence to the principle of separation of concerns
and employment of programming techniques that promote code conciseness
– e.g. the current release consists of less than 10k lines of code –
contribute to the developers' productivity. The user productivity is
amplified by ensuring that the release of the library is accompanied
with example-rich documentation. Both the users and developers
benefit from the free/libre open-source software release of the
library.
Our experience with the current version of libmpdata++
indicates that the embraced object-oriented techniques and modular design of the library
generally do not come as a trade-off for performance.
On small grids, however, there is a noticeable overhead
compared to the original Fortran implementation.
For example, in serial runs, up to 5-times longer execution times
were measured for the 3-D revolving-sphere tests
discussed in Sect. (593 grid).
The relative performance improves with increasing grid size,
reaching execution times on a par with the original Fortran implementation
on the (6×59)3 grid.
On the other hand, the separation of concerns obtained with the object-oriented
design of the library allowed equipping the code with the multi-threading mechanism,
without any substantial changes in the numerics code.
Noteworthy, for all 2-D and 3-D examples presented in the paper,
a minimum of fivefold speed-up is obtained when executing on six threads.
The library is in active development and improvements in performance are expected.
Furthermore, equipping the library with distributed-memory parallelisation
is planned for a subsequent release.