libmpdata + + 1 . 0 : a library of parallel MPDATA solvers for systems of generalised transport equations

This paper accompanies the first release of libmpdata++, a C++ library implementing the multi-dimensional positive-definite advection transport algorithm (MPDATA) on regular structured grid. The library offers basic numerical solvers for systems of generalised transport equations. The solvers are forward-in-time, conservative and nonlinearly stable. The libmpdata++ library covers the basic second-order-accurate formulation of MPDATA, its thirdorder variant, the infinite-gauge option for variable-sign fields and a flux-corrected transport extension to guarantee non-oscillatory solutions. The library is equipped with a nonsymmetric variational elliptic solver for implicit evaluation of pressure gradient terms. All solvers offer parallelisation through domain decomposition using shared-memory parallelisation. The paper describes the library programming interface, and serves as a user guide. Supported options are illustrated with benchmarks discussed in the MPDATA literature. Benchmark descriptions include code snippets as well as quantitative representations of simulation results. Examples of applications include homogeneous transport in one, two and three dimensions in Cartesian and spherical domains; a shallow-water system compared with analytical solution (originally derived for a 2-D case); and a buoyant convection problem in an incompressible Boussinesq fluid with interfacial instability. All the examples are implemented out of the library tree. Regardless of the differences in the problem dimensionality, right-hand-side terms, boundary conditions and parallelisation approach, all the examples use the same unmodified library, which is a key goal of libmpdata++ design. The design, based on the principle of separation of concerns, prioritises the user and developer productivity. The libmpdata++ library is implemented in C++, making use of the Blitz++ multi-dimensional array containers, and is released as free/libre and open-source software.


Introduction
The MPDATA advection scheme introduced in Smolarkiewicz (1983) has grown into a family of numerical algorithms for geosciences and beyond (see for example Grabowski and Smolarkiewicz, 2002;Cotter et al., 2002;Smolarkiewicz and Szmelter, 2009;Ortiz and Smolarkiewicz, 2009;Hyman et al., 2012;Charbonneau and Smolarkiewicz, 2013).MPDATA stands for multidimensional positive-definite advection transport algorithm1 .It is a finite-difference/finite-volume algorithm for solving the generalised transport equation (1) Equation ( 1) describes the advection of a scalar field ψ in a flow with velocity u.The field R on the right-hand side (rhs) is a total of source/sink terms.The scalar field G can represent the fluid density, the Jacobian of coordinate transformation or their product and satisfies the equation (2) In the homogeneous case (R ≡ 0), MPDATA is at least second-order-accurate in space and time, conservative and non-linearly stable.
The history of MPDATA spans 3 decades: Smolarkiewicz (1984)- Kühnlein et al. (2012), Smolarkiewicz et al. (2014) and is widely documented in the literature -see Smolarkiewicz and Margolin (1998), Smolarkiewicz (2006) and Prusa et al. (2008) for reviews.Notwithstanding, from the authors' experience the software engineering aspects still overshadow the benefits of MPDATA.To facilitate the use of MP-DATA schemes, hereby we present a new implementation of the MPDATA family of algorithms for regular structured grids -libmpdata++.
In the development of libmpdata++ we strive to comply with the best sought-after practices among the scientific community (Wilson et al., 2014), in particular with the paradigm of maximising code reuse.This paradigm is embodied in the "open source computational libraries -the main foundation upon which academic and also a significant part of industrial computational research rests" (Bangerth and Heister, 2013).
The libmpdata++ has been developed in C++2 , making extensive use of object-oriented programming (OOP) and template programming.The primary goals when designing libm-pdata++ were to maintain strict separation of concerns and to reproduce within the code the mathematical "blackboard abstractions" used for documenting numerical algorithms.The adopted design contributes to the readability, maintainability and conciseness of the code.The current development of libmpdata++ is an extension of the research on OOP implementation of the basic MPDATA scheme presented in Arabas et al. (2014).
The goal of this article is twofold: first, to document the library interface by providing usage examples and, second, to validate the correctness of the implementation by verifying the results against published benchmarks.
The structure of the paper is as follows.Section 2 outlines the library design.The four sections that follow correspond to four types of equation systems solved by the implemented algorithms, namely homogeneous advective transport, inhomogeneous transport, transport with prognosed velocity, and systems featuring elliptic pressure equation.Each of these sections outlines the implemented algorithms, describes the library interface and provides usage examples.Each example is accompanied with a definition of the solved problem, description of the program code and discussion of the results.An index of libmpdata++ options documented in the article is provided in Appendix A.
The paper structure reflects the solver inheritance hierarchy in libmpdata++.All features discussed in preceding sections apply to the one that follows.The set of discussed problems was selected to match the tutorial structure of the paper.The presentation begins with simple examples focusing on the basic library interface.Subsequent examples use increas-ingly more complicated cases with the most complex reflecting potential for applications to cloud dynamics (Grabowski and Smolarkiewicz, 2002).
The library and programs used to generate all results presented in the paper are released as free and open-source software -see the section on code availability at the end of the paper.

Dependencies and supported platforms
The libmpdata++ package is a header-only C++ library.It is built upon the Blitz++3 array containers.We refer the reader to the Blitz++ documentation (Veldhuizen, 2006) for description of the Blitz++ interface, to which the user is exposed while working with libmpdata++.The libmpdata++ core also depends on several components of the Boost 4 library collection, however these are used internally only.Output handlers included in the library depend additionally on gnuplotiostream 5 and HDF5 6 , but their use is optional.Example programs discussed within this article require gnuplot 7 , Par-aView 8 , Python, and the following Python packages: h5py 9 , matplotlib 10 and scipy 11 .
The library code requires a C++11-compliant compiler.In the current development workflow, we employ continuous integration on Linux with GNU g++ 12 and LLVM clang++ 13 compilers and on Apple OSX with the Apple clang++ 14 compiler.Consequently, these are considered the supported platforms.

Components
Components of the library are grouped as follows. -Solvers: -mpdata intended for solving homogeneous transport problems (Sect.3) -mpdata_rhs extending the above with rhs term handling (Sect.4) -mpdata_rhs_vip adding prognosed-velocity support (Sect.5) Classes defined within libmpdata++ have their names surrounded with black frames.The coupled_harmosc class is an example of a user-defined class defined out of the library tree.The solid black lines show the inheritance relations.The output label depicts any of the output handlers available in libmpdata++.
-Output handlers: -gnuplot offering direct communication with the gnuplot program with no intermediate output files -hdf5 offering basic HDF5 output compatible with netCDF 15 readers -hdf5_xdmf implementing the eXtensible Data Model and Format 16 standard supported for instance by the ParaView visualisation tool.
-Boundary conditions: -cyclic implementing periodic boundaries -open giving zero-divergence condition on domain edges -polar applicable with spherical coordinates.
-Concurrency handlers: -serial for single-thread operation -cxx11_thread for multi-threading using C++11 Performing integration with libmpdata++ requires choosing one of the solvers, one output handler, one boundary condition per each domain edge and one concurrency handler.The inheritance diagram in Fig. 1 shows relationships between libmpdata++ solvers defined within the library.The diagram includes as well an example user-defined class cou-pled_harmosc defined out of the library tree.The mpdata solver is displayed at the top, as it is the base class for all other classes.

Computational domain and grid
The arrangement of the computational domain used in libm-pdata++ is shown in Fig. 2. The initial condition for the dependent variable ψ is assumed to be known in nx × ny data points.The outermost data points are located at the boundaries of the domain.
The dual, staggered Arakawa-C grid (Arakawa and Lamb, 1977) used in libmpdata++ is shown in Fig. 3.In this spatial discretisation approach, the cell-mean values of the scalar fields ψ, and G reside in the centres of computational cells,corresponding to the data points of the primary grid in Fig. 2 -whereas the components of the velocity field u are specified at the cell edges of the dual grid in Fig. 3.

Error and progress reporting
There are several error-handling mechanisms used within libmpdata++.1), solid lines depict edges of primary grid and dashed lines mark edges of dual grid in Fig. 3.
First, there are sanity checks within the code implemented using static_assert() calls.These are reported during compilation, for instance when invalid values of compile-time parameters are supplied.
Second, there are available numerous run-time sanity checks, implemented using assert() calls.These are often time-consuming and are not intended to be executed in production runs.To disable them, one needs to compile the program using libmpdata++ with the -DNDEBUG compiler flag.Examples of such checks include detection of NaN values within the model state variables, which may be useful to trace origins of numerical instability problems.
Third, the user may chose to activate the Blitz++ debug mode that enables run-time array range checks.Activating Blitz++ debug mode requires compiling the program using libmpdata++ with the -DBZ_DEBUG flag and linking with libblitz.
Simulation progress is communicated to the user by continuously updating the process threads' names with the percentage of work completed (can be observed e.g. by invoking top -H).

Advective transport
The focus of this section is on the advection algorithm used within libmpdata++.Section 3.1 provides a short introduction to the implemented MPDATA scheme.Section 3.2 describes the library interface needed for the homogeneous transport cases.The following  show examples of usage of libmpdata++ along with the references to other MPDATA benchmarks.
Bullets denote the cell centres and dashed lines denote the cell walls corresponding to the dual grid in Fig. 2.

Implemented algorithms
This subsection is intended to provide the reader with an outline of selected MPDATA features that correspond to the options presently available in libmpdata++.For the full derivation of the scheme and its options see the reviews in Smolarkiewicz and Margolin (1998) and Smolarkiewicz (2006), whereas for an extended discussion of stability, positivity and convexity see Smolarkiewicz and Szmelter (2005).
In the present implementation, it is assumed that G is constant in time.Consequently, the governing homogeneous transport Eq. ( 1) can be written as This particular form is solved by the mpdata solver of libm-pdata++.
The following paragraphs will focus on the algorithms used for handling Eq. ( 3).The rules for applying source and sink terms are presented in Sect. 4.

Basic MPDATA
MPDATA is an, at least, second-order-accurate iterative scheme in which all iterations take the form of a first-orderaccurate donor-cell pass (alias upwind, upstream; cf.Press et al., 2007, Sect. 20.1.3).For the one-dimensional 17 case, after the discretisation in space (subscripts i) and time (su-perscripts n), the donor-cell pass applied to Eq. (3) yields The flux function F is defined as where [u] + ≡ max(u, 0) and [u] − ≡ min(u, 0).In the case of a time-varying velocity field, the velocity components are evaluated at an intermediate time level denoted by the n + 1/2 superscript in Eq. ( 4).Association of the velocity components with dual-cell edges is denoted by fractional indices i + 1/2 and i − 1/2; see Fig. 3.
Hereafter, Gu t x is written compactly as GC, where C denotes the Courant number.GC is referred to as the advector, while the scalar field ψ as the advectee -the nomenclature adopted after Randall (2013).
Evaluation of Eq. ( 4) concludes the first pass of MP-DATA.To compensate for the implicit diffusion of the donorcell pass, the subsequent passes of MPDATA reuse Eqs. ( 4) and ( 5), but with ψ replaced with the result of the preceding pass and u replaced with the "anti-diffusive" pseudovelocity.The pseudo-velocity is analytically derived by expanding Eq. ( 4) in the second-order Taylor series about spatial point i and time level n, and representing the leading, dissipative truncation error as an advective flux; see Smolarkiewicz (1984) for a derivation.A single corrective pass ensures second-order accuracy in time and space.Subsequent corrective passes decrease the amplitude of the leading error, within second-order accuracy.The one-dimensional formula for the basic antidiffusive advector is written as where k numbers MPDATA passes.For k = 1, C k is the flowvelocity-based Courant number, whereas for k > 1, C k is the pseudo-velocity-based Courant number.The number of corrective passes can be chosen within libmpdata++.
The library features two implementations of the donorcell algorithm defined by Eqs.(4) and ( 5).The default one is a "straightforward" summation.The alternative, more resource-intensive, is the compensated summation algorithm of Kahan (1965) which reduces round-off error arising when summing numbers of different magnitudes.

Third-order-accurate variant
Accounting for third-order terms in the Taylor series expansion while deriving the pseudo-velocity improves the accuracy of MPDATA.When G ≡ 1, u = constant and three or more corrective passes are applied, the procedure ensures third-order accuracy in time and space.The formulae for the third-order scheme, derived analytically in Margolin and Smolarkiewicz (1998), can be found in Smolarkiewicz and Margolin (1998, Eq. 36).

Divergent-flow variant
In case of a divergent flow, the pseudo-velocity formulae are augmented with an additional term proportional to the flow divergence.This additional term is implemented in libmpdata++ following Smolarkiewicz and Margolin (1998, Sect. 3.2(3)).

Non-oscillatory option
Solutions obtained with the basic MPDATA are signpreserving, and thus non-oscillatory near zero.Generally, however, they feature dispersive ripples characteristic of higher-order numerical schemes.These can be suppressed by limiting the pseudo-velocities, in the spirit of flux-corrected transport.Application of the limiters reduces somewhat the accuracy of the scheme (Smolarkiewicz and Grabowski, 1990), yet this loss is generally outweighed by ensuring nonoscillatory (or ripple-free) solutions.Noteworthy, because MPDATA is built upon the donor-cell scheme characterised by small phase error, the non-oscillatory corrections have to deal with errors in signal amplitude only.The non-oscillatory option is a default option within the libmpdata++.For the derivation and further discussion of the multi-dimensional non-oscillatory option see Smolarkiewicz and Grabowski (1990).

Variable-sign scalar fields
The basic MPDATA formulation assumes that the advected field ψ is exclusively either non-negative or non-positive.In particular, this assumption is evident in the ψ-fraction factor 6), which can become unbounded in case of a variable-sign field.The libmpdata++ library includes implementations of two MPDATA options intended for simulating advection of variable-sign field.
The first method replaces ψ with |ψ| in all ψ-fraction factors that enter the pseudo-velocity expressions.This approach is robust but it reduces the solution quality where ψ crosses through zero; see Sect.3.2(4) in Smolarkiewicz and Margolin (1998).
The default method, is the "infinite-gauge" variant of the algorithm, a generalised one-step Lax-Wendroff (linear, oscillatory) limit of MPDATA at infinite constant background, discussed in Smolarkiewicz (2006, Sect. 4.2).In practice, the infinite-gauge option of MPDATA is used with the nonoscillatory enhancement.

Compile-time parameters
Compile-time parameters include number of dimensions, number of equations and algorithm options.Most of the compile-time parameters are declared by defining integer constants within the compile-time parameter structure.Listing 1 depicts a minimal definition that inherits from the ct_params_default_t structure containing default values for numerous parameters.All solvers expect a structure with compile-time parameters as their first template parameter, as exemplified in Listing 2.

Choosing library components
The library components listed in Sect.2.2 are chosen through template parameters.First, the solver is equipped with an output mechanism by passing the solver type as a template parameter to the output type, as exemplified in Listing 3. The output classes inherit from solvers.Second, the concurrency handlers expect solver class (equipped with output) as the first template parameter.Subsequent template parameters control boundary condition types on each of the domain edges (see Listing 4).

Run-time parameters
Run-time parameters include the grid size, number of MP-DATA passes and output file name.The list of applicable run-time parameters is defined by fields of the rt_params_t structure.This structure is defined within each solver and extended when equipping the solver with an output mechanism.The concurrency handlers expect an instance of the run-time parameters structure as their constructor argument.Example code depicting how to set the run-time parameters and then instantiate a concurrency handler is presented in Listing 5.

Public methods
The concurrency handlers act as controlling logic for the other components and, hence, the user is exposed to the public interface of these handlers only.
Listing 6 contains signatures of methods implemented by each of the concurrency handlers.The advectee() is an accessor method for the advected scalar fields.It can be used for setting the initial condition as well as for examining the solver state.It expects an index of the requested advectee as the argument (advected scalar fields are numbered from zero).This provides choice between different advected variables.The returned blitz::Array is zero-base indexed and has the same size as the computational grid (set with the grid_size field of the run-time parameters structure, see Listing 5).
The advector() method allows accessing the components of the vector field of Courant numbers multiplied by the G factor (i.e. a Jacobian of coordinate transformation, a fluid density field or their product).The argument selects the vector field components numbered from zero.The size of the returned array depends on the component.It equals the grid size in all but the selected dimension in which it is reduced by 1 (i.e.nx × (ny − 1) for the "y" component and so forth; cf.Fig. 3).
The g_factor() is an accessor method for the G field.The returned array has the same size as the one returned by advectee().The default value is set to G ≡ 1 (for details, see Sect. 3.8).
The advance() method launches the time-stepping logic of the solver advancing the solution by the number of time steps given as argument.
The panic_ptr() method returns a pointer to a Boolean variable that if set to true will cause the solver to stop the computations after the currently computed time step.This method may be used, for instance, to implement signal handling within programs using libmpdata++.
All multi-dimensional arrays used in libmpdata++ use the default Blitz++ "row-major" memory layout with the last dimension varying fastest.Domain decomposition for parallel computations is done over the first dimension only.

Basic example
The source code presented in this subsection is intended to serve as a minimal complete example on how to use libm-pdata++.In other examples presented throughout the paper, only the fragments of code that differ significantly from the minimal example will be presented.
The example consists of an elemental transport problem for a one-dimensional, variable-sign field advected with a constant velocity.The simulation results using code in Listing 7 are shown in Fig. 4. Spatial and temporal directions are depicted on the abscissa and ordinate, respectively.Cellmean values of the transported field are shown on the applicate and are presented in compliance with the assumption of data points representing grid-cell means of the transported field.The code in Listing 7 begins with three include statements that reflect the choice of the library components: solver, concurrency handler and output mechanism.All compile-time parameters are grouped into a structure passed as a template parameter to the solver.Here, this structure is named ct_params_t and inherits from ct_params_default_t what 1012 A. Jaruga et al.: libmpdata++: MPDATA solver library in C++ results in assigning default values to parameters not defined within the inheriting class.The solvers expect the structure to contain a type real_t which controls the floating point format used.The two constants that do not have default values and need to be explicitly defined are n_dims and n_eqns.They control the dimensionality of the problem and the number of equations to be solved, respectively.
Choice between different solver types, output mechanisms and concurrency handlers is done via type alias declaration.Here, the basic mpdata solver is chosen which is then equipped with the gnuplot output mechanism.All output classes expect a solver class as their first template parameter, which is used to define the parent class (i.e.output classes inherit from solvers).
Classes representing concurrency handlers expect the output class and the boundary conditions as their template parameters.In the example, a basic serial handler is used and open boundary conditions on both ends of the domain are chosen.
The choice of run-time parameters is done by assigning values to the member fields of the rt_params_t structure defined within the solver class and augmented with additional fields by the output class.In this example, the instance of rt_params_t structure is named p, the grid size is set to 101 points and the output is set to be done every 20 time steps.An instance of the rt_params_t structure is expected as the constructor parameter for concurrency handlers.
The grid step dx is set to 0.1 and the number of time steps to 100.Initial values of the Courant number and the transported scalar fields are set by assigning them to the arrays returned by the advector() and advectee() methods.In this example, the Courant number equals 0.5 and the advected shape is described by the Witch of Agnesi formula y(x) = 8a 3 /(x 2 + 4a 2 ) with the coefficient a = 0.5.Initial shape is centred in the middle of the computational domain and is shifted downwards by 0.5.Finally, the actual integration is performed by calling the advance() method with the number of time steps as argument.

Example: advection scheme options
The following example is intended to present MPDATA advection scheme options described in Sect.3.1.The way of choosing different options is discussed, and the calling sequence of the library interface is shown for the case of advecting multiple scalar fields.
The example consists of transporting two boxcar signals with different MPDATA options.In all tests, the first signal extends from 2 to 4 and the second signal extends from −1 to 1, to observe the solution for fixed-sign and variable-sign signals.Listing 8 shows the compile-time parameters structure fields common to all cases presented within this example.The number of dimensions is set to 1 and the number of equations to solve is set to 2. Consistent with Listing 7 from the basic example, p shown in Listing 9 is an instance of rt_params_t structure with run-time parameters of the simulation.Setting the outfreq field to the number of time steps results in plotting the initial condition and the final state.The outvars field contains a map with a structure containing a variable name, here left empty, and unit defined for each of the advected scalar fields.Listing 10 shows how to set initial values to multiple scalar fields using the advectee() method with an integer argument specifying the index of the equation in the solved system.

Variable-sign scalar fields
The libmpdata++ library is equipped with two options for handling variable-sign fields; recall the discussion in Sect.3.1.5.The option using absolute values is named abs, whereas the "infinite-gauge" option is dubbed iga.The option flags are defined in the opts namespace.The option choice is made by defining the opts field of the compile-time parameters structure, in analogy to n_dims or n_eqns.enum { opts = opts::abs }; Listing 11.Advection scheme options for Fig. 5, variable-sign option is set to absolute value.
In the first test, the choice of handling variable-sign signal is set to abs (Listing 11). Figure 5 shows the result of simulation with parameters set in Listings 8, 9, 10 and 11.The final signal shows dispersive ripples characteristic of higher-order schemes.It is also evident that the ripple magnitude depends on the constant background, a manifestation of the scheme non-linearity.Furthermore, the final variablesign signal features a bogus saddle point at the zero crossings (cf.Sect.3.1.5),and this can be eliminated by using the infinite-gauge (alias iga) option.Listing 12 shows how to choose the iga option.Figure 6 shows the result of the simulation with parameters set in Listings 8, 9, 10 and 12.Although iga evinces more pronounced oscillations, their magnitudes do not depend on the constant background.This, together with the robust behaviour of iga when crossing zero, substantiates the discussion of Sect.3.1.5on iga amounting to a linear limit of MPDATA.

Third-order-accurate variant
Choosing third-order variant enhances the accuracy of the scheme when used with more than two passes of MPDATA or with iga; recall Sect.3.1.2.Option tot enables the thirdorder variant of the MPDATA scheme.Figure 7 shows the result of the same test as in Figs. 5 and 6 but with MPDATA options set as in Listing 13.The resulting signal is evidently more accurate and symmetric, but the oscillations are still present.enum { opts = opts::iga }; Listing 12. Advection scheme options for Fig. 6, variable-sign option is set to "infinite-gauge".

Non-oscillatory option
To eliminate oscillations apparent in the preceding tests, the non-oscillatory (fct) option (Sect.3.1.4)needs to be chosen.This option can be used together with all other MPDATA options, such as basic scheme, variable-sign signals (abs or iga) and the third-order-accurate variant (tot).
Here, fct is selected together with iga; cf.Listing 14.This is the default setting, i.e. when inheriting from the default parameters structure, and not overriding the opts setting, as illustrated in Listing 7. Figure 8 shows the corresponding results.The solutions for both fixed-sign and variable-sign signals have indistinguishable profiles and all of the dispersive ripples have been suppressed.
To further enhance the accuracy of the solution, fct and iga can be combined with the tot variant; cf.Listing 15.The corresponding result is shown in Fig. 9. Enabling the thirdorder-accurate variant improves the symmetry of the solution, as compared to the results presented in Fig. 8.

Example: convergence tests in 1-D
In this subsection the convergence test originated in Smolarkiewicz and Grabowski (1990)  Listing 13.Advection scheme options for Fig. 7, variable-sign option is set to "infinite-gauge" and third-order accuracy variant is chosen.
and grid increments where x m = 1 is the maximal increment.The series amounts to 152 simulations for each option.In each simulation, the number of time steps N T and the number of grid cells NX is adjusted so that the total time T and total length of the domain X remain constant.The domain size X = 44 x m and simulation time T = 1 are selected.The advective velocity is set to u = x m /T = 1.
In each simulation, a Gaussian profile is advected, and the result of the simulation is compared with the exact solution ψ ex .The initial profiles and the exact solutions are calculated by analytically integrating function ( 7) over the grid-cell extents, to comply with the inherent MP-DATA assumption of a data point representing the grid-cell mean of transported field.The dispersion parameter of the initial profile ( 7) is set to σ = 1.5 x m , while the profile is centred in the middle of the domain x 0 = 0.5X.As a measure of accuracy, a truncation-error function is introduced Listing 14. Advection scheme options for Fig. 8, variable-sign option is set to "infinite-gauge" and non-oscillatory option is enabled.This is the default setting in libmpdata++.
The results of the convergence test for the generic first-orderaccurate donor-cell scheme, the basic MPDATA and its thirdorder-accurate variant are shown in Fig. 10a-c.Each figure displays, in polar coordinates, the base-two logarithm of the truncation-error function (8) for the entire series of 152 simulations.The radius and angle, respectively, indicate changes in grid increment and Courant number.Thus, closer to the origin are simulation results for finer grids, closer to the abscissa are points for small Courant numbers, and closer to the ordinate are points with Courant numbers approaching unity.The contour interval of dashed isolines and of the colour map is set to 1, corresponding to error reduction by the factor of 2. Lines of constant grid-cell size and constant Courant number are overlaid with white contours.
The figures contain information on the convergence rate of MPDATA options.When moving along the lines of constant Courant number towards the origin, thus increasing the spatial and temporal resolution, the number of crossed dashed isolines determines the order of the scheme; cf.Sect.8.1 in Margolin and Smolarkiewicz (1998).Therefore, the results in Fig. 10a-c attest to the first-, second-and third-order asymptotic convergence rates, respectively.Furthermore, the shape of dashed isolines conveys the dependency of the solution accuracy on the Courant number.In particular, they show that at Listing 15.Advection scheme options for Fig. 9, variable-sign option is set to "infinite-gauge", non-oscillatory option is enabled and third-order accuracy variant is chosen.
fixed spatial resolution the solution accuracy increases with the Courant number.Moreover, as the order of the convergence increases the isolines become more circular indicating more isotropic solution accuracy in the Courant number.Figure 10b reproduces the solution in Fig. 1 of Smolarkiewicz and Grabowski (1990) and, thus, verifies the libmpdata++ implementation.For further verification Fig. 11a and b shows results of the convergence test for (i) three-pass MPDATA (run-time solver parameter n_iters = 3), and (ii) for two-pass MPDATA with fct option.These results reproduce Figs. 2 and 3 from Smolarkiewicz and Grabowski (1990).Noteworthy, an interesting feature of Fig. 11a is the groove of the third-order convergence rate formed around φ = 45 • , characteristic of MPDATA with three or more passes (Margolin and Smolarkiewicz, 1998).Next, comparing Fig. 11b with Fig. 10b shows that the price to be paid for an oscillation-free result is a reduction in the convergence rate (from 2 to ∼ 1.8; Sect. 4 in Smolarkiewicz and Grabowski, 1990).
Figure 11c and d documents original results for the convergence test applied to the "infinite-gauge" limit of MP-DATA.In particular, Fig. 11c shows that iga is as accurate as three-pass MPDATA, (cf.Sect. 4 in Smolarkiewicz and Clark, 1986), whereas Fig. 11d reveals that the third-orderaccurate iga is more anisotropic in Courant number than the third-order-accurate standard MPDATA in Fig. 10c.The convergence test results for the default setting of libm-pdata++ (iga plus fct) are not shown, because they resemble results from Fig. 11b with somewhat enhanced accuracy for well-resolved fields (i.e.small grid cells).classical solid-body rotation test (Molenkamp, 1968).The current setup follows Smolarkiewicz and Margolin (1998).

Example
The initial condition features a cone centred around the point (x 0 , y 0 ) = (50 x, 75 y).The grid interval is x = y = 1, and the domain size is 100 x × 100 y -thus containing 101×101 data points (cf.Fig. 2).The height of the cone is set to 4, the radius to 15 x, and the background level to 1.The flow velocity is specified as (u, v) = ω (y − y c , −(x − x c )), where angular velocity ω = 10 −1 and (x c , y c ) denotes coordinates of the domain centre.With time interval t = 0.1, one full rotation requires 628 time steps.The total integration time corresponds to six full rotations.The choice of the threads concurrency handler in Listing 18 results in multi-threaded calculations -using OpenMP if the compiler supports it, or using Boost.Thread otherwise.The number of computational subdomains (and hence threads) is controlled by the OMP_NUM_THREADS environment variable, regardless if OpenMP or Boost.Thread implementation is used.The default is to use all CPUs/cores available in the system.Notably, replacing concurr::serial from the previous examples with concurr::threads is the only modification needed to enable domain decomposition via shared-memory parallelism.
The way the initial condition and the velocity field are set is shown in Listing 19.The Courant number components are specified using calls to the advector() method with the argument defining the component index.
The initial condition is displayed in Fig. 12a, and the results after total integration time are shown in Fig. 12b-d All plots are centred around cone's initial location and show only a quarter of the computational domain.The isolines of the advected cone are plotted with 0.25 intervals.The results in Fig. 12b and c were obtained with the fct and the three-pass tot + fct MPDATA, respectively, whereas Fig. 12d shows test results for the default setting of libmpdata++.These results match those presented in Smolarkiewicz and Margolin (1998, Fig. 1) and Smolarkiewicz and Szmelter (2005, Fig. 4 and Table 1).In particular, the rms errorsdefined on the rhs of Eq. ( 8) -are 0.37 × 10 −3 , 0.11 × 10 −3 and 0.27 × 10 −3 for the fct, three-pass tot fct and the default libmpdata++ options, respectively.

Example: revolving sphere in 3-D
This example extends Sect.3.6 to three spatial dimensions.It exemplifies how to specify a three-dimensional setup using libmpdata++.Furthermore, the option is described for saving the simulation results to HDF5 files with XDMF annotations.
The setup follows Smolarkiewicz and Szmelter (2005): the domain size is 100×100×100, with a uniform grid consisting of 59 grid points in each direction.The time step is 0.036π .The initial condition is a sphere of radius 15 centred around the point (x 0 , y 0 , z 0 ) = (50 − 25/ Figure 13a shows the initial condition, Fig. 13b shows the results after one revolution for the default libmpdata++ options.The grey volume is composed of dual-grid cells (Sect.2.3) encompassing data points with cell-mean values of density greater than or equal to 1.
Obtained results can be compared with those presented in Smolarkiewicz and Szmelter (2005, Figs. 9-13 and Table 4).In particular, for the default libmpdata++ setting, the rms error is 2.8 × 10 −3 , and it compares favourably with the L 2 norm in their Table 4.

Example: 2-D advection on a sphere
This subsection concludes homogeneous transport examples with a 2-D solid-body rotation test on a spherical surface (Williamson and Rasch, 1989).The purpose of this exam- ple is to present methods for setting up the simulations in spherical coordinates. 18ollowing Smolarkiewicz and Rasch (1991) only the case when the initial field rotates over the poles is presented.The initial condition is a cone centred around the point (3π/2, 0) with height and radius equal to 1 and 7π/64, respectively.
The wind field is given by where λ and φ denote respectively longitude and latitude, and U = π/128.The computational domain [0, 2π ] × [−π/2, π/2] is resolved with 128×64 grid increments λ = φ and is shifted by 0.5 φ so that there are no data points on the poles.The test is run for 5120 time steps corresponding to one revolution around the globe.
The advection equation in spherical coordinates has the form of the generalised transport Eq. ( 1) with the Jacobian of coordinate transformation G = cos φ. (11) In order to solve the generalised transport equation with G ≡ 1 the nug option has to be set; see Listing 23.Boundary conditions in this example incorporate principles of differential geometry (cf.chapter XIV in Maurin, 1980) in the classical spherical latitude-longitude framework (Szmelter and Smolarkiewicz, 2010).They are cyclic (bcond::cyclic) in the zonal direction, whereas in the meridional direction they represent two degenerated charts (of the atlas composed of three) defining differentiation of dependent variables in vicinity of the poles (bcond::polar; Listing 24).The setting of G is done using the g_factor() accessor method as shown in Listing 25; note the shift in latitude by φ/2.The initial condition for the test is plotted in Fig. 14a, whereas the results are displayed in Fig. 14b and c ures use orthographic projection, with the perspective centred at the initial condition (the true solution), with the contour interval 0.1.Figure 14b shows the result for the default libmpdata++ options.There is a visible deformation in the direction of motion, consistent with earlier Cartesian rotational tests.The result in Fig. 14c, obtained using three passes of MPDATA with fct and tot, shows reduced deformation and reproduces Fig. 6 in Smolarkiewicz and Rasch (1991).Error norms were calculated following Smolarkiewicz and Rasch (1991, Eqs. 24a-e)  coordinate transformation.For instance, the "energy" conservation error (their ERR2) is −0.066 for the default libm-pdata++ setting and −0.11 for the three-pass MPDATA with tot and fct, which agrees with the values presented in Smolarkiewicz and Rasch (1991, Table 1).

Implemented algorithms
As of the current release, libmpdata++ provides three ways of handling source terms in the inhomogeneous extension of Eq. ( 3): The available time integration schemes include the two variants of the first-order-accurate Euler-forward scheme (hereafter referred to as euler_a and euler_b) and the secondorder-accurate Crank-Nicolson scheme (trapez).The Euler schemes are implemented to account for parameterised forcings (e.g.due to cloud microphysics), whereas the Crank-Nicolson scheme is standard for basic dynamics (e.g.pressure gradient, Coriolis and buoyancy forces).In both Euler schemes, while calculating the solver state at the time level n+1, the right-hand-side at the time level n is only needed.In the euler_a option (Eq.13), the source terms are computed and applied standardly after the advection: In the euler_b option (Eq.14), the source terms are computed and applied (arguably in the Lagrangian spirit; Sect.3.2 in Smolarkiewicz and Szmelter, 2009) before the advection In the trapez option (Eq.15), half of the source terms are computed and applied as in the euler_a and half as in the euler_b (arguably in the spirit of the Lagrangian trapezoidal rule; Sect.2.2 in Smolarkiewicz and Szmelter, 2009):

Library interface
The logic for handling source terms is implemented in the mpdata_rhs solver that inherits from the mpdata class (Fig. 1).Consequently, all options discussed in the preceding section apply.The choice of the source-term integration scheme is controlled by the rhs_scheme compile-time parameter with the valid values of euler_a, euler_b or trapez.
The user is expected to provide information on the source terms by defining a derived class of mpdata_rhs with the update_rhs() method overloaded.The update_rhs() signature is given in Listing 26, whereas the usage example is given in Sect.4.3.The method is called by the solver with the following arguments: -a vector of arrays rhs storing the source terms for each equation of the integrated system, -a floating-point value dt with the time-step value, -an integer number at indicating if the source terms are to be computed at time level n (if at = 0) or n + 1 (if at = 1).
Calculation of forcings at the n + 1 time level is needed if the rhs_scheme=trapez option is chosen.The case of at equal to zero is used in the Euler schemes and in the very first time step when using the trapez option (i.e.once per simulation).When the trapez option is used, the dt passed to the update_rhs() method equals half of the original time step.
The update_rhs() method is expected to first call par-ent_t::update_rhs() to zero out the source and sink terms stored in rhs.Later, it is expected to calculate the rhs terms in a given time step by summing all sources and sinks and "augment assign" them to the rhs field (e.g. using the + = operator).
All elements of the rhs vector corresponding to subsequent equations in the system are expected to be modified in a single update_rhs() call.

Example: translating oscillator
The purpose of this example is to show how to include rhs terms in libmpdata++ by creating a user-defined class out of the library tree.
A system of two one-dimensional advection equations, represents a harmonic oscillator translating with u o = constant; see Sect.4.1 in Smolarkiewicz (2006) for a discussion. 19Applying the trapezoidal rule to integrate the PDE (partial differential equation) system ( 16) leads to the following system of coupled implicit algebraic equations: where ψ * i and φ * i stand for Substituting in Eq. ( 17) ψ n+1 i with φ n+1 i and vice versa and then regrouping leads to Implementation of forcing terms prescribed in Eq. ( 20) is presented in Listing 27.A new solver coupled_harmosc is defined by inheriting from the mpdata_rhs class.A member field omega is defined to store the frequency of oscillations.The rhs terms are defined for both variables, ix::psi and ix::phi, within the update_rhs() method.The method implements both implicit and explicit formulae, the two cases are switched by the at argument.Defining forcings for both n and n + 1 cases allows using this class with both euler and trapez options.The current state of the model is obtained via a call to the state() method.Note how the formulae defined in update_rhs() in case (1) loosely resemble the mathematical notation presented in Eq. ( 20).The 0.5 is absent because the Next, the rt_params_t structure is augmented (by inheriting from parent's rt_params_t) with the omega.Lastly, the coupled_harmosc constructor is defined.Within it, the choice of the omega is handled by copying its value from the p.omega to omega member field and then checking if the user has altered the default value of 0. For inhomogeneous transport, the rhs_scheme within the ct_params_t structure needs to be defined.In this example it is set to trapez (Listing 28).MPDATA advection scheme options are set to default by inheriting from the ct_params_t_default structure.The structure ix allows calling advected variables by their labels, phi and psi, rather than integer numbers.Lastly, when defining the rt_params_t structure a value is assigned to the member field p.omega; see Listing 29.
In the present example, the initial condition for ψ is defined as ψ(x) = 0.5[1+cos(2π x/100)] for x ∈ (50, 150) and zero elsewhere.The initial condition for φ is set to zero.
The result of 1400 s of simulated time is shown in Fig. 15.Note that the solutions for both ψ and φ remain in phase and feature no substantial amplitude error.This contrasts with calculations using Euler-forward schemes (not shown).
In particular, at the end of the simulation, the rms error is 1 × 10 −7 and 1 × 10 −18 for the analogous experiment with u o ≡ 0 (not shown).

Implemented algorithms
Whenever the velocity field changes in time, the secondorder accuracy of the solution at n+1 requires an estimate of the advector at n + 1/2.This is provided by linear extrapolations from n and n − 1 values (Smolarkiewicz and Margolin, 1998).Furthermore, when the velocity is a dependent variable of the model, Eq. ( 12) embodies equations of motion.Then the velocity (or momentum) components are treated as advected scalars (i.e.advectees) and are predicted at the centres of the dual-grid cells (Fig. 3).The advector field is then interpolated linearly to the centres of the cell walls.

Library interface
The algorithms for interpolating in space and extrapolating in time the advector field from the model variables are defined in the mpdata_rhs_vip class and all user-created solvers with time-varying velocity must inherit from this class.
The transported fields may represent either velocity or momenta.In the latter case the prognosed velocity components are calculated as ratios of two advectee fields (e.g.momentum components and density).The index of the advectee that forms the common denominator for all velocity components should be assigned to vip_den.The vip_i, vip_j and vip_k store the index of the advected fields appearing in the numerators for each velocity component.These velocity components are then used to calculate the advector field.In cases when the velocity components are model variables (as in the example of Sect.6.3), the common denominator is redundant and the value −1 should be assigned to vip_den.
For systems where numerators and denominators can uniformly approach zeros, the vip_eps value is defined to prevent divisions by zero.Then, if the denominator at a given grid-point is less than the vip_eps, the resulting advector is set to zero therein.The default setting (represented with vip_eps set to 0) gives no protection from divisions by zeros.Any user-defined vip_eps > 0 activates the above algorithm.
The vip_i, vip_j, vip_k and vip_den are expected to be members of the compile-time parameters structure ct_params_t of the mpdata_rhs_vip class.The vip_eps value is a run-time parameter.
As of the current release, the prognosed-velocity features of libmpdata++ are implemented for constant G ≡ 1 only.

Example: 1-D shallow-water system
The aim of this example is to show how to define simulations with prognosed velocity field.The necessary compile-time and run-time parameters as well as the user-defined class with source and sink terms are discussed.The obtained results are compared with the analytical solution and a published MPDATA benchmark.
The idealised system of 1-D inviscid shallow-water equations is considered, with both the surface friction and background rotation neglected.The simulated physical scenario is a slab-symmetric parabolic drop spreading under gravity; see Frei (1993) for a general context and Schär and Smolarkiewicz (1996) for the bespoke analytical solutions.The corresponding governing equations take the dimensionless form where h is a normalised depth of the fluid layer and u is a normalised velocity.Following Schär and Smith (1993) the selected velocity scale is u o = (gh o ) 1/2 , where h o is the initial height of the drop and g denotes the gravitational acceleration.The characteristic timescale is t o = a/u o , where a denotes the initial half-width of the drop.At the initial time a drop is confined to |x| ≤ 1 and centred about x = 0, The time step is set to 0.01 and the grid spacing is set to 0.05.The crux of the test is a synchronous solution for the depth and momentum near the drop edge that accurately diagnoses the velocity.The definition of the rhs terms for Sect.5.3 is presented in Listing 30.Only the method for calculating the forcing terms is shown; for the full out-of-the-library-tree definition of source-terms see Listing 27.As in Listing 27, the definition in Listing 30 attempts to follow the mathematical notation.Within the ix structure, the equation indices are assigned.Furthermore, the recipe for calculating the velocity is defined by assigning the indices to vip_i and vip_den.Lack of the rhs terms is specified by toggling the nth bit of the hints_norhs field, where n is the index of the homogeneous equation.This prevents the unnecessary summation of zeros.
Listing 32 shows the run-time parameters structure.The value of gravitational acceleration p.g is set to 1 to follow the dimensionless notation of Eq. ( 21 numerical results 21 at t = 3 for different MPDATA options (red) plotted over the top panel.Figure 16b shows the solution with options abs and fct, whereas Fig. 16c shows the solution obtained with options iga and fct.
All presented results are free of apparent artefacts near the drop edge.The abs+fct in the central panel compares well with Fig. 7b in Schär and Smolarkiewicz (1996), whereas the iga+fct solution in the bottom panel closely reproduces the analytical result.The rms error, on the rhs of Eq. ( 8), at the end of the simulation is 5.77 × 10 −4 for abs+fct and 1.87 × 10 −4 for iga+fct options. 21Similar to advector field evaluation discussed in Sec.5.2 the vip_eps value was used as cut-off value to prevent divisions by zero when calculating velocity field.

Example: 2-D shallow-water system
The 2-D shallow-water test discussed here is an original axis-symmetric extension of the 1-D slab-symmetric test in Sect.5.3.The corresponding dimensionless equations take the form As in the 1-D case, h stands for the fluid height and u and v are the velocity components in x and y directions, respectively.Again, the initial condition consists of a parabolic drop centred at the origin and confined to Following the method presented by Frei (1993) and Schär and Smolarkiewicz (1996), the analytical solution of the system ( 23) can be obtained as Here λ(t) is half-width of the drop, evolving according to and λ ≡ dλ/dt is the velocity of the leading edge.
Figure 17 shows a perspective display of drop height at t = 3, whereas Fig. 18 shows the profiles of velocity and height of the drop.Similarly to Fig. 16, the top panel shows the initial condition (black) and analytical solution for t = 3 (blue).Central and bottom panels show corresponding numerical results at t = 3 (red).Solid lines represent the fluid height and the dashed lines the velocity.The central panel shows the solution with options abs and fct, whereas the bottom panel shows the solution with options iga and fct.As in the 1-D case, the velocity field near the drop edge is regular and the iga+fct result closely follows the analytical solution.The rms error for abs and fct equals 1.60 × 10 −4 and for abs and iga 0.70 × 10 −4 ; see Jarecka et al. (2015) for a discussion.
6 Systems with elliptic pressure equation

Implemented algorithms
The libmpdata++ library includes an implicit representation of pressure gradient terms for incompressible fluid equations.This necessitates the solution of an elliptic Poisson problem for pressure.The elliptic problem is solved after applying all explicitly known forcings to ensure a non-divergent velocity field at the end of each time step.As of the current release, the library is equipped with the minimal-and conjugateresidual variational iterative solvers.For the derivation of used schemes and further discussion of the elliptic problem see Smolarkiewicz and Margolin (1994), Smolarkiewicz and Szmelter (2011) and references therein.

Library interface
The methods for solving the elliptic problem are implemented in the mpdata_rhs_vip_prs class (Fig. 1).This class inherits from the mpdata_rhs_vip class.Therefore, the way to specify other source terms as well as the time-varying velocity field remains unchanged.The choice of elliptic solver is controlled by setting the compile-time parameter prs_scheme to mr and cr for the minimal-residual and conjugate-residual solver, respectively.The iterations within the elliptic solver stop when the divergence of the velocity field is lower than a threshold tolerance set by a run-time parameter prs_tol (cf.Smolarkiewicz et al., 1997).

Example: Boussinesq convection
The goal of this example is to show the user interface for simulations featuring an elliptic pressure equation.The governing PDE system consists of momentum, potential temperature, and mass-continuity equations for an ideal, 2-D, in- compressible Boussinesq fluid Here, v = (u, w) denotes the velocity field, π is the pressure perturbation about the hydrostatic reference state normalised by the reference density ρ o constant in the Boussinesq model and g is the gravitational acceleration.Furthermore, θ represents the potential temperature perturbation about the reference state θ o = constant, and ⊗ denotes the tensor product.
Combining the velocity prediction from the momentum equation, according to Eq. ( 15), with the mass continuity Eq. ( 29) leads to the elliptic Poisson problem: where v is the velocity field after the advection summed with all the explicitly known source terms at time level n + 1,

Remarks
In this paper the first release of libmpdata++ was introduced.Versatility of the user interface as well as the correctness of the implementation were illustrated with a series of examples with increasing degree of physical, mathematical and programming complexity.Starting from elementary advection in the Cartesian domain, through passive advection on the sphere, through slab-and axis-symmetric water drop spreading under gravity, to buoyant convection in an incompressible Boussinesq fluid, the accompanying discussions included code snippets, description of the user interface and comparison with previously published benchmarks.
Our priority in the development of libmpdata++ is the researcher productivity.In case of scientific software such as libmpdata++, the researchers are both users and developers of the library.The adherence to the principle of separation of concerns and employment of programming techniques that promote code conciseness -e.g. the current release consists of less than 10k lines of code -contribute to the developers' productivity.The user productivity is amplified by ensuring that the release of the library is accompanied with examplerich documentation.Both the users and developers benefit from the free/libre open-source software release of the library.
Our experience with the current version of libmpdata++ indicates that the embraced object-oriented techniques and modular design of the library generally do not come as a trade-off for performance.On small grids, however, there is a noticeable overhead compared to the original Fortran implementation.For example, in serial runs, up to 5-times longer execution times were measured for the 3-D revolving-sphere tests discussed in Sect.3.7 (59 3 grid).The relative performance improves with increasing grid size, reaching execution times on a par with the original Fortran implementation on the (6 × 59) 3 grid.On the other hand, the separation of concerns obtained with the object-oriented design of the library allowed equipping the code with the multi-threading mechanism, without any substantial changes in the numerics code.Noteworthy, for all 2-D and 3-D examples presented in the paper, a minimum of fivefold speed-up is obtained when executing on six threads.The library is in active development and improvements in performance are expected.Furthermore, equipping the library with distributed-memory parallelisation is planned for a subsequent release.1.5, 3.4.1, 3.5, 3.6, 5.3, 5.4 Using absolute values in "pseudo-velocity" formulation.(One of the two possible options for handling variable-sign signals.)2, 3.4.2, 3.5, 3.6, 3.7 Accounting for third-order terms in "pseudo-velocity" formulae.

pfc
Protecting from divisions by zero in ψ-fraction factors (as the last term in Eq. ( 6)) by conditionally assigning zeros to all grid points for which the denominator equals zero.The default is to augment the denominator with a small positive number (∼ 10 −7 for single precision and ∼ 10 −16 for double precision).The default behaviour requires the signal to be non-negative unless iga or abs is selected.

Figure 1 .
Figure 1.Inheritance diagram of classes mentioned in the paper.Classes defined within libmpdata++ have their names surrounded with black frames.The coupled_harmosc class is an example of a user-defined class defined out of the library tree.The solid black lines show the inheritance relations.The output label depicts any of the output handlers available in libmpdata++.

Figure 2 .
Figure2.Schematic of a 2-D computational domain.Bullets mark the data points for the dependent variable ψ in Eq. (1), solid lines depict edges of primary grid and dashed lines mark edges of dual grid in Fig.3.
Example definition of compile-time parameters structure.
Example alias declaration of a concurrency handler.

Figure 4 .
Figure 4. Simulation results generated by the code in Listing 7.

GeosciFigure 5 .
Figure 5. Result of the simulation with the advection scheme option for variable-sign signal set to absolute value; cf.Listing 11.

Figure 8 .
Figure 8.As in Fig. 5 but with options set to infinite-gauge treatment of variable-sign signal and flux corrections; cf.Listing 14.

Figure 9 .
Figure9.As in Fig.5but with options set to infinite-gauge treatment of variable-sign signal, non-oscillatory option and third-order accuracy variant; cf.Listing 15.

Figure 10 .
Figure 10.The result of the convergence test (a) for the donorcell scheme, (b) for the basic MPDATA and (c) for the third-orderaccurate variant.

Figure 11 .Figure 12 .
Figure 11.As in Fig. 10 (a) for three passes of MPDATA, (b) for two passes with non-oscillatory option, (c) for infinite-gauge option, and (d) for infinite-gauge with third-order-accurate variant.
Compile-time parameter settings for the rotating-cone test.p.n_iters = 3; Listing 17. Run-time parameter responsible for setting the number of MPDATA passes in Fig. 12c.Implementation of the setup using the libmpdata++ interface begins with definition of the compile-time parameters structure.The test features a single scalar field in a twodimensional space, what is reflected in the values of n_dims and n_eqns set in Listing 16.In one of the test runs, the number of MPDATA passes (n_iters) is set to 3, instead of the default value of 2. Corresponding field of run-time parameters structure is shown in Listing 17.During instantiation of the concurrency handler, four boundary-condition settings (two per each dimension) are passed as template arguments.In this example, open boundary conditions (bcond::open) are set in both dimensions -see Listing 18.
a constant density of 4. The sphere is rotating with constant angular velocity = ω/ √ 3(1, 1, 1) of magnitude ω = 0.1.The components of the advecting velocity field are (u, v, w) = (− z (y −y c )+ y (z−z c ), z (x −x c )− x (z−z c ), − y (x−x c )+ x (y−y c )), where the coordinates of the rotation centre are (x c , y c , z c ) = (50, 50, 50).The test lasts for one revolution which takes 556 time steps.Specifying the 3-D setup with the libmpdata++ programming interface calls starts by setting the n_dims field to 3 (Listing 20).Listing 21 shows the choice of recommended three-dimensional output handler hdf5_xdmf.This results in output consisting of HDF5 files with XDMF annotation that can be viewed, for example, with the ParaView visualisation software.This output is saved in a directory specified by the outdir field of the run-time parameters; see Listing 22. enum { n_dims = 3 }; Listing 20.Compile time parameter setting for the revolving-sphere test.using slv_out_t = output::hdf5_xdmf<slv_t>; Listing 21.Alias declaration of an output mechanism for the revolving-sphere test.p.outdir = dir_name; Listing 22. Run-time parameters field specifying output directory for the revolving-sphere test.

Figure 13 .
Figure 13.The results of the example presented in Sect.3.7.The whole computational domain is shown.The grey volume encompasses data points with values of density greater or equal to 1. Panel (a) shows initial condition, (b) results for the default libm-pdata++ options.

Figure 14 .
Figure 14.The results the example presented in Sect.3.8.The plots are centred over the cone's initial location and show the advected field plotted in spherical coordinates.Colours mark the amplitude of the advected field.Panel (a) shows the initial condition, (b) results for the default libmpdata++ options and (c) results for the three-pass MPDATA with fct and tot.
to take into account the effects of www.geosci-model-dev.net/8

Figure 15 .
Figure 15.Simulation results of the example presented in Sect.4.3.Abscissa marks the spatial dimension and ordinate represents the oscillator amplitude.The oscillator state is plotted every 20 time steps.

Figure 16 .
Figure 16.Simulation results of the example presented in Sect.5.3.Solid lines represent fluid height and dashed lines represent fluid velocity.Initial condition plotted in black, analytical solution in blue and test results in red.(a) shows the initial condition and analytical solution at t = 3.(b) and (c) show numerical results plotted over (a) obtained with options abs + fct and iga + fct, respectively.

Figure 18 .
Figure 18.The same as in Fig. but for a cross section of the two-dimensional case.

Figure 19 .
Figure 19.The results of the example presented in Sect.6.3.Abscissa and ordinate mark the spatial dimensions.Colours correspond to potential temperature values.Panel (a) shows results from the 200th, (b) from the 600th and (c) from the 800th time step.
dfl 3.1.3,5.3, 5.4 Augmenting the "pseudo-velocity" formulae with a term proportional to flow divergence.(To be used with divergent flows only.)fct 3.1.4,3.4.3,3.5, 3.6 Non-oscillatory option of MPDATA.MPDATA algorithm at infinite constant background.(One of the two possible options for handling variablesign signals.)

2015 A. Jaruga et al.: libmpdata++: MPDATA solver library in C++
is used to quantify the accuracy of various MPDATA options.

2015 1022 A. Jaruga et al.: libmpdata++: MPDATA solver library in C++ t
Definition of the solver used in the example presented in Sect.4.3.passed as argument in trapez option is already divided by 2.

Table 1 .
Maximal spurious extrema of the θ field after 800 time steps for various values of the convergence threshold prs_tol.

Table A2 .
Fields of run-time parameter structures.