A standard test case suite for two-dimensional linear transport on the sphere: results from a collection of state-of-the-art schemes

. Recently, a standard test case suite for 2-D linear transport on the sphere was proposed to assess important aspects of accuracy in geophysical ﬂuid dynamics with a “minimal” set of idealized model conﬁgura-tions/runs/diagnostics. Here we present results from 19 state-of-the-art transport scheme formulations based on ﬁnite-difference/ﬁnite-volume methods as well as emerging (in the context of atmospheric/oceanographic sciences) Galerkin methods. Discretization grids range from traditional regular latitude–longitude grids to more isotropic domain discretizations such as icosahedral and cubed-sphere tessellations of the sphere. The schemes are evaluated using a wide range of diagnostics in idealized ﬂow environments. Accuracy is assessed in single- and two-tracer conﬁgurations using conventional error norms as well as novel diagnostics designed for climate and climate–chemistry applications. In addition, algorithmic considerations that may be important for computational efﬁciency are reported on. The latter is inevitably computing platform dependent. The ensemble of results from a wide variety of schemes presented here helps shed light on the ability of the test case suite diagnostics and ﬂow settings to discriminate between algorithms and provide insights into accuracy in the context of global atmospheric/ocean modeling. A library of benchmark results is provided to facilitate scheme intercomparison and model development. Simple software and data sets are made available to facilitate the process of model evaluation and scheme intercomparison.


Introduction
Historically, the regular latitude-longitude grid has been the preferred discretization grid in global atmosphere models primarily due to desirable properties such as grid orthogonality and simple data structure. It also trivially lends itself to operations such as zonal/meridional averaging routinely Published by Copernicus Publications on behalf of the European Geosciences Union.
applied in global data analysis. Primarily triggered by requirements for efficient domain decomposition and minimal data movement between decomposition patches in massive parallel compute environments, there has been a significant effort to formulate atmospheric models on more isotropic grids. Other motivations for alternative tessellations of the sphere are the design of models with mesh refinement capability, possibly with smoothly varying transitions between coarse and fine resolution. This has triggered a renewed interest in developing fluid flow solvers for non-traditional spherical grids. A natural first step in model development is to design schemes that solve the continuity equation, also referred to as transport schemes or advection schemes. Numerous new algorithms have been developed within the last 10 yr or so. These encompass finite-volume, finite-difference, and Galerkin-based methods.
Despite the growing amount of research in transport scheme algorithms, the "mandatory" idealized testing of such algorithms on the sphere is surprisingly little standardized. In fact, the only standardized test in global transport scheme development is the solid-body advection test of the widely used shallow-water test case (cf. Williamson et al., 1992). Specific guidelines for the computation of error norms and plotting (contour interval and projection) are given in Williamson et al. (1992). However, resolution and other transport model settings are not specified. In the literature modelers do not always chose the same resolution and model settings, which can make it difficult to compare schemes. Even contour plotting of solutions varies across publications despite the specific guidelines of Williamson et al. (1992). Said colloquially, "apples-to-apples" comparison is not always straightforward despite the simplicity of the test (i.e., an analytical solution is known). This, among other things, motivated  to propose a standard test case suite with specific guidelines for resolution and other transport model settings. To facilitate this process further, we provide scripts for contour plots. Model developers are encouraged to use those scripts so that contour plots from different modeling groups can easily be compared.
More challenging global idealized tests have been developed since the efforts of Williamson et al. (1992) such as the highly deformational (moving) vortices on the sphere (Nair and Machenhauer, 2002;Nair and Jablonowski, 2008) and the "boomerang" flows of Nair and Lauritzen (2010). Despite the high degree of deformation in the (moving) vortex test problem, in particular when simulated beyond the original specification of simulation length (Kent et al., 2012;Pudykiewicz, 2011), it has an analytical solution. The "boomerang" flows, on the other hand, do not have easily accessible analytical solutions until the end of the specified simulation time. Contrary to most idealized tracer transport test cases, Nair and Lauritzen (2010) proposed a divergent wind field so that the modeler is forced to consider the coupling between air density and tracer mass (at least when using finite-volume type schemes), which is a necessary step in developing a transport scheme for realistic atmospheric/oceanographic applications.
The idealized transport scheme testing discussed above assesses simulation accuracy in a single-tracer setup. For a range of climate and climate-chemistry problems, it is also considered important that schemes do not disrupt pre-existing functional relations in unphysical ways (e.g., Thuburn and Mclntyre, 1997). For example, long-lived trace species in the stratosphere are known to be functionally related (Plumb, 2007), and the simulation of cloud-aerosol interactions depends on accurate preservation of relations between tracers (Ovtchinnikov and Easter, 2009). Based on the "boomerang flow", Lauritzen and Thuburn (2012) proposed an idealized test to assess how well schemes maintain a nonlinear relation between two tracers. The amount of mixing, essentially introduced by truncation errors, was quantified using novel mixing diagnostics.
In an attempt to standardize scheme testing under idealized flow settings as well as to reduce the number of tests while still assessing a wide range of aspects of accuracy considered important for geophysical applications, LSPT2012 proposed a "minimal" test case suite with specific guidelines on resolution, time step, and accuracy diagnostics. In LSPT2012 it was assumed that model developers have already tested their schemes under simpler flow conditions such as solid-body flows. Similarly, LSPT2012 did not ask modelers to report on more specialized test cases that may be useful to study certain, perhaps more specialized, aspects of accuracy. For example, by running well-known deformational test cases out further in time (Pudykiewicz, 2011), one can study the downscale cascade from near grid scale to the sub-grid scale (Kent et al., 2012). Similar tests, such as many solid-body revolutions of a large constant plateau spanning many cells, can be used for "tuning" shape-preserving filters so that the peak tracer abundance does not decay linearly (if applicable) despite the initial plateau and analytic solution being very well resolved (Appendix A16).
It is the purpose of this paper to provide a library of benchmark results for the LSPT2012 standard test case suite. The data were provided by the participants of the 2011 workshop on transport schemes held at the National Center for Atmospheric Research (NCAR) in Boulder (Colorado, USA), 30-31 March 2011. The large ensemble of schemes that participated in this intercomparison may help shed light on how the different tests and diagnostics discriminate between schemes and expose particular types of numerical errors. A list of schemes that participated in this intercomparison and the accompanying scheme acronyms are given in Table 1.
In this study grid spacings are quantified in terms of average resolution at the Equator irrespective of the discretization grid. Schemes are compared using this definition of horizontal resolution. If the reader is interested in schemes for meshrefinement applications, for example, only a subset of the schemes and grids presented here will have that capability.

Scheme
Full scheme name Documentation Implementation grid Formal acronym order CAM-FV Community Atmosphere Model - Lin and Rood (1996) Regular latitude-longitude 2 Finite-Volume Lin (2004) CAM-SE Community Atmosphere Model - Dennis et al. (2012) Gnomonic cubed-sphere 4 Spectral Elements Neale et al. (2010); Guba et al. (2013) (quadrature grid) CCSRG Conservative cascade scheme for  Reduced latitude-longitude 3 the reduced grid Tolstykh and Shashkin (2012) CLAW Wave propagation algorithm LeVeque (2002) Two-patch sphere grid 2 on mapped grids CSLAM Conservative Semi-Lagrangian Lauritzen et al. (2010) Gnomonic cubed-sphere 3 Multi-tracer scheme Erath et al. (2013) FARSIGHT Departure-point interpolation White and Dongarra (2011) Gnomonic cubed-sphere 2 scheme with a global mass fixer HEL Hybrid Eulerian Lagrangian Kaas et al. (2013) Gnomonic cubed-sphere 3 HEL-ND HEL -Non-Diffusive Kaas et al. (2013) Gnomonic cubed-sphere 3 HOMME High-Order Methods Dennis et al. (2012) Gnomonic cubed-sphere 4 and 7 Modeling Environment Guba et al. (2013)  In other words, it is up to the reader to extract information for specific applications as only uniform resolution or nonmesh-refined grids are considered here. The paper is organized as follows. In Sect. 2 the schemes are briefly introduced by discussing discretization categories/methods and grouping the respective schemes into these categories. In addition to the basic discretization concepts, this includes discussion of shape-preserving (sp) limiter used (if applicable) and air-tracer mass coupling. Specific details on time step, native grid resolutions used to match test case specifications on resolution, viscosity coefficients (if applicable), etc. are given in the Appendix for each scheme. The results for the LSPT2012 test case suite are presented in Sect. 3. It has been a challenge to present effectively and concisely the results graphically given the large number of schemes. We have found it most effective to depict most of the data in histogram format. The complete histogram data sets are made available as supplemental material for the interested reader. Scripts and data to produce convergence plots (Figs. 1 and 2), filament diagnostic plot (Fig. 5), contour plots (Figs. 7,8,9,10), scatterplots (Figs. 11,12,13 and 14), and histogram plots (Figs. 3,4,6,15 and 16) are also made available in the Supplement. Conclusions and a brief summary of results are provided in Sect. 4.

Transport equation forms and discretization categories
The continuity equation for a passive and inert scalar φ can be written in various forms such as flux form or advective form. The choice of the form of the continuous transport equation from which the discretized scheme is derived obviously depends on the numerical method. Below we define the different categories of discretizations for the schemes that participated in this intercomparison. The high-level categories are as follows: A brief description of the transport schemes that participated in this intercomparison is given within the category each scheme has been assigned to. Below, the scheme descriptions are grouped according to scheme category irrespective of discretization grid. For in-depth details on the algorithms, we refer to their respective publications; specific scheme configurations used in this intercomparison are given in the Appendix.

Flux-form finite volume
Typically, flux-form Eulerian or flux-form semi-Lagrangian schemes are based on the form where ρ is the fluid density, V the flow velocity vector, and φ the tracer mixing ratio per unit mass. In finite-volume schemes the equation of motion is integrated over a control volume. Similarly, the equation for air density is given by For Eulerian finite-volume schemes, Eq. (1) is integrated in space over a stationary (Eulerian) control grid volume/cell A k and in time over one time step t, and usually the divergence theorem is applied. After re-arranging terms the discretized continuity equation can be written as where n is the time-level index, A k the area of an Eulerian grid cell A k , and ∂A k the boundary of A k for whichn is the outward pointing normal vector. The physical interpretation of the last term on the right-hand side of Eq. (3) is basically the flux of mass through all cell walls in one time step. This term is also referred to as the flux divergence. In one dimension the flux divergence is the difference between the flux of mass through the left and right wall of the control volume. Mass conservation is therefore achieved by evaluating the flux through a cell wall shared by two control volumes in an identical way. In that case, the amount of mass flowing into a control volume through a cell wall will be exactly balanced by the outflow through the face shared by the neighboring control volume. Hence any reasonable approximation to the flux will trivially lead to conservation of tracer mass. There are several approaches to approximating the flux divergence, and they are discussed in separate sub-sections below in the context of the schemes that participated in this intercomparison project. Before that, however, we briefly discuss the coupling between air and tracer mass that is inherent in most finite-volume discretizations when the analytical solution for ρ is not known.
Finite-volume schemes based on Eq. (1) use tracer mass ρφ and not mixing ratio φ as the prognostic variable. Hence ρ must be solved for as well: It is considered important that the coupling between air mass and tracer mass is "free-stream preserving" (also referred to as "consistent tracer transport" in the literature). This means that the discretization scheme for Eq. (1) reduces to the discretization scheme for Eq. (2) for φ = 1 as it trivially does in continuous space. Note that free-stream preserving does not necessarily mean that the spatial and temporal discretization schemes for ρ and ρφ are identical. In fact, it is common practice to solve the tracer transport Eq. (1) on longer time steps than the air density Eq. (2) since tracer transport schemes are usually only limited by the advective CFL (Courant-Friedrichs-Lewy) or Lipschitz criterion (Pudykiewicz et al., 1985;Kuo and Williams, 1990) rather than the more restrictive CFL condition usually imposed on the continuity equation for air by gravity and/or sound waves. For such an approach the flux of tracer mass F through a cell wall is computed as where m is the number of sub-steps in time, F (i/m) the "background" flux of air mass through the cell wall in one sub-step t ∈ [ n + i−1 m t, n + i m t], and φ the average mixing ratio over the tracer time step, t ∈ [n t, (n + 1) t]. Note that the mixing ratio, φ , is averaged over several sub-steps in which the air density is updated. For a graphical illustration of Eq. (6), see Fig. 8.19 in Lauritzen et al. (2011b). The technique described by Eq. (6) is usually referred to as "sub-cycling", although more appropriate terminology may be "super-cycling" of tracer fluxes with respect to air mass flux.
It is worth noting that Eq. (6) with m = 1 constitutes a form of linearization of the flux where non-linear coupling between tracer mixing ratio and air mass is neglected. For example, assume that tracer mixing ratio φ(x, y) is represented through a higher order polynomial of degree K and similarly for air density ρ(x, y), where x and y denote the longitudinal and latitudinal directions, respectively. Then the flux through the cell wall involves φ(x, y) · ρ(x, y) (Dukowicz and Baumgardner, 2000), which would require integrating a polynomial of degree K 2 . Instead, the flux is approximated by Geosci. Model Dev., 7, 105-145, 2014 www.geosci-model-dev.net/7/105/2014/ which eliminates the non-linear interaction between nonconstant terms in the polynomials of φ and ρ. This simplification reduces the order of the polynomial: instead of having to integrate a polynomial of degree K 2 , only integration of polynomials of degree K is needed. For most applications it is important that the prognosis of mixing ratio φ does not introduce spurious oscillations and/or unphysical values such as negative mixing ratios. Schemes that guarantee "physical" solutions in this sense are referred to as "shape-preserving" (sp). The enforcement of shape preservation in flux-form schemes can be done by adjusting the fluxes. A very popular algorithm is FCT (flux-corrected transport by Zalesak, 1979) where a monotone low-order flux is blended with the non-monotone higher order flux to provide a shape-preserving solution. Another approach that can be used in the context of a flux-form discretization is to ensure that the reconstruction function, which is usually an integral part of a finite-volume scheme, is constrained so that it does not introduce new extrema or expand the range of the cell-averaged values. This method is referred to as slope-limiting (e.g., van Leer, 1977). For an overview of shape-preserving filters used for the schemes in this intercomparison, see Table 2. The following subsections provide brief descriptions of the models that fall into the fluxform finite-volume category.

Taylor series approach
The scheme of Skamarock and Gassmann (2011), here referred to as MPAS as it was implemented in the "Model for Prediction Across Scales" framework Ringler et al., 2011), is a generalization of onedimensional Taylor series approximations to the flux operators (Wicker and Skamarock, 2002;Hundsdorfer et al., 1995) for a Voronoi tessellation of the sphere. Specifically, these operators are generalizations of third-and fourth-order operators currently used in atmospheric models employing regular, orthogonal rectangular meshes as, for example, the Weather Research and Forecasting (WRF) model, which is documented in Skamarock and Klemp (2008). Two-dimensional least-squares-fit polynomials are used to evaluate the higher order spatial derivatives needed to cancel the leading-order truncation error terms of the standard second-order centered formulation. As in Wicker and Skamarock (2002), the third-order formulation is equivalent to the fourth-order formulation plus an additional diffusion term that is proportional to the Courant number (CN). An optimal value for the coefficient scaling this diffusion term based on qualitative evaluation of results from other tests is used (see Skamarock and Gassmann, 2011).
The time stepping is based on a three-stage Runge-Kutta method. Hence the flux operators are evaluated at three intermediate time levels for a full tracer time step update. Shape preservation is obtained by applying the FCT limiter/filter during the final Runge-Kutta stage within a given time step. Tracer and air mass coupling is through super-cycling.

"Swept-area" approach
An alternative and perhaps more physically intuitive approach to approximating the flux divergence is to trace the area that is "swept" through an Eulerian cell wall in one time step -hence the name "swept area" approach, also referred to as incremental remapping method (Dukowicz and Baumgardner, 2000), or semi-Lagrangian flux-form finitevolume method (Lin and Rood, 1996). These methods are usually based on Euler forward time differencing (two-timelevel schemes). Several schemes in this intercomparison are based on that approach, and they differ in area approximation, reconstruction method, and implementation grid (for a detailed discussion on area approximations and reconstruction methods, see, for example, Lauritzen et al., 2011b). Unless stated otherwise the schemes based on "swept areas" use the super-cycling technique for coupling tracer and air mass.
The most rigorous approach in this intercomparison, in terms of area approximation, is the Simplified Flux-Form CSLAM scheme (SFF-CSLAM, Lauritzen et al., 2011a;Ullrich et al., 2013). For each cell the flux areas are approximated by tracing the end points (vertices) of each cell face upstream. The upstream translation of these points and the face vertices can be connected with straight lines (e.g., Harris et al., 2010) or parabola (in the latter case also the midpoint of the cell faces is traced upstream; Ullrich et al., 2013) to define the swept area (aka flux area). This area will by definition be swept through the cell wall in one time step and hence can be used to approximate the mass fluxes in and out of control volumes by integrating reconstruction functions of tracer mass over the swept areas. The "Simplified" in the SFF-CSLAM scheme acronym refers to the simplification introduced by Hirt et al. (1974), in which the flux integral is simplified so that only the sub-grid-scale reconstruction immediately upstream of the cell edge is used even though the flux area may overlap more than one Eulerian cell. As discussed in Lauritzen et al. (2011a), this simplification may lead to some cancellation of errors for sufficiently small CNs. The integration of the flux region in SFF-CSLAM is performed via fourth-order Gaussian quadrature of third-and fourthorder accurate reconstruction polynomial functions  referred to as SFF-CSLAM3 and SFF-CSLAM4, respectively. Shape preservation in SFF-CSLAM is enforced by reconstruction function-limiting (slope-limiting); more specifically the maxima and minima are identified within each element, and the reconstruction function is scaled to fit within the minimum and maximum of the neighboring cell-average values (Barth and Jespersen, 1989). Since simplified flux-area integration is used, reconstruction functions are effectively extrapolated in the parts of the flux areas (if any) that are not limited to the immediate upstream cell with which the control volume shares a face. Since slope-limiting Table 2. A list of shape-preserving filter information: scheme acronym (first column), scheme category (second column), filter category (third column), whether the scheme is strictly shape-preserving in terms of not expanding the range of the initial data (fourth column), and the reason for non-shape-preservation (if applicable, fifth column). is only enforced within each Eulerian cell and not throughout the flux area, SFF-CSLAM is not strictly shape-preserving but only approximately so. SFF-CSLAM could be rendered strictly shape-preserving by using FCT, possibly at the expense of increased computational cost. A further simplification to SFF-CSLAM is to approximate the swept area with just one degree of freedom instead of two or three as described above. For example, one may use just one velocity vector at the center of each edge to trace the flux area so that the swept area is a rhomboid instead of a quadrilateral with straight (Miura, 2007) or curved edges . This approach is taken in the transport scheme implemented in the Icosahedral Nonhydrostatic Model (ICON); ICON is currently being developed in a joint effort by the Max Planck Institute for Meteorology (MPI-M) and the German Weather Service (DWD). The scheme is referred to as ICON-FFSL (Flux-Form Semi-Lagrangian). The swept area approximation in ICON-FFSL is first-order in space and second-order in time.
The simplified flux integration, as used in SFF-CSLAM, is also applied in ICON-FFSL. Hence the maximum stable CN is limited; the theoretical stable CN limitation for linear reconstruction functions is 0.5 (Fig. 3 middle; Lauritzen et al., 2011a). However, in practice ICON is stable up to CN of approximately 0.8. The reconstruction polynomial is firstorder (linear), and the coefficients are estimated using a conservative and weighted least squares reconstruction method (Ollivier-Gooch and van Altena, 2002). Shape preservation in ICON-FFSL is obtained by using FCT, and tracer-airmass coupling is through "super-cycling".
A similar approach has been taken in the scheme of SLFV-SL developed at LMD (Laboratoire de Météorologie Dynamique, Paris, France) for a hexagonal icosahedral gridbased model. It uses simplified swept areas with simplified integration of linear reconstruction functions as in ICON-FFSL. Contrary to ICON-FFSL, the SLFV schemes base their reconstruction on averaging six gradients (or five for the pentagons) rather than a least-squares fit. SLFV-SL uses a Geosci. Model Dev., 7, 105-145, 2014 www.geosci-model-dev.net/7/105/2014/ slope limiter for shape preservation -more precisely, a multidimensional extension of the Van Leer-type slope limiter discussed in Dukowicz and Kodis (1987). LMD also presented another scheme, SLFV-ML, which is similar to SLFV-SL, but instead of forward Euler the Runge-Kutta third-order total variational diminishing (TVD) time-integration method is used (e.g., Nair et al., 2005). For details on the SLFV schemes, see Appendix A14.

Wave-propagation algorithm
Related to the "swept area" approaches described above, in the sense that this algorithm has some conceptual similarities, is the wave-propagation algorithm of LeVeque (2002). The specific version of this algorithm is referred to as CLAW as it is implemented in the general Clawpack package (LeVeque, 2006). The wave-propagation algorithm can be viewed as a scheme that propagates information (i.e., waves) first in a direction normal to a given cell interface, and then in a direction transverse to this interface effectively approximating "swept area" fluxes (see, e.g., Fig. 5.22 in Durran, 2010). CLAW is based on a first-order donor cell upwind method (first-order waves) composed of one-dimensional flux-divergence operators with "correction" terms to take into account traverse flow of waves and/or higher order waves. CLAW used here is formally second-order accurate. A TVD monotonized central-difference limiter (LeVeque, 2002;van Leer, 1977) is used for shape preservation, but other TVD type flux limiters can also be applied. Clawpack supports the advective and flux form of the transport equation. The version of CLAW used here is based on the advective form. For non-divergent winds the average normal velocity at mesh cell edges is obtained by differencing a stream function evaluated at mesh cell corners. Consequently, a constant density field in a non-divergent flow is preserved in the discretized CLAW scheme based on the advective form. Clawpack is not strictly a transport code, but is designed to solve more general non-linear hyperbolic problems. The problems presented here are ideally suited for AMRClaw, the spatially adaptive version of Clawpack (http://www.clawpack.org).

Dimensional splitting approach
Instead of approximating swept area fluxes rigorously in two dimensions, one may take an operator split approach, which has been successfully applied for orthogonal (Lin and Rood, 1996) and quasi-orthogonal grids (Putman and Lin, 2007). The advantage of such an approach is that only one-dimensional operators are needed. The formal accuracy, however, is limited to second-order with the splitting. The Lin and Rood (1996) scheme is used in NCAR's Community Atmosphere Model Finite-Volume version (CAM-FV, Neale et al., 2010) and implemented on a regular latitude-longitude grid. The transport scheme in CAM-FV applies successive applications of first-order advection and PPM (piecewise parabolic method; Colella and Woodward, 1984) flux-divergence operators that are carefully combined to minimize splitting errors. To render CAM-FV, approximately shape-preserving slope limiters and curvature limiters are applied in the one-dimensional PPM reconstructions. Since the limiters are applied to the PPM operators that are one-dimensional, over-and undershoots are only eliminated along coordinate directions and not in the transverse direction. Hence, CAM-FV is only approximately shapepreserving. Air-tracer coupling is through "super-cycling". For a stability analysis of the Lin and Rood (1996) scheme, see Lauritzen (2007).
Another dimensionally split transport scheme in Eulerian flux form that participated in this intercomparison is an improved version (Prather et al., 2008) of the original secondorder moment (SOM) scheme (Prather, 1986), which is here referred to as UCISOM (UC Irvine Second-Order Moments scheme). It applies the same operators/algorithm in all coordinate directions (via dimensional splitting) and hence is trivially extensible to three dimensions. In addition to the one prognostic variable (cell-averaged tracer mass) that all the schemes discussed so far use, the SOM method carries five prognostic variables. The extra forecasted variables are moments of the tracer distribution. The UCISOM scheme has been implemented on a regular latitude-longitude grid and on an equiangular gnomonic cubed-sphere (referred to as UCISOM and UCISOM-CS, respectively).

(Semi-)Lagrangian finite volume
A (semi-)Lagrangian finite-volume scheme is typically based on the form

D Dt
where D/Dt is the total or material derivative and A(t) is a Lagrangian volume for which, by definition, there is no flux of mass across its boundaries. Lagrangian and semi-Lagrangian finite-volume schemes are also referred to as cell-integrated schemes (Nair and Machenhauer, 2002). In semi-Lagrangian finite-volume schemes, the same Lagrangian areas are only traced/retained for one time step, whereas for fully Lagrangian schemes the cells move with the flow throughout the integration or at least for multiple time steps. Each sub-category of (semi-)Lagrangian finitevolume schemes is discussed in a separate section below. Conservation of mass in (semi-)Lagrangian finite-volume schemes is based on the physical constraint that the integral of mass over the Lagrangian areas at time level n and n + 1 must match. This physical constraint is more rigorous than the requirement for mass conservation in flux-form schemes, for which any flux leads to mass conservation as long as identical fluxes with opposite signs are used for each cell face.
Contrary to flux-form schemes, the reconstruction functions must integrate to the cell-averaged value in each Eulerian control volume, and the Lagrangian areas must span the entire domain without cracks or overlap between them. For a fuller discussion, see Lauritzen et al. (2011b) and Erath et al. (2013).
Since (semi-)Lagrangian finite-volume schemes trace Lagrangian volumes rather than fluxes through cell walls, shape preservation cannot be ensured using FCT and FCT-type limiters. Shape preservation in semi-Lagrangian finite volume (not flux form) can be accomplished via slope-limiting where the reconstruction function is limited to avoid spurious under-and overshoots.

Fully two-dimensional semi-Lagrangian finite volume
The Conservative Semi-Lagrangian Multi-tracer (CSLAM) scheme, which has been implemented in NCAR's High-Order Methods Modeling Environment (HOMME; Erath et al., 2012), is based on upstream tracing of cells and subsequent integration over overlap areas between the Lagrangian cell and Eulerian grid cells. Specifically, the vertices of the Eulerian grid control volumes/cells are traced upstream and connected with straight lines to define the upstream Lagrangian area. Note that it is essential for mass conservation that the upstream areas collectively span the entire domain and that the reconstruction function integrates to the cellaveraged value within each Eulerian grid cell (Erath et al., 2013). Mass conservation in flux-form schemes is not subject to these constraints. The CSLAM scheme may also be cast in flux form (Harris et al., 2010) to produce schemes that are identical even when the slope limiter for shape preservation is applied. Note that casting the scheme in flux form allows for flux limiters such as FCT that can obviously not be used in the Lagrangian form (e.g., Lauritzen et al., 2011b). Since CSLAM integrates over fewer overlap areas than its fluxform version, it is more efficient in its Lagrangian form. The CSLAM version used in this comparison was implemented on an equiangular gnomonic cubed-sphere grid. CSLAM uses fully two-dimensional polynomial-based reconstruction functions of degree two for air density ρ(x, y) and tracer mixing ratio φ(x, y). Shape preservation is obtained with fully two-dimensional slope-limiting (Barth and Jespersen, 1989). Integration over overlap areas on the cubed-sphere is performed via line integrals in gnomonic cubed-sphere coordinates. In Lauritzen et al. (2010) line integrals along coordinate lines were computed using exact line-integral formulas (Ullrich et al., 2009). However, it was later found that these may become ill-conditioned at high resolution: switching to Gaussian quadrature makes the algorithm robust but at the cost of mass conservation unless mass consistency is enforced locally using the consistency enforcement algorithm by Erath et al. (2013), which does not affect the locality and efficiency of the CSLAM algorithm. The coupling between ρφ and φ is by using the following reconstruction function for tracer mass in each Eulerian control volume: (Appendix B of Nair and Lauritzen, 2010) where (·) refers to the cell-averaged value. Note that for φ(x, y) = 1 Eq. (9) reduces to the reconstruction function for ρ, and hence Eq. (9) is free-stream preserving. Also, the higher order terms in the product ρ(x, y) · φ(x, y) have been eliminated so that the reconstruction function for tracer mass is of degree two. One could also simply use a reconstruction function based on tracer mass ρφ instead of reconstructing ρ and φ separately. However, shape preservation should only be applied to φ as φ is conserved following parcel trajectories and not tracer mass ρφ. Hence the separation of ρ and φ in the reconstruction step is preferable.

Flow-dependent dimensional splitting
Instead of approximating the upstream area with a fully twodimensional approach, it may be approximated using a dimensionally split approach. This is similar to splitting for Eulerian fluxes. However, the dimensional splitting is not along coordinate axes but along Lagrangian translations of coordinate axes. Hence we refer to this approach as flow-dependent dimensional splitting. The upstream area is then effectively approximated using line segments that are parallel to the coordinate axes (see, for example, Fig. 2 in Lauritzen et al., 2006) so that the two-dimensional remapping problem is cast into one-dimensional "sweeps" (one sweep along a coordinate axis and one sweep along the upstream translation of the other coordinate axis); such schemes are referred to as cascade schemes and were originally introduced by Purser and Leslie (1991) for non-conservative semi-Lagrangian interpolation. Later, conservative versions of the cascade method were proposed, e.g., the conservative cascade scheme (CCS; . In each cascade sweep, PPM-based operators (similarly to CAM-FV) are used. A scheme based on CCS and implemented on the reduced latitude-longitude grid (for details on the reduced latitudelongitude grid used here, see Fadeev, 2013) participated in this intercomparison and is referred to as CCSRG (Tolstykh and Shashkin, 2012). The version of CCSRG used here does not have a limiter implemented. Tracer-mass coupling is based on reconstructing tracer mass, ρφ, and not on the reconstruction of mixing ratio and density separately.

Lagrangian finite volume
A scheme for which the Lagrangian areas are retained for longer than one time step is the trajectory-tracking scheme (Dong and Wang, 2012) based on tracking interfaces (TTS-I, Dong and Wang, 2013). The advantage of tracing interfaces is that large gradients or even discontinuities are preserved.
Geosci. Model Dev., 7, 105-145, 2014 www.geosci-model-dev.net/7/105/2014/ The initial grid in TTS-I is based on polygons generated by using a spherical centroidal Voronoi tessellation (Du et al., 1999;Ringler et al., 2008;Ju et al., 2011), where the density function that controls the distribution of polygons is set to unity. The polygons are then traced throughout the integration. Due to the large deformation of the background flow, the edges of the polygons will inevitably cross. To avoid this ill-conditioned problem, a novel curvature-guard algorithm (CGA) has been developed that splits and merges edges according to deformation criteria. The details are explained in Dong and Wang (2013). For the computation of diagnostics, the fields are mapped to a regular latitude-longitude grid (which is also done for coupling with physical parameterizations). This mapping is first-order, mass-conservative and shape-preserving. Note that the prognostic fields are always retained in Lagrangian space, so the mapping is only for computing diagnostics (and tendencies from the physical parameterizations). Coupling between tracer mass and air mass is trivial since the scheme retains Lagrangian volumes for tracer mass and air mass throughout the integration.

Hybrid Eulerian-Lagrangian
An alternative approach is to retain both a fully Lagrangian and Eulerian representation of all prognostic variables as done in the hybrid Eulerian-Lagrangian (HEL) scheme (Kaas et al., 2013). In HEL the Lagrangian solution, based on tracing Lagrangian parcels (effectively solving Dφ/Dt = 0), is used to nudge the Eulerian solution toward the Lagrangian solution that exactly preserves tracer correlations and tracks gradients very accurately. In the Lagrangian solution, mixing between neighboring parcels is done using directionally biased diffusion based on the local deformation rate of the flow. The mixing is introduced to prevent long-term development of unresolvable deformation into parcel filaments, which one may also describe as aliasing in Lagrangian space. The Eulerian solution is simply a first-order forecast; in this case, a first-order version of CSLAM is used. Hence HEL is categorized under finite-volume semi-Lagrangian schemes, and the Lagrangian parcel part of the algorithm is viewed as a shape-preserving limiter in the context of this intercomparison. Lagrangian parcel values are used to nudge the shapepreserving low-order Eulerian solution using an algorithm that ensures mass conservation and shape preservation. For comparison the scheme has also been run in an aliased, and therefore unphysical, setup without the directional diffusion (abbreviated HEL-ND; No Diffusion); thus, the Lagrangian parcels retain their initial values throughout the simulation. If using exact trajectories, HEL-ND has no errors at the end of the simulation since the parcels will have returned to their initial position without altering their initial value. In the test cases presented here, the trajectories are not exact, and the error norms are therefore non-zero. Note that this is not the case for HEL since the mixing/diffusion is irreversible.
The scheme uses the same coupling between fluid density and tracer mass as the CSLAM scheme, although the nudging of Eulerian cell averages is done separately for density and mass, but constrained by monotonicity in tracer mixing ratio. The HEL scheme is general in the sense that any shapepreserving and mass-conservative scheme can be used for the Eulerian forecast. The HEL scheme has also been tested successfully in a dynamic shallow water model with strongly varying surface topography (Kaas et al., 2013).

(Semi-)Lagrangian grid point
Some schemes, such as traditional grid-point semi-Lagrangian schemes, are based on the advective form of the continuity equation for mixing ratio φ, The FARSIGHT scheme (White and Dongarra, 2011) is based on Eq. (10) and discretized on an equiangular gnomonic cubed-sphere grid. It is an upstream semi-Lagrangian scheme that computes departure points for each grid point using backward trajectories based on numerical derivatives of the wind field at the later time. The scheme then sets φ at each grid point to the interpolated value (thirdorder for FARSIGHT) at its departure point. The scheme allows for long time steps as long as the trajectory algorithm converges (Lipschitz criterion). FARSIGHT performs best at Courant numbers of 10-20 and has large errors at low Courant numbers (White and Dongarra, 2011). Schemes based on Eq. (10) are usually not inherently mass conservative, and it is common practice to apply global mass fixers that "ad hoc" restore global mass conservation. FARSIGHT uses a global mass fixer that also locally constrains the mixing value to remain within a predefined interval. Hence the scheme is not necessarily locally shape-preserving. The parallel implementation uses dynamic communication to allow arbitrarily fine domain decomposition regardless of time step. However, it does incur the expense of a global synchronization at each time step, and the mass fixer uses global reductions. For this class of schemes, free-stream preservation is trivial since a constant φ will remain constant throughout the simulation and ρ does not appear in the transport Eq. (10).
The spectral bicubic interpolation scheme (SBC, Enomoto, 2008) is a traditional semi-Lagrangian grid-point scheme in Eq. (10) based on spectral transforms on a latitude-longitude Gaussian grid (Ritchie, 1987). The zonal, meridional, and cross derivatives are calculated using the spectral transform method and are then fed into the bicubic interpolation formula providing a fully two-dimensional interpolant (no directionally splitting that is commonly applied in traditional semi-Lagrangian schemes). The number of zonal grid points is about twice the truncation wave number (linear Gaussian grid) rather than about three times (quadratic Gaussian grid) since the nonlinear terms are hidden in the interpolation (Côté and Staniforth, 1988). The linear Gaussian grid (thus larger truncation wave number) gives better accuracy for the same number of grid points, especially at low resolutions.
Trajectories are computed using the traditional method based on bilinear interpolation along great circles (Staniforth and Côté, 1991). A two-time-level scheme (Temperton and Staniforth, 1987) is implemented for efficiency. It is confirmed that the two-time-level scheme gives exactly the same results as the three-time-level scheme used by Enomoto (2008). The time extrapolation is not used since the wind fields are known analytically at any time t. Time integration is conducted in spectral space with the unlimited scheme. In physical space, it is conducted with the shape-preserving scheme.
This scheme does not formally conserve mass and is not inherently shape-preserving although the interpolation itself is very accurate; overshoots and undershoots are much smaller compared to traditional quasi-cubic interpolation (Ritchie et al., 1995). A simple global mass fix scheme based on a variational formulation by Sun and Sun (2004) is used. Shape preservation is enforced by a quasi-monotone scheme by Nair et al. (1999). The quasi-monotone scheme is an improved version of Sun et al. (1996) that applies the Bermejo and Staniforth (1992) filter.

Lagrangian parcel methods
Instead of periodically (every time step for FARSIGHT and SBC) remapping between a Lagrangian and Eulerian mesh, one may also trace the Lagrangian parcels throughout the integration (e.g., Chorin and Marsden, 2000;Cottet and Koumoutsakos, 2000) similar to the Lagrangian finitevolume method described above (TTS-I). This method is referred to as the Lagrangian particle method, and its implementation in this intercomparison will be referred to as LPM (Bosler, 2013). Apart from different remapping to Eulerian grids, LPM is similar to HEL without diffusion (i.e., HEL-ND). Obviously, any set of parcels can be traced. LPM traces quadrilaterals of a cubed-sphere mesh or the triangles of an icosahedral triangular mesh by both tracing the centers and vertices of the control volumes. The parcel trajectories are computed using a fourth-order Runge-Kutta method.

Series-expansion methods
Transport scheme algorithms in which the solution is projected onto a set of basis functions through a minimization procedure are broadly referred to as series-expansion methods as for example explained in Durran (2010). The spectral transforms used in the SBC scheme are also based on series expansions (global). However, since the expansions are only used to provide gradients for the Lagrange interpolant, the SBC scheme is not categorized as a series-expansion scheme. In this intercomparison one scheme (with several variants) under this category participated and is referred to as HOMME (High-Order Methods Modeling Environment). HOMME is a dynamical core framework that currently accommodates spectral element Dennis et al., 2005), discontinuous Galerkin methods (Nair, 2005;Nair et al., 2009), and finite-volume methods (Erath et al., 2012) on conforming quadrilateral grids on a sphere. A gnomonic cubed-sphere grid defines the elements, and each element is populated with Gauss-Lobatto-Legendre nodes for integral evaluations used in the transport operators.
The HOMME version used here is a continuous Galerkin finite element method that relies on globally continuous polynomial basis functions of order p (here with p = 3 and p = 6). Although HOMME has the capability to solve the transport equation in advective form, it is solved in flux form (one equation for ρφ and one for ρ) for exact conservation. A compatible discretization method is used that guarantees mass conservation (Taylor and Fournier, 2010). Time stepping in HOMME is via an explicit three-stage strong stability preserving Runge-Kutta method. For shape preservation φ = (ρφ) ρ is recovered after each Euler time step in the Runge-Kutta method. The quasi-monotone limiter (shapepreserving filter) for φ is based on an optimization problem with equality and inequality constraints (Taylor et al., 2009;Guba et al., 2013).
There is a significant dependency of the simulation quality on the choice of the fourth-order hyperviscosity coefficient for low-resolution simulations with HOMME. For specific choices used in HOMME, see Appendix A8. HOMME has been incorporated as a dynamical core option in NCAR's Community Atmosphere Model (CAM, Evans et al., 2013). The configuration using the HOMME spectral element dynamical core in CAM is referred to as CAM-SE (Dennis et al., 2012). The test case suite was also run with CAM-SE (equivalent to HOMME-p3) but using the fourth-order hyperviscosity coefficients for climate simulation in CAM (see Appendix A2 for details).

Results
In this section the results for the transport schemes that participated in this comparison are presented and discussed. Horizontal resolutions are specified in terms of average grid spacing at the Equator. The test case suite works with three resolutions λ : 1.5 • , 0.75 • , and λ m (the latter is scheme dependent and defined in Sect. 3.2), where λ denotes the longitude. The identical grid spacing is also selected for the latitudinal direction. The native grid parameters corresponding to these three average grid spacings at the Equator can be found in the Appendix for the respective schemes. In addition, we make extensive use of CNs, which are also specified in terms of λ, so local CNs may differ from the "global" Geosci. Model Dev., 7, 105-145, 2014 www.geosci-model-dev.net/7/105/2014/ CN for schemes implemented on non-isotropic grids. Again the reader is referred to the scheme-specific Appendix to find time steps t corresponding to specific CNs at any of the three resolutions. Data used to make histograms are available as supplemental material. The test case specification consists of two analytical flow fields (one non-divergent and one divergent) designed to deform initially well-resolved initial conditions into thin filaments half way through the simulation (t = T /2, where T is the period). Thereafter the deformational part of the flow reverses so that the tracer distributions return to their initial condition at t = T . The deformational flow is superimposed on a constant zonal flow to challenge the schemes further and to guarantee that errors do not cancel when the deformational flow reverses. The initial conditions are based on distributions ranging from infinitely smooth surfaces to discontinuous slotted cylinders. The distributions are placed into the western and eastern hemisphere, respectively, so that model developers can investigate the symmetry of the computed solutions. A series of diagnostics are used to assess various aspects of accuracy. For specific details on the test suite setup, we refer to LSPT2012. Not all models provided a complete data set and/or ran the suite exactly complying with the test case specifications. When data are missing or non-existent in histograms, the value is set to −1. In scatterplots it will be clearly marked "NO DATA" if the data are missing. If modelers have diverted slightly from the exact test case descriptions, it will be noted in the text. We have chosen not to exclude models that did not submit a complete data set as the data they did submit do, in our opinion, provide meaningful insights. It should be noted that for schemes that are inherently shape-preserving (HEL, LPM, TTS), i.e., for schemes for which there does not exist an unlimited version, the unlimited data are marked as "NO DATA" or "−1".
The tests are grouped into six categories assessing the following: 1. numerical order of convergence using smooth Gaussian hills initial conditions, 2. "minimal" resolution using cosine bell initial conditions, 3. ability of the transport scheme to preserve filaments using cosine bells, 4. ability of the transport scheme to transport "rough" distributions using slotted cylinder initial conditions, 5. ability of the transport scheme to preserve pre-existing functional relations between tracers, 6. ability of the transport scheme to deal with divergent flows .
These topics are discussed in separate sections below.

Numerical convergence rates: Gaussian hills
The goal of this test is to estimate numerical convergence rates for the normalized error norms i , which are referred to as K un i for the unlimited scheme and (if applicable) K sp i for the shape-preserving version of the scheme, where i = 2, ∞. Gaussian hills and the non-divergent flow field are used for the initial conditions. Normalized error norms are computed after one period (T ) when the analytical solution is readily available. The initial condition is infinitely smooth (C ∞ ) so that the smoothness of the initial condition is not a limiting factor for numerical convergence rates. With C 1 initial conditions, for example, one cannot necessarily expect to achieve numeral convergence rates matching the formal order of accuracy for higher order schemes (see, for example, Harris et al., 2010). The meridional component of the velocity field v is not infinitely smooth at the poles. However, since all fields are constant at the poles (and in the vicinity of the poles) and since all metrics are based on mixing ratio φ and not tracer mass, this lack of smoothness in the derivative of v has not been found to influence the results. Hence this setup was designed to assess "optimal" convergence rates given the smoothness of the initial condition and v.
The numerical convergence rates are computed using a least-squares linear regression of the form where K i denotes constants for the resolution range approximately 3 • to 0.3 • (a Gnuplot script was made available as supplemental material in LSPT2012 to perform the leastsquares regression). Note that the resolution range has deliberately been chosen to include a range [ λ, 3 • ], where λ > 0.1 • . With the 3 • grid spacing, the mixing ratio distributions may be marginally resolved. The main interest is not asymptotic convergence rates, which should be close to the theoretical convergence rate, but rather the effect of marginally resolved features in the convergence rate computations.
Convergence plots for i , i = 2, ∞, for the unlimited and shape-preserving versions of the schemes are given in Figs. 1 and 2. The schemes have been grouped according to implementation grid. An accompanying histogram (Fig. 3, middle) depicts the convergence rate for i , i = 2, ∞. The ordering of the data in the histogram will become clear as we discuss "minimal" resolution in the next section. For the convergence study the CN is held fixed. The labels on the convergence plots and histograms include the CN appended to the scheme acronym.
The histogram graphically depicts the range of convergence rates represented by the ensemble of models. They span from first-order convergence rates to sixth-order for the unlimited schemes. Hence, the ensemble of models that participated in this intercomparison span a significant range of formal accuracies. Several observations are made regarding . 1. Convergence plots for 2 (first and third rows) and ∞ (second and fourth rows) for the (first column) and shape-preserving (second column) versions of schemes based on cubed two-patch grids. Optimal convergence rates are based on linear least-square regressions to th in grey lines on each plot show slopes of second-and third-order convergence (top and bott ctively). Initial conditions are the infinitely smooth Gaussian hills and the normalized error no puted at time t = T . 72 Fig. 1. Convergence plots for 2 (first and third rows) and ∞ (second and fourth rows) for the unlimited (first column) and shape-preserving (second column) versions of schemes based on cubed-sphere and two-patch grids. Optimal convergence rates are based on linear least-square regressions to this data. Thin grey lines on each plot show slopes of second-and third-order convergence (top and bottom, respectively). Initial conditions are the infinitely smooth Gaussian hills, and the normalized error norms are computed at time t = T .
"optimal" convergence rates and will be discussed in separate sections below.

Reaching asymptotic convergence
Together with the absolute errors that will be commented on in the discussion of "minimal" resolution, perhaps the most striking observation to be made regarding the convergence plots (Figs. 1 and 2) is that models transition from sub-optimal convergence to asymptotic convergence rates at different resolutions. Some models converge at full order for 2 already at the lower end of the resolution range for which we assess numerical convergence (e.g., CAM-FV, CCSRG, CSLAM, HEL, HEL-ND, FARSIGHT-CN10.4, SBC, SFF-CSLAM3, UCISOM, UCISOM-CS), whereas other schemes reach "optimal" convergence rates at finer resolutions (e.g., CLAW, CAM-SE, FARSIGHT-CN1.0, HOMME, SFF-CSLAM4, SLFV-ML/SL, MPAS, ICON). Common for the schemes that converge asymptotically throughout the resolution range is that they converge at rates equal to or less than two, K 2 ≤ 2, except for the thirdorder CSLAM, SFF-CSLAM3, and CCSRG schemes that  Fig. 1 but for schemes defined on a regular latitude-longitude grid (rows 1 and 2) and icosahedral/Voronoi meshes (rows 3 and 4). Note that the LPM scheme was run with fixed time step and not with fixed Courant number; therefore no CN value is appended to the LPM label. For easier comparison the y axes are identical on all optimal convergence figures.
converge asymptotically already at approximately 3 • resolution. Other higher order schemes that are formally third-order (MPAS), fourth-order (HOMME-p3, SFF-CSLAM4), and seventh-order (HOMME-p6) do not converge at the asymptotic rate at the lower end of the resolution range. The effect of hyperviscosity coefficient on convergence rates for spectral element advection can be observed by comparing CAM-SE and HOMME-p3 (Fig. 1). Another fact contributing to the discrepancy is the fact that in CAM-SE the transport test is implemented using the offline_dyn option for which the winds are held fixed throughout the tracer time step, whereas in HOMME the winds are updated at every Runge-Kutta step.

Shape-preserving filters and convergence rates
When examining the histograms for "optimal" convergence rates for 2 and ∞ (Fig. 3 middle and lower, respectively), it is immediately apparent (with the exception of CLAW, FAR-SIGHT, SBC), and not surprising, that shape-preserving filters reduce convergence rates. The most striking reductions in K 2 are for the higher order schemes such as HOMME-p6, HOMME-p3, and SFF-CSLAM4 for which the convergence rates are reduced by four, two, and two, respectively. The formally third-order schemes CSLAM, MPAS, SFF-CSLAM3 see reduction of convergence rates of about 0.5. Schemes that are approximately second-order accurate are less affected (in an absolute sense) by shape-preserving filters. The observations made for K 2 also hold in a qualitative sense for K ∞ . We also note that a posteriori shape-preservation filters/fixers do not affect convergence rates (FARSIGHT and SBC).

Time step and convergence
LSPT2012 encouraged modelers to provide data for different CNs (the CN here refers to the maximum zonal CN; see Eq. 24 in LSPT2012), especially for schemes allowing for long time steps (CN > 1) such as (semi-)Lagrangian schemes. CSLAM, for example, was run with CN = 1.0 and CN = 5.5. It is observed that as the time step is reduced with CSLAM, the absolute errors increase since an increased number of remappings implies increased spatial errors until the distribution can be represented by the polynomial reconstruction functions (Fig. 1, row 1 and 2). Since the CSLAM scheme was run with semi-analytic trajectories, temporal errors (due to trajectory computations) are minimal. The asymptotic convergence rates for CSLAM are not affected by time step in this setup. Similar observations are made for the CCSRG. The SBC scheme is also a semi-Lagrangian scheme, and, contrary to the CSLAM setup, inexact trajectories were used. At lower resolutions the spatial errors dominate so the absolute errors increase with a decreased time step (similar to CSLAM). However, at high resolution the temporal errors start to dominate the standard error norms; with CN = 1 SBC solutions become more accurate than the CN = 5.5 solutions when the resolution is finer than approximately λ = 0.375 • . In other words, the temporal errors start to dominate as the distributions are very well represented by the basis functions used in SBC at high resolution.
The Eulerian scheme used in the ICON model was run at CN = 0.2 and CN = 0.6 ( Fig. 2, row 3 and 4). Contrary to the semi-Lagrangian schemes, the solutions achieved with longer time steps have larger errors throughout the resolution range. Since ICON-FFSL is based on a low-order spatial reconstruction function, it is unlikely that the error is dominated by time-truncation errors throughout the resolution range. Rather it appears more likely that the larger CN errors are due to the simplified flux approximation for which errors increase with larger CNs due to more of the flux-area integrations being based on extrapolation of reconstruction functions.
For CAM-FV it is observed that the large CN solution (CN = 1.2) has smaller absolute errors than the CN = 0.2 simulations (Fig. 2, row 1 and 2). Although the splitting errors in the dimensionally split CAM-FV scheme increase with CN, these errors do not dominate for this test case setup. Semi-analytic "trajectories" were used (analytic wind evaluations at n + 1/2 were used in the simulations), so, as for CSLAM, the temporal errors due to "trajectories" can be expected to be small. In conclusion, the absolute errors for the two-time-level CAM-FV solutions are dominated by spatial errors (number of flux evaluations increases with decreased CN).

"Minimal" resolution λ m : cosine bells
Rather than assessing convergence rates, this test focuses on absolute errors. In other words, we ask at what resolution modelers need to run their model to achieve a certain solution quality. The solution quality is quantified in terms of the 2 error norm for solutions using the same non-divergent flow field as above but with less smooth (C 1 ) initial conditions. A less smooth initial condition is chosen to challenge the schemes with a more realistic (in terms of smoothness) initial condition compared to the infinitely smooth Gaussian hills. This is similar to the setup used in Williamson et al. (1992) where both the advection test and shallow water topography (test 5) use C 1 functions for mass distribution and surface height, respectively.
Basically, the modelers repeated the numerical convergence test (Sect. 3.1) with cosine bell initial conditions. The "minimum" resolution is defined as the resolution (specified in terms of average grid spacing at the Equator) for which the normalized 2 error norm is approximately 0.033. This threshold was chosen based on CSLAM experiments for which the filaments were resolved in the sense that asymptotic convergence is reached; for CSLAM-CN5.5 asymptotic convergence with cosine bell initial conditions is reached at approximately λ = 1.5 • for which 2 ≈ 0.033. The minimum resolution is estimated from a convergence plot (see Fig. 4 in LSPT2012) and should be computed without and (if applicable) with shape-preserving filters. The "minimal" resolution used in the remainder of the test case suite should be λ m for the unlimited scheme.
The "minimal" resolutions for the different schemes are depicted graphically in the histogram in Fig. 3 (top row). First of all, the λ m range is from approximately 1/10 • to over 2 • resolution. This is a remarkable difference in resolution to achieve the same "quality" solution. In Fig. 3 the same ordering is used for the histograms, making it easier to compare "optimal" convergence rates visually with "minimal" resolutions. The histograms for K i do not constitute a monotonically increasing quantity going from left to right in the histogram plots. In other words, high-order convergence rates do not necessarily result in coarser "minimal" resolutions or vice versa; in fact there seems to be no clear correlation between K i and λ m in the resolution range considered here. This is perhaps even more apparent in the "scatterlike" plot in Fig. 4. In fact, some of the schemes that are among the best performing schemes regarding λ m (e.g., UCISOM, LPM) perform poorly in terms of convergence rate. Had the test been run in a (high-resolution) asymptotic convergent regime, K i and λ m would most likely be inversely related. However, as mentioned the test is designed to challenge schemes near the resolution limit rather than      stogram of minimal resolution ∆λ m (upper), K i , i = 2, ∞ which are the "optim s for 2 (middle) and ∞ (lower), for the unlimited ("un", red) and shape-prese sions of the schemes. The histogram is ordered monotonically according to ∆λ m hemes so that ∆λ decreases from left to right. For schemes for which unlimited re ∆λ m for the shape-preserving scheme is used for the purpose of ordering (scheme -FV, HEL, HEL-ND, UCISOM, UCISOM-CS) and a placeholder value of −1 i s. Note that the LPM scheme was run with fixed time-step and not with fixed Cour o CN value is appended to the LPM label. 74 Fig. 3. Histogram of minimal resolution λ m (upper), K i , i = 2, ∞, which are the "optimal" convergence rates for 2 (middle) and ∞ (lower), for the unlimited ("un", red) and shape-preserving ("sp", green) versions of the schemes. The histogram is ordered monotonically according to λ m for the unlimited schemes so that λ decreases from left to right. For schemes for which unlimited results are not available, λ m for the shape-preserving scheme is used for the purpose of ordering (schemes concerned are CAM-FV, HEL, HEL-ND, UCISOM, UCISOM-CS), and a placeholder value of −1 is used in all histograms. Note that the LPM scheme was run with fixed time step and not with a fixed Courant number; therefore no CN value is appended to the LPM label.
focusing on resolutions for which the spatial distributions of tracers are well-resolved. Shape-preserving filters (with the exception of CLAW, FARSIGHT and SBC) reduced "optimal" convergence rates. The effect of shape-preserving filters on the "minimal" resolution seems to go both ways (Fig. 3, top). That is, some schemes increase accuracy ( λ m increases) when the shapepreserving filter is used (most notably with MPAS, ICON, SBC, CLAW), whereas other schemes experience a decrease (HOMME-p3, HOMME-p6, CSLAM). It is noted that ICON/MPAS and CLAW use FCT and TVD-type flux limiters, whereas CSLAM uses a slope limiter. Results that contrast unlimited and shape-preserving "minimal" resolutions are not available for CAM-FV, CCSRG, UCISOM, and UCISOM-CS since only shape-preserving data are available for those models.
In general it is also noted that the "minimal" resolutions for schemes defined on icosahedral/Voronoi grids have finer λ m than schemes defined on cubed-sphere and regular latitude-longitude grids. That said, since the measure of sion Paper | Discussion Paper | Discussion Paper | Discussion Paper | catter-like" plot of the data shown as histograms in Fig. 3 upper and middle rows. Each scheme nted by a point on the plot with (x, y) coordinates (∆λ m , K 2 ). For clarity each point is not lah scheme acronym. The purpose of this Figure is to show that there is not necessary a correlation "optimal" convergence rate and "minimal" resolution. 75 Fig. 4. "Scatter-like" plot of the data shown as histograms in Fig. 3 upper and middle rows. Each scheme is represented by a point on the plot with (x, y) coordinates ( λ m , K 2 ). For clarity each point is not labeled with a scheme acronym. The purpose of this figure is to show that there is not necessarily a correlation between "optimal" convergence rate and "minimal" resolution. resolution is the average resolution at the Equator, the regular latitude-longitude grids have more degrees of freedom than cubed-sphere and icosahedral grid-based models. In this discussion we have not considered how amenable spherical grids and schemes are to mesh-refinement applications.

"Filament" preservation diagnostic f : cosine bells
All tests above were based on traditional error norms computed at time t = T when the flow, in the absence of any numerical errors, has advected the distributions back to their initial position and shape. As discussed in Lauritzen and Thuburn (2012), the first half of the simulation, where relatively well-resolved features collapse in scale (at t = T /2 the initial condition cosine bells have been deformed into thin filaments), is typical for atmospheric flow. The second half of the simulation (t ∈ [T /2, T ]) does not resemble typical observed flow patterns, but it is very convenient for obtaining an analytical solution under complex flow conditions. Partly motivated by that, a series of diagnostics were developed for which an analytical solution is not needed, and one can thereby assess accuracy at any point in time. For example, before the "unphysical" flow reversal at t = T /2, one could expect the schemes to be most challenged at least for semi-Lagrangian and Eulerian schemes.
One such diagnostic is the filament diagnostic that is designed to diagnose how well the thin filaments that develop at t = T /2 are preserved. It takes advantage of the fact that in continuous space the area spanned by tracer values larger than some threshold value is conserved for a non-divergent flow field. The filament diagnostic, f (for a mathematical definition of f , see LSPT2012), is designed to quantify how well filaments are preserved in terms of how well a scheme preserved the total area for which φ is larger than a threshold value τ .
The cosine bell initial conditions are chosen for this test as they are quasi-smooth (but not infinitely smooth) and have mixing ratio values that span the entire range from the background value of 0.1 to the peak value, φ = 1.0. Slottedcylinder initial conditions, for example, only have two values, and simulations using that initial conditions would therefore not give information on how well the scheme maintains continuous and varying gradients.
The perfect scheme will have f close to 100 for all values of τ . We say "close" to 100 and not exactly equal to 100 since for Eulerian/semi-Lagrangian schemes that use a fixed grid one would need to truncate the exact Lagrangian solution (for which f = 100 for all t) to the fixed Eulerian grid for the computation of f ; however, that truncation error is likely orders of magnitude below the numerical truncation errors (numerical diffusion and dispersion errors) introduced by the scheme itself. For fully Lagrangian schemes based on parcels, this test forces modelers to define areas associated with the Lagrangian parcels. Cell-integrated Lagrangian schemes that track cells throughout the integration can test how well the scheme preserves areas.
As explained in LSPT2012, a highly diffusive scheme tends to increase f for lower threshold values τ (except τ = 0.1 for which f decreases) and decrease f for higher values of τ (see Fig. 6a in LSPT2012). In other words, when the base of the cosine bells is diffused, more area is covered by lower values of φ and less area is covered with higher (near peak) values of φ.
The filament diagnostic gives insight into how gradients are distorted in terms of the ability to preserve the area of the domain in which the mixing ratio is larger than the threshold value τ . If the f (τ ) curve is smooth and monotonically decreasing as a function of τ , the schemes diffusive characteristics are smooth and continuous. Schemes that tend to steepen gradients will spuriously force f (τ ) > 100 for relatively large τ values. Schemes that make use of "ad hoc" fixers (that also alter gradients) may produce an oscillatory f (τ ) curve. Figure 5 shows the filament preservation diagnostic f (at t = T /2) using the cosine bell initial condition for the unlimited and (if applicable) limited/filtered schemes at resolutions 1.5 • and 0.75 • . Results for f at the "minimal" resolution λ m are not shown although requested in LSPT2012. As for the convergence plots, data have been arranged according to discretization grid. We also show a "minimal-τ " filament preservation diagnostics as histograms in Fig. 6. That is, the y axis on the histogram is the τ value for which f is 80; this τ value is referred to as τ m and computed by solving Filament preservation diagnostic f (τ ) at 1.5 • (first column) and 0.75 • (second column) resolution, respectively, for the unlimited (thick lines) and shape-preserving (thin lines) versions of the schemes. Note that TTS-I, LPM, HEL, and HEL-ND are inherently shape-preserving and therefore only have "unlimited" data displayed. The LPM scheme was not run with fixed CN. The CN for 1.5 • and 0.75 • is 1.08 and 2.0, respectively. 76 Fig. 5. Filament preservation diagnostic f (τ ) at 1.5 • (first column) and 0.75 • (second column) resolution, respectively, for the unlimited (thick lines) and shape-preserving (thin lines) versions of the schemes. Note that TTS-I, LPM, HEL, and HEL-ND are inherently shapepreserving and therefore only have "unlimited" data displayed. The LPM scheme was not run with fixed CN. The CN for 1.5 • and 0.75 • is 1.08 and 2.0, respectively. the solution to Eq. (12) is not multivalued for the data considered here. For example, if τ m = 0.6, then 80 % of the area associated with mixing ratios larger than 0.6 is preserved. In other words, the larger τ m is, the better the scheme preserves the "peaks" of the cosine bells. The histogram in Fig. 6 is mainly shown to investigate visually if there is a relationship between "minimal resolution" and τ m . Had there been a simple linear relationship, the values of τ m would decrease/increase from the left to right in the histogram. As for the numerical convergence rates (Fig. 4), there is no simple relationship indicating that f measures other aspects of accuracy than λ m . That said, there is a tendency of increased τ m from left to right with some outliers. For example, UCISOM-CN1.0/5.5 performs exceptionally well compared to the schemes with similar "minimal resolution". Similarly, but in a opposite sense, HEL-CN1.0 performs worse than its "neighbors" in the histogram.
Perhaps more interesting in the context of f is to focus on the shape of f as a function of τ . First of all, the more diffusive schemes tend to collapse toward a straight line with negative slope for τ approximately in [0.2 : 0.8], whereas the less diffusive schemes tend toward a straight line with no or small negative slope. The smoothness of the f curve may indicate non-physical "ad hoc" fixers or anti-diffusive aspects er | Discussion Paper | Discussion Paper | Discussion Paper |   Fig. 6. A histogram of threshold value τ for which the filament preservation diagnostic f (τ ) is approximately 80.0 at resolution 1.5 • for the unlimited (red) and shape-preserving (green) versions of the schemes. Above each column the value of τ is written (if τ = −1 there is no data for that scheme configuration). 77 Fig. 6. A histogram of threshold value τ for which the filament preservation diagnostic f (τ ) is approximately 80.0 at resolution 1.5 • for the unlimited (red) and shape-preserving (green) versions of the schemes. Above each column the value of τ is written (if τ = −1, there are no data for that scheme configuration).
of a scheme. For example, the FARSIGHT scheme uses an "ad hoc" fixer for mass conservation and shape preservation. The f curves, in particular for FARSIGHT-CN10.4, are oscillatory and non-monotone. The SFF-CSLAM4 and CAM-FV0.2 schemes have a rather wide range of τ values (approximately τ ∈ [0.6 : 0.8]) for which f exceeds 100.0, which is likely due to steepening of gradients. In conclusion, there are indications that this metric is most useful for testing schemes employing "ad hoc" fixers or schemes with antidiffusive terms or other mechanisms that may steepen gradients. Note that this metric will not capture if the location of the filaments is incorrect (phase errors).

Transport of "rough" distribution: slotted cylinder
To assess how schemes perform with a rough (discontinuous) initial condition, we show contour plots of solutions at t = T /2 for slotted-cylinder initial conditions and the same non-divergent flow as used in all tests above. The slotted cylinder has been used extensively in the solid-body advection test case to demonstrate that shape-preserving limiters effectively eliminate spurious grid-scale oscillations. Contrary to traditional specifications of the slotted-cylinder initial condition, we have chosen to overlay it by a background value of φ = 0.1 instead of a zero background value. Again, this is motivated by typical conditions found in the atmosphere where structures in tracer distributions frequently overlay some smooth background distribution. In that case, positivity preserving limiters will not eliminate undershoots near the discontinuity. Contour plots for mixing ratio at t = T /2 based on slotted cylinder initial conditions are shown in Figs. 7, 8, 9, and 10 (again, data are grouped according to the discretization grid). In the LSPT2012 test case, specification modelers were asked to report on conventional error norms (at t = T ) in addition to showing contour plots (at t = T /2). Here we have chosen not to depict/list the conventional error norms as we did not find any qualitative insights that were not visible in the contour plots (the error norms are available in the supplemental material for the interested reader). So in the interest of reducing the number of figures/tables, errors 2 , and ∞ , as well as the minimum and maximum norms, are not shown.
All contour plots use the same coloring scale and contour interval making it straightforward to compare schemes visually. It is immediately apparent, most notably in the areas away from the slotted cylinders where the field should be constant, whether a scheme is not strictly shape-preserving (light blue contour filling). Almost all unlimited schemes show "ripples" in this area. Similarly, overshoots over the slotted cylinders are immediately visible (dark red contour filling). The wavelength of the spurious oscillations is related to the formal order of the schemes. For example, the oscillations for HOMME-p6 have a much shorter wavelength than those observed for ICON. This test, however, was specifically designed to assess whether shape-preserving filters truly eliminate undershoots and overshoots while still preserving extrema. Finite-volume scheme based on rigorous flux computations and/or FCT limiters completely eliminate undershoots/overshoots (CSLAM, ICON, MPAS). For schemes based on simplified fluxes and not using FCT limiters, small undershoots are visible (SFF-CSLAM3/4, CAM-FV1.2). The UCISOM scheme has a strictly shape-preserving limiting option. However, to avoid "excessive" diffusion, the limiter has been relaxed, which explains the undershoots with that scheme. If a scheme shows ripples with a strictly shape-preserving filter, then it may be due to inconsistent coupling between the air mass and tracer mass fields when the mixing ratio is extracted. For example, a scheme that is not "free-stream"-preserving will suffer from this deficiency.
The ability of the scheme to preserve the "plateau" of the slotted cylinders seems to be closely related to "minimal" resolution in a qualitative sense except for the UCI-SOM and HEL scheme that perform better than would be expected from their λ m ranking. Not surprisingly the more . Contour plot of φ at t = T /2 using "rough" initial condition at approximately 1.5 • (co 2) and 0.75 • (columns 3 and 4) resolution without (columns 1 and 3) and with (columns pe-preserving filter for a subset of transport schemes implemented on a cubed-sphere grid e acronym is shown in the lower left corner of each plot. 78 Fig. 7. Contour plot of φ at t = T /2 using "rough" initial condition at approximately 1.5 • (columns 1 and 2) and 0.75 • (columns 3 and 4) resolution without (columns 1 and 3) and with (columns 2 and 4) shape-preserving filter for a subset of transport schemes implemented on a cubed-sphere grid. The scheme acronym is shown in the lower left corner of each plot. diffusive schemes that have a smaller λ m also diffuse the slotted cylinders. The pure Lagrangian schemes obviously maintain the discontinuities in the slotted cylinder better than the Eulerian/semi-Lagrangian schemes.

Preservation of pre-existing functional relation: cosine bells and correlated cosine bells
All known tests for linear transport on a sphere consider aspects of accuracy in a single-tracer setup. As discussed in detail in Lauritzen and Thuburn (2012), the accuracy with which schemes maintain relations between tracers is of significant interest in chemistry-climate and climate model-. 8. Same as Fig. 7 for the remaining scheme defined on a cubed-sphere grid and two-patch gr LAW). For plots showing "CONSTANT FIELD -VALUE IS 0.1" no data are available. ing. To assess how well interrelated tracers are simulated in an idealized setup, we use the same flow field as before. Two cosine bell distributions, with mixing ratio χ and accompanying "correlated" mixing ratio ξ , are advected separately. The latter is related to the former initial condition through a non-linear (polynomial) relation (black curve on the scatterplots; Figs. 11,12,13,and 14). For any Eulerian or semi-Lagrangian scheme known to the authors, scatter points will deviate from the pre-existing functional curve as the simulation progresses. In a purely Lagrangian scheme with no explicitly added mixing (for example, contour surgery) where parcels are traced throughout the simulation, any relation between tracers is maintained, and hence the scatter points are stationary in the correlation plots. The way in which scatter points deviate from the polynomial curve has consequences for the physical realizability of the mixing introduced by the scheme. When mixing occurs in the atmosphere, scatter points (for example, located in Geosci. Model Dev., 7, 105-145 two different air masses) will move toward each other along straight lines in the scatterplot. These lines are called mixing lines. The area spanned by all possible mixing lines is referred to as the "convex hull" and is the bow-shaped area on the scatterplots. If the scatter point moves into any area that is not the convex hull, the mixing that the scheme introduces is unphysical unmixing. Following Lauritzen and Thuburn (2012), this unmixing is categorized into two types (for graphical illustration see Fig. C1 in LSPT2012) -rangepreserving unmixing that is unmixing within the range of the range of the initial condition. Note that in the scatterplots (Figs. 11,12,13,and 14), only the upper part of the rangepreserving unmixing area is marked with solid black lines; the triangular area below the convex hull also belongs to the range-preserving unmixing area and "overshooting" that is the remaining area on the scatterplot. When scatter points shift into the convex hull, the mixing is categorized as "real" mixing.
Associated with each area are mixing diagnostics that quantify the mixing in terms of normalized distances from the pre-existing functional curve (Fig. B1 in LSPT2012): r for "real mixing", u for range-preserving unmixing, and o for overshooting (for definitions of i , i = "r", "u", and "o", see Lauritzen and Thuburn, 2012). Following LSPT2012 the i is computed half way through the simulation, t = T /2, when the initial distributions are most deformed. As for the filament diagnostic f , the mixing diagnostics i do not require any knowledge of the analytical solution. In fact, Geosci. Model Dev., 7, 105-145, 2014 www.geosci-model-dev.net/7/105/2014/ r plots (for subset of cubed-sphere models) at t = T /2 for the cosine bell tial conditions for χ and ξ, respectively. First and third columns are for econd and fourth columns are for the shape-preserving schemes. The firs ions at ∆λ ≈ 1.5 • and the last two columns are for ∆λ ≈ 0.75 • . The sche wer left corner of each scatter plot with the maximum Courant number (C eme acronym the mixing diagnostics ("real" mixing r , range-preserving Fig. 11. Scatterplots (for subset of cubed-sphere models) at t = T /2 for the cosine bell and correlated cosine bell initial conditions for χ and ξ , respectively. First and third columns are for the unlimited schemes, and second and fourth columns are for the shape-preserving schemes. The first two columns are for simulations at λ ≈ 1.5 • , and the last two columns are for λ ≈ 0.75 • . The scheme acronym is shown in the lower left corner of each scatterplot with the maximum Courant number (CN) appended. Above the scheme acronym the mixing diagnostics ("real" mixing r , range-preserving unmixing u , overshooting o ) are given. me as Fig. 11 for the remaining cubed-sphere models.
84 Fig. 13. Same as Fig. 11 for models defined on a regular latitude-longitude grid.

85
Fig. 14. Same as Fig. 11 for models defined on an icosahedral/Voronoi mesh. and contrary to the filament diagnostic, which relied on the wind field being non-divergent, the mixing diagnostics can be applied in any flow setting and is hence more generally applicable. For a three-dimensional extension of this test case, see Kent et al. (2013).
The values for the mixing diagnostics for each scheme are shown in the lower left corner of each scheme's scatterplot. The mixing data are also shown in histogram format in Fig. 15, where i has been normalized with CSLAM values to provide a reference. Before discussing the quantification of the mixing, it is insightful to analyze the scatter data qualitatively.

1.5°)
Mixing diagnostics at resolution ∆λ m histogram of mixing diagnostics (stacked) at resolutions ∆λ ≈ 1.5 • (upper), nd ∆λ ≈ ∆λ m (lower). The ordering is according to minimal resolution ∆λ m fo ted schemes (see Fig. 3 first row). Above each scheme acronym there are two colu lumn is for the unlimited scheme and the right column contains data for the shape the scheme (if applicable). The height of each colored column (green r , yellow between i , i ∈ ["r", "u", "o"] for the scheme in question normalized by the i ∆λ = 1.5 • . Note that the y axis scale are different. The stacked histograms for SL eed the plotting range. If no data are available the mixing data are negative (altho bmitted for FARSIGHT there are some mixing diagnostics given in White an numerical values for i are listed on the scatter plots in Figs. 11, 12, 13 and 14.   Fig. 15. A histogram of mixing diagnostics (stacked) at resolutions λ ≈ 1.5 • (upper), λ ≈ 0.75 • (middle), and λ ≈ λ m (lower). The ordering is according to minimal resolution λ m for the respective unlimited schemes (see Fig. 3 first row). Above each scheme acronym there are two columns of data. The left column is for the unlimited scheme, and the right column contains data for the shape-preserving version of the scheme (if applicable). The height of each colored column (green r , yellow u , red o ) is the ratio between i , i ∈ ["r", "u", "o"] for the scheme in question normalized by the i for CSLAM (CN5.5) at λ = 1.5 • . Note that the y axis scale are different. The stacked histograms for SLFV-ML and CLAW exceed the plotting range. If no data are available, the mixing data are negative (although i data were not submitted for FARSIGHT, there are some mixing diagnostics given in White and Dongarra, 2011). The numerical values for i are listed in the scatterplots in Figs. 11, 12, 13 and 14.

Scatter shape
Scatter points located near the lower right corner of the convex hull (χ , ξ ) = (1.0, 0.1) are the mixing ratio values making up the extrema of the cosine bells and correlated cosine bells. The opposite extreme of the convex hull (upper left corner (χ , ξ ) = (0.1, 0.892)) contains the majority of the data points as that is where the background value is located on the scatterplot. Obviously, diffusive schemes will damp the extrema, which, in terms of the scatterplot, cause scatter points to shift toward the background scatter point value (0.1, 0.892) and away from the lower right corner of the convex hull. This is particularly apparent in almost all lowresolution ( λ ≈ 1.5 • ) scatterplots for the shape-preserving version of the schemes in Figs. 11-14 (second column). Considering finite-volume schemes at λ ≈ 1.5 • , it is observed that the scatter points make up a bow shape (except CSLAM-CN5.5). In addition to all being finite-volumebased schemes, shape preservation is enforced either through FCT or by constraining the reconstruction function. When the resolution is increased to 0.75 • (fourth column in Figs. 11  and 14), most of these schemes no longer have a bow-shaped scatter, but the lower boundary is curved so that the scatter points "track"/follow the pre-existing functional curve much more closely with the majority of the scatter points inside the convex hull. Some schemes (FARSIGHT-CN1.0 at 0.75 • , ICON-CN0.6 at 0.75 • , MPAS-CN0.8 at 0.75 • , SBC-1.0 at 1.5 • ) tend to lift the tail of the scatter data indicating that some steepening of the gradients is taking place.
If a scheme is not shape-preserving, scatter points may shift outside the convex hull either into the range-preserving unmixing or overshooting area. Probably the most detrimental type of unmixing is overshooting unmixing or equivalently range-expanding unmixing, which in this experimental setup is manifested by scatter points shifting beyond the upper left corner of the convex hull into the overshooting area. If a scheme is shape-preserving, no scatter points will be shifted into the overshooting unmixing area. In other words, the scheme is guaranteed not to expand the range of the initial condition mixing ratios. Note that non-zero background values have been chosen for χ so that a positivity-preserving limiter (positive definite) will not prevent undershooting. That said, a scheme may still exhibit non-shape-preserving behavior inside the range of the initial conditions that will not be accounted for in o but rather in u . As expected, all unlimited versions of the schemes show overshooting mixing of varying amounts. For all the finite-volume schemes, the scatter points in the overshooting mixing category seem to gather around the extension of the straight line making up the lower boundary of the convex hull, almost as an extension of the convex hull shape towards the upper left corner of the scatterplot. The FARSIGHT and CLAW schemes result in a much different shape that differs from an "extension" of the convex hull shape.
As alluded to above, it is immediately visible in the scatterplots if the "shape-preserving" versions of the schemes are strictly shape-preserving. For example, CAM-FV has slight overshooting mixing even though the dimensionally split application of one-dimensional operators is strictly shapepreserving. The overshooting/undershooting occurs since shape-preservation is not guaranteed in the direction traverse to the coordinate directions.
As can be proven mathematically, only schemes that are monotone according to the definition by Harten et al. (1987) will guarantee that no range-preserving unmixing occurs (Thuburn and Mclntyre, 1997). Unfortunately only firstorder schemes are monotone according to this definition (Godunov, 1959). In all schemes where the diffusive error is not dominating, we indeed see that the shape-preserving schemes produce range-preserving unmixing.

Quantification of mixing
The quantification of mixing, i , i = "r", "u", "o", is depicted in Fig. 15 using a histogram. The purpose of this figure is to show how i varies among schemes at the resolutions 1.5 • and 0.75 • as well as to observe how shapepreserving filters affect i for each individual scheme. The histogram is ordered according to "minimal" resolution λ m (see Sect. 3.2) from high value of λ m (left) to low value of λ m (right). The numerical value of i is normalized with i for CSLAM with CN5.5 ( (un) i (CSLAM)) at resolution λ ≈ 1.5. The reason for a graphical representation of "normalized" data, i / (un) i (CSLAM), rather than i is to give the reader a reference for the amount of mixing. The mixing diagnostic is relatively new, and numerical values of i may be less meaningful to the reader than normalized data. The actual values of i for a particular scheme can be found in the scatterplots (Figs. 11,12,13,and 14). Schemes with no data are listed with i = −1. Note that the spread among the schemes for i / (un) i (CSLAM) spans a large range (for example, at 1.5 • the total mixing is more than 20 times the CSLAM reference mixing).
To show the large amount of data concisely, the histograms are stacked so that the total height of each rectangle is total normalized mixing, r (un) and the colors show the breakdown into the different categories of mixing. For example, the histogram for CSLAM-CN5.5 (unlimited) is of exactly height three, and each colored section is height one.
The specific choice of CSLAM for the normalization is motivated by the "minimal" resolution. That is, at λ ≈ 1.5 • the filaments are marginally resolved for CSLAM-CN5.5. The CSLAM scheme performs, in general, a little above average compared to the other schemes in this collection, and it is therefore more suitable for reference purposes than, for example, the "best" or "worst" performing schemes. In addition it is based on a traditional finite-volume approach and hence is a suitable benchmark for schemes based on emerging numerical methods and untraditional designs that wish to compare with "traditional" transport formulations. Nevertheless it is noted that similar "traditional" schemes could also have been used for this purpose.

Mixing diagnostics at fixed resolution and "minimal" resolution
An apparent first question about the histograms in Fig. 15 (upper and middle) is whether the amount of "real" mixing for the unlimited schemes decreases with increased "minimal" resolution ( λ m ), which is used for the ordering of the mixing data. In general that is the case; there is a general trend for a monotonic decrease in r going from left Geosci. Model Dev., 7, 105-145, 2014 www.geosci-model-dev.net/7/105/2014/ to right in the two histograms shown in Fig. 15. While the relation between "minimal" resolution and r is, in general, as expected (the higher λ m the smaller r ), it is perhaps more interesting to focus on the schemes that do not follow this trend and potentially provide insights that λ m does not. Perhaps the biggest outlier in this ensemble is UCISOM-CN5.5, which has at least one order of magnitude less real mixing and unmixing compared to schemes with similar "minimal" resolution. Another "outlier" is the unlimited HOMME-p6-CN0.13, which has higher levels of r and u than schemes with similar "minimal" resolutions, which is due to spurious grid-scale oscillations. HEL, like UCISOM, is an outlier, and it clearly shows that HEL was specifically designed to minimize numerical mixing as the mixing diagnostics are much smaller than for schemes with similar "minimal" resolutions.
In the last row of Fig. 15, the normalized mixing diagnostics at the "minimal" resolution for the respective schemes are shown. Had λ m been a proxy for mixing, all histograms would have had the same height. Here the outliers described above are very apparent. This shows that the amount of numerical mixing varies significantly even though the 2 error norms are the same. This behavior was well described by Thuburn and Mclntyre (1997): "Shaping two tracer fields the same way does not imply shaping them the right way". In other words, the mixing diagnostics emphasize a different aspect of accuracy than normalized error norms (in this case specified with λ m ).

Effect of shape-preserving filter on mixing
For all schemes o is zero or close to zero when the shapepreserving filter is applied (as expected). With the exception of SBC-CN5.2 (at λ = λ m ), all schemes see a reduction in u when using a shape-preserving limiter. Shapepreserving limiters usually degrade conventional error norms compared to the unlimited scheme. On the contrary, the "unmixing" diagnostic, which accounts for spurious unmixing, improves.
The effect of shape-preserving filters on "real" mixing varies among the schemes. Some schemes see a reduction in r , and some see an increase in "real" mixing compared to the unlimited versions of the schemes.

Divergent flow experiment
Here we repeat the experiment described in Sect. 3.2 but replace the non-divergent wind field (used in all prior tests) with the divergent wind field defined in LSPT2012. All other settings are the same: time step, cosine bell initial conditions, etc. The purpose of this test case is to have modelers demonstrate that their scheme is well-behaved also for divergent flow fields. For some classes of schemes, such as finite-volume schemes, the coupling between air mass and tracer mass must be considered in divergent flow settings (see, for example, Sect. 2.1 in Nair and Lauritzen, 2010). Hence this test case forces the modeler to consider such coupling that may otherwise not be considered when the flow is non-divergent. That said, even for the non-divergent flow field, the non-preservation of a constant mixing ratio could be a result of inconsistent coupling between air and tracer mass (at least for finite-volume type schemes). In addition to assessing the consistency of the coupling, that accuracy of the coupling between air and tracer mass is assessed.
Normalized error norms ( 2 , ∞ , φ min , φ max ) at λ ≈ 1.5 resolutions are shown in the histogram in Fig. 16. The minimum (φ min ) and maximum norms (φ max ) are defined in LSPT2012. Although LSPT2012 also requested these error norms at λ ≈ 0.75 and λ ≈ λ m , we did not find intriguing insights by analyzing these data, and for brevity the histograms for this data are omitted (the data are available in the supplemental material). Except for CAM-FV and FAR-SIGHT, the divergent data are ordered similarly to λ m in terms of magnitude (Fig. 3, top). Note that schemes based on FCT limiting in general improve accuracy when shape preservation is enforced, whereas schemes based on reconstruction limiting degrade the error norms.

Algorithmic considerations
General properties of the algorithms are given in Table 3. First of all, the width of the computation halo used to update cell/grid-point value is listed. For example, if only the immediate neighboring cell-average or grid-point values are used, the width of the halo is one. This width should give an indication of message sizes in parallel computing environments. The number of communications needed per time step is indicated through the number of stages used in the scheme. The minimum number of communications needed to complete a simulation can, in general, be deduced from the stable time step limitations of the scheme. Here that is specified in terms of maximum Courant number. For schemes that are not Courant-number-limited but rather limited by the shear of the flow, we list "Lipschitz", which refers to the criterion for stability for many trajectory algorithms in (semi-)Lagrangian schemes. To indicate possible multi-tracer efficiency, it is also listed what parts of the algorithm can be reused for each additional tracer. Of course for a given number of tracers, the efficiency is dependent on all parameters in this table and not just on the amount of information that can be reused.

Summary and conclusions
Results from a wide range of schemes that have exercised a recently proposed test case suite  are presented and analyzed. It is the purpose of this paper to provide a catalog of results for an ensemble of state-of-theart transport schemes for global atmosphere/ocean modeling as well as to investigate what aspects of accuracy different   Histogram of normalized error norms ( 2 , ∞ , φ min , φ max in first, second, third, and fourth row, respectively) for the divergent flow field test case for the unlimited ("un") and shape-preserving ("sp") versions of the schemes, respectively, at λ ≈ 1.5. The ordering is according to minimal resolution λ m (see Fig. 3 first row). The value "−1" indicates that no data are available. The appended CNs are for the non-divergent flow field (for consistency with the other histograms); this test was run with the same time step as for the non-divergent flow tests. However, the maximum velocities are smaller than for the non-divergent flow, and hence the actual CNs are smaller.

Numerical order of convergence (Gaussian hills initial condition)
For infinitely smooth initial conditions, convergence data are examined in the resolution range [3 • , 0.3 • ]. This range was deliberately chosen so that the fields may only be marginally resolved at the low resolution end of this resolution range. It was observed how different schemes converge throughout the resolution range at their formal convergence rate and how other schemes reach asymptotic convergence rates at higher resolutions. The effect of shape-preserving filters on convergence rates was also examined. The convergence rates and effect of shape-preserving filters varied significantly among the schemes that participated in this intercomparison. The greatest reductions in convergence rates were seen for formally high-order schemes for which rates dropped by several orders to about second-order.

"Minimal" resolution (cosine bell initial condition)
To assess absolute errors and to challenge the schemes with a slightly less smooth initial condition (C 1 ), modelers were asked which resolution was needed to provide solutions at a certain level of accuracy (defined in terms of a root mean square error norm). This resolution was referred to as "minimal resolution" ( λ m ). The range of λ varied from approximately 0.1 • to more than 2 • . The schemes have been ordered according to increasing λ m when other accuracy diagnostics were depicted as histograms. Doing that with convergence rates showed no clear relationship between λ m and numerical convergence rates. In fact some of the lowest order schemes performed best with respect to λ m .

Ability of the transport scheme to preserve filaments
The filament diagnostic f (τ ) was introduced to quantify how well thin filaments are preserved. This diagnostic requires the flow to be non-divergent since it relies on the fact that, for a non-divergent flow field, the area of the sphere for which the mixing ratio distribution is above a threshold value τ is invariant. Measure f quantifies how much of the initial condition area, for which the mixing ratio φ is larger than τ , is preserved. By plotting f as a function of τ , one can examine how gradients are diffused or steepened and how uniform that damping of gradients is. This test was found particularly useful to identify how some filters, and limiters tend to perturb gradients non-monotonically (e.g., "ad hoc" and "a posteriori" filters/limiters).

Ability of the transport scheme to transport "rough" distributions
Discontinuous initial conditions were used to expose shapepreserving limiters as most unlimited schemes produce significant unphysical oscillations (under-and overshoots). Contour plots were shown for all schemes to compare schemes easily and visually. Note that the same contour interval and coloring is used for all schemes! This test exposes any non-shape-preservation in filters intended to enforce shape preservation and how the infinite gradients become finite. It is also directly visible how diffusive the scheme is.

Ability of the transport scheme to preserve pre-existing functional relations between tracers
This test is used to assess how schemes perturb a pre-existing non-linear functional relation between tracers and quantifies the mixing that the scheme introduces. The mixing is classified into different categories to quantify the amount of physical realizable mixing and spurious unmixing. The shapes of the scatterplots were examined, and large differences between the schemes have been discovered. Also shape-preserving limiters affect the scatter shape in different ways. It was observed that minimal resolution λ m is not necessarily a good proxy for how well a scheme maintains pre-existing functional relations between tracers. From the results it is quite clear that the mixing diagnostics measure a different aspect of accuracy compared to conventional error norms. In particular, they may be used assess if a shapepreserving filter makes the solution more physically realizable (overshooting unmixing should be exactly zero; rangepreserving unmixing should decrease) and how much real mixing the filter introduces.

Ability of transport scheme to deal with divergent flows
To force the modeler to consider density of air and tracer mass coupling (at least for finite-volume type schemes), a divergent flow field is considered. There are no explicit diffusion parameters in the CAM-FV transport scheme. However, there is implicit diffusion from the PPM algorithm used with the Lin-Rood scheme (Lin and Rood, 1996). CAM-FV also makes use of a filling algorithm to ensure positivity.

A2 CAM-SE
The resolution in CAM-SE is specified through the number of elements (NE) in each coordinate direction on one cubedsphere panel and the number of quadrature points (NP) in each coordinate direction of an element. The average resolution (in degrees) near the Equator is .
The idealized test cases are implemented in CAM-SE using the offline_dyn option. In that configuration the winds are constant throughout the Runge-Kutta time stepping and not updated at every stage (as is done in HOMME).

A3 CCSRG
CCSRG is implemented on a latitude-longitude reduced grid. The presented CCSRG results are obtained on the grids with 20 % reduction (20 % fewer points than on a regular latitude-longitude grid with the same resolution at the Equator). The grids are constructed with the algorithm of Fadeev (2013). The grid reduction starts from approximately 45 • N/S (see Tolstykh and Shashkin, 2012, for grid statistics and pictures). Semi-analytical trajectories  are used. For the 1.5 • and 0.75 • resolutions, a nondimensional time step of T /110 and T /220, respectively, is used for the CN ≈ 5.7 simulations. The time steps T /600 and T /1200, respectively, are used for CN ≈ 1.0 runs.

A4 CLAW
The sphere grid used for the computations is described in Calhoun et al. (2008) and is based on a novel mapping Geosci. Model Dev., 7, 105-145, 2014 www.geosci-model-dev.net/7/105/2014/ that transforms a single logically rectangular uniform Cartesian grid to the sphere. Our grid is similar to the cubed-sphere grid in that it is made up of N ×N grid patches stretched to fit the sphere. Whereas the cubed-sphere uses six square patches, our grid consists of two square patches, one for each hemisphere, as shown in Fig. A1. For all tests, we used 2N × N grids with resolutions in the range N = (30,60,120,240,480,960), corresponding to angles For the tests involving a minimum effective angle, we used N = 640 (λ eff = 0.28125) for the shape preserving case and N = 960 (λ eff = 0.1875) for the unfiltered case. To generate the sphere grid, we map the computational domain , 1] using a simple mapping T (ξ, η) described in Calhoun et al. (2008). The resulting finite volume mesh cells are nearly uniform in size. The computational mesh width for a given resolution is x = y = 2/N. Clawpack uses a variable time-stepping scheme and chooses time steps based on a maximum wave speed, cell area and a desired CN number. After each time step, a maximum CN number α max is computed as where A ij is the area of mesh cell ij , t the time step just taken, u ij and v ij speeds at the x and y faces of mesh cell ij , and x the (constant) computational mesh width. Under the assumption that the wave speeds do not change dramatically from one time step to the next, we can satisfy a desired CFL conditionᾱ in the next time step by choosing a new t as For the results presented here, we setᾱ = 0.95. Clawpack does not make use of any explicit diffusion parameters or artificial viscosity.
The Fortran code and Python scripts for running the benchmark examples, and Matlab scripts for visualization can all be downloaded from the author's web page (http://math.boisestate.edu/~calhoun/www_personal/ research/NCAR_workshop/).

A5 CSLAM
CSLAM is implemented on an equiangular cubed-sphere grid. The average resolution at the Equator is given by where N c × N c is the number of control volumes on each face/panel of the cube. Semi-analytical trajectories are used . The diagnostics do not change significantly when using non-analytic trajectories (C. Erath, personal communication, 2013). For the 1.5 • and 0.75 • resolutions, a non-dimensionless time step of T /120 and T /240 was used for the CN ≈ 5.5 simulations, respectively. For CN ≈ 1.0 runs, the time steps were T /600 and T /1200, respectively. The shape-preserving filter is the fully twodimensional limiter by Barth and Jespersen (1989) that scales the fully two-dimensional reconstruction polynomial of degree two so that its extrema are within the range of the surrounding cell-averaged values.

A6 FARSIGHT
See White and Dongarra (2011) for scheme details.

A7 HEL(-ND)
HEL and HEL-ND use the same settings as for CSLAM. The filter parameters are the same in HEL and HEL-ND: both are run without filters in the underlying first-order version of CSLAM. The number of Lagrangian parcels are equal to the number of grid cells, and the parcels "survive" for the total duration of the simulation. They are initialized at the grid cell centers with the same area and value as the corresponding Eulerian grid cell.

A8 HOMME
HOMME and CAM-SE use the same numerical model with only a difference in the choice of order p = NP−1 of polynomial basis functions, hyperviscosity coefficient ν, and hyperviscosity scaling η. The resolution is obtained via Eq. (A1). For HOMME simulations, we choose p = 3 because of its common use (see CAM-SE default parameters) and p = 6 to demonstrate performance for the higher order scheme. If one uses NE as in Eq. (A1) for the p = 3 setting, then NE /2 for p = 6 corresponds to the equal equatorial resolutions in both cases.
The fully collocated formulation of the spectral element method used in HOMME and CAM-SE has a grid-scale computational mode that must be controlled with some type of stabilization (Ainsworth and Wajid, 2009). Here for stabilization we use well-tested hyperviscosity (Dennis et al., 2012). In practice, hyperviscosity coefficient ν is tuned for one resolution λ 0 . Then for other resolutions the hyperviscosity coefficient is calculated similarly to Eq. (A2). Note that ν is not tuned for every single simulation in this study. In more detail, after p is defined, we specify scaling η and whether the shape-preserving limiter is used. For the reasons explained below (Sect. A8.1), if the limiter is off, we set η = p + 1. Limited simulations are configured with η = 3.0 for p = 3 and η = 4.0 for p = 6. Next, the best ν 0 is chosen for one simulation with resolution λ 0 . For this, we use standard errors, mixing diagnostics, and filament preservation diagnostics. Finally, for any given resolution λ, Contrary to the CAM-SE setup, the winds are updated in time at each stage of the Runge-Kutta time stepping.

A8.1 More on hyperviscosity scaling
In case of tracer advection, different amounts of artificial dissipation affect performance of the scheme in various ways. For example, with η = p + 1, the theoretical spatial convergence order is p + 1. If η < p + 1, convergence rates are expected to be of the order of η. Bigger amounts of hyperviscosity raise standard errors but improve preservation of preexisting functional relations and filament preservation diagnostics to a certain degree. It is natural to choose η = p+1 to recover the higher order method and demonstrate its properties; to explore the scheme in applications, smaller values of η should be used. In addition, the use of the shape-preserving limiter leads to smaller orders of spatial convergence (Guba et al., 2013). Therefore, for the unlimited simulations we set η = p + 1 to maintain characteristics of the higher order method. For the limited simulations, we take η = 3.0 for p = 3 and η = 4.0 for p = 6. We call the former "convergence regime" and the latter "mixing regime". Chosen parameters are summarized in Table A1.

A9 ICON-FFSL
The ICON grid is derived from a spherical icosahedron that is made up of 20 equilateral spherical triangles. This base grid is further refined in a multi-step procedure, until the desired resolution is reached. In a first step, the root division step, the edges of each base triangle are divided into n equal sections (termed Rn). Connecting the new edge points by great circle arcs yields n 2 spherical triangles within the original triangle. This step is followed by k bisection steps (termed Bk), where each triangle is consecutively subdivided into four smaller triangles. This results in a so-called RnBk grid. The intermediate grids and the final grid are further optimized using spring dynamics (Tomita et al., 2001), with the spring coefficient set to β = 0.9. For a given resolution RnBk, the total number of cells can be computed from n c = 20 n 2 4 k .
The average resolution at the Equator was computed as follows: where x is the average distance between neighboring cell centers and r e is the earth radius. In Table A2 the applied grids are listed together with their effective resolutions and applied time steps. The wind vector used to define the swept flux areas is computed by evaluating the analytical wind vector at the center of the cell side at time n + 1/2.

A10 LPM
The Lagrangian particle method relies on the flow map, x(α, t), giving the trajectory of fluid particles, where α is a Lagrangian parameter, t time, and x position (Chorin and Marsden, 2000;Cottet and Koumoutsakos, 2000). The flow map satisfies where u is the given fluid velocity, and the scalar is advected along particle trajectories, The sphere is represented as a union of disjoint panels, S = ∪ N i=1 P i . We present results in which the panels are either the quadrilaterals of a cubed-sphere mesh, or the triangles of an icosahedral triangular mesh. The mesh corresponds to a discretization of the Lagrangian parameter. The scheme tracks two sets of particles, at the centers and vertices of the panels, indexed by j = 1, . . . , M + N, where N is the number of panels and M the number of vertices. Each particle has a Lagrangian parameter value, α j , position, x j (t), and scalar value, φ j . We employ Cartesian coordinates for the Lagrangian parameter and position. The particles are advected in the flow, using fourth-order Runge-Kutta, with initial condition x j (0) = α j . The total scalar is computed by where φ i is the scalar value at the center of panel P i and A i is its area. To maintain accuracy, a remeshing scheme is applied at regular intervals. At a remeshing step, say t = t rm , new particle data are defined, ( x j , α j , φ j ), where x j is a grid point on either the cubed-sphere or icosahedral mesh, α j the corresponding Lagrangian parameter satisfying Geosci. Model Dev., 7, 105-145, 2014 www.geosci-model-dev.net/7/105/2014/ x j = x( α j , t rm ), and φ j = φ( α j , 0) the scalar value. To determine α j , the panel of the distorted mesh containing x j is located and α j is computed from the data in that panel by linear interpolation. Results reported here remesh every 20 time steps. The scheme is under development, and further details will be reported in Bosler (2013). Note that the remeshing scheme interpolates the Lagrangian parameter rather than the scalar. Hence LPM avoids introducing overshoots and undershoots in the scalar, and there is no artificial mixing (the error norms φ max and φ min are zero throughout all test cases, and the mixing errors for test case 5 are also zero).
Note also that mesh size is not well-defined since the particles are moving, so instead we report the average angular variation α in the Lagrangian parameter. Discretizations with N = 5120, 20 480, 81 920, 98 304 correspond to α = 4.33 • , 2.16 • , 1.08 • , 0.65 • . The time step t = 0.0125 was used for all computations; this value ensures that the time discretization error is smaller than the spatial discretization error. Using the test case CN definition with α, we have CN = 0.54, 1.08, 2.16, 3.59.

A11 MPAS
MPAS  uses the transport scheme described in Skamarock and Gassmann (2011) implemented on spherical centroidal Voronoi meshes . The meshes used in these tests are generated by subdividing icosahedral meshes. That is, the Voronoi meshes are composed of hexagons plus 12 pentagons. The scheme uses a third-order Runge-Kutta time integration scheme and a finite-volume flux divergence calculation using Eq. (11) in Skamarock and Gassmann (2011) with the upwinding parameter β = 0.25. It uses the FCT shape-reserving limiter described in Zalesak (1979); no additional explicit diffusion is used in these tests. The Voronoi meshes described as 1.5 • , 0.75 • , and 0.67 • refer to the average cell-center spacing relative to an arc length at the Equator, and these meshes use 21 506, 86 018, and 107 522 cells, respectively, to tile the sphere. The tests are performed using CN ∼ 0.8, which corresponds to 768, 1536, and 1800 time steps to complete the test-case integrations on the 1.5 • , 0.75 • and 0.67 • meshes; for reference this corresponds to time steps of 1350 s, 675 s and 576 s on the earth radius sphere. For the divergent flow test case, second-order centered fluxes are used for density.

A12 SBC
The SBC scheme is implemented on a regular latitudelongitude grid where the number of zonal grid point is nx = Thus, the linear grid is used (rather than the quadratic grid where ntrunc = nx/3 − 1).

A13 SFF-CSLAM
SFF-CSLAM uses an equiangular gnomonic cubed-sphere projection. The scheme is available for either a third-order or fourth-order reconstruction, in both cases using a finitevolume stencil of width 5. The 1.5 • and 0.75 • grids correspond to 60 × 60 and 120 × 120 elements per cubed-sphere panel. The equivalent resolution runs at 1.05 • (fourth-order reconstruction) and 0.92 • (third-order reconstruction) correspond to 86 × 86 and 98 × 98 elements per cubed-sphere panel. The time steps at 1.5 • and 0.75 • (at CN 0.8) are T /720 and T /1440, respectively. As with CSLAM, the Barth and Jespersen (1989) filter was used for positivity preservation. No additional diffusive terms were added.

A14.1 Spherical grid generation
The schemes SLFV-SL and SLFV-ML are implemented on a spherical icosahedral-hexagonal grid (Sadourny et al., 1968). We start with a spherical icosahedron, consisting of 20 equilateral spherical triangles. To achieve the desired resolution the edges of these 20 spherical triangles are divided into N equal parts. Connecting these new points with great circle arcs results in 20N 2 spherical triangles. To construct the dual grid of the spherical triangular grid, we connect the centroids of the triangles with great circle arcs. The resulting dual grid consists of spherical hexagons except 12 pentagons corresponding to the 12 starting points of the spherical icosahedron. The total number of grid cells for resolution N is N R = 10 N 2 + 2. For the resulting dual grid, the centroids of grid cells do not coincide with the vertices of the spherical triangular grid. Indeed the cell-averaged value of a function is a second-order accurate approximation of its point-wise value taken at the cell centroid. This motivates one to employ some grid adjustment or grid optimization to design higher order finite volume schemes. Instead of using any sophisticated optimization (for instance spring dynamics or Lloyd's algorithm), we use centroids of the grid cells as our computational points and adjust the triangular mesh accordingly. In Table A3. Icosahedral resolution N , average grid spacing at the Equator λ ave and time step t used for schemes SLFV-SL and SLFV-ML. fact, this grid correction is equivalent to a single step Lloyd's optimization.
For a unit sphere, the length of a basic spherical triangle is ω = 1.1071. The arc length at a resolution N is calculated as ω N . The average grid spacing at the Equator λ is calculated as We presented results of all the test cases for fixed maximum Courant number (CN = 0.8). Table A3 lists the icosahedral resolution N , average grid spacing at the Equator λ and time step of the simulation t.
The wind vector used to approximate the flux area is computed by evaluating the analytical wind field at the midpoint of the cell side at time t = n × t. For shape preservation, SLFV-SL and SLFV-ML employ a multi-dimensional extension of Van Leer-type slope limiter discussed in Dukowicz and Kodis (1987).

A14.2 SLFV scheme description
Since there is currently no publication documenting the SLFV schemes, a brief description is given here. The schemes are based on the flux-form continuity Eq. (1) integrated over a control volume : Here ρφ is the average of ρφ over a control volume , the boundary of the control volume and A( ) the area of the control volume.

SLFV-SL
In Eq. (A10), decomposing the boundary into N k edges and integrating Eq. (A10) with respect to time, one gets Here ρφ i is the value of ρφ averaged in time from t to t + t and over the ith edge composing . V n+ 1 2 k,i is the velocity field at time t n + t 2 evaluated at the midpoint r k,i of the ith edge of cell k. Approximation Eq. (A12) is second-order accurate in time and in space. Finally, φ is approximated in a semi-Lagrangian fashion: This approximation is second-order accurate in space and time (Miura, 2007). A similar formula is used for ρ . In practice we use V n k,i instead of V n+ 1 2 k,i , which introduces some temporal error for a time-varying velocity field.

SLFV-ML
In Eq. (A10), decomposing the boundary into N k edges, we get a semi-discrete equation: Here ρ i , φ i and V i are the values of ρ, φ and velocity vector V over ith edge of at time t. We evaluate these edge quantities at the midpoint of the corresponding edge to get a second-order spatial approximation at time t.
The semi-discrete Eq. (A14) is then marched forward in time using the Runge-Kutta third-order total variational diminishing time integration scheme. This choice of time integration helps to damp the unphysical oscillation due to time discretization.

A14.3 Linear reconstruction
To evaluate the right-hand side of Eqs. (A13) and (A14), we define a linear reconstruction of ρ and φ in each control volume: ρ k (r) = ρ + ∇ k ρ · (r − r k ) , respectively, where r k is the centroid of the kth control volume. Indeed the area average of a quantity coincides with the value of that quantity at the centroid of the control volume, with second-order accuracy in space. As a consequence φ ρφ ρ with the same accuracy.
To compute the discrete gradient ∇ k φ of any scalar field φ for a cell k, we work in the plane tangent to the cell centroid r k . Vectors in the tangent plane are decomposed on a local basis (e x , e y ) pointing west and north. We project the centroids of the neighboring cells r k, i to the tangent plane from the point diagonally opposite to r k . The projected centroids define five or six triangles r k , P (r k, i ), P (r k, i+1 ) , for each of which we compute a gradient ∇ k,i φ defined by its components ∇ x k,i φ and ∇ y k,i φ in local x and y directions, which we obtain by solving where d i = P (r k, i ) − r k is the position vector of the projected neighboring centroid r k, i relative to r k , and φ i (resp. φ 0 ) is the value of the scalar field φ at r k, i (resp. r k ). The gradients ∇ k,i φ are then averaged to get ∇ k φ. We have verified that this yields a first-order approximation of the gradient on non-optimized grids.

A14.4 Slope limiting
In general this gradient construction will not lead to a positivity-preserving scheme. For this we use a multidimensional extension of Van Leer-type slope limiter (Dukowicz and Kodis, 1987). In Eqs. (A15)-(A16) we replace the gradient ∇φ by a modified gradient∇ k φ = α k ∇ k φ. The limiting coefficient α k is determined for each cell k such as to enforce local monotonicity. Dukowicz and Kodis (1987) show that a possible choice of α k is where For each edge entering the sum on the right-hand side of Eq. (A12) (resp. Eq. A14), the reconstruction used to evaluate φ t, r k,i − V n+ 1 2 k,i t 2 (resp. φ i ) is the one based on the control volume situated upwind to the edge. We present results obtained with a CN equal to 0.8, but the scheme seems to work up to maximum CFL equal to 1.0.

A15 TTS-I
The TTS-I scheme operates on a fully Lagrangian mesh. The initial grid is a centroidal Voronoi tessellation of the sphere, and its resolution is given in terms of number of polygons.
The Voronoi grid is then deformed by the flow and modified by a curvature-guard algorithm (CGA) that splits and merges edges according to deformation criteria. The specific configuration of the CGA is given in Table 1 in Dong and Wang (2013). For display and computation of diagnostics (and coupling with physical parameterizations in full model setup), a regular latitude-longitude grid is used. For the experiments two resolution configurations are chosen for the two meshes. For λ ≈ 1.5 • and λ ≈ 0.75 • , the number of polygons on the initial Voronoi grid is 10 000 and 20 000, respectively. The associated regular latitude-longitude grid spacings are 1.5 • and 0.75 • . A non-dimensionless time step of T /300 and T /600 was used for the coarser and higher resolutions, respectively. Trajectories are computed using fourth-order Runge-Kutta integration.

A16 UCISOM(-CS)
UCISOM uses a regular latitude-longitude grid, and UCISOM-CS uses a gnomonic cubed sphere with resolution defined as in Eq. (A5). The CN ≈ 5.5 simulations use non-dimensional time steps t = 5/T where T = 120 and T = 240 for 1.5 • and 0.75 • resolutions, respectively; for CN ≈ 1.0, the time steps are T = 624 and T = 1248; and for CN ≈ 0.8, the time steps are T = 780 and T = 1560. The mass flux across grid edges is integrated exactly in latitude or longitude from the equations for the regular latitudelongitude grid, and with nine-point Romberg integration for the cubed-sphere grid (preserves mass convergence in each grid cell to single-precision accuracy or better). The flux over each time step is integrated analytically from the equations. UCISOM uses a single forward time step for any CN value, and is thus only first-order accurate in time (i.e., forward Euler). The rate of convergence with increasing resolution (Figs. 1-2) is actually the convergence with time step, as the errors at differing spatial resolution runs at the same time step (i.e., differing CN) are similar.
UCISOM can be run with a range of shape-preserving options (limiters L), and these range from no action (L = 0, allows for transport of negative tracer in some circumstances), positive definite (L = 1, tracer moments adjusted before transport to ensure no negative tracer anywhere within the cell along the transported direction), monotonic within a cell (L = 2, before transport, the moments in the transported dimension are adjusted to have a positive-definite, monotonic distribution), and the most rigorous shape preserving (L = 3, the moments in the daughter cell are adjusted after transport to ensure that the tracer distribution throughout the cell is limited by the minimum and maximum tracer values using the moments in the parent cell(s)). All calculations shown here used L = 2.
While limiter L = 3 can eliminate ripples on both sides of sharp gradients, it leads to a linear decline of the peak tracer abundance, even when the tracer has a large constant plateau that is resolved by many cells, as in the slotted cylinder. A simple test of UCISOM is done with a constant plateau of tracer value (110 tracer units in 12 × 12 cells) embedded in a background (10 units on a cylinder of circumference 32 cells) and moving diagonally round the cylinder. After several rotations, the tracer distribution stabilizes (along with the ripples and 2 error) at a preferred shape and then evolves very slowly. Overshoot ripples in the tracer plateau are +12 % for L = 0 or L = 1, 5 % for L = 2, and < 0.2% for L = 3. (Treatment of cross term moments, xy, produces some ripples.) Undershoot ripples in the background near the plateau are −8 %, −2.5 % and < 0.2 %, respectively. Only with L = 3 the entire 12 × 12 block decays uniformly, −1 % per revolution for CN ≈ 1. The cases in this paper are equivalent to many revolutions in this test case, and results for UCISOM look like some of the worst cases in Fig. 5, with peak tracer < 0.8. After results were completed for this study, a variant of L = 3 was tested, whereby the minimum-maximum criterion for the daughter cell is relaxed: the tracer is allowed to overshoot the parent min-max by a percentage. For large allowances (i.e., 3 %) the L = 3 case begins to look like L = 2 with +4 % and −1 % ripples, no decay of the plateau values, and no increase in 2 error over successive rotations. For small overshoot allowance (0.2 %), however, we regain some of the desired properties (i.e., the ripples are smaller, +2.5 % and −1 %), but the plateau tracer does not decay. In general the 2 errors are similar for L = 0, 1, 2, but increase for L = 3 except for CN < 0.2.