On numerical broadening of particle-size spectra: a condensational growth study using PyMPDATA 1.0

. This work discusses the numerical aspects of representing the condensational growth of particles in models of aerosol systems such as atmospheric clouds. It focuses on the Eulerian modelling approach, in which ﬁxed-bin discretisation is used for the probability density function describing the particle-size spectrum. Numerical diffusion is inherent to the employment of the ﬁxed-bin discretisation for solving the arising transport problem (advection equation describing size spectrum evolution). The focus of this work is on a technique for reducing the numerical diffusion in solutions based on the upwind scheme: the multidimensional positive deﬁnite


Background
There exist two contrasting approaches for modeling the evolution of cloud droplet size spectrum (see Grabowski, 2020, for a review): Eulerian (fixed bin) and the Lagrangian (particle based or moving bin). The Lagrangian approach has the advantages of: (i) simplicity of formulation (no need to define particle-level properties and processes as gridded continuous fields), (ii) lack of discretization-related artifacts such as numerical diffusion associated with solving PDEs, (iii) facilitation of tracking 60 multiple particle attributes such as the amount of solute required for modeling activation. On the other hand, there are inherent challenges in using the particles-based framework: (i) ensuring proper sampling of physical and parameter space, (ii) handling load-balancing in distributed memory environments, (iii) solvability of resultant stiff ODE systems. Overall, while the Lagrangian methods are the focus of active research and development (Grabowski et al., 2019), the Eulerian schemes have been predominantly used in large scale modeling (Khain et al., 2015), due to their consistency with the fluid advection dynamics 65 description and due to robust algorithms for representing particle collisions.
Following Liu et al. (1997) and Morrison et al. (2018), the earliest documented study employing Eulerian numerics for condensational growth of a continuous size distribution representing a population of particles is that of Kovetz and Olund (1969) (whereas several earlier works starting with Howell (1949) utilized the Lagrangian approaches). The numerical scheme proposed in Kovetz and Olund (1969, eq. (10)) resembles an upwind algorithm being explicit in time and orienting the finite-70 difference stencil differently for condensation and evaporation.
Likely one of the first discussions of numerical broadening of the spectrum can be found in Brown (1980) where the numerical scheme from Kovetz and Olund (1969) was improved in several ways, including sampling of the drop growth rate at the bin boundaries (as is done here). Their study also covers quantification of the error of the method by comparisons to analytic solutions. 75 In Tsang and Brock (1982), the authors point out that upwind differencing is not suitable for aerosol growth calculations for its unacceptable numerical diffusion. Noteworthy, the study includes considerations of the Kelvin effect of surface tension on the drop growth (not considered herein).
The first mention of application of MPDATA for the problem of condensational growth can be found already in Smolarkiewicz (1984) where the problem is given as an example where the divergent-flow option of the algorithm may be applicable 80 (see sect. 2.6 below).
In Tsang and Korgaonkar (1987), which is focused on the evaporation of an "aerosol cloud", MPDATA is used as a predictor step followed by a corrective step using a Galerkin finite element solver. In two subsequent studies from the same group Rao, 1988, 1990), MPDATA is compared with other algorithms in terms of mass conservation, number conservation and computational cost. In Tsang and Rao (1988), basic 3-iteration MPDATA is used, although interestingly it is noted there that 85 "If the antidiffusion velocities are increased by some factor between 1.04 and 1.08, use of [corrective iteration] only once can reduce 50% of the computing time [...] without much sacrifice of accuracy". In conclusions, the authors praise MPDATA for providing narrow size distributions. At the same time, it is pointed out that MPDATA performs worse than upwind in terms of mass conservation or mean radius prediction accuracy.
Chapter 5 focusing on the methods of solving the dynamic equation describing aerosol spectrum evolution. The basic variant of MPDATA (Smolarkiewicz, 1983) is presented with an outline of its derivation.
In Kostoglou and Karabelas (1995) and Dhaniyala and Wexler (1996) the authors mention that MPDATA has the potential to reduce errors in size computations. The latter work lists high computational cost among drawbacks in using the algorithm that led to discarding the scheme from the presented comparison.

95
In Morrison et al. (2018), a comparison of different numerical schemes for condensational growth problem is performed.
Both fixed-, and moving-bin approaches are compared, including the non-oscillatory variant of MPDATA (referred to as MPDG therein). MPDATA is reported to produce significant numerical diffusion and spectral broadening relative to all other methods.
Intriguingly, as can be seen in Fig. 7 therein, the broad spectrum in the results obtained with MPDATA appears already at the very beginning of the simulations presented (at the altitude of 20m out of 520m of simulated displacement of an air parcel).

100
Overall, the discussion in Morrison et al. (2018), focuses on the issue of spectral broadening from vertical numerical diffusion highlighting that, in principle, the problem is a four-dimensional transport problem (three spatial dimensions and the spectral dimension).
In Wei et al. (2020), MPDATA is employed for integrating droplet spectrum evolution for comparison with a Lagrangian scheme. The work concludes that the spurious broadening of the spectrum cannot be alleviated even with a grid composed of 105 2000 size bins.
Noteworthy, none of the works mentioned above discussed coordinate transformations to non-linear grid layouts with MP-DATA (a discussion of handling non-uniform mesh with upwind scheme can be found in Li et al., 2017, Appendix A). Wei et al. (2020) and Morrison et al. (2018) are the only works mentioning other than basic flavor of the scheme, yet only the non-oscillatory option was considered. Herein, the applicability of multiple variants of MPDATA and their combinations is 110 expounded highlighting their robustness for solving the condensational growth problem.

Governing equations
To describe the conservation of particle number N under the evolution of the particle size spectrum n p (p) = dN dp (n denoting number density as a function of particle size parameter p such as radius or volume), one may take the one-dimensional continuity equation (i.e., Liouville equation expressing the conservation of probability, for discussion see Hulburt and Katz, 1964), 115 in a generalized coordinate system: where G ≡ G(x) represents the coordinate transformation from p to x, x being an equidistant mesh coordinate used in the numerical solution; n p ≡ n p (p(x)) being number density function and u ≡ u(x) denoting the pace of particle growth in the chosen coordinate x. The coordinate transformation term G may play a twofold role in this context. First, there is a degree of 120 freedom in the choice of the particle-size parameter used as the coordinate (i.e., the argument p of the density function n(p)).
For the chosen coordinates p ∈ [r, s ∼ r 2 , v ∼ r 3 ], the appropriate distributions will be n r (r), n s (s) and n v (v) where s = 4πr 2 and v = 4/3 πr 3 denote particle surface and volume, respectively. The size spectrum n p (p) in a given coordinate is related with n r (r) via the following relation of measures: n p (p)dp = n r (r)dr so the total number N = n r dr is conserved.
Second, there is also a degree of freedom in the choice of the grid layout p(r(x)), that is how the parameters r, s or v are 125 discretized to form the equidistant grid in x. This can be used, for instance, to define a mass-doubling grid layout (x = ln 2 (r 3 )) as used in Morrison et al. (2018) and herein.
Combining the two transformations results in the following definition of G: G ≡ dp(r)/dx(r) = dp dx (1.2) which defines the transformation from the coordinate p of the density function to the numerical mesh coordinate x. For further 130 discussion of the coordinate transformation approaches in the embraced framework (including multi-dimensional setting), see Smolarkiewicz and Clark (1986) and Smolarkiewicz and Margolin (1993).

Upwind discretization
The numerical solution of equation (1.1) will be obtained by discretizing space and time as follows: x = i · ∆x and t = n · ∆t. Henceforth, ψ n i and G i denote the discretized number density n p and the discretized coordinate transformation term, 135 respectively. The dimensionless advective field is denoted by GC = dp dx u∆t/∆x, where C stands for the Courant number, i.e. the velocity in terms of temporal and spatial grid increments. A staggered grid is employed what warrants introduction of fractional indexing for vector fields, i.e.: GC i+1/2 ≡ (GC)| i+1/2 in the case of the discretisation of the product GC. To solve the equation numerically, a finite difference form of the differential operators is introduced embracing the so-called upwind approach (dating back at least to Courant et al., 1952, eq. 16 therein): where the introduced flux function F defines the flux of ψ across grid-cell boundary as a function of the values of ψ L and ψ R to the left and right of the boundary, respectively and the value of GC at the boundary. Hereinafter a shorthand notation

Test case and upwind solution
The test case is based on Figure 3 from East (1957) -one of the early papers on the topic of cloud droplet spectral broadening.

150
The case considers the growth of a population of cloud droplets through condensation in the equilibrium supersaturation limit, where: with ξ = ξ 0 (S − 1) being an approximately constant factor proportional to supersaturation (S − 1). The parameter ξ 0 is set to 100 µm 2 s −1 to match the results from East (1957).

155
For the initial number density distribution function, an idealized fair-weather cumulus droplet size spectrum is modeled with a lognormal distribution: with parameters: r 0 = 7 µm, n 0 = 465 cm −3 and κ = 22.
For the boundary conditions (implemented using halo grid cells), extrapolation is applied for G, while both ψ and GC are 160 set to zero within the halo.
Analytical solution to eq. (1.1) is readily obtainable forṙ = ξ/r and for any initial size distribution. Noting that introducing x = r 2 coordinates, the transport equation (1.1) becomes a constant-coefficient advection equation, the problem reduces to translation of the signal in x by 2ξt. Cast in the r coordinate, the solution can be expressed as (Kovetz, 1969): wherer =r(r, t) = r 2 − 2ξt.
The upper panels in Figures 1 and 2 depict the droplet size spectrum evolution through condensational growth from an initial liquid water mixing ratio of M 0 = 1 g kg −1 under supersaturation S − 1 = 0.075%.
Two grid layout (x) and size parameter (p) choices are depicted. Both panels in Fig. 1 present simulation carried out with density function coordinate p = r 2 and discretized on a mass-doubling grid (x = ln 2 (r 3 )), whereas both panels in Fig. 2 present 170 simulation results obtained with x = r and p = r. In both cases, the timestep is set to ∆t = 1 3 s, the domain range is (1; 26) µm, there are 75 grid cells. Such settings corresponds to GC ≈ 0.26 in first layout, where p = r 2 is used, and variable Courant number approximately in the range of (0.03; 0.07) in second layout, where p = r is used.
The snapshots are depicted at times where the liquid water mixing ratio of the analytical solution obtains values of 1, 4 and 10 g kg −1 (assuming air density of 1 kg m −3 ). In both Figure 1 and 2, the upper panels display the number density and the 175 bottom panel show the normalized mass density. The bottom panels thus depict the same quantities as Fig. 3 in East (1957).
The normalized mass density of bin i is evaluated as 4/3πρ l m (l=3) i /M by calculating the third statistical moment of the number distribution n r (p) with the formula:  East (1957). Numerical solution was obtained in the following coordinate transformation: p = r 2 ; where r 1 , r 2 are the boundaries of i-th bin, and ψ i is the value of n p associated with the bin (i.e., n p is assumed to be bin-wise 180 constant; note that the dimension of n p depends on the choice of p). The normalization factor M is the mixing ratio (e.g.,  As can be seen in both the number-and mass-density plots in Figs. 1 and 2, solutions obtained with the upwind scheme are characterized by significant drop in the peak value and spectral broadening, with respect to the analytical solution -both manifesting the numerical diffusion.

190
The broadening and the drop in the peak value is less pronounced in Fig. 2 where the linear grid increases the resolution in the large-particle region of the spectrum. ψ n i+1 and ψ n i−1 and substituted into the numerical upwind scheme, in which the flux function (1.4) is expressed using moduli (e.g., Crowley, 1968, eq. (12)): resulting in: which is further transformed by employing a time derivative of both sides of the original advection equation x ψ to substitute the second-order time derivative with spatial derivative (Cauchy-Kowalevski procedure, see Toro, 1999) leading to the sought modified equation (Roberts and Weiss, 1966, eq. 2.9): The above analysis depicts that the employment of the numerical scheme (1.3) results in a solution of a modified equation

Antidiffusive velocity and iterative corrections
The problem of numerical diffusion can be addressed by introducing the so called "antidiffusive velocity" (Smolarkiewicz, 215 1983). To this end, the Fickian flux can be cast in the form of the advective flux -an approach dubbed pseudo-velocity technique in the context of advection-diffusion simulations (Lange, 1973(Lange, , 1978 or hyperbolic formulation of diffusion (Cristiani, 2015, discussion of eq. (4) therein), and discussed in detail in Smolarkiewicz and Clark (1986, sect. 3.2): In Smolarkiewicz (1983Smolarkiewicz ( , 1984, it was proposed to apply the identity (2.4) to equation ( Accordingly, the basic antidiffusive field GC (k) is defined as follows (with > 0 being an arbitrary small constant used to 225 prevent from divisions by zero): (2.5) where k is the iteration number, GC (1) ≡ GC and

245
Such gauge transformation changes the corrective iterations of the basic algorithm as follows (replacing eqs. (2.6) and (1.4) what is symbolized with ): Noting that the amplitude of the diffusive flux (2.4) is inversely proportional to the amplitude of the signal, such gauge choice 250 decreases the amplitude of the truncation error (see Smolarkiewicz and Clark (1986, p. 408  However, in each case negative values are observed (non-physical in case of the considered problem).

255
Consequently, for the problem at hand, it is effectively essential to combine it with the monotonicity-preserving nonoscillatory option outlined in the next section.

Non-oscillatory option
In Smolarkiewicz and Grabowski (1990), extension of the MPDATA algorithm was introduced that makes the solution monotonicity preserving and precludes appearance of negative values in the discussed solution of droplet size spectrum evolution.

260
The trade-off is that the order of the algorithm is reduced (see Appendix A).
The non-oscillatory option (later referred to as "non-osc" herein) modifies the algorithm in such way:

275
An alternative approach to the iterative procedure was introduced in Beason and Margolin (1988); Margolin and Smolarkiewicz (1998) and further discussed in Margolin and Shashkov (2006), where the contributions of multiple corrective iterations of MPDATA were analytically summed leading to a new two-pass scheme dubbed DPDC (double-pass donor cell), featuring the following form of the antidiffusive GC field: (2.14) 280 with A i+ 1 2 defined in eq. (2.6). Note that only one corrective iteration is performed with the DPDC variant. As in the case of the infinite gauge variant of MPDATA (section 2.3), the above formulation does not guarantee monotonicity of the solution. Herein an example simulation combining the DPDC, the non-oscillatory and infinite-gauge variants is presented in Figure 6 depicting how the solution is improved over that in Figure 5.

GC
As pointed out in section 5.1 in Smolarkiewicz (1984), this option has the potential of improving results for the problem of the evolution of the droplet size distribution (personal communication with William Hall cited therein). This is due to the drop growth velocity defined by eq. (1.5) being dependent on radius (hence divergent given the one-dimensional problem). Yet, applying adequate coordinate transformation (i.e., p = r 2 ), the drop growth velocity in the transformed coordinates becomes 295 constant (see section 1.5 above and Hall (see, e.g. 1980, sec. 3b)). However, in simulations using the presented setup (for p = r 2 ; not shown), only insignificant changes in the signal occurring when the divergent-flow option was used were observed.
However, the problem considered herein does not include, for instance, the surface tension influence on the drop growth rate.

Third order terms
Another possible improvement to the algorithm comes from the inclusion of the third-order terms in the modified equation 300 analysis, which leads to following form of the antidiffusive velocity (Margolin and Smolarkiewicz, 1998): Noteworthy, discussion of higher-order variants of MPDATA was carried forward in Kuo et al. (1999) andWaruszewski et al. (2018). In the latter case, the focus was placed on accounting for coordinate transformation and variable velocity in the derivation of antidiffusive velocities leading to a fully third-order accurate scheme.

A "best" combination of options
The MPDATA variants presented in the preceding sections can be combined together. In Figure 8, results obtained with upwind scheme and the basic two-pass MPDATA are compared with those obtained with a powerful combination of three iterations, third-order-terms, infinite-gauge and non-oscillatory options hereinafter referred to as the "best" variant (for the problem at hand).  Table 1. Relative dispersion of the discretized (using grid setup as in Fig. 1) analytical solution taken for five selected times.

Results and discussion
In the following subsections, the influence of MPDATA algorithm variant choice on the resultant spectrum broadness and computational cost is analysed using the example simulation setup used above (i.e., in all figures except Fig. 2, see section 1.5 for test case definition).
Analysis of the scheme solution convergence with changing resolution and Courant number is presented in Appendix A.

Quantification of numerical broadening
The relative dispersion, defined as the ratio of standard deviation σ to the mean µ of the distribution, is a parameter commonly used to describe the width of the spectrum (e.g. Chandrakar et al., 2018).
The calculated dispersion ratio over all bins takes form: where m i is defined in (1.8) and N is the conserved total number of particles (equal to i m (l=0) i ). To quantify the effect of numerical diffusion on the broadness of the resultant spectrum, the following parameter is introduced based on the numerical and analytical solutions (hereinafter reported in percentages): Table 1 depicts the gradual narrowing of the spectrum under undisturbed adiabatic growth.

330
Left panel in Fig. 9 provides values of the R disp parameter evaluated at six selected timesteps corresponding to M = 1, 2, 4, 6, 8, 10 g kg −1 . Although numerical broadening is inherent to all employed schemes, and grows in time for all considered variants, the scale of the effect is significantly reduced when using MPDATA.
In particular, a tenfold decrease in numerical broadening as quantified using R disp is observed comparing upwind and "best" variant considered herein.

335
While outside of the scope of the present study, it is worth noting that in simulations combining spectral growth with transport in physical space, the numerical broadening associated with the spatial advection also contributes to the numerical broadening effect (see Pardo et al., 2020, and references therein).

Notes on conservativeness
Due to the formulation of the problem as number conservation and discretization of the evolution equation using fixed bins, 340 even though the numerical scheme is conservative (up to subtle limitations outlined below), evaluation of other statistical moments of the evolved distribution from the number density introduces an inherent discrepancy from the analytical results (for a discussion on multi-moment formulation of the problem, see e.g. Liu et al., 1997).
In order to quantify the discrepancy in the total mass between the discretized analytical solution and the numerically integrated spectrum, the following ratio is defined using moment evaluation formula (1.8): Right panel in Fig. 9 depicts the values of the above-defined ratio computed for spectra obtained with different variants of MPDATA discussed herein. The departures from analytically-derived values are largest for the upwind scheme (up to ca. 5%), and oscillate around 0 with amplitude of the order of 1% for most of the MPDATA solutions.

350
The consequences of mass conservation inaccuracies in the fixed-bin particle size spectrum representation may not be as severe as in, e.g. dynamical core responsible for transport of conserved scalar fields. The outlined discrepancies may be dealt with by calculating the change in mass during a time step from condensation, then using it in vapor and latent heat budget calculations so the total mass and energy in the modeled system are conserved.
The problem embodied in equation (1.1) is the conservation of number of particles and the embraced algorithm (1.3)-

355
(1.4) is conservative (up to numerical precision) for G = 1. However, the formulation of the donor cell scheme ψ n+1 = ψ n + G −1 on the staggered grid with G = 1, for example due to employment of non-identity coordinate transformations implies that even though the influx and outflux across boundary of adjacent cells is equal, discretization of G i at cell centers limits the level of accuracy in number conservation.
The total number of particles in the system may diverge from the analytical expected value even for the initial condition 360 depending on the employed discretization approach. In the present work, the probability density function is probed at cell centers effectively assuming piecewise-constant number density function. An alternative approach is to discretise the initial probabilities by assigning to ψ i the values of (φ i+ 1 2 − φ i−1/2 )/(r i+ 1 2 − r i−1/2 ) where φ is the cumulative distribution.  after normalization with respect to the values pertinent to upwind runs. The "best" variant is roughly ten times more costly than the upwind scheme. The table includes analogous measurements reported in earlier studies on MPDATA, where available (see Table caption for comments).

Computational cost
As can be seen from the table, the infinite gauge option not only improves result, but simplifies equation, making numerics 370 faster. Three-pass MPDATA with third order terms included is slightly faster than the variant with both the infinite-gauge and non-oscillatory options enabled.
Although the discussed problem is one-dimensional, its computationally efficient and accurate solution is essential, as it typically needs to be solved at every time step and grid point of a three-dimensional cloud model.  Smolarkiewicz (1983) for two-dimensional problem. Column labeled SS05 correspond to data reported in Smolarkiewicz and Szmelter (2005) for a 3D finite-volume advection on unstructured grid. Column SR91 includes values from Smolarkiewicz and Rasch (1991), MSS00 corresponds to data from Margolin et al. (2000), both reported for two-dimensional problems. The study was focused on the MPDATA family of numerical schemes that iteratively apply the upwind algorithm reducing the numerical diffusion while maintaining the salient features of the underlying upwind scheme such as conservativeness and positive-definiteness.
Several options introduced to MPDATA following its original formulation were explored here in the context of condensational growth problems. This included the procedure to introduce coordinate transformations (e.g., to a mass-doubling grid) 380 and the variants of MPDATA including: infinite-gauge, non-oscillatory, DPDC and third-order-terms options.
In literature, the derivation and discussion of MPDATA variants is spread across numerous research papers published across almost four decades, and in most cases focused on multidimensional hydrodynamics applications. It seems that this contributed to an apparent detachment of condensational-growth applications from the benefits of continuous research on MPDATA. It was the aim of this study, to highlight the developments that followed the original formulation of the algorithm, and to highlight 385 their applicability to the problem. To this end, it was shown that combination of such features of MPDATA as the infinite-gauge, non-oscillatory and third-order-terms options, together with application of multiple corrective iterations offer a robust scheme that grossly outperforms the almost quadragenarian basic MPDATA. https://github.com/atmos-cloud-sim-uj/PyMPDATA/tree/master/PyMPDATA_examples/Olesik_et_al_2020.
The electronic supplement to this manuscript contains a snapshot of the project code repository archived at the day of submission. The 1.0 release stated in the title is going to accompany the final version of the article.

Appendix A: Convergence analysis
To assess the spatial and temporal convergence of the numerical solutions presented above, a convergence test originating from Smolarkiewicz and Grabowski (1990) is used. For the analysis the following truncation-error L 2 measure is used (e.g., 400 Smolarkiewicz , 1984): As a side note, it is worth pointing out that for the chosen coordinates p = r 2 , x = r 2 , the coordinate transformation term is equal to identity, so there is no need for including the G factor into the computed error measures. In general case, convergence will depend on the grid choice and to account for that one may used a modified measure as given in Smolarkiewicz and Rasch 405 (1991, eq. 24 ).
As proposed in Smolarkiewicz and Grabowski (1990), visualization of the results is carried out on polar plots with radius ρ and angle φ coordinates defined as follows: where ρ was shifted by a constant so that the highest resolution grid corresponds to ρ = 1. Smolarkiewicz and Grabowski (1990), Figs. 8.1-8.2 Margolin and Smolarkiewicz (1998) and Figs. 10-11 Jaruga et al. (2015. The chosen color increments correspond to the error reduction by a factor of 2, the warmer the color, the larger the error. The 415 small gray points behind the isolines represent points for which the error value was calculated. When moving along the lines of constant Courant number, increasing the space and time discretization, number of crossed dashed isolines indicate the order of convergence. For the considered problem, it can be seen that the upwind scheme (Fig. A1) has convergence of the first order (one isoline is crossed when spatial discretization increases by one order); MPDATA scheme ( Fig. A2) of the second order and MPDATA with 3 iterations (Fig. A6) is of the third order.  Figure A8. Convergence plot for the three-pass infinite gauge non-oscillatory MPDATA with third order term corrections (cf. Fig. 8).