Collection/Aggregation in a Lagrangian cloud microphysical model: Insights from column model applications using LCM1D (v0.9)

Lagrangian cloud models (LCMs) are considered the future of cloud microphysical modeling. However, LCMs are 1 computationally expensive due to the typically high number of simulation particles (SIPs) necessary to represent microphysical 2 processes such as collection/aggregation successfully. In this study, the representation of collection/aggregation is explored in 3 one-dimensional column simulations, allowing for the explicit consideration of sedimentation, complementing the authors’ 4 previous study on zero-dimensional collection in a single grid box. Two variants of the Lagrangian probabilistic all-or-nothing 5 (AON) collection algorithm are tested that mainly differ in the assumed spatial distribution of the droplet ensemble: The first 6 variant assumes the droplet ensemble to be well-mixed in a predefined three-dimensional grid box (WM3D), while the second 7 variant considers explicitly the vertical coordinate of the SIPs, reducing the well-mixed assumption to a two-dimensional, 8 horizontal plane (WM2D). Since the number of calculations in AON depends quadratically on the number of SIPs, an approach 9 is tested that reduces the number of calculations to a linear dependence (so-called linear sampling). All variants are compared 10 to established Eulerian bin model solutions. Generally, all methods approach the same solutions, and agree well if the methods 11 are applied with sufficiently high accuracy (foremost the number of SIPs, timestep, vertical grid spacing). However, it is found 12 that the rate of convergence depends on the applied model variant. The dependence on the vertical grid spacing can be reduced 13 if AON WM2D is applied. The study also shows that the AON simulations with linear sampling, a common speed-up measure, 14 converges slower, as smaller timesteps are required to reach convergence compared to simulations with a quadratic dependence 15 on the number of SIPs. Most importantly, the study highlights that results generally require a smaller number of SIPs per grid 16 box for convergence than previous box simulations indicated. The reason is the ability of sedimenting SIPs to interact with 17 an effectively larger ensemble of particles when they are not restricted to a single grid box. Since sedimentation is considered 18 in most commonly applied three-dimensional models, the results indicate smaller computational requirements for successful 19 simulations than previously assumed, encouraging a wider use of LCMs in the future. 20 1 https://doi.org/10.5194/gmd-2019-343 Preprint. Discussion started: 11 February 2020 c © Author(s) 2020. CC BY 4.0 License.

2.1 Basic relations and definitions 1 We use a column with nz grid boxes (GBs). Each GB has the volume ∆V and a height of ∆z. The total column height is thus 2 Lz = nz × ∆z.
(1) 3 We define that the GB k with 1 ≤ k ≤ nz extends from z k−1 to z k := k × ∆z, hence the GB with k = 1 is the lowest GB. 4 The horizontal area of the column is given by 5 ∆A = ∆V /∆z.
(2) 6 The droplets are assumed to be spherical with a density of ρ w = 1000 kg/m 3 and the mass-size relation is simply given by 7 m = 4 3 πr 3 ρ w .
(3) 8 Following Gillespie (1972) and Shima et al. (2009), the probability P W M 3D ij that one droplet with mass m i coalesces with 9 one droplet with mass m j inside a small volume δV within a short time interval δt is given by Here and in the following, index p refers to any single bin or SIP. If we want to stress that the combination of two SIPs or bins 1 matters, we use indices i and j. Index k is used for altitude and l for the order of the moments by convention.
the width of the mass bins and hence on κ, as well as the other parameters of the prescribed DSD. A change of the "system 1 size" ∆V does not change the number of SIPs, but simply leads to a rescaling of the SIP weights ν i . For exponential DSD 2 given above, around 3 N SIP,GB = 5 × κ (16) 4 SIPs are initialised (the scaling factor depends on the width of DSD and the choice of the lower cut-off threshold). Finally note 5 that if the DSD is prescribed in a specific GB, the position z p of each SIP is randomly chosen from [z k , z k+1 ]. Furthermore, δt 6 and δV of the conceptual model take the values ∆t and ∆V in the numerical models. 7 2.2 Eulerian column model 8 Eulerian column models have been widely employed in cloud physics and the present bin implementation is conceptually 9 similar to previous ones (e.g. Prat and Barros, 2007; Stevens and Seifert, 2008;Hu and Srivastava, 1995). We use exponentially 10 increasing bin sizes as defined in Eq. 13. The smallest mass m bb,0 is chosen suitably small (corresponding roughly to a droplet 11 radius of 1 µm), and the grid resolution parameter s sufficiently large (4 by default), i.e. the mass doubles every four bins. 12 The variable g ln m = 1 3 g ln r will be discretized in mass space and used as a prognostic variable. The droplet mass concentra- 13 tion in each bin p and height k is given by g p,k × d ln m and approximates  18 For its numerical solution, two different positive definite advection algorithms have been used. The first option is the classical 19 first-order upwind scheme (known for its inherent numerical diffusivity). For w sed ≥ 0, it is simply given by 20 g p,k (t + ∆t) = g p,k (t) + ∆t ∆z w sed (m bb,p )(g p,k+1 (t) − g p,k (t)). 21 The above equation is solved independently for each bin p, where w sed is evaluated at the arithmetic bin centerm bb,p = 22 0.5 (m bb,p+1 + m bb,p ) 1 . A second (better) option is the popular MPDATA algorithm, which is an iterative solver based on 23 the upwind scheme, yet drastically reduces its diffusivity (Smolarkiewicz, 1984(Smolarkiewicz, , 2006. By default, MPDATA is employed. 24 Irrespective of the chosen advection solver, the prediction of the "new" g p,k depends on g p,k and g p,k+1 (i.e. the GB above 25 the one of interest). For the prediction of g p,nz at the model top, it is necessary to prescribe some value g p,nz+1 which defines 26 the upper boundary condition (this is detailed in section 2.4).

27
If the prescribed ∆t is too large and the Courant-Friedrichs-Levy (CFL) criterion ∆t ∆z w sed (m bb,p ) ≤ r CF L < 1 is violated, 28 subcyling is introduced. As w sed (m bb,p ) does not change over the course of a simulation, the (bin-dependent) number of 29 subcycles n subc,p is determined in the beginning, such that r CF L = 0.5 holds for the reduced timestep ∆t n subc,p .

30
After one call of the Bott algorithm, n subc,p calls of the selected advection algorithm with reduced time step ∆t n subc,p follow 1 for each bin p. Unlike to Eulerian methods, sedimentation in a Lagrangian approach is independent of the chosen mesh and the time step is 10 not restricted by numerical reasons. If z p becomes negative at some point in time, the SIP crossed the lower boundary and is 11 removed.

12
For the collection process, it assumed that each SIP belongs to a certain GB k obeying z k−1 ≤ z p < z k and that the real 13 droplets of each SIP are well-mixed in the GB volume (WM3D). The collection process is treated with the probabilistic AON 14 algorithm. In the regular version (see section 2.3.1), AON is called for each GB and accounts for all possible collisions among 15 any two SIPs of the same GB. By construction, the information on the vertical position is irrelevant inside the regular AON, 16 and is only used in the SIP-to-GB assignment. 17 In the version with explicit overtakes (WM2D, see section 2.3.2), for any two SIPs (of the whole column) it is checked if 18 the higher SIP (i.e. with larger z p ) overtakes the lower SIP within the current time step. This may have several advantages: 19 First, only 2D well-mixedness in a horizontal plane is assumed and possible size sorting effects within a GB are accounted 20 for. Moreover, in Lagrangian methods the time step is not restricted by the CFL criterion and the largest SIPs may travel 21 through more than one GB. In the classical approach, such a SIP can only collect SIPs from the GB where it was present in the 22 beginning of the time step. In the second approach, collections can also occur across GB boundaries (see section 2.3.2).

23
In the remainder of this paper, the classical approach is referred to as "3D Well-Mixed" (WM3D) AON and the new approach 24 as AON-WM2D. Figure 2 sketches how the SIP properties (location, weighting factor, sedimentation speed) are interpreted in 25 either approach. For simplicity, a single GB with one SIP pair is displayed. 26 AON is probabilistic and an individual realisation does usually not reproduce the mean state as predicted by deterministic 27 methods like Eulerian approaches. The extent of deviations from the mean state is exemplified in Fig. 15    Here we basically repeat the AON description of U2017 (their section 2.5).  if pcrit > 1 then 12:

13:
{can occur when νi and νj differ strongly and be regarded as special case; see text for further explanation} 14: assume νi < νj, otherwise swap i and j in the following lines 15: {pcrit > 1 is equivalent to ν coll > νi} 16: {transfer ν coll droplets with µj from SIP j to SIP i, allow multiple collections in SIP i, i.e. one droplet of SIP i collects more than one droplet of SIP j.} 17: SIP i collects ν coll droplets from SIP j and distributes them on νi droplets: µi = (νi µi + ν coll µj)/νi 18: SIP j loses ν coll droplets to SIP i: νj = νj − ν coll 19: else if pcrit >rand() then 20: 21: assume νi < νj, otherwise swap i and j in the following lines 22: {transfer νi droplets with µj from SIP j to SIP i}  Figure 3 illustrates how a collection between two SIPs is treated. SIP i is assumed to represent fewer droplets than SIP j, 1 i.e. ν i < ν j . Each real droplet in SIP i collects one real droplet from SIP j . Hence, SIP i contains ν i = 4 droplets, now with 2 mass µ i + µ j = 15. SIP j now contains ν j − ν i = 8 − 4 = 4 droplets with mass µ j = 9. Following Eq. (7), only ν coll = 2 pairs 3 of droplets would, however, merge in reality. The idea behind this probabilistic AON is that such a collection event is realised 4 only under certain circumstances in the model, namely such that the expectation values of collection events in the model and 5 in the real world are the same. This is achieved if a collection event occurs with probability

RANDOM SINGLE COLLECTION
in the model. Then, the average number of collections in the model, 9 is equal to ν coll as in the real world. A collection event between two SIPs occurs if p crit >rand(). The function rand() provides 10 uniformly distributed random numbers ∈ [0, 1]. Noticeably, no operation on a specific SIP pair is performed if p crit <rand().

11
The treatment of the special case ν coll /ν i > 1 needs some clarification. This case is regularly encountered when SIPs with 12 large droplets and small ν i collect small droplets from a SIP with large ν j . The large difference in droplet masses µ led to 13 large kernel values and high ν coll with ν i < ν coll < ν j . [. . . ] If p crit > 1, we allow multiple collections, as each droplet in 14 SIP i is allowed to collect more than one droplet from SIP j. In total, SIP i collects ν coll droplets from SIP j and distributes presented here these aspects are not relevant and thus omitted. 26 The current implementation differs in several aspects from the version in Shima  collections (MC) are differently treated. For p crit = (ν coll /ν i ) > 1, either p crit ν i or p crit ν i droplets of SIP j merge with 32 ν i droplets of SIP i depending on the probability p crit − p crit . This maintains the integer property of the SIP weights. As the 33 latter feature is not required in our approach, we deterministically merge p crit ν i = ν coll droplets from SIP j with ν i droplets 1 of SIP i. This is computationally more efficient than the integer-preserving implementation. Test simulations showed that both 2 MC treatments produce similar results.  Unlike to the classical case where 3D well-mixedness has to be assumed, droplets of a SIP are now assumed to be well 8 mixed on the x-y-plane at z = z p within the GB (horizontally well-mixed instead of the traditional isotropic assumption) and 9 represent a "concentration" of n 2D = ν/δA (units L −2 , where L is a length scale). We introduce an adapted kernel definition 10 where the relative velocity term |w sed,i − w sed,j | is dropped from Eq. 5: The AON algorithm is split into two steps:

18
Analogous to the classical implementation, a collection in the model is performed with a probability ν coll /ν i and SIP i 19 may collect ν i from SIP j (in this step i and j are chosen, such that ν i < ν j ). 20 Similarly to the WM3D version, it happens that ν coll is larger than ν i and multiple collections should be considered in the 21 algorithm.

22
Specifically to WM2D, it is also possible that a SIP interacts with other SIPs located not only in one but several GBs.

23
Accordingly, it is not only necessary to check overtakes of other SIPs in the original GB (more specifically, SIPs that lie in the 24 same GB at time t), but also the SIPs that are located underneath, depending on the prescribed time step. In a Lagrangian model, {Sort SIPs by position, the highest SIP will be the first SIP.}

6:
Sort SIPs by position, such that zi(t) ≥ zj(t) for i < j 7: {Check for overtakes} 8: for i = 1, NSIP,tot − 1 do 9: for j = i + 1, NSIP,tot do if zi(t + ∆t) ≥ zj(t + ∆t) then 14: proceed with next SIP j {no overtake occured as SIP i is still above SIP j at t + ∆t} 15: end if 16: {the above conditions guarantee that the following code is executed iff SIP i overtakes SIP j} 17: Compute ν coll according to Eq. 24 { instead of Eq. 7 as in the WM3D version} 18: {all the following operations are identical to the WM3D version and accompanying explanations are removed} 19: νnew = min(νi, νj) 20: pcrit = ν coll /νnew 21: if pcrit > 1 then 22: assume νi < νj, otherwise swap i and j in the following lines For the smallest SIPs, which often travel only a small distance inside a GB, the list of SIPs that may be overtaken is com-5 mensurately small and overtakes have to be checked for a fraction of SIPs of the GB only (that means the actual computational 6 work is smaller than in the regular version). On the other hand, imagine the largest SIPs travel through three GBs, then over-7 takes have to be tested for roughly three times more SIPs than in the regular version. Moreover, testing for overtakes (step 1) 8 is computationally less demanding than calculating the potential collections (step 2). In WM3D we have always the workload 9 of step 2 for all tested combinations, whereas in WM2D only the cheaper step 1 is executed in case of no overtake.

10
Besides the weaker assumption of 2D well-mixedness, the present approach is actually more intuitive (even though it may 11 first be regarded counter-intuitive by those who are familiar with traditional Eulerian grid-based approaches). Moreover, this 12 approach complies better with the Lagrangian paradigm of a grid-free description (the present approach is independent of nz 13 and ∆z, yet some horizontal "mixing area" ∆A has to defined, over which the droplets of a SIP are assumed to be dispersed).
14 For more sophisticated kernels, including, e.g., turbulence enhancement, the present approach may not be adopted easily 15 as the driving mechanism for collisions to occur in the current model is differential sedimentation (see also discussions on 16 cylindrical vs. spherical formulations of kernels in (Saffman and Turner, 1956) and Wang et al. (1998Wang et al. ( , 2005). 17 Finally, we shortly summarize the differences between the WM2D and WM3D approach. The standard kernel K W M 3D as (unit T ) to obtain n coll (see Eq. 8). Since SIPs represent droplet concentrations of n i = ν i /δV and n j = ν j /δV , Eq. 7 follows. 21 In the WM2D approach, the kernel K W M 2D as given by Eq. 23 has units L 2 . Multiplying it by "2D" concentrations n 2D,i 22 and n 2D,j (units L −2 ) one obtains the collected 2D concentration n 2D,coll (units L −2 ). Since SIPs represent "2D" droplet 23 concentrations of n 2D,i = ν i /δA and n j = ν 2D,j /δA, Eq. 24 follows. A collection can only occur, if a larger droplet (or SIP) i 24 overtakes a smaller droplet (or SIP) j. First, z i > z j and w sed,i > w sed,j must hold and second the overtake time ∆t OT := One can define the overtake probability p OT being 0 for ∆t OT > δt  In both models, also a non-zero influx at the model top can be prescribed. One variant is to use periodic boundary conditions. 19 In the Lagrangian approach this is done by increasing the height z p of affected SIPs by Lz, once their height drops below 0. 20 In the Eulerian model, g p,nz+1 is identified with g p,1 . A second non-zero influx variant is a prescribed size distribution that 21 is advected into the domain with its respective fall speed. In the bin model, the prescribed DSD simply defines the g i,nz+1 - 3.2 Box model emulation simulations    from the noSedi, those simulations will be referred to as "full"). Moreover, the regular AON-WM3D version uses a quadratic 1 sampling of SIP combinations (referred to as "QuadSamp"). surprisingly well for large time steps; a fact that was already shown with the AON box model (see Fig. 18 of U2017).

6
Next, we discuss the sensitivity to more physical and numerical parameters. We found that convergence is usually more 7 easily reached for higher moments than for λ 0 (not shown). Hence in the following, we confine our analysis to the most 8 "critical" quantity, and Fig. 7 displays the λ 0 -evolution for various sensitivity experiments. Even though we analyse the results 9 in some detail, we want to mention that the observed differences are in principle not substantial. In fact, results differ often 10 much more due to a different collection kernel or slightly varied initial DSDs (see section 3.2.4). Nevertheless, the analysis 11 will help to understand more deeply how collection works in an LCM with AON. This pronounced effort is justified, as 14 In a first simple step, we vary nz (see first row of Fig. 7), which changes two aspects of the numerical setup. The number 15 of GBs over which interactions can occur and secondly the height of the column. This implicitly changes the time it takes 16 for SIPs to fall through the total column and hence changes the "recycling" time scale L z /w sed . Together with nz, nr inst is 17 varied such that nz ×nr inst is always 1000. Accordingly, all simulation results are averaged over the same number of GBs and 18 we avoid that simulations with smaller nz produce noisier data. In the noSedi-simulations (panel a), the moment evolution is 19 not affected by varying (nz, nr inst ). This is trivial, as in any case the average is taken over 1000 independent GBs. At least, 20 these results demonstrate that averaging over that many GBs suffices by far to produce robust averages. In the full simulations (panel b), the λ 0 -decrease is more pronounced and the various setups produce nearly identical results (except for the case with 1 nz = 2, which is in between the other full simulations and the noSedi simulations). From this finding alone one may argue that 2 the collection process is more efficient in LCM1D than in LCM0D. numerical approach by using too few computational particles. We believe that this hinders the formation of lucky droplets and 8 fewer droplets get collected (hence λ 0 is larger for smaller κ). Another more technical explanation is that the ν p -distribution 9 of the SIP ensemble is such that the formation of lucky SIPs is not supported. Ideally, there is a reservoir of SIPs with small 10 ν-values which can become lucky SIPs. There might be too few SIPs with small ν for small κ.

25
This elucidates that convergence is improved once some process exchanges SIPs between GBs, may it be for physical reasons     faster. Relative to the regular WM3D, N comb of WM2D is at any time smaller. In the beginning of the simulation, possible 1 overtakes occur among relatively few SIPs; much fewer on average than there are in a GB, hence the total N comb is around a 2 factor 60 smaller (in the first 20 minutes; 9.44 · 10 7 vs. 1.49 · 10 6 ). Even towards the end of the simulation, many SIPs are still 3 small and travel through a small fraction of the GB. Only few SIPs grow to rain drop size and travel distances of order ∆z. The 4 table shows that the total (time-integrated) N comb is more than a factor 12 smaller for WM2D than for WM3D (2.30 · 10 7 vs. the number of SIPs. For a simplified presentation, we limit ourselves to the WM3D versions with QuadSamp and LinSamp 1 and assumed converged simulation results and no limiter events. Moreover, we assume that an increase of N SIP leads to an 2 uniform decrease of all SIP weights ν p .
6 Accordingly,    The further settings are nz = 400, ∆z = 10 m, ∆t = 10 s, nr inst = 20, κ = 40. Figure 15 shows the temporal evolution of 26 the mean diameter and the moments λ 0 , λ 1 and λ 2 . Due to the influx condition, the total mass increases during the first 10 min-27 utes, barely visible in the third panel. During this period, however, collection is already efficiently reducing the droplet number.

28
This is accompanied by an increase of the mean diameter and radar reflectivity. Soon after, the first droplets reach the surface,  SIPs becomes rain drops eventually (see e.g. Fig. 4 in U2017) and hence the SIP number is substantially smaller in the lower 4 half. There, each GB is populated roughly by 10 SIPs. Despite this rather small value, convergence in DN C and Z seems to 5 be ubiquitous.  water from the column. Due to this wash-out effect, the rain drops cannot grow that large any longer and the precipitation 1 mode peaks at smaller sizes at later times. Overall, the agreement between the three model versions is remarkable given the 2 completely different numerics of the Eulerian and Lagrangian approach.  It follows that the results of the AON WM2D version should be independent of ∆z. Moreover, the AON-WM3D variant can 12 be used to determine if the size (more specifically the depth) of the well-mixed volume is a crucial parameter. In bin models in 13 general, the latter effect could not easily be singled out as sedimentation numerics also change with ∆z.
14 Figure 18 depicts the evolution of λ 0 and λ 2 for ∆z ranging from 2 m to 100 m. As expected, the AON WM2D simulations  In this section, the 4 km deep column is initially devoid of droplets and a time-constant influx of a DSD with r 0 = 16.9 µm and 6 LW C = 6 g/m 3 is prescribed. As in the box model emulation setup, the according DNC is 297 cm −3 .  lower domain half. Nevertheless, the LCM1D results seem to be converged. SIPs at those altitudes are large (D mean > 400 µm) 1 and fall fast, which fosters a strong SIP exchange across GBs and is beneficial to convergence (see section 3.2).
2 Figure 20 shows the temporal evolution of the mean diameter, column-averaged DN C and Z. Within the first 10 minutes, 3 DNC increases quickly. Soon after, collection becomes effective and DNC reaches a quasi steady state. The radar reflectivity 4 increases within the first 60 minutes and then also reaches a quasi steady state. The only discrepancy between the various 5 models are slightly larger DNC-values with LCM1D. The reason for this is elucidated next. of Lagrangian and Eulerian approaches to advect a droplet ensemble due to sedimentation is tested first -neglecting the 6 influence of collection. Since numerical diffusion is inherent to any Eulerian advection problem, i.e., also sedimentation, its 7 impact might impede any conclusions drawn from the collection simulations. However, by using an appropriate advection 8 scheme (MPDATA, Smolarkiewicz, 1984), numerical diffusion can be reduced to an acceptable degree in the sense that the 9 present simulations focus on the differences driven by collection numerics.

10
To bridge the gap to U2017, the behavior of box model simulations is emulated in the column model. This is done by 11 initialising each GB of the column with the same droplet size distribution and applying cyclic boundary conditions at the 12 surface and the top. By using this framework, we were able to show that sedimentation increases the model convergence technique.

20
A generally good agreement of the LCM results with the bin reference has been found for all AON variants. However, 21 they reveal distinct differences in their numerical and computational requirements. LinSamp demands a shorter timestep than 22 QuadSamp as a result of the upscaled collection probabilities to avoid SIPs with a zero (or even negative) weighting factor.

23
And indeed, fully coupled LCM applications with AON and LinSamp are reported to require a relatively short timestep to 24 reach convergence (e.g., Dziekan et al., 2019). Accordingly, these strong restrictions on the timestep might cancel out the 25 computational benefit gained by the reduced number of SIP combinations that need to be tested in LinSamp. This indicates 26 that the simpler QuadSamp might be a valuable alternative to LinSamp as long as the number of SIPs is not prohibitively high. 27 We further compared the computational requirements for the WM2D and WM3D implementations of AON. We found that 28 WM2D requires to check for overtakes in the entire column, not only in the GB in which the SIP is located, as is the case for 29 WM3D. However, this seeming disadvantage is turned into an advantage, since only a minority of SIPs overtakes other SIPs.

30
Accordingly, the overall number of calculations necessary for the application of WM2D is reduced compared to WM3D. The 31 physical reason for this effect is the typical bimodal structure of droplet spectra, which consist of only a few large droplets that 32 sediment and collect other droplets efficiently, while the remaining droplets are usually too small to sediment and collect other 33 droplets. Finally, we applied these approaches to two more realistic column cases. While both cases use a prescribed inflow of droplets 1 from the top, the first case is initialised with a linearly increasing liquid water content, and the second case is completely devoid 2 of any initial droplets. Overall, the agreement of AON-WM3D, AON-WM2D, and the bin references is remarkable. Only in 3 the second case, which is designed to be heavily prone to size-sorting, a dependence on the vertical grid spacing is detectable 4 for WM3D and the bin reference, which both assume the droplets to be well-mixed within a GB, while the WM2D results are 5 found to be completely independent of the vertical grid spacing.

6
All in all, this study has shown that the representation of collection in LCMs using AON with WM3D and WM2D reproduces

28
A. Bott. A flux method for the numerical solution of the stochastic collection equation: Extension to two-dimensional particle distributions.