A refined statistical cloud closure using double-Gaussian probability density functions

We introduce a probability density function (PDF)-based scheme to parameterize cloud fraction, average liquid water and liquid water flux in large-scale models, that is developed from and tested against large-eddy simulations and observational data. Because the tails of the PDFs are crucial for an appropriate parameterization of cloud properties, we use a double-Gaussian distribution that is able to represent the observed, skewed PDFs properly. Introducing two closure equations, the resulting parameterization relies on the first three moments of the subgrid variability of temperature and moisture as input parameters. The parameterization is found to be superior to a single-Gaussian approach in diagnosing the cloud fraction and average liquid water profiles. A priori testing also suggests improved accuracy compared to existing double-Gaussian closures. Furthermore, we find that the error of the new parameterization is smallest for a horizontal resolution of about 5–20 km and also depends on the appearance of mesoscale structures that are accompanied by higher rain rates. In combination with simple autoconversion schemes that only depend on the liquid water, the error introduced by the new parameterization is orders of magnitude smaller than the difference between various autoconversion schemes. For the liquid water flux, we introduce a parameterization that is depending on the skewness of the subgrid variability of temperature and moisture and that reproduces the profiles of the liquid water flux well.


Introduction
The cloud fraction and the average liquid water in a given volume depend on the variability of temperature and moisture within that volume.If subgrid variability is not taken into account at all, the grid volume is either entirely subsaturated or entirely saturated.To overcome this problem, diagnostic relative humidity schemes have been developed, for example by Smagorinsky (1960) and Sundqvist et al. (1989) who parameterized partial cloud fraction as a function of relative humidity with a certain critical relative humidity at which a partial cloud cover first appears.This kind of parameterization has been developed further by implementing secondary predictors like condensate content (e.g., Xu and Randall, 1996) or vertical velocity (e.g., Slingo, 1987).
Another approach in diagnosing cloud fraction is based on one-dimensional probability density functions (PDFs) of the subgrid variability in temperature and moisture 1 .Assuming a single-Gaussian PDF, these schemes go back to Sommeria and Deardorff (1977) and Mellor (1977) and need not only the grid-box mean temperature and moisture but also the standard deviations as input parameters.Because the success of such schemes crucially depends on the ability to quantify the tails of the distribution (Bougeault, 1982a), further studies additionally took into account the skewness of the distribution which lead to the use of, for example, double-Gaussian (Lewellen and Yoh, 1993;Larson et al., 2001a), Gamma (Bougeault, 1982b) or Beta (Tompkins, 2002) distributions.Perraud et al. (2011) tested several of this distributions against model data and found that the double-Gaussian distribution gives best results.
Compared to relative humidity schemes, PDF-based schemes typically need more and higher moments as input parameters.While the first two moments are commonly available in numerical weather prediction (NWP) models and general circulation models (GCMs), there are ongoing efforts to develop higher-order closure boundary layer models which include an estimate of the third moment, that is, the skewness (Gryanik and Hartmann, 2002;Gryanik et al., 2005;Mironov, 2009;Machulskaya and Mironov, 2013).Apart from this disadvantage, PDF schemes have several advantages over relative humidity schemes.In PDF schemes, the shape of the PDF is parameterized but the variables aimed for, such as cloud fraction and average liquid water, are derived directly from this PDF.Therefore, the variables are calculated consistently from the assumed PDF.Also, numerical models that ignore subgrid variability are known to encounter systematic errors in cloud and radiative properties (Pincus and Klein, 2000;Rotstayn, 2000;Larson et al., 2001b).To tackle this issue, the knowledge of the subgrid PDF is essential.Furthermore, PDF schemes can potentially be used in a wide range of cloud regimes.Other than for relative humidity schemes, no trigger functions to switch from one regime (and its according parameterization) to another regime are needed and artificial distinctions can be avoided.
As a further development from one-dimensional PDFs, joint PDFs have been introduced recently (e.g., by Larson et al., 2002).In joint-PDF schemes the variability of temperature and moisture are usually not summarized in one variable and the distribution of the vertical velocity can be added as further input.Because the vertical velocity is taken into account, the liquid water flux can be derived consistently from the joint PDF.This advantage has to be paid for by the prediction or diagnosis of several more moments and correlations among temperature, humidity and vertical velocity (e.g., Larson et al., 2002, used 19 parameters instead of 5 for a double-Gaussian distribution).Hence joint-PDF schemes are much more computational expensive than one-dimensional PDF schemes and their usage in operational NWP models or GCMs is challenging with todays computational power.
We therefore step back to one-dimensional PDF schemes and focus on improving the double-Gaussian PDF scheme to diagnose subgrid cloud fraction and average liquid water.The formulation follows Larson et al. (2001a) and is developed from and tested against large-eddy simulations (LES) as well as aircraft measurements.In Sect.2, the LES model, the case studies the model is applied to and the observational data set are described.The use and construction of a double-Gaussian PDF, the refined closure equations and the parameterization of the liquid water flux are introduced in Sect.3. Next, in Sect.4, we perform a priori testing of the new cloud closure with LES data as input to examine the parameterization's behaviour under idealized conditions, that is, excluding the interplay with other model components as would be done with a posteriori testing in an NWP model or a GCM.In the following Sects.5 and 6, the error dependence of the parameterization on domain size and the role of mesoscale structures are discussed and the introduced cloud closure is extended to the diagnosis of the autoconversion rate.Finally, in Sect.7, we give some concluding remarks.

Large-eddy simulations
The LES model used in this study is the University of California, Los Angeles LES (UCLA-LES; Stevens et al., 2005;Stevens, 2007) with one major difference to previous work, that is, the time stepping is done with a third-order Runge-Kutta scheme instead of the former leapfrog scheme.Prognostic equations for each of the following variables are solved: the three components of the velocity, the total water mixing ratio, the liquid water potential temperature, the mass mixing ratio of rain water and the mass specific number of rain-water drops.Considering only warm clouds, we use the double-moment bulk microphysical scheme from Seifert and Beheng (2001).Subgrid fluxes are modelled with the Smagorinsky-Lilly model.
For our study, we adapt the UCLA-LES to four different case studies which span over a range of different cloud regimes.Shallow cumulus over ocean (RICO 2 ; see Rauber et al., 2007) and over land (ARM 3 ; see Brown et al., 2002)  are considered as well as stratocumulus (DYCOMS 4 ; see Stevens et al., 2003) and the transition from stratocumulus to cumulus (ASTEX 5 ; see Albrecht et al., 1995).Domain sizes and resolutions of the different LES cases are given in Table 1.

ARM
The LES setup of the ARM case follows that of the sixth intercomparison project, performed as part of the GCSS 6 program and described by Brown et al. (2002).

ASTEX
The setup of the LES study for the ASTEX case is similar to that proposed by the Euclipse ASTEX Lagrangian model intercomparison case (van der Dussen et al., 2013).The initial profiles are identical to the first GCSS ASTEX "A209" modelling intercomparison case and the model is forced by time-varying sea surface temperature and divergence taken from Bretherton et al. (1999).

DYCOMS
For the LES setup of DYCOMS, we follow the DYCOMS-II RF01 setup of the eighth case study conducted under the auspices of the GCSS boundary layer cloud working group and described by Stevens et al. (2005).

RICO
The initial data and the large-scale forcing for the standard RICO simulations are based on the precipitating shallow cumulus case that was constructed by the GCSS boundary layer working group and described by van Zanten et al. (2011).A modified moister version, which differs from the standard setup only by a moister initial profile, was first used by Stevens and Seifert (2008), to which we refer for a detailed setup of the case.The moister initial condition leads to higher rain rates compared to the standard case and subsequently to mesoscale organization of the cloud field due to the formation of cold pools mainly caused by evaporation of rain in the sub-cloud layer (Seifert and Heus, 2013).
Unless stated otherwise, we refer to our standard RICO setup with nx = 512 when analysing LES data from the RICO case.The three RICO cases on the right hand side in Table 1 are equal to the LES runs R01, M01 and M01 big of Seifert and Heus (2013).For this study they are solely used in Sect. 5 when discussing the error dependence on domain size and the role of mesoscale structures.

Observational data
To be able to test our parameterization against observational data, we used RICO field campaign data (Rauber et al., 2007).This data set includes airborne measurements obtained from the NSF/NCAR Research Aviation Facility C-130Q Hercules aircraft (Tail Number N130AR) at 25 Hz.Besides the static pressure and the ambient temperature, the water vapor mixing ratio measured with a Lyman-alpha hygrometer as well as the liquid water content measured with a Gerber PV-100 probe were used in this study.Because the temperature sensor is susceptible to wetting during cloud penetrations, periods of cloud presence were defined by a threshold value of 10 cloud droplets (3 to 45 µm diameters) per cm 3 and in-cloud temperature was measured by a radiometric temperature sensor that is not sensitive to wetting.In 17 research flights (RF01 to RF13 and RF16 to RF19) all available five-minutes intervals at moderate height (pressure > 600 hPa) and with relatively constant pressure (standard deviation < 1 hPa) were selected and analysed.(Note that, unfortunately, during research flight 14 and 15 the Lymanalpha hygrometer was out of service, so no analysis of these flights is possible.) 3 Introducing a refined cloud closure 3.1 Data analysis: the double-Gaussian PDF For diagnosing the cloud fraction and the average liquid water, Perraud et al. (2011) show that the temperature variability should not be neglected relative to the humidity variability.We therefore follow Sommeria and Deardorff (1977), Mellor (1977) and Lewellen and Yoh (1993) and define the extended liquid water mixing ratio, s(q t , T l ), by s = q t − q s (T l ) where q t is the total water mixing ratio, q s (T l ) is the saturation mixing ratio at a given value of the liquid water temperature T l = θ l T /θ and (∂q s /∂T ) T =T l = Lq s (T l )/(R v T 2 l ) is the slope of the saturation mixing ratio at T = T l .T is the temperature, θ the potential temperature, θ l the liquid water potential temperature, L the latent heat of vaporization, c p the specific heat at constant pressure and R v the gas constant for water vapor.The extended liquid water mixing ratio takes into account the temperature variability as well as the humidity variability and is a measure of subsaturation if s is negative.For s > 0, s is approximately equal to the liquid water mixing ratio, q l .Note that the ratio of the mean of s, s, to the standard deviation of s, σ , can be approximated by the normalized saturation deficit, Q 1 , which is defined as the bulk value of s, s bu = s(q t , T l ), divided by σ (Lewellen and Yoh, 1993, ζ therein).
If the PDF of s is known for each grid box in an NWP model or a GCM, the cloud fraction and the average liquid water can be calculated by integration over the PDF of s (see Eqs. 8 and 9 for the formulation of the integral).As this is not the case, and only the first moments of the PDF of s can usually be predicted in large-scale models, we are using high-resolution LES and observational data to investigate the behaviour of the distribution of s on the subgrid scale of an NWP model or a GCM.
Considering the distribution of s from each model level in the LES data over the whole domain, we find that the PDF of s can be highly skewed in the cloud layer with positive skewness for shallow cumulus and negative skewness for stratocumulus (Fig. 1).For shallow cumulus, cloud formation is driven by surface heat fluxes that initiate few but strong updrafts in a slowly descending environment.Therefore the PDF of s is positively skewed with the moist tail representing the (cloudy) updrafts.In contrast, stratocumulus is driven by radiative and evaporative cooling at cloud top.Hence noncloudy downdrafts emerge in a dry tail of the PDF of s and the PDF tends to be skewed negatively (Helfand and Kalnay, 1983;Moeng and Rotunno, 1990).Consequently for both the shallow cumulus regime and the stratocumulus regime, the success of a scheme diagnosing the cloud fraction and the average liquid water depends crucially on its ability to quantify the tail of the distribution.
Following Larson et al. (2001a), we choose to represent the PDF of s by a double-Gaussian distribution which can represent skewed distributions and is able to reproduce the tail.The double-Gaussian distribution is quite popular (Larson et al., 2001a;Perraud et al., 2011) because the two single-Gaussian distributions that the double-Gaussian distribution is composed of can be interpreted physically as the updrafts and their slowly descending environment in case of a cumulus regime (Neggers et al., 2009) or as the downdrafts and their well-mixed environment in the case of a stratocumulus regime.In both regimes the dominant mode of the PDF of s is associated with the well-mixed environment and assumed to be Gaussian distributed.The tail of the PDF is represented in a secondary mode and is associated with the thermal updrafts in shallow cumulus and the negatively buoyant downdrafts in stratocumulus (Fig. 1).This secondary mode is also assumed to be Gaussian distributed.The choice of the double-Gaussian PDF is further supported by direct numerical simulations (DNS) of an evaporatively driven cloud top, where scales between a few millimeters and a few meters are resolved (Mellado et al., 2010).Consistently with the physical interpretation in terms of the large-scale updraft and downdraft flow structure presented above, agreement between the LES and the DNS data (Fig. 2) indicates that the non-Gaussianity is quite insensitive to the details of the small scales, since DNS resolves them and LES parameterizes them.Therefore, the skewed shape of the PDF seems to be related to the fact that buoyancy is one of the main forcing mechanisms, which is often the case when clouds are present in the system.Using a double-Gaussian distribution, the PDF of s is written as where P 1 and P 2 are single-Gaussian distributions and s 1 , s 2 , σ 1 and σ 2 are the mean and the standard deviation of the two single-Gaussian distributions.The relative weights a and (1 − a) can be interpreted as the corresponding area fractions (see Appendix).By convention and without loss of generality, we choose s 1 > s 2 .With five parameters to determine the PDF, the double-Gaussian distribution is highly flexible on the one hand.On the other hand, operational NWP models or GCMs are not able to predict five moments of the distribution of s.Therefore closure assumptions will have to be chosen carefully (see Sect. 3.2).
In order to be able to analyse the LES data and the observational data in terms of the closure equations, we next aim to find the best fit of a double-Gaussian distribution to the PDF of s for each level of our LES data set and each five-minute interval in the observational data set.Because the skewness of the distribution is a crucial parameter in our closure, we establish an additional constraint for the fit which retains the skewness of the given PDF for the fitted double-Gaussian distribution.Instead of varying the five parameters of the double-Gaussian distribution (a, s 1 , s 2 , σ 1 , σ 2 ) like Larson et al. (2001a) did, we express s 1 as a function of a, s 2 , σ 1 , σ 2 and the mean, the standard deviation and the skewness of the given PDF, s, σ and sk, using the definition of the third standardized moment of a double-Gaussian distribution: (Lewellen and Yoh, 1993;Larson et al., 2001a).The values of s, σ and sk are obtained from the LES data or the observational data to evaluate the above equation and hence four parameters are left to be fitted (a, s 2 , σ 1 , σ 2 ).To calculate the best skewness-retaining fit for each level of the LES data and each five-minute interval in the observational data set, we first do χ 2 -tests in the relevant region of the parameter space.Because this procedure gets computationally expensive easily (at least if four parameters are to be fitted like it is done here), we only search for a coarse estimation of the best fit for the four parameters and then use this best fit as input for the Nelder-Mead downhill simplex method (Press et al., 1992) to find the actual minimum.In Fig. 1 two examples of the distribution of s in a cloud layer of the LES data, one with positive skewness and one with negative skewness, are shown together with their best skewness-retaining double-Gaussian fit.

Closure equations
Even if we assume that the first three moments of the PDF of s are readily available from an NWP model or a GCM, for example, from a higher-order closure boundary layer model, the number of parameters has to be reduced from five to three, that is, two closure equations are necessary.Larson et al. (2001a) suggested with α = 2.0 and γ = 0.6 and s 1 > s 2 by convention.
Analysing the different LES cases by fitting a double-Gaussian distribution to the (normalized) PDF of s for each vertical level as described in Sect.3.1, we obtain σ 1 /σ and σ 2 /σ and plot them as a function of sk (Fig. 3a and b).It is noted that high σ 1 /σ or σ 2 /σ values (> 1.5) at sk = 0 are an artifact of a double-Gaussian distribution being fitted to a distribution that is not skewed.In this case a single-Gaussian distribution might represent the given distribution well.So if a approaches 0.0 (or 1.0) during the fitting procedure, the second (or first) single-Gaussian distribution of the double-Gaussian distribution might fit the given distribution so well that the termination criteria for the fitting procedure is reached, independent of the shape of the first (or second) single-Gaussian distribution which is essentially irrelevant because of its small amplitude.Therefore, for sk = 0 particularly high or low values of σ 1 /σ or σ 2 /σ can be ignored, when evaluating the closure equations.Because we defined s 1 > s 2 , large values for σ 1 /σ represent the cloudy tail in shallow cumulus, where sk has high positive values.In stratocumulus, where the skewness is negative, large values for σ 2 /σ represent the non-cloudy part of the cloud layer.Larson et al. (2001a) analysed observational data from the ASTEX campaign and found only very few measurements of high positive skewness.They therefore suggested an antisymmetric behaviour for σ 1 /σ and σ 2 /σ depending on sk (Fig. 3a and b).In contrast, we find from the different LES case studies that in the cumulus regime σ 1 /σ has higher values than σ 2 /σ in the stratocumulus regime.
This broken antisymmetric behaviour is consistent with the physical understanding that cloudy updrafts in shallow cumulus are more vigorous than non-cloudy downdrafts in stratocumulus.For the shallow cumulus cloud cores the upper limit is a moist adiabatic ascent, while the stratocumulus downdrafts are only initially cloudy and as soon as they become cloud-free follow a dry adiabatic descent.Because the downdrafts are not exactly the reversed process of the updrafts, the tails of the PDFs of s are different for both cloud regimes, that is, the tails are heavier in the cumulus regime than in the stratocumulus regime.
Using the s 1 > s 2 convention, we suggest a refinement of the parameterization of Larson et al. (2001a) using a modified set of closure equations (Fig. 3a and b) with α = 2.0 adopted from Larson et al. (2001a).Fitting the parameters γ n with a simple least square fit to the LES data sets of DYCOMS and RICO, we find best agreement for γ 1 = 0.73, γ 2 = 0.46, γ 3 = 0.78 and γ 4 = 0.73.By fitting the closure equations to only the two data sets of DYCOMS and RICO, which cover the stratocumulus and the cumulus type cloud regime, respectively, the LES data sets of ASTEX and ARM remain as independent test data sets.This generic division in training and test data aims to permit a later comparison between the error of this new parameterizations and other parameterizations from the literature (see Sect. 4).The main difference between this set of closure equations and the one from Larson et al. (2001a) is the dependence of σ 1 /σ on sk for (large) positive values of skewness, that is, for the shallow cumulus regime.The new parameterization is also supported by observational aircraft data from the RICO campaign (Fig. 3c).Compared to the simulated RICO case, the skewness from the observational data does not reach values as high as the skewness from the LES.This might be due to the sampling strategy of the observational data with the aircraft.The RICO project was targeting for early stage growing shallow cumulus towers from initiation to early rain formation.The statistics for the observational data set is therefore biased toward those types of clouds and away from fully developed, later stage imodal PDF of s from the RICO case at cloud base with sk = 0.04.While the LES data shows l distribution and the double-Gaussian fit is able to capture this shape, the two parameterizations n assuming a single-Gaussian distribution for vanishing skewness.For further explanation of the Fig. 1. 27 Fig. 4. Bimodal PDF of s from the RICO case at cloud base with sk = 0.04.While the LES data shows a bimodal distribution and the double-Gaussian fit is able to capture this shape, the two parameterizations coincide in assuming a single-Gaussian distribution for vanishing skewness.For further explanation of the legend see Fig. 1. clouds (A. Schanot, personal communication, 2012) while with LES all stages of the life cycle of such clouds are modeled.Therefore in the observational data regions with particularly high s are undersampled.Nevertheless, the few data points with high skewness obtained from observational data of the RICO campaign fit well into the range of values found from LES and align rather with the introduced closure equations than with the ones from Larson et al. (2001a).
A difficulty in the parameterization of Larson et al. (2001a) as well as in the new parameterization is the treatment of distributions that are characterized by sk ≈ 0. Both sets of closure equations are constructed such that at sk = 0 the normalized standard deviations σ 1 /σ = σ 2 /σ = 1, that is for the closure equations the double-Gaussian distribution collapses to a single-Gaussian distribution as the skewness vanishes.In the LES data in the range of sk ≈ 0, distributions that match a single-Gaussian distribution occur as well as bimodal double-Gaussian distributions, where the two modes balance in a way that the skewness almost vanishes (Fig. 4).The latter distributions often appear in the cumulus regimes at cloud base and are characterized by σ 1 /σ ≈ σ 2 /σ < 1 (Fig. 3).Though the bimodal distributions with zero skewness cannot be captured adequately by the closure equations, the induced error is relatively small and will be discussed again in Sect. 4.
Knowing the first three moments of the distribution of s for a certain model level, σ 1 and σ 2 can now be calculated via the closure equations (Eq.4), while a, s 1 and s 2 are obtained from the definition of the first three moments of a double-Gaussian distribution (Larson et al., 2001a, Eqs. 22-24 therein): where Eq. ( 5) may be solved numerically for a.Alternatively, to avoid an iterative solution for a more computational efficient implementation in a GCM or an NWP model, an (e.g., polynomial or matched asymptotics) approximation of a as a function of sk can be used.For the present analysis however, we solve for a numerically using a simple bisection method with an accuracy of 10 −6 which typically took about 30 iterations.
Comparing the parameterized distribution of s to the original LES data in all four case studies (Fig. 1, ASTEX and ARM not shown), we find that the new parameterization is able to represent the differences in the distribution of s in a shallow cumulus regime as well as a stratocumulus regime and therefore represents the tails in a shallow cumulus regime better than the parameterization by Larson et al. (2001a).
Having determined a double-Gaussian PDF of s, the cloud fraction, C, and the average liquid water of a large-scale grid box, q l , are found by integration: Note that for the introduced parameterization the normalized parameters of the double-Gaussian PDF (a, (s 1 − s)/σ , (s 2 −s)/σ , σ 1 /σ , σ 2 /σ ) only depend on sk (Eqs.4-7).Therefore with Q 1 ≈ s/σ , Eqs. ( 8) and ( 9) can be rearranged such that C and the normalized average liquid water, q l /σ , are functions of sk and Q 1 only.In contrast to the cloud fraction and the average liquid water, the liquid water flux cannot be found analytically by taking only s into account, but it also depends on the vertical velocity, w.Instead of using a joint PDF of s and w, we are here heading for a more straightforward way following Cuijpers and Bechtold (1995).They determined the liquid water flux, w q l , from the flux of s, w s , by where C is the cloud fraction.F is a proportionality constant that for C < 1.0 can be interpreted as a measure of which part of the joint PDF of w and s is found in the cloudy part of the domain.Therefore, lim C→1.0 F = 1.0.Using coarse resolution LES data of shallow cumulus and stratocumulus cases, Cuijpers and Bechtold (1995) found a dependence of F on the normalized saturation deficit, Q 1 , and sk with the dependence on sk most notable near cloud base where sk is close to zero.Nevertheless, they suggest that F is described fairly well as a function of Q 1 only, giving F = exp(−1.4Q 1 ) for Q 1 ≤ 0 and F = 1.0 for Q 1 > 0. Using Eq. ( 10), we find from the different LES cases a dependence of F on both Q 1 and sk (Fig. 5, ARM and DYCOMS not shown).Using our training data sets (RICO, Fig. 5a, and DYCOMS), we propose with a = 1.5 and b = 0.25 for a new parameterization.The proposed parameterization seems to be appropriate also for the testing data sets (ARM and ASTEX, Fig. 5b).Because this new parameterization is too sensitive to high sk for Q 1 < −4.0 and therefore gives unreasonable values at a thin layer near cloud top, we limit their range of application to Q 1 ≥ −4.0.We find Q 1 < −4.0only in a thin layer at cloud top, where the liquid water flux is close to zero.A similar unreasonable behaviour is found for the parameterization of Cuijpers and Bechtold (1995) and we will therefore apply the same limit to both parameterizations when testing it in the following with LES data.In a GCM or an NWP model the cloud top behaviour is very sensitive to the interplay of the cloud parameterization and the boundary layer scheme.Therefore a meaningful validation of the cloud top behaviour should be done in such a model with all feedbacks present.However, as a first attempt w q l = 0 for Q 1 < −4.0 might be sufficient.

A priori testing of the cloud closure
Having introduced a new set of closure equations for σ 1 /σ , σ 2 /σ and F (Eq. 4 and Eq.11, respectively), we now analyse the quality of the new parameterizations with a priori a) RICO testing in LES and by comparing the introduced parameterizations with parameterizations from the literature.Note that the usefulness of a priori testing is in the assessment of validity and accuracy of the parameterizations assumptions (see e.g., Pope, 2000, p. 601).To decide which parameterization is most useful in a certain NWP model or GCM a comparison based on a posteriori testing has still to be done.In Fig. 6, the new parameterization and the parameterization of Larson et al. (2001a) are shown compared to the LES data of the ASTEX case, which is one of the testing data sets.We focus on the cumulus part of the ASTEX case  (positive sk and negative Q 1 ) because the main differences between these two parameterizations are found for the cumulus regime.For stratocumulus the two parameterizations differ only marginally.
For high positive skewness it is found that the new parameterization reproduces the LES data better than the parameterization of Larson et al. (2001a) which overestimates C and q l for a given Q 1 .Remember that zero skewness for the closure equations equals the case of a single-Gaussian distribution of s (like assumed in Sommeria and Deardorff, 1977;Mellor, 1977), while in the LES data bimodal distributions occur as well.In this case and with increasing normalized saturation deficit (which at cloud base corresponds to increasing height), the parameterizations first overestimate and later underestimate the cloud fraction.For the normalized average liquid water the effect is less relevant (see also Fig. 7).
To give an estimate of the error of the different parameterizations, the profiles of C, q l and w q l from the LES test data sets are compared with the results of the different parameterizations (Fig. 7a, b and c).The new parameterization is able to reproduce the profiles of C and q l in the shallow cumulus layer better than the parameterization using the closure equations from Larson et al. (2001a).Both cloud schemes are clearly superior to a single-Gaussian cloud closure, which severely underestimates q l and C and in particular is hardly able to diagnose any liquid water between cloud base and cloud top in the shallow cumulus layer.For the stratocumulus layer, the three parameterizations do not differ noticeably.A distinct difference between testing error (as in ASTEX; Fig. 7a and b) and training error (as in RICO; Fig. 7d and e) is not found.
For the profiles of w q l , Eq. ( 10) is used with F parameterized like suggested for the new parameterization.For comparison the parameterization by Cuijpers and Bechtold (1995) using an exponential fit of F that only depends on Q 1 is also shown in Fig. 7c for the ASTEX case.The new parameterization is able to reproduce the shape of the profiles of w q l as well as their absolute values.Again, for stratocumulus the two parameterization do not differ noticeably.
To estimate the effect of C in the new parameterization, C used in Eq. ( 10) has either been taken from the original LES data or from the new parameterization.It is shown that C has a minor influence on the profile compared to the difference between the two different parameterizations of F .At the top of the cumulus layer for both the test data set ASTEX and the training data set RICO the new parameterization underestimates w q l .Note again that for a shallow layer with Q 1 < −4.0 at cloud top the parameterizations of the liquid water flux are not valid while the liquid water flux is close to zero.
For a more quantitative analysis, the errors of the different parameterizations are summarized for the testing data sets in Table 2 and for the training data sets in Table 3.The different error metrics used are the mean absolute error, l 1 , the root mean square error, RMSE, the maximum absolute error, l ∞ and the bias.Their computation formulas are given in the caption of Table 2.For the cloud fraction and the average liquid water, the proposed closure equations are fit to the LES data sets of RICO and DYCOMS.Therefore the new parameterization is optimized for RICO and DYCOMS and the error given for the new parameterization for those cases is a training error which is potentially lower than the error of an independent test data set.Nevertheless, we do not find a perceptible higher error for the test data sets ASTEX and ARM compared to the training data sets RICO and DYCOMS.
For all four LES data sets, the single-Gaussian parameterization performs poorly compared to the other two parameterizations which are based on double-Gaussian distributions.
and i being a index for different vertical levels and output time steps.Values shown are averages over the last three output time steps of the LES data, where clouds are present, and over all vertical levels, where either x LES,i or x para.,i are nonzero.To calculate w q l para., C LES has been used in Eq. ( 10).Smallest errors are printed in bold, largest in typewriter.Note that the parameterizations of w q l is only valid for Q 1 ≥ −4.0, while C and q l are calculated over the whole range of Q 1 .
Table 3. Errors of the different parameterizations for the training data sets, RICO and DYCOMS.For further description of the abbreviations and error measures please see Table 2.
Though the double-Gaussian parameterizations are restricted to their double-Gaussian families by the respective closure equations, both double-Gaussian families are able to represent skewed distributions while a single-Gaussian distribution is not skewed.Therefore the double-Gaussian families are able to represent both cumulus and stratocumulus.For stratocumulus the absolute values of skewness are less than for cumulus, therefore the difference in the errors between the single-Gaussian and the double-Gaussian parameterizations is smaller.
Comparing the two parameterizations based on double-Gaussian distributions, the new parameterization matches the LES data better than the parameterization by Larson et al. (2001a) for ASTEX, whereas for ARM the two parameterizations have similar error magnitudes (Table 2).This is reasonable, because the closure equations have most notably been changed for high positive skewness which frequently occurs in ASTEX but is rather scarce for ARM.The same effect can also be found in the training error (Table 3).While a lower error of the new parameterization compared to the error of the parameterization of Larson et al. (2001a) is found for RICO (where high positive skewness occurs frequently), similar error magnitudes are found for DYCOMS (where the skewness is small).
For the liquid water flux, the error of the parameterization can be reduced distinctly by the new parameterization compared to the parameterization of Cuijpers and Bechtold (1995).The new parameterization depends on Q 1 as well as on sk while the parameterization of Cuijpers and Bechtold (1995) is only dependent on Q 1 .The additional dependence of the new parameterization on sk enables a more precise estimation of F which reduces the error in all four LES cases.mesoscale structures NWP models approach resolutions of only a few kilometers (e.g., Baldauf et al., 2011) which is considerably less than the domain sizes of all our LES cases.Hence, the question arises if the introduced PDF scheme is still applicable at such resolutions.We therefore investigate the dependence of the error of the new parameterizations on the domain size considered.To do so the domain of the four different RICO simulations has been divided into subdomains, the RMSE and the bias have been calculated in each subdomain and then averaged over all subdomains of the same size.These subdomains in our analysis of the LES data correspond to the grid spacing of an NWP or mesoscale model.The RICO simulations used differ in their overall domain size as well as in the initial humidity profiles of the simulations, giving "standard RICO" and "moist RICO" simulations (see Sect. 2.1.4).
For subdomain sizes smaller than 5 km, the RMSE increases rapidly with decreasing subdomain size for both standard and moist RICO simulations (Fig. 8).This rapid increase is probably due to the subdomain size approaching the size of individual cloud structures (i.e., larger cumulus clouds).When these two scales converge, the variability increases rapidly and a continuous, smooth distribution like the proposed family of double-Gaussian PDFs cannot appropriately represent the shapes of the subdomain PDFs.This results in a larger spread of the LES data around the closure equations and consequently in an increasing RMSE with decreasing subdomain size.The increasing RMSE can be interpreted such that the PDF-based, deterministic scheme becomes inappropriate at such small scales and one would have to use a stochastic approach instead.
With standard initial conditions, rain rates are small and no mesoscale structures develop, that is, the cloud field remains random.Then, for subdomain sizes larger than 10 km the RMSE is small, being around 0.005 and 0.001 g kg −1 for cloud fraction and liquid water, respectively.With moist initial conditions, precipitation appears more readily and mesoscale structures, that is, cloud streets, mesoscale arcs and cold pools, develop from 20 h onwards as discussed by Seifert and Heus (2013).In these moist cases and with subdomain sizes larger than 10 km, the cloud fraction as well as the liquid water are mostly overestimated by the double-Gaussian parameterization (positive bias).The RMSE amounts to about 0.017 and 0.04 g kg −1 for cloud fraction and liquid water, respectively, which for each variable corresponds to roughly 10 % of their respective maximum values.With decreasing subdomain size the RMSE for the moist RICO simulations decreases until the subdomain size reaches 5-10 km.At such subdomain sizes the RMSE is similar for standard and moist RICO simulations.For the moist RICO simulations and large subdomain sizes, the PDFs of s have comparatively longer tails with few very high values of s.This different shape emerges from the more localized but more intense convection and the large cloud free cold pool areas in the moist RICO case.The parameterized double-Gaussian PDF, which is fitted to nonorganized random cloud fields with small rain rates, is not able to capture the longer tails of the distributions of s adequately.Therefore, for a given skewness the normalized variance σ 1 /σ is underestimated for moist RICO simulations with mesoscale structures.
The discussed error dependence on the domain size and the investigation of the moist RICO case show, on the one hand, that even with a perfect knowledge of the first three moments of the PDF of s it remains challenging to construct a parameterization which is truly scale adaptive.On the other hand, the statistics of the cloud field at small scales seems to be independent enough from the mesoscale structures and higher rain rates to make the PDF scheme useful for a broader range of cloud regimes than the original LES data set used for the parameterization.Taking into account both the increasing error at very small subdomain sizes and the difficulties of the scheme to represent cloud properties in the moist RICO case, we conclude that the proposed scheme is most appropriate for NWP models or GCMs with horizontal resolution of about 5-20 km.
For the liquid water flux, the new parameterization does not depend explicitly on a certain family of PDFs but the factor F is directly parameterized and depends on Q 1 and sk.With this parameterization the error of the liquid water flux seems to be less dependent on the development of mesoscale structures and higher rain rates, possibly because there is no direct dependence of the parameterization on the shape of the PDF of s.A dependence of the error of the liquid water flux on the subdomain size is found in accordance with the error of the cloud fraction and average liquid water.

Extension to autoconversion rate
Autoconversion of cloud droplets to rain drops is a key process in the formation of precipitation in warm clouds.Besides the cloud fraction, the average liquid water and the liquid water flux discussed above, the autoconversion rate is another variable that depends among others on the variability of the liquid water mixing ratio (e.g., Pincus and Klein, 2000).In simple autoconversion schemes (e.g., Kessler, 1969;Sundqvist, 1978), other dependencies are neglected and the autoconversion rate only depends on the liquid water mixing ratio.With this simplification the autoconversion rate can also be handled by PDF-based schemes.
Following Kessler (1969, K69) and replacing the liquid water mixing ratio with the extended liquid water mixing ratio, the autoconversion rate, A K69 , is given as a) RMSE of C b) RMSE of q l c) RMSE of w q l d) bias of C e) bias of q l f) bias of w q l  where H is the Heaviside step function, s crit = 0.5 g kg −1 is a critical threshold below which no autoconversion occurs and k is a rate constant set to k = 10 −3 s −1 .
Alternatively, Khairoutdinov and Kogan (2000, KK00) suggested a parameterization based on data from a single large-eddy simulation using spectral bin microphysics, that is, resolving the drop size distribution explicitly.They found that a good fit to the bulk autoconversion rate is with c 1 = ( 5.829×10 6 N c ) c 2 and c 2 = 1.89.Within the factor c 1 , they introduced a dependence on the number of cloud droplets, N c .Because N c in UCLA-LES is assumed to be constant throughout a simulation, c 1 can be treated as constant in this study.
For both autoconversion schemes, K69 and KK00, the domain-averaged autoconversion rate, A, is then found by integration over the PDF of s: with s 0 = s crit for Kessler (1969) and s 0 = 0 g kg −1 for Khairoutdinov and Kogan (2000).While the integral can be solved analytically for Kessler (1969), this is not possible for the scheme of Khairoutdinov and Kogan (2000) because the exponent of s, c 2 , is not a natural number.Seifert and Beheng (2001, SB01) derived an explicit equation for the autoconversion rate which is formulated using Long's piecewise polynomial collection kernel and a universal function that is estimated by numerically solving the stochastic collection equation.Doing so they arrived at with k au = 6.808 × 10 18 m 3 kg −1 s −1 and k τ = 1 + au (τ ) (1−τ ) 2 .Here ρ 0 is the base state density depending on height and au (τ ) is a universal function depending on the internal timescale, τ = 1 − q l /(q l + q r ), designed to take into account the broadening of the droplet spectrum with time.Note that q r is the rain water content which is not included in q l .This dependence on the internal timescale makes it impossible to integrate A SB01 according to Eq. ( 14) as long as the PDF of τ is unknown in terms of the PDF of s which would require the use of a joint PDF or even the introduction of time correlations to the problem.Nevertheless, as the SB01 autoconversion rate is expected to give more realistic results than the www.geosci-model-dev.net/6/1641/2013/simple autoconversion schemes described above, the SB01 autoconversion rate is used as a reference to be compared to the other autoconversion schemes.In our study the full 4-D field of τ is, of course, known from LES and a compensatory factor for k τ can be determined for each level and each time step individually by solving = k τ,LES (z, t) s 4 x i , y j , z, t H (s) for k τ,LES .Here nx is the number of LES grid boxes in each horizontal direction.Then the ability of the new double-Gaussian parameterization to be used in combination with the SB01 autoconversion rate can be tested using k τ,LES : Note that for the use in an NWP model or a GCM, k τ,LES would have to be estimated by some other method and that k τ,LES is not equal to a horizontal mean of k τ .
From Fig. 9 showing the different autoconversion rates for the ASTEX case, it is apparent that the profiles of the autoconversion rate differ substantially both in shape and by several orders of magnitude in absolute value among the different parameterizations of the autoconversion rate (K69, KK00, SB01).While the single-Gaussian cloud closure only captures the stratocumulus type cloud layer around 2100 m, the new double-Gaussian cloud closure is additionally able to diagnose the autoconversion rate quite accurately for the cumulus layer.The same results hold for the other three LES cases (not shown).
Using k τ,LES as described above, the new double-Gaussian cloud closure is able to reproduce the profile of the SB01 autoconversion rate well for most heights.This is remarkable because A SB01 is proportional to the 4th moment of s which makes A SB01 especially sensitive to errors introduced by the cloud closure.Nevertheless, at the cloud top of the stratocumulus layer the new double-Gaussian cloud closure overestimates the SB01 autoconversion rate.This overestimation might be related to the difficulties of LES in resolving the strong gradients that occur at a stratocumulus cloud top.
Using the closure equations of Larson et al. (2001a) (as it is done exemplary with the KK00 parameterization in Fig. 9) compared to using the new closure equations gives small and probably negligible differences in the cumulus layer.
Overall the double-Gaussian PDF scheme is successful in capturing the effect of the sub-grid variability on the autoconversion rate, which is crucial for the representation in the cumulus layer.Nevertheless, the uncertainty due to the choice of the autoconversion scheme itself remains.Especially the K69 scheme leads to a strong overestimation compared to KK00 and SB01, but also KK00 shows a much higher autoconversion rate in the lowest part of the cumulus cloud layer compared to SB01.24 h); in the moist RICO cases mesoscale structures develop, while in the standard cases the clou random.

Conclusions
We introduce a refined statistical cloud closure using double-Gaussian PDFs.Following the work of Larson et al. (2001a), who provided an elegant framework for a diagnostic parameterization of the cloud fraction and the average liquid water, we modified their parameterization especially in the case of strong positive skewness of the distribution of the extended liquid water mixing ratio, s, that is, for shallow cumulus clouds.The introduced double-Gaussian closure is based on different LES case studies and is supported by observational data from aircraft measurements in shallow cumulus.It is relying on the first three moments of s as input parameters and is shown to be superior in diagnosing the cloud fraction and average liquid water profiles compared to a single-Gaussian approach that only needs the first two moments of s for input.A priori testing also suggests improved accuracy compared to existing double-Gaussian closures.
For the liquid water flux, we introduce a new parameterization of the factor F which is relating the liquid water flux to the flux of s.With F depending on the skewness of the distribution of s and the normalized saturation deficit, the new parameterization is able to reproduce the shape of the profiles of the liquid water flux better than when the dependence of the skewness is not retained.
The dependence of the error of the parameterization on the domain size and the appearance of mesoscale structures has also been tested a priori with LES.Below a domain size of about 5 km the error of the parameterization of the cloud fraction, the average liquid water and the liquid water flux is increasing rapidly with decreasing domain size.If mesoscale structures occur that are accompanied by higher rain rates and the domain size is chosen large enough to include these mesoscale structures, the error of the parameterization of the cloud fraction and the liquid water is larger than without the occurrence of mesoscale structures.Considering the liquid water flux, the error of the parameterization seems to be insensitive to the occurrence of mesoscale structures.
Finally, the cloud scheme has been applied to diagnose the autoconversion rate.Using autoconversion schemes of different complexity, the new parameterization is able to reproduce profiles of the autoconversion rate adequately.The differences between the various autoconversion schemes are much larger than the error introduced by the double-Gaussian closures.
As a next step, a posteriori testing of the introduced parameterization in a NWP model or a GCM that diagnoses or predicts the first three moments of s, for example, from a higher-order closure boundary layer model (Machulskaya and Mironov, 2013), is essential to decide which parameterization is most useful in the chosen NWP model or GCM.However, such a analysis is beyond the scope of this study and therefore left for further research.
Fig. 1.PDF of s for a specific height in the cloud layer.Furthermore, the corresponding best skewness-re double-Gaussian fit (DG-Fit) and the resulting PDF when using the closure equations from Larson et al. ( (Eq. 3) and the introduced closure equations (Eq.4) are shown.It is ∆s = s−s.The black, dashed line in the saturation value (s=0).

25Fig. 1 .
Fig. 1.PDF of s for a specific height in the cloud layer.Furthermore, the corresponding best skewness-retaining double-Gaussian fit (DG-Fit) and the resulting PDF when using the closure equations from Larson et al. (2001a) (Eq. 3) and the introduced closure equations (Eq.4) are shown.It is s = s − s.The black, dashed line indicates the saturation value (s=0).
from LES data of DYCOMS and from a DNS study.Both PDFs are calculated at a height level d top where the variance of horizontal winds are at their respective maximum.The DNS data local study of turbulent mixing at cloud top, due solely to evaporative cooling (Mellado et al., σ 1 b) LES data for σ 2 c) observational data for σ 1 ta of ARM, ASTEX, RICO and DYCOMS and observational data from the RICO campaign losure equations from Larson et al. (2001a) (dashed line) and the new closure equations (solid the new closure equations are fitted to the LES data of RICO and DYCOMS rather than to all tudies.The grey shading in (c) corresponds to two times the standard deviation from the four).The legend in (a) also applies to (b).

Fig. 2 .
Fig. 2.PDF of s from LES data of DYCOMS and from a DNS study.Both PDFs are calculated at a height level close to the cloud top where the variance of horizontal winds are at their respective maximum.The DNS data corresponds to a local study of turbulent mixing at cloud top, due solely to evaporative cooling(Mellado et al., 2010).

Fig. 3 .
Fig. 3. LES data of ARM, ASTEX, RICO and DYCOMS and observational data from the RICO campaign along with the closure equations from Larson et al. (2001a) (dashed line) and the new closure equations (solid line).Note that the new closure equations are fitted to the LES data of RICO and DYCOMS rather than to all available case studies.The grey shading in (c) corresponds to two times the standard deviation from the four LES cases in (a).The legend in (a) also applies to (b).

Fig. 5 .Fig. 5 .
Fig. 5. New parameterization of F (dashed lines) as a function of the normalized saturati skewness along with the parameterization ofCuijpers and Bechtold (1995, CB95)  and the LE

Fig. 6 .
Fig.6.The parameterizations (dashed lines) as a function of the normalized saturation deficit and the skewness shown together with the LES data of the ASTEX case (crosses).

29Fig. 6 .
Fig.6.The parameterizations (dashed lines) as a function of the normalized saturation deficit and the skewness shown together with the LES data of the ASTEX case (crosses).

Fig. 7 .Fig. 7 .
Fig. 7. Profiles of cloud fraction, average liquid water and the liquid water flux from LES cases ASTEX (testing dataset) after 25 h and RICO (training dataset) after 36 h of simulation.For the liquid water flux, C used in Eq. (10) has either been taken from the original LES data (C LES) or from the new parameterization (C new).The legend in (a) also applies to(b,d,e), the legend in (c) also applies to (f).Note the logarithmic scale on the x-axis in (a) and (b).

Fig. 8 .Fig. 9 .
Fig. 8. Dependence of the error of the parameterized cloud fraction, liquid water and liquid water flux on the domain size.Shown are different simulations of the RICO case (average error over two output time steps after 24 h); in the moist RICO cases mesoscale structures develop, while in the standard cases the cloud field remains random.

Fig. 8 .
Fig. 8. Dependence of the error of the parameterized cloud fraction, liquid water and liquid water flux on the domain size.Shown are simulations of the RICO case (average error over two output time after 24 h); in the moist RICO cases mesoscale structures develop, while in the standard cases the cloud field remains random.

Fig. 9 .
Fig.9.Profile of the autoconversion rate in ASTEX after 25 h of simulation.Note the logari the x-axis.Notation: LES: autoconversion rate calculated using the full 3D-field of LES dat Gaussian parameterization, DG: double-Gaussian parameterization using the new closure equat double-Gaussian parameterization using the closure equations fromLarson et al. (2001a). 31

Fig. 9 .
Fig.9.Profile of the autoconversion rate in ASTEX after 25 h of simulation.Note the logarithmic scale on the x axis.Notation: LES: autoconversion rate calculated using the full 3-D field of LES data, SG: single-Gaussian parameterization, DG: double-Gaussian parameterization using the new closure equations, DG L01: double-Gaussian parameterization using the closure equations fromLarson et al. (2001a).

Table 1 .
Overview of the different LES cases used in this study.The four cases on the left hand side are used to develop (DYCOMS and RICO) and test (ARM and ASTEX) the parameterizations introduced in this study.The three cases on the right hand side are solely used in the Sect. 5.
nx: number of grid points in each horizontal direction, L: horizontal domain size, H : vertical domain size, x: horizontal resolution, z: vertical resolution, t: length of simulation.

Table 2 .
Errors of the different parameterizations for the testing data sets, ASTEX and ARM.