Evaluation of a Quasi-steady state approximation of the cloud Droplet Growth Equation (QDGE) scheme for aerosol activation in global models using multiple aircraft data over both continental and marine environments

. This research introduces a numerically efficient aerosol activation scheme and evaluates it by using stratus and stratocumulus cloud data sampled during multiple aircraft campaigns in Canada, Chile, Brazil, and China. The scheme employs a Quasi-steady state approximation of the cloud Droplet Growth Equation (QDGE) to efficiently simulate aerosol 15 activation, the vertical profile of supersaturation, and the activated cloud droplet number concentration ( 𝐶𝐷𝑁𝐶 ) near the cloud base. The calculated maximum supersaturation values usingof the QDGE scheme were compared with multiple parcel model simulations under various aerosol and environmental conditions. The differences are all below 0.18 %, indicating good performance and accuracy of the QDGE scheme. We evaluated the QDGE scheme by specifying observed environmental thermodynamic variables and aerosol information from 31 cloud cases as input and comparing the simulated 20 𝐶𝐷𝑁𝐶 with cloud observations. The average of mean relative error ( 𝑀𝑅𝐸̅̅̅̅̅̅̅ ) of the simulated 𝐶𝐷𝑁𝐶 for cloud cases in each campaign ranges from 17.30 % in Brazil to 25.90 % in China, indicating that the QDGE scheme successfully reproduces observed variations in 𝐶𝐷𝑁𝐶 over a wide range of different meteorological conditions and aerosol regimes. Additionally, we carried out an error analysis by calculating the Maximum Information Coefficient (MIC) between the mean relative error ( 𝑀𝑅𝐸 ) and input variables for the individual campaigns and all


Introduction
Aerosols play an important role in determining affecting the radiation balance of the earth-atmosphere system by scattering 30 and absorbing shortwave radiation and altering the cloud reflectivity and lifetime (Twomey, 1974(Twomey, , 1977Ghan, 2013;Forster et al., 2016;Ramaswamy et al., 2019;Wang et al., 2020). Currently, Aaerosol-cloud interactions are remain as one of the largest sources of climate modeling uncertainty ((IPCC AR6, Forster et al., 2021)Intergovernmental Panel on Climate Change, 2013).
Aerosol-cloud interactions are largely driven by the activation of aerosols to form cloud droplets. The addition of activated 35 aerosol to existing clouds can directly change the concentration and size of cloud droplets and thereby affect the microphysical properties and radiative forcing of the clouds. Aerosol activation is controlled by rapid and nonlinear aerosol and cloud microphysical processes (Meskhidze et al., 2005), which have not been explicitly resolved in climate models yet (Fountoukis et al., 2007;Kang et al., 2015). Nenes et al. (2001) pointed out that the cloud droplet activation process is subject to kinetic limitations, including inertial, evaporation, and deactivation mechanisms, which further adds to the 40 complexity of the aerosol activation.
Early parameterizations of aerosol activation in climate models were based on observations and derived through parameter fitting, using the aerosol number or mass concentration or other Cloud Condensation Nuclei (CCN) proxies (e.g., sulfate mass) to empirically determine the activated (Jones et al., 1994;Boucher and Lohmann, 1995;Jones and Slingo, 1996;Lohmann, 1997;Kiehl et al., 2000;Menon et al., 2002). Although these parameterizations have the advantages of 45 convenience and low computational burden (Fountoukis et al., 2007), substantial uncertainties are resulting from limited spatiotemporal representativeness and unresolved variations in aerosol properties (Meskhidze et al., 2005). In the recent two decades, physically-based parameterization schemes of aerosol activation have emerged (Abdul-Razzak and Ghan, 2000;Cohard et al., 2000;Fountoukis and Nenes, 2005;Ming et al., 2006;Kivekä s et al., 2008;Khvorostyanov and Curry, 2009;Shipway and Abel, 2010;Zhang et al., 2015). These schemes are based on the Köhler theory and are used in climate models 50 to parameterize aerosol activation near the cloud base. As Köhler theory fundamentally describes the process by which water vapor condenses and forms liquid cloud droplets, it can be applied to a wide range of atmospheric conditions and aerosol pollution levels. However, considerable approximations of the Köhler theory are employed for application in climate models, which leads to potential biases in comparison with results from more rigorous and accurate simulations of cloud droplet growth with adiabatic parcel models (e.g. (Ghan et al., (2011) Ghan et al. (2011)). The ongoing increase in computing power 55 (Herrington and Reed, 2020) reduces the need for cost-saving approximations in climate models. In the following, we will introduce a Quasi-steady state approximation of the cloud Droplet Growth Equation (QDGE) that provides an efficient alternative to parameterizations of activated in climate models.
Parameterization schemes of aerosol activation havewere often been evaluated usingwith adiabatic parcel model simulations.
These models explicitly solve aerosol activation and droplet growth processes by mimicking vertical uplifting of an air 60 parcel containing a specified number of aerosol particles, predicting changes in temperature, humidity/supersaturation, activation of aerosols, and droplet growth from the cloud base upward. When utilizing identically specified aerosols, the results of a parcel model can be used as a benchmark to evaluate parameterizations. This approach has been used extensively used to evaluate activation schemes (Table 1). AlternativelyHowever, a less commonly used approach is to evaluate parameterizations by conducting a "closure experiment", that is, to carry out a parameterized calculation by specifying 65 observed aerosol concentrations and environmental thermodynamic conditions, and then compare the calculated and observed (e.g. Snider and Brenguier, 2000;Guibert et al., 2003;Fountoukis and Nenes, 2005;Kivekä s et al., 2008).
Though some parameterizations have been evaluated based on comparisons of simulated and observed from aircraft campaigns, mostly regional data sets have beenwere used for very specific meteorological conditions and pollution levels. It is essential to select a wide range of cloud data for different atmospheric conditions and pollution levels to arrive at 70 meaningful conclusions for global climate model simulations.
In this study, we introduce the QDGE scheme and evaluate it by using cloud data from multiple aircraft campaigns in four different regions over the world, covering marine and continental conditions. This paper is organized as follows. The next section describes the QDGE scheme and Sect. 3 summarizes the data and method used for the closure experiment and the evaluation. Section 4 illustrates the results of the closure experiment and analyzes the sources of simulation errors, followed 75 by conclusions and discussion in Sect. 5.

Scheme description
Aerosol particles that are suspended in an air parcel of air activate and grow into cloud droplets by condensation of water 80 vapor if supersaturation with respect to water exceeds a critical value. In stratus and convective clouds, aerosol activation is particularly efficient in the vicinity of the cloud base, where supersaturation typically reaches its local maximum. Although observations provide evidence that aerosol activation is not limited to the region near the cloud base, this is omitted in the aerosol activation scheme described here, similar to most parcel models and parameterizations.
In order to determine the portion of the aerosols that activates and forms cloud droplets, a numerically efficient solution of 85 the condensational droplet growth equation (e.g. Seinfeld and Pandis, 2016) is employed to simulate the growth of an ensemble of aerosol particles near the cloud base. The water vapor saturation ratio and number of activated cloud droplets above the cloud base are is simulated by solving a series of equations that describeassuming a vertically ascending air parcel containing aerosols from below the cloud base, which ascends vertically to produce supersaturated conditions above the cloud base. The vertical velocity of the air parcel of air, (in m −1 ), is either specified or parameterized, as described in 90 The change in wet aerosol particle radius, (in m), by condensation of water vapor as a function of the environmental supersaturation water vapor saturation ratio ( , e.g. Emanuel, 1994) in the scheme is given by where is the equilibrium supersaturationwater vapor saturation ratio directly over the surface of the particle, which is 95 obtained from -Köhler theory (Petters and Kreidenweis, 2007): where the parameters , , and account for thermodynamic conditions in the cloud and physiochemical properties of the aerosol particles and droplets (Appendix A). 100 As described below, the QDGE scheme solves Eqs. (1) and (2) in combination with energy and moisture budgets to calculate changes in S driven by thermodynamic processes. For instance, the thermodynamic equations underlying the QDGE scheme can be used to obtain the temporal evolution of S in the air during adiabatic ascend near cloud base (Ghan et al., 2011),: where tThe parameters and are weak functions of temperature and pressure, and is the liquid water mixing ratio, which is related to the activated particle size distribution (Appendix A).
Theoretically, each growing aerosol particle will compete with others for the water vapor in the environment, and the particle size increases according to Eq. (1) and affects the environmental supersaturation through Eq. (3). Eqs. (1-3) are complexly coupled in a complex manner thus hardly have an analytical solution. 110 where is the aerosol hygroscopicity, the surface tension of the solution/air interface (which is approximated by the 115 surface tension of water here), the density of water, the molecular weight of water, the universal gas constant, the temperature, the dry aerosol particle radius, the saturation vapor pressure, the latent heat of vaporization, the modified thermal conductivity of air accounting for non-continuum effects, the modified diffusivity of water vapor in air accounting for non-continuum effects (Seinfeld and Pandis, 2016). Petters and Kreidenweis (2007) and Kreidenweis et al. (2008) provided tabulated values of the hygroscopicity parameter for a variety of chemical compounds, based on laboratory data 120 and modeling. They found that parameterized water contents are often within experimental uncertainty. However, the accuracy of this approach tends to decrease with decreasing aerosol water content. In particular, simulations of highly concentrated, non-ideal aqueous solutions with strong electrostatic interactions between ions with the Aerosol Inorganic Model (AIM; Wexler and Clegg (2002); http://www.aim.env.uea.ac.uk/aim/aim.html) give evidence for systematically different results at low aerosol water contents for some compounds (Kreidenweis et al., 2008). In order to improve biases at 125 low relative humidity, the original method was extended to account for variations in with relative humidity in the QDGE scheme. Specifically, piecewise-linear relationships between and aerosol water activity for different chemical components were determined based on results from AIM. Direct However, the numerical solution iss of Eq. (1) are applicable but computationally expensive., For example, Eq. (3) indicates that the balance between the enhancement ofd S due to the air parcel uplifting, and given that the reducedtion of 130 S due to the condensation growth of activated particles,rate of water vapor depends on the aerosol size distribution and chemical composition, which leads to athe highly non-linear behavior variation of S with time in the ascending parcel of cloud air/heightthe supersaturation water vapor saturation ratio vertical profile. The condensation growth is also non-linearly related to the environmental conditions and aerosol properties (Eqs. 1-and 2). Thereforeypically, a time steps much shorter than 1 second is typicallyare required to numerically solve these equations, which implies computational expenses that 135 would prohibit applications in climate models (Khain et al., 2015). For instance, adiabatic ascending parcel models (e.g. Chen et al., 2016;Peng et al., 2005) to numerically solve Eqs. (1-3) require a very high time resolution, typically with a time step of 10 −3 to 10 −4 seconds. The parcel model results are regarded as the most accurate numerical solution and can be used as the benchmark to verify the parameterization of activation and condensation processes .
In large-scale stratus clouds, the maximum supersaturation (usually less than 0.2 %) occurs about 100 m above the cloud 140 base, that is, the rate of change is 0.002 % m −1 or so. A similar conclusion can be derived from the change of supersaturation and temperature (combined with a lapse rate of atmospheric temperature; (Pandis et al. (, 1990)). Therefore, it is reasonable to assume a scale of several seconds (or meters) at which the supersaturation is approximately constant in the air parcel. ConsequentlyHoweverIn this study, we introduce a parameterization for climate model to achieve numerical efficiency can be achieved by using use a Quasi-steady state approximated approximation to solve the Droplet Growth 145 Equation (QDGE), which assumes can be derived by using that the local approximation ≈ constis approximately a constant. in Eq. (1), which can then be conveniently expressed as follows, for the time period from to + ∆ (∆ is a sub-timestep, roughly several tens of seconds in a climate model), with 150 variable substitutions for particle size, = 2 /2, and for time, = | − 1|/ , and parameters that are given by: In the QDGE aerosol activation scheme, numerical efficiency is achieved by using pre-calculated solutions ( ) of Eq. (46), are used, which are are providedsd the wet particle size dependent on different values of , and in the form of lookup tables (LUTs) for different values of , , and in the model to calculate ., for different values of and . The -160 dependent parameters and , and , are determined through an iterative procedure, for each time step and vertical level near cloud base, as described in the following.
The processmajor steps of the QDGE scheme to calculate the aerosol activation isare shown in Fig. 1. A vertical grid with sub-levels and (grid spacing ∆ = ∆ / ) is employed in the QDGE scheme, where ∆ is the grid spacing in the atmospheric host model, near cloud base ( Fig. 1a-b). Calculations are only performed for the first host model grid layer 165 above the cloud base, with typical values ∆ ≈ 1 − 10 m., The local approximation with constant applies in each sublevel ∆ , and a vertical profile of is eventually obtained within the host model gird ∆ (Fig. 1c). The iterative calculation to obtain at each sub-level is described below. to ensure that the supersaturation maximum ( ) is captured and sufficiently well resolved in model applications of the aerosol activation scheme. The growth calculations are performed for a sub-ensemble of aerosol particles which are selected from the full dry aerosol size distribution at regular size intervals, ∆ = 1/ ∆ , where is on the order of 5 -20 and ∆ is the simulated particle size range of Aitken and accumulation mode aerosols, expressed in terms of a dimensionless particle size parameter = ln ( / 0 ), with 0 = 10 −6 m. In this study, we set to 6 for the closure experiment, meaning that 6 discrete aerosol 175 particle sizes are used. Sizes of other particles in the continuous aerosol size distribution are obtained from linear interpolation between the sizes of the particles in the discrete 6-member sub-ensemble.
In each sub-level ∆ , Thesupersaturation in each sub − grid (i.e. in Fig. 1b, where = 1, … , ) and thedependent parameters in Eq. (46) are obtained through an iterative calculation, which explicitly requires the conservation of mass and energy., The flow chart of the iterative calculation is as shown inasin Fig. 2. 180 rt and h in ith sub-grid (Eqs. 10 and 11) Initial Sest

I = Imax
No

Si = Scal
Yes Update Sest using Scal and a bisectional method Figure 2. The schematicflow chart of the iterative ly solvingcalcul;ation for the sub-grid supersaturation ., where is the number of iterations. PMSD is the particle mass-size distribution. Total water mass mixing ratio, , and liquid water static energy, , are conserved.
SpecificallyAt the beginning of an iteration, an initial value of supersaturation ( ) is first specified ("best guess" 185 estimate) and Eq. (46) is integrated over the sub-time step ∆ = ∆ / to obtain a first estimate of the particle wet sizes at the sub-level + ∆ . Next, an integration over the particle mass-size distribution (PMSD) yields a first estimate of the liquid water mixing ratio , , at + ∆ (Fig. 2)., subject to the initially specified value of the water vapor saturation ratio. Secondly, t The subsequent calculations are based on the total water mass mixing ratio, , and liquid water static energy, ℎ, in the 190 ascending parcel of air are calculated, as defined asby,: whereHere, is the water vapor mass mixing ratio, is the temperature, is the gravitational constant, and is the heat capacity at a constant pressure of dry air. Currently, only adiabatic processes are considered, and therefore total water mass mixing ratio, and liquid water static energy , ℎ in Eqs. (11) and (12) are conserved in each sub-level as the parcel ascends from to + ∆., The total water mass mixing ratio and liquid water static energy at the lower and upper boundaries of the current host model grid (with the superscripts and respectively) are first calculated using Eqs. (8)8, and (9). Then, the 200 total water mass mixing ratio ( ) and liquid water static energy (ℎ ) in the th sub-level are obtained by linear interpolation, given by.
Knowing and ℎ , and in the th sub-level can be are determined derived from Eqs. (810) and (119) using the 205 estimated , using the first estimate of , as described above. Subsequently, these results are used to update the supersaturation is calculatedwater vapor saturation ratio, based on the standard definition of the water vapor saturation ratio, where ≡ 0.622, and * is the saturation water vapor mass mixing ratio in the air parcel, which and of air, which depends on . Subsequently And then, tThe updated value calculated supersaturation ( ) after each iteration is compared to and the initial estimate .
of are compared and are used to determine an An improved estimate of is determined using a bisectional method that minimizes the difference between different available estimates of through iteration, as shown in Fig The maximum value of the simulated vertical supersaturation profile of S, , is used to diagnose the critical particle size , , based on Eq.
(2) ( Fig. 1c-d), Once and the critical wet radius ( ) are calculated in the grid, . Aall particles with a wet/dry radius larger than / are taken as the activated particles to become cloud condensation nuclei. cloud droplets. 225 Consequently, the cloud condensation nuclei number concentration ( ) is obtained by integrating the activated aerosol size distribution accordingly (Fig. 1e). Above cloud base, a uniform number of vertical profile of the the activated particles cloud dropletequal to the value at cloud base number mixing ratio is assumed, equals to the value calculated at cloud base, in good agreement with observations and detailed simulations ofusing cloud resolving models (Gerber et al., 2008;Slawinska et al., 2012;Jarecka et al., 2013). 230 In each grid of the host model, the dry aerosol number-size distribution is represented as particle numbers at regular size intervals, ∆ = 1/ ∆ , where is the number of size bins. ∆ is the particle size range covering both Aitken and accumulation modes, expressed in terms of a dimensionless particle size parameter = ln ( / 0 ), with 0 = 10 −6 m. In this study, we set to 6, meaning that 6 discrete aerosol particle size bins are used. The continuous aerosol size distribution (such as Fig. 1e) can be obtained from linear interpolation using the particle numbers in 6 discrete size bins. 235 Currently, only adiabatic processes are considered in each sub-level. Therefore, total water mass mixing ratio ( ) and liquid water static energy (ℎ) are conserved as the parcel ascends from to + ∆ TAlthough the above calculations assume the adiabatic ascending of air parcel. However, entrainment could have potential impacts on theenergy and moisture profiles in clouds may be affected by entrainment processes.host model grid. Besides, you canTherefore, Weand we additionally consider the impact of entrainment modifying to consider the effect of entrainment on the profile if necessary. The 240 entrainment is considered to have a direct impacton and ℎ above the cloud base by using where and ℎ are the values of total water mass mixing ratio and liquid water static energy considering the entrainment of air, with a specified entrainment rate given by e, respectively. effect , which These can be used to replace and ℎ in 245 Eqs. (10) and (11) when entrainment needs to be consideredoccurs.
Note that Eq. (3) can only be used for an adiabatic processes and does not work if there is entrainment or radiative cooling of the air, e.g. the formation of cloud droplets in radiation fog. In contrast, the QDGE scheme is much more general, as outlined above. The QDGE scheme can be easily modified for simulations of entrainment and radiation fog if required.
Finally, the maximum value of the simulated vertical supersaturationwater vapor saturation ratio profile, (Fig. 1c), is 250 selected and used to diagnose the critical particle size ,( ), which separates activated from non-activated particles, i.e. by requiring that = (Fig. 1d). Particles with sizes that are equal to or greater than the critical size ( , the dry size corresponding to ) are assumed to be activated. Consequently, the cloud condensation nuclei number droplet number concentration ( ) is obtained by integrating the activated particle size distribution accordingly (Fig. 1e).
Above cloud base, a uniform vertical profile of the cloud droplet number mixing ratio is assumed, in good agreement with 255 observations and detailed simulations of clouds (Gerber et al., 2008;Slawinska et al., 2012;Jarecka et al., 2013). Also, you can set the entrainment rate to consider the effect of entrainment on the vertical profile if necessary.exp The QDGE scheme calculates the activated particle number near the cloud base. Above cloud base, a uniform number of vertical profile of the the activated particles cloud dropletequal to the value at cloud base number mixing ratio is assumed, in good agreement with observations and detailed simulations ofusing cloud resolving models (Gerber et al., 2008;Slawinska et al., 2012;Jarecka et 260 al., 2013)

Comparison with a parcel model
In this subsection, we examine the performance of the QDGE scheme by comparing it with parcel model results by conducting a series of experiments as described in Ghan et al. (2011).
The parcel model can numerically solve the droplet growth equations in a most accurate way, by representing aerosol size 265 distributions with finely discretizing bins and utilizing a very short time step to trace the supersaturation variation with time/height (Ghan et al., 2011).
For the comparisons, we assume a tri-modal lognormal size distribution (Whitby, 1978)   In contrast to the QDGE scheme, the four activation schemes considered by Ghan et al. (2011) are based on parameterized and simplifying assumptions about the physical processes involved in the formation of clouds droplets, using the vertical grid of the host model. Therefore, the QDGE scheme can be used for a broader range of environmental and aerosol 285 conditions than these schemes, in general. Although the QDGE scheme mimicks the parcel model well, it is also numericallymore efficient. Typically a parcel model simulation will take severakl minutes, while the QDGE scheme only consumes 0.1 seconds for the same case using single core on Intel Xeon E5-2660 v2.
One more advantage of the QDGE scheme is the potential scale adaptivity for different vertical grids. The accuracy of the simulated supersaturation profile increases with the specified number of sub-levels ( ) and number of iterations ( ). 290 Therefore, aAs the super-computer capabilities for climate model simulations isare improved, the QDGE scheme will provide a more accurate solution for the activation process and easily adapts to the accuracy requirement for high-resolution GCMs in the future.
An earlier version of the QDGE scheme has been successfully used for simulations with the 5 th generation of the Canadian atmospheric global climate model (CanAM5). It is currently being tested in additional models. 295 The QDGE aerosol activation scheme has been previously used to assess Arctic indirect radiative forcing (Arora et al., 2015) and to determine the sensitivity of Arctic clouds to changes in future surface seawater dimethyl sulfide concentrations (Mahmood et al., 2019).

Campaign description
The worldwide cloud data used for the evaluation were sampled from four aircraft campaigns. The locations and instrument information of the four campaigns are shown in Fig. 41 and Table 2. The Canada (CAN) campaign provided marine stratus cloud data observed during the Radiation, Aerosol and Cloud Experiment (RACE) in fFall 1995 off the coast of Nova Scotia, Canada (Peng et al., 2002). The Chile (CL) campaign provided marine stratocumulus clouds data observed during the 305 VAMOS Ocean-Cloud-Atmosphere-Land Study Regional Experiment (VOCALS-REx), for near-climatological atmospheric conditions off northern Chile and southern Peru (Wood et al., 2011). The Brazil (AMA) campaign provided continental stratus clouds data observed in Manaus, Brazil during the Green Ocean Amazon (GoAmazon2014/5) Experiment (Martin et al., 2016). The China (CN) campaign provided polluted continental stratus clouds data sampled in Beijing, China by the Beijing Weather Modification Office (Liu et al., 2020). These worldwide datasets comprise continental (CN and AMA), 310 coastal (CAN), and marine (CL) meteorological conditions. Additionally, they cover different levels of human influence on clouds, with an observed range of the mean aerosol number concentration ( ) within 100 m below the cloud base from 282 cm −3 to 1350 cm −3 .  Aerosol and cloud measuring instruments utilized in the four campaigns are briefly presented in Table 2. The observed 320 variables mainly include the , the cloud liquid water content ( ), the aerosol number-size distribution, the chemical compositions of aerosol, and atmospheric condition parameters. For the measurement of the , the forward scattering spectrometer probe (FSSP) was used in the CAN campaign. The cloud, aerosol, and precipitation spectrometer (CAS) was used in the CL campaign. The fast cloud droplet probe (FCDP) was used in the AMA and CN campaigns. Although FCDP, FSSP, or CAS can observe cloud droplets with a particle size up to 150 μm, we only integrated the number for droplets with 325 a particle size of 2 to 30 μm to derive the . Because cloud droplets larger than 30 μm are subject to collisioncoalescence, and droplets smaller than 2 μm may be deactivated by evaporation (Fountoukis and Nenes, 2005). For the measurements of the , the King hot-wire probe was used in all campaigns, and the Johnson-Williams probe was also equipped as an alternative option in GoAmazon2014/5. In terms of the aerosol observation, all the four campaigns utilized an onboard passive cavity aerosol spectrometer probe (PCASP), and some flights during the CAN campaign used the 330 atmospheric solids analysis probe (ASAP), providing aerosol number concentration in multiple size bins roughly from 0.1 to 3 μm. We integrated the number for particles within the detected size range to determine . In the CAN, AMA, and CL campaigns, the mass concentrations of aerosol chemical species, including and Twin Otter) carried out observations (Wood et al., 2011). In order to ensure data integrity and consistency for aerosol number-size distribution and chemical composition measurements in the subsequent analysis, we only selected data from the Gulfstream-1 flights. The atmospheric condition parameters ( , pressure ( ), relative humidity ( ), vertical velocity (w)) were mainly observed by the airborne integrated meteorological measurement system (AIMMS), in all campaigns. For the CL campaign, vertical velocity data were not available from the Gulfstream-1 flights, thus we used the observed w data from 340 the Twin Otter flights that occurred simultaneous with Gulfstream-1 flights. Some meteorological variables that are required by the QDGE scheme, particularly including , , and ℎ, were not available from the aircraft observations. Therefore, we calculated these based on other variables (Sect. 3.2.4). Detailed descriptions of the aforementioned observational instruments and data quality control procedures can be obtained from the relevant publications for the different aircraft campaigns (Li et al., 1998;Peng et al., 2002;Wood et al., 2011;Kleinman et al., 2012;Martin et al., 2016Martin et al., , 2017Wang et al., 2020). 345 3.2 Data processing for closure experiment

Data extraction
The flow chart of data extraction and processing is shown in Fig. 52. In the first step, we conducted a screening of observational data to obtain suitable cloud cases fulfilling the following conditions (Step 1 in Fig. 25). First, we selected cloud cases with continuous profile with > 0 ℃ and ≥ 0.05 g cm −3 in each layer, identifying the height of the 350 cloud base as (see Fig. BA11). Second, we checked whether the near the cloud base approximately satisfies the wet adiabatic assumption, that is, nearly free from entrainment. As shown in Fig. BA11, we plotted the observed and the adiabatic ( ) profiles, the later ones were calculated by assuming that increases linearly with the height above cloud base ( ), i.e. = .
is the adiabatic liquid water lapse rate, which is a function of temperature (Brenguier, 1991). For liquid clouds, the value of varies from 0.5 × 10 −3 to 3.0 × 10 −3 g m −4 (Peng et al., 2002). For 355 the cases shown in Fig. AB11, ranges from 0.6 × 10 −3 to 2.8 × 10 −3 g m −4 . The mean of in each cloud case is shown in Table BA12. Considering that the entrainment rate was set to 1.0 × 10 −3 m −1 (weak entrainment, Barahona and Nenes, (2007)) when running the QDGE scheme in order to be close to the real atmosphere, we identify the nearly adiabatic part in the cloud case (i.e. data sampled between and ℎ ℎ in Fig. A1B1) for obtaining the observed cloud properties for evaluating the simulation. Third, we excludexcludeed the impact of collision-coalescence in the selected cloud cases, by 360 ensuring that the water contents of cloud droplets with size greater than 30 μm were less than 0.05 g cm −3 . Finally, we checked to make sure each cloud case has larger than . Ultimately, we obtained 31 eligible cloud cases were selected, as shown in Fig. A1B1. Table B2A1 listed the observed data in the selected cloud cases, and were averaged over the adiabatic part of each cloud case, and were averaged within 100 m below the cloud base.

As shown in
Step 2 of Fig. 52, we classified data samples of each cloud case into cloudy and clear conditions by utilizing the 365 following criteria. Data sampled inside the cloud (cloudy condition) requires that ≥ 0.05 g cm −3 , > 10 cm −3 , and ≥ 99.5 %, and data samples outside the cloud (clear condition) requires that < 0.05 g cm −3 , > 10 cm −3 , and < 99.5 %.
During each flight, the sampling along the horizontal flight track was continuous, which allowed us to better characterize the cloudy conditions or atmospheric conditions inside or outside the cloud. In all the 31 selected cloud cases, we were able to 370 extract data samples at levels ( , = 1, 2, … , from the cloud base; where is usually 4, at least 2.) along horizontal flight tracks in each cloud case, and calculated the mean value of the observed variable ( , ) along the horizontal track in each level . , is then extended to the vertical model levels ( , = 1, 2, … , ; where refers to the interfaces of the vertical layers in the model, i.e. ∆ = +1 − ) for running the QDGE scheme, which is Step 3 as shown in Fig. 52.
The extension proceeded with the following rules: The meteorological variables profile in clear condition, such as , , and 375 , were extended downwards to the surface by using hydrostatic equation and ideal gas law, then extended to the top by linear extrapolation, and interpolated between 1 and . The aerosol mass and number profiles were extended to surface and top by linear extrapolation and interpolated between 1 and .
was filled between 1 and by linear interpolation.
For each cloud case, the data samples in the clear air were used to obtain aerosol-related input information for the model simulations (number and mass concentrations of aerosol components in different particle size sections) and the profiles of 380 meteorological parameters. The data samples in cloudy conditions were used to obtain the vertical velocity and as input for the model, and to provide measured for comparisons with model results and closure verification. Here The the simulated ., and values in the boundariesof the host model grid werewere converted into tofor calculateinge the initial and ℎ for applying in the QDGE scheme ( Fig. 2 and Eqs. 8-12). These are Steps 4, 5, and 6, as shown in Fig. 52

Cloud cases
Step 2: The classification of samples Step 1:

Screen suitable clouds
Step 3:

The extraction and extension of data
Step 4:

Aerosol data for input
Step 6:

Meteorological input
Step 5: Vertical velocity for input Figure 52. A flow chart to schematically show the data extraction and processing for this work.

Aerosol data for input
In each of the cloud cases from the different aircraft campaigns, aerosol number concentrations _ ( = 1, . . . , ; where is the number of size bins detected in observation, see Table 2) sampled by ASAP or PCASP were categorized in 390 13, 15, or 30 bins. The size-resolved aerosol number concentrations were subsequently interpolated to a common particle size distribution (PSD) with 6 prescribed size sections for model input based on the following method (as depicted in Fig.   63). First, we used the aerosol number concentration in each size bin of the PCASP (or ASAP) data to fit a continuous PSD using cubic spline interpolation (Fig. 63b). Second, we integrated the fitted PSD to obtain the aerosol number concentration _ (k=1, … , 6) in the aerosol size sections employed by the QDGE scheme (the dry aerosol particle radius boundaries are 395 at 0.050, 0.088, 0.155, 0.274, 0.483, 0.851, 1.500 μm, as shown in Fig. 63c). By utilizing this method, the total obtained by integration over the 6 QDGE sections was slightly different from the observed total aerosol number due to the fitting of PSD, thus we further weighed the total fitted aerosol number concentration by the observed aerosol number to ensure the conservation of total number concentration (i.e., the total integrated over the QDGE sections in Fig. 63c is the same as the aerosol number integrated over the observed PSD in Fig. 63a). Finally, the PSD of the aerosol number concentration in 6 400 sections (Fig. 63c) was used as input to the QDGE scheme.

to the observations (the asterisks refer to the observations that were derived from (a)), and (c) aerosol number concentration in 6 size sections, as prescribed in model simulations with the QDGE scheme.
For each of the CAN, AMA, and CL campaigns, the AMS provided measurements of chemical components over the entire campaign, providing concentrations of where is the density of each component , and they are 1725,1769,1527,1900, and 1400 kg m −3 for 4 3 , 435 ( 4 ) 2 4 , 4 , , and , respectively (Ferek et al., 1998;Nakao et al., 2013). Consequently, we can obtain the mass concentration (unit kg cm −3 ) of each component in section following this equation: where is the median radius of section . 440 Since no AMS data are available for the CN campaign, we assumed the mass fraction of different chemical components according to contemporaneous measurements in Beijing, China (Zhou et al., 2019;Li et al., 2020), as shown in Table A2B3.
Under the assumption of = 1600 kg m −1 (Levy Zamora et al., 2019), ,k in the CN campaign can be obtained from Eq. (1921).
Finally, we obtained the number concentration of total aerosol and the mass concentration of each chemical component from 445 PCASP/ASAP and AMS measurements in each cloud case and calculated aerosol number and mass concentrations in 6 prescribed size sections following the above procedures (Step 4 in Fig. 52). We then used the aerosol information as input to drive the QDGE scheme.

Vertical velocity for input
The averaged updraft velocity ( + ) and sub-grid vertical velocity ( ) obtained from the observed vertical velocity ( ) 450 samples in clouds were used to calculate ( = + + ) as input for running the QDGE scheme (Step 5 in Fig. 25).
The updraft velocity is a key variable for parameterizing aerosol activation. Peng et al. (2005) pointed out that using a characteristic value of the vertical velocity distribution (0.8 times the standard deviation of the distribution) is a good approximation for simulating the nucleated cloud droplet number of marine stratus when running the parcel model. Meskhidze et al. (2005) also gave a method to calculate + , which had the optimal closure for cumulus and stratocumulus 455 clouds. Here, we derived a universal method for calculating + in stratus and stratocumulus based on the above two studies.
According to Meskhidze et al. (2005), the averaged updraft velocity ( + ) can be calculated by probability density function (PDF) of , ( ): For the normal PDF with the mean velocity 0 and standard deviation σ, ( ) can be represented as where Φ( ) is the cumulative distribution function of the standard normal PDF that can be represented by error function (erf): 470 Especially, when 0 = 0, 475 which is consistent with the characteristic velocity pointed by Peng et al. (2005) used for assessing cloud droplet closure for stratocumulus clouds sampled in the CAN campaign.
A sub-grid vertical velocity ( ) is needed for the QDGE scheme, and it can be derived from the square root of the Turbulent Kinetic Energy ( ) following Morrison and Pinto (2005): , 480 (275) where the is given bycan be calculated according to its definition, which is half the sum of the variances (square of standard deviations) of the velocity components: In this study, we assume that no horizontal movement occurs in cloud during the horizontal flight tracks, that is, ( ′ ) 2 ̅̅̅̅̅̅̅ = ( ′ ) 2 ̅̅̅̅̅̅̅ = 0 and ( ′ ) 2 ̅̅̅̅̅̅̅ = 2 . Therefore, the sub-grid vertical velocity can be represented by σ: If the observed in each selected cloud case obeyed the normal distribution, we could calculate ( = + + ) 490 following Eqs. (224) and (279) as input for running the QDGE scheme easily. We checked the normality of distribution by drawing a quantile-quantile (Q-Q) plot using the observed values along the horizontal flight track of the cloud case, taking CN01 as an example in Fig. 74. The linearity between the Q-Q plot of observed samples and a standard normal distribution indicates that data does indeed follow the normal distribution. In the four campaigns of this study, 4 cloud cases in CN, 2 cases in CAN, 5 cases in AMA, and 3 cases in CL have enough data samples to obtain the PDF of ( Table 2), as plotted for checking the normality of distribution in Fig. A2B2.
However, the PDF in two of the CAN cloud cases does not conform to the normal distribution very well (panel (5) and (6)  500 of Fig. A2B2). So, we used the mean and standard deviation of distribution in Peng et al. (2005) (Table   A1B2).

Meteorological input 505
Some meteorological variables ( , , , and ) can be obtained from AIMMS measurements directly, though, others ( , , and ℎ) need to be calculated according to available variables (Step 6 in Fig. 25). We obtained by the following equation: Then, and ℎ can be obtained by Eqs. (810) and (119) from and other available variables. All meteorological variables were extracted and interpolated to model levels, as described in Sect. 3.2.1. The profiles of measured meteorological 515 variables served as the initial state to drive the QDGE scheme.

Determination of Nsub
As mentioned in Sect. 2.1, the QDGE scheme simulates vertical profiles of supersaturation to determine , for a vertical grid with the size ∆ = ∆ / , where ∆ is the grid size of the atmospheric host model. The accuracy of the simulated supersaturation profile generally increases with , though, large values of imply higher computational burdens. For 520 applications of the QDGE scheme in atmospheric models, it is therefore important to determine an optimal value of that yields sufficiently accurate supersaturation profiles at acceptable costs. Figure 85a plots the vertical profiles of simulated by the QDGE scheme with different values for the cloud case CN01. The results show that each profile with ≥ 3 produces a well-defined maximum of ( ), which approaches to a stable value as is further increased. All cases seem to converge to a similar value as with = 150, as 525 plotted in Fig. 58a. Figure 58b shows the variation of with the increasing for all cloud cases in the four campaigns.
Overall, fluctuates dramatically with < 10, but plateaus when is greater than 60 (10 for CAN). Results obtained for = 150 and = 60 are similar. The mean relative error and correlation coefficient between with = 150 and that with = 60 are 1.97% and 0.9997, respectively. Therefore, we used = 60 in this study ( = 10 for CAN). Further discussion regarding the selection of are provided in Sect. 5. 530

Statistical parameters for evaluation and error analysis
The QDGE scheme simulates the ( ) in each cloud case, based on . Noting that is not exactly the 535 same as here, as we take wet particles with a size between 2 to 30 μm to compare with the observed one. Considering that aerosol activation is particularly efficient in the vicinity of the cloud base in stratus and convective clouds, the QDGE scheme only calculates the at the cloud base (Sect. 2.1). Here, we considered the effect of weak entrainment on the vertical profile of the cloud droplet number mixing ratio in order to be close to the real cloud base in the atmosphere (Sect. 3.2.1). Therefore, we evaluated the simulation effectperformance of the QDGE scheme by comparing with the 540 vertically average value of the observed ( ) in the nearly adiabatic part of the cloud (between and ℎ ℎ in Fig. A1B1) (Sect. 3.2.1), given by.
where is the number of samples between and ℎ ℎ , and , is the observed in height .
Correspondingly, the mean bias ( ) and mean relative error ( ) of each cloud case can be calculated, as follows: 545 where of each cloud case will also be used for subsequent error analysis.
To evaluate the overall accuracy of the QDGE scheme, we also calculated the mean values of , , , for cloud cases in each campaign, namely ̅̅̅̅̅̅̅̅̅̅ , ̅̅̅̅̅̅̅̅̅̅ , , and ̅̅̅̅̅̅̅ . Besides, the R square ( 2 ) ( is the Pearson 550 correlation coefficient) between the and in each campaign was also calculated.
To quantify the contributions of different physical variables to errors in the simulated with the QDGE scheme, we calculated the Maximum Information Coefficient (MIC) (Reshef et al., 2011), which provides a measure for the strength of the relationship between each input variable and . MIC can be a good measure to capture the association between the attributive variable and MRE for different types of relationships, such as linear, exponential and many complex 555 functional relationships (Reshef et al., 2011). There is no need to standardize the data before the MIC calculation and the calculations have low computational complexity and high robustness. However, it should be noted that the association here does not refer to a specific correlation, such as temporal or spatial correlation, or positive or negative correlation, but refers to the strength of a certain relationship between the variable and MRE. The MIC value is always between 0 and 1. The higher the MIC value, the stronger the association between the input variable and , that is, the input variable contributes 560 more significantly to the . Here, we calculated the MIC base on the minepy package in Python (Albanese et al., 2018), and set the parameters required in MIC as the default settings suggested by the code developers. Different parameters had an insignificant effect on the relative importance of variables and MRE.
We calculated the MIC between and each one of the following input variables: the relative humidity ( ), the mean vertical velocity ( + ) and the sub-grid vertical velocity ( ) to represent environmental and dynamic conditions; the total 565 aerosol number ( ) as a proxy of pollution level; the hygroscopicity of aerosol ( ) weighted by composition volume fraction, and the effective radius of aerosol PSD ( , ) to represent the chemical and size properties of the aerosol. Here, , and , are defined as: where , the hygroscopicity of component , is accounted for variations with relative humidity in the QDGE scheme (Appendix ASect. 2). represents the middle radius in the ℎ particle size bin observed by PCASP or ASAP (see Sect. Table 2). For MIC calculation, the values of input variables derived from observations are listed in Table A1 B2  575 for each cloud case.

Closure experiment
The results of the closure experiment are shown in Fig. 96. Almost all values fall within 30 % of the mean observations in the clouds. 2 is above 0.94 for all campaigns, which indicates a good agreement between simulation and 580 observation. For the four campaigns covering marine to continental conditions, the ̅̅̅̅̅̅̅ values are all below 26 % and the values are within ±20 %. The AMA campaign produces the best agreement between model results and observations, with a ̅̅̅̅̅̅̅ value of 17.30 %. On the other hand, the CN campaign produces a poor agreement, with a ̅̅̅̅̅̅̅ value of 25.90 %.
However, cloud droplet number concentrations are underestimated for all cloud cases for the CL campaign (Fig. 9c== −19.36 %), which may be related to the high activation ratio ( , the ratio of to , see Table A1B2) in this region. 585 in all CL cases are higher than 60 %, suggesting that the marine environment is favorable for more aerosol particles to be activated. If particles with a smaller size than the detection limit of PCASP (about 10 nm) are activated, it could lead to an underestimation of the simulated in the CL campaign. In order to provide further context, we compare the ̅̅̅̅̅̅̅ values of this study to previous studies with different aerosol activation parameterizations and aircraft measurements, as shown in Table 3. The ̅̅̅̅̅̅̅ values are relatively high for those early parameterizations, basically around 50 %. In the recent two decades, the performance of physically-based parameterization has been significantly improved, as is evident from a reduction of the ̅̅̅̅̅̅̅ to about 30 %. For instance, 595 one of the schemes (Fountoukis and Nenes, 2005) achieved remarkable closure (with ̅̅̅̅̅̅̅ of 13.5 %) for continental cumuliform/stratus. In this study, the QDGE scheme performs decently (the ̅̅̅̅̅̅̅ values are all below 26 %) in four different regions, indicating that the scheme is suitable for simulations of cloud droplet number concentrations over a wide range of different meteorological conditions and different levels of aerosol pollution.  Flossmann et al. (1985) ~50.00 Continental stratocumulus North of England (Hallberg et al., 1997) UWyo parcel model a <50.00 Marine stratocumulus Tenerife, Spain Brenguier, 2000) Fountoukis and; Nenes and Seinfeld (2003) ~30.00 Coastal stratus Monterey, California, USA (Meskhidze et al., 2005) Fountoukis and Nenes (2005)

Error analysis
Although the performance of the QDGE scheme is good in different aircraft campaigns, it is useful to analyze sources of biases in the simulations. Following the procedures described in Sect. 3.3, we calculated the Maximum Information 605 Coefficient (MIC) between MRE and the input variables of the QDGE scheme, including aerosol properties ( , and , ), thermodynamic state ( ), pollution level ( ), and atmosphere dynamic conditions ( + and ), as shown in Table A1B2. The MIC values for all cloud cases and each campaign have been shown in Table 4.
For almost all campaigns, the aerosol number concentration and the hygroscopicity, have the most significant impacts on . This is consistent with the change of environmental supersaturation (Eq. (3))droplet growth equation, according to 610 which the variation of supersaturation with height is essentially determined by the competition between the production of by adiabatic cooling and the reduction in from condensational growth of the particles, the latter mainly depends on the number and solubility of the aerosol particles. In detail, has a greater impact on in marine regions (CAN and CL), but is more significant in continental regions (CN and AMA). In marine regions, where is relatively low (Table 2), a small fluctuation in can cause noticeable changes in the simulated and CDNC, which makes more sensitive to 615 . However, in continental areas, is relatively high, and the change in hygroscopicity becomes more important to .
The atmospheric humidity and the dry size of the aerosol particle also have non-negligible impacts on . Both affect the hygroscopic growth of aerosol particles and the reduction in . Overall, the atmosphere dynamic conditions have the most insignificant impact on , which may be attributed to the weak variation of them in stratus and stratocumulus clouds (Table B2A1). 620 The MIC values also help to explain the relatively poor simulation performance of some campaigns. The chemical properties of the aerosol, which affect , are very important for the simulation in the continental region, but the CN campaign lacks AMS data and we applied the same chemical composition for all cloud cases, based on earlier measurements in this region (Sect. 3.2.2). Given the importance of the chemical properties, simultaneous measurements of chemical components probably would have helped to enhance the accuracy of simulated for the CN campaign. Another possible cause of 625 biases in simulated for the CN campaign is a much larger standard deviation of observed (see Table 2) than that of other campaigns, which could be responsible for the error in the simulated . However, it should be noted that although the CAN campaign is characterized by the presence of coastal clouds and smaller variations in , its is higher than the AMA campaign, which may be related to the application of uniform updraft velocity in simulations for the CAN campaign (Sect. 3.2.3 and Table A1B2). 630 Overall speaking, the errors in the simulated CDNC is largely relevant to the missing data in observation (such as CN and CAM campaign), the analysis of MIC and error sources here could provide a good reason to develop and improve measurement strategies in the future aircraft campaigns. In this paper, we introduce a numerically efficient aerosol activation scheme, which calculates the maximum cloud supersaturation and cloud droplet number concentration ( ) by employing a Quasi-steady state approximation of the cloud Droplet Growth Equation (QDGE) scheme. The QDGE scheme utilizes the look-up tables and an iterative methodcalculation for solving the sub-level variation of supersaturation and deriving the maximum supersaturation and the activated particle number-size distribution mass and energy budgets for efficient applications of the scheme in the large-scale 640 grid of climate models. The cComparison between the results of the QDGE scheme and a parcel model shows that biases in the maximum supersaturation under different environmental and aerosol conditions are within 0.18 % (with an average of 0.05 %), consistent with the highindicating the decent accuracyand reasonable performance of of the QDGE scheme.
Whereafter,We we evaluated the simulated with worldwide cloud data sampled during four aircraft campaigns, covering a wide range of different meteorological conditions and different levels of aerosol pollution. The aerosol 645 information, updraft velocity, and meteorological conditions were carefully extracted from aircraft measurements and applied to drive the QDGE scheme. The simulated CDNC is compared with the observed correspondence in the nearly adiabatic part of the cloud, for evaluating the performance of the scheme. The average values of the mean relative error and the mean bias in the four campaigns are all within 26% and ±20%, respectively, indicating that the QDGE scheme can reasonably simulate the activated on a regional or global scale. We also investigated the potential sources of error in 650 the simulated and found that the magnitude of the mean relative error is mostly relevant to the aerosol number concentration in marine regions and to aerosol hygroscopicity in continental regions than to other variables in the simulation.
Several points are worthy of mentioning for future work. The QDGE scheme can be further optimized in several aspects.
First, = 60 generates reasonably good results in four different regions in this study, but this number is a little high and the computation will be too demanding to apply in general circulation models. Second, the iterative calculation to derive 655 supersaturation in each sub-grid level can be computationally expensive. Therefore, both adjustments on number and optimization on the iteration would be necessary before the QDGE scheme is applied in the climate model. Last, we also want to evaluate the QDGE scheme by comparing it with parcel model simulations, to further identify the sources of error related to the approximations in the scheme. These works would be considered in future studies.

Appendix A: Parameters 660
The parameters , , , , and in Eqs. (1-3) are given by where is the aerosol hygroscopicity, is the surface tension of the solution/air interface (which is approximated by the surface tension of water here), is the density of water, is the molecular weight of water, is the universal gas constant, is the temperature, is the dry aerosol particle radius, * is the saturation vapor pressure, is the latent heat of vaporization, ′ is the modified thermal conductivity of air accounting for non-continuum effects, ′ is the modified 670 diffusivity of water vapor in air accounting for non-continuum effects (Seinfeld and Pandis, 2016), is the gravitational constant, is the molecular weight of dry air, is the atmospheric pressure, and is the heat capacity at a constant pressure of dry air. Petters and Kreidenweis (2007) and Kreidenweis et al. (2008) proposed provided tabulated values of the hygroscopicity a parameter for representing the hygroscopicity of aerosol with a variety of chemical compounds, and provided tabulated 675 values of based on laboratory data and modeling. They found that the aerosol water content (the ratio of wet aerosol volume to the dry aerosol volume) parameterized on was generally within the experimental uncertainty, but biased at low relative humidity (Kreidenweis et al., 2008;Petters and Kreidenweis, 2007). parameterized aerosol water contents are often within experimental uncertainty. However, the accuracy of this approach tends to decrease with decreasing aerosol water content. In particular, (Kreidenweis et al., (2008) Kreidenweis et al., (2008) also evaluated the calculated aerosol water 680 content based on simulations of highly concentrated, non-ideal aqueous solutions with strong electrostatic interactions between ions with the Aerosol Inorganic Model (AIM; Wexler and Clegg (2002)), which gives evidence for systematically different results from a rigorous thermodynamic model at low aerosol water contents for some compounds. In order to improve biases at low relative humidity, the original method was extended to account for variations in with relative humidity in the QDGE scheme. Specifically, piecewise-linear relationships between and aerosol water activity for different 685 chemical components were determined based on results from AIM. FiguresA   Table B1. Aerosol distribution and property parameters, referring to Whitby (1978) and (Ghan et al., (2011b   Code and data availability. The version of the QDGE scheme used to produce the results used in this paper, as well as the input data and scripts to run the model and the data to produce the key plot for the simulations, are archived on Zenodo and 705 can be accessed at https://doi.org/10.5281/zenodo.4841035 (Wang et al., 2021).

Appendix B: Tables and
Author contribution. HW processed all data, conducted all simulations and analyses, and wrote the manuscript. YP led the work, designed the experiment, and refined the manuscript. KS developed the initial model version of the QDGE scheme and provided a summary of the approach, and contributed to the writing ofrefined the manuscript. YY, WZ, and DZ helped with the data usage in the China campaign and refined the manuscript. 710 Competing interests. The authors declare that they have no conflict of interest.