The Earth Model Column Collaboratory (EMC²) v1.1: An Open-Source Ground-Based Lidar and Radar Instrument Simulator and Subcolumn Generator for Large-Scale Models

,


Introduction
The representation of cloud processes in models is continuously advancing, conceptually, and in the level of details and complexity implemented in the micro-and macro-physical schemes (e.g., Lin et al., 2019;Cesana et al., 2019). These improvements are reflected in the accuracy of the resulting model output (e.g., Klein et al., 2013;Lin et al., 2019;Myers et al., 2021;Wang 10 et al., 2019), yet results still show large inter-model variability (e.g., Zelinka et al., 2020). This variability results from, among other sources, model weaknesses concerning atmospheric processes such as cloud geometrical and optical thicknesses (e.g., Cesana and Waliser, 2016;Klein et al., 2013), formation and transition of marine stratocumulus clouds (e.g., Rémillard and Tselioudis, 2015;Lin et al., 2014;Cesana et al., 2019), and the formation and maintenance of supercooled water (e.g., Cesana et al., 2012;Tan and Storelvmo, 2016;Silber et al., 2019b). 15 Meaningful model evaluation benefits from a direct ("apples-to-apples") comparison with observations. For the evaluation of cloud representation, model output is often compared with active remote-sensing measurements from instrumentation such as lidars and radars, which provide information on the spatial structure of clouds and some direct indications about active microphysical processes. However, in these profiling cases, performing model evaluation is challenging because of observational detectability constraints (e.g., signal extinction), and lack of retrievals of some microphysical quantities by these instruments, 20 for example, hydrometeor number concentration or water content. In addition, spatial resolution differences between a model's output and an observing instrument's measurement resolution present an additional difficulty.
To bridge the gap between large-scale models such as weather or climate models and active remote-sensing observations, instrument simulators with different purposes have been developed to estimate observed parameters using model output. For example, the Cloud-resolving model Radar SIMulator (CR-SIM; Oue et al., 2020) was developed to emulate zenith-pointing 25 and scanning radar and lidar variables using high-resolution model output, with considerations of hydrometeor shape and the resulting scattering calculations (see also Mech et al., 2020). The Cloud Feedback Model Intercomparison Project Observational Simulator Package (COSP; Bodas-Salcedo et al., 2011;Swales et al., 2018), on the other hand, was developed to operate over large-scale model output targeting satellite data as observational constraints, although expansions for the emulation of ground-based radars and lidars have been developed (e.g., Zhang et al., 2018;Kuma et al., 2020). Because of the demanding 30 computation associated with the emulation of satellite measurements, COSP is typically implemented on-line into models' code to facilitate output.
To account for spatial resolution discrepancies, which are typically accentuated in the case of large-scale models due to their largely coarser resolution, some model evaluation studies and forward simulators emulate a higher spatial resolution by generating subcolumns (e.g., Bodas-Salcedo et al., 2008Chepfer et al., 2008;Klein and Jakob, 1999;Lamer, 2019;Stephens et al., 2010;Swales et al., 2018;Webb et al., 2001). Statistics calculated using multiple generated subcolumns (faithful to the processed model's physics) can be directly compared with the associated observations, thereby mitigating spatial resolution 5 biases and errors.
Here we present the Earth Model Column Collaboratory (EMC 2 ), an open-source ground-based lidar and radar simulator and subcolumn generator, which is designed to operate over large-scale model output while being faithful to the physics implemented in models' microphysics or radiation schemes but can also be applied to high-resolution model output. The software is written in Python, allowing quick installation and providing customizable operation (for scattering calculations, etc.). In 10 section 2 we describe the subcolumn generator, the forward calculations, and classification routines currently implemented in EMC 2 . Section 3 contains a brief description of the Python code and software. In section 4 we demonstrate its use by comparing observations of a case study of highly-supercooled drizzling cloud over West Antarctica (see Silber et al., 2019a) with application of EMC 2 to ouputs from a large-eddy simulation (LES) and the NASA Goddard Institute for Space Studies (GISS) ModelE3 climate model (see Cesana et al., 2019) in single-column model (SCM) mode. Finally, section 5 provides a summary 15 of the code features and case study demonstration.

EMC Description
The following simulator description assumes large-scale model convective and/or stratiform cloud scheme outputs containing four hydrometeor classes: cloud water (cl), cloud ice (ci), rain (precipitating liquid; pl), and snow (precipitating ice; pi).
While these four hydrometeor classes are widely used in various microphysics schemes, we note that EMC 2 can be generally 20 adapted to cases in which fewer or additional classes are used. We note that all the acronyms, abbreviations, and symbols used throughout this study are listed in Appendix A.
The subcolumn allocation and forward calculations in EMC 2 can be performed using three different approaches: 1. Radiation scheme approach, which largely treats hydrometeor fraction in a generalized manner and utilizes bulk (hydrometeor population) scattering lookup tables (LUTs) generated using specific size distribution assumptions. 25 2. Microphysics scheme approach, which includes full integration of single-particle scattering LUTs using hydrometeor particle size distributions prognosed by models with consideration of sub-grid hydrometeor class fraction variability assumptions.
3. Empirical non-bespoke approach, which implements empirical formulation from the literature in the forward calculations and can be used with either of the hydrometeor fraction treatment methods (microphysics or radiation) in the subcolumn 30 generator.
The bespoke radiation and microphysics approaches follow the assumptions and general logic implemented in large-scale model radiation and microphysics schemes, respectively. While our main focus throughout this manuscript will be on these two approaches in EMC 2 , we will briefly present the empirical approach in sect. 2.4 and present some of its EMC 2 -processed output in sect. 4.
We note that the detailed description below of the radiation and microphysics approaches is congruent with the current 5 implementation of these approaches in the GISS ModelE3 climate model. However, the core of these approaches is similar in other climate and Earth system models (ESMs), and EMC 2 can be easily adapted to fit specific variations in a model assumptions (see sect. 3). For example, the microphysics approach currently operates only on stratiform microphysics scheme output using a two-moment bulk scheme (Gettelman and Morrison, 2015, hereafter MG2) that has been implemented in climate models such as the Community Earth System Model version 2 (CESM2) Community Atmosphere Model version 6 (CAM6) 10 (Danabasoglu et al., 2020), the Energy Exascale Earth System Model (E3SM; Golaz et al., 2019), and ModelE3.

Allocation of Hydrometeors to Subcolumns
Prior to the radar and lidar forward calculations, EMC 2 generates subcolumns for each model output column. These subcolumns emulate a higher model spatial resolution, which partially reconciles the locality of ground-based measurements and allows a more robust statistical model evaluation. Subcolumns are generated and populated with hydrometeors from the top-15 down with maximum random overlap between liquid and ice phases (henceforth, the overlap rule; see also Fan et al., 2011) using a similar approach to that described by Lamer (2019, ch. 6). EMC 2 translates hydrometeor fractions in the model grid to a binary set of hydrometeor-containing and hydrometeor-free subcolumn bins. That is, given a specified number of subcolumns (N s ; determined by the user), the total number of hydrometeor-filled subcolumn bins at model level h and time step t is equal to the rounded value of N s × f hyd (h, t), where f hyd is the volume fraction of a hydrometeor class (e.g., f cl , f pi ) or a generalized 20 hydrometeor fraction (f gen ) used in the model radiation scheme, at the same model level and time step. Here and henceforth, we assume for simplicity an SCM output (no horizontal coordinate dimensions), even though EMC 2 can generally operate not only on SCM simulation output but also on global simulation output.
2. Stratiform cloud hydrometeors are allocated to subcolumn bins unoccupied by convective cloud hydrometeors while noting that cl and ci are allocated simultaneously to consistently follow the overlap rule. At model level h, stratiform clouds are first randomly allocated to subcolumn bins with overlying stratiform clouds (at level h + 1), followed, if necessary, by random allocation to subcolumns with cloud-free bins directly above at level h+1. This order of processing 30 where clouds are preferentially extended vertically conforms with assumptions that are often implemented in large-scale model radiative transfer calculations. In the case where overlying stratiform hydrometeors exist, the overlying layer phase is not a factor of consideration, such that a subcolumn bin containing ci may be located right above a subcolumn bin containing cl or both cl and ci (mixed-phase), and vice versa.
3. Convective and stratiform precipitating hydrometeors (pl and pi) are allocated to subcolumns without convective-stratiform no-overlap restrictions, i.e., convective and stratiform precipitation may co-exist in a single subcolumn bin while complying with the overlap rule. Similar to the stratiform cloud allocation, precipitation is allocated with maximum random 5 overlap. If some subcolumn bins are still to be populated with precipitating hydrometeors, these hydrometeors are randomly allocated to cloudy grid cells of the same type (convective or stratiform), followed by random allocation of hydrometeors to cloud-free subcolumn bins.
Once the subcolumns are populated with hydrometeors, per hydrometeor class except for stratiform cl in the case of the microphysics approach, hydrometeor mixing ratio is set in every hydrometeor-containing subcolumn bin by q hyd (s, h, t) = 10q hyd (h, t)/f hyd (h, t), where q hyd andq hyd designate the mixing ratio of a hydrometeor class (e.g., q cl , q ci ) in subcolumn bin s and a corresponding model grid cell mean, respectively. In the case of cl when using the microphysics approach, at every model level h and time step t, q cl (s, h, t) is randomly set in cl-containing subcolumn bins such that it would comply with the sub-grid variability gamma distribution described by Morrison and Gettelman (2008, eq. 8) while adjusting the values in the last cl-containing subcolumn bin such that hydrometeor mass is conserved (as in the case of other hydrometeor classes), i.e., We note that sub-grid scale variability of cloud water in ModelE3 is tied to the sub-grid scale variability of moisture rather than set at a fixed value as in Morrison and Gettelman (2008).
In stratiform hydrometeor-containing subcolumn bins, hydrometeor number concentration is set for every hydrometeor class where N hyd andN hyd designate the number concentration of a hydrometeor class in 20 subcolumn bin s and a corresponding model grid cell, respectively. EMC 2 assumes that convective schemes do not diagnosē N hyd , and hence, this information is currently not produced by the simulator.

Microphysics Approach
In the microphysics approach (currently applicable only to stratiform clouds), per hydrometeor diameter D, EMC 2 utilizes 25 q hyd and N hyd to calculate the hydrometeor size distribution, φ hyd (D, s, h, t), fully consistent with the MG2 scheme (see Morrison et al., 2009, eq. 1-3). Using LUTs containing full Mie calculations for spheres (following Bohren and Huffman, 1983, Appendix A) of single particle extinction and backscatter efficiencies at lidar operating wavelength λ l Q e hyd (D, λ l ) and Q bs hyd (D, λ l ), respectively , the lidar particulate extinction cross-section α p hyd (s, h, t) and backscatter cross-section where we use the trapezoidal rule for discrete integration over a series of D values, which can be unevenly spaced by ∆D i,i+1 = 5 D i+1 − D i , while noting that D 1 = D min and D N D = D max , where N D is the number of diameters for which Q e hyd , Q bs hyd , and φ hyd are calculated. In the case of the Mie calculations currently available in EMC 2 , D 1 = 0.1 µm, D N D = 1 cm, and  Warren and Brandt (2008). The Maxwell-Garnet equation (Bohren and Battan, 1980, eq. 10 1) for a mixture of ice and air is used to calculate the effective m hyd for ci and pi based on the ice densities implemented in EMC 2 for ModelE3 (Table 1) relative to bulk ice density of 917 kg/m 3 .
The total α p and β p α ptot (s, h, t) and β ptot (s, h, t), respectively are calculated as the sum of each of these variables for cl, ci, pl, and pi. The lidar linear depolarization ratio (LDR) is estimated by weighting fixed LDR values (per hydrometeor class; see Table 1) with the relative contribution of β p hyd to β ptot .

15
The cumulative optical thickness (from the surface upward) at the base of a given subcolumn bin, τ hyd , is calculated by where ∆z(h, t) denotes the geometrical thickness of model level h at time step t and τ hyd (s, h = 1, t) = 0. The total integrated optical thickness, τ tot (s, h, t), being the sum of τ hyd for cl, ci, pl, and pi, is used to estimate the level of full lidar signal attenuation (received signal not detectable by the simulated instrument), the value of which can be used to constrain comparisons between model output forward calculations and observations. Lidar signal extinction at visible wavelengths typically occurs at an optical thickness of 3-5 (e.g., Sokolowsky et al., 2020, fig. 4), and hence, EMC 2 assumes by default that the lidar signal 5 is extinct at a level where τ tot = 4. We note that EMC 2 allows calculating τ hyd from the top-down (i.e., at the top of a given model layer), thereby enabling simulation of airborne and spaceborne lidar measurements.
In some cases, the observational dataset only consists of measurements made by elastic lidars that do not allow direct retrieval of α p and β p . In these cases, observations can be compared with a diagnostic attenuated β ptot , β ptot,att , which is often referred to as the normalized relative backscatter (NRB; Campbell et al., 2002) after several lidar signal corrections are applied. β ptot,att 10 is calculated here by where β m and T 2 m are the molecular backscatter cross-section and the two-way transmittance calculated following Penndorf (1957), and η is the multiple scattering coefficient, the value of which is assumed to be equal to 1 by default, but can be manually set to other fixed values based on the physical assumptions made or certain empirical results (e.g., Winker, 2003).

Radiation Approach
With the radiation approach, applicable to both stratiform and convective cloud scheme output, forward calculations rely on bulk scattering LUTs. Therefore, this approach is more than two orders of magnitude faster than the microphysics approach, thereby rendering EMC 2 more suitable for the analysis of large model output datasets. Using this approach, a geometric crosssectional area for each hydrometeor-bearing subcolumn bin is first calculated assuming geometric scatterers: where ρ a is the density of air, ρ b is the bulk density of the hydrometeor class phase (1000 and 917 kg/m 3 is the case of liquid and solid water, respectively), and r e hyd is the effective radius of a hydrometeor class in the model grid cell. The α p hyd and β p hyd are then calculated by where Q e,vol hyd (λ l ) and Q bs,vol hyd (λ l ) represent in this case bulk efficiencies per unit volume taken from LUTs, in which they are provided as function of r e hyd . In Eq. 6a and 6b, however, Q e,vol hyd and Q bs,vol hyd are functions of the adjusted effective radius, r * e hyd , which equals r e hyd in all hydrometeor classes and cloud types except for stratiform cloud ice and snow. In these two cases, r * e hyd = r e hyd Φ * hyd and Φ * hyd = Φ hyd ρ hyd where ρ hyd is the hydrometeor class density (see Table 1), and Φ hyd is a constant fluffiness factor. Φ hyd is used such that a value of 0 represents an equivalent mass bulk sphere as in Gettelman and Morrison (2015) while a value of 1 represents a fluffy sphere with an equivalent maximum dimension. In the case of ModelE3, for example, Φ hyd is set by default to an intermediate value of 0.5, and generally serves as one of many 5 tuning parameters The default Q e,vol hyd and Q bs,vol hyd in eq. 6a and 6b (respectively) implemented in EMC 2 were calculated using single particle full Mie calculations in the case of liquid hydrometeors, and single particle scattering LUTs for a severely roughened 8-column ice aggregate (Yang et al., 2013) in the case of solid hydrometeors. These ice aggregate scattering calculations have been shown by Holz et al. (2016) to enable a closure between infrared Moderate-Resolution Imaging Spectroradiometer 10 (MODIS; Platnick et al., 2003) and visible Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP; Winker et al., 2009) satellite ice optical thickness retrievals, and were included in the MODIS collection 6 (C6) cloud product (Platnick et al., 2017). In order to calculate the Q e,vol hyd and Q bs,vol hyd LUTs, we assumed the same gamma distribution parameters as those implemented in the C6 dataset (see Hansen, 1971, eq. 1), consistent with the bulk LUTs utilized by ModelE3's radiation scheme.

15
Following the calculations of α p hyd and β p hyd , the total variables α ptot and β ptot as well as τ hyd , τ tot , and β ptot,att are calculated as in the microphysics approach (sect. 2.2.1).

Microphysics Approach
When the microphysics approach is selected in EMC 2 , the first three radar moments are calculated for each hydrometeor class in 20 every hydrometeor-bearing subcolumn bin; that is, the equivalent reflectivity factor (Z e hyd ), the mean Doppler velocity (V D hyd ) and the Doppler spectral width (σ D hyd ) as well as total radar moment variables Z etot , V Dtot , and σ Dtot . Full Mie calculation LUTs for the emulated radars (Table 2) are first used to calculate β p hyd at the radar operating wavelength λ r following eq. 2b.
The m hyd (λ r ) used for liquid in the Mie calculations can be taken from Segelstein (1981 , Table 1)

25
Z e hyd in every hydrometeor-containing subcolumn bin is then calculated (in linear units) using (Doviak and Zrnić, 1993, eq. 4.33) where |K w | 2 is the dielectric factor for water used in the raw radar observational processing (see Table 2). Using the resultant Z e hyd , V D hyd is then calculated by implementing the hydrometeor class terminal velocities parametrization used in the MG2 30 scheme (cf. Morrison and Gettelman, 2008, Table 2). In the calculation of V D hyd we neglect the model grid cell vertical wind, w, which predominantly has little impact on the V D hyd value. Finally, σ D hyd is calculated, while we note that beamwidth and turbulent broadening (e.g., Chen et al., 2018) are omitted from this calculation, but will be added in future work.
To allow a valid comparison between the forward calculations and observations, signal attenuation is considered in the calculation of the attenuated Z etot , Z etot,att : where Y gas and Y hydtot are the one-way integrated attenuation at the base of the subcolumn bin (in dB) due to atmospheric gases (O 2 and H 2 O; see Ulaby et al., 1981, sect. 5.3-5.5) and all hydrometeors, respectively. Y hyd tot is calculated using where α ptot is determined by summing α p hyd based on eq. 2a calculated at λ r over all hydrometeor classes while setting The vertical profile of the minimum detectable equivalent reflectivity factor, Z emin , is calculated by where z(h, t) is the height at the base of model level h (in meters) at time step t and Z emin (z = 1000 m) is the minimum detectable signal at 1 km (using the highest sensitivity mode; Table 2). When compared with observations, subcolumn bins where Z etot,att (s, h, t) < Z emin (h, t) can be treated as returned signal below the radar noise floor, and hence, are effectively 15 considered hydrometeor-free.

Radiation Approach
When the radiation approach is selected in EMC 2 , forward radar calculations using bulk LUTs are limited to the zeroth radar moment (Z e ) due to a set of limitations: 1. Large-scale model radiation schemes are not informed with hydrometeor fall velocities. Moreover, fall velocity parametrizations in microphysics schemes do not necessarily fully overlap with the hydrometeor size and shape assumptions imple-5 mented in radiation schemes.  3. The total radar moments include combinations of the different hydrometeor class mixing ratios, and hence, cannot be determined using a single set of bulk LUTs per hydrometeor class.

Noting that EMC
Thus, EMC 2 calculates only the Z e hyd , Z etot , and Z etot,att . Eqs. 6b and 7 are used to calculate Z e hyd with bulk scattering LUTs for λ r based on full single particle Mie scattering calculations in the case of liquid hydrometeors and single particle scattering calculations for the Yang et al. (2013) 8-column aggregate at a temperature of 270 K (see Ding et al., 2017) in the 15 case of solid hydrometeors. Similar to the implementation of the radiation approach in the forward lidar calculations (sect.

2.2.2)
, r e hyd considers the fluffiness factor in the case of solid (ice) hydrometeors. Finally, Z etot , and Z etot,att are calculated similar to the microphysics approach using eq. 8 and 9 with α p hyd (and α ptot ) calculated using eq. 6a.

20
Similar to the radiation approach, the empirical approach can be applied to both convective and stratiform cloud scheme outputs, and is limited to the zeroth moment in the case of forward calculations of radar variables. With this approach, empirical relationships between water content and instrument variables are used, similar to the forward simulator described by Lamer Table 2). The lidar α p hyd is calculated in hydrometeor-bearing subcolumn bins using the formulation in Heymsfield et al. (2014, eq. 9 and 9d) and Hu et al. (2007, eq. 3) for ice and liquid hydrometeors, respectively, both of which were 25 derived based on lidar measurements at λ l = 532 nm. The lidar β p hyd is then calculated using the retrieved α p hyd while assuming a constant lidar ratio (the extinction to backscatter ratio) per hydrometeor class (see Table 1). We note that lidar operating wavelengths are not considered in these forward calculations due to the scarcity and the limited ability to validate these empirical parametrizations for different regions and cloud regimes.
In the forward radar calculations, Z e hyd is estimated for the cloud water and rain hydrometeor classes using Fox and Illing-30 worth (1997, eq. 11) and Hagen and Yuter (2003, sect. 4), respectively. Here we implicitly assume that the reflectivity factor (Z) is equivalent to Z e , and hence, the radar wavelength has no weight over the calculation. Z e hyd for cloud ice and snow is estimated with the parametrization in Hogan et al. (2006, Table 2) using different parameters for the Ka and W radar bands. For the calculation of Z etot,att , Y hydtot at the model level base is estimated (in dB) following Doviak and Zrnić (1993, eq. 3.17c): where ρ w = 1000kg/m 3 is the water density, K m = (m 2 hyd −1)/(m 2 hyd +2), LW C designates the sum of the cloud water and precipitating liquid (rain) water content (in g/m 3 ), and Y hydtot (s, h = 1, t) = 0. Signal attenuation due to ice hydrometeors is 5 neglected here due to its typical significantly weaker influence relative to liquid hydrometeors, as well as the large uncertainties associated with the irregular shape of ice particles (see Doviak and Zrnić, 1993, ch. 3.3).

Hydrometeor Classifications
Once the total lidar and/or radar variables are calculated, EMC 2 can be used to classify the subcolumn simulator output.
Currently, EMC 2 incorporates three hydrometeor classification methods: the radar-sounding cloud and precipitation detection 10 and classification, the modified fixed lidar variable threshold phase classification, and the COSP lidar simulator emulator (henceforth referred to as the COSP emulator).
The radar-sounding cloud and precipitation detection method (Silber et al., 2021a; see also Vassel et al., 2019) emulates the combined use of relative humidity with respect to water (RH) sounding measurements for the detection of liquid-bearing clouds and radar echoes (received signal above the radar noise floor) for the detection of precipitating hydrometeors. In the case 15 of large-scale model output, RH below 100% in a model grid cell does not necessarily indicate a lack of cloud water, because of implemented assumptions concerning the sub-grid distribution of cloud water content (e.g., Smith, 1990). Therefore, the approach most consistent with the observational method is to simply use the cloud water-bearing subcolumn bins (see sect. 2.1) to classify the subcolumn bin as "cloud". "Precipitation" are those subcolumns bins in which Z etot,att ≥ Z emin . Subcolumn bins that can be classified as both "cloud" and "precipitation" are set as "mixed". We note that at temperatures below 0 • C, 20 the "mixed" classification type becomes more likely to represent a mixed-phase cloud with decreasing temperatures, but in general, may represent bins containing only liquid hydrometeors.
The modified fixed lidar variable threshold phase classification method is similar to previous studies that incorporated fixed LDR and β ptot thresholds to classify hydrometeor-bearing air volumes to "liquid" and "ice-only" using lidar measurements (e.g., Shupe, 2007; Thorsen and Fu, 2015). By default, however, EMC 2 includes two additional "undefined" classes that cover 25 intermediate regions in the LDR-β ptot , such that a subcolumn bin classified as "undefined1" has a higher probability that it includes some amount of liquid water while "undefined2" is more likely to only contain ice hydrometers (see sect. 4.3 for discussion and illustration of the default thresholds). The notion behind the addition of these two "undefined" classes is the fixed-threshold method limitations that could originate in: 1. Drizzle-or rain-bearing air-volumes, which may produce moderate β ptot and LDR on the order of 0.1, especially when 30 ice hydrometeors are present in the same air volumes (e.g., Derr et al., 1976;Sassen, 2003;Silber et al., 2019a). cannot be objectively translated to the model output domain to enable a direct comparison between the observations and the simulator output. Therefore, the modified fixed threshold routine, which largely agrees with existing measurements yet acknowledges both model and observational uncertainties may allow better comparisons to be made.
The emulator of the COSP lidar simulator follows the same equations and logic of the on-line lidar simulator (Cesana and Chepfer, 2013) implemented in numerous climate models. Note that, unlike the on-line COSP lidar simulator, this emulator 15 operates using the model vertical levels and does not interpolate the model output onto an evenly-spaced vertical grid. As with all other EMC 2 forward calculations and classification routines, this emulator can operate using a top-down viewing approach thus providing a bridge between the COSP simulator and EMC 2 .
Finally, EMC 2 also includes internal functions to calculate mass or frequency phase ratios using each of these hydrometeor classification methods, providing metrics to compare model output with observations or with outputs from other models. 20 3 Software Description EMC 2 depicts a workflow for comparing forward calculated radar or lidar variables generated from large-scale model output with radar or lidar measurements. Fig. 1 shows a flowchart example of this workflow for using EMC 2 to compare ModelE3's output with high spectral resolution lidar (HSRL) measurements. The workflow starts with the Model class incorporated within the emc2.core module. The Model class contains model output field namelists and default hydrometeor parameters 25 (Table 1). Using Python's class inheritance, EMC 2 allows the creation of a custom class specifying a given model's namelists and parameters (a ModelE3 class in this example), which ensures that the model output can be standardized and used by the other modules in EMC 2 . Once loaded through the Model class internal methods, model output data are stored within the Model object using the xarray dataset module (Hoyer and Hamman, 2017). 30 The Model object is then input to the subcolumn generator (sect. 2.1). The results of the subcolumn generator are stored in the xarray dataset contained within the Model object. Here, we introduce the Instrument class. Similar to the Model class, the Instrument class contains relevant information about the instrument being simulated (some of which is listed in Table 2) as well as the single-particle and bulk scattering calculation LUTs. Currently, zenith-pointing instrument properties and scattering calculation LUTs are available for various lidars and radars operated by the Department of Energy (DOE) Atmospheric Radiation Measurement (ARM) climate research facility. That is, the elastic micropulse lidar (MPL) and HSRL, both operating at a wavelength of 532 nm (e.g., Flynn et al., 2007;Eloranta, 2005), the 910 nm CL31 ceilometer (Morris, 2016), 5 the 1064 nm HSRL elastic channel (see Razenkov and Eloranta, 2018), the elastic channel of ARM's Raman lidar operating at 355 nm (e.g., Newsom, 2009), the X-band scanning ARM cloud radar (XSACR; Widener et al., 2012b), the Ka-band ARM zenith radar (KAZR; Widener et al., 2012a), and the W-band ARM cloud radar (WACR and M-WACR; Widener and Johnson, 2006). Similar to the Model class, the Instrument class allows EMC 2 to be tailored to other radars and lidars deployed at 13 https://doi. org/10.5194/gmd-2021-194 Preprint. Discussion started: 12 August 2021 c Author(s) 2021. CC BY 4.0 License. different sites, which does not confine the analysis of measurements and model output to specific ARM instruments or sites, as long as the required parameters and suitable scattering LUTs are provided. Thus, the various variables mentioned in the previous section (e.g., Φ hyd , ρ hyd , LDR) as well as the LUTs, can be easily specified and set to match configurations and assumptions implemented in different large-scale models, as well as complex scattering models more commonly implemented in cloud-resolving and LES models. EMC 2 incorporates a suite of unit tests for each function using the pytest testing tool (https://pytext.readthedocs.io/) to inspect the integrity and functionality of the code. These unit tests are combined with continuous code integration using TravisCI integration service (https://travis-ci.com/), which runs the unit tests every time a developer submits a pull request on GitHub. If the unit test passes with the developer's changes to the code, then the changes are approved to be a part of EMC 2 . 20 Documentation is also automatically generated from the metadata strings in each subroutine to ensure that each part of the code is well documented.

Case Study Example: Highly Supercooled Antarctic Cloud
To demonstrate the application of EMC 2 and its output using the different forward calculation approaches, here we describe and analyze a Lagrangian LES case study (Silber et al., 2019a) adjusted for running and testing the ModelE3 climate model 25 (as well as other climate models) in SCM mode.

Case Description
As described by Silber et al. (2019a), the stratiform cloud event that we compare with model simulations was observed over  fig. 2b; note that a surface-based temperature inversion is common during the austral winter resulting in smaller TOA radiation fluxes). Over McMurdo Station, a decoupled persistent mixed-phase cloud with temperatures as low as -29 • C was observed for ∼39 hours. The observed cloud was nearly continuously precipitating ice particles and was also drizzling for more than 7 hours, concluded from a comprehensive analysis of sounding, HSRL, and KAZR measurements (see Silber et al., 2019a).

Comparison Between Observations and ModelE3 SCM Using EMC 2
The following figures show the EMC 2 forward calculation results using the DHARMA LES simulation and ModelE3's configuration Tun3, one of four configurations of ModelE3 derived in part via a machine learning parameter tuning approach that will be described in a manuscript in preparation, to be included in the Coupled Model Intercomparison Project Phase 6 (CMIP6).
The SCM using this configuration was able to generate a cloud-top inversion and turbulent layer via cloud-top radiative cooling, 5 and produced the best agreement with the observations and the LES relative to the other three configurations (see Appendix B). Whereas here we examine application of EMC 2 to a single ModelE3 configuration in SCM mode in a case where we can also compare with LES, we note that EMC 2 is designed to enable detailed evaluation of atmospheric thermodynamic profile and cloud properties extracted from global simulations of ModelE3 configurations and other climate models against long-term datasets at fixed sites in future dedicated work.

10
The left panels in Fig. 3 show the mixing ratios of the four hydrometeor classes evolving through the simulated SCM case study. These depicted mixing ratios are the output of the SCM's stratiform cloud scheme, but because this simulation does not generate any convective hydrometeors (as expected), they also represent the total mixing ratios. Cloud water mass (top panel) dominates over the other hydrometeor classes in hydrometeor-bearing model grid cells through much of the simulation, even at lower levels in which the cloud water fraction is rather low (fig. 3, right panels). Rain (effectively drizzle) is produced by the 15 model as well but has a relatively smaller mass compared with that of snow generated. This reduced amount of rainwater in the SCM simulation relative to rainwater dominance in the simulations of Silber et al. (2019a) is largely the result of the different autoconversion parameterization schemes implemented in ModelE3 (Seifert and Beheng, 2001) relative to that implemented by default in the DHARMA LES (Khairoutdinov and Kogan, 2000), which produces significantly smaller rain mass mixing ratios in this case (not shown), contrary to some previous studies (e.g., Heiblum et al., 2016;Xiao et al., 2021). Understanding 20 the source of this differing autoconversion parameterization behavior requires a dedicated study that is a beyond the scope of this manuscript.

Figs. 4 and 5 depict the HSRL and KAZR variables observed during a single hour over McMurdo Station and simulated
with EMC 2 using the DHARMA LES three-dimensional output at the end of that simulation (without using the subcolumn 25 generator), and by applying each of the three approaches on ModelE3 configuration Tun3 SCM output for 05:00 UTC. Because our goal in this section is to demonstrate that EMC 2 can reasonably match cloud observations given comparable input, we present the EMC 2 -processed SCM output 4 hours into the simulation when cloud top heights are similar to observed instead of the end of the simulation (the SCM develops the supercooled cloud faster than the baseline LES; see Appendix B). When evaluating the processed model output against the observations, we essentially exchange temporal resolution with spatial  The processed LES output exhibits generally good agreement with the observations, evident by the comparable lidar and radar variable values and their horizontal variability (Figs. 4 and 5), the vertical cloud structure and boundaries, as well as the full lidar signal attenuation near cloud top (Fig. 4). A multi-layer cloud structure developed by the LES is suggested by the 5 intermittent breaks in the large β ptot and α ptot values. This multi-layer structure is also indicated by the lidar observations (Fig.   4), and was comprehensively discussed by Silber et al. (2019a).
Using the microphysics approach, the SCM sub-grid variability is more pronounced relative to the radiation approach owing to the occurrence of cloud water combined with the implementation of its sub-grid variability as defined in the MG2 micro- dimensional output processed using EMC 2 microphysics (without the subcolumn generator), ModelE3 EMC 2 output using the microphysics approach, EMC 2 output using the radiation approach, and EMC 2 output using the empirical approach with the microphysics cloud fractions for the subcolumn generator (see sect. 2.1). The DHARMA LES output corresponds to 10:00 UTC (arrival of cloud field at McMurdo Station; see sect. 4.2), while ModelE3's EMC 2 panels depict the SCM output for 05:00 UTC (see text) processed with EMC 2 using 100 subcolumns.
A full lidar signal attenuation mask of τtot > 4 (τtot is the total accumulated optical thickness) is applied to the plotted simulated data   fig. 4, but with the KAZR attenuated equivalent reflectivity factor (Ze tot,att ; top row), mean Doppler velocity (VD tot ; middle row), and spectrum width (σD tot ; bottom row). Note that only Ze tot,att is calculated in EMC 2 's radiation and empirical approaches (see sects. 2.3.2 and 2.4). A radar signal-to-noise ratio (SNR) mask of Ze min > Ze tot (Ze min is the minimum detectable signal on each model level) is applied to the plotted data (hatched areas). which leads to full lidar signal attenuation, as also indicated by the HSRL measurements ( fig. 4). Full lidar signal attenuation does not occur in this case using the radiation approach because of the uniform distribution of cloud water between subcolumns commensurate with τ tot values just below the full attenuation threshold of 4 (not shown). The subcolumns with full lidar signal attenuation in the microphysics approach call for the use of radar measurements for cloud-top detection (e.g., fig. 5 discussed below).   The EMC 2 lidar output also highlights a few model weaknesses. For example, using the microphysics approach, cloud base (at a height of ∼1.0 km) is highly variable and may extend nearly down to the surface, which is in contradiction to the observations, where cloud base variability is on the order of a few hundred meters throughout the depicted period ( fig. 4).
Using the radiation approach, on the other hand, breaks in hydrometeor cover are seen below the fully overcast layer, as a 21 https://doi. org/10.5194/gmd-2021-194 Preprint. Discussion started: 12 August 2021 c Author(s) 2021. CC BY 4.0 License. result of the generalized hydrometeor fraction used in the radiative transfer calculations, which can be lower or higher than the associated hydrometeor class fraction (e.g., fig. 3, right). Lesser (greater) generalized hydrometeor fractions relative to the associated hydrometeor class fraction therefore implies greater (smaller) subcolumn bin mixing ratios (see eq. 1), and hence, enhanced (diminished) associated β p hyd and α p hyd values. Thus, while the occurrence of low-level cloud water-bearing bins produces full attenuation of the simulated lidar signal in some subcolumns using the microphysics approach ( fig. 4), the smaller 5 α p hyd ensuing from the larger low-level generalized cloud fraction relative to the cloud water fraction (see fig. 3, right) causes no low-level (or any level in this case) signal extinction when the radiation approach is used (see fig. 4). Note that in both the microphysics and radiation approaches the subcolumn representation of hydrometeor mass remains consistent with the model output variables, i.e., eq. 1 holds for each hydrometeor class.
While EMC 2 's processing using the microphysics and radiation approaches is largely consistent with the observations, 10 LES, and each other (figs. 4 and 6), the empirical approach gives less consistent results. The β ptot and α ptot at heights where cloud water is the dominating hydrometeor are reasonably represented. However, the empirical β ptot and α ptot in the lowerelevation ice-dominated regions are significantly smaller relative to the observations, whereas the resulting LDR values below the overcast cloud layer match the observations and LES slightly better. The Z etot,att calculated using the empirical approach is underestimated at low levels ( fig. 5). These deviations of the empirical approach relative to the microphysics and radiation 15 approaches are strongly influenced by the deficient consideration of a given model physics, as well as the limited flexibility of the empirical parametrizations, which were specifically derived for certain geographical regions and/or conditions. Thus, while some assumptions made in the development of the empirical parameterizations may overlap with ModelE3 phyiscs in this particular case, others may not.
The microphysics and radiation approaches exhibit Z etot,att values that are too large, especially at higher levels (figs. 5 20 and 6). EMC 2 's radar processing using the microphysics approach provides the V Dtot and σ Dtot variables in addition to the Z etot,att variable. As indicated from fig. 5, both of these calculated variables show grossly reasonable correspondence between the observations and the SCM, except at heights above ∼0.8 km, where V Dtot values show large deviations ( fig. 6). These deviations are mainly the result of relatively fast fall velocities (see Table 1) and the dominance of large snow hydrometeors over Z etot,att at these levels (not shown). Congruent with the description above of the calculated radar and lidar variables, the radar-sounding and modified fixed lidar threshold methods show the domination of liquid hydrometeor classes above ∼1.0 km, the prevalence of precipitating hydrometeors at lower levels (with lower occurrence using the radiation scheme approach), and the occasional full attenuation of the 30 simulated lidar signal in the case of the lidar-based classification method. The COSP emulator detects a clear liquid-bearing subcolumn bin signal at cloud tops. However, because the cloud-top layer is highly reflective, generating large lidar scattering ratio values (the ratio of total to molecular attenuated backscatter), most of the underlying layers are either classified as "undefined" or generate signals too weak to be detected.  Using each of the three classification methods, phase ratio statistics can be generated with EMC 2 offering a method for analyzing the SCM simulation. Fig. 8 portrays the temporal evolution of the SCM simulation from the view of the simulated instruments and classification methods using the microphysics or radiation approaches. When radar-sounding and fixed lidar threshold methods are applied while using the microphysics approach, the evolution of the simulated cloud appears selfconsistent between the two methods and generally follows the prototypical appearance of nearly continuously precipitating 5 23 https: //doi.org/10.5194/gmd-2021-194 Preprint. Discussion started: 12 August 2021 c Author(s) 2021. CC BY 4.0 License. liquid-bearing cloud layers with weakly varying cloud base height (e.g., de Boer et al., 2011;Fridlind and Ackerman, 2018;Silber et al., 2021a). Here, the radar-sounding skill is associated with the diminished cloud water fraction relative to the large fraction of the other hydrometeor classes ( fig. 3, right) and the method's ability to correctly detect cloud water layers. The capabilities of this method should nonetheless be considered carefully when directly compared with observations because realistic sounding profiles typically lack the fine temporal resolution emulated by EMC 2 here, and while in-situ observations 5 can provide a robust characterization of liquid-bearing cloud layers, they can also produce sporadic false positive or negative cloud detections (e.g., Silber et al., 2020, fig S1;Vassel et al., 2019). In the case of the modified fixed lidar threshold method, the low-level phase ratio skill originates from the hydrometeor-bearing subcolumn bins being largely classified as "ice". This classification decision is the result of the low β ptot and moderate-to-high LDR (e.g., fig. 6) produced by the prevalence of ice hydrometeors relative to cloud water ( fig. 3, right) and the greater mass (and likely volume due to the spherical representation) 10 of these hydrometeors relative to rain ( fig. 3, left).
The COSP emulator using the microphysics approach with a top-down view is consistent with the example in fig. 7, in which hydrometeor detection is limited by the optically-thick and highly reflective cloud-top region, resembling observational retrievals (e.g., Cesana and Chepfer, 2013;.

15
Using the radiation approach, phase ratios more frequently show sharper transitions between the extreme values (all liquid or all non-liquid) stemming from the utilization of the generalized hydrometeor fraction. In the case of the radar-sounding method, for example, there is a distinct dominance of liquid-bearing bins, which only require any amount of cloud water to be classified as such. This dominance originates in the common occurrence of some cloud water mass mixing ratios in model grid cells throughout the SCM simulation ( fig. 3, left) combined with the implementation of the generalized hydrometeor fraction, 20 which necessarily increases the overlap between cloud water and other hydrometeor classes. The limited number of model grid cells in which the subcolumn bins exhibit a more mixed behavior are the result of the randomized component of the subcolumn generator, which does not necessarily require overlap between cloud and precipitating hydrometeors (see sect. 2.1). Based on these classification results, we suggest that the radar-sounding method could lead to hydrometeor classification favoring liquid-bearing classes when the radiation approach or similar model output with a generalized hydrometeor fraction is used. 25 The modified fixed lidar threshold method is relatively consistent with the microphysics approach ( fig. 8), even though some times (mainly around 06:00 and 10:00 UTC) are characterized by greater (smaller) relative liquid occurrence at lower (higher) model levels (phase ratio values closer to 0.5). These phase ratio differences relative to the microphysics approach are the result of the convolution of the generalized hydrometeor fraction and its deviation relative to the cloud water fraction ( fig. 3, right).
Using a top-down view, the COSP emulator agrees with both its application using the microphysics approach as well as 30 the COSP output from the on-line simulator implemented in ModelE3, which utilizes the radiation approach ( fig. 8. Unlike the on-line simulator, the emulator detects some "undefined" hydrometeors at low levels (down to the surface), which can be explained by the lack of model interpolation onto a uniform vertical grid and/or small differences in the compiled simulator code related to signal attenuation (e.g., the accumulation of optical thickness). bins) of ModelE3 configuration Tun3 SCM simulation using the microphysics (left) and radiation (right) approaches. The classification methods used to calculate the phase ratios are (from top to bottom) the radar-sounding method, the modified fixed lidar variable threshold method, the COSP lidar simulator emulator (top-down view), and the on-line COSP lidar simulator implemented in ModelE3, which is only processed using ModelE3's radiation approach. In this figure, the "mixed" class of the radar-sounding method is counted as liquid, while the "undefined" classes in the other two methods are treated as "non-liquid" even though in some cases they are more likely to be liquid-bearing (e.g., the "undefined1" class in the modified fixed lidar variable threshold method). Fig. 8 demonstrates the sensitivity of phase ratio statistics to the classification method, the viewing direction of the examined instrument, and the method by which "liquid" and "non-liquid" or "ice" classes are being counted. It shows that the use of forward simulators alone is not a guarantee for an "apples-to-apples" comparison, which requires matching processing steps to ensure its robustness.

5
EMC 2 provides an easy to use and flexible framework for the analysis of large-scale model output and its direct comparison with ground-based observations via the generation of subcolumns intended to explicitly represent a sub-grid scale variability, 25 https: //doi.org/10.5194/gmd-2021-194 Preprint. Discussion started: 12 August 2021 c Author(s) 2021. CC BY 4.0 License. and the simulation of ground-based (and air-or space-borne) radars and lidars. EMC 2 's framework is already tailored to the MG2 2-moment microphysics while using single-particle scattering LUTs and has the proper infrastructure required for it to be customized to other similar schemes, as well as high-resolution model output. EMC 2 's option for using radiation scheme logic in the subcolumn generator and simulator enables direct comparison with other on-line active instrument simulators (e.g., the COSP lidar simulator) with a bottom-up or top-down view option that can bridge between different methodologies by 5 evaluating differences between the outputs resulting from their implementation.
Because it is generally designed to emulate ground-based systems, EMC 2 is suitable for the evaluation of column output extracted from global simulations against long-term ground-based datasets. The general adaptability of the software code to other climate models and instruments via the model and instrument Python classes renders EMC 2 as a flexible framework to enable consistent and reproducible post-processing methods and evaluation across multiple models.

10
An AWARE case study was used to illustrate the application of EMC 2 to LES and SCM simulations of a highly supercooled Antarctic cloud, including the utility of the program for hydrometeor classification using radar-sounding, lidar variable thresholds, and COSP emulator methods. The ModelE3 SCM using configuration Tun3 showed general agreement with the observations at the examined simulation time as well as with the baseline DHARMA LES used to develop this case study (see Silber et al., 2019a). The LES output can be processed with EMC 2 after a few adaptations made only to an inherited Model 15 class (see sect. 3). Thus, although it was developed for large-scale models, EMC 2 can also be used to compare cloud resolving or LES models with observations (as shown for DHARMA). EMC 2 also allows the implementation of advanced scattering model calculations in the forward calculations via customized LUTs that could be matched to some scattering assumptions made by models, for example, the implementation of the MODIS C6 calculations in both ModelE3's radiation scheme and EMC 2 . 20 The AWARE case study presented here is suitable for simulation by any global model in SCM mode (see input file repository specified under code and data availability). Case study observations, as well as ModelE3 SCM and DHARMA LES inputs and outputs from EMC 2 used to produce all examples above are also provided for step-by-step illustration (see code and data availability). We plan that additional case study examples will similarly be provided to illustrate results under differing cloud regimes. 25 Planned future additions to EMC 2 include an extension to ground-based scanning radars, a Mie scattering calculator, spectral broadening estimates for the radar simulator, and a multiple-scattering model for the lidar simulator, all of which will be configured for consistency with model physics and output fields. We invite the community to take advantage of the framework provided by EMC 2 and to contribute to its further development and applications.
Appendix A: Lists of acronyms, abbreviations, and symbols   and Phys maintain substantial amounts of LWP in this highly supercooled cloud case study and agree with both the depicted 5 DHARMA LES output and the observed LWP retrievals (see Silber et al., 2019a). Moreover, the SCM using each of these two configurations is able to develop a cloud-top inversion and TKE, both of which are driven by radiative cooling of cloud water, consistent with the DHARMA LES (see A1, bottom panels) and various polar cloud observations (e.g., Morrison et al., 2012;Silber et al., 2019a). All of ModelE3's configurations except for Tun1, which generates the largest amount of ice, are within 29 https://doi.org/10. 5194/gmd-2021-194 Preprint. Discussion started: 12 August 2021 c Author(s) 2021. CC BY 4.0 License. Figure A2. Mean profiles of observed (time-averaged) and simulated (subcolumn-averaged) HSRL and KAZR variables calculated using EMC 2 microphysics (top) and radiation (bottom) approaches using the four ModelE3 configurations (see legend), without applying a lidar and radar extinction or SNR masks, respectively. the range of IWP estimated retrieval uncertainty. Out of those three model configurations, Tun3 is closest to the retrieved IWP and best matches the cloud formation and evolution in the LES.
Because the supercooled cloud is developed faster in the SCM than in the LES model (see A1, top and middle), we choose to demonstrate EMC 2 using the SCM output corresponding to 05:00 UTC. Fig A2 shows the mean lidar and radar variable profiles at that model time step from EMC 2 for each of the four model configurations together with the time-averaged observed 5 profiles. Using either the microphysics or radiation approach, configurations Tun3 and Phys best match the observations. Most of the radar and lidar variables calculated using these two configurations are largely consistent with each other, but overall, the Tun3 SCM output shows the best agreement with the mean observed profiles.
To summarize, two of ModelE3's final four configurations, that is, configurations Tun3 and Phys, show reasonable agreement with both the observed quantities as well as the LES output variables. Because configuration Tun3 performs slightly better, we focus on this model configuration for detailed comparison with LES and observations.

5
Code and data availability. The most recent EMC 2 code is available on GitHub at https://github.com/columncolab/EMC2/. The EMC 2