The Cloud-resolving model Radar SIMulator (CR-SIM) Version 3.3: description and applications of a virtual observatory

Ground-based observatories use multisensor observations to characterize cloud and precipitation properties. One of the challenges is how to design strategies to best use these observations to understand these properties and evaluate weather and climate models. This paper introduces the Cloud-resolving model Radar SIMulator (CR-SIM), which uses output from high-resolution cloud-resolving models (CRMs) to emulate multiwavelength, zenith-pointing, and scanning radar observables and multisensor (radar and lidar) products. CR-SIM allows for direct comparison between an atmospheric model simulation and remote-sensing products using a forward-modeling framework consistent with the microphysical assumptions used in the atmospheric model. CR-SIM has the flexibility to easily incorporate additional microphysical modules, such as microphysical schemes and scattering calculations, and expand the applications to simulate multisensor retrieval products. In this paper, we present several applications of CR-SIM for evaluating the representativeness of cloud microphysics and dynamics in a CRM, quantifying uncertainties in radar–lidar integrated cloud products and multi-Doppler wind retrievals, and optimizing radar sampling strategy using observing system simulation experiments. These applications demonstrate CR-SIM as a virtual observatory operator on high-resolution model output for a consistent comparison between model results and observations to aid interpretation of the differences and improve understanding of the representativeness errors due to the sampling limitations of the ground-based measurements. CR-SIM is licensed under the GNU GPL package and both the software and the user guide are publicly available to the scientific community.


Introduction
Ground-based observatories offer an integrated view of cloud and precipitation systems complementary to that available from satellites with excellent vertical resolution, especially in the boundary layer, and an accompanying description of the large-scale forcing. Currently, a number of observatories are continuously operated in different climate regimes (Illingworth et al., 2007;Löhnert et al., 2015;Stevens et al., 2016;Mather et al., 2016) with evolving measurement capabilities. In the beginning, zenith-pointing cloud radars, lidars, and radiometers provided the primary cloud and precipitation measurements. Recently, the need to characterize the mesoscale organization of clouds and precipitation over a larger domain has heightened the sophistication and complexity of these observatories to go beyond single, one-dimensional profiling measurements. For example, the US Department of Energy (DOE) Atmospheric Radiation Measurement (ARM) observatories offer observations from distributed networks of profiling and scanning radars, lidars, and radiometers (Turner and Ellingson, 2016;. Multiparametric information from profiling and scanning radars, lidars, and radiometers has been used to retrieve cloud microphysical and kinematic properties, such as ice particle properties (e.g., Zhang et al., 2014;Kneifel et al., 2015Kneifel et al., , 2016Matrosov et al., 2017;Von Lerber et al., 2017), and verti-M. Oue et al.: The Cloud-resolving model Radar SIMulator (CR-SIM) Version 3.3 cal velocities Giangrande et al., 2016). However, the comparison between the retrieved observables (e.g., ice water content from radar reflectivity) and modelproduced parameters often involves large uncertainties. Several factors, not limited to the nature of ground-based observations, challenge model evaluation using these observations. In many cases, the retrieval algorithms are based on statistical estimation of ill-posed inverse problems, and the results may not capture well the observed variability of natural data because of limitations from assumptions embedded in the retrieval algorithms (e.g., Szyrmer et al., 2012;Szyrmer and Zawadzki, 2014). Furthermore, determining critical parameters for model evaluation such as the cloud fraction profile requires complementary, synergistic observations from both radar and lidar. One such example is the Active Remote Sensing of Clouds (ARSCL; Clothiaux et al., 2001) product that combines radar and lidar data to determine the hydrometeor height distributions. Other examples of critical parameters that require a multisensor approach include cloud and precipitation classification schemes (Illingworth et al., 2007) and hydrometeor phase classification (e.g., Shupe, 2007;Luke et al., 2010;Lamer et al., 2018). So, how do we best compare such products developed using multiple sensors with different capabilities (i.e., sensitivity) with numerical model output? Further, challenges may arise from the sampling strategy used to obtain the observations. For example, a recent study by Oue et al. (2016) has shown that zenith-pointing observations from one location are inadequate to provide reliable cloud fraction profile estimates in a cumulus field. A similar investigation on 3D wind retrievals in deep convection using multi-Doppler radar techniques highlights similar deficiencies of our current observing systems (Oue et al., 2019a). How do we best quantify the measurement uncertainty introduced by the observational strategy?
In this paper, we introduce the Cloud-resolving model Radar SIMulator (CR-SIM), which has been continuously developed over the last 5 years to facilitate modelobservation comparisons. CR-SIM applies forward simulators to atmospheric model output to simulate ground-based measurements. These simulations may be used (1) to compare directly to the measurements, which provides an applesto-apples comparison of the observed variables, or (2) as input to retrieval algorithms to assess the retrieval methodology or sampling strategy using the original atmospheric model output as "truth". In this study, the CR-SIM architecture and capabilities are presented along with a series of forward simulations that emphasize its capabilities. In particular, we highlight the applications of CR-SIM in investigations of observational uncertainties. Although accurate estimation of uncertainties in the retrieval products (e.g., ice water content, liquid water content, vertical velocity) is challenging, forward simulators allow us to emulate the observational retrieval products accounting for known error sources to understand the exact impacts of those error sources on the products by comparisons with the truth, which is usually the input model data. Observing system simulation experiments (OSSEs) take advantage of forward modeling to produce simulated measurements. The understanding from OSSEs help us to (i) evaluate the model simulations using the observations while accounting for the observation limitations, (ii) estimate uncertainties in retrieval techniques used, (iii) propose new retrieval techniques accounting for the uncertainties, and (iv) optimize new observation system strategies. This study demonstrates the application of the CR-SIM forward simulator in several OSSEs in which ARM multisensor products, such as cloud locations and vertical velocity, are evaluated by considering limitations inherently imposed by the nature of the observations. (A list of acronyms is provided in Appendix B for easy reference.)

Forward simulators
Forward simulators have been widely used to design observing systems and to provide an alternative path to modelobservation comparisons by transforming the model geophysical quantities into remote-sensing observables. There are several sophisticated radar simulators which have been developed for specialized applications of interest. For example, Snyder et al. (2017a, b) simulate polarimetric radar characteristics of a supercell using radar forward simulators to understand the contribution of microphysical characteristics to the polarimetric properties and their wavelength dependency. They account for the water fraction of solid ice particles to realistically simulate differential reflectivity (Z DR ) columns, specific differential phase (K DP ) columns, and copolar correlation coefficient (ρ hv ) rings in supercells. A cloud radar simulator developed by Zhang et al. (2018) is designed to simulate vertically pointing cloud radar reflectivity (e.g., Ka-and W-band radars) using global climate model (GCM) output, which is beneficial for comparison of datasets at different scales (cloud-scale observational data versus globalscale data). Matsui et al. (2019) simulate polarimetric precipitation radar-based hydrometeor classification, vertical velocity, and rain rate from CRM output to examine uncertainties in the retrieval algorithms and model microphysical parameterizations using the POLArimetric Radar Retrieval and Instrument Simulator (POLARRIS). The uncertainties are attributed to assumptions of hydrometeor particle size distribution, density, axis ratio, and/or canting angle. Lamer et al. (2018) developed a GCM-oriented groundobservation forward simulator ((GO) 2 -SIM), which is a comprehensive radar-lidar simulator for the GCMs that emulates radar Doppler spectra moments, lidar backscatter, and depolarization, and it provides synthetic estimates of mixed-phase cloud occurrence in the GCMs that are comparable to those estimated from observations using the same methodology.
CR-SIM can simulate the ideal multiwavelength radar and lidar observables and multisensor integrated products. The zenith-pointing and scanning radar observables include radar reflectivity, Doppler velocity, spectrum width, and polarimetric fields. Zenith-pointing lidar observables include lidar backscatter and extinction coefficient. The idea behind CR-SIM is to have a forward-model operator that provides idealized radar and lidar observables (i.e., actual observations after perfect quality control and correction for the total (twoway) attenuation) on the same grid as in the CRMs or largeeddy simulations (LESs) to facilitate model-observation comparisons. Further, the design is flexible enough to be coupled with different microphysical schemes and different scattering methods (e.g., T-matrix method by Mishchenko, 2000; discrete dipole approximation by Yurkin and Hoekstra, 2011).
The CR-SIM forward simulator is tailored to compute radar and lidar observables by integrating scattering properties over the discrete particle size distributions (PSDs) using a constant bin size for each hydrometeor (Table 1), based on the microphysical scheme used in the CRM or LES. The environmental variables are obtained from a mandatory set of model output variables consisting of pressure, temperature, dry air density, and height above sea level. The single-scattering properties are calculated using the T-matrix method and packaged as look-up tables (LUTs) in CR-SIM. The simulated idealized radar and lidar variables are provided at each model grid box and can be easily compared with real observations.

Scattering properties
The LUTs consist of the complex scattering amplitudes S ij of the 2 × 2 scattering matrix for single, nonspherical particles at fixed orientations with equally spaced particle sizes computed using the T-matrix code of Mishchenko and Travis (1998) and Mishchenko (2000). Following Ryzhkov et al. (2011), we assume that the scattering characteristics of arbitrarily oriented particles can be expressed by the scattering amplitudes f a and f b corresponding to the principal axes of spheroid when the electric vector of the illuminating electromagnetic wave is directed along the spheroid axes a and b, respectively. Each hydrometeor species is characterized by its equivalent volume diameter, D = (ab 2 ) 1/3 , where a is the symmetry axis of the spheroid, and b is its transverse axis. In this convection, an oblate hydrometeor has a < b.
The scattering amplitudes f a and f b correspond to S vv and −S hh computed by the Mishchenko (2000) T-matrix code, respectively (f a ≡ S vv ; f b ≡ −S hh ). The minus sign in the expression for f b is to account for switching from the forwardscattering convention used for the amplitude and phase matrix components in the T-matrix code to the backscatteralignment convention used in the definitions of polarimetric radar measurements except for forward-scattering parameters (attenuation and phase shift).
The LUTs of scattering properties for single particles are constructed for each hydrometeor class corresponding to the CRM or LES simulation output (e.g., rain drop, snowflakes, cloud droplet, ice crystal, graupel) as a function of particle phase, bulk density, aspect ratio, size, and temperature. We used fixed values of these parameters depending on the "scattering type", which must be assigned to the corresponding hydrometeor class. The bulk density used is as parameterized in the selected microphysics scheme, assuming spheroid particle shapes. For each hydrometeor class, the complex scattering amplitudes are calculated for 91 elevation angles from 0 to 90 • with a spacing of 1 • ; five radar frequencies at 3 GHz (S band), 5.5 GHz (C band), 9.5 GHz (X band), 35 GHz (Ka band), and 94 GHz (W band); different temperature ranges for the liquid hydrometeors; different particle densities for solid hydrometeors; and a few different values of particle aspect ratio. The scattering types built in the current LUTs and their parameter settings are presented in Table 2. For lidar scattering properties, the single-particle extinction σ α and backscattering cross section σ β for spherical cloud droplets and cloud ice are calculated using the BHMIE Mie code (Bohren and Huffman, 1998). CR-SIM operates for observables from the ceilometer (wavelength of 905 nm) and micro pulse lidar (MPL, wavelengths of 353 and 532 nm).
Although most of the parameters related to hydrometeor particles (e.g., particle bulk density, size) required in the scattering calculations can be obtained from the prognostic and diagnostic variables from the CRM or LES, aspect ratios and canting angles must be assumed in the simulator and as such are prescribed by the users. All liquid and ice hydrometeors are assumed to be oblate spheroids, except cloud droplets. Raindrops are represented as oblate spheroids with a sizedependent aspect ratio following an empirical equation as a function of particle diameter, based either on Brandes et al. (2002) or Andsager et al. (1999). A fixed aspect ratio is used for each solid hydrometeor category, but for graupel and hail the empirical expression proposed by Ryzhkov et al. (2011) is also available. For all model hydrometeors, the radar polarimetric variables (which depend on particle orientation) are calculated using complex scattering amplitudes from the pre-built LUTs, assuming a mean particle canting angle of 0 • (Ryzhkov, 2001) with a choice of the particle orientation distribution. The possible choices for the distribution of particle orientations are fully (three-dimensional) random orientation, random orientation in the horizontal plane, and a two-dimensional axisymmetric Gaussian distribution of orientations. In this paper, for all simulations, we used aspect ratios proposed by Brandes et al. (2002) for rain drops, 0.2 for cloud ice, 0.6 for snow, those by Ryzhkov et al. (2011) for graupel and hail, and the two-dimensional axisymmetric Gaussian distribution for all hydrometeor species.
A hydrometer in CR-SIM is either pure liquid or a mixture of ice and air. The composition of particles within a volume is represented in the scattering computations by an appropriate selection of the dielectric constant for different hydrometeor types. The dielectric constant of liquid particles is frequency dependent and temperature dependent (Ray, 1972). Table 1. The minimum and maximum sizes and bin spacing of simulated particles for each hydrometeor category from the bulk microphysics schemes except the P3 scheme. "Particle size" here refers to the maximum particle dimension.

Category
Minimum size (µm) Maximum size (µm) Bin spacing (µm) Ice phase hydrometeors are assumed to be composed of ice and air in an ice matrix, and their effective dielectric constant is computed using the Maxwell Garnet mixing formula (Maxwell Garnet, 1904). The output radar reflectivity (Z hh ) for all hydrometeor species is the equivalent radar reflectivity, for which the computations adopt a dielectric factor of 0.92 for all hydrometeor species. This choice of the dielectric factor ensures a convention that the definition of radar reflectivity reduces to the following form: Z = N (D)D 6 dD for (spherical) liquid particles, where D is the droplet diameter and N (D) is the droplet size distribution function. The LUTs of scattering properties currently incorporated into CR-SIM were created using the T-matrix method where solid phase hydrometeors are represented as dielectrically dry oblate spheroids. Although these assumptions are rather simple compared to some other radar simulators that take into account complex electromagnetic scattering by mixedphase hydrometeors or ice hydrometeors with possibly irregular shapes (e.g., Snyder et al., 2017a, b;Jiang et al., 2019), such complex electromagnetic scatters can be easily incorporated by adding LUTs of their scattering properties from different scattering calculation methods (e.g., Kneifel et al., 2017;Leinonen and Moisseev, 2015;Leinonen and Szyrmer, 2015;Lu et al., 2016) without any structural change to the code.

Calculations of radar and lidar observables
The PSD for each hydrometeor species is produced based on the model microphysics scheme. The incorporated microphysics schemes and corresponding CRM currently available in CR-SIM are listed in Table 3. CRMs coupled with bulk moment microphysics (i.e., single and multi-moments) prognose mixing ratio and/or the number concentration of each hydrometeor species. Using these parameters, the PSD is determined in combination with size distribution assumptions used in the microphysics scheme. Bin microphysics schemes explicitly calculate the evolution of the PSDs. Radar moment observables are computed by integrating scattering properties (see Appendix A) from the LUTs over the discrete PSD for each hydrometeor type following Ryzhkov et al. (2011) while accounting for an orientation distribution as described in Sect. 2.1. They are then integrated over all simulated hydrometeor species to produce a unique value for each observable at each grid box (see Tatarevic et Table 4. Particle fall velocity, which is used for Doppler velocity and spectrum width computations, is parameterized as a function of particle diameter in the same manner as in the selected microphysics scheme. The fall velocity size relationship (V f (D)) for each hydrometeor species is specified in the following form: where f c = (ρ surf /ρ) k is the correction factor for air density with exponent k. The values for the prefactor a v , the exponent b v , and the exponent k vary according to the microphysics scheme and do not depend on particle orientation. The air density at sea level, ρ surf , is computed for the first model level. The values of coefficients and detailed descriptions concerning the microphysics schemes are found in Tatarevic et al. (2019). The versatility of the CR-SIM computational capabilities is briefly demonstrated by the following examples.
- Figure 1 shows an example of S-band (3 GHz) radar observables from CR-SIM for a mesoscale convective system (MCS) observed on 20 May 2011, during the Midlatitude Continental Convective Clouds Experiment (MC3E; Jensen et al., 2016). The convective system was simulated using the Weather Research Forecasting (WRF) model (Skamarock et al., 2008) with the Morrison 2-moment microphysics scheme, a horizontal resolution of 0.5 km, and the vertical resolution of approximately 0.25 km.
-CR-SIM includes a computation of the Doppler power spectra by introducing the method used in Kollias et al. (2014). Figure 2 shows examples of the Doppler spectra and its moments at S band for a WRF-simulated cumulus field. In the figure, a pulse repetition frequency (PRF) of 600 Hz is used, the noise power at 1 km is −40 dB, and the number of Doppler velocity bins is 256.
-CR-SIM also includes forward simulators for the ceilometer (wavelength of 905 nm) and ground-based micro pulse lidar (wavelengths of 532 and 353 nm). The lidar observables are computed for cloud ice and cloud droplet species (see Table 5 and Appendix A). Figure 3 shows an example of profiles calculated for lidar observables for a cumulus case from the LES ARM Symbiotic Simulation and Observation project (LASSO;  using WRF coupled with the Morrison 2-moment microphysics scheme. In this simulation, typical profiles are presented for aerosol backscatter (β aero ), extinction coefficient (α ext_aero ), and molecular backscatter (β mol ) based on Spinhirne (1993). As expected, the lidar backscatter is significantly attenuated by cloud droplets, but the very high lidar backscatter at the interface between air and cloud can be used to detect cloud base height.

Instrument model
An instrument model is used to postprocess the CR-SIM results to account for the total (two-way) attenuation in a pathway and the effects of technological specifications on the observables, such as sampling volume and detector sensitivity. Milbrandt and Yau multi-M scheme (Milbrandt and Yau, 2005a, b) Thompson 1-and 2-M scheme (Thompson et al., 2008) Predicted particle properties (P3) scheme  Spectral bin microphysics (Fan et al., 2012) ICOsahedral Non-hydrostatic general circulation model (ICON) Seifert and Beheng 2-M scheme (Seifert and Beheng, 2006;Seifert, 2008) Regional Atmospheric Modeling System (RAMS) 2-M scheme (Cotton et al., 2003) System for Atmospheric Modeling Tel Aviv University 2-M bin microphysics (Tzivion et al., 1987;Feingold et al., 1996) Morrison 2-M scheme (Morrison et al., 2005) The standard output of CR-SIM consists of synthetic profiling radar and lidar observations at each grid box of the model domain and synthetic scanning radar observations for a radar positioned at a user-specified location inside the model domain. The output synthetic fields are artifact free, with no propagation (e.g., total attenuation in a pathway) or instrument sampling effects (e.g., antenna beamwidth, range-gate spacing). This approach is based on the notion that the real observations used for comparison against the synthetic simulated observables will have undergone rigorous postprocessing that mitigate attenuation effects to the extent possible, e.g., velocity folding. However, the user can emulate the true observations of a scanning radar while selecting the placement of a radar or a network of radars within the model domain; thus, imposing a volume coverage pattern (VCP) scan strategy. In this case, the idealized, standard CR-SIM output at the model grid resolution can be further used as input into a radar instrument model that is written specifically for the postprocessing of the CR-SIM radar simulations. The radar instrument model accounts for radar distance to the target, elevation as provided by the VCP, pulse length, range resolution, antenna beamwidth, and receiver noise, and it outputs the radar observables in radar polar coordinates. For the calculation of elevation angles, the Earth's surface is assumed to be flat, which is an acceptable assumption for general radar observation ranges (<∼ 90 km for the vertical model grid spacing of > 0.5 km). Gaussian functions are used as the antenna-weighting the range-weighting functions to estimate the contribution of the model grid observables to the radar polar coordinate system observables. Depending on the azimuthal resolution and the antenna beamwidth, this instrument model also accounts for the radar sampling resolution. Radar reflectivity attenuated for hydrometeors can be computed using the integrated specific attenuation for hydrometeors along a radar beam path. The total hydrometeor attenuation (A tot ) at each grid box is then equal to twice the sum of the specific attenuation for all simulated hydrometeors (A h ) along a radar beam path from the location of the radar to a distance of the target at r in kilometers: where A tot is in decibel (dB) and A h in decibel per kilometer (dB km −1 ). Here, gaseous attenuation was not included. The observed reflectivity Z obs hh (logarithmic scale) is computed by subtracting A tot from Z hh on a logarithmic scale: Like Z hh , the attenuated differential reflectivity Z obs DR on a logarithmic scale is calculated as where A DP represents specific differential attenuation in decibel per kilometer (dB km −1 ). The minimum detectable reflectivity Z MIN (logarithmic) is applied with a constant C: where r is the radial distance in kilometers, and the constant C represents the minimum detectable signal at r = 1 km for the pulse length selected by the user. Figure 4 shows simulated range-height indicator (RHI) measurements at C and X bands (5.5 and 9 GHz, respectively) accounting for Z MIN , total hydrometeor attenuation, and the radar range-gate sampling volume for convective cells associated with an MCS observed on 20 May 2011 during MC3E. The input convective system simulation data are the same as Fig. 1. The instrument specifications used for the RHI simulations are for the X-band radar, a beamwidth of 1 • , range-gate spacing of 50 m, and a constant C of −50 dBZ Table 4. Computed radar variables. Units of the variables stored in the output files are provided in parenthesis.

Variable Description
Z hh Radar reflectivity factor at horizontal polarization (dBZ) Z vv Radar reflectivity factor at vertical polarization (dBZ) Z vh Cross-polarization radar reflectivity factor (dBZ) Z DR Differential reflectivity, defined as the ratio between the fraction of horizontally polarized backscattering and vertically polarized backscattering (dB) LDR h Linear depolarization ratio, defined as the ratio of the power backscattered at vertical polarization to the power backscattered at horizontal polarization for a horizontally polarized field (dB) K DP Specific differential phase, defined as the backward propagation phase difference between the horizontally and vertically polarized waves at a specific distance ( • km −1 ) δ Differential backscatter phase, defined as the difference between the phases of horizontally and vertical polarized components of the wave caused by backscattering from the objects in the radar resolution volume, computed based on Trömel et al. (2013) A h Specific attenuation at horizontal polarization, or, for horizontally polarized waves, represented by forward scattering amplitudes (dB km −1 ) A v Specific attenuation at vertical polarization, or, for vertically polarized waves, represented by forward scattering amplitudes (dB km −1 ) A DP Specific differential attenuation, defined as the difference between the specific attenuations for horizontally and vertically polarized waves (dB km −1 )  Variable Description β hydro , β aero , β mol Backscatter (sr −1 m −1 ) for cloud droplets and cloud ice (β hydro ), aerosols (β aero ), and air molecules (β mol ) β hydro_atten , β aero_atten , Attenuated backscatter (sr −1 m −1 ) for cloud droplets and cloud ice (β hydro_atten ), aerosols (β aero_atten ), β mol_atten and air molecules (β mol_atten ) α ext_hydro , α ext_aero Extinction coefficient (m −1 ) for cloud droplets and cloud ice (α ext_hydro ) and aerosols (α ext_aero ) β total Total backscatter (sr −1 m −1 ) (defined as β total = β hydro + β aero + β mol ) β total_atten Attenuated total backscatter (sr −1 m −1 ) (defined as β total_atten = β hydro_atten + β aero_atten + β mol_atten ) S Lidar ratio (defined as S = α ext_hydro /β hydro ) Figure 1. Radar observables produced by CR-SIM for a mesoscale convective system. The system was simulated using WRF with the Morrison 2-moment microphysics scheme at 1.8 km altitude. Shown are horizontal cross sections of (a) total hydrometeor content and (b) vertical air velocity from the WRF simulation. CR-SIM produces the following parameters for a scanning S-band (3 GHz) radar located at the center of the domain: (c) Z hh , (d) Z DR , (e) K DP , (f) radar antenna elevation angle, (g) Doppler velocity, and (h) spectrum width at 12:18:00 UTC.
for the Z MIN calculation. The C-band radar specifications are a beamwidth of 1 • , range-gate spacing of 120 m, and a constant C of −35 dBZ for the Z MIN calculation. These specifications follow the X-band scanning ARM precipitation radar (X-SAPR) and C-band scanning ARM precipitation radar (C-SAPR) configurations at the ARM Southern Great Plains (SGP) site during MC3E. The results are reasonable, showing strong attenuation in Z hh and Z DR by rain at X band and relatively less at C band. The simulated K DP at X band is approximately 1.6 times larger than that at C band because of the wavelength dependency.

Code features
CR-SIM is written in Fortran 95 standard including all GNU extensions and parallelized with OpenMP. The input to CR-SIM is the output from a CRM or LES in NetCDF format. The output of CR-SIM is in NetCDF format and includes simulated observables for each hydrometeor species, and one total for all the hydrometeors. These features allow users to understand the contributions of each hydrometeor species to the radar observables for a sophisticated evaluation of microphysics schemes. The code includes various microphysics schemes as shown in Table 3. The code structure supports different CRMs or LESs; flexible microphysics package ex-  tensions; and diverse assumptions such as particle shape, density, and PSD for different hydrometeor categories in the models as well as inclusion of scattering properties computed by different methods. The CR-SIM runtime depends on the computing power, number of threads used, simulation domain size, and the number of cloudy grid boxes. The simulations presented in this article were run on a computer having 500 GB memory and 24 processors each with 12 cores. For the MCS case in Fig. 1, having a simulation domain size of 667 × 667 × 12 (5.3 M grid points), the runtime is approximately 270 s using 16 threads.
The code has been released under GNU General Public License, and both the software and a detailed user guide are publicly available online (Tatarevic et al., 2019).

Sample applications of CR-SIM
In this section, several applications of CR-SIM are presented that highlight its capabilities. These applications are (i) a comparison of observed and modeled cloud fraction profiles (CFPs), (ii) a quantification of uncertainty in the estimate of domain-averaged CFP, (iii) an evaluation of a novel retrieval technique for the estimation of cloud fraction, (iv) an investigation of the impacts of limitations imposed by the nature of observations themselves on multi-Doppler wind retrieval techniques, and (v) an optimization of a new radar observation strategy for multi-Doppler wind retrievals. Figure 5 shows a flow diagram of our application processes. First, the forward simulator produces idealized observables at each model grid box (the "Output 1" box in Fig. 5). In the second step, an instrument model is used to account for the instrument features (as described in Sect. 2.3). Third, the output from the instrument model ("Output 2") is then used to retrieve the CFP (the retrieval model and "Output 3") for a direct comparison and, most importantly, for a quantification of the uncertainties in the cloud fraction estimates and evaluation of the new retrieval technique (applications i-iii). On the other hand, the output from the instrument model is also used as an input for multi-Doppler wind Figure 4. Examples of C-and X-band (5.5 and 9 GHz, respectively) RHI scans with a beamwidth of 1 • produced using CR-SIM for a convective cell in a mesoscale convective system (MCS). The simulation uses WRF with the Morrison 2-moment microphysics scheme for an MCS on 20 May 2011 during MC3E. Shown are variables at X-and C-band frequencies 15-35 km from the radar as a function of height at 12:18:00 UTC: (top raw) Z hh , (middle raw) Z DR , and (bottom raw) K DP . The figure shows (a) C-band variables without attenuation, (b) X-band variables with attenuation from a radar 24 km from the convective cell, (c) C-band variables with attenuation from a radar 24 km away, and (d) C-band variables from a radar located 59 km away with attenuation. retrieval model to investigate the uncertainty of the retrieval method and to optimize the new radar observation strategy (applications iv and v, with "Output 3"). The final step consists of a comparison of the retrieved quantities using a multi-Doppler wind retrieval against the input CRM or LES data and a quantitative estimation of uncertainties attributed to the observation limitations and the retrieval algorithms. In the following subsections, we briefly describe and summarize the findings of the studies using CR-SIM.

Comparison of observed and modeled cloud fraction profiles
Measurements of the CFP are important to quantify the impact of shallow cumulus clouds on the grid-scale meteorological state, because the fractional cloudiness of a grid box has an impact on the radiative transfer (e.g., Albrecht, 1981;Larson et al., 2001) and the vertical cumulus mass flux (e.g., de Roode and Bretheton, 2003;van Stratum et al., 2014). Zenith-profiling cloud radar and lidar measurements traditionally have been used to provide CFP estimates (e.g., Hogan et al., 2001;Kollias et al., 2009;Remillard et al., 2013;Angevine et al., 2018). Typically, the profiling radar and lidar observations are combined synergistically to provide a hydrometeor mask such as those described in ARSCL (Clothiaux et al., 2000) and the CloudNet target classification (Illingworth et al., 2007). This approach takes advantage of the radar and lidar capabilities and maximizes our ability to detect thin cloud layers. However, the performance of the combined radar and lidar algorithm degrades at heights where the lidar observations are unavailable due to complete signal attenuation. These attenuation effects are naturally not represented in model output and thus may lead to large disagreements between observations and models. Our focus is to use CR-SIM to generate a synthetic AR-SCL product that is directly comparable to the ARSCL generated using measurements from the Ka-band ARM Zenithpointing Radar (KAZR), ceilometer, and MPL. This analysis uses a shallow cumulus cloud field over ARM SGP site simulated by the LASSO project. The simulation is for the 27 June 2015 case and uses WRF-LES coupled with the Morrison 2moment microphysics scheme (Morrison et al., 2005). The horizontal and vertical resolutions are 100 and 20 m, respectively, and the horizontal dimension of the simulation domain is 14.4 km. First, the KAZR, ceilometer, and MPL measurements from the ARM SGP Central Facility are simulated using the CR-SIM forward simulator. The simulation output corresponds to the box "Output 2" in Fig. 5. Simulated KAZR reflectivity accounts for hydrometeor attenuation (Z obs hh ) and radar sensitivity (Z MIN ) as described by Eqs. (2, 3, and 5). The attenuated MPL hydrometeor backscatter (β hydro_atten ) is obtained by subtracting β aero_atten and β mol_atten from β total_atten , since the MPL total backscatter includes aerosol backscatter and molecular backscatter (see Table 5). The value obtained for β hydro_atten is below noise level if less than the unattenuated background scatter (β aero + β mol ), which is used in this simulation as the minimum detectable MPL backscatter value. The ceilometer-detected first cloud base is estimated at each grid column following O'Connor et al. (2004). Using the simulated observables, we estimate cloud locations as provided by ARSCL ("Output 3" in Fig. 5). A grid box where either KAZR Z obs hh or MPL β hydro_atten has a detectable value is indexed as a "cloudy" grid box, and grid boxes below the simulated ceilometer first-cloud base are indexed as "clear".
An example of an ARSCL simulation is shown in Fig. 6 that uses the LASSO LES data as an input. The WRF simulation shows cumulus clouds below 5 km and cirrus clouds covering the entire domain at 12-14 km. In Fig. 6b-d, the limitation of each instrument is represented in the forward simulations. The simulated KAZR measurements can detect cumulus cloud layers but cannot detect cirrus clouds, due to their low reflectivity value (lower than Z MIN ). Instead, the cirrus clouds can be detected by the simulated MPL measurements. However, the cirrus clouds may be missed by both radar and lidar measurements when cumulus clouds are present, because the MPL signal becomes fully attenuated by the low-level clouds. Figure 6f and g show the domainaveraged CFPs from the LES hydrometeor mixing ratio and from the simulated ARSCL, which assumes the ARM instruments are located at every grid column (as shown in Fig. 6e). Comparison between the two CFP plots suggests that the ARSCL for this LASSO case underestimates cirrus CFPs by 20 %, likely due to lidar beam attenuation by lower-level cumulus clouds that have a horizontal fraction of 20 %.

Uncertainty quantification of domain-averaged cloud fraction profile estimates
The ARSCL product is usually integrated for 1-3 h to provide an average CFP estimate for that time period. These CFP estimates are often compared with the model domainaveraged CFPs. However, the spatially heterogeneous distribution of the shallow cumulus clouds (Wood and Field, 2011) raises questions regarding the ability of short-term (1-3 h) zenith-profiling observations to provide adequate sampling of the cloud field. Uncertainties in the profiling measurement of cloud fractions are introduced by the limited sampling of a highly heterogeneous cloud field. We investigate these uncertainties as a function of the number of profiling sites and integration time using the CR-SIM virtual observations and the WRF LES simulation presented in the previous application. The WRF LES output is saved every 10 min, and CR-SIM is run for each output file. In this analysis, we assume that no cloud evolution occurs within a 10 min interval. Figure 7a and b show the domain-averaged CFP from the simulated ARSCL and directly from the WRF LES using a cloud water content threshold of 0.01 g kg −1 . The colors indicate different integration periods. Note that the WRF dataset in this analysis is for a shallow convective case at SGP on 11 June 2016, different from the one used in Fig. 6, which has higher cumulus cloud tops. The simulated ARSCL CFP is in good agreement with the WRF CFP for each integrated period (compare Fig. 7a and b), indicating that uncertainties attributed to observation limitations (e.g., sensitivity and attenuation) are small. Thus, the limited spatial and/or temporal sampling is the major error source to consider when comparing the profiling measurement derived CFP with the domain-averaged WRF CFP.
To emulate vertical profiling measurements, we sampled data as follows. First, observation sites are randomly selected within the horizontal domain. Second, for each snapshot of the simulation, clouds over the observation sites are sampled as if the clouds are frozen in time and advected by the mean environmental wind. Thus, the columns are sampled along the direction of the horizontal wind over the advected distance (i.e., horizontal wind speed ×10 min), where the en-  vironmental horizontal wind at each snapshot is the mean horizontal wind across the simulation domain within the cumulus cloud layer (i.e., the layer between the mean cumulus cloud base and the maximum cumulus cloud top). Last, the CFP is estimated by varying both the number of observation sites and the integration period. Figure 7c-h show the comparisons of the WRF CFPs and the simulated ARSCL CFPs for different number of observation sites (top row, c-e) and integration periods (bottom row, f-h). The center of the integration period is 21:00 UTC. Blue lines in each panel represent the simulated ARSCL CFPs integrated over time from each selected observation site for the period indicated, the red line represents the mean AR-SCL CFP averaged over the sites, and the black line represents the domain-averaged WRF CFP integrated over the indicated period. Each panel shows that the CFPs at a single site (blue lines) have large uncertainties even though they are integrated over long periods, ranging from 30 to 180 min. Those uncertainties are reduced when averaging the CFP profiles across the different sites; consequently, the mean CFP (red line) becomes closer to the domain-averaged WRF CFP (black lines). However, it also becomes evident that a small number of observation sites (Fig. 7c) may not be adequate to estimate the true CFP. Figure 7i and j show the root mean square error (RMSE) and mean absolute percentage error (MAPE) of the simulated ARSCL CFPs as a function of the number of observation sites and the integration time. Both plots show that the uncertainty can be reduced by increasing the number of observation sites and the length of the integration period. The RMSE dramatically decreases to 0.005-0.01 (30 %-50 % in MAPE) when we use four observation sites and 120 min integration. The rate of improvement of CFP, by further increasing the number of sites and integration period, is smaller; the error values slowly decrease until the RMSE and MAPE plateau at 0.002 % and 15 %, respectively. However, establishing more than 10 observational sites in such a small domain is probably impractical. At the SGP site, five Doppler lidar profiling measurements have already been operating over a 90 km × 90 km domain. These measurements can be effectively used to estimate cloud fraction without much uncertainty when clouds are homogeneously distributed over the domain.

Evaluation of a new CFP estimation technique using scanning cloud radar
Forward-radar simulators can be used to evaluate retrieval techniques. We introduce an application for estimating CFP using scanning cloud radar (SCR) measurements presented in Oue et al. (2016). As analyzed in the previous section, profiling radar measurements may produce large uncertainties in CFP estimates. On the other hand, SCRs conduct observations over a much larger domain than zenith-profiling cloud radars such as KAZR (e.g., Lamer et al., 2013;Ewald et al., 2015). Although the SCRs are widely and routinely used to observe 3D cloud fields, the application of SCRs to study shallow cumuli is not straightforward. One of the most significant limitations of the SCR observations is related to the radar sensitivity. Since shallow cumuli over land typically have low reflectivities, the strong reduction in SCR sensitivity with range creates the illusion of the atmosphere being cloudier closer to the radar location (e.g., Lamer and Kollias, 2015). This limitation can introduce uncertainties in the cloud fraction estimates. Oue et al. (2016) addressed uncertainties of radar-estimated CFPs due to the nature of the profiling and scanning radar techniques using CR-SIMgenerated observations. Figure 8a shows horizontal cross sections of WRFsimulated water content for a shallow convection case (9 June 2015; Oue et al., 2016) from LASSO. Figure 8b shows the CR-SIM simulation of the Ka band (35 GHz) Z hh accounting for the minimum detectable reflectivity, Z MIN , of the cross-wind RHI (CWRHI; Kollias et al., 2014) scans from Eq. (5) using C = −50 dBZ. In the CR-SIM analysis, the radar was located along the vertical line in Fig. 8b, and CWRHI scans were performed along the east-west direction, while the clouds were assumed to move along the northsouth direction. These figures suggest that the Z hh from the CWRHI scans cannot capture the clouds with lower water contents that are located far from the radar. This can affect cloud fraction estimates. Because the "true" cloud fraction is estimated from the original model cloud field and thus is known, the CR-SIM runs in different configurations can be used to establish the best method to estimate the cloud fraction while accounting for limitations inherent to the nature of radar measurements. Oue et al. (2016) use the cumulative distribution function (CDF) of the observed Z hh to define the size of the horizontal domain at each height needed to obtain the best estimate of the domain-averaged CFP. The horizontal domain size as a function of height corresponds to a distance from the radar where Z MIN is equal to a CDF value of 10 %. Figure 8c shows CFPs using a CDF of 10 % when changing the integration time of the CWRHI, and Fig. 8d shows the RMSE of the estimated CFPs as a function of integration time, adapted from Oue et al. (2016). The figure suggests that the 35 min or longer of CWRHI measurements provide the realistic domain-averaged CFP.

Investigation of impacts of observation limitations on multi-Doppler radar wind retrievals
Estimation of vertical air motion is essential to understand the dynamics and microphysics of deep convective clouds (e.g., Jorgensen and LeMone, 1989;Wang et al., 2019), evaluate CRM and LES results (e.g., Varble et al., 2014;Fan et al., 2017), and improve convective parameterization in GCMs (e.g., Donner et al., 2001). Multi-Doppler radar techniques have been applied to understand the dynamics and microphysics of the deep convective clouds in different cli- mate regimes (e.g., Friedrich and Hagen, 2004;Collis et al., 2013;Oue et al., 2014). However, the multi-Doppler radar retrievals are not straightforward with potential uncertainties from multiple aspects (e.g., Clark et al., 1980;Bousquet et al., 2008;Potvin et al., 2012). CR-SIM can be used to investigate the impacts of different error sources on the retrieved wind fields. Oue et al. (2019a) investigated the impacts of the radar VCP for the plan position indicator (PPI) and the observation period on uncertainties in multi-Doppler radar wind retrievals using CR-SIM. They also investigated how the uncertainties attributed to the VCP period can be reduced using the advection-correction technique proposed by Shapiro et al. (2010). The advection correction scheme allows for trajectories of multiple individual clouds, performs smooth grid-boxby-grid-box corrections of cloud locations, and takes into account changes in cloud shape with time by using PPI scans at two times. We summarize their findings, particularly regarding the impacts of radar VCP and period on multi-Doppler radar retrievals. Figure 9 shows a diagram of the analysis process. The input model data are a WRF simulation using the Morrison 2moment microphysics scheme for an MCS observed on 20 May 2011, during the MC3E field campaign at the ARM SGP site. The horizontal resolution is 500 m; the vertical resolution varies from approximately 30 m near the surface to 260 m at 2 km -above which the resolution remains approximately constant -and the simulation output is saved every 20 s. Measurements from the three X-band scanning ARM precipitation radars (X-SAPR) at the SGP site are simulated using CR-SIM. The CR-SIM-simulated radar reflectivity and Doppler velocity at the model grid are converted into the radar polar coordinates with two different VCPs for each radar: (1) 21 elevation angles ranging from 0.5 to 45 • (VCP1, same as the X-SAPR scan strategy during MC3E) and (2) 60 elevation angles ranging from 0.5 to 59.5 • with a 1 • increment (VCP2). For the both VCPs, the beamwidth is 1 • , the range-gate spacing is 50 m, and the maximum range is 40 km. The simulated radar reflectivity and Doppler velocity in polar coordinates were used as an input to the 3D-Var Figure 9. A diagram of an Observing System Simulation Experiment study to investigate the impacts of radar volume coverage pattern (VCP) on a multi-Doppler radar wind retrieval. multi-Doppler radar wind retrieval algorithm developed by  to estimate the 3D wind field for a domain of 50 km × 50 km × 10 km with horizontal and vertical grid spacings of 0.25 km.
The convective mass flux (MF) is estimated at each height as where UF is updraft fraction over the horizontal slice of the domain, w is mean vertical velocity over the updraft area, and ρ d is dry air density averaged over the domain. Figure 10 shows comparisons of convective mass flux profiles between simulated multi-Doppler radar retrievals and WRF output for two minimum updraft thresholds of 5 m s −1 (MF 5 ) and 10 m s −1 (MF 10 ). First, we applied the wind retrieval technique to a snapshot of the forward-model output to bypass the instrument model and examine the uncertainty in the retrieval model (3FullGrid). Figure 10a shows MF profiles from the 3FullGrid simulation (red line) and from the WRF snapshot (black line), with 2 min average (dark gray line) and 5 min average (light gray line). The 3FullGrid MF profile is in good agreement with the WRF output, indicating that the uncertainty in the retrieval model is small; although, it does underestimate the maximum MF for the updraft threshold of 5 m s −1 by 0.05 kg m −2 s −1 (10 % of the true MF) at 5.3 km. Figure 10b and c show MF profiles (MF 5 and MF 10 ) obtained from simulated retrievals while considering the effects of VCP (VCP1 and VCP2) and averaging period ("snap" (instantaneous), 2 min, and 5 min averages). For both VCP1 (Fig. 10b) and VCP2 (Fig. 10c), the snapshot and 2 min VCP simulations have similar MF estimates for both sets of MF 5 and MF 10 curves, indicating that a 2 min average is sufficient to capture features available from an instantaneous scan. However, the accuracy of these estimates varies with MF profile and VCP. The MF 10 estimates for both VCPs systematically underestimate the maximum values occurring between 4.5 and 6.5 km by about 0.5 kg m −2 s −1 (20 %). The performance of the MF 5 estimates for VCP snap and 2 min have strong variations with height. For VCP1 (the less dense scan pattern), MF 5 follows the WRF snapshot below 4.5 km with close agreement between 3 and 4.5 km; however, MF is underestimated around its maximum MF by about 0.075 kg m −2 s −1 (15 %) and is overestimated below 3 km and above 7 km. The denser scan pattern for VCP2 provides a dramatic improvement around the maximum and above 6 km, but it still shows overestimations below 3 km and above 7 km. Uncertainties are often increased for the VCP simulations when the averaging period is extended to 5 min. For the 5 min VCPs, MF 10 estimates for both VCP1 and VCP2 around the maximum are further underestimated, while the MF 5 estimate for VCP2 is further overestimated above 6 km. Other estimates below this height for VCP2 and for all heights for VCP1 are mostly unchanged. These results suggest that the VCP elevation strategy and sampling time extended to 5 min have a significant impact on the updraft properties retrieved at higher altitudes. This is due to the density of data sampled by the VCPs, where greater density particularly improves MF 5 around its maximum, and the deformation of cloud structures within longer sampling periods (exceeding 2 min) that causes uncertainties in the mass continuity assumption.
The rapid volume scan of less than 5 min required in the retrieval of the high-quality vertical velocities is challenging for conventional scanning radars. Most of the improvements required in the sampling strategy of the observing system (higher maximum elevation angle, higher density elevation angles, and rapid VCP time period) can be accomplished using rapid scan radar systems such as the Doppler on Wheels mobile radars (DOW; Wurman, 2001) or phased array radar systems (e.g., Kollias et al., 2018).

Evaluation of new radar observation strategies
CR-SIM can also be used to examine the performances of new remote-sensing systems and help to choose the most appropriate observation strategy for a new field campaign. Figure 4c and d show the performance of C-band (5.5 GHz) RHI measurements when the radar is located at 24 and 59 km away from the target convective clouds. As expected, the RHI from the greater distance provides the radar observables at lower resolution and includes more attenuation when precipitation clouds are located between the target and the radar. Oue et al. (2019a) investigate the impact of radar data sampling on the multi-Doppler radar wind retrievals for the MCS by an OSSE using CR-SIM. The addition of data from a Doppler radar to form a triple-Doppler radar retrieval, shown in Sect. 3.4, cannot significantly improve the updraft retrievals if the added radar VCP has inferior spatial resolution. Oue et al. (2019a) also show that the updraft retrievals in a limited area around the center of the domain, where data density from the three radars is higher than other areas, produced better results than those in the entire domain. The insights obtained from these OSSEs are beneficial for decisionmaking regarding radar observation strategies for a field campaign, such as the number of radars required and their locations. For example, Kollias et al. (2018) used CR-SIM to examine how phased array radars improve multi-Doppler radar wind retrievals compared to scanning radars for MCSs.

Summary
We present a recently developed comprehensive forward simulator for radar and lidar, CR-SIM, which is suitable for simulating complex, ground-based observational configurations and their synthetic products. CR-SIM can simulate multiwavelength, zenith-pointing and scanning radar observables (radar reflectivity, Doppler velocity, polarimetric fields, radar Doppler spectrum), lidar observables, and multisensor integrated products. The primary idea behind the simulator is to directly compare remote-sensing observations with simulated measurements based on the CRM or LES output, maintaining consistency with the microphysics scheme used in the model. CR-SIM incorporates microphysical and scattering properties independently, so that uncertainties related to microphysical assumptions are separated from uncertainties related to scattering model. This configuration allows CR-SIM to be easily expanded, either by adding microphysical schemes or new scattering classes.
One feature of CR-SIM is that it produces both radar and lidar observables for all the CRM grid boxes while accounting for elevation angles relative to a radar location, similar to Snyder et al. (2017a, b). Other radar simulators also account for radar geometry characteristics such as beamwidth and radar range resolution to simulate scatterers within the radar resolution volume (e.g., Capsoni et al., 2001;Caumont et al., 2006;Cheong et al., 2008). Instead, here, the radar sampling characteristics (e.g., antenna beamwidth, range-bin spacing, total attenuation, sensitivity) are accounted for in our postprocessing instrument model. This feature facilitates configuring any desirable observational setup with a varying number of profiling or scanning sensors from a single model simulation. The CR-SIM multisensor simulations include multiwavelength radars and lidars that allow for simulation of sophisticated virtual products such as ARSCL and 3D-Var multi-Doppler-based wind retrievals. The CR-SIM applications shown in this paper emphasize the value of applying it to high-resolution model output to emulate the sampling by the ground-based observatories. CR-SIM's coupling of CRM microphysical parameterizations with scattering models facilitates improved evaluation of model performance by enabling robust comparisons between model-simulated clouds and observables from radar and lidar while accounting for instrument characteristics and observation limitations.
CR-SIM is easily expanded to include additional microphysical schemes, new scattering classes, scattering calculations, and other applications to simulate multisensor products (e.g., multi-Doppler wind retrievals, ARSCL). At this stage, all ice hydrometeors (e.g., snow, ice, graupel, hail) are modeled as dielectrically dry spheroids. The LUTs of scattering properties incorporated into the current CR-SIM were created using the T-matrix method based on assumptions regarding ice particle composition and shape. More single-scattering properties from other scattering calculation methods can be incorporated by adding LUTs. Moreover, the gaseous attenuation will be considered in the future, as gaseous attenuation can be significant in the millimeterwavelength radar measurements, and elevation angles will be corrected for the Earth's curvature. The analyses presented here serve as a reference to the CR-SIM package and illustrate numerous applications related to sampling uncertainty, sampling optimization, retrieval uncertainty, and comparison between models and observations. . ] represent the real and imaginary parts of the complex number, respectively; and |. . .| refers to the magnitude of the value between the single bars. λ is the radar wavelength in millimeters, and |K w | 2 is the dielectric factor (the value for water = 0.92 is used for all hydrometeor species in CR-SIM). The scattering amplitudes are given in millimeters. N i (D) defines the particle size distribution in terms of the number of particles per unit volume of air and unit bin size (in m −3 mm −1 ), with the bin equivolume diameter D (in mm). Both the bin fall velocity, V F i , and vertical air velocity, w, are given in meters per second. The elevation and azimuth angles are denoted θ and ϕ, respectively, and u and v are the two components of horizontal wind. The coefficients A 1i -A 7i are the angular moments for one of the three horizontal orientation expressions taken from Ryzhkov et al. (2011).
A 4i V f i (D) N i (D) dD mm 6 m 3 m s ,  (Bohren and Huffman, 1998), the single particle extinction σ α and backscattering cross sections σ β are computed for a lidar wavelength (i.e., 905, 532, and 353 nm). The value of refractive index of water used in the calculations is 1.327 + i0.672 × 10 6 (Hale and Querry, 1973). The attenuated backscatter, β atten , at a distance z can be written as where β true is the true backscatter at height z, and α ext is the extinction coefficient:  Author contributions. MO and PK designed the OSSEs and MO carried them out. AT developed the radar simulator code, and MO, DW, and KY contributed to evolve, improve, and optimize the code. MO prepared the article with contributions from all co-authors.