Using Radar Observations to Evaluate 3D Radar Echo Structure Simulated by a Global Model

The Energy Exascale Earth System Model (E3SM) developed by the Department of Energy has a goal of 10 addressing challenges in understanding the global water cycle. Success depends on correct simulation of cloud and precipitation elements. However, lack of appropriate evaluation metrics has hindered the accurate representation of these elements in general circulation models. We derive metrics from the three-dimensional data of the ground-based Next generation radar (NEXRAD) network over the U.S. to evaluate both horizontal and vertical structures of precipitation elements. We coarsened the resolution of the radar observations to be consistent with the model resolution and improved the 15 coupling of the Cloud Feedback Model Intercomparison Project Observation Simulator Package (COSP) and E3SM Atmospheric Model Version 1 (EAMv1) to obtain the best possible model output for comparison with the observations. Three warm seasons (2014-2016) of EAMv1 simulations of 3D radar reflectivity features at an hourly scale are evaluated. A general agreement in domain-mean radar reflectivity intensity is found between EAMv1 and NEXRAD below 4 km altitude; however, the model underestimates reflectivity over the central United States, which suggests that the model does not 20 capture the mesoscale convective systems that produce much of precipitation in that region. The shape of the model estimated histogram of subgrid scale reflectivity is improved by correcting the microphysical assumptions in COSP. The model severely underestimates radar reflectivity at upper levels—the simulated echo top height is about 4 km lower than in observations—and this result is not changed by tuning any single physics parameter.


Introduction 25
Clouds and precipitation play a major role in Earth's budgets of energy, water, and momentum. However, the correct simulation of 3D structures of clouds and precipitation has been challenging in general circulation models (GCMs) (Trenberth et al., 2007;Randall et al., 2007;Eden and Widmann, 2012), partially because model grid spacings generally do not adequately resolve the cloud-structure details important to these budgets. In addition, the lack of appropriate evaluation metrics also hinders the evaluation of GCMs. Over the continental U.S., the detailed 3D radar reflectivity field (indicating the 3D distribution of precipitation particles) is observed by the ground-based Next Generation Radar (NEXRAD) network of S-band weather radars (Zhang et al., 2011 and. In this study, we use the mosaic of NEXRAD observations called Gridded Radar Data (GridRad) developed by Homeyer and Bowman (2017), which have a horizontal resolution of 4 km, vertical resolution of 1 km (24 levels), and an update cycle of 1 hour. These resolutions are coarser than the native data, but in order to compare these data appropriately with output of the global model used here, we further coarsen the horizontal 35 resolution, as described in Section 2.
The Energy Exascale Earth System Model (E3SM) is an ongoing effort of the Department of Energy (DOE) to advance the next-generation of climate modeling (Bader et al., 2014). Version 1 of E3SM (EAMv1) is a descendent of the National Center for Atmospheric Research (NCAR) Community Atmosphere Model version 5.3 (CAM5.3;Neale et al., 2012).
However, it has evolved substantially in coding, performance, resolution, physical processes, testing and development 40 procedures (Rasch et al., 2019). Previous model evaluation has focused on the long-term climatological properties of certain cloud fields, surface precipitation, and water conservation on the global scale (e.g., Qian et al., 2018;Xie et al., 2018;Zhang et al., 2018;Lin et al., 2019). Evaluations of the vertical structures of cloud and precipitation elements have used vertically pointing radar observations obtained during field campaigns Zhang et al., 2019). However, these tests lacked evaluation of fully 3D cloud and precipitation structure over large regions of the globe and over long time periods. 45 For this study, we have built data processing techniques to evaluate EAMv1 simulation of the 3D radar reflectivity field at its default setting of 1° grid spacing and 72 vertical layers at an hourly time scale. Our goal is to provide a comprehensive evaluation of both horizontal pattern and vertical structure of cloud and precipitation. We use radar observations obtained from the NEXRAD over the CONUS for the three years (2014)(2015)(2016). In order to directly compare the model results with NEXRAD, we have implemented and improved the Cloud Feedback Model Intercomparison Project (CFMIP) Observation 50 Simulator Package (COSP) (Bodas-Salcedo, et al., 2011) into EAMv1. We restrict the evaluation to the warm season. Over the CONUS, warm-season is dominated by convective processes, which are very different from the more widespread frontal cloud systems of cold-season precipitation. As discussed by Iguchi et al. (2018), precipitating ice particles have large variation in habits and scattering properties, and the effect of non-Rayleigh scattering and multiple scattering by large precipitating ice particles could introduce large uncertainty into simulating the cold-season radar reflectivity field. To avoid 55 this uncertainty, we examine only the warm season of the three years from 2014 to 2016.
As detailed in Section 2, we describe how we account for differences between the modeled and observed datasets, specifically (1) horizontal and vertical resolutions of EAMv1 (1°, 72 vertical levels) and NEXRAD (4 km horizontally, 1 km vertically) and (2) minimum detectable limits between the model and NEXRAD. The remainder of this paper is organized as follows: Section 2 describes the model, the GridRad dataset, the COSP simulator, and the step-by-step methodology of data 60 processing. Section 3 presents the model evaluation results and tests of the sensitivity to physics parameters. Section 4 provides synthesis and conclusions.

EAMv1 Description and Configuration
EAMv1's dynamics core and physics parameterizations are described in detail by Rasch et al. (2019). The continuous 65 Galerkin spectral finite element method solves the primitive equations on a cubed-sphere grid (Dennis et al., 2012;Taylor & Fournier, 2010). Tracer transport on the cubed sphere is handled using a variant of the semi-Lagrangian vertical coordinate system of Lin (2004). The method locally conserves air mass, trace constituent mass, and moist total energy (Taylor, 2011).
Turbulence, shallow cumulus clouds, and cloud macrophysics are parameterized with the Cloud Layers Unified By Binormals (CLUBB) parameterization (Golaz et al., 2002;Larson, 2017). Deep convection is based upon the formulation 70 originally described in Zhang and McFarlane (1995, hereafter ZM), with modifications by Neale et al. (2008) andRasch (2008). Stratiform clouds are represented with the "Morrison and Gettelman version 2" (MG2) two-moment bulk microphysics parameterization (Gettelman and Morrison, 2015). Aerosol microphysics and interactions with stratiform clouds are treated with an updated and improved version of the four-mode version of the Modal Aerosol Module (MAM4; Liu et al., 2016;Wang et al., 2020). 75 The EAMv1 used in this study has 30 spectral elements (ne30), which corresponds to approximately 1° horizontal grid spacing, and the total number of grid columns is 48,602. Vertically, there are 72 layers using the pressure-based terrainfollowing coordinate. The simulation is run for the time period from 1 January 2014 to 1 October 2016. We use a dynamic timestep of 30 min and a cloud microphysics timestep of 5 min. The large-scale circulation in the simulation is constrained using the nudging technique (Zhang et al., 2014;Ma et al., 2015;Lin et al., 2016), so that the model simulations can be 80 constrained by realistic large-scale forcing. Specifically, horizontal winds (U, V components) are nudged towards the ERA-Interim reanalysis data (Dee et al., 2011) with a relaxation time scale of 6 hours. Nudging is applied to all grid boxes at each time step, with the nudging tendency calculated using the model state and the linearly-interpolated ERA-Interim data (Sun et al., 2019).
To facilitate the comparison with observations, model outputs are regridded to the geographic coordinate system with a 85 horizontal grid spacing of 100 km, and the vertical coordinate is converted to the above mean surface level height in meters.
By default, all the regridding processes in this study are based on the Earth System Modeling Framework (ESMF) Python Regridding Interface (https://www.earthsystemcog.org/projects/esmpy/) using bilinear interpolation.

COSP Radar Simulator
The retrieved space-borne satellites and ground-based radar products such as cloud water content, and effective particle size 90 (e.g., Randel et al., 1996;Wang et al., 2015;Tian et al., 2016;Um et al., 2018) are often treated as the ground-truth for model evaluation (e.g., Fan et al., 2017;Han et al., 2019). However, the retrieved products often have large uncertainty (Stephens and Kummerow, 2007). To allow the comparison of model results with direct measurements form 3D scanning radars (ground based or satellite borne), the CFMIP Observation Simulator Package (COSP) was developed for use in GCMs output into pseudo-observations using forward calculation (Bodas-Salcedo et al., 2011;Swales et al., 2018;Zhang et al., 2010).
The COSP consists of three steps, as detailed in Zhang et al. (2010). The first step is to generate a subgrid-scale distribution of cloud and precipitation, which is done by using the Subgrid Cloud Overlap Profile Sampler (SCOPS; Klein and Jakob, 1999;Webb et al., 2001) and SCOPS for precipitation (SCOPS_PREC), respectively. Each GCM grid box is divided into 50 100 subcolumns in this study. Detailed description of SCOPS and SCOPS_PREC can be found in Zhang et al. (2010). Then, the radar signals are calculated by the QuickBeam code (Haynes and Stephens, 2007) using the column distribution of cloud and precipitation. Finally, the grid box mean radar reflectivity is calculated through the method of linear averaging (i.e., the reflectivity values [in dBZ] are converted to the Z values [mm 6 m -3 ] to calculate the mean Z, then mean Z is converted back to the dBZ). In addition to averaging, all the processing of radar reflectivity data from model and NEXRAD in this study 105 utilizes the linearized Z values, including horizontal averaging, vertical interpolation, calculation and comparison of mean values, etc.
The COSP version 1.4 used in this study has no scientific difference from version 2.0 (Song et al., 2018, Swales et al., 2018. The most important change we made was to modify the microphysics assumptions used for the radar reflectivity calculation regarding hydrometeor density, size distribution, etc., making those assumptions consistent with those used in the MG2 110 cloud microphysics scheme that is used in E3SM. The detailed documentation of those changes is in Table. 1. We use horizontally homogeneous cloud condensate distribution within the model grid element, and maximum-random overlapping scheme for cloud occurrence (Hillman et al., 2018).

NEXRAD Observations
The NEXRAD network consists of 159 S-band (3 GHz) Doppler radars, which form a dense observational network nearly 115 covering the CONUS. We use the GridRad mosaic product of Homeyer and Bowman (2017), which combines all NEXRAD radar data covering the region 155°W -69°W, 25°N -49°N. To compare the GridRad data to the E3SM model fields, the radar frequency in the COSP was set to 13.6 GHz, consistent with the Global Precipitation Measurement (GPM) Ku-band radar, since we originally aimed at evaluating the E3SM simulation with GPM data. However, due to the high detectable threshold of 13 dBZ, low sampling frequency (4-7 overpasses over CONUS per day), and the narrow swath width (245 km) 120 for each overpass, GPM data within the three-year period (2014-2016) have a significant under-sampling issue. That is, the GPM sample sizes over 1° grid model boxes are generally too small to robustly represent the grid element mean value. Therefore, we decided not to use GPM data in this study. As GPM operates over the whole earth and is anticipated to run for a long-time period, it will likely be a very useful dataset to evaluate the coarse-resolution global model in the future.
Although the GPM radar frequency is higher than the NEXRAD (13.6 GHz vs. 3 GHz), our previous study quantitatively 125 evaluated the coincident observations from NEXRAD and GPM over the CONUS, and found the 3D radar reflectivity fields obtained from the two independent platform datasets are highly consistent with each other after proper smoothing of GPM data in the vertical to mimic the temporal averaging used in the GridRad processing of NEXRAD data (Wang et al., 2019b). Therefore, the 13.6 GHz in COSP is accurate for evaluation with NEXRAD. In this study, although biases caused by the temporal mismatch are minimal at the horizontal resolution of 1° (~100 km), we nevertheless perform the Gaussian 130 smoothing of GridRad data to match the model time step (30 min) in the comparison.

Mapping the Radar Observations to the Model Grid
As shown in previous studies (e.g., Wang et al., 2015Wang et al., , 2016Wang et al., , 2018Feng et al., 2012Feng et al., , 2019, the minimum detectable reflectivity of NEXRAD is 0 dBZ (Fig. 1a). However, the model grid-mean reflectivity can be as low as -100 dBZ. Because our focus is on significantly precipitating clouds, the minimum threshold of reflectivity at 1° grid scale is set to be 8 dBZ 135 (corresponding to rain rate ≥ 0.1 mm hr -1 ). Thus, after coarsening the 4-km GridRad data to a 1° model grid element, only the grid elements with a mean value larger than 8 dBZ are taken into account in both observations (Fig. 1b) and simulation ( Fig. 1c). In the vertical direction, the EAMv1-simulated radar reflectivity field (72 vertical levels, hybrid coordinate) is interpolated to the levels of GridRad (vertical resolution of 1 km). The simulation data are saved hourly, consistent with the hourly GridRad data. 140

Results
After the horizontal averaging, vertical interpolation, and truncation at the identified minimum threshold of 8 dBZ, the 3D radar reflectivity fields obtained from GridRad and the model simulation become comparable. The EAMv1 simulation is evaluated from the perspectives of horizontal pattern, vertical distribution, and subgrid distribution.

Comparison of Horizontal Patterns 145
The plan views of temporal mean reflectivity through the entire study period are compared between GridRad (Figs. 2a, 2d, 2g and 2j) and EAMv1 (Figs. 2b, 2e, 2h, and 2k) at the vertical levels of 2, 4, 8, and 11 km. At 2-km altitude, the EAMv1 estimates higher reflectivity than the NEXRAD observations (Figs. 2a-b) except over the central United States. The overall mean value is 28.7 dBZ for EAMv1 and 25.1 dBZ for NEXRAD. The negative bias for the model is in the region between the Rocky Mountains and Mississippi basin (Fig. 2c), where precipitation is heavily contributed by Mesoscale Convective 150 Systems (MCSs). Those MCSs propagate eastward from their initiation over or just east of the Rocky Mountains, go through upscale growth, and finally dissipate in the eastern part of the Mississippi Basin (Yang et al. 2017;Feng et al., 2018Feng et al., , 2019.  (Fig. 2i), with a domain mean of 15.0 dBZ, much lower than 19.2 dBZ in the NEXRAD data. At 11-km altitude, the EAMv1 severely underestimates the reflectivity values compared to NEXRAD (Figs. 3j-k), with a mean value of 9.8 dBZ for EAMv1 while 16.6 dBZ for NEXRAD. The negative bias is generally more than 7.5 160 dBZ in the central United States (Fig. 2l).
Clearly, above 4 km, the model's negative biases increase with height as shown from Figs. 2f, 2i, and 2l, manifested in the central United States. There is no valid reflectivity value simulated by EAMv1 above 12-km altitude, while NEXRAD still shows reflectivity values up to 15.7 dBZ, indicating that the simulated deep convection in the warm season is not deep enough, a problem that is further examined in the following section. 165

Comparison of Vertical Distribution of Radar Reflectivity
To quantitatively examine the simulated vertical distribution of radar reflectivity, contoured frequency by altitude diagrams (CFADs, Yuter and Houze 1995) are generated from NEXRAD and EAMv1 and compared in Fig. 3. The CFADs represent the frequency of occurrence of reflectivity in a coordinate system having reflectivity bins (interval of 1 dBZ) on the x-axis and altitude bins (interval of 1 km) on the y-axis. The frequency within each bin box is calculated as the number of valid 170 samples it contains divided by the total sample number of all reflectivity bins at all levels, meaning that the integrated value of all frequencies in each plot is 100%. 3 km lower than the 14 km top seen in the observations. At low levels, below 4 km, the NEXRAD data show a high frequency core (> 3.2%) concentrated between 8-25 dBZ, whereas the simulated high frequency core is at 13-28 dBZ. For the reflectivity >35 dBZ, simulation has higher probability of occurrence than the NEXRAD observations. The box-whisker plots (Figs. 3c, f, i, l, o, and r) represent the same results in a different way. Below 4 km, the percentile values are consistent between model and observations except for the 1-km altitude where model overestimates the reflectivity. The simulated 25-180 75th percentiles are located at the reflectivity values of 15-27 dBZ at 1-km altitude, which is higher than the NEXRAD observation (12 -28 dBZ). As noted in the discussion of Fig. 2, the consistency at low-levels (e.g., 2 km) between model and observations is mainly due to the offset of negative and positive biases at different regions of the domain. Moreover, EAMv1 underestimates the frequency of echoes ≤ 15 dBZ and overestimate it for echoes between 15 and 30 dBZ, which causes the higher median values in model. From 4 km upward, the model-observation differences become much larger than at low 185 levels (Fig. 3), consistent with the result shown in Fig. 2. The underestimation of 95th percentile value increases from 10 dBZ at 7 km to more than 20 dBZ at 11 km. Above 11 km, the model completely fails to simulate any reflectivity.
The CFADs of NEXRAD observations vary from month to month. For example, the 0.6%-0.8% contour level in the observations stops at 9-km altitude in April, but extends to 10 km in May and reaches 11 km in June. It increases to the highest at 11.5 km in July and August, then decreases to 11 km in September. This seasonality follows the seasonal variation 190 of intensity of convection (Wang et al., 2019a).
The severe underestimation of the echo top height by EAMv1 has been reported for simulation of tropical convection with the Community Atmosphere Model version 5 (CAM5) in a recent study (Wang and Zhang, 2019). Although EAMv1 is different from CAM5 in many aspects such as vertical resolution and dynamical core, they share the same Zhang-McFarlane (ZM) cumulus parameterization (Zhang and McFarlane, 1995) for representing deep convection. Wang and Zhang (2019) 195 found the cloud top height of tropical convection is underestimated by more than 2 km, which can be alleviated by the adjustment of the ZM scheme. We have performed a series of sensitivity tests by changing physical parameters in ZM and cloud microphysics schemes to explore the possibility of model improvement in echo top height. These tests are detailed in Section 3.4.

Comparison on Subgrid Distribution of Reflectivity 200
The horizontal resolution difference between GCMs (∼100 km) and NEXRAD observations (∼4 km) present another challenge for testing the reality of the model simulated radar reflectivity. To mimic the observations, COSP divides the gridmean cloud and precipitation properties into sub-columns (Pincus et al., 2006) that statistically downscale the data in a way that should be consistent with observations. The way this is done in COSP is discussed by Zhang et al. (2010) and Hillman et al. (2018). In this section we examine whether the probability distribution of subgrid reflectivity implied by COSP is 205 consistent with the observed subgrid distribution of reflectivity shown by the NEXRAD observations. In EAMv1, 50 subcolumns are used for calculating the mean radar reflectivity for a model grid box. There are 625 pixels inside each 1° grid for NEXRAD data to provide a probability density function (PDF) of observed reflectivity within the box. Fig. 4 compares the simulated subgrid reflectivity distribution to the NEXRAD distribution based on all the GridRad samples combined for the 3-year period at each individual level. The results for the default microphysics 210 assumptions in COSP, which are for a single-moment scheme, produce a bi-modal distribution at all of the altitudes at and below 8-km (blue histograms in the left-hand column of Fig. 4). The bimodality is significantly different from the observed histogram, which forms a smooth gamma distribution. Song et al. (2018) also found bimodal distributions when the COSP was implemented in the CAM with the original microphysics assumptions, which are clearly unlike observed radar reflectivity distributions. 215 Our modification of the microphysical assumption in COSP (right-hand column of Fig. 4) greatly reduces the unrealistic bimodality. In addition, the modified microphysics assumptions produce higher extreme values of reflectivity, in better agreement with observations, and the grid-mean radar reflectivities increase by ~5 dBZ (Fig. 5). The improvement in the subgrid distribution and grid-mean reflectivity brought by the change of microphysics assumptions indicates the necessity of microphysical consistency between COSP and the host model. It should be noted that the simulated radar reflectivity and its 220 subgrid distribution are sensitive to the overlap assumption and the distribution function of condensates that are set in COSP (Hillman et al., 2018). Our results are from the default setup of these aspects of COSP. It is not the purpose of this study to test those assumptions.

Sensitivity of Simulated Echo Top Height Tunable Parameters of the Global Model
Different from the model evaluation of cloud top height (e.g., Xie et al., 2018), evaluation of radar echo top height indicates 225 whether the processes internal to the cloud are producing precipitation correctly. To examine if any model parameters in the cumulus parameterization ZM scheme and/or MG2 microphysics parameterization scheme can significantly influence the echo top height, we conducted a series of sensitivity tests for the tunable parameters as listed in Table. 2. Each test is based on the default setup for all other parameters. Wang and Zhang (2018) suggested that the restriction of neutral buoyancy level (NBL) from the dilute CAPE calculation 230 (Neale et al. 2008) can limit the depth of deep convection in ZM. When the convective plume reaches the NBL, all mass flux is detrained even if the updraft is still positively buoyant from the cloud model calculation (Zhang, 2009). To allow deep convection to grow deeper, we performed a sensitivity test following Wang and Zhang (2018), where the NBL determined in the dilute CAPE calculation is removed, and the upper limit of the integrals of mass flux, moist static energy, and other cloud properties is set to be very high (70 hPa in this study). After the modification, the convective cloud top height increases as 235 shown in Wang and Zhang (2018), however there is no change in the radar echo top height, i.e., the maximum altitude at which precipitation-sized particles occur. A possible reason for the limited effect on echo top height is that the cloud ice content is too low in midlatitude continental convection without convective microphysics parameterization (Song et al., 2012), which cannot be improved by merely increasing the NBL.
Other parameters that we tested in the ZM cumulus parameterization with the dilute CAPE calculation include convective 240 entrainment rate (zmconv_dmpdz), the convection adjustment time scale (zmconv_tau), the coefficient of autoconversion rate (zmconv_c0_lnd), ice particle size (clubb_ice_deep), convective fraction (cldfrc_dp), and number of layers allowed for negative CAPE (zmconv_cape_cin). The overall conclusion is that separately tuning any of these parameters does not improve the simulation of echo top height. For the convective entrainment rate (zmconv_dmpdz), we decreased the value of convective entrainment rate from -0.7×10 -3 to -1.0×10 -5 , which means that the entrainment in convection is almost turned off, 245 similar to the undiluted CAPE assumption. Results show the simulated echo top height is increased by 500-800 m in the modified simulation, and the reflectivity span in the lower troposphere is narrowed by 1-2 dBZ, which is closer to the observations (Fig. 6). This result is consistent with the previous studies that tested the undiluted CAPE assumption as well (Neale et al., 2008;Hannah and Maloney, 2014). However, that assumption is unrealistic given the fact that the undiluted CAPE-based closure strongly deviated from observations (Zhang, 2009). In summary, changing any single parameter alone 250 in ZM scheme does not improve the simulation of echo top height.
The MG2 cloud microphysics parameterization in E3SM determines only large-scale cloud and precipitation (i.e., those resolved by model resolution). Changes in the MG2 cloud microphysics parameterization could affect the parameterized cumulus cloud and precipitation by changing the large-scale forcing on which cumulus clouds are calculated. By decreasing the MG2 autoconversion rate (prc_coef1), ideally the depletion of moisture within the atmospheric column is slowed down and more water vapor can be supplied to cumulus convection. Results show, however, that the echo top height is not affected by changing the MG2 assumptions. Attempts of accelerating the Wegener-Bergeron-Findeisen process in MG2 to increase the conversion of liquid to snow/ice, as well as using lower size threshold for the ice-to-snow conversion have also proven to be unimportant to the simulation of echo top height.
Thus, echo top height proves to be insensitive to the available tunable parameters. Setting the value of convective 260 entrainment rate to be unrealistically low only gains 500-800 m increment in echo top height. Given that the model underestimation is more than 3 km, the increment is insufficient to solve the discrepancy. Note that each individual tunable parameter was changed without retuning the model to keep the top-of-atmosphere radiative energy budget balanced and the model performance optimized. Thus, some expected improvement in echo top height can be subsequently offset by other untuned processes. Instead of providing quantification of how the model responds to the changes of parameters, we 265 emphasize the trend of change in echo top height, in which the simulation of the echo top height cannot be significantly improved by tuning only one of those physical parameters. Further investigation of combinations of two and more parameters is a topic for a future study.

Conclusions and Discussion
We have evaluated the model performance of E3SM EAMv1 in simulating the warm-season 3D radar reflectivity at an 270 hourly scale over the North American sector of the globe by comparing the model results to the 3D distribution of radar reflectivity observed by NEXRAD radars over the CONUS during April-September of 2014-2016. The evaluation is achieved by improving the COSP radar simulator and employing special data processing techniques to ensure fair comparison between model and observations that are different in sampling frequency, horizontal-vertical resolutions, and minimum detection limit. We find that: 275 1. Below the 4-km altitude, the simulated domain-mean reflectivities by EAMv1 agree with NEXRAD observations in the magnitude, but the simulation fails to capture the spatial variability. The model underestimates the reflectivity in central U.S. between the Rocky Mountains and Mississippi River. This pattern suggests that the model is not adequately representing the mesoscale convective systems that dominate warm season rainfall in that region. The model overestimates the reflectivity outside this region. 280 2. Above 4-km altitude, EAMv1 shows a severe underestimation of the domain-mean reflectivity, and the negative bias increases with altitude up to 11 km, above which model fails to simulate any reflectivity at all, whereas NEXRAD observations show strong radar echoes up to 14 km.
3. With default microphysics assumptions in COSP, the simulated subgrid reflectivity PDF is bimodal, in disagreement with radar observations which show that the subgrid probability distribution of reflectivity follows a 285 gamma distribution. Changing the microphysics assumptions in COSP to be consistent with the MG2 microphysics parameterization used in E3SM, the bimodality of the subgrid distribution is nearly eliminated. It is therefore important to maintain consistency of microphysics assumptions between the host model and radar-echo simulator attached to the model.
The NEXRAD observations used in this study reveal that E3SM fails to simulate the occurrence of large ice-phase particles 290 at high levels in deep convective clouds. In addition, the conclusion of "simulated deep convection is not deep enough" also echoes the dry bias seen in GCMs as manifested in underestimations of total precipitation and individually large rain rates over the CONUS (e.g., Zheng et al., 2019). We have now shown that this model deficiency cannot be significantly improved by tuning only one of the physical parameters in the ZM cumulus and MG2 cloud microphysics schemes.
The data processing techniques and metrics we have developed in this study can be used globally for model evaluation when 295 satellite-based radars provide global 3D radar observations. The GPM radar observations will eventually be able to provide global radar echo coverage , whose data have been proven consistent with NEXRAD (Wang et al., 2019b). However, as discussed in Section 2, the sampling by GPM at 1° model grid elements for only three years of GPM data is insufficient for obtaining robust grid-mean values to compare with the E3SM simulation. When GPM has run for a much longer time period, it will become a very useful dataset to evaluate global model simulations.