Articles | Volume 14, issue 2
Methods for assessment of models
03 Feb 2021
Methods for assessment of models |  | 03 Feb 2021

Using radar observations to evaluate 3-D radar echo structure simulated by the Energy Exascale Earth System Model (E3SM) version 1

Jingyu Wang, Jiwen Fan, Robert A. Houze Jr., Stella R. Brodzik, Kai Zhang, Guang J. Zhang, and Po-Lun Ma

The Energy Exascale Earth System Model (E3SM) developed by the Department of Energy has a goal of addressing challenges in understanding the global water cycle. Success depends on correct simulation of cloud and precipitation elements. However, lack of appropriate evaluation metrics has hindered the accurate representation of these elements in general circulation models. We derive metrics from the three-dimensional data of the ground-based Next-Generation Radar (NEXRAD) network over the US to evaluate both horizontal and vertical structures of precipitation elements. We coarsened the resolution of the radar observations to be consistent with the model resolution and improved the coupling of the Cloud Feedback Model Intercomparison Project Observation Simulator Package (COSP) and E3SM Atmospheric Model Version 1 (EAMv1) to obtain the best possible model output for comparison with the observations. Three warm seasons (2014–2016) of EAMv1 simulations of 3-D radar reflectivity features at an hourly scale are evaluated. A general agreement in domain-mean radar reflectivity intensity is found between EAMv1 and NEXRAD below 4 km altitude; however, the model underestimates reflectivity over the central US, which suggests that the model does not capture the mesoscale convective systems that produce much of the precipitation in that region. The shape of the model-estimated histogram of subgrid-scale reflectivity is improved by correcting the microphysical assumptions in COSP. Different from previous studies that evaluated modeled cloud top height, we find the model severely underestimates radar reflectivity at upper levels – the simulated echo top height is about 5 km lower than in observations – and this result is not changed by tuning any single physics parameter. For more accurate model evaluation, a higher-order consistency between the COSP and the host model is warranted in future studies.

Please read the corrigendum first before continuing.

1 Introduction

Clouds and precipitation play a major role in Earth's budgets of energy, water, and momentum. However, the correct simulation of 3-D structures of clouds and precipitation has been challenging in general circulation models (GCMs) (Trenberth et al., 2007; Randall et al., 2007), partially because model grid spacings generally do not adequately resolve the cloud-structure details important to these budgets. In addition, the lack of appropriate evaluation metrics also hinders the evaluation of GCMs. Over the contiguous US (CONUS), the detailed 3-D radar reflectivity field (indicating the 3-D distribution of precipitation particles) is observed by the ground-based Next-Generation Radar (NEXRAD) network of S-band weather radars (3 GHz; Zhang et al., 2011, 2016). In this study, we use the mosaic of NEXRAD observations called Gridded Radar Data (GridRad) developed by Homeyer and Bowman (2017), which have a horizontal resolution of 0.02 (regridded to 4 km in this study), vertical resolution of 1 km (24 levels), and an update cycle of 1 h. In order to compare these data appropriately with output of the global model used here, we further coarsen the horizontal resolution, as described in Sect. 2. The Energy Exascale Earth System Model (E3SM) is an ongoing effort of the Department of Energy (DOE) to advance the next generation of climate modeling Version 1 of E3SM Atmosphere Model (EAMv1) is a descendent of the National Center for Atmospheric Research (NCAR) Community Atmosphere Model version 5.3 (CAM5.3; Neale et al., 2012). However, it has evolved substantially in coding, performance, resolution, physical processes, testing, and development procedures (Rasch et al., 2019). Previous model evaluation has focused on the long-term climatological properties of certain cloud fields, surface precipitation, and water conservation on the global scale (e.g., Qian et al., 2018; Xie et al., 2018; K. Zhang et al., 2018; Lin et al., 2019). Evaluations of the vertical structures of cloud and precipitation elements have used vertically pointing radar observations obtained during field campaigns (Y. Zhang et al., 2018; Zhang et al., 2019). However, these tests lacked evaluation of fully 3-D cloud and precipitation structure over large regions of the globe and over long time periods.

For this study, we have built data processing techniques to evaluate EAMv1 simulation of the 3-D radar reflectivity field at its default setting of 1 grid spacing and 72 vertical layers at an hourly timescale. Our goal is to provide a comprehensive evaluation of both horizontal pattern and vertical structure of cloud and precipitation. We use radar observations obtained from the NEXRAD over the CONUS for the 3 years 2014–2016. In order to directly compare the model results with NEXRAD, we have improved the Cloud Feedback Model Intercomparison Project (CFMIP) Observation Simulator Package (COSP) (Bodas-Salcedo, et al., 2011) and implemented it into EAMv1. We restrict the evaluation to the warm season (i.e., April to September). Over the CONUS, warm-season precipitation is dominated by convective processes, which are very different from the more widespread frontal cloud systems of cold-season precipitation. As discussed by Iguchi et al. (2018), precipitating ice particles have large variation in habits and scattering properties, and the effect of non-Rayleigh scattering and multiple scattering by large precipitating ice particles could introduce large uncertainty into simulating the radar reflectivity field. To reduce uncertainty due to these factors, we examine only the warm season of the 3 years from 2014 to 2016.

This paper is organized as follows: Sect. 2 describes the model, the GridRad dataset, the COSP simulator, and the step-by-step methodology of data processing to account for differences between the modeled and observed datasets, specifically (1) horizontal and vertical resolutions of EAMv1 (1, 72 vertical levels) and NEXRAD (4 km horizontally, 1 km vertically) and (2) minimum detectable limits between the model and NEXRAD. Section 3 presents the model evaluation results and tests of the sensitivity to physics parameters. Section 4 provides synthesis and conclusions.

2 Methodology

2.1 EAMv1 description and configuration

EAMv1's dynamics core and physics parameterizations are described in detail by Rasch et al. (2019). The continuous Galerkin spectral finite-element method solves the primitive equations on a cubed-sphere grid (Dennis et al., 2012; Taylor and Fournier, 2010). Tracer transport on the cubed sphere is handled using a variant of the semi-Lagrangian vertical coordinate system of Lin (2004). The method locally conserves air mass, trace constituent mass, and moist total energy (Taylor, 2011). Turbulence, shallow cumulus clouds, and cloud macrophysics are parameterized with the Cloud Layers Unified By Binormals (CLUBB) parameterization (Golaz et al., 2002; Larson, 2017). Deep convection is based upon the formulation originally described in Zhang and McFarlane (1995, hereafter ZM), with modifications by Neale et al. (2008) and Richter and Rasch (2008). Stratiform clouds are represented with the “Morrison and Gettelman version 2” (MG2) two-moment bulk microphysics parameterization (Gettelman and Morrison, 2015). Aerosol microphysics and interactions with stratiform clouds are treated with an updated and improved version of the four-mode version of the Modal Aerosol Module (MAM4; Liu et al., 2016). Regarding the stratiform–convection partition, the MG2 stratiform cloud microphysics and CLUBB higher-order turbulence parameterization explicitly provide values for condensate mass and number, as well as an estimate of stratiform cloud fraction, whereas the convective cloud fraction is not parameterized in the mass-flux-based ZM scheme (assumed to be ≪1 for typical GCM resolutions such as at 1 grid spacing or coarser) and is diagnosed from cloud mass flux for cloud radiation calculation, which is treated as a tunable parameter.

The EAMv1 used in this study has 30 spectral elements (ne30), which corresponds to approximately 1 horizontal grid spacing, and the total number of grid columns is 48 602. Vertically, there are 72 layers using a traditional hybridized sigma pressure coordinate. The simulation is run for the time period from 1 January 2014 to 1 October 2016. We use a dynamic time step of 5 min and a cloud microphysics time step of 30 min. The large-scale circulation in the simulation is constrained using the nudging technique (Zhang et al., 2014; Ma et al., 2014; Lin et al., 2016), so that the model simulations can be constrained by realistic large-scale forcing. Specifically, horizontal winds (U, V components) are nudged towards the Modern-Era Retrospective analysis for Research and Applications, version 2 (MERRA2), reanalysis data (Gelaro, et al., 2017) with a relaxation timescale of 6 h. Nudging is applied to all grid boxes at each time step, with the nudging tendency calculated using the model state and the linearly interpolated MERRA2 data (Sun et al., 2019).

To facilitate the comparison with observations, model outputs are regridded to the geographic coordinate system with a horizontal grid spacing of 100 km, and the vertical coordinate is converted to the above mean surface level height in meters. By default, all the regridding processes in this study are based on the Earth System Modeling Framework Python Regridding Interface (, last access: 10 April 2019) using bilinear interpolation.

2.2 COSP radar simulator

The retrieved spaceborne satellite and ground-based radar products such as cloud water content and effective particle size (e.g., Randel et al., 1996; Wang et al., 2015; Tian et al., 2016; Um et al., 2018) are often treated as the ground truth for model evaluation (e.g., Fan et al., 2017; Han et al., 2019). However, the retrieved products often have large uncertainty (Stephens and Kummerow, 2007). To allow the comparison of model results with direct measurements from 3-D scanning radars (ground-based or satellite-borne), the COSP was developed for use in GCMs (Bodas-Salcedo et al., 2011). Instead of using retrieved products to evaluate the model simulation, COSP converts model output into pseudo-observations using forward calculations (Bodas-Salcedo et al., 2011; Swales et al., 2018; Zhang et al., 2010).

The COSP consists of three steps, as detailed in Zhang et al. (2010). The first step is to generate a subgrid-scale distribution of cloud and precipitation, which is done by using the Subgrid Cloud Overlap Profile Sampler (SCOPS; Klein and Jakob, 1999; Webb et al., 2001) and SCOPS for precipitation (SCOPS_PREC), respectively. Each GCM grid box is divided into 50 subcolumns in this study. Detailed description of SCOPS and SCOPS_PREC can be found in Zhang et al. (2010). Then, the radar signals are calculated by the QuickBeam code (Haynes and Stephens, 2007) using the column distribution of cloud and precipitation. Thus, COSP calculates the reflectivity for the combined cloud properties using its own subgrid assumption, and it does not distinguish convective and stratiform cloud contributions to reflectivity. Finally, the grid box mean radar reflectivity is calculated through the method of linear averaging (i.e., the reflectivity values [in dBZ] are converted to the Z values [mm6 m−3] to calculate the mean Z, and then mean Z is converted back into dBZ). In addition to averaging, all the processing of radar reflectivity data from model and NEXRAD in this study utilizes the linearized Z values, including horizontal averaging, vertical interpolation, calculation and comparison of mean values, etc.

Table 1Modification of the hydrometeor assumptions used in COSP.

a LS: large-scale; CV: convective; L: cloud liquid; I: cloud ice; S: snow. b Distribution width: ν in N(D)=N0D(ν-1)e-λD, which is a shape parameter in gamma distribution describing the dispersion of the distribution. n/a – not applicable

Download Print Version | Download XLSX

The COSP version 1.4 used in this study has no scientific difference from version 2.0 (Song et al., 2018, Swales et al., 2018). Following the general usage of COSP, we modified the microphysics assumptions used for the radar reflectivity calculation regarding hydrometeor density, size distribution, etc., making those assumptions consistent with those used in the MG2 cloud microphysics scheme that is used in E3SM. The detailed documentation of those changes is in Table 1. Note that, although we tried to make the COSP use the same hydrometeor size distribution functions as MG2, the three parameters (slope, intercept, and shape parameters) are still separately defined in COSP. We use horizontally homogeneous cloud condensate distribution within the model grid element and the maximum–random overlapping scheme for cloud occurrence (Marchand et al., 2009; Hillman et al., 2018).

2.3 NEXRAD observations

The NEXRAD network consists of 159 S-band (3 GHz) Doppler radars, which form a dense observational network nearly covering the CONUS. We use the GridRad mosaic product of Homeyer and Bowman (2017), which combines all NEXRAD radar data covering the region 25–49 N, 155–69 W. To compare the GridRad data to the E3SM model fields, the radar frequency in the COSP was set to 13.6 GHz, consistent with the Global Precipitation Measurement (GPM) Ku-band radar, since we originally aimed at evaluating the E3SM simulation with GPM data. However, due to the high detectable threshold of 13 dBZ, low sampling frequency (four to seven overpasses over CONUS per day), and the narrow swath width (245 km) for each overpass, GPM data within the 3-year period (2014–2016) have a significant under-sampling issue. That is, the GPM sample sizes over 1 model grid boxes are generally too small to robustly represent the grid element mean value. Therefore, we decided not to use GPM data in this study. As GPM operates over the whole earth and is anticipated to run for a long time period, it will likely be a very useful dataset for evaluating the coarse-resolution global model in the future.

The GPM radar frequency is higher than that of NEXRAD (13.6 vs. 3 GHz). Previous studies have shown conversions from Ku (13.6 GHz) to S band (3 GHz) are necessary when using GPM Ku-band radar to calibrate the ground-based radars (Warren et al., 2018). Based on our previous study that quantitatively evaluated the coincident observations from NEXRAD and GPM over the CONUS, we found the 3-D radar reflectivity fields obtained from the two independent platforms are highly consistent with each other after proper smoothing of GPM data in the vertical (Wang et al., 2019b). We performed a series of offline tests of COSP simulation using the frequency of 3 GHz (NEXRAD), 13.6 GHz (GPM Ku band), and 94 GHz (the cloud profiling radar on board the CloudSat satellite). Their corresponding reflectivities are compared in Fig. 1. As shown, the reflectivity values with 3 GHz are very similar to those with 13.6 GHz, indicating the Rayleigh scattering is satisfied for both frequencies in this application. To examine if the COSP can correctly handle the Mie scattering calculation, the frequency of 94 GHz used by the CloudSat is also tested, whose products have been widely used for the evaluation of coarse-resolution models (Zhang et al., 2010). As shown in Fig. 1, the reflectivities simulated with 94 GHz significantly deviate from those simulated with 3 and 13.6 GHz when reflectivities >10dBZ, which reveals that the COSP simulator is capable of handling both Rayleigh and Mie scattering calculations. However, there is no difference using Ku band or S band in the COSP simulator in this study, because the simulated condensates are not large enough to lead to non-Rayleigh scattering, which is typically observed at Z>40dBZ for the Ku band (Matrosov, 1992).

An attenuation correction has been applied in case of existence of any large particles, although they are extremely unlikely to occur in this application. Since the COSP mimics the satellite view from space to the ground, the layer below 1 km altitude is most vulnerable to the possible attenuation caused by large precipitation particles, which has been excluded from the comparison. In this study, biases caused by the temporal mismatch are minimal at the horizontal resolution of 1 (∼100km); we nevertheless perform Gaussian smoothing of GridRad data to match the model time step (30 min) in the comparison.

Figure 1Scatterplots of radar reflectivity values simulated by the COSP simulator at 3 GHz (x axis) vs. those simulated at 13.6 GHz (left y axis) and 94 GHz (right y axis).


Figure 2Examples of (a) original GridRad observation, (b) GridRad mapped over the E3SM model grid, and (c) the concurrent model simulation on 11 May 2016, 07:00 UTC, at the 2 km altitude.

2.4 Mapping the radar observations to the model grid

As shown in previous studies (e.g., Wang et al., 2015, 2016, 2018; Feng et al., 2012, 2019), the minimum reflectivity of the 3-D mosaic NEXRAD dataset is 0 dBZ (Fig. 2a). However, the model grid-mean reflectivity can be as low as 100 dBZ. Because our focus is on significantly precipitating clouds, the minimum threshold of reflectivity at 1 grid scale is set to be 8 dBZ (corresponding to rain rate ≥0.1mm h−1). We also tested with a threshold of 0 dBZ and report later on how it only has minor effects on our conclusions. For our main results, after coarsening the 4 km GridRad data to a model grid element, only the grid elements with a mean value larger than 8 dBZ are taken into account in both observations (Fig. 2b) and in the simulation (Fig. 2c). In the vertical direction, the EAMv1-simulated radar reflectivity field (72 vertical levels, hybrid coordinate) is interpolated to the levels of GridRad (vertical resolution of 1 km). The simulation data are saved hourly, consistent with the hourly GridRad data.

Figure 3Comparison of radar reflectivity subgrid distribution between NEXRAD observations (red bars) and the simulations (blue bars) at the vertical levels of 2, 4, 8, and 11 km. Simulation results in the left and right columns are from the default microphysics assumptions in COSP and modified COSP microphysics assumptions, respectively.


Figure 4Scatter density plot of radar reflectivity values from the simulation with the modified microphysics assumptions (y axis) vs. those with the default microphysics assumptions (x axis). The data shown are for April 2014. The dots are color-labeled with their frequency of occurrence.


3 Results

After the horizontal averaging, vertical interpolation, and truncation at the identified minimum threshold of 8 dBZ, the 3-D radar reflectivity fields obtained from GridRad and the model simulation become comparable. The EAMv1-simulated reflectivity is evaluated from the perspectives of subgrid distribution, horizontal pattern, and vertical distribution.

3.1 Comparison of subgrid distribution of reflectivity

The horizontal resolution difference between GCMs (∼100km) and NEXRAD observations (4 km) presents a challenge for testing the model-simulated radar reflectivity. To mimic the observations, COSP divides the grid-mean cloud and precipitation properties into subcolumns (Pincus et al., 2006) that statistically downscale the data in a way that should be consistent with observations. The way this is done in COSP is discussed by Zhang et al. (2010) and Hillman et al. (2018). In this section we examine whether the subgrid reflectivity distribution generated by COSP is consistent with the observed subgrid reflectivity distribution shown by the NEXRAD observations.

In EAMv1, 50 subcolumns are used for calculating the mean radar reflectivity for a model grid box. There are 625 pixels inside each 1 grid for NEXRAD data to provide a probability density function (PDF) of observed reflectivity within the box. After averaging the NEXRAD pixels at subgrid scale to 50 samples to match the COSP's subcolumns, Fig. 3 compares the simulated subgrid reflectivity PDF to the NEXRAD PDF based on all the GridRad samples combined for the 3-year period at each individual level, where the interval of reflectivity bins is 1 dBZ. The results for the default microphysics assumptions in COSP, which are for a single-moment scheme, produce a bimodal distribution at and below 8 km altitudes (blue histograms in the left-hand column of Fig. 3). The bimodality is significantly different from the observed PDF, which forms a smooth gamma distribution. Song et al. (2018) also found bimodal distributions when the COSP was implemented in the CAM with the original microphysics assumptions, which are clearly unlike real observed radar reflectivity distributions.

Our modification of the microphysical assumptions in COSP (right-hand column of Fig. 3) greatly reduces the bimodality. In addition, the modified microphysical assumptions produce higher values of reflectivity, in better agreement with observations, and the grid-mean radar reflectivities increase by ∼4dBZ (Fig. 4) mainly for values less than 25 dBZ. The improvement in the subgrid distribution and grid-mean reflectivity brought by the change of microphysics assumptions indicates the necessity of microphysical consistency between the COSP and the host model. It should be noted that the simulated radar reflectivity and its subgrid distribution are sensitive to the overlap assumption and the distribution function of condensates that are set in COSP (Hillman et al., 2018). Our results are from the default setup of these aspects of COSP. It is not the purpose of this study to test those assumptions.

Although the simulated subgrid reflectivity distribution is improved by setting the microphysics assumptions used in COSP consistent with the MG2, the model is still significantly biased. In addition to the intrinsic model–observation differences in the number concentrations and mixing ratios of hydrometeors, there are other possible error sources related to the reflectivity calculation as mentioned in Sect. 2.2. For example, (1) the mixing ratios of hydrometeor types from different types of clouds are not directly passed from the host model to COSP but rather are lumped together and equally divided among all the precipitating subcolumns, (2) the spectral parameters for defining a gamma distribution are not consistent with those from MG2, and (3) the assumptions of subgrid distribution and hydrometeor vertical overlap are simple and not consistent with other parts of the host model. In addition, the subgrid distribution results from COSP are calculated based on the assumption about the distribution of cloud and precipitation among the 50 subcolumns, which is independent of what E3SM uses. Therefore, a higher-order consistency between the COSP and the host model is warranted in future studies.

In this following analysis, we focus on the evaluation of the simulated 3-D radar reflectivity field at the model's native grid, which is 1, since the subgrid information from COSP does not directly reflect how E3SM does it. Also, the convective cloud fraction is not parameterized in the mass-flux-based ZM scheme and is diagnosed from cloud mass flux for cloud radiation calculation, which is treated as a tunable parameter, whose evaluation is not very meaningful unless it becomes an independent variable, for instance, for grey-zone resolutions.

Table 2The statistical comparison of radar reflectivity between NEXRAD and EAMv1.

Download Print Version | Download XLSX

Figure 5Plan view of radar reflectivity averaged from NEXRAD observations (a, d, g, j); EAMv1 simulation with the modified microphysics assumptions in COSP (b, e, h, k); and their absolute differences (c, f, i, l) at the level of 2, 4, 8, and 11 km altitude. The NEXRAD data are spatially averaged from native resolution to the model grid over the April–September period during 2014–2016, and the simulations are vertically interpolated to the NEXRAD levels.

Figure 6Comparison of radar reflectivity histograms at 1 scale between NEXRAD observations (red bars) and the simulations (blue bars) at the vertical levels of 2, 4, 8, and 11 km.


3.2 Comparison of horizontal patterns

Now we compare the temporal mean reflectivity through the entire study period between the NEXRAD observation (Fig. 5a, d, g, and j) and EAMv1 simulation (Fig. 5b, e, h, and k) with the consistent microphysical assumptions between COSP and the host model at the vertical levels of 2, 4, 8, and 11 km. The mean, standard deviation, 95th-percentile values, and valid sample numbers between the model and NEXRAD are compared in Table 2. At 2 km altitude, the EAMv1 estimates higher reflectivity than the NEXRAD observations (Fig. 5a–b) except over the central US. The overall mean value is 28.7 dBZ for EAMv1 and 25.1 dBZ for NEXRAD. The negative bias for the model is in the region between the Rocky Mountains and Mississippi Basin (Fig. 5c), where precipitation is heavily contributed by mesoscale convective systems (MCSs). Those MCSs propagate eastward from their initiation over or just east of the Rocky Mountains, go through upscale growth, and finally dissipate in the eastern part of the Mississippi Basin (Yang et al., 2017; Feng et al., 2018, 2019). The standard deviations of the two individual datasets are quite similar, and EAMv1 generates a higher 95th-percentile value than the observation, indicating the model overestimates the extremely high values in the lower troposphere. In addition, those simulated extreme values are evenly distributed across the entire domain, which fail to mimic the spatial footprint of MCSs as depicted by the NEXRAD data.

At 4 km altitude (Fig. 5d–e), the model's underestimation over the central US becomes larger compared to 2 km altitude, and the overestimation at the foothills of Rocky Mountains also becomes larger. The model also overestimates reflectivity in the east region of the domain. These results indicate that the E3SM simulation fails to capture the observed spatial variability. The domain mean value between the model and observations is the same (24.0 dBZ) as a consequence of the offset between the negative and positive biases in different areas. The standard deviation and 95th-percentile values are comparable with the observations as well. At 8 km, underestimation of the reflectivity by the model occurs over almost the entire domain (Fig. 5i), with a domain mean of 15.0 dBZ, much lower than 19.2 dBZ in the NEXRAD data. Meanwhile, the modeled standard deviation and the extreme values are smaller, indicating the model has difficulty capturing the observed variability.

At 11 km altitude, the EAMv1 severely underestimates the reflectivity values compared to NEXRAD (Fig. 5j–k), with a mean value of 9.8 dBZ for EAMv1 and 16.6 dBZ for NEXRAD. The negative bias is generally more than 7.5 dBZ in the central US (Fig. 5l), and the model severely underestimates the standard deviation and extreme reflectivity. Moreover, EAMv1's sample size is 50 times lower than that of NEXRAD, indicating the lower occurrence of reflectivity values ≥8dBZ. Clearly, above 4 km, the model's negative biases increase with height as shown in Fig. 5f, i, and l, manifested in the central US. There is no valid reflectivity value simulated by EAMv1 above 12 km altitude, where NEXRAD still shows reflectivity values up to 15.7 dBZ, indicating that the simulated deep convection in the warm season is not deep enough, a problem that is further examined in the following section.

In addition to the mean values, the histograms of observed and simulated radar reflectivities are compared for different altitudes, where the interval of reflectivity bins is 2 dBZ (Fig. 6). By comparing the occurrence of Z≥8dBZ between model and observations, the model apparently has a narrower distribution than the observations, and the model–observation deviation in maximum values increases with height. At 8 km and below, the model generally overestimates the sample sizes of smaller reflectivity values but lacks extremely high reflectivity values. However, at 11 km altitude, the model greatly underestimates the sample sizes of the entire reflectivity spectrum compared to the observation, causing the severe underestimation in the mean value.

3.3 Comparison of vertical distribution of radar reflectivity

To quantitatively examine the simulated vertical distribution of radar reflectivity, contoured-frequency-by-altitude diagrams (CFADs; Yuter and Houze 1995) are generated from NEXRAD and EAMv1 and compared in Fig. 7. The CFADs represent the frequency of occurrence of reflectivity in a coordinate system having reflectivity bins (interval of 1 dBZ) on the x axis and altitude bins (interval of 1 km) on the y axis. The frequency within each bin box is calculated as the number of valid samples it contains divided by the total sample number of all reflectivity bins at all levels, meaning that the integrated value of all frequencies in each plot is 100 %.

Figure 7 shows the CFADs for both NEXRAD observations (Fig. 7a, d, g, j, m, and p) and the EAMv1 simulation (Fig. 7b, e, h, k, n, and q) for each month from April to September combined over 2014–2016. The most distinct difference between the model and observations is the simulated echo top height. The echo top height in the simulation generally is at 11 km, at least 5 km lower than the 16 km top seen in the observations. At levels below 4 km, the NEXRAD data show a high-frequency zone (>3.2%) concentrated between 8–25 dBZ, whereas the simulated high-frequency zone is at 13–28 dBZ. For reflectivity >35dBZ, the simulation has a higher probability of occurrence than the NEXRAD observations.

Regarding the overall shape of CFADs, the model follows the well-known pattern where the reflectivity value range of the high-frequency zone (>3.2%) increases from cloud top to the freezing level and then slowly decreases or remains constant below the freezing level. The cores of maximum frequency (>5%) are located in the centres of the high-frequency zones. However, these characteristics are not presented in the observations, whose high-frequency zones are greatly skewed to the lower reflectivity values. The characteristics of NEXRAD's CFADs could be due to averaging from fine resolution (4 km) to coarse resolution (1) , as well as averaging of convective and stratiform components because the two components produce significantly different reflectivity profiles and magnitudes.

The box-and-whisker plots (Fig. 7c, f, i, l, o, and r) represent the same results in a different way, where the normalization is conducted at each level rather than against the entire dataset at all levels. Below 4 km, the percentile values are consistent between the model and observations except for the 1 km altitude, where the model overestimates the reflectivity. The simulated 25–75th percentiles are located at the reflectivity values of 15–27 dBZ at 1 km altitude, which is higher than the NEXRAD observation (12–28 dBZ). As noted in the discussion of Fig. 5, the consistency at low levels (e.g., 2 km) between the model and observations is mainly due to the offset of negative and positive biases in different regions of the domain. Moreover, EAMv1 underestimates the frequency of echoes ≤15dBZ and overestimates it for echoes between 15 and 30 dBZ, which causes the higher median values in the model. From 4 km upward, the model–observation differences become much larger than at low levels, consistent with the result shown in Fig. 5. The underestimation of 95th-percentile value increases from 10 dBZ at 7 km to more than 20 dBZ at 11 km. Above 11 km, the model fails to generate average reflectivity above 8 dBZ, and the typical reflectivity value is between 0 and 2 dBZ at 12 km.

From Fig. 7 it is clear that the model severely underestimates the echo top height by at least 5 km. To look at how this result is sensitive to the threshold reflectivity, we reprocessed the results with the 0 dBZ threshold. By lowering the threshold to 0 dBZ, an increment of ∼1km in the vertical extension of the CFADs is found in the model, but the echo top height of the observations is not changed much. As a result, the choice of threshold does not change the conclusion of severe model underestimation in echo top height.

The CFADs of NEXRAD observations vary from month to month. For example, the echo top height is at 15 km in April, which increases to 16 km in May, then reaches 17 km in June and July, and finally decreases to 15 km in September. Similarly, the 0.6–0.8 % contour level in the observations stops at 9 km altitude in April but extends to 10 km in May and reaches 11 km in June. It increases to the highest level at 11.5 km in July and August, and then decreases to 11 km in September. This seasonality follows the seasonal variation of intensity of convection (Wang et al., 2019a), which is not captured in the EAMv1 simulation (Fig. 7b, e, h, k, n, and q).

The severe underestimation of the echo top height by EAMv1 has been reported for simulation of tropical convection with CAM5 in a recent study (Wang and Zhang, 2018). Although EAMv1 is different from CAM5 in many aspects, such as vertical resolution and dynamical core, they share the same ZM cumulus parameterization (Zhang and McFarlane, 1995) for representing deep convection. Wang and Zhang (2019) found the cloud top height of tropical convection is underestimated by more than 2 km, which can be alleviated by the adjustment of the ZM scheme. We have performed a series of sensitivity tests by changing physical parameters in ZM and cloud microphysics schemes to explore the possibility of model improvement in echo top height. These tests are detailed in Sect. 3.4.

As evaluated in Zheng et al. (2019), E3SM v1 failed to simulate the diurnal variation of precipitation over the central US, where the observed nocturnal peak is greatly underestimated. Xie et al. (2019) improved the diurnal cycle of convection in E3SM v1 recently by modifying the convective trigger function in the ZM scheme. It will be interesting to see if the 3-D radar reflectivity fields can be better simulated using the updated ZM scheme.

Figure 7Contoured-frequency-by-altitude diagrams (CFADs) normalized by the total number of samples at all altitude levels for NEXRAD (a, d, g, j, m, p) and EAMv1 simulation with the modified microphysics assumptions in COSP (b, e, h, k, n, q) for the months from April to September averaged over the 2014–2016 period. The box-and-whisker plots (c, f, i, l, o, r) for NEXRAD (red) and EAMv1(blue) are calculated using normalization at each individual level, where the center of the box represents the 50th-percentile value, and the 25th and 75th percentiles are represented by the left and right boundary of the box, respectively. Whiskers correspond to the 5 and 95 % values.


Table 3Changes of the tunable parameters in the sensitivity tests for echo top height.

Download Print Version | Download XLSX

Figure 8Comparison of contoured-frequency-by-altitude diagrams (CFADs) for the warm seasons over 2014–2016 between (a) NEXRAD, (b) EAMv1 simulation, and (c) the EAMv1-test simulation with reduced convective entrainment rate.


3.4 Sensitivity of simulated echo top height to tunable parameters of the global model

Differently from the model evaluation of cloud top height and high cloud fraction (e.g., Xie et al., 2018), where EAMv1 has shown good agreement with satellite observations over the CONUS, evaluation of radar echo top height indicates whether the processes internal to the cloud are producing precipitation correctly. To examine if any model parameters in the ZM cumulus parameterization scheme and/or MG2 microphysics parameterization scheme can significantly influence the echo top height, we conducted a series of sensitivity tests for the tunable parameters as listed in Table 3. In each test a single parameter is changed, and all other parameters retain their default values.

Wang and Zhang (2018) suggested that the restriction of neutral buoyancy level (NBL) from the dilute convective available potential energy (CAPE) calculation (Neale et al., 2008) can limit the depth of deep convection in ZM. When the convective plume reaches the NBL, all mass flux is detrained even if the updraft is still positively buoyant from the cloud model calculation (Zhang, 2009). To allow deep convection to grow deeper, we performed a sensitivity test following Wang and Zhang (2018), where the NBL determined in the dilute CAPE calculation is removed, and the upper limit of the integrals of mass flux, moist static energy, and other cloud properties is set to be very high (70 hPa in this study). After the modification, the convective cloud top height increases as shown in Wang and Zhang (2018); however there is no change in the radar echo top height, i.e., the maximum altitude at which precipitation-sized particles occur. A possible reason for the limited effect on echo top height is that the cloud ice content is too low in midlatitude continental convection without convective microphysics parameterization (Song et al., 2012), which cannot be improved by merely increasing the NBL.

Other parameters that we tested in the ZM cumulus parameterization with the dilute CAPE calculation include convective entrainment rate (zmconv_dmpdz), the convection adjustment timescale (zmconv_tau), the coefficient of autoconversion rate (zmconv_c0_lnd), ice particle size (clubb_ice_deep), convective fraction (cldfrc_dp), and number of layers allowed for negative CAPE (zmconv_cape_cin). The overall conclusion is that separately tuning any of these parameters does not improve the simulation of echo top height. For the convective entrainment rate (zmconv_dmpdz), we decreased its value from -0.7×10-3 to -1.0×10-5, which means that the entrainment in convection is almost turned off, similar to the undiluted CAPE assumption. Results show the simulated echo top height is increased by 500–800 m in the EAMv1-test simulation, and the reflectivity span in the lower troposphere is narrowed by 1–2 dBZ, which is closer to the observations (Fig. 8). This result is consistent with the previous studies that tested the undiluted CAPE assumption as well (Neale et al., 2008; Hannah and Maloney, 2014). However, that assumption is unrealistic given the fact that the undiluted CAPE-based closure strongly deviated from observations (Zhang, 2009). In summary, changing any of our selected parameters individually in the ZM scheme does not improve the simulation of echo top height.

The MG2 cloud microphysics parameterization in E3SM determines only large-scale cloud and precipitation (i.e., those resolved by the model). Changes in the MG2 cloud microphysics parameterization could affect the parameterized cumulus cloud and precipitation by changing the large-scale forcing which feeds into the cumulus cloud calculations. By decreasing the MG2 autoconversion rate (prc_coef1), ideally the depletion of moisture within the atmospheric column is slowed down and more water vapor can be supplied to cumulus convection. Results show, however, that the echo top height is not affected by changing the MG2 assumptions. Attempts at accelerating the Wegener–Bergeron–Findeisen process in MG2 to increase the conversion of liquid to snow or ice, as well as using a lower size threshold for the ice-to-snow conversion, have also proven to be unimportant to the simulation of echo top height.

Thus, echo top height proves to be insensitive to the available tunable parameters. Setting the value of the convective entrainment rate to be unrealistically low only gains a 500–800 m increment in echo top height. Given that the model underestimation is more than 5 km, the increment is insufficient to solve the discrepancy. Note that each individual tunable parameter was changed without retuning the model to keep the top-of-atmosphere radiative energy budget balanced and the model performance optimized. Thus, some expected improvement in echo top height can be subsequently offset by other untuned processes. Instead of providing quantification of how the model responds to the changes of parameters, we emphasize the trend of change in echo top height, in which the simulation of the echo top height cannot be significantly improved by tuning only one of those physical parameters. Further investigation of combinations of two and more parameters is a topic for a future study.

4 Conclusions and discussion

We have evaluated the model performance of E3SM EAMv1 in simulating the warm-season 3-D radar reflectivity at an hourly scale over the North American sector of the globe by comparing the model results to the 3-D distribution of radar reflectivity observed by NEXRAD radars over the CONUS during April–September of 2014–2016. The evaluation is achieved by improving the COSP radar simulator and employing special data processing techniques to ensure fair comparison between model and observations that are different in sampling frequency, horizontal–vertical resolutions, and minimum detection limit. Our findings are as follows:

  1. With the default microphysics assumptions in COSP, the simulated subgrid reflectivity PDF is bimodal, in disagreement with radar observations which show that the subgrid reflectivity follows a gamma distribution. When the microphysics assumptions in COSP are changed to be consistent with the MG2 microphysics parameterization used in E3SM, the bimodality of the subgrid distribution is nearly eliminated. It is therefore important to maintain consistency of microphysics assumptions between the host model and radar echo simulator attached to the model as advocated by the COSP community (Swales et al., 2018). For more accurate model evaluation, a higher-order consistency between the COSP and the host model is warranted in future studies.

  2. Below 4 km altitude, the simulated domain-mean reflectivities by EAMv1 agree with NEXRAD observations in magnitude, but the simulation fails to capture the spatial variability. The model underestimates the reflectivity in the central US between the Rocky Mountains and Mississippi River. This pattern suggests that the model is not adequately representing the mesoscale convective systems that dominate warm-season rainfall in that region. The model overestimates the reflectivity outside this region.

  3. Above 4 km altitude, EAMv1 shows a severe underestimation of the domain-mean reflectivity, and the negative bias increases with altitude up to 11 km, above which the model fails to simulate any valid reflectivity at all, whereas NEXRAD observations show strong radar echoes up to 16 km.

  4. EAMv1 is able to simulate the variability and extreme value of reflectivity at the lower troposphere but significantly underestimates them at high levels.

The NEXRAD observations used in this study reveal that EAMv1 fails to simulate the occurrence of large ice-phase particles at high levels in deep convective clouds. In addition, the conclusion that “simulated deep convection is not deep enough” also echoes the dry bias seen in GCMs as manifested in underestimations of total precipitation and individually large rain rates over the CONUS (e.g., Zheng et al., 2019). We have now shown that this model deficiency cannot be significantly improved by tuning a single value of the physical parameters in the ZM cumulus and MG2 cloud microphysics schemes. Note the large-scale circulation is nudged towards observations for the simulations in this study, so our results represent the best-case model performance. Compared to the nudged simulations, free running of EAMv1 has shown nonnegligible biases in the regional circulation (Sun et al., 2019). With the nudged simulations, the large biases in circulation can be excluded so that the performances of physics parameterizations in simulating convective systems can be more insightfully understood.

The data processing techniques and metrics we have developed in this study can be used globally for model evaluation when satellite-based radars provide global 3-D radar observations. The GPM radar observations will eventually be able to provide global radar echo coverage (Houze et al., 2019), whose data have been proven consistent with NEXRAD (Wang et al., 2019b). However, as discussed in Sect. 2, the sampling by GPM at 1 model grid elements for only 3 years of GPM data is insufficient for obtaining robust grid-mean values to compare with the EAMv1 simulation. In addition to the restriction in the availability of observational data, the high computation cost with the incorporation of the COSP simulator in simulation and the demand of large data space (14 000 core hours and 1.2 TB of data per simulation month at hourly output frequency) have hindered the modeling for an extended period. When GPM has run for a much longer time period and more powerful computational resources become available, it will be a very useful study for evaluating the long-term model simulations at the global scale. In addition, the results of this study can provide metrics for evaluating the cumulus parameterizations or provide insights on how to further improve the cumulus parameterizations like Labbouz et al. (2018), which could be a follow-on work. Future studies can also focus on separately evaluating properties in convective and stratiform regions, since the thermodynamic and reflectivity profiles are fundamentally different between the two regions.

Code availability

The source code in this study is based on the Department of Energy (DOE) Energy Exascale Earth System Model (E3SM) Project version 1 at revision 9a86ab9, whose code can be acquired from the E3SM repository ( and Zenodo (Wang, 2021).

Data availability

The observational data are available through the National Center for Atmospheric Research Data Archive (available at:, last access: 20 May 2019) (Bowman and Homeyer, 2017). Model results are available from (last access: 5 August 2020) (Wang, 2020).

Author contributions

JW performed the simulations and conducted the analyses. JF and RAH developed the idea of this research. KZ helped in the model configuration, and P-LM implemented the radar simulator. GJZ provided feedback and helped shape the research. All authors discussed the results and contributed to the final manuscript.

Competing interests

The authors declare that they have no conflict of interest.


We acknowledge the support of the Climate Model Development and Validation (CMDV) project at PNNL. The effort of Jingyu Wang, Jiwen Fan, Kai Zhang, and Po-Lun Ma was supported by CMDV. Robert A. Houze Jr. was supported by NASA Award NNX16AD75G and by master agreement 243766 between the University of Washington and PNNL. Stella R. Brodzik was supported by NASA Award NNX16AD75G and subcontracts from the CMDV and Water Cycle and Climate Extreme Modeling (WACCEM) projects of PNNL. Guang J. Zhang was supported by the DOE Biological and Environmental Research Program (BER) Award DE-SC0019373. PNNL is operated for the US Department of Energy (DOE) by the Battelle Memorial Institute under contract DE-AC05-76RL01830. This research used resources of the National Energy Research Scientific Computing Center (NERSC), a US Department of Energy Office of Science user facility operated under contract DE-AC02-05CH11231.

Financial support

This research has been supported by the US Department of Energy, Office of Science projects of Climate Model Development and Validation and Water Cycle and Climate Extreme Modeling under the Contract DE-AC05-76RL01830 with the Pacific Northwest National Laboratory, as well as the project DE-SC0019373 and the National Aeronautics and Space Administration (grant no. NNX16AD75G). .

Review statement

This paper was edited by Christina McCluskey and reviewed by Alain Protat, Peter May, and two anonymous referees.


Bodas-Salcedo, A., Webb, M. J., Bony, S., Chepfer, H., Dufresne, J.-L., Klein, S. A., Zhang, Y., Marchand, R., Haynes, J., Pincus, R., and John, V. O.: COSP: Satellite simulation software for model assessment, B. Am. Meteorol. Soc., 92, 1023–1043,, 2011. 

Bowman, K. P. and Homeyer, C. R.: GridRad – Three-Dimensional Gridded NEXRAD WSR-88D Radar Data. Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory,, 2017. 

Dennis, J., Edwards, K., Evans, J., Guba, O., Lauritzen, P. H., Mirin, A. A., St-Cyr, A., Taylor, M. A., and Worley, P. H.: CAM-SE: A scalable spectral element dynamical core for the Community Atmosphere Model, International J. High Perform. C., 26, 74–89, 2012. 

Fan, J., Han, B., Varble, A., Morrison, H., North, K., Kollias, P., Chen, B., Dong, X., Giangrande, S. E., Khain, A., Lin, Y., Mansell, E., Milbrandt, J. A., Stenz, R., Thompson, G., and Wang, Y.: Cloud-resolving model intercomparison of an MC3E squall line case: Part I – Convective updrafts, J. Geophys. Res.-Atmos., 122, 9351–9378,, 2017. 

Feng, Z., Dong, X., Xi, B., McFarlane, S. A., Kennedy, A., Lin, B., and Minnis, P.: Life cycle of midlatitude deep convective systems in a Lagrangian framework, J. Geophys. Res., 117, D23201,, 2012. 

Feng, Z., Leung, L. R., Houze Jr., R. A., Hagos, S., Hardin, J., Yang, Q., Han, B., and Fan, J.: Structure and evolution of mesoscale convective systems: Sensitivity to cloud microphysics in convection-permitting simulations over the United States, J. Adv. Model. Earth Sy., 10, 1470–1494,, 2018. 

Feng, Z., Houze, R. A., Leung, L. R., Song, F., Hardin, J. C., Wang, J., Gustafson, W. I., and Homeyer, C. R.: Spatiotemporal Characteristics and Large-Scale Environments of Mesoscale Convective Systems East of the Rocky Mountains, J. Climate, 32, 7303–7328,, 2019. 

Gelaro, R., McCarty, W., Suárez, M. J., Todling, R., Molod, A., Takacs, L., Randles, C. A., Darmenov, A., Bosilovich, M. G., Reichle, R., Wargan, K., Coy, L., Cullather, R., Draper, C., Akella, S., Buchard, V., Conaty, A., da Silva, A. M., Gu, W., Kim, G.-K., Koster, R., Lucchesi, R., Merkova, D., Nielsen, J. E., Partyka, G., Pawson, S., Putman, W., Rienecker, M., Schubert, S. D., Sienkiewicz, M., and Zhao, B.: The Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2), J. Climate, 30, 5419–5454,, 2017. 

Gettelman, A. and Morrison, H.: Advanced two-moment bulk microphysics for global models. Part I: Off-line tests and comparison with other schemes, J. Climate, 28, 1268–1287,, 2015. 

Golaz, J.-C., Larson, V. E., and Cotton, W. R.: A PDF-based model for boundary layer clouds. Part I: Method and model description, J. Atmos. Sci., 59, 3540–3551,<3540:APBMFB>2.0.CO;2, 2002. 

Han, B., Fan, J., Varble, A., Morrison, H., Williams, C. R., Chen, B., Dong, X., Giangrande, S. E., Khain, A., Mansell, E., Milbrandt, J. A., Shpund, J., and Thompson, G.: Cloud-resolving model intercomparison of an MC3E squall line case: Part II. Stratiform precipitation properties, J. Geophys. Res.-Atmos., 124, 1090–1117,, 2019. 

Hannah, W. M. and Maloney, E. D.: The moist static energy budget in NCAR CAM5 hindcasts during DYNAMO, J. Adv. Model, Earth Sy., 6, 420–440,, 2014. 

Haynes, J. M. and Stephens, G. L.: Tropical oceanic cloudiness and the incidence of precipitation: Early results from CloudSat, Geophys, Res. Lett., L09811,, 2007. 

Hillman, B. R., Marchand, R. T., and Ackerman, T. P.: Sensitivities of simulated satellite views of clouds to subgrid-scale overlap and condensate heterogeneity, J. Geophys. Res.-Atmos., 123, 7506–7529,, 2018. 

Homeyer, C. R. and Bowman, K. P.: Algorithm Description Document for Version 3.1 of the Three-Dimensional Gridded NEXRAD WSR-88D Radar (GridRad) Dataset, Technical Report, available at: (last access: 20 May 2019), 2017. 

Houze, R. A., Wang, J., Fan, J., Brodzik, S., and Feng, Z.: Extreme convective storms over high-latitude continental areas where maximum warming is occurring, Geophys. Res. Lett., 46, 4059–4065,, 2019. 

Iguchi, T., Kawamoto, N., and Oki, R.: Detection of Intense Ice Precipitation with GPM/DPR, J. Atmos. Oceanic Tech., 35, 491–502,, 2018. 

Klein, S. A. and Jakob, C.: Validation and Sensitivities of Frontal Clouds Simulated by the ECMWF Model, Mon. Weather Rev., 127, 2514–2531,<2514:VASOFC>2.0.CO;2, 1999. 

Larson, V. E.: CLUBB-SILHS: A parameterization of subgrid variability in the atmosphere, arXiv [preprint], arXiv:1711.03675, 2017. 

Lin, G., Wan, H., Zhang, K., Qian, Y., and Ghan, S. J.: Can nudging be used to quantify model sensitivities in precipitation and cloud forcing? J. Adv. Model. Earth Sy., 8, 1073–1091,, 2016. 

Lin, G., Fan, J., Feng, Z., Gustafson, W. I., Ma, P.-L., and Zhang, K.: Can the multiscale modeling framework (mmf) simulate the mcs-associated precipitation over the Central United States? J. Adv. Model. Earth Sy., 11, 4669–4686,, 2019. 

Lin, S.-J.: A “Vertically Lagrangian” Finite-Volume Dynamical Core for Global Models, Mon. Weather Rev., 132, 2293–2307,<2293:AVLFDC>2.0.CO;2, 2004. 

Liu, X., Ma, P.-L., Wang, H., Tilmes, S., Singh, B., Easter, R. C., Ghan, S. J., and Rasch, P. J.: Description and evaluation of a new four-mode version of the Modal Aerosol Module (MAM4) within version 5.3 of the Community Atmosphere Model, Geosci. Model Dev., 9, 505–522,, 2016. 

Ma, P.-L., Rasch, P. J., Fast, J. D., Easter, R. C., Gustafson Jr., W. I., Liu, X., Ghan, S. J., and Singh, B.: Assessing the CAM5 physics suite in the WRF-Chem model: implementation, resolution sensitivity, and a first evaluation for a regional case study, Geosci. Model Dev., 7, 755–778,, 2014. 

Marchand, R., Haynes, J., Mace, G. G., Ackerman, T., and Stephens: A comparison of simulated cloud radar output from the multiscale modeling framework global climate model with CloudSat cloud radar observations, J. Geophys. Res., 114, D00A20,, 2009. 

Matrosov, S. Y.: Radar reflectivity in snowfall. IEEE T. Geosci. Remote, 30, 454–461, 1992. 

Neale, R. B., Richter, J. H, Conley, A. J., Park, S., Lauritzen, P. H., Gettelman, A., Williamson, D. L., Rasch, P. J., Vavrus, S. J., Taylor, M. A., Collins, W. D., Zhang, M., and Lin S.-J.: Description of the NCAR Community Atmosphere Model (CAM 5.0), Tech. Note NCAR/TN-486 + STR, Natl. Cent. For Atmos, available at: (last access: 20 May 2019), 2012. 

Neale, R. B., Richter, J. H., and Jochum, M.: The Impact of Convection on ENSO: From a Delayed Oscillator to a Series of Events, J. Climate, 21, 5904–5924,, 2008. 

Pincus, R, Hemler, R. S., and Klein, S. A.: Using Stochastically Generated Subcolumns to Represent Cloud Structure in a Large-Scale Model, Mon. Weather Rev., 134, 3644–3656,, 2006. 

Qian, Y., Wan, H., Yang, B., Golaz, J.-C., Harrop, B., Hou, Z., Larson, V. E., Leung, L. R., Lin, G., Lin, W., Ma, P.-L., Ma, H.-Y., Rasch, P., Singh, B., Wang, H., Xie, S. and Zhang, K.: Parametric sensitivity and uncertainty quantification in the version 1 of E3SM atmosphere model based on short perturbed parameter ensemble simulations, J. Geophys. Res.-Atmos., 123, 13046–13073,, 2018. 

Randall, D. A., Wood, R. A., Bony, S., Colman, R., Fichefet, T., Fyfe, J., Kattsov, V., Pitman, A., Shukla, J., Srinivasan, J., Stouffer, R. J., Sumi, A., and Taylor, K. E.: Climate models and their evaluation, in: Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change, edited by: Solomon, S., Qin, D.,Manning, M., Chen, Z., Marquis, M., Averyt, K. B., Tignor, M., and Miller, H. L., Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 589–662, 2007. 

Randel, D. L., Vonder Haar, T. H., Ringerud, M. A., Stephens, G. L., Greenwald, T. J., and Combs, C. L.: A new global water vapor dataset. B. Am. Meteorol. Soc., 77, 1233–1246, 1996. 

Rasch, P. J., Xie, S., Ma, P.-L., Lin, W., Wang, H., Tang, Q., Burrows, S. M., Caldwell, P., Zhang, K., Easter, R. C., Cameron‐Smith, P., Singh, B., Wan, H., Golaz, J.-C., Harrop, B. E., Roesler, E., Bacmeister, J., Larson, V. E., Evans, K. J., Qian, Y., Taylor, M., Leung, L. R., Zhang, Y., Brent, L., Branstetter, M., Hannay, C., Mahajan, S., Mametjanov, A., Neale, R., Richter, J. H., Yoon, J.-H., Zender, C. S., Bader, D., Flanner, M., Foucar, J. G., Jacob, R., Keen, N., Klein, S. A., Liu, X., Salinger, A. G., Shrivastava, M., and Yang, Y.: An Overview of the Atmospheric Component of the Energy Exascale Earth System Model, J. Adv. Model. Earth Sy., 11, 2377–2411,, 2019. 

Richter, J. H. and Rasch, P. J.: Effects of convective momentum transport on the atmospheric circulation in the Community Atmosphere Model, Version 3, J. Climate, 21, 1487–1499,, 2008. 

Song, H., Zhang, Z., Ma, P.-L., Ghan, S., and Wang, M.: The importance of considering sub-grid cloud variability when using satellite observations to evaluate the cloud and precipitation simulations in climate models, Geosci. Model Dev., 11, 3147–3158,, 2018. 

Stephens, G. L. and Kummerow, C. D.: The remote sensing of clouds and precipitation from space: A review, J. Atmos. Sci., 64, 3742–3765, 2007. 

Sun, J., Zhang, K., Wan, H., Ma, P.-L., Tang, Q., Zhang, S.: Impact of nudging strategy on the climate representativeness and hindcast skill of constrained EAMv1 simulations, J. Adv. Model. Earth Sy., 11, 3911–3933,, 2019. 

Swales, D. J., Pincus, R., and Bodas-Salcedo, A.: The Cloud Feedback Model Intercomparison Project Observational Simulator Package: Version 2, Geosci. Model Dev., 11, 77–81,, 2018. 

Taylor, M. A.: Conservation of mass and energy for the moist atmospheric primitive equations on unstructured grids, in: Numerical techniques for global atmospheric models, Lecture Notes Comput. Sci. Eng., edited by: Lauritzen, P. H. Barth, T. J., Griebel, M., Keyes, D. E., Nieminen, R. M., Roose, D., and Schlick, T., Vol. 80, pp. 357–380, Heidelberg, Germany: Springer,, 2011. 

Taylor, M. A. and Fournier, A.: A compatible and conservative spectral element method on unstructured grids, J. Comput. Phys., 229, 5879–5895,, 2010. 

Tian, J., Dong, X., Xi, B., Wang, J., Homeyer, C. R., McFarquhar, G. M., and Fan, J.: Retrievals of ice cloud microphysical properties of deep convective systems using radar measurements, J. Geophys. Res. Atmos., 121, 10820–10839,, 2016. 

Trenberth, K. E., Jones, P. D., Ambenje, P., Bojariu, R., Easterling, D., Klein Tank, A., Parker, D., Rahimzadeh, F., Renwick, J. A., Rusticucci, M., Soden B., and Zhai, P.: Observations: Surface and atmospheric climate change, in: Climate Change 2007: The Physical Science Basis, edited by: Solomon, S., Qin, D., Manning, M., Chen, Z., Marquis, M., Averyt, K. B., Tignor, M., and Miller, H. L., Cambridge University Press, United Kingdom and New York, NY, USA. 235–336, 2007. 

Um, J., McFarquhar, G. M., Stith, J. L., Jung, C. H., Lee, S. S., Lee, J. Y., Shin, Y., Lee, Y. G., Yang, Y. I., Yum, S. S., Kim, B.-G., Cha, J. W., and Ko, A.-R.: Microphysical characteristics of frozen droplet aggregates from deep convective clouds, Atmos. Chem. Phys., 18, 16915–16930,, 2018. 

Wang, J.: Model results for E3SMv1 COSP simulation, available at:, last access: 5 August 2020. 

Wang, J: Model code and configuration for E3SMv1 COSP simulation, Zenodo,, 2021. 

Wang, J., Dong, X., and Xi, B.: Investigation of ice cloud microphysical properties of DCSs using aircraft in situ measurements during MC3E over the ARM SGP site, J. Geophys. Res.-Atmos., 120, 3533–3552,, 2015. 

Wang, J., Dong, X., Xi, B., and Heymsfield, A. J.: Investigation of liquid cloud microphysical properties of deep convective systems: 1. Parameterization of raindrop size distribution and its application for stratiform rain estimation, J. Geophys. Res. Atmos., 121, 10739–10760,, 2016. 

Wang, J., Dong, X., and Xi, B.: Investigation of liquid cloud microphysical properties of deep convective systems: 2. Parameterization of raindrop size distribution and its application for convective rain estimation. J. Geophys. Res.-Atmos., 123, 11637–11651,, 2018. 

Wang, J., Dong, X., Kennedy, A., Hagenhoff, B., and Xi, B.: A Regime-Based Evaluation of Southern and Northern Great Plains Warm-Season Precipitation Events in WRF, Weather Forecast., 34, 805–831,, 2019a. 

Wang, J., Houze, Jr., R. A., Fan, J., Brodzik, S. R., Feng, Z., and Hardin, J. C.: The detection of mesoscale convective systems by the GPM Ku-band spaceborne radar, J. Meteorol. Soc. Jpn. (Special Edition on Global Precipitation Measurement (GPM): 5th Anniversary), 97,, 2019b. 

Wang, M. and Zhang, G. J.: Improving the Simulation of Tropical Convective Cloud-Top Heights in CAM5 with CloudSat Observations, J. Climate, 31, 5189–5204,, 2018. 

Warren, R. A., Protat, A., Siems, S. T., Ramsay, H. A., Louf, V., Manton, M. J., and Kane, T. A.: Calibrating Ground-Based Radars against TRMM and GPM, J. Atmos. Oceanic Tech., 35, 323–346,, 2018. 

Webb, M., Senior, C., Bony, S., and Morcrette, J. J.: Combining ERBE and ISCCP data to assess clouds in the Hadley Centre, ECMWF and LMD atmospheric climate models, Clim. Dynam., 17, 905–922,, 2001. 

Xie, S., Lin, W., Rasch, P. J., Ma, P.-L., Neale, R., Larson, V. E., Qian, Y., Bogenschutz, P. A., Caldwell, P., Cameron‐Smith, P., Golaz, J.-C., Mahajan, S., Singh, B., Tang, Q., Wang, H., Yoon, J.-H., Zhang, K., and Zhang Y.: Understanding cloud and convective characteristics in version 1 of the E3SM atmosphere model, J. Adv. Model. Earth Sy., 10, 2618–2644,, 2018. 

Xie, S., Wang, Y.-C., Lin, W., Ma, H.-Y., Tang, Q., Tang, S., Zheng, X., Golaz, J.-C., Zhang, G.-J., and Zhang, M.: Improved diurnal cycle of precipitation in E3SM with a revised convective triggering function, J. Adv. Model. Earth Sy., 11, 2290–2310,, 2019. 

Yang, Q., Houze, Jr. R. A., Leung, L. R., and Feng, Z.: Environments of long-lived mesoscale convective systems over the central United States in convection permitting climate simulations, J. Geophys. Res.-Atmos., 122, 13288–13307,, 2017. 

Yuter, S. E. and Houze, Jr. R. A.: Three-dimensional kinematic and microphysical evolution of Florida cumulonimbus, Part II: Frequency distribution of vertical velocity, reflectivity, and differential reflectivity, Mon. Weather Rev., 123, 1941–1963, 1995. 

Zhang, G. J.: Effects of entrainment on convective available potential energy and closure assumptions in convection parameterization, J. Geophys. Res., 114, D07109,, 2009. 

Zhang, G. J. and McFarlane, N. A.: Sensitivity of climate simulations to the parameterization of cumulus convection in the Canadian climate centre general circulation model, Atmos. Ocean, 33, 407–446,, 1995. 

Zhang, J., Howard, K., Langston, C., Vasiloff, S., Kaney, B., Arthur, A., Van Cooten, S., Kelleher, K., Kitzmiller, D., Ding, F., Seo, D., Wells, E., and Dempsey, C.: National Mosaic and Multi-Sensor QPE (NMQ) System: Description, Results, and Future Plans, B. Am. Meteorol. Soc., 92, 1321–1338,, 2011.  

Zhang, J., Howard, K., Langston, C., Kaney, B., Qi, Y., Tang, L., Grams, H., Wang, Y., Cocks, S., Martinaitis, S., Arthur, A., Cooper, K., Brogden, J., and Kitzmiller, D.: Multi-Radar Multi-Sensor (MRMS) Quantitative Precipitation Estimation: Initial Operating Capabilities, B. Am. Meteorol. Soc., 97, 621–638,, 2016.  

Zhang, K., Wan, H., Liu, X., Ghan, S. J., Kooperman, G. J., Ma, P.-L., Rasch, P. J., Neubauer, D., and Lohmann, U.: Technical Note: On the use of nudging for aerosolclimate model intercomparison studies, Atmos. Chem. Phys., 14, 8631–8645,, 2014. 

Zhang, K., Rasch, P. J., Taylor, M. A., Wan, H., Leung, R., Ma, P.-L., Golaz, J.-C., Wolfe, J., Lin, W., Singh, B., Burrows, S., Yoon, J.-H., Wang, H., Qian, Y., Tang, Q., Caldwell, P., and Xie, S.: Impact of numerical choices on water conservation in the E3SM Atmosphere Model version 1 (EAMv1), Geosci. Model Dev., 11, 1971–1988,, 2018. 

Zhang, Y., Klein, S. A., Boyle, J., and Mace, G. G.: Evaluation of tropical cloud and precipitation statistics of Community Atmosphere Model version 3 using CloudSat and CALIPSO data, J. Geophys. Res., 115, D12205,, 2010. 

Zhang, Y., Xie, S., Klein, S. A., Marchand, R., Kollias, P., Clothiaux, E. E., Lin, W., Johnson, K., Swales, D., Bodas-Salcedo, A., Tang, S., Haynes, J. M., Collis, S., Jensen, M., Bharadwaj, N., Hardin, J., and Isom, B.: The ARM Cloud Radar Simulator for Global Climate Models: Bridging Field Data and Climate Models, B. Am. Meteorol. Soc., 99, 21–26,, 2018. 

Zhang, Y., Xie, S., Lin, W., Klein, S. A., Zelinka, M., Ma, P.-L., Rasch, P. J., Qian, Y., Tang, Q., and Ma, H.-Y.: Evaluation of clouds in version 1 of the E3SM atmosphere model with satellite simulators, J. Adv. Model. Earth Sy., 11, 1253–1268,, 2019. 

Zheng, X., Golaz, J.-C., Xie, S., Tang, Q., Lin, W., Zhang, M., Ma., H.-Y., and Roesler, E. L.: The summertime precipitation bias in E3SM Atmosphere Model version 1 over the Central United States. J. Geophys. Res.-Atmos., 124, 8935–8952,, 2019. 


The requested paper has a corresponding corrigendum published. Please read the corrigendum first before downloading the article.

Short summary
This paper presents an evaluation of the E3SM model against NEXRAD radar observations for the warm seasons during 2014–2016. The COSP forward simulator package is implemented in the model to generate radar reflectivity, and the NEXRAD observations are coarsened to the model resolution for comparison. The model severely underestimates the reflectivity above 4 km. Sensitivity tests on the parameters from cumulus parameterization and cloud microphysics do not improve this model bias.