Articles | Volume 16, issue 13
Model description paper
13 Jul 2023
Model description paper |  | 13 Jul 2023

The fully coupled regionally refined model of E3SM version 2: overview of the atmosphere, land, and river results

Qi Tang, Jean-Christophe Golaz, Luke P. Van Roekel, Mark A. Taylor, Wuyin Lin, Benjamin R. Hillman, Paul A. Ullrich, Andrew M. Bradley, Oksana Guba, Jonathan D. Wolfe, Tian Zhou, Kai Zhang, Xue Zheng, Yunyan Zhang, Meng Zhang, Mingxuan Wu, Hailong Wang, Cheng Tao, Balwinder Singh, Alan M. Rhoades, Yi Qin, Hong-Yi Li, Yan Feng, Yuying Zhang, Chengzhu Zhang, Charles S. Zender, Shaocheng Xie, Erika L. Roesler, Andrew F. Roberts, Azamat Mametjanov, Mathew E. Maltrud, Noel D. Keen, Robert L. Jacob, Christiane Jablonowski, Owen K. Hughes, Ryan M. Forsyth, Alan V. Di Vittorio, Peter M. Caldwell, Gautam Bisht, Renata B. McCoy, L. Ruby Leung, and David C. Bader

This paper provides an overview of the United States (US) Department of Energy's (DOE's) Energy Exascale Earth System Model version 2 (E3SMv2) fully coupled regionally refined model (RRM) and documents the overall atmosphere, land, and river results from the Coupled Model Intercomparison Project 6 (CMIP6) DECK (Diagnosis, Evaluation, and Characterization of Klima) and historical simulations – a first-of-its-kind set of climate production simulations using RRM. The North American (NA) RRM (NARRM) is developed as the high-resolution configuration of E3SMv2 with the primary goal of more explicitly addressing DOE's mission needs regarding impacts to the US energy sector facing Earth system changes. The NARRM features finer horizontal resolution grids centered over NA, consisting of 25100 km atmosphere and land, a 0.125 river-routing model, and 1460 km ocean and sea ice. By design, the computational cost of NARRM is 3× of the uniform low-resolution (LR) model at 100 km but only  10 %–20 % of a globally uniform high-resolution model at 25 km.

A novel hybrid time step strategy for the atmosphere is key for NARRM to achieve improved climate simulation fidelity within the high-resolution patch without sacrificing the overall global performance. The global climate, including climatology, time series, sensitivity, and feedback, is confirmed to be largely identical between NARRM and LR as quantified with typical climate metrics. Over the refined NA area, NARRM is generally superior to LR, including for precipitation and clouds over the contiguous US (CONUS), summertime marine stratocumulus clouds off the coast of California, liquid and ice phase clouds near the North Pole region, extratropical cyclones, and spatial variability in land hydrological processes. The improvements over land are related to the better-resolved topography in NARRM, whereas those over ocean are attributable to the improved air–sea interactions with finer grids for both atmosphere and ocean and sea ice. Some features appear insensitive to the resolution change analyzed here, for instance the diurnal propagation of organized mesoscale convective systems over CONUS and the warm-season land–atmosphere coupling at the southern Great Plains. In summary, our study presents a realistically efficient approach to leverage the fully coupled RRM framework for a standard Earth system model release and high-resolution climate production simulations.

1 Introduction

Global Earth system models (ESMs) are fundamental tools for understanding the past evolution of the climate system and projecting future climate changes under various anthropogenic scenarios. High horizontal resolution simulations on climate scales have been recognized as one of the increasingly important directions of ESM development in recent years (Demory et al.2014; Haarsma et al.2016). Compared to low-resolution models, high-resolution models show superior fidelity in representing both the large-scale circulation (e.g., meridional ocean heat transport) (Griffies et al.2015) and small-scale processes (e.g., clouds and streamflow) (Haarsma et al.2016, and references therein). More importantly, simulations with enhanced horizontal resolution exhibit improved skills in capturing regional climate change signals and facilitating process-level studies, which provide a crucial basis for assessing the impacts of climate extremes with augmented societal implications. However, fine-resolution and multi-century simulations (with ensembles) are competing requirements for climate experiments due to limited computational and human resources. This conflict will likely continue to challenge the climate modeling community, as evidenced by the fact that more than 3 times (72 vs. 23) as many model sources (including different versions of the same model) have published simulations at 100 km than at 25 km nominal resolutions in the current Coupled Model Intercomparison Project 6 (CMIP6) archive (, last access: 18 August 2022). This suggests that despite the commonly recognized benefits, not many modeling centers can afford to pursue routine high-resolution climate simulations.

The Energy Exascale Earth System Model (E3SM) project (Leung et al.2020) is supported by the US Department of Energy (DOE) with a primary goal of improving actionable predictions of Earth system variability and change by leveraging advanced DOE computational resources. Scientifically, E3SM development is motivated by modeling requirements in three overarching fields (i.e., water cycle, biogeochemistry, and cryosphere) to address the most critical DOE mission-related questions, such as water availability, wildfires, heat waves, and sea level rise, which all pose challenges to the energy sector with climate change. High-resolution simulations are clearly more desirable to achieve these E3SM objectives since these processes have high spatiotemporal variability. However, uniformly increasing the grid size for climate production simulations is not an easy task even with DOE's world class high-performance computing power. For example, the 25 km simulation is at least 32 times (16× more grid cells, smaller physics time step, and smaller dynamical core time step) more expensive than the 100 km version with the E3SM version 1 (E3SMv1) model (Caldwell et al.2019), making high-resolution models much more computationally expensive not only to run but also to tune for skillful simulations. With these demands and limitations, a multiscale approach is an attractive avenue for global ESMs to deliver high-resolution production simulations over target areas at a more economical cost.

The multiresolution method (Ringler et al.2008; Leung et al.2013), also known as regionally refined model (RRM) or variable-resolution (VR) model, was proposed to alleviate the computational burden of global ESMs by refining a fraction of the globe with higher resolution while keeping (without coarsening) the remaining area at lower resolution. The RRM method is a general tool for all major ESM components, such as atmosphere, land, ocean, and sea ice. With a careful design of the RRM mesh, the high-resolution grids can better represent fine-scale processes over an area of interest at a typical cost of only  10 %–20 % of a comparable globally uniform high-resolution configuration. Compared to regional or nested climate models, global RRMs by design minimize the impacts from the lack of a two-way dynamical feedback between the refined area and the outside domain.

Recently, an increasing number of studies have successfully applied the RRM technique in global ESMs to tackle a wide range of climate research themes from climatological statistics of idealized aquaplanet (Zarzycki et al.2014) and mean climate state of more realistic simulations (Sakaguchi et al.2015, 2016; Gettelman et al.2018; Tang et al.2019) to complex terrain climate (Wu et al.2017; Rhoades et al.2018c; Rahimi et al.2019; Bambach et al.2022) and climate extremes (Huang and Ullrich2017; Rhoades et al.2020a, b; Zarzycki et al.2021; Reed et al.2022; Xu et al.2022). Others leveraged RRM to study specific aspects of climate, such as tropical cyclones (Zarzycki and Jablonowski2014, 2015; Hazelton et al.2018), marine stratocumulus (Bogenschutz et al.2023), snowpack (Rhoades et al.2016, 2017), surface energy flux (Burakowski et al.2019), Greenland surface mass balance (van Kampenhout et al.2019), irrigation impacts on regional climate (Huang and Ullrich2016), and land use and land cover change influence on land–atmosphere coupling and precipitation (Devanand et al.2020). Lately, the RRM resolution has been pushed to a new limit for watershed-scale hydrology analysis (Xu and Di Vittorio2021) and cloud-resolving scale climate simulation (Liu et al.2022).

The RRM high-resolution results are robust for most places except the Intertropical Convergence Zone (Rauscher et al.2013; Zarzycki et al.2014), covering almost all typical climate regimes such as the contiguous US (CONUS) (Gettelman et al.2018; Tang et al.2019), the western (Rhoades et al.2016; Huang et al.2016; Huang and Ullrich2017; Rhoades et al.2018c) and eastern US (Liu et al.2022), South America (Sakaguchi et al.2015, 2016; Bambach et al.2022), Asia (Sakaguchi et al.2016), East Asia (Liang et al.2021), eastern China (Xu et al.2021), the Tibetan Plateau (Rahimi et al.2019), the Maritime Continent (Harris and Lin2014), Atlantic basin (Zarzycki et al.2015), the southeastern Pacific (Bogenschutz et al.2023), Greenland (van Kampenhout et al.2019), and the Arctic (Veneziani et al.2022). Furthermore, the RRM capability in representing the general high-resolution climate seems generally acceptable for different models, including the Variable-Resolution Community Earth System Model (VR-CESM) (e.g., Gettelman et al.2018), the E3SMv1 atmospheric model (EAMv1) (Tang et al.2019), the Model for Prediction Across Scales-Atmosphere (MPAS-A) (Hagos et al.2013; Sakaguchi et al.2015, 2016; Liang et al.2021), the Geophysical Fluid Dynamics Laboratory finite-volume dynamical core on the cubed-sphere grid (Harris and Lin2013, 2014), and the ICOsahedral Non-hydrostatic Earth System Model (ICON-ESM) (Jungclaus et al.2022).

All of the aforementioned studies utilize RRMs for Atmospheric Model Intercomparison Project (AMIP-type) (Gates et al.1999) simulations. Although these studies provide valuable experience and important knowledge about RRMs, modeling centers still face the question of how to transform such AMIP-type RRM achievements from individual scientific studies emphasizing specific climate aspects to a standard global ESM release version aiming at a much broader and general scope. At a minimum, the criteria of reasonable global climate should be satisfied for the RRM to be widely adopted for global ESM releases. Most previous AMIP-type RRM studies focus on the regional results within the refined grids without paying much attention to the outside domain. While this might be acceptable for targeted studies, one cannot release a global model without reasonable global results since such a model is expected to address the challenge of a long (multi-century) spinup and demonstrate top-of-atmosphere (TOA) radiative balance in pre-industrial fully coupled simulations. In addition, some physics parameterizations (e.g., deep convection) suffer from poor scale awareness and hence require retuning as the model resolution increases (e.g., Xie et al.2018). This implies significant model calibration efforts that modeling centers have to seriously consider when planning on releasing the RRM besides the low-resolution model. Furthermore, based on our EAMv1 RRM experience, retuning does not guarantee improved global climate performance. In the present study, building upon the EAMv1 RRM (atmosphere and land area of 25100 km horizontal resolution with the 25 km mesh over the CONUS) (Tang et al.2019) plus the E3SMv2 lower resolution configuration (Golaz et al.2022), we extend the RRM configuration to ocean and sea ice (see grids in Fig. A1) as a fully coupled RRM with fine meshes centered over North America (NA). We propose an innovative RRM strategy (see details in Sect. 2.1) to meet the criteria above with a minimal retuning effort and for the first time to deliver production climate simulations using a fully coupled RRM.

This paper focuses on the atmosphere, land, and river components of the E3SMv2 North American RRM (NARRM), while a companion paper (Luke P. Van Roekel, personal communication, 2023) overviews the NARRM ocean and sea ice. This paper is organized as follows. Section 2 describes the NARRM model, our hybrid time step strategy for the atmospheric component, and key tools and tests used to create its atmospheric configuration. Section 3 summarizes the simulations performed in the present study and reports on the computational cost of the NARRM historical simulation relative to its lower-resolution (LR) counterpart. Analyses of model results start at the global scale in Sect. 4 and then shift to the high-resolution NA region in Sect. 5 for atmosphere, land and river, and land–atmosphere interactions. Conclusions and discussions are presented in Sect. 6.

2 Model description

Except for the mesh and mesh-related settings, E3SMv2 LR and NARRM essentially have the same atmosphere, land, and river components. They are upgraded from E3SMv1 and briefly described here. In the E3SMv2 atmosphere model (EAMv2), the dynamical core uses the High-Order Method Modeling Environment (HOMME) package (Dennis et al.2005, 2011; Evans et al.2013) on the spectral element grid (Taylor and Fournier2010). HOMME has been updated to use a potential temperature formulation of the equations with a more accurate pressure gradient (Taylor et al.2020; Herrington et al.2022) and a new interpolation semi-Lagrangian scheme (Islet) for passive tracer transport (Bradley et al.2022). The physics operates on a separate finite-volume grid (Hannah et al.2021), which has four-ninths as many columns as the corresponding spectral element grid (see Table 1) and hence runs about faster than it would on the spectral element grid. The physics parameterization updates include the Cloud Layers Unified By Binormals scheme (Golaz et al.2002; Larson2017) for subgrid turbulent transport and cloud macrophysics, the Zhang–McFarlane (ZM) deep convection scheme (Zhang and McFarlane1995) with a new trigger method (Xie et al.2019), gravity wave parameterizations following Richter et al. (2010) with additional modifications (Beres et al.2004; Richter et al.2019), the O3v2 package (Tang et al.2021) for the prognostic stratospheric ozone, and the four-mode version of Modal Aerosol Module (MAM4) (Liu et al.2016; Wang et al.2020) with an updated treatment of dust aerosol (Feng et al.2022). The same set of EAM physics parameters is used in the LR and NARRM simulations analyzed here. The LR grid is a quasi-uniform 1 cubed sphere grid with an average grid spacing of 100 km. The NARRM grid has an average grid spacing of  25 km over North America, transitioning to match the  100 km cubed-sphere grid over the rest of the globe. All simulations, except the idealized baroclinic wave simulations described later, utilize E3SM's standard 72 vertical levels (L72).

The E3SMv2 land model (ELMv2) runs on the same grid as the atmospheric physics. ELMv2 upgrades the prescribed vegetation distribution for better consistency between land use and changes in plant functional types across platforms and adopts the new shortwave radiation model SNICAR-AD (Dang et al.2019) for snow and ice. The land use harmonization version 2f data (LUH2;, last access: 3 July 2023) (Hurtt et al.2020) are converted into E3SMv2 plant functional types with an updated version of the land use translator (Di Vittorio et al.2014). The trajectory of land cover change has also been improved through better tracking of previous land use change. The E3SMv2 river-routing model (Model for Scale Adaptive River Transport, MOSARTv2) utilizes the regular lat–long grid (0.5 for LR and 0.125 for NARRM). MOSARTv2 uses the kinematic wave method to route the runoff from ELM into the ocean model via an eight-direction-based river network (Li et al.2013). More details about the E3SMv2 model are documented by Golaz et al. (2022).

2.1 EAM hybrid time step strategy for RRM production simulations

In previous RRM studies, including the EAMv1 CONUS RRM (Tang et al.2019), the atmospheric physics time step is often chosen to be shorter than that of the globally uniform low-resolution model to match the highest-resolution grids in the RRM. However, such treatment faces the challenge of satisfying the criteria above for the purpose of global climate production simulations. Mainly because the ZM deep convection scheme and other cloud parameterizations used by EAM are by design not scale-aware (Xie et al.2018), if the EAM in NARRM used a shorter physics time step than LR while keeping other physics parameters unchanged, the NARRM results on the unrefined portion of the mesh (covering a larger area than the refined portion) would not match the quality of the LR results and thus undermine the NARRM global performance. Furthermore, even if NARRM used the retuned high-resolution physics parameters along with the shorter physics time step, we would still have degraded global simulation quality over the LR model (see Fig. A2 for the EAMv1 results). With all these considerations, in the present study when employing RRM for climate production campaigns, we opt for a hybrid time step strategy in EAM, which is a combination of an LR physics time step and the high-resolution dynamics time steps (see Table 1). In this way, NARRM retains much of the LR global climate characteristics with possible improvements at the refined area benefiting from the high-resolution dynamics. Moreover, this approach simplifies the RRM development as it naturally avoids further tuning the RRM beyond what was done for LR. This choice also ensures that the physics behaves as similarly as possible between the LR and RRM simulations to facilitate direct comparisons of their climates.

It is worthwhile noting that the hybrid time step strategy is a practical choice before the scale-aware cloud parameterization becomes available. With the coarsened physics time step, NARRM cannot take full advantage of resolved processes (e.g., updrafts) at 25 km because the dynamics at 25 km explicitly resolve greater vertical velocities relative to those at 100 km and hence have faster dynamical timescales, which require the correspondingly shortened physics time step to match the faster-evolving instability. The time-truncation errors of the hybrid time step method are large at 25 km as quantified by a moist bubble test (Herrington et al.2019).

2.2 EAM running on unstructured meshes

In EAM, the underlying grid is always treated as fully unstructured. EAM can run on any grid that represents a tiling of the sphere with quadrilateral elements. For quasi-uniform grids, EAM relies on cubed-sphere grids since these grids are simple to construct. RRM grids are constructed by external tools as described below. Internally, the code treats all these grids identically, the only difference being the various resolution-dependent parameters. For the dynamical core, these parameters consist of the many time steps in the model (given in Table 1) and the hyperviscosity coefficient. The dynamical core time steps are chosen to ensure stability of the model. For RRM grids, these time steps are chosen to match those that would be used in a global model with the same resolution as the highest resolution contained within the RRM. For the NARRM grid used here, which includes refinement down to 25 km, we use the same time steps as would be used by a global 25 km configuration of EAM.

Table 1Column numbers and time steps of the atmosphere component used in LR and NARRM simulations.

Download Print Version | Download XLSX

For hyperviscosity, EAM relies on a resolution-aware tensor hyperviscosity formulation (Guba et al.2014) applied on each model surface. The tensor coefficients vary spatially based on the two length scales of each spectral element (derived from the eigenvalues of the reference element map). This operator has a built in scaling of Δx3 with strength controlled by a coefficient ν with units of per second. The tensor is designed to have the proper directional resolution dependence for highly distorted elements, while matching the traditional constant-coefficient hyperviscosity on square elements. In EAMv2, we use the tensor hyperviscosity operator with ν=3.4×10-8 s−1 for all grids (cubed-sphere and RRM) and at all resolutions. The only exception is the LR 1 cubed-sphere grid, where for continuity with older simulations we continue to use the constant-coefficient hyperviscosity operator with μ=1×1015 m4 s−1. For a uniform degree p spectral element grid with square elements, the tensor operator with coefficient ν is identical to a constant coefficient hyperviscosity operator with coefficient μ=p23RΔx3ν, where Δx is the element edge length divided by p and R is the radius of the sphere. In EAM, we always use p=3.

2.3 Key tools for the RRM configuration

A number of tools have been developed to streamline the workflow for EAM and ELM simulations on RRM grids. These are described as follows, in the approximate order they are employed.

  • The Spherical Quadrilateral Grid Generator (SQuadGen). Generation of the atmosphere–land mesh is performed using SQuadGen (Ullrich2022; Guba et al.2014). This tool translates a monochrome PNG image, which denotes the desired level of grid refinement on an equirectangular projection, to a mesh of refined quadrilaterals based on a cubed sphere. The use of quadrilaterals is by necessity for compatibility with the spectral element dynamical core. Transition regions are managed using “paving”, that is, using predefined patterns of quadrilaterals which enable transition between coarse-resolution and fine-resolution regions. Smoothing of the grid is performed via spring dynamics. The spectral elements of the NARRM grid produced with this procedure are shown in Fig. 1. In the spectral element method, each field is represented by polynomials up to degree 3 within each element. The resolution represented by each element (its average length divided by 3) is shown in Fig. 2.

  • TempestRemap. The TempestRemap package (Ullrich and Taylor2015; Ullrich et al.2016) is used to generate conservative, consistent, and monotone linear maps between fields stored as volume averages (i.e., updated using the finite-volume methods) and fields stored as spectral elements (i.e., as coefficients of a set of basis functions). The generated maps require the construction of an “overlap mesh”, which is the union of the source and target face; the generation of an approximate map; and subsequent projection of the approximate map onto the linear space of conservative, consistent, and (optionally) monotone maps.

  • Topography generation. To generate topography and associated surface roughness fields on the NARRM grid, we rely on the tool chain described in Lauritzen et al. (2015) combined with a topography smoothing tool included with HOMME. The use of HOMME's topography smoothing tool ensures that the smoothing is done with the same discrete Laplace operator used internally in the dynamical core.

  • NetCDF Operators (NCOs). NCOs consist of a number of command-line tools that enable manipulation of netCDF files (Zender2008). The tools include variable extraction, remapping, and spatial and temporal averaging. Provenance information is preserved within the netCDF files to enable scientific reproducibility.

Figure 1North American RRM (NARRM) grids for the atmosphere dynamical core shown in (a) a cylindrical equidistant projection and (b) an orthographic projection.

2.4 Idealized test

Before running long coupled NARRM simulations, we first evaluate the dynamical core settings for the NARRM grid using a baroclinic instability test case. This test case establishes that the dynamical core behaves as expected in an idealized setting: the time steps are stable, the model can capture high-resolution features in the high-resolution region, and the presence of the high-resolution and mesh transition regions does not negatively impact the large-scale behavior. For this evaluation, we use an extension of the dry baroclinic wave test case by Ullrich et al. (2014) with two idealized, analytically prescribed mountains (Hughes and Jablonowski2023). The latter now serve as the trigger for baroclinic instability. The addition of the two mountains generates a flow with more energy at smaller scales as compared to the original test case, especially downstream of the mountains, making this an attractive test case for studying the impacts of resolution.

For this test case, we run simulations with three different horizontal grids, LR, NARRM, and high-resolution (HR), and 30 hybrid vertical levels (L30), which are specified in Appendix B of Reed and Jablonowski (2012). The LR and NARRM grids are as described above, and we add an HR grid. The HR grid is a global 0.25 grid which matches the high-resolution region of the NARRM grid. All idealized runs use the same settings as in the full model (except L30 instead of L72), with HR and NARRM using identical time steps since they both contain regions of 0.25 resolution. All simulations utilize the EAMv2 tensor hyperviscosity tuning with ν=3.4×10-8 and Δx3 resolution scaling.

The test case is fully described in Hughes and Jablonowski (2023). We use the dry configuration and make one modification to the locations of the mountains. In particular, the center locations of the mountains are shifted longitudinally by 144 to the east in order to place the two mountains within the NARRM's high-resolution region. The new center locations are therefore 144 and 76 W. The peak height of the mountain ranges is 2000 m. Figure 2 illustrates the size and location of the mountains and the NARRM mesh resolution, while Fig. 3 shows the surface pressure at day 6 computed on the same mesh. The latter highlights the topographically generated baroclinic instability in the Northern Hemisphere.

Figure 2Contour lines of the topographic height with a peak amplitude of 2000 m overlaid on a map of the NARRM grid resolution (square root of element area). The resolution is  25 km over North America (shown in yellow), transitioning to  100 km over the rest of the globe (dark blue). The two mountains are mostly contained within the high-resolution region. In the low-resolution region, the faint outline of an inscribed cube shows the slight non-uniformness of the 1 cubed-sphere grid used in that region.


Figure 3Contours of the surface pressure at day 6 showing the topographically triggered baroclinic instability in the Northern Hemisphere as computed on the NARRM grid. The instability has yet to be triggered in the Southern Hemisphere. The mountain height contours are overlaid. The colors saturate over the mountain ranges with minimum surface pressure values around 750–780 hPa (not shown).


Figure 4 shows contour lines of the 750 hPa temperature field after 6 d on all three grids. The plots are zoomed in over the region with the most activity shown in Fig. 3. We first compare the field in the NARRM's high-resolution region with the HR result and note the remarkable agreement between the two solutions (black contour lines) in the high-resolution region (yellow color). The presence of high resolution in the NARRM simulation allows the model to capture features in that region with finer scales than can be captured by the LR simulation (as expected). Further downstream from the mountains at the right edge of the Fig. 4b, the NARRM resolution has transitioned to match the LR resolution (blue color), and the scales captured by the NARRM solution are no longer as fine as they are in the HR solution. They are somewhat dissipated and fall between the LR and HR results. Thus, the presence of the high-resolution region in the NARRM grid improves some aspects of the solution in the low-resolution region. Finally, we note that there are no visible artifacts from the distorted elements in the mesh transition region. Examination of other fields, such as vorticity (not shown), demonstrate similar results.

Figure 4Contour lines of the 750 hPa temperature field on day 6 with contour intervals of 5 C. The temperature contours are overlaid on a map colorized by grid resolution. The data is plotted over a subset of the globe containing the mountains and most of the downstream region affected by the baroclinic instability. Results are shown from the LR grid (a), NARRM (b), and HR grid (c). The NARRM grid shows the transition from high resolution (yellow,  25 km) to low resolution (blue,  100 km).


3 Simulations and computational cost

We perform a set of NARRM production simulations parallel to the LR version documented by Golaz et al. (2022) and following the same CMIP6 specifications. The LR and NARRM production simulations analyzed in the present study are summarized in Table 2. These simulations consist of the CMIP6 Diagnosis, Evaluation, and Characterization of Klima (DECK) and historical simulations (Eyring et al.2016), i.e., one pre-industrial control (piControl, 500 years), two idealized CO2 runs (1pctCO2 and abrupt-4xCO2, each 150 years), a five-member historical ensemble (historical_N, 1850–2014), and a three-member Atmospheric Model Intercomparison Project (amip) type ensemble (amip_N, 1870–2014). Initial conditions are taken from 1 January of different years of piControl, as indicated in Table 2 for 1pctCO2, abrupt-4xCO2, and historical_N simulations. The amip_N simulations are initialized from the 1870 condition of corresponding historical_N simulations.

In order to estimate the effective radiative forcing of anthropogenic aerosols in LR and NARRM configurations, we perform pairs of nudged simulations with prescribed emissions of aerosols and their precursors for the present-day (PD, year 2010) and pre-industrial (PI, year 1850) values, which are taken from the CMIP6 emission data. Table 3 lists the nudged simulations used to assess the effective radiative forcing of anthropogenic aerosols. Horizontal winds in LR and NARRM are nudged towards wind fields from their respective baseline simulations, with a relaxation timescale of 6 h. These nudged simulations are 15 months long, with the first 3 months discarded as spinup. Previous studies have shown that nudging the horizontal winds can help constrain the large-scale circulation in the model (Zhang et al.2014; Sun et al.2019; Tang et al.2019), meaning that the anthropogenic aerosol effects can be determined with relatively short simulations (K. Zhang et al.2022; S. Zhang et al.2022).

Table 2Summary of E3SMv2 LR (Golaz et al.2022) and NARRM production simulations analyzed in this study. Numbers in parentheses indicate the simulation year numbers.

Download Print Version | Download XLSX

Table 3Nudged LR and NARRM atmospheric model simulations used in this study. All simulations are performed with prescribed sea surface temperature (SST) and sea ice concentration for year 2010. Nudging data are 6-hourly model output saved from the LR and NARRM free-running simulations (middle column). Due to the model instability problem with nudging application in RRM (with a relatively long time step), we use an alternative physics–dynamics coupling approach (see option “se_ftype = 1” in Sect. 3.1 of Zhang et al.2018) for the NARRM nudged simulations. We find the impact of using different physics–dynamics coupling approaches on the global mean effective aerosol forcing estimate in LR to be small (difference <0.05 W m−2).

Download Print Version | Download XLSX

3.1 Computational performance

A sequence of performance benchmark simulations were run on the Argonne National Laboratory Chrysalis cluster. Chrysalis has 512 compute nodes. Each node has two AMD Epyc 7532 “Rome” 2.4 GHz processors. Each processor has 32 cores, for a total of 64 cores per node. Each node has 256 GB 16-channel DDR4 3200 MHz memory. The interconnect hardware is Mellanox HDR200 InfiniBand and uses the fat tree topology. The model code was compiled using Intel release 20200925 with GCC version 8.5.0 compatibility and run using OpenMPI 4.1.3 provided in the Mellanox HPC-X Software Toolkit.

The simulations are run with one MPI process per core and no OpenMP threading. Throughput values are computed using the maximum wall-clock time (minimum throughput) over all message passing interface (MPI) processes; model initialization time is excluded. A throughput data point corresponds to one simulation run for 90 d. The input/output (I/O) configuration is identical to production simulations. At the end of 90 d, a restart file is written.

Figure 5Performance of the LR and NARRM historical simulations. (a) Throughput vs. number of computer nodes. Each data point is annotated with its throughput in simulated years per day (SYPD) and computer resource configuration name. The dashed gray line shows the perfect-scaling slope. (b) Computational resource plots for the L process layouts. Each component has one rectangle. A rectangle has the area given by the product of normalized wall-clock time and number of cores, with the NARRM total time normalized to 1.0.


Figure 6Performance of the atmosphere (Atm.) and ocean components of the NARRM historical simulation. Solid lines show measured performance. Dashed lines show the performance predicted by a simple model that uses the LR simulation with the XS process layout for input data; see the text for a description of the performance model.


Figure 5 summarizes the performance of the LR and NARRM historical_N simulations for several node counts and corresponding process layouts with names T (NARRM only), XS, S, M, and L. Note that while the layout names are shared among models, the specific layout associated with a name differs among models. Each simulation's data point is annotated with its throughput in simulated years per day (SYPD) and process layout name. The highest throughput of the LR simulations is 39.81 SYPD. In Golaz et al. (2022, Fig. 2), the highest throughput is 41.89 for the same node count; historical_N simulations have additional forcings to compute relative to the piControl simulation used in Golaz et al. (2022, Fig. 2). The LR throughput falls off from the perfect scaling slope faster than the NARRM throughput because the LR simulation has less work per node. For the L process layouts, accounting for 105 vs. 100 nodes, the throughput factor difference is 3.14.

Figure 5b shows the wall-clock-time–resource product for each component for the L layouts. A rectangle's width is proportional to the number of cores the component uses; its height is proportional to the wall-clock time to simulate a fixed simulation period, with the time normalized so that the NARRM simulation has a total time of 1.0. The atmosphere (ATM), sea ice (ICE), coupler (CPL), land (LND), and river runoff (ROF; LND and ROF are too small to label) components run on one set of nodes, while the ocean (OCN) component runs on another set. An unfilled rectangle having “LR” or “NARRM” at the top-right corner shows the total product. Because there is no global communication barrier between components run in sequence, the time value of each component is approximate, and thus the filled rectangles do not sum to the total time.

We can understand the NARRM component-level performance as a function of spatial and temporal discretization parameters and one LR simulation to calibrate throughput. The LR calibration simulation should reflect that RRM simulations have a large amount of work per node; thus, we use the LR simulation run with the XS process layout, the left-most LR point in Fig. 5a. We focus on the two most expensive components, the atmosphere and ocean. We start with the ocean, whose performance is simpler to model. For simplicity, we write the formulas in terms of wall-clock time (w.c.t.) for a fixed simulation length, e.g., 90 d. The input measured datum is the top-level ocean component (ocn) wall-clock time in the LR simulation run with the XS process layout, w.c.t.LRocn. The input parameters are the number of computer cores (ncore) used in the LR (XS) and RRM (variable) simulations, the number of cells (ncell) in each grid, and the time steps (Δt) in each simulation. For a fixed simulation length, the predicted ocean component RRM performance is then

(1) w . c . t . RRM ocn = ( n core ) LR ocn ( n core ) RRM ocn ( n cell ) RRM ocn ( n cell ) LR ocn ( Δ t ) LR ocn ( Δ t ) RRM ocn w . c . t . LR ocn .

The performance model for the atmosphere is more complicated because it has two important time steps, one each for the dynamical core (dynamics) and the column parameterizations (physics). Thus, the factor accounting for model time steps is broken into two terms, one each for the physics and dynamics. The predicted atmosphere component RRM performance is then

(2) w . c . t . RRM atm = ( n core ) LR atm ( n core ) RRM atm ( n cell ) RRM atm ( n cell ) LR atm ( Δ t physics ) LR atm ( Δ t physics ) RRM atm w . c . t . LR atmphysics + ( Δ t dynamics ) LR atm ( Δ t dynamics ) RRM atm w . c . t . LR atmdynamics .

Figure 6 shows the results of these models, where wall-clock time and simulation length have been converted to throughput (SYPD). The solid lines show the measured throughput of each component as a function of number of computer cores. The dashed lines show the corresponding throughput values predicted by Eqs. (1) and (2). The single LR XS layout ocean throughput value is used as the reference for the ocean, and the single LR XS layout atmosphere throughput value is similarly used as the reference for the atmosphere; these are the only measured data inputs to the performance models. The primary error in the performance model is not accounting for a fall-off in scaling at large core counts. Because this fall-off is small for the atmosphere and ocean components, these simple performance models are accurate and can be used to predict the cost of other model configurations. For example, a uniform high-resolution atmosphere model would use the ne120pg2 grid, which has 6⋅1202 elements. Using the same time steps and number of vertical levels as in the NARRM configuration, which has 14 454 elements, for fixed computational resources, the high-resolution atmosphere configuration's throughput would be 61202/14454=5.98 times smaller than the NARRM configuration's throughput, where this factor is the quotient of the numbers of elements in each of the two grids.

Figure 7Comparison of the global spatial RMSE of model climatology (annual and seasonal averages of years 1985–2014) vs. observations with the E3SM Diags package (C. Zhang et al.2022). The model results are from the first historical member of E3SMv2 (0101), LR (blue triangles), and NARRM (red triangles) and 52 CMIP6 models (r1i1p1f1). The boxes and whiskers show the 25th percentile, 75th percentile, and minimum and maximum RMSE of the CMIP6 ensemble. Quantities include (a) TOA net radiation flux, (b, c) TOA SW and LW cloud radiative effects, (d) precipitation, (e) surface air temperature over land, (f) sea level pressure, (g, h) 200 and 850 hPa zonal wind, and (i) 500 hPa geopotential height. TOA is the top of the atmosphere, SW is shortwave, CRE is cloud radiative effects, LW is longwave, ANN is annual, DJF is December–February, MAM is March–April, JJA is June–August, SON is September–November, and RMSE is root-mean-square error. The climatology of the observations and reanalysis data are calculated from CERES-EBAF Ed4.1 (Loeb et al.2018) (2001–2014) for (a), (b), and (c); GPCP2.3 (Adler et al.2018) (1985–2014) for (d); and ERA5 (Hersbach et al.2020) (1985–2014) for (e), (f), (g), and (h).


Figure 8Surface geopotential height of (a) LR and (b) NARRM over North America.

Figure 9Time series of global annual mean surface air temperature anomalies from the ensemble mean of LR (blue) and NARRM (red) historical runs and observational datasets (gray) (National Oceanic and Atmospheric Administration (NOAA) National Climatic Data Center (NCDC), National Aeronautics and Space Administration (NASA) GISTEMP, and HadCRUT4). The model ensemble minimum–maximum ranges are shaded, while the observational minimum and maximum numbers are labeled in the parentheses of legend.


Figure 10Comparison of climate sensitivities between LR (a, c) and NARRM (b, c) derived from idealized CO2 forcing simulations. (a, b) time series of global annual mean surface air temperature anomaly from the following simulations, abrupt-4xCO2 (red), 1pctCO2 (blue), and the control (piControl; green). The transient climate response (TCR) is computed as a 20-year average around the time of CO2 doubling (year 70). (c, d) Gregory regression plots. The estimated effective climate sensitivity (ECS) and effective CO2 radiative forcing (F) are as labeled.


4 Global climate

As described above, the RRM model is expected to simulate a global climate similar to the LR model for production simulation campaigns since most areas are still covered by the same LR grids. In this section, we will examine whether this is the case for the global mean climate, climate sensitivity, and climate feedback.

For the global climatology, we focus on the last 3 decades (years 1985–2014) of historical simulations when more observational datasets are available. Figure 7 provides an overall comparison of the global mean climate among LR (blue triangles), NARRM (red triangles), and CMIP6 (boxes and whiskers) models as quantified by the uncentered spatial root-mean-square error (RMSE) relative to the observations or reanalysis data. The RMSE numbers are calculated with the E3SM Diags package (C. Zhang et al.2022) for the first historical member (0101 for LR and NARRM, r1i1p1f1 for CMIP6 models). Figure 7 clearly shows that NARRM and LR simulate very similar annual and seasonal averages. NARRM outperforms LR in the June–July–August (JJA) shortwave (SW) cloud radiative effect (CRE) partly because it better represents low clouds in NA (see Fig. 13 for the example in California). NARRM also simulates slightly better December–January–February (DJF) precipitation compared to LR, partly due to its improved topography (Fig. 8) and orographic precipitation in NA (see Fig. 12b, d, f). For other times (e.g., annual mean, Fig. A3) NARRM and LR precipitation results are very similar. On the other hand, NARRM does not perform as well as LR for some other fields, such as the 200 hPa zonal wind in JJA and September–October–November (SON), which are associated with the increased positive biases in the tropical western Pacific and Amazon (not shown).

Figure 9 compares the long time series (years 1850–2014) of global annual average anomalies in the surface air temperature from the ensemble means of LR and NARRM historical simulations and observational datasets (National Oceanic and Atmospheric Administration National Climatic Data Center (Smith et al.2008; Zhang et al.2015), National Aeronautics and Space Administration GISTEMP (GISTEMP Team2018; Hansen et al.2010), and HadCRUT4 (Morice et al.2012)). Over the whole period, NARRM tracks LR closely, including good agreement with observations until the 1930s and low biases afterwards, which are mainly attributed to too strong aerosol-related forcing and feedback (see Golaz et al.2022, for details). This is further confirmed by the fact that the global mean effective radiative forcing of anthropogenic aerosols in NARRM and LR are very similar (1.415 W m−2 vs. 1.421 W m−2), as quantified by a pair of nudged simulations (see Sect. 5.1.2).

Following the CMIP6 DECK protocol (Eyring et al.2016), we quantify the climate sensitivity and feedback with the abrupt quadrupling of CO2 (abrupt-4xCO2) and the transient climate response (TCR) with a simulation forced by a 1 % yr−1 CO2 increase (1pctCO2) relative to the pre-industrial control simulation (piControl). The equilibrium climate sensitivity (ECS) is estimated with the linear regression of TOA radiation change against surface temperature change in a 150-year abrupt-4xCO2 simulation (Gregory et al.2004). The 2xCO2 effective radiative forcing (ERF) is computed as the y intercept of the Gregory plot divided by two, which measures the energy imbalance caused by doubling the atmospheric CO2 concentration while keeping the surface temperature unchanged. TCR, which measures the response on shorter timescales, is derived based on its definition – the average surface temperature change in the 20-year period when the CO2 concentration doubles from a 1pctCO2 experiment.

Figure 10 depicts the annual mean surface temperature change as a function of time and the Gregory plots from the idealized CO2 experiments with LR and NARRM. The differences in climate sensitivity between LR and NARRM are very subtle as quantified by both ECS (4.00 K vs. 3.94 K) and TCR (2.41 K vs. 2.44 K).

The regression slope in the Gregory plot (Fig. 10c, d) denotes the total radiative feedback caused by the quadrupled CO2 concentration. We further apply the radiative kernel method (Soden et al.2008; Held and Shell2012) to decompose the total radiative feedback into non-cloud and cloud feedbacks. The cloud feedback is estimated by adjusting the cloud radiative effect anomalies for non-cloud influences. Overall, NARRM shows a slightly larger ERF (3.22 W m−2 vs. 2.98 W m−2), which accompanied with the similar ECS produces a stronger negative total climate feedback in NARRM. The total climate feedback is −0.74 W m−2 K−1 for LR and −0.82 W m−2 K−1 for NARRM, which mainly relates to the slightly weaker positive SW cloud feedback in NARRM than in LR (see Fig. 22).

In summary, the results in this section confirm that NARRM with the hybrid time step methodology simulates largely identical global climate as its corresponding LR configuration and hence satisfies the necessary requirement (i.e., good global climate) of global RRM production simulations we proposed in the introduction.

5 North American results

In this section, we will zoom in over the refined region over North America (NA) and emphasize climate aspects most relevant to the E3SM water cycle scientific goals (Leung et al.2020) as well as some weaknesses in LR revealed by Golaz et al. (2022). The results will be described for the atmosphere, land, and river models, respectively. Moreover, we will analyze interactions between different components (i.e., land–atmosphere coupling) because these interactions are also expected to change with the resolution increase.

5.1 Atmosphere

5.1.1 Hydrology over the CONUS

First, we look at the overall atmospheric results over the CONUS (20–50 N, 65–125 W) by comparing the spatial RMSEs of the historical ensemble means between LR (blue triangles) and NARRM (red triangles) in Fig. 11. The same metric is used in Fig. 7 for the global results, but we adjust the variables to be more relevant to the CONUS. NARRM generally produces better (as quantified by smaller RMSE numbers) results than LR for these annual and seasonal climatologies, such as SW CRE (Fig. 11a), precipitation (Fig. 11c), and 200 hPa zonal wind (Fig. 11f). Because we have not retuned the physics of NARRM, some deteriorations are expected, for example longwave (LW) CRE in DJF and March–April–May (MAM) (Fig. 11b).

Precipitation and clouds, which are obviously important quantities for the water cycle, are largely improved in NARRM compared to LR. The precipitation patterns are better captured by NARRM as observed at the Sierra Madre Occidental in JJA (Fig. 12 left column) and in the western US in DJF (Fig. 12 right column) due to the better-resolved topography in NARRM (see Fig. 8). The poor representation of marine stratocumulus clouds is a long-standing problem that plagues many ESMs (e.g., Bogenschutz et al.2023). The underestimation of summertime low clouds (manifested as the excessive TOA shortwave CRE) in the California stratocumulus region is substantially improved with NARRM (Fig. 13). The improvement in this bias is likely due to a reduced bias in the simulated sea surface temperature (SST) in the coupled RRM. To constrain the impact of the RRM on the SST bias, we conduct two additional experiments: one where the LR ocean is coupled to the RRM atmosphere, and another where the RRM ocean is coupled to the LR atmosphere. The simulated SST bias averaged over years 51–100 of the piControl is shown in Fig. 14. Comparison of Fig. 14a and b shows a clear reduction in bias for the NARRM simulation. Figure 14c and d illustrate that the regional refinement in the atmosphere (c) is primarily responsible for the reduction in SST bias; however, comparing to the NARRM result, we see that the bias is further reduced when regional refinement is included in both components, highlighting the advantage of coupled RRM over a single-component RRM. Further details will be described in a future paper by Van Roekel et al. (Luke P. Van Roekel, personal communication, 2023).

Another well-known issue of global ESMs is the poorly captured diurnal propagation of organized mesoscale convective systems (MCSs). Over CONUS, such MCSs originate from the front range of the Rockies in the afternoon and propagate eastward, manifesting as a nocturnal precipitation peak in the central US and contributing as much as half of the summertime rainfall in that region (Riley et al.1987; Jiang et al.2006). Both LR and NARRM simulate the summertime nocturnal rain peak in the central US (Fig. 15) because of the new convective trigger method for deep convection (Xie et al.2019). However, the magnitude is weaker and the area of nocturnal peak extends much larger (almost the whole eastern half of the US) than the observation, which could be caused by remaining propagation or convective trigger deficiencies. Nevertheless, NARRM reduces the underestimation of maximum diurnal cycle magnitude with an 80 % greater value (5.61 mm d−1 vs. 3.10 mm d−1) than LR on the same 100 km grids. The NARRM maximum can be as high as 6.71 mm d−1 on the 25 km grids but still biases low compared to the observation (10.89 mm d−1). This result suggests that a resolution of  25 km is not adequate to capture the physics driving propagating MCSs, which probably require convection-permitting atmospheric simulations to achieve a good agreement with observations (Caldwell et al.2021).

Figure 11The same as Fig. 7 but contrasting LR and NARRM historical ensemble means at the refined CONUS area (20–50 N, 65–125 W). Note that variables shown are adjusted to be more appropriate for the CONUS. (a, b) TOA SW and LW cloud radiative effects, (c) precipitation, (d) total precipitable water, (e) surface air temperature, (f, g) 200 and 850 hPa zonal wind, and (h, i) 500 hPa geopotential height and vertical velocity (pressure). The same reference climatology data are used for the variables also shown in Fig. 7, whereas ERA5 (Hersbach et al.2020) (1985–2014) is used for (d), (e), and (i).


Figure 12Comparison of CONUS JJA (a, c, e) and DJF (b, d, f) precipitation geographic patterns from ERA5 reanalysis (a, b), LR (c, d), and NARRM (e, f) historical ensemble means.

Figure 13Mean TOA shortwave cloud radiative effects at California in JJA of (a) observations (CERES-EBAF Ed4.1), (b) LR (H1-5) minus observation, and (c) NARRM (H1-5) minus observation.


Figure 14Sea surface temperature (SST) bias (model–observations) simulated by four configurations of E3SMv2 (a) NARRM, (b) LR, and (c) RRM atmosphere coupled to LR ocean and (d) LR atmosphere coupled to RRM ocean. The data are averaged over years 51–100 of the respective piControl simulations.

Figure 15Mean diurnal phase (local time, colors) and magnitude (color density) of the maximum precipitation in JJA calculated from the first harmonic of 3-hourly total precipitation (mm d−1) for (a) Tropical Rainfall Measuring Mission (TRMM) observations (Huffman et al.2007), (b) LR (H1-5), (c) NARRM (H1-5) regridded to the same 100 km grids as (b), and (d) NARRM (H1-5) on 25 km grids.

5.1.2 Aerosols

The E3SMv2 LR model (Golaz et al.2022) simulates too strong aerosol-related forcing, which has been identified as the primary cause of the underestimated warming in the later portion of historical period in Fig. 9. We will examine here if the NARRM configuration helps bring down the biases in aerosols and anthropogenic forcing by better resolving the meteorological and climate fields. In addition, since NARRM employs the hybrid time step approach that eliminates retuning the scale-dependent aerosol parameters used in LR, e.g., the global scaling factor used to constrain the total emission fluxes of natural aerosols (dust and sea salt), which depend non-linearly on the model-resolved small-scale surface winds, we will also discuss the impact of increasing model horizontal resolution on the natural aerosols and total aerosol optical depth (AOD) in NARRM.

  1. Impact on anthropogenic aerosols.

    Aerosols in the NARRM configuration are represented in the same manner as in the LR with the enhanced MAM4 (Wang et al.2020) and improved dust aerosol properties (Feng et al.2022). Anthropogenic and wildfire emissions used in the LR and NARRM experiments are also from the same input datasets. However, cloud microphysical processes, horizontal advection, and convection can affect aerosol loading within the NARRM high-resolution domain if wet deposition and/or transport are substantially different from the LR configuration (Caldwell et al.2019). We first compare modeled surface mass concentrations of SO2, sulfate, black carbon, and organic carbon between the ensemble means of LR and NARRM historical simulations, and evaluate them against ground-based observations from the Clean Air Status and Trends Network (CASTNET) and the Interagency Monitoring of Protected Visual Environments (IMPROVE). The results are shown in Fig. 16. In general, both sets of simulations show a strong correlation with measurements, and biases are very similar between the two for the anthropogenic aerosol species. NARRM simulations slightly increase (less than 7 %) the concentrations of the four species shown here, compared to LR simulations, which may result from less wet removal or vertical transport in the refined mesh.

  2. Impact on natural aerosols.

    In addition to aerosol removal, emissions of natural aerosols such as dust and sea salt are highly sensitive to resolution changes due to their strong dependence on the resolved surface wind speeds in the model. Increasing model horizontal resolution normally requires retuning the dust and sea salt aerosol emission factors. For E3SMv1, Feng et al. (2022) showed that without retuning, an increase in the horizontal resolution by a factor of 4 (i.e., from  100 km in LR to  25 km in HR) results in about 29 % increase in global dust emissions and an even larger increase in dust AOD of 42 % due to the combined effects from the weakened removal. In contrast, as shown in Table 4, NARRM historical runs simulate nearly the same global mean AODs as LR for all the aerosol species including dust and sea salt, without changing their emission factors. Over the regionally refined CONUS, the mean dust and sea salt AODs are slightly increased (< 5 %). This suggests that NARRM largely retains the performance of LR for the aerosol simulations on the global and regional mean basis without requiring additional retuning of the scale-dependent emission factors.

  3. Aerosol spatial variability and extremes.

    On the other hand, NARRM shows improvement over LR in representing aerosol spatial variability and extreme values over the refined mesh region. Figure 17 compares the simulated AOD (550 nm) distributions between LR (0101) and NARRM (0101) historical simulations for the present-day time period of 2000–2014. While both depict a similar general geographical pattern, e.g., higher AODs over the more polluted eastern US than the western part of the country, NARRM captures greater and finer detail in spatial variability than LR, e.g., over the mountainous areas along the Rockies, Sierra Nevada, and Appalachians. The better-resolved AOD variability in NARRM results from the spatial refinement of the resolution-dependent aerosol emission fluxes (natural species), transport, and removal, as discussed above. Compared to the ground-based AOD measurements at the 37 AERONET (Holben et al.1998) sites (2006–2015), NARRM shows stronger spatial correlation with the observations than LR (Fig. 17c). Both configurations overestimate the mean AOD averaged over the AERONET sites, possibly linked to the weak wet removal in E3SMv2 (Golaz et al.2022).

    In addition to the improved spatial variability, higher resolution in NARRM also leads to more frequent occurrences of large AOD predictions over CONUS than the LR model, especially over the regions dominated by wind-driven dust or sea salt aerosols. Figure 18 shows an example of the calculated probability density function (PDF) for dust AOD over the major dust source region in the US (32–42 N, 118–108 W; indicated by the 10×10 box in Fig. 17b), from both the LR (0101) and NARRM (0101) simulations in 2000–2014, which are remapped to the same 0.25 grid resolution. It is worth noting that the remapping of the LR results to the finer resolution leads to little improvement in the resolved spatial variability in dust AOD. Clearly, NARRM predicts more occurrences of high dust AOD over this region than LR, e.g., 22 % of the dust AODs predicted by NARRM exceed 0.015, which is the top 98th percentile of the LR model predictions remapped to the same resolution. This suggests that LR may significantly underestimate the occurrences of large dust outbreaks in the southwestern US region relative to NARRM due to the unresolved surface winds for dust mobilization in the model. Similarly, NARRM would be more suitable for urban climate or air quality studies for capturing the extremely polluted cases occurring at finer spatial or temporal scales.

  4. Effective radiative forcing of anthropogenic aerosols.

    Figure 19 shows the effective radiative forcing of anthropogenic aerosols (ΔF) over CONUS and adjacent ocean areas estimated using nudged LR and NARRM simulations. ΔF is overall negative in both LR and NARRM and dominated by the shortwave component (ΔFSW in Figure 19b, e). The regional mean ΔFSW and ΔFLW are both slightly stronger (more negative for ΔFSW and more positive for ΔFLW) in NARRM compared to LR. Over the Pacific Ocean near 20 N and 120 W, ΔFSW and ΔF in NARRM are much stronger than in LR. This is mainly caused by larger low cloud fraction simulated in NARRM (see Appendix Fig. A4), which causes a larger contrast in droplet number concentration and liquid water path between the PD and PI simulations compared to LR (Fig. A5).

Figure 16Scatter plots of modeled annual mean surface concentrations of (a) SO2, (b) sulfate, (c) black carbon, and (d) organic carbon (POM+SOA) from LR (H1-5) and NARRM (H1-5) compared to observations at CASTNET and IMPROVE network surface sites during 2005–2014. The numbers are mean concentration and correlation coefficient (R) for data at the individual sites.


Table 4Comparison of simulated annual mean AOD (550 nm) between LR (H1-5) and NARRM (H1-5) for the time period of 1985–2014. POM stands for particulate organic matter, BC stands for black carbon, and SOA stands for secondary organic aerosol.

Download Print Version | Download XLSX

Figure 17Aerosol optical depth (AOD) at 550 nm from (a) LR (0101) and (b) NARRM (0101) historical simulations averaged over 2000–2014. Panel (c) shows the AOD comparison of the two model simulations with the AERONET observations during 2006–2015. The site locations of AERONET are denoted by the gray dots in panel (a). The gray box in panel (b) denotes the dust region referenced in Fig. 18.

Figure 18Calculated probability density function (PDF) of the dust AOD predictions from LR (0101) and NARRM (0101) between 2000–2014 remapped to the same 0.25  grid resolution, over the major dust source region in the US (32–42 N, 118–108 W; indicated by the 10×10 box in Fig. 17b).


Figure 19Anthropogenic aerosol effects simulated by the nudged LR and NARRM simulations.

5.1.3 Cloud and cloud feedback

Here we examine the impact of increased horizontal resolution over NA on the simulated clouds and their radiative effects with the LR and NARRM historical simulations and on cloud feedback changes with the quadrupling 4xCO2 simulations.

Figure 20 compares the cloud cover between the E3SM CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation) simulator output and the GCM-Oriented CALIPSO Cloud Product (CALIPSO-GOCCP) (Zhang et al.2023). Cloud cover and cloud thermodynamic phase are diagnosed with the same algorithm in the CALIPSO simulator and CALIPSO-GOCCP data, facilitating consistent model–observation comparisons (Chepfer et al.2008, 2010; Cesana and Chepfer2013). Observed total cloud cover is larger over the NA polar region than the CONUS in the CALIPSO-GOCCP data. Total cloud cover is larger than 60 % over the eastern Pacific Ocean, northern Atlantic Ocean, and Arctic Ocean. Strong land–ocean contrast is observed – liquid phase clouds dominate over the ocean, while ice phase clouds prevail over the land in areas such as Greenland and CONUS. Compared to CALIPSO-GOCCP, LR overestimates total cloud cover at NA high latitudes and the western CONUS and underestimates it in Greenland, particularly over Baffin Bay and near the Greenland coast. The underestimated cloud cover over the western coast of the CONUS is also notable, which is consistent with the previous discussion on Fig. 13. The excessive modeled total cloud cover is primarily attributed to the positive biases in the liquid cloud over the polar region. The positive biases of ice cloud cover contributes to the biases over the mountainous regions in the western NA. On the other hand, ice clouds are underestimated over Greenland and northern Canada.

Over land, NARRM displays improvements relative to LR in western NA and the Arctic for both cloud phases. For instance, the ice cloud biases are significantly reduced from Alaska to the western CONUS (Fig. 20f, i), and the liquid cloud deficiencies over Alaska and Greenland are generally decreased (Fig. 20e, h). The better represented topography in NARRM (Fig. 8) is probably the key factor of these NARRM improvements. The impact of increased horizontal resolution on simulated cloud phase is also noted in the E3SMv1 model with the CALIPSO simulator (Zhang et al.2019), where increased horizontal resolution also slightly decreases simulated liquid and ice clouds at temperatures warmer than −40C in the Arctic region.

Over ocean, NARRM substantially improves the stratocumulus clouds to the west of coastal regions. NARRM also moderately outperforms LR in representing liquid clouds over the North Atlantic to the west of Greenland. This is related to the warmer ( 1.5 C) NARRM surface air temperature over the Labrador Sea. This warmer NARRM surface air temperature is consistent with the decreased sea ice concentration in that region (not shown), which is somewhat expected as an advantage of refining grids for both atmosphere and ocean and sea ice (see Fig. A1). Further process-level analysis is necessary to fully understand this LR-NARRM model behavior change and will be reported in separate papers.

Figure 21 compares the simulated SW and LW CRE with the CERES-EBAF Ed4.1 observations. Large negative SW CRE biases and positive LW CRE biases are shown over the Arctic land area and western coast of NA (i.e., Alaska to Oregon) in LR, which mainly result from the overestimated cloud cover. These biases are substantially reduced in NARRM, primarily owing to the improved cloud cover (Fig. 20). Given the reduced negative bias of marine stratocumulus clouds in NARRM near the western coasts of the CONUS, simulated SW CRE is also largely improved. As discussed by Golaz et al. (2022), sea ice concentration is largely overestimated over the North Atlantic Ocean in LR. The too large sea ice extent leads to weaker SW and LW CRE than observed in the Labrador Sea. This is primarily because of the brighter and colder sea ice surface in LR that reflects more SW radiative fluxes and emits less LW radiative fluxes than the observations under clear-sky conditions (not shown). Compared to LR, the maximum positive bias in NARRM sea ice extent in Labrador Sea is greatly alleviated. The better simulated sea ice extent reduces the biases of overly reflective clear-sky SW radiation and the insufficient clear-sky outgoing LW radiation. With generally comparable all-sky SW and LW radiative fluxes between LR and NARRM, those reduced clear-sky biases thus lead to a better CRE in NARRM over Labrador Sea.

Given the improved historical cloud cover and cloud radiative effects over NA, we further examine the regional climate feedbacks over this region in Fig. 22. Relative to the global mean value, the total climate feedback over NA is more negative from NARRM than from LR. This mainly results from the more negative Planck feedback and less positive SW cloud feedbacks. The more negative Planck feedback is related to the stronger surface warming over the northeastern Pacific (not shown).

Figure 23 shows the spatial distribution of cloud feedback of LR and NARRM in the NA region. Due to the different cloud types over land and ocean, we report their regional averages separately. Notably, the total land cloud feedback is 0.41 W m−2 K−1 smaller in NARRM (0.26 W m−2 K−1) than in LR (0.67 W m−2 K−1), which is dominated by the reduced SW cloud feedback over the northeastern US (Fig. 23e). Further examination indicates this reduction is mainly related to the weaker reduction in low cloud cover under warming in NARRM. Figure 20g shows that the overestimated cloud cover is slightly alleviated over the Arctic land region in NARRM, implying that the lower mean state cloud cover might contribute to a weaker cloud reduction under warming there. Over ocean, NARRM presents a stronger SW cloud feedback and a weaker LW cloud feedback over the marine low cloud regime, leading to a small change in total cloud feedback. Across the CSS/WGNE Pacific Cross-Section Intercomparison (GPCI) transect (Teixeira et al.2011), NARRM tends to show a weaker positive cloud feedback near the coast and more positive cloud feedback off the coast. These factors suggest that the regional refinement can significantly affect the regional cloud responses under warming and the predictability of regional climate.

Figure 20Spatial distribution of annual mean total cloud cover (a) and cloud cover in liquid phase (b) and ice phase (c) from the CALIPSO-GOCCP data. The cloud cover biases in LR (H1-5) and NARRM (H1-5) historical simulations (1985–2014) are shown in (d)(f) and (g)(i), respectively. Simulated cloud cover and cloud thermodynamic phase are derived by the CALIPSO simulator. Climatology data of CALIPSO-GOCCP version 3.1.2 (Chepfer et al.2010) from 2006–2018 are used in the model evaluation.

Figure 21Spatial distribution of the observed shortwave cloud radiative effect (a), the longwave cloud radiative effect (b), and the simulated cloud radiative effect biases in LR (H1-5) (c, d) and NARRM (H1-5) (e, f) historical simulations (1985–2014). The observed cloud radiative effect is from the CERES-EBAF Ed4.1.

Figure 22Mean global and NA climate feedbacks of LR and NARRM decomposed using radiative kernels (e.g., Soden et al.2008; Held and Shell2012).


Figure 23Spatial distribution of total (a, d, g), SW (b, e, h), and LW (c, f, i) North American cloud feedbacks for LR (a–c), NARRM (d–f), and the difference between NARRM and LR (g–i). The GPCI transect is denoted by the dashed black line. The average values over land and ocean are labeled in the brackets.

5.1.4 Extratropical cyclone

One of the primary motivations for pushing climate simulation resolution is to potentially better capture extremes. Extratropical cyclones (ETCs) are a major weather extreme phenomenon at middle and high latitudes, bringing with them strong winds and precipitation that can exert substantial societal impacts along their pathways over days and over hundreds to thousands of kilometers. Climate changes are likely to induce changes to the dynamical and physical characteristics of ETCs as well as their geospatial distribution (e.g., Bengtsson et al.2006; Ulbrich et al.2009). Projections of such future changes rely heavily on numerical climate and ESM models. Their skills in simulating major weather systems like ETC have been carefully scrutinized by modeling centers and by the climate science community in conjunction with the major intercomparison campaigns (Greeves et al.2007; Chang2013). While the conventional climate models with grid resolution around 100 km show reasonable skill in producing ETC frequency and spatial track density, it has also been found that higher-resolution models are better capable of capturing more intense ETCs (e.g., Jung et al.2006), which is critical for using ESM to project future climates, as growing evidence shows that global warming tends to shift the weather spectrum to the more extreme end (Melillo et al.2014). Here, we will demonstrate the benefits of higher resolution in simulating ETCs in a regionally refined setting by comparing the results from NARRM with those from LR simulations against the ETC activities derived from the ERA5 reanalysis.

The ETC tracks and statistics can be obtained using automated identification and tracking algorithms (e.g., Blender and Schubert2000; Geng and Sugi2001; Bengtsson et al.2006; Jung et al.2006; Ullrich and Zarzycki2017; Ullrich et al.2021). The objective identification and tracking also make it suitable to compare ETC activities and statistics derived from different data sources in particular for model evaluations. The algorithms usually identify and track the spatial features of a meteorological variable, such as mean sea level pressure (MSLP) or 850 hPa vorticity, that can characterize the structure of cyclones and their movements. In this work, we use a community feature detection and tracking framework, TempestExtremes (Ullrich and Zarzycki2017; Ullrich et al.2021), to derive ETC activities from 6-hourly MSLP data during the period of 1985–2014 from the E3SM simulations and the ERA5 reanalysis. Considering higher-resolution data can more accurately identify the storms and their tracks (Blender and Schubert2000; Geng and Sugi2001), all the model and reanalysis data are placed on 1×1 grids to feed the tracking software. The algorithm takes two steps. First, a candidate cyclone is detected when a minimum MSLP feature is enclosed by a contour of 200 Pa interval within 6 of the center. Candidates within 6 of one another are merged, with the lower center pressure taking precedence. The candidates are then stitched together to define the tracks if the features persist for at least 60 h with a maximum gap of at most 18 h. From the start to the end, a candidate cyclone must travel at least 12 great circle distance to qualify as an ETC.

Over the NARRM high-resolution domain, ETCs are most active during winter. Figure 24 shows the mean DJF track density for the models and the analysis derived by casting the computed ETC track data onto 5×5 grids. The tracks are mostly concentrated over the northeastern Pacific and northwestern Atlantic that form the well-known storm tracks. Both LR and NARRM simulations capture these main features to a large extent. There are, however, notable differences between LR and NARRM over these oceanic storm tracks. The track densities are clearly underestimated in the LR simulations inside the refined region, except for the Atlantic storm track in the coupled mode. NARRM clearly produces higher ETC track density than LR does for both sections of the oceanic storm tracks and mostly agrees better with the ERA5 data, although in the coupled mode the track density tends to be overestimated. The shape and orientation of both the Pacific and Atlantic storm tracks are much better produced by NARRM in the coupled mode. Given the significant differences from both coupled LR (Fig. 24d) and uncoupled NARRM (Fig. 24c), it is reasonable to believe that the better captured storm track shapes in the coupled NARRM are due to interactions with the refined ocean (Fig. A1). Several secondary centers of active ETCs in the ERA5 over land are also reproduced by the models, including the active regions over the Great Lakes and Hudson Bay, although the densities are overestimated in the models (more so in the NARRM). It is worthwhile mentioning that the NARRM simulations are able to produce the chain of secondary centers to the east of the Rocky Mountains that are also present in the ERA5 reanalysis but are largely missing in the LR simulations. This is presumably due to the NARRM's better-resolved mountainous terrain features, a benefit by design.

The benefit of grid refinement can be further seen in Fig. 25, which shows the histograms of the ETC as a function of the minimum center pressure and the maximum deepening rate during its lifetime. All events within the refined region bounded by the dashed black lines as shown in Fig. 24 are used to compute these statistics. The maximum deepening rate is defined as the maximum 6-hourly center pressure drop, normalized by sin(φref)/sin(φ), with φ being the latitude and φref the reference latitude at 45 (see also in Jung et al.2006). Clearly the NARRM very closely reproduces the number of intense cyclones (minimum center MSLP <960 hPa), while unsurprisingly the LR model underproduces. This is true in coupled and uncoupled modes. Both LR and NARRM simulations overestimate the number of weaker ETCs. On a similar note, the observed number of rapid-growth cyclones is closely reproduced by the NARRM simulations but is clearly underestimated by a large margin in the LR simulations.

Figure 24Mean extratropical cyclone track density in the DJF season between 1985 and 2014 from (a) ERA5, (b, c) AMIP, and (d, e) historical simulations of LR and NARRM. Dashed black lines denote the western and eastern boundary of the refined region.

Figure 25Histogram of extratropical cyclone minimum center sea level pressure and maximum 6 h deepening rate in the DJF season between 1985 and 2014 (a, b) AMIP and (c, d) historical simulations. LR is shown in dotted, NARRM in dashed, and ERA5 in solid black lines.


5.2 Land and river

5.2.1 Snowpack

Natural storage provided by mountain snowpack is central to water supply reliability in the western US (Siirila-Woodburn et al.2021). To evaluate model skill in representing this critical hydroclimate benchmark variable, intra-annual snowpack dynamics are evaluated using the methodology known as the snow water equivalent (SWE) triangle (Rhoades et al.2018a, b). The seven metrics that make up the SWE triangle attempt to distill management-relevant aspects of the accumulation and ablation of snowpack (e.g., peak water volume and snowmelt rate) for any arbitrary gridded SWE dataset. Five HUC2 basins of the mountainous western US are used to derive five-member ensemble and basin average evaluations of LR and NARRM fully coupled historical simulations and are compared with ERA5. All datasets are bi-linearly regridded using the Earth System Modeling Framework (ESMF) to 0.25 resolution prior to masking and computing the basin-average SWE triangle metrics.

NARRM provides enhanced winter (DJF) climatological representation of the spatial variability of SWE across the CONUS relative to LR (Fig. 26a, b). This is seen through higher SWE magnitudes and more granular spatial structures in NARRM compared with LR, particularly in coastal mountain ranges such as the Cascades and Sierra Nevada, and corroborates a long history of ESM studies that highlight the critical importance of horizontal resolution ( 0.25) in properly representing the mountainous hydrologic cycle (Demory et al.2014; Rhoades et al.2017; Kapnick et al.2018; Palazzi et al.2019; Bambach et al.2022; Rhoades et al.2022). As shown through the more granular intra-seasonal perspective of the SWE triangle metrics, certain aspects in the snowpack dynamics are improved with NARRM (e.g., peak water volume), namely in the Pacific Northwest and California (Fig. 26c). With that said, some E3SM SWE biases are not ameliorated with horizontal resolution and may arise due to the combination of higher winter season precipitation (Fig. 12) and a general cool bias (Fig. 9) in both the LR and NARRM fully coupled historical simulations.

Figure 26Climatological DJF average snow water equivalent (SWE) as simulated by E3SMv2 with (a) LR and (b) NARRM over the 1985–2014 period. (c) LR (blue) and NARRM (red) SWE triangle metrics for five HUC2 basins within the mountainous western US compared with ERA5 (gray). Black bars at the end of each histogram represent the mean 95 % confidence intervals.

5.2.2 Runoff and evapotranspiration

NARRM better captures spatial variability in land hydrologic processes, as indicated by two most important land hydrologic variables, total runoff (Fig. 27) and evapotranspiration (ET) (Fig. 28). For instance, in the coastal Pacific regions, the Rocky Mountains block atmospheric moisture from ocean to inland areas and lead to two distinct hydrologic regimes: a wet regime in the western mountains and a dry regime in the eastern mountains. This abrupt spatial shift from a wet to dry hydrologic regime can be clearly seen in the composite runoff map from the Global Runoff Data Center (GRDC) (Fekete et al.2011), Fig. 27a, and the observed evapotranspiration map from the Moderate Resolution Imaging Spectroradiometer (MODIS) satellite observations (Running et al.2017), Fig. 28a. Note that the GRDC runoff map is not completely based on the observational data since runoff measurements are not available at the regional or global scales due to technical and economic limitations. It is nevertheless a more realistic estimate than any model simulations because it was first generated with a monthly hydrologic model (hence producing spatiotemporal variability) and then corrected for bias against discharge measurements at thousands of river gauges (Fekete et al.2011). This abrupt shift of hydrologic regime around the Rocky Mountains, along with the other spatial variations, is much better resolved in the NARRM simulation than LR, which is the case for both simulated runoff and evapotranspiration, as shown in Figs. 27b, c and 28b, c. The NARRM simulated spatial patterns are thus more realistic than the LR ones over NA. Over the remaining regions of the globe, the NARRM and LR simulated spatial patterns are quite similar to each other in terms of both runoff and evapotranspiration (not shown).

The simulation biases in runoff and evapotranspiration are further examined in terms of absolute biases, i.e., the absolute difference between the simulated and “benchmark” values. Here the GRDC runoff and MODIS ET data are used as the benchmark data. Figures 27d and 28d show the maps of absolute bias difference, i.e., the difference between the absolute biases in the LR simulation and those in the NARRM simulation (former subtracting latter), for annual mean runoff and evapotranspiration, respectively. For a specific grid cell in these two maps, a positive difference means the absolute bias in the LR simulation is larger than that in the NARRM and vice versa. It appears that there are more absolute biases in LR than NARRM over both the western and eastern US. Using the longitude 100 W as the divide, the average absolute bias differences (positive indicates NARRM has less overall absolute bias than LR) are 22.8 and 0.9 mm yr−1 over the western and eastern US, respectively, for annual mean runoff, and are 21.6 and 18.5 mm yr−1 over the western and eastern US, respectively, for annual mean evapotranspiration. When compared to LR, NARRM can thus help reduce simulation biases in hydrologic variables.

Figure 27Annual mean runoff from (a) GRDC, (b) LR, and (c) NARRM simulations and (d) the differences in absolute biases between LR and NARRM (LRNARRM); a positive value suggests the absolute bias in the LR simulation is larger than that in the NARRM, while a negative value indicates the opposite.

Figure 28Annual mean evapotranspiration from (a) MODIS, (b) LR, and (c) NARRM simulations and (d) the differences in absolute biases between LR and NARRM (LRNARRM); a positive value suggests the absolute bias in the LR simulation is larger than that in the NARRM, while a negative value indicates the opposite.

5.2.3 Streamflow

Streamflow simulations are typically affected by multiple sources of uncertainties, such as the biases in the simulated runoff, the uncertainties in the river model parameters (e.g., river network topology, channel geometry, Manning's roughness coefficients), and water demand data (Li et al.2013, 2015a, b; Zhou et al.2020). For river network topology, the 0.5 and 0.125 resolution river network data are used for the LR and NARRM simulations, respectively, as shown in Fig. 29a, b. It is expected that a higher-resolution river network data can represent rivers more smoothly and hence more realistically. Another benefit of higher resolution river network data is to enable more extensive streamflow validation. Terrestrial water fluxes, particularly surface runoff and streamflow, are dominated by gravity and controlled by topography and hence mostly follow irregular watershed boundaries. In most land surface and ESMs, including E3SM, regular lat–long grids are used to resolve spatial heterogeneity for both runoff and river processes to be compatible with the other land and atmospheric components (Lawrence et al.2019; Golaz et al.2022). Both the magnitude and timing of streamflow at each river gauge are dominated by the corresponding upstream drainage area. Streamflow simulations are thus largely affected by the discrepancies between the watershed boundaries and regular lat–long grids. These discrepancies can be significantly reduced with higher-resolution river network data. For example, in this study a 10 % discrepancy threshold is used to select the river gauges for validating streamflow simulations; i.e., the relative difference between the real upstream drainage area of a river gauge and that estimated from a lat–long grid-based river network should not exceed 10 %. Over the NA domain, 615 river gauges satisfy the requirement for the 0.5 resolution river network, whilst 2924 river gauges satisfy the requirement for the 0.125 resolution river network. There are 563 river gauges that simultaneously satisfy the requirement for both resolutions.

Figure 29Simulated annual mean streamflow at 563 river gauges in NA compared against The Global Streamflow Indices and Metadata (GSIM) database. (a) River network used in LR, demonstrated by the mean annual discharge. (b) River network used in NARRM, demonstrated by the mean annual discharge. (c) Simulated annual mean streamflow against observed for LR and NARRM.

Figure 30Simulated JJA (a, c, e) and DJF (b, d, f) mean streamflow at CONUS river gauges compared against The Global Streamflow Indices and Metadata (GSIM) database. (a, b) Relative bias of LR. (c, d) Relative bias of NARRM. (e) The difference in absolute relative bias between LR (absolute value of a) and NARRM (absolute value of c) for the JJA season; a positive value (purple) indicates LR has greater absolute bias than NARRM. (f) The same as (e) but for the DJF season

Figure 29c shows the comparison between the annual mean observed and simulated streamflow over these 563 river gauges. Overall, both simulations produce the long-term average streamflow reasonably well across these gauges. NARRM performs noticeably better (closer to the red 1:1 line) for the top four gauges with the largest discharges. An additional analysis (figure not shown) indicates that LR produces greater absolute bias than NARRM in 330 out of 563 gauges (about 60 %) in the streamflow simulation. Interestingly, it appears that the overestimation and underestimation of JJA and DJF streamflow are concentrated in the western and eastern US, respectively, for both LR and NARRM, as shown in Fig. 30a–d. Figure 30e, f displays the difference in absolute biases between LR and NARRM (the former subtracted from the latter) at individual river gauges for the JJA and DJF seasons. Positive differences (indicating greater bias in LR than in NARRM, purple color) dominate over most gauges in the eastern US during JJA and over the CONUS during DJF. Taken together, Figs. 29 and 30 suggest an overall better performance of NARRM despite all the uncertainties.

5.3 Land–atmosphere coupling

Accurate representation of the interactive processes between the land surface, planetary boundary layer (PBL), and clouds and precipitation is an ongoing challenge for current state-of-art climate models. Here we assess the land–atmosphere (L-A) coupling in LR (H1-5), LR (A1-3), NARRM (H1-5), and NARRM (A1-3) using the 9-year warm-season observations at the Atmospheric Radiation Measurement (ARM) Southern Great Plains (SGP) site following Tao et al. (2021). Before the detailed analysis of L-A coupling, we first examine the seasonal variations of daytime mean surface heat fluxes from May to August during 2004–2012. As shown in Fig. 31, the evaporative fraction (EF) is in general underestimated in model simulations except for LR (H1-5). The much lower simulated EF compared with the ARM observations is mainly attributed to a large negative bias in surface latent heat fluxes (LH). Different from the other simulations, the surface sensible heat flux (SH) is significantly underestimated in LR (H1-5) from May to early July. As a result, the simulated daytime mean EF is higher than that from the observations. Overall, the surface state and fluxes are better reproduced in the historical runs than in the AMIP runs, where both LR (A1-3) and NARRM (A1-3) show a significant negative bias in LH and EF persisting since July. This is surprising and requires further analyses, which is beyond the scope of this paper. In the following, we focus on two local convective regimes and diagnose model behaviors using the local coupling metrics (Santanello et al.2018).

During the selected 9-year period, 165 and 154 clear-sky days are classified from LR (A1-3) and NARRM (A1-3), respectively (Table A1). This is double the 66 clear-sky days identified from ARM observations. However, the occurrence frequency of shallow cumulus (ShCu) days is much lower in these AMIP runs compared with that observed. For ShCu, only 6 and 5 d are identified in LR (A1-5) and NARRM (A1-5), respectively (not shown). For the historical runs, the number of selected clear-sky days from both LR (H1-5) and NARRM (H1-5) are comparable to that observed, but the occurrence frequency of ShCu days is still low. As we are targeting a statistical and climatological comparison between the long-term ARM data and climate model simulations, we extend the analysis period to 1980–2012 for model simulations on ShCu days due to the limited sample size between 2004 and 2012.

Figure 32 shows the composite clear-sky day mixing diagrams (Santanello et al.2009), which relates the conservative variables, potential temperature (θ), and total water-specific humidity (q) to the water and energy budgets and the growth of planetary boundary layer (PBL). The coevolution of Lvq and Cpθ (07:30 to 17:30 LST) is decomposed by vector components that represent the integrated fluxes of heat and moisture from the land surface (Vsfc), the advection (Vadv), and the entrainment at the PBL top (Vent as a residual). Six metrics are derived from these diagrams and summarized in Table 5. In general, although the differences among various model simulations are minor, several common model biases are noted when compared with the ARM observations. For example, the model-simulated clear-sky days are featured with too warm and too dry conditions in the early morning (07:30 LST). The θsfc in the two AMIP runs and historical runs, for both LR and NARRM, is about 3 and 2 times that which was observed, respectively. The high θsfc indicates that more energy at the surface goes to heating rather than moistening. Moreover, the ELH is significantly overestimated in the model simulations, which is about 5 (3) times that observed in the two AMIP runs (historical runs). The much higher simulated ELH suggests that the entrainment heating and drying dominates the surface fluxes on the simulated clear-sky days, which supports rapid and deep PBL growth in models. Different from the observations, the simulated advection tends to cool and dry the mixed layer, but the overall impact is much smaller compared to those from the surface and entrainment.

Figure 33 shows the daytime evolution composites of PBL, lifting condensation level (LCL), and LCL deficit (PBL top height minus LCL) on clear-sky and ShCu days. Both PBL and LCL on model-simulated clear-sky days are much higher than on the ARM-observed clear-sky days. The model behaviors are in general consistent among different simulations, except that the bias of LCL is significantly lower in LR (H1-5) compared with the others. In models, the PBL grows rapidly after sunrise on clear-sky days, corresponding to the large warm and dry air entrainments that dominate the PBL budget (Fig. 32). But the too warm and too dry early morning surface conditions lead to an even higher LCL on model-simulated clear-sky days. The PBL never reaches the LCL, with a negative LCL deficit throughout the day, which supports clear skies. The diurnal evolution of LCL on the ARM-observed ShCu days is similar to that on clear-sky days, but the development of the PBL is much more vigorous. As a result, the PBL is deep enough to touch the LCL for cloud formation around noon. Different from the observed results, the daytime evolution of PBL is much weaker on ShCu days than on clear-sky days in all model simulations. However, the decrease in LCL from clear-sky days to ShCu days is even greater, where the growth of PBL is high enough to touch the LCL for cloud formation. Note that the models simulate a positive LCL deficit at around 09:00 LST, a few hours earlier than that in the observations. To summarize, ShCu forms as a result of strong surface SH fluxes that drives the rapid development of PBL in observations, while in models ShCu results from a relatively more humid lower troposphere that leads to a lowered LCL. Differences among various model simulations are pretty minor.

Figure 31The seasonal variation of 2004–2012 daytime mean (06:00–18:00 LST): (a) surface sensible heat flux (SH), (b) surface latent heat flux (LH), and (c) surface evaporative fraction (EF, defined as [LH/(LH+SH)]) from ARM observations (black), LR (A1-3) (green), LR (H1-5) (blue), NARRM (A1-3) (orange), and NARRM (H1-5). A moving average of 30 d is applied to smooth out short-term fluctuations and highlight longer-term trends.


Figure 32Clear-sky-day mixing diagram of the PBL conservative variables, Lvq vs. Cpθ, during the daytime evolution from ARM observations (black), LR (A1-3) (green), LR (H1-5) (blue), NARRM (A1-3) (orange), and NARRM (H1-5) (red). Dots denote the composite hourly means from 07:30 to 17:30 LT. The text annotations depict the vector component contributions from surface (Vsfc), advection (Vadv), and entrainment fluxes (Vent) to the evolution.


Table 5The surface (βsfc) and entrainment (βent) Bowen ratios, the entrainment ratio of heat (ESH) and moisture (ELH), and the advective flux ratio of heat (ASH) and moisture (ALH) from the ARM observations, LR (A1-3), LR (H1-5), NARRM (A1-3), and NARRM (H1-5) on clear-sky days. The flux values (W m−2) are derived using the mixing diagram theory and surface, advection, and entrainment flux vectors depicted in Fig. 32.

Download Print Version | Download XLSX

Figure 33Composite daytime evolution of (a) PBL, (b) LCL, and (c) LCL deficit (PBL minus LCL) from ARM observations (black), LR (A1-3) (green), LR (H1-5) (blue), NARRM (A1-3) (orange), and NARRM (H1-5) (red) on clear-sky days. Panels (d)(f) are the same as panels (a)(c) but on shallow cumulus days.


6 Conclusions and discussion

A primary Earth system model (ESM) advancement is to represent the spatially continuous world more realistically on discretized grids, which often requires constantly increasing the finest scale of explicitly resolved processes within the computational limit. Before uniformly high-resolution global models solve their severe computational challenge for climate simulation campaigns, the multiresolution ESM (e.g., regionally refined model (RRM)) is a natural alternative for these campaigns. Nevertheless, it has been over a decade since such a multiresolution method (e.g., Ringler et al.2008) was proposed.

To our knowledge, this is the first study with a global ESM that has accomplished the CMIP6 climate simulation campaign with a fully coupled RRM configuration – a potentially significant step in the long journey of improving the explicitly resolved resolution of climate simulations. The key to this success is the application of the hybrid time step strategy (i.e., merging the high-resolution dynamics time step with the low-resolution physics time step) in the atmosphere model, which mitigates the negative impacts caused by the persistent poor scale-aware problem of atmospheric physics in a multi-scale framework (e.g., RRM). The powerful aspect of RRM is that it typically only costs  10 %–20 % of the globally uniform high-resolution model, substantially reducing the computational burden of production simulations. This is particularly important for high-resolution ensemble simulations, which are necessary to account for the internal variability of the climate system, but whose cost would otherwise be prohibitive. On the global scale, we show that NARRM reproduces the LR climate well. Within the high-resolution domain (i.e., North America), NARRM displays more improvements than deteriorations relative to LR. Furthermore, some of the NARRM improvements (e.g., marine shallow cumulus clouds over California and mixed-phase clouds near the Arctic) are attributable to the better-captured coupling processes, highlighting the strength of refining multiple components over a single component. The main detailed findings are as follows.

  • The new dry baroclinic idealized test (Hughes and Jablonowski2023) allows us to test the NARRM grid with the stand-alone atmospheric dynamical core and confirms that the NARRM mesh is numerically stable and that the results are reasonable compared to the LR and HR grids (Fig. 4).

  • By employing the EAM hybrid time step method, NARRM successfully matches the global climate (including climatology, time series, and climate sensitivity and feedback) simulated by LR (Figs. 7, 9, 10, 22) without retuning physics parameters.

  • Within the high-resolution region over the CONUS, precipitation and clouds are largely improved in NARRM compared to LR (Figs. 11a, c, 12, 13) due to the better topography in NARRM (Fig. 8) and/or reduced sea surface temperature biases.

  • Refining the atmospheric grid spacing from 100 to 25 km is not adequate to improve the diurnal propagation of organized MCSs over the CONUS (Fig. 15).

  • NARRM retains the LR performance of aerosol simulations on a global scale and regional mean basis without retuning of the scale-dependent aerosol emissions. Over the refined mesh, NARRM improves the simulated aerosol spatial variability and predictions of extreme polluted cases (e.g., the upper tail of AOD distribution) (Fig. 18). On the other hand, the refined grid resolution does not eliminate the high biases in aerosol loadings and effective radiative forcing inherited from the LR model.

  • NARRM generally simulates better cloud cover than LR for both liquid and ice phase clouds. Over land (e.g., western NA and Greenland), this improvement is likely related to topography, whereas over ocean it is attributed to air–sea interactions.

  • NARRM produces a comparable global mean cloud feedback to LR but a less positive cloud feedback over the NA (Fig. 22). The reduction in cloud feedback there mainly relates to the shortwave component. The total cloud feedback over coastal California does not change much due to the compensation between the shortwave and longwave components (Fig. 23).

  • While both the LR and NARRM simulations are to a large extent able to capture the spatial and statistical distributions of the observed extratropical cyclone (ETC) activities, NARRM shows a particularly improved skill when simulating the ETC activities along the oceanic storm tracks and over the mountain range to the east of the Rocky Mountains (Fig. 24). NARRM in coupled mode outperforms all other configurations (LR and uncoupled NARRM) when simulating the shape and orientation of the oceanic storm tracks within the NARRM high-resolution domain due to the coupling with the refined ocean surrounding North America. NARRM in general produces more ETCs than LR and overestimates the total number of cyclones compared to the ERA5 reanalysis. More importantly, for intense or rapidly developing cyclones, the NARRM simulations are in close agreement with the observations, whereas the LR simulations are mismatched by a significant margin (Fig. 25).

  • NARRM appears to better represent the spatial variability in land hydrologic processes by resolving the land features more realistically over the western US (Figs. 27, 28). With higher grid resolution, NARRM can better capture surface topography that dominates surface water flows across hillslopes and through rivers, and hence not only improves the river model performance but also provides more precise river gauge geo-referencing information for streamflow validations (Figs. 29, 30).

  • NARRM provides enhanced winter (DJF) climatological representation of the spatial variability of snow water equivalent (SWE) across the CONUS relative to LR (Fig. 26ab) as a result of higher SWE magnitudes and more granular spatial structures in NARRM. Certain biases (e.g., peak water volume) in snowpack are reduced in NARRM compared with LR (Fig. 26c).

  • Over the ARM SGP site during warm seasons, the surface conditions are warm and dry on the model-simulated clear-sky days, with overestimation in both the PBL height and the LCL (Figs. 32, 33), while the ShCu days in models result from a much moister environment compared with that in the observations (Fig. 33). In general, the surface properties and fluxes are better reproduced in the historical runs than in the AMIP runs (Fig. 31), showing only limited impact due to resolution.

Besides the NARRM configuration illustrated in the present study, E3SMv2 has been successfully run with RRM meshes with finer grids located in other regions (Antarctic, Arctic, and southeastern Pacific). We expect that the hybrid time step strategy is a general approach that can be applied to these RRMs to simulate high-resolution climate in different areas. With that in mind, we streamlined the process of creating new RRM configurations to facilitate broader RRM applications in the next phase of the E3SM project. Depending on the goal of RRM simulations, further improvements over the refined domain can be achieved via additional parameter tuning. In that case, nudging the outside coarser domain may be necessary to avoid severe degradations of the climate there. Such nudging capability is available in E3SM (e.g., Tang et al.2019), and one has the option to nudge towards the data from the reanalysis product or low-resolution E3SM simulation. Lastly, we highlight that this paper serves as an overview of the NARRM atmosphere, land, and river models. More in-depth analysis is planned to be reported in follow-up papers.

Appendix A

Figure A1North American RRM (NARRM) grids for ocean and sea ice.

Figure A2The same as Fig. 11 but for the global spatial RMSE of model climatology from the EAMv1 LR (blue triangles) and the EAMv1 RRM (red triangles) with high-resolution grids over the CONUS and both physics parameters and time steps tuned for high resolution. The details of these two EAMv1 simulations are documented in Tang et al. (2019).


Figure A3Comparison of global annual mean precipitation geographic patterns from (a) GPCP2.3, (b) LR, and (c) NARRM historical ensemble means.

Figure A4Cloud fractions over North America simulated by the nudged LR (a, b, c) and NARRM (c, d, e) for low clouds (CLDLOW; a, d), middle clouds (CLDMED; b, e), and high clouds (CLDHGH; c, f).

Figure A5Present-day (PD) minus pre-industrial (PI) changes in cloud droplet number concentration (a, d), liquid water path (LWP), and ice water path (IWP) in North America calculated from the nudged LR (a, b, c) and NARRM (d, e, f) simulations.

Table A1Definition criteria and sample size of the clear-sky regime (Clear) and shallow cumulus regime (ShCu) based on the ARM observations and four different model simulations.

Download Print Version | Download XLSX

Code and data availability

The E3SM code used in this work is available at (E3SM Project2022) and on GitHub at (last access: 3 July 2023), including a maintenance branch (maint-2.0;, last access: 3 July 2023) that has been created to reproduce these simulations.

The complete native model output and the nudging simulations' climatology data are accessible directly at the National Energy Research Scientific Computing Center (NERSC) (, E3SM Project2023a and, E3SM Project2023b) for low-resolution and NARRM simulations, which are documented at (E3SM Project2023c). A subset of the native output is also available through the DOE Earth System Grid Federation (ESGF) at (E3SM Project2023d). Data reformatted following CMIP conventions will also be available through ESGF at (E3SM Project2023e).

Performance data and scripts are located at (last access: 4 July 2023,, Bradley2022).

Author contributions

As part of the E3SM water cycle group co-led by JCG and LPVR, QT led the regionally refined model (RRM) development team with significant contributions from JCG, MAT, WL, BRH, PAU, and AMB and help from OG, JDW, CZ, CSZ, ELR, AFR, MEM, RLJ, AVDV, PMC, and GB. The RRM configuration leverages the low-resolution model development effort from TZ, KZ, and XZ. QT and JCG designed the paper scope. QT, JCG, MAT, AMB, KZ, BS, and RMF carried out the model simulations with assistance from AM, NDK, and RLJ. QT led the analysis and the manuscript writing with significant contributions from JCG, MAT, WL, AMB, TZ, KZ, YZ, MZ, MW, HW, CT, AMR, YQ, HYL, and YF. All co-authors contributed to the manuscript.

Competing interests

At least one of the (co-)authors is a guest member of the editorial board of Geoscientific Model Development. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare


Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Financial support

This research was supported as part of the Energy Exascale Earth System Model (E3SM) project, funded by the US Department of Energy (DOE), Office of Science, Office of Biological and Environmental Research (BER). E3SM production simulations were performed on a high-performance computing cluster provided by the BER Earth System Modeling program and operated by the Laboratory Computing Resource Center at Argonne National Laboratory, and the National Energy Research Scientific Computing Center (NERSC), a DOE Office of Science User Facility supported by the Office of Science of the US Department of Energy under contract no. DE-AC02-05CH11231. Developmental simulations were performed using BER Earth System Modeling program's Compy computing cluster located at the Pacific Northwest National Laboratory and the NERSC machine cori-knl. Additional developmental simulations and post-processing and data archiving of production simulations used the resources of NERSC.

Lawrence Livermore National Laboratory (LLNL) is operated by Lawrence Livermore National Security, LLC, for the US DOE, National Nuclear Security Administration under contract no. DE-AC52-07NA27344. Support was received from the LLNL LDRD project 22-ERD-008, “Multiscale Wildfire Simulation Framework and Remote Sensing”, the DOE Atmospheric Radiation Measurement (ARM), and the DOE Atmospheric System Research (ASR) program. Data from the US DOE were used as part of the ARM Climate Research Facility Southern Great Plains site. Pacific Northwest National Laboratory is operated by Battelle for the US Department of Energy under contract no. DE-AC05-76RL01830. This paper describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the US Department of Energy or the United States Government. Sandia National Laboratories is a multi-mission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the US Department of Energy's National Nuclear Security Administration under contract no. DE-NA0003525. This work was supported by the US DOE through the Los Alamos National Laboratory. Los Alamos National Laboratory is operated by Triad National Security, LLC, for the National Nuclear Security Administration of US Department of Energy (contract no. 89233218CNA000001). The University of Michigan scientists Christiane Jablonowski and Owen K. Hughes were supported by the DOE Office of Science (grant no. DE-SC0023220). Co-author Alan M. Rhoades was funded by the Director, Office of Science, Office of Biological and Environmental Research of the U.S. Department of Energy Regional and Global Model Analysis (RGMA) program, through the Calibrated and Systematic Characterization, Attribution and Detection of Extremes (CASCADE) Science Focus Area (award no. DE-AC02-05CH11231), and the “An Integrated Evaluation of the Simulated Hydroclimate System of the Continental US” project (award no. DE-SC0016605).

Review statement

This paper was edited by Sam Rabin and reviewed by two anonymous referees.


Adler, R., Sapiano, M., Huffman, G., Wang, J.-J., Gu, G., Bolvin, D., Chiu, L., Schneider, U., Becker, A., Nelkin, E., Xie, P., Ferraro, R., and Shin, D.-B.: The Global Precipitation Climatology Project (GPCP) Monthly Analysis (New Version 2.3) and a Review of 2017 Global Precipitation, Atmosphere, 9, 138,, 2018. a

Bambach, N. E., Rhoades, A. M., Hatchett, B. J., Jones, A. D., Ullrich, P. A., and Zarzycki, C. M.: Projecting climate change in South America using variable-resolution Community Earth System Model: An application to Chile, Int. J. Climatol., 42, 2514–2542,, 2022. a, b, c

Bengtsson, Y., Hodges, K. I., and Roeckner, R.: Storm tracks and climate change, J. Climate, 19, 3518–3543,, 2006. a, b

Beres, J. H., Alexander, M. J., and Holton, J. R.: A Method of Specifying the Gravity Wave Spectrum above Convection Based on Latent Heating Properties and Background Wind, J. Atmos. Sci., 61, 324–337,<0324:AMOSTG>2.0.CO;2, 2004. a

Blender, R. and Schubert, M.: Cyclone tracking in different spatial and temporal resolutions, Mon. Weather Rev., 128, 377–384,<0377:CTIDSA>2.0.CO;2, 2000.  a, b

Bogenschutz, P. A., Lee, H.-H., Tang, Q., and Yamaguchi, T.: Combining regional mesh refinement with vertically enhanced physics to target marine stratocumulus biases as demonstrated in the Energy Exascale Earth System Model version 1, Geosci. Model Dev., 16, 335–352,, 2023. a, b, c

Bradley, A.: Data and scripts for performance analysis in the E3SMv2 NARRM overview paper, Zenodo [data set],, 2022. a

Bradley, A. M., Bosler, P. A., and Guba, O.: Islet: interpolation semi-Lagrangian element-based transport, Geosci. Model Dev., 15, 6285–6310,, 2022. a

Burakowski, E. A., Tawfik, A., Ouimette, A., Lepine, L., Zarzycki, C., Novick, K., Ollinger, S., and Bonan, G.: Simulating surface energy fluxes using the variable-resolution Community Earth System Model (VR-CESM), Theor. Appl. Climatol., 138, 115–133,, 2019. a

Caldwell, P. M., Mametjanov, A., Tang, Q., Van Roekel, L. P., Golaz, J.-C., Lin, W., Bader, D. C., Keen, N. D., Feng, Y., Jacob, R., Maltrud, M. E., Roberts, A. F., Taylor, M. A., Veneziani, M., Wang, H., Wolfe, J. D., Balaguru, K., Cameron-Smith, P., Dong, L., Klein, S. A., Leung, L. R., Li, H.-Y., Li, Q., Liu, X., Neale, R. B., Pinheiro, M., Qian, Y., Ullrich, P. A., Xie, S., Yang, Y., Zhang, Y., Zhang, K., and Zhou, T.: The DOE E3SM Coupled Model Version 1: Description and Results at High Resolution, J. Adv. Model. Earth Sy., 11, 4095–4146,, 2019. a, b

Caldwell, P. M., Terai, C. R., Hillman, B., Keen, N. D., Bogenschutz, P., Lin, W., Beydoun, H., Taylor, M., Bertagna, L., Bradley, A. M., Clevenger, T. C., Donahue, A. S., Eldred, C., Foucar, J., Golaz, J.-C., Guba, O., Jacob, R., Johnson, J., Krishna, J., Liu, W., Pressel, K., Salinger, A. G., Singh, B., Steyer, A., Ullrich, P., Wu, D., Yuan, X., Shpund, J., Ma, H.-Y., and Zender, C. S.: Convection-Permitting Simulations With the E3SM Global Atmosphere Model, J. Adv. Model. Earth Sy., 13, e2021MS002544,, 2021. a

Cesana, G. and Chepfer, H.: Evaluation of the cloud thermodynamic phase in a climate model using CALIPSO-GOCCP, J. Geophys. Res.-Atmos., 118, 7922–7937, 2013. a

Chang, E. K. M.: CMIP5 projection of significant reduction in extratropical cyclone activity over North America, J. Climate, 26, 9903–9922,, 2013. a

Chepfer, H., Bony, S., Winker, D., Chiriaco, M., Dufresne, J.-L., and Sèze, G.: Use of CALIPSO lidar observations to evaluate the cloudiness simulated by a climate model, Geophys. Res. Lett., 35, L15704,, 2008. a

Chepfer, H., Bony, S., Winker, D., Cesana, G., Dufresne, J. L., Minnis, P., Stubenrauch, C. J., and Zeng, S.: The GCM-Oriented CALIPSO Cloud Product (CALIPSO-GOCCP), J. Geophys. Res.-Atmos., 115, D00H16,, 2010. a, b

Dang, C., Zender, C. S., and Flanner, M. G.: Intercomparison and improvement of two-stream shortwave radiative transfer schemes in Earth system models for a unified treatment of cryospheric surfaces, The Cryosphere, 13, 2325–2343,, 2019. a

Demory, M., Vidale, P., Roberts, M., Berrisford, P., Strachan, J., Schiemann, R., and Mizielinski, M.: The role of horizontal resolution in simulating drivers of the global hydrological cycle, Clim. Dynam., 42, 2201–2225,, 2014. a, b

Dennis, J. M., Fournier, A., Spotz, W. F., St-Cyr, A., Taylor, M. A., Thomas, S. J., and Tufo, H.: High-Resolution Mesh Convergence Properties and Parallel Efficiency of a Spectral Element Atmospheric Dynamical Core, Int. J. High Perform. C., 19, 225–235,, 2005. a

Dennis, J. M., Edwards, J., Evans, K. J., Guba, O., Lauritzen, P. H., Mirin, A. A., St-Cyr, A., Taylor, M. A., and Worley, P. H.: CAM-SE: A scalable spectral element dynamical core for the Community Atmosphere Model, Int. J. High Perform. C., 26, 74–89,, 2011. a

Devanand, A., Huang, M., Lawrence, D. M., Zarzycki, C. M., Feng, Z., Lawrence, P. J., Qian, Y., and Yang, Z.: Land Use and Land Cover Change Strongly Modulates Land-Atmosphere Coupling and Warm-Season Precipitation Over the Central United States in CESM2-VR, J. Adv. Model. Earth Sy., 12, e2019MS001925,, 2020. a

Di Vittorio, A. V., Chini, L. P., Bond-Lamberty, B., Mao, J., Shi, X., Truesdale, J., Craig, A., Calvin, K., Jones, A., Collins, W. D., Edmonds, J., Hurtt, G. C., Thornton, P., and Thomson, A.: From land use to land cover: restoring the afforestation signal in a coupled integrated assessment–earth system model and the implications for CMIP5 RCP simulations, Biogeosciences, 11, 6435–6450,, 2014. a

E3SM Project: E3SM-Project/E3SM: v2.0.2: Second patch release for v2.0 (v2.0.2), Zenodo [code],, 2022. a

E3SM Project: E3SMv2 water cycle low-resolution simulations,, last access: 3 July 2023a. a

E3SM Project: E3SMv2 water cycle North American regionally refined model simulations,, last access: 3 July 2023b. a

E3SM Project: E3SMv2 water cycle experiment documentations,, last access: 3 July 2023c. a

E3SM Project: E3SM project data,, last access: 3 July 2023d. a

E3SM Project: Energy Exascale Earth System Model, Department of Energy,, last access: 3 July 2023e. a

Evans, K., Lauritzen, P., Mishra, S., Neale, R., Taylor, M., and Tribbia, J.: AMIP Simulation with the CAM4 Spectral Element Dynamical Core, J. Climate, 26, 689–709,, 2013. a

Eyring, V., Bony, S., Meehl, G. A., Senior, C. A., Stevens, B., Stouffer, R. J., and Taylor, K. E.: Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization, Geosci. Model Dev., 9, 1937–1958,, 2016. a, b

Fekete, B., Vorosmarty, C., Hall, F., Collatz, G., Meeson, B., Los, S., Brown De Colstoun, E., and Landis, D.: ISLSCP II UNH/GRDC Composite Monthly Runoff, ORNL DAAC,, 2011. a, b

Feng, Y., Wang, H., Rasch, P. J., Zhang, K., Lin, W., Tang, Q., Xie, S., Hamilton, D. S., Mahowald, N., and Yu, H.: Global Dust Cycle and Direct Radiative Effect in E3SM Version 1: Impact of Increasing Model Resolution, J. Adv. Model. Earth Sy., 14, e2021MS002909,, 2022. a, b, c

Gates, W. L., Boyle, J. S., Covey, C., Dease, C. G., Doutriaux, C. M., Drach, R. S., Fiorino, M., Gleckler, P. J., Hnilo, J. J., Marlais, S. M., Phillips, T. J., Potter, G. L., Santer, B. D., Sperber, K. R., Taylor, K. E., and Williams, D. N.: An Overview of the Results of the Atmospheric Model Intercomparison Project (AMIP I), B. Am. Meteorol. Soc., 80, 29–56,<0029:AOOTRO>2.0.CO;2, 1999. a

Geng, Q. and Sugi, M.: Variability of the North Atlantic cyclone activity in winter analyzed from NCEP-NCAR reanalysis data, J. Climate, 14, 3863–3783,;2, 2001. a, b

Gettelman, A., Callaghan, P., Larson, V. E., Zarzycki, C. M., Bacmeister, J. T., Lauritzen, P. H., Bogenschutz, P. A., and Neale, R. B.: Regional Climate Simulations With the Community Earth System Model, J. Adv. Model. Earth Sy., 10, 1245–1265,, 2018. a, b, c

GISTEMP Team: GISS surface temperature analysis (GISTEMP), NASA Goddard Institute for Space Studies, (last access: 24 May 2018), 2018. a

Golaz, J.-C., Larson, V. E., and Cotton, W. R.: A PDF-based model for boundary layer clouds. Part I: Method and model description, J. Atmos. Sci., 59, 3540–3551,<3540:apbmfb>;2, 2002. a

Golaz, J.-C., Van Roekel, L. P., Zheng, X., Roberts, A. F., Wolfe, J. D., Lin, W., Bradley, A. M., Tang, Q., Maltrud, M. E., Forsyth, R. M., Zhang, C., Zhou, T., Zhang, K., Zender, C. S., Wu, M., Wang, H., Turner, A. K., Singh, B., Richter, J. H., Qin, Y., Petersen, M. R., Mametjanov, A., Ma, P.-L., Larson, V. E., Krishna, J., Keen, N. D., Jeffery, N., Hunke, E. C., Hannah, W. M., Guba, O., Griffin, B. M., Feng, Y., Engwirda, D., Di Vittorio, A. V., Dang, C., Conlon, L. M., Chen, C.-C.-J., Brunke, M. A., Bisht, G., Benedict, J. J., Asay-Davis, X. S., Zhang, Y., Zhang, M., Zeng, X., Xie, S., Wolfram, P. J., Vo, T., Veneziani, M., Tesfa, T. K., Sreepathi, S., Salinger, A. G., Reeves Eyre, J. E. J., Prather, M. J., Mahajan, S., Li, Q., Jones, P. W., Jacob, R. L., Huebler, G. W., Huang, X., Hillman, B. R., Harrop, B. E., Foucar, J. G., Fang, Y., Comeau, D. S., Caldwell, P. M., Bartoletti, T., Balaguru, K., Taylor, M. A., McCoy, R. B., Leung, L. R., and Bader, D. C.: The DOE E3SM Model Version 2: Overview of the Physical Model and Initial Model Evaluation, J. Adv. Model. Earth Sy., 14, e2022MS003156,, 2022. a, b, c, d, e, f, g, h, i, j, k, l

Greeves, C. Z., Pope, V. D., Stratton, R. A., and Martin, G. M.: Representation of Northern Hemisphere winter storm tracks in climate models, Clim. Dynam., 28, 683–702,, 2007. a

Gregory, J. M., Ingram, W. J., Palmer, M. A., Jones, G. S., Stott, P. A., Thorpe, R. B., Lowe, J. A., Johns, T. C., and Williams, K. D.: A new method for diagnosing radiative forcing and climate sensitivity, Geophys. Res. Lett., 31, L03205,, 2004. a

Griffies, S. M., Winton, M., Anderson, W. G., Benson, R., Delworth, T. L., Dufour, C. O., Dunne, J. P., Goddard, P., Morrison, A. K., Rosati, A., et al.: Impacts on ocean heat from transient mesoscale eddies in a hierarchy of climate models, J. Climate, 28, 952–977, 2015. a

Guba, O., Taylor, M. A., Ullrich, P. A., Overfelt, J. R., and Levy, M. N.: The spectral element method (SEM) on variable-resolution grids: evaluating grid sensitivity and resolution-aware numerical viscosity, Geosci. Model Dev., 7, 2803–2816,, 2014. a, b

Haarsma, R. J., Roberts, M. J., Vidale, P. L., Senior, C. A., Bellucci, A., Bao, Q., Chang, P., Corti, S., Fučkar, N. S., Guemas, V., von Hardenberg, J., Hazeleger, W., Kodama, C., Koenigk, T., Leung, L. R., Lu, J., Luo, J.-J., Mao, J., Mizielinski, M. S., Mizuta, R., Nobre, P., Satoh, M., Scoccimarro, E., Semmler, T., Small, J., and von Storch, J.-S.: High Resolution Model Intercomparison Project (HighResMIP v1.0) for CMIP6, Geosci. Model Dev., 9, 4185–4208,, 2016. a, b

Hagos, S., Leung, R., Rauscher, S. A., and Ringler, T.: Error Characteristics of Two Grid Refinement Approaches in Aquaplanet Simulations: MPAS-A and WRF, Mon. Weather Rev., 141, 3022–3036,, 2013. a

Hannah, W. M., Bradley, A. M., Guba, O., Tang, Q., Golaz, J.-C., and Wolfe, W.: Separating Physics and Dynamics grids for Improved Computational Efficiency in Spectral Element Earth System Models, J. Adv. Model Earth Sy., 13, e2020MS002419,, 2021. a

Hansen, J., Ruedy, R., Sato, M., and Lo, K.: Global surface temperature change, Rev. Geophys., 48, RG4004,, 2010. a

Harris, L. M. and Lin, S.-J.: A Two-Way Nested Global-Regional Dynamical Core on the Cubed-Sphere Grid, Mon. Weather Rev., 141, 283–306,, 2013. a

Harris, L. M. and Lin, S.-J.: Global-to-Regional Nested Grid Climate Simulations in the GFDL High Resolution Atmospheric Model, J. Climate, 27, 4890–4910,, 2014. a, b

Hazelton, A. T., Harris, L., and Lin, S.-J.: Evaluation of Tropical Cyclone Structure Forecasts in a High-Resolution Version of the Multiscale GFDL fvGFS Model, Weather Forecast., 33, 419–442,, 2018. a

Held, I. M. and Shell, K. M.: Using relative humidity as a state variable in climate feedback analysis, J. Climate, 25, 2578–2582, 2012. a, b

Herrington, A. R., Lauritzen, P. H., Reed, K. A., Goldhaber, S., and Eaton, B. E.: Exploring a Lower-Resolution Physics Grid in CAM-SE-CSLAM, J. Adv. Model. Earth Sy., 11, 1894–1916,, 2019. a

Herrington, A. R., Lauritzen, P. H., Lofverstrom, M., Lipscomb, W. H., Gettelman, A., and Taylor, M. A.: Impact of grids and dynamical cores in CESM2.2 on the surface mass balance of the Greenland Ice Sheet, J. Adv. Model. Earth Sy., 14, e2022MS003192,, 2022. a

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.-N.: The ERA5 global reanalysis, Q. J. Roy. Meteo. Soc., 146, 1999–2049,, 2020. a, b

Holben, B. N., Eck, T. F., Slutsker, I., Tanre, D., Buis, J. P., Setzer, A., Vermote, E., Reagan, J. A., Kaufman, Y. J., Nakajima, T., Lavenu, F., Jankowiak, I., and Smirnov, A.: AERONET – A federated instrument network and data archive for aerosol characterization, Remote Sens. Environ., 66, 1–16,, 1998. a

Huang, X. and Ullrich, P. A.: Irrigation impacts on California's climate with the variable-resolution CESM, J. Adv. Model. Earth Sy., 8, 1151–1163,, 2016. a

Huang, X. and Ullrich, P. A.: The Changing Character of Twenty-First-Century Precipitation over the Western United States in the Variable-Resolution CESM, J. Climate, 30, 7555–7575,, 2017. a, b

Huang, X., Rhoades, A. M., Ullrich, P. A., and Zarzycki, C. M.: An evaluation of the variable-resolution CESM for modeling California's climate, J. Adv. Model. Earth Sy., 8, 345–369,, 2016. a

Huffman, G. J., Bolvin, D. T., Nelkin, E. J., Wolff, D. B., Adler, R. F., Gu, G., Hong, Y., Bowman, K. P., and Stocker, E. F.: The TRMM Multisatellite Precipitation Analysis (TMPA): Quasi-global, multiyear, combined-sensor precipitation estimates at fine scales, J. Hydrometeorol., 8, 38–55, 2007. a

Hughes, O. K. and Jablonowski, C.: A Mountain-Induced Moist Baroclinic Wave Test Case for the Dynamical Cores of Atmospheric General Circulation Models, EGUsphere [preprint],, 2023. a, b, c

Hurtt, G. C., Chini, L., Sahajpal, R., Frolking, S., Bodirsky, B. L., Calvin, K., Doelman, J. C., Fisk, J., Fujimori, S., Klein Goldewijk, K., Hasegawa, T., Havlik, P., Heinimann, A., Humpenöder, F., Jungclaus, J., Kaplan, J. O., Kennedy, J., Krisztin, T., Lawrence, D., Lawrence, P., Ma, L., Mertz, O., Pongratz, J., Popp, A., Poulter, B., Riahi, K., Shevliakova, E., Stehfest, E., Thornton, P., Tubiello, F. N., van Vuuren, D. P., and Zhang, X.: Harmonization of global land use change and management for the period 850–2100 (LUH2) for CMIP6, Geosci. Model Dev., 13, 5425–5464,, 2020. a

Jiang, X., Lau, N.-C., and Klein, S. A.: Role of eastward propagating convection systems in the diurnal cycle and seasonal mean of summertime rainfall over the US Great Plains, Geophys. Res. Lett., 33, L19809,, 2006. a

Jung, T., Gulev, S. K., Rudeva, I., and Soloviov, V.: Sensitivity of extratropical cyclone characteristics to horizontal resolution in the ECMWF model, Q. J. Roy. Meteor. Soc., 132, 1839–1857,, 2006. a, b, c

Jungclaus, J. H., Lorenz, S. J., Schmidt, H., Brovkin, V., Brüggemann, N., Chegini, F., Crüger, T., De-Vrese, P., Gayler, V., Giorgetta, M. A., Gutjahr, O., Haak, H., Hagemann, S., Hanke, M., Ilyina, T., Korn, P., Kröger, J., Linardakis, L., Mehlmann, C., Mikolajewicz, U., Müller, W. A., Nabel, J. E. M. S., Notz, D., Pohlmann, H., Putrasahan, D. A., Raddatz, T., Ramme, L., Redler, R., Reick, C. H., Riddick, T., Sam, T., Schneck, R., Schnur, R., Schupfner, M., von Storch, J.-S., Wachsmann, F., Wieners, K.-H., Ziemen, F., Stevens, B., Marotzke, J., and Claussen, M.: The ICON Earth System Model Version 1.0, J. Adv. Model. Earth Sy., 14, e2021MS002813,, 2022. a

Kapnick, S. B., Yang, X., Vecchi, G. A., Delworth, T. L., Gudgel, R., Malyshev, S., Milly, P. C. D., Shevliakova, E., Underwood, S., and Margulis, S. A.: Potential for western US seasonal snowpack prediction, P. Nal. Acad. Sci. USA, 115, 1180–1185,, 2018. a

Larson, V. E.: CLUBB-SILHS: A parameterization of subgrid variability in the atmosphere, arXiv [preprint],, 2017. a

Lauritzen, P. H., Bacmeister, J. T., Callaghan, P. F., and Taylor, M. A.: NCAR_Topo (v1.0): NCAR global model topography generation software for unstructured grids, Geosci. Model Dev., 8, 3975–3986,, 2015. a

Lawrence, D. M., Fisher, R. A., Koven, C. D., Oleson, K. W., Swenson, S. C., Bonan, G., Collier, N., Ghimire, B., van Kampenhout, L., Kennedy, D., Kluzek, E., Lawrence, P. J., Li, F., Li, H., Lombardozzi, D., Riley, W. J., Sacks, W. J., Shi, M., Vertenstein, M., Wieder, W. R., Xu, C., Ali, A. A., Badger, A. M., Bisht, G., Brunke, M. A., Burns, S. P., Buzan, J., Clark, M., Craig, A., Dahlin, K., Drewniak, B., Fisher, J. B., Flanner, M., Fox, A. M., Gentine, P., Hoffman, F., Keppel-Aleks, G., Knox, R., Kumar, S., Lenaerts, J., Leung, L. R., Lipscomb, W. H., Lu, Y., Pandey, A., Pelletier, J. D., Perket, J., Randerson, J. T., Ricciuto, D. M., Sanderson, B. M., Slater, A., Subin, Z. M., Tang, J., Thomas, R. Q., Val Martin, M., and Zeng, X.: The Community Land Model version 5: Description of new features, benchmarking, and impact of forcing uncertainty, J. Adv. Model. Earth Sy., 11, 4245–4287, 2019. a

Leung, L. R., Ringler, T., Collins, W. D., Taylor, M., and Ashfaq, M.: A Hierarchical Evaluation of Regional Climate Simulations, Eos T. Am. Geophys. Un., 94, 297–298,, 2013. a

Leung, L. R., Bader, D. C., Taylor, M. A., and McCoy, R. B.: An Introduction to the E3SM Special Collection: Goals, Science Drivers, Development, and Analysis, J. Adv. Model. Earth Sy., 12, e2019MS001821,, 2020. a, b

Li, H., Wigmosta, M. S., Wu, H., Huang, M., Ke, Y., Coleman, A. M., and Leung, L. R.: A physically based runoff routing model for land surface and earth system models, J. Hydrometeorol., 14, 808–828, 2013. a, b

Li, H.-Y., Leung, L. R., Getirana, A., Huang, M., Wu, H., Xu, Y., Guo, J., and Voisin, N.: Evaluating global streamflow simulations by a physically based routing model coupled with the community land model, J. Hydrometeorol., 16, 948–971, 2015a. a

Li, H.-Y., Ruby Leung, L., Tesfa, T., Voisin, N., Hejazi, M., Liu, L., Liu, Y., Rice, J., Wu, H., and Yang, X.: Modeling stream temperature in the A nthropocene: An earth system modeling approach, J. Adv. Model. Earth Sy., 7, 1661–1679, 2015b. a

Liang, Y., Yang, B., Wang, M., Tang, J., Sakaguchi, K., Leung, L. R., and Xu, X.: Multiscale Simulation of Precipitation Over East Asia by Variable Resolution CAM-MPAS, J. Adv. Model. Earth Sy., 13, e2021MS002656,, 2021. a, b

Liu, W., Ullrich, P., Li, J., Zarzycki, C. M., Caldwell, P. M., Leung, L. R., and Qian, Y.: The June 2012 North American Derecho: A testbed for evaluating regional and global climate modeling systems at cloud-resolving scales, Earth and Space Science Open Archive [preprint], p. 39,, 2022. a, b

Liu, X., Ma, P.-L., Wang, H., Tilmes, S., Singh, B., Easter, R. C., Ghan, S. J., and Rasch, P. J.: Description and evaluation of a new four-mode version of the Modal Aerosol Module (MAM4) within version 5.3 of the Community Atmosphere Model, Geosci. Model Dev., 9, 505–522,, 2016. a

Loeb, N. G., Doelling, D. R., Wang, H., Su, W., Nguyen, C., Corbett, J. G., Liang, L., Mitrescu, C., Rose, F. G., and Kato, S.: Clouds and the earth’s radiant energy system (CERES) energy balanced and filled (EBAF) top-of-atmosphere (TOA) edition-4.0 data product, J. Climate, 31, 895–918,, 2018. a

Melillo, J. M., Richmond, T. C., and Yohe, G. W. E.: Climate Change Impacts in the United States: The Third National Climate Assessment, US Global Change Research Program, US Global Change Research Program, p. 841,, 2014. a

Morice, C. P., Kennedy, J. J., Rayner, N. A., and Jones, P. D.: Quantifying uncertainties in global and regional temperature change using an ensemble of observational estimates: The HadCRUT4 data set, J. Geophys. Res.-Atmos., 117, D08101,, 2012. a

Palazzi, E., Mortarini, L., Terzago, S., and Von Hardenberg, J.: Elevation-dependent warming in global climate model simulations at high spatial resolution, Clim. Dynam., 52, 2685–2702,, 2019. a

Rahimi, S. R., Wu, C., Liu, X., and Brown, H.: Exploring a Variable-Resolution Approach for Simulating Regional Climate Over the Tibetan Plateau Using VR-CESM, J. Geophys. Res.-Atmos., 124, 4490–4513,, 2019. a, b

Rauscher, S. A., Ringler, T. D., Skamarock, W. C., and Mirin, A. A.: Exploring a Global Multiresolution Modeling Approach Using Aquaplanet Simulations, J. Climate, 26, 2432–2452,, 2013. a

Reed, K. A. and Jablonowski, C.: Idealized tropical cyclone simulations of intermediate complexity: a test case for AGCMs, J. Adv. Model. Earth Sy., 4, M04001,, 2012. a

Reed, K. A., Wehner, M. F., and Zarzycki, C. M.: Attribution of 2020 hurricane season extreme rainfall to human-induced climate change, Nature Commun., 13, 1–6,, 2022. a

Rhoades, A. M., Huang, X., Ullrich, P. A., and Zarzycki, C. M.: Characterizing Sierra Nevada Snowpack Using Variable-Resolution CESM, J. Appl. Meteorol. Clim., 55, 173–196,, 2016. a, b

Rhoades, A. M., Ullrich, P. A., and Zarzycki, C. M.: Projecting 21st century snowpack trends in western USA mountains using variable-resolution CESM, Clim. Dynam., 50, 261–288,, 2017. a, b

Rhoades, A. M., Jones, A. D., and Ullrich, P. A.: Assessing Mountains as Natural Reservoirs With a Multimetric Framework, Earth's Future, 6, 1221–1241,, 2018a. a

Rhoades, A. M., Jones, A. D., and Ullrich, P. A.: The Changing Character of the California Sierra Nevada as a Natural Reservoir, Geophys. Res. Lett., 45, 13008–13019,, 2018b. a

Rhoades, A. M., Ullrich, P. A., Zarzycki, C. M., Johansen, H., Margulis, S. A., Morrison, H., Xu, Z., and Collins, W. D.: Sensitivity of Mountain Hydroclimate Simulations in Variable-Resolution CESM to Microphysics and Horizontal Resolution, J. Adv. Model. Earth Sy., 10, 1357–1380,, 2018c. a, b

Rhoades, A. M., Jones, A. D., O'Brien, T. A., O'Brien, J. P., Ullrich, P. A., and Zarzycki, C. M.: Influences of North Pacific Ocean Domain Extent on the Western US Winter Hydroclimatology in Variable-Resolution CESM, J. Geophys. Res.-Atmos., 125, e2019JD031977,, 2020a. a

Rhoades, A. M., Jones, A. D., Srivastava, A., Huang, H., O'Brien, T. A., Patricola, C. M., Ullrich, P. A., Wehner, M., and Zhou, Y.: The Shifting Scales of Western US Landfalling Atmospheric Rivers Under Climate Change, Geophys. Res. Lett., 47, e2020GL089096,, 2020b. a

Rhoades, A. M., Hatchett, B. J., Risser, M. D., Collins, W. D., Bambach, N. E., Huning, L. S., McCrary, R., Siirila-Woodburn, E. R., Ullrich, P. A., Wehner, M. F., Zarzycki, C. M., and Jones, A. D.: Asymmetric emergence of low-to-no snow in the midlatitudes of the American Cordillera, Na. Clim. Change, 12, 1151–1159,, 2022. a

Richter, J. H., Sassi, F., and Garcia, R. R.: Toward a Physically Based Gravity Wave Source Parameterization in a General Circulation Model, J. Atmos. Sci., 67, 136–156,, 2010. a

Richter, J. H., Chen, C.-C., Tang, Q., Xie, S., and Rasch, P. J.: Improved Simulation of the QBO in E3SMv1, J. Adv. Model. Earth Sy., 11, 3403–3418,, 2019. a

Riley, G. T., Landin, M. G., and Bosart, L. F.: The Diurnal Variability of Precipitation across the Central Rockies and Adjacent Great Plains, Mon. Weather Rev., 115, 1161–1172,<1161:TDVOPA>2.0.CO;2, 1987. a

Ringler, T., Ju, L., and Gunzburger, M.: A multiresolution method for climate system modeling: application of spherical centroidal Voronoi tessellations, Ocean Dynam., 58, 475–498,, 2008. a, b

Running, S., Mu, Q., and Zhao, M.: MOD16A2 MODIS/Terra Net Evapotranspiration 8-Day L4 Global 500m SIN Grid V006, Tech. rep.,, 2017. a

Sakaguchi, K., Leung, L. R., Zhao, C., Yang, Q., Lu, J., Hagos, S., Rauscher, S. A., Dong, L., Ringler, T. D., and Lauritzen, P. H.: Exploring a Multiresolution Approach Using AMIP Simulations, J. Climate, 28, 5549–5574,, 2015. a, b, c

Sakaguchi, K., Lu, J., Leung, L. R., Zhao, C., Li, Y., and Hagos, S.: Sources and pathways of the upscale effects on the Southern Hemisphere jet in MPAS-CAM4 variable-resolution simulations, J. Adv. Model. Earth Sy., 8, 1786–1805,, 2016. a, b, c, d

Santanello, J. A., Peters-Lidard, C. D., Kumar, S. V., Alonge, C., and Tao, W.-K.: A Modeling and Observational Framework for Diagnosing Local Land–Atmosphere Coupling on Diurnal Time Scales, J. Hydrometeorol., 10, 577–599,, 2009. a

Santanello, J. A., Dirmeyer, P. A., Ferguson, C. R., Findell, K. L., Tawfik, A. B., Berg, A., Ek, M., Gentine, P., Guillod, B. P., van Heerwaarden, C., Roundy, J., and Wulfmeyer, V.: Land–Atmosphere Interactions: The LoCo Perspective, B. Am. Meteorol. Soc., 99, 1253–1272,, 2018. a

Siirila-Woodburn, E., Rhoades, A. M., Hatchett, B. J., Huning, L., Szinai, J., Tague, C., Nico, P. S., Feldman, D., Jones, A. D., Collins, W. D., and Kaatz, L.: A low-to-no snow future and its impacts on water resources in the western United States, Nature Rev. Earth Environ., 2, 800–819,, 2021. a

Smith, T. M., Reynolds, R. W., Peterson, T. C., and Lawrimore, J.: Improvements to NOAA’s Historical Merged Land–Ocean Surface Temperature Analysis (1880–2006), J. Climate, 21, 2283–2296,, 2008. a

Soden, B. J., Held, I. M., Colman, R., Shell, K. M., Kiehl, J. T., and Shields, C. A.: Quantifying climate feedbacks using radiative kernels, J. Climate, 21, 3504–3520, 2008. a, b

Sun, J., Zhang, K., Wan, H., Ma, P.-L., Tang, Q., and Zhang, S.: Impact of Nudging Strategy on the Climate Representativeness and Hindcast Skill of Constrained EAMv1 Simulations, J. Adv. Model. Earth Sy., 11, 3911–3933,, 2019. a

Tang, Q., Klein, S. A., Xie, S., Lin, W., Golaz, J.-C., Roesler, E. L., Taylor, M. A., Rasch, P. J., Bader, D. C., Berg, L. K., Caldwell, P., Giangrande, S. E., Neale, R. B., Qian, Y., Riihimaki, L. D., Zender, C. S., Zhang, Y., and Zheng, X.: Regionally refined test bed in E3SM atmosphere model version 1 (EAMv1) and applications for high-resolution modeling, Geosci. Model Dev., 12, 2679–2706,, 2019. a, b, c, d, e, f, g, h

Tang, Q., Prather, M. J., Hsu, J., Ruiz, D. J., Cameron-Smith, P. J., Xie, S., and Golaz, J.-C.: Evaluation of the interactive stratospheric ozone (O3v2) module in the E3SM version 1 Earth system model, Geosci. Model Dev., 14, 1219–1236,, 2021. a

Tao, C., Zhang, Y., Tang, Q., Ma, H.-Y., Ghate, V. P., Tang, S., Xie, S., and Santanello, J. A.: Land–Atmosphere Coupling at the US Southern Great Plains: A Comparison on Local Convective Regimes between ARM Observations, Reanalysis, and Climate Model Simulations, J. Hydrometeorol., 22, 463–481,, 2021. a

Taylor, M. A. and Fournier, A.: A compatible and conservative spectral element method on unstructured grids, J. Comput. Phys., 229, 5879–5895,, 2010. a

Taylor, M. A., Guba, O., Steyer, A., Ullrich, P. A., Hall, D. M., and Eldred, C.: An Energy Consistent Discretization of the Nonhydrostatic Equations in Primitive Variables, J. Adv. Model. Earth Sy., 12, e2019MS001783,, 2020. a

Teixeira, J., Cardoso, S., Bonazzola, M., Cole, J., DelGenio, A., DeMott, C., Franklin, C., Hannay, C., Jakob, C., Jiao, Y., Karlsson, J., Kitagawa, H., Köhler, M., Kuwano-Yoshida, A., LeDrian, C., Li, J., Lock, A., Miller, M. J., Marquet, P., Martins, J., Mechoso, C. R., Meijgaard, E. v., Meinke, I., Miranda, P. M. A., Mironov, D., Neggers, R., Pan, H. L., Randall, D. A., Rasch, P. J., Rockel, B., Rossow, W. B., Ritter, B., Siebesma, A. P., Soares, P. M. M., Turk, F. J., Vaillancourt, P. A., Von Engeln, A., and Zhao, M.: Tropical and subtropical cloud transitions in weather and climate prediction models: The GCSS/WGNE Pacific Cross-Section Intercomparison (GPCI), J. Climate, 24, 5223–5256,, 2011. a

Ulbrich, U., Leckebusch, G. C., and Pinto, J. G.: Extra-tropical cyclones in the present and future climate: a review, Theor. Appl. Climatol., 96, 117–131,, 2009. a

Ullrich, P.: Spherical Quadrilateral Grid Generator (SQuadGen) v1.1, Zenodo [code],, 2022. a

Ullrich, P. A. and Taylor, M. A.: Arbitrary-order conservative and consistent remapping and a theory of linear maps: Part I, Mon. Weather Rev., 143, 2419–2440, 2015. a

Ullrich, P. A. and Zarzycki, C. M.: TempestExtremes: a framework for scale-insensitive pointwise feature tracking on unstructured grids, Geosci. Model Dev., 10, 1069–1090,, 2017. a, b

Ullrich, P. A., Melvin, T., Jablonowski, C., and Staniforth, A.: A proposed baroclinic wave test case for deep and shallow-atmosphere dynamical cores, Q. J. Roy. Meteor. Soc., 140, 1590–1602, 2014. a

Ullrich, P. A., Devendran, D., and Johansen, H.: Arbitrary-order conservative and consistent remapping and a theory of linear maps: Part II, Mon. Weather Rev., 144, 1529–1549, 2016. a

Ullrich, P. A., Zarzycki, C. M., McClenny, E. E., Pinheiro, M. C., Stansfield, A. M., and Reed, K. A.: TempestExtremes v2.1: a community framework for feature detection, tracking, and analysis in large datasets, Geosci. Model Dev., 14, 5023–5048,, 2021. a, b

van Kampenhout, L., Rhoades, A. M., Herrington, A. R., Zarzycki, C. M., Lenaerts, J. T. M., Sacks, W. J., and van den Broeke, M. R.: Regional grid refinement in an Earth system model: impacts on the simulated Greenland surface mass balance, The Cryosphere, 13, 1547–1564,, 2019. a, b

Veneziani, M., Maslowski, W., Lee, Y. J., D'Angelo, G., Osinski, R., Petersen, M. R., Weijer, W., Craig, A. P., Wolfe, J. D., Comeau, D., and Turner, A. K.: An evaluation of the E3SMv1 Arctic ocean and sea-ice regionally refined model, Geosci. Model Dev., 15, 3133–3160,, 2022. a

Wang, H., Easter, R. C., Zhang, R., Ma, P.-L., Singh, B., Zhang, K., Ganguly, D., Rasch, P. J., Burrows, S. M., Ghan, S. J., Lou, S., Qian, Y., Yang, Y., Feng, Y., Flanner, M., Leung, L. R., Liu, X., Shrivastava, M., Sun, J., Tang, Q., Xie, S., and Yoon, J.-H.: Aerosols in the E3SM Version 1: New Developments and Their Impacts on Radiative Forcing, J. Adv. Model. Earth Sy., 12, e2019MS001851,, 2020. a, b

Wu, C., Liu, X., Lin, Z., Rhoades, A. M., Ullrich, P. A., Zarzycki, C. M., Lu, Z., and Rahimi-Esfarjani, S. R.: Exploring a Variable-Resolution Approach for Simulating Regional Climate in the Rocky Mountain Region Using the VR-CESM, J. Geophys. Res.-Atmos., 122, 10939–10965,, 2017. a

Xie, S., Lin, W., Rasch, P. J., Ma, P.-L., Neale, R., Larson, V. E., Qian, Y., Bogenschutz, P. A., Caldwell, P., Cameron-Smith, P., Golaz, J.-C., Mahajan, S., Singh, B., Tang, Q., Wang, H., Yoon, J.-H., Zhang, K., and Zhang, Y.: Understanding Cloud and Convective Characteristics in Version 1 of the E3SM Atmosphere Model, J. Adv. Model. Earth Sy., 10, 2618–2644,, 2018. a, b

Xie, S., Wang, Y.-C., Lin, W., Ma, H.-Y., Tang, Q., Tang, S., Zheng, X., Golaz, J.-C., Zhang, G. J., and Zhang, M.: Improved Diurnal Cycle of Precipitation in E3SM With a Revised Convective Triggering Function, J. Adv. Model. Earth Sy., 11, 2290–2310,, 2019. a, b

Xu, Z. and Di Vittorio, A.: Hydrological analysis in watersheds with a variable-resolution global climate model (VR-CESM), J. Hydrol., 601, 126646,, 2021. a

Xu, Z., Di Vittorio, A., Zhang, J., Rhoades, A., Xin, X., Xu, H., and Xiao, C.: Evaluating Variable-Resolution CESM Over China and Western United States for Use in Water-Energy Nexus and Impacts Modeling, J. Geophys. Res.-Atmos., 126, e2020JD034361,, 2021. a

Xu, Z., Chang, A., and Di Vittorio, A.: Evaluating and projecting of climate extremes using a variable-resolution global climate model (VR-CESM), Weather Climate Extremes, 38, 100496,, 2022. a

Zarzycki, C. M. and Jablonowski, C.: A multidecadal simulation of Atlantic tropical cyclones using a variable-resolution global atmospheric general circulation model, J. Adv. Model. Earth Sy., 6, 805–828,, 2014. a

Zarzycki, C. M. and Jablonowski, C.: Experimental Tropical Cyclone Forecasts Using a Variable-Resolution Global Model, Mon. Weather Rev., 143, 4012–4037,, 2015. a

Zarzycki, C. M., Levy, M. N., Jablonowski, C., Overfelt, J. R., Taylor, M. A., and Ullrich, P. A.: Aquaplanet Experiments Using CAM’s Variable-Resolution Dynamical Core, J. Climate, 27, 5481–5503,, 2014. a, b

Zarzycki, C. M., Jablonowski, C., Thatcher, D. R., and Taylor, M. A.: Effects of Localized Grid Refinement on the General Circulation and Climatology in the Community Atmosphere Model, J. Climate, 28, 2777–2803,, 2015. a

Zarzycki, C. M., Ullrich, P. A., and Reed, K. A.: Metrics for Evaluating Tropical Cyclones in Climate Data, J. Appl. Meteorol. Clim., 60, 643–660,, 2021. a

Zender, C. S.: Analysis of self-describing gridded geoscience data with netCDF Operators (NCO), Environ. Model. Softw., 23, 1338–1342, 2008. a

Zhang, C., Golaz, J.-C., Forsyth, R., Vo, T., Xie, S., Shaheen, Z., Potter, G. L., Asay-Davis, X. S., Zender, C. S., Lin, W., Chen, C.-C., Terai, C. R., Mahajan, S., Zhou, T., Balaguru, K., Tang, Q., Tao, C., Zhang, Y., Emmenegger, T., Burrows, S., and Ullrich, P. A.: The E3SM Diagnostics Package (E3SM Diags v2.7): a Python-based diagnostics package for Earth system model evaluation, Geosci. Model Dev., 15, 9031–9056,, 2022a. a, b

Zhang, G. J. and McFarlane, N. A.: Sensitivity of climate simulations to the parameterization of cumulus convection in the Canadian climate centre general circulation model, Atmos.-Ocean, 33, 407–446,, 1995. a

Zhang, H.-M., Huang, B., Lawrimore, J., Menne, M., and Smith, T. M.: NOAA Global Surface Temperature Dataset (NOAAGlobalTemp), Version 4.0, NOAA National Centers for Environmental Information [data set],, 2015. a

Zhang, K., Wan, H., Liu, X., Ghan, S. J., Kooperman, G. J., Ma, P.-L., Rasch, P. J., Neubauer, D., and Lohmann, U.: Technical Note: On the use of nudging for aerosol–climate model intercomparison studies, Atmos. Chem. Phys., 14, 8631–8645,, 2014.  a

Zhang, K., Rasch, P. J., Taylor, M. A., Wan, H., Leung, R., Ma, P.-L., Golaz, J.-C., Wolfe, J., Lin, W., Singh, B., Burrows, S., Yoon, J.-H., Wang, H., Qian, Y., Tang, Q., Caldwell, P., and Xie, S.: Impact of numerical choices on water conservation in the E3SM Atmosphere Model version 1 (EAMv1), Geosci. Model Dev., 11, 1971–1988,, 2018. a

Zhang, K., Zhang, W., Wan, H., Rasch, P. J., Ghan, S. J., Easter, R. C., Shi, X., Wang, Y., Wang, H., Ma, P.-L., Zhang, S., Sun, J., Burrows, S. M., Shrivastava, M., Singh, B., Qian, Y., Liu, X., Golaz, J.-C., Tang, Q., Zheng, X., Xie, S., Lin, W., Feng, Y., Wang, M., Yoon, J.-H., and Leung, L. R.: Effective radiative forcing of anthropogenic aerosols in E3SM version 1: historical changes, causality, decomposition, and parameterization sensitivities, Atmos. Chem. Phys., 22, 9129–9160,, 2022. a

Zhang, S., Zhang, K., Wan, H., and Sun, J.: Further improvement and evaluation of nudging in the E3SM Atmosphere Model version 1 (EAMv1): simulations of the mean climate, weather events, and anthropogenic aerosol effects, Geosci. Model Dev., 15, 6787–6816,, 2022. a

Zhang, Y., Xie, S., Lin, W., Klein, S. A., Zelinka, M., Ma, P.-L., Rasch, P. J., Qian, Y., Tang, Q., and Ma, H.-Y.: Evaluation of Clouds in Version 1 of the E3SM Atmosphere Model With Satellite Simulators, J. Adv. Model. Earth Sy., 11, 1253–1268,, 2019. a

Zhang, Y., Xie, S., Qin, Y., Lin, W., Golaz, J.-C., Zheng, X., Ma, P.-L., Qian, Y., Tang, Q., Terai, C. R., and Zhang, M.: Understanding Changes in Cloud Simulations from E3SM Version 1 to Version 2, EGUsphere, submitted, 2023. a

Zhou, T., Leung, L. R., Leng, G., Voisin, N., Li, H.-Y., Craig, A. P., Tesfa, T., and Mao, Y.: Global irrigation characteristics and effects simulated by fully coupled land surface, river, and water management models in E3SM, J. Adv. Model. Earth Sy., 12, e2020MS002069,, 2020.  a

Short summary
High-resolution simulations are superior to low-resolution ones in capturing regional climate changes and climate extremes. However, uniformly reducing the grid size of a global Earth system model is too computationally expensive. We provide an overview of the fully coupled regionally refined model (RRM) of E3SMv2 and document a first-of-its-kind set of climate production simulations using RRM at an economic cost. The key to this success is our innovative hybrid time step method.