Better calibration of cloud parameterizations and subgrid effects increases the ﬁdelity of the E3SM Atmosphere Model version 1

. Realistic simulation of the Earth’s mean-state climate remains a major challenge, and yet it is crucial for predicting the climate system in transition. Deﬁciencies in models’ process representations, propagation of errors from one process to another, and associated compensating errors can often confound the interpretation and improvement of model simulations. These errors and biases can also lead to unrealistic climate projections and incorrect attribution of the physical mechanisms governing past and future climate change. Here we show that a signiﬁcantly improved global atmospheric simulation can be achieved by focusing on the realism of process assumptions in cloud calibration and subgrid effects using the Energy Exascale Earth System Model (E3SM) Atmosphere Model version 1 (EAMv1). The calibration of clouds and subgrid effects informed by our understanding of physical mechanisms leads to signiﬁcant improvements in clouds and precipitation climatology, reducing common and long-standing biases across cloud regimes in the model. The improved cloud ﬁdelity in turn reduces biases in other aspects of the system. Furthermore, even though the recalibration does not change the global mean aerosol and total anthropogenic effective radiative forc-Published


Introduction
The Energy Exascale Earth System Model (E3SM) version 1 (E3SMv1) Caldwell et al., 2019) includes an atmospheric component called the E3SM atmosphere model (EAM) version 1 (EAMv1) . EAMv1 was released in April 2018 together with the fully coupled E3SMv1 and all of its model components. EAMv1 uses a revised four-mode version of the modal aerosol module (MAM) (Liu et al., , 2016Wang et al., 2020); an updated two-moment cloud microphysics scheme  (hereafter MG2); the Cloud Layers Unified By Binormals (CLUBB) parameterization Larson and Golaz, 2005;Bogenschutz et al., 2013) for turbulence, shallow convection, and cloud macrophysics; the Zhang and McFarlane (1995) (ZM) parameterization for deep convection with the addition of convective momentum transport (Richter and Rasch, 2008); and a modified dilute plume calculation (Neale et al., 2008). The model shows general success in simulating present-day climatology, producing improved simulation compared to atmospheric simulations of previous-generation Earth system models (ESMs)  that participated in the Coupled Model Intercomparison Project (CMIP) phase 5 (CMIP5) (Taylor et al., 2012).
However, EAMv1 still produces significant regional cloud and precipitation biases that are common in many ESMs (Y. Xie et al., 2018;Brunke et al., 2019). These persistent errors include the underestimation of coastal stratocumulus (Sc), overly bright trade cumulus (Cu), mislocation of the Sc-to-Cu transition regions, and a notable underestimation of the areal extent of clouds over the Indo-Pacific warm pool. EAMv1 also showed some new cloud biases compared to its predecessors, including overly bright clouds embedded within storm tracks and an unrealistically high liquid water path (LWP) in polar regions (Zhang et al., 2020). Closely related to these errors are biases in the mean, variability, and extremes of precipitation. As shown in , EAMv1 produces high annual mean precipitation over the global average in high-elevation regions and in the central Pacific but low annual mean precipitation over Amazonia and the tropical western Pacific (TWP). EAMv1 contains the signature of a double Intertropical Convergence Zone (ITCZ) that has been problematic in ESMs for over 2 decades (Mechoso et al., 1995;Dai, 2006). Furthermore, similar to many other coarse-resolution models, EAMv1 produces too many light precipitation events and too few heavy precipitation events compared to observations (Stephens et al., 2010). The diurnal cycle of precipitation over regions that are strongly influenced by mesoscale convective systems (MCSs) is skewed, producing peak precipitation at midday instead of from the late afternoon to early morning . These common and persistent biases in predictions of clouds and precipitation arise from the coarse model resolution that is insufficient to represent small-scale features, as well as various deficiencies in parameterizations of cloud, turbulence, and convection processes. These deficiencies can, in turn, adversely affect other aspects of the atmosphere.
In addition to these cloud and precipitation biases, EAMv1 also shows large biases in the simulated present-day climatology of surface temperature and winds, similar to other global model predictions (Morcrette et al., 2018). These biases pose challenges for the fully coupled E3SMv1 to produce credible projections of the future climate. As discussed in Golaz et al. (2019), E3SMv1 appears very sensitive to perturbations of atmospheric composition (aerosols and greenhouse gases), producing differences in the observed and simulated temporal evolution of the global mean surface temperature in the 20th century and a relatively high estimate of equilibrium climate sensitivity (ECS) of 5.3 K compared to estimates based on multiple lines of evidence including process understanding, historical climate record, and paleoclimate record (Sherwood et al., 2020).
Many factors may contribute to the behavior and biases of the model. Biases affect the interpretation of climate projections and future model development plans. The choice of parameter settings for parameterizations is a scientifically important factor in creating (and reducing) these biases. This study explores the impact of changes to parameter settings (i.e., recalibration) to improve fidelity of model climate, and implications for climate change studies. Hence, this recalibration effort can provide important physical insights into future development of E3SM as well as other ESMs.
Model calibration, or tuning, is a crucial research element in Earth system modeling. This procedure optimizes model fidelity by addressing the trade-off between optimizing individual processes and process interactions so that the model climate agrees with observables while simultaneously satisfying energy balance requirements. These multiple constraints frequently expose the presence of error compensations in ESMs. As discussed in depth in Hourdin et al. (2017) and Schmidt et al. (2017), balancing these requirements is a mix of art and science because some degree of subjectivity is inevitable and choices are made based on expert judgment. Expert judgment consists of evaluation, intercomparison, and interpretation of results. This is followed by changes to the model parameter settings to make the model better suited for answering specific science questions that originally motivated its development. During the development of EAMv1, model calibration primarily used the traditional one-at-a-time parameter adjustment approach Xie et al., 2018). In principle, automated procedures could be employed to perform such calibrations, but they are not yet used for final calibrations (for reasons discussed below). Instead, automated procedures have been performed for an ensemble of short simulations with perturbed parameter to provide a systematic assessment of the parametric sensitivity Qian et al., 2018), helping to provide insight about multivariate responses of the model to changes in single or multiple parameters.
The traditional one-at-a-time parameter adjustment approach is inefficient and expensive in terms of both computational and human resources . It is a sequential and iterative process that requires a large number (e.g., hundreds) of iterations consisting of (1) running a multi-year simulation, (2) performing a comprehensive evaluation using diagnostics packages to assess the impact of the change in a single parameter value on different aspects of the simulation, and (3) designing and running the next simulation based on evaluation of the current simulation. However, there are too many uncertain parameters within a climate model to repeat this process and perfectly optimize its climate fidelity.
The perturbed parameter ensemble approach (Murphy et al., 2004) has been used for quantifying parametric uncertainty. The EAMv1 development team adopted the short simulation ensemble approach (Wan et al., 2014;Qian et al., 2018), which uses 5 d simulations rather than multi-year simulations to assess the fast physics Ma et al., 2014. The approach significantly reduces the turnaround time and computational cost compared to the traditional multi-year simulation ensemble approach for a systematic assessment of the parametric uncertainty. One caveat, however, is that it requires a priori knowledge of a manageable set of uncertain parameters and their physically, observationally, or empirically justifiable ranges. The parameter space is also too large to explore fully, and only a subset of parameters are typically selected based on physical intuition and expert judgment. In hindsight, the parameter set selected for the short simulation ensemble during the EAMv1 development was insufficient because parameters not included in the original ensemble were later found to be important. An-other limitation is that the short simulations focus on fast physical processes and rapid adjustments. By design, important factors such as slow internal variability of the atmosphere (e.g., inter-annual variability) and circulation feedbacks are not considered, and thus any conclusion drawn from the short simulation ensemble might not be applicable to the calibration of the ESM for climate simulations. Both limitations could be mitigated if the perturbed parameter ensemble includes every possible combination of parameter choices and the simulations were a decade in length, but the amount of computational resources required for such an exercise is prohibitive.
The one-at-a-time calibration approach using multi-year simulations and the short simulation ensemble approach using multi-day simulations are complementary, but for the purpose of tuning EAMv1 both approaches shared some common challenges: (1) there were insufficient computational and human resources to explore and optimize parameter choices, (2) there was insufficient time to perform and analyze the simulations, and (3) there was a necessary tradeoff in that improvements to one aspect of the simulation in general may be made at the price of degradation in other aspects, suggesting model structural deficiency in addition to parametric uncertainty . Reconciling these contradictory results and further improving the model fidelity have been great challenges for the model development team.
In contrast to the above, an important aspect of the tuning strategy we present here is that we intentionally focus only on a subset of parameters and skill metrics related to cloud processes rather than optimizing the model for more than a dozen of the metrics that the community typically relies on (Burrows et al., 2018;Hourdin et al., 2017;Mauritsen et al., 2012;Gleckler et al., 2008). We find that when clouds in every regime are improved, other aspects of the global atmospheric simulation are also improved, even though they are not the direct targets for calibration. Interestingly, the recalibrated atmosphere model, denoted as EAMv1P, exhibits weaker sensitivities to aerosol perturbation and to surface warming for both clouds and precipitation. Because the notable biases in E3SMv1's simulated surface temperature evolution are due to a combination of high ECS (from cloud feedback) and strong aerosol forcing , EAMv1P may lead to improvements in the simulation of the 20th century temperature evolution and a lower estimate of ECS when running as part of the fully coupled E3SM. More challenges may yet emerge in tuning fully coupled models.
We acknowledge that our recalibration approach has several caveats. First, like all current model calibration strategies, our recalibration does not lead to a unique and perfect configuration, and there are likely multiple ways to achieve a different model configuration with equally accurate presentday climate. We also acknowledge that there may be complications when the recalibrated atmosphere model is coupled with the ocean. Additional tuning might be required. However, the experience from this study will likely be valuable in that effort. Finally, we acknowledge that some tuning choices are better justified than others because many of the uncertain parameters do not have a physically or observationally justifiable range. For those poorly constrained processes, the recalibration provides a way to identify the important process assumptions that affect our ability to accurately simulate the climate system. Future and ongoing studies that develop theoretical or observational constraints to reduce the uncertainties associated with these fundamental process formulations will continue to be very valuable.
In Sect. 2, we provide a discussion on the recalibration. Section 3 shows the results from the recalibrated model. We draw conclusions in Sect. 4.

Approach
Because clouds in different regimes are governed by different processes, the recalibration first treats each regional cloud bias separately, followed by adjustments (including sea salt and dust emission factors) to refine the cloud climatology and to restore the top-of-atmosphere (TOA) energy balance. The TOA cloud radiative effects (CREs) are the primary tuning target, but other cloud properties and cloud controlling factors are also assessed. We adopted the one-at-a-time parameter adjustment approach. Adjustments of uncertain parameters were driven by analysis of physical mechanisms affecting the simulation in every cloud regime. We also introduced new parameters for controlling the coupling of subgrid effects between the convection, turbulence, and surface flux parameterizations to produce better simulation of clouds. The recalibration is described in detail in this section.

Tropical clouds
Tropical clouds and precipitation are primarily controlled by the deep-convection parameterization and ice cloud microphysics. They interact strongly with the atmospheric circulation in the tropics through their overturning and vertical mixing of moist static energy. In EAMv1, cloud cover is significantly underestimated in the TWP and the eastern Pacific. Precipitation is biased low in the TWP and over the Amazon and biased high in the central Pacific, which can be viewed as a displacement of the Walker circulation. These biases also reflect errors in the simulated Hadley cell, moderating subsidence in the subtropics and the distribution of stratocumulus and trade cumulus.
Our main strategy to improve the tropical clouds and precipitation is through incorporating a previously missing gustiness representation, which includes the subgrid wind and temperature variance in the surface flux and the ZM's parcel buoyancy calculations. As we will show below, this improves the spatial distribution of cloud and precipitation, provided it is followed by subsequent parameter adjustments to keep the magnitude of tropical CREs and precipitation within a reasonable range. This idea is motivated in part by Harrop et al. (2018), who showed that including the Redelsperger et al. (2000) gustiness effects associated with deep convection over ocean increases local surface fluxes in EAMv1 running at ∼ 1 • horizontal grid spacing. The circulation responses significantly improve clouds and precipitation over the TWP. This is because E3SMv1 uses the Large and Pond (1982) and Zeng et al. (1998) parameterizations for surface fluxes of heat, moisture, and momentum over ocean and land, respectively, and these bulk aerodynamic schemes are prone to underestimate surface fluxes in regions where (1) large-scale winds are weak and (2) convective episodes are frequent. Enabling gustiness effects increases surface fluxes in those regions and hence increases clouds and precipitation.
The gustiness effects associated with deep convection were not ready in time to be included in the E3SMv1 release because including the gustiness effects requires retuning of the model. In this study, we built on the success of Harrop et al. (2018) and extended the Redelsperger et al. (2000) parameterization to operate over both land and ocean. To account for the gustiness effects associated with shallow convection and turbulence, the subgrid wind variance predicted by CLUBB was passed to the surface flux calculations. The total wind speed used for surface flux computation is expressed as follows: where U is the total wind speed, U 0 is the resolved largescale wind speed, and U g(ZM) and U g(CLUBB) are the wind speed enhancements owing to the gustiness associated with ZM and CLUBB, respectively. The use of the Redelsperger et al. (2000) parameterization over land is meant as a simple approximation to incorporate a consistent gustiness treatment globally until more targeted studies of gustiness impacts over land are made into a suitable alternative parameterization. Parameters a g and b g are tunable parameters used for calibrating the spatial distribution of surface fluxes. The a g parameter can be set to different values to account for the difference in surface roughness and to provide the flexibility to adjust the model in the face of the structural uncertainty of this parameterization. Based on sensitivity tests, we set a g to 0.9 over ocean and 1.2 over land and b g to 1.5 over both land and ocean. Figure 1 shows that the gustiness associated with the ZM deep-convection parameterization contributes about 15 % to the total surface wind speed felt by the surface flux scheme over tropical ocean and up to 45 % over tropical land. Meanwhile, gustiness associated with the shallow-convection and turbulence parameterization CLUBB accounts for 10 %-30 % of the total surface wind speed globally. Therefore, including gustiness effects significantly increases surface fluxes of sensible heat, moisture, and momentum in these regions.
Next, we considered the subgrid temperature perturbation in the parcel buoyancy calculation in the ZM scheme. The subgrid temperature perturbation is set to 0.5 K in the Community Atmosphere Model version 5 (CAM5) (Neale et al., 2010) and0.8 K in EAMv1 (Rasch et al., 2019). This treatment assumes that the subgrid heterogeneity of temperature is globally uniform. However, subgrid variability of temperature should vary in space and time. In particular, subgrid temperature heterogeneity is typically larger over land than over ocean. Setting a globally uniform subgrid temperature perturbation can potentially create biases in the distribution of deep convection. To address this deficiency, we computed the subgrid temperature perturbation by taking the square root of the subgrid temperature variance (a prognostic variable in CLUBB) and passed that information through ZM's parcel buoyancy calculation to account for the variability of the subgrid temperature perturbation. Based on sensitivity tests, a scaling factor of 2.0 was introduced to enhance the effect so that the simulated tropical clouds are in better agreement with observations (as discussed in Sect. 3).
Accounting for the gustiness effects and the variability of subgrid temperature variance was designed for EAMv1 running at ∼ 1 • horizontal grid spacing. It is logical to expect that increasing model spatial resolution will reduce the impacts of these subgrid effects. Thus, a retuning of these subgrid effects would likely be needed when the model is run at a different horizontal resolution. The model configuration with only the gustiness effects and the subgrid temperature variance added to EAMv1 is labeled as EAMv1_SGV.
While EAMv1_SGV improves the spatial distribution of tropical clouds and precipitation (discussed in Sect. 3), tropical CREs and precipitation become overly strong after these changes, indicating a need for additional tuning to compensate for the unintended changes. Among all the tunable parameters, we targeted the ones that were heavily tuned in EAMv1 and adjusted their values to be closer to their theoretical or nominal values. For context, in EAMv1, the coefficients controlling the autoconversion rate in convective clouds c0_lnd and c0_ocn (which are inversely proportional to the timescale that condensate is converted to precipitation) were set to 0.007, more than 3 times larger than the nominal rate used in Lord et al. (1982). The consequence is that little condensate is detrained from convective updrafts, producing cirrus clouds with very low water content in the upper troposphere. To compensate for the weak source of ice water, EAMv1 assumes more Aitken mode sulfate aerosols are efficient homogeneous ice nuclei. As a result, EAMv1 produces relatively high cloud ice number (Ni) with small ice water content and weak sedimentation rates, making the cirrus clouds more persistent and highly reflective. In this recalibration, we chose to take the following steps: It is worth noting that changing dmpdz has different effects on CREs in different parts of the tropics and a significant impact on the subtropical CREs, but the exact mechanism is unclear and requires further investigation. We took an iterative approach to retune the model, adjusting one parameter at a time and assessing its impacts after each simulation. The model configuration with only these ZM parameter changes added to EAMv1 is labeled as EAMv1_ZM (Table 1). In addition to changes made to the deep convection scheme in EAMv1_ZM, we also introduced two microphysical changes in MG2 in order to refine the tropical CRE (Table 3): (1) we increased the size threshold for sulfate aerosols to act as homogeneous ice nuclei (so4_sz_thresh_icenuc) to reduce ice number concentration and increase ice crystal size and thus the sedimentation rate and (2) increased ice_sed_ai to further increase the ice sedimentation rate. Combining the two MG2 changes with EAMv1_SGV and EAMv1_ZM, these adjustments increase cloudiness in the western and eastern Pacific, decrease cloudiness in the central Pacific, and cause weaker subsidence in the subtropics.

Subtropical low clouds
Realistic simulation of low clouds across various cloud regimes requires not only a realistic simulation of the largescale meteorological conditions but also a versatile parameterization that is able to describe different subgrid characteristics of clouds and atmospheric thermodynamic conditions in different cloud regimes. Following Medeiros and Stevens (2011), cloud regimes are determined by the vertical velocity at 500 hPa and the lower tropospheric stability. We also assessed the geographical distribution of those clouds. The CLUBB parameterization employed in EAMv1 uses a multivariate probability density function (PDF) to describe the subgrid variability of cloud, thermodynamic, and dynamic variables, all of which are closely connected to changes in the subgrid vertical velocity w . The second and third moments of w , w 2 and w 3 , are prognostic variables in CLUBB, meaning that the skewness of the w PDF, Sk w ≡ (w 3 )/(w 2 3/2 ), is predicted according to the governing equations. This is a critical treatment because it allows CLUBB to produce different subgrid characteristics in different regimes. As illustrated in Golaz et al. (2002), a low skewness corresponds to a rather symmetric PDF of w characteristic of the stratus and stratocumulus regimes, whereas a high skewness is more characteristic of a trade cumulus regime in which stronger and isolated updrafts embedded in subsidence occur more frequently. In principle, CLUBB can be used to represent the deep-convection regime as well (Thayer-Calder et al., 2015;Guo et al., 2015), but it requires significant amount of effort to enable said unification such that EAMv1 still uses ZM for a separate treatment of deep convection. The limit of Sk w ≤ 4.5 is imposed in EAMv1 in order to prevent numerical instability in CLUBB's equations. To simulate different subgrid variabilities in different regimes, CLUBB uses different damping coefficients and different widths of the w PDF as a function of Sk w : for X * set to the diffusivity or variance of a CLUBB's prognostic variable (e.g., vertical velocity variance, total water variance), , where X * is a linear combination of low skewness values X (C1, C11, and gamma_coef in Table 2) and high skewness values Xb (C1b, C6rtb, C6rthlb, C11b, and gamma_coefb in Table 2) with a weighting factor e −0.5·( Skw Xc ) 2 , where Xc is a transition factor (C1c, C6rtc, C6rthlc, C11c, gamma_coefc in Table 2). For instance, the damping coefficient for w 2 , C1 * , is expressed as a function of skewness, C1, C1b, and C1c: . Although this variable skewness treatment provides a way to simulate different subgrid characteristics in different regimes, it is poorly constrained -the equation describing X * and the chosen values of parameters X, Xb, and Xc are somewhat ad hoc. In EAMv1, we set C1b and gamma_coefb to be the same as C1 and gamma_coef, respectively, to reduce unconstrained assumptions. This is a simple choice that reduces the number of free parameters in CLUBB, but it also limits the flexibility of the CLUBB parameterization with implications for the model fidelity. As shown in , EAMv1 produces overly bright shallow Cu and a significant bias in near-coast Sc. Therefore, we explored a different pathway in this study by setting C1 and C1b and gamma_coef and gamma_coefb to different values and used the simulated low-cloud CREs as the tuning target to determine the parameter values. Improvements in the simulated clouds are significant, as will be shown in Sect. 3. However, it is worth noting that these improvements do not suggest that this treatment or the parameter settings are the correct representation of the physical processes in the real world. Rather, our study should be viewed as a demonstration that it is useful to enable the variable skewness treatment to facilitate the production of different subgrid characteristics in different cloud regimes. Reducing the level of complexity of the physics may sometimes compromise the model fidelity and can lead to further uncertainties in climate projections. As we further show in Sect. 3, these changes also affect aerosol-cloud interactions, cloud feedbacks, and, ultimately, climate sensitivity. Future studies that employ sufficient observations (from Doppler lidar, for example) or large eddy simulations (LES) to either constrain the parameter values in the current parameterization or develop a new parameterization to mimic the real-world subgrid characteristics in different regimes would be highly valuable.
To recalibrate CLUBB, we first increased the overall cloudiness through the following processes.
1. We weakened the turbulent mixing in the planetary boundary layer (PBL), which reduces PBL decoupling and mixing between the PBL and the free troposphere. This was achieved by increasing C1, C1b, C6rtb, C6rthlb, and C14; increasing C_k10; and increasing the eddy length scale threshold (Fig. 2a, b). wpxp_L_thresh Eddy length scale threshold for Newtonian and buoyancy damping of w q t and w θ l (m) 60 100 2. We facilitated cloud formation by reducing the width of the w PDF via reducing gamma_coef and gamma_coefb.
3. We promoted Sc-like symmetric mixing rather than shallow Cu-like asymmetric mixing by reducing Sk w via increasing C8.
4. We allowed larger horizontal variation in subgrid characteristics by enlarging the difference in parameter values between high-and low-skewness regimes (i.e., X's and Xb's), as determined from satellite observations (Z. B. , and modified the Xc values to refine the transition between low-and high-skewness regimes. The change in the width of the w PDF also affects the in-cloud liquid water mixing ratio (Qc) variance, resulting in variable enhancement factors for warm rain processes in cloud microphysics. We also reduced the cloudiness in the shallow Cu regime by decreasing the lateral entrainment (i.e., reducing mu). These changes increase the skewness in the shallow Cu regime (Fig. 2c, d), and lead to a realistic Sc-to-Cu transition (as discussed in Sect. 3). The model configuration with only these CLUBB parameter changes added to EAMv1 is labeled as EAMv1_CLUBB (Table 2). Uncertainties in cloud microphysical processes affect all non-deep convective clouds, including subtropical clouds. The tuning of the microphysical processes is justified by fundamental process-level uncertainties and simplifying assumptions made in bulk microphysics schemes (including the MG2 scheme used in EAMv1) regarding particle size distributions and the subgrid scale distribution of cloud properties. To increase cloudiness in the Sc regime, we weakened cloud-top entrainment by enhancing droplet sedimentation (Bretherton et al., 2007). Next, we reduced the lower bound of subgrid vertical velocity used for cloud droplet nucleation (wsubmin). This improves the coupling between the simulated subgrid updraft velocity and the cloud microphysical properties such as droplet number, size, and condensate amount. We also adjusted the warm rain processes by restoring the heavily tuned prc_exp1, the exponent of droplet number (N c ) in the autoconversion parameterization in EAMv1 , to the nominal value based on observations (Wood, 2005). This increases cloudiness in areas where more aerosols are present. The accretion process is also enhanced to compensate for the reduction of precipitation from the change in autoconversion. The above microphysical modifications designed to optimize stratocumulus will be combined with additional microphysical tunings inspired by cloud types at other latitudes (see Sect. 2.3).
It is worth noting that the autoconversion parameterizations in EAMv1 is based on Khairoutdinov and Kogan (2000), which is a function of Qc and N c . However, the parameter values (i.e., the scale factor and exponents of Qc and N c ) for different cloud regimes are very different (Kogan, 2013), indicating that the autoconversion process is governed by more factors than those considered in the current parameterization. Therefore, there is no one set of parameter values that can optimally represent the autoconversion process for all cloud regimes. Adjusting these parameters to achieve reasonably good representation of cloud and precipitation simulations is possible, but one should use caution when interpreting the results and acknowledge the fundamental deficiency of the underlying process representations in the model. Given the importance of warm rain processes (autoconversion and accretion) in simulating clouds and precipitation and their responses to forcings, developing new parameterizations that can flexibly represent these processes over a broad range of cloud types to address this model deficiency should be included in the roadmap toward next generation ESMs.

Midlatitude and high-latitude clouds
Another significant cloud bias present in midlatitude and high-latitudes in EAMv1 can be attributed to excessive supercooled liquid clouds due to a suppressed Wegener-Bergeron-Findeisen (WBF) process . This insufficient conversion from liquid to ice is a consequence of an inherited value of a scaling factor of 0.1 that tuned down the WBF process rate significantly. The WBF rate was previously tuned down in order to address an underestimate supercooled liquid clouds in CAM5 DeMott et al., 2010;Liu et al., 2011). However, EAMv1 eliminated one of the sources of this bias by replacing the Meyers et al. (1992) ice nucleation (IN) scheme from CAM5 with a classical nucleation theory (CNT)-based scheme (Hoose et al., 2010;Wang et al., 2014). The CNT scheme addresses the overproduction of ice crystals by Meyers et al. (1992), which scavenges liquid water rapidly. Replacing the Meyers et al. (1992) scheme but maintaining the slow WBF conversion from liquid to ice produced unrealistically high liquid water path (LWP) in midlatitudes and high latitudes: the LWP poleward of 60 • N and over the Southern Ocean is 15 %-30 % higher than the LWP in the tropics (see discussion in Sect. 3.1; Fig. 3). Such an unrealistic meridional distribution of LWP can cause significant biases in the radiative energy distribution, atmospheric circulation, and water cycle. The excessive cloud liquid water in midlatitudes and high latitudes can also lead to strong aerosol-cloud interactions and biases in long-range transport of aerosols due to strong wet scavenging . The highresolution configuration of E3SMv1 reverted the IN scheme to Meyers et al. (1992) to address this bias , but the error compensation from two incorrect cloud processes can potentially produce biases in cloud microphysical properties, adversely impacting the credibility of climate projections.
In this study, we adopted an alternative approach to address this bias. Y.  shows that improvements can be made by increasing the WBF process rate. Therefore, we retained the new CNT-based IN scheme that had been shown to perform better than the Meyers et al. (1992) scheme and significantly increased the scale factor for the WBF process to increase the conversion from liquid to ice. This adjustment is superposed with additional benefits from the parameter adjustments in the ZM scheme (Sect. 2.1) that improved the upper-tropospheric ice clouds in the tropics and increased ice clouds in the midlatitudes. The model configuration with only the MG2 parameter changes added to EAMv1 is labeled as EAMv1_MP (Table 3). The combination of EAMv1_MP and EAMv1_ZM lead to lower LWP and higher ice water path (IWP) in the midlatitudes and high latitudes (see discussion in Sect. 3.1; Fig. 3).

Model simulations
The final revised model (labeled as EAMv1P) includes all changes discussed above and two additional changes to the scale factors for emissions of sea spray and dust aerosols (Ta-  ble 4) so that the global mean aerosol optical depth (τ aer ) is similar between EAMv1 and the recalibrated model.
In this paper, we show model results from grouped parameter adjustments instead of individual parameter changes. Model configurations are listed in Table 5. The effects and the mechanisms of each individual parameter adjustment require further investigation and will be documented in separate papers.
Each model configuration was used for 11-year global atmospheric simulations (the first year was discarded as spinup) in which the atmosphere model was coupled with an interactive land model but sea surface temperature (SST) and sea ice cover were prescribed. Emissions of aerosols and their precursors were obtained from CMIP phase 6 (CMIP6) emission datasets (Hoesly et al., 2018;van Marle et al., 2017). We ran the coarse-resolution EAM configuration (i.e., ne30np4, which corresponds to approximately 1 • horizontal grid spacing) with 1. present-day (here 2000 CE) forcing; 2. pre-industrial (here 1850 CE) forcing; 3. present-day forcing, except for pre-industrial aerosol emissions; 4. pre-industrial forcing with SST elevated by 4 K uniformly; 5. present-day forcing with SST, sea ice, and solar constant set to pre-industrial conditions. We compute the effective radiative forcing (ERF) from these prescribed SST and sea ice experiments (Hansen et al., 2005). Forster et al. (2016) compared different methodologies for computing the ERF and recommend the prescribed SST and sea ice method. The differences between (1) and (3) provide information on the impacts of anthropogenic aerosols. Contrasting (2) and (4) provides climate feedback estimates. Total anthropogenic ERF (ERF ant ), also termed total adjusted forcing, is derived by comparing (5) and (2)  . ERF ant includes anthropogenic forcing (greenhouse gas concentrations, aerosols, and land use land cover change) and rapid adjustments in water vapor, clouds, and temperature. Table 6 summarizes the global mean present-day climatology of cloud properties using the various model configurations listed in Table 5. Satellite observations summarized in Stubenrauch et al. (2013) and Neubauer et al. (2019) are also provided, but we note that it is dangerous (and can be misleading) to compare model state variables with satellite retrievals without using a simulator since large retrieval and sampling uncertainties exist. The CREs are computed by double radiation calls in the model. Shortwave and longwave CREs contributed from liquid clouds, ice clouds, convective clouds, and snow are independently computed. Rain droplets are not radiatively active in EAMv1. Because radiative trans-fer is nonlinear, the sum of the CREs from clouds and snow are not equal to the total CRE.

Clouds
Compared with EAMv1, EAMv1_CLUBB shows lowermagnitude top-of-atmosphere (TOA) net CREs due primarily to a reduction of liquid clouds in the shallow Cu regime. EAMv1_MP also produces lower-magnitude total shortwave and longwave CREs, but it is attributable to the reduction of CREs from both liquid and ice clouds from increasing the WBF process. EAMv1_SGV only marginally increases CREs, but EAMv1_ZM significantly enhances the CREs from liquid and ice clouds, though the convective CREs are significantly reduced in EAMv1_ZM because the convective cloud fraction is much lower as a result of reducing the deep convective cloud fraction parameter dp1. The CRE differences are consistent with the differences in cloud optical depth (τ cld ). In contrast, cloud fractions and cloud heights are relatively invariant between different configurations. EAMv1_MP reduces LWP, IWP, N c , and Ni mostly at midlatitudes and high latitudes, and EAMv1_ZM increases them mostly in the tropics. The EAMv1P configuration combines all of the changes and produces global mean net CRE (−24.28 W m −2 ) not very different from that in EAMv1 (−24.7 W m −2 ), but we emphasize that the spatial distribution of clouds is as important as global mean values because different cloud regimes may respond to perturbations differently. Figure 3 shows that the changes made in EAMv1_ZM increase the IWP significantly at most latitudes except the polar regions. This is likely due to the combination of reducing the convective autoconversion efficiency (by reducing c0_lnd and c0_ocn) and decreasing the ice particle size de- EAMv1 with all the changes Table 6. Global mean 10-year-averaged cloud properties of EAMv1, EAMv1_CLUBB, EAMv1_MP, EAMv1_SGV, EAMv1_ZM, EAMv1P, and satellite observations summarized in Stubenrauch et al. (2013) and Neubauer et al. (2019). Relevant cloud properties listed here are TOA shortwave cloud radiative effects (SWCRE; unit = W m −2 ) and those of liquid clouds (SWCRE liq ), ice clouds (SWCRE ice ), snow (SWCRE snow ), and convective clouds (SWCRE conv ); TOA longwave cloud radiative effects (LWCRE; unit = W m −2 ) and those of liquid clouds (LWCRE liq ), ice clouds (LWCRE ice ), snow (LWCRE snow ), and convective clouds (LWCRE conv ); cloud fraction (unit = %) of the total column (F cld,tot ), below 700 hPa (F cld,low ), between 400 and 700 hPa (F cld,med ), and above 400 hPa (F cld,hgh ); optical depth of all clouds (τ cld ) and that of liquid clouds (τ liq ), ice clouds (τ ice ), snow (τ snow ), convective clouds (τ conv ), and all clouds below 700 hPa (τ low ) and above 400 hPa (τ hgh ); column-integrated total LWP (unit = g m −2 ) and IWP (unit = g m −2 ), N c (unit = 10 9 m −2 ) and N i (unit = 10 9 m −2 ); altitude of the top (Z hgh,top ; unit = km) and base (Z hgh,top ; unit = km) of clouds above 400 hPa; and altitude of the top (Z low,top ; unit = km) and base (Z low,top ; unit = km) of clouds below 700 hPa. These changes in condensate also lead to a more realistic liquid condensate fraction (LCF) thermal dependence (Fig. 3d). Because of the general IWP increase in EAMv1_ZM, the meridional distribution of LCF is reduced as a result of changes made in ZM (Fig. 3c). Interestingly, the global mean atmospheric temperature where ice and liquid each contribute to 50 % of total condensate, T5050 (Mc-Coy et al., 2015, in EAMv1 is about 240 K, which is significantly lower than observational estimates of 254-258 K (McCoy et al., 2016). While the CMIP5 models tend to freeze liquid condensates at higher temperatures (Cesana et al., 2015;Tan et al., 2016;McCoy et al., 2016), EAMv1 appears to have overcorrected this bias and produced excessive supercooled liquid at low temperatures. Consistent with Y. , EAMv1_MP increases the T5050. Combining with changes introduced in EAMv1_ZM, EAMv1P produces a much more reasonable T5050 of 254 K, which is at the lower bound of the observational estimates. We note that even though Hu et al. (2010) provided an observationally derived LCF-T relationship based on the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) measurements (Winker et al., 2007), EAMv1 does not have the CALIPSO cloud-phase simulator (Cesana and Chepfer, 2013), and thus a fair comparison is not possible. Evaluating the model LCF-T relationship against satellite observations in a consistent way will be very useful and requires further investigation.
Differences in the simulated cloud phase can have important implications for aerosol-cloud interactions (ACI) because the physical processes regulating the interactions between aerosols and warm cloud and between aerosols and cold clouds are very different. The simulated cloud phase can also affect cloud feedbacks to warming . The ACI and cloud feedbacks will be discussed in Sect. 3.4 and 3.5. Figure 4 illustrates how the most challenging TOA shortwave cloud radiative effect (SWCRE) biases in EAMv1 (Fig. 4a) are greatly remedied by the cumulative effects of our retuning (Fig. 4f), with intermediate subpanels decomposing the grouped parameter changes in ways that help illustrate how they are intended to address those biases independently and jointly. By enabling the variable skewness treatment in CLUBB and the adjustments that follow, the overly bright shallow Cu and the significant lack of Sc in EAMv1 are greatly improved in EAMv1_CLUBB. SWCRE associated with coastal Sc is increased by about 10-20 W m −2 off the coast of California and by about 30-40 W m −2 off the coast of Peru and Chile, while over the shallow Cu regime SWCRE is reduced by 20-30 W m −2 . The elevated Sk w in the shallow Cu regime also reduces the cloud water removal timescale (not shown). In EAMv1_MP, tuning up the WBF process corrects the SWCRE bias at midlatitudes and high latitudes. With increasing fraction of ice condensate, the cloud water removal timescale is reduced (not shown) because warm rain processes are less efficient in removing condensate than ice precipitation processes (Mülmenstädt et al., 2021). Adjustments to cloud droplet sedimentation and warm rain processes make moderate improvements to Sc. Changes to ice crystal sedimentation and sulfate aerosol size result in significant reduction in tropical SWCRE, as upper tropospheric clouds respond to these adjustments the most. Furthermore, EAMv1_SGV increases clouds in areas where large-scale winds are weak and convection occurs frequently, including the TWP and Amazonia. Some effects on the eastern Pacific are also observed. EAMv1_ZM further increases cloudiness in the ITCZ, especially in the western and eastern Pacific. Cloudiness in the Southern Pacific Convergence Zone (SPCZ) is also improved. Setting c0_lnd and c0_ocn to lower values essentially slows down the convective autoconversion process, leading to longer water removal timescales in the tropics. Combining all the changes, EAMv1P shows improved cloud distribution with reduced biases in the tropics, subtropics, midlatitudes, and high latitudes, indicating that the changes discussed in Sect. 2 are appropriate.
Further evaluation using the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) cloud simulator (Chepfer et al., 2008) Fig. 4, EAMv1 shows overly bright clouds in trade cumulus regions and over the Southern Ocean, and EAMv1P reduces these biases. Figure 5 shows that EAMv1 produces less clouds in trade cumulus regions and more clouds over then Southern Ocean than GOCCP, and EAMv1P increases the bias in trade cumulus regions and does not change the Southern Ocean bias. This could indicate that the improvements to the TOA SWCRE over these regions are achieved by compensating errors between cloud fraction and cloud optical depth.
Given the importance of low clouds in Earth's radiation budget, we investigate the planetary boundary layer (PBL) properties in different model configurations to gain insights into the physical mechanisms associated with the param-  (Wielicki et al., 1996) Energy Balance and Filled (EBAF) Edition 4.1 (Loeb et al., 2012(Loeb et al., , 2003(Loeb et al., , 2018   eter adjustments. Table 7 shows that the adjustments to CLUBB parameters affect the simulated PBL properties significantly (as expected). The adjustments to CLUBB parameters directly reduce the w 2 925 and increase Sk w925 , but they also govern the turbulent mixing and cloud processes in the PBL, producing a complex set of overall impacts on the macroscale properties of the PBL. The weaker w 2 925 indicates a shallower PBL, reducing both the PBL decoupling strength (PBL dcp ), defined as the difference between cloud base height and lifting condensation level (LCL) (Jones et al., 2011), and the frequency of occurrence of decoupled PBL. The cloud-top entrainment rate for PBL clouds (w e ) is reduced as a result. It is interesting to note that the changes to the ZM deep-convection scheme can also reduce cloud-top entrainment, presumably through strengthening of the large-scale subsidence in the subtropics. On the other hand, higher Sk w925 indicates that the model produces more asymmetric mixing and shallow Cu-like clouds. This matters for cloud feedback since an increase in Cu-like clouds and decrease in Sc-like clouds can lead to weaker low-cloud feedback (Cesana et al., 2019), and results will be discussed further in Sect. 3.5. Finally, we find that the inverse relative variance of cloud water, which affects the enhancement factors of autoconversion, accretion, and immersion freezing (Morrison and Gettelman, 2008), are not sensitive to the parameter changes. Thus, there is a limited impact on these three processes from changes in subgrid in-cloud water variance.
Next, we compare the estimated inversion strength (EIS) (Wood and Bretherton, 2006) between model simulations and a reanalysis dataset. The EIS was computed follow- ing the CFMIP diagnostics code catalogue (Tsushima et al., 2017). EIS has traditionally been considered as an important cloud-controlling factor affecting low clouds and low-cloud feedback Myers et al., 2021). Figure 6 shows that EAMv1 generally underestimates EIS, except in the tropics. The revised model EAMv1P alleviates many of the biases, but some biases remain. EAMv1_CLUBB reduces the bias over land in general (except for in northern Africa) as well as the midlatitude and high-latitude ocean. EAMv1_MP shows significant difference in the polar regions, indicating that reducing supercooled liquid in the mixed-phase cloud regime can change polar PBL properties. EAMv1_SGV enhances the EIS as a result of convection invigoration. Similarly, EAMv1_ZM directly reduces the bias in the tropics and produces enhanced EIS in midlatitudes and high latitudes through large-scale circulation responses. Figure 7 shows that changes in EAMv1_CLUBB also significantly reduce the PBL decoupling strength (Jones et al., 2011). The decoupled PBL is often a sign that the PBL grows too deep, and thus the negative buoyancy at the top of the PBL is insufficient to mix through the sub-cloud layer (Wood, 2012). These conditions favor the transition from Sc to shallow Cu (Wood, 2012;Xiao et al., 2011), reducing the overall cloudiness and contributing to the lack of Sc in EAMv1. This long-standing regional cloud bias is primarily alleviated by adjustments to CLUBB parameters, particularly the increases in C1 and C1b that reduce w 2 . Furthermore, EAMv1_SGV also reduces the PBL decoupling strength over tropical land and subtropical and midlatitude ocean, likely due to the enhanced surface flux that moistens the PBL. The recalibrated model EAMv1P shows the collective effect of significant reduction in decoupling strength (Fig. 7) and frequency (not shown).
We also diagnose the cloud-top entrainment efficiency (Bretherton et al., 2007) in different model configurations to further clarify the physical mechanisms associated with the parameter adjustments. Cloud-top entrainment efficiency is defined as A = w e b z i /w 3 * , where w e is the entrainment rate computed by differencing the resolved vertical motion and change in inversion height (z i ), b is the virtual potential temperature jump scaled into buoyancy jump where the reference virtual potential temperature θ ref is 300 K, and w * is the convective velocity (w * = (2.5 z i 0 w b dz) 1/3 ) that measures the buoyancy integrated over the boundary layer, where b is the buoyancy perturbation and w b is the buoyancy flux. Figure 8 shows that the largest differences are again a result of changes made in CLUBB. As w 2 is reduced, EAMv1_CLUBB produces a shallower PBL consistent with a reduced cloud-top entrainment efficiency. In EAMv1_MP, the enhancement of liquid and ice sedimentation also reduces entrainment efficiency (Bretherton et al., 2007). EAMv1_SGV generally enhances the surface fluxes and produces a deeper and relatively less stable PBL, leading to enhanced mixing between the PBL and the free troposphere.
The changes in PBL decoupling strength and cloud-top entrainment efficiency shown in Figs. 7 and 8 are consistent with expectations and affirm our understanding of the physical mechanisms connecting the parameter adjustments, CREs, and PBL properties, even though they are not directly controlled by any tunable parameters. Unfortunately, currently there is no global observational estimate for decoupling frequency and cloud-top entrainment efficiency, and thus we cannot assert that the recalibration improves these physical mechanisms when taken alone. However, put together they constitute a reassuring sign that relevant metrics of macroscale low-cloud dynamics are associated with desired changes in TOA SWCRE in logical ways. Future studies that derive decoupling frequency and cloud-top entrainment efficiency, as well as other important cloud-controlling factors, from field campaign measurements for evaluating models in particular regions and time periods would be highly valuable. Table 8 shows the global mean precipitation characteristics. We find that adjustments to the ZM scheme (e.g., reduc-  ing the convective autoconversion efficiency and convective cloud fraction) lead to a reduction of convective precipitation (PRECC) and an increase in large-scale precipitation (PRECL). Here convective precipitation refers to the precipitation produced by the ZM deep-convection parameterization and large-scale precipitation refers to the precipitation produced by the MG2 cloud microphysics parameterization. While EAMv1 produces more convective precipitation than large-scale precipitation, the revised model EAMv1P corrects this bias so that the model is in better agreement with observational estimates (Yang et al., 2013). The shift from convective to large-scale precipitation is expected to improve precipitation characteristics (Yang et al., 2013) because more detailed cloud microphysics processes are considered for large-scale clouds.

Precipitation
Nevertheless, the common bias in ESMs of producing frequent drizzle and light precipitation is pronounced in EAMv1, and adjustments of parameters have only a marginal impact. This suggests that the precipitation PDF bias is not related to parametric uncertainty and perhaps is attributed to model's structural deficiency such as issues with the trigger and closure in its deep-convection scheme or the coarse resolution, which is insufficient to simulate strong moisture convergence or dependency of precipitation formation on unresolved mesoscale forcing. Such an interpretation is consistent with many intercomparisons between super-parameterized and conventionally parameterized versions of the Community Earth System Model (CESM) (Kooperman et al., 2016) that have sampled different structural formulations for rainfall production. Recent studies indicated that using an improved convective trigger  or incorporating a stochastic convection scheme  into ZM can also help address the "too-frequent-too-weak" precipitation biases in EAMv1. Lastly, we show that the fraction of large-scale precipitation produced by autoconversion (R auto in Table 8) in EAMv1 is already much lower than its predecessor model CAM5 even at 0.25 • horizontal grid spacing (Ma et al., 2015), and the changes in EAMv1_MP further reduce the autoconversion fraction. This change will affect  (Bretherton et al., 2007). The cloud-top entrainment efficiency is computed at every cloud physics time step (dt = 5 min) of the model. Table 8. Global mean 10-year-averaged precipitation fields of EAMv1, EAMv1_CLUBB, EAMv1_MP, EAMv1_SGV, EAMv1_ZM, and EAMv1P. Relevant precipitation variables listed here are total, convective, and large-scale precipitation rates (PRECT, PRECC, and PRECL, respectively; unit = mm d −1 ); ratio of deep convective precipitation to total precipitation (R conv ); frequency of occurrence (unit = %) of no precipitation (FREQ dry ), drizzle with precipitation rates less than 0.5 mm d −1 (FREQ drizzle ), light precipitation with precipitation rates between 0.5 and 8 mm d −1 (FREQ light ), moderate precipitation with precipitation rates between 8 and 80 mm d −1 (FREQ moderate ), and heavy precipitation with precipitation rates exceeding 80 mm d −1 (FREQ heavy ); and ratio of autoconversion to total precipitation (R auto ).

Variable
EAMv1 EAMv1_CLUBB (Posselt and Lohmann, 2009;Wang et al., 2012;Gettelman et al., 2013). As discussed in Rasch et al. (2019) and shown in Fig. 9, EAMv1 produces high annual mean precipitation over the globe, over high elevations, over the Maritime Continent, and in the central Pacific but low annual mean precipitation over Amazonia and the oceanic TWP. With an improved cloud distribution, we find the precipitation simulation improves as well. Figure 9 shows that tropical precipitation is greatly improved. EAMv1_SGV enhances precipitation in the TWP, eastern Pacific, and Amazonia, whereas EAMv1_CLUBB and EAMv1_ZM reduce precipitation in the central Pacific and western Indian Ocean while increasing precipitation in the SPCZ. This suggests that the displaced Walker circulation in EAMv1 is significantly improved in the recalibrated model. EAMv1_SGV also reduces precipitation bias over high-elevation regions such as the Andes and Himalayas (likely through non-local circulation response). We also find an unexpected improvement from the ZM changes by reduc-ing the double ITCZ bias. While the physical mechanism remains unclear and requires further investigation, our results corroborate the finding of Song and Zhang (2018) that the double ITCZ bias is sensitive to the adjustments in the deep-convection parameterization, which affects the tropical clouds (and energy budget) and precipitation directly and the large-scale circulations indirectly.
In summary, the recalibrated model with improved clouds also produces more realistic present-day precipitation climatology. Pronounced precipitation biases in the tropics, over land, and over high elevations are significantly reduced. The improved realism of the precipitation distribution is consistent with the improved cloud distribution. These improvements lead to a more realistic atmospheric circulation and positive impacts on other aspects of the simulated atmosphere. The remaining biases in tropical clouds and precipitation could be related to the coarse model resolution, which fails to resolve islands, narrow mountain ranges, mesoscale convection, and small-scale meteorological fields (Wang et al., 2018), and also has a deficiency in representing the triggering of deep convection . The lack of representing ice clouds in CLUBB can also contribute to remaining biases in midlatitudes and high latitudes (Zhang et al., 2020).

Other aspects of the present-day climate
Our recalibration is governed by an understanding of the physical mechanisms present in the atmosphere and their representation in parameterizations. Our effort has focused on improving the CREs across cloud regimes. Improvements to clouds and precipitation have been accomplished that are consistent with our expectations, but an evaluation of other aspects of the simulated present-day climate is essential. While the possibility of compensating biases always exists, our confidence in the underlying physics in the model will be increased if many other aspects are also improved. Otherwise, we are forced to suspect that the model achieves its behavior primarily through compensating biases.
Near-surface air temperature is an important state variable for validating the fidelity of the ESMs. Both dynamical and physical processes affect the temperature field, and thus an appropriate balance between these processes is essential for producing a realistic simulation of present-day conditions. Therefore, the near-surface air temperature can also be viewed as a minimum requirement for providing some confidence in projections of future climate. However, like many weather and climate models (Morcrette et al., 2018), EAMv1 produces significant near-surface air temperature biases. The Northern Hemisphere (NH) high latitudes exhibit a 1-5 K warm bias, and there are cold biases in other places (Fig. 10). The warm high-latitude bias and the cold tropical bias produce a weaker Equator-to-pole temperature gradient, which can cause errors in midlatitude baroclinicity, storm tracks, and large-scale circulations. It can also lead to excessive melting of sea ice and land ice, which has adverse impacts on ocean circulation. Figure 10 shows that the parameter adjustments that aim to improve CREs generally improve the nearsurface temperature, and the changes in EAMv1_MP lead to the largest improvements. This suggests that the liquid cloud bias in EAMv1 due to the underactive WBF process, coupled with the CNT-based IN parameterization, may be responsible for the near-surface temperature bias. Strong liquid-to-ice conversion improves the CREs and subsequently affects the near-surface temperature, which will further impact circulations and affect other aspects of the Earth's climate.
Surface winds affect the physical climate and the biogeochemical cycle in a variety of ways. In EAMv1, surface winds affect surface flux of heat, moisture, and momentum, which influence the thermodynamic properties in the PBL but also more generally affect atmospheric energy and water cycles. The emissions of sea spray aerosols and mineral dust are a function of surface winds. Over the ocean, surface winds drive the ocean surface currents and influence the mixed-layer depth, heat budget, and carbon uptake in the ocean. Figure 11 shows that surface winds in EAMv1 are significantly stronger than those in the MERRA-2 reanalysis, especially in the Southern Ocean and North Atlantic. In the tropical Pacific Ocean, the trade easterlies are too strong, which pushes the cold tongue into the Indo-Pacific warm pool. The wind direction biases are reduced in EAMv1_SGV when the gustiness parameterization is enabled, such that the subgrid winds are accounted for in surface flux calculations. EAMv1_ZM also shows some minor improvements in TWP. Combining all the model changes, the revised model EAMv1P shows significant improvements in surface winds in many parts of the tropics, North Atlantic, and Southern Ocean. In the fully coupled E3SM, these improvements may lead to more realistic ocean circulations as well as oceanatmosphere exchange of heat, moisture, momentum, trace gases, and aerosols.  Although our recalibration is only targeted to improve CRE features (Fig. 4), those changes can affect aerosols as well because cloud processing is an important sink in the aerosol life cycle. Figure 12 shows that the changes in EAMv1_MP and EAMv1_ZM increase the aerosol loading, while EAMv1_SGV produces lower aerosol loading. The changes in aerosol loading are partially due to the changes in wet scavenging. In EAMv1_MP, the reduction of supercooled liquid water path increases aerosol loading in midlatitudes and high latitudes because liquid clouds remove aerosols efficiently. EAMv1_SGV enhances the surface moisture flux, which also increases wet scavenging, and the weakened convective autoconversion in EAMv1_ZM reduces the wet removal of aerosols. We also find that the revisions have reduced dust emissions over the Sahara because of the weakened turbulence in EAMv1_CLUBB. Collectively, the recalibrated model EAMv1P reduces the aerosol optical depth (τ aer ) biases in the NH midlatitudes and high latitudes, in the tropics, and over land in general. There are, however, remaining τ aer biases in the subtropics, eastern Pacific, eastern Atlantic, and Southern Ocean.
In addition to improvements in near-surface temperature, surface winds, and column-integrated aerosols, we observe improvements to sea level pressure (SLP) and temperature and wind fields in the recalibrated model EAMv1P (Fig. 13). While EAMv1_CLUBB and EAMv1_MP do not produce different results from EAMv1, we find that the meridional wind at 850 and 500 hPa (coded as numbers 4 and 7) in EAMv1_SGV and EAMv1_ZM are in better agreement with ERA5 as their normalized standard deviation reduces. Many other aspects of the climate are carefully evaluated using E3SM standard diagnostics (https://portal.nersc.gov/project/ e3sm/beharrop/EAMv1P/, last access: 14 February 2022). We find that the recalibrated model shows improvements in most aspects of the simulated present-day climate (despite the fact that they were not tuning targets), and low or no degradation in others. We conclude that when improvements in simulating clouds across regimes are achieved by applying adjustments based on an understanding of the physical mechanisms, those changes are manifested by more realistic simulation of many features of the global atmosphere. Because the correct response of the nonlinear climate system Figure 12. Clear-sky aerosol optical depth (τ aer ). The MODIS onboard Aqua τ aer data product (Levy et al., 2013) is used for comparison with EAMv1. Model clear-sky τ aer is sampled at 13:30 LT. depends on both realistic base state and realistic process representations, the improved realism in the recalibrated model EAMv1P provides greater confidence in estimating the responses of the climate system to anthropogenic forcings and ultimately the ECS.

Responses to anthropogenic aerosols
The role of aerosols in the climate system is a major uncertainty in projections of Earth's future climate and in interpreting how the climate has been forced over recent decades. The uncertainty has been attributed to both a lack of understanding of aerosol emissions in pre-industrial times (Carslaw et al., 2013) and uncertainties associated with modeling aerosol and cloud processes (Regayre et al., 2018;Yoshioka et al., 2019). E3SMv1 produces notable biases in the historical evolution of surface temperature due to a combination of high ECS (from cloud feedback) and strong aerosol forcing, both of which are likely to be too large . In this section, we assess the cloud and precipitation responses to anthropogenic aerosols in the recalibrated model where processes influencing aerosols and clouds operate differently from EAMv1 and the simulated present-day atmosphere is more realistic than that in EAMv1. Our goal is to understand the impacts and the physical mechanisms of the parameter adjustments on cloud and precipitation responses to aerosols. The effects of anthropogenic aerosols are assessed by differencing paired simulations where one uses the presentday aerosol emissions and the other uses the pre-industrial aerosol emissions (see Sect. 2 for the experiment design). Table 9 shows the global mean net total ERF ant in EAMv1 is quite low compared to CMIP5  and other previous-generation models (Kiehl, 2007). This is mostly attributed to the aerosol ERF (ERF aer ) . EAMv1_MP increases ERF ant , but other parameter adjustments lower ERF ant so that the recalibrated model EAMv1P produces about the same ERF ant . ERF aer com-prises the ERF associated with aerosol-radiation interactions (ERF ari ), aerosol-cloud interaction (ERF aci ), and aerosolinduced surface albedo changes. The ERF aer is computed by differencing all-sky TOA radiative flux between paired fixed SST simulations with present-day and pre-industrial aerosol emissions (Hansen et al., 2005), which is referred to as ERF_fSST in Forster et al. (2016). ERF aci is defined as the clean-sky TOA CRE difference (Ghan, 2013). Note that the Ghan (2013) method removes the direct radiative effect from the anthropogenic aerosols on CREs, producing stronger ERF aci (−1.48 W m −2 ) compared to the Boucher et al. (2013) Wang et al. (2020), which assumes that ERF aci is the residual between ERF ari+aci and ERF ari . EAMv1 produces slightly weaker net ERF aer (−1.42 W m −2 ) and ERF aci (−1.48 W m −2 ) than its predecessor CAM5's −1.47 and −1.53 W m −2 , respectively . EAMv1's ERF aer falls within the 68 % confidence range of −1.6 to −0.6 W m −2 (where the 90 % confidence range is between −2.0 and −0.4 W m −2 ) estimated recently by considering various lines of evidence including models, observations, theories, energy balance requirements, and observed temperature constraints (Bellouin et al., 2020).
Collectively, the net ERF aci and ERF aer in EAMv1P remain about the same as EAMv1, but EAMv1P produces significantly weaker ERF aci,sw and ERF aci,lw . These are due to competing effects of our microphysical versus deep convective recalibrations. Our microphysical tunings in EAMv1_MP significantly weaken ERF aci for two reasons. First, EAMv1_MP reduces supercooled liquid clouds in the NH storm track from tuning up the WBF process, which weakens the ERF aci due to aerosol effects on liquid clouds. Second, EAMv1_MP reduces the sulfate aerosols participating in homogeneous ice nucleation, an expected consequence of having increased the size threshold of sulfate aerosols. Since ERF aci is mostly attributed to aerosol effects on liq- Figure 13. Taylor diagram (Taylor, 2001) comparing sea level pressure, temperature, and winds in EAMv1, EAMv1_CLUBB, EAMv1_MP, EAMv1_SGV, EAMv1_ZM, and EAMv1P with the ERA5 reanalysis. Table 9. Global mean 10-year-averaged total ERF ant derived from paired simulations with present-day and pre-industrial forcings. Shortwave, longwave, and net ERF aer ; shortwave, longwave, net ERF aci (unit = W m −2 ); and the difference in total precipitation rate (PRECT, unit = mm d −1 ), land surface temperature (Ts; unit = K), and aerosol optical depth (τ aer ) difference between paired simulations with presentday and pre-industrial aerosol emissions are also given. uid clouds in EAMv1, reducing the amount of baseline liquid clouds reduces ERF aci . Conversely, our tunings of the deepconvection scheme in EAMv1_ZM enhance ERF aci . Since the ZM scheme does not consider detailed cloud microphysical processes, this enhancement is likely due to the overall increase in cloudiness as shown in Fig. 1. Collectively, the net ERF aci and ERF aer in EAMv1P remain about the same as EAMv1, but EAMv1P produces significantly weaker ERF aci,sw and ERF aci,lw . Both longwave and shortwave radiation affect surface temperature and atmospheric cooling rates, which govern the hydrological cycle. Because ERF aer is reduced for both short-wave and longwave in EAMv1P, the recalibrated model shows reduced aerosol-induced response in precipitation (Table 9) and land surface temperature (Table 9), even though the net ERFaer is about the same. Furthermore, the τ aer difference between the paired simulations with present-day and pre-industrial aerosol emissions ( τ aer ) in EAMv1P agrees much better with estimates from model ensembles (Watson-Parris et al., 2020) and from an estimate based on a combination of models and observations (Kinne et al., 2006) than that in EAMv1. Because τ aer is significantly larger in EAMv1P than EAMv1, whereas ERF aci values in the two model configurations are similar, the sensitivity of CREs to aerosol perturbations (i.e., the change in CRE per unit aerosol perturbation) is lower in EAMv1P. Figure 14 shows that the recalibration leads to a smaller magnitude of both positive and negative ERF aci in most places. The aerosol-induced strong warming in the Arctic and strong cooling in the NH storm track, East Asia, and North America are reduced, indicating a weaker local CRE response to aerosols in EAMv1P. EAMv1_MP again produces the most significant reduction, which we attribute to the more effective WBF process that reduces the supercooled liquid clouds. Other changes introduced in EAMv1_MP may also contribute to the weaker ERF aci in East Asia, the Northeast Pacific, and North America, including (1) enhancing the sedimentation of ice and liquid cloud droplets, (2) reducing the sulfate aerosols available for homogeneous ice nucleation, and (3) reducing the minimum subgrid vertical velocity used for liquid droplet nucleation. Regional exceptions with enhanced ERF aci magnitude also occur and are noteworthy in the subtropical stratocumulus regions off the Peruvian and Namibian coasts, where our recalibration has increased the amount of low cloud available to participate in aerosolinduced brightening.
Of particular note regarding model calibration against historical temperature changes are the response of aerosolinduced land surface temperature changes. In Fig. 15, we show that the strong influence of aerosols on surface temperature in EAMv1 is encouragingly reduced by each of our incremental recalibrations. Despite the fact that the global mean ERF aer remains the same in EAMv1P, the temperature effects are muted. With the reduced sensitivity of surface temperature to aerosol perturbations, we speculate that these recalibrations might ameliorate the concerning signature of the unrealistically strong cooling in the 1950s in E3SMv1  if the cause of the bias is indeed due to the overly strong aerosol forcing as hypothesized. We also find that aerosols induce opposite land temperature changes over northeastern Eurasia and northwestern North America. This indicates that the surface temperature changes are not determined only by local energy balance. Other processes in the climate system, such as large-scale circulation changes also play a role. Furthermore, an empirical relation has been shown to exist between the global mean ERF ant and ECS in climate models from both the CMIP3 and CMIP5 collections (Kiehl, 2007;Forster et al., 2013). The relationship between ERF ant and ECS exists because both values in models are sensitive to simulated clouds. Our tuning strategy specifically targets improving the representation of clouds, and it is worth asking whether these improvements uphold or alter the ERF ant -ECS relation. The small difference in ERF ant between the EAMv1 and EAMv1P configurations suggests the possibility of a similar small difference in ECS between these two configurations, and yet we find this is not the case (see Sect. 3.5). Table 10 shows that the aerosol-induced change in cloud fraction remains small in all model configurations. For column-integrated condensate amount, consistent with muted cloud radiative responses to aerosol, EAMv1_MP significantly reduces the sensitivity of LWP and IWP to aerosols. EAMv1_ZM also reduces the IWP sensitivity. The droplet and ice number concentrations are highly sensitive to anthropogenic aerosols as expected, but EAMv1_MP significantly reduces the sensitivity of both N c and N i to aerosols, while EAMv1_ZM only reduces the sensitivity of N i to aerosol perturbations. By combining the present-day N c and Ni in Table 6 and the relative change in N c and N i due to anthropogenic aerosols in Table 10, we find that EAMv1_ZM produces higher N c and N i in the unperturbed pre-industrial environment than those in EAMv1. EAMv1_ZM also produces a larger N c increase (4.79 × 10 9 m −2 ) due to anthropogenic aerosols than EAMv1 (4.58 × 10 9 m −2 ), which is consistent with the larger ERF aci . Changes in cloud macrophysical and microphysical properties drive cloud optical property and radiative effect changes as well. EAMv1_MP reduces the sensitivity of τ liq , τ ice , and τ snow to aerosols, leading to lower sensitivity of CRE for corresponding hydrometeors to aerosol perturbations. EAMv1_ZM also reduces the sensitivity of τ ice and τ snow to aerosols and the corresponding CRE sensitivities. This is likely due to the reduction of the ice particle size detrained from deep convection, which increases N i in the unperturbed pre-industrial environment so that the ice clouds are less susceptible to aerosols. Finally, the revised model EAMv1P shows decreases in shortwave and longwave CRE responses.
In addition to damping condensate and radiative responses to aerosol loading, our recalibration also reduces the sensitivity of precipitation intensity statistics. In EAMv1, anthropogenic aerosols reduce the frequency of occurrence of light precipitation (<2 mm d −1 ) across all large-scale dynamical regimes based on large-scale vertical velocity at 500 hPa, reduce light-to-moderate precipitation (<80 mm d −1 ) in strong ascending regions (< − 20 hPa d −1 ), and increase precipitation between 2.5 and 20 mm d −1 in general (Fig. 16). The parameter adjustments in EAMv1_MP, EAMv1_SGV, and EAMv1_ZM all lead to weakened precipitation response compared to EAMv1. As a consequence, cloud and precipitation processes become less sensitive to aerosol perturbations in the recalibrated model.  In summary, the recalibration reduces the overall responses of CREs, surface temperature, and the hydrological cycle to aerosols. Evaluation of the hydrological cycle response to aerosols indicates that the total precipitation rate is influenced globally (Table 9), regionally (not shown), and in terms of large-scale precipitation frequency of occurrence (using a joint PDF; Fig. 16). However, the global mean ERF ant , ERF aer , and ERF aci remain about the same between the default model EAMv1 and the recalibrated model EAMv1P due to invariant effects of changes in N c , and due to compensations in shortwave and longwave effects that vary in the opposite direction. These analyses demonstrate that the global mean ERFs are insufficient for understanding or constraining the response of the hydrological cycle and surface temperature to aerosols. The shortwave and longwave contribution to the total aerosol ERF, as well as the spatial distribution of aerosol ERF, need to be considered to understand how aerosols affect the Earth system. Furthermore, the unperturbed base state climate can play a role as well. As shown in Fig. 10, the recalibrated model reduces the surface temperature bias significantly, which can lead to a more realistic response of surface temperature to forcings.

Response to surface warming
The response of the Earth system to surface warming is of great scientific and societal importance. ECS values in CMIP6 span a significantly wider range (1.8 to 5.6 K) than in CMIP5 and observationally constrained estimates (Sherwood et al., 2020), and their substantially higher multi-model mean value has been attributed to the same causes identified in E3SMv1: strong positive cloud feedbacks (Zelinka et al., 2020). In this section, we discuss the impacts of parameter adjustments on cloud and other climate feedbacks. The feedbacks are assessed using the Cess methodology (Cess et al., 1989) by contrasting the difference between a control preindustrial simulation and a perturbed simulation with SST elevated by 4 K globally (see Sect. 2 for the experiment design). Figure 17 shows that EAMv1's total climate feedback of −1.51 W m −2 K −1 is weaker than the CMIP5 multi-model mean (−1.6 W m −2 K −1 ), but it is within the inter-model Table 10. The same as Table 6 but instead showing the change in cloud properties induced by anthropogenic aerosols relative to their pre-industrial values (unit = %). Variables are defined in Table 6  spread of −1.05 to −1.95 W m −2 K −1 (Ringer et al., 2014). The less negative feedback suggests a faster warming in the late 20th century and a higher ECS, consistent with the findings in Golaz et al. (2019). EAMv1_CLUBB and EAMv1_MP produce stronger global mean feedback, which will lead to lower ECS and weaker warming in the 20th century, while EAMv1_ZM produces positive feedback in the tropics. The recalibrated model EAMv1P produces a stronger climate feedback of −1.74 W m −2 K −1 , a 15 % increase from EAMv1, and thus it can be expected to have a lower ECS. In Fig. 18, climate feedbacks diagnosed using the Pendergrass et al. (2018) radiative kernel reveal that the noncloud feedbacks are invariant across different model configurations and that the variation in total climate feedback is due solely to the spread in cloud feedbacks as a result of our parameter and subgrid adjustments. Further decomposing the cloud feedback into its total, shortwave, and longwave components via cloud radiative kernels (Zelinka et al., 2012a(Zelinka et al., , b, 2013 indicates that cloud feedbacks are weakened from 0.77, 0.35, and 0.42 W m −2 K −1 in EAMv1 to 0.47 (−39 %), 0.20 (−43 %), and 0.27 W m −2 K −1 (−35 %) in EAMv1P. The stronger negative total climate feedback from the weakened positive cloud feedback suggests that the recalibration will produce a slower warming in the late 20th century and lower ECS. Figure 18b shows that EAMv1_CLUBB and EAMv1_MP both reduce the magnitude of shortwave cloud feedback. EAMv1_MP strengthens the negative shortwave cloud optical depth feedback, likely due to the reduction of meanstate supercooled liquid in mixed-phase clouds (by strengthening the WBF process). The weaker cloud feedback in EAMv1_CLUBB comes from the reduction of cloud amount feedback. This is likely due to the fact that EAMv1_CLUBB improves the simulation of shallow Cu. Because Sc cloud amount decreases more with warming than shallow Cu (Cesana et al., 2019;Cesana and Del Genio, 2021;Myers et al., 2021;Scott et al., 2020), producing shallow Cu rather than Sc reduces cloud amount feedback. In other words, EAMv1_CLUBB simulates a control-state climate with more Cu and less Sc than the default EAMv1, and thus the positive feedback from warming-induced reductions of low-cloud cover is weakened because Cu is more resilient to warming than Sc. In the meantime, EAMv1_CLUBB reduces the decoupling strength and cloud-top entrainment in the Sc regime, which can also reduce the cloud amount feedback.
Contrary to the effects introduced by EAMv1_CLUBB and EAMV1_MP, EAMv1_ZM enhances total cloud feedback. Figure 18b shows that EAMv1_ZM significantly reduces both shortwave and longwave cloud optical depth feedbacks and diminishes longwave cloud amount feedback. The large reduction in the negative shortwave cloud optical depth feedback results in a stronger positive total cloud feedback. This indicates that changes made in EAMv1_ZM, particu-larly (1) the reduction of the ice particle radius detrained from deep convection (ice_deep) and (2) the reduction of convective autoconversion (c0_ocn and c0_lnd), which make convective clouds and their anvils opaque in the presentday climate, result in a weaker sensitivity of CRE to surface warming. However, the physical mechanisms relating those tuning choices to cloud feedbacks remain unclear and require further investigation. Figure 19 shows that parameter adjustments affect cloud feedbacks in different geographical regions. The total cloud feedback appears to be a balance between cloud optical depth feedback and cloud amount feedback, as the cloud altitude feedback is insensitive to our adjustments in parameters and subgrid effects. In the tropics, the recalibrated model EAMv1P shows stronger positive total cloud feedback (Fig. 19a), which can be attributed to the enhanced cloud optical depth feedback introduced by EAMv1_SGV and EAMv1_ZM. This highlights the importance of realistic representation of cloud properties associated with deep convection, including both the deep convective clouds and the anvil detrained from deep convection. In the subtropics, EAMv1P produces weaker positive total cloud feedback due to the reduction of cloud amount feedback in EAMv1_CLUBB and EAMv1_ZM. EAMv1_CLUBB weakens turbulent mixing and increases the skewness Sk w in the shallow Cu regions to facilitate asymmetric vertical mixing that enhances shallow Cu rather than the symmetric vertical mixing that enhances Sc. For this reason, a weaker positive cloud feedback is expected since Sc cloud amount decreases more with warming than shallow Cu (Cesana et al., 2019). EAMv1_ZM also reduces subtropical cloud amount feedback, likely through its impacts on circulation that affect subtropical subsidence and clouds. In midlatitudes and high latitudes, EAMv1_MP makes the largest contribution to modifying cloud feedbacks. Making the WBF process more efficient reduces supercooled liquid clouds in the mean state, which strengthens the negative cloud optical depth feedback through enhancing the negative cloud-phase feedback . We note that the high-latitude cloud optical depth feedback is highly uncertain. Sherwood et al. (2020) estimated the feedback to be near zero based on two studies, Ceppi et al. (2016) and Terai et al. (2016), which reported feedback estimates of similar magnitude but opposite signs. Hence, it remains unclear if the stronger negative cloud optical depth feedback in the Southern Ocean produced by EAMv1_MP and EAMv1P is closer to reality, but this essentially reduces the global total cloud feedback due to the sign reversal of the total cloud feedback in the Southern Ocean.
In Table 11, we find that cloud fraction changes induced by surface warming are insensitive to the recalibration. LWP increases as the surface warms. By making the WBF process more efficient, EAMv1_MP shows a greater LWP response to surface warming, which weakens the positive cloud feedback as discussed previously. Liquid and ice particle numbers N c and N i are both reduced with surface warming, and  (Cess et al., 1989).  (Pendergrass et al., 2018;Zelinka et al., 2012bZelinka et al., , a, 2013 parameter adjustments in EAMv1_MP and EAMv1_ZM affect the sensitivity. In terms of radiative properties, we find that the recalibration reverses the sign of the response of τ liq to surface warming largely due to the changes made in EAMv1_MP, leading to cloud thickening instead of thinning in the lower troposphere (i.e., increasing τ low as surface warms). In the upper troposphere, EAMv1_ZM reduces the τ hgh sensitivity to surface warming, which weakens the positive high cloud feedback. The modifications in EAMv1_ZM have the largest impact on the changes in CRE response changes associated with ice clouds. Combining all of the changes, the revised model EAMv1P reverses the sign of the liquid CREs, likely due to the cloud-phase response to warming caused by increased IWP in the model.
In assessing the impact of parameter changes on ECS, we also computed the lower tropospheric mixing index (LTMI) (Sherwood et al., 2014) and found that the recalibration leads to a 10 % reduction in LTMI (not shown), which corresponds to about 1 K decrease in ECS based on the LTMI-ECS relationship from CMIP5. Most parameter adjustments do not alter LTMI. EAMv1_ZM produces lower LTMI because it reduces convective activity by weakening the convective autoconversion process to increase the cirrus cloud opacity that stabilizes the troposphere. However, because the statistical significance of the relationship between LTMI and ECS has decreased in CMIP6 compared to CMIP5 (Schlund et al., 2020), LTMI might not be a good predictor for ECS in E3SM.
Finally, we assess the impacts of our recalibration on the patterned response of precipitation to surface warming. In Sect. 3.2 we showed that the parameter changes in EAMv1_ZM significantly reduce the ratio of convective precipitation rate to total precipitation rate in the present-day climatology. This change alone can lead to different precipitation responses to surface warming because different precipitation mechanisms are employed between the convection and the microphysics parameterizations. Figure 20 shows enhanced convective precipitation with warming in the tropics, the SPCZ, and storm tracks in EAMv1. EAMv1_ZM significantly reduces the response, likely due to the reduced Figure 19. Zonal mean of (a) total cloud feedback, (b) cloud optical depth feedback, (c) cloud amount feedback, and (d) cloud altitude feedback. Table 11. The same as Table 6 but showing the change in cloud properties induced by surface warming relative to their pre-industrial values (unit = % K −1 ). Variables are defined in Table 6 convective autoconversion efficiency. Other parameter adjustments also affect the response in the Indo-Pacific warm pool, but the parameter changes do not have a direct impact on convective precipitation, and thus the change in response might be caused by circulation feedbacks. In the recalibrated model EAMv1P, the convective precipitation response to surface warming is mostly reduced in the tropics. The global mean convective precipitation response is reduced by 0.013 mm d −1 K −1 (−24 %) compared to the response in EAMv1. The relative increase in convective precipitation due to surface warming, however, is only slightly reduced from 3.07 % K −1 in EAMv1 to 2.97 % K −1 in EAMv1P.
The large-scale precipitation response in EAMv1 has a similar magnitude as the convective precipitation response, but the response is larger in the storm tracks and not as strong in the tropics (Fig. 21). EAMv1_ZM significantly enhances the response in the TWP because the parameter changes in EAMv1_ZM shift the precipitation from convective to largescale so that the response comes from the large-scale precipitation. The recalibrated model EAMv1P enhances the largescale precipitation response by 0.018 mm d −1 K −1 (+37 %) compared to EAMv1. The relative increase in large-scale precipitation due to surface warming is also increased from 3.17 % K −1 in EAMv1 to 4.11 % K −1 in EAMv1P.
In summary, the recalibration enhances the negative climate feedback to surface warming by reducing the positive cloud feedback. The storm track, shallow Cu regions, and the Indo-Pacific warm pool are the regions where the cloud feedback is most sensitive to the parameter adjustments. The largest precipitation response is seen in the tropics, SPCZ, and storm tracks. The parameter adjustments in the ZM deepconvection parameterization produce the largest changes in the response. Because the default model EAMv1 and the recalibrated model EAMv1P produce different climate and cloud feedbacks, the two models are expected to produce different estimates of ECS, even though their ERF ant values are about the same. Our results are consistent with the findings of Smith et al. (2020) that the statistical relationship between the ERF aer and ECS established in Kiehl (2007) and Forster et al. (2013) is challenged by modern ESMs. Fully coupled model simulations are needed to test this hypothesis.

Summary and discussion
In this study, we have developed a new model configuration of EAMv1, named EAMv1P, using a model calibration strategy that focuses on calibrating CREs that can be reliably observed across cloud regimes and geographical regions. The recalibration was guided by our understanding of the physical mechanisms that relate biases to uncertain process assumptions and used ample iterations to buffer unintended consequences of interventions in individual regimes against those they interact with. The recalibrated model produces an encouragingly improved present-day cloud and precipitation climatology and reduced sensitivity to aerosol perturbation and surface warming. Below we summarize the changes and behavior of the intermediate model configurations.
-Incorporating the subgrid effects (EAMv1_SGV) was intended to increase cloudiness in regions where largescale winds are weak and yet convection occurs frequently (e.g., TWP and Amazon) by enhancing local surface fluxes of heat, moisture, and momentum in those regions. Compared to all other intermediate model configurations, EAMv1_SGV produces the largest impact in terms of reducing the tropical surface wind direction bias, which will likely reduce the cold tongue bias in the fully coupled E3SM. Introducing these subgrid effects also reduces precipitation biases over the TWP, Amazon, and high-elevation regions (e.g., the Himalayas and Andes). EAMv1_SGV produces only a moderately weakened surface temperature response and precipitation response to aerosol forcing compared to the default model EAMv1.
-Parameter adjustments in the ZM deep-convection parameterization (EAMv1_ZM) were intended to improve overall tropical cloud amounts by weakening the convective autoconversion and reducing detrained ice crystal radius. We find that these changes increase IWP globally. Furthermore, we find that EAMv1_ZM is the only model configuration that produces a stronger ERF aci and a stronger positive cloud feedback. The enhanced ERF aci is seen in East Asia, Europe, and Sc and shallow Cu regions. The increased cloud feedback is primarily due to the significant reduction of negative cloud optical depth feedback in the tropics.
-Parameter adjustments in the CLUBB parameterization (EAMv1_CLUBB) were introduced to improve the subtropical Sc, shallow Cu, and the Sc-to-Cu transition by making parameters a function of the skewness of subgrid vertical velocity Sk w . These changes lead to encouraging reductions in both the "too-dim stratocumulus" and "too-bright trade cumulus" biases in modern ESMs. We find that the changes also significantly reduce the precipitation bias over the central Pacific Ocean. The changes introduced in EAMv1_CLUBB do not affect ERF aci , but they lead to the largest reduction in the positive cloud amount feedback in the subtropics compared to other intermediate model configurations.
-Parameter adjustments in the MG2 microphysical parameterization (EAMv1_MP) were intended to (1) reduce the excessive supercooled cloud liquid in the midlatitudes and high latitudes by enhancing the WBF process, (2) reduce ice particle number by reducing the sulfate aerosol available for homogeneous ice nucleation, and (3) improve Sc by enhancing the droplet sedimentation rate. We find that these changes give the largest reduction in ERF aci in the midlatitudes and high latitudes, in areas under great anthropogenic influence (e.g., East Asia, North America), and in the subtropics. EAMv1_MP also produces the weakest total cloud feedback due to the stronger negative cloud optical depth feedback in the tropics, midlatitudes, and high latitudes. The significant enhancement of negative cloud optical depth feedback results in a reversal of the sign of the total cloud feedback in the Southern Ocean.
The revised model EAMv1P includes all of the incremental changes discussed above. We find that EAMv1P produces a much more realistic CRE distribution than EAMv1 by addressing multiple regime-specific cloud biases spanning the tropics, subtropics, midlatitudes, and high latitudes. This is achieved through the collective effects of our modest adjustments to the ZM deep-convection scheme and subgrid effects, CLUBB turbulence, and MG2 microphysics. The improved CRE distribution naturally leads to better geographic distribution of radiative energy at the TOA, which is essential for setting up a realistic atmospheric circulation that further improves the overall fidelity of the model atmospheric state. We have also compared results from grouped parameter changes to understand how process assumptions affect CRE and other aspects of the simulated atmosphere. We show that the recalibrated model produces more improvements than the sum of the improvements from individual intermediate configuration, demonstrating the nonlinearity in the climate system and the necessity of combining all of the improvements that target different biases in different regimes. Further reducing the model biases by improving parameterizations, numerics, resolution, and calibration is an ongoing effort for the E3SM team. Incorporating process-oriented diagnostics in model development and calibration will be useful for ensuring that the model get the right answer for the right reason.
Cloud, precipitation, and surface temperature responses to anthropogenic aerosols and greenhouse gases are major sources of uncertainty in the simulated climate of the past, present, and future. Since the climate system is nonlinear, realistic estimates of the system's response depend on a realistic base state. EAMv1's deficiencies in base state fidelity likely contribute to its biases in the historical surface temperature evolution and its high ECS. In contrast, the recalibrated model EAMv1P produces a much more realistic present-day base climate state, due to a better calibration of cloud properties and subgrid effects that improve the representation of physical mechanisms compared to EAMv1. Hence, the revised model EAMv1P is more likely to produce credible estimates of the climate system's response to external forcings and climate projections when running as part of the fully coupled E3SM.
We show that the sensitivity of clouds, precipitation, and surface temperature to anthropogenic aerosols is significantly lower in the recalibrated model than in the default model, suggesting the potential to improve the historical surface temperature evolution over E3SMv1, such as the potential to reduce the cold bias between the 1960s and 1980s. We find that the responses to anthropogenic aerosols are mostly affected by parameter adjustments in EAMv1_MP and EAMv1_ZM. To simulate historical surface tempera-ture evolution accurately, future model development efforts should target these two parameterizations so that processes of cloud microphysical and deep convective processes are better constrained to represent real-world processes.
The recalibrated model EAMv1P also produces a weaker cloud feedback compared to the default model EAMv1, suggesting potential improvements to the surface temperature evolution, like slower warming after the 1980s and a lower ECS. Parameter adjustments in EAMv1_CLUBB, EAMv1_MP, and EAMv1_ZM significantly affect cloud feedbacks. Hence, to reduce the uncertainty in the predictions of future climate, subgrid cloud properties and process representations, including turbulent mixing, cloud macrophysics and microphysics, and deep convection, need to be better constrained. We find that EAMv1 and EAMv1P produce different surface temperature responses to anthropogenic aerosols and different cloud feedbacks (and, consequently, ECS) even though they produce the same global mean ERF. This suggests that the statistical relationships between the global mean ERF, cloud feedback, and ECS established in Kiehl (2007) and Forster et al. (2013) do not apply to current generation ESMs, as documented in Smith et al. (2020). This indicates that global mean ERF is not a good indicator of the historical and future climate change. Other factors such as the spectral composition (i.e., shortwave versus longwave) and spatial distribution of the ERF and cloud feedback, as well as the realism of the unperturbed base climate state, need to be considered. Identifying the process representations that affect only ERF, those that affect only cloud feedback, and those that affect both is an important step toward a better understanding of the evolution of the climate system.
It is natural to wonder if an equivalent or superior ESM calibration might have been achievable with less human effort or fewer computational resources via semi-automated machine learning (ML) methods that emulate or expand the workflow outlined in this paper. Indeed, emulating a complex model's parameter sensitivities following human-constructed trial simulations to aid model calibration and uncertainty quantification would be an intriguing possibility. Several recent studies have shown successful application of ML methods in model calibration (Cleary et al., 2021;Dunbar et al., 2021;Couvreux et al., 2021;Hourdin et al., 2021). In theory, reinforcement learning (RL) with an appropriately formulated agent-based optimization system could be guided via its loss function formulation with skill metrics that optimize for the same patterns and mean-state climate metrics that we prioritized in this study. In practice, however, this ML task faces a fundamental challenge that the cost of an individual agent-reward sample is performing multi-year climate simulations. The workflow outlined in this paper has the considerable advantage that experienced human experts make educated parameter interventions based on assessment of the simulation and discriminate the desired effects in a nuanced way that tolerates certain unintended consequences. It is not clear how available ML methods could be infused with analogous physical foresight to make similar decisions, and thus it is logical to expect that they would require more evaluation samples to succeed via brute force. Therefore, experimenting with clever strategies to increase reward density and to integrate physical knowledge from experts in the ML workflow would be a highly worthwhile long-term challenge.
Author contributions. PLM designed the study, performed simulations and analyses, and prepared the first draft of the manuscript. BEH, VEL, RBN, AG, HM, HaW, KZ, PAB, ZZ, HS, XL, JW, PMC, and PJR contributed to developing the tuning strategy and analysis. SAK, MDZ, and YZ performed the cloud feedback decomposition analysis and contributed to cloud evaluation. YQ, JHY, CRJ, MH, SLT, XueZ, WL, JQ, HC, MAB, JM, SH, QT, and JF contributed to the evaluation and analysis of results. BS implemented the code in E3SMv1. XubZ, LKB, and JDF contributed to assessment of subgrid effects. HuW and MAT contributed to the testing of model sensitivity to time-stepping and process coupling. MSP contributed to the analysis of results and the assessment of using machine learning methods for tuning. JCG, SX, and LRL contributed to interpreting results and the comparisons with other modeling studies. All authors contributed to the writing of the manuscript.
Competing interests. At least one of the (co-)authors is a member of the editorial board of Geoscientific Model Development. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Acknowledgements. The model tuning work was supported as part of the Energy Exascale Earth System Model (E3SM) project (project no. 65814). The analyses of effective radiative forcing and aerosol-cloud interactions were supported as part of the Enabling Aerosol-cloud interactions at GLobal convection-permitting scalES (EAGLES) project (project no. 74358). The E3SM and EA-GLES projects were sponsored by the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research, Earth System Model Development (ESMD) program area. The cloud diagnostics and feedback analysis were supported as part of a cloud diagnostics project (project no. 66187) and Program for Climate Model Diagnosis & Intercomparison (PCMDI) (project no. SCW1453), respectively, both sponsored by the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research, Regional and Global Model Analysis (RGMA) program area. The development and evaluation of gustiness effects over land were part of the Integrated Cloud, Land-Surface, and Aerosol System Study (ICLASS) science focus area (project no. 57131), sponsored by the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research, Atmospheric System Research (ASR) program. This research used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility operated under contract no. DE-AC02-05CH11231. This research also used a high-performance computing cluster provided by the Office of Biological and Environmental Research, ESMD program area, and operated by the Laboratory Computing Resource Center at Argonne National Laboratory. This paper describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the U.S. Department of Energy or the United States Government.
Financial support. This study was funded by the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research, Earth System Model Development (ESMD) program area (project nos. 65814, 74358); Regional and Global Model Analysis (RGMA) program area (project nos. 66187, SCW1453); and Atmospheric System Research (ASR) program (project no. 57131). The Pacific Northwest National Laboratory is operated for the U.S. Department of Energy by the Battelle Memorial Institute under contract DE-AC05-76RL01830. Work at Lawrence Livermore National Laboratory was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract no. DE-AC52-07NA27344. Sandia National Laboratories is a multi-mission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract no. DE-NA0003525.
Review statement. This paper was edited by Axel Lauer and reviewed by Yuan Wang and two anonymous referees.