the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Better calibration of cloud parameterizations and subgrid effects increases the fidelity of the E3SM Atmosphere Model version 1
Bryce E. Harrop
Vincent E. Larson
Richard B. Neale
Andrew Gettelman
Hugh Morrison
Hailong Wang
Kai Zhang
Stephen A. Klein
Mark D. Zelinka
Yuying Zhang
Yun Qian
Jin-Ho Yoon
Christopher R. Jones
Meng Huang
Sheng-Lun Tai
Balwinder Singh
Peter A. Bogenschutz
Xue Zheng
Wuyin Lin
Johannes Quaas
Hélène Chepfer
Michael A. Brunke
Xubin Zeng
Johannes Mülmenstädt
Samson Hagos
Zhibo Zhang
Xiaohong Liu
Michael S. Pritchard
Jingyu Wang
Peter M. Caldwell
Jiwen Fan
Larry K. Berg
Jerome D. Fast
Mark A. Taylor
Jean-Christophe Golaz
Shaocheng Xie
Philip J. Rasch
L. Ruby Leung
Download
- Final revised paper (published on 07 Apr 2022)
- Preprint (discussion started on 13 Oct 2021)
Interactive discussion
Status: closed
-
RC1: 'Comment on gmd-2021-298', Anonymous Referee #1, 22 Nov 2021
This summarises (at some length) efforts to improve the calibration and evaluation of the atmospheric component of the E3SM coupled model. The procedure described seems extremely labor intensive and (frankly) somewhat arbitrary. Nontheless, the results do show signficant improvements and curiously, a reduction in the implied climate sensitivity (assessed via Cess-type perturbations). This is publishable with only minor revisions (as outlined below) and perhaps some condensing to reduce length and repetition.I have two questions that might add to some of the discussion. What is the prospect for automating some of these tests, using ML/AI for instance to reduce the burden and increase the area of phase space tested? I don't have huge confidence that the current procedure will lead to true (local) minima in errors, but I'd like to sse this discussed here.
Secondly, there is a preprint related to ECS in CESM2 (a related model), that has pointed out some odd (possibly erroneous) coding related to the ice-nucleation in that model (CAM6). Does this have any relevance here? https://www.essoar.org/pdfjs/10.1002/essoar.10507790.1
Minor points:
line 40. "...precise knowledge of ... ERF is not enough". This is a strawman argument. Who has ever said that it was?
line 84. The comparison to other ESMs is irrelevant. It the comparison to the constrained range from observations that matters (Sherwood et al, 2020).
line 114-115. Is there any evidence that the skill scores dervied from a 5 day simulation are correlated to skill scores from a year or 10 year run? Presumably they are not being tested against the same observations?
line 120: "in hindsight"? is this referring to the 5-day simulations with EAMv1, or the previous one-at-a-time approach.
line 125. There is a big gap between 5 days and 10 years. Is there any assessment of how useful different lengths of simulation might be? For instance 1 year might be a good compromise?
line 128. use the actual times (10 years and 5 days) rather than 'short' or 'long' - relative measures are not very specific.
line 145. "perfect" is too much to ask. But the point about non-uniqueness is important.
line 160. Why? The authors just spent two pages saying why this was not a good approach!
line 224. "might"? --> "will"
line 245. Has the length of these simulations been mentioned?
line 385. How long are these simulations? (line 400 suggests 11 years, but is that just for simulation #5?) In any case, move this up in the text.
line 420/Table 6. Add observed values (where available) for comparison (i.e. from CERES, or CALIPSO).
line 438-449. please compare with Cesana et al (2021, doi:10.1029/2021GL094876). The implementation of a CALIPSO simulator should indeed be a high priority. Without a realistic target for LCF this tuning will inevitably be haphazard, but I think it likely that the EAMv1P is more realistic.
figure 4. The authors should add the EAMv1P-CERES map as well, so that the improvemed version can be compared directly with panel a. (also in Figures 5, 6, 9, 10, 11, and 12)
line 497. It is of course challenging, but I don't think the biggest challenge is the lack of observational data.
line 635. This will always be true - it is not a binary situation.
line 802. This is an odd argument. Who has ever claimed that ERF is sufficient to determine responses? It is precisley the opposite - the major uncertainty (since the Charney report!) has always been in the sensitivity.
line 818+. The comparison to the other models is fine, but the comparison should be with observationally constrained estimates - ie. Sherwood et al (2020), IPCC AR6 Chp. 7 etc.
line 1019. This has never been claimed.
Citation: https://doi.org/10.5194/gmd-2021-298-RC1 -
AC1: 'Reply on RC1', Po-Lun Ma, 31 Jan 2022
The comment was uploaded in the form of a supplement: https://gmd.copernicus.org/preprints/gmd-2021-298/gmd-2021-298-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Po-Lun Ma, 31 Jan 2022
-
RC2: 'Comment on gmd-2021-298', Anonymous Referee #2, 23 Nov 2021
This manuscript describes the process of retuning of version 1 of the atmospheric component EAM of E3SM climate model, which focused on parameters related to various cloud processes, and how the retuning impacted a range of quantities beyond the tuning targets, ranging from surface temperature in the present-day to aerosol forcing (via cloud adjustments) to cloud feedbacks.
General comments
The manuscript does a good job of explaining the reasoning that led to the strategy used in retuning, which relies on the bet that improving the representation of clouds will lead to improvements across the board. It’s nice to see that the bet pays off.
There’s rather a lot of detail in section 2, describing groups of parameters were tuned. There’s really a lot of detail in section 3, which explains how the returned model behaves with respect to a wide range of emergent phenomena. There is so much detail, in fact, that the manuscript works much better as a description of what was done than it does as an explanation of what was learned. If the authors’ goal is to document the strategy and its impacts they have succeeded, but if they aim to influence the ways in which readers undertake or understand model tuning they would be well advised to bring their ideas into sharper focus. Sharpening the manuscript will almost certainly involve relegating material to appendices or supplemental material.
A large proportion of the very many figures are of the form of Figure 4: six (well-constructed) maps showing the difference of v1 against observations, four maps showing the change induced by the four sets of parameter changes, then a map showing the aggregate change of the final retuning relative to the original. Readers are left to judge improvement by mentally subtracting the bias in the upper left plot from the change in the lower right plot. Could the lower plot be revised to show, for example, the improvement or degradation in the original bias as a result of the tuning?
Versions of the six-ma figure (e.g. Fig 7) with no observational constraint are harder for readers to assess.
Tuning relies on the ability to measure improvements in simulations, normally relative to observations. It’s remarkable that the authors spend essentially no time discussing the sources of their observational constraints, or how uncertainty in these constraint is or isn’t considered as part of the tuning strategy.
The tuning strategy used by the authors is somewhat traditional. Comparisons to other approaches (e.g. the automated calibration to process-scale constraints used by HiTune, doi:10.1029/2020MS002217 or the formal inference discussed in the Clima project, doi:10.1016/j.jcp.2020.109716) would no doubt be welcome.
More specific comments
Tuning of course involves the changing of specific parameters. The variable names in the specific computer code are perhaps too specific to be in the main text. This information, and indeed probably the original and changed values, could be summarized in one or more tables in an appendix.
Line 113: the current term of art is “perturbed parameter ensemble”.
The subsections of section 2 are labeled as tropical clouds, low clouds, etc. In practice each section might also be categorized according the scheme whose parameters are being tuned. Indicating this (e.g. “Tropical clouds and the deep convection scheme”) might guide readers’ attention.
Section 2 describes the re-tuning in detail, including which parameters are re-tuned and why. General material (e.g. line 286-294) should be deferred or removed.
Line 220: The authors scale the temperature variance provided by one scheme by a factor of 2 before introducing it in another scheme. It’s not clear whether this is a reasonable physical assumption. It if is the choice should be justified; if it’s not the choice should be explained.
Section 2.2: how are the different cloud regimes identified in practice, during a simulation?
One reason for showing six panels in Figure 4 and its many analogs is to highlight the geographic distribution of the impacts of parameter changes. The authors might ask themselves if maps are the best way to show these differences in all cases.
Line 395-400 could be edited for clarity and to remove general material, as could lines 486-490.
Line 528: EIS is thought to control low cloud properties, not their feedback (sensitivity to surface temperature change).
The central point of lines 715-728 could no doubt be made more compactly and directly.
Line 739-740 are bewildering.
Figure 13 is hard for readers to interpret. Coding bias with shapes and variables with numbers is really quite unfriendly - bias would be better coded with size or shading, leaving shape to stand for quantities. But readers will also appreciate guidance in interpretation, since all the panels look very much the same to an unpracticed reader.
Line 831: Does the present analysis use the specific kernels of Zelinka 2012 and Pendergrass 2018? If so this should be made explicit. If the text refers to the ideas the original papers (e.g. doi:10.1175/2007JCLI2044.1) should be referenced.
Figure 18, especially panel a, is not particularly informative, since readers are asked to compare small changes in large numbers introduced by tuning.
Citation: https://doi.org/10.5194/gmd-2021-298-RC2 -
AC2: 'Reply on RC2', Po-Lun Ma, 31 Jan 2022
The comment was uploaded in the form of a supplement: https://gmd.copernicus.org/preprints/gmd-2021-298/gmd-2021-298-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Po-Lun Ma, 31 Jan 2022
-
RC3: 'Comment on gmd-2021-298', Yuan Wang, 02 Dec 2021
The manuscript presents comprehensive results from the model calibration effort to improve DOE EAM_v1. It explores the impacts of the model recalibration on the fidelity of model climate, and implications for cloud feedbacks and ERF. The documented technical details in parameter calibration are highly beneficial for the scientific community to understand the model at the process level. The calibration comprises four major parts that reflect the latest advancement in modeling cloud, convection, aerosol, and radiation. The manuscript is co-authored by leading scientists in the field. Overall, the model development processes and associated impacts on major science questions are thoroughly discussed. The work is appropriate for GMD and I recommend acceptance after the authors address the comments listed below.
1) As each experiment runs for 11 years, some more statistical analyses can be conducted. For example, for Fig. 3, are the differences significant compared with the natural variability in the control run? For those difference maps, the authors may consider wiping out those pixels with insignificant differences.
2) L139-141, the logic is unclear here. Why the smaller cloud feedback and aerosol radiative forcing can lead to a better surface temperature simulation in a couple run?
3) L343, should be “reduced conversion”.
4) Fig. 4c&5c, tuning in MP apparently impacts LWP, but not cloud fraction in the Sc regions. Is it simply due to the diagnostic cloud fraction scheme which is independent with cloud microphysics? If yes, should such a disconnection be targeted in the future model development?
5) L527-534, please provide how EIS is calculated from the model.
6) Fig. 14c, it is a little surprising to see the altered cloud-rain autoconversion does not impact Eaci significantly over the subtropical warm cloud regions. Any explanation?
7) Fig. 15a, on about the same latitude, why aerosols induce opposite land temperature changes over the northeast Eurasia and the northwest North America?
8) Fig. 14e and L963-965, it is unclear to me why EAMv1_ZM shows less sensitivity of Nc and Ni to aerosols, but produces stronger ERFaci?
9) It may be beyond the scope of the study, but I am curious of the additivity of the impacts of different tuning parts. Would the total impacts from EAMv1P be a linear addition of those from each individual configuration? In other words, are there significant nonlinear interactions among those different configurations?
10) Near the end of the paper, it is worth discussing what are the unresolved outstanding biases in EAMv1P and whether they are likely to be resolved in the next stage.
Citation: https://doi.org/10.5194/gmd-2021-298-RC3 -
AC3: 'Reply on RC3', Po-Lun Ma, 31 Jan 2022
The comment was uploaded in the form of a supplement: https://gmd.copernicus.org/preprints/gmd-2021-298/gmd-2021-298-AC3-supplement.pdf
-
AC3: 'Reply on RC3', Po-Lun Ma, 31 Jan 2022