METEORv1.0.1: a novel framework for emulating multi-timescale regional climate responses

Sandstad, Marit; Steinert, Norman Julius; Baur, Susanne; Sanderson, Benjamin Mark

doi:https://doi.org/10.5194/gmd-18-8269-2025

Articles | Volume 18, issue 21

https://doi.org/10.5194/gmd-18-8269-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/gmd-18-8269-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 18, issue 21

Model description paper

|

06 Nov 2025

Model description paper |

| 06 Nov 2025

METEORv1.0.1: a novel framework for emulating multi-timescale regional climate responses

Marit Sandstad, Norman Julius Steinert, Susanne Baur, and Benjamin Mark Sanderson

Download

Final revised paper (published on 06 Nov 2025)
Preprint (discussion started on 13 Mar 2025)

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-1038', Anonymous Referee #1, 06 May 2025
Review of METEOR 1.0
Summary of paper: The authors show a new pattern-scaling technique in which regional annual-mean temperature and precipitation patterns, as responses to forcing changes, are time-dependent. The results capture the multi-model mean of assessed CMIP6 responses quite well, with an RMSE of ~0.15 K for warming and 0.16 × 10⁻⁷ kg m⁻² s⁻¹ for temperature.
Recommendation: Acceptance after revisions (whether those are minor or major is in the eye of the beholder and up to the authors).
General comments:

A very useful contribution to the long quest to emulate GCM/ESM response fields via enhanced pattern-scaling techniques.
CMIP6 MMM versus CMIP6 individual-model emulator:

As currently presented, the paper features mainly the validation of METEOR against the CMIP6 MMM, rather than validations of individual CMIP6 models. This is not made clear in the abstract or most of the text, where the reader gets the impression that METEOR, in its current calibration, is a useful emulator of individual ESMs. That might be the case, but it is not shown. In other words, the paper is not clearly framed as being limited to emulating only the multi-model mean. If the authors wish to present METEOR as an individual-GCM/ESM emulator, then the paper needs to test the appropriateness of CMIP6 model-by-model responses. Only small in-sample goodness-of-fit metrics are shown (e.g., Figure A1 panels a and d present the RMSE for the GHG response in the in-sample abrupt-4×CO₂). I therefore strongly encourage the authors to show more model-by-model validation—for example, by including absolute-error maps of 20-year means for individual model SSP5-8.5 or SSP1-2.6 out-of-sample temperature and precipitation fields for 2080–2100. Tables of RMSE and MAE values by model and scenario would be useful in an Appendix, allowing comparison to alternative emulation techniques. Similarly, Figures 8 and 9 could be extended to include maps of the best and worst CMIP6 model fits, rather than showing only MMM differences.
Global-mean validation versus regional validation:

At present, Figures 5–7 and B13–B20 show useful comparison plots for global-mean temperature and precipitation responses. That is reassuring (and a great result), but for an emulator of regional climate responses, more regional comparisons are needed. The global-mean response can be obtained much more simply—e.g., as an extension of the C-SCM with a few lines of code and these calibration parameters. I suggest replacing (or extending) Figures B13–B20 with figures that show the worst- and best-performing regions (using either custom definitions or IPCC AR6 regions). Regional responses could also be shown as maps—you already include CMIP6 MMM comparison maps in your Figures 8 and 9.
Limitations for impact models:

The utility of these results for impact emulators depends on each emulator’s needs. METEOR v1.0 is limited to annual-mean projections of best-estimate warming and precipitation changes, and does not yet include variability, compound-event modeling, climate-oscillation modes, distribution tails, etc. Although some of these caveats are mentioned in the conclusion, an explicit upfront statement of the current emulator’s scope (and its limitations) would be helpful.
Physical interpretation of response patterns (Figures B1–B12):

Looking at the GHG and “residual” response patterns, one wonders whether they are intended purely as statistical fits (in which case they need not be physically interpretable, as long as applications stay within the training spectrum), or whether they represent physically meaningful patterns. If the latter, one could apply the emulator beyond 2100 to 2300 with more confidence. Since the authors do not clearly state that these are statistical fits—and some discussion refers to physical interpretation of short- and long-term responses—I suggest the following:
Equilibrium response aggregate pattern: Add a fourth column to Figures 3 and 4, as well as B1–B12, that sums the short-, medium-, and long-term response patterns. This should yield the equilibrium response pattern, which readers can then evaluate for physical plausibility. If the equilibrium response is not physically plausible (and some of the patterns seem hard to interpret), then these components should be framed explicitly as purely statistical fits valid up to 2100 for the shown validations. Alternatively, you might introduce training constraints—for example, requiring that the sum of the three timescales falls within a physically plausible range. You could also discuss whether the land-ocean warming ratio evolves plausibly from short-term through equilibrium response.

Full colorbar: Many patterns appear clipped by the chosen colorbar limits, making it hard to see true minima and maxima. Please include a full colorbar for these figures and choose its range to include extreme values (possibly on a logarithmic scale) so that readers can see tail-end values. For example, in Figure B6 the MIROC-ES2L long-term GHG precipitation response is unclear; likewise CanESM5’s short-term precipitation response in Figure B5 and UKESM1-0-LL’s medium-term temperature response in Figure B3.

Correlation between temperature and precipitation:

Since METEOR emulates both variables, it would be useful to examine their regional co-evolution. For instance, map percent precipitation change per degree of warming—some regions should show ~2–5 % °C⁻¹, moisture-saturated regions near Clausius–Clapeyron (~7 % °C⁻¹), etc. This would provide a physics-based check on the emulator’s joint behavior.
Skill comparison to other techniques:

The reported skill metrics (Pearson, RMSE) need context. Consider benchmarking against the ClimateBench test (doi:10.1029/2021MS002954) using NorESM2 output, or comparing to other published emulators. You might also compare each model’s emulation error to the inter-model spread in response patterns, to assess whether emulator errors are small relative to GCM diversity.
Small comments:
Lines 11–12, Abstract: You state that the emulation system can “accurately predict gridded responses to out-of-sample scenarios.” That is too broad, since you demonstrate accuracy only for the MMM, annual means, and expected values. Please qualify.

Line 47: Do you mean that ClimateBench data are not widely available? They are provided via Zenodo—please clarify.

Line 105: “most impactful non-GHG forcer.” Perhaps note that this is currently true but may differ under low-emission scenarios.

Line 134: When you subtract the piControl “climatology,” do you mean a 20- or 30-year rolling mean, a trend, or a non-parametric low-pass filter? Please specify.

Line 137: Clarify whether you use cos(lat) for area weighting or each model’s native areacella.

Figures 3 & 4: Much of the long-term precipitation response lies outside the colorbar range—consider widening it or otherwise showing pattern extrema.

Figure A1 caption: Typo: “fo” → “of.”

Tropospheric ozone response: Where is this captured? I assume in the residual (aerosol-scaled) response—please state.

Residual scaling bias: Using sulfate as the scaler for residual response may bias low-emission scenarios, since nitrate aerosols could dominate forcing by century’s end. Discuss this potential bias.
Citation: https://doi.org/10.5194/egusphere-2025-1038-RC1
RC2: 'Comment on egusphere-2025-1038', Yann Quilcaille, 19 May 2025

All comments are detailed in the file. Overall, very good manuscript, minor revisions related to the method and validation.

Citation: https://doi.org/10.5194/egusphere-2025-1038-RC2
AC1: 'Comment on egusphere-2025-1038', Marit Sandstad, 16 Jun 2025

Thank you reviewer 1 for your thorough and useful review, attached are our detailed responses to each of your points including our strategies to address your criticisms in our revised manuscript.

Citation: https://doi.org/10.5194/egusphere-2025-1038-AC1
AC2: 'Comment on egusphere-2025-1038', Marit Sandstad, 16 Jun 2025

Thank you Yann Quilcaille for your thorough and useful review. Attached are our detailed responses to each of your points including our strategies to address your criticisms in our revised manuscript

Citation: https://doi.org/10.5194/egusphere-2025-1038-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Marit Sandstad on behalf of the Authors (24 Jun 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (01 Jul 2025) by Stefan Rahimi-Esfarjani

RR by Yann Quilcaille (02 Jul 2025)

RR by Anonymous Referee #1 (29 Sep 2025)

ED: Publish as is (29 Sep 2025) by Stefan Rahimi-Esfarjani

AR by Marit Sandstad on behalf of the Authors (06 Oct 2025) Manuscript

Short summary

We present METEORv1.0.1, a climate model emulator, that can be trained on full spatially resolved and widely available climate model data to reproduce climate variables and make predictions from unseen emission trajectories. The methodology identifies patterns with timescales of impact for one or more forcers using idealised experiments and anomaly calculations. Results for precipitation and temperature show good model performance and can reproduce hysteresis for overshoot scenarios.

METEORv1.0.1: a novel framework for emulating multi-timescale regional climate responses

Download

Interactive discussion

Peer review completion

Suggestions for revision or reasons for rejection