Preprints
https://doi.org/10.5194/gmd-2024-133
https://doi.org/10.5194/gmd-2024-133
Submitted as: development and technical paper
 | 
23 Sep 2024
Submitted as: development and technical paper |  | 23 Sep 2024
Status: a revised version of this preprint is currently under review for the journal GMD.

Using feature importance as exploratory data analysis tool on earth system models

Daniel Ries, Katherine Goode, Kellie McClernon, and Benjamin Hillman

Abstract. Machine learning (ML) models are commonly used to generate predictions, but these models can also support the discovery of new science. Generating accurate predictions necessitates that a model captures the structure of the underlying data. If the structure is properly extracted, ML could be a useful exploratory and evidential tool. In this paper, we present a case study that demonstrates the use of ML for exploratory data analysis (EDA) in the climate space. We apply the ML explainability method of spatio-temporal zeroed feature importance (stZFI) to understand how climate variable associations evolve over space and time. Our analyses focus on data from ensembles of earth systems models (ESMs), which provide data on different climate states and conditions. We elect to work with ESM ensembles since they allow us to compare feature importance across alternative scenarios not available with observed data. The ensembles also account for natural variability, so we can distinguish between signal and noise due to natural climate variability when computing feature importance. For our analyses, we consider the 1991 volcanic eruption of Mount Pinatubo: a large stratospheric aerosol injection. We explore the climate pathway associated with the eruption from aerosols to radiation to temperature at both the near-surface and stratospheric levels. In addition to applying the method to data generated from two different ESMs, we apply stZFI to reanalysis data to compare the associations identified by stZFI. We show how stZFI tracks the importance of aerosol optical depth over time on forecasting temperatures. This case study illustrates usefulness of an ML tool (stZFI) for EDA on a well studied climate exemplar.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Daniel Ries, Katherine Goode, Kellie McClernon, and Benjamin Hillman

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on gmd-2024-133', Anonymous Referee #1, 22 Oct 2024
    • AC1: 'Reply on RC1', Daniel Ries, 16 Dec 2024
  • RC2: 'Comment on gmd-2024-133', Anonymous Referee #2, 08 Nov 2024
    • AC2: 'Reply on RC2', Daniel Ries, 16 Dec 2024
Daniel Ries, Katherine Goode, Kellie McClernon, and Benjamin Hillman
Daniel Ries, Katherine Goode, Kellie McClernon, and Benjamin Hillman

Viewed

Total article views: 315 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
260 44 11 315 3 5
  • HTML: 260
  • PDF: 44
  • XML: 11
  • Total: 315
  • BibTeX: 3
  • EndNote: 5
Views and downloads (calculated since 23 Sep 2024)
Cumulative views and downloads (calculated since 23 Sep 2024)

Viewed (geographical distribution)

Total article views: 310 (including HTML, PDF, and XML) Thereof 310 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 16 Dec 2024
Download
Short summary
Machine learning has advanced research in the climate science domain, but its models are difficult to understand. In order to understand the impacts and consequences of climate interventions such as stratospheric aerosol injection, complex models are often necessary. We use a case study to illustrate how we can understand the inner workings of a complex model. We present this technique as an exploratory tool that can be used to quickly discover and assess relationships in complex climate data.