Articles | Volume 18, issue 17
https://doi.org/10.5194/gmd-18-5605-2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Special issue:
https://doi.org/10.5194/gmd-18-5605-2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Huge ensembles – Part 2: Properties of a huge ensemble of hindcasts generated with spherical Fourier neural operators
Ankur Mahesh
CORRESPONDING AUTHOR
Earth and Environmental Sciences Area, Lawrence Berkeley National Laboratory (LBNL), Berkeley, California, USA
Department of Earth and Planetary Science, University of California, Berkeley, USA
William D. Collins
Earth and Environmental Sciences Area, Lawrence Berkeley National Laboratory (LBNL), Berkeley, California, USA
Department of Earth and Planetary Science, University of California, Berkeley, USA
Boris Bonev
NVIDIA Corporation, Santa Clara, California, USA
Noah Brenowitz
NVIDIA Corporation, Santa Clara, California, USA
Yair Cohen
NVIDIA Corporation, Santa Clara, California, USA
Peter Harrington
National Energy Research Scientific Computing Center (NERSC), LBNL, Berkeley, California, USA
Karthik Kashinath
NVIDIA Corporation, Santa Clara, California, USA
Thorsten Kurth
NVIDIA Corporation, Santa Clara, California, USA
Joshua North
Earth and Environmental Sciences Area, Lawrence Berkeley National Laboratory (LBNL), Berkeley, California, USA
Travis A. O'Brien
Department of Earth and Atmospheric Sciences, Indiana University, Bloomington, Indiana, USA
Earth and Environmental Sciences Area, Lawrence Berkeley National Laboratory (LBNL), Berkeley, California, USA
Michael Pritchard
NVIDIA Corporation, Santa Clara, California, USA
Department of Earth System Science, University of California, Irvine, USA
David Pruitt
NVIDIA Corporation, Santa Clara, California, USA
Mark Risser
Earth and Environmental Sciences Area, Lawrence Berkeley National Laboratory (LBNL), Berkeley, California, USA
Shashank Subramanian
National Energy Research Scientific Computing Center (NERSC), LBNL, Berkeley, California, USA
Jared Willard
National Energy Research Scientific Computing Center (NERSC), LBNL, Berkeley, California, USA
Related authors
Ankur Mahesh, William D. Collins, Boris Bonev, Noah Brenowitz, Yair Cohen, Joshua Elms, Peter Harrington, Karthik Kashinath, Thorsten Kurth, Joshua North, Travis O'Brien, Michael Pritchard, David Pruitt, Mark Risser, Shashank Subramanian, and Jared Willard
Geosci. Model Dev., 18, 5575–5603, https://doi.org/10.5194/gmd-18-5575-2025, https://doi.org/10.5194/gmd-18-5575-2025, 2025
Short summary
Short summary
Simulating extreme weather events in a warming world is a challenging task for current weather and climate models. These models' computational cost poses a challenge in studying low-probability extreme weather. We use machine learning to construct a new probabilistic system. We give an in-depth explanation of how we constructed this system. We present a thorough pipeline to validate our method. Our method requires fewer computational resources than existing weather and climate models.
Ankur Mahesh, Travis A. O'Brien, Burlen Loring, Abdelrahman Elbashandy, William Boos, and William D. Collins
Geosci. Model Dev., 17, 3533–3557, https://doi.org/10.5194/gmd-17-3533-2024, https://doi.org/10.5194/gmd-17-3533-2024, 2024
Short summary
Short summary
Atmospheric rivers (ARs) are extreme weather events that can alleviate drought or cause billions of US dollars in flood damage. We train convolutional neural networks (CNNs) to detect ARs with an estimate of the uncertainty. We present a framework to generalize these CNNs to a variety of datasets of past, present, and future climate. Using a simplified simulation of the Earth's atmosphere, we validate the CNNs. We explore the role of ARs in maintaining energy balance in the Earth system.
Prabhat, Karthik Kashinath, Mayur Mudigonda, Sol Kim, Lukas Kapp-Schwoerer, Andre Graubner, Ege Karaismailoglu, Leo von Kleist, Thorsten Kurth, Annette Greiner, Ankur Mahesh, Kevin Yang, Colby Lewis, Jiayi Chen, Andrew Lou, Sathyavat Chandran, Ben Toms, Will Chapman, Katherine Dagon, Christine A. Shields, Travis O'Brien, Michael Wehner, and William Collins
Geosci. Model Dev., 14, 107–124, https://doi.org/10.5194/gmd-14-107-2021, https://doi.org/10.5194/gmd-14-107-2021, 2021
Short summary
Short summary
Detecting extreme weather events is a crucial step in understanding how they change due to climate change. Deep learning (DL) is remarkable at pattern recognition; however, it works best only when labeled datasets are available. We create
ClimateNet– an expert-labeled curated dataset – to train a DL model for detecting weather events and predicting changes in extreme precipitation. This work paves the way for DL-based automated, high-fidelity, and highly precise analytics of climate data.
Ankur Mahesh, William D. Collins, Boris Bonev, Noah Brenowitz, Yair Cohen, Joshua Elms, Peter Harrington, Karthik Kashinath, Thorsten Kurth, Joshua North, Travis O'Brien, Michael Pritchard, David Pruitt, Mark Risser, Shashank Subramanian, and Jared Willard
Geosci. Model Dev., 18, 5575–5603, https://doi.org/10.5194/gmd-18-5575-2025, https://doi.org/10.5194/gmd-18-5575-2025, 2025
Short summary
Short summary
Simulating extreme weather events in a warming world is a challenging task for current weather and climate models. These models' computational cost poses a challenge in studying low-probability extreme weather. We use machine learning to construct a new probabilistic system. We give an in-depth explanation of how we constructed this system. We present a thorough pipeline to validate our method. Our method requires fewer computational resources than existing weather and climate models.
Bo Dong, Paul Ullrich, Jiwoo Lee, Peter Gleckler, Kristin Chang, and Travis A. O'Brien
Geosci. Model Dev., 18, 961–976, https://doi.org/10.5194/gmd-18-961-2025, https://doi.org/10.5194/gmd-18-961-2025, 2025
Short summary
Short summary
A metrics package designed for easy analysis of atmospheric river (AR) characteristics and statistics is presented. The tool is efficient for diagnosing systematic AR bias in climate models and useful for evaluating new AR characteristics in model simulations. In climate models, landfalling AR precipitation shows dry biases globally, and AR tracks are farther poleward (equatorward) in the North and South Atlantic (South Pacific and Indian Ocean).
Ryan J. O'Loughlin, Dan Li, Richard Neale, and Travis A. O'Brien
Geosci. Model Dev., 18, 787–802, https://doi.org/10.5194/gmd-18-787-2025, https://doi.org/10.5194/gmd-18-787-2025, 2025
Short summary
Short summary
We draw from traditional climate modeling practices to make recommendations for machine-learning (ML)-driven climate science. Our intended audience is climate modelers who are relatively new to ML. We show how component-level understanding – obtained when scientists can link model behavior to parts within the overall model – should guide the development and evaluation of ML models. Better understanding yields a stronger basis for trust in the models. We highlight several examples to demonstrate.
Ankur Mahesh, Travis A. O'Brien, Burlen Loring, Abdelrahman Elbashandy, William Boos, and William D. Collins
Geosci. Model Dev., 17, 3533–3557, https://doi.org/10.5194/gmd-17-3533-2024, https://doi.org/10.5194/gmd-17-3533-2024, 2024
Short summary
Short summary
Atmospheric rivers (ARs) are extreme weather events that can alleviate drought or cause billions of US dollars in flood damage. We train convolutional neural networks (CNNs) to detect ARs with an estimate of the uncertainty. We present a framework to generalize these CNNs to a variety of datasets of past, present, and future climate. Using a simplified simulation of the Earth's atmosphere, we validate the CNNs. We explore the role of ARs in maintaining energy balance in the Earth system.
Bjorn Stevens, Stefan Adami, Tariq Ali, Hartwig Anzt, Zafer Aslan, Sabine Attinger, Jaana Bäck, Johanna Baehr, Peter Bauer, Natacha Bernier, Bob Bishop, Hendryk Bockelmann, Sandrine Bony, Guy Brasseur, David N. Bresch, Sean Breyer, Gilbert Brunet, Pier Luigi Buttigieg, Junji Cao, Christelle Castet, Yafang Cheng, Ayantika Dey Choudhury, Deborah Coen, Susanne Crewell, Atish Dabholkar, Qing Dai, Francisco Doblas-Reyes, Dale Durran, Ayoub El Gaidi, Charlie Ewen, Eleftheria Exarchou, Veronika Eyring, Florencia Falkinhoff, David Farrell, Piers M. Forster, Ariane Frassoni, Claudia Frauen, Oliver Fuhrer, Shahzad Gani, Edwin Gerber, Debra Goldfarb, Jens Grieger, Nicolas Gruber, Wilco Hazeleger, Rolf Herken, Chris Hewitt, Torsten Hoefler, Huang-Hsiung Hsu, Daniela Jacob, Alexandra Jahn, Christian Jakob, Thomas Jung, Christopher Kadow, In-Sik Kang, Sarah Kang, Karthik Kashinath, Katharina Kleinen-von Königslöw, Daniel Klocke, Uta Kloenne, Milan Klöwer, Chihiro Kodama, Stefan Kollet, Tobias Kölling, Jenni Kontkanen, Steve Kopp, Michal Koran, Markku Kulmala, Hanna Lappalainen, Fakhria Latifi, Bryan Lawrence, June Yi Lee, Quentin Lejeun, Christian Lessig, Chao Li, Thomas Lippert, Jürg Luterbacher, Pekka Manninen, Jochem Marotzke, Satoshi Matsouoka, Charlotte Merchant, Peter Messmer, Gero Michel, Kristel Michielsen, Tomoki Miyakawa, Jens Müller, Ramsha Munir, Sandeep Narayanasetti, Ousmane Ndiaye, Carlos Nobre, Achim Oberg, Riko Oki, Tuba Özkan-Haller, Tim Palmer, Stan Posey, Andreas Prein, Odessa Primus, Mike Pritchard, Julie Pullen, Dian Putrasahan, Johannes Quaas, Krishnan Raghavan, Venkatachalam Ramaswamy, Markus Rapp, Florian Rauser, Markus Reichstein, Aromar Revi, Sonakshi Saluja, Masaki Satoh, Vera Schemann, Sebastian Schemm, Christina Schnadt Poberaj, Thomas Schulthess, Cath Senior, Jagadish Shukla, Manmeet Singh, Julia Slingo, Adam Sobel, Silvina Solman, Jenna Spitzer, Philip Stier, Thomas Stocker, Sarah Strock, Hang Su, Petteri Taalas, John Taylor, Susann Tegtmeier, Georg Teutsch, Adrian Tompkins, Uwe Ulbrich, Pier-Luigi Vidale, Chien-Ming Wu, Hao Xu, Najibullah Zaki, Laure Zanna, Tianjun Zhou, and Florian Ziemen
Earth Syst. Sci. Data, 16, 2113–2122, https://doi.org/10.5194/essd-16-2113-2024, https://doi.org/10.5194/essd-16-2113-2024, 2024
Short summary
Short summary
To manage Earth in the Anthropocene, new tools, new institutions, and new forms of international cooperation will be required. Earth Virtualization Engines is proposed as an international federation of centers of excellence to empower all people to respond to the immense and urgent challenges posed by climate change.
Arjun Babu Nellikkattil, Danielle Lemmon, Travis Allen O'Brien, June-Yi Lee, and Jung-Eun Chu
Geosci. Model Dev., 17, 301–320, https://doi.org/10.5194/gmd-17-301-2024, https://doi.org/10.5194/gmd-17-301-2024, 2024
Short summary
Short summary
This study introduces a new computational framework called Scalable Feature Extraction and Tracking (SCAFET), designed to extract and track features in climate data. SCAFET stands out by using innovative shape-based metrics to identify features without relying on preconceived assumptions about the climate model or mean state. This approach allows more accurate comparisons between different models and scenarios.
Prabhat, Karthik Kashinath, Mayur Mudigonda, Sol Kim, Lukas Kapp-Schwoerer, Andre Graubner, Ege Karaismailoglu, Leo von Kleist, Thorsten Kurth, Annette Greiner, Ankur Mahesh, Kevin Yang, Colby Lewis, Jiayi Chen, Andrew Lou, Sathyavat Chandran, Ben Toms, Will Chapman, Katherine Dagon, Christine A. Shields, Travis O'Brien, Michael Wehner, and William Collins
Geosci. Model Dev., 14, 107–124, https://doi.org/10.5194/gmd-14-107-2021, https://doi.org/10.5194/gmd-14-107-2021, 2021
Short summary
Short summary
Detecting extreme weather events is a crucial step in understanding how they change due to climate change. Deep learning (DL) is remarkable at pattern recognition; however, it works best only when labeled datasets are available. We create
ClimateNet– an expert-labeled curated dataset – to train a DL model for detecting weather events and predicting changes in extreme precipitation. This work paves the way for DL-based automated, high-fidelity, and highly precise analytics of climate data.
Travis A. O'Brien, Mark D. Risser, Burlen Loring, Abdelrahman A. Elbashandy, Harinarayan Krishnan, Jeffrey Johnson, Christina M. Patricola, John P. O'Brien, Ankur Mahesh, Prabhat, Sarahí Arriaga Ramirez, Alan M. Rhoades, Alexander Charn, Héctor Inda Díaz, and William D. Collins
Geosci. Model Dev., 13, 6131–6148, https://doi.org/10.5194/gmd-13-6131-2020, https://doi.org/10.5194/gmd-13-6131-2020, 2020
Short summary
Short summary
Researchers utilize various algorithms to identify extreme weather features in climate data, and we seek to answer this question: given a
plausibleweather event detector, how does uncertainty in the detector impact scientific results? We generate a suite of statistical models that emulate expert identification of weather features. We find that the connection between El Niño and atmospheric rivers – a specific extreme weather type – depends systematically on the design of the detector.
Mark D. Risser and Michael F. Wehner
Adv. Stat. Clim. Meteorol. Oceanogr., 6, 115–139, https://doi.org/10.5194/ascmo-6-115-2020, https://doi.org/10.5194/ascmo-6-115-2020, 2020
Short summary
Short summary
Evaluation of modern high-resolution global climate models often does not account for the geographic location of the underlying weather station data. In this paper, we quantify the impact of geographic sampling on the relative performance of climate model representations of precipitation extremes over the United States. We find that properly accounting for the geographic sampling of weather stations can significantly change the assessment of model performance.
Cited articles
Allen, S., Bhend, J., Martius, O., and Ziegel, J.: Weighted Verification Tools to Evaluate Univariate and Multivariate Probabilistic Forecasts for High-Impact Weather Events, Weather Forecast., 38, 499–516, https://doi.org/10.1175/waf-d-22-0161.1, 2023. a, b
Ananthakrishnan, R., Chard, K., Foster, I., and Tuecke, S.: Globus platform‐as‐a‐service for collaborative science applications, Concurr. Comp.-Pract. E., 27, 290–305, https://doi.org/10.1002/cpe.3262, 2014. a
Baño-Medina, J., Sengupta, A., Watson-Parris, D., Hu, W., and Monache, L. D.: Towards calibrated ensembles of neural weather model forecasts, ESS Open Archive, https://doi.org/10.22541/essoar.171536034.43833039/v1, 2024. a
Bercos‐Hickey, E., O’Brien, T. A., Wehner, M. F., Zhang, L., Patricola, C. M., Huang, H., and Risser, M. D.: Anthropogenic Contributions to the 2021 Pacific Northwest Heatwave, Geophys. Res. Lett., 49, 23, https://doi.org/10.1029/2022gl099396, 2022. a
Bi, K., Xie, L., Zhang, H., Chen, X., Gu, X., and Tian, Q.: Accurate medium-range global weather forecasting with 3D neural networks, Nature, 619, 533–538, https://doi.org/10.1038/s41586-023-06185-3, 2023. a
Caldwell, P. M., Terai, C. R., Hillman, B., Keen, N. D., Bogenschutz, P., Lin, W., Beydoun, H., Taylor, M., Bertagna, L., Bradley, A. M., Clevenger, T. C., Donahue, A. S., Eldred, C., Foucar, J., Golaz, J., Guba, O., Jacob, R., Johnson, J., Krishna, J., Liu, W., Pressel, K., Salinger, A. G., Singh, B., Steyer, A., Ullrich, P., Wu, D., Yuan, X., Shpund, J., Ma, H., and Zender, C. S.: Convection‐Permitting Simulations With the E3SM Global Atmosphere Model, J. Adv. Model. Earth Sy., 13, 11, https://doi.org/10.1029/2021ms002544, 2021. a
Craig, G. C., Puh, M., Keil, C., Tempest, K., Necker, T., Ruiz, J., Weissmann, M., and Miyoshi, T.: Distributions and convergence of forecast variables in a 1,000‐member convection‐permitting ensemble, Q. J. Roy. Meteor. Soc., 148, 2325–2343, https://doi.org/10.1002/qj.4305, 2022. a
Deser, C., Lehner, F., Rodgers, K. B., Ault, T., Delworth, T. L., DiNezio, P. N., Fiore, A., Frankignoul, C., Fyfe, J. C., Horton, D. E., Kay, J. E., Knutti, R., Lovenduski, N. S., Marotzke, J., McKinnon, K. A., Minobe, S., Randerson, J., Screen, J. A., Simpson, I. R., and Ting, M.: Insights from Earth system model initial-condition large ensembles and future prospects, Nat. Clim. Change, 10, 277–286, https://doi.org/10.1038/s41558-020-0731-2, 2020. a
Domeisen, D. I. V., Eltahir, E. A. B., Fischer, E. M., Knutti, R., Perkins-Kirkpatrick, S. E., Schär, C., Seneviratne, S. I., Weisheimer, A., and Wernli, H.: Prediction and projection of heatwaves, Nature Reviews Earth and Environment, 4, 36–50, https://doi.org/10.1038/s43017-022-00371-z, 2022. a
ESGF: https://esgf.llnl.gov/, last access: 31 July 2024. a
Esper, J., Torbenson, M., and Büntgen, U.: 2023 summer warmth unparalleled over the past 2,000 years, Nature, 631, 1–2, 2024. a
Eyring, V., Bony, S., Meehl, G. A., Senior, C. A., Stevens, B., Stouffer, R. J., and Taylor, K. E.: Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization, Geosci. Model Dev., 9, 1937–1958, https://doi.org/10.5194/gmd-9-1937-2016, 2016. a
Finkel, J., Gerber, E. P., Abbot, D. S., and Weare, J.: Revealing the Statistics of Extreme Events Hidden in Short Weather Forecast Data, AGU Advances, 4, 2, https://doi.org/10.1029/2023av000881, 2023. a
Fischer, E. M., Beyerle, U., Bloin-Wibe, L., Gessner, C., Humphrey, V., Lehner, F., Pendergrass, A. G., Sippel, S., Zeder, J., and Knutti, R.: Storylines for unprecedented heatwaves based on ensemble boosting, Nat. Commun., 14, 1, https://doi.org/10.1038/s41467-023-40112-4, 2023. a
Frame, D., Aina, T., Christensen, C., Faull, N., Knight, S., Piani, C., Rosier, S., Yamazaki, K., Yamazaki, Y., and Allen, M.: The climate prediction .net BBC climate change experiment: design of the coupled model ensemble, Philos. T. R. Soc. A, 367, 855–870, https://doi.org/10.1098/rsta.2008.0240, 2008. a
Gessner, C., Fischer, E. M., Beyerle, U., and Knutti, R.: Very rare heat extremes: quantifying and understanding using ensemble re-initialization, J. Climate, 34, 1–46, https://doi.org/10.1175/jcli-d-20-0916.1, 2021. a
Gneiting, T. and Raftery, A. E.: Strictly Proper Scoring Rules, Prediction, and Estimation, J. Am. Stat. Assoc., 102, 359–378, https://doi.org/10.1198/016214506000001437, 2007. a
Guan, H., Arcomano, T., Chattopadhyay, A., and Maulik, R.: LUCIE: A Lightweight Uncoupled ClImate Emulator with long-term stability and physical consistency for O(1000)-member ensembles, arXiv [preprint], https://doi.org/10.48550/ARXIV.2405.16297, 2024. a
Hakim, G. J. and Masanam, S.: Dynamical Tests of a Deep-Learning Weather Prediction Model, Artificial Intelligence for the Earth Systems, 3, 3, https://doi.org/10.1175/aies-d-23-0090.1, 2024. a
Hazeleger, W., Severijns, C., Semmler, T., Ştefănescu, S., Yang, S., Wang, X., Wyser, K., Dutra, E., Baldasano, J. M., Bintanja, R., Bougeault, P., Caballero, R., Ekman, A. M. L., Christensen, J. H., van den Hurk, B., Jimenez, P., Jones, C., Kållberg, P., Koenigk, T., McGrath, R., Miranda, P., van Noije, T., Palmer, T., Parodi, J. A., Schmith, T., Selten, F., Storelvmo, T., Sterl, A., Tapamo, H., Vancoppenolle, M., Viterbo, P., and Willén, U.: EC-Earth: A Seamless Earth-System Prediction Approach in Action, B. Am. Meteorol. Soc., 91, 1357–1364, https://doi.org/10.1175/2010bams2877.1, 2010. a
Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz‐Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., De Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., de Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.: The ERA5 global reanalysis, Q. J. Roy. Meteor. Soc., 146, 1999–2049, https://doi.org/10.1002/qj.3803, 2020. a
Hu, Y., Chen, L., Wang, Z., and Li, H.: SwinVRNN: A Data‐Driven Ensemble Forecasting Model via Learned Distribution Perturbation, J. Adv. Model. Earth Sy., 15, 2, https://doi.org/10.1029/2022ms003211, 2023. a
Jeffrey, S., Rotstayn, L., Collier, M., Dravitzki, S., Hamalainen, C., Moeseneder, C., Wong, K., and Syktus, J.: Australia’s CMIP5 submission usingthe CSIRO-Mk3. 6 model, Aust. Meteorol. Ocean., 63, 1–13, 2013. a
Kay, J. E., Deser, C., Phillips, A., Mai, A., Hannay, C., Strand, G., Arblaster, J. M., Bates, S. C., Danabasoglu, G., Edwards, J., Holland, M., Kushner, P., Lamarque, J.-F., Lawrence, D., Lindsay, K., Middleton, A., Munoz, E., Neale, R., Oleson, K., Polvani, L., and Vertenstein, M.: The Community Earth System Model (CESM) Large Ensemble Project: A Community Resource for Studying Climate Change in the Presence of Internal Climate Variability, B. Am. Meteorol. Soc., 96, 1333–1349, https://doi.org/10.1175/bams-d-13-00255.1, 2015. a, b
Kelder, T., Marjoribanks, T. I., Slater, L. J., Prudhomme, C., Wilby, R. L., Wagemann, J., and Dunstone, N.: An open workflow to gain insights about low‐likelihood high‐impact weather events from initialized predictions, Meteorol. Appl., 29, 3, https://doi.org/10.1002/met.2065, 2022a. a
Kelder, T., Wanders, N., van der Wiel, K., Marjoribanks, T. I., Slater, L. J., Wilby, R. l., and Prudhomme, C.: Interpreting extreme climate impacts from large ensemble simulations – are they unseen or unrealistic?, Environ. Res. Lett., 17, 044052, https://doi.org/10.1088/1748-9326/ac5cf4, 2022b. a, b
Kirchmeier-Young, M. C. and Zhang, X.: Human influence has intensified extreme precipitation in North America, P. Natl. Acad. Sci. USA, 117, 13308–13313, https://doi.org/10.1073/pnas.1921628117, 2020. a
Kirchmeier-Young, M. C., Zwiers, F. W., and Gillett, N. P.: Attribution of Extreme Events in Arctic Sea Ice Extent, J. Climate, 30, 553–571, https://doi.org/10.1175/jcli-d-16-0412.1, 2017. a
Kochkov, D., Yuval, J., Langmore, I., Norgaard, P., Smith, J., Mooers, G., Lottes, J., Rasp, S., Düben, P., Klöwer, M., Hatfield, S., Battaglia, P., Sanchez-Gonzalez, A., Willson, M., Brenner, M. P., and Hoyer, S.: Neural General Circulation Models, arXiv [preprint], https://doi.org/10.48550/ARXIV.2311.07222, 2023. a
Leach, N. J., Weisheimer, A., Allen, M. R., and Palmer, T.: Forecast-based attribution of a winter heatwave within the limit of predictability, P. Natl. Acad. Sci. USA, 118, 49, https://doi.org/10.1073/pnas.2112087118, 2021. a
Leach, N. J., Watson, P. A., Sparrow, S. N., Wallom, D. C., and Sexton, D. M.: Generating samples of extreme winters to support climate adaptation, Weather and Climate Extremes, 36, 100419, https://doi.org/10.1016/j.wace.2022.100419, 2022. a
Leach, N. J., Roberts, C. D., Aengenheyster, M., Heathcote, D., Mitchell, D. M., Thompson, V., Palmer, T., Weisheimer, A., and Allen, M. R.: Heatwave attribution based on reliable operational weather forecasts, Nat. Commun., 15, 1, https://doi.org/10.1038/s41467-024-48280-7, 2024. a, b
Lerch, S., Thorarinsdottir, T. L., Ravazzolo, F., and Gneiting, T.: Forecaster’s Dilemma: Extreme Events and Forecast Evaluation, Stat. Sci., 32, 106–127, https://doi.org/10.1214/16-sts588, 2017. a
Leutbecher, M.: Ensemble size: How suboptimal is less than infinity?, Q. J. Roy. Meteor. Soc., 145, 107–128, https://doi.org/10.1002/qj.3387, 2018. a, b
Leutbecher, M. and Palmer, T.: Ensemble forecasting, J. Comput. Phys., 227, 3515–3539, https://doi.org/10.1016/j.jcp.2007.02.014, 2008. a
Li, L., Carver, R., Lopez-Gomez, I., Sha, F., and Anderson, J.: Generative emulation of weather forecast ensembles with diffusion models, Science Advances, 10, 13, https://doi.org/10.1126/sciadv.adk4489, 2024. a, b
Longmate, J. M., Risser, M. D., and Feldman, D. R.: Prioritizing the selection of CMIP6 model ensemble members for downscaling projections of CONUS temperature and precipitation, Clim. Dynam., 61, 5171–5197, https://doi.org/10.1007/s00382-023-06846-z, 2023. a
Lu, Y.-C. and Romps, D. M.: Extending the Heat Index, J. Appl. Meteorol. Clim., 61, 1367–1383, https://doi.org/10.1175/jamc-d-22-0021.1, 2022. a
Maher, N., Milinski, S., Suarez‐Gutierrez, L., Botzet, M., Dobrynin, M., Kornblueh, L., Kröger, J., Takano, Y., Ghosh, R., Hedemann, C., Li, C., Li, H., Manzini, E., Notz, D., Putrasahan, D., Boysen, L., Claussen, M., Ilyina, T., Olonscheck, D., Raddatz, T., Stevens, B., and Marotzke, J.: The Max Planck Institute Grand Ensemble: Enabling the Exploration of Climate System Variability, J. Adv. Model. Earth Sy., 11, 2050–2069, https://doi.org/10.1029/2019ms001639, 2019. a
Mahesh, A., O'Brien, T. A., Loring, B., Elbashandy, A., Boos, W., and Collins, W. D.: Identifying atmospheric rivers and their poleward latent heat transport with generalizable neural networks: ARCNNv1, Geosci. Model Dev., 17, 3533–3557, https://doi.org/10.5194/gmd-17-3533-2024, 2024. a
Mahesh, A., Collins, W., Bonev, B., Brenowitz, N., Cohen, Y., Elms, J., Harrington, P., Kashinath, K., Kurth, T., North, J., O'Brien, T., Pritchard, M., Pruitt, D., Risser, M., Subramanian, S., and Willard, J.: Huge ensembles – Part 1: Design of ensemble weather forecasts using spherical Fourier neural operators, Geosci. Model Dev., Geosci. Model Dev., 18, 5575–5603, https://doi.org/10.5194/gmd-18-5575-2025, 2025a. a
Mahesh, A., Collins, W., Bonev, B., Brenowitz, N., Cohen, Y., Harrington, P., Kashinath, K., Kurth, T., North, J., O’Brien, T., Pritchard, M., Pruitt, D., Risser, M., Subramanian, S., and Willard, J.: Huge ensembles part I design of ensemble weather forecasts with spherical Fourier neural operators; Huge ensem- bles part II properties of a huge ensemble of hindcasts generated with spherical Fourier neural operators, DRYAD [code and data set], https://doi.org/10.5061/DRYAD.2RBNZS80N, 2025b. a
Mahesh, A., Collins, W., Bonev, B., Brenowitz, N., Cohen, Y., Harrington, P., Kashinath, K., Kurth, T., North, J., O’Brien, T., Pritchard, M., Pruitt, D., Risser, M., Subramanian, S., and Willard, J.: Huge ensembles part I design of ensemble weather forecasts with spherical Fourier neural operators; Huge ensembles part II properties of a huge ensemble of hindcasts generated with spherical Fourier neural operators, GitHub [code], https://github.com/ankurmahesh/earth2mip-fork (last access: 20 August 2025), 2025c. a
Mamalakis, A., Ebert-Uphoff, I., and Barnes, E. A.: Neural network attribution methods for problems in geoscience: A novel synthetic benchmark dataset, Environmental Data Science, 1, https://doi.org/10.1017/eds.2022.7, 2022. a
Mankin, J. S., Lehner, F., Coats, S., and McKinnon, K. A.: The Value of Initial Condition Large Ensembles to Robust Adaptation Decision‐Making, Earth’s Future, 8, 10, https://doi.org/10.1029/2020ef001610, 2020. a
MARS: https://www.ecmwf.int/en/forecasts/access-forecasts/access-archive-datasets, last access: 31 July 2024. a
Massart, P.: The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality, Ann. Probab., 18, 1269–1283, https://doi.org/10.1214/aop/1176990746, 1990. a, b
McCulloch, C. E. and Neuhaus, J. M.: Misspecifying the Shape of a Random Effects Distribution: Why Getting It Wrong May Not Matter, Stat. Sci., 26, 388–402, https://doi.org/10.1214/11-sts361, 2011. a
McKinnon, K. A., Poppick, A., Dunn-Sigouin, E., and Deser, C.: An “Observational Large Ensemble” to Compare Observed and Modeled Temperature Trend Uncertainty due to Internal Variability, J. Climate, 30, 7585–7598, https://doi.org/10.1175/jcli-d-16-0905.1, 2017. a
Milinski, S., Maher, N., and Olonscheck, D.: How large does a large ensemble need to be?, Earth Syst. Dynam., 11, 885–901, https://doi.org/10.5194/esd-11-885-2020, 2020. a
Millin, O. T. and Furtado, J. C.: The Role of Wave Breaking in the Development and Subseasonal Forecasts of the February 2021 Great Plains Cold Air Outbreak, Geophys. Res. Lett., 49, 21, https://doi.org/10.1029/2022gl100835, 2022. a
Miranda, N. D., Lizana, J., Sparrow, S. N., Zachau-Walker, M., Watson, P. A. G., Wallom, D. C. H., Khosla, R., and McCulloch, M.: Change in cooling degree days with global mean temperature rise increasing from 1.5 °C to 2.0 °C, Nat. Sustain., 6, 1326–1330, https://doi.org/10.1038/s41893-023-01155-z, 2023. a
Mo, R., Lin, H., and Vitart, F.: An anomalous warm-season trans-Pacific atmospheric river linked to the 2021 western North America heatwave, Communications Earth and Environment, 3, 1, https://doi.org/10.1038/s43247-022-00459-w, 2022. a
NVIDIA: NVIDIA Earth2Studio, GitHub [code], https://github.com/NVIDIA/earth2studio (last access: 20 August 2025), 2025. a
Palmer, T. N.: The economic value of ensemble forecasts as a tool for risk assessment: From days to decades, Q. J. Roy. Meteor. Soc., 128, 747–774, https://doi.org/10.1256/0035900021643593, 2002. a, b
Pathak, J., Subramanian, S., Harrington, P., Raja, S., Chattopadhyay, A., Mardani, M., Kurth, T., Hall, D., Li, Z., Azizzadenesheli, K., Hassanzadeh, P., Kashinath, K., and Anandkumar, A.: FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators, arXiv [preprint], https://doi.org/10.48550/ARXIV.2202.11214, 2022. a
Per: https://docs.nersc.gov/systems/perlmutter/architecture/, last access: 31 July 2024. a
Philip, S. Y., Kew, S. F., van Oldenborgh, G. J., Anslow, F. S., Seneviratne, S. I., Vautard, R., Coumou, D., Ebi, K. L., Arrighi, J., Singh, R., van Aalst, M., Pereira Marghidan, C., Wehner, M., Yang, W., Li, S., Schumacher, D. L., Hauser, M., Bonnet, R., Luu, L. N., Lehner, F., Gillett, N., Tradowsky, J. S., Vecchi, G. A., Rodell, C., Stull, R. B., Howard, R., and Otto, F. E. L.: Rapid attribution analysis of the extraordinary heat wave on the Pacific coast of the US and Canada in June 2021, Earth Syst. Dynam., 13, 1689–1713, https://doi.org/10.5194/esd-13-1689-2022, 2022. a
Price, I., Sanchez-Gonzalez, A., Alet, F., Ewalds, T., El-Kadi, A., Stott, J., Mohamed, S., Battaglia, P., Lam, R., and Willson, M.: GenCast: Diffusion-based ensemble forecasting for medium-range weather, arXiv [preprint], https://doi.org/10.48550/ARXIV.2312.15796, 2023. a
Richardson, D. S.: Measures of skill and value of ensemble prediction systems, their interrelationship and the effect of ensemble size, Q. J. Roy. Meteor. Soc., 127, 2473–2489, https://doi.org/10.1002/qj.49712757715, 2001. a
Rodgers, K. B., Lin, J., and Frölicher, T. L.: Emergence of multiple ocean ecosystem drivers in a large ensemble suite with an Earth system model, Biogeosciences, 12, 3301–3320, https://doi.org/10.5194/bg-12-3301-2015, 2015. a
Runge, J., Bathiany, S., Bollt, E., Camps-Valls, G., Coumou, D., Deyle, E., Glymour, C., Kretschmer, M., Mahecha, M. D., Muñoz-Marí, J., van Nes, E. H., Peters, J., Quax, R., Reichstein, M., Scheffer, M., Schölkopf, B., Spirtes, P., Sugihara, G., Sun, J., Zhang, K., and Zscheischler, J.: Inferring causation from time series in Earth system sciences, Nat. Commun., 10, 1, https://doi.org/10.1038/s41467-019-10105-3, 2019. a
Sanderson, B. M., Oleson, K. W., Strand, W. G., Lehner, F., and O’Neill, B. C.: A new ensemble of GCM simulations to assess avoided impacts in a climate mitigation scenario, Climatic Change, 146, 303–318, https://doi.org/10.1007/s10584-015-1567-z, 2015. a
Scher, S. and Messori, G.: Ensemble Methods for Neural Network‐Based Weather Forecasts, J. Adv. Model. Earth Sy., 13, 2, https://doi.org/10.1029/2020ms002331, 2021. a
Schneider, T., Leung, L. R., and Wills, R. C. J.: Opinion: Optimizing climate models with process knowledge, resolution, and artificial intelligence, Atmos. Chem. Phys., 24, 7041–7062, https://doi.org/10.5194/acp-24-7041-2024, 2024. a
Siegert, S., Ferro, C. A. T., Stephenson, D. B., and Leutbecher, M.: The ensemble‐adjusted Ignorance Score for forecasts issued as normal distributions, Q. J. Roy. Meteor. Soc., 145, 129–139, https://doi.org/10.1002/qj.3447, 2019. a
Steadman, R. G.: The Assessment of Sultriness. Part I: A Temperature-Humidity Index Based on Human Physiology and Clothing Science, J. Appl. Meteorol., 18, 861–873, https://doi.org/10.1175/1520-0450(1979)018<0861:taospi>2.0.co;2, 1979. a
Sun, L., Alexander, M., and Deser, C.: Evolution of the Global Coupled Climate Response to Arctic Sea Ice Loss during 1990–2090 and Its Contribution to Climate Change, J. Climate, 31, 7823–7843, https://doi.org/10.1175/jcli-d-18-0134.1, 2018. a
Swain, D. L., Wing, O. E. J., Bates, P. D., Done, J. M., Johnson, K. A., and Cameron, D. R.: Increased Flood Exposure Due to Climate Change and Population Growth in the United States, Earth’s Future, 8, 11, https://doi.org/10.1029/2020ef001778, 2020. a
Thompson, V., Dunstone, N. J., Scaife, A. A., Smith, D. M., Slingo, J. M., Brown, S., and Belcher, S. E.: High risk of unprecedented UK rainfall in the current climate, Nat. Commun., 8, 1, https://doi.org/10.1038/s41467-017-00275-3, 2017. a, b
Vonich, P. T. and Hakim, G. J.: Predictability Limit of the 2021 Pacific Northwest Heatwave from Deep-Learning Sensitivity Analysis, arXiv [preprint], https://doi.org/10.48550/ARXIV.2406.05019, 2024. a
Webber, R. J., Plotkin, D. A., O’Neill, M. E., Abbot, D. S., and Weare, J.: Practical rare event sampling for extreme mesoscale weather, Chaos, 29, https://doi.org/10.1063/1.5081461, 2019. a
Weyn, J. A., Durran, D. R., and Caruana, R.: Can Machines Learn to Predict Weather? Using Deep Learning to Predict Gridded 500‐hPa Geopotential Height From Historical Weather Data, J. Adv. Model. Earth Sy., 11, 2680–2693, https://doi.org/10.1029/2019ms001705, 2019. a
Weyn, J. A., Durran, D. R., Caruana, R., and Cresswell‐Clay, N.: Sub‐Seasonal Forecasting With a Large Ensemble of Deep‐Learning Weather Prediction Models, J. Adv. Model. Earth Sy., 13, 7, https://doi.org/10.1029/2021ms002502, 2021. a
Wilks, D. S. and Hamill, T. M.: Potential Economic Value of Ensemble-Based Surface Weather Forecasts, Mon. Weather Rev., 123, 3565–3575, https://doi.org/10.1175/1520-0493(1995)123<3565:pevoeb>2.0.co;2, 1995. a
Ye, K., Woollings, T., Sparrow, S. N., Watson, P. A. G., and Screen, J. A.: Response of winter climate and extreme weather to projected Arctic sea-ice loss in very large-ensemble climate model simulations, npj Climate and Atmospheric Science, 7, 1, https://doi.org/10.1038/s41612-023-00562-5, 2024. a
Zamo, M. and Naveau, P.: Estimation of the Continuous Ranked Probability Score with Limited Information and Applications to Ensemble Weather Forecasts, Math. Geosci., 50, 209–234, https://doi.org/10.1007/s11004-017-9709-7, 2017. a
Zhang, L., Risser, M. D., Wehner, M. F., and O’Brien, T. A.: Leveraging Extremal Dependence to Better Characterize the 2021 Pacific Northwest Heatwave, J. Agric. Biol. Envir. S., 1–22, https://doi.org/10.1007/s13253-024-00636-8, 2024. a
Zhang, Y. and Boos, W. R.: An upper bound for extreme temperatures over midlatitude land, P. Natl. Acad. Sci. USA, 120, 12, https://doi.org/10.1073/pnas.2215278120, 2023. a
Zhong, X., Chen, L., Li, H., Feng, J., and Lu, B.: FuXi-ENS: A machine learning model for medium-range ensemble weather forecasting, arXiv [preprint], https://doi.org/10.48550/ARXIV.2405.05925, 2024. a
Short summary
We use machine learning emulators to create a massive ensemble of simulated weather extremes. This ensemble provides a large sample size, which is essential to characterize the statistics of extreme weather events and study their physical mechanisms. Also, these ensembles can be beneficial to accurately forecast the probability of low-likelihood extreme weather.
We use machine learning emulators to create a massive ensemble of simulated weather extremes....