Articles | Volume 18, issue 17
https://doi.org/10.5194/gmd-18-5575-2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Special issue:
https://doi.org/10.5194/gmd-18-5575-2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Huge ensembles – Part 1: Design of ensemble weather forecasts using spherical Fourier neural operators
Ankur Mahesh
CORRESPONDING AUTHOR
Earth and Environmental Sciences Area, Lawrence Berkeley National Laboratory (LBNL), Berkeley, California, USA
Department of Earth and Planetary Science, University of California, Berkeley, USA
William D. Collins
Earth and Environmental Sciences Area, Lawrence Berkeley National Laboratory (LBNL), Berkeley, California, USA
Department of Earth and Planetary Science, University of California, Berkeley, USA
Boris Bonev
NVIDIA Corporation, Santa Clara, California, USA
Noah Brenowitz
NVIDIA Corporation, Santa Clara, California, USA
Yair Cohen
NVIDIA Corporation, Santa Clara, California, USA
Joshua Elms
Department of Earth and Atmospheric Sciences, Indiana University, Bloomington, Indiana, USA
Peter Harrington
National Energy Research Scientific Computing Center (NERSC), LBNL, Berkeley, California, USA
Karthik Kashinath
NVIDIA Corporation, Santa Clara, California, USA
Thorsten Kurth
NVIDIA Corporation, Santa Clara, California, USA
Joshua North
Earth and Environmental Sciences Area, Lawrence Berkeley National Laboratory (LBNL), Berkeley, California, USA
Travis O'Brien
Department of Earth and Atmospheric Sciences, Indiana University, Bloomington, Indiana, USA
Michael Pritchard
NVIDIA Corporation, Santa Clara, California, USA
Department of Earth System Science, University of California, Irvine, USA
David Pruitt
NVIDIA Corporation, Santa Clara, California, USA
Mark Risser
Earth and Environmental Sciences Area, Lawrence Berkeley National Laboratory (LBNL), Berkeley, California, USA
Shashank Subramanian
National Energy Research Scientific Computing Center (NERSC), LBNL, Berkeley, California, USA
Jared Willard
National Energy Research Scientific Computing Center (NERSC), LBNL, Berkeley, California, USA
Related authors
Ankur Mahesh, William D. Collins, Boris Bonev, Noah Brenowitz, Yair Cohen, Peter Harrington, Karthik Kashinath, Thorsten Kurth, Joshua North, Travis A. O'Brien, Michael Pritchard, David Pruitt, Mark Risser, Shashank Subramanian, and Jared Willard
Geosci. Model Dev., 18, 5605–5633, https://doi.org/10.5194/gmd-18-5605-2025, https://doi.org/10.5194/gmd-18-5605-2025, 2025
Short summary
Short summary
We use machine learning emulators to create a massive ensemble of simulated weather extremes. This ensemble provides a large sample size, which is essential to characterize the statistics of extreme weather events and study their physical mechanisms. Also, these ensembles can be beneficial to accurately forecast the probability of low-likelihood extreme weather.
Ankur Mahesh, Travis A. O'Brien, Burlen Loring, Abdelrahman Elbashandy, William Boos, and William D. Collins
Geosci. Model Dev., 17, 3533–3557, https://doi.org/10.5194/gmd-17-3533-2024, https://doi.org/10.5194/gmd-17-3533-2024, 2024
Short summary
Short summary
Atmospheric rivers (ARs) are extreme weather events that can alleviate drought or cause billions of US dollars in flood damage. We train convolutional neural networks (CNNs) to detect ARs with an estimate of the uncertainty. We present a framework to generalize these CNNs to a variety of datasets of past, present, and future climate. Using a simplified simulation of the Earth's atmosphere, we validate the CNNs. We explore the role of ARs in maintaining energy balance in the Earth system.
Prabhat, Karthik Kashinath, Mayur Mudigonda, Sol Kim, Lukas Kapp-Schwoerer, Andre Graubner, Ege Karaismailoglu, Leo von Kleist, Thorsten Kurth, Annette Greiner, Ankur Mahesh, Kevin Yang, Colby Lewis, Jiayi Chen, Andrew Lou, Sathyavat Chandran, Ben Toms, Will Chapman, Katherine Dagon, Christine A. Shields, Travis O'Brien, Michael Wehner, and William Collins
Geosci. Model Dev., 14, 107–124, https://doi.org/10.5194/gmd-14-107-2021, https://doi.org/10.5194/gmd-14-107-2021, 2021
Short summary
Short summary
Detecting extreme weather events is a crucial step in understanding how they change due to climate change. Deep learning (DL) is remarkable at pattern recognition; however, it works best only when labeled datasets are available. We create
ClimateNet– an expert-labeled curated dataset – to train a DL model for detecting weather events and predicting changes in extreme precipitation. This work paves the way for DL-based automated, high-fidelity, and highly precise analytics of climate data.
Ankur Mahesh, William D. Collins, Boris Bonev, Noah Brenowitz, Yair Cohen, Peter Harrington, Karthik Kashinath, Thorsten Kurth, Joshua North, Travis A. O'Brien, Michael Pritchard, David Pruitt, Mark Risser, Shashank Subramanian, and Jared Willard
Geosci. Model Dev., 18, 5605–5633, https://doi.org/10.5194/gmd-18-5605-2025, https://doi.org/10.5194/gmd-18-5605-2025, 2025
Short summary
Short summary
We use machine learning emulators to create a massive ensemble of simulated weather extremes. This ensemble provides a large sample size, which is essential to characterize the statistics of extreme weather events and study their physical mechanisms. Also, these ensembles can be beneficial to accurately forecast the probability of low-likelihood extreme weather.
Bo Dong, Paul Ullrich, Jiwoo Lee, Peter Gleckler, Kristin Chang, and Travis A. O'Brien
Geosci. Model Dev., 18, 961–976, https://doi.org/10.5194/gmd-18-961-2025, https://doi.org/10.5194/gmd-18-961-2025, 2025
Short summary
Short summary
A metrics package designed for easy analysis of atmospheric river (AR) characteristics and statistics is presented. The tool is efficient for diagnosing systematic AR bias in climate models and useful for evaluating new AR characteristics in model simulations. In climate models, landfalling AR precipitation shows dry biases globally, and AR tracks are farther poleward (equatorward) in the North and South Atlantic (South Pacific and Indian Ocean).
Ryan J. O'Loughlin, Dan Li, Richard Neale, and Travis A. O'Brien
Geosci. Model Dev., 18, 787–802, https://doi.org/10.5194/gmd-18-787-2025, https://doi.org/10.5194/gmd-18-787-2025, 2025
Short summary
Short summary
We draw from traditional climate modeling practices to make recommendations for machine-learning (ML)-driven climate science. Our intended audience is climate modelers who are relatively new to ML. We show how component-level understanding – obtained when scientists can link model behavior to parts within the overall model – should guide the development and evaluation of ML models. Better understanding yields a stronger basis for trust in the models. We highlight several examples to demonstrate.
Ankur Mahesh, Travis A. O'Brien, Burlen Loring, Abdelrahman Elbashandy, William Boos, and William D. Collins
Geosci. Model Dev., 17, 3533–3557, https://doi.org/10.5194/gmd-17-3533-2024, https://doi.org/10.5194/gmd-17-3533-2024, 2024
Short summary
Short summary
Atmospheric rivers (ARs) are extreme weather events that can alleviate drought or cause billions of US dollars in flood damage. We train convolutional neural networks (CNNs) to detect ARs with an estimate of the uncertainty. We present a framework to generalize these CNNs to a variety of datasets of past, present, and future climate. Using a simplified simulation of the Earth's atmosphere, we validate the CNNs. We explore the role of ARs in maintaining energy balance in the Earth system.
Bjorn Stevens, Stefan Adami, Tariq Ali, Hartwig Anzt, Zafer Aslan, Sabine Attinger, Jaana Bäck, Johanna Baehr, Peter Bauer, Natacha Bernier, Bob Bishop, Hendryk Bockelmann, Sandrine Bony, Guy Brasseur, David N. Bresch, Sean Breyer, Gilbert Brunet, Pier Luigi Buttigieg, Junji Cao, Christelle Castet, Yafang Cheng, Ayantika Dey Choudhury, Deborah Coen, Susanne Crewell, Atish Dabholkar, Qing Dai, Francisco Doblas-Reyes, Dale Durran, Ayoub El Gaidi, Charlie Ewen, Eleftheria Exarchou, Veronika Eyring, Florencia Falkinhoff, David Farrell, Piers M. Forster, Ariane Frassoni, Claudia Frauen, Oliver Fuhrer, Shahzad Gani, Edwin Gerber, Debra Goldfarb, Jens Grieger, Nicolas Gruber, Wilco Hazeleger, Rolf Herken, Chris Hewitt, Torsten Hoefler, Huang-Hsiung Hsu, Daniela Jacob, Alexandra Jahn, Christian Jakob, Thomas Jung, Christopher Kadow, In-Sik Kang, Sarah Kang, Karthik Kashinath, Katharina Kleinen-von Königslöw, Daniel Klocke, Uta Kloenne, Milan Klöwer, Chihiro Kodama, Stefan Kollet, Tobias Kölling, Jenni Kontkanen, Steve Kopp, Michal Koran, Markku Kulmala, Hanna Lappalainen, Fakhria Latifi, Bryan Lawrence, June Yi Lee, Quentin Lejeun, Christian Lessig, Chao Li, Thomas Lippert, Jürg Luterbacher, Pekka Manninen, Jochem Marotzke, Satoshi Matsouoka, Charlotte Merchant, Peter Messmer, Gero Michel, Kristel Michielsen, Tomoki Miyakawa, Jens Müller, Ramsha Munir, Sandeep Narayanasetti, Ousmane Ndiaye, Carlos Nobre, Achim Oberg, Riko Oki, Tuba Özkan-Haller, Tim Palmer, Stan Posey, Andreas Prein, Odessa Primus, Mike Pritchard, Julie Pullen, Dian Putrasahan, Johannes Quaas, Krishnan Raghavan, Venkatachalam Ramaswamy, Markus Rapp, Florian Rauser, Markus Reichstein, Aromar Revi, Sonakshi Saluja, Masaki Satoh, Vera Schemann, Sebastian Schemm, Christina Schnadt Poberaj, Thomas Schulthess, Cath Senior, Jagadish Shukla, Manmeet Singh, Julia Slingo, Adam Sobel, Silvina Solman, Jenna Spitzer, Philip Stier, Thomas Stocker, Sarah Strock, Hang Su, Petteri Taalas, John Taylor, Susann Tegtmeier, Georg Teutsch, Adrian Tompkins, Uwe Ulbrich, Pier-Luigi Vidale, Chien-Ming Wu, Hao Xu, Najibullah Zaki, Laure Zanna, Tianjun Zhou, and Florian Ziemen
Earth Syst. Sci. Data, 16, 2113–2122, https://doi.org/10.5194/essd-16-2113-2024, https://doi.org/10.5194/essd-16-2113-2024, 2024
Short summary
Short summary
To manage Earth in the Anthropocene, new tools, new institutions, and new forms of international cooperation will be required. Earth Virtualization Engines is proposed as an international federation of centers of excellence to empower all people to respond to the immense and urgent challenges posed by climate change.
Arjun Babu Nellikkattil, Danielle Lemmon, Travis Allen O'Brien, June-Yi Lee, and Jung-Eun Chu
Geosci. Model Dev., 17, 301–320, https://doi.org/10.5194/gmd-17-301-2024, https://doi.org/10.5194/gmd-17-301-2024, 2024
Short summary
Short summary
This study introduces a new computational framework called Scalable Feature Extraction and Tracking (SCAFET), designed to extract and track features in climate data. SCAFET stands out by using innovative shape-based metrics to identify features without relying on preconceived assumptions about the climate model or mean state. This approach allows more accurate comparisons between different models and scenarios.
Prabhat, Karthik Kashinath, Mayur Mudigonda, Sol Kim, Lukas Kapp-Schwoerer, Andre Graubner, Ege Karaismailoglu, Leo von Kleist, Thorsten Kurth, Annette Greiner, Ankur Mahesh, Kevin Yang, Colby Lewis, Jiayi Chen, Andrew Lou, Sathyavat Chandran, Ben Toms, Will Chapman, Katherine Dagon, Christine A. Shields, Travis O'Brien, Michael Wehner, and William Collins
Geosci. Model Dev., 14, 107–124, https://doi.org/10.5194/gmd-14-107-2021, https://doi.org/10.5194/gmd-14-107-2021, 2021
Short summary
Short summary
Detecting extreme weather events is a crucial step in understanding how they change due to climate change. Deep learning (DL) is remarkable at pattern recognition; however, it works best only when labeled datasets are available. We create
ClimateNet– an expert-labeled curated dataset – to train a DL model for detecting weather events and predicting changes in extreme precipitation. This work paves the way for DL-based automated, high-fidelity, and highly precise analytics of climate data.
Travis A. O'Brien, Mark D. Risser, Burlen Loring, Abdelrahman A. Elbashandy, Harinarayan Krishnan, Jeffrey Johnson, Christina M. Patricola, John P. O'Brien, Ankur Mahesh, Prabhat, Sarahí Arriaga Ramirez, Alan M. Rhoades, Alexander Charn, Héctor Inda Díaz, and William D. Collins
Geosci. Model Dev., 13, 6131–6148, https://doi.org/10.5194/gmd-13-6131-2020, https://doi.org/10.5194/gmd-13-6131-2020, 2020
Short summary
Short summary
Researchers utilize various algorithms to identify extreme weather features in climate data, and we seek to answer this question: given a
plausibleweather event detector, how does uncertainty in the detector impact scientific results? We generate a suite of statistical models that emulate expert identification of weather features. We find that the connection between El Niño and atmospheric rivers – a specific extreme weather type – depends systematically on the design of the detector.
Mark D. Risser and Michael F. Wehner
Adv. Stat. Clim. Meteorol. Oceanogr., 6, 115–139, https://doi.org/10.5194/ascmo-6-115-2020, https://doi.org/10.5194/ascmo-6-115-2020, 2020
Short summary
Short summary
Evaluation of modern high-resolution global climate models often does not account for the geographic location of the underlying weather station data. In this paper, we quantify the impact of geographic sampling on the relative performance of climate model representations of precipitation extremes over the United States. We find that properly accounting for the geographic sampling of weather stations can significantly change the assessment of model performance.
Cited articles
Agrawal, S., Carver, R., Gazen, C., Maddy, E., Krasnopolsky, V., Bromberg, C., Ontiveros, Z., Russell, T., Hickey, J., and Boukabara, S.: A Machine Learning Outlook: Post-processing of Global Medium-range Forecasts, arXiv [preprint], https://doi.org/10.48550/ARXIV.2303.16301, 2023. a
Allen, S., Ginsbourger, D., and Ziegel, J.: Evaluating forecasts for high-impact events using transformed kernel scores, arXiv [preprint], https://doi.org/10.48550/ARXIV.2202.12732, 2022. a, b, c
Allen, S., Bhend, J., Martius, O., and Ziegel, J.: Weighted Verification Tools to Evaluate Univariate and Multivariate Probabilistic Forecasts for High-Impact Weather Events, Weather Forecast., 38, 499–516, https://doi.org/10.1175/waf-d-22-0161.1, 2023. a
Arcomano, T., Szunyogh, I., Pathak, J., Wikner, A., Hunt, B. R., and Ott, E.: A Machine Learning‐Based Global Atmospheric Forecast Model, Geophys. Res. Lett., 47, 9, https://doi.org/10.1029/2020gl087776, 2020. a
Balch, J. K., Abatzoglou, J. T., Joseph, M. B., Koontz, M. J., Mahood, A. L., McGlinchy, J., Cattau, M. E., and Williams, A. P.: Warming weakens the night-time barrier to global fire, Nature, 602, 442–448, https://doi.org/10.1038/s41586-021-04325-1, 2022. a
Baño-Medina, J., Sengupta, A., Watson-Parris, D., Hu, W., and Monache, L. D.: Towards calibrated ensembles of neural weather model forecasts, ESS Open Archive, https://doi.org/10.22541/essoar.171536034.43833039/v1, 2024. a, b, c, d
Ben Bouallègue, Z., Clare, M. C. A., Magnusson, L., Gascón, E., Maier-Gerber, M., Janoušek, M., Rodwell, M., Pinault, F., Dramsch, J. S., Lang, S. T. K., Raoult, B., Rabier, F., Chevallier, M., Sandu, I., Dueben, P., Chantry, M., and Pappenberger, F.: The Rise of Data-Driven Weather Forecasting: A First Statistical Assessment of Machine Learning–Based Weather Forecasts in an Operational-Like Context, B. Am. Meteorol. Soc., 105, E864–E883, https://doi.org/10.1175/bams-d-23-0162.1, 2024. a, b
Bercos‐Hickey, E., O’Brien, T. A., Wehner, M. F., Zhang, L., Patricola, C. M., Huang, H., and Risser, M. D.: Anthropogenic Contributions to the 2021 Pacific Northwest Heatwave, Geophys. Res. Lett., 49, 23, https://doi.org/10.1029/2022gl099396, 2022. a
Bi, K., Xie, L., Zhang, H., Chen, X., Gu, X., and Tian, Q.: Accurate medium-range global weather forecasting with 3D neural networks, Nature, 619, 533–538, https://doi.org/10.1038/s41586-023-06185-3, 2023. a, b, c
Bodnar, C., Bruinsma, W. P., Lucic, A., Stanley, M., Brandstetter, J., Garvan, P., Riechert, M., Weyn, J., Dong, H., Vaughan, A., Gupta, J. K., Tambiratnam, K., Archibald, A., Heider, E., Welling, M., Turner, R. E., and Perdikaris, P.: Aurora: A Foundation Model of the Atmosphere, arXiv [preprint], https://doi.org/10.48550/ARXIV.2405.13063, 2024. a
Bonavita, M.: On some limitations of data-driven weather forecasting models, arXiv [preprint], https://doi.org/10.48550/ARXIV.2309.08473, 2023. a, b, c
Bonev, B., Kamenev, A., and Kurth, T.: Makani: Massively parallel training of machine-learning based weather and climate models, GitHub [code], https://github.com/NVIDIA/modulus-makani/tree/v0.1.0 (last access: 20 August 2025), 2024. a
Brankovic, C., Palmer, T. N., Molteni, F., Tibaldi, S., and Cubasch, U.: Extended‐range predictions with ECMWF models: Time‐lagged ensemble forecasting, Q. J. Roy. Meteor. Soc., 116, 867–912, https://doi.org/10.1002/qj.49711649405, 1990. a
Bülte, C., Horat, N., Quinting, J., and Lerch, S.: Uncertainty quantification for data-driven weather models, arXiv [preprint], https://doi.org/10.48550/ARXIV.2403.13458, 2024. a, b, c, d
Chen, K., Han, T., Gong, J., Bai, L., Ling, F., Luo, J.-J., Chen, X., Ma, L., Zhang, T., Su, R., Ci, Y., Li, B., Yang, X., and Ouyang, W.: FengWu: Pushing the Skillful Global Medium-range Weather Forecast beyond 10 Days Lead, arXiv [preprint], https://doi.org/10.48550/ARXIV.2304.02948, 2023a. a
Chen, L., Zhong, X., Zhang, F., Cheng, Y., Xu, Y., Qi, Y., and Li, H.: FuXi: a cascade machine learning forecasting system for 15-day global weather forecast, npj Climate and Atmospheric Science, 6, 1, https://doi.org/10.1038/s41612-023-00512-1, 2023b. a
Collins, W., Pritchard, M., Brenowitz, N., Cohen, Y., Harrington, P., Kashinath, K., Mahesh, A., and Subramanian, S.: Huge Ensembles of Weather Extremes using the Fourier Forecasting Neural Network, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4460, https://doi.org/10.5194/egusphere-egu24-4460, 2024. a
Couairon, G., Lessig, C., Charantonis, A., and Monteleoni, C.: ArchesWeather: An efficient AI weather forecasting model at 1.5° resolution, arXiv [preprint], https://doi.org/10.48550/ARXIV.2405.14527, 2024. a
ECMWF: IFS Documentation, https://www.ecmwf.int/en/publications/ifs-documentation, last access: 17 July 2024. a
Esper, J., Torbenson, M., and Büntgen, U.: 2023 summer warmth unparalleled over the past 2,000 years, Nature, 631, 94–97, https://doi.org/10.1038/s41586-024-07512-y, 2024. a
Fortin, V., Abaza, M., Anctil, F., and Turcotte, R.: Why Should Ensemble Spread Match the RMSE of the Ensemble Mean?, J. Hydrometeorol., 15, 1708–1713, https://doi.org/10.1175/jhm-d-14-0008.1, 2014. a, b, c, d
Gneiting, T. and Ranjan, R.: Comparing Density Forecasts Using Threshold and Quantile-Weighted Scoring Rules, J. Bus. Econ. Stat., 29, 411–422, https://doi.org/10.1198/jbes.2010.08110, 2011. a
Haiden, T., Janousek, M., Vitart, F., Bouallègue, Z. B., Ferranti, L., Prates, F., and Prates, F.: Evaluation of ECMWF forecasts, including the 2023 upgrade, European Centre for Medium Range Weather Forecasts Reading, UK, https://www.ecmwf.int/en/elibrary/81389-evaluation-ecmwf-forecasts-including-2023-upgrade (last access: 20 August 2025), 2023. a, b, c
He, C., Kim, H., Hashizume, M., Lee, W., Honda, Y., Kim, S. E., Kinney, P. L., Schneider, A., Zhang, Y., Zhu, Y., Zhou, L., Chen, R., and Kan, H.: The effects of night-time warming on mortality burden under future climate change scenarios: a modelling study, The Lancet Planetary Health, 6, e648–e657, https://doi.org/10.1016/s2542-5196(22)00139-5, 2022. a
Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz‐Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., De Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., de Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.: The ERA5 global reanalysis, Q. J. Roy. Meteor. Soc., 146, 1999–2049, https://doi.org/10.1002/qj.3803, 2020. a
Jeffares, A., Liu, T., Crabbé, J., and van der Schaar, M.: Joint Training of Deep Ensembles Fails Due to Learner Collusion, arXiv [preprint], https://doi.org/10.48550/ARXIV.2301.11323, 2023. a
Karlbauer, M., Cresswell-Clay, N., Durran, D. R., Moreno, R. A., Kurth, T., Bonev, B., Brenowitz, N., and Butz, M. V.: Advancing Parsimonious Deep Learning Weather Prediction using the HEALPix Mesh, arXiv [preprint], https://doi.org/10.48550/ARXIV.2311.06253, 2023. a
Keisler, R.: Forecasting Global Weather with Graph Neural Networks, arXiv [preprint], https://doi.org/10.48550/ARXIV.2202.07575, 2022. a, b
Kochkov, D., Yuval, J., Langmore, I., Norgaard, P., Smith, J., Mooers, G., Lottes, J., Rasp, S., Düben, P., Klöwer, M., Hatfield, S., Battaglia, P., Sanchez-Gonzalez, A., Willson, M., Brenner, M. P., and Hoyer, S.: Neural General Circulation Models, arXiv [preprint], https://doi.org/10.48550/ARXIV.2311.07222, 2023. a, b, c, d, e
Kurth, T., Subramanian, S., Harrington, P., Pathak, J., Mardani, M., Hall, D., Miele, A., Kashinath, K., and Anandkumar, A.: FourCastNet: Accelerating Global High-Resolution Weather Forecasting Using Adaptive Fourier Neural Operators, in: Proceedings of the Platform for Advanced Scientific Computing Conference, Davos, Switzerland, 26–28 June 2023, PASC ’23, ACM, https://doi.org/10.1145/3592979.3593412, 2023. a
Lalaurette, F.: Early detection of abnormal weather using a probabilistic Extreme Forecast Index, European Center for Medium-range Weather Forecasting Technical Memoranda, 373, https://doi.org/10.21957/ehfunckhs, 2002. a
Lam, R., Sanchez-Gonzalez, A., Willson, M., Wirnsberger, P., Fortunato, M., Alet, F., Ravuri, S., Ewalds, T., Eaton-Rosen, Z., Hu, W., Merose, A., Hoyer, S., Holland, G., Vinyals, O., Stott, J., Pritzel, A., Mohamed, S., and Battaglia, P.: Learning skillful medium-range global weather forecasting, Science, 382, 1416–1421, https://doi.org/10.1126/science.adi2336, 2023. a, b, c, d, e
Lang, S., Alexe, M., Chantry, M., Dramsch, J., Pinault, F., Raoult, B., Clare, M. C. A., Lessig, C., Maier-Gerber, M., Magnusson, L., Bouallègue, Z. B., Nemesio, A. P., Dueben, P. D., Brown, A., Pappenberger, F., and Rabier, F.: AIFS - ECMWF's data-driven forecasting system, arXiv [preprint], https://doi.org/10.48550/ARXIV.2406.01465, 2024. a, b, c, d
Lavers, D. A., Pappenberger, F., Richardson, D. S., and Zsoter, E.: ECMWF Extreme Forecast Index for water vapor transport: A forecast tool for atmospheric rivers and extreme precipitation, Geophys. Res. Lett., 43, 11852–11858, https://doi.org/10.1002/2016gl071320, 2016. a, b
Lerch, S., Thorarinsdottir, T. L., Ravazzolo, F., and Gneiting, T.: Forecaster’s Dilemma: Extreme Events and Forecast Evaluation, Stat. Sci., 32, 1, https://doi.org/10.1214/16-sts588, 2017. a
Leutbecher, M. and Palmer, T.: Ensemble forecasting, J. Comput. Phys., 227, 3515–3539, https://doi.org/10.1016/j.jcp.2007.02.014, 2008. a
Li, L., Carver, R., Lopez-Gomez, I., Sha, F., and Anderson, J.: Generative emulation of weather forecast ensembles with diffusion models, Science Advances, 10, 13, https://doi.org/10.1126/sciadv.adk4489, 2024. a
Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A.: Fourier Neural Operator for Parametric Partial Differential Equations, arXiv [preprint], https://doi.org/10.48550/ARXIV.2010.08895, 2020. a
Linsenmeier, M., and Shrader, J.G.: Global inequalities in weather forecasts, Socarxiv [preprint], https://ideas.repec.org/p/osf/socarx/7e2jf.html (last access: 20 August 2025), 2023. a
Liu, X., Saravanan, R., Fu, D., Chang, P., Patricola, C. M., and O’Brien, T. A.: How Do Climate Model Resolution and Atmospheric Moisture Affect the Simulation of Unprecedented Extreme Events Like the 2021 Western North American Heat Wave?, Geophys. Res. Lett., 51, e2024GL108160, https://doi.org/10.1029/2024gl108160, 2024. a
Lu, X., O’Neill, C. M., Warner, S., Xiong, Q., Chen, X., Wells, R., and Penfield, S.: Winter warming post floral initiation delays flowering via bud dormancy activation and affects yield in a winter annual crop, P. Natl. Acad. Sci. USA, 119, 39, https://doi.org/10.1073/pnas.2204355119, 2022. a
Lu, Y.-C. and Romps, D. M.: Extending the Heat Index, J. Appl. Meteorol. Clim., 61, 1367–1383, https://doi.org/10.1175/jamc-d-22-0021.1, 2022. a
Mahesh, A., Collins, W., Bonev, B., Brenowitz, N., Cohen, Y., Harrington, P., Kashinath, K., Kurth, T., North, J., O'Brien, T., Pritchard, M., Pruitt, D., Risser, M., Subramanian, S., and Willard, J.: Huge ensembles – Part 2: Properties of a huge ensemble of hindcasts generated with spherical Fourier neural operators, Geosci. Model Dev., 18, 5605–5633, https://doi.org/10.5194/gmd-18-5605-2025, 2025a. a
Mahesh, A., Collins, W., Bonev, B., Brenowitz, N., Cohen, Y., Harrington, P., Kashinath, K., Kurth, T., North, J., O’Brien, T., Pritchard, M., Pruitt, D., Risser, M., Subramanian, S., and Willard, J.: Huge ensembles part I design of ensemble weather forecasts with spherical Fourier neural operators; Huge ensem- bles part II properties of a huge ensemble of hindcasts generated with spherical Fourier neural operators, DRYAD [code, data set], https://doi.org/10.5061/DRYAD.2RBNZS80N, 2025b.
Mahesh, A., Collins, W., Bonev, B., Brenowitz, N., Cohen, Y., Harrington, P., Kashinath, K., Kurth, T., North, J., O’Brien, T., Pritchard, M., Pruitt, D., Risser, M., Subramanian, S., and Willard, J.: Huge ensembles part I design of ensemble weather forecasts with spherical Fourier neural operators; Huge ensembles part II properties of a huge ensemble of hindcasts generated with spherical Fourier neural operators, GitHub [code], https://github.com/ankurmahesh/earth2mip-fork (last access: 20 August 2025), 2025c.
McGovern, A., Bostrom, A., McGraw, M., Chase, R. J., Gagne, D. J., Ebert-Uphoff, I., Musgrave, K. D., and Schumacher, A.: Identifying and Categorizing Bias in AI/ML for Earth Sciences, B. Am. Meteorol. Soc., 105, E567–E583, https://doi.org/10.1175/bams-d-23-0196.1, 2024. a
McKinnon, K. A. and Simpson, I. R.: How Unexpected Was the 2021 Pacific Northwest Heatwave?, Geophys. Res. Lett., 49, 18, https://doi.org/10.1029/2022gl100380, 2022. a
Mitra, P. P. and Ramavajjala, V.: Learning to forecast diagnostic parameters using pre-trained weather embedding, arXiv [preprint], https://doi.org/10.48550/ARXIV.2312.00290, 2023. a
Mittermaier, M. P.: A Strategy for Verifying Near-Convection-Resolving Model Forecasts at Observing Sites, Weather Forecast., 29, 185–204, https://doi.org/10.1175/waf-d-12-00075.1, 2014. a
Murage, P., Hajat, S., and Kovats, R. S.: Effect of night-time temperatures on cause and age-specific mortality in London, Environmental Epidemiology, 1, e005, https://doi.org/10.1097/ee9.0000000000000005, 2017. a
Nguyen, T., Shah, R., Bansal, H., Arcomano, T., Madireddy, S., Maulik, R., Kotamarthi, V., Foster, I., and Grover, A.: Scaling transformer neural networks for skillful and reliable medium-range weather forecasting, arXiv [preprint], https://doi.org/10.48550/ARXIV.2312.03876, 2023. a
NVIDIA: NVIDIA Earth2Studio, GitHub [code], https://github.com/NVIDIA/earth2studio (last access: 20 August 2025), 2025. a
Olivetti, L. and Messori, G.: Do data-driven models beat numerical models in forecasting weather extremes? A comparison of IFS HRES, Pangu-Weather and GraphCast, EGUsphere [preprint], https://doi.org/10.5194/egusphere-2024-1042, 2024. a
Palmer, T., Buizza, R., Hagedorn, R., Lawrence, A., Leutbecher, M., and Smith, L.: Ensemble prediction: A pedagogical perspective, ECMWF Newsletter, 106, 10–17, 2006. a
Pasche, O. C., Wider, J., Zhang, Z., Zscheischler, J., and Engelke, S.: Validating Deep Learning Weather Forecast Models on Recent High-Impact Extreme Events, arXiv [preprint], https://doi.org/10.48550/ARXIV.2404.17652, 2024. a, b
Pathak, J., Subramanian, S., Harrington, P., Raja, S., Chattopadhyay, A., Mardani, M., Kurth, T., Hall, D., Li, Z., Azizzadenesheli, K., Hassanzadeh, P., Kashinath, K., and Anandkumar, A.: FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators, arXiv [preprint], https://doi.org/10.48550/ARXIV.2202.11214, 2022. a, b, c, d, e
Ramavajjala, V.: HEAL-ViT: Vision Transformers on a spherical mesh for medium-range weather forecasting, arXiv [preprint], https://doi.org/10.48550/ARXIV.2403.17016, 2024. a
Rasp, S., Dueben, P. D., Scher, S., Weyn, J. A., Mouatadid, S., and Thuerey, N.: WeatherBench: A Benchmark Data Set for Data‐Driven Weather Forecasting, J. Adv. Model. Earth Sy., 12, 11, https://doi.org/10.1029/2020ms002203, 2020. a
Rasp, S., Hoyer, S., Merose, A., Langmore, I., Battaglia, P., Russell, T., Sanchez‐Gonzalez, A., Yang, V., Carver, R., Agrawal, S., Chantry, M., Ben Bouallegue, Z., Dueben, P., Bromberg, C., Sisk, J., Barrington, L., Bell, A., and Sha, F.: WeatherBench 2: A Benchmark for the Next Generation of Data‐Driven Global Weather Models, J. Adv. Model. Earth Sy., 16, 6, https://doi.org/10.1029/2023ms004019, 2024. a, b, c, d, e
Scher, S. and Messori, G.: Ensemble Methods for Neural Network‐Based Weather Forecasts, J. Adv. Model. Earth Sy., 13, 2, https://doi.org/10.1029/2020ms002331, 2021. a, b
Selz, T. and Craig, G. C.: Can Artificial Intelligence‐Based Weather Prediction Models Simulate the Butterfly Effect?, Geophys. Res. Lett., 50, 20, https://doi.org/10.1029/2023gl105747, 2023. a
Seneviratne, S., Zhang, X., Adnan, M., Badi, W., Dereczynski, C., Luca, A. D., Ghosh, S., Iskandar, I., Kossin, J., Lewis, S., Otto, F., Pinto, I., Satoh, M., Vicente-Serrano, S., Wehner, M., and Zhou, B.: Weather and Climate Extreme Events, in: Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change, edited by: Masson-Delmotte, V., Zhai, P., Pirani, A., Connors, S., Ṕéan, C., Berger, S., Caud, N., Chen, Y., Goldfarb, L., Gomis, M., Huang, M., Leitzell, K., Lonnoy, E., Matthews, J., Maycock, T., Waterfield, T., Yelekçi, O., Yu, R., and Zhou, B., Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 1513–1766, https://doi.org/10.1017/9781009157896.013, 2021. a
Toth, Z. and Kalnay, E.: Ensemble Forecasting at NMC: The Generation of Perturbations, B. Am. Meteorol. Soc., 74, 2317–2330, https://doi.org/10.1175/1520-0477(1993)074<2317:efantg>2.0.co;2, 1993. a, b, c
Toth, Z. and Kalnay, E.: Ensemble Forecasting at NCEP and the Breeding Method, Mon. Weather Rev., 125, 3297–3319, https://doi.org/10.1175/1520-0493(1997)125<3297:efanat>2.0.co;2, 1997. a, b
Vargas Zeppetello, L. R., Raftery, A. E., and Battisti, D. S.: Probabilistic projections of increased heat stress driven by climate change, Communications Earth and Environment, 3, 1, https://doi.org/10.1038/s43247-022-00524-4, 2022. a
Watt-Meyer, O., Dresdner, G., McGibbon, J., Clark, S. K., Henn, B., Duncan, J., Brenowitz, N. D., Kashinath, K., Pritchard, M. S., Bonev, B., Peters, M. E., and Bretherton, C. S.: ACE: A fast, skillful learned global atmospheric model for climate prediction, arXiv [preprint], https://doi.org/10.48550/ARXIV.2310.02074, 2023. a, b
Weyn, J. A., Durran, D. R., Caruana, R., and Cresswell‐Clay, N.: Sub‐Seasonal Forecasting With a Large Ensemble of Deep‐Learning Weather Prediction Models, J. Adv. Model. Earth Sy., 13, 7, https://doi.org/10.1029/2021ms002502, 2021. a, b, c
Willard, J. D., Harrington, P., Subramanian, S., Mahesh, A., O'Brien, T. A., and Collins, W. D.: Analyzing and Exploring Training Recipes for Large-Scale Transformer-Based Weather Prediction, arXiv [preprint], https://doi.org/10.48550/ARXIV.2404.19630, 2024. a, b
Zhang, L., Risser, M. D., Wehner, M. F., and O’Brien, T. A.: Leveraging Extremal Dependence to Better Characterize the 2021 Pacific Northwest Heatwave, J. Agr. Biol. Envir. St., 1–22, https://doi.org/10.1007/s13253-024-00636-8, 2024. a
Zhong, X., Chen, L., Li, H., Feng, J., and Lu, B.: FuXi-ENS: A machine learning model for medium-range ensemble weather forecasting, arXiv [preprint], https://doi.org/10.48550/ARXIV.2405.05925, 2024. a
Zsótér, E.: Recent developments in extreme weather forecasting, European Center for Medium-range Weather Forecasting Newsletter, 107, 8–17 pp., https://doi.org/10.21957/kl9821hnc7, 2006. a
Short summary
Simulating extreme weather events in a warming world is a challenging task for current weather and climate models. These models' computational cost poses a challenge in studying low-probability extreme weather. We use machine learning to construct a new probabilistic system. We give an in-depth explanation of how we constructed this system. We present a thorough pipeline to validate our method. Our method requires fewer computational resources than existing weather and climate models.
Simulating extreme weather events in a warming world is a challenging task for current weather...