Preprints
https://doi.org/10.5194/gmd-2024-54
https://doi.org/10.5194/gmd-2024-54
Submitted as: development and technical paper
 | 
02 May 2024
Submitted as: development and technical paper |  | 02 May 2024
Status: a revised version of this preprint is currently under review for the journal GMD.

The Real Challenges for Climate and Weather Modelling on its Way to Sustained Exascale Performance: A Case Study using ICON (v2.6.6)

Panagiotis Adamidis, Erik Pfister, Hendryk Bockelmann, Dominik Zobel, Jens-Olaf Beismann, and Marek Jacob

Abstract. The weather and climate model ICON (ICOsahedral Nonhydrostatic) is being used in high resolution climate simulations, in order to resolve small-scale physical processes. The envisaged performance for this task is 1 simulated year per day for a coupled atmosphere-ocean setup at global 1.2 km resolution. The necessary computing power for such simulations can only be found on exascale supercomputing systems. The main question we try to answer in this article is where to find sustained exascale performance, i. e. which hardware (processor type) is best suited for the weather and climate model ICON and consequently how this performance can be exploited by the model, i. e. what changes are required in ICON’s software design so as to utilize exascale platforms efficiently. To this end, we present an overview of the available hardware technologies and a quantitative analysis of the key performance indicators of the ICON model on several architectures. It becomes clear that domain decomposition-based parallelization has reached the scaling limits, leading us to conclude that the performance of a single node is crucial to achieve both better performance and better energy efficiency. Furthermore, based on the computational intensity of the examined kernels of the model it is shown that architectures with higher memory throughput are better suited than those with high computational peak performance. From a software engineering perspective, a redesign of ICON from a monolithic to a modular approach is required to address the complexity caused by hardware heterogeneity and new programming models to make ICON suitable for running on such machines.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Panagiotis Adamidis, Erik Pfister, Hendryk Bockelmann, Dominik Zobel, Jens-Olaf Beismann, and Marek Jacob

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on gmd-2024-54', Anonymous Referee #1, 28 May 2024
    • AC1: 'Reply on RC1', Panagiotis Adamidis, 21 Jun 2024
  • RC2: 'Comment on gmd-2024-54', Anonymous Referee #2, 13 Aug 2024
    • AC2: 'Reply on RC2', Panagiotis Adamidis, 21 Aug 2024
Panagiotis Adamidis, Erik Pfister, Hendryk Bockelmann, Dominik Zobel, Jens-Olaf Beismann, and Marek Jacob
Panagiotis Adamidis, Erik Pfister, Hendryk Bockelmann, Dominik Zobel, Jens-Olaf Beismann, and Marek Jacob

Viewed

Total article views: 995 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
766 202 27 995 14 18
  • HTML: 766
  • PDF: 202
  • XML: 27
  • Total: 995
  • BibTeX: 14
  • EndNote: 18
Views and downloads (calculated since 02 May 2024)
Cumulative views and downloads (calculated since 02 May 2024)

Viewed (geographical distribution)

Total article views: 989 (including HTML, PDF, and XML) Thereof 989 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 16 Sep 2024
Download
Short summary
In this paper, we investigated performance indicators of the climate model ICON on different compute architectures to answer the question of how to generate high resolution climate simulations. Evidently, utilizing more processing units of the conventionally used architectures is not enough, presumably architectures with a larger memory throughput are the most promising way. More potential can be gained from single node optimization rather than simply increasing the number of compute nodes.