Preprints
https://doi.org/10.5194/gmd-2024-155
https://doi.org/10.5194/gmd-2024-155
Submitted as: development and technical paper
 | 
28 Oct 2024
Submitted as: development and technical paper |  | 28 Oct 2024
Status: this preprint is currently under review for the journal GMD.

Reducing Time and Computing Costs in EC-Earth: An Automatic Load-Balancing Approach for Coupled ESMs

Sergi Palomas, Mario C. Acosta, Gladys Utrera, and Etienne Tourigny

Abstract. Earth System Models (ESMs) are intricate models employed for simulating the Earth's climate, typically constructed from distinct independent components dedicated to simulate specific natural phenomena (such as atmosphere and ocean dynamics, atmospheric chemistry, land and ocean biosphere, etc.). In order to capture the interactions between these processes, ESMs utilize coupling libraries, which oversee the synchronization and field exchanges among independent developed codes typically operating in parallel as a Multi-Program, Multi-Data (MPMD) application.

The performance achieved depends on the coupling approach, as well as on the number of parallel resources and scalability properties of each component. Determining the appropriate number of resources to use for each component in coupled ESMs is crucial for efficient utilization of the High Performance Computing (HPC) infrastructures used in climate modelling. However, this task traditionally involves manual testing of multiple process allocations by trial and error, requiring significant time investment from researchers. Thus, making the process more error-prone, and often resulting in a loss in application performance due to the complexity of the task. This paper introduces the automatic load-balance tool (auto-lb), a methodology and tool for determining the resource allocation to each component within coupled ESMs, aimed at improving the application's performance. Notably, this methodology is automatic and does not require expertise in HPC to improve the performance achieved by coupled ESMs. This is accomplished by minimizing the load-imbalance: reducing each constituent's execution cost (core-hours), as well as minimizing the core-hours wasted resulting from the synchronizations between them, without penalizing the execution speed of the entire model. This optimization is achieved regardless of the scalability properties of each constituent and the complexity of their dependencies during the coupling.

To achieve this, we designed a new performance metric called "Fittingness" to assess the performance of coupled execution evaluating the trade-off between the parallel efficiency and application throughput. This metric is intended for scenarios where optimality can depend on various criteria and constraints. Aiming for maximum speed might not be desirable if it leads to a decrease in parallel efficiency and, therefore, increasing the computational costs of simulation.

The methodology was tested across multiple experiments using the widely recognized European ESM, EC-Earth3. The results were compared with real operational configurations, such as those used for the Coupled Model Intercomparison Project Phase 6 (CMIP6) and for the European Climate Prediction Project (EUCP), and validated on different HPC platforms. All of them suggest that the current approaches lead to performance loss, and that auto-lb can achieve better results in both, execution speed and reduction of the core-hours needed. When comparing to the EC-Earth standard-resolution CPMIP6 runs, we achieved a configuration 4.7 % faster while also reducing the core-hours required by 1.3 %. Likewise, when compared to the EC-Earth high-resolution EUCP runs, the method presented showed an improvement of 34 % in the speed, with a 6.7 % reduction in the core-hours consumed.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Sergi Palomas, Mario C. Acosta, Gladys Utrera, and Etienne Tourigny

Status: open (until 23 Dec 2024)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • CEC1: 'Comment on gmd-2024-155: No compliance with the policy of the journal', Juan Antonio Añel, 29 Oct 2024 reply
    • AC1: 'Reply on CEC1', Sergi Palomas, 14 Nov 2024 reply
      • CEC2: 'Reply on AC1', Juan Antonio Añel, 15 Nov 2024 reply
Sergi Palomas, Mario C. Acosta, Gladys Utrera, and Etienne Tourigny

Model code and software

Prediction script Sergi Palomas https://earth.bsc.es/gitlab/spalomas/prediction-script

Sergi Palomas, Mario C. Acosta, Gladys Utrera, and Etienne Tourigny

Viewed

Total article views: 135 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
109 20 6 135 0 0
  • HTML: 109
  • PDF: 20
  • XML: 6
  • Total: 135
  • BibTeX: 0
  • EndNote: 0
Views and downloads (calculated since 28 Oct 2024)
Cumulative views and downloads (calculated since 28 Oct 2024)

Viewed (geographical distribution)

Total article views: 125 (including HTML, PDF, and XML) Thereof 125 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 15 Nov 2024
Download
Short summary
This work presents an automatic tool to enhance the performance of climate models by optimizing how computer resources are allocated. Traditional methods are time-consuming and error-prone, often resulting in inefficient simulations. Our tool improves speed and reduces computational costs without needing expert knowledge. The tool has been tested on European climate models, making simulations up to 34 % faster while using fewer resources, helping to make climate simulations more efficient.