Preprints
https://doi.org/10.5194/gmd-2024-120
https://doi.org/10.5194/gmd-2024-120
Submitted as: methods for assessment of models
 | 
09 Sep 2024
Submitted as: methods for assessment of models |  | 09 Sep 2024
Status: this preprint is currently under review for the journal GMD.

The ESGF Virtual Aggregation (CMIP6 v20240125)

Ezequiel Cimadevilla, Bryan Lawrence, and Antonio Santiago Cofiño

Abstract. The Earth System Grid Federation (ESGF) holds several petabytes of climate data distributed across millions of files held in data centers worldwide. Obtaining and manipulating the scientific information (climate variables) held in these files is non-trivial. The ESGF Virtual Aggregation is one of several solutions to providing an out-of-the-box aggregated and analysis ready view of those variables. Here we discuss the ESGF Virtual Aggregation in the context of the existing infrastructure, and some of those other solutions providing analysis ready data. We describe how it is constructed, how it can be used, and provide some performance evaluation. It will be seen that the ESGF Virtual Aggregation provides a sustainable solution to some of the problems encountered in producing analysis ready data, without the cost of data replication to different formats, albeit at the cost of more data movement within the analysis than some alternatives. If heavily used, it may also require more ESGF data servers than are currently deployed in data node deployments. The need for such data servers should be a component of ongoing discussions about the future of the ESGF and its constituent core services.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Ezequiel Cimadevilla, Bryan Lawrence, and Antonio Santiago Cofiño

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on gmd-2024-120', Anonymous Referee #1, 01 Oct 2024
    • AC1: 'Reply on RC1', Ezequiel Cimadevilla, 11 Oct 2024
      • RC3: 'Reply on AC1', Anonymous Referee #1, 11 Oct 2024
  • RC2: 'Comment on gmd-2024-120', Anonymous Referee #2, 09 Oct 2024
    • AC2: 'Reply on RC2', Ezequiel Cimadevilla, 14 Oct 2024
  • CEC1: 'Comment on gmd-2024-120', Astrid Kerkweg, 18 Oct 2024
    • AC3: 'Reply on CEC1', Ezequiel Cimadevilla, 21 Oct 2024
Ezequiel Cimadevilla, Bryan Lawrence, and Antonio Santiago Cofiño
Ezequiel Cimadevilla, Bryan Lawrence, and Antonio Santiago Cofiño

Viewed

Total article views: 371 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
228 73 70 371 3 4
  • HTML: 228
  • PDF: 73
  • XML: 70
  • Total: 371
  • BibTeX: 3
  • EndNote: 4
Views and downloads (calculated since 09 Sep 2024)
Cumulative views and downloads (calculated since 09 Sep 2024)

Viewed (geographical distribution)

Total article views: 361 (including HTML, PDF, and XML) Thereof 361 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 20 Nov 2024
Download
Short summary
The Earth System Grid Federation (ESGF) stores an enormous amount of climate data spread across millions of files in data centers all over the world. Accessing and working with this scientific information is quite complex. This work presents ESGF Virtual Aggregation, an approach that combines data from different sources into a format that is ready for analysis straight away.