the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
CIOFC1.0: a Common Parallel Input/Output Framework Based on C-Coupler2.0
Xinzhu Yu
Li Liu
Chao Sun
Qingu Jiang
Biao Zhao
Zhiyuan Zhang
Bin Wang
Abstract. As Earth system modeling develops ever finer grid resolutions, the inputting and outputting (I/O) of the increasingly large data fields becomes a processing bottleneck. Many models developed in China, as well as the Community Coupler (C-Coupler), do not fully benefit from existing parallel I/O supports. This paper reports the design and implementation of a Common parallel Input/Output Framework based on C-Coupler2.0 (CIOFC1.0). Parallelization by CIOFC1.0 can accelerate the I/O of large data fields. The framework also allows convenient specification by users of the I/O settings; e.g., the data fields for I/O, the time series of the data files for I/O, and the data grids in the files. The framework can also adaptively input data fields from a time-series dataset during model integration, automatically interpolate data when necessary, and output fields either periodically or irregularly. CIOFC1.0 demonstrates the cooperative development of an I/O framework and coupler, and thus enables convenient and simultaneous use of a coupler and an I/O framework.
- Preprint
(960 KB) - Metadata XML
- BibTeX
- EndNote
Xinzhu Yu et al.
Status: final response (author comments only)
-
RC1: 'Comment on gmd-2022-77', Anonymous Referee #1, 15 Jun 2022
General comments
The paper addresses the highly relevant question of I/O management for Earth System models, by describing the implementation of an I/O server based on the C-Coupler2 library. Taking benefit of the existing remapping and interpolation functions of the coupler, and relying on the XML formalism to precisely define the data to input or output, the authors propose a validated and tested library. This library is essentially made to offer an efficient parallel I/O functionality to the C-Coupler users community.
The authors present a short but comprehensive landscape of the present I/O servers available in the community. The choice of a new development is justified by the opportunity to do it at reduce cost, thanks to a regional cooperation (developers associated to the article) and taking benefit of an existing code (C-Coupler).
As supplementary material, the source code and a user guide is available online. The access to the source code facilitates the understanding of the implementation, even though the article explanations are clear enough to get the overarching strategy . Despite my efforts, it was not possible to compile the test case provided with the source code. The reason is certainly a lack of time that would have been necessary to fully investigate the issues. However, it seems to me that the configuration procedure is rather complex, fitted to the machines architecture of the C-Coupler community and, even though only a small number of external library are required (NetCDF, pnetCDF and MPI), it is difficult to exactly identify the origin of the problem. Sadly, this is a problem that can commonly happen in our community. A simple “makefile” with few compiling option parameters would have been much more convenient …
More generally speaking, such a community tool would be more attractive if its description could prove its usefulness for a larger number of models. This is particularly deceptive in the case of the CIOFC library because its description shows that very few additional operations are required, starting from a C-Coupler instrumented model, to get the parallel I/O functionality, which makes more attractive, from a developer point of view, the coupler+I/O server suite.
More details would be required to have a better picture of the CIOFC compatibility with a non “C-Coupled” model, e.g. (i) is the tool able to handle masks (sparse matrix) ? or (ii) what are the available interpolations and are they conservative ? The reader who is not familiar with the C-Coupler could be helped if some functions, shared by the CIOFC tool but related to the C-Coupler, could be summarised. Even though it was not possible to reproduce the tests that lead to validate the tool and certify its level of performance, the validation results presented in the document are convincing. One would have like to find there a larger set of performance measurements, not only for competition spirit, but also to be able to evaluate the limits of the chosen technology. However, these limits (synchronicity, single server for the whole ESM components) are mentioned in conclusion, which suggests future fruitful enhancements.
Specific comments
Technical choices are comprehensively described, but one would have like to find more justifications to them. For example:
- the XML format is adopted to select the data that have to be transferred. Even though this a standard choice in the community, I wonder if the XML parser choice and its maintenance could become a problem during the development and would like to know the authors feeling about that.
- same question related to the pnetCDF library maintenance/compatibility with the other netCDF library and compilers. Could the maintenance cost of an additional library be avoided by choosing another I/O server decomposition strategy (one or several global fields per process instead of one 2D horizontal subdomain per process, see for example the 2nd level of server of XIOS v2) ?
- are the output CMOR or CF compliant and if not, why ?
Concerning the launching of the I/O server processes, it is precised that they are considered as a « subset of model processes (p9, l4)». Their number is an XLM file parameter (max_num_pio_proc) but it is certainly also necessary to increase the number of a model process number accordingly ? If yes, on which model should the user do that ? How are these I/O processes identified by the model(s) at initialisation to avoid including them in the pool of its compute processes ?
The procedure which aims to select data from file or model is precisely described, sometimes with too much verbosity, e.g. the one related the output time serie manager (§3.4). It could also be interesting to describe how this information is transmitted to the library (XML parser). How the input/output data of the model is updated could also be unclear to the reader, since no model array to be updated/transferred are provided via the CIOFC API (for writing in output mode, or update in input mode). A check to the example source code shows that it is done via the C-Coupler API, but it could be interesting to mention it in the article.
I also wonder if a CIOFC output or input handler can be set during runtime, or differently said, if the way the model data are modified by input (or the output files) can be changed interactively during simulation ? If yes, can the interpolation be recomputed and if yes, how long does it takes (performance) ?
The authors emphasize the « flexibility to change the source of each boundary field » (p15, l17 ), but can a variable be switched during runtime from values read in file (input) and read from a model (coupling) ? In another possible configuration, can these two sources contribute simultaneously to define a mixed variable, following a geographical mask (e.g. SST coupled from a model, and lake temperature read in a file?)
For readers interested by comparison (since absolute values are provided in Fig 26-29), it would be good to give more details about the experimental protocol leading to the computing performance measurement at §4.3. For example, the file system kind (and its theoretical performance) could be mentioned. From the model part, the output frequency during the measurement must also be provided.
During measures, the « numerical calculations in real models are neglected» (p18, l3), but are you sure that no interaction between computations and CIOFC mapping could occur ? Other interaction would be interesting to avoid: other users could stress the file system during the test. Was the machine empty during the measurements ? In addition, do you think that three measurements per each test case are enough to neglect the variability induced by the perturbation mentioned above ? If the output frequency is kept constant, how this number was chosen and does its value change the results ? If yes, how ? This could be interesting to understand how much asynchroneous output (or buffering ?) is needed. Could you precise what is the kind of MPI communications in C-Coupler between models and I/O servers (MPI_Send, MPI_Bsend, MPI_Isend ?)
I did not notice any typing errors in the text.
Citation: https://doi.org/10.5194/gmd-2022-77-RC1 -
AC1: 'Reply on RC1', Xinzhu Yu, 06 Sep 2022
Dear Reviewer,
Thanks a lot for reviewing our manuscript and for the comments and suggestions.
We would like to reply some comments here, and will carefully follow all your comments and suggestions when revising the manuscript.
- Regarding with the reproducibility of the test model, we will significantly modify the Test Model with C-Coupler2 with CIOFC User’s Guide with more details for running the test model. Some modifications need to be made for running the test model on a new platform or under a new directory, but we haven’t made these details clear in the user’s guide.
- Besides the performance tests based on the test model mentioned in the manuscript, we will make more tests with real models.
- We will try to improve the description of the managers.
Citation: https://doi.org/10.5194/gmd-2022-77-AC1
-
RC2: 'Comment on gmd-2022-77', Anonymous Referee #2, 04 Aug 2022
General Comments
In this paper the authors describe the design and implementation of an I/O framework, CIOFC1.0, based on the community coupler (C-Coupler2.0) software used by many climate models (FGOALS, FIO-AOW, GRAPES) developed or used mainly in China. The I/O framework uses XML-formatted configuration files, similar to the C-Coupler2.0 software, and reuses the interpolation algorithms in C-Coupler2.0 for spatial interpolation while writing out the model output. The I/O framework supports structured and unstructured grids. Considering the fact that climate models are running simulations at much higher resolutions than before and outputting more data and at higher frequency this work is critical to support climate models that use C-Coupler2.0, which currently does not have parallel I/O support.The paper would significantly benefit by including discussions and comparisons with I/O frameworks used by other major climate models. Also, including performance of a climate model component instead of a test model and comparing it with other I/O frameworks would be useful to the reader.
Specific Comments
The paper includes a brief survey of the existing coupler software and frameworks, however it would be useful to include more references to I/O frameworks and libraries (e.g. the NetCDF library which has Parallel I/O support, the PIO library used by CESM and other climate models, the SCORPIO library used by the E3SM climate model, the I/O libraries used by frameworks like ESMF, the CFIO library) used by other climate models. More discussions comparing the work described in this paper with these I/O frameworks (CFIO, XIOS, PIO, SCORPIO, NetCDF, ESMF etc) would help in understanding the original contributions in this paper and add to the motivation for this work.Although the paper mentions (p2, l20) that CIOFC1.0 can be used by other component models (apart from the community coupler) it is not apparent from the paper how it can be achieved, especially if the component model does not use C-Coupler2.0. From the provided source code and discussions in the paper the CIOFC1.0 framework is not a separate library (is part of the C-Coupler2.0 library), so integrating it with a separate component model (GAMIL, GEOS, GRAPES) would demonstrate how it can be used by component models in an earth system model.
When discussing the I/O configuration manager that uses XML-formatted inputs (Section 3.1) it would be useful to compare the approach here with other climate models that handle structured and unstructured grids (and have similar issues to deal with - support different types of grids, vertical coordinates etc). The spatial interpolation manager (Section 3.2) in CIOFC1.0 uses the interpolation algorithms from the coupler so having the functionality in the I/O framework is useful when integrating model components directly with CIOFC1.0 (if not, can't this functionality be moved inside the coupler?). So showing integration of the I/O framework with a model component (that does not use the C-Coupler2.0) would have been useful here.
When implementing the Parallel I/O operation (Section 3.3) were there any discussions on supporting low level libraries other than PnetCDF or using formats other than NetCDF? Were there any s/w design decisions made based on adding support for new libraries or formats in the future? In Section 3.3 (p9, l5) when discussing I/O decompositions, data rearrangements and grouping I/O processes as a subset of the model compute processes it would be useful to refer and discuss prior work in this area (specifically work done by J Dennis et al on the Parallel I/O library). Also it would be informative to discuss how the framework handles reading and writing multiple model variables into multiple files, since model components write 100s of variables in a typical simulation run.
The implementation of a recursive tree based timer is discussed in Section 3.4 (p11, l10). It would be useful to compare the approach here with current implementations of timers in other major climate models. Typically timers are useful in climate models in the model driver and other model components, why did the authors choose to implement the recursive tree based timer in the I/O framework instead of the C-Coupler2.0 (p11, l1) ?
The implementation of the output driving procedure is discussed in Section 3.5 (p11, l25). Again it would be useful to compare the approach here with approaches used by other coupler software/frameworks like MCT, NUOPC etc.
In Section 4 the evaluation of the I/O framework was performed using a test model (p17, l19).
- It would have also been useful to compare the serial (C-Coupler2.0 without CIOFC1.0) vs parallel (CIOFC1.0) I/O for a model component run (e.g. GRAPES, GEOS, GAMIL).
- When evaluating performance (p19, l23) a fine resolution vs coarse resolution is used for the variables being written out, however it is not apparent whether the data is on a structured or unstructured grid (The test model does seem to support both grids - p18, l4). (For example the performance, especially the data rearrangement time, would vary significantly depending on whether the data is on a structured grid with regular 2d decomposition or on an unstructured cubed sphere grid)
- It would be useful to see the average model I/O write/read throughput with the I/O framework (This would include all I/O costs - creating files, defining I/O decompositions, data rearrangement, filesystem write time etc & would also include 100s of multi dimensional model variables written out in a typical run). The I/O throughput for a single variable is sometimes not representative of the overall I/O throughput.
- Including comparison of the I/O throughput with other I/O libraries/frameworks (CFIO, XIOS, PIO, SCORPIO, NetCDF, ESMF etc) would be useful here.
- In Section 4.3 (p20, l1-10) the authors discuss the difference in I/O write/read performance for the different number of I/O processes. It would have been useful to include discussion on how this information helps in chosing the I/O framework configuration parameters (e.g. number of I/O processes chosen) for a model run that outputs many 1D, 2D & 3D variables
- Was there any impact on I/O performance due to the placement of the I/O processes (is it configurable by the user)? Were there any interesting aspects of the machine architecture that impacted the I/O performance (e.g. In Earthlab for example the 6D TORUS network, SSDs, fast vs ordinary storage pool etc)
- Although data interpolation is handled by the coupler, it would be useful to include some brief performance statistics that includes the performance of the data interpolation and comparison with similar frameworks (online vs offline interpolation, algorithm characteristics like conservativeness, performance of interpolation using other libs/frameworks like ESMF, MOAB etc).
In the conclusion section (Section 5, p21 l24) the authors mention that they intent to support asynchronous I/O in the next version of the community coupler (C-Coupler3). Did you add this flexibility into the current design of the I/O framework? How much of the I/O framework needs to change to incorporate asynchronous I/O?
As a general comment, the paper includes very detailed information on the implementation of the different software managers in the framework and some of this information can be summarized without impacting the overall quality of the paper. Similarly in the case of figures the authors could also consider showing only the relevant parts of the XML configurations and removing the list of PnetCDF APIs.
Citation: https://doi.org/10.5194/gmd-2022-77-RC2 -
AC2: 'Reply on RC2', Xinzhu Yu, 06 Sep 2022
Dear Reviewer,
Thanks a lot for reviewing our manuscript and for the comments and suggestions.
We would like to reply some comments here, and will carefully follow all your comments and suggestions when revising the manuscript.
- We will include the discussions and tests regarding with other I/O frameworks that are frequently used by other major climate models. We will include the performance test result of a real climate model besides the test model we are currently using and make more explanations about how CIOFC can be used by other component models.
- We will rewrite about the introduction of parallel I/O manager and add the reference about prior work when discussing I/O decompositions, data rearrangements and grouping I/O processes as a subset of the model compute processes.
- We will rewrite about the introduction of output time series manager and the spatial data interpolation manager for the comparison with the implementations of timers and the performance of interpolation function in other major climate models.
Citation: https://doi.org/10.5194/gmd-2022-77-AC2
-
AC3: 'Final Response', Xinzhu Yu, 09 Sep 2022
The comment was uploaded in the form of a supplement: https://gmd.copernicus.org/preprints/gmd-2022-77/gmd-2022-77-AC3-supplement.pdf
Xinzhu Yu et al.
Xinzhu Yu et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
831 | 158 | 22 | 1,011 | 7 | 6 |
- HTML: 831
- PDF: 158
- XML: 22
- Total: 1,011
- BibTeX: 7
- EndNote: 6
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1