the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Mixed-Precision Computing in the GRIST Dynamical Core for Weather and Climate Modelling
Abstract. Atmosphere modelling applications become increasingly memory-bound due to the inconsistent development rates between processor speeds and memory bandwidth. In this study, we mitigate memory bottlenecks and reduce the computational load of the GRIST dynamical core by adopting the mixed-precision computing strategy. Guided by a limited-degree of iterative development principle, we identify the equation terms that are precision insensitive and modify them from double- to single-precision. The results show that most precision-sensitive terms are predominantly linked to pressure-gradient and gravity terms, while most precision-insensitive terms are advective terms. The computational cost is reduced without compromising the solver accuracy. The runtime of the model’s hydrostatic solver, non-hydrostatic solver, and tracer transport solver is reduced by 24 %, 27 %, and 44 %, respectively. A series of idealized tests, real-world weather and climate modelling tests, has been performed to assess the optimized model performance qualitatively and quantitatively. In particular, in the high-resolution weather forecast simulation, the model sensitivity to the precision level is mainly dominated by the small-scale features. While in long-term climate simulation, the precision-induced sensitivity can form at the large scale.
- Preprint
(3205 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on gmd-2024-68', Luca Bertagna, 13 May 2024
The paper provides a good description of a single and mixed precision approach in atmosphere modeling, with adequate motivations for current and future models. The authors describe their strategy for building the single and mixed precision models, as well as the justification for their choices. The results are divided in a computational performance and a physical performance sections, which help the reader understanding the benefits/disadvantages of each approach both from a computational perspective as well as from a model skill one.
Overall, this is a good paper, and I recommend its publication. However, I would like to suggest a few aspects that the authors could clarify better, or provide some perspective on.
----
1. From the text in section 2.2, it seems the authors adopted a "greedy" approach, where each choice of variable to treat in single precision is built on top of the previous. Is that the case? Have they tried to start from different choices for the initial single precision variable? If so, what did they observe? If not, do they expect differences? I'm thinking that all paths would led to a similar final configuration of variables to be switched to single precision, but would be nice to know for sure if that's the case or if different initial choices may lead to different final configurations.
2. The authors say that "the precision optimization tests were conducted using the G8 grid". I assume the resulting configuration of single precision variables was also used for the G6/G7 tests in the section 4.2. Have the authors confirmed that the same configuration was indeed optimal (by means of adding/removing some SGL variables) also for other grids? If not, would they expect any difference? Why or why not?
3. When formatting equations in section 2.3, be aware of color-blind people. I am not one, so I can't give a thumbs up/down. But I would suggest to verify that the color choices are not affecting color-blind readers. E.g., the blue term in eq 10 may not look blue, or the green term in eq 5 may not stand out from the red around it. As a possible alternative, you could consider underlying, or use another box (perhaps dashed, to distinguish from the black one).
4. In section 3, the authors mention that SGL, MIX, and DBL all used the same computational resources. I would assume that, among other things, a reduced precision could allow to use less computational resources, which would additionally benefit performance (due to reduced MPI costs). This can be particularly beneficial on machines with sub-optimal interconnect, as well as on GPU architectures, where increased computational intensity (in terms of degrees-of-freedom per GPU) can increase overall performance. Have they explored this avenue? What are their thoughts on this?
5. On line 182: why is 24% in parentheses? Seems like the line should be "27%, 24%, and 44%".
6. Fig 2 seems to show absolute L2 error. It may be helpful to show a relative error, so that the reader can better gauge the impact of the SGL/MIX approximations.
7. Have the authors tried to see whether, for a fixed set of SGL variables, the quality of the SGL/MIX approximations (compared to DBL) changes with respect to numerical choices (such as the order of numerical schemes)? If not, do they expect similar quality? It would be nice to know if the need for specific terms in double precision comes from the underlying physics and PDEs, rather than from the particular details of the numerical scheme.
8. While single precision is definitely more appealing at km-scale, it can still be interesting to use it at coarser resolutions. For instance, it could allow running larger ensembles, benefiting UQ investigations. The authors mention the G6, G7, and G8 grids, which are all km or sub-km grids. Have they done any experiment at lower resolutions? If so, did they observe similar patterns? If they haven't done such experiment, are they planning to? Why or why not?
9. In terms of reproducibility, it would help if the authors could share a snapshot of the source code repo, containing all the needed modifications. It would also help to share (perhaps in the form of README files in that same repo) instructions on how to run the particular experiments they ran (e.g., input files, run scripts, peculiar environment settings,...).Citation: https://doi.org/10.5194/gmd-2024-68-RC1 - AC1: 'Reply on RC1', Yi Zhang, 29 May 2024
-
RC2: 'Comment on gmd-2024-68', Filip Vana, 19 May 2024
In this paper authors describe the effort associated with making the GRIST dynamical core computationally cheaper through adopting reduced precision to selected parts of the code. The paper is really well written, logically well structured which helps its readability. Like that it is a joy to be followed for a reader. The thorough evaluation part is quite impressive as it represents a lot of hard work. It is also done with great care to cover all possible aspects potentially impacted by reduced precision.The only complaint is that the paper doesn’t really attempt to modify the original code in order to make it more profitable for reduced precision. By that I mean that authors were only trying to identify precision sensitive parts requiring to be exclusively evaluated with double precision in the original code. There is no discussion trying to explain this sensitivity neither an attempt to eventually propose a modification or new method allowing the ussage of single precision also there. But that is perhaps subject to another paper.Small points:1/ In 110 some error norm is computed based on two model variables Ps and VOR. How those norms are evaluated and is there any scaling applied to one of them to make the two norms roughly comparable? The way it is described here is too generic to be followed.2/ I found bit unintuitive to digest the results of splitting supercell thunderstorms in section 4.2. Especially, the text belonging to 245 part describing results presented on figure 4. I am bit surprised by the great similarities between double precision and mixed solution until the 5400s with almost bifurcation behaviour afterwards. It feels like somethig stange happens at that time range. I am also quite surprised by finding the higest resolution runs continue to remain similar across the two precisions while lower resolution runs show difference. From our experience it was rather opposite: higher resoluition runs exhibited higher sensitivity to used numerical precision. Could this be somehow explained?Despite my general comment and the two rather questions than really complain I would suggest the paper is accepted for publication. If author wish they could eventually address my points, but it could be published straight away the way as it was submitted. Bravo!Filip Vana (ECWMF)Citation: https://doi.org/
10.5194/gmd-2024-68-RC2 - AC2: 'Reply on RC2', Yi Zhang, 29 May 2024
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
306 | 51 | 28 | 385 | 31 | 18 |
- HTML: 306
- PDF: 51
- XML: 28
- Total: 385
- BibTeX: 31
- EndNote: 18
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1