Preprints
https://doi.org/10.5194/gmd-2020-323
https://doi.org/10.5194/gmd-2020-323

Submitted as: development and technical paper 10 Dec 2020

Submitted as: development and technical paper | 10 Dec 2020

Review status: a revised version of this preprint was accepted for the journal GMD.

The GPU version of LICOM3 under HIP framework and its large-scale application

Pengfei Wang1,3, Jinrong Jiang2,4, Pengfei Lin1,4, Mengrong Ding1, Junlin Wei2, Feng Zhang2, Lian Zhao2, Yiwen Li1, Zipeng Yu1, Weipeng Zheng1,4, Yongqiang Yu1,4, Xuebin Chi2,4, and Hailong Liu1,4 Pengfei Wang et al.
  • 1State Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, China
  • 2Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
  • 3Center for Monsoon System Research (CMSR), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100190, China
  • 4University of Chinese Academy of Sciences, Beijing 100049, China

Abstract. A high-resolution (1/20°) global ocean general circulation model with Graphics processing units (GPUs) code implementations is developed based on the LASG/IAP Climate system Ocean Model version 3 (LICOM3) under Heterogeneous-compute Interface for Portability (HIP) framework. The dynamic core and physics package of LICOM3 are both ported to the GPU, and 3-dimensional parallelization is applied. The HIP version of the LICOM3 (LICOM3-HIP) is 42 times faster than what the same number of CPU cores dose, when 384 AMD GPUs and CPU cores are used. The LICOM3-HIP has excellent scalability; it can still obtain speedup of more than four on 9216 GPUs comparing to 384 GPUs. In this phase, we successfully performed a test of 1/20° LICOM3-HIP using 6550 nodes and 26200 GPUs, and at the grand scale, the model’s time to solution can still obtain an increasing, about 2.72 simulated years per day (SYPD). The high performance was due to putting almost all of computation processes inside GPUs, and thus greatly reduces the time cost of data transfer between CPUs and GPUs. At the same time, a 14-year spin-up integration following the phase 2 of Ocean Model Intercomparison Project (OMIP-2) protocol of surface forcing has been conducted, and the preliminary results have been evaluated. We found that the model results have little differences from the CPU version. Further comparison with observations and lower-resolution LICOM3 results suggests that the 1/20° LICOM3-HIP can not only reproduce the observations, but also produce much smaller scale activities, such as submesoscale eddies and frontal scales structures.

Pengfei Wang et al.

 
Status: final response (author comments only)
Status: final response (author comments only)
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
[Login for authors/topical editors] [Subscribe to comment alert] Printer-friendly Version - Printer-friendly version Supplement - Supplement

Pengfei Wang et al.

Data sets

The dataset of figures for "The GPU version of LICOM3 under HIP framework and its large-scale application" (updated) Hailong Liu, Pengfei Wang, Jinrong Jiang, and Pengfei Lin https://doi.org/10.5281/zenodo.4302811

Model code and software

The GPU version of LICOM3 under HIP framework and its large-scale application (updated) Hailong Liu, Pengfei Wang, Jinrong Jiang, and Pengfei Lin https://doi.org/10.5281/zenodo.4302813

Pengfei Wang et al.

Viewed

Total article views: 348 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
278 65 5 348 5 5
  • HTML: 278
  • PDF: 65
  • XML: 5
  • Total: 348
  • BibTeX: 5
  • EndNote: 5
Views and downloads (calculated since 10 Dec 2020)
Cumulative views and downloads (calculated since 10 Dec 2020)

Viewed (geographical distribution)

Total article views: 255 (including HTML, PDF, and XML) Thereof 254 with geography defined and 1 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 12 Apr 2021
Download
Short summary
The global ocean general circulation models are a fundamental tool for oceanography research, ocean forecast, and climate change research. The increasing resolution will greatly improve the simulation of the model, but it also demands much more computing resources. In this study, we have ported an ocean general circulation model to a heterogeneous computing system and have developed a 3–5 km model version. A 14-year integration has been conducted and the preliminary results have been evaluated.