Submitted as: methods for assessment of models
02 May 2022
Submitted as: methods for assessment of models | 02 May 2022
Status: this preprint is currently under review for the journal GMD.

Characterizing Uncertainties of Earth System Modeling with Heterogeneous Many-core Architecture Computing

Yangyang Yu1,2, Shaoqing Zhang1,2,3, Haohuan Fu4,5, Lixin Wu1,2,3, Dexun Chen5, Yang Gao3,6, Zhiqiang Wei3, Dongning Jia3, and Xiaopei Lin1,2,3 Yangyang Yu et al.
  • 1Key Laboratory of Physical Oceanography, Ministry of Education, Institute for Advanced Ocean Study, Frontiers Science Center for Deep Ocean Multispheres and Earth System (DOMES), Ocean University of China, Qingdao, 266100, China
  • 2College of Oceanic and Atmospheric Sciences, Ocean University of China, Qingdao, 266100, China
  • 3Pilot National Laboratory for Marine Science and Technology, Qingdao, 266100, China
  • 4Ministry of Education Key Lab. for Earth System Modeling, and Department of Earth System Science, Tsinghua University, Beijing, 100084, China
  • 5National Supercomputing Center in Wuxi, Wuxi, 214072, China
  • 6Key Laboratory of Marine Environmental Science and Ecology, Ministry of Education, Frontiers Science Center for Deep Ocean Multispheres and Earth System (DOMES), Ocean University of China, Qingdao, 266100, China

Abstract. Physical and heat limits of the semiconductor technology require the adaptation of heterogeneous architectures in supercomputers, such as graphics processing units (GPUs) with many-core accelerators and many-core processors with management and computing cores, to maintain a continuous increase of computing performance. The transition from homogeneous multi-core architectures to heterogeneous many-core architectures can produce “potential differences” that lead to numerical perturbations and uncertainties in simulation results, which could blend with errors due to coding bugs. The development of a methodology to identify the computational perturbations and secure the model correctness is a critically important step in model development on the computer system with new architectures. We have developed a methodology to characterize the uncertainties in the heterogeneous many-core computing environment, which contains a simple multiple-column atmospheric model consisting of typical discontinuous physical parameterizations defined by on-off switches, an efficient ensemble-based test approach, and a software tool applied to the GPU-based high-performance computing (HPC) and Sunway systems. Statistical distributions from ensembles of the heterogeneous systems show quantitative analyses of computational perturbations and acceptable error tolerances. The methodology explores fully understanding to distinguish between perturbations caused by platforms and discrepancies caused by software bugs, and provides encouraging references for verifying the reliability of supercomputing platforms and discussing the sensibility of Earth system modeling to the adaptation of new heterogeneous many-core architectures.

Yangyang Yu et al.

Status: open (until 27 Jun 2022)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse

Yangyang Yu et al.

Yangyang Yu et al.


Total article views: 194 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
156 34 4 194 2 1
  • HTML: 156
  • PDF: 34
  • XML: 4
  • Total: 194
  • BibTeX: 2
  • EndNote: 1
Views and downloads (calculated since 02 May 2022)
Cumulative views and downloads (calculated since 02 May 2022)

Viewed (geographical distribution)

Total article views: 186 (including HTML, PDF, and XML) Thereof 186 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 20 May 2022
Short summary
To understand the scientific consequence of perturbations caused by slave cores in heterogeneous computing environments, we examine the influence of perturbation amplitudes on the determination of cloud bottom and cloud top and compute the probability density function (PDF) of generated clouds. A series of comparisons on the PDFs between homogeneous and heterogeneous systems show consistently acceptable error tolerances when using slave cores in heterogeneous computing environments.