GPU-HADVPPM4HIP V1.0: using the heterogeneous-compute interface for portability (HIP) to speed up the piecewise parabolic method in the CAMx (v6.10) air quality model on China's domestic GPU-like accelerator
Kai Cao,Qizhong Wu,Lingling Wang,Hengliang Guo,Nan Wang,Huaqiong Cheng,Xiao Tang,Dongxing Li,Lina Liu,Dongqing Li,Hao Wu,and Lanning Wang
Henan Ecological Environmental Monitoring and Safety Center, Henan Key Laboratory of Environmental Monitoring Technology, Zhengzhou 450008, China
Hengliang Guo
National Supercomputing Center in Zhengzhou, Zhengzhou 450001, China
Nan Wang
Henan Ecological Environmental Monitoring and Safety Center, Henan Key Laboratory of Environmental Monitoring Technology, Zhengzhou 450008, China
Huaqiong Cheng
College of Global Change and Earth System Science, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China
Joint Center for Earth System Modeling and High Performance Computing, Beijing Normal University, Beijing 100875, China
Xiao Tang
State Key Laboratory of Atmospheric Boundary Layer Physics and Atmospheric Chemistry, Institute of Atmospheric Physics, Chinese Academy of Science, Beijing 100029, China
Dongxing Li
College of Global Change and Earth System Science, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China
Joint Center for Earth System Modeling and High Performance Computing, Beijing Normal University, Beijing 100875, China
Lina Liu
National Supercomputing Center in Zhengzhou, Zhengzhou 450001, China
Dongqing Li
College of Global Change and Earth System Science, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China
Hao Wu
National Supercomputing Center in Zhengzhou, Zhengzhou 450001, China
AMD’s heterogeneous-compute interface for portability was implemented to port the piecewise parabolic method solver from NVIDIA GPUs to China's GPU-like accelerators. The results show that the larger the model scale, the more acceleration effect on the GPU-like accelerator, up to 28.9 times. The multi-level parallelism achieves a speedup of 32.7 times on the heterogeneous cluster. By comparing the results, the GPU-like accelerators have more accuracy for the geoscience numerical models.
AMD’s heterogeneous-compute interface for portability was implemented to port the piecewise...