GPU-HADVPPM V1.0: a high-efficiency parallel GPU design of the piecewise parabolic method (PPM) for horizontal advection in an air quality model (CAMx V6.10)
Kai Cao,Qizhong Wu,Lingling Wang,Nan Wang,Huaqiong Cheng,Xiao Tang,Dongqing Li,and Lanning Wang
Henan Ecological Environment Monitoring and Safety Center, Henan Key Laboratory of Environmental Monitoring Technology, Zhengzhou 450000, China
Nan Wang
Henan Ecological Environment Monitoring and Safety Center, Henan Key Laboratory of Environmental Monitoring Technology, Zhengzhou 450000, China
Huaqiong Cheng
College of Global Change and Earth System Science, Beijing Normal
University, Beijing 100875, China
Xiao Tang
State Key Laboratory of Atmospheric Boundary Layer Physics and
Atmospheric Chemistry, Institute of Atmospheric Physics, Chinese Academy of
Science, Beijing 100029, China
Dongqing Li
College of Global Change and Earth System Science, Beijing Normal
University, Beijing 100875, China
Offline performance experiment results show that the GPU-HADVPPM on a V100 GPU can achieve up to 1113.6 × speedups to its original version on an E5-2682 v4 CPU. A series of optimization measures are taken, and the CAMx-CUDA model improves the computing efficiency by 128.4 × on a single V100 GPU card. A parallel architecture with an MPI plus CUDA hybrid paradigm is presented, and it can achieve up to 4.5 × speedup when launching eight CPU cores and eight GPU cards.
Offline performance experiment results show that the GPU-HADVPPM on a V100 GPU can achieve up to...