Articles | Volume 18, issue 4
https://doi.org/10.5194/gmd-18-1089-2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/gmd-18-1089-2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Enhancing single precision with quasi-double precision: achieving double-precision accuracy in the Model for Prediction Across Scales – Atmosphere (MPAS-A) version 8.2.1
Jiayi Lai
College of Global Change and Earth System Science, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China
Lanning Wang
CORRESPONDING AUTHOR
College of Global Change and Earth System Science, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China
Joint Center for Earth System Modeling and High-Performance Computing, Beijing Normal University, Beijing 100875, China
Qizhong Wu
College of Global Change and Earth System Science, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China
Joint Center for Earth System Modeling and High-Performance Computing, Beijing Normal University, Beijing 100875, China
National Supercomputing Center, Wuxi 214026, China
Fang Wang
CMA Earth System Modeling and Prediction Centre (CEMC), Beijing 100081, China
Related authors
No articles found.
Kai Cao, Qizhong Wu, Xiao Tang, Jinxi Li, Xueshun Chen, Huansheng Chen, Wending Wang, Huangjian Wu, Lei Kong, Jie Li, Jiang Zhu, and Zifa Wang
EGUsphere, https://doi.org/10.5194/egusphere-2025-2918, https://doi.org/10.5194/egusphere-2025-2918, 2025
This preprint is open for discussion and under review for Geoscientific Model Development (GMD).
Short summary
Short summary
This study achieves significant acceleration by developing an optimized advection module for Emission and atmospheric Processes Integrated and Coupled Community Model on GPU-like accelerators. Through implementing thread-block coordinated indexing, minimizing CPU-GPU communication, and an hybrid parallelization framework, we demonstrate prominent speedups: 556.5× faster offline performance for the Heterogeneous Interface PPM solver and 20.5× acceleration in coupled simulations.
Zichen Wu, Xueshun Chen, Zifa Wang, Huansheng Chen, Zhe Wang, Qing Mu, Lin Wu, Wending Wang, Xiao Tang, Jie Li, Ying Li, Qizhong Wu, Yang Wang, Zhiyin Zou, and Zijian Jiang
Geosci. Model Dev., 17, 8885–8907, https://doi.org/10.5194/gmd-17-8885-2024, https://doi.org/10.5194/gmd-17-8885-2024, 2024
Short summary
Short summary
We developed a model to simulate polycyclic aromatic hydrocarbons (PAHs) from global to regional scales. The model can reproduce PAH distribution well. The concentration of BaP (indicator species for PAHs) could exceed the target values of 1 ng m-3 over some areas (e.g., in central Europe, India, and eastern China). The change in BaP is lower than that in PM2.5 from 2013 to 2018. China still faces significant potential health risks posed by BaP although the Action Plan has been implemented.
Lei Kong, Xiao Tang, Zifa Wang, Jiang Zhu, Jianjun Li, Huangjian Wu, Qizhong Wu, Huansheng Chen, Lili Zhu, Wei Wang, Bing Liu, Qian Wang, Duohong Chen, Yuepeng Pan, Jie Li, Lin Wu, and Gregory R. Carmichael
Earth Syst. Sci. Data, 16, 4351–4387, https://doi.org/10.5194/essd-16-4351-2024, https://doi.org/10.5194/essd-16-4351-2024, 2024
Short summary
Short summary
A new long-term inversed emission inventory for Chinese air quality (CAQIEI) is developed in this study, which contains constrained monthly emissions of NOx, SO2, CO, PM2.5, PM10, and NMVOCs in China from 2013 to 2020 with a horizontal resolution of 15 km. Emissions of different air pollutants and their changes during 2013–2020 were investigated and compared with previous emission inventories, which sheds new light on the complex variations of air pollutant emissions in China.
Kai Cao, Qizhong Wu, Lingling Wang, Hengliang Guo, Nan Wang, Huaqiong Cheng, Xiao Tang, Dongxing Li, Lina Liu, Dongqing Li, Hao Wu, and Lanning Wang
Geosci. Model Dev., 17, 6887–6901, https://doi.org/10.5194/gmd-17-6887-2024, https://doi.org/10.5194/gmd-17-6887-2024, 2024
Short summary
Short summary
AMD’s heterogeneous-compute interface for portability was implemented to port the piecewise parabolic method solver from NVIDIA GPUs to China's GPU-like accelerators. The results show that the larger the model scale, the more acceleration effect on the GPU-like accelerator, up to 28.9 times. The multi-level parallelism achieves a speedup of 32.7 times on the heterogeneous cluster. By comparing the results, the GPU-like accelerators have more accuracy for the geoscience numerical models.
Zehua Bai, Qizhong Wu, Kai Cao, Yiming Sun, and Huaqiong Cheng
Geosci. Model Dev., 17, 4383–4399, https://doi.org/10.5194/gmd-17-4383-2024, https://doi.org/10.5194/gmd-17-4383-2024, 2024
Short summary
Short summary
There is relatively limited research on the application of scientific computing on RISC CPU platforms. The MIPS architecture CPUs, a type of RISC CPUs, have distinct advantages in energy efficiency and scalability. The air quality modeling system can run stably on the MIPS and LoongArch platforms, and the experiment results verify the stability of scientific computing on the platforms. The work provides a technical foundation for the scientific application based on MIPS and LoongArch.
Jiaxu Guo, Juepeng Zheng, Yidan Xu, Haohuan Fu, Wei Xue, Lanning Wang, Lin Gan, Ping Gao, Wubing Wan, Xianwei Wu, Zhitao Zhang, Liang Hu, Gaochao Xu, and Xilong Che
Geosci. Model Dev., 17, 3975–3992, https://doi.org/10.5194/gmd-17-3975-2024, https://doi.org/10.5194/gmd-17-3975-2024, 2024
Short summary
Short summary
To enhance the efficiency of experiments using SCAM, we train a learning-based surrogate model to facilitate large-scale sensitivity analysis and tuning of combinations of multiple parameters. Employing a hybrid method, we investigate the joint sensitivity of multi-parameter combinations across typical cases, identifying the most sensitive three-parameter combination out of 11. Subsequently, we conduct a tuning process aimed at reducing output errors in these cases.
Yaqi Wang, Lanning Wang, Juan Feng, Zhenya Song, Qizhong Wu, and Huaqiong Cheng
Geosci. Model Dev., 16, 6857–6873, https://doi.org/10.5194/gmd-16-6857-2023, https://doi.org/10.5194/gmd-16-6857-2023, 2023
Short summary
Short summary
In this study, to noticeably improve precipitation simulation in steep mountains, we propose a sub-grid parameterization scheme for the topographic vertical motion in CAM5-SE to revise the original vertical velocity by adding the topographic vertical motion. The dynamic lifting effect of topography is extended from the lowest layer to multiple layers, thus improving the positive deviations of precipitation simulation in high-altitude regions and negative deviations in low-altitude regions.
Xianwei Wu, Liang Hu, Lanning Wang, Haitian Lu, and Juepeng Zheng
Geosci. Model Dev. Discuss., https://doi.org/10.5194/gmd-2023-164, https://doi.org/10.5194/gmd-2023-164, 2023
Revised manuscript not accepted
Short summary
Short summary
In order to build an effective surrogate model for the community atmospheric model (CAM). We present a surrogate model-based parameter tuning framework for the CAM and apply it to improve the CAM5 precipitation performance and propose a multilevel surrogate model-based optimization method. We design a nonuniform parameter parameterization scheme and integrate the parameters using a parameter smoothing scheme, and the experimental results improve in four regions.
Kai Cao, Qizhong Wu, Lingling Wang, Nan Wang, Huaqiong Cheng, Xiao Tang, Dongqing Li, and Lanning Wang
Geosci. Model Dev., 16, 4367–4383, https://doi.org/10.5194/gmd-16-4367-2023, https://doi.org/10.5194/gmd-16-4367-2023, 2023
Short summary
Short summary
Offline performance experiment results show that the GPU-HADVPPM on a V100 GPU can achieve up to 1113.6 × speedups to its original version on an E5-2682 v4 CPU. A series of optimization measures are taken, and the CAMx-CUDA model improves the computing efficiency by 128.4 × on a single V100 GPU card. A parallel architecture with an MPI plus CUDA hybrid paradigm is presented, and it can achieve up to 4.5 × speedup when launching eight CPU cores and eight GPU cards.
Jinming Feng, Meng Luo, Jun Wang, Yuan Qiu, Qizhong Wu, and Ke Wang
EGUsphere, https://doi.org/10.5194/egusphere-2023-867, https://doi.org/10.5194/egusphere-2023-867, 2023
Preprint withdrawn
Short summary
Short summary
We modified the code of the Weather Research and Forecasting Model (WRF) v3.8.1 to include the forcing components more than the Greenhouse Gases and evaluate the impact of forcing configurations on the climate simulation results in China. It showed that different external forcing configurations in WRF could result in considerable impact on the annual temperature and precipitation trend, which was stronger than parameterization schemes but was weaker than spectral nudging.
Yuejin Ye, Zhenya Song, Shengchang Zhou, Yao Liu, Qi Shu, Bingzhuo Wang, Weiguo Liu, Fangli Qiao, and Lanning Wang
Geosci. Model Dev., 15, 5739–5756, https://doi.org/10.5194/gmd-15-5739-2022, https://doi.org/10.5194/gmd-15-5739-2022, 2022
Short summary
Short summary
The swNEMO_v4.0 is developed with ultrahigh scalability through the concepts of hardware–software co-design based on the characteristics of the new Sunway supercomputer and NEMO4. Three breakthroughs, including an adaptive four-level parallelization design, many-core optimization and mixed-precision optimization, are designed. The simulations achieve 71.48 %, 83.40 % and 99.29 % parallel efficiency with resolutions of 2 km, 1 km and 500 m using 27 988 480 cores, respectively.
Qian Ma, Kaicun Wang, Yanyi He, Liangyuan Su, Qizhong Wu, Han Liu, and Youren Zhang
Earth Syst. Sci. Data, 14, 463–477, https://doi.org/10.5194/essd-14-463-2022, https://doi.org/10.5194/essd-14-463-2022, 2022
Short summary
Short summary
Surface incident solar radiation plays a key role in atmospheric circulation, the water cycle, and ecological equilibrium on Earth. A homogenized century-long surface incident solar radiation dataset was obtained over Japan.
Ying Wei, Xueshun Chen, Huansheng Chen, Yele Sun, Wenyi Yang, Huiyun Du, Qizhong Wu, Dan Chen, Xiujuan Zhao, Jie Li, and Zifa Wang
Geosci. Model Dev., 14, 4411–4428, https://doi.org/10.5194/gmd-14-4411-2021, https://doi.org/10.5194/gmd-14-4411-2021, 2021
Short summary
Short summary
The sub-grid particle formation (SGPF) in plumes plays an important role in air pollution and climate. We coupled an SGPF scheme to a chemical transport model with an aerosol microphysics module and applied it to investigate the SGPF impact over China. The scheme clearly improved the model performance in simulating aerosol components and particle number at typical sites influenced by point sources. The results indicate the significant effects of SGPF on aerosol particles in industrial areas.
Xueshun Chen, Fangqun Yu, Wenyi Yang, Yele Sun, Huansheng Chen, Wei Du, Jian Zhao, Ying Wei, Lianfang Wei, Huiyun Du, Zhe Wang, Qizhong Wu, Jie Li, Junling An, and Zifa Wang
Atmos. Chem. Phys., 21, 9343–9366, https://doi.org/10.5194/acp-21-9343-2021, https://doi.org/10.5194/acp-21-9343-2021, 2021
Short summary
Short summary
Atmospheric aerosol particles have significant climate and health effects that depend on aerosol size, composition, and mixing state. A new global-regional nested aerosol model with an advanced particle microphysics module and a volatility basis set organic aerosol module was developed to simulate aerosol microphysical processes. Simulations strongly suggest the important role of anthropogenic organic species in particle formation over the areas influenced by anthropogenic sources.
Hui Wang, Qizhong Wu, Alex B. Guenther, Xiaochun Yang, Lanning Wang, Tang Xiao, Jie Li, Jinming Feng, Qi Xu, and Huaqiong Cheng
Atmos. Chem. Phys., 21, 4825–4848, https://doi.org/10.5194/acp-21-4825-2021, https://doi.org/10.5194/acp-21-4825-2021, 2021
Short summary
Short summary
We assessed the influence of the greening trend on BVOC emission in China. The comparison among different scenarios showed that vegetation changes resulting from land cover management are the main driver of BVOC emission change in China. Climate variability contributed significantly to interannual variations but not much to the long-term trend during the study period.
Lei Kong, Xiao Tang, Jiang Zhu, Zifa Wang, Jianjun Li, Huangjian Wu, Qizhong Wu, Huansheng Chen, Lili Zhu, Wei Wang, Bing Liu, Qian Wang, Duohong Chen, Yuepeng Pan, Tao Song, Fei Li, Haitao Zheng, Guanglin Jia, Miaomiao Lu, Lin Wu, and Gregory R. Carmichael
Earth Syst. Sci. Data, 13, 529–570, https://doi.org/10.5194/essd-13-529-2021, https://doi.org/10.5194/essd-13-529-2021, 2021
Short summary
Short summary
China's air pollution has changed substantially since 2013. Here we have developed a 6-year-long high-resolution air quality reanalysis dataset over China from 2013 to 2018 to illustrate such changes and to provide a basic dataset for relevant studies. Surface fields of PM2.5, PM10, SO2, NO2, CO, and O3 concentrations are provided, and the evaluation results indicate that the reanalysis dataset has excellent performance in reproducing the magnitude and variation of air pollution in China.
Han Xiao, Qizhong Wu, Xiaochun Yang, Lanning Wang, and Huaqiong Cheng
Geosci. Model Dev., 14, 223–238, https://doi.org/10.5194/gmd-14-223-2021, https://doi.org/10.5194/gmd-14-223-2021, 2021
Short summary
Short summary
Few studies have investigated the effects of initial conditions on the simulation or prediction of PM2.5 concentrations. Here, sensitivity experiments are used to explore the effects of three initial mechanisms (clean, restart, and continuous) and emissions in Xi’an in December 2016. According to this work, if the restart mechanism cannot be used due to computing resource and storage space limitations when forecasting PM2.5 concentrations, a spin-up time of at least 27 h is needed.
Shaoqing Zhang, Haohuan Fu, Lixin Wu, Yuxuan Li, Hong Wang, Yunhui Zeng, Xiaohui Duan, Wubing Wan, Li Wang, Yuan Zhuang, Hongsong Meng, Kai Xu, Ping Xu, Lin Gan, Zhao Liu, Sihai Wu, Yuhu Chen, Haining Yu, Shupeng Shi, Lanning Wang, Shiming Xu, Wei Xue, Weiguo Liu, Qiang Guo, Jie Zhang, Guanghui Zhu, Yang Tu, Jim Edwards, Allison Baker, Jianlin Yong, Man Yuan, Yangyang Yu, Qiuying Zhang, Zedong Liu, Mingkui Li, Dongning Jia, Guangwen Yang, Zhiqiang Wei, Jingshan Pan, Ping Chang, Gokhan Danabasoglu, Stephen Yeager, Nan Rosenbloom, and Ying Guo
Geosci. Model Dev., 13, 4809–4829, https://doi.org/10.5194/gmd-13-4809-2020, https://doi.org/10.5194/gmd-13-4809-2020, 2020
Short summary
Short summary
Science advancement and societal needs require Earth system modelling with higher resolutions that demand tremendous computing power. We successfully scale the 10 km ocean and 25 km atmosphere high-resolution Earth system model to a new leading-edge heterogeneous supercomputer using state-of-the-art optimizing methods, promising the solution of high spatial resolution and time-varying frequency. Corresponding technical breakthroughs are of significance in modelling and HPC design communities.
Cited articles
Banderier, H., Zeman, C., Leutwyler, D., Rüdisühli, S., and Schär, C.: Reduced floating-point precision in regional climate simulations: an ensemble-based statistical verification, Geosci. Model Dev., 17, 5573–5586, https://doi.org/10.5194/gmd-17-5573-2024, 2024. a
Bauer, P., Thorpe, A., and Brunet, G.: The quiet revolution of numerical weather prediction, Nature, 525, 47–55, https://doi.org/10.1038/nature14956, 2015. a, b, c, d
Chen, S., Zhang, Y., Wang, Y., Liu, Z., Li, X., and Xue, W.: Mixed-precision computing in the GRIST dynamical core for weather and climate modelling, Geosci. Model Dev., 17, 6301–6318, https://doi.org/10.5194/gmd-17-6301-2024, 2024. a, b
Cotronei, A. and Slawig, T.: Single-precision arithmetic in ECHAM radiation reduces runtime and energy consumption, Geosci. Model Dev., 13, 2783–2804, https://doi.org/10.5194/gmd-13-2783-2020, 2020. a
Dawson, A. and Düben, P. D.: rpe v5: an emulator for reduced floating-point precision in large numerical simulations, Geosci. Model Dev., 10, 2221–2230, https://doi.org/10.5194/gmd-10-2221-2017, 2017. a
Dawson, A., Düben, P. D., MacLeod, D. A., and Palmer, T. N.: Reliable low precision simulations in land surface models, Clim. Dynam., 51, 2657–2666, 2018. a
Dmitruk, B. and Stpiczyński, P.: Improving accuracy of summation using parallel vectorized Kahan's and Gill-Møller algorithms, Concurr. Comput.: Pract. Exp., 35, e7763, https://doi.org/10.1002/cpe.7763, 2023. a, b
Duda, M.: MPAS-Model v8.2.1, GitHub [code], https://github.com/MPAS-Dev/MPAS-Model/releases/tag/v8.2.1 (last access: 26 December 2024), 2024. a
Gear, C.: Numerical initial value problems in ordinary differential equations, HomeSIAM Review, 15, 3, https://doi.org/10.1137/1015088, 1973. a
Gill, S.: A process for the step-by-step integration of differential equations in an automatic digital computing machine, in: Mathematical Proceedings of the Cambridge Philosophical Society, vol. 47, Cambridge University Press, 96–108, https://doi.org/10.1017/S0305004100026414, 1951. a
Hatfield, S., Chantry, M., Düben, P., and Palmer, T.: Accelerating high-resolution weather models with deep-learning hardware, in: Proceedings of the platform for advanced scientific computing conference, PASC '19: Proceedings of the Platform for Advanced Scientific Computing Conference, Zurich, Switzerland, 2019, 1, 1–11, https://doi.org/10.1145/3324989.3325711, 2019. a, b, c
Higham, N. J.: Accuracy and stability of numerical algorithms, 2nd Edn., SIAM, https://doi.org/10.1137/1.9780898718027, 2002. a
Kahan, W.: Pracniques: further remarks on reducing truncation errors, Commun. ACM, 8, 40, https://doi.org/10.1145/363707.363723, 1965. a, b
Klemp, J. B., Skamarock, W. C., and Dudhia, J.: Conservative split-explicit time integration methods for the compressible nonhydrostatic equations, Mon. Weather Rev., 135, 2897–2913, 2007. a
Lai, J.: Enhancing Single-Precision with Quasi Double-Precision: Achieving Double-Precision Accuracy in the Model for Prediction Across Scales-Atmosphere (MPAS-A) version 8.2.1, Zenodo [code], https://doi.org/10.5281/zenodo.14576893, 2024. a, b, c
MPAS: MPAS-Atmosphere Idealized Test Cases, https://mpas-dev.github.io/, last access: 29 February 2025. a
Paxton, E. A., Chantry, M., Klöwer, M., Saffin, L., and Palmer, T.: Climate modeling in low precision: Effects of both deterministic and stochastic rounding, J. Climate, 35, 1215–1229, 2022. a
Thompson, R. J.: Improving round-off in Runge-Kutta computations with Gill's method, Commun. ACM, 13, 739–740, 1970. a
Tintó Prims, O., Acosta, M. C., Moore, A. M., Castrillo, M., Serradell, K., Cortés, A., and Doblas-Reyes, F. J.: How to use mixed precision in ocean models: exploring a potential reduction of numerical precision in NEMO 4.0 and ROMS 3.6, Geosci. Model Dev., 12, 3135–3148, https://doi.org/10.5194/gmd-12-3135-2019, 2019. a
Tomonori, K. and Hideko, N.: On the correction method of round-off errors in the Yang's Runge-Kutta method, Jpn. Soc. Indust. Appl. Meth., 5, 293–305, 1995. a
Wicker, L. J. and Skamarock, W. C.: Time-splitting methods for elastic models using forward time schemes, Mon. Weather Rev., 130, 2088–2097, 2002. a
Short summary
High-performance computing limitations often hinder numerical model development. Traditional models use double precision for accuracy, which is computationally expensive. Lower precision reduces costs but can introduce errors. The quasi-double-precision (QDP) algorithm helps mitigate these errors. This study applies the QDP algorithm to the Model for Prediction Across Scales – Atmosphere, showing reduced errors and computational time, making it an efficient solution for large-scale simulations.
High-performance computing limitations often hinder numerical model development. Traditional...