Articles | Volume 15, issue 20
Development and technical paper
24 Oct 2022
Development and technical paper |  | 24 Oct 2022

Development of a regional feature selection-based machine learning system (RFSML v1.0) for air pollution forecasting over China

Li Fang, Jianbing Jin, Arjo Segers, Hai Xiang Lin, Mijie Pang, Cong Xiao, Tuo Deng, and Hong Liao

Related authors

Observational operator for fair model calibration with ground NO2 measurements
Li Fang, Jianbing Jin, Arjo Segers, Ke Li, Ji Xia, Wei Han, Baojie Li, Hai Xiang Lin, Lei Zhu, Song Liu, and Hong Liao
Geosci. Model Dev. Discuss.,,, 2024
Revised manuscript under review for GMD
Short summary
Neighbouring time ensemble Kalman filter (NTEnKF) data assimilation for dust storm forecasting
Mijie Pang, Jianbing Jin, Segers Arjo, Huiya Jiang, Wei Han, Ji Xia, Li Fang, Jiandong Li, Hai Xiang Lin, and Hong Liao
Geosci. Model Dev. Discuss.,,, 2023
Revised manuscript under review for GMD
Short summary
A gridded air quality forecast through fusing site-available machine learning predictions from RFSML v1.0 and chemical transport model results from GEOS-Chem v13.1.0 using the ensemble Kalman filter
Li Fang, Jianbing Jin, Arjo Segers, Hong Liao, Ke Li, Bufan Xu, Wei Han, Mijie Pang, and Hai Xiang Lin
Geosci. Model Dev., 16, 4867–4882,,, 2023
Short summary
Inverse modeling of the 2021 spring super dust storms in East Asia
Jianbing Jin, Mijie Pang, Arjo Segers, Wei Han, Li Fang, Baojie Li, Haochuan Feng, Hai Xiang Lin, and Hong Liao
Atmos. Chem. Phys., 22, 6393–6410,,, 2022
Short summary

Related subject area

Atmospheric sciences
Incorporating Oxygen Isotopes of Oxidized Reactive Nitrogen in the Regional Atmospheric Chemistry Mechanism, version 2 (ICOIN-RACM2)
Wendell W. Walters, Masayuki Takeuchi, Nga L. Ng, and Meredith G. Hastings
Geosci. Model Dev., 17, 4673–4687,,, 2024
Short summary
A general comprehensive evaluation method for cross-scale precipitation forecasts
Bing Zhang, Mingjian Zeng, Anning Huang, Zhengkun Qin, Couhua Liu, Wenru Shi, Xin Li, Kefeng Zhu, Chunlei Gu, and Jialing Zhou
Geosci. Model Dev., 17, 4579–4601,,, 2024
Short summary
Implementation of a Simple Actuator Disk for Large-Eddy Simulation in the Weather Research and Forecasting Model (WRF-SADLES v1.2) for wind turbine wake simulation
Hai Bui, Mostafa Bakhoday-Paskyabi, and Mohammadreza Mohammadpour-Penchah
Geosci. Model Dev., 17, 4447–4465,,, 2024
Short summary
WRF-PDAF v1.0: implementation and application of an online localized ensemble data assimilation framework
Changliang Shao and Lars Nerger
Geosci. Model Dev., 17, 4433–4445,,, 2024
Short summary
Implementation and evaluation of diabatic advection in the Lagrangian transport model MPTRAC 2.6
Jan Clemens, Lars Hoffmann, Bärbel Vogel, Sabine Grießbach, and Nicole Thomas
Geosci. Model Dev., 17, 4467–4493,,, 2024
Short summary

Cited articles

Abu Awad, Y., Koutrakis, P., Coull, B. A., and Schwartz, J.: A spatio-temporal prediction model based on support vector machine regression: Ambient Black Carbon in three New England States, Environ. Res., 159, 427–434,, 2017. a
Altmann, A., Toloşi, L., Sander, O., and Lengauer, T.: Permutation importance: a corrected feature importance measure, Bioinformatics, 26, 1340–1347,, 2010. a
Bai, Y., Li, Y., Zeng, B., Li, C., and Zhang, J.: Hourly PM2.5 concentration forecast using stacked autoencoder model with emphasis on seasonality, J. Clean. Prod., 224, 739–750, 2019. a
Bartier, P. M. and Keller, C.: Multivariate interpolation to incorporate thematic surface data using inverse distance weighting (IDW), Comput. Geosci., 22, 795–799,, 1996. a
Bey, I., Jacob, D. J., Yantosca, R. M., Logan, J. A., Field, B. D., Fiore, A. M., Li, Q., Liu, H. Y., Mickley, L. J., and Schultz, M. G.: Global modeling of tropospheric chemistry with assimilated meteorology: Model description and evaluation, J. Geophys. Res.-Atmos., 106, 23073–23095,, 2001. a
Short summary
This study proposes a regional feature selection-based machine learning system to predict short-term air quality in China. The system has a tool that can figure out the importance of input data for better prediction. It provides large-scale air quality prediction that exhibits improved interpretability, fewer training costs, and higher accuracy compared with a standard machine learning system. It can act as an early warning for citizens and reduce exposure to PM2.5 and other air pollutants.