EnKF-based fusion of site-available machine learning air quality predictions from RFSML v1.0 and gridded chemical transport model forecasts from GEOS-Chem v13.1.0
Abstract. Statistical methods, particularly machine learning models, have gained significant popularity in air quality predictions. These prediction models are trained using the historical measurement datasets independently collected at the environmental monitoring stations, and their operational forecasts onward by the inputs of the real-time ambient pollutant observations. Therefore, these high-quality machine learning models only provide site-available predictions. In contrast, deterministic chemical transport models (CTM), which simulate the full life cycles of air pollutants, provide forecasts that are continuous in 3D field. However, owing to the complex error sources due to the emission, transport, and removal of pollutants, CTM forecasts are typically biased particularly in fine scale. In this study, we proposed a gridded prediction with high accuracy by fusing predictions from our recent regional-feature-selection machine learning prediction (RFSML v1.0) and a CTM forecast. The prediction fusion was conducted using the Bayesian theory-based ensemble Kalman filter (EnKF). Background error covariance was an essential part in the assimilation process. Ensemble CTM predictions driven by the perturbed emission inventories were initially used for representing their spatial covariance statistics, which could resolve the main part of the CTM error. In addition, a covariance inflation algorithm was designed to amplify the ensemble perturbations to account for other model errors next to the uncertainty in emission inputs. Model evaluation tests were conducted based on independent measurements. Our EnKF-based prediction fusion presented significant improvements than the pure CTM. Moreover, covariance inflation further enhanced the fused prediction particularly in the cases of severe underestimation.
Li Fang et al.
Status: open (until 10 Apr 2023)
- RC1: 'Comment on gmd-2022-301', Anonymous Referee #1, 07 Mar 2023 reply
- RC2: 'Comment on gmd-2022-301', Anonymous Referee #2, 16 Mar 2023 reply
Li Fang et al.
Li Fang et al.
Viewed (geographical distribution)
In this article, the author proposes a high-precision grid prediction by fusing regional-feature selection machine learning prediction (RFSML v1.0) and CTM prediction. The set CTM prediction driven by the disturbance emission inventory is used to represent its spatial covariance statistics to solve the error, and the covariance expansion algorithm is designed to amplify the integrated disturbance and reduce the error, and the model evaluation is carried out on the basis of independent measurement. The prediction fusion based on EnKF is significantly improved than pure CTM. The manuscript has a good innovation, the content of the manuscript is detailed, the discussion is explicit, and the conclusion is clear. It is suggested to add some content for minor revision and publish it in the journal Geoscientific Model Development.