A Regional multi-Air Pollutant Assimilation System (RAPAS v1.0) for emission estimates: system development and application

Feng, Shuzhuang; Jiang, Fei; Wu, Zheng; Wang, Hengmao; He, Wei; Shen, Yang; Zhang, Lingyu; Zheng, Yanhua; Lou, Chenxi; Jiang, Ziqiang; Ju, Weimin

doi:https://doi.org/10.5194/gmd-16-5949-2023

Articles | Volume 16, issue 20

https://doi.org/10.5194/gmd-16-5949-2023

Articles | Volume 16, issue 20

Model description paper

20 Oct 2023

Model description paper |

| 20 Oct 2023

A Regional multi-Air Pollutant Assimilation System (RAPAS v1.0) for emission estimates: system development and application

Shuzhuang Feng, Fei Jiang, Zheng Wu, Hengmao Wang, Wei He, Yang Shen, Lingyu Zhang, Yanhua Zheng, Chenxi Lou, Ziqiang Jiang, and Weimin Ju

Abstract

Top-down atmospheric inversion infers surface–atmosphere fluxes from spatially distributed observations of atmospheric composition in order to quantify anthropogenic and natural emissions. In this study, we developed a Regional multi-Air Pollutant Assimilation System (RAPAS v1.0) based on the Weather Research and Forecasting–Community Multiscale Air Quality (WRF–CMAQ) modeling system model, the three-dimensional variational (3D-Var) algorithm, and the ensemble square root filter (EnSRF) algorithm. This system can simultaneously assimilate hourly in situ CO, SO₂, NO₂, PM_2.5, and PM₁₀ observations to infer gridded emissions of CO, SO₂, NO_x, primary PM_2.5 (PPM_2.5), and coarse PM₁₀ (PMC) on a regional scale. In each data assimilation window, we use a “two-step” scheme, in which the emissions are inferred first and then input into the CMAQ model to simulate initial conditions (ICs) of the next window. The posterior emissions are then transferred to the next window as prior emissions, and the original emission inventory is only used in the first window. Additionally, a “super-observation” approach is implemented to decrease the computational costs, observation error correlations, and influence of representative errors. Using this system, we estimated the emissions of CO, SO₂, NO_x, PPM_2.5, and PMC in December and July 2016 over China using nationwide surface observations. The results show that compared to the prior emissions (2016 Multi-resolution Emission Inventory for China – MEIC 2016)), the posterior emissions of CO, SO₂, NO_x, PPM_2.5, and PMC in December 2016 increased by 129 %, 20 %, 5 %, 95 %, and 1045 %, respectively, and the emission uncertainties decreased by 44 %, 45 %, 34 %, 52 %, and 56 %, respectively. With the inverted emissions, the RMSE of simulated concentrations decreased by 40 %–56 %. Sensitivity tests were conducted with different prior emissions, prior uncertainties, and observation errors. The results showed that the two-step scheme employed in RAPAS is robust in estimating emissions using nationwide surface observations over China. This study offers a useful tool for accurately quantifying multi-species anthropogenic emissions at large scales and in near-real time.

Download & links

Article (PDF, 13986 KB)

Supplement (7496 KB)

Download & links

Article (13986 KB)
Full-text XML
Supplement (7496 KB)
BibTeX
EndNote

How to cite.

Received: 26 Apr 2021 – Discussion started: 27 Aug 2021 – Revised: 18 Aug 2023 – Accepted: 06 Sep 2023 – Published: 20 Oct 2023

1 Introduction

Owing to rapid economic development and pollution control legislation, there is an increasing demand for providing updated emission estimates, especially in areas where anthropogenic emissions are intensive. Accurately estimating source emission quantities and spatiotemporal changes resulting from various regulations is imperative and valuable for understanding air quality responses and is crucial for providing timely instructions for the design of future emission regulations. However, most inventories have been developed based on a bottom-up approach and are usually updated with a delay of a few years owing to the complexity of gathering statistical information on activity levels and sector-specific emission factors (Ding et al., 2015). The large uncertainty associated with the low temporal and spatial resolutions of these datasets also greatly limits the assessment of emission changes. Some studies (Bauwens et al., 2020; Shi and Brasseur, 2020) have evaluated emission changes indirectly through concentration measurements; however, air pollution changes are not only dominated by emission changes, but also highly affected by meteorological conditions (Shen et al., 2021).

Top-down atmospheric inversion infers surface–atmosphere fluxes from spatially distributed observations of atmospheric compositions. Recent efforts have been focused on developing air pollution data assimilation (DA) systems to conduct top-down inversions, which can integrate model and multi-source observational information to constrain emission sources. Two major methods are widely used in those DA systems: four-dimensional variational data assimilation (4D-Var) and ensemble Kalman filter (EnKF). 4D-Var provides a global optimal analysis by minimizing a cost function. It shows an implicit flow-dependent background error covariance and can reflect complex nonlinear constraint relationships (Lorenc, 2003). Additionally, a weak constraint 4D-Var method can partly account for the model error by defining a systematic error term in a cost function (Derber, 1989). For example, the GEOS-Chem and TM5 4D-Var frameworks have been used to estimate CH₄ (Alexe et al., 2015; Monteil et al., 2013; Schneising et al., 2009; Stanevich et al., 2021; Wecht et al., 2014) and CO₂ fluxes (Basu et al., 2013; Nassar et al., 2011; H. Wang et al., 2019) from different satellite retrieval products. Additionally, Jiang et al. (2017) and Stavrakou et al. (2008) also used the 4D-Var algorithm to estimate global CO and NO_x emission trends using MOPITT and GOME/SCIAMACHY retrievals, respectively. Using NIES lidar observations, Yumimoto et al. (2008) applied the 4D-Var DA to infer dust emissions over eastern Asia, and the results agreed well with various satellite data and surface observations. Based on surface observations, Meirink et al. (2008) developed a 4D-Var system to optimize monthly methane emissions, which showed a high degree of consistency in posterior emissions and uncertainties when compared with an analogous inversion based on the traditional synthesis approach.

Although considerable progress has been made to reduce large uncertainties in emission inventories, the drawback of the 4D-Var method is the additional development of adjoint models, which are technically difficult and cumbersome for complex chemical transport models (Bocquet and Sakov, 2013). Instead, EnKF uses flow-dependent background error covariance generated by ensemble simulations to map deviations in concentrations to increments of emissions, which is more flexible and easier to implement. Many previous studies have used EnKF techniques to assimilate single- or dual-species observations to optimize the corresponding emission species (Chen et al., 2019; Peng et al., 2017; Schwartz et al., 2014; Sekiyama et al., 2010). Miyazaki et al. (2017) improved NO_x emission estimates using multi-constituent satellite observations and further estimated global surface NO_x emissions from 2005 to 2014. Feng et al. (2020b) used surface observations of NO₂ to infer the NO_x emission changes in China during the COVID-19 epidemic and to quantitatively evaluate the impact of the epidemic on economic activities from the perspective of emission change. Tang et al. (2011) adjusted the emissions of NO_x and volatile organic compounds (VOCs) through assimilating surface O₃ observations and achieved a better performance in O₃ forecasts. However, such a revision may encounter the problem of model error compensation rather than a retrieval of physically meaningful quantities, which should be avoided from overfitting for emission inversion purposes (Bocquet, 2012; Navon, 1998; Tang et al., 2011). The EnKF has also been widely applied to optimize emissions of carbon dioxide (Jiang et al., 2021; Liu et al., 2019), carbon monoxide (Feng et al., 2020a; Mizzi et al., 2018), sulfur dioxide (Chen et al., 2019), ammonia (Kong et al., 2019), etc.

Multi-species data assimilation can efficiently reduce the uncertainty in emission inventories and has led to improvements in air quality forecasting (Ma et al., 2019; Miyazaki et al., 2012b) as it offers additional constraints on emission estimates through improvements in related atmospheric fields, chemical reactions, and gas–particle transformations (Miyazaki and Eskes, 2013). Barbu et al. (2009) updated sulfur oxide (SO_x) emissions with SO₂ and sulfate aerosol observations and found that the simultaneous assimilation of both species performed better than assimilating them separately. Müller and Stavrakou (2005) also found that the simultaneous optimization of the sources of CO and NO_x led to better agreement between simulations and observations compared to the case where only CO observations are used.

The deviation in the chemical initial conditions (ICs) is an important source of error that affects the accuracy of emission inversion because atmospheric inversion fully attributes the biases in simulated and observed concentrations to deviations in emissions (Meirink et al., 2006; Peylin et al., 2005). The biases of concentrations would be compensated for through the unreasonable adjustment of pollution emissions without the optimization of ICs (Tang et al., 2013). Simultaneously optimizing chemical ICs and emissions has been applied in many previous studies to constrain emissions (Ma et al., 2019; Miyazaki et al., 2012a; Peng et al., 2018). For example, Elbern et al. (2007) jointly adjusted O₃, NO_x, and VOC ICs as well as NO_x and VOCs emissions through assimilating surface O₃ and NO_x observations. Although the forecast skills of O₃ were improved, due to the coarse model resolution and the strong nonlinear relationship between O₃ and NO_x, the assimilation of O₃ observations worsened the emission inversion and forecast of NO_x. Peng et al. (2018) assimilated near-surface observations to simultaneously optimize the ICs and emissions. In the 72 h forecast evaluation, their resultant emission succeeded in improving the SO₂ forecast while having little influence on CO and aerosol forecasts and even degrading the forecast of NO₂. Ma et al. (2019) also found that the DA benefits for forecast almost disappeared after 72 h using optimized ICs and emissions. Although a large improvement has been achieved, this method has significant limitations with respect to emission inversion as the contributions from the emissions and chemical ICs to the model's biases are difficult to distinguish (Jiang et al., 2017). In addition, the constraints of the chemical ICs with observations in each assimilation window make the emission inversions between the windows independent. This means that if the emission in one window is overestimated or underestimated, it cannot be transferred to the next window for further correction and compensation. Considering the importance of emissions in chemical field prediction (Bocquet et al., 2015), the rapid disappearance of the DA benefits seems unrealistic, indicating that simultaneously optimizing chemical ICs and emissions may result in a systematic bias in the inverted emissions (Jiang et al., 2021).

Since 2013, China has deployed an air pollution monitoring network that publishes nationwide and real-time hourly surface observations. This dataset provides an opportunity to improve emission estimates using DA. In this study, a regional multi-air pollutant assimilation system using 3D-Var and EnKF DA techniques was constructed to simultaneously assimilate various surface observations (e.g., CO, SO₂, NO₂, O₃, PM_2.5, and PM₁₀). We adopted a two-step method in this system, in which the ICs of each DA window were simulated using the posterior emissions of the previous DA window. The capabilities of RAPAS for reanalysis field generation and emission inversion estimation were also evaluated. The robustness of the system was investigated with different prior inventories, uncertainty settings of prior emissions, and observation errors. The remainder of the paper is organized as follows: Sect. 2 introduces the DA system and observation data, Sect. 3 describes the experimental design, Sect. 4 presents and discusses the results of the system performance and sensitivity tests, and Sect. 5 concludes the paper.

2 Method and data

2.1 System description

2.1.1 Procedure of the assimilation system

A regional air pollutant assimilation system has been preliminarily constructed and successfully applied in our previous studies to optimize the gridded CO and NO_x emissions (Feng et al., 2020a, b). Herein, the system was further extended to simultaneously assimilate multiple species (e.g., CO, SO₂, NO₂, O₃, PM_2.5, and PM₁₀) and was officially named the Regional multi-Air Pollutant Assimilation System (RAPAS v1.0). RAPAS has three components: a regional chemical transport model (CTM), which is coupled offline and used to simulate the meteorological fields and atmospheric compositions, and the 3D-Var and ensemble square root filter (EnSRF) modules, which are used to optimize chemical ICs (Feng et al., 2018; Jiang et al., 2013b) and anthropogenic emissions (Feng et al., 2020a, b), respectively. 3D-Var was introduced considering its excellent performance in our previous study and the lower computational cost during the spin-up period in optimizing ICs. Additionally, the 3D-Var method can obtain a better IC than the EnKF method (Schwartz et al., 2014).

Based on the above three components, RAPAS was divided into two subsystems: the IC assimilation (IA) subsystem (CTM plus 3D-Var) and the emission inversion (EI) subsystem (CTM plus EnSRF). As shown in Fig. 1, the IA subsystem was first run to optimize the chemical ICs (Kleist et al., 2009; Wu et al., 2002) for the subsequent EI subsystem. Distinguishing the source type of the model–observation mismatch error was not required for the IA subsystem. The EI subsystem runs cyclically with a two-step scheme. In the first step, the prior emissions (X^b) are perturbed and input into the CTM model to simulate chemical concentration ensembles. The simulated concentrations of the lowest model level were then interpolated to the observation space according to the locations and times of the observations using the nearest-neighbor interpolation method. Prior emissions (X^b), simulated observations, and real observations were entered into the EnSRF module to generate optimized emissions (X^a). In the second step, the optimized emissions were re-entered into the CTM model to generate the ICs of the next DA window. Meanwhile, the optimized emissions were transferred to the next window as prior emissions. Unlike the joint adjustment of ICs and emissions (“one-step” scheme) in emission inversion (Chen et al., 2019), the two-step scheme needs to run the CTM model twice, which is time-consuming, but can transfer the potential errors of the inverted emissions in one DA window to the next for further correction.

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f01

Figure 1Composition and flowchart of RAPAS. X^a and X^b represent the prior and posterior emissions. The 3D-Var assimilation stage lasts 5 d with a data input frequency of 6 h, and the DA window in the EI subsystem is set to 1 d.

2.1.2 Atmospheric transport model

The regional chemical transport model of the Weather Research and Forecasting–Community Multiscale Air Quality (WRF–CMAQ) modeling system was adopted in this study (Skamarock and Klemp, 2008; Byun and Schere, 2006). CMAQ is a regional 3D Eulerian atmospheric chemistry and transport model with a “one-atmosphere” design developed by the US Environmental Protection Agency (EPA). It can simultaneously address the complex interactions among multiple pollutants as well as air quality issues. The CMAQ model was driven by the WRF model, which is a state-of-the-art mesoscale numerical weather prediction system designed for both atmospheric research and meteorological field forecasting. In this study, WRF version 4.0 and CMAQ version 5.0.2 were used. The WRF simulations were performed with a 36 km horizontal resolution on 169 × 129 grids, covering all of mainland China (Fig. 2). This spatial resolution has been widely adopted in regional simulations as it can provide good simulations of spatiotemporal variations in air pollutants (Mueller and Mallard, 2011; Sharma et al. 2016). In the vertical direction, there were 51 sigma levels on the sigma–pressure coordinates extending from the surface to 100 hPa. The underlying surface of the urban and built-up land was replaced by the MODIS land cover retrieval of 2016 to adapt to the rapid expansion of urbanization. The CMAQ model was run with the same domain but with three grid cells removed from each side of the WRF domain. There were 15 layers in the CMAQ vertical coordinates, which were interpolated from 51 WRF layers.

The meteorological initial and lateral boundary conditions were both provided by the final operational global analysis data of the National Centers for Environmental Prediction (NCEP) with a 1^∘ × 1^∘ resolution at 6 h intervals. The chemical lateral boundary conditions and chemical ICs in the IA subsystem originate from background profiles. As mentioned above, in the EI subsystem, the chemical IC in the first window is provided by the IA subsystem, and in the following windows, it is forward simulated using optimized emissions from the previous window. Carbon Bond 05 with updated toluene chemistry (CB05tucl) and the sixth-generation aerosol module (AERO6) were chosen as the gas-phase and aerosol chemical mechanisms, respectively (Appel et al., 2013; Sarwar et al., 2012). The detailed physical and chemical configurations are listed in Table 1.

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f02

Figure 2Model domain and observation network. The dashed red frame depicts the CMAQ computational domain, the black squares represent the surface meteorological measurement sites, the navy blue triangles represent the sounding sites, and the red and blue dots represent the air pollution measurement sites. Observations from all sites were assimilated in the 3D-Var subsystem, while observations of city sites where red dots were averaged are used for assimilation and where blue dots were averaged are used for independent evaluation in the EI subsystem; the boxed subregions are the North China Plain (NCP) and Yangtze River Delta (YRD), and the shaded area depicts the topography.

Download

Table 1Configuration options of Weather Research and Forecasting–Community Multiscale Air Quality model (WRF–CMAQ).

Download Print Version | Download XLSX

2.1.3 3D-Var assimilation algorithm

The Gridpoint Statistical Interpolation (GSI) system developed by the US NCEP was utilized in this study. Building on the work of Liu et al. (2011), Jiang et al. (2013b), and Feng et al. (2018), we extended the GSI to simultaneously assimilate multiple species (including CO, SO₂, NO₂, O₃, PM_2.5, and PM₁₀) and first used individual aerosol species of PM_2.5 as analysis variables within the GSI–WRF–CMAQ framework. Additional work includes the construction of surface air pollutant observation operators, the updating of observation errors, and the statistics of background error covariance for the analysis variables. Moreover, the data interface was modified to read/write the CMAQ output/input file directly, which was easy to implement.

In the sense of minimum analysis error variance, the 3D-Var algorithm optimizes the analysis fields with observations through an iterative process to minimize the cost function J(x) defined below:

\begin{matrix} (1) & \begin{aligned} J (x) & = \frac{1}{2} (x_{a} - x_{b})^{T} B^{- 1} (x_{a} - x_{b}) \\ + \frac{1}{2} [H (x_{a}) - y]^{T} R^{- 1} [H (x_{a}) - y], \end{aligned} \end{matrix}

where x_a is a vector of the analysis field; x_b is the background field; y is the vector of observations; B and R are the background and observation error covariance matrices, respectively, representing the relative contributions to the analysis; and H is the observation operator that maps the model variables to the observation space.

The analysis variables were the 3D mass concentrations of the pollution components (e.g., CO and sulfate) at each grid point. Hourly mean surface pollution observations within a 1 h window of the analysis were assimilated. To assimilate the surface pollution observations, model-simulated compositions were first diagnosed at observation locations. For gas concentrations to be directly used as analysis variables, the units need to be converted from parts per million (ppm) and parts per billion (ppb) to milligrams per cubic meter (mg m⁻³) and micrograms per cubic meter (µg m⁻³), respectively, to match the observations. The model-simulated PM_2.5 and PM₁₀ concentrations at the ground level were diagnosed as follows:

\begin{matrix} (2) & \begin{aligned} {PM}_{2.5} & = f_{i} \times {PM}_{i} + f_{j} \times {PM}_{j} + f_{k} \times {PM}_{k} = OC \\ + EC + {SO}_{4}^{2 -} + {NO}_{3}^{-} + {NH}_{4}^{+} + SEAS + {AP}_{2.5}, \end{aligned} \\ (3) & {PM}_{10} = {PM}_{i} + {PM}_{j} + {PM}_{k} = {PM}_{2.5} + PMC, \end{matrix}

where f_i, f_j, and f_k are the PM_2.5 fractions of the Aitken, accumulation, and coarse modes, respectively. These ratios are recommended as the concentrations of PM_2.5 and fine-mode aerosols (i.e., Aitken plus accumulation) can differ because PM_2.5 particles include small tails from the coarse mode in the CMAQ model (Binkowski and Roselle, 2003; Jiang et al., 2006). PM_i, PM_j, and PM_k are the mass concentrations of the Aitken, accumulation, and coarse modes in the CMAQ model, respectively. Seven aerosol species of PM_2.5 (organic carbon (OC), elemental carbon (EC), sulfate (SO $_{4}^{2 -}$ ), nitrate (NO $_{3}^{-}$ ), ammonium (NH $_{4}^{+}$ ), sea salt (SEAS), and fine-mode unspeciated aerosols (AP_2.5)) and additional coarse PM₁₀ (PMC) were extracted as analysis variables and were updated using the PM_2.5 and PMC observations. Before calculating Eq. (1) within the GSI, the analysis variables were bilinearly interpolated in the horizontal direction to the observation locations.

Calculating background error covariance (B) is generally costly and difficult when a high-dimensional numerical model is used. For simplification, B was represented as a product of spatial correlation matrices and standard deviations (SDs):

\begin{matrix} (4) & B = {DCD}^{T} \\ (5) & C = C_{x} \otimes C_{y} \otimes C_{z}, \end{matrix}

where D is the background error SD matrix; C is the background error correlation matrix; ⊗ is the Kronecker product; and C_x, C_y, and C_z denote three 1D correlation submatrices in the longitude, latitude, and vertical coordinate directions, respectively. C_x and C_y are assumed to be horizontally isotropic such that they can be represented using a Gaussian function. The correlation between any two points x_i and x_j in the horizontal direction is expressed as follows:

\begin{matrix} (6) & c (x_{i}, x_{j}) = e^{- \frac{(x_{i} - x_{j})^{2}}{2 L^{2}}}, \end{matrix}

where L is the horizontal correlation scale estimated using the proxy of the background error (Fig. 3). The vertical correlation matrix C_z is directly estimated from the model background field as C_z is only an n_z×n_z (here, n_z=15) matrix.

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f03

Figure 3Vertical profiles of standard deviations (a, ppm; b and c, µg m⁻³) and horizontal (d–f, km) and vertical (g–i, sigma units) length scales for CO, SO₂, NO₂, O₃, sulfate, nitrate, ammonium, EC, OC, sea salt, unspeciated aerosols (AP_2.5), PMC, PM_2.5, and PM₁₀.

Download

To estimate these matrices, the National Meteorological Center (NMC) method was used to compute B for each variable by taking the differences between forecasts of different lengths valid at the same time (Parrish and Derber, 1992; Rabier et al., 1998). Differences between the 24 and 12 h WRF–CMAQ forecasts of 60 pairs (2 pairs per day) of analysis variables valid at either 00:00 or 12:00 UTC during November 2016 were used. The horizontal and vertical length scales of the correlation matrices were estimated using recursive filters (Purser et al., 2003). The vertical distribution of the background error SDs, which varies with height and species, is shown in Fig. 3. The vertical profile of the background error SDs corresponds to the vertical concentration distribution. This means that higher concentrations tend to have larger background error SDs (e.g., CO and nitrate). These SDs exhibit a common reduction as the height increases, especially at the top of the boundary layer. The horizontal correlation of the background error determines the propagation of observation information in this direction, whereas the vertical correlation determines the vertical extension of such increments. For gaseous pollutants and most individual aerosol components, the horizontal length scales increased with height, whereas for the total particulate matter (i.e., PM_2.5, PM₁₀), the scales increased with height in the boundary layer and decreased with height in the free troposphere. The ground-level scale generally spreads across 40–45 km for all control variables. The vertical length scale of most species first increased and then decreased with height, which may be related to vertical mixing (Kahnert, 2008) and stack emissions at approximately 200 m height.

2.1.4 EnKF assimilation algorithm

In EnKF, the time-dependent uncertainties of the state variables are estimated using a Monte Carlo approach through an ensemble. Uncertainty can be propagated using linear or nonlinear dynamic models (flow-dependent background error covariance) by simply implementing ensemble simulations. The EnSRF algorithm introduced by Bierman (1977) and Maybeck (1979) was used to constrain pollution emissions in this study. EnSRF is a deterministic EnKF that obviates the need to perturb observations, which has a higher computational efficiency and a better performance (Sun et al., 2009).

The perturbation of the prior emissions represents the uncertainty. We implemented additive emission adjustment methods, which were calculated using the following function:

\begin{matrix} (7) & X_{i}^{b} = X_{0}^{b} + δ X_{i}^{b}, i = 1, 2, \dots, N, \end{matrix}

where b is the background (prior) state; i is the identifier of the perturbed samples; and N is the ensemble size, which was set to 40 considering the trade-off between computational cost and inversion accuracy (Fig. S1). In contrast to the estimation of parameters based on the augmentation of the conventional state vector (e.g., concentrations) with the parameter variables, X only comprises emissions in this study (similarly hereafter). $δ X_{i}^{b}$ is the randomly perturbed samples added to the prior emissions $X_{0}^{b}$ to produce ensemble samples of the inputs $X_{i}^{b}$ . $δ X_{i}^{b}$ is drawn from Gaussian distributions with a mean of 0 and standard deviation of the prior emission uncertainty in each grid. The state variables of the emissions include CO, SO₂, NO_x, primary PM_2.5 (PPM_2.5), and PMC. We used variable localization to update the analysis, which means that the covariance among different state variables was not considered, and the emission of one species was constrained only by its corresponding air pollutant observation. This method has been widely used in chemical data assimilation systems to avoid spurious correlations between species (Ma et al., 2019; Miyazaki et al., 2012b).

After obtaining an ensemble of state vectors (prior emissions), ensemble runs of the CMAQ model were conducted to propagate the errors in the model with each ensemble sample of state vectors. Combined with the observational vector y, the state vector $\overline{X^{b}}$ was updated by minimizing the analysis variance.

\begin{matrix} (8) & \overline{X^{a}} = \overline{X^{b}} + K (y - H \overline{X^{b}}), \\ (9) & K = P^{b} H^{T} ({HP}^{b} H^{T} + R)^{- 1}, \\ (10) & P^{b} = \frac{1}{N - 1} \sum_{i = 1}^{N} (X_{i}^{b} - \overline{X^{b}}) (X_{i}^{b} - \overline{X^{b}})^{T}, \\ (11) & δ X_{i}^{a} = δ X_{i}^{b} - \tilde{K} H δ X_{i}^{b} \end{matrix}

While employing sequential assimilation and independent observations, $\tilde{K}$ is calculated as follows:

\begin{matrix} (12) & \tilde{K} = {(1 + \sqrt{R / ({HP}^{b} H^{T} + R)})}^{- 1} K, \end{matrix}

where $\overline{X^{b}}$ is the mean of the ensemble samples $X_{i}^{b}$ ; H is the observation operator that maps the model space to the observation space, consisting of the model integration process converting emissions into concentrations and spatial interpolation matching the model concentration to the locations of the observations; $y - H \overline{X^{b}}$ reflects the differences between the simulated and observed concentrations; P^b is the ensemble-estimated background (a priori) error covariance; K is the Kalman gain matrix of the ensemble mean depending on the background error covariance P^b and the observation error covariance R, representing the relative contributions to analysis; and $\tilde{K}$ is the Kalman gain matrix of the ensemble perturbation, which is used to calculate emission perturbations after inversions $δ X_{i}^{a}$ . The ensemble mean $\overline{X^{a}}$ of the analyzed state was considered the best estimate of the emissions.

When large volumes of site observations are at a much higher resolution than the model grid spacing, many correlated or fully consistent model–data mismatch errors can appear in one cluster, resulting in excessive adjustments and deteriorated model performance (Houtekamer and Mitchell, 2001). To reduce the horizontal observation error correlations and influence of representativeness errors, a super-observation approach combining multiple noisy observations located within the same grid and assimilation window was developed based on optimal estimation theory (Miyazaki et al., 2012a). Previous studies have demonstrated the necessity for data-thinning and dealiasing errors (Feng et al., 2020b; Zhang et al., 2009a). The super-observation y_new, super-observation error r_new, and corresponding simulation x_new,i of the ith sample are calculated as follows:

\begin{matrix} (13) & 1 / r_{new}^{2} = \sum_{j = 1}^{m} 1 / r_{j}^{2}, \\ (14) & y_{new} = \sum_{j = 1}^{m} w_{j} y_{j} / \sum_{j = 1}^{m} w_{j}, \\ (15) & x_{new, i} = \sum_{j = 1}^{m} w_{j} x_{i j} / \sum_{j = 1}^{m} w_{j}, \end{matrix}

where j is the identifier of m observations within a super-observation grid, r_j is the observational error of the actual jth observation y_j, x_ij is the simulated concentration using the ith prior emission sample corresponding to the jth observation, and $w_{j} = 1 / r_{j}^{2}$ is the weighting factor. The super-observation error decreased as the number of observations used within a super-observation increased. This method was used in our previous inversions using surface-based (Feng et al., 2020b) and satellite-based (Jiang et al., 2021) observations.

In this study, the DA window was set to 1 d because the model requires a longer time to integrate the emission information into the concentration ensembles (Ma et al., 2019). Due to the super-observation approach, only one assimilation is needed per grid cell in one assimilation window. In addition, owing to the complexity of hourly emissions, it is difficult to simulate hourly concentrations that match the observations well. Although a longer DA window would allow more observations to constrain the emission change of one grid, the spurious correlation signals of EnKF would attenuate the observation information over time (Bruhwiler et al., 2005; Jiang et al., 2021). Kang et al. (2012) conducted observing system simulation experiments (OSSEs) and demonstrated that owing to the transport errors and increased spurious correlation, a longer DA window (e.g., 3 weeks) would cause the analysis system to blur essential emission information away from the observation. Therefore, daily mean simulations and observations were used in the EnSRF algorithm, and daily emissions were optimized in this system.

EnKF is subject to spurious correlations because of the limited number of ensembles when it is applied in high-dimensional atmospheric models, which can cause rank deficiencies in the estimated background error covariance and filter divergence as well as further degrade analyses and forecasts (Wang et al., 2020). Covariance localization is performed to reduce spurious correlations caused by a finite ensemble size (Houtekamer and Mitchell, 2001). Covariance localization preserves the meaningful impact of observations on state variables within a certain distance (cutoff radius) but limits the detrimental impact of observations on remote state variables. The localization function of Gaspari and Cohn (Gaspari and Cohn, 1999) is used in this system, which is a piecewise continuous fifth-order polynomial approximation of a normal distribution. The optimal localization scale is related to the ensemble size, assimilation window, dynamic system, and lifetime of the chemical species in the atmosphere. CO, SO₂, and PM_2.5 are rather stable in the atmosphere, with a lifetime of more than 1 d. According to the average wind speed (3.3 m s⁻¹, Table 4) and length of the DA window, the localization scales of CO, SO₂, and PM_2.5 were set to 300 km. In addition, the localization scales of NO₂, which is rather reactive and has a lifetime of approximately 10 h in winter (de Foy et al., 2015), and PMC, which mainly comes from local sources and has a short residence time in the atmosphere owing to the rapid deposition rate (Clements et al., 2014, 2016; Hinds, 1982), were set to 150 and 250 km, respectively.

2.2 Prior emissions and uncertainties

Anthropogenic emissions over China were obtained from the 2016 Multi-resolution Emission Inventory for China (MEIC 2016) (Zheng et al., 2018), while those over the other regions of eastern Asia were obtained from the mosaic Asian anthropogenic emission inventory (MIX) (Li et al., 2017). The spatial resolutions of the MEIC and MIX inventories were both 0.25^∘ × 0.25^∘, and they are downscaled to match the model grid spacing of 36 km. The spatial distributions of CO, SO₂, NO_x, PPM_2.5, and PMC emissions are shown in Fig. 12. The daily emission inventory, which was arithmetically averaged from the combined monthly emission inventory, was directly used in the EI subsystem and was employed as the prior emission of the first DA window in the EI subsystem (Fig. 1). During the simulations, daily emissions were further converted to hourly emissions. All species emitted from area sources were converted to hourly emissions using the same diurnal profile (Fig. S2), and for the point source, we assumed that there was no diurnal change. MEIC 2012 was used as an alternative a priori over China to investigate the impact of different prior emissions on optimized emissions. The Model of Emissions of Gases and Aerosols from Nature (MEGAN) (Guenther et al., 2012) was used to calculate time-dependent biogenic emissions, which was driven by the WRF model. Biomass burning emissions were not included because they have little impact across China during the study period (Zhang et al., 2020).

During the inversion cycles, inverted emissions of different members converge gradually, and the ensemble-estimated error covariance matrix is likely to be underestimated. To avoid this, considering the compensation of model errors and comparable emission uncertainties from one day to the next, we imposed the same uncertainty on emissions at each DA window. As mentioned above, the optimized emissions of the current DA window were transferred to the next DA window as prior emissions. The technology-based emission inventory developed by Zhang et al. (2009b), using the same method as MEIC, showed that the emissions of PMC and PPM_2.5 had the largest uncertainties, followed by CO and finally SO₂ and NO_x. Therefore, the uncertainties in PMC, PPM_2.5, CO, SO₂, and NO_x in this study were set as 40 %, 40 %, 30 %, 25 %, and 25 %, respectively. However, previous studies have shown that inversely estimated CO and PMC emissions can exceed 100 % higher than the bottom-up emissions (MEIC) in certain areas (Feng et al., 2020b; Ma et al., 2019). Therefore, according to the extent of underestimation, we set an uncertainty of 100 % for both the CO and the PMC emissions at the beginning of the three DA windows to quickly converge the emissions. Mean emission analysis is generally minimally sensitive to the uncertainty setting in the assimilation cycle method (Feng et al., 2020a; Gurney et al., 2004; Miyazaki et al., 2012a) as the inversion errors of the current window can be transferred to the next window for further optimization (Sect. 4.3).

2.3 Observation data and errors

Hourly-averaged surface CO, SO₂, NO₂, O₃, PM_2.5, and PM₁₀ observations from 1504 national air quality monitoring stations were assimilated into this system, which were obtained from the Ministry of Ecology and Environment of the People's Republic of China (https://air.cnemc.cn:18007/, last access: 25 June 2020). These sites are distributed over most of central and eastern China and become denser near metropolitan areas (see Fig. 2). To ensure data quality, value-range checks were performed to eliminate unrealistic or unrepresentative observations, and only the observations within the subjectively selected threshold range were assimilated (Table 2). In addition, a time-continuity check was performed to eliminate gross outliers and sudden anomalies using the function $max (|O (t) - O (t \pm 1)|) \leq f (t$ ), where O(t) and O(t±1) represent observations at time t and t±1, respectively, and $f (t) = T_{a} + T_{b} \times O_{t}$ . This means that the concentration difference between time t and time t+1 and time t−1 should be less than f(t). T_b was fixed at 0.15, and the section of T_a is given in Table 2, which was determined empirically according to the time series change in the concentration at each site. To avoid potential cross-correlations, we assimilated PM_2.5 and PMC. Additionally, in the EI subsystem, the observations within each city were averaged to reduce the data density, reduce the error correlation, and increase spatial representation (Houtekamer and Mitchell, 2001; Houtekamer and Zhang, 2016). Finally, 336 city sites were available across mainland China, in which data from 311 cities were selected for assimilation, and the remaining 25 were selected for independent validation (Fig. 2). In the IA subsystem, owing to the small horizontal correlation scale (Fig. 3), all site observations were assimilated to provide a good IC for the next emission inversion in order to obtain more extensive observation constraints.

The observation error covariance matrix (R) includes both the measurement and the representation errors. The measurement error ε₀ is defined as follows:

\begin{matrix} (16) & ε_{0} = ermax + ermin \times Π_{0}, \end{matrix}

where ermax is the base error, and Π₀ denotes the observed concentration. These parameters for different species are listed in Table 2 and were determined according to Chen et al. (2019), Feng et al. (2018), and Jiang et al. (2013b).

The representative error depends on the model resolution and characteristics of the observation locations, which were calculated using the equations of Elbern et al. (2007), defined as follows:

\begin{matrix} (17) & ε_{r} = γ ε_{0} \sqrt{Δ l / L}, \end{matrix}

where γ is a tunable parameter (here, γ=0.5), Δl is the grid spacing (36 km), and L is the radius (3 km for simplification) of the influence area of the observation. The total observation error (r) was defined as follows:

\begin{matrix} (18) & r = \sqrt{ε_{0}^{2} + ε_{r}^{2}} . \end{matrix}

Table 2Parameters of quality control and measurement error.

Download Print Version | Download XLSX

3 Experimental design

RAPAS inversions were conducted according to the procedure and settings described in Sect. 2. December is one of the months with the most severe air pollution, whereas July is one of the least polluted months in China. Therefore, this study mainly tested the performance of the RAPAS system over these 2 months. For December, the IA subsystem was run from 26 to 30 November 2016, with a 6 h interval cycling assimilation to optimize ICs (ICDA). A better IC at 00:00 UTC on 1 December could be obtained by a 5 d high-frequency cycling assimilation and atmospheric mixing. The EI subsystem was then run for December 2016 with a 1 d assimilation window to optimize emissions (EMDA). In July, the system operated identically to that of December. It should be noted that owing to the stronger atmospheric oxidation, the lifetime of NO₂ in July was significantly shorter than that in December; thus, we adopted a smaller localization scale for NO₂ (80 km). Both assimilation experiments used the combined prior emission inventories of 2016, as described in Sect. 2.2, and the emission base year coincided with the research stage. An observing system simulation experiment (OSSE) was conducted to evaluate the performance of the RAPAS system, which has been widely used in previous assimilation system developments (Daley, 1997). In the OSSE experiment, we used the MEIC 2016 inventory as a “true” emission and the true emission, reduced by 30 % over mainland China, as a prior emission. The simulations were performed using the true emission and sampled according to the locations and times of the real observations used as artificial observations. The observation errors were the same as those in EMDA. To evaluate the IC improvements from the IA subsystem, an experiment without 3D-Var (NODA) was conducted with the same meteorological fields and physical and chemistry parameterization settings as those of the ICDA. To evaluate the posterior emissions of the EI subsystem, two parallel forward-modeling experiments were performed for December 2016: a control experiment (CEP) with prior (MEIC 2016) emissions and a validation experiment (VEP) with posterior emissions. Both experiments used the same IC at 00:00 UTC on 1 December generated through the IA subsystem. The only difference between CEP and VEP were emissions. Table 3 summarizes the different emission inversion experiments conducted in this study.

To investigate the robustness of our system, seven sensitivity tests (from EMS1 to EMS7; see Table 3) were performed. These experiments were all based on EMDA. EMS1 used MEIC 2012 as the original prior emission in China, aiming to investigate the impact of different prior inventories on the estimates of emissions. The other experiments (EMS2–5) aimed to test the impact of different prior uncertainty settings, in which the prior uncertainties were reduced by −50 % and −25 % and increased by 25 % and 50 %, respectively. EMS6 aimed to evaluate the impact of observation errors on emission estimates, in which all observation errors are magnified twice. EMS7 aimed to evaluate the impact of IC optimization of the first window on emission estimates, in which the ICs were taken from a 5 d spin-up simulation. Eight forward-modeling experiments (VEP1, VEP2, …, VEP7) were also performed with the posterior emissions of EMS1 to EMS7 to evaluate their performance.

Table 3Emission inversion and sensitivity experiments conducted in this study.

Download Print Version | Download XLSX

4 Results

4.1 Evaluations

4.1.1 Simulated meteorological fields

In the RAPAS system, the inversion approach attributes all biases between the simulated and observed concentrations to emissions. Meteorological fields dominate the physical and chemical processes of air pollutants in the atmosphere, and thus their simulation accuracy would significantly affect the estimates of emissions in this study. To quantitatively evaluate the performance of the WRF simulations, the mean bias (BIAS), root mean square error (RMSE), and correlation coefficient (CORR) were calculated against the surface meteorological observations measured at 400 stations, and the planetary boundary layer height (PBLH) was calculated using the sounding data at 92 sites. Surface observations were obtained from the National Climatic Data Center integrated surface database (http://www.ncdc.noaa.gov/oa/ncdc.html, last access: 25 October 2021), and sounding data were obtained from the website of the University of Wyoming (http://weather.uwyo.edu/upperair/sounding.html, last access: 10 March 2022). The sounding data had a 12 h interval. The observed PBLH was calculated using sound data via the bulk Richardson number method (Richardson et al., 2013). The spatial distribution of meteorological stations is shown in Fig. 2. The simulated temperature at 2 m (T2), relative humidity at 2 m (RH2), wind speed at 10 m (WS10), and PBLH from 26 November to 31 December 2016 were evaluated against the observations. Table 4 summarizes the statistical results of the evaluation of the simulated meteorological parameters. Overall, T2, RH2, and PBLH were slightly underestimated, with biases of −0.1 ^∘C, −3.8 %, and −41.1 m, respectively. CORRs were approximately 0.98 for T2, 0.94 for RH2, and 0.90 for PBLH, showing good consistency between the observations and simulations. WS10 was overestimated, with a bias of 0.7 m s⁻¹ and an RMSE of 0.8 m s⁻¹ but was better than the simulations from many previous studies (Chen et al., 2016; Jiang et al., 2012a, b). Therefore, the WRF can generally reproduce meteorological conditions sufficiently in terms of their temporal variation and magnitude over China, which is adequate for our inversion estimation.

Table 4Statistics comparing the simulated and observed 10 m wind speed (WS10), 2 m temperature (T2), and 2 m relative humidity (RH2), as well as the planetary boundary layer height (PBLH). BIAS: mean bias; RMSE: root mean square error; CORR: correlation coefficient.

Download Print Version | Download XLSX

4.1.2 Initial conditions

Figure 4 shows an evaluation of the analyzed concentrations of the six species against surface observations. For comparison, the evaluations of the simulations without 3D-Var (NODA) are also shown in Fig. 4. The simulations of the NODA experiment (red dots) are scattered on both sides of the central line, as large systematic biases remain across many measurement sites. Conversely, the ICDA experiment (blue dots) showed a much better agreement with the observations than those from NODA. The statistics show that there are large systematic biases in the NODA simulations, with large RMSEs and small CORRs for all species, particularly for CO and PMC. After the assimilation of surface observations, the RMSE of CO decreased to 0.7 mg m⁻³, and the RMSEs of SO₂, NO₂, O₃, PM_2.5, and PMC decreased to 22.0, 12.0, 9.6, 20.5, and 19.6 µg m⁻³, respectively, with respective reductions of 50.0 %, 73.1 %, 61.0 %, 64.7 %, 69.5 %, and 60.8 % compared to those of NODA (Table 5). The CORRs of ICDA increased by 290.0 %, 291.3 %, 55.4 %, 87.2 %, 130.0 %, and 214.8 % to 0.78, 0.90, 0.87, 0.88, 0.92, and 0.85, respectively. These statistics indicate that the ICs of the ground level improved significantly. However, owing to the lack of observations, we still do not know the simulation bias in the upper–middle boundary layer. Although concentrations at high altitudes can be constrained by ground-based observations through vertical correlations, the effect is limited; therefore, the bias remains non-negligible.

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f04

Figure 4Scatterplots of simulated versus observed (a) CO, (b) SO₂, (c) NO₂, (d) O₃, (e) PM_2.5, and (f) PMC mass concentration initializations from the background (red) and analysis (blue) fields at 00:00 UTC on 1 December.

Download

Table 5Comparisons of the surface CO, SO₂, NO₂, O₃, PM_2.5, and PMC mass concentrations from the control and assimilation experiment against observations aggregated over all analysis times. CO unit: mg m⁻³; other units: µg m⁻³. BIAS: mean bias; RMSE: root mean square error; CORR: correlation coefficient.

Download Print Version | Download XLSX

4.1.3 Posterior emissions

Owing to the mismatched spatial scales, it is difficult to directly evaluate the optimized emissions against observations. Generally, we indirectly validated the optimized emissions by comparing the forward-simulated concentrations using the posterior emissions against atmospheric measurements (e.g., Jiang et al., 2014; Jin et al., 2018; Peters et al., 2007). Figure 5 shows the spatial distributions of the mean biases between the gaseous pollutants simulated using prior and posterior emissions and assimilated observations. In the CEPs, for each species, the distribution of biases was similar to the increments in background fields constrained through 3D-Var, as shown in Fig. S3. For example, almost all sites had large negative biases for CO, while for SO₂ and NO₂, positive biases were mainly distributed over the North China Plain (NCP), the Yangtze River Delta (YRD), the Sichuan Basin (SCB), and central China, and negative biases were distributed over the remaining areas. After constraining with observations, the biases of all three gaseous air pollutants were significantly reduced at most sites. For CO, the biases at 62 % of the sites decreased to absolute values less than 0.2 mg m⁻³, and for SO₂ and NO₂, the biases at 52 % and 47 % of the sites were within ±4 µg m⁻³. However, large negative biases were still observed in western China, indicating that the uncertainties of the posterior emissions are still large in western China, which may be attributed to the large biases in prior emissions and the relatively limited observations. Overall, the statistics show that there are different levels of improvement at the 311 assimilation sites of 92 %, 85 %, and 85 % for CO, SO₂, and NO₂, respectively. The small number of sites with a worse performance may be related to over-adjusted emissions by EI or contradictory adjustments caused by opposite biases in adjacent areas.

Table 6 lists the statistical results of the evaluations averaged over the whole mainland of China. For CO, the mean bias was −0.8 mg m⁻³ with the prior emissions, while it substantially reduced to −0.1 mg m⁻³ (reduction rate of 89.6 %) when simulating with the posterior emissions. Additionally, the RMSE decreased by 48.1 % from 1.08 to 0.56 mg m⁻³, and the CORR increased by 76.1 % from 0.46 to 0.81. For SO₂ and NO₂, the regional mean biases increased slightly as the positive/negative biases among different sites might be offset. However, the RMSEs decreased to 17.7 and 12.3 µg m⁻³, respectively, which were 58.3 % and 50.8 % lower than those of CEPs, and the CORRs increased by 125.6 % and 35.4 %, both reaching up to 0.88, indicating that EI significantly improved the NO_x and SO₂ emission estimates.

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f05

Figure 5Spatial distribution of the BIAS of the simulated (a, b) CO, (c, d) SO₂, and (e, f) NO₂ with prior (a, c, e, CEP) and posterior (b, d, f, VEP) emissions. CO unit: mg m⁻³; SO₂ and NO₂ units: µg m⁻³.

Figure 6 shows the spatial distributions of the mean biases of simulated PM_2.5 and PMC evaluated against assimilated observations. Similarly, the CEP simulations did not perform well. There were widespread underestimations across the country, with mean biases of −24.0 and −32.4 µg m⁻³. After data assimilation, the performance of the VEP simulations significantly improved. The biases decreased by 72.1 % and 90.4 % to −6.7 and −3.1 µg m⁻³; the RMSEs decreased by 41.2 % and 40.7 % to 29.6 and 24.6 µg m⁻³; and the CORRs increased by 35.9 % and 176.0 % to 0.87 and 0.69 for PM_2.5 and PMC, respectively. Overall, 89.6 % and 97.2 % of the assimilation sites were improved for PM_2.5 and PMC, respectively. However, compared with the results for the three gaseous pollutants, there were sites with large biases scattered throughout the entire domain. In addition to the potential over-adjusted or contradictory adjustments of emissions as in the three gas species, the sites with large biases may be related to the complex precursors and complex homogeneous and heterogeneous chemical reactions and transformation processes of secondary PM_2.5 and the fact that we did not simulate the time variation in dust blowing caused by wind speed for PMC owing to the lack of land cover data that are compatible with the CMAQ dust module and agricultural activity data to identify dust source regions.

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f06

Figure 6The same as in Fig. 5 but for PM_2.5 and PMC.

Figures 7 and 8 show the spatial distributions of the biases calculated against independent observations for the five species. With posterior emissions, the decreasing ratios of RMSEs ranged from 26.7 %–42.0 %, and the CORRs increased by 13.7 %–59.0 % to 0.62–0.87. Overall, the biases at the independent sites are similar or slightly worse than those at the assimilated sites, which is reasonable as the closer the independent sites are to the assimilated site, the more constraints of observation information can be obtained and the more significant the improvements in the optimized state variables of the model. For example, generally, the transmission distance of NO₂ is relatively short, and remote cities with small emission correlations to the cities with assimilated observations are relatively less constrained, resulting in only a 26.7 % decrease in the RMSE.

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f07

Figure 7As in Fig. 5 but for the independent validation.

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f08

Figure 8As in Fig. 6 but for the independent validation.

Comparing our results with those of previous studies, Tang et al. (2013) inverted CO emissions over Beijing and the surrounding areas and obtained comparable improvements (Table 6) in the RMSE (37 %–48 % vs. 30 %–51 %) and CORR (both studies ∼ 0.81); however, we decreased the biases by 90 %–97 %, which is much greater than their 48 %–64 % reductions. Additionally, Chen et al. (2019) showed that the RMSE of simulated SO₂ with updated SO₂ emissions decreased by 4.2 %–52.2 % for different regions, and the CORR only increased to 0.69 at most. These improvements are smaller than those obtained in this study, which may be due to the insufficient adjustment of emissions caused by the underestimated ensemble spread through the inflation method. The better performance in this study may be related to our inversion process, which causes the optimized emissions of the current DA window to propagate to the next DA window for further correction.

Table 6Statistics comparing the pollution concentrations from the simulations with prior (CEP) and posterior (VEP) emissions against assimilated and independent observations, respectively. CO unit: mg m⁻³; other units: µg m⁻³. BIAS: mean bias; RMSE: root mean square error; CORR: correlation coefficient.

Download Print Version | Download XLSX

4.1.4 Uncertainty reduction

The uncertainty reduction rate (UR) is an important quantity to evaluate the performance of RAPAS and the effectiveness of in situ observations (Chevallier et al., 2007; Jiang et al., 2021; Takagi et al., 2011). Following Jiang et al. (2021), the UR was calculated as

\begin{matrix} (19) & UR = (1 - \frac{σ_{posterior}}{σ_{prior}}) \times 100, \end{matrix}

where σ_posterior and σ_prior are the posterior and prior uncertainties, respectively, calculated using the standard deviations of the prior and posterior perturbations. Figure 9 shows the URs averaged in each province and mainland China. URs varied with species as they are closely related to the magnitude settings of prior uncertainties (Jiang et al., 2021). The URs of PPM_2.5 and PMC were the most effective, while the UR of NO_x emissions was the lowest. For mainland China overall, uncertainties were reduced by 44.4 %, 45.0 %, 34.3 %, 51.8 %, and 56.1 % for CO, SO₂, NO_x, PPM_2.5, and PMC, respectively. For one species, URs varied across provinces. URs are usually related to observation coverage, which means that the more observation constraints there are, the more URs decrease. Additionally, URs may also be related to emission distributions. Generally, URs were more significant in the provinces where observations and emissions were both relatively concentrated (e.g., Tibet), while they were much lower where the emissions were scattered or relatively uniform, but the observations were only in large cities, even if there were many more observations in other provinces.

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f09

Figure 9Time-averaged posterior emission uncertainty reduction (%) indicated by the standard deviation reduction in the total emissions per province calculated by prior and posterior ensembles.

Download

4.1.5 Evaluation using chi-squared statistics

To diagnose the performance of the EnKF analysis, chi-squared (χ²) statistics were calculated, which are generally used to test whether the prior ensemble mean RMSE with respect to the observations is consistent with the prior “total spread” (square root of the sum of ensemble variance and observation error variance). Following Zhang et al. (2015), for the tth window, χ² is defined as

\begin{matrix} (20) & χ_{t}^{2} = (y - H \overline{X^{b}})^{T} ({HP}^{b} H^{T} + R)^{- 1} (y - H \overline{X^{b}}) . \end{matrix}

Figure 10 shows the time series of the relative changes between the prior and posterior emissions and the χ² statistics. There were relatively large adjustments in emissions in the first three windows, especially for the PMC. Subsequently, the five species reached a more optimal state with successive emission inversion cycles. The χ² statistics showed similar variation characteristics to the daily changes in emissions. The χ² value was slightly greater than 1, indicating that the uncertainties from the error covariance statistics did not fully account for the error in the ensemble simulations. A similar result was reported by Chen et al. (2019). Further investigations should be conducted to generate larger spreads by accounting for the influence of model errors. As we imposed the same uncertainty in prior emissions at each DA window to partially compensate for the influence of model errors, χ² statistics showed small fluctuations, indicating that the system updated emissions consistently and stably.

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f10

Figure 10Relative changes (a) in posterior emission estimates of CO, SO₂, NO_x, PPM_2.5, and PMC and χ² statistics (b) of these state vectors in each window.

Download

4.1.6 Evaluation using OSSE

Figure 11 shows the spatial distribution of the error reduction in the posterior emissions of the five species. After inversion, in most areas, the emission errors were reduced by more than 80 %, especially in the central and eastern regions with dense observation sites, while in remote areas far away from cities, due to the sparse observation sites, the emission errors were still not well adjusted. Overall, the error reduction rates of CO, SO₂, NO_x, PPM_2.5, and PMC were 78.4 %, 86.1 %, 78.8 %, 77.6 %, and 72.0 %, respectively, indicating that with the in situ observations in China, RAPAS can significantly reduce emission errors and thus showed good performance in emission estimates.

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f11

Figure 11Spatial distribution of the error reduction (%) of posterior emissions in the OSSE.

4.2 Inverted emissions

Figure 12 shows the spatial distribution of temporally averaged prior and posterior emissions and their differences in emissions in December 2016. It should be noted that emissions outside China were masked; as the observation sites were limited to China in this study, there was a slight change in the emissions outside China. Higher emissions were mainly concentrated in central and eastern China, especially in the NCP, YRD, and Pearl River Delta, and lower emissions occurred across northwestern and southern China. Compared with the prior emissions, posterior CO emissions were considerably increased across most areas of mainland China, especially in northern China, with an overall increase of 129 %. A notable underestimation of prior emissions was also confirmed by inversion estimations (Feng et al., 2020b; Tang et al., 2013; Wu et al., 2020) and model evaluations (Kong et al., 2020) in previous studies. For SO₂, the emissions increased mainly in northeastern China, Shanxi, Ningxia, Gansu, Fujian, Jiangxi, and Yunnan provinces. In the SCB, central China, the YRD, and part of the NCP, emissions were significantly reduced. The national total SO₂ emissions increased by 20 %. For NO_x, although the increment of the national total emissions was small (approximately 5 %), there were large deviations. The emissions in the NCP and YRD were reduced, whereas the emissions in most cities in other regions increased. The changes in the emission of PPM_2.5 were similar to those of SO₂. Compared with the prior emissions, the posterior PPM_2.5 emissions decreased over central China, the SCB, and the YRD, whereas those in southern and northern China increased, especially in Shanxi, Shaanxi, Gansu, and southern Hebei provinces. Overall, the relative increase was 95 %. For PMC, the posterior emissions were increased over all of mainland China, with a national mean relative increase of 1045 %. Larger emission increments mainly occurred in areas with significant anthropogenic emissions of CO and PPM_2.5, indicating that the large underestimation of PMC emissions in the prior inventory may mainly be attributed to the underestimations of anthropogenic activities. The absence of natural dust is another reason, as the wind-blown dust scheme was not applied in this study. Overall, PM₁₀ emissions (PPM_2.5 + PMC) increased by 318 %. If we assume that all the increments in PM₁₀ emissions are from natural dust, it means the contribution of natural dust accounted for 75 % of total PM₁₀ emissions, which is consistent with the source apportionment of PM₁₀ of 75 % in Changsha in central China (Li et al., 2010). Large PMC emission increments were also reported by Ma et al. (2019).

Detailed estimations of posterior emissions and relative changes compared to prior emissions in each province and mainland China are given in Table S1. The evaluation results for July showed that the emission uncertainty could still be significantly reduced, and the performance of the system in July was comparable to that in December (Table S2). Additionally, the seasonal variation in emissions was well reflected (Figs. S4 and S5), which means that our system performed well at different times of the year. Note that the differences, excluding PMC, between the prior and posterior emissions mainly reflect the deficiencies of the prior emissions as the times of the prior emissions and observations were consistent in this study.

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f12

Figure 12Spatial distribution of the time-averaged prior emissions (left column, MEIC 2016), posterior emissions (middle column), and differences (right column, posterior minus prior).

4.3 Sensitivity tests

4.3.1 Impact of prior inventories

Various prior inventories have shown considerable differences in space allocation and emission magnitudes. Inversion results can be sensitive to a priori emissions if the observations are insufficient (Gurney et al., 2004; He et al., 2018). MEIC 2012 was used as an alternative a priori in EMS1 to investigate the impact of different prior emissions on posterior emissions. Figure 13 shows the time series of the relative differences in the daily posterior emissions of the five species between the EMDA (base) and EMS1 experiments. Overall, the differences between the two posterior emissions gradually decreased over time. At the beginning, the differences in the CO, SO₂, NO_x, PPM_2.5, and PMC between the two inventories (i.e., MEIC 2012 vs. MEIC 2016) were 17.5 %, 114.5 %, 30.8 %, 46.0 %, and 72.0 %, respectively, compared to 2.5 %, 4.5 %, 4.5 %, −8.9 %, and 3.0 % in the last 10 d. In addition, the species with larger emission differences at the beginning took a longer time (i.e., more DA steps) to achieve convergence. The quick convergence of PMC emissions was attributed to the large prior uncertainty of 100 % used in the first three DA windows. In contrast to the other species, there were significant negative deviations in PPM_2.5 emissions between the two experiments. This may be due to the positive deviations in the precursors of PM_2.5 (i.e., SO₂ and NO_x), which lead to a larger amount of secondary production. The PPM_2.5 emissions will be reduced to balance the total PM_2.5. We compared the PM_2.5 concentrations simulated by the two optimized inventories and found that they were almost the same (Fig. S6). Overall, this indicates that observations in China were sufficient to infer emissions and that our system was robust. Meanwhile, the monthly posterior emissions shown in Sect. 4.2 were still underestimated to a certain extent.

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f13

Figure 13Relative differences in CO, SO₂, NO_x, PPM_2.5, and PMC emissions (%, the ratio of the absolute difference to EMDA) between the EMDA and EMS1 experiments.

Download

4.3.2 Impact of prior uncertainty settings

The uncertainty of prior emissions determines how closely the analysis is weighted towards the background and observations; however, information about prior uncertainties is generally not readily available. To evaluate the possible influence of prior uncertainties on the optimized emissions, we increased and reduced the uncertainties after 3 d of cycling, starting at 00:00 UTC, 3 December, by 25 % and 50 % in EMS2 (−50 %), EMS3 (−25 %), EMS4 (+25 %), and EMS5 (+50 %), respectively. Table 7 summarizes the emission changes with different prior uncertainty settings in the EMS2–5 experiments. To better understand the response of the system to the emission uncertainty settings, Fig. 14 illustrates the time series of SO₂ emission changes, chi-squared statistics, and RMSEs of simulated SO₂ with emissions updated in the EMDA and EMS2–5 experiments over the YRD and NCP (Fig. 2). Compared with EMDA, when the uncertainties decreased (increased), the emissions of the five species decreased (increased) accordingly. This is because the posterior emissions of the five species were larger than the prior emissions, and, as shown in Fig. 14a–d, larger uncertainty will lead to faster convergence, resulting in larger posterior emissions. It can also be seen from Fig. 14 that a faster convergence will reduce the RMSE of the simulated concentration with the posterior emissions in the early stage of the experiment; however, in the later stage of the experiment, there were no significant differences in the RMSE and chi-squared statistics among the different experiments. However, day-to-day changes in emissions also cause slight fluctuations. In addition, when greater uncertainties are set, the day-to-day changes in emissions are more drastic, resulting in a larger RMSE, as shown in the NCP. Moreover, the significant day-to-day variations in the estimated emissions may not be in line with the actual situation. Owing to the spatial–temporal inhomogeneity of emissions, the differences in chi-squared statistics between the YRD and NCP show that it may be necessary to apply different a priori uncertainties according to different regions (Chen et al., 2019). Therefore, when using an EnKF system for emission estimation, error settings must be carefully executed. Overall, the uncertainties chosen in EMDA aim to minimize the deviation of the concentration fields and maintain the stability of the inversion.

Table 7Relative differences in CO, SO₂, NO_x, PPM_2.5 and PMC emissions ( %, the ratio of the absolute difference to EMDA) between the EMDA and EMS2–5 experiments.

Download Print Version | Download XLSX

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f14

Figure 14Time series of the SO₂ emission changes, chi-squared statistics, and RMSE of simulated SO₂ with updated SO₂ emissions in the EMDA and EMS2–5 experiments over the YRD and NCP.

Download

4.3.3 Impact of observation error settings

Observation errors are another factor that determine the relative weights of the observations and background in the analysis. A proper estimate of the observation error is important for filter performance; however, observation errors are generally not provided with datasets. The observation error is usually set to a fixed value (Ma et al., 2019), specific proportion of the observation value (Tang et al., 2013), or value calculated by combining measurement error with representative error as used in this study. Generally, the performance of data assimilation is sensitive to the specification of the observation error (Tang et al., 2013). A sensitivity experiment (EMS6) with a doubled observation error was conducted to evaluate the influence of observation error on the optimized emissions. Overall, the spatial distribution of emissions after optimization was almost the same as that of the EMDA experiment but with a lower increment (Fig. S7), resulting in a weaker estimate of the national total emissions for each species. This is because the observation error inflates, and the system becomes more certain of the prior emission and reduces the effect of observation information. Figure 15 shows the time series of simulated and observed daily concentrations and their RMSEs verified against the assimilated sites. The simulations in VEP6 usually performed worse, with larger biases and RMSEs than those of VEP (Figs. S8 and S9), especially in western and southern China, where posterior emissions were significantly underestimated. These results generally corresponded to sluggish emission changes and large chi-squared statistics (Fig. S10), suggesting that an observation error that is too large may substantially impact the estimated emissions.

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f15

Figure 15Time series of the daily concentrations (CONC, left) and root mean square error (RMSE, right) obtained from CEP, VEP, and VEP6. The simulations were verified against the assimilated sites.

Download

4.3.4 Impact of the IC optimization of the first window

Several studies indicate large emission discrepancies resulting from IC errors (Jiang et al., 2013a; Miyazaki et al., 2017; Tang et al., 2013), which means that if the IC is not optimized, the errors in concentrations would be compensated for through the adjustment of emissions. To evaluate the impact of IC optimization of the first window on the emission inversions, an EMS7 experiment without the IA step was conducted. Figure 16 shows the time series of the relative differences in the daily posterior emissions of the five species between the EMDA and EMS7 experiments. It can be observed that IC optimization had a significant impact on the emission inversions of long-lived species (i.e., CO). The overall difference in the inverted CO emissions between the two experiments was approximately 5.3 % but can reach 26.1 % in the first few windows. For the short-lived species, IC optimization had little impact on the emissions; for example, the average emission differences of SO₂, NO_x, and PMC in the two experiments were 0.3 %, 0.3 %, and 0.9 %, respectively. For PPM_2.5, the average emission difference is affected not only by primary emissions, but also by the complex chemistry of its precursors. Therefore, the difference between the two experiments fluctuated, with an overall difference of 2 %. Notably, with the gradual disappearance of the benefit of IC assimilation, the two experiments reached a unified state after several windows. For CO, the impact of IA on emission inversion lasted approximately half a month. These results indicate that removing the bias of the IC of the first DA window is essential for the subsequent inverse analysis (Jiang et al., 2017).

https://gmd.copernicus.org/articles/16/5949/2023/gmd-16-5949-2023-f16

Figure 16Relative differences in CO, SO₂, NO_x, PPM_2.5, and PMC emissions (%, the ratio of the absolute difference to EMDA) between the EMDA and EMS7 experiments.

Download

4.4 Discussion

Optimal state estimation using an EnKF relies on the assumption of an unbiased Gaussian prior error, which is not guaranteed in such highly nonlinear and large-bias systems. In this study, some pollutants (e.g., CO, PMC) have very large simulated biases; thus, if a small uncertainty is adopted, the emission bias cannot be fully reduced. If a very large uncertainty is adopted, then the degree of freedom of adjustment is too large, and the inverted daily emissions will fluctuate abnormally. Therefore, we only set a larger prior uncertainty in the first three windows, adopting a moderate uncertainty in the following windows, and used a two-step inversion scheme and cyclic iteration to gradually correct the emission errors. Figure 10a shows the time series of the relative differences between prior and posterior emissions in each window. There were relatively large adjustments for the emissions in the first three windows, especially for PMC, but the adjustment ranges of the five species after the first three windows were within the uncertainty range (e.g., ±25 %), indicating that with this scheme, the EnKF method used in this system performed well in emission inversion.

Model–data mismatch errors are from both the emissions and the inherent model errors arising from the model structure, discretization, parameterizations, and biases in the simulated meteorological fields. Neglecting model errors would attribute all uncertainties to emissions and lead to considerable bias in the estimated emissions. In the version of the CMAQ model used in this study, there are no heterogeneous reactions (Quan et al., 2015; Wang et al., 2017), the parameterization scheme for the formation of secondary organic aerosols (SOAs) is imperfect (Carlton et al., 2008; Jiang et al., 2012a; Yang et al., 2019), no feedback between chemistry and meteorology was considered, and we used an idea profile for chemical lateral boundary conditions. All the above problems can lead to underestimated concentrations of pollutants, which in turn require more emissions to compensate for this, leading to overestimation of emissions. In addition, previous studies have shown that ammonia emissions in the MEIC inventory are underestimated (Kong et al., 2019; Paulot et al., 2014; Zhang et al., 2018). Owing to the lack of ammonia observations, our system does not include emission estimates of ammonia, which means that the concentration of ammonium aerosol was underestimated in this system, also resulting in an overestimation of the PPM_2.5 emission. Wind-blown dust was also not simulated; thus, the PMC emission inverted in this system comes from anthropogenic activities and natural sources. Although some of these shortcomings can be solved by updating the CTM model, there will still be errors in each parameterization and process. In general, a parameter estimation method was used to reduce the model errors, in which some uncertain parameters were included in the augmented state vector and optimized synchronously based on the available observations (Brandhorst et al., 2017; Evensen, 2009). However, it is difficult to identify the key uncertain parameters of different species in different models, which generally comes not only from the complex atmospheric chemical model but also from hundreds of model inputs (Tang et al., 2013). Another method is bias correction, which treats the model error as a bias term and includes it in an augmented state vector (Brandhorst et al., 2017; De Lannoy et al., 2007; Keppenne et al., 2005). In addition, the weak-constraint 4D-Var method can be used to reduce model errors, which adds a correction term in the model integration to account for the different sources of model error (Sasaki, 1970). Although the reliable diagnosis of model error remains a challenge (Laloyaux et al., 2020), it should be considered in an assimilation system. In the future, we will consider model errors in our system to obtain better emission estimates.

Independent variable localization was adopted to avoid potential spurious correlations across different species in this study. However, the transmission scales for different species in different regions differ, and a more accurate localization range can be obtained through backward trajectory analysis. In addition, O₃ observations were not assimilated to improve NO_x and VOC emissions using cross-species information. O₃ concentrations and NO_x (VOC) emissions were positively correlated in the NO_x (VOC)-limited region and negatively correlated in the VOC (NO_x)-limited region (Tang et al., 2011; N. Wang et al., 2019). Hamer et al. (2015) successfully used O₃ observations to estimate NO_x and VOC emissions within the 4D-Var framework within an ideal model. However, the NO_x emissions are often point or line sources, which are all small compared to the model resolution. With a coarse spatial resolution, the model cannot accurately simulate the relationships between O₃ and its precursors. When assimilating O₃ observations to infer NO_x or VOC emissions, the inaccurate relationships simulated by the model would worsen the inversion of NO_x emissions (Inness et al., 2015). In general, improving the model resolution can improve the detailed simulation and provide better prior information on O₃–NO_x–VOC, but it is still difficult to determine whether the condition is NO_x-limited or VOC-limited in the real atmosphere using prior emissions (Liu and Shi, 2021). Elbern et al. (2007) emphasized that assimilating O₃ to correct NO_x or VOC emissions must follow the EKMA framework derived based on observations, otherwise, even if the resolution is improved to sufficiently solve point and line sources, precursor emissions may still be adjusted in an opposite direction. This can be demonstrated in our OSSE experiment at a high resolution of 3 km (Fig. S11). In this study, the spatial resolutions of the prior emission inventory (i.e., MEIC) is 0.25^∘ × 0.25^∘, which is appropriate for modeling at regional scales (Zheng et al., 2017). With this emission inventory, it is unable to accurately simulate the O₃–NO_x–VOC relationships. Therefore, to avoid the impact of inaccurate O₃–NO_x relationships on emission inversion, in our system, we did not assimilate O₃ but directly assimilated NO₂ to optimize the NO_x emissions. This work will be followed by an ongoing study using the available VOC observations.

Although we do not assimilate O₃ observations, model resolution still has some influence on inversion results. In our previous study (Feng et al., 2022), we inferred the NO_x emissions over the YRD in China using NO₂ observations, which has a spatial resolution of 12 km. The study period, assimilated observations, and inversion settings are the same as in this study. We compared the posterior emissions of the YRD between this study and Feng et al. (2022). The results showed that there was a similar spatial distribution of posterior emissions inferred using the two resolutions (36 km vs. 12 km) (Fig. S12), but the total NO_x emission in the YRD inferred using the 36 km resolution was about 8.8 % higher than that inferred using the 12 km resolution. The differences are mainly caused by meteorological differences at different resolutions. This indicates that coarse model resolution may lead to some overestimation of the inverted emissions. In addition, as shown previously, the concentrations after DA were evidently underestimated in western China, indicating that the inverted emissions over these regions still have large uncertainties because of the sparsity of observations, which are spatially insufficient for sampling the inhomogeneity in emissions. Therefore, further investigations with the joint assimilation of multi-source observations (e.g., satellite) are underway.

NO_x is mainly emitted by transportation (Li et al., 2017), which can reflect the level of economic activity to a certain extent. Weekly emission changes were explored to verify the performance of the system in depicting emission changes (Fig. S13). Although the “weekend effect” of emissions in China is not significant (Wang et al., 2014, 2015), the posterior NO_x emission changes are in good agreement with the observations. In our previous studies (Feng et al., 2020a, b), this system was successfully applied to optimize NO_x and CO emissions. The inverted emission changes were also in line with the epidemic control time points. Additionally, the emission changes can reflect the emission migration from developed or urban areas to developing or surrounding areas in recent years, which is consistent with the emission control strategies in China. Although the system did not consider the model error, resulting in a certain difference between the posterior and actual emissions, the spatiotemporal changes in posterior emissions were relatively reasonable and can be used to monitor emission changes and inform emission regulations.

5 Summary and conclusions

In this study, we developed a Regional multi-Air Pollutant Assimilation System (RAPAS v1.0) based on the WRF–CMAQ model, 3D-Var algorithm, and EnKF algorithm. RAPAS can quantitatively optimize gridded emissions of CO, SO₂, NO_x, PPM_2.5, and PMC on a regional scale by simultaneously assimilating hourly in situ measurements of CO, SO₂, NO₂, PM_2.5, and PM₁₀. This system includes two subsystems: the IA subsystem and the EI subsystem, which optimize chemical ICs and infer anthropogenic emissions.

Taking the 2016 MEIC in December as a priori, the emissions of CO, SO₂, NO_x, PPM_2.5, and PMC in December 2016 were inferred by assimilating the corresponding nationwide observations over China. The optimized ICs and posterior emissions were examined against assimilated and independent observations through parallel forward-simulation experiments with and without DA. Sensitivity tests were performed to investigate the impact of different inversion processes, prior emissions, prior uncertainties, and observation errors on emission estimates.

RAPAS showed a good performance in assimilating surface in situ observations, with the calculated emission uncertainties reduced by 44.4 %, 45.0 %, 34.3 %, 51.8 %, and 56.1 % for CO, SO₂, NO_x, PPM_2.5, and PMC, respectively. It can also significantly improve the simulations; the RMSEs of the simulated concentrations with posterior emissions decreased by 40.1 %–56.3 %, and the CORRs increased from 0.26–0.66 to 0.69–0.87 for different species. The OSSE experiment showed that the errors of posterior CO, SO₂, NO_x, PPM_2.5, and PMC could be reduced by 78.4 %, 86.1 %, 78.8 %, 77.6 %, and 72.0 %, respectively. Overall, compared with the prior emissions (MEIC 2016), the posterior emissions increased by 129 %, 20 %, 5 %, and 95 % for CO, SO₂, NO_x, and PPM_2.5, respectively. The posterior PMC emissions, which included anthropogenic and natural dust contributions, increased by 1045 %. Sensitivity tests with different prior inventories showed that the observations in China were sufficient to infer emission and that our system was less dependent on prior inventories. Additionally, sensitivity tests with different prior uncertainties indicated that when the posterior emissions were larger than the prior emissions, the emissions decreased (increased) with decreases (increases) in uncertainties because of the different convergence rates. These results demonstrate the advantage of the two-step method in emission inversion in that the inversion errors of the last window can be transferred to the current window for further optimization, thus enhancing the robustness of emissions estimation using nationwide observations over China with RAPAS. It should be noted that the system usually responds slowly to too small a priori uncertainties or too large observation errors, which may result in large errors in the estimated emissions.

In summary, the comprehensive evaluation and sensitivity tests revealed that RAPAS could serve as a useful tool for accurately quantifying the spatial and temporal changes in multi-species emissions at regional scales and in near-real time, which will be helpful for air pollution control in China and other regions around the world with dense ground observation networks.

Code and data availability

The codes of RAPAS v1.0 are available at https://doi.org/10.5281/zenodo.5566225 (Feng and Jiang, 2021a). The WRF model code is open-source code and can be obtained from the WRF model's user page (https://www2.mmm.ucar.edu/wrf/users, WRF Users Page, 2021). The CMAQ model is available through an open license as well (https://www.epa.gov/cmaq, US EPA, 2021). The observational and emission data used in this study are available at https://doi.org/10.5281/zenodo.4718290 (Feng and Jiang, 2021b).

Supplement

The supplement related to this article is available online at: https://doi.org/10.5194/gmd-16-5949-2023-supplement.

Author contributions

SF, FJ, ZW, and ZJ developed RAPAS v1.0. SF and FJ designed the research. SF performed the model simulations, analyzed data, and prepared the paper with contributions from all co-authors. FJ supervised the model development project and assisted in conceptualization and writing. HW, WH, YS, LZ, YZ, CL, and WJ contributed to the discussion and improvement of the paper.

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

Acknowledgements

This work is supported by the National Key R&D Program of China (grant no. 2020YFA0607504), the National Natural Science Foundation of China (grant nos. 42305116 and 41907378), and the Fundamental Research Funds for the Central Universities (grant no. 090414380031). We are grateful to the High-Performance Computing Center (HPCC) of Nanjing University for doing the numerical calculations in this paper on its blade cluster system, and we thank the MEIC team for providing the prior anthropogenic emissions (http://www.meicmodel.org/, last access: 8 August 2018).

Financial support

This research has been supported by the National Key R&D Program of China (grant no. 2020YFA0607504), the National Natural Science Foundation of China (grant nos. 42305116 and 41907378), and the Fundamental Research Funds for the Central Universities (grant no. 090414380031).

Review statement

This paper was edited by Ignacio Pisso and reviewed by two anonymous referees.

References

Appel, K. W., Pouliot, G. A., Simon, H., Sarwar, G., Pye, H. O. T., Napelenok, S. L., Akhtar, F., and Roselle, S. J.: Evaluation of dust and trace metal estimates from the Community Multiscale Air Quality (CMAQ) model version 5.0, Geosci. Model Dev., 6, 883–899, https://doi.org/10.5194/gmd-6-883-2013, 2013.

Alexe, M., Bergamaschi, P., Segers, A., Detmers, R., Butz, A., Hasekamp, O., Guerlet, S., Parker, R., Boesch, H., Frankenberg, C., Scheepmaker, R. A., Dlugokencky, E., Sweeney, C., Wofsy, S. C., and Kort, E. A.: Inverse modelling of CH₄ emissions for 2010–2011 using different satellite retrieval products from GOSAT and SCIAMACHY, Atmos. Chem. Phys., 15, 113–133, https://doi.org/10.5194/acp-15-113-2015, 2015.

Barbu, A. L., Segers, A. J., Schaap, M., Heemink, A. W., and Builtjes, P. J. H.: A multi-component data assimilation experiment directed to sulphur dioxide and sulphate over Europe, Atmos. Environ., 43, 1622–1631, 2009.

Basu, S., Guerlet, S., Butz, A., Houweling, S., Hasekamp, O., Aben, I., Krummel, P., Steele, P., Langenfelds, R., Torn, M., Biraud, S., Stephens, B., Andrews, A., and Worthy, D.: Global CO₂ fluxes estimated from GOSAT retrievals of total column CO₂, Atmos. Chem. Phys., 13, 8695–8717, https://doi.org/10.5194/acp-13-8695-2013, 2013.

Bauwens, M., Compernolle, S., Stavrakou, T., Müller, J.-F., van Gent, J., Eskes, H., Levelt, P. F., van der A, R., Veefkind, J. P., Vlietinck, J., Yu, H., and Zehner, C.: Impact of Coronavirus Outbreak on NO₂ Pollution Assessed Using TROPOMI and OMI Observations, Geophys. Res. Lett., 47, e2020GL087978, https://doi.org/10.1029/2020gl087978, 2020.

Bierman, G. J.: Factorization methods for Discrete Sequential estimation, Academic Press, ISBN 9780120973507, 1977.

Binkowski, F. S. and Roselle, S. J.: Models-3 community multiscale air quality (CMAQ) model aerosol component – 1. Model description, J. Geophys. Res.-Atmos., 108, 4183, https://doi.org/10.1029/2001jd001409, 2003.

Bocquet, M.: Parameter-field estimation for atmospheric dispersion: application to the Chernobyl accident using 4D-Var, Q. J. Roy. Meteor. Soc., 138, 664–681, 2012.

Bocquet, M. and Sakov, P.: Joint state and parameter estimation with an iterative ensemble Kalman smoother, Nonlin. Processes Geophys., 20, 803–818, https://doi.org/10.5194/npg-20-803-2013, 2013.

Bocquet, M., Elbern, H., Eskes, H., Hirtl, M., Žabkar, R., Carmichael, G. R., Flemming, J., Inness, A., Pagowski, M., Pérez Camaño, J. L., Saide, P. E., San Jose, R., Sofiev, M., Vira, J., Baklanov, A., Carnevale, C., Grell, G., and Seigneur, C.: Data assimilation in atmospheric chemistry models: current status and future prospects for coupled chemistry meteorology models, Atmos. Chem. Phys., 15, 5325–5358, https://doi.org/10.5194/acp-15-5325-2015, 2015.

Brandhorst, N., Erdal, D., and Neuweiler, I.: Soil moisture prediction with the ensemble Kalman filter: Handling uncertainty of soil hydraulic parameters, Adv. Water Resour., 110, 360–370, 2017.

Bruhwiler, L. M. P., Michalak, A. M., Peters, W., Baker, D. F., and Tans, P.: An improved Kalman Smoother for atmospheric inversions, Atmos. Chem. Phys., 5, 2691–2702, https://doi.org/10.5194/acp-5-2691-2005, 2005.

Byun, D. and Schere, K. L.: Review of the governing equations, computational algorithms, and other components of the models-3 Community Multiscale Air Quality (CMAQ) modeling system, Appl. Mech. Rev., 59, 51–77, https://doi.org/10.1115/1.2128636, 2006.

Carlton, A. G., Turpin, B. J., Altieri, K. E., Seitzinger, S. P., Mathur, R., Roselle, S. J., and Weber, R. J.: CMAQ Model Performance Enhanced When In-Cloud Secondary Organic Aerosol is Included: Comparisons of Organic Carbon Predictions with Measurements, Environ. Sci. Technol., 42, 8798–8802, 2008.

Chen, D., Liu, Z., Fast, J., and Ban, J.: Simulations of sulfate–nitrate–ammonium (SNA) aerosols during the extreme haze events over northern China in October 2014, Atmos. Chem. Phys., 16, 10707–10724, https://doi.org/10.5194/acp-16-10707-2016, 2016.

Chen, D., Liu, Z., Ban, J., and Chen, M.: The 2015 and 2016 wintertime air pollution in China: SO₂ emission changes derived from a WRF-Chem/EnKF coupled data assimilation system , Atmos. Chem. Phys., 19, 8619–8650, https://doi.org/10.5194/acp-19-8619-2019, 2019.

Chevallier, F., Bréon, F.-M., and Rayner, P. J.: Contribution of the Orbiting Carbon Observatory to the estimation of CO₂ sources and sinks: Theoretical study in a variational data assimilation framework, J. Geophys. Res.-Atmos., 112, D09307, https://doi.org/10.1029/2006JD007375, 2007.

Clements, A. L., Fraser, M. P., Upadhyay, N., Herckes, P., Sundblom, M., Lantz, J., and Solomon, P. A.: Chemical characterization of coarse particulate matter in the Desert Southwest – Pinal County Arizona, USA, Atmos. Pollut. Res., 5, 52–61, https://doi.org/10.5094/apr.2014.007, 2014.

Clements, N., Hannigan, M. P., Miller, S. L., Peel, J. L., and Milford, J. B.: Comparisons of urban and rural PM_10-−2.5 and PM_2.5 mass concentrations and semi-volatile fractions in northeastern Colorado, Atmos. Chem. Phys., 16, 7469–7484, https://doi.org/10.5194/acp-16-7469-2016, 2016.

Daley, R.: Atmospheric Data Assimilation (gtSpecial IssueltData Assimilation in Meteology and Oceanography: Theory and Practice), J. Meteorol. Soc. Jpn. Ser. II, 75, 319–329, 1997.

de Foy, B., Lu, Z., Streets, D. G., Lamsal, L. N., and Duncan, B. N.: Estimates of power plant NO_x emissions and lifetimes from OMI NO₂ satellite retrievals, Atmos. Environ., 116, 1–11, https://doi.org/10.1016/j.atmosenv.2015.05.056, 2015.

De Lannoy, G. J. M., Houser, P. R., Pauwels, V. R. N., and Verhoest, N. E. C.: State and bias estimation for soil moisture profiles by an ensemble Kalman filter: Effect of assimilation depth and frequency, Water Resour. Res., 43, W06401, https://doi.org/10.1029/2006WR005100, 2007.

Derber, J. C.: A Variational Continuous Assimilation Technique, Mon. Weather Rev., 117, 2437–2446, 1989.

Ding, J., van der A, R. J., Mijling, B., Levelt, P. F., and Hao, N.: NO_x emission estimates during the 2014 Youth Olympic Games in Nanjing, Atmos. Chem. Phys., 15, 9399–9412, https://doi.org/10.5194/acp-15-9399-2015, 2015.

Elbern, H., Strunk, A., Schmidt, H., and Talagrand, O.: Emission rate and chemical state estimation by 4-dimensional variational inversion, Atmos. Chem. Phys., 7, 3749–3769, https://doi.org/10.5194/acp-7-3749-2007, 2007.

Evensen, G.: The Ensemble Kalman Filter for Combined State and Parameter Estimation Monte Carlo Techniques for Data Assimilation in Large Systems, IEEE Contr. Syst. Mag., 29, 83–104, https://doi.org/10.1109/mcs.2009.932223, 2009.

Feng, S. and Jiang, F.: The code of Regional multi-Air Pollutant Assimilation System (RAPAS v1.0) for emission estimates (v1.0), Zenodo [code], https://doi.org/10.5281/zenodo.5566225, 2021a.

Feng, S. and Jiang, F.: Anthropogenic air pollutant emissions over China inferred by Regional multi-Air Pollutant Assimilation System (RAPAS v1.0), Zenodo [data set], https://doi.org/10.5281/zenodo.4718290, 2021b.

Feng, S., Jiang, F., Jiang, Z., Wang, H., Cai, Z., and Zhang, L.: Impact of 3DVAR assimilation of surface PM_2.5 observations on PM_2.5 forecasts over China during wintertime, Atmos. Environ., 187, 34–49, https://doi.org/10.1016/j.atmosenv.2018.05.049, 2018.

Feng, S., Jiang, F., Wu, Z., Wang, H., Ju, W., and Wang, H.: CO Emissions Inferred From Surface CO Observations Over China in December 2013 and 2017, J. Geophys. Res.-Atmos., 125, e2019JD031808, https://doi.org/10.1029/2019jd031808, 2020a.

Feng, S., Jiang, F., Wang, H., Wang, H., Ju, W., Shen, Y., Zheng, Y., Wu, Z., and Ding, A.: NO_x Emission Changes Over China During the COVID-19 Epidemic Inferred From Surface NO₂ Observations, Geophys. Res. Lett., 47, e2020GL090080, https://doi.org/10.1029/2020gl090080, 2020b.

Feng, S., Jiang, F., Wang, H., Shen, Y., Zheng, Y., Zhang, L., Lou, C., and Ju, W.: Anthropogenic emissions estimated using surface observations and their impacts on PM_2.5 source apportionment over the Yangtze River Delta, China, Sci. Total Environ., 828, 154522, https://doi.org/10.1016/j.scitotenv.2022.154522, 2022.

Gaspari, G. and Cohn, S. E.: Construction of correlation functions in two and three dimensions, Q. J. Roy. Meteor. Soc., 125, 723–757, https://doi.org/10.1256/smsqj.55416, 1999.

Guenther, A. B., Jiang, X., Heald, C. L., Sakulyanontvittaya, T., Duhl, T., Emmons, L. K., and Wang, X.: The Model of Emissions of Gases and Aerosols from Nature version 2.1 (MEGAN2.1): an extended and updated framework for modeling biogenic emissions, Geosci. Model Dev., 5, 1471–1492, https://doi.org/10.5194/gmd-5-1471-2012, 2012.

Gurney, K. R., Law, R. M., Denning, A. S., Rayner, P. J., Pak, B. C., Baker, D., Bousquet, P., Bruhwiler, L., Chen, Y. H., Ciais, P., Fung, I. Y., Heimann, M., John, J., Maki, T., Maksyutov, S., Peylin, P., Prather, M., and Taguchi, S.: Transcom 3 inversion intercomparison: Model mean results for the estimation of seasonal carbon sources and sinks, Global Biogeochem. Cy., 18, GB1010, https://doi.org/10.1029/2003gb002111, 2004.

He, W., van der Velde, I. R., Andrews, A. E., Sweeney, C., Miller, J., Tans, P., van der Laan-Luijkx, I. T., Nehrkorn, T., Mountain, M., Ju, W., Peters, W., and Chen, H.: CTDAS-Lagrange v1.0: a high-resolution data assimilation system for regional carbon dioxide observations, Geosci. Model Dev., 11, 3515–3536, https://doi.org/10.5194/gmd-11-3515-2018, 2018.

Hinds, W. C.: Aerosol Technology: Properties, Behavior, and Measurement of Airborne Particles, John Wiley, New York, ISBN 9780471087267, 1982.

Houtekamer, P. L. and Mitchell, H. L.: A sequential ensemble Kalman filter for atmospheric data assimilation, Mon. Weather Rev., 129, 123–137, https://doi.org/10.1175/1520-0493(2001)129<0123:asekff>2.0.co;2, 2001.

Houtekamer, P. L. and Zhang, F.: Review of the Ensemble Kalman Filter for Atmospheric Data Assimilation, Mon. Weather Rev., 144, 4489–4532, https://doi.org/10.1175/mwr-d-15-0440.1, 2016.

Inness, A., Blechschmidt, A.-M., Bouarar, I., Chabrillat, S., Crepulja, M., Engelen, R. J., Eskes, H., Flemming, J., Gaudel, A., Hendrick, F., Huijnen, V., Jones, L., Kapsomenakis, J., Katragkou, E., Keppens, A., Langerock, B., de Mazière, M., Melas, D., Parrington, M., Peuch, V. H., Razinger, M., Richter, A., Schultz, M. G., Suttie, M., Thouret, V., Vrekoussis, M., Wagner, A., and Zerefos, C.: Data assimilation of satellite-retrieved ozone, carbon monoxide and nitrogen dioxide with ECMWF's Composition-IFS, Atmos. Chem. Phys., 15, 5275–5303, https://doi.org/10.5194/acp-15-5275-2015, 2015.

Jiang, F., Liu, Q., Huang, X., Wang, T., Zhuang, B., and Xie, M.: Regional modeling of secondary organic aerosol over China using WRF/Chem, J. Aerosol Sci., 43, 57–73, https://doi.org/10.1016/j.jaerosci.2011.09.003, 2012a.

Jiang, F., Zhou, P., Liu, Q., Wang, T., Zhuang, B., and Wang, X.: Modeling tropospheric ozone formation over East China in springtime, J. Atmos. Chem., 69, 303–319, https://doi.org/10.1007/s10874-012-9244-3, 2012b.

Jiang, F., Wang, H. M., Chen, J. M., Machida, T., Zhou, L. X., Ju, W. M., Matsueda, H., and Sawa, Y.: Carbon balance of China constrained by CONTRAIL aircraft CO2 measurements, Atmos. Chem. Phys., 14, 10133–10144, https://doi.org/10.5194/acp-14-10133-2014, 2014.

Jiang, F., Wang, H., Chen, J. M., Ju, W., Tian, X., Feng, S., Li, G., Chen, Z., Zhang, S., Lu, X., Liu, J., Wang, H., Wang, J., He, W., and Wu, M.: Regional CO₂ fluxes from 2010 to 2015 inferred from GOSAT XCO₂ retrievals using a new version of the Global Carbon Assimilation System, Atmos. Chem. Phys., 21, 1963–1985, https://doi.org/10.5194/acp-21-1963-2021, 2021.

Jiang, W., Smyth, S., Giroux, E., Roth, H., and Yin, D.: Differences between CMAQ fine mode particle and PM_2.5 concentrations and their impact on model performance evaluation in the lower Fraser valley, Atmos. Environ., 40, 4973–4985, https://doi.org/10.1016/j.atmosenv.2005.10.069, 2006.

Jiang, Z., Jones, D. B. A., Worden, H. M., Deeter, M. N., Henze, D. K., Worden, J., Bowman, K. W., Brenninkmeijer, C. A. M., and Schuck, T. J.: Impact of model errors in convective transport on CO source estimates inferred from MOPITT CO retrievals, J. Geophys. Res.-Atmos., 118, 2073–2083, 2013a.

Jiang, Z., Liu, Z., Wang, T., Schwartz, C. S., Lin, H.-C., and Jiang, F.: Probing into the impact of 3DVAR assimilation of surface PM₁₀ observations over China using process analysis, J. Geophys. Res.-Atmos., 118, 6738–6749, https://doi.org/10.1002/jgrd.50495, 2013b.

Jiang, Z., Worden, J. R., Worden, H., Deeter, M., Jones, D. B. A., Arellano, A. F., and Henze, D. K.: A 15-year record of CO emissions constrained by MOPITT CO observations, Atmos. Chem. Phys., 17, 4565–4583, https://doi.org/10.5194/acp-17-4565-2017, 2017.

Jin, J., Lin, H. X., Heemink, A., and Segers, A.: Spatially varying parameter estimation for dust emissions using reduced-tangent-linearization 4DVar, Atmos. Environ., 187, 358–373, https://doi.org/10.1016/j.atmosenv.2018.05.060, 2018.

Kahnert, M.: Variational data analysis of aerosol species in a regional CTM: background error covariance constraint and aerosol optical observation operators, Tellus B, 60, 753–770, https://doi.org/10.1111/j.1600-0889.2008.00377.x, 2008.

Kang, J.-S., Kalnay, E., Miyoshi, T., Liu, J., and Fung, I.: Estimation of surface carbon fluxes with an advanced data assimilation methodology, J. Geophys. Res.-Atmos., 117, D24101, https://doi.org/10.1029/2012JD018259, 2012.

Keppenne, C. L., Rienecker, M. M., Kurkowski, N. P., and Adamec, D. A.: Ensemble Kalman filter assimilation of temperature and altimeter data with bias correction and application to seasonal prediction, Nonlin. Processes Geophys., 12, 491–503, https://doi.org/10.5194/npg-12-491-2005, 2005.

Kleist, D. T., Parrish, D. F., Derber, J. C., Treadon, R., Wu, W.-S., and Lord, S.: Introduction of the GSI into the NCEP Global Data Assimilation System, Weather Forecast., 24, 1691–1705, https://doi.org/10.1175/2009waf2222201.1, 2009.

Kong, L., Tang, X., Zhu, J., Wang, Z., Pan, Y., Wu, H., Wu, L., Wu, Q., He, Y., Tian, S., Xie, Y., Liu, Z., Sui, W., Han, L., and Carmichael, G.: Improved Inversion of Monthly Ammonia Emissions in China Based on the Chinese Ammonia Monitoring Network and Ensemble Kalman Filter, Environ. Sci. Technol., 53, 12529–12538, https://doi.org/10.1021/acs.est.9b02701, 2019.

Kong, L., Tang, X., Zhu, J., Wang, Z., Fu, J. S., Wang, X., Itahashi, S., Yamaji, K., Nagashima, T., Lee, H.-J., Kim, C.-H., Lin, C.-Y., Chen, L., Zhang, M., Tao, Z., Li, J., Kajino, M., Liao, H., Wang, Z., Sudo, K., Wang, Y., Pan, Y., Tang, G., Li, M., Wu, Q., Ge, B., and Carmichael, G. R.: Evaluation and uncertainty investigation of the NO₂, CO and NH₃ modeling over China under the framework of MICS-Asia III, Atmos. Chem. Phys., 20, 181–202, https://doi.org/10.5194/acp-20-181-2020, 2020.

Laloyaux, P., Bonavita, M., Chrust, M., and Gürol, S.: Exploring the potential and limitations of weak-constraint 4D-Var, Q. J. Roy. Meteor. Soc., 146, 4067–4082, 2020

Li, J.-D., Deng, Q.-H., Lu, C., and Huang, B.-L.: Chemical compositions and source apportionment of atmospheric PM₁₀ in suburban area of Changsha, China, J. Cent. South Univ. T., 17, 509–515, 2010.

Li, M., Zhang, Q., Kurokawa, J.-I., Woo, J.-H., He, K., Lu, Z., Ohara, T., Song, Y., Streets, D. G., Carmichael, G. R., Cheng, Y., Hong, C., Huo, H., Jiang, X., Kang, S., Liu, F., Su, H., and Zheng, B.: MIX: a mosaic Asian anthropogenic emission inventory under the international collaboration framework of the MICS-Asia and HTAP, Atmos. Chem. Phys., 17, 935–963, https://doi.org/10.5194/acp-17-935-2017, 2017.

Liu, C. and Shi, K.: A review on methodology in O₃-NO_x-VOC sensitivity study, Environ. Pollut., 291, 118249, https://doi.org/10.1016/j.envpol.2021.118249, 2021.

Liu, Y., Kalnay, E., Zeng, N., Asrar, G., Chen, Z., and Jia, B.: Estimating surface carbon fluxes based on a local ensemble transform Kalman filter with a short assimilation window and a long observation window: an observing system simulation experiment test in GEOS-Chem 10.1, Geosci. Model Dev., 12, 2899–2914, https://doi.org/10.5194/gmd-12-2899-2019, 2019.

Liu, Z., Liu, Q., Lin, H.-C., Schwartz, C. S., Lee, Y.-H., and Wang, T.: Three-dimensional variational assimilation of MODIS aerosol optical depth: Implementation and application to a dust storm over East Asia, J. Geophys. Res.-Atmos., 116, D23206, https://doi.org/10.1029/2011jd016159, 2011.

Lorenc, A. C.: Modelling of error covariances by 4D-Var data assimilation, Q. J. Roy. Meteor. Soc., 129, 3167–3182, 2003.

Hamer, P. D., Bowman, K. W., Henze, D. K., Attié, J.-L., and Marécal, V.: The impact of observing characteristics on the ability to predict ozone under varying polluted photochemical regimes, Atmos. Chem. Phys., 15, 10645–10667, https://doi.org/10.5194/acp-15-10645-2015, 2015.

Ma, C., Wang, T., Mizzi, A. P., Anderson, J. L., Zhuang, B., Xie, M., and Wu, R.: Multiconstituent Data Assimilation With WRF-Chem/DART: Potential for Adjusting Anthropogenic Emissions and Improving Air Quality Forecasts Over Eastern China, J. Geophys. Res.-Atmos., 124, 7393–7412, https://doi.org/10.1029/2019jd030421, 2019.

Maybeck, P. S.: Stochastic Models, Estimation and Control Academic Press, ISBN 9780124807013, 1979.

Meirink, J. F., Eskes, H. J., and Goede, A. P. H.: Sensitivity analysis of methane emissions derived from SCIAMACHY observations through inverse modelling, Atmos. Chem. Phys., 6, 1275–1292, https://doi.org/10.5194/acp-6-1275-2006, 2006.

Meirink, J. F., Bergamaschi, P., and Krol, M. C.: Four-dimensional variational data assimilation for inverse modelling of atmospheric methane emissions: method and comparison with synthesis inversion, Atmos. Chem. Phys., 8, 6341–6353, https://doi.org/10.5194/acp-8-6341-2008, 2008.

Miyazaki, K. and Eskes, H.: Constraints on surface NO_x emissions by assimilating satellite observations of multiple species, Geophys. Res. Lett., 40, 4745–4750, https://doi.org/10.1002/grl.50894, 2013.

Miyazaki, K., Eskes, H. J., and Sudo, K.: Global NO_x emission estimates derived from an assimilation of OMI tropospheric NO₂ columns, Atmos. Chem. Phys., 12, 2263–2288, https://doi.org/10.5194/acp-12-2263-2012, 2012a.

Miyazaki, K., Eskes, H. J., Sudo, K., Takigawa, M., van Weele, M., and Boersma, K. F.: Simultaneous assimilation of satellite NO₂, O₃, CO, and HNO₃ data for the analysis of tropospheric chemical composition and emissions, Atmos. Chem. Phys., 12, 9545–9579, https://doi.org/10.5194/acp-12-9545-2012, 2012b.

Miyazaki, K., Eskes, H., Sudo, K., Boersma, K. F., Bowman, K., and Kanaya, Y.: Decadal changes in global surface NO_x emissions from multi-constituent satellite data assimilation, Atmos. Chem. Phys., 17, 807–837, https://doi.org/10.5194/acp-17-807-2017, 2017.

Mizzi, A. P., Edwards, D. P., and Anderson, J. L.: Assimilating compact phase space retrievals (CPSRs): comparison with independent observations (MOZAIC in situ and IASI retrievals) and extension to assimilation of truncated retrieval profiles, Geosci. Model Dev., 11, 3727–3745, https://doi.org/10.5194/gmd-11-3727-2018, 2018.

Monteil, G., Houweling, S., Butz, A., Guerlet, S., Schepers, D., Hasekamp, O., Frankenberg, C., Scheepmaker, R., Aben, I., and Rockmann, T.: Comparison of CH₄ inversions based on 15 months of GOSAT and SCIAMACHY observations, J. Geophys. Res.-Atmos., 118, 11807–11823, 2013.

Müller, J.-F. and Stavrakou, T.: Inversion of CO and NO_x emissions using the adjoint of the IMAGES model, Atmos. Chem. Phys., 5, 1157–1186, https://doi.org/10.5194/acp-5-1157-2005, 2005.

Mueller, S. F. and Mallard, J. W.: Contributions of Natural Emissions to Ozone and PM_2.5 as Simulated by the Community Multiscale Air Quality (CMAQ) Model, Environ. Sci. Technol., 45, 4817–4823, https://doi.org/10.1021/es103645m, 2011.

Nassar, R., Jones, D. B. A., Kulawik, S. S., Worden, J. R., Bowman, K. W., Andres, R. J., Suntharalingam, P., Chen, J. M., Brenninkmeijer, C. A. M., Schuck, T. J., Conway, T. J., and Worthy, D. E.: Inverse modeling of CO₂ sources and sinks using satellite observations of CO₂ from TES and surface flask measurements, Atmos. Chem. Phys., 11, 6029–6047, https://doi.org/10.5194/acp-11-6029-2011, 2011.

Navon, I. M.: Practical and theoretical aspects of adjoint parameter estimation and identifiability in meteorology and oceanography, Dynam. Atmos. Oceans, 27, 55–79, 1998.

Parrish, D. F. and Derber, J. C.: The National Meteorological Center's spectral statistical- interpolation analysis system, Mon. Weather Rev., 120, 1747–1763, https://doi.org/10.1175/1520-0493(1992)120<1747:tnmcss>2.0.co;2, 1992.

Paulot, F., Jacob, D. J., Pinder, R. W., Bash, J. O., Travis, K., and Henze, D. K.: Ammonia emissions in the United States, European Union, and China derived by high-resolution inversion of ammonium wet deposition data: Interpretation with a new agricultural emissions inventory (MASAGE_NH3), J. Geophys. Res.-Atmos., 119, 4343–4364, 2014.

Peng, Z., Liu, Z., Chen, D., and Ban, J.: Improving PM_2.5 forecast over China by the joint adjustment of initial conditions and source emissions with an ensemble Kalman filter, Atmos. Chem. Phys., 17, 4837–4855, https://doi.org/10.5194/acp-17-4837-2017, 2017.

Peng, Z., Lei, L., Liu, Z., Sun, J., Ding, A., Ban, J., Chen, D., Kou, X., and Chu, K.: The impact of multi-species surface chemical observation assimilation on air quality forecasts in China, Atmos. Chem. Phys., 18, 17387–17404, https://doi.org/10.5194/acp-18-17387-2018, 2018.

Peters, W., Jacobson, A. R., Sweeney, C., Andrews, A. E., Conway, T. J., Masarie, K., Miller, J. B., Bruhwiler, L. M. P., Petron, G., Hirsch, A. I., Worthy, D. E. J., van der Werf, G. R., Randerson, J. T., Wennberg, P. O., Krol, M. C., and Tans, P. P.: An atmospheric perspective on North American carbon dioxide exchange: CarbonTracker, P. Natl. Acad. Sci. USA, 104, 18925–18930, https://doi.org/10.1073/pnas.0708986104, 2007.

Peylin, P., Rayner, P. J., Bousquet, P., Carouge, C., Hourdin, F., Heinrich, P., Ciais, P., and AEROCARB contributors: Daily CO₂ flux estimates over Europe from continuous atmospheric measurements: 1, inverse methodology, Atmos. Chem. Phys., 5, 3173–3186, https://doi.org/10.5194/acp-5-3173-2005, 2005.

Purser, R. J., Wu, W. S., Parrish, D. F., and Roberts, N. M.: Numerical aspects of the application of recursive filters to variational statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian covariances, Mon. Weather Rev., 131, 1524–1535, https://doi.org/10.1175//1520-0493(2003)131<1524:naotao>2.0.co;2, 2003.

Quan, J., Liu, Q., Li, X., Gao, Y., Jia, X., Sheng, J., and Liu, Y.: Effect of heterogeneous aqueous reactions on the secondary formation of inorganic aerosols during haze events, Atmos. Environ., 122, 306–312, 2015.

Rabier, F., McNally, A., Andersson, E., Courtier, P., Unden, P., Eyre, J., Hollingsworth, A., and Bouttier, F.: The ECMWF implementation of three-dimensional variational assimilation (3D-Var). II: Structure functions, Q. J. Roy. Meteor. Soc., 124, 1809–1829, https://doi.org/10.1256/smsqj.55002, 1998.

Richardson, H., Basu, S., and Holtslag, A. A. M.: Improving Stable Boundary-Layer Height Estimation Using a Stability-Dependent Critical Bulk Richardson Number, Bound.-Lay. Meteorol., 148, 93–109, 2013.

Sarwar, G., Simon, H., Bhave, P., and Yarwood, G.: Examining the impact of heterogeneous nitryl chloride production on air quality across the United States, Atmos. Chem. Phys., 12, 6455–6473, https://doi.org/10.5194/acp-12-6455-2012, 2012.

Sasaki, Y.: Some Basic Formalisms in Numerical Variational Analysis, Mon. Weather Rev., 98, 875–883, https://doi.org/10.1175/1520-0493(1970)098<0875:SBFINV>2.3.CO;2, 1970.

Schneising, O., Buchwitz, M., Burrows, J. P., Bovensmann, H., Bergamaschi, P., and Peters, W.: Three years of greenhouse gas column-averaged dry air mole fractions retrieved from satellite – Part 2: Methane, Atmos. Chem. Phys., 9, 443–465, https://doi.org/10.5194/acp-9-443-2009, 2009.

Schwartz, C. S., Liu, Z., Lin, H.-C., and Cetola, J. D.: Assimilating aerosol observations with a “hybrid” variational-ensemble data assimilation system, J. Geophys. Res.-Atmos., 119, 4043–4069, https://doi.org/10.1002/2013jd020937, 2014.

Sekiyama, T. T., Tanaka, T. Y., Shimizu, A., and Miyoshi, T.: Data assimilation of CALIPSO aerosol observations, Atmos. Chem. Phys., 10, 39–49, https://doi.org/10.5194/acp-10-39-2010, 2010.

Sharma, S., Chatani, S., Mahtta, R., Goel, A., and Kumar, A.: Sensitivity analysis of ground level ozone in India using WRF-CMAQ models, Atmos. Environ., 131, 29–40, https://doi.org/10.1016/j.atmosenv.2016.01.036, 2016.

Shen, Y., Jiang, F., Feng, S., Zheng, Y., Cai, Z., and Lyu, X.: Impact of weather and emission changes on NO₂ concentrations in China during 2014–2019, Environ. Pollut., 269, 116163, https://doi.org/10.1016/j.envpol.2020.116163, 2021.

Shi, X. and Brasseur, G. P.: The Response in Air Quality to the Reduction of Chinese Economic Activities During the COVID-19 Outbreak, Geophys. Res. Lett., 47, e2020GL088070, https://doi.org/10.1029/2020gl088070, 2020.

Skamarock, W. C. and Klemp, J. B.: A time-split nonhydrostatic atmospheric model for weather research and forecasting applications, J. Comput. Phys., 227, 3465–3485, https://doi.org/10.1016/j.jcp.2007.01.037, 2008.

Stanevich, I., Jones, D. B. A., Strong, K., Keller, M., Henze, D. K., Parker, R. J., Boesch, H., Wunch, D., Notholt, J., Petri, C., Warneke, T., Sussmann, R., Schneider, M., Hase, F., Kivi, R., Deutscher, N. M., Velazco, V. A., Walker, K. A., and Deng, F.: Characterizing model errors in chemical transport modeling of methane: using GOSAT XCH₄ data with weak-constraint four-dimensional variational data assimilation, Atmos. Chem. Phys., 21, 9545–9572, https://doi.org/10.5194/acp-21-9545-2021, 2021.

Stavrakou, T., Müller, J.-F., Boersma, K. F., De Smedt, I., and van der A, R. J.: Assessing the distribution and growth rates of NO_x emission sources by inverting a 10-year record of NO₂ satellite columns, Geophys. Res. Lett., 35, L10801, https://doi.org/10.1029/2008gl033521, 2008.

Sun, A. Y., Morris, A., and Mohanty, S.: Comparison of deterministic ensemble Kalman filters for assimilating hydrogeological data, Adv. Water Resour., 32, 280–292, https://doi.org/10.1016/j.advwatres.2008.11.006, 2009.

Takagi, H., Saeki, T., Oda, T., Saito, M., Valsala, V., Belikov, D., Saito, R., Yoshida, Y., Morino, I., Uchino, O., Andres, R. J., Yokota, T., and Maksyutov, S.: On the Benefit of GOSAT Observations to the Estimation of Regional CO₂ Fluxes, SOLA, 7, 161–164, https://doi.org/10.2151/sola.2011-041, 2011.

Tang, X., Zhu, J., Wang, Z. F., and Gbaguidi, A.: Improvement of ozone forecast over Beijing based on ensemble Kalman filter with simultaneous adjustment of initial conditions and emissions, Atmos. Chem. Phys., 11, 12901–12916, https://doi.org/10.5194/acp-11-12901-2011, 2011.

Tang, X., Zhu, J., Wang, Z. F., Wang, M., Gbaguidi, A., Li, J., Shao, M., Tang, G. Q., and Ji, D. S.: Inversion of CO emissions over Beijing and its surrounding areas with ensemble Kalman filter, Atmos. Environ., 81, 676–686, https://doi.org/10.1016/j.atmosenv.2013.08.051, 2013.

US EPA: CMAQ: The Community Multiscale Air Quality Modeling System, US EPA [code], https://www.epa.gov/cmaq, last access: 25 April 2021.

Wang, C., Lei, L., Tan, Z.-M., and Chu, K.: Adaptive Localization for Tropical Cyclones With Satellite Radiances in an Ensemble Kalman Filter, Front. Earth Sci., 8, 39, https://doi.org/10.3389/feart.2020.00039, 2020.

Wang, H., Jiang, F., Wang, J., Ju, W., and Chen, J. M.: Terrestrial ecosystem carbon flux estimated using GOSAT and OCO-2 XCO₂ retrievals, Atmos. Chem. Phys., 19, 12067–12082, https://doi.org/10.5194/acp-19-12067-2019, 2019.

Wang, N., Lyu, X., Deng, X., Huang, X., Jiang, F., and Ding, A.: Aggravating O3 pollution due to NO_x emission control in eastern China, Sci. Total Environ., 677, 732–744, 2019.

Wang, Y. H., Hu, B., Ji, D. S., Liu, Z. R., Tang, G. Q., Xin, J. Y., Zhang, H. X., Song, T., Wang, L. L., Gao, W. K., Wang, X. K., and Wang, Y. S.: Ozone weekend effects in the Beijing–Tianjin–Hebei metropolitan area, China, Atmos. Chem. Phys., 14, 2419–2429, https://doi.org/10.5194/acp-14-2419-2014, 2014.

Wang, Z., Li, Y., Dong, X., Sun, R., Sun, N., and Pan, L.: Analysis on weekend effect of air pollutants in urban atmosphere of Beijing, Journal of University of Chinese Academy of Sciences, 32, 843–850, 2015.

Wang, Z., Wang, W., Tham, Y. J., Li, Q., Wang, H., Wen, L., Wang, X., and Wang, T.: Fast heterogeneous N₂O₅ uptake and ClNO₂ production in power plant and industrial plumes observed in the nocturnal residual layer over the North China Plain, Atmos. Chem. Phys., 17, 12361–12378, https://doi.org/10.5194/acp-17-12361-2017, 2017.

Wecht, K. J., Jacob, D. J., Sulprizio, M. P., Santoni, G. W., Wofsy, S. C., Parker, R., Bösch, H., and Worden, J.: Spatially resolving methane emissions in California: constraints from the CalNex aircraft campaign and from present (GOSAT, TES) and future (TROPOMI, geostationary) satellite observations, Atmos. Chem. Phys., 14, 8173–8184, https://doi.org/10.5194/acp-14-8173-2014, 2014.

WRF Users Page: WRF Model Users' Page, WRF Users Page [code], https://www2.mmm.ucar.edu/wrf/users, 2021.

Wu, H., Tang, X., Wang, Z., Wu, L., Li, J., Wang, W., Yang, W., and Zhu, J.: High-spatiotemporal-resolution inverse estimation of CO and NO_x emission reductions during emission control periods with a modified ensemble Kalman filter, Atmos. Environ., 236, 117631, https://doi.org/10.1016/j.atmosenv.2020.117631, 2020.

Wu, W. S., Purser, R. J., and Parrish, D. F.: Three-dimensional variational analysis with spatially inhomogeneous covariances, Mon. Weather Rev., 130, 2905–2916, https://doi.org/10.1175/1520-0493(2002)130<2905:tdvaws>2.0.co;2, 2002.

Yang, W., Li, J., Wang, W., Li, J., Ge, M., Sun, Y., Chen, X., Ge, B., Tong, S., Wang, Q., and Wang, Z.: Investigating secondary organic aerosol formation pathways in China during 2014, Atmos. Environ., 213, 133–147, 2019.

Yumimoto, K., Uno, I., Sugimoto, N., Shimizu, A., Liu, Z., and Winker, D. M.: Adjoint inversion modeling of Asian dust emission using lidar observations, Atmos. Chem. Phys., 8, 2869–2884, https://doi.org/10.5194/acp-8-2869-2008, 2008.

Zhang, F., Weng, Y., Sippel, J. A., Meng, Z., and Bishop, C. H.: Cloud-Resolving Hurricane Initialization and Prediction through Assimilation of Doppler Radar Observations with an Ensemble Kalman Filter, Mon. Weather Rev., 137, 2105–2125, https://doi.org/10.1175/2009mwr2645.1, 2009.

Zhang, L., Chen, Y., Zhao, Y., Henze, D. K., Zhu, L., Song, Y., Paulot, F., Liu, X., Pan, Y., Lin, Y., and Huang, B.: Agricultural ammonia emissions in China: reconciling bottom-up and top-down estimates, Atmos. Chem. Phys., 18, 339–355, https://doi.org/10.5194/acp-18-339-2018, 2018.

Zhang, Q., Streets, D. G., Carmichael, G. R., He, K. B., Huo, H., Kannari, A., Klimont, Z., Park, I. S., Reddy, S., Fu, J. S., Chen, D., Duan, L., Lei, Y., Wang, L. T., and Yao, Z. L.: Asian emissions in 2006 for the NASA INTEX-B mission, Atmos. Chem. Phys., 9, 5131–5153, https://doi.org/10.5194/acp-9-5131-2009, 2009.

Zhang, S., Zheng, X., Chen, J. M., Chen, Z., Dan, B., Yi, X., Wang, L., and Wu, G.: A global carbon assimilation system using a modified ensemble Kalman filter, Geosci. Model Dev., 8, 805–816, https://doi.org/10.5194/gmd-8-805-2015, 2015.

Zhang, X., Liu, J., Han, H., Zhang, Y., Jiang, Z., Wang, H., Meng, L., Li, Y. C., and Liu, Y.: Satellite-Observed Variations and Trends in Carbon Monoxide over Asia and Their Sensitivities to Biomass Burning, Remote Sensing, 12, 830, https://doi.org/10.3390/rs12050830, 2020.

Zheng, B., Zhang, Q., Tong, D., Chen, C., Hong, C., Li, M., Geng, G., Lei, Y., Huo, H., and He, K.: Resolution dependence of uncertainties in gridded emission inventories: a case study in Hebei, China, Atmos. Chem. Phys., 17, 921–933, https://doi.org/10.5194/acp-17-921-2017, 2017.

Zheng, B., Tong, D., Li, M., Liu, F., Hong, C., Geng, G., Li, H., Li, X., Peng, L., Qi, J., Yan, L., Zhang, Y., Zhao, H., Zheng, Y., He, K., and Zhang, Q.: Trends in China's anthropogenic emissions since 2010 as the consequence of clean air actions, Atmos. Chem. Phys., 18, 14095–14111, https://doi.org/10.5194/acp-18-14095-2018, 2018.

Articles

Short summary

We document the system development and application of a Regional multi-Air Pollutant Assimilation System (RAPAS v1.0). This system is developed to optimize gridded source emissions of CO, SO₂, NO_x, primary PM_2.5, and coarse PM₁₀ on a regional scale via simultaneously assimilating surface measurements of CO, SO₂, NO₂, PM_2.5, and PM₁₀. A series of sensitivity experiments demonstrates the advantage of the “two-step” inversion strategy and the robustness of the system in estimating the emissions.