the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.

Development of the CMA-GFS-AERO 4D-Var assimilation system v1.0 – Part 1: System description and preliminary experimental results
Yongzhu Liu
Xiaoye Zhang
Chao Wang
Wenxing Jia
Deying Wang
Zhaorong Zhuang
Xueshun Shen
We developed a strongly coupled aerosol–meteorology four-dimensional variational (4D-Var) assimilation system, CMA-GFS-AERO 4D-Var, for investigating the feedback of aerosol data assimilation on meteorological forecasts. This system was developed on the basis of the framework of the incremental analysis scheme of the China Meteorological Administration Global Forecasting System (CMA-GFS). CMA-GFS-AERO 4D-Var includes three component models: forward, tangent linear, and adjoint models. The CMA-GFS-AERO forward model was constructed by integrating an aerosol module containing the main physical processes of black carbon (BC) aerosol in the atmosphere into the CMA-GFS weather model. The tangent linear model and the adjoint model of the aerosol module were further developed and coupled online with the CMA-GFS tangent linear and adjoint models, respectively. In CMA-GFS-AERO 4D-Var, the BC mass concentration was used as the control variable and minimized together with atmospheric variables. The validation of this system includes the tangent linear approximation, the adjoint correctness test, the single-observation experiment, and the full-observation experiment. The results show that the CMA-GFS-AERO tangent linear model performs well in terms of tangent linear approximation for BC and that adjoint sensitivity agrees well with tangent linear sensitivity. Assimilating BC observations can generate analysis increments not only for BC but also for atmospheric variables, highlighting the capability of CMA-GFS-AERO 4D-Var in exploring the feedback of BC assimilation on atmospheric variables. The computational performance of CMA-GFS-AERO 4D-Var also indicates its potential in operational application. This study focuses on the theoretical architecture and practical implementation of the system; the detailed analysis of a series of cycling assimilation experiments will be described in part 2 of this set of companion papers.
- Article
(7118 KB) - Full-text XML
-
Supplement
(952 KB) - BibTeX
- EndNote
Coupled chemistry–meteorology models (CCMMs) are atmospheric chemistry models that concurrently simulate meteorological processes and chemical transformations (Zhang, 2008; Baklanov et al., 2014; Bocquet et al., 2015). They are more recent compared to chemical transport models (CTMs), which rely on meteorological fields as inputs (Seinfeld and Pandis, 1998). CCMMs account for the feedback mechanism between aerosols and meteorology, specifically the moisture and temperature perturbations resulting from aerosol microphysics and radiative forcing, which, in turn, affect atmospheric dynamics such as convection, circulation, and stability, whereas CTMs lack the capability to incorporate these feedback mechanisms (Guerrette and Henze, 2015).
CCMMs provide the possibility of assimilating both meteorological and chemical data, enabling the production of an optimal initial condition for improving air quality predictions and developing reanalysis of three-dimensional (3D) chemical concentrations over the past decades (Bocquet et al., 2015). One of the first applications of data assimilation with a CCMM was conducted at Météo-France. Semane et al. (2009) used four-dimensional variational (4D-Var) data assimilation to assimilate the vertical profiles of ozone (O3) concentrations obtained from the Microwave Limb Sounder (MLS) aboard the Aura satellite into the ARPEGE/MOCAGE (Action de Recherche Petite Echelle Grande Echelle/Modèle de Chimie Atmosphérique de Grande Echelle) chemistry–meteorology integrated system, and they found that the assimilation of O3 reduces the wind bias in the lower stratosphere. This general approach is also adopted by the European Centre for Medium-Range Weather Forecasts (ECMWF), although without considering the influence of chemical species on meteorological variables (Flemming et al.,2011; Inness et al., 2013). Flemming et al. (2011) utilized the 4D-Var system of the Integrated Forecast System (IFS) coupled with three different O3 chemistry mechanisms, including a linear chemistry, the MOZART3 (Model for Ozone and Related Chemical Tracers, version 3) chemistry, and the TM5 (Transport Model, version 5) chemistry, to assimilate O3 data from four satellite-borne sensors to improve the simulation of the stratospheric O3 hole in 2008. Previous efforts have also explored the application of ensemble-based methods for data assimilation with a CCMM (Pagowski and Grell, 2012; Bocquet et al., 2015). Pagowski and Grell (2012) assimilated surface measurements of fine aerosols using the Weather Research and Forecasting model coupled with Chemistry (WRF-Chem) and the ensemble Kalman filter (EnKF) method. Bocquet et al. (2015) also presented an application of the EnKF to assimilate surface fine-particulate-matter observations and meteorological observations with the WRF-Chem model over the eastern part of North America. Results demonstrated a large positive impact of aerosol data assimilation on aerosol concentrations, while the effect of meteorological observation assimilation on aerosol concentration was rather minor. All the preceding studies have laid good foundations for data assimilation with CCMMs. However, since CCMMs are fairly recent, the development and applications of data assimilation in CCMMs are still limited. Further research and more attention are required, especially in terms of the potential feedback of chemical data assimilation on meteorological forecasts.
Additionally, EnKF estimates background error covariance through ensemble forecasts, which rely on a limited number of ensemble members (Zhu et al., 2022). In high-dimensional problems, the limited number of samples may not be able to fully capture all the error characteristics, resulting in inaccuracies in the estimation of background error covariance. Although ensemble Kalman smoothers (EnKSs) extend the EnKF framework by incorporating an assimilation window to leverage temporal observational information, they remain constrained by similar limitations in ensemble size. In contrast, 4D-Var explicitly integrates both the complete observational dataset and the full model dynamics within the assimilation window to constrain state evolution, rather than relying solely on ensemble statistics. This generally allows 4D-Var to achieve higher accuracy in high-dimensional problems by making better use of both observational data and model constraints, leading to more precise state estimation. While the flow dependence of the background error covariance is implicitly realized within the assimilation window in 4D-Var, modeling the cross-variable component of the covariance presents a significant challenge in data assimilation for CCMMs. Furthermore, the tangent linear model (TLM) and the adjoint model (ADM) are essential components of 4D-Var, but their development is often fraught with difficulties.
Significant efforts have been made in the field of atmospheric chemistry adjoint modeling. Elbern and Schmidt (1999) first constructed the ADM of a 3D CTM, EUARD (The University of Cologne European Air Pollution Dispersion Chemistry Transport Model). Inspired by this work, various ADMs of CTMs have been successively developed, including mainly CHIMERE (Menut et al., 2000; Vautard et al., 2000; Schmidt and Martin, 2003), IMAGES (Intermediate Model of Global Evolution of Species; Müller and Stavrakou, 2005), STEM-III (Sulfur Transport Eulerian Model; Sandu et al., 2005), CAMx (Comprehensive Air Quality Model with Extensions model; Liu et al., 2007), CMAQ (Community Multiscale Air Quality model; Hakami et al., 2007), and GEOS-Chem (Henze et al., 2007). An et al. (2016) and Wang et al. (2022) constructed the ADM of GRAPES-CUACE (Global/Regional Assimilation and PrEdiction System coupled with CMA Unified Atmospheric Chemistry Environmental Forecasting System), an independently developed CCMM in China (Wang et al., 2010, 2018). ADMs of these widely used CTMs play an important role in inverse modeling and chemical data assimilation (Menut et al., 2000; Müller and Stavrakou, 2005; Sandu et al., 2005; Hakami et al., 2007; Henze et al., 2009). However, these CTMs do not take into account the relationships between chemical species and meteorological variables, resulting in certain uncertainties in adjoint sensitivity, which in turn affect the effectiveness of 4D-Var. Although GRAPES-CUACE is a CCMM, its ADM includes only the adjoint model of the chemical model and not the adjoint model of the meteorological model, leading to uncertainties in the sensitivity calculation as well.
Black carbon (BC) aerosol, a major component of the fine particulate matter (PM2.5) defined by an aerodynamic diameter of 2.5 µm or less, primarily originates from the incomplete combustion of biomass and fossil fuels (Kuhlbusch, 1998). As an important atmospheric pollutant, BC is porous and adsorbs other solid and gaseous pollutants (e.g., SO2, O3), and it provides catalytic conditions for them, which plays an important role in photochemical and heterogeneous reactions and gas–particle conversion processes (Koch, 2001). BC is also the main optically absorbing component of atmospheric aerosols, effectively absorbing solar radiation in the visible to infrared wavelength range, thus affecting the temperature field throughout the atmosphere, including the surface temperature. The climatic effects of BC have been widely reported, but the extent to which BC affects weather forecasting requires further investigation (Chung and Seinfeld, 2002; Menon et al., 2002; Bond et al., 2013).
To deeply investigate the feedbacks of aerosol data assimilation on meteorological forecasts, we utilized BC as a starting point to develop the strongly coupled aerosol–meteorology 4D-Var system. Firstly, we constructed a coupled aerosol–meteorology system, named CMA-GFS-AERO, by integrating an aerosol module (AERO-BC) containing main aerosol physical processes of BC in the atmosphere into the operational version of the weather model CMA-GFS V4.0 (Shen et al., 2023), which was developed by the China Meteorological Administration (CMA; GFS stands for Global Forecasting System). Then, the tangent linear model and the adjoint model of the AERO-BC module were constructed and coupled online with the TLM and ADM of CMA-GFS (Liu et al., 2017, 2023; Zhang et al., 2019), respectively. Thus, CMA-GFS-AERO ADM includes not only the adjoint model of physical processes of BC, but also the adjoint model of the meteorological model. Moreover, the BC adjoint variables and the meteorological adjoint variables mutually influence each other throughout the adjoint integration process, leading to a notable enhancement in the precision of the adjoint sensitivity of the aerosol and meteorology state. Based on the CMA-GFS-AERO forward model and its TLM and ADM, we further constructed CMA-GFS-AERO 4D-Var by adding BC as a control variable into the incremental analysis scheme of CMA-GFS 4D-Var. The rationality and capability of CMA-GFS-AERO 4D-Var in capturing the feedbacks of aerosol data assimilation on meteorological analysis were verified using the single-observation experiment and the full-observation experiment. The following text is divided into five sections. Section 2 introduces the methods, Sect. 3 describes the development of CMA-GFS-AERO 4D-Var, Sect. 4 provides the model setup, Sect. 5 presents the results, and the conclusions are found in Sect. 6.
2.1 Model description
2.1.1 CMA-GFS
The China Meteorological Administration Global Forecasting System (CMA-GFS, formerly known as GRAPES-GFS) is an operational global numerical weather model independently developed by the CMA (Chen and Shen, 2006; Chen et al., 2008; Shen et al., 2023). For this work, we used CMA-GFS version 4.0 (CMA-GFS v4.0). The dynamic core of CMA-GFS utilizes the fully compressible non-hydrostatical equations formulated on spherical coordinates with latitude and longitude and adopts the height-based, terrain-following coordinate, which is shown in Fig. S1 in the Supplement (Yang et al., 2007). The model employs semi-implicit and semi-Lagrangian schemes in two-level time integration (Yang et al., 2007). The spatial differential adopts the Arakawa-C grid in the horizontal and Charney–Phillips variable staggering in the vertical. The large-scale transport processes utilize a hybrid piecewise rational method (PRM) and quasi-monotone semi-Lagrangian (QMSL) scheme (Su et al., 2013). The physical parameterization schemes used in this work are consistent with those adopted in the operational application of CMA-GFS v4.0 and have been proven to perform well in global numerical weather prediction. The selected schemes mainly include the Simplified Arakawa–Schubert (SAS) cumulus convection scheme (Arakawa and Schubert, 1974; Liu et al., 2015), the double-moment cloud microphysics scheme (Liu et al., 2003a, b; Li et al., 2024), the Rapid Radiative Transfer Model for the GCM (RRTMG) longwave and shortwave radiation schemes (Mlawer et al., 1997; Morcrette et al., 2008), the Common Land Model (CoLM) land surface scheme (Dai et al., 2003), and the New Medium Range Forecast (NMRF) boundary layer scheme (Hong and Pan, 1996; Han and Pan, 2011). The state variables of the CMA-GFS nonlinear model (NLM) include non-dimensional pressure (Π), potential temperature (θ), the east–west component of horizontal wind (u), the north–south component of horizontal wind (v), the vertical component of wind (), and the specific humidity (q).
2.1.2 CUACE
CUACE (CMA Unified Atmospheric Chemistry Environmental Forecasting System) is an air quality model developed by the Chinese Academy of Meteorological Sciences to study both air quality forecasting and climate change (Gong and Zhang, 2008; Wang et al., 2010; Zhou et al., 2012). CUACE mainly includes three modules: the aerosol module, the gaseous chemistry module, and the thermodynamic equilibrium module. CUACE adopts CAM (Canadian Aerosol Module; Gong et al., 2003), which employs the size-segregated multicomponent aerosol algorithm as its aerosol module. CAM involves six types of aerosols – BC, sulfate (SF), nitrate (NI), sea salt (SS), organic carbon (OC), and soil dust (SD) – and each of them utilizes the sectional representation method (Gelbard et al., 1980; Meng et al., 1998; Gong et al., 2003), in which the aerosol size distribution is generally approximated by a set of contiguous, nonoverlapping and discrete size bins, to represent particle size distributions. The core of CAM is the major aerosol processes in the atmosphere, including hygroscopic growth, coagulation, nucleation, condensation, dry deposition/sedimentation, and below-cloud scavenging.
2.2 Incremental 4D-Var
The CMA-GFS 4D-Var data assimilation system has been in operation at CMA since 1 July 2018 (Zhang et al., 2019). CMA-GFS 4D-Var applies the incremental analysis scheme proposed by Courtier et al. (1994). The cost function is defined as
where represents the analysis increment of the model variables, with xa being the analysis field and xb the background state; is the observation innovation at time i, with yi being the observation at time i; Hi represents the observation operator at time i; M0→i denotes the nonlinear model integration from the analysis time to time i; Hi is the linear operator corresponding to Hi; M0→i is the linear model corresponding to M0→i; B represents the error covariance matrix of xb; Ri denotes the observation error covariance matrix at time i; and Jc is the weak constraint term on the basis of the digital filter. Jc is not relevant to the current work, so the formula described below omits the Jc term from the cost function for the sake of simplicity.
After the physical and preconditioning transformations of the control variables, the cost function can be expressed as (Courtier et al., 1994; Lorenc et al., 2000; Zhang et al., 2019)
where w denotes the control variables after the physical and preconditioning transformations, and the analysis increment is expressed as δx=Uw, with U (UUT=B) being the square root matrix of the background error covariance matrix.
The gradient of the cost function J(w) with respect to the control variable w is
where is the adjoint operator of Hi and is the adjoint model of M0→i, which denotes the backward integration of the ADM from the time i to the analysis time.
Currently, the CMA-GFS 4D-Var system adopts a 6 h cycle and is performed four times a day, with assimilation windows of 03:00–09:00, 09:00–15:00, 15:00–21:00, and 21:00–03:00 UTC. The assimilation process is divided into two parts: the outer loop and the inner loop. In the outer loop, the CMA-GFS NLM (M0→i) is integrated at high resolution for 6 h to obtain the trajectory, which is a collection of stored values of all model state variables at all time steps within the assimilation window. The observation innovation di is calculated in the outer loop as well. In the inner loop, the CMA-GFS TLM and ADM are integrated at low resolution to calculate the cost function (J(w)) and its gradient (∇wJ). The gradient is further provided to the Lanczos-CG algorithm (Lanczos, 1950; Liu et al., 2018) to perform the minimization, obtaining the optimal analysis increments of control variables.
The computational cost is an important factor to be considered when developing a coupled aerosol–meteorology 4D-Var system with potential for operational application (Flemming et al., 2015). The CUACE model is computationally expensive since it includes more than 100 chemical variables for aerosols and gases, as well as hundreds of gas-phase chemical reactions. It is difficult to construct a coupled aerosol–meteorology 4D-Var system directly based on the CUACE model. On the other hand, BC has an important impact on the climate and can be used to study the two-way feedback interactions between aerosol and meteorology (Chung and Seinfeld, 2002; Menon et al., 2002; Bond et al., 2013). Therefore, we utilized BC as a starting point to construct the strongly coupled aerosol–meteorology 4D-Var system (CMA-GFS-AERO 4D-Var).
Creating CMA-GFS-AERO 4D-Var required three important components: (1) the CMA-GFS-AERO forward model, (2) the CMA-GFS-AERO TLM and ADM, and (3) the 4D-Var framework. This section provides a detailed description of the construction of the CMA-GFS-AERO 4D-Var from these three aspects.
3.1 CMA-GFS-AERO forward model
In this work, for the sake of interest in BC and the consideration of computational efficiency, we developed the CMA-GFS-AERO forward model by integrating the aerosol module AERO-BC into CMA-GFS v4.0. The AERO-BC module was created by extracting BC-related codes from the CUACE model, with its functionality aligning with the BC aerosol processes in the CAM module of CUACE. In other words, the physical processes for BC in AERO-BC are identical to those in the CAM module, with no changes made. The main differences lie in the engineering aspect: (1) while the CAM module was originally written in Fortran 77, the AERO-BC code has been rewritten in Fortran 90; (2) since CAM in CUACE deals with six types of aerosols, the code structure is somewhat complex and redundant, whereas AERO-BC focuses solely on BC, resulting in a simpler and more streamlined structure. These updates improve code readability and enhance computational efficiency, without affecting the underlying physical processes.
In AERO-BC, BC is represented by six bins with particle diameters of 0.01–0.04, 0.04–0.16, 0.16–0.64, 0.64–2.56, 2.56–10.24, and 10.24–40.96 µm, where the radius range is calculated by the geometric progression method to satisfy , with VRAT being the average volume ratio between adjacent bins (Jacobson et al., 1994). Thus, six new prognostic variables for the mass mixing ratio of BC, denoted as (units: kg kg−1), where , are added in the dynamical framework of CMA-GFS.
The main processes in AERO-BC include (1) calculating the emission flux of BC through the surface flux calculation module, (2) calculating the vertical diffusion trend of BC by solving the vertical diffusion equation, and (3) simulating key BC aerosol processes in the atmosphere, including hygroscopic growth, coagulation, nucleation, condensation, dry deposition/sedimentation, and below-cloud scavenging. For more details, please refer to the relevant literature on the CAM module (Gong et al., 2003; Gong and Zhang, 2008; Wang et al., 2010; Zhou et al., 2012). The AERO-BC module is designed as a one-dimensional (1D) column module, which operates at individual vertical columns corresponding to fixed horizontal locations (i.e., fixed latitude and longitude). In the integration of AERO-BC with CMA-GFS, the interface programs transfer meteorological parameters (e.g., temperature, wind, and humidity) from CMA-GFS to AERO-BC, extend the spatial dimension from 1D to 3D, and read emissions for AERO-BC. The transport processes for are the same as those for the variables associated with the different water species in CMA-GFS, using the hybrid PRM and QMSL schemes (Su et al., 2013).
Besides, according to the vertical distribution of BC in the MERRA-2 (Modern-Era Retrospective analysis for Research and Applications, Version 2) reanalysis data (GMAO, 2015), we observed that the BC mass mixing ratio decreases rapidly after entering the stratosphere, reaching values of about 10−12 kg kg−1. This is 2–3 orders of magnitude smaller compared to the surface. To improve computational efficiency and to balance memory usage with the effectiveness of BC forecasting, we set the height of in the CMA-GFS-AERO model to 65 levels (approximately 30 hPa), which corresponds to the middle layer of the stratosphere. Regarding the absence of BC above model level 65, we handled vertical transport by assuming that any BC concentrations above this level are negligible. This approximation does not significantly affect the model's performance, as the BC mass mixing ratio is very small in the upper layers. Correspondingly, in the adjoint code, BC concentrations above model level 65 are also treated as negligible, and this does not significantly affect the adjoint calculations.
3.2 CMA-GFS-AERO TLM and ADM
In developing the TLM and ADM of the CMA-GFS-AERO model, we firstly constructed the tangent linear and adjoint codes of the AERO-BC module, subsequently coupling them with the TLM and ADM, respectively, of the CMA-GFS model (Liu et al., 2017, 2023; Zhang et al., 2019). Since adjoint codes generated by automatic differentiation tools often suffer from issues such as poor readability and maintainability, low efficiency, and even errors due to the complexity of numerical models (Zou et al., 1997), the tangent linear and adjoint codes in this study were written line by line manually, without using any automatic differentiation tool.
The AERO-BC can be symbolically written as
where F denotes the AERO-BC model operator and c and yAERO are vectors representing the input and output variables of the AERO-BC, respectively.
The TLM of the AERO-BC can be obtained by linearizing F, expressed as
where F is the TLM operator and δc and δyAERO represent perturbations of input and output variables of the AERO-BC, respectively.
The ADM of AERO-BC is essentially the transpose of the AERO-BC TLM expressed as
where FT is the adjoint operator of F and and δc* represent input and output variables of the ADM of AERO-BC, respectively.
In constructing the TLM and the ADM of AERO-BC, no simplifications were made to the AERO-BC processes. Specifically, no regularization was applied to the nonlinear equations, nor were any complex processes, which were difficult to linearize, omitted. As a result, the TLM and the ADM of AERO-BC fully include all processes related to emission flux, vertical diffusion, and aerosol physical processes as described in Sect. 3.1.
The TLM and the ADM of AERO-BC are 1D column modules, meaning that they operate independently at each fixed horizontal grid point (i.e., fixed latitude and longitude), with vertical variation only. To extend them to 3D, the tangent linear model and the adjoint model of the interface programs were also constructed. Furthermore, the tangent linear model and the adjoint model of BC transport processes follow the same framework as those for the variables associated with the different water species in the CMA-GFS TLM and ADM, utilizing the tangent linear model and the adjoint model of QMSL. In this way, the 3D parameters could be transferred from CMA-GFS to AERO-BC. Thus, we obtained the CMA-GFS-AERO TLM and ADM.
3.3 CMA-GFS-AERO 4D-Var
On the basis of the CMA-GFS-AERO forward model and its TLM and ADM, we further constructed the CMA-GFS-AERO 4D-Var by adding BC as a control variable into the incremental analysis scheme introduced in Sect. 2.2. We also provided a detailed introduction to the BC observation and errors, the BC observation operator, and the background error covariance for BC.
3.3.1 BC mass concentration as a control variable
The establishment of a strongly coupled aerosol–meteorology 4D-Var system based on CMA-GFS 4D-Var requires the addition of aerosol analysis. Although the six variables for the mass mixing ratio of BC () have been used in the CMA-GFS-AERO forward model, they can constitute a heavy burden for the analysis if they are all included in the control vector. The reasons for this, as mentioned by Benedetti et al. (2009), mainly include the following: (1) background error statistics would have to be generated for all variables separately, (2) the control vector would be significantly larger in size, which would consequently increase the cost of the iterative minimization, and most importantly, (3) the BC analysis would be under-constrained since the surface observations of BC are mass concentrations (units: µg m−3), which do not distinguish between size bins, resulting in one observation of BC mass concentration being used to constrain six BC variables. To address these issues, the BC mass concentration is selected as the control variable, denoted as Cbc (units: µg m−3), and is added to the control vector (, where ψ is the stream function, χu is the unbalanced velocity potential, πu is the unbalanced Exner pressure, and q is the specific humidity) of CMA-GFS 4D-Var. Thus, the control vector for the CMA-GFS-AERO 4D-Var is , assuming that these five variables are independent of each other.
The conversion relationship between Cbc and is
where ρ is the atmospheric density. In order to obtain the BC initial field that can be used in the CMA-GFS-AERO model from the analysis field, it is also necessary to convert Cbc into . Firstly, the distribution weights (ωn) of each size bin of in the background field are calculated based on the entire three-dimensional domain, following the equation , where N represents the number of three-dimensional grid points. Secondly, the analysis increment of () is calculated based on the analysis increment of Cbc (δCbc), following the equation
Finally, is interpolated and superimposed on in the background field to obtain the initial field of BC.
Similarly, in the minimization process of the inner loop of CMA-GFS-AERO 4D-Var, the conversion between the tangent linear variable of BC () and the analysis increment of Cbc (δCbc) is also calculated according to the derivative of Eq. (7) () and Eq. (8).
3.3.2 BC observation and errors
The BC observations used in the CMA-GFS-AERO 4D-Var system are the BC surface concentrations obtained from the China Atmospheric Monitoring Network (CAWNET), which was established by the CMA and has been monitoring the BC surface mass concentration in China since 2006 (Xu et al., 2020). The BC observation data were collected from 32 stations (Guo et al., 2020), including 11 urban, 17 rural, and 4 remote stations. The distribution of these stations is shown in Fig. S2. The monitoring of BC in CAWNET was conducted using an Aethalometer, AE31, which is one of the models produced by Magee Scientific (USA, https://www.aerosolmageesci.com, last access: 30 July 2025). The AE31 determines mass concentration of BC particles collected from air samples flowing through a quartz filter. The instrument measures the transmission through the filter over a wide spectrum of wavelengths from 370 to 950 nm. Light at the selected wavelength is transmitted through control and sample filters, and the attenuation change in the filter is then translated into the BC mass concentration. In this study, we used the BC concentration measured at the recommended wavelength of 880 nm. The AE31 measures BC concentrations every 5 min. We performed quality control on the original data and obtained the hourly average values, which were used in the BC assimilation experiments. The quality control procedures are as follows:
-
Eliminating abnormal values. During the calculation of hourly averages from the 5 min sampled data, any BC concentration values that differ significantly from the hourly average (i.e., those where the absolute difference exceeds 3 times the standard deviation) are considered abnormal and discarded. Additionally, any bad data flagged by the instrument's monitoring system are also removed.
-
Filling in missing values. If more than one-third of the data for a given hour are missing or if there are more than three consecutive missing values, the entire hour's data are discarded. For other cases, linear interpolation is applied to fill in the missing values.
The observation error covariance matrix R in Eq. (1) contains both measurement and representativeness errors. Following the formula described by Chen et al. (2019), which is an improvement on the method proposed by Pagowski et al. (2010) and Schwartz et al. (2012), we calculated the measurement error ε0. The formula is expressed as
where Obc denotes the observed BC concentrations (units: µg m−3).
Representativeness errors reflect the inaccuracies in the observation operator and in the interpolation from the model grid to the observation location. We used the representativeness error (εr) expression defined by Elbern et al. (2007) as follows:
where γ is an adjustable parameter scaling ε0 (γ=0.5 was used here), Δx is the grid spacing (100 km in this work), and L is the radius of influence of a BC observation. According to Elbern et al. (2007), L was set to 2, 10, and 20 km for urban, rural, and remote stations, respectively. The total BC observation error (εbc) was defined as
which constituted the diagonal elements in the R matrix.
3.3.3 BC observation operator
The observation operator in the CMA-GFS-AERO 4D-Var system performs two basic tasks: (1) transforming model state variables into observed physical quantities and (2) interpolating the background field (or analysis field) to the location of the observation. The transformation of the physical quantities is related to the type of observations, and the spatial interpolation operator consists of both horizontal and vertical interpolation. Since the CMA-GFS-AERO 4D-Var system adopts the Charney–Phillips staggered grid in the vertical direction and the Arakawa-C grid in the horizontal direction, the observation operator must account for the staggered locations of different physical variables. To minimize errors introduced by variable transformations and spatial interpolation, appropriate handling of horizontal staggering and vertical layer transitions is required. The steps to construct the BC observation operator are as follows:
-
Based on Eq. (7), the BC mass mixing ratios () of six size bins are summed and converted into the mass concentrations (Cbc), which are further interpolated to the observation locations by the horizontal bilinear interpolation to obtain the equivalent BC concentrations that are consistent with the units of the observations.
-
According to the heights of BC surface observations, the corresponding vertical interpolation schemes are selected to obtain the equivalent BC observations. If the height of BC surface observation is greater than the height of the first model layer, the cubic spline interpolation is used to process the BC concentration interpolation. If the observation height is less than the height of the first model layer and the difference between the two heights is less than 300 m, the BC concentration at the first model layer is regarded as the equivalent BC observation. However, if the difference between the two heights is greater than or equal to 300 m, the data from that site are discarded.
3.3.4 Background error covariance for BC
The variable fields involved in variational assimilation are all three-dimensional, and it is challenging to directly deal with the correlations of these three-dimensional fields due to their high dimensionality. Therefore, in the CMA-GFS 4D-Var assimilation system, a simplification is made by assuming that the correlation coefficient can be expressed as the product of the vertical correlation coefficient and the horizontal correlation coefficient (Zhang et al., 2019). The horizontal correlation is calculated using the spectral filtering method, while the vertical correlation is calculated through empirical orthogonal function (EOF) decomposition (Zhang et al., 2019).
In the CMA-GFS-AERO 4D-Var system, the background error covariance for the control variable BC adopts a modeled structure. The background error variance varies with height, as shown in Fig. 1a. The vertical correlation model of the background error is derived through a combination of theoretical considerations (Bergman, 1979) and experimental tuning, with particular reference to the methodology used for humidity in the CMA-GFS 4D-Var system. It is expressed as
where zi and zj are the model terrain heights of level i and j, respectively. , where g denotes the gravitational acceleration, Rd represents the gas constant for dry atmospheric air, T0 is the standard temperature (273.15 K), and kp is the constant coefficient (Bergman, 1979). Following the value of kp used for the control variable of humidity in the CMA-GFS 4D-Var system, we set kp to 10 for the control variable BC. Figure 1b depicts the distribution of the vertical correlation coefficients of the background error of the 1st, 10th, and 20th layers with other layers.

Figure 1(a) Vertical profile of background error variance for BC; (b) vertical correlation coefficients of background error between the 1st, 10th, and 20th layers with other layers for BC.
The horizontal correlation of the background error for the control variable BC is calculated by the second-order autoregressive (SOAR) correlation function, which is commonly used in operational data assimilation systems (Ballard et al., 2016), expressed as
where di,j is the arc length of the great circle between two points i and j and L is the characteristic horizontal length scale. The length scale for the control variable BC varies with height in the model, following the way the length scale of the humidity variable varies with height in the CMA-GFS 4D-Var system, which is shown in Table 1.
3.3.5 Flow-dependent background error covariance in CMA-GFS-AERO 4D-Var
In the strongly coupled aerosol–meteorology assimilation system, interactions between the atmospheric variables and BC allow BC observations to influence the analysis increments of atmospheric variables and vice versa. The incremental 4D-Var algorithm implicitly evolves the background error covariances (B) throughout the assimilation window according to the TLM dynamics. This process modifies prior background error variance estimates and induces non-zero correlations between model variables (Smith et al., 2015). By utilizing the fully coupled TLM and ADM in the inner loops of the strongly coupled assimilation system, cross-covariance information between BC and atmospheric variables is generated. This enables observations of one variable to produce analysis increments in the other, leading to more consistent analyses.
Specifically, if the BC observation is assumed to take place at the beginning of the assimilation window only and under the assumption of a single, non-cycling assimilation window, the 4D-Var assimilation behaves similarly to the 3D-Var assimilation. In this case, since the BC variable is assumed to be uncorrelated with the atmospheric variables in the static B and there is no direct relationship between the BC observation operator and the atmospheric variables, the BC observation does not lead to the generation of the analysis increments of atmospheric variables. Therefore, the merits of a coupled data assimilation system cannot be fully manifested by assimilating a BC observation at the beginning of the window only. If the BC observation is assumed to take place at the middle and the end of the assimilation window, B evolves within the assimilation time window through the TLM M0→i, obtaining the implicit background error covariance matrix at the observation time. includes the cross-covariance information of BC and atmospheric variables and can realize the feedback of the BC observation on the atmospheric variables through the CMA-GFS-AERO ADM , further producing analysis increments of atmospheric variables. In other words, the distribution of the analysis increment at the observation time is determined by the cross-time error matrix M0→iB.
In this work, the horizontal resolution of the CMA-GFS-AERO forward model in the outer loop was set to 0.25°, with an integration step of 300 s, and the horizontal resolution of the CMA-GFS-AERO TLM and ADM in the inner loop was 1.0°, with an integration step of 900 s. The model has 87 vertical layers, with the top being approximately 0.1 hPa (Fig. S1). Referring to the running scheme of the CMA-GFS 4D-Var system described in Sect. 2.2, the CMA-GFS-AERO 4D-Var system also adopts the same 6 h cycling schedule and assimilation windows. The forecast of the CMA-GFS-AERO model started at 03:00 UTC on 1 October 2016 and was restarted every 6 h. The meteorological initial fields for each 6 h cycle were obtained from the operational CMA-GFS analysis. The BC field was initialized with null concentrations at 03:00 UTC on 1 October 2016. From the second forecast cycle onward, the initial conditions of BC were derived from the BC field at the end of the previous 6 h forecast, allowing the BC field to be cycled. The first 9 d was used as the spin-up time to establish a realistic BC distribution. The maximum minimization iteration number in the inner loop was set to 50, while the outer loop was performed only once. This setting is consistent with the operational configuration of the CMA-GFS 4D-Var system and has been found sufficient for achieving convergence in our experiments. The atmospheric observations used in this work are shown in Table S1.
Anthropogenic emission sources used in this study were from the Multi-resolution Emission Inventory for China (MEIC) (Li et al., 2017; Zheng et al., 2018), the Copernicus Atmosphere Monitoring Service global and regional emissions (CAMS) (Granier et al., 2019), and the global datasets of the Task Force Hemispheric Transport of Air Pollution (HTAP) (Janssens-Maenhout et al., 2015). These inventories include various gases (NOx, CO, SO2, NH3, CH4, and non-methane volatile organic compounds (NMVOCs)) and particulates (OC, BC, PM2.5, and PM10), where PM10 refers to the inhalable particulate matter with an aerodynamic diameter of 10 µm or less. These data were processed into grid-point emission data applicable to the CUACE model through the EMIPS emission source processing system (Chen et al., 2023). To improve computational efficiency, they were further simplified into emission source data containing only BC as input to the CMA-GFS-AERO model.
At present, we have run the CMA-GFS-AERO 4D-Var system for 3 months from 1 October 2016. This section mainly shows the experiment results of random time in this 3-month period to the present the rationality and stability of the system. The detailed analysis of the cycling assimilation experiments will be further elaborated in part 2 of this paper.
5.1 Validation of CMA-GFS-AERO TLM and ADM
Validation of the tangent linear and adjoint models is an important part of introducing a new modeling component, such as the AERO-BC module. Considering that CMA-GFS TLM and ADM have been validated and documented in Liu et al. (2017, 2023) and Zhang et al. (2019), here we mainly present the validation of the tangent linear model and adjoint model of the newly developed AERO-BC module.
The correctness of the AERO-BC TLM can be verified by checking whether the following equation is satisfied (Mahfouf and Rabier, 2000; Liu et al., 2017; Tian and Zou, 2020):
where ∥.∥ denotes the norm of the vector and α is the scale factor of initial perturbations with the range from 1.0 to 10−14. As the scale factor α becomes smaller and smaller, the function Φ(α) is expected to approach unity in an approximately linear manner.
We firstly verified all submodules in the AERO-BC TLM, finding that the tangent linear approximation of each submodule was correct. Subsequently, we conducted a set of six experiments with the integration time from 1 to 6 h to verify the correctness of the full AERO-BC TLM. The background field and analysis increment generated by the CMA-GFS-AERO 4D-Var system were used as the basic-state initial field and the perturbation initial field of the CMA-GFS-AERO TLM for 6 h forecasting. The atmospheric and BC state variables c and their perturbations δc of these six time periods were used as inputs of the AERO-BC and its TLM, and the tangent linear approximation of the output variable (the perturbation of the mass mixing ratio of BC, ) of the AERO-BC TLM is tested using Eq. (14).
Figure 2 shows the results of the six correctness experiments. As expected, in each verification experiment, as the scale factor α becomes smaller and smaller for certain ranges of α values, the values of Φ(α) gradually get closer and closer to unity. When α is too small (such as 10−12), the accuracies of the Φ(α) values start to be affected by the machine round-off errors and drift away from unity. This indicates that the tangent linear approximation of the AERO-BC TLM is correct.

Figure 2Variations in the function |Φ(α)−1| for the correctness check of the AERO-BC TLM for the 6 h forecast length, where α is the scale factor of initial perturbations.
We further diagnosed the impact of linearized physical processes on the forecast effectiveness of CMA-GFS-AERO TLM. Generally, the diagnostic method is to calculate the relative error (r) between the tangent linear perturbation forecast M(δx) and the nonlinear perturbation forecast ΔM(δx) (Mahfouf, 1999; Liu et al., 2019; Zhang et al., 2019), which can be expressed as
The nonlinear perturbation forecast ΔM(δx) is the difference between the NLM forecasts from two different initial conditions: the analysis field xa and the background field xb; that is ). And the tangent linear perturbation forecast M(δx) is integrated using the analysis increment δx () as the initial perturbation field. r needs to be calculated for each model variable at each grid.
The forecast period for this experiment was 6 h starting from 03:00 UTC on 25 October 2016 (randomly selected time). For the nonlinear perturbation test, which includes the full physical processes, the two initial conditions were the analysis field xa and the background field xb generated by the CMA-GFS-AERO 4D-Var system at 03:00 UTC on 25 October 2016. For the tangent linear perturbation test, the initial condition was the analysis increment δx () at 03:00 UTC on 25 October 2016. The model trajectory required for the tangent linear perturbation forecast was calculated by the CMA-GFS-AERO NLM including the full physical process with the background field xb as the initial field. The nonlinear and tangent linear models were performed at the same resolution of 1.0°, and the analysis field xa and the background field xb were interpolated from 0.25 to 1.0° based on the 3D interpolation method (Huo et al., 2022).
Figure 3 depicts the results of the nonlinear perturbation forecast and the tangent linear perturbation forecast. Figure 3a–b show the differences in vertically accumulated and latitudinally averaged BC mass concentration (units: µg m−3) after 6 h of integration of the CMA-GFS-AERO NLM with two initial conditions of xa and xb, respectively, and Fig. 3c–d present the vertically accumulated and latitudinally averaged BC mass concentration perturbations after 6 h of integration of CMA-GFS-AERO TLM with the initial condition of δx (). It can be seen that after 6 h of forecast, the distribution of the results of CMA-GFS-AERO NLM and TLM, both horizontally and vertically, are very similar with only minor differences. This indicates that CMA-GFS-AERO TLM shows good performance in terms of tangent linear approximation for BC.

Figure 3Differences in (a) vertically accumulated and (b) latitudinally averaged BC mass concentration (units: µg m−3) after 6 h of integration of the CMA-GFS-AERO NLM with initial conditions xa and xb and perturbations in (c) vertically accumulated and (d) latitudinally averaged BC mass concentration after 6 h of integration of CMA-GFS-AERO TLM with the initial condition δx ().
The vertical distribution of the globally averaged relative error between the perturbation forecasts of CMA-GFS-AERO TLM and NLM, which was calculated according to Eq. (15), is shown in Fig. 4. It can be seen that below the 20th model layer, the tangent linear approximation for BC is better than that for the wind field, potential temperature, and specific humidity. Although the tangent linear approximation for BC is slightly worse above the 20th model layer, it is still far better than that for specific humidity. It is worth noting that the BC concentration above the 20th model level is quite low (Fig. 3b), so the impact of the tangent linear approximation is minimal. This phenomenon indicates that, in comparison to variables such as potential temperature and specific humidity in the CMA-GFS-AERO model, the tangent linear approximation for BC is quite effective, making it well-suited for constructing a 4D-Var system.

Figure 4The vertical distribution of the globally averaged relative error between the perturbation forecasts of the CMA-GFS-AERO TLM and NLM at the resolution of 1.0° (black line: non-dimensional pressure; blue line: potential temperature; red line: BC; magenta line: u wind; cyan line: v wind; green line: specific humidity).
The correctness of the AERO-BC adjoint can be verified by checking whether the following equation is satisfied (Mahfouf and Rabier, 2000; Liu et al., 2017; Tian and Zou, 2020):
where 〈,〉 denotes the inner product. Using δc as the input of the AERO-BC TLM, the output of the AERO-BC TLM F(δc) can be obtained and the left-hand side (LHS) of Eq. (16) can be calculated. Then, taking F(δc) as the input of the AERO-BC adjoint, we can get its output FT(F(δc)) and calculate the right-hand side (RHS). If the AERO-BC adjoint is developed correctly, the LHS and RHS of Eq. (16) are expected to agree with the machine accuracy of the data type declared in the program, which is double precision in the AERO-BC.
Following Eq. (16), we conducted five experiments with the integration time equal to 1, 6, 12, 24, and 36 steps with a time step of 900 s. Considering the mass mixing ratio of BC () as an example, for each experiment, the atmospheric variables and perturbations in the analysis increment generated by the CMA-GFS-AERO 4D-Var system were used as the input of the AERO-BC TLM. We ran the tangent linear codes once to obtain the value of the tangent linear output and calculated the LHS of Eq. (16). Then, taking the tangent linear output as input, the AERO-BC adjoint codes were run once to obtain the sensitivity value, which was further used to calculate the RHS of Eq. (16) with the perturbation. The validation results are presented in Table 2. The resulting LHS and RHS from the five tests agree with the precision of machine accuracies, indicating the correctness of the AERO-BC adjoint model.
5.2 Single-observation experiment
In order to evaluate the rationality of the CMA-GFS-AERO 4D-Var system, we performed the single-observation experiment for BC. The experiment period was 6 h starting from 03:00 UTC on 24 November 2016 (randomly selected time), and the forecast field of the CMA-GFS-AERO model at this time was selected as the background field. During the assimilation process, no atmospheric observations were added. We adopted the BC surface observation at Nanjiao station (116.47° E, 39.8° N), which is located in Beijing, at 03:00 UTC on 24 November 2016. The altitude of Nanjiao station is 31.3 m, and the observed BC concentration is 10.0 µg m−3. Figure S3 shows the location of the BC observation and the wind field at 925 hPa, which moves from northwest to southeast. The BC observation was placed at 03:00, 06:00, and 09:00 UTC, respectively, corresponding to the beginning, the middle, and the end of the assimilation time window.
Theoretically, the analysis increment at the beginning time for 4D-Var assimilation is . If we only assimilate the observation at time ti, the analysis increment at the observation time is . When assimilating the single observation, is a vector with only one factor. If the observation position and the analysis grid coincide, the spatial interpolation in the observation operator can be ignored. Thus, the analysis increment at the observation time can reflect the structure of the background error covariance at the observation time. Figure 5 shows the analysis increments of BC at the first model layer at the observation times, with the BC observation placed at 03:00, 06:00, and 09:00 UTC, respectively. When the BC observation is placed at 03:00 UTC (the observation innovation () is −1.2 µg m−3 at 03:00 UTC), the 4D-Var assimilation behaves similarly to the 3D-Var assimilation, and the horizontal distribution of the BC analysis increment is determined by the static background error covariance model B. Since the CMA-GFS-AERO 4D-Var system uses a homogeneous second-order autoregressive spatial correlation model, the BC analysis increment at 03:00 UTC (Fig. 5a) is essentially isotropic, and only the background error covariance, which varies with latitude, causes the analysis increment to differ somewhat in the north–south direction. When the BC observation is placed at 06:00 UTC (the observation innovation is −9.5 µg m−3 at 06:00 UTC) and 09:00 UTC (the observation innovation is −9.0 µg m−3 at 09:00 UTC), the BC analysis increments show anisotropic characteristics (Fig. 5b–c), which is consistent with the movement of the wind at 925 hPa (Fig. S3), indicating that the background error covariance varies with the weather situation. Meanwhile, it can also be seen that the values of the BC analysis increments at 06:00 and 09:00 UTC are much larger than those at 03:00 UTC. This is because the BC observation innovations at 06:00 and 09:00 UTC are greater than those at 03:00 UTC.

Figure 5The analysis increments of BC at the first model level at the observation times, with the BC observation placed at (a) the beginning of the assimilation time window, 03:00 UTC; (b) the middle of the assimilation window, 06:00 UTC; and (c) the end of the assimilation time window, 09:00 UTC. The black triangle represents the ideal observation location (116.47° E, 39.8° N).
Figure 6 presents the evolved analysis increments of BC at the first model level at the end of the assimilation time window (09:00 UTC) obtained by the CMA-GFS-AERO TLM, with the BC observation placed at 03:00 and 06:00 UTC, respectively. For the case where the BC observation is placed at 03:00 UTC, the initial analysis increment at 03:00 UTC (Fig. 5a) exhibits an isotropic structure due to the static B. In contrast, the propagated analysis increment at the end of the assimilation time window (09:00 UTC, Fig. 6a) exhibits an anisotropic structure under the influence of the flow-dependent . Similarly, when the BC observation is placed at 06:00 UTC, both the initial analysis increment at 06:00 UTC (Fig. 5b) and the propagated analysis increment at 09:00 UTC (Fig. 6b) exhibit an anisotropic structure. In addition, the horizontal distribution structure of the BC analysis increments in Fig. 6a and b closely resembles that of the analysis increments at the observation time of 09:00 UTC (Fig. 5c). This indicates the significant impact of flow-dependent dynamics on the evolution of the analysis increments. No matter what time the observation is placed at, the spatial propagation of the observation information is effectively achieved through the model integration. In this idealized single-observation experiment, the propagation of BC increments is primarily dominated by advection due to the limited observational constraint. When more comprehensive observations are assimilated, advection remains a key factor, but its dominance is less pronounced as other processes also influence the adjustment of BC distributions (see Sect. 5.3).

Figure 6The analysis increments of BC at the first model level at the end of the assimilation time window, 09:00 UTC, with the BC observation placed at (a) the beginning of the assimilation window, 03:00 UTC and (b) the middle of the assimilation window, 06:00 UTC. The black triangle represents the ideal observation location (116.47° E, 39.8° N).
Figure 7 depicts the analysis increments of temperature at the first model level at the beginning time of the assimilation time window (03:00 UTC), with the BC observation placed at 06:00 and 09:00 UTC in panels (a) and (b), respectively. In this specific case, the analysis increments of temperature are positive, with a value of about 0.02 K near the observation location, when the BC observation is placed at 06:00 and 09:00 UTC. The temperature analysis increment depends on several factors, including the BC observation innovation, the location of the observation, and the meteorological conditions during the assimilation time window. Here, the positive analysis increments of temperature may be due to the fact that the BC observation innovations at 06:00 and 09:00 UTC are negative (−9.5 µg m−3 at 06:00 UTC and −9.0 µg m−3 at 09:00 UTC), indicating that the background BC concentration is lower than the observed values. Assimilation of these observations increases the BC concentrations in the analysis, which, under the prevailing meteorological conditions, leads to positive temperature increments near the observation site. As explained in Sect. 3.3.5, the coupling between BC and atmospheric variables within the system allows this type of feedback to occur.

Figure 7The analysis increments of temperature at the first model layer at the beginning of the assimilation time window, 03:00 UTC, with the BC observation placed at (a) the middle of the assimilation window, 06:00 UTC and (b) the end of the assimilation time window, 09:00 UTC. The black triangle represents the ideal observation location (116.47° E, 39.8° N).
Figure 8 shows the analysis increments of pressure at the first model level, as well as the east–west component of horizontal wind and relative humidity at the same level at the beginning of the assimilation time window (03:00 UTC), with the BC observation placed at 09:00 UTC. The assimilation of a single BC observation produces noticeable analysis increments in the pressure, east–west component of horizontal wind, and relative humidity in North China, which shows that the CMA-GFS-AERO 4D-Var coupled assimilation system can reflect the impact of BC assimilation on atmospheric increments. In reality, unlike the single-observation experiment, the BC observation is distributed within the assimilation time window, rather than just at a fixed moment; thus, the advantages of the CMA-GFS-AERO 4D-Var strong coupling assimilation system can be fully utilized to explore the feedback of BC assimilation on atmospheric variables.

Figure 8The analysis increments of the (a) pressure, (b) east–west component of horizontal wind, and (c) relative humidity at the first model layer at the beginning of the assimilation time window, 03:00 UTC, with the BC observation placed at the end of the assimilation time window, 09:00 UTC. The black triangle represents the ideal observation location (116.47° E, 39.8° N).
5.3 Case study on BC and atmosphere assimilation
On the basis of the single-observation experiment, we further conducted the full-observation experiment for BC and atmospheric variables. The experiment period was also 6 h starting from 03:00 UTC on 24 November 2016 (the same time as the experimental setup in Sect. 5.2), and the forecast field of the CMA-GFS-AERO model at this time was selected as the background field. We conducted a set of four experiments to investigate the impact of different BC assimilation strategies on both BC and atmospheric variables. These experiments are listed in Table 3. Differently from the single-observation experiment in Sect. 5.2, in which the observations are placed at a fixed time, we assimilated all available BC observations with an hourly frequency within the assimilation time window in the full-observation experiment. For the DA_MET_then_BC experiment, the CMA-GFS-AERO 4D-Var system was executed twice sequentially within the same assimilation window. In the first step, only operational meteorological observations were assimilated, and the resulting analysis was used as the background field for the second step, in which only BC surface observations were assimilated. Except for the observational datasets, the model configurations and assimilation settings in both steps remained identical. This two-step procedure allows us to separate the effect of BC observations from the influence of meteorological observations and their associated background adjustment, thereby facilitating a clearer attribution of the BC assimilation impact. In contrast, DA_MET_BC_simult assimilated both operational meteorological observations and BC surface observations simultaneously within a single 4D-Var run. This one-step assimilation strategy allows all observations to jointly influence the analysis field, reflecting the integrated effect of both meteorological and BC observations. In the following analysis, we primarily compare the BC analysis increments obtained from DA_BC, DA_MET_then_BC, and DA_MET_BC_simult experiments, noting that the BC analysis increments from the DA_MET experiment are very small (figure omitted). Additionally, we compare the atmospheric analysis increments caused by BC assimilation in DA_BC, DA_MET_then_BC (DA_MET_then_ BC-DA_MET), and DA_MET_BC_simult (DA_MET_BC_simult-DA_MET).
Figure 9 presents the analysis increments of BC at the first model layer from the DA_BC, DA_MET_then_BC, and DA_ MET_BC_ simult experiments. These analysis increments are valid at the beginning of the assimilation window, as is standard in 4D-Var systems. When only BC surface observations are assimilated (DA_BC), the BC analysis increment is mainly concentrated in North China and Eastern China, with a maximum value of about 6.0 µg m−3 (Fig. 9a). When operational meteorological observations are assimilated first, followed by BC surface observations (DA_MET_then_BC), or when both operational meteorological and BC surface observations are assimilated simultaneously (DA_MET_BC_simult), the distribution and the value of BC analysis increments are nearly identical to those of DA_BC, with only minor differences (Fig. 9b–c). This indicates that the three BC assimilation strategies have similar assimilation effects on BC, further demonstrating that the assimilation of meteorological observations has a relatively small impact on BC analysis increments.

Figure 9The analysis increments of BC at the first model layer from (a) DA_BC, (b) DA_MET_then_BC, and (c) DA_ MET_BC_ simult.
We further explored the impact of different BC assimilation strategies on analysis increments of atmospheric variables. Figure 10 shows the analysis increments of the temperature, pressure, east–west component of horizontal wind, and relative humidity at the first model layer, resulting from BC assimilation in DA_BC, DA_MET_then_BC, and DA_ MET_BC_simult. It is worth noting that in DA_BC, only BC observations are assimilated, so the analysis increments of atmospheric variables purely reflect the response to BC. In contrast, both DA_MET_then_BC and DA_MET_BC_simult assimilate BC and meteorological observations, and thus their analysis increments include the combined effects of both types of observations. To isolate the influence of BC assimilation alone on atmospheric variables and under the assumption that the contribution of meteorological observations is comparable between DA_MET_then_BC/DA_MET_BC_simult and DA_MET, we calculated the differences between the analysis increments of these experiments and those from DA_MET, which assimilates only meteorological observations. This subtraction effectively removes the contributions from meteorological observations, allowing the resulting increments to be attributed solely to the assimilation of BC observations. In this way, a more direct and fair comparison can be made with DA_BC. Figure 10a, d, g, and j display the analysis increments of these variables from BC assimilation in DA_BC. Figure 10b, e, h, and k show the increments due to BC assimilation in DA_MET_then_BC, obtained by subtracting the atmospheric increments in DA_MET from DA_MET_then_BC (DA_MET_then_ BC-DA_ MET). Figure 10c, f, i, and l illustrate the increments caused by BC assimilation in DA_MET_BC_simult, obtained similarly by subtracting the increments in DA_MET from DA_MET_ BC_ simult (DA_MET_BC_simult − DA_MET).

Figure 10The analysis increments of the (a, b, c) temperature, (d, e, f) pressure, (g, h, i) east–west component of horizontal wind, and (j, k, l) relative humidity at the first model layer caused by BC assimilation. Panels (a), (d), (g), and (j) are analysis increments from DA_BC; panels (b), (e), (h), and (k) are the differences in analysis increments between DA_MET_ then_BC and DA_MET (DA_MET_then_BC minus DA_MET); and panels (c), (f), (i), and (l) are the differences in analysis increments between DA_MET_BC_simult and DA_MET (DA_ MET_BC_simult minus DA_MET).
When only BC surface observations are assimilated (DA_BC), analysis increments of the temperature (Fig. 10a), pressure (Fig. 10d), east–west component of horizontal wind (Fig. 10g), and relative humidity (Fig. 10j) are present in North China and Eastern China. The value of the analysis increments for the temperature, pressure, east–west component of horizontal wind, and relative humidity reach approximately 0.1 K (Fig. 10a), 0.1 Pa (Fig. 10d), −0.2 m s−1 (Fig. 10g), and 0.8 % (Fig. 10j), respectively.
When operational meteorological observations are assimilated first, followed by BC surface observations (DA_MET_then_BC), the distributions and the values of the analysis increments of these four atmospheric variables due to BC assimilation (Fig. 10b, e, h, k) are basically consistent with those of DA_BC. This is because, although the DA_MET_then_BC experiment assimilates meteorological observations before BC surface observations, the background field of BC remains unchanged. While the assimilation of meteorological observations updates atmospheric variables, it does not directly alter the BC background field. Therefore, the observation-minus-background (OMB) values for BC observations in DA_MET_then_BC are very close to those in DA_BC, with only minor differences caused by the slight influence of updated meteorological fields on the observation operator. As a result, the analysis increments of atmospheric variables due to BC assimilation are similar between the two experiments. Additionally, the values in each sub-image of the middle panel in Fig. 10 differ slightly from those on the left. These differences are attributed to the distinct basic-state values of the atmospheric variables used in the tangent linear and adjoint processes. Specifically, in DA_BC, the basic-state values of the atmospheric variables are derived from the atmospheric background field without meteorological assimilation, while in DA_MET_then_BC, they are taken from the atmospheric analysis field after assimilating the operational meteorological observations. These differences in the input to the TLM and ADM can lead to subtle variations in the analysis increments.
The overall distribution and pattern of the analysis increments of temperature (Fig. 10c), pressure (Fig. 10f), and the east–west component of horizontal wind (Fig. 10i) caused by BC assimilation in DA_MET_BC_simult are consistent with those in DA_BC and DA_MET_then_BC. However, the increment values in DA_MET_BC_simult are smaller, with values reaching approximately 0.02 K (Fig. 10c), 0.002 Pa (Fig. 10f), and −0.05 m s−1 (Fig. 10i), respectively. The analysis increment of relative humidity (Fig. 10l) due to BC assimilation in DA_MET_BC_simult shows a small positive value distribution, whereas in DA_BC and DA_MET_then_BC, it exhibits a negative value distribution. The differences in analysis increments of the four atmospheric variables caused by BC assimilation between DA_MET_BC_simult and DA_ BC/DA_MET_then_BC may be attributed to the stronger constraints imposed by the atmospheric observations. In both DA_MET_then_BC and DA_BC, only BC surface observations are incorporated during the BC assimilation step. At this stage, the system relies solely on BC observations to correct the initial field. In the absence of atmospheric observations, BC observations play a dominant role, leading to larger analysis increments of atmospheric variables. In contrast, in DA_MET_BC_simult, both operational meteorological observations and BC surface observations are assimilated simultaneously. In this scenario, atmospheric observations may impose additional constraints on the adjustment of atmospheric fields, thereby moderating the impact of BC observations during the assimilation process. As a result, a more balanced adjustment of atmospheric variables is achieved in DA_MET_BC_simult. This behavior also highlights the importance of properly specifying the observation error covariance matrix. In future work, we plan to further examine the specification of the BC observation errors and their impact on assimilation performance.
Table 4Computational costs (unit: s) for 6, 24, and 120 h integrations of CMA-GFS and CMA-GFS-AERO models.

The CMA-GFS and CMA-GFS-AERO models are integrated with the same time step (300 s), the same horizontal resolution of 0.25°, and the same CPU cores (1920 cores).
Table 5Computational costs (unit: s) for 12 h integrations of CMA-GFS TLM/ADM and CMA-GFS-AERO TLM/ADM.

CMA-GFS TLM/ADM and CMA-GFS-AERO TLM/ADM are integrated with the same time step (900 s) and the same horizontal resolution of 1°.
Table 6Computational costs (unit: s) for 6 h integrations of CMA-GFS 4D-Var and CMA-GFS-AERO 4D-Var.

CMA-GFS 4D-Var and CMA-GFS-AERO 4D-Var are integrated with the same time step of 300 s/900 s (outer loop/inner loop), the same horizontal resolution of 0.25°/1° (outer loop/inner loop), and the same number of minimization iterations of 35 steps.
The preliminary results obtained from this set of four experiments indicate that different BC assimilation strategies have little impact on BC analysis increments but significantly affect the analysis increments of atmospheric variables. When only BC observations are assimilated, the influence of BC on atmospheric variables is more pronounced, whereas the simultaneous assimilation of meteorological observations moderates this influence. This suggests that the presence of meteorological observations during assimilation may impose additional constraints on the adjustment of atmospheric fields, potentially reducing the degree to which the assimilation of BC observations alone can alter the atmospheric state. In this way, the integration of meteorological observations helps stabilize the adjustment process, supporting more consistent and interpretable assimilation results. Moreover, the four experiments demonstrate that the CMA-GFS-AERO 4D-Var system has been technically implemented and is able to produce credible analysis increments in both BC and atmospheric fields. These increments display realistic spatial structures and amplitudes, indicating that the system performs as intended under the current configuration and with the available observations. These results offer practical evidence of the system's functionality and its potential utility for exploring the feedback of BC data assimilation on meteorological forecasts. In the future, we will conduct cycling assimilation experiments using CMA-GFS-AERO 4D-Var to gain deeper insights into the role of BC assimilation in numerical weather prediction and further refine the system for broader applications.
5.4 Computational performance of CMA-GFS-AERO 4D-Var
This section presents the computational performance of CMA-GFS-AERO 4D-Var from three aspects: (1) the forward model, (2) the TLM and ADM, and (3) the 4D-Var system. We firstly evaluated the computational performance of a CMA-GFS-AERO simulation and compared it with that of the CMA-GFS simulation. Table 4 shows the computational costs for 6, 24, and 120 h integrations of CMA-GFS and CMA-GFS-AERO models. It can be seen that for 6, 24, and 120 h forecasts with the same integration time step (300 s), the same horizontal resolution of 0.25°, and the same number of CPU cores (1920 cores), the CMA-GFS-AERO simulations increase only about 10 % of the computational time of the CMA-GFS simulations (as a reference, the microphysics process accounts for approximately 5 % of the total computation time in CMA-GFS simulations). This shows the high efficiency of the CMA-GFS-AERO forward model, which is an important factor in developing a strongly coupled aerosol–meteorology 4D-Var system.
Table 5 presents the computational costs for 12 h integrations of CMA-GFS TLM/ADM and CMA-GFS-AERO TLM/ADM, and Table 6 shows the computational costs for 6 h integrations of CMA-GFS 4D-Var and CMA-GFS-AERO 4D-Var. It is apparent that with an increasing number of CPU cores, the acceleration effects of CMA-GFS-AERO TLM, ADM, and 4D-Var are comparable to those of CMA-GFS TLM, ADM, and 4D-Var. When using 1440 CPU cores, the total times of CMA-GFS-AERO TLM, ADM, and 4D-Var are approximately 1.1 times, 1.2 times, and 1.4 times those of CMA-GFS TLM, ADM, and 4D-Var, respectively. This highlights the high efficiency and good scalability of CMA-GFS-AERO TLM, ADM, and 4D-Var, making the coupled aerosol–meteorology 4D-Var system potentially suitable for operational application.
In this study, we developed CMA-GFS-AERO 4D-Var, a strongly coupled aerosol–meteorology data assimilation system, under the framework of the incremental analysis scheme of CMA-GFS 4D-Var. CMA-GFS-AERO 4D-Var includes three model components: forward, tangent linear, and adjoint models. The CMA-GFS-AERO forward model was constructed by integrating the AERO-BC module, an aerosol module containing main aerosol physical processes of BC in the atmosphere, the code of which was extracted from the CUACE air quality model and further optimized in this work, into the CMA-GFS weather model. The tangent linear model and the adjoint model of the AERO-BC module were developed and coupled online with the TLM and ADM of CMA-GFS, respectively. Thus, the CMA-GFS-AERO ADM includes not only the adjoint model of physical processes of BC, but also the adjoint model of the meteorological model. The BC mass concentration was used as the control variable and minimized together with atmospheric variables. The background error covariance of the control variable BC adopted a modeled structure. The assimilation system used BC surface observations from the China Atmospheric Monitoring Network. The observation error and the observation operator of BC were described in detail as well.
The CMA-GFS-AERO TLM and ADM were verified by tangent linear approximation and adjoint correctness tests. The results show that CMA-GFS-AERO TLM exhibits good performance in tangent linear approximation for BC, and adjoint sensitivity agrees well with tangent linear sensitivity. The CMA-GFS-AERO 4D-Var system was validated for its accuracy and rationality by the single-observation experiment and the full-observation experiment. The results show that assimilating BC observations can generate analysis increments not only for BC but also for atmospheric variables such as the temperature, pressure, wind field, and relative humidity. This demonstrates that the newly developed CMA-GFS-AERO 4D-Var system has been technically implemented and is capable of producing credible assimilation outcomes, highlighting its potential as a useful tool for exploring the feedback of BC data assimilation on meteorological forecasts. Additionally, the computational performance of CMA-GFS-AERO 4D-Var was evaluated, and the results indicate that when using 1440 CPU cores for 6 h integrations, the total time of CMA-GFS-AERO 4D-Var is approximately 1.4 times that of CMA-GFS 4D-Var, highlighting the high efficiency of CMA-GFS-AERO 4D-Var and the potential in operational application.
The next steps are as follows. We intend to explore the impact of assimilating surface BC observations on the forecast fields of BC and atmospheric variables through cycling assimilation experiments. CMA-GFS-AERO 4D-Var still needs to be applied to control variables for BC emission scaling factors. Further development of CMA-GFS-AERO 4D-Var will aim to assimilate more aerosol species while ensuring computational efficiency, providing an effective way to study the impact of aerosol assimilation on the analysis and forecast fields of atmospheric variables.
The CMA-GFS model and its 4D-Var system and CUACE model were distributed by the CMA Earth System Modeling and Prediction Centre (CEMC) and the Chinese Academy of Meteorological Sciences, respectively. The model was run on the PI-SUGON high-performance computer with Intel Fortran Compiler. Due to copyright restrictions of CEMC, the full codes of the system are not freely available; interested users can contact the operational management department of CEMC or the author Yongzhu Liu (liuyzh@cma.gov.cn) for further assistance. Codes related to this study, including the tangent linear and adjoint interface codes for black carbon (BC) and the observation operator codes for BC and the CMA-GFS-AERO 4D-Var main program, are available on Zenodo (https://doi.org/10.5281/zenodo.14880420; Liu et al., 2025). Model outputs of the four assimilation experiments of BC and atmosphere used in this study are also available at this website.
The supplement related to this article is available online at https://doi.org/10.5194/gmd-18-4855-2025-supplement.
XZ, XS, and WH envisioned and oversaw the project. YL and CW developed the model code and performed the simulations. YL, WJ, and CW prepared the manuscript with contributions from all co-authors.
The contact author has declared that none of the authors has any competing interests.
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.
This work was supported by the Major Program of National Natural Science Foundation of China (42090032) and National Natural Science Foundation of China (42305111). The assimilation and forecast experiments were performed on the high-performance computer PI-SUGON of the China Meteorological Administration. The development of the CMA-GFS-AERO 4D-Var system is a collaborative effort involving contributions from many colleagues. We sincerely thank the entire team for their cooperation. We also thank YaQiang Wang from CAMS for providing anthropogenic emission source data and black carbon observations. Special thanks are extended to the three anonymous reviewers for their insightful comments and constructive suggestions, which contributed significantly to the improvement of the initial version of this paper.
This research has been supported by the Major Program of National Natural Science Foundation of China (grant no. 42090032) and National Natural Science Foundation of China (grant no. 42305111).
This paper was edited by Yuefei Zeng and reviewed by three anonymous referees.
An, X. Q., Zhai, S. X., Jin, M., Gong, S., and Wang, Y.: Development of an adjoint model of GRAPES–CUACE and its application in tracking influential haze source areas in north China, Geosci. Model Dev., 9, 2153–2165, https://doi.org/10.5194/gmd-9-2153-2016, 2016.
Arakawa, A. and Schubert, W. H.: Interaction of a cumulus cloud ensemble with the large-scale environment, Part I, J. Atmos. Sci., 31, 674–701, https://doi.org/10.1175/1520-0469(1974)031<0674:IOACCE>2.0.CO;2, 1974.
Baklanov, A., Schlünzen, K., Suppan, P., Baldasano, J., Brunner, D., Aksoyoglu, S., Carmichael, G., Douros, J., Flemming, J., Forkel, R., Galmarini, S., Gauss, M., Grell, G., Hirtl, M., Joffre, S., Jorba, O., Kaas, E., Kaasik, M., Kallos, G., Kong, X., Korsholm, U., Kurganskiy, A., Kushta, J., Lohmann, U., Mahura, A., Manders-Groot, A., Maurizi, A., Moussiopoulos, N., Rao, S. T., Savage, N., Seigneur, C., Sokhi, R. S., Solazzo, E., Solomos, S., Sørensen, B., Tsegas, G., Vignati, E., Vogel, B., and Zhang, Y.: Online coupled regional meteorology chemistry models in Europe: current status and prospects, Atmos. Chem. Phys., 14, 317–398, https://doi.org/10.5194/acp-14-317-2014, 2014.
Ballard, S. P., Li, Z., Simonin, D., and Caron, J. F.: Performance of 4D-Var NWP-based nowcasting of precipitation at the Met Office for summer 2012, Q. J. Roy. Meteor. Soc., 142, 472–487, https://doi.org/10.1002/qj.2665, 2016.
Bergman, K. H.: Multivariate Analysis of Temperatures and Winds Using Optimum Interpolation, Mon. Weather Rev., 107, 1423–1444, https://doi.org/10.1175/1520-0493(1979)107<1423:MAOTAW>2.0.CO;2, 1979.
Benedetti, A., Morcrette, J. J., Boucher, O., Dethof, A., Engelen, R. J., Fisher, M., Flentje, H., Huneeus, N., Jones, L., and Kaiser, J. W.: Aerosol analysis and forecast in the European centre for medium-range weather forecasts integrated forecast system: 2. Data assimilation, J. Geophys. Res.-Atmos., 114, D13205, https://doi.org/10.1029/2008JD011115, 2009.
Bocquet, M., Elbern, H., Eskes, H., Hirtl, M., Žabkar, R., Carmichael, G. R., Flemming, J., Inness, A., Pagowski, M., Pérez Camaño, J. L., Saide, P. E., San Jose, R., Sofiev, M., Vira, J., Baklanov, A., Carnevale, C., Grell, G., and Seigneur, C.: Data assimilation in atmospheric chemistry models: current status and future prospects for coupled chemistry meteorology models, Atmos. Chem. Phys., 15, 5325–5358, https://doi.org/10.5194/acp-15-5325-2015, 2015.
Bond, T. C., Doherty, S. J., Fahey, D. W., Forster, P. M., Berntsen, T., DeAngelo, B. J., Flanner, M. G., Ghan, S., Kärcher, B., and Koch, D.: Bounding the role of black carbon in the climate system: A scientific assessment, J. Geophys. Res.-Atmos., 118, 5380–5552, https://doi.org/10.1002/jgrd.50171, 2013.
Chen, D. and Shen, X.: Recent progress on GRAPES research and application, J. Appl. Meteorol. Sci., 17, 773–777, 2006 (in Chinese).
Chen, D., Liu, Z., Ban, J., Zhao, P., and Chen, M.: Retrospective analysis of 2015–2017 wintertime PM2.5 in China: response to emission regulations and the role of meteorology, Atmos. Chem. Phys., 19, 7409–7427, https://doi.org/10.5194/acp-19-7409-2019, 2019.
Chen, D., Xue, J., Yang, X., Zhang, H., Shen, X., Hu, J., Wang, Y., Ji, L., and Chen, J.: New generation of multi-scale NWP system (GRAPES): general scientific design, Chinese Sci. Bull., 53, 3433–3445, https://doi.org/10.1007/s11434-008-0494-z, 2008.
Chen, W., Wang, Y., Li, J., Yi, Z., Zhao, Z., Guo, B., Che, H., and Zhang, X.: Description and evaluation of a newly developed emission inventory processing system (EMIPS), Sci. Total Environ., 870, 161909, https://doi.org/10.1016/j.scitotenv.2023.161909, 2023.
Chung, S. H. and Seinfeld, J. H.: Global distribution and climate forcing of carbonaceous aerosols, J. Geophys. Res.-Atmos., 107, AAC 14-11–AAC 14-33, https://doi.org/10.1029/2001JD001397, 2002.
Courtier, P., Thépaut, J. N., and Hollingsworth, A.: A strategy for operational implementation of 4D-Var, using an incremental approach, Q. J. Roy. Meteor. Soc., 120, 1367–1387, https://doi.org/10.1002/qj.49712051912, 1994.
Dai, Y., Zeng, X., Dickinson, R. E., Baker, I., Bonan, G. B., Bosilovich, M. G., Denning, A. S., Dirmeyer, P. A., Houser, P. R., Niu, G., Oleson, K. W., Schlosser, C. A., and Yang, Z.: The Common Land Model, B. Am. Meteorol. Soc., 84, 1013–1024, https://doi.org/10.1175/BAMS-84-8-1013, 2003.
Elbern, H. and Schmidt, H.: A four-dimensional variational chemistry data assimilation scheme for Eulerian chemistry transport modeling, J. Geophys. Res.-Atmos., 104, 18583–18598, https://doi.org/10.1029/1999JD900280, 1999.
Elbern, H., Strunk, A., Schmidt, H., and Talagrand, O.: Emission rate and chemical state estimation by 4-dimensional variational inversion, Atmos. Chem. Phys., 7, 3749–3769, https://doi.org/10.5194/acp-7-3749-2007, 2007.
Flemming, J., Huijnen, V., Arteta, J., Bechtold, P., Beljaars, A., Blechschmidt, A.-M., Diamantakis, M., Engelen, R. J., Gaudel, A., Inness, A., Jones, L., Josse, B., Katragkou, E., Marecal, V., Peuch, V.-H., Richter, A., Schultz, M. G., Stein, O., and Tsikerdekis, A.: Tropospheric chemistry in the Integrated Forecasting System of ECMWF, Geosci. Model Dev., 8, 975–1003, https://doi.org/10.5194/gmd-8-975-2015, 2015.
Flemming, J., Inness, A., Jones, L., Eskes, H. J., Huijnen, V., Schultz, M. G., Stein, O., Cariolle, D., Kinnison, D., and Brasseur, G.: Forecasts and assimilation experiments of the Antarctic ozone hole 2008, Atmos. Chem. Phys., 11, 1961–1977, https://doi.org/10.5194/acp-11-1961-2011, 2011.
Gelbard, F., Tambour, Y., and Seinfeld, J. H.: Sectional representations for simulating aerosol dynamics, J. Colloid Interf. Sci., 76, 541–556, https://doi.org/10.1016/0021-9797(80)90394-X, 1980.
GMAO (Global Modeling and Assimilation Office): MERRA-2 inst3_3d_aer_Nv: 3d,3-Hourly,Instantaneous,Model-Level,Assimilation,Aerosol Mixing Ratio V5.12.4, NASA Goddard Earth Sciences Data and Information Services Center (GES DISC) [data set], https://doi.org/10.5067/LTVB4GPCOTK2, 2015.
Gong, S. L., Barrie, L. A., Blanchet, J. P., Von Salzen, K., Lohmann, U., Lesins, G., Spacek, L., Zhang, L. M., Girard, E., and Lin, H.: Canadian Aerosol Module: A size-segregated simulation of atmospheric aerosol processes for climate and air quality models 1. Module development, J. Geophys. Res.-Atmos., 108, AAC 3-1–AAC 3-16, https://doi.org/10.1029/2001JD002002, 2003.
Gong, S. L. and Zhang, X. Y.: CUACE/Dust – an integrated system of observation and modeling systems for operational dust forecasting in Asia, Atmos. Chem. Phys., 8, 2333–2340, https://doi.org/10.5194/acp-8-2333-2008, 2008.
Granier, C., Darras, S., Denier van der Gon, H., Doubalova, J., Elguindi, N., Galle, B., Gauss, M., Guevara, M., Jalkanen, J.-P., Kuenen, J., Liousse, C., Quack, B., Simpson, D., and Sindelarova, K.: The Copernicus Atmosphere Monitoring Service Global and Regional Emissions (April 2019 Version), Copernicus Atmosphere Monitoring Service, https://doi.org/10.24380/d0bn-kx16, 2019.
Guerrette, J. J. and Henze, D. K.: Development and application of the WRFPLUS-Chem online chemistry adjoint and WRFDA-Chem assimilation system, Geosci. Model Dev., 8, 1857–1876, https://doi.org/10.5194/gmd-8-1857-2015, 2015.
Guo, B., Wang, Y., Zhang, X., Che, H., Ming, J., and Yi, Z.: Long-Term variation of black carbon aerosol in China based on revised aethalometer monitoring data, Atmosphere, 11, 684, https://doi.org/10.3390/atmos11070684, 2020.
Hakami, A., Henze, D. K., Seinfeld, J. H., Singh, K., Sandu, A., Kim, S., Byun, D., and Li, Q.: The adjoint of CMAQ, Environ. Sci. Technol., 41, 7807–7817, https://doi.org/10.1021/es070944p, 2007.
Han, J. and Pan, H. L.: Revision of convection and vertical diffusion schemes in the NCEP global forecast system, Weather Forecast., 26, 520–533, https://doi.org/10.1175/WAF-D-10-05038.1, 2011.
Henze, D. K., Hakami, A., and Seinfeld, J. H.: Development of the adjoint of GEOS-Chem, Atmos. Chem. Phys., 7, 2413–2433, https://doi.org/10.5194/acp-7-2413-2007, 2007.
Henze, D. K., Seinfeld, J. H., and Shindell, D. T.: Inverse modeling and mapping US air quality influences of inorganic PM2.5 precursor emissions using the adjoint of GEOS-Chem, Atmos. Chem. Phys., 9, 5877–5903, https://doi.org/10.5194/acp-9-5877-2009, 2009.
Hong, S. Y. and Pan, H. L.: Nonlocal boundary layer vertical diffusion in a medium-range forecast model, Mon. Weather Rev., 124, 2322–2339, https://doi.org/10.1175/1520-0493(1996)124<2322:NBLVDI>2.0.CO;2, 1996.
Huo, Z., Li, X., Chen, J., and Liu, Y.: CMA global ensemble prediction using singular vectors from background field, J. Appl. Meteorol. Sci., 33, 655–667, 2022 (in Chinese).
Inness, A., Baier, F., Benedetti, A., Bouarar, I., Chabrillat, S., Clark, H., Clerbaux, C., Coheur, P., Engelen, R. J., Errera, Q., Flemming, J., George, M., Granier, C., Hadji-Lazaro, J., Huijnen, V., Hurtmans, D., Jones, L., Kaiser, J. W., Kapsomenakis, J., Lefever, K., Leitão, J., Razinger, M., Richter, A., Schultz, M. G., Simmons, A. J., Suttie, M., Stein, O., Thépaut, J.-N., Thouret, V., Vrekoussis, M., Zerefos, C., and the MACC team: The MACC reanalysis: an 8 yr data set of atmospheric composition, Atmos. Chem. Phys., 13, 4073–4109, https://doi.org/10.5194/acp-13-4073-2013, 2013.
Jacobson, M. Z., Turco, R. P., Jensen, E. J., and Toon, O. B.: Modeling coagulation among particles of different composition and size, Atmos. Environ., 28, 1327–1338, https://doi.org/10.1016/1352-2310(94)90280-1, 1994.
Janssens-Maenhout, G., Crippa, M., Guizzardi, D., Dentener, F., Muntean, M., Pouliot, G., Keating, T., Zhang, Q., Kurokawa, J., Wankmüller, R., Denier van der Gon, H., Kuenen, J. J. P., Klimont, Z., Frost, G., Darras, S., Koffi, B., and Li, M.: HTAP_v2.2: a mosaic of regional and global emission grid maps for 2008 and 2010 to study hemispheric transport of air pollution, Atmos. Chem. Phys., 15, 11411–11432, https://doi.org/10.5194/acp-15-11411-2015, 2015.
Koch, D.: Transport and direct radiative forcing of carbonaceous and sulfate aerosols in the GISS GCM, J. Geophys. Res.-Atmos., 106, 20311–20332, https://doi.org/10.1029/2001JD900038, 2001.
Kuhlbusch, T. A. J.: Black carbon and the carbon cycle, Science, 280, 1903–1904, https://doi.org/10.1126/science.280.5371.1903, 1998.
Lanczos, C.: An iteration method for the solution of the eigenvalue problem of linear differential and integral operators, J. Res. Nat. Bur. Stand., 45, 255–282, 1950.
Li, M., Zhang, Q., Kurokawa, J.-I., Woo, J.-H., He, K., Lu, Z., Ohara, T., Song, Y., Streets, D. G., Carmichael, G. R., Cheng, Y., Hong, C., Huo, H., Jiang, X., Kang, S., Liu, F., Su, H., and Zheng, B.: MIX: a mosaic Asian anthropogenic emission inventory under the international collaboration framework of the MICS-Asia and HTAP, Atmos. Chem. Phys., 17, 935–963, https://doi.org/10.5194/acp-17-935-2017, 2017.
Liu, F., Zhang, Y., Su, H., and Hu, J.: Adjoint Model of Atmospheric Chemistry Transport Model CAMx: Construction and Application, Acta Scientiarum Natruralium Universitatis Pekinensis, 43, 764–770, 2007 (in Chinese).
Liu, K., Chen, Q., and Sun, J.: Modification of cumulus convection and planetary boundary layer schemes in the GRAPES global model, J. Meteorol. Res., 29, 806–822, https://doi.org/10.1007/s13351-015-5043-5, 2015.
Liu, Q., Hu, Z., and Zhou, X.: Explicit Cloud Schemes of HLAFS and Simulation of Heavy Rainfall and Clouds, Part I: Explicit Cloud Schemes, J. Appl. Meteorol. Sci., 14, 60–67, 2003a (in Chinese).
Liu, Q., Hu, Z., and Zhou, X.: Explicit Cloud Schemes of HLAFS And Simulation of Heavy Rainfall And Clouds, Part II: Simulation of Heavy Rainfall and Clouds, J. Appl. Meteorol. Sci., 14, 68–77, 2003b (in Chinese).
Liu, Y., Zhang, L., and Jin, Z.: The Optimization of GRAPES Global Tangent Linear Model and Adjoint Model, J. Appl. Meteorol. Sci., 28, 62–71, https://doi.org/10.11898/1001-7313.20170106, 2017 (in Chinese).
Liu, Y., Gong, J., Zhang, L., and Chen, Q.: Influence of linearized physical processes on the GRAPES 4DVAR, Acta Meteorol. Sin., 77, 196–209, 2019.
Liu, Y., Zhang, L., Chen, J., and Wang, C.: An improvement of the linearized planetary boundary layer parameterization scheme for CMA-GFS 4DVar, J. Appl. Meteorol. Sci., 34, 15–26, https://doi.org/10.11898/1001-7313.20230102, 2023.
Liu, Y., Zhang, L., and Lian, Z.: Conjugate Gradient Algorithm in the Four-Dimensional Variational Data Assimilation System in GRAPES, J. Meteorol. Res., 32, 974–984, https://doi.org/10.1007/s13351-018-8053-2, 2018.
Liu, Y. Zhang, X., Han, W., Wang, C., Jia, W., Wang, D., Zhuang, Z., and Shen, X.: Development of the CMA-GFS-AERO 4D-Var assimilation system v1.0- Part 1: System description and preliminary experimental results, Zenodo [data set], https://doi.org/10.5281/zenodo.14880420, 2025.
Li, Z., Liu, Q., and Ma, Z.: Influences of Graupel Microphysics on CMA-GFS Simulation of Summer Regional Precipitation, J. Meteorol. Res., 38, 27–38, https://doi.org/10.1007/s13351-024-3068-3, 2024.
Lorenc, A. C., Ballard, S. P., Bell, R. S., Ingleby, N. B., Andrews, P. L. F., Barker, D. M., Bray, J. R., Clayton, A. M., Dalby, T., and Li, D.: The Met. Office global three-dimensional variational data assimilation scheme, Q. J. Roy. Meteor. Soc., 126, 2991–3012, https://doi.org/10.1256/smsqj.57001, 2000.
Mahfouf, J.-F.: Influence of physical processes on the tangent-linear approximation, Tellus A, 51, 147–166, https://doi.org/10.3402/tellusa.v51i2.12312, 1999.
Mahfouf, J. F. and Rabier, F.: The ECMWF operational implementation of four-dimensional variational assimilation-Part I : Experimental results with simplified physics, Q. J. Roy. Meteor. Soc., 126, 1143–1170, https://doi.org/10.1002/qj.49712656415, 2000.
Meng, Z., Dabdub, D., and Seinfeld, J. H.: Size-resolved and chemically resolved model of atmospheric aerosol dynamics, J. Geophys. Res., 103, 3419–3435, https://doi.org/10.1029/97JD02796, 1998.
Menon, S., Hansen, J., Nazarenko, L., and Luo, Y.: Climate effects of black carbon aerosols in China and India, Science, 297, 2250–2253, https://doi.org/10.1126/science.1075159, 2002.
Menut, L., Vautard, R., Beekmann, M., and Honoré, C.: Sensitivity of photochemical pollution using the adjoint of a simplified chemistry-transport model, J. Geophys. Res.-Atmos., 105, 15379–15402, https://doi.org/10.1029/1999JD900953, 2000.
Mlawer, E. J., Taubman, S. J., Brown, P. D., Iacono, M. J., and Clough, S. A.: Radiative transfer for inhomogeneous atmospheres: RRTM, a validated correlated-k model for the long wave, J. Geophys. Res.-Atmos., 102, 16663–16682, https://doi.org/10.1029/97JD00237, 1997.
Morcrette, J. J., Barker, H. W., Cole, J. N. S., Iacono, M. J., and Pincus, R.: Impact of a new radiation package, McRad, in the ECMWF Integrated Forecast System, Mon. Weather Rev., 136, 4773–4798, https://doi.org/10.1175/2008MWR2363.1, 2008.
Müller, J.-F. and Stavrakou, T.: Inversion of CO and NOx emissions using the adjoint of the IMAGES model, Atmos. Chem. Phys., 5, 1157–1186, https://doi.org/10.5194/acp-5-1157-2005, 2005.
Pagowski, M. and Grell, G. A.: Experiments with the assimilation of fine aerosols using an Ensemble Kalman Filter, J. Geophys. Res., 117, D21302, https://doi.org/10.1029/2012JD018333, 2012.
Pagowski, M., Grell, G. A., McKeen, S. A., Peckham, S. E., and Devenyi, D.: Three-dimensional variational data assimilation of ozone and fine particulate matter observations: some results using the Weather Research and Forecasting–Chemistry model and Grid-point Statistical Interpolation, Q. J. Roy. Meteor. Soc., 136, 2013–2024, https://doi.org/10.1002/qj.700, 2010.
Sandu, A., Daescu, D. N., Carmichael, G. R., and Chai, T.: Adjoint sensitivity analysis of regional air quality models, J. Comput. Phys., 204, 222–252, https://doi.org/10.1016/j.jcp.2004.10.011, 2005.
Schmidt, H. and Martin, D.: Adjoint sensitivity of episodic ozone in the Paris area to emissions on the continental scale, J. Geophys. Res.-Atmos., 108, 8561, https://doi.org/10.1029/2001JD001583, 2003.
Schwartz, C. S., Liu, Z., Lin, H. C., and McKeen, S. A.: Simultaneous three-dimensional variational assimilation of surface fine particulate matter and MODIS aerosol optical depth, J. Geophys. Res.-Atmos., 117, D13202, https://doi.org/10.1029/2011JD017383, 2012.
Seinfeld, J. H. and Pandis, S. N.: Atmospheric Chemistry and Physics: From Air Pollution to Climate Change, Environment ence & Policy for Sustainable Development, 1st ed., Wiley-Interscience, New York, https://archive.org/details/atmosphericchemi0000sein (last access: 30 July 2025), 1998.
Semane, N., Peuch, V.-H., Pradier, S., Desroziers, G., El Amraoui, L., Brousseau, P., Massart, S., Chapnik, B., and Peuch, A.: On the extraction of wind information from the assimilation of ozone profiles in Météo–France 4-D-Var operational NWP suite, Atmos. Chem. Phys., 9, 4855–4867, https://doi.org/10.5194/acp-9-4855-2009, 2009.
Shen, X., Su, Y., Zhang, H., and Hu, J.: New Version of the CMA-GFS Dynamical Core Based on the Predictor–Corrector Time Integration Scheme, J. Meteorol. Res., 37, 273–285, https://doi.org/10.1007/s13351-023-3002-0, 2023.
Smith, P. J., Fowler, A. M., and Lawless, A. S.: Exploring strategies for coupled 4D-Var data assimilation using an idealised atmosphere–ocean model, Tellus A, 67, 27025, https://doi.org/10.3402/tellusa.v67.27025, 2015.
Su, Y., Shen, X., Peng, X., Li, X., Wu, X., Zhang, S., and Chen, X.: Application of PRM scalar advection scheme in GRAPES global forecast system, Chinese Journal of Atmospheric Sciences, 37, 1309–1325, https://doi.org/10.3878/j.issn.1006-9895.2013.12164, 2013 (in Chinese).
Tian, X. and Zou, X.: Development of the tangent linear and adjoint models of the MPAS-Atmosphere dynamic core and applications in adjoint relative sensitivity studies, Tellus A, 72, 16, https://doi.org/10.1080/16000870.2020.1814602, 2020.
Vautard, R., Beekmann, M., and Menut, L.: Applications of adjoint modelling in atmospheric chemistry: sensitivity and inverse modelling, Environ. Modell. Softw., 15, 703–709, https://doi.org/10.1016/S1364-8152(00)00058-X, 2000.
Wang, C., An, X., Zhao, D., Sun, Z., Jiang, L., Li, J., and Hou, Q.: Development of GRAPES-CUACE adjoint model version 2.0 and its application in sensitivity analysis of ozone pollution in north China, Sci. Total Environ., 826, 153879, https://doi.org/10.1016/j.scitotenv.2022.153879, 2022.
Wang, H., Gong, S., Zhang, H., Chen, Y., Shen, X., Chen, D., Xue, J., Shen, Y., Wu, X., and Jin, Z.: A new-generation sand and dust storm forecasting system GRAPES_CUACE/Dust: Model development, verification and numerical simulation, Chinese Sci. Bull., 55, 635–649, https://doi.org/10.1007/s11434-009-0481-z, 2010.
Wang, H., Peng, Y., Zhang, X., Liu, H., Zhang, M., Che, H., Cheng, Y., and Zheng, Y.: Contributions to the explosive growth of PM2.5 mass due to aerosol–radiation feedback and decrease in turbulent diffusion during a red alert heavy haze in Beijing–Tianjin–Hebei, China, Atmos. Chem. Phys., 18, 17717–17733, https://doi.org/10.5194/acp-18-17717-2018, 2018.
Xu, X., Yang, X., Zhu, B., Tang, Z., Wu, H., and Xie, L.: Characteristics of MERRA-2 black carbon variation in east China during 2000–2016, Atmos. Environ., 222, 117140, https://doi.org/10.1016/j.atmosenv.2019.117140, 2020.
Yang, X., Chen, J., Hu, J., Chen, D., Shen, X., and Zhang, H.: A semi-implicit semi-Lagrangian global nonhydrostatic model and the polar discretization scheme, Sci. China Ser. D, 50, 1885–1891, https://doi.org/10.1007/s11430-007-0124-7, 2007.
Zhang, L., Liu, Y., Liu, Y., Gong, J., Lu, H., Jin, Z., Tian, W., Liu, G., Zhou, B., and Zhao, B.: The operational global four-dimensional variational data assimilation system at the China Meteorological Administration, Q. J. Roy. Meteor. Soc., 145, 1882–1896, https://doi.org/10.1002/qj.3533, 2019.
Zhang, Y.: Online-coupled meteorology and chemistry models: history, current status, and outlook, Atmos. Chem. Phys., 8, 2895–2932, https://doi.org/10.5194/acp-8-2895-2008, 2008.
Zheng, B., Tong, D., Li, M., Liu, F., Hong, C., Geng, G., Li, H., Li, X., Peng, L., Qi, J., Yan, L., Zhang, Y., Zhao, H., Zheng, Y., He, K., and Zhang, Q.: Trends in China's anthropogenic emissions since 2010 as the consequence of clean air actions, Atmos. Chem. Phys., 18, 14095–14111, https://doi.org/10.5194/acp-18-14095-2018, 2018.
Zhou, C., Gong, S., Zhang, X., Liu, H., Xue, M., Cao, G., An, X., Che, H., Zhang, Y., and Niu, T.: Towards the improvements of simulating the chemical and optical properties of Chinese aerosols using an online coupled model–CUACE/Aero, Tellus B, 64, 18965, https://doi.org/10.3402/tellusb.v64i0.18965, 2012.
Zhu, S., Wang, B., Zhang, L., Liu, J., Liu, Y., Gong, J., Xu, S., Wang, Y., Huang, W., Liu, L., He, Y., Wu, X., Zhao, B., and Chen, F.: A 4DEnVar-based ensemble four-dimensional variational (En4DVar) hybrid data assimilation system for global NWP: system description and primary tests, J. Adv. Model. Earth Sy., 14, e2022MS003023, https://doi.org/10.1029/2022MS003023, 2022.
Zou, X., Vandenberghe, F., Pondeca, M., and Kuo, Y.-H.: Introduction to adjoint techniques and the MM5 adjoint modeling system (No. NCAR/TN-435 þ STR), University Corporation for Atmospheric Research, https://idv.sinica.edu.tw/cytseng/current term/TN435.pdf (last access: 30 July 2025), 1997.