Implementation of aerosol data assimilation in WRFDA (V4.0.3) for WRF-Chem (V3.9.1) using the MADE/VBS scheme

The Weather Research and Forecasting model data assimilation (WRFDA) system, initially designed for meteorological data assimilation, is extended for aerosol data assimilation for the WRF model coupled with Chemistry (WRF-Chem). An interface between WRF-Chem and WRFDA is built for the Regional Atmospheric Chemistry Mechanism (RACM) chemistry and the Modal Aerosol Dynamics Model for Europe (MADE) coupled with the Volatility Basis Set (VBS) aerosol schemes. This article describes the implementation of the new interface for assimilating PM2.5, PM10, and four gas species (SO2, NO2, 5 O3, and co) on the ground. And the effects of aerosol data assimilation are briefly examined through a month-long case study during the Korea-United States Air Quality (KORUS-AQ) period. It is demonstrated that the 3DVAR analysis can lead to consistent forecast improvements up to 26%, diminishing most systematic bias errors for 24 h.


WRFDA for WRF-Chem
A new interface for the RACM-MADE-VBS scheme is developed based on version 4.0.3 of the WRFDA system to assimilate surface PM 2.5 , PM 10 , so 2 , no 2 , o 3 , and co measurements. The variational data assimilation system seeks an analysis solution as the best estimate of the true state by minimizing deviations of model variables (x) from the corresponding observations (y) 140 based on the error statistics of background forecasts and observations. The variational scheme assumes Gaussian and unbiased error distributions, which can be characterized by covariances alone, its solution is thus found a least-squares best fit using the covariances. In practice, when the cost function J(x) is reached to a minimum through an iterative minimization process, the resulting state vector x becomes the analysis solution (Lorenc, 1986).
145 where x b is a deterministic forecast from the previous assimilation cycle with subscript b denoting background forecasts, and B and R represent background and observation error covariance matrices, respectively. An observation operator H(x) transforms model states (x) to observed quantities (y) at observation locations and can be nonlinear.
The WRFDA employs an incremental formulation (Courtier et al., 1994) where observations are assimilated through the observation operator to provide analysis increments δx(= x − x b ) that minimizes J. This approach can keep analysis imbal-150 ance to a minimum, making the minimization procedure more efficient. The resulting analysis x a (= x b + δx) is then used as the initial condition for the following forecast. For a typical numerical weather prediction (NWP) model, the state vector (x) that contains all the prognostic variables lies in the huge dimensional state space (with typical degrees of freedom greater than O(10 7 )), which makes the computation of J b prohibitive. As a practical way of solving J b , a control vector (v) that consists of analysis variables is defined as δx = B 1/2 v. While forecast errors of model variables are typically correlated through govern-155 ing equations, control variables are designed to have no cross-correlations such that the error matrix is diagonalized. With the control variable transformation, the cost function is rewritten as below.
where the innovation vector is defined as d = y − H(x b ) and H is a linearized version of H. In weather data assimilation, the control variable transformation has been broadly practiced because meteorological variables follow physical balance equations 160 (such as geostrophic or hydrostatic equations) at large scales (Bannister, 2008). But it is not straightforward to define multivariate correlations between chemical species or between chemical and meteorological variables due to their complex interactions and chemical reactions that are highly nonlinear and often transient. Therefore, chemical or aerosol species in the model states (x) are directly used as control variables (v) in chemical data assimilation. For the MADE-VBS aerosol scheme, the analysis (or control) variables consist of 35 aerosol species in the three-dimensional model space.
In the variational algorithms, the square root of the B matrix (B = B 1/2 (B 1/2 ) T ) is decomposed into a series of submatrices for the control variable transform.
where the U p matrix is called physical or balance transformation (via linear regression), S a diagonal matrix of forecast error standard deviation, U v the vertical transform, and U h the horizontal transform matrix.
The WRFDA provides various options for estimating the background error covariance through "cv_option" in namelist.
Here, cv_option = 7 is chosen for no balance transformation in the regional simulations, meaning that the chemical species are control variables as full fields such that U p becomes an identity operator for chemical data assimilation. The horizontal 180 transform matrix U h is performed using recursive filters (Purser et al., 2003), while the vertical transform U v is carried out via an empirical orthogonal function (EOF) decomposition of the vertical component of the background error covariance.
In the 3DVAR algorithm, background error covariance estimates are important, particularly in data-sparse regions. As most surface stations are concentrated in urban areas, the background error covariance can play a major role on spreading out the observed information horizontally and vertically.

185
In this study, chemical simulations are carried out in the WRF-Chem model, starting at 00 UTC every day for one month of May 2016, to compute background error covariance statistics for chemical and aerosol species defined in the RACM-MADE-VBS parameterization. Differences between 24-and 48-h forecasts at the same validation time are then used as a proxy for forecast errors in each domain, and a total of 29 sample forecasts for 3 -31 May 2016 were used to construct the B matrix using the National Meteorological Center (NMC) method (Parrish and Derber, 1992), assuming the same model bias and uncorrelated model errors. There are 5 sequential stages (e.g., stage0 -stage4) implemented with different options in the GENBE software.
In this study, all the grid points are binned together for each model level, with no latitudinal or longitudinal dependencies in the background error covariance. linearly reduces with height, indicating large uncertainties at the sea level, but soila is characterized by the high peak in the mid-troposphere, which might be associated with the large uncertainty in the long-range transport of dust aerosols. In the gas species, ozone represents most uncertainties near the top (e.g., low stratosphere), while carbon monoxide shows large error values in the low troposphere. The vertical error structure is hard to see in so 2 and no 2 due to the trivial values, but their standard deviations are also relatively large in the boundary layer.

205
The vertical spread of the observed information at the surface is determined by vertical error correlations, closely associated with the simulated boundary layer height. As the static background error covariance cannot simulate the diurnal variability of the boundary layer, this becomes one of the main limitations of the 3DVAR analysis for air quality applications. Figure   3 depicts the normalized vertical auto-correlations derived from the time-lagged forecasts for four major aerosol species in accumulation and Aitken modes, three coarse mode aerosols, and four trace gases (from top to bottom panels). Generally, 210 correlation contours tend to spread more in the lower levels, implying that the updates in the lowest level can affect the entire boundary layer. The accumulation and coarse mode particles have a wider vertical spread than the Aitken mode particles that would have more localized effects. The circular pattern around level 22 in most species could be related to the advection with strong upper-level jets. While all the trace gases have relatively large correlations near the surface, ozone and nitrogen dioxide show the largest correlations near the tropopause and stratosphere, respectively.

215
To examine the horizontal propagation of the increments from point observations, the horizontal correlation length scales of the same species are illustrated in Fig. 4. In accumulation and coarse modes (in the top and the third rows, respectively), the overall vertical structure is similar, with the linear increase down to the surface. The length scale at the surface is specified around 36 km for so4aj, for example, corresponding to four grids in the 9-km domain, meaning that an observation at a point location can affect four surrounding grid points radially. On the other hand, Aitken mode variables have short length scales near 220 the surface, which tend to increase in the upper levels, but their maximum values are smaller than those of their counterparts in the accumulation mode, representing more localized effects horizontally. Trace gases show different vertical distributions with the maximum near the top, except for ozone.
When the RACM-MADE-VBS option (e.g., chem_opt = 108) is chosen, the model equivalent of the observed PM 2.5 is computed as a total sum of three-dimensional mass mixing ratios of 32 aerosol species in accumulation (j) and Aitken (i) 225 modes predicted in the WRF-Chem model, as below.
where ρ d is dry density ([kg/m 3 ]) for the unit conversion from aerosol mixing ratios (µg/kg) to mass concentrations ( When PM 10 is assimilated alone, the model correspondent is computed by adding three coarse-mode variables -anthro-235 pogenic primary aerosol (antha), marine aerosol concentration (seas), soil-derived aerosol particles such as dust (soila) -into the simulated PM 2.5 . But in the concurrent assimilation of PM 10 and PM 2.5 , the residuals from (PM 10 -PM 2.5 ) are assimilated as a sum of three coarse-mode aerosols, following Pang et al. (2018) and Sun et al. (2020).

Observation processing and measurement errors
In this study, hourly surface observations of six major pollutants (PM 2.5 , PM 10 , so 2 , no 2 , o 3 , and co) are used from 379 the assimilation procedure. Surface PM 2.5 and PM 10 observations are rejected when they are greater than 300, 500 µg/m 3 , respectively, or are different from their model equivalent (e.g., H(x b )) by more than 100 µg/m 3 . Gas species are also checked with the maximum threshold of 2, 2, 2, and 20 ppmv for the observed so 2 , no 2 , o 3 , co, respectively. They are also rejected based on the threshold of 0.2 ppmv for the innovations.
Gas-phase pollutants on the ground are assimilated together, as opposed to individual species, using the corresponding model variables as their analysis (or control) variables. As observations for all the gas species are processed to have the same ppmv unit as the model variables before assimilation, the observation operator becomes a simple horizontal interpolation (e.g., bilinear interpolation) of the corresponding variable at the lowest model level. For the new assimilation capability, several new 260 parameters are added to namelist.input in WRF-Chem, as summarized in Table 2. To demonstrate the capability of all the new observation operators (that are independent of each other), this study only presents the simultaneous assimilation of all six pollutants using chemicda_opt = 4, as listed in Table 2.
In this 3D-Var analysis, observation errors are assumed uncorrelated such that the observation error covariance matrix R in Eq. (1) Fig. 6) varies from cycle to cycle, but the time series of omb and oma indicates that the analysis system gets spun up quickly (with the steady trend of oma) and runs reliably throughout the month-long period with the analyses closer to observations than background forecasts for all six pollutants. The number of observations in trace gases is omitted in Fig. 7 because it is very similar to that of PM observations and the omb in gas species is greatly fluctuating with cycles. Such large oscillations of omb and large differences between analysis and background are often attributable to 295 the considerable errors in the forecast model and/or forcing parameters, which prompt the model state to return to its own equilibrium quickly (e.g., within 6 h). For the rest of the figures, the evaluation is made only for 7-31 May 2016, discarding the first week of cycling as a spin-up period.
The MADE-VBS aerosol parameterization has been reported to simulate the chemical composition over the East Asian region reasonably well (Lee et al., 2020; Saide et al., 2020). As seen in Fig. 8, the surface PM 2.5 analysis is dominated by 300 nitrates (NO3), sulfate (SO4), ammonium (NH4), unspeciated PM 2.5 , and anthropogenic secondary organic aerosols (ASOA) in Seoul, South Korea (in that order). Compared to the mean analysis for the evaluation period, the analysis in the heavy pollution event on 26 May 2016 shows that major constituents' contributions further increase, particularly by nitrate. Due to the limited information content of atmospheric composition measurements as well as the scarcity of such observations, it is hard to evaluate the fractional aerosol contribution by all the species defined in the MADE-VBS scheme. Hence, the pie charts 305 are presented only to understand how the aerosol composition is represented in the parameterization and how it changes with high PM 2.5 concentrations. But the overall composition with major constituents seems consistent with those from previous studies (Jo et al., 2020;Tian et al., 2019). Because the forward operator assigns an equal weight for each aerosol species, the fractional contribution is not substantially changed by assimilation in this particular case study. Figure 9 presents the horizontal distribution of analysis increments in the assimilated variables at the lowest model level, 310 averaged for the evaluation period. Surface PM 2.5 concentrations are reduced by assimilation, especially over the middle east China (along 30 • N), indicating that they were mostly overpredicted in background forecasts, likely due to the systematic overestimation of anthropogenic emission data. Given that air pollutants in the emission data constitute the majority of the precursors of PM 2.5 pollution, surface PM 2.5 concentrations could strongly depend on emissions, which might have led to the overestimation in the background forecasts. Therefore, the assimilation of surface PM 2.5 tends to counteract the overestimation 315 driven by the emission data over China. On the contrary, PM 10 concentrations are predominantly enhanced by assimilation over most areas, presumably because coarse mode aerosols might not be sufficiently described in both the emission data (through "E_PM_10") and the model estimate. Among the coarse-mode species, dust aerosols (soila) show the most significant analysis increments over the Jing-Jin-Ji (an abbreviation of the Chinese names of Beijing, Tianjin, and Hebei) Metropolitan region (not shown). On the other hand, the analysis does not make meaningful changes in so 2 and no 2 that have short lifetime, but tend to 320 decrease ozone and increase carbon monoxide over South Korea. The reduction of PM 2.5 and the large increase of PM 10 in the boundary layer, as shown in Fig. 11, are consistent with previous 330 results. PM 2.5 shows that the analysis (orange) and the following 6 h forecast (red) are not much different in the climatological sense (e.g., mean over time and space). But PM 10 displays relatively large discrepancies between the two and even bigger differences between NODA and DA, mainly due to the large changes in soila, as shown in the previous figure. Systematic disparities between observations and the model estimates typically imply the deficiencies in the model simulation and/or the forcing parameters. As the focus of this study is the prediction of surface PM 2.5 concentrations, no further investigation is 335 made on the systematic errors in PM 10 simulations in this study. But generally, the larger the model error gets, the harder it is to make an optimal solution in the analysis. On the other hand, sulfur dioxide and nitrogen dioxide near the surface are slightly increased by assimilation over domain 2, which does not last for 6 h because of their short lifetime. While the changes in the vertical structure of those two fields are confined to the boundary layer, ozone and carbon monoxide experience the adjustments for the entire profile through the cycling. 340 Figure 12 illustrates the time series of rms and bias errors of 0-48 h forecasts with respect to independent PM 2.5 observations at the surface. The large initial errors in NODA imply that aerosol species are not properly initialized without assimilation, even if they are recycled every 6 h for the whole month. With data assimilation, initial conditions in the DA experiment are substantially improved over both domains, leading to smaller forecast errors throughout 48 h forecasts. In domain 1, a large overestimation in NODA is significantly reduced by assimilation, but the positive bias remains for 48 h. In domain 2, the 345 systematic bias is mostly corrected in DA up to 36h forecasts, and the rms errors are consistently small compared to NODA.
The forecast errors are mostly distinguishable for the first 24 h, and the analysis impact typically lasts no longer than 6 h in trace gases like no 2 and so 2 , 24 h forecast mean errors are thus summarized in Table 3 for all six pollutants. Comparing to the baseline run (NODA), the DA experiment systematically improves surface PM 2.5 forecasts in both domains, with the rms errors decreased by 26% and 20% over domains 1 and 2, respectively. The rms errors of PM 10 are reduced only by ∼14% in 350 DA, but the systematic underestimation gets mostly diminished over both domains. The assimilation is not very effective in the prediction of gas species except for carbon monoxide, partly due to the model errors and partly due to the observation errors that might need to be further adjusted for better results.
The forecast errors depicted in Fig. 12 are dominated by moderate (or clear-sky) cases in Korea, but air quality forecasting becomes more crucial for heavy pollution events, making the categorical forecast verification important in a practical sense. Table 4 categorizes four different events based on hourly surface concentrations in six pollutants, and Table 5  forecasts for different air pollution events. Figure 13 highlights the forecast accuracy for categorized events, verified against independent observations, based on the formulae described below.
The air quality forecasting operated by the NIER in Korea is currently evaluated in the same way daily, except for daily mean 365 values (rather than hourly averages). The forecast accuracy rates defined here can be considered as skill scores for categorized events so that the higher, the better. First of all, NODA shows very stable accuracy rates between 40 and 50% for all events.
As forecast errors usually grow with time due to the model uncertainty, this means that the forcing parameters consistently constrain the chemical forecasting. With data assimilation (red), the initial accuracy gets doubled up in both domains (up to 80%), even for high pollution cases. But the benefit of the analysis quickly disappears with time, implying the challenges 370 with chemical data assimilation using the 3DVAR technique and large uncertainties in aerosol simulations. Note that the DA algorithm used here cannot produce an optimal solution when there are significant errors in the model and/or the forcing parameters as the strong-constraint variational system assumes a perfect model in the optimization. As Bocquet et al. (2015) pointed out, even with the improved analysis, it is hard to compete with forcing parameters such as emissions by which the chemical transport model is strongly driven, making the chemical analysis impact typically limited to the first-day forecasts.

375
The results shown here are consistent with previous studies, illustrating that the most benefit of data assimilation is limited to the first 24h forecasts, although the overall forecast accuracy in DA still remains higher than NODA up to 48 h. In domain 2, due to the small sample size, the forecast accuracy for high concentrations is not as consistent (and smooth) as in domain 1, and the results may not be statistically significant. But the false alarm rates for all events are also clearly reduced for 0-24 h forecasts, indicating that the assimilation systematically improves the 9-km simulations.