Interactive comment on “ Implementation of aerosol assimilation in Gridpoint Statistical Interpolation v . 3 . 2 and WRF-Chem v . 4 . 3 . 1

Abstract. Gridpoint Statistical Interpolation (GSI) is an assimilation tool that is used at the National Centers for Environmental Prediction (NCEP) in operational weather forecasting in the USA. In this article, we describe implementation of an extension to the GSI for assimilating surface measurements of PM2.5, PM10, and MODIS aerosol optical depth at 550 nm with WRF-Chem (Weather Research and Forecasting model coupled with Chemistry). We also present illustrative results. In the past, the aerosol assimilation system has been employed to issue daily PM2.5 forecasts at NOAA/ESRL (Earth System Research Laboratory) and, we believe, it is well tested and mature enough to be made available for wider use. We provide a package that, in addition to augmented GSI, consists of software for calculating background error covariance statistics and for converting in situ and satellite data to BUFR (Binary Universal Form for the Representation of meteorological data) format, and sample input files for an assimilation exercise. Thanks to flexibility in the GSI and coupled meteorology–chemistry of WRF-Chem, assimilating aerosol observations can be carried out simultaneously with meteorological data assimilation. Both GSI and WRF-Chem are well documented with user guides available online. This article is primarily intended to be a technical note on the implementation of the aerosol assimilation. Its purpose is also to provide guidance for prospective users of the computer code. Scientific aspects of aerosol assimilation are also briefly discussed.

The above publications have shown that initial conditions play an important but not dominant role in chemical forecasting.Especially for predicting air quality, i.e., chemical composition in the boundary layer, inaccurate source emissions and deficient physical and chemical parameterizations result in deteriorating forecasts soon after assimilation.In this context, applying 3D-Var assimilation methods that aim to exclusively ameliorate initial conditions constitutes only a first step towards improving chemical forecasts.
Below, we describe aerosol observations that can be currently assimilated with our extension of the Gridpoint Statistical Interpolation (GSI, Wu et al., 2002;Purser et al., 2003a, b).Next, we provide a brief introduction to the 3D-Var formulation of the GSI and elaborate on forward operators for aerosol observations and specification of model (background) error.We conclude by presenting results of an application of the assimilation system.

Observations and measurement errors
In our implementation, assimilated observations include surface measurements of PM 2.5 and PM 10 , and aerosol optical depth (AOD, alternatively, aerosol optical thickness, AOT) retrievals at 550 nm from Moderate Resolution Imaging Spectroradiometer (MODIS) satellites Aqua and Terra.
In North America, continuous measurements of surface aerosol concentrations at hourly resolution are made available thanks to monitoring stations participating in the US EPA AirNow program.The observations are processed with minimal delay, making them suitable for real-time assimilation.A free subscription to the real-time data feed is possible through the AirNow gateway (http://airnowapi.org/).A computer code is made available to convert text-formatted files obtained form the gateway to BUFR (Binary Universal Form for the Representation of meteorological data, Dragosavac, 2007) as required by the GSI.
Aerosol optical depth data come from MODIS sensors on board the Terra and Aqua satellites.Retrievals over land and sea are derived from the dark-target product (Remer et al., 2005) and deep-blue product over bright land surface (Hsu et al., 2004(Hsu et al., , 2006)).Currently, the dark-target ocean and land AOD products are available from both Terra and Aqua, but deep blue retrievals are only available from Aqua.MODIS retrieved AOD is provided at seven wavelengths: 470, 550, 660, 870, 1240, 1630, and 2130 nm.In our implementation, only level 2 (L2) AOD retrievals at 550 nm are used.The AOD observation error is specified after Remer et al. (2005) as ε AOD = 0.03+0.05τover water and ε AOD = 0.05+0.15τover land, where τ is an AOD observation.Only AOD retrievals marked with the highest quality flag are retained for the assimilation.
L2 retrievals from Aqua are available at ftp://ladsweb.nascom.nasa.gov/allData/51/MYD04_L2and L2 retrievals from Terra are available at ftp://ladsweb.nascom.nasa.gov/allData/51/MOD04_L2.These data come in HDF-EOS format at 5 min segments of the satellite's orbit that correspond to 10 km × 10 km resolution at the surface.Computer code (W.Wolf, personal communication, 2013) is available in the package to convert HDF to BUFR for the GSI.

Aerosol assimilation within the Gridpoint Statistical Interpolation
GSI includes a 3D-VAR assimilation tool from which an analysis is obtained by minimization of a cost function given by In Eq. ( 1), x is a vector of analysis, x b is the forecast or background vector, y is an observation vector, B is the background error covariance matrix, H is an observation operator, and R is the observation error covariance matrix.The background error covariance matrix B is separated into vertical and horizontal components, and is represented as a product of error variances and spatial correlation matrices.The correlation matrices simulate Gaussian shapes in space and in the GSI are modeled with recursive filters (Purser et al., 2003a, b).The application of the filters requires specification of the background error correlation length scales.The observation covariance matrix R combines measurement and representativeness errors, and is usually assumed to be diagonal.The observation operator H , which can be non-linear, converts model variables to observation space.Solutions to the minimization problem are sought using the incremental approach (Courtier et al., 1994).With this approach, two minimization loops are employed: an outer loop where fully non-linear observation operator is applied and an inner loop where the observation operator is linearized.
Our extension to the GSI includes separate options for Goddard Chemistry Aerosol Radiation and Transport (GO-CART, Chin et al., 2000Chin et al., , 2002;;Ginoux et al., 2001) and all other aerosol modules in WRF-Chem (Weather Research and Forecasting model coupled with Chemistry) (Grell et al., 2005).Since the Community Radiative Transfer Model (CRTM, Han et al., 2006;Liu and Weng, 2006), which is coupled to the GSI, is currently only available for GOCART, AOD can only be assimilated with the GOCART model background.

Forward models and observation processing in GSI
The forward models for GOCART differ from other aerosol parameterizations and are described first.
PM 2.5 concentration is calculated as where ρ d , dry air density, is multiplied by mixing ratios of aerosol species.Factors for sulfate and organic carbon account for increasing the mass of the compounds due to the presence of ammonium ion and oxygen, respectively.Factors for dust and sea salt account for a size cut-off at the 2.5 µm diameter calculated assuming lognormal distribution of these species.An expression for PM 10 concentration is Only a brief description of the observation operator for AOD is given here and we refer the reader to Liu et al. (2011) and Schwartz et al. (2012) for full details.We assume that the size distribution of aerosol species within each size bin is logarithmic and that the particles are spherical and externally mixed.Parameters of the distributions are give in Liu et al. (2011).CRTM contains profiles of GOCART aerosol species that include their effective radii, standard deviations, and refractive indices.The extinction coefficient of each aerosol species is computed for a given wavelength based on Mie scattering theory and accounting for hygroscopic size growth of hydrophilic species.Finally, AOD is calculated from the equation where E ext is the extinction coefficient (a function of wavelength λ, refractive index n r , and effective radius r eff ), c is aerosol mixing ratio, ρ d is dry air density, and d is layer depth.Indices i and k denote aerosol species and model layers respectively; n = 15 denotes the number of GOCART aerosol species.
For each of the Eqs.( 2), (3), and (4), mixing ratios of aerosol species are horizontally linearly interpolated to the observation location.No extrapolations are performed in the vertical for surface observations, as their locations are assumed to coincide with the first model level.
A representativeness error for a surface observation is assigned based on the character of the site after Elbern et al. (2007), using a formula given by ε repr = αε m ( x/L repr ) 1/2 , where ε m is measurement error, α is a tunable parameter, x is model grid size, and L repr represents the observation's radius of influence.The parameter α determines magnitude of the observation error and can be specified in the "namelist".Its default value, which was obtained through experimentation, is set to 0.5.Radii of influence for observations are prescribed equal to 10, 4, and 2 km for rural, suburban, and urban sites, respectively.The total observation error is calculated as ε obs = (ε 2 m + ε 2 repr ) 1/2 .Only surface measurements that fall below specified thresholds are accepted (default values are set to 100 µg m −3 for PM 2.5 and to 150 µg m −3 for PM 10 ).Also, an observation is rejected if its deviation from the background is greater than these maximum allowable values.Depending on the user's preference, an observation can also be rejected if a difference between its actual elevation and model terrain height interpolated to its geographic location exceeds a threshold specified in the namelist.Characteristics of the error for different instruments and the default values can be easily modified (in the GSI distribution files convinfo, chemmod.f90,and read_anowbufr.f90).
To reduce the volume and diminish the correlation of satellite observation errors, thinning (subsampling) of AOD observations is recommended to a resolution that is comparable to the model grid size.Thinning options can be specified in the namelist.
For aerosol options other than GOCART, PM 2.5 or PM 10 are read as PM2_5_DRY or PM10 from WRF-Chem output so that Eqs. ( 2) and ( 3) are not required.The rationale for such an approach is discussed in the next section.Calculation of surface PM observation errors and data selection for the assimilation follows the implementation for GOCART.

Specification of background error
In GSI, error correlation length scales and variances can vary zonally and vertically.They can be calculated as forecast statistics using the NMC (National Meteorological Center) method (Parrish and Derber, 1992) or the ensemble method (Fisher, 2003).Computer code for producing a file containing these statistics for meteorological state variables and desired aerosols formatted for the GSI is available for download www.geosci-model-dev.net/7/1621/2014/ with the WRF data assimilation system at http://www.mmm.ucar.edu/wrf/users/wrfda/downloads.html.
For GOCART parameterization, state variables include 15 aerosol species.As an illustration, vertical profiles of standard deviations and horizontal correlation length scales for OC 1 , OC 2 , and sulfate are shown in Fig. 1.These statistics were derived for a month-long period in the 2012 summer, over a domain spanning eastern North America, with 24 km grid resolution, using the NMC method applied to 24 and 48 h forecasts.In the GOCART case, increments (or additions to the background state) to each aerosol species are obtained using background error statistics for individual aerosol species.We will not reflect on the realism of the statistics derived using the NMC method in this manuscript, but only point out that accounting for uncertainty in emission sources and aerosol parameterization deficiencies should be considered when estimating model errors.Pagowski and Grell (2012) discuss this topic in detail.
An alternative approach is also available where increments to individual species are calculated based on their a priori contribution to the total aerosol mass.This is expressed as the sum of 15 aerosols species accounting for multiplication factors of sulfate and organic carbon (hereafter, "ratio approach").With this approach, statistics for the total aerosol are used to minimize the 3D-Var cost function and need to be provided in the background error input file.The choice of any of the two approaches is determined in the namelist.Also, error correlation length scales and standard deviations can be tuned for optimal performance and modified by factors specified in the namelist.
For parameterizations other than GOCART, specifying background error statistics for a large number of aerosol species is, in our opinion, overly burdensome, especially because such statistics may not be reliable given the large uncertainties in emissions and in the state of science in aerosol modeling.Therefore, for these parameterizations, we require that background error statistics are provided for a WRF-Chem output variable PM2_5_DRY/PM10 when PM 2.5 /PM 10 observations are assimilated.This variable is also a state variable for which an increment will be calculated.

Running GSI and aerosol assimilation cycle
A comprehensive user guide for GSI is available at http://www.dtcenter.org/com-GSI/users/docs/users_guide/GSIUserGuide_v3.2.pdf.Also, an online tutorial is available and group tutorials are given at least once a year (http://www.dtcenter.org/com-GSI/users/tutorial/index.php).Thus, only a cursory description of the assimilation is given here.Our package provides a default configuration and shell scripts for assimilating PM 2.5 , PM 10 , and MODIS AOD with WRF-Chem GOCART parameterization.
Specifically, for aerosol assimilation, in addition to an input file with aerosol background statistics, a user needs to provide WRF-Chem output in netcdf format, observations files in BUFR format (normally a single file for PM 2.5 and PM 10 , and/or file with MODIS AOD), a namelist specifying options for the assimilation, and a configuration file anavinfo.The latter file contains the names of aerosol species as state variables for which minimization of the 3D-Var cost function is performed.Normally, entries in anavinfo would include either GOCART species or PM25/PM10.We note that a simultaneous assimilation of meteorological variables is also possible.
On the output, GSI overwrites the input WRF-Chem file.For quality control and to visualize increments, we suggest using ncdiff, a component of netcdf manipulation software NCO available at http://nco.sourceforge.net(alternatively diffv operator from the CDO package, https://code.zmaw.de/projects/cdo).For GOCART, the output WRF-Chem file contains an analysis of aerosol species.No further processing is required to issue the next forecast.For other aerosol options, increments to individual aerosol species need to be calculated using the ratio approach and added to the background.They will constitute initial conditions for the following forecast.We again recommend using NCO software for this procedure.Sample increments to OC 1 , OC 2 , and sulfate on the first model level (i.e., assumed to be at the surface) are shown in Fig. 2. Their magnitudes and spatial patterns are related to the specification of background error statistics for individual aerosol species.Surface and satellite observations were assimilated to produce this figure.
We routinely employ a 6-hour assimilation cycle that includes both assimilation of standard meteorological observations and aerosol observations.
The impact of aerosol assimilation has been well documented in the publications cited in Sect. 1.For illustration, Fig. 3 shows bias and spatial correlation with respect to AirNow measurements calculated for forecasts issued over a month-long period during summer 2012 with and without assimilation of surface observations of PM 2.5 .The GOCART parameterization was used with the ratio approach.The improvement in the early forecast hours is noteworthy.Reasons for a relatively quick deterioration of the aerosol forecasts at later hours were briefly noted in Sect. 1 and are elaborated in detail in Pagowski and Grell (2012) and Jiang et al. (2013).

Conclusions
We described our implementation of the assimilation of PM 2.5 and PM 10 surface observations and satellite MODIS AOD level 2 retrieval using the GSI and WRF-Chem.Along with aerosol assimilation, computer codes for formatting the observations are included in the package.Also, an example configuration and sample input files for an assimilation exercise are supplied.
We recommend that prospective users become familiar with a general application of the GSI as described in the user guide and in the online tutorial.
We hope that the availability of this implementation will lead to further development of the aerosol and chemical data assimilation system that may include a wider range of observations.GSI is a community-based system and user contributions are encouraged.

Figure 1 .
Figure 1.Vertical profiles of standard deviations (top) and horizontal correlation length scales (bottom) for OC 1 , OC 2 , and sulphate derived for a North American domain (see text for details).Tick mark values of − log(p/p s ) on the ordinate approximately correspond to values of atmospheric pressure equal to 1000, 600, 370, 220, 135, and 80 hPa.

Figure 2 .
Figure 2. Sample analysis increments of OC 1 , OC 2 , and sulfate (from the top) on the first model level.

Figure 3 .
Figure 3. Bias (left) and spatial correlation (right) calculated for forecasts issued over a month-long period in summer 2012 for the North America domain with and without assimilation of surface observations of PM 2.5 .