Development of an Ozone Monitoring Instrument (OMI) aerosol index (AI) data assimilation scheme for aerosol modeling over bright surfaces – a step toward direct radiance assimilation in the UV spectrum

Using the Vector LInearized Discrete Ordinate Radiative Transfer (VLIDORT) code as the main driver for forward model simulations, a first-of-its-kind data assimilation scheme has been developed for assimilating Ozone Monitoring Instrument (OMI) aerosol index (AI) measurements into the Naval Aerosol Analysis and Predictive System (NAAPS). This study suggests that both root mean square error (RMSE) and absolute errors can be significantly reduced in NAAPS analyses with the use of OMI AI data assimilation when compared to values from NAAPS natural runs. Improvements in model simulations demonstrate the utility of OMI AI data assimilation for aerosol model analysis over cloudy regions and bright surfaces. However, the OMI AI data assimilation alone does not outperform aerosol data assimilation that uses passive-based aerosol optical depth (AOD) products over cloud-free skies and dark surfaces. Further, as AI assimilation requires the deployment of a fully multiple-scatter-aware radiative transfer model in the forward simulations, computational burden is an issue. Nevertheless, the newly developed modeling system contains the necessary ingredients for assimilation of radiances in the ultraviolet (UV) spectrum, and our study shows the potential of direct radiance assimilation at both UV and visible spectrums, possibly coupled with AOD assimilation, for aerosol applications in the future. Additional data streams can be added, including data from the TROPOspheric Monitoring Instrument (TROPOMI), the Ozone Mapping and Profiler Suite (OMPS), and eventually the Plankton, Aerosol, Cloud and ocean Ecosystem (PACE) mission.


Introduction
Operational chemical transport modeling (CTM) of atmospheric aerosol particles, including simulation of sources and sinks and long-range transport of aerosol events such as biomass burning aerosols from fires and dust outbreaks, is now commonplace at global meteorology centers for air quality and visibility forecasts (e.g., Sessions et al., 2015;Lynch et al., 2016). Variational and ensemble-based assimilation of satellite-derived aerosol products such as aerosol optical depth (AOD), lidar backscatter measurements, and surface aerosol properties can substantially improve accuracies in CTM analyses and forecasts (Zhang et al., 2008Yumimoto et al., 2008;Uno et al., 2008;Benedetti et al., 2009;Schutgens et al., 2010;Sekiyama et al., 2010;Saide et al. 2013;Schwartz et al., 2012;Li et al., 2013;Rubin et al., 2017;Lynch et al., 2016).
Currently, the main satellite inputs for operational aerosol modeling are AOD products derived from passive-based polar-orbiting imagers, such as the Moderate Resolution Imaging Spectroradiometer (MODIS), the Visible Infrared Imaging Radiometer Suite (VIIRS), and the Advance Very High Resolution Radiometer (AVHRR). Experimentation is proceeding with the use of products from the multi-angle imaging spectroradiometer (MISR) (e.g., Lynch et al., 2016;Randles et al., 2017;Buchard et al., 2017) and from geostationary instruments such as Himawari and the Geostationary Operational Environmental Satellite (GOES). A major advantage with such passive-based satellite sensors is that the AOD is retrieved with high spatial and temporal resolution over relatively broad fields of view (e.g., Zhang et al., 2014). For example, MODIS and VIIRS provide near-global daily daytime coverage (e.g., Levy et al., 2013;Hsu et al., 2019), and GOES and Himawari are capable of retrieving AOD over North American and East Asian regions at sub-hourly temporal resolution (e.g., Bessho et al., 2016).
To date, these traditional passive-based satellite AOD retrievals have been limited to darker surfaces and relatively cloud-free conditions. The widely used MODIS Dark Target aerosol data, for instance, are globally available only over oceans and dark land surfaces (e.g., Levy et al., 2013). The MISR and MODIS Deep Blue aerosol products are also available over some arid environments but are not applicable to snow-and ice-covered regions (e.g., Kahn et al., 2010;Hsu et al., 2013). Also, none of the abovementioned aerosol products are valid over cloudy regions.
In comparison to AOD, the semi-quantitative UV-based aerosol index (AI) has long been used to monitor major aerosol events such as smoke plumes and dust storms, starting with the Total Ozone Mapping Spectrometer (TOMS) from the late 1970s (Herman et al., 1997). AI is derived using the ratio of observed UV radiances to simulated ones assuming only a clear Rayleigh sky (e.g., Torres et al., 2007). AI retrievals are currently computed using observations from sensors with ozone-sensitive channels. For example, the Ozone Monitoring Instrument (OMI), Ozone Mapping and Profiler Suite (OMPS), TROPOspheric Monitoring Instrument (TROPOMI), and the future Plankton, Aerosol, Cloud and ocean Ecosystem (PACE) mission include ozonesensitive channels that can detect UV-absorbing aerosol particles, such as black-carbon-laden smoke or iron-bearing dust, over bright surfaces, such as desert, snow-and icecovered regions, and aerosol plumes above clouds (e.g., Torres et al., 2012;Yu et al., 2012;Alfaro-Contreras et al., 2014. To complement existing AOD assimilating systems, we have developed an AI data assimilation (AI-DA) system that is capable of assimilating OMI AI over bright surfaces and cloudy regions for aerosol analyses and forecasts. This study can be considered one of the first attempts at direct radiance assimilation in the UV spectrum for aerosol applications, as AI can be directly computed from UV radiances and the developed OMI AI-DA system has all the necessary components for a typical radiance assimilation package. In time we expect our assimilation model to merge with AOD or solar radiance assimilation to influence aerosol loading, height, and absorption (e.g., the VIIRS + OMPS product; Lee et al., 2015). Details of the developed OMI AI assimilation system are presented in the paper, which is organized as follows: datasets used in the study are summarized in Sect. 2. Section 3 discusses the components of the AI-DA system. Section 4 provides an evaluation of the developed system, and Sect. 5 contains a summary discussion.

Datasets and models
Three datasets are used in this study. These are (i) the OMI level 2 UV aerosol product (OMAERUV; Torres et al., 2007), (ii) the Aerosol Robotic Network (AERONET; Holben et al., 1998) AOD product, and (iii) reanalysis data from the Naval Aerosol Analysis and Prediction System (NAAPS;Lynch et al., 2016), which was the first operational global aerosol mass transport model available to the community. The assimilation system is based on spatial and temporal variations of aerosol particles from NAAPS (Zhang and Reid, 2006;Zhang et al., 2008), and the Vector LInearized Discrete Ordinate Radiative Transfer (VLIDORT; Spurr, 2006) code is used to construct a forward model for the AI-DA system.

OMI aerosol product
UV aerosol index data from the OMI level 2 version 3 UV aerosol products (OMAERUV) are used in this study. The OMI is onboard the Aura satellite (launched in 2004) and it observes the Earth's atmosphere over the UV-visible spectrum with a pixel size of 13 × 24 km at nadir for the global scan mode and a swath of ∼ 2600 km (Levelt et al., 2018). The daytime equatorial crossing for the Aura platform is ∼ 13:30. The dataset comprises the UV AI, viewing and solar geometries, spectrally dependent surface albedos at the 354 and 388 nm spectral channels, terrain pressure, geolocations, XTrack and algorithm quality flags, and other aerosol and ancillary parameters. The UV AI is designed to detect UVabsorbing aerosol particles and is based on radiance observations at 354 nm (I obs354 ) and calculated radiance (I cal354 ) at 354 nm for a Rayleigh (no aerosol) atmosphere (e.g., Torres et al., 2007) defined as Unbiased, noise-reduced, quality-assured AI data are necessary for AI data assimilation. This is especially important for OMI observations due to this particular sensor suffering from well-referenced "row anomaly" issues . To remove pixels with row anomalies, only retrievals with XTrack flag values of 0 are retained. Also, abnormal AI values were identified over mountain regions. Thus, retrievals with terrain and surface pressure less than 850 hPa are excluded in the study. Finally, only retrievals with OMI AI values larger than −2 are used. Therefore, OMI observations over cloudy skies, which could have negative OMI AI values, are also included. Both cloud-free and above-cloud AI data satisfying these quality checks are aggregated and averaged in 1 × 1 • (latitude-longitude) bins. As a radiative transfer model run is applied for each observation, the gridded data are used in the assimilation process in order to reduce the computational burden. Averaged parameters for the gridded data include the solar and sensor zenith angles, the relative azimuth angles, the spectrally dependent surface albedos at 354 and 388 nm, the cloud fraction, and the AI values themselves. Additional quality assurance steps are also applied during the spatial averaging process. Isolated high AI values are removed as follows. First, for a 4 × 4 pixel box, if the mean AI is less than 0.7 but an individual AI value is larger than 0.7, then that one value is removed. Second, if the standard deviation of AI values for a 3 × 3 pixel box surrounding a pixel is larger than 0.5, that individual AI value is likewise removed. Note that both approaches are essentially homogeneity tests that are used for identifying outliers. The thresholds are empirically estimated through visual inspection.

AERONET data
Version 3 level 2 daytime, cloud-cleared, and quality-assured AERONET data are used to evaluate the performance of the OMI AI data assimilation in our study (Holben et al., 1998;Giles et al., 2019). During daytime, AOD from AERONET instruments are derived by measuring the attenuated solar radiance typically at eight wavelengths ranging from 340 to 1640 nm. In this study, AERONET data are collocated with NAAPS analyses with and without OMI AI assimilation. In order to collocate AERONET and NAAPS AOD data, AERONET AOD values within ±30 min of a given NAAPS analysis time are averaged and used as ground-based AOD values for the NAAPS 1 × 1 • (latitude-longitude) collocated bins. As AERONET data require a cloud-free line of sight to the solar disk, the performance of OMI AI data assimilation over overcast regions is not evaluated.

NAAPS and NAAPS reanalysis data
The NAAPS (http://www.nrlmry.navy.mil/aerosol/, last access: 18 December 2020) model is a multispecies, threedimensional, Eulerian global transport model using the operational Navy Global Environmental Model (NAVGEM) as the meteorological driver (Hogan et al., 2014). NAAPS provides 6 d forecasts at a 3 h interval with a spatial resolution of 1/3 • (latitude-longitude) and 42 vertical levels on a global scale. NAAPS predicts four aerosol particle classes: anthropogenic and biogenic fine particles (ABF, such as primary and secondary organic aerosols and sulfate aerosols), dust, biomass burning smoke, and sea salt (Lynch et al, 2016).
The 2003-2018 NAAPS reanalysis version 1 (v1) (Lynch et al., 2016) is a modified version of the operational NAAPS model. In this version, quality-controlled retrievals of AOD from MODIS and MISR (Zhang and Reid, 2006;Hyer et al., 2011;Shi et al., 2014) are assimilated into NAAPS through the Naval Research Laboratory Atmospheric Variation Data Assimilation System-AOD System (NAVDAS-AOD; e.g., Zhang et al., 2008Zhang et al., , 2011Zhang et al., , 2014. Aerosol source functions, including biomass burning, smoke, and dust emissions, are regionally tuned based on the AERONET data. Other aerosol processes, including dry deposition over water, are also tuned based on AOD data assimilation correction fields. NOAA Climate Prediction Center (CPC) MORPHing (CMORPH) precipitation data are used to constrain the wet removal process within the tropics (Joyce et al., 2004). The usage of CMORPH avoids the ubiquitous precipitation bias that exists in all global atmospheric models (e.g., Dai, 2006) and is proven to improve aerosol wet deposition, therefore yielding better AOD (Xian et al., 2009). The reanalysis agrees reasonably well with AERONET data on a global scale (Lynch et al., 2016) and also reproduces AOD trends that are in good agreement with satellite-based analyses (e.g., Zhang and Reid, 2010;Hsu et al., 2012). In this study, we use a free-running version of NAAPS reanalysis v1 without AOD assimilation to provide aerosol fields every 6 h at 1 • × 1 • (latitude-longitude) resolution.

VLIDORT radiative transfer code
VLIDORT is a linearized, multiple-scatter radiative transfer model for the simultaneous generation of Stokes 4-vectors and analytically derived Jacobians (weighting functions) of these 4-vectors with respect to any atmospheric or surface property (Spurr, 2006). The model uses discrete-ordinate methods to solve the polarized plane-parallel RT equations in a multilayer atmosphere, plus the solution of a boundary value problem and subsequent source function integration to obtain radiation fields at any geometry and any atmospheric level. VLIDORT has a "pseudo-spherical" ansatz: the treatment of solar-beam attenuation in a spherical-shell atmosphere before scattering. Single scattering in VLIDORT is accurate for both line-of-sight and solar-beam spherical geometry. The model has a full thermal emission capability. VLIDORT has two supplements, one dealing with bidirectional (non-Lambertian) reflection at the surface and the other with the inclusion of surface light sources (SIF or water-leaving radiances). Full details on the VLIDORT model may be found in a recent review paper (Spurr and Christi, 2019, and references to VLIDORT therein).
VLIDORT is used to simulate the AI in this study. Simulations at 354 and 388 nm are performed for both Rayleigh atmospheres and scenarios with aerosol loadings (four massmixing profiles for different aerosol types) taken from the NAAPS model. In addition to the AI, Jacobian calculations are needed with respect to these aerosol profiles. Firstly, radiance Jacobians with respect to these four mass-mixing profiles are computed analytically using VLIDORT's lineariza- tion facility; secondly, the associated Jacobians of AI are further derived through a second VLIDORT linearization with respect to the Lambertian-equivalent reflectivity. The details of this process are given in the next section.

OMI AI assimilation system
The OMI assimilation system has three components: a forward model, a 3-D variational assimilation system, and a post-processing system. Based on the background NAAPS 3-D aerosol concentrations for dust, smoke, ABF, and sea salt aerosols, the forward model not only computes the associated AI values, but also their Jacobians of AI with respect to the four aerosol mass loading profiles. The 3-D variational assimilation system is a modified 3-D NAVDAS-AOD system (Zhang et al., 2008) that computes increments for dust and smoke aerosol concentrations based on OMI AI data. The post-processing system constructs a new NAAPS analysis based on the background NAAPS aerosol concentrations and increments as derived from the 3-D variational assimilation system. Details of the forward model and the modified NAVDAS-AOD system are described in this section.

Forward model for simulating OMI AI
To construct an AI-DA system, a forward model is needed to simulate AI using aerosol concentrations from NAAPS. In this study, the forward model is built around the VLI-DORT model, following a similar method to that suggested in Buchard et al. (2015). Here VLIDORT is configured to compute OMI radiances and Jacobians as functions of the observational conditions at 354 and 388 nm using geolocation information from OMI data such as satellite zenith, solar zenith, and relative azimuth angles, as well as ancillary OMI data (surface albedos at 354 and 388 nm).
To convert from NAAPS mass loading concentrations to aerosol extinction and scattering profiles, we require aerosol optical properties for the four species at 354 and 388 nm, which are summarized in Table 1. The optical properties of ABF (assumed to be sulfate in this study), sea salt, dust, and smoke aerosols, including mass extinction efficiencies and single-scattering albedos at 354 and 388 nm, are adapted from NASA's Goddard Earth Observing System version 5 (GEOS-5) model (e.g., Colarco et al., 2014;Buchard et al., 2015). Note that the study period is July and August of 2007 over Africa, coinciding with the early biomass burning season associated with lower single-scattering albedo values (Eck et al., 2013). With that in mind, we choose a quite low value of 0.85 for the single-scattering albedo value at 354 nm (e.g., Eck et al., 2013;Cochrane et al., 2019). A slightly higher single-scattering albedo of 0.86 is assumed at 388 nm. The slight increase in single-scattering albedo from 354 to 388 nm has also been observed from Solar Spectral Flux Radiometer (SSFR) observations during the recent NASA ObseRvations of CLouds above Aerosols and their intEractionS (ORACLES) campaign . Scattering matrices for dust, smoke, sea salt, and sulfate (to represent ABF) aerosols are based on associated expansion coefficients (e.g., Colarco et al., 2014;Buchard et al., 2015) taken from NASA's GEOS-5 model. Also, to reduce computational expenses, scalar radiative transfer calculations are performed.
To simulate OMI AI, the Lambertian equivalent reflectivity (LER) at 388 nm (R 388 ) is needed for estimating LER at 354 nm. The R 388 is calculated from VLIDORT based on Eq.
(2) I ray388 (0) is the calculated path radiance at 388 nm assuming a Rayleigh atmosphere with surface albedo 0. T and S b are the calculated transmittance and spherical albedo at 388 nm. I aer388 (ρ 388 ) is the computed radiance including 3-D aerosol fields from NAAPS and the 388 nm surface albedo from OMI data. In Buchard et al. (2015), an adjusting factor is applied to R 388 by adding the difference between climatological surface albedos at 354 and 388 nm. A similar approach is also adopted in this study, as shown in their Eq. (3).
Here, R 388 is surface-albedo-adjusted Lambertian equivalent reflectivity at 388 nm. ρ 388 and ρ 354 are surface albedo values at 388 and 354 nm channels that are obtained from the OMI OMAERUV data. Finally, the simulated AI (AI naaps ) is given by Here, I aer354 (ρ 354 ) is the calculated radiance at 354 nm using NAAPS aerosol fields and the OMI-reported surface albedo at 354 nm (ρ 354 ). I ray354 (R 388 ) is the calculated radiance assuming a Rayleigh atmosphere and the derived value of R 388 as the surface albedo (Buchard et al., 2015). The forward-model-simulated OMI AI values are intercompared with OMI AI values as shown in Fig. 1 for the study region. A total of 1 month (1-31 July 2007) of NAAPS Figure 1. (a) Spatial distribution of NAAPS AODs using NAAPS reanalysis data from the collocated OMI and NAAPS dataset for July 2007. (b) Simulated AI using NAAPS reanalysis data as shown in (a). (c) Spatial distribution of OMI AI using gridded OMI data from the collocated OMI and NAAPS dataset for July 2007. Grey highlights those 1×1 • (latitude-longitude) bins that have fewer than three collocated NAAPS and OMI AI data points for the study period. reanalysis data and OMI AI data were used. Note that OMI AI data over both cloud-free and cloudy skies were used. Since surface albedos included in the OMI data represent reflectivities under clear-sky situations, the albedo under a cloudy sky is then computed as Here, ρ clr and f c are the clear-sky surface albedo (e.g., ρ 354 or ρ 388 ) and the cloud fraction, both quantities obtained from the OMI dataset. Clouds are assumed to be tropospheric (close to the surface) with a UV albedo of 0.8 such that this equation applies to both the 354 and 388 nm channels. Figure 1a shows the spatial distribution of NAAPS AOD over central and northern Africa using collocated NAAPS and OMI AI datasets. OMI AI data are grid-averaged in 1 • × 1 • (latitude-longitude) bins. Also, we focus on Africa in this paper as this area includes dust plumes over deserts and smoke plumes overlying stratus cloud decks. The Arctic is not included as additional efforts may be needed to fully understand the properties of sea ice reflectivity; we leave this topic for a future paper. Only bins that have valid NAAPS and OMI AI data are used to generate Fig. 1. Dust plumes are visible over northern Africa and the Persian Gulf, and a smoke plume from central Africa is also evident. These UVabsorbing aerosol plumes are also captured by OMI AI, as seen in Fig. 1c. Shown in Fig. 1b is the simulated OMI AI using the NAAPS aerosol fields and viewing geometries with surface albedos from OMI. The simulated OMI AI shows similar patterns to those derived from OMI, especially for the dust plumes over northern Africa and smoke plumes over central Africa. An overall correlation of 0.79 is found between simulated and satellite-retrieved OMI AI values, as shown in Fig. 1, suggesting that the forward model is functioning reasonably as designed.

Forward model for Jacobians of AI
Jacobians of OMI AI with respect to aerosol mass concentrations are needed for the OMI AI assimilation system. In this study, AI Jacobians (K) are calculated from radiance Jacobians with respect to aerosol mass concentrations for four aerosol species (smoke, dust, ABF and sulfate, sea salt) at 354 nm (K 354,nk = ∂I aer354 ∂M nk ) and 388 nm (K 388,nk = ∂I aer388 ∂M nk ) wavelengths. Here M nk is the mass concentration for aerosol type, k, and for vertical layer, n. I aer354 and I aer388 are radiances for the 354 and 388 nm channels, respectively. K 354,nk and K 388,nk are the corresponding radiance Jacobians at 354 and 388 nm, respectively. AI Jacobians can then be calculated through analytic differentiation of the basic formula in Eq. (1), and, after some algebra, we find the following result: Here, A 1 and A 2 are respectively given by Eqs. (7) and (8), as Based on these equations, radiance Jacobians with respect to aerosol particles, K 354,nk and K 388,nk , are computed at 354 and 388 nm, respectively, using OMI-reported surface albedo values (ρ 354 and ρ 388 ), followed by a calculation of the albedo Jacobian ∂I ray354 (R 388 ) ∂R at 354 nm. To check this analytic Jacobian calculation in Eqs. (6)-(8), we compute the aerosol AI Jacobians using a finitedifference (FD) method. Here, the derivative of AI as a function of the aerosol concentration of a species, k, in layer n, is computed using Here C nk and C nk are the baseline and perturbed aerosol concentrations, respectively, and AI and AI' are computed using C nk and C nk , respectively. Figure 2b shows the comparison of Jacobians of dust aerosols estimated from the analytic and the FD solutions. Dust, smoke, ABF, and sea salt aerosol concentrations as a function of altitude are shown in Fig. 2a. To compute FD Jacobians with respect to dust aerosols, a 10 % perturbation is introduced in the dust profiles. A very close match is found between analytic and FD Jacobians. This validates the analytical solution used in the study. The analytic solution is of course much faster, as a single call to VLIDORT will deliver all necessary Jacobians at one wavelength compared to 97 separate calls to VLIDORT with the FD calculation (baseline; four species perturbations in the 24-layer atmosphere).

The variational OMI AI assimilation system
The OMI AI assimilation system is based on AI simulations (with Jacobians) from the forward model. Two principles underlie the assimilation procedure. First, we assume that OMI AI is sensitive to UV-absorbing aerosol particles, such as NAAPS smoke and dust, or that only smoke and dust are injected high enough into the troposphere to impact AI. Therefore, innovations are limited to modifications of dust and smoke aerosol properties. For classes that do not strongly project onto AI, such as sea salt and ABF aerosols, aerosol concentrations are not modified during the process. Second, contributions of smoke and dust aerosols to AI (AI smoke / AI dust ) prior to assimilation are estimated by multiplying smoke and dust aerosol concentrations from NAAPS with Jacobians of AI for the respective smoke and dust aerosols. The ratio of AI innovation from smoke aerosols ( AI smoke ) to total AI innovation ( AI or OMI AI − AI naaps ) is assumed to be the ratio of AI smoke to AI smoke + AI dust . The same assumption holds for dust aerosols.
Given these two principles, the overall design concept for the OMI AI assimilation can be expressed as where C b and C a are NAAPS aerosol concentrations for the analysis and background fields, respectively, C b dust and C b smk are background NAAPS particle mass concentrations for dust and smoke, H (C) is the NAAPS forward model that links NAAPS particle mass concentrations to AI, and H is defined as ∂H (C)/∂C, which is the Jacobian matrix of AI with respect to aerosol concentrations. Y is the observed OMI AI, and Y − H (C b ) is the innovation of AI representing the difference between observed and modeled AI values. The terms are the fractional contribution of innovation from dust and smoke aerosol, respectively. These terms are estimated using NAAPS aerosol concentrations for relatively high aerosol loading cases (AOD > 0.15). For low aerosol loading (AOD < 0.15) as reported from NAAPS, it is possible that NAAPS could underestimate aerosol concentrations. Thus, the fractional contribution of innovations is assigned a value of 1 for the dominant aerosol type based on a NAAPS aerosol climatology (Zhang et al., 2008). Note is in observational space. P dust and P smk are model error covariance matrices for dust and smoke aerosols (e.g., Zhang et al., 2008Zhang et al., , 2011Zhang et al., , 2014. R is the observation-based error covariance. The terms represent the estimated increments in model space. The background error covariance matrix is constructed from modeled error variances and error correlations following the methodology in previous studies (Zhang et al., 2008. The horizontal background error correlation is generated using the second-order regressive function (SOAR), as shown in Eq. (11) (Zhang et al., 2008), or Here, x and y are two given locations, and R xy is the great circle distance. L is the averaged error correlation length and is set to 200 km based on Zhang et al. (2008). Similarly, the vertical error correlation between two pressure levels p 1 and p 2 is also based on the SOAR function, this time in pressure space, based on Zhang et al. (2011): Here, L is a unitless number representing vertical correlation length and is set to 0.2. The horizontal error variance is based on the root mean square error (RMSE) of aerosol concentrations, which is arbitrarily set to 100 µg m −3 for near-surface dust aerosols (ground to 700 hPa). The RMSE of dust aerosol mass is assumed to decrease as altitude increases and is set to 50 %, 25 %, and 1 % of the near-surface values for 500-700, 350-500, and 70-350 hPa, respectively. Note that different aerosol species have different mass extinction efficiency values. Here we assume that the modeled error in aerosol extinction is the same for different aerosol species, and thus the RMSE of the smoke aerosol concentration is scaled by the mass extinction efficiency ratio between smoke and dust aerosols. The observational errors are assumed to be non-correlated in this study (e.g., Zhang et al., 2008). OMI AI values over cloud-free and cloudy skies are used in the study, and therefore RMSEs of AI are required for both these situations. Note that, as suggested by Yu et al. (2012), for the same above-cloud CALIOP AOD, variations in AI are found to be of the order of 1 for cloud optical depth changing from 2 to 20. Thus, we assume that the RMSE of OMI AI is 0.5 for cloud-free skies, increasing linearly with the cloud fraction up to a value of 1 for 100 % overcast.
Lastly, we assume that detectable UV-absorbing aerosols have AI values larger than 0.8 (e.g., Torres et al., 2013). Therefore, for regions with OMI AI values larger than 0.8, UV-absorbing aerosol particles can be added or removed from air columns based on innovations, which are the differences between OMI-reported and simulated AI values. For regions with OMI AI values less than 0.8, innovations are only used to remove UV-absorbing aerosol particles from air columns.
4 System evaluation and discussion 4.1 Evaluating the performance of the AI assimilation system over Africa Using 2 months of OMI data (July-August 2007), the performance of OMI AI assimilation was evaluated around the Africa region (20 • S-40 • N; 60 • W-50 • E). The study region was chosen to examine the performance of OMI AI data assimilation over bright surfaces such as the deserts of northern Africa and to study aerosol advection over clouds, in this case smoke off the west coast of southern Africa. In this demonstration, two NAAPS runs were performed for the period of 1 July to 31 August 2007, one with and one without the use of OMI AI assimilation (AI-DA run). Both runs were initialized with the use of NAAPS reanalysis data at 00:00 UTC 1 July and do not include any other form of aerosol assimilation. Figure 3a shows the true-color composite from Aqua MODIS for 28 July 2007 over the study region that is obtained from the NASA Worldview site (https://worldview. earthdata.nasa.gov/; last access: June 2020). Visible in the image are the dust plumes from northern Africa transported to the Atlantic Ocean and smoke plumes from central and southern Africa transported to the west coast of South Africa. As indicated by the aggregated OMI AI data for 12:00 UTC on 28 July 2007 (Fig. 3b), dust plumes from northern Africa are transported to the north corner of the west coast of northern Africa. Smoke plumes are also visible in the OMI AI plot in southern Africa and are transported to the west coast and over the Atlantic. Comparing Fig. 3a and b, smoke plumes, as identified from OMI, are also found over cloudy regions as indicated from the MODIS visible imagery. Note that Fig. 3b shows the OMI AI data used in the assimilation process, and again AI retrievals over both cloud-free and cloudy conditions are included as suggested by Fig. 3b. Figure 3c is the 12:00 UTC, 28 July 2007 NAAPS AOD product from the natural run. In comparison, Fig. 3d shows the same situation, this time with the use of OMI AI data assimilation. Comparing Fig. 3b and d, dust and smoke aerosol patterns as shown from OMI AI more closely resemble the NAAPS AOD fields after AI assimilation. Over the northeast coast of Africa, heavy aerosol plumes, as hinted at in NAAPS AOD from the natural run (Fig. 3c), cover larger spatial areas than those inferred from OMI AI data. In comparison, NAAPS AOD patterns from the OMI AI data assimilation cycle closely resemble aerosol patterns as suggested from OMI AI data. Also shown in Fig. 3e and f are the simulated AI using NAAPS data from the natural and OMI AI-DA runs (data from Fig. 3c and d), respectively. Clearly, with the use of NAAPS data from the natural run, simulated OMI AI is overestimated in comparison with OMI AI data (Fig. 3b). Simulated AI patterns with the use of NAAPS data from the OMI AI-DA run rather closely resemble AI patterns from the OMI data, again indicating that the OMI AI-DA system is functioning reasonably as designed.
The performance of AI-DA is also evaluated using OMI AI for the whole study period, as shown in Fig. 4. These data are constructed using collocated OMI AI and NAAPS data according to the conditions introduced in Sect. 3. Here, Fig. 4a and e are spatial distributions of 2-monthly averaged (July and August 2007) AODs for NAAPS AI-DA and natural runs, respectively. Figure 4b is the spatial distribution of the simulated AI using NAAPS data from AI-DA runs, and Fig. 4c is the spatial distribution of OMI AI for the 2month period. Figure 4f and g show similar plots to those in Fig. 4b and c, but this time for NAAPS natural runs. While simulated AI values from NAAPS natural runs (Fig. 4f) are overestimated compared to OMI AI values (Fig. 4g) for the study region, the patterns of simulated AI from NAAPS AI-DA runs (Fig. 4b) are similar to patterns shown from OMI AI (Fig. 4c). This is also seen in Fig. 4d, which is the difference between simulated AI from NAAPS AI-DA runs and OMI AI. In contrast to the situation in Fig. 4d, Fig. 4h, which is the difference between simulated AI from NAAPS natural runs and OMI AI, shows much larger differences in AI values.
While it is not too difficult to make the model mimic the AI product, proof of real skill lies in any improvements to AOD calculations. To this end, the performance of OMI AI assimilation was evaluated with the use of AERONET data. Figure 5a shows the intercomparison of NAAPS AOD versus AERONET AOD at 0.55 µm. A total of 1443 collocated pairs of NAAPS and AERONET data were compiled for the study region over the 2-month test period. Comparing with AERONET data, NAAPS AOD from the natural run had a correlation of 0.68, a mean absolute error in AOD of 0.154, and an RMSE of 0.220. In comparison, with AI assimilation, NAAPS AOD correlations with AERONET increased to 0.74 (Fig. 5b), the absolute error was reduced to 0.104, and RMSE was reduced to 0.156, both roughly a 30 % reduction. Note that AERONET AOD values are only available for lines of sight that are free of cloud presence for the sun-photometer instruments. Also, the slope of AERONET versus NAAPS AOD is 0.87 for the NAAPS natural runs, and a similar slope of 0.84 is found for the NAAPS AI-DA runs.

Intercomparison with AOD data assimilation
Typically, NAAPS reanalyses are constructed through assimilation of MISR and MODIS aerosol products (NAAPS AOD assimilation). Thus, the performances of NAAPS AOD and AI-DA assimilations are compared against AERONET data. Figure 5c shows the comparison of AERONET AOD and NAAPS AOD after AOD assimilation, while Fig. 5b shows a similar plot but using NAAPS data from AI-DA. Note that the same version of the NAAPS model with the same temporal and spatial resolutions driven by the same meteorological data was used to construct Fig. 5, and thus the differences in Fig. 5a, b, and c only result from different aerosol data assimilation methods implemented (no data assimilation for the natural run). A better correlation between AERONET and NAAPS data of 0.79 is found using AOD data assimilation. In comparison, the correlation is 0.74 for the AI-DA runs. Slightly better RMSE (0.140 versus 0.156) and absolute er- ror (0.095 versus 0.104) values are also found for the AOD data assimilation runs. This result is not surprising as OMI AI provides only a proxy for aerosol properties, while passivebased AOD retrievals are often considered a more reliable parameter for representing column-integrated aerosol properties. But still, the evaluation efforts are over a cloud-free line of sight as detected from AERONET, and AI-DA may further assist traditional AOD data assimilation by providing AI assimilation over cloudy regions.

Sensitivity test
As mentioned in Sect. 3, aerosol properties for non-smoke aerosol types were obtained from the NASA GEOS-5 model (e.g., Colarco et al., 2014;Buchard et al., 2015). Yet, different smoke aerosol single-scattering albedo (SSA) values are used in this study, as values for central Africa have a strong seasonal dependency (e.g., Eck et al., 2013). While SSA values of 0.85 and 0.86 are used for the 354 and 388 nm channels, respectively, in our study, we have also examined the sensitivity of simulated OMI AI with respect to differing SSA values (Fig. 6). Figure 6a-c show the simulated AI at 12:00 UTC on 28 July 2007 using NAAPS reanalysis data (Lynch et al., 2016) for three scenarios: SSA values at 354 and 388 nm of 0.84 and 0.84 (Fig. 6a), 0.85 and 0.85 (Fig. 6b), and 0.86 and 0.86 (Fig. 6c). Over the central Africa area, where smoke plumes are expected, simulated OMI AI patterns are similar for Fig. 6a and b, but reduced values of AI are found when using higher SSA values of 0.86 at both 354 and 388 nm. This is further confirmed by the averaged AI for the smoke region over central Africa (14.5 to 0.5 • S latitude and 10.5 to 30.5 • E longitude; indicated using the black box in Fig. 6f) of 0.96, 0.94, and 0.78 for Fig. 6a-c, respectively.  Fig. 6d-f, respectively. Interestingly, the spectral dependence of SSA seems to affect the simulated AI significantly, and this phenomenon has also been reported by previous studies (e.g., Hammer et al., 2016). The averaged AI values over central Africa (again indicated by the black box in Fig. 6f) are 0.94, 1.11, and 1.32 for 388 nm SSAs of 0.85, 0.855, and 0.86, respectively. This exercise suggests that simulated AI is a strong function of SSA so that both the spectral dependence of SSA values at 354 and 388 nm and reliable SSA values are needed on a regional basis for future applications.
Interestingly, although simulated AI values are significantly affected by perturbing SSA values as shown in Fig. 6, less significant impacts are observed for NAAPS AOD. This is found by running the OMI AI-DA for 12:00 UTC on 28 July 2007 for SSA values used to generate Fig. 6. For example, for the region highlighted by the black box in Fig. 6f, the averaged values for the simulated OMI AI are 0.96, 0.94, and 0.78 using SSA values at 354 the 388 nm channels of 0.84 and 0.84, 0.85 and 0.85, and 0.86 and 0.86, respectively. The corresponding NAAPS AODs are found to be 0.559, 0.560, and 0.585 after OMI AI-DA, which is a change of less than 5 %. Similarly, by fixing the SSA value of the 354 nm channel as 0.85 and perturbing SSA values at 388 nm from 0.85 to 0.86, a ∼ 30 % change is found in simulated OMI AI (from 0.94 to 1.32), yet a ∼ 10 % change is found for the NAAPS AOD (from 0.560 to 0.504) after OMI AI-DA.
It is also of interest to investigate the changes in aerosol vertical distributions due to the OMI AI-DA. For this exer- cise, we selected the case of 12:00 UTC on 28 July 2007 and compared vertical distributions of smoke and dust aerosols near the peak AI value of the smoke plume (9.5 • S and 20.5 • E) for the NAAPS natural and AI-DA runs (Fig. 7a). Note that the differences between OMI DA and natural runs as shown in Fig. 7 are essentially an integrated effect of OMI AI-DA from 00:00 Z on 1 July to 12:00 Z on 28 July 2007. As shown in Fig. 7a, the corrections to dust and smoke aerosol concentrations from the AI-DA system seem to be systematic changes across the majority of vertical layers, instead of moving dust or smoke aerosol plumes vertically. As dust aerosol concentrations are reduced at all layers, a systematic correction to smoke aerosol concentrations, although non-linear, is also observed. AI assimilation helps reduce the amount of upper troposphere dust (likely to be artifact) but does change the layer centroid slightly upwards. We have also evaluated NAAPS vertical distributions near a peak dust plume region (25.5 • N and 12.5 • W) for the case of 12:00 Z 28 on July 2007 as shown in Fig. 7b. Similar to Fig. 7a, a nonlinear correction to dust aerosol concentrations is also observed across the vertical domain.

Issues and discussion
The OMI AI data assimilation system is a proxy for all-sky, all-band modeling system radiance assimilation. It contains all the necessary components for such radiance assimilation, including a forward model for simulating radiances and AI values and their Jacobians, based on a full vector linearized radiative transfer model called for every observation. Therefore, the computational burden is a direct issue associated with the deployment of calls to a radiative transfer model for each observation. For the study area in this work, after binning OMI AI data into a 1 • × 1 • (latitude-longitude) product, it still takes about ∼ 1 CPU day for NAAPS to run for 1 month of model time. In comparison, the timescale for running AOD assimilation for 1 month is at the hourly level. Clearly, there will be an unavoidable computational burden of some sort for OMI AI assimilation and by extension for future radiance assimilation in the UV-visible spectrum for aerosol analyses. Performance enhancement methods, such as parallel processing (the VLIDORT software is thread-safe and can be used in parallel environments such as OpenMP) or fast lookup table extraction based on neural networks and trained datasets of forward simulation, must be explored in order to enable such assimilation applications in near-real time on a global scale.
In contrast to the assimilation of retrieved aerosol properties, both aerosol absorption and scattering need to be ac-counted for when assimilating radiance or OMI AI in the UV spectrum. This requires the inclusion of more dynamic aerosol optical properties in the data assimilation process and properties that vary with region and season. As noted already, even for biomass burning aerosols over South Africa, lower single-scattering albedo values were found at earlier stages of burning seasons (e.g., Eck et al., 2013). A lookup table of aerosol optical properties as functions of region and season will be needed for global implications of OMI AI and future radiance assimilation for aerosol modeling.
OMI AI is sensitive to above-cloud UV-absorbing aerosols (e.g., Yu et al., 2012;Alfaro-Contreras et al., 2014), and therefore OMI AI values over cloudy scenes were also used in this study. However, OMI AI cannot be used to infer aerosol properties for aerosol plumes beneath a cloud deck. For regions with high clouds, the use of OMI AI data assimilation will likely result in an underestimation of AOD as below-cloud aerosol plumes are not accounted for. Therefore, only OMI AI data over low cloud scenes are to be used for aerosol assimilation efforts. In addition, although some quality assurance steps were applied in this study for the OMI AI data, lower AI values were observed over glint regions near the west coast of Africa. Abnormally high OMI AI values are also seen near the Arctic region -this may be related to the presence of floating ice sheets. Thus, innovative and detailed data screening and quality assurance steps are needed to exclude potentially noisy OMI AI retrievals and for further application of OMI AI data assimilation on a global scale.
Even with these known issues, OMI AI assimilation as presented in the study illustrates a new method for assimilating non-conventional aerosol products. Bearing in mind that OMI AI assimilation is essentially radiance assimilation in the UV spectrum, this study demonstrates the potential of directly assimilating satellite radiance in the UV-visible spectrum for aerosol modeling and analyses.

Conclusions
The OMI aerosol index (AI), which measures the differences between simulated radiances over Rayleigh sky and observed radiances at 354 nm, has been used to detect the presence of absorbing aerosols over both dark and bright surfaces. We have constructed a new assimilation system, based on the VLIDORT radiative transfer code as the major component of the forward model, for the direct assimilation of OMI AI. The aim is to improve accuracies of aerosol analyses over bright surfaces such as cloudy regions and deserts.
The performance of the OMI AI data assimilation system was evaluated over south-central and northern African regions for the period of 1 July-31 August 2007. This evaluation was done by intercomparing NAAPS analyses with and without the inclusion of OMI AI data assimilation. Besides cloud-free AI retrievals over dark surfaces, OMI AI re-trievals over desert regions were also considered. When compared against AERONET data, a total ∼ 29 % reduction in root mean square error (RMSE) with a ∼ 32 % reduction in absolute error was found for NAAPS analyses with the use of OMI AI assimilation. Also, NAAPS analyses with the inclusion of OMI AI data assimilation show similar aerosol patterns as those in the OMI AI datasets, showing that our OMI AI data assimilation system works as expected.
This study also suggests that NAAPS analyses with OMI AI data assimilation cannot outperform NAAPS reanalysis data incorporated with MODIS and MISR AOD assimilation through validation against AERONET data. This is not surprising, as OMI AI is only a proxy for the AOD and is sensitive to other factors such as surface albedo and aerosol vertical distribution. Also, AERONET data are only available over cloud-free fields of view, so the performance of our OMI AI data assimilation system over cloudy regions has not been evaluated.
There are a number of issues arising from our study. For example, aerosol optical properties are needed for the OMI AI-DA system -these have strong regional and temporal signatures that need to be carefully quantified before applying them to the AI-DA on a global scale. Also, OMI AI retrievals are rather noisy and contain known and unknown biases. Abnormally high OMI AI values are found over mountain regions and polar regions. Sporadic high AI values are also known to occur for reasons that are still not properly understood. Even though quality assurance steps were proposed in this study, detailed analyses of OMI AI data are needed for future implementation of OMI AI data assimilation in aerosol studies.
Lastly, AI values are derived from radiances, and thus the AI-DA system presented in the study can be thought of as a radiance assimilation system for the UV spectrum. This is because the AI-DA system contains all the necessary components for radiance assimilation based on a forward model for calculating not only simulated satellite radiances, but also the aerosol profile Jacobians of these radiances, with both quantities as functions of observation conditions. This study is among the first attempts at radiance assimilation in the UV spectrum and indicates the future potential for direct radiance assimilation in the UV and visible spectra for aerosol analyses and forecasts.
Code and data availability. The OMI data assimilation scheme (V1.0) is constructed using VLIDORT and NAVDAS-AOD for NAAPS analyses and forecasts. The VLIDORT radiative transfer mode is the property of RT Solutions Inc. The VLIDORT code is publicly available and comes with a standard GNU public license through direct contact with RT Solutions Inc. (http://www.rtslidort. com/mainprod_vlidort.html, last access: 18 December 2020). Both NAAPS and NAVDAS-AOD are proprietary to the Naval Research Laboratory, United States Department of the Navy. Nevertheless, both NAAPS and NAVDAS-AOD are well documented in past stud-ies (e.g., Lynch et al., 2016;Zhang et al., 2008;Rubin et al., 2017), and we have made every effort to thoroughly report our methods so that they may be replicated. AOD fields from the NAAPS OMI AI-DA and natural runs over the study region and period are shared in the Supplement to the paper for readers who are interested. The NAAPS reanalysis data are available from the USGODAE website (https://nrlgodae1.nrlmry.navy.mil/cgi-bin/ datalist.pl?dset=nrl_naaps_reanalysis&summary=Go, last access: 18 December 2020, Naval Research Laboratory Monterey, Lynch et al., 2016). The OMI OMAERUV data are available from the NASA Goddard Earth Sciences Data and Information Services Center (GES DISC; https://doi.org/10.5067/Aura/OMI/DATA2004, Torres, 2006). AERONET data are obtained from the NASA AERONET web page (https://aeronet.gsfc.nasa.gov/cgi-bin/draw_ map_display_inv_v3, last access: 18 December 2020, NASA AERONET team, Holben et al., 1998;Giles et al., 2019).
Author contributions. All authors contributed to the overall design of the study. Authors JZ and RS coded the system. Author JSR provided valuable suggestions though the study. Author PX assisted with the evaluation of the system.
Competing interests. The authors declare that they have no conflict of interest.