Open Access

Abstract. We present an aerosol data assimilation system based on a global aerosol climate model (SPRINTARS – Spectral Radiation-Transport Model for Aerosol Species) and a four-dimensional variational data assimilation method (4D-Var). Its main purposes are to optimize emission estimates, improve composites, and obtain the best estimate of the radiative effects of aerosols in conjunction with observations. To reduce the huge computational cost caused by the iterative integrations in the models, we developed an offline model and a corresponding adjoint model, which are driven by pre-calculated meteorological, land, and soil data. The offline and adjoint model shortened the computational time of the inner loop by more than 30%. By comparing the results with a 1 yr simulation from the original online model, the consistency of the offline model was verified, with correlation coefficient R > 0.97 and absolute value of normalized mean bias NMB The feasibility and capability of the developed system for aerosol inverse modelling was demonstrated in several inversion experiments based on the observing system simulation experiment framework. In the experiments, we used the simulated observation data sets of fine- and coarse-mode AOTs from sun-synchronous polar orbits to investigate the impact of the observational frequency (number of satellites) and coverage (land and ocean), and assigned aerosol emissions to control parameters. Observations over land have a notably positive impact on the performance of inverse modelling as compared with observations over ocean, implying that reliable observational information over land is important for inverse modelling of land-born aerosols. The experimental results also indicate that information that provides differentiations between aerosol species is crucial to inverse modelling over regions where various aerosol species coexist (e.g. industrialized regions and areas downwind of them).


Introduction
It is well known that airborne aerosols play an important role in air quality, acid rain, and human health (Pope et al., 2002).Furthermore, aerosols crucially impact climate and weather through complicated processes (i.e.direct, semidirect, first indirect, and second indirect effects).Rodwell and Jung (2008) reported that updated aerosol climatology leads to improvements in forecast skill and error reduction in precipitation and wind for the forecast model of the European Centre for Medium-range Weather Forecasts (ECMWF).Their results indicate that aerosols and weather are strongly connected, and that large uncertainties remain in the description of aerosols.Recently, sophisticated chemical transport models (CTM) have been developed at several research institutes that have provided insight into various aspects of aerosols (e.g.emissions, transport, deposition, and climate effects).However, uncertainties remain in the model results.The aerosol model inter-comparison (AeroCom) found that there is large diversity among the models in emissions, composition, and optical properties (Textor et al., 2006;Kinne et al., 2006).The Fourth Assessment Report (AR4) of the IPCC (Forster et al., 2007) suggests that scientific understanding of aerosol radiative forcing is still at a mid-low to low level, and its uncertainty is greater than that for long-lived greenhouse gasses.

Published by Copernicus
Data assimilation, which optimizes initial conditions and model parameters with observational constraints and has contributed substantially to the development of numerical weather prediction (NWP), has recently also been applied to CTMs.For gaseous species, Elbern et al. (1997) applied the four-dimensional variational data assimilation method (4D-Var) to the European Air Pollution Dispersion (EURAD) CTM and optimized ozone initial conditions over central Europe (Elbern and Schmidt, 2001).Chai et al. (2006Chai et al. ( , 2007) ) developed a Sulfur Transport Eulerian Model (STEM) 4D-Var system and assimilated a data set from an observational campaign.Assimilation methods have also been extended to inverse modelling of various gaseous species (e.g.Yumimoto and Uno, 2006;Stavrakou and Müller, 2006;Elbern et al., 2007;Kopacz et al., 2009;Stavrakou et al., 2009).More recently, an 8 yr reanalysis of atmospheric composition was produced by the Monitoring Atmospheric Composition and Climate (MACC) project with the ECMWF's Integrated Forecast System (Inness et al., 2013).
For airborne aerosols, Hakami et al. (2005) performed inverse modelling of black carbon emissions with the STEM 4D-Var system.Yumimoto et al. (2008Yumimoto et al. ( , 2012) estimated dust emission and particle size distributions of extreme dust storms over East Asia with an in situ lidar network and the RAMS/CFORS-4DVAR (RC4) data assimilation system.Dubovik et al. (2008) optimized global aerosol sources from satellite data using the adjoint of the GOCART model.Wang et al. (2012) performed a top-down estimate of dust emission with satellite measurements and the GEOS-Chem adjoint model (Henze et al., 2007).As part of the MACC project, aerosol optical depth (AOT) measured by satellites was assimilated in the Integrated Forecast System with the 4D-Var method (Benedetti et al., 2009).Zhang et al. (2008) assimilated AOT in an operational forecast system with the two-dimensional variational data assimilation method (2D-Var).Huneeus et al. (2012Huneeus et al. ( , 2013) ) performed top-down estimates of aerosol emission inventories with total and finemode AOT measured by satellites.In addition to the variational method, ensemble-based assimilation methods have also been applied to CTMs (Constantinescu et al., 2007a, b;Sekiyama et al., 2010;Schutgens et al., 2010;Yumimoto and Takemura, 2011;Miyazaki et al., 2012).Yet, compared with NWP, data assimilation for aerosol species is still in the development stage.
Here we present a data assimilation system based on 4D-Var and the global aerosol climate model Spectral Radiation-Transport Model for Aerosol Species (SPRINTARS) with the ultimate aim of optimizing emission estimates, improving four-dimensional descriptions, and obtaining the best estimate of the climate effect of airborne aerosols in conjunction with various observations.To reduce the huge computational cost arising from the iterative integration of the forward and adjoint models, we have developed an offline version of SPRINTARS.An adjoint version of SPRINTARS was developed based on the offline model.To assess the capability of the system in inverse modelling applications, we performed several inversion experiments based on the observing system simulation experiment (OSSE) framework.The experiments also examined the impact of the observation frequency (number of satellites) and coverage (land and ocean) on the inversion results.
The paper is structured as follows.Section 2 presents brief descriptions of the methodology of 4D-Var for aerosol data assimilation and inverse modelling.Section 3 describes the SPRINTARS/4D-Var data assimilation system.The offline and adjoint models used in the system are also presented.In Sect.4, we validate the offline model with respect to the original online model.Section 5 describes the several inversion tests that we performed.The impact of observational frequency and coverage on the inversion is analysed.Finally, Sect.6 presents our conclusions.

The 4D-Var data assimilation method with aerosol transport model
At a given time step t, the evolution of the aerosol transport model is described as where C and E are vectors of the aerosol mass concentration of dimension m and emission of dimension l, respectively.To simplify the problem setup, here we assume that the emission is constant over time.M denotes the model operator, which includes advection, diffusion, chemical reaction, deposition, emission, and feedback of the aerosols.Using a unified vector x = [C, E] T of dimension n = m + l, Eq. ( 1) can be redefined as In the 4D-Var method (Talagrand and Courtier, 1987), we define the cost function (J ) as follows: Here J C and J E are called as background terms, which guarantee the uniqueness of the optimized solution even in the underdetermined problem in which n is larger than the number of observations p; B C and B E are the background error covariance matrices of dimensions m × m and l × l for concentration and emission, respectively; C 0 is the initial aerosol concentration at t = 0; and C b and E b represent the background or a priori values of concentration and emission, respectively.Therefore, J C and J E are measures of the deviation from the background value weighted by the background error covariance.J o represents the observational term, which measures the distance between the observation (y) and modelled values; H is given by where Ĥ denotes the observation operator, which maps the model state into the observation state; x 0 = [C 0 , E] T is the control parameter of dimension n; and R represents the observation error covariance matrix.
To obtain the optimal solution in which the cost function is minimized, the gradient of the cost function is required.This gradient of the cost function with respect to the control parameter x 0 is given by where H is the tangent linear of H given by in which Ĥ and M represent the tangent linear versions of the observation operator Ĥ and the model evolution M. The dimension of ∇ x 0 J is n.
In 4D-Var, the adjoint model is used to calculate Eq. ( 8).The adjoint model of Eq. ( 1) is derived as follows: Here χ t = [λ t , ε] T , where λ and ε are the adjoint variables for C and E, respectively.In addition, φ, which shows a residual between the observations and modelled values, drives the adjoint model given by As the subscripts in Eq. ( 10) show, the adjoint models are integrated from the final time t = T to the initial time t = 0, backward in time.The adjoint variables at time step t represent the sensitivity of the observational part of the cost function with respect to the concentration and emission at time step t.With the adjoint variables, the gradient of the cost function with respect to the control parameter is obtained as follows: The optimal solution of the initial conditions C 0 and emission E that minimizes the cost function is obtained in an iterative manner.At each iteration step, the cost function and its gradient are re-calculated with updated initial conditions and emission.

The SPRINTARS/4D-Var data assimilation system
A schematic diagram of the SPRINTARS/4D-Var data assimilation system is shown in Fig. 1.The SPRINTARS/4D-Var data assimilation system is composed of three major processes: a priori run, inner loop, and a posteriori run.The a priori run (Fig. 1a) is a standard run by the original online SPRINTARS (hereafter referred to as ONS) before data assimilation.The inner loop (Fig. 1b) is the main core of 4D-Var, and consists of a forward run, backward run, and optimization process.In this iterative cycle, observations are assimilated and initial and boundary conditions, and aerosol emissions, are optimized to minimize the cost function.For the optimization, more than 10 iterative integrations of forward and adjoint runs are required.To reduce this computational time, we developed an offline version of SPRINT-ARS (hereafter referred to as OFS) and corresponding adjoint model (ADJ), which avoid the integrating meteorological and radiative processes of the coupled general circulation model (GCM).Meteorological, land, and  pre-calculated by the a priori run are stored and used to drive OFS and ADJ in the inner loop.Using optimized initial and boundary conditions and emissions, the a posteriori run is performed by ONS (Fig. 1c).The a posteriori run provides assimilated (a posteriori) estimates of 4-dimensional distributions, deposition fluxes, and radiative forcing of aerosols.
To account for the non-linearity and aerosol feedback of the system into the assimilation, we can update the input meteorological, land, and soil data to those provided by the a posteriori run instead of the a priori run, and then perform the inner loop again to optimize the control parameters with the updated input data.This update is called the outer loop (e.g.Huang et al., 2009).The following subsections give detailed descriptions of each component.
In this study, we use SPRINTARS version 3.84 with T42 horizontal resolution (approximately 2.8 • × 2.8 • ) and 20 vertical sigma layers.Both anthropogenic and biomass burning emissions are based on Lamarque et al. (2010).The soil dust emission is represented as a function of the cube of the wind velocity at 10 m height depending on land use, soil texture, vegetation, leaf area index (LAI), soil moisture, and snow cover.The sea salt spray emission is proportional to the 3.2 power of the wind velocity at 10 m height over the ocean without sea ice.Sulfur dioxide (SO 2 ) emissions from volcanic eruptions are based on the Global Emissions Inventory Activity (GEIA) database (Andres and Kasgnoc, 1998).The reader can refer to Takemura (2012) for details of the aerosol emissions in SPRINTARS.Reanalysis products provided by the National Centers for Environmental Prediction (NCEP)/National Center for Atmospheric Research (NCAR) are used for nudging of the horizontal wind and temperature in the MIROC.

Offline SPRINTARS
To reduce the computational time of the inner loop, we developed an offline version of SPRINTARS (OFS).OFS is driven by meteorological data pre-calculated by ONS, and only advection, diffusion, chemistry, dry and wet depositions, gravitational settling, and emissions of aerosols are calculated, skipping the integrations of the dynamic core and physical package of the MIROC.The input meteorological data include principal meteorological variables (e.g.wind velocity, temperature, pressure, humidity, temperature at 2 m height, and wind velocity at 10 m height), soil and land information (soil moisture, snow amount, LAI, and sea ice), cloud and precipitation information (precipitation flux, cloud cover, cumulus fraction, cloud water, and water/ice partition), and radiative variables (long-and short-wave heating rate), which are linearly interpolated to the model time step.Compared to the ONS, the OFS is faster by a factor of 1.5 with T42 resolution.With finer resolution, the computational efficiency of OFS should be even more pronounced.We validate OFS in Section 4.

The adjoint of offline SPRINTARS
The adjoint version of SPRINTARS (ADJ) is derived directly from the discrete equation of OFS.The adjoint model of Eq. ( 13) becomes where M advc , M diff , M chem , M depo , and M emiss represent the tangent linear (or Jacobian) of M advc , M diff , M chem , M depo , and M emiss , respectively, and φ measures the residual between the model and observations (Eq.11) and drives the adjoint model.Integration of the adjoint model from the final time step t = T to the initial time step t = 0 propagates observational information measured in the assimilation window backward in time as the adjoint variables, and calculates the gradient of J with respect to the initial conditions and emission (see Eqs. 8 and 12).In the same way as for OFS, meteorological, land, and soil data pre-calculated by ONS are used to drive ADJ.

The optimization process
The optimization (or descent) process numerically searches for the minimum of the cost function using its gradient, and is performed after each iteration.The quasi-Newton limitedmemory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS: Liu and Nocedal, 1989.)algorithm and conjugate gradient method are available in SPRINTARS/4D-Var.We adopted the L-BFGS algorithm in this study.

Validation of the offline model
In this section, the fidelity of OFS is examined by comparison with ONS.In the examination, we focus on the aerosol optical variables (i.e.et al., 2011;Zhang et al., 2008;Sekiyama et al., 2010;Schutgens et al., 2010;Yumimoto et al., 2008).We performed a 1 yr integration of ONS for 2007 with a 1 yr spin-up run, and then drove OFS by the meteorological data calculated by ONS with the same initial conditions and anthropogenic and biomass burning emissions.Emissions of natural aerosols (i.e.dust, sea salt, and DMS) were calculated in each model and compared between OFS and ONS.The comparison used 1 yr model outputs between 70 • E and 70 • N at 3 h intervals.Table 1 summarizes the statistical results for the aerosol optical variables, including the root mean square error (RMSE), correlation coefficient (R), linear least squares best-fit slope and intercept, normalized mean bias (NMB), normalized mean error (NME), and data number used for calculating the statistics.Formulations of the statistical measures are given in Appendix A. Scatter plots of ONS versus OFS results are shown in Fig. 2. Colours denote the frequency of occurrence on a log scale; light blue and turquoise represent occurrence frequency ranges of 100-1000 (0.00054 %-0.0054 %) and 1000-10 000 (0.0054 %-0.054 %), respectively.The AOTs and emissions simulated by OFS successfully reproduce those by ONS with R > 0.97 and slopes between 0.91 and 1.05.The NMB, in which factors of 2 under-and over-predictions are −50 % and 100 %, and NME also show good agreement.Slight overestimates are found in AOT for sulfate and carbon aerosols; the NMBs are 6.9 % and 4.0 %, respectively.In addition to the error in transport due to the use of a time-interpolated wind velocity, the main reason for the aerosol overestimation is their hygroscopicity.It is well known that hygroscopic aerosols can absorb water and change their particle sizes and optical properties depending on their chemical characteristics (e.g.Tang, 1996).In the model, the growth rates and extinction cross sections of sulfate, carbon, and sea-salt aerosols depend on the relative humidity (RH), and increase significantly along with the RH under very humid situations (RH > 80 %) (Takemura et al., 2000).The difference between the simulated RH in ONS and time-interpolated RH in OFS can produce these overestimates.Sea-salt aerosol is also hygroscopic, and its particle size growth rate and optical properties also depend on the RH.Underestimation of its emission, as explained in the following paragraph, however, overcomes the overestimation.
Values of AOT for aeolian aerosols (sea salt and dust) by OFS show small underestimations of −6.0 % and −6.1 % in the NMB.These can mostly be attributed to underestimations in their emissions (Tables 1-3).As mentioned in Sect.3, the emission fluxes of sea-salt and dust aerosols are proportional to the 3.2 power and the cube of the wind velocity near the surface, respectively.Moreover, the threshold wind velocity partly contributes to underestimation of the dust emission; only when the wind velocity exceeds the threshold velocity dust is emitted.The time-interpolated wind velocity cannot reproduce the fine-scale variation of the simulated one, which results in the underestimation of emissions and AOT.However, the emission of aeolian aerosols by OFS shows good agreement with those by the ONS.The relatively large values of RMSE and NME for dust emission are attributed to a large underestimation (∼800 g km 2 s −1 ) of a strong dust storm in the Taklimakan desert on 9 May 2007; the lower wind velocity at 10 m of OFS (13.1 m s −1 in OFS versus 14.0 m s −1 in ONS) is primarily responsible for that.
The Ångström exponent of OFS agrees well with that of ONS, but its scatter plot (Fig. 2f) exhibits a large spread in the distribution compared with AOTs and emissions.The Ångström exponent, defined as the slope of the AOT between the wavelengths of 440 and 870 nm, is commonly used as an indicator of the aerosol size distribution.Errors from finer (i.e.sulfate and carbonaceous) and coarser (i.e.sea-salt and dust) aerosols accumulate and lead to the broad scatter plot.
Figure 3 shows spatial distributions of the total AOT of ONS and bias (OFS minus ONS) for each individual aerosol component.The OFS successfully reproduces the spatial distribution of AOT, and the bias is limited to between −0.04 and 0.05 except in southwestern China.On the one hand, the positive bias in industrialized regions (i.e.Europe, East   Asia, and the eastern coast of North America) is dominated by sulfate aerosol.On the other hand, carbonaceous aerosol contributes to overestimations over biomass burning sources (i.e.Central Africa, Southeast Asia, and Central and South America).The relatively large positive bias (∼0.09 of the total AOT) around southwestern China and Southeast Asia is attributed to dense concentrations of sulfate and carbonaceous aerosols and the very humid circumstances.The negative bias is attributed to sea-salt and dust aerosols (Fig. 3a, b).Tables 2 and 3 summarize the emission amounts of sea-salt and dust aerosols.Underestimates of sea-salt (∼0.6 %) and dust (∼5 %) emissions are found for major emission sources ).An underestimation of the sea-salt AOT forms a zonal bond over the zone of westerlies (30 Underestimates of natural aerosols seem to be systematic, and there is no regional dependency among sources. Vertical distributions of the annual and zonal means of the total aerosol extinction coefficient and bias (OFS minus ONS) for each individual aerosol component are exhibited in Fig. 4. Vertical distributions by OFS agree well with those by ONS.On the one hand, large biases are found in the lower layer (sigma level < 0.85) for each individual aerosol component.On the other hand, in the upper region, the difference between ONS and OFS is quite small (<0.001 km −1 ) except above the Equator.The bias in the lower layer (sigma level > 0.85), where most aerosols exist (total aerosol extinction coefficient > 0.06 km −1 ) ranges from −8.0 % to 8.4 %.Sulfate and carbonaceous aerosols show overestimates in the Northern Hemisphere.However, the negative biases of seasalt and dust aerosols overcome it for the total extinction coefficient.Sea-salt aerosol by OFS exhibits a symmetric distribution of negative bias (see Fig. 3e).Underestimations of the dust aerosol converge in the 0 • -50 • N range where dust sources are situated.Positive biases are found around the Equator for every aerosol (especially for dust).A possible explanation for this is underestimation of wet removal due to the interpolated precipitation and cloud variables in OFS.

Inversion experiments based on the OSSE framework
We performed several inversion tests based on the OSSE framework to assess the capabilities of SPRINTARS/4D-Var in inverse modelling application.The OSSE framework is a powerful tool used to evaluate the potential impact of a future or planned observing system on a data assimilation application and is also useful for assessing the performance of the data assimilation system (Masutani et al., 2010).With CTMs, Edwards et al. (2009), Zoogman et al. (2011), Sekiyama et al. (2012), andYumimoto (2013) have carried out OSSEs for future geostationary satellites and space-borne lidars.
In the OSSE framework, the nature run (NR), the simulated observation, and the control run (CR) are defined.The NR is a proxy of the "true" state, and is usually derived from a standard model simulation.The simulated observation, a representation of the observation data measured by the observing system we want to examine, is retrieved from the NR.The CR is used as an "alternative" state and is generated by a model simulation with different parameter settings, other meteorological data, and perturbed emissions.With the CR, the simulated observation is assimilated to generate the analysis run (AR).By comparing the AR with the NR (estimating how close the AR is to the NR), we can evaluate the impact of the simulated observation in the assimilation and the capabilities of the assimilation system.

Experimental setting
Figure 5 shows a schematic diagram of our inversion experiments.The NR is derived from a model simulation driven by a standard set of emissions.The aerosol fields produced by the NR are used to generate the simulated observations.For the simulated observations, we consider fine-and coarsemode AOTs provided by the Level 2 Moderate Resolution Imaging Spectrometer (MODIS) product (Remer et al., 2005(Remer et al., , 2008)).The simulated fine-mode AOT is generated with AOTs of sulfate, carbonaceous, sea-salt (two finer bins), and dust (three finer bins) aerosol fields calculated by the NR.The simulated coarse-mode AOT consists of the two coarser bins of sea-salt and the three coarser bins of dust aerosols.
Six sets of simulated observations are conducted based on combinations of two existing and one imaginary satellite in sun-synchronous polar orbits (Table 4) and data coverage over ocean and land.The perfect experiment (PE) uses simulated observations over the globe (all sky; ocean and land) at 3 h intervals, and is conducted to assess the capabilities of the data assimilation system.In the PE, it is expected that the AR recovers aerosol composites and emissions of the NR.
To investigate the impact of the observational frequency in the inversion, we conducted Experiments 1-3 (E1-3).Observational data sets from one (Terra) and two (Terra and Aqua) satellites were assigned to E1 and E2, respectively.E3 assimilated the simulated AOTs measured by three satellites (two existing and one imaginary satellite; see Table 4).Remer et al. (2008) noted that compared to the land product, the MODIS product over the ocean contains inherently more information because of the spectral surface reflectance.It allows the fine mode fraction (FMF; the fraction of the total AOT composed of the fine-mode AOT) over the ocean to be more reliable than that over the land, so that E1-3 use simulated AOTs only over the ocean.Two additional sensitivity experiments (Experiments 4 and 5) were also conducted to evaluate how much the land product impacts the inversion, because major aerosol sources (except sea-salt aerosol) are situated over land.In the sensitivity experiments (E4 and E5), we assumed the case if we could obtain fine-and    coarse-mode AOTs over the land with the same frequency and accuracy as those over the ocean.E4 and E5 are the counterparts of E1 and E2, respectively.Additional land product increases the total number of data by a factor of 1.6.In each data set except PE, observations were limited to the clear sky so as to reproduce more realistic observational coverage.The cloud fraction modelled by the NR was used for the cloud masking.The AOTs between 60 • S and 70 • N were assimilated at 3 h intervals.We assumed that the observational error was 0.05, referring to Kaufman et al. (2005).The six experiments are summarized in Table 5.
The CR is driven with perturbed emissions given by where E and E are the perturbed and base emissions, respectively, and σ is the scaling factor.Emissions of SO 2 from fossil fuel and biomass burning, carbonaceous aerosol from fossil fuel, biomass fuel, forest fires, and agriculture, sea salt, and dust were perturbed independently and optimized in the inverse modelling.Volcanic and DMS emissions were excluded from the inversion.The scaling factor was randomly generated following a log-normal distribution with mean = 1 and variation = 2 for SO 2 and carbon and with mean = 1 and variation = 3 for sea-salt and dust emissions.The larger variation for the natural aerosol emissions reflects their relatively larger uncertainties (Carmichael et al., 2008).The scaling factor was allowed to vary in every grid, each day, and each aerosol.The deviation (CR minus NR) is shown in Fig. 6b.Compared with the averaged AOT of the NR (Fig. 6a), a large deviation is found in the source and downwind regions.The CR shows lower biases (see also the first column in Table 6) due to the maximum limitation of the scaling factor, which avoids extremely large perturbations.Comparing the emissions, AOTs show better correlation and lower NMB, and NME.The CR and NR were initialized with identical aerosol fields, which cause the lower deviation in AOTs.
We assigned scaling factors of aerosol emissions to control parameters.The scaling factor allowed increases or decreases of the existing emissions, and could not detect missing sources.Because the CR was generated by emissions perturbed by the scaling factor (as shown by Eq. 15), detection of missing aerosol sources is beyond the scope of this experiment.Emissions of dust and sea-salt aerosols, which have several particle bins, were adjusted as total emissions (not each emission of their bins).The initial aerosol conditions were not included in the control parameters, and the CR, NR and ARs were initialized with identical aerosol fields.The background errors were based on the setting of the CR.We assigned 200 % for SO 2 and carbon emissions.For the natural aerosol emissions (dust and sea salt), 300 % of uncertainty was assigned.Temporal and special correlations were not considered in this study.The experimental period was 10 days (21-31 May 2007) based on average lifetime of aerosols in atmosphere.The assimilation window was the full 10 day period.To demonstrate the feasibility of offline and adjoint models in inverse modelling, all inverse experiments were performed in the inner loop.In the other words, we performed all the NRs, CRs, and ARs with the offline model.

Results of the inversion experiments
The reduction of the cost function is shown in Fig. 7.Note that the cost function is normalized by the initial values and presented on a log scale.The cost function reduced by one order of magnitude during 15-24 iterations in each experiment.Compared with other ideal experiments (e.g.Henze et al. 2007), the reduction rate is relatively low.This can be attributed to the observation data.We assimilated two column quantities (i.e.fine-and coarse-mode AOTs) integrated from vertical profiles of four aerosol species.In additional inversion tests, in which the AOT of each individual aerosol species was independently assimilated, the reduction of the cost function became much faster; the cost function reducing by one order within at least 10 iterations (not shown).One interesting feature is that the reduction rates of the cost functions show different behaviours depending on the observational coverage, not on the number of observations; PE, E4, and E5 exhibited the relatively rapid reduction rate during early iterations.This is because they used observation data over the land.Major emission sources (except sea-salt aerosol) are situated over land and the observation data over land covers information from around these sources.Table 6 shows the statistics of the six experiments versus the CR.The assimilation efficiency (AE) is defined as the reduction rate of the RMSE through the inversion (formulation is given in Appendix A).The PE achieves significant improvements (AE > 70 %) and agreement with the NR (R > 0.99, absolute value of NMB < 1.2 %, and NME <10 %) for the AOT.The 10 day averaged AOT (Fig. 6c) shows that the deviation is less than 0.01, except for a few grids over deserts and high-latitude oceans where no observation data were assimilated.These results confirm that the PE successfully reproduces AOT fields of the NR and the feasibility of the assimilation system.During E1-6, in general, a larger observation number leads to better improvement (Table 6).The impact of observation data over the land is discussed below.
Histograms of the deviations (CR minus NR, and AR minus NR) are shown in Fig. 8.The CR shows skewed distributions due to the emissions randomly perturbed by the scaling factors, which have the maximum limitation.The inversion results by E2 and E4 also show skewness.Regional dependencies of assimilated data due to cloud cover and land/ocean lead this skewness.However, the PE, in which the simulated observations over the globe are assimilated, achieves symmetric distributions and the best improvement, considerably increasing the fractions of deviations between −0.05 and 0.05.E2 also improves the AOT fields for every aerosol species; R is higher than 0.9, AE is 18-47 %, the absolute value of NMB is less than 1.2 %, and NME   reduces by more than half except for carbon AOT (Table 6).
Except for sea-salt aerosol, E4 shows better agreement than E2, especially for dust AOT, in spite of the fewer observation data.It is mainly observation data near the source regions that contribute to this advantage.Observation data over the land, however, leads to slightly worse improvement in seasalt AOT, even in the same satellite orbit (i.e.E1 versus E4 and E2 versus E5).The least improvement in sea-salt AOT is found around the North Pacific and North Atlantic oceans, where various aerosols are transported from the source regions and coexist (not shown).Modifications of the other aerosols introduced by observation data over the land may cause this disadvantage for sea-salt aerosol.
Comparison of the aerosol emission amounts show that the inversion experiments lead to improved agreement with the NR (Fig. 9).In particular, PE and E5 successfully recover the emission amount of the NR with an absolute value of NMB < 10 %.During E1-E3 (without data over land), better improvement generally resulted from an increased use of satellite data (also see Table 6).However, the additional improvement is limited.Observation data over land (E4 and E5) brings significant improvement in the SO 2 , carbon, and dust aerosol emissions.As mentioned above, the modification in SO 2 and carbon emissions complicates sea-salt emission in the downwind regions where various aerosol species coexist.Dust emission shows significant improvement (R > 0.99, AE > 90 %, and NME < 7 %) in PE and E5 compared with the other aerosols.Over land, the coarse-mode AOT is dominated by dust aerosol.This definitive observation results in the outstanding improvement.In contrast, acceptable improvements are found in SO 2 and carbon emissions; PE leads to AE = 26.5 % and 48.4 % and R = 0.764 and 0.899, respectively.The fine-mode AOT is sensitive not only to sulfate and carbonaceous aerosols but also to sea-salt and dust aerosols.This makes categorical detection of individual aerosols from the fine-mode AOT quite difficult.Moreover, major SO 2 sources coincide with carbonaceous sources in industrialized and biomass burning regions.These overlapping observational sensitivity and source regions cause the relatively poorer improvement.We perform two additional inversion tests.One uses sulfate AOT to optimize SO 2 emission.The other assimilates AOT of carbonaceous aerosol for carbon emission.The two tests achieve R = 0.922 and 0.964, AE = 50.2% and 70.8 %, and NMB = 37.7 % and 30.6 % for SO 2 and carbon emissions, respectively.
Figure 10 shows the relationships between AE and the number of observations for AOT and aerosol emissions.In general, higher AE is obtained with an increased number of observation data over the ocean (E1-3) for both AOT and emissions.The observation data over the ocean (E1-3) show that the slope of AE versus the number of observations ranges from 2.5 to 5.7 × 10 −5 , except sulfate and sea-salt AOT and emissions.However, sea-salt AOT and emissions exhibit much higher slopes of 2.4-2.8 × 10 −4 because seasalt aerosol is emitted in sea spray and is distributed mainly over the ocean.It is clear that observations over land give a significant improvement in sulfate, carbonaceous, and dust aerosols.In E5, the AEs of carbonaceous and dust emissions increased more than 4 times more than in E3, in spite of it having only a 2.5 % larger number of data than E3.
The experiments show that observation data over land have a larger impact on inversion than data over ocean because they can provide information from around the source regions.Hsu et al. (2004)  Satellite Observation), transmits pulses of light at 532 and 1064 nm, measures the backscattered intensity, and provides vertical distributions of aerosols regardless of the surface conditions (Winker et al., 2010).These reliable observations over land (around source regions) play an important role in aerosol inverse modelling (Sekiyama et al., 2009;Ku and Park, 2013).The second implication obtained from the inversion experiment is the importance of information that differentiates between aerosol species.The fine-and coarsemode AOTs are inadequate for identifying major tropospheric aerosol species (sulfate and carbonaceous aerosols in particular).Omar et al. (2009) have developed aerosol classification algorithms for CALIPSO aerosol products based on a cluster analysis of a multiyear AERONET data set.The Ångström exponent, depolarization ratio (Shimizu et al., 2004) and colour ratio (Sugimoto et al., 2002) provide characteristics of the aerosol layer (e.g.dominant aerosol species) and are also useful for aerosol inverse modelling.

Conclusions
The SPRINTARS/4D-Var data assimilation system was developed based on a global aerosol climate model (SPRINT-ARS) and four-dimensional variational data assimilation method (4D-Var) with the aim of optimizing emissions, improving four-dimensional composites, and obtaining the best estimate of the climate effects of aerosol species with observational constraints.To reduce the huge computational cost due to the iterative integrations of forward and adjoint models, we employed an offline version of SPRINTARS (OFS) in which the integrations of the dynamic core and the physical package by the coupled GCM are skipped, and pre-calculated meteorological, soil, and land data drive aerosol advection, diffusion, chemistry, wet and dry depositions, gravitational settling, and emission processes.The corresponding adjoint model was derived from OFS, and the inner loop is accelerated by more than 30 % at T42 horizontal resolution (about 2.8 • × 2.8 • ) by using the offline and adjoint models.At a finer resolution, the computational efficiency should by greatly improved.The SPRINTARS/4D-Var system also has the capability of an iterative outer loop that takes the non-linearity and feedback of aerosols into account.
We validated OFS by using 1 yr simulation results.The AOT of each individual aerosol species and the natural aerosol emissions by OFS show good agreement with those by the online (standard) SPRINTARS (ONS), with R > 0.97 and an absolute value of NMB < 7 %.The wind-blown sea-salt and dust emissions are slightly underestimated (NMB = −5.98 and −6.08) due to the time-interpolated (less variable) wind velocity near the surface in OFS.These negative biases result in underestimation of the sea-salt and dust AOTs around the desert region and the zone of westerlies over the ocean.A positive bias is found in sulfate and carbonaceous AOTs with NMB = 6.90 % and 4.02 %.The time-interpolated RH in OFS contributes to this bias mainly through their hygroscopicity.A difference in the presentation of precipitation and clouds between OFS and ONS causes wet removal to be underestimated, which results in a positive bias in the aerosol extinction coefficient around the Equator.To assess the capability of the developed assimilation system in inverse modelling applications, several inversion experiments based on the OSSE framework have been conducted.In the inversion experiments, simulated fine-and coarse-mode AOTs generated by the NR are assimilated to the CR, in which perturbed emissions are used.The inversion results successfully reproduce the original unperturbed emissions, demonstrating the feasibility of the system.The inversion experiments found that the addition of observations over land (the observation coverage) had a significantly positive impact on the inversion (except sea-salt emission) compared with an increase of observations over the ocean because most major aerosol sources are situated over land.This indicates that reliable observations over land are important in aerosol inverse modelling.Another implication obtained from the inversion experiments is that information that differentiates between aerosol species is crucial over regions where various aerosols are brought together and coexist (e.g.industrialized regions and the northern Pacific and northern Atlantic oceans).Measurements of aerosol characteristics (e.g. the Ångström exponent, depolarization ratio and colour ratio) would be useful.Subsequent studies will use real inversion modelling.Coarse-mode AOT data from the MODIS/Terra and Aqua satellites will be assimilated to reproduce the dust emission over East Asia.These results will be presented in a forthcoming publication.

Fig. 2 .
Fig. 2.Scatter plots of ONS versus OFS results for aerosol optical thicknesses of (a) total, (b) sulfate, (c) carbonaceous, (d) sea-salt, and (e) dust aerosols; (f) Ångstöm components; and natural aerosol emissions for (g) sea-salt and (h) dust aerosols.Colours denote frequency of occurrence on a log scale.The white broken line is the 1 : 1 line and the black broken lines denote the 1.5 : 1 and 1 : 1.5 lines.
of dust extinction coef.

Fig. 5 .
Fig. 5. Schematic diagram of the inversion experiments with the OSSE framework.

Fig. 6 .
Fig. 6.Spatial distributions of the averaged AOT and its deviations (CR minus NR, and AR minus NR).(a) 10 day averaged total AOT from NR, (b) deviation between CR and NR, and (c) deviation between PE and NR.

Fig. 7 .
Fig. 7. Reduction rate of the cost function on a log scale.

Fig. 8 .
Fig. 8. Histograms of deviations (NR minus CR, and NR minus AR) for (a) total, (b) sulfate, (c) carbon, (d) sea-salt, and (e) dust AOTs.Grey shading shows NR minus CR.The numbers in the panels are the fractions of the deviations between −0.05 and 0.05.

Fig. 10 .
Fig. 10.The relationships between AE and the number of assimilated observations for (a) aerosol optical thickness and (b) aerosol emissions.

Table 1 .
Statistical results of the offline model (OFS) versus the online model (ONS) for aerosol optical thickness (AOT) and emissions.All statistics were calculated between 70 • S and 70 • N.

Table 4 .
List of satellites in sun-synchronous polar orbits considered in the inversion experiments.

Table 5 .
Inversion experiments and their settings.

Table 6 .
Statistics of the six experiments versus the control run (CR) for aerosol optical thickness and emissions.All statistics were calculated between 60 • S and 70 • N.