Particle filter based volcanic ash emission inversion applied to a hypothetical sub-Plinian Eyjafjallajökull eruption using the Ensemble for Stochastic Integration of Atmospheric Simulations (ESIAS-chem) version 1.0

A particle filter based inversion system is presented, which enables to derive timeand altitude-resolved volcanic ash emission fluxes along with its uncertainty. The system assimilates observations of volcanic ash column mass loading as retrieved from geostationary satellites. It aims to estimate the temporally varying emission profile endowed with error margins. In addition, we analyze the dependency of our estimate on wind field characteristics, notably vertical shear, within variable observation intervals. Thus, the proposed system addresses the special challenge of analyzing the vertical profile of volcanic 5 ash clouds given only column mass loading data as retrieved by geostationary satellite imagery. The underlying method rests on a linear-combination of height-time emission finite elements of arbitrary resolution, each of which is assigned to a model run subject to ensemble-based space-time source inversion. Employing a modular concept, this setup builds the Ensemble for Stochastic Integration of Atmospheric Simulations (ESIAS-chem). It comprises a particle smoother in combination with a discrete-grid ensemble extension of the Nelder-Mead minimization method. The ensemble version of the EURopean Air pol10 lution Dispersion Inverse Model (EURAD-IM) is integrated into ESIAS-chem but can be replaced by other models. As initial validation of ESIAS-chem, the system is applied to simulated artificial observations of both ash-contaminated and ash-free atmospheric columns using identical twin experiments. Thus, in this idealized initial performance test the underlying meteorological uncertainty is neglected. The inversion system is applied to two notional sub-Plinian eruptions of the Eyjafjallajökull volcano, Iceland, with strong ash emission changes with time and injection heights. It demonstrates the ability of ESIAS-chem 15 to retrieve the volcanic ash emission fluxes from the assimilation of column mass loading data only. However, the analyzed emission profiles strongly differ in their levels of accuracy depending of the strength of wind shear conditions. While the error is only 10 %–20 % for the estimated emission fluxes under strong wind conditions it increases up to 60 % under weak wind shear conditions. In case of increasing wind shear, the performance of the analysis may benefit from extending the assimilation window, in which new observations potentially contribute valuable information to the analysis system. For our test cases using 20 an artificial volcanic eruption, we found an assimilation window length of 18 hours, i. e. 10 hours after the eruption terminated, to be sufficient for analyzing the extent and location of the artificial ash cloud. In the performed test cases, the analysis ensemble predicts the location of high volcanic ash column mass loading in the atmosphere with a very high probability of >95 %.


Introduction
Emission profiles of volcanic eruptions depend on multiple parameters, such as crater size or exit velocity of the emitted mass.
Further, they depend on atmospheric stability and wind profile at the volcano. Many of these parameters are unknown or difficult to measure exactly. This renders the estimation of emission profiles of volcanic eruptions challenging for chemistry transport models in the context of data assimilation and inverse modelling for source estimation. Therefore, special methods 30 for assessing the strength and vertical distribution of volcanic emissions are necessary. As volcanic eruptions contain enormous amounts of harmful trace gases and particulate matter, a detailed knowledge not only about the spatial and temporal variations of the emissions and its strength is needed but also accurate information about the analysis error of the emissions and the evolving volcanic ash cloud is required.
Typically, explosive volcanic eruptions occur as sequences of emissions with highly varying ejection mass and height. Only 35 limited observations of volcanic ash emission parameters are available (e.g. eruption plume heights retrieved from radar measurements, Arason et al., 2011), which are affected by their specific uncertainties and limitations, e. g. by orographic shielding in the case of radar observations. Thus, eruption models are applied to simulate volcanic emissions. These may be inferred from statistical data (e. g., Sparks et al., 1997;Mastin et al., 2009) or physical processes (e. g., Woodhouse et al., 2013;Folch et al., 2016). Statistical models base on observational data from historic volcanic eruptions, which are sparse and show a large 40 variance in eruption rate for given plume heights. For example, Mastin et al. (2009) calculated an uncertainty by a factor of four in estimating the emission rate for a plume height of 25 km using their statistical model. Physical plume-scale models require orographic details of the volcano (e. g. crater size) but also meteorological fields and parameters (e. g. wind entrainment coefficients), which are often poorly known and render these models highly uncertain. Costa et al. (2016) identified the wind entrainment coefficient as main source of uncertainty leading to up to two orders of magnitude differences for the estimation g. via the European Aerosol Research Lidar Network, EARLINET 1 ), column mass loading observations rarely provide information about the vertical distribution of volcanic ash and are mostly limited to cloud top heights (e. g., L. J. Ventress, 2016;Piontek, 2021). Therefore, multiple data assimilation / source inversion methods make assumptions about the vertical extent of the volcanic ash cloud (e. g., Schmehl et al., 2012;Wilkins et al., 2016b;Zidikheri et al., 2016). In addition to column mass 60 loading observations of volcanic ash clouds, Zidikheri et al. (2017a) suggested to use brightness temperature measurements to distinguish regions with high mass load from those with low mass load. All these observations are influenced by water cloud cover, which limits the detection of volcanic ash in the atmosphere.
First estimations of volcanic ash emissions from the 2010 Eyjafjallajökull eruption in a high temporal and vertical resolution were made by Stohl et al. (2011) and later by Kristiansen et al. (2012) and Kristiansen et al. (2015). Their algorithm bases 65 on the inversion technique of Eckhardt et al. (2008), in which an optimal combination of distinct emission packages is estimated using a least squares method. The method showed to provide reliable a posteriori estimates of the time-varying emission profiles. Stohl et al. (2011) include errors from a priori estimates, retrieval errors and model errors and discussed results in terms of relative error reduction subject to assumptions made. Schmehl et al. (2012) initiate the volcanic ash analysis using an ensemble of simulations with random emission strengths and wind fields. Their best estimate of the volcanic ash concentration 70 is found iteratively using a "genetic algorithm variational approach". Herein, rather strong assumptions on the emission profile are made: the emissions are kept fixed for the simulation duration; emissions are placed into a single model layer; wind fields are only adjusted in the model layer containing volcanic ash emissions. However, the method provides a quick and easy to implement first estimate of the volcanic ash concentrations in the atmosphere. Yet, the strong assumptions may render the approach unfeasible for longer lasting volcanic eruptions in which the emissions vary more strongly. Another data assimi-75 lation method for estimating the volcanic ash emissions was proposed by Lu et al. (2016). They developed an adjoint-free, ensemble-based four-dimensional variational data assimilation (4D-var) method. The method showed reliable estimates of the true emission profile in their experiments using synthetic, vertically integrated satellite observations. However, they do not address the uncertainty estimate of their analysis. Zidikheri et al. (2016) and later Zidikheri et al. (2017b) developed an inversion system that aims to analyze the horizontal 80 distribution of volcanic ash column mass loading rather than the emission strength. This study was extended by Zidikheri et al. (2017a) to additionally estimate the height and the particle size distribution of volcanic ash emissions using a parameter refinement method. Here, an ensemble of source parameter values has been applied. Using a proper metric (in their case the pattern correlation coefficient) the ensemble is evaluated against observations. The best fitted ensemble member is taken as analysis. The method is easy to implement for a fast analysis of a volcanic eruption as only the upper and lower bounds of 85 the considered source parameters need to be defined. However, the number of model runs used to find the analysis increases exponentially with the number of parameters. Rough estimates of the parameters' uncertainty are provided by the spread of the top 2 % ensemble members with respect to the metric (Zidikheri et al., 2017b), which does not take uncertainties in the observed quantities into account. Wilkins et al. (2014) used the "data insertion" method, in which observed volcanic ash column mass loadings act as virtual sources for volcanic ash with a predefined vertical distribution. The algorithm was successfully applied to the eruptions of Eyjafjallajökull, Iceland, 2010(Wilkins et al., 2016b) and Grímsvötn, Iceland, 2011(Wilkins et al., 2016c. Given the lack of vertical information in column mass loading retrievals of volcanic ash, the data insertion method needs assumptions about the vertical distribution of the volcanic ash content in the atmosphere. Thus, this larger source of uncertainty for the volcanic ash analysis is ignored. The data insertion scheme has also been implemented as a first step towards an ensemble-based data 95 assimilation scheme in the FALL3D-8.0 atmospheric transport model (Prata et al., 2021). Fu et al. (2017) developed a mask-state algorithm for ensemble Kalman Filters to reduce the size of the state vector to be optimized. More recent applications of the ensemble Kalman Filter and its variants are provided by Pardini et al. (2020) and Osores et al. (2020). By estimating the source parameters of the volcanic eruption, the approaches using the ensemble Kalman Filter assume constant emission parameters between two assimilation steps. This is a rather strong assumption on the emissions 100 especially if observational data is sparse or far away from the volcano. However, keeping this assumption in mind the ensemble Kalman Filter methodology provides an estimate on the analysis uncertainty.
A general framework for calculating uncertainties of volcanic ash concentrations for constant volcanic ash emissions given "any model and any observational data" was proposed by Denlinger et al. (2012). Bursik et al. (2012) applied the "polynomial chaos quadrature weighted estimate" (PCQWE) method to volcanic ash emissions. This approach was further extended by 105 Stefanescu et al. (2014) to take uncertainties in the wind fields into account as well. An extension of the polynomial chaos quadrature method was proposed by Madankan et al. (2014) to generate hazard maps of volcanic ash in the atmosphere. The PCQWE method aims to map uncertainties in the input parameters of a volcanic eruption onto the volcanic ash concentrations without accounting for constraining volcanic ash observations. Additionally, Dare et al. (2016) investigated the influence of meteorological ensemble forecasts on the dispersion of volcanic ash. They found that not only the ensemble statistics should be 110 evaluated but also the single ensemble members, which may contribute significant information to the distribution of volcanic ash.
Although these studies applied highly advanced data assimilation and source inversion methods for analyzing the emission strength and the uncertainty of volcanic ash dispersion forecasts, a joint assessment of both, emission strength and its uncertainty, in a high temporal and vertical resolution has not yet been evaluated. Thus, this contribution aims to fill this gap by 115 providing such estimates of emission strength and its uncertainty in a high temporal and spatial resolution resulting in heightresolved probability maps of volcanic ash concentrations.
Section 2 describes the full stochastic inversion system ESIAS-chem and the methods applied. The potential and limitations of ESIAS-chem are shown by identical twin experiments in section 3. A discussion of the results and conclusions will be given in section 4. The Ensemble for Stochastic Integration of Atmospheric Simulations (ESIAS) is designed to control simultaneous and interactive runs of ultra-large ensembles of complex atmospheric models. ESIAS comprises a meteorological (ESIAS-met) and an atmospheric chemical part (ESIAS-chem). One main feature of ESIAS is the potential to include data assimilation and source inversion modules. The emphasis of this paper is placed on ESIAS-chem for probabilistic atmospheric chemistry-transport-125 diffusion simulations with data assimilation. In the following, the main components of the ESIAS-chem system are introduced, which include an ensemble of emission packages, a modified version of the Nelder-Mead minimization algorithm, and a particle filter and resampling algorithm. Further, the full stochastic particle smoother algorithm of ESIAS-chem for probabilistic volcanic ash analyses is described. Finally, the metrics for analyzing the systems performance is summarized. The reader is referred to Tab. 1 for a summary of the used variables. for estimating the volcanic ash column mass loading was used by Stohl et al. (2011) and Kristiansen et al. (2015) aiming to estimate the optimal emission profile. Contrary to their analysis, the focus of ESIAS-chem lays on the predictability of volcanic ash emissions and the resulting volcanic ash concentrations.
In order to find the optimal emission profile, the cost function J(a) x k ∆z k , where x k is the modeled concentration of volcanic ash in [µg m −3 ] and ∆z k is the thickness of model layer k in [m]. Please note that the vertical and temporal resolution of the emissions can be varied by changing the parameters K max and N T , thus, making them independent from the vertical and temporal resolution of the model. Matrix R accounts for the impact of retrieval 155 errors of the volcanic ash column mass loading and is considered diagonal. It can be made spatially and temporally dependent, to account for assumed increased retrieval errors due to water cloud influences, particularly thick umbrella ash clouds above or in the vicinity of the volcano or interference of other aerosols or mineral dust. In our study, we have made assumptions about the observation error (including retrieval error). In applications to real volcanic eruptions, the use of retrieval errors provided by the observations is highly encouraged. Starting from a scalar column load value as exclusive data source, we considered estimation uncertainties of the derived height profile presented here as an order of magnitude larger than retrieval errors in this idealized experiments, especially if the number K max determines some multiple of O(10) layers. The observation error can also be incorporated in constructing the ensemble, as in general any ensemble data assimilation procedure can straightforwardly account for the retrieval uncertainty by artificially perturbing retrievals of column mass loading, where the random perturbation is scaled by the assumed statistics of retrieval errors. Clearly, this must not be the only means to generate the ensemble, as this 165 accounts only for a fraction of the overall uncertainty, resulting in underdispersive ensembles. Although the algorithm is designed for column mass loading observations, it is as well applicable to observations of any volcanic ash related quantity, e.

Nelder-Mead algorithm
The minimization problem posed by (1) is quadratic within the limits of being bounded due to positive semi-definiteness of 170 all components. Quasi-Newton methods, including a bounded variant proved less efficient, as a background state reasonably close to the "truth" for a tangent-linear approximation to hold, is typically unknown. This missing a priori knowledge cannot serve any preconditioning requirements other than highly speculative inferences from assumed eruption type and strengths scenarios. With an increasing number of model levels with their (positive semi-definite) concentrations to be attributed, while column values as given data are single scalars only, the ill-conditioning of the minimization problem increases drastically and 175 a much needed reasonable background information prior to the volcanic eruption is hardly available. Also simple smoothness assumptions of the vertical profile are often invalid for ash clouds, at least during early stages. As minimization tests with the Nelder-Mead method performed clearly best, without getting lost in drastically elongated minima as introduced by underdetermined degrees of freedom through vertical level concentrations, the algorithm by Nelder and Mead (1965) was applied to the inversion problem. The Nelder-Mead minimization algorithm is a combinatorial optimization method without constraints and 180 without the need to compute the function derivatives. It has proven to be robust, especially in cases where the function to be minimized has discontinuities or the function values are noisy (see McKinnon, 1998). This is expected to be likely in highly variable volcanic eruptions especially given highly uncertain, and thus noisy, observations. Additionally, the Nelder-Mead algorithm can easily account for bounded regions, in our case positive semi-definite ash loads, and needs relatively few function evaluations (mostly 1-2 function evaluations per iteration, Lagarias et al., 1998).

185
The idea of the algorithm is to move a simplex on the surface of the cost function to find an improved model state in a N emis -dimensional space. The version of the Nelder-Mead method used in this study follows Gao and Han (2012) and utilizes adaptive parameters controlling the step size for each iteration of the minimization. The version has been implemented for parallel operation (Klein and Neira, 2014;Lee and Wiswall, 2007). In our application, the Nelder-Mead algorithm is used to find the optimal combination of the pairwise distinct emission packages. This is accomplished by assigning a factor a i , which 190 needs to be scaled by the algorithm, to each emission package.
Due to its simplicity, the Nelder-Mead algorithm is easy to implement but it is likely to find a local rather than the global minimum of the cost function (which is also a problem for least-square minimization techniques with poor initial guesses, as for volcanic eruptions). Thus, we have added some adjustments to the algorithm. First, we perform the minimization only for integers (including 0). Thus, only integer values are accepted for the scaling factors a i of the emission packages. By applying 195 this constraint it is assumed that the introduced errors are of lower order than the error introduced by the temporal and vertical resolution of the emission packages. Further, the minimization is restarted with larger perturbations of the vertices (edges of the simplex) if the algorithm is trapped in a local minima. Finally, the minimization is started for an ensemble of Nelder-Mead analysis. As perturbed observations are used as input to the minimization procedure, the solutions (here emission profiles) produced by the analysis ensemble are assumed to map the uncertainty given by the observations onto the emission rates (cf.

Particle filter
The particle filter methodology, also known as sequential importance resampling, is used as a non-Gaussian data assimilation method for large ensemble simulations of the atmospheric state. Please note that the term ensemble in this section defines a full model simulation and does not refer to the ensemble of emission packages defined in the Sect. 2.1.1. The particle filter method 205 was proposed by Gordon et al. (1993) and further popularized in oceanography and meteorology by van Leeuwen (2009). It develops from Bayes' Theorem where p(·) denotes the probability density function (PDF), y the observations, and x the model state. The a priori PDF is approximated by an ensemble of N ens model runs where δ(·) denotes the Kronecker delta function and x (i) is the model state of particle (ensemble member) i.

Applying (4) results in
where an individual weight w (i) is applied to each ensemble member. Thus, each ensemble member is weighted by the normal-

215
ized likelihood of its current model state. The ensemble statistics can now be computed using the ensemble member weights.
For example, the ensemble mean is It is noted that in the particle filter method no assumptions of the statistical forecast error characteristics and the observation error were made (the errors do not need to be normally distributed and the model state does not need to be unbiased as other In particle filters, filter degeneracy often occurs (cf. Bengtsson et al., 2008;Snyder et al., 2008;Bickel et al., 2008), especially in high dimensional problems. Several methods exist to reduce filter degeneracy (see e.g. van Leeuwen, 2009, for a review) and the reader is referred to the original papers for more information. In ESIAS-chem, the particle filtering and resampling steps are applied after the ensemble of optimal emission profiles has been found by the DENM algorithm. A weight w (i)

225
is assigned to each optimal emission profile. Residual resampling (Liu and Chen, 1998) is used to replace emission profiles leading to small weights by emission profiles with high weights (this step includes perturbing duplicated emission profiles).
After resampling, the weights are normalized again (w (i) = 1/N Ens ). Thus, the statistical informative value of the analysis ensemble is preserved. Qualitatively, the strategy of particle filtering applied here can be expressed as follows: By replacing the valueless ensemble members of the analysis (i. e. those with too little weight) each ensemble member has comparable skill 230 to match the observations. Hence, the probability of an event (e. g. volcanic ash concentrations above a certain threshold) can directly be extracted from the relative number of ensemble members that simulate this event.

ESIAS-chem
ESIAS-chem is designed as a flexible analysis system for quantitative volcanic ash assessments, along with an uncertainty quantification of the analyzed emission flux profile. ESIAS-chem is constructed such that it is applicable to other scenarios of 235 accidentally released matter and constituents, given constraining observations are available. Further, it is capable to be coupled with ensembles of meteorological fields to account for additional uncertainties resulting from meteorological forecasts.
However, this idealized investigation focuses on the ability of the system to reconstruct the emission profile and its uncertainty under perfect meteorological conditions. Thus, no meteorological ensemble is used at this stage.
The aerosol dynamics (nucleation, accumulation, deliquescence) and aerosol chemistry in EURAD-IM (EURopean Air pollu- has been switched off for two reasons: numerical efficiency in an ensemble context and specifics of volcanic ash properties cannot be expected to be reasonably well featured by a general pollutant aerosol module like MADE. Ideally, a full volcanic ash aerosol dynamics and chemistry model as proposed by Schmidt (2013) would be in place, along with its not existing adjoint.

245
Yet we consider the error to be negligible within the evolution time frame addressed in our idealized study.
The workflow of the system is illustrated in Fig. 1. Once a volcanic eruption is detected (t = t 0 ), the system is started by generating the ensemble of emission packages (cf. section 2.1.1) of default mass of volcanic ash. As long as no observations are available, this ensemble of emission packages serves as an estimation predictor of the maximum possible volcanic ash extent without providing quantitative volcanic ash estimates. Once new observations become available, the system is restarted The ensemble of emission packages is compared with the observations to calculate (1), which is to be minimized using the DENM method. As this algorithm is optimized for the estimation of column integrated ash loading in a considerably 13 January, 2021).
underdetermined control system, a regularization term is added to the cost function (1), leading to This choice restricts the scaling factors a to vary too strongly. In first tests without the regularization term, the emission rates have partly increased to unrealistic high values. Therefore, the B-matrix was chosen in a sequence of sensitivity tests, in which the influence of the regularization term on the emission profile was evaluated. Best results have been found by choosing B 260 as diagonal matrix B = diag(10). Please note that the chosen diagonal form of the B-matrix led to reasonable results for the artificial emission profile used in this study. However, for realistic applications a more elaborated evaluation of a properly chosen B-matrix is required and straightforwardly applicable. In this performance test, the only purpose of the matrix serves to restrict the scaling factors a not to vary too strongly. In addition, the regularization term was chosen in order to maintain a suitable spread of the analysis ensemble.

265
The minimization is initialized with a set of arbitrarily varying scaling factors a for the pairwise distinct emission packages.
The algorithm was tested using a time-varying initial emission profile with umbrella-shaped vertical mass distribution. Due to the chosen true emission profile in this idealized study (cf. Sect. 3), the minimization using the initial emission profile with umbrella-shaped vertical mass distribution shows larger errors. In the application of the algorithm to a real volcanic eruption, the performance of the analysis using an umbrella-shaped initial emission profile may exceed the performance using 270 an arbitrary emission profile. Hence, ESIAS-chem is designed to adjust the initial emission profile to the characteristics of the current volcanic eruption. In addition, the observation errors are represented by perturbed observations (cf. Houtekamer and Mitchell, 1998), which are assimilated by ESIAS-chem leading to larger ensemble spreads.
Once an improved emission profile has been found by the DENM minimization, a particle filter step is applied to the analysis ensemble. The weights, which result from the filtering step, are applied to the analyzed emission profiles. If new observations 275 become available the assimilation window may be elongated to use the new observations for updating the emissions within the whole assimilation window [t 0 , t i+1 ].

Metrics used for analyzing ESIAS-chem's performance
The results of the stochastic inversion method are validated using different measures on the ensemble mean. The pattern correlation coefficient (pcc, cf. Zidikheri et al., 2016) provide information about the accuracy of the horizontal extent of the 280 volcanic ash cloud. The pcc is defined by (Zidikheri et al. (2016)) with va = va − va. Herein, the entries of the binary volcanic ash detection vector va for the ensemble mean (subscript x) and observations (y) are equal to 1 if the grid column contains volcanic ash above 0.2 g m −2 , which is the detection limit of volcanic ash column mass loading observations , and 0 otherwise. The averaged volcanic ash detection va is where 1 denotes the vector with 1 on all entries and < ·, · > indicates the scalar product. The pattern correlation coefficient gives information about the compliance between the simulated and observed horizontal distribution of volcanic ash in the atmosphere.
If the volcanic ash clouds, indicated by the column mass loading, of the nature run and the ensemble mean perfectly coincide, 290 the pattern correlation coefficient equals 1. If the volcanic ash clouds of the nature run and the ensemble mean are totally disjoint, the pattern correlation coefficient equals 0. It is noted that for this model setup with perfect meteorology, pcc < 1 indicates that the analysis contains volcanic ash either in model layers or at times, where no volcanic ash is emitted in the nature run.
The inner-cloud distribution of volcanic ash of the analysis ensemble mean is analyzed using the relative mean absolute error 295 (RMAE). The RMAE is defined by where N y is the number of grid columns in which volcanic ash column mass loading of the nature run y j ≥ 0.2 g m −2 and x j is the column mass load of the analysis ensemble mean in grid cell j. The RMAE is as well calculated for volcanic ash concentrations, for which this threshold is ≥ 10 µg m −3 . In this case, y j corresponds to the volcanic ash concentration The probability estimate of the ensemble analysis is investigated using the Brier score where r is the number of verification classes, p j,i is the forecast probability of the ensemble for class j to predict event i, and The ability of ESIAS-chem to provide quantitative estimates of the volcanic ash emission uncertainty is explored by identical twin experiments. Identical twin experiments are necessary, yet not sufficient standard test procedures for validating spatiotemporal data assimilation and inverse modelling set-ups. They are idealized experiments as they rest on the "perfect model assumption" and its analog for the data side: exactly known accuracy and representativity. This provides a total knowledge of the "synthetic truths" as given by simulations with the same model and extraction of artificial "measurements/soundings" 325 thereof. The term identical twin refers to the fact, that observations and a priori knowledge are constructed from the same model and input data, in which only the parameters to be optimized (emission profile in our case) differ. Given the identical twin assumption the experiment is then to be made realistic in all other respects. Daley (1991) concludes however, that identical twin experiments "err on the optimistic side" (loc. cit.). Yet, the applicability of ESIAS-chem to real volcanic eruptions will be shown in a companion paper. cantly larger time steps (six hours in this analysis), however. The synthetic observations from the nature run are extracted from the full domain including grid cells containing volcanic ash and those without volcanic ash. The information of ash free areas is necessary in order to avoid estimates of spurious emissions at false times and heights in the analysis.

345
The uncertainty of volcanic ash column mass loading observations is about 40 % (Western et al., 2015;Clarisse and Prata, 2016) or even higher (Wen and Rose, 1994;Kylling et al., 2014). For the identical twin experiments in this study, the observation error needs a special treatment because a relative error overemphasizes the system to low observed values. Further, no relative error for observations of no-volcanic ash observations can be obtained. Therefore, the following expression for the observation error σ y i with a minimum error of 0.1 g m −2 is used With this choice of observation error, the impact of low volcanic ash values observed at the edge of the volcanic ash cloud on the analysis is diminished. For applications to real volcanic eruptions, the observation error provided by the satellite retrieval per pixel should be considered.
The Hovmoeller-like plot of the nature run emission profile is shown in Fig. 2. It shows the variable emission rate by a height-355 time graphic above the volcano. The selected sub-Plinian eruption type (Bardintzeff and McBirney, 2000) is characterized by two short explosive phases between 02-04 UTC and 06-08 UTC reaching a height of approx. 8 km above the volcano.
The length of the assimilation window influences the performance of the inversion algorithm due to differences in vertical and horizontal mixing and vertical wind shear. Hence, the performance of ESIAS-chem is tested for different assimilation window lengths. All assimilation windows start at 00 UTC for the specific day and last for 6-36 hours. Fig. 3 illustrates the different 360 assimilation window lengths, which differ in length by multiples of 6 hours. With increased residence time in the atmosphere, the volcanic ash at different heights becomes more horizontally split by wind shear. This effect can be exploited by increasing the assimilation window length. Contrary, vertical and horizontal mixing of volcanic ash may limit the benefit that is gained by increasing the assimilation window length. For example, if volcanic ash emitted by two different emission packages is mixed, it is impossible to attribute the volcanic ash to one or the other emission package. Once the volcanic ash emissions are 365 optimized, a forecast is appended until 36 hours after the simulation start (hatched areas in Fig. 3). Thus, beneficial impacts of the inversion results for the analyses with differing assimilation window lengths can be assessed.
The first real weather test day to which ESIAS-chem is applied is 15 April 2010, which was characterized by strong westnorth-westerly winds in Iceland. This is illustrated by the wind profile at the volcano (Fig. 4a) for the whole simulation length of 36 hours and the wind field over Europe at 500 hPa on 15 April 2010, 12 UTC (Fig. 4b). During this day, the polar front 370 and the polar jet stream are located above Iceland, driving the volcanic ash to travel fast southeast towards continental Europe, with wind speeds of up to 60 m s −1 at heights of 5-8 km above the volcano. Fig. 4c shows a vertical cross section of pressure and temperature along the red line in Fig. 4b at 12 UTC on 15 April 2010. As indicated by the intersection of the isobars and isotherms, 15 April 2010 is characterized by substantial vertical wind shear above Iceland, which is expected to ease the distinction of volcanic ash emitted at different heights as seen from above.
pothetical sub-Plinian eruption of the Eyjafjallajökull volcano on 29 April 2010 (Fig. 5). A similar emission profile is taken as depicted in Fig. 2, yet with slightly different emission rates. This day is characterized by weak winds of approximately 10 m s −1 in the vicinity of the volcano, which is illustrated by Fig. 5. Thus, the emitted volcanic ash is only slowly transported.
Additionally, the vertical wind shear on 29 April 2010, 12 UTC, is low, as indicated by a higher barotropicity above Iceland  ( Fig. 5c). The two dates were chosen because of their different wind patterns and the real eruption of the Eyjafjallajökull that occurred during these days. Hence, the identical twin experiment provides an optimal case scenario for the application of ESIAS-chem to real volcanic eruptions.

Volcanic ash dispersion 385
The ability of the analysis ensemble mean to predict the volcanic ash dispersion is investigated using the pattern correlation coefficient (pcc) and the relative mean absolute error (RMAE) introduced in Section 2.2. The pattern correlation coefficient is shown in Fig. 6 for the two analysis days. The lines in Fig. 6 indicate results for different assimilation window lengths as illustrated by Fig. 3. Fig. 6 shows a constantly large pattern correlation coefficient > 0.95 after the artificial eruption terminated at 08 UTC for both analysis days, except for assimilation window lengths of 6 and 12 hours. By applying an assimilation 390 window of 6 hours from the simulation start, the artificial volcanic eruption has not terminated, thus, the latest emissions from the nature run volcanic eruption are not considered in the analysis. This leads to a reduced pcc for the 6 hour assimilation window test case. The assimilation window of 12 hours ends four hours after the termination of the artificial volcanic eruption.
The reduced pcc exhibits that this short assimilation window is not sufficient in order to analyze the correct emission profile.  is a measure for volcanic ash column mass loading above and below the chosen threshold. It does not measure differences in the strength of volcanic ash column mass loading above the threshold.
Increasing the assimilation window length (i.e. taking later observations into account) increases the pattern correlation coeffi-400 cient on both days. The analysis suggests that for the respective test cases an assimilation window of 18 hours, that is 10 hours after the artificial eruption terminated, is sufficient for ESIAS-chem to analyze the exact location of the volcanic ash cloud as observed from space leading to a pcc value that remains high (> 0.95) throughout the full analysis time period of 36 hours. Fig. 6 demonstrates that the inversion system is able to accurately analyze the horizontal extent of the volcanic ash cloud. Fig. 7 shows the RMAE for volcanic ash column mass loading on both days. The RMAE is relatively constant for the duration 405 of the simulations, except for the 12 hours assimilation window case in Fig. 7a. At the end of the simulation time at 36 hours, the test cases with longer assimilation windows (≥ 18 hours) show a RMAE of the order of 10 % for both days. These low values show the good performance of the analysis for these assimilation window lengths with respect to the nature run. In principle, Fig. 7 corroborates the same findings that are analyzed for the pattern correlation coefficient, i. e. increasing the assimilation window length decreases the error of the analysis ensemble mean. For both days and meteorological circulation patterns, an 410 assimilation window of 18 hours is sufficient to reduce the RMAE to a value of approx. 10 % for column mass loading values above 0.2 g m −2 . On 15 April 2010, assimilation windows larger than 24 hours result in a slightly higher RMAE than the analysis using an assimilation window of 18 hours. This is a result of the convergence of volcanic ash in the upper troposphere south of Norway around 24 hours after the simulation has started (not shown). Thus, additional observations at later times do not contribute significant information to the inversion system. In summary, Fig. 7 proves that the inversion system is able to 415 analyze the distribution of volcanic ash column mass loading properly for weak and strong wind conditions.
The above analysis focuses on the comparison of the nature run and the ensemble mean with respect to column mass loading  The different lines indicate different assimilation window lengths from 6 hours (gray) to 36 hours (magenta) as defined by Fig. 3. of volcanic ash. Thus, it does not provide any information about the vertical distribution of volcanic ash. The ability of ESIASchem to infer vertical profiles of volcanic ash is given in Fig. 8, which displays the relative mean absolute error of the volcanic ash concentrations above 10 µg m −3 . The RMAE of the volcanic ash concentrations decreases by increasing the assimilation 420 window length, which is especially visible for 29 April 2010. On both days, an assimilation window of only 6 hours results in a RMAE larger than 100 %. Therefore, this test case is not shown in Fig. 8. The RMAE for the 12 hour assimilation window test case show a spike at 12 UTC. This results from the insufficient estimation of the upper part of the eruption column in the second explosive phase of the eruption (cf. Fig. A2 in the appendix). This error smoothed out in the subsequent hours of the simulation. On average, the RMAE reduces to about 20 % on both days for assimilation windows larger than 18 hours, which 425 shows the good performance of the ESIAS-chem analysis not only in terms of column mass loading but also in terms of the vertical distribution of the volcanic ash in the atmosphere.

Emission profile
As an example, the analysis results using an assimilation window of 24 hours are investigated in more detail. This test case is chosen as the previous analysis showed the good performance of the 24 hour assimilation window experiments. Further, an 430 assimilation window of 24 hours is a reasonable choice for either analysis of longer lasting volcanic eruptions or an operational use. The analyzed ensemble mean emission profiles for other assimilation window lengths are shown in the Appendix A along with the relative error. Fig. 9 and Fig. 10 display the profile of (a) the nature run emissions, (b) the ensemble mean emissions, (c) the relative error of the ensemble mean 435 and (d) the relative ensemble standard deviation for the 24 hour assimilation window experiments on 15 April and 29 April 2010. Herein, x and y are the ensemble mean and nature run emissions, respectively, and σ x is the ensemble standard deviation.
The total nature run emissions on both days (4.25 · 10 8 tons and 4.30 · 10 8 tons on 15 April and 29 April, respectively) are 440 well captured by the analyzed total emissions with mean emissions of 4.60 · 10 8 tons and 4.10 · 10 8 tons, respectively, and standard deviations of 3.67 · 10 7 tons and 3.47 · 10 7 tons. The relative error of the total emitted volcanic ash is 7.7 % and 4.7 %, respectively. On 15 April 2010, the analyzed emission profile of the ensemble mean shows the two explosive eruptions of the nature run emission profile with the correct height of the maximum emissions at the right time (Fig. 9b). Even though the ensemble mean shows a vertically and temporally smoothed emission profile, the false emissions are low with respect to the 445 maximum emissions. The relative error of the ensemble mean emissions is of the order of 10 %-20 % for most emission times and heights (cf. Fig. 9c) and therefore, the results are similar to the analysis presented before. The relative ensemble standard deviation is of the same order as the relative error of the ensemble mean emissions, indicating a reasonable ensemble spread.
The analyzed emission profile of the ensemble mean on 29 April 2010 (Fig. 10b) however shows strong deviations from the nature run emission profile (Fig. 10a). Although the highest level emissions of the nature run emission profile at 8 km height The analyzed emission profile on 29 April 2010 shows the limits of the ESIAS-chem approach. While the volcanic ash column mass loading have only low errors, the emission profile shows large deviations up to 60 % and more (Fig. 10c). The ensemble standard deviation of the emission profile (Fig. 10d) is lower than the relative error of the ensemble mean and ranges 460 around 20 %. The results indicate that on 29 April 2010 the mixing of volcanic ash in the atmosphere is too effective, which prohibits a proper estimate of volcanic ash emission profiles. However, the previous results show that even though the volcanic ash emission profile could not be properly estimated by the system on 29 April 2010, the vertical and horizontal distribution of volcanic ash in the atmosphere is fairly represented by the ensemble mean.

Probability analysis 465
The proper analysis of high volcanic ash concentrations in the atmosphere as well as their forecast accuracy are of great importance for air safety advisory services. Yet, only the ability of ESIAS-chem to provide reasonable estimates of vertically resolved volcanic ash forecasts and analysis is shown. Thus, in this section the probability estimate of the analysis ensemble for the volcanic ash emissions and the resulting concentrations remains to be discussed. Fig. 11 shows the histogram of the relative emission factor for different assimilation window lengths for the test case on 15 April 2010 as given by the analysis 470 ensemble. The relative emission factor is calculated for each time-height combination (t, k) of the emission profile by dividing the emission rate of each member of the analysis ensemble ER (i) t,k by the respective nature run emission rate ER N R Thus, emissions in the analysis ensemble that are temporally or vertically outside the nature run emission profile are not considered. The calculation of the histogram in Fig. 11 includes all emissions and four different subset of emission strengths 475 (the strongest 50 %, 25 %, and 10 % emissions). The relative emission factors for the 12 hour assimilation window test case tend to underestimate the emissions of the nature run ( Fig. 11a and Fig. 11d). By increasing the assimilation window length, the histograms peak around factor 1, while the occurrences of underpredicting the nature run emission rates diminish. A relative emission factor of 1 indicates a good match of the analyzed and nature run emission rates. This improvement by increasing the assimilation window length is especially true for the top 10 % emission rates in Fig. 11d.   In general, the analysis tends to underestimate the emission rates as was previously discussed in Sect. 3.2.2. This results in a bias toward too small relative emission factors in the histograms. However, by increasing the assimilation window length, the underestimation of the emission rates by the analysis ensemble reduces. For the strongest 25 % of the emission rates, assimilation windows longer than 18 hours show a second maximum at a relative emission factor of 1 (Fig. 12c). These test cases also show a lower rate 485 of underprediction for the top 10 % emission rates (Fig. 12d). Thus, the results suggest that the reliability of the ensemble to analyze the strong emission rates in the upper emission plumes increases with increasing assimilation window length for both meteorological conditions, yet with different significance.
The accuracy of the probabilistic prediction of volcanic ash concentrations by the ensemble is measured by the Brier score (cf. Sect. 2.2). The Brier score is shown in Fig. 13 for each hour and for all assimilation window lengths. The Brier score the volcanic ash concentrations are attributed more and more to different classes used for the calculation of the Brier score.

495
This reduces the underlying probability and increases the Brier score. With increasing time after the volcanic eruption, the volcanic ash concentrations reduces due to dispersion and deposition. Lower volcanic ash concentrations have larger errors (not shown) meaning that ESIAS-chem is less able to predict these low concentrations with high confidence. Especially for shorter assimilation window lengths, ESIAS-chem is not able to estimate the emission profile properly. Thus, the corresponding volcanic ash is emitted into false layers or at false times leading to larger errors in the probabilistic forecast.

500
As an example, the following analysis is aiming to assess the confidence of the ensemble prediction of volcanic ash using the 24 hour assimilation window experiment. Fig. 14a compares the probability of volcanic ash column mass loading exceeding 2 g m −2 on 16 April 2010, 00 UTC. Additionally, the nature run's volcanic ash column mass loading contours for 0.5, 1, and 2 g m −2 are overlaid by blue lines. On 15 April 2010, wind conditions are favorable for volcanic ash to disperse rapidly. Thus, the area containing high volcanic ash column mass loading covers only a small region above South-Sweden. The ensemble 505 predicts a probability of more than 90 % for high volcanic ash column mass loading in this area. A small probability of about 20-30 % of volcanic ash column mass loading exceeding the threshold of 2 g m −2 is also predicted above the North Sea, where nature run's volcanic ash column mass loading exceeds 1 g m −2 . Fig. 14b shows the vertical cross-section along the Figure 13. Brier score as calculated by (10) Fig. 14b). Even though the volcanic ash at 4 km height in the center of the cross-section is covered from above by the elevated volcanic ash at 7 km height, the ensemble predicts a 50 % chance of volcanic ash exceeding the threshold at 4 km height. This is remarkable, since only vertically integrated observations of volcanic ash are assimilated. The volcanic ash northeast of the center of the vertical cross-section (i. e. to the right in Fig. 14b) is predicted by only 20-30 % of the ensemble. The ensemble predicts this volcanic ash in this vertical column to be at a height of 6-7 km by a chance of more 520 than 70 %. This may be due to the lack of vertical wind shear that prevents the distinction of volcanic ash emitted at different height levels. Fig. 15a shows the probability of volcanic ash column mass loading exceeding 2 g m −2 as predicted by the ensemble on 30 April 2010, 12 UTC, i. e. 36 hours after the simulation start and 12 hours after the end of the assimilation window. Isolines of nature run's volcanic ash column mass loading for 0.5 g m −2 , 1 g m −2 , and 2 g m −2 are also given by blue lines. A vertical 525 cross-section of the probability of volcanic ash concentration exceeding 2 mg m −3 along the red line in Fig. 15a is shown in Fig. 15b. Even though the emission profile on 29 April 2010 was not well analyzed, the ensemble predicts the high volcanic ash concentration with a probability of more than 90 %. Fig. 15b shows a vertically tilted volcanic ash cloud. This suggests that only little vertical mixing occurred on 29. April 2010 in the displayed vertical cross-section. Thus, the falsely emitted volcanic ash in the horizontally smoothed analysis emission profile leads to similar volcanic ash concentrations, which suggests that 530 horizontal mixing of volcanic ash happened. Hence, an exact estimation of the emission profile is generally impossible from column mass loading observations as different emission packages lead to similar volcanic ash concentrations and/or column mass loadings. However, the good performance in analyzing the vertical structure of the volcanic ash cloud is partly due to the perfect model / perfect meteorology assumption made in this study. The reliable estimate of the emission profile for the test case with strong wind shear suggest that the vertical structure of the volcanic ash is also sufficiently estimated under real 535 conditions, where meteorological forecast uncertainties impose a limiting factor to further improvements. This needs to be proved in the application to real volcanic eruptions.

Discussion and conclusions
In this study, a new method for estimating volcanic ash emissions and its uncertainty from column mass loading observations is developed. This new method is realized by the atmospheric chemical part of the Ensemble for Stochastic Integration of 540 Atmospheric Simulations (ESIAS-chem). The method comprises an ensemble-based particle smoother, which extends the assimilation window to include the latest observations available. The discrete-grid ensemble Nelder-Mead method (DENM) is developed in order to efficiently achieve a posterior ensemble representation of the time-dependent emission profile. The particle smoother approach enables to use the latest observations for the estimation of the emission profile within the whole assimilation window, while consistancy with all observations within the time interval is enforced. the total emitted mass of volcanic ash is reasonably well estimated by the analysis ensemble on both days, by increasing the assimilation window length, the ensemble performs increasingly better in analyzing the emission rates, especially for high emission rates in the upper part of the eruption column,

555
on 15 April 2010, a second lower volcanic ash layer covered from above by the main volcanic ash cloud was predicted by about 50 % of the ensemble members.
Due to the identical twin approach, the presented investigation acts as a best case scenario for probabilistic volcanic ash assessments. The analysis is idealized in different ways: The uncertainties in meteorological fields (especially in winds) in model parameters (e. g. deposition velocity), and parametrizations (e. g. clouds) have been neglected. Further, the amount of areas allow for removing volcanic ash emissions from the analysis. The ability of ESIAS-chem to give reliable results for real volcanic eruptions using non-idealized meteorology and incomplete observations will be addressed in a follow-up study.
Even though direct observations of volcanic ash columns were used in this study, ESIAS-chem is extremely flexible in terms of observational data. All kinds of data can be used the constrain the inversion method, such as samples of tephra fall out, if 565 available.
ESIAS-chem is designed to account for additional information on the emission profile, which may, for example, be obtained from radar or web cam observations (e. g., Arason et al., 2011). Thus, changes in the vertical or temporal resolution of the emission profile are applicable if suggested by observations without noteworthy modifications.
In this study, ESIAS-chem was challenged with highly variable volcanic ash emissions. The analysis has shown that ESIAS-570 chem is able to provide good estimates of the volcanic ash concentration in the atmosphere as well as its forecast probability.
Further, the emission profile was estimated reasonably well at least for the strong wind test case for assimilation window lengths greater than 18 hours. However, the ideal length of the assimilation window may depend on the current meteorological situation, most notable the vertical wind shear, and the availability of observational data. Thus, in applications to real volcanic eruptions the assimilation window should be as large as practicably possible to include a large number of observations linking 575 eruption time of particles with observation time.
The system shows high probability in estimating the vertical distribution of high volcanic ash concentration for both test dates. Although the system lacks to estimate the true emission profile sufficiently well for weak wind conditions, the analysis of the probability of volcanic ash showed that its vertical distribution in the atmosphere is reliably predicted.
Besides volcanic ash eruptions, ESIAS-chem is applicable to a variety of emission scenarios, especially unexpected emission 580 events like forest fires and mineral dust events. Therefore, it provides a fast and efficient model for source term estimation including uncertainty representation. In principle, the method can be adapted to multi-source emission scenarios. The enhanced need for compute resources of ESIAS-chem can partly be compensated by reducing the resolution of the emission profile.
For the analysis of real volcanic ash emissions, it is intended to use a meteorological ensemble to account for additional uncertainties in wind fields, which is well applicable within the concept of ESIAS-chem. It is noted that ESIAS-chem is 585 flexible in integrating other modules and is applicable to other atmospheric models as well. Author contributions. PF designed the code extension necessary for this analysis. HE contributed mainly to the idea of ensemble estimation 590 of emission uncertainties. Main input regarding satellite observations and its uncertainty was given by ACL. All authors contributed equally to the manuscript.