FALL3D-8.0: a computational model for atmospheric transport and deposition of particles, aerosols and radionuclides – Part 2: model applications

This manuscript presents different application cases and validation results of the latest version release of the FALL3D-8.0 model, an open-source atmospheric transport model. The code has been redesigned from scratch to incorporate different categories of species and to overcome legacy issues that precluded its preparation towards extreme-scale computing. Validation results are shown for long-range dispersal of fine volcanic ash and SO2 clouds, tephra fallout deposits and dispersal and ground deposition of radionuclides. The first two examples (i.e. the 2011 Puyehue-Cordón Caulle and 2019 Raikoke erup5 tions) make use of geostationary satellite retrievals for two purposes: first, to furnish an initial data insertion condition for the model; and second, to validate the time series of model outputs against the satellite retrievals. The metrics used to validate the model simulations of volcanic ash and SO2 are the Structure, Amplitude and Location (SAL) metric and the Figure of Merit in Space (FMS). The other two application cases are validated with scattered ground-based observations of deposit load and local particle grain size distributions from the 23 February 2013 Mt. Etna eruption and with measurements from the Radioac10 tivity Environmental Monitoring (REM) database during the 1986 Chernobyl nuclear accident. Simulation results indicate that FALL3D-8.0 outperforms previous code versions both in terms of model accuracy and code performance. We also find that simulations initialised with the new data insertion scheme consistently improve agreement with satellite retrievals at all lead times out to 48 hours for both SO2 and long-range fine ash simulations.

2.1.1 Volcanic ash detection Figure 1 shows SEVIRI observations of the Cordón Caulle volcanic ash plume and illustrates the steps used to detect volcanic ash in the present study. For context, Fig. 1a and 1b show a composite of MODIS true colour imagery and the SEVIRI 10.8 µm brightness temperature (T 11 B ) respectively. Here, we propose an ash detection scheme based on applying successive masks that flag SEVIRI pixels as 'ash-affected' before attempting a subsequent quantitative ash retrieval (Sec. 2.1.2): 1. First, we apply a temperature cut-off threshold to water vapor corrected BTDs (∆T ash ): that is, only those pixels with ∆T ash less than the cut-off threshold of T wc = −0.5 K are flagged as potential ash pixels.
This water vapor correction follows the semi-empirical approach of Yu et al. (2002). As illustrated in Fig. 1c, this first threshold is reasonably effective at detecting the Cordón Caulle ash cloud. However, this simple cut-off threshold may 65 not remove false positives due to temperature inversions generated by clear land at night (Platt and Prata, 1993), icecovered surfaces (Yamanouchi et al., 1987), cold cloud-tops (Potts and Ebert, 1996) and high satellite zenith angles (Gu et al., 2005).
2. Second, we apply a cold surface mask designed to remove false positives due to reasons mentioned above. This cut-off condition relabels potential ash pixels as 'ash free' if:

70
∆T ash > T sc and      T 11 B > 255 K over land T 11 B > 240 K over ocean (2) where T sc = −1.5 K is the cold surface cut-off value. We note that T 11 B condition of this mask will preserve ash detection sensitivity for high altitude (cold) ash clouds, which is particularly well-suited for the Cordón Caulle case study. However, this condition may not be suitable for low-level ash clouds (low thermal contrast resulting in less negative BTDs in addition to warmer cloud-tops). The effect of this mask is illustrated by comparing Figs. 1c and d. Note how, for the case 75 shown, the cold surface mask removes almost all false positives over the region covered by low-level stratiform cloud.
3. Third, we apply a mask for false positives due to an increased path length at high satellite zenith angles (Gu et al., 2005).
We mask out false positives at high zenith angles imposing: ∆T ash > T zc and ζ > 80 o where T zc = −2 K is the zenith cut-off threshold and ζ is the satellite zenith angle. The effect of the high zenith mask 4. Finally, the last step in the detection process is to remove any spurious ash-labeled pixels using a noise filter that removes objects (groups of contiguous pixels) that are less than 16 pixels in size (Fig. 1f).
The MODIS true color composite shown in Fig. 1a illustrates that, even in a relatively complex scene (numerous clouds, large regions of land and ocean, high mountains, ice-covered surfaces, etc.), the ash detection is robust and provides a good balance 85 between reduced false positives and increased true positives. An interesting point to note is that negative BTDs in the vicinity of Cordón Caulle are enhanced due to the high satellite zenith angles at these locations. Gu et al. (2005) discuss the benefit of improved sensitivity to ash at high satellite zenith angles, but also show that mass loading retrievals can be overestimated in these situations. We correct for the effect of high zenith angles after retrieving the mass loading.
2.1.2 Volcanic ash retrieval 90 Once pixels have been identified as being 'ash-affected' we apply a Look-up Table (LuT) approach (Prata and Grant, 2001;Prata and Prata, 2012) to retrieve volcanic ash optical depth (τ ), effective radius (r e ; in µm), and column mass loading (m l ; in gm −2 ). The retrieval procedure is illustrated in Fig. 2a. The temperature difference model employed here is based on the forward model developed by Prata (1989b) and Wen and Rose (1994): 95 where I λ is the radiance at the top of the atmosphere at wavelength (λ), τ λ is the wavelength-dependent optical depth, B(T s ) is the Planck radiance evaluated for surface temperature (T s ) below the ash cloud, and B(T c ) is the Planck radiance corresponding to the temperature at the ash cloud-top (T c ). The optical depth is defined as: τ (λ) = πL ∞ 0 r 2 Q ext (λ, r)n(r)dr where L is the geometric thickness of the ash cloud, Q ext (λ, r) is the extinction efficiency factor (determined from Mie 100 calculations), r is the particle radius and n(r) represents the distribution of particles within the ash cloud. The ash mass loading is determined as: where ρ is the ash particle density (set to 2500 kgm −3 based on field measurements reported by Dominguez et al. (2020) for distal ash) and the cos(ζ) term corrects the mass loading based on the satellite zenith angle. Uncertainties using this approach 105 have been previously estimated to be up to 50% (Wen and Rose, 1994;Corradini et al., 2008). Our microphysical model, used to parameterize a volcanic ash cloud in the radiative transfer calculations, assumes that ash particles are spherical, composed of andesite and conform to a lognormal size distribution with a spread equal to 0.5 (geometric standard deviation σ g = 1.65), similar to existing operational volcanic ash retrieval algorithms (e.g. Francis et al., 2012;Pavolonis et al., 2013).
The retrieval scheme relies on interpolating pre-computed LuTs generated by conducting radiative transfer calculations made 110 for varying values of r e , τ , T s and T c . The LuTs are generated using a new python implementation of the original FORTRAN code developed by Prata (1989b) to solve the radiative transfer equation for a single-layer ash cloud using the Discrete Ordinates Method (DOM; Stamnes et al., 1988;Laszlo et al., 2016). In the present study, we consider τ in the range from 0-9.9 in 4 https://doi.org/10.5194/gmd-2020-166 Preprint. Discussion started: 17 June 2020 c Author(s) 2020. CC BY 4.0 License. steps of 0.1 and r e from 1-15 µm in steps of 0.2 µm. All radiative transfer calculations use 16 radiation streams and a unique LuT is generated for every combination of T c and T s identified from ash-affected pixels. Figure 3 shows a graphical example 115 of a LuT generated for one combination of T s and T c and the range of τ and r e considered.
To determine T s directly from measurements it is generally recommended to find a clear-air pixel near the volcanic cloud of interest (e.g. Wen and Rose, 1994) and can sometimes be determined by finding the maximum value of T 11 B in the scene (Prata and Lynch, 2019). Obtaining an estimate for T c from measurements, however, can be more difficult as the minimum value of T 11 B may not correspond to the (semi-transparent) ash cloud of interest. Nevertheless, even if T s and T c can be reasonably 120 estimated from measurements, it is often assumed that a single or mean value (and corresponding standard deviation) is representative of the entire ash cloud. Figure 1 shows that, in our case, the ash plume extends more than 60°in longitude and 20°in latitude, over land (including the Andes mountain ranges) and ocean, meaning that there is a considerable variation in cloud-top and surface temperature across ash-affected pixels. In addition, the meteorological setting within the considered spatial and temporal domains is complex (significant amounts of clouds), making estimates of T s and T c from measurements challenging 125 for the Cordón Caulle case study. To account for variation in T s and T c across space and time, we use ERA5 reanalysis data to determine T s and T c at every ash-affected pixel over our study period from 5-10 June 2011 (in one hour time steps).
To determine T s from ERA5, we use the surface skin temperature (T skin ) and assume that the atmospheric transmittance (t atm ) has only a small effect on measured radiances at the top of the atmosphere for spit-window channels (i.e. t atm ≈ 1).
We also correct T skin for variations in land surface emissivity using the University of Wisconsin global IR land surface emis-130 sivity database (Seemann et al., 2008). For ocean surfaces, we set the emissivity to 0.99 consistent with Western et al. (2015).
Analysis comparing T 11 B SEVIRI measurements against the emissivity-corrected T skin for clear-sky pixels on 4 June 2011 indicates average differences of ∼2 K. To determine T c from ERA5 we require an estimate of the volcanic cloud-top height.
A fortuitous CALIPSO pass early on during the eruption on 5 June 2011 reveals that the Cordón Caulle ash cloud reached as high as 13-14 km above sea level (Fig. 4a) and later observations indicate heights from 10-13 km (Fig. 4b). For the retrievals 135 presented here, we take T c to be the ERA5 temperature at 13 km (a. s. l.) and make the simplifying assumption of constant height at all locations (and times) for every ash-affected pixel detected during 5-10 June 2011. The assumption of constant cloud-top height allows T c to vary in time, horizontally but not vertically. However, FALL3D-8.0 simulations indicate that the height of the Cordón Caulle ash cloud was relatively stable over the course of its dispersion in the atmosphere and so we expect errors introduced by this assumption to be small. This was probably due to its injection into the stratosphere and its transport 140 via the Southern hemisphere jet stream (height variations from 11-15 km; Klüser et al., 2013;Vernier et al., 2013;Prata et al., 2017). For our study period from 5-10 June 2011, T c ranged from 206-226 K while T s ranged from 230-304 K. We therefore performed radiative transfer calculations to construct unique LuTs, in steps of 2 K, for every possible combination of T s and T c within these ranges.

145
Retrieval methods applicable to broadband IR satellite observations have largely focussed on exploiting SO 2 absorption features near 8.6 µm and 7.3 µm (e.g. Realmuto et al., 1994;Prata et al., 2003;Watson et al., 2004). To validate the new SO 2 scheme in FALL3D-8.0, we apply a three-channel technique to IR geostationary satellite measurements to retrieve total SO 2 column densities in Dobson Units (DU) (Prata et al., 2003;Prata and Kerkmann, 2007;Doutriaux-Boucher and Dubuisson, 2009). The three-channel technique exploits the strong SO 2 absorption feature near 7.3 µm and is only sensitive to upper-level 150 SO 2 (> 4 km) due to the absorption of lower-level water vapor at this wavelength. We use the recent SO 2 -rich eruption of Raikoke volcano in Russia during June 2019 as a validation case study and apply the SO 2 retrieval to observations made by the multispectral Advanced Himawari Imager (AHI) instrument aboard Himawari-8 geostationary satellite (Bessho et al., 2016).

SO 2 detection
The three channels used to detect SO 2 using AHI measurements are centred around 6.9, 7.3 and 11.2 µm. To determine whether 155 there is an SO 2 signal in the data, we first construct a synthetic 7.3 µm brightness temperature by interpolating from 6.9 to 11.2 µm in the radiance space and then converting to brightness temperature via the Planck function (Prata et al., 2003). Figure 5 illustrates how the interpolation procedure works in radiance space. The resulting 'clear' brightness temperature (T 7.3 BC ) is a good approximation of the measured value (T 7.3 B ) in a SO 2 -free atmosphere, so that one can identify SO 2 clouds by taking the difference between these two variables: In theory, ∆T SO2 should be equal to zero under clear-sky conditions and increase with increasing SO 2 column density. However, in reality, high satellite zenith angles and variations in temperature and humidity can cause ∆T SO2 to be positive even under clear-sky conditions (Prata et al., 2003;Doutriaux-Boucher and Dubuisson, 2009). To remove false positives due to high satellite zenith angles and high water vapour burdens, we compute two SO 2 -related BTDs (∆T 69 and ∆T 86 ) and apply two 165 successive temperature cut-off thresholds: where only those pixels with a ∆T 69 greater than a cut-off threshold of T 69 = −2.5 K are flagged as potential SO 2 . The second threshold takes advantage of the SO 2 absorption feature near 8.6 µm: where only those pixels with a ∆T 86 greater than a cut-off threshold of T 86 = 3.5 K are flagged as potential SO 2 .
In addition, the presence of meteorological clouds and embedded volcanic ash can also affect the interpolation procedure used to construct T 7.3 BC . Figures 6a to 6c show, respectively, T 7.3 B , T 7.3 BC , and ∆T SO2 brightness temperatures for the SO 2 -rich Raikoke cloud on 22 June 2019 at 21:00 UTC. Clearly, the interpolation procedure does a good job at removing the SO 2 signal from the measurements resulting in excellent detection sensitively for ∆T SO2 . Comparison of Figures 6d, e and f show how the 175 ∆T 69 and ∆T 86 thresholds are used to remove false alarms whilst preserving legitimate SO 2 -affected pixels.

SO 2 retrieval
As mentioned above, ∆T SO2 calculated via Eq. (7) is a function of the total column density of SO 2 . The SO 2 retrieval is based on constructing this function from offline radiative transfer calculations. For this purpose we use the MODTRAN-6.0 code (Berk et al., 2014) to compute top-of-the-atmosphere (TOA) radiances at the 7.3 µm wavelength (Fig. 2b). All 180 radiances determined from MODTRAN-6.0 were convolved using the AHI spectral response functions. These radiances are then converted to brightness temperatures to compute BTDs between an SO 2 -free atmosphere and atmospheres with varying column amounts of SO 2 at 7.3 µm (i.e. ∆T SO2 ). We are then able to generate a function representing the relationship between the SO 2 column density, u(∆T SO2 ), and ∆T SO2 by interpolating between the data points generated from the radiative transfer modelling (Fig. 8). In practice we generate this function using a 1D quadratic interpolation procedure implemented in the 185 SciPy python package (Virtanen et al., 2020). Atmospheric profiles of temperature, humidity and gases were taken from the US standard atmosphere. In varying the SO 2 column amounts, we must specify an SO 2 profile. We use CALIPSO total attenuated backscatter profiles collocated with Himawari-8 observations of ∆T SO2 to constrain the SO 2 profiles used in the radiative transfer calculations for the Raikoke case. When using the CALIPSO observations to determine the height and thickness of SO 2 layers we make the assumption that SO 2 is collocated with sulphate aerosols (Carboni et al., 2016;Prata et al., 2017).
190 Figure 7a shows a daytime CALIOP overpass intersecting SO 2 layers detected by Himawari-8 during the initial explosive phase of the Raikoke eruption. The vertical distribution of cloud/aerosol layers in the CALIOP observations reveal that the eastern part of the plume reached at least 12 km (a.s.l.). Later CALIOP/AHI observations reveal complex stratospheric dynamics; two distinct components are apparent in the attenuated backscatter data with thin layers (1-2 km) present at 13-15 km in the northern part of the SO 2 cloud and ∼12 km in the southern part (Fig. 7b). Based on these initial observations, we constructed 195 u(∆T SO2 ) using a uniform SO 2 distribution with a maximum cloud-top height of 13.5 km and thickness of 2.5 km (Fig. 8).
The retrieval then proceeds by computing ∆T SO2 from AHI data and evaluating u(∆T SO2 ) for every SO 2 -affected pixel.

Data insertion
The data insertion scheme was recently introduced in FALL3D-8.0 and is briefly described in Folch et al. (2020). Here we describe the data insertion setup used for the Cordón Caulle and Raikoke case studies. To insert IR satellite retrievals of 200 volcanic ash and SO 2 (described in Sects. 2.1.2 and 2.2.2) into FALL3D, the satellite retrievals were re-sampled (using nearest neighbour sampling) from their native projection into a regular 0.1°× 0.1°latitude-longitude grid, consistent with the FALL3D grid. The vertical distribution must also be specified in the model as the satellite retrievals represent total column loadings (i.e. 2D spatial fields). For the cases presented here, CALIOP observations were used to constrain the vertical distribution of ash and SO 2 . Note that the vertical distribution is only required for the data insertion time. The data insertion times used for  Table 1.
Both qualitative and quantitative validation approaches have been used to validate previous versions of FALL3D against satellite observations (Corradini et al., 2011;Folch et al., 2012). Here we use the SAL metric (Wernli et al., 2008) to quantitatively 210 compare satellite retrievals of volcanic ash (Sect. 2.1.2) and SO 2 (Sect. 2.2.2) to the corresponding simulations with and without data insertion. The SAL metric was developed for validation of precipitation forecasts against radar and satellite data (Wernli et al., 2008). Dacre (2011) (2018), we also use the FMS score as a complement to SAL for comparing the spatial coverage of observed vs. modelled fields. A detailed mathematical description of the SAL metric is presented in Wernli et al. (2008) and so we only provide a brief description of each of the components of SAL (i.e. S, A and L) in the following subsections.
The main requirement for calculating SAL is the determination of model and observation objects. Objects are identified as clusters of contiguous pixels whose magnitude is above some threshold corresponding to a physical quantity determined from 220 observations. In our case, the threshold is determined based on the detection limit of the satellite retrievals. For the SEVIRI ash retrievals (Cordón Caulle case; Sect. 2.1.2), we use a threshold of 0.2 gm −2 consistent with the threshold suggested by Prata and Prata (2012). For the Himawari-8 SO 2 retrievals (Raikoke case; Sect. 2.2.2), there is currently no commonly accepted detection threshold. For the purposes of identifying SO 2 objects, we allowed for a threshold of 5 DU, noting that the minimum detected SO 2 total column burdens at each time step in the satellite retrievals were ∼8-10 DU (after applying the threshold tests 225 defined in Sect. 2.2.1). The objects are determined for both observation (satellite retrievals) and model fields and are computed as the absolute sum of S, A and L, which results in an index that varies from 0 (best agreement) to 6 (worst agreement). All comparisons between observations and model simulations are made using a regular 0.1°× 0.1°latitude-longitude grid.

Amplitude
The Amplitude (A) metric is the simplest of the three metrics used to construct SAL and compares the normalised difference 230 of the mass-averaged values of the observation and model fields. It can vary from -2 to +2 where negative (positive) values indicate that the model is under-predicting (over-predicting) the mass when compared with observations.

Location
The Location (L) metric has two components (L = L 1 + L 2 ). L 1 is calculated as the distance between the centers of mass between the model and observation fields, normalised by the maximum distance across the specified domain. It can vary from 0 to +1 235 and is considered a first-order indication of the accuracy of the model simulation compared with observations. However, L 1 can equal 0 (suggesting perfect agreement) for situations where observation and model fields clearly do not agree. For example, Wernli et al. (2008) describe the case of two objects at opposite sides of the domain having the same center of mass as a single object in the center of the domain. L 2 was introduced to handle these situations by considering the weighted average distance between the overall center of mass and the center of mass of each individual object for both model and observation fields. L 2 is computed by taking the normalised difference between the model weighted average distance and the observation weighted average distance. It is scaled such that it varies from 0 to +1 (to vary over the same range as L 1 ), meaning that L varies in the range from 0 to +2.

Structure
The Structure (S) metric is the most complex of the three metrics used to construct SAL. The general idea is to compute the 245 normalised 'volume' of all individual objects for each dataset (i.e. the model and observation fields). The normalised volumes are computed by dividing the total (sum) mass of each object by its maximum mass. The weighted mean of the normalised volumes is then computed for the observation and model fields and S is computed by taking the difference between the weighted means. The S metric can vary from -2 to +2, where negative values indicate that modelled objects are too small or too peaked (or a combination of both) compared to the observed fields. The FMS score compares the spatial coverage of observed vs. modelled fields. It is simply the area of intersection divided by the area of union between the ash mass loading observation and model fields: where A mod and A obs are the modelled and observed ash mass loading areas, respectively. The FMS varies from 0 (no intersec-255 tion) to 1 (perfect overlap). initiating a remarkable example of a long-lasting plume with complex dynamics, strongly influenced by the interplay between eruptive style, atmospheric winds and deposit erosion (Bonadonna et al., 2015). The initial explosive phase of the eruption (4-14 June) was characterised by the development of eruption columns with heights oscillating between 6-14 km above sea level (a.s.l.). Plume heights progressively decreased (4-6 km a.s.l) between 15 and 30 June, and low intensity ash emission persisted 265 for several months (Elissondo et al., 2016). Due to the predominant westerly winds, ash was transported towards Argentina and a wide area of the arid and semi-arid regions of northern Patagonia was severely affected by tephra dispersal and fallout. The eruption had multiple impacts on the ecosystem, critical infrastructures, human activities and several economic sectors (e.g. agriculture, aviation, and tourism; Wilson et al., 2013).  Fig. 11 and summarised in Table 2. Comparison of Figs. 9a and 10a highlight the advantage of a data insertion scheme. For the simulation without data insertion (Fig. 10a), the plume has already begun to deviate from the satellite observations with too much mass dispersing towards the south. This is reflected in both the SAL score of 1.93 and FMS score of 0.22 at this time.
The data insertion scheme (Fig. 9a) Table 2); however, at around 36 hours the SAL reached above 2 and then decreased sharply (Fig. 11c). The reason for the sudden reduction in SAL just after 36 hours is most likely due to the satellite retrievals being compromised by cloud interference at this time in addition to the continual input of mass at the source in the model simulations. This input of mass was included to account for ash erupted after the data insertion time. The satellite retrievals capture some of the ash plume near source (Figs. 9c and 10c), but cannot be expected to accurately characterise the plume at this location due to its high opacity 295 in the IR window. Another difference between the model and observations at this time is the large difference in the centres of mass (L = 0.32). This is due to the high mass loadings near source in the model fields and high mass loadings near the centre of the domain (43°S, 35°W) in the observed fields. The satellite is likely over-estimating mass in this part of the ash cloud because of underlying meteorological clouds that have not been accounted for in the radiative transfer modelling. After 72 hours (Fig. 9d), the simulations with and without data insertion are identical as all ash used in the data insertion scheme has  Table 2. We selected a data insertion time of 22 June 2019 at 18:00 UTC (1 day after the beginning of the eruption) as this is a time when the SO 2 cloud was completely detached from source ( Fig. 12a). Note that the AHI retrievals of the SO 2 plume at the beginning of the eruption (Fig. 7a) were likely compromised by interference of ice particles in the initial eruption plume (Prata et al., 2003;Doutriaux-Boucher and Dubuisson, 2009). In addition, retrievals early on in the plume's dispersion may have been affected by band saturation caused by extremely high SO 2 330 column loads.
At the data insertion time (22 June 2019 at 18:00 UTC), the SAL score for the FALL3D SO 2 simulation without data insertion is 2.87 and the FMS is 0.32. Therefore, applying data insertion at this time represents a significant correction of the model simulation to the satellite observations (compare Fig. 12a and Fig. 13a). The main difference between the satellite observations and simulation without data insertion is that the model indicates a portion of the SO 2 plume connecting back to 335 the volcano while this feature is not present in the observations. TROPOMI observations of the SO 2 cloud confirm this spatial structure (see Fig. 13 of Global Volcanism Program, 2019). The reason for the lack of detection of SO 2 in this region in the AHI retrievals is probably due to water vapour interference, implying that this part of the plume was at lower altitudes than the main SO 2 cloud. Indeed, SO 2 height retrievals from CrIS data show that plume heights varied from ∼3-7 km a.s.l. in this region (see Fig. 5 of Hyman and Pavolonis, 2020).

340
For the simulations without data insertion, at 24 hours after insertion, the validation metrics exhibit minor changes with SAL decreasing from 2.87 to 2.59 and the FMS from 0.32 to 0.23 (Fig. 13a, b). For the simulations with data insertion, SAL has steadily increased from 0 to 1.21 while the FMS has decreased from 1 to 0.29 over the first 24 hours (Fig. 12a, b; see Supplementary Material for the full animation of the data insertion simulations). Figure 11b shows that the SAL score for the simulation with data insertion is largely affected by the S and A scores whereas the L score is low (0.05) indicating the FALL3D 345 is able to track the centre of mass of SO 2 very well when initialised with satellite retrievals (Fig. 11b). In this case the A metric is negative, meaning that the model is under-predicting the mass when compared to the satellite retrievals. This is probably due to the fact that the total mass retrieved by the satellite actually increases after the data insertion time. An increase in SO 2 mass cannot be accounted for in the data insertion scheme if no new sources of SO 2 are included in the model simulations. A reason for an increase in mass in the satellite retrievals, even after the SO 2 cloud has detached from source, could be due to several 350 factors related to the detection sensitivity of the retrieval. An interesting physical reason for the increase in SO 2 could be that ice particles in the nascent plume were sequestering SO 2 initially and then releasing it later on as the plume dispersed into the atmosphere . Increases in the satellite-retrieved SO 2 mass can also be due to the horizontal and vertical distribution of water vapour. For example, if the SO 2 cloud is in a region of high water vapour initially and then moves into a drier region it is likely that more SO 2 will be detected thus increasing the retrieved mass. This effect can also occur if the SO 2 355 layers are transported vertically in the atmosphere.
At 48 hours, for the simulations without data insertion (Fig. 13c), the SAL score actually improves (decreasing from 2.58 to 1.88) and the FMS largely remains the same (decreasing from 0.23 to 0.20). The improvement in the SAL score can be attributed to a steady increase in S metric and decrease in the A metric (Fig. 11d). This indicates that the structure and mass (amplitude) modelled by the simulations without data insertion are converging towards that observed by the satellite over 48 360 hours. For the simulations with data insertion, at 48 hours after insertion (Fig. 12c), the SAL score has continued to increase (from 1.21 to 1.38) and the FMS has continued to decrease (from 0.29 to 0.25). In general, at all lead times, the validation metrics indicate that the data insertion simulations provide better agreement with observations than the simulations without data insertion (Table 2).

The 2013 Mt. Etna tephra deposit 365
On 23 February 2013 at 18:15 UTC, the eruptive activity of Mt. Etna increased significantly. A buoyant plume rising up to 9 km above sea level (a.s.l.) along with incandescent lava fountains exceeding 500 m above the crater were generated during the paroxysmal phase (Poret et al., 2018), resulting in an ash plume extending towards the NE for more than 400 km away from the source and moderate ash fallout over the Italian regions of Calabria and Puglia.
In order to simulate this event using high-resolution wind fields, we ran first the ARW (Advanced Research WRF) core of the WRF (Weather Research and Forecasting) model (Skamarock et al., 2008)  (2018), this distribution underestimates the fine ash fraction. In order to correct this drawback, the simulation was run with a fine-enriched bi-Gaussian TGSD given by with the mean and standard deviation for the fine subpopulation given, respectively, by µ f = 2.54 and σ f = 0.38, and the 385 coarse subpopulation was defined by the parameters µ c = −2.96 and σ c = 1.03. The fractions of each subpopulation are p and 1 − p. In this work, p = 0.7 was used. Table 1 summarises the rest of model configuration options, and the resulting tephra ground load map is shown in Fig. 14.
In order to validate the FALL3D-8.0 deposit we compared the simulations results to the observations reported by Poret et al. (2018), which consist of deposit load and local grain-size distribution samples at 10 locations (S1-S10). Proximal sites (S1-S7) 390 are located between 5 and 16 km from the vent, whereas the rest of samples (S8-S10) correspond to the locations of Messina (Sicily, S8), Cardinale (Calabria, S9) and Brindisi (Puglia, S10), the latter located at about 410 km from the volcano. Figure 15a compares modelled and observed tephra loading at all sites. Note that all points lie within a factor 3 error band and, remarkably, that a perfect agreement (black solid line) is found across four orders of magnitude (from 10 −3 kg m −2 to more than 10 kg m −2 ). This good agreement is also observed at a bin level in the local grain size distributions. To illustrate this, 395 Fig. 15b compares computed and observed particle distribution modes at all sites (S01-S10). It should be noted that the model predicts unimodal distributions at all sampling sites in good agreement with field observations.

The 1986 Chernobyl nuclear accident
One of the most serious nuclear accident on the Earth occurred on 25 April 1986 at 21:23 UTC at the Chernobyl Nuclear Power Plant (NPP) in Ukraine. Two explosions in the NPP dispersed the radioactive material into the atmosphere where it was 400 transported by the winds up to distances of thousands of kilometers away from the NPP.
In order to simulate the dispersion of the radioactive material an estimation of the emission rate from the Chernobyl NPP is needed. Unfortunately, estimations of such a source term is still very uncertain and they rely on the solution of an inverse Here we used the source term reported by Brandt et al. (2002) as input for the FALL3D simulations (see Code and data availability statement to access the corresponding input files).
Particle size distributions and settling velocities of radioactive material, such as 134 Cs, 137 Cs, and 131 I, released during the accident are uncertain and their estimations also complicated by the interaction with other atmospheric particles and aerosols (Brandt et al., 2002). Effective settling velocities range from 0.0005 to 0.005 m/s for 137 Cs and from 0.001 to 0.02 m/s for 410 131 I (Brandt et al., 2002). Considering these ranges and discretizing velocities in 4 classes (see Table 3) we chose the effective classes and fractions through the best fit of the simulations results with the radioactivity values measured on 10 May 1986.
FALL3D simulations were carried out on the computational domain shown in Fig. 16 for the period from 24 April to 10 May 1986, considering the input values reported in Table 3 (Table 3) is reported in Fig. 16. We can see that most of simulated values are within an order of magnitude of the measurements (Fig. 17).
Simulations results of the radioactive cloud evolution relative to 137 Cs (vertically integrated radioactivity concentration in the atmosphere, expressed in Bq/m 2 ) from 28 April to 9 May, 1986, are reported in Fig. 18. We can clearly see that simulations correctly reproduced the patterns described by Brandt et al. (2002). The evolution of the 137 Cs dispersal is also available as a 420 video in the Supplementary Material, together with videos corresponding to the dispersal of 134 Cs and 131 I.

Conclusions
Four different examples from the new FALL3D-8.0 benchmark suite have been presented to validate the accuracy of the lastest major version release of the FALL3D model and complement a companion paper (Folch et al., 2020)  Ideally, in an operational setting, the model should be re-run with an updated data insertion time when new, good quality satellite retrievals become available. From a qualitative perspective it appears that SAL values of less than 1.5 and FMS values of ∼0.40 indicate good spatial agreement between the model and observation fields (see Fig. 9b). However, it's also important to consider that the satellite retrievals can be affected by cloud interference, meaning that the ash/SO 2 detection schemes may miss some legitimate ash or SO 2 that the model is otherwise predicting (see Fig. 9c). A data assimilation scheme that 445 considers the errors in the satellite retrievals in addition to errors in the model simulations (constructed based on an ensemble for example) can be used to resolve these issues (e.g. Fu et al., 2017;Pardini et al., 2020).
For the ground deposit load validation case study of the February 2013 Mt Etna eruption, we find exceptionally good agreement between field observations and FALL3D simulations of deposit loads. Acceptable ratios of model to observed ash loading (between 1:3 and 3:1) were found across 4 orders of magnitude, i.e., from 10 −3 kgm −2 to more than 10 kgm −2 . Good 450 agreement (between model and field observations) in terms of the mode (φ) of distributions was also found at the majority of field sampling sites (Fig. 14b).
For the radionuclides validation, we found that the model has a very good performance in reproducing the observations of the dispersal of the radioactive cloud after the 22 April Chernobyl accident, similarly to other models in the literature (e.g. Brandt et al., 2002).     Figure 2. Flow diagram showing the (a) volcanic ash retrieval process and (b) volcanic SO2 retrieval process used in the present study.
Parallelograms indicate datasets (blue for inputs, green for outputs) and rectangles indicate processes (i.e. code used to implement the retrieval algorithms and perform radiative transfer calculations). Offline calculations are any computations that are pre-computed (i.e. before any observations are made by the satellite).