Weak-constraint inverse modeling using HYSPLIT-4 Lagrangian dispersion model and Cross-Appalachian Tracer Experiment (CAPTEX) observations – effect of including model uncertainties on source term estimation
- 1NOAA Air Resources Laboratory (ARL), NOAA Center for Weather and Climate Prediction, 5830 University Research Court, College Park, MD 20740, USA
- 2Cooperative Institute for Climate and Satellites, University of Maryland, College Park, MD 20740, USA
Correspondence: Tianfeng Chai (email@example.com)
A Hybrid Single-Particle Lagrangian Integrated Trajectory version 4 (HYSPLIT-4) inverse system that is based on variational data assimilation and a Lagrangian dispersion transfer coefficient matrix (TCM) is evaluated using the Cross-Appalachian Tracer Experiment (CAPTEX) data collected from six controlled releases. For simplicity, the initial tests are applied to release 2, for which the HYSPLIT has the best performance. Before introducing model uncertainty terms that will change with source estimates, the tests using concentration differences in the cost function result in severe underestimation, while those using logarithm concentration differences result in overestimation of the release rate. Adding model uncertainty terms improves results for both choices of the metric variables in the cost function. A cost function normalization scheme is later introduced to avoid spurious minimal source term solutions when using logarithm concentration differences. The scheme is effective in eliminating the spurious solutions and it also helps to improve the release estimates for both choices of the metric variables. The tests also show that calculating logarithm concentration differences generally yields better results than calculating concentration differences, and the estimates are more robust for a reasonable range of model uncertainty parameters. This is further confirmed with nine ensemble HYSPLIT runs in which meteorological fields were generated with varying planetary boundary layer (PBL) schemes. In addition, it is found that the emission estimate using a combined TCM by taking the average or median values of the nine TCMs is similar to the median of the nine estimates using each of the TCMs individually. The inverse system is then applied to the other CAPTEX releases with a fixed set of observational and model uncertainty parameters, and the largest relative error among the six releases is 53.3 %. At last, the system is tested for its capability to find a single source location as well as its source strength. In these tests, the location and strength that yield the best match between the predicted and the observed concentrations are considered as the inverse modeling results. The estimated release rates are mostly not as good as the cases in which the exact release locations are assumed known, but they are all within a factor of 3 for all six releases. However, the estimated location may have large errors.
The transport and dispersion of gaseous and particulate pollutants are often simulated to generate pollution forecasts for emergency responses or produce comprehensive analyses of the past for better understanding of the particular events. Lagrangian particle dispersion models are particularly suited to provide plume products associated with emergency response scenarios. While accurate air pollutant source terms are crucial for the quantitative predictions, they are rarely provided in most applications and have to be approximated with a lot of assumptions. For instance, the smoke forecasts over the continental US operated by the National Oceanic and Atmospheric Administration (NOAA) using the Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model (Draxler and Hess, 1997; Stein et al., 2015) in support of the National Air Quality Forecast Capability (NAQFC) rely on outdated fuel loading data and a series of assumptions related to smoke release heights and strength approximation (Rolph et al., 2009).
Observed concentration, deposition, or other functions of the atmospheric pollutants such as aerosol optical thickness measured by satellite instruments can be used to estimate some combination of source location, strength, and temporal evolution using various source term estimation (STE) methods (Bieringer et al., 2017; Hutchinson et al., 2017). Among the applications, the recent Fukushima Daiichi nuclear power plant accidents saw the most implementations of the STE methods to estimate the radionuclide releases. The STE methods range from simple comparisons between model outputs and measurements (Achim et al., 2014; Chino et al., 2011; Hirao et al., 2013; Katata et al., 2012, 2015; Kobayashi et al., 2013; Oza et al., 2013; Terada et al., 2012) to sophisticated ones using various dispersion models and inverse modeling schemes (Chai et al., 2015; Saunier et al., 2013; Stohl et al., 2012; Winiarek et al., 2012, 2014). Another active field for STE applications is the estimation of the volcanic ash emissions. Many attempts have been made for several major volcano eruptions (Chai et al., 2017; Prata and Grant, 2001; Wen and Rose, 1994; Wilkins et al., 2014, 2016).
While there are many STE methods applied to reconstruct the emission terms, it is still a state of the art. Two popular advanced inverse modeling approaches are cost-function-based optimization methods and those based on Bayesian inference. However, it is difficult to evaluate the STE without knowing the actual sources for most applications. Chai et al. (2015) generated pseudo-observations using the same dispersion model in their initial inverse experiment tests, which are often called “twin experiments”. Such tests allow observational errors to be added realistically (Chai et al., 2015), but it is non-trivial to represent the model errors incurred by other model parameters such as the uncertainties of the meteorological field. One way to objectively evaluate the inverse modeling results is to compare the predictions with the independent observations or withheld data. However, such indirect comparisons still cannot provide quantitative error statistics for the source terms.
There have been some tracer experiments conducted to study the atmospheric transport and dispersion with controlled releases. In these experiments, the source terms were well-quantified and comprehensive measurements were made subsequently over an extended area (Draxler et al., 1991; Van Dop et al., 1998). With the known source terms, they provide a unique opportunity to evaluate the STE methods. Singh and Rani (2014) and Singh et al. (2015) used measurements from recent dispersion experiment (Fusion Field Trials 2007) data to evaluate a least-squares technique for identification of a point release. The European Tracer Experiment (ETEX) data set was also used to study the STE methods based on the principle of maximum entropy and a least-squares cost function (Bocquet, 2005, 2007, 2008). However, such formal evaluation of the STE methods is still very limited.
A HYSPLIT inverse system based on 4D-Var data assimilation and a transfer coefficient matrix (TCM) was developed and applied to estimate a cesium-137 source from the Fukushima nuclear accident using air concentration measurements (Chai et al., 2015). The system was further developed to estimate the effective volcanic ash release rates as a function of time and height by assimilating satellite mass loadings and ash cloud top heights (Chai et al., 2017). In this study, the Cross-Appalachian Tracer Experiment (CAPTEX) data are used to evaluate the HYSPLIT inverse modeling system. The paper is organized as follows. Section 2 describes the CAPTEX experiment, HYSPLIT-4 model configuration, and the source term inversion method. Section 3 presents emission inversion results, and a summary is given in Sect. 4.
2.1 CAPTEX experiment
The CAPTEX experiment consisted of seven near-surface releases of the inert tracer perfluoro-monomethyl-cyclohexane (PMCH) from Dayton, Ohio, USA, and Sudbury, Ontario, Canada, during September and October 1983 (Draxler, 1987). Table 1 lists the locations, time, amounts, and measurement counts of the seven releases. Samples were collected at 84 different measurement sites distributed from 300 to 1100 km downwind of the emission source, as either 3 or 6 h averages up to 60 h after each release (NOAA ARL, 2018a). Figure 1 shows the distribution of measurement sites and the two source locations. Release 6 is excluded from the testing as in the earlier studies using CAPTEX data (e.g., Hegarty et al., 2013; Ngan et al., 2015) due to low detection at the measurement sites. Note that 3.4 fL L−1 has been subtracted from all CAPTEX measurements to remove background and “noise” in sampling where the ambient background concentration is constant at 3.0 fL L−1 (Ferber et al., 1986). At ground level, 1 fL L−1 is equivalent to 15.6 pg m−3. Duplicate sample analyses showed that the majority of the data have a mean standard deviation estimated as 10.8 % but contaminated samples may have standard deviation as large as 65 % (Ferber et al., 1986).
In this study, the tracer transport and dispersion are modeled using the HYSPLIT model (version 4, NOAA ARL, 2018b) in its particle mode in which three-dimensional (3-D) Lagrangian particles released from the source location passively follow the wind field (Draxler and Hess, 1997, 1998; Stein et al., 2015). A particle release rate of 50 000 particles h−1 is used for all calculations. Random velocity components based on local stability conditions are added to the mean advection velocity in the three wind component directions. The meteorological data used to drive the HYSPLIT are time averaged from the Advanced Research Weather Research and Forecasting (WRF) model (ARW, version 3.2.1) simulation results at 10 km resolution and they are identical to those used by Hegarty et al. (2013). The 10 km run was nested inside a larger domain at 30 km resolution, over which the simulation was started using the North American Regional Reanalysis (NARR) at 32 km (Mesinger et al., 2006). In the WRF simulations, 3-D grid nudging of winds was applied in the free troposphere and within the planetary boundary layer (PBL). There are 43 vertical layers, with the lowest one being approximately 33 m thick. Tracer concentrations are computed over each grid cell by summing the mass of all particles in the cell and dividing the result by the cell's volume. In this study, the concentration grid cells have 0.25∘ resolution in both latitude and longitude directions and vertically they extend 100 m from the ground.
To avoid running the HYSPLIT modeling repeatedly, a TCM is generated similar to the previous HYSPLIT inverse modeling studies (Chai et al., 2015, 2017). As described in Draxler and Rolph (2012), independent simulations are performed with a unit emission rate from each source location and a predefined time segment. Each release scenario is simply a linear combination of the unit emission runs.
2.3 Emission inversion
Similar to Chai et al. (2015), the unknown releases can be solved by minimizing a cost functional that integrates the differences between model predictions and observations, deviations of the final solution from the first guess (a priori), as well as other relevant information written into penalty terms (Daley, 1991). For the current application, the cost functional ℱ is defined as
where qij is the discretized source term at hour i and location j for which an independent HYSPLIT simulation has been run and recorded in a TCM. is the first guess or a priori estimate and is the corresponding error variance. Note that all tracer sources in this study were at ground level and the release heights in the HYSPLIT were set as 10 m for all of the following test cases. We also assume the uncertainties of the release at each time and location are independent of each other so that only the diagonal term of the typical a priori error variance appears in Eq. (1). ch and co denote HYSPLIT-predicted and measured concentrations, respectively. The observational errors are assumed to be uncorrelated. As the term is essentially used to weight terms, the uncertainties of the model predictions and the representative errors should be included in addition to the observational uncertainties. This will be further discussed in Sect. 3.2. A large-scale bound-constrained limited-memory quasi-Newton code, L-BFGS-B (Zhu et al., 1997) is used to minimize the cost functional ℱ defined in Eq. (1) when multiple parameters need to be determined. As shown by Chai et al. (2015), the metric variable can be changed from concentration to logarithm concentration. Both choices of metric variable will be tested here. Note that the cases presented in this study are all formulated as overdetermined problems.
3.1 Recovering emission strength without model uncertainty
As an initial test, the exact release location and time are both assumed known and the only unknown variable left to be determined is the release rate or the total release amount. For this type of one-dimensional problem, an optimal emission strength can be easily found without having to use sophisticated minimization routines. For instance, the ℱ may be directly calculated for a number of emission strength values, and the resulting ℱ=ℱ(q) plot will reveal the optimal q strength that is associated with the minimal ℱ. Note that such an optimal solution not only depends on the chosen parameters in Eq. 1 but also highly depends on the HYSPLIT model setup and the meteorological fields.
Both Hegarty et al. (2013) and Ngan et al. (2015) showed that the HYSPLIT dispersion model performed better for release 2 than the other releases. Thus, release 2 is initially chosen to perform a series of inverse modeling tests. Assuming no prior knowledge of the emission strength, the first guess is given as qb=0, and σ=104 kg h−1 is assumed. Sensitivity tests show that when qb is changed to 100 kg h−1, the emission strength estimates are nearly unchanged with the same or larger σ.
Firstly, no model uncertainties are considered to contribute to ϵ. The observational uncertainties are formulated to include a fractional component fo×co and an additive part ao. Note that this general formulation is chosen for its simplicity. It should be replaced when more uncertainty information is available. Table 2 lists the emission strength q that generates the minimal cost function for a series of fo and ao combinations, where fo ranges from 10 % to 50 %, and ao is assigned as 10, 20, and 50 pg m−3. All the emission strength values obtained are significantly lower than the actual release of 67 kg h−1. It shows that a larger fo value tends to have a smaller q estimate, but a larger ao results in a larger q. The significant underestimation of the release strength is caused by the implicit assumption of a perfect model when ϵ does not include the model uncertainties. Figure 2 shows the comparison between the predicted and measured concentrations when the actual release rate of 67 kg h−1 is applied. Large discrepancies still exist even when the exact release is known and used in the simulation. For the measured zero concentrations, most of the predicted values are non-zero and can be above 1000 pg m−3. As ϵm=ao for these zero concentrations, will dominate the cost function when ao is not large enough. This explains that the underestimation is not as severe for ao=50 pg m−3 as that for ao=10 pg m−3. While ϵ do not change with fo for the zero concentrations, smaller fo values help increase the weighting of the terms associated with large measured concentrations. So, the estimated emission strength when fo=10 % is better than when fo=50 %.
As stated in Chai et al. (2015), the metric variable in Eq. (1) can be changed to ln(c), i.e., replacing with . A constant 0.001 pg m−3 is added to both and to allow the logarithm operation for zero concentrations. In such a case, can be calculated as
Note that 0.001 pg m−3 is also added to in the second term to avoid dividing by zero. The term in Eq. (2) makes larger for measured low concentrations than those measured high concentrations. It causes more weighting towards measured high concentrations and results in overestimation shown in Table 3. The measured zero concentrations have little effect on the final emission strength estimates. Table 3 shows that the emission strengths are overestimated but are within a factor of 2 over the actual release of 67 kg h−1, for all fo and ao combinations. The similar trends of how q changes with fo and ao are also observed here; i.e., a larger ao or a smaller fo tends to have a larger q estimate.
While using logarithm concentration as the metric variable yields better emission estimates than using concentration as the metric variable, the results in Table 3 are apparently systematically overestimated compared to the systematically underestimated results in Table 2. In addition, the fo and ao combinations associated with the best emission estimates in Tables 2 and 3 appear to be in the opposite corners of the tables.
3.2 Recovering emission strength with model uncertainty
To consider the model uncertainties in a simplified way, ϵ2 will be formulated as
As ao and ah affect the ϵ2 in a similar way, the representative errors caused by comparing the measurements with the predicted concentrations averaged in a grid can be included in either ah or ao.
With logarithm concentration as the metric variable, is comprised of two parts, as
Note that a constant small number (0.001 pg m−3) is added to denominators and to avoid dividing by zero.
Since the predicted concentrations in Eqs. (3) and (4) will vary when source term estimates change, the model uncertainties will depend on the current release parameters. Thus, the model uncertainty terms are not static during the inverse modeling and they change along with the source estimates. Using concentration and logarithm concentration as the metric variable, respectively, Tables 4 and 5 show the emission strength estimates with different fh and ah, while keeping fo=20 %, ao=20 pg m−3. Additional tests with other chosen fo and ao values show similar but slightly different results. For brevity, they are not presented here. It should be noted that the model uncertainties are not equivalent to model errors. Although dispersion model simulations can have large errors due to various reasons including the source term uncertainties, the model uncertainties are used to indicate that the model is not perfect even with the “optimal” model parameters. Similar to the weak constraint applied in operational 4D-Var data assimilation systems (Tremolet, 2006; Zupanski, 1997), introducing model uncertainties is mainly intended to relax the model constraint for imperfect models. Here, the fh and ah parameters are given similar ranges to those given to the observational uncertainty parameters.
When concentration is used as the metric variable, the emission strength estimates with model uncertainties considered are improved over those without model uncertainties. The estimates of emission strength generally increase with the model uncertainty, either through ah or fh, except for fh=50 %, when the q estimates slowly decreases with ah. When fh=0 %, ah=10, 20, and 50 pg m−3, while ao=20 pg m−3; the q estimates, 7.7, 9.1, and 13.6 kg h−1, are in line with the results shown in Table 2, where q=7.1 kg h−1 for ao=20 pg m−3 and q=12.6 kg h−1 for ao=50 pg m−3. However, the trend of how q estimates change with fh is opposite to how q estimates change with fo. Table 4 shows that the emission strength increases with the model uncertainty factor fh. With fh=20 %, the release estimates of 48.5, 50.4, and 53.5 kg h−1 are all within 30 % of the actual release rate of 67 kg h−1. Instead of the underestimation shown in Table 2, the release estimates are overestimated when fh=50 % is assumed.
With logarithm concentration as the metric variable, larger ah or fh results in slightly smaller q estimates. While how q estimates change with fh is similar to how they change with fa in Table 3, how q estimates change with ah is opposite to how q estimates change with ao before introducing model uncertainties. Equation (4) shows that fo and fh affect in a simple monotonic way, while the effect of is complicated, as it is divided by the value that varies with the source terms. Table 5 shows that the source terms are no longer overestimated as those in Table 3. In fact, all cases have slight to moderate underestimation, with the worst results being q=42.6 kg h−1 when fh=50 % and ah=50 pg m−3. Another aspect of using logarithm concentration as the metric variable is that the range of the release estimates listed in Table 5 is not as large as that in Table 4, which resulted from using concentration as the metric variable for the same 12 combinations of ah and fh.
3.3 Cost function normalization
Without model uncertainties, the weighting terms for each model–observation pair do not change with emission estimates. When and are formulated as in Eqs. (3) and (4), respectively, they vary with emission estimates. This may cause complication in some circumstances when logarithm concentration is used as the metric variable. To avoid having zero source as a global minimizer in such situations, the sum of the weights of the mismatch between model simulation and observations is kept unchanged for varying qij by normalizing it with the weight sum when , as shown in Eq. (5).
Figure 3 shows the cost function as a function of source strength when is defined as in Eq. (4), with fh=0, ah=50 pg m−3, fo=10 %, ao=20 pg m−3. Before introducing cost function normalization, a global minimal cost function appears when release strength approaches zero, while a local minimal cost function exists at 56.8 kg h−1. Several such instances were found when ah=50 pg m−3 and when fh is 0 or 10 %, while both fo and ao are relatively small. The smaller cost function when release strength approaches zero is due to the increasing in Eq. (4) as gets smaller. While the model–observation differences are not smaller for lower release strength, the drastic increase of when ah=50 pg m−3 and fh is 0 % or 10 % results in smaller cost function with decreasing source strength.
Figure 3 shows that the cost function has the minimum at q=67.3 kg h−1 after normalization. Note that the dramatic difference of the cost function magnitude before and after the normalization is due to the extremely small value of calculated at qb=0. Tables 6 and 7 show the emission strength estimates after cost function normalization with different fh and ah, while keeping fo=20 %, ao=20 pg m−3, using concentration and logarithm concentration as the metric variables, respectively. Note that fo=20 % was chosen for the cases listed in Table 7, while fo=10 % was chosen in Fig. 3 to illustrate the potential problem. How estimates change with fh and ah in Tables 6 and 7 is similar to what is shown in Tables 4 and 5. The estimates are generally closer to the actual release than those obtained without the cost function normalization.
When having concentration as the metric variable and with fh=50 %, the emission strength estimates are 64.7, 64.7, and 65.3 kg h−1 for ah=10, 20, and 50 pg m−3, respectively. They are all within 5 % of the actual release rate. However, fh less than or equal to 20 % results in significant underestimation. When having logarithm concentration as the metric variable, the source term estimates are not very sensitive to fh and ah values, and the results listed in Table 7 are all within 20 % of the actual release rate. Among those estimates, a result of 67.3 kg h−1 when fh=10 % and ah=10 pg m−3 is almost identical to the actual release rate.
Ngan and Stein (2017) simulated CAPTEX releases using a variety of PBL schemes. In their configuration, WRF version 3.5.1 was used with 27 km grid spacing and 33 vertical layers. The NARR data set was used for the initial conditions and lateral boundary conditions. The WRF model was initialized every day at 06:00 UTC, and the first 18 h of spin-up time in the 42 h simulation were discarded. The PBL schemes used to create the WRF ensemble were the Yonsei University (Hong et al., 2006), Mellor–Yamada–Janjic (Janjic, 1994), quasi-normal scale elimination (Pergaud et al., 2009), MYNN 2.5 level TKE (Nakanishi and Niino, 2006), ACM2 (Pleim, 2007), Bougeault and Lacarrere (Bougeault and Lacarrère, 1989), University of Washington (Bretherton and Park, 2009), total energy mass flux (Angevine et al., 2010), and Grenier–Bretherton–MaCaa (Grenier and Bretherton, 2001) schemes. Nine simulations were conducted with the PBL schemes and their associated surface layer schemes, except for the YSU, BouLac, UW, and GBM cases in which the MM5 Monin–Obukhov surface scheme was applied. The land-surface model was Noah land-surface model (Chen and Dudhia, 2001), except ACM2 in which the Pleim–Xiu land-surface model was used.
An individual TCM is generated using each of the nine simulations. The nine TCMs can be used to estimate the emission strengths independently following the same procedure described previously. Tables 8 and 9 show the third (25th percentile), fifth (median), and seventh (75th percentile) emission strengths of the nine estimates that minimize the normalized ℱ defined in Eq. (5) with different fh and ah, while keeping fo=20 %, ao=20 pg m−3, using concentration and logarithm concentration as the metric variables, respectively. The 25th percentile and 75th percentile values are mostly within 5 % of the median estimates. While the median estimates show the same trends with fh and ah as the results in Tables 6 and 7, they are significantly larger due to the meteorological model differences. Apparently, the differences among the simulations with different PBL schemes are smaller than the differences between the ensemble simulations here and the simulation used in the earlier sections. This suggests that uncertainties of the emission strength are probably larger than the ranges indicated by the 25th and 75th percentile values. The results using logarithm concentration as the metric variable are quite robust with the listed model uncertainty parameters. However, the estimates using concentration as the metric variable are very sensitive to fh and ah. This is consistent with results shown in Sect. 3.2 and 3.3.
Instead of using each individual TCM generated from nine simulations independently, the nine TCMs can be combined into one matrix by taking the median or average values. The combined TCM can then be used to estimate the source terms. The results for concentration and logarithm concentration metric variables are listed in Tables 10 and 11, respectively. They show that the emission estimates using the median transfer coefficients of the nine TCMs are very close to the median of the nine estimates using the nine simulations individually. For the cases with logarithm concentration as the metric variable, the emission estimates using the median value of the nine TCMs are all within 3.1 % of the median values of the nine estimates obtained with each individual TCM. For the cases with concentration as the metric variable, the average relative differences are 6.4 %, with the maximum relative difference being 10.8 % when fh=10 % and ah=50 pg m−3. Combining the TCMs by taking the median value generates slightly better results than combining the TCMs by taking the average value does.
Similar to what was found in earlier sections and also in Chai et al. (2015), the cases having logarithm concentration as the metric variable generally yield better results than those having concentration as the metric variable. It is probably due to the large range of the concentrations. When having concentration as the metric variable, certain model uncertainty parameters yield good source terms, but the estimates are quite sensitive to the choices of the model uncertainty parameters. However, it is not easy to find such model uncertainty parameters that would yield satisfactory results for applications when the actual releases are indeed unknown. The results here and in the previous sections show that the estimates having logarithm concentration as the metric variable are quite robust for a reasonable range of model uncertainty parameters. For these reasons, logarithm concentration is chosen as the metric variable for the later tests.
3.5 Source location and other releases
In addition to the source strength, the source location and its temporal variation can be retrieved with adequate accuracy using the HYSPLIT inverse system described here if there are sufficient measurements available. For instance, Chai et al. (2015) estimated 99 6 h emission rates of the radionuclide cesium-137 from the Fukushima nuclear accident using 1296 daily average air concentration measurements at 115 stations around the globe. Here, the system's capability to locate a single source location will be tested using a straightforward approach. In these tests, the release time is assumed known, but its location and strength are left to be determined. A region of suspect is first gridded at certain spatial resolution to form a limited number of candidate source locations. An optimal strength is then found at each candidate source location following the method described earlier. The location that results in the best match between the predicted and the observed concentrations is considered as the likely source location.
In the following tests, a 11×11 grid with 0.2∘ resolution in both longitude and latitude directions is used to generate 121 candidate source locations. They are centered at 40.0∘ N, 84.5∘ W, for releases 1–4, and centered at 46.6∘ N, 80.8∘ W, for releases 5 and 7. Using the normalized ℱ defined in Eq. (5) and assuming fo=20 %, ao=20 pg m−3, fh=20 %, and ah=20 pg m−3, a minimal cost function associated with an optimal release strength can be found at each location. When logarithm concentration is taken as the metric variable, the emission estimates are not sensitive to fh and ah choices, as indicated by the results in Tables 7, 9, and 11. Figure 4 shows the 121 candidate locations and their respective minimal cost function values for release 2. No candidate locations are chosen to collocate with the actual source location which will be unknown for the future applications that need to locate the sources. A global minimal point is found at 39.8∘ N, 84.5∘ W, with achieved when q=48.5 kg h−1. This grid point is taken as the estimated source location and it is 26.4 km away from the actual release site (39.90∘ N, 84.22∘ W). The neighboring location (39.8∘ N, 84.3∘ W) which is the closest to the actual release site yields a slightly larger ℱ=3.17 with an optimal release rate of 60.9 kg h−1. If the exact source location is known as in the tests presented earlier, the cost function ℱ reaches 1.59 at its minimal point when q=61.5 kg h−1. Apparently, compared with those cases when the release strength is the only unknown, finding both the source location and its strength with the same amount of observations is expected to be more difficult. Note that the smaller normalized ℱ values in Fig. 3 are for a case with different observation and model uncertainty parameters, where fo=10 %, ao=20 pg m−3, fh=0 %, and ah=50 pg m−3.
Table 12 lists the source location and strength estimations for the six releases following the same procedure as described here, where the uncertainty parameters are fo=20 %, ao=20 pg m−3, fh=20 %, and ah=20 pg m−3. Releases 1 and 4 have the minimal cost function ℱmin occurring at the north boundary and the west boundary, respectively. In such scenarios, it might be necessary to expand the suspected source region for the future applications to find the source locations. However, if source locations are known to reside in the suspected region, the sources can definitely be near the boundaries. In such cases, the point with ℱmin should be considered as the estimated source location. Releases 3, 5, and 7 have their ℱmin occurring at inner grid points, similar to release 2 shown in Fig. 4. None of the closest candidate source locations yield the best match between model simulation and observations quantified by the cost function ℱ. Among the six releases, the estimated source location for release 2 is the closest to its actual release site, with a distance of 26.4 km.
The release rates obtained along with the likely source locations are underestimated by a factor of 3 for release 1, and overestimated by a factor of 3 for releases 4 and 7, while the estimates for releases 2, 3, and 5 are much better, with relative errors of −27.6 %, −5.4 %, and 21.5 %, respectively. Table 12 also lists the release rates q′ estimated with the exact source location assumed known. These estimates for all releases are within a factor of 2 of the actual release rates, and the largest relative error is 53.3 % for release 1. The posterior uncertainties of the release rate estimates are also calculated and listed. They range from 1.8 kg h−1 for release 2 to 6.2 kg h−1 for release 1. The apparent underestimation is likely due to the model uncertainty assumption, including its simplified formulation as well as the chosen parameter values. Either with the source location known or unknown, release 2 has one of the best emission estimates among the six releases, probably because the HYSPLIT forward model has the best performance for the same release (Hegarty et al., 2013). The significant model errors when simulating the transport and dispersion even with the exact source terms are mostly caused by the meteorological uncertainties, while the HYSPLIT physical schemes and parameters, as well as the numerical discretization, also contribute.
An assumption made in this inverse modeling algorithm is that the differences between model and observation have a normal distribution with a zero mean. Figure 5 shows the probability density function (pdf) of ln(ch)−ln(co) for the six CAPTEX releases using the estimated release rate q′ listed in Table 12. The pdf distribution of ln(ch)−ln(co) for release 2 is consistent with the normal distribution assumption, and the pdf for release 4 shows the largest deviation from a normal distribution, while those for the other four releases resemble a normal distribution to some extent. The largest relative error for release 1 is likely related to the negative mean of the ln(ch)−ln(co) distribution shown in Fig. 5. The overestimated q′ probably results from the compensation of the model bias. Note that the better performance using ln(ch)−ln(co) than ch−co is believed to be caused by the fact that normal distribution assumption is mostly valid for the former but probably invalid for the latter.
The meteorological field and the observations are the two major inputs to the current inverse modeling. As discussed above, better model performance of release 2 helps to lead to better inverse results than the other releases. However, it is impossible to eliminate the model uncertainties. In practice, ensemble runs can be used to quantify the uncertainties and reduce the model errors by taking the average or median values of the ensemble runs. On the other hand, increasing the number of observations is effective to improve the inverse modeling results and reduce the result uncertainty. In principle, when the release strength is the only value to be determined, each measurement within the predicted plume can provide an independent estimate. However, relying on a single observation to estimate the strength is problematic since a particular model output can be very different from the observation and thus lead to an erroneous estimation of the source strength when used in isolation. For instance, although the HYSPLIT predictions of release 2 with exact source terms are very good, compared with individual measurements, they have severe underestimation (e.g., 0.77 pg m−3 predicted versus 686 pg m−3 measured), as well as significant overestimation (e.g., 2033 pg m−3 predicted versus 31.2 pg m−3 measured). Therefore, similar to a regression technique, increasing the sampling number can improve the final results, as exemplified by the very good source term estimation for release 2 when using all the available measurements. Also note that the samples outside predicted plumes do not contribute to the inverse modeling. Table 1 lists the total measurement counts for each release, but the number of measurements actually contributing to the inverse modeling are those inside the HYSPLIT plumes, including those with zero or background concentrations. The number of such effective measurements inside the plumes generated by HYSPLIT from the exact source location and time period are reduced to 148, 237, 211, 68, 46, and 53, for releases 1–5 and 7, respectively. The largest number of effective measurements, 237, of release 2, also indicates the best performance of the HYSPLIT simulation among those of the six releases. The effectiveness of the measurements will change when source location or release time is changed. The measurements that are not active in determining the source strength with a known source location and release time may be effective to locate the source locations.
A HYSPLIT inverse system developed to estimate the source term parameters has been evaluated using the CAPTEX data collected from six controlled releases. In the HYSPLIT inverse system, a cost function is used to measure the differences between model predictions and observations weighted by the observational uncertainties. Inverse modeling tests with various observational uncertainties show that calculating concentration differences results in severe underestimation, while calculating logarithm concentration differences results in overestimation.
Unlike other STE applications where model uncertainties are either ignored or assumed static, we introduce the model uncertainty terms that depend on the source term estimates. The model uncertainty terms improve inverse results for both choices of the metric variables in the cost function. It is also found that cost function normalization can avoid spurious minimal source terms when using logarithm concentration as the metric variable. The inverse tests show that having logarithm concentration as the metric variable generally yields better results than having concentration as the metric variable. The estimates having logarithm concentration as the metric variable are robust for a reasonable range of model uncertainty parameters. Such conclusions are further confirmed with nine ensemble runs where meteorological fields were generated using a different version of the WRF meteorological model with varying PBL schemes.
With a fixed set of observational and model uncertainty parameters, the inverse method with logarithm concentration as the metric variable is then applied to all of the six releases. The emission rates are well recovered, with the largest relative error as 53.3 % for release 1. The system is later tested for its capability to locate a single source location as well as its source strength. The location and strength that result in the best match between the predicted and the observed concentrations are considered as the inverse results. The estimated location is close to the actual release site for release 2 of which the forward HYSPLIT model has the best performance. The strength estimates are all within a factor of 3 for the six releases.
TC designed and conducted the inverse tests. AS provided guidance on the HYSPLIT modeling and suggestions on the inverse tests. FN performed the HYSPLIT ensemble simulations using different PBL schemes.
The authors declare that they have no conflict of interest.
This study was supported by NOAA grant NA09NES4400006 (Cooperative Institute
for Climate and Satellites – CICS) at the NOAA Air Resources Laboratory in
collaboration with the University of Maryland.
Edited by: Slimane Bekki
Reviewed by: two anonymous referees
Achim, P., Monfort, M., Le Petit, G., Gross, P., Douysset, G., Taffary, T., Blanchard, X., and Moulin, C.: Analysis of Radionuclide Releases from the Fukushima Dai-ichi Nuclear Power Plant Accident Part II, Pure Appl. Geophys., 171, 645–667, https://doi.org/10.1007/s00024-012-0578-1, 2014. a
Angevine, W. M., Jiang, H., and Mauritsen, T.: Performance of an Eddy Diffusivity-Mass Flux Scheme for Shallow Cumulus Boundary Layers, Mon. Weather Rev., 138, 2895–2912, https://doi.org/10.1175/2010MWR3142.1, 2010. a
Bieringer, P. E., Young, G. S., Rodriguez, L. M., Annunzio, A. J., Vandenberghe, F., and Haupt, S. E.: Paradigms and commonalities in atmospheric source term estimation methods, Atmos. Environ., 156, 102–112, https://doi.org/10.1016/j.atmosenv.2017.02.011, 2017. a
Bocquet, M.: Reconstruction of an atmospheric tracer source using the principle of maximum entropy. II: Applications, Q. J. Roy. Meteor. Soc., 131, 2209–2223, https://doi.org/10.1256/qj.04.68, 2005. a
Bocquet, M.: Inverse modelling of atmospheric tracers: non-Gaussian methods and second-order sensitivity analysis, Nonlin. Processes Geophys., 15, 127–143, https://doi.org/10.5194/npg-15-127-2008, 2008. a
Bougeault, P. and Lacarrère, P.: Parameterization of Orography-Induced Turbulence in a Mesobeta-Scale Model, Mon. Weather Rev., 117, 1872–1890, 1989. a
Chai, T., Draxler, R., and Stein, A.: Source term estimation using air concentration measurements and a Lagrangian dispersion model – Experiments with pseudo and real cesium-137 observations from the Fukushima nuclear accident, Atmos. Environ., 106, 241–251, https://doi.org/10.1016/j.atmosenv.2015.01.070, 2015. a, b, c, d, e, f, g, h, i, j
Chai, T., Crawford, A., Stunder, B., Pavolonis, M. J., Draxler, R., and Stein, A.: Improving volcanic ash predictions with the HYSPLIT dispersion model by assimilating MODIS satellite retrievals, Atmos. Chem. Phys., 17, 2865–2879, https://doi.org/10.5194/acp-17-2865-2017, 2017. a, b, c
Chen, F. and Dudhia, J.: Coupling an advanced land surface-hydrology model with the Penn State-NCAR MM5 modeling system. Part I: Model implementation and sensitivity, Mon. Weather Rev., 129, 569–585, 2001. a
Chino, M., Nakayama, H., Nagai, H., Terada, H., Katata, G., and Yamazawa, H.: Preliminary Estimation of Release Amounts of I-131 and Cs-137 Accidentally Discharged from the Fukushima Daiichi Nuclear Power Plant into the Atmosphere, J. Nucl. Sci. Technol., 48, 1129–1134, 2011. a
Daley, R.: Atmospheric Data Analysis, Cambridge University Press, Cambridge, 1991. a
Draxler, R.: Sensitivity of a Trajectory Model to the Spatial and Temporal Resolution of the Meteorological Data during CAPTEX, J. Clim. Appl. Meteorol., 26, 1577–1588, 1987. a
Draxler, R. and Hess, G.: Description of the HYSPLIT_4 modeling system, Tech. Rep. NOAA Technical Memo ERL ARL-224, National Oceanic and Atmospheric Administration, Air Resources Laboratory, Silver Spring, Maryland, USA, 1997. a, b
Draxler, R. and Hess, G.: An overview of the HYSPLIT_4 modeling system for trajectories, dispersion and deposition, Aust. Meteorol. Mag., 47, 295–308, 1998. a
Draxler, R., Dietz, R., Lagomarsino, R., and Start, G.: Across North-America Tracer Experiment (ANATEX) – Sampling and Analysis, Atmos. Environ., 25, 2815–2836, https://doi.org/10.1016/0960-1686(91)90208-O, 1991. a
Draxler, R. R. and Rolph, G. D.: Evaluation of the Transfer Coefficient Matrix (TCM) approach to model the atmospheric radionuclide air concentrations from Fukushima, J. Geophys. Res., 117, D05107, https://doi.org/10.1029/2011JD017205, 2012. a
Ferber, G., Heffter, J., Draxler, R., Lagomarisino, R., and Thomas, F.: Cross-Appalachian Tracer Experiment (CAPTEX '83) final report, Tech. Rep. NOAA Technical Memorandum ERL ARL-142, National Oceanic and Atmospheric Administration, Silver Spring, Maryland, USA, available at: https://www.arl.noaa.gov/documents/reports/arl-142.pdf (last access: 12 December 2018), 1986. a, b
Grenier, H. and Bretherton, C.: A moist PBL parameterization for large-scale models and its application to subtropical cloud-topped marine boundary layers, Mon. Weather Rev., 129, 357–377, 2001. a
Hegarty, J., Draxler, R. R., Stein, A. F., Brioude, J., Mountain, M., Eluszkiewicz, J., Nehrkorn, T., Ngan, F., and Andrews, A.: Evaluation of Lagrangian Particle Dispersion Models with Measurements from Controlled Tracer Releases, J. Appl. Meteorol. Clim., 52, 2623–2637, https://doi.org/10.1175/JAMC-D-13-0125.1, 2013. a, b, c, d
Hirao, S., Yamazawa, H., and Nagae, T.: Estimation of release rate of iodine-131 and cesium-137 from the Fukushima Daiichi nuclear power plant, J. Nucl. Sci. Technol., 50, 139–147, https://doi.org/10.1080/00223131.2013.757454, 2013. a
Hong, S.-Y., Noh, Y., and Dudhia, J.: A new vertical diffusion package with an explicit treatment of entrainment processes, Mon. Weather Rev., 134, 2318–2341, https://doi.org/10.1175/MWR3199.1, 2006. a
Hutchinson, M., Oh, H., and Chen, W.-H.: A review of source term estimation methods for atmospheric dispersion events using static or mobile sensors, Inform. Fusion, 36, 130–148, https://doi.org/10.1016/j.inffus.2016.11.010, 2017. a
Janjic, Z.: The Step-Mountain Eta Coordinate Model – Further Developments of the Convection, Viscous Sublayer, and Turbulence Closure Schemes, Mon. Weather Rev., 122, 927–945, 1994. a
Katata, G., Ota, M., Terada, H., Chino, M., and Nagai, H.: Atmospheric discharge and dispersion of radionuclides during the Fukushima Dai-ichi Nuclear Power Plant accident. Part I: Source term estimation and local-scale atmospheric dispersion in early phase of the accident, J. Environ. Radioactiv., 109, 103–113, https://doi.org/10.1016/j.jenvrad.2012.02.006, 2012. a
Katata, G., Chino, M., Kobayashi, T., Terada, H., Ota, M., Nagai, H., Kajino, M., Draxler, R., Hort, M. C., Malo, A., Torii, T., and Sanada, Y.: Detailed source term estimation of the atmospheric release for the Fukushima Daiichi Nuclear Power Station accident by coupling simulations of an atmospheric dispersion model with an improved deposition scheme and oceanic dispersion model, Atmos. Chem. Phys., 15, 1029–1070, https://doi.org/10.5194/acp-15-1029-2015, 2015. a
Kobayashi, T., Nagai, H., Chino, M., and Kawamura, H.: Source term estimation of atmospheric release due to the Fukushima Dai-ichi Nuclear Power Plant accident by atmospheric and oceanic dispersion simulations, J. Nucl. Sci. Technol., 50, 255–264, https://doi.org/10.1080/00223131.2013.827910, 2013. a
Mesinger, F., DiMego, G., Kalnay, E., Mitchell, K., Shafran, P., Ebisuzaki, W., Jovic, D., Woollen, J., Rogers, E., Berbery, E., Ek, M., Fan, Y., Grumbine, R., Higgins, W., Li, H., Lin, Y., Manikin, G., Parrish, D., and Shi, W.: North American regional reanalysis, B. Am. Meteorol. Soc., 87, 343–360, https://doi.org/10.1175/BAMS-87-3-343, 2006. a
Nakanishi, M. and Niino, H.: An improved Mellor-Yamada level-3 model: Its numerical stability and application to a regional prediction of advection fog, Bound.-Lay. Meteorol., 119, 397–407, https://doi.org/10.1007/s10546-005-9030-8, 2006. a
Ngan, F. and Stein, A.: A Long-Term WRF Meteorological Archive for Dispersion Simulations: Application to Controlled Tracer Experiments, J. Appl. Meteorol. Clim., 56, 2203–2220, https://doi.org/10.1175/JAMC-D-16-0345.1, 2017. a
Ngan, F., Stein, A., and Draxler, R.: Inline Coupling of WRF-HYSPLIT: Model Development and Evaluation Using Tracer Experiments, J. Appl. Meteorol. Clim., 54, 1162–1176, https://doi.org/10.1175/JAMC-D-14-0247.1, 2015. a, b
NOAA Air Resources Laboratory's (ARL): CAPTEX data, available at: https://www.arl.noaa.gov/wp_arl/wp-content/uploads/documents/datem/exp_data/captex/, last access: 12 December 2018a. a, b
Oza, R. B., Indumati, S. P., Puranik, V. D., Sharma, D. N., and Ghosh, A. K.: Simplified approach for reconstructing the atmospheric source term for Fukushima Daiichi nuclear power plant accident using scanty meteorological data, Ann. Nucl. Energy, 58, 95–101, https://doi.org/10.1016/j.anucene.2013.03.016, 2013. a
Pergaud, J., Masson, V., Malardel, S., and Couvreux, F.: A Parameterization of Dry Thermals and Shallow Cumuli for Mesoscale Numerical Weather Prediction, Bound.-Lay. Meteorol., 132, 83–106, https://doi.org/10.1007/s10546-009-9388-0, 2009. a
Pleim, J. E.: A combined local and nonlocal closure model for the atmospheric boundary layer. Part I: Model description and testing, J. Appl. Meteorol. Clim., 46, 1383–1395, https://doi.org/10.1175/JAM2539.1, 2007. a
Prata, A. and Grant, I.: Retrieval of microphysical and morphological properties of volcanic ash plumes from satellite data: Application to Mt Ruapehu, New Zealand, Q. J. Roy. Meteor. Soc., 127, 2153–2179, https://doi.org/10.1256/smsqj.57614, 2001. a
Rolph, G. D., Draxler, R. R., Stein, A. F., Taylor, A., Ruminski, M. G., Kondragunta, S., Zeng, J., Huang, H.-C., Manikin, G., Mcqueen, J. T., and Davidson, P. M.: Description and verification of the NOAA Smoke Forecasting System: the 2007 fire season, Weather Forecast., 24, 361–378, https://doi.org/10.1175/2008WAF2222165.1, 2009. a
Saunier, O., Mathieu, A., Didier, D., Tombette, M., Quélo, D., Winiarek, V., and Bocquet, M.: An inverse modeling method to assess the source term of the Fukushima Nuclear Power Plant accident using gamma dose rate observations, Atmos. Chem. Phys., 13, 11403–11421, https://doi.org/10.5194/acp-13-11403-2013, 2013. a
Singh, S. K. and Rani, R.: A least-squares inversion technique for identification of a point release: Application to Fusion Field Trials 2007, Atmos. Environ., 92, 104–117, https://doi.org/10.1016/j.atmosenv.2014.04.012, 2014. a
Singh, S. K., Turbelin, G., Issartel, J.-P., Kumar, P., and Feiz, A. A.: Reconstruction of an atmospheric tracer source in Fusion Field Trials: Analyzing resolution features, J. Geophys. Res., 120, 6192–6206, https://doi.org/10.1002/2015JD023099, 2015. a
Stein, A. F., Draxler, R. R., Rolph, G. D., Stunder, B. J. B., Cohen, M. D., and Ngan, F.: NOAA's HYSPLIT atmospheric transport and dispersion modeling system, B. Am. Meteorol. Soc., 96, 2059–2077, https://doi.org/10.1175/BAMS-D-14-00110.1, 2015. a, b
Stohl, A., Seibert, P., Wotawa, G., Arnold, D., Burkhart, J. F., Eckhardt, S., Tapia, C., Vargas, A., and Yasunari, T. J.: Xenon-133 and caesium-137 releases into the atmosphere from the Fukushima Dai-ichi nuclear power plant: determination of the source term, atmospheric dispersion, and deposition, Atmos. Chem. Phys., 12, 2313–2343, https://doi.org/10.5194/acp-12-2313-2012, 2012. a
Terada, H., Katata, G., Chino, M., and Nagai, H.: Atmospheric discharge and dispersion of radionuclides during the Fukushima Dai-ichi Nuclear Power Plant accident. Part II: verification of the source term and analysis of regional-scale atmospheric dispersion, J. Environ. Radioactiv., 112, 141–154, https://doi.org/10.1016/j.jenvrad.2012.05.023, 2012. a
Van Dop, H., Addis, R., Fraser, G., Girardi, F., Graziani, G., Inoue, Y., Kelly, N., Klug, W., Kulmala, A., Nodop, K., and Pretel, J.: ETEX: A European tracer experiment; Observations, dispersion modelling and emergency response, Atmos. Environ., 32, 4089–4094, https://doi.org/10.1016/S1352-2310(98)00248-9, 1998. a
Wilkins, K. L., Mackie, S., Watson, I. M., Webster, H. N., Thomson, D. J., and Dacre, H. F.: Data insertion in volcanic ash cloud forecasting, Ann. Geophys., 57, https://doi.org/10.4401/ag-6624, 2014. a
Wilkins, K. L., Watson, I. M., Kristiansen, N. I., Webster, H. N., Thomson, D. J., Dacre, H. F., and Prata, A. J.: Using data insertion with the NAME model to simulate the 8 May 2010 Eyjafjallajokull volcanic ash cloud, J. Geophys. Res., 121, 306–323, https://doi.org/10.1002/2015JD023895, 2016. a
Winiarek, V., Bocquet, M., Saunier, O., and Mathieu, A.: Estimation of errors in the inverse modeling of accidental release of atmospheric pollutant: Application to the reconstruction of the cesium-137 and iodine-131 source terms from the Fukushima Daiichi power plant, J. Geophys. Res.-Atmos., 117, D05122, https://doi.org/10.1029/2011JD016932, 2012. a
Winiarek, V., Bocquet, M., Duhanyan, N., Roustan, Y., Saunier, O., and Mathieu, A.: Estimation of the caesium-137 source term from the Fukushima Daiichi nuclear power plant using a consistent joint assimilation of air concentration and deposition observations, Atmos. Environ., 82, 268–279, https://doi.org/10.1016/j.atmosenv.2013.10.017, 2014. a
Zhu, C., Byrd, R. H., Lu, P., and Nocedal, J.: L-BFGS-B–FORTRAN suroutines for large scale bound constrained optimization, ACM T. Math. Software, 23, 550–560, 1997. a
Zupanski, D.: A general weak constraint applicable to operational 4DVAR data assimilation systems, Mon. Weather Rev., 125, 2274–2292, 1997. a