Three-dimensional methane distribution simulated with FLEXPART 8-CTM-1.1 constrained with observation data

A Lagrangian particle dispersion model, the FLEXible PARTicle dispersion chemical transport model (FLEXPART CTM), is used to simulate global threedimensional fields of trace gas abundance. These fields are constrained with surface observation data through nudging, a data assimilation method, which relaxes model fields to observed values. Such fields are of interest to a variety of applications, such as inverse modelling, satellite retrievals, radiative forcing models and estimating global growth rates of greenhouse gases. Here, we apply this method to methane using 6 million model particles filling the global model domain. For each particle, methane mass tendencies due to emissions (based on several inventories) and loss by reaction with OH, Cl and O(1D), as well as observation data nudging were calculated. Model particles were transported by mean, turbulent and convective transport driven by 1×1 ERA-Interim meteorology. Nudging is applied at 79 surface stations, which are mostly included in the World Data Centre for Greenhouse Gases (WDCGG) database or the Japan–Russia Siberian Tall Tower Inland Observation Network (JR-STATION) in Siberia. For simulations of 1 year (2013), we perform a sensitivity analysis to show how nudging settings affect modelled concentration fields. These are evaluated with a set of independent surface observations and with vertical profiles in North America from the National Oceanic and Atmospheric Administration (NOAA) Earth System Research Laboratory (ESRL), and in Siberia from the Airborne Extensive Regional Observations in SIBeria (YAK-AEROSIB) and the National Institute for Environmental Studies (NIES). FLEXPART CTM results are also compared to simulations from the global Eulerian chemistry Transport Model version 5 (TM5) based on optimized fluxes. Results show that nudging strongly improves modelled methane near the surface, not only at the nudging locations but also at independent stations. Mean bias at all surface locations could be reduced from over 20 to less than 5 ppb through nudging. Near the surface, FLEXPART CTM, including nudging, appears better able to capture methane molar mixing ratios than TM5 with optimized fluxes, based on a larger bias of over 13 ppb in TM5 simulations. The vertical profiles indicate that nudging affects model methane at high altitudes, yet leads to little improvement in the model results there. Averaged from 19 aircraft profile locations in North America and Siberia, root mean square error (RMSE) changes only from 16.3 to 15.7 ppb through nudging, while the mean absolute bias increases from 5.3 to 8.2 ppb. The performance for vertical profiles is thereby similar to TM5 simulations based on TM5 optimized fluxes where we found a bias of 5 ppb and RMSE of 15.9 ppb. With this rather simple model setup, we thus provide three-dimensional methane fields suitable for use as boundary conditions in regional inverse modelling as a priori information for satellite retrievals and for more accurate estiPublished by Copernicus Publications on behalf of the European Geosciences Union. 4470 C. D. Groot Zwaaftink et al.: 3-D methane distribution nudged to observations with FLEXPART CTM mation of mean mixing ratios and growth rates. The method is also applicable to other long-lived trace gases.

Abstract.A Lagrangian particle dispersion model, the FLEXible PARTicle dispersion chemical transport model (FLEXPART CTM), is used to simulate global threedimensional fields of trace gas abundance.These fields are constrained with surface observation data through nudging, a data assimilation method, which relaxes model fields to observed values.Such fields are of interest to a variety of applications, such as inverse modelling, satellite retrievals, radiative forcing models and estimating global growth rates of greenhouse gases.Here, we apply this method to methane using 6 million model particles filling the global model domain.
For each particle, methane mass tendencies due to emissions (based on several inventories) and loss by reaction with OH, Cl and O( 1 D), as well as observation data nudging were calculated.Model particles were transported by mean, turbulent and convective transport driven by 1 • × 1 • ERA-Interim meteorology.Nudging is applied at 79 surface stations, which are mostly included in the World Data Centre for Greenhouse Gases (WDCGG) database or the Japan-Russia Siberian Tall Tower Inland Observation Network (JR-STATION) in Siberia.For simulations of 1 year (2013), we perform a sensitivity analysis to show how nudging settings affect modelled concentration fields.These are evaluated with a set of independent surface observations and with vertical profiles in North America from the National Oceanic and Atmospheric Administration (NOAA) Earth System Research Laboratory (ESRL), and in Siberia from the Airborne Extensive Regional Observations in SIBeria (YAK-AEROSIB) and the National Institute for Environmental Studies (NIES).FLEXPART CTM results are also compared to simulations from the global Eulerian chemistry Transport Model version 5 (TM5) based on optimized fluxes.Results show that nudging strongly improves modelled methane near the surface, not only at the nudging locations but also at independent stations.Mean bias at all surface locations could be reduced from over 20 to less than 5 ppb through nudging.Near the surface, FLEXPART CTM, including nudging, appears better able to capture methane molar mixing ratios than TM5 with optimized fluxes, based on a larger bias of over 13 ppb in TM5 simulations.The vertical profiles indicate that nudging affects model methane at high altitudes, yet leads to little improvement in the model results there.Averaged from 19 aircraft profile locations in North America and Siberia, root mean square error (RMSE) changes only from 16.3 to 15.7 ppb through nudging, while the mean absolute bias increases from 5.3 to 8.2 ppb.The performance for vertical profiles is thereby similar to TM5 simulations based on TM5 optimized fluxes where we found a bias of 5 ppb and RMSE of 15.9 ppb.With this rather simple model setup, we thus provide three-dimensional methane fields suitable for use as boundary conditions in regional inverse modelling as a priori information for satellite retrievals and for more accurate esti-mation of mean mixing ratios and growth rates.The method is also applicable to other long-lived trace gases.

Introduction
Three-dimensional (3-D) global concentration fields of different trace gases, such as methane (CH 4 ), carbon dioxide (CO 2 ) or carbon monoxide (CO), are of interest for many applications.They can, for instance, inform regional air quality policies and verify climate policies via a comparison of global emissions and atmospheric concentration growth rates, and are required as input for many types of applications like radiation simulations or satellite retrievals.For instance, 3-D global concentration fields used as initial and boundary conditions for regional chemical transport models have a substantial influence on the results from these models (e.g.Tang et al., 2007;Andersson et al., 2015;Pendlebury et al., 2018).Regional inverse modelling of greenhouse gas emissions often requires global 3-D greenhouse gas concentrations as boundary conditions (Thompson and Stohl, 2014).These and many other applications explain the popularity of 3-D concentration fields produced with different tools (e.g.CarbonTracker; Peters et al., 2007).
In situ measurements are typically too sparse to provide global 3-D concentration fields.Interpolation methods are associated with large errors in regions with few observations (e.g. in the free troposphere and over oceans).They are therefore rarely used for constructing 3-D concentration fields for the atmosphere, while interpolation of maritime CO 2 measurements is more popular (Laruelle et al., 2017).Satellite retrievals can in principle produce global coverage but generally lack sufficient temporal and/or vertical resolution, spatial coverage (especially in polar regions) and the calibration necessary to make them comparable to ground-based measurements, and they can therefore contain substantial uncertainties and biases.Often, satellite observations require input of a priori vertical concentration profiles in their retrieval (Schepers et al., 2012).Concentration fields can also be simulated with atmospheric chemistry transport models, yet uncertainties in the fluxes (emissions and surface uptake) and atmospheric chemistry and transport errors in modelling will lead to mismatches between modelled and observed concentrations.In particular, if simulated concentrations are globally or regionally too high or too low compared to reality, such biases may render these concentration fields unusable.For example, for regional inverse modelling of surface emissions, a "background" concentration field biased high would lead to low (or even, for many species, unrealistic negative) emissions (Manning et al., 2003;Stohl et al., 2009).For example, an average increase by 10 ppb in the methane background leads to a 20 % decrease in Swiss national methane emissions (Stephan Henne, unpublished results).
Therefore, often a combination of (transport) models with data assimilation techniques is used to obtain more accurate model results.One prominent technique uses observation data and a data assimilation or inverse modelling approach to optimize the surface fluxes of greenhouse gases.Such methods can improve our understanding of greenhouse gas emissions and quantify sources and sinks (e.g.Meirink et al., 2008;Bergamaschi et al., 2013;Berchet et al., 2015).Simulations using the optimized emissions will generally also improve the 3-D concentration distribution compared to simulations with non-optimized emissions.The global chemistry Transport Model version 5 (TM5) simulations used in this paper were made following this approach.However, these methods cannot correct for transport errors and errors in the chemical sinks, although such errors may partly be offset by compensating errors in optimized emissions.Furthermore, emissions are only improved for certain large regions, emission types and often at low (e.g.monthly) time resolution.Thus, to improve 3-D concentration distributions, direct "correction" of simulated concentrations using observation data is preferable over improvement of emission fluxes.A combination of both approaches can also be conceived.
Data assimilation techniques such as four-dimensional variational assimilation (4D-Var) also allow simulated concentrations to be corrected in a formal fully consistent model framework.These techniques are, however, relatively computationally expensive and may not offer substantial benefits over more simple methods.Newtonian relaxation (also known as nudging or repeated insertion) is a much simpler data assimilation method to correct simulated concentrations with observation data (e.g.Anthes, 1974;Stauffer and Seaman, 1990).While it may lead to violations of the model's underlying physics and mass conservation, the method has gained popularity because of its computational efficiency, robustness and simple implementation.In contrast to complex data assimilation techniques such as 4D-Var, which require adjoint versions of the simulation model, implementing Newtonian relaxation requires only a few lines of extra code.
Apart from the question of how observation data are used to correct model biases, there exists a separate problem with current Eulerian chemistry transport models.These models struggle with the simulation of large-scale transport of chemical species, e.g. over intercontinental distances due to numerical diffusion introduced by the advection schemes (Rastigejev et al., 2010;Eastham and Jacob, 2017).In remote areas, where concentration enhancements occur mainly through transport from far-away source regions, these models are not very reliable.Lagrangian transport models, which do not suffer from numerical diffusion to a comparable extent, have often been used successfully to simulate pollution transport over intercontinental distances (e.g.Stohl et al., 2003).
In this study, we improve modelled concentration fields of long-lived trace gases in a Lagrangian transport model, the FLEXible PARTicle dispersion chemical transport model (FLEXPART CTM), by nudging simulated concentrations towards observations where and when available.We explore the method with the application to methane.Concentration fields obtained with and without nudging are evaluated based on (independent) surface observations and vertical profiles from aircraft campaigns in different regions.Furthermore, we compare the model skill to that of a frequently used Eulerian model (TM5) using non-optimized and optimized emission fields.For this comparison, we chose TM5 because this is a state-of-the-art transport model and the TM5-4D-Var is the inversion framework used in Europe's Copernicus Atmospheric Monitoring Service (CAMS).
A description of the transport model, nudging method and observations is given in Sect. 2. In Sect.3, we will first discuss the sensitivity of modelled concentration fields to settings of the nudging method and then evaluate model performance in comparison to surface and profile observations, and TM5 modelled concentration fields.
2 Methods and data

FLEXPART CTM
The FLEXPART CTM model was developed at Empa based on FLEXPART (Stohl et al., 1998(Stohl et al., , 2005)).FLEXPART is a Lagrangian particle dispersion model that calculates trajectories of parcels to describe transport processes in the atmosphere.Parcels, which can represent gases or aerosols, are influenced by dry deposition and wet deposition.For the current study, we used FLEXPART 8-CTM-1.1 (Henne et al., 2018a), which is based on FLEXPART 8.0 and described in detail by Henne et al. (2018b).Important differences between FLEXPART and FLEXPART CTM are found in the implementation of flux fields and chemistry.For the current purpose, simulations are made in domain filling mode (Stohl and James, 2004).This means that at initialization air parcels are randomly distributed over the whole model domain (in this case global) proportionally to air density.Each parcel represents a fraction of the total atmospheric mass.In addition to an air tracer, each parcel can also carry a number of chemical species including methane.Whenever air parcels reside near the surface, methane fluxes are accounted for by changing the methane masses of the respective parcel.Methane loss through reaction with OH, Cl and O( 1 D) radicals is also accounted for in the form of a pseudo-first-order reaction with prescribed monthly variable concentration fields (Henne et al., 2018b).We used OH concentration fields from the GEOS-Chem model (Bey et al., 2001).As such, we can simulate methane concentrations if initial conditions are well represented and a spin-up time is included.
The simulations presented here are driven with methane fluxes from a combination of several inventories.For anthropogenic sources, we included fluxes from the Emissions Database for Global Atmospheric Research (EDGAR), versions v4.2 for 2000-2008, while 2008 emissions were repeated for 2009-2013.Monthly biomass burning emissions are based on the Global Fire Emissions Database (GFED) version 3 (Van der Werf et al., 2010).Wetland emissions were obtained from the Lund-Potsdam-Jena Wetland Hydrology and Methane (LPJ-WHyMe; Spahni et al., 2011) dynamic global vegetation model.Additional sources taken into account include ocean hydrates (Houweling et al., 1999), wild ruminants (Houweling et al., 1999) and termites (Sanderson, 1996).Flux fields are averaged or interpolated to monthly intervals with 1 • × 1 • spatial resolution.A spinup simulation without nudging was run for the years 2000-2012.At the end of this spin-up, a single global scaling factor was applied to the simulated methane molar mixing ratios derived by a comparison to surface observations for the year 2012.This scaling allowed us to remove part of the bias from the spin-up.The sensitivity analysis and evaluation of FLEXPART CTM and the nudging method was made for 2013.The first 20 days of the simulation year with nudging are also considered spin-up time and not analysed further.In the sensitivity analysis, we consider different settings for our nudging routine (see Sect. 2.2 and Table 1) and in our evaluation we compare simulations with and without nudging to observations as well as other model simulations.Meteorological input data are ERA-Interim reanalysis fields (1 • × 1 • , 3-hourly) from the European Centre for Medium-Range Weather Forecasts (ECMWF).Output fields are saved as daily averages at 2 • × 2 • spatial resolution at 24 levels.The output levels follow topography and resolution ranges from 500 m near the surface, 1000 m in the troposphere and increasing up to 5000 m at the top level.The simulations include approximately 6 million particles each.

Nudging routine
In our simulations, we will test and use the nudging routine presented by Henne et al. (2018b).For each observation, we consider a symmetrical kernel in which modelled data should be relaxed towards the observation.The weight of the kernel varies in space and time.An Epanechnikov function is used for the spatial weight of the kernel (w s ) for each pair of observation (i) and parcel (j ): where r ij is based on the kernel extent h in directions x, y and z following (2) X j , Y j and Z j refer to the parcel location; x i , y i and z i refer to the observation location.
Table 1.Overview of kernel settings in the sensitivity analysis.The spatial width in the x direction is equal to the width in the y direction (in metres).σ obs is the standard deviation of observations over 1 year at each nudging location, σ max is the maximum value of σ obs from all nudging locations, h min is 1 • and u is the mean wind speed at the nudging location based on ERA-Interim data.

Sim Spatial width Height Temporal width Relaxation time
The temporal kernel has a tricubic weight function following where h t is the temporal kernel width, t j the current model time and t i the time of the closest valid observation.Using the spatial and temporal weight, the nudging tendency is calculated as where m j is the modelled mass and M i is the observed mass, which is calculated from observed mole fractions.τ i is the nudging relaxation timescale and t the model synchronization time step.The latter should be smaller than the former to assure numerical stability.The kernel widths and nudging relaxation timescale can be set for each observation location.Different settings have been tested as shown in Table 1.
In tests NF1 to NF6, we analysed the influence of specific parameters and kernel settings, which were the same at all stations.In tests NV1 to NV3, we introduced a kernel size dependent on the variability of the observed methane concentrations over 1 year at each station.We thereby assumed that the kernel size should be smaller in the case of larger variability (standard deviation).In all these tests, the temporal kernel width is assumed to be proportional to the spatial width.In test NW1, we reduced the temporal kernel size with increasing mean wind speed at each location, assuming air masses will reside less time around the nudging location if wind speeds are strong and observations are representative of a shorter time period.To determine the mean wind speed, we used the wind speed in 2013 at the stations interpolated from ERA-Interim data.In test NW2, we tested an influence of wind speed on the spatial extent of the kernel.Finally, in NW3, we assumed that spatial kernel size increases with wind speed, yet the temporal kernel width decreases with increased variability of the observed methane mole fractions.

Observations
Several data sources were used for different purposes in this study.Input data for the FLEXPART CTM simulations have been described in Sect.2.1.At the surface, we included observations of CH 4 (reported in units of dry air mole fraction, nmol mol −1 , abbreviated ppb) from the World Data Centre for Greenhouse Gases (WDCGG; https://ds.data.jma.go.jp/gmd/wdcgg/, last access: 17 January 2017) and EBAS (http://ebas.nilu.no/,last access: 28 February 2017) at 85 locations and 5 locations in the JR-STATION tall tower network in Siberia (Sasakawa et al., 2010).Observations include flask air sampled at intervals of a few days up to months and continuous measurements.All observations have been converted to the National Oceanic and Atmospheric Administration (NOAA) 2004 methane standard scale (Dlugokencky et al., 2005).The National Institute for Environmental Studies (NIES) data were converted to NOAA 2004 National Institute for Environmental scale assuming CH 4 NOAA−2004 = (CH 4 NIES + 12.2) /1.0087.Of 90 surface or tower observation locations in total, we arbitrarily selected 11 for validation purposes that were not used for nudging.Locations are shown in Fig. 1; a full list of surface stations is given in the Supplement (Table S1).
Further validation of the nudging routine was based on the comparison to aircraft measurements.These include regular vertical profiles of NOAA at 13 locations (Sweeney et al., 2015).Profiles were taken from the surface up to about 8000 m altitude in approximately monthly intervals.Furthermore, we include methane profiles observed in Siberia during an Airborne Extensive Regional Observations in SIBeria (YAK-AEROSIB) campaign (Paris et al., 2008) in July 2013 and observed by NIES at monthly intervals in Surgut and Novosibirsk (Sasakawa et al., 2017) for validation.Locations are shown in Fig. 1.We interpolated FLEXPART CTM data to each measurement during NOAA, NIES or YAK-AEROSIB flights.We then calculated mean profiles per month based on the vertical grid resolution of FLEXPART CTM output (24 layers).

TM5 simulations
To compare the performance of FLEXPART CTM with the nudging routine, we also evaluate reference and reanalysis fields of methane from the global chemistry transport model TM5 (Huijnen et al., 2010;Krol et al., 2005) with the same observations and methods.Methane inversion simulations are provided by the Netherlands Organisation for Applied Scientific Research -Netherlands Institute for Space Research (TNO-SRON) and are available through CAMS.The TM5-4D-Var inverse modelling system provides optimized methane fluxes (e.g.Bergamaschi et al., 2013).The TM5 methane fields are averaged daily and have a 2 • × 3 • resolution.In our evaluation, we include TM5 simulations based on a priori information (here referred to as the TM5 reference simulation) and a simulation with optimized fluxes (referred to as TM5 reanalysis simulation) that included all NOAA surface observations.Notice that TM5 has assimilated data also from the surface stations used for evaluation, and thus the comparison is not totally independent, giving the TM5 reanalysis simulation a (perhaps small) advantage over the FLEXPART CTM simulations.We did not use TM5 reanalysis simulations with the assimilated Greenhouse Gases Observing Satellite (GOSAT) data (Bergamaschi et al., 2009) since the bias in comparison to surface observations was larger than in the presented simulations, as shown in the Supplement (Figs.S1 and S2).

Sensitivity analysis
The nudging kernel settings such as the size, described in Sect.2.2, will affect the influence observations have on the methane fields.To show this influence and to find appropriate settings for our application, we performed a sensitivity analysis.The different kernel settings are listed in Table 1. Figure 2 shows the annual mean surface methane molar mixing ratios as simulated with FLEXPART CTM without nudging.Additionally, the difference in annual mean mixing ratios between the reference simulation and two nudging simulations is shown.With relatively small nudging kernels (NF2, Fig. 2b), some influence on global methane fields is seen.The strongest effects, however, are restricted to the nudging locations.Increasing the kernel size for background stations (NV3, Fig. 2c) shows that the nudging can influence simulated methane values strongly in a far larger region.The nudging appears to mostly lower modelled methane mole fractions in the Northern Hemisphere and to increase methane in the Southern Hemisphere, indicating a positive and negative bias of methane concentrations, respectively.This will later be discussed based on a comparison to observations.Furthermore, Fig. 3 illustrates how nudging influences simulated methane values over time in a region in the Southern Hemisphere at different altitudes.Methane starts to increase in the planetary boundary layer, because nudging occurs at surface stations, and over time the influence of nudging extends to higher altitudes.The differences come from nudging stations within this particular region, but effects from elsewhere will also reach this region with time and likely at higher altitudes.
A more quantitative analysis of the sensitivity to nudging kernel settings is given in Fig. 4. We compare observed and modelled daily averaged methane concentrations in 2013 through the coefficient of determination (r 2 ), root mean square error (RMSE) and bias for nudged FLEXPART CTM simulations and the reference simulation at all surface locations (including both stations used for the nudging and independent validation stations).Generally, model performance in terms of root mean squared error, correlation and bias improves for all tested nudging kernel settings compared to the reference simulation.Already for a rather small kernel size (NF1), the coefficient of determination increases, while there remains a relatively large spread in model bias.It also appears that the chosen relaxation time (NF4 vs. NF3 or NV3 vs. NV2) affects the coefficient of determination.Especially large changes are seen for the temporal kernel width (NW3 vs. NW2 and NF5 vs. NF3).
The representativeness of a station is directly linked to the variability of the observed methane concentrations.If the variability is high, this may indicate that the station is often influenced by emissions in its vicinity or by smallscale transport phenomena.Such a station should exert less weight on the simulated concentration field than a station with greater representativeness and its area (and/or time period) of influence should be smaller.Therefore, in another set of tests (NV1-3, NW1 and NW3), we assumed that the variability of observed concentrations (expressed in terms of the annual standard deviation) limits either the temporal or spatial kernel width.Unlike tests NF1-NF6, the kernel size is thereby no longer identical for all locations.Based on the small values of RMSE, it appears that changing the spatial kernel size based on the observed variability can improve the performance of the nudging routine (NV1 and NV3).The mean RMSE in the reference case is 35.3 ppb for all stations and 23.6 ppb for validation stations and is reduced to 10.3 ppb for all and 18.7 ppb for validation stations in the NV3 simulation.The correlation improves mostly at nudging stations, as can be concluded from an increase of mean r 2 for all stations from 0.54 in the reference to 0.92 in the NV3 simulation, yet at validation stations remains at 0.62 for REF and NV3.
We are particularly interested if the nudging routine can remove some of the bias of modelled methane simulations, as bias is detrimental for many applications (e.g.inverse modelling).In Fig. 5, we therefore show the model bias and coefficient of determination at surface stations for a selection of sensitivity simulations.In the reference simulation, methane concentrations tend to be underestimated in the Southern Hemisphere and overestimated in the Northern Hemisphere by FLEXPART CTM with the used emission inventories.The nudging routine removes most of the model bias at many stations.We found a large variation of observed methane in Siberia at the stations of the JR-STATION network.At some of these stations, daily mean mixing ratios reach values ex-ceeding 2000 ppb (Sasakawa et al., 2010).The stations are located in taiga, steppe and wetland biomes.Sasakawa et al. (2010) showed that during summer methane emissions from wetlands and during winter from fossil fuel extraction can explain most of the variation in methane values in this region.Also biomass burning is locally contributing to elevated methane.It appears that, in this region, the model benefits from nudging kernel sizes related to the standard deviation of observations (compare NV3 to NF3).For simulation NV3, the nudging kernel sizes are relatively small and there is less overlap of the different kernels in this region, allowing for a larger spatial variability in simulated concentrations.
Bias and r 2 shown in Fig. 5 summarize model performance at many stations.To get a better understanding of how nudging changes the modelled methane concentrations, we discuss the annual cycle at two surface stations (Figs. 6 and 7).As an example of a nudging location, we look at the remotely located Palmer Station, Antarctica (PSA; −64.92 • N, −64.00 • E).Here, observations include approximately weekly flask-air samples.Modelled and observed mixing ratios in 2013 are shown as time series and a scatter plot (Fig. 6).Additionally, model performance statistics for PSA are given in the table in Fig. 6.The figures and table show that correlation between model and observations is already large for the reference simulation, as the seasonal cycle is well captured by the model, but the bias can be reduced through nudging.For the smallest nudging kernel size (NF1), the improvement is limited, from −8.5 ppb (REF) to −8.4 ppb (NF1).For all other nudged simulations, however, the bias is reduced to less than −2.3 ppb.The time series reveals more differences.Since the variation of observed methane mixing ratios at PSA is relatively small, the nudg- ing kernel in simulation NV3 was large (over 10 • radius).In the case of small nudging kernels (NF3), methane deviates from the reference simulation only during measurement periods but nearly returns to the reference values in between measurements.For larger kernels (NV3), the methane concentrations are kept at a level similar to the observations also in between observations.Given the remote site location and the lack of local emissions, this scenario appears more realistic.This behaviour is particular to remote stations and rather extreme at PSA, where the reference model has a pronounced bias and there are no other observations to correct it.At other stations, the influence of other close-by nudging stations is generally larger and better results are obtained for small kernels as well.Besides the kernel settings, the number of nudging stations included in a simulation will thus also affect model performance, as will be discussed at the end of this section.
Another station for which we show observations and simulations in more detail is Heimaey (ICE, Fig. 7).Heimaey is located close to the southern coast of Iceland and is an independent station where no nudging is applied in FLEX-PART CTM simulations.At this location, the reference simulation has a relatively large bias of over 28 ppb.With the introduction of nudging at other stations, the bias is reduced to approximately 6 ppb and also all other statistical parameters improve (see table in Fig. 7).Most improvement, both in correlation and bias reduction, was seen in simulation NV3.In this simulation, the relatively large kernel given to background stations helps to reduce bias globally.In the time series (Fig. 7a), it can be seen that model performance at the start of the year is worse than at the end of the year.Although the deviation between reference and nudged simulations occurs from the start, the difference between several simulations accumulates during the simulation period as more ob- servations are included.We show the simulations over the complete nudging period to demonstrate this, but we did not include the first 20 days with nudging in the model performance calculations.Deviations can of course also increase if the model performs worse in particular periods, for instance, due to errors in emissions, inducing stronger nudging effects.
To demonstrate that the nudging routine generally improves model results throughout the domain, we evaluate the bias at all independent validation stations in Table 2. Effects are smaller than at many nudging stations shown in Fig. 5. Nonetheless, there is a clear improvement at some stations, even though they are remote, such as USH (Ushuaia, Tierra del Fuego, Argentina).Table 2 also shows that FLEXPART CTM model performance at some stations can decrease depending on nudging kernel settings.This is the case at THD (Trinidad Head, California, USA) and YAK1 (Yakutsk, Siberia, Russia).While methane mixing ratios were slightly underestimated at THD in the reference simulation, an overestimation was observed at close-by nudging locations.In the nudging simulations, methane concentrations were thus decreased in this region and underestimation at the independent station increased.In Yakutsk, measurements are made at a low inlet (11 m, here called YAK1) and a high inlet (77 m, YAK2).Nudging appears to decrease bias of the methane values at the high inlet but not at lower altitude.A large bias both in reference and nudged simulations is seen at BKT (Bukit Koto, Tabang, Indonesia).This station is influenced by nearby biomass burning emissions and a sea-breeze system that is difficult to resolve in the FLEXPART simulations (Henne et al., 2018b).
Jungfraujoch, Switzerland (JFJ), is a high-altitude station and the strongly decreased bias of modelled mixing ratios indicates that surface nudging has a beneficial influ-  1).Independent validation stations are included.The red line in the boxplots indicates the median, the bottom and top box edges are the 25th and 75th percentiles, and whisker ends are at 9th and 91st percentiles.The circles are outliers.
two locations with NOAA Earth System Research Laboratory (ESRL) vertical profiles.
Nudging at the surface changes modelled methane concentrations up to heights of 8 km (Fig. 8).At ACG (Alaska Coast Guard, Alaska, United States of America), differences due to surface nudging even appear largest at heights between 2 and 6 km.The number of nudging observations is limited in this region and differences are therefore mostly due to nudging in well-observed regions further south, followed by northward transport accompanied by isentropic up-  lift of those air masses to higher altitudes.Similarly, Sweeney et al. (2015) showed, based on vertical profiles of CO 2 , that air transported to this region from the central Pacific is well mixed throughout the observed column (< 8000 m).In a region with multiple nudging locations such as DND (Dahlen, North Dakota, USA), on the other hand, mainly the methane concentrations near the surface are affected by nudging.As for surface validation stations, improvements are seen mostly in RMSE and bias rather than correlation (Table 3).
Finally, we noted earlier that, besides nudging kernel properties, the number of nudging observations included in the simulation can affect the model performance.We therefore tested model performance for simulations with different numbers of particles and where some of the nudging locations in NV3 were removed.As expected, reducing the number of particles somewhat increases the noise in the simu-lation but does not strongly influence overall model performance (not shown), because as long as the parcels in the simulation are well-mixed the same fraction of particles will be influenced by nudging.We expect a larger difference in the case where we randomly remove nudging stations from our simulations.We tested this for the kernel settings used in NV3, with 50 % and 80 % of the nudging stations included.A summary of model performance results is given in Table 4. Looking at all stations, correlation increases and RMSE decreases with an increase in number of stations.As already discussed, the influence on correlation between model and observations at independent sites is limited.RMSE at independent validation sites does decrease if 50 % or 80 % of the nudging locations are included, but the final 20 % do not improve results here.With 80 % of the nudging locations, the background methane mixing ratios were thus already im-Geosci.Model Dev., 11,[4469][4470][4471][4472][4473][4474][4475][4476][4477][4478][4479][4480][4481][4482][4483][4484][4485][4486][4487]2018 www.geosci-model-dev.net/11/4469/2018/,3 GMI, Guam −5.2 −5.6 −6.4 −6.3 −6.6 −6.5 −5.9 −4.9 −6.0 −5.8 −4.1 −6.5 −6.0 KUM, Cape Kumukahi −2.5 −1.6  proved globally, or the nudging stations most relevant for the validation stations were already included in the 80 %.

Model performance
Based on results shown in previous sections, we selected the NV3 run to provide the best nudged FLEXPART CTM methane distribution.For further analysis of model performance, we will only use this simulation.We will evaluate this best case simulation in comparison with the methane field simulated by the Eulerian TM5 model (see Sect. 2.4).

Surface mixing ratios
As for FLEXPART CTM in Fig. 5, we show bias and correlation values for 1 year of TM5 simulations at all surface stations (Fig. 9).We include TM5 simulations that are based on a priori information (TM5 REF) and TM5 reanalysis simulations that include fluxes optimized by the TM5-4D-Var system (e.g.Bergamaschi et al., 2010).When comparing the reference versions of FLEXPART CTM and TM5, similar biases can be found for both models, although FLEXPART CTM's performance with mean RMSE of 35.3 ppb and mean absolute bias of 20.9 ppb appears slightly poorer than that of TM5 with mean RMSE of 33.9 ppb and mean absolute bias of 17.3 ppb (also, see Table 5).On the other hand, correlation is higher for FLEXPART CTM (r 2 = 0.54) than for TM5 (r 2 = 0.39).Differences may be due to the use of different emission inventories but can also be related to other modelling aspects, e.g.differences in methane lifetime and OH fields.More interesting, however, is the performance of FLEXPART CTM when the nudging routine is switched on and TM5 based on optimized fluxes, shown in Fig. 9 and Table 5. Especially if nudging stations and validation stations are both considered, the performance of the nudged FLEX-PART CTM simulation (NV3) is better than that of the TM5 simulations.Mean bias of all stations was reduced to 4.9 ppb in NV3 but 13.3 ppb in the optimized TM5 simulation.At independent stations (for FLEXPART CTM), methane mixing ratios are on average biased by 8.8 ppb in NV3 and 8.1 ppb in TM5 reanalysis.It should be noted that most of the nudging data are also used in the reanalysis version of TM5.In fact, data from independent NOAA surface stations (diamonds in Fig. 9) such as ICE, USH, AZR (Terceira Island, Azores, Portugal) and GMI (Guam, USA) are not used in FLEXPART CTM but are ingested in the TM5 reanalysis system, and better performance by TM5 may therefore be expected.Observations from the Yakutsk tower are not used in either of the simulations.Here, the model bias is lower in the nudged FLEXPART CTM simulations.This is probably also related to the inclusion of data from the JR-STATION network in FLEXPART CTM, which are not included in TM5.It appears that both models, with optimization, are well able to capture background concentrations in the Southern Hemisphere.In the northern midlatitudes, deviations from observations are generally largest.This is also the region with the most observations, and spatial variability appears large.Here, the models are, with the current setup, not able to capture this large spatial variability, and observations are representative of small regions only.This is further demonstrated in a comparison of averaged zonal mean values from FLEXPART CTM and TM5 simulations and zonal mean values from the GLOBALVIEW-CH4 product in Fig. 10.GLOBALVIEW-CH4 extends surface measurements of methane in time and space to provide model-independent trace gas climatology (Masarie and Tans, 1995;GLOBALVIEW-CH4, 2009).Differences between the three products are largest from the Equator to the northern midlatitudes.The GLOBALVIEW-CH4 product is probably more representative of a maritime baseline and biased low in the Northern Hemisphere, partly explaining the larger difference here between both models and GLOBALVIEW-CH4.

Aircraft profiles
To assess model performance above the surface, we use vertical profiles obtained from aircraft measurements at 16 locations (see Fig. 1).Overall, 12 of these are located in North In Siberia, we obtained data from the YAK-AEROSIB campaign and regular NIES profiles in Surgut and Novosibirsk.
We show mean profiles in 2013 at each location in Fig. 11.
The simulations are interpolated to profile observation locations and times, and averaging is done at FLEXPART CTM vertical resolution.Besides comparing FLEXPART and observations, we again also include results of TM5 reference and reanalysis (or 4D-Var) output.As was also shown before based on vertical profiles at two stations (Fig. 8), nudging FLEXPART CTM simulations at the surface influences vertical profiles throughout the troposphere.In North America, many surface observations used for nudging FLEX-PART CTM are also included in the TM5 reanalysis simulations.Good results are therefore obtained with both models at some locations, such as DND (Dahlen) or CMA (Cape May).It appears that FLEXPART CTM tends to underestimate methane concentrations at altitudes above 6000 m; see, for instance, SCA (Charleston, South Carolina, USA), WBI (West Branch, Iowa, USA) and YAK-AEROSIB.This may indicate that the exchange between troposphere and stratosphere is too strong in FLEXPART CTM, or that the stratospheric methane may be underestimated.Obviously, the performance of both models varies with altitude, as will be discussed later.An especially large difference between FLEX-PART CTM and TM5 is seen in Novosibirsk, even though this does not appear related only to the inclusion of Siberian data in the FLEXPART CTM nudging routine.This may be due to the use of different emission inventories.The profiles shown in Fig. 11 are averages of all available monthly profiles per site.Model performance of FLEXPART CTM and TM5 varies throughout the year, however, and as an example, we show seasonally averaged profiles at ESP (Estevan Point, British Columbia, Canada) in Fig. 12.Both models appear to benefit from the nudging and reanalysis techniques, respectively, with improved concentrations, especially near the surface, compared to reference simulations.For both models, strong deviations from measurements oc-  cur in autumn with very similar profiles for TM5 and nudged FLEXPART CTM.The nudging appears to improve modelled FLEXPART CTM concentrations up to roughly 3 km altitude in all seasons except for autumn.
To assess if modelled vertical profiles deviate systematically from observed vertical profiles, we split the profiles into three altitude ranges: below 2000 m altitude, from 2000 up to 6000 m altitude and above 6000 m.In Fig. 13, the bias of each model is shown for all NOAA profiles in North America for each of the altitude ranges.As was already indicated by the average profiles in Fig. 11, FLEXPART CTM tends to underestimate methane at altitudes above 6000 m.The nudging routine near the surface partly enhances this bias in this altitude range.In TM5, values are rather overestimated at this altitude and an improvement between the reference and reanalysis runs is visible.At the altitude range between 2000 and 6000 m, FLEXPART CTM performs better than at upper levels.For TM5, there seems to be a small overestimation of methane concentrations at this altitude range.Near the surface, below 2000 m, a clear impact of the nudging routine for FLEXPART CTM is shown.For TM5, strong positive as well as negative biases occur near the surface and are larger in the reanalysis simulation.For this region, it appears that the FLEXPART CTM nudging routine improves near-surface concentration fields, while the TM5 reanalysis scheme is better able to improve high-altitude concentrations.Similarly, Bergamaschi et al. (2013) showed for a comparison between TM5 modelled methane concentrations and NOAA aircraft measurements during 2003 to 2010, that RMSE in the boundary layer was larger than in the free troposphere.
The underestimation of methane concentrations at high altitude by FLEXPART CTM means that the bias of average profiles is mostly negative (Table 6), and averaged absolute bias of all profiles increases in the case of nudging.The removal of methane near the surface can cause too-low values at higher altitudes.Looking at RMSE, both models show similar values (∼ 16 ppb) for average profiles (Table 7).Although nudging was particularly useful to reduce the bias near the surface, as we concluded based on surface observations, it is less effective at improving vertical profiles above

Conclusions
We presented a nudging routine in combination with FLEX-PART CTM to provide three-dimensional fields of longlived trace gas abundance constrained with surface observations.In particular, this study focuses on the optimization of methane.However, with the right settings, this framework could be used for other species.The nudging constrains model results with observations and thereby considerably levels out (combined) model errors in emissions, transport modelling and methane losses.
In a sensitivity analysis, we showed that modelled methane near the surface greatly improves through the inclusion of observations at 79 locations.To show that the observations improve the concentration fields not only locally but globally, we compare model results to independent observations.These were at the surface (11 sites) as well as vertical profiles from aircraft campaigns (19 sites).For very small kernels (0.5 • width), nudging was too weak and not much improvement was seen.Large kernels are suitable for background stations, but for stations with large variability in the observations a smaller kernel leads to better results.The number of observations included in the simulation also influences model performance.We concluded that nudging kernel settings of test NV3 (see Table 1) were best to improve modelled methane concentrations.This setting is based on a kernel whose bandwidth is dependent on the observed methane variability over 1 year at a site.
A comparison with methane fields from TM5 simulations was made.We included reference TM5 simulations, based on bottom-up emission information, as well as reanalysis simulations that use optimized emission fluxes (TM5-4D-Var).There are small differences in the bottom-up emission inventories driving FLEXPART CTM and TM5.Results showed that in the upper troposphere FLEXPART CTM underestimates methane and best results are obtained near the surface.In contrast, TM5 appears to be better in the upper atmosphere, yet model performance near the surface is poorer.The mean RMSE of all averaged vertical profiles was 15.7 ppb for FLEXPART CTM (nudged), yet 15.9 ppb for TM5 (reanalysis).For all surface stations, mean RMSE values were 10.3 and 29.7 ppb for FLEXPART CTM and TM5, respectively, and at independent surface stations values were 18.7 and 20.6 ppb, respectively.It should be noted, however, that the independent stations are only strictly independent for FLEXPART CTM simulations but are partly included in TM5 assimilation runs.In future experiments, it might be useful to use the same data selections for the FLEX-PART nudging and the TM5-4D-Var inversion.It could also be interesting to drive the FLEXPART nudging with a poste-riori emissions from the TM5 inversion to see the differences in how the models simulate 3-D concentrations.Increases in correlation between observations and modelled values at surface stations are considerable at nudging locations but low at independent validation stations.
In summary, we can simulate three-dimensional fields of methane with a Lagrangian model constrained with observations through nudging.We have shown that in the troposphere this simple assimilation technique, which is computationally inexpensive, can give similarly good results as a Eulerian chemistry transport model combined with a computationally expensive four-dimensional variational data assimilation scheme.Near the surface, the Lagrangian model with nudging even seems to outperform the Eulerian model with data assimilation.
were primarily supported by NASA grants to MIT (NNX11AF17G) and SIO (NNX11AF15G,NNX11AF16G) Edited by: Andrea Stenke Reviewed by: three anonymous referees

Figure 1 .
Figure 1.Map of surface and profile locations used for nudging or validation.For profiles that span a larger domain, only the centre location is indicated.

Figure 2 .
Figure 2. Simulated annual mean surface methane (ppb) for 2013 in the FLEXPART CTM reference simulation (a) and the difference in annual mean surface methane for FLEXPART CTM NF2 (b) and NV3 (c) compared to the reference simulation.Additionally, zonal averages of surface methane (ppb) for all three simulations are also shown (d).

Figure 3 .
Figure 3. Difference in simulated regional mean methane (ppb) per altitude for FLEXPART CTM simulations NV3-REF in January 2013.The values are averages per altitude level for the region south of 60 • S.

Figure 4 .
Figure 4. FLEXPART CTM model performance based on boxplots of coefficient of determination, root mean square error (RMSE) and bias in daily average methane model fractions at surface stations for different kernel settings (see Table1).Independent validation stations are included.The red line in the boxplots indicates the median, the bottom and top box edges are the 25th and 75th percentiles, and whisker ends are at 9th and 91st percentiles.The circles are outliers.

Figure 6 .
Figure 6.Methane (ppb) as observed and modelled at Palmer Station in 2013 for a selection of FLEXPART CTM simulations, shown as time series (a) and a scatter plot (b).

Figure 7 .
Figure 7. Methane (ppb) as observed and modelled in Heimaey in 2013 for a selection of FLEXPART CTM simulations, shown as time series (a) and a scatter plot (b).

Figure 8 .
Figure 8. Averaged annual vertical methane profiles for 2013 based on monthly averaged data at NOAA profile locations ACG (Alaska Coast Guard, USA) and DND (Dahlen, North Dakota, USA).See Fig. 1 for approximate profile locations.

Figure 12 .
Figure 12.Seasonal observed and modelled methane profiles in 2013 at Estevan Point (Canada).FLP refers to FLEXPART CTM.

Figure 13 .
Figure 13.Model bias of FLEXPART CTM (abbreviated FLP) and TM5 at three altitude ranges for NOAA profiles in 2013 in North America.

Table 2 .
Bias of modelled methane (ppb) at independent surface validation stations.The mean value is the mean of absolute bias value at all stations listed.Settings of FLEXPART CTM simulations are given in Table1.Station locations are shown in Fig.1and coordinates are given in TableS1.

Table 3 .
Model performance at ACG (Alaska Coast Guard, USA) and Dahlen (DND) in terms of r 2 , RMSE and bias (ppb) of a selection of simulations according to Table1.

Table 4 .
Average of coefficient of determination, root mean square error and absolute bias at all surface stations and at surface validation stations only for different FLEXPART CTM simulations.The percentage refers to the number of nudging locations that were included in the simulation.

Table 5 .
Mean values of r 2 , RMSE and absolute bias for different FLEXPART CTM and TM5 simulations at surface stations for all stations or independent validation stations only.America and are part of the NOAA/ESRL aircraft program; one is also part of NOAA but is in the region of the Cook Islands in the central southern Pacific Ocean (Rarotonga).

Table 6 .
Bias (ppb) of averaged methane profiles at NOAA and NIES profile sites and during the YAK-AEROSIB flight campaign.Bias is determined with equal weight of each vertical layer.

Table 7 .
RMSE (ppb) of averaged methane profiles at NOAA and NIES profile sites and during the YAK-AEROSIB flight campaign.RMSE is determined with equal weight of each vertical layer.
. The operation of all UK DECC Network stations, TAC, RGL and TTA was funded by the UK Department of Business, Energy and Industrial Strategy (contract TRN1028/06/2015) with additional funding at Mace Head, Ireland and Ragged Point, Barbados, under NASA contract NNX16AC98G through MIT with a subaward 5710004056 to University of Bristol and NOAA contract RA-133R-15-CN-0008 for Ragged Point; Cathrine Lund Myhre (NILU); China Meteorological Administration; Climate Science Centre -CSIRO Oceans & Atmosphere; Agency for Meteorology, Climatology and Geophysics (BMKG); Dirección Meteorológica de Chile; EMEP; Environment and Climate Change Canada; Federal Environment Agency, Austria; Federal Environment Agency, Germany; Institute of Arctic and Alpine Research at the University of Colorado, Boulder, funded by US National Science Foundation grant AON 1108391; Italian National Agency for New Technology, Energy and the Environment; Izana Atmospheric Research Center; Meteorological State Agency of Spain; Japan Meteorological Agency, Korea Meteorological Administration; National Institute of Water & Atmospheric Research Ltd. (NIWA), New Zealand; Ricerca sul Sistema Energetico -RSE S.p.A.; South African Weather Service; Swiss Federal Laboratories for Materials Science and Technology (Empa); University of Malta and University of Urbino.This study was funded by the Norwegian Research Council as part of ICOS-Norway (project 245927) and was part of the Nordic Centre of Excellence eSTICC (eScience Tools for Investigating Climate Change in Northern High Latitudes) funded by Nordforsk (grant 57001).