Regional CO 2 inversions with LUMIA , the Lund University Modular Inversion Algorithm , v 1 . 0

Atmospheric inversions are commonly used for estimating large-scale (continental to regional) net sources and sinks of CO2 and other stable atmospheric tracers from their observed concentrations. Recently, there has been an increasing demand from stakeholders for robust estimates of greenhouse gases at country-scale (or higher) resolution, in particular in the framework of the Paris agreement. This increase in resolution is in theory enabled by the growing availability of observations from surface in-situ networks (such as ICOS in Europe) and from remote sensing products (OCO-2, GOSAT-2). The increase 5 in the resolution of inversions is also a necessary step to provide efficient feedback to the process-based (bottom-up) modelling community (vegetation models, fossil fuel emission inventories). This, however, calls for new developments in the inverse modelling systems, mainly in terms of diversification of the inversion approaches, shift from global to regional inversions, and improvement in the computational efficiency, We have developed the Lund University Modular Inversion Algorithm (LUMIA) as a tool to address some of these new 10 developments. LUMIA is meant to be a platform for inverse modelling developments at Lund University. It aims at being a flexible, yet simple and easy to maintain set of tools that modellers can combine to build inverse modelling experiments. It is in particular designed to be transport model agnostic, which should facilitate isolating the transport model errors from those introduced by the inversion setup itself. Here, we briefly describe the motivations for developing LUMIA as well as the underlying development principles, current status and future prospects. We present a first LUMIA inversion setup for a regional CO2 15 inversions over Europe, based on a new coupling between the Lagrangian FLEXPART (high resolution foreground transport) and the global coarse resolution TM5 transport models, using in-situ data from surface and tall tower observation sites.


Introduction
The accumulation of greenhouse gases in the atmosphere is the main driver of climate change. The largest contribution of anthropogenic activities to global warming is through the release of fossil carbon (mainly as CO 2 ) to the atmosphere, but other 20 human activities such as land use change (for agriculture, deforestation, etc.) also play a significant role. The climate forcings from this increased CO 2 concentration is likely to induce feedbacks through reactions of the terrestrial ecosystems and of the oceans (Stocker et al., 2013). Our capacity to correctly predict climate change, anticipate and mitigate its effects depends therefore largely on our capacity to model and predict the evolution of carbon exchanges between the atmosphere and other reservoirs. For future climate simulations, the only available option is through "direct" (bottom-up) modelling of the different components of the biogeochemical cycles, i.e. using models (numerical or statistical) that simulate, as accurately as possible (given the precision requirements of the simulation), greenhouse gas fluxes to and from the atmosphere. For past periods, however, the "inverse" (top-down) approach is also possible, in which the greenhouse gas fluxes are diagnosed from their observed impact on atmospheric greenhouse gas concentrations. 30 Direct and inverse approaches are complementary, the former can provide detailed estimates of the spatial and temporal variability of the fluxes, but often with large uncertainties on the total fluxes Sitch et al. (2015) . On the contrary, inverse approaches provide robust estimates of total fluxes at large scales consistent with the observations (e.g. Gurney et al. (2002)), but with poor sensitivity to smaller scales (e.g. Peylin et al. (2013)). An atmospheric inverse model typically couples an atmospheric transport model (which computes the relationships between 35 fluxes and concentrations) with an inversion algorithm, whose task is to determine the most likely set of fluxes, within some prior constraints and given the information from an observation ensemble (in a Bayesian approach). In practice, inversions are complex codes, computationally heavy. The complexity arises in a large part from the necessity to combine large quantities of informations from sometimes very heterogeneous datasets (various types of observations, flux estimates, meteorological forcings, etc.). The computational weight depends largely on that of the underlying transport model, which usually needs to be 40 ran a large number of times (iteratively or as an ensemble).
In recent years, the availability of observations has grown by orders of magnitude, with the deployment of high-density surface observation networks (such as the Integrated Carbon Observation System, ICOS, in Europe) and the fast developments in satellite retrievals of tropospheric greenhouse gas concentrations (GOSAT, etc.). Meanwhile, the demand for inversions is increasing, in particular from stakeholders such as regional, national or trans-national governments who are interested 45 in country-scale inversions as a means of quantifying their carbon emissions, in connection with emission reduction targets as defined in the Paris agreement .
This context puts strains on the existing inverse models. The larger availability of high quality data means that fluxes can be constrained at finer scales, but it also means that models of higher definition and precision must be used. The development of regional inversions (of varying scales) allows in theory an efficient usage of high resolution data while preserving a reasonable 50 computational cost, but comes with specific challenges such as the need of more boundary conditions and the lack of options for cross-validation when the resolution increases and the domain size shrinks. The demands from various stakeholders (policy makers, bottom-up modellers, medias, etc.) also call for developments in the inversion techniques, with for instance a more pronounced focus on the quantification of anthropogenic sources  or the optimization of ecosystem models parameters instead of CO 2 fluxes in carbon cycle data assimilation systems (CCDAS) (Kaminski et al., 2013). 55 To enable such progress in the method and quality of the inversions, it is important to have a robust and flexible tool. The purpose of LUMIA (Lund University Modular Inversion Algorithm) is to be not an integrated, well defined inverse model, but a development platform for top-down experiments. LUMIA was developed from the start as a model-agnostic inversion tool, with a clear isolation of the data stream between the transport model and the optimization algorithm in an interface module.
One of the main aims is to eventually allow a better characterization of the uncertainty associated to the transport model. Strong observations and the prior). Mathematically, this means finding the vectorx that minimizes a cost function J(x) defined (in our case) as where the prior (B) and observation (R) error covariance matrices weigh the relative contributions to the cost function of 90 each departure from each prior control variable x i b and from each observation y j . The optimal control vectorx is solved for analytically (for small scale problems) or approximated step-wise (variational and ensemble approaches are most common (Rayner et al., 2018)).
An inversion system is therefore the combination of an observation operator (i.e. transport model, sampling operator, etc.), an inversion technique and a set of assumptions on the prior values of the variables to estimates, their uncertainties and the 95 uncertainties of the observations. Each of these components introduces its own share of uncertainty, which makes the results harder to interpret: which feature of the solution is real, and which is introduced by e.g. the transport model, or incorrect assumptions on some uncertainties?
These questions are difficult to address with inversion systems tightly integrating a transport model and an implementation of a specific inversion technique. LUMIA is therefore developed as a flexible inversion library, with a clear isolation between 100 the transport model and the inversion algorithm itself. LUMIA is primarily designed of a collection of modules, as simple and independent as reasonable, that can be used as basic elements to construct (inverse or forward) transport modelling experiments.
While the name "LUMIA" refers to that code library, its development (and first application) stem from the need of performing regional CO 2 inversions based on observations using in-situ data from the ICOS network. At this stage, the existing code is therefore largely limited to that regional inversion setup. The distinction between the LUMIA inversion library and the specific instance to allow forward transport runs.
The two pre-processing modules (in blue on the diagram) contain routines to import fluxes and observations from various data providers (handling of different file formats, units conversions, etc.), and are in charge of filling in the databases in obsdb and optimData. These preprocessing steps typically represent a large fraction of the total code of (any) inversion setup, but are very user-and application-specific. We chose to explicitly exclude them from the LUMIA library, as the benefit of including 130 them is far outweighed by the associated code maintenance cost. These pre-processing steps are also not necessarily performed at the same time, and on the same computer, as the inversion itself.
The execution of the actual inversion is controlled by a master python script, written by the user and for which an example is provided in SI. The transport operator itself (purple box) is formally independent of the LUMIA library, nonetheless our transport script, which relies on pre-computed observation response functions, is distributed and can be used as an example (in 135 SI).

Inversion setup
Our test inversion setup is designed to optimize the monthly net atmosphere-ecosystem carbon flux (NEE, Net Ecosystem Exchange) over Europe at a target horizontal resolution of 0.5 • , using CO 2 observations from the European ICOS network (or similar/precursor sites). Two series of inversions are presented: First, a series of Observing System Synthetic Experiments 140 (OSSEs), using known truth and synthetic observations; then a series of inversions constrained by real observations. All the inversions are performed on a domain ranging from 15 • W, 33 • N to 35 • E, 73 • N (illustrated in Figure 2, and hereafter referred to as the Regional Inversion Domain, RID) and cover the year 2011. The following sections describe the inversion technique, the transport model and the problem constraints (prior fluxes and observations). Green boxes are LUMIA modules, purple boxes are external codes and blue boxes are pre-processing scripts, typically user-and application-specific. The black thick line marks the limits of LUMIA itself. The red arrows show the flow of data in the inversion setup phase and the green arrows show the data flow during the inversion. The inversion inputs are prepared in a preparation phase (blue arrows), which involves external models (FLEXPART and TM5). The transport model used in the inversion is considered an external code (here is is a simple python script that reads in observations, fluxes and footprints, but a full transport model could be plugged in instead). 145 We use a Bayesian variational inversion algorithm, similar to that used in TM5-4DVAR inversions (Basu et al., 2013b;Meirink et al., 2008). In a variational inversion, the minimum of the cost function J(x) (Equation 2) is solved for iteratively:

Inversion approach
1. An initial "prior" run is performed to compute the concentrations (y m = Hx b ) corresponding to the prior control vector 2. The local cost function (J(x = x b )) and cost function gradient (∇ x J(x = x b )) are computed. 150 3. A control vector increment (δx) is deduced from the gradient, and the process is repeated from step 1 (with x = x b + δx), until a convergence criterion is reached.
The control vector increments are computed using an external library implementing the Lanczos algorithm (Lanczos, 1950).
For efficiency (reduction of the number of iterations) and practicality (reduction of the number of large matrix multiplications) reasons, the optimization is performed on the preconditioned variable ω = B −1/2 (x − x b ) (following Courtier et al. (1994)  and similar to the implementation in Basu et al. (2013b). Equation 2 then becomes with d 0 = Hx b − y the prior model-data mismatches. In this formulation, the cost function gradient is given by The non-preconditioned observational cost function gradient ∇ x J obs = H T R −1 (Hx − y) is computed using the adjoint 160 technique (Errico, 1997). The transformation matrix B 1/2 is obtained by eigen-value decomposition of B. Note that in this formulation, the inverse of B (or the square root of its inverse) is actually never needed, making it possible to constrain the inversion with a non invertible matrix.
In practice, the preconditioning adds two extra steps to the algorithm described above: conversion from ω to x (x = B 1/2 ω+ x b ) before applying the transport operator (i.e. running the transport model), just before step 1; Conversion from ∇ x J obs to 165 ∇ ω J obs (just after step 2). The initial preconditioned control vector is filled with zeros and corresponds to x 0 = x b .

Observation operator (transport model)
The observation operator (H in Equation 2) groups the ensemble of operations to compute the CO 2 concentrations corresponding to a given control vector. In our case, this covers the disaggregation from the monthly fluxes in the control vector to a 3-hourly temporal resolution, the addition of prescribed fluxes (fossil, ocean and biomass burning categories), their transport 170 to the observations location and the addition of background concentrations.
In this first implementation of CO 2 inversions with LUMIA, we opted for a regional transport model based on pre-computed observational response functions (footprints): where f t nee is the NEE flux map at time step t of the month m, f t 0nee is the corresponding prior NEE map, x m and x m b are 180 the control vector and prior control vector components corresponding to month m at the same spatial coordinates, and n t is the number of three-hourly intervals in the month m. In other words, the inversion adjusts an offset to the prior, high temporal resolution fluxes.
The adjoint operations corresponding to Equations 5 and 6 are summarized by 185 with δ i y the model-data mismatches weighted by their uncertainties (See Section 3.3.1). Since K and y bg are constant throughout the inversion iterations, they can be pre-computed, which reduces the transport computations to a set of very simple matrix operations. This tremendously reduces the computational cost of the inversions but increases the I/O and storage requirements (one response function K must be stored for each observation and is read at each forward and adjoint iteration). 3.2.1 Response functions (regional transport model) The response functions (K) were computed using the FLEXPART 10.0 Lagrangian transport model (Seibert and Frank, 2004;Stohl et al., 2010). FLEXPART simulates the dispersion, backwards in time from the observation location, of a large number of virtual air "particles". The response function K φ i corresponds to the aggregated residence time of the particles released for observation y i , in a given space-time grid box φ of the regional inversion, and below a threshold altitude layer arbitrarily set to 195 100m).
The simulations were driven by ECMWF ERA-Interim reanalysis, extracted at a 3-hourly temporal resolution, and on a 0.5 • × 0.5 • horizontal resolution, on a regional domain ranging from 25 • W, 23 • N to 45 • W, 83 • N, slightly larger than the inversion grid, which allows for some accounting of particles re-entry (i.e. when an air mass leaves the inversion domain, and re-enters it later, which is not accounted for in the background).

200
One set of 3-hourly response functions was computed for each observation, up to seven days backward in time (less if all the particles leave the domain sooner). For plain or low altitude sites (see Table 1), the particles were released from the sampling height above ground of the observations. For high altitude sites (around which the orography is unlikely to be correctly accounted for), the particles were released from the altitude above sea level of the observation sites.
The response functions are stored in HDF5 files, following a format described in SI. More detailed information on the 205 configuration of the FLEXPART runs is also provided in SI.

Background concentrations (global transport model)
The background CO 2 concentrations (y bg in Equation 5) result from the transport of CO 2 -loaded air masses from outside the regional inversion domain to the observation sites. One approach to compute these background concentrations has been proposed by Rödenbeck et al. (2009), and consists in extracting background concentrations time series at the observation sites 210 from the model output of a global, coarse resolution Eulerian transport model, driven by a set of inversion-derived CO 2 fluxes.
The background extraction is done in three steps: 1. A global, coarse resolution inversion is performed, constrained by a realistic set of prior CO 2 fluxes f glo apri , an initial atmospheric distribution of CO 2 concentrations (C ini ), a set of global, background surface CO 2 observations and a subset of the observations to be used later in the regional, high resolution CO 2 inversion. The aim of this step is to obtain 215 a set of CO 2 fluxes f glo that leads to a very realistic atmospheric CO 2 distribution in and around the regional inversion domain (RID). The accuracy of the fluxes themselves has less importance. 2. The CO 2 concentrations y tot corresponding to the transport of the optimized coarse resolution fluxes f glo to the observation sites within the RID are computed using a forward run of the global transport model used in step 1. The foreground CO 2 concentrations y f g are computed using a modified version of that same model, in which the fluxes and concentra-220 tions are maintained as zero at all times outside the regional domain, so that the concentrations y f g result only from the transport of the fraction of the fluxes f glo that is within the RID.
3. The background CO 2 concentrations are obtained by subtraction of the foreground concentrations to the total ones (y bg = y tot − y f g ).
The underlying assumptions is, that, by the time the air masses originating from outside the RID reach the observation sites, 225 existing high resolution patterns of CO 2 at the regional domain boundaries concentrations would have been dispersed, and therefore the field of background CO 2 concentrations within the RID can be well represented with a coarse resolution transport model. On the other hand, this background CO 2 distribution should be as realistic as possible (within the limits of the model resolution), especially in and around the boundaries of the foreground domain, therefore the use of an inversion in step 1 above.
We refer to Rödenbeck et al. (2009) for a much more complete description of the approach. 230 We implemented the Rödenbeck et al. (2009) approach in a TM5 model setup (Huijnen et al., 2010) with the initial global inversion (step 1) performed in a TM5-4DVAR setup, based on (Basu et al., 2013a). The NEE flux is optimized monthly on a global 6 • × 4 • grid, and three additional prescribed CO 2 flux categories are accounted for (fossil fuel, biomass burning and ocean sink). It covers the entire period of the LUMIA inversion, plus six extra months at the beginning and one at the end to limit the influence of the initial condition and to ensure that the background concentrations in the last month of the LUMIA 235 inversion are well constrained by the observations (observations provide important constraints on the fluxes from the preceding month).

The inversion is constrained by flask observations from the NOAA ESRL Carbon Cycle Cooperative Global Air Sampling
Network (Dlugokencky et al., 2019) outside the European domain, and by a subset of the observations used for the regional inversion within the European domain (see Section 3.3 for references, and SI for a full list of the sites used in that step).

240
Since the focus of this inversion is to produce a realistic CO 2 distribution around the European domain, the choice of a prior matters a lot less than the selection of observations. For practical reasons, prior fluxes from the CarbonTracker 2016 release were used (Peters et al., 2007): the NEE prior is generated by the SibCASA model (Schaefer et al., 2008) The total (y tot ) and foreground (y f g ) CO 2 concentration time series at the observation sites are extracted using a modified forward TM5 run implementing the step 2 of the background extraction approach described above. The foreground and total TM5 orography) and 5000 m.a.s.l (with a vertical resolution of 250 m and a temporal resolution of 30 minutes). The latter is used to construct a part of the observation uncertainties.

Observations and observation uncertainties
Observations from the GLOBALVIEWplus 4.2 obspack product were used in the inversions (NOAA Carbon Cycle Group 255 ObsPack Team, 2019). For the year 2011, the product includes observations from 26 sites within our regional domain (in addition to observations from mobile platforms, which were not used). Continuous observations are available at 18 of these 26 sites and nine sites are high altitude. Most of these observation sites are now part of the European ICOS network. A list of the sites (coordinates, observation frequency, sampling height and data provider) is provided in Table 1, and the location of the observations is also reported in Figure 2.

Observation uncertainties
The observation uncertainty matrix (R) accounts for both the measurement uncertainties (ε obs ) and the model representation uncertainty (ε H , i.e. the incapacity of the model to represent perfectly well the observations, even given perfect fluxes). In theory, the diagonal of the matrix stores the absolute total uncertainty associated to each observation while the off-diagonals should store the observation error correlations. In practice, these correlations are difficult to quantify, and the size of the matrix 265 would anyway make it impractical to invert. The off-diagonals are therefore ignored in our system (as in most similar inversion setups) and the observation uncertainty is stored in a simpler observation error vector, ε y .
Our inversion system uses an observation operator that decomposes the background and foreground components of the CO 2 mixing ratio, therefore the model uncertainty can itself be decomposed in foreground and background uncertainties: The instrumental error (ε y ) is provided by the data providers for most of the observations, and typically ranges between 0.1-0.7 ppm (see Figure 3). We enforced a minimum instrumental error (ε min obs ) of 0.3 ppm for all the observations. The model representation error can not be formally quantified, as this would require knowing precisely the CO 2 fluxes that the inversion is attempting to estimate. One can, however, assign representation error estimates (foreground and background) based, in particular, on assumptions of situations that would normally lead to a degradation of the model performances (for 275 instance late-night/early-morning observations, with a development of the boundary layer that may not be well captured by the model, or observations in regions with a complex orography). Transport model comparisons can also provide representation error estimates based on the difference in their results.

Foreground model uncertainties:
As described in Section 3.2.2, background and foreground CO 2 concentrations are computed for each observation site from a 280 TM5 simulation. We performed a forward transport simulation with the regional transport model in LUMIA, using both the background concentrations and the foreground fluxes from that TM5 simulation, so that the two simulations differ only by their regional transport model. A comparison between the concentrations computed by the two models is shown in Figure 4.
The bias between the two models is very contained during the summer months (it is below 0.2 ppm from April to September, and goes as low as 0.01 ppm in July), but rises during the winter months (up to 1.45 ppm in November). The mean average 285 difference between the two simulations is also much larger in winter: it ranges from 0.82 ppm in September to 4.3 ppm in November, with a yearly average of 3.3 ppm.
This comparison is not a formal performance assessment of either TM5 or of the FLEXPART-based transport used in LUMIA, and it particular the bias should be interpreted with care as the sign of the total net foreground flux changes during the year (which mechanically leads to a change of the sign of the bias). Nonetheless, it provides an indication on the order of 290 magnitude of the foreground model transport errors. We use the absolute differences between the two models as a proxy for ε f g .

Background model uncertainties:
Background concentrations are expected to be accurately estimated by the global TM5 inversion when the dominant winds are from the West and that any signal from a strong point CO 2 source or sink has had time to dissipate along the air mass 295 trajectory over the Atlantic Ocean. In less favourable conditions, there can be entries of less well-mixed air inside the domain, in particular in case of Easterly winds or in events of re-entry of continental air that would have previously left the domain.
These events are less likely to be well captured by the TM5 inversion and should be attributed a higher uncertainty.
There is no perfect and easy way to detect these events, but one of their consequences would be a less homogeneous background CO 2 distribution around the observation sites when they occur. As part of the TM5 simulation, vertical profiles of 300 background concentrations were stored for each observation (from the surface to 5000 m.a.s.l, at a 250m vertical resolution).
We set the background uncertainty of each observation (ε bg ) to the standard deviation of its corresponding background CO 2 vertical profile. ε bg is on average 0.36 ppm, one order of magnitude lower than ε f g , and it is also more constant (it ranges between 0.01 and 3.6 ppm). Note that these statistics are computed before the observation selection procedure, described in the following section. The different components of the observation uncertainty are compared in Figure 3 for two representative 305 sites.

Observation selection
The inversions are performed on a subset of the observations included in the obspack product. Only observations for which the transport model simulation is expected to result in accurate concentrations are kept. In practice, one of the main difficulties of transport modelling is to correctly compute the mixing of air in the lower troposphere below the boundary layer. The lowest

315
A second filter, used in some of the inversions and in the background TM5 inversion (Section 3.2.2), is the limitation of a maximum of one observation per 24 hours at each observation site (the one that has the lowest observation uncertainty, as per the definition in the previous above). This is justified by the fact that two observations at a same site, within a small time interval have strongly correlated model representation errors and do therefore not provide independent information. In the absence of a proper accounting of observational error correlations, it may be preferable to limit the number of assimilated observations.

320
The use of this second observation filter is discussed in further details in Section 5.

Prior and prescribed fluxes
In addition of the Net Ecosystem Exchange (NEE, net atmosphere-land CO 2 flux) that is optimized in the inversions, the simulations also account for anthropogenic CO 2 emissions (combustion of fossil fuels, bio fuels and cement production), for biomass burning emissions (large scale forest fires) and for the ocean-atmosphere CO 2 exchanges.     ORCHIDEE is a global processed-based terrestrial biosphere model (initially described in Krinner et al. (2005)) that com- The land cover change (including deforestation, regrowth and cropland dynamic) were prescribed using annual land cover maps derived from the Harmonized land use data set (Hurtt et al., 2011) combined with the the ESA-CCI land cover products. The ocean-atmosphere flux is taken from the Jena CarboScope v1.5 product, which provides temporally and spatially resolved estimates of the global sea-air CO 2 flux, estimated by fitting a simple data-driven diagnostic model of ocean mixed-layer 350 biogeochemistry to surface-ocean CO 2 partial pressure data from the SOCAT v1.5 database .
A biomass burning flux category was also included in the inversion, based on fluxes from the Global Fire Emission Database v4 (Giglio et al., 2013). In our European domain biomass burning emissions are negligible regarding the other CO 2 emission sources, however, we include it for completeness.
All fluxes are regridded on the same 0.5 • × 0.5 • , 3-hourly resolution (by simple aggregation or re-binning). A summary of 355 the prior fluxes sources, original resolution and yearly totals is provided in Table 2.

Prior uncertainties
The background error covariance matrix (B in Equation 2) is constructed following the "correlation length" approach used in many other inversion systems (e.g. covariance between fluxes x 1 and x 2 at gridcells with coordinates p 1 = (i1, j1, t1) and p 2 = (i2, j2, t2) is defined as: where σ 2 x1 and σ 2 x2 are the variances assigned to the prior monthly NEE at coordinates p 1 and p 2 , and C h and C t are decay functions of the spatial (C h ) and temporal (C t ) distance between p 1 and p 2 , defined respectively as C t (p 1 , p 2 ) = e −|t1−t2|/Lt L h β , with d(p 1 , p 2 ) the geographic distance between the center of the two pixels. L t and L h are temporal and spatial correlation length. The decay functions are further referred as Gaussian or exponential decay functions, depending 365 on the value of the β parameter (1 for exponential, 2 for Gaussian). We explore the two versions of decay functions in our OSSEs described in Section 4.
The true uncertainty of the prior fluxes (σ 2 x ) is difficult to evaluate and is therefore constructed on reasonable but arbitrary assumptions. We assumed that the flux uncertainties scale linearly with the net flux, and therefore to set the individual flux uncertainties to a constant fraction of the absolute net flux, possibly within a range.

370
This approach is practical as is doesn't require any additional inputs, and justified by the fact that the uncertainties should be low in areas where the flux is known to be low (for instance, desert areas in a CO 2 inversion). One drawback however is that the net CO 2 flux (NEE) may be close to zero as a result from the gross fluxes (GPP and respiration) of same magnitude.
An alternative approach is setting the uncertainty as a scaling factor of the vegetation respiration. We did not explore this here because of a lack of a respiration field at the beginning of the study.   A series of OSSEs was performed aiming primarily at verifying the absence of abnormal behaviour of the inversion system (i.e. verify that it converges towards the truth and responds as expected to changes in observational or prior constraints). The OSSEs share a common prior NEE (ORCHIDEE) and a common set of observations derived from a forward propagation of the "true" (LPJ-GUESS) fluxes (at the same time and location as the actual observations listed in Table 1), perturbed by a random 380 error (y = y truth + N (0, σ 2 y ), with σ 2 y the uncertainty of each observation as defined in the matrix R). In the reference inversion Sref, the prior error covariance matrix (B) is constructed with prior uncertainties set to 25% of the absolute prior value (σ 2 x b = 0.25|x b |) and exponentially decaying horizontal and temporal covariances, with a horizontal covariance length (L h ) of 200 km and a temporal covariance length (L t ) of one month.
Inversions S500 and S100 differ from Sref by the use of respectively longer (500 km) and shorter (100 km) horizontal 385 covariance lengths. Inversion SG uses a Gaussian decay function instead the of exponential one used in Sref. Inversions SP and SA use respectively only the low altitude and only the high altitude sites (See Table 1). Inversion SB is similar to Sref but uses a perturbed set of background CO 2 concentrations: y bg pert = 0.75y bg truth +0.25y bg prior , with y bg prior the background concentrations corresponding to the prior fluxes of the coarse resolution TM5 inversion from which y bg truth are generated (See Section 3.2.2).

395
On the other hand, the lower panel of Figure 5 compares the root mean square error of the inversions to that of the prior: all the inversions lead to a reduction of the RMSE by ≈30%. Inversion S100 performs marginally better, and SP and SG marginally worse.
These contradictory analyses can be explained by looking at the fluxes at the monthly scale, shown in the upper panel of Figure 5. The prior NEE has a much more pronounced seasonal cycle than our assumed truth. The summer uptake is, for

Regional decomposition
Aggregated monthly and annual regional totals are shown for all the inversions in Figure 6 (with the region definitions provided in SI). The bar plots on the left show the prior, true and posterior annual NEE budgets in each region, and the plots on the right show the monthly prior and posterior differences with the "true" regional NEE.
As it is the case at the domain-scale, the monthly regional NEE budgets are systematically improved by the inversions.

415
This sometimes leads to degradation of the representation of the annual budget (in the Northern, Southern and Eastern Europe regions in particular), which confirms that the annual budget is not robustly constrained by our system, contrarily to the monthly NEE.
The largest discrepancies are found in the Eastern, Northern and South-Eastern Europe regions, where the observation network is the densest. On the other hand, the flux estimate is, at least at the monthly scale, very insensitive to the different inversion settings in Central Europe. Reductions in the network density only leads to marginal degradation of the monthly regional NEE (at least in these large regions).
The use of a biased set of background concentrations in inversion SB affects the Western Europe region more than any other: background air generally enters the domain from the West, and Western Europe fluxes are then used as a buffer to compensate for this biased background.

425
Inversion S500 behaves noticeably different from the other inversions in Northern, Eastern and South-Eastern Europe (the three regions that are the least constrained by observations): Inversion S500 uses longer spatial covariance lengths (500 km).
Therefore, the flux adjustments will be spread over a larger area than in e.g. Sref (which uses a B matrix constructed by using 200 km covariance lengths). Since the prior error (i.e. the difference between ORCHIDEE and LPJ-GUESS) follows a relatively homogeneous pattern, S500 effectively produces a better estimation of the NEE in Eastern Europe. There is, however, 430 no guarantee that such a conclusion would apply to an inversion constrained by real observations, as the difference between the prior and the real NEE fluxes may not follow such a simple pattern.
This point is further illustrated in Figure 7, which shows maps of prior, posterior (Sref) and true fluxes, and the associated errors and error corrections, for the four seasons. Flux error reductions are more consistently obtained in Eastern and Southern Europe, where the network is the least dense. One straightforward explanation is that the prior errors are much larger in these 435 regions and therefore easier to reduce. But in Western and Central Europe, where the observation network is dense and where neighbour sites occasionally provide contradictory signals, the inversions have to apply more complex corrections to the control vector, which is more difficult when the covariances lengths are long (i.e. in S500) and occasionally leads to locally increased error. The left panel shows the RMSE reduction at each observation site for inversion Sref (the size of the dots is proportional to the number of assimilated observations at each site, and the color shows the net RMSE reduction). At all sites the inversion 450 leads to improvements in the fit, but those are generally much more modest in Western Europe.

Evolution of the fit to observations
Finally, the center panel compares the RMSE reduction of inversion Sref to that of the other OSSEs: The best performance is logically achieved by inversion S1 at most of the sites because it is constrained by exact concentrations, and therefore it is easier for the inversion to find a solution. But overall the inversion performances are very comparable. Central Europe Figure 6. Prior, true and posterior annual regional NEE balance (bar plots) and monthly regional NEE error in the prior and posterior fluxes (line plots) The comparatively low misfits in continental Europe can be explained by the (coincidental) good performance of the prior 455 in that region (see previous section and Figure 6), but also by the strong sensitivity of these sites to background concentrations.
Sites in the UK and, in particular, Ireland sample very little continental air, which leaves little margin for the inversion to improve the representation of their observations.

Inversions against real observations
The OSSEs presented above neglect several complications of real inversions, in particular transport model errors (the obser- . regards to typical user choices, such as the selection of observations and the definition of prior uncertainties, and to provide some indicators of the transport model performance.

465
The reference inversion (Rref) is constrained by the observations presented in Section 3.3 and uses a prior terrestrial flux from the LPJ-GUESS model described in Section 3.4. The prior uncertainties (σ 2 x b ) are set to 25% of the net monthly flux in each pixel, and gaussian spatial correlation length of 200 km and exponential temporal correlation lengths of one month are used. Besides Rref, two groups of inversions were performed (with the specific settings listed in Table 3): -The sensitivity of the fluxes to the observation selection is tested in experiments RP, Rlf and RA. In RP, high altitude 470 sites (see Table 1) have been excluded from the inversion. In Rlf the same observation network as in Rref has been used, but the frequency of observations has been limited to a maximum of one for each day and site. Finally, in RA only high altitude sites plus LMP, PAL and TTA have been used (because of the lack of any altitude site in the regions of these three sites).
-Inversions R500 and R100 were designed to assess the impact of the structure of the prior covariance matrix on the 475 results. In inversions R500 and R100, the spatial correlation length used to build B was changed to 500 km and 100 km As for the OSSEs, we analyze first the optimized fluxes and then the evolution of the fits to the observations. Note that this time the true fluxes are not known.

Optimized fluxes
Optimized fluxes are shown in Figure 9, as monthly (left) and annual totals (right), aggregated on the entire domain (first (Rref, R500, R100) find a 50 to 60% reduction of the net NEE sink, inversions RP (which uses only low altitude sites) and Rlf (which uses the same number of sites but a reduced number of observations) lead only to a 25% reduction of the net sink, and RA (altitude sites only) does not adjust the total flux significantly. Focusing on the monthly totals, the inversions 485 consistently point to a larger seasonal cycle than simulated by the LPJ-GUESS model, with a more positive NEE in the winter and autumn months (especially in March, September, October and November), and a stronger carbon uptake in the European summer months. Especially the near-zero NEE in August is clearly contradicted by all the inversions.
The regional decomposition shows a more complex picture. The increased early summer (May-July) uptake is largely attributed to three regions: Northern, South-Eastern and Central Europe, and the August shift towards negative NEE is almost Contrarily to what happened in the OSSEs, the spread of the inversions ensemble is not narrower where the network density is 495 the highest. The results in Central Europe are on the contrary extremely sensitive to the restriction of the observation ensemble to high (RA) or low (RP) altitude sites, and changes in the observation frequency (Rlf) also have an impact (but in a predictable sense: they lead to a correction of the flux of smaller amplitude than e.g. Rref). Changes in the spatial covariance lengths (Rref, R500, R100) correspond roughly to what was seen with the OSSEs: the impact is mostly in regions with low observational coverage.

500
Maps of prior and posterior fluxes are shown in the upper and central row of Figure 10 for the reference inversion Rref.
While at large scales the inversion preserves relatively well the spatial NEE distribution, it tends to concentrate a large part of the flux adjustments around some of the observation sites, in particular in Western and Central Europe. On the contrary, flux adjustments are relatively smooth in areas at the domain edges.
The lower row of Figure 10 shows the standard deviation of the ensemble at each pixel (and for each 4-month period). The 505 largest discrepancies between the inversions are found around Northern Germany and Poland (in the "Central Europe" region).
One likely explanation is site-specific transport model errors: Observations from two sites sensitive partly to the same fluxes, but affected by different systematic model error (biases) may provide apparently contradictory information on the direction and/or amplitude of the flux adjustments. The inversion can only reconcile such contradictory observational information by clustering strong flux adjustments in small areas and/or time periods. The problem is stronger where the network density is 510 the highest, and the ensemble spread varies the most where the network density is the most impacted by the sites selection in inversions RA and RP. In the OSSEs, similar changes in network density did not lead to similar problem because of the use of a perfect transport model.

Reduction of the observation misfits
The comparison of prior and posterior fit residuals is more critical in this series of inversions than in the OSSEs since here the 515 true fluxes are not known. Figure 11 shows observation fits for the R-denoted series of inversions in a similar way than done in  Looking at each site separately (left and middle panels) shows, that, while at most sites the representation of the observations 520 is improved by the inversion (red circles), it is strongly degraded at two sites: OXK (-1.15 ppm) and OPE (-0.74 ppm). At another six sites, smaller degradations of the RMSE occur (from -0.02 ppm at PUI, to -0.25 ppm at CES).
The central plot in Figure 11 shows a comparison of the RMSE reduction at the sites with other inversions, but also the evolution of the model-observation biases for inversion Rref (blue lines, right axis  sample largely the same air masses; it is not possible for the inversion to find a combination of fluxes that improves the fit at all sites (at least not without departing too much from the prior).
Likely causes for these poor misfit reductions are transport model errors (diverging biases at neighbouring sites, introduced either by the foreground transport model (FLEXPART) or by the background one (TM5-4DVAR)) and/or a locally insufficient 530 resolution of the optimized state vector. These causes will need to be investigated thoroughly im a future study, but are not a sign of a malfunction of the optimization method itself.

Discussion
We have setup an atmospheric inversion system based on an implementation of the variational inversion approach (Section 3.1) with a transport model based on an offline coupling between FLEXPART (high-resolution regional transport) and TM5 535 (coarse-resolution transport of the background fluxes and historical atmospheric CO 2 burden). The inversion was tested through a series of synthetic experiments and realistic inversions, which show that it is working as expected. In this section we discuss separately three aspects of the paper. First the inversion results themselves, then the TM5-FLEXPART coupling and finally the LUMIA system. .

540
The inversion technique used in this study is by design not innovative (the definition of the control vector, the specification of the uncertainties, etc. replicate what has been done in previous studies (e.g. Kountouris et al. (2018)), as the aim is to have a reference setup. The scientific results are therefore at this stage limited (as it also wasn't the aim of the paper), but the analysis of the OSSEs results show that the inversions are working as expected. Fluxes are well resolved at the monthly scale and large error reductions (with respect to the known truth) occur where the network is the densest, as expected.

545
The aggregated annual flux estimates are, on the contrary, generally degraded by the inversions compared to the prior. This is because, while both positive and negative prior errors are correctly adjusted in the inversions, the error reduction is much stronger in summer (when the prior strongly overestimates the (prescribed) carbon uptake). The OSSEs are an ideal setup for such studies with no transport and no observational errors, such that this bias in the annual posterior NEE results entirely from uneven seasonal constraints on the flux adjustments (the prior uncertainties are larger in summer than in winter, because NEE 550 is larger in summer).
As mentioned earlier, the focus of the paper is not on the inversion results themselves, and the results shown here might be a worst case scenario: scaling the prior uncertainty to the respiration instead of to the NEE itself, as done here, would for instance partly even out the uncertainties. Nonetheless, the results have potential implications for existing inversions (in the absence of a known truth, it is very difficult to tell how adapted is a particular prior error covariance matrix to a given problem), which 555 makes it a worthy topic for a dedicated study.
The application of the same inversion approach to real observations leads to smaller flux adjustments. This could be a sign that the difference between the LPJ-GUESS prior (used in this second set of inversions) and the true fluxes is smaller than that between the prior and synthetic truth in the OSSEs, but the analysis of the observation misfits reduction also point to potential site-dependent transport model errors. One of the next steps towards improving our inversions will therefore have to be a thorough model calibration effort. In that sense, the flexibility of LUMIA with regards to the transport model is particularly adapted.

TM5-FLEXPART coupling
The inversions rely on an offline coupling between the FLEXPART Lagrangian transport model (for regional, high resolution transport) and TM5-4DVAR for providing background concentrations. The setup replicates the 2-step scheme of Rödenbeck A succint comparison between this "TM5-FLEXPART" transport model and TM5 itself was performed (Section 3.3.1) and is used as a proxy for the transport model error. It doesn't show any global bias between the two models, but a possible seasonal offset towards the month of November. The prescribed observation uncertainties are scaled up to account for this possible larger model error, so the impact on inversions should be limited. Nonetheless, that possible seasonal bias would need to be 570 investigated and accounted for before deriving scientific conclusions from inversions against real observations. The choice of the models and of that specific coupling was driven in part by the perspective of exchanges with other groups using similar setups. In the current stage, replacing the FLEXPART response functions from another similar Lagrangian transport model (STILT (), NAME (), etc.) or the TM5 background time series by data generated with a different model (using either the same or a different technique to estimate background concentrations at the observation sites) is straightforward and 575 will facilitate a better evaluation of the model performance.
Note also that the Rödenbeck et al. (2009) approach means that there is no 'hard' coupling between the two models meaning that there is no risk of having to use an older version of one model because of the lack of implementation of the coupling in newer code. This, of course, also facilitates the exchange of one transport model against another as mentioned above.
From a practical and technical point of view, the current setup presents the advantage of speed and scalability: the application 580 of the transport operator is done independently for each observation and therefore can be distributed on as many CPUs as available. Inversions can thus be performed in very limited (user) time (5-8 hours wall time per inversion on 24 CPUs for the inversions in this paper). This time efficiency is critical for running not only single inversions, but inversion ensembles, which provide a better representation of the real uncertainties. Evolutions of the code for very large ensemble of observations (such as from satellite retrievals) may, however, benefit from further developments (aggregation of observations and footprints; 585 reduction of the number of grid points where possible; etc.)

LUMIA framework
LUMIA is primarily designed as a library of (python) code to support the development of inverse modelling activities at Lund University. It aims at being not an integrated atmospheric inversion system (i.e. transport model + inversion algorithm), but rather a toolbox for inversions (and for model-data fusion approaches in general). Currently, the available code is mainly 590 limited to the modules used in the inversions presented in this paper, but applications such as inversions with other tracers (CH 4 , 14 CO 2 ), with other regional transport models (STILT) and coupling with a vegetation model (towards a CCDAS) are envisioned. The existing (and future) modules are build with the aim to maximize flexibility (i.e. the capacity for the user to build their own inversion experiments, rather than just replicate existing ones), usability and sustainability of the code. The concept 595 of usability refers to the need to limit the time spent by users in understanding the code itself (rather than the algorithms implemented in the code), while the concept of sustainability refers to the capacity for the code to retain its flexibility and usability characteristics throughout future developments. Concretely, it means that the code should attempt to follow a number of simple but essential coding practices (one function, one task; explicit naming of variables; code as close as possible as the original equations; clean I/O; usage of only standard libraries to facilitate portability, etc.). The aim is in particular that a 600 module can be removed/replaced without risking to break other modules.
The LUMIA code is not meant to be a "key in hand" system, it targets users having or willing to acquire robust understanging of inverse modelling (it is perfectly usable as a toy model for learning). We therefore do not publish the code in a public repository, but we are very open to collaborations and distribute the code on-demand. An archive of the code in it's current shape is nonetheless included as SI of this document.

7 Summary and conclusions
We have presented the LUMIA inversion library, as well as a first set of regional CO 2 inversions performed with LUMIA, relying on a new coupling between the FLEXPART and TM5-4DVAR transport model. LUMIA is designed to be a toolbox for inversions and is meant to support future inverse modelling work at Lund University.
The inversions intentionally follow a very classical approach, so as to facilitate comparison with other systems and to set a 610 point of reference. The aim is, however, to use LUMIA in the future to develop more innovative approaches (multiple transport models, use of satellite data, optimisation of vegetation model parameters (CCDAS), etc.).
Although the inversion setup lacks the maturity of established systems, it offers promising computational performances and the results suggest interesting scientific questions regarding the capacity of regional inversion systems to constrain the annual budget of CO 2 , and point to specific improvements of the inversion approach.