Recovery of sparse urban greenhouse gas emissions

Zanger, Benjamin; Chen, Jia; Sun, Man; Dietrich, Florian

doi:https://doi.org/10.5194/gmd-15-7533-2022

Articles | Volume 15, issue 20

https://doi.org/10.5194/gmd-15-7533-2022

Articles | Volume 15, issue 20

Methods for assessment of models

17 Oct 2022

Methods for assessment of models |

| 17 Oct 2022

Recovery of sparse urban greenhouse gas emissions

Benjamin Zanger, Jia Chen, Man Sun, and Florian Dietrich

Abstract

To localize and quantify greenhouse gas emissions from cities, gas concentrations are typically measured at a small number of sites and then linked to emission fluxes using atmospheric transport models. Solving this inverse problem is challenging because the system of equations often has no unique solution and the solution can be sensitive to noise. A common top–down approach for solving this problem is Bayesian inversion with the assumption of a multivariate Gaussian distribution as the prior emission field. However, such an assumption has drawbacks when the assumed spatial emissions are incorrect or not Gaussian distributed. In our work, we investigate sparse reconstruction (SR), an alternative reconstruction method that can achieve reasonable estimations without using a prior emission field by making the assumption that the emission field is sparse. We show that this assumption is generally true for the cities we investigated and that the use of the discrete wavelet transform helps to make the urban emission field even more sparse. To evaluate the performance of SR, we created concentration data by applying an atmospheric forward transport model to CO₂ emission inventories of several major European cities. We used SR to locate and quantify the emission sources by applying compressed sensing theory and compared the results to regularized least squares (LSs) methods. Our results show that SR requires fewer measurements than LS methods and that SR is better at localizing and quantifying unknown emitters.

Download & links

Article (PDF, 8700 KB)

Supplement (374 KB)

Download & links

How to cite.

Received: 29 Dec 2021 – Discussion started: 21 Feb 2022 – Revised: 26 Aug 2022 – Accepted: 31 Aug 2022 – Published: 17 Oct 2022

1 Introduction

Understanding anthropogenic greenhouse gas (GHG) emissions is important for scientists and decision makers fighting climate change. Based on a growing amount of atmospheric observations, studies estimating emission fields of GHG sources and sinks from these observations have been performed on local (Chen et al., 2016; Viatte et al., 2017; Toja-Silva et al., 2017), metropolitan (Jones et al., 2021; Turner et al., 2020; Hase et al., 2015), country (Miller et al., 2013; Shekhar et al., 2020), and global (Hirsch et al., 2006; Mueller et al., 2008; Turner et al., 2015; Jacob et al., 2016) scales. One of the main reasons for such studies is to verify and improve GHG emission inventories created by bottom–up methods. Verification and improvements include but are not limited to

determining the difference between the real emissions and the emissions captured by inventories
determining differences between the real and bottom–up estimated emissions for individual emitters
finding emitters which are not captured by inventories (unknown emitters).

Atmospheric inverse modeling methods use column or in situ GHG concentration measurements to estimate emission fields. Due to a lack of measurements and high modeling and measurement uncertainties, estimating each grid cell of an emission field independently is not possible. Instead, sectors (Jones et al., 2021), spatial correlations (Wesloh et al., 2020), and/or temporal correlations (Jones et al., 2021; Wesloh et al., 2020) are used to construct alternative parameterizations of emission fields to prevent overfitting.

An alternative to overcoming these issues are sparse reconstruction (SR) methods (Ray et al., 2015). SR methods can use concentration measurements to estimate sparse emission fields, meaning that only a small number of large emitters contribute significantly to the total emissions. These methods determine the critical emission grid cells and adjust the emissions of only those cells until the model best matches the observations. All other grid cells are set to zero. Once conditions of compressed sensing (CS) are fulfilled, SR methods are guaranteed to determine the best possible emission fields and provide a good estimation of their emissions.

SR for the recovery of GHGs has been proposed by several recent studies. Ray et al. (2014, 2015) used stagewise orthogonal matching pursuit (StOMP), a reconstruction method known from compressed sensing, and modified it to enforce positive emission estimates. To overcome the restriction of sparse reconstruction of only being able to reconstruct sparse fields, they used a multi-scale resolution field based on wavelets. In Ray et al. (2015), StOMP has been modified so that prior information can be included. Both studies reconstructed fossil fuel CO₂ emissions in an idealized scenario with synthetic measurements and very low measurement noise (SNR >40 dB). Hase et al. (2017) demonstrated sparse reconstruction with enforced positive emission estimates of anthropogenic CH₄ emissions from synthetic observations in the US. To increase the sparsity of the emission field, Hase et al. (2017) used a redundant dictionary representation, where the representation of the emission field is not unique. Yan et al. (2012) proposed compressed sensing, i.e., sparse reconstruction with guaranteed best feature selection, for environmental monitoring with the focus on undersampling.

In this paper, we apply SR to assessing urban GHG emissions. Current GHG emission monitoring systems in cities, such as Dietrich et al. (2021), Shusterman et al. (2016) and Sargent et al. (2018), acquire GHG concentration in the atmosphere as column or in situ concentration measurements. These measurements are then related to city emissions and background concentrations, where the city emissions are the unknowns of interest while the background is (partially) known. Göckede et al. (2010) have shown that in smaller domains, uncertainties in the background have a high influence on the estimation of the city emissions. Therefore, modern approaches, such as Jones et al. (2021) and Klappenbach et al. (2021), use the measurements acquired to additionally improve the certainty of background concentrations using a Bayesian approach. In this work, we ignore the background and make the assumption that it is known in full detail. Extending our approach to include background concentrations is straightforward.

As urban emissions, we are using anthropogenic emission inventories from multiple European cities. To overcome the sparsity constraints of SR for non-sparse emissions, we use a wavelet transformation.

We are the first to apply SR to the estimation of urban GHG emissions. The findings of our work are the following:

Urban emissions are mostly sparse and a third-level wavelet transform performs well in sparsifying urban emissions further.
SR needs fewer measurements than Gaussian prior methods to achieve a similar performance if the emissions are sparse enough.
SR performs well in localizing and quantifying large emitters, leading to the application of finding unknown emitters not captured by emission inventories.

The paper is structured as follows. Section 2 gives a formulation of inverse problems, introduces the reader to sparse reconstruction methods, compressed sensing, and compressible emissions, and provides a description of the algorithms used in this paper. The compressibility of the anthropogenic emissions in European cities is discussed in Sect. 3. Section 4 shows selected scenarios of our reconstruction method, highlighting beneficial conditions and use cases of our reconstruction method as well as discussing measurement noise. In Sect. 5 we create case studies for different European cities in an idealized and noisy case.

2 Methodology

This section gives the problem statement of atmospheric inverse problems (Sect. 2.1), provides an introduction to the theory of sparse reconstruction (Sect. 2.2, Sect. 2.3, and Sect. 2.4), and introduces measures (Sect. 2.5) and algorithms (Sect. 2.6) for sparse reconstruction.

2.1 Inverse problems

An inverse problem is a problem in which input parameters should be determined from the observation of a process. For the problem in this paper, those input parameters x∈ℝⁿ are the GHG emission fluxes for each grid cell in an emission field and the measurements y∈ℝ^m are in situ or column measurements of GHG concentrations in the atmosphere. These quantities are connected by an atmospheric process, referred to as forward model F: y=F(x). In practice, such a forward model can be a linear, non-linear, or even a stochastic process. For this analysis, we limit the forward model to linear cases. Therefore, we can write y=Ax, where $A \in R^{m \times n}$ is called the sensing matrix. A least squares estimation of the GHG emission fluxes x is given by

\begin{matrix} (1) & \hat{x} = \underset{x}{argmin} {∥A x - y∥}_{2}^{2} . \end{matrix}

Often, however, such inverse problems are ill-posed, as in the cases we deal with in this paper. In such cases, no or no unique solution exists or the solution does not depend smoothly on the data, therefore, being sensitive to noise. For ill-posed inverse problems, the least squares estimation without regularization does not provide a useful reconstruction technique. For a more detailed discussion of ill-posed problems we refer to chap. 3 of Nakamura and Potthast (2015).

2.2 Bayesian inversion

A typical approach in atmospheric sciences to solve inverse problems is Bayesian inversion. In such a setup, the unknown emissions are assumed to follow a known probability distribution. This probability distribution is referred to as a priori. Measurements are used to update the a priori, which results in a new probability distribution referred to as a posteriori. From this posteriori distribution a parameter estimation can be made using a maximum likelihood (ML) detector on the a posteriori. Since the ML detector acts on the a posteriori, this is commonly referred to as Maximum a posteriori (MAP) detector. Let us call the probability distribution of the a priori p_X(x). Furthermore, probability distributions are assigned both to the observations and the model to account for uncertainties. The measurement distribution is written as p_Y(y) and the distribution which maps x to y is written as $p_{Y | X} (y | x)$ . Using Bayes' theorem, a posteriori distribution for x under the condition of the observations y can be derived:

\begin{matrix} (2) & p_{X | Y} (x | y) = \frac{p_{Y | X} (y | x) p_{X} (x)}{p_{Y} (y)} . \end{matrix}

On this derived distribution the MAP detector can be applied. Since p_Y(y) is only a constant factor for some specific measurements, the MAP detector can be written as

\begin{matrix} (3) & \hat{x} = \underset{x}{argmax} p_{Y | X} (y | x) p_{X} (x), \end{matrix}

where $\hat{x}$ are the estimated physical quantities. Typically the distributions are assumed to be Gaussian, where the a priori distribution is assumed to be centered around an initial guess, x_A. This allows for easier error analysis and makes the problem computationally feasible.

2.3 From Bayesian inversion to regularization

To show the relation between Bayesian inversion and regularization methods, we show how a Bayesian inversion problem, using Gaussian priors, can be converted to a regularization problem. Assume that $p_{Y | X} (y | x)$ is Gaussian distributed with the covariance matrix S_o,

\begin{matrix} (4) & p_{Y | X} (y | x) = \frac{1}{\sqrt{(2 π)^{m} det (S_{o})}} \exp (- \frac{1}{2} {∥S_{o}^{- 1 / 2} (A x - y)∥}_{2}^{2}), \end{matrix}

where m are the number of measurements, and p_X(x) to be a Gaussian prior of the form

\begin{matrix} (5) & p_{X} (x) = C \exp (- \frac{1}{2} {∥S_{A}^{- 1 / 2} (x - x_{A})∥}_{2}^{2}), \end{matrix}

where C is a normalization constant. Applying the MAP detector from Eq. (3) gives an estimation of

\begin{array}{l} (6) & \hat{x} & = \underset{x}{argmax} p_{Y | X} (y | x) p_{X} (x) \\ (7) & = \underset{x}{argmin} [{∥S_{o}^{- 1 / 2} (A x - y)∥}_{2}^{2} + {∥S_{A}^{- 1 / 2} (x - x_{A})∥}_{2}^{2}] . \end{array}

The idea of regularization methods on the other hand is to add a penalty term to the least squares problem from Eq. (1) to prefer solutions of a certain kind,

\begin{matrix} (8) & \underset{x}{argmin} [{∥C_{1} (A x - y)∥}_{2}^{2} + λ R (C_{2} x)], \end{matrix}

where $R : R^{n} \to R$ is the regularization function and C₁,C₂ are correlation matrices. This equation is equivalent to Eq. (7) with λ=1, $C_{1} = S_{o}^{- 1 / 2}$ , $R = {∥S_{A}^{- 1 / 2} (x - x_{A})∥}_{2}^{2}$ , and C₂=I. For such a regularization function, the regularization scheme is known as Tikhonov regularization or also ridge regression (Golub et al., 1999). While in Bayesian inversion it is most often assumed that a prior value x_A is known, in regularization x_A is often unknown and assumed to be 0.

In this paper, we investigate sparse reconstruction (SR) methods. To achieve SR, the regularization term has to be changed so that sparse solutions are preferred over non-sparse solutions. In statistics, such a regularization function is the Lasso regularization function, presented by Tibshirani (1996). The Lasso is given by

\begin{matrix} (9) & R (x) = \sum_{j} | x_{j} | = {∥x∥}_{1} \end{matrix}

and is used especially when x is approximately sparse. The equivalent of the Lasso in Bayesian inversion is the assumption of a Laplacian distributed prior. The Lasso is expected to select those elements in x which are important and meaningful, while the irrelevant features are estimated to be zero. Su et al. (2017) showed that this is not necessarily the case, as coefficients which are zero in x are sometimes estimated to be important (which is referred to as false discovery). In the next section, we introduce compressed sensing (CS), which provides sufficient conditions to prevent false discoveries in x using Lasso regularization.

2.4 Compressed sensing

Compressed sensing (CS) is a theory which provides sufficient conditions to guarantee best possible reconstruction using the Lasso regularizer, therefore, preventing false discoveries. The conditions of CS apply to the forward model A and are hard to examine. For this reason, the conditions are normally already considered in the design process. In the following, we provide the very basics of CS needed to understand our work. For a more comprehensive introduction to CS, we refer to Boche et al. (2015).

CS states that an s-sparse signal $x \in Σ_{s}^{n}$ , where s-sparse is defined as $s \geq |\{j | x_{j} \neq 0\}|$ and $Σ_{s}^{n}$ is the set of all s-sparse signals which are in ℝⁿ, can be uniquely reconstructed by m measurements y∈ℝ^m, defined by y=Ax, where $A \in R^{m \times n}$ , if certain conditions are satisfied for A. An overview of the symbols, norms, and measures used is given in Table 1. The unique solution is found by solving the l₀ minimization problem given by

\begin{matrix} (P0) & \hat{x} = \underset{x}{argmin} {∥x∥}_{0} s.t. A x = y . \end{matrix}

Solving this minimization problem is NP-hard and not applicable to real-world applications. Candès et al. (2006) showed that for additional conditions in A, one can solve the l₁ regularization problem instead, which is a convex problem:

\begin{matrix} (P1) & \hat{x} = \underset{x}{argmin} {∥x∥}_{1} s.t. A x = y . \end{matrix}

Then, solving Eq. (P1) leads to the same solution as Eq. (P0). A sufficient condition for recovering Eq. (P0) with Eq. (P1) is the restricted isometry property (RIP) introduced in Candès and Tao (2005). This property determines a restricted isometry constant (RIC) δ_s for a certain sparsity level s, which is calculated by

\begin{matrix} (10) & (1 - δ_{s}) {∥x∥}_{2}^{2} \leq {∥A x∥}_{2}^{2} \leq (1 + δ_{s}) {∥x∥}_{2}^{2}, \end{matrix}

where $x \in Σ_{s}^{n}$ . The RIC tells us how close the singular values of m×s submatrices of A are to 1. For δ_2s<1, any s-sparse solution can be uniquely determined by solving the Eq. (P1) problem, and for $δ_{2 s} < (\sqrt{2} - 1)$ this is even possible if the signal is superimposed by noise (noisy case) (Candès, 2008). In practice, calculating the RIC is NP-hard (see Tillmann and Pfetsch, 2014) and it is not applicable to calculate this constant for a given matrix. However, the RIP might be used within a design process, since there are known random distributions of matrices, which satisfy the RIP for large s considering sufficiently large n and m (Baraniuk et al., 2008). Another property, which very loosely upper-bounds the RIC, is the incoherence property (Wang et al., 2015). The coherence μ of a matrix is defined by

\begin{matrix} (11) & μ = max |< a_{i} | a_{j} >| i \neq j, \end{matrix}

where a_i and a_j are distinct column vectors of $\tilde{A}$ , where $\tilde{A}$ is the column-normalized matrix of A. The coherence bounds the RIC by

\begin{matrix} (12) & δ_{2 s} \leq (2 s - 1) μ . \end{matrix}

Therefore, s-sparse solutions can be uniquely recovered if $μ < \frac{1}{2 s - 1}$ holds in the noiseless case or $μ < \frac{\sqrt{2} - 1}{2 s - 1}$ in the noisy case.

In real-world scenarios, coefficients are rarely sparse but often compressible. This means that the coefficients can be well approximated by sparse coefficients. A more detailed explanation for compressible coefficients is found in Sect. 2.5. CS guarantees good estimates of compressible solutions if the RIP is satisfied for the noisy case. Then the reconstruction error can also be bounded (see Candès, 2008, for the exact definitions and bounds).

2.5 Sparsifying emissions

Sparsity is one of the key elements for SR. However, emissions are not always sparse. In order to make a non-sparse emission field sparse, a transformation into a different domain can be used. Such transformations include the Fourier transform, wavelet transform, transformations tailored to specific data sets, e.g., by SVD truncation (see Hong et al., 2011), or over-complete dictionaries, where the representation does not have to be orthogonal and multiple representations for the same emission field exist (see Candès et al., 2011). In this paper, we only deal with the discrete wavelet transform (DWT) to sparsify emissions. This transform is used for image compression (Lewis and Knowles, 1992) and was also used in Ray et al. (2014) and Ray et al. (2015) to parameterize fossil fuel CO₂ emissions for the US.

There are several ways to quantify the sparsity of an emission field. One possibility is to measure the error, using any l_p norm, of an emission field x to its best s-sparse approximation:

\begin{matrix} (13) & σ_{s} (x)_{p} = inf \{{∥x - z∥}_{p}, z \in Σ_{s}^{n}\} . \end{matrix}

Independent of the norm used, the best approximation is given by an emission field z, which contains the same s highest values of x and is zero otherwise. Often, it is more intuitive to give the relative sparsity s_rel, which is the fraction of non-zero entries as a percentage of all entries, instead of the sparsity level s. In this paper, both notations are used, e.g., σ_10%(x)₂ is the l₂ error of the signal which best approximates x and maximally possesses 10 % non-zero elements while σ₁₀(x)₂ is the l₂ error of the signal approximating x with maximally 10 non-zero elements. To show the distribution of values in x, a plot showing σ_s(x) over all possible s can be used.

https://gmd.copernicus.org/articles/15/7533/2022/gmd-15-7533-2022-f01

Figure 1Visualization of the compressibility of emission fields by plotting σ_s(x)₁ over all s. The more hyperbolic the curve is, the more compressible the emission field.

Recovery of sparse urban greenhouse gas emissions

2.1 Inverse problems

2.2 Bayesian inversion

2.3 From Bayesian inversion to regularization

2.4 Compressed sensing

2.5 Sparsifying emissions

2.6 Reconstruction algorithms

2.7 Data evaluation

4.1 Formulating the estimation problem

4.2 Applying sparse reconstruction to European cities

4.2.1 Influence of wind coverage

4.2.2 Sparse reconstruction in the wavelet domain

4.2.3 Discovering unknown emitters

4.2.4 Number of measurements needed

4.2.5 Measurement noise