A multi-pollutant and multi-sectorial approach to screen the consistency of emission inventories

: Some studies show that significant uncertainties affect emission inventories, which may impeach conclusions based on air quality model results. These uncertainties result from the need to compile a wide variety of 15 information to estimate an emission inventory. In this work, we propose and discus a screening method to compare two emission inventories, with the overall goal of improving the quality of emission inventories by feeding back the results of the screening to inventory compilers who can check the inconsistencies found and where applicable resolve errors. The method targets three different aspects: 1) the total emissions assigned to a series of large geographical area, countries in our application; 2) the way these country total emissions are shared in terms of sector 20 of activity and 3) the way inventories spatially distribute emissions from countries to smaller areas, cities in our application. The first step of the screening approach consists in sorting the data and keep only emission contributions that are relevant enough. In a second step, the method identifies, among those significant differences, the most important ones that are evidence of methodological divergence and/or errors that can be found and resolved in at least one of the inventories. The approach has been used to compare two versions of the CAMS-REG European 25 scale inventory over 150 cities in Europe for selected activity sectors. Among the 4500 screened pollutant-sectors, about 450 were kept as relevant among which 46 showed inconsistencies. The analysis indicated that these inconsistencies were almost equally arising from large scale reporting and spatial distribution differences. They mostly affect SO 2 and PM coarse emissions from the industrial and residential sectors. The screening approach is general and can be used for other types of applications related to emission inventories.


Introduction
Air pollution remains a critical issue, as it is one of the main causes of human health damage worldwide. In the 35 EU28 alone, exposure to a pollutant such as fine particulate matter (PM25) is estimated to be responsible for approximately 390 000 premature deaths per year (EEA, 2020). Reducing pollution levels requires appropriate regulatory decisions leading to the implementation of effective abatement strategies. Such decisions are not easy to support because they may involve several pollution sources interacting through complex and non-linear atmospheric phenomena. As only air quality models can simulate the impacts of emission reductions considering the complexity 40 of the atmosphere, they are potentially the only tools able to support the planning of reduction strategies. The accuracy of their results however strongly depends on the quality of a wide range of input data (meteorological fields, boundary conditions, land use data and pollutant emissions) (Im et al., 2018, Zhu et al., 2019, Dufour et a., 2021, de Meij et al., 2009, de Meij et al. 2018, Cuvelier et al., 2010. Many previous studies have shown that emissions are the input that has one of the most critical influences on the results of air quality 45 modeling and, in particular, on the urban-scale source apportionment used to design air quality plans (Kryza et al., 2015, Zhang et al., 2015. More alarmingly, some studies have shown that significant uncertainties affect emission inventories, which may impeach conclusions based on air quality model results , Markakis et al., 2015. These uncertainties result from the need to compile a wide variety of information to develop an emission inventory. Indeed, these inventories are prepared for many pollutants (NOx, PM, VOC, SO2, CO, CO2...) and for 50 many activity sectors (transport, industry, residential, agriculture, natural sources...) that entail different emission processes. The spatial and temporal distribution of emissions is typically based on proxies that can be estimated by very different methods. For example, top-down approaches start from emission estimates at large scale (e.g. national inventory) and disaggregate spatially and temporally the emissions with finer scale proxies. Bottom-up approaches compute directly the emissions starting from local spatial and temporal proxies based on accurate locations or high-55 resolution shape patterns (road, ship routes, high-resolution land use, vehicle counting, etc...). Various sources of proxies are reported to create very high resolution inventories at the urban scale (Zheng et al, 2021, Geng et al., 2017, Ramacher et al., 2021. One of the most important challenges in compiling local scale emission inventories is to remain consistent with data provided by national inventory while providing satisfactory accuracy at all locations and times. 60 For all these reasons, compiling emission inventories can lead to different results depending on the methods and data used. In previous works, Thunis et al. (2016) proposed a methodology to compare two emission estimates over a given area based on a limited input, the total emission per pollutant and macro sector. With this method, the differences between the total emissions of the two inventories are apportioned in terms of emission factors and activity differences. This information can then be used by emission inventory developers to identify the main causes for these discrepancies and likely errors in their estimates. However, this method is able to apportion differences between emission factors and activity only when the difference in emission factors is known for at least one of the 70 emitted pollutants. Since this was leading to arbitrarily treat one pollutant, Clappier and Thunis (2020) improved the method by implementing a probabilistic approach to find the most likely allocation between emission factor and activity to remove this limiting assumption. In their work, Trombetti et al. (2018) then extended the approach to multiple inventories.

75
While the proposed approach shares some graphical representation with the previous, it differs in terms of diagnostics. The three original features of the new approach are: 1) differences in total emissions are allocated into three key components that provide information on the sectoral and spatial shares of the emissions at two geographical scales for each pollutant, 2) the capability to perform the analysis simultaneously for a large number of locations while systemically excluding emissions that are not relevant (lower emissions compared to others) and 3) 80 to rank the largest inconsistencies between the two inventories.
In this paper, we propose a screening method, based on a comparison between two inventories, to identify possible errors and inconsistencies. This new method can be applied to ese two inventories that can either be two versions or two different years of a given inventory, two inventories based on distinct sources of information, e.g. CAMS-REG 85 ) and EDGAR (Crippa et al. 2018), top-down vs. bottom-up approaches, regional vs Europe wide, etc.…. Here, we illustrate the use of the proposed method by focusing on comparisons between two versions of the same inventory and apply the screening methodology to a continental scale inventory used to compile air quality modelling at the urban scale: the CAMS-REG inventory . In a follow-up paper, we extend and apply the approach to the comparison of different inventories.

90
The paper starts with a description of the screening approach that includes its required input data, the methodology itself and its output. An application with the EU-wide CAMS-REG inventory is then presented in Section 3 while further considerations are addressed in Section 4.

95
The approach presented in this article aims at comparing two emission inventories over a series of geographical areas within the domain they spatially cover. These geographical areas include two groups characterised by different scales: large (e.g. country) and focus (e.g. cities) areas. For each pollutant, the method screens the consistency of the inventories around three aspects: (1) the total pollutant emissions assigned at large scale; (2) the way these total pollutant emissions are shared in terms of sector of activity and 3) the way large scale emissions are distributed to 100 the focus areas.
In other words, the screening approach intend to answer the following questions:  Are there inconsistencies in total pollutant emissions at large scale level?
 Are there inconsistencies in the sectoral contributions to the emissions at large scale level?  Are there inconsistencies in the way inventories distribute large-scale emissions spatially?


Note that the method proposed here is designed with a focus on the spatial dimension. Other uncertainties related to emission inventories (e.g. speciation of VOC or PM, temporal distribution of the emissions) are not considered. 110 2.1 Input data Based on a 0.1 x 0.1 degree gridded emission inventory detailed in terms of emitted pollutants (denoted as "p") and sectors of activity (denoted as "s"), the data required for each pollutant and sector (denoted as a [p,s] couple) are twofold and consist of: 115  Emission totals aggregated over specific areas of interest (e.g., urban areas, agriculture intensive areas, industrial areas…). These areas, referred to as focus areas, can be freely selected and represent locations where we wish to assess the consistency of the inventory. The associated emissions are denoted by a lowercase notation , .

120
 Emission totals aggregated over larger areas (e.g., country, regions, modelling domains…). In general, these areas correspond to the larger scale at which data are reported. For example, country is the scale of interest for EU wide inventories as this is the scale at which national emission totals are typically reported or estimated. These areas are referred to as larger scale areas and its emissions denoted by an uppercase notation , .

125
The number of focus areas is denoted by N. We will also denote sectorial emission totals by an overbar ( ̅ = ∑ , ).

Methodology
The number of [p,s] points under screening is equal to the product of the number of pollutants by the number of sectors itself multiplied by the number of focus areas (i.e. × × ). Because this number may become 130 overwhelming, we proceed with a number of steps that help focusing the screening on priority aspects. To this end, threshold parameters are set to restrict the screening to relevant emissions (i.e., emissions that are large enough), and priority is a second step given to the detection of the largest differences between inventories among those relevant data. These steps, schematically represented in the flowchart of Figure 1Figure 1, are discussed in the next subsections.

Exclusion of non-relevant emissions
Not all emission data (E and e for all pollutant and sector) are kept for the analysis because large differences on small numbers are not relevant. For the exclusion, we proceed as follows. For each [p,s] and each inventory, we calculate the ratio between the focus area emission ( , ) and its respective larger scale pollutant total, i.e. ̅̅̅ . As we will see in the application (Section 3), this exclusion step leads to eliminating a large majority of the [p,s] 160 couples from the screening process (between 80 and 90%).

2.2.2
Decomposition into key components The objective of the decomposition is to isolate emission characteristics that are associated to the inventory compilation process, in order to facilitate the resolution of the detected inconsistencies. The three main characteristics are (1) the Pollutant Totals over Large areas (LPT), the Sectorial Shares over Large areas (LSS) and 165 the Activity Share over the Focus areas (FAS). To isolate these components, we decompose the ratio of the known pollutant-sector emissions for each focus areas as follows: where ̅ represents the larger scale emissions summed over all sector for a given pollutant. Superscripts refer to the 170 two inventories used for the screening. Equation (3) is an identity where all terms are known from input quantities, i.e. the focus and larger scale emissions detailed in terms of pollutants and sectors. The three terms on the right-hand side of the identity provide information on the FAS, LSS and LPT, respectively.
For convenience, we rewrite equation (3) in logarithm form as: Which can be rewritten as equation (5) with simplified notations: where the hat symbol indicates that quantities are expressed as logarithmic ratios. These three quantities are at the basis of the screening methodology and serve as input for the graphical representation as well.

Identification of inconsistencies
One of the main steps of the methodology consists in keeping only the largest differences among the relevant emissions identified in Section 2.2.1. The comparison of two emission inventories always leads to different 185 estimates, as inventories can be calculated by different methods (e.g. different activity data, emission factors and spatial disaggregation to the grid).
Differences originate from methodological choices but also from errors generated during the inventory compilation process. When differences are small, it is not possible to tell whether they originate from methodological choices or from errors. Moreover, it is not possible to assess whether one methodological choice leads to an improvement as compared to the other, because true emissions remain unknown (Kryza et al., 2015). We will refer to these small differences as "uncertainty".
Although very large differences may result from methodological choices as well (e.g. inclusion or not of condensable emissions for the residential sector), they are more likely to be associated to errors. Given the magnitude of the differences, it will in most cases be possible to identify one best value out of the two inventory estimates, even though the truth is unknown. These large differences therefore point to a list of potential issues for inventory compilers to check and fix where applicable, opening the way to potential improvement. In this work, these large differences are named "inconsistencies" and are intended as differences that are large enough to ensure 200 that one of the two inventory values is tenable (i.e. justifiable) whereas the other is not.
In the proposed screening methodology, a threshold is introduced to distinguish inconsistencies from uncertainties. This arbitrary level is denoted as and is a free input data in this screening process.

205
The detection of inconsistencies is performed as follows. For each [p,s] we check that any of the two input data: |̂| and || do not exceed the threshold but also that any of the three main components: |̂|, |̂| and |̂| do not either. This is to flag potential compensating effects between ̂ and ̂ since ̂=̂+̂, and between ̂ and ̂ since ̂=̂+̂. To achieve this, the following indicator is defined.
Differences beyond the threshold ( , ≥ )are then flagged as inconsistencies.

Calculation of an emission consistency indicator (ECI)
As a follow-up step, all [p,s] couples that remain after the relevance ( , > )and inconsistency detection steps ( , > ), are used to calculate an "Emission Consistency Indicator (ECI)" as follows: The ECI quantifies the maximum difference among all relevant [p,s], normalized by the inconsistency level ( ). It therefore quantifies the ratio between the maximum inconsistency and the assumed level of uncertainty. A value of ECI less than one means that all differences are considered as uncertainty (in other words none of the inventory can be identified as best performing). Together with the ECI, which quantifies this maximum difference, we associate the percentage of inconsistent [p,s] with respect to the total number of relevant data, to provide information on the 220 number of detected inconsistencies. To facilitate the screening process, these concepts are displayed graphically. This is discussed in the next Section.

Diamond diagram
For the graphical representation, we use an aggregated form of equation (5), recalling that the two last terms on the 225 right hand side combine into the ratio of the larger scale [p,s] emissions, i.e.: ̂=̂+̂ (8) where FAS is related to the large scale-to-focus scale emission share and E is related to the large scale emissions. Relation (8) is the basis of the "diamond" diagram ( Figure 2Figure 2) that provides an overview of all inconsistencies detected during the screening process. In this diagram, each inconsistent emission [p,s] is represented by a point that has larger scale emissions (̂)as abscissa and focus activity share as ordinate (̂). The 230 sum of these two terms () is equal for points that lie on "−1" slope diagonals. At this stage it is important to note that positive differences in terms of larger scale emissions and focus area shares will characterize points lying on the right and top parts of the diagram, respectively. In addition, the upper right and lower left diagram areas indicate summing-up effects whereas the lower right and top left areas highlight compensating effects.

235
The diamond shape (in the middle of the diagram) derives from equation (8) where the threshold is used to draw the inconsistency limit for each of its three terms. Each [p,s] point lying outside this shape is therefore characterized by an inconsistency in terms of either E, FAS or/and e, small or large according to its relative position in the diagram. The calculation of the inconsistency limit (equation 6) however considers LPT and LSS as two additional criteria. Because of their link ( =̂+̂), a point within the diamond represents therefore an inconsistency in 240 terms of LPT and LSS that compensate each other, since their sum remains lower than the threshold (̂≤ log( ), otherwise, the point would lie outside the diamond). We recall that LPT is related to the total of the large scale emissions whereas LSS provides information on their sectoral share. In this diamond diagram, shapes are used to differentiate activity sectors while colours differentiate pollutants. The size of the symbol is proportional to the relevance of the emission contribution (γ). Finally, we use symbol filling to distinguish priorities among inconsistencies related to the three components: LPT, LSS and FAS. The priority is set as follows: 1 − , 2 − and 3 − . This is motivated by the fact that larger scale inconsistencies are easier to tackle and might correct for many focus area inconsistencies at the same time (i.e. for all focus area belonging to a given larger area). The priority is then set by checking in this order if the component exceed the threshold or is larger than the remaining components. In practice, this is implemented as follows: , then priority is set to the larger scale pollutant total (LPT).
2. If step 1 is not fulfilled and if ̂≥ log ( ) or ̂≥̂ then priority is set to the larger scale sectorial share (CSS) 3. If neither steps 1 and 2 are fulfilled, priority is set to the focus area activity share (FAS) 260 Note that compared to the emission diagram proposed by Thunis et al. (2016) and Clappier and Thunis (2020), the diagram proposed here does not distinguishes between acceptable (within the diamond) and non-acceptable data (outside the diamond) but displays only inconsistencies (i.e. data to be checked for which some explanation must be found). Moreover, the current formulation does not rely on probabilistic assumptions and directly relates to emission characteristics that are readily available to emission developers. 265

2.3.2
Supporting diagrams In addition to the diamond diagram, other diagrams are proposed to support the interpretation of the screening. These diagrams are designed to provide additional information by detailing further some aspects (e.g. geographical) at the expenses of aggregation or simplification (e.g. limitation to top inconsistencies) on other aspects. These diagrams are: 270  Overview map: Data are displayed on a geographical map to easily identify the inconsistencies for each focus area. However, only the maximum inconsistency (max , { })for each focus area is shown. While the size is here proportional to the magnitude of the inconsistency, the symbol shapes, colours and filling remain similar to the overview diamond.

275
 Barplot: For a given pollutant and focus area, this diagram allows visualizing inventory differences directly in terms of FAS, LSS and LPT components. This diagram is used here as validation means (with respect to the diamond results).
We discuss further these visualizations in the application section, showing a graphical example and comments for 280 each type of plot.

Input
In this section, we apply our screening methodology to the CAMS-REG regional anthropogenic emission inventory that covers emissions for UNECE-Europe for the main air pollutants and greenhouse gases .

285
The method starts from the reported emissions by European countries to UNFCCC (for greenhouse gases) and to EMEP/CEIP (for air pollutants) and have been aggregated into 246 different combinations of sectors and fuels. Reported data are analysed by sector and completed with alternative emission estimates where the completeness, consistency and/or quality of the reported data was not sufficiently accurate. In practice, reported data were found fit for purpose for EU Member States and the UK, Norway, Iceland and Switzerland, while for other countries 290 alternative emission estimates were used. In addition, some further modifications were made to the dataset for which we refer to Kuenen et al. (2021). This results in a complete emission inventory for all countries, which is then spatially distributed at high resolution using a consistent methodology over the whole domain.  (Britz et al. 2015)). It also uses a new approach for Agricultural Waste Burning, improves the point source database, and uses updated harmonized inland and sea shipping based on the FMI STEAM model (Jalkanen et al., 2009). Further details on these changes are provided in Kuenen et al. (2021).
It is important to stress that the proposed screening methodology assesses the overall consistency of the two 315 inventory versions, i.e. it covers the consistency of the inventory compilation itself but also the consistency of its input data (in this case: country reported emissions). Finally, the relevance and inconsistency threshold are set in this work to = 0.5 and = 2. Although the choice of these threshold is arbitrary and may seem challenging, this is not the case in practice and identifying the inconsistencies is a robust process. A too low threshold will lead to detecting too many differences among which the 340 smallest (at uncertainty level) do not allow assessing what is best. However, the largest differences (inconsistencies) are yet identified and can be taken care of to improve one or both inventories. A too low threshold will therefore lead to confusing information by mixing uncertainties and inconsistencies. On the other hand, a too high threshold will lead to detecting too few inconsistencies and therefore to missing errors that could potentially be corrected. In practice, it is recommended to start with a high threshold and lower it progressively until differences cannot be justified anymore. The values for the relevance and inconsistency threshold ( and ) presented here reflect these considerations.

Results
The diamond diagram (Figure 4Figure 4)

375
The European map presented in Figure 5Figure 5 flags out the inconsistency that dominates in each city. It is interesting to note that while the total number of inconsistency (46) might seem large, the map shows that in some countries, the same type of inconsistency is widespread. This is the case of the PMco industrial emissions in the UK, the SO2 industrial emissions in France, or for the NMVOC residential emissions in the Czech Republic. As the size of the symbol is here proportional to the magnitude of the inconsistency, the PMco emissions from the industrial 380 sector might need a priority check in some UK cities. Even though these inconsistencies appear in different countries, their type is similar and resolving one might bring useful information to resolve the others.

Figure 5: EU (overview) map. Only cities where at least one [p,s] couple ratio with relevant emissions ( ≥ ) is above the inconsistency threshold ( ≥ ) are shown by a symbol. If more than one [p,s] fulfils these two conditions, only the largest is shown. For all others, cities are represented by a black dot.
We focus now on some of these inconsistencies and try to understand their origin. For this purpose, we select inconsistencies of different types ( Figure 6). Examples are picked among those showing important inconsistencies with the aim to illustrate different type of inconsistencies, i.e. in terms of LSS, FAS and LPT ( Figure 6). Note that 390 depending on pollutant and sector, the ECI can differ quite strongly in magnitude explaining the different range of values in our examples.
Vilnius: inconsistent country totals for PM2.5: In Vilnius (Lithuania), the ECI is about 2 (see diamond plot in Figure  6 top row), indicating inconsistencies that are about twice larger than the level of uncertainty. The flagged 395 inconsistency is aligned along the x-axis indicating issues in terms of country PM2.5 values, in particular pollutant levels (LPT). The associated bar-plot highlights a factor 2 to 3 difference between the CAMS-REG-v2.2.1 and -v4.2 estimates ( Figure 6). Note that the country sectorial shares are also diverging for the industrial but also transport sectors. This is however seen by the screening tool as a second priority.
The changes can be explained by the changes in the emission reporting that is used as input to the CAMS-REG 400 inventories. Significant updates were made in the 2019 submission compared to the 2017 submission for Lithuania. For example, PM2.5 emissions from residential sector decreased by nearly a factor 4, whereas road transport emissions increased by ~70%. National total PM2.5 emissions reported by Lithuania for 2015 were reduced by more than 50% between submissions in 2017 and 2019.

405
Dublin: inconsistent industry country share for PM2.5: In Dublin, the ECI is about 50, indicating inconsistencies that are about 50 times larger than the level of uncertainty. The flagged inconsistency (PMco from industry) lies on the right indicating a much larger value attributed to this pollutant/sector in the CAMS-REG-v4.2 version. This is confirmed in the associated bar-plot (Figure 6 second row) that highlights a totally different industrial share in the two inventories. This country scale issue is partly echoed in the urban share, but this is seen by the screening tool as 410 a second priority. Similar to the Vilnius case above, this can be explained by changes in country reporting. Whereas total emissions in the 2017 submission for Industry were 1.5 kt PM2.5 and 1.6 kt PM10, in the 2019 submission the PM2.5 emissions amounted to 1.9 kt and PM10 emissions were 7.7 kt. Hence PMco emissions from industrial sources were increased by more than a factor 50, from 0.1 to 5.8 kt, between both versions.
Newcastle: inconsistent industry urban share for PMCO: In Newcastle (UK), the ECI is 68 (Figure 6 third row), indicating inconsistencies that are about 70 times larger than the level of uncertainty. The flagged inconsistency (PMco from industry) is mostly driven by the urban share but country values differ largely as well (factor 2). Note that large differences of the same type also occur for PM2.5. While this is not flagged as a major inconsistency by the screening approach (because the relative importance of the emissions (γ) is too small), this might become the case 420 when the PMco inconsistency has been resolved. The associated bar-plot highlights differences in country totals and country sectorial shares but these are not sufficient (in terms of γ or β) to trigger the flagging. On the contrary, the very large difference in the urban share, with CAMS-REG-v4.2 exceeding -v2.2 emissions by almost a factor 100 are flagged. As mentioned above, this inconsistency is present in many UK cities. The inconsistency can be explained partly by changes in reporting between 2017 and 2019 submissions, as the PMco from industry increased 425 from 15.8 to 35.2 kt. For the distribution in the country, E-PRTR is used for distributing emissions to point source installations. When checking in detail for this location, a factor 1000 error in E-PRTR reporting was found, which lead to an over-allocation of PM emissions from the industrial sector to this specific industrial site located within the Newcastle urban area. This makes that in CAMS-REG-v4.2 emissions in this particular location are overestimated, which is compensated for by underestimated emissions elsewhere in the UK.

430
London: inconsistent "other" urban share for NH3: In London (UK), the ECI is 2.5, indicating inconsistencies that are about 2.5 times larger than the level of uncertainty (Figure 6 bottom row). The flagged inconsistency (NH3 from the "other" sector) results from both urban and country differences that add up but the dominating factor is the urban share. The associated bar-plot confirms 435 this issue while differences in country totals and country sectorial shares appear moderate in comparison to the urban share issue. In contrast to Newcastle, this issue is only appearing for London. In this specific case, it was found that a relatively large part of NH3 emissions was reported in the category "other waste" for the United Kingdom as a whole. Given the relatively low importance of the sector "other waste" and the absence of point source information for NH3 for this particular sector, these emissions were allocated using a 440 surrogate point source distribution where all emissions ended up in the same point source in London, thus significantly over-allocating emissions in this location. This therefore points to an inconsistency in the CAMS-REG methodology. Half of the inconsistencies between the two versions of CAMS-REG considered in this study can be attributed to changes in country reporting. All European countries annually revise and report their historical annual emissions 450 back to 1990, hence the emission of e.g., 2015 are reported every year again. The differences between both versions may be the result of a correction of an error and/or the implementation of a different methodology to estimate the emissions. This may be checked in the reports (IIRs) that are submitted annually along with the reported emissions, but likely not all changes are documented in detail. This makes that only in a selection of cases it will be possible to define an error in one of the inventories as such, and hence define the better inventory.

455
An important driver for inventory improvement in recent years has been the annual review of air pollutant inventory data under the NEC Directive organised by the European Commission 1 , which has led to substantial revisions or nationally reported data since 2017 for all pollutants and sectors.

Further considerations
The approach presented in this work is intended as a screening to flag inconsistencies. Only differences that are 460 above a user-defined threshold (βt) are detected and smaller differences are disregarded. This threshold reflects the limit between relatively small differences for which no emission inventory can be estimated to be the best (because true emissions are unknown) and differences that are so large that they are likely associated with a large error in one of the two (or both) inventories (hence called inconsistencies) for which it should be possible to identify a best performing inventory.

465
While solving a few inconsistencies will generally lower the overall number of inconsistencies, this is however not always the case. Indeed a very large inconsistency can potentially lead to a γ factor that is so large that all other [p,s] for that city would be disregarded in proportion. Once the inconsistency is solved, the new γ estimates might lead to one or more new inconsistencies to be flagged. This is therefore a step-wise approach.

470
The settings used in this work, i.e., the choice of 150 urban areas and the country level as larger scale have been arbitrarily fixed. The methods allow for flexible choices and could be applied to other areas than urban (e.g., high emission industrial or intensive agriculture areas) to assess the consistency with respect to other types of emissions. Similarly, the larger scale can be adapted to the specific inventory and focus on regions rather than countries or have 475 it defined as the entire modelling domain.
The proposed application focuses on the comparison of two versions of a specific inventory (here CAMS-REG). Although more challenging, the screening method can be applied to the comparison of two different inventories. Obviously, additional challenges will appear, in particular (1) differences in terms of spatial resolution that might 480 result in sources being excluded from some grid cell for one inventory and included in the other, resulting in artificial differences or (2) the need of harmonization of the emissions in terms of sectorial categories as a first step before the comparison. This inter-comparison of inventories is the subject of a follow-up paper, where these specific issues are discussed.

485
Given its flexible settings, the screening method also applies to bottom-up inventories. These can then be compared with themselves (e.g. 2 versions) or with other inventories. As mentioned earlier, the smaller areas of interests can be designed at own convenience and this is also the case for the larger scale that in the extreme case can be set to the total domain area.

490
Finally, the screening tool also provides text information that summarizes the inconsistencies by detailing the city, sector, pollutant, type and amplitude for each of them. A comment line is associated to each of them in order to keep track of steps taken to resolve them (or not).

Conclusions
In this work, we proposed and discussed a screening method to compare two emission inventories. The overall goal 495 is to improve the quality of emission inventories by feeding back the results of the screening to inventory compilers who can check the inconsistencies found and where applicable resolve errors. The method targets three different aspects: 1) the total emissions assigned to a series of large geographical area, countries in our application; 2) the way these country total emissions are shared in terms of sector of activity and 3) the way inventories spatially distribute emissions from countries to smaller areas, cities in our application. The method provides a way to quantify the level 500 of consistency (intended here as a whole, i.e. emission compilation plus all input relevant data) between two inventories.
Given the large and possibly overwhelming amount of data to analyze (many pollutants, activity macro sector and cities), the first step of the screening approach consists in sorting the data for the comparison and keep only emission 505 contributions that are relevant enough. In a second step the method identifies, among those relevant significant emissionsdifferencesemissions, the most important differences that are evidence of methodological divergence and/or errors that can be found and resolved in at least one of the inventories. Although this screening does not allow to check the quality of the inventories in an absolute way, the magnitude of the differences is often large enough that it is possible to identify one best value out of the two inventory estimates, even though the truth is unknown.

510
The approach has been used to compare two versions of the CAMS-REG European scale inventory over 150 cities in Europe for selected activity sectors. The versions 2.2.1 and 4.2 of this inventory differ both in terms of reporting year (new activity data, new emission factors, etc.) and in terms of spatial distribution (e.g. split in road transport emissions between urban, rural and highway shares, new proxies for agriculture…). Among the 4500 screened 515 pollutant-sectors, about 450 were kept as relevant among which 46 showed inconsistencies. The analysis indicated that these inconsistencies were almost equally arising from reporting by countries and methodological issues in CAMS-REG (e.g. spatial distribution). They mostly affect SO2 and PM coarse emissions, from the industrial and residential sectors. Differences in terms of reporting may be the result of a correction of an error and/or the implementation of a different methodology to estimate the emissions. But the fact that about half of the 520 inconsistencies can be attributed to changes in country reporting stresses the necessity to further checking the informative inventory reports (IIRs) that are submitted annually along with the reported emissions. For inconsistencies related to the CAMS-REG methodology and in particular the spatial distribution therein, the analysis presented here showed that for specific cities, screened errors could be explained, and some of them resolved, leading to improved inventories.

525
Although only a particular example has been discussed here, the screening approach is general and can be used for other types of applications related to emission inventories. The approach can be applied to other inventory scales (e.g. regional or local) and can be tuned to address different sectors or areas. Intensive agriculture or industrial areas could for example be added to the urban agglomerations considered in this work. Because the type of emission 530 expertise strongly depends on the type of applications, the expertise relevant to a given application is necessary to analyze the screening results and correct likely errors. The screening approach also allows assessing the consistency of a Comparison between temporal series of emissions, or compare between inventories based on different source of information or even between compare inventories based on different methodologies (e.g. comparison of top-down and bottom up) are possible. The latter are the subject of a follow-up paper. 535

Appendix
This section provides details on the cities considered in this study, in terms of sectors and emitted pollutants for which a distinction is made between non-relevant, relevant and inconsistent emission inventory pairs. The table below distinguishes for each city between non-relevant and relevant, the latter being further split into consistent and inconsistent.