Local Fractions - a method for the calculation of local source contributions to air pollution, illustrated by examples using the EMEP MSC-W model (rv4_33)

. We present a computationally inexpensive method for individually quantifying the contributions from different sources to local air pollution. It can explicitly distinguish between regional/background and local/urban air pollution, allowing fully consistent downscaling schemes. The method can be implemented in existing Eulerian chemical transport models and can be used to distinguish a large number of emission sources to air pollution in every receptor grid cell within one single model simulation and thus to provide 5 detailed maps of the origin of the pollutants. Hence it can be used for time-critical operational services providing scientiﬁc information as input to local policy decisions on air pollution abatement. The main limitation in its current version is that only primary pollutants can be addressed.


Introduction
The origin of atmospheric pollutants within a given region is one of the fundamental questions of air quality research.Degradation of air quality, either temporary or sustained, is often the result of both local and long-range transported air pollution, originating from anthropogenic but also natural emission sources.Anthropogenic emissions are due to a large number of different categories such as road traffic, industrial point sources and large area sources.
In order to devise optimal strategies of air pollution abatement, for example short-term or long-term emission reduction measures, air quality managers need to have access to reliable scientific knowledge about the origin of air pollution.Typical questions include: a) By what amount can local air pollution be reduced through local measures only, and in which cases will regional or countrywide measures be necessary?b) What will be the benefit of emission reduction measures imposed on one or several specific emission sectors?c) Will these measures be efficient on a short time frame or should they be implemented on a longer-term basis?https://doi.org/10.5194/gmd-2019-296Preprint.Discussion started: 11 November 2019 c Author(s) 2019.CC BY 4.0 License.Many different methods exist to extract information about the origin of air pollution (e.g., Thunis et al. , 2019;Clappier et al., 2017).Some of them are based on measurements of the chemical composition of air masses in the region or interest (receptor region).Such a 'chemical fingerprint' can then give hints on the origin (source region or sector) of the pollutants.Most methods, however, are based on models as these can be readily applied to scenario calculations as well.Chemical Transport Models (CTMs), in particular, are efficient mathematical tools that treat emission sources, transport, chemical conversion and loss mechanisms of air pollutants in a consistent way, and allow different scenarios to be assessed within a reasonable amount of computing time.
The simplest method to evaluate the importance of different emission sources in a CTM is the 'direct' method (e.g., Folberth et al., 2012), sometimes also referred to as 'annihilation' or 'brute force' method, where the same model simulation is repeated with and without including a chosen emission source.The difference in pollutant concentrations in the receptor region can then be attributed to the chosen emission source.In order to stay within quasi-linearity one can choose to reduce the emission source by only a small amount.This is usually referred to as 'perturbation method' (e.g., Jacob et al., 1999;Fiore et al., 2009) and is well suited to simulate the effects of policy measures to reduce emissions from certain sectors by a certain amount.However, one of the drawbacks of this method is that for each source contribution a new, independent, simulation must be performed.
In general, the chemical processes involve non-linearities, and the sum of individual contributions calculated by the direct or perturbation methods does not equal the base case simulation where all sources are included.A method that better treats non-linearities is the so-called 'tagging method' (e.g., Wang et al., 1998;Wu et al., 2011;Grewe et al., 2013).It distinguishes chemically identical molecules according to their sources.In the model this can be done by defining one separate tracer for each source.In this way it is possible to track these tracers through transport and chemical transformations, and thereby quantify their contribution to air pollution at any given location in one single model run.The limitation of tagging methods has been that in cases where the number of different sources is large the calculation becomes unpractical.
In this regard, 'adjoint models' (Elbern and Schmidt, 1999;Vautard et al., 2000;Henze et al., 2007) are superior.Adjoint models calculate the derivative of a model scalar with respect to all other model parameters in one single simulation and in this way efficiently quantify the contribution from all emission sources to air pollution in a given receptor region.However, a new adjoint simulation must be performed for each receptor region.
Still, only a relatively small amount of sources or receptors at a time can be analyzed by all these methods.Perturbation methods calculate all receptor values for one source group, tagging methods compute all receptors for a limited number of source groups, while adjoint models address all sources for one receptor group.Ideally, all contributions to all receptor points should be described.
In this paper we present a method which can efficiently calculate the contribution of a significantly larger amount of sources (thousands or more), to a limited (but large) number of receptor regions.This method does not provide results that cannot be obtained by other means, but it does so at a lower computational cost and is thus well suited especially for time critical operational applications.It can be built on top of existing Eulerian CTMs relatively easily, and thereby has the potential to offer a new range of applications.However, it is at present limited to primary pollutants, for which linearity can be assumed.It will thus complement existing methods, but not replace them.In the following Section we describe the method in technical detail, while in Sect. 3 we show concrete examples of what kind of results the method can provide.The results will also be compared against the direct method.Finally in the last section we discuss possible applications of the method as well as plans for further development.

Description of the method
In theory the method corresponds to a tagging method, where pollutants from different origins are tagged and their values are traced and stored individually.
We define the Local Fraction LF s in a receptor grid cell as the fraction of pollutant that is due to a particular source term s.
For example, s can be a power plant or the road traffic emissions in a specific source region.LF s is a number between zero and one and is calculated as: The Total Pollutant is abbreviated T P ; it could be the air concentration of particulate matter for example.The Local Pollutant, LP s , is the part of T P "tagged" from a specific origin s.Its value is in general the result of various processes (emissions, advection, diffusion, etc.) as will be described below.Given the value of the Total Pollutant, then the Local Fraction and the Local Pollutant carry the same information, but as we will see, there are a few practical advantages of storing the Local Fraction rather than the Local Pollutant.
In a time-splitting framework the different physical processes are included sequentially, and we will show in the next Sections how the value of the Local Pollutant changes during each of them.For simplicity, the initial value for LF s is set to zero, given that in the long term LF s should not be sensitive to the initial value.

Emissions
The Local Pollutant and Local Fraction are associated with a particular emission source category (E s ) in a specific grid cell.
(Formally the source could also be spread over a group of grid cells, but at present we limit ourselves to sources defined on single grid cells).If E s (t) is the emission rate of source s at time t, LP s will increase during the time step ∆t: and For instance s could refer to emissions of particulate matter from road traffic emissions, T P would be the total concentration of particulate matter in the receptor region, and LF s would then be the fraction by which the total concentrations in the receptor region would be reduced if the emissions from road traffic in the source region were removed completely (assuming linearity).

Advection
Transport of pollutants will mix pollutants from different origins.We will trace individually the Local Pollutant due to different sources and from every horizontal grid cell within the source region.We need then two sets of position indices, one for the origin (source region) and one for the actual position (receptor grid cell): Where x s and y s are the (horizontal) coordinates of the source grid cell, and x, y and z are the coordinates of the receptor grid cell.s is a specific source category at (x s , y s ).In order to keep the calculation at a reasonable cost, one can limit x s and y s to be within a preset number of grid cells from the receptor grid cell, ∆ max : The source position indices are then replaced by its relative position relative to the receptor grid cell: where ∆x s = x s − x and ∆y s = y s − y are the signed distances to the source.In practice, also z is limited, as it is usually not necessary to trace pollutants for receptor grid cells all the way up through the atmosphere.Note that the vertical position of the source is not explicitly traced, but it can, in principle, be included in the form of separate sources s.
We call the region delimited by all (x s , y s ) and the vertical range of z for the "local region".
LF s,∆xs,∆ys (x, y, z, t) is in practice a seven dimensional array.The range of s depends on the number of source categories to be tracked.The size of this array can be very large, which reflects the large amount of information it carries.
Pollutants can be traced within this region.If they leave the local region, they are no longer identifiable by the method, even if they return into the local region.
Let us consider a flux of pollutant, F (x, y, z, t) (assumed positive), from a grid cell x to x + 1 during ∆t, and a source at a position ∆x s relative to x.
The amount of Local Pollutant leaving the grid cell x is At position x + 1, the relative position of that source is x s − 1, and the Local Pollutant is thus updated according to Or, if the source is moved by one grid cell (∆x s replaced by ∆x s + 1), the formula becomes: LP s,∆xs,∆ys (x + 1, y, z, t + ∆t) = LP s,∆xs,∆ys (x + 1, y, z, t) + F (x, y, z, t)LF s,∆xs+1,∆ys (x, y, z, t) LF s,∆xs+1,∆ys (x + 1, y, z, t + ∆t) = LP s,∆xs+1,∆ys (x + 1, y, z, t + ∆t) T P (x + 1, y, z, t + ∆t) (10) The fluxes and Total Pollutants are not explicitly dependent on the source s, and are normally available quantities in the CTM model.
If the flux is exiting the grid cell x, the Local Fractions at x do not have to be updated, since it can be assumed that the fractions being removed are the same for the Local and Total Pollutants.

Diffusion
For diffusion we compute the effect of diffusion directly on every Local Pollutant: LF s,∆xs,∆ys (x, y, :, t + ∆t) = Diffusion(LP s,∆xs,∆ys (x, y, :, t)) Diffusion(T P (x, y, :, t)) Where "Diffusion()" is the numerical operator that computes the diffusion in the model and the colon ':' indicates its operation over the entire vertical grid column.This ensures a consistent treatment of the Local Fractions, whatever numerical procedure is applied for the diffusion.
In a practical implementation it is not necessary to include all the vertical levels, as the contribution from higher levels is negligible (it corresponds to pollutants leaving and returning to the local region during the same time step).In our implementation we include only two layers above the highest local region considered.

Deposition
For deposition (dry or wet), we can assume that the same fractions of Local and Total Pollutants are removed.Therefore the Local Fraction will not vary during the deposition process: The simplicity of this formula is one of the motivations for storing LF rather than LP .

Chemistry
To fully follow the pollutants through all the chemical reactions would, in principle, require an explicit reference to all the sources and grids.It is possible to reduce the size of the problem if linearity is assumed.This has been done by other groups (e.g., Kranenburg et al., 2013).The calculation of all the chemical reactions is one of the most computationally intensive part of CTMs (roughly 60% in the EMEP MSC-W model (Simpson et al. , 2012)  Eq. ( 12) can be used.This assumption is correct for primary particles and, as illustrated in our examples below, can give meaningful results also for N H 3 , SO x and N O x .So far the method is only developed for emitted pollutants, and not for secondary pollutants.

Examples and validation
The Local Fractions will depend on a broad range of factors such as emission distributions, meteorological conditions, grid resolution, chemical regime, size of the local region etc.It is beyond the scope of this article to systematically quantify how all the possible situations affect Local Fractions.Here we will only try to give some basic examples for situations where the approximations made are valid and the results calculated are of relevance.
The Local Fraction LF s,∆xs,∆ys (x, y, z, t) is a 7-dimensional array, and in the following Sections we will try to briefly illustrate the information that can be provided by this array.
The results shown in this Section are based on a grid with a resolution of 0.3°in the longitude direction and 0.2°in the latitude direction.

Illustration of Source Receptor capabilities
For a fixed value of x, y and z and t, the Local Fractions LF s,∆xs,∆ys (x, y, z, t) give the contributions of a pollutant s emitted at (x + ∆x s , y + ∆y s ) to the position (x, y), i.e. a two dimensional map of the origin of the pollutants found at position (x, y).
Thus provides a complete description of all source receptor relationships within a given distance from the receptor grid cell.In order to compare with the direct method, one can "invert" LP s,∆xs,∆ys (x, y, z, t), to get a map of the receptors for a fixed source: LP † s,∆xs,∆ys (x, y, z, t) = LP s,−∆xs,−∆ys (x + ∆x s , y + ∆y s , z, t) LP † s,∆xs,∆ys (x, y, z, t) then gives the contributions of a pollutant s located at (x, y) to the position (x + ∆x s , y + ∆y s ) Figure 4 illustrates a comparison of the results obtained 1) by removing the emissions from a single grid cell and computing the difference with the normal case (direct method).
2) by using one single run and Eq. ( 13) with a local region of size 41×41×8.
Within the local region the results are similar, but the Local Fraction method gives such a map for any grid cell in one single run, while the direct method would require a separate run for each source region.
Note that for the purposes of this experiment we have chosen a zero order advection scheme in all model runs.The default 10 fourth order scheme is slightly non-local, and the direct method would give spurious results very close to the sources.This is however not a problem for the LF method, and for short distances it is actually an advantage compared to the direct method.Source in Oslo agglomeration.Horizontal axis is time (120 hours).The 8 levels results cannot be distinguished from the direct method on the figure, even for the largest distance considered.

Vertical transport
For source apportionment applications, the focus is typically on horizontal transport.Nevertheless the code should trace the pollutants with a combination of vertical and horizontal transport.Over short distances only transport through the lowest layers needs to be considered.If the focus is on regions where a large part of the pollutants are transported over long distances, the vertical extend of the local area should be chosen large enough.
Figure 5 illustrates the dependence of the Local Fraction on the thickness of the local region.In this example, only a few vertical levels are required to describe the Local Fraction within the grid cell (the remaining discrepancy comes from pollutants first leaving the grid cell, and then returning later).For a distance of up to 14 grid cells, including 8 vertical layers in the local region, results are not distinguishable from the exact value calculated by the direct method.Obviously, emission or vertical mixing at higher altitudes would require to include the corresponding vertical layers.
For NOx, even for relatively small distances, there is a discrepancy between the contribution calculated with the Local Fraction method and the direct method (Fig. 6).This is because the Local Fraction method does not explicitly distinguish between NO and NO2.The mix modelled in the remote emissions may differ from the local values.Since reaction rates are different for NO and NO2, the local NOx transformation rate is not representative for the reaction rates of the incoming "older" NOx.At a distances larger than a few grid cells, a discrepancy can be observed between the contribution calculated with the Local Fraction method and the direct method.

Completeness
For local regions that are large enough, the source of all primary particles can be accounted for.This can be verified directly by summing all the Local Fractions for a given grid cell: ∆xs,∆ys LF s,∆xs,∆ys (x, y) ( A sum of one means that all sources are accounted for.The difference between the sum of of the Local Fractions and one gives the fraction of pollutants with sources outside of the local region.In Fig. 7 the sum of the Local Fractions is shown for every grid cell on the map for different horizontal sizes of the local region.For most land areas, more than 80% of the sources are found for the smallest window (41x41) and essentially all sources for the largest (161x161).Note that incomplete results are not a measure of an error in the method.Rather they show the amount of pollutants with sources outside of the local region, which is useful information.The calculation of the Local Fractions only needs information from the nearest neighbors, see Eq. ( 9) and is therefore well suited for parallel processing in a space partition framework.While storing all the Local Fractions is memory demanding, the data are distributed among the compute nodes, so that the memory requirement can be met by increasing the number of nodes.
In order to illustrate the computational cost, we can consider a typical model run, on a 400×260×20 grid (0.3 degrees longitude × 0.2 degrees latitude resolution), over one month (March 2016) on 160 processors that takes 447 seconds without the Local Fraction calculations.Table 1 shows the additional computational cost for computing the Local Fractions.The mathematical operations required to compute the Local Fractions are proportional to the number of sources considered and the size of the local area (in our implementation the additional time required for those operations grows faster, probably because of suboptimal utilization of cache memory).If one is only interested in the nearby sources (within a city, for example), the Local Fractions can be calculated at almost no additional cost.Remote sources can still be described, but at an additional cost.

Discussion
Local Fractions are a new concept that can help understand and analyse the origin of pollutants.It has the potential to be developed further, and a new range of applications is still being developed.

Source apportionment
Source receptor relationships can be produced for any source and receptor within a region around the source.The size of this region can be chosen to be relatively large (100 grid cells or more).Since the fluxes are given from and to individual grid cells, small regions (typically cities) can be studied simply by adding up individual grid cell contributions.These small regions do not have to be predefined in the model simulations.Indeed, the relative contributions of sources that contribute to the pollutants within a city covering several grid cells can be determined in a post-processing step, using graphical user interfaces where the user can choose the source region and source categories interactively.
Still, the method provides information about transport within a limited region only (the 'local region').The choice of the size of this region is a balance between the computational cost and the distance to the sources of interest.For the study of a city, it may be sufficient to include a region covering the agglomeration.The total pollutants from sources outside of the local region are still quantified but without specification of their location, using the method presented in Sect.3.4.

Downscaling
One obstacle to combine fine scale (urban) and regional modelling is the problem of "double counting".In the regional scale model, there is usually only one total concentration value, without distinction between its origins.Ideally, the regional model should only compute the background/regional contributions and the fine scale model can then add the local contribution.
The Local Fractions can give the relative contributions from different sources directly.Thus, it is possible to either redistribute or replace only the appropriate local contributions using the more accurate fine scale model.This avoids several of the problems associated with quantifying sources of different origins (Thunis , 2018).
An example for an operational downscaling tool is "uEMEP" (= urban EMEP), which combines the method described in this paper with the EMEP MSC-W air quality model (Simpson et al. , 2012), to provide daily air quality forecasts for all of Norway (https://luftkvalitet.miljostatus.no/, Denby et al. , 2020).

Improved modelling
Concentration of pollutants near the surface are required to assess health impacts or dry deposition.However, in many CTMs, the lowest layer is several tens of meters thick, and the concentrations of pollutants will have a non-constant vertical profile As shown in Fig. 1 and 2 the Local Fractions vary strongly in space and time.If this information can be used to give better estimations of vertical profiles of pollutants it should have a significant effect on the results.

Future work
In this work, sources are always defined in an individual grid cell.The relative position of the source, (∆x s , ∆y s ), could be replaced by a generic index that would point to more general groups of grid cells or regions.The formalism would be the same, except that emissions from any grid cell from the relevant region should be added together in the Local Fraction.This would allow for instance to distinguish individually all grid cells in the immediate vicinity of the receptor grid cell, and successively larger regions as the distance increases.Another application could be to define countries as emitter regions.
In the future we plan to generalize the method to include chemical processes.The Local Fractions could then give information about sensitivities to changes in emissions without necessarily summing up to one hundred percent.
used for the tests presented below).A consistent chemical treatment of Local Pollutants would mean to almost multiply the computation time by the number of Local Pollutants considered, i.e. the size of (s, ∆x s , ∆y s ).In order to preserve the simplicity of the method, we will in this version assume that https://doi.org/10.5194/gmd-2019-296Preprint.Discussion started: 11 November 2019 c Author(s) 2019.CC BY 4.0 License. the chemical processes modify the local and non-local part of the pollutants in the same proportions.With this assumption

3. 1
Time and space dependence (∆x s = ∆y s = 0)In Fig.1, an illustration of the time evolution of the instantaneous Local Fraction for fine particulate matter (PM 2.5 ) at an arbitrary location (in the Oslo agglomeration) is shown.The value gives the fraction of PM 2.5 which has its origin in the same grid cell.It is strongly correlated with the concentrations of PM 2.5 , but it does not always vary exactly in the same way.It will also depend on the wind speed, emission rates and the surrounding levels of pollution.If a relatively large amount of clean air is moving into that area, the total concentration will decrease, but the Local Fraction will remain high.High Local Fractions indicate that most of the pollutant is locally produced.

Figure 2
Figure 2 shows a map of monthly-mean Local Fractions for March 2016.It gives a picture of how much the sources in a particular grid cell contribute compared to the surrounding sources.The distribution is similar to the emission distribution, but isolated emission sources show up more clearly in the Local Fractions map.

Figure 3
Figure3shows such a map for an arbitrary location.It is simply the value of LF s,∆xs,∆ys (x, y, z, t) averaged over one month, where x and y are the position of the central point (receptor).Such a map is calculated for any point on the grid in a single simulation.In this example the local region has a horizontal extend of 41 times 41 grid cells.Direct methods would then, in principle, require 41 • 41 + 1 = 1682 simulations to calculate the values of one of those maps.

Figure 1 .
Figure 1.Time evolution of the Local Fraction of PM2.5 in the Oslo agglomeration (left axis) during the period 5th to 9th January 2016 (longitude=11.55°,latitude=59.9°).The total concentration of PM is also shown (right axis).

Figure 2 .
Figure 2. Example of spatial distribution of the Local Fraction of PM2.5, averaged over one month (March 2016, left panel).The total emissions of PM2.5 averaged over that period are shown in the right panel.

Figure 3 .
Figure 3. Example of Local Fractions as a source map.The values show the fraction of PM2.5 that has been emitted at that location and transported to the central point.The sum of all the fractions is in this case 0.976, meaning that 2.4 % of the PM2.5 concentration at the central position, originates from sources outside of the local region https://doi.org/10.5194/gmd-2019-296Preprint.Discussion started: 11 November 2019 c Author(s) 2019.CC BY 4.0 License.

Figure 4 .
Figure 4. Receptor map for a single grid cell emission, obtained through direct method (left panels) and the Local Fraction method (right panels), averaged over one month (March 2016).Concentrations of PM2.5, NOx, SOx and NH3 (in µg m −3 ).The direct method requires a separate run for each source location.The Local Fraction method gives the receptor map in one single run, in a limited region around the source, but for any source grid cell.

Figure 5 .
Figure 5. Sensitivity of the concentration of PM2.5 (µg m −3 ) to the number of vertical levels included in the local region, for different distances from the source.The distance from the source is given in numbers of grid cells (one grid cell = 0.3 degrees in longitude direction).

Figure 6 .
Figure 6.Concentration of NOx ( µg m −3 ).Sensibility to number of vertical levels included in the local region.Distance to source in number of grid cells (one grid cell = 0.3 degrees in longitude direction).Source in Oslo agglomeration.Horizontal axis is time (120 hours).

Figure 8
Figure 8 show the result for different vertical extend of the local region.The Local Fractions get close to complete in most places, when 8 vertical levels are included (approximately 1522 meters height).As one would expect, this roughly corresponds 10

Figure 7 .
Figure 7. Sum of all Local Fractions (Eq.(14)) for PM2.5 and different sizes of the local region (average for March 2016).The distance is counted as number of grid cells in each direction.All vertical layers (20) are included.A sum of 1.0 means that all the sources have been accounted for.

Figure 8 .
Figure 8. Sum of all Local Fractions (Eq.(14)) for PM2.5 and for different vertical extents of the local region (average for March 2016).A horizontal region of 161×161 is included in the local region.For a standard atmosphere, the height of the top of the layers 6, 7, 8 and 9 are, respectively, 623, 1015, 1522 and 2149 meters.A sum of 1.0 means that all the sources have been accounted for.

3. 5
Computational aspectsSince one of the key advantages of the Local Fraction method is its low computational demand, we will give a few concrete examples of the computational cost for providing the Local Fraction values in our implementation.The transformations carried out for the calculation of Local Fractions presented in Sect. 2 are all relatively simple.The most computationally intensive parts of the model (calculation of fluxes, chemical transformations, deposition processes) are not explicitly performed for every Local Pollutant, but only once for the total concentrations.This also means that there are no fundamental changes in the model code involved.The updates of the Local Fractions can be added on top of an existing model in separate routines.For processes were local pollutants are transformed by the same relative amount as non-local pollutants (deposition and chemistry in our implementation), there is no need to update the Local Fractions; this is the main motivation for storing the Local Fractions rather than the local pollutants.
https://doi.org/10.5194/gmd-2019-296Preprint.Discussion started: 11 November 2019 c Author(s) 2019.CC BY 4.0 License.A substantial amount of time can be required for writing results to disk, specially if all results are written out every hour.This is mainly due to the large amount of data collected; for instance, for a local region of size ×21×21×1 and 14 sectors or species, 400 • 260 • 21 • 21 • 14 values = 5 GB of data have to be written to disk each time it is requested.
https://doi.org/10.5194/gmd-2019-296Preprint.Discussion started: 11 November 2019 c Author(s) 2019.CC BY 4.0 License.within the layer.The shape of the profile will depend on the local conditions: if the pollutants are emitted locally at the surface the concentration will typically decrease with height, while the opposite is true for background pollutants.With the knowledge of the Local Fractions it is possible to improve the description of the vertical profile, and thus a more accurate estimation of, for instance, 3 meter concentrations (useful for health impact studies) or dry deposition rates.

Table 1 .
Additional computation time needed for the calculation of Local Fractions in different settings, expressed as fraction of the total time needed when calculation of Local Fractions is not included.The first column shows the horizontal size of the local region, while the second column shows the number of vertical levels through which pollutants are traced.The total time without calculation of Local Fractions in our tests was 447 seconds.