Articles | Volume 18, issue 10
https://doi.org/10.5194/gmd-18-2983-2025
https://doi.org/10.5194/gmd-18-2983-2025
Development and technical paper
 | 
21 May 2025
Development and technical paper |  | 21 May 2025

Accounting for effects of coagulation and model uncertainties in particle number concentration estimates based on measurements from sampling lines – a Bayesian inversion approach with SLIC v1.0

Matti Niskanen, Aku Seppänen, Henri Oikarinen, Miska Olin, Panu Karjalainen, Santtu Mikkonen, and Kari Lehtinen
Abstract

The particle number (PN) emissions of both light- and heavy-duty vehicles are nowadays regulated and are typically measured from a full dilution tunnel with constant volume sampling (CVS). PN measurements for research and development purposes, though, are often taken from the raw exhaust to avoid the high setup costs of CVS. There is, however, a risk with these and any other kind of PN measurements with high number concentrations, which is that physical processes such as coagulation and diffusion losses inside sampling lines can alter, sometimes dramatically, the particle size distribution and bias its measurement. In this paper, we propose a method in the Bayesian framework for inverse problems to estimate the initial, unaltered particle size distribution based on the distorted measurements. The proposed method takes into account particle morphology and van der Waals and viscous forces in the coagulation model and allows the incorporation of prior information on the particle size distribution and, most importantly, a systematic quantification of uncertainty. We analyze raw exhaust PN measurements of a fuel-operated auxiliary heater and find that while a typical sampling line can reduce the PN by more than 50 %, the initial particle size distribution can be feasibly estimated with reasonable computational demands. The proposed method should give more freedom for designing the measurement setup and also aid in the comparison of results obtained at different sampling locations, such as CVS and tailpipe.

Share
1 Introduction

Particulate matter (PM), i.e., particle pollution, poses great health risks to humans and causes environmental damage (Shiraiwa et al.2017; Manisalidis et al.2020). Significant PM sources, such as vehicle tailpipe emissions, are therefore in most cases subject to particle mass regulations and nowadays also to particle number (PN) regulations in Europe (Commission Regulation (EU)2019; Giechaskiel et al.2021b). PN regulations were introduced to better cover the fraction of ultrafine (<0.1µm) particles, which is negligible in terms of mass but dominates the ambient atmosphere in terms of PN (80 %–90 % of all particles) (Hofman et al.2016). These ultrafine particles are a cause of growing concern in the public health community due to their ability to circumvent primary airway defenses and penetrate deep in the lungs (Kwon et al.2020; Morawska and Buonanno2021). Furthermore, the harmful systematic health effects associated with coarse (<10µm) and fine (<2.5µm) particulates can often be attributed to the fraction of ultrafine particles (Kwon et al.2020). At the moment PN regulations are given for particles with a diameter >23nm, not because smaller particles are any less dangerous, but mainly because smaller particles are very sensitive to sampling conditions and thus hard to measure accurately and with good repeatability (Giechaskiel et al.2021b). The Euro 7 road vehicles emission standard proposal increases the size range to particles >10nm (Giechaskiel et al.2024). Details of the emissions such as those emitted from tailpipes are also important basic knowledge for the purpose of climate change research, as aerosol forcing uncertainty represents the largest climate forcing uncertainty overall (Kahn et al.2023).

PN emissions are typically measured using condensation particle counters (e.g., Hering et al.2005) or diffusion chargers (e.g., Schriefl et al.2019), which can measure total PNs. If more detailed information is required, one can measure PN as a function of particle size, i.e., a particle size distribution (PSD), with an instrument such as the engine exhaust particle sizer (EEPS) spectrometer (TSI2024; Wang et al.2016a). There are, however, aspects of PN measurement that may bias the results considerably unless explicitly accounted for in the design of the experiment (which may not always be possible) or numerically post-measurement. For example, measuring PSD in raw exhaust usually involves the use of sampling lines of different lengths to transfer and possibly cool the sample to the measurement devices. Within the sampling lines, the particles have a window of time to coagulate and diffuse onto the inner walls of the lines, which alters the PSD of the raw exhaust. The shortest possible residence times in the lines are typically targeted because residence times that are short enough (and PN concentrations low enough) limit the effects of diffusion and coagulation, leaving the PSD mostly unaltered. This may not always be practicable, however, and sometimes the PSD undergoes significant changes, up to more than an order of magnitude, especially in the concentrations of the smallest particle sizes. In one study, Giechaskiel et al. (2019) examined the measurement of PN directly from the tailpipe of heavy-duty engines and found that an increase from 0.5 to 4 m in the sampling line length resulted in a loss of 20 %–50 % of particles >10nm, not explainable with just diffusion losses. Further, Liu et al. (2021) studied the effect of Brownian coagulation on particle evolution in gasoline engine exhaust by measuring the PSD at various locations along a laboratory exhaust system. They found that the PSD evolution mainly occurs for sizes <50nm, that the evolution is dominated by coagulation, and that a significant (up to 90 %) particle reduction can occur when there is a high number of accumulation mode (>50nm) particles. Such high losses would render a measurement useless unless the changes in the PSD are rigorously quantified and corrected for.

Of the physical processes in a sampling line, wall deposition losses are nowadays commonly corrected, but the effect of coagulation is typically not. Further, because the effect of coagulation is nonlinear and depends on the PN concentration, it cannot be corrected with a simple calibration measurement. To ensure coagulation and diffusion do not have a significant effect on the PSD, legislation specifies for example the maximum residence time and Reynolds number for a sample in the sampling line (Giechaskiel et al.2021b), but we have not found in the literature attempts to correct the estimate of a PSD after the measurement has been made. In this paper, we aim to find out if the initial PSD can be estimated based on measurements that have been distorted by coagulation and diffusional losses. Our approach uses methods in the Bayesian framework for inverse problems (Kaipio and Somersalo2006) and makes use of a numerical model of the coagulation and wall diffusion process that takes into account the fractal nature of soot particles (Park et al.2003; Rogak and Flagan1992) as well as van der Waals and viscous forces (Alam1987). Due to uncertainties related to the measurement (e.g., noise) and to the coagulation model (e.g., poorly known parameters), the resulting estimate is also uncertain to some degree. In the Bayesian framework, this uncertainty is quantified by the posterior probability density, i.e., the probability density of the unknown variable of interest, conditioned on the measurement. The possible prior information on the model unknown and statistics of the measurement noise are also accounted for. From the posterior one can calculate not only the most probable values of the initial PSD, but also the credible intervals to explicitly quantify the uncertainty of the PSD estimate. Robust uncertainty quantification is essential to assess the reproducibility of measurements between different laboratories (Giechaskiel et al.2018, 2021a).

To test the proposed method, we carry out inversion with both synthetic data and real PSD measurements of a fuel-operated auxiliary heater (Oikarinen et al.2022). We will compute the estimate of the initial PSD in two ways that are useful in different circumstances. The first one is based on solving an optimization problem, which is computationally fast and can be done in real time when collecting measurements in the field. In this case, the posterior uncertainty is approximated with a Gaussian distribution. The second approach is based on sampling the posterior with Markov chain Monte Carlo (MCMC), which is a lot more demanding computationally but does not need to rely on Gaussian approximations for the uncertainty estimates. The second approach can also be used to take into account the possible uncertainty in the parameters of the coagulation model, as will be shown later. We will also compare and discuss the results between the two approaches. The code and data used to carry out the examples in this paper are available in the package SLIC (Niskanen2024).

The remainder of the paper is organized as follows. In Sect. 2, we describe our approach to modeling coagulation and wall deposition. In Sect. 3, we describe and solve the inverse problem of estimating the initial PSD based on measurements done with a system that has a sampling line. In Sect. 4, we apply the method on synthetic and real data, and the results are discussed in Sect. 5. Finally, conclusions are given in Sect. 6.

2 Simulating the microphysical processes in a sampling line

When collecting PSD data from sampling lines, there are two main processes quick enough to affect aerosol particles during the short time (up to a few seconds) a sample resides in a sampling line: coagulation (agglomeration of particles) and wall deposition due to diffusion (particles lost to the sampling line walls). Together these processes reduce the number of particles and increase their size.

The time evolution of a PN concentration density n(v,t) (unit: m-1cm-3) experiencing coagulation and wall diffusion can be described by the following equation (Seinfeld and Pandis2016):

dn(v,t)dt=120vβ(v-v¯,v¯)n(v-v¯,t)n(v¯,t)dv¯(1)-n(v,t)0β(v¯,v)n(v¯,t)dv¯-R(v)n(v,t),

where the first integral represents a coagulation source, the second a coagulation sink, and R(v) in this case the rate of removal of particles by wall diffusion. Particle volume and time are denoted by v and t, respectively. The rate of coagulation is governed by the coagulation coefficient β(v1,v2), which describes the frequency that two particles of volume v1 and v2 collide.

To approximate the solution of Eq. (1) numerically, the particle size range is first discretized. We adopt here the sectional method (Gelbard et al.1980; Jacobson et al.1994; Lehtinen and Zachariah2001) for its computational efficiency and ease of implementation, where the particle size range is divided into a finite number of sections or bins. Here we use geometrically distributed bins so that the logarithm of the bin width is constant. The sectional method conserves total particle volume, and in Salminen et al. (2022) it was even found that for pure coagulation the sectional method can be more accurate than the finite element method with a similar number of discretization points.

In a typical implementation of the sectional method, particles within each bin are approximated to have a constant size equal to the bin middle point. Consider then two particles from bins with volumes vi and vj colliding and forming a larger particle with volume vi+vj. Unless the volume of the new particle coincides exactly with a center of a bin, the particle is divided into two adjacent bins in a way that conserves the total volume. For this purpose, let us define a size-splitting operator ξijk, which gives the volume fraction of the coagulated particle vi+vj partitioned into bin k:

(2) ξ i j k = v k + 1 - ( v i + v j ) v k + 1 - v k , v k v i + v j < v k + 1 ; k < n B , 1 - ξ i j ( k - 1 ) , v k - 1 < v i + v j < v k ; k > 1 , 1 , v k v i + v j ; k = n B , 0 , otherwise ,

where nB is the total number of bins. Let us now denote the PN concentration in the kth bin by Nk=nkΔDp, where ΔDp is the width and nk the PN concentration density (cf. n(v,t) in Eq. 1) of bin k and k=1,,nB. The discretized version of Eq. (1) then reads

(3) d N k d t = 1 2 i , j ξ i j k β i j N i N j - i β i k N i N k - R k N k ,

where βij is the coagulation coefficient between size bins i and j. Time integration for Eq. (3) can be carried out for example with Runge–Kutta methods. We use here RK45 as implemented in SciPy (Virtanen et al.2020).

2.1 Coagulation model

Particles collide due to their Brownian, or thermal, motion, as well as motion due to external forces such as laminar shear or turbulent flow and gravitational, van der Waals, Coulomb, and hydrodynamic forces (Seinfeld and Pandis2016). Colliding liquid particles coalesce and retain their spherical shape. Solid particles above a certain critical size r1, however, are not expected to coalesce but instead form fractal-like agglomerates of dense primary particles. The collision frequency for fractal-like particles can be significantly higher than for spherical particles with the same volume (Rogak and Flagan1992).

In this work, for coagulation we consider the particles' Brownian motion and the effects of van der Waals and viscous forces, as those are likely to dominate over other factors in coagulation at short timescales and distances. Further, in our specific example with soot particles, we will take into account the particle shape because soot particles are known to form long chains of aerosol agglomerates (Park et al.2003). Figure 1 demonstrates the effect of fractal geometry and van der Waals and viscous forces on the coagulation coefficient. Although other aspects of coagulation such as the effect of charge or turbulent flow are not modeled in this work, they can be included in the considered framework should the need arise. For examples that have dealt with the Coulombic effects see Mahfouz and Donahue (2021a), Mahfouz and Donahue (2021b), Pfeifer et al. (2023), and Thomas et al. (2024), and for examples on the effect of turbulence see Okuyama et al. (1978), Williams and Crane (1983), and Zhao et al. (2021).

https://gmd.copernicus.org/articles/18/2983/2025/gmd-18-2983-2025-f01

Figure 1Coagulation coefficients calculated for mobility diameters 5.62–562 nm. (a) Spherical vs. fractal coagulation coefficient (fractal dimension =1.7 and primary particle diameter =30nm). (b) Spherical coagulation coefficient with and without van der Waals and viscous forces (Hamaker constant =2×10-19J).

Download

2.1.1 Brownian coagulation of fractal-like agglomerates

In this paper, the coagulation of fractal-like agglomerates is modeled following mostly Rogak and Flagan (1992) and Jacobson and Seinfeld (2004), where it is assumed that the agglomerate structure is completely determined by three parameters: the fractal dimension Df, the primary particle radius r1, and the number of primary particles 𝒩. All primary particles are thus assumed to be of the same size. The fractal dimension, Df[1,3], characterizes the shape of the agglomerate. For example, when Df=1, the agglomerate forms a line, whereas with Df=2, it fills space like a surface, and as Df approaches 3, it curls up to resemble a spherical volume. For soot particles, typical values for the primary particle radius are between 10 and 17 nm and the fractal dimension between 1.5 and 2.2 (Wentzel et al.2003; Lapuerta et al.2017).

The three parameters above give rise to a number of equivalent radii that characterize the coagulation of non-spherical particles. Let us first highlight two here: the volume equivalent radius and the mobility radius. The volume equivalent radius, rv, is the radius of a sphere with the same volume as the aggregate, whereas the mobility radius, rm, is the radius of a sphere that experiences the same drag force as the agglomerate under the same dynamic conditions (Rogak and Flagan1992). This distinction is important, because while mobility radii are what is usually measured, the size-splitting coefficient Eq. (2) should be calculated using the volume equivalent radii to conserve particle volume. Thus, for non-spherical particles, the output bin sizes of a typical measurement device should not be used directly for modeling coagulation.

Let us first state 𝒩i as the ratio of the volumes of the aggregate and the primary particle, Ni=(rv,i/r1)3, where i refers to the index of a size bin. The fractal or outer radius of the agglomerate is then defined as

(4) r f , i = r 1 N i 1 / D f , r v , i r 1 , r v , i , r v , i < r 1 .

The Brownian coagulation rate between two particles of sizes i and j can be stated in the Fuchs form, modified for fractal geometry, as

(5) β i , j = 4 π ( r c , i + r c , j ) ( D m , i + D m , j ) C i , j - 1 ,

where rc,i and Dm,i are the collision radius and Brownian diffusion coefficient of particle i, respectively, and Ci,j is a transition-regime correction factor that extends the validity of the formula from the continuum regime to sizes smaller or comparable to the mean free path of diffusion particles. Following Rogak and Flagan (1992), we set the collision radius equal to the outer radius of the particle, rc,i=rf,i.

The expressions for Dm,i and Ci,j require the (transition-regime) mobility radius rm,i, which is computed by interpolating between the continuum-regime mobility radius rmc,i and free-molecular-regime mobility radius rmk,i, as

(6) r m , i C c ( Kn m , i ) = r mc , i C c ( Kn a , i ) ,

where Cc(Kn) is the Cunningham slip correction factor, given by

(7) C c ( Kn ) = 1 + Kn ( 1.257 + 0.4 exp ( - 1.1 / Kn ) ) ,

with Kn representing the Knudsen number. The mobility radius Knudsen number is given by

(8) Kn m , i = λ a r m , i

and the adjusted Knudsen number by

(9) Kn a , i = λ a r mc , i r mk , i 2 .

Above, λa denotes the mean free path of an air molecule, which can be calculated from

(10) λ a = k B T 2 π d 2 p ,

where kB is the Boltzmann constant, T is the gas temperature, d is the diameter of an air molecule (approximated here as the diameter of a nitrogen molecule), and p is the gas pressure. The free-molecular-regime mobility radius, rmk,i, is assumed to be equal to the projected-area equivalent radius rA,i, given by

(11) r A , i = r 1 max { N i 2 / 3 , min [ 1 + 0.67 ( N i - 1 ) , D f N i 2 / D f / 3 ] } , r v , i r 1 , r v , i , r v , i < r 1 ,

and the continuum-regime mobility radius is given by

(12) r mc , i = max r f , i ln 2 r f , i r 1 + 1 - 1 , r f , i D f - 1 2 0.7 , r A , i .

Now, the diffusion coefficient can be calculated from

(13) D m , i = k B T 6 π η r m , i C c ( Kn m , i ) ,

where η represents the dynamic viscosity of gas, which is temperature-dependent and over the range of 100–1800 K can be determined using the Sutherland equation (Hinds1999):

(14) η = 1.458 × 10 - 6 Pa s K T K 1.5 T + 110.4 K .

The correction factor Ci,j is given by

(15) C i , j = r c , i + r c , j r c , i + r c , j + δ m , i 2 + δ m , j 2 + c ̃ ,

where

(16) δ m , i = ( 2 r m , i + λ m , i ) 3 - ( 4 r m , i 2 + λ m , i 2 ) 3 / 2 6 r m , i λ m , i - 2 r m , i

and

(17) c ̃ = 4 ( D m , i + D m , j ) v ¯ i 2 + v ¯ j 2 ( r c , i + r c , j ) .

The effective mean free path of a particle size i is given by

(18) λ m , i = 8 D m , i π v ¯ i ,

with the mean thermal speed

(19) v ¯ i = 8 k B T π M ¯ i ,

where M¯i=4/3πrv,i3ρ is the mass of a single particle, with ρ denoting the particle density.

The above model was written in terms of the volume equivalent radius rv and parameters Df and r1 to find the mobility radius rm; i.e., it forms a relation f:rvrm. On the other hand, measurements of PN concentration are usually modeled with respect to rm, and hence, a mapping from rm to rv is needed. An explicit solution f-1:rmrv is not possible, though, and we calculate rv by solving a minimization problem:

(20) r v , i = arg min r v , i r m , i - f ( r v , i , r 1 , D f ) 2 2 .

2.1.2 Van der Waals and viscous forces

Van der Waals forces are weak fluctuating dipole–dipole forces that increase the rate of aerosol coagulation (Alam1987). Viscous forces, on the other hand, slow down the rate of coagulation. Viscous forces are absent in the free molecular regime but in the continuum regime can even slow down coagulation more than van der Waals forces enhance it (Jacobson and Seinfeld2004). These forces can be taken into account by multiplying the Brownian coagulation coefficient Eq. (5) by a correction factor Vi,j. We follow here Alam (1987), who determined an interpolation formula for the correction factor between the free-molecular and continuum regimes as

(21) V i , j = W c , i , j 1 + c ̃ 1 + W c , i , j / W k , i , j c ̃ ,

where c̃ is defined in Eq. (17) and Wk,i,j and Wc,i,j are correction factors for the free-molecular and continuum regimes, respectively. These are given by

Wk,i,j=-12(ri+rj)2kBT×ri+rjdEi,j(r)dr+rd2Ei,j(r)dr2(22)×exp-1kBTr2dEi,j(r)dr+Ei,j(r)r2dr

and

Wc,i,j=[(ri+rj)(23)×ri+rjDDi,j(r)expEi,j(r)kBTdrr2]-1.

The correction factors include the Van der Waals interaction potential

Ei,j(r)=-A6[2rirjr2-(ri+rj)2+2rirjr2-(ri-rj)2(24)+lnr2-(ri+rj)2r2-(ri-rj)2],

where A is the Hamaker constant, and the diffusion ratio that corrects the diffusion coefficient in the continuum regime for viscous forces

DDi,j(r)=1+2.6rirj(ri+rj)2rirj(ri+rj)(r-ri-rj)(25)+rirj(ri+rj)(r-ri-rj).

Typical values for the Hamaker constant of soot are in the range of 1×10-194×10-19J (Liu et al.2018).

2.2 Wall deposition by diffusion

Aerosol particles adhere when they collide with a surface, and thus their concentration at the surface is zero. This results in a concentration gradient near the surface which causes a continuous diffusion of particles to the surface. In addition to diffusion, other basic mechanisms of wall deposition include electrostatic attraction, i.e., the effect of charges on the walls (Mahfouz and Donahue2020); inertial impaction; interception; and gravitational settling (Hinds1999). Also surface properties such as roughness can influence deposition (Shimada et al.1987; Hussein et al.2012). In this paper, we only model wall deposition by diffusion, but other mechanisms can naturally be included in the model as needed.

For laminar flow of particles through a cylindrical tube, the fraction P=nout/nin of entering particles that exit as a result of diffusional losses can be approximated with the help of a dimensionless diffusion parameter μ as (Hinds1999)

(26) P = 1 - 5.50 μ 2 / 3 + 3.77 μ , μ < 0.009 , 0.819 exp ( - 11.5 μ ) + 0.0957 exp ( - 70.1 μ ) , μ 0.009 .

The diffusion parameter is defined as

(27) μ = D L Q ,

where L is the tube length and Q is the volume flow rate (m3 s−1) through the tube. For the particle diffusion coefficient D we use Dm from Eq. (13). To transform the penetration P into the removal term R in Eq. (3), we set

(28) R = - log ( P ) / T ,

where T is the residence time of the aerosol in the tube.

3 Estimating the initial size distribution

Let us now formulate the problem of estimating the initial PSD as a problem of statistical inference within the Bayesian framework. The unknown parameters are modeled as random variables, and all information on them is expressed in terms of probability density functions. The solution to the inference problem is the posterior probability density, a conditional probability density of the unknown parameters conditioned on the measured data (Kaipio and Somersalo2006).

Let us denote by ÑRnB a vector of length nB that represents a parametrization of the PSD of the aerosol at the entry of the sampling line (i.e., the initial PSD). This will be the model unknown in the Bayesian inference. In the parametrization used below, Ñ consists of the logarithms of the size-discretized PSD N,Ñ:=logN, to enforce positivity and to treat the wide range of number concentrations in a numerically more stable way. In what follows, ̃ will be used to denote a variable on the log scale. Further, let yRnM be a vector of length nM consisting of noisy, indirect observations of the PSD after the particles have undergone coagulation and diffusion processes when transported through the sampling line. The posterior is then given by Bayes' formula,

(29) π ( N ̃ | y ) = π ( y | N ̃ ) π ( N ̃ ) π ( y ) π ( y | N ̃ ) π ( N ̃ ) ,

where π(y|Ñ) is the likelihood function; π(Ñ) is the prior probability density; and π(y) is the value of the marginal probability density of measurements at y, which once the measurements are realized, acts as a normalization constant. The likelihood function describes the likelihood of different measurement outcomes given a realization of parameters Ñ. It considers statistics of the measurement noise and the misfit between measurements and the model. The prior density models statistically the information on the parameters before measurements y are accounted for. It should include a higher probability for parameter values we expect to see compared to those we do not expect to see, and can, for example, include an assumption that small differences in number concentration between neighboring size bins are more probable than large ones.

3.1 The prior

We model the prior density as Gaussian with mean Ñ* and covariance Γ̃pr. The covariance is constructed to promote smoothness (Lieberman et al.2010) between adjacent size bins:

(30) Γ ̃ pr ( i , j ) = a ̃ exp - 1 2 d ̃ i - d ̃ j 2 b ̃ 2 ,

where ã=Γ̃pr(i,i) is the variance of the initial PSD, d̃i is the size bin center, and b̃=l̃/2ln(100) is related to the smoothness of the initial PSD over the size range. The degree of smoothness is controlled by correlation length l̃, which is defined as the distance (here, in terms of particle size on a logarithmic scale) over which the cross-covariance Γ̃pr(i,j) drops to 1 % of ã. Due to the log transformation, a correlation length l̃=1 corresponds to 1 order of magnitude. An example of a covariance matrix with l̃=3/4 and ã=4, when each order of magnitude is divided into 16 bins, is shown in Fig. 2.

https://gmd.copernicus.org/articles/18/2983/2025/gmd-18-2983-2025-f02

Figure 2An example prior covariance matrix with size bins covering 2 orders of magnitude and 16 bins per decade.

Download

With the above notation, the prior density can be written as

(31) π ( N ̃ ) exp - 1 2 L ̃ pr N ̃ * - N ̃ 2 ,

where L̃pr is a matrix square root of the inverse of the prior covariance, i.e., L̃prTL̃pr=Γ̃pr-1.

3.2 The likelihood

Let h:RnBRnM denote the parameter-to-observable mapping, i.e., the forward model, which models the dependency between the initial time PSD (or, to be specific, its logarithmic parameters) and the observable quantity; i.e., in this case, h models not only the measurement device but also the coagulation and wall deposition processes that the particle ensemble undergoes while being transported through the sampling line between the initial time and the time of the observation. We will derive the exact form of h in Sect. 4 because it depends on the measurement device used. With the standard assumption that the measurement process is corrupted by additive Gaussian noise eRnM, the following observation model can be written for the measurement:

(32) y = h ( N ̃ ) + e .

If we assume that the measurement noise is independent of the unknowns and normally distributed with zero mean and covariance Γe,eN(0,Γe), the likelihood can be written as

(33) π ( y | N ̃ ) exp - 1 2 L e y - h ( N ̃ ) 2 ,

where LeTLe=Γe-1.

3.3 The posterior

The solution to the inference problem, the posterior density Eq. (29), can now be stated by combining the prior Eq. (31) and the likelihood Eq. (33):

π(Ñ|y)exp{-12Le(y-h(Ñ))2(34)-12Lpr(Ñ*-Ñ)2}.

The final task in the problem is to summarize the posterior by calculating point and interval estimates, which we will describe next.

3.3.1 Laplace approximation to the posterior

One of the most commonly used approximations to the posterior is the Laplace approximation, in which the posterior density is approximated with a Gaussian distribution (MacKay2003). The mean of the distribution is set to the maximum a posteriori (MAP) estimate, and the covariance is calculated based on the curvature of the posterior around the MAP estimate, i.e., π(Ñ|y)N(ÑMAP,Γ̃post). The MAP estimate is defined as the point where the posterior density has a maximum:

(35) N ̃ MAP = arg max N ̃ D π ( N ̃ | y ) ,

where DRnB is a space of possible values for Ñ. In this paper, we use the Gauss–Newton algorithm equipped with line search to compute the MAP estimate. The posterior covariance is given by

(36) Γ ̃ post = J ( N ̃ MAP ) T Γ ̃ e - 1 J ( N ̃ MAP ) + Γ ̃ pr - 1 - 1 ,

where J(ÑMAP) denotes the Jacobian matrix of h(Ñ) with respect to Ñ, evaluated at the MAP estimate.

The Laplace approximation to the posterior covariance can be used to compute parameter uncertainty estimates such as approximate credible intervals (CIs), which in the Bayesian framework can be directly interpreted as statements on the probabilities of the parameter values. This means that the true parameter value is contained within the m % CI with m % probability. The CI does not have a unique definition, but here for the Laplace approximation we define a 95 % credible interval as ÑMAP±1.96σ̃post, where σ̃post2(i)=Γ̃post(i,i), i.e., the values on the diagonal of the posterior covariance matrix. Estimates for N can be calculated by transforming the log-parametrized PSD values to the absolute scale as 10ÑMAP±1.96σ̃post. Note that due to the parametrization the distribution of N is not Gaussian but log normal.

3.3.2 Full characterization of the posterior

In many cases, the MAP estimate and Laplace approximation are sufficient to summarize the posterior and quantify uncertainty in the parameters. However, to properly assess the validity of this approximation the posterior needs to be fully characterized. In practice, this is done by sampling methods such as Markov chain Monte Carlo (MCMC), which generate correlated samples {Ñi,i=0,1,} that are distributed according to the posterior probability density. The basic algorithm to construct an MCMC sampler is the Metropolis algorithm (Metropolis et al.1953), where a proposal Ñ for a new sample is generated based on the current sample Ñi, and the proposal is accepted with a probability of the posterior ratio, min(1,π(Ñ|y)/π(Ñi|y)). If the proposal is accepted, we set Ñi+1=Ñ and otherwise Ñi+1=Ñi. The algorithm has to be run for long enough so that the chain is converged, i.e., will have properly sampled the posterior and sufficiently accurate estimates can be computed. How long is long enough depends on many factors, such as the required accuracy, efficiency of the sampler, and shape of the posterior. Assessing convergence reliably can be difficult but is a requirement before any conclusions from the samples can be drawn.

Once a set of samples representative of the posterior has been obtained, estimates of the conditional mean (CM) and credible intervals can be computed. The CM estimate is the expected value of the posterior, i.e., its center of mass, and is computed as

(37) N ̃ CM = D N ̃ π ( N ̃ | y ) d N ̃ 1 n s i = 1 n s N ̃ i ,

where ns is the number of samples. In the case of MCMC we define the m % credible interval as the interval between the two ends, or tails, of the marginal posterior distribution, where both tails contain (100-m)/2% of the marginal posterior samples. The true shape of the marginal posteriors can also be visualized by histograms of the samples.

Drawing lots of samples to approximate the posterior is naturally slower than computing the Laplace approximation, and the feasibility of applying MCMC on a given problem depends highly on the efficiency of the chosen sampler. In this work, we use the Metropolis-adjusted Langevin algorithm (MALA) (Roberts and Tweedie1996) with adaptive proposals (Haario et al.2001). Compared to a regular random walk Metropolis algorithm, MALA increases sampling efficiency using the gradient of the posterior to guide the proposals towards areas of higher posterior probability. For more information on MCMC methods, see for example Brooks et al. (2011).

3.4 Uncertainties in the forward model

In addition to the unknown of primary interest, Ñ, the forward model includes auxiliary (sometimes also called nuisance) parameters – ones that we are not primarily interested in but nevertheless affect the model output. These include parameters such as exhaust flow velocity, fractal dimension, and the Hamaker constant. Above, the inference of the PSD was written assuming that these parameters are known exactly. This, of course, is seldom the case in reality, and the assumption may lead to erroneous results and unrealistically narrow uncertainty estimates. To quantify the effect of uncertainty in the auxiliary parameters on the posterior probability of Ñ, we can use marginalization, where the auxiliary parameters are modeled as additional unknowns and then integrated out.

Let us denote the auxiliary unknowns by νRnν and define a prior π(ν) that is independent of π(Ñ). Then, the augmented form of the posterior reads

(38) π ( N ̃ , ν | y ) π ( y | N ̃ , ν ) π ( N ̃ ) π ( ν ) .

By carrying out the following integral,

(39) π ν ( N ̃ | y ) = R n ν π ( N ̃ , ν | y ) d ν ,

the effect of ν is integrated out and we are left with a marginal posterior probability πν(Ñ|y). We use the subscript ν to denote the variable(s) over which the posterior is marginalized to avoid confusion with Eq. (34), which implicitly considers ν known. In practice, the integration is done by carrying out MCMC sampling for both Ñ and ν simultaneously. Once a representative set of samples has been collected, the marginalization Eq. (39) is trivial.

4 Case studies

Let us now use the described inversion approach to analyze measurements of particle emissions from fuel-operated auxiliary heaters (AHs) used in vehicle preheating or providing additional heat (Oikarinen et al.2022). In this kind of measurement, often (i) the PN concentration is high and (ii) the sample needs to be transported some distance from the end of the exhaust to the measurement device – two factors that increase the likelihood of coagulation and wall losses distorting the PSD, as well as raise the question of the validity of the measurements unless these effects are tested and corrected for. We will compute both the Laplace approximation and a full characterization of the posterior with MCMC to compare these approaches. Two cases will be studied, one based on synthetic data where the true initial PSDs are known and one considering real measurements.

4.1 Modeling the measurement setup

The PSD measurements in Oikarinen et al. (2022) were carried out using the Engine Exhaust Particle Sizer™ (EEPS) model 3090, manufactured by TSI Inc. The EEPS is a spectrometer that estimates PSD based on measurements of electrical mobility. A simplified schematic of the measurement setup in Oikarinen et al. (2022) is shown in Fig. 3, where exhaust gas from the AH is led to the EEPS through a sampling line with a length of 3.2 m and an inner diameter of 5 mm. Flow velocity through the line was 3.5 m s−1, leading to a residence time of approximately 0.9 s. The sampling line was heated to 250 °C, and just before the EEPS the sample was diluted at a ratio that varied between 40:1 and 100:1 to allow for measurements of PN concentrations higher than the upper limit of the instrument.

https://gmd.copernicus.org/articles/18/2983/2025/gmd-18-2983-2025-f03

Figure 3A schematic of the measurement setup.

Download

The operating principle of the EEPS is, in short, the following (TSI2015): a flow of exhaust gas is directed at the inlet, where particles in the gas are positively charged. The charged particles are then directed in a laminar flow through a cylindrical tube that has a positive high-voltage electrode column in the center and a stack of 22 circular electrometers on the outer wall. The electric field between the center electrode and electrometers deflects the particles outwards so that particles with high electrical mobility hit the electrometers near the inlet and particles with low electrical mobility hit the electrometers further along the stack. The striking particles generate currents that are proportional to the particle concentration. Measurements of these currents with their location along the stack can be inverted to give an estimate of the PN concentration. The operating size range of the EEPS is between 5.6 and 560 nm, and the whole size range is measured simultaneously, every 1 s in the present case.

An observation model (cf. Eq. 32) for the EEPS in terms of the raw electrometer currents can be written as (Wang et al.2016a):

(40) y = H f + e ,

where y∈ℝ22 is the measured current, f∈ℝ17 denotes a representation of the size distribution used by the EEPS, and e∈ℝ22 is measurement noise. The size distribution f is mapped to currents with an instrument matrix1 HR22×17, which can be exported from the EEPS2. Here we use the SOOT instrument matrix (Wang et al.2016b), which TSI has developed to give more accurate estimates with soot particles than the instrument's default instrument matrix. To connect the aerosol bin model Eq. (3) to the observation model above, we map the size bin representation to f with an interpolation matrix BR17×nB:

(41) f = B N ^ ,

where N^ is a size bin representation of the PSD of the sample when it enters the EEPS after it has been altered by the sampling line. Finally, let Ξ represent the coagulation and wall diffusion model so that N^=Ξ(N), and let ϕ denote a transformation from the log space to the absolute scale N=ϕ(Ñ). The full observation model is then y=h(Ñ)+e, where the forward model is

(42) h ( N ̃ ) := HB Ξ ( ϕ ( N ̃ ) ) .

To specify the covariance of the measurement noise, Γe, we assume that noise in each electrometer is independent of the other electrometers so that Γe is diagonal and consists of the noise variances σe,i2 of each electrometer i. These variances can generally be estimated from a calibration measurement with a minimal number of particles using a filtered sample. However, an analysis of the measurement data shows in this case that the noise seems to be approximately proportional to the measured current. Specifically, during time windows that are short enough, where the PSD is expected to stay constant and thus variation in the measured signal can be attributed to noise, the standard deviation of the measurements is roughly 5 %–15 % of the amplitude of the measurement. An exception to this is when the current drops below the electrometer noise floor. We therefore model the noise variance as

(43) σ e , i 2 = max σ floor , i 2 , ( ϵ y i ) 2 , i = 1 , , 22 ,

where σfloor,i2 and yi denote the noise floor and measured current of electrometer i, respectively, and ϵ denotes the noise level. Note that in the derivation of the likelihood we assumed that the noise and the unknowns are independent, but here some dependency is introduced between them. In practice, we thus use the measured values of yi to assess the variance in Eq. (43) instead of reformulating the likelihood model to accommodate for a non-additive noise model that would result from the dependency between N and noise. The effect of this approximation, however, is minor if the noise level is not very high.

4.2 Inversions with simulated data

We created simulated data with the above model (Eq. 42), specifying the initial PSD and auxiliary parameters and adding noise with level ϵ=0.10. The initial number concentration was chosen to be high enough to clearly show coagulation effects but still realistic to be comparable to the number concentrations seen in the real measurements shown later. With simulated data, one must be careful to not commit a so-called inverse crime of using the same exact model to create the data and solve the inverse problem (Kaipio and Somersalo2006), which might lead to unrealistically good results. To ensure that the (tiny) numerical errors that arise due to discretizing the model are not the same, the data were created using a higher number of size bins than what was used in the inversion. We used 30 and 16 bins per decade for the data-generating and inversion models, respectively. The dilution ratio was set to 100:1, and the auxiliary parameters were set to the values given in Table 1. The expected value of the prior was set to Ñ*=log(103/wbin), where wbin is the logarithmic bin width, and correlation length and variance were set to l̃=12/16 and ã=4, respectively.

Table 1True values for the auxiliary parameters that were used to simulate synthetic data and the range over which they are allowed to vary when marginalizing.

Download Print Version | Download XLSX

4.2.1 Known auxiliary parameters

First, we carried out inversion with the same auxiliary parameters that were used to create the data. In this case, the only source of modeling error with respect to the accurate model used for generating the synthetic data is that caused by the coarser discretization of the particle size. The specified initial PSD and the corresponding generated noisy data points are shown in Fig. 4.

https://gmd.copernicus.org/articles/18/2983/2025/gmd-18-2983-2025-f04

Figure 4Synthetic data. (a) Simulated noisy data, model fit corresponding to the MAP estimate, and standard deviation of the measurement noise. (b) MAP estimate and 95 % posterior credible interval of the initial PSD, the MAP estimate propagated through the sampling line, and the true initial PSD.

Download

The Gauss–Newton algorithm to compute the Laplace approximation for the initial PSD converged in nine iterations and took 0.4 s. All calculations in this paper were done on a laptop with an Intel(R) Core(TM) i7-12700H processor, using only a single CPU core. The Laplace approximation for the initial PSD and the PSD after the sampling line are also shown in Fig. 4. The PSD after the sampling line is calculated from the MAP estimate with the coagulation and wall diffusion model, and it depicts what would be estimated based on the EEPS data if the effects of the PSD evolution within the sampling line were neglected. A comparison between the estimated PSDs before and after the sampling line reveals a clear coagulation effect; the concentration of the smallest particles has reduced by up to 2 orders of magnitude, and these particles show up as a slightly higher number concentration of larger particles. The initial PSD can be said to be estimated well because the true initial PSD is found within the 95 % CIs over the whole size range. The relative error between the MAP estimate and the true initial PSD is 6.2 %.

To compute the true posterior, we ran the MCMC sampler for 200 000 iterations, which took 20 min. The first 25 % of the samples were removed as burn-in, and the remaining 150 000 samples had an average integrated autocorrelation time of 20.2 samples, which means there were over 7400 independent posterior samples in the chain. Figure 5a–c show the first variable of the MCMC chain (i.e., the smallest size bin) as an example, and a visual inspection of this and the rest of the chains showed no sign of convergence issues. In the remainder of this paper, it is assumed that a similar amount of sampling and a similar convergence analysis are done to validate each presented MCMC result. Figure 5d shows the CM estimate and 95 % CIs. Again, the initial PSD is estimated well and is found within the CIs. The relative error between the CM estimate and the true initial PSD is 7.1 %. The main differences compared to the Laplace approximation are that for particles larger than 200 nm, the CM estimate gives lower concentrations than the MAP estimate and that the CIs obtained by sampling are narrower than those of the Laplace approximation.

https://gmd.copernicus.org/articles/18/2983/2025/gmd-18-2983-2025-f05

Figure 5Synthetic data. The full MCMC chain, shown in (c), of the log number concentration of the first size bin and two zoomed-in sections showing (a) the beginning of the chain which is still in burn-in and (b) part of the chain when it has reached the steady state. (d) CM estimate and posterior credible intervals calculated with MCMC using the true auxiliary parameters.

Download

4.2.2 Unknown auxiliary parameters

As mentioned in Sect. 3.4, it not always realistic to assume the auxiliary parameters are known accurately, and one approach to take their uncertainty into account is to model the auxiliary parameters as additional unknowns. An example of what can happen when just one parameter, here the fractal dimension, is incorrectly specified and its uncertainty not modeled is shown in Fig. 6. Here the data were generated with Df=1.7, but in the inversion we used Df=2.1, which is well within the possible values for the fractal dimension of soot. Figure 6a shows that, although not a dramatic change from the results in Fig. 5, the initial PSD is underestimated and the 95 % credible intervals do not always contain the truth, especially around 10 nm. In addition, the relative error between the CM estimate and the truth has increased to 18.8 %.

https://gmd.copernicus.org/articles/18/2983/2025/gmd-18-2983-2025-f06

Figure 6Synthetic data. (a) Posterior distribution with fractal dimension set to an incorrect value. (b) Posterior distribution with fractal dimension marginalized.

Download

Let us now model Df as an unknown and marginalize it. The resulting posterior is shown in Fig. 6b. Compared to the model where Df was assumed to be accurate, the CIs of especially particles under 50 nm (which seem to be the most affected by coagulation) are considerably wider and now do include the true initial PSD. The CM estimate is also closer to the truth, with the relative error being 10.1 %.

Finally, let us marginalize all other auxiliary parameters as well, one by one and also all together. A complete list of the auxiliary parameters is given in Table 1, which also lists the ranges over which the integration is carried out. The fixed auxiliary parameters are held at their true values, except for Df, which is set to 2.1. Figure 7a shows the full posterior in the case where all auxiliary parameters were integrated out at the same time. As is expected, the CIs in this case are wider than when only the fractal dimension has been marginalized. The relative error between the CM estimate and the truth is 6.8 %, which is even lower than when using the correct auxiliary parameters (7.1 %). However, this is likely just a coincidence related to, e.g., the chosen parameter ranges for marginalization and not something to be expected in general.

https://gmd.copernicus.org/articles/18/2983/2025/gmd-18-2983-2025-f07

Figure 7Synthetic data. (a) Posterior with all auxiliary model parameters marginalized. (b) Marginal posteriors of size bin number 5 (12.40 nm) with different auxiliary parameters marginalized.

Download

To visualize the effects of uncertainties in different auxiliary parameters, let us focus on the posterior of a single size bin, i.e., a marginal posterior, which can be plotted as a histogram. Figure 7b shows these marginal posteriors at the sixth size bin (corresponding to dm=12.40nm) of five cases, where either the flow velocity, Hamaker constant, fractal dimension, all of the auxiliary parameters, or none of the auxiliary parameters has been marginalized. Note that the particle number in Fig. 7b is shown on the x axis, and the y axis now denotes the posterior probability density. The marginalization of temperature and the primary particle diameter is not shown because their effect on the posterior was minimal. Over the considered integration ranges, the effects of fractal dimension and the Hamaker constant were larger than that of the flow velocity, but in each case the marginalization increased the width of the CIs. Marginalizing all auxiliary parameters at the same time had naturally the largest effect.

4.3 Fuel-operated auxiliary heater measurement

The measurement campaign by Oikarinen et al. (2022) was carried out in February 2021 in Finnish winter conditions (19 to 7 °C) over several days and with multiple vehicles. For the scope of this paper, we chose a measurement of one of the vehicles, a gasoline-powered 2019 Volkswagen Golf, to analyze as an example. The vehicle had an original equipment manufacturer (OEM)-installed AH manufactured by Webasto Ltd. Emissions of the AH were measured over a half-hour period, which included a cold start and shutdown of the AH. The EEPS was set to measure at 1 s intervals, leading to a total of around 1900 measurements. A more detailed description of the setup and analysis of the EEPS-inverted data is found in the original paper.

As mentioned earlier, computing the Laplace approximation is much faster than doing MCMC sampling and was therefore used to invert the dataset. MCMC was carried out at a few selected measurement points, discussed later. Discretization of the inversion model was set to 16 bins per decade so that the output bins were the same as those of the EEPS. The prior parameters were also kept the same as in the synthetic data inversion. However, because the choice of prior is subjective to some degree, in Sect. 5 we will briefly discuss different choices for the prior parameters and their influence on the PSD estimates. Further, because the data exhibit temporal smoothness, we used the previous MAP estimate as an initial guess for computing the next one, which reduced the number of Gauss–Newton iterations required to minimize the cost function. Inverting the whole dataset took 2 min and 20 s.

The MAP estimates of the initial PSD for the duration of the whole measurement are shown in Fig. 8. The results are qualitatively similar to the ones obtained directly from the EEPS, discussed in Oikarinen et al. (2022), where after startup, the AH emissions are relatively stable, followed by a burst of sub-20 nm particles during shutdown. However, there is a large difference in the total PN. Compared to the (diffusion-corrected) EEPS estimates, the total PN is on average 50 %–100 % higher in the MAP estimates, varying over the measurement period with a peak 230 % higher during the AH shutdown. Because coagulation affects the particle population differently depending on the particle size, also the ratio of PN between the MAP and EEPS estimates depends on the particle size. This is visualized in Fig. 9, which shows the ratio of PNs in the MAP and EEPS estimates for a few ranges of particles sizes. The number of particles under 20 nm may be over 3 times higher with the sampling line modeled, whereas the number of particles over 100 nm is typically reduced by up to 50 %.

https://gmd.copernicus.org/articles/18/2983/2025/gmd-18-2983-2025-f08

Figure 8The MAP estimates for the initial PSD for a measurement of AH exhaust emissions of a 2019 Volkswagen Golf. Lines 1 and 2 denote measurement times analyzed in more detail.

Download

https://gmd.copernicus.org/articles/18/2983/2025/gmd-18-2983-2025-f09

Figure 9Ratio of PNs between the MAP and diffusion-corrected EEPS estimates during the example measurement. The MAP estimate of the number of small particles is 3 times that of the EEPS estimate in the beginning, whereas the number of particles over 100 nm is only around half of that of the EEPS estimate.

Download

To compare the MAP estimates and EEPS output in more detail, let us analyze the two time instants shown as dashed red lines in Fig. 8. The time instant at Line 1 in the middle of the measurement corresponds to the steady state, and the time instant at Line 2 at the end is the time when the burst of sub-20 nm particles was at its peak. The Laplace approximations and the corresponding data fits at these points are shown in Figs. 10 and 11. First, in both examples, modeling the sampling line increases the number of the smallest particles in the estimates significantly while reducing the number of larger particles. Second, the PSDs after the sampling line are close to the PSD estimates from the EEPS, as they should be if our inversion algorithm was working correctly. The differences between them can be mostly attributed to the different inversion algorithms, choices of prior/regularization parameters, and handling of measurement noise. Inversion in the EEPS is based on a Tikhonov-regularized Gauss–Markov least-squares solution (Mirme et al.2007; Mirme and Mirme2013; Wang et al.2016a), but its details are not public. Moreover, the EEPS sometimes rounds the PSD estimates of some size bins down to zero, which can be seen in both Figs. 10 and 11 as gaps in the dotted line (values set to not a number (NaN) because of the log scale).

https://gmd.copernicus.org/articles/18/2983/2025/gmd-18-2983-2025-f10

Figure 10AH measurement at time instant 1 (Line 1 in Fig. 8). (a) Measured data, model fit corresponding to the MAP estimate, and standard deviation of the measurement noise. (b) MAP estimate and 95 % posterior credible interval of the initial PSD, the MAP estimate propagated through the sampling line, and the PSD estimate given by the EEPS.

Download

https://gmd.copernicus.org/articles/18/2983/2025/gmd-18-2983-2025-f11

Figure 11AH measurement at time instant 2 (refer to the caption of Fig. 10).

Download

Finally, we used MCMC to marginalize the auxiliary model parameters for the measurements at Line 1 and Line 2. The marginalization was carried out simultaneously for all five parameters over the ranges listed in Table 1. The results, shown in Fig. 12, resemble those of the synthetic tests in that the CM and MAP estimates are still close, but the 95 % CIs in the marginalized case are wider especially for the smallest particles.

https://gmd.copernicus.org/articles/18/2983/2025/gmd-18-2983-2025-f12

Figure 12AH inversion results with all auxiliary parameters marginalized using MCMC for (a) Line 1 and (b) Line 2.

Download

5 Discussion

The analyzed synthetic and real measurements show that sampling lines can have a major effect on the PSD when measuring the emissions of fuel-operated auxiliary heaters. The same observation obviously applies to any measurement where the sampling line is long and/or the PN concentration is high. When should one then be concerned about their measurements being distorted and consider applying, for example, the methods in this paper? The answer depends not just on the sampling line and PN concentration, but also on factors such as the particle shape, density, and fluid they are suspended in, which affect the coagulation rate and particle diffusion. In the context of the specific numerical example studied in this paper (see Fig. 4), our model shows that if the length of the sampling line was reduced from 3.2 m to, say, 1 m, the true initial total PN between 5.6 and 562 nm would still be about 33 % higher than what would be measured after the sampling line. It could be possible to define an indicator for a risk of distorted measurements based on the residence time and measured PN, but this is out of the scope of the present study.

A question regarding the inversion is if there is enough benefit to sampling the true posterior with MCMC compared to just computing the Laplace approximation. Based on the results in Sect. 4, the answer is in most cases no; the MAP and CM estimates are similar, and the Gaussian density seems to approximate the posterior well. Most importantly, computing the Laplace approximation is fast enough that it could even be combined with the inversion algorithm in the EEPS and still produce the estimates in real time, only now taking into account the effect of the sampling line and providing uncertainty estimates as well. A major benefit of running MCMC is that parameters in the forward model that we are uncertain of can be integrated out and their uncertainty propagated to the PSD estimates. The trade-off is an increased computation time. However, it could be possible to account for this uncertainty in an approximative manner while still retaining the speed of the Laplace approximation, using the Bayesian approximation error method (Kaipio and Somersalo2007; Kaipio and Kolehmainen2013), but this has not been pursued in this work.

As mentioned earlier, the inverse problem of estimating the PSD from measurements of currents is highly ill-posed even without considering the sampling line, and hence the uncertainty of the posterior depends to a large extent on the prior assumptions. For example, as the correlation length l̃ (see Sect. 3.1) is increased, the posterior will be constrained to more and more smooth solutions, which also constrains the posterior uncertainty. Let us demonstrate this in Fig. 13 by plotting the Laplace approximations of two posteriors which differ only in the prior correlation length. Note how in addition to smoothing out the MAP estimate, increasing the correlation length also decreases the width of the credible intervals, i.e., decreases the posterior uncertainty.

https://gmd.copernicus.org/articles/18/2983/2025/gmd-18-2983-2025-f13

Figure 13The influence of prior correlation length shown as MAP estimates and 95 % CIs with (a) l̃=8/16 and (b) l̃=12/16. The overlaid dotted lines are the MAP estimates from the neighboring figure for an easier comparison.

Download

A source of uncertainty related to coagulation that has not been modeled in this work is the influence of particles larger and smaller than what we can measure, >562 and <5.6nm, respectively, in the case of the EEPS. In the measurable range, coagulation with these larger particles results in higher-than-modeled particle losses, whereas coagulation with the smaller particles both introduces new particles at the lower measurable limit as well as causes particles in the measurable range to grow faster than modeled. Hence, the presence of particles larger than we can measure will lead to underestimating the initial PSD, and the presence of particles smaller than we can measure will have the opposite effect. According to a numerical simulation over a wide size range, however, if we assume the PSD outside the measurable range to continue the downward trend that is seen, for example, in Fig. 12a, its effect on the measured PSD is negligible.

Finally, there are a few loss mechanisms, including sampling losses (extraction of the aerosol at the sampling location), gravitational losses, inertial impaction, thermophoresis, and electrostatic deposition that may modify PN concentrations in a sampling line (Giechaskiel et al.2012; Hinds1999). Most of these are, however, likely negligible in the case studied here and therefore not modeled. Sampling losses, gravitational losses, and inertial impaction mainly affect particles >1µm, which is larger than what the EEPS can measure. On the other hand, inertial impaction can also cause losses of nanometer-size particles, but these are at or below the measured size range. Thermophoresis, the motion of particles in a temperature gradient, is potentially a major loss mechanism in sampling lines if there is a large temperature difference between the exhaust gas and the sampling line walls. In the measurement by Oikarinen et al. (2022), the sampling line was heated to 250 °C, which is very close to the temperature of the exhaust gas exiting the AH exhaust pipe, and hence the thermophoretic losses are expected to be minimal. Out of the mentioned loss mechanisms here, electrostatic deposition has the potential to have a major effect but at the same time is difficult to quantify because we do not know the charge on either the soot particles or the sampling line walls. We have hence ignored it in this paper but recognize that this effect may result in even higher particle losses than modeled so far.

6 Conclusions

In this article, we investigated how coagulation and wall diffusion in a sampling line modify a sample's particle size distribution and hence bias its measurement. Coagulation was found to become a significant factor especially at high PN concentrations because while the rate of wall diffusion is independent of the number concentration, the rate of coagulation is approximately proportional to its square. As an example of a high-PN case we studied the exhaust emissions of a fuel-burning road vehicle auxiliary heater, but it is worth noting that the presented method is applicable, and extendable, to any measurement where coagulation and/or any other aerosol process (such as condensation or evaporation) is suspected to affect the measurement. Through simulations and examining real data we found that, in a typical measurement setup, coagulation in the sampling line can reduce the total PN by more than 50 % and the number of small particles (<20nm) even by an order of magnitude. The initial particle size distribution, not yet biased by the sampling line, can, however, be estimated from measurements that were done after the sampling line, given that we have a model for the processes in the sampling line.

Estimating the initial particle size distribution is an ill-posed inverse problem which we solved using methods in the Bayesian framework for inverse problems. In this framework, we can incorporate prior information about the size distribution and carry out a systematic calculation of the uncertainty related to the estimates. Two approaches to exploring the posterior probability density were tested: one where the posterior is fully characterized by MCMC sampling and another where the posterior is approximated as a Gaussian distribution. These resulted in similar estimates for the initial PSD, but both approaches also had distinct advantages. With the Gaussian approximation the inverse problem could be solved in a computationally efficient manner so that computing the estimates of the initial size distribution and its uncertainty could be done in real time even with measurement devices that have fast response times (1–10 Hz, for example). Using MCMC, on the other hand, we were able to take into account the (possible) uncertainty in the parameters of the coagulation and diffusion models, such as particle shape, using marginalization. This results in uncertainty estimates that are more consistent with what we assume to know about the problem but are computationally more demanding to calculate.

In conclusion, when the number concentration is high and the particles are small, the effect of coagulation in sampling lines should be taken into account when carrying out PSD or PN measurements. Otherwise, the particle numbers may be severely underestimated. Further, estimates of the PSDs have major uncertainties associated with them that traditional, deterministic inversion methods do not convey. Bayesian methods for uncertainty quantification could therefore prove useful in assessing the health and environmental impacts of fine particles in the future.

Code and data availability

The current version of SLIC is available from the project website under the MIT license: https://github.com/mniskanen/sampling-line-inversion (last access: 12 February 2025). The exact version of the model and the input data and scripts to run the model and to produce the results and plots for all the simulations presented in this paper are archived on Zenodo: https://doi.org/10.5281/zenodo.12188947 (Niskanen2024).

Author contributions

MN: formal analysis, investigation, methodology, software, validation, visualization, writing (original draft preparation, review and editing). AS: methodology, writing (review and editing). HO: conceptualization, data curation, writing (review and editing). MO: conceptualization, data curation, investigation, writing (review and editing). PK: conceptualization, funding acquisition, project administration, supervision, writing (review and editing). SM: conceptualization, funding acquisition, project administration, supervision, writing (review and editing). KL: conceptualization, funding acquisition, methodology, project administration, writing (review and editing).

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

Financial support

This research has been supported by the Research Council of Finland (grant nos. 346375, 325022, 352968, 337550, and 337551), the Jane ja Aatos Erkon Säätiö (project AHMA), the Tampere Institute for Advanced Study, and the Kone Foundation.

Review statement

This paper was edited by Simon Unterstrasser and reviewed by three anonymous referees.

References

Alam, M. K.: The effect of van der Waals and viscous forces on aerosol coagulation, Aerosol Sci. Tech., 6, 41–52, https://doi.org/10.1080/02786828708959118, 1987. a, b, c

Brooks, S., Gelman, A., Jones, G., and Meng, X.-L.: Handbook of Markov Chain Monte Carlo, CRC press, https://doi.org/10.1201/b10905, 2011. a

Commission Regulation (EU): 2019/1939 of 7 November 2019 amending Regulation (EU) No 582/2011 as regards Auxiliary Emission Strategies (AES), access to vehicle OBD information and vehicle repair and maintenance information, measurement of emissions during cold engine start periods and use of portable emissions measurement systems (PEMS) to measure particle numbers, with respect to heavy duty vehicles (Text with EEA relevance), https://eur-lex.europa.eu/eli/reg/2019/1939/oj (last access: 20 June 2024), 2019. a

Gelbard, F., Tambour, Y., and Seinfeld, J. H.: Sectional representations for simulating aerosol dynamics, J. Colloid. Interf. Sci., 76, 541–556, https://doi.org/10.1016/0021-9797(80)90394-x, 1980. a

Giechaskiel, B., Arndt, M., Schindler, W., Bergmann, A., Silvis, W., and Drossinos, Y.: Sampling of non-volatile vehicle exhaust particles: a simplified guide, SAE International Journal of Engines, 5, 379–399, https://doi.org/10.4271/2012-01-0443, 2012. a

Giechaskiel, B., Casadei, S., Mazzini, M., Sammarco, M., Montabone, G., Tonelli, R., Deana, M., Costi, G., Di Tanno, F., Prati, M., Clairotte, M., and Di Domenico, A.: Inter-laboratory correlation exercise with Portable Emissions Measurement Systems (PEMS) on chassis dynamometers, Appl. Sci.-Basel, 8, 2275, https://doi.org/10.3390/app8112275, 2018. a

Giechaskiel, B., Lähde, T., Schwelberger, M., Kleinbach, T., Roske, H., Teti, E., van den Bos, T., Neils, P., Delacroix, C., Jakobsson, T., and Lu Karlsson, H.: Particle number measurements directly from the tailpipe for type approval of heavy-duty engines, Appl. Sci.-Basel, 9, 4418, https://doi.org/10.3390/app9204418, 2019. a

Giechaskiel, B., Lähde, T., Melas, A. D., Valverde, V., and Clairotte, M.: Uncertainty of laboratory and portable solid particle number systems for regulatory measurements of vehicle emissions, Environ. Res., 197, 111068, https://doi.org/10.1016/j.envres.2021.111068, 2021a. a

Giechaskiel, B., Melas, A., Martini, G., and Dilara, P.: Overview of vehicle exhaust particle number regulations, Processes, 9, 2216, https://doi.org/10.3390/pr9122216, 2021b. a, b, c

Giechaskiel, B., Melas, A., Broekaert, S., Gioria, R., and Suarez-Bertoa, R.: Solid Particle Number (SPN) Portable Emission Measurement Systems (PEMS) for heavy-duty applications, Appl. Sci.-Basel, 14, 654, https://doi.org/10.3390/app14020654, 2024. a

Haario, H., Saksman, E., and Tamminen, J.: An adaptive Metropolis algorithm, Bernoulli, 7, 223–242, https://doi.org/10.2307/3318737, 2001. a

Hering, S. V., Stolzenburg, M. R., Quant, F. R., Oberreit, D. R., and Keady, P. B.: A laminar-flow, Water-Based Condensation Particle Counter (WCPC), Aerosol Sci. Tech., 39, 659–672, https://doi.org/10.1080/02786820500182123, 2005. a

Hinds, W. C.: Aerosol Technology Properties, Behavior, and Measurement of Airborne Particles, Wiley and Sons, Incorporated, John, 2nd edn., ISBN 9781118591970, 1999. a, b, c, d

Hofman, J., Staelens, J., Cordell, R., Stroobants, C., Zikova, N., Hama, S., Wyche, K., Kos, G., Van Der Zee, S., Smallbone, K., Weijers, E., Monks, P., and Roekens, E.: Ultrafine particles in four European urban environments: results from a new continuous long-term monitoring network, Atmos. Environ., 136, 68–81, https://doi.org/10.1016/j.atmosenv.2016.04.010, 2016. a

Hussein, T., Smolik, J., Kerminen, V.-M., and Kulmala, M.: Modeling dry deposition of aerosol particles onto rough surfaces, Aerosol Sci. Tech., 46, 44–59, https://doi.org/10.1080/02786826.2011.605814, 2012. a

Jacobson, M. Z. and Seinfeld, J. H.: Evolution of nanoparticle size and mixing state near the point of emission, Atmos. Environ., 38, 1839–1850, https://doi.org/10.1016/j.atmosenv.2004.01.014, 2004. a, b

Jacobson, M. Z., Turco, R. P., Jensen, E. J., and Toon, O. B.: Modeling coagulation among particles of different composition and size, Atmos. Environ., 28, 1327–1338, https://doi.org/10.1016/1352-2310(94)90280-1, 1994. a

Kahn, R. A., Andrews, E., Brock, C. A., Chin, M., Feingold, G., Gettelman, A., Levy, R. C., Murphy, D. M., Nenes, A., Pierce, J. R., Popp, T., Redemann, J., Sayer, A. M., da Silva, A. M., Sogacheva, L., and Stier, P.: Reducing aerosol forcing uncertainty by combining models with satellite and within the atmosphere observations: a three way street, Rev. Geophys., 61, 1–27, https://doi.org/10.1029/2022rg000796, 2023. a

Kaipio, J. and Kolehmainen, V.: Approximate marginalization over modeling errors and uncertainties in inverse problems, Bayesian Theory and Applications, Oxford University Press, 644–672, https://doi.org/10.1093/acprof:oso/9780199695607.003.0032, 2013. a

Kaipio, J. and Somersalo, E.: Statistical and Computational Inverse Problems, vol. 160 of Applied Mathematical Sciences, Springer-Verlag GmbH, ISBN 978-0-387-27132-3, 2006. a, b, c

Kaipio, J. and Somersalo, E.: Statistical inverse problems: discretization, model reduction and inverse crimes, J. Comput. Appl. Math., 198, 493–504, https://doi.org/10.1016/j.cam.2005.09.027, 2007. a

Kwon, H.-S., Ryu, M. H., and Carlsten, C.: Ultrafine particles: unique physicochemical properties relevant to health and disease, Exp. Mol. Med., 52, 318–328, https://doi.org/10.1038/s12276-020-0405-1, 2020. a, b

Lapuerta, M., Barba, J., Sediako, A. D., Kholghy, M. R., and Thomson, M. J.: Morphological analysis of soot agglomerates from biodiesel surrogates in a coflow burner, J. Aerosol Sci., 111, 65–74, https://doi.org/10.1016/j.jaerosci.2017.06.004, 2017. a

Lehtinen, K. E. and Zachariah, M. R.: Self-preserving theory for the volume distribution of particles undergoing brownian coagulation, J. Colloid. Interf. Sci., 242, 314–318, https://doi.org/10.1006/jcis.2001.7791, 2001. a

Lieberman, C., Willcox, K., and Ghattas, O.: Parameter and state model reduction for large-scale statistical inverse problems, SIAM J. Sci. Comput., 32, 2523–2542, https://doi.org/10.1137/090775622, 2010. a

Liu, H., Yu, Y., Wang, C., Xu, H., and Ma, X.: Brownian coagulation of particles in the gasoline engine exhaust system: experimental measurement and Monte Carlo simulation, Fuel, 303, 121340, https://doi.org/10.1016/j.fuel.2021.121340, 2021. a

Liu, Y., Song, C., Lv, G., Chen, N., Zhou, H., and Jing, X.: Determination of the attractive force, adhesive force, adhesion energy and Hamaker constant of soot particles generated from a premixed methane/oxygen flame by AFM, Appl. Surf. Sci., 433, 450–457, https://doi.org/10.1016/j.apsusc.2017.10.030, 2018. a

MacKay, D. J. C.: Information theory, inference and learning algorithms, chap. 27, Cambridge university press, ISBN 9780521642989, 2003. a

Mahfouz, N. G. A. and Donahue, N. M.: Primary ion diffusion charging and particle wall loss in smog chamber experiments, Aerosol Sci. Tech., 54, 1058–1069, https://doi.org/10.1080/02786826.2020.1757032, 2020. a

Mahfouz, N. G. A. and Donahue, N. M.: Atmospheric nanoparticle survivability reduction due to charge induced coagulation scavenging enhancement, Geophys. Res. Lett., 48, 1–9, https://doi.org/10.1029/2021gl092758, 2021a. a

Mahfouz, N. G. A. and Donahue, N. M.: Technical note: The enhancement limit of coagulation scavenging of small charged particles, Atmos. Chem. Phys., 21, 3827–3832, https://doi.org/10.5194/acp-21-3827-2021, 2021. a

Manisalidis, I., Stavropoulou, E., Stavropoulos, A., and Bezirtzoglou, E.: Environmental and health impacts of air pollution: a review, Frontiers in Public Health, 8, 1–13, https://doi.org/10.3389/fpubh.2020.00014, 2020. a

Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E.: Equation of state calculations by fast computing machines, J. Chem. Phys., 21, 1087–1092, https://doi.org/10.1063/1.1699114, 1953. a

Mirme, A., Tamm, E., Mordas, G., Vana, M., Uin, J., Mirme, S., Bernotas, T., Laakso, L., Hirsikko, A., and Kulmala, M.: A wide-range multi-channel Air lon Spectrometer, Boreal Environ. Res., 12, 2007. a

Mirme, S. and Mirme, A.: The mathematical principles and design of the NAIS – a spectrometer for the measurement of cluster ion and nanometer aerosol size distributions, Atmos. Meas. Tech., 6, 1061–1071, https://doi.org/10.5194/amt-6-1061-2013, 2013. a

Morawska, L. and Buonanno, G.: The physics of particle formation and deposition during breathing, Nature Reviews Physics, 3, 300–301, https://doi.org/10.1038/s42254-021-00307-4, 2021. a

Niskanen, M.: SLIC v.1.0.1, Zenodo [code], https://doi.org/10.5281/zenodo.12188947, 2024. a, b

Oikarinen, H., Olin, M., Martikainen, S., Leinonen, V., Mikkonen, S., and Karjalainen, P.: Particle number, mass, and black carbon emissions from fuel-operated auxiliary heaters in real vehicle use, Atmospheric Environment: X, 16, 100189, https://doi.org/10.1016/j.aeaoa.2022.100189, 2022. a, b, c, d, e, f, g

Okuyama, K., Kousaka, Y., and Yoshida, T.: Turbulent coagulation of aerosols in a pipe flow, J. Aerosol Sci., 9, 399–410, https://doi.org/10.1016/0021-8502(78)90002-2, 1978. a

Park, K., Cao, F., Kittelson, D. B., and McMurry, P. H.: Relationship between particle mass and mobility for Diesel exhaust particles, Environ. Sci. Technol., 37, 577–583, https://doi.org/10.1021/es025960v, 2003. a, b

Pfeifer, J., Mahfouz, N. G. A., Schulze, B. C., Mathot, S., Stolzenburg, D., Baalbaki, R., Brasseur, Z., Caudillo, L., Dada, L., Granzin, M., He, X.-C., Lamkaddam, H., Lopez, B., Makhmutov, V., Marten, R., Mentler, B., Müller, T., Onnela, A., Philippov, M., Piedehierro, A. A., Rörup, B., Schervish, M., Tian, P., Umo, N. S., Wang, D. S., Wang, M., Weber, S. K., Welti, A., Wu, Y., Zauner-Wieczorek, M., Amorim, A., El Haddad, I., Kulmala, M., Lehtipalo, K., Petäjä, T., Tomé, A., Mirme, S., Manninen, H. E., Donahue, N. M., Flagan, R. C., Kürten, A., Curtius, J., and Kirkby, J.: Measurement of the collision rate coefficients between atmospheric ions and multiply charged aerosol particles in the CERN CLOUD chamber, Atmos. Chem. Phys., 23, 6703–6718, https://doi.org/10.5194/acp-23-6703-2023, 2023. a

Roberts, G. O. and Tweedie, R. L.: Exponential convergence of Langevin distributions and their discrete approximations, Bernoulli, 2, 341, https://doi.org/10.2307/3318418, 1996. a

Rogak, S. N. and Flagan, R. C.: Coagulation of aerosol agglomerates in the transition regime, J. Colloid. Interf. Sci., 151, 203–224, https://doi.org/10.1016/0021-9797(92)90252-h, 1992. a, b, c, d, e

Salminen, T., Lehtinen, K. E., Kaipio, J. P., Russell, V., and Seppänen, A.: Application of finite element method to general dynamic equation of aerosols – comparison with classical numerical approximations, J. Aerosol Sci., 160, 105902, https://doi.org/10.1016/j.jaerosci.2021.105902, 2022. a

Schriefl, M. A., Bergmann, A., and Fierz, M.: Design principles for sensing particle number concentration and mean particle size with unipolar diffusion charging, IEEE Sens. J., 19, 1392–1399, https://doi.org/10.1109/jsen.2018.2880278, 2019. a

Seinfeld, J. H. and Pandis, S. N.: Atmospheric Chemistry and Physics: From Air Pollution to Climate Change, Wiley, ISBN 1118947401, https://www.ebook.de/de/product/25599491/john_h_seinfeld_spyros_n_pandis_atmospheric_chemistry_and_physics_from_air_pollution_to_climate_change.html, 2016. a, b

Shimada, M., Okuyama, K., Kousaka, Y., and Ohshima, K.: Turbulent and brownian diffusive deposition of aerosol particles onto a rough wall, J. Chem. Eng. Jpn., 20, 57–64, https://doi.org/10.1252/jcej.20.57, 1987. a

Shiraiwa, M., Ueda, K., Pozzer, A., Lammel, G., Kampf, C. J., Fushimi, A., Enami, S., Arangio, A. M., Fröhlich-Nowoisky, J., Fujitani, Y., Furuyama, A., Lakey, P. S. J., Lelieveld, J., Lucas, K., Morino, Y., Pöschl, U., Takahama, S., Takami, A., Tong, H., Weber, B., Yoshino, A., and Sato, K.: Aerosol health effects from molecular to global scales, Environ. Sci. Technol., 51, 13545–13567, https://doi.org/10.1021/acs.est.7b04417, 2017. a

Thomas, A. E., Bauer, P. S., Dam, M., Perraud, V., Wingen, L. M., and Smith, J. N.: Automotive braking is a source of highly charged aerosol particles, P. Natl. Acad. Sci. USA, 121, 1–7, https://doi.org/10.1073/pnas.2313897121, 2024. a

TSI: Engine Exhaust Particle Sizer (EEPS) spectrometer model 3090 operation and service manual, TSI Incorporated, USA, J edn., p/N 1980494, 2015. a

TSI: Engine Exhaust Particle Sizer (EEPS) 3090, https://tsi.com/engine-exhaust-particle-sizer-spectrometer-3090/, last access: 12 April 2024. a

Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van der Walt, S. J., Brett, M., Wilson, J., Millman, K. J., Mayorov, N., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., Carey, C. J., Polat, İ., Feng, Y., Moore, E. W., VanderPlas, J., Laxalde, D., Perktold, J., Cimrman, R., Henriksen, I., Quintero, E. A., Harris, C. R., Archibald, A. M., Ribeiro, A. H., Pedregosa, F., van Mulbregt, P., and SciPy 1.0 Contributors: SciPy 1.0: fundamental algorithms for scientific computing in Python, Nat. Methods, 17, 261–272, https://doi.org/10.1038/s41592-019-0686-2, 2020.  a

Wang, X., Grose, M. A., Avenido, A., Stolzenburg, M. R., Caldow, R., Osmondson, B. L., Chow, J. C., and Watson, J. G.: Improvement of Engine Exhaust Particle Sizer (EEPS) size distribution measurement – I. Algorithm and applications to compact-shape particles, J. Aerosol Sci., 92, 95–108, https://doi.org/10.1016/j.jaerosci.2015.11.002, 2016a. a, b, c

Wang, X., Grose, M. A., Caldow, R., Osmondson, B. L., Swanson, J. J., Chow, J. C., Watson, J. G., Kittelson, D. B., Li, Y., Xue, J., Jung, H., and Hu, S.: Improvement of Engine Exhaust Particle Sizer (EEPS) size distribution measurement – II. Engine exhaust particles, J. Aerosol Sci., 92, 83–94, https://doi.org/10.1016/j.jaerosci.2015.11.003, 2016b. a

Wentzel, M., Gorzawski, H., Naumann, K.-H., Saathoff, H., and Weinbruch, S.: Transmission electron microscopical and aerosol dynamical characterization of soot aerosols, J. Aerosol Sci., 34, 1347–1370, https://doi.org/10.1016/s0021-8502(03)00360-4, 2003. a

Williams, J. and Crane, R.: Particle collision rate in turbulent flow, Int. J. Multiphas. Flow, 9, 421–435, https://doi.org/10.1016/0301-9322(83)90098-8, 1983. a

Zhao, Y., Yang, W., Song, X., Jiang, C., and Feng, Y.: Coagulation patterns and the impacts on traffic-related ultrafine particle dispersion in road tunnels employing dynamic mesh algorithms, Environ. Sci. Pollut. R., 28, 61380–61396, https://doi.org/10.1007/s11356-021-14987-z, 2021. a

1

Also known as an inversion matrix in the EEPS manual.

2

If the raw current data and/or the instrument matrix were not available, inversion could also be done based on the EEPS estimate of the PSD. However, specifying the distribution of the uncertainty, or noise, in the data would not be as straightforward as with the raw data.

Download
Short summary
Particle size is a key factor determining the properties of aerosol particles which have a major influence on the climate and on human health. When measuring the particle sizes, however, sometimes the sampling lines that transfer the aerosol to the measurement device distort the size distribution, making the measurement unreliable. We propose a method to correct for the distortions and estimate the true particle sizes, improving measurement accuracy.
Share