Partial Derivative Fitted Taylor Expansion : an e ffi cient method for calculating gas / liquid equilibria in atmospheric aerosol particles – Part 2 : Organic compounds

Partial Derivative Fitted Taylor Expansion: an efficient method for calculating gas/liquid equilibria in atmospheric aerosol particles – Part 2: Organic compounds D. Topping, D. Lowe, and G. McFiggans National Centre of Atmospheric Science, Univ. of Manchester, Manchester, M13 9PL, UK Centre for Atmospheric Sciences, School of Earth, Atmospheric and Environmental Sciences, Univ. of Manchester, Manchester, M13 9PL, UK


Introduction and rationale
Gas to particle partitioning, driven by a difference in equilibrium and partial pressures, is a key process that dictates the evolving chemical composition of atmospheric aerosol particles, thus their environmental impacts.The equilibrium vapour pressure above a solution is given by: where P i is the equilibrium vapour pressure, P o the pure component vapour pressure, γ i the activity coefficient in solution and x i the mole fraction of component i in solution.
Sensitivity to choice of predictive technique for calculating P o , in terms of aerosol mass and properties, has recently been reviewed by McFiggans et al. (2010) and Topping et al. (2011).In this paper we use the technique for calculating P o recommended by Barley et al. (2009).
In a previous publication, a new hybrid reduced complexity ionic mixing rule for calculating γ i and hence, equilibrium vapour pressure of inorganic condensates, was presented (Topping et al., 2009).PD-FiTE, or Partial Derivative Fitted Taylor Series Expansion, was inspired by the MTEM model (Zaveri et al., 2005), which in turn is based on the observation that the logarithm of activity coefficients varied linearly as a function of water activity when expressed in terms of equivalent mole fractions (Zaveri et al., 2005).Whilst the terms within the inorganic PD-FiTE model Taylor Series Expansion are based on empirical observations of activity coefficient variation, the development of the organic model in this paper is not.Rather, in this manuscript we test the applicability of another Taylor Series Expansion applied to organic solutes in water, where the model parameters are derived solely to ensure correct limiting behaviour Published by Copernicus Publications on behalf of the European Geosciences Union.and interaction terms derived by fitting this framework to a more complex benchmark model (UNIFAC).This fitting methodology is the same methodology applied to inorganic PD-FiTE, despite the different approach used in defining the model terms and concentration scales, hence we use the same acronym here.
Activity coefficients account for interactions taking place in solution.Numerous model are available for both inorganic, organic and mixed inorganic-organic solutions.Unfortunately these models are far too expensive for inclusion in large scale models (Zaveri et al., 2005;Topping et al., 2009).As inorganic and organic activity coefficient frameworks have different theoretical constructs it is difficult to build reduced complexity frameworks which are equally applicable to both systems (Zuend et al., 2011).Following inorganic PD-FiTE, in this paper we use the same Taylor Series expansion methodology to develop a model for organic solutes in aqueous solutions via an appropriate representation of multi-component concentrations and interactions.Assessing the accuracy of both the inorganic and organic versions of PD-FiTE to mixed inorganic-organic aqueous systems will form the focus of a future study, using the revised AIOMFAC model as a benchmark (Zuend et al., 2011).

Reduced complexity activity coefficient framework
The numerical basis for the inorganic PD-FiTE is a simple Taylor series expansion involving only the first order term: lnγ i (x j ,x k ,...RH) = lnγ o i (RH) where lnγ i (RH) is the activity coefficient of component i in the mixture as a function of relative humidity (RH), lnγ o i (RH) the binary activity coefficient of component i in water at a given RH, x i the equivalent mole fraction of component i.As described by Topping et al. (2009), the interaction terms implicitly account for any effects of partial dissociation of the HSO 4 -ion.Since the dissociation of organics cannot be modelled with any certainty (Clegg and Seinfeld, 2006), only interactions between undissociated molecules are considered.For the organic model, interactions are restricted to binary pairs of solutes, thus reducing computational cost.The activity coefficient function then takes on the following general form: Where N represents the total number of solutes, c i and c j represent a specific concentration scale for components i and j .Models such as UNIQUAC (Abrams et al., 1975) and the Wilson (Wilson, 1964) equations also treat binary interactions, the mixture represented by the sum of these pairs.In the residual and combinatorial expressions of the UNIQUAC model, binary pairs are coupled using absolute mole fractions weighted according to molecular surface area and volume parameters: (4) where m represents the total number of compounds, r i and q i are the surface area and volume parameters for components i with associated mole fraction x i .In organic PD-FiTE, choice of concentration scale can be chosen according to the limiting requirements of the numerical framework.The numerical framework of organic PD-FiTE is not based on empirical observations of activity coefficient variation, rather the same parameter fitting methodology is used as the inorganic framework,as detailed in Sect.3. The ability of this new framework to replicate activity coefficients for various concentrations is given in Sect. 4 where the UNIFAC model is used as a benchmark.As with inorganic PD-FiTE, for a one component system (i.e. one organic solute in water), the activity coefficient of the solute must equal the binary activity coefficient in water.This represents the first term in the Taylor series expansion: where we use the symbol f to represent activity coefficients of organic solutes.As inorganic PD-FiTE was based on the empirical observation that the logarithm of solute activity coefficients varies roughly linearly with water activity, thus RH for a bulk solution, Eq. ( 6) would be written explicitly as a function of RH were we able to use the same basis.As mentioned previously, this framework does not use the same empirical basis.For the inorganic model, the range of compounds defining the composition space was relatively small and, using the PD-FiTE fitting methodology, the variation in water activity of the whole system was well constrained.For the organic model, it is likely that the compounds selection will change depending on the application, thus systems studied.Whilst the water activity scale could be used, given the fitting methodology optimizes interaction terms, as new compounds are introduced every model parameter would have to be refit.To mitigate this problem, a different concentration scale is chosen .In Eq. ( 2), the expression x refers to equivalent mole fractions.In the first instance, the Taylor Series expression used for organic PD-FiTE is expressed using mole fractions of components in the multicomponent mixture (including water).For the organic model, binary activity coefficients and interaction are expressed as a function of water mole fraction.The reason for this is twofold.Firstly, this will aid development of coupled inorganicorganic approaches in the future.Whilst interaction terms between inorganic and organic components are not presented here, the effect of dilution by the water associated with the Geosci.Model Dev., 5, 1-13, 2012 www.geosci-model-dev.net/5/1/2012/inorganic fraction can be used to effect the organic solute activity coefficients in a semi-coupled approach.Secondly, the combination of data from binary systems assumes the solvent to be water.This allows future expansion without the need to run a thermodynamic model to derive fits as a function of RH for each binary pair.It is possible that multiple phases exist in aerosol particles in the atmosphere.However, solving this problem is extremely challenging even for ternary systems (Zuend et al., 2011) and currently cannot be generalised to complex mixtures of multiple organic and inorganic solutes.
In the absence of treatment of phase separation or mixed solvent system, taking water as the solvent Eq. ( 6) becomes: and the generic Taylor series expansion is written as: The reliance on data from binary systems is an important feature of any flexible organic activity framework.Whilst the inorganic fraction of aerosol particles is restricted to small range of compounds, the number of organic compounds acting as potential condensates can be very large.With this in mind, the form of the expression encapsulated within the summation can now be chosen.Taking a hypothetical ternary system of organic compounds "A" and "B", starting with a binary solution of component "A" in water, as component "B" is added, the deviation in activity coefficient f A has to be accounted for using an appropriate concentration scale.Equation (8) can be re-written as: To ensure correct limiting behaviour, the difference between the binary activity coefficient lnf o A (x w ) and ternary activity coefficient lnf A (x B ,x w ) can be expressed a function of dry solute mole fractions.As the dry mole fraction of solute "B" approaches zero, the term ∂lnf A ∂x B x w converges to zero.Therefore, for a specific mole fraction of water, Eq. ( 9) is re-written as: where x B is the dry mole fraction of solute "B".As "B" tends to zero, the activity coefficient of "A" converges to the binary activity coefficient at a specific concentration of water.Parameters in inorganic PD-FiTE were optimised using the ADDEM thermodynamic model (Topping et al., 2005).The same parameter optimisation approach is used here.The term x B is re-written as a specific function of where β now represents a scaling factor in lnf A , at a specific concentration of water, as the dry solute mole fraction of "B" changes.Whilst a simple scaling factor that varies linearly with x B could be used, the order of polynomial chosen (e.g.linear, quadratic, cubic, etc) is based on accuracy of fit as compared to UNIFAC predictions as described in Sect. 4. Equation ( 11) is not complete and variability of ∂ lnf A ∂x B as a function of x w is required.As the concentration of component "A" approaches zero, the magnitude of the dependence of β(F (x B )) on the concentration of water increases.Therefore we introduce another variable to Eq. ( 11): where (α(F (x w )) x A →0 ) represents the variation of as a function of water mole fraction as "A" approaches zero, and (β x B ) represents the scaling of (α(F (x w )) x A →0 ) as a function of x B .
We can now write the generalised expression for lnf A as: Where the variables encapsulated within the summation can be expressed as: The rationale behind the above framework is best illustrated using an example.Two compounds were randomly selected within the UNIFAC framework with the following functional groups: (1) CH 3 ×2, CH 2 ×1, OH ×1, COOH ×1 (compound "A"); (2) CH 2 ×1, COOH ×1 (compound "B").Figure 1 shows a surface plot of lnf A (x B ,x w ) − lnf o A (x w ) or lnf A as a function of x B and x w .As the "dry" mole fraction of solute "B" tends to zero, the value of lnf A tends to zero for all concentrations of water.However, as "A" tends to zero, the variability of lnf A varies non-linearly as a function of x w , the magnitude increases as "A" decreases.
In any higher order systems, the assumption is made that the behaviour of each binary pair holds in the mixture.However, the use of total mole fractions and dry mole fractions ensures an appropriate dilution effect is captured.Comparisons between organic PD-FiTE with the UNIFAC model are presented in Sect. 4.
In the following section, the procedure used for constraining both (βx B ) and (α(F (x w )) x A →0 ) is described.

Activity coefficients
Using the same methodology described by Topping et al. (2009), the interaction terms in the model are optimised based on fitting to a more complex scheme, the UNIFAC model (Fredenslund et al., 1975).Parameters (β B,A (x B ) x w ) and (α B,A (x w ) x A →0 ) are expressed using polynomials as a function of x B and x w respectively.The required level of complexity, or order of the polynomials, are dictated by setting the tolerance on two independent statistical variables, automating the whole process.The order of fitting to both parameters is important.First, the polynomials for (α B,A (x w ) x A →0 ) are determined by setting the dry mole fraction of "A", x A to a negligible amount (0.0001).Following this, the mole fraction of water is varied and the x w axes in Fig. 1 is populated.The order of the polynomial is chosen by selecting the best fit between 1st to 8th order polynomials in the MATLAB software package 7.6.0(R2008a).For each polynomial fit, the coefficients are derived through initialisation with a random number generator which is run 1000 times, the fitting routine using the Levenberg Marquardt algorithm.The point at which the order of a polynomial meets the criteria for being chosen is defined as: (1) the pairwise linear correlation coefficient between UNIFAC predictions and the polynomial fit is greater than 0.99 or (2) when there is no significant decrease in the sum of the square of the residuals as the degree of polynomial is increased.The cutoff value used for the second criteria was set to 10 −4 .
Following this, the polynomial for (β B,A (x B ) x w ) was derived by choosing a slice from the surface displayed in Fig. 1 as a function of x B .The value of x w chosen to define the location of this slice is that which corresponds to the maximum value of (α B,A (x w ) x A →0 ).Subsequent values of lnf A as a function of x B are then normalised to this value to give us the scaling factors used to derive (β B,A (x B ) x w ).
The order of the polynomial chosen is defined using the same procedure outlined above.Using this approach it is possible to use less computationally expensive polynomials, whilst retaining accuracy, as compared to pre-defined polynomial orders for each case (e.g. a 5th order polynomial for each value of (β B,A (x B ) x w ) and (α B,A (x w ) x A →0 )).For example, each variable might be adequately represented by a cubic and quadratic expression.On the other hand, it is possible to pre-define the order of each polynomial to further aid computational performance whilst accepting a set decrease in overall accuracy.For example, it may be desirable to restrict the order of (β B,A (x i ) x w ) to a linear expression such     that the direction of lnf A is at least captured.In applying the framework to a specific set of compounds in Sect.4, the automated procedure is used.For binary activity coefficients lnf o A (x w ), the same automated procedure is used.

Calculating water content
For calculating water content, thus mole fractions in solution, the ZSR mixing rule is used (Zdanovskii, 1936).This method has been reviewed extensively in the literature and therefore not analysed here.Using ZSR, the water content associated with each compound at a specific relative humidity is added together to calculate the water content of the mixture: where the individual water contents are fit to UNIFAC for a range of water activities (0 to 0.99).The mole fraction of water associated with each solute x w i (RH) is fit to the UNI-FAC model to ensure a well behaved polynomial fit (rather than absolute water content).The water content associated with each solute can then be calculated using the following expression: where n i is the number of moles of solute i.

Benchmarking PD-FiTE against UNIFAC
In the following section Eq. ( 13) is applied to a specific set of compounds and benchmarked against the UNIFAC model.Barley et al. (2011) recently presented the sensitivity of predicted aerosol mass and chemical signatures, such as Oxygen:Carbon ratio and average molecular weight, to choice of predictive technique used within absorptive partitioning calculations.In that study, a gas phase degradation mechanism, the Master Chemical Mechanism (MCM), was used to simulate the gas phase abundance of 2700 compounds for various anthropogenic and biogenic scenarios (Jenkin et al., 1997).Here we use the same simulations and average the contribution of individual components to the predicted condensed phase abundance, across all conditions, in www.geosci-model-dev.net/5/1/2012/Geosci.Model Dev., 5, 1-13, 2012   order to extract a subset of compounds to be used as an example on the applicability of PD-FiTE.The resulting compounds defined in that study are listed in Table 1 along with their SMILES string, compound identification number as defined within the MCM and chemical structure.Specific details regarding the selection methodology are presented in Appendix A. The simplified molecular input line entry specification or SMILES is a specification for unambiguously describing the structure of chemical molecules using short ASCII strings.Each compound SMILES representation was translated into appropriate UNIFAC functional groups using the PyBel toolbox (O'Boyle et al., 2008) and a complete set of SMARTS to represent the Hansen et al. (1991) UNI-FAC matrix.For a more detailed discussion of SMILES and SMARTS, the reader is referred to the Daylight Chemical Information systems webpage (www.daylight.com).To illustrate the use of Eq. ( 13), the polynomials used to calculate the binary activity coefficient and water content of each compound are listed in Tables 2 and 3 respectively.An example subset of the polynomials used to calculate interactions between each solute pair are given in Appendix B.
To illustrate the behaviour of compounds listed in Table 1, Figs. 2 and 3 show surface plots of lnf 1 as a function of x B and x w for two separate ternary mixtures with compounds "3" and "4".As the 'dry' mole fraction of solute "B", or compounds "3" and "4" approach zero, the value of lnf 1 tends to zero for all concentrations of water.Whilst the presence of solute "3" always decreases lnf 1 relative to a binary solution of only solute "1" in water, the presence of solute "4" can increase or decrease lnf 1 depending on the concentration of water.As described in Sect.2, each surface is used to fit the interaction parameters (β B,A (x B ) x w ) and (α B,A (x w ) x A →0 ) between each binary pair of solutes.
www.geosci-model-dev.net/5/1/2012/Geosci.Model Dev., 5, 1-13, 2012   To benchmark PD-FiTE, the original UNIFAC model is used with the updated parameters of Peng et al. (2001).The version of PD-FiTE defined by Eq. ( 13) is hereafter referred to as "PD-FiTE-ternary" as it account for interactions between binary pairs of solutes in water.Comparisons are also made with PD-FiTE assuming that all solute-solute interactions can be set to zero such that the summation term in Eq. ( 13) is not used.This variant shall hereafter be referred to "PD-FiTE-binary" as it represents individual solutes in wa-Table 4. Average percentage deviations between PD-FiTE and UNIFAC.Statistics averaged across 100 random initializations at ten relative humidities (10, 20, 30, 40, 50, 60, 70, 80, 90, 99 %)  ter.An analysis of the accuracy of both approaches is important since "PD-FiTE-binary" is less computational expensive than "PD-FiTE-ternary" as discussed in Sect. 5. Pre-defining concentrations of inorganic compounds to benchmark the model for a range of typical environments is relatively easy, (Zaveri et al., 2005).For organic compounds, however, this is less clear.For this reason, in this study, concentrations of each compound in Table 1 were randomly generated using uniformly distributed pseudorandom numbers in the MAT-LAB software package 7.6.0(R2008a).In total, 100 relative Table 7. Polynomials representing interactions between solute "A" (compound '1' in Table 1) and solutes "B" (compounds 3, 4 and 5 in Table 1), for a specific concentration of water as "B" changes; 4) D( 5) D( 6 concentrations for each compound were derived using the Mersenne Twister algorithm.For each concentration the relative humidity was varied between 10 to 99 %, resulting in a different water content calculated using the ZSR rule.For comparisons with the UNIFAC model the same water content was used.Some overall statistics are discussed before individual compound analysis is given.Figure 4 displays scatter plots for both the binary and ternary version of PD-FiTE versus UNIFAC for four different relative humidities (10, 30, 60 and 90 %).Taking the 10 % RH case, ternary PD-FiTE clearly correlates better with UNIFAC than binary PD-FiTE, as confirmed by the Pearson squared correlation coefficients (0.96 and 0.40, respectively).The performance of binary PD-FiTE decreases as the activity coefficients increase.This is to be expected as the highly non-linear influence of multiple solutes is not captured within the binary framework.As the relative humidity increases the correlation coefficient of both variants improves.Figure 5 plots the ratio of predictions from both binary and ternary PD-FiTE as compared to UNIFAC across 100 sets of simulations at 10 different relative humidities.The ternary model is clearly more accurate than the binary variant, with median values always within ±20 % across all relative humidity ranges.Predictions from binary PD-FiTE lie across a much broader distribution with deviations as low as −40 % and as high as +70 % in the inter-quartile range.Table 4 provides average deviations for each individual compound across all simulations and relative humidities.Whilst some detail provided by Figs. 4 and 5 is lost within the average statistics, average percentage deviations from ternary PD-FiTE are smaller than those from binary PD-FiTE.For example, for compound 7 used in this study, and listed in Table 4, the percentage deviation on changing from the ternary to the binary model can increase from 2.2 % to −73.3 %.Multiple outliers have influenced the derivation of mean values which should be combined with the distributions presented in Fig. 5. Overall, binary PD-FiTE is statistically less accurate than ternary PD-FiTE with total average deviations of −16.12 % and 3.8 %, respectively.

Computational performance
Figure 6 displays the cumulative CPU time in seconds after each set of 100 calls to both PD-FiTE (binary and ternary) and UNIFAC for the 100 sets of simulations described in Sect. 4. Whilst a comparison of the number of floating point operations might be advantageous, this is difficult to attribute to an iterative approach using UNIFAC.There are two lines representing UNIFAC: the first represents the use of water contents calculated using the ZSR approach within PD-FiTE, the second represents an iterative solution for both water contents and activity coefficients.No code was parallelised for this comparison.All routines were written in the Matlab software package which was run on a windows XP machine (2.66 Ghz Intel dual core, 2GB RAM).In each case, the fixed parameters within the model were saved to memory at the first call.Given the number of permutations in ternary PD-FiTE this significantly reduces run time.The binary version of PD-FiTE is the most efficient method, as to be expected.The order of the polynomials used to represent interactions within the ternary PD-FiTE model are optimised during the fitting process.Therefore it is difficult to directly relate the generic increase in computational cost of the ternary method over the binary method.For the compounds analysed in this study, ternary PD-FiTE is less efficient than the binary variant by a factor of ≈6, as indicated in Fig. 6.However, both ternary and binary PD-FiTE are up to a factor of ≈12 and ≈66 times more efficient than calling the UNIFAC model using the same water content, and ≈310 and ≈1800 times more efficient than an iterative model using UNIFAC.

Dynamical testcases
As a demonstration of the usefulness of PD-FiTE in describing the role of variation of particle composition in the dynamical evolution of an aerosol population, the code has been incorporated into a model describing the explicit disequilibrium mass transfer of semivolatile species to a developing aerosol size distribution (Microphysical Aerosol Numerical model Incorporating Chemistry, Lowe et al., 2009), using the same example compounds defined in Sect. 4. For illustration and clarity of explanation, no gas-or condensed-phase reactions are considered.The gas-phase mixing ratios of the semivolatile species are all initialised at zero, increased sinusoidally to their maximum values over a period of 12 h, then decreased again to zero over another 12 h.Uptake of www.geosci-model-dev.net/5/1/2012/Geosci.Model Dev., 5, 1-13, 2012  Geosci.Model Dev., 5, 1-13, 2012 www.geosci-model-dev.net/5/1/2012/semivolatile species to the condensed-phase does not deplete the gas-phase mixing ratios.The maximum mixing ratios, taken from an online box model simulation using the MCM are given in Table 5 .
The condensed phase consists of 1 aerosol distribution, comprised of 64 individual size-sections.Particle growth across the size-sections within each mode is dealt with using the Moving Centre method.The aerosol distribution is initialised with 2 log-normal modes, each with a representative involatile core (hereafter referred to as compounds 14 and 15, used to represent primary and heavily oygenated organic aerosol).The first log-normal mode has a mode radius of 0.2 µm, width of 1.7, and total particle number of 10 cm −3 air ; the second has a mode radius of 0.02 µm, width of 2.0, and total particle number of 2000 cm −3 air .The total dry aerosol mass is 3.5 µg m −3 air , temperature is 285.15K, and relative humidity is 75 %.These compounds are used purely as a way to initialise an existing aerosol distribution, the functionality of compound 15, was assumed to be that reported for fulvic acid (Topping et al., 2005).
The testcase was run twice: (1) assuming ideality, and (2) using PD-FiTE to calculate activity coefficients over 24 h (model time).Semivolatile partitioning is driven by the difference between atmospheric partial pressure and vapour pressure of the species over the condensed-phase.The sizeresolved vapour pressure differences for each of the 13 semivolatile species, as well as the size-resolved dry mole fractions for each of the 15 condensed-phase species are provided as supplementary material; for brevity, however, we will only examine the behaviour of two compounds: Compound "8" (Fig. 7); and Compound "12" (Fig. 8) which exhibit positive and negative deviations from ideality respectively.In Figs. 7 and 8 the coloured surface plots show the differences between partial pressure and vapour pressure across the size ranges of the two aerosol modes for each semivolatile gas species (plotted as the difference in atmospheres; red indicates higher partial pressures; blue indicates higher vapour pressures).The black lines show the absolute gas phase concentrations of each semivolatile gas species during the model run.
Compound "12" is the dominate condensed-phase semivolatile species in this example (Compound "1" has a higher partial pressure, but is also more volatile, and so does not contribute as much to the condensed phase), with a dry mole fraction close to 0.35 at 12 h across the majority of the aerosol distribution in the ideal testcase (Fig. 7b).Pressure differences are highest over the largest particles (Fig. 7a); with diffusion controlled condensation rates these large particles never achieve equilibrium with the gas-phase.The introduction of non-ideality greatly increases the volatility of compound "12", decreasing the vapour pressures over all particles sizes (Fig. 7c) and reducing the dry mole fraction maxima to ≈0.1 (Fig. 7d).
Compound "5" has a lower abundance than compound "12", but is also less volatile, and so reaches a dry mole fraction of ≈0.025 at 12 h in the ideal testcase (Fig. 8b).The lower volatility causes greater pressure differences over more of the particle size distribution (Fig. 8a), leading to the mole fraction in the smaller particles to continue increasing (leading to greater composition gradients across the particle size-range).Non-ideality greatly decreases the volatility of compound "12", increasing the vapour pressure differences (Fig. 8c), which leads to an increase in the dry mole fraction maxima to ≈0.065 at 12 h (Fig. 8d).
This simulation demonstrates the ability of PD-FiTE to be used stably in dynamical simulations of multicomponent aerosol evolution in a changing gaseous environment.Whilst it is difficult to derive generalised conclusions to one example dynamical testcase, it is interesting to observe both the positive and negative deviations from ideality resulting from interactions taking place in solution.This was also found by Barley et al. (2011) who performed a much broader assessment of the sensitivity to non-ideality within an equilibrium framework.This demonstrates the potential inaccuracy in prescribing fixed negative or positive deviations from ideality when modelling the evolving chemical composition of aerosol particles.

Conclusions and future work
A simple, flexible mixing rule is presented which allows the calculation of activity coefficients, thus equilibrium vapour pressures, of organic condensates above a multi-component aqueous solution.Based on the same fitting methodology as a previously published inorganic model (PD-FiTE), organic PD-FiTE treats interactions between binary pairs of solutes with variable sets of polynomials.Using 13 compounds extracted from an example gas phase degradation mechanism, the framework is benchmarked against the UNIFAC model.For 1000 randomly derived concentration ranges and 10 relative humidities between 10 and 99 %, the average deviation was calculated to be 3.8 %.Whilst compound specific deviations did vary, the median and inter-quartile values across all relative humidity range always fell within ±20 %.Comparisons were also made with organic PD-FiTE by assuming interactions between solutes can be set to zero.Predictions from this approach lie across a much broader distribution with deviations as low as −40 % and as high as +70 % in the inter-quartile range, with an average deviation of −16.1 %.The computational cost of both variants was compared to the use of UNIFAC for the same amounts of water.Both the fully coupled and uncoupled organic PD-FiTE are up to a factor of ≈12 and ≈66 times more efficient than calling the UNIFAC model using the same water content respectively, and ≈310 and ≈1800 times more efficient than an iterative model using UNIFAC.All of the above comparisons assume that the UNIFAC model is accurate for all compounds studied here.

Fig. 1 . 30 Fig. 1 .
Fig. 1.Difference in activity coefficient of compound "A" as a function of x B and x w .The colour-scale is bound by red (low values) to purple (high values) figure

Fig. 2 .Fig. 2 .
Fig. 2. Difference in activity coefficient of compound 1 as a function of x 3 and x w .The compound numbers and SMILES strings are listed in Table 1.The colour-scale is bound by red (low values) to purple (high values) Fig. 2. Difference in activity coefficient of compound 1 as a function of x 3 and x w .The compound numbers and SMILES strings are listed in Table 1.The colour-scale is bound by red (low values) to purple (high values).

Fig. 3 .Fig. 3 .
Fig. 3. Difference in activity coefficient of compound 1 as a function of x 4 and x w .The compound numbers and SMILES strings are listed inTable 1.The colour-scale is bound by red (low values) to purple (high values)

Fig. 4 .
Fig. 4. Comparing binary/ternary PD-FiTE and UNIFAC for a random specific testcase.Each subplot represents a specific RH.Linear correlation coefficients are given in each subplot title.'Rb' and 'Rt' are the correlation coefficients for the binary and ternary model respectively.For this specific example, the compounds with the largest deviations from the binary model are 3,4,7,8,9 and 10.The compound numbers and SMILES strings are listed inTable 1 and average percentage deviations in Table 4. 33

Fig. 4 .
Fig. 4.Comparing binary/ternary PD-FiTE and UNIFAC for a random specific testcase.Each subplot represents a specific RH.Linear correlation coefficients are given in each subplot title.'Rb' and 'Rt' are the correlation coefficients for the binary and ternary model, respectively.For this specific example, the compounds with the largest deviations from the binary model are 3, 4, 7, 8, 9 and 10.The compound numbers and SMILES strings are listed in Table1and average percentage deviations in Table4.

Fig. 5 .Fig. 5 .
Fig. 5. Ratio of activity coefficients predicted by PD-FiTE to UNIFAC as a function of RH.Blue boxes correspond to PD-FiTE binary and cyan to PD-FiTE ternary.Using the MATLAB R2008a software package, the box represents the upper/lower quartile range with the median highlighted inbetween.The whiskers represent values within the 10 and 90 percentile boundaries, outliers representing any values greater than 1.5 times the inter-quartile range.

Fig. 6 .
Fig.6.Cumulative CPU time for each 100'th call of each activity coefficient model.Two variations of the UNIFAC model correspond to using the same water content as calculated in PD-FiTE (using ZSR "W ZSR " and using an iterative solver).

Fig. 6 .
Fig.6.Cumulative CPU time for each 100'th call of each activity coefficient model.Two variations of the UNIFAC model correspond to using the same water content as calculated in PD-FiTE (using ZSR "W ZSR " and using an iterative solver).

Fig. 7 .Fig. 7 .Fig. 8 .
Fig.7.Evolving condensed phase abundance and gas phase concentration for compound "8" as a function of time, assuming ideality (a and b) and using PD-FiTE (c and d).Panels (a) and (c) show the size-resolved differences between vapour pressure (VP) over aerosol particles and the gas-phase partial pressure (PP) (red indicates higher partial pressures; blue indicates higher vapor pressures), as well as the temporal variation in the partial pressure of compound "8" (black line).Panels (b) and (d) show the size-resolved dry mole fractions of compound "8" within the condensed-phase.

Fig. 8 .
Fig.8.Evolving condensed phase abundance and gas phase concentration for compound "12" as a function of time, assuming ideality (a and b) and using PD-FiTE (c and d).Panels (a) and (c) show the size-resolved differences between vapour pressure (VP) over aerosol particles and the gas-phase partial pressure (PP) (red indicates higher partial pressures; blue indicates higher vapor pressures), as well as the temporal variation in the partial pressure of compound "12" (black line).Panels (b) and (d) show the size-resolved dry mole fractions of compound "12" within the condensed-phase.
Table 1 and average percentage deviations in Table 4.
Table 1 and average percentage deviations in Table 4.
ation for fitting as presented within the Master Chemical Mecha-

Table 2 .
Polynomials for binary activity coefficient: lnγ o

Table 5 .
, providing 1000 datapoints.Mixing ratios (ppt) of each compound used within the dynamical simulation.