New-particle formation from condensable gases is a common atmospheric process that has significant but uncertain effects on aerosol particle number concentrations and aerosol–cloud–climate interactions. Assessing the formation rates of nanometer-sized particles from different vapors is an active field of research within atmospheric sciences, with new data being constantly produced by molecular modeling and experimental studies. Such data can be used in large-scale climate and air quality models through parameterizations or lookup tables. Molecular cluster dynamics modeling, ideally benchmarked against measurements when available for the given precursor vapors, provides a straightforward means to calculate high-resolution formation rate data over wide ranges of atmospheric conditions. Ideally, the incorporation of such data should be easy, efficient and flexible in the sense that same tools can be conveniently applied for different data sets in which the formation rate depends on different parameters. In this work, we present a tool to generate and interpolate lookup tables of formation rates for user-defined input parameters. The table generator primarily applies cluster dynamics modeling to calculate formation rates from an input quantum chemistry data set defined by the user, but the interpolator may also be used for tables generated by other models or data sources. The interpolation routine uses a multivariate interpolation algorithm, which is applicable to different numbers of independent parameters, and gives fast and accurate results with typical interpolation errors of up to a few percent. These routines facilitate the implementation and testing of different aerosol formation rate predictions in large-scale models, allowing the straightforward inclusion of new or updated data without the need to apply separate parameterizations or routines for different data sets that involve different chemical compounds or other parameters.

Formation of secondary aerosol particles from condensable gases is a well-known and ubiquitous phenomenon in Earth’s atmosphere

The combination of quantum chemistry and molecular cluster dynamics simulations is the state-of-the-art method to calculate theoretical particle formation rates

New quantum chemical data, calculated for different chemical species

Deriving parameterizations quickly becomes cumbersome as the number of independent parameters increases: finding parameterization formulas and coefficients that reproduce the formation rate data with a reasonable accuracy throughout the parameter space is virtually impossible for arbitrary chemical systems. Furthermore, evaluation of very complex formulas within a computationally heavy large-scale model may not be optimal. The benefit of lookup tables is that values determined from a table of sufficient resolution are guaranteed to be close to the original data, and no pre-processing of the formation rate data is needed. The use of such tables, on the other hand, requires multivariate interpolation algorithms that should ideally be applicable to tables of arbitrary dimensions, with no need for manual changes depending on the number of independent parameters. While interpolated values must be sufficiently accurate, the interpolation routine should be computationally efficient and preferably simple.

In this work, we introduce a tool to incorporate molecular modeling data in atmospheric models by flexible routines to generate and interpolate formation rate lookup tables. With this approach, we aim to cover the following aspects:

flexible implementation of state-of-the-art molecular modeling results in large-scale models;

inclusion of arbitrary chemical compounds;

efficient multivariate interpolation; and

user-friendly routines with no need for modifications depending on, for instance, included chemical compounds.

The proposed lookup table approach comprises two components: a generator routine to create tables and an interpolation routine to be implemented in an atmospheric model. We refer to the routines as J-GAIN (Formation rate (

Flow chart illustrating the generation and application of particle formation rate tables by J-GAIN. The boxes summarize the steps for using the two parts: (1) table generation with automatized calculation of formation rates by cluster dynamics modeling and (2) implementation of tables and the table interpolator in a host model. User-defined input and output are specified outside the boxes.

By default, the table generator takes molecular cluster thermochemistry data for the given chemical compounds as input and calculates the formation rates by molecular cluster dynamics simulations through coupling to the open-source cluster model ACDC (Atmospheric Cluster Dynamics Code;

In addition to the molecular cluster data, the table generator requires user-defined input for the ranges of the parameters that define the ambient conditions. These parameters include the concentrations of the vapor compounds, temperature (

The user sets the range for the values of each parameter by giving the lower and upper limits, and the number of values to be placed within the limits at even intervals. Parameters can be defined as “logarithmic”, in which case the data points are placed evenly on a logarithmic scale. This is relevant for, for example, vapor concentrations. The formation rate table is then generated by running the generator which calls ACDC to obtain the formation rate

The tables are outputted as binary files, and a descriptor file is generated together with the table. The latter contains the essential information on the table, including the names and units of the independent parameters, the lower and upper limits of the parameter values, and the numbers of values along each dimension. In order to ensure sufficient accuracy, several tables can be generated at different resolutions. Then, possible errors in interpolated values can be assessed by interpolating a coarser-resolution table on the grid of a higher-resolution reference table (Sect.

The interpolation routine uses the descriptor file to obtain the number and identities of the independent parameters for the corresponding lookup table. After loading the table, the routine determines the formation rate by linear multivariate interpolation. In general, for an

Treatment of input values that are outside the ranges covered by the table is parameter-dependent and defined by the user. For each parameter, the following options are included in the example routines.

Value below/above the table limits

Value below the lower limit

In the code repository, we provide an example of simple interfaces to load and interpolate lookup tables within a host model. The input parameters of the interpolation subroutine include all independent parameters that the particle formation rate may depend on, and the total formation rate is returned as output. Importantly, the interpolator is not limited to using a single table: separate particle formation pathways, corresponding to different chemical compounds, can be incorporated as separate tables. If more than one table is used, the interpolator is applied separately to each table, and the total formation rate can be obtained as the sum of the individual formation rates.

The repository includes a simple example of summing the rates interpolated from two separate tables. However, the user may construct different ways to treat several tables according to their needs and data availability. To give an example, a possible practical application could be as follows: separate tables are used for parallel formation mechanisms, for example, inorganic

Schematic presentation of treatment of several tables: an example of a possible table combination that the user may construct according to their needs.

We demonstrate the application and performance of the J-GAIN table generator and interpolator using previously published quantum chemical data for sulfuric acid and ammonia

We apply the

Figure

Particle formation rate

Relative error in interpolated particle formation rate

In order to demonstrate the application of the interpolator for interpolation over all independent parameters at realistic ambient conditions, corresponding to implementation of the routine in an atmospheric model,

Particle formation rate

In general, such errors are very small compared to typical uncertainty estimates for formation rate data: uncertainties in both theoretical and experimental

In order to assess the interpolation speed with respect to the numbers of data points and independent parameters, we apply arbitrary test tables of different sizes and dimensions. Figure

The performance is primarily affected by the number of dimensions: the addition of a new dimension may increase the run time by up to a factor of

The approximative run times of Fig.

Time required to perform 1 million lookups for tables of different sizes. The figure shows different combinations

For applications with, e.g., higher spatial resolution or shorter time step, the performance of the formation rate routine can be optimized by reasonable choices of table dimensions and size. For example, for very strongly clustering chemical systems, the presence of atmospheric ions only has minor effects. The IPR parameter could thus be discarded in the interest of speed. In addition to the resolution of the independent parameter values (i.e., the absolute intervals), their ranges (minimum and maximum values) can also be chosen considering the modeled environment, so that redundant values are avoided. If the numbers and ranges of parameters cannot be optimized further, very large tables can be divided into separate subtables that cover different sets of ambient conditions and selected within the host model application based on the input conditions.

It can also be noted that considering the high accuracy of the interpolated values, interpolation from pre-generated tables is superior compared to the time required to calculate the formation rates by the molecular cluster model. For example, generating the table of

It must be noted that incorporating aerosol formation rates in an atmospheric model involves given limitations. These limitations are independent of the data source or implementation method and apply equally to lookup tables, parameterizations or other approaches. An obvious restriction is the availability of tracers in the host model. While various chemical compounds may contribute to atmospheric particle formation, including large numbers of new species in a chemical transport model may be cumbersome. In addition, sources of individual chemical species, such as different types of amines or organic acids, may not be well quantified.

Here, we apply particle formation rates from

A simplified option to include rough source–sink dynamics for species that can be assumed to have common sources and similar properties with respect to gas-phase chemistry and gas-to-particle partitioning is to implement a lumped or representative trace compound. For example, monoamines with similar properties – namely di- and trimethylamines – have been approximated as a single representative alkylamine species, the emissions of which are scaled from ammonia emissions by assumed amine-to-ammonia ratio due to their common sources

In addition, it can be noted that the standard approach to implement formation rates involves certain approximations related to molecular cluster kinetics. Namely, formation rates are assumed to be determined solely by ambient conditions, applying the steady-state approximation for the cluster population and omitting time-dependent vapor–cluster–aerosol kinetics

Adequate representation of aerosol particle formation from vapors in atmospheric models is needed for assessing the climate and health effects of aerosols. The increasing amount of available computational molecular cluster chemistry data enables calculation of new-particle formation rates for different chemical compounds and computational chemistry methods. As formation rates are typically complicated functions of ambient conditions, a practical approach to apply them in an atmospheric model framework is through lookup tables. Here, we provide a tool to generate and interpolate formation rate tables, applicable to arbitrary sets of chemical species, for conveniently incorporating theoretical particle formation rate data in large-scale models.

Tests conducted using data for

For J-GAIN, formation rate tables can be created (1) by the provided table generator when applying the molecular cluster dynamics modeling approach or (2) by saving tabulated formation rates from another data source in a similar table. The details of the formats are listed below, and detailed instructions for the table generator are available in the J-GAIN repository

For the J-GAIN table generator, the molecular cluster input data are given in the format of the ACDC model. Briefly, the set of input files include

molecular compositions of the clusters

cluster formation free energies as enthalpies and entropies, and

cluster dipole moments and polarizabilities if charged species are also included.

For other data sources, the data need to be processed into the table format applied by the J-GAIN interpolator, including two files:

a binary file that contains the formation rate array and

a descriptor file that lists the independent parameters and their ranges.

Order of loops over the values of the independent parameters for saving a lookup table in the J-GAIN format (see Appendix A for details).

…

Save

…

In Algorithm A1,

Table

Parameter value as a function of time

Time profiles of the independent parameters for the diurnal test case.

J-GAIN is available at

DY designed, wrote and tested the programs and created the figures. TO conceived the project and wrote the manuscript. All authors contributed to discussing the results and revising the manuscript.

The contact author has declared that neither of the authors has any competing interests.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors thank Pontus Roldin and Carl Svenhag for discussions and help with testing the tool.

This research has been supported by the Swedish Research Council Vetenskapsrådet (grant no. 2019-04853) and Formas – a Swedish Research Council for Sustainable Development (grant no. 2019-01433).

This paper was edited by Andrea Stenke and reviewed by two anonymous referees.