As our knowledge and understanding of atmospheric aerosol particle evolution and impact grows, designing community mechanistic models requires an ability to capture increasing chemical, physical and therefore numerical complexity. As the landscape of computing software and hardware evolves, it is important to profile the usefulness of emerging platforms in tackling this complexity. Julia is a relatively new programming language that promises computational performance close to that of Fortran, for example, without sacrificing the flexibility offered by languages such as Python. With this in mind, in this paper we present and demonstrate the initial development of a high-performance community mixed-phase atmospheric 0D box model, JlBox, written in Julia.

In JlBox v1.1 we provide the option to simulate the chemical kinetics of a gas phase whilst also providing a fully coupled gas-particle model with dynamic partitioning to a fully moving sectional size distribution, in the first instance. JlBox is built around chemical mechanism files, using existing informatics software to parse chemical structures and relationships from these files and then provide parameters required for mixed-phase simulations. In this study we use mechanisms from a subset and the complete Master Chemical Mechanism (MCM). Exploiting the ability to perform automatic differentiation of Jacobian matrices within Julia, we profile the use of sparse linear solvers and pre-conditioners, whilst also using a range of stiff solvers included within the expanding ODE solver suite the Julia environment provides, including the development of an adjoint model. Case studies range from a single volatile organic compound (VOC) with 305 equations to a “full” complexity MCM mixed-phase simulation with 47 544 variables. Comparison with an existing mixed-phase model shows significant improvements in performance for multi-phase and mixed VOC simulations and potential for developments in a number of areas.

Mechanistic models of atmospheric aerosol particles are designed, primarily, as a facility for quantifying the impact of processes and chemical complexity on their physical and chemical evolution. Depending on how aligned these models are with the state of the science, they have been used for validating or generating reduced-complexity schemes for use in regional to global models

With ongoing investments in atmospheric aerosol monitoring technologies, the research community continues to hypothesise and identify new processes and molecular species deemed important to improve our understanding of their impacts. This continually expanding knowledge base of processes and compounds, however, likewise requires us to expand our aerosol modelling frameworks to capture this increased complexity. It also raises an important question about the appropriate design of community-driven process models that can adapt to such increases in complexity, and also about how we ensure our platforms exploit emerging computational platforms, if appropriate.

In this paper we present a new community atmospheric 0D box model, JlBox, written in Julia. Whilst the first version of JlBox, v1.1, has the same structure and automatic model generation approach as PyBox

In the following sections we describe the components included within the first version of JlBox, JlBox v1.1. In Sect.

The gas-phase reaction of chemicals in atmosphere follows the gas kinetics equation:

As we have to keep track of the concentration of every compound in every size bin, this significantly increases the complexity of the ODE relative to the gas-phase model:

Array layout for ODE states

Sensitivity analysis is useful when we need to investigate how the model behaves when we perturb the model parameters and initial values. One approach is to see how all the outputs change due to one perturbed value by simply subtracting the original outputs from the perturbed outputs, or, in a local sense, solving an ODE whose RHS is the partial derivative of the respective parameter. However, this approach would be very expensive when we want the sensitivity of a scalar output with respect to all the parameters. This is often the case when doing data assimilation. The adjoint method can efficiently solve the problem. Imagine there is some scalar function

Solve the ODE Eq. (

Numerically integrate formula Eq. (

JlBox implements the adjoint sensitivity algorithm with the help of an auto-generated Jacobian matrix

JlBox is written in pure Julia and is presently only dependent on the UManSysProp Python package for parsing chemical structures into objects for use with fundamental property calculations during a pre-processing stage. The pre-processing stage also includes extracting the rate function, stoichiometry matrix and other parameters from a file that defines the chemical mechanism using the common KPP format, followed by a solution to the self-generated ODEs using implicit ODE solvers. Specifically, the model consists of six parts:

Run a chemical mechanism parser.

Perform rate expression formulation and optimisation.

Perform RHS function formulation.

Create a Jacobian of RHS function.

Preparation and calculation for partitioning process.

Adjoint sensitivity analysis where required.

Figure

Schematic illustrating the structure of JlBox v1.1, whether in forward or adjoint configuration.

Like PyBox, JlBox builds the required equations to be solved by reading a chemical mechanism file. In the examples provided here, we use mechanisms extracted from the Master Chemical Mechanism (MCM) to build the intended model for simulation. A preview of a mechanism file is given in Listing 1.

Example of the MCM file.

There are two sections in each line of the mechanism file separated by the

Upon reading each set of equations, JlBox will assign unique numbers for reactants and products if encountered for the first time; then it will fill in the stoichiometry matrix

The latter part of a line of the chemical mechanism file, after the symbol

The gas–aerosol partitioning process requires additional pre-processing of several parameters of each compound required by the growth equation. These are listed in Eqs. (

When solving the ODE, the RHS function of the gas-phase kinetics firstly updates the non-constant rate coefficients

Following this, the model calculates the rate of change (loss/gain) of reactants and products in each equation and sums the loss/gain of the same species across different equations using the following:

There are two ways to implement this. The first projects the structure to program instructions executed by the RHS function. The second stores it as data and the RHS function loops through the data to calculate the result.

The first method is intended to statically figure out the symbolic expressions of the loss and gain for each species as combinations of rate coefficients and gas concentrations, and to generate the RHS function line by line from the relevant expressions. This method is straightforward and fast, especially for small cases. However, it consumes lots of memory and time for compiling when the mechanism file is large (i.e.

The other approach is to use spare matrix manipulation because of the sparse structure of the stoichiometry matrix in atmospheric chemical mechanisms. Considering equation numbers as columns, compounds numbers as rows and signed stoichiometry (positive for products and negative for reactants) as values, most columns of the stoichiometry matrix have limited (usually

The gas–aerosol partitioning component of JlBox simulates the condensational growth of aerosols in discrete size bins where each particle has the same size. Please note that as we use a fully moving distribution in v1.1, when we further refer to a size bin we retain a discrete representation with no defined limits per bin.

JlBox computes the rate of loss/gain for gas-phase and condensed-phase substances through all size bins. Firstly, for each size bin

The combination of the gas-phase Eq. (

Please note we explicitly simulate the partitioning of water between the gaseous and condensed phase following every other condensate. We appreciate this may significantly reduce the run-time of the box model. However, in this instance we wish to retain the explicit nature of the partitioning process before applying any simplifications, as we briefly discuss in Sect.

JlBox uses the

Since all the states in the ODE (Eqs.

The Jacobian matrix of the RHS

JlBox implements an analytical Jacobian function for both gas kinetics and the partitioning process as well as those generated using finite differentiation and automatic differentiation. Theoretically, an analytical Jacobian is the most accurate and efficient approach but can be laborious to implement due to the nature of the equations involved and therefore error-prone due to manual imputation. The finite-difference approximation can have low numerical accuracy and high-performance costs due to multiple evaluations of the RHS function, although it is the simplest to implement and is applicable to most functions. Automatic differentiation shares the advantages of both methods mentioned previously; it has the convenience of automatically generating a Jacobian matrix from the Julia-based model, much like the finite-difference method, whilst retaining the accuracy of the analytical solution. Based on the fact that all programs are combinations of primitive instructions, an auto-differentiation library could generate the derivative of a program according to the chain rule and predefined derivatives of primitive instructions. The only limitation is that the RHS function must be fully written in the Julia language, and this dictates any additional work that might be required. JlBox uses the

To improve performance and reduce memory consumption, JlBox has special treatments for computing the Jacobian of mixed-phase RHS. Firstly, the gas kinetic part

As the size of the Jacobian matrices grow quickly (

Following the Kinetic PreProcessor (KPP) and AtChem model approach

In JlBox, the functions for solving

In this section we demonstrate the ability to build and deploy an adjoint model. Using it to quantify sensitivity typically relies on experimental data and processes that will be incorporated in future versions. Nonetheless, the example given in Sect.

Solving the ODE Eq. (

The goal of JlBox is to provide a high-performance mechanistic atmospheric aerosol box model that also retains the flexibility and usability of Python implementations, for example. Not only should it have comparable performance, if not run faster, than other models for a given scenario, it should have the capacity for integrating benchmark chemical mechanisms with coupled aerosol process descriptions. In this section we validate the output of JlBox against PyBox since the model process representations are identical, whilst also investigating the relative performance as the “size” of the problem scales.

To test the numerical correctness of JlBox, we ran our model together with existing box model including PyBox and KPP with identical scenarios. JlBox is designed as a more efficient version of PyBox, so it is expected to have identical results in both gas- and mixed-phase scenarios. Meanwhile, gas-phase models constructed from the widely used KPP software could provide some guarantee that the results from JlBox is useful. However, aerosol processes are not available in KPP; as a result we could only compare outputs of gas kinetics. We prepared two test scenarios with gas-phase simulation only and multi-phase simulation. The settings of the simulations are listed in Table

Initial conditions and solver configurations.

Figure

Comparison of gas-only

A demonstration of an adjoint sensitivity analysis is conducted to calculate the partial derivative of secondary organic aerosol (SOA) mass at the end of the simulation with respect to the rate coefficients of each equation in the mechanism. The configurations of the simulation are the same as the mixed-phase

The results presented in Table

Sensitivities of SOA mass with respect to gas-phase rate coefficients. The units of the last two columns depend on the number of reactants.

In this section we demonstrate the performance of JlBox on “large-scale” problems where both KPP and PyBox fail to solve due to constraints imposed by the model workflow and language dependencies as shown in Appendix

The elapsed time taken by JlBox is plotted in Fig.

Elapsed time of 72 h mixed-phase simulations. The initial conditions used for each case are listed in Appendix A.

Figure

Time series plot of SOA mass from the same case studies used in profiling total simulation time. In this study, as noted in the text, we use a predictive technique that under-predicts the saturation vapour pressure to create the maximum number of viable condensing products.

Time series plot of size bins for experiment A1 with the high-RH scenario.

JlBox is developed based on the PyBox model

Comparison between JlBox and PyBox.

As for algorithmic advances, the automatic differentiation method for generating Jacobian matrices is not only the most effective addition but also a fundamental one. It is an accurate and convenient way to calculate the Jacobian matrix which only requires an RHS function fully written in Julia. With Jacobian matrices available, the number of RHS evaluations is dramatically reduced since the implicit ODE solver no longer needs to estimate the Jacobian matrix using finite differences. Also, without automatic differentiation, it will not be so easy to build the adjoint model of a fully coupled process model which explicitly requires the Jacobian matrix for the entire model, let alone to extend the model with more processes. Besides, the adaptation of sparse matrices for gas kinetics reduced the compilation cost to a small constant value enabling the JlBox to simulate large-scale mechanisms such as the entire MCM, which for PyBox typically remains limited by memory.

Compared to other models like KPP

There are a number of processes and algorithmic implementations not included in this version of JlBox that would be useful for further use in a scientific capacity. These include coagulation, hybrid sectional methods and auto-oxidation products schemes to name a few

Quantifying the importance, or not, of process and chemical complexity requires a multifaceted approach. With the proliferation of data-science-driven approaches across most scientific domains,

JlBox will continually grow and we encourage uptake and further developments.

Initial condition for anthropogenic VOC experiments from

Initial condition for biogenic VOC experiments from

In Table

Elapsed time and total allocated memory of the multi-phase APINENE simulation in Sect.

In Table

Performance comparison of PyBox, JlBox and KPP based on elapsed time of forward and adjoint simulation in Sects.

As shown in Fig.

Performance comparison between JlBox and PyBox with different numbers of size bins and mechanisms.

The exact code for JlBox v1.1 used in this paper can be found on Zenodo at

JlBox was written and evaluated by LH. DT provided guidance on comparisons with PyBox, including mapping the same structure to JlBox, and helped in the effective design and sustainability of JlBox.

The authors declare that they have no conflict of interest.

This work was supported by the EPSRC UKCRIC Manchester Urban Observatory (University of Manchester) (grant number: EP/P016782/1). The authors would like to acknowledge the assistance given by Research IT at the University of Manchester. The authors would also like to acknowledge the ETH Zurich Euler cluster for supporting large-scale simulations.

This research has been supported by the Engineering and Physical Sciences Research Council (grant no. EP/P016782/1).

This paper was edited by Sylwester Arabas and reviewed by two anonymous referees.