JlBox v1.0: A Julia based mixed-phase atmospheric chemistry box-model

As our knowledge and understanding of atmospheric aerosol particle evolution and impact grows, designing community mechanistic models requires an ability to capture increasing chemical, physical and therefore numerical complexity. As the landscape of computing software and hardware evolves, it is important to profile the usefulness of emerging platforms in tackling this complexity. Julia is a relatively new programming language that promises computational performance close to 5 that of Fortran, for example, without sacrificing flexibility offered by languages such as Python. With this in mind, in this paper we present and demonstrate the initial development of a high-performance community mixed phase atmospheric 0D box-model, JlBox, written in Julia. In JlBox v1.0 we provide the option to simulate the chemical kinetics of a gas phase whilst also providing a fully coupled gasparticle model with dynamic partitioning to a fully moving sectional size distribution, in the first instance. JlBox is built around 10 chemical mechanism files, using existing informatics software to provide parameters required for mixed phase simulations. In this study we use mechanisms from a subset and the complete Master Chemical Mechanism (MCM). Exploiting the ability to perform automatic differentiation of Jacobian matrices within Julia, we profile the use of sparse linear solvers and preconditioners, whilst also using a range of stiff solvers included within the expanding ODE solver suite the Julia environment provides, including the development of an adjoint model. Case studies range from a single volatile organic compound [VOC] 15 with 305 equations to a ’full’ complexity MCM mixed phase simulation with 47544 variables. Comparison with an existing mixed phase model shows significant improvements in performance and potential for developments in a number of areas.


Introduction
Mechanistic models of atmospheric aerosol particles are designed, primarily, as a facility for quantifying the impact of processes and chemical complexity on their physical and chemical evolution. Depending on how aligned these models are with the 20 state-of-the-science, they have been used for validating or generating reduced complexity schemes for use in regional to global models (Zaveri et al., 2008;Riemer et al., 2009;Amundson et al., 2006;Korhonen et al., 2004;Roldin et al., 2014;Hallquist et al., 2009;Kokkola et al., 2018). This is based on the evaluation that 'full' complexity schemes are too computationally expensive for use in large scale models. With this in mind, the community has developed a spectrum of box-models that focus on to capture process descriptions for use in regional to global models (e.g., Zaveri et al., 2008;Kokkola et al., 2018). Recent studies are also exploring coupling the latter with numerical techniques for reducing systematic errors through assimilation of ambient measurements (e.g., Sherwen et al., 2019).
With ongoing investments in atmospheric aerosol monitoring technologies, the research community continue to hypothesise and identify new processes and molecular species deemed important to improve our understanding of their impacts. This 30 continually expanding knowledge base of processes and compounds, however, presents both numerical and computational challenges on the development of the next generation of mechanistic models. It also raises an important question about appropriate design of community driven process models that can not only adapt to increases in complexity, but how we ensure our platforms exploit emerging computational platforms, if appropriate.
In this paper we present a new community atmospheric 0D box-model, JlBox, written in Julia. Whilst the first version of 35 JlBox, v1.0, has the same structure and automatic model generation approach as PyBox (Topping et al., 2018), we present significant improvements in a number of areas. Julia is a relatively new programming language, created with the understanding that Scientific computing has traditionally required the highest performance, yet domain experts have largely moved to slower dynamic languages for daily work. Julia promises computational performance close to that of Fortran, for example, without sacrificing flexibility offered by languages such as Python. In JlBox v1.0 we evaluate the performance of a one-language 40 driven simulation that still utilises automated property predictions provided by UManSysProp and other informatics suites (Topping et al., 2018). The choice of programming language when building new and sustainable model infrastructures is clearly influenced by multiple factors. These include issues around training, support and computational performance to name a few. Python has seen a persistent increase in use across the sciences, in part driven by the large ecosystem and community driven tools that surrounds it. This was the main factor behind the creation of PyBox. Likewise, in this paper we demonstrate 45 that the growing ecosystem around Julia offers a number of significant computational and numerical benefits to tackle known challenges in creating aerosol models using a one language approach. Specifically, we make use of the ability to perform automatic differentiation of Julia code using tools now available in that ecosystem. In JlBox we demonstrate the usefulness of this capability when coupling particle phase models to a gas phase model where deriving an analytical jacobian might be deemed too difficult. 50 In the following sections we describe the components included within the first version of JlBox, JlBox v1.0. In section 2 we briefly describe the theory on which JlBox is based, including the equations that define implementation of the adjoint sensitivity studies. In section 3 we discuss the code structure, including parsing algorithms for chemical mechanisms, and the use of sparse linear solvers and pre-conditioners, whilst also using a range of stiff solvers included within the expanding ODE solver suite DifferentialEquations.jl. In section 4 we then demonstrate the computational performance of JlBox relative to an 55 existing community gas phase and mixed phase box-models, looking at a range of mechanisms from the Master Chemical Mechanism (Jenkin et al., 1997(Jenkin et al., , 2002. We present JlBox as a platform for a range of future developments, including the addition of in/on aerosol processes currently not captured. It is our hope that the demonstration of Julia specific functionality in this study will facilitate this process.

60
The gas phase reaction of chemicals in atmosphere follows the gas kinetics equation: where [C i ] is the concentration of compound i, r j is the the reaction rate of reaction j , k j is the corresponding reaction rate coefficient and S ij is the value of the stoichiometry matrix for compound i and j. The above Ordinary Differential Equation (ODE), equation (1), fully determines the concentrations of gas phase chemicals at any time given reaction coefficients k j , a stoichiometry matrix {S ij } and initial values. All chemical kinetic demonstrations in this study are provided by the Master Chemical Mechanism (MCM) (Jenkin et al., 1997(Jenkin et al., , 2002, but the parsing scheme allows for any mechanism provided in the standard Kinetics PreProcessor [KPP] format (Damian et al., 2002). When adding aerosol particles to the system, more interactions have to be considered in order to predict the state of the system, including concentrations of components in the gas and particulate phase. In JlBox v1.0 we only consider the gas-aerosol partitioning to a fully moving sectional size distribution, 70 recognising the need to use hybrid sectional methods when including coagulation, (e.g., Kokkola et al., 2018). We discuss these future developments in section 5. We use bulk-absorptive partitioning in v1.0 where gas-to-particle partitioning is dictated by gas phase abundance and equilibrium vapour pressures above ideal droplet solutions. This process is described by the growthdiffusion equation provided by Jacobson (2005) (Pages 543, 549, 557, 560).
is the effective saturation vapor concentration over a curvature surface of size bin k (considering Kelvin effect), D ef f i,k is the effective molecular diffusion coefficient, r k is the size of particles in size bin k, n k the respective number concentration of particles, [C core,k ] is the molar concentration of 80 core, m w,i molecular weight of condensate i, ρ i is liquid phase density, p s i is pure component saturation vapor pressure, D * i is molecular diffusion coefficient, K n,i is Knudsen number, α i is accommodation coefficient, σ is the surface tension of the droplet, R * is the universal gas constant, N A is Avogadro's number, and T is temperature.
As we have to keep track of the concentration of every compound in every size bin, this significantly increases the complexity of the ODE relative to the gas phase model:

85
dy dt = f (y; p), y = (C 1 , C 2 , . . . , C n ; C 1,1 , C 2,1 , . . . , C n,m ) where y represents the states of the ODE, n is number of chemicals, m is number of size bins, p is a vector of parameters of the ODE, and f (y) is the RHS function implicitly defined by equations (1) and (2). For example, the gas phase simulation of a mechanism with n = 800 chemicals has to solve an ODE with 800 states, while the mixed phase simulation with m = 16 size bins will have 13600 (= 800 + 800 × 16) states. Meanwhile, the size of Jacobian matrix (required by implicit ODE solvers) 90 will increase in a quadratic way from 800 × 800 to 13600 × 13600.
Sensitivity analysis is useful when we need to investigate how the model behaves when we perturb the model parameters and initial values. One approach is to see how all the outputs change due to one perturbed value by simply subtracting the original outputs from the perturbed outputs, or, in a local sense, solving an ODE whose RHS is the partial derivative of the respective parameter. However, this approach would be very expensive when we want the sensitivity of a scalar output with respect to 95 all the parameters. This is often the case when doing data assimilation. The adjoint method can efficiently solve thie problem.
Imagine there is some scalar function g(y) and we would like to compute its sensitivity against some parameters p. Introducing the adjoint vector λ(t) with the shape of g(y)'s gradient, the adjoint method could compute this in two steps (Damian et al., 2002): 1. Solve the ODE (6) in a backward order 100 2. Numerically integrate formula (7) JlBox implements the adjoint sensitivity algorithm with the help of an auto-generated Jacobian matrix ∂f (y; p)/∂y. Users only need to supply the gradient function of the scalar function with respect to ODE states ∂g/∂y as well as the Jacobian 105 function ∂f (y; p)/∂p of RHS function with respect to parameters so as to get the sensitivity of the scalar function with respect to parameters ∂g/∂p at time t F . Both are provided automatically through the automatic differentiation provided by Julia.

Implementation
JlBox written in pure Julia and is presently only dependent on the UManSysProp Python package for parsing chemical structures into objects for use with fundamental property calculations during a pre-processing stage. The pre-processing stage also 110 includes extracting the rate function, stoichiometry matrix and other parameters from a file that defines the chemical mechanism using the common KPP format, followed by a solution to the self-generated ODEs using implicit ODE solvers. Specifically, the model consists of 6 parts: 1. Run a chemical mechanism parser 2. Perform rate expression formulation and optimization 115 3. Perform RHS function formulation 4. Create a Jacobian of RHS function 5. Preparation and calculation for partitioning process 6. Adjoint sensitivity analysis where required. Upon reading each set of equations, JlBox will assign unique numbers for reactants and products if encountered for the first time; then it will fill in the stoichiometry matrix S ij with stoichiometry coefficients where i is the number of the equation

145
(depicted at the beginning of each line) and j is the number of the reactants/products. The stoichiometry matrix is firstly built as a list of triplet (i, j, S ij ) for fast insertion of elements and then it is transformed into the compressed sparse column (CSC) format which is more memory efficient for calculating the RHS of gas-kinetics.
The latter part of a line of the chemical mechanism file, after the symbol :, represents the expression of reaction rate coefficient k j (y; p). The expression consists of prescribed combinations of basic arithmetic operators + -* / ** , basic math given a constant temperature at 298.15K. To further reduce computation, when a reaction rate coefficient is constant, the related expression is deleted from the function which is called at every time step to update the coefficients and the respective initial value of the coefficient is set to be the constant.

160
The gas-aerosol partitioning process requires additional pre-processing of several parameters of each compound required by the growth equation. These are listed in equations 2 to 4. Python packages UManSysProp (Topping et al., 2018) and OpenBabel (O'Boyle et al., 2011) are called during the pre-processing stage to calculate thermodynamic properties required by those parameters.

165
When solving the ODE, the RHS function of the gas phase kinetics firstly updates the non-constant rate coefficients k j (y; p), then constructs the reaction rate r j from concentrations of compounds [C i ], their stoichiometry matrix (S ij ), and rate coeffi- Following this, the model calculates the rate of change (loss/gain) of reactants and products in each equation and sums the 170 loss/gain of the same species across different equations using: There are two ways to implement this. The first projects the structure to program instructions executed by the RHS function.
The second stores it as data and the RHS function loops through the data to calculate the result.
The first method is intended to statically figure out the symbolic expressions of the loss and gain for each species as combina-175 tions of rate coefficients and gas concentrations, and generate the RHS function line by line from the relevant expressions. This method is straightforward and fast, especially for small cases. However, it consumes lots of memory and time for compiling when the mechanism file is large (i.e., > 1000 equations).
The other approach is to use spare matrix manipulation because of the sparse structure of the stoichiometry matrix in atmospheric chemical mechanisms. Considering equation numbers as columns, compounds numbers as rows, and signed stoi- The gas-aerosol partitioning component of JlBox simulates the condensational growth of aerosols in discrete size bins where each particle has the same size. Please note that as we use a fully moving distribution in v1.0, when we further refer to a size bin we retain a discrete representation with no defined limits per bin. JlBox computes the rate of loss/gain for gas  (2). As we adopted the moving bin scheme in v1.0, it keeps track of the bin sizes r k as they grow during the process following formulas (10) and (11). Finally, the rate of change of a given specie d[C i,k ]/dt is summed across all bins to give the corresponding loss/gain of gas phase concentrations according to conservation law.
The combination of the gas phase (9) and condensed phase (12) rate of change expressions provides the overall RHS function (5) of a mixed-phase simulation.

Numerical methods and automatic differentiation
JlBox uses the DifferentialEquations.jl library to solve the ODE, assembling the RHS function in a canonical way: The Jacobian matrix of the RHS ∂f (y; p)/∂y is needed in implicit ODE solvers as well as in adjoint sensitivity analysis.

215
The accuracy of the Jacobian matrix, however, has variable requirements in each case. For implicit ODE solvers, when doing forward simulations, the accuracy of the matrix only affects the rate of convergence instead of the accuracy of the result. Some methods like BDF and Rosenbrock-W, by-design, could tolerate inaccurate Jacobian matrices (Wanner and Hairer, 1996, p114).
Meanwhile, for adjoint sensitivity analysis, accurate Jacobian matrices are needed as they explicitly appear in the RHS function (6).

220
JlBox implements an analytical Jacobian function for both gas kinetics and partitioning process as well as those generated using finite differentiation and automatic differentiation. Theoretically, an analytical Jacobian is the most accurate and efficient approach, but can be laborious to implement due to the nature of the equations involved and therefore error-prone due to manual imputation. The finite difference approximation can have low numerical accuracy and high performance costs due to multiple evaluations of the RHS function, although it is the simplest to implement and is applicable to most functions. Automatic The only limitation is that the RHS function must be fully written in the Julia language and this dictates any additional work 230 that might be required. JlBox uses the ForwardDiff.jl library to perform auto-differentiation. The library introduces the dual-number trick with the help of Julia's multiple dispatch mechanism.
To improve performance and reduce memory consumption, JlBox has special treatments for computing the Jacobian of mixed phase RHS. Firstly, the gas kinetic part ∂f i /∂y j | 1≤i,j≤n is produced analytically because it is sparse and has simple analytical form, while auto-differentiation tools will waste lots of memory and time as they treat it as a dense martix. Secondly, 235 according to (12), one part of the Jacobian could be expressed as the sum of another part: which could also reduce computation. We only have to compute the Jacobian of (2) using methods mentioned previously.
For comparison of performance and accuracy, JlBox implements two auto-differentiated Jacobians for aerosol processes called "coarse_seeding" and "fine_seeding" with and without the optimizations mentioned above. According to benchmark results 240 presented in Appendix table B1, it was found those optimizations could significantly reduce memory usage without effecting the performance.

Sparse linear solvers and pre-conditioners
As the size of the Jacobian matrices grow quickly (O(n 2 )) following the growth of number of states n, it becomes increasingly slow when simulating a mixed-phase model on the full MCM mechanism which has 47544 states when using 16 size bins.

245
The majority of time is spent in solving the dense linear equation M x = b where M = I − γJ, J is the Jacobian matrix, γ is a scalar set by ODE solver, x and b are some vectors.
Following the Kinetic PreProcessor (KPP) and AtChem model approach (Sommariva et al., 2020;Damian et al., 2002), as the Jacobian is quite sparse JlBox introduces the option to use sparse linear solvers provided by 'DifferentialEquations.jl'.
Specifically JlBox is optimized for the iterative sparse linear solver GMRES in CVODE_BDF by providing pre-conditioners 250 which could drastically reduce the number of iterations of iterative sparse linear solvers like GMRES. Theoretically, a preconditioner P is a rough approximation of the matrix M so that P −1 M has less condition number than M . It is 'rough' in a way that the pre-condition process of solving P −1 x = b is easier. In practice, the pre-conditioner P is stored in LU factored form so that solving P −1 x = b is a simple back substitution that sometimes needs to be updated to retain proximity with the changing Jacobian.

255
In JlBox, the functions for solving P −1 x = b and updating P are specified by 'prec' and 'psetup' arguments inside the CVODE_BDF solver. JlBox provides default prec and psetup as a tri-diagonal pre-conditioner following the approach used in AtChem (Sommariva et al., 2020). In psetup, a full Jacobian is calculated in sparse format followed by taking its tridiagonal values forming the approximated tridiagonal M . A LU factorization is then calculated using the Thomas algorithm and stored in cache so that prec can solve the linear equation quickly.

Adjoint sensitivity analysis
In this section we simply demonstrate the ability to build and deploy an adjoint model. Using it to quantify sensitivity typically relies on experimental data and processes that will be incorporated in future versions. Nonetheless, the example given in section 4.2 demonstrates the ability to evaluate the sensitivity of predicted secondary organic aerosol to all gas phase kinetic coefficients. An adjoint sensitivity analysis computes the derivatives of a scalar function g(y) of the ODE states with respect 265 to some parameters p of the RHS function f (y; p). The actual computation reformulates solving the ODE (6) in a backward order and numerically integrating formula (7). It is worth noting that the equation is in the linear form, so using an implicit method that linearizes the RHS function like the Rosenbrock method may give a good result. The Rosenbrock method explicitly includes the Jacobian function as an estimation of the RHS function. In this case, such estimation is an exact representation which enables longer time-steps. The backward differentiation formula (BDF) may also benefit from this for the same reason, with the number of Newton steps reduced to one or two. As the Jacobian matrix is frequently called, a fast and accurate Jacobian function is needed. With this in mind, the special treatment of AD mentioned in section 3.3 delivered a 10x improvement in performance compared with the one that simply wrap the RHS with the AD function. For the second step, we adopt the adaptive Gauss-Kronrod quadrature to calculate the formula accurately.
Solving the ODE (6) in a backward manner poses a significant problem as we need to evaluate (an accurate) Jacobian matrix 275 in a backward order (from t F to t 0 ) which requires accessing the states y(t) at given time-points in backward order. The only way to achieve that is to store a series of states y i at some checkpoints t i . The stored states alone are sufficient for using ODE solvers with fixed time step, but an adaptive ODE solver is needed for better error control which requires accessing y(t) at an arbitrary time t. Thus we need to interpolate those states into dense outputs. Since the time derivative of y is easily accessible in the form of dy/dt = f (y), we can use Hermite interpolation or higher order interpolation to enhance the accuracy of the 280 interpolation. JlBox utilises the solution object of DifferentialEquations.jl (which internally implements Hermite interpolation) to provide y(t) at any given point t 0 ≤ t ≤ t F .

Model Output
The goal of JlBox is to provide a high performance mechanistic atmospheric aerosol box model that also retains the flexibility and usability of Python implementations, for example. Not only should it have comparable performance, if not run faster, than 285 other models for a given scenario, but have the capacity for integrating benchmark chemical mechanisms with coupled aerosol process descriptions. In this section we validate the output of JlBox against PyBox since the model process representations are identical, whilst also investigating the relative performance as the 'size' of the problem scales.

Validation against an existing model box-model
To test the numerical correctness of JlBox, we ran our model together with existing box-model including PyBox and KPP with 290 identical scenarios. JlBox is designed as a more efficient version of PyBox, so it is expected to have identical results in both gas and mixed phase scenarios. Meanwhile, gas phase models constructed from the widely used KPP software could provide some guarantee that the results from JlBox is useful. However, aerosol processes are not available in KPP, as a result we could only compare outputs of gas kinetics. We prepared two test scenarios with gas phase simulation only and mixed-phase simulation.
The settings of the simulations are listed in Table 1. Additionally, in the mixed phase simulation, we set the initial aerosol 295 to be an ideal representation of ammonium sulphate solution satisfying a lognormal size distribution with an average size of 0.2 microns and a standard deviation of 2.2 microns, discretized into 16 bins. The saturation vapour pressure threshold of whether to include the gas-to-particle partitioning of a specific chemical is chosen to be 10 −6 atm based on an extremely low absorptive partitioning coefficient for a wide range of pre-existing mass loadings. For all simulations presented in this paper we use the vapour pressure technique of Joback and Reid (1987). Whilst known to systematically under predict saturation

Evaluation of adjoint sensitivity analysis
A demonstration of an adjoint sensitivity analysis is conducted to calculate the partial derivative of secondary oraganic aerosol 310 mass (SOA) at the end of simulation with respect to the rate coefficients of each equation in the mechanism. The configurations of the simulation is the same as the mixed phase alpha-pinene scenario (Table 1) presented in the previous section.
The results presented in Table 2 highlighted the top 10 (in terms of absolute magnitude) estimated deviations of SOA mass dSOA under a 1% change of rate coefficients because the derivate itself (dSOA/dratecoef f ) is not comparable due to different units involved. The reactions between alpha-pinene and ozone have the most substantial effect. The order of the 315 equations simply highlights the flow of alpha-pinene to its subsequent products. This might attribute to the fact that the system hasn't reached the equilibrium state (also illustrated in the exponential growth of SOA mass in Figure 2). Another interesting point is that competing reactions have similar sensitivities but opposite signs like reactions of APINOOB, APINOOA, and APINENE+OH. The competing reactions between alpha-pinene and ozone is an outlier with a ratio of 5 between the two. A plausible explanation is that for those reactions with opposite sensitivities, the products of one leads to little or no SOA while 320 the other contributes more, so when the former reaction is accelerated due to its perturbed rate coefficient, it reduces the ability of the latter reaction to produce SOA. As a result, the two reactions have opposite sensitivities. For the reactions of APINENE and O3, it is possible that the APINOOA and APINOOB pathways both produce SOA, and the first produces more than the second one. When the rate coefficient of the second reaction is increased, the decrease of SOA due to less APINOOA does not offset the increase of SOA due to more APINOOB, which leads to a smaller but still positive sensitivity of SOA. As we state 325 earlier, a deeper analysis with alternative options for saturation vapour pressures and process inclusion may reveal important dependencies.

Performance on large scale problems
In this section we demonstrate the performance of JlBox on 'large scale' problems where both KPP and Pybox fail to solve due to constraints imposed by the model workflow and language dependencies as shown in Appendix B. We define 'large 330 scale' problems as those beyond single VOCs or gas phase only simulations. Equipped with a sparse linear solver and autogenerated tridiagonal preconditioner, JlBox is ideal for simulating larger mechanisms than we present above. With this in mind, the largest possible mechanism accessible from the MCM suite is selected, which contains 16701 chemical equations, 5832 species. Moreover, we performed 72-hour mixed phase simulations with 16 moving bins. This means that JlBox has to solve a system of stiff ODEs of 47544 variables that requires solving matrices of 47544 × 47544 at each time step. The initial conditions are taken from an existing representative chamber study on mixed VOC systems (Couvidat et al., 2018, Table 1 & 2) (see Appendix A) with 16 experiments in two sets. We use average values of temperature where ranges are provided.
In addition, instead of using the relative humidity selected in those studies, we performed perturbed simulations with low RH scenario of 10% and high RH scenario of 80% respectively to investigate possible dependence on stiffness according to variable partitioning from the gas to the condensed phase. All the simulations were executed on the ETH Zurich Euler cluster, 340 requesting 4 cores and 7GB memory each to exploit parallelism between different initial conditions. This was chosen as a PC would have to run them in sequential order making it too time consuming.
The elapsed time taken by JlBox is plotted in Figure 3. The average time is around 7 hours which is approximately 1/10 of simulation time. In addition, the maximum memory consumption is 8216 MB and average consumption is 4273 MB. Both values are significantly smaller than the Jacobian matrix if we were to store it in a double precision dense form. Note that the 345 Euler cluser provide 3 types of CPU nodes equipped with Intel XeonE3 1585Lv5, XeonGold 6150 and AMD EPYC 7742 and the simulation jobs are distributed to all three kinds of node. Although XeonE3 has better single core performance compared to the other two, the time variations between different scenarios far exceeds the variations due to the difference in processors. Figure 4 shows the generation of SOA mass in the 72-hour period. JlBox captures a diurnal change of photolysis rate as is depicted in experiment A. We remind the reader that we have no despositional loss, or variable emissions, and that we are 350 using the boiling point method of Joback and Reid (1987) for estimating saturation vapour pressures.

Comparison with other models
JlBox is developed based on the PyBox model (Topping et al., 2018): they have similar structures, rely on the same methods for calculating pure component properties and provide almost identical results. Despite these similarities, we feel JlBox has 355 made significant improvements over PyBox in terms of readability, functionality, scale-ability, and efficiency from both a programming and algorithmic sense ( Table 3). The Julia programming language makes the most significant contribution to those improvements in that it promises a high performance environment, close to Fortran, without sacrificing flexibility of Python. For example, the directly translated partitioning code in JlBox can run at a comparable speed as the individual Fortran routines in PyBox, and the multiple dispatching mechanism makes it trivial for implementing the automatic differentiation. As 360 a result, JlBox elegantly solves the "two-language problem" without compromising anything by writing everything in Julia. It spares users from editing "code in code" like PyBox that makes it easier to maintain the code base and to extend the model. The homogeneous code base of JlBox also enables a convenient migration to other devices like GPUs considering there is already a GPU backend for Julia.
As for algorithmic advances, the automatic differentiation method for generating Jacobian matrices is not only the most 365 effective addition but also a fundamental one. It is an accurate and convenient way to calculate the Jacobian matrix which only requires an RHS function fully written in Julia. With Jacobian matrices available, the number of RHS evaluations is dramatically reduced since the implicit ODE solver no longer needs to estimate the Jacobian matrix using finite differences.
Also, without automatic differentiation, it will not be so easy to build the adjoint model of a fully coupled process model which explicitly requires the Jacobian matrix for the entire model, let alone to extend the model with more processes. Besides, the 370 adaptation of sparse matrices for gas kinetics reduced the compilation cost to a small constant value enabling the JlBox to simulate large scale mechanisms such as the entire MCM mechanism, which for PyBox typically remains limited by memory.
Compared to other models like KPP (Damian et al., 2002) and AtChem (Sommariva et al., 2020), JlBox is unique due to its ability to perform coupled mixed phase simulation efficiently especially on large mechanisms such as the full MCM mechanism where the vanilla KPP variant often fails to compile. JlBox is written in pure standard Julia without any string manipulation to 375 codes as against KPP and AtChem, which enables full IDE support making it more developer friendly.

Future development
There are a number of processes and algorithmic implementations not included in this version of JlBox that would be useful for further use in a scientific capacity. These include coagulation, hybrid sectional methods and auto-oxidation products schemes to name a few (Ehn et al., 2014;Hallquist et al., 2009;Riemer et al., 2009). As we state earlier, the purpose of this develop-380 ment stage was to create and profile the first Julia implementation of an aerosol box-model for the scientific community that would demonstrably exploit the exciting potential this emerging language has to offer. In version 1.0 we provide a fully coupled model. We could, and will, provide options for implementing simplified approaches to aerosol process, such as operator splitting. Indeed, these methods have proven to provide robust mechanisms for mitigating computational efficiency barriers if implemented correctly. However our ethos with JlBox is to build and develop a platform for a benchmark community box-385 model that exploits the benefits that aforementioned benefits that Julia provides. This includes the ability to exploit existing and emerging hardware and software platforms as we try to tackle the growing chemical and process complexity associated with aerosol evolution. We hope that, with version 1.0, the community can help develop and expand this new framework.
Quantifying the importance, or not, of process and chemical complexity requires a multifaceted approach. With the proliferation of data science driven approaches across most scientific domains, Reichstein et al. (2019) note that the next generation 390 of earth system models are likely going to merge machine learning and traditional process driven models to attempt to solve aforementioned challenges in complexity whilst exploiting the rich and growing data-sets of global observations. Julia is being used in development of machine learning (ML) frameworks, with libraries such as Flux-ML enabling researchers to embed process driven models within the back propagation pipeline (Innes, 2018). This opens up the possibility to develop observation driven parameterisations in hybrid mechanistic-ML frameworks, which helps with the issue around provenance in ML 395 parameterisation developments.
JlBox will continually grow and we encourage uptake and further developments.
Appendix A: Initial condition of section 4.3 in OrdinaryDiffEq.jl. It uses the classical TRBDF2 scheme (Hosea and Shampine, 1996) while benefiting from a highperformance linear solver provided by the Julia community.
In mixed phase simulation and its adjoint sensitivity analysis. For the full MCM mechanism, due to memory restrictions, the only practical option is to use the CVODE ODE solver with the FGMRES sparse linear solver.

Accur
Top 10 sensitivities of seco aerosol (SOA) mass under rate coefficients of each ga using the mixed-phase sim MCM APINENE mechanism resolve more complex mechanisms and processes.
• We need adjoint sensitivity analysis for figuring out importance of parameters and for data assimilations.
• The new JlBox model can satisfy both needs while having almost identical outputs as existing model PyBox.
• A test run of adjoint sensitivity analysis is performed on secondary organic aerosol (SOA) mass w.r.t. rate coefficients of gas phase reactions.

Initial Condition
Low RH High RH