A key and expensive part of coupled atmospheric chemistry–climate model
simulations is the integration of gas-phase chemistry, which involves dozens
of species and hundreds of reactions. These species and reactions form a
highly coupled network of differential equations (DEs). There exist orders of
magnitude variability in the lifetimes of the different species present in
the atmosphere, and so solving these DEs to obtain robust numerical solutions
poses a “stiff problem”. With newer models having more species and
increased complexity, it is now becoming increasingly important to have
chemistry solving schemes that reduce time but maintain accuracy. While a
sound way to handle stiff systems is by using implicit DE solvers, the
computational costs for such solvers are high due to internal iterative
algorithms (e.g. Newton–Raphson methods). Here, we propose an approach for
implicit DE solvers that improves their convergence speed and robustness with
relatively small modification in the code. We achieve this by blending the
existing Newton–Raphson (NR) method with quasi-Newton (QN) methods, whereby
the QN routine is called only on selected iterations of the solver. We test
our approach with numerical experiments on the UK Chemistry and Aerosol
(UKCA) model, part of the UK Met Office Unified Model suite, run in both an
idealised box-model environment and under realistic 3-D atmospheric
conditions. The box-model tests reveal that the proposed method reduces the
time spent in the solver routines significantly, with each QN call costing
27 % of a call to the full NR routine. A series of experiments over a range
of chemical environments was conducted with the box model to find the optimal
iteration steps to call the QN routine which result in the greatest reduction
in the total number of NR iterations whilst minimising the chance of causing
instabilities and maintaining solver accuracy. The 3-D simulations show that
our moderate modification, by means of using a blended method for the
chemistry solver, speeds up the chemistry routines by around 13 %,
resulting in a net improvement in overall runtime of the full model by
approximately 3 % with negligible loss in the accuracy. The blended QN
method also improves the robustness of the solver, reducing the number of
grid cells which fail to converge after 50 iterations by 40 %. The relative
differences in chemical concentrations between the control run and that using
the blended QN method are of order

With the advent of supercomputers, simulating the atmosphere using computational models has become an integral part of atmospheric science research, complementing experimental measurements, in situ and remote observations. Model predictions are playing an increasingly important role in both purely scientific investigations and public policy making (IPCC, 2013; Glotfelty et al., 2017). In recent years, increasing computational power has enabled the development of coupled chemistry–climate models (Morgenstern et al., 2009) which determine the chemical evolution and transport (Lauritzen et al., 2009) of trace atmospheric constituents, such as long-lived greenhouse gases, ozone, nitrogen oxides, volatile organic compounds and aerosol particles, and their influence on the environment, air quality and human health (Heal et al., 2013; Lamarque et al., 2013; O'Connor et al., 2014; Tilmes et al., 2015; Collins et al., 2017). These models require globally accurate predictions over time frames that span decades (Lamarque et al. 2013), involving chemical reactions of species with lifetimes ranging from sub-seconds to centuries (Whitehouse et al., 2004), making the task computationally very expensive.

The UK Chemistry and Aerosols (UKCA) model is part of the Met Office Unified Model (UM) (Cullen, 1993; Hewitt et al., 2011) and works as its chemistry (Morgenstern et al., 2009; O'Connor et al., 2014) and aerosol (Mann et al., 2010) component. Hereafter, we refer to UM-UKCA as the fully coupled chemistry–climate model and refer to the individual submodules as UKCA and UM. Solving the chemistry in UKCA comes at a significant cost as it is one of the most expensive components in the UM-UKCA model. As coupled chemistry–climate models become more complex and the description of chemistry more involved, the need for computationally economic methods will be in higher demand. Hence, it makes sense to investigate ways of increasing the speed of the existing schemes with the goal of little or no sacrifice in accuracy.

Problems of a similar kind appear in other fields such as combustion systems which contain possibly reduced physical dynamics but more intensive chemistry (up to thousands of reactions) (Lu et al., 2009) and aerosol microphysics and dynamics (Mitsakou et al., 2005). Mathematically, these systems are represented by complex networks of coupled differential equations (DEs) which one must solve numerically. There is no universal best numerical method that works for every type of DE. Often one needs to choose the most reasonable method according to the need (e.g. ease of incorporating/modifying in model, solution CPU cost/time, accuracy). The numerical methods available can be conveniently categorised as explicit or implicit. Explicit methods are direct integration methods that work for many types of conventional problems but have worse stability properties, while implicit methods are more involved and indirect in calculations but have superior stability properties (Atkinson, 1989; Sandu et al., 1997; Damian et al., 2002). Generally, explicit methods are quicker than implicit methods at integration of single iteration steps but can fall behind in the total integration cost due to the extra efforts to ensure stability (generally by halving the time steps). When it comes to atmospheric chemistry calculations, the main stumbling block against getting stable solutions is the problem of stiffness, which, broadly speaking, originates from different chemical reactions having orders of magnitude different timescales (Cariolle et al., 2017). If one uses an explicit DE method, the (approximate) concentration values of the next time step are calculated based on the tendencies at the current time. This makes it extremely hard to choose a time step which is short enough to capture the chemical changes and preserve stability but also long enough to make the calculations feasible for computers. A good way to overcome this difficulty is by using an implicit method where tendencies are not based on current values but treated as unknowns to be solved (along with the new concentration values). This greatly increases the stability of solutions at the cost of a series of extra calculations for each time step. But again, there is no single best implicit method which is suitable for all types of stiff problems. In fact, there are families of numerical schemes available for each category (Atkinson, 1989). It is therefore desirable for any proposed new method to be flexible enough so that they can be appended to the existing solver algorithms without substantial change. This is the aim of the proposed method here.

As will be detailed further in the text, a common feature of the many currently available implicit schemes is the solution of large systems of nonlinear differential equations iteratively (Ortega and Rheinboldt, 1970; Brandt, 1977; Kelley, 1995). At each time step, expensive subroutines have to be called several times; this is the main source of computational cost of the chemical time integration. These subroutines typically include (i) construction of a Jacobian (derivative of a function in higher dimensions) and (ii) a Newton–Raphson-type iterative algorithm to solve the nonlinear algebraic equations (associated with the nonlinear differential equations). To overcome the high costs, methods that avoid or reduce Jacobian construction have gained popularity in recent years (Brown and Saad, 1990; Chan and Jackson, 1984; Knoll and Keyes, 2004; Viallet et al., 2016, and the references therein). Our motivation for this work is somewhat similar in that we use approximations of the Jacobian to reduce the costs of the solver.

Illustration of application of the QN method (adopted in our work) to find the root of a function of one variable.

Here, we develop an approach which reduces the costs of expensive routines by partly recycling the information generated within the iterations. The method is based on exploiting this information in a way that enables one to take extra steps forward for the desired solution without going through the costly parts of the cycle. The approach is an adaptation of the quasi-Newton (QN) methods (Broyden, 1965; Shanno, 1970; Fletcher, 1970; Goldfarb, 1970; Davidon, 1991), fused into the classical Newton–Raphson (NR) method, which are commonly used for solving large systems of nonlinear algebraic equations.

The main idea behind the QN method is illustrated in Fig. 1. The objective of
finding species concentrations after a short time interval can be transformed
into finding the roots of a nonlinear function, which, in Fig. 1, is
represented as a function

Our adaptation of the QN method uses an “inverse update” approximation (Kvaalen, 1991) instead of the more commonly used “forward updates” (Broyden, 1965). We demonstrate that the approach improves the convergence rate significantly with respect to the number of main NR iterations and saves computational time. We further argue that using our mixed-method approach makes the algorithm more robust against “stiff environments” as it reduces the probability of the solver failing to converge on a solution and restarting using a shorter time step. We also test how the solutions (chemical concentrations of species) are affected over a long period of integration. We show that the differences in prognostic variables between our suggested QN method and the classical NR method are negligible and do not grow in time.

The structure of this article is as follows. In Sect. 2, we describe the UM-UKCA model and give a brief summary of its basic features. We then outline the current algorithm that handles the reaction kinetics by solving systems of nonlinear ordinary differential equations (ODEs) followed by our suggested modification using quasi-Newton methods. We further discuss why and how this modification works, its advantages and its possible dangers. In Sect. 3, we report results of our computational experiments carried out under both a controlled box-model environment and as part of the full 3-D Met Office UM-UKCA model. We compare the results of the code-modified runs with the control runs from the perspective of computational savings and differences in the concentrations/mixing ratios of chemical species, and discuss related matters with regard to parallel computing clusters. In Sect. 4, we conclude the paper by summarising and highlighting our results and pointing to possible future directions.

UM-UKCA, originally developed by the National Centre for Atmospheric Science and the UK Met Office, was designed as a framework for atmospheric chemistry and aerosol computations that operates under the Met Office Unified Model (UM) platform and models atmospheric chemistry and aerosol fields that can feed back onto the model dynamics via the model radiation scheme (Morgenstern et al., 2009; O'Connor et al., 2014). It computes a number of possible physical–chemical processes taking place in the atmosphere such as radiation, photolysis, emissions, wet/dry deposition and clouds. It is coupled to the UM transport dynamics sequentially; that is, transport routines and chemistry–aerosol routines are performed one after another (operator splitting) with adjustable frequency. Currently in its global configuration, for transport, a time step of 20 min is used, whilst a chemical time step of 1 h is used to update the new concentrations of species in the model.

A number of chemical schemes are available in UKCA for modelling different parts of the atmosphere (troposphere, stratosphere, etc.) with varying model details (e.g. radiative feedback switched on/off). In this paper, we use the more general stratospheric–tropospheric coupled scheme with and without an online aerosol mode (either using GLOMAP mode (Mann et al., 2010) or aerosol climatologies) to demonstrate our results. The pure stratospheric–tropospheric mode (StratTrop) contains 75 species and consists of 283 chemical reactions (Banerjee et al., 2016). When GLOMAP-mode aerosols are activated, 12 additional tracers are added to the system and a total of 306 reactions represent the atmospheric chemistry. The StratTrop chemical mechanism is solved using an implicit backward Euler scheme under the ASAD framework (Carver et al., 1997; Wild and Prather, 2000), as described in detail below, while photolysis is computed using the Fast-JX scheme (Wild et al., 2000). The details of these schemes can be found in Abraham et al. (2012). The UM-UKCA version used here is vn10.6.1, in the Global Atmosphere 7.1 configuration, which is a development of the UM-UKCA GA6 configuration (Walters et al., 2017).

In addition to the full 3-D UM-UKCA model, we also use a box-model version of UKCA (hereafter referred to as UKCA_BOX) to gain better control of the chemistry part of our simulations. UKCA_BOX is designed as a development tool using the same UKCA code, branched from version 10.1 of the UM-UKCA, but with the rest of the UM-UKCA model removed and replaced with inputs that feed the UKCA code with the same information as if it were a single grid cell in the full 3-D model. The box model uses the same StratTrop (CheST) chemical mechanism, ASAD chemical solver and Fast-JX photolysis scheme as the full 3-D model but does not have any emissions, deposition or transport. As it runs for only a single grid cell, it can be run cheaply on a single processor across many test cases. Thus, it is ideal for testing and optimising the chemical solver in UKCA over a wide range of idealised chemical environments.

In the following sections, we discuss the chemical time integration schemes in the UKCA package for determining the new tracer concentrations and chemical tendencies. All numerical schemes are implemented using the Fortran 95 language. The code is available in the UM-UKCA trunk from version 10.8. Branches are also available in vn10.7 and vn10.6.1.

The time integration for the gas-phase chemistry in UKCA is carried out by the ASAD package which provides a flexible framework for adding and removing new reactions/species (Carver et al., 1997; Wild and Prather, 2000). The UKCA version of the ASAD package uses a backward Euler numerical scheme to compute the new species concentrations at the next chemical time step. One of the reasons for this choice is that the relevant timescales of the reactions of species vary over many orders of magnitudes depending on the location and time of the reactions, which makes the system extremely stiff. The backward Euler method is an implicit scheme which has superior numerical stability properties to almost all other explicit or semi-explicit methods and hence works particularly well with stiff systems (Atkinson, 1989). This enables the use of longer time steps and makes long time integrations feasible. The drawback is that, as in all implicit schemes, it demands that systems of nonlinear algebraic equations are solved at each time step, requiring extra calculations and so increasing the computational cost significantly.

These heavy costs can be partly reduced by exploiting the fact that the coupling among species is “loose” in the sense that each species reacts with several other species but not all. This makes the Jacobian sparse and allows for the use of sparse matrix methods which significantly cuts costs. This approach was implemented in the UM-UKCA model (see Morgenstern et al., 2009).

The reaction kinetics in the atmosphere can be represented, mathematically, as a system of nonlinear ODEs where the initial values are prescribed. Emissions and dry/wet deposition enter these equations as source and sink terms. The task of determining the change in chemical species concentrations is equivalent to solving the coupled nonlinear system numerically.

Let

To solve Eq. (1) numerically using a backward Euler scheme, we discretise the
time variable, so the discrete equation takes the form

Here, we give a brief description of the NR method, which will prepare the
ground for discussion of our contribution. Setting

The linear equation (Eq. 5) can also be written in the form

In the current UKCA implementation, each major calculation step of the ODE
solution algorithm is carried out by a separate routine as shown in Fig. 2a.
The main solving engine begins by calculating the current tendencies
(right-hand side of Eq. 1) using the updated chemical concentrations from the
previous time step (Step 1 in Fig. 2a). Then an initial predictor guess
(forward Euler type) is calculated to be used in the following iterative
loop. After that, the Jacobian is calculated using the exact quadratic form
of the nonlinear reaction rates (Step 2). This step is followed by the
solution of the linear Eq. (6) (Step 3). After the new increment (

Flowchart showing steps taken to numerically solve the nonlinear
chemical equations using the Newton–Raphson method: as carried out in the
standard version of ASAD in the UKCA chemical transport model

We noted above that the expensive parts of the chemical integration are the Jacobian construction and solution of a system of linear equations at each iteration. Our strategy is based on the idea of using QN methods to minimise the number of iterations in the main NR solving loop, thereby reducing the number of Jacobian reconstructions and linear systems to be solved.

In QN methods, the use of exact Jacobian at every iteration is abandoned. Instead it is approximated in a way that will satisfy certain imposed conditions. The ideas behind these (secant) methods, which date back to Broyden (1965), Shanno (1970), Fletcher (1970), Goldfarb (1970) and Davidon (1991) resemble using the inverse quotient of a function (of one variable) to replace the reciprocal of the exact derivative of the same function (see Fig. 1). The price of this avoidance is a slowdown in convergence (not quadratic as in the NR algorithm but still super-linear). In general, this strategy is more profitable since the slowdown in the convergence rate can be compensated by the substantial time gain obtained from bypassing the other costly steps compared to the time lost in the number of iterations.

Our implementation is somewhat different from the standard quasi-Newton methods in that Newton–Raphson iterations are not completely replaced by the QN iterations. Rather, QN iterations are fused into the existing NR loop and implemented only if a chosen criterion is met. In this sense, the new algorithm is a mixed method which uses both NR and QN methods as needed. This way keeps the changes to the existing algorithm minimal and makes the method flexible and practical to use. Despite this relatively small change in the algorithm, the computational gain in return is considerable.

Diagrammatically (see Fig. 2b), the approach works as follows. If the desired convergence has not taken place after the end of the Newton–Raphson iteration, then instead of moving on to the next iteration and reconstructing the Jacobian from scratch (Step 2), we make a pseudo-iteration and form an “effective approximation” for the inverse of the Jacobian using the concentrations already computed (Step 5). Step 6 follows in which we resolve for the newer concentration values making use of the information available from Step 3. So, a full NR iteration is effectively replaced by a QN pseudo-iteration taking much less time. These measures are quantified in Sect. 3.1.

In the above description, we refer to the “effective approximation” of the inverse of the Jacobian. However, in practice, we do not strictly construct an approximate “inverse” since taking the inverse of a matrix brings more expense. Rather, the remnants of the main NR iteration (the Jacobian from Step 2, concentrations from Step 3) are recycled and used in the approximation scheme for the inverse of the Jacobian (Broyden approximation). Schematically, after the main Newton–Raphson route, we perform Steps 4–6 shown in Fig. 2b, which is formalised below.

We use a particular, Broyden-type inverse approximation scheme (Kvaalen,
1991), which is given by the following form

Once

In this section, we compare our results with the new method (quasi-Newton) and without (classical Newton–Raphson) when implemented in the current version of the UKCA solver. We consider the effectiveness of the algorithm on a single processor with, i.e. UKCA_BOX, as well as on a high-performance parallel computing (HPC) platform (ARCHER) with the full 3-D UM simulations. In both cases, our analysis will be two-fold: comparison of computational performance (savings, robustness, etc.) and comparison of predicted model values. We show that, although the chemistry step alone takes 5 to 10 % of the entire computations, there is a noticeable speed-up when the chemistry component is modified in the way suggested without causing any significant error in prognostic variable values. This also improves the robustness of the computation by reducing the number of cases during the course of entire chemical integration for which the time step has to be halved in order to converge on a solution.

To test the performance of the QN approximation method on performance of the UKCA chemistry solver, we first tested the changes in UKCA_BOX. UKCA_BOX allows us to test the performance of the QN methods under a highly controlled environment, and optimise the options for the solver based on a variety of chemical conditions.

Four standard test cases were set up for these experiments to test the
behaviour of the box model in different chemical environments: Urban, Rural,
Marine and Stratosphere (Strat). The initial conditions for these test cases
were extracted for July from a 10-year run of the full UM-UKCA model for the
year 2000 at 1.875

Summary of data points from UM model runs used to initialise UKCA_BOX scenarios, parameters describing atmospheric conditions of each scenario and initial concentrations of select chemical species. In each case, data are extracted from a 10-year July average run of the UM-UKCA model for the year 2000.

The UKCA_BOX uses the Fast-JX photolysis scheme (Wild et al., 2000),
comparable to that used in the full UM-UKCA model (Telford et al., 2013). For
the purposes of these experiments, a simplified setup was used whereby
photolysis turns “on” and “off” every 12 h of integration, using
precalculated photolysis rates. This was done to minimise the computation of
photolysis rates and create idealised scenarios with an abrupt step change at
“dawn” and “dusk” to test the stability of the solver. Photolysis rates
were taken from an offline run of the 1-D column Fast-JX scheme at 12:00 UTC
on 1 July, 40

As discussed in the previous section, the QN method is cheaper than the full NR method because it does not recalculate the full Jacobian at each iteration (Table 2). On average, one QN iteration takes 27 % of the time of a full NR iteration. Since the QN method reduces the number of NR iterations required to converge, the time taken will therefore generally be reduced. However, the QN method is not as exact as the NR method, and so there is not a one-to-one efficiency: calling the QN method many times may only reduce the number of NR iterations required by a few, and in some cases calling the QN method too many times can result in a net increase in computational burden. Finding the most efficient setup therefore becomes an optimisation problem: how can we gain the maximum reduction in NR iterations, with as few calls to the QN method as possible? In particular, we are interested in reducing the number of iterations required for the solver during the most challenging chemical states when the equations are most stiff. This will reduce the range of time taken for cores to solve each part of the domain, therefore reducing time spent waiting for all cores to catch up to the same time in the full 3-D model.

Wall-clock times for running 1000 calls for the NR iterations and QN iteration within the UKCA_BOX model run on a single processor core.

To test the range of options, we devised nine experiments for each scenario, as
summarised in Table 3. The control (CNTL) experiment does not call the QN
method and is identical to the solver in the release version of UKCA. The
other scenarios call the QN method after one or more NR-iterations, as given
by the numbers in the names of experiments in Table 3. For example, QN1 calls
the QN Newton method after the first NR iteration only, QN2–3 calls it after
the second and third NR iterations, and QN1

Summary of experiments conducted using UKCA_BOX. The control (CNTL) experiment does not call the QN method. The other experiments call the QN method after one or more NR iterations.

Figure 3 shows chemical concentrations for a selection of chemical tracers
from the box model, comparing the CNTL experiment with the QN experiments,
for the Urban scenario. Similar figures for the other scenarios are included
in the Supplement. In this scenario, the mix of

Concentrations of

Time series of the number of iterations required to converge for the Urban scenario are shown in Fig. 4. Similar figures for the Rural, Marine and Strat scenarios are included in the Supplement (Figs. S1 and S2; S3 and S4; S5 and S6, respectively), which in general are found to converge in fewer iterations than the Urban case. The dashed blue line shows the number of NR iterations required to reach a stable solution at each time step, the red line shows the number of QN iterations required, and the black line shows the estimated NR-equivalent number of iterations taken to solve, using the result that QN iterations take on average 27 % of the computational time to solve compared to the NR method (Table 2). The first time step is the most difficult to solve, as the initial chemical concentrations are typically far from a steady state having been taken from monthly average values from model cells. After that, the dawn and dusk periods, the time steps immediately after photolysis is turned on and off, respectively, are the next most challenging, as changing photolysis rates causes an abrupt change in the lifetimes of many species. The inclusion of the QN method can be seen to improve the solver when the net NR-equivalent iterations (black line) are lowered compared to the CNTL scenario, and is optimal when this can be achieved with the minimum number of QN pseudo-iterations (red line, Fig. 4). While the UKCA_BOX model only solves a single case at any one time step, each core in the 3-D model will solve for many grid cells at each time step, and can only move on to the next time step once all have converged. In other words, the 3-D model is only as fast as its slowest grid cell. For this reason, the cases where the new methods reduce iteration count at the more challenging time steps (at dawn and dusk) are considered a stronger indication that they will improve integration time in the full 3-D model rather than the average.

The Urban scenario is the most challenging of the test cases to solve, due to
the high initial concentrations of reactive tracers (Fig. 4). The CNTL
scenario takes 12 full NR iterations to solve the first time step, then
between 4 and 7 for each time step thereafter, needing 4.36 iterations on
average (Fig. 4a). More iterations are required at dawn and dusk, with a
maximum of seven NR iterations required at dusk. Calling the QN pseudo-iteration
on the first iteration (QN1, QN1–2, QN1–3 and QN1

Plots of solver iteration (convergence) numbers for the original
full NR method and QN methods, with QN pseudo-iterations only called on
particular iteration(s). The CNTL scenario

Computational speed-up using the QN method in comparison to the regular Newton–Raphson method.

Average wall-clock time in seconds (

In this section, we report our results for the full 3-D global UM-UKCA
simulations with the QN method implemented (on the original ASAD solver code)
and without (classical NR method). We discuss these results from the
perspectives of model performance (computational savings and stability) and
prognostic evaluations (comparison of model physical values). All simulations
were performed using version 10.6.1 of the model, applying the GA7.1
configuration at 1.875

We have performed three sets of numerical experiments with two slightly
different configurations of UKCA. The first version (StratTrop) uses the
stratosphere–troposphere chemistry where all radiative feedback from UKCA
trace gases was turned off and aerosol climatologies were used. This setup
allows for changing the chemical species whilst maintaining the same wind
fields between the simulations. The UM-UKCA is parallelised by breaking the
domain up into a chess-board pattern of subdomains, defined by the number of
processes given for the east–west (EW) and north–south (NS) directions. The
solver iterates across all grid cells in the subdomain until all have reached
a stable solution. Thus, the computational speed is limited by the
hardest-to-converge (“stiffest”) grid cell in each subdomain. This
configuration was run for 20 model years using 432 cores
(24EW

Number of times that the solver needed to halve the time step in order to avoid divergences or wild oscillations over 1 year of integration.

A second set of simulations was performed using the stratosphere–troposphere
chemistry combined with the GLOMAP-mode aerosol scheme (StratTrop

Left column

Histograms of the number of NR iterations to convergence for the
216-core StratTrop

We begin our discussion with an overview of the timing for each simulation set. These total time measurements are complemented by a robustness assessment, checking the number of times that iteration steps of the main chemistry solver are halved in order to reach the prescribed accuracy (that is, where UKCA spends more CPU in regions of stiff chemistry). This initial analysis is then expanded to a more detailed analysis via time measurement maps of the simulations and iteration maps of the chemistry solver.

Table 4 gives the total wall-clock time measurement results for the four
20-year sets of simulations (jobs). A plot of the speed-up for absolute
wall-clock time is also included in the Supplement (Fig. S7). Using our
suggested modification of the current algorithm leads to a net savings of

A legitimate question is to check how quasi-Newton methods, which are essentially based on approximations, change the robustness of the numerical scheme. This is particularly important since the modelled systems are generally under stiff conditions which are prone to instability. A poorly designed approximate method could wash out important information on the direction of the chemical evolution and cause the program to crash after some number of steps. To demonstrate that the approximation scheme that we propose is safe, we show in Table 6 the number of times the UKCA model halves the time step (a sign that the chemical conditions at that particular location and time are such that the solution to fails to converge, oscillate or even diverge, and therefore the time step has to be reduced). According to Table 6, with the QN modification, the occurrence of halving the time step is nearly 2 times less frequent compared to the original algorithm, suggesting that the mixed QN method can be more robust in chemically stiff environments, saving more computational time overall as halving the time step significantly increases computational costs. The parallelisation of the UM-UKCA is such that the whole model can be held up by the few grid cells which fail to converge under the normal time step. So improving the robustness of the solver potentially has much greater benefits to net computational efficiency than just the direct reduction in cost to solve the individual grid cells.

Next, we make a grid point analysis of NR iterations to understand the origin of computational savings. In general, the time that it takes the solver to calculate final chemical concentrations on a grid point depends heavily on the ambient photochemical conditions at that point and time. So, the number of iterations in which the program exits the solver loop varies significantly across the domain.

Figure 5 shows maps of the mean number of iterations to convergence (averaged over column and time) for the 1-year simulations (one chemical time step is equal to 1 model hour) with the StratTrop (216- and 432-core) and GLOMAP (432-core) schemes. The CNTL simulations (left-hand column) clearly show regions where more iterations are required. The right-hand column shows the difference in mean number of iterations to convergence when using the QN2–3 method. Not only is the mean number of NR iterations reduced globally, but greater benefit is seen in the hot-spot regions noted in the CNTL simulations.

As Fig. 7 but for OH. Note the use of a log scale in the top (CNTL) plots. Note that the model domains are visible due to the extremely small differences in OH.

By summing the total number of points through the 1-year period according to number of iterations, a histogram of iteration numbers is produced which neatly summarises performance of both methods (the CNTL and the QN cases). Figure 6 shows the histogram of the iteration numbers over all grid points for the 1-year simulations with the same StratTrop (216- and 432-core) and GLOMAP (432-core) schemes. The QN method greatly reduces the peak at eight iterations, and allows the majority of solutions (approximately 70 %) to be found in four or less NR iterations.

In this section, we evaluate the accuracy of our proposed method. Recall from Sect. 3.1 that the QN method produces physical values which are very close to what the original method calculates even for fast-changing species.

We test the accuracy of the two methods by comparing the model predictions
for two different species which have very different lifetimes (

For comparison of differences in values, we consider only the StratTrop scenario in which ozone and other chemical feedbacks are not included. This avoids intrinsic perturbations dominating the solutions over long periods of time and ensures that the dynamics are identical between both simulations.

From the last 10-year average of two 20-year experiments (StratTrop-CNTL and
StratTrop-QN2–3), we see that

For the comparison of OH concentrations in the 20-year StratTrop-CNTL and
StratTrop-QN2–3 experiments, Fig. 8 shows the zonal-mean differences and
surface value differences in the month of July. The difference values are
slightly larger but still only of the order of 0.1 % or smaller. Note that
the largest percentage differences are seen in the areas with the smallest
absolute OH concentrations. Almost everywhere else the fractional difference
in OH is less than the tolerance of the solver (10

In this subsection, we give a quantitative analysis of the differences in the
physical values obtained from the computations. In the strict sense of the
word, there is actually no extra “error” associated with our proposed
method of computation as both the classical NR and QN approaches give
approximate solutions of the real DE within a chosen error tolerance (which
is met by each method). Nevertheless, for completeness and comparison, we will
regard the NR computations (CNTL runs) as the “true” values and measure the
difference in OH and

The figures in the previous sections provide maps of absolute and relative
differences. Depending on the location of the point, these differences vary
but always stay very small. In order to have a more quantitative measure of
how different one particular run is from the other, we need a metric that
will take into account all of the grid points and the corresponding errors.
Considering the extreme low values of OH in certain regions, the most
suitable metrics (Yu et al., 2006) are the normalised mean absolute
difference (NMAD) and normalised root mean square difference (NRMSD) which
are, respectively, defined by

Comparison of Newton–Raphson versus quasi-Newton methods by the metrics NMAD and NRMSD.

We also plot the NMAD, NRMSD and NMB as a function of time (each month) in
the last 10-year period for OH (Figs. S8, S9 and S10 of the Supplement,
respectively) and for

Atmospheric chemistry simulations are at the heart of coupled chemistry–climate models. Solving the complex sets of equations that represent the evolution of species comes at a high computational cost. In this article, we introduced a version of the quasi-Newton method into the UKCA coupled climate model. The quasi-Newton method demonstrates improvements, in multiple ways, over the classical Newton–Raphson method used in the UKCA model chemistry solver.

The main benefit of the QN approach, as discussed in Sect. 3, is its ability to reduce the computational time for the simulations. The advantages, however, are not limited to reducing the costs of chemistry calculations. The computations are more robust against stiff chemical environments, thereby reducing the possibility of divergence and instability in computations. On parallel platforms, even when there is no danger of instability, robustness actually can translate into extra computational gain as the method saves further time by avoiding unnecessary wait times in the subdomains. Overall, we see a reduction in total computational costs of the whole UKCA model of approximately 3 %, corresponding to a reduction of approximately 15 % in the chemistry routines. Whilst this may not seem like a big reduction, it is significant given the high costs associated with the rest of the coupled UKCA model. In practice, a 3 % reduction of costs for a large study involving 10 000 model years corresponds to 300 model years saved, roughly 100 real days of supercomputer time with the current setup.

We also demonstrated that the suggested method, while improving the performance, does not deteriorate the accuracy of physical predictions, which is an obvious requirement for any proposed method. From the cross comparisons under different computational environments (UKCA_BOX or parallel UM simulations), different chemical scenarios (interactive or noninteractive) for a large spectrum of chemical species (varying from very long lifetime or short lifetime), the method maintains the same level of accuracy as the original method.

Another feature of our approach is its flexibility to use with many existing chemistry solving systems. Whilst this work focussed specifically on the UKCA, the algorithm can be easily integrated to the existing codes of the other (unrelated) coupled chemical system solvers. If implemented in a chemical transport model, for example, one would expect the overall benefit to be greater, due to the greater proportion of computational expense of the chemical solver due to the lack of other online physical processes. As shown in Sect. 2, it is also simple to detach the algorithm from the modified program and revert back to the original algorithm if desired using options defined in the namelist. Furthermore, since the method is quite generic, it can be used beyond solving chemical systems. We think that it will be just as easy to implement the method in other components of the climate model, for instance, solving systems of time-dependent nonlinear (partial) differential equations which can be cast into a problem of solving systems of nonlinear algebraic equations at each time step.

Finally, we remark that we have focused on one particular quasi-Newton approach which took advantage of available information and use it to replace costly Jacobian construction and linear system solving routines which proved to work robustly under fairly general conditions. There are also other Newton-type methods that avoid or reduce Jacobian construction (Brown and Saad, 1990). Although these methods pursue relatively different strategies (and hence require more substantial changes to a classical NR-type algorithm), it would be interesting to investigate their numerical capability.

Due to intellectual property right restrictions, we cannot provide either the source code or documentation papers for the UM. However, we provide a pseudo-code for the NR and QN routine part of the DE system solver of the UKCA (see Appendix A below).

!

! Inside the new chemistry step: determine the concentrations for the next step...

…

…

…

Update tendencies (

Make an initial guess for the algebraic system as an input to the iterative
solver

! Iteration counter: k, maximum iteration counter: max_iter

! Update the

! Jacobian construction and linear system solving

Compute exact Jacobian

Solve for the new increment

! Updating the

Perform treatments for troublesome convergence (e.g.

dampening factor) or

Filtering of possible negative values in components of

! This can be done on iterations 2

recommended on steps 2 & 3

! This step will not be done if the

and the routine is about to exit

If (

Update the tendencies

Update the

!

Compute the Jacobian modification factor

Re-solve
for the newer increment

Update

End If

End Do

EE developed and implemented the method. NLA modernised the implementation and performed the global simulations in the article. SAN developed the modern BOX_MODEL with support from PTG, NLA, and ATA and performed the UKCA_BOX simulations in the article. CM developed the earlier versions of the box model. ATA and JAP oversaw the work. EE, NLA, and SAN wrote the paper with contributions from all authors.

The authors declare that they have no conflict of interest.

We thank Oliver Wild for many useful discussions. The first author thanks Nigel Wood and Olaf Morgenstern for helpful comments. We also thank Alan Hewitt and Stuart Whitehouse for reviewing the code.

Model integrations have been performed using the ARCHER UK National Supercomputing Service and the MONSooN system, a collaborative facility supplied by the Joint Weather and Climate Research Programme, which is a strategic partnership between the UK Met Office and the Natural Environment Research Council. This work used the NEXCS HPC facility provided by the Natural Environment Research Council. We thank NCAS for providing support for the UKCA model development.

The first and last authors were supported under the ERC (ACCI) grant (project number 267760). Alex T. Archibald and Scott Nicholls thank the Isaac Newton Trust under whose auspices this work was funded.Edited by: Jason Williams Reviewed by: three anonymous referees