We explore coupling to a configurable subsurface reactive transport code as a flexible and extensible approach to biogeochemistry in land surface models. A reaction network with the Community Land Model carbon–nitrogen (CLM-CN) decomposition, nitrification, denitrification, and plant uptake is used as an example. We implement the reactions in the open-source PFLOTRAN (massively parallel subsurface flow and reactive transport) code and couple it with the CLM. To make the rate formulae designed for use in explicit time stepping in CLMs compatible with the implicit time stepping used in PFLOTRAN, the Monod substrate rate-limiting function with a residual concentration is used to represent the limitation of nitrogen availability on plant uptake and immobilization. We demonstrate that CLM–PFLOTRAN predictions (without invoking PFLOTRAN transport) are consistent with CLM4.5 for Arctic, temperate, and tropical sites.

Switching from explicit to implicit method increases rigor but introduces
numerical challenges.
Care needs to be taken to use scaling, clipping,
or log transformation to avoid negative concentrations during the
Newton iterations.
With a tight relative update tolerance (STOL) to avoid
false convergence, an accurate solution can be achieved with about
50 % more computing time than CLM in point mode site simulations
using either the scaling or clipping methods. The log transformation
method takes 60–100 % more computing time than CLM. The
computing time increases slightly for clipping and scaling; it
increases substantially for log transformation for half saturation
decrease from

As some biogeochemical processes (e.g., methane and nitrous oxide reactions) involve very low half saturation and thresholds, this work provides insights for addressing nonphysical negativity issues and facilitates the representation of a mechanistic biogeochemical description in Earth system models to reduce climate prediction uncertainty.

Land surface (terrestrial ecosystem) models (LSMs) calculate the
fluxes of energy, water, and greenhouse gases across the
land–atmosphere interface for the atmospheric general circulation
models for climate simulation and weather forecasting

As LSMs usually hardcode the soil biogeochemistry reaction network
(pools/species, reactions, rate formulae), substantial effort is often
required to modify the source code for testing alternative models and
incorporating new process understanding.

An essential aspect of LSMs is to simulate competition for nutrients
(e.g., mineral nitrogen, phosphorus) among plants and microbes. In
CLMs, plant and immobilization nitrogen demands are calculated
independent of soil mineral nitrogen. The limitation of nitrogen
availability on plant uptake and immobilization is simulated by
a demand-based competition: demands are downregulated by soil nitrogen
concentration

Three methods are used to avoid negative concentration in RTM
codes. One is to use the logarithm concentration as the primary
variable

As LSMs need to run under various conditions at the global scale for
simulation duration of centuries, it is necessary to resolve accuracy
and efficiency issues to use RTM codes for LSMs. The objective of this
work is to explore some of the implementation issues associated with
using RTM codes in LSMs, with the ultimate goal being accurate,
efficient, robust, and configurable representations of subsurface
biogeochemical reactions in CLM. To this end, we develop an
alternative implementation of an existing CLM biogeochemical reaction
network using PFLOTRAN (massively parallel subsurface flow and reactive transport)

Among the many reactions in LSMs are the soil biogeochemical reactions
for carbon and nitrogen cycles, in particular the organic matter
decomposition, nitrification, denitrification, plant nitrogen uptake,
and methane production and oxidation. The kinetics are usually
described by a first-order rate modified by response functions for
environmental variables (temperature, moisture, pH, etc.)

The reaction network for the carbon

In CLM–PFLOTRAN, CLM can instruct PFLOTRAN to solve the partial
differential equations for energy (including freezing and thawing),
water flow, and reaction and transport in the surface and
subsurface. This work focuses on the PFLOTRAN biogeochemistry, with the
CLM solving the energy and water flow equations and handling the
solute transport (mixing, advection, diffusion, and leaching).
Here, we focus on how reactions are implemented and thus only use PFLOTRAN in
batch mode (i.e. without transport). However, PFLOTRAN's advection and
diffusion capabilities are operational in the CLM–PFLOTRAN coupling described
here. In each
CLM time step, the CLM provides production rates for

The reactions and rates are implemented using the “reaction sandbox”
concept in PFLOTRAN

Unlike the explicit time stepping in CLM, in which only reaction rates need to be calculated, implicit time stepping requires evaluating derivatives. While PFLOTRAN provides an option to calculate derivatives numerically via finite-differencing, we use analytical expressions for efficiency and accuracy.

Many reactions can be specified in an input file, providing flexibility in adding various reactions with user-defined rate formulae. As typical rate formulae consist of first-order, Monod, and inhibition terms, a general rate formula with a flexible number of terms and typical moisture, temperature, and pH response functions is coded in PFLOTRAN. Most of the biogeochemical reactions can be specified in an input file, with a flexible number of reactions, species, rate terms, and various response functions without source code modification. Code modification is necessary only when different rate formulae or response functions are introduced. In contrast, the pools and reactions are traditionally hardcoded in CLM. Consequently, any change of the pools, reactions, or rate formula may require source code modification. Therefore, the more general approach used by PFLOTRAN facilitates implementation of increasingly mechanistic reactions and tests of various representations with less code modification.

To use RTMs in LSMs, we need to make reaction networks designed for
use in explicit time-stepping LSMs compatible with implicit time-stepping RTMs. The limitation of reactant availability on reaction
rate is well represented by the first-order rate
(Eqs.

For the litter decomposition reactions (Appendix Reactions

To separate mineral nitrogen into ammonium (

In this equation

CLM uses a demand-based competition approach (Appendix A,
Sect.

Negative components of the concentration update (

A third approach, log transformation, also ensures a positive solution

The Newton–Raphson method and scaling, clipping, and log
transformation are widely used and extensively tested for RTMs, but
not for coupled LSM–RTM applications. The CLM describes biogeochemical
dynamics within daily cycles for simulation durations of hundreds of
years; the nitrogen concentration can be very low (

For tests 1–3, we start with plant ammonium uptake to examine the
numerical solution for Monod function, and then add nitrification and
denitrification incrementally to assess the implications of adding
reactions. For test 4, we check the implementation of mineralization
and immobilization in the decomposition reactions. Third, we compare
the nitrogen demand partition into ammonium and nitrate between CLM
and PFLOTRAN. With coupled CLM–PFLOTRAN spin-up simulations for
Arctic, temperate, and tropical sites, we assess the application of
scaling, clipping, and log transformation to achieve accurate,
efficient, and robust simulations. Spreadsheets, PFLOTRAN input
files, and additional materials are provided as Supplement, and archived
at

Our implementation of CLM soil biogeochemistry introduces mainly two
parameters: half saturation (

It was observed that plants can decrease nitrogen concentration to
below the detection limit in hours

We consider the plant ammonium uptake Reaction (

Discretizing it in time using the backward Euler method for a time-step size

Ignoring the negative root, [

We use a spreadsheet to examine the Newton–Raphson iteration process
for solving Eq. (

Even though clipping avoids convergence to the negative solution, the
ammonium consumption is clipped, but the

In contrast to clipping, scaling applies the same scaling factor to
limit both ammonium consumption and

Small to zero concentration for ammonium and

This simple test for the Monod function indicates that (1) Newton–Raphson iterations may converge to a negative concentration; (2) scaling, clipping, and log transformation can be used to avoid convergence to negative concentration; (3) small or zero concentration makes the Jacobian matrix stiff or singular when log transformation is used, and clipping is needed to guard against overflow or underflow of the exponential function; (4) clipping limits the consumption, but not the corresponding production, violating reaction stoichiometry in the iteration; (5) production reactions with external sources are inhibited in the iterations when scaling is applied, which is unintended; (6) additional iterations can resolve issues in (4) or (5); and (7) loose update tolerance convergence criteria may cause false convergence and result in mass balance errors for clipping and scaling.

Adding a nitrification Reaction (

A semianalytical solution similar to Eq. (

Depending on the rates (

Influence of half saturation

The scaling factor (

In summary, this test problem demonstrates that (1) a negative update can be produced even for products during a Newton–Raphson iteration; and (2) when a negative update is produced for a very low concentration, a very small scaling factor may numerically inhibit all of the reactions due to false convergence even with very tight STOL.

The matrix and update equations with plant nitrate uptake and
denitrification added to test 2 are available in
Appendix

We examine another part of the reaction network: decomposition,
nitrogen immobilization, and mineralization
(Fig.

For comparison with CLM, we examine the uptake rate as a function of
demands and available concentrations

We test the implementation by running CLM–PFLOTRAN simulations for
Arctic (US-Brw), temperate (US-WBW), and tropical (BR-Cax) AmeriFlux
sites. The CLM–PFLOTRAN simulations are run in the mode in which
PFLOTRAN only handles subsurface chemistry (decomposition,
nitrification, denitrification, plant nitrogen uptake). For comparison
with CLM, (1) depth and

Calculated LAI and nitrogen distribution among vegetation, litter,
SOM,

The US-Brw site (71.35

The US-WBW site (35.96

The BR-Cax site (

The site climate data from 1998 to 2006, 2002 to 2010, and 2001 to
2006 are used to drive the spin-up simulation for the Arctic (US-Brw),
temperate (US-WBW), and tropical (BR-Cax) sites, respectively. This
introduces a multi-year cycle in addition to the annual cycle
(Figs.

Calculated LAI and nitrogen distribution among vegetation, litter,
SOM,

The Arctic site shows a distinct summer growing season
(Fig.

The higher

Numerical errors introduced due to false convergence in clipping,
scaling, or log transformation are captured in CLM when it checks
carbon and nitrogen mass balance for every time step for each column,
and reports

Wall time for CLM–PFLOTRAN relative to CLM for spin-up simulation on OIC (ORNL Institutional Cluster Phase5).

CLM wall time is 29.3, 17.7, and 17.1 h for the Arctic, temperate, and
tropical sites for a simulation duration of 1000, 600, and 600 years.

Mass balance errors are reported for

Calculated LAI and nitrogen distribution among vegetation, litter,
SOM,

Resetting nitrous oxide concentration to

The results for scaling is similar to clipping: mass balance errors
are reported for

Decreasing STOL can decrease and eliminate the numerical inhibition
in the case of

The frequent negative update to nitrous oxide is produced because the
rate for the nitrification Reaction (

The numerical errors can be decreased and eliminated by decreasing
STOL. Similar to the tests 2–3 a
small STOL can result in small

Global land surface models have traditionally represented subsurface soil biogeochemical processes using preconfigured reaction networks. This hardcoded approach makes it necessary to revise source code to test alternative models or to incorporate improved process understanding. We couple PFLOTRAN with CLM to facilitate testing of alternative models and incorporation of new understanding. We implement CLM-CN decomposition cascade, nitrification, denitrification, and plant nitrogen uptake reactions in CLM–PFLOTRAN. We illustrate that with implicit time stepping using the Newton–Raphson method, the concentration can become negative during the iterations even for species that have no consumption, which need to be prevented by intervening in the Newton–Raphson iteration procedure.

Simply stopping the iteration with negative concentration and returning to the time-stepping subroutine to cut time-step size can avoid negative concentration, but may result in small time-step sizes and high computational cost. Clipping, scaling, and log transformation can all prevent negative concentration and reduce computational cost but at the risk of accuracy. Our results reveal implications when the relative update tolerance (STOL) is used as one of the convergence criteria. While use of STOL improves efficiency in many situations, satisfying STOL does not guarantee satisfying the residual equation, and therefore it may introduce false convergence. Clipping reduces the consumption but not the production in some reactions, violating reaction stoichiometry. Subsequent iterations are required to resolve this violation. A tight STOL is needed to avoid false convergence and prevent mass balance errors. While the scaling method reduces the whole update vector following the stoichiometry of the reactions to maintain mass balance, a small scaling factor caused by a negative update to a small concentration may diminish the update and result in false convergence, numerically inhibiting all reactions, which is not intended for productions with external sources (e.g., nitrogen deposition from CLM to PFLOTRAN). For accuracy and efficiency, a very tight STOL is needed when the concentration can be very low. Log transformation is accurate and robust, but requires more computing time. The computational cost increases with decreasing concentrations, most substantially for log transformation.

These computational issues arise because we switch from the explicit
methods to the implicit methods for soil biogeochemistry. We use small half
saturation (e.g., 10

For reactions with very low half saturation and residual concentrations, e.g.,
redox reactions involving O

Our CLM–PFLOTRAN spin-up simulations at Arctic, temperate, and
tropical sites produce results similar to CLM4.5, and indicate that
accurate and robust solutions can be achieved with clipping, scaling, or
log transformation. The computing time is 50 to 100 % more than
CLM4.5 for a range of half saturation values from 10

An alternative to our approach of coupling LSMs with reactive
transport codes is to code the solution to the advection, diffusion,
and reaction equations directly in the LSM. This has been done using
explicit time stepping and operator splitting to simulate the
transport and transformation of carbon, nitrogen, and other species in
CLM

PFLOTRAN is open-source
software. It is distributed under the terms of the GNU Lesser General
Public License as published by the Free Software Foundation either
version 2.1 of the License, or any later version. It is available at

The CLM-CN decomposition cascade consists of three litter pools with
variable CN ratios, four soil organic matter (SOM) pools with constant
CN ratios, and seven reactions

CLM4.5 has an option to separate

As the CN ratio is variable for the three litter pools, litter N pools
need to be tracked such that Reaction (

The nitrification reaction to produce

The denitrification reaction is

The plant nitrogen uptake reaction can be written as

Denote

Ignoring equilibrium reactions and transport for simplicity of
discussion in this work, PFLOTRAN solves the ordinary differential
equation,

If none of these tolerances are met in MAXIT iterations or MAXF
function evaluations, the iteration is considered to diverge, and
PFLOTRAN decreases the time-step size for MAX_CUT times. The default
values in PFLOTRAN are ATOL

Adding to test 2 a plant

G. Bisht, B. Andre, R. T. Mills, J. Kumar, and F. M. Hoffman. developed the CLM–PFLOTRAN framework that this work is built upon. F. Yuan, G. Tang, G. Bisht, and X. Xu added biogeochemistry to the CLM–PFLOTRAN interface. F. Yuan proposed the nitrification and denitrification reactions and rate formulae. G. Tang, F. Yuan, and X. Xu implemented the CLM soil biogeochemistry in PFLOTRAN under guidance of G. E. Hammond, P. C. Lichtner, S. L. Painter, and P. E. Thornton. G. Tang prepared the manuscript with contributions from all co-authors. G. Tang, F. Yuan, G. Bisht, and G. E. Hammond contributed equally to the work.

Thanks to Nathaniel O. Collier at ORNL for many discussions that
contributed significantly to this work. Thanks to Kathie Tallant and
Kathy Jones at ORNL for editing service. This research was funded by
the U.S. Department of Energy, Office of Sciences, Biological and
Environmental Research, Terrestrial Ecosystem Sciences and Subsurface
Biogeochemical Research Program, and is a product of the
Next-Generation Ecosystem Experiments in the Arctic (NGEE-Arctic)
project. Development of CLM–PFLOTRAN was partially supported by the
ORNL Laboratory Directed Research and Development (LDRD) program. ORNL
is managed by UT-Battelle, LLC, for the U.S. Department of Energy
under contract DE-AC05-00OR22725.