A generic biogeoch mical module for earth system models

Introduction Conclusions References


Introduction
Terrestrial ecosystems store almost three times as much carbon as the atmosphere; hence changes in the terrestrial carbon budgets have important implications to the future climate through carbon cycle feedbacks.Accurate modeling of carbon cycling at regional to global scales must incorporate mechanism-based, robust representations of soil carbon and nutrient cycling processes as well as other biogeochemical processes that are coupled with the carbon cycle.
A number of biogeochemistry modules have been developed and used in Earth system models (ESM) to simulate the fluxes of carbon, nitrogen, energy and water into and out of an ecosystem (e.g.CLM-CN that originated from Biome-BGC, Thornton et al., 2007, CENTURY/DAYCENT, Parton et al., 1988, 2001, and other terrestrial biosphere models that participated in the Carbon Land Model Intercomparison Project -CLAMP, Randerson et al., 2009, the International Land Model Benchmarking Project -ILAMB, Luo et al., 2012, etc.).In these previous modules, biogeochemical processes are represented by a set of ordinary differential equations (ODEs), and each of the equation lumps contribution from various individual biogeochemical processes.Although schematic diagrams are presented in literature for C and N flows among different compartments, e.g.within the plants, above ground or in the soil, it is difficult to deconvolute each differential equation to track the contribution from individual process.This raises a difficulty in updating or adding new processes into the module.Furthermore, many land surface models such as the current CLM only include C and N cycling, even though P cycling has been shown to be important in regulating terrestrial biogeochemical processes (Wang et al., 2010;Yang et al., 2013).In addition, anthropogenic sulfur (S), which is not in the current CLM, can also disturb the biogeochemical cycling in terrestrial ecosystems through competition for labile forms of organic carbon between nitrate-reducing and sulfate-reducing bacteria (Bünemann and Condron, 2007;Gu et al., 2012).Introduction

Conclusions References
Tables Figures

Back Close
Full All of the biogeochemistry modules, as reviewed in a recent report of the Intergovernmental Panel on Climate Change (IPCC, 2007), assume that organic carbon decomposition driven by microbial communities follows a first order chemical decay process, neglecting feedback interactions between microbial activities and the controlling processes and factors (such as the form of substrates, microbial function, community, and abundance) (Todd-Brown et al., 2012).The physiology of the microbial communities is affected by climate change, which in turn leads to changes in microbial decomposition rates (Bradford et al., 2008).To better represent how microbial processes influence biogeochemical cycling, integrating microbial communities into terrestrial ecosystem models has been called upon in recent years to reduce uncertainty in model prediction (Allison and Martiny, 2008;McGuire and Treseder, 2010).Recent advances in soil microbiology are now allowing us to better understand the dynamics of microbial community and how it affects biogeochemical processes in soils.The simple representation of soil biogeochemical processes in current models can no longer handle the complexity that results from the integration of the microbial community dynamics.
One of the major challenges to integrate microbial dynamics is that currently each biogeochemistry module is developed in an ad hoc way with little flexibility to modify or incorporate new biogeochemical processes because they are hardwired in the computer source codes.When additional and/or different biogeochemical and physical processes are to be included, the source codes have to be significantly modified, including code changes to output new state variables.Maintenance and modification of the current software hierarchy is time consuming, labor intensive, and error prone.
In this paper, we propose a generic, reaction-based algorithm to integrate soil biogeochemical processes into land surface models using CLM as an example.A new biogeochemical module was developed based on the generic algorithm.The module can be readily used to update or incorporate new C, N, P and other nutrient cycle models in CLM and to evaluate the effects of alternative process pathways and mechanisms on climate changes.Similar to modeling geochemical reactions using the reaction based approach that has been commonly used in modeling subsurface reactive transport, the Introduction

Conclusions References
Tables Figures

Back Close
Full new biogeochemical module treats terrestrial C, N, and P cycles as reaction pathways and these pathways are explicitly modeled using the reaction based approach.In the following sections, we will first introduce this approach, followed by using this approach to develop the generic biogeochemical module in CLM.As a validation, the results from the new module incorporated into CLM were compared with the original CLM-CN.
The results were then compared to those using the decomposition model of soil organic matter in CENTURY (Parton et al., 1988) under the CLM modeling framework.A procedure to integrate new processes into the new biogeochemical module is demonstrated using a global P cycling as an example (Wang et al., 2010).The P included model can be used to evaluate P effect on productivity of terrestrial ecosystems.

Reaction-based approach
A dynamic reaction system may be defined by reaction pathways or networks (Yeh et al., 2001).In the reaction-based approach, the mass balance of each state variable of the reaction system can be written as in which, C i is the mass of the i -th state variable, t is time, R k is the rate of the k-th reaction, v i k is the reaction stoichiometry of the i -th state variable in the k-th reaction associated with the products, µ i k is the reaction stoichiometry of the i -th state variable in the k-th reaction associated with the reactants, and N is the number of reactions.
Equation (1) states that the rate of change in the mass of any state variable is due to all reactions that produce or consume that state variable.The whole system can be

Conclusions References
Tables Figures

Back Close
Full written in a matrix form as If M is the total number of state variables in the system, then [I] is a M × M unit matrix, dC dt is a vector of length M, [A] is a M × N matrix, the column k of which consists of stoichiometries of all C i 's involved in reaction k with rate R k , which are allowed to vary during the simulation to account for the effects of moisture, temperature, etc. on the reaction stoichiometries.
The reaction rates in subsurface ecosystems typically span over a wide range of magnitude, which makes it inefficient to directly integrate Eq. (1).To overcome this difficulty, the fast reactions are often assumed to occur instantaneously, so the ODEs for the fast reactions can then be described using algebraic equations (AEs).A systematic way of deriving ODEs and decouple them from fast reactions using matrix row and column operation was detailed in Fang et al. (2003).A generic biogeochemical reaction solver was developed in Fang et al. (2003) for this purpose.It was modified and called NGBGC (Next Generation BioGeoChemical module) hereafter in this study.

Development of the generic biogeochemical module
The current biogeochemical module in CLM has a total of around 100 state variables and hundreds of physical and biogeochemical processes to represent carbon and nitrogen cycles.The number of equations is expected to grow when more processes and state variables are added.Users have to manually discern the ODEs and keep track of the state variables, stoichiometries, and their relationship to each process.This is tractable when each process only involves at most two state variables as currently represented in the CLM model, although a significant effort is required to track each variable and process.It becomes a significant burden for the programmers and modelers when complex reaction pathways that involve more variables and processes have to be integrated.Users also need to be familiar with the code before they can make Introduction

Conclusions References
Tables Figures

Back Close
Full efficient updates to it.These difficulties can be resolved using the new biogeochemical module presented in this study.The new biogeochemical module, NGBGC, is designed to flexibly handle any number of biogeochemical processes and state variables that are to be included in any ESM without changing algorithms and codes related to ODEs for biogeochemistry in CLM.
In addition, NGBGC can be conveniently used to derive the mathematical equations to describe these processes.The following illustrates the essence of the proposed generic algorithm and modeling procedures.
To be consistent with the biogeochemical processes commonly integrated in CLM such as CLM-CN, we first identified these processes by reversely analyzing the numerical codes that solve the ODEs used in CLM-CN.A reaction network was then built based on the identified processes.Table 1 shows the processes involved in the cascade model of litter and soil organic matter decomposition in CN.
In Table 1, C and N are state variables of each pool.The symbol "→" denotes the reaction direction, which is assumed to be nonreversible.Variables on the left are reactants, and on the right are products.If a process does not have a reactant/product, the process represents the source/sink of the state variable.Other variables in the table are defined in the List of Abbreviations in the Appendix.
If one were not familiar with the code and processes that the code is written to represent, making changes would be difficult.In a reaction-based approach, by studying the processes in Table 1 and how the rates were defined in the original code, we came up with the following reaction-based decomposition cascade processes with balanced elements in Table 2.
In Table 2, parameters a are the inverse of C : N ratio for each pool, which are calculated before each new time integration, C and N are state variables of each pool.R 1 to R 8 are decomposition rates, R 9 -R 15 are heterotrophic respiration rates of each pool.To illustrate the procedure executed in NGBGC, the ODEs in the form of Eqs. ( 1) and ( 2) for the state variables and reactions in Table 2 can be written as follows: The ODEs for the rest of the state variables based on the reactions in Table 2 can be derived similarly based on the mass balance for each state variable and were not listed here.The advantages using such a reaction-based approach are: (1) the meaning of each process is obvious to the users, (2) carbon and nitrogen are explicitly linked in the reaction pathways, and additional variables, such as rate names for N transition in Table 1, are not necessary because they will be handled automatically by NGBGC using the stoichiometries in each process, and (3) new elements can be added in each process in a straightforward way.
If the decomposition model of soil organic matter in CENTURY (Parton et al., 1988) is to be used, the pathways with balanced elements are shown in Table 3.
The ODEs for the state variables based on the reactions in Table 3 can be formulated similarly as described for those in Table 2.Note that NGBGC will automatically form the ODEs once reactions such as those in Table 2 or 3 are provided as input to the module.NGBGC has been incorporated in CLM4.5 (science branch).It consolidates all the biogeochemical subroutines (6 in CLM-CN, 3 for carbon and 3 for nitrogen) used in the original CLM-CN code into one with a little over 100 lines of code.Unlike the original code, this single subroutine to describe biogeochemical processes will not be modified when a new or updated biogeochemical process is to be incorporated so it can significantly simplify the programming and more effective ODE solvers for new computational platforms can be introduced easily by modifying a single subroutine.

Development of process database and script tool
To use NGBGC, a perl script was written to provide the stoichiometries of biogeochemical reactions such as those illustrated in Tables 2 or 3 as an input file to the module.
Forming a self-consistent biogeochemical reaction network with correct stoichiometries is a major challenge not only for modeling, but also for process characterization and understanding because of the complexity and a large number of biogeochemical processes and reactions involved in terrestrial carbon cycling.To solve this problem from the modeling point of view, we developed a reaction database that can be used to concisely store all biogeochemical reactions in plants, litter, and soil that can potentially affect carbon cycling processes.For a specific simulation problem, relevant reactions can be extracted by a PERL script from the database to formulate a table of reaction network as illustrated in Tables 2 or 3.Each biogeochemical process in the database has a unique name that is used to track the rate definition in the code and a tag to denote its source so that users have the flexibility to choose those processes either from a particular model (e.g.CLM-CN, CEN-TURY, etc.) currently used by the modeling community or introduce new or updated processes.For users to efficiently update the database, the reactions are grouped into three types: reactions within plants, reactions from plants to soil, and reactions in the soil.The format will be provided in a user's manual so that one can easily update the database with new and updated biogeochemical processes by only modifying the source code for the kinetics of the new processes.A database containing those processes reversely identified from CLM-CN can be found in the Supplement.Application of the database is demonstrated in the result section and the simulation results using the reaction-based approach are benchmarked against those from the existing biogeochemical module, such as CLM-CN.

GMDD Introduction Conclusions References
Tables Figures

Back Close
Full For each process in the database, state variables are separated into plant functional type (pft) or column averaged type by (p) and (c), respectively.For instance, the following pathway represents the contribution to the column c in litter pool from the leaf pool of each pft type: The ODEs for the above three reactions can be written as In the current version of CLM-CN, the right hand side of Eq. ( 6) is represented by a lumped rate:

Conclusions References
Tables Figures

Back Close
Full in which r is a function of leafc(p).R1, R2 and R3 are calculated as a fraction of r based on the fraction of each litter pool in the column and the weight of each pft relative to the column.Thus, for this type of reactions involving both pft and column state variables, attention needs to be paid when assigning reaction rates to the corresponding ODEs.We have two functions specified in the module: set_column_chem_rate and set_pft_chem_rate; set_pft_chem_rate is called when integrating pft state variables, and set_column_chem_rate is called when integrating column state variables.In the database, a comment is appended for this type of pathways that use a lumped rate in the ODEs like Eq. ( 10).
As mentioned above, a tool written with the PERL script is a key component of this generic module.The script has three functions: (1) generate the stoichiometries that are readable by NGBGC; (2) define and incorporate new processes in CLM; and (3) generate the necessary functions based on a specific reaction network picked from the database.These new functions include pointers that point the reaction names in the database to the defined rate expressions that are used in the code, and state variable names and type (C, N, P, ...), identify those reactions (such as those in Table 2) that have dynamic variable stoichiometries during the simulation.It also identifies transport properties (mobile or immobile) of each column state variable so that a transport capability in CLM such as the one that was recently developed by Tang et al. (2013) (CLM-BeTR) to deal with depth dependent biogeochemistry in the soil can be used.Note that by identifying the pools that are mobile in the soil, Eq. ( 2) can be readily extended to derive the transport equations of those pools for the land model (Fang et al., 2003).
The framework developed in this study allows using the same modeling algorithm to simulate C and N cycling in terrestrial ecosystems with different mathematical representations.The new approach can also readily integrate other nutrient cycles of importance, such as phosphorus (P), potassium (K), and sulfur (S), into the simulation system.Once these processes are in the database, the module generates equations to be solved automatically.But users have the freedom to input user-defined rate expressions for the new kinetic mechanisms.

CN model
As mentioned in the section above, we have reversely identified those processes used in the CLM-CN model and incorporated them into our database.All the processes included in CLM-CN were then extracted from the database using the PERL script and input into NGBGC for simulation.All the modeling results presented in this study are defined on a grid with 1 • resolution, using the stand-alone CLM with the Qian et al. (2006) atmospheric input data for 2002 and 2003, CO 2 level and aerosol deposition for 2000, and arbitrary-initial condition.Using the new approach and the processes in Table 1, we were able to obtain identical results as that of the original CLM-CN.We then replaced the decomposition processes with the reactions in Table 2 and refer our model as CLM-NGBGC.Results of the simulation after two years of integration using the original code and our code were compared and they are exactly the same as shown in Fig. 1 by the temporal change of total litter C at the Tonzi Ranch site located in the lower foothills of the Sierra Nevada Mountains.
Another simulation was run using the decomposition model of soil organic matter described in Parton et al. (1988) with initial global residence time of soil1 (1.5 yr), soil2 (25 yr), and soil3 (1000 yr).The database was used to pick the reactions in Table 3 instead of reactions with rates R 4 , R 5 , R 6 in Table 2. Since this decomposition model has only three soil organic pools, reaction with rate R 15 in Table 2 was not picked either.This is done through "century" tag in the database.We refer to the reaction network involving soil organic matter decomposition reactions in Table 2 as Reaction Network 1 and that involving reactions in Table 3 as Reaction Network 2 hereafter.Spatial distribution of net primary productivity (NPP) and carbon in soil organic matter  and 3.The total NPP calculated using Reaction Network 2 is 89 % of that using Reaction Network 1.

CNP model
Recent technology of global mapping of P (Yang et al., 2013) has made it possible to integrate global P cycling in land model.There is no P cycling model in the current CLM framework.To integrate phosphorus cycling into CLM, we assume the allocation of P to live aboveground and belowground and the flow of P in plant litter and different organic soil pools follow the allocation of N in CLM-CN: (1) phenology transition flux of C, N and P maintains a constant stoichiometry, (2) P is allocated to an additional storage pool after the P needs are satisfied for plant tissues, and (3) the scheme of occurrence of retranslocation of P follows that of N. Stoichiometric relationship of P during decomposition follows that of N in Tables 2 and 3.For instance, the first reaction in Table 2 is now written as where P labile is the labile pool of P in the soil, parameters b are the inverse of the C : P ratio for each pool.
Five extra processes described in Wang et al. (2010) were used to represent the dynamics of labile, sorbed, strongly sorbed, occluded phosphorus in soil, and biochemical mineralization (breakdown) of slow and passive soil organic P pools through a group of enzymes collectively known as phosphatases (Wang et al., 2007).These processes are shown in Table 4. Sorbed P was assumed to be in equilibrium with labile P, and their relationship is described using the Langmuir isotherm (Wang et al., 2007).sorbed P (g P m −2 ) and adsorption constant (g P m −2 ), respectively.v pmax (0.1) is the maximum specific biochemical mineralization rate (yr −1 ), K ptase (150.0) is an empirical constant (gN(gP) −1 ).λ pup (25.0) and λ ptase (15.0) are the N cost for P uptake and phosphatatse production (gN(gP) −1 ), respectively.P enters the system through dust deposition and weathering and leaves by leaching.
Weathering is not soil texture dependent, but fixed at a constant rate.Dust deposition rate is 0.0017 g P m −2 yr −1 .Soil P leaching rate is calculated as In this study, k is assumed to be 0.04 yr −1 .The C : P ratios of plant tissues: leaf, live stem, dead stem, live coarse root, dead coarse root, fine root were estimated from Wang et al., roughly 15 times of the C : N ratio.The initial C : P ratios for litters and newly formed soil organic pool are 4 times of their C : N ratios, the C : P ratios of the other soil organic pools are 7 times of their C : N ratios.
All parameters for this P cycling model were within the ranges reported in the literature, and were assumed to be constant, independent of soil texture or biome in the modeling.Maintenance respiration dependency on P was not considered either.Therefore, this example is for demonstration purpose only.
Note that P sorption is a fast process and consequently was treated as an equilibrium process and represented by an algebraic equation.The total number of ODEs derived was one less than the total number of state variables.To complete the system, one algebraic equation represented by the Langmuir isotherm is included in the solution process.Using the decomposition approach, the ODE regarding labile P was written as the following:

GMDD Introduction Conclusions References
Tables Figures

Back Close
Full These ODEs for P only were then linked with other ODEs involving C and N.After each integration, the combined mass of labile and sorbed P was obtained.The individual mass of labile and sorbed P was solved using Newton iteration for the Langmuir equation.As described before, once reaction database was updated to include P, and new reaction rate laws such as those in Table 4 were inputted, the entire set of ODEs was automatically formulated and solved by the NGBGC.If a depth dependent P were to be considered, only labile P in Eq. ( 12) will be transported.Total gross photosynthesis (GPP) for the original CN model was scaled or downregulated by the N limitation.GPP down-regulation for this CNP model is controlled by the minimum of N-limiting factor and P-limiting factor, which are calculated as x P = min 1, P labile in which, x is the nutrient limiting factor, N smin is the amount of mineral N in soil, P labile is the amount of labile P in the soil, F is the amount of minimal nutrient required to sustain a given NPP, and ∆t is the integration time step of the model.Using the generic approach, this model was implemented within a few days and incorporated in the current CLM framework.
With the P weathering rate of 0.05 g P m −2 yr −1 , almost no P limiting on productivity was shown.Figures 4 and 5 show the distribution of labile and sorbed P after two years of simulation.Total labile and sorbed P account for 18 % and 14 % of total P in soil, respectively.When the weathering rate is decreased to 0.01 g P m −2 yr −1 , Fig. 6 shows the P limited regions.P limiting on NPP of tropical evergreen forest and savannah is consistent with the simulation in Wang et al. (2010) using this low weathering rate.
As mentioned above, soil texture and biome were not considered in our P model.Maintenance respiration dependency on P was not considered either.Studies have found that the carbon costs of root maintenance respiration in phosphorus-deficient plants could be as high as 39 % of daily photosynthesis (Meir et al., 2008;Nord et al., 2011;Postma and Lynch, 2011).The model can be improved and applied globally when global P datasets become available.

Conclusions
A generic biogeochemistry module was developed that can be used to simulate C, N, and P biogeochemical cycles in terrestrial biosphere.This new module uses a reactionbased approach in which the overall rate of a state variable is described by summing the rates of all individual processes where this variable is a reactant or product.The approach provides a flexible way to update or add a new process by only modifying the kinetic mechanisms for the targeted specific process without the need to modify the integration of specific ODEs as required for the current CLM model.In addition, a database was generated that can store individual biogeochemical processes used in all ESMs orderly and be readily updated when new and updated process models become available.The processes in the database can be efficiently extracted using a PERL script to form a formatted reaction network to be used in the generic biogeochemistry module.Biogeochemical processes represented in different ESMs can be modeled by extracting different pools of reactions from the database.This way, different CN models and biogeochemical mechanisms can be compared effectively in the same CLM modeling framework and evaluated using observations.This framework also allows structural uncertainties related to biogeochemistry models to be assessed with minimal coding efforts.
The approach represents a cost-effective way for the CLM community to model biogeochemical processes and provides an efficient way to integrate new and updated process models.In addition, the reaction database approach provides a mutually understandable venue to communicate with biogeochemists for improvement of model process representations and even inspire new research.nitrogen in soil organic matter pool 1 N soil2 nitrogen in soil organic matter pool 2 N soil3 nitrogen in soil organic matter pool 3 N soil4 nitrogen in soil organic matter pool 4 N smin mineral nitrogen in soil P litr1 phosphorus in litter pool 1 P labile phosphorus in the labile soil pool P soil1 phosphorus in soil organic matter pool 1 P soil2 phosphorus in soil organic matter pool 2 P soil3 phosphorus in soil organic matter pool 3 P sorbed sorbed phosphorus pool in soil P ssorbed strongly sorbed phosphorus pool in soil P occluded occluded phosphorus pool in soil Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | soil1 pool after two years of simulation are shown in Figs. 2 τ p, sorb (0.01) and τ p, ssorb (0.01) are rate constants for the sorbed and strongly sorbed P pools in yr −1 , respectively.S pmax (50.0) and K plab (64.0) are the maximum amount of Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

GMDD
soil3 ) N smin + C soil3 + a soil3 N soil3 → C soil1 + a soil1 N soil1 R 5 Discussion Paper | Discussion Paper | Discussion Paper | Table A1.List of abbreviations.a cwd N : C (g N/g C) ratio in coarse woody debris pool a litr1 N : C (g N/g C) ratio in litter pool 1 a litr2 N : C (g N/g C) ratio in litter pool 2 a litr3 N : C (g N/g C) ratio in litter pool 3 a soil1 N : C (g N/g C) ratio in soil organic matter pool 1 a soil2 N : C (g N/g C) ratio in soil organic matter pool 2 a soil3 N : C (g N/g C) ratio in soil organic matter pool 3 a soil4 N : C (g N/g C) ratio in soil organic matter pool 4 b litr1 P : C (g P/g C) ratio in litter pool 1 b soil1 P : C (g P/g C) ratio in soil organic matter pool 1 C cwd carbon in coarse woody debris pool C litr1 carbon in litter pool 1 C litr2 carbon in litter pool 2 C litr3 carbon in litter pool 3 C soil1 carbon in soil organic matter pool 1 C soil2 carbon in soil organic matter pool 2 C soil3 carbon in soil organic matter pool 3 C soil4 carbon in soil organic matter pool 4 N cwd nitrogen in coarse woody debris pool N litr1 nitrogen in litter pool 1 N litr2 nitrogen in litter pool 2 N litr3 nitrogen in litter pool 3 N soil1

Figure 1 .
Figure 1.Comparison of temporal change of total litter C at the Tonzi Ranch site with the original model (CLM-CN) and our new model (CLM-NGBGC).

Fig. 1 .
Fig. 1.Comparison of temporal change of total litter C at the Tonzi Ranch site with the original model (CLM-CN) and our new model (CLM-NGBGC).

Table 1 .
Decomposition processes created from the ODEs in CLM-CN.

Table 3 .
CENTURY decomposition model of soil organic matter.