A generic biogeochemical module for Earth system models : Next Generation BioGeoChemical Module ( NGBGC ) , version 1 . 0

Physical and biogeochemical processes regulate soil carbon dynamics and CO 2 flux to and from the atmosphere, influencing global climate changes. Integration of these processes into Earth system models (e.g., community land models (CLMs)), however, currently faces three major challenges: (1) extensive efforts are required to modify modeling structures and to rewrite computer programs to incorporate new or updated processes as new knowledge is being generated, (2) computational cost is prohibitively expensive to simulate biogeochemical processes in land models due to large variations in the rates of biogeochemical processes, and (3) various mathematical representations of biogeochemical processes exist to incorporate different aspects of fundamental mechanisms, but systematic evaluation of the different mathematical representations is difficult, if not impossible. To address these challenges, we propose a new computational framework to easily incorporate physical and biogeochemical processes into land models. The new framework consists of a new biogeochemical module, Next Generation BioGeoChemical Module (NGBGC), version 1.0, with a generic algorithm and reaction database so that new and updated processes can be incorporated into land models without the need to manually set up the ordinary differential equations to be solved numerically. The reaction database consists of processes of nutrient flow through the terrestrial ecosystems in plants, litter, and soil. This framework facilitates effective comparison studies of biogeochemical cycles in an ecosystem using different conceptual models under the same land modeling framework. The approach was first implemented in CLM and benchmarked against simulations from the original CLM-CN code. A case study was then provided to demonstrate the advantages of using the new approach to incorporate a phosphorus cycle into CLM. To our knowledge, the phosphorus-incorporated CLM is a new model that can be used to simulate phosphorus limitation on the productivity of terrestrial ecosystems. The method presented here could in theory be applied to simulate biogeochemical cycles in other Earth system models.


Introduction
Terrestrial ecosystems store almost three times as much carbon as the atmosphere; hence changes in the terrestrial carbon budgets have important implications for the future climate through carbon cycle feedbacks.Accurate modeling of carbon cycling at regional to global scales must incorporate mechanism-based, robust representations of soil carbon and nutrient cycling processes as well as other biogeochemical processes that are coupled with the carbon cycle.
A number of biogeochemistry modules have been developed and used in Earth system models (ESMs) to simulate the fluxes of carbon, nitrogen, energy, and water into and out of an ecosystem (e.g., CLM-CN, which originated from Biome-BGC (Thornton et al., 2007), CENTURY/DAYCENT (Parton et al., 2001(Parton et al., , 1988)), and other terrestrial biosphere models that participated in the Carbon Land Model Intercomparison Project (CLAMP) (Randerson et al., 2009), the

Y. Fang et al.: A generic biogeochemical module for Earth system models
International Land Model Benchmarking Project (ILAMB) (Luo et al., 2012), etc.).In these previous modules, biogeochemical processes are represented by a set of ordinary differential equations (ODEs), and each of the equations is a sum of the contributions from various individual biogeochemical processes.Although schematic diagrams are presented in the literature for C and N flows among different compartments, e.g., within the plants, above ground or in the soil, it is difficult to deconvolute each differential equation to track the contribution from individual processes.Not knowing the detailed processes simulated in a model makes it difficult to update or add new processes to the module.Furthermore, many land surface models such as the current CLM only include C and N cycling, even though P cycling has been shown to be important in regulating terrestrial biogeochemical processes (Buendia et al., 2010;Goll et al., 2012;Wang et al., 2010;Yang et al., 2013).In addition, anthropogenic sulfur (S), which is not included in the current CLM, can also disturb the biogeochemical cycling in terrestrial ecosystems through competition for labile forms of organic carbon between nitrate-reducing and sulfate-reducing bacteria (Bünemann and Condron, 2007;Gu et al., 2012).Although P models and S models have been developed in the literature (Bünemann and Condron, 2007;Goll et al., 2012;Mitchell and Fuller, 1988;Wang et al., 2010), including them in CLM in its current code structure requires a nontrivial amount of work.
All of the biogeochemistry modules, as reviewed in a recent report of the Intergovernmental Panel on Climate Change (IPCC, 2007), assume that organic carbon decomposition driven by microbial communities follows a first-order chemical decay process, neglecting feedback interactions between microbial activities and the controlling processes and factors such as the form of substrates, microbial function, community, and abundance (Todd-Brown et al., 2012).The physiology of microbial communities is affected by climate change, which in turn leads to changes in microbial decomposition rates (Bradford et al., 2008).To better represent how microbial processes influence biogeochemical cycling, integrating microbial communities into terrestrial ecosystem models has been called upon in recent years to reduce uncertainty in model prediction (Allison and Martiny, 2008;McGuire and Treseder, 2010).Recent advances in soil microbiology are now allowing us to better understand the dynamics of microbial communities and how they affect biogeochemical processes in soils.The simple representation of soil biogeochemical processes in current models can no longer handle the complexity that results from the integration of microbial community dynamics.
One of the major challenges of integrating microbial dynamics is that currently each biogeochemistry module is developed in an ad hoc way with little flexibility to modify or incorporate new biogeochemical processes because they are hardwired in the computer source codes.When additional and/or different biogeochemical and physical processes are to be included, the source codes have to be significantly modified, including code changes to output new state variables.For example, a recent modeling study has found that incorporating microbial processes can improve the global soil carbon projections (Wieder et al., 2013).The modeled processes in that study are compared to the conventional processes in Fig. 1.It can be seen that in order to use this simple new model, ODEs for carbon state variables in litter and soil have to be manually modified and six more ODEs for the microbes have to be added.Users have to manually discern the ODEs and keep track of the state variables, stoichiometries, and their relationship to each process.This is easy when fluxes between different pools only pass through a single microbial population, but in reality the conversion of pools involves microbial consortia.It is tractable when each process only involves at most two state variables as currently represented in CLM, although a significant effort is required to track each variable and process.It becomes a significant burden for the programmers and modelers when complex reaction pathways that involve more variables and processes have to be integrated.Users also need to be familiar with the code before they can make efficient updates to it.Maintenance and modification of the current software hierarchy is time consuming, labor intensive, and error prone.These difficulties can be resolved using the new biogeochemical module presented in this study.
In this paper, we propose a generic, reaction-based algorithm to integrate soil biogeochemical processes into land surface models using processes simulated in CLM as an example.CLM is a land model driven by atmospheric forcing.It simulates coupled biogeophysical and biogeochemical processes, which include solar and longwave radiation interactions with vegetation canopy and soil, momentum and turbulent fluxes from canopy and soil, heat transfer in soil and snow, canopy hydrology, soil hydrology, snow hydrology, stomatal physiology and photosynthesis, river transport, and carbon-nitrogen cycling that includes prognostic vegetation phenology (Lawrence et al., 2011).The carbon-nitrogen biogeochemical module in CLM is referred to as CLM-CN.It is based on the terrestrial biogeochemistry Biome-BGC model with a prognostic carbon and nitrogen cycle (Thornton et al., 2002;Thornton and Rosenbloom, 2005).In CLM-CN, the photosynthetic carbon is partitioned into vegetation pools.It simulates a total of around 100 state variables in plant pools for leaf, live stem, dead stem, live coarse root, dead coarse root, fine root, growth storage pools for leaf and fine root growth, a coarse woody debris pool, three litter pools and four soil pools.Hundreds of physical and biogeochemical processes are used to represent the carbon and nitrogen cycles.The equation of the conservation of carbon and nitrogen mass in each pool is solved using an explicit method, i.e., the state of a system at a later time is calculated from the state of the system at the current time.Detailed technical description of the model can be found in Oleson et al. (2010).The new biogeochemical module developed in this paper can be readily used to update or incorporate new C, N, P and other nutrient cycle models into CLM and other Earth system models and to evaluate the effects of alternative process pathways and mechanisms on climate changes.Similar to modeling geochemical reactions using the reaction-based approach that has been commonly used in modeling subsurface reactive transport (Aguilera et al., 2005;Chilakapati et al., 2000;Yeh et al., 2001), the new biogeochemical module treats terrestrial C, N, and P cycles as reaction pathways and these pathways are explicitly modeled using the reactionbased approach.The reaction-based approach requires a selfconsistent reaction network, the forming of which with correct stoichiometries is a major challenge not only for modeling, but also for process characterization and understanding because of the complexity and a large number of biogeochemical processes and reactions involved in terrestrial carbon cycling.To solve this problem from the modeling point of view, a reaction database that can be used to concisely store all biogeochemical reactions in plants, litter, and soil that can potentially affect carbon cycling processes is developed here.In the following sections, we will first introduce the reaction-based approach, followed by using this approach to develop the generic biogeochemical module in CLM.As a validation, the results from the new module incorporated into CLM were compared with the original CLM-CN.The results were then compared to those using the decomposition model of soil organic matter in CENTURY (Parton et al., 1988) under the CLM modeling framework.A procedure to integrate new processes into the new biogeochemical module is demonstrated using global P cycling as an example (Wang et al., 2010).The model that includes the phosphorus dynamics can be used to evaluate the effect of phosphorus on productivity of terrestrial ecosystems.

Reaction-based approach
A dynamic reaction system may be defined by reaction pathways or networks (Yeh et al., 2001).In the reaction-based approach, the mass balance of each state variable of the reaction system can be written as in which C i is the mass of the ith state variable, t is time, R k is the rate of the kth reaction, v ik is the reaction stoichiometry of the ith state variable in the kth reaction associated with the products, µ ik is the reaction stoichiometry of the ith state variable in the kth reaction associated with the reactants, and N is the number of reactions.Equation ( 1) states that the rate of change in the mass of any state variable is due to all reactions that produce or consume that state variable.The whole system can be written in a matrix form as If M is the total number of state variables in the system, then [I] is an M × M unit matrix, dC dt is a vector of length M, and [A] is an M×N matrix, the column k of which consists of stoichiometries of all C i s involved in reaction k with rate R k , which are allowed to vary during the simulation to account for the effects of moisture, temperature, etc. on the reaction stoichiometries.
The reaction rates in subsurface ecosystems typically span a wide range of magnitude, which makes it inefficient to directly integrate Eq. (1).To overcome this difficulty, the fast reactions are often assumed to occur instantaneously, so the ODEs for the fast reactions can then be described using thermodynamic equilibrium approaches represented by algebraic equations (AEs).Automatic code generators are already available to serve this purpose (Aguilera et al., 2005;Chilakapati et al., 2000).However, many of them depend on third-party commercial software.In this paper, we use a systematic way to derive ODEs and decouple them from fast reactions using matrix row and column operation (matrix decomposition) as detailed in Fang et al. (2003).A simple system with one slow reaction and one fast (equilibrium) reaction, and three state variables are used to illustrate the method: We define the flux of a substance as a derivative of that substance with respect to time, and it has the dimension of amount of substance per unit volume transformed per unit time.The reaction rate for a single reaction differs from the flux of a substance by the reciprocal of its stoichiometric number in that reaction.The rate of a reaction is always positive, and it can be calculated either as linear or nonlinear function.Assuming Reaction (R1) is a slow reaction with reaction rate R 1 and Reaction (R2) is a fast adsorption/desorption reaction with reaction rate R 2 , the simple system in matrix form can be written as Equation ( 3) can be expanded as Direct integration of the above ODEs is impractical due to the large rate of R 2 , which requires very small time steps.To solve the problem, a direct thermodynamic equilibrium approach, such as a sorption isotherm (Wang et al., 2010), may be employed to model the fast Reaction (R2).The system can now be solved with two ODEs and one algebraic equation and R 2 is no longer in any of the ODEs.This can be done by performing column reduction by determining a pivot (nonzero) element in the fast reaction column in matrix [A] of Eq. ( 2) and using a matrix row operation to convert the column containing the pivot element into a unit column.Element -1 or c in the second column of the reaction matrix in Eq. ( 3) can be a pivot element.If c is chosen as a pivot element, the third row of the matrix equation is first divided by c and then added to the first row to convert the second column of the reaction matrix into a unit column.Equation (3) now becomes Expanding Eq. ( 7), we obtain the following: Equation ( 10) is the only ODE that contains R 2 and it can be replaced with an algebraic equation (which can be a function of A and C) and solved together with Eqs. ( 8) and (9).A and C can be solved using Newton-Raphson iteration after (A+1/cC) is solved with explicit time integration.
If the system contains no fast processes, ODEs can be formed by simply expanding the matrix equation provided by the user.Otherwise, column reduction will be performed to generate ODEs containing only slow reactions similar to the procedure above for the simple example.The equations can be solved either using an explicit method or the Newton-Raphson iteration approach.The solver developed in Fang et al. (2003) was modified and is called NGBGC (Next Generation BioGeoChemical Module) hereafter in this study.

Development of the generic biogeochemical module
The new biogeochemical module, NGBGC, is designed to flexibly handle any number of biogeochemical processes and state variables that are to be included in any ESM without changing algorithms and codes related to ODEs for biogeochemistry in CLM.In addition, NGBGC can be conveniently used to derive the mathematical equations to describe these processes.A conceptual diagram in Fig. 2 shows the procedure from preparing input (steps above the dotted line) to running NGBGC inside CLM.The preprocessing part runs a PERL script on a reaction network written in a text file.The PERL script generates reaction stoichiometry and supporting

Reactions
Flux name functions that later will be compiled with the code in CLM, and the user subroutine in which kinetics or algebraic equations (if any) for each process are defined to generate an executable.Given a reaction of 2A ↔ 5B, the stoichiometry generated will look like "1 −2.0 2 5.0", i.e., positive number for product, negative number for reactant.The integers "1" and "2" represent substance A and B, respectively.NGBGC will read in the stoichiometries and run matrix decomposition to generate ODEs, which will then be solved using an explicit method.The following illustrates the essence of the proposed generic module and modeling procedures.
To be consistent with the biogeochemical processes commonly integrated in CLM, such as CLM-CN, we first identified these processes by reversely analyzing the numerical codes that solve the ODEs used in CLM-CN.A reaction network was then built based on the identified processes.Table 1 shows the processes involved in the cascade model of litter and soil organic matter decomposition in CN.
In Table 1, C and N are state variables of each pool.The symbol "→" denotes the reaction direction, which is assumed to be nonreversible.Variables on the left are reactants and on the right are products.If a process does not have a reactant/product, the process represents the source/sink of the state variable.Other variables in the table are defined in the List of Abbreviations in the Appendix (Table A1).
If one were not familiar with the code and processes that the code is written to represent, making changes would be difficult.In a reaction-based approach, by studying the processes in Table 1 and how the rates were defined in the original code, we came up with the following decomposition cascade processes with balanced elements in Table 2 using the reaction-based approach.
In Table 2, parameter a is the N : C ratio for each pool, which is calculated before each new time integration, and C and N are state variables of each pool.R 1 to R 8 are decomposition rates, and R 9 -R 15 are heterotrophic respiration rates of each pool.
To illustrate the procedure executed in NGBGC, the ODEs for some of the state variables in the form of Eq. ( 2) using the reactions in Table 2 can be written as follows: and The ODEs for the rest of the state variables based on the reactions in Table 2 can be derived similarly based on the mass balance for each state variable.All the ODEs from the reactions in Table 2 and their reaction kinetics are listed in the Supplement.
The advantages of using such a reaction-based approach are: (1) the meaning of each process is obvious to the users, (2) carbon and nitrogen are explicitly linked in the reaction pathways, and additional variables, such as flux names for N transition in Table 1, are not necessary because they will be handled automatically by NGBGC using the stoichiometries in each process, and (3) new elements can be added to each process in a straightforward way.
If the decomposition model of soil organic matter in CEN-TURY (Parton et al., 1988) is to be used, the pathways with balanced elements are shown in Table 3.
The ODEs for the state variables based on the reactions in Table 3 can be formulated similarly as described for those in Table 2.They can be found in the Supplement.Note that NG-BGC will automatically form the ODEs once reactions such as those in Tables 2 and 3 are provided as input to the module.NGBGC has been incorporated into CLM4.5 (science branch).It consolidates all the biogeochemical subroutines Table 3. CENTURY decomposition model of soil organic matter.
Reactions Rate (6 in CLM-CN, 3 for carbon and 3 for nitrogen) used in the original CLM-CN code to solve the ODEs of the state variables into one with a little over 100 lines of code.Unlike the original code, this single subroutine to describe biogeochemical processes will not be modified when a new or updated biogeochemical process is to be incorporated, so it can significantly simplify the programming and more effective ODE solvers for new computational platforms can be introduced easily by modifying a single subroutine.

Development of process database and script tool
A reaction database is developed so that the reaction network as illustrated in Tables 2 and 3 can be extracted for a specific simulation problem.The database is generic and can be expanded when new understandings become available.Each biogeochemical process in the database has a unique name that is used to track the rate definition in the code and a tag to denote its source so that users have the flexibility to choose those processes either from a particular model (e.g., CLM-CN, CENTURY, etc.) currently used by the modeling community or to introduce new or updated processes.For example, the microbial model shown in Fig. 1 can be added to the database with the tag "microbial_model".The entry in the database for carbon flux from the litter pool 1 to the microbial pool 1 in the model can be written as lit1_c(c) ↔ lmic1_c(c) !litr1_to_lmic microbial_model, where (c) means the reaction is in the soil column of the model grid, litr1_to_lmic is the rate name that will be used in the user subroutine, and microbial_model is the tag.The rate law can be first-order kinetics, Michaelis-Menten kinetics, or any other expressions fitted from the experiment.They are written by the user in the user subroutine.
For users to efficiently update the database, the reactions are grouped into three types: reactions within plants, reactions from plants to soil, and reactions in the soil.The format is provided in the comments of the database so that one can easily update the database with new and updated biogeochemical processes by only modifying the user subroutine for the kinetics of the new processes.A database containing those processes reversely identified from CLM-CN can be found in the Supplement.Application of the database is demonstrated in the Results section and the simulation results using the reaction-based approach are benchmarked against those from the existing biogeochemical module, such as CLM-CN.
For each process in the database, state variables are separated into plant functional type (p) or column averaged type (c).For instance, the following pathways represent the contribution to the column c in litter pools from the leaf pool of Geosci.Model Dev., 6,[1977][1978][1979][1980][1981][1982][1983][1984][1985][1986][1987][1988]2013 www.geosci-model-dev.net/6/1977/2013/ each plant functional type (pft): The ODEs for the above three reactions can be written as In the current version of CLM-CN, the right-hand side of Eq. ( 14) is represented by a lumped rate: in which r is a function of leafc(p).R 1 , R 2 and R 3 are calculated as a fraction of r based on the fraction of each litter pool in the column and the weight of each pft relative to the column.Thus, for this type of reaction involving both pft and column state variables, attention needs to be paid when assigning reaction rates to the corresponding ODEs.We have two functions specified in the module: set_column_chem_rate and set_pft_chem_rate; set_pft_chem_rate is called when integrating pft state variables, and set_column_chem_rate is called when integrating column state variables.In the database, a comment is appended for this type of pathway that uses a lumped rate in the ODEs like Eq. ( 18).
As shown in Fig. 2, a script written in the PERL language is a key component of this generic module.The script has three functions: (1) generate the stoichiometries that are readable by NGBGC, (2) define and incorporate new processes into CLM, and (3) generate the necessary functions based on a specific reaction network picked from the database.These new functions include pointers that link the reaction names in the database to the defined rate expressions that are used in the code, and state variable names and types (C, N, P, . . . ) identify those reactions (such as those in Table 2) that have dynamic variable stoichiometries, i.e., changing N : C and P : C ratios during the simulation.It also identifies transport properties (mobile or immobile) of each column state variable so that a transport capability in CLM such as the one that was recently developed by Tang et al. ( 2013) (CLM-BeTR) to deal with depth-dependent biogeochemistry in the soil can be used.Note that by identifying the pools that are mobile in the soil, Eq. ( 2) can be readily extended to derive the transport equations of those pools for the land model (Fang et al., 2003).
The framework developed in this study allows using the same modeling algorithm to simulate C and N cycling in terrestrial ecosystems with different mathematical representations.For example, microbial models can be evaluated by simply replacing the reaction network formed by conventional models with those from the microbial models and new kinetics.The new approach can also readily integrate other nutrient cycles of importance, such as phosphorus (P), potassium (K), and sulfur (S), into the simulation system.Once these processes are in the database, the module generates equations to be solved automatically.But users have the freedom to input user-defined rate expressions with arbitrary form for the new kinetic mechanisms.These expressions are prepared by the user in the user subroutine, which is called by NGBGC during the simulation.

CN model
As mentioned in the section above, we have reversely identified those processes used in the CLM-CN model and incorporated them into our database.All the processes included in CLM-CN were then extracted from the database using the PERL script and input into NGBGC for simulation.All the modeling results presented in this study are defined on a grid with a 1-degree resolution, using the stand-alone CLM with the Qian et al. (2006) atmospheric input data for 2002 and 2003, CO 2 level and aerosol deposition for 2000, and arbitrary initial conditions.Using the new approach and the processes in Table 1, we were able to obtain identical results as those of the original CLM-CN.We then replaced the decomposition processes with the reactions in Table 2 and refer to our model as CLM-NGBGC.Results of the simulation after two years of integration using the original code and our code were compared and they are exactly the same, as shown in Fig. 3 by the temporal change of total litter C at the Tonzi Ranch site located in the lower foothills of the Sierra Nevada Mountains.
Another simulation was run using the decomposition model of soil organic matter described in Parton et al. (1988) with initial global residence time of soil1 (1.5 yr), soil2 (25 yr), and soil3 (1000 yr).We selected all the reactions except for the reactions with rates R 4 , R 5 , and R 6 in Table 2 and all the reactions in Table 3 from the database.Since this decomposition model has only three soil organic pools, the reaction with rate R 15 in Table 2 was not selected either.This is done through the "century" tag in the database.We refer to the reaction network involving soil organic matter decomposition reactions in Table 2 as Reaction Network 1 and that involving reactions in Table 3    2 hereafter.Spatial distribution of net primary productivity (NPP) and carbon in the soil organic matter soil1 pool after two years of simulation is shown in Figs. 4 and 5, respectively.The total NPP calculated using Reaction Network 2 is 89 % of that using Reaction Network 1.

CNP model
Recent contributions of global mapping of P (Yang et al., 2013) has made it possible to integrate global P cycling into land models.There is no P cycling model in the current CLM framework.To integrate phosphorus cycling into CLM, the allocation of P to live aboveground and belowground and the flow of P in plant litter and different organic soil pools are assumed to follow the allocation of N in CLM-CN: (1) the phenology transition flux of C, N, and P maintains a constant stoichiometry, (2) P is allocated to an additional storage pool after the P needs are satisfied for plant tissues, and (3) the scheme of occurrence of retranslocation of P follows that of N. The stoichiometric relationship of P during decomposition follows that of N in Tables 2 and 3.For instance, R1 in Table 2 is now written as where P labile is the labile pool of P in the soil and parameter b is the P : C ratio for each pool.
Five extra processes described in Wang et al. (2010) were used to represent the dynamics of labile, sorbed, strongly sorbed, and occluded phosphorus in soil, and biochemical mineralization (breakdown) of slow and passive soil organic P pools through a group of enzymes collectively known as phosphatases (Wang et al., 2007).These processes are shown in Table 4. Sorbed P was assumed to be in equilibrium with labile P, and their relationship is described using the Langmuir isotherm (Wang et al., 2007).
τ p,sorb (0.01) and τ p,ssorb (0.01) are rate constants for the sorbed and strongly sorbed P pools in yr −1 , respectively.S pmax (50.0) and K plab (64.0) are the maximum amount of sorbed P (g P m −2 ) and an adsorption constant (g P m −2 ), respectively.v pmax (0.1) is the maximum specific biochemical mineralization rate (yr −1 ) and K ptase (150.0) is an empirical constant (gN(gP) −1 ).λ pup (25.0) and λ ptase (15.0) are the N cost for P uptake and phosphatase production (gN(gP) −1 ), respectively.P enters the system through dust deposition and weathering and leaves by leaching.Weathering is not soil texturedependent, but fixed at a constant rate.The dust deposition rate is 0.0017 g P m −2 yr −1 .The soil P leaching rate is calculated as In this study, k is assumed to be 0.04 yr −1 .The C : P ratios of six plant tissues, leaf, live stem, dead stem, live coarse root, dead coarse root, and fine root, were estimated from Wang et al. as roughly 15 times the C : N ratios.The initial C : P ratios for litter and the newly formed soil organic pool are four times their C : N ratios, and the C : P ratios of the other soil organic pools are seven times their C : N ratios.
All parameters for this P cycling model were within the ranges reported in the literature, and were assumed to be constant, independent of soil texture or biome in the modeling.Maintenance respiration dependency on P was not considered either.Therefore, this example is for demonstration purpose only.
Note that P sorption is a fast process and consequently was treated as an equilibrium process and represented by an algebraic equation.The total number of ODEs derived was one less than the total number of state variables.To complete the  system, one algebraic equation represented by the Langmuir isotherm is included in the solution process.Using the decomposition approach, the ODE regarding labile P was written as These ODEs for P only were then linked with other ODEs involving C and N.After each integration, the combined mass of labile and sorbed P was obtained.The individual mass of labile and sorbed P was solved using Newton iteration for the Langmuir equation.As described before, once the reaction database was updated to include P, and new reaction rate laws such as those in Table 4 were written in a user-defined subroutine of kinetics, the entire set of ODEs was automatically formulated and solved by the NGBGC.If a depth-dependent P were to be considered, only labile P in Eq. ( 21) would be transported.
Total gross photosynthesis (GPP) for the original CN model was scaled or down-regulated by the N limitation.GPP down-regulation for this CNP model is controlled by the minimum of the N-limiting factor and the P-limiting factor, which are calculated as and in which x is the nutrient-limiting factor, N smin is the amount of mineral N in soil, P labile is the amount of labile P in the soil, F is the amount of minimal nutrient required to sustain a given NPP, and t is the integration time step of the model.Using the generic approach, this model was implemented within a few days and incorporated into the current CLM framework.
With the P weathering rate of 0.05 g P m −2 yr −1 , almost no P limiting on productivity was shown.Figures 6 and 7 show the distribution of labile and sorbed P after two years of simulated time.Total labile and sorbed P account for 18 % and 14 % of total P in soil, respectively.
When the weathering rate is decreased to 0.01 g P m −2 yr −1 , Fig. 8 shows the P-limited regions.P limiting on NPP of tropical evergreen forest and savannah  is consistent with the simulation in Wang et al. (2010) using this low weathering rate.
As mentioned above, soil texture and biome were not considered in our P model.Maintenance respiration dependency on P was not considered either.Studies have found that the carbon costs of root maintenance respiration in phosphorusdeficient plants could be as high as 39 % of daily photosynthesis (Meir et al., 2008;Nord et al., 2011;Postma and Lynch, 2011).The model can be improved and applied globally when global P datasets become available.

Conclusion
A generic biogeochemistry module was developed that can be used to simulate C, N, and P biogeochemical cycles in the terrestrial biosphere.This new module uses a reactionbased approach in which the overall rate of a state variable is described by summing the rates of all individual processes where this variable is a reactant or product.The approach provides a flexible way to update or add a new process by only modifying the kinetic mechanisms for the targeted specific process without the need to modify the integration of specific ODEs as required for the current CLM model.In addition, a database was generated that can store individual biogeochemical processes used in all ESMs orderly and be readily updated when new and updated process models become available.The processes in the database can be efficiently extracted using a PERL script to form a formatted reaction network to be used in the generic biogeochemistry module.Biogeochemical processes represented in different ESMs can be modeled by extracting different pools of reactions from the database.This way, different CN models and biogeochemical mechanisms can be compared effectively in the same CLM modeling framework and evaluated using observations by changing the reaction networks and user-defined subroutines for kinetics.This framework also allows structural uncertainties related to biogeochemistry models to be assessed with minimal coding efforts.
The approach presented in this manuscript can also facilitate the exploration of new methods of finding a steady-state biogeochemical solution for an ecosystem.When observations are not available, defining the initial conditions of an ecosystem is important in modeling ecosystem biogeochemical cycles.Most biogeochemical cycle models in Earth system models are kinetic reactions.A steady-state solution, or model spin-up, from numerical integration of the ODEs using a daily or hourly time step for all processes is needed to obtain the initial conditions (McGuire et al., 1997;Thornton and Rosenbloom, 2005).Due to the long residence time (hundreds of years) of soil carbon, long model simulation is often required so that a steady-state solution of state variables over repeated climate forcing is reached.The computation is expensive especially for global modeling.The approach presented in this manuscript will facilitate the new model spinup method to obtain a steady state of the system by setting the time derivatives of the state variables to zero, which gives a set of algebraic equations.For example, we can apply the

Fig. 1 .
Fig. 1.Diagram of soil C models.(a) Conventional model and (b) microbial model.Lit, SOM, and Mic stand for C pools for litter, soil organic, and microbial biomass, respectively.Dashed arrows are plant inputs.Solid arrows are fluxes between pools.

Fig. 3 .
Fig. 3. Comparison of temporal change of total litter C at the Tonzi Ranch site of the original model (CLM-CN) with our new model (CLM-NGBGC).

Table 1 .
Decomposition processes created from the ODEs in CLM-CN.