Using model reduction to predict the soil surface C1800 flux: An example of representing complex biogeochemical dynamics in a computationally efficient manner

Earth System Models (ESMs) must calculate large-scale interactions between the land and atmosphere while accurately characterizing ﬁne-scale spatial heterogeneity in water, carbon, and nutrient dynamics. We present here a high-dimensional model representation (HDMR) approach that allows detailed process representation of a coupled 5 carbon and water tracer (the δ 18 O value of the soil-surface CO 2 ﬂux ( δF s )) in a computationally tractable manner. δF s depends on the δ 18 O value of soil water, soil moisture, soil temperature, and soil CO 2 production (all of which are depth-dependent), and the δ 18 O value of above-surface CO 2 . We tested the HDMR approach over a growing season in a C 4 -dominated pasture using two vertical soil discretizations. The di ﬀ erence 10 between the HDMR approach and the full model solution in the three-month integrated isoﬂux was less than 0.2 % (0.5 mol m − 2 ‰), and the approach is up to 100 times faster than the full numerical solution. This type of model reduction approach allows representation of complex coupled biogeochemical processes in regional and global climate models and can be extended to characterize subgrid-scale spatial heterogeneity. is a O value of . Mechanistic models that include these interactions may too computationally expensive to integrate in regional and global models at their native spatial scale. The results presented here demonstrate that the accurately predicts s up to times faster than the full numerical


Introduction
Atmospheric CO 2 has substantial impacts on global climate, both over the long-term and, as we have witnessed since the beginning of the industrial revolution, much shorter time scales (Watson et al., 2001). As a result, the impacts of anthropogenic CO 2 emissions and climate system feedbacks on the long-term state and stability of the climate are currently the focus of much research. Since interactions with the terrestrial biosphere dominate spatial and inter-and intra-annual variations in atmospheric CO 2 concentrations (Tans et al., 1990), developing reliable models of ecosystem CO 2 exchanges is necessary to predict future climate.
Terrestrial carbon cycle models used at the site and regional scales and in Earth System Models (ESMs; e.g., Bonan et al., 2002;Denning et al., 1996;Parton et al., known to exist at scales substantially finer than those represented in current ESMs (∼ 100 km resolution; King et al., 2010;Thompson et al., 2011), either explicitly or by integrating scaling rules based on mechanistic process representation.
Land models are often tested and calibrated against field eddy covariance measurements of net CO 2 ecosystem exchange (NEE) (Baldocchi et al., 2001). Difficulties in interpreting these NEE measurements arise from landscape horizontal and vertical heterogeneity, footprint uncertainty, unsteady conditions, and stable nocturnal conditions (Aubinet et al., 2000;Baldocchi, 2003;Goulden et al., 1996). Further, ecosystem model development requires accurate estimates of the gross CO 2 fluxes comprising the net flux, i.e., the assimilated (photosynthetic) and respired fluxes. This partitioning is necessary since the processes controlling these fluxes respond differently to environmental forcings and therefore require separate model formulations and parameterizations.
Measurements of the stable isotope 18 O in CO 2 have been proposed as a tracer to partition measured net CO 2 fluxes into component gross fluxes (Yakir and Wang, 1996), identify regional distributions of CO 2 exchanges (Ciais et al., 1997a,b;Cuntz et al., 2003;Francey and Tans, 1987;Peylin et al., 1999), and investigate interactions between the C and water cycles (Buenning et al., 2012;Wingate et al., 2009). However, using measurements of 18 O in CO 2 for these methods requires accurate estimation of the δ 18 O value of the soil-surface CO 2 flux (δF s (‰)), which depends on a complex suite of interactions between the C and water cycles (Riley et al., 2002;Tans, 1998).
Using this example, we illustrate here a computationally efficient approach to represent these dynamics in a manner appropriate for inclusion in regional and global models.
phase can substantially change the δ O value of the soil gas CO 2 . Upon dissolution, CO 2 can exchange 18 O atoms with the water, thereby acquiring the 18 O composition of the water. The impact of this exchange on the δ 18 O value of soil water (δ sw (‰)) is small, since there are orders of magnitude more H 2 O than CO 2 molecules in soil moisture. The competition between CO 2 diffusion through the open pore space and dissolution into the soil water can substantially impact δF s (Miller et al., 1999;Riley, 2005). Three classes of methods to estimate δF s have been reported. Several authors have hypothesized that a depth-integrated δ 18 O value of soil water and a constant effective kinetic fractionation factor can be used (Ciais et al., 1997a,b;Miller et al., 1999;Yakir and Wang, 1996). Tans (1998) Riley et al., 2002;Stern et al., 1999). ISOLSM has been integrated with the general circulation model CCM3 (Buenning et al., 2012) to investigate the impact of ecosystems on the δ 18 O value of atmospheric CO 2 (δ a ). However, the soil-gas diffusion and reaction submodels in ISOLSM are computationally expensive. The High-Dimensional Model Representation (HDMR) method applied here allows reduction of the full model to a series of look up tables, while still characterizing second order interactions between variables important in the system. This approach substantially reduces simulation runtime (by up to a factor of 100), while still generating accurate δF s predictions.  (Bonan, 1996). LSM1 is a "big-leaf" model that calculates internally consistent ecosystem energy, CO 2 , and H 2 O exchanges with the atmosphere. Soil moisture, advective water fluxes, and temperature, all of which impact δ sw , are calculated at user-defined depths in the soil.
The isotopic mechanisms integrated in ISOLSM are described in detail by Riley et al. (2002); the model has been applied in a number of other studies of isotope and bulk C and water dynamics (Aranibar et al., 2006;Cooley et al., 2005;Henderson-Sellers et al., 2006;Lai et al., 2006;McDowell et al., 2008;Riley et al., 2003Riley et al., , 2008Riley et al., , 2009Riley, 2005;Still et al., 2009;Torn et al., 2011). A brief description of the model follows to illustrate the nature of the interactions impacting δF s . ISOLSM solves for δ sw using an explicit method with boundary conditions specified for the δ 18 O values of precipitation and above-canopy vapor. Surface evaporation is calculated in LSM1 using a laminar soil-surface boundary layer resistance and the gradient between vapor concentrations at the soil surface and canopy air. A similar approach is taken in ISOLSM to compute the soil-surface H 18 2 O flux. In this case, though, the additional effects of an equilibrium partitioning factor and a different laminar boundary layer resistance for the heavier isotopologue are included. Root water withdrawal from the soil profile (driven by transpiration) is calculated using modules from LSM1; root H 2 18 O withdrawal occurs (F s and F s , respectively (µmol m s )) are computed from the concentration gradients and diffusivities at the soil surface. Finally, the δ 18 O value of the soil-surface CO 2 flux is calculated as where r pdb is the V-PDB-CO 2 standard. As applied here, ISOLSM uses 2.5 cm control volumes to solve for δ sw and twenty unevenly spaced control volumes down to 1 m depth for the soil-gas calculations. Model testing is described in Riley et al. (2003) and an application of ISOLSM to analyze the impact of near-surface δ sw on δF s is presented in Riley (2005).

High-dimensional model reduction
The HDMR technique described here (termed the cut-HDMR) is a special application of a group of tools designed to represent high-dimensional models (Alis and Rabitz, 2001;Rabitz and Alis, 1999;. HDMR was developed to substantially decrease simulation runtime while retaining nonlinear interactions between state variables and model parameters. The HDMR method has been used, for example, to study global atmospheric chemistry (Wang et al., 1999), stratospheric chemistry (Shorter et al., 1999), and atmospheric radiation transport (Shorter et al., 2000).
Here, f 0 is a constant that represents the system response at a (i.e., g(a)), a is the reference point (the central point of the n-dimensional hypercube defined by x); f i (x i )) characterizes the impact on g(x) of a change in x i while other inputs are taken from the reference point a; f i j (x i , x j ) characterizes the impact on g(x) of simultaneous changes in x i and x j ; and f 12...n (x 1 , x 2 , . . . , x n ) gives the residual impact on g(x) of all the variables simultaneously. The cut-HDMR approach ignores functions with greater than two variable interactions under the hypothesis that first and second order interactions dominate δF s . The three expansion functions are calculated as: The nomenclature for g indicates that it is evaluated assuming all variables are at the reference point a except the specific value(s) of x contained in the parentheses. Subtracting off the lower-order expansion functions when calculating f i j (x i , x j ) ensures a unique addition from g(x i , x j , a).

Applying HDMR to calculate δF s
For this work the HDMR expansion functions were generated using ISOLSM to evaluate δF s at steady state for a suite of input variables. In general, the soil-gas system will 7 The input variables that comprise x are assigned from a range divided into N equal intervals (Table 1). While all ISOLSM simulations here used the spatial discretizations described in Riley et al. (2002), the HDMR expansion functions are evaluated with two vertical discretization scenarios (D 1 and D 2 ). Scenario D 1 uses 2.5 cm soil control volumes down to 20 cm depth and N = 100, and scenario D 2 uses average soil moisture, temperature, and δ sw in 5 cm increments down to 20 cm depth and N = 100.
To develop the HDMR expansion functions, ISOLSM is run to steady state for each set of conditions (i.e., each x). δF s is then evaluated and the expansion functions are calculated with Eqs. (3)-(5) and stored as look-up tables. Computing the expansion functions for each discretization took about seven days on a 2 GHz Atherton PC with 512 MB of RAM. During the HDMR simulation, first and second order interpolation routines are used to calculate the expansion functions for a specific input set x. The advantage to the HDMR approach is the ability to rapidly evaluate Eq. (2) once the expansion functions have been computed.
In ISOLSM the soil CO 2 source term is calculated as the sum of autotrophic and heterotrophic respiration, each with their own exponentially decaying depth profile defined by the parameters z a 0 and z h 0 (m), respectively. z a 0 and z h 0 are sensitive to soil moisture, becoming larger as the soil dries (i.e., the soil CO 2 production moves deeper as the soil dries). To save computational time, the HDMR expansion functions were generated with a single exponential parameter, z 0 . Therefore, in the simulations presented here, the HDMR model applies a parameter weighted by the predicted autotrophic, F a (µmol m −2 s −1 ), and heterotrophic, F h (µmol m −2 s −1 ), CO 2 sources to approximate the depth-distribution of CO 2 production: 8 The HDMR approach was tested using meteorological data from May-July 2000 in a C 4 -dominated tallgrass prairie pasture in Oklahoma (36 • 56 N, 96 • 41 W). This dataset was used previously to develop and test ISOLSM (Riley et al., 2002(Riley et al., , 2003. The site is in a region with various land uses, including crops, sparse trees, and other grasslands, has not been grazed since 1996, and is burned every spring. Maximum leaf area index is about 3 and maximum net ecosystem exchange during the growing season is about 35 µmol m −2 s −1 . The site and collection of meteorological forcing and flux data are described in detail in Suyker and Verma (2001) and Colello et al. (1998).

Results and discussion
The magnitude and vertical distribution of δ sw is an important determinant of δF s (Riley, 2005). In this system, low humidity, high air temperatures, and high soil evaporation rates generate strong δ sw gradients in the top 5 cm of soil, making accurate prediction of δ sw critical for predicting δF s . For example, Fig. 1 shows predicted δ sw for four soil layers over a twenty-day period in June 2000. The spikes in δ sw in the first soil layer occur for a number of reasons. Rapid increases in δ sw are typically driven by large soil evaporative fluxes, which occur when the vapor gradient between the soil surface and canopy air is large. The δ 18 O value of above canopy vapor (δ v ) is impacted by surface evaporative fluxes, resulting in a feedback between δ v and near-surface δ sw . Because we lack continuous measurements of δ v , we estimate it using a constant offset (7 ‰) from the estimated stem water δ 18 O value. In reality, δ v can change more rapidly than this approach allows, as shown in Helliker (∇ 0-5 cm δ sw (‰ cm )).
Differences between ISOLSM and HDMR predictions occur for several reasons. First, discretization scenario D 2 is unable to capture the impact on δF s resulting from large δ sw gradients between 0 and 5 cm depth. We have previously shown that these gradients can substantially impact δF s (Riley, 2005). Following precipitation (e.g., days 162, 163, 164, 167, 172, and 174; Fig. 3) the enhanced soil-surface evaporation leads to ∇ 0-5 cm δ sw of up to 5 ‰ cm −1 . We have observed gradients of this magnitude in a sorghum field in Oklahoma (unpublished data), as have Miller et al. (1999) in their soil column experiments. The impact of these gradients on δF s is better captured in scenario D 1 since this HDMR solution is based on the identical spatial discretization as that of the ISOLSM simulation (i.e., eight 2.5 cm control volumes in the top 20 cm of soil). Second, even in the absence of vertical spatial gradients in δ sw , rapid changes in δ sw will lead to errors in the HDMR predictions since the HDMR solution is based on the steady-state full model solution. However, the errors between the D 1 discretization scenario predictions and the ISOLSM solution are small during these periods of rapid change. These results imply that errors associated with the steady-state assumption are relatively small for the conditions simulated here. Third, using an approximation for z 0 (Eq. 6) will lead to errors in the depth distribution of CO 2 production, although the total production will be correct. Finally, the HDMR solution is linearly interpolated between the forcing values shown in Table 1; this interpolation will lead to some error. I have attempted to minimize this type of error by using relatively small increments between successive values at which the expansion functions were evaluated (i.e., N = 100). the three-month period. I c is accurately simulated by the HDMR approach for both discretization scenarios (Fig. 4). Differences in the HDMR model predictions from the full model solution during periods of large near-surface δ sw gradients did not substantially impact predictions of the cumulative isoflux over the three-month period. The error in cumulative isoflux after three months is about 0.2 % (0.5 mol m −2 ‰) for both discretization scenarios. The HDMR solution was computed ∼ 50 and 100 times faster than the full ISOLSM numerical solution for discretization scenarios D 1 and D 2 , respectively. This increased computational efficiency makes it practical to include the ISOLSM-based HDMR solution for δF s in regional-and global-scale models.
In the broader context, spatial heterogeneity in hydrology and biogeochemical cycling occurs on scales substantially finer than can currently, and likely ever, be directly represented in ESMs. The actual transformation of soil organic matter and CO 2 production, e.g., occurs at the 10's of nm scale, and is impacted by pore-scale heterogeneity in nutrients, water, organic molecules, mineral surfaces, microbes, and others (Kleber et al., 2011). The microbial community acting at these scales is incredibly diverse (Goldfarb et al., 2011) as are the range of organic molecules being transformed and consumed (Kogel-Knabner, 2002;Sutton and Sposito, 2005). At the mm to cm scale, aggregation, macropores, plant roots, and other soil structural properties impact distributions of microbes and resources (Six et al., 2001). Vertical structure of hydrology and C inputs can vary on horizontal scales as small as a few meters and vertical scales on the order of 10 cm. Accounting for these types of heterogeneity across 10's of km in an ESM is a substantial challenge that high-dimensional model reduction techniques such as that presented here may help address. 2 plex function of the depth-dependent (a) δ 18 O value of soil water, (b) soil moisture, (c) soil temperature, and (d) soil CO 2 production, as well as the δ 18 O value of abovesurface CO 2 . Mechanistic models that include these interactions (e.g., ISOLSM) may be too computationally expensive to integrate in regional and global models at their native spatial scale. The results presented here demonstrate that the HDMR technique accurately predicts δF s up to 100 times faster than the full numerical solution. Under rapidly changing soil moisture conditions, such as immediately after a precipitation event, the full numerical solution of the C 18 OO surface flux differs slightly from the HDMR solution. Errors in the HDMR solution arise from the steady-state assumption, approximation of the depth-dependence of soil CO 2 production, and linear interpolation. However, these errors have a small impact on the predicted cumulative isoflux. The error in the cumulative isoflux over the growing season calculated with HDMR (compared to that calculated with the full model) was less than 0.2 %.
Applying measurements of the δ 18 O value of atmospheric CO 2 to partition measured net ecosystem fluxes into gross fluxes and, at the regional and global scale, to estimate spatially explicit CO 2 exchanges requires accurate prediction of the δ 18 O value of the soil-surface CO 2 flux. Further, for regional and global simulations such a method must be computationally efficient. The HDMR method applied here shows great promise as a tool for addressing the need for mechanistic representation of processes across a wide range of scales and spatial heterogeneity. Table 1. Parameters and state variables used to generate the expansion functions. Spatial discretization scenarios D 1 and D 2 correspond to eight 2.5 cm and four 5 cm control volumes, respectively, in the top 20 cm of soil. All HDMR simulations are performed by dividing each parameter range into 100 equal spaces (i.e., N = 100).
1 Figure 2. Simulated δF s over the growing season from ISOLSM and the HDMR approach 2 using discretization scenarios D 1 and D 2 . Variability in δF s is large when δ sw variability in 3 the top 2.5 or 5 cm is large. 4 Fig. 2. Simulated δF s over the growing season from ISOLSM and the HDMR approach using discretization scenarios D 1 and D 2 . Variability in δF s is large when δ sw variability in the top 2.5 or 5 cm is large.  Fig. 2, but for a 20-day period. Also shown are (b) differences between the full ISOLSM model results and the HDMR predictions for ∆z = 2.5 and 5 cm and (c) ∇ 0-5 cm δ sw , the predicted gradient in δ sw over the top 5 cm of soil. Differences between scenarios D 1 and D 2 are largest when near-surface δ sw gradients are large. Discretization scenario D 1 more accurately predicts the impact of these gradients on δF s since this HDMR solution is based on the identical spatial discretization as that of the ISOLSM simulation (i.e., 2.5 cm in the top 20 cm of soil).