Interactive comment on “ A reduced order modeling approach to represent subgrid-scale hydrological dynamics for regional-and climate-scale land-surface simulations : application in a polygonal tundra landscape ”

Existing land surface models (LSMs) describe physical and biological processes that occur over a wide range of spatial and temporal scales. For example, biogeochemical and hydrological processes responsible for carbon (CO2, CH4) exchanges with the atmosphere range from molecular scale (pore-scale O2 consumption) to tens of kilometer scale (vegetation distribution, river networks). Additionally, many processes within LSMs are nonlinearly coupled (e.g., methane production and soil moisture dynamics), and therefore simple linear upscaling techniques can result in large prediction error. In this paper we applied a particular reduced-order modeling (ROM) technique known as "Proper Orthogonal Decomposition mapping method" that reconstructs temporally-resolved fine-resolution solutions based on coarse-resolution solutions. We applied this technique to four study sites in a polygonal tundra landscape near Barrow, Alaska. Coupled surface-subsurface isothermal simulations were performed for summer months (June–September) at fine (0.25 m) and coarse (8 m) horizontal resolutions. We used simulation results from three summer seasons (1998–2000) to build ROMs of the 4-D soil moisture field for the four study sites individually (single-site) and aggregated (multi-site). The results indicate that the ROM produced a significant computational speedup (> 103) with very small relative approximation error (more » ROM. We also demonstrated that our approach: (1) efficiently corrects for coarse-resolution model bias and (2) can be used for polygonal tundra sites not included in the training dataset with relatively good accuracy ( « less


Introduction
The terrestrial hydrological cycle strongly impacts, and is impacted by, atmospheric processes.Further, a primary control on terrestrial biogeochemical (BGC) dynamics and greenhouse gas (GHG) emissions from soils (e.g., CO 2 , CH 4 , N 2 O) across spatial scales is exerted by the system's hydrological state (Schuur et al., 2008).Soil moisture also impacts soil temperature, which is another important controller of GHG emissions (Torn and Chapin, 1993).Since climate change is predicted to change the amount and temporal distribution of precipitation globally, there is a critical need for models to not only accurately capture subgrid heterogeneity of terrestrial hydrological processes, but also the impacts of subgrid hydrological heterogeneity on BGC fluxes.
Terrestrial hydrological states are important for climate prediction across a very wide range of spatial scales, from soil pores to continental.The critical spatial scale relevant to soil moisture state and subsurface and surface fluxes may be as small as ∼ 10 m 2 (Wood et al., 2011), although there is vibrant disagreement about the relative increase in predictability when trying to explicitly simulate at such high resolutions with limited observational data to constrain parameter values (Beven and Cloke, 2012).However, the importance of representing fine-resolution spatial structure in hydrological states and fluxes has been demonstrated for surface evapotranspiration budgets (Vivoni et al., 2007;Wood, 1997), runoff and streamflow (Arrigo and Salvucci, 2005;Barrios and Francés, 2012;Vivoni et al., 2007), and atmospheric feedbacks (Nykanen and Foufoula-Georgiou, 2001).It remains unclear what the critical spatial scale is for biogeochemical dynamics, although it has been shown that "hot spot" formation is important for wetland biogeochemistry occurs at scales ∼ O(10 cm) (Frei et al., 2012) and for nitrogen cycle variations at ∼ O(m) (McClain et al., 2003).In contrast, the current suite of land surface models applicable at watershed (e.g., PAWS, Riley and Shen, 2014;Shen, 2009), regional (Maxwell et al., 2012), or climate (Koven et al., 2013;Tang et al., 2013)  The methods to represent spatial heterogeneity in hydrological and biogeochemical dynamics differ between watershed and regional or climate-scale models.While many current watershed-scale models explicitly represent lateral inter-connectivity for subsurface and surface fluxes, regional and climate-scale models currently rely on a nonspatially explicit tiling approach.For example, CLM4.5 (Koven et al., 2013;Lawrence et al., 2012;Tang et al., 2013), the land model integrated in the Community Earth System Model (Hurrell et al., 2013), represents land-surface grid cells with the same horizontal extent as the atmospheric grid cells (which can range from ∼ 1 • × 1 • for climate change simulations to ∼ 0.25 • × 0.25 • for relatively short simulations, Bacmeister et al., 2013;Wehner et al., 2014).These grid cells are disaggregated into a subgrid hierarchy of non-spatially explicit land units (e.g., vegetated, lakes, glacier, urban), columns (with variability in hydrological, snow, and crop management), and plant functional types (accounting for variations in broad categories of plants and bare ground).Therefore, we contend that representing the much smaller spatial scales now recognized to control hydrological and biogeochemical dynamics in regional and global-scale models will require a reformulation of the overall design of these models.
One potential approach to represent spatial heterogeneity in soil moisture fields at resolutions finer than represented in a particular modeling framework is to relate the statistical properties of the soil moisture field with the spatial scale.Hu et al. (1997) showed that the variance (σ 2 θ ) of the soil moisture (θ) field at different spatial averaging areas (A) can be related to the ratio of those areas raised to a scaling exponent (γ).They also showed that γ is related to the spatial correlation structure of the soil moisture field and that it decreases as soils dry.Observational studies have described a power law decay of variance as a function of the observation scale (Rodriguez-Iturbe et al., 1995;Wood, 1998), and several investigators have demonstrated that the relationship between σ 2 θ and spatial scale is not "simple" (i.e., not log-log linear across all spatial scales; e.g., Das and Mohanty, 2008;Famiglietti et al., 1999;Joshi and Mohanty, 2010;Mascaro et al., 2010Mascaro et al., , 2011;;Nykanen and Foufoula-Georgiou, 2001).Introduction

Conclusions References
Tables Figures

Back Close
Full A second potential approach to account for spatial heterogeneity in soil moisture states is to relate its higher-order moments to the mean, and then apply these relationships within a model that predicts the transient coarse-resolution mean.In many observationally-based studies, an upward convex relationship between the mean and variance has been reported (e.g., Brocca et al., 2010Brocca et al., , 2012;;Choi and Jacobs, 2011;Famiglietti et al., 2008;Lawrence and Hornberger, 2007;Li and Rodell, 2013;Pan and Peters-Lidard, 2008;Rosenbaum et al., 2012;Tague et al., 2010;Teuling et al., 2007;Teuling and Troch, 2005).Theoretical analyses have also indicated that an upward convex relationship is consistent with current understanding of soil moisture dynamics (e.g., Vereecken et al., 2007).However, as discussed in Brocca et al. (2007), the relationships between soil moisture mean and statistical moments have been reported to depend on many factors, including lateral redistribution, radiation, soil characteristics, vegetation characteristics, elevation above the drainage channel, downslope gradient, bedrock topography, and specific upslope area.This large number of observed controllers and the lack of an accepted set of dominant factors argue that substantial work remains before this type of information can be integrated with land models to represent subgrid spatial heterogeneity.
Modeling studies have also been performed to investigate spatial scaling properties of moisture and how these properties relate to ecosystem properties.For example, Ivanov et al. (2010) studied spatial heterogeneity in moisture on an idealized small hill slope, and found hysteretic patterns during the wetting-drying cycle and that the system response depends on precipitation magnitude.Riley and Shen (2014) used a distributed modeling framework to analyze relationships between mean and higher-order moments of soil moisture and ecosystem properties in a watershed in Michigan.They concluded that the strongest relationship between the observed declines in variance with increases in mean moisture (past a peak in this relationship) was with the gradient convolved with mean evapotranspiration.Other studies have focused on upscaling fineresolution model parameters to effective coarser-resolution parameters.For example, Introduction

Conclusions References
Tables Figures

Back Close
Full Jana and Mohanty (2012) showed that power-law scaling of hydraulic parameters was able to capture subgrid topographic effects for four different hill slope configurations.Theoretical work to explicitly include spatial heterogeneity in the hydrological governing equations has also been applied to this problem.Albertson and Montaldo (2003) and Montaldo and Albertson (2003) developed a relationship for the time rate of change of soil moisture variance based on the mean moisture and spatial covariances between soil moisture, infiltration, drainage, and ET.Teuling and Troch (2005) applied a similar approach to study the impacts of vegetation, soil properties, and topography on the controls of soil moisture variance.Kumar (2004) applied a Reynolds averaging approach, and ignoring second and higher order terms, derived a relationship for the time rate of change of the mean moisture field that depends on the moisture variance.Choi et al. (2007) applied the model to a ∼ 25 000 km 2 Appalachian Mountain region for summer months of one year and found that subgrid variability significantly affected the prediction of mean soil moisture.
The approaches described above to capture fine-resolution spatial heterogeneity within a coarse-resolution modeling framework have some limitations.First, the soil moisture probability density function is often very non-normal (Ryu and Famiglietti, 2005), making the sole use of variance as a descriptor of moisture heterogeneity insufficient.A similar problem arises with the Reynolds averaging approach that does not include higher-order terms.This approach also requires a method to "close" the solution (i.e., relate the higher-order terms to the mean moisture), and there is no generally accepted method to perform this closure.Perhaps the largest constraint of these approaches in the context of climate change and atmospheric interactions is that they cannot account for the temporal memory in the system that impacts biogeochemical transformations.In particular, the biogeochemical dynamics at a particular point in time depend on the state and dynamics that occurred in the past, and just knowing the statistical distribution of moisture at a particular time may not maintain the continuity required for accurate prediction.Therefore, for applications related to regional to global-scale interactions with the atmosphere, a method is required that allows for (1) computationally Introduction

Conclusions References
Tables Figures

Back Close
Full tractable simulations (i.e., relatively coarser resolution); (2) spatially explicit prediction of the temporal evolution of soil moisture at relatively finer resolutions; and (3) integration of the relatively finer resolution soil moisture predictions with representations of the relevant biogeochemical dynamics.
To that end, we described, tested, and applied a generally applicable reduced order modeling technique that allowed us to reconstruct a fine-resolution heterogeneous 4-D soil moisture solution from a coarse-resolution simulation, thereby resulting in significant computational savings.In this study, we built ROMs based on the Proper Orthogonal Decomposition Mapping Method (Robinson et al., 2012), which first involved training the reduced order model (ROM) using fine-and coarse-resolution simulations over multiple years.Hydrologic simulations of coupled surface and subsurface processes for an Alaska polygonal tundra system were performed using a PFLOTRAN model (Bisht and Riley, 2014;Hammond et al., 2012).Simulations were performed for four study sites in Alaska that represent distinct polygonal surface characteristics and individual ROMs were built for each site.The resulting ROMs were then applied under different forcing conditions and over periods outside of the ROM training period.In the Sect. 2 we described the approach used to develop the ROMs, the polygonal tundra site used for our simulations, the PFLOTRAN hydrological simulations configuration, and our approach for evaluating the ROM solutions.In the Sect.3, we presented analyses of ROM accuracy within the temporal training (1998)(1999)(2000) and testing (2002 and 2006) periods.Finally, we ended with a discussion of limitations of the approach, possible improvements, and methods to incorporate the proposed ROM approach within a global-scale hydrological and biogeochemical model.Introduction

Conclusions References
Tables Figures

Back Close
Full

Site description and hydrologic simulation setup
In this study, we developed ROMs for hydrological simulations performed at four sites in the Barrow Environmental (BEO) in Barrow, Alaska (71.3 • N, 156.5 • W).The BEO lies within the Alaskan Arctic Costal Plain, which is a relatively flat region, characterized by thaw lakes and drained basins (Hinkel et al., 2003;Sellmann et al., 1975) and polygonal ground features (Hinkel et al., 2001;Hubbard et al., 2013).The Department of Energy (DOE) Next-Generation Ecosystem Experiments (NGEE-Arctic) project sets up four intensely monitored sites (A, B, C and D, shown in Fig. 1) within the BEO in 2012 to study impact of climate change in high-latitude regions.The four NGEE-Arctic study sites have distinct micro-topographic features, which include low-centered (A), high-centered (B), and transitional polygon (C and D).The mean annual air temperature for our study sites is approximately −13 • C (Walker et al., 2005) and the mean annual precipitation is 106 mm with the majority of rainfall falling during the summer season (Wu et al., 2013).The study site is underlain with continuous permafrost and the seasonally active layer depth ranges between 30-90 cm (Hinkel et al., 2003).
We applied a version of the three-dimensional subsurface reactive transport simulator PFLOTRAN, which was modified to include surface water flows, for simulating surface-subsurface hydrologic processes at the four NGEE-Arctic study sites.The subsurface flows in PFLOTRAN are solved with a finite volume and an implicit time integration scheme, and are sequentially coupled to a finite volume based surface flow solution that is solved explicitly in time.Simulations at the four study sites were conducted using meshes at horizontal resolutions of 0.25 m, 0.5 m, 1.0 m, 2.0 m, 4.0 m, and 8.0 m.A constant vertical resolution of 5 cm with a total depth of 50 cm was used for all simulations.The simulations were carried out for four summer months (July-September) of each year between 1998-2006.Evapotranspiration and effective precipitation boundary conditions for the PFLOTRAN simulations were obtained from offline simulations of the Community Land Model (CLM4.5;Oleson, 2013).Vertical heterogeneity in soil Introduction

Conclusions References
Tables Figures

Back Close
Full properties was prescribed using data from Hinzman et al. (1991).A static active layer depth of 50 cm, corresponding approximately to the maximum seasonal value, was assumed for all simulations.Details of model setup are provided in Bisht and Riley (2014).
In the current study, the ROM was trained on three years of data (1998)(1999)(2000) and ROM predictions for 2002 and 2006 were compared against fine-resolution simulations.

Development of the reduced order modeling approach
The multifidelity ROM approach we are proposing is based on the gappy Proper Orthogonal Decomposition (POD) mapping approach (Robinson et al., 2012).Let p be a set of parameters that defines a particular solution or observation.The set of parameters could include system parameters (e.g., vegetation distribution, soil types, and topography), climate forcings, time, and other quantities that have an influence on the system response.In this paper, we limited p to be time (days for summer seasons in a year) and the climate forcings prescribed at that particular time (precipitation and evapotranspiration rates).Then, given a sample set S N = {q 1 , . . ., q N }, where q i is a set of parameters p and N is the number of samples, we can compute the corresponding solution {f(q 1 ), . . ., f(q N )}.In this paper, f is a fine-resolution three-dimensional soil moisture field, but in general, f can be any spatial quantity of interest (e.g., soil temperature or GHG emission).

POD method
The POD approximation of f, f POD , is given by where of the matrix given by W N = f(p 1 ) − f, . . ., f(p N ) − f (Everson and Sirovich, 1995): The POD bases ζ i are thus the left eigenvectors of W N and λ i are the associated eigenvalues with each POD basis.The POD method is thus similar to principal component analysis (Jolliffe, 2002) and Karhunen Loeve decomposition (Moore, 1981).We computed the POD bases based on the kernel eigenvalue approach (Everson and Sirovich, 1995).
The number of POD bases, M, used to reconstruct the approximate solution to a certain level of error (ε λ ) can be determined using e As mentioned in Wilkinson (2011), the dimensional reduction afforded by the POD method depends on the extent to which the components of f are correlated.We note that Eq. (1) only states how f is represented in a linear space spanned by the POD bases, but there are multiple approaches of determining α(p) = {α 1 (p), . . ., α M (p)} for a given p.One optimal approach of determining α, denoted by α POD (p), minimizes the least squares error between f(p) and f POD (p): (3) However, α POD (p) cannot usually be determined using Eq.(3) since f(p) is the quantity we would like to approximate.To overcome this conundrum, the POD projection method (Willcox and Peraire, 2002) directly discretizes the governing equations using the linear space spanned by ζ i and solves the resulting algebraic equations for α POD (p).This approach requires extensive modification of the existing code of the simulator, and is thus not suitable for existing LSMs.However, we had presented results based on Eq. ( 3) in the Sect. 3 to demonstrate the limit of accuracy for POD-related methods.Introduction

Conclusions References
Tables Figures

Back Close
Full

POD mapping method
To overcome the difficulties associated with calculating α POD (p), the POD mapping method (POD-MM) attempts to efficiently reconstruct the fine-resolution solution based on a coarse-resolution solution.This approach is a modification of the gappy POD (Everson and Sirovich, 1995).For the same sample set S N , we determine {g(q 1 ), . . ., g(q N )}, where g(p) is the coarse-resolution solution for a given parameter set p with resolution ∆x g .As in Robinson et al. (2012), the multifidelity POD bases, , are then determined through a SVD of the matrix W POD-MM : where f is as defined before and g = 1 where ζ f i and ζ g i are components associated with the fine-and coarse-resolution models.Given a coarse-resolution solution g(p), we first determine where • 2 is the L 2 norm.We note that α POD-MM (p) is not simply given by Eq. ( 3) since , where ∆x g is the resolution at which g is computed.Note that we have denoted the solution constructed using the POD and the POD-MM method by f POD (p) and f POD-MM ∆x g (p), respectively.The relative error of the POD method with respect to the true fine-resolution solution is defined as: This error measure gives the maximum theoretical accuracy achievable using PODrelated methods, including POD-MM.For the POD mapping method, a separate ROM is constructed for each ∆x g we considered and the error measure for each ROM is defined as:

Results and discussion
As described in the Methods section, we developed the ROM models for the four NGEE-Arctic Barrow study sites chosen for detailed characterization.The four sites differ in their topographic characteristics and therefore each site has a different dynamic soil moisture response to the same meteorological forcings.In addition, since the parameters varied in this study are time and the forcing terms, historical data (prior-year simulations) can be used to construct the ROM.The resulting ROM is subsequently used to predict future responses.For more general cases involving system parameters, statistical or adaptive sampling techniques are needed to generate S N and the corresponding W POD-MM (Pau et al., 2013a, b).For all study years, domain average soil Introduction

Conclusions References
Tables Figures

Back Close
Full moisture decreased during the first half of the simulation time period due to losses associated with evapotranspiration, while soil moisture increased in the latter half due to increased rainfall.Sites A and B had the lowest mean soil moisture, followed by site C, and then the wettest site, D.

Application of POD method
We first constructed four separate ROMs, one for each site.tends to get completely saturated, thereby resulting in a discontinuity in the parameter space and requiring larger M. For all four sites, we found α 1 to be linearly correlated to the mean of f (i.e.fineresolution mean soil moisture, µ f ) (Fig. 4).Other α i does not have such a simple correlation with µ f .Since µ f can be accurately approximated by the mean of g (µ g ; (Bisht and Riley, 2014), we approximated f by where α fit 1 (µ) = a 1 (µ) + a 2 and we determined a 1 and a 2 from a least-square fit of α 1 and µ f .For ∆x g = 8 m, the error e POD-mean ∆x g , defined analogously to Eq. ( 7) with f POD replaced by f POD-mean ∆x g , has a maximum of 0.013 and a mean of 0.0016 at site A (Fig. 5).
At site D and ∆x g = 8 m, the maximum and mean of e POD-mean ∆x g are 0.016 and 0.005 respectively.

Application of POD mapping method
As with the previous analysis, ROMs based on POD-MM were constructed using only soil moisture data from 1998-2000 and daily prediction of soil moisture at 0.25 m were made for 2002 and 2006 using only the ROMs and coarse-resolution solutions.We only presented our analyses for site A and D for brevity but the results were consistent with those from the remaining sites; Fig. 2 shows that site B should yield similar results to site A and site C to site D.
The mean error for the POD mapping method (e POD-MM ∆x g

) decreases monotonically
with M for all ∆x g = ∆x > 0.25 m up to M = M optimal , after which e POD-MM ∆x g starts to increase and fluctuate (Fig. 6).This behavior is consistent with results from Everson and Sirovich (1995)  , the accuracy of f POD-MM ∆x g can be systematically improved by utilizing more POD bases in the approximation.
For a given M ≤ M optimal , e POD-MM ∆x g decreases with ∆x g , which implies that increasing the number of bases leads to a more accurate reconstructed fine-resolution solution.
For site D, M optimal also increases with decreasing ∆x g since larger information content allows more α POD-MM for more POD bases to be determined accurately.For year 2006, M optimal = 33 for ∆x g = 0.5 m and M optimal = 28 for ∆x g = 8 m.For site A however, M optimal = 10 for all ∆x g .As such, when the underlying dynamics that we want to capture are mild (small M optimal ), the dependence of M optimal on ∆x g is weaker.For the results shown, e POD-MM ∆x g is less than 6 × 10 −5 when f POD-MM ∆x g is evaluated at M optimal for site D. In addition, the mean of (f POD-MM ∆x g − f) is 1.15 × 10 −5 and 6.72 × 10 −6 for sites A and D, respectively, indicating that there is only a negligible bias in the ROM solution.
The above approach requires knowledge of the true fine-resolution solutions to determine M optimal .Alternatively, we can determine M optimal by examining the amount of variance represented by the first M POD bases.For M < M optimal , there is a linear relationship between log(e POD-MM ∆x g ) and log(e 0.01.We further examined the relative point-wise error, given by for the days with the largest e POD-MM ∆x g ; they corresponded to day 1 of 2002 for site A (Fig. 9) and day 106 of 2002 for site D (Fig. 10).For site A, the maximum ε (Fig. 10).In addition, the saturated portion of the solution has small fluctuating errors, as evident from ε POD-MM ∆x g of the bottom layer (Fig. 10).Future work will examine how we can remove these fluctuations by simultaneously taking into account both water content and saturation.

Application of an alternate POD mapping method
We also analyzed an alternative formulation to investigate if, for the POD mapping method, the number of POD bases required could be reduced for a fixed approximation error threshold or further reduction in the approximation error can be achieved with a given number of bases.Instead of applying Eq. ( 4), we performed a SVD of the matrix where h(p) = f(p) − g(p), and g is the piecewise constant map from the coarseresolution grid of g onto the fine-resolution grid of f.By using the deviation of f from the Introduction

Conclusions References
Tables Figures

Back Close
Full mapped coarse-resolution solution g, we removed the bias resulting from the mismatch between the mean of f and g.We note that this alternative POD mapping formulation is possible since our coarse-and fine-resolution grids are nested (which will always be the case for the types of applications we are developing here).For non-nested grids, a linear mapping is expected to work as well, although we did not analyze that approach here.We denote the resulting POD bases vector as where ζ h i are the components associated with h.Given a solution g(p), the approximate f where the α POD-MM2 is determined analogously to α POD-MM .

Multi-site ROM
We analyzed whether we could construct a single ROM that applies to all four sites as a first step towards developing a ROM that is applicable to the entire NGEE-Arctic study region.We used daily snapshots from all four sites for 1998-2000 to construct a single ROM.Similar to our previous analysis, we first examined the accuracy obtained through a POD method.The POD-related method can theoretically perform very well even when all four sites are considered in aggregate (Fig. 12).However, the number of POD bases needed to achieve similar accuracy is greater than when separate ROMs are constructed for each site (compare Figs. 2 and 12).
Using the multi-site ROM, e POD-MM ∆x g ≤ 10 −3 when only a relatively small number of POD bases are used (Fig. 13).For ∆x g = 8 m, the error is minimum when M = 30.The magnitude of the error is only slightly larger than single-site ROMs.Although this approach is less efficient since M is generally larger than M for the single-site ROM, it will still be significantly faster than performing simulations at the finest-resolution.A multi-site ROM is a good alternative to multiple single-site ROMs when the number of sites becomes large.In addition, if the sites have some similar features, smaller number of snapshots is required per site, leading to lower computational cost needed to construct a single multi-site ROM compared to multpile single-site ROMs.The above results also motivated our next analysis.

Site-independent ROM
We examined whether a single ROM constructed using the POD method at sites B, C, and D could accurately predict fine-resolution soil moisture at site A. The error of the POD ROM decreases with increasing number of bases but not as rapidly as e POD of single-or multi-site ROMs (Fig. 14).For the POD mapping method, the error (e POD-MM ∆x g ) also decreases slowly with M when compared to single-or multi-site ROMs (Fig. 15a).
For ∆x g = 8 m, the minimum e POD-MM ∆x g is 0.025, occurring at M = 10.For ∆x g < 8 m, Introduction

Conclusions References
Tables Figures

Back Close
Full ).However, for 0.6 ≤ θ ≤ 0.8, the fit is poorer with the PDF of f POD-MM ∆x g resembling a dual-mode Gaussian distribution centered around 0.69 and 0.74.These peaks also deviate slightly from that of f.The three modes in the PDF in Fig. 16 correspond to the three different soil material properties used to characterize subsurface structure of the polygonal landscape (Bisht and Riley, 2014).
The predicted pointwise soil moisture errors at site A have a maximum relative error of 0.15 and a mean of 0.02 (Fig. 17, shown for 20-25 cm layer solutions of day 20 of year 1998), i.e., substantially less accurate than for the site-dependent method (Fig. 8).
Since Bisht and Riley (2014) demonstrated that the soil moisture at each soil layer is inversely correlated to the DEM, we introduced the following approximation where f POD-MM3   ∆x g , j is the approximated solution of j th layer.In above, where f j (g j ) is the j th layer of f (g) for site B, C and D, and f j ( g j ) is the approximate reconstruction of the mean for the j th layer of f (g) based on the DEM of site A. The f j and g j are defined as

Conclusions References
Tables Figures

Back Close
Full where µ g, j is the mean of the j th layer computed based on different snapshots of g computed at site A, DEM is the surface elevation of site , 1 ≤ j ≤ 10, we showed that we could improve the accuracy of the site-independent ROM by taking into consideration the correlation between the soil moisture and the DEM (Fig. 15b).At ∆x g = 8 m, a minimum of 0.014 is obtained for e POD-MM3 ∆x g when M = 5.The PDF of f POD-MM3   ∆x g , j , 1 ≤ j ≤ 10 is also a closer approximation of the PDF of f compared to the PDF of f POD-MM ∆x g (Fig. 16) and the heterogeneous structure of f is approximately reproduced (Fig. 17).In addition, looking only at the 5th soil layer, the mean and variance of ε , are more uniformly smaller than ε POD-MM ∆x g .Fine-resolution soil moisture fields retrieved using the site-independent ROM are quite accurate (< 1.5 %) given the large topographic differences between Site A and the remaining three sites.In other words, this approach led to an accurate fine-scale soil moisture prediction for a site that was excluded from the training dataset, but did share some topographic features with sites that were part of the training dataset.Our hypothesis is that the level of error from the site-independent ROM is well below that required for accurate prediction of soil moisture impacts on BGC dynamics.For example, at a moisture content of 0.4 %, a mean relative error of 0.015 corresponds to an error in moisture content of 0.006 %, which will have negligible impacts on GHG emission prediction.We will explore in future work whether these results could be improved by increasing the number of sites used to train the dataset or taking into account other correlation aspects between the DEM and the solution.

Conclusions References
Tables Figures

Back Close
Full

Application to larger-scale hydrological simulations
The POD mapping method shows great promise in allowing prediction of fine-resolution soil moisture dynamics using coarse-resolution simulations.Here we applied a factor of 2 5 difference in resolution and achieved soil moisture simulation errors of < 0.06 % during two years that were not included in the training dataset, with an effective decrease in computation time of more than a factor of 1000.If the above results hold for simulations that include more heterogeneity in subsurface (e.g., conductivity) and surface (vegetation) properties, integration of the relevant ROMs into a land model such as CLM will allow much finer representation of processes than is currently possible.
The results indicate that the POD-MM is insensitive to fine-vs.coarse-resolution simulation biases.This result is potentially useful in cases where we know coarseresolution solutions are biased, for example Chen and Durlofsky (2006) showed that an adaptive upscaling technique for subsurface permeability was needed to correct for bias in coarse simulation of synthetic channelized reservoir.To demonstrate that POD-MM corrects bias in the coarse solution, let g bias = (1 + δ)g where δ is a prescribed perturbation.We showed that α POD-MM is determined by solving for which the solution is equivalent to solving Eq. ( 6).Therefore, a constant bias will not affect the accuracy of our approximation.To further support the above analysis, Introduction

Conclusions References
Tables Figures

Back Close
Full only emerge at large M due to overfitting, the same reason we observed fluctuations in Fig. 7.The above analysis however does not apply to f POD-mean ∆x g , since f POD-mean ∆x g relies on the assumption that the coarse-and the fine-resolution means have negligible differences.Thus, any bias in the coarse-resolution mean will lead to a biased f POD-mean ∆x g .Further study is needed to study the biases in site-independent ROMs and the effects of coarse-resolution bias due to the upscaling of heterogeneous soil properties across scales.
The current method can be efficiently deployed within the existing CESM framework.For the cases that we have examined, the ROMs for the subsurface processes can be developed without considering the full coupled system so that the fine-resolution solutions can be determined more efficiently.Once ROMs are constructed, coarse resolution predictions of soil moisture can be mapped onto a fine grid to predict biogeochemical processes at higher spatial resolution, while the land-atmosphere interactions can still be modeled at a coarser grid.We will explore how such a ROM framework can be robustly implemented within the CLM model in future work.
As with any sampling-based ROM, the POD mapping method performs well only if the snapshots of the solution used to construct the ROM form an approximation space that can reasonably represent the solution.In the cases that we examined here, the annual cycle of the climate forcing does not change drastically from year to year and the response of soil moisture to climate forcing was relatively smooth.We thus obtained good predicted solutions using only data from a period of 3 years to build the ROM.However, in situations where the characteristics of the solutions to be predicted are not fully captured by solutions from prior years, for example due to anomalous climate forcing, sampling techniques should be used to ensure that the samples used to construct the ROM are formed across the full phase space of expected forcing.
While the computational costs of evaluating the ROM are typically low, the initial computational overhead required to construct the ROM can be large.In addition to the potentially large number of samples that we need to simulate, we have to efficiently deal with the challenges of storing and retrieving the simulated solutions, especially if each

GMDD Introduction
Full solution consists of millions of degree of freedoms.As such, the current method can be deployed successfully for large-scale applications only if a database management system suited for large-scale computing is employed.

Conclusions
In this paper, we proposed the construction of ROMs for land models based on PODrelated methods.ROMs were built for soil moisture predictions from the PFLOTRAN model for the four NGEE-Arctic sites.An initial analysis based on the POD projection method was first used to classify and rank the sites, allowing us to conclude that hydrological complexity of the sites cannot be concluded based on DEM alone.We then used the POD mapping method as a ROM technique to map coarse-resolution solutions to fine-resolution solutions.The resulting fine-resolution ROM solutions had maximum mean relative errors less than 0.1 % with a computational speedup greater than 10 3 .Our initial analyses indicated that the POD mapping method is insensitive to coarse-resolution model biases.We compared the single-site, multi-site, and siteindependent ROMs against fine-resolution simulations.Both the single-site and multisite approaches were very accurate (< 0.1 %), and the site-independent approach had relative error < 1.5 % when used to assess a site that was not included in the ROM training.The errors for the ROMs were larger for sites that were excluded in the training dataset as compared to sites that were included in ROM development.However, the overall error magnitude is still quite low given the large topographical differences across the sites, thereby giving creditability for using ROMs in larger-scale simulations.We therefore concluded that the integration of ROMs into an Earth System Modeling framework is practical and can provide an accurate approach to spatial scaling.Figures

Conclusions References
Tables Figures

Back Close
Full scales typically represent hydrological or biogeochemical cycles at ∼ O(100 m-km) scales.Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | and ζ i are the POD bases.The POD bases are determined through a singular value decomposition (SVD) Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | ζ g i are not mutually orthogonal.The approximate solution, f POD-MM ∆x g (p), is then given by GMDD Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | The first step was to develop the ROM based on the POD method using the finest resolution (∆x f = 0.25 m) soil moisture predictions from 1998-2000.Given the soil moisture data for 2002 and 2006, we then determined f POD based on Eqs.(1) and (3).The mean relative error of the POD method, e POD , over 120 days in year 2002 and 2006 decreases with increasing M (Fig.2).Here, the M values at which the error is computed correspond to increasing ε λ = 10 −1 , 10 −2 , . . ., 10 −8 .There was no significant difference between the error budgets as a function of M for 2002 and 2006.Interestingly, site D requires more POD bases than the other sites to obtain the same value of e POD .We therefore concluded that the sites increase in hydrological complexity in the following order: A, B, C, and D. This conclusion is not obvious based solely on the DEM of the sites.For example, although the DEMs of sites B and C have similar distributions, site C requires significantly larger M compared to site B for the same e POD (Fig.3).In addition, site D requires the most M although its DEM has the smallest standard deviation.The larger number of POD bases required by site D can be attributed to particularly non-smooth soil moisture probability distribution functions (PDFs) under relatively saturated conditions.The POD method is more efficient when the approximated solution has more smoothness in the parameter space (i.e., a solution at a particular point varies smoothly with the parameters).Site D is relatively flat and at the end of the summer season it Discussion Paper | Discussion Paper | Discussion Paper | in their development of ROMs for face reconstruction.Although larger M improves the least square fit, it leads to overfitting and increases the uncertainty Discussion Paper | Discussion Paper | Discussion Paper | in the computed α POD-MM .This increased uncertainty in α POD-MM introduces significant random noise into the reconstructed fine-resolution solution, leading to fluctuating e POD-MM ∆x g.Compared to f POD-mean ∆x g

i
=1 λ i /λ T and λ T = N i =1 λ i ; the slope of the line is dependent on ∆x g (Fig.7).In addition, e λ M < 10 −6 appears to be a reasonable criterion for determining M optimal .Choosing this value leads to M = 10 and M = 25 at sites A and D, respectively, for ∆x g = 8.0 m.These values are very close to the M optimal values identified based on Fig. 6.We next analyzed the daily variation of e POD-MM ∆x g for years 2002 and 2006 (Fig. 8).The error e POD-MM ∆x g is typically larger in the wetter periods although its maximum is below Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | ∆x g for small M (Fig. 11).For example, for M = 1 and ∆x g = 0.5 m, e POD-MM ∆x g is an order of magnitude smaller than e POD-MM ∆x g .However, the convergence behavior of e POD-MM2 ∆x g with M is less well behaved as compared to the POD-MM.As a result, minimum achievable e larger ∆x g .The original formulation is thus preferred since it allows the error to be reduced systematically by increasing M, especially when ∆x g is large.Discussion Paper | Discussion Paper | Discussion Paper | for M > 10.The PDF of f POD-MM ∆x g for 0.4 ≤ θ ≤ 0.6 is reasonably close to the PDF of f (Fig.16, shown for day 20 of year 1998 for which e POD-MM ∆x g is approximately the minimum e POD-MM ∆x g Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Fig. 10 .Fig. 15 .
Fig. 10.Solutions of f, g, and ε POD-MM ∆x g for day 106 of year 2002 at site D and for three soil depths; the top, middle, and bottom rows correspond to layers 0-5 cm, 20-25 cm, and 45-50 cm, respectively, from the surface.

Fig. 16 .;
Fig. 16.The probability density function (PDF) of f, f POD-MM ∆x g and f POD-MM3 ∆x g for day 20 of year A, and µ DEM ∆x f (µ DEM ∆x g) is the mean of the DEM at resolution ∆x f (∆x g ).Thus, no fine-resolution solution of A is used to determine f