Edinburgh Explorer Description and validation of an intermediate complexity model for ecosystem photosynthesis and evapotranspiration: ACM-GPP-ETv1

. Photosynthesis (gross primary production, GPP) and evapotranspiration (ET) are ecosystem processes with global signiﬁcance for climate, the global carbon and hydrological cycles and a range of ecosystem services. The mech-anisms governing these processes are complex but well un-derstood. There is strong coupling between these processes, mediated directly by stomatal conductance and indirectly by root zone soil moisture content and its accessibility. This coupling must be effectively modelled for robust predictions of earth system responses to global change. Yet, it is highly demanding to model leaf and cellular processes, like stomatal conductance or electron transport, with response times of minutes, over decadal and global domains. Computational demand means models resolving this level of complexity cannot be easily evaluated for their parameter sensitivity nor calibrated using earth observation information through data assimilation approaches requiring large ensembles. To over-come these challenges, here we describe a coupled photosynthesis evapotranspiration model of intermediate complexity. The model reduces computational load and parameter numbers by operating at canopy scale and daily time step. Through the inclusion of simpliﬁed representation of key process interactions, it retains sensitivity to variation in climate, leaf traits, soil states and atmospheric CO 2 . The new model is calibrated to match the biophysical responses of a complex terrestrial ecosystem model (TEM) of GPP and ET through a Bayesian model–data fusion framework. The calibrated ACM-GPP-ET generates unbiased estimates of TEM GPP and ET and captures 80 %–95 % of the sensitivity of carbon and water ﬂuxes by the complex TEM. The ACM-GPP-ET model operates 3 orders faster than the complex TEM. Independent evaluation of ACM-GPP-ET at FLUXNET sites, using a single global parameterisation, shows good agreement, with typical R 2 ∼ 0 . 60 for both GPP and ET. This intermediate complexity modelling approach allows full Monte Carlo-based quantiﬁcation of model parameter and structural uncertainties and global-scale sensitivity analyses for these processes and is fast enough for use within terrestrial ecosystem model–data fusion frameworks requiring large ensem-bles.


Introduction
Ecosystem photosynthesis and evaporation are key ecosystem fluxes, and their strong coupling generates important feedbacks between plant carbon and water cycles (Tuzet et al., 2003;Bonan and Doney, 2018). Ecosystem photosynthesis, or gross primary productivity (GPP), is generally the sole input of organic carbon into terrestrial ecosystems, ultimately determining potential carbon accumulation rates. Ecosystem evaporation, or evapotranspiration (ET), is the combination of plant-mediated transpiration, soil surface evaporation and subsequent evaporation of rainfall intercepted by plant canopies. The dominant abiotic factors governing the magnitude and variability of GPP are tempera-T. L.  ture, which is controlled by root biomass and its distribution through the soil profile. Thus, the soil-root interface is a second coupling point between the plant carbon and water cycles (Beer et al., 2009;Bonan and Doney, 2018). State-of-the-art terrestrial ecosystem models (TEMs) provide a mechanistic and process-oriented representation of the coupling between plant carbon and water cycles (e.g. Krinner et al., 2005;Oleson et al., 2010;Smallman et al., 2013;Harper et al., 2016) at leaf or even sub-leaf scale, resolving radiative transfer, stomatal conductance and electron transport. TEMs represent state-of-the-art knowledge on how ecosystems function and are used to provide meaningful predictions of the responses by and feedbacks from the terrestrial land surface in response to changes in the earth system (Bonan and Doney, 2018). Mechanistic models linking leaf-level photosynthesis (e.g. Farquhar and von Caemmerer, 1982;Collatz et al., 1991) and transpiration (e.g. Monteith, 1965) through models of stomatal regulation Williams et al., 1996;Bonan et al., 2014) are well established. Scaling from leaf to canopy scale has grown increasingly complex as the role of non-linear within-canopy variation of both abiotic (e.g. light, temperature, momentum, CO 2 and H 2 O) and biotic (i.e. plant traits) factors in plant carbon-water relations has improved (e.g. Wang and Leuning, 1998;Buckley et al., 2013;Sun et al., 2014;Way et al., 2015;Coble et al., 2016;Scartazza et al., 2016;Nolan et al., 2017;Bonan et al., 2018).
However, the increasing complexity of TEMs presents new challenges. Many of the most complex TEMs are too slow for use in model-data fusion analyses, which are reliant on massive ensemble simulations (e.g. Ziehn et al., 2012;Smallman et al., 2017). While effective and more computationally efficient alternative model-data fusion approaches are available, they often rely on model code modifications, such as the creation of the model adjoint in variational approaches (e.g. Kuppel et al., 2012;Raoult et al., 2016) or model emulation, often resulting in larger uncertainties in their posterior analysis (e.g. Fer et al., 2018). The complexity of typical TEMs generally prevents a robust quantification of their uncertainties; it is very challenging computationally to determine the sensitivities of TEM model outputs to parameter variation. This hinders interpretation of model-data mismatch. Finally, there are major challenges in procuring sub-daily meteorological observations needed to drive TEMs away from meteorological stations -this is a particularly acute problem in tropical regions. Thus, TEMs are generally run using statistical downscaled climate reanalysis data, which contain errors. The uncertainty generated when these errors are propagated into TEM GPP and ET estimates is comparable to intermediate complexity (IC) model error associated with simulating daily fluxes directly (Williams et al., 1997(Williams et al., , 2001a. Thus, IC models have been shown to have similar errors to TEM models but at lower computational cost (i.e. 1 time step versus 24 time steps). Thus, there is considerable value in having less complex, fast-running models that simulate GPP and ET. The challenge here is to produce a model both sufficiently mechanistic to represent the coupling between plant carbon and water cycles linking to ecophysiological processes and observations of key global unknowns (e.g. rooting depth) but also computationally fast enough to be integrated into model-data fusion schemes and to allow a full exploration of parameter-related uncertainties.
Photosynthesis is often estimated using physiologically realistic light, CO 2 and temperature response functions (e.g. Jones, 1992;Williams et al., 1997). Evaporation is frequently estimated using simplified versions of the Penman-Monteith model, typically modelling plant stomatal regulation as a function of environmental drivers (e.g. Priestley and Taylor, 1972;Fisher et al., 2008). The impact of moisture limitations on both GPP and ET is commonly achieved through the use of the vapour pressure deficit (VPD) as a proxy (e.g. Mu et al., 2011;Wang et al., 2017) or using a single soil layer "bucket" (e.g. Martens et al., 2017). While simple models can show skill when compared to in situ estimates (Mu et al., 2011;Bloom and Williams, 2015;Martens et al., 2017;Wang et al., 2017), they usually estimate a single process, either photosynthesis or evapotranspiration, neglecting their coupling. Without coupling, the feedbacks between C and water cycles will not be modelled robustly. For instance, there is a high risk that independently calibrated, simple GPP and ET models that are coupled naively in a plant-soil model framework will misdiagnose the sensitivity of water use efficiency (C fixed per water transpired) and have low predictive capability outside of the calibrated range (e.g. big leaf versus multiple leaf canopy; Tuzet et al., 2003;Wang and Leuning, 1998). Thus, connecting a series of simple models to generate a model of IC carries significant risks. The IC model must represent process interactions effectively. A key test therefore is that any IC model must reproduce the sensitivities of key processes (i.e. GPP, ET), their interactions (WUE) and soil moisture status demonstrated by the state-of-the-art TEMs to ensure flux estimates are right for the right reasons.
This study builds on two previously developed aggregated canopy models (ACMs) for GPP (Williams et al., 1997) and ET (Fisher et al., 2008) and an existing state-of-the-art TEM, Soil Plant Atmosphere (SPA; Williams et al., 1996;Smallman et al., 2013). ACM-GPP simulated daily GPP sensitive to canopy nitrogen (N), temperature, absorbed short-wave radiation and atmospheric CO 2 concentration, based on physiologically realistic relationships but lacking a representation of the impact of soil moisture availability on photosynthesis. Despite this limitation ACM-GPP has been coupled to the DALEC C-cycle model (Williams et al., 2005) and successfully used in model-data fusion experiments to improve our understanding of ecosystem C status, C allocation and residence times (Fox et al., 2009;Bloom and Williams, 2015;Bloom et al., 2016;Smallman et al., 2017) but also carbonnitrogen interactions (Thomas and Williams, 2014). In addition to lacking a soil moisture response to photosynthesis, ACM-GPP limits the capacity of DALEC analyses to constrain the root component of the C cycle as roots cur-rently play no ecological role within the modelling system (i.e. water or nutrient uptake). ACM-ET simulates the bulk ecosystem evapotranspiration based on a modified Penman-Monteith approach sensitive to absorbed short-wave radiation, temperature, vapour pressure deficit and wind speed. However, ACM-ET's bulk approach does not allow for distinguishing between different evaporative sources (i.e. soil surface, root-extracted rainfall and canopy-intercepted rainfall). Thus, it does not account for the different biotic and abiotic drivers which have varied responses to environmental change (Wei et al., 2017). Moreover, ACM-ET does not have a mechanistic coupling to water supply governed by root biomass and root vertical distribution. ACM-GPP and ACM-ET use different empirical models linking LAI, minimum tolerated leaf water potential and meteorological drivers to estimate canopy conductance. Both ACMs can be calibrated to provide useful GPP and ET estimates; however when combined their predictive capacity for emergent properties such as WUE is limited (R 2 < 0.2; data not shown), highlighting the need for further development to reproduce emergent ecosystem properties.
Here we describe a process model of intermediate complexity, ACM-GPP-ET version 1, that simulates gross primary productivity and evapotranspiration. ACM-GPP-ET is a fast, coupled representation of plant carbon and water cycles at ecosystem scale and daily time resolution. Coupling is achieved via a canopy stomatal model that determines CO 2 and H 2 O exchanges in the canopy. With fewer parameters than many state-of-the-art process-orientated models, our model is simpler to calibrate. With a daily time step and single canopy layer, the model is fast and therefore viable for ensemble modelling. In fact ACM-GPP-ET is explicitly intended as a replacement for ACM-GPP as part of the DALEC model, addressing current weaknesses in simulating carbon-water interactions with our model-data fusion framework. To ensure its realism, ACM-GPP-ET is an emulation of a more complex TEM, SPA (Williams et al., 1996), that resolves leaf-scale, hourly exchanges of CO 2 and water. SPA also includes a detailed, multi-layer representation of radiative transfer, energy balance, carboxylation and plantsoil interactions at sub-daily timescales. SPA explicitly couples available supply of water from the soil (determined as a function of soil characteristics, root biomass and structure) to demand by the atmosphere (as a function of absorbed radiation and vapour pressure deficit) which results in robust dynamics in response to varied water availability (Bonan et al., 2014). We create a very large ensemble of SPA runs across environmental space to map the sensitivity of GPP and ET to biophysical changes and then fit the parameters of ACM-GPP-ET to these surfaces.
ACM-GPP-ET is based on previous approaches developed to estimate GPP and ET independently (Williams et al., 1997;Fisher et al., 2008) but here uniquely are realistically coupled for the first time. GPP is estimated as a function of foliar nitrogen content allocated to photosynthetic activity, temperature, intercellular CO 2 concentration and absorbed PAR. ET is estimated as the sum of transpiration, evaporation from the soil surface and of rainfall intercepted by the canopy, within a soil water mass-balanced system. Using a combination of GPP and ET estimates from both TEM and observation-orientated analyses, spanning site to global scales, we calibrate and validate ACM-GPP-ET and address the following questions: 3. How do ACM-GPP-ET predictions compare to fully independent FLUXNET-derived estimates of carbon and water fluxes across the globe?
Finally we discuss novel research applications made possible using our intermediate complexity, ecophysiologically based modelling approach including full Monte Carlo-based quantification of model parameter and structural uncertainties, global-scale sensitivity analyses (e.g. WUE response to increased CO 2 ), rapid testing of alternative theoretical models of stomatal conductance and use within terrestrial ecosystem model-data fusion frameworks.
2 Description of ACM-GPP-ET

Model overview
The aggregated canopy model for gross primary productivity and evapotranspiration version 1 (ACM-GPP-ET v1) provides a computationally efficient yet broadly mechanistic representation of photosynthetic and evaporative fluxes of terrestrial ecosystems. Evapotranspiration is explicitly represented as the sum of transpiration (coupled to GPP via a mechanistic representation of stomatal conductance), evaporation from the soil surface and evaporation of precipitation intercepted by the canopy. Absorption and reflectance of short-and long-wave radiation are estimated as non-linear functions of LAI. Aerodynamic conductance for canopy and soil surface exchange is estimated as a function of wind speed and canopy structure (LAI and height). ACM-GPP-ET includes a four-layer model of soil water balance. The top three soil layers are accessible to roots, which determines the available supply of water to the plant as a function of fine root biomass and their distribution through the soil profile. Soil evaporation is assumed to be supported by the top soil layer only (Fig. 1).

Model drivers
ACM-GPP-ET requires both meteorological and biophysical information as inputs (Table 1). Most of the needed drivers are widely available from either field observations or global reanalyses. Meteorological drivers are extracted from the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA-Interim reanalysis (Dee et al., 2011), while soil textural information is extracted from global interpolations of field inventories (e.g. Harmonized World Soils Database, HWSD; Hiederer and Köchy, 2011). LAI is widely available from satellite-based remote sensing such as the NASA-generated MODIS product (https://modis.gsfc. nasa.gov/data/dataprod/mod15.php, last access: 18 October 2017). In contrast, information on below-ground biophysical information, such as root stocks and rooting depth, is more challenging to obtain as it is highly spatially variable and not directly observable from space. Simulation models and model-data fusion-based C-cycle analyses may simulate root stocks which can provide useful information (e.g. Bloom et al., 2016), while rooting depth information can also be statistically estimated (e.g. Fan et al., 2017).

Model parameters
Parameters within ACM-GPP-ET represent a wide range of time-invariant physical, biogeophysical and biogeochemical properties. In this study we calibrate a total of 22 parameters related to nitrogen-use and light-use efficiency, temperature response of photosynthesis, plant water-use efficiency and radiation absorption and reflectance processes (Table 2); these parameters broadly relate to ecosystem traits. Ecosystem traits can be reasonably expected to vary between ecosystems, and thus we should be able to retrieve ecologically consistent estimates for these parameters given suitable carbon and water flux information. In this study were we are calibrating against a complex model with "known" parameter values against which we can compare our estimated values. There are a further 12 biophysical parameters which are assumed to be constant and therefore are not retrieved as part of our calibration procedure (Table 3).

Gross primary productivity
Following Williams et al. (1997) and Jones (1992), GPP is estimated as a co-limited function of temperature, CO 2 (limited by stomatal opening and thus plant water availability) and absorbed PAR. The temperature-and foliar-nitrogenlimited rate of photosynthesis (P NT ; g C m −2 d −1 ) is first determined as a function of leaf area index (LAI; m 2 m −2 ), average foliage nitrogen content (N fol ; grams of nitrogen per metre squared of leaf), nitrogen use efficiency (NUE; g C/g N/m 2 leaf/d) and temperature (T air ; • C).
where T adj describes a skewed normal distribution (scaling 0-1) with an optimum temperature (T opt ), maximum temperature (T max ) and kurtosis (Kurt).
CO 2 limitation is dependent on canopy conductance of CO 2 (g c ; mmolCO 2 m −2 d −1 ), which is assumed to be the combined conductance of the stomata (g s ; mmol H 2 O m −2 s −1 ) and leaf boundary layer (g b-mmol ; mmol H 2 O m −2 s −1 ). Note that g s and g b are calculated for conductance of water vapour, and thus coefficients 1.65 and 1.37 convert conductance of Table 1. Drivers required as inputs by ACM-GPP-ET. Each driver has its unit specified and a brief description. The mean value across the calibration climate space is given with the standard deviation in parentheses. The drivers are divided between those which are time-varying and those assumed constant for a given location but that can vary between locations.  (Jones, 1992); dayl sec = 86 400 is the number of seconds per day.
The canopy boundary layer (g b ; see Eq. 57) and stomatal conductance (g s ) are initially calculated in metres per second (m s −1 ) and thus must be converted into millimoles of H 2 O per metre per second (mmol H 2 O m −2 s −1 ) for the purposes of calculating CO 2 exchange. Note that the coupling between photosynthesis and transpiration occurs via g s , which is estimated as a function of available water supply, atmospheric demand and the intrinsic water use efficiency threshold on GPP; see Sect. 2.6 for details on the estimation of g s . g b-mmol = g b · (1000 · Pr air /(T airK · R con )) (4) g s-mmol = g s · (1000 · Pr air /(T airK · R con )), where T airK is the air temperature in kelvin, Pr air is air pressure (default = 101 325 Pa) and R con is the universal gas constant (8.3144 J K −1 mol −1 ). The scalar 1000 adjusts units from moles (mol) to millimoles (mmol). The internal CO 2 concentration (C i ; ppm or µmol mol −1 ) is estimated as a function of atmospheric CO 2 concentration (C a ; ppm or µmol mol −1 ), g c , the CO 2 compensation (C comp ; ppm or µmol mol −1 ) and half saturation (C half ; ppm or µmol mol −1 ) points.
M C (12 g C mol −1 ) is the molar ratio of carbon, where its inverse (M −1 C converts from moles of carbon (mol C) to grams of carbon (g C) and 1 × 10 6 scales from micromoles (µmol) to moles.
C comp determines the C i at which GPP becomes positive, while C half is the C i at which CO 2 -limited photosynthesis is at 50 % of its maximum rate. Both C comp and C half are calculated as a function of temperature following McMurtie et al. (1992).
T airK (10) where α comp and α half describe the values at the reference temperature (20 • C or 298.15 K), and β comp and β half describe the sensitivity of the temperature response. CO 2 -limited photosynthesis (P CO 2 ; g C m −2 d −1 ) is calculated as a function of g c and CO 2 exchange gradient, where 1 × 10 −6 scales from µmol to moles, and M C converts moles to grams of C. At this juncture, a day length (dayl: hours) correction is applied to be consistent with the light limitation calculation which follows Light-limited photosynthesis (P I ; g C m −2 d −1 ) is defined as a function of absorbed short-wave radiation (I ) and a quan-tum yield parameter (E 0 ).
The final GPP estimate (g C m −2 d −1 ) is the result of combined light-and CO 2 -limited photosynthesis. GPP = P I · P CO 2 P I + P CO 2 (14)

Evapotranspiration
Evapotranspiration is based on the Penman-Monteith model assuming isothermal net radiation conditions (Jones, 1992). Evaporation is simulated from three source which are (i) transpiration, (ii) evaporation of precipitation intercepted by the canopy and (iii) the soil surface. The following sections detail the calculation of each evaporative source within their respective available water supplies.

Transpiration
Transpiration (E trans ; kg H 2 O m −2 d −1 ) is estimated by the Penman-Monteith equation linking the drivers of transpiration, canopy radiation status and atmospheric demand, with restrictions on evaporative losses, namely available water supply from the roots within the soil profile. The upper limit on water supply is imposed by restricting the maximum stomatal conductance (g s ) for a given set of environmental conditions (process described in Sect. 2.6).
where iso-canopy is the isothermal net radiation (W m −2 ; see Sect. 2.7.2), s (kPa K −1 ) is the slope of curve relating saturation vapour pressure with air temperature and γ is the psychrometer constant (kPa K −1 ). ρ air is the density of air (kg m −3 ), λ is the latent heat of vaporisation (J kg −1 ) and cp air is the specific heat capacity of air (J kg K −1 ).
s, γ and λ are calculated as a function of T air following equations described in Jones (1992).
The calculation of canopy conductance (g b ; m s −1 ) is described in Sect. 2.8.1 linked to canopy properties (LAI and canopy height) and wind speed. Stomatal conductance (g s ; m s −1 ), used in both the calculation of GPP and transpiration, is calculated via an iterative bisection procedure described in the following section.

Wet canopy evaporation
Wet canopy surface evaporation (E wet ; kg H 2 O m −2 d −1 ) is the evaporation of precipitation intercepted by the canopy and thus is limited by the available canopy water storage (C stor ).
where E pot is the potential wet canopy evaporation (kg H 2 O m −2 d −1 ). E pot is assumed to be unrestricted evap-oration as estimated by the Penman model assuming isothermal net radiation. E pot is further restricted based on the ratio of current canopy water storage: where C max is the maximum canopy water storage (kg H 2 O m −2 ), defined as a function of LAI related by α (0.2) as previously used in SPA (Smallman et al., 2013).
where C stor is determined by water inputs from precipitation, less that which reaches the soil surface (i.e. throughfall) and its water losses by evaporation (as described above) or overflow from intercepted water exceeding C max onto the ground. The fraction of precipitation expected to be throughfall (Tfall) is estimated as a function of LAI related by µ (0.5), where µ is selected assuming that interception of rainfall is similar to that of direct radiation.

Soil surface evaporation
Soil evaporation (E soil ; kg H 2 O m −2 d −1 ) is estimated using the Penman-Monteith equation linking drivers of evaporation, soil isothermal radiation ( iso-soil ; W m −2 ) and atmospheric demand, with restrictions on evaporative losses (i.e. available water in the top soil layer). The upper limit of evaporation is also restricted by the thickness of the dry layer (drythick; m) of soil at the surface.
where g soil is the soil surface aerodynamic conductance (m s −1 ), and g ws is the conductance of water vapour (m s −1 ) through the soil air space. VPD soil is the vapour pressure deficit (kPa) between the air above the soil and air within the soil pore space.
where por top is the porosity (0-1; m 3 m −3 ) for the top soil layer as calculated using the Saxton model of soil hydrology (Saxton et al., 1986;Williams et al., 2001b), D w is the diffusion coefficient for water through air (m 2 s −1 ) at reference temperature 293.2 K and 1.75 is a scalar coefficient relating the temperature dependence of D w . τ is the tortuosity (i.e.
where e surf is the vapour pressure deficit within the soil air space (kPa). e soil is the vapour pressure in the soil air space (kPa), and e sat is the saturation vapour pressure at the current temperature. SWP top is the soil water potential (MPa) for the top soil layer, while V w is the partial molar volume of water at 20 • C (i.e. 1.805 ×10 −7 m 3 mol −1 ). All other scalar values are coefficients relating current air temperature to e sat .

Calculating g s : the coupling point between plant C and H 2 O cycles
The intrinsic water use efficiency (iWUE) optimisation approach for estimating g s is well established and validated; in particular iWUE has been shown to show improved drought response of g s compared to the Ball-Berry g s model (Williams et al., 1996;Bonan et al., 2014). The model aims to maximise photosynthetic uptake within the constraints on g s imposed by the available supply of water to the canopy and atmospheric demand for evaporation; this approach is referred to as optimising the iWUE. Calculation of g s is a three-step process.
Step 1 is the estimation of the potential steady flow water supply over the day (MaxSupply; kg H 2 O m −2 d −1 ) from the soil via roots to the canopy.
where LWP min is the minimum tolerated leaf water potential (MPa), wSWP is the soil water potential weighted by root access (MPa) and R tot is the total hydraulic resistance (MPa s −1 m −2 mmol −1 H 2 O). The unit is changed from mmol to kilograms of water (kg H 2 O) using the molar mass of water M H 2 O (18 g mol −1 ) and 1×10 −3 scalar.
Step 2 inverts the Penman-Monteith equation to calculate the value of g s required to meet MaxSupply under current atmospheric demand and isothermal net radiation conditions.
Step 3 uses an iterative bisection process which quantifies the sensitivity of GPP to the g s increment by 1 mmol H 2 O m −2 s −1 (δGPP; grams of C per metre squared of leaf per day per millimole of H 2 O). The bisection process varies g s between 0 and g s-max minimising gs opt , the difference between δGPP/g s and iWUE, g C/m 2 leaf/d/mmol H 2 Og s .

Radiation balance
State-of-the-art radiative transfer schemes are able to quantify differential canopy absorption, transmittance to soil surface and reflection back to the sky of PAR, NIR and longwave radiation. Using a detailed radiative transfer scheme as a base (Williams et al., 1998), here we have developed simple Michaelis-Menten relationships parameterised to reproduce the emergent absorption, transmittance and reflection properties of a canopy as a function of LAI. Net canopy ( iso-canopy ; W m −2 ) and soil ( iso-soil ; W m −2 ) isothermal radiation balances are calculated from the combination of short-and long-wave absorption detailed in the following sections.

Short-wave radiation absorption
ACM-GPP-ET uses a bidirectional radiative transfer scheme to estimate the absorption of PAR and NIR by the canopy and soil surface. Downward radiation first interacts with the canopy either being reflected back toward the sky, transmitted toward the soil surface or absorbed by the canopy. Second, the radiation which is transmitted through the canopy to the soil surface is either absorbed or reflected back through the canopy. The fraction of incoming PAR (canopy PAR-abs ) and NIR (canopy NIR-abs ) absorbed is estimated as the residual of that reflected back into the sky (canopy PAR-refl , canopy NIR-refl ) or that transmitted (canopy PAR-trans , canopy NIR-trans ) toward the soil surface.
canopy PAR-abs = 1 − canopy PAR-refl − canopy PAR-trans (39) where α NIR-refl and α PAR-refl are the maximum fraction of NIR and PAR reflected by the canopy. K NIR-refl and K PAR-refl are the LAI values at which 50 % of maximum reflectance is achieved for NIR and PAR respectively. α NIR-trans and α PAR-trans are the maximum reduction in transmittance for NIR and PAR; similarly K NIR-trans and K PAR-trans are the LAI at which transmittance is reduced by 50 %. Absorption of PAR (APAR canopy ) and NIR (ANIR canopy ) by the canopy on its first pass down through the canopy is estimated as APAR canopy = PAR · canopy PAR-abs (41) ANIR canopy = NIR · canopy NIR-abs .
Transmitted PAR and NIR are then incident on the soil surface to be absorbed by the soil surface or reflected back up towards the canopy. We assume that the soil absorption fraction (soil abs ) of incident PAR and NIR is the same; however PAR and NIR remain independently tracked to allow for subsequent reflection back towards the canopy.
APAR soil = PAR · canopy PAR-trans · soil abs (43) ANIR soil = NIR · canopy NIR-trans · soil abs (44) PAR and NIR which are reflected from the soil are then available for a second opportunity of the canopy to absorb and typically contributed < 1 % of absorbed radiation in ACM-GPP-ET. Therefore the total APAR canopy and ANIR canopy are calculated as follows: ANIR canopy = ANIR canopy + (NIR · canopy NIR-trans · (1 − soil abs ) · canopy NIR-abs ).
Estimates of incoming short-wave radiation are widely available; however partitioned estimates of NIR and PAR are less frequent. In such circumstances we assume a fixed ratio between PAR and total short-wave radiation (PAR : SW).

Isothermal long-wave radiation absorption
The long-wave radiation balance is estimated assuming isothermal conditions (i.e. the surfaces are assumed to be the same temperature as the surrounding air). Calculation of the isothermal long-wave radiation is a trade-off between the need to account for a positive bias in available energy for evaporation, should only the short-wave radiation be accounted for, and errors introduced by not considering thermal heating and cooling of surfaces. Similar to the short-wave radiative transfer scheme described above, ACM-GPP-ET uses a bidirectional radiative transfer scheme to estimate the absorption and emission of long-wave radiation by the canopy and soil surface. The long-wave radiation balance is divided into four components. First, incoming radiation interacts with the canopy either being reflected back toward the sky, transmitted toward the soil surface or absorbed by the canopy. Second, the radiation which is transmitted through the canopy to the soil surface is either absorbed or reflected back through the canopy. Third, the soil surface emits long-wave radiation towards the canopy, repeating Step 1 in reverse. Fourth, the canopy itself emits long-wave radiation which is either incident on the soil surface or lost to the sky.
Emission of long-wave radiation (W m −2 ) is dependent on temperature and emissivity (σ ) related by the Stefan-Boltzmann constant (κ). Incoming long-wave radiation (LW) from the sky is assumed to be related to surface air temperature (T airK ) minus 20 • C, while long-wave emission (LW em ) from surfaces is assumed to be related to surface air temperature under isothermal conditions.
The fractions of incoming LW to be reflected (canopy LW-refl ), transmitted (canopy LW-trans ) and absorbed (canopy LW-abs ) by the canopy are estimates as a function of LAI.
The canopy emits long-wave radiation (i.e. LW em · LAI), much of which is absorbed within the canopy itself, resulting in a decreasing fraction of long-wave emitted by the canopy from actually leaving the canopy airspace (canopy LW−release ).
The soil surface also emits long-wave radiation (i.e. LW em ) which is absorbed, reflected or transmitted through the canopy above. Net absorption of long-wave radiation by the canopy is therefore calculated as ALW canopy = LW · canopy LW-abs + LW em · canopy LW-abs − LW em · LAI · 2 · canopy LW-release .
Note that factor 2 refers to the two sides of a leaf, both of which are releasing long-wave radiation, one heading upwards towards the sky and the other heading downwards towards the soil. The net absorption of long-wave radiation by the soil is estimated as where absorption of incident long-wave radiation is assumed to be equal to σ . We note a small quantity of long-wave emitted from the soil will be reflected back to the surface; however this is typically < 1 % of the long-wave energy budget and is neglected here.

Canopy aerodynamic conductance
Canopy conductance of water vapour (g b ; m s −1 ) and CO 2 is estimated using the leaf-level boundary layer conductance model used in SPA (Nikolov et al., 1995;Smallman et al., 2013). We assume that exchange is dominant at the top of the canopy and that conductance should be linked to the canopy top wind speed (u h ; m s −1 ). Note that the boundary layer conductance model allows for the simulation of both free and forced convection, although here we simulate only the forced convection due to the lack of an explicit simulation of the energy balance. A detailed description of the conductance model is given in Nikolov et al. (1995).
where g b is the conductance for water vapour (m s −1 ), and D wv is the temperature-dependent molecular diffusivity of water vapour (m 2 s −1 ), where D wv20 (0.0000242) is D wv at a 20 • C (293.15 K) reference temperature. Sh is the Sherwood number, and d o is the leaf diameter.
Sh is estimated as a fraction of the Nusselt number (N u), while N u is a function of the Prandtl (P r = 0.72) and Reynolds (Re) numbers.
where κ is the von Karman constant (0.41), d is the canopy zero plane displacement height (m), z 0 is the canopy roughness length (m) and u * is the friction velocity (m s −1 ). d and z 0 are calculated based on canopy structure (height z h and LAI) as described in Raupach (1994). and where C d1 is an empirically fitted parameter (7.5), and h (0.193) corrects the roughness length for the effect of the roughness sub-layer. u * is estimated as a function of LAI and u h : where C s = 0.003 approximates the impact of substrate drag, and C r = 0.3 corrects for the roughness sub-layer (Raupach, 1994).

Soil aerodynamic conductance
Soil aerodynamic conductance (g soil ; m s −1 ) is first calculated as a resistance. Soil resistance is integrated from the soil roughness length (z soil = 0.001 m) through the canopy based on the turbulent eddy diffusivity following Niu and Yang (2004).
where dz is the vertical step size (m) through the canopy, and K h is the eddy diffusivity at z position (m) within the canopy. Eddy diffusivity (K h ; m 2 s −1 ) is assumed to have an exponential decay through the canopy (as with momentum). Eddy diffusivity at the canopy top is estimated as specified in Kaimal and Finnigan (1994).
K h is decayed through the canopy as described below.
The coefficient of momentum decay f is dependent on c d the coefficient of drag for foliage (0.2), LAI, l m and soil surface Monin-Obukov similarity coefficient ( m ). m is assumed to be = 1 describing neutral conditions (Garratt, 1992).

Plant hydraulic resistance
We use a mechanistic model of plant hydraulics to determine the maximum available water supply to the canopy from each of the three potential rooting layers (E layer ; mmol H 2 O m −2 s −1 ) under steady-state flow (Jones, 1992). The advantage to using a mechanistic approach allows for the estimation of physiological properties which makes novel comparisons with field observations such as Poyatos et al. (2016) possible. The model assumes that the canopy at LWP min is drawing from each of the three soil layers based on their layer-specific SWP, canopy, root and soil hydraulic resistances (MPa s −1 m −2 mmol −1 ).
where ρ lw is the density of liquid water (1000 kg m −3 ), and g is the acceleration of gravity (9.82 m s −12 ). The hydraulic resistance due to the soil (R soil ), roots (R root ) and the combined resistance of the stem and branch (R canopy ) each have units of MPa s −1 m −2 mmol −1 .
Root radius is the mean root radius (0.00029 m; Bonan et al., 2014), and r s is the mean distance between roots (m). l R is the root length (m) within the current soil layer, and l s is the thickness of the current soil layer (m). Root resist is the root resistivity (25 MPa/s/g/mmolH 2 0; Bonan et al., 2014), and G p is the plant conductivity to water (5 mmol H 2 O/m leaf area /s/MPa; Bonan et al., 2014).
The root length (l R ) within each of the three soil layers available for root access is a function of available root biomass within that layer (Root layer ; g m −2 ). The total root biomass (Root total ; g m −2 ) is distributed between the three soil layers assuming that 50 % of the biomass is in the top 25 % of the rooting profile, where the current rooting depth (D root-cur ; m) is assumed to follow an exponentially decaying function.
l R = Root layer Root density · π · Root 2 radius (77) G s is the soil conductivity (m 2 s −1 MPa −1 ), which is calculated as a function of soil textural parameters derived from the Saxton model of soil hydraulics (sax c1 , sax c2 and sax c3 ) and volumetric water content ( ; m 3 m −3 ). For further details, see Saxton et al. (1986) and Williams et al. (2001b) G s = sax c1 · e sax c2 +sax c3 The ratio of E layer / E layer determines the proportional extraction of water from each soil layer (Up frac ) due to E trans .
For use elsewhere in the model, the soil-layer-specific SWP and hydraulic resistances are aggregated based on uptake potential from each soil layer to provide an apparent SWP and resistance, i.e. the weighted soil water potential (wSWP) and total hydraulic resistance (R tot ).

Soil water balance
The Saxton model of soil hydraulics is used as the basis for simulation of the soil water balance within ACM-GPP-ET (Saxton et al., 1986). The implementation is a simplified version of that used within the SPA model (Williams et al., 2001b;Smallman et al., 2013). A total of four soil layers are simulated by the model; three of these layers are available for root access depending on the amount of root currently available. The first soil layer has a fixed depth of 10 cm from which soil surface evaporation is extracted, while the second layer has a fixed depth of 20 cm (i.e. total depth of first and second soil layers is 30 cm). The third layer has a variable depth dependent on the penetration depth of the roots within this layer (i.e. root biomass), thus providing a potential advantage of increasing rooting depth to access water resources deeper within the soil. The fourth soil layer is defined by the maximum soil rooting depth (D root max ; m). The soil water mass balance is updated through four stages briefly described below.
The soil water mass balance is updated in sequence dealing with (i) evaporative losses, (ii) gravitational drainage, (iii) infiltration of precipitation and (iv) adjustments to the soil layers based on changes in rooting depth. Evaporative losses from the soil surface are extracted solely from the top soil layer, while water losses due to transpiration are extracted based on the Up frac as determined based on the rooting distribution. The gravitational drainage and infiltration schemes are a simplified implementation of those used by the SPA model (Williams et al., 2001b). Gravitational drainage is then calculated based on the downward flow of water from soil layers currently above their field capacity to deeper layers and ultimately out of the bottom of the soil water column (i.e. drainage flux). Precipitation which reaches the soil surface infiltrates based on the available pore space (i.e. porosity) of the soil layers. As the minimum time period used for the model is daily, we assume that the maximum available pore space can be utilised. Once all soil layers have filled all available pore space (i.e. the soil is saturated), all remaining precipitation is assumed to be lost from the system as run-off.

Calibration procedure
We used the Soil Plant Atmosphere (SPA; Williams et al., 1996;Smallman et al., 2013) model to generate a dataset of photosynthetic and evaporative fluxes for the calibration of ACM-GPP-ET. The calibration of ACM-GPP-ET was conducted using the CARbon DAta MOdel fraMework (CARDAMOM) model-data fusion framework (Bloom and Williams, 2015). SPA simulated a 12-year period (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012) at an hourly time step for 200 locations selected using a stratified random process from across the global land surface (Fig. 2); stratification was to ensure even coverage across the latitudinal gradient. The number of sites selected was a trade-off to ensure good spatial coverage of training data but to end with a calibration dataset comprised of ∼ 50 000 d to reduce computational cost for the calibration process. Land cover areas covered by desert, rocky areas or areas dominated by C 4 photosynthetic pathway vegetation, as specified in the ECMWF land cover map, were excluded from the sampling to avoid areas which do not have substantial photosynthetic activity and to reflect the fact that ACM-GPP-ET is designed to simulate the dominant C 3 photosynthetic pathway. To isolate the impact of root biomass and vertical distribution on water supply from the role of soil moisture status, SPA's soil moisture is held at field capacity for the creation of the calibration dataset. However, part of the validation process (see Sect. 4 for details) of SPA is rerun but this time allowing the soil moisture status to vary in response to inputs and losses to facilitate validation of ACM-GPP-ET's capacity to simulate the development of drought compared to our complex model.
For both the calibration and validation simulations, SPA (and ACM-GPP-ET) require inputs of meteorological drivers, foliar nitrogen, soil textural information and time series of LAI and root biomass. Prescribing LAI and root biomass, opposed to coupling ACM-GPP-ET to a C-cycle model, allows the isolation of photosynthetic driver sensitivity without complex C-cycle feedbacks. Meteorological drivers were taken from the ERA-Interim reanalysis (Dee et al., 2011); these drivers were downscaled to an hourly time step using a weather generator (https://github.com/GCEL/ WeatherGenerator_v1, last access: 29 April 2019). Atmospheric CO 2 concentration for each day was sampled from 300 to 450 ppm; the exaggerated CO 2 range is to ensure that influence of increasing CO 2 concentrations is contained within the calibration dataset. Note that prescribing LAI, root biomass and fixing soil moisture at field capacity prevents propagation of unexpected/undesired responses to strongly increasing atmospheric CO 2 gradients; i.e. each day's GPP and ET are purely as a function of that day's conditions. Mean foliar nitrogen content was randomly sampled for each site (but held constant over time) from log10 normal distribution (mean = 1.89 g N m −2 ; Kattge et al., 2011). Soil sand and clay contents were extracted from the Harmonized World Soils Database (HWSD; Hiederer and Köchy, 2011); locations for which the sand and clay content fall outside the parameterised bounds for the Saxton soil hydrological model were also excluded (∼ 0.7 % of global land surface). Time series of LAI and root stocks used to drive SPA and ACM- GPP-ET were extracted from a global terrestrial carbon cycle model-data fusion analysis (Bloom et al., 2016). We used LAI and fine roots datasets from Bloom et al. (2016) derived from MODIS LAI products, remotely sensed above-ground biomass and ecological process knowledge (for details, see Bloom and Williams, 2015).
In order to quantify the ability of the analysis framework to retrieve accurate ecophysiological trait information (e.g. optimum temperature of photosynthesis), a single set of ecophysiological parameters is used to drive SPA. This nominal parameter set is a combination of the forest hydraulic parameters from Williams et al. (1996) and broad-leaf forest radiation reflectance parameters from Smallman et al. (2013).
The hourly SPA simulations were aggregated to daily time step and then sub-sampled at a 2-weekly interval to reduce temporal auto-correlation. Further filtering was applied to remove days with zero GPP (i.e. winter) and days for which SPA was unable to solve its energy balance closure to a cumulative absolute error, over sub-daily time steps, of less than 50 W m −2 summed across each 24 h period. Thus, after filtering, some 42 658 simulation days were available for calibration.

SPA
The Soil Plant Atmosphere (SPA; Williams et al., 1996;Smallman et al., 2013) model simulates a mechanistic representation of the terrestrial ecosystem, coupling plant carbon and water cycles through ecophysiological principles. SPA simulates up to 10 canopy layers simulating both the sunlit and shaded leaf area, each being independently connected to water supply from the soil. Water accessibility from 20 soil layers is determined as a function of root penetration within each soil layer. SPA estimates the surface exchanges of heat, water and CO 2 within a mass-and energy-balanced framework. SPA has been extensively validated at range of spatial scales (leaf to landscape) and climate zones (tropical, temperate and Arctic) (Williams et al., 1998(Williams et al., , 2001bFisher et al., 2006Fisher et al., , 2007Sus et al., 2010;Smallman et al., 2013Smallman et al., , 2014López-Blanco et al., 2018). A detailed description of SPA and its major developments can be found in Williams et al. (1996Williams et al. ( , 1998Williams et al. ( , 2001bWilliams et al. ( , 2005, Sus et al. (2010) and Smallman et al. (2013); however a brief description follows below.
Leaf-level photosynthesis Farquhar and von Caemmerer, 1982), transpiration (Penman- Monteith;Jones, 1992) and energy balance are coupled via a mechanistic model of stomatal conductance (Williams et al., 1996;Bonan et al., 2014). Stomatal opening is modulated to optimise photosynthetic uptake within constraints determined by (i) the intrinsic water use efficiency and (ii) the balance of atmospheric demand with available water supply from the soil. SPA makes use of a detailed multi-canopy layer radiative transfer scheme estimating the absorption and reflectance of both short-and long-wave radiation for the sunlit and shaded leaf area (Williams et al., 1998). Aerodynamic exchange coefficients for water, heat and CO 2 are estimated accounting for above-, within-and under-canopy momentum decay, including stability corrections based on the Monin-Obukov stability theorem (Smallman et al., 2013).

CARDAMOM
The CARbon DAta MOdel fraMework (CARDAMOM; Bloom and Williams, 2015) is a model-data fusion (MDF) framework. CARDAMOM uses a Bayesian approach within a Metropolis Hastings-Markov Chain Monte Carlo (MH-MCMC) algorithm to retrieve parameters for a given model as a function of observational constraints. Parameter priors are specified with a uniform probability distribution truncated between minimum and maximum values (Table 1). CARDAMOM was used to calibrate ACM-GPP-ET based on GPP, transpiration, soil evaporation and evaporation of intercepted rainfall provided by SPA.
For the purpose of calculating the cost function used within the MDF approach, we assumed uncertainties consistent with those expected from eddy covariance. The uncertainty of both GPP and evaporative fluxes is assumed to be 15 %, representing random error estimates typically found for eddy covariance (Stoy et al., 2006;Mauder et al., 2013). The use of the SPA model with known input parameters allows for quantification of how accurately the equivalent retrieved parameters have been determined. Where there is a directly equivalent parameter between SPA and ACM-GPP-ET, the SPA values are provided in Table 1.

Validation procedure
ACM-GPP-ET is validated over two phases using a range of evapotranspiration and GPP estimates from (1) outof-sample SPA simulations and (2) fully independent eddy-covariance-derived estimates from the FLUXNET2015 database.

SPA validation
Phase (1) tests the ability of ACM-GPP-ET to simulate the soil moisture dynamics and resulting feedbacks on plant photosynthesis and evaporation estimated by the state-of-the-art SPA model. To achieve this we compare out-of-sample SPAsimulated estimates of GPP and evapotranspiration. SPA's soil moisture status was held constant at field capacity during calibration phase and substantially sub-sampled. Here SPA's soil moisture status was allowed to vary as a function of its inputs and outputs; these simulations were carried out at the same locations as those used for the calibration analysis but without any sub-sampling.

Independent validation: FLUXNET2015
Phase (2) quantifies ACM-GPP-ET's predictive skill using fully independent observations. Furthermore, we contrast ACM-GPP-ET's predictive capacity with that of SPA. This is achieved by both ACM-GPP-ET and SPA simulating carbon and water fluxes at FLUXNET2015 sites. The SPA simulations use the same nominal set of ecophysiological parame-ters used in generating the calibration dataset. ACM-GPP-ET uses the maximum likelihood parameter set retrieved during the calibration procedure. The parameter set used here broadly represents a forest ecosystem. Therefore, we hypothesise that both SPA and ACM-GPP-ET will perform best at forest sites and less well at sites with different hydraulic traits.
Daily estimates of GPP and ET derived from eddy covariance were extracted from the FLUXNET2015 database (http://fluxnet.fluxdata.org/, last access: 1 November 2016). The FLUXNET2015 database was filtered to include only sites which overlapped our simulation period (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015) for a minimum of 3 years, to allow for inter-annual comparison and not more than 20 % missing data. ACM-GPP-ET is designed to emulate the C 3 photosynthetic pathway; therefore sites which are listed to be dominated by vegetation which use the C 4 or Crassulacean acid metabolism (CAM) pathways were removed. We also removed estimates which do not carry the highest-quality flags to avoid comparing our model against estimates generated using a statistical model of net exchange (i.e. we use only the non-gap-filled observations). Eddy covariance estimates for which energy balance non-closure is less than 85 % were also rejected. In total 497 site years encompassing 10 vegetation types across 59 sites were available for validation of GPP and ET. Meteorological drivers used to drive SPA and ACM-GPP-ET were extracted from the local site-level measurements. Missing data were gap filled, either assuming linear interpretation (for single missing hours) or extracted from bias-corrected temporally downscaled meteorological drivers as used in the calibration process. Finally, as with the earlier SPA simulations, LAI and root biomass for each site are extracted from Bloom et al. (2016).

Results
We show that a single global calibration of ACM-GPP-ET can effectively reproduce the patterns of GPP and ET simulated by SPA. Importantly the predictions of WUE are consistent for both ACM-GPP-ET and SPA, so that the simplified model is able to capture the interactions between C and water cycling. We also describe an independent validation against FLUXNET data, across 59 sites.

SPA calibration and validation
A single global ACM-GPP-ET parameterisation simulates the calibration dataset with a high degree of skill when using the maximum-likelihood parameter sets retrieved from the calibration analysis (Fig. 3). The dynamics of soil evaporation is best simulated by ACM-GPP-ET followed by GPP and transpiration, each achieving R 2 ≥ 0.91. Evaporation of canopy-intercepted precipitation achieved a lower R 2 at 0.81. All fluxes are largely unbiased (GPP bias = −0.2 g C m −2 d −1 and evaporative fluxes magnitude ≤ 0.08 kg H 2 O m −2 d −1 ), with low root mean square error (RMSE; 0.97 g C m −2 d −1 and ≤ 0.39 kg H 2 O m −2 d −1 ). However, we note a tendency to underestimate peaks in transpiration found in the SPA simulation (Fig. 3). Evaporation of canopy-intercepted precipitation is least well simulated by each metric used here; however this is expected given the sensitivity of this flux to the timing and intensity of precipitation events and the canopy energy balance varying strongly at sub-daily timescales which are not accounted for here. Transpiration (E trans ; 61 %) dominates the overall ACM-GPP-ET-simulated evaporative budget, followed by evaporation of canopy-intercepted rainfall (E wet ; 34 %) and soil surface evaporation (E soil ; 5 %), broadly consistent with those simulated by SPA in the calibration dataset (67 %, 28 % and 5 % respectively). ACM-GPP-ET is 2200 times faster than SPA, where ACM-GPP-ET requires ∼ 0.000007 s d −1 and SPA 0.015 CPU seconds per simulated day. Overall, ACM-GPP-ET simulates the calibration dataset with substantial skill and a substantial reduction in computational requirements.
Simulation of carbon and water fluxes remains robust in the phase (1) validation where ACM-GPP-ET is compared against the out-of-sample SPA simulations in which soil moisture content is dynamically simulated, i.e. not held at field capacity as in the calibration procedure (Fig. 4). Similarly, partitioning between evaporative fluxes remains closely aligned between ACM-GPP-ET (E trans = 59 %, E wet = 35 %, E soil = 6 %) and SPA (E trans = 62 %, E wet = 31 %, E soil = 7 %). Only soil evaporation suffers a substantial reduction in the simulation of the variability of SPA's soil evaporation from R 2 = 0.96 to 0.59 however remaining unbiased and the RMSE increasing by only 0.02 kg H 2 O m −2 d −1 (Fig. 4).
Ecosystem water use efficiency (WUE = GPP/E trans ) simulated by SPA is well predicted by ACM-GPP-ET (R 2 = 0.79, RMSE = 1.88 g C per kg H 2 O, bias = 0.33 g C per kg H 2 O). The consistency within simulations with dynamic water availability demonstrates resilience in ACM-GPP-ET's ability to represent the linkages between the plant carbon and water cycles, which is key when considering the impacts of climatic extremes such as the evolution of drought. This ability to simulate drought in a manor consistent with SPA is supported by the high-quality simulation of soil moisture content (R 2 = 0.84, RMSE = 4.19 kg H 2 O m −2 , bias = 1.17 kg H 2 O m −2 ; Fig. 4).

FLUXNET2015: independent validation
For phase (2) validation, ACM-GPP-ET and SPA simulated GPP and ET estimates for 59 sites from the FLUXNET2015 database to provide fully independent validation of ACM-GPP-ET's ability to simulate real-world estimates but also of its predictive skill compared to that of SPA (Table 4; Fig. 5). For the FLUXNET validation we aim to achieve a similar degree of predictive capacity to existing remote-sensing-driven estimates. As stated earlier we hypothesise that both ACM-GPP-ET and SPA will perform better at forest sites than in other ecosystems. GPP is typically better predicted than ET by both ACM-GPP-ET and SPA, which is expected given that ET is the combination of three evaporative fluxes (ET = E trans +E wet +E soil ). Both GPP and ET are underestimated by ACM-GPP-ET and SPA, with larger RMSEs than found when comparing between ACM-GPP-ET and SPA simulations (Table 4). However, ACM-GPP-ET marginally outperforms SPA at most sites and for ET in particular. The between-site distribution of R 2 and RMSE is skewed, with a relatively small number of sites performing poorly (Fig. 5).
For each metric shown (R 2 , RMSE and bias) the distribution achieved by ACM-GPP-ET indicates a potentially greater degree of predictive skill at a daily time step than SPA. ACM-GPP-ET and SPA perform well at forested sites (except evergreen broad-leaf forests), with more variable performance in grassland-, crop-and savannah-type ecosystems (Fig. 6). However, both ACM-GPP-ET and SPA demonstrated a clear capability to simulate inter-site variation (i.e. the mean GPP and/or ET between sites), with R 2 ∼ 0.94 for both GPP and ET. Variation in predictive capability is not unexpected given that we use a single set of parameters for both models without site-specific modifications.
ACM-GPP-ET-simulated and eddy-covariance-derived estimates of GPP and ET were compared at different temporal aggregations (weekly, monthly and annual), showing good skill at simulating seasonal and inter-annual dynamics. From daily through to weekly and monthly aggregation, the statistical agreement between variation in simulated and observed estimates improves considerably. Estimation of ET is most improved, increasing from R 2 = 0.58 to 0.72 to 0.75, while for GPP R 2 increases from 0.61 to 0.65 to 0.68. RMSE and mean bias remain largely unchanged. However, simulation of inter-annual variation is more challenging with the R 2 for GPP = 0.68 and ET = 0.46. A similar pattern of results is found for SPA (not shown).

Discussion
In this study we have described, calibrated (Fig. 3) and validated (Figs. 4-7) a model of intermediate complexity, the aggregated canopy model for gross primary productivity and evapotranspiration (ACM-GPP-ET v1). ACM-GPP-ET provides a process-orientated representation of plant photosynthesis and the water cycle, coupled through ecophysiological principles (Fig. 1). ACM-GPP-ET simulations using a single global calibration have been validated against simulated GPP and evaporative fluxes but also emergent properties including WUE and soil moisture status from the stateof-the-art SPA model. These simulations were driven with LAI, fine root stock and meteorological conditions spanning across global gradients (Fig. 2). Furthermore, to provide fully independent validation we have compared our estimated GPP and ET fluxes, again using a single global calibration, against multiple eddy-covariance-derived flux data from the FLUXNET2015 database, demonstrating substantial predictive skill.

ACM-GPP-ET: lessons learned on model simplification
A number of alternative model structures were tested over the course of the development of ACM-GPP-ET, and while it is beyond the scope of this study to describe these in detail, there are a range of important lessons learned from the development of specific components. The single most computationally expensive component is the iterative solution linking photosynthesis and transpiration via stomatal conductance. However, a coupled representation of stomatal conductance linking these processes was essential for maintaining the predictive capacity of both canopy exchanges and emergent properties of WUE and soil moisture status. Similarly, the simulation of soil moisture dynamics is time-consuming, due to the need for simulating non-linear drainage processes occurring at sub-daily time steps. Originally a single-layer bucket was tested but was unable to generate reasonable soil moisture dynamics and ultimately drought responses compared with SPA. Our experience is consistent with other studies which have explicitly considered the impact of vary- ing the number of soil moisture layers (Blyth and Daamen, 1997). Water drainage between soil layers and runoff of water from the canopy surface places an upper limit on efficiency achievable while maintaining predictive skill for soil moisture status and, indirectly, canopy fluxes. However, we expect further efficiency improvements to be achievable through subsequent code modifications including alternative theoretical approaches to achieve the photosynthesistranspiration coupling. A dedicated focus on code optimality is beyond the scope of the current study but is critical to the ongoing process of model improvement.
In this study 22 parameters are calibrated ( Table 3); 15 of these are related to the estimation of canopy and/or soil absorption of PAR, NIR and long-wave radiation. The key challenge for the radiative transfer was the essential requirement to reproduce the emergent non-linear functional shape between LAI and canopy radiation absorption, transmittance to soil and reflectance, making the complex vertical structure implicit in the calibration. We found an appropriate simulation of non-linear radiative transfer was critical for realis-tic radiative responses of each component of evaporation. In contrast, for a GPP model alone, a far simpler radiative transfer scheme was viable (Williams et al., 1997). However, the large number of parameters in the radiative transfer scheme is open to constraint through, for example, remote sensing observations of canopy structure and reflectance. These observations could be used to calibrate the scheme for individual locations but also canopy structural forms (i.e. canopy vertical structure).

Intermediate complexity emulation of a complex TEM
ACM-GPP-ET accurately simulates its calibration dataset (Fig. 3) and out-of-sample validation (Fig. 4) generated by the SPA model. Substantial predictive skill was achieved for photosynthesis and each of the evaporative fluxes but also the unbiased simulation of soil moisture and WUE, which were not part of the calibration process (Fig. 4). The single global calibration effectively spans global process sensitivity to climate across gradients of latitude, maritime-continental gradients and seasonal cycles. The calibration also represents the effect of ecological variation of LAI and fine root stocks. By including 10 different drivers, we generated a major challenge for model simplification. ACM-GPP-ET must robustly represent functional forms for C and water cycling across these multiple response dimensions, including any interactions. We note an underestimate in peak transpiration fluxes (Fig. 3) which we hypothesise is due to the lack of including the impact of energy balance in canopy and non-linear responses at sub-daily timescales. While this bias may in some cases lead to an underestimate of within-day drought/water supply limitation, the statistical analyses for validation indicate that the functional forms embedded in ACM-GPP-ET effectively represent those arising from complex mechanistic interactions within SPA. ACM-GPP-ET generates robust daily aggregations from SPA's hourly resolution.
ACM-GPP-ET effectively reproduces the ecoclimatological sensitivity of plant water use efficiency from the SPA model (Fig. 4). Reliable simulation of the dynamics and magnitude of plant WUE is an important property for robust modelling of hydrological, ecological and biological interactions. For climate sensitivity studies, ecosystem carbon-water coupling controls drought development and its interactions with ecosystem processes (Beer et al., 2009;Keenan et al., 2013;Bonan et al., 2014). Similarly, appropriate partitioning of evaporative fluxes, i.e. T /ET, is essential to simulate the overall ecosystem response to change in climate correctly. Transpiration has a direct interaction with biogeochemical cycling through canopy conductance, whereas evaporative fluxes have an indirect effect through adjustments to soil moisture and radiation environment mediated through variation in canopy cover. T /ET estimated by ACM-GPP-ET and SPA is closely aligned for both the calibration procedure with fixed soil moisture and that with dynamic soil moisture. The fraction of T /ET declined under dynamic soil moisture in both models, indicating a consistent response to varied water status. This consistency indicates that ACM-GPP-ET effectively represents and aggregates the critical nexus for carbon and water cycling simulated in the leaf-scale stomatal conductance routines of SPA to canopy scale. As noted in Sect. 6.1 alternative coupling approaches could be investigated to reduce the computational requirements, but perhaps more importantly our framework allows for the testing of alternative coupling hypotheses to robustly assess their predictive capacity against emergent properties. For example, recent analyses estimate that transpiration is responsible for ∼ 58 % of global evaporation, consistent with both SPA and ACM-GPP-ET estimates and contrasting with many TEMs which tend to underestimate T /ET partitioning (Wei et al., 2017) but also expanding on the stomatal model comparison conducted by Bonan et al. (2014).

Comparison to independent flux observations worldwide
A key component of our validation process was the comparison of ACM-GPP-ET-and SPA-simulated GPP and ET fluxes at 59 FLUXNET2015 sites. We aimed to achieve a comparable predictive skill compared to analyses driven by remotely sensed information. Moreover, we hypothesised that both ACM-GPP-ET and SPA would perform best at forest sites due to the choice of forest-based parameters used in the calibration procedure. ACM-GPP-ET and SPA simulated both GPP and ET fluxes with substantial skill, especially given that only a single global parameterisation was used (Table 4; Figs. 5-6). Indeed, ACM-GPP-ET slightly outperforms SPA in each statistical metric presented here; the greatest difference is found for simulating daily variation of GPP and in particular ET fluxes (Fig. 5). However, it is unlikely that ACM-GPP-ET has actually improved on SPA itself, as ACM-GPP-ET is an emulation of SPA; therefore the difference should not be viewed as significant. The improved statistics found for ACM-GPP-ET are likely due to a combination of factors underlying errors which by chance lead to an apparent improvement. One exception to this assumption is that SPA's sub-daily meteorological drivers were gap filled based on downscaled reanalysis drivers (as used in the calibration process), which, as noted in the Introduction, can introduce errors comparable in magnitude to the direct daily aggregation (e.g. Williams et al., 2001a). Finally, as hypothesised SPA and ACM-GPP-ET performed best at forests sites, with some grassland, cropland and woody savanna sites performing less well (Fig. 6). The relative pattern of performance between vegetation types is consistent between ACM-GPP-ET and SPA, strongly indicating a consistent underlying response to a wide range of climate conditions and ecological states and similar predictive capabilities when applied in circumstances without site-specific information.
The ACM-GPP-ET predictions used a single, global calibration and thus evaluated a single response surface to FLUXNET data, without taking into account any ecological variation in plant processes among FLUXNET sites beyond LAI, fine root biomass and soil textural information (potential errors in these components are discussed below). Thus, critical plant traits, such as the maximum rate of carboxylation per leaf area (V cmax ), stem hydraulic conductance (G p ) and rooting depth, were set the same across all sites for SPA and ACM-GPP-ET. But, we expect these parameters to vary, given our knowledge of trait variation at sites and from worldwide studies (Wright et al., 2004;Kattge et al., 2011;Johnson et al., 2018). So, the global parameterisation is likely to be biased for most sites. The poorer performance of SPA and ACM-GPP-ET in predicting GPP in evergreen deciduous forests is likely linked to a bias in parameters, for example (Fig. 5). For a more robust global application, ACM-GPP-ET requires prior estimates for local values of the parameters in Table 3. Based on previous works by Bonan et al. (e.g. 2014), we expect the most important local parameterisation will be for root resistivity, plant conductivity and NUE. Root resistivity and plant conductance determine the maximum rates of water transport to the canopy. NUE determines the capacity of carboxylation and electron transport in photosynthesis.
Errors in the LAI, fine root biomass and soil textural information used in the SPA and ACM-GPP-ET model inputs are also likely. Errors in the remote sensing of LAI assimilated within the CARDAMOM framework will propagate into the resulting estimates of LAI and fine root biomass. Soil textural information comes from the HWSD, the errors for which are poorly described. Furthermore, we made assumptions about root depth and canopy height which are also likely biased. LAI information ultimately comes from MODIS (or other satellite products), which has varied skill depending on vegetation cover type (Yan et al., 2016). Yan et al. (2016) showed MODIS LAI detects LAI dynamics least well over forests (particularly needle-leaf), but perhaps more critically for our analysis they also showed a consistent RMSE between 0.6 and 0.8 m 2 m −2 between vegetation types. For arable crop land and grassland, such an RMSE could constitute an error in the magnitude of LAI on the order of 66 % (in their observation dataset), potentially resulting in substantial errors and bias in estimation of ecosystem fluxes.

Comparison to other global GPP and ET estimates
We aimed for and achieved ACM-GPP-ET's performance against FLUXNET2015 sites to be comparable to that demonstrated by GPP and ET estimates generated by a range of alternative approaches and temporal resolutions (e.g. Jung et al., 2011;Mu et al., 2011;Martens et al., 2017;Wang et al., 2017). For example, FLUXCOM uses a machine-learning approach assimilation and a wide range of global spanning information to estimate monthly GPP (number of sites not reported: R 2 =∼ 0.82, RMSE =∼ 1.18 g C m −2 d −1 ) and ET (number of sites not reported: R 2 =∼ 0.86, RMSE =∼ 0.47 kg H 2 O m −2 d −1 ) (Jung et al., 2011). At monthly timescales FLUXCOM performs marginally better in the estimation of variation in fluxes but with RMSEs roughly half that found with ACM-GPP-ET for both GPP and ET; this is not unexpected as FLUXCOM was calibrated against the FLUXNET database itself (Jung et al., 2011), whereas the satellite-based remotely sensed derived 8 d MODIS estimates, based on absorbed radiation and empirical response functions, perform less favourably than ACM-GPP-ET for both GPP (18 sites, R 2 = 0.52, RMSE = 0.96 g C m −2 d −1 ; Wang et al., 2017) and ET (46 sites, R 2 = 0.65, RMSE = 0.84 kg H 2 O m −2 d −1 ; Mu et al., 2011). Finally, the Global Land Evaporation Amsterdam Model (GLEAM) estimates of ET performed similarly to those achieved by ACM-GPP-ET; GLEAM makes use of a comparatively complex approach to estimate ET, using a model of ecosystem water cycling updated by satellite-based remotely sensed information within a data assimilation framework to generate a daily estimate of the global water budget (63 sites, R 2 = 0.64, RMSE = 0.73 kg H 2 O m −2 d −1 ; Martens et al., 2017). In each case the approaches highlighted above made use of vegetation-type-specific information or location-specific remotely sensed biophysical information to drive their analysis compared to our comparatively naive approach using a single set of ecophysiological parameters. Therefore, we reasonably expect significant improvements through the inclusion of location-and/or vegetation-type-specific calibration as would be achieved through model-data fusion approaches (e.g. Bloom et al., 2016). A key benefit of our processorientated approach is that unlike the above-described approaches, which are each dependent on the input of remotely sensed information, our modelling framework can be used predictively to extrapolate into space and times for which remotely sensed information are not available. This is particularly true when ACM-GPP-ET has been coupled to our DALEC C-cycle model, allowing for feedbacks between car-bon supply and available LAI. Further details on the potential of this coupling are below.

Global applications
It is typical to generate regional and global estimates of carbon and water cycling using complex land surface models. Such models make vital contributions to assessments of the global carbon budget (Le Quéré et al., 2015) and weather and climate forecasts. A challenge for these models is that their complexity generates high computational demand, and they have demanding parameterisation needs. Thus, these models are often applied using plant-functional-type approaches, whereby parameters are set for an entire biome, with no variation, and no uncertainty is attached. There is a need then for models of intermediate complexity that are less demanding computationally and have fewer parameters but retain realism.
ACM-GPP-ET is such a model, constructed from the simplification of a complex land surface model with a long evaluation history, SPA. ACM-GPP-ET captures the critical functional forms in carbon-water interactions that emerge from process representation at sub-canopy, hourly timescales. ACM-GPP-ET can represent the interactions of supply and demand on stomatal opening and how this responds to changes in atmospheric conditions and soil moisture states. This level of detail is critical for application in global change analyses that are vital for diagnosing and predicting earth system evolution. Thus, ACM-GPP-ET produces realistic outputs, based on comparison with its more complex precursor.
ACM-GPP-ET is well suited for ensemble modelling schemes due to its faster runtime, as shown in the MH-MCMC calibration process used here with SPA outputs used as training data. The parameter posteriors generated here (Fig. A1) provide a starting point for full carbon cycle and water cycle analyses across regional to global domains. For instance, Bloom et al. (2016) have shown how an IC GPP model, ACM-GPP (Williams et al., 1997), combined with a carbon cycling model (DALEC; Williams et al., 2005), can be linked into a model-data fusion framework, CARDAMOM. CARDAMOM can, when combined with DALEC, retrieve probabilistic estimates of carbon stocks, fluxes and model parameters (including key unknowns such as photosynthate allocation to plant tissues and their residence times). CARDAMOM can produce outputs across a domain at the resolution of input forcing (climate data, burned area) and observational constraints (satellite time series of LAI, biomass maps, soil C maps). The advantage of CARDAMOM is that it generates likelihoods for model initial conditions and parameter values that are consistent with climate forcing and domain observations from, for example, satellites, and their estimated errors. Currently CAR-DAMOM infers water limitations to C cycling through satellite observations of greenness alone. Because there is no cou-pling to a local water model, CARDAMOM cannot use modelled information on water balance or independent observations such as surface soil moisture (e.g. Chen et al., 2018). Through using ACM-GPP-ET in CARDAMOM, it will be possible to assimilate new observational data related to water fluxes and state variables.
The combination of ACM-GPP-ET, coupled to DALEC, and CARDAMOM provides multiple direct and indirect avenues for propagating information acquired using intermediate complexity models to complex state-of-the-art TEMs. ACM-GPP-ET and SPA directly share five parameters calibrated in this study (Table 2) plus a further nine biophysical traits which were not calibrated in this study (Table 3). Moreover, all of the parameters calibrated in this study (Table 2) can be indirectly related to those used in SPA (and many other TEMs), e.g. NUE, which is closely related to V cmax , J max and foliar nitrogen but also radiation absorption and reflectance as a function of LAI. Similarly, when ACM-GPP-ET is combined with DALEC and used within the CAR-DAMOM framework, analyses such as those carried out by Bloom et al. (2016) (as is intended) retrieving information on carbon stocks, carbon allocation and residence times result in the retrieval of ecologically relevant traits. These traits can be directly related to parameters found in most state-of-the-art TEMs equipped with a C cycle. Such information should at a minimum provide information on spatial variation expected and in the optimum situation inform the researcher of the exact magnitude of those parameters.

Further opportunities and gaps
There are an array of next steps to undertake for further development both as a stand-alone tool and as part of a coupled modelling framework along with DALEC and CARDAMOM. The ACM-GPP-ET parameters estimated here against SPA can be calibrated individually at the FLUXNET2015 site (where sufficient biophysical information is available) to determine critical parameter variability to explain observed differences in fluxes. Driven with remotely sensed LAI, ACM-GPP-ET could make global estimates of GPP, ET and WUE for direct comparison with outputs from FLUXCOM, GLEAM and CMIP5 model ensembles. As part of the CARDAMOM framework, a site-specific FLUXNET2015 analysis allows us to assess our ability to retrieve information on the whole carbon cycle across ecological and climate gradients within a data-rich environment, including key unknowns such as rooting depths, which play a critical role in ecosystem resilience to drought. Such analyses provide the supporting frameworks needed to conduct global-scale reanalyses and potentially near-term (next 12 months) and intermediate-term (next 10 years) predictions with fully resolved uncertainties due to the propagation of ensembles.
Due to lack of space we have not reviewed the uncertainty estimates on parameter retrievals in the calibration of ACM-GPP-ET from SPA, but these contain useful information on the relative uncertainties in key processes in the aggregation. There are gaps in the capacity of ACM-GPP-ET globally; C 4 pathways have not been included, nor organic soils. However, SPA is capable of simulating flux responses to these process adjustments, so new calibrations and/or model structure can be generated following a similar approach to that laid out here.

Conclusions
We have calibrated and robustly validated a model of intermediate complexity, ACM-GPP-ET, demonstrating good capacity for simulating the carbon-water dynamics of a stateof-the-art SPA model. ACM-GPP-ET has demonstrated substantial predictive skill when simulating fully independent eddy-covariance-derived estimates of carbon and water exchange, which is comparable to that of other globally used GPP and ET products. Finally, ACM-GPP-ET is highly computationally efficient, ∼ 2200 times faster than SPA, opening up a substantial range of further opportunities.
Code availability. The code for ACM-GPP-ET version 1 has been made freely available from Edinburgh DataShare (doi: https://doi.org/10.7488/ds/2480; Smallman and Williams, 2018). Subsequent source code developments will be made available under GNU General Public License (GPL) via GitHub (https://github. com/GCEL/ACM_GPP_ET/releases, last access: 29 April 2019). The Fortran source code for the Weather Generator v1 used when downscaling daily meteorology to hourly time step is available at https://github.com/GCEL/WeatherGenerator_v1 (Smallman and Williams, 2019). The CARDAMOM calibration process of ACM-GPP-ET retrieves multiple parameter sets consistent with the calibration dataset, resulting in a probability density function (PDF) for each parameter. A detailed discussion of the PDFs retrieved is beyond the scope of this study; however a brief description of the primary features is given below (Fig. A1).
The width (relative to the prior range) and overall shape of the PDF (i.e. uni-or multi-modal) give an indication of the constraint achieved. The majority of parameter posteriors (16 out of 22) cover less than 50 % of their prior range and show a single clear peak value without substantial skew (e.g. NUE; Fig. A1). Notable exceptions include aPAR_trans and kPAR_refl, indicating that there is a degree of equifinality in the absorption of PAR required to support photosynthesis and canopy evaporation. Therefore, this provides a potential focus for refinement of the model or calibration process (e.g. through the introduction of new data streams). Figure A1. Probability density functions of the retrieved parameters for ACM-GPP-ET. The figure label refers to the name given in Table 2 of the main text. The range of the x axis matches that of the parameter prior ranges to allow easy identification of those parameters which are the easiest to constrain. Table B1. Ecophysiological parameters used by SPA within this study but not already provided in Tables 2 and 3. These parameters are drawn from previous SPA publications (Williams et al., 1996;Bonan et al., 2014).

Symbol
Value Units Description κC 33.6 µmol C g N s −1 Coefficient relating the maximum rate of carboxylation to leaf nitrogen content κJ 53.8 µmol C g N s −1 Coefficient relating the maximum rate of electron transport to leaf nitrogen content minLWP spa 2 MPa Absolute value for minimum tolerated leaf water potential Leaf cap 2500 millimole of H 2 O per metre squared of leaf per MPa Leaf/plant water capacitance for supply versus demand calculation leaf parrefl 0.16 fraction Leaf-level reflectance of incident photosynthetically active radiation leaf partrans 0.16 fraction Leaf-level transmittance of photosynthetically active radiation leaf nirrefl 0.43 fraction Leaf-level reflectance of incident near-infrared radiation leaf nirtrans 0.26 fraction Leaf-level transmittance of near-infrared radiation soil parrefl 0.03 fraction Reflectance of photosynthetically active radiation incident on soil surface soil nirrefl 0.02 fraction Reflectance of near-infrared radiation incident on soil surface