P-model v1.0: an optimality-based light use efficiency model for simulating ecosystem gross primary production

. Terrestrial photosynthesis is the basis for vegetation growth and drives the land carbon cycle. Accurately simulating gross primary production (GPP, ecosystem-level apparent photosynthesis) is key for satellite


Introduction
Realistic, reliable, and robust estimates of terrestrial photosynthesis are required to understand variations in the carbon cycle, monitor forest and cropland productivity, and predict impacts of global environmental change on ecosystem function . Understanding how photosynthetic rates depend on temperature, humidity, solar radiation, CO 2 , and soil moisture is at the core of this challenge. Process-based dynamic vegetation models (DVMs) and Earth system models (ESMs) in use today almost always use some form of the Farquhar-von Caemmerer-Berry (FvCB) model for C 3 photosynthesis (Farquhar et al., 1980;von Caemmerer and Farquhar, 1981), in combination with stomatal conductance (g s ) models (Ball et al., 1987;Leuning, 1995;Medlyn et al., 2011) that couple water and carbon fluxes at the leaf surface.
The FvCB model describes the instantaneous saturating relationship between leaf-internal CO 2 concentrations (c i ) and assimilation (A), and how this relationship depends on absorbed photosynthetically active radiation (APAR). It simulates A as the minimum of a light-limited and a RuBisCOlimited assimilation rate, A J and A C , respectively: Although the FvCB model is standard for leaf-scale photosynthesis and its environmental response on timescales of minutes to hours, DVMs and ESMs using FvCB produce divergent results for ecosystem-level fluxes and their response to the environment at longer timescales (Rogers et al., 2017). This is due to assumptions that have to be made about photosynthetic parameters that are not predicted by the FvCB model: stomatal conductance (g s ) and the maximum rates of RuBisCO carboxylation (V cmax ) and electron transport (J max ) for ribulose-1,5-bisphosphate (RuBP) regeneration, which together determine the relationship between c i and A. Common approaches for determining the values of V cmax and J max in DVMs and ESMs are to prescribe fixed values per plant functional type (PFT) and attempt to simulate the distribution of PFTs in space, or to use empirical relationships between leaf N and V cmax and simulate leaf N internally or prescribe it per PFT (Smith and Dukes, 2013;Rogers, 2014). While the FvCB model describes a non-linear relationship between instantaneous assimilation and absorbed light, ecosystem production, integrated over weeks to months, scales proportionally with APAR (Monteith, 1972;Medlyn, 1998). This observation underlies the general light use efficiency (LUE) model which describes ecosystem-level photosynthesis (gross primary production; GPP) as the product of APAR and LUE: where PAR is the incident photosynthetically active radiation and fAPAR is the fraction of PAR that is absorbed by green tissue. The LUE model is the basis for observation-driven GPP models that use fAPAR and PAR based on remote sensing data and combine this with different approaches for simulating LUE (Running et al., 2004;Zhang et al., 2017;Field et al., 1995) and for some forest growth models (Landsberg and Waring, 1997). Other remote-sensing-data-based models (Jiang and Ryu, 2016) apply the FvCB model in combination with vegetation cover and type data and prescribed V cmax for a set of PFTs.
Here, we describe a model, referred to as the P-model, that unifies the FvCB and LUE models following the theory developed by Prentice et al. (2014) and Wang et al. (2017a). The model assumes an optimality principle that balances the C cost (per unit of assimilation) of maintaining transpiration and carboxylation (V cmax ) capacities. It thus predicts how the ratio of leaf-internal to ambient CO 2 (c i : c a = χ) acclimates to the environment, given temperature (T ), water vapour pressure deficit (D), atmospheric pressure (p), and ambient CO 2 concentration (c a ) (Prentice et al., 2014). The P-model also assumes that the photosynthetic machinery tends to coordinate V cmax and J max in order to operate close to the intersection of the light-limited and RuBisCO-limited assimilation rates (coordination hypothesis; Chen et al., 1993;Haxeltine and Prentice, 1996;Maire et al., 2012) under mean daytime environmental conditions. By further assuming equality in the marginal cost and benefit of J max , daily-to-monthly average assimilation rates can then be described as fractions of absorbed PAR, i.e. as a LUE model (Eq. 2) (Wang et al., 2017a).
Thus, the P-model embodies an optimality-based theory for predicting the acclimation of leaf-level photosynthesis to its environment and for simulating LUE. In combination with prescribed PAR and remotely sensed fAPAR, it estimates GPP across diverse environmental conditions (Wang et al., 2017a). Its prediction for acclimating photosynthetic parameters reduces the number of prescribed (and temporally fixed) values and avoids the distinction of model parameterisation by vegetation types or biomes (apart from a distinction between the C 3 and C 4 photosynthetic pathways). The P-model has a further advantage over other data-driven GPP models (Running et al., 2004;Zhang et al., 2017) and empirically upscaled GPP estimates (Jung et al., 2011) in that it accounts for the influence of changing CO 2 , and that it uses first principles (rather than imposed functions) to represent effects of T , D, and p (Sect. 2). The theory underlying the P-model regarding the water-carbon trade-off has been described by Prentice et al. (2014) and applied by Keenan et al. (2016) to simulate how changes in primary production have driven the terrestrial C sink over past decades, and by Smith et al. (2019) to explain variations in observed V cmax . Wang et al. (2017a) complemented the theory by including effects of limited electron transport capacity (J max ) and predicted variations in observed χ across environmental gradients. To resolve model biases under conditions of low soil moisture, Stocker et al. (2019) further applied an empirical stress function to reduce LUE under dry soil conditions. The purpose of this paper is (i) to provide a full documentation of the model implementation and reference for opensource software (rpmodel R package, available on CRAN); (ii) to provide an evaluation of model-predicted LUE and GPP against GPP derived from eddy covariance flux measurements (FLUXNET 2015 Tier 1 dataset); (iii) to apply this model for global-scale simulations and compare spatial patterns and global totals of simulated GPP with other estimates with global coverage; and (iv) to introduce a robust and pragmatic solution to resolving model bias under dry and cold conditions. With (iv), we do not aim at extending the theoretical basis for the P-model (Prentice et al., 2014;Wang et al., 2017a) but to include environmental controls in the LUE model that serve to make the model applicable as a remote-sensing-data-driven GPP model for a wide range of conditions and vegetation types. The evaluation focuses on different components of variability (spatial, annual, seasonal, daily anomalies) (Sects. 4.6-4.2). We further address uncertainties associated with the fAPAR forcing (Sect. 4.4) and the uncertainties in the evaluation data by using GPP data derived from different flux decomposition methods (Sect. 4.5). The use of continuous GPP measurements, rather than experimentally disturbed measurements, makes it challenging to assess modelled GPP under extreme environmental conditions. We therefore make a further evaluation of simulated GPP during the course of soil moisture drought events (fLUE droughts; Sect. 4.3).

Theory
The theory underlying the P-model has been described by Wang et al. (2017a) and the derivation of equations is given therein. It is presented here again for completeness.

Balancing carbon and water costs
The P-model centres around a prediction for the optimal ratio of leaf-internal to ambient CO 2 concentration c i : c a (termed χ) that balances the costs associated with maintaining the transpiration stream and the cost of maintaining a given carboxylation capacity. The optimal balance is achieved when the two marginal costs are equal: Here, a and b are the respective unit costs. b is assumed to be constant, and a to scale linearly with the temperaturedependent viscosity of water η(T ), calculated following Huber et al. (2009). Below, we introduce β = b/a , with a = η * a and η * = η(T )/η(25 • C). The optimal χ solves the above equation. We use Fick's law (Fick, 1855) to express transpiration and assimilation as a function of stomatal con-ductance g s : and and use the RuBisCO-limited assimilation rate from the FvCB model: with where c i is given by c a χ. K is the effective Michaelis-Menten coefficient for RuBisCO-limited assimilation (Sect. B3), and * is the photorespiratory compensation point in the absence of dark respiration (Sect. B1). The optimal χ can be derived as (3) are divided by A, the solution is independent of whether the RuBisCO-limited rate A C or the lightlimited rate A J (see below) is followed. With this prediction for χ, we can use the coordination hypothesis (Chen et al., 1993;Haxeltine and Prentice, 1996;Maire et al., 2012) and the light-limited assimilation rate from the FvCB model to write I abs is the amount of absorbed light and ϕ 0 is the intrinsic quantum yield efficiency. This equation has the form of a LUE model (Eq. 2) in that A J scales linearly with I abs . Using Eqs. (9) and (8), m can be expressed directly as The unit cost ratio β has been estimated by Wang et al. (2017a) to 240 based on global leaf δ 13 C data and a simplified version of the P-model (assuming * = 0 and neglecting www.geosci-model-dev.net/13/1545/2020/ Geosci. Model Dev., 13, 1545-1581, 2020 the J max limitation). Here, we re-estimated β to 146 based on the full version of the model using the same global leaf δ 13 C dataset. This is more strictly consistent with the model formulation implemented here. Equation (12) provides the basis for predicting CO 2 assimilation rates in the form of a LUE model (Eq. 2) where LUE is a function of T and p (both affecting * , K, and η * ; see Sects. B1 and B3), D, and c a . The prediction of optimal χ has a number of corollaries (see Appendix C). An estimate for stomatal conductance (g s ) and the intrinsic water use efficiency (iWUE = A/g s ) directly follow from the optimal water-carbon balance (Eq. 3). By assuming A J = A C , we can further derive V cmax , as well as dark respiration (R d ), which is a function of V cmax (see Sects. C3 and C4).

Introducing J max limitation
Equation (10) assumes that the light response of A is linear up to the coordination point. In reality, rates saturate towards high light levels because the electron transport rate J , necessary for the regeneration of ribulose 1,5-bisphosphate (RuBP), tends towards a maximum J max . To account for this effect, Eq. (10) can be modified, following the formulation by Smith (1937), using a non-rectangular hyperbola relationship between A J and I abs to allow for the effect of finite J max : In this equation, A J is no longer linear with respect to I abs and thus does not have the form of a LUE model. However, J max is assumed here to acclimate on longer timescales to I abs , so that the marginal gain in assimilation A per unit change in J max is equal to the unit cost (c) of maintaining J max .
The unit cost c is assumed to include the maintenance of light-harvesting complexes and various proteins involved in the electron transport chain. The cost of maintaining a given J max is thus assumed to scale linearly with J max and that this proportionality is constant (c is constant). By taking the derivative of Eq. (13) with respect to J max and rearranging terms (see Appendix F2 for intermediate steps), we obtain the J max limitation factor L in Eq. (13) as with c * = 4c. Note that L is independent of I abs . Hence, A J is again a linear function of absorbed light. The cost factor c * is estimated from published values of J max : V cmax = 1.88 at 25 • C. (Kattge and Knorr, 2007) and χ = 0.8 (Lloyd and Farquhar, 1994) at c * = 0.41 (Wang et al., 2017a). The revised LUE model thus becomes Wang et al. (2017a) showed that this formulation of J max costs leads to a realistic dependence of the J max : V cmax ratio on growth temperature.
As shown by Smith et al. (2019), an alternative approach can be used to introduce the effects of J max limitation, replacing Eq. (13) by the more widely used one-parameter family of saturation curves following Farquhar and Wong (1984). This alternative is described in Appendix F3 and implemented as an optional method in the R package rpmodel.

The light use efficiency model
A is commonly expressed in mol m −2 s −1 . For further model description and evaluation, we refer to ecosystem-scale quantities in mass units of assimilated C and model GPP (g C m −2 d −1 ) following Eq. (2) with fAPAR · PPFD = I abs (18) Here, M C is the molar mass of carbon (12.0107 g mol −1 ) to convert from molar units to mass units, and PPFD is the photosynthetic photon flux density per square metre, integrated over a day (mol m −2 d −1 ). fAPAR is unitless and integrates across the canopy, i.e. from fluxes per unit leaf area to fluxes per unit ground area. LUE is in units of g C mol −1 . The intrinsic quantum yield parameter ϕ 0 is modelled as temperature dependent, and an additional (unitless) empirical soil moisture stress factor (β(θ )) is included for modelling LUE.

Temperature dependence of the intrinsic quantum yield of photosynthesis
The temperature dependence of the intrinsic quantum yield (ϕ 0 (T ), mol mol −1 ) is modelled following the temperature dependence of the maximum quantum yield of photosystem II in light-adapted leaves, determined by Bernacchi et al. (2003) as where a L is the leaf absorptance, and b L is the fraction of absorbed light that reaches photosystem II. The factor 1/4 Geosci. Model Dev., 13, 1545-1581, 2020 www.geosci-model-dev.net/13/1545/2020/ is introduced here as the equation given by Bernacchi et al. (2003) applies to electron transport rather than C assimilation. Here, (a L b L /4) is treated as a single calibratable parameter (see Sect. 3.3) and is henceforth referred to as c L ≡ a L b L /4. (All calibratable parameters are thereafter indicated by a hat over the symbol.) This temperature dependence was not accounted for in earlier P-model publications (Keenan et al., 2016;Wang et al., 2017a). To test the effect of this temperature dependence on simulated GPP, we conducted alternative simulations, where a constant ϕ 0 was calibrated instead (Sect. 3.2). Note that ϕ 0 includes the factor a L for incomplete leaf absorptance, which is commonly quantified separately from the quantum yield efficiency. In other vegetation models, a L is commonly ascribed a value of 0.72-0.88 (Rogers et al., 2017). Values of ϕ 0 used here are accordingly lower than values for the intrinsic quantum yield reported from experimental studies (Long et al., 1993;Singsaas et al., 2001). Furthermore, within-canopy reflection and reabsorption indicate that leaf-level absorptance is not equivalent to canopy-level absorptance; thus, ϕ 0 should be regarded as canopy-scale effective value of intrinsic quantum yield. It is treated here as a calibratable parameter, which may vary according to the fAPAR forcing dataset used.
3.1.2 Soil moisture stress β(θ ) is an empirical soil moisture stress function. We use results by Stocker et al. (2018) to fit this function based on two general patterns. First, the functional form of β(θ ) is approximated by a quadratic expression that approaches 1 for soil moisture at a certain threshold θ * and held constant at 1 for soil moisture values above this threshold. Here, θ is the plant-available soil water, expressed as a fraction of available water holding capacity, and θ * is set to 0.6. The general form is Second, the sensitivity of β(θ ) to extreme soil dryness (θ → 0) is related to the mean aridity, quantified as the mean annual ratio of actual over potential evapotranspiration (AET / PET) . The decline in β(θ ) with drying soils is steep in dry climates and less steep in less dry climates. In Eq. (21), the sensitivity parameter q is defined by the maximum β reduction at low soil moisture β 0 ≡ β(θ = θ 0 ), leading to q = (β 0 − 1)/(θ * − θ 0 ) 2 . Note that q has a negative value. β 0 is modelled as a linear function of the mean aridity: a θ and b θ are treated as calibratable parameters. Soil moisture (θ ), AET, and PET are simulated using the SPLASH model , which treats soil water storage as a single bucket and calculates potential evapotranspiration based on Priestley and Taylor (1972). The only difference with the model version described by Davis et al. (2017) is that we account here for a variable water holding capacity calculated based on soil texture and depth data from SoilGrids (Hengl et al., 2014). A detailed description of the applied empirical functions for calculating plant-available water holding capacity from texture data is given in Appendix D.

Site-scale simulations
We conducted multiple sets of site-scale simulations (Table 1) to investigate the dependence of model performance on alternative model setups (variable/fixed soil moisture and temperature effects), alternative choices of forcing data (fA-PAR), and alternative observational target data for calibration (GPP based on different flux decompositions). Parameters ( c L , a θ , and b θ ) were calibrated and evaluated against the appropriate observational data for each set of simulations separately.
The ORG setup is the P-model in its original form, as described in Wang et al. (2017a). It uses a fixed quantum efficiency of photosynthesis ( ϕ 0 is calibrated instead of c L ) and does not account for soil moisture stress (β(θ ) = 1). Here, the model is forced with fAPAR data based on MODIS FPAR (MCD15A3H), linearly interpolated 4 d values to daily values (see Sect. 3.4.1), and is calibrated against GPP data from FLUXNET 2015 based on the nighttime partitioning method (NT) (see Sect. 3.5.1). The simulation set BRC ("Bernacchi") is identical to ORG except that ϕ 0 is allowed to vary with temperature following Bernacchi et al. (2003) and Eq. (20), and c L is calibrated. The full P-model setup (FULL) includes the soil moisture stress function described above, and c L , a θ , and b θ are calibrated simultaneously.
All additional simulations account for both temperature and soil moisture effects. The simulation set FULL_FPARitp also uses MODIS FPAR data for fAPAR but applies a spline to get daily values. This is to evaluate alternative interpolation methods. The simulation set FULL_EVI uses MODIS EVI (MOD13Q1), interpolated to daily from 8 d data, to assess to which the degree the model performance depends on the fAPAR forcing data source. See Sect. 3.4.1 for more information.
All site-scale simulations were calibrated against GPP data (Sect. 3.3), calculated using the nighttime flux decomposition method (Reichstein et al., 2005). Additional simulation sets FULL_DT, FULL_NTsub, and FULL_Ty were used to investigate the dependence of model performance on the choice of observational data used for calibration. We used GPP data based on the nighttime decomposition method (Reichstein et al., 2005) for FULL_NTsub, the daytime decomposition method (Lasslop et al., 2010) for FULL_DT, and an alternative decomposition method, previously used in Wang et al. (2017a), for FULL_Ty. The Ty method estimates a con-stant monthly background respiration rate fitted to match net ecosystem exchange fluxes of CO 2 from measurements assuming a linear or saturating dependence of GPP on PPFD. Calibration and evaluation of FULL_DT, FULL_NTsub, and FULL_Ty are done only for sites and dates where observational data are available for all three datasets (DT, NT, and Ty); hence, there is the distinction between FULL_NTsub and FULL.

Global simulations
Global simulations were conducted for the FULL setup, using the respectively calibrated parameters from the site-scale simulations. All vegetation is assumed to follow the C 3 photosynthetic pathway and we do not distinguish between croplands and other vegetation. We conducted two simulations with alternative fAPAR forcing data. These are described in Sect. 3.4.

Model calibration
Calibration was performed only for the model parameters determining the quantum efficiency of photosynthesis ( ϕ 0 or c L , respectively) and the dependence of the sensitivity of the soil moisture stress function on average aridity (parameters a θ and b θ ). Simulated GPP was calibrated to minimise the root mean square error (RMSE) compared to observed daily GPP (Sect. 3.5). We used generalised simulated annealing from the GenSA R package (Xiang et al., 2013) to calibrate model parameters. This algorithm is particularly suited to find global minima of non-linear objective functions in situations where there can be a large number of local minima. To test the robustness of the calibration and evaluation metrics, we additionally performed out-of-sample calibrations for the FULL setup where the training set included data from all but one site. The test dataset, used to calculate R 2 and RMSE, contained only data from the single left-out site.

Forcing data
Unstressed light use efficiency, m in Eq. (19), is simulated using monthly mean values for daytime T and D; temporally constant site-specific elevation (used to calculate atmospheric pressure, scaled from sea-level standard pressure of 101 325 Pa); and annually varying observed atmospheric CO 2 (MacFarling Meure et al., 2006), identical across sites. The choice of aggregating to monthly mean values is motivated by the timescale of RuBisCO turnover, which limits the rate at which photosynthetic parameters can acclimate to changing environmental conditions (McNevin et al., 2006).
Predicted monthly LUE (m ) is multiplied by daily varying I abs , and response functions ϕ 0 (T ) and β(θ ), driven by daily varying temperature and soil moisture. This choice is motivated by the known rapid response in stomatal conductance to drying soils (represented by β(θ )) and the instantaneous temperature response of the quantum yield efficiency (ϕ 0 (T )). Simulating GPP as the product of LUE and daily varying PPFD would not be consistent with the non-linear instantaneous response of A to light (Eq. 10) given the acclimation timescale of photosynthesis (Suzuki et al., 2001;Maire et al., 2012). We therefore evaluate simulated GPP averaged over 8 d periods. The choice of appropriate model prediction and evaluation timescales is further discussed in Sect. 5.

fAPAR
For site-scale simulations, we used three alternative datasets as model forcing for fAPAR (MODIS FPAR splined, MODIS FPAR linearly interpolated, and MODIS EVI, splined; see Table 1). MODIS FPAR data are from the MCD15A3H Collection 6 dataset (Myneni et al., 2015), given at a resolution of 500 m and 4 d. The data were filtered to remove data points where clouds were present, values equal to 1.00, and outliers (more than 3 times the interquartile range). Filtered values were replaced by the mean value for the respective day of year. To obtain daily varying I abs (Eq. 18), two alternatives were explored. For the first (used in all setups except FULL_FPARitp), values were derived using a cubic smoothing spline (function smooth.spline() with parameter spar=0.01 in R; R Core Team, 2016). For the FULL_ FPARitp setup, daily fAPAR values were linearly interpolated to each day. MODIS EVI data are from the MOD13Q1 Collection 6 dataset (Didan, 2015), given at a resolution of 250 m and 8 d. These data were filtered based on the summary quality control flag, removing "cloudy" pixels. Gaps were filled and data were splined to daily values. All fAPAR data were downloaded from the Google Earth Engine using the google_earth_engine_subsets library (Hufkens, 2017).
For global-scale simulations, we used two alternative fAPAR datasets. "MODIS FPAR" is from globally gridded MODIS FPAR data at 0.5 • resolution derived at the Integrated Climate Data Center (ICDC; https://icdc.cen. uni-hamburg.de/1/daten/land/modis-lai-fpar.html, last access: 5 March 2020), based on the MOD15A2H MODIS Terra leaf area index/FPAR 8 d L4 global 500 m SIN grid V006 dataset (Myneni et al., 2020). For the present application, 8 d data are aggregated (mean) to monthly data. "fA-PAR3g" is based on Advanced Very High Resolution Radiometer (AVHRR) Global Inventory Modeling and Mapping Studies (GIMMS) FPAR3g v2 data (Zhu et al., 2013), 15 d, 1/12 • resolution and aggregated for the present application to 0.5 • and monthly data. For all global P-model simulations, fAPAR is held constant for each day in respective months. Simulations cover the years 2000-2016. Due to limited temporal coverage, January 2000 data are taken as February 2000 for simulations driven by MODIS FPAR.

Meteorological data
For site-scale simulations, the meteorological forcing data are derived from the FLUXNET 2015 Tier 1 dataset (daily), Geosci. Model Dev., 13, 1545-1581, 2020 www.geosci-model-dev.net/13/1545/2020/ Column ϕ 0 (T ) specifies whether the temperature dependence of intrinsic quantum yield is included. Column β(θ) specifies whether soil moisture stress is included. Columns ϕ 0 , c L , a θ , and b θ provide the calibrated parameter values in each simulation set. which provides data from measurements taken and processed along with the CO 2 flux measurements. The PPFD (mol m −2 d −1 ) is derived from shortwave downwelling radiation as PPFD = 60 × 60 × 24 × 10 −6 k EC R SW , where k EC = 2.04 µmol J −1 (Meek et al., 1984), and R SW is incoming shortwave radiation from daily FLUXNET 2015 data (variable name SW_IN_F, given in W m −2 ). The factor k EC accounts for the energy content of R SW and the fraction of photosynthetically active radiation in total shortwave radiation. Daytime vapour pressure deficit (VPD, or D in Sect. 2) is calculated from half-hourly FLUXNET 2015 data (variable name VPD_F) by averaging over time steps with positive insolation (SW_IN_F). Daytime air temperature is taken directly from the FLUXNET 2015 dataset (variable name T_F_DAY). This is a simplification, as we are not calculating leaf temperature or VPD at the leaf surface, which are more directly relevant for photosynthesis. For global-scale simulations, we use daily, 0.5 • meteorological forcing from WATCH-WFDEI (Weedon et al., 2014). We use mean daily 2 m air temperature; daily snow and rainfall; shortwave downwelling radiation converted to mol photons m −2 d −1 by multiplication with k EC ; and daily 2 m specific humidity (q air ), converted to VPD (D) as described in Appendix E. We used daily minimum and maximum air temperatures for each month from Climate Research Unit (CRU) time series (TS) 4.01 data (Harris et al., 2014) to calculate a respective VPD and use their mean as input to Pmodel simulations in order to reduce effects of the non-linear dependence of D on T (D = (D(T min ) + D(T max ) /2). All processes that depend on atmospheric pressure use Eq. (B10) and the 0.5 • resolution elevation map from WATCH-WFDEI (Weedon et al., 2014) to calculate a temporally constant atmospheric pressure for each grid cell.

Data for site-scale simulations
We used data from 59 sites for model calibration and 126 sites for evaluation ( Fig. 1 and Table A1). The number of valid daily GPP data points used for the calibration set was 162 158 and 241 084 for the evaluation set. The calibration sites were selected based on the apparent reliability of relationships between CO 2 fluxes, co-located greenness data, measured soil moisture, and meteorological variables, emerging from a previous analysis . For the evaluation, we used all sites except those classified as croplands or wetlands, and seven sites where C 4 vegetation is mentioned in the site description available through the FLUXNET2015 dataset (AU-How, DE-Kli, FR-Gri, IT-BCi, US-Ne1, US-Ne2, and US-Ne3).
GPP predictions by the P-model are compared to GPP estimates from the FLUXNET 2015 Tier 1 dataset (downloaded on 13 November 2016). We used GPP based on the nighttime partitioning method (Reichstein et al., 2005) (GPP_NT_VUT_REF) and removed data for which more than 50 % of the half-hourly data are gap filled and for which the daytime and nighttime partitioning methods (GPP_DT_VUT_REF and GPP_ NT_ VUT_ REF, respectively) are inconsistent, i.e. the upper and lower 2.5 % quan-tiles of the difference between GPP values quantified by each method. For additional simulation sets, model calibration and evaluation was performed using GPP data based on the daytime partitioning method (GPP_ DT_VUT_REF) (Lasslop et al., 2010) with analogous filtering steps, and GPP data based on an alternative method that fits a constant ecosystem respiration rate as the net ecosystem exchange under conditions where PPFD tends to zero (FULL_Ty; Wang et al., 2017a). For all calibration and evaluation, we removed data points before the "MODIS era" (before 18 February 2000).

Data for global-scale simulations
We compare the simulated spatial distribution of GPP from global-scale simulations against seven different remotesensing-data-driven GPP estimates with global coverage and two Sun-induced fluorescence (SiF) data products. The global GPP estimates are from the following models: MTE (Jung et al., 2011), FLUXCOM ("RS+METEO" setup) (Tramontana et al., 2016), MODIS GPP (MOD17A2H Collections 55 and 6) (Running et al., 2004;Zhao et al., 2005;Running et al., 2015), BESS (Jiang and Ryu, 2016), BEPS (He et al., 2018;Chen et al., 2016), and VPM (Zhang et al., 2017). A more detailed description of these models and aggregation to a common grid of 0.5 • and monthly resolution can be found in Luo et al. (2018). For SiF, we use data from Global Ozone Monitoring Experiment-2A (GOME-2A) and GOME-2B, based on v.2 (V27) 740 nm terrestrial chlorophyll fluorescence data from the MetOp-A and MetOp-B satellites (Joiner et al., 2013(Joiner et al., , 2016. Data were aggregated to monthly and 0.5 • resolution by mean, as further described in Luo et al. (2018).

Evaluation methods
We evaluated both simulated LUE and GPP. The P-model (Sect. 2) predicts variations in LUE across sites (space) and months (monthly LUE = m ), while simulated GPP is affected by the PPFD and fAPAR data used as model forcing (Eq. 19 and Sect. 3.4). Conversely, "observed" LUE is calculated as LUE obs = GPP obs /(fAPAR · PPFD), and the evaluation is thus also affected by the PPFD and fAPAR data. The evaluation of LUE tests the added explanatory power of the P-model compared to models that rely on fixed prescribed LUE values. Evaluating GPP facilitates the comparison of the model performance to similar models of terrestrial GPP. Model performance for GPP is benchmarked against a null model (NULL), which assumes a temporally constant and spatially uniform LUE. The LUE for the NULL model is fitted to observed GPP using linearly interpolated MODIS FPAR and GPP data from the NT method; see Table 1. Thus, while LUE is constant, the NULL model preserves the spatial and temporal patterns in APAR (= fAPAR · PPFD).  Table A1. The colour key of the map represents aridity, quantified as the ratio of precipitation over potential evapotranspiration from Greve et al. (2014).

Components of variability
For LUE, we separately analysed spatial (mean annual values by site) and monthly means only for the FULL setup. For GPP, we analysed spatial, annual, seasonal (mean by day of year), 8 d, and the variability in daily anomalies from the mean seasonal cycle for all setups. The seasonal variability was determined for different Köppen-Geiger climatic zones (see Table 2). Information about the association of sites with climatic zones was extracted from Falge et al. (2017). Evaluations were made only for climatic zones with at least three sites. For each component of variability, we calculated the adjusted coefficient of determination (R 2 adj ; hereafter referred to as R 2 ) and the RMSE. Figures showing correlations between simulated and observed values additionally present the mean bias, the slope of the linear regression model, and the number of data points (N).

Drought response
The bias in GPP (modelled minus observed) was calculated for 20 d before and 80 d after the onset of a drought event as identified by Stocker et al. (2018) for 36 sites. Drought events (fLUE droughts) are periods of consecutive days where soil moisture, separated from other drivers using neural networks, reduces LUE below a given threshold. The data specifying the timing and duration of drought events were downloaded from Zenodo (Stocker, 2018). We then rearranged the data to align all drought events at all sites, normalised data to their median value during the 10 d before the onset of droughts (normalisation by subtracting median), and computed quantiles per day, where "day" is defined with respect to the onset of each drought event.

Calibration results
The calibration of model parameters, done with data from all calibration sites simultaneously, yielded values that closely matched the means across parameter values derived from the out-of-sample calibrations (Fig. 2). This confirms the robustness of the calibration and that the model is not overfitted. Similarly for the evaluation metrics, the R 2 and RMSE values reported from evaluations against data from all evaluation sites pooled yielded values that closely match the means across the out-of-sample evaluations (each calculated with data from the single left-out site). This analysis also shows that the distribution of the evaluation metrics is skewed, with evaluations against a few sites indicating relatively poor performance (R 2 below 0.5 for ZM-Mon, AR-Vir, and FR-Pue), while the most frequent values indicate very good model performance (evaluations at 21 sites giving R 2 values of above 0.8). Because the out-of-sample calibrations are computationally very expensive, we performed this analysis only for one setup (FULL) and report evaluation metrics done with pooled data from all evaluation sites for the remainder of the analysis.

GPP variability across scales
Tables 3 and 4 provide an overview of model performance (R 2 and RMSE) in simulating GPP at different scales. The ORG setup captures 70 % of the variance in observed GPP with data aggregated to 8 d means (33 604 data points). Model performance both with respect to explained variance (R 2 ) and the RMSE is improved by including the effects of temperature on quantum yield efficiency in the BRC model setup (R 2 = 72 %), and by including the effects of soil moisture stress in the FULL model setup (R 2 = 75 %; Fig. 3). Both the BRC and FULL model setups outperform the NULL model.

Seasonal variations
Seasonal variations are generally reliably simulated (R 2 : 0.69-0.73 for P-model setups, and R 2 : 0.71 for the NULL model; Fig. 4). Also the NULL model captures most of the seasonal variability, especially in climate zones Dfb and Dfc, and Cfb and Cfa. This indicates that seasonal GPP variations are largely driven by seasonal changes in insolation (PPFD) and vegetation greenness (fAPAR). Accounting for temperature effects on the quantum yield efficiency reduces the overestimation of GPP in spring, except in the case of climate zone Dfb. Observed GPP increases are lagged compared to vegetation greenness, with a delay of up to 2 months at some sites. This lag is clearly visible at almost all sites in Dfb. The early-season high bias is largely absent for sites in climate zone Cfb, where observed GPP starts increasing early and the simulations match the observations except at sites CZ-wet, DE-Hai, and FR-Fon, where the start of season is simulated too early.
GPP is overestimated during the dry season in climate zones with a marked dry season (Aw, Bsk, Csa, and Csb) in model setups that do not account for soil moisture stress (ORG, BRC, NULL). The NULL model has the largest bias. High VPD during dry periods reduces simulated LUE and leads to lower GPP values and a smaller bias in all the Pmodel setups (ORG, BRC, FULL). The empirical soil moisture stress function applied in the FULL setup eliminates the dryness-related bias in zones Aw, Csa, and Csb and substantially reduces this bias for sites in zone Bsk. Observations suggest that GPP declines to values around zero during dry periods at sites in zone Bsk (mostly savannah vegetation and grasslands; see Table A1). The remaining bias in the FULL model, which includes the soil moisture stress function, is related to the fact that prescribed fAPAR remains relatively high and that the soil moisture stress function does not decline to zero.
The ORG and BRC models tend to underestimate peakseason GPP more strongly compared to the FULL model. This is a direct consequence of the calibration which balances errors across all data points. Across-site average peakseason maximum GPP is accurately captured by the FULL model in most zones (Fig. 4), except for an underestimation of GPP in zones Aw, Cfa, and Cfb, and an overestimation in zone Csa. Site-level evaluations suggests no clear relationship between peak-season underestimation and vegetation type in zone Cfb. The overestimation of peak-season GPP in zone Csa is caused by a high bias at sites with evergreen broadleaf vegetation (FR-Pue, IT-Cp2, IT-Cpz); sites with other vegetation types show no consistent peak-season bias.

Spatial and annual variations
The R 2 for simulated GPP, aggregated to annual totals, ranges from 0.60 (ORG) to 0.69 (FULL). The NULL model achieves an R 2 of 0.58. Most of the explanatory power of the different models for annual total GPP stems from their power in predicting between-site ("spatial") variations (Fig. 5). The R 2 for spatial variations ranges from 0.63 (ORG) to 0.69 (FULL), and 0.65 for the NULL model. In contrast, interannual variations at a site are poorly simulated (R 2 : 0.05-0.09 for P-model setups, and 0.03 for the NULL model). Interannual variations are generally much smaller than betweensite (spatial) variations or seasonal variations. Thus, capturing them is challenging. Interannual GPP variations are generally better simulated at sites where the variability is high and in particular at dry sites.

Drought response
The P-model setups that do not include the soil moisture stress function (ORG and BRC) systematically overestimate GPP during droughts (Fig. 6). This bias increases Table 2. Description of Köppen-Geiger climate zones and number of sites for which data are available per climate zone and hemisphere. Sites are classified based on Falge et al. (2017) and Beck et al. (2018). Only zones with data from more than three sites are shown.  sharply at the onset of drought events and continues to increase throughout the drought period. The bias is strongly reduced by applying the empirical soil moisture stress function (Eq. 21) in the FULL model. A small bias remains also in the FULL model. This stems from overestimated values at a few sites (in particular AU-DaP, US-Cop, US-SRG, US-SRM, US-Var, US-Whs, US-Wkg), mostly grasslands and sites in seasonally dry climate zones (Aw, Bsk, and Csa; see Fig. 4), where flux measurements indicate an almost complete shutdown of photosynthetic activity during the dry season. In contrast, the fAPAR data (MODIS FPAR) suggest values substantially greater than zero at these sites during these periods. This suggests either contributions to PAR absorption by photosynthetically inactive tissue, underestimation of LUE sensitivity to dry soils at these sites, or an over-estimation of the rooting zone moisture availability by the SPLASH model.

Uncertainty from fAPAR input data
Tests of the sensitivity of model performance to alternative fAPAR forcing datasets show that the difference between splined and linearly interpolated MODIS FPAR is small, with slightly better performance using splined fAPAR data. Model performance is generally better using MODIS FPAR compared to simulations using MODIS EVI. Spatial variations, in particular, are better captured using MODIS FPAR (Fig. 5  period in zone Bsk is larger when using MODIS EVI than when using MODIS FPAR (Fig. 7, right). The positive spring bias in simulated GPP in zone Dfb is present irrespective of the source of the fAPAR forcing (Fig. 7, left), as is the peakseason bias of GPP in zones Bsk, Cfb, and Csb (not shown). Differences between the EVI-and FPAR-forced simulations depend on vegetation type. The EVI-forced simulation tends to be biased low in evergreen needleleaf vegetation and has generally lower values in all evergreen vegetation types compared to the FPAR-forced simulation. However, there is no general difference in model bias between simulations made with the two forcings in other vegetation types.

GPP target data
The different flux decomposition methods make fundamentally different assumptions regarding the sensitivity of ecosystem respiration to diurnal changes in temperature. This should lead to systematic differences in derived observational GPP values and should affect model-data disagreement.
Model predictions compare better to GPP data based on the flux decomposition method Ty (Wang et al., 2017a) than for GPP data based on the DT and NT methods. For GPP 8 d means, the model achieves an R 2 of 0.68 when com-pared to GPP Ty (FULL_Ty model setup), as opposed to 0.64 and 0.66 compared to the DT and NT methods, respectively (FULL_DT and FULL_NTsub; Table 3, Fig. 8). Spatial and annual correlations are not evaluated for GPP Ty due to missing data. Evaluations presented here rely on dates for which neither the NT, DT, nor Ty methods have missing values and thus contain an equal number of data points. Metrics from the NT evaluation, repeated here, are not identical to the ones above and are referred to as "NTsub" in Tables 3 and 4.
We found a systematic low bias of simulated GPP in the peak-season in the climatic zone Cfb (warm temperate, fully humid, warm summer). However, as shown in Fig. 8, this bias does not seem to be affected by the choice of GPP evaluation data.

LUE
The FULL version of the P-model captures 32 % of the variability in mean annual LUE across all sites and across the full range of observed mean annual LUE values and vegetation types (Fig. 9). Overall, 48 % of the observed LUE variation within vegetation types is captured by the model through the relationships with climate, without the need to specify parameters for specific vegetation types.   Table 2). Only climate zones are shown here for which data from at least five sites were available.
Geosci. Model Dev., 13, 1545-1581, 2020 www.geosci-model-dev.net/13/1545/2020/ Overall, 31 % of the variability in monthly mean LUE is captured by the model, with data from all sites and years pooled (Fig. 9). The model overestimates monthly LUE values and underestimates LUE at the lowest and highest ends of the LUE range, respectively. The low-end overestimation is reflected by the overestimation of GPP in the spring at winter-cold sites (Sect. 4.2.1) and during soil moisture droughts (Sect. 4.3). The underestimation of high monthly values is not clearly linked to any particular vegetation type.

Global GPP
Simulated global total GPP is 106 Pg C yr −1 when using MODIS FPAR and 122 Pg C yr −1 when using fAPAR3g forcing data (mean over the years 2001-2011, FULL setup). The spatial pattern of simulated GPP differs substantially between simulations forced by MODIS FPAR and fAPAR3g (Fig. 10). This is most evident in their latitudinal distribution (Fig. 11). The global spatial pattern of fAPAR3g-based GPP simulated by the P-model generally matches the global distribution of the mean across other remote-sensing-based GPP models and lies within the range of their estimates for the latitudinal distribution. The MODIS FPAR-forced P-model simulation suggests lower values in the tropics that differ from the fAPAR3g-based estimates by a factor of ∼ 2 around the Equator. The moderate tropical GPP of the MODIS FPAR-based P-model simulation agrees well with the latitudinal distribution of SiF from GOME-2A and GOME-2B.

Discussion
The performance of the P-model can be compared to results obtained from other remote-sensing-driven GPP models (RS models). In its FULL setup, the P-model achieves an R 2 of 0.75 and a RMSE of 1.96 g C m −2 d −1 , in simulating . Unfortunately, we cannot present a direct comparison between these models based on data from identical dates and sites. A targeted model intercomparison may address this. While seasonal and spatial variations in GPP are reliably simulated by the P-model, the model's performance in simulating interannual GPP variations is weaker. Similar results regarding relatively poor model performance in explaining interannual variations have been found from previous studies in both empirical (Richardson et al., 2007;Urbanski et al., 2007b) and process model-based (Keenan et al., 2012;Biederman et al., 2016) analyses. This is likely due to lagged effects of climate anomalies expressed through biotic responses (Richardson et al., 2007;Keenan et al., 2012).
The P-model-based estimates of global GPP (106 Pg C yr −1 when using MODIS FPAR and 122 Pg C yr −1 when using fAPAR3g forcing data, mean over 2001-2011, FULL setup) are within the range of other estimates of global GPP (also means over [2001][2002][2003][2004][2005][2006][2007][2008][2009][2010][2011]: 133 Pg C yr −1 for MTE (Jung et al., 2011), 130 Pg C yr −1 for FLUXCOM (Tramontana et al., 2016), 112 Pg C yr −1 for MODIS-55 GPP and 105 Pg C yr −1 for MODIS-6 GPP (Running et al., 2004;Zhao et al., 2005), 133 Pg C yr −1 for BESS (Jiang and Ryu, 2016;Ryu et al., 2011), 121 Pg C yr −1 for BEPS Chen et al., 2016), and 135 Pg C yr −1 for VPM (Zhang et al., 2017). The P-model results presented here are based on simulations that embody relatively strong simplifying assumptions. In particular, we assumed all vegetation to follow the C 3 photosynthetic pathway and we made no distinction between croplands and other vegetation, although crops are often more productive (Guanter et al., 2014). Due to the short period for which forcing data and outputs from comparable models are available, we did not analyse temporal trends in global GPP here. Analyses not shown here indicate that the introduction of the J max cost factor (not included, e.g. in Keenan et al., 2016) increases the sensitivity of modelled GPP to CO 2 . Further evaluation of model behaviour against data from CO 2 manipulation experiments will be necessary before applying the model to simulate CO 2 -related trends.
The large spread of tropical GPP estimates is striking. The highest estimate among the other GPP models we used for evaluation here -coming from BESS -is more than 50 % higher than MODIS GPP from Collection 6. The fAPAR3gbased P-model tropical GPP estimate falls within the range of other GPP models, while the MODIS FPAR-based estimate is lower than all other models. However, the latter's comparably low tropical GPP agrees well with the latitudinal distribution of SiF (Fig. 11). However, large changes in leaf area index across latitudes, combined with a dependency of the SiF signal on vegetation structure (Zeng et al., 2019) may undermine the validity of SiF as a benchmark for the latitudinal GPP distribution. A lack of evaluation data from eddy covariance measurements in dense tropical forests precludes us from drawing conclusions on the accurateness of these diverging tropical GPP estimates.
With a particular focus on soil moisture effects, Stocker et al. (2019) presented global GPP based on the P-model, corresponding to a setup with the soil moisture stress function but without the temperature dependence of the quantum yield efficiency. They also used a different parameterisation with ϕ 0 = 0.0579, a θ = 0.107, and b θ = 0.478 for their intermediate model version. Their estimate for global GPP was around 130 Pg C yr −1 for recent years.
The coefficients of determination (R 2 ) of simulated versus observed values are lower for LUE (0.32 for the spatial correlation in the FULL setup, Fig. 9b) than for GPP (0.69 for the spatial correlation in the FULL setup). This is because GPP variations are strongly driven by variations in absorbed light (PPFD·fAPAR), which are observed and used for modelling. In contrast, variations in LUE cannot be observed directly. Using remotely sensed information for estimating LUE variations, e.g. based on SiF (Frankenberg et al., 2018;Li et al., 2018;Ryu et al., 2019) Table 2). Climate zones shown here are illustrative examples.  Other remote-sensing-based GPP models rely on vegetationtype-specific model parameters for LUE (Zhang et al., 2017;Running et al., 2004;Jiang and Ryu, 2016). The P-model in its FULL setup explains 48 % of the variations in LUE across sites aggregated to vegetation types without relying on vegetation or biome-type specific parameterisations. In its ORG setup, it explains 12 % of the variations (not shown) and 51 % of the variations when excluding sites classified as "open shrublands", which tend have a substantially lower LUE than simulated by the P-model (not shown). In spite of this substantial portion of explained variability, the NULL model with its temporally constant and spatially uniform LUE achieves higher R 2 values for GPP than the ORG Pmodel setup at the spatial, annual, and seasonal scales (Table 3). This indicates that the spatial and temporal variations in absorbed light are the main drivers of GPP in LUE-type models and underlines the importance of evaluation against a NULL model benchmark. Taken together, these findings demonstrate that the P-model offers a simple but powerful method for simulating terrestrial GPP using readily available input datasets and a very small number of free (calibratable) parameters. Here, three parameters are calibrated (for the FULL setup). Other model parameters are derived from independent field and laboratory measurements. Accounting for the temperature dependence of the quantum yield efficiency (ϕ 0 ) clearly improves model predictions.
The parameter ϕ 0 is commonly treated as a constant in global vegetation models (Rogers et al., 2017). Our results indicate potential for improving such models' photosynthesis routines by accounting for the temperature dependence of ϕ 0 . ϕ 0 appears as a linear scalar in the LUE model. However, the magnitude of this scalar is uncertain and depends on whether incomplete light absorption by the leaf is included in the definition of ϕ 0 or in fAPAR data. We have used MODIS FPAR and MODIS EVI data to define fAPAR in different model setups. While the two are well correlated, their absolute values differ. Hence, we have calibrated an apparent quantum yield efficiency ( ϕ 0 ) to GPP data separately for different fAPAR datasets, thereby implicitly distinguishing what components of light absorption factors are contained in the fAPAR data. The leaf absorptance, a L , which is typically taken to be around 0.8 in global vegetation models (Rogers et al., 2017), is similar to the ratio of fitted ϕ 0 values for simulation FULL and FULL_EVI, here calculated as 0.67 ( Table 1). An improvement in model performance is obtained by accounting for soil moisture stress using an empirical function. However, the use of an empirical function masks underlying processes. Furthermore, the use of an empirical function is not consistent with the optimality approach that underlies the P-model. The bias reduction associated with using an empirical soil moisture stress function hints at missing factors in the theoretical approach which rests on an assumed constancy of the unit costs of transpiration (a in Eq. 3). Prentice et al. (2014) provide a definition of a that is explicit in terms of plant hydraulic traits and physical properties that deter-Geosci. Model Dev., 13, 1545-1581, 2020 www.geosci-model-dev.net/13/1545/2020/  (Running et al., 2004;Zhao et al., 2005), BESS (Jiang and Ryu, 2016), BEPS (He et al., 2018;Chen et al., 2016), and VPM (Zhang et al., 2017). P-model results are from simulations with the FULL setup and calibrated parameters as given in Table 1. mine water transport along the plant-soil-atmosphere continuum. In particular, a ∝ ( k s ) −1 , where is the maximum daytime difference in leaf-to-soil water potential and k s is the sapwood area-specific permeability. However, large variations in stomatal conductance are known to occur in response to relatively fast soil dry-downs (timescale of days) (Keenan et al., 2010;Egea et al., 2011;Stocker et al., 2018). This suggests a potential to improve the P-model by allowing the unit cost of transpiration to be a function of rooting-zone moisture availability and by coupling stomatal conductance with the soil water balance.
Observational uncertainty could affect both parameter calibration and model evaluation. Keenan et al. (2019) found a systematic bias in GPP estimates based on the nighttime partitioning method due to inhibition of leaf respiration in the light (Kok, 1949;Wehr et al., 2016), which affects fluxes unevenly throughout the season and across vegetation types. However, we found no clear difference in model-data agreement, nor in fitted parameters, in comparisons of three alternative GPP datasets that use different approaches to decompose net CO 2 exchange fluxes from eddy covariance measurements into ecosystem respiration and GPP terms.
We have found a consistent early-season high bias in simulated GPP for numerous sites in regions with deciduous broadleaved vegetation in temperate and cold climates (in particular US-MMS, IT-Col, US-WCr, US-UMd, US-UMB, and US-Ha1), and also in mixed and needleleaf stands (in particular US-Syv, US-NR1, FI-Hyy, CA-Qfo, and CA-Man). The temperature dependence of the intrinsic quantum yield, as introduced in the BRC and FULL setups, did not resolve this bias. Additional analyses (not shown) suggested that this bias is not related to soil temperatures. The P-model, as applied here, uses daily air temperature for simulating temperature stress on the intrinsic quantum yield in the BRC and FULL setups. A reduction in the quantum yield efficiency arises from several mechanisms, includ- ing increased non-photochemical quenching, a reduction in chlorophyll, and absorption by screening pigments (Huner et al., 1993;Oquist and Huner, 2003;Ensminger et al., 2004;Adams et al., 2004;Verhoeven, 2014). These adaptations serve to limit oxidative damage under high light and low temperature conditions, where an imbalance between electron supply and demand exists, arising from an imbalance between temperature-insensitive photochemical rates and temperature-sensitive biochemical rates. The reversion of these adaptations and resumption of the intrinsic quantum yield efficiency and photosynthesis requires sustained temperatures above a certain critical threshold (Tanja et al., 2003) and exhibits a delay with respect to instantaneous air temperatures (Pelkonen and Hari, 1980;Mäkelä et al., 2004). Approaches accounting for a delayed resumption of photosynthesis after cold periods offer scope for further improvement of the P-model and may be included in global vegetation and Earth system models where this effect is currently not accounted for (Tanja et al., 2003;Rogers et al., 2017).
There is a positive bias in simulated GPP during the dry season at a number of sites where the vegetation phenology is influenced by drought. The positive bias is related to the combination of using prescribed fAPAR data, which shows substantial absorption by non-green vegetation, and insufficient sensitivity of simulated LUE to soil drying. However, GPP is accurately simulated at other sites affected by seasonally recurring water stress. The modelled sensitivity to dry soils is determined by the soil moisture stress function, which depends on the mean aridity of the site as estimated using a fixed depth soil moisture "bucket". Accounting for variability in rooting zone depth, which may also be influenced by local topographical factors and access to groundwater (Fan et al., 2013 may help to minimise model biases in drought-prone areas. The current implementation of the P-model involves some simplifications in terms of climate drivers by using average daily meteorological conditions, measured above the canopy, as input. Optimality in balancing carbon and water costs for average daily conditions is not necessarily equivalent to optimality in balancing integrated water and carbon costs over the diurnal cycle. Large variations in ambient conditions over a diurnal cycle, combined with a non-linear dependence of costs on these conditions suggest that the approach of taking average daily conditions may be an oversimplification. Nevertheless, prior evaluations have shown robust and accurate predictions of optimal χ across a range conditions (Wang et al., 2017a). Using above-canopy VPD values instead of VPD at the leaf surface for scaling water losses implicitly assumes a perfectly coupled atmospheric boundary layer. Using above-canopy air temperature instead of leaf temperatures introduces a bias when the two become decoupled (Michaletz et al., 2015). The impact of these simplifications may be minor but should be evaluated.
A further simplification is that investment in electron transport capacity (expressed by J max ) and investments in the carboxylation capacity (expressed by V cmax ) are coordinated so that for conditions with which the model is forced (here, monthly means of daily averages), photosynthesis operates at the co-limitation point of the light-and RuBisCO-limited assimilation rates, and an effective linear relationship between absorbed light and mean assimilation emerges. This assumption follows from the coordination hypothesis, which itself can be understood as an optimality principle (Haxeltine and Prentice, 1996;Maire et al., 2012) and is well supported by observations (Maire et al., 2012). However, this coordination is contingent on the timescale at which photosynthetic acclimation occurs, which is not known precisely (Smith and Dukes, 2013;Way and Yamori, 2014). By simulating χ using monthly mean meteorological variables, we assume a Geosci. Model Dev., 13, 1545-1581, 2020 www.geosci-model-dev.net/13/1545/2020/ monthly timescale of acclimation. This is probably a conservative estimate (Smith and Dukes, 2017;Veres and Williams III, 1984). Considering the concave relationship of assimilation rates and absorbed light that follows from the FvCB model for a given J max , linearly scaling a given monthly LUE term with daily varying absorbed light levels should lead to an overestimation of assimilation rates at high light levels. This overestimation should disappear as the timescale over which light levels are averaged is increased. However, our results do not confirm these expectations (Fig. 12). The fact that the model did not exhibit a systematic error in simulating GPP variations when applied at the daily timescale is probably due to the fact that day-to-day variability in light levels is relatively small compared to the within-day variability and the non-linearity between A and daily varying light levels does not play an important role.

Conclusions
The P-model provides a simple, parameter-sparse but powerful method to predict photosynthetic capacity and light use efficiency across a wide range of climatic conditions and vegetation types. It provides a basis for a terrestrial light use efficiency model driven by remotely sensed vegetation greenness. Using optimality principles for the formulation of the P-model reduces its dependence on uncertain or vegetation-type-specific parameters and enables robust predictions of GPP and its variations through the seasons, between years, and across space. Further work is required to develop a distinct treatment of C 4 vegetation for global applications and additional evaluations are needed to examine the P-model's sensitivity to increasing CO 2 . We have shown that accounting for the effects of low soil moisture and the reduction in the quantum yield efficiency under low temperatures improves model performance. There is potential to include below-ground water limitation effects in the mechanistic optimality framework of the P-model.
Code and data availability. The P-model is implemented as an R package (rpmodel) and available through CRAN and Zenodo (Stocker, 2019a). Results shown here correspond to rpmodel version v1.0.4. A documentation of the R package is available under https://stineb.github.io/rpmodel/ (last access: 5 February 2020). Both site-scale and global simulations shown here are done with the Fortran implementation of the P-model within the SOFUN modelling framework (version v1.2.0), available on Zenodo (Stocker, 2019b). Site-scale forcing data ingest and filtering, model calibration, and evaluation were done using the R package rsofun (version v1.0.wrap_sofun), available on Zenodo (Stocker, 2020b). Scripts that implement the workflow (repository eval_pmodel version v2) are available on Zenodo (Stocker, 2020a). Model outputs are available on Zenodo (Stocker, 2019c). Table A1 provides metadata information and references for each site from the FLUXNET2015 Tier 1 dataset, used for model calibration and evaluation in the present study.     The temperature-and pressure-dependent photorespiratory compensation point in absence of dark respiration * (T , p) is calculated from its value at standard temperature (T 0 = 25 • C) and atmospheric pressure (p 0 = 101 325 Pa), referred to as * 25,p 0 . It is modified by temperature following an Arrhenius-type temperature response function f Arrh (T K , H * ) with activation energy H * , and is corrected for atmospheric pressure p(z) at elevation z.

Appendix A: Site information
Values of H * and * 25,p 0 are taken from Bernacchi et al. (2001). The latter is converted to Pa and standardised to p 0 by multiplication with p 0 ( * 25,p 0 = 42.75 µmol mol −1 × 10 −6 × 101 325 Pa = 4.332 Pa). H * is 37 830 J mol −1 . All parameter values are summarised in Table A2. The function p(z) is defined in Sect. B4. Note that T K indicates that the respective temperature value is given in Kelvin and T K,0 = 298.15 K.
To correct for effects by temperature following the Arrhenius equation with its form x(T K ) = exp(c − H a /(T K R)), the temperature-correction function f Arrh (T K , H a ), used in Eq. (B1) and further equations below, is given by where H is the respective activation energy (e.g. H * in Eq. B1), and R is the universal gas constant (8.3145 J mol −1 K −1 ).

B2 Deriving *
The temperature and pressure dependency of * follows from the temperature dependencies of K c , K o , V c,max , and V o,max and the pressure dependency of pO 2 (p): pO 2 (p) is the partial pressure of atmospheric oxygen (Pa) and scales linearly with p(z). K c is the Michaelis-Menten constant for carboxylation (Pa); K o is the Michaelis-Menten constant for oxygenation (Pa); V cmax is maximum rate of carboxylation (µmol m −2 s −1 ); and V omax is the maximum rate of oxygenation (µmol m −2 s −1 ). The temperaturedependency equations for these four terms are given in Table 1 of Bernacchi et al. (2001) with respective scaling con-stants c and activation energies H a as K c (T K ) = exp(38.05 − 79.43/(T K R)) (B4a) K o (T K ) = 1000 × exp(20.30 − 36.38/(T K R)) (B4b) V o,max (T K ) = exp(22.98 − 60.11/(T K R)) (B4c) V c,max (T K ) = exp(26.35 − 65.33/(T K R)). (B4d) By substituting the temperature-dependency equations for each term in Eq. (B3) and rearranging terms, * can be written as * (T K , z) = pO 2 (z) exp(6.779 − 37.83/(T K R)) .
With pO 2 (p) at standard atmospheric pressure (101 325 Pa) taken to be 21 000 Pa, and assuming a constant mixing ratio across the troposphere, its pressure dependence can be expressed as hence, * (T K , p) = p(z) exp(5.205 − 37.83/(T K R)).
We can use this to calculate * at standard temperature (T K = 298.15 K) and pressure (p(z) = 101 325 Pa) as * 25,p 0 = 4.332 Pa. Note that to convert Eq. (B5) to the form corresponding to the one given by Bernacchi et al. (2001), the partial pressure of oxygen (pO 2 ) has to be assumed at standard conditions. pO 2 is approximately 21 000 Pa and with the standard atmospheric pressure of 101 325 Pa, pO 2 can be converted from Pascals to parts per million (ppm) as 21 000/10 1325 × 10 6 = 207 254 ppm = exp(12.24) ppm. This can be combined with the exponent in Eq. (B5) to exp(12.24) · exp(6.779) = exp(19.02). This corresponds to the parameter values determining the temperature dependence of * given by Bernacchi et al. (2001) as * = exp(19.02 − 37.83/(T K R)).

B3 Michaelis-Menten coefficient of photosynthesis
The effective Michaelis-Menten coefficient K (Pa) of RuBisCO-limited photosynthesis (Eq. 6) is determined by the Michaelis-Menten constants for the carboxylation and oxygenation reactions (Farquhar et al., 1980): where K c is the Michaelis-Menten constant for CO 2 (Pa), K o is the Michaelis-Menten constant for the carboxylation and oxygenation reaction, respectively, and pO 2 is the partial pressure of oxygen (Pa). K c and K o follow a temperature dependence, given by the Arrhenius equation analogously to the temperature dependence of * (Eq. B1):  Table A2). The latter two have been converted from µmol mol −1 in Bernacchi et al. (2001) to units of Pa by multiplication with the standard atmosphere (101 325 Pa). Note that K c25 and K o25 are rate constants and are independent of atmospheric pressure. Pressure dependence of K is solely in pO 2 (p) (see Eq. B6).

B4 Atmospheric pressure
The elevation dependence of atmospheric pressure is computed by assuming a linear decrease in temperature with elevation and a mean adiabatic lapse rate (Berberan-Santos et al., 1997): where z is the elevation above mean sea level (m), g is the gravitational constant (9.80665 m s −2 ), p 0 is the standard atmospheric pressure at 0 m a.s.l. (101 325 Pa), L is the mean adiabatic lapse rate (0.0065 K m −2 ), M a is the molecular weight for dry air (0.028963 kg mol −1 ), and R is the universal gas constant (8.3145 J mol −1 K −1 ). All parameter values that are held fixed in the model (not calibrated) are summarised in Table A2.
Appendix C: Corollary of the χ prediction

C1 Stomatal conductance
Stomatal conductance g s (mol C Pa −1 ) follows from the prediction of χ given by Eq. (8) and g s = A/(c a (1 − χ )) (from Eq. 5). Stomatal conductance can thus be written as This has a similar form as the solution for g s derived from a different optimality principle by Medlyn et al. (2011) (their Eq. 11). Differences are that an additional term g 0 is missing here and that * does not appear in Medlyn et al. (2011). The theory presented by Prentice et al. (2014) provides a theoretical interpretation for the parameter g 1 in Medlyn et al. (2011): it is given by ξ (Eq. 9) and can thus be predicted from the environment. However, it is notable that the underlying optimality criterion used by Medlyn et al. (2011), as proposed by Cowan and Farquhar (1977), is one that maintains a constant marginal water cost of carbon gain λ = ∂E/∂A. It thus describes an instantaneous g s adjustment, e.g. to diurnal variations in D and has been adopted into DVMs and ESMs for respective predictions (with a given V cmax ). In contrast, the theory presented here and underlying the P-model predicts χ which is jointly controlled by g s and V cmax . In other words, it predicts a g s that is coordinated with V cmax and thus acclimates at a similar timescale (which is on the order of days to weeks). This χ can be understood as a "set point" for an average χ with actual χ varying around it at a daily to subdaily timescale.

C2 Intrinsic water use efficiency
The intrinsic water use efficiency (iWUE, in Pa) has been defined as the ratio of assimilation over stomatal conductance (to water) (Beer et al., 2009) as iWUE = A/(1.6g s ). The factor 1.6 accounts for the difference in diffusivity between CO 2 and H 2 O. Using Fick's law (Eq. 5), this is simply or, using the prediction of optimal χ given by Eq. (8), this can be expressed as iWUE = 1

C3 Maximum carboxylation capacity
With A J = A C , V cmax can directly be derived as c i is given by c a χ. The second part of the equation follows from the definitions of m (Eq. 11) and m C (Eq. 7). Normalising V cmax to standard temperature (25 • C) following a modified Arrhenius function based on Kattge and Knorr (2007) gives V cmax25 as where H V is the activation energy (71 513 J mol −1 ), H d is the deactivation energy (200 000 J mol −1 ), and S is an entropy term (J mol −1 K −1 ) calculated using a linear relationship with T from Kattge and Knorr (2007), with a slope of b S = 1.07 J mol −1 K −2 and intercept of a S = 668.39 J mol −1 K −1 : Note that T is in units of • C in the above equation. Equation (C6) describes the instantaneous response to temperature and is not the same as the optimality-driven acclimation to temperature predicted by the P-model.

C4 Dark respiration R d
Dark respiration at standard temperature R d25 is calculated as being proportional to V cmax25 : where b 0 = 0.015 (Atkin et al., 2015). Dark respiration follows a slightly different instantaneous temperature sensitivity than V cmax following Heskel et al. (2016): By combining Eqs. (C6), (C8), and (C9), R d at growth temperature T can directly be calculated from V cmax as Appendix D: Soil water holding capacity The soil water balance is solved following the SPLASH model but with the total soil water holding capacity per unit ground area (θ WHC , in mm) calculated as a function of the soil texture. Precipitation in the form of rain (P rain ) and snow (P snow ) are taken from WATCH-WFDEI (Weedon et al., 2014) and are summed and converted from kg m −2 s −1 to mm d −1 by multiplication of (60 × 60 × 24) s d −1 . To obtain θ WHC , we use soil depth to bedrock and texture data from SoilGrids (Hengl et al., 2014), extracted around the FLUXNET sites. We assumed that the plant-available WHC is determined by the WHC down to a maximum depth of 2 m and is limited by the depth to bedrock. The water holding capacity (w WHC , in mm) was defined as the difference in volumetric soil water storage at field capacity (W FC , in m 3 m −3 ) and the permanent wilting point (W PWP , in m 3 m −3 ): · min(z bedrock , z max ). (D1) f gravel is the gravel fraction, z bedrock is the depth to bedrock (in mm), and z max is 2000 mm. The volumetric soil water storage at field capacity and wilting point were derived from texture and organic matter content data through pedotransfer functions, as described by Saxton and Rawls (2006). W FC is calculated as where k FC = −0.251 · f sand + 0.195 · f clay + 0.011 · f OM (D3) f sand , f clay , f OM are the sand, clay, and organic matter contents in percent by weight. W PWP is calculated as W PWP = k PWP + (0.14 · k PWP − 0.02) , where k PWP = −0.024 · f sand + 0.487 · f clay + 0.006 · f OM (D9) and p(z) is atmospheric pressure, taken here as a constant function of elevation z (Sect. B4); w air is the mass mixing ratio of water vapour to dry air (dimensionless) and derived from specific humidity as w air = q air /(1−q air ). R d and R v are the specific gas constants of dry air and water vapour, respectively, and are given by R/M d and R/M v , respectively, where R is the universal gas constant (8.314 J mol −1 K −1 ) and M d (28.963 g mol −1 ) and M v (18.02 g mol −1 ) are the molecular mass of dry air of water vapour, respectively. T is air temperature in • C.
Appendix F: Extended theory F1 Deriving χ Using Eqs. (4) and (5), the term on the left-hand side of Eq.
(3) can thus be written as Using Eq. (6) and the simplification * = 0, the derivative term on the right-hand side of Eq. (3) can be written as Equation (3) can thus be written as Geosci. Model Dev., 13, 1545-1581, 2020 www.geosci-model-dev.net/13/1545/2020/ and solved for χ: where b/a = β/η * . The exact solution, without the simplification * = 0, and following analogous steps, is This can also be written as F2 Deriving the J max limitation factor By taking the derivative of A J with respect to J max , Eq. (14) can be expressed as With this, we can write 1 1 + k −2 = u .
This can be rearranged to The right-hand term now corresponds to the J max limitation factor L in Eq. (13), and we get Eq. (15).
To sum up, the P-model calculates GPP as GPP = I abs ϕ 0 (T ) β(θ ) m M C , . (F17) I abs is the absorbed light (taken as fAPAR×PPFD, mol m −2 ), ϕ 0 (T ) is the temperature-dependent intrinsic quantum yield, β(θ ) is the soil moisture stress factor, and M C is the molar mass of carbon (g mol −1 ).

F3 An alternative method for introducing the J max limitation
Section 2.2 introduced the effect of a finite J max leading to a saturating relationship between absorbed light and the lightlimited assimilation rate, A J . An alternative method was presented by Smith et al. (2019) and is implemented in rpmodel as an optional method (argument method_jmaxlim = "smith19"). Following their approach, the light-limited assimilation rate is described as m is the CO 2 limitation factor (Eq. 11), and J is a saturating function of absorbed light, approaching J max for high light levels, following Farquhar et al. (1980): θ J 2 − (ϕ 0 I abs + J max ) J + ϕ 0 I abs J max = 0 .
θ is a unitless parameter determining the curvature of the response of J to I abs , here taken as 0.85, based on Smith et al. ϕ 0 I abs + J max ± (ϕ 0 I abs + J max ) 2 − 4θ ϕ 0 I abs J max 2θ , (F20) from which the smaller root is used to derive A J . Similar as in the method used by Wang et al. (2017a) and outlined in Sect. 2.2, a proportionality between A J and J max is assumed (∂A/∂J max = c; Eq. 14). Taking the derivative of Eq. (F20) with respect to J max and setting equal to c leads to Using this, A J can be written analogously to Eq. (16) but with and The cost parameter c was assumed to be non-varying. Under standard conditions of 25 • C, 101 325 Pa atmospheric pressure, 1000 Pa vapour pressure deficit, and 360 ppm CO 2 , at which the ratio of J max to V cmax was assumed to be 2.07 (Smith and Dukes, 2017), c was derived as 0.053 (Smith et al., 2019).
Using the definition of V cmax from Eq. (C4), m can be replaced by m from Eq. (F23) to calculate an "intermediate rate of V cmax " (Smith et al., 2019) as Appendix G: The rpmodel() function of the rpmodel R package The rpmodel R package provides an implementation of the Pmodel as described here. The main function is rpmodel(), which returns a list of variables that are mutually consistent within the theory of the P-model (Sect. 2) and based on calculations defined in this paper. References for the returned list of variables are given in Table A3.
Author contributions. BDS designed the study, wrote the model code, conducted the analysis, and wrote the paper. HW developed the model and wrote the initial version of the model description. NGS developed the model and implemented model code. SPH contributed to designing the study and writing the manuscript. TK contributed to the study design, model implementation, and manuscript writing. DS implemented the water holding capacity model. TD wrote an initial version of the model code and model documentation. ICP developed the model and contributed to designing the study.
Competing interests. The authors declare that they have no conflict of interest.
Acknowledgements. This work contributes to the AXA Chair Programme in Biosphere and Climate Impacts and the Imperial College initiative on Grand Challenges in Ecosystems and the Environment. Review statement. This paper was edited by Jatin Kala and reviewed by two anonymous referees.