ORCHIDEE-MICT-BIOENERGY: an attempt to represent the production of lignocellulosic crops for bioenergy in a global vegetation model

. Bioenergy crop cultivation for lignocellulosic biomass is increasingly important for future climate mitigation, and it is assumed on large scales in Integrated Assessment Models (IAMs) that develop future land use change scenarios consistent with the dual constraint of sufficient food production and deep de-carbonization for low climate warming targets. In most global vegetation models, there is no specific representation of crops producing lignocellulosic biomass, resulting in simulation biases of biomass yields and other carbon outputs, and in turn of future bioenergy production. Here, we 15 introduced four new plant functional types (PFTs) to represent four major lignocellulosic bioenergy crops, eucalypt, poplar and willow, Miscanthu s, and switchgrass, in the global process-based vegetation model, ORCHIDEE. New parameterizations of photosynthesis, carbon allocation and phenology are proposed based on a compilation of field measurements. A specific harvest module is further added to the model to simulate the rotation of bioenergy tree PFTs based on their age dynamics. The resulting ORCHIDEE-MICT-BIOENERGY model is applied at 296 locations where field 20 measurements of harvested biomass are available for different bioenergy crops. The new model can generally reproduce the global bioenergy crop yield observations. Biases of the model results related to grid-based simulations versus the point-scale measurements and the lack of fertilization and fertilization management practices in the model are discussed.

separation of different forest age cohorts allows a proper bookkeeping of different ages of rotation forests and tracking individually their carbon stock dynamics and areal cohorts. In addition, we aimed to introduce additional herbaceous bioenergy crops like Miscanthus and switchgrass as well as woody crops like eucalypt, willow and poplar in a more systematic way on the global scale (not only poplar for Europe as in ORCHIDEE-SRC). We will add sentences to explain the relationship with the version by De Groote et al. (2015) on P3L28: "There is another ORCHIDEE version including short rotation coppice poplar plantations (ORCHIDEE-SRC, De Groote et al., 2015) based on the forest management module (Bellassen et al., 2010), but ORCHIDEE-SRC is more designed for studying specific coppicing processes and is evaluated using only two coppicing sites in Belgium. Although detailed forest management processes are not included in ORCHIDEE-MICT, this version includes explicit gross land use changes, i.e., the rotational transitions from other vegetation types to woody bioenergy crops and periodic clear-cut harvest of forests. These features are important to study the carbon emissions from bioenergy crop when their areas expand by converting other land use types in future BECCS scenarios. In addition, ORCHIDEE-MICT contains a bookkeeping system to track different forest age classes as separate land cohorts at a sub-grid scale (Yue et al., 2018). This functionality allows simulating the woody harvest based on rotation length tracking individually the carbon stock dynamics of different age classes of forests. In addition to the poplar plantation in Europe in ORCHIDEE-SRC (De Groote et al., 2015), we aimed to include herbaceous bioenergy crops like Miscanthus and switchgrass as well as other woody crops like eucalypt and willow in a more systematic way on the global scale." in the model. Therefore, not accounting for the stump growth does not necessarily lead to a deceleration of biomass production. Otherwise, we may not be able to validate the yields.
We are fully aware that the fate of belowground biomass after harvest is important to derive the full carbon cycle. We have a reserve carbon pool for leaf onset in ORCHIDEE (Krinner et al., 2005), and a simple approach to account for the stump is to leave some carbon in this reserve pool after harvest. We will implement this feature and evaluate the belowground and soil carbon in the next step of development, also after collecting new observation data for belowground and soil carbon of different bioenergy crops.

Comment #11
page 8, line 21: What is the reason for harvesting in winter at lower biomass harvest? Is that really nutrient recycling? I would assume that you can add nutrients in a managed system.

Response #11
Yes, it is due to the nutrient recycling and drying. It could be harvest at maximum yield, and then nutrients need to be added as the reviewer pointed out. However, fertilization increases cost, leaching and N2O emissions and is neither cost-effective nor environmentally beneficial. In fact, it is recommended to harvest between January and March in the "Planting and Growing Miscanthus -Best Practice Guidelines" by the UK ministry of agriculture (DEFRA, 2007).
We will revise the sentence here to make it more clear: "In practice, harvesting of Miscanthus and switchgrass is usually performed in winter and early next spring after drying and nutrient recycling through leaf falling off Zub and Brancourt-Hulmel, 2010) which leads to a lower biomass at harvest but enhances nutrient conservation. For example, 18%-46% of the nitrogen in Miscanthus can be recycled through leaf falling to soil and translocation from shoots to rhizomes (Cadoux et al., 2012). Similar seasonal nitrogen dynamics were also observed for switchgrass . In fact, Miscanthus is recommended to be harvested between January and March in practice guidelines (DEFRA, 2007). Otherwise, fertilizers have to be applied to amend the nutrient removal from harvest, which is neither cost-effective nor environment-friendly." Cadoux, S., Riche, A. B., Yates, N. E. and Machet, J.-M.: Nutrient requirements of Miscanthus x giganteus: conclusions from a review of published studies, Biomass andBioenergy, 38, 14-22, 2012. DEFRA: Planting andGrowing Miscanthus., 2007. Heaton, E. A., Dohleman, F. G. andLong, S. P.: Seasonal nitrogen dynamics of Miscanthus × giganteus andPanicum virgatum, GCB Bioenergy, 1(4), 297-307, doi:10.1111/j.1757-1707.01022.x, 2009: The development and current status of perennial rhizomatous grasses as energy crops in the US and Europe, Biomass and bioenergy, 25(4), 335-361, 2003. Zub, H. W. and Brancourt-Hulmel, M.: Agronomic and physiological performances of different species of Miscanthus, a major energy crop. A review, Agron. Sustain. Dev., 30(2), 201-214, 2010.

Comment #12
page 9, line 4: But isn't it less practical to harvest in nearly each age class? The harvester could harm other trees. I would assume that plantations consist of homogeneous age classes and are harvested at a certain age. But maybe I do not understand which practise you assume here.

Response #12
Yes, plantations in the model are assumed to be homogeneous cohorts in as different patches of a model grid cell and harvested at certain age of maturity. Here, the "boundary" refers to the threshold of biomass to define age classes or cohorts in the model, not the physical boundary between different patches in reality. To avoid misleading, we will revise this sentence as: "Namely, harvesting starts from the second youngest age class, thus the age in the second youngest forest age cohort should be set up as same as the rotation length." Comment #13 page 9, line 15: Are there "real" plantations" already or are that more experimental sites?

Response #13
We will add a sentence to clarify it here: "Most of the measurements (>90%) are based experimental trials, especially for Miscanthus and switchgrass." Comment #14 page9, line 23: "Note that this dataset does not distinguish the utilization .." -But that makes a big difference.

Response #14
First, this is not a problem for Miscanthus and switchgrass because they are both designed for bioenergy purpose in the experimental trials. It may influence the woody crops like poplar, willow and eucalypt, but there are not a great number of studies on woody plantation for bioenergy use. Although some plantations are for timber or pulpwood, they can still provide the specific growth information for this woody crop type. We would think it is justified to use these biomass production observations to evaluate the model, considering maybe more uncertainties induced by species and genotype differences and management practices. Comment #15 page 10, line 21: It might be better to count the harvest events.

Response #15
Because we artificially harvest 1% of the grid cell each year and re-plant immediately, after the first 5 years (the rotation length), there is always a fraction that is ready for harvest. For example, regrowth of 1 st year harvest patches will reach the rotation length in the 5 th year, and the 2 nd year harvest patches will reach a full rotation in the 6 th year… Therefore, we used the last 10 years harvested biomass, representing 10 harvested events but probably from different patches.
We will revise the sentence here to clarify it: "The harvested biomass for the last 10 years was used to calculate the median and range of the simulated yields. Note that we artificially harvest 1% of the grid cells each year, and the harvested patches will be planted immediately. After the first 5 years (one rotation length), there is always a fraction reaching a full rotation and ready for harvest. The harvest in the last 10 years thus represents 10 harvest events." Comment #16 page 10, line 25: But this is of enormous importance if you like to estimate biomass potentials for BECCS. It is essential to balance the harvest and the soil carbon losses and the carbon needed for the establishment of a biomass plantation.

Response #16
Yes, we agree that soil carbon should be evaluated before using this model to study the full carbon cycle for BECCS. The biomass productivity is relatively isolated from other carbon pools like soil carbon in the model, so the implementation and parameterizations in this study are sufficient to simulate the biomass yields only. The soil carbon evaluation will be conducted in the next step, but it will take time and efforts to collect soil carbon data from observations for different crops. As we stated in the title and introduction, we only aimed to capture the biomass yield observations on global scale in this paper.

Comment #17
page 13, line 19: But it seems also that the model underestimates yields in dry regions. Blue rectangle tend to be more left sided for PFT15, 16, and 17.

Response #17
As suggested, we will add sentences here to point out this finding: "The strong underestimation (darker blue color) seems more aligned to the drier regions, especially for poplar and willow (PFT15, Fig. 10b)." Comment #18 page 14, line 14: "... different carbon dynamics in litter and soil and water and energy balance can be expected." That's why you need to take for the soil carbon balance as well. This is one of the main issue I have on that manuscript.

Comment #1
Overall, this manuscript is a straightforward evaluation of a PFT parameterization in a well-established global biogeochemical model. The authors are adding parameterization of specific plants that are used in lignocellulosic biomass for biofuels. The study is motivated by need to connect a global land biogeochemical model, which typically do not have specific parameterization of biofuel crops, to Integrative Assessment Models that include extensive uses of biofuels in many scenarios for energy development.
I appreciate the authors documenting this model developing through a relatively short publication and that the parameters presented are commonly used across other global biogeochemical models. This will allow the manuscript serve as a resource for other modeling groups that add these bioenergy crops to their simulations.

Response #1
We thank the reviewer for the comments and suggestions. Please see the detailed point-by-point responses below.

Comment #2
My main critique of the manuscript is that it needs more analysis and discussion of causes of the modeldata mismatch, specifically the role of management in the parameterization and the observation datasets.
The authors mention that there is considerable variation many of the parameters (e.g., Page 5,line 24). Is that variation related to management? Could there be a parameterization for high intensity management (nutrient additions, irrigation, advanced genetics) and a parameterization for lower intensity management? In general, it would be useful to provide more information about the drivers of variation in the parameters for each species.

Response #2
As suggested, we will add sentences to discuss the variations of parameters related to managements on P14L26: "We adjusted some key parameters (e.g. Vcmax, Jmax and SLA) related to productivity of bioenergy crops based on a collection of field measurements. We only took the medians and the ranges to validate the parameter values in the model but didn't explicitly consider the impacts of management (e.g. fertilization, species) on these parameters, neither in the model nor in the measurements. Here, we summarized some management effects on these parameters for different bioenergy crops based on measurements as follows. 1) Miscanthus: Wang et al. (2012) found that biomass yield of Miscanthus increased under nitrogen addition through elevated SLA, but fertilization didn't affect Vcmax, stomatal conductance (gs) or the extinction coefficient (k). Yan et al. (2015) measured photosynthesis variables of three Miscanthus species in two experimental fields and found significantly higher gs, Jmax and Vcmax of Miscanthus lutarioriparius than M. sacchariflorus and M. sinensis.
2) Switchgrass: SLA differed significantly among nine cultivars of switchgrass but didn't respond significantly to water stress or nitrogen application for individual cultivar (Byrd and May II, 2000). Trócsá nyi et al. (2009) reported a lower SLA of switchgrass from the early harvest than from the late harvest. Hui et al. (2018) investigated leaf physiology of switchgrass under five precipitation treatments and found significantly higher photosynthesis rate and gs under elevated precipitation but no significant difference under reduced precipitation compared to control plots.
3) Eucalypt: Lin et al. (2013) measured photosynthesis response of six Eucalyptus species to temperature and found significantly different Jmax25 and Vcmax25 among species but non-significant differences in their ratios (Jmax25 / Vcmax25) and in the temperature response of Jmax and Vcmax. With extra nitrogen supply, Jmax and Vcmax of Eucalyptus grandis increased significantly, mainly associated with elevated leaf nitrogen content (Grassi et al., 2002). Sharwood et al. (2017) also found that Jmax and Vcmax of Eucalyptus globulus were correlated with leaf nitrogen content and the ratio of Jmax/Vcmax was constant under elevated CO2 or elevated temperature, but SLA is influenced by different CO2 and temperature treatments.

Comment #3
The manuscript focuses on a global analysis, rather than comparing directly to individual field studies. By averaging the studies within a grid-cell, there is considerable variation in the observations within a grid-cell ( Figure 3). I assume that much of this variation can be attributed to differences in management of the bioenergy crop. For example, there are likely different levels of nutrient fertilization, irrigation, and use of specific genotypes within a grid-cell. I recommend exploring this variation more. Do the simulations compare better to yields from specific types of management? Addressing this question will help set a path for future model development that includes management practices. For example, if the simulations compare better to the nutrient fertilization treatment trials, then including nutrient limitation will potentially help improve the simulations of the biofuel. I realize that the paragraph on page 11, line 9 address this issue but I found paragraph to be weak. Can the studies not be roughly categorized by management intensity? Furthermore, the final sentence "implying the model is able to capture at least some of the observations in these grid cells" does not give much confidence that the new parameterization is actually an improvement.

Response #3
As suggested, we further categorized the observations with different managements (i.e. fertilization, irrigation and species) and added three figures and two tables (reproduced below) to show the modelobservation comparison. We also fully discussed the management effects on biomass yields for each bioenergy crop based on evidence from reviews or meta-analyses (Heaton et al., 2004;Cadoux et al., 2012;Kauter et al., 2003;De Moraes Gonç alves et al., 2004;Wang et al., 2010;Fabio et al., 2018. See details below). We will also add sentences in the revised manuscript to incorporate these aspects: "Management like fertilization, irrigation and species plays an important role in the biomass yields. In ORCHIDEE-MICT-BIOENERGY, nutrient limitations and management by irrigation and fertilization are not explicitly implemented. Instead, we used parameter values in the range that favors a higher productivity (Section 2.3, Fig. 1) and compared the simulated yields with the median values of all observations regardless the management (Fig. 3). We further categorized the observations into three groups (fertilization, non-fertilization or non-reported) and compared with simulated yields (Fig. S5).
There is no systematic bias between simulated yields and yields at fertilized sites for all PFTs (orange dots in Fig. S5). The model seems to overestimate the yields of eucalypt at sites with non-reported information of fertilization (most gray dots above 1:1 line in Fig. S5a, Table S4) and overestimate yields of poplar and willow at sites without fertilization (green dots in Fig. S5b, Table S4). Yields at sites with non-reported fertilization information are underestimated by the model for Miscanthus (gray dots in Fig. S5c, Table S4) but overestimated for switchgrass (gray dots in Fig. S5d, Table S4). We didn't group the observations based on different fertilization rates because there are large variations in the biomass response to fertilization rates. For example, in a quantitative review by Heaton et al. (2004), the relationship between yields of Miscanthus and nitrogen application rates were not significant. Cadoux et al. (2012) reviewed 11 studies that measured Miscanthus yields under fertilization, and the biomass response to nitrogen fertilization was positive in 6 of the studies but no response in the others. Similarly, some studies showed positive biomass response of poplar to nitrogen fertilization, but others didn't (Kauter et al., 2003). Eucalypt also showed variable response to fertilization while the general response was positive (De Moraes Gonç alves et al., 2004). In quantitative reviews of fertilization effects on yields of switchgrass  and willow (Fabio et al., 2018), the relationship between biomass yields and nitrogen fertilization rates was significantly positive but the coefficient of determination (r 2 ) was very low. In summary, biomass response to fertilization varied largely, and evidence from field measurements is not conclusive. More importantly, the basic soil characteristics should be taken into account in addition to the fertilization rates but unfortunately, we didn't have information of soil nutrient contents nor types, nutrient stoichiometry, rates and timing of applied fertilizers for each site from observations. We also separated the observations based on irrigation information (irrigation, non-irrigation and nonreported) in comparison with modeled yields (Fig. S6). Both underestimation and overestimation were found for sites with different irrigation management for different PFTs. The yields of eucalypt were underestimated at sites with irrigation (blue dots in Fig. S6a, Table S4) but overestimated at sites with non-reported irrigation information (gray dots in Fig. S6a, Table S4). Compared to fertilization, not many sites reported irrigation information and the quantification of irrigation rates is more difficult. For example, some studies reported irrigation amount per year while some others only reported descriptive information like "soil moisture maintained to field capacity" or "irregular irrigation when necessary". Comparison between simulated yields and observations for the main species of bioenergy crops is shown in Fig. S7. The model overestimated yields of Eucalyptus urophylla × E. grandis, E. globulus and E. nitens (Fig. S7a, Table S5). For poplar and willow, the model generally overestimated yields of Populus deltoides × P. nigra, P. deltoides but underestimated yields of P. trichocarpa and Salix schwerinii × S. viminalis (Fig. S7b, Table S5). There is underestimation of yields for Miscanthus × giganteus but overestimation for Miscanthus sinensis. In fact, the observed yields of the former are significantly higher than yields of the latter (t-test, p<0.01). Only four sites reported yields for Panicum pretense, and they were overestimated by the model (Fig. S7d, Table S5). " We also revised the final sentence as: "In addition, the error bars for most sites (67%, 73%, 74% and 64% for PFT14 to PFT17 respectively) reach the 1:1 line ( Fig. 3 left panel), implying that at least some observations in these grid cells can be represented by the model.". Here we only stated that although the medians are not on the 1:1 line, some observations can be captured by the model. We didn't imply the improvement after new parameterizations here, because the improvement from the previous model version be clearly seen from Fig. 11 and discussed in Section 4.2.     Biomass and Bioenergy, 38, 14-22, 2012. De Moraes Gonç alves, J. L., Stape, J. L., Laclau, J. P., Smethurst, P. and Gava, J. L.: Silvicultural effects on the productivity and wood quality of eucalypt plantations, Forest Ecol. Manag.,, 45-61, doi:10.1016/j.foreco.2004.01.022, 2004 Effects of nitrogen fertilization in shrub willow short rotation coppice production -a quantitative review, GCB Bioenergy, doi:10.1111/gcbb.12507, 2018. Heaton, E., Voigt, T. and A quantitative review comparing the yields of two candidate C4 perennial biomass crops in relation to nitrogen, temperature and water, Biomass and Bioenergy, 27 (1), 21-30, doi:10.1016/J.BIOMBIOE.2003.10.005, 2004

Comment #4
Also, these is an issue for the editors to provide input on, but the paper leans heavily on a data paper that is submitted to another unnamed journal. Therefore, a reviewer of this paper is unable to comment on the quality and applicability of the observational dataset. Should this paper be allowed to be published before that data paper is available?

Response #4
As shown on P9L15-24, we briefly reported information on the dataset related to this study. We already submitted the revised version of the dataset after peer-review in a data journal. If the dataset paper is accepted before this GMD manuscript, we will provide the detail reference information. The dataset will be eventually available to public and free to access (hopefully soon).

Comment #5
The spatial mapping of the model-bias is useful but it opened the question whether there are spatial differences in management that could explain the spatial variation in the mismatch.

Response #5
We agree that management would contribute to the spatial mismatch between model and observation. However, it is difficult to isolate individual management factor (e.g., species, irrigation and fertilization) or systematically evaluate the role of all these factors in driving model-observation mismatch. In addition, if we separate the spatial maps of sites with a specific management, the number of sites is limited in most cases and consequently, no spatial patterns can be observed. We thus discussed the management effects on the biases between model and observation globally as suggested by the reviewer (see Response #3) but didn't analyze further the regional management contributions to the spatial patterns of mismatch here.

Comment #6
The model evaluation and discussion sections blur together a bit at the edges (section 4.1 seems like a continuation of section 3). I recommend making the separation more clear.

Response #6
We will move section 4.1 from Discussion to Model evaluation section.

Comment #7
Section 3.3 says that the model-observations results generally lie around the 1:1 ratio line but doesn't provide any statistics on the fit. What is the slope and intercept from the 1:1 fit?

Response #7
We will add sentences here to report some statistics: "Although the regression between modeled and observed medians is not significant with a low r 2 value because of the overestimation and underestimation at some sites ( Fig. 3 left panel), the difference between the two samples of modelled and observed yields is not significant (t-test, p>0.17) and the percent bias (PBIAS, defined as sum of biases divided by sum of observed values, Moriasi et al., 2007) ranges from 2% to 8% for all PFTs, implying that the global distributions of modeled and observed yields are consistent ( Fig. 3

right panel).
In addition, the error bars for most sites (67%, 73%, 74% and 64% for PFT14 to PFT17 respectively) reach the 1:1 line ( Fig. 3 left panel), implying that at least some observations in these grid cells can be represented by the model." Comment #8 Figure 6. It is hard to see the gridcells in the subboxes. For example, box 2 in Figure 6 has lower points that are impossible to see. Can the subboxes be bigger. I also recommend adding a histogram inset that summarizing the data across grid-cells for all the similar figures ( Figure 6-9)

Response #8
We will enlarge box 2 in Figure 6 and add a histogram inset in Figure 6-9 as suggested. Comment #9 Figure 10 stated that there is a 1:1 line that is not present in the figure

Response #9
We will delete this sentence in Figure 10 caption.

Response #10
We will revise it accordingly.

Response #11
We will revise it accordingly.

Response #12
We will revise it accordingly.

Response #13
We will revise it accordingly.

Comment #14
Page 12 Line 13:It is unclear what is meant by "because of the large spacing of plantation the trial experiment which results in . . .". Perhaps what was intended was something like: "because of the large spacing of the planting in the trial at that experimental site, which results in . . .".

Response #14
We will revise it accordingly.

Comment #15
Page 13 Line 9: Change "US" to "the US

Response #15
We will revise it accordingly. Biomass-derived fuels serve as an alternative energy source to substitute fossil fuel and are used by many countries to meet renewable energy and climate target (Karp and Shield, 2008;Meier et al., 2015;Robertson et al., 2017). Expanding bioenergy crop plantation is considered in future scenarios for energy security and climate change mitigation (Karp and Shield, 2008;Robertson et al., 2017;Smith et al., 2016). For bioenergy production to provide economic and climate benefits, 5 cultivated plants must have a high productivity and a high yield of harvestable biomass (Karp and Shield, 2008;Robertson et al., 2017;Whitaker et al., 2010). The first generation of bioenergy crops usually refers to grain and high-sugar crops like maize and sugarcane (Karp and Shield, 2008). These crops have high nutrient requirements which demand fertilizer additions causing high N2O emissions to the atmosphere to achieve a high productivity (Melillo et al., 2009;Searchinger et al., 2008). These grain and high-sugar crops are unlikely to be planted in large-scale for the purpose of bioenergy production 10 because of the food demand pressure for fertile land and fertilizer (Alexandratos and Bruinsma, 2012;Gerland et al., 2014;United Nations, 2017). Compared to the first generation, the second generation bioenergy crops, known as lignocellulosic energy crops like giant miscanthus, swithgrass and short-rotation trees, are adapted to a wider range of climatic and soil conditions and require less nitrogen fertilizer (Cadoux et al., 2012;Miguez et al., 2008). Those second generation bioenergy crops have potentials to be deployed on marginal lands to avoid direct and indirect land use change (LUC) carbon emissions 15 and damage of ecosystem services (Robertson et al., 2017). They also appear to have less greenhouse gas (GHG) emissions and higher energy efficiency than the first generation bioenergy crops (Whitaker et al., 2010).
Bioenergy with carbon capture and storage (BECCS) is the main class of future negative emission technologies expected to result in net removal of atmospheric CO2 (Smith et al., 2016). BECCS has been extensively assumed in Integrated Assessment Models (IAMs) to develop land-based mitigation scenarios for low warming levels (Fuss et al., 2014;Popp et al., 20 2014). In most IAMs like IMAGE (Bouwman et al., 2006;Stehfest et al., 2014) and MAgPIE (Klein et al., 2014;Popp et al., 2011), second generation bioenergy crops are used as primary energy carriers . One output from IAMs is future land use maps based on different environmental, socioeconomic and policy constraints. These land use maps, after being translated into plant functional type (PFT) maps, can be used in grid-based dynamic global vegetation models (DGVMs) to simulate the terrestrial carbon dynamics, biogeochemical (e.g. LUC carbon emissions) and biophysical (e.g.

25
albedo and transpiration changes) effects of land use processes (Brovkin et al., 2013;Wilkenskjeld et al., 2014). Global vegetation models can provide in return to IAMs some valuable information like spatially explicit biomass density, crop yield and water availability (Bonsch et al., 2015(Bonsch et al., , 2016Stehfest et al., 2014). For example, dedicated bioenergy crop modelling has been implemented in a global vegetation model (LPJml) Heck et al., 2016), to simulate biophysical yields and water availability as input data for MAgPIE (Bonsch et al., 2016).

30
In most global grid-based vegetation models, there is no dedicated PFTs to represent second-generation bioenergy crops.
Instead, these plants are often represented by a generic crop PFT. Biases in simulated biomass production and resulting 3 carbon and energy balance thus arise when ignoring differences in carbon assimilation and phenology between generic crops and lignocellulosic bioenergy crops. Moreover, lignocellulosic woody bioenergy crops like eucalypt, poplar and willow cannot be properly represented by an herbaceous crop PFT. For example, eucalypt has a high maximum rate of carboxylation (Vcmax) but relatively low leaf area index (LAI) Whitehead and Beadle, 2004). Miscanthus on the contrary, has a relatively lower Vcmax (Wang et al., 2012;Yan et al., 2015) but a higher LAI (Heaton et al., 2008;Zub and Brancourt-5 Hulmel, 2010) than eucalypt (Whitehead and Beadle, 2004). Even if both Miscanthus and switchgrass are C4 crops, Miscanthus can achieve a significantly higher yield than switchgrass because of a higher efficiency of converting intercepted radiation into aboveground biomass than switchgrass (Heaton et al., 2008). The water, nitrogen and light use efficiencies are also higher for Miscanthus than for switchgrass, resulting in a higher rate of leaf photosynthesis in the former . All these important differences between lignocellulosic bioenergy crops need to be considered, which calls for 10 having a dedicated new model PFT for each species.
Similarly, the way that harvest is implemented for generic crops in global models (usually removing a fixed fraction of biomass, typically on the order of 50%) cannot be used for bioenergy crops. The harvest index (HI, harvested biomass as a fraction of aboveground biomass) is very different for grain crops and herbaceous bioenergy crops. In addition, most vegetation models currently do not account for realistic rotations of ligneous bioenergy plants (e.g. poplar and eucalypt).

15
Modeling the harvest of woody bioenergy crops should be based on rotation practices of typically a few years rather than on assuming annual full harvest like for herbaceous crops. This requires simulating forest age dynamics (Yue et al., 2018) to accurately represent the ligneous biomass harvest.
In this study, we aim to model biomass yields of four major lignocellulosic bioenergy crops in the global dynamic vegetation model ORCHIDEE. We introduce the new bioenergy crop PFTs, adjust the parameters relevant to physiology, phenology 20 and harvest process of bioenergy crops based on observations, and evaluate the simulated biomass yields using a new global dataset of field measurements.  (Krinner et al., 2005). The principal processes related to carbon cycling comprise photosynthesis, vegetation carbon allocation, autotrophic and heterotrophic respiration, plant phenology (e.g. leaf 4 onset and senescence) and litter and soil carbon dynamics (Krinner et al., 2005). ORCHIDEE-MICT further includes highlatitude related processes with new parameterizations of soil carbon vertical discretization, snow processes, and the SPITFIRE fire module (Guimberteau et al., 2018). Importantly, the representation of forest age dynamics in this version (Yue et al., 2018) allows us to simulate wood harvest based on rotation length practices, a prerequisite for simulating the woody yields. Originally, there are 13 plant functional types (PFTs) in ORCHIDEE (Table 1) (Krinner et al., 2005). In order to represent the bioenergy crops, we introduced four new PFTs ( specifically for their corresponding bioenergy crops based on field experiment or measurement data in Section 2.3.

Bioenergy biomass harvest module
The new module represents the periodical harvest of bioenergy crops, consisting of two sub-routines differentiating woody and herbaceous crops. For woody types, harvest is based on simulated forest age classes (see details in Yue et al., 2018).
Briefly, each woody PFT is sub-divided into six cohort functional types (CFTs) corresponding to different age classes. The 30 boundary of age classes is set as being PFT specific and defined based on maximum woody biomass (total of the sapwood and heartwood biomass). When the biomass of a young woody CFT reaches the upper boundary defining its age class, it is and put harvested biomass into a separate bioenergy harvest pool rather than mixing it with the modeled wood product pools 5 existing for forest management harvest (Yue et al., 2018) or with an agricultural product pool for the two crop PFTs (PFT13 and 14,

Parameterization of bioenergy crops
Most parameters in ORCHIDEE are PFT specific (Krinner et al., 2005). Since we aim to improve the biomass production performance of the four bioenergy crop PFTs, we adjusted parameters controlling carbon assimilation (Section 2.3.1 and Kattge and Knorr (2007) in order to account for the acclimation of Vcmax and Jmax to temperature. In ORCHIDEE,

25
Vcmax25 (Vcmax at 25 °C) is prescribed for each PFT, and Jmax is calculated from the ratio (rJV) between Jmax and Vcmax: rJV is a function of growth temperature (Tgrowth) (Kattge and Knorr, 2007): where arJV and brJV is the acclimation parameters derived by fitting data from 36 plant species (Kattge and Knorr, 2007). For C4 plants, no acclimation is considered for Vcmax and Jmax, and thus brJV = 0 and arJV is a fixed value (   (Table 2) for each bioenergy crop PFT using a value close to the median value in the observation dataset (Fig. 1a,c, within a range of 10% of the median values). We also verified that Jmax25 from Equation (1) is also in the range of independent Jmax25 observations (Fig. 1b). Importantly, the observation-based estimates of Vcmax25 and 15 Jmax25 for Miscanthus are significantly larger than for switchgrass (p = 0.02 and 0.09 respectively, Fig. 1a,b). Note that the ranges shown in Fig. 1 could be influenced by the sample size and number of studies.
We also adjusted other parameters including θ (the convexity factor of the response of rate of electron transport to irradiance), and α(LL) (conversion efficiency of absorbed light into e-transport rate at strictly limiting light) and g0 (residual stomatal conductance when irradiance approaches zero) in the leaf-level photosynthesis equations of ORCHIDEE to match 20 higher productivity based on field measurements or empirical data ( Table 2). The detailed effects of these parameters on photosynthesis in the FvCB model can be found in Yin and Struik (2009). In brief, θ and α(LL) are used in the calculation of J (photosynthesis rate limited by electron transport): Where I is the photon flux density absorbed by leaf photosynthetic pigments. g0 is an intercept related to the estimation of gs 25 (stomatal conductance): Where A is the net photosynthesis rate, Rd is the day respiration, and Ci and Ci* are the intercellular CO2 partial pressure and Ci -based CO2 compensation point in the absence of Rd, respectively. fVPD is factor of the effect of leaf-to-air vapor pressure difference (Yin and Struik, 2009).
Morphological plant traits are also of key importance to the canopy-level productivity (Chang et al., 2015). The specific leaf 10 area (SLA) in ORCHIDEE is a PFT-specific constant (Krinner et al., 2005). SLA for different bioenergy crops from our data compilation (164 entries in Table S1) is shown in Fig. 1d. A factor of 2 is used to convert the SLA unit from m 2 g -1 dry matter to m 2 g -1 C. Observation derived SLA for eucalypt is lower than for the other bioenergy crops, and SLA for switchgrass is relatively larger. SLA is set to the median value of observations for PFT16 (Miscanthus) and PFT17 (switchgrass), and close to the 75 th percentile value of the data we compiled for PFT14 (eucalypt) and PFT15 (poplar and 15 willow) ( Fig. 1d and Table 2).
Another important plant trait for photosynthesis is the leaf orientation, which determines the radiation extinction in the canopy. Although LAI of eucalypts is generally moderate (Anderson, 1981;Stape et al., 2004;Whitehead and Beadle, 2004), leaf angles are nearly close to vertical in mature eucalypts forest (Anderson, 1981;King, 1997), leading to a good distribution of radiation to the lower canopy layers. The light extinction coefficient (k) for PFT14 (eucalypt) is therefore set 20 to 0.36 (Table 2) according to the measurement-based estimate by Stape et al. (2004). Similarly, a field study shows the seasonal average k ranging from 0.23 to 0.37 for poplars (Ceulemans et al., 1992;Heilman et al., 1996), and a median value of 0.3 was used for PFT15 (Table 2).
For woody PFTs in ORCHIDEE, the partitioning between aboveground and belowground sapwood biomass is a function of 30 forest age (Krinner et al., 2005): 8 fab,t = fab,min + (fab,maxfab,min) × (1e -t/τ ) Where fab,t is the fraction of sapwood allocated to aboveground at age t; fab,min and fab,max are the minimum and maximum fraction allocated to aboveground (0.2 and 0.8 respectively); and τ is an empirical parameter. This equation implies that more biomass is allocated to belowground sapwood to develop coarse roots in younger forests. The partition between aboveground and belowground biomass is influenced by resource supply like water and nutrient availability (Litton et al., 2007). For 5 example, belowground carbon allocation in eucalypt is observed to be strongly reduced by irrigation (Barton and Montagu, 2006;Ryan et al., 2010;Stape et al., 2008). Fertilized poplars also showed greater shoot growth than control plots (Coleman et al., 2004). We assumed that bioenergy trees should usually be under intensive management (e.g. irrigation and fertilization) especially in the establishment year (Caslin et al., 2015;Isebrands and Richardson, 2014;Jacobs, 1981). A higher water and nutrient availability then implies a lower investment of biomass on roots for bioenergy trees. In the ORCHIDEE version used 10 here, as there is no specific fertilization or irrigation practice included, the idealized approach chosen to partially account for these managements operations was to reduce τ in Equation (5) from 5 to 2 years (Table 2) to give a maximum allocation of sapwood biomass to aboveground faster than in the standard version. The difference of these two values is illustrated in Fig.   S1S2. Also because the rotation length for bioenergy trees is usually of several years only (Karp and Shield, 2008), it is reasonable to assume that these plants allocate more biomass aboveground in the first few years. However, trees like poplar 15 and willow can sprout from the remaining stem or root (Isebrands and Richardson, 2014), which is not accounted for in the model. Last, we also adjusted the factor (β, Table 2) in the exponential function to calculate the soil water stress in ORCHIDEE (Krinner et al., 2005;McMurtrie et al., 1990) to reduce the soil moisture stress on bioenergy trees (Fig. S2S3).

Phenology parameters
An adjustment of parameters related to phenology was performed for the two herbaceous bioenergy PFTs (PFT16 and 20 PFT17, Table 2) to derive the total biomass production for the whole growing season. Lewandowski et al. (2003) and Zub and Brancourt-Hulmel (2010) reviewed growth temperature and growing season length of Miscanthus and switchgrass, and found that these two crops have higher cold tolerance and a longer growing season than grasses. Compared to maize, Miscanthus has an earlier leaf onset and later leaf fall, and thus its growing season length is 59% longer ). Some Miscanthus genotypes need fewer cumulative degree-days for shoot emergence (60 to 118 degree days) 25 and a high frost tolerance (-9 to -6 °C) (Farrell et al., 2006). To account for this frost tolerance and longer growing season, we decreased the growing degree days for leaf onset in the model (GDDonset) from 700 (standard value for C4 crop PFT) to 320 degree days (same as the default value for C4 grass PFT in ORCHIDEE) and the critical temperature for leaf senescence (Tsenescence) from 10 to 0 °C for PFT16 and PFT17 ( Table 2). Note that we did not set Tsenescence to be -9 to -6 °C, because frost tolerance was only documented for certain Miscanthus genotypes, so we used a conservative value of 0 °C for Miscanthus 30 and switchgrass PFTs. In addition, we increased the critical leaf age beyond which leaves enter senescence (tleaf) and the 9 minimum leaf age to allow leaf senescence (tleaf,min) to be the same as the default values for C4 grass PFT (PFT11 in Table 1) in ORCHIDEE (Table 2).

Biomass harvest
The harvest index (HI) determines how much aboveground biomass is harvested. Theoretically, all the aboveground biomass of a lingocellulosic crop can be used for energy production. Some IAMs (e.g. GCAM3.0, Kyle et al., 2011) indeed assume a 5 HI of 1 for switchgrass for instance. In practice, harvesting of Miscanthus and switchgrass is usually performed in winter and early spring after drying and nutrient recycling through leaf falling offsenescence Zub and Brancourt-Hulmel, 2010) which leads to a lower biomass at harvest but enhances nutrient conservation. For example, 18%-46% of the nitrogen in Miscanthus can be recycled through leaf falling to soil and translocation from shoots to rhizomes (Cadoux et al., 2012). Similar seasonal nitrogen dynamics were also observed for switchgrass ). In fact,

10
Miscanthus is recommended to be harvested between January and March in practice guidelines (DEFRA, 2007). Otherwise, fertilizers have to be applied to amend the nutrient removal from harvest, which is neither cost-effective nor environment-  (Table 2). However, for simulations using future land use maps generated from IAMs, we would recommend to setsetting the HI same as in IAMs to be consistent.
The rotation length for eucalypt, poplar and willow varies among different tree types, species, locations and plantation purposes (Caslin et al., 2015;Isebrands and Richardson, 2014;Karp and Shield, 2008;Keoleian and Volk, 2005;Mead et al., 2001). For example, eucalypt and poplar for sawlog and veneer utilization are often on rotations of 8-20 years, depending on 20 regions (Isebrands and Richardson, 2014;Mead et al., 2001). But short rotation coppice bioenergy plantation of poplar and willow have shorter cutting cycles of 3-5 years (Caslin et al., 2015;Isebrands and Richardson, 2014;Karp and Shield, 2008;Keoleian and Volk, 2005). A rotation length of 8 years was used in LPJml model for bioenergy trees .
In ORCHIDEE, the rotation length for bioenergy tree PFTs is associated with the setting of age classes (see Section 2.2).
Namely, harvesting starts from the second youngest age class, thus the biomass class boundary forage in the second youngest 25 forest age cohort is assumed to defineshould be set up as same as the rotation length. For idealized simulations presented below, we used a rotation length of 4-6 years based on the harvest age and rotation length in the evaluation dataset ( Fig. 2; Section 3.2). Here, the harvest age (Fig. 2) represents the age when the biomass of bioenergy trees was harvested or estimated. It is directly reported by the original literature and corresponding corresponds to the reported yield. Rotation length (Fig. 2) is the management practice reported in the original literature, and it is the same as harvest age in most studies.

30
In other studies, however, some trees may be harvested earlier or later than the regular rotation length, e.g. for a comparison purpose. In addition, not all literature reported both harvest age and rotation length (see the number of observations in Fig. 2).

Simulation set-up
The set-up for the site-scale simulations in ORCHIDEE-MICT-BIOENERGY is as follows. The model is forced with 30 min time step climatic forcing data, CRU-NCEP v7 (Viovy, 2017) recycling the period of 1990-2000. The CRU-NCEP forcing data is a merged product of CRU TS climate dataset (Harris et al., 2014) and NCEP reanalysis data (Kalnay et al., 1996).
Some observation sites have reported mean annual temperature (MAT) and precipitation (MAP), and we verified that these 25 data are consistent with the MAT and MAP from the CRU-NCEP v7 climate forcing data we used (Fig. S3S4). Thus no bias correction was applied to the CRU-NCEP v7 climate forcing. The soil texture map used in the model is based on the twelve USDA texture classes from Reynolds et al. (2000).
We assumed a homogenous coverage (100%) of one single bioenergy crop PFT in a grid cell covered by the same PFT type as the site observations. We set an annual harvest fraction of 1% of the grid cell each year. The 1% annual harvest fraction is 30 just an artificial value to make sure that there is always forest in second youngest age class available for harvest every year Formatted: Font: Italic after a stable rotation is established. We compared the annual harvested biomass in bioenergy harvest pool in per area unit, so the harvest area has no influence on our model evaluation. For the bioenergy trees (PFT14 and PFT15), a spin-up of 100 years without harvest was run first to derive biomass evolution in time to define the respective biomass boundaries for age classes in each grid cell (see Yue et al., 2018). The biomass boundaries are grid-cell specific because of the different vegetation growth rates in different grid cells. The six age classes from youngest to oldest are thus set to be corresponding to 5 0-4, 4-6, 6-10, 10-30, 30-50, and >50 years. We set the second youngest age class that is used in priority for bioenergy harvest (Section 2.2) to be 4-6 years ( Fig. 2) based on harvest age and rotation length reported by the original publications in the evaluation dataset (Li et al., submitted). After spin-up, the simulations for PFT14 (eucalypt) and PFT15 (poplar and willow) were run with bioenergy harvest process for 50 years because we only harvested the second age class (4-6 yr) and 50 years is long enough to establish a stable rotation. The harvested biomass amount for the last 10 years was used to calculate 10 the median and range of the simulated yields. Note that we artificially harvest 1% of the grid cells each year, and the harvested patches will be planted immediately. After the first 5 years (one rotation length), there is always a fraction reaching a full rotation and ready for harvest. The harvest in the last 10 years thus represents 10 harvest events. We divided the harvested biomass by 5 years (4-6 years in the second youngest age class) to obtain the annual mean yields of PFT14 (eucalypt) and PFT15 (poplar and willow). A carbon-to-dry-matter ratio of 0.5 was used to convert the unit of yields into ton 15 DM ha -1 yr -1 .
For the bioenergy grasses (PFT16 and PFT17) simulations were performed directly (without spin-up) with harvest for 50 years, and similarly, the yields of the last 10 years were used for comparison with site observed values. Note that we aim to assess the performance of simulated biomass yields rather than the state of the carbon pools including litter and soil organic matter, that depend on site history. As litter and soil carbon pools do not influence vegetation productivity in the model, we 20 did not perform a full long spin-up of carbon pools to their equilibrium values.

Simulated bioenergy yields at global level
The simulated bioenergy biomass yields in comparison with field observations for the four bioenergy crops are shown in Fig.   3. The model-observation results generally lie around the 1:1 ratio line (Fig. 3 left panel). Although the regression between modeled and observed medians is not significant with a low r 2 value because of the overestimation and underestimation at 25 some sites (Fig. 3 left panel), the difference between the two samples of modelled and observed yields is not significant (ttest, p>0.17) and the percent bias (PBIAS, defined as sum of biases divided by sum of observed values, Moriasi et al., 2007) ranges from 2% to 8% for all PFTs, implying that the global distributions of modeled and observed yields are consistent.
Although there are some large overestimates and underestimates especially for PFT14 (eucalypt) and PFT16 (Miscanthus), reported like "irrigating when necessary". The fertilization rates are also difficult to synthesize between different studies because they applied different types of fertilizers, some annually but some in random years. Third, each observation is associated with different managements / treatments, and there is no uniform standard to weight all these different managements. Last, global vegetation models usually run at a half-degree resolution, which may not fully represent the site level climate variations and soil properties. However, the error bars for most sites (67%, 73%, 74% and 64% for PFT14 to 20 PFT17 respectively) reach the 1:1 line (Fig. 3 left panel), implying that the model is able to capture at least some observations in these grid cells.

Biomass-age relationship at site level
A good representation of biomass-age curves for bioenergy trees in the model is crucial to reproduce the yields, especially in the first several years after plantation planting (≤ rotation length). However, most of the observations in the global evaluation 25 dataset were only mean annual yield (Li et al., submitted). This precludes a detailed analysis of biomass dynamics over time for bioenergy trees. We thus selected 22 studies (Table S2) from the evaluation dataset that reported biomass amount of multiple ages (at least two years) and at the same site for eucalypt, poplar or willow. We went through the original articles to derive the biomass-age curves and compared them with the same curves from the model simulations ( Fig. 4 and Fig. 5).
There is a good agreement on biomass-age relationship of eucalypt between model and observation in some sites in Australia 30 and China (Site #2, #8-12 in Fig. 4). But the model underestimates the biomass evolution of eucalypt at Site #13 in New Zealand and overestimates it at Site #5-6 in China (Fig. 4). For poplar and willow, there are two long-term (>10 yr) 13 consecutive observation sites in Wisconsin, USA (Site #13 and #15 in Fig. 5), where the model captures the biomass-age relationship well. In some other sites (Site #2, #3, #7, #14 and #17 in Fig. 5), however, the model results only agree with observations for the first few years and then deviate from the observations afterwards. The model generally coarsely underestimates the biomass of poplar and willow at all ages in the sites in eastern (Site #1 in Fig. 5) and western (Site #5 and #6 in Fig. 5) coastal region of US, in UK (Site #8 to #11 in Fig. 5) and in Sweden (Site #16 in Fig. 5), but overestimates in 5 India (Site #12 in Fig. 5) and at one site in China (Site #18 in Fig. 5).
Possible reasons for the model-observation differences at each site using the information reported in the original studies (see details in Table S2) include the different varieties of species (e.g. genotypes) and management (e.g. fertilization, irrigation or spacing) in the field, which were not explicitly considered in the model. For example, the model overestimates biomass at Site #4 in Fig. 4 because of the large spacing of plantation in the trial at that experimental site (Han et al., 2010), which 10 results in lower biomass yield when converting the unit of ton DM plant -1 yr -1 to ton DM ha -1 yr -1 . Site #13, #14 and #15 in Fig. 5 are from the same study (Strong and Hansen, 1993), and the model reproduces at Site #13 and #15 but underestimates it at Site #14. It is because the biomass-age curves at Site #13 and #15 are from the average of several genotypes (some have higher yields and some lower), but only one genotype with relatively high yield was planted at Site #14 (Strong and Hansen, 1993), causing an model underestimation at Site #14. In addition, our model seems to systematically underestimate biomass 15 production of willow for the sites in UK (Site #8-11 in Fig. 5). These observed biomass production in UK was based on a range of willow varieties in trial experiments, and the authors (Lindegaard et al., 2011) claimed that the trial experiments generate higher yields than large-scale commercial plantations because of the differences in land quality and practice guidelines (e.g. cutting, harvest index). Despite of some model-observation differences, we emphasized that the modeled biomass-age curves are consistent with observations for most sites within the rotation length. 20

Maps of differences between simulated and observed yields
The spatial distributions of relative differences between simulated and observed biomass yields are shown in Fig. 6 to Fig. 9 for each PFT. The observations for eucalypt mainly distribute in Brazil, tropical Africa, South Asia and Australia (Fig. 6).
ORCHIDEE-MICT-BIOENERGY slightly underestimates biomass yield for PFT14 (eucalypt) in Brazil and overestimates some grid cells in southern China and Australia. Some biomass observations of eucalypts in Australia are obtained from 25 native forests (Li et al., submitted), which may partly explain the overestimation by model.
Poplar and willow are mainly planted in temperate regions like the United States, Europe, and Central and East Asia (Fig. 7).
ORCHIDEE-MICT-BIOENERGY underestimates the biomass yields for PFT15 (poplar and willow) in western US but overestimates the yields in eastern US. There is no distinct pattern for the differences between observations and model results in Europe with both underestimation and overestimation across grid cells. But it seems that the simulated biomass 30 yields are lower than observations in Sweden. In Central and East Asia, biomass yields in the inland grid cells are generally underestimated but those in the coastal areas are overestimated.

Field Code Changed
Most of the observations for Miscanthus are from Europe although some trial tests are also available in eastern US and a few in China (Fig. 8). In the US, very slight underestimation of yield was found in the inland areas while overestimation was more close to the ocean. The model underestimates biomass yields for PFT16 (Miscanthus) in UK and South Europe, and slightly overestimates it in other areas in Europe. There are only three grid cells with Miscanthus yield observations in China, and they are all largely overestimated in the simulations.

5
Switchgrass is a native perennial grass in North America  and thus mainly grows in the US (

Model-observation difference in different climate bins
The differences between simulated and observed biomass yields for bioenergy crop PFTs in different MAT and MAP intervals are shown in Fig. 10. There is no systematical bias of simulated biomass yields in the climate space except in the climate zones with relatively high MAT and MAP (upper-right grids in Fig. 10b,c) for PFT15 (poplar and willow) to PFT17 15 (switchgrass). For these PFTs, iIt seems ORCHIDEE overestimated the yields with MAT > 15 C and MAP > 1000 mm yr -1 .
The strong underestimation (darker blue color) seems more aligned to the drier regions, especially for poplar and willow (PFT15, Fig. 10b).
The distribution patterns in Fig. 10 also reflect the different climate conditions of growth for these bioenergy crops.
Consistent with their physiological characteristics, eucalypts grow in tropical regions ( Fig. 6) with MAT > 10 C and MAP > 20 500 mm yr -1 (Fig. 10). By contrast, poplars and willows grow in temperate regions (Fig. 7) and some under low MAT and MAP (Fig. 10). Miscanthus and switchgrass are usually planted in Europe and US ( Fig. 8 and Fig. 9) with moderate MAT and MAP.
We further investigated whether other climate forcing variables in the model impact the model-observation differences using the multiple linear regression method (Table S3) and the regression tree method (Breiman et al., 1984;Pedregosa and 25 Varoquaux, 2011) (Fig. S4S5). In these two methods, PFT types and nine climate forcing variables (Table S3) were used as independent variables and the relative model-observation difference as dependent variable. The multiple linear regression is non-significant (p = 0.28) with a very low r 2 (0.01), suggesting that the variations in the relative model-observation differences is mostly explained by other factors rather than the climate forcing biases used in the model. In the regression tree (Fig. S4S5), the first discriminator is short-wave radiation but it only split very few samples. Although north wind speed 30 Formatted: Not Highlight separates a relatively large proportion of samples (Fig. S4S5), it has little to do with the biomass production in the model. Therefore, results from these two regression methods suggest the model-observation biases are unlikely caused by the model simulation.

1 Model performance before and after bioenergy crop implementation
In this study, we added four new PFTs to represent the main lignocellulosic bioenergy crops and implemented new parameterizations for each new PFT. As a first step, we evaluated the biomass production from bioenergy crops in ORCHIDEE using a global field measurement dataset. We compared the biomass yields simulated by the new ORCHIDEE-MICT-BIOENERGY with the yields from previous ORCHIDEE version (Fig. 11). In the previous version, bioenergy crops 10 were all taken as herbaceous C4 crop (PFT13), and thus severe overestimation (overestimating 60% on average) occurs for tropical bioenergy trees (i.e. eucalypts, gray squares in Fig. 11a). Although using herbaceous C4 crop generally reproduce the observed biomass yields of poplars and willows (grey squares in Fig. 11b), different carbon dynamics in litter and soil and water and energy balance can be expected.
Using the right tree PFTs for bioenergy trees and right herbaceous PFTs for bioenergy grasses but without new 15 parameterizations also results in significant biases in the simulated yields compared to observations (blue triangles in Fig.   11). Specifically, using the default parameters of previous version is found to largely underestimate biomass yields of all bioenergy trees (blue triangles in Fig. 11a,b). For bioenergy grasses, slight underestimation was found for Miscanthus (blue triangles in Fig. 11c) while large overestimation was found for switchgrass (blue triangles in Fig. 11d) with previous default parameters. The large biomass yields of C4 crops in previous ORCHIDEE version (blue triangles in Fig. 11c,d) mainly result 20 from the high Vcmax25 (Table 2), which is not the reason for the high yields of Miscanthus and switchgrass (Fig. 1). We emphasize again that different bioenergy crops achieve high productivities through different pathways based on their plant traits (Section 2.3) and it is important to specifically consider these traits by proper parameterizations in the global vegetation models.

25
We adjusted some key parameters (e.g. Vcmax, Jmax and SLA) related to productivity of bioenergy crops based on a collection of field measurements. We only took the medians and the ranges to validate the parameter values in the model but didn't explicitly consider the impacts of management (e.g. fertilization, species) on these parameters, neither in the model nor in the measurements. Here, we summarized some management effects on these parameters for different bioenergy crops based on measurements as follows.
1) Miscanthus: (Wang et al., (2012) found that biomass yield of Miscanthus increased under nitrogen addition through elevated SLA, but fertilization didn't affect Vcmax, stomatal conductance (gs) or the extinction coefficient (k). (Yan et al., (2015)measured photosynthesis variables of three Miscanthus species in two experimental fields and found significantly 5 higher gs, Jmax and Vcmax of Miscanthus lutarioriparius than M. sacchariflorus and M. sinensis.
2) Switchgrass: SLA differed significantly among nine cultivars of switchgrass but didn't respond significantly to water stress or nitrogen application for individual cultivar (Byrd and May II, 2000). (Trócsá nyi et al., (2009) reported a lower SLA of switchgrass from the early harvest than from the late harvest. (Hui et al., (2018) investigated leaf physiology of switchgrass under five precipitation treatments and found significantly higher photosynthesis rate and gs under elevated 10 precipitation but no significant difference under reduced precipitation compared to control plots.
3) Eucalypt: (Lin et al., (2013) measured photosynthesis response of six Eucalyptus species to temperature and found significantly different Jmax25 and Vcmax25 among species but non-significant differences in their ratios (Jmax25 / Vcmax25) and in the temperature response of Jmax and Vcmax. With extra nitrogen supply, Jmax and Vcmax of Eucalyptus grandis increased significantly, mainly associated with elevated leaf nitrogen content (Grassi et al., 2002). (Sharwood et al., (2017)  and Vcmax of the former species were significantly higher than the latter hybrid despite some clonal variations (Dowell et al., 2009). (Wullschleger, (1993) summarized the species-specific estimates of Jmax and Vcmax, and the five Populus species 20 displayed large variations. In a poplar free-air CO2 enrichment (PopFACE) experiment, P. alba, P. nigra and P. × euramericana showed significant difference of gs but non-significant difference of Jmax and Vcmax among species, while the elevated CO2 significantly decreased Jmax and Vcmax but had no influence on gs species (Bernacchi et al., 2003). SLA was also found to differ significantly between P. deltoides × P. nigra family and P. deltoides × P. trichocarpa family (Marron et al., 2007). For willows, SLA increased significantly under fertilization and irrigation, but the magnitude of response varied 25 among six varieties of Salix species (Weih and Rönnberg-Wä stjung, 2007). Similarly, the response of SLA and gs to nitrogen fertilization differed among three willow clones, but no significant difference of Vcmax was found between fertilization and control plots for all clones (Merilo et al., 2006). requires substantial computational resource, and more importantly, there may be not enough measured parameters of each species for all the processes implemented in the models. At this stage, therefore, using the medians and ranges across a grea t number of observations is a more justified and practical way to tune the parameters in the models. But more field 5 measurements and quantitative reviews of relationships between individual parameter and individual management as well as interactions between different parameters and managements are highly needed in future research.

Management impacts on yields
Management like fertilization, irrigation and species plays an important role in the biomass yields. In ORCHIDEE -MICT-BIOENERGY, nutrient limitations and management by irrigation and fertilization are not explicitly implemented. Instead, 10 we used parameter values in the range that favors a higher productivity (Section 2.3, Fig. 1) and compared the simulated yields with the median values of all observations regardless the management (Fig. 3). We further categorized the observations into three groups (fertilization, non-fertilization or non-reported) and compared with simulated yields (Fig. S6).
There is no systematic bias between simulated yields and yields at fertilized sites for all PFTs (orange dots in Fig. S6). The model seems to overestimate the yields of eucalypt at sites with non-reported information of fertilization (most gray dots 15 above 1:1 line in Fig. S6a, Table S4) and overestimate yields of poplar and willow at sites without fertilization (green dots in Fig. S6b, Table S4). Yields at sites with non-reported fertilization information are underestimated by the model for Miscanthus (gray dots in Fig. S6c, Table S4) but overestimated for switchgrass (gray dots in Fig. S6d, Table S4).
We didn't group the observations based on different fertilization rates because there are large variations in the biomass response to fertilization rates. For example, in a quantitative review by (Heaton et al., (2004), the relationship between yields 20 of Miscanthus and nitrogen application rates were not significant. (Cadoux et al., (2012) reviewed 11 studies that measured Miscanthus yields under fertilization, and the biomass response to nitrogen fertilization was positive in 6 of the studies but no response in the others. Similarly, some studies showed positive biomass response of poplar to nitrogen fertilization, but others didn't (Kauter et al., 2003). Eucalypt also showed variable response to fertilization while the general response was positive (De Moraes Gonç alves et al., 2004). In quantitative reviews of fertilization effects on yields of switchgrass (Wang et 25 al., 2010) and willow (Fabio and Smart, 2018), the relationship between biomass yields and nitrogen fertilization rates was significantly positive but the coefficient of determination (r 2 ) was very low. In summary, biomass response to fertilization varied largely, and evidence from field measurements is not conclusive. More importantly, the basic soil characteristics should be taken into account in addition to the fertilization rates but unfortunately, we didn't have information of soil nutrient contents nor types, nutrient stoichiometry, rates and timing of applied fertilizers for each site from observations.

30
We also separated the observations based on irrigation information (irrigation, non-irrigation and non-reported) in comparison with modeled yields (Fig. S7). Both underestimation and overestimation were found for sites with different  Table S4) but overestimated at sites with non-reported irrigation information (gray dots in Fig. S7a, Table S4).
Compared to fertilization, not many sites reported irrigation information and the quantification of irrigation rates is more difficult. For example, some studies reported irrigation amount per year while some others only reported descriptive information like "soil moisture maintained to field capacity" or "irregular irrigation when necessary".

5
Comparison between simulated yields and observations for the main species of bioenergy crops is shown in Fig. S8 Table S5). There is underestimation of yields for Miscanthus × giganteus but overestimation for Miscanthus sinensis. In fact, the observed yields of the former are significantly higher 10 than yields of the latter (t-test, p<0.01). Only four sites reported yields for Panicum pretense, and they were overestimated by the model (Fig. S8d, Table S5).

4 Future development
Although the model can generally reproduce the bioenergy crop yields on a global scale,In ORCHIDEE-MICT-BIOENERGY, nutrient limitations and management like irrigation and fertilization are not explicitly implemented. Instead,

15
we used parameter values in the range that favors a higher productivity (Section 2.3, Fig. 1). However, there are still some regional biases of biomass yields for different bioenergy crops. For example, ORCHIDEE-MICT-BIOENERGY underestimates the biomass yields of Miscanthus in UK by 43% (Fig. 8) and overestimates the yields of switchgrass in eastern US by 18% (Fig. 9). Thus, for a regional use of modelled results, slight modifications of related parameters would be needed.

20
In addition to the yields from aboveground biomass, the allocation of belowground biomass also needs to be modified, and the resulting soil carbon stocks need to be evaluated. In the current version, the non-harvested parts of biomass go to the litter pool after each harvest. In reality, however, stumps and coarse roots remain alive in coppicing practices of tree spec ies like eucalypt, poplar and willow, and new shoots grow out of these stumps in the next growing season. Beside the biogeochemical processes, it is also critical to further parameterize and evaluate biophysical processes, especially in the coupled simulations of global vegetation models with climate models to calculate the biophysical feed-backs. Field measurements on e.g. leaf traits, heat exchange and transpiration of bioenergy crops extend our knowledge of these 5 biophysical processes and need to be integrated adequately in the global vegetation models.

Conclusions
Bioenergy crop has been extensively assumed in IAMs and is an important type of future land use. However, most global vegetation models do not have specific representations of these bioenergy crops. It is important to accurately represent the physiology, phenology and carbon allocation of these crops because it fundamentally impacts the hydrology dynamics, 10 energy balance and carbon cycle. Especially for woody bioenergy crops like eucalypts, poplars and willows, not only the biomass yields but also the seasonal variations and biophysical effects, and carbon turnover are impacted by new parameterizations.
In this study, we demonstrated the importance of proper representative of bioenergy crops in a global vegetation model to reproduce the observation-based biomass yields. We introduced new bioenergy crop PFTs based on their plant 15 characteristics, modified the parameters relevant to productivity based on field measurements and empirical evidence, and added the dedicated harvest process to simulate bioenergy biomass yields. The bioenergy crop simulations in ORCHIDEE-MICT-BIOENERGY generally reproduced the observation-based biomass yields for bioenergy crops at global level.
However, it is still difficult to match observations site-by-site due to the uncertainties in the observation dataset and the lack of explicit managements in the model. Evaluations on soil carbon dynamics and biophysical variables are further needed.

20
Our work improves the performance of ORCHIDEE on bioenergy crops modelling, and the parameters used in ORCHIDEE-MICT-BIOENERGY also provide guidance for other vegetation models on incorporating dedicated bioenergy crops.

Code availability
This model development is based on ORCHIDEE-MICT version (Guimberteau et al., 2018) with gross land use changes and forest age dynamics (Yue et al., 2018). The code availability can be found in these two publications. The newly implemented 25 parameterization can be found in Table 2

Data availability
The compiled Vcmax and Jmax data from observations can be found in supplementary Table S1. The evaluation dataset used in this study i.e. the global yield dataset for major lignocellulosic bioenergy crops based on field measurements, has been 5 submitted to a data description journal and will be freely accessed after the acceptance of the data description paper.   climate. PFT 14 is tropical bioenergy tree, eucalypt; PFT15 is temperate bioenergy tree, poplar and willow; PFT16 is C4 bioenergy grass, Miscanthus; PFT17 is C4 bioenergy grass, switchgrass. The red line indicates the 1:1 ratio line.

Figure 4
Biomass-age curves at different sites for PFT14 (tropical bioenergy tree, eucalypt). Site number, coordinates, and country for each site are also shown. Biomass at most sites refers to aboveground biomass, except Site #5, #11 and #12 (labeled "total", i.e. the sum aboveground and belowground biomass; the same total biomass from model is used for these sites). The detailed site information is shown in Table S2.
32 Figure 5 Biomass-age curves at different sites for PFT15 (temperate bioenergy tree, poplar and willow). Site number, coordinates, and country for each site are also shown. Biomass at most sites refers to aboveground biomass, except Site #17 (labeled "total", i.e. the sum aboveground and belowground biomass; the same total biomass from model is used for this site.). The detailed site 5 information is shown in Table S2. Figure 6 The map of relative difference between simulated and observed biomass yields for PFT14 (tropical bioenergy tree, eucalypt). The inset plot shows the frequency of the relative difference between model and observation. Figure 7 The map of relative difference between simulated and observed biomass yields for PFT15 (temperate bioenergy tree, poplar and willow). The inset plot shows the frequency of the relative difference between model and observation. Figure 8 The map of relative difference between simulated and observed biomass yields for PFT16 (C4 bioenergy grass, Miscanthus). The inset plot shows the frequency of the relative difference between model and observation.

35
36 Figure 9 The map of relative difference between simulated and observed biomass yields for PFT17 (C4 bioenergy grass, switchgrass). The inset plot shows the frequency of the relative difference between model and observation.