Interactive comment on “ The biophysics , ecology , and biogeochemistry of functionally diverse , vertically-and horizontally-heterogeneous ecosystems : the Ecosystem Demography Model , version 2 . 2 – Part 1 : Model description

changes since ED-2.0, as the technical description of ED-2.0.12 and ED-2.1 were not published and they are part of a continuous model development effort. In addition, while ED-2.0 solved the energy and water cycles at sub-daily time scales, the paper by Medvigy et al. (2009) does not describe the implementation of these cycles in the ED framework. Our manuscript describes, for the first time, the fundamental equations that govern the energy, water, and CO2 dynamics in ED-2 (Section 3), and how we obtain each flux that is accounted for in the fundamental equations (Section 4).


Introduction
The dynamics of the terrestrial biosphere play an integral role in the earth's carbon, water and energy cycles (Betts and Silva Dias, 2010;Santanello Jr et al., 2018;Le Quéré et al., 2018), and consequently, how the earth's climate system is expected to change over the coming decades due to the increasing levels of atmospheric carbon dioxide arising from anthropogenic activities (IPCC, 2014;Le Quéré et al., 2018). Models for the dynamics of the terrestrial biosphere and its bi-directional 5 interaction with the atmosphere have evolved considerably over the past decades (Levis, 2010;Fisher et al., 2014. As described by Sellers et al. (1997), the first generation of land surface models (LSMs) were limited to provide boundary conditions to atmospheric models, and only solved a simplified energy and water budget, and accounted for the effects of surface on frictional effects on near-surface winds (e.g. Manabe et al., 1965;Somerville et al., 1974). These models, however, did not account for the active role of vegetation. The second generation of LSMs considered the active role of vegetation 10 and represented the spectral properties of the canopy, the changes in roughness of vegetated surfaces, and the biophysical controls on evaporation and transpiration (Sellers et al., 1997); examples of these models include NCAR/BATS (Dickinson et al., 1986) and SiB . The increasing recognition of the role of vegetation in mediating the exchanges of carbon, water and energy between the land and the atmosphere led to the third generation of LSMs, which incorporated explicit representations of plant photosynthesis, and resulting dynamics of terrestrial carbon uptake, turnover and release within 15 terrestrial ecosystems (Sellers et al., 1997); examples of such models included LSM (Bonan, 1995) and SiB2 (Sellers et al., 1996). While the fluxes of carbon, water and energy predicted by these models would change in response to changes in their climate forcing, the biophysical and biogeochemical properties of the ecosystem within each climatological grid cell was prescribed, and thus did not change over time.
Subsequently, building upon previous work (Prentice et al., 1992;Neilson, 1995;Haxeltine and Prentice, 1996), Foley et al. 20 (1996) adopted an approach to calculate the productivity of a series of plant functional types (PFTs), based on a leaf-level model of photosynthesis. The abundance of each PFT within each grid cell was dynamic, with the abundance changes being determined by the relative productivity of the PFTs. This allowed the fast-timescale exchanges of carbon, water, and energy within the plant canopy to be explictly linked with the long-term dynamics of the ecosystem. This approach followed the concept of dynamic global vegetation model (DGVM), originally coined by Prentice et al. (1989) to describe this kind of terrestrial 25 biosphere model in which changes in climate could drive changes in ecosystem composition, structure and functioning, and which when run coupled to atmospheric models would then feedback onto climate. The subsequent generation of terrestrial biosphere-based DGVMs (i.e. DGVMs incorporating couple carbon, water, and energy fluxes) such as LPJ (Sitch et al., 2003), CLM-DGVM (Levis et al., 2004) and TRIFFID/JULES (Hughes et al., 2004;Clark et al., 2011;Mangeon et al., 2016) have included additional mechanisms such as disturbance through fires and multiple types of mortality. 30 Analyses have shown that most terrestrial biosphere models are capable of reproducing the current distribution of global biomes (e.g. Sitch et al., 2003;Blyth et al., 2011) and their carbon stocks and fluxes (Piao et al., 2013). However, they diverge markedly in their predictions of how terrestrial ecosystems will respond to future climate change (Friedlingstein et al., 2014).
In fully-coupled Earth System Model simulations, some of these differing predictions arise from divergent predictions about the direction and magnitude of regional climate change. However, off-line analyses, in which the models are forced with prescribed climatological forcing, have shown that there is also substantial disagreement between the models about how terrestrial ecosystems will respond to any shift in climate (e.g. Sitch et al., 2008;Zhang et al., 2015). In addition, the transitions between biome types, for example, the transition that occurs between closed-canopy tropical forests and grass-and shrub-dominated savannahs in South America, are generally far more abrupt in typical DGVM results than in observations (Good et al., 2011;5 Levine et al., 2016).
One important limitation of most DGVMs is that they cannot represent within-ecosystem diversity and heterogeneity. The representation of plant functional diversity within terrestrial biosphere models is normally coarse, with broadly-defined PFTs defined from a combination of morphological and leaf physiological attributes . In addition, there is limited variation in the resource conditions (light, water, nutrient levels) experienced by individual plants within the cli-10 matological grid cells of traditional DGVMs. Some models such as CLM (Oleson et al., 2013) have vertical above-ground heterogeneity in the form of a multi-layer plant canopy that allows for sun and shade leaves, and/or differences in rooting depth between PFTs; however, resource conditions are assumed to be horizontally homogeneous, meaning that there is no horizontal spatial variation in resource conditions experienced by individuals. The lack of significant variability in resource conditions limits the range of environmental niches within the climatological grid cells of terrestrial biosphere, and makes the coexistence 15 between PFTs difficult. Consequently, models would often predict ecosystems comprised of single homogeneous vegetation types (Moorcroft, 2003(Moorcroft, , 2006. Field-and laboratory-based studies conducted over the past thirty years indicate that plant functional diversity significantly affects ecosystem functioning (Loreau and Hector, 2001;Tilman et al., 2014, and references therein), and variations in trait expression are strongly driven by disturbances and local heterogeneity of abiotic factors such as soil characteristics (Bruelheide 20 et al., 2018;Both et al., 2019). In many cases, biodiversity increases ecosystem productivity and ecosystem stability (e.g., Tilman and Downing, 1994;Naeem and Li, 1997;Cardinale et al., 2007;García-Palacios et al., 2018), and biodiversity has also been shown to contribute to enhanced ecosystem functionality in highly stressed environments (e.g. Jucker and Coomes, 2012).
Other studies have also established correlations between tropical forest diversity and carbon storage and primary productivity (Cavanaugh et al., 2014;Poorter et al., 2015;Liang et al., 2016;Huang et al., 2018). 25 In addition to the absence of within-ecosystem diversity in conventional terrestrial biosphere models, plants of each PFT are also assumed to be homogeneous in size while, in contrast, most terrestrial ecosystems, particularly forests and woodlands, exhibit marked size-structure of individuals within plant canopies (Hutchings, 1997). This size-related heterogeneity is important because plant size strongly affects the amount of light, water, and nutrients individual plants within the canopy can access, which, in turn, affects their performance, dynamics and responses to climatological stress. It also allows representing the dy- 30 namics of pervasive human-driven degradation of forest ecosystems (Lewis et al., 2015;Haddad et al., 2015), which affects carbon stocks, structure and composition of forests that cannot be easily represented in highly aggregated models (Longo and Keller, 2019).
An alternative approach to simulating the dynamics of terrestrial ecosystems has been individual based vegetation models (Friend et al., 1997;Bugmann, 2001;Sato et al., 2007;Fischer et al., 2016;Maréchaux and Chave, 2017). Also known as forest 35 gap models, due to the importance of canopy gaps for the dynamics of closed canopy forests, these models simulate the birth, growth, and death of individual plants, thereby incorporating diversity and heterogeneity of the plant canopy mechanistically. In forest gap models, the ecosystem properties such as total carbon stocks, and net ecosystem productivity are emergent properties resulting from competition of limiting resources and the differential ability of plants to survive and be productive under a variety of micro-environments (e.g. gaps or the understory of a densely populated patch of old-growth forest). This approach 5 has two main advantages. First, gap models represent the dynamic changes in the ecosystem structure caused by disturbances such as tree fall, selective logging, and fires. These disturbances create new micro-environments that are significantly different from old-growth vegetation areas, and allow plants with different life strategies (for example, shade-intolerants) to co-exist in the landscape. Second, because individual trees are represented in the model, the results can be directly compared with field measurements. Gap models have various degrees of complexity, with some models being able to represent the interactions 10 between climate variability and gross primary productivity (Friend et al., 1997;Sato et al., 2007), as well as the impact of climate change in the ecosystem carbon balance (Fischer et al., 2016, and references therein). However, because the birth and death of individuals within a plant canopy are stochastic processes, multiple realizations of given model formulation are required to determine the long-term, large-scale dynamics of these models, which limits their applicability over large regions or global scales, and has precluded their use in Earth System Modeling studies. 15 The need to represent vegetation structure in terrestrial biosphere models, without the computational burden required to simulate every tree at regional and global scales, led to the development of cohort-based models . In the cohort-based approach, individual trees are grouped according to their size (e.g. height or diameter at breast height); functional groups, which can be defined along trait axes (e.g. Reich et al., 1997;Wright et al., 2004;Fortunel et al., 2012), and microenvironment conditions (e.g. whether plants are living in a gap, recently burned fragment, or in a patch of old-growth forest). Over 20 the past two decades, many cohort-based models have emerged, including the Ecosystem Demography Model (ED, Moorcroft et al., 2001;Hurtt et al., 2002;Albani et al., 2006;Medvigy et al., 2009); the Lund-Potsdam-Jena General Ecosystem Simulator (LPJ-GUESS, Smith et al., 2001;Ahlström et al., 2012;Lindeskog et al., 2013); and the Land Model version 3 with Perfect Plasticity Approximation (LM3-PPA, Weng et al., 2015), and the Functionally-Assembled Terrestrial Ecosystem Simulator (FATES, Fisher et al., 2015;Huang et al., 2019). Because these models also represent functional diversity and heterogeneity 25 of micro-environments, the ecosystem's structure, diversity and functioning also emerge from the interactions between plants with different life strategies under different resource availability, albeit at a lesser extent than individual-based models .
The Ecosystem Demography Model (ED Moorcroft et al., 2001) is a cohort-based model. Through this approach, it addresses the need to incorporate heterogeneity into models of the long-term, large scale response of terrestrial ecosystems to changes 30 in climate and other environmental forcings within a deterministic modeling framework. The size and age-structured partial differential equations that describe the plant community are derived from individual-level properties, but are properly scaled to account for the spatially-localized nature of interactions within plant canopies. The model was later extended by Hurtt et al. (2002) and Albani et al. (2006) to incorporate multiple forms of disturbance including land-clearing, land-abandonment, and forest harvesting. An important difference between ED and most DGVMs is that in ED, PFTs are defined not simply based 35 on their biogeographic ranges, but also represent diversity in plant life-history strategies within any given ecosystem. These different PFTs represent a suite of physiological, morphological, and life-history traits that mechanistically represent the ways different kinds of plants utilize resources (Fisher et al., 2010). The original ED model formulation was an off-line ecosystem model describing the coupled carbon and water fluxes of a heterogeneous tropical forest ecosystem (Moorcroft et al., 2001). Subsequently Medvigy et al. (2009) applied a similar 5 approach to develop the Ecosystem Demography model version 2 (ED-2) that describes coupled carbon, water and energy fluxes of the land surface. Since then, the ED-2 model has been continuously developed to improve several aspects of the model (see Supplement S1 for further information): (1) the conservation and thermodynamic representation of energy, water, and carbon cycles of the ecosystems; (2) the representation of several components of the energy, water, and carbon cycles, including the canopy radiative transfer, aerodynamic conductances and eddy fluxes, and leaf physiology (photosynthesis); (3) 10 the structure of the code, including efficient data storage, code parallelization, and version control and code availability. ED-2 has been used in many studies including offline simulations (e.g. Medvigy et al., 2009;Antonarakis et al., 2011;Kim et al., 2012;Zhang et al., 2015;Castanho et al., 2016;Levine et al., 2016), or interactively with a regional atmospheric model (e.g. Knox et al., 2015;Swann et al., 2015).
In this paper, we describe in detail the biophysical, physiological, ecological and biogeochemical formulation of the most 15 recent version of the ED-2 model (ED-2.2), focusing in particular on the model's formulation of the fast time-scale dynamics of the heterogeneous plant canopy that occur at sub-daily timescales. While many parameterizations and sub-models in ED-2.2 are based on approaches that are also used in other DVGMs, their implementation in ED-2.2 has some critical differences from other ecosystem models and also previous versions of ED: (1) In ED-2.2, the fundamental budget equations use energy and total mass as the main prognostic variables; because we use equations that directly track the time changes of the properties 20 we seek to conserve, we can assess the model conservation of such properties with fewer assumptions . (2) In ED-2.2, all thermodynamic properties are scalable with mass, and the model is constructed such that when individual biomass changes due to growth and turnover, the thermodynamic properties are also updated to reflect changes in heat and water holding capacity.
(3) The water and energy budget equations for vegetation are solved at the individual level and the corresponding equations for environments shared by plants such as soils and canopy air space are solved for each micro-environment in the landscape, 25 and thus ecosystem-scale fluxes are emerging properties of the plant community. This approach allows the model to represent both the horizontal and vertical heterogeneity of environments of the plant communities. It also links the individual's ability to access resources such as light and water and accumulate carbon under a variety of micro-environments, which ultimately drives the long-term dynamics of growth, reproduction, and survivorship. In ED-2.2 the terrestrial ecosystem within a given region of interest is represented through a hierarchy of structures to capture the physical and biological heterogeneity in the ecosystem's properties (Fig. 1).

Static levels
Dynamic levels Static levels (grid, polygons, and sites) are assigned during the model initialization and remain constant throughout the simulation. Dynamic levels (patches and cohorts) may change during the simulation according to the dynamics of the ecosystem.
Physical Heterogeneity: The domain of interest is geographically divided into polygons. Within each polygon, the timevarying meteorological forcing above the plant canopy is assumed to be uniform. For example, a single polygon may be used to simulate the dynamics of an ecosystem in the neighborhood of an eddy flux tower, or alternatively, a polygon may represent the lower boundary condition within one horizontal grid-cell in an atmospheric model. Each polygon is sub-divided into one or more sites that are designed to represent landscape-scale variation in other abiotic properties, such as soil texture, 5 depth, elevation, slope, aspect, and topographic moisture index. Each site is defined as a fractional area within the polygon and represents all regions within the polygon that share similar time-invariant physical (abiotic) properties. Both polygons and sites are defined at the beginning of the simulation and are fixed in time, and no geographic information exists below the level of the polygon.
Biotic Heterogeneity: Within each site, horizontal, disturbance-related heterogeneity in the ecosystem at any given time t is characterized through a series of patches that are defined by the time elapsed since last disturbance (i.e. age, a) and the type of disturbance that generated them. Like sites, patches are not physically contiguous: each patch represents the collection 5 of canopy gap-sized (∼ 10 m) areas within the site that have a similar disturbance history, defined in terms of the type of disturbance experienced (represented by subscript q, q ∈ 1, 2, . . . , N Q ; list of indices available at Tab. 1) and time since the disturbance event occurred. The disturbance types accounted for in ED-2.2, and the possible transitions between different disturbance types, are shown in Fig. S1. The collection of gaps within each given site belonging to a polygon follows a probability distribution function α, which can be also thought of as the relative area within a site, that satisfies: Similarly, the plant community population is characterized by the number of plants per unit area (hereafter number density, n), and is further classified according to their plant functional type (PFT), represented by subscript f (f ∈ 1, 2, . . . , N F ; Tab. 1) and the type of gap (q). The number density distribution depends on the individuals' biomass characteristics (size, C), the age since last disturbance (a) and the time (t), and is expressed as n f q (C, a, t). Size is defined as a vector C = n −1 f q (C l ; C r ; C σ ; C h ; C n ) 15 (units: kg C plant −1 ) corresponding to biomass of leaves, fine roots, sapwood, heartwood, and non-structural storage (starch and sugars), respectively.
Following Moorcroft et al. (2001), Albani et al. (2006), and Medvigy and Moorcroft (2012), the fundamental partial differential equations that describe the dynamics of demographic density and probability distribution within each site in the size-and-age structured model are defined as (dependencies omitted in the equations for clarity): where m f is mortality rate, which may depend on the PFT, size, and the individual carbon balance; g f is the vector of the net growth rates for each carbon pool, which also may depend on the PFT, size, and carbon balance; ∇ C · is the divergent operator for the size vector; and λ q q is the transition matrix from gaps generated by previous disturbance q affected by new disturbance 25 of type q, which may depend on environmental conditions. Boundary conditions are shown in Supplement S2.
Equation (2) and Eq. (3) cannot be solved analytically except for the most trivial cases; therefore the age distribution is discretized into patches (subscript u, u ∈ 1, 2, . . . , N P ; Tab. 1) of similar age and same disturbance type, and the population size structure living in any given patch u is discretized into cohorts (subscript k, k ∈ 1, 2, . . . , N T ; Tab. 1) of similar size and Table 1. List of subscripts associated with ED-2.2 hierarchical levels. NT is the total number of cohorts, NG is the total number of soil (ground) layers, NS is the total number of temporary surface water/snowpack layers, and NC is the total number of canopy air space layers, currently only used to obtain properties related to canopy conductance. The complete list of subscripts is available at Tab. S1.

Subscript Description
Xc Canopy air space (single layer) same PFT (Fig. 1). Unlike polygons and sites, patches and cohorts are dynamic levels: changes in distribution (fractional area) of patches are driven by aging and disturbance rates, whereas changes in the distribution of cohorts in each patch are driven by growth, mortality, and recruitment (Fig. S2).
The environment perceived by each plant (e.g. incident light, temperature, vapor pressure deficit) varies across large scales as a consequence of changes in climate (macro-environment), but also varies at small scales (within the landscape; micro-5 environment) because of the horizontal and vertical position of each individual relative to other individuals in the plant community (e.g. Bazzaz, 1979) and the position of the local community in landscapes with complex terrains. Both macro-and micro-environmental conditions drive the net primary productivity of each individual, and ultimately determine growth, mortality, and recruitment rates for each individual. Likewise, they can also affect the disturbance rates: for example, during drought conditions (macro-environment) open canopy patches (micro-environment) may experience faster ground desiccation and con-10 sequently increase local fire risk. To account for the variability in micro-environments within the landscape and within local plant communities, in ED-2.2 the energy, water, and carbon dioxide cycles are solved separately for each patch, and within each patch, fluxes and storage associated with individual plants are solved for each cohort.
The ED-2.2 model represents processes that have inherently different time scales, therefore the model also has a hierarchy of time steps, in order to attain maximum computational efficiency (Table 2). Processes associated with the short-term dynamics 15 are presented in this manuscript. A summary of the phenological processes and those associated with longer term dynamics is presented in Supplement S3 (see also Moorcroft et al., 2001;Albani et al., 2006;Medvigy et al., 2009). Table 2. Time steps associated with processes resolved by ED-2.2. The thermodynamic sub-step is dynamic and it depends on the error evaluation of the integrator, but it cannot be longer than the biophysics step, which is defined by the user. Other steps are fixed as of ED-2.2.
Processes marked with are presented in the main paper. Other processes are briefly described in Supplements S3 and S4.  (The HDF Group, 2016). In addition, the Message Passing Interface (MPI) is highly recommended for regional simulations and is required for simulations coupled with the Brazilian Improvements on Regional Atmospheric Modeling System (BRAMS) atmospheric model (Knox 5 et al., 2015;Swann et al., 2015;Freitas et al., 2017). The source code can be also compiled with Shared Memory Processing (SMP) libraries, which enable parallel processing of thermodynamics and biophysics steps at the patch level, and thus allowing shorter simulation time.
Code design and parallel structure. ED-2.2 has been designed to be run in three different configurations; (1) As a stand-alone land-surface model over a small list of specified locations (sites), or (2) as a stand-alone land-surface model distributed over a regional grid, or (3) as coupled with the BRAMS atmospheric model as distributed over a regional grid (ED-BRAMS, Knox 5 et al., 2015;Swann et al., 2015). For regional stand-alone grids, the model partitions the grid into spatially contiguous tiles of polygons, which access the initial and boundary conditions and are integrated independently of each other, but write the results to a unified output file using collective input/output functions from HDF5. In the case of simulations dynamically coupled with the BRAMS model, polygons are defined to match each atmospheric grid cell.
Memory allocation. The code uses dynamic allocation of variables and extensive use of pointers to efficiently reduce the 10 amount of data transferred between routines. To reduce the output file size, polygon-, site-, patch-, and cohort-level variables are always written as long vectors, and auxiliary index vectors are used to map variables from higher hierarchical levels to lower hierarchical levels (for example, to which patch a cohort-level variable belongs).

Model inputs
Every ED-2.2 simulation requires an initial state for forest structure and composition (initial state), a description of soil char-15 acteristics (edaphic conditions), and a time-varying list of meteorological drivers (atmospheric conditions).
Initial state. To initialize a plant community from inventory data, one must have either the diameter at breast height of every individual or the stem density of different diameter size classes, along with plant functional type identification and location; in addition necromass from the litter layer, woody debris and soil organic carbon are needed. Alternatively, initial conditions can be obtained from airborne LiDAR measurements (Antonarakis et al., 2011(Antonarakis et al., , 2014 or a prescribed near bare ground condition 20 may be used for long-term spin up simulations. Previous simulations can be used as initial conditions as well. Edaphic conditions. The user must also provide soil characteristics such as total soil depth, total number of soil layers, the thickness of each layer, as well as soil texture, color and the bottom soil boundary condition (bedrock, reduced drainage, free drainage, or permanent water table). This flexibility allows the user to easily adjust the soil characteristics according to their regions of interest. Soil texture can be read from standard data sets (e.g. Tempel et al., 1996;Hengl et al., 2017) or provided 25 directly by the user. Soil layers, soil color and bottom boundary condition must be provided directly by the user as of ED-2.2.
In addition, simulations with multiple sites per polygon also need to provide the fractional areas of each site and the mean soil texture class, slope, aspect, elevation, and topographic moisture index of each site.
Atmospheric conditions. Meteorological conditions needed to drive ED-2.2 include temperature, specific humidity, CO 2 molar fraction, pressure of the air above the canopy, precipitation rate, incoming solar (shortwave) irradiance (radiation flux) 30 and incoming thermal (longwave) irradiance (Table 3), at a reference height that is at least a few meters above the canopy.
Sub-daily measurements (0.5-6 hours) are highly recommended so the model can properly simulate the diurnal cycle and interdiurnal variability. Meteorological drivers can be either at a single location (e.g. eddy covariance towers), or gridded meteorological drivers such as reanalysis (e.g. Dee et al., 2011;Gelaro et al., 2017) or bias-corrected products based on Table 3. Atmospheric boundary conditions driving the ED-2.2 model. Variable names and subscript follow a standard notation throughout the manuscript (Tables S1 and S2). Flux variables between two thermodynamic systems are defined by a dot and two indices separated by a comma, and they are positive when the net flux goes from the thermodynamic system represented by the first index to the one represented by the second index.
Downward near infrared irradiance, diffuse W m −2 reanalysis (e.g. Sheffield et al., 2006;Weedon et al., 2014). Whenever available, CO 2 must be provided at comparable temporal and spatial resolution as other meteorological drivers; otherwise, it is possible to provide spatially homogeneous, time-variant CO 2 , or constant CO 2 , although this may increase uncertainties in the model predictions (e.g. Wang et al., 2007). Alternatively, the meteorological forcing (including CO 2 ) may be provided directly by BRAMS Swann et al., 2015).
Plant functional types. The user must specify which plant functional types (PFTs) are allowed to occur in any given sim-5 ulation. ED-2.2 has a list of default PFTs, with parameters described in Tables S5-S6. Alternatively, the user can modify the parameters of existing PFTs or define new PFTs through an extensible markup language (XML) file, which is read during the model initialization.
3 Overview of enthalpy, water, and carbon dioxide cycles Here we present the fundamental equations that describe the biogeophysical and biogeochemical cycles. Because the environ-10 mental conditions are a function of the local plant community and resources are shared by the individuals, these cycles must be described at the patch level, and the response of the plant community can be aggregated to the polygon level once the cycles are resolved for each patch. In ED-2.2, patches do not exchange enthalpy, water, and carbon dioxide with other patches; thus patches are treated as independent systems. Throughout this section, we will only refer to the patch-and cohort-levels, and indices associated with patches, sites and polygons will be omitted for clarity.

Definition of the thermodynamic state
Each patch is defined by a thermodynamic envelope (Fig. 2), comprised of multiple thermodynamic systems: each soil layer (total number of layers N G ), each temporary surface water or snow layer (total number of layers N S ), leaves and branchwood 5 portion of each cohort (total number of cohorts N T ), and the canopy air space. For simplicity, roots are assumed to be in thermal equilibrium with the soil layers and have negligible heat capacity compared to the soil layers. Although patches do not exchange heat and mass with other patches, they are allowed to exchange heat and mass with the free air (i.e. the atmosphere above and outside of the air-space control-volume we deem as within canopy) and lose water and associated energy through surface and sub-surface runoff. We also assume that intensive variables such as pressure and temperature are uniform within 10 each thermodynamic system. Note that free air is not considered a thermodynamic system in ED-2 because the thermodynamic state is determined directly from the boundary conditions, and thus external to the model.
where V is the volume of the thermodynamic system and p is the ambient pressure. The components on the right-hand side of Eq. (4) and Eq. (5) depend on the thermodynamic system, and will be presented in detail in the following sections. Net heat fluxes (Q) represent changes in enthalpy that are not associated with mass exchange (radiative and sensible heat fluxes), whereas the remaining enthalpy fluxes (Ḣ) correspond to changes in heat capacity due to addition or removal of mass from 20 each thermodynamic system.
The merit of solving the changes in enthalpy over internal energy is that changes in enthalpy are equivalent to the net energy flux when pressure is constant (Eq. 4). Pressure is commonly included in atmospheric measurements, making it easy to track changes in enthalpy not related to energy fluxes. In reality, the only thermodynamic system where the distinction between internal energy and enthalpy matters is the canopy air space. Work associated with thermal expansion of solids and 25 liquids is several orders of magnitude smaller than heat (Dufour and van Mieghem, 1975), and changes in pressure contribute significantly less to enthalpy because the specific volume of solids and liquids are comparatively small. Likewise, enthalpy fluxes that do not involve gas phase (e.g. canopy dripping and runoff) are nearly indistinguishable from internal energy flux, whereas differences between enthalpy and internal energy fluxes are significant when gas phase is involved (e.g. transpiration and eddy flux). For simplicity, from this point on we will use the term enthalpy whenever internal energy is indistinguishable 30 from enthalpy. The complete list of state variables in ED-2.2 is shown in Table 4. Table 4. List of state variables solved in ED-2.2. Unless otherwise noted, the reference equation is the ordinary differential equation that defines the rate of change of the thermodynamic state. The list of fluxes that describe the thermodynamic state is presented in Table 5. For a complete list of subscripts and variables used in this manuscript, refer to Tables S1-S2.

State variable Description Units Budget equation
cc CO2 mixing ratio -canopy air space µmolC mol −1 (23) hc Specific enthalpy -canopy air space Hs j Enthalpy -temporary surface water layer j J m −2 (4) wc Specific humidity -canopy air space Ws j Water mass -TSW layer j kgW m −2 (5) a Budget fluxes are in units of area, and the state variable is updated following the conversion described in Section 3.2.1.
b Canopy air space pressure is not solved using ordinary differential equations, but based on the atmospheric pressure from the meteorological forcing. c Canopy air space depth is determined from vegetation characteristics, not from an ordinary differential equation.
Variations in enthalpy are more important than their actual values, but they must be consistently defined relative to a predetermined and known thermodynamic state, at which we define enthalpy to be zero. For any material other than water, enthalpy is defined as zero when the material temperature is 0K; for water, enthalpy is defined as zero when water is at 0K and completely frozen. The general definitions of enthalpy and internal energy states used in all thermodynamic systems in ED-2.2 are described in Supplement S5. In ED-2.2, enthalpy is used as the prognostic variable because these are directly 5 and linearly related to the governing ordinary differential equation (Eq. 4). Temperature is diagnostically obtained based on the heat capacity of each thermodynamic system, and the heat capacities of different thermodynamic systems are defined in Supplement S6.

Heat (Q), water (Ẇ ), and enthalpy (Ḣ) fluxes
The enthalpy and water cycles for each patch in ED-2.2 are summarized in Fig. 2, and these cycles are solved every thermody-10 namic sub-step (∆t Thermo ), using a fourth-order Runge-Kutta integrator with dynamic time steps to maintain the error within prescribed tolerance. For all fluxes and variables, we follow the subscript notation described in Table S1, and denote flux variables with a dot and two indices separated by a comma, denoting the systems impacted by the flux. For any variable X with that has flux between a system m and a system n, we assume thatẊ m,n > 0 when the net flux goes from system m to system n, and thatẊ m,n = −Ẋ n,m . Arrows in Fig. 2 Figure 2. Schematic of the fluxes that are solved in ED-2.2 for a single patch (thermodynamic envelope). In this example, the patch has NT cohorts, NG soil layers and NS = 1 temporary surface water. Both NG and the maximum NS are specified by the user; NT is dynamically defined by ED-2.2.. Letters near the arrows are the subscripts associated with fluxes, although the flux variable has been omitted here for clarity. Solid red arrows represent heat flux with no exchange of mass, and dashed yellow arrows represent exchange of mass and associated enthalpy. Arrows that point to a single direction represent fluxes that can only go in one (non-negative) direction, and arrows pointing to both directions represent fluxes that can be positive, negative, or zero.
is provided in Table 5, and a complete list of variables is provided in In ED-2.2, the soil characteristics (number of soil layers, thickness of each soil layer and total soil depth, soil texture, soil color) are defined by the user, and assumed constant throughout the simulation. Within each patch, each soil layer (comprised by soil matrix and soil water in each layer) is considered a separate thermodynamic system, with the main size dimension being the layer thickness ∆ z gj , with j = 1 being the deepest soil layer, and j = N G being the topmost soil layer. Typically, the top Table 5. List of energy, water, and carbon dioxide fluxes that define the thermodynamic state in ED-2.2, along with sections and equation that define them. Fluxes are denoted by a dotted letter, and two subscripts separated with a comma:Ẋm,n. Positive fluxes go from thermodynamic system m to thermodynamic system n; negative fluxes go in the opposite direction. Acronyms in the description column: canopy air space (CAS); temporary surface water (TSW). The complete list of subscripts and variables used in this manuscript is available in Tables S1-S2, and the list of state variables is shown in layer thickness is set to ∆z g N G = 0.02 m, which is a compromise between computational efficiency and ability to represent the stronger gradients near the surface, and layers with increasing thickness (∆ z gj ) are added for the entire rooting zone.
The thermodynamic state is defined in terms of the soil volume: the bulk specific enthalpy H gj (J m −3 ) and volumetric soil water content ϑ gj (m 3 W m −3 ), which can be related to Eq. (4)-(5) by defining H gj = H gj ∆z gj and W gj = ρ ·ϑ gj ·∆z gj , where ρ is the density of liquid water (Table S3). Soil net fluxes for any layer j are defined as: Net sensible heat flux between consecutive layers (4.1) Ground-CAS sensible heat (4.5.2 and 4.5.3) , H gj Net enthalpy flux due to water flux Water percolation between consecutive layers (4.1) Gnd. Evaporation Water percolation between consecutive layers (4.1) Gnd. Evaporation , where δ gj g j is the Kronecker delta for comparing two soil layers g j and g j (1 if g j = g j ; 0 otherwise), CAS is the canopy air space, and subscript o denotes the loss through runoff. References in parentheses underneath the terms correspond to the sections in which each term is presented in detail. In the equations above, we assumeQ g0,g1 to be zero (bottom boundary condition in thermal equilibrium), and (Ḣ g1,g0 = −Ḣ g0,g1 ;Ẇ g1,g0 = −Ẇ g0,g1 ) to be sub-surface runoff fluxes (see section 10 4.1). In addition, (Q g N G ,g N G +1 ;Ḣ g N G ,g N G +1 ;Ẇ g N G ,g N G +1 ) are equivalent to (Q g N G ,s1 ;Ḣ g N G ,s1 ;Ẇ g N G ,s1 ), which are the fluxes between the topmost soil layer and the bottommost temporary surface water layer (see also section 3.2.2).

Temporary surface water (TSW)
Temporary surface water (TSW) exists whenever water falls to the ground, or dew or frost develops on the ground. The layer will be maintained only when the amount of water that reaches the ground exceeds the water holding capacity of the top soil 15 layer (a function of the soil porosity), or when precipitation falls as snow. The maximum number of temporary surface water layers N max S is defined by the user, but the actual number of layers N S and the thickness of each layer depends on the total mass and the water phase, following Walko et al. (2000). When the layer is in liquid phase, only one layer (N S = 1) is maintained. If a snowpack develops, the temporary surface water can be divided into several layers (subscript j, with j = 1 being the deepest soil layer, and j = N S being the topmost TSW layer). Net TSW fluxes are defined as: Net sensible heat flux between consecutive layers (4.1) +Q a,sj Ḣ sj Net enthalpy flux due to water flux =Ḣ sj−1,sj −Ḣ sj ,sj+1 Water percolation between consecutive layers (4.1) Surface water evaporation (4. 5.2 and 4.5.3) , Water percolation between consecutive layers (4.1) Surface water evaporation (4. 5.2 and 4.5.3) , where δ sj s j is the Kronecker delta for comparing two TSW layers s j and s j (1 if s j = s j ; 0 otherwise), CAS is the canopy air space, and subscript o denotes loss from the thermodynamic envelope through runoff. Terms are described in detail in the sections shown underneath each term. Similarly to the soil fluxes (Section 3.2.1), we assume that (Q s0,s1 ;Ḣ s0,s1 ;Ẇ s0,s1 ) is , the fluxes between the topmost soil layer and the bottommost TSW layer. When solving Eq. (9)-(11) for layer s N S , we assume the termsQ sj ,sj+1 ,Ḣ sj ,sj+1 andẆ sj ,sj+1 to be all zero, as layer N S + 1 does not exist.
In the case of liquid TSW, the layer thickness of the single layer is defined as ∆z s1 = ρ −1 W s1 , where ρ is the density of liquid water (Table S3). In the case of snowpack development, the snow density and the layer thickness of the TSW are solved 10 as described in Supplement S7. The thickness of each layer of snow (∆z sj ) is defined using the same algorithm as LEAF-2 (Walko et al., 2000) and described in Supplement S7.

Vegetation
In ED-2.2, vegetation is solved as an independent thermodynamic system only if the cohort is sufficiently large. The minimum size is an adjustable parameter and the typical minimum heat capacity solved by ED-2.2 is on the order of 10 J m −2 K −1 and 15 total area index of 0.005 m 2 leaf+wood m −2 . Cohorts smaller than this are excluded from all energy and water cycle calculations and assumed to be in thermal equilibrium with canopy air space. The net fluxes of heat, enthalpy and water each cohort k that can be resolved are: Each term is described in detail in the sections shown underneath each term on the right-hand side of Eq. (12)-(14).

Canopy air space (CAS)
The canopy air space is a gas, therefore extensive properties akin to the other thermodynamic systems are not intuitive because total mass and total volume cannot be directly compared to observations. Therefore, all prognostic and diagnostic variables are 5 solved in the intensive form. Total enthalpy H c and total water mass W c of the canopy air space can be written in terms of air density ρ c and the equivalent depth of the canopy air space z c as: 10 where BA t k (cm 2 ) and z t k (m) are the basal area and the height of cohort k, respectively; and N T (canopy) is the number of cohorts that are in the canopy, and we assume that cohorts are ordered from tallest to shortest. In case the canopy is open, N T (canopy) is the total number of cohorts, and a minimum value of 5m is imposed when vegetation is absent or too short, to prevent numerical instabilities. Because the equivalent canopy depth depends only on the cohort size, z c is updated at the cohort dynamics step (∆t CD , Table 2). If we substitute Eq. (15) and Eq. (16) into Eq. (4) and Eq. (5), respectively, and assume 15 that changes in density over short time steps are much smaller than changes in enthalpy or humidity, and then we obtain the following equations for the canopy air space budget: where   .
Unlike in the other thermodynamic systems (soil, temporary surface water, and vegetation), the net enthalpy flux of the canopy air space is not exclusively due to associated water flux: the eddy flux between the free air and the canopy air space (Ḣ a,c ) includes both water transport and flux associated with mixing of air with different temperatures, and thus enthalpy, 5 between canopy air space and free air.
In addition, we must also track the canopy-air-space pressure p c . In ED-2.2, CAS pressure is not solved through a differential equation: instead p c is updated whenever the meteorological forcing is updated, using the ideal gas law and hydrostatic equilibrium following the method described in Supplement S8. The rate of change of canopy air pressure is then applied in Eq. (18). Likewise, CAS density (ρ c ) is updated at the end of each thermodynamic step to ensure that the CAS conforms to the 10 ideal gas law.

Carbon dioxide cycle
In ED-2.2, the carbon dioxide cycle is a subset of the full carbon cycle, which is shown in Fig. 3. The canopy air space is the only thermodynamic system with CO 2 storage that is solved by ED-2.2; nonetheless, we assume that the contribution of CO 2 to density and heat capacity of the canopy air space is negligible, hence only the molar CO 2 mixing ratio c c (mol C mol −1 ) is 15 traced.
The change in CO 2 storage in the canopy air space is determined by the following differential equation: Growth and maintenance Respiration (4.7) where M d and M C are the molar masses of dry air and carbon, respectively, used to convert mass to molar fraction (1 mol C = 20 1 mol CO2 ). The terms on the right-hand side of Eq. (24) are described in detail in the sections displayed underneath each term. The net leaf-CAS flux (Ċ l k ,c ) for any cohort k is positive when leaf respiration exceeds photosynthetic assimilation.
The heterotrophic respiration is based on a simplified implementation of the CENTURY model (Bolker et al., 1998) that In addition to canopy air space, we also define a virtual cohort pool of carbon corresponding to the accumulated carbon balance (C ∆ k ). The accumulated carbon balance links short-term carbon cycle components such as photosynthesis and respi-5 ration with long-term dynamics that depend on carbon balance such as such as carbon allocation to growth and reproduction, and mortality (Long-term dynamics described in Supplement S3). The accumulated carbon balance is defined by the following equation: Storage turnover Respiration (4.7) −Ċ ∆ k ,c Growth and maintenance Respiration (4.7) −Ċ t k ,e1 Turnover of non-lignified litter (S4) Turnover of lignified litter (S4) , whereĊ t k ,e1 andĊ t k ,e2 are the individual carbon losses caused by leaf shedding and turnover of living tissues that become part of the litter (Ċ t k ,e1 ) and structural debris (Ċ t k ,e2 ). The transfer of carbon from plants to the soil carbon pools and between the soil carbon pools do not directly impact the carbon dioxide budget, but contribute to the long-term ecosystem carbon stock distribution and carbon balance. These components have been discussed in previous ED and ED2 publications (Moorcroft et al., 2001;Albani et al., 2006;Medvigy, 2006;Medvigy et al., 2009) and are summarized in Supplement S4.
4 Sub-models and parameterizations of terms of the general equations

Hydrology sub-model and ground energy exchange
The ground model encompasses heat, enthalpy, and water fluxes between adjacent layers of soil and temporary surface water, 5 as well as losses of water and enthalpy due to surface runoff and drainage. Fluxes between adjacent layers are positive when they are upwards, and runoff and drainage fluxes are positive or zero.
Sensible heat flux between two adjacent soil or temporary surface water layers j − 1 and j are determined based on thermal conductivity Υ Q and temperature gradient (Bonan, 2008), with an additional term for temporary surface water to scale the flux when the temporary surface water covers only a fraction f TSW of the ground: where the operator is the log-linear interpolation from the mid-point height of layers j − 1 and j to the height at the interface. The bottom boundary condition of Eq. (26) is ∂T ∂z g0,g1 ≡ 0. The interface between the top soil layer and the first temporary surface water (Q g N G ,s1 ) is found by applying Eq. (27) with T s0 ; Υ Qs 0 ; ∆z s0 = T g N G ; Υ Qg N G ; ∆z g N G . Soil 15 thermal conductivity depends on soil moisture and texture properties, and the parameterization is described in Supplement S9.
Both the fraction of ground covered by the temporary surface water and the thermal conductivity of the temporary surface water are described in Supplement S10.
Ground water exchange between layers occurs only if water is in liquid phase. The water flux between soil layers g j−1 and g j , j ∈ {2, 3, . . . , N G } is determined from Darcy's law (Bonan, 2008): where Ψ is the soil matric potential and Υ Ψ is the hydraulic conductivity, both defined after Brooks and Corey (1964), with an additional correction term applied to hydraulic conductivity to reduce conductivity in case the soil is partially or completely frozen (Supplement S9). The bottom boundary condition for soil matric potential gradient is ∂Ψ ∂z g0,g1 ≡ 0. The term dzg dz in Eq. (28) is the flux due to gravity, and it is 1 for all layers except the bottom boundary condition, which 25 depends on the sub-surface drainage. Sub-surface drainage at the bottom boundary depends on the type of drainage, and is determined using a slight modification of Eq. (28). Let ð be an angle-like parameter that controls the drainage beneath the lowest level. Because we assume zero gradient in soil matric potential between the lowest layer and the boundary condition, the sub-surface drainage flux (Ẇ g1,g0 ) becomes: Special cases of Eq. (29) are the zero-flow conditions (ð = 0) and free drainage (ð = π 2 ). For the temporary surface water, water flux between layers through percolation is calculated similarly to LEAF-2 (Walko et al., 2000). Liquid water in excess of 10 % is in principle free to percolate to the layer below, although the maximum percolation of the first surface water layer is limited by the amount of pore space available at the top ground layer: Surface runoff of liquid water is simulated using a simple extinction function, applied only at the top most temporary surface water layer: where t Runoff is a user-defined e-folding decay time, usually on the order of a few minutes to a few hours (Table S4).
In addition to the water fluxes due to sub-surface drainage, surface runoff and the transport of water between layers, we must account for the associated enthalpy fluxes. Enthalpy fluxes due to sub-surface drainage and surface runoff are defined based on the water flux and the temperature of the layers where water is lost, by applying the definition of enthalpy (Supplement S5): 15 where q is the specific heat of liquid water (Table S3), and T 0 is defined in Eq. (S53). The enthalpy flux between two adjacent layers is solved similarly, but it must account for the sign of the flux in order to determine the water temperature of the donor layer: where the subscript x j represents either soil (g j ) or temporary surface water (s j ).

Precipitation and vegetation dripping
In ED-2.2, precipitating water from rain and snow increases the water storage of the thermodynamic systems, as rainfall can be intercepted by the canopy, or reach the ground. This influx of water also affects the enthalpy storage due to the enthalpy associated with precipitation, although no heat exchange is directly associated with precipitation.
To determine the partitioning of total incoming precipitation (Ẇ ∞,a ) into interception by each cohort (Ẇ a,t k ) and direct 25 interception by the ground (throughfall,Ẇ a,s N S ), we use the fraction of open canopy (O) and the total plant area index of each cohort (Φ t k ): where Φ t k = Λ t k + Ω t k is the total plant area index, Λ t k and Ω t k being the leaf and wood area indices, both defined from PFTdependent allometric relations (Supplement S18); X t k is the crown area index of each cohort, also defined in Supplement S18.
Throughfall precipitation is always placed on the topmost temporary surface water layer. In case no temporary surface water 5 layer exists, a new layer is created, although it may be extinct in case all water is able to percolate down to the top soil layer.
Precipitation is a mass flux, but it also has an associated enthalpy flux (Ḣ ∞,a ) that must be partitioned and incorporated to the cohorts and temporary surface water. Similar to the water exchange between soil layers, the enthalpy flux associated with rainfall uses the definition of enthalpy (Supplement S5). Because precipitation temperature is seldom available in meteorological drivers (towers or gridded meteorological forcing data sets), we assume that precipitation temperature is closely associated 10 with the free-air temperature (T a ), and we use T a to determine whether the precipitation falls as rain, snow, or a mix of both.
Importantly, the use of free-air temperature partly accounts for the thermal difference between precipitation temperature and the temperature of intercepted surfaces. Rain is only allowed when T a is above the water triple point (T 3 = 273.16 K); in this case, the rain temperature is always assumed to be at T a . Pure snow occurs when the free-air temperature is below T 3 , and likewise snow temperature is assumed to be T a . When free air temperature is only slightly above T 3 , a mix of rain and snow 15 occurs, with the rain temperature assumed to be T a and snow temperature assumed to be T 3 : where (q i ; q ) are the specific heats of ice and liquid, respectively, and T 0 is temperature at which supercooled water would have enthalpy equal to zero (Eq. S53). The fraction of precipitation that falls as rain a is based on the Jin et al. (1999) parameterization, slightly modified to make the function continuous: The enthalpy flux associated with precipitation is then partitioned into canopy interception (Ḣ a,t k ) and throughfall (Ḣ a,s N S ) using the same scaling factor as in Eq. (37) and Eq. (36): 25 Leaves and branches can accumulate only a finite amount of water on their surfaces, proportional to their total area. When incoming precipitation rates are too high (or more rarely when dew or frost formation is excessive), any water amount that exceeds the holding capacity is lost to the ground as canopy dripping. Similarly to incoming precipitation, the excess water lost through dripping also has an associated enthalpy that must be taken into account, although dripping has no associated heat flux. The canopy dripping fluxes of water (Ẇ t k ,s N S ) and the associated enthalpy (Ḣ t k ,s N S ) are defined such that the leaves and branches lose the excess water within one time step: where t k is the liquid fraction of surface water on top of cohort k andŵ max is the cohort holding capacity, which is an adjustable parameter (Table S4) but typically is of the order of 0.05 − 0.40 kg W m −2 Leaf+Wood (Wohlfahrt et al., 2006).

Radiation model
The radiation budget is solved using a multi-layer version of the two-stream model (Sellers, 1985;Liou, 2002;Medvigy, 2006) 10 applied to three broad spectral bands: photosynthetically active radiation (PAR, wave lengths between 0.4 and 0.7 µm), near infrared radiation (NIR, wave lengths between 0.7 and 3.0 µm) and thermal infrared radiation (TIR, wave lengths between 3.0 and 15 µm).

Canopy radiation profile
For each spectral band m, the canopy radiation scheme assumes that each cohort corresponds to one layer of vegetation within 15 the canopy, and within each layer the optical and thermal properties are assumed constant. For all bands, the top boundary condition for each band is provided by the meteorological forcing (Table 3). In the cases of PAR (m = 1) and NIR (m = 2), the downward irradiance is comprised of a beam (direct) and isotropic (diffuse) components, whereas TIR (m = 3) is assumed to be all diffuse. Direct irradiance that is intercepted by the cohorts can be either back-scattered or forward-scattered as diffuse radiation, and direct radiation reflected by the ground is assumed to be entirely diffuse. 20 Following Sellers (1985), the extinction of downward direct irradiance and the two-stream model for hemispheric diffuse irradiance for each of the spectral bands (m = 1, 2, 3) is given by:    (Table 3); Z is the Sun's zenith angle; Q mk ,Q ⇓ mk ,Q ⇑ mk are the downward direct, downward diffuse, and upward diffuse irradiances (Eq. 45-47);Φ k is the effective plant area index (Eq. S100); (ς mk ; β mk ) are the scattering coefficients and the backscattering fraction for diffuse irradiance (Eq. S101,S102); ς mk ; β mk are the scattering coefficients and the backscattering fraction for direct irradiance (Eq. S104,S105);Q mk is the black-body irradiance (Eq. 48); andQ a,t k is the net absorbed irradiance (Eq. 49).
where index k ∈ {1, 2, . . . , N T } corresponds to each cohort k or its lower interface (Fig. 4); interface N T + 1 is immediately above the tallest cohort;Q mk is the downward direct irradiance incident at interface k; (Q ⇓ mk andQ ⇑ mk ) are the downward and upward (hemispheric) diffuse irradiances incident at interface k; ς mk is the scattering coefficient, and thus (1 − ς mk ) is the absorptivity; β mk and β mk are the backscattered fraction of scattered direct and diffuse irradiances, respectively;Φ is the effective cumulative plant area index, assumed zero at the top of each layer, and increasing downwards (Φ k is the total for 5 layer k); µ k and µ k are the inverse of the optical depth per unit of effective plant area index for direct and diffuse radiation, respectively; andQ mk is the irradiance emitted by a black body at the same temperature as the cohort (T t k ).
Equations (45)-(47) simplify for each spectral band. First,Q m=3 k ≡ 0, because we assume that all incoming TIR irradiance is diffuse. Likewise, the black-body emissionQ mk =Q 1k = 0 for the PAR (m = 1) and NIR (m = 2) bands, because thermal emission is negligible at these wave lengths. The black-body emission for the TIR band is defined as where σ SB is the Stefan-Boltzmann constant (Table S3). Note that for emission of TIR radiation (45-47), we assume that emissivity is the same as absorptivity (Kirchhoff's law; Liou, 2002), hence the 1 − ς term.
The effective plant area indexΦ k is the total area (leaves and branches) that is corrected to account for that leaves are not uniformly distributed in the layer. It is defined asΦ k = Ω k + f Clump k Λ k , where f Clump k is the PFT-dependent clumping index (Chen and Black, 1992, default values in Tables S5-S6), Λ t k is the leaf area index and Ω t k is the wood area index.Φ is assumed zero at the top of each layer, increasing downwards.
The optical properties of the leaf layers -optical depth and scattering parameters for direct and diffuse radiation for each 5 of the three spectral bands -are assumed constant within each layer. These properties are determined from PFT-dependent characteristics such as mean orientation factor, spectral band-dependent reflectivity, transmissivity, and emissivity (Supplement S11). Because the properties are constant within each layer, it is possible to analytically solve the full profile of both direct and diffuse radiation, using the solver described in Supplement S12.
Once the profiles ofQ mk ,Q ⇓ mk andQ ⇑ mk are determined, we obtain the irradiance that is absorbed by each cohortQ a,t k : This term is then used in the enthalpy budget of each cohort (Eq. 4 and Eq. 12).

Ground radiation
The ground radiation sub-model determines the irradiance emitted by the ground surface, and the profile of irradiance through the temporary surface water layers and top soil layer. Note that the ground radiation and the canopy radiation model are inter-15 dependent: the incoming radiation at the top ground layer is determined from the canopy radiation model, and the ground scattering coefficient (ς m0 , see Supplement S11) is needed for the canopy-radiation bottom boundary condition (Supplement S12).
However, since the scattering coefficient does not depend on the total incoming radiation, the irradiance profile can be solved for a standardized amount of incoming radiation, and once the downward radiation at the bottom of the canopy has been calculated, the absorbed irradiance for each layer can be scaled appropriately. 20 Black-body emission from the ground (Q m0 ) is calculated as an area-weighted average of the emissivities of exposed soil and temporary surface water: where (1 − ς 3g ) and (1 − ς 3s ), are, respectively, the thermal-infrared emissivities of the top soil layer and the temporary surface water (Table S4), and f TSW is the fraction of ground covered by temporary surface water. In ED-2.2, the soil and snow scattering 25 coefficients for the TIR band are assumed constant, following Walko et al. (2000).
Once the irradiance profile for the canopy is determined from Eq. (45)-(47), the irradiance absorbed by each temporary surface water layer (j ∈ {1, 2, . . . , N S }) is calculated by integrating the transmissivity profile for each layer, starting from the top layer: where µ s is the inverse of the optical depth of temporary surface water.
The irradiance absorbed by the ground is a combination of irradiance of exposed soil and irradiance that is transmitted through all temporary surface water layers, and the net absorption of longwave radiation:

Surface Layer Model
The surface layer model determines the fluxes of enthalpy, water, and carbon dioxide between the canopy air space and the free air above. It is based on the Monin-Obukhov similarity theory (Monin and Obukhov, 1954;Foken, 2006), which has been widely used by biosphere-atmosphere models representing a variety of biomes (e.g. Walko et al., 2000;Best et al., 2011;10 Oleson et al., 2013), although this is often an extrapolation of the theory that was not originally developed for heterogeneous vegetation, or tall vegetation (Foken, 2006).
In order to obtain the fluxes, we assume that the eddy diffusivity of buoyancy is the same as the diffusivity of enthalpy, water vapor, and CO 2 . This assumption allows us to define a single canopy conductance G c for the three variables, following the algorithm described in Supplement S14.1. We then obtain the following equations for fluxes between canopy air space and the 15 free atmosphere: whereh a is the equivalent enthalpy of air at reference height z a when the air is adiabatically moved to the top of the canopy 20 air space, using the definition of potential temperature: where p 0 is the reference pressure, R is the universal gas constant, q pd is the specific heat of dry air at constant pressure, and M d is the molar mass of dry air (Table S3). 25 Sensible heat flux between the free atmosphere and canopy air space (Q a,c ) can be derived from the definition of enthalpy and enthalpy flux (Eq. S50 and Eq. 54), although it is not directly applied to the energy balance in the canopy air space (Ḣ a,c is used instead).
4.5 Heat and water exchange between surfaces and canopy air space

Leaves and branches
Fluxes of sensible heat (Q t k ,c ) and water vapor (Ẇ t k ,c ) between the leaf surface and wood surface and the canopy air space follow the same principle of conductance and gradient that define the eddy fluxes between the free atmosphere and canopy air 10 space (Eq. 53;54). Throughout this section, we use subscripts λ k and β k to denote leaf and wood boundary layers of cohort k, respectively; the different subscripts are needed to differentiate fluxes coming from the leaves' intercellular space (e.g. transpiration, see also Section 4.6). Let G Qλ k (m s −1 ) and G W λ k (m s −1 ) be the conductances of heat and water between the leaf boundary layer of cohort k and the canopy air space, and G Qβ k and G W β k be the wood boundary layer counterparts. The surface sensible heat and surface water vapor fluxes are: where (q λ k ,c ;q β k ,c ;ẇ λ k ,c ;ẇ β k ,c ) are the leaf-surface and branch-surface heat and water fluxes by unit of leaf and branch area, respectively; the factors 2 and π in Eq. (60) means that sensible heat is exchanged on both sides of the leaves, and on the longitudinal area of the branches, which are assumed cylindrical. Intercepted water and dew and frost formation is allowed only on one side of the leaves, and an area equivalent to a one-sided flat plate for branches, and therefore only the leaf and wood area indices are used in Eq. (61). Canopy air space temperature, specific humidity, density, and specific heat, leaf temperature, and wood temperature are determined diagnostically. We also assume that surface temperature of leaves and branches to be the same as their internal temperatures (i.e. T Sfc l k ≡ T l k and T Sfc b k ≡ T b k ). Specific humidity at the leaf surface w Sfc l k = w Sat T Sfc l k , p c and branch surface w Sfc β k = w Sat T Sfc b k , p c are assumed to be the saturation specific humidity w Sat (Supplement S15).
Heat conductance for leaves and branches are based on the convective heat transfer, as described in Supplement S14.2.
Further description of the theory can be found in Monteith and Unsworth (2008, Section 10.1).

Temporary surface water and soil
Sensible heat and water fluxes between the temporary surface water and soil and the canopy air space are calculated similarly to leaves and branches. Surface conductance G Sfc is assumed to be the same for both heat and water, and also the same for soil 5 and temporary surface water: 10 Specific humidity for temporary surface water is computed exactly as leaves and branches, w s N S = w Sat T s N S , p c (Supplement S15). For soils the specific humidity also accounts for the soil moisture and the sign of the flux, using a method similar to Avissar and Mahrer (1988): 15 where g is the gravity acceleration, M w is the water molar mass, and R is the universal gas constant (Table S3); T g N G , ϑ g N G and Ψ g N G are the temperature, soil moisture and soil matric potential of the topmost soil layer, respectively; and ϑ Fc and ϑ Re are the soil moisture at field capacity and the residual soil moisture, respectively. The exponential term in Eq. 70 corresponds to the soil pore relative humidity derived from the Kelvin equation (Philip, 1957), and s g is the soil wetness function, which takes a similar functional form as the relative humidity term from Noilhan and Planton (1989) and the β term from Lee and 20 Pielke (1992). The total resistance between the surface and the canopy air space is a combination of the resistance if the surface was bare, plus the resistance due to the vegetation, as described in Supplement S14.3.

Enthalpy flux due to evaporation and condensation
Dew and frost are formed when water in the canopy air space condenses or freezes on any surface (leaves, branches, or ground); likewise, water that evaporates and ice that sublimates from these surfaces immediately become part of the canopy air space. 25 In terms of energy transfer, two processes occur, the phase change and the mass exchange, and both must be accounted for the enthalpy flux. Phase change depends on the specific latent heat of vaporization (l v ) and sublimation (l iv ), which are linear functions of temperature, based on Eq. (S48) and Eq. (S49): where l v3 and l iv3 are the specific latent heats of vaporization and sublimation at the water triple point (T 3 ), q pv is the specific heat of water vapor at constant pressure, and q i and q are the specific heats of ice and liquid water, respectively (Table S3). 5 The temperature for phase change must be the surface temperature because this is where the phase change occurs. In the most generic case, if a surface x at temperature T x with a liquid water fraction x , the total enthalpy flux between the surface and canopy air spaceḢ x,c associated with the water flux W x,c is: Enthalpy flux due to mass exchange Enthalpy flux due to phase change By using the definitions from Eq. (S54), Eq. (74) can be further simplified to: which is consistent with the exchange of pure water vapor and enthalpy between the thermodynamic systems. Eq. (75) is used

Leaf physiology
In ED-2.2, leaf physiology is modeled following Farquhar et al. (1980) and Collatz et al. (1991) for C 3 plants; and Collatz 15 et al. (1992) for C 4 plants, and the Leuning (1995) model for stomatal conductance. This sub-model ultimately determines the net leaf-level CO 2 uptake rate of each cohort k (Ȧ k , mol C m −2 Leaf s −1 ), controlled exclusively by the leaf environment, and the corresponding water loss through transpiration (Ė k , mol W m −2 Leaf s −1 ). The exchange of water and CO 2 between the leaf intercellular space and the canopy air space is mediated by the stomata and the leaf boundary layer, which imposes an additional resistance to fluxes of these substances. For simplicity, we assume that 20 the leaf boundary layer air has low storage capacity, and thus the fluxes of any substance (water or CO 2 ) entering and exiting the boundary layer must be the same. Fluxes of water and carbon between the leaf intercellular space and the canopy air space must overcome both the stomatal resistance and the boundary layer resistance, whereas sensible heat flux and water flux from leaf surface water must overcome the boundary layer resistance only (Fig. 5). The potential fluxes of CO 2 and water can be written as:  Cohort index k is omitted from the figure for clarity.

Leaf boundary layer
where G Xλ k and G Xl k (units m s −1 ) are the leaf boundary layer and stomatal conductances for element X (either water W or carbon C), respectively; c λ k and w λ k are the CO 2 mixing ratio and the specific humidity of the leaf boundary layer, 5 respectively; and c l k and w l k are the CO 2 and specific humidity of the leaf intercellular space, respectively. As stated in Eq. (78), we assume the leaf intercellular space to be at water vapor saturation. The leaf boundary-layer conductances are obtained following the algorithm shown in Supplement S14.2. The net CO 2 assimilation flux and stomatal conductances are described below.
From Farquhar et al. (1980), the net CO 2 assimilation flux is defined as: Oxygenation releases 0.5 mol CO2 for every mol O2 , hence the half multiplier, and it is related to carboxylation by means of the CO 2 compensation point Γ k (Lambers et al., 2008): 5 where c l k is the CO 2 mixing ratio in the leaf intercellular space. The CO 2 compensation point is determined after Collatz et al. (1991Collatz et al. ( , 1992: where o ⊕ is the reference O 2 mixing ratio (Table S3), and represents the ratio between the rates of carboxylase to oxygenase and is a function of temperature. The general form of the function describing the metabolic dependence upon temperature for 10 any variable x (including ) is: where x 15 is the value of variable x at temperature T 15 = 288.15K, and Q 10x is the parameter which describes temperature dependence (Table S4).

15
Carboxylase Oxygenase), photorespiration is nearly nonexistent in C 4 plants (Lambers et al., 2008), hence the assumption that Γ k is zero. For C 4 plants, the carboxylation rate under Ribulose-1,5-Biphosphate (RuBP) saturated conditions becomes the maximum capacity of Rubisco to perform the carboxylase function (V C k =V max C k ). For C 3 , this rate is unattainable even under RuBP-saturated conditions because carboxylation and oxygenation are mutually inhibitive reactions (Lambers et al., 2008). Therefore, the maximum attainable carboxylation (V C k =V RuBP C k ) is expressed by a modified Michaelis-Menten kinetics 20 equation: where is the effective Michaelis constant, and K C k and K O k are the Michaelis constants for carboxylation and oxygenation, respectively. Both K C k and K O k are dependent on temperature, following Eq. (84) (default parameters in Table S4), whereasV max C k follows a modified temperature-dependent function to account for the fast decline of 25 both productivity and respiration at low and high temperatures (Sellers et al., 1996;Moorcroft et al., 2001): where f Cold , f Hot , T Cold and T Hot are PFT-dependent, phenomenological parameters to reduce the function value at low and high temperatures (Tables S5-S6).
The original expression for the initial slope of the carboxylation rate under near-zero CO 2 (V InSl C k ) for C 4 plants by Collatz 5 et al. (1992) has been modified later (e.g. Foley et al., 1996) to explicitly includeV max C k ; this is the same expression used in ED-2.2: where k PEP represents the initial slope of the response curve to increasing CO 2 ; the default value in ED-2.2 (Table S4) is the same value used by Collatz et al. (1992). 10 From the total photosynthetically active irradiance absorbed by the cohortQ PAR:a,t k (Eq. 49), we define the photon flux that is absorbed by the leaf (q PAR k , mol m −2 Leaf s −1 ): where Ein is the average photon-specific energy in the PAR band (0.4−0.7 µm; Table S3). Even though a high fraction k of the absorbed irradiance is used to transport electrons needed by the light reactions of photosynthesis (Lambers et al., 2008), only 15 a fraction of the irradiance absorbed by the leaf is absorbed by the chlorophyll; in addition, the number electrons needed by each carboxylation and oxygenation reaction poses an additional restriction to the total carboxylation rate. The product of these three factors is combined into a single scaling factor for total absorbed PAR, the quantum yield ( k ), which is a PFT-dependent property in ED-2.2 (Tables S5-S6). The maximum carboxylation rate under light limitationV PAR C k is: 20 Carboxylation may also be limited by the export rate of starch and sucrose that is synthesized by triose phosphate, especially when CO 2 concentration is high combined with high irradiance, at low temperatures, or O 2 concentration is low (von Caemmerer, 2000;Lombardozzi et al., 2018). This limitation was not included in ED-2.2.
Day respiration comprises all leaf respiration terms that are not dependent on photosynthesis, and it is mostly due to mitochondrial respiration; it is currently represented as a function of the maximum carboxylation rate, following Foley et al. (1996): where f R is a PFT-dependent parameter (Tables S5-S6).
Stomatal conductance is controlled by plants and is a result of a trade-off between the amount of carbon that leaves can uptake and the amount of water that plants may lose. Leuning (1995) proposed a semi-empirical stomatal conductance expression for water based on these trade-offs: whereĜ ∅ W l k is the residual conductance when stomata are closed, M k is the slope of the stomatal conductance function, and 5 ∆w k is an empirical coefficient controlling conductance under severe leaf-level water deficit; all of them are PFT-dependent parameters (Tables S5-S6). From Cowan and Troughton (1971), stomatal conductance of CO 2 is estimated by the ratio f Gl between the diffusivities of water and CO 2 in the air (Table S4): Variables w l k ,V max C k ,Ṙ k , k , K O k , K C k , Γ k , and K ME k are functions of leaf temperature and canopy air space pressure, and 10 thus can be determined directly. In constrast, nine variables are unknown for each limitation case as well as for the case when the stomata are closed: The remaining unknown variables are determined numerically, following the algorithm described in Supplement S16. The stomatal conductance model by Leuning (1995) (Eq. 91) is regulated by leaf vapor pressure deficit, however, Eq. (76) and Eq. (77) do not account for soil moisture limitation on photosynthesis. To represent this additional effect, we define a 15 soil-moisture dependent scaling factor (f W l k , Supplement S17) to reduce productivity and transpiration as soil available water decreases. Because stomatal conductance cannot be zero, the scaling factor f W l k interpolates between the fully closed case and the solution without soil moisture limitation, yielding to the actual fluxes of CO 2 (Ċ l k ,c , kg C m −2 s −1 ) and water (Ẇ l k ,c , kg W m −2 s −1 ): where þ k is 1 if the PFT is hypostomatous or 2 if the PFT is amphistomatous or needleleaf. Alternatively, Xu et al. (2016) implemented a process-based plant hydraulics scheme that solves the soil-stem-leaf water flow in ED-2.2; details of this implementation are available in the referred paper.
For simplicity, we assume that the water content in the leaf intercellular space and the plant vascular system are constant, 25 therefore the amount of water lost by the intercellular space through transpiration always matches the amount of water absorbed by roots. Plants may extract water from all layers to which they have access, and the amount of water extracted from each layer is proportional to the available water in the layer relative to the total available water (W gj ): where W gj is defined following Supplement S17 and W g(N G +1) ≡ 0. The net water flux in the leaf intercellular space due to transpiration is assumed to be zero, however the associated net energy flux cannot be zero. Water enters the leaf intercellular 5 space as liquid water at the soil temperature, reaches thermal equilibrium with leaves, and is lost to the canopy air space as water vapor at the leaf temperature. Therefore, the enthalpy flux between the soil layers and the cohort is calculated similarly to Eq. (35), whereas the enthalpy flux between the leaf intercellular space and the canopy air space is solved similarly to Eq. (75):

Non-leaf autotrophic respiration
Respiration from fine roots is defined using a phenomenological function of temperature that has the same functional form as leaf respiration (Moorcroft et al., 2001). Because roots are allowed in multiple layers, and in ED-2.2 roots have a uniform distribution of mass throughout the profile, the total respiration (Ċ r k ,c : kg C m −2 s −1 ) is the integral of the contribution from each soil layer, weighted by the layer thickness: where r r k (s −1 ) is the PFT-dependent factor that describes the relative metabolic activity of fine roots at the reference temperature (15 • C (Tables S5-S6), and T is the same temperature-dependent function from Eq. (86); default parameters are listed in Tables S5-S6. Total storage respiration is a combination of two terms: a phenomenological term that represents the long-term turnover rate 20 of the accumulated storage pool (individual-basedṘ n k or flux-basedĊ n k ,c ), assumed constant (Medvigy et al., 2009), and a term related to the losses associated with the assimilated carbon for growth and maintenance of the living tissues (individual-basedṘ ∆ k or flux-basedĊ ∆ k ,c , Amthor, 1984). The latter is a strong function of the plant metabolic rate, which has strong daily variability hence is a function of the daily carbon balance: where (τ n k , τ ∆ k ) are the PFT-dependent decay rates associated with storage turnover and consumption for growth, respectively (Tables S5-S6); and C ∆ k (kg C m −2 ) is the total accumulated carbon from the previous day as defined in Eq. (25). The transport from non-structural storage and the accumulated carbon for maintenance, growth and, storage is summarized in Supplement S3.

Heterotrophic respiration
Heterotrophic respiration comes from the decomposition of carbon in the three soil/litter carbon pools. For each carbon pool e j ; j ∈ (1, 2, 3), we determine the maximum carbon loss based on the characteristic decay rate, which corresponds to the typical half-life for metabolic litter (e 1 ); structural litter (e 2 ); and slow soil organic matter (e 3 ) determined from Bolker et al. (1998): 5 where f he is the fraction of decay that is lost through respiration (Table S4), and by definition f he3 must be always one (slow soil carbon can only be lost through heterotrophic respiration); B e are the decay rates at optimal conditions, based on Bolker et al. (1998) (Table S4); T g20 and ϑ 20 are the average temperature and relative soil moisture of the top 0.2 m of soil; the relative soil moisture for each layer is defined as: 10 and E T (T g20 ) and E ϑ (ϑ 20 ) are functions that reduces the decomposition rate due to temperature or soil moisture under extreme conditions: where (f Cold ;T gCold ), (f Hot ;T gHot ), (f Dry ;ϑ Dry ) and (f Wet ;ϑ Wet ) are phenomenological parameters to decrease decomposition rates 15 at low and high temperatures, and dry and saturated soils, respectively (Table S4). The decay fraction from fast and structural soil carbon that is not lost through heterotrophic respiration is transported to the slow soil carbon (Supplement S4).

Conservation of energy, water, and carbon dioxide
The ED-2.2 simulations show a high degree of conservation of the total energy, water, and carbon (Fig. 6). In the example 20 simulation for one patch at Paracou, French Guiana (GYF), a tropical forest site, the accumulated deviation from perfect closure (residual) of the energy budget over 50 years (2,629,800 time steps) was 0.1% of the total enthalpy storage -sum of enthalpy stored at the canopy air space, cohorts, temporary surface water and soil layers (Fig. 6a) and 0.002% of the accumulated losses through eddy flux, the largest cumulative flux of enthalpy. Results for the water budget were even better, with maximum accumulated residuals of 0.04% of the total water stored in the ED-2.2 thermodynamic systems, or 0.0006% of 25 the total water input by precipitation (Fig. 6b), and the accumulated residual of carbon was 0.008% of the total carbon storage or 0.017% the total accumulated loss through eddy flux. The average absolute residual errors by time step, relative to the total   Figure 6. Example of (a) enthalpy, (b) water, and (c) carbon conservation assessment in ED-2.2, for a single-patch simulation at GYF for 50 years. Terms are presented as the cumulative contribution to the change storage. Total storage is the combination of canopy air space, cohorts, temporary surface water and soil layers in the case enthalpy and water, and canopy air space, cohorts, seed bank, and soil carbon pools in the case of carbon. Positive (negative) values mean accumulation (loss) by the combined storage pool over the time. Pressure change accounts for changes in enthalpy when pressure from the meteorological forcing is updated, and density change accounts for changes in mass to ensure the ideal gas law. Canopy air space (CAS) change and vegetation heat capacity (Veg Hcap) change reflect the addition/subtraction of carbon, water, and enthalpy due to the vegetation dynamics modifying the canopy air space depth and the total heat capacity of the vegetation due to biomass accumulation or loss. Storage change is the net gain or loss of total storage, and residual corresponds to the deviation from the perfect closure. Note that we present the y axis in cube root scale to improve visualization of the smallest terms. storage, ranged from 3.6 · 10 −11 (carbon) to 3.8 · 10 −10 (energy), and thus orders of magnitude less than the truncation error of single-precision numbers (1.2 · 10 −7 ) and the model tolerance for each time step (1.2 · 10 −5 ).
The conservation of energy and water of ED-2.2 also represents a substantial improvement from previous versions of the model. We carried out additional decadal-long simulations with ED-2.2 and two former versions of the model (ED-2.0.12 and ED-2.1) and the most similar configuration possible among versions, and found that cumulative residual of enthalpy relative 5 to eddy flux loss decreased from 15.2% (ED-2.0.12) or 5.7% (ED-2.1) to 6.1 · 10 −5 % (ED-2.2) (Fig. S3a-c). Similarly, the cumulative violation of perfect water budget closure, relative to total precipitation input, decreased from 3.4% (ED-2.0.12) or 1.1% (ED-2.1) to 1.2 · 10 −4 % (ED-2.2) (Fig. S3d-f).

Simulated ecosystem heterogeneity
Because ED-2.2 accounts for the vertical distribution of the plant community and the local heterogeneity of ecosystems, it is by the Princeton Global Meteorological Forcing (Sheffield et al., 2006, ;1969−2008, and with active fires (Supplement S3.4).
For the last 100 years, we also prescribed land use changes derived from Hurtt et al. (2006) and Soares-Filho et al. (2006).
The distribution of basal area binned by diameter at breast height (DBH) classes show high variability across the domain, and even within biome boundaries (Fig. 7). For example, larger trees (DBH ≥ 50 cm) are nearly absent outside the Amazon biome, with the exception of more humid regions such as the Atlantic Forest along the Brazilian coast, western Colombia, and 5 Panama (Fig. 7d,e). In contrast, in seasonally dry areas as the Brazilian cerrado, intermediate-sized trees (10 ≤ DBH < 50 cm) contribute the most to the basal area (e.g. areas near site BSB, Fig. 7b,c). Even within the Amazon ecoregion, basal area shows variability in the contribution of trees with different sizes, including the areas outside the arc of deforestation along the southern and eastern edges of the biome (Fig. 7). Similarly, the abundance of different plant functional groups shows great variability across the region, with dominance of grasses and early-successional tropical trees in deforested regions and in drier areas in 10 the Brazilian Cerrado, whereas late-succesional tropical trees dominating the tropical forests, albeit with lower dominance in parts of Central Amazonia (Fig. S4).
The variability of forest structural and functional composition observed in regional simulations emerge from both the competition among cohorts in the local microenvironment and the environmental controls on the disturbance regime. In Fig. 8 we present the impact of different disturbance regimes modulating the predicted ecosystem structure and composition for two vations, following the methodology described in Longo et al. (2018); we allowed fires to occur but for simplicity we did not prescribe land use change. After 500 years of simulation, the structure at the two sites are completely different, with large, late-successional trees dominating the canopy at GYF (Fig. 8a) and open areas with shorter, mostly early-successional trees dominating the landscape at BSB (Fig. 8b). For GYF, the structural and functional composition is achieved only after 200 years of simulation, whereas in BSB a dynamic steady state caused by the strong fire regime is achieved in about 100 years (Fig. S5).

5
At both sites, early sucessional trees dominate the canopy at recently disturbed areas (Fig. 8c,d) with late-successional (GYF) or mid-successional trees (BSB) increasing in size only at the older patches (> 30 years, Fig. 8c,d), and the variation of basal area as a function of age since last disturbance show great similarity at both sites (Fig. 8e). However, the disturbance regimes are markedly different: at GYF, fires never occurred and disturbance was driven exclusively by tree fall (prescribed at 1.11%, yr −1 ), whereas fires substantially increase the disturbance rates at BSB (average fire return interval of 19.3 years). 10 Consequently, old-growth patches (older than 100 years) are nonexistent at BSB and abundant at GYF (Fig. S5f). In addition, the high disturbance regime at BSB meant that large trees and late-sucessional trees (slow growers) failed to establish, but succeeded and maintained a stable population at GYF (Fig. S5).
The impacts of simulating structurally and functionally diverse ecosystems are also observed in the fluxes of energy, water, carbon, and momentum. For example, in Fig. 9 we show the monthly average fluxes from the last 40 years of simulation at GYF, along with the interannual variability of the fluxes aggregated to the polygon-level (hereafter polygon variability, error bars) and the interannual variability of the fluxes accounting for the patch probability (hereafter patch variability, colors in the background). The polygon-level variability can be thought as the variability attributable exclusively to climate variability, 5 whereas the patch variability also incorporates the impact of the structural heterogeneity in the variability. Most highly aggregated ("big-leaf") models characterize the polygon-level variability, but not the patch variability. However, in all cases, the patch variability far exceeded the polygon variability, indicating that structural variability is as important as the interannual variability in complex ecosystems. In the case of sensible heat, polygon variable was between 39 and 64% of the patch variability ( Fig. 9a). The polygon-to-patch variability ratio was similar for both friction velocity (19 − 39%) and water fluxes (17 − 44%) 10 ( Fig. 9b,c). In the case of gross primary productivity, the relevance of patch variability was even higher, with polygon-to-patch variability ratio ranging from 3.7% during the dry season to 17% during the wet season (Fig. 9d). Importantly, the broader range of fluxes across patches in the site can be entirely attributed to structural and functional diversity, because all patches were driven by the same meteorological forcing.

Conservation of biophysical and biogeochemical properties
As demonstrated in Section 5.1, it is possible to represent the long-term, large-scale dynamics of heterogeneous and functionally diverse plant canopy while still accurately conserving the fluxes of carbon, water and energy fluxes that occur the ecosystem. ED-2.2 exhibits excellent conservation of energy, water, and carbon dioxide even in multi-decadal scales. After 50 years of simulation, the accumulated residuals from perfect closure never exceeded 0.1% of the total energy, water, and carbon 20 stored in the pools resolved by the model (Fig. 6), which is significantly less than the error accepted in each time step (1%).
The model's excellent conservation of these three key properties is possible because the ordinary differential equations are written directly in terms of the variables that we sought to conserve, thus reducing the effects of non-linearities. A key feature that facilitates the model's high level of energy conservation is the use of enthalpy as the primary state variables within the model. This contrasts with most terrestrial biosphere models, which use temperature as their energy state variable (e.g. Best 25 et al., 2011;Oleson et al., 2013). By using enthalpy, the model can seamlessly incorporate energy storage changes caused by rapid changes in water content and consequently heat capacity. It also reduces errors near phase changes (freezing or melting), when changes in energy may not correspond to changes in temperature. Nonetheless, the residual errors in ED-2.2 are larger than the error of each time steps after integrating the model over multiple decades (Fig. 6), which suggests that the errors may have a systematic component that deserves further investigation. The main contribution to the remaining residual errors 30 in carbon, water, and energy fluxes comes from the linearization of the prognostic equations due to changes in density in the canopy air space (Eq. 18-19;23). The magnitude of these residuals would likely be further reduced by using the bulk enthalpy,  water content, and carbon dioxide content in the canopy air space as the state variables instead of the specific enthalpy, specific humidity and CO 2 mixing ratio.
Unlike most existing terrestrial biosphere models (but see SiB2, e.g. Baker et al., 2003;Vidale and Stöckli, 2005), in ED-2.2 we explicitly include the dynamic storage of energy, water, and carbon dioxide in the canopy air space. Canopy air space storage is particularly important in tall, dense tropical forests; accounting for this storage term, as well as the energy storage of 5 vegetation allows a more realistic representation of the fluxes between the ecosystem and the air above (see also Haverd et al., 2007). In addition, the separation of the ecosystem fluxes in the model into eddy fluxes and change in canopy air space storage allows a thorough evaluation of the model's ability to represent both the total exchange and the ventilation of water, energy and carbon in and out of the ecosystem with eddy covariance towers, as shown in the companion paper .

Heterogeneity of ecosystems
It has been long advocated that terrestrial biosphere models must incorporate demographic processes and ecosystem heterogeneity to improve their predictive ability in a changing world (Moorcroft, 2006;Evans, 2012;. In ED-2.2, we aggregate individuals and forest communities according to similar characteristics (Fig. 1). For example, individuals are only aggregated into cohorts if they are of similar size, same functional group, and live in compara-5 ble micro-environments. Likewise, local plant communities are aggregated only if their disturbance history and their vertical structure are similar. The level of aggregation of ED-2.2 still allows mechanistic representation of ecological processes such as how individuals' access to and competition for resources vary depending on their size, adaptation, and presence of other individuals. This approach allows representing a broad range of structure and composition of ecosystems (Fig. 7,S4), as opposed to simplified biome classification. In this manuscript, we presented the functional diversity using only the default tropical plant 10 functional types (PFTs), which describe the functional diversity along a single functional trait axis of broadleaf tropical trees.
However, the ED-2.2 framework allows users to easily modify the traits and trade-offs of existing PFTs, or include new functional groups; previous studies using ED-2.2 have leveraged this feature of the code to define PFTs according to the research question both in the tropics (e.g. Xu et al., 2016;Trugman et al., 2018;Feng et al., 2018) and in the extra-tropics (e.g. Raczka et al., 2018;Bogan et al., 2019). 15 Previous analysis by Levine et al. (2016) has shown that the dynamic, fine-scale heterogeneity and functional diversity of the plant canopy in ED-2.2 is essential for capturing macro-scale patterns in tropical forest properties. Specifically, Levine et al. (2016) found that ED-2.1 was able to characterize the smoother observed transition in tropical forest biomass across a dry-season length gradient in the Amazon, whereas a highly aggregated (big-leaf like) version of ED-2.1 predicted abrupt shifts in biomass, which is commonly observed in many dynamic global vegetation models (e.g. Good et al., 2011). Results from 20 two related studies have shown that the incorporation of sub-grid scale heterogeneity and diversity within ED-2 also improves its ability to correctly capture the responses of terrestrial ecosystems to environmental perturbation. First, in an assessment of the ability of four terrestrial biosphere models to capture the impact of rainfall changes on biomass in Amazon forests (Powell et al., 2013), ED-2.1 was the only model that captured the timing and average magnitude of above-ground biomass loss that was observed in two experimental drought treatments while all three big-leaf model formulations predicted minimal impacts of the 25 drought experiment. Second, a recent analysis by Longo et al. (2018) on the impact of recurrent droughts in the Amazon found that drought-induced carbon losses in ED-2.2 arose mostly from the death of canopy trees, a characteristic that is consistent with field and remote sensing observations of drought impacts in the region (Phillips et al., 2010;Yang et al., 2018).
Importantly, since its inception, the ED model accounts for the disturbance-driven horizontal heterogeneity of ecosystems (Moorcroft et al., 2001). As demonstrated in Moorcroft et al. (2001), the continuous development of treefall gaps is fundamental 30 to explaining the long-term trajectory of biomass accumulation in tropical forests; for example, by representing both recently disturbed and old-growth fragments of forests, it is possible to simulate micro-environments where either shade-intolerant plants thrive or slow-growing, shade-tolerant individuals dominate the canopy (Fig. 8a,c). Moreover, ED-2.2 can also represent dynamic and diverse disturbance regimes, which ultimately mediate the regional variation of ecosystem properties. For example, tropical forests and woody savannas may share similarities in local communities with similar age since disturbance (Fig. 8e); however, because fire disturbances frequently affect large areas in the savannas, fragments of old-growth vegetation are nearly absent in these regions (Fig. 8f), which creates an environment dominated mostly by smaller trees (Fig. S5c).
Furthermore, the heterogeneity of ecosystems in ED-2.2 is integrated across all time scales, because we solve the biophysical and biogeochemical cycles for each cohort and each patch separately (Fig. 2-3). While solving the cycles at sub-grid scale 5 adds complexity, it also improves the characterization of heterogeneity of available water and energy for plants of different sizes, even within the same polygon: for example, the light profile and soil water availability are not only determined by meteorological conditions, but also by the number of individuals, their height and their rooting depth, and their traits and trade-offs that determine their ability to extract soil moisture or assimilate carbon. As a result, the variability in ecosystem functioning represented by ED-2.2 is significantly increased relative to the variability that a highly aggregated model based on 10 the average ecosystem structure would be able to capture (Fig. 9).

Current and Future Developments
In this manuscript, we focused on describing the biophysical and biogeochemical core of the ED-2.2 model, and appraising its ability to represent both short-term (intra-annual and interannual) and long-term (decades to century) processes. However, the ED-2.2 community is continuously developing and improving the model. In this section we summarize some of the recent and 15 ongoing model developments being built on top of the ED-2.2 dynamic core.
Terrestrial biosphere models still show significant uncertainties in representing photosynthesis due to missing processes and inconsistencies in parameter estimations (Rogers et al., 2017). We are currently implementing the carboxylation limitation by the maximum electron transport rate and by the triose phosphate utilization (von Caemmerer, 2000;Lombardozzi et al., 2018), and constrained by observations (Norby et al., 2017), and incorporating nitrogen and phosphorus limitation. In addition, 20 the model has also been recently updated to mechanistically represent plant hydraulics, and first results indicate a significant improvement of the the model's prediction of water use efficiency and water stress in tropical forests in Central America (Xu et al., 2016). Also, to better represent the dynamics of soil carbon in ED-2.2, we are implementing and optimizing a more detailed version of the CENTURY model (Bolker et al., 1998).
To improve the representation of surface and soil water dynamics, the model has been coupled with a hydrological routine 25 model that accounts for lateral flux of water as a function of terrain characteristics and simulates river discharge (Pereira et al., 2017;Arias et al., 2018). Moreover, an integrated approach of hydraulic routing based on TOPMODEL (Walko et al., 2000;Beven and Freer, 2001), which allows exchange of water and internal energy exchange between different sites as a function of topographic characteristics, is being implemented in ED-2.2.
The ED-2.2 model framework is designed to simulate functionally diverse ecosystems but trait values within each functional 30 group are fixed. To account for the observed plasticity in many leaf traits, a new parameterization of leaf trait variation as function of the light level, based on the parameterization by Lloyd et al. (2010) and  is being implemented.
In addition, the ED-2.2 model has also been recently updated to represent the light competition and parasite-host relationships between lianas and trees (di Porcia e Brugnera et al., 2019), and it is currently being extended to incorporate plant functional types from different biogeographic regions, such as temperate semi-arid shrublands (Pandit et al., 2018), as well as boreal ecosystems, building on previous works using ED-1 (Ise et al., 2008).
Anthropogenic forest degradation is pervasive throughout the tropics (Lewis et al., 2015). To improve the model's ability to represent damage and recovery from degradation, we are implementing a selective logging module that represents the direct impact of felling of marketable individuals, and accounts the damage associated with skid trails, roads and decks, which are 5 modulated by logging intensity and logging techniques (Pereira Jr. et al., 2002;Feldpausch et al., 2005). In addition, the original fire model has been recently improved to account for size-and bark-thickness-dependent survivorship , and is being developed to account for natural and anthropogenic drivers of ignition, fire intensity, fire spread and fire duration (Thonicke et al., 2010;Le Page et al., 2015).
The complexity and sophistication of ED-2.2 also creates important scientific challenges. For example, the multiple pro-  Importantly, the need to incorporate terrestrial ecosystem heterogeneity in Earth System Models has been long advocated (e.g. Moorcroft, 2006;Evans, 2012), but only recently global models have been incorporating ecological mechanisms that allow representing functionally diverse and heterogeneous biomes at global scale without relying on artificial climate envelopes. One example is the Functionally Assembled Terrestrial Ecosystem Simulation (FATES; Fisher et al., 2015), which incorporated the patch and cohort structure of ED-2.2 into the Community Land Model (CLM; Oleson et al., 2013) 20 framework.

Conclusions
ED-2.2 represents a significant advance in how to integrate a variety of processes ranging across multiple time scales in heterogeneous landscapes: it retains all the detailed representation of the long-term dynamics of functionally diverse, spatially heterogeneous landscapes and long-term dynamics from the original ED ecosystem model (Moorcroft et al., 2001;Hurtt et al., 25 2002; Albani et al., 2006), but also solves for the associated energy, water, and CO 2 fluxes of plants living in horizontally and vertically stratified micro-environments within the plant canopy, which was initially implemented by Medvigy et al. (2009) (ED-2) by adapting the big-leaf land surface model LEAF-3 (Walko et al., 2000) to the cohort-based structure of ED-2.
The results presented in the model description demonstrated that ED-2.2 has a high degree of conservation of carbon, energy, and water, even over multi-decadal scales (Fig. 6). Importantly, the current formulation of the model allows us to represent 30 functional and structural diversity both at local and regional scales ( Fig. 7-8;S4-S5), and the effect of the heterogeneity on energy, water, carbon, and momentum fluxes (Fig. 9). In the companion paper, we use data from eddy covariance towers, forest inventory, bottom-up estimates of carbon cycles and remote sensing products to assess the strengths and limitations of the current model implementation .
This manuscript focused on the milestone updates in the energy, water, and carbon cycle within the ED-2.2 framework, but the model continues to be actively developed. Some of the further developments include implementing more mechanisms that influence photosynthesis and water cycle such as plant hydraulics, nutrient cycling, expanding the plant functional diversity 5 including trait plasticity and lianas, as well as expanding the types of natural and anthropogenic disturbances. ED-2.2 is a collaborative, open-source model that is readily available from its repository, and the scientific community is encouraged to use the model and contribute with new model developments.