Robust Ecosystem Demography (RED version 1.0): a parsimonious approach to modelling vegetation dynamics in Earth system models

. A signiﬁcant proportion of the uncertainty in climate projections arises from uncertainty in the representation of land carbon uptake. Dynamic global vegetation models (DGVMs) vary in their representations of regrowth and competition for resources, which results in differing responses to changes in atmospheric CO 2 and climate. More advanced cohort-based patch models are now becoming established in the latest DGVMs. These models typically attempt to simulate the size distribution of trees as a function of both tree size (mass or trunk diameter) and age (time since disturbance). This approach can capture the overall impact of stochas-tic disturbance events on the forest structure and biomass – but at the cost of increasing the number of parameters and ambiguity when updating the probability density function (pdf) in two dimensions. Here we present the Robust Ecosystem Demography (RED), in which the pdf is collapsed onto the single dimension of tree mass. RED is designed to re-tain the ability of more complex cohort DGVMs to represent forest demography, while also being parameter sparse and analytically solvable for the steady state. The population of each plant functional type (PFT) is partitioned into mass classes with a ﬁxed baseline mortality along with an assumed power-law scaling of growth rate with mass. The analytical equilibrium solutions of RED allow the model to be calibrated against observed forest cover using a single parameter – the ratio of mortality to growth for a tree of a reference mass ( µ 0 ). We show that RED can thus be calibrated to the ESA LC_CCI (European Space Agency Land Cover Climate Change Initiative) coverage dataset for nine PFTs. Using net primary productivity and litter outputs from the UK Earth System Model (UKESM), we are able to diagnose the spatially varying disturbance rates consistent with this observed vegetation map. The analytical form for RED circumnavi-gates the need to spin up the numerical model, making it attractive for application in Earth system models (ESMs). This is especially so given that the model is also highly parameter sparse.

Abstract. A significant proportion of the uncertainty in climate projections arises from uncertainty in the representation of land carbon uptake. Dynamic global vegetation models (DGVMs) vary in their representations of regrowth and competition for resources, which results in differing responses to changes in atmospheric CO 2 and climate. More advanced cohort-based patch models are now becoming established in the latest DGVMs. These models typically attempt to simulate the size distribution of trees as a function of both tree size (mass or trunk diameter) and age (time since disturbance). This approach can capture the overall impact of stochastic disturbance events on the forest structure and biomassbut at the cost of increasing the number of parameters and ambiguity when updating the probability density function (pdf) in two dimensions. Here we present the Robust Ecosystem Demography (RED), in which the pdf is collapsed onto the single dimension of tree mass. RED is designed to retain the ability of more complex cohort DGVMs to represent forest demography, while also being parameter sparse and analytically solvable for the steady state. The population of each plant functional type (PFT) is partitioned into mass classes with a fixed baseline mortality along with an assumed power-law scaling of growth rate with mass. The analytical equilibrium solutions of RED allow the model to be calibrated against observed forest cover using a single parameter -the ratio of mortality to growth for a tree of a reference mass (µ 0 ). We show that RED can thus be calibrated to the ESA LC_CCI (European Space Agency Land Cover Climate Change Initiative) coverage dataset for nine PFTs. Using net primary productivity and litter outputs from the UK Earth System Model (UKESM), we are able to diagnose the spatially varying disturbance rates consistent with this observed vegetation map. The analytical form for RED circumnavigates the need to spin up the numerical model, making it attractive for application in Earth system models (ESMs). This is especially so given that the model is also highly parameter sparse.

Introduction
A key requirement of Earth system science is to estimate how much carbon the land surface will take up in the decades ahead (Ciais et al., 2014). This is an important component of the total carbon budget consistent with avoiding global warming thresholds, such as 2 • C (Schleussner et al., 2016). Unfortunately, projections of future land carbon storage still span a wide range (Brovkin et al., 2013;Friedlingstein et al., 2014;Arora et al., 2019). Beyond the CO 2 and nutrient fertilization effects and land-use change, significant uncertainty also arises from the representation of vegetation demographics such as recruitment, competition, and mortality (Brovkin et al., 2013;Ahlström et al., 2015). The representation of plant communities within Earth system models (ESMs) is achieved through the use of dynamic global vegetation models (DGVMs). DGVMs employ a variety of biophysical, biogeographical, and biochemical processes to simulate growth, competition, and recruitment of vegetation. The variety in Published by Copernicus Publications on behalf of the European Geosciences Union. 4068 A. P. K. Argles et al.: Robust Ecosystem Demography the number and resolution of the processes contributes to the differences found at the Earth system level.
Within the context of modelling vegetation at a global level, there is a trade-off between the complexity of ecological process representation and the necessity of parsimony at scale (Fisher et al., 2018). DGVMs range from the simplistic, older, top-down approaches to that of complex individual-based DGVMs. For example, in the first instance the TRIFFID model (Cox, 2001) simulates the fractional area of each plant functional type (PFT) using phenomenological Lotka-Volterra equations. The benefit of the TRIFFID approach is its simplicity and robustness. However, the model suffers from the lack of size representation and other processes, which results in the overestimation of regrowth time (Burton et al., 2019). In the second instance, individual-based models can explicitly represent a multitude of biological and ecosystem processes at an individual plant level (Smith, 2001;Sato et al., 2007). The benefit of this is that size-dependent physiology and spatial heterogeneity can be explicitly represented. However, multiple ensemble members are often needed to construct meaningful forest statistics, which makes such models computationally expensive to run at large scales. Compromises between the complexity of individual-based and top-down DGVMs exist as a class of tree cohort models. In the ED model (Moorcroft et al., 2001;Medvigy et al., 2009) the tree population is partitioned between patch disturbance and biomass classes allowing for the scaling of process to be represented in both age and size. ED2 can realistically model forests around the world (boreal, rainforest, and temperate) (Medvigy et al., 2009;Fisher et al., 2018). However, parameterization of competition within cohort DGVMs can result in a wide spread of outcomes when simulating climate change (Fisher et al., 2010;Scheiter et al., 2013).
In a similar vein other models have limited the number of cohort dimensions. The POP model (Haverd et al., 2014) uses stand-age cohorts as the dimension for population dynamics, every time step applying crowding and resource limited mortality rates. Another example is the ORCHIDEE-MICT (Yue et al., 2018), which disaggregates the populations of a PFT into patch cohort functional types, with transitions between cohorts diagnosed when the average basal diameter passes a threshold. This paper presents a simplified cohort model -Robust Ecosystem Demography (RED) -which updates the number of trees in each mass class but does not separately track tree age or patch age. RED assumes that the tree-size distribution of a forest is determined by how the rates of tree growth and mortality vary with tree size (Kohyama et al., 2003;Coomes et al., 2003;Muller-Landau et al., 2006;Lima et al., 2016). We follow many other studies in assuming that tree-growth rates vary with the three-quarter power of tree mass (m 3/4 ), as suggested by metabolic scaling theory (West et al., 1997). Where tree mortality rate can also be assumed to be approximately independent of tree mass, the demographic equation yields equilibrium tree-size distributions which follow a Weibull distribution. This is sometimes termed demographic equilibrium theory (DET) (see Appendix B). These simplifications significantly reduce the number of free parameters in RED but still enable it to fit forest inventory data in North America (Moore et al., 2018) and South America (Moore et al., 2020).
2 Description of the model A full list of variables, parameters, and units are given in Table 1.

Theory
The underlying theoretical model for RED is a continuity equation, for each PFT and spatial location, which describes the time evolution of the number density n of plants per unit area per unit mass m: Here g is the growth rate, and γ is the mortality rate of a plant of mass m. In general, g and γ could be any reasonable function of tree size. For large-scale applications we make simplifying assumptions for these functions consistent with observed n from forest inventory data (Moore et al., 2018(Moore et al., , 2020. By default we assume that γ is independent of plant mass and that g follows a power law of plant mass as follows: Here g 0 is the growth rate of a plant with the reference mass, m 0 . A value of φ g = 0.75 is assumed by default, consistent with the analysis of field-based measurements by Niklas and Spatz (2004). We also follow Niklas and Spatz (2004) in assuming the scaling of plant canopy area a with plant mass as follows: where φ a = 0.5 by default. Solutions for n can be integrated over mass to derive the total plant number, N = ∞ 0 ndm, the total growth rate, G = ∞ 0 gndm, the total biomass, M = ∞ 0 mndm, and the fractional area covered ν = ∞ 0 andm.

Discrete mass classes
We wish to produce a model of vegetation demography that can be updated numerically and which explicitly conserves vegetation carbon, providing a constraint on the number of plants moving between mass classes in the discrete form. In order to do this we integrate Eq. (1) over finite mass ranges: Lowest/sapling mass boundary kg C g Structural growth of an individual at a given mass and time kg C yr −1 g 0 Structural growth of an individual at the lowest mass boundary at a specific time kg C yr −1 a Crown area of an individual at a given mass m 2 a 0 Crown area of an individual at the lowest mass boundary m 2 φ g Constant describing the power law scaling of structural growth across mass φ a Constant describing the power law scaling of crown area across mass α The fraction of total growth going into seedling recruitment -Cohort n Number density across mass space, the derivative of N with respect to mass The fractional coverage γ Mortality rate, the summation of the baseline and additional mortalities across mass yr −1 γ b Baseline mortality rate, the fraction of population dying over a year due to non-explicitly yr −1 modelled reasons s The fraction of space available for seedlings -F The flux of population density over time m −2 yr −1 d Demographic litter, the loss of carbon due to competition and mortality kg C yr −1 m −2 M Biomass density kg C m −2 c k,l Competition coefficient, the fraction of a PFT, k, that is shaded by the canopy of PFT l -Equilibrium µ 0 The boundary turnover parameter -the ratio of mass lost to gained due to growth in the boundary mass class λ i The proportional population of the ith class to the (i − 1)th class at equilibrium eq Subscript denoting a variable in equilibrium -Numerical k, l Indices representing the PFT number i, j Indices representing mass class number -I The largest mass class -(k) The current time step ξ The size scaling coefficient, where mass classes are defined as m j = ξ m j −1 , with ξ > 1 where i denotes the ith mass class; F i is the flux of plants growing out of the ith mass class and into the (i + 1)th mass class; F i−1 is the flux of plants growing out of the (i − 1)th mass class and into the ith mass class; and N i is the number of plants per unit area in the ith mass class. For clarity, Eq. (4) is deliberately presented as continuous in time at this stage, as the focus in this subsection is on discretization of the mass profile. The fully numerical version of RED, which includes discretization of time, is described in Sects. 2.4 and 2.5. In order to explicitly conserve carbon, the flux F i must take the form (see Appendix A) where m i is the mean mass of a plant in the ith mass class, and g i is the growth rate per plant of the ith mass class (kg C yr −1 plant −1 ).

Seedling production and gap competition
To solve Eq. (4) we also require a lower boundary condition, which represents the rate at which seedlings of mass m 0 are introduced into the cohort. Here we assume that a fixed fraction, α, of the total assimilate available to a PFT (P ), is devoted to producing new seedlings, with the remainder G = (1 − α)P being allocated to the growth of existing plants. Spreading is homogeneous across the entirety of the grid box, but only seedlings established within "unoccupied" space will survive to join the plant cohort. The net incoming flux of seedlings of mass m 0 is therefore where s is the fractional gap area available for seedlings. The definition of s is assumed to differ by PFT to reflect an underlying tree-shrub-grass dominance hierarchy, as shown schematically in Fig. 1. Therefore, the rate of recruitment F 0 is the ratio of a fraction of the carbon assimilate allocated to reproduction, αP , and m 0 , multiplied by the gap area s. The space available to the seedlings of the kth PFT is calculated from the area fractions of the PFTs to which it is subdominant as follows: where ν l is the area fraction of the lth PFT, and c kl is the competition coefficient for the impact of PFT l on PFT k. If PFT l is within the same plant functional group (trees, shrubs or grasses) as PFT k or dominant over it, c kl = 1. If PFT k is dominant over PFT l, c kl = 0 ( Fig. 1). This "gap" boundary condition results in there being no equilibrium solution where the amount of coverage exceeds 1. Doing so would halt the recruitment flux such that mortality processes would bring the fractional coverage back below unity. This is a similar competition regime to the Lotka-inspired TRIFFID model (Cox, 2001) and allows for the coexistence between inter-functional groups (trees, shrubs, and grasses) of PFTs. For instance, a PFT such as broadleaf deciduous tree can coexist with a deciduous shrub and C3 grass. The hierarchy also enables the simulation of succession during regrowth. Faster-growing species of grasses will not be able to expand into space occupied by trees and shrubs, unless there is space created by disturbance. A summary of the competition coefficients is given in Table 2.
2.4 Coupling to Earth system models RED updates plant size distributions, biomass, and fractional areal coverage for an arbitrary number of PFTs at each spatial location and can be driven by variables provided by a land carbon cycle model, an Earth system model, or observations (see Fig. 2). For each PFT, the minimum required input is a time series of net carbon assimilate (P ), defined as the difference between net primary productivity ( N ) and local litter production due to turnover of leaves, stems, and roots ( l ): We apply the m 3/4 scaling to P . We therefore implicitly assume the same scaling for both gross primary productivity and plant respiration. This is consistent with observations suggesting that plant production also scales approximately as m 3/4 (Enquist et al., 1998;Niklas and Enquist, 2001). Where available, additional mortality due to disturbance events such as droughts, fires, and anthropogenic deforestation (γ d ) can be added to the baseline mortality rates (γ b ), for each PFT as follows: Disturbance rates γ d can in principle be both PFT dependent and mass dependent (e.g. to capture forestry practices). Figure 2. Schematic of RED coupled to an ESM or land carbon cycle model. RED is driven by a time series of net carbon assimilate, P , which is then split between seedling production, αP , and the growth of existing plants, G = (1 − α)P . The seedling flux is limited by the available free space, s. Additional mortality rates diagnosed from disturbance models, γ d , can be added onto an assumed baseline mortality, γ b , as a function of both PFT and mass class. Table 3. List of PFT names and assumed allometric scaling parameters (m 0 , a 0 , h 0 ), seedling fraction (α), and competition coefficient (c pft,j ). The growth allometry of trees and shrubs across size is assumed to follow Niklas and Spatz (2004) The competition coefficients given describe which PFT functional group shades the current PFT; if c pft,j = 1, the PFT is shaded; otherwise it is not (Table 2). The input values of net assimilate for each PFT (P ) define the total structural growth rate, G = (1 − α)P , and the seedling flux F 0 (via Eq. 6), using PFT-specific values of the parameter α (see Table 3). The definition of the total structural growth rate at a given time step is which can be combined with the growth scaling given by Eq.
(2) to derive the reference growth rate, g 0 , from the net assimilate, P , which is a driving input: 4072 A. P. K. Argles et al.: Robust Ecosystem Demography This in turn enables the growth rate of each mass class to be calculated using Eq. (2). For each PFT, the number of plants in a mass class (N i ) is updated using a discretized form of Eq. (4): where t is the RED time step (typically 1 month), and the superscript (j ) denotes the j th time step. Our results are robust to changes in model time step so long as the time step remains small compared to the characteristic timescales associated with regrowth (m 0 /g 0 ∼ 4 years) and plant mortality (1/γ ∼ 20 years). The lower-boundary seedling flux is calculated from Eq. (6) using Eq. (7). We impose a zero-flux condition out of the upper mass class, under the assumption that there will be enough mass classes to ensure that this flux is negligible. However, to ensure carbon conservation on the land we add any plants that grow out of the upper mass class into a demographic litterfall term for each PFT, which is a RED output. This demographic litterfall term, d , keeps track of the carbon lost from the vegetation due to competition, mortality, and the carbon in any such plants that grow out of the largest resolved mass class (class I ) as follows: The first term on the righthand side of this equation represents carbon loss due to the shading of seedlings; the second term represents mortality of the resolved mass classes (which may include disturbance events); and the third term, which is normally very small, is the loss of vegetation carbon due to plants growing beyond the modelled mass classes. In order to initiate regrowth from bare soil, RED also assumes a minimum effective fractional area of each PFT. Where the net assimilate would be sufficiently negative to take the vegetation fraction below this minimum, the minimum value is maintained by subtraction from the demographic litter. The demographic litterfall term therefore represents the net addition litter production consistent with the prescribed net assimilate flux, the disturbance rate, and the change in vegetation carbon modelled by RED. When coupling to an ESM or land carbon model, the demographic litterfall term ( d ) should be added to the input local litterfall ( l ) (as used in Eq. 8) to calculate the total litterfall flux into the soil and/or litter system.

Steady state
The steady state of the continuum model defined by Eqs. (1) and (2) can be solved analytically for each PFT (Moore et al., 2018(Moore et al., , 2020. The continuum analytical solutions for the equilibrium mass distribution n eq (m), the total plant number (N eq ), biomass (M eq ), growth rate (G eq ), and fractional area (ν eq ) are summarized in Appendix B. The shape of the mass distribution and each of these parameters depend on the ratio of plant mortality to growth, which we choose to define for the reference mass class m 0 as follows: In order to initialize the numerical RED model in a driftfree initial state, we also derive the steady state of the discrete model (of Eq. 12), which will differ slightly from the continuum model for a finite number of mass classes. The equilibrium solution of Eq. (12) is derived in Appendix B2, based on the balance between seedling recruitment and total cohort mortality that defines the equilibrium state. The discretized version of RED thus yields formulae for the coverage (Eq. B28) and biomass densities (Eq. B30) which depend on the lowest mass class through the value of µ 0 . Similarly, analytical expressions can be derived for total plant number and total growth rate of each PFT at equilibrium.
1. The total equilibrium stand density, N eq : 2. The total equilibrium structural growth, G eq : 3. The total equilibrium coverage, ν eq : 4. The total equilibrium carbon mass, M eq : Here X N , X G , X ν , and X M are functions of µ 0 (see Appendix B2). This equilibrium state is derived by setting N in Eq. (B17), such that the flux entering into a mass class is equal to the flux leaving that class, due to growth out of the class and the loss of plants due to mortality.
The equations above therefore define the equilibrium state of the discrete system for given values of N 0 and µ 0 . The value of µ 0 can be estimated from forest demographic data where this is available (Moore et al., 2018(Moore et al., , 2020. However, for global applications we rarely have more observations than the fractional coverage of each PFT. Starting from the derived forms for N eq (Eq. 15) and G eq (Eq. 16) and requiring that the recruitment flux (α/(1 − α)G eq s) is equal to that of the total population dying (γ N eq ), we can derive an equation for the total equilibrium coverage (full details in Appendix B2):  Table 3. As the left-hand side of this equation depends only on prescribed constants and µ 0 , Eq. (19) can be inverted (by numerical iteration) to estimate µ 0 for observed values of the PFT fractions (ν k , ν l ) and an assumed value of α (see Table 3). Once the value of µ 0 has been derived in this manner, it can be used to calculate X ν and, therefore, N 0 by inversion of Eq. (B28): Equations (19) and (20) therefore allow us to define an initial equilibrium state (N i ) which is consistent with observed area fractions of each PFT. Furthermore, when paired with an estimate of the net carbon assimilate (from a model or observations), the µ 0 estimate can be converted into a map of the implied mortality (γ ) by PFT. We demonstrate this capability globally in the next section.

Modelling results
For these runs, the numerical RED model is set up to use the nine PFTs which are currently used in JULES (Harper et al., 2018). This enables us to directly use driving data -time series of the rate of net assimilation (P ) -from a previous UKESM model simulation that includes JULES (Sellar et al., 2019). RED is integrated forward using a 1-month time step and successive mass classes that differ by a multiplicative constant ξ , so that m i = ξ m i−1 . The value of ξ was chosen to optimally fit the analytical equilibrium solutions assuming 10 mass classes for trees, 8 mass classes for shrubs, and 1 mass class for grasses, assuming µ 0 = 0.25 (see Appendix B3).
Other PFT-specific parameters are assumed as summarized in Table 3.

Global: diagnosed plant mortality rates
Here we use the analytical forms for the equilibrium state (Sect. 2.5) and observations of global vegetation cover to diagnose the corresponding map of PFT-specific mortality rates. These mortality rates are therefore consistent with the current observed vegetation state and rates of net assimilation (P ) provided from UKESM (Sellar et al., 2019). The UKESM simulation provides net primary productivity (NPP) and local litterfall per unit area of each PFT. We multiply by PFT fraction to get the grid-box mean values required to drive RED (using ESA land-cover data, as explained below). The observed maps of PFTs are provided by the ESA LC_CCI dataset for 2008-2012 (Poulter et al., 2015), projected onto the nine JULES PFTs (Fig. 3). Maps of the prescribed annual mean values of the rate of net assimilation (P ) are shown in Fig. 4. We use the procedure outlined in Sect. 2.5 to estimate spatially varying values of µ 0 for each PFT, using Eq. (B32) and then Eq. (B34) to estimate N 0 . This method successfully reproduces the ESA map of dominant PFT to good accuracy, as shown in Fig. 5 and Table 4. The fit of the RED equilibrium vegetation coverage to the ESA observations is generally very good (Table 4). However, it is imperfect in some areas (e.g. Central Asia, Sahel) where the driving net assimilate from UKESM is zero or negative. Also, areas where the observational dataset indicates coexisting PFTs within the same vegetation class (e.g. broadleaf trees and needleleaf trees) are not well simulated by this first version of RED, which leads to competitive exclusion in the equilibrium state (see Discussion). Since we now have diagnosed values of µ 0 and N 0 , along with prescribed values of P , we can also diagnose the mean plant mortality rate γ , for each location and for each PFT, from Eq. (14) as follows: where g 0 is given by Eq. (11) combined with Eqs. (B18) and (B20). Maps of γ values, derived in this way, are shown in Fig. 6. The mortality rate derived is dependent on the assumed areal coverage and the total assimilate. A high coverage with a low growth rate will result in a compensating low diagnosed mortality rate (and vice versa). Furthermore, the choice of α (Eq. 11) and m 0 also influence the diagnosed value of γ . An analysis of the sensitivity of the inferred value of γ to these factors is presented in Appendix C. Assuming ±20 % uncertainty on assimilate, α, and m 0 and ±5 % on the coverage gives an uncertainty bound of ±35 % on γ . Under the assumption that high coverages are indicative of the baseline mortality for a given PFT, we take a subsample of the grid boxes that are within the top quartile of nonzero coverages (ν eq > 0.01) ( Table 5). The median µ 0 value di- Figure 6. Diagnosed maps of mortality rates γ for each PFT, as required for consistency with the ESA observations and the UKESM growth rates. White areas correspond with zero coverage and/or zero growth. Site-level assessments of the rates of stand mortality within pantropical forests conclude a range of background rates (Lugo and Scatena, 1996;Phillips, 1996;Phillips et al., 2004). Phillips (1996) estimates mortality rates collected across 40 pantropical sites for tree sizes greater than 10-25 cm dbh. Later work by Phillips et al. (2004) used the demographic data from the RAINFOR dataset of trees ≥ 10 cm dbh. Using these site assessments, we can make a comparison to BET-Tr equilibrium mortality rates by looking at the values of γ in areas where we would expect to see old-growth forests. We use the top 25 % of coverages of the BET-Tr PFT to represent plausible areas of undisturbed forest. Figure 7 shows that the diagnosed baseline mortality rates are in reasonable agreement with these observational estimates for Amazonia.
There is a need to better understand the influence of mortality arising from disturbance events such as droughts and fire in order to constrain model projections (Pugh et al., 2020). Here we investigate if the equilibrium mortality rates implicitly capture areas of disturbances, by comparing the mean tree mortality rate to fire and land-use surveys (the mean mortality is defined here by weighting grid-box γ values by grid-box fractional coverages). There are a number of surveys relating stand mortality in regions prone to wildfires (Swaine, 1992;Kinnaird and O'Brien, 1998;Peterson and Reich, 2001;Van Nieuwstadt and Sheil, 2005;Prior et al., 2009;Staver et al., 2009;Brando et al., 2014). In a broad sense, post-fire mortality rates can range from 0.06 yr −1 to catastrophic rates around 0.8 yr −1 and can vary quite considerably depending on tree species, fire frequency, and drought severity. The drought-fire interaction is responsible for significantly increasing mortality post fire and can be a driving cause of regional dieback (Allen et al., 2010;Brando et al., 2014). Using the ESA FIRE_CCI dataset (Chuvieco et al., 2019) we can estimate the burnt vegetation fraction per year. Taking the average burnt vegetation fraction for the months between 2000 and 2010 and converting into annual burn rate, we gain an estimate of fire severity. Another key issue is anthropogenic land use and land-use change (Nepstad et al., 2008;Haddad et al., 2015). Fragmentation of natural forests is understood to raise the mortality of the remaining forest and to decrease the overall resilience of the ecosystem (Esseen, 1994;Laurance et al., 1998;Jönsson et al., 2007). In order to maintain a near-constant agricultural fraction, regular disruption such as grazing is needed to prevent recolonization and secondary succession (Dorrough and Moxham, 2005;Van Uytvanck et al., 2008;Chaturvedi et al., 2012). We carry out a comparison with land use using the 2000 ESA LC_CCI inferred crop coverages (Li et al., 2019).
In Fig. 9, we see the derived observations for burn area (panel a) and crop fraction (panel b), along with the derived mean γ for the tree PFTs (panel c). From Fig. 9d, we see that there are areas of large mortality (γ > 0.075 yr −1 ) that do correspond to areas where we see large fire activity (burn rate > 0.1 yr −1 ) and increased crop fraction (> 0.25). However, large burn rates are seen to overlap in parts of central Brazil around the Cernado region, southern Africa, and north Western Australia where fires are understood to play a significant part within the ecosystem (Coutinho, 1990;Medeiros and Miranda, 2008;Prior et al., 2009;Staver et al., 2009). There are also some areas of agriculture which correspond to deforestation, such as in the Atlantic forests of Brazil and in Indonesia (Higuchi et al., 2008;Curran et al., 2004). Areas of increased disturbances result in grasses and shrubs dominating (Fig. 3).
Analysis of the RED equilibrium is an indirect approach to estimating tree mortality based on simple yet mechanistic principles of demography and relying on few inputs (vege- Figure 8. Comparison of observation-based estimates of tropical tree mortality (Phillips, 1996;Phillips et al., 2004) to γ values diagnosed from RED for the BET-Tr PFT (for the top 25 % of fractions for this PFT). tation cover and assimilate). It is however conditional on the assumed estimates of vegetation coverage and net rates of assimilation.

Local: simulating succession
In this subsection we demonstrate the vegetation successional dynamics simulated by RED in an idealized spin-up from bare soil, for a grid box at the edge of the Amazonian rainforest (Fig. 10). Under these circumstances, the diagnosed initial state is indeed the long-term equilibrium state, as evidenced by the horizontal dashed lines in panels (a) and (b) of Fig. 10.
Faster-growing grass PFTs dominate the grid box within the first 12 years, before being replaced by evergreen shrubs which shade the grass seedlings. Eventually, broadleaf evergreen tropical trees replace much of the shrub and grass, on a timescale determined in large part by the parameter α and the reference mass class m 0 . With the parameters used here, the vegetation fraction reaches close to its equilibrium value after about 20 years (panel a), but full spin-up of the biomass takes around 150 years (panel b). The modelled evolution of number density versus mass distribution for each PFT is shown in panel (c) (after 6 years), panel (d) (after 13 years), and panel (e) (after 100 years), with the eventual demographic equilibrium profiles shown by the dashed lines. It is clear that grass PFTs are close to their demographic equilibrium after only 6 years, but tree PFTs need more than 100 years to reach equilibrium.
The dashed lines in Fig. 10 represent a dynamical RED simulation from the diagnosed demographic equilibrium state. This state is derived using the methodology described in Sect. 2.5, with one significant change. The competition rules given by Eq. (7) and Table 2 result ultimately in equi-libria which have a single dominate PFT in each class of co-competing types (trees, shrubs, grasses). To avoid drifts associated with the competitive exclusion of the subdominant PFTs in each vegetation class, we choose to initialize the dominant PFT to have the total area fraction of all the PFTs in that vegetation class.

Global: spin-up from bare soil
Transient simulations of global vegetation will be the subject of a future paper, but in the final subsection of this paper we wish to demonstrate the utility of the semi-analytical equilibrium for initialization of global model runs.  The ultimate steady state is determined by the balance between recruitment and mortality (Eq. 6). Intra-and inter-PFT occurs here through the shading of seedlings, which implies that just a fraction of the grid box (s, gap fraction) is available to grow seedlings (Eq. 7).
shows the time evolution of global mean PFT fractions and biomass from a global run driven by net assimilation rates from the UKESM model. Once again, two RED simulations are shown, one started from bare soil (solid lines) and the other from the semi-analytical equilibrium state (dashed lines). Using a constant assimilate rate (Fig. 4) and the mortality distribution (Fig. 7), we see convergence of these two runs but only after more than 1000 years of simulated time. The ability to diagnose the equilibrium state therefore has the potential to reduce model spin-up time hugely, especially for Earth system model (ESM) applications.

Discussion
The response of the land surface to climate change is a key uncertainty in climate projections. Ambitious climate targets also rely on land management practices such as reforestation and afforestation to increase the storage of carbon on land. First-generation dynamic global vegetation models (DGVMs) attempted to model the land surface in terms of bulk properties such as mean vegetation cover, vegetation carbon and leaf area index. These models lack information about the plant size distribution, which compromised their ability to represent recovery from disturbance and the impact of land management. Providing useful guidance on these issues requires improved DGVMs, which can rep-resent changes in tree-size distributions within forests (so called "demography"). A number of much more sophisticated second-generation DGVMs are now under development. These models often explicitly simulate the number of plants within different size or mass classes and on different patches of land, which are defined by the time since a disturbance event. Such second-generation models are therefore in principle able to simulate variations in plant number density as both a function of patch age and plant size. However, this completeness is at the expense of much computational and parameter complexity.
Our previous work in evaluating demographic equilibrium theory for regional forest inventory datasets in North America (Moore et al., 2018) and using RAINFOR sites for South America (Moore et al., 2020) has provided the theoretical basis for the development of RED. In those studies we found that the tree-size distributions observed at a large-scale in forests can be satisfactorily understood in terms of demographic equilibrium in the size dimension alone. This is a reduction in complexity compared to other cohort models which are based on patch age, and yet it is an improvement in ecological fidelity compared to older phenomenological DGVMs such as TRIFFID (Cox, 2001). The modular design of RED allows for easy coupling to land-surface schemes, merely requiring the per unit grid-box total carbon assimilate rate and any additional mortality disturbance rates as inputs for each grid box (Fig. 2). In principle, RED allows scope for more complex tree-size-dependent processes, although in this first study we chose to assume size-independent (but spatially varying) mortality rates for each PFT. Our previous work suggests that this is a good first-order assumption (Moore et al., 2018(Moore et al., , 2020. Internally within the model we make a number of simplifications. Firstly, the number density for each PFT is treated as a function of plant mass alone. This immediately eliminates the need to explicitly represent patches and, therefore, removes age as an independent dimension. This is a distinct approach relative to cohort DGVMs which are based on patches defined by time since disturbance, such as the POP or ORCHIDEE-MICT models (Haverd et al., 2014;Yue et al., 2018). Secondly, we assume that plant growth rates vary as a power of plant mass. By default we assume a power of φ g = 3/4, which is consistent with metabolic scaling theory (Enquist et al., 1998) and the empirically determined allometric relationships of Niklas and Spatz (2004).
Finally, we assume that competition is only significant for the lowest "seedling" mass class. This enables us to represent gap dynamics among plants and resultant stages in succession. This represents a significant simplification compared to other approaches involving the perfect plasticity assumption (PPA), as used within DGVMs such as LM3-PPA or CLM(ED) (Fisher et al., 2015;Weng et al., 2015), where canopies are assumed to perfectly fill gaps through photomorphism (Strigul et al., 2008). In LM3-PPA the radiative flux is limited by the available gap fraction in a given crown layer. PPA parallels our gap boundary condition at the lowest mass class (Eq. 6), but in RED the growth of a cohort is given by the disaggregation of total growth via metabolic scaling (Eq. 11).
These simplifications allow RED to be solved analytically for the steady-state vegetation cover given information on the mortality and growth rates per unit area for each PFT. Such analytical steady-state solutions mean that RED can be easily initialized in drift-free preindustrial states, which is vital to avoid spurious sources and sinks in climate-carbon cycle projections. The analytical solutions also enable RED to be calibrated to the observed vegetation cover, via a single parameter (µ 0 ) which represents the ratio of mortality to growth for a tree of an arbitrary reference mass. The existence of analytical steady-state solutions for RED also opens up other promising research avenues. For example, these solutions imply relationships between the fractional coverage of each PFT, total plant biomass, and the ratio of mortality to growth. This in turn allows RED to be calibrated using observations of any two of these quantities. The analytical solutions also allow optimality hypotheses to be explored (e.g. the hypothesis that the fraction of net assimilate allocated to seed production maximizes stand density and/or biomass).
Aside from the existence of analytical steady-state solutions, RED is attractive for large-scale applications because it is both parameter sparse ("parsimonious") and requires very few driving variables. The main driving variable is the timevarying net plant growth rate for each PFT, which is defined as net primary production minus the local litterfall. These driving data can be provided by a land-surface scheme, as we do in this study, or from observations. The only other driving variable for RED is the mortality rate, which we treat in this study as a geographically varying PFT-specific constant that is independent of mass. However, in principle RED could utilize mortality rates that depend on plant mass and time to represent individual disturbance events (e.g. forest fires, disease outbreaks). Despite its simplicity, the RED model is able to fit the global distribution of vegetation types (Fig. 5) and simulate successional dynamics, including changes in forest demography (Fig. 10).
There are inevitably weaknesses with any particular modelling approach. For RED, a current limitation is for competition to lead to a single PFT at each location within each co-competing vegetation class (i.e. tree, shrub, grass). The PFT with the highest equilibrium fraction will end up excluding subdominant PFTs within the same vegetation class. It was necessary for us to account for this eventual competitive exclusion to derive zero-drift steady states for the global runs presented in Sect. 3.2.1. Such competitive exclusion is a common problem in DGVMs (Fisher et al., 2018). Currently, RED would therefore not be the most appropriate DGVM to answer important questions regarding the role of biodiversity in ecosystem function (Pavlick et al., 2013;Levine et al., 2016). More sophisticated DGVMs are required to simulate plant diversity, such as individual-based models  and DGVMs specifically designed to capture sub-grid-scale patch dynamics (Longo et al., 2019a, b). Adapting our gap boundary condition (Eq. 7) appears to be a promising way to allow greater PFT diversity in RED, without unduly increasing model complexity. We see this as a key priority for future research.
RED is currently being coupled to the JULES land-surface model, replacing TRIFFID as the default DGVM within that framework. In parallel, significant improvements are being made to the representation of physiological processes in JULES, most notably through the representation of nonstructural carbohydrate ("SUGAR"; , and through the inclusion of a coupled model of stomatal conductance and hydraulic failure under drought stress ("SOX"; Eller et al., 2018Eller et al., , 2020. Plans are also being made to derive the mortality rates for RED from the INFERNO forestfire model (Burton et al., 2019). These developments will allow us to simulate the effects of size-dependent tree mortality rates within the near future.

Conclusions
In this paper we have presented a new intermediate complexity second-generation dynamic global vegetation model (DGVM), which captures important changes in forest demography. The Robust Ecosystem Demography (RED) model makes a number of important simplifications to achieve this. These simplifications are based on theoretical concepts (e.g. metabolic scaling theory to estimate how plant growth rate varies with plant mass and minimum crown overlap) and also comparison to observed forest demography (Moore et al., 2018(Moore et al., , 2020. As a result, RED is parameter sparse and can be driven with time series of net plant growth rate (and optionally disturbance rates) for each plant functional type (PFT). We have demonstrated that RED can be calibrated effectively to observed global vegetation maps, using a single fitting parameter (representing the ratio of mortality to growth for a plant of an arbitrary reference mass). The next stage will be to use RED in coupled climate-carbon cycle projections so to assess how changes in vegetation demography impact future CO 2 and climate. We have made the prototype RED code publicly available, and we hope that Earth system and land-surface modellers will make good use of this framework to further their own research.

Appendix A: Functional form of flux F i in discretized RED
For large-scale application in ESMs, a primary concern is to ensure that the total vegetation carbon obeys carbon balance (i.e. only changes due to the net impact of total growth minus total mortality). Here we use that requirement to derive the functional form for F i as given in Eq. (5).
The total vegetation carbon in each mass class is M i = m i N i . The updated equation for M i is therefore Eq. (4) multiplied by m i : The total carbon in the vegetation, M, is the sum of the carbon in each of the mass classes: Thus the update equation for the total carbon is which can be rewritten as Now substituting Eq. (5) into Eq. (A4) gives The first term on the righthand side of this equation is the total carbon uptake due to growth, and the second term represents the total carbon loss due to mortality, which is the required carbon conservation equation.

Appendix B: Continuum solutions and demographic equilibrium theory
Equation (1) can be solved for the steady state if we assume metabolic scaling of growth using Eq.
(2) and a sizeindependent mortality (Moore et al., 2018(Moore et al., , 2020 as follows: where n 0 is a boundary condition that describes the number density at the mass m 0 . The parameter µ 0 is the ratio of the rate biomass loss due to mortality to the rate of biomass gain Figure B1. The quasi-Weibull number density solution to DET (Eq. B1), assuming the same initial n 0 and growth scaling φ g = 0.75 but different µ 0 values. due to growth, for the reference mass class m 0 . Similar analytical solutions can be derived for other measures of tree size, such as basal diameter or height (Moore et al., 2018(Moore et al., , 2020. Integrating Eq. (B1) from m 0 to ∞ gives the total number density: Other cohort integrals can be derived by integrating over the number density distribution, such as total growth rate ( gndm): total biomass ( mndm): and total vegetation cover ( andm): where (a, b) is the incomplete upper gamma function. As we assume the allometric exponents presented in Niklas 4082 A. P. K. Argles et al.: Robust Ecosystem Demography and Spatz (2004) (φ g = 3/4, φ a = 1/3), these functional forms simplify to Finally, to convert a µ 0 found using biomass (µ 0,tdm ) to one based on carbon mass, we use the formula assuming that biomass is twice the carbon mass.

B1 Closed continuous form
The lowest population flux, n 0 g 0 , is equal to the seedling boundary condition, F 0 , in Eq. (6): Substituting the total number density, N eq (Eq. B2), into the left-hand side and total growth, G eq (Eq. B6), into the righthand side yields a solution for the equilibrium coverage, assuming s = 1 − ν eq , as follows: which simplifies to Using Eq. (B8) we can write the total number density at equilibrium in terms of ν eq : This enables Eq. (B6) to be rewritten as This equation in turn defines the total assimilate: Finally the total biomass can be written in closed form as

B2 Discrete steady state
To solve for the discrete model equilibrium, we start from the flow equation from Eq. (4) with the term ∂N/∂t → 0: Considering the population flux (Eq. 5), we find N i in relation to the lower mass class, N i−1 : Assuming no population grows out of the top class, λ I is given as λ i can be simplified to depend only on µ 0 , by using µ 0 = (γ m 0 /g 0 ) (Eq. 14) and applying the mass scaling of growth rates g i = g 0 (m i /m 0 ) φ g . We can show that λ i and λ I are An expression for the total stand density at equilibrium, N eq , can be derived. Using Eq. (B18), we can represent any population of mass class i in terms of the lowest mass class N 0 as follows: Therefore, when finding the total number of stands relative to N 0 we get where X N describes the sum of the all mass classes as a proportion of N 0 . We can describe the total class growth rate in relation to N 0 as By using the allometric relationship (Eq. 2) we describe the total class growth rate in relation to the lowest class growth rate, N 0 g 0 . Like N eq , we can show the total growths across all classes is therefore We can repeat the same process for coverage and using allometric relationship (Eq. 3) This gives the total coverage, ν eq , as Finally, for the total carbon mass within the class, we get with the total carbon density equalling In equilibrium, the rate of the recruitment of seedlings (Eq. 6) must balance the rate of loss of plants due to total mortality (γ sN eq ): Substituting in Eq. (B22), Eq. (B25) yields a balance equation for the kth PFT: We can get the equilibrium fraction of a PFT, k, by rearranging the above equation, assuming c kk = 1 as follows: Once the value of µ 0 has been derived in this manner, we can find N 0 by inversion of Eq. (B28) as follows: Substituting Eq. (B33) into Eq. (B34) allows us to determine N 0 and hence most other total densities in terms of purely µ 0 and prescribed constants.

B3 Continuous-discrete convergence
Inevitably discretized models will not exactly reproduce exact continuum analytical solutions, as a result of numerical inaccuracies that arise from using a finite number of mass classes. However, where exact analytical solutions exist they can be used to benchmark numerical models and optimize discretization schemes, which is what we set out to do in this appendix. We compare the continuum analytical solution for the equilibrium coverage (Eq. B12) to results from RED with differing numbers of mass classes m i and a geometric mass class scaling, m i+1 = ξ m i . Figure B2a shows how the relationship between ν eq varies with µ 0 for the exact continuum solution (black line) and variants of the numerical version of RED with different numbers of mass classes (coloured lines). As hoped, results from the discretized model converge on the exact solution as the number of mass classes increases. The numerical versions of RED shown in Fig. B2a each use a value of ξ that is near optimum for the number of mass classes, as shown in panels (b) and (c) of Fig. B2. Optimum ξ values reduce from about 2.3 for 10 mass classes to 1.1 for 100 mass classes. This variation results from a trade-off. For a given number of mass classes, small values of ξ give greater numerical accuracy but explicitly model less of the mass range, and the opposite is true of large ξ values. As a result, optimum values of ξ an be defined for each number of mass classes as outlined below.
For geometric scaling any mass can be expressed in terms of m 0 , by writing m i = m 0 (ξ ) i . Therefore, by using m i+1 − m i = m 0 (ξ ) i (ξ − 1), we find that our equilibrium form of λ i is reduced to From Fig. B2c, we see that there is an optimum value for ξ , the geometric scaling for a given number of classes, which minimizes the difference between the continuous and discrete forms. This can be found by taking the difference in the continuous and discrete coverages and differentiating with respect to ξ to find the minima. It should be noted that as the continuous form is not dependent on ξ , we get ∂ ∂ξ ν eq,continuous − ν eq = − ∂ ∂ξ ν eq , where ν eq corresponds with the discrete equilibrium, i.e. Eq. (B32), with ν eq = (1−s). Setting Eq. (B36) equal to zero we reduce the relationship to only a dependence on X N and X G : Finding the partial derivative of X N , using the geometric form of Eq. (B18), we get and for X G we get the following: Finding λ i we get and for the top class, λ I we get the following: To numerically solve for the minimum, we must differentiate Eq. (B37), with respect to ξ . Through the product rule we get ∂ 2 ∂ξ 2 X N X G = X G X N − X G X N .
Differentiating equation (Eq. B38) and simplifying gives and doing the same for Eq. (B39) gives λ i is given by For the double differential of λ i we get We now possess the identities needed to numerically find the optimum bin scaling for a given number of classes. In Fig. B2c, the optimum scaling, ξ , is shown as the solid black line.
Appendix C: Sensitivity of diagnosed mortality rates to model parameters The diagnosed mortality rates in Fig. 6 are sensitive to variation in model inputs and parameters. The mortality rate, γ , can be found for the continuous solutions by rearranging the boundary condition equation (Eq. 6) and substituting in Eqs. (B2) and (B13) as follows: γ = αP eq a 0 m 0 1 − ν eq ν eq 1 + 1 2µ 0 + 1 8µ 2 0 .
The key external inputs to this equation are the observed PFT fraction ν eq and the net assimilate P eq . In addition, our estimates of γ are dependent on the internal model parameters, α and m 0 . Figure C1. The sensitivity of the mortality rate to assumed input variables: (a) coverage, ν eq ; (b) P eq carbon assimilate rate; and model parameters (c) reseed fraction, α, and (d) boundary mass, m 0 . The solid black line indicates the fixed values with corresponding (b-d) ±20 % or (a) ±5 % variation (dotted black lines).
The red lines in Fig. C1 demonstrate how the estimate of γ depends on these four inputs. The black dashed lines in Fig. C1 indicate how uncertainties in each input relate to uncertainties in γ , for "true" values typical of a tree PFT. We estimate uncertainties in the observed PFT fraction (e.g. from remote sensing) to be ±5 % and uncertainties in P (e.g. from JULES) to be ±20 %, leading to errors of ±17 % and ±20 % respectively. Likewise, ±20 % uncertainties in the internal parameters α and m 0 lead to ±12 % and ± 20 % uncertainties in γ . Combining these sources of uncertainty leads to an overall uncertainty in our inferred estimate of γ of about ±35 %. Code availability. The RED model Python code is archived at https://doi.org/10.5281/zenodo.3548678 (Argles, 2019). Furthermore, RED is currently being coupled into JULES, where a basic integration currently exists as branch (vn5.8_red_jules_couple) -this requires registration for the JULES repository (https://code. metoffice.gov.uk/trac/home, last access: 2 September 2020).
Author contributions. Originally the model framework was in JRM's thesis (Moore, 2016) under the supervision of PMC and CH. The description of PFT competition, the numerical model, and the equilibrium solutions has been further developed by APKA, JRM, ABH, and PMC. Currently RED is being integrated into JULES with the supervision of AJW and CJ. AJW also provided and processed the UKESM growth rates needed to drive RED globally within this paper.