A protocol for an intercomparison of biodiversity and ecosystem services models using harmonized land-use and climate scenarios

Abstract. To support the assessments of the Intergovernmental Science-Policy Platform
on Biodiversity and Ecosystem Services (IPBES), the IPBES Expert Group on
Scenarios and Models is carrying out an intercomparison of biodiversity and
ecosystem services models using harmonized scenarios (BES-SIM). The goals of
BES-SIM are (1) to project the global impacts of land-use and climate change
on biodiversity and ecosystem services (i.e., nature's contributions to
people) over the coming decades, compared to the 20th century, using a set of
common metrics at multiple scales, and (2) to identify model uncertainties
and research gaps through the comparisons of projected biodiversity and
ecosystem services across models. BES-SIM uses three scenarios combining
specific Shared Socio-economic Pathways (SSPs) and Representative
Concentration Pathways (RCPs) – SSP1xRCP2.6, SSP3xRCP6.0, SSP5xRCP8.6 – to
explore a wide range of land-use change and climate change futures. This
paper describes the rationale for scenario selection, the process of
harmonizing input data for land use, based on the second phase of the Land
Use Harmonization Project (LUH2), and climate, the biodiversity and ecosystem
services models used, the core simulations carried out, the harmonization of
the model output metrics, and the treatment of uncertainty. The results of
this collaborative modeling project will support the ongoing global
assessment of IPBES, strengthen ties between IPBES and the Intergovernmental
Panel on Climate Change (IPCC) scenarios and modeling processes, advise the
Convention on Biological Diversity (CBD) on its development of a post-2020
strategic plans and conservation goals, and inform the development of a new
generation of nature-centred scenarios.



Introduction
Understanding how anthropogenic activities impact biodiversity and human societies is essential for nature conservation and sustainable development. Land-use and climate change are widely recognized as two of the main drivers of future biodiversity change (Hirsch and CBD, 2010;Maxwell et al., 2016;Sala, 2000; Secretariat of the CBD and UNEP, 2014), with potentially severe impacts on ecosystem services and ultimately human well-being (Cardinale et al., 2012;MEA, 2005). Habitat and land-use changes, resulting from past, present, and future human activities, as well as climate change, have both immediate and long-term impacts on biodiversity and ecosystem services (Graham et al., 2017;Lehsten et al., 2015;Welbergen et al., 2008). Therefore, current and future land-use projections are essential elements for assessing biodiversity and ecosystem change (Titeux et al., , 2017. Climate change has already been observed to have direct and indirect impacts on biodiversity and ecosystems, which are projected to intensify by the end of the century, with potentially severe consequences for species and habitats, and, therefore, also for ecosystem functions and services (Pecl et al., 2017;Settele et al., 2015).
Global environmental assessments, such as the Millennium Ecosystem Assessment (MEA, 2005), the Global Biodiversity Outlooks (GBO), the multiple iterations of the Global Environmental Outlook (GEO), the Intergovernmental Panel on Climate Change (IPCC), and other studies have used scenarios to assess the impact of socio-economic development pathways on land use and climate and their consequences for biodiversity and ecosystem services (Jantz et al., 2015;Pereira et al., 2010). Models are used to quantify the biodiversity and ecosystem services impacts of different scenarios, based on climate and land-use projections from general circulation models (GCMs) and integrated assessment models (IAMs) (Pereira et al., 2010). These models include empirical dose-response models, species-area relationship models, species distribution models and more mech-anistic models such as trophic ecosystem models (Pereira et al., 2010;Akçakaya et al., 2015). So far, each of these scenario exercises has been based on a single model or a small number of biodiversity and ecosystem services models, and intermodel comparison and uncertainty analysis have been limited (IPBES, 2016;Leadley et al., 2014). The Expert Group on Scenarios and Models of the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES) is addressing this gap by carrying out a biodiversity and ecosystem services model intercomparison with harmonized scenarios, for which this paper lays out the protocol.
Over the past 2 decades, IPCC has fostered the development of global scenarios to inform climate mitigation and adaptation policies. The Representative Concentration Pathways (RCPs) describe different climate futures based on greenhouse gas emissions throughout the 21st century . These emissions pathways have been converted into climate projections in the most recent Climate Model Inter-comparison Project (CMIP5). In parallel, the climate research community also developed the Shared Socioeconomic Pathways (SSPs), which consist of trajectories of future human development with different socio-economic conditions and associated land-use projections Riahi et al., 2017). The SSPs can be combined with RCP-based climate projections to explore a range of futures for climate change and land-use change, and they are being used in a wide range of impact modeling intercomparisons (Rosenzweig et al., 2017;. Therefore, the use of the SSP-RCP framework for modeling the impacts on biodiversity and ecosystem services provides an outstanding opportunity to build bridges between the climate, biodiversity and ecosystem services communities; it has been explicitly recommended as a research priority in the IPBES assessment on scenarios and models (IPBES, 2016).
Model intercomparisons bring together different communities of practice for comparable and complementary modeling, in order to improve the comprehensiveness of the subject modeled, and to estimate uncertainties associated with scenarios and models (Frieler et al., 2015). In the last decades, various model intercomparison projects (MIPs) have been initiated to assess the magnitude and uncertainty of climate change impacts. For instance, the Inter-Sectoral Impact Model Intercomparison Project (ISI-MIP) was initiated in 2012 to quantify and synthesize climate change impacts across sectors and scales (Rosenzweig et al., 2017;Warszawski et al., 2014). The ISI-MIP aims to bridge sectors such as agriculture, forestry, fisheries, water, energy, and health with global circulation models, Earth system models (ESMs), and integrated assessment models for more integrated and impact-driven modeling and assessment .
Here, we present the methodology used to carry out a BES-SIM in both terrestrial and freshwater ecosystems. The BES-SIM project addresses the following questions.
(1) What are the projected magnitudes and spatial distribu-tion of biodiversity and ecosystem services under a range of land-use and climate future scenarios? (2) What is the magnitude of the uncertainties associated with the projections obtained from different scenarios and models? Although independent of the ISI-MIP, the BES-SIM has been inspired by ISI-MIP and other intercomparison projects and was initiated to address the needs of the global assessment of IPBES. We brought together 10 biodiversity models and six ecosystem functions and services models to assess impacts of land-use and climate change scenarios in the coming decades (up to 2070) and to hindcast changes to the last century (to 1900). The modeling approaches differ in several respects concerning how they treat biodiversity and ecosystem services responses to land-use and climate changes, including the use of correlative, deductive, and process-based approaches, and in how they treat spatial-scale and temporal dynamics. We assessed different classes of essential biodiversity variables (EBVs), including species populations, community composition, and ecosystem function, as well as a range of measures on ecosystem services such as food production, pollination, water quantity and quality, climate regulation, soil protection, and pest control (Pereira et al., 2010;Akçakaya et al., 2015). This paper provides an overview of the scenarios, models and metrics used in this intercomparison, and thus a roadmap for further analyses that is envisaged to be integrated into the first global assessment of the IPBES (Fig. 1).

Scenario selection
All the models included in BES-SIM used the same set of scenarios with particular combinations of SSPs and RCPs. In the selection of the scenarios, we applied the following criteria: (1) data on projections should be readily available, and (2) the total set should cover a broad range of landuse change and climate change projections. The first criterion entailed the selection of SSP-RCP combinations that are included in the ScenarioMIP protocol as part of CMIP6 (O'Neill et al., 2016), as harmonized data were available for these runs and they form the basis of the CMIP climate simulations. The second criterion implied a selection of scenarios with low and high degrees of climate change and different land-use scenarios within the ScenarioMIP set. Our final selection was SSP1 with RCP2.6 (moderate land-use pressure and low level of climate change) , SSP3 with RCP6.0 (high land-use pressure and moderately high level of climate change) , and SSP5 with RCP8.5 (medium land-use pressure and very high level of climate change) , thus allowing us to assess a broad range of plausible futures (Table 1). Further, by combining projections of low and high anthropogenic pressure on land use with low and high levels of climate change, we can test these drivers' individual and synergistic impacts on biodiversity and ecosystem services. The first scenario (SSP1xRCP2.6) is characterized by a relatively "environmentally friendly world" with low population growth, high urbanization, relatively low demand for animal products, and high agricultural productivity. These factors together lead to a decrease in the land use of around 700 Mha globally over time (mostly pastures). This scenario is also characterized by low air pollution, as policies are introduced to limit the increase in greenhouse gases in the atmosphere, leading to an additional forcing of 2.6 W m −2 before 2100. The second scenario (SSP3xRCP6.0) is characterized by "regional rivalry", with high population growth, slow economic development, material-intensive consumption, and low food demand per capita. Agricultural land intensification is low, especially due to the very limited transfer of new agricultural technologies to developing countries. This scenario has minimal land-use change regulation, with a large land conversion for human-dominated uses, and a relatively high level of climate change with a radiative forcing of 6.0 W m −2 by 2100. The third scenario (SSP5xRCP8.5) is a world characterized by "strong economic growth" fuelled by fossil fuels, with low population growth, high urbanization, and high food demand per capita but also high agricultural productivity. As a result, there is a modest increase in land use. Air pollution policies are stringent, motivated by local health concerns. This scenario leads to a very high level of climate change with a radiative forcing of 8.5 W m −2 by 2100. Full descriptions of each SSP scenario are provided in  and Riahi et al. (2017). The SSP scenarios excluded elements that have interaction effects with climate change except for SSP1, which focuses on environmental sustainability. Thus, SSPs describe futures where biodiversity is not affected by climate change to allow for the important estimation of the climate change impact on biodiversity .

Input data
A consistent set of land-use and climate data was implemented across the models to the extent possible. All models in BES-SIM used the newly released Land Use Harmonization version 2 dataset (LUH2, Hurtt et al., 2018). For the models that require climate data, we selected the climate projections of the past, present, and future from CMIP5/ISIMIP2a (McSweeney and Jones, 2016) and its downscaled version from the WorldClim (Fick and Hijmans, 2017), as well as MAGICC 6.0 (Meinshausen et al., 2011a, b) from the IMAGE model for GLOBIO models (Table 2). A complete list of input datasets and variables used by the models is documented in Table S1 of the Supplement.
(a) SSP scenarios SSP1 Sustainability nomic and land-use modules for the translation of narratives into consistent quantitative projections across scenarios . It is important to note that the used land-use scenarios, although driven mostly by the SSP storylines, were projected to be consistent with the paired RCPs and include biofuel deployment to mitigate climate change.
The SSP3 is associated with RCP7.0 (SSP3xRCP7.0); however, climate projections (i.e., time series of precipitation and temperature) are currently not available for RCP7.0. Therefore, we chose the closest RCP available, which was RCP6.0, for the standalone use of climate projections, and chose SSP3xRCP6.0 for the land-use projections from the LUH2. In this paper, we refer to this scenario as SSP3xRCP6.0. The land-use projections from each of the IAMs were harmonized using the LUH2 methodology. LUH2 was de-veloped for CMIP6 and provides a global gridded land-use dataset comprising estimates of historical land-use change  and future projections (2015-2100), obtained by integrating and harmonizing land-use history with future projections of different IAMs (Jungclaus et al., 2017;Lawrence et al., 2016;O'Neill et al., 2016). Compared to the first version of the LUH , LUH2 (Hurtt et al., 2018) is driven by the latest SSPs, has a higher spatial resolution (0.25 vs 0.50 • ), more detailed land-use transitions (12 versus 5 possible land-use states), and increased data-driven constraints (Heinimann et al., 2017;Monfreda et al., 2008). LUH2 provides over 100 possible transitions per grid cell per year (e.g., crop rotations, shifting cultivation, agricultural changes, wood harvest) and various agricultural management layers (e.g., irrigation, synthetic nitrogen fertilizer, biofuel Table 2. Improvements made in the Land Use Harmonization v2 (LUH2) from LUH v1 (sources: Hurtt et al., 2011Hurtt et al., , 2018 crops), all with annual time steps. The 12 land states include the separation of primary and secondary natural vegetation into forest and non-forest sub-types, pasture into managed pasture and rangeland, and cropland into multiple crop functional types (C 3 annual, C 3 perennial, C 4 annual, C 4 perennial, and N-fixing crops) ( Table 2). For biodiversity and ecosystem services models that rely on discrete, high-resolution land-use data (i.e., the GLOBIO model for terrestrial biodiversity and the InVEST model), the fractional LUH2 data were downscaled to discrete land-use grids (10 arcsec resolution; ∼ 300 m) with the land-use allocation routine of the GLOBIO4 model. To that end, urban, cropland, pasture, rangeland, and forestry areas from LUH2 were first aggregated across the LUH2 grid cells to the regional level of the IMAGE model, with forestry consisting of the wood harvest from forested cells and nonforested cells with primary vegetation. Next, the totals per region were allocated to 300 m cells with the GLOBIO4 land allocation routine, with specific suitability layers for urban, cropland, pasture, rangeland, and forestry areas. After allo-cation, cropland was reclassified into three intensity classes (low, medium, high) based on the amount of fertilizer used per grid cell. More details on the downscaling procedure are provided in Supplementary Methods in the Supplement.

Climate data
GCMs are based on fundamental physical processes (e.g., conservation of energy, mass, and momentum and their interaction with the climate system) and simulate climate patterns of temperature, precipitation, and extreme events on a large scale (Frischknecht et al., 2016). Some GCMs now incorporate elements of Earth's climate system (e.g., atmospheric chemistry, soil and vegetation, land and sea ice, carbon cycle) in Earth system models (GCMs with an interactive carbon cycle), and have dynamically downscaled models with higher-resolution data in regional climate models (RCMs).
A large number of climate datasets are available today from multiple GCMs, but not all GCMs provide projections for all RCPs. In BES-SIM, some models require continuous time-series data. In order to harmonize the climate data to be used across biodiversity and ecosystem services models, we chose the bias-corrected climate projections from CMIP5, which were also adopted by ISIMIP2a (Hempel et al., 2013) or their downscaled versions available from WorldClim (Fick and Hijmans, 2017). Most analyses were carried out using a single GCM, the IPSL-CM5A-LR (Dufresne et al., 2013), since it provides mid-range projections across the five GCMs (HadGEM2-ESGFDL-ESM2M, IPSL-CM5A-LR, MIROC-ESM-CHEM, and NorESM1-M) in ISIMIP2a (Warszawski et al., 2014). The ISIMIP2a output from the IPSL-CM5A-LR provides 12 climate variables on daily time steps from the pre-industrial period 1951 to 2099 at 0.5 • resolution (Mc-Sweeney and Jones, 2016), of which only a subset was used in this exercise (Table S1). The WorldClim downscaled dataset has 19 bioclimatic variables derived from monthly temperature and rainfall from 1960 to 1990 with multi-year averages for specific points in time (e.g., 2050, 2070) up to 2070. Six models in BES-SIM used the ISIMIP2a dataset and three models used the WorldClim dataset. An exception was made for the GLOBIO models, which used MAGICC 6.0 climate data (Meinshausen et al., 2011a, b) in the IMAGE model framework , to which GLOBIO is tightly connected ( Table 3). The variables used from the climate dataset in each model are listed in Table S1.

Other input data
In addition to the land-use and climate data, most models use additional input data to run their future and past simulations to estimate changes in biodiversity and ecosystem services. For instance, species occurrence data are an integral part of modeling in 6 of 10 biodiversity models, while 2 models rely on estimates of habitat affinity coefficients (e.g., reductions in species richness in a modified habitat relative to the pristine habitat) from the PREDICTS model Purvis et al., 2018). In three dynamic global vegetation models (DGVMs), atmospheric CO 2 concentrations, irrigated fraction, and wood harvest estimates are commonly used, while two ecosystem services models rely on topography and soil-type data for soil erosion measures. A full list of model-specific input data is given in Table S1.

Models in BES-SIM
Biodiversity and ecosystem services models at the global scale have increased in number and improved considerably over the last decade, especially with the availability of biodiversity data and advancement in statistical modeling tools and methods (IPBES, 2016). In order for a model to be included in BES-SIM, it had either to be published in a peer-reviewed journal or adopt published methodologies, with modifications made to modeling sufficiently docu-mented and accessible for review (Table S2). Sixteen models were included in BES-SIM (Appendix A, details on modeling methods in Table S2). These models were mainly grouped into four classes: species-based, community-based, and ecosystem-based models of biodiversity, and models of ecosystem functions and services. The methodological approaches, the taxonomic or functional groups, the spatial resolution and the output metrics differ across models (Appendix A). All 16 models are spatially explicit, with 15 of them using land-use data as an input and 13 of them requiring climate data. We also used one model, BIOMOD2 (Thuiller, 2004;Thuiller et al., 2009), to assess the uncertainty of climate range projections without the use of land-use data.

Species-based models of biodiversity
Species-based models aim to predict historical, current, and future potential distribution and abundance of individual species. These can be developed using correlative methods based on species observation and environmental data (Aguirre-Gutiérrez et al., 2013;Guisan and Thuiller, 2005;Guisan and Zimmermann, 2000) as well as expert-based solutions where data limitations exist (Rondinini et al., 2011). Depending on the methodologies employed and the ecological aspects modeled, they can be known as species distribution models, ecological niche models, bioclimatic envelope models, and habitat suitability models (Elith and Leathwick, 2009). Such species-based models have been used to forecast environmental impacts on species distribution and status.
In BES-SIM, four species-based models were included: AIM-biodiversity (Ohashi et al., 2018), InSiGHTS (Rondinini et al., 2011;Visconti et al., 2016), MOL (Jetz et al., 2007;Merow et al., 2013), and BIOMOD2 (Appendix A, Table S2). The first three models project individual species distributions across a large number of species by combining projections of climate impacts on species ranges with projections of land-use impacts on species ranges. AIMbiodiversity uses Global Biodiversity Information Facility (GBIF) species occurrence data on 9025 species across five taxonomic groups (amphibians, birds, mammals, plants, reptiles) to train statistical models for current land use and climate to project future species distributions. InSiGHTS uses species' presence records from regular sampling within species' ranges and pseudo-absence records from regular sampling outside of species' ranges on 2827 species of mammals. MOL uses species land-cover preference information and species presence and absence predictions on 20 833 species of amphibians, birds, and mammals. In-SiGHTS and MOL rely on IUCN's range maps as a baseline, which are developed based on expert knowledge of the species habitat preferences and areas of non-occurrence (Fourcade, 2016). Both models use a hierarchical approach with two steps: first, a statistical model trained on current species ranges is used to assess future climate suitability within species ranges; second, a model detailing associations between species and habitat types based on expert opinion is used to assess the impacts of land use in the climate-suitable portion of the species range. BIOMOD2 is an R modeling package that runs up to nine different algorithms (e.g., random forests, logistic regression) of species distribution models using the same data and the same framework. BIOMOD2 included three taxonomic groups (amphibians, birds, mammals) (see Sect. 7 "Uncertainties").

Community-based models of biodiversity
Community-based models predict the assemblage of species using environmental data and assess changes in community composition through species presence and abundance (D'Amen et al., 2017). Output variables of community-based models include assemblage-level metrics, such as the proportion of species persisting in a landscape, mean species abundances (number of individuals per species), and compositional similarity (pairwise comparison at the species level) relative to a baseline (typically corresponding to a pristine landscape).
Three models in BES-SIM -cSAR-iDiv (Martins and Pereira, 2017), cSAR-IIASA-ETH (Chaudhary et al., 2015), and BILBI (Hoskins et al., 2018;Ferrier et al., 2004Ferrier et al., , 2007 -rely on versions of the species-area relationship (SAR) to estimate the proportion of species persisting in humanmodified habitats relative to native habitat (i.e., the number of species in the modified landscape divided by the number of species in the native habitat). In its classical form, the SAR describes the relationship between the area of native habitat and the number of species found within that area. The countryside SAR (cSAR) builds on the classic SAR but accounts for the differential use of both human-modified and native habitats by different functional species groups. Both the cSAR-iDiv and cSAR-IIASA-ETH models use habitat affinities (proportion of area of a habitat type that can be effectively used by a species group) to weight the areas of the different habitats in a landscape. The habitat affinities are calibrated from field studies by calculating the change in species richness in a modified habitat relative to the native habitat. The habitat affinities of the cSAR-iDiv model are estimated from the PREDICTS dataset (Hudson et al., 2017 while the habitat affinities of cSAR-IIASA-ETH come from a previously published database of studies (Chaudhary et al., 2015). The cSAR-iDiv model considers 9853 species for one taxonomic group (birds) in two functional groups (forest species and non-forest species) while cSAR-IIASA-ETH considers a total of 1 911 583 species for five taxonomic groups (amphibians, birds, mammals, plants, reptiles) by ecoregions (these are, however, not 1 911 583 unique species as a species present in two ecoregions will be counted twice). BILBI couples application of the species-area relationship with correlative statistical modeling of continuous spatial turnover patterns in the species composition of communities as a function of environmental variation. Through space-Geosci. Model Dev., 11, 4537-4562, 2018 www.geosci-model-dev.net/11/4537/2018/ for-time projection of compositional turnover (i.e., change in species), this coupled model enables the effects of both climate change and habitat modification to be considered in estimating the proportion of species persisting for 254 145 vascular plant species globally. Three community-based models -PREDICTS, GLOBIO Aquatic (Alkemade et al., 2009;Janse et al., 2015), and GLOBIO Terrestrial (Alkemade et al., 2009;Schipper et al., 2016) -estimate a range of assemblage-level metrics based on empirical dose-response relationships between pressure variables (e.g., land-use change and climate change) and biodiversity variables (e.g., species richness or mean species abundance) (Appendix A). PREDICTS uses a hierarchical mixed-effects model to assess how a range of site-level biodiversity metrics respond to land use and related pressures, using a global database of 767 studies, including over 32 000 sites and 51 000 species from a wide range of taxonomic groups (Hudson et al., 2017. GLOBIO is an integrative modeling framework for aquatic and terrestrial biodiversity that builds upon correlative relationships between biodiversity intactness and pressure variables, established with meta-analyses of biodiversity data retrieved from the literature on a wide range of taxonomic groups.

Ecosystem-based model of biodiversity
The Madingley model (Harfoot et al., 2014b) is a mechanistic individual-based model of ecosystem structure and function. It encodes a set of fundamental ecological principles to model how individual heterotrophic organisms with a body size greater than 10 µg that feed on other living organisms interact with each other and with their environment. The model is general in the sense that it applies the same set of principles for any ecosystem to which it is applied, and is applicable across scales from local to global. To capture the ecology of all organisms, the model adopts a functional trait-based approach with organisms characterized by a set of categorical traits (feeding mode, metabolic pathway, reproductive strategy, and movement ability), as well as continuous traits (juvenile, adult, and current body mass). Properties of ecological communities emerge from the interactions between organisms, influenced by their environment. The functional diversity of these ecological communities can be calculated, as well as the dissimilarity over space or time between communities (Table S2). Madingley uses three functional groups (trophic levels, metabolic pathways, and reproductive strategies).

Models of ecosystem functions and services
In order to measure ecosystem functions and services, three DGVM models -LPJ-GUESS (Lindeskog et al., 2013;Olin et al., 2015;Smith et al., 2014), LPJ (Poulter et al., 2011;Sitch et al., 2003), and CABLE (Haverd et al., 2018) -and three ecosystem services models -InVEST (Sharp et al., 2016), GLOBIO (Alkemade et al., 2009Schulp et al., 2012), and GLOSP (Guerra et al., 2016) -were engaged in this model intercomparison. The DGVMs are process-based models that simulate responses of potential natural vegetation and associated biogeochemical and hydrological cycles to changes in climate and atmospheric CO 2 and disturbance regimes (Prentice et al., 2007). Processes in anthropogenically managed land (cropland, pastures, and managed forests) are also increasingly being accounted for . DGVMs can project changes in future ecosystem states (e.g., type of plant functional trait (PFT), relative distribution of each PFT, biomass, height, leaf area index, water stress), ecosystem functioning (e.g., moderation of climate, processing/filtering of waste and toxicants, provision of food and medicines, modulation of productivity, decomposition, biogeochemical and nutrient flows, energy, matter, water), and habitat structure (i.e., amount, composition, and arrangement of physical matter that describe an ecosystem within a defined location and time); however, DGVMs are limited in capturing species-level biodiversity change because vegetation is represented by a small number of plant functional types (PFTs) (Bellard et al., 2012;Thuiller et al., 2013).
The InVEST suite includes 18 models that map and measure the flow and value of ecosystem goods and services across a landscape or a seascape. They are based on biophysical processes of the structure and function of ecosystems, and they account for both supply and demand. The GLO-BIO model estimates ecosystem services based on outputs from the IMAGE model , the PCRaster Global Water Balance global hydrological model (PCR-GLOBWB, van Beek et al., 2011), and the Global Nutrient Model (Beusen et al., 2015). It is based on correlative relationships between ecosystem functions and services, and particular environmental variables (mainly land use), quantified based on literature data. Finally, GLOSP is a 2-D model that estimates the level of global and local soil erosion, and protection using the Universal Soil Loss Equation.

Output metrics
Given the diversity of modeling approaches, a wide range of biodiversity and ecosystem services metrics can be produced by the model set (Table S2). For the biodiversity model intercomparison analysis, three main categories of common output metrics were reported over time: extinctions as absolute change in species richness (N, number of species) or as proportional species richness change (P , % species), abundance-based intactness (I , % intactness), and mean proportional change in suitable habitat extent across species (H , % suitable habitat) (Table 4). These metrics were calculated at two scales: local or grid cell (α scale, i.e., the value of the metric within the smallest spatial unit of BES-SIM which is the grid cell) and regional or global scale (γ scale, i.e., the value of the metric for a set of grid cells comprising a Table 4. Selected output indicators for intercomparison of biodiversity and ecosystems models. For species diversity change, both proportional changes in species richness (P ) and absolute changes (N) are reported. Some models project the α metrics at the level of the grid cell (e.g., species-based and SAR based community models) while others average the local values of the metrics across the grid cell weighted by the area of the different habitats in the cell (e.g., PREDICTS, GLOBIO).  (Table 4). For the models that can project γ metrics, both regional-γ for each IPBES regions ( Table 1 in Brooks et al., 2016;UNEP-WCMC, 2015) and a global-γ were reported. The species diversity change metrics measured as absolute number or percentage change in species richness show species persistence and extinction in a given time and place. Absolute changes in species richness and proportional species richness change are interrelated and may be calculated from reporting species richness over time, as N t = S t − S t0 and P = N t /S t0 , where S t is the number of species at time t. Most models reported one or both types of species richness metrics (Table 4). The abundance-based intactness (I ) measures the mean species abundance in the current community relative to the abundances in a pristine community. This metric is available only for two community-based models: GLOBIO (where intactness is estimated as the arithmetic mean of the abundance ratios of the individual species, whereby ratios > 1 are set to 1) and PREDICTS (where intactness is estimated as the ratios of the sum of species abundances). The habitat change (H ) measures cell-wise changes in available habitat for the species. It represents the changes in the suitable habitat extent of each species relative to a baseline, i.e., (E i,t − E i,t0 )/E i,t0 , where E i,t is the suitable habitat extent of species i at time t within the unit of analysis. It is reported by averaging across species occurring in each unit of analysis (grid cell, region, or globe), and is provided by the species-level models (i.e., AIM-biodiversity, In-SiGHTS, MOL) ( Table 4). The baseline year, t 0 , used to calculate changes for the extinction and habitat extent metrics, was the first year of the simulation (in most cases t 0 = 1900; see Table 5).

BES-SIM
For ecosystem functions and services, each model's output metrics were mapped onto the new classification of Nature's Contributions to People (NCP) published by the IPBES scientific community (Díaz et al., 2018). Among the 18 possible NCPs, the combination of models participating in BES-SIM was able to provide measures for 10 NCPs, including regulating metrics on pollination (e.g., proportion of agricultural lands whose pollination needs are met, % agricultural area), climate (e.g., vegetation carbon, total carbon uptake and loss, MgC), water quantity (e.g., monthly runoff, Pg month −1 ), water quality (e.g., nitrogen and phosphorus leaching, PgN s −1 ), soil protection (e.g., erosion protection, 0-100 index), hazards (e.g., costal vulnerability, unitless score; flood risk, number of people affected) and detrimental organisms (e.g., fraction of cropland potentially protected by the natural pest relative to all available cropland, km 2 ), and material metrics on bioenergy (e.g., bioenergy-crop production, PgC yr −1 ), food and feed (e.g., total crop production, 10 9 KCal) and materials (e.g., wood harvest, KgC) ( Table 6). Some of these metrics require careful interpretation in the context of NCPs (e.g., an increase in flood risk can be caused by climate change and/or by a reduction of the capacity of ecosystems to reduce flood risk) and additional translation of increasing or declining measures of ecosystem functions and services (e.g., food and feed, water quantity) into contextually relevant information (i.e., positive or negative impacts) Geosci. Model Dev., 11, 4537-4562, 2018 www.geosci-model-dev.net/11/4537/2018/ on human well-being and quality of life. Given the disparity of metrics across models within each NCP category, names of the metrics are listed in Table 6, and units, definitions, and methods are provided in Table S3.

Core simulations
The simulations for BES-SIM required a minimum of two outputs from the modeling teams: present (2015) and future (2050). Additionally, a past projection (1900) and a further future projection (2070) were also provided by several modeling teams. Some models projected further into the past and also at multiple time points from the past to the future (Appendix A). Models that simulated a continuous time series of climate change impacts provided 20-year averages around these mid-points to account for inter-annual variability. The models ran simulations at their original spatial resolutions (Appendix A), and upscaled results to 1 • grid cells using arithmetic means. In order to provide global or regional averages of the α or grid cell metrics, the arithmetic mean values across the cells of the globe or a certain region were calculated, as well as percentiles of those metrics. Both 1 • rasters and a table with values for each IPBES region and the globe were provided by each modeling team for each output metric.
To measure the individual and synergistic impacts of landuse and climate change on biodiversity and ecosystem services, models accounting for both types of drivers were run three times: with land-use change only, with climate change only, and with both drivers combined. For instance, to measure the impact of land use alone, the projections into 2050 were obtained while retaining climate data constant from the present (2015) to the future (2050). Similarly, to measure the impact of climate change alone, the climate projections into 2050 (or 2070) were obtained while retaining the land-use data constant from the present (2015) to the future (2050). Finally, to measure the impact of land-use and climate change combined, models were run using projections of both land-use and climate change into 2050 (or 2070). When models required continuous climate time-series data to hindcast to 1900, data from years in the time period 1951 to 1960 were randomly selected to fill the data missing for years 1901 to 1950 from the ISIMIP 2a IPSL dataset. Models that used multi-decadal climate averages from WorldClim (i.e., InSiGHTS, BILBI) assumed no climate impacts for 1900.

Uncertainties
Reporting uncertainty is a critical component of model intercomparison exercises (IPBES, 2016). Within BES-SIM, un- Table 6. Selected output indicators for inter-comparison of ecosystem functions and services models, categorized based on the classification of Nature's Contributions to People (Díaz et al., 2018 www.geosci-model-dev.net/11/4537/2018/ certainties were explored by each model reporting the mean values of its metrics, and where possible the 25th, 50th, and 75th percentiles based on the parameterization set specific to each model, which can be found in each model's key manuscripts describing the modeling methods. When combining the data provided by the different models, the average and the standard deviations of the common metrics were calculated (e.g., intermodel average and standard deviation of P γ ). In a parallel exercise to inform BES-SIM, the BIOMOD2 model was used in assessing the uncertainty in modeling changes in species ranges arising from using different RCP scenarios, different GCMs, a suite of species distribution modeling algorithms (e.g., random forest, logistic regression), and different species dispersal hypotheses.

Conclusions
The existing SSP and RCP scenarios provide a consistent set of past and future projections of two major drivers of terrestrial and freshwater biodiversity change -land use and climate. However, we acknowledge that these projections have certain limitations. These include limited consideration of biodiversity-specific policies in the storylines (only the SSP1 baseline emphasizes additional biodiversity policies) (O'Neill et al., 2016;Rosa et al., 2017), coarse spatial resolution, and land-use classes that are not sufficiently detailed to fully capture the response of biodiversity to land-use change (Harfoot et al., 2014a;Titeux et al., , 2017. The heterogeneity of models and their methodological approaches, as well as additional harmonization of metrics of ecosystem functions and services (Tables 6, S3), are areas for further work. In the future, it will also be important to capture the uncertainties associated with input data, with a focus on uncertainty in land-use and climate projections resulting from differences among IAMs and GCMs on each scenario . The gaps identified through BES-SIM and future directions for research and modeling will be published separately, as well as analyses of the results on the model intercomparison and on individual models.
As a long-term perspective, BES-SIM is expected to provide critical foundation and insights for the ongoing development of nature-centred, multiscale Nature Futures scenarios (Rosa et al., 2017). Catalyzed by the IPBES Expert Group on Scenarios and Models, this new scenario and modeling framework will shift traditional ways of forecasting impacts of society on nature to more integrative, biodiversity-centred visions and pathways of socio-economic and ecological systems. A future round of BES-SIM could use these biodiversity-centred storylines to project dynamics of biodiversity and ecosystem services and associated consequences for socio-economic development and human well-being. This will help policymakers and practitioners to collectively identify pathways for sustainable futures based on alternative biodiversity management approaches and assist researchers in incorporating the role of biodiversity into socio-economic scenarios.
Code and data availability. The output data from this model intercomparison will be downloadable from the website of the IPBES Expert Group on Scenarios and Models in the future (https: //www.ipbes.net/deliverables/3c-scenarios-and-modelling, last access: 8 November 2018). The LUH2 land-use data used for model runs are available at http://luh.umd.edu/data.shtml . The climate datasets used in BES-SIM can be downloaded from the respective websites (https://www.isimip.org/outputdata/ (Inter-sectoral Impact Model Intercomparison Project Output Data, 2017), http://worldclim.org/version1, Hijmans et al., 2017). www.geosci-model-dev.net/11/4537/2018/ Geosci. Model Dev., 11, 4537-4562, 2018 Appendix A A species distribution model that estimates biodiversityloss-based projected shift of species range under the conditions of land-use and climate change.
Distribution of suitable habitat (land) estimated from climate and land-use data using a statistical model on species presence and climate and land-use classifications, calibrated by historical data.
Please see Table S2  An expert map-based species distribution model that projects potential losses in species occurrences and geographic range sizes given changes in suitable conditions of climate and land-cover change.
Expert maps for terrestrial amphibians, birds and mammals as a baseline for projections, combined with downscaled layers for current climate. A penalized point process model estimated individual species niche boundaries, which were projected into 2050 and 2070 to estimate range loss. Species habitat preferenceinformed land-cover associations were used to refine the proportion of suitable habitat in climatically suitable cells with present and future land-cover-based projections.
Inductive species distribution modeling was built using point process models to delineate niche boundaries. Binary maps of climatically suitable cells were rescaled (to [0,1]) based on the proportion of the cell within a species land-cover preference. An R package that allows one to run up to nine different algorithms of species distribution models using the same data and the same framework. An ensemble could then be produced allowing a full treatment of uncertainties given the data, algorithms, climate models, and climate scenarios.
BIOMOD2 is based on species distribution models that link observed or known presenceabsence data to environmental variables (e.g., climate). Each model is cross-validated several times (a random subset of 70 % of the data are used for model calibration, while 30 % are held out for model evaluation Proportional species richness of each species group is a power function of the sum of the areas of each habitat in a landscape, weighted by the affinity of each species group with each habitat type. Species richness is calculated by multiplying the proportional species richness by the number of species known to occur in the area. The total number of species in a landscape is the sum of the number of species for each species group. Two functional groups of bird species: (1) forest birds; (2) non-forest birds. Habitat affinities retrieved from the PREDICTS database. A countryside species area relationship model that estimates the impact of time series of spatially explicit land-use and land-cover changes on communitylevel measures of terrestrial biodiversity.
Extends concept of the SAR to mainland environment where the habitat size depends not only on the extent of the original pristine habitat, but also on the extent and taxon-specific affinity of the other non-pristine land uses and land covers (LULC) of conversion. Affinities derived from field records. Produces the average habitat suitability, regional species richness, and loss of threatened and endemic species for five taxonomic groups. A modeling framework that couples application of the species-area relationship with correlative generalized dissimilarity modeling (GDM)-based modeling of continuous patterns of spatial and temporal turnover in the species composition of communities (applied in this study to vascular plant species globally).
The potential effects of climate scenarios on beta-diversity patterns are estimated through space-for-time projection of compositional-turnover models fitted to present-day biological and environmental data. These projections are then combined with downscaled land-use scenarios to estimate the proportion of species expected to persist within any given region. This employs an extension of species-area modeling designed to work with biologically scaled environments varying continuously across space and time.
Please see Table S3 for detailed methodology. The hierarchical mixedeffects model that estimates how four measures of sitelevel terrestrial biodiversity overall abundance, within-sample species richness, abundance-based compositional similarity and richness-based compositional similarityrespond to land use and related pressures.
Models employ data from the PREDICTS database encompassing 767 studies from over 32 000 sites on over 51 000 species. Models assess how alpha diversity is affected by land use, land-use intensity, and human population density. Model coefficients are combined with past, present and future maps of the pressure data to make global projections of response variables, which are combined to yield the variants of the Biodiversity Intactness Index (an indicator first proposed by Scholes and Biggs, 2005

GLOBIO -Terrestrial
A modeling framework that quantifies the impacts of multiple anthropogenic pressures on local biodiversity (MSA).
Based on a set of correlative relationships between biodiversity (MSA) on the one hand and anthropogenic pressures on the other, quantified based on metaanalyses of biodiversity data reported in the literature. Georeferenced layers of the pressure variables are then combined with the response relationships to quantify changes in biodiversity.
Improved land-use allocation routine, improved response relationships for encroachment ( Vegetation dynamics result from growth and competition for light, space, and soil resources among woody plant individuals and herbaceous understorey. A suite of simulated patches per grid cell represents stochastic processes of growth and mortality (succession). Individuals for woody PFTs are identical within an age cohort. Processes such as photosynthesis, respiration, and stomatal conductance are simulated daily. Net primary production (NPP) accrued at the end of each simulation year is allocated to leaves, fine roots, and, for woody PFTs, sapwood, resulting in height, diameter and biomass growth.
The model version used here has some updates to the fire model compared to Knorr et al. (2016); see also Rabin et al. (2017). Simulations also accounted for wood harvest, using the modeled recommendations from LUH2. 0.5 • 1920, 1950, 1970, 2015, 2050, 2070   A big leaf model that simulates the coupled dynamics of biogeography, biogeochemistry and hydrology under varying climate, atmospheric CO 2 concentrations, and land-use landcover change practices to represent demography of grasses and trees in a scale from individuals to landscapes.
Hierarchical representation of the land surface -tiles represent land use with various plant or crop functional types. Implements establishment, mortality, fire, carbon allocation, and land-cover change on annual time steps, and calculates photosynthesis, autotrophic respiration, and heterotrophic respiration on daily time steps. Fully prognostic, meaning that PFT distributions and phenology are simulated based on physical principles within a numerical framework.
LPJ represents the full set of states and transitions represented in LUHv2 and improved estimate of carbon fluxes from land-cover change.
0.5 • 1920, 1950, 1970, 2015, 2050, 2070 Poulter et al. (2011) Combines biophysics (coupled photosynthesis, stomatal conductance, canopy energy balance) with daily biogeochemical cycling of carbon and nitrogen (CASA-CNP) and annual patch-based representation of vegetation structural dynamics (POP). Accounts for gross landuse transitions and wood harvest, including effects on patch age distribution in secondary forest. Simulates co-ordination of ratelimiting processes in C 3 photosynthesis, as an outcome of fitness maximization.
Quantifies a range of provisioning services (e.g., crop production, grass and fodder production, wild food), regulating services (e.g., pest control, pollination, erosion risk reduction, carbon sequestration), and culture services (e.g., naturebased tourism) and other measures (e.g., water availability, food risk reduction, harmful algal blooms). Derived from various models, including the Integrated Model to Assess the Global Environment (IM-AGE) model and PCRaster Global Water Balance (PCR-GLOBWB), and from empirical studies using meta-analysis.
Relationships between land use and the presence of pollinators and predators updated through additional peer review papers.  A suite of geographic information system (GIS) based spatially explicit models used to map and value the ecosystem goods and services in biophysical or economic terms.
18 models for distinct ecosystem services designed for terrestrial, freshwater, marine and coastal ecosystems. Based on production functions that define how changes in an ecosystem's structure and function are likely to affect the flows and values of ecosystem services across a landscape or a seascape. Accounts for both service supply and the location and activities of demand. Modular and selectable.
The crop-production model was simplified from 175 crops to the 5 crop types reported in LUH2. Other models have minor simplifications; see Tables S2 and S3 for more detail. Protected soil (Ps) is defined as the amount of soil that is prevented from being eroded (water erosion) by the mitigating effect of available vegetation. Ps is calculated from the difference between soil erosion (Se) and potential soil erosion (Pse) based on the integration of the joint effect of slope length, rainfall erosivity, and soil erodibility. Soil protection is given by the value of fractional vegetation cover calculated as a function of land use, altitude, precipitation, and soil properties.
Please see Table S3  Author contributions. All the authors co-designed the study and provided scientific input and technical details on models, scenarios and data necessary to carry out the intermodel comparison and synthesis. HMP, RA, PL, and IMDR led the development of the protocol, and HK led the writing of the manuscript with model-specific text contributions and review comments from all co-authors.
Competing interests. The authors declare that they have no conflict of interest.