A biome is a major regional ecological community characterized by distinctive life forms and principal plants. Many empirical schemes such as the Holdridge life zone (HLZ) system have been proposed and implemented to predict the global distribution of terrestrial biomes. Knowledge of physiological climatic limits has been employed to predict biomes, resulting in more precise simulation; however, this requires different sets of physiological limits for different vegetation classification schemes. Here, we demonstrate an accurate and practical method to construct empirical models for biome mapping: a convolutional neural network (CNN) was trained by an observation-based biome map, as well as images depicting air temperature and precipitation. Unlike previous approaches, which require assumption(s) of environmental constrain for each biome, this method automatically extracts non-linear seasonal patterns of climatic variables that are relevant in biome classification. The trained model accurately simulated a global map of current terrestrial biome distribution. Then, the trained model was applied to climate scenarios toward the end of the 21st century, predicting a significant shift in global biome distribution with rapid warming trends. Our results demonstrate that the proposed CNN approach can provide an efficient and objective method to generate preliminary estimations of the impact of climate change on biome distribution. Moreover, we anticipate that our approach could provide a basis for more general implementations to build empirical models of other climate-driven categorical phenomena.
Terrestrial biomes and climate are among the earliest known ecological concerns, and many empirical schemes have been proposed to characterize their relationship (Prentice and Leemans, 1990). One of the best known of these schemes is the Holdridge life zone (HLZ) system (Holdridge, 1947), which classifies vegetation distribution using only two independent variables: the annual mean precipitation and the bio-temperature (i.e. mean of above-freezing air temperature). Due to its simplicity, this scheme has been extensively implemented in numerous studies (Emanuel et al., 1985; Henderson-Sellers, 1991; Lugo et al., 1999; Monserud and Leemans, 1992; Prentice, 1990). For example, Elsen et al. (2021) applied historical climatologies and climate projections to the HLZ system for determining potential changes in global life zone distributions under changing climates.
Despite its relative simplicity, the HLZ scheme accounts well for
ecophysiological constraints. This scheme is based on bio-temperatures,
given that plant productivity becomes negligible at temperatures below
0
Efforts have been made to develop biome-mapping schemes that incorporate these environmental constraints. These implementations are considered to have clear physiological bases (Prentice et al., 1992; Woodward and Williams, 1987), and their predictions simulate present-day distributions of vegetation more accurately than the HLZ scheme. However, an important drawback of this type of approach is that it requires absolute physiological limits for each vegetation type or plant functional type (PFT), for which there is still insufficient comprehensive information, as this cannot be estimated from the geographical distribution of the vegetation (Lavorel et al., 2007). Making matters more difficult, researchers do not share the same classification criteria for terrestrial biomes, and the number of vegetation types or PFTs varies widely from five (Henderson-Sellers, 1991) to almost 100 (Box, 1981), depending on the research purpose and the geographical scale studied. By contrast, empirical approaches like the HLZ scheme do not require detailed physiological data and thus have the advantage of being easily applicable to any given vegetation classification criteria. Recently, empirical models for biome mapping using various types of environmental data have been developed by employing multinomial logistic regression (Levavasseur et al., 2012, 2013) and machine learning algorithms (Hengl et al., 2018).
A convolutional neural network (CNN) has been successfully adapted for use in species distribution modelling at regional scales (Benkendorf and Hawkins, 2020; Botella et al., 2018); however, it has not been used to develop global biome models. A CNN is an algorithm for machine learning in which a model learns to conduct classification tasks directly from training data. Model training of a CNN is based on finding patterns in the spatial organization of the training data (typically images) that recognizes its classification well. Unlike other conventional algorithms for machine learning, CNN learns directly from training data without a requirement for manual feature extraction.
Indeed, Botella et al. (2018) empirically demonstrated that a CNN model performed better at reconstructing species distributions than the popular species distribution modelling method, MAXENT (Phillips et al., 2006). This higher performance was attributed to CNN's efficient use of spatial patterns in environmental variables, which often control species distribution. MAXENT ignores these spatial patterns. A second explanation for the improved performance is that CNN can treat high-order interaction effects between input variables, whereas MAXENT, like the majority of other methods, only represents interactions between environmental variables by the products of variable pairs.
Using a CNN approach, we demonstrate an accurate and practical method to construct empirical models for operational global biome mapping. After evaluating the accuracy of the biome map reconstructed by this method, we applied the trained CNN to climatic scenarios toward the end of the 21st century to demonstrate a possible model's application to predict the shift in the global biome map under changing climate. To the best of our knowledge, this is the first application of CNN to reconstruct a global biome map. We only employed a small number of climatic variables for input to examine how CNN improves the reconstruction accuracy compared to the classical HLZ scheme.
We follow Ise and Oba (2019) and Ise and Oba (2020) for training CNN with input variables. This method represents climatic conditions using graphical images and employs them as training data for CNN models. To account for seasonal variability, previous correlative climate–vegetation models needed to pre-define representative variables. For example, Levavasseur et al. (2013) divided each climatic variable into four “seasonal” predictors by averaging data corresponding 3-month periods (i.e. DJF for winter, MAM for spring, JJA for summer and SON for fall). By contrast, the method we employed can automatically extract non-linear seasonal patterns for climatic variables that are relevant in biome classification. In other words, it enables CNNs to learn the seasonal pattern of multiple climatic variables without any indexical expression, which would reduce the amount of information and add a source of arbitration.
Comparison of global biome distributions used to evaluate the
training accuracies of the convolutional neural network (CNN) model.
For training the CNN model, we employed potential land cover types and the
monthly climate information from the ISLSCP2 Potential Natural Vegetation
Cover (Ramankutty and Foley, 2010) and CRU TS4.00 (Harris and Jones, 2017)
datasets, respectively. Both datasets have a 0.5
In machine learning experiments, a fraction of the training data is
typically divided randomly into two subsets, of which one is used for model
training, and the other is then used to validate the trained model. This
study used the CRU TS4.00 climate data as training data, which was generated
by interpolating data from weather stations, meaning that values in each
grid are not independent of those in nearby grids. Under these
circumstances, validation using the typical procedures described above would
risk overfitting (i.e. training the model too closely or exactly to a
particular set of data, thereby creating a model that may fail to fit
additional data or reliably predict future observations) (Leinweber, 2007).
Therefore, other climate datasets were used for validating the trained
model: NCEP/NCAR reanalysis (Kalnay et al., 1996) and the HadGEM2-ES
(Collins et al., 2011) and MIROC-ESM datasets (Watanabe et al., 2011).
Notably, the nature of these three datasets is different from that of the
CRU TS4.00; the NCEP/NCAR consists of reanalysis data that incorporates
observed and weather model output data, while the other two datasets were
derived only from climate models. Details of these climate datasets are
available in Table S1. To be consistent with the training data, the spatial
resolutions of the validation data were linearly interpolated to a
0.5
In this study, the accuracy when the model was applied to the training climate dataset (i.e. the CRU dataset) is referred to as the “training accuracy”, which shows how well the model was trained to extract common features of each category from images. The accuracy for the validation climate dataset (i.e. the NCEP/NCAR reanalysis, Had2GEM-ES and MIROC-ESM datasets) is referred to as the “test accuracy”, which shows how the model is robust against independent input data.
We graphically represented the standardized air temperature and
precipitation data on a grid using R statistical computing software version 3.3.3 (R-Core-Team, 2018). These images will be referred to hereafter as
visualized climatic environments (VCEs). For efficient machine learning,
climate data were standardized prior to visualization. The
In the VCE of the RGB colour tile, up to three climate variables can be represented by RGB channels. To find the optimal combination of climatic variables, we systematically evaluated the model performance of 14 combinations of climatic variable experiments for both annual and monthly means (Tables S3 and S4, respectively). Downward shortwave radiation and humidity were added for this evaluation, as all of the climate datasets contain these. Generally, training accuracy increases with the number of climatic variables; however, the test accuracy does not increase further after two climatic variables. This suggests that models with three climatic variables are at risk of overfitting. Amongst the models of annual and monthly means of climatic variables, the model with monthly mean air temperature and monthly precipitation had the highest test accuracy. Therefore, models that combined air temperature (bio-temperature for the model of annual mean climate) and precipitation were employed for the entire study.
We also evaluated the influence of different transformations of climatic variables (Table S5) and assignment patterns of air temperature and precipitation to RGB colour channels of the VCE (Table S6) on the resulting accuracy. Based on these evaluations, we settled on models with a combination of air temperature (bio-temperature for the model of annual mean climate) and precipitation, both of which are log transformed, and assigned to the blue and red channels, respectively, of the colour tile VCE representation. Examples of VCEs of annual mean climate and monthly mean climate are shown in Figs. S1 and S2, respectively.
The LeNet (LeCun et al., 1998), which is the world's first CNN, was employed
for this study. The computer employed to execute the learning had Ubuntu
16.04 LTS installed as the operating system and was equipped with an Intel
core i7-8700 CPU, 16 GB of RAM and an NVIDIA GeForce GTX1080Ti graphics
card, which accelerates the learning procedure. On the computer, the NVIDIA
DIGITS 6.0.0 software (Caffe version 0.15.14) served as the basis for CNN
execution, and LeNet was employed to train the CNN via the TensorFlow
library. To see how DIGITS actually implements the CNN, its internal code
can be viewed using the DIGITS menu (on the “New image model” screen, click
the “Custom Network tab” and select “TensorFlow”). A description of the CNN
model and its parameter settings are available in the Supplement, Sect. S1. To train the CNN model, 10 VCEs corresponding to years 1971–1980 were
generated for each grid using the CRU data, resulting in 572 640 VCEs (i.e.
To validate the trained CNN model, a VCE of the average climate conditions from 1971 to 1980 was obtained for each grid and each validation climate dataset. These VCEs were applied to the trained CNN model and were classified by their most plausible biome. It took roughly 8 min to complete the VCE classification (i.e. 57 264 in total) for each climate dataset. Then, the computed biome distributions were validated by quantitative comparison with the observation-based biome map of ISLSCP2.
For comparing the differences and similarities between two biome maps, cross-tabulation matrices were obtained for each comparison. Tables S7 and S8 show cross-tabulation matrices of training accuracies as examples. Using these matrices, the differences between the two biome maps were separated into two components: quantity disagreement and allocation disagreement (Pontius and Millones, 2011). Here, a quantity disagreement indicates a discrepancy between the proportions of the categories (i.e. the biome), while an allocation disagreement indicates a discrepancy in the spatial allocation of the categories under a given set of category proportions in the reference and comparison maps.
The use of one particular climatic dataset for training and three different climatic datasets for validation introduces a source of arbitrary error. To examine the dependency of climatic datasets for training and reconstructing performance, an experiment was performed wherein training and reconstruction of the same biome map was conducted using all combinations of the four historical climatic datasets, and then the reconstructive accuracies were compared.
Overall, 10 years of climate data may be insufficient to accurately train the model. We therefore conducted a sensitivity test in which performance was compared among models trained on monthly climate data averaged over 10-year (1971–1980; control), 20-year (1961–1980) and 30-year (1951–1980) periods. Validation datasets for each model were averaged over the same periods as the training data.
We used different climate datasets for training and validating the models to
avoid overfitting that may be caused by dependencies in values among nearby
grids in the training data (CRU TS4.0). To assess the effects of
overfitting, we compared performance among four models that differed with
respect to the grain size of training data. Nearby grid cells
(0.5
Finally, we conducted an additional experiment for comparing the accuracy of
potential natural vegetation (PNV) map reconstruction between the HLZ scheme and our method using common
training dataset. We developed a look-up table of the most common PNV for
each combination of annual mean bio-temperature class and annual
precipitation class, consistent with the HLZ scheme. The bin sizes of the
HLZ scheme are six for the annual mean bio-temperature class and eight for
the annual precipitation class. As these coarse-grained bin sizes would
potentially depress the accuracy of the PNV simulation, we also developed
look-up tables of
Following validation, the CNN model trained with monthly mean climate data
was used to predict future biome distribution maps by applying climate
scenarios for the 21st century. These predictions were conducted in
combinations of two general circulation models (GCMs) (i.e. MIROC-ESM and HadGEM2-ES) and two
Representative Concentration Pathways (RCPs; i.e. RCP2.6 and RCP8.5). These
RCPs represent the atmospheric greenhouse gas (GHG) concentration forecasts
adopted by the IPCC for its fifth Assessment Report (AR5) in 2014. RCP2.6
assumes that global annual GHG emissions will peak between 2010 and 2020 and
decline substantially afterwards. By contrast, RCP8.5 assumes that emissions
will continue to rise throughout the 21st century. The scenarios RCP2.6 and
RCP8.5, respectively, project that atmospheric CO
Global biome compositions of the observation-based map
Test accuracies representing how the trained CNN models simulate a
biome map with climatic conditions spanning from 1971 to 1980.
A comparison of the training accuracies between the annual climate model and the monthly climate model demonstrated that simulation of some biomes largely depended on climate seasonality (Figs. 1 and 2). Besides the most plausible biome, the CNN outputs its certainty, which is the probability (in %) of the classification judged by the CNN. Geographical distribution of the certainty clearly showed considering seasonality improves the certainty except in the northern parts of the South American and African continents where no apparent seasonality exists (Fig. S3). These results are consistent with Prentice et al. (1992), demonstrating that global biome distribution is under substantial controls of climatic tolerance and the occurrence and extent of drought seasons. In fact, seasonality significantly improved the average training accuracies from 3.5 % to 61.9 % for tropical deciduous forests, 0.4 % to 54.8 % for temperate broadleaf evergreen forests and 24.5 % to 79.0 % for boreal deciduous forests (Tables S7 and S8). The same pattern can be observed in test accuracy comparisons (Figs. 2, 3 and S3), although temperate broadleaf evergreen and boreal deciduous forests were largely absent from Had2GEM-ES and MIROC-ESM, respectively (Fig. 2). These absences would be due to differences in the reconstructed current climate among datasets (Fig. S4). Overall, for all climatic datasets examined, better training and test accuracies were consistently obtained in CNN models trained with monthly mean climate data than in those trained with annual mean climate data (Fig. 4). Thus, the CNN model trained with monthly mean climate data was used for analysis with the climate scenarios in the 21st century.
Fractions of agreement and disagreement between observation-based biome map and simulated biome maps trained by monthly mean climate or annual mean climate from CRU climate data spanning from 1971 to 1980. These CNN models were adapted to one of the four climatic datasets (CRU, NCEP, Had2GEM-ES and MIROC-ESM) spanning the same period of the training data. The fraction of agreement of the CRU corresponds to the training accuracy, while that of other climate data corresponds to the test accuracy.
For all combinations of CNN models and climatic data, the allocation disagreement was much larger than the quantity disagreement: while the allocation disagreement ranged from 0.227 to 0.392, the quantity disagreement varied from 0.037 to 0.200 (Fig. 4). The larger allocation disagreement can be explained by the tendency of observation-based biome distributions to be fragmented over areas with similar climatic conditions (Fig. 1a), while model-reconstructed biome distributions had more continuous structures (Figs. 1b–c and 3) (for example, the Australian continent). The probability of the most plausible biome tended to be lower for these fragmented regions (Fig. S3), suggesting these regions have climatic conditions suitable for multiple potential biomes. The lower quantity disagreement demonstrated that the CNN model reconstructed the fraction of the global biome composition under the current climatic conditions well. As the main purpose of this research is to develop an empirical model of climatic controls on biome distribution, this would indicate that the reconstructions of biome maps with the CNN models are actually much more accurate for their particular purpose than implied by the accuracies found from the simple map comparison.
CNN model accuracies for biome distribution simulations. These accuracies were obtained using the model trained by the climatic dataset on the row, with the climate dataset on the column as an input reconstruction. Therefore, the italic values show the accuracy when the climate datasets for training and reconstruction were identical. For each climate dataset, the monthly mean temperature and monthly precipitation during 1971 to 1980 were standardized and log transformed, then used for drawing the RGB colour tile VCEs.
Table 1 compares the dependence of reconstruction accuracy on combinations of climate datasets for training and test climate datasets. Accuracies were higher and less variable when the climate dataset for training and testing were identical (0.701–0.734), compared to when these datasets were different (0.394–0.559). These results suggest that uncertainty in historical climate reconstruction and overfitting are more significant sources of failure in reconstructing biome distribution than the dependency of training on a particular climate dataset.
No major trends were observed in test accuracies in the sensitivity test, which compared performance among models trained using monthly climate averaged over 10-, 20- and 30-year periods (Table S9). This indicates that climate data averaged over a 10-year period are sufficient for model training. However, long-term climatic conditions are important in controlling biome distribution via extreme climates, which may cause complete reorganization of systems and communities and may provide important opportunities for, and constraints to, plant recruitment. For example, in response to an anomalous drought during 2002–2003, regional-scale die-off of overstorey woody plants was observed across southwestern North American woodlands (Breshears et al., 2005). Considering the effects of extreme climates in the model would be an interesting topic for future study.
Grain size of the training and validation data did not result in noticeable
differences in training and test accuracies, with the exception of the CRU
dataset (Table S10), demonstrating that the influence of grain size on
training efficiency is negligible. In contrast, test accuracies of the CRU
dataset were lower at coarser grain sizes, at 80.4 %, 78.2 %, 76.1 %
and 72.2 % for the
CNN model and HLZ models accuracies for biome distribution simulations. CNN
model corresponds to the top row model of Table S2 (a RGB colour tile). Four
HLZ models have different bin sizes for climate classifications
(bio-temperature class
Accuracies of PNV reconstructions using the HLZ look-up tables for each
climate dataset increase with the resolution of bin sizes for climate
classifications (Table 2). It reaches quasi-equilibrium at 24
bio-temperature classes
Predicted biome maps under climatic scenarios from 2091 to 2100. Monthly means of four sets of forecasted climatic conditions derived from combinations of two climate models (i.e. Had2GEM-ES and MIROC-ESM) and two RCP scenarios (i.e. RCP2.6 and RCP8.5). These means were applied to the CNN model that was trained by the current biome distribution map, as well as the present climatic condition derived from the CRU dataset. Colour definitions are available in Fig. 1.
The applications of the CNN model to the climate scenarios predicted a significant shift in global biome distributions (Fig. 5) and area coverage (Fig. S5) under rapid warming trends (Figs. S6 and S7). For both GCM outputs, more intense biome shifts were predicted for RCP8.5 than for RCP2.6, but the shift trends remained consistent. The most visible change was the expansion of temperate forests over boreal forests in both North America and Eurasia. Boreal and cold vegetation shrank and its composition changed; tundra areas gave way to boreal forests, while boreal evergreen forests became confined to a narrow strip at higher latitudes. Tropical vegetation remained relatively unchanged, but nearly all tropical deciduous forests in the Southern Hemisphere were substituted by savanna, which coincided with a reduction in annual precipitation (Figs. S6 and S7).
Given the uncertainty of the climatic predictions derived from the Earth system models (ESMs) and RCP scenarios, our analysis of the climate change effect only indicates the potential for considerable changes in biome distribution at the end of the 21st century. Besides, changes in the expected biome, which is an equilibrium state of vegetation coverage, are not always accompanied by immediate changes in actual vegetation. In fact, these time lags can be very long (i.e. decades to millennia) because the adjustment of vegetation to new climate conditions entails a series of plant population dynamics processes, such as seed dispersal, establishment, competition against other existing plants and reproduction (Sato and Ise, 2012). Even present-day plant species distributions are considered not in equilibrium with present-day climates (e.g. Woodward, 1990). Our study cannot infer such transient changes in vegetation; however, current process base approaches are also not a reliable option for reconstructing plant population dynamic processes at the global scale; biome map predictions under common changing climate scenarios differ significantly from state-of-the-art dynamic global vegetation models (DGVMs) (Pugh et al., 2020). Hence, empirical and top-down approaches, like our simulation, should still have an important role to play in approximate mapping of biomes under changing climatic conditions.
There are two types of approach to mapping biomes: the correlative climate–vegetation approach and process-based approach (Notaro et al., 2012; Yates et al., 2009). We employed the former, which has advantages and disadvantages compared to the latter. An advantage of the correlative approach is that it is relatively straightforward and may be rapidly applied to different climate change scenarios. Indeed, models using the correlative approach are a common tool for predicting the impacts of climate change on biodiversity for conservation planning, because they can be easily used to simultaneously assess large numbers of species (e.g. Thomas et al., 2004).
An important disadvantage of the correlative method is that extrapolating current correlations between climate and biome distributions into the future may lead to seriously biased predictions; strong performance in the present climate does not guarantee similar performance under a new set of climatic conditions that may occur in the future. However, neither Had2GEM-ES (Figs. S3f and S8a–b) nor MIROC-ESM (Figs. S3h and S8c and d) showed apparent expansions of biome uncertainty in projected climatic conditions at the end of the 21st century. This may suggest outside the environmental space of the training data is not conspicuous at the global scale. For quantifying methodological uncertainty might also result from comparing performance between correlative and process-based models in “unsuitable” conditions outside the environmental space of the training data (Yates et al., 2009).
A second disadvantage of the correlative approach is that it cannot infer
impacts of elevated atmospheric CO
DGVMs, which use process-based approaches, may facilitate the identification
of areas where elevated CO
We must also keep in mind that the correlative climate–vegetation approach ignores feedbacks between vegetation and climate, which are known to influence vegetation distribution at equilibrium (Pitman, 2003). Both Had2GEM-ES and MIROC-ESM explicitly consider climate–vegetation interactions, including dynamic adjustment of biome distribution, and hence its projected climates are the outcomes of such interactions. However, due to the difference in projected distributions of biomes among models, some regions should have mismatched reconstructions of the interactions. Implementing the CNN model with Earth system models to dynamically adjust biome distribution to simulated climate distribution would address this issue.
The CNN model was trained with an observation-based biome map, which is composed of natural vegetation only. However, the impact of human activity on ecosystems is now so prevalent, and hence predicting ecosystem changes without explicit consideration of socioeconomic systems would be challenging (Ellis, 2015). Therefore, future research might address how current patterns of human activity interact with projected biome changes to reveal regions where these interactive agents align and amplify one another.
This study only considers biome distribution at the 0.5
Our study adopted the LeNet architecture implementation, which has six hidden layers, to create CNN models. Botella et al. (2018) found that a deep network (six hidden layers) outperformed a shallow network (one hidden layer) for building species distribution models; however, Benkendorf and Hawkins (2020) found that using more than two hidden layers was of no benefit and argued that the usefulness of deeper networks depends on the size of the training dataset. Therefore, carefully selecting the approximate complexity of architecture implementation may improve model accuracy. We compared the performance of models trained by four different types of VCE representation of annual precipitation and average annual bio-temperature, and all models have an almost equal performance (Table S2). This result might indicate that LeNet perfectly extracts at least two variables irrespective of how visualized. Lastly, the default parameters in NVIDIA DIGITS 6.0 remained largely unchanged. Our approach was kept relatively simple to demonstrate the robustness of our concept; however, further improvements to the scheme could be explored by selecting other implementation architectures and systematically testing the effect of parameter modulation.
Regardless of the limitations discussed above, this study provides an efficient and practical method for generating preliminary estimations of the potentially dramatic impact of climate change on biome distributions. Since this method is simply an application of image classification AI, it demands much less technical skill and computer resources. Reconstruction of global biome distribution substantially improved when climate seasonality was taken into consideration, demonstrating that the method successfully extracted seasonal patterns of climatic variables that are relevant in biome classification. This method could also be applied to building empirical models of other climate-driven phenomena such as cropping systems and the spread of vector-borne diseases and hence has potential to be a de facto standard for building empirical models across a range of research and application fields.
All data required to reproduce the analyses described herein are publicly
available at the following URL/DOI:
The supplement related to this article is available online at:
HS conceived and conducted the experiments. HS and TI analysed the results. HS wrote the manuscript. HS and TI reviewed the manuscript.
At least one of the (co-)authors is a member of the editorial board of
Anonymous reviewers and Tobias Gerken provided valuable comments on previous versions of the paper. Shuntaro Watanabe and Yurika Oba of Kyoto University offered technical support regarding issues of deep learning, including the installation of the pertinent computer environments. Tomohiro Hajima, of the Japan Agency for Marine-Earth Science and Technology, converted the climate data of the MIROC-ESM. Tomomichi Kato, the topical editor, handled the review process. This work was funded by (1) a Japan Society for the Promotion of Science KAKENHI (grant nos. 18H03357 and 17H01477) and (2) the Arctic Challenge for Sustainability II (ArCS II) (programme grant no. JPMXD1420318865).
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research has been supported by the Japan Society for the Promotion of Science (grant nos. 18H03357 and 17H01477) and the National Institute of Polar Research (grant no. Arctic Challenge for Sustainability II (ArCS II)).
This paper was edited by Tomomichi Kato and reviewed by Tobias Gerken and two anonymous referees.