Articles | Volume 15, issue 5
Geosci. Model Dev., 15, 1899–1911, 2022
https://doi.org/10.5194/gmd-15-1899-2022
Geosci. Model Dev., 15, 1899–1911, 2022
https://doi.org/10.5194/gmd-15-1899-2022

Model description paper 08 Mar 2022

Model description paper | 08 Mar 2022

Building a machine learning surrogate model for wildfire activities within a global Earth system model

Building a machine learning surrogate model for wildfire activities within a global Earth system model
Qing Zhu1, Fa Li1,2, William J. Riley1, Li Xu3, Lei Zhao4, Kunxiaojia Yuan1,2, Huayi Wu2, Jianya Gong5, and James Randerson3 Qing Zhu et al.
  • 1Climate and Ecosystem Sciences Division, Climate Sciences Department, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
  • 2State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, China
  • 3Department of Earth System Science, University of California Irvine, Irvine, CA, USA
  • 4Department of Civil and Environmental Engineering, University of Illinois Urbana-Champaign, Champaign, IL, USA
  • 5School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China

Correspondence: Qing Zhu (qzhu@lbl.gov)

Abstract

Wildfire is an important ecosystem process, influencing land biogeophysical and biogeochemical dynamics and atmospheric composition. Fire-driven loss of vegetation cover, for example, directly modifies the surface energy budget as a consequence of changing albedo, surface roughness, and partitioning of sensible and latent heat fluxes. Carbon dioxide and methane emitted by fires contribute to a positive atmospheric forcing, whereas emissions of carbonaceous aerosols may contribute to surface cooling. Process-based modeling of wildfires in Earth system land models is challenging due to limited understanding of human, climate, and ecosystem controls on fire counts, fire size, and burned area. Integration of mechanistic wildfire models within Earth system models requires careful parameter calibration, which is computationally expensive and subject to equifinality. To explore alternative approaches, we present a deep neural network (DNN) scheme that surrogates the process-based wildfire model with the Energy Exascale Earth System Model (E3SM) interface. The DNN wildfire model accurately simulates observed burned area with over 90 % higher accuracy with a large reduction in parameterization time compared with the current process-based wildfire model. The surrogate wildfire model successfully captured the observed monthly regional burned area during validation period 2011 to 2015 (coefficient of determination, R2=0.93). Since the DNN wildfire model has the same input and output requirements as the E3SM process-based wildfire model, our results demonstrate the applicability of machine learning for high accuracy and efficient large-scale land model development and predictions.

1 Introduction

Wildfires burn ∼500 million hectares of vegetated land surface each year, which significantly modifies the physical properties and biogeochemical cycles of terrestrial ecosystems (Bond-Lamberty et al., 2007; Randerson et al., 2006; Pellegrini et al., 2018; Andela et al., 2017). Living vegetation biomass, surface litter, and coarse woody debris are directly combusted and removed by wildfire (Walker et al., 2019; Harden et al., 2006). It has been suggested that global forest cover would double if fire were eliminated (Bond et al., 2005). Fire has multiple important consequences for the climate system, including directly releasing greenhouse gases (e.g., CO2, CH4) (Ross et al., 2013; Kasischke and Bruhwiler, 2002) and aerosols (Jiang et al., 2020); changing land surface albedo and energy budgets (French et al., 2016; Rother and De Sales, 2020) and land–atmosphere exchanges of heat, mass, and momentum (Chambers and Chapin, 2002); limiting plant transpiration and regional water recycling (Holden et al., 2018; Brando et al., 2020); and reshaping forest composition (Mekonnen et al., 2019). In addition, biomass burning emits a large amount of fine particulate matter that contributes to about 30 % of cloud condensation nuclei globally (Day, 2004). Soil organic matter decomposition, nitrogen mineralization, and the richness and diversity of soil fungal communities (Oliver et al., 2015) could also be influenced by wildfire through modifying litter substrate supply and degraded enzymatic activities (Pellegrini et al., 2020; 2018; Bowd et al., 2019; Holden et al., 2018).

Climate change and land use activities have jointly affected fire spatial distribution, frequency, and intensity (Xu et al., 2020; Kelley et al., 2019; Andela et al., 2017) since the pre-industrial era. For example, warmer and drier climate conditions enhance fuel aridity and favor fire occurrence in forest ecosystems where fuels have built up over a period of decades and centuries (Abatzoglou and Williams, 2016; Williams et al., 2019). Even if annual precipitation does not decline, redistribution of precipitation towards extreme wet season rainfall events could contribute to longer dry periods and thus more severe fire activity (Xu et al., 2020). Human activities often shape wildfire activity through regulating patterns of ignition and fire occurrence (e.g., power line ignition) (Keeley and Syphard, 2018) and suppressing wildfire activity by means of land fragmentation, fire management, and livestock grazing (Andela et al., 2017). In California, fire density is highly associated with population density and the distance to the wildland–urban interface (WUI) (Syphard et al., 2007). At the global scale, along gradients of increasing population density, fire frequency initially increases by up to 20 % and then gradually declines in more densely populated areas (Knorr et al., 2014).

Although global wildfire burned area has declined over the recent two decades (Andela et al., 2017), many vulnerable ecosystems and geographic regions have experienced significant increases in wildfire activity (Abatzoglou and Williams, 2016; Walker et al., 2019) resulting in large losses of natural resources and economic assets (Stephenson et al., 2013; Papakosta et al., 2017). In western US forests, wildfire has dramatically increased, costing billions of dollars each year and gaining widespread public attention. This regional wildfire increase is mainly driven by concurrent increases in spring temperature and declining snowpacks (Westerling et al., 2006), mid-summer increases in vapor pressure deficit (Williams et al., 2019), and increases in drought stress during fall (Goss et al., 2020). The enhancement of wet and dry oscillations favors initial vegetation growth and subsequent wildfire activity (Heyerdahl et al., 2002; Saha et al., 2019).

Wildfire models have played an important role in many aspects of wildfire research, including monitoring fire spread (Finney, 1998; Radke et al., 2019), analyzing controllers of wildfire short-term and long-term variability (Kelley et al., 2019), predicting severity of the upcoming fire seasons (Preisler and Westerling, 2007) and climate-scale fire variability (Girardin and Mudelsee, 2008; Yue et al., 2013), and understanding the complex climate–wildfire–ecosystem feedbacks (Clark et al., 2004; Zou et al., 2020; Mekonnen et al., 2019). Two types of wildfire models are widely used: process-based models and data-driven statistical models (Hantson et al., 2016). Process-based wildfire models consider detailed processes related to natural fire ignition (Prentice and Mackerras, 1977), anthropogenic ignition (Venevsky et al., 2002), fire spread and duration (Thonicke et al., 2010), fire suppression (Lenihan and Bachelet, 2015), and fire mass and heat fluxes (Li et al., 2012). Process-based wildfire models have been widely used in dynamic vegetation models and coupled Earth system models (ESMs) with various complexities of parameterization (Li et al., 2019; Rabin et al., 2017). As more and more detailed fire processes are considered and parameterized, structural and parametric uncertainties may increase due to incomplete representation of individual processes and imperfect mathematical formulation (Riley and Thompson, 2017). Historically, data-driven models were often used for fire behavior modeling and aim to track the ignition, spread, duration, and extinction of individual fires (Finney, 1998; Radke et al., 2019) at fine spatial and temporal scales. This group of models are more relevant to operational fire research. While process-based wildfire models in the context of global vegetation models or Earth system land models focuses on the grid-cell aggregated fire burned area dynamics that are more relevant to researches on large-scale patterns and climate scale predictions (Li et al., 2019; Rabin et al., 2017). This study particularly focuses on the second category of wildfire models.

Although explicit processes are simulated, the accuracy of process-based wildfire models are highly dependent on parameterization, which is computationally expensive (Zhu and Zhuang, 2014; Teckentrup et al., 2018; Xu et al., 2021). Data-driven models, however, directly link the driving variables (e.g., climate factors) to the fire activity using simple statistical models or more sophisticated machine learning techniques, ignoring the explicit processes and feedbacks associated with wildfire (Radke et al., 2019; Tonini et al., 2020; Ganapathi Subramanian and Crowley, 2018). Through training and validation, statistical representations of wildfire dynamics are learned by models using principles from machine learning. Data-driven wildfire models are diverse in terms of driving variables and model structure. For example, many current machine learning wildfire models rely on remote oceanic dynamics (e.g., sea surface temperature variability) and atmospheric teleconnections to simulate land surface fire activities (Yu et al., 2020; Chen et al., 2011, 2020). Another group of data-driven wildfire models draws more heavily upon regional climate, plant functional type, and human infrastructure driver variables (Coffield et al., 2019; Sayad et al., 2019).

In this study, we develop a machine learning wildfire model using the process representation of wildfire in the Energy Exascale Earth System Model (E3SM) land model (ELMv1) (Zhu et al., 2019), five observationally inferred burned area products (Andela et al., 2019; Giglio et al., 2018; Lizundia-Loiola et al., 2020, 2018; Van Der Werf et al., 2017), and a deep neural network approach (Goodfellow et al., 2016). We implemented a deep learning model that can better capture the complex and non-linear interactions between controlling factors and wildfire activity. The objectives of this study are to surrogate the wildfire parameterization in ELMv1 with the deep neural network and improve the model-simulated wildfire burned area across various fire regions (Giglio et al., 2013).

2 Methodology

2.1 ELMv1 wildfire model

The process-based wildfire model in ELMv1 originates from the Community Land Model (CLM4.5) (Li et al., 2012); we take this wildfire model as the baseline (hereafter referred to as BASE-Fire) without modification on process representation. BASE-Fire combines information regarding ignition, fuel conditions, surface climate, and anthropogenic suppression to simulate total burned area based on the fire counts and spread area of each fire (Fig. 1). The fire count in BASE-Fire is modeled as the sum of anthropogenic ignition and natural ignition, where the latter is proportional to lightning density (Prentice and Mackerras, 1977) and the former is determined by population density (Venevsky et al., 2002). Human activity may also intentionally suppress wildfire occurrence if the fire is detected at an early stage. For example, developed regions with high population density and gross domestic product are less likely to use fire to remove surface biomass. On the other hand, developed regions are more likely to suppress fire given more effective fire management policy and suppression capability. Fire count is also affected by surface fuel availability (aboveground biomass) and fuel combustibility (relative humidity, topsoil temperature, and moisture). The fire spread area in BASE-Fire is modeled as an elliptical-shaped region controlled by wind speed and fuel wetness (Rothermel, 1972) (using topsoil (0–15 cm) moisture as a proxy). The fire duration is set to be one day based on a study that reported the mean global fire persistence of years 2001–2004 (Giglio et al., 2006a). BASE-Fire also does not explicitly consider roads, rivers, and firefighting activity (Arora and Boer, 2005).

https://gmd.copernicus.org/articles/15/1899/2022/gmd-15-1899-2022-f01

Figure 1Schematic representation of the ELMv1 process-based BASE-Fire model and the components to be surrogated with the deep neural network (DNN) model (dark grey).

Download

2.2 Deep neural network wildfire surrogate model

We developed the new fire model in two steps: (Eq. 1) surrogating BASE-Fire with a deep neural network (DNN) approach and (Eq. 2) improving that surrogate model using five observationally inferred burned area products (Table S1 in the Supplement). First, we surrogated BASE-Fire with a DNN approach (hereafter referred to as DNN-Fire) that uses the same input and output variables as BASE-Fire but treats the explicit intermediate processes (e.g., ignition, fire spread) as latent variables coded by hidden layers in the DNN (Fig. 1). DNN-Fire was developed with five hidden layers and five neurons in each layer for burned area simulation. The DNN approach uses a fully connected feed-forward neural network (Schmidhuber, 2015) that comprises input, hidden, and output layers:

(1)h1=f1W1I+b1,(2)h2=f2W2h1+b2,(3)h3=f3W3h2+b3,(4)h4=f4W4h3+b4,(5)h5=f5W5h4+b5,(6)O=f6W6h5+b6,

where I denotes the input layer (e.g., climate factors) with 11 neurons, each corresponding to an input variable listed in Table 1. h1,h2,h3,h4, and h5 are five hidden vectors that are calculated with two steps. First is a linear combination of previous layers' input vector (h) and the trainable weight parameter matrix [W1,W2,W3,W4, W5, W6], considering biases b1,b2,b3, b4, b5, and b6. Then, nonlinear activation functions f1,f2,f3,f4, f5, and f6. are applied to the output from the previous step. In this study we used softplus as the activation function (Zheng et al., 2015) that is a non-linear transformation of input signals. O denotes the output layer that summarizes the latent variables from the last hidden layer (h5) and calculates burned area.

Table 1Input and output variables of ELMv1 BASE-Fire and surrogate DNN-Fire models.

Download Print Version | Download XLSX

Second, we improved the surrogate DNN-Fire by fine-tuning the weight parameters using observations (hereafter referred to as DNN-Fire-OBS). Between 2001 and 2010, we initialized DNN-Fire-OBS's weight parameters (W1,W2,W3,W4, W5, and W6) using results from DNN-Fire; replaced the BASE-Fire burned area by the ensemble mean of five observationally inferred burned area products including GFEDv4s (Van Der Werf et al., 2017), Fire_CCI51 (Lizundia-Loiola et al., 2020), Fire_CCILT11 (Lizundia-Loiola et al., 2018), MODIS MCD64 (Giglio et al., 2018), and Fire_Atlas (Andela et al., 2019) (Table S1 in the Supplement); and adjusted weight parameters until the model best reproduced the observed burned area. This two-step approach will also allow rapid parameterization of the Fire model as new fire data and baseline fire model results become available. DNN-Fire-OBS can be more easily generalized since BASE-Fire provides explicit physical guidance and a larger-than-observation input and output feature space for development of the machine learning fire model. One limitation is the large discrepancy among five burned area products. Tuning DNN-Fire towards the ensemble mean of the five products will potentially compromise the data difference; however, future work is needed to improve the burned area data quality and consistency.

2.3 Model setup and simulation protocol

We ran ELMv1 with BASE-Fire at 1.9 by 2.5 spatial resolution (Zhu et al., 2016, 2020) to generate training and testing datasets for the DNN wildfire model. BASE-Fire was first spun up for 600 years with accelerated soil decomposition followed by 200 years regular spinup with regular soil decomposition (Koven et al., 2013). The spinup simulations were forced with constant atmospheric CO2 concentration (285 ppmv) and 1901–1920 repeated climate forcing from GSWP3 (Global Soil Wetness Project) (Dirmeyer et al., 2006). The purpose of the spinup was to initialize ecosystem carbon pools and stabilize plant and soil carbon and water fluxes. Restarting from the “spun-up” conditions, a transient simulation was then conducted from 1901 to 2015 with GSWP3 transient climate forcing, atmospheric CO2 concentrations, and nitrogen and phosphorus deposition (Lamarque et al., 2005; Mahowald et al., 2008). Wildfire associated variables were selected for output with a monthly temporal resolution (Table 1).

BASE-Fire output from years 1981 to 2010 were used to train, test, and fine-tune DNN-Fire. We developed 14 region-specific models, corresponding to 14 widely used GFED regions. For each region, all land grid cells (comprising no fire history, infrequent fire, and repeated fire) were concatenated into one data matrix (where rows consist of the number of samples and columns of the number of variables). A total of 80 % of the data matrix was randomly sampled for the training dataset, and the remaining 20 % of the data were reserved for testing. Furthermore, the random sampling was stratified in order to reduce the risk of sampling, e.g., adjacent high fire grid cells. All grid cells were first divided into three “strata”: low burn (0 %–33 % percentile), median burn (33 %–66 % percentile), and high burn (67 %–100 % percentile) grid cells based on the magnitude of the burn. The stratified random sample assured the sampled grid cells for training and testing had the same ratios of low, medium and high burn, thus eliminating the sampling bias from spatial autocorrelation (Wang et al., 2012). In addition to random sampling, we also investigated the impacts of data choice on the model performance, by sampling the testing datasets within specific years (e.g., 2001–2002, 2003–2004, 2005–2006, 2007–2008, 2009–2010), and used the rest of the years for training. We found neglected differences among the models (Fig. S1 in the Supplement) indicating the choice of training/testing data years were not impactful. Therefore, we will discuss the results of the stratified random sampling approach as the major results throughout the paper.

All training and testing datasets were normalized to the range [0, 1] with the following scaler:

(7) X = X - X min X max - X min ,

where X is the variable vector of interest, and Xmin and Xmax are minimum and maximum values of X, respectively. During the training stage, we randomly initialized the weighting parameters (Eq. 1–6) and optimized them using the adaptive moment estimation method (Kingma and Ba, 2014), which is a variant of the gradient descent optimization method but considers adaptive learning rate and momentum-like exponentially decaying gradients. The parameter optimization aimed to minimize a mean squared error cost function:

(8) J = 1 n i = 1 n ( y i DNN - y i BASE ) 2 ,

where yiDNN and yiBASE are DNN-Fire- and BASE-Fire-generated burned area, respectively. i represents a different grid cell. Cost function J summarizes the overall magnitude of the error between the surrogate DNN-Fire and BASE-Fire. We then evaluated model performance using metrics of mean absolute error (Eq. 9), Pearson correlation (Eq. 10), and coefficient of determination (Eq. 11).

(9)MAE=1ni=1nyiDNN-yiBASE(10)p=covariance(yDNN,yBASE)varianceyDNNvarianceyBASE(11)R2=1-i=1n(yiDNN-yiBASE)2i=1n(yiBASE-ymeanBASE)2
https://gmd.copernicus.org/articles/15/1899/2022/gmd-15-1899-2022-f02

Figure 2ELMv1 process-based model (BASE-Fire) simulated and five observationally inferred burned area products (Table S1) at (a) global scale in (b) tropical (23.5 S–23.5 N), (c) temperate (23.5– 67.5 N), and (d) boreal (north of 67.5 N) regions.

Download

https://gmd.copernicus.org/articles/15/1899/2022/gmd-15-1899-2022-f03

Figure 3A comparison of wildfire burned area between estimates from the ELMv1 process-based model (BASE-Fire), deep neural network wildfire model (DNN-Fire), deep neural network wildfire model fine-tuned with observed burned area (DNN-Fire-OBS), and observations over 14 GFED fire regions.

Download

https://gmd.copernicus.org/articles/15/1899/2022/gmd-15-1899-2022-f04

Figure 4The performance of the deep neural network wildfire model (DNN-Fire), compared with the original ELMv1 process-based wildfire model (BASE-Fire) over 14 GFED regions between years 2001 and 2010.

Download

https://gmd.copernicus.org/articles/15/1899/2022/gmd-15-1899-2022-f05

Figure 5Inter-annual variation of burned area from years 2001 to 2010 (a) and the averaged seasonal cycle (b) of burned area estimated by the ELMv1 process-based wildfire model (BASE-Fire), deep neural network wildfire model (DNN-Fire), deep neural network wildfire model fine-tuned with observations (DNN-Fire-OBS), and observations.

Download

https://gmd.copernicus.org/articles/15/1899/2022/gmd-15-1899-2022-f06

Figure 6Prognostic simulation of annual wildfire burned area with the deep neural network wildfire model fine-tuned with observations (DNN-Fire-OBS) compared with five burned area products (Table S1) over 2011–2015 for 14 GFED regions.

Download

https://gmd.copernicus.org/articles/15/1899/2022/gmd-15-1899-2022-f07

Figure 7Prognostic simulation of wildfire burned area (2011–2015) with the deep neural network wildfire model fine-tuned with observations (DNN-Fire-OBS) compared with observations and nine FireMIP models outputs.

Download

3 Results and discussion

3.1 Evaluation of wildfire surrogate model

BASE-Fire performed reasonably well for total global burned area (508±53 Mha yr−1 (millions of hectares per year) between years 2001 and 2010 compared with the observational long-term average of 424–484 Mha yr−1; Fig. 2, Table S1). BASE-Fire also captured the global declining trend of wildfire burned area over this time period, attributed to a decrease in tropical fires (Andela et al., 2017). At the regional scale, however, BASE-Fire underestimated tropical (23.5 S–23.5 N) burned area and overestimated temperate (23.5–67.5 N) and boreal (67.5 N above) burned area (Fig. 2). Large spatial heterogeneity existed for BASE-Fire regional bias. For example, over tropical GFED regions, BASE-Fire overestimated wildfire burned area over Southern Hemisphere South America (SHSA) but underestimated wildfire burned area over both Southern and Northern Hemisphere Africa regions (SHAF and NHAF), despite an overall underestimation over the tropical region (Fig. 3). In contrast, consistent overestimation occurred over all temperate GFED regions. For example, wildfire burning was overestimated by about a factor of 16 (∼1 versus 16 Mha yr−1) over the Europe GFED region (EURO) (Fig. 3). Although there is room to improve BASE-Fire performance, the parameterization would involve large ensemble simulations and computational resources. Instead, we first used BASE-Fire-generated data as training and testing datasets to parameterize DNN-Fire, and then we fine-tuned the DNN-Fire model against observed burned area.

Next we parameterized and compared DNN-Fire with BASE-Fire outputs. Using BASE-Fire-generated 1.9×2.5 resolution datasets of surface fuel conditions (fuel load (vegetation biomass), fuel temperature (topsoil temperature), and fuel wetness (topsoil moisture)) with gridded climate forcing (GSWP3) (Dirmeyer et al., 2006), land use (LUH2 dataset) (Hurtt et al., 2020), and socio-economic (Van Vuuren et al., 2007; Dobson et al., 2000) factors, DNN-Fire captured the spatial pattern of BASE-Fire-predicted wildfire activity (Fig. 4, S2). Across all GFED regions, mean absolute error of DNN-Fire was 4.4 Mha yr−1 (<1 % of total burn area), with median and maximum errors of 1.8 and 13.0 Mha yr−1, respectively (Fig. 3). Equatorial Asia (EQAS), Northern Hemisphere South America (NHSA), Central America (CEAS), and Europe (EURO) regions had the lowest DNN-Fire errors (<1.0 Mha yr−1), while Southern Hemisphere Africa (SHAF), and Boreal Asia (BOAS) had the largest errors (10–13 Mha yr−1). Overall, the correlation coefficient between BASE-Fire and DNN-Fire simulated burned area was 0.91 (p value <0.01), and the coefficient of determination (R2) was 0.79. Across seasons, DNN-Fire also reasonably captured the BASE-Fire peak fire months (June to October), which were dominated by Southern Hemisphere Africa and Southern Hemisphere South America (Fig. 5).

By surrogating BASE-Fire, DNN-Fire is expected to have similar biases and uncertainties. The deficiency of the BASE-Fire model will propagate to DNN-Fire. In our future work we will overcome such limitations by training multiple DNN-Fire models with ensemble simulations of BASE-Fire models that differ in critical parameters and vary in model structures.

3.2 Calibrating the wildfire surrogate model using observations

Although the global pattern was reasonably captured, BASE-Fire had relatively large biases in several GFED regions, as discussed above. Since DNN-Fire was trained and validated only with BASE-Fire-generated inputs (e.g., fuel conditions) and outputs (burned area), we expect that, at best, DNN-Fire would have comparable biases as BASE-Fire. Starting from DNN-Fire, we further calibrated the model weighting parameters to create DNN-Fire-OBS and validated DNN-Fire-OBS performance using observed burned area from five existing burned area products (Table S1) between years 2001 and 2010. The calibration time cost several minutes with an Intel Xeon Phi Processor 7250.

Dramatic improvements were found in most of the 14 regions simulated by DNN-Fire-OBS (Fig. 3). Overall, DNN-Fire-OBS simulated global long-term average burned area was 458 Mha yr−1 (compared with observational average 467 Mha yr−1). Averaged across 14 regions, 73 % reduction of mean absolute error was achieved by DNN-Fire-OBS, compared with the BASE-Fire model. The Pearson correlation coefficient between the DNN-Fire-OBS simulated and observational burned area was 0.98 (p value <0.001) with an R2 of 0.97. Bias reduction was disproportionally distributed across the GFED regions (Fig. 3). For example, severely burned regions, including Southern and Northern Hemisphere Africa (SHAF and NHAF) and Southern Hemisphere South America (SHSA) greatly benefited from the tuning, and their regional biases were reduced by 88, 65, and 51 Mha yr−1 (or 88 %, 89 %, 98 % reduction), respectively. Although Temperate North America (TENA) and Europe (EURO) wildfire burned area is relatively small (1–3 Mha yr−1), the impacts of wildfire activity were significant due to their high population densities. DNN-Fire tended to overestimate the burned area in TENA and EURO by 47 and 13 Mha yr−1, while DNN-Fire-OBS significantly reduced biases in both regions to less than 0.3 Mha yr−1 (a 97 %–98 % reduction).

BASE-Fire tended to overestimate inter-annual variability (IAV) and had opposite burned area anomalies between years 2001 and 2005. DNN-Fire dampened BASE-Fire's IAV but systematically overestimated burned area. DNN-Fire-OBS agreed well with the observed IAV between years 2001 and 2010 (Fig. 5a). The seasonal cycle was also improved in DNN-Fire-OBS in terms of reducing BASE-Fire's overestimation of burned area during peak fire seasons (Fig. 5b, S3), although we note that DNN-Fire-OBS is biased high during low fire seasons (March and April).

3.3 Prognostic simulation and limitations

We next evaluated the DNN-Fire-OBS model against observations for the period 2011 to 2015, using data which were not used to train and validate the model. Overall, DNN-Fire-OBS simulated 469–514 Mha yr−1 global burned area, compared with observations of 349–509 Mha yr−1. Note that the large observational ranges were mainly due to the differences among the five burned area products rather than the inter-annual variability (Fig. 6). Regionally, DNN-Fire-OBS overestimated NHAF, SHAF, and SHSA annual burned area by 8, 6, and 2 Mha yr−1, respectively (Fig. 6) compared with the observational mean. Averaged latitudinal distribution of simulated burned area during this period showed that global wildfire activity peaked around 10–15 S and 5–10 N, together accounting for burning 12 %–16 % of the land surface (Fig. 7). These two peaks were dominated by large burned area over Southern (SHAF) and Northern Hemisphere Africa (NHAF) fire regions. Compared with observations, DNN-Fire-OBS simulated reasonable burned area latitudinal distributions (Fig. 7). We also compared the nine FireMIP models (Rabin et al., 2017; Teckentrup et al., 2018) and found diverse latitudinal distribution of burned area. The across-model differences were much larger than the inter-annual variation simulated by each individual model, which indicated large model structural uncertainties. Validation was also conducted for the historical period 1981–2000, when most of the satellite-based burned area data were not available. Compared with charcoal-index-inferred burned area during 1981–2000 (Fig. S4), the DNN-Fire-OBS model reasonably captured the decrease in burned area from ∼530 to 490 Mha yr−1. In summary, DNN-Fire-OBS simulation is reasonably accurate: Eq. (1) improved the simulated wildfire spatial and temporal distributions in ELMv1;Eq. (2) enabled effective and efficient parameterization of fires at regional scale.

This study focuses on design, development, and parameterization of the DNN fire model within the E3SM model interface. In this way the DNN model can be readily coupled in the future and iteratively simulate climate, ecosystem fuel conditions, and fire dynamics. Although no feedbacks exist between biomass and tree cover and burned area under the current offline mode, this study is an important step towards fully coupling E3SM and the DNN-Fire models in the future. We acknowledge several challenges and limitations in our modeling framework. First, the DNN model uncertainty was subject to the accuracy of climate forcings as well as other physical driving variables simulated by the physical wildfire model (ELMv1). For example, in this work ELM simulation of soil temperature, soil moisture, fuel load and so on is subject to the uncertainty of GSWP3 forcings. Furthermore, those simulated variables served as inputs for the DNN model and would result in burned area prediction uncertainty. It was challenging to eliminate the forcing uncertainties in this work, but we could at least evaluate the magnitude of these uncertainties. We ran the DNN-Fire-OBS model with alternative forcings of CRU-JRA, NCEP-DOE2, and CDAS soil moisture from 2001 to 2010 and compared the results with DNN-Fire-OBS driven by default inputs (Fig. S5). The results showed relatively larger uncertainties from climate forcing than that from soil moisture forcing, particularly over the major fire regions (e.g., SHSA, SHAF, and NHAF). For fuel load, although no transient dataset of global living biomass existed yet, we directly compared the ELM model simulated biomass with the global estimate (GEOCARBON ∼455 Pg C). We found that the modeled present-day biomass continuously increased from 425 to 470 Pg C and compared reasonably well with the global benchmark (Fig. S6). Future work will focus on evaluating the uncertainties from dead fuel load and fuel temperature variables.

Second, the original ELMv1 wildfire model has a unified mathematical representation of how fuel, climate, and socio-economic conditions control wildfire burned area (Li et al., 2012). However, training one single DNN wildfire model across the globe will produce a model dominated by grid cells that have a high burned area (e.g., Africa). The performance of the trained DNN model, therefore, will likely have larger biases over the low-fire grid cells, although the globally aggregated burned area could be reasonable. We partly overcame this challenge by applying the widely used 14 GFED fire regions that assume unique and relatively uniform dynamics over each region (Giglio et al., 2006b) and employed a stratified random sampling method for training and testing datasets. Although the regionally specific wildfire model introduces additional complexity, it better represents distinct characteristics of wildfire activity over different climate regimes and biomes (Zou et al., 2019; Zhu and Zhuang, 2013) and allows for future analyses of how the relevant controllers vary across the globe.

Thirdly, the cost function and the training of the DNN model relied on the normality assumption of burned area data. Therefore, the DNN model error might be dominated by highly burned grid cells. A potential solution is to use log transformation on non-normal data or the resultant cost function (Kelley et al., 2021). Finally, our GFED region-based parameterization strategy relied on the combination of climate and biome types, while an alternative parameterization strategy for the DNN-Fire model could be based on plant functional type distributions. Based on our analysis, the plant-functional-type-based DNN-Fire model had similar performance compared with the GFED-based model (Fig. S7, S8). Since the GFED regions were defined by present-day climate and fire regimes, our GFED-based models may not fully capture the changes of future fire dynamics due to longer-timescale climate and fire regime changes.

4 Conclusions

In this study, we first surrogated the baseline ELMv1 wildfire model with a deep neural network (DNN) approach (Pearson correlation coefficient =0.91, p value <0.01, R2=0.79). The development was based on inputs and outputs from the baseline ELMv1 wildfire simulation, which is process-based and reasonably simulates global burned area, although regional biases existed. We then calibrated the neural network weights using the observationally inferred burned area from the years 2001–2010. The final calibrated DNN wildfire model (DNN-Fire-OBS) was shown to be more accurate over the 14 GFED regions. For example, reductions in absolute error over Africa, South America, and Europe were ∼90 %. More importantly, the DNN-Fire-OBS model parameters could be calibrated within minutes, compared with traditional ELMv1 parameterization ensemble simulations that consume a large amount of computational time. The improved DNN-Fire-OBS model also accurately prognosed global and regional burned area in the 5-year period following the training period from 2011 to 2015 (modeled 469–514 Mha yr−1). We conclude that the improved surrogate wildfire model (DNN-Fire-OBS) developed in this study can serve as an effective alternative to the process-based fire model currently used in ELMv1. More broadly, we conclude that machine learning techniques can facilitate Earth system model development, parameterization, and uncertainty reduction with high efficiency and accuracy.

Code availability

The code used in this study is available via Zhu, 2021.

Data availability

GFEDv4s, Fire_CCI51, Fire_CCILT11, MCD64, and Fire_Atlas are five global wildfire burned area datasets that are used to train and validate the deep neural network model in this study.

GFEDv4s data can be accessed at https://daac.ornl.gov/VEGETATION/guides/fire_emissions_v4.html (Randerson et al., 2018).

Fire_CCI51 data can be accessed at https://geogra.uah.es/fire_cci/firecci51.php (ESA, 2021a).

Fire_CCILT11 data can be accessed at https://geogra.uah.es/fire_cci/fireccilt11.php (ESA, 2021b).

MCD64 data can be accessed at https://modis-fire.umd.edu/files/MODIS_C6_Fire_User_Guide_C.pdf (Giglio et al., 2020).

Fire_Atlas data can be accessed at https://www.globalfiredata.org/fireatlas.html (FireAtlas, 2019).

FireMIP model outputs can be accessed at https://doi.org/10.5281/zenodo.3555562 (Hantson et al., 2019) and are used to compare with this study in terms of latitudinal distribution of burned area.

Supplement

The supplement related to this article is available online at: https://doi.org/10.5194/gmd-15-1899-2022-supplement.

Author contributions

QZ and WJR designed the study. QZ, WJR, LX, and JTR designed the model experiments. QZ and FL wrote the code and ran the experiments. LZ, KY, HW, and JG all contributed to the interpretation of the results and writing of the paper.

Competing interests

The contact author has declared that neither they nor their co-authors have any competing interests.

Disclaimer

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Acknowledgements

This research was supported by the Energy Exascale Earth System Modeling (E3SM, https://e3sm.org/, last access: 15 September 2021) Project and the Reducing Uncertainties in Biogeochemical Interactions through Synthesis and Computation (RUBISCO) Scientific Focus Area, Office of Biological and Environmental Research of the U.S. Department of Energy Office of Science. Lawrence Berkeley National Laboratory (LBNL) is managed by the University of California for the U.S. Department of Energy under contract DE-AC02-05CH11231.

Financial support

This research has been supported by the U.S. Department of Energy (Energy Exascale Earth System Modeling (grant no. E3SM, https://e3sm.org/, last access: 15 September 2021), Project and Reducing Uncertainties in Biogeochemical Interactions through Synthesis and Computation (RUBISCO) Scientific Focus Area).

Review statement

This paper was edited by Gerd A. Folberth and reviewed by Joe Melton, Matthias Forkel, and D. I. Kelley.

References

Abatzoglou, J. T. and Williams, A. P.: Impact of anthropogenic climate change on wildfire across western US forests, P. Natl. Acad. Sci., 113, 11770–11775, 2016. 

Andela, N., Morton, D., Giglio, L., Chen, Y., Van Der Werf, G., Kasibhatla, P., DeFries, R., Collatz, G., Hantson, S., and Kloster, S.: A human-driven decline in global burned area, Science, 356, 1356–1362, 2017. 

Andela, N., Morton, D. C., Giglio, L., Paugam, R., Chen, Y., Hantson, S., van der Werf, G. R., and Randerson, J. T.: The Global Fire Atlas of individual fire size, duration, speed and direction, Earth Syst. Sci. Data, 11, 529–552, https://doi.org/10.5194/essd-11-529-2019, 2019. 

Arora, V. K. and Boer, G. J.: Fire as an interactive component of dynamic vegetation models, J. Geophys. Res.-Biogeo., 110, G02008, https://doi.org/10.1029/2005JG000042, 2005. 

Bond, W. J., Woodward, F. I., and Midgley, G. F.: The global distribution of ecosystems in a world without fire, New Phytol., 165, 525–538, 2005. 

Bond-Lamberty, B., Peckham, S. D., Ahl, D. E., and Gower, S. T.: Fire as the dominant driver of central Canadian boreal forest carbon balance, Nature, 450, 89–92, 2007. 

Bowd, E. J., Banks, S. C., Strong, C. L., and Lindenmayer, D. B.: Long-term impacts of wildfire and logging on forest soils, Nat. Geosci., 12, 113–118, 2019. 

Brando, P., Soares-Filho, B., Rodrigues, L., Assunção, A., Morton, D., Tuchschneider, D., Fernandes, E., Macedo, M., Oliveira, U., and Coe, M.: The gathering firestorm in southern Amazonia, Sci. Adv., 6, eaay1632, https://doi.org/10.1126/sciadv.aay1632, 2020. 

Cecil, D. J., Buechler, D. E., and Blakeslee, R. J.: Gridded lightning climatology from TRMM-LIS and OTD: Dataset description, Atmos. Res., 135, 404–414, 2014. 

Chambers, S. and Chapin, F.: Fire effects on surface-atmosphere energy exchange in Alaskan black spruce ecosystems: Implications for feedbacks to regional climate, J. Geophys. Res.-Atmos., 107, 148–227, 2002. 

Chen, Y., Randerson, J. T., Morton, D. C., DeFries, R. S., Collatz, G. J., Kasibhatla, P. S., Giglio, L., Jin, Y., and Marlier, M. E.: Forecasting fire season severity in South America using sea surface temperature anomalies, Science, 334, 787–791, 2011. 

Chen, Y., Randerson, J. T., Coffield, S. R., Foufoula-Georgiou, E., Smyth, P., Graff, C. A., Morton, D. C., Andela, N., van der Werf, G. R., and Giglio, L.: Forecasting global fire emissions on subseasonal to seasonal (S2S) time scales, J. Adv. Model. Earth Sy., 12, e2019MS001955, https://doi.org/10.1029/2019MS001955, 2020. 

Clark, T. L., Coen, J., and Latham, D.: Description of a coupled atmosphere-fire model, Int. J. Wildland Fire, 13, 49–63, 2004. 

Coffield, S. R., Graff, C. A., Chen, Y., Smyth, P., Foufoula-Georgiou, E., and Randerson, J. T.: Machine learning to predict final fire size at the time of ignition, Int. J. Wildland Fire, 28, 861–873, https://doi.org/10.1071/WF19023, 2019. 

Day, C.: Smoke from burning vegetation changes the coverage and behavior of clouds, Phys. Today, 57, 24, https://doi.org/10.1063/1.1768664, 2004. 

Dirmeyer, P. A., Gao, X., Zhao, M., Guo, Z., Oki, T., and Hanasaki, N.: GSWP-2: Multimodel analysis and implications for our perception of the land surface, Bull. Am. Meteorol. Soc., 87, 1381–1398, 2006. 

Dobson, J. E., Bright, E. A., Coleman, P. R., Durfee, R. C., and Worley, B. A.: LandScan: a global population database for estimating populations at risk, Photogram. Eng. Rem. S., 66, 849–857, 2000. 

ESA: Fire_cci Burned Area dataset, Fire_CCI51, ESA [data set], https://geogra.uah.es/fire_cci/firecci51.php, last access: 15 September 2021a. 

ESA: Fire_cci long-term Burned Area dataset, Fire_CCILT11, ESA [data set], https://geogra.uah.es/fire_cci/fireccilt11.php, last access: 15 September 2021b. 

Finney, M. A.: FARSITE, Fire Area Simulator – model development and evaluation, Res. Pap. RMRS-RP-4, Ogden, UT: US Department of Agriculture, Forest Service, Rocky Mountain Research Station, 47 p., 1998. 

FireAtlas: Global Fire Atlas, FireAtlas [data set], https://www.globalfiredata.org/fireatlas.html (last access: 15 September 2021), 2019. 

French, N. H., Whitley, M. A., and Jenkins, L. K.: Fire disturbance effects on land surface albedo in Alaskan tundra, J. Geophys. Res.-Biogeo., 121, 841–854, 2016. 

Ganapathi Subramanian, S. and Crowley, M.: Using spatial reinforcement learning to build forest wildfire dynamics models from satellite images, Front. ICT, 5, 6, https://doi.org/10.3389/fict.2018.00006, 2018. 

Giglio, L., Csiszar, I., and Justice, C. O.: Global distribution and seasonality of active fires as observed with the Terra and Aqua Moderate Resolution Imaging Spectroradiometer (MODIS) sensors, J. Geophys. Res.-Biogeo., 111, G02016, https://doi.org/10.1029/2005JG000142, 2006a. 

Giglio, L., van der Werf, G. R., Randerson, J. T., Collatz, G. J., and Kasibhatla, P.: Global estimation of burned area using MODIS active fire observations, Atmos. Chem. Phys., 6, 957–974, https://doi.org/10.5194/acp-6-957-2006, 2006b. 

Giglio, L., Randerson, J. T., and Van Der Werf, G. R.: Analysis of daily, monthly, and annual burned area using the fourth-generation global fire emissions database (GFED4), J. Geophys. Res.-Biogeo., 118, 317–328, 2013. 

Giglio, L., Boschetti, L., Roy, D. P., Humber, M. L., and Justice, C. O.: The Collection 6 MODIS burned area mapping algorithm and product, Remote Sens. Environ., 217, 72–85, 2018. 

Giglio, L., Schroeder, W., Hall, J. V., and Justice, C. O.: MODIS Collection 6 Active Fire Product User’s Guide Revision C, NASA [data set], https://modis-fire.umd.edu/files/MODIS_C6_Fire_User_Guide_C.pdf (last access: 15 September 2021), 2020. 

Girardin, M. P. and Mudelsee, M.: Past and future changes in Canadian boreal wildfire activity, Ecol. Appl., 18, 391–406, 2008. 

Goodfellow, I., Bengio, Y., and Courville, A.: Deep learning, MIT Press, Cambridge, http://www.deeplearningbook.org (last access: 15 September 2021), 2016. 

Goss, M., Swain, D. L., Abatzoglou, J. T., Sarhadi, A., Kolden, C. A., Williams, A. P., and Diffenbaugh, N. S.: Climate change is increasing the likelihood of extreme autumn wildfire conditions across California, Environ. Res. Lett., 15, 094016, https://doi.org/10.1088/1748-9326/ab83a7, 2020. 

Hantson, S., Arneth, A., Harrison, S. P., Kelley, D. I., Prentice, I. C., Rabin, S. S., Archibald, S., Mouillot, F., Arnold, S. R., Artaxo, P., Bachelet, D., Ciais, P., Forrest, M., Friedlingstein, P., Hickler, T., Kaplan, J. O., Kloster, S., Knorr, W., Lasslop, G., Li, F., Mangeon, S., Melton, J. R., Meyn, A., Sitch, S., Spessa, A., van der Werf, G. R., Voulgarakis, A., and Yue, C.: The status and challenge of global fire modelling, Biogeosciences, 13, 3359–3375, https://doi.org/10.5194/bg-13-3359-2016, 2016. 

Hantson, S., Rabin, S., Kelley, D. I., Arneth, A., Harrison, S. P., Archibald, S., Bachelet, D., Forrest, M., Hickler, T., Kloster, S., Lasslop, G., Li, F., Mangeon, S., Melton, J. R., Nieradzik, L., Prentice, I. C., Sheehan, T., Sitch, S., Teckentrup, L., Voulgarakis, A., Yue, C.: Model outputs: Quantitative assessment of fire and vegetation properties in historical simulations with fire-enabled vegetation models from the Fire Model Intercomparison Project, Zenodo [data set], https://doi.org/10.5281/zenodo.3555562, 2019. 

Harden, J. W., Manies, K. L., Turetsky, M. R., and Neff, J. C.: Effects of wildfire and permafrost on soil organic matter and soil climate in interior Alaska, Glob. Change Biol., 12, 2391–2403, 2006. 

Heyerdahl, E. K., Brubaker, L. B., and Agee, J. K.: Annual and decadal climate forcing of historical fire regimes in the interior Pacific Northwest, USA, The Holocene, 12, 597–604, 2002. 

Holden, Z. A., Swanson, A., Luce, C. H., Jolly, W. M., Maneta, M., Oyler, J. W., Warren, D. A., Parsons, R., and Affleck, D.: Decreasing fire season precipitation increased recent western US forest wildfire activity, Proc. Natl. A. Sci., 115, E8349–E8357, 2018. 

Hurtt, G. C., Chini, L., Sahajpal, R., Frolking, S., Bodirsky, B. L., Calvin, K., Doelman, J. C., Fisk, J., Fujimori, S., Klein Goldewijk, K., Hasegawa, T., Havlik, P., Heinimann, A., Humpenöder, F., Jungclaus, J., Kaplan, J. O., Kennedy, J., Krisztin, T., Lawrence, D., Lawrence, P., Ma, L., Mertz, O., Pongratz, J., Popp, A., Poulter, B., Riahi, K., Shevliakova, E., Stehfest, E., Thornton, P., Tubiello, F. N., van Vuuren, D. P., and Zhang, X.: Harmonization of global land use change and management for the period 850–2100 (LUH2) for CMIP6, Geosci. Model Dev., 13, 5425–5464, https://doi.org/10.5194/gmd-13-5425-2020, 2020. 

Jiang, Y., Yang, X.-Q., Liu, X., Qian, Y., Zhang, K., Wang, M., Li, F., Wang, Y., and Lu, Z.: Impacts of wildfire aerosols on global energy budget and climate: The role of climate feedbacks, J. Climate, 33, 3351–3366, 2020. 

Kasischke, E. S. and Bruhwiler, L. P.: Emissions of carbon dioxide, carbon monoxide, and methane from boreal forest fires in 1998, J. Geophys. Res.-Atmos., 107, 148–227, https://doi.org/10.1029/2001JD000461, 2002. 

Kelley, D. I., Bistinas, I., Whitley, R., Burton, C., Marthews, T. R., and Dong, N.: How contemporary bioclimatic and human controls change global fire regimes, Nat. Clim. Change, 9, 690–696, 2019. 

Kelley, D. I., Burton, C., Huntingford, C., Brown, M. A. J., Whitley, R., and Dong, N.: Technical note: Low meteorological influence found in 2019 Amazonia fires, Biogeosciences, 18, 787–804, https://doi.org/10.5194/bg-18-787-2021, 2021. 

Keeley, J. E. and Syphard, A. D.: Historical patterns of wildfire ignition sources in California ecosystems, Int. J. Wildland Fire, 27, 781–799, 2018. 

Kingma, D. P. and Ba, J.: Adam: A method for stochastic optimization, arXiv:1412.6980, 2014. 

Knorr, W., Kaminski, T., Arneth, A., and Weber, U.: Impact of human population density on fire frequency at the global scale, Biogeosciences, 11, 1085–1102, 2014. 

Koven, C. D., Riley, W. J., Subin, Z. M., Tang, J. Y., Torn, M. S., Collins, W. D., Bonan, G. B., Lawrence, D. M., and Swenson, S. C.: The effect of vertically resolved soil biogeochemistry and alternate soil C and N models on C dynamics of CLM4, Biogeosciences, 10, 7109–7131, https://doi.org/10.5194/bg-10-7109-2013, 2013. 

Lamarque, J. F., Kiehl, J. T., Brasseur, G. P., Butler, T., Cameron-Smith, P., Collins, W. D., Collins, W. J., Granier, C., Hauglustaine, D., and Hess, P. G.: Assessing future nitrogen deposition and carbon cycle feedback using a multimodel approach: Analysis of nitrogen deposition, J. Geophys. Res.-Atmos., 110, D19303, https://doi.org/10.1029/2005JD005825, 2005. 

Lenihan, J. M. and Bachelet, D.: Historical climate and suppression effects on simulated fire and carbon dynamics in the conterminous United States, Global Vegetation Dynamics: Concepts and Applications in the MC1 Model, edited by: Bachelet, D. and Turner, D., AGU Geophys. Monog., 214, 17–30, 2015. 

Li, F., Zeng, X. D., and Levis, S.: A process-based fire parameterization of intermediate complexity in a Dynamic Global Vegetation Model, Biogeosciences, 9, 2761–2780, https://doi.org/10.5194/bg-9-2761-2012, 2012. 

Li, F., Val Martin, M., Andreae, M. O., Arneth, A., Hantson, S., Kaiser, J. W., Lasslop, G., Yue, C., Bachelet, D., Forrest, M., Kluzek, E., Liu, X., Mangeon, S., Melton, J. R., Ward, D. S., Darmenov, A., Hickler, T., Ichoku, C., Magi, B. I., Sitch, S., van der Werf, G. R., Wiedinmyer, C., and Rabin, S. S.: Historical (1700–2012) global multi-model estimates of the fire emissions from the Fire Modeling Intercomparison Project (FireMIP), Atmos. Chem. Phys., 19, 12545–12567, https://doi.org/10.5194/acp-19-12545-2019, 2019. 

Lizundia-Loiola, J., Pettinari, M., Chuvieco, E., Storm, T., and Gómez-Dans, J.: ESA CCI ECV Fire Disturbance: Algorithm Theoretical Basis Document-MODIS, version 2.0, https://climate.esa.int/media/documents/Fire_cci_D2.1.3_ATBD-MODIS_v2.0.pdf (last access: 15 September 2021), 2018. 

Lizundia-Loiola, J., Otón, G., Ramo, R., and Chuvieco, E.: A spatio-temporal active-fire clustering approach for global burned area mapping at 250 m from MODIS data, Remote Sens. Environ., 236, 111493, https://doi.org/10.1016/j.rse.2019.111493, 2020. 

Mahowald, N., Jickells, T. D., Baker, A. R., Artaxo, P., Benitez-Nelson, C. R., Bergametti, G., Bond, T. C., Chen, Y., Cohen, D. D., and Herut, B.: Global distribution of atmospheric phosphorus sources, concentrations and deposition rates, and anthropogenic impacts, Global Biogeochem. Cy., 22, GB4026, https://doi.org/10.1029/2008GB003240, 2008. 

Mekonnen, Z. A., Riley, W. J., Randerson, J. T., Grant, R. F., and Rogers, B. M.: Expansion of high-latitude deciduous forests driven by interactions between climate warming and fire, Nat. Plants, 5, 952–958, 2019. 

Oliver, A. K., Callaham Jr., M. A., and Jumpponen, A.: Soil fungal communities respond compositionally to recurring frequent prescribed burning in a managed southeastern US forest ecosystem, Forest Ecol. Manag.t, 345, 1–9, 2015. 

Papakosta, P., Xanthopoulos, G., and Straub, D.: Probabilistic prediction of wildfire economic losses to housing in Cyprus using Bayesian network analysis, Int. J. Wildland Fire, 26, 10–23, 2017. 

Pellegrini, A. F., Ahlström, A., Hobbie, S. E., Reich, P. B., Nieradzik, L. P., Staver, A. C., Scharenbroch, B. C., Jumpponen, A., Anderegg, W. R., and Randerson, J. T.: Fire frequency drives decadal changes in soil carbon and nitrogen and ecosystem productivity, Nature, 553, 194–198, 2018. 

Pellegrini, A. F., Hobbie, S. E., Reich, P. B., Jumpponen, A., Brookshire, E. J., Caprio, A. C., Coetsee, C., and Jackson, R. B.: Repeated fire shifts carbon and nitrogen cycling by changing plant inputs and soil decomposition across ecosystems, Ecol. Monogr., 90, e01409, https://doi.org/10.1002/ecm.1409, 2020. 

Preisler, H. K. and Westerling, A. L.: Statistical model for forecasting monthly large wildfire events in western United States, J. Appl. Meteorol. Clim., 46, 1020–1030, 2007. 

Prentice, S. and Mackerras, D.: The ratio of cloud to cloud-ground lightning flashes in thunderstorms, J. Appl. Meteorol., 16, 545–550, 1977. 

Rabin, S. S., Melton, J. R., Lasslop, G., Bachelet, D., Forrest, M., Hantson, S., Kaplan, J. O., Li, F., Mangeon, S., Ward, D. S., Yue, C., Arora, V. K., Hickler, T., Kloster, S., Knorr, W., Nieradzik, L., Spessa, A., Folberth, G. A., Sheehan, T., Voulgarakis, A., Kelley, D. I., Prentice, I. C., Sitch, S., Harrison, S., and Arneth, A.: The Fire Modeling Intercomparison Project (FireMIP), phase 1: experimental and analytical protocols with detailed model descriptions, Geosci. Model Dev., 10, 1175–1197, https://doi.org/10.5194/gmd-10-1175-2017, 2017. 

Radke, D., Hessler, A., and Ellsworth, D.: FireCast: Leveraging Deep Learning to Predict Wildfire Spread, Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence Main track, IJCAI 2019 Macao, Int. Joint Conf. Aartif., 4575–4581, https://doi.org/10.24963/ijcai.2019/636, 2019. 

Randerson, J. T., Liu, H., Flanner, M. G., Chambers, S. D., Jin, Y., Hess, P. G., Pfister, G., Mack, M., Treseder, K., and Welp, L.: The impact of boreal forest fire on climate warming, Science, 314, 1130–1132, 2006. 

Randerson, J. T., van der Werf, G. R., Giglio, L., Collatz, G. J., and Kasibhatla, P. S.: Global Fire Emissions Database, Version 4, (GFEDv4), ORNL DAAC, Oak Ridge, Tennessee, USA, https://doi.org/10.3334/ORNLDAAC/1293, data available at: https://daac.ornl.gov/VEGETATION/guides/fire_emissions_v4.html (last access: 15 September 2021), 2018. 

Riley, K. and Thompson, M.: An uncertainty analysis of wildfire modeling, Natural hazard uncertainty assessment: modeling and decision support, Monograph, 223, 193–213, 2017. 

Ross, A. N., Wooster, M. J., Boesch, H., and Parker, R.: First satellite measurements of carbon dioxide and methane emission ratios in wildfire plumes, Geophys. Res. Lett., 40, 4098–4102, 2013. 

Rother, D. and De Sales, F.: Impact of Wildfire on the Surface Energy Balance in Six California Case Studies, Bound.-Lay. Meteorol., 178, 143–166, 2020. 

Rothermel, R. C.: A mathematical model for predicting fire spread in wildland fuels, Intermountain Forest & Range Experiment Station, Forest Service, US Department of Agriculture, Ogden, UT, USA, Res. Pap. INT-115, 40p., 1972. 

Saha, M. V., Scanlon, T. M., and D'Odorico, P.: Climate seasonality as an essential predictor of global fire activity, Global Ecol. Biogeogr., 28, 198–210, 2019. 

Sayad, Y. O., Mousannif, H., and Al Moatassime, H.: Predictive modeling of wildfires: A new dataset and machine learning approach, Fire Safety J., 104, 130–146, 2019. 

Schmidhuber, J.: Deep learning in neural networks: An overview, Neural Networks, 61, 85–117, 2015. 

Stephenson, C., Handmer, J., and Betts, R.: Estimating the economic, social and environmental impacts of wildfires in Australia, Environ. Hazards, 12, 93–111, 2013. 

Syphard, A. D., Radeloff, V. C., Keeley, J. E., Hawbaker, T. J., Clayton, M. K., Stewart, S. I., and Hammer, R. B.: Human influence on California fire regimes, Ecol. Appl., 17, 1388–1402, 2007. 

Teckentrup, L., Lasslop, G., Bachelet, D., Forrest, M., Hantson, S., Li, F., Melton, J. R., Yue, C., Arneth, A., Harrison, S. P., and Sitch, S.: Simulations of historical burned area: A comparison of global fire models in FireMIP, EGUGA, 17537, https://ui.adsabs.harvard.edu/abs/2018EGUGA..2017537T, 2018. 

Thonicke, K., Spessa, A., Prentice, I. C., Harrison, S. P., Dong, L., and Carmona-Moreno, C.: The influence of vegetation, fire spread and fire behaviour on biomass burning and trace gas emissions: results from a process-based model, Biogeosciences, 7, 1991–2011, https://doi.org/10.5194/bg-7-1991-2010, 2010. 

Tonini, M., D'Andrea, M., Biondi, G., Degli Esposti, S., Trucchia, A., and Fiorucci, P.: A Machine Learning-Based Approach for Wildfire Susceptibility Mapping, The Case Study of the Liguria Region in Italy, Geosciences, 10, 105, https://doi.org/10.3390/geosciences10030105, 2020. 

van der Werf, G. R., Randerson, J. T., Giglio, L., van Leeuwen, T. T., Chen, Y., Rogers, B. M., Mu, M., van Marle, M. J. E., Morton, D. C., Collatz, G. J., Yokelson, R. J., and Kasibhatla, P. S.: Global fire emissions estimates during 1997–2016, Earth Syst. Sci. Data, 9, 697–720, https://doi.org/10.5194/essd-9-697-2017, 2017. 

van Vuuren, D. P., Lucas, P. L., and Hilderink, H.: Downscaling drivers of global environmental change: Enabling use of global SRES scenarios at the national and grid levels, Glob. Environ. Change, 17, 114–130, 2007. 

Venevsky, S., Thonicke, K., Sitch, S., and Cramer, W.: Simulating fire regimes in human-dominated ecosystems: Iberian Peninsula case study, Glob. Change Biol., 8, 984–998, 2002. 

Walker, X. J., Baltzer, J. L., Cumming, S. G., Day, N. J., Ebert, C., Goetz, S., Johnstone, J. F., Potter, S., Rogers, B. M., and Schuur, E. A.: Increasing wildfires threaten historic carbon sink of boreal forest soils, Nature, 572, 520–523, 2019. 

Wang, J.-F., Stein, A., Gao, B.-B., and Ge, Y.: A review of spatial sampling, Spat. Stat., 2, 1–14, 2012. 

Westerling, A. L., Hidalgo, H. G., Cayan, D. R., and Swetnam, T. W.: Warming and earlier spring increase western US forest wildfire activity, Science, 313, 940–943, 2006. 

Williams, A. P., Abatzoglou, J. T., Gershunov, A., Guzman-Morales, J., Bishop, D. A., Balch, J. K., and Lettenmaier, D. P.: Observed impacts of anthropogenic climate change on wildfire in California, Earths Future, 7, 892–910, 2019. 

Xu, L., Qing, Z., William, J. R., Yang, C., Hailong, W., Po-Lun, M., and James, T. R.: The influence of fire aerosols on surface climate and gross primary production in the Energy Exascale Earth System Model (E3SM), J. Climate 34, 7219–7238, 2021. 

Xu, X., Jia, G., Zhang, X., Riley, W. J., and Xue, Y.: Climate regime shift and forest loss amplify fire in Amazonian forests, Glob. Change Biol., 26, 5874–5885, 2020. 

Yu, Y., Mao, J., Thornton, P. E., Notaro, M., Wullschleger, S. D., Shi, X., Hoffman, F. M., and Wang, Y.: Quantifying the drivers and predictability of seasonal changes in African fire, Nature Commun., 11, 1–8, 2020.  

Yue, X., Mickley, L. J., Logan, J. A., and Kaplan, J. O.: Ensemble projections of wildfire activity and carbonaceous aerosol concentrations over the western United States in the mid-21st century, Atmos. Environ., 77, 767–780, 2013. 

Zheng, H., Yang, Z., Liu, W., Liang, J., and Li, Y.: Improving deep neural networks using softplus units, 2015 International Joint Conference on Neural Networks (IJCNN), 1–4, https://doi.org/10.1109/IJCNN.2015.7280459, 2015. 

Zhu, Q.: Building a machine learning surrogate model for wildfire activities within a global earth system model, Zenodo [code], https://doi.org/10.5281/zenodo.5508795, 2021. 

Zhu, Q. and Riley, W. J.: Improved modelling of soil nitrogen losses, Nat. Clim. Change, 5, 705–706, 2015. 

Zhu, Q. and Zhuang, Q.: Improving the quantification of terrestrial ecosystem carbon dynamics over the United States using an adjoint method, Ecosphere, 4, art118, https://doi.org/10.1890/ES13-00058.1, 2013. 

Zhu, Q. and Zhuang, Q.: Parameterization and sensitivity analysis of a process-based terrestrial ecosystem model using adjoint method, J. Adv. Model. Ea. Sy., 6, 315–331, https://doi.org/10.1002/2013MS000241, 2014. 

Zhu, Q., Riley, W. J., Tang, J., and Koven, C. D.: Multiple soil nutrient competition between plants, microbes, and mineral surfaces: model development, parameterization, and example applications in several tropical forests, Biogeosciences, 13, 341–363, https://doi.org/10.5194/bg-13-341-2016, 2016. 

Zhu, Q., Riley, W. J., Tang, J., Collier, N., Hoffman, F. M., Yang, X., and Bisht, G.: Representing nitrogen, phosphorus, and carbon interactions in the E3SM Land Model: Development and global benchmarking, J. Adv. Model. Ea. Sy., 11, 2238–2258, https://doi.org/10.1029/2018MS001571, 2019. 

Zhu, Q., Riley, W. J., Iversen, C. M., and Kattge, J.: Assessing impacts of plant stoichiometric traits on terrestrial ecosystem carbon accumulation using the E3SM land model, J. Adv. Model. Ea. Sy., 12, e2019MS001841, https://doi.org/10.1029/2019MS001841, 2020. 

Zou, Y., Wang, Y., Ke, Z., Tian, H., Yang, J., and Liu, Y.: Development of a REgion-specific ecosystem feedback fire (RESFire) model in the Community Earth System Model, J. Adv. Model. Ea. Sy., 11, 417–445, 2019. 

Zou, Y., Wang, Y., Qian, Y., Tian, H., Yang, J., and Alvarado, E.: Using CESM-RESFire to understand climate-fire-ecosystem interactions and the implications for decadal climate variability, Atmos. Chem. Phys., 20, 995–1020, https://doi.org/10.5194/acp-20-995-2020, 2020. 

Download
Short summary
Wildfire is a devastating Earth system process that burns about 500 million hectares of land each year. It wipes out vegetation including trees, shrubs, and grasses and causes large losses of economic assets. However, modeling the spatial distribution and temporal changes of wildfire activities at a global scale is challenging. This study built a machine-learning-based wildfire surrogate model within an existing Earth system model and achieved high accuracy.