Spatial parameter optimization of a terrestrial biosphere model for improving estimation of carbon fluxes for deciduous forests in the eastern United States: an efficient model-data fusion method
- 1School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
- 2Earth Systems Research Center, Institute for the Study of Earth, Oceans, and Space, University of New Hampshire, Durham, NH 03824, USA
- 3Department of Geographical Sciences, University of Maryland, College Park, MD 20742, USA
- 4College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China
- 5Guangdong Province Key Laboratory for Climate Change and Natural Disaster Studies, School of Atmospheric Sciences, Sun Yat-sen University, Guangzhou, China
- 1School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
- 2Earth Systems Research Center, Institute for the Study of Earth, Oceans, and Space, University of New Hampshire, Durham, NH 03824, USA
- 3Department of Geographical Sciences, University of Maryland, College Park, MD 20742, USA
- 4College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China
- 5Guangdong Province Key Laboratory for Climate Change and Natural Disaster Studies, School of Atmospheric Sciences, Sun Yat-sen University, Guangzhou, China
Abstract. Inaccurate parameter estimation is a significant source of uncertainty in complex terrestrial biosphere models. Model parameters may have large spatial variability, even within a vegetation type. Model uncertainty from parameter uncertainty can be significantly reduced by model-data fusion (MDF), which, however, is difficult to implement over a large region with traditional methods due to the high computational cost. This study proposed a hybrid modeling approach that couples a terrestrial biosphere model with a data-driven machine learning method, which uses satellite information and considers the physical mechanisms. We developed a two-step framework to estimate the essential parameters of the revised Integrated Biosphere Simulator (IBIS) pixel by pixel using the satellite-derived leaf area index (LAI) and gross primary productivity (GPP) products as “true values.” The first step was to estimate the optimal parameters for each sample using a modified adaptive surrogate modeling algorithm (MASM). We applied the Gaussian Process Regression algorithm (GPR) as a surrogate model to learn the relationship between model parameters and errors. In our second step, we built an eXtreme Gradient Boosting (XGBoost) model between the optimized parameters and local environmental variables. The trained XGBoost model was then used to predict optimal parameters spatially across the deciduous forests in the eastern United States. The results showed that the parameters were highly variable spatially and quite different from the default values over forests, and the simulation errors of the GPP and LAI could be markedly reduced with the optimized parameters. The effectiveness of the optimized model in estimating GPP, ecosystem respiration (ER) and net ecosystem exchange (NEE) were also tested through site validation. The optimized model reduced the root-mean-squared-error (RMSE) from 7.03 to 6.22 gC m-2 8 d-1 for GPP, 2.65 to 2.11 gC m-2 8 d-1 for ER and 4.45 to 4.38 gC m-2 8 d-1 for NEE. The mean annual GPP, ER and NEE of the region from 2000 to 2019 were 5.79, 4.60 and -1.19 Pg year-1, respectively. The strategy used in this study requires only a few hundred model runs to calibrate regional parameters and is readily applicable to other complex terrestrial biosphere models with different spatial resolutions. Our study also emphasizes the necessity of pixel-level parameter calibration and the value of remote sensing products for per-pixel parameter optimization.
- Preprint
(2164 KB) -
Supplement
(976 KB) - BibTeX
- EndNote
Rui Ma et al.
Status: open (until 13 Jul 2022)
-
RC1: 'Comment on gmd-2022-96', Anonymous Referee #1, 01 Jun 2022
reply
Accurate estimation of parameter has drawn much attention to reduce uncertainty in model projections. The idea that combining terrestrial biosphere model with a data-driven machine learning method to invert spatial parameter is interesting. The two-step framework proposed by the authors optimized spatial parameters effectively which are quite different from the default values, and reduced the simulation errors of the GPP and LAI markedly. This manuscript provided a good case study to apply remote sensing products for pixel-level parameter optimization to reduce uncertainty in current model framework. I think the paper has reached the level to be published.
-
AC1: 'Reply on RC1', Rui Ma, 08 Jun 2022
reply
Thank you for your comment. Please feel free to contact us if you have any other questions.
-
AC1: 'Reply on RC1', Rui Ma, 08 Jun 2022
reply
-
RC2: 'Comment on gmd-2022-96', Anonymous Referee #2, 05 Jun 2022
reply
This paper proposed a two-step framework to estimate the essential parameters of the revised Integrated Biosphere Simulator (IBIS) at pixel level. The paper was well prepared, organized and clearly presented. However, there are still some issues that the authors should address before the paper is considered for the publication.
1ï¼This study used global LAI and GPP products from the Global Land Surface Satellite (GLASS) suite as “observations” (“true values”) for parameter calibration on a spatial scale. If the GLASS product is good enough to serve as “true values”, what is the significance of model parameter correction? Please clarify it clearly in Introduction.
2ï¼The paper mainly focused on deciduous forests in the eastern United States. How about the accuracy of other products in the study area? If you want to take GLASS products as the reference values, please elaborate the accuracy of GLASS products used in the study area.
3ï¼In Line 228, the ten sensitive parameters derived in the study are sensitive parameters for which indicators (GPP, LAI, or ER, NEE?)?
4ï¼The ranges of parameters are very important because they directly limit the boundaries of parameter optimization and the range of parameter space variation. Therefore, how did the authors determine the ranges of the prior parameters in this study?
5ï¼Lines 337-338, as you mentioned, for the testing set, the estimated errors (RMSE and DISO) using XGBoost were slightly less correlated with the corresponding accuracy indexes of the MASM approach. However, the input of the target values in Xboost's training set are the parameter values obtained by MASM. So is the training effect of the Xboost model unsatisfactory?
6ï¼Both MASM and Xboost can be used to obtain spatial distribution of sensitivity parameters. Introducing XGBoost to predict other spatial parameters from some partially corrected parameters may cause greater uncertainty. So why did the authors choose these two methods? It seems that the necessity of two-step correction is not clear in the article.
7ï¼Parameter screening and optimization mainly targeted at LAI and GPP, while ER and NEE were added to the final carbon flux prediction. NEE means GPP minus ecosystem respiration and disturbance. I wonder if the parameters associated with these two processes were corrected? In addition, suggest to analyze the uncertainties of these two carbon flux results.
-
CEC1: 'Comment on gmd-2022-96', Juan Antonio Añel, 15 Jun 2022
reply
Dear authors,
After checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
You have archived your code on GitHub. However, GitHub is not a suitable repository. GitHub itself instructs authors to use other alternatives for long-term archival and publishing, such as Zenodo. Admittedly, the Topical Editor made a mistake when in previous communication with you, suggested GitHub as an option. Our apologies for this mistake.Therefore, please, publish your code in one of the appropriate repositories, and include the relevant primary input/output data. In this way, you must include in a potential reviewed version of your manuscript the modified 'Code and Data Availability' section, the DOI of the code (and another DOI for the dataset if necessary).
Also, both the README.txt in the GitHub repository and the "Code and Data policy" section in your manuscript continue suggesting other repositories beyond the requested one. The role of such section and repository is not to promote your preferred webpage, the web of a project or link to the newest version of a model but assure the replicability of your work. In this way, all the information that is not the link and DOI to the repository with the exact code and data used in your work only adds confusion, making it more complicated for editors, reviewers and readers to reach the right assets. Therefore, please, both in the final repository and the "Code and Data Availability" section remove the links to servers that are not the permanent ones that our policy requests (oml.gov and ed.ac.uk).
Additionally, there is no license listed in the GitHub repository. If you do not include a license, despite what you state in the README file or the policy in other servers, the code is not free to be used by a third party; it continues to be your property. Therefore, when uploading the model's code to Zenodo, you could want to choose a free software/open-source (FLOSS) license. We recommend the GPLv3. You only need to include the file 'https://www.gnu.org/licenses/gpl-3.0.txt' as LICENSE.txt with your code. Also, you can choose other options that Zenodo provides: GPLv2, Apache License, MIT License, etc.
Juan A. Añel
Geosci. Model Dev. Exec. Editor
Rui Ma et al.
Rui Ma et al.
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
296 | 80 | 12 | 388 | 25 | 3 | 4 |
- HTML: 296
- PDF: 80
- XML: 12
- Total: 388
- Supplement: 25
- BibTeX: 3
- EndNote: 4
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1