the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
FLAML version 2.3.3 model-based assessment of gross primary productivity at forest, grassland, and cropland ecosystem sites
Abstract. Accurately estimating Gross Primary Productivity (GPP) in terrestrial ecosystems is essential for understanding the global carbon cycle. Satellite-based Light Use Efficiency (LUE) models are commonly employed for simulating GPP. However, the variables and algorithms related to environmental limiting factors differ significantly across various LUE models. In this work, we developed a series of FLAML-LUE models tailored for different ecosystems. These models utilize the Fast Lightweight Automated Machine Learning (FLAML) framework, using variables of LUE models, to investigate the potential of estimating site-scale GPP. Incorporating meteorological data, eddy covariance measurements, and remote sensing indices, we employed FLAML-LUE models to assess the impact of various variable combinations on GPP across different temporal scales, including daily, 8-day, 16-day, and monthly intervals. Cross-validation analyses indicated that the effectiveness of FLAML-LUE models for forest ecosystems varied significantly across different sites, with R² values ranging from 0.56 to 0.94. For grassland ecosystems, R² values ranged from 0.62 to 0.87, and for cropland ecosystems, R² values ranged from 0.78 to 0.88. Extending the time scale of input data could significantly enhance the accuracy of model simulations. Specifically, the average R2 increased from 0.82 to 0.92 for forest ecosystems, 0.79 to 0.83 for grassland ecosystems, and 0.84 to 0.87 for farmland ecosystems. Additionally, the importance ranking method indicated that vegetation index and temperature were the most important variables for GPP estimation in forest, grassland, and farmland ecosystems, while the importance of the moisture index was relatively low. This study offers an approach to estimate GPP fluxes and evaluate the impact of variables on GPP estimation. It has the potential to be applied in predicting GPP for different vegetation types at a regional scale.
- Preprint
(2952 KB) - Metadata XML
-
Supplement
(366 KB) - BibTeX
- EndNote
Status: open (until 12 Apr 2025)
-
CEC1: 'Comment on gmd-2024-169 - No compliance with the policy of the journal', Juan Antonio Añel, 12 Feb 2025
reply
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
You provide a GitHub link for the FLAML code, which you use in your manuscript. However, GitHub is not a suitable repository for scientific publication. GitHub itself instructs authors to use other long-term archival and publishing alternatives, such as Zenodo. Moreover, to get access to part of the data that you use in your study you provide a generic link to a website (https://www.scidb.cn/en/), which 1) is not an acceptable long-term repository for scientific publication, and 2) does not provide direct access to the data you have used, but it is simply a link to a main portal.Therefore, we have to ask you to store the FLAML code and all the data that you use to produce your manuscript in one of the acceptable repositories according to our policy. Also, reply to this comment with the relevant information (link and a permanent identifier for the repositories (e.g. DOI)) as soon as possible.
Please remember to include a modified 'Code and Data Availability' section in a potentially reviewed manuscript, containing the new information.
Finally, I have to note that if you do not fix this problem, we will have to reject your manuscript for publication in our journal. In this regard, I advise to the Topical Editor of this manuscript to stall the review process for your manuscript until these issues have been solved, as we can not accept manuscripts in Discussions that do not comply with our policy.
Juan A. Añel
Geosci. Model Dev. Executive EditorCitation: https://doi.org/10.5194/gmd-2024-169-CEC1 -
AC1: 'Reply on CEC1', Jie Lai, 13 Feb 2025
reply
Dear Editor,
We hope this message finds you well.
Thank you very much for your valuable feedback. It seems there may have been a misunderstanding regarding this section. As per the journal’s policy, we have stored both our code and data on Zenodo (DOI: 10.5281/zenodo.14542880). The GitHub link provided in our manuscript refers to the FLAML source code, which is maintained by its developers and is included solely to facilitate access to the library for other researchers. The site-level flux data were obtained from https://www.scidb.cn/en/. However, the actual dataset used in our study has been processed and is also available on Zenodo.If you believe our Code and Data Availability section still does not fully comply with the journal’s guidelines, we would be happy to revise it accordingly. We look forward to your feedback.
Once again, thank you for your time and support.
Best regards,
Jie Lai,
Shenyang Institute of Applied Ecology, Chinese Academy of Sciences,
laijie21@mails.ucas.ac.cn-
CEC2: 'Reply on AC1', Juan Antonio Añel, 13 Feb 2025
reply
Dear authors,
Thanks for your reply. Regarding the data in your manuscript, if all the data that you use is in the Zenodo repository, then we can consider the data part in compliance with our policy. However, the FLAML library is necessary to perform the work that you present in your manuscript, and its current GitHub repository does not guarantee permanent storage or future availability of it. This library is under the MIT license, which allows redistribution. Therefore, it is your responsibility as authors of the manuscript, that must assure the replicability of the presented work, to obtain all the software necessary (all the contained in the FLAML git), and store it in a permanent repository. Please, do it.
In the future, to avoid misunderstandings, please, be aware that the Code and Data Availability section is to present the actual assets necessary to replicate your work, not to promote generic websites or projects. Therefore, it would be good if you restrict the information in this section to the actual links and permanent handles (e.g. DOI) of the assets necessary to replicate your work. This is something you could consider in potentially reviewed versions of your manuscript.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/gmd-2024-169-CEC2 -
AC2: 'Reply on CEC2', Jie Lai, 15 Feb 2025
reply
Dear editor,
We hope this message finds you well.
Thank you very much for your valuable feedback. We have carefully considered your suggestions and made the necessary revisions. The FLAML source code, in a zip file, has been uploaded to Zenodo. With the modifications outlined below, we believe it now fully complies with the requirements of the Code and Data Availability section.
“A Fast Library for Automated Machine Learning & Tuning (FLAML) is a Python library, and detailed documentation about FLAML can be found on GitHub. We have uploaded the related source code and documentation to Zenodo (https://doi.org/10.5281/zenodo.14874754, Laijie, 2025). The flux observation data and the Python source code of the FLAML-LUE used in this paper are also archived on Zenodo (https://doi.org/10.5281/zenodo.14542880, Laijie, 2024).”
Once again, thank you for your time and support.
Best regards,
Jie Lai,
Shenyang Institute of Applied Ecology, Chinese Academy of Sciences,
laijie21@mails.ucas.ac.cn-
CEC3: 'Reply on AC2', Juan Antonio Añel, 15 Feb 2025
reply
Dear authors,
Many thanks, we can consider your manuscript in compliance with the policy of the journal.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/gmd-2024-169-CEC3
-
CEC3: 'Reply on AC2', Juan Antonio Añel, 15 Feb 2025
reply
-
AC2: 'Reply on CEC2', Jie Lai, 15 Feb 2025
reply
-
CEC2: 'Reply on AC1', Juan Antonio Añel, 13 Feb 2025
reply
-
AC1: 'Reply on CEC1', Jie Lai, 13 Feb 2025
reply
-
CC1: 'Comment on gmd-2024-169', Xiangyong Lei, 02 Mar 2025
reply
General Comments:
The manuscript titled "FLAML version 2.3.3 model-based assessment of gross primary productivity at forest, grassland, and cropland ecosystem sites" introduces a new model called FLAML-LUE was created by combining FLAML model with LUE-based models, the latter provides the key variables of vegetation growth for modeling.
The manuscript is well-structured, has clear research objectives, and is well-discussed, although a minor further improvement could be made. I listed some minor comments in the specific comments below.
Specific comments:
- Line 67: There is a period before the citation and another one after it. Please pay attention to this detail error.
- Line 143: The authors should consider including the geographical coordinates (longitude and latitude) of the monitoring stations in Table 1. Providing this spatial reference would significantly enhance the study's reproducibility and facilitate comparative analyses with other datasets.
- Line 147 and 150: It is recommended to change the last third of the sentence to:
"The last third of the long-term series data from the ALF, CBF, and QYF stations were used for forest model validation. Similarly, the last third of the data from DLG, DXG, and HBG stations were used for grassland model validation, and the last third of the data from JZA and YCA stations were used for cropland model validation." - Line 156: It is recommended to change “However, some sites have no ER data” to “some sites lacked ER data.”
- Line 216 and 219: (C. Wang et al., 2021) should be (Wang et al., 2021).
- Line 241: The table title does not have a period at the end.
- Line 256-259: All equations should be centered in the text, with equation numbers placed on the right margin enclosed in parentheses.
- Line 272: The manuscript exhibits inconsistent punctuation formatting patterns that require standardization.
- In Figures 4, 8, and 12: The y-axis labels appear to be incorrect. Shouldn't they be 8-day, 16-day, and monthly, respectively?
- Line 430, 452, and 545: The 2 in R² is not formatted as a superscript.
- Why are separate models constructed for forest, grassland, and cropland ecosystems instead of developing a single unified model?
Citation: https://doi.org/10.5194/gmd-2024-169-CC1 -
AC3: 'Reply on CC1', Jie Lai, 03 Mar 2025
reply
Dear Dr. Lei,
We hope this email finds you well.
Thank you very much for your valuable feedback. We have carefully reviewed your comments and made the necessary revisions accordingly.
A detailed point-by-point response to each of your comments is provided in the attached document. Additionally, we have corrected the formatting and other minor errors in the manuscript as suggested. We sincerely appreciate your insightful suggestions, which have helped improve the quality of our work.
Please let us know if you need any further information or have additional questions. We would greatly appreciate any further suggestions that could enhance the manuscript.
Once again, thank you for your time and support.
Best regards,
Jie Lai
Shenyang Institute of Applied Ecology, Chinese Academy of Sciences
laijie21@mails.ucas.ac.cn
-
RC1: 'Comment on gmd-2024-169', Anonymous Referee #1, 03 Mar 2025
reply
Dear Authors,
Thank you for this interesting study on developing a Machine Learning (ML) based Light-Use Efficiency (LUE) model to estimate Gross Primary Productivity (GPP). The study integrated remote sensing and eddy flux data into a Fast Lightweight Automated Machine Learning (FLAML) framework and achieved improved statistical performance scores in relation to previous methods. The choice of a ML framework is novel and represents an important step in the development of LUE models.
The study’s methods are sound and statistical analysis very well developed, such tests and choice of training and validation data. However, I have identified some points which could lead to a minor or major revision depending on the editor’s opinion. First, the choice of simple vegetation indices as dependent variables for the model seem to me dated, especially due to the current availability of Solar Induced Fluorescence (SIF) products, which are more suited as proxies of photosynthesis than EVI, NDVI, etc. Although the authors mention the possible future use of SIF, I would like to know further details to why it was not used in this study, or extra analysis where SIF is included. Second, the resolution of the remote sensing products used (500 meters) does not seem to be compatible with the eddy flux data. At this scale, microclimatic or topographic factors may cause significant divergences in relation to a 500 m size pixel, and lead to inconsistencies. I suggest that if possible data with higher resolution are used (LANDSAT or SENTINEL-2) or arguments are given for the use of the lower resolution product. Finally, I would be very interested in the production of a GPP map of China using the FLAML framework, and how it compares with other GPP maps. I think this would greatly increase the manuscript’s appeal.
Specific comments:
L90 - I would not say ML is "fundamentally different" from regression models, but that they offer advantages in relation to.
L94 - I would also point out limitations on ML techniques, such as dependence on large training datasets and not being able to link results to real-world processes
L96 - ...Which is an advantage when the focus is solely on spatial predictions
Fig. 1 - The mini-map on the bottom right corner does not include any sites, or any extra information, maybe remove it? Otherwise, I believe the editors should label these areas in the South China Sea as “under dispute”, as stated in the “maps and aerials” section of the submission guidelines.
Table 2 – In contrast to other vegetation indexes, LAI satellite data is based on empirical models, such as previous GPP estimating methods. It would be interesting to check if field LAI data from the sites are available to see if direct LAI measurements improve the ML model.
L686 - I would argue then that in the future hyperspectral data + ML would provide much better estimates too, this could be discussed with references.
Citation: https://doi.org/10.5194/gmd-2024-169-RC1
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
144 | 33 | 14 | 191 | 22 | 5 | 5 |
- HTML: 144
- PDF: 33
- XML: 14
- Total: 191
- Supplement: 22
- BibTeX: 5
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1