the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Data-Informed Inversion Model (DIIM): a framework to retrieve marine optical constituents in the BOUSSOLE site using a three-stream irradiance model
Abstract. Within the New Copernicus Capability for Trophic Ocean Networks (NECCTON) project, we aim to improve the current data assimilation system by developing a method for accurately estimating marine optical constituents from satellite-derived Remote Sensing Reflectance. We developed and compared two frameworks by implicitly inverting a semi-analytical expression derived from the classical Radiative Transfer Equation. First, we used a Bayesian estimation, which provided retrievals of the optical constituents along with their uncertainties. Moreover, using historical in-situ measurements together with a Markov Chain Monte Carlo (MCMC) algorithm to adjust the model parameters, we were able to reduce the root mean square Error (RMSE) between the retrieved data and in-situ observations. Second, we employed the Stochastic Gradient Variational Bayes (SGVB) framework to efficiently approximate the Maximum Posterior (MAP) estimates of the optical constituents while simultaneously finding the Maximum Likelihood Estimate (MLE) of the model parameters. This approach resulted in faster computations of the optical constituents compared to Bayesian estimations, with equivalent RMSE values between the retrieved data and in-situ observations. We showed that both, the MCMC and SGVB based algorithms, were able to find sets of optimal parameters, which, due to correlations between them, are not unique. We conclude that both methods are consistent with the Radiative Transfer Equation. The first method provides reliable uncertainty estimations, while the second offers a faster alternative to standard inversion techniques, making it suitable for inversion and model optimization problems where MCMC algorithms are intractable.
- Preprint
(1935 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 22 Feb 2025)
-
CEC1: 'Comment on gmd-2024-174 - No compliance with the policy of the journal', Juan Antonio Añel, 26 Dec 2024
reply
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.htmlin your Code and Data Availability section you provide a handful of links for sites containing information related to your manuscript. However, such sites are not acceptable for scientific publication. They include a GitHub site and Google drives. You have included the DOI for a Zenodo repository containing part of the information, after an initial request from the Topical Editor; however, the Zenodo repository that you provide does not contain the necessary information to replicate the work that you present in your manuscript. For example, when trying to run the Python notebook on model availability, it tries to connect to the external GitHub site (moreover under certain circumstances/configurations it fails to do it).
I have to be clear here. All the models, code and data necessary to replicate your work must be contained in the Zenodo repository and must be standalone, that is, work without need for additional download of software or data from third party sites (excluding common libraries such as pandas, numpy, etc. that should be listed with their versions). Also, the Code and Data Availability section in your manuscript contains several references to sites that do not comply with our policy. This section is not there to advertise the "last version" of a software or the work of the authors, but to provide the specific information that assures the compliance with the principle of scientific reproducibility. Including such unuseful information only introduces unnecessary noise and complexity in the assessment of the compliance of your submitted manuscript. Therefore, please, remove from this section all the information about sites that do no comply with the policy for permanent archival of code and data.
Therefore, the current situation with your manuscript is irregular. Please, publish your code in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible, as we can not accept manuscripts in Discussions that do not comply with our policy. Also, please include the relevant primary input/output data. Also, you must include a modified 'Code and Data Availability' section in a potentially reviewed manuscript, containing the DOI of the new repositories.
I note that if you do not fix this problem, we will have to reject your manuscript for publication in our journal.
Juan A. Añel
Geosci. Model Dev. Executive EditorCitation: https://doi.org/10.5194/gmd-2024-174-CEC1 -
AC1: 'Reply on CEC1', Carlos Enmanuel Soto Lopez, 08 Jan 2025
reply
Dear Prof. Juan A. Añel
Geosci. Model Dev. Executive Editor
After careful reviewing of the code and data availability, we have uploaded in Zenodo a standalone version of the code used to reproduce the results from the article, as well as the data needed to reproduce it. In this new upload, we make no mention of other sites, except the standard python3 libraries:
- networkx>=2.8.8
- torch
- torchvision
- torchaudio
- ConfigSpace>=1.1.1
- constrained-linear-regression>=0.0.4
- matplotlib>=3.7
- matplotlib-inline>=0.1.7
- mdurl>=0.1.2
- multiprocess>=0.70.16
- numpy>=1.24
- pandas>=2
- ray>=2.1
- scipy>=1.1
- seaborn>=0.13.2
- sympy>=1.12
- pvlib>=0.11.0
- tqdm>=4.66.5
- pyarrow
- tensorboardX
We have also prepared a README with instructions on how to install the required external packages with a Makefile. We also included a Python script (reproduce.py). By launching reproduce.py, the user is guided to reproduce the calculations and figures in the manuscript, with the option to run all calculations (-a), only the Bayesian computations (-b), only the sensitivity analysis and MCMC analysis (-m), or only the calculations related to training the neural network (-n).
I'll send a version of the manuscript with a different 'Code and Data Availability' section, as well as the DOI identifier updated in the Citations section via email.
DOI from the new code and data availability: 10.5281/zenodo.14609747.
Link to the code and data availability: https://doi.org/10.5281/zenodo.14609747
Best regards,
Carlos Enmanuel Soto Lopez
Citation: https://doi.org/10.5194/gmd-2024-174-AC1
-
AC1: 'Reply on CEC1', Carlos Enmanuel Soto Lopez, 08 Jan 2025
reply
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
119 | 35 | 9 | 163 | 5 | 5 |
- HTML: 119
- PDF: 35
- XML: 9
- Total: 163
- BibTeX: 5
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1