Strengths and weaknesses of three Machine Learning methods for
pCO<sub>2</sub> interpolation

Stamell, Jake; Rustagi, Rea R.; Gloege, Lucas; McKinley, Galen A.

doi:https://doi.org/10.5194/gmd-2020-311

Preprints

https://doi.org/10.5194/gmd-2020-311

Preprints

Submitted as: development and technical paper

22 Oct 2020

Submitted as: development and technical paper |

| 22 Oct 2020

Status: this preprint was under review for the journal GMD but the revision was not accepted.

Strengths and weaknesses of three Machine Learning methods for pCO₂ interpolation

Jake Stamell, Rea R. Rustagi, Lucas Gloege, and Galen A. McKinley

Abstract. Using the Large Enemble Testbed, a collection of 100 members from four independent Earth system models, we test three general-purpose Machine Learning (ML) approaches to understand their strengths and weaknesses in statistically reconstructing full-coverage surface ocean pCO₂ from sparse in situ data. To apply the Testbed, we sample the full-field model pCO₂ as real-world pCO₂ collected from 1982–2016 for each ensemble member. We then use ML approaches to reconstruct the full-field and compare with the original model full-field pCO₂ to assess reconstruction skill. We use feed forward neural network (NN), XGBoost (XGB), and random forest (RF) approaches to perform the reconstructions. Our baseline is the NN, since this approach has previously been shown to be a successful method for pCO₂ reconstruction. The XGB and RF allow us to test tree-based approaches. We perform comparisons to a test set, which consists of 20% of the real-world sampled data that are withheld from training. Statistical comparisons with this test set are equivalent to that which could be derived using real-world data. Unique to the Testbed is that it allows for comparison to all the "unseen" points to which the ML algorithms extrapolate. When compared to the test set, XGB and RF both perform better than NN based on a suite of regression metrics. However, when compared to the unseen data, degradation of performance is large with XGB and even larger with RF. Degradation is comparatively small with NN, indicating a greater ability to generalize. Despite its larger degradation, in the final comparison to unseen data, XGB slightly outperforms NN and greatly outperforms RF, with lowest mean bias and more consistent performance across Testbed members. All three approaches perform best in the open ocean and for seasonal variability, but performance drops off at longer time scales and in regions of low sampling, such as the Southern Ocean and coastal zones. For decadal variability, all methods overestimate the amplitude of variability and have moderate skill in reconstruction of phase. For this timescale, the greater ability of the NN to generalize allows it to slightly outperform XGB. Taking into account all comparisons, we find XGB to be best able to reconstruct surface ocean pCO₂ from the limited available data.

Received: 16 Sep 2020 – Discussion started: 22 Oct 2020

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Jake Stamell, Rea R. Rustagi, Lucas Gloege, and Galen A. McKinley

Status: closed

AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment

- Printer-friendly version

- Supplement

RC1: 'comments', Anonymous Referee #1, 18 Nov 2020
- AC1: 'Reply on RC1', Jake Stamell, 04 Feb 2021
RC2: 'Comments', Anonymous Referee #2, 19 Jan 2021
- AC2: 'Reply on RC2', Jake Stamell, 04 Feb 2021

Status: closed

AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment

- Printer-friendly version

- Supplement

RC1: 'comments', Anonymous Referee #1, 18 Nov 2020
- AC1: 'Reply on RC1', Jake Stamell, 04 Feb 2021
RC2: 'Comments', Anonymous Referee #2, 19 Jan 2021
- AC2: 'Reply on RC2', Jake Stamell, 04 Feb 2021

Jake Stamell, Rea R. Rustagi, Lucas Gloege, and Galen A. McKinley

Data sets

ML methods for pCO2 reconstruction - Large Ensemble Testbed - NN/XGB/RF Jake Stamell and Galen A. McKinley https://doi.org/10.6084/m9.figshare.c.4568555.v2

Jake Stamell, Rea R. Rustagi, Lucas Gloege, and Galen A. McKinley

Viewed

Total article views: 2,015 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,345	609	61	2,015	62	86

HTML: 1,345
PDF: 609
XML: 61
Total: 2,015
BibTeX: 62
EndNote: 86

Views and downloads (calculated since 22 Oct 2020)

Month	HTML	PDF	XML	Total
Oct 2020	286	54	0	340
Nov 2020	74	37	1	112
Dec 2020	32	25	1	58
Jan 2021	30	8	1	39
Feb 2021	29	9	0	38
Mar 2021	17	9	0	26
Apr 2021	19	7	0	26
May 2021	19	9	0	28
Jun 2021	18	10	0	28
Jul 2021	14	1	0	15
Aug 2021	13	4	0	17
Sep 2021	12	4	0	16
Oct 2021	21	16	0	37
Nov 2021	22	18	0	40
Dec 2021	20	9	6	35
Jan 2022	22	7	1	30
Feb 2022	33	10	1	44
Mar 2022	22	12	1	35
Apr 2022	26	7	0	33
May 2022	10	13	2	25
Jun 2022	4	1	1	6
Jul 2022	11	3	1	15
Aug 2022	22	10	2	34
Sep 2022	23	10	0	33
Oct 2022	13	5	1	19
Nov 2022	11	8	0	19
Dec 2022	7	3	0	10
Jan 2023	22	6	0	28
Feb 2023	15	3	0	18
Mar 2023	14	8	0	22
Apr 2023	11	9	0	20
May 2023	13	8	0	21
Jun 2023	10	9	1	20
Jul 2023	16	6	1	23
Aug 2023	17	15	0	32
Sep 2023	17	14	1	32
Oct 2023	17	10	0	27
Nov 2023	8	5	1	14
Dec 2023	11	3	2	16
Jan 2024	11	17	3	31
Feb 2024	12	5	1	18
Mar 2024	10	16	5	31
Apr 2024	24	9	7	40
May 2024	24	4	4	32
Jun 2024	34	8	2	44
Jul 2024	13	14	4	31
Aug 2024	18	4	4	26
Sep 2024	14	13	0	27
Oct 2024	16	26	0	42
Nov 2024	28	10	0	38
Dec 2024	15	5	0	20
Jan 2025	17	9	0	26
Feb 2025	22	7	1	30
Mar 2025	18	12	2	32
Apr 2025	16	15	0	31
May 2025	20	11	1	32
Jun 2025	19	5	1	25
Jul 2025	13	14	1	28
Aug 2025	0

Cumulative views and downloads (calculated since 22 Oct 2020)

Month	HTML	PDF	XML	Total
Oct 2020	286	54	0	340
Nov 2020	74	37	1	112
Dec 2020	32	25	1	58
Jan 2021	30	8	1	39
Feb 2021	29	9	0	38
Mar 2021	17	9	0	26
Apr 2021	19	7	0	26
May 2021	19	9	0	28
Jun 2021	18	10	0	28
Jul 2021	14	1	0	15
Aug 2021	13	4	0	17
Sep 2021	12	4	0	16
Oct 2021	21	16	0	37
Nov 2021	22	18	0	40
Dec 2021	20	9	6	35
Jan 2022	22	7	1	30
Feb 2022	33	10	1	44
Mar 2022	22	12	1	35
Apr 2022	26	7	0	33
May 2022	10	13	2	25
Jun 2022	4	1	1	6
Jul 2022	11	3	1	15
Aug 2022	22	10	2	34
Sep 2022	23	10	0	33
Oct 2022	13	5	1	19
Nov 2022	11	8	0	19
Dec 2022	7	3	0	10
Jan 2023	22	6	0	28
Feb 2023	15	3	0	18
Mar 2023	14	8	0	22
Apr 2023	11	9	0	20
May 2023	13	8	0	21
Jun 2023	10	9	1	20
Jul 2023	16	6	1	23
Aug 2023	17	15	0	32
Sep 2023	17	14	1	32
Oct 2023	17	10	0	27
Nov 2023	8	5	1	14
Dec 2023	11	3	2	16
Jan 2024	11	17	3	31
Feb 2024	12	5	1	18
Mar 2024	10	16	5	31
Apr 2024	24	9	7	40
May 2024	24	4	4	32
Jun 2024	34	8	2	44
Jul 2024	13	14	4	31
Aug 2024	18	4	4	26
Sep 2024	14	13	0	27
Oct 2024	16	26	0	42
Nov 2024	28	10	0	38
Dec 2024	15	5	0	20
Jan 2025	17	9	0	26
Feb 2025	22	7	1	30
Mar 2025	18	12	2	32
Apr 2025	16	15	0	31
May 2025	20	11	1	32
Jun 2025	19	5	1	25
Jul 2025	13	14	1	28
Aug 2025	0

Viewed (geographical distribution)

Total article views: 1,760 (including HTML, PDF, and XML) Thereof 1,758 with geography defined and 2 with unknown origin.

Country	#	Views	%

Cited

Latest update: 05 Aug 2025

Short summary

Using simulated surface ocean pCO₂ from Earth System Models, we test three Machine Learning methods (neural network, XGBoost, random forest) to discern their ability to reconstruct global coverage from sparse observations. Synthetic data means we can train based on real-world sampling patterns and then evaluate against the known full coverage result of the original simulation. ML approaches perform best in the open ocean, but struggle in regions of low sampling. XGBoost saw the best performance.


Total:	0
HTML:	0
PDF:	0
XML:	0

Strengths and weaknesses of three Machine Learning methods for pCO2 interpolation

Data sets

Viewed

Viewed (geographical distribution)

Cited

4 citations as recorded by crossref.

Strengths and weaknesses of three Machine Learning methods for pCO₂ interpolation