the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Using Deep Learning to integrate paleoclimate and global biogeochemistry over Phanerozoic time
Abstract. Databases of 3D paleoclimate model simulations are increasingly used within global biogeochemical models for the Phanerozoic Eon. This improves the accuracy of the surface processes within the biogeochemical models, but the approach is limited by the availability of large numbers of paleoclimate simulations at different pCO2 levels and for different continental configurations. In this paper we apply the Frame Interpolation for Large Motion (FILM) Deep Learning method to a set of paleoclimate model simulations to upscale their time resolution from one model run every ~25 million years to one model run every 1 million year (Myr).
Testing the method on a 5 Myr time-resolution set of continental configurations confirms the accuracy of our approach when reconstructing intermediate frames from configurations separated by up to 40 Myrs. We then apply the method to upscale the paleoclimate datastructure in the SCION climate-biogeochemical model and demonstrate that upscaled outputs for global distributions of surface temperature and runoff follow a logical progression between the original keyframes.
When updated to use the high-time-resolution climate datastructure, the SCION model predicts climate shifts that were not present in the original model outputs due to its previous use of wide-spaced datasets and simple linear interpolation. We conclude that a time resolution of ~10 Myr in paleoclimate simulations is likely sufficient for investigating the long-term carbon cycle, and that Deep Learning methods may be critical in attaining this time-resolution at a reasonable computational expense, as well as for developing new fully-continuous methods in which 3D continental processes—such as species distribution—are able to translate over a moving continental surface in deep time. Nonetheless, the efficacy of Deep Learning methods in interpolating runoff data, compared to that of paleogeography and temperature, is diminished by the heterogeneous distribution of runoff. Consequently, interpolated climates should be confirmed by running a paleoclimate model for any sound scientific conclusions.
- Preprint
(5653 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on gmd-2023-230', Anonymous Referee #1, 13 Mar 2024
Review of the article “Using Deep Learning to integrate paleoclimate and global geochemistry over Phanerozoic time.”
(added in pdf version in the "supplement")
General comments:
Spatially resolved modeling has gained popularity in recent years and has proven to be a very effective tool for assessing paleoclimates. Due to the sparse data (modeled or from proxies) used in assessing fluxes through time, these models are limited. Building on SCION's already important development in this domain, this contribution develops a method that makes the model even more reliable. At least two reasons make this study very valuable and interesting:
- It provides a framework to improve the accuracy of surface processes used in biogeochemical models and demonstrates that 10Myrs is sufficient which gives valuable information to the community for future studies.
- This method reduces the global computing costs of running models over geological timescales, which are currently a major limiting factor in many research projects.
It was a pleasure to read this manuscript, which is of high quality. The scientific significance is excellent as the sparse data available is a limiting agent in the paleoclimate modeling domain. The scientific quality is very good as studies are carried out to test the reliability of this method and so on different timescales and targets (paleogeography vs runoff) and discussed in some detail. The scientific reproducibility is good as the method is explained thoroughly and models are available to download. The presentation quality is good as the figures are very relevant to the text and illustrate it well.
However, the PaleoDEM validation revealed some serious issues with interpolating over 40 Myrs, yet the authors used it on GEOCLIM/SCION with intervals up to 55 Myrs. Additionally, Figure 4 shows artificial landmasses created with a 10Myr timestep that may pose a problem for climate modeling, mostly for models including oceanic circulation. Therefore, some caution on the use of this method should be highlighted a bit more thoroughly in the text.
Specific comments: (the numbering applies to the preprint version of the manuscript available online)
Lines
Comment
title
Phanerozoic time sounds a bit odd to me, why not use the normal word for it: “eon”?
26-27
“Species distribution”, you don’t mention this point in the main text, maybe it will be better placed in the conclusion as an opening?
38
Goddéris et al. (2023) is quoted but can’t be found in the reference list.
44
It is mentioned 22 continental configurations. However, in both GEOCLIM and SCION, only 21 frames are cited (not assigned to any reference though). What and when is this 22nd?
47-49
From what I understood the ITCZ is more or less forced by the model, is that what you mean?
However, the ITCZ's shape and location will be determined by the paleographical configuration and be modeled by the GCM, along with the areas of extreme weathering. In most cases, it is not the other way around. Would it be possible to reformulate this sentence?
60 - 63
When I read this manuscript, what for me is the heart of this study is the method and validation parts, the latter is not mentioned at all in the introduction.
68
4.5°*7.5° might be too coarse to display important paleogeographic features such as island arcs that might have a great impact on climate over some periods of the Earth’s history (see Ribeiro et al., 2022 or Marcilly et al. 2022)
68-69
“Roughly evenly spaced” You then mention later in the text (l 137) that some spacing is 55 Myrs. Moreover, it is never mentioned in the text what is the model used for these continental reconstructions. If it is indeed the one used in the original GEOCLIM (Goddéris et al., 2014) then is it the maps from Blakey (2007)? (As in Nardin et al. 2011) I think this study will gain from referencing the paleogeographic models in a better way because they are at the base layer for climate modeling and therefore are extremely important.
71
“Original FOAM”, which version of the model do we speak about?
75
“Wide spacing in time between …datasets”: the time spacing is mentioned but still no numbers are given à maybe it will be nice to give some numbers for the reader to know which scale of spacing we are talking about?
95-96
You mention shifts in FOAM are due to the reorganization of landmasses, yet no real plate tectonic/paleogeography studies are quoted here. It will be nice to have the study presenting the reconstructions behind FOAM quoted.
142
Maybe “the” should be “a” PaleoDEM dataset. PaleoDEM is a quite widely used term and not restricted to the work of Scotese.
154
The problem with downscaling is that it often results in an overestimation of the exposed land ratio which will mean more area available for weathering for the biogeochemical model. I know it’s difficult to run GCM with a finer grid.
(It was just to raise awareness as this is not strictly related to the subject of your study. I understand the point here is to demonstrate the reliability of the method which I think is very well done in this study)
227-228
I’m not sure what you are trying to say here: that the synthetic and real maps (Scotese and Wright, 2018) have a better fit together compared to Scotese& wright (2018) and Marcilly et al. (2021)? If so, it might need some reformulation.
What period of comparison are we talking about here?
Can you give an estimate of this discrepancy, in % error for example?
274 - 275
Having issues with small landmasses is quite serious because they often display high runoff (Goddéris et al., 2014) and therefore host high weathering.
294-296+
§ 4
This is interesting because other models such as GEOCARBSULF (Marcilly et al., 2021) and GEOCLIM (Goddéris & Donnadieu, 2017; Goddéris et al, 2014) have this spike which is attributed to a change in climate sensitivity in GEOCARBSULF for example.
I’m confused about how the frames are now interpolated; in this section, are we back to the first part where you interpolate following the spacing of the maps which are roughly evenly spaced”? how much time in between two frames? It is a bit confusing after the validation part with the 3 different spacings.
So, you demonstrate that the accuracy with spacing greater than 10Myrs is reduced using this method and yet you use it with intervals up to 55 Myrs? Is that not a problem? I don’t think you should draw any conclusions with intervals over 10 Myrs which if it is indeed the FOAM runs as the one in Goddéris et al. (2014), actually covers the majority of the Phanerozoic.
This is where it becomes complicated for me to understand because the extreme warmth of the Permian Triassic extinction is probably shorter than 10 Myrs so can you actually see this signal? in Cao et al (2022) the interval considered is 253-247 Ma, roughly 254-250 Ma in Yang et al. (2019) for example.
How can you see short signals such as the P/T warming but not the Ordovician cooling (Hirnantian) for example? (Which has been attributed by many to be caused by changes in paleogeography (e.g., Nardin et al., 2011).
Why quoting Wu et al. (2023) the article in the reference list is not about the PT boundary.?
301
The fate of South China will also depend on the chosen reconstruction and downscaling process as for the lower Triassic in Fig 9 South China seems well emerged but in the reconstructions of Marcilly et al. (2021) the land area available for weathering is very small.
305 - 307
You should also mention that the timestep between two “base” reconstructions is greater than 10 Myrs and the accuracy is therefore reduced.
323
“20 Myrs apart” are they though? From Goddéris et al. (2014) they seem more 30 to 40 Myrs apart for the majority.
“Can be applied “vs “should aim to run climate models at least every 10 Myrs” (l333)
Therefore, the recommendation made for further studies is not respected in this very study à a bit of mixed message. Maybe this sentence should be rewritten?
334
The conclusion is very well structured and easy to read. Maybe it will also be worth mentioning here that the method fails to reconstruct short-lived events (greater than 1Myrs though) such as the Hirnantian glacial event. Even though they have large climatic consequences.
Comment related to figures:
Figure
Comment
2
Concerning the graphs presenting SSIM and 2D correlation:
Whatever the frame interval considered it seems that there are two periods of increasingly low performance. It’s difficult to read for sure the ages but I would say between 430-420 Ma and 250-210 Ma. What can cause this?
Table 1
Title: Missing the a in “evaluation”
4
The synthetic maps at 105 Ma (yellow arrows) worry me a bit because it’s ok (not ideal though) for running GCM simulations with FOAM because it doesn’t have a proper oceanic circulation module but with other GCMs such artifact will represent an issue. Can you comment on that?
You don’t mention it in the text but this creation of land over South China (orange arrow) will create a big issue for the assessment of weathering fluxes at it is well known small, isolated landmasses are hosting a lot of runoff and therefore weathering (mostly at the equator). It will be nice to highlight this point even if you already mentioned that the 40 Myrs step is less accurate. It will actually illustrate this point.
5
In both the runoff and temperature graph, deeper time runs seem to have a better correlation between CO2-Temprature and CO2-runoff: How do you explain that?
6-7
Those two figures are quite crowded, is it possible to select the most representative graphs and put the other ones in the appendix?
8
It will be nice to see an estimation of the “accuracy” of the method on this figure. Maybe highlight the periods where base maps are closer to each other and so lead to more accuracy. This way the reader can directly see which periods are more reliable than the others.
In the text you mention, that with this method, the timestep is reduced to 1Myr so we should see the signal of more short-lived events. However, here the Hirnantian cooling is totally hidden and instead there is even an increase in CO2 and temperature. Can you comment on this?
9
The figure highlights the increase in runoff and weathering in central Pangea during the Late Permian /Early Triassic in mid-latitude Pangea, but this is debated and evidences such as large extend of evaporites deposits suggest quite arid conditions (Scotese maps below (DOI:10.13140/2.1.2757.8567.) or Cui & Cao (2021) https://doi.org/10.1002/gj.4123).
Arid conditions are unlikely to result in intense weathering. Can you comment on that as well?
-
AC1: 'Reply on RC1', Dongyu Zheng, 26 Apr 2024
We thank the reviewers for their insightful comments and have addressed all the points raised. We agree with reviewer’s comments and thank them for the positive assessment. Indeed, although the FILM method can provide potential continuous interpolated frames that can be used for models like SCION, caution is absolutely warranted if using these in the place of GCM simulations to make conclusions.We have incorporated several important considerations into the discussion and conclusion sections for readers who may wish to use their own datasets for interpolation.
-
RC2: 'Comment on gmd-2023-230', Anonymous Referee #2, 21 Mar 2024
Summary: The study presents an application of an interpolation algorithm, originally developed to interpolate video frames, to deep-time paleoclimate simulations. Using this application, steady-state snap-shot climate simulations widely separated in time can be interpolated to produce paleoclimate maps with a higher time resolution.
The manuscript uses interpolated maps of palaeogeography, temperature, and runoff covering the last 500 million years with a time resolution of 1 million years (from an original time resolution of about 25 million years) to drive a biochemical climate model.
Recommendation: I liked the study's original idea, but I am not totally convinced that the authors have demonstrated that the method can work properly. My background is in climate dynamics, statistical climatology, and machine learning, but I am not an expert in deep-time paleoclimate. Thus, I will not comment on those more specific questions and hope that other reviewers can evaluate those aspects more thoroughly.
I explain my main concern below. I recommend that the manuscript be revised, perhaps not major revisions, but I would like to see the revised manuscript.
Main point:
1) The video frame interpolating algorithm FILM has been used as pre-trained without any further fine-tuning with paleoclimate data. Thus, it solely relies on ‘video dynamics’ that can be found, I assume, in usual video clips and films. These video dynamics are most likely dominated by ‘advection’ of visual features: static or moving objects, gradual changes in colour, shades, perspectives, etc.
The authors validate the application of FILM in a paleoclimate setting by looking at (1) spatially resolved palaeography and (2) globally averaged temperatures and run-off. This is where my concern arises. The spatially resolved palaeography does resemble a ‘video clip’. The movement of continental and ocean plates is indeed an ‘convection’ feature, and therefore we can expect that a video frame interpolating algorithm can cope with the interpolation in time of paleography, specially at large continental scales. However, I am not convinced it also works for spatially resolved temperatures. The temperature field is not advected; it does not ‘move’ like an object in space. Many factors, including land-sea distribution and external forcing, including latitude, CO2 concentrations, water vapour concentrations, precipitation, etc, control it. It is a large leap of faith to consider the evolution of the temperature field as a set of moving objects in space. This might be more correct at very short time scales. Say hours or days, for which temperature may be more strongly controlled by convection of air masses, but this is not true for longer time scales.
The question would be then how to validate the interpolated temperature field. The authors present a validation of the global mean temperature, but as they argue in the manuscript, global mean temperature at these long time scales is strongly controlled by greenhouse gas concentration, and thus, any simple interpolation algorithm would probably achieve satisfactory results without the requirement of a skilful spatially resolved reconstruction. If only the global mean temperature were important, this would be a more or less acceptable validation, but then the FILM setup would not be necessary - a simple time interpolation of the global mean temperature would be sufficient.
I know that a spatially resolved validation is not easy, but why should the FILM output be trusted without that step? One possibility is to use a ‘perfect model’ approach using GCM simulations. Here, the ground truth is assumed to be a long GCM simulation, which can then be subsampled, interpolated and compared with the ‘truth’. I know there are no GCM simulations over such long periods, but some cover the Holocene. Here, the FILM setup can be tested. Alternatively, simulations with intermediate complexity models, like CLIMBER or similar, over longer periods (~100k years ) might be used for this test.
Particular points
2) ‘We then apply the method to upscale the
paleoclimate data structure in the SCION climate-biogeochemical model and demonstrate that upscaled outputs for global distributions of surface temperature and runoff follow a logical progression between the original keyframes.’
This sentence is a bit convoluted and not easy to understand. Does it mean that the interpolation produces reasonable or plausible fields?
3) ‘This coarse time resolution likely has impacted the accuracy of the biogeochemical model results’
has likely
4) ‘Deep Learning models are complex neural networks with typically >106 parameters’
I guess 106 is a typo. Do you mean 100? Why precisely 106?
5) ‘The model emulates the learning process of humans by updating the parameters in the neural networks to produce optimal predictions’
I would not use the term predictions, as the application in this study is not prediction but interpolation. Also, neural networks can generally be used in many other non-predictive settings.
6) ‘This convolutional operation yields a higher-level representation of the original images’
The word higher level will not be clear to many readers if they are not experts in machine learning. Can you be more specific?
7) Table 1. The caption is too cryptic and should not refer the reader to search the text for an explanation of the table's contents. At the very least, it should point to a specific position in the text.
8) The only indication of the time span covered in this paper is the title (Phanerozoic). I think it would be helpful to include a more specific time frame in the abstract and the introduction.
Citation: https://doi.org/10.5194/gmd-2023-230-RC2 -
AC2: 'Reply on RC2', Dongyu Zheng, 26 Apr 2024
We thank the reviewers for their insightful comments and have addressed all the points raised. We have conducted a numerical evaluation using a high-resolution temperature dataset based on GCM simulations. This evaluation further demonstrates that the Deep Learning method can indeed produce reliable interpolations of climatic variables.
-
AC2: 'Reply on RC2', Dongyu Zheng, 26 Apr 2024
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
294 | 62 | 20 | 376 | 17 | 10 |
- HTML: 294
- PDF: 62
- XML: 20
- Total: 376
- BibTeX: 17
- EndNote: 10
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1