the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Spy4Cast v1.0: a Python Tool for statistical seasonal forecast based on Maximum Covariance Analysis
Abstract. Maximum Covariance Analysis (MCA) is a well known discriminant analysis technique used for finding coupled patterns in climate data. This is a powerful tool that has been applied to the study of teleconnections, by reducing all possible relationships between a predictor and a predictand field to a few modes of covariability patterns. MCA can be used to provide statistical forecasts, which can complement predictions performed with dynamical models. Nevertheless, the power of this tool relies on its application in a productive and easy way, as it can be applied to the huge climate data-sets available. Spy4Cast is an open-source interface (API), implemented in Python, that contains a MCA-based statistical model to be used for seasonal forecast. Its main goal is to increase automation and productivity. Spy4Cast enables large data-set manipulation and also performs basic tasks like region slicing and plotting. The methodology consists on an initial configuration (data-set reading and slicing) and preprocessing that prepares the data to be fed into MCA, crossvalidation and validation. It acts upon any kind of predictor and predicting variables that can come from any source of data. Spy4Cast analyses the model sensitivity to particular years, including a diagnosis of the stability of the obtained modes to particular outliers. Finally, the spatial and temporal skill, in terms of anomaly correlation coefficient is obtained and a hindcast is provided. The software is easily accessible through a python package and well documented for beginners and experienced programmers. Only a reduced number of third-party libraries are needed, and they are those widely used in data-science and physics.
- Preprint
(1268 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on gmd-2024-164', Anonymous Referee #1, 29 Jan 2025
The comment was uploaded in the form of a supplement: https://gmd.copernicus.org/preprints/gmd-2024-164/gmd-2024-164-RC1-supplement.pdf
-
AC1: 'Reply on RC1', Pablo Duran-Fonseca, 01 Feb 2025
Thank you for your comments.
We will address them and get back to you as soon as possible with the corrected manuscript.
Kind regards,
Pablo Duran
Citation: https://doi.org/10.5194/gmd-2024-164-AC1 - AC3: 'Reply on RC1', Pablo Duran-Fonseca, 26 Mar 2025
-
AC1: 'Reply on RC1', Pablo Duran-Fonseca, 01 Feb 2025
-
RC2: 'Comment on gmd-2024-164', Anonymous Referee #2, 15 Mar 2025
Synopsis
As the authors note (L34-45), MCA is a well-developed and widely used method in geoclimate studies. A key strength of this work is that it provides accessible software for researchers to implement MCA in its full capacity. The model is useful for certain climate applications, such as the SST prediction demonstrated in the paper. However, its broader applicability appears limited—for instance, the current software only supports monthly data (L149). While I appreciate the effort in developing this tool for the community, I am uncertain about its potential for widespread adoption. Ultimately, the decision on its suitability for publication rests with the editor.
Major comments
Given the current scope of the software, I strongly recommend providing thorough documentation and a user manual. Additional examples of its application would further enhance its value to the community.
Furthermore, as a comprehensive tool for statistical seasonal forecasting, the software would benefit from referencing influential prior work that incorporates statistical analysis and cross-validation. This would provide a more complete methodological context. A quick Google search yields references such as 'Cross-Validation in Statistical Climate Forecast Models' by Barnston and van den Dool (1987). I would encourage the authors to conduct a more thorough search to identify additional relevant studies.
Since this software provides a ready-to-use implementation of an existing method rather than introducing a new approach, what are its future prospects? Do you have plans to expand its functionality or broaden its applicability?
Detail comments:
Line 1: Did you mean a dimension reduction technique?
Line 10: How do you test model sensitivity to particular years? Did you say this in the manuscript?
Line 11: How would you test modes to particular outliers? Did you mention this? Or is this implied by testing different batch of years?
Line 14: Is the software fully documented? As it stands, it would benefit from additional work to develop a more comprehensive manual.
Line 20-25: SST and SLP patterns which are highly ‘correlated’
You need to clarify an important distinction. Your description of the coupling between SST and SLP refers to only one phase of the Southern Oscillation or ENSO. However, it applies to both El Niño and La Niña, not just El Niño. The current wording suggests that only El Niño is being considered.
Line 28: a baseline for seasonal forecasts? Why
Line 30-33: Machine learning methods are rapidly evolving, and you should reference more recent studies to support this statement. For example, Toride et al. (2024, https://arxiv.org/abs/2404.15419) demonstrates the use of neural networks to identify physical relationships and find predictability.
Line 41-42: Instead of using the phrase 'a new paradigm,' it would be more accurate to reference earlier studies identifying the connection between ENSO and other tropical basins. For example, the connection between ENSO and the Indian Ocean Dipole (IOD) was first identified by Saji et al. (1999): A Dipole Mode in the Tropical Indian Ocean (Nature). This study demonstrated how the IOD can influence ENSO dynamics and has been foundational in the field.
Line 46-47: reliable? The citation at the end of the sentence is incomplete.
Line 56: ‘… not designed to assess stationarity’ seems to contradict Line 250: ‘Spy4Cast is able to perform a validation methodology to look for non-stationary relations.’ Line 260 as well.
Line 59: fix the reference
Line 69: unit tests?
Line 76: Section 4 is an example of using Atlantic to predict Pacific SSt. However, if you have the Sahelian rainfall example, I think it would be useful to include in your manual/documentation and showcase different settings and functionalities of your work.
Line 149: only being able to take monthly data seems limited capability to me.
Line 190: How do you determine the sample size? Monthly data are likely highly correlated, i.e., each month is not an independent point, you need to use the effective sample size when you do statistical analysis.
Line 195, 197 and more: What table or listing are you referring to? In general, when referencing your previous paper on which this software is built, I suggest specifying the relevant tables, listings, or figures. This would make it easier for users to trace the code development and better understand the overall concept.
Line 223: It says 2010 in the listing.
Line 224: You mention non-stationarity multiple times in the manuscript, but it is unclear how you determine it. Since there are various methods to assess non-stationarity, I recommend specifying the approach you used to ensure clarity.
Line 225: What does ‘a hot spot’ in climate variability studies mean?
Line 241: I think ‘can be represented’ is a more accurate phrasing.
Line 246 Us should be in math form?
Line 249: The rest of modes… this statement is misleading and not accurate. Fig 5 seems to say 68% instead of 76%?
Line 254: can you say what years? 94 and 91 for example?
Line 260: I am not sure you have explained how to use your software to determine stationarity
Line 265: This is not a ‘new’ approach but rather a ready-to-use software implementation of a well-established method for seasonal forecasting.
Line 270: What is OFF project again?
Figure 3 caption: You need to label what year this is. Is it 1997 based on List 4?
Listing 6: Do you need to ‘import Preprocess’ first in this script?
Citation: https://doi.org/10.5194/gmd-2024-164-RC2 -
AC2: 'Reply on RC2', Pablo Duran-Fonseca, 17 Mar 2025
Thank you for you comments.
We will address them apporpietely and respond as soon as possible. We will integrate your observations with the ones provided by Anonymous Referee #1.
Kind regards,
Pablo Duran
Citation: https://doi.org/10.5194/gmd-2024-164-AC2
-
AC2: 'Reply on RC2', Pablo Duran-Fonseca, 17 Mar 2025
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
224 | 49 | 13 | 286 | 13 | 12 |
- HTML: 224
- PDF: 49
- XML: 13
- Total: 286
- BibTeX: 13
- EndNote: 12
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1