the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A learning-based method for efficient large-scale sensitivity analysis and tuning of single column atmosphere model (SCAM)
Jiaxu Guo
Yidan Xu
Haohuan Fu
Wei Xue
Lanning Wang
Lin Gan
Xianwei Wu
Liang Hu
Gaochao Xu
Xilong Che
Abstract. The Single Column Atmospheric Model (SCAM) is an essential tool for analyzing and improving the physics schemes of CAM. Although it already largely reduces the compute cost from a complete CAM, the exponentially-growing parameter space makes a combined analysis or tuning of multiple parameters difficult. In this paper, we propose a hybrid framework that combines parallel execution and a learning-based surrogate model, to support large-scale sensitivity analysis (SA) and tuning of combinations of multiple parameters. We start with a workflow (with modifications to the original SCAM) to support the execution and assembly of a large number of sampling, sensitivity analysis, and tuning tasks. By reusing the 3,840 instances with the variation of 11 parameters, we train a neural network (NN) based surrogate model that achieves both accuracy and efficiency (with the computational cost reduced by several orders of magnitude). The improved balance between cost and accuracy enables us to integrate NN-based grid search into the traditional optimization methods to achieve better optimization results with fewer compute cycles. Using such a hybrid framework, we explore the joint sensitivity of multi-parameter combinations to multiple cases using a set of three parameters, identify the most sensitive three-parameter combination out of eleven, and perform a tuning process that reduces the error of precipitation by 5 % to 15 % in different cases.
Jiaxu Guo et al.
Status: final response (author comments only)
-
RC1: 'Comment on gmd-2022-264', Anonymous Referee #1, 14 Jan 2023
- AC1: 'Reply on RC1', Jiaxu Guo, 18 Apr 2023
-
RC2: 'Comment on gmd-2022-264', Anonymous Referee #2, 17 Mar 2023
Guo et al. proposes a machine learning surrogate model to accelerate the parameter sensitivity analysis (SA) and tuning for SCAM. Considering the high computational cost of real models, especially for the global models, machine learning surrogate models can tackle this challenge, but it requires the high accuracy of the surrogate models, especially at the small neighborhood around the local optimization, which depend on the response between the different parameters and different metrics, the number of samples and the deep learning method. In this paper, although the authors evaluate the accuracy of the NN model in terms of precipitation, it probably exists the inconsistent between ML and real model, which don’t be highlighted in this paper. The authors believe that due to the high computational cost of the GCM, the SCAM can be the alternative model for parameter SA and tuning. In reality, the optimal parameters tuned in SCAM could not be suitable for GCM, due to the global regions and more complex large-scale circulation. Overall this manuscript, the organization and writing are not clear and should be well structured. There are a very large number of language errors. The English writing should be greatly improved. Therefore, considering the motivation, the logic and writing quality, I would like to reject this paper.
Other Major issues:
- The authors separately tune the parameters in SCAM for each site and get the different sensitive parameters and different optimal values. It is difficult to transfer this information to GCM. If the authors can do the multi-objective tuning for these sites with the same parameters, it could be helpful for global model tuning because these SCAM sites indeed represent the different regimes.
- For the workflow, there should be a “metrics” component because it is very important for tuning. No matter SCAM or GCM, the tuning metrics could be the cost function between model simulations and observations. The different designs could affect the optimization. In terms of the metrics, it could consider the 1) different statistic errors between simulation and observation, such as RMSE, performance score like Yang et al. (2013), 2) one objective or multiple objectives, and how to deal with the multiple objectives.
- Line 35: The statement that the Morris SA method cannot get the interactive sensitivity could be wrong. Aurally, the standard deviation of MOAT samplings can stand for the interactive effect of one parameter with others (Morris, 1991).
- Line 45: as the part of introduction, the authors should explain the challenge of the SA methods, why you choose Morris and Sobol, the computational cost issue, surrogate problems using machine learning. If there are previous works, what’s your contribution?
- Line 55: the authors should do comprehensive literature research, even for GCM, there are a large number of work for tuning, such as Yang et al. (2013) and Zou et al. (2014). In addition, the NN surrogate model is used to tune as well. But the authors don’t introduce the previous work and challenge in terms of this issue. The introduce section should be more clear.
- Line 75, Acutely, there are existing SA and tuning workflow used in climate models, such as PSUADE and DAKOTA, the authors don’t compare their workflow to these packages. It’s not new for the community.
- Line 88: The authors don’t mention the 30% improvement for tuning error and computational cost. How do they come from?
- Table 2 is wrong. Each IOP file includes many variables, not just these four variables. Therefore, the statement that you choose precipitation is wrong.
- Line 120: This issue could not be a significant challenge. Some simple scripts can collect the data.
- The statement about Morris is wrong, see 3.
- In section 3.1: sampling is not equal to SA. In this section, the authors introduce the SA methods. You should consider change the structure or change the title.
- Line 177: it could be better to compare NN with other surrogate models, such as xgboost, ResNet.
- Line 184: The 768 samples seem not enough for training NN, do the authors evaluate the performance of NN? In Figure 4, how do you define the accuracy?
- Line 185: Do the authors do the hyper-parameter tuning for NN?
- Line 195: Due to the uncertainties of each method, ensemble can’t guarantee to reduce the error.
- 1, the left hand of this equation is not the number of processes. It should be the number of simulations. The number of simulations could depend on different sampling method. For Morris, it could not require such number of samples. It is not clear for this description.
- For a SCAM simulation, it usually requires 10-20 minutes, why do you require more than one hour?
- It is confusing that you can re-use the sampling from SA to train the surrogate model for tuning, but in section 3.3, you mention that the surrogate model can be also used in SA?
- Line 233: how do you get the conclusion? It is not convinced.
- It is very confusing for section 3.4 and is difficult to follow your idea. You could consider to re-organize this section.
- Line 270: how many samples do you have for the correlation? Do you do the p-value test?
- Figure 7: it is difficult to evaluate the tuning performance in Figure 7. It could be better to use metrics like Yang et al. (2013).
- Figure 4, Figure 5 are the results but appear in section 3 (methods). It could be reorganized.
- Line 313: pz2 (c0_ocn) should be high influence on the ocean case. But in Figure 8, it doesn’t have the high effect on PRECT at TWP. Could you explain the reason?
- Line 317: the reason for the different between ARM 95 and ARM 97 is not convinced.
- Line 345: are the 16 iterations enough for convergence? Why don’t use the general optimization, such as PSO, GA that you mentioned in the introduction section?
Minor issues:
- In Fig. 1, is there an arrow pointing from “SA methods” to “testing of combinations”?
- Caption in Figure 1: The sentence “SCAM launcher, the data collector and the jobs therein represent the batch execution of the SCAM algorithm” should be rephrased. What are the “jobs” and “batch execution”? It is not clear.
- Line 25: should explain “ne30”
- Line 38: the reference of Sobol method is wrong, pls use the original paper.
- Line 43: The QMC and LHC are sampling methods, not SA methods.
- In Table 3, how do you select these parameters? And how do you define the range of each individual?
- Line 94: there should be a reference for SCAM5.
- Line 102: all sites belong to ARM.
- Line 108: “in the code” change to “in the model”
- Line 116: “is an important issue” change to “are important issues”
- Line 130: It is only suit for SCAM. For GCM, it is impossible.
- Line 165: please explain the “second-order sensitivity”
- Table 4: the reference of Sobol is wrong.
- Line 174: please consider the correct position of this sentence.
- Line 193: It is not clear for “not a direct evaluation.”
- Line 195: add “have” before “its”
- Line 215: how do you get the number of multiple thousand?
Citation: https://doi.org/10.5194/gmd-2022-264-RC2 - AC2: 'Reply on RC2', Jiaxu Guo, 18 Apr 2023
-
AC3: 'Final author comment on gmd-2022-264', Jiaxu Guo, 18 Apr 2023
Dear Editor and Referees,
Thank you very much for your valuable time and constructive comments! We have carefully replied to all the comments and organized our responses into documents. These documents have been uploaded as attachments separately to the post of each referee.
Best regards,
Jiaxu Guo on behalf of all authorsCitation: https://doi.org/10.5194/gmd-2022-264-AC3
Jiaxu Guo et al.
Jiaxu Guo et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
484 | 76 | 16 | 576 | 4 | 3 |
- HTML: 484
- PDF: 76
- XML: 16
- Total: 576
- BibTeX: 4
- EndNote: 3
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1