the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Surrogate model-based precipitation tuning for CAM5
Xianwei Wu
Liang Hu
Lanning Wang
Haitian Lu
Juepeng Zheng
Abstract. The uncertainty of physical parameters is a major reason for a poor precipitation simulation performance in Earth system models (ESMs), especially over the tropical and Pacific regions. Although tuning related parameters can help reduce such uncertainty factors, repetitive runs of ESMs incur large computational costs. While surrogate models can reduce the computational costs in many tuning scenarios, building an effective surrogate model for the community atmospheric model (CAM) is a complex integration of many processes, which is an unresolved challenge due to its strong nonlinear behaviors. In this study, we present a surrogate model-based parameter tuning framework for the CAM and apply it to improve the CAM5 precipitation performance. We propose a multilevel surrogate model-based optimization method. First, a global-level surrogate model is constructed with a gradient boosting regression tree (GBRT), which has been proven, through cross-validation experiments, to have a more significant effect than other methods. The candidate point approach (CAND) is applied to balance exploration and exploitation to obtain better values for establishing a local-level surrogate model. A local-level surrogate model is then constructed based on a much smaller number of chosen points. We design a trust region approach to adjust the sampling region during the tuning process. This proposed method has a faster convergence speed and higher accuracy during the tuning process. We attempt a region-based optimization method to improve the CAM simulation results over some areas with large errors. The results show that the surrogate model-based optimization method can significantly improve the simulation performance of the CAM model. The average improvement of the selected regions is 19 %. To integrate the optimization results of these regions, we design a nonuniform parameter parameterization scheme and integrate the parameters using a parameter smoothing scheme, and the experimental results improve in four regions. These experimental results demonstrate that the proposed method improves the precipitation simulation of the CAM model.
- Preprint
(11680 KB) - Metadata XML
- BibTeX
- EndNote
Xianwei Wu et al.
Status: open (until 11 Oct 2023)
-
RC1: 'Comment on gmd-2023-164', Anonymous Referee #1, 26 Sep 2023
reply
This study proposes the online surrogate updating strategy to tune parameters in climate models. By constructing a higher accuracy surrogate model in the sub-domain parameter space, this method effectively reduces computational costs. This idea is particularly novel as it overcomes the low efficiency associated with the offline surrogate method, which cannot ensure optimization for the real model. Despite the significance of this work, the manuscript requires major revision by addressing the following issues.
Major issues:
- The manuscript structure, particularly the method section, needs to be reorganized to improve the compactness. There are several areas that require clarification. For instance, Algorithm 1 calculates the RMSE, but its definition is found in section 3.2.2. It would be more appropriate to move the definition to section 2. Additionally, in Line 4 of Algorithm 2, it is unclear whether the new parameters are obtained using CAND. Furthermore, it is not explained why the local-level surrogate utilizes Gaussian Process. In addition, it could describes the difference between the algorithm used in this work and the ASMO. Typically, optimization algorithms require hundreds of steps to achieve convergence, but in this work, only around 20 steps of local optimization are performed. It is hard to say the algorithms get convergence. It appears that the ASMO method can achieve local optimization more quickly. The conclusion is not convinced. The description of CAND is difficult to follow, particularly the calculation vs and vd, which is lack of calculation details. The cross validation describe can move from result section to the method section.
- The manuscript lacks a thorough mechanism analysis of how parameters affect precipitation on a global and regional scale. While section 4 presents optimization results, it lacks organization and falls short in providing a detailed understanding of the underlying mechanisms. To enhance the manuscript, it is recommended to delve deeper into the analysis. By investigating the cause-effect relationships between parameters and precipitation patterns, physics insights can be gained to improve the parameterization scheme.
- In equations 10-11, it could be possible for the numerator to be very large, and the denominator can be very small. This implies that the value of sigma could exceed 0.75, but the fitness is bad. If the fitness is good, the value of sigma could be close to 1 rather than just being greater than 0.75.
- Improving the clarity of motivation for the nonuniform parameter parameterization scheme.
- Line 55, while previous methods involved running the climate model, it is important to note that this work also requires running the climate model in each iteration. However, the manuscript does not provide a direct comparison of the efficiency of this method with other approaches. To enhance the evaluation of the proposed method, it would be beneficial to include an assessment of the computational cost compared to existing methods. This evaluation can provide valuable insights into the efficiency and computational advantages of the proposed approach, strengthening the manuscript's contribution in terms of computational performance.
Minor issues:
- The title uses CAM5, but the contexts use CAM. They could be consistent.
- Line 11: “selected points..” to “selected points.”
- Line 29: traditional tuning methods in climate modeling have certain limitations. However, they remain highly useful. The majority of climate models employ traditional tuning approaches due to their reliance on well-established physics knowledge. In fact, automatic tuning methods require a solid understanding of physics to enhance their efficiency.
- Line 35, The statement that "WRF physics process is simple" is not accurate. In fact, it is known to be complex and intricate.
- Line 37, The statement that "MVFSA may become infeasible for CAM tuning" may require further consideration. Fast simulated annealing, which is utilized in MVFSA, actually requires only one population to search for the next optimal parameters. The MVFSA requires thousands of steps to get a stable solution. But CAM requires a lot of computational cost for each optimization iteration. The authors should thoroughly discuss the challenges associated with MVFSA to provide a comprehensive understanding of its feasibility for CAM tuning.
- Line 51, When the optimization process reaches convergence, further iterations do not lead to any improvement. Similarly, once the optimization algorithm has obtained a local solution, additional iterations do not result in further enhancements. The effectiveness of the algorithm is also a determining factor in this regard.
- Line 58, It is confusing that ‘the mathematical expression is complex and time-consuming’. Could you explain it?
- Line 59. Revise the sentence “Wang et al. … ; a SCM-SMA hydrologic model”
- Line 85, the authors could carefully analyze the challenge of ASMO used in atmospheric model. The method has been successfully used in WRF, CLM. what’s the real challenge for atmospheric model?
- Line 91, the above sentences discuss the tuning algorithms. The sentence “The precipitation process …” talk about the metrics. It would be beneficial to separate these statements into individual paragraphs.
- Line 110, it is hard to say the nonlinearity and complexity of CAM5 are much higher than WRF.
- Section 2.1, describe more details of CAM5, such as horizontal resolution, vertical level, how long does CAM5 run, the sst and sea ice are used prescribed seasonal climatology.
- Line 138: define the six main regions, giving a table including the range of latitude and longitude.
- Line 143, why not use GPCP to estimate precipitation but use ERA5.
- Line 147, “Makes” to “makes”
- Line 163, is the “sampling method” is the latin hypercube sampling? How many samples do you conduct?
- Line 280, use the correct ref for GP.
- Line 308, It is confusing that the surrogate model is built as the quadratic function. Does it use GP?
- For fig2, what is the y-axis? Is it the relative error? How calculate it?
- Line 343, ‘lower’ to ‘lowest’.
- Section 4.1.3 should be merged into section 4.1.2.
- In figure 4, it should include the obs pattern, or the difference between opt/default and observation.
- Line 366, it is confusing for this sentence “Therefore, we need to further …”
Citation: https://doi.org/10.5194/gmd-2023-164-RC1
Xianwei Wu et al.
Xianwei Wu et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
187 | 47 | 7 | 241 | 2 | 2 |
- HTML: 187
- PDF: 47
- XML: 7
- Total: 241
- BibTeX: 2
- EndNote: 2
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1