Snowfall prediction is important in winter and early spring because snowy conditions generate enormous economic damages. However, there is a lack of previous studies dealing with snow prediction, especially using land surface models (LSMs). Numerical weather prediction models directly interpret the snowfall events, whereas LSMs evaluate the snow cover, snow albedo, and snow depth through interaction with atmospheric conditions. Most LSMs include parameters based on empirical relations, resulting in uncertainties in model solutions. When the initially developed empirical parameters are local or inadequate, we need to optimize the parameter sets for a certain region. In this study, we seek the optimal parameter values in the snow-related processes – snow cover, snow albedo, and snow depth – of the Noah LSM, for South Korea, using the micro-genetic algorithm and the in situ surface observations and remotely sensed satellite data. Snow data from observation stations representing five land cover types – deciduous broadleaf forest, mixed forest, woody savanna, cropland, and urban and built-up lands – are used to optimize five snow-related parameters that calculate the fractional snow cover, maximum snow albedo of fresh snow, and fresh snow density associated with the snow depth. Another parameter, reflecting the dependence of fractional snow cover on the land cover types, is also optimized. Optimization of these six snow-related parameters has led to improvement in the root mean squared errors by 17.0 %, 6.2 %, and 3.3 % in snow depth, snow albedo, and fractional snow cover, respectively. In terms of the mean bias, the underestimation problems of snow depth and overestimation problems of snow albedo have been alleviated through optimization of parameters calculating the fresh snow by about 44.2 % and 31.0 %, respectively.

Land surface models (LSMs) act as the lower boundary conditions for regional numerical weather prediction (NWP) and climate models, to which they provide the surface fluxes

Intense snowfall events often occur on the Korean Peninsula during winter and early spring. In South Korea (SK), heavy snowfalls are the third-most serious source of natural disasters, following typhoons and heavy rainfalls

Uncertainties in parameterized physical processes have been observed and quantified in various numerical models
(e.g.,

Most snow processes in the LSMs are parameterized based on the observations in specific local regions, and hence they may not represent adequately the situation in SK and be the source of uncertainties for numerical snow prediction over SK. We aim at obtaining the optimal parameter values of the snow-related processes – snow cover, snow albedo, and snow depth – in a LSM using the micro-GA, which causes a better LSM performance over SK. This study represents the first attempt to develop a coupled system of the micro-GA and Noah LSM for parameter estimation of the snow processes. Section 2 describes the methodology, including the snow processes of the LSM and the micro-GA optimization tool. Section 3 explains the experiment design. Results and the conclusions and outlook are provided in Sects. 4 and 5, respectively.

In this study, we employ the Noah LSM

The current Noah LSM (version 3.4.1) uses a single-layer representation of the snow processes considering a bulk snow–soil canopy layer

The FSC (

Figure

Responses of the snow variables to the variations in the snow-related parameters for given ranges.

The SWE threshold,

SA is defined as the fraction of incident radiation reflected by the snowpack and is crucial for evaluating surface-energy balance, particularly during snowmelt

Surface albedo generally increases over snow, but it may react differently over a shallow snowpack: when accumulation starts by snowfall or diminution occurs by snowmelt, patchy areas can be generated and corresponding model grid boxes may not be covered by snow

Spatial variation in SA is taken into consideration in

In the Noah LSM, SD is evaluated as the ratio of SWE (

The genetic algorithm (GA) is a global optimization algorithm developed by John Holland in the 1970s (e.g.,

The micro-GA is an advanced and simplified GA with smaller generation sizes, thus requiring less computational time than the conventional GA

Figure

Although the micro-GA is computationally more efficient than the conventional GA, it still demands substantial computing time because each individual serially executes the model. Therefore, we have developed a parallel processing system in the micro-GA–Noah LSM coupled system. Instead of sequentially performing each individual and calculating the fitness within a generation, we run the model simultaneously for all populations to obtain the fitness and select the best individual when all tasks are finished (see the dashed box in Fig.

A flowchart of parameter optimization from the micro-GA–Noah LSM coupled system. The dashed box depicts the parallel system for the Noah LSM, running for each individual.

The fitness function is a performance index to evaluate how well potential solutions fit the objective. In the GA optimization, the fitness function should be carefully defined because it is used for all generations and individuals. Generally, the root mean squared error (RMSE) is a widely used indicator for evaluating the performance of a model (e.g.,

We then obtain the improvement ratio,

We finally define the normalized fitness function,

The land surface processes were forced by six meteorological fields from ASOS (

The snow observations (i.e., SD, FSC, and SA) are used for the model verification and the fitness function calculation. For SD, the hourly model outputs are evaluated using the hourly ASOS data. To confirm the snow season, we have excluded the SD observations lower than 0.1 cm. For FSC and SA, we have no ASOS observations over SK; thus, we have used the MODIS/Terra Snow Cover Daily L3 Global 500 m SIN Grid radiance data

For the optimization experiment, we have selected some stations that represent different land covers in SK, aiming at having a representative combination of snow-related parameters over SK. We have defined a representative set of LCTs within a 2.5 km radius from the ASOS observations, excluding the water body. The LCTs have been taken from the MODIS (on board Terra and Aqua) Land Cover Type Yearly Climate Modeling Grid (CMG) Version 6

Five representative LCTs over SK, following the IGBP classification – DBF, MF, WS, CL, and UB. For each LCT, five selected stations are shown with the station name (abbreviation in parentheses), location in latitude (

We have designed the following two GA optimization experiments: (1) OPT_5 that optimizes five snow parameters (

Stations used for experiments

For the micro-GA optimization, we have prespecified the following input parameters: (1) the population size, i.e., a collection of individuals; (2) the number of parameters to be used for optimization; (3) the number of chromosomes expressing an arbitrary solution; (4) the maximum number of generations to iterate the optimization; (5) the type of crossover operator that creates a new structure of chromosomes through the exchange of the chromosome; (6) the elitism to decide whether the most suitable individual would be preserved for the next generation. The micro-GA–Noah LSM coupled system has been repeatedly performed to find a parameter combination within the specified generations.

Table

The input parameters for micro-GA in experiments OPT_5 and OPT_W.

In this study, we have conducted the optimization experiments from 00:00 UTC 1 May 2009 to 23:00 UTC 30 April 2018. During this 9-year period, the number of snow observations was continuously secured. Data from the first 5 months (May–October in 2009) were utilized for model initialization and spinup, and thus they were not considered for the verification. Cross-validation has been conducted using the 1-year data from 00:00 UTC 1 May 2018 to 23:00 UTC 30 April 2019. Since they showed similar aspects, we only discuss the results of optimization periods with sufficient samples.

Numerical prediction models generally require spinup to reach a statistical equilibrium state where the initial conditions under a forcing are adjusted to the model's own physics/dynamics and numerics

First, the Noah LSM has been repeatedly executed using the atmospheric forcing for 9 years. This recursive simulation has been conducted from 1 May 2009 to 30 April 2018 to see whether the model was able to reach an equilibrium by setting the repetition loop to 0, 300, 600, and 1000. Our results indicated no significant differences; thus, we concluded that repetition was not required. Second, we have performed sensitivity tests to identify the spinup period due to changes in the initial conditions by adding biases (

To optimize snow parameters specialized in SK, we have employed the micro-GA–Noah LSM coupled system using the observations over SK. Figure

The fitness function for generations during

As a result, we have obtained the optimized six snow parameters over SK (Table

Summary of optimized snow parameters related to snow variables. Minimum (min), default, and maximum (max) are the ranges used in the optimization process. Default is the empirical value used in the Noah LSM.

We have investigated the mean bias (MB) using the box plot expressing the quartile and the distribution of extreme values: it explains how much the bias of the CNTL is reduced in optimization experiments by comparing the model with the observations. Before optimization, the CNTL showed underestimated FSC and SD and overestimated SA (

Box plots of

Scatter plots of observations (OBS) and model results (LSM) for snow variables FSC (left panels), SA (middle panels), and SD (cm; right panels) from the verification experiments – CNTL (black dots), VRF_5 (blue dots), and VRF_6 (orange dots), which are evaluated over different LCTs.

The performance has been evaluated using the improvement ratio, which indicates how much the RMSE, MB, and coefficient of determination (

The RMSE, MB, and

To supplement insufficient improvement in the FSC, we have additionally optimized

Time series of the snow variables for DBF (e.g., UL) from May 2009 to April 2018:

Finally, all six parameters related to the snow variables have been verified in VRF_6 with the same 25 stations used in CNTL. When the optimized five parameters are used except

To understand more details of the improvements due to the optimization, we analyzed the scatter plots that compare the observations and the model results in Fig.

Statistics of model performance using non-optimized parameters (CNTL) and optimized parameters (VRF_5 and VRF_6) over different LCTs represented by different stations – DBF represented by UL, MF by GM, WS by NG, CL by BR, and UB by SL. The RMSEs and

Figure

Lastly, we have investigated how the optimized snow parameters can affect the other variables in the LSM. Figure

Time series of the difference between CNTL and VRF_6 for the UL in DBF from May 2009 to April 2018:

The Noah land surface model (Noah LSM) generally underestimates snow amount during the peak winter and shows earlier snowmelt in spring, whereas it overestimates snow albedo (SA) over Eurasia, mainly due to uncertain parameterization processes

The coupling system of the micro-GA and Noah LSM automatically estimates the optimal snow-related parameters by objectively comparing observations and model solutions through the fitness function. Instead of trial-and-error procedures, it has the advantage of reducing a substantial amount of computational time. The original micro-GA reduces the computational time using the elitism and re-initialization methods in the small number of individuals. However, we have developed a parallel system on the coupled system to further improve the computational efficiency in this study; it enables us to simultaneously execute multiple individuals in one generation and multiple Noah LSM runs in one individual.

Six parameters included in the snow processes in the Noah LSM have been optimized by using a micro-GA during the period 2009–2018 in South Korea (SK). The first parameter is the distribution shape parameter that participates in the FSC calculation and shows a positive correlation with the FSC: the optimized value is expected to increase the FSC, but it is not sufficient to alleviate its underestimation problems. The second parameter is the snow water equivalent threshold value that implies 100 % snow cover and is also used in the FSC calculation depending on the land cover type: its optimized value improves the FSC in terms of RMSE and mean bias over some stations. The third parameter is the maximum SA coefficient: its optimized (decreased) value improves the RMSE by reducing the overestimation of SA. The fourth parameter is the coefficient in the maximum albedo of fresh snow, and its optimized value was similar to the default one. The other two parameters are related to the fresh snow density used for the SD calculation. In particular, the sixth parameter depends on air temperature, and its optimization brings about the largest improvement in SD: the optimized (reduced) value remarkably reduces the RMSE, which ameliorates the underestimation problem of SD. This significant improvement in SD is due to the high spatial and temporal resolutions of observations.

The best combinations of snow parameters optimized for SK can be used to improve the snowfall prediction. Our results showed improvement in all snow variables in terms of RMSE by 3.3 %, 6.2 %, and 17.0 % for FSC, SA, and SD, respectively. Furthermore, SD increased after optimization, which led to increases in both soil temperature and sensible heat flux via an insulating response; soil moisture also increased due to increased SD in previous years. This implies that the optimized snow parameters not only let the model solutions close to the observations, but also act in a physically consistent manner. Satellite observations proved to be effective in the optimization; however, their coarse resolution as well as insufficient number of stations used for optimization often restrict improvement in the snow variables, as shown in some discouraging statistics including the mean bias and the coefficient of determination (

Based on the encouraging optimization results in the offline Noah LSM, we plan to optimize the Noah LSM in a coupled land–atmosphere prediction system. The online Noah LSM can produce a spatial distribution of model variables over the land surface, which allows a two-dimensional assessment of model performance and a three-dimensional extension through various interactions between the land surface and the atmosphere. We anticipate that the optimized snow parameters can lead to positive effects on the atmospheric variables through the changes in heat fluxes as well as snow variables in the Noah LSM. As a result, we can identify how optimal parameters are appreciated in SK in terms of both horizontal and vertical distributions. Furthermore, the micro-GA–Noah LSM coupled system can be utilized to optimize other parameters in the Noah LSM, including the ones that indirectly affect the snow processes.

The current version of the Noah LSM provided by National Center for Atmospheric Research (NCAR) is available from the website:

The 1-hourly forcing data (i.e., ASOS) for the Noah LSM are obtained from the Open MET Data Portal, which is available at

SuL, SKP, HJG, WYL, YHL, and CC contributed to the conceptualization. SuL, SKP, and CC designed the experiments, and SuL carried them out with the investigation. SuL, HJG, and EL developed the model code, and EL and SeL contributed to the validation. SuL prepared the manuscript with contributions from all the co-authors.

The contact author has declared that none of the authors has any competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We are grateful to the managing editor and three anonymous referees for their valuable comments.

This work is supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (grant no. 2018R1A6A1A08025520) and Development of Numerical Weather Prediction and Data Application Techniques (grant no. NTIS-1365003222) funded by the Korea Meteorological Administration. It is partly supported by an NRF grant funded by the Korean government (MSIT) (grant no. NRF-2021R1A2C1095535).

This paper was edited by Yuefei Zeng and reviewed by three anonymous referees.