Determining the sensitive parameters of WRF model for the prediction of tropical cyclones in the Bay of Bengal using Global Sensitivity Analysis and Machine Learning

The present study focuses on identifying the parameters from the Weather Research and Forecasting (WRF) model that strongly influence the prediction of tropical cyclones over the Bay of Bengal (BoB) region. Three global sensitivity analysis (SA) methods, namely the Morris One-at-A-Time (MOAT), Multivariate Adaptive Regression Splines (MARS), and surrogatebased Sobol’ are employed to identify the most sensitive parameters out of 24 tunable parameters corresponding to seven parameterization schemes of the WRF model. Ten tropical cyclones across different categories, such as cyclonic storms, severe 5 cyclonic storms, and very severe cyclonic storms over BoB between 2011 and 2018, are selected in this study. The sensitivity scores of 24 parameters are evaluated for eight meteorological variables. The parameter sensitivity results are consistent across three SA methods for all the variables, and 8 out of the 24 parameters contribute 80%−90% to the overall sensitivity scores. It is found that the Sobol’ method with Gaussian progress regression as a surrogate model can produce reliable sensitivity results when the available samples exceed 200. The parameters with which the model simulations have the least RMSE values when 10 compared with the observations are considered as the optimal parameters. Comparing observations and model simulations with the default and optimal parameters shows that predictions with the optimal set of parameters yield a 16.74% improvement in the 10m wind speed, 3.13% in surface air temperature, 0.73% in surface air pressure, and 9.18% in precipitation predictions compared to the default set of parameters.

speeds of ESCSs have shown an increasing trend, and the land falling category was very severe. Singh et al. (2021a). On considering climate change, Singh et al. (2019) showed that present warming climate impacts on the formation and severity 25 of tropical cyclones over the BoB region. This ultimately affects the densely populated coastal cities adjacent to BoB, such as Chennai, Visakhapatnam, Bhubaneswar, and Kolkatta (Singh et al., 2019). Reddy et al. (2021) showed that projecting the present global warming conditions and climate changes into the near future leads to the intensification of the tropical cyclones with ESCS and VSCS categories. Consequently, accurate predictions of cyclone track, landfall, wind, and precipitation are critical in minimizing the damage caused by the tropical cyclones that are increasing in number and intensity. 30 The Weather Research and Forecast (WRF) model (Skamarock et al., 2008) is a community-based Numerical Weather Prediction (NWP) system, which has been widely used to predict cyclones to date. The accuracy of the WRF model depends on (i) the specification of initial and lateral boundary conditions, (i) the representation of model physics schemes, and (iii) the specification of parameters. With the availability of vast computational resources and observations, the accuracy in the specification of initial and lateral boundary conditions is improved to a great extent (Mohanty et al., 2010;Singh et al., 2021b). 35 Many researchers have studied the sensitivity of physics schemes in simulating tropical cyclones over the BoB and invariably reported the performance of different combinations of physics schemes by comparing the tracks and intensities of cyclones (Pattanayak et al., 2012;Osuri et al., 2012;Rambabu et al., 2013;Kanase and Salvekar, 2015;Chandrasekar and Balaji, 2016;Sandeep et al., 2018;Venkata Rao et al., 2020;Mahala et al., 2021;Singh et al., 2021b;Messmer et al., 2021). However, systematic studies on parameter sensitivity, to determine their optimal values is yet to be explored for tropical cyclones over 40 the BoB region.
Model parameters are the constants or exponents written in physics equations set up by the scheme developers, either through observations or theoretical calculations. In some cases, the default parameters are selected based on trial-and-error methods.
This implies the parameters values may vary depending on the climatological conditions (Hong et al., 2004;Knutti et al., 2002). The WRF model consists of a bundle of physics schemes, and there exist as many as a hundred tunable parameters 45 (Quan et al., 2016). Calibration of all the parameters to reduce the model prediction error is highly challenging, and it brings several obstacles. First, a vast number of model simulations are required to perform parameter optimization, and the order goes beyond 10 4 with an increase in parameter dimension. Second, the WRF model can simulate various meteorological variables, and each parameter may influence more than one variable. Thus, the parameter optimization needs to consider several variables at once, which increases the computation cost even further (Chinta and Balaji, 2020). With the current situation and availability 50 of computational resources keeping in mind, performing thousands of numerical simulations for long periods such as tropical cyclones is extremely expensive. The best remedy is to use sensitivity analysis to identify the parameters that significantly impact the model simulation thereby reducing order of parameter dimension.
Sensitivity analysis is the method of uncertainty estimation in model outputs contributed by the variations in model inputs (Saltelli, 2002). Several researchers (Green and Zhang, 2014;Quan et al., 2016;Di et al., 2017;Ji et al., 2018;Wang et al., 55 2020; Chinta et al., 2021) have conducted sensitivity analysis of a number of parameters using various methods in the WRF model. Green and Zhang (2014) studied the sensitivity of four parameters in the WRF model to the intensity and structure of Hurricane Katrina, using single and multi-parameter designs, and reported that two parameters significantly affect the intensity the MOAT method with ten repetitions, and the results showed that 9 out of 23 parameters have a considerable impact on the model outputs. These studies show that hundreds of numerical simulations are required to perform sensitivity analysis. Thus, while selecting the sensitivity analysis methods and the number of parameters, the computational coast is a critical factor to consider. Razavi and Gupta (2015) extensively studied the impact of numerous sensitivity analysis methods and reported that each 80 method works based on a different set of ground-level definitions. The results from these methods do not always coincide. The studies proposed that while selecting a global sensitivity analysis method, one needs to consider four important characteristics, namely (i) local sensitivities, (ii) the global distribution of local sensitivities, (iii) the global distribution of model responses, and (iv) the structural organization of the response surface. The studies also reported that relying on only one sensitivity analysis method may not yield feasible results since one single method may not be able to bring out all the characteristics fully. 85 From these studies, one can infer that more than one SA method needs to be explored to improve confidence in the results obtained from sensitivity studies. The objective of the present study is to assess the sensitivity of the WRF model parameters to various meteorological variables such as surface pressure, temperature, wind speed, precipitation, and model variables such as radiation fluxes and boundary layer height, for the simulations of tropical cyclones over the BoB region, using three different global sensitivity analysis methods.

90
This paper is organized as follows: a brief description of sensitivity analysis methods is presented in section 2. Section 3 presents the design of numerical experiments and sensitivity experimental setup. Section 4 shows the results of the three sensitivity analysis methods and a comparison between simulations and observations, and section 5 gives the summary and conclusions.

95
Sensitivity analysis is the assessment of uncertainties in model outputs that are attributed to the variations in inputs factors (Saltelli et al., 2008). The sensitivity analysis proceeds as follows: (1) selecting the right model and corresponding best set of physics schemes, (2) identifying the adjustable input parameters and corresponding ranges, (3) choosing the sensitivity analysis methods, (4) running the design of experiments to generate the sample set of input parameters and running the model using these parameter sets, and (5) analyzing the model outputs obtained by different parameter samples and quantifying the 100 sensitivity of selected parameters.
Sensitivity analysis methods are classified as derivative-based, response-surface-based, and variance-based approaches Wang et al. (2020). In mathematical terms, the change of an output concerning the change in the input is referred to as the sensitivity of that input, which is the principle of derivative-based SA. The Morris One-at-A-Time (MOAT) is a derivative based SA method(see subsection 2.1). The response-surface-based approach works on the differences between the responses of a 105 mathematical model with all the input factors against that built with all but a particular input factor. The Multivariate Adaptive Regression Splines (MARS) method comes under this category (see subsection 2.2). For the variance-based approaches, the sensitivity of an input variable is defined as the contribution of the variance caused by the variable in question to the total variance of the model output. In mathematical terms, if the model output variance is decomposed by the contributions of each individual and combined interactions, then the highly sensitive factors will have a more significant variance contribution.

110
The Sobol' sensitivity analysis comes under the variance-based approach (see subsection 2.3). The MOAT method requires a uniform space-filling design, whereas the MARS and Sobol' methods require random space-filling designs. The MOAT and MARS methods give a more qualitative analysis, whereas the Sobol' method gives a quantitative analysis (Wang et al., 2020).
As already stated by Razavi and Gupta (2015), unique sensitivity analysis methods for all applications are scarce in the literature. Furthermore, they observed that using more versatile SA methods could improve the confidence in sensitivity results by 115 compensating for the drawbacks of the individual SA methods. Thus, in the present study, three widely used SA methods are selected for sensitivity analysis because of the differences in their methodology, as a consequence of which the parameters that are sensitive to the numerical model are studied. One can then extract those parameters which turn out to be significant in all the methods under consideration, thereby bolstering the argument. These are the most influential parameters that need to be worked out to improve the forecast skill.

The MOAT method
The MOAT is a derivative-based sensitivity analysis method, also known as elementary effects method, which evaluates the parameter sensitivity according to the elemental effects of individual parameters (Morris, 1991). Consider a model with n input parameters X = (x 1 , x 2 , ...x n ) with variability in their ranges. The parameters are normalized to bound between Here p is a user defined integer. An initial vector of input parameter X 1 = (x 1 1 , x 1 2 , ...x 1 n ) is randomly created by taking values from the defined parameter space. Following the One-at-A-Time method, one parameter is selected and perturbed by ∆, i,e., X 1 m = (x 1 1 , x 1 2 , . .., x 1 m ± ∆, . .., x 1 n ). Here ∆ is a randomly selected multiple of 1 p−1 . The model is run using these initial and perturbed vectors, and the elemental effect of of that parameter is calculated as: The subscript m implies the m th parameter is perturbed and the superscript 1 is the indication of 1 st MOAT trajectory. In a single trajectory, this process is repeated for all parameters to compute the elementary effects of every parameter. The entire trajectory is replicated r times randomly to obtain the reliable sensitivity results. At the end of the process, a total of r ×(n+1) model simulations are evaluated to complete the MOAT sensitivity analysis. A modified mean of |EE m |, µ m , and the standard deviation of |EE m |, σ m are constructed as the sensitivity indices of input parameter x m , as given below A high value of µ m implies that the parameter x m has a more significant impact on the model output. In contrast, a high value of σ m indicates the nonlinearity of x m or high interactions with other parameters.

The MARS method
MARS is an extension of Recursive Partition Regression model with the ability of continuous derivative (Friedman, 1991 decomposed as: The basis functions can be a constant (B 0 ), a hing function (B m (x i )), or a product of two or more hing functions (B m (x i , x j )).
The coefficients (a 0 , a 1 , ..a m ) are determined by linear regression in every partition. The Generalized Cross Validation (GCV) score of every model during the backward pass is calculated as: parameter.

The Sobol' method
The Sobol' sensitivity analysis works on the basis of variance decomposition (Sobol, 2001). Consider a response function f (x) of a random vector x. The ANalysis Of VAriance (ANOVA) decomposition of f (x) is written as: 160 The variance of f (x) can be expressed as the contributions of variance of each term in the equation (6), i,e., Where n is the total number of parameters, D is the total variance of output response function, D i is the variance of x i , D ij is the variance of interactions of x i and x j , and D 12...n is the variance of interactions of all parameters. The Sobol' sensitivity indices of a particular parameter are defined as the ratio of individual variances to the total variance, and these can be written as: These indices explain the effects of first order, second order, and total order interactions, respectively. From equations (8) and (9) it is evident that the sum of all the indices is equal to 1. Finally, the total order sensitivity index of i th parameter can be calculated as the sum of all the interactions of that parameter, i,e.,

175
Generally, while the computation of first and second order effects is rather straight forward, the calculation of higher order effects is very expensive because the dimension of the higher order terms is very large. To solve this problem, Homma and Saltelli (1996) introduced a new total sensitivity index as Where D −i indicates the total variance of response function without the consideration of the effects of the i th parameter. A higher total order sensitivity index implies higher importance of that parameter.
3 Design of numerical experiments

WRF model configuration and adjustable parameters
In the present study, the Advanced Research WRF (WRF-ARW) model version 3.9 (Skamarock et al., 2008) is rapid radiative transfer model (Mlawer et al., 1997) for longwave radiation, Dudhia shortwave scheme (Dudhia, 1989) for shortwave radiation, revised MM5 scheme (Jiménez et al., 2012) for surface layer physics, Unified Noah land surface model (Mukul Tewari et al., 2004) for land surface physics, Yonsei University Scheme (YSU)  for planetary 200 boundary layer physics, Kain-Fritsch (Kain, 2004) for cumulus physics, and WRF Single-Moment 6-class (WSM6) scheme (Hong and Lim, 2006) for microphysics. A total of 24 tunable parameters are selected based on the guidance from literature (Di et al., 2015;Quan et al., 2016). The list of parameters and corresponding ranges are presented in Table 1. Though the selected parameter may not cover the entire existing parameters, the availability of computational resources limits the experimental design. The experimental design is based on the most critical parameters that are more likely to significantly influence the 205 model output.

Simulation events, WRF model output variables, and observational data
In the present study, ten tropical cyclones that originated  Figure 2 illustrates the IMD observed tracks of selected cyclones, with a clear indication of their category. Table   2 presents the details of category, landfall time, and the simulation duration of the cyclones selected in the present study. Each cyclone is simulated for 108 hours, including 12 hours of spin-off time, 72 hours of simulation before the landfall, and 24 215 hours of simulation after the landfall. The sensitivity of parameters is conducted for different meteorological variables: wind speed 10 meters above ground(WS10), temperature 2 meters above ground (SAT), surface pressure (SAP), total precipitation (RAIN), planetary boundary layer height (PBLH), outgoing longwave radiation flux (OLR), downward short wave radiation flux (DSWRF), and downward longwave radiation flux (DLWRF). The WRF simulations of these variables are stored at 6-hour intervals.

220
The simulations are validated against the Indian Monsoon Data Assimilation and Analysis (IMDAA) data (Ashrit et al., 2020) and Integrated Multi-satellitE Retrievals for GPM (IMERG) dataset (Huffman, G and Savtchenko, AK, 2019). The IMDAA data is available at 0.12 • × 0.12 • resolution with a six-hour latency and the IMERG data is available at 0.1 • × 0.1 • resolution with a thirty-minute latency. Since the model resolution is close to the validation data resolution, it results in very little or no loss of data after regridding takes place. The accumulated precipitation data for validation is taken from IMERG data, 225 while the remaining variables are taken from IMDAA data. Apart from this data, the maximum sustained wind speed (MSW) observations at the storm center for every cyclone, provided by the IMD at 3-hour intervals, are also used for validation.

Experimental setup
The sensitivity analysis requires a large set of values of the parameters assigned to the WRF model, following which simulations are performed. Uncertainty Quantification Python Laboratory (UQ-PyL) is an uncertainty quantification platform, designed by In contrast, the MARS and Sobol' methods require a different set of samples compared to the MOAT method. Based on the previous studies (Ji et al., 2018;Wang et al., 2020), the quasi-Monte Carlo (QMC) Sobol' sequence design (Sobol', 1967) is employed to create 250 parameter samples, using UQ-PyL package for each event. Similar to the MOAT method, these parameter samples are assigned in the WRF model, and another 2500 simulations are performed for the cyclones under consideration.

240
The output variables are extracted and stored at 6-hour intervals. The evaluation of sensitivities using the MOAT method requires simulations only from the WRF model. In contrast, the MARS and Sobol' methods require skill score metrics between the simulation and observations. In the present study, the RMSE score between simulation and observation is employed as the skill score metric, which is formulated as

245
Where I and J are the number of grid points in lateral and longitudinal direction, K is the dimension of times, L is the number of cyclones, sim is the simulated value, and obs is the observed value. Since the same parameter set is employed for all the cyclones, equation (12) is employed to get one RMSE value corresponding to one parameter sample. The parameter set and RMSE are given as inputs and targets to the MARS solver, and the MARS sensitivity indices are computed following GCV equation (5).

250
The Sobol' method, as a quantitative sensitivity analysis method, gives more accurate and robust results, albeit at a much higher computational cost. The Sobol' method may require [10 3 ∼ 10 4 × (n + 1)] (i.e., n is the number of parameters) number of model runs to get accurate results. This is exceedingly challenging even if supercomputing facilities are available. To circumvent this difficulty, one can use the surrogate models instead of running the WRF model for more simulations. The surrogate models are powerful machine learning tools that can correlate the empirical relations between inputs (i.e., parameter 255 set) and the targets (i.e., RMSE matrix). In the present study, five different surrogate models namely Gaussian Process Regression (GPR) (Schulz et al., 2018), Support Vector Machines (SVM) (Radhika and Shashi, 2009), Random Forest (RF) (Segal, 2005), Regression Tree (RT) (Razi and Athappilly, 2005), and K Nearest Neighborhood (KNN) (Rajagopalan and Lall, 1999) are selected for evaluation. The surrogate models are provided with the parameter set as inputs and the RMSE as the target, and the models are trained on this data. The goodness of fit is considered as the accuracy metric, which is calculated as, Where N is the total number of samples, y i is the true value,ŷ i is the predicted value, and y is the mean of true values.
The accuracy of the surrogate models is examined by applying ten-fold cross-validation, which is implemented as follows.
The entire data is divided into ten equally spaced subsets. The data in k th fold is kept as the test set, whereas the data from the remaining folds is taken as the training set.  Figure 3, with a darker shade indicating the highest sensitivity and a lighter shade indicating the least sensitivity. Figure 3 shows that parameter P14 has the highest sensitivity to most of the variables, followed by parameter P6. The parameters P3, P4, P10, P15, P17, P21, and P22 also show high sensitivity to at least one of the variables. In contrast, the parameters P1, 280 P8, P11, P13, P16, P18, and P20 seem insensitive to any one of the variables, and the remaining parameters have a minimal contribution. A close observation of Figure 3 reveals that the variables OLR and DSWRF having the highest sensitivity to just one parameter each, whereas the remaining variables exhibit the highest sensitivity to at least two parameters.
The uncertainties that lie in the sensitive parameters is examined by observing the distribution of the parameters. Since the available data points are limited to only ten samples, a resampling method can be employed to procure more samples without 285 further numerical model runs. The bootstrap resampling (Efron and RJ, 1993) is an efficient way to generate the same number of samples as the original dataset, with replacement allowed. In the present study, the bootstrap method is employed for 100 applications to generate ten samples with replacement. In this way, a new dataset of (100 × 10) is created for one parameter corresponding to one variable. The distribution of each parameter is illustrated as a boxplot in shows that the most sensitive parameters exhibit either a higher variance or a higher median value (Wang et al., 2020). For the variable OLR, Figure 4(f) shows that the parameter P10 has the highest median value with large variance, whereas the parameters P6 and P12 have the least median value with large variances. Figure 4(g) shows that parameter P14 has the highest sensitivity to the variable DSWRF and has a very minimal variance, whereas the sensitivity of the remaining parameters is 295 comparably very minimal. Figures 4(c,e,h) show that the variables SAP, PBLH, and DLWRF have more than three sensitive parameters. The results show that except for DSWRF and OLR, all the variables have at least two high sensitive parameters.
The results obtained by the boxplot strengthen that of the heatmap results.

MARS sensitivity analysis
The GCV scores of 24 parameters corresponding to the selected variables are calculated based on the MARS method. Figure 5 300 illustrates the heatmap of normalized GCV scores, with 1 indicating the highest sensitivity and 0 indicating the least sensitivity.
The intensity signatures of Figure 5 are very consistent with that of Figure 3. The results show that most of the variables are sensitive to P14, followed by Parameter P6. In addition to this, the parameters P3, P4, P10, P15, P17, and P22 are seen to affect at least one of the dependent variables. The results also reveal that P1, P2, P8, P11, P13, P16, P18, P19, P20, P21, and P24 do not significantly influence any of the variables. A close observation of Figure 5  before analyzing the sensitivity of the parameters. Figure 7 shows the distribution of R 2 scores of different surrogate models for the selected meteorological variables by applying bootstrap resampling. In Figure 7, each subplot corresponds to one meteorological variable, and the horizontal and vertical axes indicate the surrogate models and the goodness of fit (R 2 ) value, respectively. For every meteorological variable, the 325 GPR model has the highest R 2 value, which is close to 1, and the variance is also minimal. This implies that the GPR model can accurately correlate the empirical relations between inputs and outputs. In contrast, the remaining surrogate models show high variance in respect of at least one of the variables. Figure 7(c) shows that the regression tree has the highest variance with the least R 2 value, and the minimum whisker lies below zero, which indicates the inability of the RT in capturing the correlations. In every subfigure, the R 2 value of KNN is close to 0.5, which implies that the model can explain only 50% of 330 the total variance around its mean. The surrogate models SVM and RF have very close accuracy except for the variable ORL, in which the SVM shows high variance with R 2 value close to 0.5. These results indicate that all the remaining models have inconsistencies in their accuracy except for the GPR model. Figure 8 shows a scatter plot of the WRF model output against the GPR fit output for the eight variables under consideration. In Figure 8, each subplot corresponds to one meteorological variable, and the horizontal and vertical axes indicate the output of the WRF model and GPR fit, respectively. From the R 2 335 value shown in the plots, it is clear that the GPR model can explain 95% of the variability of the output data around its mean, except for the variable surface pressure, for which the R 2 value is 0.88 (as shown in Figure 8(c)). In view of the above, the GPR is chosen as the best surrogate model for the sensitivity studies with Sobol'.

Results of surrogate-based Sobol' method
The GPR model, which is built upon 250 samples, is used to predict the outputs of 50000 samples generated by Sobol' sequence, and these outputs are used to estimate the Sobol' sensitivity indices, corresponding to each variable. Figure 10  360 Figures 11(a-h) show the detailed illustration of the sensitivity indices of each meteorological variable. In each subfigure, the blue bar shows the first-order (primary) effects, the red bar shows the higher-order (interaction) effects, and the sum of these two show the total order effects. The advantage of Sobol' method is that the method can provide quantification of interaction effects. Figures 11(c,e,f) show that the SAP, PBLH, and OLR have considerable higher-order effects, which indicate that the interactions are predominate in these variables. Figure 11 (b,g) show that the variables SAT and DSWRF have only one sensitive 365 parameter each, while Figures 11(c,e,h) show that the variables SAP, PBLH, and DLWRF are influenced by more than three parameters. These results strengthen the analogy obtained through the heatmap.
The results from Sobol' method indicate that only a few parameters contribute much to the sensitivity of the output variables. P22, are responsible for more than 80% of the total sensitivity of every variable. Unlike MOAT or MARS methods, Sobol's deterministic nature gives more accurate results as they are free of any uncertainties.

Physical interpretation of parameter sensitivity
The results obtained by the three sensitivity analysis methods suggest that only a few parameters strongly influence the mete-  Figure 13 in descending order, which indicates the ranks of the parameters. Figure 13 shows that eight parameters: P3, P4, P6, P10, P14, P15, P17, and P22 strongly influence the variables combined. Additionally, there is a near-exact matching of all the three sensitivity methods, with little variation in their ranks.

380
The results show that the parameter P14 is the most sensitive parameter among all. This represents the scattering tuning parameter used in the shortwave radiation scheme proposed by Dudhia (1989). This parameter is used in the downward component of solar flux equation (Montornès et al., 2015). This parameter is the main constant associated with the scattering attenuation and directly affects the solar radiation reaching the ground in the form of DSWRF. When a cloud is present in the atmosphere, it attenuates the downward solar radiation; simultaneously, it contributes to the downward longwave radiation by 385 means of multiple scattering. Since the Dudhia (1989) scheme does not have a representation of the multi-scattering process, parameter P14 attenuates the downward radiation without any contribution to the heating rate (Montornès et al., 2015). This leads to changes in the DLWRF. The land surface model transforms the solar radiation into other kinds of energies, such as latent heat (LH) and sensible heat (SH) near the surface. This implies that the changes caused to the downward radiation will also affect the LH and SH. The planetary boundary layer is governed by the LH and SH. Therefore, the changes in the DSWRF 390 will ultimately affect PBLH Montornès et al. (2015). A higher value of P14 leads to a decrease in downward solar radiation and the surface level heating, which ultimately reduces the surface atmosphere temperature (SAT). Studies of Quan et al. (2016) show that the changes in SAT lead to variations in relative humidity. Due to the correlation between SAT, humidity, and SAP, variations in SAT and humidity lead to variations in the SAP.
The parameter P6 is the multiplier for entrainment mass flux rate in the Kain-Fritsch cumulus physics scheme. This parameter 395 determines the amount of ambient air entraining into the updraft flux, which further dilutes the updraft parcel. A high value of P6 indicates a high amount of ambient air entrainment into the air parcel, and the atmosphere becomes more stable. This suppresses the formation of deep convection, which leads to a decrease in convective precipitation. Since the shallow convection removes a large amount of the instabilities, this leads to more stable stratiform clouds, ultimately resulting in high precipitation.
The occurrence of precipitation decreases the SAT and increases the relative humidity, leading to a change in the SAP. This also affects PBLH. Evaporation is the main constituent of cloud formation. Since the parameter P17 affects evaporation, the DLWRF, which depends on clouds, will also be affected by P17.
The parameter P10 is the scaling factor applied for icefall velocity used in the microphysics scheme, proposed by . This parameter controls the ice terminal fall velocity, which governs the sedimentation of ice crystals. The cloud constituents such as cloud water and cloud ice are affected by the sedimentation of ice crystals. Since the cloud water and cloud ice reflect radiation into the outer space, any change in the parameter P10 causes variations in the OLR (Quan et al., 2016;Di et al., 2017;Ji et al., 2018). The parameter P4 is the Von Kármán constant used in the surface layer scheme (Jiménez et al., 2012) and PBL scheme . This parameter relates the flow speed profile in a wall-normal shear flow 415 to the stress at the boundary. This parameter directly influences the bulk transfer coefficient of momentum, heat, moisture, and diffusivity coefficient of momentum. This implies the changes in P4 will bring implicit variations in surface pressure and moisture, which lead to changes in the precipitation Wang et al. (2020).
The parameter P22 is the profile shape exponent for calculating the moment diffusivity coefficient used in the PBL scheme.
This parameter is directly related to P4 since both are used in the diffusivity coefficient of the momentum equation. This 420 parameter regulates the mixing intensity of turbulence in the boundary layer, and because of this, the planetary boundary layer height (PBLH) will be affected (Quan et al., 2016;Di et al., 2017;Wang et al., 2020). The parameter P15 is the diffusivity angle for cloud optical depth computation used in the longwave radiation scheme, proposed by Mlawer et al. (1997). The longwave radiation irradiating back to the Earth's surface is attenuated by the diffusivity factor (which is the inverse of cosine of diffusivity angle) multiplied by the optical depth. Thus, changes in P15 directly cause variations in DLWRF (Quan et al.,425 2016; Di et al., 2017;Iacono et al., 2000;Viúdez-Mora et al., 2015). The parameter P3 is the scaling factor for surface roughness used in the surface layer scheme (Jiménez et al., 2012). A smooth surface lets the flow be laminar, whereas a rough surface drags the flow, thereby affecting the near-surface wind speed (Nelli et al., 2020). This way, parameter P3 is directly related to the wind speed. Thus, any changes in P3 results will also affect the surface wind speed (Wang et al., 2020).

A comparison between simulations with the default and optimal parameters 430
The objective of the present work is to identify the most important parameters which greatly influence the model output variables. In the present study, the parameters with which the model simulations show the least RMSE error with respect to the observations are selected as optimal parameters. However, these parameters can be further optimized by a procedure followed by Chinta and Balaji (2020) to improve the model predictions of output variables which are greatly affected by the parameters.
To illustrate whether parameter optimization can improve model prediction, a comparison of WRF simulations with the default 435 and optimal parameters for the meteorological variables, such as precipitation, surface temperature, surface pressure, and wind speed, was conducted. The RMSE values of WS10, SAT, SAP, and precipitation of the default and optimal simulations are evaluated and are shown in Table 3. The results show that optimal simulations have smaller RMSE values for surface wind (2.11 m/s) compared to default simulations (2.53 m/s). The percentage improvement is calculated as the percentage of of improvement is achieved by using the optimal parameters over the default parameters for the simulations of surface wind speed. Similarly, the optimal parameters yield improvements of 3.13% for surface temperature, 0.73% for surface pressure, and 9.18% for precipitation, over the default parameters.
Taylor statistics (Taylor, 2001) is used to evaluate the accuracy of the model forecasts of WS10, SAT, SAP, and precipitation, simulated with the default and optimal parameters. The Taylor Figures 15(c,d) show that the default and optimal simulations have similar spatial distributions of temperature bias over the entire domain, with very minimal differences are observed over the northwest, southwest, and Bangladesh regions.
Over these regions, the optimal simulations show a little less bias compared to the default simulations. For surface pressure, Figures 15(e,f) shows that the default and optimal simulations have similar spatial structures of bias over the entire domain with seemingly no variations at all. Figure 15(g,h) show that the default simulations have larger spatial structures with higher bias compared to the optimal simulations over the north BoB, Bangladesh coast, south-east BoB, and central BoB regions.
These results indicate that the optimization of the sensitive parameters with respect to wind speed and precipitation will yield more improvement.
The WRF model runs with optimal parameters improved the simulations of meteorological variables at the surface level.
However, the optimal parameters indeed exert an impact on the upper atmospheric variables, and the performance of optimal 480 parameters for the simulations of variables at this level should be satisfactory to use in the future. For this purpose, the wind fields at 500 hPa of vscs Thane and cyclone Phailin, simulated by the default and optimal parameters, are compared with observations, as shown in Figures 16 and 17. For cyclone Thane, at the end of day1, Figures 16(a1,b1,c1) show that the default and optimal parameters simulated similar cyclonic circulations and traces of anticyclonic circulations that are well matching with the observations. At the end of day2, Figures 16(a2,b2,c2) show that the optimal parameters simulated a well structured 485 cyclonic circulation, whereas the default parameters simulated irregularities around the cyclonic circulation that were not observed. Both the parameters simulated an anticyclonic circulation with a spatial deviation to that of the observed one. At the end day3, Figures 16(a3,b3,c3) show that the default parameters simulated an anticyclonic circulation, but failed to simulate a cyclonic circulation. In contrast, the optimal parameters simulated a well structured cyclonic circulation with a spatial deviation and an anticyclonic circulation. For cyclone Phailin, at the end of day1, Figures 17(a1,a2,a3) show that the default and optimal 490 parameters overestimated the cyclonic circulation intensity, however the optimal simulations show relatively less intensity than the default simulations. At the end of day2, Figures 17(a2,b2,c2) show that default and optimal simulations have similar intense cyclonic circulations at the observed location with an overestimation compared to the observations. At the end of the day3, Figures 17(a3,b3,c3) show that the optimal simulations have relatively similar intensity compared to the observations than the default simulations. These results show that the optimal parameters simulated the velocity field at 500 hPa with less intensity 495 and close to the observations than the default parameters.
The Maximum Sustained Wind speed (MSW) is one of the primary measures of the intensity of a cyclone, and predicting an accurate MSW is of primordial importance for early warnings. In addition to the spatial distributions of variables, MSW is also compared for default and optimal simulations and shown in Figure 18. From the WRF simulations using QMC samples, MSW values of the ten cyclones are extracted at 6-hour intervals, beginning from the 18 th hour to the 84 th hour, and the data 500 thus obtained is averaged over all the cyclones. A boxplot is generated using the data, and this shows that uncertainties in the parameters significantly affect the MSW simulations. The simulated MSW values with the default and optimal parameters are plotted along with the observed IMD MSW values, which show that the optimal simulations match quite well with the observations compared to the default simulations. These results indicate that the optimization of parameters will definitely improve model predictions.
The present study evaluated the sensitivity of the eight meteorological variables, namely surface wind speed, surface air temperature, surface air pressure, precipitation, planetary boundary layer height, downward shortwave radiation flux, downward longwave radiation flux, and outgoing longwave radiation flux, to 24 tunable parameters for the simulations of ten tropical cyclones over the BoB region. The tunable parameters were selected from seven physics schemes of the WRF model. Ten trop- P17 -multiplier for the saturated soil water content, P10 -scaling factor applied to icefall velocity, P4 -Von Karman constant, P22 -profile shape exponent for calculating the momentum diffusivity coefficient, P3 -scaling related to surface roughness, and P15 -diffusivity angle for cloud optical depth) were found contributing to 80% − 90% of the total sensitivity metric. A 520 comparison of the WRF simulations with the default and that with optimal parameters with respect to observations showed a 19.65% improvement in the surface wind prediction, 6.5% improvement in the surface temperature prediction, and a 13.3% improvement in the precipitation prediction when the optimal set of parameters is used instead of the default set of parameters.
These results indicate that the optimization of model parameters using advanced optimization techniques can further improve the prediction of tropical cyclones in the Bay of Bengal.